From flang-commits at lists.llvm.org Thu May 1 00:50:28 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 00:50:28 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <681327c4.050a0220.139300.1cec@mx.google.com> https://github.com/jofrn updated https://github.com/llvm/llvm-project/pull/123609 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Thu May 1 01:13:18 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Thu, 01 May 2025 01:13:18 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang] Use precompiled headers in Frontend, Lower, Parser, Semantics and Evaluate (PR #131137) In-Reply-To: Message-ID: <68132d1e.a70a0220.312596.1f7f@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `lld-x86_64-win` running on `as-worker-93` while building `flang,llvm` at step 7 "test-build-unified-tree-check-all". Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/2821
Here is the relevant piece of the build log for the reference ``` Step 7 (test-build-unified-tree-check-all) failure: test (failure) ******************** TEST 'LLVM-Unit :: Support/./SupportTests.exe/90/95' FAILED ******************** Script(shard): -- GTEST_OUTPUT=json:C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-9172-90-95.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=95 GTEST_SHARD_INDEX=90 C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe -- Script: -- C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe --gtest_filter=ProgramEnvTest.CreateProcessLongPath -- C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(160): error: Expected equality of these values: 0 RC Which is: -2 C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(163): error: fs::remove(Twine(LongPath)): did not return errc::success. error number: 13 error message: permission denied C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:160 Expected equality of these values: 0 RC Which is: -2 C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:163 fs::remove(Twine(LongPath)): did not return errc::success. error number: 13 error message: permission denied ******************** ```
https://github.com/llvm/llvm-project/pull/131137 From flang-commits at lists.llvm.org Thu May 1 03:14:31 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Thu, 01 May 2025 03:14:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix #else with trailing text (PR #138045) In-Reply-To: Message-ID: <68134987.a70a0220.3c663f.261b@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/138045 >From e3f856c692eec2d5e82116e369bd51f3964256fa Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 30 Apr 2025 18:07:41 -0400 Subject: [PATCH 1/3] [flang] Fix #else with trailing text Fixed the issue, where the extra text on #else line (' Z' in the example below) caused the data from the "else" clause to be processed together with the data of "then" clause. ``` PARAMETER(A=2) PARAMETER(A=3) end ``` --- flang/lib/Parser/preprocessor.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/flang/lib/Parser/preprocessor.cpp b/flang/lib/Parser/preprocessor.cpp index a47f9c32ad27c..1e984896ea4ed 100644 --- a/flang/lib/Parser/preprocessor.cpp +++ b/flang/lib/Parser/preprocessor.cpp @@ -684,7 +684,9 @@ void Preprocessor::Directive(const TokenSequence &dir, Prescanner &prescanner) { dir.GetIntervalProvenanceRange(j, tokens - j), "#else: excess tokens at end of directive"_port_en_US); } - } else if (ifStack_.empty()) { + } + + if (ifStack_.empty()) { prescanner.Say(dir.GetTokenProvenanceRange(dirOffset), "#else: not nested within #if, #ifdef, or #ifndef"_err_en_US); } else if (ifStack_.top() != CanDeadElseAppear::Yes) { >From c7daada0d3a684d7e5c9ff8c484b1d110ed17843 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 30 Apr 2025 18:23:55 -0400 Subject: [PATCH 2/3] Test --- flang/test/Preprocessing/pp048.F | 11 +++++++++++ 1 file changed, 11 insertions(+) create mode 100644 flang/test/Preprocessing/pp048.F diff --git a/flang/test/Preprocessing/pp048.F b/flang/test/Preprocessing/pp048.F new file mode 100644 index 0000000000000..121262c1840f9 --- /dev/null +++ b/flang/test/Preprocessing/pp048.F @@ -0,0 +1,11 @@ +! RUN: %flang -E %s 2>&1 | FileCheck %s +#ifndef XYZ42 + PARAMETER(A=2) +#else Z + PARAMETER(A=3) +#endif +! Ensure that "PARAMETER(A" is printed only once +! CHECK: PARAMETER(A +! CHECK-NOT: PARAMETER(A + end + >From 17f057b31e857c0bd58c3017559db0e2913e1da5 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 30 Apr 2025 18:34:01 -0400 Subject: [PATCH 3/3] Removed the blank line --- flang/lib/Parser/preprocessor.cpp | 1 - 1 file changed, 1 deletion(-) diff --git a/flang/lib/Parser/preprocessor.cpp b/flang/lib/Parser/preprocessor.cpp index 1e984896ea4ed..6e8e3aee19b09 100644 --- a/flang/lib/Parser/preprocessor.cpp +++ b/flang/lib/Parser/preprocessor.cpp @@ -685,7 +685,6 @@ void Preprocessor::Directive(const TokenSequence &dir, Prescanner &prescanner) { "#else: excess tokens at end of directive"_port_en_US); } } - if (ifStack_.empty()) { prescanner.Say(dir.GetTokenProvenanceRange(dirOffset), "#else: not nested within #if, #ifdef, or #ifndef"_err_en_US); From flang-commits at lists.llvm.org Thu May 1 03:53:27 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 01 May 2025 03:53:27 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][flang-driver] Support flag -finstrument-functions (PR #137996) In-Reply-To: Message-ID: <681352a7.170a0220.3a6029.0ec9@mx.google.com> ================ @@ -81,6 +81,8 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Options to add to the linker for the object file std::vector DependentLibs; + bool InstrumentFunctions{false}; ---------------- tblah wrote: This could be stored more efficiently by putting it in CodeGenOptions.def instead https://github.com/llvm/llvm-project/pull/137996 From flang-commits at lists.llvm.org Thu May 1 05:00:32 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 05:00:32 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][llvm][OpenMP][OpenACC] Add implicit casts to omp.atomic and acc.atomic (PR #131603) In-Reply-To: Message-ID: <68136260.170a0220.31275a.0eb3@mx.google.com> https://github.com/NimishMishra updated https://github.com/llvm/llvm-project/pull/131603 >From d00abc7a026b5b17ac2ec6e10cfae2d866288d51 Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Thu, 1 May 2025 17:29:56 +0530 Subject: [PATCH] [flang][llvm][OpenMP] Add implicit casts to omp.atomic --- flang/lib/Lower/OpenMP/OpenMP.cpp | 56 ++++++++++++++++++- .../Todo/atomic-capture-implicit-cast.f90 | 48 ++++++++++++++++ .../Lower/OpenMP/atomic-implicit-cast.f90 | 56 +++++++++++++++++++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 31 ---------- mlir/test/Target/LLVMIR/openmp-llvm.mlir | 21 +++---- 5 files changed, 164 insertions(+), 48 deletions(-) create mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 create mode 100644 flang/test/Lower/OpenMP/atomic-implicit-cast.f90 diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index f099028c23323..fe5aa994a76bd 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2889,9 +2889,55 @@ static void genAtomicRead(lower::AbstractConverter &converter, fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); mlir::Value toAddress = fir::getBase(converter.genExprAddr( *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); + + if (fromAddress.getType() != toAddress.getType()) { + // Emit an implicit cast + mlir::Type toType = fir::unwrapRefType(toAddress.getType()); + mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto oldIP = builder.saveInsertionPoint(); + builder.setInsertionPointToStart(builder.getAllocaBlock()); + mlir::Value alloca = builder.create(loc, fromType); + builder.restoreInsertionPoint(oldIP); + genAtomicCaptureStatement(converter, fromAddress, alloca, + leftHandClauseList, rightHandClauseList, + elementType, loc); + auto load = builder.create(loc, alloca); + if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { + // Emit an additional `ExtractValueOp` if `fromAddress` is of complex + // type, but `toAddress` is not. + + auto extract = builder.create( + loc, mlir::cast(fromType).getElementType(), load, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + auto cvt = builder.create(loc, toType, extract); + builder.create(loc, cvt, toAddress); + } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { + // Emit an additional `InsertValueOp` if `toAddress` is of complex + // type, but `fromAddress` is not. + mlir::Value undef = builder.create(loc, toType); + mlir::Type complexEleTy = + mlir::cast(toType).getElementType(); + mlir::Value cvt = builder.create(loc, complexEleTy, load); + mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); + mlir::Value idx0 = builder.create( + loc, toType, undef, cvt, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + mlir::Value idx1 = builder.create( + loc, toType, idx0, zero, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 1))); + builder.create(loc, idx1, toAddress); + } else { + auto cvt = builder.create(loc, toType, load); + builder.create(loc, cvt, toAddress); + } + } else + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); } /// Processes an atomic construct with update clause. @@ -2976,6 +3022,10 @@ static void genAtomicCapture(lower::AbstractConverter &converter, mlir::Type stmt2VarType = fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + // Check if implicit type is needed + if (stmt1VarType != stmt2VarType) + TODO(loc, "atomic capture requiring implicit type casts"); + mlir::Operation *atomicCaptureOp = nullptr; mlir::IntegerAttr hint = nullptr; mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 b/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 new file mode 100644 index 0000000000000..5b61f1169308f --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 @@ -0,0 +1,48 @@ +!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s + +!CHECK: not yet implemented: atomic capture requiring implicit type casts +subroutine capture_with_convert_f32_to_i32() + implicit none + integer :: k, v, i + + k = 1 + v = 0 + + !$omp atomic capture + v = k + k = (i + 1) * 3.14 + !$omp end atomic +end subroutine + +subroutine capture_with_convert_i32_to_f64() + real(8) :: x + integer :: v + x = 1.0 + v = 0 + !$omp atomic capture + v = x + x = v + !$omp end atomic +end subroutine capture_with_convert_i32_to_f64 + +subroutine capture_with_convert_f64_to_i32() + integer :: x + real(8) :: v + x = 1 + v = 0 + !$omp atomic capture + x = v + v = x + !$omp end atomic +end subroutine capture_with_convert_f64_to_i32 + +subroutine capture_with_convert_i32_to_f32() + real(4) :: x + integer :: v + x = 1.0 + v = 0 + !$omp atomic capture + v = x + x = x + v + !$omp end atomic +end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 new file mode 100644 index 0000000000000..75f1cbfc979b9 --- /dev/null +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -0,0 +1,56 @@ +! REQUIRES : openmp_runtime + +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +! CHECK: func.func @_QPatomic_implicit_cast_read() { +subroutine atomic_implicit_cast_read +! CHECK: %[[ALLOCA3:.*]] = fir.alloca complex +! CHECK: %[[ALLOCA2:.*]] = fir.alloca complex +! CHECK: %[[ALLOCA1:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA0:.*]] = fir.alloca f32 + +! CHECK: %[[M:.*]] = fir.alloca complex {bindc_name = "m", uniq_name = "_QFatomic_implicit_cast_readEm"} +! CHECK: %[[M_DECL:.*]]:2 = hlfir.declare %[[M]] {uniq_name = "_QFatomic_implicit_cast_readEm"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[W:.*]] = fir.alloca complex {bindc_name = "w", uniq_name = "_QFatomic_implicit_cast_readEw"} +! CHECK: %[[W_DECL:.*]]:2 = hlfir.declare %[[W]] {uniq_name = "_QFatomic_implicit_cast_readEw"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFatomic_implicit_cast_readEx"} +! CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X]] {uniq_name = "_QFatomic_implicit_cast_readEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Y:.*]] = fir.alloca f32 {bindc_name = "y", uniq_name = "_QFatomic_implicit_cast_readEy"} +! CHECK: %[[Y_DECL:.*]]:2 = hlfir.declare %[[Y]] {uniq_name = "_QFatomic_implicit_cast_readEy"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Z:.*]] = fir.alloca f64 {bindc_name = "z", uniq_name = "_QFatomic_implicit_cast_readEz"} +! CHECK: %[[Z_DECL:.*]]:2 = hlfir.declare %[[Z]] {uniq_name = "_QFatomic_implicit_cast_readEz"} : (!fir.ref) -> (!fir.ref, !fir.ref) + integer :: x + real :: y + double precision :: z + complex :: w + complex(8) :: m + +! CHECK: omp.atomic.read %[[ALLOCA0:.*]] = %[[Y_DECL]]#0 : !fir.ref, !fir.ref, f32 +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA0]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (f32) -> i32 +! CHECK: fir.store %[[CVT]] to %[[X_DECL]]#0 : !fir.ref + !$omp atomic read + x = y + +! CHECK: omp.atomic.read %[[ALLOCA1:.*]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA1]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f64 +! CHECK: fir.store %[[CVT]] to %[[Z_DECL]]#0 : !fir.ref + !$omp atomic read + z = x + +! CHECK: omp.atomic.read %[[ALLOCA2:.*]] = %[[W_DECL]]#0 : !fir.ref>, !fir.ref>, complex +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA2]] : !fir.ref> +! CHECK: %[[EXTRACT:.*]] = fir.extract_value %[[LOAD]], [0 : index] : (complex) -> f32 +! CHECK: %[[CVT:.*]] = fir.convert %[[EXTRACT]] : (f32) -> i32 +! CHECK: fir.store %[[CVT]] to %[[X_DECL]]#0 : !fir.ref + !$omp atomic read + x = w + +! CHECK: omp.atomic.read %[[ALLOCA3:.*]] = %[[W_DECL]]#0 : !fir.ref>, !fir.ref>, complex +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA3]] : !fir.ref> +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (complex) -> complex +! CHECK: fir.store %[[CVT]] to %[[M_DECL]]#0 : !fir.ref> + !$omp atomic read + m = w +end subroutine diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index 63d7171b06156..06dc1184e7cf5 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -268,33 +268,6 @@ computeOpenMPScheduleType(ScheduleKind ClauseKind, bool HasChunks, return Result; } -/// Emit an implicit cast to convert \p XRead to type of variable \p V -static llvm::Value *emitImplicitCast(IRBuilder<> &Builder, llvm::Value *XRead, - llvm::Value *V) { - // TODO: Add this functionality to the `AtomicInfo` interface - llvm::Type *XReadType = XRead->getType(); - llvm::Type *VType = V->getType(); - if (llvm::AllocaInst *vAlloca = dyn_cast(V)) - VType = vAlloca->getAllocatedType(); - - if (XReadType->isStructTy() && VType->isStructTy()) - // No need to extract or convert. A direct - // `store` will suffice. - return XRead; - - if (XReadType->isStructTy()) - XRead = Builder.CreateExtractValue(XRead, /*Idxs=*/0); - if (VType->isIntegerTy() && XReadType->isFloatingPointTy()) - XRead = Builder.CreateFPToSI(XRead, VType); - else if (VType->isFloatingPointTy() && XReadType->isIntegerTy()) - XRead = Builder.CreateSIToFP(XRead, VType); - else if (VType->isIntegerTy() && XReadType->isIntegerTy()) - XRead = Builder.CreateIntCast(XRead, VType, true); - else if (VType->isFloatingPointTy() && XReadType->isFloatingPointTy()) - XRead = Builder.CreateFPCast(XRead, VType); - return XRead; -} - /// Make \p Source branch to \p Target. /// /// Handles two situations: @@ -8685,8 +8658,6 @@ OpenMPIRBuilder::createAtomicRead(const LocationDescription &Loc, } } checkAndEmitFlushAfterAtomic(Loc, AO, AtomicKind::Read); - if (XRead->getType() != V.Var->getType()) - XRead = emitImplicitCast(Builder, XRead, V.Var); Builder.CreateStore(XRead, V.Var, V.IsVolatile); return Builder.saveIP(); } @@ -8983,8 +8954,6 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createAtomicCapture( return AtomicResult.takeError(); Value *CapturedVal = (IsPostfixUpdate ? AtomicResult->first : AtomicResult->second); - if (CapturedVal->getType() != V.Var->getType()) - CapturedVal = emitImplicitCast(Builder, CapturedVal, V.Var); Builder.CreateStore(CapturedVal, V.Var, V.IsVolatile); checkAndEmitFlushAfterAtomic(Loc, AO, AtomicKind::Capture); diff --git a/mlir/test/Target/LLVMIR/openmp-llvm.mlir b/mlir/test/Target/LLVMIR/openmp-llvm.mlir index 02a08eec74016..32f0ba5b105ff 100644 --- a/mlir/test/Target/LLVMIR/openmp-llvm.mlir +++ b/mlir/test/Target/LLVMIR/openmp-llvm.mlir @@ -1396,42 +1396,35 @@ llvm.func @omp_atomic_read_implicit_cast () { //CHECK: call void @__atomic_load(i64 8, ptr %[[X_ELEMENT]], ptr %[[ATOMIC_LOAD_TEMP]], i32 0) //CHECK: %[[LOAD:.*]] = load { float, float }, ptr %[[ATOMIC_LOAD_TEMP]], align 8 -//CHECK: %[[EXT:.*]] = extractvalue { float, float } %[[LOAD]], 0 -//CHECK: store float %[[EXT]], ptr %[[Y]], align 4 +//CHECK: store { float, float } %[[LOAD]], ptr %[[Y]], align 4 omp.atomic.read %3 = %17 : !llvm.ptr, !llvm.ptr, !llvm.struct<(f32, f32)> //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[Z]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i32 %[[ATOMIC_LOAD_TEMP]] to float -//CHECK: %[[LOAD:.*]] = fpext float %[[CAST]] to double -//CHECK: store double %[[LOAD]], ptr %[[Y]], align 8 +//CHECK: store float %[[CAST]], ptr %[[Y]], align 4 omp.atomic.read %3 = %1 : !llvm.ptr, !llvm.ptr, f32 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[W]] monotonic, align 4 -//CHECK: %[[LOAD:.*]] = sitofp i32 %[[ATOMIC_LOAD_TEMP]] to double -//CHECK: store double %[[LOAD]], ptr %[[Y]], align 8 +//CHECK: store i32 %[[ATOMIC_LOAD_TEMP]], ptr %[[Y]], align 4 omp.atomic.read %3 = %7 : !llvm.ptr, !llvm.ptr, i32 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i64, ptr %[[Y]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i64 %[[ATOMIC_LOAD_TEMP]] to double -//CHECK: %[[LOAD:.*]] = fptrunc double %[[CAST]] to float -//CHECK: store float %[[LOAD]], ptr %[[Z]], align 4 +//CHECK: store double %[[CAST]], ptr %[[Z]], align 8 omp.atomic.read %1 = %3 : !llvm.ptr, !llvm.ptr, f64 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[W]] monotonic, align 4 -//CHECK: %[[LOAD:.*]] = sitofp i32 %[[ATOMIC_LOAD_TEMP]] to float -//CHECK: store float %[[LOAD]], ptr %[[Z]], align 4 +//CHECK: store i32 %[[ATOMIC_LOAD_TEMP]], ptr %[[Z]], align 4 omp.atomic.read %1 = %7 : !llvm.ptr, !llvm.ptr, i32 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i64, ptr %[[Y]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i64 %[[ATOMIC_LOAD_TEMP]] to double -//CHECK: %[[LOAD:.*]] = fptosi double %[[CAST]] to i32 -//CHECK: store i32 %[[LOAD]], ptr %[[W]], align 4 +//CHECK: store double %[[CAST]], ptr %[[W]], align 8 omp.atomic.read %7 = %3 : !llvm.ptr, !llvm.ptr, f64 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[Z]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i32 %[[ATOMIC_LOAD_TEMP]] to float -//CHECK: %[[LOAD:.*]] = fptosi float %[[CAST]] to i32 -//CHECK: store i32 %[[LOAD]], ptr %[[W]], align 4 +//CHECK: store float %[[CAST]], ptr %[[W]], align 4 omp.atomic.read %7 = %1 : !llvm.ptr, !llvm.ptr, f32 llvm.return } From flang-commits at lists.llvm.org Thu May 1 05:00:40 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 05:00:40 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][llvm][OpenMP] Add implicit casts to omp.atomic (PR #131603) In-Reply-To: Message-ID: <68136268.050a0220.3a9318.29c8@mx.google.com> https://github.com/NimishMishra edited https://github.com/llvm/llvm-project/pull/131603 From flang-commits at lists.llvm.org Thu May 1 05:04:11 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 05:04:11 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][llvm][OpenMP] Add implicit casts to omp.atomic (PR #131603) In-Reply-To: Message-ID: <6813633b.050a0220.22526d.28ad@mx.google.com> NimishMishra wrote: I have created an issue for the OpenMP atomic capture TODO: https://github.com/llvm/llvm-project/issues/138123 Also, results of testing this patch: **gfortran testsuite**: Testing Time: 43.93s Total Discovered Tests: 6568 Passed: 6568 (100.00%) **fujitsu testsuite** (with / without patch): Total Discovered Tests: 88889 Passed : 87884 (98.87%) Failed : 274 (0.31%) Executable Missing: 731 (0.82%) I do not have access to an aarch machine, so tested fujitsu on x86. https://github.com/llvm/llvm-project/pull/131603 From flang-commits at lists.llvm.org Thu May 1 05:25:51 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 01 May 2025 05:25:51 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][llvm][OpenMP] Add implicit casts to omp.atomic (PR #131603) In-Reply-To: Message-ID: <6813684f.050a0220.4bc85.28b5@mx.google.com> https://github.com/kiranchandramohan edited https://github.com/llvm/llvm-project/pull/131603 From flang-commits at lists.llvm.org Thu May 1 05:25:51 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 01 May 2025 05:25:51 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][llvm][OpenMP] Add implicit casts to omp.atomic (PR #131603) In-Reply-To: Message-ID: <6813684f.630a0220.15c654.2499@mx.google.com> https://github.com/kiranchandramohan approved this pull request. Thanks for running the testsuites. LGTM. Have two requests for code comments. https://github.com/llvm/llvm-project/pull/131603 From flang-commits at lists.llvm.org Thu May 1 05:25:51 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 01 May 2025 05:25:51 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][llvm][OpenMP] Add implicit casts to omp.atomic (PR #131603) In-Reply-To: Message-ID: <6813684f.050a0220.2bac24.2bb9@mx.google.com> ================ @@ -2889,9 +2889,55 @@ static void genAtomicRead(lower::AbstractConverter &converter, fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); mlir::Value toAddress = fir::getBase(converter.genExprAddr( *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); + + if (fromAddress.getType() != toAddress.getType()) { ---------------- kiranchandramohan wrote: Please add commenst for: -> Why we cannot use the typedAssignment lowering and is using custom lowering here? -> Why do these casts have to be added? -> Why is it safe to do so? -> Why we cannot use the typedAssignment lowering? https://github.com/llvm/llvm-project/pull/131603 From flang-commits at lists.llvm.org Thu May 1 05:25:51 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 01 May 2025 05:25:51 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][llvm][OpenMP] Add implicit casts to omp.atomic (PR #131603) In-Reply-To: Message-ID: <6813684f.170a0220.4f24a.10ba@mx.google.com> ================ @@ -2889,9 +2889,55 @@ static void genAtomicRead(lower::AbstractConverter &converter, fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); mlir::Value toAddress = fir::getBase(converter.genExprAddr( *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); + + if (fromAddress.getType() != toAddress.getType()) { + // Emit an implicit cast + mlir::Type toType = fir::unwrapRefType(toAddress.getType()); + mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto oldIP = builder.saveInsertionPoint(); + builder.setInsertionPointToStart(builder.getAllocaBlock()); + mlir::Value alloca = builder.create(loc, fromType); ---------------- kiranchandramohan wrote: Please add a comment for the need for this alloca. https://github.com/llvm/llvm-project/pull/131603 From flang-commits at lists.llvm.org Thu May 1 06:13:33 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 06:13:33 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][llvm][OpenMP] Add implicit casts to omp.atomic (PR #131603) In-Reply-To: Message-ID: <6813737d.170a0220.35025b.20c5@mx.google.com> https://github.com/NimishMishra updated https://github.com/llvm/llvm-project/pull/131603 >From d00abc7a026b5b17ac2ec6e10cfae2d866288d51 Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Thu, 1 May 2025 17:29:56 +0530 Subject: [PATCH 1/3] [flang][llvm][OpenMP] Add implicit casts to omp.atomic --- flang/lib/Lower/OpenMP/OpenMP.cpp | 56 ++++++++++++++++++- .../Todo/atomic-capture-implicit-cast.f90 | 48 ++++++++++++++++ .../Lower/OpenMP/atomic-implicit-cast.f90 | 56 +++++++++++++++++++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 31 ---------- mlir/test/Target/LLVMIR/openmp-llvm.mlir | 21 +++---- 5 files changed, 164 insertions(+), 48 deletions(-) create mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 create mode 100644 flang/test/Lower/OpenMP/atomic-implicit-cast.f90 diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index f099028c23323..fe5aa994a76bd 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2889,9 +2889,55 @@ static void genAtomicRead(lower::AbstractConverter &converter, fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); mlir::Value toAddress = fir::getBase(converter.genExprAddr( *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); + + if (fromAddress.getType() != toAddress.getType()) { + // Emit an implicit cast + mlir::Type toType = fir::unwrapRefType(toAddress.getType()); + mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto oldIP = builder.saveInsertionPoint(); + builder.setInsertionPointToStart(builder.getAllocaBlock()); + mlir::Value alloca = builder.create(loc, fromType); + builder.restoreInsertionPoint(oldIP); + genAtomicCaptureStatement(converter, fromAddress, alloca, + leftHandClauseList, rightHandClauseList, + elementType, loc); + auto load = builder.create(loc, alloca); + if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { + // Emit an additional `ExtractValueOp` if `fromAddress` is of complex + // type, but `toAddress` is not. + + auto extract = builder.create( + loc, mlir::cast(fromType).getElementType(), load, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + auto cvt = builder.create(loc, toType, extract); + builder.create(loc, cvt, toAddress); + } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { + // Emit an additional `InsertValueOp` if `toAddress` is of complex + // type, but `fromAddress` is not. + mlir::Value undef = builder.create(loc, toType); + mlir::Type complexEleTy = + mlir::cast(toType).getElementType(); + mlir::Value cvt = builder.create(loc, complexEleTy, load); + mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); + mlir::Value idx0 = builder.create( + loc, toType, undef, cvt, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + mlir::Value idx1 = builder.create( + loc, toType, idx0, zero, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 1))); + builder.create(loc, idx1, toAddress); + } else { + auto cvt = builder.create(loc, toType, load); + builder.create(loc, cvt, toAddress); + } + } else + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); } /// Processes an atomic construct with update clause. @@ -2976,6 +3022,10 @@ static void genAtomicCapture(lower::AbstractConverter &converter, mlir::Type stmt2VarType = fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + // Check if implicit type is needed + if (stmt1VarType != stmt2VarType) + TODO(loc, "atomic capture requiring implicit type casts"); + mlir::Operation *atomicCaptureOp = nullptr; mlir::IntegerAttr hint = nullptr; mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 b/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 new file mode 100644 index 0000000000000..5b61f1169308f --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 @@ -0,0 +1,48 @@ +!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s + +!CHECK: not yet implemented: atomic capture requiring implicit type casts +subroutine capture_with_convert_f32_to_i32() + implicit none + integer :: k, v, i + + k = 1 + v = 0 + + !$omp atomic capture + v = k + k = (i + 1) * 3.14 + !$omp end atomic +end subroutine + +subroutine capture_with_convert_i32_to_f64() + real(8) :: x + integer :: v + x = 1.0 + v = 0 + !$omp atomic capture + v = x + x = v + !$omp end atomic +end subroutine capture_with_convert_i32_to_f64 + +subroutine capture_with_convert_f64_to_i32() + integer :: x + real(8) :: v + x = 1 + v = 0 + !$omp atomic capture + x = v + v = x + !$omp end atomic +end subroutine capture_with_convert_f64_to_i32 + +subroutine capture_with_convert_i32_to_f32() + real(4) :: x + integer :: v + x = 1.0 + v = 0 + !$omp atomic capture + v = x + x = x + v + !$omp end atomic +end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 new file mode 100644 index 0000000000000..75f1cbfc979b9 --- /dev/null +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -0,0 +1,56 @@ +! REQUIRES : openmp_runtime + +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +! CHECK: func.func @_QPatomic_implicit_cast_read() { +subroutine atomic_implicit_cast_read +! CHECK: %[[ALLOCA3:.*]] = fir.alloca complex +! CHECK: %[[ALLOCA2:.*]] = fir.alloca complex +! CHECK: %[[ALLOCA1:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA0:.*]] = fir.alloca f32 + +! CHECK: %[[M:.*]] = fir.alloca complex {bindc_name = "m", uniq_name = "_QFatomic_implicit_cast_readEm"} +! CHECK: %[[M_DECL:.*]]:2 = hlfir.declare %[[M]] {uniq_name = "_QFatomic_implicit_cast_readEm"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[W:.*]] = fir.alloca complex {bindc_name = "w", uniq_name = "_QFatomic_implicit_cast_readEw"} +! CHECK: %[[W_DECL:.*]]:2 = hlfir.declare %[[W]] {uniq_name = "_QFatomic_implicit_cast_readEw"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFatomic_implicit_cast_readEx"} +! CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X]] {uniq_name = "_QFatomic_implicit_cast_readEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Y:.*]] = fir.alloca f32 {bindc_name = "y", uniq_name = "_QFatomic_implicit_cast_readEy"} +! CHECK: %[[Y_DECL:.*]]:2 = hlfir.declare %[[Y]] {uniq_name = "_QFatomic_implicit_cast_readEy"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Z:.*]] = fir.alloca f64 {bindc_name = "z", uniq_name = "_QFatomic_implicit_cast_readEz"} +! CHECK: %[[Z_DECL:.*]]:2 = hlfir.declare %[[Z]] {uniq_name = "_QFatomic_implicit_cast_readEz"} : (!fir.ref) -> (!fir.ref, !fir.ref) + integer :: x + real :: y + double precision :: z + complex :: w + complex(8) :: m + +! CHECK: omp.atomic.read %[[ALLOCA0:.*]] = %[[Y_DECL]]#0 : !fir.ref, !fir.ref, f32 +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA0]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (f32) -> i32 +! CHECK: fir.store %[[CVT]] to %[[X_DECL]]#0 : !fir.ref + !$omp atomic read + x = y + +! CHECK: omp.atomic.read %[[ALLOCA1:.*]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA1]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f64 +! CHECK: fir.store %[[CVT]] to %[[Z_DECL]]#0 : !fir.ref + !$omp atomic read + z = x + +! CHECK: omp.atomic.read %[[ALLOCA2:.*]] = %[[W_DECL]]#0 : !fir.ref>, !fir.ref>, complex +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA2]] : !fir.ref> +! CHECK: %[[EXTRACT:.*]] = fir.extract_value %[[LOAD]], [0 : index] : (complex) -> f32 +! CHECK: %[[CVT:.*]] = fir.convert %[[EXTRACT]] : (f32) -> i32 +! CHECK: fir.store %[[CVT]] to %[[X_DECL]]#0 : !fir.ref + !$omp atomic read + x = w + +! CHECK: omp.atomic.read %[[ALLOCA3:.*]] = %[[W_DECL]]#0 : !fir.ref>, !fir.ref>, complex +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA3]] : !fir.ref> +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (complex) -> complex +! CHECK: fir.store %[[CVT]] to %[[M_DECL]]#0 : !fir.ref> + !$omp atomic read + m = w +end subroutine diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index 63d7171b06156..06dc1184e7cf5 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -268,33 +268,6 @@ computeOpenMPScheduleType(ScheduleKind ClauseKind, bool HasChunks, return Result; } -/// Emit an implicit cast to convert \p XRead to type of variable \p V -static llvm::Value *emitImplicitCast(IRBuilder<> &Builder, llvm::Value *XRead, - llvm::Value *V) { - // TODO: Add this functionality to the `AtomicInfo` interface - llvm::Type *XReadType = XRead->getType(); - llvm::Type *VType = V->getType(); - if (llvm::AllocaInst *vAlloca = dyn_cast(V)) - VType = vAlloca->getAllocatedType(); - - if (XReadType->isStructTy() && VType->isStructTy()) - // No need to extract or convert. A direct - // `store` will suffice. - return XRead; - - if (XReadType->isStructTy()) - XRead = Builder.CreateExtractValue(XRead, /*Idxs=*/0); - if (VType->isIntegerTy() && XReadType->isFloatingPointTy()) - XRead = Builder.CreateFPToSI(XRead, VType); - else if (VType->isFloatingPointTy() && XReadType->isIntegerTy()) - XRead = Builder.CreateSIToFP(XRead, VType); - else if (VType->isIntegerTy() && XReadType->isIntegerTy()) - XRead = Builder.CreateIntCast(XRead, VType, true); - else if (VType->isFloatingPointTy() && XReadType->isFloatingPointTy()) - XRead = Builder.CreateFPCast(XRead, VType); - return XRead; -} - /// Make \p Source branch to \p Target. /// /// Handles two situations: @@ -8685,8 +8658,6 @@ OpenMPIRBuilder::createAtomicRead(const LocationDescription &Loc, } } checkAndEmitFlushAfterAtomic(Loc, AO, AtomicKind::Read); - if (XRead->getType() != V.Var->getType()) - XRead = emitImplicitCast(Builder, XRead, V.Var); Builder.CreateStore(XRead, V.Var, V.IsVolatile); return Builder.saveIP(); } @@ -8983,8 +8954,6 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createAtomicCapture( return AtomicResult.takeError(); Value *CapturedVal = (IsPostfixUpdate ? AtomicResult->first : AtomicResult->second); - if (CapturedVal->getType() != V.Var->getType()) - CapturedVal = emitImplicitCast(Builder, CapturedVal, V.Var); Builder.CreateStore(CapturedVal, V.Var, V.IsVolatile); checkAndEmitFlushAfterAtomic(Loc, AO, AtomicKind::Capture); diff --git a/mlir/test/Target/LLVMIR/openmp-llvm.mlir b/mlir/test/Target/LLVMIR/openmp-llvm.mlir index 02a08eec74016..32f0ba5b105ff 100644 --- a/mlir/test/Target/LLVMIR/openmp-llvm.mlir +++ b/mlir/test/Target/LLVMIR/openmp-llvm.mlir @@ -1396,42 +1396,35 @@ llvm.func @omp_atomic_read_implicit_cast () { //CHECK: call void @__atomic_load(i64 8, ptr %[[X_ELEMENT]], ptr %[[ATOMIC_LOAD_TEMP]], i32 0) //CHECK: %[[LOAD:.*]] = load { float, float }, ptr %[[ATOMIC_LOAD_TEMP]], align 8 -//CHECK: %[[EXT:.*]] = extractvalue { float, float } %[[LOAD]], 0 -//CHECK: store float %[[EXT]], ptr %[[Y]], align 4 +//CHECK: store { float, float } %[[LOAD]], ptr %[[Y]], align 4 omp.atomic.read %3 = %17 : !llvm.ptr, !llvm.ptr, !llvm.struct<(f32, f32)> //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[Z]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i32 %[[ATOMIC_LOAD_TEMP]] to float -//CHECK: %[[LOAD:.*]] = fpext float %[[CAST]] to double -//CHECK: store double %[[LOAD]], ptr %[[Y]], align 8 +//CHECK: store float %[[CAST]], ptr %[[Y]], align 4 omp.atomic.read %3 = %1 : !llvm.ptr, !llvm.ptr, f32 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[W]] monotonic, align 4 -//CHECK: %[[LOAD:.*]] = sitofp i32 %[[ATOMIC_LOAD_TEMP]] to double -//CHECK: store double %[[LOAD]], ptr %[[Y]], align 8 +//CHECK: store i32 %[[ATOMIC_LOAD_TEMP]], ptr %[[Y]], align 4 omp.atomic.read %3 = %7 : !llvm.ptr, !llvm.ptr, i32 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i64, ptr %[[Y]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i64 %[[ATOMIC_LOAD_TEMP]] to double -//CHECK: %[[LOAD:.*]] = fptrunc double %[[CAST]] to float -//CHECK: store float %[[LOAD]], ptr %[[Z]], align 4 +//CHECK: store double %[[CAST]], ptr %[[Z]], align 8 omp.atomic.read %1 = %3 : !llvm.ptr, !llvm.ptr, f64 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[W]] monotonic, align 4 -//CHECK: %[[LOAD:.*]] = sitofp i32 %[[ATOMIC_LOAD_TEMP]] to float -//CHECK: store float %[[LOAD]], ptr %[[Z]], align 4 +//CHECK: store i32 %[[ATOMIC_LOAD_TEMP]], ptr %[[Z]], align 4 omp.atomic.read %1 = %7 : !llvm.ptr, !llvm.ptr, i32 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i64, ptr %[[Y]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i64 %[[ATOMIC_LOAD_TEMP]] to double -//CHECK: %[[LOAD:.*]] = fptosi double %[[CAST]] to i32 -//CHECK: store i32 %[[LOAD]], ptr %[[W]], align 4 +//CHECK: store double %[[CAST]], ptr %[[W]], align 8 omp.atomic.read %7 = %3 : !llvm.ptr, !llvm.ptr, f64 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[Z]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i32 %[[ATOMIC_LOAD_TEMP]] to float -//CHECK: %[[LOAD:.*]] = fptosi float %[[CAST]] to i32 -//CHECK: store i32 %[[LOAD]], ptr %[[W]], align 4 +//CHECK: store float %[[CAST]], ptr %[[W]], align 4 omp.atomic.read %7 = %1 : !llvm.ptr, !llvm.ptr, f32 llvm.return } >From dae12f1ceb3d906df77a43096763c5553ff83fc9 Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Thu, 1 May 2025 18:37:12 +0530 Subject: [PATCH 2/3] Add comment explaining the implicit casting --- flang/lib/Lower/OpenMP/OpenMP.cpp | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fe5aa994a76bd..d00ab6d07d7b8 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2891,13 +2891,39 @@ static void genAtomicRead(lower::AbstractConverter &converter, *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); if (fromAddress.getType() != toAddress.getType()) { - // Emit an implicit cast + // Emit an implicit cast. + // Different yet compatible types on omp.atomic.read constitute valid + // Fortran. The OMPIRBuilder will emit atomic instructions (on primitive + // types) and + // __atomic_load libcall (on complex type) without explicitly converting + // between such compatible types, leading to execute issues. The + // OMPIRBuilder relies on the frontend to resolve such inconsistencies + // between omp.atomic.read operand types. Inconsistency between operand + // types in omp.atomic.write are resolved through implicit casting by use of + // typed assignment (i.e. `evaluate::Assignment`). However, use of typed + // assignment in omp.atomic.read (of form `v = x`) leads to an unsafe, + // non-atomic load of `x` into a temporary `alloca`, followed by an atomic + // read of form `v = alloca`. Hence, perform a custom implicit cast. + + // For an atomic read of form `v = x` that would (without implicit casting) + // lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, + // type2`, this implicit casting will generate the following FIR: %alloca = + // fir.alloca type2 + // omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, + //type2 %load = fir.load %alloca : !fir.ref %cvt = fir.convert %load + //: (type2) -> type1 fir.store %cvt to %v : !fir.ref + + // These sequence of operations is thread-safe since each thread allocates + // the `alloca` in its stack, and performs `%alloca = %x` atomically. Once + // safely read, each thread performs the implicit cast on the local alloca, + // and writes the final result to `%v`. mlir::Type toType = fir::unwrapRefType(toAddress.getType()); mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); auto oldIP = builder.saveInsertionPoint(); builder.setInsertionPointToStart(builder.getAllocaBlock()); - mlir::Value alloca = builder.create(loc, fromType); + mlir::Value alloca = builder.create( + loc, fromType); // Thread scope `alloca` to atomically read `%x`. builder.restoreInsertionPoint(oldIP); genAtomicCaptureStatement(converter, fromAddress, alloca, leftHandClauseList, rightHandClauseList, >From a66a09c137833d49b35572636ae11174e8ee89de Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Thu, 1 May 2025 18:43:09 +0530 Subject: [PATCH 3/3] Fix formatting --- flang/lib/Lower/OpenMP/OpenMP.cpp | 43 ++++++++++++++++--------------- 1 file changed, 22 insertions(+), 21 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index d00ab6d07d7b8..2fc45e501f3fc 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2891,32 +2891,34 @@ static void genAtomicRead(lower::AbstractConverter &converter, *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); if (fromAddress.getType() != toAddress.getType()) { - // Emit an implicit cast. - // Different yet compatible types on omp.atomic.read constitute valid - // Fortran. The OMPIRBuilder will emit atomic instructions (on primitive - // types) and - // __atomic_load libcall (on complex type) without explicitly converting - // between such compatible types, leading to execute issues. The - // OMPIRBuilder relies on the frontend to resolve such inconsistencies - // between omp.atomic.read operand types. Inconsistency between operand - // types in omp.atomic.write are resolved through implicit casting by use of - // typed assignment (i.e. `evaluate::Assignment`). However, use of typed - // assignment in omp.atomic.read (of form `v = x`) leads to an unsafe, + // Emit an implicit cast. Different yet compatible types on + // omp.atomic.read constitute valid Fortran. The OMPIRBuilder will + // emit atomic instructions (on primitive types) and `__atomic_load` + // libcall (on complex type) without explicitly converting + // between such compatible types. The OMPIRBuilder relies on the + // frontend to resolve such inconsistencies between `omp.atomic.read ` + // operand types. Similar inconsistencies between operand types in + // `omp.atomic.write` are resolved through implicit casting by use of typed + // assignment (i.e. `evaluate::Assignment`). However, use of typed + // assignment in `omp.atomic.read` (of form `v = x`) leads to an unsafe, // non-atomic load of `x` into a temporary `alloca`, followed by an atomic - // read of form `v = alloca`. Hence, perform a custom implicit cast. + // read of form `v = alloca`. Hence, it is needed to perform a custom + // implicit cast. - // For an atomic read of form `v = x` that would (without implicit casting) + // An atomic read of form `v = x` would (without implicit casting) // lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, - // type2`, this implicit casting will generate the following FIR: %alloca = - // fir.alloca type2 - // omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, - //type2 %load = fir.load %alloca : !fir.ref %cvt = fir.convert %load - //: (type2) -> type1 fir.store %cvt to %v : !fir.ref + // type2`. This implicit casting will rather generate the following FIR: + // + // %alloca = fir.alloca type2 + // omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 + // %load = fir.load %alloca : !fir.ref + // %cvt = fir.convert %load: (type2) -> type1 + // fir.store %cvt to %v : !fir.ref // These sequence of operations is thread-safe since each thread allocates // the `alloca` in its stack, and performs `%alloca = %x` atomically. Once - // safely read, each thread performs the implicit cast on the local alloca, - // and writes the final result to `%v`. + // safely read, each thread performs the implicit cast on the local + // `alloca`, and writes the final result to `%v`. mlir::Type toType = fir::unwrapRefType(toAddress.getType()); mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); @@ -2932,7 +2934,6 @@ static void genAtomicRead(lower::AbstractConverter &converter, if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { // Emit an additional `ExtractValueOp` if `fromAddress` is of complex // type, but `toAddress` is not. - auto extract = builder.create( loc, mlir::cast(fromType).getElementType(), load, builder.getArrayAttr( From flang-commits at lists.llvm.org Thu May 1 06:15:45 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 06:15:45 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][llvm][OpenMP] Add implicit casts to omp.atomic (PR #131603) In-Reply-To: Message-ID: <68137401.050a0220.23ac2d.3aed@mx.google.com> https://github.com/NimishMishra updated https://github.com/llvm/llvm-project/pull/131603 >From d00abc7a026b5b17ac2ec6e10cfae2d866288d51 Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Thu, 1 May 2025 17:29:56 +0530 Subject: [PATCH 1/4] [flang][llvm][OpenMP] Add implicit casts to omp.atomic --- flang/lib/Lower/OpenMP/OpenMP.cpp | 56 ++++++++++++++++++- .../Todo/atomic-capture-implicit-cast.f90 | 48 ++++++++++++++++ .../Lower/OpenMP/atomic-implicit-cast.f90 | 56 +++++++++++++++++++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 31 ---------- mlir/test/Target/LLVMIR/openmp-llvm.mlir | 21 +++---- 5 files changed, 164 insertions(+), 48 deletions(-) create mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 create mode 100644 flang/test/Lower/OpenMP/atomic-implicit-cast.f90 diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index f099028c23323..fe5aa994a76bd 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2889,9 +2889,55 @@ static void genAtomicRead(lower::AbstractConverter &converter, fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); mlir::Value toAddress = fir::getBase(converter.genExprAddr( *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); + + if (fromAddress.getType() != toAddress.getType()) { + // Emit an implicit cast + mlir::Type toType = fir::unwrapRefType(toAddress.getType()); + mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto oldIP = builder.saveInsertionPoint(); + builder.setInsertionPointToStart(builder.getAllocaBlock()); + mlir::Value alloca = builder.create(loc, fromType); + builder.restoreInsertionPoint(oldIP); + genAtomicCaptureStatement(converter, fromAddress, alloca, + leftHandClauseList, rightHandClauseList, + elementType, loc); + auto load = builder.create(loc, alloca); + if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { + // Emit an additional `ExtractValueOp` if `fromAddress` is of complex + // type, but `toAddress` is not. + + auto extract = builder.create( + loc, mlir::cast(fromType).getElementType(), load, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + auto cvt = builder.create(loc, toType, extract); + builder.create(loc, cvt, toAddress); + } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { + // Emit an additional `InsertValueOp` if `toAddress` is of complex + // type, but `fromAddress` is not. + mlir::Value undef = builder.create(loc, toType); + mlir::Type complexEleTy = + mlir::cast(toType).getElementType(); + mlir::Value cvt = builder.create(loc, complexEleTy, load); + mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); + mlir::Value idx0 = builder.create( + loc, toType, undef, cvt, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + mlir::Value idx1 = builder.create( + loc, toType, idx0, zero, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 1))); + builder.create(loc, idx1, toAddress); + } else { + auto cvt = builder.create(loc, toType, load); + builder.create(loc, cvt, toAddress); + } + } else + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); } /// Processes an atomic construct with update clause. @@ -2976,6 +3022,10 @@ static void genAtomicCapture(lower::AbstractConverter &converter, mlir::Type stmt2VarType = fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + // Check if implicit type is needed + if (stmt1VarType != stmt2VarType) + TODO(loc, "atomic capture requiring implicit type casts"); + mlir::Operation *atomicCaptureOp = nullptr; mlir::IntegerAttr hint = nullptr; mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 b/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 new file mode 100644 index 0000000000000..5b61f1169308f --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 @@ -0,0 +1,48 @@ +!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s + +!CHECK: not yet implemented: atomic capture requiring implicit type casts +subroutine capture_with_convert_f32_to_i32() + implicit none + integer :: k, v, i + + k = 1 + v = 0 + + !$omp atomic capture + v = k + k = (i + 1) * 3.14 + !$omp end atomic +end subroutine + +subroutine capture_with_convert_i32_to_f64() + real(8) :: x + integer :: v + x = 1.0 + v = 0 + !$omp atomic capture + v = x + x = v + !$omp end atomic +end subroutine capture_with_convert_i32_to_f64 + +subroutine capture_with_convert_f64_to_i32() + integer :: x + real(8) :: v + x = 1 + v = 0 + !$omp atomic capture + x = v + v = x + !$omp end atomic +end subroutine capture_with_convert_f64_to_i32 + +subroutine capture_with_convert_i32_to_f32() + real(4) :: x + integer :: v + x = 1.0 + v = 0 + !$omp atomic capture + v = x + x = x + v + !$omp end atomic +end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 new file mode 100644 index 0000000000000..75f1cbfc979b9 --- /dev/null +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -0,0 +1,56 @@ +! REQUIRES : openmp_runtime + +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +! CHECK: func.func @_QPatomic_implicit_cast_read() { +subroutine atomic_implicit_cast_read +! CHECK: %[[ALLOCA3:.*]] = fir.alloca complex +! CHECK: %[[ALLOCA2:.*]] = fir.alloca complex +! CHECK: %[[ALLOCA1:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA0:.*]] = fir.alloca f32 + +! CHECK: %[[M:.*]] = fir.alloca complex {bindc_name = "m", uniq_name = "_QFatomic_implicit_cast_readEm"} +! CHECK: %[[M_DECL:.*]]:2 = hlfir.declare %[[M]] {uniq_name = "_QFatomic_implicit_cast_readEm"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[W:.*]] = fir.alloca complex {bindc_name = "w", uniq_name = "_QFatomic_implicit_cast_readEw"} +! CHECK: %[[W_DECL:.*]]:2 = hlfir.declare %[[W]] {uniq_name = "_QFatomic_implicit_cast_readEw"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFatomic_implicit_cast_readEx"} +! CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X]] {uniq_name = "_QFatomic_implicit_cast_readEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Y:.*]] = fir.alloca f32 {bindc_name = "y", uniq_name = "_QFatomic_implicit_cast_readEy"} +! CHECK: %[[Y_DECL:.*]]:2 = hlfir.declare %[[Y]] {uniq_name = "_QFatomic_implicit_cast_readEy"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Z:.*]] = fir.alloca f64 {bindc_name = "z", uniq_name = "_QFatomic_implicit_cast_readEz"} +! CHECK: %[[Z_DECL:.*]]:2 = hlfir.declare %[[Z]] {uniq_name = "_QFatomic_implicit_cast_readEz"} : (!fir.ref) -> (!fir.ref, !fir.ref) + integer :: x + real :: y + double precision :: z + complex :: w + complex(8) :: m + +! CHECK: omp.atomic.read %[[ALLOCA0:.*]] = %[[Y_DECL]]#0 : !fir.ref, !fir.ref, f32 +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA0]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (f32) -> i32 +! CHECK: fir.store %[[CVT]] to %[[X_DECL]]#0 : !fir.ref + !$omp atomic read + x = y + +! CHECK: omp.atomic.read %[[ALLOCA1:.*]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA1]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f64 +! CHECK: fir.store %[[CVT]] to %[[Z_DECL]]#0 : !fir.ref + !$omp atomic read + z = x + +! CHECK: omp.atomic.read %[[ALLOCA2:.*]] = %[[W_DECL]]#0 : !fir.ref>, !fir.ref>, complex +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA2]] : !fir.ref> +! CHECK: %[[EXTRACT:.*]] = fir.extract_value %[[LOAD]], [0 : index] : (complex) -> f32 +! CHECK: %[[CVT:.*]] = fir.convert %[[EXTRACT]] : (f32) -> i32 +! CHECK: fir.store %[[CVT]] to %[[X_DECL]]#0 : !fir.ref + !$omp atomic read + x = w + +! CHECK: omp.atomic.read %[[ALLOCA3:.*]] = %[[W_DECL]]#0 : !fir.ref>, !fir.ref>, complex +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA3]] : !fir.ref> +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (complex) -> complex +! CHECK: fir.store %[[CVT]] to %[[M_DECL]]#0 : !fir.ref> + !$omp atomic read + m = w +end subroutine diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index 63d7171b06156..06dc1184e7cf5 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -268,33 +268,6 @@ computeOpenMPScheduleType(ScheduleKind ClauseKind, bool HasChunks, return Result; } -/// Emit an implicit cast to convert \p XRead to type of variable \p V -static llvm::Value *emitImplicitCast(IRBuilder<> &Builder, llvm::Value *XRead, - llvm::Value *V) { - // TODO: Add this functionality to the `AtomicInfo` interface - llvm::Type *XReadType = XRead->getType(); - llvm::Type *VType = V->getType(); - if (llvm::AllocaInst *vAlloca = dyn_cast(V)) - VType = vAlloca->getAllocatedType(); - - if (XReadType->isStructTy() && VType->isStructTy()) - // No need to extract or convert. A direct - // `store` will suffice. - return XRead; - - if (XReadType->isStructTy()) - XRead = Builder.CreateExtractValue(XRead, /*Idxs=*/0); - if (VType->isIntegerTy() && XReadType->isFloatingPointTy()) - XRead = Builder.CreateFPToSI(XRead, VType); - else if (VType->isFloatingPointTy() && XReadType->isIntegerTy()) - XRead = Builder.CreateSIToFP(XRead, VType); - else if (VType->isIntegerTy() && XReadType->isIntegerTy()) - XRead = Builder.CreateIntCast(XRead, VType, true); - else if (VType->isFloatingPointTy() && XReadType->isFloatingPointTy()) - XRead = Builder.CreateFPCast(XRead, VType); - return XRead; -} - /// Make \p Source branch to \p Target. /// /// Handles two situations: @@ -8685,8 +8658,6 @@ OpenMPIRBuilder::createAtomicRead(const LocationDescription &Loc, } } checkAndEmitFlushAfterAtomic(Loc, AO, AtomicKind::Read); - if (XRead->getType() != V.Var->getType()) - XRead = emitImplicitCast(Builder, XRead, V.Var); Builder.CreateStore(XRead, V.Var, V.IsVolatile); return Builder.saveIP(); } @@ -8983,8 +8954,6 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createAtomicCapture( return AtomicResult.takeError(); Value *CapturedVal = (IsPostfixUpdate ? AtomicResult->first : AtomicResult->second); - if (CapturedVal->getType() != V.Var->getType()) - CapturedVal = emitImplicitCast(Builder, CapturedVal, V.Var); Builder.CreateStore(CapturedVal, V.Var, V.IsVolatile); checkAndEmitFlushAfterAtomic(Loc, AO, AtomicKind::Capture); diff --git a/mlir/test/Target/LLVMIR/openmp-llvm.mlir b/mlir/test/Target/LLVMIR/openmp-llvm.mlir index 02a08eec74016..32f0ba5b105ff 100644 --- a/mlir/test/Target/LLVMIR/openmp-llvm.mlir +++ b/mlir/test/Target/LLVMIR/openmp-llvm.mlir @@ -1396,42 +1396,35 @@ llvm.func @omp_atomic_read_implicit_cast () { //CHECK: call void @__atomic_load(i64 8, ptr %[[X_ELEMENT]], ptr %[[ATOMIC_LOAD_TEMP]], i32 0) //CHECK: %[[LOAD:.*]] = load { float, float }, ptr %[[ATOMIC_LOAD_TEMP]], align 8 -//CHECK: %[[EXT:.*]] = extractvalue { float, float } %[[LOAD]], 0 -//CHECK: store float %[[EXT]], ptr %[[Y]], align 4 +//CHECK: store { float, float } %[[LOAD]], ptr %[[Y]], align 4 omp.atomic.read %3 = %17 : !llvm.ptr, !llvm.ptr, !llvm.struct<(f32, f32)> //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[Z]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i32 %[[ATOMIC_LOAD_TEMP]] to float -//CHECK: %[[LOAD:.*]] = fpext float %[[CAST]] to double -//CHECK: store double %[[LOAD]], ptr %[[Y]], align 8 +//CHECK: store float %[[CAST]], ptr %[[Y]], align 4 omp.atomic.read %3 = %1 : !llvm.ptr, !llvm.ptr, f32 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[W]] monotonic, align 4 -//CHECK: %[[LOAD:.*]] = sitofp i32 %[[ATOMIC_LOAD_TEMP]] to double -//CHECK: store double %[[LOAD]], ptr %[[Y]], align 8 +//CHECK: store i32 %[[ATOMIC_LOAD_TEMP]], ptr %[[Y]], align 4 omp.atomic.read %3 = %7 : !llvm.ptr, !llvm.ptr, i32 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i64, ptr %[[Y]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i64 %[[ATOMIC_LOAD_TEMP]] to double -//CHECK: %[[LOAD:.*]] = fptrunc double %[[CAST]] to float -//CHECK: store float %[[LOAD]], ptr %[[Z]], align 4 +//CHECK: store double %[[CAST]], ptr %[[Z]], align 8 omp.atomic.read %1 = %3 : !llvm.ptr, !llvm.ptr, f64 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[W]] monotonic, align 4 -//CHECK: %[[LOAD:.*]] = sitofp i32 %[[ATOMIC_LOAD_TEMP]] to float -//CHECK: store float %[[LOAD]], ptr %[[Z]], align 4 +//CHECK: store i32 %[[ATOMIC_LOAD_TEMP]], ptr %[[Z]], align 4 omp.atomic.read %1 = %7 : !llvm.ptr, !llvm.ptr, i32 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i64, ptr %[[Y]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i64 %[[ATOMIC_LOAD_TEMP]] to double -//CHECK: %[[LOAD:.*]] = fptosi double %[[CAST]] to i32 -//CHECK: store i32 %[[LOAD]], ptr %[[W]], align 4 +//CHECK: store double %[[CAST]], ptr %[[W]], align 8 omp.atomic.read %7 = %3 : !llvm.ptr, !llvm.ptr, f64 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[Z]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i32 %[[ATOMIC_LOAD_TEMP]] to float -//CHECK: %[[LOAD:.*]] = fptosi float %[[CAST]] to i32 -//CHECK: store i32 %[[LOAD]], ptr %[[W]], align 4 +//CHECK: store float %[[CAST]], ptr %[[W]], align 4 omp.atomic.read %7 = %1 : !llvm.ptr, !llvm.ptr, f32 llvm.return } >From dae12f1ceb3d906df77a43096763c5553ff83fc9 Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Thu, 1 May 2025 18:37:12 +0530 Subject: [PATCH 2/4] Add comment explaining the implicit casting --- flang/lib/Lower/OpenMP/OpenMP.cpp | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fe5aa994a76bd..d00ab6d07d7b8 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2891,13 +2891,39 @@ static void genAtomicRead(lower::AbstractConverter &converter, *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); if (fromAddress.getType() != toAddress.getType()) { - // Emit an implicit cast + // Emit an implicit cast. + // Different yet compatible types on omp.atomic.read constitute valid + // Fortran. The OMPIRBuilder will emit atomic instructions (on primitive + // types) and + // __atomic_load libcall (on complex type) without explicitly converting + // between such compatible types, leading to execute issues. The + // OMPIRBuilder relies on the frontend to resolve such inconsistencies + // between omp.atomic.read operand types. Inconsistency between operand + // types in omp.atomic.write are resolved through implicit casting by use of + // typed assignment (i.e. `evaluate::Assignment`). However, use of typed + // assignment in omp.atomic.read (of form `v = x`) leads to an unsafe, + // non-atomic load of `x` into a temporary `alloca`, followed by an atomic + // read of form `v = alloca`. Hence, perform a custom implicit cast. + + // For an atomic read of form `v = x` that would (without implicit casting) + // lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, + // type2`, this implicit casting will generate the following FIR: %alloca = + // fir.alloca type2 + // omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, + //type2 %load = fir.load %alloca : !fir.ref %cvt = fir.convert %load + //: (type2) -> type1 fir.store %cvt to %v : !fir.ref + + // These sequence of operations is thread-safe since each thread allocates + // the `alloca` in its stack, and performs `%alloca = %x` atomically. Once + // safely read, each thread performs the implicit cast on the local alloca, + // and writes the final result to `%v`. mlir::Type toType = fir::unwrapRefType(toAddress.getType()); mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); auto oldIP = builder.saveInsertionPoint(); builder.setInsertionPointToStart(builder.getAllocaBlock()); - mlir::Value alloca = builder.create(loc, fromType); + mlir::Value alloca = builder.create( + loc, fromType); // Thread scope `alloca` to atomically read `%x`. builder.restoreInsertionPoint(oldIP); genAtomicCaptureStatement(converter, fromAddress, alloca, leftHandClauseList, rightHandClauseList, >From a66a09c137833d49b35572636ae11174e8ee89de Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Thu, 1 May 2025 18:43:09 +0530 Subject: [PATCH 3/4] Fix formatting --- flang/lib/Lower/OpenMP/OpenMP.cpp | 43 ++++++++++++++++--------------- 1 file changed, 22 insertions(+), 21 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index d00ab6d07d7b8..2fc45e501f3fc 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2891,32 +2891,34 @@ static void genAtomicRead(lower::AbstractConverter &converter, *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); if (fromAddress.getType() != toAddress.getType()) { - // Emit an implicit cast. - // Different yet compatible types on omp.atomic.read constitute valid - // Fortran. The OMPIRBuilder will emit atomic instructions (on primitive - // types) and - // __atomic_load libcall (on complex type) without explicitly converting - // between such compatible types, leading to execute issues. The - // OMPIRBuilder relies on the frontend to resolve such inconsistencies - // between omp.atomic.read operand types. Inconsistency between operand - // types in omp.atomic.write are resolved through implicit casting by use of - // typed assignment (i.e. `evaluate::Assignment`). However, use of typed - // assignment in omp.atomic.read (of form `v = x`) leads to an unsafe, + // Emit an implicit cast. Different yet compatible types on + // omp.atomic.read constitute valid Fortran. The OMPIRBuilder will + // emit atomic instructions (on primitive types) and `__atomic_load` + // libcall (on complex type) without explicitly converting + // between such compatible types. The OMPIRBuilder relies on the + // frontend to resolve such inconsistencies between `omp.atomic.read ` + // operand types. Similar inconsistencies between operand types in + // `omp.atomic.write` are resolved through implicit casting by use of typed + // assignment (i.e. `evaluate::Assignment`). However, use of typed + // assignment in `omp.atomic.read` (of form `v = x`) leads to an unsafe, // non-atomic load of `x` into a temporary `alloca`, followed by an atomic - // read of form `v = alloca`. Hence, perform a custom implicit cast. + // read of form `v = alloca`. Hence, it is needed to perform a custom + // implicit cast. - // For an atomic read of form `v = x` that would (without implicit casting) + // An atomic read of form `v = x` would (without implicit casting) // lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, - // type2`, this implicit casting will generate the following FIR: %alloca = - // fir.alloca type2 - // omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, - //type2 %load = fir.load %alloca : !fir.ref %cvt = fir.convert %load - //: (type2) -> type1 fir.store %cvt to %v : !fir.ref + // type2`. This implicit casting will rather generate the following FIR: + // + // %alloca = fir.alloca type2 + // omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 + // %load = fir.load %alloca : !fir.ref + // %cvt = fir.convert %load: (type2) -> type1 + // fir.store %cvt to %v : !fir.ref // These sequence of operations is thread-safe since each thread allocates // the `alloca` in its stack, and performs `%alloca = %x` atomically. Once - // safely read, each thread performs the implicit cast on the local alloca, - // and writes the final result to `%v`. + // safely read, each thread performs the implicit cast on the local + // `alloca`, and writes the final result to `%v`. mlir::Type toType = fir::unwrapRefType(toAddress.getType()); mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); @@ -2932,7 +2934,6 @@ static void genAtomicRead(lower::AbstractConverter &converter, if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { // Emit an additional `ExtractValueOp` if `fromAddress` is of complex // type, but `toAddress` is not. - auto extract = builder.create( loc, mlir::cast(fromType).getElementType(), load, builder.getArrayAttr( >From d1353cfd0e49b4b35a56bab927d9494e4eb337c8 Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Thu, 1 May 2025 18:45:33 +0530 Subject: [PATCH 4/4] Fix minor typo --- flang/lib/Lower/OpenMP/OpenMP.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 2fc45e501f3fc..47e7c266ff7d3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2906,13 +2906,13 @@ static void genAtomicRead(lower::AbstractConverter &converter, // implicit cast. // An atomic read of form `v = x` would (without implicit casting) - // lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, + // lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, // type2`. This implicit casting will rather generate the following FIR: // // %alloca = fir.alloca type2 // omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 // %load = fir.load %alloca : !fir.ref - // %cvt = fir.convert %load: (type2) -> type1 + // %cvt = fir.convert %load : (type2) -> type1 // fir.store %cvt to %v : !fir.ref // These sequence of operations is thread-safe since each thread allocates From flang-commits at lists.llvm.org Thu May 1 06:16:03 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 06:16:03 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][llvm][OpenMP] Add implicit casts to omp.atomic (PR #131603) In-Reply-To: Message-ID: <68137413.170a0220.2b2a6.f9d3@mx.google.com> ================ @@ -2889,9 +2889,55 @@ static void genAtomicRead(lower::AbstractConverter &converter, fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); mlir::Value toAddress = fir::getBase(converter.genExprAddr( *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); + + if (fromAddress.getType() != toAddress.getType()) { ---------------- NimishMishra wrote: Thanks. I have added an explanation for these https://github.com/llvm/llvm-project/pull/131603 From flang-commits at lists.llvm.org Thu May 1 06:18:36 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 01 May 2025 06:18:36 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <681374ac.170a0220.16d213.2a28@mx.google.com> https://github.com/kparzysz edited https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Thu May 1 06:18:36 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 01 May 2025 06:18:36 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <681374ac.650a0220.2dff6e.279f@mx.google.com> https://github.com/kparzysz approved this pull request. Sorry for the delay in reviewing. Looks ok to me, but please wait for @tblah's input as well. https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Thu May 1 06:18:37 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 01 May 2025 06:18:37 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <681374ad.170a0220.22486a.2b28@mx.google.com> ================ @@ -3627,40 +3683,45 @@ void OmpStructureChecker::CheckIsVarPartOfAnotherVar( const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause) { for (const auto &ompObject : objList.v) { - common::visit( - common::visitors{ - [&](const parser::Designator &designator) { - if (const auto *dataRef{ - std::get_if(&designator.u)}) { - if (IsDataRefTypeParamInquiry(dataRef)) { + CheckIsVarPartOfAnotherVar(source, ompObject, clause); + } +} + +void OmpStructureChecker::CheckIsVarPartOfAnotherVar( + const parser::CharBlock &source, const parser::OmpObject &ompObject, + llvm::StringRef clause) { + common::visit( + common::visitors{ + [&](const parser::Designator &designator) { + if (const auto *dataRef{ + std::get_if(&designator.u)}) { + if (IsDataRefTypeParamInquiry(dataRef)) { + context_.Say(source, + "A type parameter inquiry cannot appear on the %s " + "directive"_err_en_US, ---------------- kparzysz wrote: Please make message strings be in a single line (here and in other places you've changed). This is to make it easier to find them in the source based on the compiler output. https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Thu May 1 06:18:37 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 01 May 2025 06:18:37 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <681374ad.170a0220.14b47a.299e@mx.google.com> ================ @@ -2961,6 +2961,63 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { // clause CheckMultListItems(); + if (GetContext().directive == llvm::omp::Directive::OMPD_task) { + if (auto *detachClause{FindClause(llvm::omp::Clause::OMPC_detach)}) { + unsigned version{context_.langOptions().OpenMPVersion}; + if (version == 50 || version == 51) { + // OpenMP 5.0: 2.10.1 Task construct restrictions + CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_detach, + {llvm::omp::Clause::OMPC_mergeable}); + } else if (version >= 52) { + // OpenMP 5.2: 12.5.2 Detach construct restrictions + if (FindClause(llvm::omp::Clause::OMPC_final)) { + context_.Say(GetContext().clauseSource, + "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); + } + + const auto &detach{ + std::get(detachClause->u)}; + if (const auto *name{parser::Unwrap(detach.v.v)}) { + if (name->symbol) { ---------------- kparzysz wrote: All parser::Name's should have non-null symbols by now (it's an internal compiler error if they don't). https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Thu May 1 06:39:07 2025 From: flang-commits at lists.llvm.org (David Truby via flang-commits) Date: Thu, 01 May 2025 06:39:07 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] Remove FLANG_INCLUDE_RUNTIME (PR #124126) In-Reply-To: Message-ID: <6813797b.170a0220.1e69a0.4e76@mx.google.com> DavidTruby wrote: This patch seems to have removed the behaviour that flang-rt is added to LLVM_ENABLE_RUNTIMES automatically which has broken all of our internal builds. My read of the commit message, and the patch, is that this behaviour hasn't been intentionally removed? I have a cmake line like: ``` cmake -GNinja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLLVM_CCACHE_BUILD=On -DLLVM_ENABLE_PROJECTS="clang;flang;mlir" -DLLVM_ENABLE_RUNTIMES=openmp -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=On ../llvm ``` and I am not getting flang-rt. I also see: ``` -- Not building Flang-RT. For a usable Fortran toolchain, either set FLANG_ENABLE_FLANG_RT=ON, add LLVM_ENABLE_RUNTIMES=flang-rt, or compile a standalone Flang-RT. ``` https://github.com/llvm/llvm-project/pull/124126 From flang-commits at lists.llvm.org Thu May 1 06:44:05 2025 From: flang-commits at lists.llvm.org (David Truby via flang-commits) Date: Thu, 01 May 2025 06:44:05 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] Remove FLANG_INCLUDE_RUNTIME (PR #124126) In-Reply-To: Message-ID: <68137aa5.170a0220.5ed93.596e@mx.google.com> ================ @@ -149,6 +149,12 @@ if ("flang" IN_LIST LLVM_ENABLE_PROJECTS) message(STATUS "Enabling clang as a dependency to flang") list(APPEND LLVM_ENABLE_PROJECTS "clang") endif() + + option(FLANG_ENABLE_FLANG_RT "Implicitly add LLVM_ENABLE_RUNTIMES=flang-rt when compiling Flang" ON) + if (FLANG_ENABLE_FLANG_RT AND NOT "flang-rt" IN_LIST LLVM_ENABLE_RUNTIMES) + message(STATUS "Enabling Flang-RT as a dependency of Flang") + list(APPEND LLVM_ENABLE_RUNTIMES "flang-rt") ---------------- DavidTruby wrote: I believe you can't `list(APPEND ..)` cache variables like this as they'll get overridden by what was provided on the command line. I think it should be `set(LLVM_ENABLE_RUNTIMES "${LLVM_ENABLE_RUNTIMES};flang-rt" CACHE INTERNAL "")` or something along those lines https://github.com/llvm/llvm-project/pull/124126 From flang-commits at lists.llvm.org Thu May 1 06:45:06 2025 From: flang-commits at lists.llvm.org (David Truby via flang-commits) Date: Thu, 01 May 2025 06:45:06 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] Remove FLANG_INCLUDE_RUNTIME (PR #124126) In-Reply-To: Message-ID: <68137ae2.170a0220.738c0.701a@mx.google.com> https://github.com/DavidTruby edited https://github.com/llvm/llvm-project/pull/124126 From flang-commits at lists.llvm.org Thu May 1 06:45:40 2025 From: flang-commits at lists.llvm.org (David Truby via flang-commits) Date: Thu, 01 May 2025 06:45:40 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] Remove FLANG_INCLUDE_RUNTIME (PR #124126) In-Reply-To: Message-ID: <68137b04.050a0220.16fa1b.7735@mx.google.com> https://github.com/DavidTruby edited https://github.com/llvm/llvm-project/pull/124126 From flang-commits at lists.llvm.org Thu May 1 06:49:15 2025 From: flang-commits at lists.llvm.org (David Truby via flang-commits) Date: Thu, 01 May 2025 06:49:15 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] Remove FLANG_INCLUDE_RUNTIME (PR #124126) In-Reply-To: Message-ID: <68137bdb.050a0220.145e91.80a7@mx.google.com> ================ @@ -149,6 +149,12 @@ if ("flang" IN_LIST LLVM_ENABLE_PROJECTS) message(STATUS "Enabling clang as a dependency to flang") list(APPEND LLVM_ENABLE_PROJECTS "clang") endif() + + option(FLANG_ENABLE_FLANG_RT "Implicitly add LLVM_ENABLE_RUNTIMES=flang-rt when compiling Flang" ON) + if (FLANG_ENABLE_FLANG_RT AND NOT "flang-rt" IN_LIST LLVM_ENABLE_RUNTIMES) + message(STATUS "Enabling Flang-RT as a dependency of Flang") + list(APPEND LLVM_ENABLE_RUNTIMES "flang-rt") ---------------- DavidTruby wrote: Hmm, that doesn't seem to work either. I actually don't see the message so for some reason we aren't entering that if statement body I guess? https://github.com/llvm/llvm-project/pull/124126 From flang-commits at lists.llvm.org Thu May 1 06:58:07 2025 From: flang-commits at lists.llvm.org (David Truby via flang-commits) Date: Thu, 01 May 2025 06:58:07 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] Remove FLANG_INCLUDE_RUNTIME (PR #124126) In-Reply-To: Message-ID: <68137def.050a0220.faf06.92a0@mx.google.com> ================ @@ -149,6 +149,12 @@ if ("flang" IN_LIST LLVM_ENABLE_PROJECTS) message(STATUS "Enabling clang as a dependency to flang") list(APPEND LLVM_ENABLE_PROJECTS "clang") endif() + + option(FLANG_ENABLE_FLANG_RT "Implicitly add LLVM_ENABLE_RUNTIMES=flang-rt when compiling Flang" ON) + if (FLANG_ENABLE_FLANG_RT AND NOT "flang-rt" IN_LIST LLVM_ENABLE_RUNTIMES) + message(STATUS "Enabling Flang-RT as a dependency of Flang") + list(APPEND LLVM_ENABLE_RUNTIMES "flang-rt") ---------------- DavidTruby wrote: Sorry. All of that was false alarms. The issue is that LLVM_ENABLE_RUNTIMES hasn't been defined by this point, it's defined a little after these checks. When it does get defined it overwrites whatever was done here. If you move the LLVM_ENABLE_RUNTIMES define above this, then this works as expected. https://github.com/llvm/llvm-project/pull/124126 From flang-commits at lists.llvm.org Thu May 1 07:08:35 2025 From: flang-commits at lists.llvm.org (David Truby via flang-commits) Date: Thu, 01 May 2025 07:08:35 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] Remove FLANG_INCLUDE_RUNTIME (PR #124126) In-Reply-To: Message-ID: <68138063.050a0220.25b10d.b316@mx.google.com> DavidTruby wrote: I have posted #138136 which fixes the issues I mentioned above https://github.com/llvm/llvm-project/pull/124126 From flang-commits at lists.llvm.org Thu May 1 00:03:19 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 00:03:19 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <68131cb7.170a0220.2f6e16.1399@mx.google.com> https://github.com/jofrn updated https://github.com/llvm/llvm-project/pull/123609 >From 210b6d80bcfbbcd216f98199df386280724561e2 Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 20 Jan 2025 04:51:26 -0800 Subject: [PATCH 01/27] [TargetVerifier][AMDGPU] Add TargetVerifier. This pass verifies the IR for an individual backend. This is different than Lint because it consolidates all checks for a given backend in a single pass. A check for Lint may be undefined behavior across all targets, whereas a check in TargetVerifier would only pertain to the specified target but can check more than just undefined behavior such are IR validity. A use case of this would be to reject programs with invalid IR while fuzzing. --- llvm/include/llvm/IR/Module.h | 4 + llvm/include/llvm/Target/TargetVerifier.h | 82 +++++++ .../TargetVerify/AMDGPUTargetVerifier.h | 36 +++ llvm/lib/IR/Verifier.cpp | 18 +- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 213 ++++++++++++++++++ llvm/lib/Target/AMDGPU/CMakeLists.txt | 1 + llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 62 +++++ llvm/tools/llvm-tgt-verify/CMakeLists.txt | 34 +++ .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 172 ++++++++++++++ 9 files changed, 618 insertions(+), 4 deletions(-) create mode 100644 llvm/include/llvm/Target/TargetVerifier.h create mode 100644 llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h create mode 100644 llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp create mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify.ll create mode 100644 llvm/tools/llvm-tgt-verify/CMakeLists.txt create mode 100644 llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp diff --git a/llvm/include/llvm/IR/Module.h b/llvm/include/llvm/IR/Module.h index 91ccd76c41e07..03c0cf1cf0924 100644 --- a/llvm/include/llvm/IR/Module.h +++ b/llvm/include/llvm/IR/Module.h @@ -214,6 +214,10 @@ class LLVM_ABI Module { /// @name Constructors /// @{ public: + /// Is this Module valid as determined by one of the verification passes + /// i.e. Lint, Verifier, TargetVerifier. + bool IsValid = true; + /// Is this Module using intrinsics to record the position of debugging /// information, or non-intrinsic records? See IsNewDbgInfoFormat in /// \ref BasicBlock. diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h new file mode 100644 index 0000000000000..e00c6a7b260c9 --- /dev/null +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -0,0 +1,82 @@ +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at TargetVerifier.cpp or an +// individual backend's TargetVerifier. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_TARGET_VERIFIER_H +#define LLVM_TARGET_VERIFIER_H + +#include "llvm/IR/PassManager.h" +#include "llvm/IR/Module.h" +#include "llvm/TargetParser/Triple.h" + +namespace llvm { + +class Function; + +class TargetVerifierPass : public PassInfoMixin { +public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) {} +}; + +class TargetVerify { +protected: + void WriteValues(ArrayRef Vs) { + for (const Value *V : Vs) { + if (!V) + continue; + if (isa(V)) { + MessagesStr << *V << '\n'; + } else { + V->printAsOperand(MessagesStr, true, Mod); + MessagesStr << '\n'; + } + } + } + + /// A check failed, so printout out the condition and the message. + /// + /// This provides a nice place to put a breakpoint if you want to see why + /// something is not correct. + void CheckFailed(const Twine &Message) { MessagesStr << Message << '\n'; } + + /// A check failed (with values to print). + /// + /// This calls the Message-only version so that the above is easier to set + /// a breakpoint on. + template + void CheckFailed(const Twine &Message, const T1 &V1, const Ts &... Vs) { + CheckFailed(Message); + WriteValues({V1, Vs...}); + } +public: + Module *Mod; + Triple TT; + + std::string Messages; + raw_string_ostream MessagesStr; + + TargetVerify(Module *Mod) + : Mod(Mod), TT(Triple::normalize(Mod->getTargetTriple())), + MessagesStr(Messages) {} + + void run(Function &F) {}; +}; + +} // namespace llvm + +#endif // LLVM_TARGET_VERIFIER_H diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h new file mode 100644 index 0000000000000..e6ff57629b141 --- /dev/null +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -0,0 +1,36 @@ +//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU ---*- C++ -*-===// +//// +//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +//// See https://llvm.org/LICENSE.txt for license information. +//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +//// +////===----------------------------------------------------------------------===// +//// +//// This file defines target verifier interfaces that can be used for some +//// validation of input to the system, and for checking that transformations +//// haven't done something bad. In contrast to the Verifier or Lint, the +//// TargetVerifier looks for constructions invalid to a particular target +//// machine. +//// +//// To see what specifically is checked, look at an individual backend's +//// TargetVerifier. +//// +////===----------------------------------------------------------------------===// + +#ifndef LLVM_AMDGPU_TARGET_VERIFIER_H +#define LLVM_AMDGPU_TARGET_VERIFIER_H + +#include "llvm/Target/TargetVerifier.h" + +namespace llvm { + +class Function; + +class AMDGPUTargetVerifierPass : public TargetVerifierPass { +public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); +}; + +} // namespace llvm + +#endif // LLVM_AMDGPU_TARGET_VERIFIER_H diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 8afe360d088bc..9d21ca182ca13 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -135,6 +135,10 @@ static cl::opt VerifyNoAliasScopeDomination( cl::desc("Ensure that llvm.experimental.noalias.scope.decl for identical " "scopes are not dominating")); +static cl::opt + VerifyAbortOnError("verifier-abort-on-error", cl::init(false), + cl::desc("In the Verifier pass, abort on errors.")); + namespace llvm { struct VerifierSupport { @@ -7796,16 +7800,22 @@ VerifierAnalysis::Result VerifierAnalysis::run(Function &F, PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { auto Res = AM.getResult(M); - if (FatalErrors && (Res.IRBroken || Res.DebugInfoBroken)) - report_fatal_error("Broken module found, compilation aborted!"); + if (Res.IRBroken || Res.DebugInfoBroken) { + M.IsValid = false; + if (VerifyAbortOnError && FatalErrors) + report_fatal_error("Broken module found, compilation aborted!"); + } return PreservedAnalyses::all(); } PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto res = AM.getResult(F); - if (res.IRBroken && FatalErrors) - report_fatal_error("Broken function found, compilation aborted!"); + if (res.IRBroken) { + F.getParent()->IsValid = false; + if (VerifyAbortOnError && FatalErrors) + report_fatal_error("Broken function found, compilation aborted!"); + } return PreservedAnalyses::all(); } diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp new file mode 100644 index 0000000000000..585b19065c142 --- /dev/null +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -0,0 +1,213 @@ +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" + +#include "llvm/Analysis/UniformityAnalysis.h" +#include "llvm/Analysis/PostDominators.h" +#include "llvm/Support/Debug.h" +#include "llvm/IR/Dominators.h" +#include "llvm/IR/Function.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/IntrinsicsAMDGPU.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/Value.h" + +#include "llvm/Support/raw_ostream.h" + +using namespace llvm; + +static cl::opt +MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); + +// Check - We know that cond should be true, if not print an error message. +#define Check(C, ...) \ + do { \ + if (!(C)) { \ + TargetVerify::CheckFailed(__VA_ARGS__); \ + return; \ + } \ + } while (false) + +static bool isMFMA(unsigned IID) { + switch (IID) { + case Intrinsic::amdgcn_mfma_f32_4x4x1f32: + case Intrinsic::amdgcn_mfma_f32_4x4x4f16: + case Intrinsic::amdgcn_mfma_i32_4x4x4i8: + case Intrinsic::amdgcn_mfma_f32_4x4x2bf16: + + case Intrinsic::amdgcn_mfma_f32_16x16x1f32: + case Intrinsic::amdgcn_mfma_f32_16x16x4f32: + case Intrinsic::amdgcn_mfma_f32_16x16x4f16: + case Intrinsic::amdgcn_mfma_f32_16x16x16f16: + case Intrinsic::amdgcn_mfma_i32_16x16x4i8: + case Intrinsic::amdgcn_mfma_i32_16x16x16i8: + case Intrinsic::amdgcn_mfma_f32_16x16x2bf16: + case Intrinsic::amdgcn_mfma_f32_16x16x8bf16: + + case Intrinsic::amdgcn_mfma_f32_32x32x1f32: + case Intrinsic::amdgcn_mfma_f32_32x32x2f32: + case Intrinsic::amdgcn_mfma_f32_32x32x4f16: + case Intrinsic::amdgcn_mfma_f32_32x32x8f16: + case Intrinsic::amdgcn_mfma_i32_32x32x4i8: + case Intrinsic::amdgcn_mfma_i32_32x32x8i8: + case Intrinsic::amdgcn_mfma_f32_32x32x2bf16: + case Intrinsic::amdgcn_mfma_f32_32x32x4bf16: + + case Intrinsic::amdgcn_mfma_f32_4x4x4bf16_1k: + case Intrinsic::amdgcn_mfma_f32_16x16x4bf16_1k: + case Intrinsic::amdgcn_mfma_f32_16x16x16bf16_1k: + case Intrinsic::amdgcn_mfma_f32_32x32x4bf16_1k: + case Intrinsic::amdgcn_mfma_f32_32x32x8bf16_1k: + + case Intrinsic::amdgcn_mfma_f64_16x16x4f64: + case Intrinsic::amdgcn_mfma_f64_4x4x4f64: + + case Intrinsic::amdgcn_mfma_i32_16x16x32_i8: + case Intrinsic::amdgcn_mfma_i32_32x32x16_i8: + case Intrinsic::amdgcn_mfma_f32_16x16x8_xf32: + case Intrinsic::amdgcn_mfma_f32_32x32x4_xf32: + + case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_bf8: + case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_fp8: + case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_bf8: + case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_fp8: + + case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_bf8: + case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_fp8: + case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_bf8: + case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_fp8: + return true; + default: + return false; + } +} + +namespace llvm { +class AMDGPUTargetVerify : public TargetVerify { +public: + Module *Mod; + + DominatorTree *DT; + PostDominatorTree *PDT; + UniformityInfo *UA; + + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) + : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} + + void run(Function &F); +}; + +static bool IsValidInt(const Type *Ty) { + return Ty->isIntegerTy(1) || + Ty->isIntegerTy(8) || + Ty->isIntegerTy(16) || + Ty->isIntegerTy(32) || + Ty->isIntegerTy(64) || + Ty->isIntegerTy(128); +} + +static bool isShader(CallingConv::ID CC) { + switch(CC) { + case CallingConv::AMDGPU_VS: + case CallingConv::AMDGPU_LS: + case CallingConv::AMDGPU_HS: + case CallingConv::AMDGPU_ES: + case CallingConv::AMDGPU_GS: + case CallingConv::AMDGPU_PS: + case CallingConv::AMDGPU_CS_Chain: + case CallingConv::AMDGPU_CS_ChainPreserve: + case CallingConv::AMDGPU_CS: + return true; + default: + return false; + } +} + +void AMDGPUTargetVerify::run(Function &F) { + // Ensure shader calling convention returns void + if (isShader(F.getCallingConv())) + Check(F.getReturnType() == Type::getVoidTy(F.getContext()), "Shaders must return void"); + + for (auto &BB : F) { + + for (auto &I : BB) { + if (MarkUniform) + outs() << UA->isUniform(&I) << ' ' << I << '\n'; + + // Ensure integral types are valid: i8, i16, i32, i64, i128 + if (I.getType()->isIntegerTy()) + Check(IsValidInt(I.getType()), "Int type is invalid.", &I); + for (unsigned i = 0; i < I.getNumOperands(); ++i) + if (I.getOperand(i)->getType()->isIntegerTy()) + Check(IsValidInt(I.getOperand(i)->getType()), + "Int type is invalid.", I.getOperand(i)); + + // Ensure no store to const memory + if (auto *SI = dyn_cast(&I)) + { + unsigned AS = SI->getPointerAddressSpace(); + Check(AS != 4, "Write to const memory", SI); + } + + // Ensure no kernel to kernel calls. + if (auto *CI = dyn_cast(&I)) + { + CallingConv::ID CalleeCC = CI->getCallingConv(); + if (CalleeCC == CallingConv::AMDGPU_KERNEL) + { + CallingConv::ID CallerCC = CI->getParent()->getParent()->getCallingConv(); + Check(CallerCC != CallingConv::AMDGPU_KERNEL, + "A kernel may not call a kernel", CI->getParent()->getParent()); + } + } + + // Ensure MFMA is not in control flow with diverging operands + if (auto *II = dyn_cast(&I)) { + if (isMFMA(II->getIntrinsicID())) { + bool InControlFlow = false; + for (const auto &P : predecessors(&BB)) + if (!PDT->dominates(&BB, P)) { + InControlFlow = true; + break; + } + for (const auto &S : successors(&BB)) + if (!DT->dominates(&BB, S)) { + InControlFlow = true; + break; + } + if (InControlFlow) { + // If operands to MFMA are not uniform, MFMA cannot be in control flow + bool hasUniformOperands = true; + for (unsigned i = 0; i < II->getNumOperands(); i++) { + if (!UA->isUniform(II->getOperand(i))) { + dbgs() << "Not uniform: " << *II->getOperand(i) << '\n'; + hasUniformOperands = false; + } + } + if (!hasUniformOperands) Check(false, "MFMA in control flow", II); + //else Check(false, "MFMA in control flow (uniform operands)", II); + } + //else Check(false, "MFMA not in control flow", II); + } + } + } + } +} + +PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { + + auto *Mod = F.getParent(); + + auto UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + F.getParent()->IsValid = false; + } + + return PreservedAnalyses::all(); +} +} // namespace llvm diff --git a/llvm/lib/Target/AMDGPU/CMakeLists.txt b/llvm/lib/Target/AMDGPU/CMakeLists.txt index 09a3096602fc3..bcfea0bf8ac94 100644 --- a/llvm/lib/Target/AMDGPU/CMakeLists.txt +++ b/llvm/lib/Target/AMDGPU/CMakeLists.txt @@ -110,6 +110,7 @@ add_llvm_target(AMDGPUCodeGen AMDGPUTargetMachine.cpp AMDGPUTargetObjectFile.cpp AMDGPUTargetTransformInfo.cpp + AMDGPUTargetVerifier.cpp AMDGPUWaitSGPRHazards.cpp AMDGPUUnifyDivergentExitNodes.cpp AMDGPUUnifyMetadata.cpp diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll new file mode 100644 index 0000000000000..f56ff992a56c2 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -0,0 +1,62 @@ +; RUN: not llvm-tgt-verify %s -mtriple=amdgcn |& FileCheck %s + +define amdgpu_kernel void @test_mfma_f32_32x32x1f32_vecarg(ptr addrspace(1) %arg) #0 { +; CHECK: Not uniform: %in.f32 = load <32 x float>, ptr addrspace(1) %gep, align 128 +; CHECK-NEXT: MFMA in control flow +; CHECK-NEXT: %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) +s: + %tid = call i32 @llvm.amdgcn.workitem.id.x() + %gep = getelementptr inbounds <32 x float>, ptr addrspace(1) %arg, i32 %tid + %in.i32 = load <32 x i32>, ptr addrspace(1) %gep + %in.f32 = load <32 x float>, ptr addrspace(1) %gep + + %0 = icmp eq <32 x i32> %in.i32, zeroinitializer + %div.br = extractelement <32 x i1> %0, i32 0 + br i1 %div.br, label %if.3, label %else.0 + +if.3: + br label %join + +else.0: + %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) + br label %join + +join: + ret void +} + +define amdgpu_cs i32 @shader() { +; CHECK: Shaders must return void + ret i32 0 +} + +define amdgpu_kernel void @store_const(ptr addrspace(4) %out, i32 %a, i32 %b) { +; CHECK: Undefined behavior: Write to memory in const addrspace +; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 +; CHECK-NEXT: Write to const memory +; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 + %r = add i32 %a, %b + store i32 %r, ptr addrspace(4) %out + ret void +} + +define amdgpu_kernel void @kernel_callee(ptr %x) { + ret void +} + +define amdgpu_kernel void @kernel_caller(ptr %x) { +; CHECK: A kernel may not call a kernel +; CHECK-NEXT: ptr @kernel_caller + call amdgpu_kernel void @kernel_callee(ptr %x) + ret void +} + + +; Function Attrs: nounwind +define i65 @invalid_type(i65 %x) #0 { +; CHECK: Int type is invalid. +; CHECK-NEXT: %tmp2 = ashr i65 %x, 64 +entry: + %tmp2 = ashr i65 %x, 64 + ret i65 %tmp2 +} diff --git a/llvm/tools/llvm-tgt-verify/CMakeLists.txt b/llvm/tools/llvm-tgt-verify/CMakeLists.txt new file mode 100644 index 0000000000000..fe47c85e6cdce --- /dev/null +++ b/llvm/tools/llvm-tgt-verify/CMakeLists.txt @@ -0,0 +1,34 @@ +set(LLVM_LINK_COMPONENTS + AllTargetsAsmParsers + AllTargetsCodeGens + AllTargetsDescs + AllTargetsInfos + Analysis + AsmPrinter + CodeGen + CodeGenTypes + Core + IRPrinter + IRReader + MC + MIRParser + Passes + Remarks + ScalarOpts + SelectionDAG + Support + Target + TargetParser + TransformUtils + Vectorize + ) + +add_llvm_tool(llvm-tgt-verify + llvm-tgt-verify.cpp + + DEPENDS + intrinsics_gen + SUPPORT_PLUGINS + ) + +export_executable_symbols_for_plugins(llc) diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp new file mode 100644 index 0000000000000..68422abd6f4cc --- /dev/null +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -0,0 +1,172 @@ +//===--- llvm-isel-fuzzer.cpp - Fuzzer for instruction selection ----------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Tool to fuzz instruction selection using libFuzzer. +// +//===----------------------------------------------------------------------===// + +#include "llvm/InitializePasses.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Analysis/Lint.h" +#include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/Bitcode/BitcodeReader.h" +#include "llvm/Bitcode/BitcodeWriter.h" +#include "llvm/CodeGen/CommandFlags.h" +#include "llvm/CodeGen/TargetPassConfig.h" +#include "llvm/IR/Constants.h" +#include "llvm/IR/LLVMContext.h" +#include "llvm/IR/LegacyPassManager.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/Verifier.h" +#include "llvm/IRReader/IRReader.h" +#include "llvm/Passes/PassBuilder.h" +#include "llvm/Passes/StandardInstrumentations.h" +#include "llvm/MC/TargetRegistry.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/DataTypes.h" +#include "llvm/Support/Debug.h" +#include "llvm/Support/InitLLVM.h" +#include "llvm/Support/SourceMgr.h" +#include "llvm/Support/TargetSelect.h" +#include "llvm/Target/TargetMachine.h" +#include "llvm/Target/TargetVerifier.h" + +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" + +#define DEBUG_TYPE "isel-fuzzer" + +using namespace llvm; + +static codegen::RegisterCodeGenFlags CGF; + +static cl::opt +InputFilename(cl::Positional, cl::desc(""), cl::init("-")); + +static cl::opt + StacktraceAbort("stacktrace-abort", + cl::desc("Turn on stacktrace"), cl::init(false)); + +static cl::opt + NoLint("no-lint", + cl::desc("Turn off Lint"), cl::init(false)); + +static cl::opt + NoVerify("no-verifier", + cl::desc("Turn off Verifier"), cl::init(false)); + +static cl::opt + OptLevel("O", + cl::desc("Optimization level. [-O0, -O1, -O2, or -O3] " + "(default = '-O2')"), + cl::Prefix, cl::init('2')); + +static cl::opt + TargetTriple("mtriple", cl::desc("Override target triple for module")); + +static std::unique_ptr TM; + +static void handleLLVMFatalError(void *, const char *Message, bool) { + if (StacktraceAbort) { + dbgs() << "LLVM ERROR: " << Message << "\n" + << "Aborting.\n"; + abort(); + } +} + +int main(int argc, char **argv) { + StringRef ExecName = argv[0]; + InitLLVM X(argc, argv); + + InitializeAllTargets(); + InitializeAllTargetMCs(); + InitializeAllAsmPrinters(); + InitializeAllAsmParsers(); + + PassRegistry *Registry = PassRegistry::getPassRegistry(); + initializeCore(*Registry); + initializeCodeGen(*Registry); + initializeAnalysis(*Registry); + initializeTarget(*Registry); + + cl::ParseCommandLineOptions(argc, argv); + + if (TargetTriple.empty()) { + errs() << ExecName << ": -mtriple must be specified\n"; + exit(1); + } + + CodeGenOptLevel OLvl; + if (auto Level = CodeGenOpt::parseLevel(OptLevel)) { + OLvl = *Level; + } else { + errs() << ExecName << ": invalid optimization level.\n"; + return 1; + } + ExitOnError ExitOnErr(std::string(ExecName) + ": error:"); + TM = ExitOnErr(codegen::createTargetMachineForTriple( + Triple::normalize(TargetTriple), OLvl)); + assert(TM && "Could not allocate target machine!"); + + // Make sure we print the summary and the current unit when LLVM errors out. + install_fatal_error_handler(handleLLVMFatalError, nullptr); + + LLVMContext Context; + SMDiagnostic Err; + std::unique_ptr M = parseIRFile(InputFilename, Err, Context); + if (!M) { + errs() << "Invalid mod\n"; + return 1; + } + auto S = Triple::normalize(TargetTriple); + M->setTargetTriple(S); + + PassInstrumentationCallbacks PIC; + StandardInstrumentations SI(Context, false/*debug PM*/, + false); + registerCodeGenCallback(PIC, *TM); + + ModulePassManager MPM; + FunctionPassManager FPM; + //TargetLibraryInfoImpl TLII(Triple(M->getTargetTriple())); + + MachineFunctionAnalysisManager MFAM; + LoopAnalysisManager LAM; + FunctionAnalysisManager FAM; + CGSCCAnalysisManager CGAM; + ModuleAnalysisManager MAM; + PassBuilder PB(TM.get(), PipelineTuningOptions(), std::nullopt, &PIC); + PB.registerModuleAnalyses(MAM); + PB.registerCGSCCAnalyses(CGAM); + PB.registerFunctionAnalyses(FAM); + PB.registerLoopAnalyses(LAM); + PB.registerMachineFunctionAnalyses(MFAM); + PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); + + SI.registerCallbacks(PIC, &MAM); + + //FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); + + Triple TT(M->getTargetTriple()); + if (!NoLint) + FPM.addPass(LintPass()); + if (!NoVerify) + MPM.addPass(VerifierPass()); + if (TT.isAMDGPU()) + FPM.addPass(AMDGPUTargetVerifierPass()); + else if (false) {} // ... + else + FPM.addPass(TargetVerifierPass()); + MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); + + MPM.run(*M, MAM); + + if (!M->IsValid) + return 1; + + return 0; +} >From a808efce8d90524845a44ffa5b90adb6741e488d Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 3 Feb 2025 07:15:12 -0800 Subject: [PATCH 02/27] Add hook for target verifier in llc,opt --- .../llvm/Passes/StandardInstrumentations.h | 6 ++++-- llvm/include/llvm/Target/TargetVerifier.h | 1 + .../TargetVerify/AMDGPUTargetVerifier.h | 18 ++++++++++++++++++ llvm/lib/LTO/LTOBackend.cpp | 2 +- llvm/lib/LTO/ThinLTOCodeGenerator.cpp | 2 +- llvm/lib/Passes/CMakeLists.txt | 1 + llvm/lib/Passes/PassBuilderBindings.cpp | 2 +- llvm/lib/Passes/StandardInstrumentations.cpp | 19 +++++++++++++++---- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 12 ++++++------ llvm/lib/Target/CMakeLists.txt | 2 ++ .../CodeGen/AMDGPU/tgt-verify-llc-fail.ll | 6 ++++++ .../CodeGen/AMDGPU/tgt-verify-llc-pass.ll | 6 ++++++ llvm/tools/llc/NewPMDriver.cpp | 2 +- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 2 +- llvm/tools/opt/NewPMDriver.cpp | 2 +- llvm/unittests/IR/PassManagerTest.cpp | 6 +++--- 16 files changed, 68 insertions(+), 21 deletions(-) create mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll create mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll diff --git a/llvm/include/llvm/Passes/StandardInstrumentations.h b/llvm/include/llvm/Passes/StandardInstrumentations.h index f7a65a88ecf5b..988fcb93b2357 100644 --- a/llvm/include/llvm/Passes/StandardInstrumentations.h +++ b/llvm/include/llvm/Passes/StandardInstrumentations.h @@ -476,7 +476,8 @@ class VerifyInstrumentation { public: VerifyInstrumentation(bool DebugLogging) : DebugLogging(DebugLogging) {} void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM); + ModuleAnalysisManager *MAM, + FunctionAnalysisManager *FAM); }; /// This class implements --time-trace functionality for new pass manager. @@ -621,7 +622,8 @@ class StandardInstrumentations { // Register all the standard instrumentation callbacks. If \p FAM is nullptr // then PreservedCFGChecker is not enabled. void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM = nullptr); + ModuleAnalysisManager *MAM, + FunctionAnalysisManager *FAM); TimePassesHandler &getTimePasses() { return TimePasses; } }; diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index e00c6a7b260c9..ad5aeb895953d 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -75,6 +75,7 @@ class TargetVerify { MessagesStr(Messages) {} void run(Function &F) {}; + void run(Function &F, FunctionAnalysisManager &AM); }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index e6ff57629b141..d8a3fda4f87dc 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -22,6 +22,10 @@ #include "llvm/Target/TargetVerifier.h" +#include "llvm/Analysis/UniformityAnalysis.h" +#include "llvm/Analysis/PostDominators.h" +#include "llvm/IR/Dominators.h" + namespace llvm { class Function; @@ -31,6 +35,20 @@ class AMDGPUTargetVerifierPass : public TargetVerifierPass { PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); }; +class AMDGPUTargetVerify : public TargetVerify { +public: + Module *Mod; + + DominatorTree *DT; + PostDominatorTree *PDT; + UniformityInfo *UA; + + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) + : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} + + void run(Function &F); +}; + } // namespace llvm #endif // LLVM_AMDGPU_TARGET_VERIFIER_H diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp index 1c764a0188eda..475e7cf45371b 100644 --- a/llvm/lib/LTO/LTOBackend.cpp +++ b/llvm/lib/LTO/LTOBackend.cpp @@ -275,7 +275,7 @@ static void runNewPMPasses(const Config &Conf, Module &Mod, TargetMachine *TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(Mod.getContext(), Conf.DebugPassManager, Conf.VerifyEach); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); PassBuilder PB(TM, Conf.PTO, PGOOpt, &PIC); RegisterPassPlugins(Conf.PassPlugins, PB); diff --git a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp index 9e7f8187fe49c..369b003df1364 100644 --- a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp +++ b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp @@ -245,7 +245,7 @@ static void optimizeModule(Module &TheModule, TargetMachine &TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(TheModule.getContext(), DebugPassManager); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); PipelineTuningOptions PTO; PTO.LoopVectorization = true; PTO.SLPVectorization = true; diff --git a/llvm/lib/Passes/CMakeLists.txt b/llvm/lib/Passes/CMakeLists.txt index 6425f4934b210..f171377a8b270 100644 --- a/llvm/lib/Passes/CMakeLists.txt +++ b/llvm/lib/Passes/CMakeLists.txt @@ -29,6 +29,7 @@ add_llvm_component_library(LLVMPasses Scalar Support Target + TargetParser TransformUtils Vectorize Instrumentation diff --git a/llvm/lib/Passes/PassBuilderBindings.cpp b/llvm/lib/Passes/PassBuilderBindings.cpp index 933fe89e53a94..f0e1abb8cebc4 100644 --- a/llvm/lib/Passes/PassBuilderBindings.cpp +++ b/llvm/lib/Passes/PassBuilderBindings.cpp @@ -76,7 +76,7 @@ static LLVMErrorRef runPasses(Module *Mod, Function *Fun, const char *Passes, PB.crossRegisterProxies(LAM, FAM, CGAM, MAM); StandardInstrumentations SI(Mod->getContext(), Debug, VerifyEach); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); // Run the pipeline. if (Fun) { diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index dc1dd5d9c7f4c..7b15f89e361b8 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -45,6 +45,7 @@ #include "llvm/Support/Signals.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Support/xxhash.h" +#include "llvm/Target/TargetVerifier.h" #include #include #include @@ -1454,9 +1455,10 @@ void PreservedCFGCheckerInstrumentation::registerCallbacks( } void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM) { + ModuleAnalysisManager *MAM, + FunctionAnalysisManager *FAM) { PIC.registerAfterPassCallback( - [this, MAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { + [this, MAM, FAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { if (isIgnored(P) || P == "VerifierPass") return; const auto *F = unwrapIR(IR); @@ -1473,6 +1475,15 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", P)); + + if (FAM) { + TargetVerify TV(const_cast(F->getParent())); + TV.run(*const_cast(F), *FAM); + if (!F->getParent()->IsValid) + report_fatal_error(formatv("Broken function found after pass " + "\"{0}\", compilation aborted!", + P)); + } } else { const auto *M = unwrapIR(IR); if (!M) { @@ -2512,7 +2523,7 @@ void PrintCrashIRInstrumentation::registerCallbacks( } void StandardInstrumentations::registerCallbacks( - PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM) { + PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM, FunctionAnalysisManager *FAM) { PrintIR.registerCallbacks(PIC); PrintPass.registerCallbacks(PIC); TimePasses.registerCallbacks(PIC); @@ -2521,7 +2532,7 @@ void StandardInstrumentations::registerCallbacks( PrintChangedIR.registerCallbacks(PIC); PseudoProbeVerification.registerCallbacks(PIC); if (VerifyEach) - Verify.registerCallbacks(PIC, MAM); + Verify.registerCallbacks(PIC, MAM, FAM); PrintChangedDiff.registerCallbacks(PIC); WebsiteChangeReporter.registerCallbacks(PIC); ChangeTester.registerCallbacks(PIC); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 585b19065c142..e6cdec7160229 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -14,8 +14,8 @@ using namespace llvm; -static cl::opt -MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); +//static cl::opt +//MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); // Check - We know that cond should be true, if not print an error message. #define Check(C, ...) \ @@ -81,7 +81,7 @@ static bool isMFMA(unsigned IID) { } namespace llvm { -class AMDGPUTargetVerify : public TargetVerify { +/*class AMDGPUTargetVerify : public TargetVerify { public: Module *Mod; @@ -93,7 +93,7 @@ class AMDGPUTargetVerify : public TargetVerify { : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} void run(Function &F); -}; +};*/ static bool IsValidInt(const Type *Ty) { return Ty->isIntegerTy(1) || @@ -129,8 +129,8 @@ void AMDGPUTargetVerify::run(Function &F) { for (auto &BB : F) { for (auto &I : BB) { - if (MarkUniform) - outs() << UA->isUniform(&I) << ' ' << I << '\n'; + //if (MarkUniform) + //outs() << UA->isUniform(&I) << ' ' << I << '\n'; // Ensure integral types are valid: i8, i16, i32, i64, i128 if (I.getType()->isIntegerTy()) diff --git a/llvm/lib/Target/CMakeLists.txt b/llvm/lib/Target/CMakeLists.txt index 9472288229cac..f2a5d545ce84f 100644 --- a/llvm/lib/Target/CMakeLists.txt +++ b/llvm/lib/Target/CMakeLists.txt @@ -7,6 +7,8 @@ add_llvm_component_library(LLVMTarget TargetLoweringObjectFile.cpp TargetMachine.cpp TargetMachineC.cpp + TargetVerifier.cpp + AMDGPU/AMDGPUTargetVerifier.cpp ADDITIONAL_HEADER_DIRS ${LLVM_MAIN_INCLUDE_DIR}/llvm/Target diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll new file mode 100644 index 0000000000000..584097d7bc134 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll @@ -0,0 +1,6 @@ +; RUN: not not llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each -o - < %s 2>&1 | FileCheck %s + +define amdgpu_cs i32 @nonvoid_shader() { +; CHECK: LLVM ERROR + ret i32 0 +} diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll new file mode 100644 index 0000000000000..0c3a5fe5ac4a5 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll @@ -0,0 +1,6 @@ +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each %s -o - 2>&1 | FileCheck %s + +define amdgpu_cs void @void_shader() { +; CHECK: ModuleToFunctionPassAdaptor + ret void +} diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index fa82689ecf9ae..a060d16e74958 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -126,7 +126,7 @@ int llvm::compileModuleWithNewPM( PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); MAM.registerPass([&] { return MachineModuleAnalysis(MMI); }); diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 68422abd6f4cc..3352d07deff2f 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -147,7 +147,7 @@ int main(int argc, char **argv) { PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); //FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); diff --git a/llvm/tools/opt/NewPMDriver.cpp b/llvm/tools/opt/NewPMDriver.cpp index 7d168a6ceb17c..a8977d80bdf44 100644 --- a/llvm/tools/opt/NewPMDriver.cpp +++ b/llvm/tools/opt/NewPMDriver.cpp @@ -423,7 +423,7 @@ bool llvm::runPassPipeline( PrintPassOpts.SkipAnalyses = DebugPM == DebugLogging::Quiet; StandardInstrumentations SI(M.getContext(), DebugPM != DebugLogging::None, VK == VerifierKind::EachPass, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); DebugifyEachInstrumentation Debugify; DebugifyStatsMap DIStatsMap; DebugInfoPerPass DebugInfoBeforePass; diff --git a/llvm/unittests/IR/PassManagerTest.cpp b/llvm/unittests/IR/PassManagerTest.cpp index a6487169224c2..bb4db6120035f 100644 --- a/llvm/unittests/IR/PassManagerTest.cpp +++ b/llvm/unittests/IR/PassManagerTest.cpp @@ -828,7 +828,7 @@ TEST_F(PassManagerTest, FunctionPassCFGChecker) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -877,7 +877,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerInvalidateAnalysis) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -945,7 +945,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerWrapped) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); >From 64d001858efc994e965071cd319d268b934a6eb3 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 16 Apr 2025 10:19:00 -0400 Subject: [PATCH 03/27] Run AMDGPUTargetVerifier within AMDGPU pipeline. Move IsValid from Module to TargetVerify. --- clang/lib/CodeGen/BackendUtil.cpp | 2 +- llvm/include/llvm/IR/Module.h | 4 ---- llvm/include/llvm/Target/TargetVerifier.h | 2 ++ llvm/lib/IR/Verifier.cpp | 4 ++-- llvm/lib/Passes/StandardInstrumentations.cpp | 2 +- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 5 +++++ llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 3 ++- llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 6 +++--- 8 files changed, 16 insertions(+), 12 deletions(-) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index f7eb853beb23c..9a1c922f5ddef 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -922,7 +922,7 @@ void EmitAssemblyHelper::RunOptimizationPipeline( TheModule->getContext(), (CodeGenOpts.DebugPassManager || DebugPassStructure), CodeGenOpts.VerifyEach, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); PassBuilder PB(TM.get(), PTO, PGOOpt, &PIC); // Handle the assignment tracking feature options. diff --git a/llvm/include/llvm/IR/Module.h b/llvm/include/llvm/IR/Module.h index 03c0cf1cf0924..91ccd76c41e07 100644 --- a/llvm/include/llvm/IR/Module.h +++ b/llvm/include/llvm/IR/Module.h @@ -214,10 +214,6 @@ class LLVM_ABI Module { /// @name Constructors /// @{ public: - /// Is this Module valid as determined by one of the verification passes - /// i.e. Lint, Verifier, TargetVerifier. - bool IsValid = true; - /// Is this Module using intrinsics to record the position of debugging /// information, or non-intrinsic records? See IsNewDbgInfoFormat in /// \ref BasicBlock. diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index ad5aeb895953d..2d0c039132c35 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -70,6 +70,8 @@ class TargetVerify { std::string Messages; raw_string_ostream MessagesStr; + bool IsValid = true; + TargetVerify(Module *Mod) : Mod(Mod), TT(Triple::normalize(Mod->getTargetTriple())), MessagesStr(Messages) {} diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 9d21ca182ca13..d7c514610b4ba 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -7801,7 +7801,7 @@ VerifierAnalysis::Result VerifierAnalysis::run(Function &F, PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { auto Res = AM.getResult(M); if (Res.IRBroken || Res.DebugInfoBroken) { - M.IsValid = false; + //M.IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken module found, compilation aborted!"); } @@ -7812,7 +7812,7 @@ PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto res = AM.getResult(F); if (res.IRBroken) { - F.getParent()->IsValid = false; + //F.getParent()->IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken function found, compilation aborted!"); } diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index 7b15f89e361b8..879d657c87695 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -1479,7 +1479,7 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, if (FAM) { TargetVerify TV(const_cast(F->getParent())); TV.run(*const_cast(F), *FAM); - if (!F->getParent()->IsValid) + if (!TV.IsValid) report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", P)); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 90e3489ced923..6ec34d6a0fdbf 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -90,6 +90,7 @@ #include "llvm/MC/TargetRegistry.h" #include "llvm/Passes/PassBuilder.h" #include "llvm/Support/FormatVariadic.h" +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Transforms/HipStdPar/HipStdPar.h" #include "llvm/Transforms/IPO.h" #include "llvm/Transforms/IPO/AlwaysInliner.h" @@ -1298,6 +1299,8 @@ void AMDGPUPassConfig::addIRPasses() { addPass(createLICMPass()); } + //addPass(AMDGPUTargetVerifierPass()); + TargetPassConfig::addIRPasses(); // EarlyCSE is not always strong enough to clean up what LSR produces. For @@ -2040,6 +2043,8 @@ void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { // but EarlyCSE can do neither of them. if (isPassEnabled(EnableScalarIRPasses)) addEarlyCSEOrGVNPass(addPass); + + addPass(AMDGPUTargetVerifierPass()); } void AMDGPUCodeGenPassBuilder::addCodeGenPrepare(AddIRPass &addPass) const { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index e6cdec7160229..c70a6d1b6fa66 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -205,7 +205,8 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan dbgs() << TV.MessagesStr.str(); if (!TV.MessagesStr.str().empty()) { - F.getParent()->IsValid = false; + TV.IsValid = false; + return PreservedAnalyses::none(); } return PreservedAnalyses::all(); diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 3352d07deff2f..fbe7f6089ff18 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -163,9 +163,9 @@ int main(int argc, char **argv) { FPM.addPass(TargetVerifierPass()); MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); - MPM.run(*M, MAM); - - if (!M->IsValid) + auto PA = MPM.run(*M, MAM); + auto PAC = PA.getChecker(); + if (!PAC.preserved()) return 1; return 0; >From fdae3025942584d0085deb3442f40471548defe5 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 16 Apr 2025 11:08:20 -0400 Subject: [PATCH 04/27] Remove cmd line options that aren't required. Make error message explicit. --- llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll | 4 ++-- llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll index 584097d7bc134..c5e59d4a2369e 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll @@ -1,6 +1,6 @@ -; RUN: not not llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each -o - < %s 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm -o - < %s 2>&1 | FileCheck %s define amdgpu_cs i32 @nonvoid_shader() { -; CHECK: LLVM ERROR +; CHECK: Shaders must return void ret i32 0 } diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll index 0c3a5fe5ac4a5..8a503b7624a73 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll @@ -1,6 +1,6 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each %s -o - 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm %s -o - 2>&1 | FileCheck %s --allow-empty define amdgpu_cs void @void_shader() { -; CHECK: ModuleToFunctionPassAdaptor +; CHECK-NOT: Shaders must return void ret void } >From 5ceda58cc5b5d7372c6e43cbdf583f0dda87b956 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 19:36:34 -0400 Subject: [PATCH 05/27] Return Verifier none status through PreservedAnalyses on fail. --- llvm/lib/Analysis/Lint.cpp | 4 +++- llvm/lib/IR/Verifier.cpp | 2 ++ llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 8 +++++--- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/llvm/lib/Analysis/Lint.cpp b/llvm/lib/Analysis/Lint.cpp index f05e36e2025d4..c8e38963e5974 100644 --- a/llvm/lib/Analysis/Lint.cpp +++ b/llvm/lib/Analysis/Lint.cpp @@ -742,9 +742,11 @@ PreservedAnalyses LintPass::run(Function &F, FunctionAnalysisManager &AM) { Lint L(Mod, DL, AA, AC, DT, TLI); L.visit(F); dbgs() << L.MessagesStr.str(); - if (AbortOnError && !L.MessagesStr.str().empty()) + if (AbortOnError && !L.MessagesStr.str().empty()) { report_fatal_error( "linter found errors, aborting. (enabled by abort-on-error)", false); + return PreservedAnalyses::none(); + } return PreservedAnalyses::all(); } diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index d7c514610b4ba..51f6dec53b70f 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -7804,6 +7804,7 @@ PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { //M.IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken module found, compilation aborted!"); + return PreservedAnalyses::none(); } return PreservedAnalyses::all(); @@ -7815,6 +7816,7 @@ PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { //F.getParent()->IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken function found, compilation aborted!"); + return PreservedAnalyses::none(); } return PreservedAnalyses::all(); diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index fbe7f6089ff18..042824ac37fea 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -164,9 +164,11 @@ int main(int argc, char **argv) { MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); auto PA = MPM.run(*M, MAM); - auto PAC = PA.getChecker(); - if (!PAC.preserved()) - return 1; + { + auto PAC = PA.getChecker(); + if (!PAC.preserved()) + return 1; + } return 0; } >From 99c29069cdaf68c92ce7f25ca2f730bf738ca324 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 21:16:02 -0400 Subject: [PATCH 06/27] Rebase update. --- llvm/include/llvm/Target/TargetVerifier.h | 2 +- llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 2d0c039132c35..fe683311b901c 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -73,7 +73,7 @@ class TargetVerify { bool IsValid = true; TargetVerify(Module *Mod) - : Mod(Mod), TT(Triple::normalize(Mod->getTargetTriple())), + : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} void run(Function &F) {}; diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 042824ac37fea..627bc51ef3a43 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -123,7 +123,7 @@ int main(int argc, char **argv) { return 1; } auto S = Triple::normalize(TargetTriple); - M->setTargetTriple(S); + M->setTargetTriple(Triple(S)); PassInstrumentationCallbacks PIC; StandardInstrumentations SI(Context, false/*debug PM*/, @@ -153,7 +153,7 @@ int main(int argc, char **argv) { Triple TT(M->getTargetTriple()); if (!NoLint) - FPM.addPass(LintPass()); + FPM.addPass(LintPass(false)); if (!NoVerify) MPM.addPass(VerifierPass()); if (TT.isAMDGPU()) >From 3ea7eae48a6addbf711716e7a819830dddc1b34a Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 22:49:52 -0400 Subject: [PATCH 07/27] Add generic TargetVerifier. --- llvm/lib/Target/TargetVerifier.cpp | 32 ++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) create mode 100644 llvm/lib/Target/TargetVerifier.cpp diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp new file mode 100644 index 0000000000000..de3ff749e7c3c --- /dev/null +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -0,0 +1,32 @@ +#include "llvm/Target/TargetVerifier.h" +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" + +#include "llvm/Analysis/UniformityAnalysis.h" +#include "llvm/Analysis/PostDominators.h" +#include "llvm/Support/Debug.h" +#include "llvm/IR/Dominators.h" +#include "llvm/IR/Function.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/IntrinsicsAMDGPU.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/Value.h" + +namespace llvm { + +void TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { + if (TT.isAMDGPU()) { + auto *UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + } + } +} + +} // namespace llvm >From f52c4dbc84952d97266f5f4158729e564de10240 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 23:13:14 -0400 Subject: [PATCH 08/27] Remove store to const check since it is in Lint already --- llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 8 -------- llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 2 -- 2 files changed, 10 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index c70a6d1b6fa66..1cf2b277bee26 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -140,14 +140,6 @@ void AMDGPUTargetVerify::run(Function &F) { Check(IsValidInt(I.getOperand(i)->getType()), "Int type is invalid.", I.getOperand(i)); - // Ensure no store to const memory - if (auto *SI = dyn_cast(&I)) - { - unsigned AS = SI->getPointerAddressSpace(); - Check(AS != 4, "Write to const memory", SI); - } - - // Ensure no kernel to kernel calls. if (auto *CI = dyn_cast(&I)) { CallingConv::ID CalleeCC = CI->getCallingConv(); diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll index f56ff992a56c2..c628abbde11d1 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -32,8 +32,6 @@ define amdgpu_cs i32 @shader() { define amdgpu_kernel void @store_const(ptr addrspace(4) %out, i32 %a, i32 %b) { ; CHECK: Undefined behavior: Write to memory in const addrspace -; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 -; CHECK-NEXT: Write to const memory ; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 %r = add i32 %a, %b store i32 %r, ptr addrspace(4) %out >From 5c9a4ab3895d6939b12386d1db2081ca388df01a Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 23:14:38 -0400 Subject: [PATCH 09/27] Add chain followed by unreachable check --- llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 6 ++++++ llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 10 ++++++++++ 2 files changed, 16 insertions(+) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 1cf2b277bee26..8ea773bc0e66f 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -142,6 +142,7 @@ void AMDGPUTargetVerify::run(Function &F) { if (auto *CI = dyn_cast(&I)) { + // Ensure no kernel to kernel calls. CallingConv::ID CalleeCC = CI->getCallingConv(); if (CalleeCC == CallingConv::AMDGPU_KERNEL) { @@ -149,6 +150,11 @@ void AMDGPUTargetVerify::run(Function &F) { Check(CallerCC != CallingConv::AMDGPU_KERNEL, "A kernel may not call a kernel", CI->getParent()->getParent()); } + + // Ensure chain intrinsics are followed by unreachables. + if (CI->getIntrinsicID() == Intrinsic::amdgcn_cs_chain) + Check(isa_and_present(CI->getNextNode()), + "llvm.amdgcn.cs.chain must be followed by unreachable", CI); } // Ensure MFMA is not in control flow with diverging operands diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll index c628abbde11d1..e620df94ccde4 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -58,3 +58,13 @@ entry: %tmp2 = ashr i65 %x, 64 ret i65 %tmp2 } + +declare void @llvm.amdgcn.cs.chain.v3i32(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) +declare amdgpu_cs_chain void @chain_callee(<3 x i32> inreg, <3 x i32>) + +define amdgpu_cs void @no_unreachable(<3 x i32> inreg %a, <3 x i32> %b) { +; CHECK: llvm.amdgcn.cs.chain must be followed by unreachable +; CHECK-NEXT: call void (ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.p0.i32.v3i32.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) + call void(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) + ret void +} >From 0ff03f792c018e4fd0c11de9da4d3353617707f5 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 23:26:19 -0400 Subject: [PATCH 10/27] Remove mfma check --- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 89 ------------------- llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 25 ------ 2 files changed, 114 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 8ea773bc0e66f..684ced5bba574 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -14,9 +14,6 @@ using namespace llvm; -//static cl::opt -//MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); - // Check - We know that cond should be true, if not print an error message. #define Check(C, ...) \ do { \ @@ -26,60 +23,6 @@ using namespace llvm; } \ } while (false) -static bool isMFMA(unsigned IID) { - switch (IID) { - case Intrinsic::amdgcn_mfma_f32_4x4x1f32: - case Intrinsic::amdgcn_mfma_f32_4x4x4f16: - case Intrinsic::amdgcn_mfma_i32_4x4x4i8: - case Intrinsic::amdgcn_mfma_f32_4x4x2bf16: - - case Intrinsic::amdgcn_mfma_f32_16x16x1f32: - case Intrinsic::amdgcn_mfma_f32_16x16x4f32: - case Intrinsic::amdgcn_mfma_f32_16x16x4f16: - case Intrinsic::amdgcn_mfma_f32_16x16x16f16: - case Intrinsic::amdgcn_mfma_i32_16x16x4i8: - case Intrinsic::amdgcn_mfma_i32_16x16x16i8: - case Intrinsic::amdgcn_mfma_f32_16x16x2bf16: - case Intrinsic::amdgcn_mfma_f32_16x16x8bf16: - - case Intrinsic::amdgcn_mfma_f32_32x32x1f32: - case Intrinsic::amdgcn_mfma_f32_32x32x2f32: - case Intrinsic::amdgcn_mfma_f32_32x32x4f16: - case Intrinsic::amdgcn_mfma_f32_32x32x8f16: - case Intrinsic::amdgcn_mfma_i32_32x32x4i8: - case Intrinsic::amdgcn_mfma_i32_32x32x8i8: - case Intrinsic::amdgcn_mfma_f32_32x32x2bf16: - case Intrinsic::amdgcn_mfma_f32_32x32x4bf16: - - case Intrinsic::amdgcn_mfma_f32_4x4x4bf16_1k: - case Intrinsic::amdgcn_mfma_f32_16x16x4bf16_1k: - case Intrinsic::amdgcn_mfma_f32_16x16x16bf16_1k: - case Intrinsic::amdgcn_mfma_f32_32x32x4bf16_1k: - case Intrinsic::amdgcn_mfma_f32_32x32x8bf16_1k: - - case Intrinsic::amdgcn_mfma_f64_16x16x4f64: - case Intrinsic::amdgcn_mfma_f64_4x4x4f64: - - case Intrinsic::amdgcn_mfma_i32_16x16x32_i8: - case Intrinsic::amdgcn_mfma_i32_32x32x16_i8: - case Intrinsic::amdgcn_mfma_f32_16x16x8_xf32: - case Intrinsic::amdgcn_mfma_f32_32x32x4_xf32: - - case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_bf8: - case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_fp8: - case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_bf8: - case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_fp8: - - case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_bf8: - case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_fp8: - case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_bf8: - case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_fp8: - return true; - default: - return false; - } -} - namespace llvm { /*class AMDGPUTargetVerify : public TargetVerify { public: @@ -129,8 +72,6 @@ void AMDGPUTargetVerify::run(Function &F) { for (auto &BB : F) { for (auto &I : BB) { - //if (MarkUniform) - //outs() << UA->isUniform(&I) << ' ' << I << '\n'; // Ensure integral types are valid: i8, i16, i32, i64, i128 if (I.getType()->isIntegerTy()) @@ -156,36 +97,6 @@ void AMDGPUTargetVerify::run(Function &F) { Check(isa_and_present(CI->getNextNode()), "llvm.amdgcn.cs.chain must be followed by unreachable", CI); } - - // Ensure MFMA is not in control flow with diverging operands - if (auto *II = dyn_cast(&I)) { - if (isMFMA(II->getIntrinsicID())) { - bool InControlFlow = false; - for (const auto &P : predecessors(&BB)) - if (!PDT->dominates(&BB, P)) { - InControlFlow = true; - break; - } - for (const auto &S : successors(&BB)) - if (!DT->dominates(&BB, S)) { - InControlFlow = true; - break; - } - if (InControlFlow) { - // If operands to MFMA are not uniform, MFMA cannot be in control flow - bool hasUniformOperands = true; - for (unsigned i = 0; i < II->getNumOperands(); i++) { - if (!UA->isUniform(II->getOperand(i))) { - dbgs() << "Not uniform: " << *II->getOperand(i) << '\n'; - hasUniformOperands = false; - } - } - if (!hasUniformOperands) Check(false, "MFMA in control flow", II); - //else Check(false, "MFMA in control flow (uniform operands)", II); - } - //else Check(false, "MFMA not in control flow", II); - } - } } } } diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll index e620df94ccde4..62b220d7d9f49 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -1,30 +1,5 @@ ; RUN: not llvm-tgt-verify %s -mtriple=amdgcn |& FileCheck %s -define amdgpu_kernel void @test_mfma_f32_32x32x1f32_vecarg(ptr addrspace(1) %arg) #0 { -; CHECK: Not uniform: %in.f32 = load <32 x float>, ptr addrspace(1) %gep, align 128 -; CHECK-NEXT: MFMA in control flow -; CHECK-NEXT: %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) -s: - %tid = call i32 @llvm.amdgcn.workitem.id.x() - %gep = getelementptr inbounds <32 x float>, ptr addrspace(1) %arg, i32 %tid - %in.i32 = load <32 x i32>, ptr addrspace(1) %gep - %in.f32 = load <32 x float>, ptr addrspace(1) %gep - - %0 = icmp eq <32 x i32> %in.i32, zeroinitializer - %div.br = extractelement <32 x i1> %0, i32 0 - br i1 %div.br, label %if.3, label %else.0 - -if.3: - br label %join - -else.0: - %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) - br label %join - -join: - ret void -} - define amdgpu_cs i32 @shader() { ; CHECK: Shaders must return void ret i32 0 >From 6b84c73a35a260d64ed45df90052f8212b0ee4e7 Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 21 Apr 2025 20:54:10 -0400 Subject: [PATCH 11/27] Add registerVerifierPasses to PassBuilder and add the verifier passes to PassRegistry. --- llvm/include/llvm/InitializePasses.h | 1 + llvm/include/llvm/Passes/PassBuilder.h | 21 +++++++ .../llvm/Passes/TargetPassRegistry.inc | 12 ++++ .../TargetVerify/AMDGPUTargetVerifier.h | 11 ++-- llvm/lib/Passes/PassBuilder.cpp | 7 +++ llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 11 ++++ .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 56 ++++++++++++++++++- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 1 + 8 files changed, 114 insertions(+), 6 deletions(-) diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 9bef8e496c57e..ae398db3dc1da 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -317,6 +317,7 @@ void initializeUnpackMachineBundlesPass(PassRegistry &); void initializeUnreachableBlockElimLegacyPassPass(PassRegistry &); void initializeUnreachableMachineBlockElimLegacyPass(PassRegistry &); void initializeVerifierLegacyPassPass(PassRegistry &); +void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeVirtRegMapWrapperLegacyPass(PassRegistry &); void initializeVirtRegRewriterPass(PassRegistry &); void initializeWasmEHPreparePass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/PassBuilder.h b/llvm/include/llvm/Passes/PassBuilder.h index 51ccaa53447d7..6000769ce723b 100644 --- a/llvm/include/llvm/Passes/PassBuilder.h +++ b/llvm/include/llvm/Passes/PassBuilder.h @@ -172,6 +172,13 @@ class PassBuilder { /// additional analyses. void registerLoopAnalyses(LoopAnalysisManager &LAM); + /// Registers all available verifier passes. + /// + /// This is an interface that can be used to populate a + /// \c ModuleAnalysisManager with all registered loop analyses. Callers can + /// still manually register any additional analyses. + void registerVerifierPasses(ModulePassManager &PM, FunctionPassManager &); + /// Registers all available machine function analysis passes. /// /// This is an interface that can be used to populate a \c @@ -570,6 +577,15 @@ class PassBuilder { } /// @}} + /// Register a callback for parsing an Verifier Name to populate + /// the given managers. + void registerVerifierCallback( + const std::function &C, + const std::function &CF) { + VerifierCallbacks.push_back(C); + FnVerifierCallbacks.push_back(CF); + } + /// {{@ Register pipeline parsing callbacks with this pass builder instance. /// Using these callbacks, callers can parse both a single pass name, as well /// as entire sub-pipelines, and populate the PassManager instance @@ -841,6 +857,11 @@ class PassBuilder { // Callbacks to parse `filter` parameter in register allocation passes SmallVector, 2> RegClassFilterParsingCallbacks; + // Verifier callbacks + SmallVector, 2> + VerifierCallbacks; + SmallVector, 2> + FnVerifierCallbacks; }; /// This utility template takes care of adding require<> and invalidate<> diff --git a/llvm/include/llvm/Passes/TargetPassRegistry.inc b/llvm/include/llvm/Passes/TargetPassRegistry.inc index 521913cb25a4a..2d04b874cf360 100644 --- a/llvm/include/llvm/Passes/TargetPassRegistry.inc +++ b/llvm/include/llvm/Passes/TargetPassRegistry.inc @@ -151,6 +151,18 @@ PB.registerPipelineParsingCallback([=](StringRef Name, FunctionPassManager &PM, return false; }); +PB.registerVerifierCallback([](ModulePassManager &PM) { +#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) PM.addPass(CREATE_PASS) +#include GET_PASS_REGISTRY +#undef VERIFIER_MODULE_ANALYSIS + return false; +}, [](FunctionPassManager &FPM) { +#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) FPM.addPass(CREATE_PASS) +#include GET_PASS_REGISTRY +#undef VERIFIER_FUNCTION_ANALYSIS + return false; +}); + #undef ADD_PASS #undef ADD_PASS_WITH_PARAMS diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index d8a3fda4f87dc..b6a7412e8c1ef 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -39,14 +39,17 @@ class AMDGPUTargetVerify : public TargetVerify { public: Module *Mod; - DominatorTree *DT; - PostDominatorTree *PDT; - UniformityInfo *UA; + DominatorTree *DT = nullptr; + PostDominatorTree *PDT = nullptr; + UniformityInfo *UA = nullptr; + + AMDGPUTargetVerify(Module *Mod) + : TargetVerify(Mod), Mod(Mod) {} AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} - void run(Function &F); + bool run(Function &F); }; } // namespace llvm diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index e7057d9a6b625..e942fed8b6a72 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -582,6 +582,13 @@ void PassBuilder::registerLoopAnalyses(LoopAnalysisManager &LAM) { C(LAM); } +void PassBuilder::registerVerifierPasses(ModulePassManager &MPM, FunctionPassManager &FPM) { + for (auto &C : VerifierCallbacks) + C(MPM); + for (auto &C : FnVerifierCallbacks) + C(FPM); +} + static std::optional> parseFunctionPipelineName(StringRef Name) { std::pair Params; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 98a1147ef6d66..41e6a399c7239 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -81,6 +81,17 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #undef FUNCTION_ALIAS_ANALYSIS #undef FUNCTION_ANALYSIS +#ifndef VERIFIER_MODULE_ANALYSIS +#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) +#endif +#ifndef VERIFIER_FUNCTION_ANALYSIS +#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) +#endif +VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) +VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) +#undef VERIFIER_MODULE_ANALYSIS +#undef VERIFIER_FUNCTION_ANALYSIS + #ifndef FUNCTION_PASS_WITH_PARAMS #define FUNCTION_PASS_WITH_PARAMS(NAME, CLASS, CREATE_PASS, PARSER, PARAMS) #endif diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 684ced5bba574..63a7526b9abdc 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -5,6 +5,7 @@ #include "llvm/Support/Debug.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" +#include "llvm/InitializePasses.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" @@ -19,7 +20,7 @@ using namespace llvm; do { \ if (!(C)) { \ TargetVerify::CheckFailed(__VA_ARGS__); \ - return; \ + return false; \ } \ } while (false) @@ -64,7 +65,7 @@ static bool isShader(CallingConv::ID CC) { } } -void AMDGPUTargetVerify::run(Function &F) { +bool AMDGPUTargetVerify::run(Function &F) { // Ensure shader calling convention returns void if (isShader(F.getCallingConv())) Check(F.getReturnType() == Type::getVoidTy(F.getContext()), "Shaders must return void"); @@ -99,6 +100,10 @@ void AMDGPUTargetVerify::run(Function &F) { } } } + + if (!MessagesStr.str().empty()) + return false; + return true; } PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { @@ -120,4 +125,51 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan return PreservedAnalyses::all(); } + +struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { + static char ID; + + std::unique_ptr TV; + bool FatalErrors = true; + + AMDGPUTargetVerifierLegacyPass() : FunctionPass(ID) { + initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + AMDGPUTargetVerifierLegacyPass(bool FatalErrors) + : FunctionPass(ID), + FatalErrors(FatalErrors) { + initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + + bool doInitialization(Module &M) override { + TV = std::make_unique(&M); + return false; + } + + bool runOnFunction(Function &F) override { + if (TV->run(F) && FatalErrors) { + errs() << "in function " << F.getName() << '\n'; + report_fatal_error("Broken function found, compilation aborted!"); + } + return false; + } + + bool doFinalization(Module &M) override { + bool IsValid = true; + for (Function &F : M) + if (F.isDeclaration()) + IsValid &= TV->run(F); + + //IsValid &= TV->run(); + if (FatalErrors && !IsValid) + report_fatal_error("Broken module found, compilation aborted!"); + return false; + } + + void getAnalysisUsage(AnalysisUsage &AU) const override { + AU.setPreservesAll(); + } +}; +char AMDGPUTargetVerifierLegacyPass::ID = 0; } // namespace llvm +INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverify", "AMDGPU Target Verifier", false, false) diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 627bc51ef3a43..503db7b1f8d18 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -144,6 +144,7 @@ int main(int argc, char **argv) { PB.registerCGSCCAnalyses(CGAM); PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); + //PB.registerVerifierPasses(MPM, FPM); PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); >From ec3276b182f3a758a24024291772efe435485857 Mon Sep 17 00:00:00 2001 From: jofernau Date: Tue, 22 Apr 2025 14:57:31 -0400 Subject: [PATCH 12/27] Remove leftovers. Add titles. Add call to registerVerifierCallbacks in llc. --- llvm/lib/Passes/CMakeLists.txt | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 4 --- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 35 +++++++++++-------- llvm/lib/Target/TargetVerifier.cpp | 19 ++++++++++ llvm/tools/llc/NewPMDriver.cpp | 6 ++-- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 7 ++-- 6 files changed, 45 insertions(+), 28 deletions(-) diff --git a/llvm/lib/Passes/CMakeLists.txt b/llvm/lib/Passes/CMakeLists.txt index f171377a8b270..9c348cb89a8c5 100644 --- a/llvm/lib/Passes/CMakeLists.txt +++ b/llvm/lib/Passes/CMakeLists.txt @@ -29,7 +29,7 @@ add_llvm_component_library(LLVMPasses Scalar Support Target - TargetParser + #TargetParser TransformUtils Vectorize Instrumentation diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 6ec34d6a0fdbf..257cc724b3da9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1299,8 +1299,6 @@ void AMDGPUPassConfig::addIRPasses() { addPass(createLICMPass()); } - //addPass(AMDGPUTargetVerifierPass()); - TargetPassConfig::addIRPasses(); // EarlyCSE is not always strong enough to clean up what LSR produces. For @@ -2043,8 +2041,6 @@ void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { // but EarlyCSE can do neither of them. if (isPassEnabled(EnableScalarIRPasses)) addEarlyCSEOrGVNPass(addPass); - - addPass(AMDGPUTargetVerifierPass()); } void AMDGPUCodeGenPassBuilder::addCodeGenPrepare(AddIRPass &addPass) const { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 63a7526b9abdc..0eecedaebc7ce 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -1,3 +1,22 @@ +//===-- AMDGPUTargetVerifier.cpp - AMDGPU -------------------------*- C++ -*-===// +//// +//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +//// See https://llvm.org/LICENSE.txt for license information. +//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +//// +////===----------------------------------------------------------------------===// +//// +//// This file defines target verifier interfaces that can be used for some +//// validation of input to the system, and for checking that transformations +//// haven't done something bad. In contrast to the Verifier or Lint, the +//// TargetVerifier looks for constructions invalid to a particular target +//// machine. +//// +//// To see what specifically is checked, look at an individual backend's +//// TargetVerifier. +//// +////===----------------------------------------------------------------------===// + #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Analysis/UniformityAnalysis.h" @@ -25,19 +44,6 @@ using namespace llvm; } while (false) namespace llvm { -/*class AMDGPUTargetVerify : public TargetVerify { -public: - Module *Mod; - - DominatorTree *DT; - PostDominatorTree *PDT; - UniformityInfo *UA; - - AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) - : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} - - void run(Function &F); -};*/ static bool IsValidInt(const Type *Ty) { return Ty->isIntegerTy(1) || @@ -147,7 +153,7 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { } bool runOnFunction(Function &F) override { - if (TV->run(F) && FatalErrors) { + if (!TV->run(F) && FatalErrors) { errs() << "in function " << F.getName() << '\n'; report_fatal_error("Broken function found, compilation aborted!"); } @@ -160,7 +166,6 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { if (F.isDeclaration()) IsValid &= TV->run(F); - //IsValid &= TV->run(); if (FatalErrors && !IsValid) report_fatal_error("Broken module found, compilation aborted!"); return false; diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index de3ff749e7c3c..992a0c91d93b1 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -1,3 +1,22 @@ +//===-- TargetVerifier.cpp - LLVM IR Target Verifier ----------------*- C++ -*-===// +//// +///// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +///// See https://llvm.org/LICENSE.txt for license information. +///// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +///// +/////===----------------------------------------------------------------------===// +///// +///// This file defines target verifier interfaces that can be used for some +///// validation of input to the system, and for checking that transformations +///// haven't done something bad. In contrast to the Verifier or Lint, the +///// TargetVerifier looks for constructions invalid to a particular target +///// machine. +///// +///// To see what specifically is checked, look at TargetVerifier.cpp or an +///// individual backend's TargetVerifier. +///// +/////===----------------------------------------------------------------------===// + #include "llvm/Target/TargetVerifier.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index a060d16e74958..a8f6b999af06e 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -114,6 +114,8 @@ int llvm::compileModuleWithNewPM( VK == VerifierKind::EachPass); registerCodeGenCallback(PIC, *Target); + ModulePassManager MPM; + FunctionPassManager FPM; MachineFunctionAnalysisManager MFAM; LoopAnalysisManager LAM; FunctionAnalysisManager FAM; @@ -125,15 +127,13 @@ int llvm::compileModuleWithNewPM( PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); + PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM, &FAM); FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); MAM.registerPass([&] { return MachineModuleAnalysis(MMI); }); - ModulePassManager MPM; - FunctionPassManager FPM; - if (!PassPipeline.empty()) { // Construct a custom pass pipeline that starts after instruction // selection. diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 503db7b1f8d18..b00bab66c6c3e 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -1,4 +1,4 @@ -//===--- llvm-isel-fuzzer.cpp - Fuzzer for instruction selection ----------===// +//===--- llvm-tgt-verify.cpp - Target Verifier ----------------- ----------===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -6,7 +6,7 @@ // //===----------------------------------------------------------------------===// // -// Tool to fuzz instruction selection using libFuzzer. +// Tool to verify a target. // //===----------------------------------------------------------------------===// @@ -144,14 +144,11 @@ int main(int argc, char **argv) { PB.registerCGSCCAnalyses(CGAM); PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); - //PB.registerVerifierPasses(MPM, FPM); PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM, &FAM); - //FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); - Triple TT(M->getTargetTriple()); if (!NoLint) FPM.addPass(LintPass(false)); >From 4f00c83f58a86a0adc26b621cc53e8b568b8c8e0 Mon Sep 17 00:00:00 2001 From: jofernau Date: Thu, 24 Apr 2025 16:02:21 -0400 Subject: [PATCH 13/27] Add pass to legacy PM. --- llvm/include/llvm/CodeGen/Passes.h | 2 + llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Target/TargetVerifier.h | 6 +- llvm/lib/Passes/StandardInstrumentations.cpp | 4 +- llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 2 +- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 45 ---------- llvm/lib/Target/TargetVerifier.cpp | 87 ++++++++++++++++++- llvm/tools/llc/NewPMDriver.cpp | 6 +- llvm/tools/llc/llc.cpp | 4 + .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 1 + 10 files changed, 106 insertions(+), 53 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index d214ab9306c2f..b293315e11c17 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -617,6 +617,8 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); + + FunctionPass *createTargetVerifierLegacyPass(); } // End llvm namespace #endif diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index ae398db3dc1da..3f9ffc4efd9ec 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -307,6 +307,7 @@ void initializeTailDuplicateLegacyPass(PassRegistry &); void initializeTargetLibraryInfoWrapperPassPass(PassRegistry &); void initializeTargetPassConfigPass(PassRegistry &); void initializeTargetTransformInfoWrapperPassPass(PassRegistry &); +void initializeTargetVerifierLegacyPassPass(PassRegistry &); void initializeTwoAddressInstructionLegacyPassPass(PassRegistry &); void initializeTypeBasedAAWrapperPassPass(PassRegistry &); void initializeTypePromotionLegacyPass(PassRegistry &); @@ -317,7 +318,6 @@ void initializeUnpackMachineBundlesPass(PassRegistry &); void initializeUnreachableBlockElimLegacyPassPass(PassRegistry &); void initializeUnreachableMachineBlockElimLegacyPass(PassRegistry &); void initializeVerifierLegacyPassPass(PassRegistry &); -void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeVirtRegMapWrapperLegacyPass(PassRegistry &); void initializeVirtRegRewriterPass(PassRegistry &); void initializeWasmEHPreparePass(PassRegistry &); diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index fe683311b901c..23ef2e0b8d4ef 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -30,7 +30,7 @@ class Function; class TargetVerifierPass : public PassInfoMixin { public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) {} + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); }; class TargetVerify { @@ -76,8 +76,8 @@ class TargetVerify { : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} - void run(Function &F) {}; - void run(Function &F, FunctionAnalysisManager &AM); + bool run(Function &F); + bool run(Function &F, FunctionAnalysisManager &AM); }; } // namespace llvm diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index 879d657c87695..f125b3daffd5e 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -62,6 +62,8 @@ static cl::opt VerifyAnalysisInvalidation("verify-analysis-invalidation", #endif ); +static cl::opt VerifyTargetEach("verify-tgt-each"); + // An option that supports the -print-changed option. See // the description for -print-changed for an explanation of the use // of this option. Note that this option has no effect without -print-changed. @@ -1476,7 +1478,7 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, "\"{0}\", compilation aborted!", P)); - if (FAM) { + if (VerifyTargetEach && FAM) { TargetVerify TV(const_cast(F->getParent())); TV.run(*const_cast(F), *FAM); if (!TV.IsValid) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 41e6a399c7239..73f9c60cf588c 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -88,7 +88,7 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) #endif VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) -VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) +VERIFIER_FUNCTION_ANALYSIS("tgtverifier", TargetVerifierPass()) #undef VERIFIER_MODULE_ANALYSIS #undef VERIFIER_FUNCTION_ANALYSIS diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 0eecedaebc7ce..96bcaaf6f2ac9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -132,49 +132,4 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan return PreservedAnalyses::all(); } -struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { - static char ID; - - std::unique_ptr TV; - bool FatalErrors = true; - - AMDGPUTargetVerifierLegacyPass() : FunctionPass(ID) { - initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); - } - AMDGPUTargetVerifierLegacyPass(bool FatalErrors) - : FunctionPass(ID), - FatalErrors(FatalErrors) { - initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); - } - - bool doInitialization(Module &M) override { - TV = std::make_unique(&M); - return false; - } - - bool runOnFunction(Function &F) override { - if (!TV->run(F) && FatalErrors) { - errs() << "in function " << F.getName() << '\n'; - report_fatal_error("Broken function found, compilation aborted!"); - } - return false; - } - - bool doFinalization(Module &M) override { - bool IsValid = true; - for (Function &F : M) - if (F.isDeclaration()) - IsValid &= TV->run(F); - - if (FatalErrors && !IsValid) - report_fatal_error("Broken module found, compilation aborted!"); - return false; - } - - void getAnalysisUsage(AnalysisUsage &AU) const override { - AU.setPreservesAll(); - } -}; -char AMDGPUTargetVerifierLegacyPass::ID = 0; } // namespace llvm -INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverify", "AMDGPU Target Verifier", false, false) diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 992a0c91d93b1..170fc4769c1d8 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -20,6 +20,7 @@ #include "llvm/Target/TargetVerifier.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" +#include "llvm/InitializePasses.h" #include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/Analysis/PostDominators.h" #include "llvm/Support/Debug.h" @@ -32,7 +33,22 @@ namespace llvm { -void TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { +bool TargetVerify::run(Function &F) { + if (TT.isAMDGPU()) { + AMDGPUTargetVerify TV(Mod); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + return false; + } + return true; + } + report_fatal_error("Target has no verification method\n"); +} + +bool TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { if (TT.isAMDGPU()) { auto *UA = &AM.getResult(F); auto *DT = &AM.getResult(F); @@ -44,8 +60,77 @@ void TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { dbgs() << TV.MessagesStr.str(); if (!TV.MessagesStr.str().empty()) { TV.IsValid = false; + return false; + } + return true; + } + report_fatal_error("Target has no verification method\n"); +} + +PreservedAnalyses TargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { + auto TT = F.getParent()->getTargetTriple(); + + if (TT.isAMDGPU()) { + auto *Mod = F.getParent(); + + auto UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + return PreservedAnalyses::none(); } + return PreservedAnalyses::all(); } + report_fatal_error("Target has no verification method\n"); } +struct TargetVerifierLegacyPass : public FunctionPass { + static char ID; + + std::unique_ptr TV; + + TargetVerifierLegacyPass() : FunctionPass(ID) { + initializeTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + + bool doInitialization(Module &M) override { + TV = std::make_unique(&M); + return false; + } + + bool runOnFunction(Function &F) override { + if (!TV->run(F)) { + errs() << "in function " << F.getName() << '\n'; + report_fatal_error("broken function found, compilation aborted!"); + } + return false; + } + + bool doFinalization(Module &M) override { + bool IsValid = true; + for (Function &F : M) + if (F.isDeclaration()) + IsValid &= TV->run(F); + + if (!IsValid) + report_fatal_error("broken module found, compilation aborted!"); + return false; + } + + void getAnalysisUsage(AnalysisUsage &AU) const override { + AU.setPreservesAll(); + } +}; +char TargetVerifierLegacyPass::ID = 0; +FunctionPass *createTargetVerifierLegacyPass() { + return new TargetVerifierLegacyPass(); +} } // namespace llvm +using namespace llvm; +INITIALIZE_PASS(TargetVerifierLegacyPass, "tgtverifier", "Target Verifier", false, false) diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index a8f6b999af06e..4b95977a10c5f 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -57,6 +57,9 @@ static cl::opt DebugPM("debug-pass-manager", cl::Hidden, cl::desc("Print pass management debugging information")); +static cl::opt VerifyTarget("verify-tgt-new-pm", + cl::desc("Verify the target")); + bool LLCDiagnosticHandler::handleDiagnostics(const DiagnosticInfo &DI) { DiagnosticHandler::handleDiagnostics(DI); if (DI.getKind() == llvm::DK_SrcMgr) { @@ -127,7 +130,8 @@ int llvm::compileModuleWithNewPM( PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); - PB.registerVerifierPasses(MPM, FPM); + if (VerifyTarget) + PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM, &FAM); diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 140459ba2de21..1fd8a9f9cd9f8 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -209,6 +209,8 @@ static cl::opt PassPipeline( static cl::alias PassPipeline2("p", cl::aliasopt(PassPipeline), cl::desc("Alias for -passes")); +static cl::opt VerifyTarget("verify-tgt", cl::desc("Verify the target")); + namespace { std::vector &getRunPassNames() { @@ -658,6 +660,8 @@ static int compileModule(char **argv, LLVMContext &Context) { // Build up all of the passes that we want to do to the module. legacy::PassManager PM; + if (VerifyTarget) + PM.add(createTargetVerifierLegacyPass()); PM.add(new TargetLibraryInfoWrapperPass(TLII)); { diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index b00bab66c6c3e..b86c2318b45b7 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -141,6 +141,7 @@ int main(int argc, char **argv) { ModuleAnalysisManager MAM; PassBuilder PB(TM.get(), PipelineTuningOptions(), std::nullopt, &PIC); PB.registerModuleAnalyses(MAM); + //PB.registerVerifierPasses(MPM, FPM); PB.registerCGSCCAnalyses(CGAM); PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); >From 3013fc91155a7d84c73ac820fe6bc24c47dad38d Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 26 Apr 2025 00:13:42 -0400 Subject: [PATCH 14/27] Add fam in other projects. --- flang/lib/Frontend/FrontendActions.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter4/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter5/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter6/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter7/toy.cpp | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..7c48e35ff68cf 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -911,7 +911,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); - si.registerCallbacks(pic, &mam); + si.registerCallbacks(pic, &mam, &fam); if (ci.isTimingEnabled()) si.getTimePasses().setOutStream(ci.getTimingStreamLLVM()); pto.LoopUnrolling = opts.UnrollLoops; diff --git a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp index 0f58391c50667..f9664025f61f1 100644 --- a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp @@ -577,7 +577,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp index 7117eaf4982b0..eae06d9f57467 100644 --- a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp @@ -851,7 +851,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp index cb7b6cc8651c1..30ad79ef2fc58 100644 --- a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp @@ -970,7 +970,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp index 91b7191a07c6f..4a39bc33c5591 100644 --- a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp @@ -1139,7 +1139,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Promote allocas to registers. >From 8745cd135bd27559429f158fc0d678a210af7292 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 26 Apr 2025 02:30:40 -0400 Subject: [PATCH 15/27] Avoid fatal errors in llc. --- llvm/include/llvm/CodeGen/Passes.h | 2 +- llvm/lib/Target/TargetVerifier.cpp | 18 +++++++++++++----- .../test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll | 2 +- .../test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll | 2 +- llvm/tools/llc/llc.cpp | 2 +- 5 files changed, 17 insertions(+), 9 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index b293315e11c17..8d88d858c57ad 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -618,7 +618,7 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); - FunctionPass *createTargetVerifierLegacyPass(); + FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors); } // End llvm namespace #endif diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 170fc4769c1d8..3be50f4ef6da3 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -94,8 +94,10 @@ struct TargetVerifierLegacyPass : public FunctionPass { static char ID; std::unique_ptr TV; + bool FatalErrors = false; - TargetVerifierLegacyPass() : FunctionPass(ID) { + TargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), + FatalErrors(FatalErrors) { initializeTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); } @@ -107,7 +109,10 @@ struct TargetVerifierLegacyPass : public FunctionPass { bool runOnFunction(Function &F) override { if (!TV->run(F)) { errs() << "in function " << F.getName() << '\n'; - report_fatal_error("broken function found, compilation aborted!"); + if (FatalErrors) + report_fatal_error("broken function found, compilation aborted!"); + else + errs() << "broken function found, compilation aborted!\n"; } return false; } @@ -119,7 +124,10 @@ struct TargetVerifierLegacyPass : public FunctionPass { IsValid &= TV->run(F); if (!IsValid) - report_fatal_error("broken module found, compilation aborted!"); + if (FatalErrors) + report_fatal_error("broken module found, compilation aborted!"); + else + errs() << "broken module found, compilation aborted!\n"; return false; } @@ -128,8 +136,8 @@ struct TargetVerifierLegacyPass : public FunctionPass { } }; char TargetVerifierLegacyPass::ID = 0; -FunctionPass *createTargetVerifierLegacyPass() { - return new TargetVerifierLegacyPass(); +FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors) { + return new TargetVerifierLegacyPass(FatalErrors); } } // namespace llvm using namespace llvm; diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll index c5e59d4a2369e..e2d9edda5d008 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll @@ -1,4 +1,4 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm -o - < %s 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-tgt -o - < %s 2>&1 | FileCheck %s define amdgpu_cs i32 @nonvoid_shader() { ; CHECK: Shaders must return void diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll index 8a503b7624a73..a2dab0ff47924 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll @@ -1,4 +1,4 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm %s -o - 2>&1 | FileCheck %s --allow-empty +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-tgt %s -o - 2>&1 | FileCheck %s --allow-empty define amdgpu_cs void @void_shader() { ; CHECK-NOT: Shaders must return void diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 1fd8a9f9cd9f8..329d95826551f 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -661,7 +661,7 @@ static int compileModule(char **argv, LLVMContext &Context) { // Build up all of the passes that we want to do to the module. legacy::PassManager PM; if (VerifyTarget) - PM.add(createTargetVerifierLegacyPass()); + PM.add(createTargetVerifierLegacyPass(false)); PM.add(new TargetLibraryInfoWrapperPass(TLII)); { >From c7bf730193e39bf838a29de7617d31a900bbc576 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 26 Apr 2025 03:40:47 -0400 Subject: [PATCH 16/27] Add tool to build/test. --- llvm/test/CMakeLists.txt | 1 + llvm/test/lit.cfg.py | 1 + llvm/utils/gn/secondary/llvm/test/BUILD.gn | 1 + .../llvm/tools/llvm-tgt-verify/BUILD.gn | 25 +++++++++++++++++++ 4 files changed, 28 insertions(+) create mode 100644 llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn diff --git a/llvm/test/CMakeLists.txt b/llvm/test/CMakeLists.txt index 66849002eb470..10ca9300e7c66 100644 --- a/llvm/test/CMakeLists.txt +++ b/llvm/test/CMakeLists.txt @@ -135,6 +135,7 @@ set(LLVM_TEST_DEPENDS llvm-strip llvm-symbolizer llvm-tblgen + llvm-tgt-verify llvm-readtapi llvm-tli-checker llvm-undname diff --git a/llvm/test/lit.cfg.py b/llvm/test/lit.cfg.py index aad7a088551b2..8620f2a7014b5 100644 --- a/llvm/test/lit.cfg.py +++ b/llvm/test/lit.cfg.py @@ -227,6 +227,7 @@ def get_asan_rtlib(): "llvm-strings", "llvm-strip", "llvm-tblgen", + "llvm-tgt-verify", "llvm-readtapi", "llvm-undname", "llvm-windres", diff --git a/llvm/utils/gn/secondary/llvm/test/BUILD.gn b/llvm/utils/gn/secondary/llvm/test/BUILD.gn index 228642667b41d..157e7991c52a8 100644 --- a/llvm/utils/gn/secondary/llvm/test/BUILD.gn +++ b/llvm/utils/gn/secondary/llvm/test/BUILD.gn @@ -319,6 +319,7 @@ group("test") { "//llvm/tools/llvm-strings", "//llvm/tools/llvm-symbolizer:symlinks", "//llvm/tools/llvm-tli-checker", + "//llvm/tools/llvm-tgt-verify", "//llvm/tools/llvm-undname", "//llvm/tools/llvm-xray", "//llvm/tools/lto", diff --git a/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn b/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn new file mode 100644 index 0000000000000..b751bafc5052c --- /dev/null +++ b/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn @@ -0,0 +1,25 @@ +import("//llvm/utils/TableGen/tablegen.gni") + +tgtverifier("llvm-tgt-verify") { + deps = [ + "//llvm/lib/Analysis", + "//llvm/lib/AsmPrinter", + "//llvm/lib/CodeGen", + "//llvm/lib/CodeGenTypes", + "//llvm/lib/Core", + "//llvm/lib/IRPrinter", + "//llvm/lib/IRReader", + "//llvm/lib/MC", + "//llvm/lib/MIRParser", + "//llvm/lib/Passes", + "//llvm/lib/Remarks", + "//llvm/lib/ScalarOpts", + "//llvm/lib/SelectionDAG", + "//llvm/lib/Support", + "//llvm/lib/Target", + "//llvm/lib/TargetParser", + "//llvm/lib/TransformUtils", + "//llvm/lib/Vectorize", + ] + sources = [ "llvm-tgt-verify.cpp" ] +} >From c8dd3db3fe078f76e822a9646d3d7295fa23752a Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 28 Apr 2025 10:42:24 -0400 Subject: [PATCH 17/27] Cleanup of unrequired functions. --- llvm/include/llvm/Target/TargetVerifier.h | 1 - .../TargetVerify/AMDGPUTargetVerifier.h | 1 - .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 25 +++---------------- llvm/lib/Target/TargetVerifier.cpp | 22 ++-------------- 4 files changed, 6 insertions(+), 43 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 23ef2e0b8d4ef..427a05b2648a9 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -77,7 +77,6 @@ class TargetVerify { MessagesStr(Messages) {} bool run(Function &F); - bool run(Function &F, FunctionAnalysisManager &AM); }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index b6a7412e8c1ef..74e5b5f7a1efd 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -32,7 +32,6 @@ class Function; class AMDGPUTargetVerifierPass : public TargetVerifierPass { public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); }; class AMDGPUTargetVerify : public TargetVerify { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 96bcaaf6f2ac9..bda412f723242 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -107,29 +107,12 @@ bool AMDGPUTargetVerify::run(Function &F) { } } - if (!MessagesStr.str().empty()) + //dbgs() << MessagesStr.str(); + if (!MessagesStr.str().empty()) { + //IsValid = false; return false; - return true; -} - -PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { - - auto *Mod = F.getParent(); - - auto UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return PreservedAnalyses::none(); } - - return PreservedAnalyses::all(); + return true; } } // namespace llvm diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 3be50f4ef6da3..6b57c18ff9316 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -48,25 +48,6 @@ bool TargetVerify::run(Function &F) { report_fatal_error("Target has no verification method\n"); } -bool TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { - if (TT.isAMDGPU()) { - auto *UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return false; - } - return true; - } - report_fatal_error("Target has no verification method\n"); -} - PreservedAnalyses TargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto TT = F.getParent()->getTargetTriple(); @@ -123,11 +104,12 @@ struct TargetVerifierLegacyPass : public FunctionPass { if (F.isDeclaration()) IsValid &= TV->run(F); - if (!IsValid) + if (!IsValid) { if (FatalErrors) report_fatal_error("broken module found, compilation aborted!"); else errs() << "broken module found, compilation aborted!\n"; + } return false; } >From 2c12e6a6d7f9a1cb7bcebfb30ccdd0fe7b198727 Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 28 Apr 2025 10:43:32 -0400 Subject: [PATCH 18/27] Make virtual. --- llvm/include/llvm/Target/TargetVerifier.h | 2 +- llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 427a05b2648a9..ade2676a64325 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -76,7 +76,7 @@ class TargetVerify { : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} - bool run(Function &F); + virtual bool run(Function &F); }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index 74e5b5f7a1efd..b97fbc046e391 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -48,7 +48,7 @@ class AMDGPUTargetVerify : public TargetVerify { AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} - bool run(Function &F); + bool run(Function &F) override; }; } // namespace llvm >From 3267b65e82da4cb7bc0f31f74c76f78d0445512f Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 10:56:43 -0400 Subject: [PATCH 19/27] Remove from legacy PM. Add to target dependent pipeline. --- llvm/include/llvm/CodeGen/Passes.h | 2 +- llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Target/TargetVerifier.h | 4 +- .../TargetVerify/AMDGPUTargetVerifier.h | 1 + llvm/lib/Passes/StandardInstrumentations.cpp | 10 +-- llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 + .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 74 ++++++++++++++- llvm/lib/Target/TargetVerifier.cpp | 90 ------------------- llvm/tools/llc/llc.cpp | 2 - .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 2 - 11 files changed, 85 insertions(+), 106 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index 8d88d858c57ad..da6ad3f612aa8 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -618,7 +618,7 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); - FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors); + //FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); } // End llvm namespace #endif diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 3f9ffc4efd9ec..7d4fad2d87a16 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -307,7 +307,7 @@ void initializeTailDuplicateLegacyPass(PassRegistry &); void initializeTargetLibraryInfoWrapperPassPass(PassRegistry &); void initializeTargetPassConfigPass(PassRegistry &); void initializeTargetTransformInfoWrapperPassPass(PassRegistry &); -void initializeTargetVerifierLegacyPassPass(PassRegistry &); +//void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeTwoAddressInstructionLegacyPassPass(PassRegistry &); void initializeTypeBasedAAWrapperPassPass(PassRegistry &); void initializeTypePromotionLegacyPass(PassRegistry &); diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index ade2676a64325..1d12eb55bbf0a 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -30,7 +30,7 @@ class Function; class TargetVerifierPass : public PassInfoMixin { public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); + virtual PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) = 0; }; class TargetVerify { @@ -76,7 +76,7 @@ class TargetVerify { : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} - virtual bool run(Function &F); + virtual bool run(Function &F) = 0; }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index b97fbc046e391..49bcbc8849e3c 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -32,6 +32,7 @@ class Function; class AMDGPUTargetVerifierPass : public TargetVerifierPass { public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) override; }; class AMDGPUTargetVerify : public TargetVerify { diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index f125b3daffd5e..076df47d5b15d 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -45,7 +45,7 @@ #include "llvm/Support/Signals.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Support/xxhash.h" -#include "llvm/Target/TargetVerifier.h" +//#include "llvm/Target/TargetVerifier.h" #include #include #include @@ -1479,12 +1479,12 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, P)); if (VerifyTargetEach && FAM) { - TargetVerify TV(const_cast(F->getParent())); - TV.run(*const_cast(F), *FAM); - if (!TV.IsValid) + //TargetVerify TV(const_cast(F->getParent())); + //TV.run(*const_cast(F), *FAM); + /*if (!TV.IsValid) report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", - P)); + P));*/ } } else { const auto *M = unwrapIR(IR); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 73f9c60cf588c..41e6a399c7239 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -88,7 +88,7 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) #endif VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) -VERIFIER_FUNCTION_ANALYSIS("tgtverifier", TargetVerifierPass()) +VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) #undef VERIFIER_MODULE_ANALYSIS #undef VERIFIER_FUNCTION_ANALYSIS diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 257cc724b3da9..f1a60b8f33140 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1976,6 +1976,8 @@ AMDGPUCodeGenPassBuilder::AMDGPUCodeGenPassBuilder( } void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { + addPass(AMDGPUTargetVerifierPass()); + if (RemoveIncompatibleFunctions && TM.getTargetTriple().isAMDGCN()) addPass(AMDGPURemoveIncompatibleFunctionsPass(TM)); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index bda412f723242..cedd9ddc78011 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -107,12 +107,82 @@ bool AMDGPUTargetVerify::run(Function &F) { } } - //dbgs() << MessagesStr.str(); + dbgs() << MessagesStr.str(); if (!MessagesStr.str().empty()) { - //IsValid = false; + IsValid = false; return false; } return true; } +PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { + auto *Mod = F.getParent(); + + auto UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + return PreservedAnalyses::none(); + } + return PreservedAnalyses::all(); +} + +/* +struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { + static char ID; + + std::unique_ptr TV; + bool FatalErrors = false; + + AMDGPUTargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), + FatalErrors(FatalErrors) { + initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + + bool doInitialization(Module &M) override { + TV = std::make_unique(&M); + return false; + } + + bool runOnFunction(Function &F) override { + if (!TV->run(F)) { + errs() << "in function " << F.getName() << '\n'; + if (FatalErrors) + report_fatal_error("broken function found, compilation aborted!"); + else + errs() << "broken function found, compilation aborted!\n"; + } + return false; + } + + bool doFinalization(Module &M) override { + bool IsValid = true; + for (Function &F : M) + if (F.isDeclaration()) + IsValid &= TV->run(F); + + if (!IsValid) { + if (FatalErrors) + report_fatal_error("broken module found, compilation aborted!"); + else + errs() << "broken module found, compilation aborted!\n"; + } + return false; + } + + void getAnalysisUsage(AnalysisUsage &AU) const override { + AU.setPreservesAll(); + } +}; +char AMDGPUTargetVerifierLegacyPass::ID = 0; +FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors) { + return new AMDGPUTargetVerifierLegacyPass(FatalErrors); +}*/ } // namespace llvm +//INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 6b57c18ff9316..c63ae2a2c5daf 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -33,94 +33,4 @@ namespace llvm { -bool TargetVerify::run(Function &F) { - if (TT.isAMDGPU()) { - AMDGPUTargetVerify TV(Mod); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return false; - } - return true; - } - report_fatal_error("Target has no verification method\n"); -} - -PreservedAnalyses TargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { - auto TT = F.getParent()->getTargetTriple(); - - if (TT.isAMDGPU()) { - auto *Mod = F.getParent(); - - auto UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return PreservedAnalyses::none(); - } - return PreservedAnalyses::all(); - } - report_fatal_error("Target has no verification method\n"); -} - -struct TargetVerifierLegacyPass : public FunctionPass { - static char ID; - - std::unique_ptr TV; - bool FatalErrors = false; - - TargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), - FatalErrors(FatalErrors) { - initializeTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); - } - - bool doInitialization(Module &M) override { - TV = std::make_unique(&M); - return false; - } - - bool runOnFunction(Function &F) override { - if (!TV->run(F)) { - errs() << "in function " << F.getName() << '\n'; - if (FatalErrors) - report_fatal_error("broken function found, compilation aborted!"); - else - errs() << "broken function found, compilation aborted!\n"; - } - return false; - } - - bool doFinalization(Module &M) override { - bool IsValid = true; - for (Function &F : M) - if (F.isDeclaration()) - IsValid &= TV->run(F); - - if (!IsValid) { - if (FatalErrors) - report_fatal_error("broken module found, compilation aborted!"); - else - errs() << "broken module found, compilation aborted!\n"; - } - return false; - } - - void getAnalysisUsage(AnalysisUsage &AU) const override { - AU.setPreservesAll(); - } -}; -char TargetVerifierLegacyPass::ID = 0; -FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors) { - return new TargetVerifierLegacyPass(FatalErrors); -} } // namespace llvm -using namespace llvm; -INITIALIZE_PASS(TargetVerifierLegacyPass, "tgtverifier", "Target Verifier", false, false) diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 329d95826551f..2e9e4837fe467 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -660,8 +660,6 @@ static int compileModule(char **argv, LLVMContext &Context) { // Build up all of the passes that we want to do to the module. legacy::PassManager PM; - if (VerifyTarget) - PM.add(createTargetVerifierLegacyPass(false)); PM.add(new TargetLibraryInfoWrapperPass(TLII)); { diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index b86c2318b45b7..d832dcdff4ad0 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -158,8 +158,6 @@ int main(int argc, char **argv) { if (TT.isAMDGPU()) FPM.addPass(AMDGPUTargetVerifierPass()); else if (false) {} // ... - else - FPM.addPass(TargetVerifierPass()); MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); auto PA = MPM.run(*M, MAM); >From 6401b7517843a03ab114aaf333624ef914d5a5f3 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 11:18:50 -0400 Subject: [PATCH 20/27] Add back to legacy PM. --- llvm/include/llvm/CodeGen/Passes.h | 2 -- llvm/include/llvm/InitializePasses.h | 1 - llvm/lib/Target/AMDGPU/AMDGPU.h | 3 +++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 1 + llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 8 ++++---- 5 files changed, 8 insertions(+), 7 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index da6ad3f612aa8..d214ab9306c2f 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -617,8 +617,6 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); - - //FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); } // End llvm namespace #endif diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 7d4fad2d87a16..9bef8e496c57e 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -307,7 +307,6 @@ void initializeTailDuplicateLegacyPass(PassRegistry &); void initializeTargetLibraryInfoWrapperPassPass(PassRegistry &); void initializeTargetPassConfigPass(PassRegistry &); void initializeTargetTransformInfoWrapperPassPass(PassRegistry &); -//void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeTwoAddressInstructionLegacyPassPass(PassRegistry &); void initializeTypeBasedAAWrapperPassPass(PassRegistry &); void initializeTypePromotionLegacyPass(PassRegistry &); diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h index 4ff761ec19b3c..f69956ba44255 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPU.h +++ b/llvm/lib/Target/AMDGPU/AMDGPU.h @@ -530,6 +530,9 @@ extern char &GCNRewritePartialRegUsesID; void initializeAMDGPUWaitSGPRHazardsLegacyPass(PassRegistry &); extern char &AMDGPUWaitSGPRHazardsLegacyID; +FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); +void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); + namespace AMDGPU { enum TargetIndex { TI_CONSTDATA_START, diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index f1a60b8f33140..42d6764eacda9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1377,6 +1377,7 @@ bool AMDGPUPassConfig::addGCPasses() { //===----------------------------------------------------------------------===// bool GCNPassConfig::addPreISel() { + addPass(createAMDGPUTargetVerifierLegacyPass(false)); AMDGPUPassConfig::addPreISel(); if (TM->getOptLevel() > CodeGenOptLevel::None) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index cedd9ddc78011..c4d303bee6ef8 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -17,6 +17,7 @@ //// ////===----------------------------------------------------------------------===// +#include "AMDGPU.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Analysis/UniformityAnalysis.h" @@ -24,7 +25,7 @@ #include "llvm/Support/Debug.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" -#include "llvm/InitializePasses.h" +//#include "llvm/InitializePasses.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" @@ -133,7 +134,6 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan return PreservedAnalyses::all(); } -/* struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { static char ID; @@ -183,6 +183,6 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { char AMDGPUTargetVerifierLegacyPass::ID = 0; FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors) { return new AMDGPUTargetVerifierLegacyPass(FatalErrors); -}*/ +} } // namespace llvm -//INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) +INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) >From e2f0225db1439f7d8ee612ee4c4d37a4b44f96b6 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 14:04:10 -0400 Subject: [PATCH 21/27] Remove reference to FAM in registerCallbacks and VerifyEach for TargetVerify in instrumentation --- clang/lib/CodeGen/BackendUtil.cpp | 2 +- flang/lib/Frontend/FrontendActions.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter4/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter5/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter6/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter7/toy.cpp | 2 +- .../llvm/Passes/StandardInstrumentations.h | 6 ++---- llvm/lib/LTO/LTOBackend.cpp | 2 +- llvm/lib/LTO/ThinLTOCodeGenerator.cpp | 2 +- llvm/lib/Passes/PassBuilderBindings.cpp | 2 +- llvm/lib/Passes/StandardInstrumentations.cpp | 21 ++++--------------- llvm/tools/llc/NewPMDriver.cpp | 7 ++++--- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 2 +- llvm/tools/opt/NewPMDriver.cpp | 2 +- llvm/unittests/IR/PassManagerTest.cpp | 6 +++--- 15 files changed, 24 insertions(+), 38 deletions(-) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 9a1c922f5ddef..f7eb853beb23c 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -922,7 +922,7 @@ void EmitAssemblyHelper::RunOptimizationPipeline( TheModule->getContext(), (CodeGenOpts.DebugPassManager || DebugPassStructure), CodeGenOpts.VerifyEach, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); PassBuilder PB(TM.get(), PTO, PGOOpt, &PIC); // Handle the assignment tracking feature options. diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 7c48e35ff68cf..c1f47b12abee2 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -911,7 +911,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); - si.registerCallbacks(pic, &mam, &fam); + si.registerCallbacks(pic, &mam); if (ci.isTimingEnabled()) si.getTimePasses().setOutStream(ci.getTimingStreamLLVM()); pto.LoopUnrolling = opts.UnrollLoops; diff --git a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp index f9664025f61f1..0f58391c50667 100644 --- a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp @@ -577,7 +577,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp index eae06d9f57467..7117eaf4982b0 100644 --- a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp @@ -851,7 +851,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp index 30ad79ef2fc58..cb7b6cc8651c1 100644 --- a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp @@ -970,7 +970,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp index 4a39bc33c5591..91b7191a07c6f 100644 --- a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp @@ -1139,7 +1139,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Promote allocas to registers. diff --git a/llvm/include/llvm/Passes/StandardInstrumentations.h b/llvm/include/llvm/Passes/StandardInstrumentations.h index 988fcb93b2357..65934c93ba614 100644 --- a/llvm/include/llvm/Passes/StandardInstrumentations.h +++ b/llvm/include/llvm/Passes/StandardInstrumentations.h @@ -476,8 +476,7 @@ class VerifyInstrumentation { public: VerifyInstrumentation(bool DebugLogging) : DebugLogging(DebugLogging) {} void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM, - FunctionAnalysisManager *FAM); + ModuleAnalysisManager *MAM); }; /// This class implements --time-trace functionality for new pass manager. @@ -622,8 +621,7 @@ class StandardInstrumentations { // Register all the standard instrumentation callbacks. If \p FAM is nullptr // then PreservedCFGChecker is not enabled. void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM, - FunctionAnalysisManager *FAM); + ModuleAnalysisManager *MAM); TimePassesHandler &getTimePasses() { return TimePasses; } }; diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp index 475e7cf45371b..1c764a0188eda 100644 --- a/llvm/lib/LTO/LTOBackend.cpp +++ b/llvm/lib/LTO/LTOBackend.cpp @@ -275,7 +275,7 @@ static void runNewPMPasses(const Config &Conf, Module &Mod, TargetMachine *TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(Mod.getContext(), Conf.DebugPassManager, Conf.VerifyEach); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); PassBuilder PB(TM, Conf.PTO, PGOOpt, &PIC); RegisterPassPlugins(Conf.PassPlugins, PB); diff --git a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp index 369b003df1364..9e7f8187fe49c 100644 --- a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp +++ b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp @@ -245,7 +245,7 @@ static void optimizeModule(Module &TheModule, TargetMachine &TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(TheModule.getContext(), DebugPassManager); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); PipelineTuningOptions PTO; PTO.LoopVectorization = true; PTO.SLPVectorization = true; diff --git a/llvm/lib/Passes/PassBuilderBindings.cpp b/llvm/lib/Passes/PassBuilderBindings.cpp index f0e1abb8cebc4..933fe89e53a94 100644 --- a/llvm/lib/Passes/PassBuilderBindings.cpp +++ b/llvm/lib/Passes/PassBuilderBindings.cpp @@ -76,7 +76,7 @@ static LLVMErrorRef runPasses(Module *Mod, Function *Fun, const char *Passes, PB.crossRegisterProxies(LAM, FAM, CGAM, MAM); StandardInstrumentations SI(Mod->getContext(), Debug, VerifyEach); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); // Run the pipeline. if (Fun) { diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index 076df47d5b15d..dc1dd5d9c7f4c 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -45,7 +45,6 @@ #include "llvm/Support/Signals.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Support/xxhash.h" -//#include "llvm/Target/TargetVerifier.h" #include #include #include @@ -62,8 +61,6 @@ static cl::opt VerifyAnalysisInvalidation("verify-analysis-invalidation", #endif ); -static cl::opt VerifyTargetEach("verify-tgt-each"); - // An option that supports the -print-changed option. See // the description for -print-changed for an explanation of the use // of this option. Note that this option has no effect without -print-changed. @@ -1457,10 +1454,9 @@ void PreservedCFGCheckerInstrumentation::registerCallbacks( } void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM, - FunctionAnalysisManager *FAM) { + ModuleAnalysisManager *MAM) { PIC.registerAfterPassCallback( - [this, MAM, FAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { + [this, MAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { if (isIgnored(P) || P == "VerifierPass") return; const auto *F = unwrapIR(IR); @@ -1477,15 +1473,6 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", P)); - - if (VerifyTargetEach && FAM) { - //TargetVerify TV(const_cast(F->getParent())); - //TV.run(*const_cast(F), *FAM); - /*if (!TV.IsValid) - report_fatal_error(formatv("Broken function found after pass " - "\"{0}\", compilation aborted!", - P));*/ - } } else { const auto *M = unwrapIR(IR); if (!M) { @@ -2525,7 +2512,7 @@ void PrintCrashIRInstrumentation::registerCallbacks( } void StandardInstrumentations::registerCallbacks( - PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM, FunctionAnalysisManager *FAM) { + PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM) { PrintIR.registerCallbacks(PIC); PrintPass.registerCallbacks(PIC); TimePasses.registerCallbacks(PIC); @@ -2534,7 +2521,7 @@ void StandardInstrumentations::registerCallbacks( PrintChangedIR.registerCallbacks(PIC); PseudoProbeVerification.registerCallbacks(PIC); if (VerifyEach) - Verify.registerCallbacks(PIC, MAM, FAM); + Verify.registerCallbacks(PIC, MAM); PrintChangedDiff.registerCallbacks(PIC); WebsiteChangeReporter.registerCallbacks(PIC); ChangeTester.registerCallbacks(PIC); diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index 4b95977a10c5f..863a555798dab 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -117,8 +117,6 @@ int llvm::compileModuleWithNewPM( VK == VerifierKind::EachPass); registerCodeGenCallback(PIC, *Target); - ModulePassManager MPM; - FunctionPassManager FPM; MachineFunctionAnalysisManager MFAM; LoopAnalysisManager LAM; FunctionAnalysisManager FAM; @@ -133,11 +131,14 @@ int llvm::compileModuleWithNewPM( if (VerifyTarget) PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); MAM.registerPass([&] { return MachineModuleAnalysis(MMI); }); + ModulePassManager MPM; + FunctionPassManager FPM; + if (!PassPipeline.empty()) { // Construct a custom pass pipeline that starts after instruction // selection. diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index d832dcdff4ad0..50f4e56bb6af6 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -148,7 +148,7 @@ int main(int argc, char **argv) { PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); Triple TT(M->getTargetTriple()); if (!NoLint) diff --git a/llvm/tools/opt/NewPMDriver.cpp b/llvm/tools/opt/NewPMDriver.cpp index a8977d80bdf44..7d168a6ceb17c 100644 --- a/llvm/tools/opt/NewPMDriver.cpp +++ b/llvm/tools/opt/NewPMDriver.cpp @@ -423,7 +423,7 @@ bool llvm::runPassPipeline( PrintPassOpts.SkipAnalyses = DebugPM == DebugLogging::Quiet; StandardInstrumentations SI(M.getContext(), DebugPM != DebugLogging::None, VK == VerifierKind::EachPass, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); DebugifyEachInstrumentation Debugify; DebugifyStatsMap DIStatsMap; DebugInfoPerPass DebugInfoBeforePass; diff --git a/llvm/unittests/IR/PassManagerTest.cpp b/llvm/unittests/IR/PassManagerTest.cpp index bb4db6120035f..a6487169224c2 100644 --- a/llvm/unittests/IR/PassManagerTest.cpp +++ b/llvm/unittests/IR/PassManagerTest.cpp @@ -828,7 +828,7 @@ TEST_F(PassManagerTest, FunctionPassCFGChecker) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -877,7 +877,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerInvalidateAnalysis) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -945,7 +945,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerWrapped) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); >From b43cec12bbfc6071d4a99e75aad4273bab4e3182 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 14:44:28 -0400 Subject: [PATCH 22/27] Remove references to registry --- llvm/include/llvm/Passes/PassBuilder.h | 21 ------------------- .../llvm/Passes/StandardInstrumentations.h | 2 +- .../llvm/Passes/TargetPassRegistry.inc | 12 ----------- llvm/lib/Passes/PassBuilder.cpp | 7 ------- llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 11 ---------- llvm/tools/llc/NewPMDriver.cpp | 5 ----- llvm/tools/llc/llc.cpp | 2 -- 7 files changed, 1 insertion(+), 59 deletions(-) diff --git a/llvm/include/llvm/Passes/PassBuilder.h b/llvm/include/llvm/Passes/PassBuilder.h index 6000769ce723b..51ccaa53447d7 100644 --- a/llvm/include/llvm/Passes/PassBuilder.h +++ b/llvm/include/llvm/Passes/PassBuilder.h @@ -172,13 +172,6 @@ class PassBuilder { /// additional analyses. void registerLoopAnalyses(LoopAnalysisManager &LAM); - /// Registers all available verifier passes. - /// - /// This is an interface that can be used to populate a - /// \c ModuleAnalysisManager with all registered loop analyses. Callers can - /// still manually register any additional analyses. - void registerVerifierPasses(ModulePassManager &PM, FunctionPassManager &); - /// Registers all available machine function analysis passes. /// /// This is an interface that can be used to populate a \c @@ -577,15 +570,6 @@ class PassBuilder { } /// @}} - /// Register a callback for parsing an Verifier Name to populate - /// the given managers. - void registerVerifierCallback( - const std::function &C, - const std::function &CF) { - VerifierCallbacks.push_back(C); - FnVerifierCallbacks.push_back(CF); - } - /// {{@ Register pipeline parsing callbacks with this pass builder instance. /// Using these callbacks, callers can parse both a single pass name, as well /// as entire sub-pipelines, and populate the PassManager instance @@ -857,11 +841,6 @@ class PassBuilder { // Callbacks to parse `filter` parameter in register allocation passes SmallVector, 2> RegClassFilterParsingCallbacks; - // Verifier callbacks - SmallVector, 2> - VerifierCallbacks; - SmallVector, 2> - FnVerifierCallbacks; }; /// This utility template takes care of adding require<> and invalidate<> diff --git a/llvm/include/llvm/Passes/StandardInstrumentations.h b/llvm/include/llvm/Passes/StandardInstrumentations.h index 65934c93ba614..f7a65a88ecf5b 100644 --- a/llvm/include/llvm/Passes/StandardInstrumentations.h +++ b/llvm/include/llvm/Passes/StandardInstrumentations.h @@ -621,7 +621,7 @@ class StandardInstrumentations { // Register all the standard instrumentation callbacks. If \p FAM is nullptr // then PreservedCFGChecker is not enabled. void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM); + ModuleAnalysisManager *MAM = nullptr); TimePassesHandler &getTimePasses() { return TimePasses; } }; diff --git a/llvm/include/llvm/Passes/TargetPassRegistry.inc b/llvm/include/llvm/Passes/TargetPassRegistry.inc index 2d04b874cf360..521913cb25a4a 100644 --- a/llvm/include/llvm/Passes/TargetPassRegistry.inc +++ b/llvm/include/llvm/Passes/TargetPassRegistry.inc @@ -151,18 +151,6 @@ PB.registerPipelineParsingCallback([=](StringRef Name, FunctionPassManager &PM, return false; }); -PB.registerVerifierCallback([](ModulePassManager &PM) { -#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) PM.addPass(CREATE_PASS) -#include GET_PASS_REGISTRY -#undef VERIFIER_MODULE_ANALYSIS - return false; -}, [](FunctionPassManager &FPM) { -#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) FPM.addPass(CREATE_PASS) -#include GET_PASS_REGISTRY -#undef VERIFIER_FUNCTION_ANALYSIS - return false; -}); - #undef ADD_PASS #undef ADD_PASS_WITH_PARAMS diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index e942fed8b6a72..e7057d9a6b625 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -582,13 +582,6 @@ void PassBuilder::registerLoopAnalyses(LoopAnalysisManager &LAM) { C(LAM); } -void PassBuilder::registerVerifierPasses(ModulePassManager &MPM, FunctionPassManager &FPM) { - for (auto &C : VerifierCallbacks) - C(MPM); - for (auto &C : FnVerifierCallbacks) - C(FPM); -} - static std::optional> parseFunctionPipelineName(StringRef Name) { std::pair Params; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 41e6a399c7239..98a1147ef6d66 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -81,17 +81,6 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #undef FUNCTION_ALIAS_ANALYSIS #undef FUNCTION_ANALYSIS -#ifndef VERIFIER_MODULE_ANALYSIS -#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) -#endif -#ifndef VERIFIER_FUNCTION_ANALYSIS -#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) -#endif -VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) -VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) -#undef VERIFIER_MODULE_ANALYSIS -#undef VERIFIER_FUNCTION_ANALYSIS - #ifndef FUNCTION_PASS_WITH_PARAMS #define FUNCTION_PASS_WITH_PARAMS(NAME, CLASS, CREATE_PASS, PARSER, PARAMS) #endif diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index 863a555798dab..fa82689ecf9ae 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -57,9 +57,6 @@ static cl::opt DebugPM("debug-pass-manager", cl::Hidden, cl::desc("Print pass management debugging information")); -static cl::opt VerifyTarget("verify-tgt-new-pm", - cl::desc("Verify the target")); - bool LLCDiagnosticHandler::handleDiagnostics(const DiagnosticInfo &DI) { DiagnosticHandler::handleDiagnostics(DI); if (DI.getKind() == llvm::DK_SrcMgr) { @@ -128,8 +125,6 @@ int llvm::compileModuleWithNewPM( PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); - if (VerifyTarget) - PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM); diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 2e9e4837fe467..140459ba2de21 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -209,8 +209,6 @@ static cl::opt PassPipeline( static cl::alias PassPipeline2("p", cl::aliasopt(PassPipeline), cl::desc("Alias for -passes")); -static cl::opt VerifyTarget("verify-tgt", cl::desc("Verify the target")); - namespace { std::vector &getRunPassNames() { >From b583b3f804758f6b8ca686bf66d59d744fffbe8e Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 19:05:53 -0400 Subject: [PATCH 23/27] Remove int check --- .../lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 18 ------------------ 1 file changed, 18 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index c4d303bee6ef8..2ca0bbeb57653 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -25,7 +25,6 @@ #include "llvm/Support/Debug.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" -//#include "llvm/InitializePasses.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" @@ -46,15 +45,6 @@ using namespace llvm; namespace llvm { -static bool IsValidInt(const Type *Ty) { - return Ty->isIntegerTy(1) || - Ty->isIntegerTy(8) || - Ty->isIntegerTy(16) || - Ty->isIntegerTy(32) || - Ty->isIntegerTy(64) || - Ty->isIntegerTy(128); -} - static bool isShader(CallingConv::ID CC) { switch(CC) { case CallingConv::AMDGPU_VS: @@ -81,14 +71,6 @@ bool AMDGPUTargetVerify::run(Function &F) { for (auto &I : BB) { - // Ensure integral types are valid: i8, i16, i32, i64, i128 - if (I.getType()->isIntegerTy()) - Check(IsValidInt(I.getType()), "Int type is invalid.", &I); - for (unsigned i = 0; i < I.getNumOperands(); ++i) - if (I.getOperand(i)->getType()->isIntegerTy()) - Check(IsValidInt(I.getOperand(i)->getType()), - "Int type is invalid.", I.getOperand(i)); - if (auto *CI = dyn_cast(&I)) { // Ensure no kernel to kernel calls. >From 2ba9f5d85326b80bd502116a95353d7e9ad4c9bb Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 21:56:28 -0400 Subject: [PATCH 24/27] Remove modifications to Lint/Verifier. --- llvm/lib/Analysis/Lint.cpp | 4 +--- llvm/lib/IR/Verifier.cpp | 20 ++++---------------- 2 files changed, 5 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Analysis/Lint.cpp b/llvm/lib/Analysis/Lint.cpp index c8e38963e5974..f05e36e2025d4 100644 --- a/llvm/lib/Analysis/Lint.cpp +++ b/llvm/lib/Analysis/Lint.cpp @@ -742,11 +742,9 @@ PreservedAnalyses LintPass::run(Function &F, FunctionAnalysisManager &AM) { Lint L(Mod, DL, AA, AC, DT, TLI); L.visit(F); dbgs() << L.MessagesStr.str(); - if (AbortOnError && !L.MessagesStr.str().empty()) { + if (AbortOnError && !L.MessagesStr.str().empty()) report_fatal_error( "linter found errors, aborting. (enabled by abort-on-error)", false); - return PreservedAnalyses::none(); - } return PreservedAnalyses::all(); } diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 51f6dec53b70f..8afe360d088bc 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -135,10 +135,6 @@ static cl::opt VerifyNoAliasScopeDomination( cl::desc("Ensure that llvm.experimental.noalias.scope.decl for identical " "scopes are not dominating")); -static cl::opt - VerifyAbortOnError("verifier-abort-on-error", cl::init(false), - cl::desc("In the Verifier pass, abort on errors.")); - namespace llvm { struct VerifierSupport { @@ -7800,24 +7796,16 @@ VerifierAnalysis::Result VerifierAnalysis::run(Function &F, PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { auto Res = AM.getResult(M); - if (Res.IRBroken || Res.DebugInfoBroken) { - //M.IsValid = false; - if (VerifyAbortOnError && FatalErrors) - report_fatal_error("Broken module found, compilation aborted!"); - return PreservedAnalyses::none(); - } + if (FatalErrors && (Res.IRBroken || Res.DebugInfoBroken)) + report_fatal_error("Broken module found, compilation aborted!"); return PreservedAnalyses::all(); } PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto res = AM.getResult(F); - if (res.IRBroken) { - //F.getParent()->IsValid = false; - if (VerifyAbortOnError && FatalErrors) - report_fatal_error("Broken function found, compilation aborted!"); - return PreservedAnalyses::none(); - } + if (res.IRBroken && FatalErrors) + report_fatal_error("Broken function found, compilation aborted!"); return PreservedAnalyses::all(); } >From 0c572440b11b571d0431c2c0bfd83132126e096f Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 22:21:47 -0400 Subject: [PATCH 25/27] Remove llvm-tgt-verify tool. --- llvm/test/CMakeLists.txt | 1 - llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 45 ----- llvm/test/lit.cfg.py | 1 - llvm/tools/llvm-tgt-verify/CMakeLists.txt | 34 ---- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 171 ------------------ llvm/utils/gn/secondary/llvm/test/BUILD.gn | 1 - .../llvm/tools/llvm-tgt-verify/BUILD.gn | 25 --- 7 files changed, 278 deletions(-) delete mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify.ll delete mode 100644 llvm/tools/llvm-tgt-verify/CMakeLists.txt delete mode 100644 llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp delete mode 100644 llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn diff --git a/llvm/test/CMakeLists.txt b/llvm/test/CMakeLists.txt index 10ca9300e7c66..66849002eb470 100644 --- a/llvm/test/CMakeLists.txt +++ b/llvm/test/CMakeLists.txt @@ -135,7 +135,6 @@ set(LLVM_TEST_DEPENDS llvm-strip llvm-symbolizer llvm-tblgen - llvm-tgt-verify llvm-readtapi llvm-tli-checker llvm-undname diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll deleted file mode 100644 index 62b220d7d9f49..0000000000000 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ /dev/null @@ -1,45 +0,0 @@ -; RUN: not llvm-tgt-verify %s -mtriple=amdgcn |& FileCheck %s - -define amdgpu_cs i32 @shader() { -; CHECK: Shaders must return void - ret i32 0 -} - -define amdgpu_kernel void @store_const(ptr addrspace(4) %out, i32 %a, i32 %b) { -; CHECK: Undefined behavior: Write to memory in const addrspace -; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 - %r = add i32 %a, %b - store i32 %r, ptr addrspace(4) %out - ret void -} - -define amdgpu_kernel void @kernel_callee(ptr %x) { - ret void -} - -define amdgpu_kernel void @kernel_caller(ptr %x) { -; CHECK: A kernel may not call a kernel -; CHECK-NEXT: ptr @kernel_caller - call amdgpu_kernel void @kernel_callee(ptr %x) - ret void -} - - -; Function Attrs: nounwind -define i65 @invalid_type(i65 %x) #0 { -; CHECK: Int type is invalid. -; CHECK-NEXT: %tmp2 = ashr i65 %x, 64 -entry: - %tmp2 = ashr i65 %x, 64 - ret i65 %tmp2 -} - -declare void @llvm.amdgcn.cs.chain.v3i32(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) -declare amdgpu_cs_chain void @chain_callee(<3 x i32> inreg, <3 x i32>) - -define amdgpu_cs void @no_unreachable(<3 x i32> inreg %a, <3 x i32> %b) { -; CHECK: llvm.amdgcn.cs.chain must be followed by unreachable -; CHECK-NEXT: call void (ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.p0.i32.v3i32.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) - call void(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) - ret void -} diff --git a/llvm/test/lit.cfg.py b/llvm/test/lit.cfg.py index 8620f2a7014b5..aad7a088551b2 100644 --- a/llvm/test/lit.cfg.py +++ b/llvm/test/lit.cfg.py @@ -227,7 +227,6 @@ def get_asan_rtlib(): "llvm-strings", "llvm-strip", "llvm-tblgen", - "llvm-tgt-verify", "llvm-readtapi", "llvm-undname", "llvm-windres", diff --git a/llvm/tools/llvm-tgt-verify/CMakeLists.txt b/llvm/tools/llvm-tgt-verify/CMakeLists.txt deleted file mode 100644 index fe47c85e6cdce..0000000000000 --- a/llvm/tools/llvm-tgt-verify/CMakeLists.txt +++ /dev/null @@ -1,34 +0,0 @@ -set(LLVM_LINK_COMPONENTS - AllTargetsAsmParsers - AllTargetsCodeGens - AllTargetsDescs - AllTargetsInfos - Analysis - AsmPrinter - CodeGen - CodeGenTypes - Core - IRPrinter - IRReader - MC - MIRParser - Passes - Remarks - ScalarOpts - SelectionDAG - Support - Target - TargetParser - TransformUtils - Vectorize - ) - -add_llvm_tool(llvm-tgt-verify - llvm-tgt-verify.cpp - - DEPENDS - intrinsics_gen - SUPPORT_PLUGINS - ) - -export_executable_symbols_for_plugins(llc) diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp deleted file mode 100644 index 50f4e56bb6af6..0000000000000 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ /dev/null @@ -1,171 +0,0 @@ -//===--- llvm-tgt-verify.cpp - Target Verifier ----------------- ----------===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===----------------------------------------------------------------------===// -// -// Tool to verify a target. -// -//===----------------------------------------------------------------------===// - -#include "llvm/InitializePasses.h" -#include "llvm/ADT/StringRef.h" -#include "llvm/Analysis/Lint.h" -#include "llvm/Analysis/TargetLibraryInfo.h" -#include "llvm/Bitcode/BitcodeReader.h" -#include "llvm/Bitcode/BitcodeWriter.h" -#include "llvm/CodeGen/CommandFlags.h" -#include "llvm/CodeGen/TargetPassConfig.h" -#include "llvm/IR/Constants.h" -#include "llvm/IR/LLVMContext.h" -#include "llvm/IR/LegacyPassManager.h" -#include "llvm/IR/Module.h" -#include "llvm/IR/Verifier.h" -#include "llvm/IRReader/IRReader.h" -#include "llvm/Passes/PassBuilder.h" -#include "llvm/Passes/StandardInstrumentations.h" -#include "llvm/MC/TargetRegistry.h" -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/DataTypes.h" -#include "llvm/Support/Debug.h" -#include "llvm/Support/InitLLVM.h" -#include "llvm/Support/SourceMgr.h" -#include "llvm/Support/TargetSelect.h" -#include "llvm/Target/TargetMachine.h" -#include "llvm/Target/TargetVerifier.h" - -#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" - -#define DEBUG_TYPE "isel-fuzzer" - -using namespace llvm; - -static codegen::RegisterCodeGenFlags CGF; - -static cl::opt -InputFilename(cl::Positional, cl::desc(""), cl::init("-")); - -static cl::opt - StacktraceAbort("stacktrace-abort", - cl::desc("Turn on stacktrace"), cl::init(false)); - -static cl::opt - NoLint("no-lint", - cl::desc("Turn off Lint"), cl::init(false)); - -static cl::opt - NoVerify("no-verifier", - cl::desc("Turn off Verifier"), cl::init(false)); - -static cl::opt - OptLevel("O", - cl::desc("Optimization level. [-O0, -O1, -O2, or -O3] " - "(default = '-O2')"), - cl::Prefix, cl::init('2')); - -static cl::opt - TargetTriple("mtriple", cl::desc("Override target triple for module")); - -static std::unique_ptr TM; - -static void handleLLVMFatalError(void *, const char *Message, bool) { - if (StacktraceAbort) { - dbgs() << "LLVM ERROR: " << Message << "\n" - << "Aborting.\n"; - abort(); - } -} - -int main(int argc, char **argv) { - StringRef ExecName = argv[0]; - InitLLVM X(argc, argv); - - InitializeAllTargets(); - InitializeAllTargetMCs(); - InitializeAllAsmPrinters(); - InitializeAllAsmParsers(); - - PassRegistry *Registry = PassRegistry::getPassRegistry(); - initializeCore(*Registry); - initializeCodeGen(*Registry); - initializeAnalysis(*Registry); - initializeTarget(*Registry); - - cl::ParseCommandLineOptions(argc, argv); - - if (TargetTriple.empty()) { - errs() << ExecName << ": -mtriple must be specified\n"; - exit(1); - } - - CodeGenOptLevel OLvl; - if (auto Level = CodeGenOpt::parseLevel(OptLevel)) { - OLvl = *Level; - } else { - errs() << ExecName << ": invalid optimization level.\n"; - return 1; - } - ExitOnError ExitOnErr(std::string(ExecName) + ": error:"); - TM = ExitOnErr(codegen::createTargetMachineForTriple( - Triple::normalize(TargetTriple), OLvl)); - assert(TM && "Could not allocate target machine!"); - - // Make sure we print the summary and the current unit when LLVM errors out. - install_fatal_error_handler(handleLLVMFatalError, nullptr); - - LLVMContext Context; - SMDiagnostic Err; - std::unique_ptr M = parseIRFile(InputFilename, Err, Context); - if (!M) { - errs() << "Invalid mod\n"; - return 1; - } - auto S = Triple::normalize(TargetTriple); - M->setTargetTriple(Triple(S)); - - PassInstrumentationCallbacks PIC; - StandardInstrumentations SI(Context, false/*debug PM*/, - false); - registerCodeGenCallback(PIC, *TM); - - ModulePassManager MPM; - FunctionPassManager FPM; - //TargetLibraryInfoImpl TLII(Triple(M->getTargetTriple())); - - MachineFunctionAnalysisManager MFAM; - LoopAnalysisManager LAM; - FunctionAnalysisManager FAM; - CGSCCAnalysisManager CGAM; - ModuleAnalysisManager MAM; - PassBuilder PB(TM.get(), PipelineTuningOptions(), std::nullopt, &PIC); - PB.registerModuleAnalyses(MAM); - //PB.registerVerifierPasses(MPM, FPM); - PB.registerCGSCCAnalyses(CGAM); - PB.registerFunctionAnalyses(FAM); - PB.registerLoopAnalyses(LAM); - PB.registerMachineFunctionAnalyses(MFAM); - PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - - SI.registerCallbacks(PIC, &MAM); - - Triple TT(M->getTargetTriple()); - if (!NoLint) - FPM.addPass(LintPass(false)); - if (!NoVerify) - MPM.addPass(VerifierPass()); - if (TT.isAMDGPU()) - FPM.addPass(AMDGPUTargetVerifierPass()); - else if (false) {} // ... - MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); - - auto PA = MPM.run(*M, MAM); - { - auto PAC = PA.getChecker(); - if (!PAC.preserved()) - return 1; - } - - return 0; -} diff --git a/llvm/utils/gn/secondary/llvm/test/BUILD.gn b/llvm/utils/gn/secondary/llvm/test/BUILD.gn index 157e7991c52a8..228642667b41d 100644 --- a/llvm/utils/gn/secondary/llvm/test/BUILD.gn +++ b/llvm/utils/gn/secondary/llvm/test/BUILD.gn @@ -319,7 +319,6 @@ group("test") { "//llvm/tools/llvm-strings", "//llvm/tools/llvm-symbolizer:symlinks", "//llvm/tools/llvm-tli-checker", - "//llvm/tools/llvm-tgt-verify", "//llvm/tools/llvm-undname", "//llvm/tools/llvm-xray", "//llvm/tools/lto", diff --git a/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn b/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn deleted file mode 100644 index b751bafc5052c..0000000000000 --- a/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn +++ /dev/null @@ -1,25 +0,0 @@ -import("//llvm/utils/TableGen/tablegen.gni") - -tgtverifier("llvm-tgt-verify") { - deps = [ - "//llvm/lib/Analysis", - "//llvm/lib/AsmPrinter", - "//llvm/lib/CodeGen", - "//llvm/lib/CodeGenTypes", - "//llvm/lib/Core", - "//llvm/lib/IRPrinter", - "//llvm/lib/IRReader", - "//llvm/lib/MC", - "//llvm/lib/MIRParser", - "//llvm/lib/Passes", - "//llvm/lib/Remarks", - "//llvm/lib/ScalarOpts", - "//llvm/lib/SelectionDAG", - "//llvm/lib/Support", - "//llvm/lib/Target", - "//llvm/lib/TargetParser", - "//llvm/lib/TransformUtils", - "//llvm/lib/Vectorize", - ] - sources = [ "llvm-tgt-verify.cpp" ] -} >From 94c24ebf4fc1c67872d5d2effa8016b5b04b71a5 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 22:23:15 -0400 Subject: [PATCH 26/27] Remove TargetVerifier.cpp --- llvm/lib/Passes/CMakeLists.txt | 1 - llvm/lib/Target/CMakeLists.txt | 1 - llvm/lib/Target/TargetVerifier.cpp | 36 ------------------------------ 3 files changed, 38 deletions(-) delete mode 100644 llvm/lib/Target/TargetVerifier.cpp diff --git a/llvm/lib/Passes/CMakeLists.txt b/llvm/lib/Passes/CMakeLists.txt index 9c348cb89a8c5..6425f4934b210 100644 --- a/llvm/lib/Passes/CMakeLists.txt +++ b/llvm/lib/Passes/CMakeLists.txt @@ -29,7 +29,6 @@ add_llvm_component_library(LLVMPasses Scalar Support Target - #TargetParser TransformUtils Vectorize Instrumentation diff --git a/llvm/lib/Target/CMakeLists.txt b/llvm/lib/Target/CMakeLists.txt index f2a5d545ce84f..e354fd484a7a9 100644 --- a/llvm/lib/Target/CMakeLists.txt +++ b/llvm/lib/Target/CMakeLists.txt @@ -7,7 +7,6 @@ add_llvm_component_library(LLVMTarget TargetLoweringObjectFile.cpp TargetMachine.cpp TargetMachineC.cpp - TargetVerifier.cpp AMDGPU/AMDGPUTargetVerifier.cpp ADDITIONAL_HEADER_DIRS diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp deleted file mode 100644 index c63ae2a2c5daf..0000000000000 --- a/llvm/lib/Target/TargetVerifier.cpp +++ /dev/null @@ -1,36 +0,0 @@ -//===-- TargetVerifier.cpp - LLVM IR Target Verifier ----------------*- C++ -*-===// -//// -///// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -///// See https://llvm.org/LICENSE.txt for license information. -///// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -///// -/////===----------------------------------------------------------------------===// -///// -///// This file defines target verifier interfaces that can be used for some -///// validation of input to the system, and for checking that transformations -///// haven't done something bad. In contrast to the Verifier or Lint, the -///// TargetVerifier looks for constructions invalid to a particular target -///// machine. -///// -///// To see what specifically is checked, look at TargetVerifier.cpp or an -///// individual backend's TargetVerifier. -///// -/////===----------------------------------------------------------------------===// - -#include "llvm/Target/TargetVerifier.h" -#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" - -#include "llvm/InitializePasses.h" -#include "llvm/Analysis/UniformityAnalysis.h" -#include "llvm/Analysis/PostDominators.h" -#include "llvm/Support/Debug.h" -#include "llvm/IR/Dominators.h" -#include "llvm/IR/Function.h" -#include "llvm/IR/IntrinsicInst.h" -#include "llvm/IR/IntrinsicsAMDGPU.h" -#include "llvm/IR/Module.h" -#include "llvm/IR/Value.h" - -namespace llvm { - -} // namespace llvm >From 7a06a11fe08abd2b6bac4eb46b498e640cc6b78e Mon Sep 17 00:00:00 2001 From: jofernau Date: Thu, 1 May 2025 03:02:30 -0400 Subject: [PATCH 27/27] clang-format --- llvm/include/llvm/Target/TargetVerifier.h | 10 +- .../TargetVerify/AMDGPUTargetVerifier.h | 44 ++++----- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 97 ++++++++++--------- 3 files changed, 77 insertions(+), 74 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 1d12eb55bbf0a..3f8c710a88768 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -1,4 +1,4 @@ -//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier ---*- C++ -*-===// +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier --*- C++ -*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -20,8 +20,8 @@ #ifndef LLVM_TARGET_VERIFIER_H #define LLVM_TARGET_VERIFIER_H -#include "llvm/IR/PassManager.h" #include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" #include "llvm/TargetParser/Triple.h" namespace llvm { @@ -59,10 +59,11 @@ class TargetVerify { /// This calls the Message-only version so that the above is easier to set /// a breakpoint on. template - void CheckFailed(const Twine &Message, const T1 &V1, const Ts &... Vs) { + void CheckFailed(const Twine &Message, const T1 &V1, const Ts &...Vs) { CheckFailed(Message); WriteValues({V1, Vs...}); } + public: Module *Mod; Triple TT; @@ -73,8 +74,7 @@ class TargetVerify { bool IsValid = true; TargetVerify(Module *Mod) - : Mod(Mod), TT(Mod->getTargetTriple()), - MessagesStr(Messages) {} + : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} virtual bool run(Function &F) = 0; }; diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index 49bcbc8849e3c..8bcbc3ae77483 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -1,29 +1,29 @@ -//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU ---*- C++ -*-===// -//// -//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -//// See https://llvm.org/LICENSE.txt for license information. -//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -//// -////===----------------------------------------------------------------------===// -//// -//// This file defines target verifier interfaces that can be used for some -//// validation of input to the system, and for checking that transformations -//// haven't done something bad. In contrast to the Verifier or Lint, the -//// TargetVerifier looks for constructions invalid to a particular target -//// machine. -//// -//// To see what specifically is checked, look at an individual backend's -//// TargetVerifier. -//// -////===----------------------------------------------------------------------===// +//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU -- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at an individual backend's +// TargetVerifier. +// +//===----------------------------------------------------------------------===// #ifndef LLVM_AMDGPU_TARGET_VERIFIER_H #define LLVM_AMDGPU_TARGET_VERIFIER_H #include "llvm/Target/TargetVerifier.h" -#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/Analysis/PostDominators.h" +#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/IR/Dominators.h" namespace llvm { @@ -43,10 +43,10 @@ class AMDGPUTargetVerify : public TargetVerify { PostDominatorTree *PDT = nullptr; UniformityInfo *UA = nullptr; - AMDGPUTargetVerify(Module *Mod) - : TargetVerify(Mod), Mod(Mod) {} + AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod), Mod(Mod) {} - AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, + UniformityInfo *UA) : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} bool run(Function &F) override; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 2ca0bbeb57653..1b0653f915bd7 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -1,34 +1,34 @@ -//===-- AMDGPUTargetVerifier.cpp - AMDGPU -------------------------*- C++ -*-===// -//// -//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -//// See https://llvm.org/LICENSE.txt for license information. -//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -//// -////===----------------------------------------------------------------------===// -//// -//// This file defines target verifier interfaces that can be used for some -//// validation of input to the system, and for checking that transformations -//// haven't done something bad. In contrast to the Verifier or Lint, the -//// TargetVerifier looks for constructions invalid to a particular target -//// machine. -//// -//// To see what specifically is checked, look at an individual backend's -//// TargetVerifier. -//// -////===----------------------------------------------------------------------===// +//===-- AMDGPUTargetVerifier.cpp - AMDGPU -----------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at an individual backend's +// TargetVerifier. +// +//===----------------------------------------------------------------------===// -#include "AMDGPU.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" +#include "AMDGPU.h" -#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/Analysis/PostDominators.h" -#include "llvm/Support/Debug.h" +#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" #include "llvm/IR/Value.h" +#include "llvm/Support/Debug.h" #include "llvm/Support/raw_ostream.h" @@ -39,7 +39,7 @@ using namespace llvm; do { \ if (!(C)) { \ TargetVerify::CheckFailed(__VA_ARGS__); \ - return false; \ + return false; \ } \ } while (false) @@ -47,45 +47,45 @@ namespace llvm { static bool isShader(CallingConv::ID CC) { switch(CC) { - case CallingConv::AMDGPU_VS: - case CallingConv::AMDGPU_LS: - case CallingConv::AMDGPU_HS: - case CallingConv::AMDGPU_ES: - case CallingConv::AMDGPU_GS: - case CallingConv::AMDGPU_PS: - case CallingConv::AMDGPU_CS_Chain: - case CallingConv::AMDGPU_CS_ChainPreserve: - case CallingConv::AMDGPU_CS: - return true; - default: - return false; + case CallingConv::AMDGPU_VS: + case CallingConv::AMDGPU_LS: + case CallingConv::AMDGPU_HS: + case CallingConv::AMDGPU_ES: + case CallingConv::AMDGPU_GS: + case CallingConv::AMDGPU_PS: + case CallingConv::AMDGPU_CS_Chain: + case CallingConv::AMDGPU_CS_ChainPreserve: + case CallingConv::AMDGPU_CS: + return true; + default: + return false; } } bool AMDGPUTargetVerify::run(Function &F) { // Ensure shader calling convention returns void if (isShader(F.getCallingConv())) - Check(F.getReturnType() == Type::getVoidTy(F.getContext()), "Shaders must return void"); + Check(F.getReturnType() == Type::getVoidTy(F.getContext()), + "Shaders must return void"); for (auto &BB : F) { for (auto &I : BB) { - if (auto *CI = dyn_cast(&I)) - { + if (auto *CI = dyn_cast(&I)) { // Ensure no kernel to kernel calls. CallingConv::ID CalleeCC = CI->getCallingConv(); - if (CalleeCC == CallingConv::AMDGPU_KERNEL) - { - CallingConv::ID CallerCC = CI->getParent()->getParent()->getCallingConv(); + if (CalleeCC == CallingConv::AMDGPU_KERNEL) { + CallingConv::ID CallerCC = + CI->getParent()->getParent()->getCallingConv(); Check(CallerCC != CallingConv::AMDGPU_KERNEL, - "A kernel may not call a kernel", CI->getParent()->getParent()); + "A kernel may not call a kernel", CI->getParent()->getParent()); } // Ensure chain intrinsics are followed by unreachables. if (CI->getIntrinsicID() == Intrinsic::amdgcn_cs_chain) Check(isa_and_present(CI->getNextNode()), - "llvm.amdgcn.cs.chain must be followed by unreachable", CI); + "llvm.amdgcn.cs.chain must be followed by unreachable", CI); } } } @@ -98,7 +98,8 @@ bool AMDGPUTargetVerify::run(Function &F) { return true; } -PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { +PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, + FunctionAnalysisManager &AM) { auto *Mod = F.getParent(); auto UA = &AM.getResult(F); @@ -122,9 +123,10 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { std::unique_ptr TV; bool FatalErrors = false; - AMDGPUTargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), - FatalErrors(FatalErrors) { - initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + AMDGPUTargetVerifierLegacyPass(bool FatalErrors) + : FunctionPass(ID), FatalErrors(FatalErrors) { + initializeAMDGPUTargetVerifierLegacyPassPass( + *PassRegistry::getPassRegistry()); } bool doInitialization(Module &M) override { @@ -167,4 +169,5 @@ FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors) { return new AMDGPUTargetVerifierLegacyPass(FatalErrors); } } // namespace llvm -INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) +INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", + "AMDGPU Target Verifier", false, false) From flang-commits at lists.llvm.org Thu May 1 01:11:49 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 01:11:49 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <68132cc5.170a0220.158f8b.0615@mx.google.com> https://github.com/jofrn updated https://github.com/llvm/llvm-project/pull/123609 >From 210b6d80bcfbbcd216f98199df386280724561e2 Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 20 Jan 2025 04:51:26 -0800 Subject: [PATCH 01/28] [TargetVerifier][AMDGPU] Add TargetVerifier. This pass verifies the IR for an individual backend. This is different than Lint because it consolidates all checks for a given backend in a single pass. A check for Lint may be undefined behavior across all targets, whereas a check in TargetVerifier would only pertain to the specified target but can check more than just undefined behavior such are IR validity. A use case of this would be to reject programs with invalid IR while fuzzing. --- llvm/include/llvm/IR/Module.h | 4 + llvm/include/llvm/Target/TargetVerifier.h | 82 +++++++ .../TargetVerify/AMDGPUTargetVerifier.h | 36 +++ llvm/lib/IR/Verifier.cpp | 18 +- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 213 ++++++++++++++++++ llvm/lib/Target/AMDGPU/CMakeLists.txt | 1 + llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 62 +++++ llvm/tools/llvm-tgt-verify/CMakeLists.txt | 34 +++ .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 172 ++++++++++++++ 9 files changed, 618 insertions(+), 4 deletions(-) create mode 100644 llvm/include/llvm/Target/TargetVerifier.h create mode 100644 llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h create mode 100644 llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp create mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify.ll create mode 100644 llvm/tools/llvm-tgt-verify/CMakeLists.txt create mode 100644 llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp diff --git a/llvm/include/llvm/IR/Module.h b/llvm/include/llvm/IR/Module.h index 91ccd76c41e07..03c0cf1cf0924 100644 --- a/llvm/include/llvm/IR/Module.h +++ b/llvm/include/llvm/IR/Module.h @@ -214,6 +214,10 @@ class LLVM_ABI Module { /// @name Constructors /// @{ public: + /// Is this Module valid as determined by one of the verification passes + /// i.e. Lint, Verifier, TargetVerifier. + bool IsValid = true; + /// Is this Module using intrinsics to record the position of debugging /// information, or non-intrinsic records? See IsNewDbgInfoFormat in /// \ref BasicBlock. diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h new file mode 100644 index 0000000000000..e00c6a7b260c9 --- /dev/null +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -0,0 +1,82 @@ +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at TargetVerifier.cpp or an +// individual backend's TargetVerifier. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_TARGET_VERIFIER_H +#define LLVM_TARGET_VERIFIER_H + +#include "llvm/IR/PassManager.h" +#include "llvm/IR/Module.h" +#include "llvm/TargetParser/Triple.h" + +namespace llvm { + +class Function; + +class TargetVerifierPass : public PassInfoMixin { +public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) {} +}; + +class TargetVerify { +protected: + void WriteValues(ArrayRef Vs) { + for (const Value *V : Vs) { + if (!V) + continue; + if (isa(V)) { + MessagesStr << *V << '\n'; + } else { + V->printAsOperand(MessagesStr, true, Mod); + MessagesStr << '\n'; + } + } + } + + /// A check failed, so printout out the condition and the message. + /// + /// This provides a nice place to put a breakpoint if you want to see why + /// something is not correct. + void CheckFailed(const Twine &Message) { MessagesStr << Message << '\n'; } + + /// A check failed (with values to print). + /// + /// This calls the Message-only version so that the above is easier to set + /// a breakpoint on. + template + void CheckFailed(const Twine &Message, const T1 &V1, const Ts &... Vs) { + CheckFailed(Message); + WriteValues({V1, Vs...}); + } +public: + Module *Mod; + Triple TT; + + std::string Messages; + raw_string_ostream MessagesStr; + + TargetVerify(Module *Mod) + : Mod(Mod), TT(Triple::normalize(Mod->getTargetTriple())), + MessagesStr(Messages) {} + + void run(Function &F) {}; +}; + +} // namespace llvm + +#endif // LLVM_TARGET_VERIFIER_H diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h new file mode 100644 index 0000000000000..e6ff57629b141 --- /dev/null +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -0,0 +1,36 @@ +//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU ---*- C++ -*-===// +//// +//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +//// See https://llvm.org/LICENSE.txt for license information. +//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +//// +////===----------------------------------------------------------------------===// +//// +//// This file defines target verifier interfaces that can be used for some +//// validation of input to the system, and for checking that transformations +//// haven't done something bad. In contrast to the Verifier or Lint, the +//// TargetVerifier looks for constructions invalid to a particular target +//// machine. +//// +//// To see what specifically is checked, look at an individual backend's +//// TargetVerifier. +//// +////===----------------------------------------------------------------------===// + +#ifndef LLVM_AMDGPU_TARGET_VERIFIER_H +#define LLVM_AMDGPU_TARGET_VERIFIER_H + +#include "llvm/Target/TargetVerifier.h" + +namespace llvm { + +class Function; + +class AMDGPUTargetVerifierPass : public TargetVerifierPass { +public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); +}; + +} // namespace llvm + +#endif // LLVM_AMDGPU_TARGET_VERIFIER_H diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 8afe360d088bc..9d21ca182ca13 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -135,6 +135,10 @@ static cl::opt VerifyNoAliasScopeDomination( cl::desc("Ensure that llvm.experimental.noalias.scope.decl for identical " "scopes are not dominating")); +static cl::opt + VerifyAbortOnError("verifier-abort-on-error", cl::init(false), + cl::desc("In the Verifier pass, abort on errors.")); + namespace llvm { struct VerifierSupport { @@ -7796,16 +7800,22 @@ VerifierAnalysis::Result VerifierAnalysis::run(Function &F, PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { auto Res = AM.getResult(M); - if (FatalErrors && (Res.IRBroken || Res.DebugInfoBroken)) - report_fatal_error("Broken module found, compilation aborted!"); + if (Res.IRBroken || Res.DebugInfoBroken) { + M.IsValid = false; + if (VerifyAbortOnError && FatalErrors) + report_fatal_error("Broken module found, compilation aborted!"); + } return PreservedAnalyses::all(); } PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto res = AM.getResult(F); - if (res.IRBroken && FatalErrors) - report_fatal_error("Broken function found, compilation aborted!"); + if (res.IRBroken) { + F.getParent()->IsValid = false; + if (VerifyAbortOnError && FatalErrors) + report_fatal_error("Broken function found, compilation aborted!"); + } return PreservedAnalyses::all(); } diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp new file mode 100644 index 0000000000000..585b19065c142 --- /dev/null +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -0,0 +1,213 @@ +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" + +#include "llvm/Analysis/UniformityAnalysis.h" +#include "llvm/Analysis/PostDominators.h" +#include "llvm/Support/Debug.h" +#include "llvm/IR/Dominators.h" +#include "llvm/IR/Function.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/IntrinsicsAMDGPU.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/Value.h" + +#include "llvm/Support/raw_ostream.h" + +using namespace llvm; + +static cl::opt +MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); + +// Check - We know that cond should be true, if not print an error message. +#define Check(C, ...) \ + do { \ + if (!(C)) { \ + TargetVerify::CheckFailed(__VA_ARGS__); \ + return; \ + } \ + } while (false) + +static bool isMFMA(unsigned IID) { + switch (IID) { + case Intrinsic::amdgcn_mfma_f32_4x4x1f32: + case Intrinsic::amdgcn_mfma_f32_4x4x4f16: + case Intrinsic::amdgcn_mfma_i32_4x4x4i8: + case Intrinsic::amdgcn_mfma_f32_4x4x2bf16: + + case Intrinsic::amdgcn_mfma_f32_16x16x1f32: + case Intrinsic::amdgcn_mfma_f32_16x16x4f32: + case Intrinsic::amdgcn_mfma_f32_16x16x4f16: + case Intrinsic::amdgcn_mfma_f32_16x16x16f16: + case Intrinsic::amdgcn_mfma_i32_16x16x4i8: + case Intrinsic::amdgcn_mfma_i32_16x16x16i8: + case Intrinsic::amdgcn_mfma_f32_16x16x2bf16: + case Intrinsic::amdgcn_mfma_f32_16x16x8bf16: + + case Intrinsic::amdgcn_mfma_f32_32x32x1f32: + case Intrinsic::amdgcn_mfma_f32_32x32x2f32: + case Intrinsic::amdgcn_mfma_f32_32x32x4f16: + case Intrinsic::amdgcn_mfma_f32_32x32x8f16: + case Intrinsic::amdgcn_mfma_i32_32x32x4i8: + case Intrinsic::amdgcn_mfma_i32_32x32x8i8: + case Intrinsic::amdgcn_mfma_f32_32x32x2bf16: + case Intrinsic::amdgcn_mfma_f32_32x32x4bf16: + + case Intrinsic::amdgcn_mfma_f32_4x4x4bf16_1k: + case Intrinsic::amdgcn_mfma_f32_16x16x4bf16_1k: + case Intrinsic::amdgcn_mfma_f32_16x16x16bf16_1k: + case Intrinsic::amdgcn_mfma_f32_32x32x4bf16_1k: + case Intrinsic::amdgcn_mfma_f32_32x32x8bf16_1k: + + case Intrinsic::amdgcn_mfma_f64_16x16x4f64: + case Intrinsic::amdgcn_mfma_f64_4x4x4f64: + + case Intrinsic::amdgcn_mfma_i32_16x16x32_i8: + case Intrinsic::amdgcn_mfma_i32_32x32x16_i8: + case Intrinsic::amdgcn_mfma_f32_16x16x8_xf32: + case Intrinsic::amdgcn_mfma_f32_32x32x4_xf32: + + case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_bf8: + case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_fp8: + case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_bf8: + case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_fp8: + + case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_bf8: + case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_fp8: + case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_bf8: + case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_fp8: + return true; + default: + return false; + } +} + +namespace llvm { +class AMDGPUTargetVerify : public TargetVerify { +public: + Module *Mod; + + DominatorTree *DT; + PostDominatorTree *PDT; + UniformityInfo *UA; + + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) + : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} + + void run(Function &F); +}; + +static bool IsValidInt(const Type *Ty) { + return Ty->isIntegerTy(1) || + Ty->isIntegerTy(8) || + Ty->isIntegerTy(16) || + Ty->isIntegerTy(32) || + Ty->isIntegerTy(64) || + Ty->isIntegerTy(128); +} + +static bool isShader(CallingConv::ID CC) { + switch(CC) { + case CallingConv::AMDGPU_VS: + case CallingConv::AMDGPU_LS: + case CallingConv::AMDGPU_HS: + case CallingConv::AMDGPU_ES: + case CallingConv::AMDGPU_GS: + case CallingConv::AMDGPU_PS: + case CallingConv::AMDGPU_CS_Chain: + case CallingConv::AMDGPU_CS_ChainPreserve: + case CallingConv::AMDGPU_CS: + return true; + default: + return false; + } +} + +void AMDGPUTargetVerify::run(Function &F) { + // Ensure shader calling convention returns void + if (isShader(F.getCallingConv())) + Check(F.getReturnType() == Type::getVoidTy(F.getContext()), "Shaders must return void"); + + for (auto &BB : F) { + + for (auto &I : BB) { + if (MarkUniform) + outs() << UA->isUniform(&I) << ' ' << I << '\n'; + + // Ensure integral types are valid: i8, i16, i32, i64, i128 + if (I.getType()->isIntegerTy()) + Check(IsValidInt(I.getType()), "Int type is invalid.", &I); + for (unsigned i = 0; i < I.getNumOperands(); ++i) + if (I.getOperand(i)->getType()->isIntegerTy()) + Check(IsValidInt(I.getOperand(i)->getType()), + "Int type is invalid.", I.getOperand(i)); + + // Ensure no store to const memory + if (auto *SI = dyn_cast(&I)) + { + unsigned AS = SI->getPointerAddressSpace(); + Check(AS != 4, "Write to const memory", SI); + } + + // Ensure no kernel to kernel calls. + if (auto *CI = dyn_cast(&I)) + { + CallingConv::ID CalleeCC = CI->getCallingConv(); + if (CalleeCC == CallingConv::AMDGPU_KERNEL) + { + CallingConv::ID CallerCC = CI->getParent()->getParent()->getCallingConv(); + Check(CallerCC != CallingConv::AMDGPU_KERNEL, + "A kernel may not call a kernel", CI->getParent()->getParent()); + } + } + + // Ensure MFMA is not in control flow with diverging operands + if (auto *II = dyn_cast(&I)) { + if (isMFMA(II->getIntrinsicID())) { + bool InControlFlow = false; + for (const auto &P : predecessors(&BB)) + if (!PDT->dominates(&BB, P)) { + InControlFlow = true; + break; + } + for (const auto &S : successors(&BB)) + if (!DT->dominates(&BB, S)) { + InControlFlow = true; + break; + } + if (InControlFlow) { + // If operands to MFMA are not uniform, MFMA cannot be in control flow + bool hasUniformOperands = true; + for (unsigned i = 0; i < II->getNumOperands(); i++) { + if (!UA->isUniform(II->getOperand(i))) { + dbgs() << "Not uniform: " << *II->getOperand(i) << '\n'; + hasUniformOperands = false; + } + } + if (!hasUniformOperands) Check(false, "MFMA in control flow", II); + //else Check(false, "MFMA in control flow (uniform operands)", II); + } + //else Check(false, "MFMA not in control flow", II); + } + } + } + } +} + +PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { + + auto *Mod = F.getParent(); + + auto UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + F.getParent()->IsValid = false; + } + + return PreservedAnalyses::all(); +} +} // namespace llvm diff --git a/llvm/lib/Target/AMDGPU/CMakeLists.txt b/llvm/lib/Target/AMDGPU/CMakeLists.txt index 09a3096602fc3..bcfea0bf8ac94 100644 --- a/llvm/lib/Target/AMDGPU/CMakeLists.txt +++ b/llvm/lib/Target/AMDGPU/CMakeLists.txt @@ -110,6 +110,7 @@ add_llvm_target(AMDGPUCodeGen AMDGPUTargetMachine.cpp AMDGPUTargetObjectFile.cpp AMDGPUTargetTransformInfo.cpp + AMDGPUTargetVerifier.cpp AMDGPUWaitSGPRHazards.cpp AMDGPUUnifyDivergentExitNodes.cpp AMDGPUUnifyMetadata.cpp diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll new file mode 100644 index 0000000000000..f56ff992a56c2 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -0,0 +1,62 @@ +; RUN: not llvm-tgt-verify %s -mtriple=amdgcn |& FileCheck %s + +define amdgpu_kernel void @test_mfma_f32_32x32x1f32_vecarg(ptr addrspace(1) %arg) #0 { +; CHECK: Not uniform: %in.f32 = load <32 x float>, ptr addrspace(1) %gep, align 128 +; CHECK-NEXT: MFMA in control flow +; CHECK-NEXT: %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) +s: + %tid = call i32 @llvm.amdgcn.workitem.id.x() + %gep = getelementptr inbounds <32 x float>, ptr addrspace(1) %arg, i32 %tid + %in.i32 = load <32 x i32>, ptr addrspace(1) %gep + %in.f32 = load <32 x float>, ptr addrspace(1) %gep + + %0 = icmp eq <32 x i32> %in.i32, zeroinitializer + %div.br = extractelement <32 x i1> %0, i32 0 + br i1 %div.br, label %if.3, label %else.0 + +if.3: + br label %join + +else.0: + %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) + br label %join + +join: + ret void +} + +define amdgpu_cs i32 @shader() { +; CHECK: Shaders must return void + ret i32 0 +} + +define amdgpu_kernel void @store_const(ptr addrspace(4) %out, i32 %a, i32 %b) { +; CHECK: Undefined behavior: Write to memory in const addrspace +; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 +; CHECK-NEXT: Write to const memory +; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 + %r = add i32 %a, %b + store i32 %r, ptr addrspace(4) %out + ret void +} + +define amdgpu_kernel void @kernel_callee(ptr %x) { + ret void +} + +define amdgpu_kernel void @kernel_caller(ptr %x) { +; CHECK: A kernel may not call a kernel +; CHECK-NEXT: ptr @kernel_caller + call amdgpu_kernel void @kernel_callee(ptr %x) + ret void +} + + +; Function Attrs: nounwind +define i65 @invalid_type(i65 %x) #0 { +; CHECK: Int type is invalid. +; CHECK-NEXT: %tmp2 = ashr i65 %x, 64 +entry: + %tmp2 = ashr i65 %x, 64 + ret i65 %tmp2 +} diff --git a/llvm/tools/llvm-tgt-verify/CMakeLists.txt b/llvm/tools/llvm-tgt-verify/CMakeLists.txt new file mode 100644 index 0000000000000..fe47c85e6cdce --- /dev/null +++ b/llvm/tools/llvm-tgt-verify/CMakeLists.txt @@ -0,0 +1,34 @@ +set(LLVM_LINK_COMPONENTS + AllTargetsAsmParsers + AllTargetsCodeGens + AllTargetsDescs + AllTargetsInfos + Analysis + AsmPrinter + CodeGen + CodeGenTypes + Core + IRPrinter + IRReader + MC + MIRParser + Passes + Remarks + ScalarOpts + SelectionDAG + Support + Target + TargetParser + TransformUtils + Vectorize + ) + +add_llvm_tool(llvm-tgt-verify + llvm-tgt-verify.cpp + + DEPENDS + intrinsics_gen + SUPPORT_PLUGINS + ) + +export_executable_symbols_for_plugins(llc) diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp new file mode 100644 index 0000000000000..68422abd6f4cc --- /dev/null +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -0,0 +1,172 @@ +//===--- llvm-isel-fuzzer.cpp - Fuzzer for instruction selection ----------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Tool to fuzz instruction selection using libFuzzer. +// +//===----------------------------------------------------------------------===// + +#include "llvm/InitializePasses.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Analysis/Lint.h" +#include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/Bitcode/BitcodeReader.h" +#include "llvm/Bitcode/BitcodeWriter.h" +#include "llvm/CodeGen/CommandFlags.h" +#include "llvm/CodeGen/TargetPassConfig.h" +#include "llvm/IR/Constants.h" +#include "llvm/IR/LLVMContext.h" +#include "llvm/IR/LegacyPassManager.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/Verifier.h" +#include "llvm/IRReader/IRReader.h" +#include "llvm/Passes/PassBuilder.h" +#include "llvm/Passes/StandardInstrumentations.h" +#include "llvm/MC/TargetRegistry.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/DataTypes.h" +#include "llvm/Support/Debug.h" +#include "llvm/Support/InitLLVM.h" +#include "llvm/Support/SourceMgr.h" +#include "llvm/Support/TargetSelect.h" +#include "llvm/Target/TargetMachine.h" +#include "llvm/Target/TargetVerifier.h" + +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" + +#define DEBUG_TYPE "isel-fuzzer" + +using namespace llvm; + +static codegen::RegisterCodeGenFlags CGF; + +static cl::opt +InputFilename(cl::Positional, cl::desc(""), cl::init("-")); + +static cl::opt + StacktraceAbort("stacktrace-abort", + cl::desc("Turn on stacktrace"), cl::init(false)); + +static cl::opt + NoLint("no-lint", + cl::desc("Turn off Lint"), cl::init(false)); + +static cl::opt + NoVerify("no-verifier", + cl::desc("Turn off Verifier"), cl::init(false)); + +static cl::opt + OptLevel("O", + cl::desc("Optimization level. [-O0, -O1, -O2, or -O3] " + "(default = '-O2')"), + cl::Prefix, cl::init('2')); + +static cl::opt + TargetTriple("mtriple", cl::desc("Override target triple for module")); + +static std::unique_ptr TM; + +static void handleLLVMFatalError(void *, const char *Message, bool) { + if (StacktraceAbort) { + dbgs() << "LLVM ERROR: " << Message << "\n" + << "Aborting.\n"; + abort(); + } +} + +int main(int argc, char **argv) { + StringRef ExecName = argv[0]; + InitLLVM X(argc, argv); + + InitializeAllTargets(); + InitializeAllTargetMCs(); + InitializeAllAsmPrinters(); + InitializeAllAsmParsers(); + + PassRegistry *Registry = PassRegistry::getPassRegistry(); + initializeCore(*Registry); + initializeCodeGen(*Registry); + initializeAnalysis(*Registry); + initializeTarget(*Registry); + + cl::ParseCommandLineOptions(argc, argv); + + if (TargetTriple.empty()) { + errs() << ExecName << ": -mtriple must be specified\n"; + exit(1); + } + + CodeGenOptLevel OLvl; + if (auto Level = CodeGenOpt::parseLevel(OptLevel)) { + OLvl = *Level; + } else { + errs() << ExecName << ": invalid optimization level.\n"; + return 1; + } + ExitOnError ExitOnErr(std::string(ExecName) + ": error:"); + TM = ExitOnErr(codegen::createTargetMachineForTriple( + Triple::normalize(TargetTriple), OLvl)); + assert(TM && "Could not allocate target machine!"); + + // Make sure we print the summary and the current unit when LLVM errors out. + install_fatal_error_handler(handleLLVMFatalError, nullptr); + + LLVMContext Context; + SMDiagnostic Err; + std::unique_ptr M = parseIRFile(InputFilename, Err, Context); + if (!M) { + errs() << "Invalid mod\n"; + return 1; + } + auto S = Triple::normalize(TargetTriple); + M->setTargetTriple(S); + + PassInstrumentationCallbacks PIC; + StandardInstrumentations SI(Context, false/*debug PM*/, + false); + registerCodeGenCallback(PIC, *TM); + + ModulePassManager MPM; + FunctionPassManager FPM; + //TargetLibraryInfoImpl TLII(Triple(M->getTargetTriple())); + + MachineFunctionAnalysisManager MFAM; + LoopAnalysisManager LAM; + FunctionAnalysisManager FAM; + CGSCCAnalysisManager CGAM; + ModuleAnalysisManager MAM; + PassBuilder PB(TM.get(), PipelineTuningOptions(), std::nullopt, &PIC); + PB.registerModuleAnalyses(MAM); + PB.registerCGSCCAnalyses(CGAM); + PB.registerFunctionAnalyses(FAM); + PB.registerLoopAnalyses(LAM); + PB.registerMachineFunctionAnalyses(MFAM); + PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); + + SI.registerCallbacks(PIC, &MAM); + + //FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); + + Triple TT(M->getTargetTriple()); + if (!NoLint) + FPM.addPass(LintPass()); + if (!NoVerify) + MPM.addPass(VerifierPass()); + if (TT.isAMDGPU()) + FPM.addPass(AMDGPUTargetVerifierPass()); + else if (false) {} // ... + else + FPM.addPass(TargetVerifierPass()); + MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); + + MPM.run(*M, MAM); + + if (!M->IsValid) + return 1; + + return 0; +} >From a808efce8d90524845a44ffa5b90adb6741e488d Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 3 Feb 2025 07:15:12 -0800 Subject: [PATCH 02/28] Add hook for target verifier in llc,opt --- .../llvm/Passes/StandardInstrumentations.h | 6 ++++-- llvm/include/llvm/Target/TargetVerifier.h | 1 + .../TargetVerify/AMDGPUTargetVerifier.h | 18 ++++++++++++++++++ llvm/lib/LTO/LTOBackend.cpp | 2 +- llvm/lib/LTO/ThinLTOCodeGenerator.cpp | 2 +- llvm/lib/Passes/CMakeLists.txt | 1 + llvm/lib/Passes/PassBuilderBindings.cpp | 2 +- llvm/lib/Passes/StandardInstrumentations.cpp | 19 +++++++++++++++---- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 12 ++++++------ llvm/lib/Target/CMakeLists.txt | 2 ++ .../CodeGen/AMDGPU/tgt-verify-llc-fail.ll | 6 ++++++ .../CodeGen/AMDGPU/tgt-verify-llc-pass.ll | 6 ++++++ llvm/tools/llc/NewPMDriver.cpp | 2 +- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 2 +- llvm/tools/opt/NewPMDriver.cpp | 2 +- llvm/unittests/IR/PassManagerTest.cpp | 6 +++--- 16 files changed, 68 insertions(+), 21 deletions(-) create mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll create mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll diff --git a/llvm/include/llvm/Passes/StandardInstrumentations.h b/llvm/include/llvm/Passes/StandardInstrumentations.h index f7a65a88ecf5b..988fcb93b2357 100644 --- a/llvm/include/llvm/Passes/StandardInstrumentations.h +++ b/llvm/include/llvm/Passes/StandardInstrumentations.h @@ -476,7 +476,8 @@ class VerifyInstrumentation { public: VerifyInstrumentation(bool DebugLogging) : DebugLogging(DebugLogging) {} void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM); + ModuleAnalysisManager *MAM, + FunctionAnalysisManager *FAM); }; /// This class implements --time-trace functionality for new pass manager. @@ -621,7 +622,8 @@ class StandardInstrumentations { // Register all the standard instrumentation callbacks. If \p FAM is nullptr // then PreservedCFGChecker is not enabled. void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM = nullptr); + ModuleAnalysisManager *MAM, + FunctionAnalysisManager *FAM); TimePassesHandler &getTimePasses() { return TimePasses; } }; diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index e00c6a7b260c9..ad5aeb895953d 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -75,6 +75,7 @@ class TargetVerify { MessagesStr(Messages) {} void run(Function &F) {}; + void run(Function &F, FunctionAnalysisManager &AM); }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index e6ff57629b141..d8a3fda4f87dc 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -22,6 +22,10 @@ #include "llvm/Target/TargetVerifier.h" +#include "llvm/Analysis/UniformityAnalysis.h" +#include "llvm/Analysis/PostDominators.h" +#include "llvm/IR/Dominators.h" + namespace llvm { class Function; @@ -31,6 +35,20 @@ class AMDGPUTargetVerifierPass : public TargetVerifierPass { PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); }; +class AMDGPUTargetVerify : public TargetVerify { +public: + Module *Mod; + + DominatorTree *DT; + PostDominatorTree *PDT; + UniformityInfo *UA; + + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) + : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} + + void run(Function &F); +}; + } // namespace llvm #endif // LLVM_AMDGPU_TARGET_VERIFIER_H diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp index 1c764a0188eda..475e7cf45371b 100644 --- a/llvm/lib/LTO/LTOBackend.cpp +++ b/llvm/lib/LTO/LTOBackend.cpp @@ -275,7 +275,7 @@ static void runNewPMPasses(const Config &Conf, Module &Mod, TargetMachine *TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(Mod.getContext(), Conf.DebugPassManager, Conf.VerifyEach); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); PassBuilder PB(TM, Conf.PTO, PGOOpt, &PIC); RegisterPassPlugins(Conf.PassPlugins, PB); diff --git a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp index 9e7f8187fe49c..369b003df1364 100644 --- a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp +++ b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp @@ -245,7 +245,7 @@ static void optimizeModule(Module &TheModule, TargetMachine &TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(TheModule.getContext(), DebugPassManager); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); PipelineTuningOptions PTO; PTO.LoopVectorization = true; PTO.SLPVectorization = true; diff --git a/llvm/lib/Passes/CMakeLists.txt b/llvm/lib/Passes/CMakeLists.txt index 6425f4934b210..f171377a8b270 100644 --- a/llvm/lib/Passes/CMakeLists.txt +++ b/llvm/lib/Passes/CMakeLists.txt @@ -29,6 +29,7 @@ add_llvm_component_library(LLVMPasses Scalar Support Target + TargetParser TransformUtils Vectorize Instrumentation diff --git a/llvm/lib/Passes/PassBuilderBindings.cpp b/llvm/lib/Passes/PassBuilderBindings.cpp index 933fe89e53a94..f0e1abb8cebc4 100644 --- a/llvm/lib/Passes/PassBuilderBindings.cpp +++ b/llvm/lib/Passes/PassBuilderBindings.cpp @@ -76,7 +76,7 @@ static LLVMErrorRef runPasses(Module *Mod, Function *Fun, const char *Passes, PB.crossRegisterProxies(LAM, FAM, CGAM, MAM); StandardInstrumentations SI(Mod->getContext(), Debug, VerifyEach); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); // Run the pipeline. if (Fun) { diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index dc1dd5d9c7f4c..7b15f89e361b8 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -45,6 +45,7 @@ #include "llvm/Support/Signals.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Support/xxhash.h" +#include "llvm/Target/TargetVerifier.h" #include #include #include @@ -1454,9 +1455,10 @@ void PreservedCFGCheckerInstrumentation::registerCallbacks( } void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM) { + ModuleAnalysisManager *MAM, + FunctionAnalysisManager *FAM) { PIC.registerAfterPassCallback( - [this, MAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { + [this, MAM, FAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { if (isIgnored(P) || P == "VerifierPass") return; const auto *F = unwrapIR(IR); @@ -1473,6 +1475,15 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", P)); + + if (FAM) { + TargetVerify TV(const_cast(F->getParent())); + TV.run(*const_cast(F), *FAM); + if (!F->getParent()->IsValid) + report_fatal_error(formatv("Broken function found after pass " + "\"{0}\", compilation aborted!", + P)); + } } else { const auto *M = unwrapIR(IR); if (!M) { @@ -2512,7 +2523,7 @@ void PrintCrashIRInstrumentation::registerCallbacks( } void StandardInstrumentations::registerCallbacks( - PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM) { + PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM, FunctionAnalysisManager *FAM) { PrintIR.registerCallbacks(PIC); PrintPass.registerCallbacks(PIC); TimePasses.registerCallbacks(PIC); @@ -2521,7 +2532,7 @@ void StandardInstrumentations::registerCallbacks( PrintChangedIR.registerCallbacks(PIC); PseudoProbeVerification.registerCallbacks(PIC); if (VerifyEach) - Verify.registerCallbacks(PIC, MAM); + Verify.registerCallbacks(PIC, MAM, FAM); PrintChangedDiff.registerCallbacks(PIC); WebsiteChangeReporter.registerCallbacks(PIC); ChangeTester.registerCallbacks(PIC); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 585b19065c142..e6cdec7160229 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -14,8 +14,8 @@ using namespace llvm; -static cl::opt -MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); +//static cl::opt +//MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); // Check - We know that cond should be true, if not print an error message. #define Check(C, ...) \ @@ -81,7 +81,7 @@ static bool isMFMA(unsigned IID) { } namespace llvm { -class AMDGPUTargetVerify : public TargetVerify { +/*class AMDGPUTargetVerify : public TargetVerify { public: Module *Mod; @@ -93,7 +93,7 @@ class AMDGPUTargetVerify : public TargetVerify { : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} void run(Function &F); -}; +};*/ static bool IsValidInt(const Type *Ty) { return Ty->isIntegerTy(1) || @@ -129,8 +129,8 @@ void AMDGPUTargetVerify::run(Function &F) { for (auto &BB : F) { for (auto &I : BB) { - if (MarkUniform) - outs() << UA->isUniform(&I) << ' ' << I << '\n'; + //if (MarkUniform) + //outs() << UA->isUniform(&I) << ' ' << I << '\n'; // Ensure integral types are valid: i8, i16, i32, i64, i128 if (I.getType()->isIntegerTy()) diff --git a/llvm/lib/Target/CMakeLists.txt b/llvm/lib/Target/CMakeLists.txt index 9472288229cac..f2a5d545ce84f 100644 --- a/llvm/lib/Target/CMakeLists.txt +++ b/llvm/lib/Target/CMakeLists.txt @@ -7,6 +7,8 @@ add_llvm_component_library(LLVMTarget TargetLoweringObjectFile.cpp TargetMachine.cpp TargetMachineC.cpp + TargetVerifier.cpp + AMDGPU/AMDGPUTargetVerifier.cpp ADDITIONAL_HEADER_DIRS ${LLVM_MAIN_INCLUDE_DIR}/llvm/Target diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll new file mode 100644 index 0000000000000..584097d7bc134 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll @@ -0,0 +1,6 @@ +; RUN: not not llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each -o - < %s 2>&1 | FileCheck %s + +define amdgpu_cs i32 @nonvoid_shader() { +; CHECK: LLVM ERROR + ret i32 0 +} diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll new file mode 100644 index 0000000000000..0c3a5fe5ac4a5 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll @@ -0,0 +1,6 @@ +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each %s -o - 2>&1 | FileCheck %s + +define amdgpu_cs void @void_shader() { +; CHECK: ModuleToFunctionPassAdaptor + ret void +} diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index fa82689ecf9ae..a060d16e74958 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -126,7 +126,7 @@ int llvm::compileModuleWithNewPM( PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); MAM.registerPass([&] { return MachineModuleAnalysis(MMI); }); diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 68422abd6f4cc..3352d07deff2f 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -147,7 +147,7 @@ int main(int argc, char **argv) { PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); //FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); diff --git a/llvm/tools/opt/NewPMDriver.cpp b/llvm/tools/opt/NewPMDriver.cpp index 7d168a6ceb17c..a8977d80bdf44 100644 --- a/llvm/tools/opt/NewPMDriver.cpp +++ b/llvm/tools/opt/NewPMDriver.cpp @@ -423,7 +423,7 @@ bool llvm::runPassPipeline( PrintPassOpts.SkipAnalyses = DebugPM == DebugLogging::Quiet; StandardInstrumentations SI(M.getContext(), DebugPM != DebugLogging::None, VK == VerifierKind::EachPass, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); DebugifyEachInstrumentation Debugify; DebugifyStatsMap DIStatsMap; DebugInfoPerPass DebugInfoBeforePass; diff --git a/llvm/unittests/IR/PassManagerTest.cpp b/llvm/unittests/IR/PassManagerTest.cpp index a6487169224c2..bb4db6120035f 100644 --- a/llvm/unittests/IR/PassManagerTest.cpp +++ b/llvm/unittests/IR/PassManagerTest.cpp @@ -828,7 +828,7 @@ TEST_F(PassManagerTest, FunctionPassCFGChecker) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -877,7 +877,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerInvalidateAnalysis) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -945,7 +945,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerWrapped) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); >From 64d001858efc994e965071cd319d268b934a6eb3 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 16 Apr 2025 10:19:00 -0400 Subject: [PATCH 03/28] Run AMDGPUTargetVerifier within AMDGPU pipeline. Move IsValid from Module to TargetVerify. --- clang/lib/CodeGen/BackendUtil.cpp | 2 +- llvm/include/llvm/IR/Module.h | 4 ---- llvm/include/llvm/Target/TargetVerifier.h | 2 ++ llvm/lib/IR/Verifier.cpp | 4 ++-- llvm/lib/Passes/StandardInstrumentations.cpp | 2 +- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 5 +++++ llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 3 ++- llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 6 +++--- 8 files changed, 16 insertions(+), 12 deletions(-) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index f7eb853beb23c..9a1c922f5ddef 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -922,7 +922,7 @@ void EmitAssemblyHelper::RunOptimizationPipeline( TheModule->getContext(), (CodeGenOpts.DebugPassManager || DebugPassStructure), CodeGenOpts.VerifyEach, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); PassBuilder PB(TM.get(), PTO, PGOOpt, &PIC); // Handle the assignment tracking feature options. diff --git a/llvm/include/llvm/IR/Module.h b/llvm/include/llvm/IR/Module.h index 03c0cf1cf0924..91ccd76c41e07 100644 --- a/llvm/include/llvm/IR/Module.h +++ b/llvm/include/llvm/IR/Module.h @@ -214,10 +214,6 @@ class LLVM_ABI Module { /// @name Constructors /// @{ public: - /// Is this Module valid as determined by one of the verification passes - /// i.e. Lint, Verifier, TargetVerifier. - bool IsValid = true; - /// Is this Module using intrinsics to record the position of debugging /// information, or non-intrinsic records? See IsNewDbgInfoFormat in /// \ref BasicBlock. diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index ad5aeb895953d..2d0c039132c35 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -70,6 +70,8 @@ class TargetVerify { std::string Messages; raw_string_ostream MessagesStr; + bool IsValid = true; + TargetVerify(Module *Mod) : Mod(Mod), TT(Triple::normalize(Mod->getTargetTriple())), MessagesStr(Messages) {} diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 9d21ca182ca13..d7c514610b4ba 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -7801,7 +7801,7 @@ VerifierAnalysis::Result VerifierAnalysis::run(Function &F, PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { auto Res = AM.getResult(M); if (Res.IRBroken || Res.DebugInfoBroken) { - M.IsValid = false; + //M.IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken module found, compilation aborted!"); } @@ -7812,7 +7812,7 @@ PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto res = AM.getResult(F); if (res.IRBroken) { - F.getParent()->IsValid = false; + //F.getParent()->IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken function found, compilation aborted!"); } diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index 7b15f89e361b8..879d657c87695 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -1479,7 +1479,7 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, if (FAM) { TargetVerify TV(const_cast(F->getParent())); TV.run(*const_cast(F), *FAM); - if (!F->getParent()->IsValid) + if (!TV.IsValid) report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", P)); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 90e3489ced923..6ec34d6a0fdbf 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -90,6 +90,7 @@ #include "llvm/MC/TargetRegistry.h" #include "llvm/Passes/PassBuilder.h" #include "llvm/Support/FormatVariadic.h" +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Transforms/HipStdPar/HipStdPar.h" #include "llvm/Transforms/IPO.h" #include "llvm/Transforms/IPO/AlwaysInliner.h" @@ -1298,6 +1299,8 @@ void AMDGPUPassConfig::addIRPasses() { addPass(createLICMPass()); } + //addPass(AMDGPUTargetVerifierPass()); + TargetPassConfig::addIRPasses(); // EarlyCSE is not always strong enough to clean up what LSR produces. For @@ -2040,6 +2043,8 @@ void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { // but EarlyCSE can do neither of them. if (isPassEnabled(EnableScalarIRPasses)) addEarlyCSEOrGVNPass(addPass); + + addPass(AMDGPUTargetVerifierPass()); } void AMDGPUCodeGenPassBuilder::addCodeGenPrepare(AddIRPass &addPass) const { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index e6cdec7160229..c70a6d1b6fa66 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -205,7 +205,8 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan dbgs() << TV.MessagesStr.str(); if (!TV.MessagesStr.str().empty()) { - F.getParent()->IsValid = false; + TV.IsValid = false; + return PreservedAnalyses::none(); } return PreservedAnalyses::all(); diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 3352d07deff2f..fbe7f6089ff18 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -163,9 +163,9 @@ int main(int argc, char **argv) { FPM.addPass(TargetVerifierPass()); MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); - MPM.run(*M, MAM); - - if (!M->IsValid) + auto PA = MPM.run(*M, MAM); + auto PAC = PA.getChecker(); + if (!PAC.preserved()) return 1; return 0; >From fdae3025942584d0085deb3442f40471548defe5 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 16 Apr 2025 11:08:20 -0400 Subject: [PATCH 04/28] Remove cmd line options that aren't required. Make error message explicit. --- llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll | 4 ++-- llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll index 584097d7bc134..c5e59d4a2369e 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll @@ -1,6 +1,6 @@ -; RUN: not not llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each -o - < %s 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm -o - < %s 2>&1 | FileCheck %s define amdgpu_cs i32 @nonvoid_shader() { -; CHECK: LLVM ERROR +; CHECK: Shaders must return void ret i32 0 } diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll index 0c3a5fe5ac4a5..8a503b7624a73 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll @@ -1,6 +1,6 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each %s -o - 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm %s -o - 2>&1 | FileCheck %s --allow-empty define amdgpu_cs void @void_shader() { -; CHECK: ModuleToFunctionPassAdaptor +; CHECK-NOT: Shaders must return void ret void } >From 5ceda58cc5b5d7372c6e43cbdf583f0dda87b956 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 19:36:34 -0400 Subject: [PATCH 05/28] Return Verifier none status through PreservedAnalyses on fail. --- llvm/lib/Analysis/Lint.cpp | 4 +++- llvm/lib/IR/Verifier.cpp | 2 ++ llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 8 +++++--- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/llvm/lib/Analysis/Lint.cpp b/llvm/lib/Analysis/Lint.cpp index f05e36e2025d4..c8e38963e5974 100644 --- a/llvm/lib/Analysis/Lint.cpp +++ b/llvm/lib/Analysis/Lint.cpp @@ -742,9 +742,11 @@ PreservedAnalyses LintPass::run(Function &F, FunctionAnalysisManager &AM) { Lint L(Mod, DL, AA, AC, DT, TLI); L.visit(F); dbgs() << L.MessagesStr.str(); - if (AbortOnError && !L.MessagesStr.str().empty()) + if (AbortOnError && !L.MessagesStr.str().empty()) { report_fatal_error( "linter found errors, aborting. (enabled by abort-on-error)", false); + return PreservedAnalyses::none(); + } return PreservedAnalyses::all(); } diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index d7c514610b4ba..51f6dec53b70f 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -7804,6 +7804,7 @@ PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { //M.IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken module found, compilation aborted!"); + return PreservedAnalyses::none(); } return PreservedAnalyses::all(); @@ -7815,6 +7816,7 @@ PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { //F.getParent()->IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken function found, compilation aborted!"); + return PreservedAnalyses::none(); } return PreservedAnalyses::all(); diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index fbe7f6089ff18..042824ac37fea 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -164,9 +164,11 @@ int main(int argc, char **argv) { MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); auto PA = MPM.run(*M, MAM); - auto PAC = PA.getChecker(); - if (!PAC.preserved()) - return 1; + { + auto PAC = PA.getChecker(); + if (!PAC.preserved()) + return 1; + } return 0; } >From 99c29069cdaf68c92ce7f25ca2f730bf738ca324 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 21:16:02 -0400 Subject: [PATCH 06/28] Rebase update. --- llvm/include/llvm/Target/TargetVerifier.h | 2 +- llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 2d0c039132c35..fe683311b901c 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -73,7 +73,7 @@ class TargetVerify { bool IsValid = true; TargetVerify(Module *Mod) - : Mod(Mod), TT(Triple::normalize(Mod->getTargetTriple())), + : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} void run(Function &F) {}; diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 042824ac37fea..627bc51ef3a43 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -123,7 +123,7 @@ int main(int argc, char **argv) { return 1; } auto S = Triple::normalize(TargetTriple); - M->setTargetTriple(S); + M->setTargetTriple(Triple(S)); PassInstrumentationCallbacks PIC; StandardInstrumentations SI(Context, false/*debug PM*/, @@ -153,7 +153,7 @@ int main(int argc, char **argv) { Triple TT(M->getTargetTriple()); if (!NoLint) - FPM.addPass(LintPass()); + FPM.addPass(LintPass(false)); if (!NoVerify) MPM.addPass(VerifierPass()); if (TT.isAMDGPU()) >From 3ea7eae48a6addbf711716e7a819830dddc1b34a Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 22:49:52 -0400 Subject: [PATCH 07/28] Add generic TargetVerifier. --- llvm/lib/Target/TargetVerifier.cpp | 32 ++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) create mode 100644 llvm/lib/Target/TargetVerifier.cpp diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp new file mode 100644 index 0000000000000..de3ff749e7c3c --- /dev/null +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -0,0 +1,32 @@ +#include "llvm/Target/TargetVerifier.h" +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" + +#include "llvm/Analysis/UniformityAnalysis.h" +#include "llvm/Analysis/PostDominators.h" +#include "llvm/Support/Debug.h" +#include "llvm/IR/Dominators.h" +#include "llvm/IR/Function.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/IntrinsicsAMDGPU.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/Value.h" + +namespace llvm { + +void TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { + if (TT.isAMDGPU()) { + auto *UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + } + } +} + +} // namespace llvm >From f52c4dbc84952d97266f5f4158729e564de10240 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 23:13:14 -0400 Subject: [PATCH 08/28] Remove store to const check since it is in Lint already --- llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 8 -------- llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 2 -- 2 files changed, 10 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index c70a6d1b6fa66..1cf2b277bee26 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -140,14 +140,6 @@ void AMDGPUTargetVerify::run(Function &F) { Check(IsValidInt(I.getOperand(i)->getType()), "Int type is invalid.", I.getOperand(i)); - // Ensure no store to const memory - if (auto *SI = dyn_cast(&I)) - { - unsigned AS = SI->getPointerAddressSpace(); - Check(AS != 4, "Write to const memory", SI); - } - - // Ensure no kernel to kernel calls. if (auto *CI = dyn_cast(&I)) { CallingConv::ID CalleeCC = CI->getCallingConv(); diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll index f56ff992a56c2..c628abbde11d1 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -32,8 +32,6 @@ define amdgpu_cs i32 @shader() { define amdgpu_kernel void @store_const(ptr addrspace(4) %out, i32 %a, i32 %b) { ; CHECK: Undefined behavior: Write to memory in const addrspace -; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 -; CHECK-NEXT: Write to const memory ; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 %r = add i32 %a, %b store i32 %r, ptr addrspace(4) %out >From 5c9a4ab3895d6939b12386d1db2081ca388df01a Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 23:14:38 -0400 Subject: [PATCH 09/28] Add chain followed by unreachable check --- llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 6 ++++++ llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 10 ++++++++++ 2 files changed, 16 insertions(+) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 1cf2b277bee26..8ea773bc0e66f 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -142,6 +142,7 @@ void AMDGPUTargetVerify::run(Function &F) { if (auto *CI = dyn_cast(&I)) { + // Ensure no kernel to kernel calls. CallingConv::ID CalleeCC = CI->getCallingConv(); if (CalleeCC == CallingConv::AMDGPU_KERNEL) { @@ -149,6 +150,11 @@ void AMDGPUTargetVerify::run(Function &F) { Check(CallerCC != CallingConv::AMDGPU_KERNEL, "A kernel may not call a kernel", CI->getParent()->getParent()); } + + // Ensure chain intrinsics are followed by unreachables. + if (CI->getIntrinsicID() == Intrinsic::amdgcn_cs_chain) + Check(isa_and_present(CI->getNextNode()), + "llvm.amdgcn.cs.chain must be followed by unreachable", CI); } // Ensure MFMA is not in control flow with diverging operands diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll index c628abbde11d1..e620df94ccde4 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -58,3 +58,13 @@ entry: %tmp2 = ashr i65 %x, 64 ret i65 %tmp2 } + +declare void @llvm.amdgcn.cs.chain.v3i32(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) +declare amdgpu_cs_chain void @chain_callee(<3 x i32> inreg, <3 x i32>) + +define amdgpu_cs void @no_unreachable(<3 x i32> inreg %a, <3 x i32> %b) { +; CHECK: llvm.amdgcn.cs.chain must be followed by unreachable +; CHECK-NEXT: call void (ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.p0.i32.v3i32.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) + call void(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) + ret void +} >From 0ff03f792c018e4fd0c11de9da4d3353617707f5 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 23:26:19 -0400 Subject: [PATCH 10/28] Remove mfma check --- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 89 ------------------- llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 25 ------ 2 files changed, 114 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 8ea773bc0e66f..684ced5bba574 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -14,9 +14,6 @@ using namespace llvm; -//static cl::opt -//MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); - // Check - We know that cond should be true, if not print an error message. #define Check(C, ...) \ do { \ @@ -26,60 +23,6 @@ using namespace llvm; } \ } while (false) -static bool isMFMA(unsigned IID) { - switch (IID) { - case Intrinsic::amdgcn_mfma_f32_4x4x1f32: - case Intrinsic::amdgcn_mfma_f32_4x4x4f16: - case Intrinsic::amdgcn_mfma_i32_4x4x4i8: - case Intrinsic::amdgcn_mfma_f32_4x4x2bf16: - - case Intrinsic::amdgcn_mfma_f32_16x16x1f32: - case Intrinsic::amdgcn_mfma_f32_16x16x4f32: - case Intrinsic::amdgcn_mfma_f32_16x16x4f16: - case Intrinsic::amdgcn_mfma_f32_16x16x16f16: - case Intrinsic::amdgcn_mfma_i32_16x16x4i8: - case Intrinsic::amdgcn_mfma_i32_16x16x16i8: - case Intrinsic::amdgcn_mfma_f32_16x16x2bf16: - case Intrinsic::amdgcn_mfma_f32_16x16x8bf16: - - case Intrinsic::amdgcn_mfma_f32_32x32x1f32: - case Intrinsic::amdgcn_mfma_f32_32x32x2f32: - case Intrinsic::amdgcn_mfma_f32_32x32x4f16: - case Intrinsic::amdgcn_mfma_f32_32x32x8f16: - case Intrinsic::amdgcn_mfma_i32_32x32x4i8: - case Intrinsic::amdgcn_mfma_i32_32x32x8i8: - case Intrinsic::amdgcn_mfma_f32_32x32x2bf16: - case Intrinsic::amdgcn_mfma_f32_32x32x4bf16: - - case Intrinsic::amdgcn_mfma_f32_4x4x4bf16_1k: - case Intrinsic::amdgcn_mfma_f32_16x16x4bf16_1k: - case Intrinsic::amdgcn_mfma_f32_16x16x16bf16_1k: - case Intrinsic::amdgcn_mfma_f32_32x32x4bf16_1k: - case Intrinsic::amdgcn_mfma_f32_32x32x8bf16_1k: - - case Intrinsic::amdgcn_mfma_f64_16x16x4f64: - case Intrinsic::amdgcn_mfma_f64_4x4x4f64: - - case Intrinsic::amdgcn_mfma_i32_16x16x32_i8: - case Intrinsic::amdgcn_mfma_i32_32x32x16_i8: - case Intrinsic::amdgcn_mfma_f32_16x16x8_xf32: - case Intrinsic::amdgcn_mfma_f32_32x32x4_xf32: - - case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_bf8: - case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_fp8: - case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_bf8: - case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_fp8: - - case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_bf8: - case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_fp8: - case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_bf8: - case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_fp8: - return true; - default: - return false; - } -} - namespace llvm { /*class AMDGPUTargetVerify : public TargetVerify { public: @@ -129,8 +72,6 @@ void AMDGPUTargetVerify::run(Function &F) { for (auto &BB : F) { for (auto &I : BB) { - //if (MarkUniform) - //outs() << UA->isUniform(&I) << ' ' << I << '\n'; // Ensure integral types are valid: i8, i16, i32, i64, i128 if (I.getType()->isIntegerTy()) @@ -156,36 +97,6 @@ void AMDGPUTargetVerify::run(Function &F) { Check(isa_and_present(CI->getNextNode()), "llvm.amdgcn.cs.chain must be followed by unreachable", CI); } - - // Ensure MFMA is not in control flow with diverging operands - if (auto *II = dyn_cast(&I)) { - if (isMFMA(II->getIntrinsicID())) { - bool InControlFlow = false; - for (const auto &P : predecessors(&BB)) - if (!PDT->dominates(&BB, P)) { - InControlFlow = true; - break; - } - for (const auto &S : successors(&BB)) - if (!DT->dominates(&BB, S)) { - InControlFlow = true; - break; - } - if (InControlFlow) { - // If operands to MFMA are not uniform, MFMA cannot be in control flow - bool hasUniformOperands = true; - for (unsigned i = 0; i < II->getNumOperands(); i++) { - if (!UA->isUniform(II->getOperand(i))) { - dbgs() << "Not uniform: " << *II->getOperand(i) << '\n'; - hasUniformOperands = false; - } - } - if (!hasUniformOperands) Check(false, "MFMA in control flow", II); - //else Check(false, "MFMA in control flow (uniform operands)", II); - } - //else Check(false, "MFMA not in control flow", II); - } - } } } } diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll index e620df94ccde4..62b220d7d9f49 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -1,30 +1,5 @@ ; RUN: not llvm-tgt-verify %s -mtriple=amdgcn |& FileCheck %s -define amdgpu_kernel void @test_mfma_f32_32x32x1f32_vecarg(ptr addrspace(1) %arg) #0 { -; CHECK: Not uniform: %in.f32 = load <32 x float>, ptr addrspace(1) %gep, align 128 -; CHECK-NEXT: MFMA in control flow -; CHECK-NEXT: %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) -s: - %tid = call i32 @llvm.amdgcn.workitem.id.x() - %gep = getelementptr inbounds <32 x float>, ptr addrspace(1) %arg, i32 %tid - %in.i32 = load <32 x i32>, ptr addrspace(1) %gep - %in.f32 = load <32 x float>, ptr addrspace(1) %gep - - %0 = icmp eq <32 x i32> %in.i32, zeroinitializer - %div.br = extractelement <32 x i1> %0, i32 0 - br i1 %div.br, label %if.3, label %else.0 - -if.3: - br label %join - -else.0: - %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) - br label %join - -join: - ret void -} - define amdgpu_cs i32 @shader() { ; CHECK: Shaders must return void ret i32 0 >From 6b84c73a35a260d64ed45df90052f8212b0ee4e7 Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 21 Apr 2025 20:54:10 -0400 Subject: [PATCH 11/28] Add registerVerifierPasses to PassBuilder and add the verifier passes to PassRegistry. --- llvm/include/llvm/InitializePasses.h | 1 + llvm/include/llvm/Passes/PassBuilder.h | 21 +++++++ .../llvm/Passes/TargetPassRegistry.inc | 12 ++++ .../TargetVerify/AMDGPUTargetVerifier.h | 11 ++-- llvm/lib/Passes/PassBuilder.cpp | 7 +++ llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 11 ++++ .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 56 ++++++++++++++++++- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 1 + 8 files changed, 114 insertions(+), 6 deletions(-) diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 9bef8e496c57e..ae398db3dc1da 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -317,6 +317,7 @@ void initializeUnpackMachineBundlesPass(PassRegistry &); void initializeUnreachableBlockElimLegacyPassPass(PassRegistry &); void initializeUnreachableMachineBlockElimLegacyPass(PassRegistry &); void initializeVerifierLegacyPassPass(PassRegistry &); +void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeVirtRegMapWrapperLegacyPass(PassRegistry &); void initializeVirtRegRewriterPass(PassRegistry &); void initializeWasmEHPreparePass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/PassBuilder.h b/llvm/include/llvm/Passes/PassBuilder.h index 51ccaa53447d7..6000769ce723b 100644 --- a/llvm/include/llvm/Passes/PassBuilder.h +++ b/llvm/include/llvm/Passes/PassBuilder.h @@ -172,6 +172,13 @@ class PassBuilder { /// additional analyses. void registerLoopAnalyses(LoopAnalysisManager &LAM); + /// Registers all available verifier passes. + /// + /// This is an interface that can be used to populate a + /// \c ModuleAnalysisManager with all registered loop analyses. Callers can + /// still manually register any additional analyses. + void registerVerifierPasses(ModulePassManager &PM, FunctionPassManager &); + /// Registers all available machine function analysis passes. /// /// This is an interface that can be used to populate a \c @@ -570,6 +577,15 @@ class PassBuilder { } /// @}} + /// Register a callback for parsing an Verifier Name to populate + /// the given managers. + void registerVerifierCallback( + const std::function &C, + const std::function &CF) { + VerifierCallbacks.push_back(C); + FnVerifierCallbacks.push_back(CF); + } + /// {{@ Register pipeline parsing callbacks with this pass builder instance. /// Using these callbacks, callers can parse both a single pass name, as well /// as entire sub-pipelines, and populate the PassManager instance @@ -841,6 +857,11 @@ class PassBuilder { // Callbacks to parse `filter` parameter in register allocation passes SmallVector, 2> RegClassFilterParsingCallbacks; + // Verifier callbacks + SmallVector, 2> + VerifierCallbacks; + SmallVector, 2> + FnVerifierCallbacks; }; /// This utility template takes care of adding require<> and invalidate<> diff --git a/llvm/include/llvm/Passes/TargetPassRegistry.inc b/llvm/include/llvm/Passes/TargetPassRegistry.inc index 521913cb25a4a..2d04b874cf360 100644 --- a/llvm/include/llvm/Passes/TargetPassRegistry.inc +++ b/llvm/include/llvm/Passes/TargetPassRegistry.inc @@ -151,6 +151,18 @@ PB.registerPipelineParsingCallback([=](StringRef Name, FunctionPassManager &PM, return false; }); +PB.registerVerifierCallback([](ModulePassManager &PM) { +#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) PM.addPass(CREATE_PASS) +#include GET_PASS_REGISTRY +#undef VERIFIER_MODULE_ANALYSIS + return false; +}, [](FunctionPassManager &FPM) { +#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) FPM.addPass(CREATE_PASS) +#include GET_PASS_REGISTRY +#undef VERIFIER_FUNCTION_ANALYSIS + return false; +}); + #undef ADD_PASS #undef ADD_PASS_WITH_PARAMS diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index d8a3fda4f87dc..b6a7412e8c1ef 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -39,14 +39,17 @@ class AMDGPUTargetVerify : public TargetVerify { public: Module *Mod; - DominatorTree *DT; - PostDominatorTree *PDT; - UniformityInfo *UA; + DominatorTree *DT = nullptr; + PostDominatorTree *PDT = nullptr; + UniformityInfo *UA = nullptr; + + AMDGPUTargetVerify(Module *Mod) + : TargetVerify(Mod), Mod(Mod) {} AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} - void run(Function &F); + bool run(Function &F); }; } // namespace llvm diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index e7057d9a6b625..e942fed8b6a72 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -582,6 +582,13 @@ void PassBuilder::registerLoopAnalyses(LoopAnalysisManager &LAM) { C(LAM); } +void PassBuilder::registerVerifierPasses(ModulePassManager &MPM, FunctionPassManager &FPM) { + for (auto &C : VerifierCallbacks) + C(MPM); + for (auto &C : FnVerifierCallbacks) + C(FPM); +} + static std::optional> parseFunctionPipelineName(StringRef Name) { std::pair Params; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 98a1147ef6d66..41e6a399c7239 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -81,6 +81,17 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #undef FUNCTION_ALIAS_ANALYSIS #undef FUNCTION_ANALYSIS +#ifndef VERIFIER_MODULE_ANALYSIS +#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) +#endif +#ifndef VERIFIER_FUNCTION_ANALYSIS +#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) +#endif +VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) +VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) +#undef VERIFIER_MODULE_ANALYSIS +#undef VERIFIER_FUNCTION_ANALYSIS + #ifndef FUNCTION_PASS_WITH_PARAMS #define FUNCTION_PASS_WITH_PARAMS(NAME, CLASS, CREATE_PASS, PARSER, PARAMS) #endif diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 684ced5bba574..63a7526b9abdc 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -5,6 +5,7 @@ #include "llvm/Support/Debug.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" +#include "llvm/InitializePasses.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" @@ -19,7 +20,7 @@ using namespace llvm; do { \ if (!(C)) { \ TargetVerify::CheckFailed(__VA_ARGS__); \ - return; \ + return false; \ } \ } while (false) @@ -64,7 +65,7 @@ static bool isShader(CallingConv::ID CC) { } } -void AMDGPUTargetVerify::run(Function &F) { +bool AMDGPUTargetVerify::run(Function &F) { // Ensure shader calling convention returns void if (isShader(F.getCallingConv())) Check(F.getReturnType() == Type::getVoidTy(F.getContext()), "Shaders must return void"); @@ -99,6 +100,10 @@ void AMDGPUTargetVerify::run(Function &F) { } } } + + if (!MessagesStr.str().empty()) + return false; + return true; } PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { @@ -120,4 +125,51 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan return PreservedAnalyses::all(); } + +struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { + static char ID; + + std::unique_ptr TV; + bool FatalErrors = true; + + AMDGPUTargetVerifierLegacyPass() : FunctionPass(ID) { + initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + AMDGPUTargetVerifierLegacyPass(bool FatalErrors) + : FunctionPass(ID), + FatalErrors(FatalErrors) { + initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + + bool doInitialization(Module &M) override { + TV = std::make_unique(&M); + return false; + } + + bool runOnFunction(Function &F) override { + if (TV->run(F) && FatalErrors) { + errs() << "in function " << F.getName() << '\n'; + report_fatal_error("Broken function found, compilation aborted!"); + } + return false; + } + + bool doFinalization(Module &M) override { + bool IsValid = true; + for (Function &F : M) + if (F.isDeclaration()) + IsValid &= TV->run(F); + + //IsValid &= TV->run(); + if (FatalErrors && !IsValid) + report_fatal_error("Broken module found, compilation aborted!"); + return false; + } + + void getAnalysisUsage(AnalysisUsage &AU) const override { + AU.setPreservesAll(); + } +}; +char AMDGPUTargetVerifierLegacyPass::ID = 0; } // namespace llvm +INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverify", "AMDGPU Target Verifier", false, false) diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 627bc51ef3a43..503db7b1f8d18 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -144,6 +144,7 @@ int main(int argc, char **argv) { PB.registerCGSCCAnalyses(CGAM); PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); + //PB.registerVerifierPasses(MPM, FPM); PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); >From ec3276b182f3a758a24024291772efe435485857 Mon Sep 17 00:00:00 2001 From: jofernau Date: Tue, 22 Apr 2025 14:57:31 -0400 Subject: [PATCH 12/28] Remove leftovers. Add titles. Add call to registerVerifierCallbacks in llc. --- llvm/lib/Passes/CMakeLists.txt | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 4 --- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 35 +++++++++++-------- llvm/lib/Target/TargetVerifier.cpp | 19 ++++++++++ llvm/tools/llc/NewPMDriver.cpp | 6 ++-- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 7 ++-- 6 files changed, 45 insertions(+), 28 deletions(-) diff --git a/llvm/lib/Passes/CMakeLists.txt b/llvm/lib/Passes/CMakeLists.txt index f171377a8b270..9c348cb89a8c5 100644 --- a/llvm/lib/Passes/CMakeLists.txt +++ b/llvm/lib/Passes/CMakeLists.txt @@ -29,7 +29,7 @@ add_llvm_component_library(LLVMPasses Scalar Support Target - TargetParser + #TargetParser TransformUtils Vectorize Instrumentation diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 6ec34d6a0fdbf..257cc724b3da9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1299,8 +1299,6 @@ void AMDGPUPassConfig::addIRPasses() { addPass(createLICMPass()); } - //addPass(AMDGPUTargetVerifierPass()); - TargetPassConfig::addIRPasses(); // EarlyCSE is not always strong enough to clean up what LSR produces. For @@ -2043,8 +2041,6 @@ void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { // but EarlyCSE can do neither of them. if (isPassEnabled(EnableScalarIRPasses)) addEarlyCSEOrGVNPass(addPass); - - addPass(AMDGPUTargetVerifierPass()); } void AMDGPUCodeGenPassBuilder::addCodeGenPrepare(AddIRPass &addPass) const { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 63a7526b9abdc..0eecedaebc7ce 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -1,3 +1,22 @@ +//===-- AMDGPUTargetVerifier.cpp - AMDGPU -------------------------*- C++ -*-===// +//// +//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +//// See https://llvm.org/LICENSE.txt for license information. +//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +//// +////===----------------------------------------------------------------------===// +//// +//// This file defines target verifier interfaces that can be used for some +//// validation of input to the system, and for checking that transformations +//// haven't done something bad. In contrast to the Verifier or Lint, the +//// TargetVerifier looks for constructions invalid to a particular target +//// machine. +//// +//// To see what specifically is checked, look at an individual backend's +//// TargetVerifier. +//// +////===----------------------------------------------------------------------===// + #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Analysis/UniformityAnalysis.h" @@ -25,19 +44,6 @@ using namespace llvm; } while (false) namespace llvm { -/*class AMDGPUTargetVerify : public TargetVerify { -public: - Module *Mod; - - DominatorTree *DT; - PostDominatorTree *PDT; - UniformityInfo *UA; - - AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) - : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} - - void run(Function &F); -};*/ static bool IsValidInt(const Type *Ty) { return Ty->isIntegerTy(1) || @@ -147,7 +153,7 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { } bool runOnFunction(Function &F) override { - if (TV->run(F) && FatalErrors) { + if (!TV->run(F) && FatalErrors) { errs() << "in function " << F.getName() << '\n'; report_fatal_error("Broken function found, compilation aborted!"); } @@ -160,7 +166,6 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { if (F.isDeclaration()) IsValid &= TV->run(F); - //IsValid &= TV->run(); if (FatalErrors && !IsValid) report_fatal_error("Broken module found, compilation aborted!"); return false; diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index de3ff749e7c3c..992a0c91d93b1 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -1,3 +1,22 @@ +//===-- TargetVerifier.cpp - LLVM IR Target Verifier ----------------*- C++ -*-===// +//// +///// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +///// See https://llvm.org/LICENSE.txt for license information. +///// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +///// +/////===----------------------------------------------------------------------===// +///// +///// This file defines target verifier interfaces that can be used for some +///// validation of input to the system, and for checking that transformations +///// haven't done something bad. In contrast to the Verifier or Lint, the +///// TargetVerifier looks for constructions invalid to a particular target +///// machine. +///// +///// To see what specifically is checked, look at TargetVerifier.cpp or an +///// individual backend's TargetVerifier. +///// +/////===----------------------------------------------------------------------===// + #include "llvm/Target/TargetVerifier.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index a060d16e74958..a8f6b999af06e 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -114,6 +114,8 @@ int llvm::compileModuleWithNewPM( VK == VerifierKind::EachPass); registerCodeGenCallback(PIC, *Target); + ModulePassManager MPM; + FunctionPassManager FPM; MachineFunctionAnalysisManager MFAM; LoopAnalysisManager LAM; FunctionAnalysisManager FAM; @@ -125,15 +127,13 @@ int llvm::compileModuleWithNewPM( PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); + PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM, &FAM); FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); MAM.registerPass([&] { return MachineModuleAnalysis(MMI); }); - ModulePassManager MPM; - FunctionPassManager FPM; - if (!PassPipeline.empty()) { // Construct a custom pass pipeline that starts after instruction // selection. diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 503db7b1f8d18..b00bab66c6c3e 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -1,4 +1,4 @@ -//===--- llvm-isel-fuzzer.cpp - Fuzzer for instruction selection ----------===// +//===--- llvm-tgt-verify.cpp - Target Verifier ----------------- ----------===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -6,7 +6,7 @@ // //===----------------------------------------------------------------------===// // -// Tool to fuzz instruction selection using libFuzzer. +// Tool to verify a target. // //===----------------------------------------------------------------------===// @@ -144,14 +144,11 @@ int main(int argc, char **argv) { PB.registerCGSCCAnalyses(CGAM); PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); - //PB.registerVerifierPasses(MPM, FPM); PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM, &FAM); - //FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); - Triple TT(M->getTargetTriple()); if (!NoLint) FPM.addPass(LintPass(false)); >From 4f00c83f58a86a0adc26b621cc53e8b568b8c8e0 Mon Sep 17 00:00:00 2001 From: jofernau Date: Thu, 24 Apr 2025 16:02:21 -0400 Subject: [PATCH 13/28] Add pass to legacy PM. --- llvm/include/llvm/CodeGen/Passes.h | 2 + llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Target/TargetVerifier.h | 6 +- llvm/lib/Passes/StandardInstrumentations.cpp | 4 +- llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 2 +- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 45 ---------- llvm/lib/Target/TargetVerifier.cpp | 87 ++++++++++++++++++- llvm/tools/llc/NewPMDriver.cpp | 6 +- llvm/tools/llc/llc.cpp | 4 + .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 1 + 10 files changed, 106 insertions(+), 53 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index d214ab9306c2f..b293315e11c17 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -617,6 +617,8 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); + + FunctionPass *createTargetVerifierLegacyPass(); } // End llvm namespace #endif diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index ae398db3dc1da..3f9ffc4efd9ec 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -307,6 +307,7 @@ void initializeTailDuplicateLegacyPass(PassRegistry &); void initializeTargetLibraryInfoWrapperPassPass(PassRegistry &); void initializeTargetPassConfigPass(PassRegistry &); void initializeTargetTransformInfoWrapperPassPass(PassRegistry &); +void initializeTargetVerifierLegacyPassPass(PassRegistry &); void initializeTwoAddressInstructionLegacyPassPass(PassRegistry &); void initializeTypeBasedAAWrapperPassPass(PassRegistry &); void initializeTypePromotionLegacyPass(PassRegistry &); @@ -317,7 +318,6 @@ void initializeUnpackMachineBundlesPass(PassRegistry &); void initializeUnreachableBlockElimLegacyPassPass(PassRegistry &); void initializeUnreachableMachineBlockElimLegacyPass(PassRegistry &); void initializeVerifierLegacyPassPass(PassRegistry &); -void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeVirtRegMapWrapperLegacyPass(PassRegistry &); void initializeVirtRegRewriterPass(PassRegistry &); void initializeWasmEHPreparePass(PassRegistry &); diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index fe683311b901c..23ef2e0b8d4ef 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -30,7 +30,7 @@ class Function; class TargetVerifierPass : public PassInfoMixin { public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) {} + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); }; class TargetVerify { @@ -76,8 +76,8 @@ class TargetVerify { : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} - void run(Function &F) {}; - void run(Function &F, FunctionAnalysisManager &AM); + bool run(Function &F); + bool run(Function &F, FunctionAnalysisManager &AM); }; } // namespace llvm diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index 879d657c87695..f125b3daffd5e 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -62,6 +62,8 @@ static cl::opt VerifyAnalysisInvalidation("verify-analysis-invalidation", #endif ); +static cl::opt VerifyTargetEach("verify-tgt-each"); + // An option that supports the -print-changed option. See // the description for -print-changed for an explanation of the use // of this option. Note that this option has no effect without -print-changed. @@ -1476,7 +1478,7 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, "\"{0}\", compilation aborted!", P)); - if (FAM) { + if (VerifyTargetEach && FAM) { TargetVerify TV(const_cast(F->getParent())); TV.run(*const_cast(F), *FAM); if (!TV.IsValid) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 41e6a399c7239..73f9c60cf588c 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -88,7 +88,7 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) #endif VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) -VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) +VERIFIER_FUNCTION_ANALYSIS("tgtverifier", TargetVerifierPass()) #undef VERIFIER_MODULE_ANALYSIS #undef VERIFIER_FUNCTION_ANALYSIS diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 0eecedaebc7ce..96bcaaf6f2ac9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -132,49 +132,4 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan return PreservedAnalyses::all(); } -struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { - static char ID; - - std::unique_ptr TV; - bool FatalErrors = true; - - AMDGPUTargetVerifierLegacyPass() : FunctionPass(ID) { - initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); - } - AMDGPUTargetVerifierLegacyPass(bool FatalErrors) - : FunctionPass(ID), - FatalErrors(FatalErrors) { - initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); - } - - bool doInitialization(Module &M) override { - TV = std::make_unique(&M); - return false; - } - - bool runOnFunction(Function &F) override { - if (!TV->run(F) && FatalErrors) { - errs() << "in function " << F.getName() << '\n'; - report_fatal_error("Broken function found, compilation aborted!"); - } - return false; - } - - bool doFinalization(Module &M) override { - bool IsValid = true; - for (Function &F : M) - if (F.isDeclaration()) - IsValid &= TV->run(F); - - if (FatalErrors && !IsValid) - report_fatal_error("Broken module found, compilation aborted!"); - return false; - } - - void getAnalysisUsage(AnalysisUsage &AU) const override { - AU.setPreservesAll(); - } -}; -char AMDGPUTargetVerifierLegacyPass::ID = 0; } // namespace llvm -INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverify", "AMDGPU Target Verifier", false, false) diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 992a0c91d93b1..170fc4769c1d8 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -20,6 +20,7 @@ #include "llvm/Target/TargetVerifier.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" +#include "llvm/InitializePasses.h" #include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/Analysis/PostDominators.h" #include "llvm/Support/Debug.h" @@ -32,7 +33,22 @@ namespace llvm { -void TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { +bool TargetVerify::run(Function &F) { + if (TT.isAMDGPU()) { + AMDGPUTargetVerify TV(Mod); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + return false; + } + return true; + } + report_fatal_error("Target has no verification method\n"); +} + +bool TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { if (TT.isAMDGPU()) { auto *UA = &AM.getResult(F); auto *DT = &AM.getResult(F); @@ -44,8 +60,77 @@ void TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { dbgs() << TV.MessagesStr.str(); if (!TV.MessagesStr.str().empty()) { TV.IsValid = false; + return false; + } + return true; + } + report_fatal_error("Target has no verification method\n"); +} + +PreservedAnalyses TargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { + auto TT = F.getParent()->getTargetTriple(); + + if (TT.isAMDGPU()) { + auto *Mod = F.getParent(); + + auto UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + return PreservedAnalyses::none(); } + return PreservedAnalyses::all(); } + report_fatal_error("Target has no verification method\n"); } +struct TargetVerifierLegacyPass : public FunctionPass { + static char ID; + + std::unique_ptr TV; + + TargetVerifierLegacyPass() : FunctionPass(ID) { + initializeTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + + bool doInitialization(Module &M) override { + TV = std::make_unique(&M); + return false; + } + + bool runOnFunction(Function &F) override { + if (!TV->run(F)) { + errs() << "in function " << F.getName() << '\n'; + report_fatal_error("broken function found, compilation aborted!"); + } + return false; + } + + bool doFinalization(Module &M) override { + bool IsValid = true; + for (Function &F : M) + if (F.isDeclaration()) + IsValid &= TV->run(F); + + if (!IsValid) + report_fatal_error("broken module found, compilation aborted!"); + return false; + } + + void getAnalysisUsage(AnalysisUsage &AU) const override { + AU.setPreservesAll(); + } +}; +char TargetVerifierLegacyPass::ID = 0; +FunctionPass *createTargetVerifierLegacyPass() { + return new TargetVerifierLegacyPass(); +} } // namespace llvm +using namespace llvm; +INITIALIZE_PASS(TargetVerifierLegacyPass, "tgtverifier", "Target Verifier", false, false) diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index a8f6b999af06e..4b95977a10c5f 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -57,6 +57,9 @@ static cl::opt DebugPM("debug-pass-manager", cl::Hidden, cl::desc("Print pass management debugging information")); +static cl::opt VerifyTarget("verify-tgt-new-pm", + cl::desc("Verify the target")); + bool LLCDiagnosticHandler::handleDiagnostics(const DiagnosticInfo &DI) { DiagnosticHandler::handleDiagnostics(DI); if (DI.getKind() == llvm::DK_SrcMgr) { @@ -127,7 +130,8 @@ int llvm::compileModuleWithNewPM( PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); - PB.registerVerifierPasses(MPM, FPM); + if (VerifyTarget) + PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM, &FAM); diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 140459ba2de21..1fd8a9f9cd9f8 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -209,6 +209,8 @@ static cl::opt PassPipeline( static cl::alias PassPipeline2("p", cl::aliasopt(PassPipeline), cl::desc("Alias for -passes")); +static cl::opt VerifyTarget("verify-tgt", cl::desc("Verify the target")); + namespace { std::vector &getRunPassNames() { @@ -658,6 +660,8 @@ static int compileModule(char **argv, LLVMContext &Context) { // Build up all of the passes that we want to do to the module. legacy::PassManager PM; + if (VerifyTarget) + PM.add(createTargetVerifierLegacyPass()); PM.add(new TargetLibraryInfoWrapperPass(TLII)); { diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index b00bab66c6c3e..b86c2318b45b7 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -141,6 +141,7 @@ int main(int argc, char **argv) { ModuleAnalysisManager MAM; PassBuilder PB(TM.get(), PipelineTuningOptions(), std::nullopt, &PIC); PB.registerModuleAnalyses(MAM); + //PB.registerVerifierPasses(MPM, FPM); PB.registerCGSCCAnalyses(CGAM); PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); >From 3013fc91155a7d84c73ac820fe6bc24c47dad38d Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 26 Apr 2025 00:13:42 -0400 Subject: [PATCH 14/28] Add fam in other projects. --- flang/lib/Frontend/FrontendActions.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter4/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter5/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter6/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter7/toy.cpp | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..7c48e35ff68cf 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -911,7 +911,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); - si.registerCallbacks(pic, &mam); + si.registerCallbacks(pic, &mam, &fam); if (ci.isTimingEnabled()) si.getTimePasses().setOutStream(ci.getTimingStreamLLVM()); pto.LoopUnrolling = opts.UnrollLoops; diff --git a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp index 0f58391c50667..f9664025f61f1 100644 --- a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp @@ -577,7 +577,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp index 7117eaf4982b0..eae06d9f57467 100644 --- a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp @@ -851,7 +851,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp index cb7b6cc8651c1..30ad79ef2fc58 100644 --- a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp @@ -970,7 +970,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp index 91b7191a07c6f..4a39bc33c5591 100644 --- a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp @@ -1139,7 +1139,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Promote allocas to registers. >From 8745cd135bd27559429f158fc0d678a210af7292 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 26 Apr 2025 02:30:40 -0400 Subject: [PATCH 15/28] Avoid fatal errors in llc. --- llvm/include/llvm/CodeGen/Passes.h | 2 +- llvm/lib/Target/TargetVerifier.cpp | 18 +++++++++++++----- .../test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll | 2 +- .../test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll | 2 +- llvm/tools/llc/llc.cpp | 2 +- 5 files changed, 17 insertions(+), 9 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index b293315e11c17..8d88d858c57ad 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -618,7 +618,7 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); - FunctionPass *createTargetVerifierLegacyPass(); + FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors); } // End llvm namespace #endif diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 170fc4769c1d8..3be50f4ef6da3 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -94,8 +94,10 @@ struct TargetVerifierLegacyPass : public FunctionPass { static char ID; std::unique_ptr TV; + bool FatalErrors = false; - TargetVerifierLegacyPass() : FunctionPass(ID) { + TargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), + FatalErrors(FatalErrors) { initializeTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); } @@ -107,7 +109,10 @@ struct TargetVerifierLegacyPass : public FunctionPass { bool runOnFunction(Function &F) override { if (!TV->run(F)) { errs() << "in function " << F.getName() << '\n'; - report_fatal_error("broken function found, compilation aborted!"); + if (FatalErrors) + report_fatal_error("broken function found, compilation aborted!"); + else + errs() << "broken function found, compilation aborted!\n"; } return false; } @@ -119,7 +124,10 @@ struct TargetVerifierLegacyPass : public FunctionPass { IsValid &= TV->run(F); if (!IsValid) - report_fatal_error("broken module found, compilation aborted!"); + if (FatalErrors) + report_fatal_error("broken module found, compilation aborted!"); + else + errs() << "broken module found, compilation aborted!\n"; return false; } @@ -128,8 +136,8 @@ struct TargetVerifierLegacyPass : public FunctionPass { } }; char TargetVerifierLegacyPass::ID = 0; -FunctionPass *createTargetVerifierLegacyPass() { - return new TargetVerifierLegacyPass(); +FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors) { + return new TargetVerifierLegacyPass(FatalErrors); } } // namespace llvm using namespace llvm; diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll index c5e59d4a2369e..e2d9edda5d008 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll @@ -1,4 +1,4 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm -o - < %s 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-tgt -o - < %s 2>&1 | FileCheck %s define amdgpu_cs i32 @nonvoid_shader() { ; CHECK: Shaders must return void diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll index 8a503b7624a73..a2dab0ff47924 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll @@ -1,4 +1,4 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm %s -o - 2>&1 | FileCheck %s --allow-empty +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-tgt %s -o - 2>&1 | FileCheck %s --allow-empty define amdgpu_cs void @void_shader() { ; CHECK-NOT: Shaders must return void diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 1fd8a9f9cd9f8..329d95826551f 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -661,7 +661,7 @@ static int compileModule(char **argv, LLVMContext &Context) { // Build up all of the passes that we want to do to the module. legacy::PassManager PM; if (VerifyTarget) - PM.add(createTargetVerifierLegacyPass()); + PM.add(createTargetVerifierLegacyPass(false)); PM.add(new TargetLibraryInfoWrapperPass(TLII)); { >From c7bf730193e39bf838a29de7617d31a900bbc576 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 26 Apr 2025 03:40:47 -0400 Subject: [PATCH 16/28] Add tool to build/test. --- llvm/test/CMakeLists.txt | 1 + llvm/test/lit.cfg.py | 1 + llvm/utils/gn/secondary/llvm/test/BUILD.gn | 1 + .../llvm/tools/llvm-tgt-verify/BUILD.gn | 25 +++++++++++++++++++ 4 files changed, 28 insertions(+) create mode 100644 llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn diff --git a/llvm/test/CMakeLists.txt b/llvm/test/CMakeLists.txt index 66849002eb470..10ca9300e7c66 100644 --- a/llvm/test/CMakeLists.txt +++ b/llvm/test/CMakeLists.txt @@ -135,6 +135,7 @@ set(LLVM_TEST_DEPENDS llvm-strip llvm-symbolizer llvm-tblgen + llvm-tgt-verify llvm-readtapi llvm-tli-checker llvm-undname diff --git a/llvm/test/lit.cfg.py b/llvm/test/lit.cfg.py index aad7a088551b2..8620f2a7014b5 100644 --- a/llvm/test/lit.cfg.py +++ b/llvm/test/lit.cfg.py @@ -227,6 +227,7 @@ def get_asan_rtlib(): "llvm-strings", "llvm-strip", "llvm-tblgen", + "llvm-tgt-verify", "llvm-readtapi", "llvm-undname", "llvm-windres", diff --git a/llvm/utils/gn/secondary/llvm/test/BUILD.gn b/llvm/utils/gn/secondary/llvm/test/BUILD.gn index 228642667b41d..157e7991c52a8 100644 --- a/llvm/utils/gn/secondary/llvm/test/BUILD.gn +++ b/llvm/utils/gn/secondary/llvm/test/BUILD.gn @@ -319,6 +319,7 @@ group("test") { "//llvm/tools/llvm-strings", "//llvm/tools/llvm-symbolizer:symlinks", "//llvm/tools/llvm-tli-checker", + "//llvm/tools/llvm-tgt-verify", "//llvm/tools/llvm-undname", "//llvm/tools/llvm-xray", "//llvm/tools/lto", diff --git a/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn b/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn new file mode 100644 index 0000000000000..b751bafc5052c --- /dev/null +++ b/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn @@ -0,0 +1,25 @@ +import("//llvm/utils/TableGen/tablegen.gni") + +tgtverifier("llvm-tgt-verify") { + deps = [ + "//llvm/lib/Analysis", + "//llvm/lib/AsmPrinter", + "//llvm/lib/CodeGen", + "//llvm/lib/CodeGenTypes", + "//llvm/lib/Core", + "//llvm/lib/IRPrinter", + "//llvm/lib/IRReader", + "//llvm/lib/MC", + "//llvm/lib/MIRParser", + "//llvm/lib/Passes", + "//llvm/lib/Remarks", + "//llvm/lib/ScalarOpts", + "//llvm/lib/SelectionDAG", + "//llvm/lib/Support", + "//llvm/lib/Target", + "//llvm/lib/TargetParser", + "//llvm/lib/TransformUtils", + "//llvm/lib/Vectorize", + ] + sources = [ "llvm-tgt-verify.cpp" ] +} >From c8dd3db3fe078f76e822a9646d3d7295fa23752a Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 28 Apr 2025 10:42:24 -0400 Subject: [PATCH 17/28] Cleanup of unrequired functions. --- llvm/include/llvm/Target/TargetVerifier.h | 1 - .../TargetVerify/AMDGPUTargetVerifier.h | 1 - .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 25 +++---------------- llvm/lib/Target/TargetVerifier.cpp | 22 ++-------------- 4 files changed, 6 insertions(+), 43 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 23ef2e0b8d4ef..427a05b2648a9 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -77,7 +77,6 @@ class TargetVerify { MessagesStr(Messages) {} bool run(Function &F); - bool run(Function &F, FunctionAnalysisManager &AM); }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index b6a7412e8c1ef..74e5b5f7a1efd 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -32,7 +32,6 @@ class Function; class AMDGPUTargetVerifierPass : public TargetVerifierPass { public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); }; class AMDGPUTargetVerify : public TargetVerify { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 96bcaaf6f2ac9..bda412f723242 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -107,29 +107,12 @@ bool AMDGPUTargetVerify::run(Function &F) { } } - if (!MessagesStr.str().empty()) + //dbgs() << MessagesStr.str(); + if (!MessagesStr.str().empty()) { + //IsValid = false; return false; - return true; -} - -PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { - - auto *Mod = F.getParent(); - - auto UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return PreservedAnalyses::none(); } - - return PreservedAnalyses::all(); + return true; } } // namespace llvm diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 3be50f4ef6da3..6b57c18ff9316 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -48,25 +48,6 @@ bool TargetVerify::run(Function &F) { report_fatal_error("Target has no verification method\n"); } -bool TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { - if (TT.isAMDGPU()) { - auto *UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return false; - } - return true; - } - report_fatal_error("Target has no verification method\n"); -} - PreservedAnalyses TargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto TT = F.getParent()->getTargetTriple(); @@ -123,11 +104,12 @@ struct TargetVerifierLegacyPass : public FunctionPass { if (F.isDeclaration()) IsValid &= TV->run(F); - if (!IsValid) + if (!IsValid) { if (FatalErrors) report_fatal_error("broken module found, compilation aborted!"); else errs() << "broken module found, compilation aborted!\n"; + } return false; } >From 2c12e6a6d7f9a1cb7bcebfb30ccdd0fe7b198727 Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 28 Apr 2025 10:43:32 -0400 Subject: [PATCH 18/28] Make virtual. --- llvm/include/llvm/Target/TargetVerifier.h | 2 +- llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 427a05b2648a9..ade2676a64325 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -76,7 +76,7 @@ class TargetVerify { : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} - bool run(Function &F); + virtual bool run(Function &F); }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index 74e5b5f7a1efd..b97fbc046e391 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -48,7 +48,7 @@ class AMDGPUTargetVerify : public TargetVerify { AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} - bool run(Function &F); + bool run(Function &F) override; }; } // namespace llvm >From 3267b65e82da4cb7bc0f31f74c76f78d0445512f Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 10:56:43 -0400 Subject: [PATCH 19/28] Remove from legacy PM. Add to target dependent pipeline. --- llvm/include/llvm/CodeGen/Passes.h | 2 +- llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Target/TargetVerifier.h | 4 +- .../TargetVerify/AMDGPUTargetVerifier.h | 1 + llvm/lib/Passes/StandardInstrumentations.cpp | 10 +-- llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 + .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 74 ++++++++++++++- llvm/lib/Target/TargetVerifier.cpp | 90 ------------------- llvm/tools/llc/llc.cpp | 2 - .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 2 - 11 files changed, 85 insertions(+), 106 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index 8d88d858c57ad..da6ad3f612aa8 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -618,7 +618,7 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); - FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors); + //FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); } // End llvm namespace #endif diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 3f9ffc4efd9ec..7d4fad2d87a16 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -307,7 +307,7 @@ void initializeTailDuplicateLegacyPass(PassRegistry &); void initializeTargetLibraryInfoWrapperPassPass(PassRegistry &); void initializeTargetPassConfigPass(PassRegistry &); void initializeTargetTransformInfoWrapperPassPass(PassRegistry &); -void initializeTargetVerifierLegacyPassPass(PassRegistry &); +//void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeTwoAddressInstructionLegacyPassPass(PassRegistry &); void initializeTypeBasedAAWrapperPassPass(PassRegistry &); void initializeTypePromotionLegacyPass(PassRegistry &); diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index ade2676a64325..1d12eb55bbf0a 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -30,7 +30,7 @@ class Function; class TargetVerifierPass : public PassInfoMixin { public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); + virtual PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) = 0; }; class TargetVerify { @@ -76,7 +76,7 @@ class TargetVerify { : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} - virtual bool run(Function &F); + virtual bool run(Function &F) = 0; }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index b97fbc046e391..49bcbc8849e3c 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -32,6 +32,7 @@ class Function; class AMDGPUTargetVerifierPass : public TargetVerifierPass { public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) override; }; class AMDGPUTargetVerify : public TargetVerify { diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index f125b3daffd5e..076df47d5b15d 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -45,7 +45,7 @@ #include "llvm/Support/Signals.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Support/xxhash.h" -#include "llvm/Target/TargetVerifier.h" +//#include "llvm/Target/TargetVerifier.h" #include #include #include @@ -1479,12 +1479,12 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, P)); if (VerifyTargetEach && FAM) { - TargetVerify TV(const_cast(F->getParent())); - TV.run(*const_cast(F), *FAM); - if (!TV.IsValid) + //TargetVerify TV(const_cast(F->getParent())); + //TV.run(*const_cast(F), *FAM); + /*if (!TV.IsValid) report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", - P)); + P));*/ } } else { const auto *M = unwrapIR(IR); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 73f9c60cf588c..41e6a399c7239 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -88,7 +88,7 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) #endif VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) -VERIFIER_FUNCTION_ANALYSIS("tgtverifier", TargetVerifierPass()) +VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) #undef VERIFIER_MODULE_ANALYSIS #undef VERIFIER_FUNCTION_ANALYSIS diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 257cc724b3da9..f1a60b8f33140 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1976,6 +1976,8 @@ AMDGPUCodeGenPassBuilder::AMDGPUCodeGenPassBuilder( } void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { + addPass(AMDGPUTargetVerifierPass()); + if (RemoveIncompatibleFunctions && TM.getTargetTriple().isAMDGCN()) addPass(AMDGPURemoveIncompatibleFunctionsPass(TM)); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index bda412f723242..cedd9ddc78011 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -107,12 +107,82 @@ bool AMDGPUTargetVerify::run(Function &F) { } } - //dbgs() << MessagesStr.str(); + dbgs() << MessagesStr.str(); if (!MessagesStr.str().empty()) { - //IsValid = false; + IsValid = false; return false; } return true; } +PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { + auto *Mod = F.getParent(); + + auto UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + return PreservedAnalyses::none(); + } + return PreservedAnalyses::all(); +} + +/* +struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { + static char ID; + + std::unique_ptr TV; + bool FatalErrors = false; + + AMDGPUTargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), + FatalErrors(FatalErrors) { + initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + + bool doInitialization(Module &M) override { + TV = std::make_unique(&M); + return false; + } + + bool runOnFunction(Function &F) override { + if (!TV->run(F)) { + errs() << "in function " << F.getName() << '\n'; + if (FatalErrors) + report_fatal_error("broken function found, compilation aborted!"); + else + errs() << "broken function found, compilation aborted!\n"; + } + return false; + } + + bool doFinalization(Module &M) override { + bool IsValid = true; + for (Function &F : M) + if (F.isDeclaration()) + IsValid &= TV->run(F); + + if (!IsValid) { + if (FatalErrors) + report_fatal_error("broken module found, compilation aborted!"); + else + errs() << "broken module found, compilation aborted!\n"; + } + return false; + } + + void getAnalysisUsage(AnalysisUsage &AU) const override { + AU.setPreservesAll(); + } +}; +char AMDGPUTargetVerifierLegacyPass::ID = 0; +FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors) { + return new AMDGPUTargetVerifierLegacyPass(FatalErrors); +}*/ } // namespace llvm +//INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 6b57c18ff9316..c63ae2a2c5daf 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -33,94 +33,4 @@ namespace llvm { -bool TargetVerify::run(Function &F) { - if (TT.isAMDGPU()) { - AMDGPUTargetVerify TV(Mod); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return false; - } - return true; - } - report_fatal_error("Target has no verification method\n"); -} - -PreservedAnalyses TargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { - auto TT = F.getParent()->getTargetTriple(); - - if (TT.isAMDGPU()) { - auto *Mod = F.getParent(); - - auto UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return PreservedAnalyses::none(); - } - return PreservedAnalyses::all(); - } - report_fatal_error("Target has no verification method\n"); -} - -struct TargetVerifierLegacyPass : public FunctionPass { - static char ID; - - std::unique_ptr TV; - bool FatalErrors = false; - - TargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), - FatalErrors(FatalErrors) { - initializeTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); - } - - bool doInitialization(Module &M) override { - TV = std::make_unique(&M); - return false; - } - - bool runOnFunction(Function &F) override { - if (!TV->run(F)) { - errs() << "in function " << F.getName() << '\n'; - if (FatalErrors) - report_fatal_error("broken function found, compilation aborted!"); - else - errs() << "broken function found, compilation aborted!\n"; - } - return false; - } - - bool doFinalization(Module &M) override { - bool IsValid = true; - for (Function &F : M) - if (F.isDeclaration()) - IsValid &= TV->run(F); - - if (!IsValid) { - if (FatalErrors) - report_fatal_error("broken module found, compilation aborted!"); - else - errs() << "broken module found, compilation aborted!\n"; - } - return false; - } - - void getAnalysisUsage(AnalysisUsage &AU) const override { - AU.setPreservesAll(); - } -}; -char TargetVerifierLegacyPass::ID = 0; -FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors) { - return new TargetVerifierLegacyPass(FatalErrors); -} } // namespace llvm -using namespace llvm; -INITIALIZE_PASS(TargetVerifierLegacyPass, "tgtverifier", "Target Verifier", false, false) diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 329d95826551f..2e9e4837fe467 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -660,8 +660,6 @@ static int compileModule(char **argv, LLVMContext &Context) { // Build up all of the passes that we want to do to the module. legacy::PassManager PM; - if (VerifyTarget) - PM.add(createTargetVerifierLegacyPass(false)); PM.add(new TargetLibraryInfoWrapperPass(TLII)); { diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index b86c2318b45b7..d832dcdff4ad0 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -158,8 +158,6 @@ int main(int argc, char **argv) { if (TT.isAMDGPU()) FPM.addPass(AMDGPUTargetVerifierPass()); else if (false) {} // ... - else - FPM.addPass(TargetVerifierPass()); MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); auto PA = MPM.run(*M, MAM); >From 6401b7517843a03ab114aaf333624ef914d5a5f3 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 11:18:50 -0400 Subject: [PATCH 20/28] Add back to legacy PM. --- llvm/include/llvm/CodeGen/Passes.h | 2 -- llvm/include/llvm/InitializePasses.h | 1 - llvm/lib/Target/AMDGPU/AMDGPU.h | 3 +++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 1 + llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 8 ++++---- 5 files changed, 8 insertions(+), 7 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index da6ad3f612aa8..d214ab9306c2f 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -617,8 +617,6 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); - - //FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); } // End llvm namespace #endif diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 7d4fad2d87a16..9bef8e496c57e 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -307,7 +307,6 @@ void initializeTailDuplicateLegacyPass(PassRegistry &); void initializeTargetLibraryInfoWrapperPassPass(PassRegistry &); void initializeTargetPassConfigPass(PassRegistry &); void initializeTargetTransformInfoWrapperPassPass(PassRegistry &); -//void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeTwoAddressInstructionLegacyPassPass(PassRegistry &); void initializeTypeBasedAAWrapperPassPass(PassRegistry &); void initializeTypePromotionLegacyPass(PassRegistry &); diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h index 4ff761ec19b3c..f69956ba44255 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPU.h +++ b/llvm/lib/Target/AMDGPU/AMDGPU.h @@ -530,6 +530,9 @@ extern char &GCNRewritePartialRegUsesID; void initializeAMDGPUWaitSGPRHazardsLegacyPass(PassRegistry &); extern char &AMDGPUWaitSGPRHazardsLegacyID; +FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); +void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); + namespace AMDGPU { enum TargetIndex { TI_CONSTDATA_START, diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index f1a60b8f33140..42d6764eacda9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1377,6 +1377,7 @@ bool AMDGPUPassConfig::addGCPasses() { //===----------------------------------------------------------------------===// bool GCNPassConfig::addPreISel() { + addPass(createAMDGPUTargetVerifierLegacyPass(false)); AMDGPUPassConfig::addPreISel(); if (TM->getOptLevel() > CodeGenOptLevel::None) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index cedd9ddc78011..c4d303bee6ef8 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -17,6 +17,7 @@ //// ////===----------------------------------------------------------------------===// +#include "AMDGPU.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Analysis/UniformityAnalysis.h" @@ -24,7 +25,7 @@ #include "llvm/Support/Debug.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" -#include "llvm/InitializePasses.h" +//#include "llvm/InitializePasses.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" @@ -133,7 +134,6 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan return PreservedAnalyses::all(); } -/* struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { static char ID; @@ -183,6 +183,6 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { char AMDGPUTargetVerifierLegacyPass::ID = 0; FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors) { return new AMDGPUTargetVerifierLegacyPass(FatalErrors); -}*/ +} } // namespace llvm -//INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) +INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) >From e2f0225db1439f7d8ee612ee4c4d37a4b44f96b6 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 14:04:10 -0400 Subject: [PATCH 21/28] Remove reference to FAM in registerCallbacks and VerifyEach for TargetVerify in instrumentation --- clang/lib/CodeGen/BackendUtil.cpp | 2 +- flang/lib/Frontend/FrontendActions.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter4/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter5/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter6/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter7/toy.cpp | 2 +- .../llvm/Passes/StandardInstrumentations.h | 6 ++---- llvm/lib/LTO/LTOBackend.cpp | 2 +- llvm/lib/LTO/ThinLTOCodeGenerator.cpp | 2 +- llvm/lib/Passes/PassBuilderBindings.cpp | 2 +- llvm/lib/Passes/StandardInstrumentations.cpp | 21 ++++--------------- llvm/tools/llc/NewPMDriver.cpp | 7 ++++--- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 2 +- llvm/tools/opt/NewPMDriver.cpp | 2 +- llvm/unittests/IR/PassManagerTest.cpp | 6 +++--- 15 files changed, 24 insertions(+), 38 deletions(-) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 9a1c922f5ddef..f7eb853beb23c 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -922,7 +922,7 @@ void EmitAssemblyHelper::RunOptimizationPipeline( TheModule->getContext(), (CodeGenOpts.DebugPassManager || DebugPassStructure), CodeGenOpts.VerifyEach, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); PassBuilder PB(TM.get(), PTO, PGOOpt, &PIC); // Handle the assignment tracking feature options. diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 7c48e35ff68cf..c1f47b12abee2 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -911,7 +911,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); - si.registerCallbacks(pic, &mam, &fam); + si.registerCallbacks(pic, &mam); if (ci.isTimingEnabled()) si.getTimePasses().setOutStream(ci.getTimingStreamLLVM()); pto.LoopUnrolling = opts.UnrollLoops; diff --git a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp index f9664025f61f1..0f58391c50667 100644 --- a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp @@ -577,7 +577,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp index eae06d9f57467..7117eaf4982b0 100644 --- a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp @@ -851,7 +851,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp index 30ad79ef2fc58..cb7b6cc8651c1 100644 --- a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp @@ -970,7 +970,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp index 4a39bc33c5591..91b7191a07c6f 100644 --- a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp @@ -1139,7 +1139,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Promote allocas to registers. diff --git a/llvm/include/llvm/Passes/StandardInstrumentations.h b/llvm/include/llvm/Passes/StandardInstrumentations.h index 988fcb93b2357..65934c93ba614 100644 --- a/llvm/include/llvm/Passes/StandardInstrumentations.h +++ b/llvm/include/llvm/Passes/StandardInstrumentations.h @@ -476,8 +476,7 @@ class VerifyInstrumentation { public: VerifyInstrumentation(bool DebugLogging) : DebugLogging(DebugLogging) {} void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM, - FunctionAnalysisManager *FAM); + ModuleAnalysisManager *MAM); }; /// This class implements --time-trace functionality for new pass manager. @@ -622,8 +621,7 @@ class StandardInstrumentations { // Register all the standard instrumentation callbacks. If \p FAM is nullptr // then PreservedCFGChecker is not enabled. void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM, - FunctionAnalysisManager *FAM); + ModuleAnalysisManager *MAM); TimePassesHandler &getTimePasses() { return TimePasses; } }; diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp index 475e7cf45371b..1c764a0188eda 100644 --- a/llvm/lib/LTO/LTOBackend.cpp +++ b/llvm/lib/LTO/LTOBackend.cpp @@ -275,7 +275,7 @@ static void runNewPMPasses(const Config &Conf, Module &Mod, TargetMachine *TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(Mod.getContext(), Conf.DebugPassManager, Conf.VerifyEach); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); PassBuilder PB(TM, Conf.PTO, PGOOpt, &PIC); RegisterPassPlugins(Conf.PassPlugins, PB); diff --git a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp index 369b003df1364..9e7f8187fe49c 100644 --- a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp +++ b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp @@ -245,7 +245,7 @@ static void optimizeModule(Module &TheModule, TargetMachine &TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(TheModule.getContext(), DebugPassManager); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); PipelineTuningOptions PTO; PTO.LoopVectorization = true; PTO.SLPVectorization = true; diff --git a/llvm/lib/Passes/PassBuilderBindings.cpp b/llvm/lib/Passes/PassBuilderBindings.cpp index f0e1abb8cebc4..933fe89e53a94 100644 --- a/llvm/lib/Passes/PassBuilderBindings.cpp +++ b/llvm/lib/Passes/PassBuilderBindings.cpp @@ -76,7 +76,7 @@ static LLVMErrorRef runPasses(Module *Mod, Function *Fun, const char *Passes, PB.crossRegisterProxies(LAM, FAM, CGAM, MAM); StandardInstrumentations SI(Mod->getContext(), Debug, VerifyEach); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); // Run the pipeline. if (Fun) { diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index 076df47d5b15d..dc1dd5d9c7f4c 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -45,7 +45,6 @@ #include "llvm/Support/Signals.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Support/xxhash.h" -//#include "llvm/Target/TargetVerifier.h" #include #include #include @@ -62,8 +61,6 @@ static cl::opt VerifyAnalysisInvalidation("verify-analysis-invalidation", #endif ); -static cl::opt VerifyTargetEach("verify-tgt-each"); - // An option that supports the -print-changed option. See // the description for -print-changed for an explanation of the use // of this option. Note that this option has no effect without -print-changed. @@ -1457,10 +1454,9 @@ void PreservedCFGCheckerInstrumentation::registerCallbacks( } void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM, - FunctionAnalysisManager *FAM) { + ModuleAnalysisManager *MAM) { PIC.registerAfterPassCallback( - [this, MAM, FAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { + [this, MAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { if (isIgnored(P) || P == "VerifierPass") return; const auto *F = unwrapIR(IR); @@ -1477,15 +1473,6 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", P)); - - if (VerifyTargetEach && FAM) { - //TargetVerify TV(const_cast(F->getParent())); - //TV.run(*const_cast(F), *FAM); - /*if (!TV.IsValid) - report_fatal_error(formatv("Broken function found after pass " - "\"{0}\", compilation aborted!", - P));*/ - } } else { const auto *M = unwrapIR(IR); if (!M) { @@ -2525,7 +2512,7 @@ void PrintCrashIRInstrumentation::registerCallbacks( } void StandardInstrumentations::registerCallbacks( - PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM, FunctionAnalysisManager *FAM) { + PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM) { PrintIR.registerCallbacks(PIC); PrintPass.registerCallbacks(PIC); TimePasses.registerCallbacks(PIC); @@ -2534,7 +2521,7 @@ void StandardInstrumentations::registerCallbacks( PrintChangedIR.registerCallbacks(PIC); PseudoProbeVerification.registerCallbacks(PIC); if (VerifyEach) - Verify.registerCallbacks(PIC, MAM, FAM); + Verify.registerCallbacks(PIC, MAM); PrintChangedDiff.registerCallbacks(PIC); WebsiteChangeReporter.registerCallbacks(PIC); ChangeTester.registerCallbacks(PIC); diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index 4b95977a10c5f..863a555798dab 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -117,8 +117,6 @@ int llvm::compileModuleWithNewPM( VK == VerifierKind::EachPass); registerCodeGenCallback(PIC, *Target); - ModulePassManager MPM; - FunctionPassManager FPM; MachineFunctionAnalysisManager MFAM; LoopAnalysisManager LAM; FunctionAnalysisManager FAM; @@ -133,11 +131,14 @@ int llvm::compileModuleWithNewPM( if (VerifyTarget) PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); MAM.registerPass([&] { return MachineModuleAnalysis(MMI); }); + ModulePassManager MPM; + FunctionPassManager FPM; + if (!PassPipeline.empty()) { // Construct a custom pass pipeline that starts after instruction // selection. diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index d832dcdff4ad0..50f4e56bb6af6 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -148,7 +148,7 @@ int main(int argc, char **argv) { PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); Triple TT(M->getTargetTriple()); if (!NoLint) diff --git a/llvm/tools/opt/NewPMDriver.cpp b/llvm/tools/opt/NewPMDriver.cpp index a8977d80bdf44..7d168a6ceb17c 100644 --- a/llvm/tools/opt/NewPMDriver.cpp +++ b/llvm/tools/opt/NewPMDriver.cpp @@ -423,7 +423,7 @@ bool llvm::runPassPipeline( PrintPassOpts.SkipAnalyses = DebugPM == DebugLogging::Quiet; StandardInstrumentations SI(M.getContext(), DebugPM != DebugLogging::None, VK == VerifierKind::EachPass, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); DebugifyEachInstrumentation Debugify; DebugifyStatsMap DIStatsMap; DebugInfoPerPass DebugInfoBeforePass; diff --git a/llvm/unittests/IR/PassManagerTest.cpp b/llvm/unittests/IR/PassManagerTest.cpp index bb4db6120035f..a6487169224c2 100644 --- a/llvm/unittests/IR/PassManagerTest.cpp +++ b/llvm/unittests/IR/PassManagerTest.cpp @@ -828,7 +828,7 @@ TEST_F(PassManagerTest, FunctionPassCFGChecker) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -877,7 +877,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerInvalidateAnalysis) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -945,7 +945,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerWrapped) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); >From b43cec12bbfc6071d4a99e75aad4273bab4e3182 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 14:44:28 -0400 Subject: [PATCH 22/28] Remove references to registry --- llvm/include/llvm/Passes/PassBuilder.h | 21 ------------------- .../llvm/Passes/StandardInstrumentations.h | 2 +- .../llvm/Passes/TargetPassRegistry.inc | 12 ----------- llvm/lib/Passes/PassBuilder.cpp | 7 ------- llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 11 ---------- llvm/tools/llc/NewPMDriver.cpp | 5 ----- llvm/tools/llc/llc.cpp | 2 -- 7 files changed, 1 insertion(+), 59 deletions(-) diff --git a/llvm/include/llvm/Passes/PassBuilder.h b/llvm/include/llvm/Passes/PassBuilder.h index 6000769ce723b..51ccaa53447d7 100644 --- a/llvm/include/llvm/Passes/PassBuilder.h +++ b/llvm/include/llvm/Passes/PassBuilder.h @@ -172,13 +172,6 @@ class PassBuilder { /// additional analyses. void registerLoopAnalyses(LoopAnalysisManager &LAM); - /// Registers all available verifier passes. - /// - /// This is an interface that can be used to populate a - /// \c ModuleAnalysisManager with all registered loop analyses. Callers can - /// still manually register any additional analyses. - void registerVerifierPasses(ModulePassManager &PM, FunctionPassManager &); - /// Registers all available machine function analysis passes. /// /// This is an interface that can be used to populate a \c @@ -577,15 +570,6 @@ class PassBuilder { } /// @}} - /// Register a callback for parsing an Verifier Name to populate - /// the given managers. - void registerVerifierCallback( - const std::function &C, - const std::function &CF) { - VerifierCallbacks.push_back(C); - FnVerifierCallbacks.push_back(CF); - } - /// {{@ Register pipeline parsing callbacks with this pass builder instance. /// Using these callbacks, callers can parse both a single pass name, as well /// as entire sub-pipelines, and populate the PassManager instance @@ -857,11 +841,6 @@ class PassBuilder { // Callbacks to parse `filter` parameter in register allocation passes SmallVector, 2> RegClassFilterParsingCallbacks; - // Verifier callbacks - SmallVector, 2> - VerifierCallbacks; - SmallVector, 2> - FnVerifierCallbacks; }; /// This utility template takes care of adding require<> and invalidate<> diff --git a/llvm/include/llvm/Passes/StandardInstrumentations.h b/llvm/include/llvm/Passes/StandardInstrumentations.h index 65934c93ba614..f7a65a88ecf5b 100644 --- a/llvm/include/llvm/Passes/StandardInstrumentations.h +++ b/llvm/include/llvm/Passes/StandardInstrumentations.h @@ -621,7 +621,7 @@ class StandardInstrumentations { // Register all the standard instrumentation callbacks. If \p FAM is nullptr // then PreservedCFGChecker is not enabled. void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM); + ModuleAnalysisManager *MAM = nullptr); TimePassesHandler &getTimePasses() { return TimePasses; } }; diff --git a/llvm/include/llvm/Passes/TargetPassRegistry.inc b/llvm/include/llvm/Passes/TargetPassRegistry.inc index 2d04b874cf360..521913cb25a4a 100644 --- a/llvm/include/llvm/Passes/TargetPassRegistry.inc +++ b/llvm/include/llvm/Passes/TargetPassRegistry.inc @@ -151,18 +151,6 @@ PB.registerPipelineParsingCallback([=](StringRef Name, FunctionPassManager &PM, return false; }); -PB.registerVerifierCallback([](ModulePassManager &PM) { -#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) PM.addPass(CREATE_PASS) -#include GET_PASS_REGISTRY -#undef VERIFIER_MODULE_ANALYSIS - return false; -}, [](FunctionPassManager &FPM) { -#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) FPM.addPass(CREATE_PASS) -#include GET_PASS_REGISTRY -#undef VERIFIER_FUNCTION_ANALYSIS - return false; -}); - #undef ADD_PASS #undef ADD_PASS_WITH_PARAMS diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index e942fed8b6a72..e7057d9a6b625 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -582,13 +582,6 @@ void PassBuilder::registerLoopAnalyses(LoopAnalysisManager &LAM) { C(LAM); } -void PassBuilder::registerVerifierPasses(ModulePassManager &MPM, FunctionPassManager &FPM) { - for (auto &C : VerifierCallbacks) - C(MPM); - for (auto &C : FnVerifierCallbacks) - C(FPM); -} - static std::optional> parseFunctionPipelineName(StringRef Name) { std::pair Params; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 41e6a399c7239..98a1147ef6d66 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -81,17 +81,6 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #undef FUNCTION_ALIAS_ANALYSIS #undef FUNCTION_ANALYSIS -#ifndef VERIFIER_MODULE_ANALYSIS -#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) -#endif -#ifndef VERIFIER_FUNCTION_ANALYSIS -#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) -#endif -VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) -VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) -#undef VERIFIER_MODULE_ANALYSIS -#undef VERIFIER_FUNCTION_ANALYSIS - #ifndef FUNCTION_PASS_WITH_PARAMS #define FUNCTION_PASS_WITH_PARAMS(NAME, CLASS, CREATE_PASS, PARSER, PARAMS) #endif diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index 863a555798dab..fa82689ecf9ae 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -57,9 +57,6 @@ static cl::opt DebugPM("debug-pass-manager", cl::Hidden, cl::desc("Print pass management debugging information")); -static cl::opt VerifyTarget("verify-tgt-new-pm", - cl::desc("Verify the target")); - bool LLCDiagnosticHandler::handleDiagnostics(const DiagnosticInfo &DI) { DiagnosticHandler::handleDiagnostics(DI); if (DI.getKind() == llvm::DK_SrcMgr) { @@ -128,8 +125,6 @@ int llvm::compileModuleWithNewPM( PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); - if (VerifyTarget) - PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM); diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 2e9e4837fe467..140459ba2de21 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -209,8 +209,6 @@ static cl::opt PassPipeline( static cl::alias PassPipeline2("p", cl::aliasopt(PassPipeline), cl::desc("Alias for -passes")); -static cl::opt VerifyTarget("verify-tgt", cl::desc("Verify the target")); - namespace { std::vector &getRunPassNames() { >From b583b3f804758f6b8ca686bf66d59d744fffbe8e Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 19:05:53 -0400 Subject: [PATCH 23/28] Remove int check --- .../lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 18 ------------------ 1 file changed, 18 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index c4d303bee6ef8..2ca0bbeb57653 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -25,7 +25,6 @@ #include "llvm/Support/Debug.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" -//#include "llvm/InitializePasses.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" @@ -46,15 +45,6 @@ using namespace llvm; namespace llvm { -static bool IsValidInt(const Type *Ty) { - return Ty->isIntegerTy(1) || - Ty->isIntegerTy(8) || - Ty->isIntegerTy(16) || - Ty->isIntegerTy(32) || - Ty->isIntegerTy(64) || - Ty->isIntegerTy(128); -} - static bool isShader(CallingConv::ID CC) { switch(CC) { case CallingConv::AMDGPU_VS: @@ -81,14 +71,6 @@ bool AMDGPUTargetVerify::run(Function &F) { for (auto &I : BB) { - // Ensure integral types are valid: i8, i16, i32, i64, i128 - if (I.getType()->isIntegerTy()) - Check(IsValidInt(I.getType()), "Int type is invalid.", &I); - for (unsigned i = 0; i < I.getNumOperands(); ++i) - if (I.getOperand(i)->getType()->isIntegerTy()) - Check(IsValidInt(I.getOperand(i)->getType()), - "Int type is invalid.", I.getOperand(i)); - if (auto *CI = dyn_cast(&I)) { // Ensure no kernel to kernel calls. >From 2ba9f5d85326b80bd502116a95353d7e9ad4c9bb Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 21:56:28 -0400 Subject: [PATCH 24/28] Remove modifications to Lint/Verifier. --- llvm/lib/Analysis/Lint.cpp | 4 +--- llvm/lib/IR/Verifier.cpp | 20 ++++---------------- 2 files changed, 5 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Analysis/Lint.cpp b/llvm/lib/Analysis/Lint.cpp index c8e38963e5974..f05e36e2025d4 100644 --- a/llvm/lib/Analysis/Lint.cpp +++ b/llvm/lib/Analysis/Lint.cpp @@ -742,11 +742,9 @@ PreservedAnalyses LintPass::run(Function &F, FunctionAnalysisManager &AM) { Lint L(Mod, DL, AA, AC, DT, TLI); L.visit(F); dbgs() << L.MessagesStr.str(); - if (AbortOnError && !L.MessagesStr.str().empty()) { + if (AbortOnError && !L.MessagesStr.str().empty()) report_fatal_error( "linter found errors, aborting. (enabled by abort-on-error)", false); - return PreservedAnalyses::none(); - } return PreservedAnalyses::all(); } diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 51f6dec53b70f..8afe360d088bc 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -135,10 +135,6 @@ static cl::opt VerifyNoAliasScopeDomination( cl::desc("Ensure that llvm.experimental.noalias.scope.decl for identical " "scopes are not dominating")); -static cl::opt - VerifyAbortOnError("verifier-abort-on-error", cl::init(false), - cl::desc("In the Verifier pass, abort on errors.")); - namespace llvm { struct VerifierSupport { @@ -7800,24 +7796,16 @@ VerifierAnalysis::Result VerifierAnalysis::run(Function &F, PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { auto Res = AM.getResult(M); - if (Res.IRBroken || Res.DebugInfoBroken) { - //M.IsValid = false; - if (VerifyAbortOnError && FatalErrors) - report_fatal_error("Broken module found, compilation aborted!"); - return PreservedAnalyses::none(); - } + if (FatalErrors && (Res.IRBroken || Res.DebugInfoBroken)) + report_fatal_error("Broken module found, compilation aborted!"); return PreservedAnalyses::all(); } PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto res = AM.getResult(F); - if (res.IRBroken) { - //F.getParent()->IsValid = false; - if (VerifyAbortOnError && FatalErrors) - report_fatal_error("Broken function found, compilation aborted!"); - return PreservedAnalyses::none(); - } + if (res.IRBroken && FatalErrors) + report_fatal_error("Broken function found, compilation aborted!"); return PreservedAnalyses::all(); } >From 0c572440b11b571d0431c2c0bfd83132126e096f Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 22:21:47 -0400 Subject: [PATCH 25/28] Remove llvm-tgt-verify tool. --- llvm/test/CMakeLists.txt | 1 - llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 45 ----- llvm/test/lit.cfg.py | 1 - llvm/tools/llvm-tgt-verify/CMakeLists.txt | 34 ---- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 171 ------------------ llvm/utils/gn/secondary/llvm/test/BUILD.gn | 1 - .../llvm/tools/llvm-tgt-verify/BUILD.gn | 25 --- 7 files changed, 278 deletions(-) delete mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify.ll delete mode 100644 llvm/tools/llvm-tgt-verify/CMakeLists.txt delete mode 100644 llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp delete mode 100644 llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn diff --git a/llvm/test/CMakeLists.txt b/llvm/test/CMakeLists.txt index 10ca9300e7c66..66849002eb470 100644 --- a/llvm/test/CMakeLists.txt +++ b/llvm/test/CMakeLists.txt @@ -135,7 +135,6 @@ set(LLVM_TEST_DEPENDS llvm-strip llvm-symbolizer llvm-tblgen - llvm-tgt-verify llvm-readtapi llvm-tli-checker llvm-undname diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll deleted file mode 100644 index 62b220d7d9f49..0000000000000 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ /dev/null @@ -1,45 +0,0 @@ -; RUN: not llvm-tgt-verify %s -mtriple=amdgcn |& FileCheck %s - -define amdgpu_cs i32 @shader() { -; CHECK: Shaders must return void - ret i32 0 -} - -define amdgpu_kernel void @store_const(ptr addrspace(4) %out, i32 %a, i32 %b) { -; CHECK: Undefined behavior: Write to memory in const addrspace -; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 - %r = add i32 %a, %b - store i32 %r, ptr addrspace(4) %out - ret void -} - -define amdgpu_kernel void @kernel_callee(ptr %x) { - ret void -} - -define amdgpu_kernel void @kernel_caller(ptr %x) { -; CHECK: A kernel may not call a kernel -; CHECK-NEXT: ptr @kernel_caller - call amdgpu_kernel void @kernel_callee(ptr %x) - ret void -} - - -; Function Attrs: nounwind -define i65 @invalid_type(i65 %x) #0 { -; CHECK: Int type is invalid. -; CHECK-NEXT: %tmp2 = ashr i65 %x, 64 -entry: - %tmp2 = ashr i65 %x, 64 - ret i65 %tmp2 -} - -declare void @llvm.amdgcn.cs.chain.v3i32(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) -declare amdgpu_cs_chain void @chain_callee(<3 x i32> inreg, <3 x i32>) - -define amdgpu_cs void @no_unreachable(<3 x i32> inreg %a, <3 x i32> %b) { -; CHECK: llvm.amdgcn.cs.chain must be followed by unreachable -; CHECK-NEXT: call void (ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.p0.i32.v3i32.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) - call void(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) - ret void -} diff --git a/llvm/test/lit.cfg.py b/llvm/test/lit.cfg.py index 8620f2a7014b5..aad7a088551b2 100644 --- a/llvm/test/lit.cfg.py +++ b/llvm/test/lit.cfg.py @@ -227,7 +227,6 @@ def get_asan_rtlib(): "llvm-strings", "llvm-strip", "llvm-tblgen", - "llvm-tgt-verify", "llvm-readtapi", "llvm-undname", "llvm-windres", diff --git a/llvm/tools/llvm-tgt-verify/CMakeLists.txt b/llvm/tools/llvm-tgt-verify/CMakeLists.txt deleted file mode 100644 index fe47c85e6cdce..0000000000000 --- a/llvm/tools/llvm-tgt-verify/CMakeLists.txt +++ /dev/null @@ -1,34 +0,0 @@ -set(LLVM_LINK_COMPONENTS - AllTargetsAsmParsers - AllTargetsCodeGens - AllTargetsDescs - AllTargetsInfos - Analysis - AsmPrinter - CodeGen - CodeGenTypes - Core - IRPrinter - IRReader - MC - MIRParser - Passes - Remarks - ScalarOpts - SelectionDAG - Support - Target - TargetParser - TransformUtils - Vectorize - ) - -add_llvm_tool(llvm-tgt-verify - llvm-tgt-verify.cpp - - DEPENDS - intrinsics_gen - SUPPORT_PLUGINS - ) - -export_executable_symbols_for_plugins(llc) diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp deleted file mode 100644 index 50f4e56bb6af6..0000000000000 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ /dev/null @@ -1,171 +0,0 @@ -//===--- llvm-tgt-verify.cpp - Target Verifier ----------------- ----------===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===----------------------------------------------------------------------===// -// -// Tool to verify a target. -// -//===----------------------------------------------------------------------===// - -#include "llvm/InitializePasses.h" -#include "llvm/ADT/StringRef.h" -#include "llvm/Analysis/Lint.h" -#include "llvm/Analysis/TargetLibraryInfo.h" -#include "llvm/Bitcode/BitcodeReader.h" -#include "llvm/Bitcode/BitcodeWriter.h" -#include "llvm/CodeGen/CommandFlags.h" -#include "llvm/CodeGen/TargetPassConfig.h" -#include "llvm/IR/Constants.h" -#include "llvm/IR/LLVMContext.h" -#include "llvm/IR/LegacyPassManager.h" -#include "llvm/IR/Module.h" -#include "llvm/IR/Verifier.h" -#include "llvm/IRReader/IRReader.h" -#include "llvm/Passes/PassBuilder.h" -#include "llvm/Passes/StandardInstrumentations.h" -#include "llvm/MC/TargetRegistry.h" -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/DataTypes.h" -#include "llvm/Support/Debug.h" -#include "llvm/Support/InitLLVM.h" -#include "llvm/Support/SourceMgr.h" -#include "llvm/Support/TargetSelect.h" -#include "llvm/Target/TargetMachine.h" -#include "llvm/Target/TargetVerifier.h" - -#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" - -#define DEBUG_TYPE "isel-fuzzer" - -using namespace llvm; - -static codegen::RegisterCodeGenFlags CGF; - -static cl::opt -InputFilename(cl::Positional, cl::desc(""), cl::init("-")); - -static cl::opt - StacktraceAbort("stacktrace-abort", - cl::desc("Turn on stacktrace"), cl::init(false)); - -static cl::opt - NoLint("no-lint", - cl::desc("Turn off Lint"), cl::init(false)); - -static cl::opt - NoVerify("no-verifier", - cl::desc("Turn off Verifier"), cl::init(false)); - -static cl::opt - OptLevel("O", - cl::desc("Optimization level. [-O0, -O1, -O2, or -O3] " - "(default = '-O2')"), - cl::Prefix, cl::init('2')); - -static cl::opt - TargetTriple("mtriple", cl::desc("Override target triple for module")); - -static std::unique_ptr TM; - -static void handleLLVMFatalError(void *, const char *Message, bool) { - if (StacktraceAbort) { - dbgs() << "LLVM ERROR: " << Message << "\n" - << "Aborting.\n"; - abort(); - } -} - -int main(int argc, char **argv) { - StringRef ExecName = argv[0]; - InitLLVM X(argc, argv); - - InitializeAllTargets(); - InitializeAllTargetMCs(); - InitializeAllAsmPrinters(); - InitializeAllAsmParsers(); - - PassRegistry *Registry = PassRegistry::getPassRegistry(); - initializeCore(*Registry); - initializeCodeGen(*Registry); - initializeAnalysis(*Registry); - initializeTarget(*Registry); - - cl::ParseCommandLineOptions(argc, argv); - - if (TargetTriple.empty()) { - errs() << ExecName << ": -mtriple must be specified\n"; - exit(1); - } - - CodeGenOptLevel OLvl; - if (auto Level = CodeGenOpt::parseLevel(OptLevel)) { - OLvl = *Level; - } else { - errs() << ExecName << ": invalid optimization level.\n"; - return 1; - } - ExitOnError ExitOnErr(std::string(ExecName) + ": error:"); - TM = ExitOnErr(codegen::createTargetMachineForTriple( - Triple::normalize(TargetTriple), OLvl)); - assert(TM && "Could not allocate target machine!"); - - // Make sure we print the summary and the current unit when LLVM errors out. - install_fatal_error_handler(handleLLVMFatalError, nullptr); - - LLVMContext Context; - SMDiagnostic Err; - std::unique_ptr M = parseIRFile(InputFilename, Err, Context); - if (!M) { - errs() << "Invalid mod\n"; - return 1; - } - auto S = Triple::normalize(TargetTriple); - M->setTargetTriple(Triple(S)); - - PassInstrumentationCallbacks PIC; - StandardInstrumentations SI(Context, false/*debug PM*/, - false); - registerCodeGenCallback(PIC, *TM); - - ModulePassManager MPM; - FunctionPassManager FPM; - //TargetLibraryInfoImpl TLII(Triple(M->getTargetTriple())); - - MachineFunctionAnalysisManager MFAM; - LoopAnalysisManager LAM; - FunctionAnalysisManager FAM; - CGSCCAnalysisManager CGAM; - ModuleAnalysisManager MAM; - PassBuilder PB(TM.get(), PipelineTuningOptions(), std::nullopt, &PIC); - PB.registerModuleAnalyses(MAM); - //PB.registerVerifierPasses(MPM, FPM); - PB.registerCGSCCAnalyses(CGAM); - PB.registerFunctionAnalyses(FAM); - PB.registerLoopAnalyses(LAM); - PB.registerMachineFunctionAnalyses(MFAM); - PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - - SI.registerCallbacks(PIC, &MAM); - - Triple TT(M->getTargetTriple()); - if (!NoLint) - FPM.addPass(LintPass(false)); - if (!NoVerify) - MPM.addPass(VerifierPass()); - if (TT.isAMDGPU()) - FPM.addPass(AMDGPUTargetVerifierPass()); - else if (false) {} // ... - MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); - - auto PA = MPM.run(*M, MAM); - { - auto PAC = PA.getChecker(); - if (!PAC.preserved()) - return 1; - } - - return 0; -} diff --git a/llvm/utils/gn/secondary/llvm/test/BUILD.gn b/llvm/utils/gn/secondary/llvm/test/BUILD.gn index 157e7991c52a8..228642667b41d 100644 --- a/llvm/utils/gn/secondary/llvm/test/BUILD.gn +++ b/llvm/utils/gn/secondary/llvm/test/BUILD.gn @@ -319,7 +319,6 @@ group("test") { "//llvm/tools/llvm-strings", "//llvm/tools/llvm-symbolizer:symlinks", "//llvm/tools/llvm-tli-checker", - "//llvm/tools/llvm-tgt-verify", "//llvm/tools/llvm-undname", "//llvm/tools/llvm-xray", "//llvm/tools/lto", diff --git a/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn b/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn deleted file mode 100644 index b751bafc5052c..0000000000000 --- a/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn +++ /dev/null @@ -1,25 +0,0 @@ -import("//llvm/utils/TableGen/tablegen.gni") - -tgtverifier("llvm-tgt-verify") { - deps = [ - "//llvm/lib/Analysis", - "//llvm/lib/AsmPrinter", - "//llvm/lib/CodeGen", - "//llvm/lib/CodeGenTypes", - "//llvm/lib/Core", - "//llvm/lib/IRPrinter", - "//llvm/lib/IRReader", - "//llvm/lib/MC", - "//llvm/lib/MIRParser", - "//llvm/lib/Passes", - "//llvm/lib/Remarks", - "//llvm/lib/ScalarOpts", - "//llvm/lib/SelectionDAG", - "//llvm/lib/Support", - "//llvm/lib/Target", - "//llvm/lib/TargetParser", - "//llvm/lib/TransformUtils", - "//llvm/lib/Vectorize", - ] - sources = [ "llvm-tgt-verify.cpp" ] -} >From 94c24ebf4fc1c67872d5d2effa8016b5b04b71a5 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 22:23:15 -0400 Subject: [PATCH 26/28] Remove TargetVerifier.cpp --- llvm/lib/Passes/CMakeLists.txt | 1 - llvm/lib/Target/CMakeLists.txt | 1 - llvm/lib/Target/TargetVerifier.cpp | 36 ------------------------------ 3 files changed, 38 deletions(-) delete mode 100644 llvm/lib/Target/TargetVerifier.cpp diff --git a/llvm/lib/Passes/CMakeLists.txt b/llvm/lib/Passes/CMakeLists.txt index 9c348cb89a8c5..6425f4934b210 100644 --- a/llvm/lib/Passes/CMakeLists.txt +++ b/llvm/lib/Passes/CMakeLists.txt @@ -29,7 +29,6 @@ add_llvm_component_library(LLVMPasses Scalar Support Target - #TargetParser TransformUtils Vectorize Instrumentation diff --git a/llvm/lib/Target/CMakeLists.txt b/llvm/lib/Target/CMakeLists.txt index f2a5d545ce84f..e354fd484a7a9 100644 --- a/llvm/lib/Target/CMakeLists.txt +++ b/llvm/lib/Target/CMakeLists.txt @@ -7,7 +7,6 @@ add_llvm_component_library(LLVMTarget TargetLoweringObjectFile.cpp TargetMachine.cpp TargetMachineC.cpp - TargetVerifier.cpp AMDGPU/AMDGPUTargetVerifier.cpp ADDITIONAL_HEADER_DIRS diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp deleted file mode 100644 index c63ae2a2c5daf..0000000000000 --- a/llvm/lib/Target/TargetVerifier.cpp +++ /dev/null @@ -1,36 +0,0 @@ -//===-- TargetVerifier.cpp - LLVM IR Target Verifier ----------------*- C++ -*-===// -//// -///// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -///// See https://llvm.org/LICENSE.txt for license information. -///// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -///// -/////===----------------------------------------------------------------------===// -///// -///// This file defines target verifier interfaces that can be used for some -///// validation of input to the system, and for checking that transformations -///// haven't done something bad. In contrast to the Verifier or Lint, the -///// TargetVerifier looks for constructions invalid to a particular target -///// machine. -///// -///// To see what specifically is checked, look at TargetVerifier.cpp or an -///// individual backend's TargetVerifier. -///// -/////===----------------------------------------------------------------------===// - -#include "llvm/Target/TargetVerifier.h" -#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" - -#include "llvm/InitializePasses.h" -#include "llvm/Analysis/UniformityAnalysis.h" -#include "llvm/Analysis/PostDominators.h" -#include "llvm/Support/Debug.h" -#include "llvm/IR/Dominators.h" -#include "llvm/IR/Function.h" -#include "llvm/IR/IntrinsicInst.h" -#include "llvm/IR/IntrinsicsAMDGPU.h" -#include "llvm/IR/Module.h" -#include "llvm/IR/Value.h" - -namespace llvm { - -} // namespace llvm >From 37a19c161b7fa97e6bea78cdc6c433c3e5f86efd Mon Sep 17 00:00:00 2001 From: jofernau Date: Thu, 1 May 2025 03:02:30 -0400 Subject: [PATCH 27/28] clang-format --- llvm/include/llvm/Target/TargetVerifier.h | 10 +- .../TargetVerify/AMDGPUTargetVerifier.h | 44 ++++----- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 98 ++++++++++--------- 3 files changed, 77 insertions(+), 75 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 1d12eb55bbf0a..3f8c710a88768 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -1,4 +1,4 @@ -//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier ---*- C++ -*-===// +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier --*- C++ -*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -20,8 +20,8 @@ #ifndef LLVM_TARGET_VERIFIER_H #define LLVM_TARGET_VERIFIER_H -#include "llvm/IR/PassManager.h" #include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" #include "llvm/TargetParser/Triple.h" namespace llvm { @@ -59,10 +59,11 @@ class TargetVerify { /// This calls the Message-only version so that the above is easier to set /// a breakpoint on. template - void CheckFailed(const Twine &Message, const T1 &V1, const Ts &... Vs) { + void CheckFailed(const Twine &Message, const T1 &V1, const Ts &...Vs) { CheckFailed(Message); WriteValues({V1, Vs...}); } + public: Module *Mod; Triple TT; @@ -73,8 +74,7 @@ class TargetVerify { bool IsValid = true; TargetVerify(Module *Mod) - : Mod(Mod), TT(Mod->getTargetTriple()), - MessagesStr(Messages) {} + : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} virtual bool run(Function &F) = 0; }; diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index 49bcbc8849e3c..5b8d9ec259b63 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -1,29 +1,29 @@ -//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU ---*- C++ -*-===// -//// -//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -//// See https://llvm.org/LICENSE.txt for license information. -//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -//// -////===----------------------------------------------------------------------===// -//// -//// This file defines target verifier interfaces that can be used for some -//// validation of input to the system, and for checking that transformations -//// haven't done something bad. In contrast to the Verifier or Lint, the -//// TargetVerifier looks for constructions invalid to a particular target -//// machine. -//// -//// To see what specifically is checked, look at an individual backend's -//// TargetVerifier. -//// -////===----------------------------------------------------------------------===// +//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU -- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at an individual backend's +// TargetVerifier. +// +//===----------------------------------------------------------------------===// #ifndef LLVM_AMDGPU_TARGET_VERIFIER_H #define LLVM_AMDGPU_TARGET_VERIFIER_H #include "llvm/Target/TargetVerifier.h" -#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/Analysis/PostDominators.h" +#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/IR/Dominators.h" namespace llvm { @@ -43,10 +43,10 @@ class AMDGPUTargetVerify : public TargetVerify { PostDominatorTree *PDT = nullptr; UniformityInfo *UA = nullptr; - AMDGPUTargetVerify(Module *Mod) - : TargetVerify(Mod), Mod(Mod) {} + AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod), Mod(Mod) {} - AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, + UniformityInfo *UA) : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} bool run(Function &F) override; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 2ca0bbeb57653..eb22eb2177f7f 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -1,34 +1,34 @@ -//===-- AMDGPUTargetVerifier.cpp - AMDGPU -------------------------*- C++ -*-===// -//// -//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -//// See https://llvm.org/LICENSE.txt for license information. -//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -//// -////===----------------------------------------------------------------------===// -//// -//// This file defines target verifier interfaces that can be used for some -//// validation of input to the system, and for checking that transformations -//// haven't done something bad. In contrast to the Verifier or Lint, the -//// TargetVerifier looks for constructions invalid to a particular target -//// machine. -//// -//// To see what specifically is checked, look at an individual backend's -//// TargetVerifier. -//// -////===----------------------------------------------------------------------===// +//===-- AMDGPUTargetVerifier.cpp - AMDGPU -----------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at an individual backend's +// TargetVerifier. +// +//===----------------------------------------------------------------------===// -#include "AMDGPU.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" +#include "AMDGPU.h" -#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/Analysis/PostDominators.h" -#include "llvm/Support/Debug.h" +#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" #include "llvm/IR/Value.h" +#include "llvm/Support/Debug.h" #include "llvm/Support/raw_ostream.h" @@ -39,53 +39,52 @@ using namespace llvm; do { \ if (!(C)) { \ TargetVerify::CheckFailed(__VA_ARGS__); \ - return false; \ } \ } while (false) namespace llvm { static bool isShader(CallingConv::ID CC) { - switch(CC) { - case CallingConv::AMDGPU_VS: - case CallingConv::AMDGPU_LS: - case CallingConv::AMDGPU_HS: - case CallingConv::AMDGPU_ES: - case CallingConv::AMDGPU_GS: - case CallingConv::AMDGPU_PS: - case CallingConv::AMDGPU_CS_Chain: - case CallingConv::AMDGPU_CS_ChainPreserve: - case CallingConv::AMDGPU_CS: - return true; - default: - return false; + switch (CC) { + case CallingConv::AMDGPU_VS: + case CallingConv::AMDGPU_LS: + case CallingConv::AMDGPU_HS: + case CallingConv::AMDGPU_ES: + case CallingConv::AMDGPU_GS: + case CallingConv::AMDGPU_PS: + case CallingConv::AMDGPU_CS_Chain: + case CallingConv::AMDGPU_CS_ChainPreserve: + case CallingConv::AMDGPU_CS: + return true; + default: + return false; } } bool AMDGPUTargetVerify::run(Function &F) { // Ensure shader calling convention returns void if (isShader(F.getCallingConv())) - Check(F.getReturnType() == Type::getVoidTy(F.getContext()), "Shaders must return void"); + Check(F.getReturnType() == Type::getVoidTy(F.getContext()), + "Shaders must return void"); for (auto &BB : F) { for (auto &I : BB) { - if (auto *CI = dyn_cast(&I)) - { + if (auto *CI = dyn_cast(&I)) { // Ensure no kernel to kernel calls. CallingConv::ID CalleeCC = CI->getCallingConv(); - if (CalleeCC == CallingConv::AMDGPU_KERNEL) - { - CallingConv::ID CallerCC = CI->getParent()->getParent()->getCallingConv(); + if (CalleeCC == CallingConv::AMDGPU_KERNEL) { + CallingConv::ID CallerCC = + CI->getParent()->getParent()->getCallingConv(); Check(CallerCC != CallingConv::AMDGPU_KERNEL, - "A kernel may not call a kernel", CI->getParent()->getParent()); + "A kernel may not call a kernel", CI->getParent()->getParent()); } // Ensure chain intrinsics are followed by unreachables. if (CI->getIntrinsicID() == Intrinsic::amdgcn_cs_chain) Check(isa_and_present(CI->getNextNode()), - "llvm.amdgcn.cs.chain must be followed by unreachable", CI); + "llvm.amdgcn.cs.chain must be followed by unreachable", CI); } } } @@ -98,7 +97,8 @@ bool AMDGPUTargetVerify::run(Function &F) { return true; } -PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { +PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, + FunctionAnalysisManager &AM) { auto *Mod = F.getParent(); auto UA = &AM.getResult(F); @@ -122,9 +122,10 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { std::unique_ptr TV; bool FatalErrors = false; - AMDGPUTargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), - FatalErrors(FatalErrors) { - initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + AMDGPUTargetVerifierLegacyPass(bool FatalErrors) + : FunctionPass(ID), FatalErrors(FatalErrors) { + initializeAMDGPUTargetVerifierLegacyPassPass( + *PassRegistry::getPassRegistry()); } bool doInitialization(Module &M) override { @@ -167,4 +168,5 @@ FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors) { return new AMDGPUTargetVerifierLegacyPass(FatalErrors); } } // namespace llvm -INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) +INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", + "AMDGPU Target Verifier", false, false) >From 0eb626f09a0851a4e95f86804af85400906a451c Mon Sep 17 00:00:00 2001 From: jofernau Date: Thu, 1 May 2025 03:49:20 -0400 Subject: [PATCH 28/28] Add VerifyTarget option --- .../llvm/Target/TargetVerify/AMDGPUTargetVerifier.h | 6 ++---- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 9 +++++++-- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index 5b8d9ec259b63..8ed7dd7ea2f69 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -37,17 +37,15 @@ class AMDGPUTargetVerifierPass : public TargetVerifierPass { class AMDGPUTargetVerify : public TargetVerify { public: - Module *Mod; - DominatorTree *DT = nullptr; PostDominatorTree *PDT = nullptr; UniformityInfo *UA = nullptr; - AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod), Mod(Mod) {} + AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod) {} AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) - : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} + : TargetVerify(Mod), DT(DT), PDT(PDT), UA(UA) {} bool run(Function &F) override; }; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 42d6764eacda9..582090c3c411e 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -482,6 +482,9 @@ static cl::opt HasClosedWorldAssumption( cl::desc("Whether has closed-world assumption at link time"), cl::init(false), cl::Hidden); +static cl::opt VerifyTarget("verify-tgt", + cl::desc("Enable the target verifier")); + extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAMDGPUTarget() { // Register the target RegisterTargetMachine X(getTheR600Target()); @@ -1377,7 +1380,8 @@ bool AMDGPUPassConfig::addGCPasses() { //===----------------------------------------------------------------------===// bool GCNPassConfig::addPreISel() { - addPass(createAMDGPUTargetVerifierLegacyPass(false)); + if (VerifyTarget) + addPass(createAMDGPUTargetVerifierLegacyPass(false)); AMDGPUPassConfig::addPreISel(); if (TM->getOptLevel() > CodeGenOptLevel::None) @@ -1977,7 +1981,8 @@ AMDGPUCodeGenPassBuilder::AMDGPUCodeGenPassBuilder( } void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { - addPass(AMDGPUTargetVerifierPass()); + if (VerifyTarget) + addPass(AMDGPUTargetVerifierPass()); if (RemoveIncompatibleFunctions && TM.getTargetTriple().isAMDGCN()) addPass(AMDGPURemoveIncompatibleFunctionsPass(TM)); From flang-commits at lists.llvm.org Thu May 1 05:52:03 2025 From: flang-commits at lists.llvm.org (Mats Petersson via flang-commits) Date: Thu, 01 May 2025 05:52:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #131628) In-Reply-To: Message-ID: <68136e73.050a0220.cf38f.3012@mx.google.com> https://github.com/Leporacanthicus updated https://github.com/llvm/llvm-project/pull/131628 >From a75db6e7529fa9c3f1770e1325e6cd1696a77052 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Thu, 6 Mar 2025 10:41:59 +0000 Subject: [PATCH 01/11] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION This adds another puzzle piece for the support of OpenMP DECLARE REDUCTION functionality. This adds support for operators with derived types, as well as declaring multiple different types with the same name or operator. A new detail class for UserReductionDetials is introduced to hold the list of types supported for a given reduction declaration. Tests for parsing and symbol generation added. Declare reduction is still not supported to lowering, it will generate a "Not yet implemented" fatal error. --- flang/include/flang/Semantics/symbol.h | 21 ++- flang/lib/Semantics/check-omp-structure.cpp | 63 ++++++-- flang/lib/Semantics/resolve-names-utils.h | 4 + flang/lib/Semantics/resolve-names.cpp | 77 +++++++++- flang/lib/Semantics/symbol.cpp | 12 +- .../Parser/OpenMP/declare-reduction-multi.f90 | 134 ++++++++++++++++++ .../OpenMP/declare-reduction-operator.f90 | 59 ++++++++ .../OpenMP/declare-reduction-functions.f90 | 126 ++++++++++++++++ .../OpenMP/declare-reduction-mangled.f90 | 51 +++++++ .../OpenMP/declare-reduction-operators.f90 | 55 +++++++ .../OpenMP/declare-reduction-typeerror.f90 | 30 ++++ .../Semantics/OpenMP/declare-reduction.f90 | 4 +- 12 files changed, 616 insertions(+), 20 deletions(-) create mode 100644 flang/test/Parser/OpenMP/declare-reduction-multi.f90 create mode 100644 flang/test/Parser/OpenMP/declare-reduction-operator.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-functions.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-mangled.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-operators.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 715811885c219..12867a5f8ec6f 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -701,6 +701,25 @@ class GenericDetails { }; llvm::raw_ostream &operator<<(llvm::raw_ostream &, const GenericDetails &); +class UserReductionDetails : public WithBindName { +public: + using TypeVector = std::vector; + UserReductionDetails() = default; + + void AddType(const DeclTypeSpec *type) { typeList_.push_back(type); } + const TypeVector &GetTypeList() const { return typeList_; } + + bool SupportsType(const DeclTypeSpec *type) const { + for (auto t : typeList_) + if (t == type) + return true; + return false; + } + +private: + TypeVector typeList_; +}; + class UnknownDetails {}; using Details = std::variant; + TypeParamDetails, MiscDetails, UserReductionDetails>; llvm::raw_ostream &operator<<(llvm::raw_ostream &, const Details &); std::string DetailsToString(const Details &); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 717982f66027c..aa8c830e8f2d2 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -8,6 +8,7 @@ #include "check-omp-structure.h" #include "definable.h" +#include "resolve-names-utils.h" #include "flang/Evaluate/check-expression.h" #include "flang/Evaluate/expression.h" #include "flang/Evaluate/type.h" @@ -3403,8 +3404,8 @@ bool OmpStructureChecker::CheckReductionOperator( valid = llvm::is_contained({"max", "min", "iand", "ior", "ieor"}, realName); if (!valid) { - auto *misc{name->symbol->detailsIf()}; - valid = misc && misc->kind() == MiscDetails::Kind::ConstructName; + auto *reductionDetails{name->symbol->detailsIf()}; + valid = reductionDetails != nullptr; } } if (!valid) { @@ -3486,7 +3487,8 @@ void OmpStructureChecker::CheckReductionObjects( } static bool IsReductionAllowedForType( - const parser::OmpReductionIdentifier &ident, const DeclTypeSpec &type) { + const parser::OmpReductionIdentifier &ident, const DeclTypeSpec &type, + const Scope &scope) { auto isLogical{[](const DeclTypeSpec &type) -> bool { return type.category() == DeclTypeSpec::Logical; }}; @@ -3506,9 +3508,11 @@ static bool IsReductionAllowedForType( case parser::DefinedOperator::IntrinsicOperator::Multiply: case parser::DefinedOperator::IntrinsicOperator::Add: case parser::DefinedOperator::IntrinsicOperator::Subtract: - return type.IsNumeric(TypeCategory::Integer) || + if (type.IsNumeric(TypeCategory::Integer) || type.IsNumeric(TypeCategory::Real) || - type.IsNumeric(TypeCategory::Complex); + type.IsNumeric(TypeCategory::Complex)) + return true; + break; case parser::DefinedOperator::IntrinsicOperator::AND: case parser::DefinedOperator::IntrinsicOperator::OR: @@ -3521,8 +3525,18 @@ static bool IsReductionAllowedForType( DIE("This should have been caught in CheckIntrinsicOperator"); return false; } + parser::CharBlock name{MakeNameFromOperator(*intrinsicOp)}; + Symbol *symbol{scope.FindSymbol(name)}; + if (symbol) { + const auto *reductionDetails{symbol->detailsIf()}; + assert(reductionDetails && "Expected to find reductiondetails"); + + return reductionDetails->SupportsType(&type); + } + return false; } - return true; + assert(0 && "Intrinsic Operator not found - parsing gone wrong?"); + return false; // Reject everything else. }}; auto checkDesignator{[&](const parser::ProcedureDesignator &procD) { @@ -3535,18 +3549,42 @@ static bool IsReductionAllowedForType( // IAND: arguments must be integers: F2023 16.9.100 // IEOR: arguments must be integers: F2023 16.9.106 // IOR: arguments must be integers: F2023 16.9.111 - return type.IsNumeric(TypeCategory::Integer); + if (type.IsNumeric(TypeCategory::Integer)) { + return true; + } } else if (realName == "max" || realName == "min") { // MAX: arguments must be integer, real, or character: // F2023 16.9.135 // MIN: arguments must be integer, real, or character: // F2023 16.9.141 - return type.IsNumeric(TypeCategory::Integer) || - type.IsNumeric(TypeCategory::Real) || isCharacter(type); + if (type.IsNumeric(TypeCategory::Integer) || + type.IsNumeric(TypeCategory::Real) || isCharacter(type)) { + return true; + } } + + // If we get here, it may be a user declared reduction, so check + // if the symbol has UserReductionDetails, and if so, the type is + // supported. + if (const auto *reductionDetails{ + name->symbol->detailsIf()}) { + return reductionDetails->SupportsType(&type); + } + + // We also need to check for mangled names (max, min, iand, ieor and ior) + // and then check if the type is there. + parser::CharBlock mangledName = MangleSpecialFunctions(name->source); + if (const auto &symbol{scope.FindSymbol(mangledName)}) { + if (const auto *reductionDetails{ + symbol->detailsIf()}) { + return reductionDetails->SupportsType(&type); + } + } + // Everything else is "not matching type". + return false; } - // TODO: user defined reduction operators. Just allow everything for now. - return true; + assert(0 && "name and name->symbol should be set here..."); + return false; }}; return common::visit( @@ -3561,7 +3599,8 @@ void OmpStructureChecker::CheckReductionObjectTypes( for (auto &[symbol, source] : symbols) { if (auto *type{symbol->GetType()}) { - if (!IsReductionAllowedForType(ident, *type)) { + const auto &scope{context_.FindScope(symbol->name())}; + if (!IsReductionAllowedForType(ident, *type, scope)) { context_.Say(source, "The type of '%s' is incompatible with the reduction operator."_err_en_US, symbol->name()); diff --git a/flang/lib/Semantics/resolve-names-utils.h b/flang/lib/Semantics/resolve-names-utils.h index 64784722ff4f8..de0991d69b61b 100644 --- a/flang/lib/Semantics/resolve-names-utils.h +++ b/flang/lib/Semantics/resolve-names-utils.h @@ -146,5 +146,9 @@ struct SymbolAndTypeMappings; void MapSubprogramToNewSymbols(const Symbol &oldSymbol, Symbol &newSymbol, Scope &newScope, SymbolAndTypeMappings * = nullptr); +parser::CharBlock MakeNameFromOperator( + const parser::DefinedOperator::IntrinsicOperator &op); +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_RESOLVE_NAMES_H_ diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 74367b5229548..f048b374588ca 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1752,15 +1752,75 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, PopScope(); } +parser::CharBlock MakeNameFromOperator( + const parser::DefinedOperator::IntrinsicOperator &op) { + switch (op) { + case parser::DefinedOperator::IntrinsicOperator::Multiply: + return parser::CharBlock{"op.*", 4}; + case parser::DefinedOperator::IntrinsicOperator::Add: + return parser::CharBlock{"op.+", 4}; + case parser::DefinedOperator::IntrinsicOperator::Subtract: + return parser::CharBlock{"op.-", 4}; + + case parser::DefinedOperator::IntrinsicOperator::AND: + return parser::CharBlock{"op.AND", 6}; + case parser::DefinedOperator::IntrinsicOperator::OR: + return parser::CharBlock{"op.OR", 6}; + case parser::DefinedOperator::IntrinsicOperator::EQV: + return parser::CharBlock{"op.EQV", 7}; + case parser::DefinedOperator::IntrinsicOperator::NEQV: + return parser::CharBlock{"op.NEQV", 8}; + + default: + assert(0 && "Unsupported operator..."); + return parser::CharBlock{"op.?", 4}; + } +} + +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name) { + if (name == "max") { + return parser::CharBlock{"op.max", 6}; + } + if (name == "min") { + return parser::CharBlock{"op.min", 6}; + } + if (name == "iand") { + return parser::CharBlock{"op.iand", 7}; + } + if (name == "ior") { + return parser::CharBlock{"op.ior", 6}; + } + if (name == "ieor") { + return parser::CharBlock{"op.ieor", 7}; + } + // All other names: return as is. + return name; +} + void OmpVisitor::ProcessReductionSpecifier( const parser::OmpReductionSpecifier &spec, const std::optional &clauses) { + const parser::Name *name{nullptr}; + parser::Name mangledName{}; + UserReductionDetails reductionDetailsTemp{}; const auto &id{std::get(spec.t)}; if (auto procDes{std::get_if(&id.u)}) { - if (auto *name{std::get_if(&procDes->u)}) { - name->symbol = - &MakeSymbol(*name, MiscDetails{MiscDetails::Kind::ConstructName}); + name = std::get_if(&procDes->u); + if (name) { + mangledName.source = MangleSpecialFunctions(name->source); } + } else { + const auto &defOp{std::get(id.u)}; + mangledName.source = MakeNameFromOperator( + std::get(defOp.u)); + name = &mangledName; + } + + UserReductionDetails *reductionDetails{&reductionDetailsTemp}; + Symbol *symbol{name ? name->symbol : nullptr}; + symbol = FindSymbol(mangledName); + if (symbol) { + reductionDetails = symbol->detailsIf(); } auto &typeList{std::get(spec.t)}; @@ -1792,6 +1852,10 @@ void OmpVisitor::ProcessReductionSpecifier( const DeclTypeSpec *typeSpec{GetDeclTypeSpec()}; assert(typeSpec && "We should have a type here"); + if (reductionDetails) { + reductionDetails->AddType(typeSpec); + } + for (auto &nm : ompVarNames) { ObjectEntityDetails details{}; details.set_type(*typeSpec); @@ -1802,6 +1866,13 @@ void OmpVisitor::ProcessReductionSpecifier( Walk(clauses); PopScope(); } + + if (name) { + if (!symbol) { + symbol = &MakeSymbol(mangledName, Attrs{}, std::move(*reductionDetails)); + } + name->symbol = symbol; + } } bool OmpVisitor::Pre(const parser::OmpDirectiveSpecification &x) { diff --git a/flang/lib/Semantics/symbol.cpp b/flang/lib/Semantics/symbol.cpp index 32eb6c2c5a188..e627dd293ba7c 100644 --- a/flang/lib/Semantics/symbol.cpp +++ b/flang/lib/Semantics/symbol.cpp @@ -246,7 +246,7 @@ void GenericDetails::CopyFrom(const GenericDetails &from) { // This is primarily for debugging. std::string DetailsToString(const Details &details) { return common::visit( - common::visitors{ + common::visitors{// [](const UnknownDetails &) { return "Unknown"; }, [](const MainProgramDetails &) { return "MainProgram"; }, [](const ModuleDetails &) { return "Module"; }, @@ -266,7 +266,7 @@ std::string DetailsToString(const Details &details) { [](const TypeParamDetails &) { return "TypeParam"; }, [](const MiscDetails &) { return "Misc"; }, [](const AssocEntityDetails &) { return "AssocEntity"; }, - }, + [](const UserReductionDetails &) { return "UserReductionDetails"; }}, details); } @@ -300,6 +300,9 @@ bool Symbol::CanReplaceDetails(const Details &details) const { [&](const HostAssocDetails &) { return this->has(); }, + [&](const UserReductionDetails &) { + return this->has(); + }, [](const auto &) { return false; }, }, details); @@ -598,6 +601,11 @@ llvm::raw_ostream &operator<<(llvm::raw_ostream &os, const Details &details) { [&](const MiscDetails &x) { os << ' ' << MiscDetails::EnumToString(x.kind()); }, + [&](const UserReductionDetails &x) { + for (auto &type : x.GetTypeList()) { + DumpType(os, type); + } + }, [&](const auto &x) { os << x; }, }, details); diff --git a/flang/test/Parser/OpenMP/declare-reduction-multi.f90 b/flang/test/Parser/OpenMP/declare-reduction-multi.f90 new file mode 100644 index 0000000000000..0e1adcc9958d7 --- /dev/null +++ b/flang/test/Parser/OpenMP/declare-reduction-multi.f90 @@ -0,0 +1,134 @@ +! RUN: %flang_fc1 -fdebug-unparse -fopenmp %s | FileCheck --ignore-case %s +! RUN: %flang_fc1 -fdebug-dump-parse-tree -fopenmp %s | FileCheck --check-prefix="PARSE-TREE" %s + +!! Test multiple declarations for the same type, with different operations. +module mymod + type :: tt + real r + end type tt +contains + function mymax(a, b) + type(tt) :: a, b, mymax + if (a%r > b%r) then + mymax = a + else + mymax = b + end if + end function mymax +end module mymod + +program omp_examples +!CHECK-LABEL: PROGRAM omp_examples + use mymod + implicit none + integer, parameter :: n = 100 + integer :: i + type(tt) :: values(n), sum, prod, big, small + + !$omp declare reduction(+:tt:omp_out%r = omp_out%r + omp_in%r) initializer(omp_priv%r = 0) +!CHECK: !$OMP DECLARE REDUCTION (+:tt: omp_out%r=omp_out%r+omp_in%r +!CHECK-NEXT: ) INITIALIZER(omp_priv%r=0_4) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE-NEXT: OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Add +!PARSE-TREE: OmpTypeNameList -> OmpTypeSpecifier -> TypeSpec -> DerivedTypeSpec +!PARSE-TREE-NEXT: Name = 'tt' +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out%r=omp_out%r+omp_in%r' +!PARSE-TREE: OmpClauseList -> OmpClause -> Initializer -> OmpInitializerClause -> AssignmentStmt = 'omp_priv%r=0._4 + !$omp declare reduction(*:tt:omp_out%r = omp_out%r * omp_in%r) initializer(omp_priv%r = 1) +!CHECK-NEXT: !$OMP DECLARE REDUCTION (*:tt: omp_out%r=omp_out%r*omp_in%r +!CHECK-NEXT: ) INITIALIZER(omp_priv%r=1_4) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Multiply +!PARSE-TREE: OmpTypeNameList -> OmpTypeSpecifier -> TypeSpec -> DerivedTypeSpec +!PARSE-TREE-NEXT: Name = 'tt' +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out%r=omp_out%r*omp_in%r' +!PARSE-TREE: OmpClauseList -> OmpClause -> Initializer -> OmpInitializerClause -> AssignmentStmt = 'omp_priv%r=1._4' + !$omp declare reduction(max:tt:omp_out = mymax(omp_out, omp_in)) initializer(omp_priv%r = 0) +!CHECK-NEXT: !$OMP DECLARE REDUCTION (max:tt: omp_out=mymax(omp_out,omp_in) +!CHECK-NEXT: ) INITIALIZER(omp_priv%r=0_4) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> ProcedureDesignator -> Name = 'max' +!PARSE-TREE: OmpTypeNameList -> OmpTypeSpecifier -> TypeSpec -> DerivedTypeSpec +!PARSE-TREE: Name = 'tt' +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out=mymax(omp_out,omp_in)' +!PARSE-TREE: OmpClauseList -> OmpClause -> Initializer -> OmpInitializerClause -> AssignmentStmt = 'omp_priv%r=0._4' + !$omp declare reduction(min:tt:omp_out%r = min(omp_out%r, omp_in%r)) initializer(omp_priv%r = 1) +!CHECK-NEXT: !$OMP DECLARE REDUCTION (min:tt: omp_out%r=min(omp_out%r,omp_in%r) +!CHECK-NEXT: ) INITIALIZER(omp_priv%r=1_4) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> ProcedureDesignator -> Name = 'min' +!PARSE-TREE: OmpTypeNameList -> OmpTypeSpecifier -> TypeSpec -> DerivedTypeSpec +!PARSE-TREE: Name = 'tt' +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out%r=min(omp_out%r,omp_in%r)' +!PARSE-TREE: OmpClauseList -> OmpClause -> Initializer -> OmpInitializerClause -> AssignmentStmt = 'omp_priv%r=1._4' + call random_number(values%r) + + sum%r = 0 + !$omp parallel do reduction(+:sum) +!CHECK: !$OMP PARALLEL DO REDUCTION(+: sum) +!PARSE-TREE: ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct +!PARSE-TREE: OmpBeginLoopDirective +!PARSE-TREE: OmpLoopDirective -> llvm::omp::Directive = parallel do +!PARSE-TREE: OmpClauseList -> OmpClause -> Reduction -> OmpReductionClause +!PARSE-TREE: Modifier -> OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Add +!PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'sum +!PARSE-TREE: DoConstruct + do i = 1, n + sum%r = sum%r + values(i)%r + end do + + prod%r = 1 + !$omp parallel do reduction(*:prod) +!CHECK: !$OMP PARALLEL DO REDUCTION(*: prod) +!PARSE-TREE: ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct +!PARSE-TREE: OmpBeginLoopDirective +!PARSE-TREE: OmpLoopDirective -> llvm::omp::Directive = parallel do +!PARSE-TREE: OmpClauseList -> OmpClause -> Reduction -> OmpReductionClause +!PARSE-TREE: Modifier -> OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Multiply +!PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'prod' +!PARSE-TREE: DoConstruct + do i = 1, n + prod%r = prod%r * (values(i)%r+0.6) + end do + + big%r = 0 + !$omp parallel do reduction(max:big) +!CHECK: $OMP PARALLEL DO REDUCTION(max: big) +!PARSE-TREE: ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct +!PARSE-TREE: OmpBeginLoopDirective +!PARSE-TREE: OmpLoopDirective -> llvm::omp::Directive = parallel do +!PARSE-TREE: OmpClauseList -> OmpClause -> Reduction -> OmpReductionClause +!PARSE-TREE: Modifier -> OmpReductionIdentifier -> ProcedureDesignator -> Name = 'max' +!PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'big' +!PARSE-TREE: DoConstruct + do i = 1, n + big = mymax(values(i), big) + end do + + small%r = 1 + !$omp parallel do reduction(min:small) +!CHECK: !$OMP PARALLEL DO REDUCTION(min: small) +!CHECK-TREE: ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct +!CHECK-TREE: OmpBeginLoopDirective +!CHECK-TREE: OmpLoopDirective -> llvm::omp::Directive = parallel do +!CHECK-TREE: OmpClauseList -> OmpClause -> Reduction -> OmpReductionClause +!CHECK-TREE: Modifier -> OmpReductionIdentifier -> ProcedureDesignator -> Name = 'min' +!CHECK-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'small' +!CHECK-TREE: DoConstruct + do i = 1, n + small%r = min(values(i)%r, small%r) + end do + + print *, values%r + print *, "sum=", sum%r + print *, "prod=", prod%r + print *, "small=", small%r, " big=", big%r +end program omp_examples diff --git a/flang/test/Parser/OpenMP/declare-reduction-operator.f90 b/flang/test/Parser/OpenMP/declare-reduction-operator.f90 new file mode 100644 index 0000000000000..7bfb78115b10d --- /dev/null +++ b/flang/test/Parser/OpenMP/declare-reduction-operator.f90 @@ -0,0 +1,59 @@ +! RUN: %flang_fc1 -fdebug-unparse -fopenmp %s | FileCheck --ignore-case %s +! RUN: %flang_fc1 -fdebug-dump-parse-tree -fopenmp %s | FileCheck --check-prefix="PARSE-TREE" %s + +!CHECK-LABEL: SUBROUTINE reduce_1 (n, tts) +subroutine reduce_1 ( n, tts ) + type :: tt + integer :: x + integer :: y + end type tt + type :: tt2 + real(8) :: x + real(8) :: y + end type + + integer :: n + type(tt) :: tts(n) + type(tt2) :: tts2(n) + +!CHECK: !$OMP DECLARE REDUCTION (+:tt: omp_out=tt(x=omp_out%x-omp_in%x,y=omp_out%y-omp_in%y) +!CHECK: ) INITIALIZER(omp_priv=tt(x=0_4,y=0_4)) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Add +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out=tt(x=omp_out%x-omp_in%x,y=omp_out%y-omp_in%y)' +!PARSE-TREE: OmpInitializerClause -> AssignmentStmt = 'omp_priv=tt(x=0_4,y=0_4)' + + !$omp declare reduction(+ : tt : omp_out = tt(omp_out%x - omp_in%x , omp_out%y - omp_in%y)) initializer(omp_priv = tt(0,0)) + + +!CHECK: !$OMP DECLARE REDUCTION (+:tt2: omp_out=tt2(x=omp_out%x-omp_in%x,y=omp_out%y-omp_in%y) +!CHECK: ) INITIALIZER(omp_priv=tt2(x=0._8,y=0._8) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Add +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out=tt2(x=omp_out%x-omp_in%x,y=omp_out%y-omp_in%y)' +!PARSE-TREE: OmpInitializerClause -> AssignmentStmt = 'omp_priv=tt2(x=0._8,y=0._8)' + + !$omp declare reduction(+ :tt2 : omp_out = tt2(omp_out%x - omp_in%x , omp_out%y - omp_in%y)) initializer(omp_priv = tt2(0,0)) + + type(tt) :: diffp = tt( 0, 0 ) + type(tt2) :: diffp2 = tt2( 0, 0 ) + integer :: i + + !$omp parallel do reduction(+ : diffp) + do i = 1, n + diffp%x = diffp%x + tts(i)%x + diffp%y = diffp%y + tts(i)%y + end do + + !$omp parallel do reduction(+ : diffp2) + do i = 1, n + diffp2%x = diffp2%x + tts2(i)%x + diffp2%y = diffp2%y + tts2(i)%y + end do + +end subroutine reduce_1 +!CHECK: END SUBROUTINE reduce_1 diff --git a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 new file mode 100644 index 0000000000000..924ef0807ec80 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 @@ -0,0 +1,126 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +module mm + implicit none + type two + integer(4) :: a, b + end type two + + type three + integer(8) :: a, b, c + end type three + + type twothree + type(two) t2 + type(three) t3 + end type twothree + +contains +!CHECK-LABEL: Subprogram scope: inittwo + subroutine inittwo(x,n) + integer :: n + type(two) :: x + x%a=n + x%b=n + end subroutine inittwo + + subroutine initthree(x,n) + integer :: n + type(three) :: x + x%a=n + x%b=n + end subroutine initthree + + function add_two(x, y) + type(two) add_two, x, y, res + res%a = x%a + y%a + res%b = x%b + y%b + add_two = res + end function add_two + + function add_three(x, y) + type(three) add_three, x, y, res + res%a = x%a + y%a + res%b = x%b + y%b + res%c = x%c + y%c + add_three = res + end function add_three + +!CHECK-LABEL: Subprogram scope: functwo + function functwo(x, n) + type(two) functwo + integer :: n + type(two) :: x(n) + type(two) :: res + integer :: i + !$omp declare reduction(adder:two:omp_out=add_two(omp_out,omp_in)) initializer(inittwo(omp_priv,0)) +!CHECK: adder: UserReductionDetails TYPE(two) +!CHECK OtherConstruct scope +!CHECK: omp_in size=8 offset=0: ObjectEntity type: TYPE(two) +!CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) +!CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) +!CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) + + + !$omp simd reduction(adder:res) + do i=1,n + res=add_two(res,x(i)) + enddo + functwo=res + end function functwo + + function functhree(x, n) + implicit none + type(three) :: functhree + type(three) :: x(n) + type(three) :: res + integer :: i + integer :: n + !$omp declare reduction(adder:three:omp_out=add_three(omp_out,omp_in)) initializer(initthree(omp_priv,1)) + + !$omp simd reduction(adder:res) + do i=1,n + res=add_three(res,x(i)) + enddo + functhree=res + end function functhree + + function functtwothree(x, n) + type(twothree) :: functtwothree + type(twothree) :: x(n) + type(twothree) :: res + type(two) :: res2 + type(three) :: res3 + integer :: n + integer :: i + + !$omp declare reduction(adder:two:omp_out=add_two(omp_out,omp_in)) initializer(inittwo(omp_priv,0)) + + !$omp declare reduction(adder:three:omp_out=add_three(omp_out,omp_in)) initializer(initthree(omp_priv,1)) + +!CHECK: adder: UserReductionDetails TYPE(two) TYPE(three) +!CHECK OtherConstruct scope +!CHECK: omp_in size=8 offset=0: ObjectEntity type: TYPE(two) +!CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) +!CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) +!CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) +!CHECK OtherConstruct scope +!CHECK: omp_in size=24 offset=0: ObjectEntity type: TYPE(three) +!CHECK: omp_orig size=24 offset=24: ObjectEntity type: TYPE(three) +!CHECK: omp_out size=24 offset=48: ObjectEntity type: TYPE(three) +!CHECK: omp_priv size=24 offset=72: ObjectEntity type: TYPE(three) + + !$omp simd reduction(adder:res3) + do i=1,n + res3=add_three(res%t3,x(i)%t3) + enddo + + !$omp simd reduction(adder:res2) + do i=1,n + res2=add_two(res2,x(i)%t2) + enddo + res%t2 = res2 + res%t3 = res3 + end function functtwothree + +end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction-mangled.f90 b/flang/test/Semantics/OpenMP/declare-reduction-mangled.f90 new file mode 100644 index 0000000000000..f1675b6f251e0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-mangled.f90 @@ -0,0 +1,51 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +!! Test that the name mangling for min & max (also used for iand, ieor and ior). +module mymod + type :: tt + real r + end type tt +contains + function mymax(a, b) + type(tt) :: a, b, mymax + if (a%r > b%r) then + mymax = a + else + mymax = b + end if + end function mymax +end module mymod + +program omp_examples +!CHECK-LABEL: MainProgram scope: omp_examples + use mymod + implicit none + integer, parameter :: n = 100 + integer :: i + type(tt) :: values(n), big, small + + !$omp declare reduction(max:tt:omp_out = mymax(omp_out, omp_in)) initializer(omp_priv%r = 0) + !$omp declare reduction(min:tt:omp_out%r = min(omp_out%r, omp_in%r)) initializer(omp_priv%r = 1) + +!CHECK: min, ELEMENTAL, INTRINSIC, PURE (Function): ProcEntity +!CHECK: mymax (Function): Use from mymax in mymod +!CHECK: op.max: UserReductionDetails TYPE(tt) +!CHECK: op.min: UserReductionDetails TYPE(tt) + + big%r = 0 + !$omp parallel do reduction(max:big) +!CHECK: big (OmpReduction): HostAssoc +!CHECK: max, INTRINSIC: ProcEntity + do i = 1, n + big = mymax(values(i), big) + end do + + small%r = 1 + !$omp parallel do reduction(min:small) +!CHECK: small (OmpReduction): HostAssoc + do i = 1, n + small%r = min(values(i)%r, small%r) + end do + + print *, "small=", small%r, " big=", big%r +end program omp_examples diff --git a/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 b/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 new file mode 100644 index 0000000000000..e7513ab3f95b1 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 @@ -0,0 +1,55 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +module vector_mod + implicit none + type :: Vector + real :: x, y, z + contains + procedure :: add_vectors + generic :: operator(+) => add_vectors + end type Vector +contains + ! Function implementing vector addition + function add_vectors(a, b) result(res) + class(Vector), intent(in) :: a, b + type(Vector) :: res + res%x = a%x + b%x + res%y = a%y + b%y + res%z = a%z + b%z + end function add_vectors +end module vector_mod + +program test_vector +!CHECK-LABEL: MainProgram scope: test_vector + use vector_mod +!CHECK: add_vectors (Function): Use from add_vectors in vector_mod + implicit none + integer :: i + type(Vector) :: v1(100), v2(100) + + !$OMP declare reduction(+:vector:omp_out=omp_out+omp_in) initializer(omp_priv=Vector(0,0,0)) +!CHECK: op.+: UserReductionDetails TYPE(vector) +!CHECK: v1 size=1200 offset=4: ObjectEntity type: TYPE(vector) shape: 1_8:100_8 +!CHECK: v2 size=1200 offset=1204: ObjectEntity type: TYPE(vector) shape: 1_8:100_8 +!CHECK: vector: Use from vector in vector_mod + +!CHECK: OtherConstruct scope: +!CHECK: omp_in size=12 offset=0: ObjectEntity type: TYPE(vector) +!CHECK: omp_orig size=12 offset=12: ObjectEntity type: TYPE(vector) +!CHECK: omp_out size=12 offset=24: ObjectEntity type: TYPE(vector) +!CHECK: omp_priv size=12 offset=36: ObjectEntity type: TYPE(vector) + + v2 = Vector(0.0, 0.0, 0.0) + v1 = Vector(1.0, 2.0, 3.0) + !$OMP parallel do reduction(+:v2) +!CHECK: OtherConstruct scope +!CHECK: i (OmpPrivate, OmpPreDetermined): HostAssoc +!CHECK: v1: HostAssoc +!CHECK: v2 (OmpReduction): HostAssoc + + do i = 1, 100 + v2(i) = v2(i) + v1(i) ! Invokes add_vectors + end do + + print *, 'v2 components:', v2%x, v2%y, v2%z +end program test_vector diff --git a/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 b/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 new file mode 100644 index 0000000000000..14695faf844b6 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 @@ -0,0 +1,30 @@ +! RUN: not %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s 2>&1 | FileCheck %s + +module mm + implicit none + type two + integer(4) :: a, b + end type two + + type three + integer(8) :: a, b, c + end type three +contains + function add_two(x, y) + type(two) add_two, x, y, res + add_two = res + end function add_two + + function func(n) + type(three) :: func + type(three) :: res3 + integer :: n + integer :: i + !$omp declare reduction(adder:two:omp_out=add_two(omp_out,omp_in)) + !$omp simd reduction(adder:res3) +!CHECK: error: The type of 'res3' is incompatible with the reduction operator. + do i=1,n + enddo + func = res3 + end function func +end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction.f90 b/flang/test/Semantics/OpenMP/declare-reduction.f90 index 11612f01f0f2d..ddca38fd57812 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction.f90 @@ -17,7 +17,7 @@ subroutine initme(x,n) end subroutine initme end interface !$omp declare reduction(red_add:integer(4):omp_out=omp_out+omp_in) initializer(initme(omp_priv,0)) -!CHECK: red_add: Misc ConstructName +!CHECK: red_add: UserReductionDetails !CHECK: Subprogram scope: initme !CHECK: omp_in size=4 offset=0: ObjectEntity type: INTEGER(4) !CHECK: omp_orig size=4 offset=4: ObjectEntity type: INTEGER(4) @@ -35,7 +35,7 @@ program main !$omp declare reduction (my_add_red : integer : omp_out = omp_out + omp_in) initializer (omp_priv=0) -!CHECK: my_add_red: Misc ConstructName +!CHECK: my_add_red: UserReductionDetails !CHECK: omp_in size=4 offset=0: ObjectEntity type: INTEGER(4) !CHECK: omp_orig size=4 offset=4: ObjectEntity type: INTEGER(4) !CHECK: omp_out size=4 offset=8: ObjectEntity type: INTEGER(4) >From ee4f70556e81a47c3c42b3bf781425deac5d44a0 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Wed, 26 Mar 2025 13:42:43 +0000 Subject: [PATCH 02/11] Fix review comments * Add two more tests (multiple operator-based declarations and re-using symbol already declared. * Add a few comments. * Fix up logical results. --- flang/include/flang/Semantics/symbol.h | 10 +-- flang/lib/Semantics/check-omp-structure.cpp | 11 +-- flang/lib/Semantics/resolve-names.cpp | 38 +++++++---- .../OpenMP/declare-reduction-dupsym.f90 | 15 ++++ .../OpenMP/declare-reduction-functions.f90 | 68 ++++++++++++++++++- .../OpenMP/declare-reduction-logical.f90 | 32 +++++++++ .../OpenMP/declare-reduction-typeerror.f90 | 4 ++ 7 files changed, 152 insertions(+), 26 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-logical.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 12867a5f8ec6f..b944912290cf7 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -701,7 +701,10 @@ class GenericDetails { }; llvm::raw_ostream &operator<<(llvm::raw_ostream &, const GenericDetails &); -class UserReductionDetails : public WithBindName { +// Used for OpenMP DECLARE REDUCTION, it holds the information +// needed to resolve which declaration (there could be multiple +// with the same name) to use for a given type. +class UserReductionDetails { public: using TypeVector = std::vector; UserReductionDetails() = default; @@ -710,10 +713,7 @@ class UserReductionDetails : public WithBindName { const TypeVector &GetTypeList() const { return typeList_; } bool SupportsType(const DeclTypeSpec *type) const { - for (auto t : typeList_) - if (t == type) - return true; - return false; + return llvm::is_contained(typeList_, type); } private: diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa8c830e8f2d2..1eac0bdfc05bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3518,7 +3518,10 @@ static bool IsReductionAllowedForType( case parser::DefinedOperator::IntrinsicOperator::OR: case parser::DefinedOperator::IntrinsicOperator::EQV: case parser::DefinedOperator::IntrinsicOperator::NEQV: - return isLogical(type); + if (isLogical(type)) { + return true; + } + break; // Reduction identifier is not in OMP5.2 Table 5.2 default: @@ -3535,7 +3538,7 @@ static bool IsReductionAllowedForType( } return false; } - assert(0 && "Intrinsic Operator not found - parsing gone wrong?"); + DIE("Intrinsic Operator not found - parsing gone wrong?"); return false; // Reject everything else. }}; @@ -3573,7 +3576,7 @@ static bool IsReductionAllowedForType( // We also need to check for mangled names (max, min, iand, ieor and ior) // and then check if the type is there. - parser::CharBlock mangledName = MangleSpecialFunctions(name->source); + parser::CharBlock mangledName{MangleSpecialFunctions(name->source)}; if (const auto &symbol{scope.FindSymbol(mangledName)}) { if (const auto *reductionDetails{ symbol->detailsIf()}) { @@ -3583,7 +3586,7 @@ static bool IsReductionAllowedForType( // Everything else is "not matching type". return false; } - assert(0 && "name and name->symbol should be set here..."); + DIE("name and name->symbol should be set here..."); return false; }}; diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f048b374588ca..fe63ff31afd3d 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1772,7 +1772,7 @@ parser::CharBlock MakeNameFromOperator( return parser::CharBlock{"op.NEQV", 8}; default: - assert(0 && "Unsupported operator..."); + DIE("Unsupported operator..."); return parser::CharBlock{"op.?", 4}; } } @@ -1801,8 +1801,8 @@ void OmpVisitor::ProcessReductionSpecifier( const parser::OmpReductionSpecifier &spec, const std::optional &clauses) { const parser::Name *name{nullptr}; - parser::Name mangledName{}; - UserReductionDetails reductionDetailsTemp{}; + parser::Name mangledName; + UserReductionDetails reductionDetailsTemp; const auto &id{std::get(spec.t)}; if (auto procDes{std::get_if(&id.u)}) { name = std::get_if(&procDes->u); @@ -1816,11 +1816,22 @@ void OmpVisitor::ProcessReductionSpecifier( name = &mangledName; } + // Use reductionDetailsTemp if we can't find the symbol (this is + // the first, or only, instance with this name). The detaiols then + // gets stored in the symbol when it's created. UserReductionDetails *reductionDetails{&reductionDetailsTemp}; - Symbol *symbol{name ? name->symbol : nullptr}; - symbol = FindSymbol(mangledName); + Symbol *symbol{FindSymbol(mangledName)}; if (symbol) { + // If we found a symbol, we append the type info to the + // existing reductionDetails. reductionDetails = symbol->detailsIf(); + + if (!reductionDetails) { + context().Say(name->source, + "Duplicate defineition of '%s' in !$OMP DECLARE REDUCTION"_err_en_US, + name->source); + return; + } } auto &typeList{std::get(spec.t)}; @@ -1849,17 +1860,16 @@ void OmpVisitor::ProcessReductionSpecifier( // We need to walk t.u because Walk(t) does it's own BeginDeclTypeSpec. Walk(t.u); - const DeclTypeSpec *typeSpec{GetDeclTypeSpec()}; - assert(typeSpec && "We should have a type here"); - - if (reductionDetails) { + // Only process types we can find. There will be an error later on when + // a type isn't found. + if (const DeclTypeSpec * typeSpec{GetDeclTypeSpec()}) { reductionDetails->AddType(typeSpec); - } - for (auto &nm : ompVarNames) { - ObjectEntityDetails details{}; - details.set_type(*typeSpec); - MakeSymbol(nm, Attrs{}, std::move(details)); + for (auto &nm : ompVarNames) { + ObjectEntityDetails details{}; + details.set_type(*typeSpec); + MakeSymbol(nm, Attrs{}, std::move(details)); + } } EndDeclTypeSpec(); Walk(std::get>(spec.t)); diff --git a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 new file mode 100644 index 0000000000000..17f70174e1854 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 @@ -0,0 +1,15 @@ +! RUN: not %flang_fc1 -fopenmp -fopenmp-version=50 %s 2>&1 | FileCheck %s + +!! Check for duplicate symbol use. +subroutine dup_symbol() + type :: loc + integer :: x + integer :: y + end type loc + + integer :: my_red + +!CHECK: error: Duplicate defineition of 'my_red' in !$OMP DECLARE REDUCTION + !$omp declare reduction(my_red : loc : omp_out%x = omp_out%x + omp_in%x) initializer(omp_priv%x = 0) + +end subroutine dup_symbol diff --git a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 index 924ef0807ec80..a2435fca415cd 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 @@ -85,8 +85,8 @@ function functhree(x, n) functhree=res end function functhree - function functtwothree(x, n) - type(twothree) :: functtwothree + function functwothree(x, n) + type(twothree) :: functwothree type(twothree) :: x(n) type(twothree) :: res type(two) :: res2 @@ -121,6 +121,68 @@ function functtwothree(x, n) enddo res%t2 = res2 res%t3 = res3 - end function functtwothree + functwothree=res + end function functwothree + +!CHECK-LABEL: Subprogram scope: funcbtwo + function funcBtwo(x, n) + type(two) funcBtwo + integer :: n + type(two) :: x(n) + type(two) :: res + integer :: i + !$omp declare reduction(+:two:omp_out=add_two(omp_out,omp_in)) initializer(inittwo(omp_priv,0)) +!CHECK: op.+: UserReductionDetails TYPE(two) +!CHECK OtherConstruct scope +!CHECK: omp_in size=8 offset=0: ObjectEntity type: TYPE(two) +!CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) +!CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) +!CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) + + + !$omp simd reduction(+:res) + do i=1,n + res=add_two(res,x(i)) + enddo + funcBtwo=res + end function funcBtwo + + function funcBtwothree(x, n) + type(twothree) :: funcBtwothree + type(twothree) :: x(n) + type(twothree) :: res + type(two) :: res2 + type(three) :: res3 + integer :: n + integer :: i + + !$omp declare reduction(+:two:omp_out=add_two(omp_out,omp_in)) initializer(inittwo(omp_priv,0)) + !$omp declare reduction(+:three:omp_out=add_three(omp_out,omp_in)) initializer(initthree(omp_priv,1)) + +!CHECK: op.+: UserReductionDetails TYPE(two) TYPE(three) +!CHECK OtherConstruct scope +!CHECK: omp_in size=8 offset=0: ObjectEntity type: TYPE(two) +!CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) +!CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) +!CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) +!CHECK OtherConstruct scope +!CHECK: omp_in size=24 offset=0: ObjectEntity type: TYPE(three) +!CHECK: omp_orig size=24 offset=24: ObjectEntity type: TYPE(three) +!CHECK: omp_out size=24 offset=48: ObjectEntity type: TYPE(three) +!CHECK: omp_priv size=24 offset=72: ObjectEntity type: TYPE(three) + + !$omp simd reduction(+:res3) + do i=1,n + res3=add_three(res%t3,x(i)%t3) + enddo + + !$omp simd reduction(+:res2) + do i=1,n + res2=add_two(res2,x(i)%t2) + enddo + res%t2 = res2 + res%t3 = res3 + end function funcBtwothree + end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction-logical.f90 b/flang/test/Semantics/OpenMP/declare-reduction-logical.f90 new file mode 100644 index 0000000000000..7ab7cad473ac8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-logical.f90 @@ -0,0 +1,32 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +module mm + implicit none + type logicalwrapper + logical b + end type logicalwrapper + +contains +!CHECK-LABEL: Subprogram scope: func + function func(x, n) + logical func + integer :: n + type(logicalwrapper) :: x(n) + type(logicalwrapper) :: res + integer :: i + !$omp declare reduction(.AND.:type(logicalwrapper):omp_out%b=omp_out%b .AND. omp_in%b) initializer(omp_priv%b=.true.) +!CHECK: op.AND: UserReductionDetails TYPE(logicalwrapper) +!CHECK OtherConstruct scope +!CHECK: omp_in size=4 offset=0: ObjectEntity type: TYPE(logicalwrapper) +!CHECK: omp_orig size=4 offset=4: ObjectEntity type: TYPE(logicalwrapper) +!CHECK: omp_out size=4 offset=8: ObjectEntity type: TYPE(logicalwrapper) +!CHECK: omp_priv size=4 offset=12: ObjectEntity type: TYPE(logicalwrapper) + + !$omp simd reduction(.AND.:res) + do i=1,n + res%b=res%b .and. x(i)%b + enddo + + func=res%b + end function func +end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 b/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 index 14695faf844b6..b8ede55aa0ed7 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 @@ -20,6 +20,10 @@ function func(n) type(three) :: res3 integer :: n integer :: i + + !$omp declare reduction(dummy:kerflunk:omp_out=omp_out+omp_in) +!CHECK: error: Derived type 'kerflunk' not found + !$omp declare reduction(adder:two:omp_out=add_two(omp_out,omp_in)) !$omp simd reduction(adder:res3) !CHECK: error: The type of 'res3' is incompatible with the reduction operator. >From 3f926bda65b22b18e2524f95cd5041792054c9d2 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Wed, 26 Mar 2025 17:51:25 +0000 Subject: [PATCH 03/11] Use stringswitch and spell details correctly --- flang/lib/Semantics/resolve-names.cpp | 27 +++++++++------------------ 1 file changed, 9 insertions(+), 18 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index fe63ff31afd3d..4f5dde00223bc 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -38,6 +38,7 @@ #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" #include "flang/Support/default-kinds.h" +#include "llvm/ADT/StringSwitch.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1778,23 +1779,13 @@ parser::CharBlock MakeNameFromOperator( } parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name) { - if (name == "max") { - return parser::CharBlock{"op.max", 6}; - } - if (name == "min") { - return parser::CharBlock{"op.min", 6}; - } - if (name == "iand") { - return parser::CharBlock{"op.iand", 7}; - } - if (name == "ior") { - return parser::CharBlock{"op.ior", 6}; - } - if (name == "ieor") { - return parser::CharBlock{"op.ieor", 7}; - } - // All other names: return as is. - return name; + return llvm::StringSwitch(name.ToString()) + .Case("max", {"op.max", 6}) + .Case("min", {"op.min", 6}) + .Case("iand", {"op.iand", 7}) + .Case("ior", {"op.ior", 6}) + .Case("ieor", {"op.ieor", 7}) + .Default(name); } void OmpVisitor::ProcessReductionSpecifier( @@ -1817,7 +1808,7 @@ void OmpVisitor::ProcessReductionSpecifier( } // Use reductionDetailsTemp if we can't find the symbol (this is - // the first, or only, instance with this name). The detaiols then + // the first, or only, instance with this name). The details then // gets stored in the symbol when it's created. UserReductionDetails *reductionDetails{&reductionDetailsTemp}; Symbol *symbol{FindSymbol(mangledName)}; >From d0b5e5cb04a86a7980c7238787a927f881e03205 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 4 Apr 2025 16:27:07 +0100 Subject: [PATCH 04/11] Add support for user defined operators in declare reduction Also print the reduction declaration in the module file. Fix trivial typo. Add/modify tests to cover all the new things, including fixing the duplicated typo in the test... --- flang/include/flang/Semantics/semantics.h | 9 +++ flang/include/flang/Semantics/symbol.h | 10 +++ flang/lib/Parser/unparse.cpp | 7 +++ flang/lib/Semantics/mod-file.cpp | 21 +++++++ flang/lib/Semantics/mod-file.h | 1 + flang/lib/Semantics/resolve-names.cpp | 41 +++++++++--- flang/lib/Semantics/semantics.cpp | 6 ++ .../OpenMP/declare-reduction-dupsym.f90 | 2 +- .../OpenMP/declare-reduction-modfile.f90 | 63 +++++++++++++++++++ .../OpenMP/declare-reduction-operators.f90 | 29 +++++++++ 10 files changed, 180 insertions(+), 9 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 diff --git a/flang/include/flang/Semantics/semantics.h b/flang/include/flang/Semantics/semantics.h index 730513dbe3232..460af89daa0cf 100644 --- a/flang/include/flang/Semantics/semantics.h +++ b/flang/include/flang/Semantics/semantics.h @@ -290,6 +290,10 @@ class SemanticsContext { // Top-level ProgramTrees are owned by the SemanticsContext for persistence. ProgramTree &SaveProgramTree(ProgramTree &&); + // Store (and get a reference to the stored string) for mangled names + // used for OpenMP DECLARE REDUCTION. + std::string &StoreUserReductionName(const std::string &name); + private: struct ScopeIndexComparator { bool operator()(parser::CharBlock, parser::CharBlock) const; @@ -343,6 +347,11 @@ class SemanticsContext { std::map moduleFileOutputRenamings_; UnorderedSymbolSet isDefined_; std::list programTrees_; + + // storage for mangled names used in OMP DECLARE REDUCTION. + // use std::list to avoid re-allocating the string when adding + // more content to the container. + std::list userReductionNames_; }; class Semantics { diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index b944912290cf7..f28a1d6b929eb 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -29,6 +29,8 @@ class raw_ostream; } namespace Fortran::parser { struct Expr; +struct OpenMPDeclareReductionConstruct; +struct OmpDirectiveSpecification; } namespace Fortran::semantics { @@ -707,6 +709,10 @@ llvm::raw_ostream &operator<<(llvm::raw_ostream &, const GenericDetails &); class UserReductionDetails { public: using TypeVector = std::vector; + using DeclInfo = std::variant; + using DeclVector = std::vector; + UserReductionDetails() = default; void AddType(const DeclTypeSpec *type) { typeList_.push_back(type); } @@ -716,8 +722,12 @@ class UserReductionDetails { return llvm::is_contained(typeList_, type); } + void AddDecl(const DeclInfo &decl) { declList_.push_back(decl); } + const DeclVector &GetDeclList() const { return declList_; } + private: TypeVector typeList_; + DeclVector declList_; }; class UnknownDetails {}; diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 47dae0ae753d2..3f6815968b76f 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -3325,4 +3325,11 @@ template void Unparse(llvm::raw_ostream &, const Program &, Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); template void Unparse(llvm::raw_ostream &, const Expr &, Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); + +template void Unparse( + llvm::raw_ostream &, const parser::OpenMPDeclareReductionConstruct &, + Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); +template void Unparse(llvm::raw_ostream &, + const parser::OmpDirectiveSpecification &, Encoding, bool, bool, + preStatementType *, AnalyzedObjectsAsFortran *); } // namespace Fortran::parser diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index ee356e56e4458..93226beb8b5ed 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -8,6 +8,7 @@ #include "mod-file.h" #include "resolve-names.h" +#include "flang/Common/indirection.h" #include "flang/Common/restorer.h" #include "flang/Evaluate/tools.h" #include "flang/Parser/message.h" @@ -887,6 +888,7 @@ void ModFileWriter::PutEntity(llvm::raw_ostream &os, const Symbol &symbol) { [&](const ObjectEntityDetails &) { PutObjectEntity(os, symbol); }, [&](const ProcEntityDetails &) { PutProcEntity(os, symbol); }, [&](const TypeParamDetails &) { PutTypeParam(os, symbol); }, + [&](const UserReductionDetails &) { PutUserReduction(os, symbol); }, [&](const auto &) { common::die("PutEntity: unexpected details: %s", DetailsToString(symbol.details()).c_str()); @@ -1035,6 +1037,25 @@ void ModFileWriter::PutTypeParam(llvm::raw_ostream &os, const Symbol &symbol) { os << '\n'; } +void ModFileWriter::PutUserReduction( + llvm::raw_ostream &os, const Symbol &symbol) { + auto &details{symbol.get()}; + // The module content for a OpenMP Declare Reduction is the OpenMP + // declaration. There may be multiple declarations. + // Decls are pointers, so do not use a referene. + for (const auto decl : details.GetDeclList()) { + if (auto d = std::get_if( + &decl)) { + Unparse(os, **d); + } else if (auto s = std::get_if( + &decl)) { + Unparse(os, **s); + } else { + DIE("Unknown OpenMP DECLARE REDUCTION content"); + } + } +} + void PutInit(llvm::raw_ostream &os, const Symbol &symbol, const MaybeExpr &init, const parser::Expr *unanalyzed) { if (IsNamedConstant(symbol) || symbol.owner().IsDerivedType()) { diff --git a/flang/lib/Semantics/mod-file.h b/flang/lib/Semantics/mod-file.h index 82538fb510873..9e5724089b3c5 100644 --- a/flang/lib/Semantics/mod-file.h +++ b/flang/lib/Semantics/mod-file.h @@ -80,6 +80,7 @@ class ModFileWriter { void PutDerivedType(const Symbol &, const Scope * = nullptr); void PutDECStructure(const Symbol &, const Scope * = nullptr); void PutTypeParam(llvm::raw_ostream &, const Symbol &); + void PutUserReduction(llvm::raw_ostream &, const Symbol &); void PutSubprogram(const Symbol &); void PutGeneric(const Symbol &); void PutUse(const Symbol &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 4f5dde00223bc..3943bdaf0c2c7 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1502,7 +1502,7 @@ class OmpVisitor : public virtual DeclarationVisitor { AddOmpSourceRange(x.source); ProcessReductionSpecifier( std::get>(x.t).value(), - std::get>(x.t)); + std::get>(x.t), x); return false; } bool Pre(const parser::OmpMapClause &); @@ -1658,8 +1658,13 @@ class OmpVisitor : public virtual DeclarationVisitor { private: void ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, const parser::OmpClauseList &clauses); + template void ProcessReductionSpecifier(const parser::OmpReductionSpecifier &spec, - const std::optional &clauses); + const std::optional &clauses, + const T &wholeConstruct); + + parser::CharBlock MangleDefinedOperator(const parser::CharBlock &name); + int metaLevel_{0}; }; @@ -1788,9 +1793,21 @@ parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name) { .Default(name); } +parser::CharBlock OmpVisitor::MangleDefinedOperator( + const parser::CharBlock &name) { + // This function should only be used with user defined operators, that have + // the pattern + // .. + CHECK(name[0] == '.' && name[name.size() - 1] == '.'); + return parser::CharBlock{ + context().StoreUserReductionName("op" + name.ToString())}; +} + +template void OmpVisitor::ProcessReductionSpecifier( const parser::OmpReductionSpecifier &spec, - const std::optional &clauses) { + const std::optional &clauses, + const T &wholeOmpConstruct) { const parser::Name *name{nullptr}; parser::Name mangledName; UserReductionDetails reductionDetailsTemp; @@ -1800,11 +1817,17 @@ void OmpVisitor::ProcessReductionSpecifier( if (name) { mangledName.source = MangleSpecialFunctions(name->source); } + } else { const auto &defOp{std::get(id.u)}; - mangledName.source = MakeNameFromOperator( - std::get(defOp.u)); - name = &mangledName; + if (const auto definedOp{std::get_if(&defOp.u)}) { + name = &definedOp->v; + mangledName.source = MangleDefinedOperator(definedOp->v.source); + } else { + mangledName.source = MakeNameFromOperator( + std::get(defOp.u)); + name = &mangledName; + } } // Use reductionDetailsTemp if we can't find the symbol (this is @@ -1819,7 +1842,7 @@ void OmpVisitor::ProcessReductionSpecifier( if (!reductionDetails) { context().Say(name->source, - "Duplicate defineition of '%s' in !$OMP DECLARE REDUCTION"_err_en_US, + "Duplicate definition of '%s' in !$OMP DECLARE REDUCTION"_err_en_US, name->source); return; } @@ -1868,6 +1891,8 @@ void OmpVisitor::ProcessReductionSpecifier( PopScope(); } + reductionDetails->AddDecl(&wholeOmpConstruct); + if (name) { if (!symbol) { symbol = &MakeSymbol(mangledName, Attrs{}, std::move(*reductionDetails)); @@ -1903,7 +1928,7 @@ bool OmpVisitor::Pre(const parser::OmpDirectiveSpecification &x) { if (maybeArgs && maybeClauses) { const parser::OmpArgument &first{maybeArgs->v.front()}; if (auto *spec{std::get_if(&first.u)}) { - ProcessReductionSpecifier(*spec, maybeClauses); + ProcessReductionSpecifier(*spec, maybeClauses, x); } } break; diff --git a/flang/lib/Semantics/semantics.cpp b/flang/lib/Semantics/semantics.cpp index 10a01039ea0ae..4a74d9e1dc1bd 100644 --- a/flang/lib/Semantics/semantics.cpp +++ b/flang/lib/Semantics/semantics.cpp @@ -771,4 +771,10 @@ bool SemanticsContext::IsSymbolDefined(const Symbol &symbol) const { return isDefined_.find(symbol) != isDefined_.end(); } +std::string &SemanticsContext::StoreUserReductionName(const std::string &name) { + userReductionNames_.push_back(name); + CHECK(userReductionNames_.back() == name); + return userReductionNames_.back(); +} + } // namespace Fortran::semantics diff --git a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 index 17f70174e1854..2e82cd1a18332 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 @@ -9,7 +9,7 @@ subroutine dup_symbol() integer :: my_red -!CHECK: error: Duplicate defineition of 'my_red' in !$OMP DECLARE REDUCTION +!CHECK: error: Duplicate definition of 'my_red' in !$OMP DECLARE REDUCTION !$omp declare reduction(my_red : loc : omp_out%x = omp_out%x + omp_in%x) initializer(omp_priv%x = 0) end subroutine dup_symbol diff --git a/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 b/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 new file mode 100644 index 0000000000000..caed7fd335376 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 @@ -0,0 +1,63 @@ +! RUN: %python %S/../test_modfile.py %s %flang_fc1 -fopenmp +! Check correct modfile generation for OpenMP DECLARE REDUCTION construct. + +!Expect: drm.mod +!module drm +!type::t1 +!integer(4)::val +!endtype +!!$OMP DECLARE REDUCTION (*:t1:omp_out = omp_out*omp_in) INITIALIZER(omp_priv=t& +!!$OMP&1(1)) +!!$OMP DECLARE REDUCTION (.fluffy.:t1:omp_out = omp_out.fluffy.omp_in) INITIALI& +!!$OMP&ZER(omp_priv=t1(0)) +!!$OMP DECLARE REDUCTION (.mul.:t1:omp_out = omp_out.mul.omp_in) INITIALIZER(om& +!!$OMP&p_priv=t1(1)) +!interface operator(.mul.) +!procedure::mul +!end interface +!interface operator(.fluffy.) +!procedure::add +!end interface +!interface operator(*) +!procedure::mul +!end interface +!contains +!function mul(v1,v2) +!type(t1),intent(in)::v1 +!type(t1),intent(in)::v2 +!type(t1)::mul +!end +!function add(v1,v2) +!type(t1),intent(in)::v1 +!type(t1),intent(in)::v2 +!type(t1)::add +!end +!end + +module drm + type t1 + integer :: val + end type t1 + interface operator(.mul.) + procedure mul + end interface + interface operator(.fluffy.) + procedure add + end interface + interface operator(*) + module procedure mul + end interface +!$omp declare reduction(*:t1:omp_out=omp_out*omp_in) initializer(omp_priv=t1(1)) +!$omp declare reduction(.mul.:t1:omp_out=omp_out.mul.omp_in) initializer(omp_priv=t1(1)) +!$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) initializer(omp_priv=t1(0)) +contains + type(t1) function mul(v1, v2) + type(t1), intent (in):: v1, v2 + mul%val = v1%val * v2%val + end function + type(t1) function add(v1, v2) + type(t1), intent (in):: v1, v2 + add%val = v1%val + v2%val + end function +end module drm + diff --git a/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 b/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 index e7513ab3f95b1..73fa1a1fea2c5 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 @@ -19,6 +19,35 @@ function add_vectors(a, b) result(res) end function add_vectors end module vector_mod +!! Test user-defined operators. Two different varieties, using conventional and +!! unconventional names. +module m1 + interface operator(.mul.) + procedure my_mul + end interface + interface operator(.fluffy.) + procedure my_add + end interface + type t1 + integer :: val = 1 + end type +!$omp declare reduction(.mul.:t1:omp_out=omp_out.mul.omp_in) +!$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) +!CHECK: op.fluffy., PUBLIC: UserReductionDetails TYPE(t1) +!CHECK: op.mul., PUBLIC: UserReductionDetails TYPE(t1) +contains + function my_mul(x, y) + type (t1), intent (in) :: x, y + type (t1) :: my_mul + my_mul%val = x%val * y%val + end function + function my_add(x, y) + type (t1), intent (in) :: x, y + type (t1) :: my_add + my_add%val = x%val + y%val + end function +end module m1 + program test_vector !CHECK-LABEL: MainProgram scope: test_vector use vector_mod >From 38bdc0f7b498150eaa6d7a291456477fc94aff73 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 4 Apr 2025 18:56:14 +0100 Subject: [PATCH 05/11] Fix nit comments and add simple bad operator test --- flang/lib/Semantics/check-omp-structure.cpp | 12 +++++------- flang/lib/Semantics/resolve-names-utils.h | 3 ++- flang/lib/Semantics/resolve-names.cpp | 8 +++++--- flang/lib/Semantics/symbol.cpp | 3 +-- .../OpenMP/declare-reduction-bad-operator.f90 | 6 ++++++ 5 files changed, 19 insertions(+), 13 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 1eac0bdfc05bd..099a58124638d 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3488,7 +3488,7 @@ void OmpStructureChecker::CheckReductionObjects( static bool IsReductionAllowedForType( const parser::OmpReductionIdentifier &ident, const DeclTypeSpec &type, - const Scope &scope) { + const Scope &scope, SemanticsContext &context) { auto isLogical{[](const DeclTypeSpec &type) -> bool { return type.category() == DeclTypeSpec::Logical; }}; @@ -3528,7 +3528,7 @@ static bool IsReductionAllowedForType( DIE("This should have been caught in CheckIntrinsicOperator"); return false; } - parser::CharBlock name{MakeNameFromOperator(*intrinsicOp)}; + parser::CharBlock name{MakeNameFromOperator(*intrinsicOp, context)}; Symbol *symbol{scope.FindSymbol(name)}; if (symbol) { const auto *reductionDetails{symbol->detailsIf()}; @@ -3539,11 +3539,11 @@ static bool IsReductionAllowedForType( return false; } DIE("Intrinsic Operator not found - parsing gone wrong?"); - return false; // Reject everything else. }}; auto checkDesignator{[&](const parser::ProcedureDesignator &procD) { const parser::Name *name{std::get_if(&procD.u)}; + CHECK(name && name->symbol); if (name && name->symbol) { const SourceName &realName{name->symbol->GetUltimate().name()}; // OMP5.2: The type [...] of a list item that appears in a @@ -3583,10 +3583,8 @@ static bool IsReductionAllowedForType( return reductionDetails->SupportsType(&type); } } - // Everything else is "not matching type". - return false; } - DIE("name and name->symbol should be set here..."); + // Everything else is "not matching type". return false; }}; @@ -3603,7 +3601,7 @@ void OmpStructureChecker::CheckReductionObjectTypes( for (auto &[symbol, source] : symbols) { if (auto *type{symbol->GetType()}) { const auto &scope{context_.FindScope(symbol->name())}; - if (!IsReductionAllowedForType(ident, *type, scope)) { + if (!IsReductionAllowedForType(ident, *type, scope, context_)) { context_.Say(source, "The type of '%s' is incompatible with the reduction operator."_err_en_US, symbol->name()); diff --git a/flang/lib/Semantics/resolve-names-utils.h b/flang/lib/Semantics/resolve-names-utils.h index de0991d69b61b..ed74c8203e29a 100644 --- a/flang/lib/Semantics/resolve-names-utils.h +++ b/flang/lib/Semantics/resolve-names-utils.h @@ -147,7 +147,8 @@ void MapSubprogramToNewSymbols(const Symbol &oldSymbol, Symbol &newSymbol, Scope &newScope, SymbolAndTypeMappings * = nullptr); parser::CharBlock MakeNameFromOperator( - const parser::DefinedOperator::IntrinsicOperator &op); + const parser::DefinedOperator::IntrinsicOperator &op, + SemanticsContext &context); parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name); } // namespace Fortran::semantics diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 3943bdaf0c2c7..552a0efc6aaa5 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1759,7 +1759,8 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, } parser::CharBlock MakeNameFromOperator( - const parser::DefinedOperator::IntrinsicOperator &op) { + const parser::DefinedOperator::IntrinsicOperator &op, + SemanticsContext &context) { switch (op) { case parser::DefinedOperator::IntrinsicOperator::Multiply: return parser::CharBlock{"op.*", 4}; @@ -1778,7 +1779,7 @@ parser::CharBlock MakeNameFromOperator( return parser::CharBlock{"op.NEQV", 8}; default: - DIE("Unsupported operator..."); + context.Say("Unsupported operator in OMP DECLARE REDUCTION"_err_en_US); return parser::CharBlock{"op.?", 4}; } } @@ -1825,7 +1826,8 @@ void OmpVisitor::ProcessReductionSpecifier( mangledName.source = MangleDefinedOperator(definedOp->v.source); } else { mangledName.source = MakeNameFromOperator( - std::get(defOp.u)); + std::get(defOp.u), + context()); name = &mangledName; } } diff --git a/flang/lib/Semantics/symbol.cpp b/flang/lib/Semantics/symbol.cpp index e627dd293ba7c..e1e9f1705e452 100644 --- a/flang/lib/Semantics/symbol.cpp +++ b/flang/lib/Semantics/symbol.cpp @@ -246,8 +246,7 @@ void GenericDetails::CopyFrom(const GenericDetails &from) { // This is primarily for debugging. std::string DetailsToString(const Details &details) { return common::visit( - common::visitors{// - [](const UnknownDetails &) { return "Unknown"; }, + common::visitors{[](const UnknownDetails &) { return "Unknown"; }, [](const MainProgramDetails &) { return "MainProgram"; }, [](const ModuleDetails &) { return "Module"; }, [](const SubprogramDetails &) { return "Subprogram"; }, diff --git a/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 new file mode 100644 index 0000000000000..3b27c6aa20f13 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 @@ -0,0 +1,6 @@ +! RUN: not %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s 2>&1 | FileCheck %s + +function func(n) + !$omp declare reduction(/:integer:omp_out=omp_out+omp_in) +!CHECK: error: Unsupported operator in OMP DECLARE REDUCTION +end function func >From 0afc3591885e72d59912811e0e01b158f895b928 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 4 Apr 2025 19:47:42 +0100 Subject: [PATCH 06/11] Fix error messages to be more consistent --- flang/lib/Semantics/resolve-names.cpp | 6 +++--- .../Semantics/OpenMP/declare-reduction-bad-operator.f90 | 2 +- flang/test/Semantics/OpenMP/declare-reduction-error.f90 | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 552a0efc6aaa5..f1c2a0759e2ed 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1492,7 +1492,7 @@ class OmpVisitor : public virtual DeclarationVisitor { auto *symbol{FindSymbol(NonDerivedTypeScope(), name)}; if (!symbol) { context().Say(name.source, - "Implicit subroutine declaration '%s' in !$OMP DECLARE REDUCTION"_err_en_US, + "Implicit subroutine declaration '%s' in DECLARE REDUCTION"_err_en_US, name.source); } return true; @@ -1779,7 +1779,7 @@ parser::CharBlock MakeNameFromOperator( return parser::CharBlock{"op.NEQV", 8}; default: - context.Say("Unsupported operator in OMP DECLARE REDUCTION"_err_en_US); + context.Say("Unsupported operator in DECLARE REDUCTION"_err_en_US); return parser::CharBlock{"op.?", 4}; } } @@ -1844,7 +1844,7 @@ void OmpVisitor::ProcessReductionSpecifier( if (!reductionDetails) { context().Say(name->source, - "Duplicate definition of '%s' in !$OMP DECLARE REDUCTION"_err_en_US, + "Duplicate definition of '%s' in DECLARE REDUCTION"_err_en_US, name->source); return; } diff --git a/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 index 3b27c6aa20f13..1d1d2903a2780 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 @@ -2,5 +2,5 @@ function func(n) !$omp declare reduction(/:integer:omp_out=omp_out+omp_in) -!CHECK: error: Unsupported operator in OMP DECLARE REDUCTION +!CHECK: error: Unsupported operator in DECLARE REDUCTION end function func diff --git a/flang/test/Semantics/OpenMP/declare-reduction-error.f90 b/flang/test/Semantics/OpenMP/declare-reduction-error.f90 index c22cf106ea507..21f5cc186e037 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-error.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-error.f90 @@ -7,5 +7,5 @@ end subroutine initme subroutine subr !$omp declare reduction(red_add:integer(4):omp_out=omp_out+omp_in) initializer(initme(omp_priv,0)) - !CHECK: error: Implicit subroutine declaration 'initme' in !$OMP DECLARE REDUCTION + !CHECK: error: Implicit subroutine declaration 'initme' in DECLARE REDUCTION end subroutine subr >From 1b08d9cdb6b57640e0f85361c8b60f8b77084d57 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Mon, 7 Apr 2025 10:47:44 +0100 Subject: [PATCH 07/11] add missed test change --- flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 index 2e82cd1a18332..83f8f85299dca 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 @@ -9,7 +9,7 @@ subroutine dup_symbol() integer :: my_red -!CHECK: error: Duplicate definition of 'my_red' in !$OMP DECLARE REDUCTION +!CHECK: error: Duplicate definition of 'my_red' in DECLARE REDUCTION !$omp declare reduction(my_red : loc : omp_out%x = omp_out%x + omp_in%x) initializer(omp_priv%x = 0) end subroutine dup_symbol >From 17a400b89ce08f84c76ebbb4f1b2b7c92f3ae5ae Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Tue, 8 Apr 2025 14:41:37 +0100 Subject: [PATCH 08/11] Improve support for metadirective + declare reduction --- flang/include/flang/Semantics/symbol.h | 4 ++-- flang/lib/Parser/unparse.cpp | 4 ++-- flang/lib/Semantics/mod-file.cpp | 2 +- flang/lib/Semantics/resolve-names.cpp | 12 +++++++++++- .../Semantics/OpenMP/declare-reduction-modfile.f90 | 4 +++- 5 files changed, 19 insertions(+), 7 deletions(-) diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index f28a1d6b929eb..5cc47b36c234f 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -30,7 +30,7 @@ class raw_ostream; namespace Fortran::parser { struct Expr; struct OpenMPDeclareReductionConstruct; -struct OmpDirectiveSpecification; +struct OmpMetadirectiveDirective; } namespace Fortran::semantics { @@ -710,7 +710,7 @@ class UserReductionDetails { public: using TypeVector = std::vector; using DeclInfo = std::variant; + const parser::OmpMetadirectiveDirective *>; using DeclVector = std::vector; UserReductionDetails() = default; diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 3f6815968b76f..bcc50a72a84b4 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -3329,7 +3329,7 @@ template void Unparse(llvm::raw_ostream &, const Expr &, Encoding, bool, template void Unparse( llvm::raw_ostream &, const parser::OpenMPDeclareReductionConstruct &, Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); -template void Unparse(llvm::raw_ostream &, - const parser::OmpDirectiveSpecification &, Encoding, bool, bool, +template void Unparse(llvm::raw_ostream &, + const parser::OmpMetadirectiveDirective &, Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); } // namespace Fortran::parser diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index 93226beb8b5ed..c24b4a63a2aeb 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1047,7 +1047,7 @@ void ModFileWriter::PutUserReduction( if (auto d = std::get_if( &decl)) { Unparse(os, **d); - } else if (auto s = std::get_if( + } else if (auto s = std::get_if( &decl)) { Unparse(os, **s); } else { diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1c2a0759e2ed..9c3bd00627ff7 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1655,6 +1655,14 @@ class OmpVisitor : public virtual DeclarationVisitor { EndDeclTypeSpec(); } + bool Pre(const parser::OmpMetadirectiveDirective &x) { // + metaDirective_ = &x; + return true; + } + void Post(const parser::OmpMetadirectiveDirective &) { // + metaDirective_ = nullptr; + } + private: void ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, const parser::OmpClauseList &clauses); @@ -1666,6 +1674,7 @@ class OmpVisitor : public virtual DeclarationVisitor { parser::CharBlock MangleDefinedOperator(const parser::CharBlock &name); int metaLevel_{0}; + const parser::OmpMetadirectiveDirective *metaDirective_{nullptr}; }; bool OmpVisitor::NeedsScope(const parser::OpenMPBlockConstruct &x) { @@ -1930,7 +1939,8 @@ bool OmpVisitor::Pre(const parser::OmpDirectiveSpecification &x) { if (maybeArgs && maybeClauses) { const parser::OmpArgument &first{maybeArgs->v.front()}; if (auto *spec{std::get_if(&first.u)}) { - ProcessReductionSpecifier(*spec, maybeClauses, x); + CHECK(metaDirective_); + ProcessReductionSpecifier(*spec, maybeClauses, *metaDirective_); } } break; diff --git a/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 b/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 index caed7fd335376..f80eb1097e18a 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_modfile.py %s %flang_fc1 -fopenmp +! RUN: %python %S/../test_modfile.py %s %flang_fc1 -fopenmp -fopenmp-version=52 ! Check correct modfile generation for OpenMP DECLARE REDUCTION construct. !Expect: drm.mod @@ -8,6 +8,7 @@ !endtype !!$OMP DECLARE REDUCTION (*:t1:omp_out = omp_out*omp_in) INITIALIZER(omp_priv=t& !!$OMP&1(1)) +!!$OMP METADIRECTIVE OTHERWISE(DECLARE REDUCTION(+:INTEGER)) !!$OMP DECLARE REDUCTION (.fluffy.:t1:omp_out = omp_out.fluffy.omp_in) INITIALI& !!$OMP&ZER(omp_priv=t1(0)) !!$OMP DECLARE REDUCTION (.mul.:t1:omp_out = omp_out.mul.omp_in) INITIALIZER(om& @@ -50,6 +51,7 @@ module drm !$omp declare reduction(*:t1:omp_out=omp_out*omp_in) initializer(omp_priv=t1(1)) !$omp declare reduction(.mul.:t1:omp_out=omp_out.mul.omp_in) initializer(omp_priv=t1(1)) !$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) initializer(omp_priv=t1(0)) +!$omp metadirective otherwise(declare reduction(+: integer)) contains type(t1) function mul(v1, v2) type(t1), intent (in):: v1, v2 >From 93f817902ed042f48c5171b23d95578e87eb5b0e Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Thu, 10 Apr 2025 19:11:10 +0100 Subject: [PATCH 09/11] Fix Klausler reported review comments Also rebase, as the branch was quite a way behind. Small conflict was resolved. --- flang/include/flang/Semantics/symbol.h | 6 +++--- flang/lib/Semantics/check-omp-structure.cpp | 9 ++++---- flang/lib/Semantics/mod-file.cpp | 23 ++++++++++++--------- flang/lib/Semantics/resolve-names-utils.h | 2 +- flang/lib/Semantics/resolve-names.cpp | 20 +++++++----------- 5 files changed, 29 insertions(+), 31 deletions(-) diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 5cc47b36c234f..b7b29afe1ceea 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -715,11 +715,11 @@ class UserReductionDetails { UserReductionDetails() = default; - void AddType(const DeclTypeSpec *type) { typeList_.push_back(type); } + void AddType(const DeclTypeSpec &type) { typeList_.push_back(&type); } const TypeVector &GetTypeList() const { return typeList_; } - bool SupportsType(const DeclTypeSpec *type) const { - return llvm::is_contained(typeList_, type); + bool SupportsType(const DeclTypeSpec &type) const { + return llvm::is_contained(typeList_, &type); } void AddDecl(const DeclInfo &decl) { declList_.push_back(decl); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 099a58124638d..91b8a8dd57d3b 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3404,8 +3404,7 @@ bool OmpStructureChecker::CheckReductionOperator( valid = llvm::is_contained({"max", "min", "iand", "ior", "ieor"}, realName); if (!valid) { - auto *reductionDetails{name->symbol->detailsIf()}; - valid = reductionDetails != nullptr; + valid = name->symbol->detailsIf(); } } if (!valid) { @@ -3534,7 +3533,7 @@ static bool IsReductionAllowedForType( const auto *reductionDetails{symbol->detailsIf()}; assert(reductionDetails && "Expected to find reductiondetails"); - return reductionDetails->SupportsType(&type); + return reductionDetails->SupportsType(type); } return false; } @@ -3571,7 +3570,7 @@ static bool IsReductionAllowedForType( // supported. if (const auto *reductionDetails{ name->symbol->detailsIf()}) { - return reductionDetails->SupportsType(&type); + return reductionDetails->SupportsType(type); } // We also need to check for mangled names (max, min, iand, ieor and ior) @@ -3580,7 +3579,7 @@ static bool IsReductionAllowedForType( if (const auto &symbol{scope.FindSymbol(mangledName)}) { if (const auto *reductionDetails{ symbol->detailsIf()}) { - return reductionDetails->SupportsType(&type); + return reductionDetails->SupportsType(type); } } } diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index c24b4a63a2aeb..10abb4db159c1 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1039,20 +1039,23 @@ void ModFileWriter::PutTypeParam(llvm::raw_ostream &os, const Symbol &symbol) { void ModFileWriter::PutUserReduction( llvm::raw_ostream &os, const Symbol &symbol) { - auto &details{symbol.get()}; + const auto &details{symbol.get()}; // The module content for a OpenMP Declare Reduction is the OpenMP // declaration. There may be multiple declarations. // Decls are pointers, so do not use a referene. for (const auto decl : details.GetDeclList()) { - if (auto d = std::get_if( - &decl)) { - Unparse(os, **d); - } else if (auto s = std::get_if( - &decl)) { - Unparse(os, **s); - } else { - DIE("Unknown OpenMP DECLARE REDUCTION content"); - } + common::visit( // + common::visitors{// + [&](const parser::OpenMPDeclareReductionConstruct *d) { + Unparse(os, *d); + }, + [&](const parser::OmpMetadirectiveDirective *m) { + Unparse(os, *m); + }, + [&](const auto &) { + DIE("Unknown OpenMP DECLARE REDUCTION content"); + }}, + decl); } } diff --git a/flang/lib/Semantics/resolve-names-utils.h b/flang/lib/Semantics/resolve-names-utils.h index ed74c8203e29a..809074031e2cc 100644 --- a/flang/lib/Semantics/resolve-names-utils.h +++ b/flang/lib/Semantics/resolve-names-utils.h @@ -149,7 +149,7 @@ void MapSubprogramToNewSymbols(const Symbol &oldSymbol, Symbol &newSymbol, parser::CharBlock MakeNameFromOperator( const parser::DefinedOperator::IntrinsicOperator &op, SemanticsContext &context); -parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name); +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_RESOLVE_NAMES_H_ diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 9c3bd00627ff7..0416b5d410fec 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1442,11 +1442,15 @@ class OmpVisitor : public virtual DeclarationVisitor { static bool NeedsScope(const parser::OpenMPBlockConstruct &); static bool NeedsScope(const parser::OmpClause &); - bool Pre(const parser::OmpMetadirectiveDirective &) { + bool Pre(const parser::OmpMetadirectiveDirective &x) { // + metaDirective_ = &x; ++metaLevel_; return true; } - void Post(const parser::OmpMetadirectiveDirective &) { --metaLevel_; } + void Post(const parser::OmpMetadirectiveDirective &) { // + metaDirective_ = nullptr; + --metaLevel_; + } bool Pre(const parser::OpenMPRequiresConstruct &x) { AddOmpSourceRange(x.source); @@ -1655,14 +1659,6 @@ class OmpVisitor : public virtual DeclarationVisitor { EndDeclTypeSpec(); } - bool Pre(const parser::OmpMetadirectiveDirective &x) { // - metaDirective_ = &x; - return true; - } - void Post(const parser::OmpMetadirectiveDirective &) { // - metaDirective_ = nullptr; - } - private: void ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, const parser::OmpClauseList &clauses); @@ -1793,7 +1789,7 @@ parser::CharBlock MakeNameFromOperator( } } -parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name) { +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name) { return llvm::StringSwitch(name.ToString()) .Case("max", {"op.max", 6}) .Case("min", {"op.min", 6}) @@ -1888,7 +1884,7 @@ void OmpVisitor::ProcessReductionSpecifier( // Only process types we can find. There will be an error later on when // a type isn't found. if (const DeclTypeSpec * typeSpec{GetDeclTypeSpec()}) { - reductionDetails->AddType(typeSpec); + reductionDetails->AddType(*typeSpec); for (auto &nm : ompVarNames) { ObjectEntityDetails details{}; >From ce6ca8fd9de36c0617a06df3f31a413fe317e08a Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Tue, 29 Apr 2025 13:25:19 +0100 Subject: [PATCH 10/11] Fix some semantics issues --- flang/lib/Semantics/assignment.cpp | 10 ++++++ flang/lib/Semantics/assignment.h | 3 ++ flang/lib/Semantics/check-omp-structure.cpp | 39 +++++++++++++-------- flang/lib/Semantics/resolve-names-utils.h | 1 + flang/lib/Semantics/resolve-names.cpp | 14 +++----- 5 files changed, 43 insertions(+), 24 deletions(-) diff --git a/flang/lib/Semantics/assignment.cpp b/flang/lib/Semantics/assignment.cpp index 935f5a03bdb6a..b6d66c1c92aa0 100644 --- a/flang/lib/Semantics/assignment.cpp +++ b/flang/lib/Semantics/assignment.cpp @@ -43,6 +43,7 @@ class AssignmentContext { void Analyze(const parser::PointerAssignmentStmt &); void Analyze(const parser::ConcurrentControl &); int deviceConstructDepth_{0}; + SemanticsContext &context() { return context_; } private: bool CheckForPureContext(const SomeExpr &rhs, parser::CharBlock rhsSource); @@ -213,8 +214,17 @@ void AssignmentContext::PopWhereContext() { AssignmentChecker::~AssignmentChecker() {} +SemanticsContext &AssignmentChecker::context() { + return context_.value().context(); +} + AssignmentChecker::AssignmentChecker(SemanticsContext &context) : context_{new AssignmentContext{context}} {} + +void AssignmentChecker::Enter( + const parser::OpenMPDeclareReductionConstruct &x) { + context().set_location(x.source); +} void AssignmentChecker::Enter(const parser::AssignmentStmt &x) { context_.value().Analyze(x); } diff --git a/flang/lib/Semantics/assignment.h b/flang/lib/Semantics/assignment.h index a67bee4a03dfc..4a1bb92037119 100644 --- a/flang/lib/Semantics/assignment.h +++ b/flang/lib/Semantics/assignment.h @@ -37,6 +37,7 @@ class AssignmentChecker : public virtual BaseChecker { public: explicit AssignmentChecker(SemanticsContext &); ~AssignmentChecker(); + void Enter(const parser::OpenMPDeclareReductionConstruct &x); void Enter(const parser::AssignmentStmt &); void Enter(const parser::PointerAssignmentStmt &); void Enter(const parser::WhereStmt &); @@ -54,6 +55,8 @@ class AssignmentChecker : public virtual BaseChecker { void Enter(const parser::OpenACCLoopConstruct &); void Leave(const parser::OpenACCLoopConstruct &); + SemanticsContext &context(); + private: common::Indirection context_; }; diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 91b8a8dd57d3b..0490d6aae8edf 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3391,6 +3391,14 @@ bool OmpStructureChecker::CheckReductionOperator( break; } } + // User-defined operators are OK if there has been a declared reduction + // for that. So check if it's a defined operator, and it has + // UserReductionDetails - then it's good. + if (const auto *definedOp{std::get_if(&dOpr.u)}) { + if (definedOp->v.symbol->detailsIf()) { + return true; + } + } context_.Say(source, "Invalid reduction operator in %s clause."_err_en_US, parser::ToUpperCaseLetters(getClauseName(clauseId).str())); return false; @@ -3485,6 +3493,17 @@ void OmpStructureChecker::CheckReductionObjects( } } +static bool CheckSymbolSupportsType(const Scope &scope, + const parser::CharBlock &name, const DeclTypeSpec &type) { + if (const auto &symbol{scope.FindSymbol(name)}) { + if (const auto *reductionDetails{ + symbol->detailsIf()}) { + return reductionDetails->SupportsType(&type); + } + } + return false; +} + static bool IsReductionAllowedForType( const parser::OmpReductionIdentifier &ident, const DeclTypeSpec &type, const Scope &scope, SemanticsContext &context) { @@ -3528,14 +3547,11 @@ static bool IsReductionAllowedForType( return false; } parser::CharBlock name{MakeNameFromOperator(*intrinsicOp, context)}; - Symbol *symbol{scope.FindSymbol(name)}; - if (symbol) { - const auto *reductionDetails{symbol->detailsIf()}; - assert(reductionDetails && "Expected to find reductiondetails"); - - return reductionDetails->SupportsType(type); - } - return false; + return CheckSymbolSupportsType(scope, name, type); + } else if (const auto *definedOp{ + std::get_if(&dOpr.u)}) { + // TODO: Figure out if it's valid. + return true; } DIE("Intrinsic Operator not found - parsing gone wrong?"); }}; @@ -3576,12 +3592,7 @@ static bool IsReductionAllowedForType( // We also need to check for mangled names (max, min, iand, ieor and ior) // and then check if the type is there. parser::CharBlock mangledName{MangleSpecialFunctions(name->source)}; - if (const auto &symbol{scope.FindSymbol(mangledName)}) { - if (const auto *reductionDetails{ - symbol->detailsIf()}) { - return reductionDetails->SupportsType(type); - } - } + return CheckSymbolSupportsType(scope, mangledName, type); } // Everything else is "not matching type". return false; diff --git a/flang/lib/Semantics/resolve-names-utils.h b/flang/lib/Semantics/resolve-names-utils.h index 809074031e2cc..ee8113a3fda5e 100644 --- a/flang/lib/Semantics/resolve-names-utils.h +++ b/flang/lib/Semantics/resolve-names-utils.h @@ -150,6 +150,7 @@ parser::CharBlock MakeNameFromOperator( const parser::DefinedOperator::IntrinsicOperator &op, SemanticsContext &context); parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name); +std::string MangleDefinedOperator(const parser::CharBlock &name); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_RESOLVE_NAMES_H_ diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 0416b5d410fec..ebbbcc75e5b67 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1667,8 +1667,6 @@ class OmpVisitor : public virtual DeclarationVisitor { const std::optional &clauses, const T &wholeConstruct); - parser::CharBlock MangleDefinedOperator(const parser::CharBlock &name); - int metaLevel_{0}; const parser::OmpMetadirectiveDirective *metaDirective_{nullptr}; }; @@ -1799,14 +1797,9 @@ parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name) { .Default(name); } -parser::CharBlock OmpVisitor::MangleDefinedOperator( - const parser::CharBlock &name) { - // This function should only be used with user defined operators, that have - // the pattern - // .. +std::string MangleDefinedOperator(const parser::CharBlock &name) { CHECK(name[0] == '.' && name[name.size() - 1] == '.'); - return parser::CharBlock{ - context().StoreUserReductionName("op" + name.ToString())}; + return "op" + name.ToString(); } template @@ -1828,7 +1821,8 @@ void OmpVisitor::ProcessReductionSpecifier( const auto &defOp{std::get(id.u)}; if (const auto definedOp{std::get_if(&defOp.u)}) { name = &definedOp->v; - mangledName.source = MangleDefinedOperator(definedOp->v.source); + mangledName.source = parser::CharBlock{context().StoreUserReductionName( + MangleDefinedOperator(definedOp->v.source))}; } else { mangledName.source = MakeNameFromOperator( std::get(defOp.u), >From 4739878424472cc74daf7f58e82916e045626338 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Thu, 1 May 2025 13:39:58 +0100 Subject: [PATCH 11/11] [Flang][OpenMP] Fix review comment failed examples Add code to better handle operators in parsing and semantics. Add a function to set the the scope when processign assignments, which caused a crash in "check for pure functions". Add three new tests and amend existing tests to cover a pure function. --- flang/include/flang/Semantics/symbol.h | 9 +++- flang/lib/Semantics/check-omp-structure.cpp | 17 ++++--- .../declare-reduction-bad-operator2.f90 | 28 +++++++++++ .../OpenMP/declare-reduction-functions.f90 | 17 ++++++- .../OpenMP/declare-reduction-operator.f90 | 36 ++++++++++++++ .../OpenMP/declare-reduction-renamedop.f90 | 47 +++++++++++++++++++ 6 files changed, 145 insertions(+), 9 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-bad-operator2.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-operator.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-renamedop.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index b7b29afe1ceea..7141f5bb3feb4 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -719,7 +719,14 @@ class UserReductionDetails { const TypeVector &GetTypeList() const { return typeList_; } bool SupportsType(const DeclTypeSpec &type) const { - return llvm::is_contained(typeList_, &type); + // We have to compare the actual type, not the pointer, as some + // types are not guaranteed to be the same object. + for (auto t : typeList_) { + if (*t == type) { + return true; + } + } + return false; } void AddDecl(const DeclInfo &decl) { declList_.push_back(decl); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 0490d6aae8edf..cefe80f442727 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3392,11 +3392,14 @@ bool OmpStructureChecker::CheckReductionOperator( } } // User-defined operators are OK if there has been a declared reduction - // for that. So check if it's a defined operator, and it has - // UserReductionDetails - then it's good. + // for that. We mangle those names to store the user details. if (const auto *definedOp{std::get_if(&dOpr.u)}) { - if (definedOp->v.symbol->detailsIf()) { - return true; + std::string mangled = MangleDefinedOperator(definedOp->v.symbol->name()); + const Scope &scope = definedOp->v.symbol->owner(); + if (const Symbol *symbol = scope.FindSymbol(mangled)) { + if (symbol->detailsIf()) { + return true; + } } } context_.Say(source, "Invalid reduction operator in %s clause."_err_en_US, @@ -3498,7 +3501,7 @@ static bool CheckSymbolSupportsType(const Scope &scope, if (const auto &symbol{scope.FindSymbol(name)}) { if (const auto *reductionDetails{ symbol->detailsIf()}) { - return reductionDetails->SupportsType(&type); + return reductionDetails->SupportsType(type); } } return false; @@ -3550,8 +3553,8 @@ static bool IsReductionAllowedForType( return CheckSymbolSupportsType(scope, name, type); } else if (const auto *definedOp{ std::get_if(&dOpr.u)}) { - // TODO: Figure out if it's valid. - return true; + return CheckSymbolSupportsType( + scope, MangleDefinedOperator(definedOp->v.symbol->name()), type); } DIE("Intrinsic Operator not found - parsing gone wrong?"); }}; diff --git a/flang/test/Semantics/OpenMP/declare-reduction-bad-operator2.f90 b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator2.f90 new file mode 100644 index 0000000000000..9ee223c1c71fe --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator2.f90 @@ -0,0 +1,28 @@ +! RUN: not %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s 2>&1 | FileCheck %s + +module m1 + interface operator(.fluffy.) + procedure my_mul + end interface + type t1 + integer :: val = 1 + end type +!$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) +contains + function my_mul(x, y) + type (t1), intent (in) :: x, y + type (t1) :: my_mul + my_mul%val = x%val * y%val + end function my_mul + + subroutine subr(a, r) + implicit none + integer, intent(in), dimension(10) :: a + integer, intent(out) :: r + integer :: i + !$omp do parallel reduction(.fluffy.:r) +!CHECK: error: The type of 'r' is incompatible with the reduction operator. + do i=1,10 + end do + end subroutine subr +end module m1 diff --git a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 index a2435fca415cd..000d323f522cf 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 @@ -166,7 +166,7 @@ function funcBtwothree(x, n) !CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) !CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) !CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) -!CHECK OtherConstruct scope +!CHECK: OtherConstruct scope !CHECK: omp_in size=24 offset=0: ObjectEntity type: TYPE(three) !CHECK: omp_orig size=24 offset=24: ObjectEntity type: TYPE(three) !CHECK: omp_out size=24 offset=48: ObjectEntity type: TYPE(three) @@ -184,5 +184,20 @@ function funcBtwothree(x, n) res%t2 = res2 res%t3 = res3 end function funcBtwothree + + !! This is checking a special case, where a reduction is declared inside a + !! pure function + + pure logical function reduction() +!CHECK: reduction size=4 offset=0: ObjectEntity funcResult type: LOGICAL(4) +!CHECK: rr: UserReductionDetails INTEGER(4) +!CHECK: OtherConstruct scope: size=16 alignment=4 sourceRange=0 bytes +!CHECK: omp_in size=4 offset=0: ObjectEntity type: INTEGER(4) +!CHECK: omp_orig size=4 offset=4: ObjectEntity type: INTEGER(4) +!CHECK: omp_out size=4 offset=8: ObjectEntity type: INTEGER(4) +!CHECK: omp_priv size=4 offset=12: ObjectEntity type: INTEGER(4) + !$omp declare reduction (rr : integer : omp_out = omp_out + omp_in) initializer (omp_priv = 0) + reduction = .false. + end function reduction end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction-operator.f90 b/flang/test/Semantics/OpenMP/declare-reduction-operator.f90 new file mode 100644 index 0000000000000..e4ac7023f4629 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-operator.f90 @@ -0,0 +1,36 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +module m1 + interface operator(.fluffy.) +!CHECK: .fluffy., PUBLIC (Function): Generic DefinedOp procs: my_mul + procedure my_mul + end interface + type t1 + integer :: val = 1 + end type +!$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) +!CHECK: op.fluffy., PUBLIC: UserReductionDetails TYPE(t1) +!CHECK: t1, PUBLIC: DerivedType components: val +!CHECK: OtherConstruct scope: size=16 alignment=4 sourceRange=0 bytes +!CHECK: omp_in size=4 offset=0: ObjectEntity type: TYPE(t1) +!CHECK: omp_orig size=4 offset=4: ObjectEntity type: TYPE(t1) +!CHECK: omp_out size=4 offset=8: ObjectEntity type: TYPE(t1) +!CHECK: omp_priv size=4 offset=12: ObjectEntity type: TYPE(t1) +contains + function my_mul(x, y) + type (t1), intent (in) :: x, y + type (t1) :: my_mul + my_mul%val = x%val * y%val + end function my_mul + + subroutine subr(a, r) + implicit none + type(t1), intent(in), dimension(10) :: a + type(t1), intent(out) :: r + integer :: i + !$omp do parallel reduction(.fluffy.:r) + do i=1,10 + r = r .fluffy. a(i) + end do + end subroutine subr +end module m1 diff --git a/flang/test/Semantics/OpenMP/declare-reduction-renamedop.f90 b/flang/test/Semantics/OpenMP/declare-reduction-renamedop.f90 new file mode 100644 index 0000000000000..12e80cbf7b327 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-renamedop.f90 @@ -0,0 +1,47 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +!! Test that we can "rename" an operator when using a module's operator. +module module1 +!CHECK: Module scope: module1 size=0 + implicit none + type :: t1 + real :: value + end type t1 + interface operator(.mul.) + module procedure my_mul + end interface operator(.mul.) +!CHECK: .mul., PUBLIC (Function): Generic DefinedOp procs: my_mul +!CHECK: my_mul, PUBLIC (Function): Subprogram result:TYPE(t1) r (TYPE(t1) x,TYPE(t1) +!CHECK: t1, PUBLIC: DerivedType components: value +contains + function my_mul(x, y) result(r) + type(t1), intent(in) :: x, y + type(t1) :: r + r%value = x%value * y%value + end function my_mul +end module module1 + +program test_omp_reduction +!CHECK: MainProgram scope: test_omp_reduction + use module1, only: t1, operator(.modmul.) => operator(.mul.) + +!CHECK: .modmul. (Function): Use from .mul. in module1 + implicit none + + type(t1) :: result + integer :: i + !$omp declare reduction (.modmul. : t1 : omp_out = omp_out .modmul. omp_in) initializer(omp_priv = t1(1.0)) +!CHECK: op.modmul.: UserReductionDetails TYPE(t1) +!CHECK: t1: Use from t1 in module1 +!CHECK: OtherConstruct scope: size=16 alignment=4 sourceRange=0 bytes +!CHECK: omp_in size=4 offset=0: ObjectEntity type: TYPE(t1) +!CHECK: omp_orig size=4 offset=4: ObjectEntity type: TYPE(t1) +!CHECK: omp_out size=4 offset=8: ObjectEntity type: TYPE(t1) +!CHECK: omp_priv size=4 offset=12: ObjectEntity type: TYPE(t1) + result = t1(1.0) + !$omp parallel do reduction(.modmul.:result) + do i = 1, 10 + result = result .modmul. t1(real(i)) + end do + !$omp end parallel do +end program test_omp_reduction From flang-commits at lists.llvm.org Thu May 1 06:08:41 2025 From: flang-commits at lists.llvm.org (Mats Petersson via flang-commits) Date: Thu, 01 May 2025 06:08:41 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #131628) In-Reply-To: Message-ID: <68137259.a70a0220.292e53.349b@mx.google.com> Leporacanthicus wrote: > There is a crash in `Fortran::semantics::IsReductionAllowedForType` for the following test. > [snip big chunk of code] > pure logical function is_reduction_valid() > !$omp declare reduction (bar : integer : omp_out = omp_out + omp_in) initializer (omp_priv = 0) > is_reduction_valid = .false. > end function is_reduction_valid > ``` Fixed all three of the code snippets that fail, and added tests for checking that it works. https://github.com/llvm/llvm-project/pull/131628 From flang-commits at lists.llvm.org Thu May 1 08:17:04 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Thu, 01 May 2025 08:17:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <68139070.170a0220.34ec91.d980@mx.google.com> ================ @@ -20,6 +21,7 @@ #include "flang/Optimizer/Dialect/FIROpsSupport.h" #include "flang/Optimizer/Support/InternalNames.h" #include "flang/Optimizer/Support/Utils.h" +#include "flang/Parser/parse-tree.h" ---------------- akuhlens wrote: As it turns out, it wasn't. I think this got pulled in while I was iterating and ultimately wasn't needed. Removed! https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Thu May 1 08:19:46 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 08:19:46 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <68139112.170a0220.1edde3.a9b0@mx.google.com> ================ @@ -13,6 +13,7 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include ---------------- fanju110 wrote: Ok,I have adjusted it as you suggested https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Thu May 1 08:20:48 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 08:20:48 -0700 (PDT) Subject: [flang-commits] [flang] c617466 - [flang][llvm][OpenMP] Add implicit casts to omp.atomic (#131603) Message-ID: <68139150.630a0220.d3a12.f107@mx.google.com> Author: NimishMishra Date: 2025-05-01T08:20:42-07:00 New Revision: c61746650178c117996e1787617f36ccda7233f7 URL: https://github.com/llvm/llvm-project/commit/c61746650178c117996e1787617f36ccda7233f7 DIFF: https://github.com/llvm/llvm-project/commit/c61746650178c117996e1787617f36ccda7233f7.diff LOG: [flang][llvm][OpenMP] Add implicit casts to omp.atomic (#131603) Currently, implicit casts in Fortran are handled by the OMPIRBuilder. This patch shifts that responsibility to FIR codegen. Added: flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 flang/test/Lower/OpenMP/atomic-implicit-cast.f90 Modified: flang/lib/Lower/OpenMP/OpenMP.cpp llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp mlir/test/Target/LLVMIR/openmp-llvm.mlir Removed: ################################################################################ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index f099028c23323..47e7c266ff7d3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2889,9 +2889,82 @@ static void genAtomicRead(lower::AbstractConverter &converter, fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); mlir::Value toAddress = fir::getBase(converter.genExprAddr( *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); + + if (fromAddress.getType() != toAddress.getType()) { + // Emit an implicit cast. Different yet compatible types on + // omp.atomic.read constitute valid Fortran. The OMPIRBuilder will + // emit atomic instructions (on primitive types) and `__atomic_load` + // libcall (on complex type) without explicitly converting + // between such compatible types. The OMPIRBuilder relies on the + // frontend to resolve such inconsistencies between `omp.atomic.read ` + // operand types. Similar inconsistencies between operand types in + // `omp.atomic.write` are resolved through implicit casting by use of typed + // assignment (i.e. `evaluate::Assignment`). However, use of typed + // assignment in `omp.atomic.read` (of form `v = x`) leads to an unsafe, + // non-atomic load of `x` into a temporary `alloca`, followed by an atomic + // read of form `v = alloca`. Hence, it is needed to perform a custom + // implicit cast. + + // An atomic read of form `v = x` would (without implicit casting) + // lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, + // type2`. This implicit casting will rather generate the following FIR: + // + // %alloca = fir.alloca type2 + // omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 + // %load = fir.load %alloca : !fir.ref + // %cvt = fir.convert %load : (type2) -> type1 + // fir.store %cvt to %v : !fir.ref + + // These sequence of operations is thread-safe since each thread allocates + // the `alloca` in its stack, and performs `%alloca = %x` atomically. Once + // safely read, each thread performs the implicit cast on the local + // `alloca`, and writes the final result to `%v`. + mlir::Type toType = fir::unwrapRefType(toAddress.getType()); + mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto oldIP = builder.saveInsertionPoint(); + builder.setInsertionPointToStart(builder.getAllocaBlock()); + mlir::Value alloca = builder.create( + loc, fromType); // Thread scope `alloca` to atomically read `%x`. + builder.restoreInsertionPoint(oldIP); + genAtomicCaptureStatement(converter, fromAddress, alloca, + leftHandClauseList, rightHandClauseList, + elementType, loc); + auto load = builder.create(loc, alloca); + if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { + // Emit an additional `ExtractValueOp` if `fromAddress` is of complex + // type, but `toAddress` is not. + auto extract = builder.create( + loc, mlir::cast(fromType).getElementType(), load, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + auto cvt = builder.create(loc, toType, extract); + builder.create(loc, cvt, toAddress); + } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { + // Emit an additional `InsertValueOp` if `toAddress` is of complex + // type, but `fromAddress` is not. + mlir::Value undef = builder.create(loc, toType); + mlir::Type complexEleTy = + mlir::cast(toType).getElementType(); + mlir::Value cvt = builder.create(loc, complexEleTy, load); + mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); + mlir::Value idx0 = builder.create( + loc, toType, undef, cvt, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + mlir::Value idx1 = builder.create( + loc, toType, idx0, zero, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 1))); + builder.create(loc, idx1, toAddress); + } else { + auto cvt = builder.create(loc, toType, load); + builder.create(loc, cvt, toAddress); + } + } else + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); } /// Processes an atomic construct with update clause. @@ -2976,6 +3049,10 @@ static void genAtomicCapture(lower::AbstractConverter &converter, mlir::Type stmt2VarType = fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + // Check if implicit type is needed + if (stmt1VarType != stmt2VarType) + TODO(loc, "atomic capture requiring implicit type casts"); + mlir::Operation *atomicCaptureOp = nullptr; mlir::IntegerAttr hint = nullptr; mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 b/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 new file mode 100644 index 0000000000000..5b61f1169308f --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 @@ -0,0 +1,48 @@ +!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s + +!CHECK: not yet implemented: atomic capture requiring implicit type casts +subroutine capture_with_convert_f32_to_i32() + implicit none + integer :: k, v, i + + k = 1 + v = 0 + + !$omp atomic capture + v = k + k = (i + 1) * 3.14 + !$omp end atomic +end subroutine + +subroutine capture_with_convert_i32_to_f64() + real(8) :: x + integer :: v + x = 1.0 + v = 0 + !$omp atomic capture + v = x + x = v + !$omp end atomic +end subroutine capture_with_convert_i32_to_f64 + +subroutine capture_with_convert_f64_to_i32() + integer :: x + real(8) :: v + x = 1 + v = 0 + !$omp atomic capture + x = v + v = x + !$omp end atomic +end subroutine capture_with_convert_f64_to_i32 + +subroutine capture_with_convert_i32_to_f32() + real(4) :: x + integer :: v + x = 1.0 + v = 0 + !$omp atomic capture + v = x + x = x + v + !$omp end atomic +end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 new file mode 100644 index 0000000000000..75f1cbfc979b9 --- /dev/null +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -0,0 +1,56 @@ +! REQUIRES : openmp_runtime + +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +! CHECK: func.func @_QPatomic_implicit_cast_read() { +subroutine atomic_implicit_cast_read +! CHECK: %[[ALLOCA3:.*]] = fir.alloca complex +! CHECK: %[[ALLOCA2:.*]] = fir.alloca complex +! CHECK: %[[ALLOCA1:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA0:.*]] = fir.alloca f32 + +! CHECK: %[[M:.*]] = fir.alloca complex {bindc_name = "m", uniq_name = "_QFatomic_implicit_cast_readEm"} +! CHECK: %[[M_DECL:.*]]:2 = hlfir.declare %[[M]] {uniq_name = "_QFatomic_implicit_cast_readEm"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[W:.*]] = fir.alloca complex {bindc_name = "w", uniq_name = "_QFatomic_implicit_cast_readEw"} +! CHECK: %[[W_DECL:.*]]:2 = hlfir.declare %[[W]] {uniq_name = "_QFatomic_implicit_cast_readEw"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFatomic_implicit_cast_readEx"} +! CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X]] {uniq_name = "_QFatomic_implicit_cast_readEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Y:.*]] = fir.alloca f32 {bindc_name = "y", uniq_name = "_QFatomic_implicit_cast_readEy"} +! CHECK: %[[Y_DECL:.*]]:2 = hlfir.declare %[[Y]] {uniq_name = "_QFatomic_implicit_cast_readEy"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Z:.*]] = fir.alloca f64 {bindc_name = "z", uniq_name = "_QFatomic_implicit_cast_readEz"} +! CHECK: %[[Z_DECL:.*]]:2 = hlfir.declare %[[Z]] {uniq_name = "_QFatomic_implicit_cast_readEz"} : (!fir.ref) -> (!fir.ref, !fir.ref) + integer :: x + real :: y + double precision :: z + complex :: w + complex(8) :: m + +! CHECK: omp.atomic.read %[[ALLOCA0:.*]] = %[[Y_DECL]]#0 : !fir.ref, !fir.ref, f32 +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA0]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (f32) -> i32 +! CHECK: fir.store %[[CVT]] to %[[X_DECL]]#0 : !fir.ref + !$omp atomic read + x = y + +! CHECK: omp.atomic.read %[[ALLOCA1:.*]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA1]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f64 +! CHECK: fir.store %[[CVT]] to %[[Z_DECL]]#0 : !fir.ref + !$omp atomic read + z = x + +! CHECK: omp.atomic.read %[[ALLOCA2:.*]] = %[[W_DECL]]#0 : !fir.ref>, !fir.ref>, complex +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA2]] : !fir.ref> +! CHECK: %[[EXTRACT:.*]] = fir.extract_value %[[LOAD]], [0 : index] : (complex) -> f32 +! CHECK: %[[CVT:.*]] = fir.convert %[[EXTRACT]] : (f32) -> i32 +! CHECK: fir.store %[[CVT]] to %[[X_DECL]]#0 : !fir.ref + !$omp atomic read + x = w + +! CHECK: omp.atomic.read %[[ALLOCA3:.*]] = %[[W_DECL]]#0 : !fir.ref>, !fir.ref>, complex +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA3]] : !fir.ref> +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (complex) -> complex +! CHECK: fir.store %[[CVT]] to %[[M_DECL]]#0 : !fir.ref> + !$omp atomic read + m = w +end subroutine diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index 63d7171b06156..06dc1184e7cf5 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -268,33 +268,6 @@ computeOpenMPScheduleType(ScheduleKind ClauseKind, bool HasChunks, return Result; } -/// Emit an implicit cast to convert \p XRead to type of variable \p V -static llvm::Value *emitImplicitCast(IRBuilder<> &Builder, llvm::Value *XRead, - llvm::Value *V) { - // TODO: Add this functionality to the `AtomicInfo` interface - llvm::Type *XReadType = XRead->getType(); - llvm::Type *VType = V->getType(); - if (llvm::AllocaInst *vAlloca = dyn_cast(V)) - VType = vAlloca->getAllocatedType(); - - if (XReadType->isStructTy() && VType->isStructTy()) - // No need to extract or convert. A direct - // `store` will suffice. - return XRead; - - if (XReadType->isStructTy()) - XRead = Builder.CreateExtractValue(XRead, /*Idxs=*/0); - if (VType->isIntegerTy() && XReadType->isFloatingPointTy()) - XRead = Builder.CreateFPToSI(XRead, VType); - else if (VType->isFloatingPointTy() && XReadType->isIntegerTy()) - XRead = Builder.CreateSIToFP(XRead, VType); - else if (VType->isIntegerTy() && XReadType->isIntegerTy()) - XRead = Builder.CreateIntCast(XRead, VType, true); - else if (VType->isFloatingPointTy() && XReadType->isFloatingPointTy()) - XRead = Builder.CreateFPCast(XRead, VType); - return XRead; -} - /// Make \p Source branch to \p Target. /// /// Handles two situations: @@ -8685,8 +8658,6 @@ OpenMPIRBuilder::createAtomicRead(const LocationDescription &Loc, } } checkAndEmitFlushAfterAtomic(Loc, AO, AtomicKind::Read); - if (XRead->getType() != V.Var->getType()) - XRead = emitImplicitCast(Builder, XRead, V.Var); Builder.CreateStore(XRead, V.Var, V.IsVolatile); return Builder.saveIP(); } @@ -8983,8 +8954,6 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createAtomicCapture( return AtomicResult.takeError(); Value *CapturedVal = (IsPostfixUpdate ? AtomicResult->first : AtomicResult->second); - if (CapturedVal->getType() != V.Var->getType()) - CapturedVal = emitImplicitCast(Builder, CapturedVal, V.Var); Builder.CreateStore(CapturedVal, V.Var, V.IsVolatile); checkAndEmitFlushAfterAtomic(Loc, AO, AtomicKind::Capture); diff --git a/mlir/test/Target/LLVMIR/openmp-llvm.mlir b/mlir/test/Target/LLVMIR/openmp-llvm.mlir index 02a08eec74016..32f0ba5b105ff 100644 --- a/mlir/test/Target/LLVMIR/openmp-llvm.mlir +++ b/mlir/test/Target/LLVMIR/openmp-llvm.mlir @@ -1396,42 +1396,35 @@ llvm.func @omp_atomic_read_implicit_cast () { //CHECK: call void @__atomic_load(i64 8, ptr %[[X_ELEMENT]], ptr %[[ATOMIC_LOAD_TEMP]], i32 0) //CHECK: %[[LOAD:.*]] = load { float, float }, ptr %[[ATOMIC_LOAD_TEMP]], align 8 -//CHECK: %[[EXT:.*]] = extractvalue { float, float } %[[LOAD]], 0 -//CHECK: store float %[[EXT]], ptr %[[Y]], align 4 +//CHECK: store { float, float } %[[LOAD]], ptr %[[Y]], align 4 omp.atomic.read %3 = %17 : !llvm.ptr, !llvm.ptr, !llvm.struct<(f32, f32)> //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[Z]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i32 %[[ATOMIC_LOAD_TEMP]] to float -//CHECK: %[[LOAD:.*]] = fpext float %[[CAST]] to double -//CHECK: store double %[[LOAD]], ptr %[[Y]], align 8 +//CHECK: store float %[[CAST]], ptr %[[Y]], align 4 omp.atomic.read %3 = %1 : !llvm.ptr, !llvm.ptr, f32 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[W]] monotonic, align 4 -//CHECK: %[[LOAD:.*]] = sitofp i32 %[[ATOMIC_LOAD_TEMP]] to double -//CHECK: store double %[[LOAD]], ptr %[[Y]], align 8 +//CHECK: store i32 %[[ATOMIC_LOAD_TEMP]], ptr %[[Y]], align 4 omp.atomic.read %3 = %7 : !llvm.ptr, !llvm.ptr, i32 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i64, ptr %[[Y]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i64 %[[ATOMIC_LOAD_TEMP]] to double -//CHECK: %[[LOAD:.*]] = fptrunc double %[[CAST]] to float -//CHECK: store float %[[LOAD]], ptr %[[Z]], align 4 +//CHECK: store double %[[CAST]], ptr %[[Z]], align 8 omp.atomic.read %1 = %3 : !llvm.ptr, !llvm.ptr, f64 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[W]] monotonic, align 4 -//CHECK: %[[LOAD:.*]] = sitofp i32 %[[ATOMIC_LOAD_TEMP]] to float -//CHECK: store float %[[LOAD]], ptr %[[Z]], align 4 +//CHECK: store i32 %[[ATOMIC_LOAD_TEMP]], ptr %[[Z]], align 4 omp.atomic.read %1 = %7 : !llvm.ptr, !llvm.ptr, i32 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i64, ptr %[[Y]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i64 %[[ATOMIC_LOAD_TEMP]] to double -//CHECK: %[[LOAD:.*]] = fptosi double %[[CAST]] to i32 -//CHECK: store i32 %[[LOAD]], ptr %[[W]], align 4 +//CHECK: store double %[[CAST]], ptr %[[W]], align 8 omp.atomic.read %7 = %3 : !llvm.ptr, !llvm.ptr, f64 //CHECK: %[[ATOMIC_LOAD_TEMP:.*]] = load atomic i32, ptr %[[Z]] monotonic, align 4 //CHECK: %[[CAST:.*]] = bitcast i32 %[[ATOMIC_LOAD_TEMP]] to float -//CHECK: %[[LOAD:.*]] = fptosi float %[[CAST]] to i32 -//CHECK: store i32 %[[LOAD]], ptr %[[W]], align 4 +//CHECK: store float %[[CAST]], ptr %[[W]], align 4 omp.atomic.read %7 = %1 : !llvm.ptr, !llvm.ptr, f32 llvm.return } From flang-commits at lists.llvm.org Thu May 1 08:20:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 08:20:50 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][llvm][OpenMP] Add implicit casts to omp.atomic (PR #131603) In-Reply-To: Message-ID: <68139152.170a0220.346c58.da65@mx.google.com> https://github.com/NimishMishra closed https://github.com/llvm/llvm-project/pull/131603 From flang-commits at lists.llvm.org Thu May 1 08:21:16 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 08:21:16 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6813916c.650a0220.2e1a20.d073@mx.google.com> ================ @@ -0,0 +1,39 @@ +! Tests for -fprofile-generate and -fprofile-use flag compatibility. These two +! flags behave similarly to their GCC counterparts: +! +! -fprofile-generate Generates the profile file ./default.profraw +! -fprofile-use=/file Uses the profile file /file + +! On AIX, -flto used to be required with -fprofile-generate. gcc-flag-compatibility-aix.c is used to do the testing on AIX with -flto +! RUN: %flang %s -c -S -o - -emit-llvm -fprofile-generate | FileCheck -check-prefix=PROFILE-GEN %s +! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section +! PROFILE-GEN: @__profd_{{_?}}main = + + ---------------- fanju110 wrote: It's okay, I referenced clang for this file, and I adjusted the code layout to keep it looking nice https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Thu May 1 08:22:10 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 08:22:10 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <681391a2.170a0220.2fc68c.d5d7@mx.google.com> ================ @@ -8,8 +8,14 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; +} // namespace llvm namespace llvm::driver { ---------------- fanju110 wrote: Thanks for your suggestion, I have corrected it https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Thu May 1 08:14:47 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Thu, 01 May 2025 08:14:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <68138fe7.630a0220.7178d.e0c4@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/136012 >From 0f4591ee621e2e9d7acb0e6066b556cb7e243162 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Wed, 16 Apr 2025 12:01:24 -0700 Subject: [PATCH 1/9] initial commit --- flang/include/flang/Lower/AbstractConverter.h | 4 + flang/include/flang/Lower/OpenACC.h | 10 +- flang/include/flang/Semantics/symbol.h | 23 +- flang/lib/Lower/Bridge.cpp | 7 +- flang/lib/Lower/CallInterface.cpp | 10 + flang/lib/Lower/OpenACC.cpp | 197 ++++++++++++++---- flang/lib/Semantics/mod-file.cpp | 1 + flang/lib/Semantics/resolve-directives.cpp | 83 ++++---- 8 files changed, 233 insertions(+), 102 deletions(-) diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 1d1323642bf9c..59419e829718f 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -14,6 +14,7 @@ #define FORTRAN_LOWER_ABSTRACTCONVERTER_H #include "flang/Lower/LoweringOptions.h" +#include "flang/Lower/OpenACC.h" #include "flang/Lower/PFTDefs.h" #include "flang/Optimizer/Builder/BoxValue.h" #include "flang/Optimizer/Dialect/FIRAttr.h" @@ -357,6 +358,9 @@ class AbstractConverter { /// functions in order to be in sync). virtual mlir::SymbolTable *getMLIRSymbolTable() = 0; + virtual Fortran::lower::AccRoutineInfoMappingList & + getAccDelayedRoutines() = 0; + private: /// Options controlling lowering behavior. const Fortran::lower::LoweringOptions &loweringOptions; diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 0d7038a7fd856..7832e8b69ea23 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -22,6 +22,9 @@ class StringRef; } // namespace llvm namespace mlir { +namespace func { +class FuncOp; +} class Location; class Type; class ModuleOp; @@ -42,6 +45,7 @@ struct OpenACCRoutineConstruct; } // namespace parser namespace semantics { +class OpenACCRoutineInfo; class SemanticsContext; class Symbol; } // namespace semantics @@ -79,8 +83,10 @@ void genOpenACCDeclarativeConstruct(AbstractConverter &, void genOpenACCRoutineConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, mlir::ModuleOp, - const parser::OpenACCRoutineConstruct &, - AccRoutineInfoMappingList &); + const parser::OpenACCRoutineConstruct &); +void genOpenACCRoutineConstruct( + AbstractConverter &, mlir::ModuleOp, mlir::func::FuncOp, + const std::vector &); void finalizeOpenACCRoutineAttachment(mlir::ModuleOp, AccRoutineInfoMappingList &); diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 715811885c219..1b6b247c9f5bc 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -127,6 +127,8 @@ class WithBindName { // Device type specific OpenACC routine information class OpenACCRoutineDeviceTypeInfo { public: + OpenACCRoutineDeviceTypeInfo(Fortran::common::OpenACCDeviceType dType) + : deviceType_{dType} {} bool isSeq() const { return isSeq_; } void set_isSeq(bool value = true) { isSeq_ = value; } bool isVector() const { return isVector_; } @@ -141,9 +143,7 @@ class OpenACCRoutineDeviceTypeInfo { return bindName_ ? &*bindName_ : nullptr; } void set_bindName(std::string &&name) { bindName_ = std::move(name); } - void set_dType(Fortran::common::OpenACCDeviceType dType) { - deviceType_ = dType; - } + Fortran::common::OpenACCDeviceType dType() const { return deviceType_; } private: @@ -162,13 +162,24 @@ class OpenACCRoutineDeviceTypeInfo { // in as objects in the OpenACCRoutineDeviceTypeInfo list. class OpenACCRoutineInfo : public OpenACCRoutineDeviceTypeInfo { public: + OpenACCRoutineInfo() + : OpenACCRoutineDeviceTypeInfo(Fortran::common::OpenACCDeviceType::None) { + } bool isNohost() const { return isNohost_; } void set_isNohost(bool value = true) { isNohost_ = value; } - std::list &deviceTypeInfos() { + const std::list &deviceTypeInfos() const { return deviceTypeInfos_; } - void add_deviceTypeInfo(OpenACCRoutineDeviceTypeInfo &info) { - deviceTypeInfos_.push_back(info); + + OpenACCRoutineDeviceTypeInfo &add_deviceTypeInfo( + Fortran::common::OpenACCDeviceType type) { + return add_deviceTypeInfo(OpenACCRoutineDeviceTypeInfo(type)); + } + + OpenACCRoutineDeviceTypeInfo &add_deviceTypeInfo( + OpenACCRoutineDeviceTypeInfo &&info) { + deviceTypeInfos_.push_back(std::move(info)); + return deviceTypeInfos_.back(); } private: diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index b4d1197822a43..9285d587585f8 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -443,7 +443,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); Fortran::lower::genOpenACCRoutineConstruct( *this, bridge.getSemanticsContext(), bridge.getModule(), - d.routine, accRoutineInfos); + d.routine); builder = nullptr; }, }, @@ -4287,6 +4287,11 @@ class FirConverter : public Fortran::lower::AbstractConverter { return Fortran::lower::createMutableBox(loc, *this, expr, localSymbols); } + Fortran::lower::AccRoutineInfoMappingList & + getAccDelayedRoutines() override final { + return accRoutineInfos; + } + // Create the [newRank] array with the lower bounds to be passed to the // runtime as a descriptor. mlir::Value createLboundArray(llvm::ArrayRef lbounds, diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 226ba1e52c968..867248f16237e 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -1689,6 +1689,16 @@ class SignatureBuilder "SignatureBuilder should only be used once"); declare(); interfaceDetermined = true; + if (procDesignator && procDesignator->GetInterfaceSymbol() && + procDesignator->GetInterfaceSymbol() + ->has()) { + auto info = procDesignator->GetInterfaceSymbol() + ->get(); + if (!info.openACCRoutineInfos().empty()) { + genOpenACCRoutineConstruct(converter, converter.getModuleOp(), + getFuncOp(), info.openACCRoutineInfos()); + } + } return getFuncOp(); } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 3dd35ed9ae481..37b660408af6c 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -38,6 +38,7 @@ #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" +#include #define DEBUG_TYPE "flang-lower-openacc" @@ -4139,11 +4140,152 @@ static void attachRoutineInfo(mlir::func::FuncOp func, mlir::acc::RoutineInfoAttr::get(func.getContext(), routines)); } +void createOpenACCRoutineConstruct( + Fortran::lower::AbstractConverter &converter, mlir::Location loc, + mlir::ModuleOp mod, mlir::func::FuncOp funcOp, std::string funcName, + bool hasNohost, llvm::SmallVector &bindNames, + llvm::SmallVector &bindNameDeviceTypes, + llvm::SmallVector &gangDeviceTypes, + llvm::SmallVector &gangDimValues, + llvm::SmallVector &gangDimDeviceTypes, + llvm::SmallVector &seqDeviceTypes, + llvm::SmallVector &workerDeviceTypes, + llvm::SmallVector &vectorDeviceTypes) { + + std::stringstream routineOpName; + routineOpName << accRoutinePrefix.str() << routineCounter++; + + for (auto routineOp : mod.getOps()) { + if (routineOp.getFuncName().str().compare(funcName) == 0) { + // If the routine is already specified with the same clauses, just skip + // the operation creation. + if (compareDeviceTypeInfo(routineOp, bindNames, bindNameDeviceTypes, + gangDeviceTypes, gangDimValues, + gangDimDeviceTypes, seqDeviceTypes, + workerDeviceTypes, vectorDeviceTypes) && + routineOp.getNohost() == hasNohost) + return; + mlir::emitError(loc, "Routine already specified with different clauses"); + } + } + std::string routineOpStr = routineOpName.str(); + mlir::OpBuilder modBuilder(mod.getBodyRegion()); + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + modBuilder.create( + loc, routineOpStr, funcName, + bindNames.empty() ? nullptr : builder.getArrayAttr(bindNames), + bindNameDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(bindNameDeviceTypes), + workerDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(workerDeviceTypes), + vectorDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(vectorDeviceTypes), + seqDeviceTypes.empty() ? nullptr : builder.getArrayAttr(seqDeviceTypes), + hasNohost, /*implicit=*/false, + gangDeviceTypes.empty() ? nullptr : builder.getArrayAttr(gangDeviceTypes), + gangDimValues.empty() ? nullptr : builder.getArrayAttr(gangDimValues), + gangDimDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(gangDimDeviceTypes)); + + if (funcOp) + attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpStr)); + else + // FuncOp is not lowered yet. Keep the information so the routine info + // can be attached later to the funcOp. + converter.getAccDelayedRoutines().push_back( + std::make_pair(funcName, builder.getSymbolRefAttr(routineOpStr))); +} + +static void interpretRoutineDeviceInfo( + fir::FirOpBuilder &builder, + const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo, + llvm::SmallVector &seqDeviceTypes, + llvm::SmallVector &vectorDeviceTypes, + llvm::SmallVector &workerDeviceTypes, + llvm::SmallVector &bindNameDeviceTypes, + llvm::SmallVector &bindNames, + llvm::SmallVector &gangDeviceTypes, + llvm::SmallVector &gangDimValues, + llvm::SmallVector &gangDimDeviceTypes) { + mlir::MLIRContext *context{builder.getContext()}; + if (dinfo.isSeq()) { + seqDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } + if (dinfo.isVector()) { + vectorDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } + if (dinfo.isWorker()) { + workerDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } + if (dinfo.isGang()) { + unsigned gangDim = dinfo.gangDim(); + auto deviceType = + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType())); + if (!gangDim) { + gangDeviceTypes.push_back(deviceType); + } else { + gangDimValues.push_back( + builder.getIntegerAttr(builder.getI64Type(), gangDim)); + gangDimDeviceTypes.push_back(deviceType); + } + } + if (const std::string *bindName{dinfo.bindName()}) { + bindNames.push_back(builder.getStringAttr(*bindName)); + bindNameDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } +} + +void Fortran::lower::genOpenACCRoutineConstruct( + Fortran::lower::AbstractConverter &converter, mlir::ModuleOp mod, + mlir::func::FuncOp funcOp, + const std::vector &routineInfos) { + CHECK(funcOp && "Expected a valid function operation"); + fir::FirOpBuilder &builder{converter.getFirOpBuilder()}; + mlir::Location loc{funcOp.getLoc()}; + std::string funcName{funcOp.getName()}; + + // Collect the routine clauses + bool hasNohost{false}; + + llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimDeviceTypes, gangDimValues; + + for (const Fortran::semantics::OpenACCRoutineInfo &info : routineInfos) { + // Device Independent Attributes + if (info.isNohost()) { + hasNohost = true; + } + // Note: Device Independent Attributes are set to the + // none device type in `info`. + interpretRoutineDeviceInfo(builder, info, seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, + bindNames, gangDeviceTypes, gangDimValues, + gangDimDeviceTypes); + + // Device Dependent Attributes + for (const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo : + info.deviceTypeInfos()) { + interpretRoutineDeviceInfo( + builder, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, + bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, + gangDimDeviceTypes); + } + } + createOpenACCRoutineConstruct( + converter, loc, mod, funcOp, funcName, hasNohost, bindNames, + bindNameDeviceTypes, gangDeviceTypes, gangDimValues, gangDimDeviceTypes, + seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); +} + void Fortran::lower::genOpenACCRoutineConstruct( Fortran::lower::AbstractConverter &converter, Fortran::semantics::SemanticsContext &semanticsContext, mlir::ModuleOp mod, - const Fortran::parser::OpenACCRoutineConstruct &routineConstruct, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { + const Fortran::parser::OpenACCRoutineConstruct &routineConstruct) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); mlir::Location loc = converter.genLocation(routineConstruct.source); std::optional name = @@ -4174,6 +4316,7 @@ void Fortran::lower::genOpenACCRoutineConstruct( funcName = funcOp.getName(); } } + // TODO: Refactor this to use the OpenACCRoutineInfo bool hasNohost = false; llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, @@ -4226,6 +4369,8 @@ void Fortran::lower::genOpenACCRoutineConstruct( std::get_if(&clause.u)) { if (const auto *name = std::get_if(&bindClause->v.u)) { + // FIXME: This case mangles the name, the one below does not. + // which is correct? mlir::Attribute bindNameAttr = builder.getStringAttr(converter.mangleName(*name->symbol)); for (auto crtDeviceTypeAttr : crtDeviceTypes) { @@ -4255,47 +4400,10 @@ void Fortran::lower::genOpenACCRoutineConstruct( } } - mlir::OpBuilder modBuilder(mod.getBodyRegion()); - std::stringstream routineOpName; - routineOpName << accRoutinePrefix.str() << routineCounter++; - - for (auto routineOp : mod.getOps()) { - if (routineOp.getFuncName().str().compare(funcName) == 0) { - // If the routine is already specified with the same clauses, just skip - // the operation creation. - if (compareDeviceTypeInfo(routineOp, bindNames, bindNameDeviceTypes, - gangDeviceTypes, gangDimValues, - gangDimDeviceTypes, seqDeviceTypes, - workerDeviceTypes, vectorDeviceTypes) && - routineOp.getNohost() == hasNohost) - return; - mlir::emitError(loc, "Routine already specified with different clauses"); - } - } - - modBuilder.create( - loc, routineOpName.str(), funcName, - bindNames.empty() ? nullptr : builder.getArrayAttr(bindNames), - bindNameDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(bindNameDeviceTypes), - workerDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(workerDeviceTypes), - vectorDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(vectorDeviceTypes), - seqDeviceTypes.empty() ? nullptr : builder.getArrayAttr(seqDeviceTypes), - hasNohost, /*implicit=*/false, - gangDeviceTypes.empty() ? nullptr : builder.getArrayAttr(gangDeviceTypes), - gangDimValues.empty() ? nullptr : builder.getArrayAttr(gangDimValues), - gangDimDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(gangDimDeviceTypes)); - - if (funcOp) - attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpName.str())); - else - // FuncOp is not lowered yet. Keep the information so the routine info - // can be attached later to the funcOp. - accRoutineInfos.push_back(std::make_pair( - funcName, builder.getSymbolRefAttr(routineOpName.str()))); + createOpenACCRoutineConstruct( + converter, loc, mod, funcOp, funcName, hasNohost, bindNames, + bindNameDeviceTypes, gangDeviceTypes, gangDimValues, gangDimDeviceTypes, + seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); } void Fortran::lower::finalizeOpenACCRoutineAttachment( @@ -4443,8 +4551,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( fir::FirOpBuilder &builder = converter.getFirOpBuilder(); mlir::ModuleOp mod = builder.getModule(); Fortran::lower::genOpenACCRoutineConstruct( - converter, semanticsContext, mod, routineConstruct, - accRoutineInfos); + converter, semanticsContext, mod, routineConstruct); }, }, accDeclConstruct.u); diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index ee356e56e4458..befd204a671fc 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1387,6 +1387,7 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, parser::Options options; options.isModuleFile = true; options.features.Enable(common::LanguageFeature::BackslashEscapes); + options.features.Enable(common::LanguageFeature::OpenACC); options.features.Enable(common::LanguageFeature::OpenMP); options.features.Enable(common::LanguageFeature::CUDA); if (!isIntrinsic.value_or(false) && !notAModule) { diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index d75b4ea13d35f..93c334a3ca3cb 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1034,61 +1034,53 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Symbol &symbol, const parser::OpenACCRoutineConstruct &x) { if (symbol.has()) { Fortran::semantics::OpenACCRoutineInfo info; + std::vector currentDevices; + currentDevices.push_back(&info); const auto &clauses = std::get(x.t); for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isSeq(); - } else { - info.deviceTypeInfos().back().set_isSeq(); + if (const auto *dTypeClause = + std::get_if(&clause.u)) { + currentDevices.clear(); + for (const auto &deviceTypeExpr : dTypeClause->v.v) { + currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); } + } else if (std::get_if(&clause.u)) { + info.set_isNohost(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) + device->set_isSeq(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) + device->set_isVector(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) + device->set_isWorker(); } else if (const auto *gangClause = std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isGang(); - } else { - info.deviceTypeInfos().back().set_isGang(); - } + for (auto &device : currentDevices) + device->set_isGang(); if (gangClause->v) { const Fortran::parser::AccGangArgList &x = *gangClause->v; + int numArgs{0}; for (const Fortran::parser::AccGangArg &gangArg : x.v) { + CHECK(numArgs <= 1 && "expecting 0 or 1 gang dim args"); if (const auto *dim = std::get_if(&gangArg.u)) { if (const auto v{EvaluateInt64(context_, dim->v)}) { - if (info.deviceTypeInfos().empty()) { - info.set_gangDim(*v); - } else { - info.deviceTypeInfos().back().set_gangDim(*v); - } + for (auto &device : currentDevices) + device->set_gangDim(*v); } } + numArgs++; } } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isVector(); - } else { - info.deviceTypeInfos().back().set_isVector(); - } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isWorker(); - } else { - info.deviceTypeInfos().back().set_isWorker(); - } - } else if (std::get_if(&clause.u)) { - info.set_isNohost(); } else if (const auto *bindClause = std::get_if(&clause.u)) { + std::string bindName = ""; if (const auto *name = std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { - if (info.deviceTypeInfos().empty()) { - info.set_bindName(sym->name().ToString()); - } else { - info.deviceTypeInfos().back().set_bindName( - sym->name().ToString()); - } + bindName = sym->name().ToString(); } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1101,21 +1093,16 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Fortran::parser::Unwrap( *charExpr); std::string str{std::get(charConst->t)}; - std::stringstream bindName; - bindName << "\"" << str << "\""; - if (info.deviceTypeInfos().empty()) { - info.set_bindName(bindName.str()); - } else { - info.deviceTypeInfos().back().set_bindName(bindName.str()); + std::stringstream bindNameStream; + bindNameStream << "\"" << str << "\""; + bindName = bindNameStream.str(); + } + if (!bindName.empty()) { + // Fixme: do we need to ensure there there is only one device? + for (auto &device : currentDevices) { + device->set_bindName(std::string(bindName)); } } - } else if (const auto *dType = - std::get_if( - &clause.u)) { - const parser::AccDeviceTypeExprList &deviceTypeExprList = dType->v; - OpenACCRoutineDeviceTypeInfo dtypeInfo; - dtypeInfo.set_dType(deviceTypeExprList.v.front().v); - info.add_deviceTypeInfo(dtypeInfo); } } symbol.get().add_openACCRoutineInfo(info); >From 1b6da293788edc56eea566f5c15126de6955169c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Tue, 22 Apr 2025 16:06:33 -0700 Subject: [PATCH 2/9] fix includes --- flang/lib/Lower/OpenACC.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 37b660408af6c..a3ebd9b931dc6 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -32,13 +32,13 @@ #include "flang/Semantics/expression.h" #include "flang/Semantics/scope.h" #include "flang/Semantics/tools.h" -#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" -#include "mlir/Support/LLVM.h" #include "llvm/ADT/STLExtras.h" #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" -#include +#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" +#include "mlir/IR/MLIRContext.h" +#include "mlir/Support/LLVM.h" #define DEBUG_TYPE "flang-lower-openacc" >From 7b65ac4c477e5e46bf369a3a9f94f69cf496ef6b Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Wed, 23 Apr 2025 13:50:19 -0700 Subject: [PATCH 3/9] adding test --- flang/include/flang/Semantics/symbol.h | 7 +++++- flang/lib/Semantics/resolve-directives.cpp | 6 ++++- .../Lower/OpenACC/acc-module-definition.f90 | 17 ++++++++++++++ .../Lower/OpenACC/acc-routine-use-module.f90 | 23 +++++++++++++++++++ 4 files changed, 51 insertions(+), 2 deletions(-) create mode 100644 flang/test/Lower/OpenACC/acc-module-definition.f90 create mode 100644 flang/test/Lower/OpenACC/acc-routine-use-module.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 1b6b247c9f5bc..fe6c73997733a 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -142,7 +142,11 @@ class OpenACCRoutineDeviceTypeInfo { const std::string *bindName() const { return bindName_ ? &*bindName_ : nullptr; } - void set_bindName(std::string &&name) { bindName_ = std::move(name); } + bool bindNameIsInternal() const {return bindNameIsInternal_;} + void set_bindName(std::string &&name, bool isInternal=false) { + bindName_ = std::move(name); + bindNameIsInternal_ = isInternal; + } Fortran::common::OpenACCDeviceType dType() const { return deviceType_; } @@ -153,6 +157,7 @@ class OpenACCRoutineDeviceTypeInfo { bool isGang_{false}; unsigned gangDim_{0}; std::optional bindName_; + bool bindNameIsInternal_{false}; Fortran::common::OpenACCDeviceType deviceType_{ Fortran::common::OpenACCDeviceType::None}; }; diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 93c334a3ca3cb..8fb3559c34426 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1077,10 +1077,12 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( } else if (const auto *bindClause = std::get_if(&clause.u)) { std::string bindName = ""; + bool isInternal = false; if (const auto *name = std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { bindName = sym->name().ToString(); + isInternal = true; } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1100,12 +1102,14 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( if (!bindName.empty()) { // Fixme: do we need to ensure there there is only one device? for (auto &device : currentDevices) { - device->set_bindName(std::string(bindName)); + device->set_bindName(std::string(bindName), isInternal); } } } } symbol.get().add_openACCRoutineInfo(info); + } else { + llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() << "\n"; } } diff --git a/flang/test/Lower/OpenACC/acc-module-definition.f90 b/flang/test/Lower/OpenACC/acc-module-definition.f90 new file mode 100644 index 0000000000000..36e41fc631c77 --- /dev/null +++ b/flang/test/Lower/OpenACC/acc-module-definition.f90 @@ -0,0 +1,17 @@ +! RUN: rm -fr %t && mkdir -p %t && cd %t +! RUN: bbc -fopenacc -emit-fir %s +! RUN: cat mod1.mod | FileCheck %s + +!CHECK-LABEL: module mod1 +module mod1 + contains + !CHECK subroutine callee(aa) + subroutine callee(aa) + !CHECK: !$acc routine seq + !$acc routine seq + integer :: aa + aa = 1 + end subroutine + !CHECK: end + !CHECK: end +end module \ No newline at end of file diff --git a/flang/test/Lower/OpenACC/acc-routine-use-module.f90 b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 new file mode 100644 index 0000000000000..7fc96b0ef5684 --- /dev/null +++ b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 @@ -0,0 +1,23 @@ +! RUN: rm -fr %t && mkdir -p %t && cd %t +! RUN: bbc -emit-fir %S/acc-module-definition.f90 +! RUN: bbc -emit-fir %s -o - | FileCheck %s + +! This test module is based off of flang/test/Lower/use_module.f90 +! The first runs ensures the module file is generated. + +module use_mod1 + use mod1 + contains + !CHECK: func.func @_QMuse_mod1Pcaller + !CHECK-SAME { + subroutine caller(aa) + integer :: aa + !$acc serial + !CHECK: fir.call @_QMmod1Pcallee + call callee(aa) + !$acc end serial + end subroutine + !CHECK: } + !CHECK: acc.routine @acc_routine_0 func(@_QMmod1Pcallee) seq + !CHECK: func.func private @_QMmod1Pcallee(!fir.ref) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +end module \ No newline at end of file >From 70f8d469346d22597c7b3ff38b2f4a84a82b6d85 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 14:39:04 -0700 Subject: [PATCH 4/9] debugging failure --- flang/include/flang/Lower/OpenACC.h | 7 ++ flang/include/flang/Semantics/symbol.h | 21 +++-- flang/lib/Lower/Bridge.cpp | 21 +++-- flang/lib/Lower/CallInterface.cpp | 21 ++--- flang/lib/Lower/OpenACC.cpp | 87 +++++++++++-------- flang/lib/Semantics/mod-file.cpp | 11 ++- flang/lib/Semantics/resolve-directives.cpp | 16 ++-- flang/lib/Semantics/symbol.cpp | 46 ++++++++++ .../test/Lower/OpenACC/acc-routine-named.f90 | 10 ++- .../Lower/OpenACC/acc-routine-use-module.f90 | 6 +- flang/test/Lower/OpenACC/acc-routine.f90 | 63 ++++++++------ 11 files changed, 199 insertions(+), 110 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 7832e8b69ea23..dc014a71526c3 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -37,11 +37,16 @@ class FirOpBuilder; } namespace Fortran { +namespace evaluate { +class ProcedureDesignator; +} // namespace evaluate + namespace parser { struct AccClauseList; struct OpenACCConstruct; struct OpenACCDeclarativeConstruct; struct OpenACCRoutineConstruct; +struct ProcedureDesignator; } // namespace parser namespace semantics { @@ -71,6 +76,8 @@ static constexpr llvm::StringRef declarePostDeallocSuffix = static constexpr llvm::StringRef privatizationRecipePrefix = "privatization"; +bool needsOpenACCRoutineConstruct(const Fortran::evaluate::ProcedureDesignator *); + mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, pft::Evaluation &, diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index fe6c73997733a..8c60a196bdfc1 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -22,6 +22,7 @@ #include #include #include +#include #include namespace llvm { @@ -139,25 +140,26 @@ class OpenACCRoutineDeviceTypeInfo { void set_isGang(bool value = true) { isGang_ = value; } unsigned gangDim() const { return gangDim_; } void set_gangDim(unsigned value) { gangDim_ = value; } - const std::string *bindName() const { - return bindName_ ? &*bindName_ : nullptr; + const std::variant *bindName() const { + return bindName_.has_value() ? &*bindName_ : nullptr; } - bool bindNameIsInternal() const {return bindNameIsInternal_;} - void set_bindName(std::string &&name, bool isInternal=false) { - bindName_ = std::move(name); - bindNameIsInternal_ = isInternal; + const std::optional> &bindNameOpt() const { + return bindName_; } + void set_bindName(std::string &&name) { bindName_.emplace(std::move(name)); } + void set_bindName(SymbolRef symbol) { bindName_.emplace(symbol); } Fortran::common::OpenACCDeviceType dType() const { return deviceType_; } + friend llvm::raw_ostream &operator<<( + llvm::raw_ostream &, const OpenACCRoutineDeviceTypeInfo &); private: bool isSeq_{false}; bool isVector_{false}; bool isWorker_{false}; bool isGang_{false}; unsigned gangDim_{0}; - std::optional bindName_; - bool bindNameIsInternal_{false}; + std::optional> bindName_; Fortran::common::OpenACCDeviceType deviceType_{ Fortran::common::OpenACCDeviceType::None}; }; @@ -187,6 +189,9 @@ class OpenACCRoutineInfo : public OpenACCRoutineDeviceTypeInfo { return deviceTypeInfos_.back(); } + friend llvm::raw_ostream &operator<<( + llvm::raw_ostream &, const OpenACCRoutineInfo &); + private: std::list deviceTypeInfos_; bool isNohost_{false}; diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 9285d587585f8..abe07bcfdfcda 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -438,14 +438,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { [&](Fortran::lower::pft::ModuleLikeUnit &m) { lowerMod(m); }, [&](Fortran::lower::pft::BlockDataUnit &b) {}, [&](Fortran::lower::pft::CompilerDirectiveUnit &d) {}, - [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) { - builder = new fir::FirOpBuilder( - bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); - Fortran::lower::genOpenACCRoutineConstruct( - *this, bridge.getSemanticsContext(), bridge.getModule(), - d.routine); - builder = nullptr; - }, + [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) {}, }, u); } @@ -472,6 +465,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { setCurrentPosition(funit.getStartingSourceLoc()); + builder = new fir::FirOpBuilder( + bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { funit.setActiveEntry(entryIndex); @@ -498,6 +493,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (Fortran::lower::pft::ContainedUnit &unit : funit.containedUnitList) if (auto *f = std::get_if(&unit)) declareFunction(*f); + builder = nullptr; } /// Get the scope that is defining or using \p sym. The returned scope is not @@ -1035,7 +1031,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { return bridge.getSemanticsContext().FindScope(currentPosition); } - fir::FirOpBuilder &getFirOpBuilder() override final { return *builder; } + fir::FirOpBuilder &getFirOpBuilder() override final { + CHECK(builder && "builder is not set before calling getFirOpBuilder"); + return *builder; + } mlir::ModuleOp getModuleOp() override final { return bridge.getModule(); } @@ -5617,6 +5616,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { LLVM_DEBUG(llvm::dbgs() << "\n[bridge - startNewFunction]"; if (auto *sym = scope.symbol()) llvm::dbgs() << " " << *sym; llvm::dbgs() << "\n"); + // I don't think setting the builder is necessary here, because callee + // always looks up the FuncOp from the module. If there was a function that + // was not declared yet. This call to callee will cause an assertion + //failure. Fortran::lower::CalleeInterface callee(funit, *this); mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments(); builder = diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 867248f16237e..b938354e6bcb3 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -10,6 +10,7 @@ #include "flang/Evaluate/fold.h" #include "flang/Lower/Bridge.h" #include "flang/Lower/Mangler.h" +#include "flang/Lower/OpenACC.h" #include "flang/Lower/PFTBuilder.h" #include "flang/Lower/StatementContext.h" #include "flang/Lower/Support/Utils.h" @@ -20,6 +21,7 @@ #include "flang/Optimizer/Dialect/FIROpsSupport.h" #include "flang/Optimizer/Support/InternalNames.h" #include "flang/Optimizer/Support/Utils.h" +#include "flang/Parser/parse-tree.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" #include "flang/Support/Fortran.h" @@ -715,6 +717,14 @@ void Fortran::lower::CallInterface::declare() { func.setArgAttrs(placeHolder.index(), placeHolder.value().attributes); setCUDAAttributes(func, side().getProcedureSymbol(), characteristic); + + if (const Fortran::semantics::Symbol *sym = side().getProcedureSymbol()) { + if (const auto &info{sym->GetUltimate().detailsIf()}) { + if (!info->openACCRoutineInfos().empty()) { + genOpenACCRoutineConstruct(converter, module, func, info->openACCRoutineInfos()); + } + } + } } } } @@ -1688,17 +1698,8 @@ class SignatureBuilder fir::emitFatalError(converter.getCurrentLocation(), "SignatureBuilder should only be used once"); declare(); + interfaceDetermined = true; - if (procDesignator && procDesignator->GetInterfaceSymbol() && - procDesignator->GetInterfaceSymbol() - ->has()) { - auto info = procDesignator->GetInterfaceSymbol() - ->get(); - if (!info.openACCRoutineInfos().empty()) { - genOpenACCRoutineConstruct(converter, converter.getModuleOp(), - getFuncOp(), info.openACCRoutineInfos()); - } - } return getFuncOp(); } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index a3ebd9b931dc6..eefa8fbf12b1a 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -36,6 +36,7 @@ #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" +#include "llvm/Support/ErrorHandling.h" #include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" #include "mlir/IR/MLIRContext.h" #include "mlir/Support/LLVM.h" @@ -4140,6 +4141,14 @@ static void attachRoutineInfo(mlir::func::FuncOp func, mlir::acc::RoutineInfoAttr::get(func.getContext(), routines)); } +static mlir::ArrayAttr getArrayAttrOrNull(fir::FirOpBuilder &builder, llvm::SmallVector &attributes) { + if (attributes.empty()) { + return nullptr; + } else { + return builder.getArrayAttr(attributes); + } +} + void createOpenACCRoutineConstruct( Fortran::lower::AbstractConverter &converter, mlir::Location loc, mlir::ModuleOp mod, mlir::func::FuncOp funcOp, std::string funcName, @@ -4173,31 +4182,29 @@ void createOpenACCRoutineConstruct( fir::FirOpBuilder &builder = converter.getFirOpBuilder(); modBuilder.create( loc, routineOpStr, funcName, - bindNames.empty() ? nullptr : builder.getArrayAttr(bindNames), - bindNameDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(bindNameDeviceTypes), - workerDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(workerDeviceTypes), - vectorDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(vectorDeviceTypes), - seqDeviceTypes.empty() ? nullptr : builder.getArrayAttr(seqDeviceTypes), + getArrayAttrOrNull(builder, bindNames), + getArrayAttrOrNull(builder, bindNameDeviceTypes), + getArrayAttrOrNull(builder, workerDeviceTypes), + getArrayAttrOrNull(builder, vectorDeviceTypes), + getArrayAttrOrNull(builder, seqDeviceTypes), hasNohost, /*implicit=*/false, - gangDeviceTypes.empty() ? nullptr : builder.getArrayAttr(gangDeviceTypes), - gangDimValues.empty() ? nullptr : builder.getArrayAttr(gangDimValues), - gangDimDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(gangDimDeviceTypes)); - - if (funcOp) - attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpStr)); - else + getArrayAttrOrNull(builder, gangDeviceTypes), + getArrayAttrOrNull(builder, gangDimValues), + getArrayAttrOrNull(builder, gangDimDeviceTypes)); + + auto symbolRefAttr = builder.getSymbolRefAttr(routineOpStr); + if (funcOp) { + + attachRoutineInfo(funcOp, symbolRefAttr); + } else { // FuncOp is not lowered yet. Keep the information so the routine info // can be attached later to the funcOp. - converter.getAccDelayedRoutines().push_back( - std::make_pair(funcName, builder.getSymbolRefAttr(routineOpStr))); + converter.getAccDelayedRoutines().push_back(std::make_pair(funcName, symbolRefAttr)); + } } static void interpretRoutineDeviceInfo( - fir::FirOpBuilder &builder, + Fortran::lower::AbstractConverter &converter, const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo, llvm::SmallVector &seqDeviceTypes, llvm::SmallVector &vectorDeviceTypes, @@ -4207,23 +4214,24 @@ static void interpretRoutineDeviceInfo( llvm::SmallVector &gangDeviceTypes, llvm::SmallVector &gangDimValues, llvm::SmallVector &gangDimDeviceTypes) { - mlir::MLIRContext *context{builder.getContext()}; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto getDeviceTypeAttr = [&]() -> mlir::Attribute { + auto context = builder.getContext(); + auto value = getDeviceType(dinfo.dType()); + return mlir::acc::DeviceTypeAttr::get(context, value ); + }; if (dinfo.isSeq()) { - seqDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + seqDeviceTypes.push_back(getDeviceTypeAttr()); } if (dinfo.isVector()) { - vectorDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + vectorDeviceTypes.push_back(getDeviceTypeAttr()); } if (dinfo.isWorker()) { - workerDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + workerDeviceTypes.push_back(getDeviceTypeAttr()); } if (dinfo.isGang()) { unsigned gangDim = dinfo.gangDim(); - auto deviceType = - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType())); + auto deviceType = getDeviceTypeAttr(); if (!gangDim) { gangDeviceTypes.push_back(deviceType); } else { @@ -4232,10 +4240,18 @@ static void interpretRoutineDeviceInfo( gangDimDeviceTypes.push_back(deviceType); } } - if (const std::string *bindName{dinfo.bindName()}) { - bindNames.push_back(builder.getStringAttr(*bindName)); - bindNameDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + if (dinfo.bindNameOpt().has_value()) { + const auto &bindName = dinfo.bindNameOpt().value(); + mlir::Attribute bindNameAttr; + if (const auto &bindStr{std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(*bindStr); + } else if (const auto &bindSym{std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(converter.mangleName(*bindSym)); + } else { + llvm_unreachable("Unsupported bind name type"); + } + bindNames.push_back(bindNameAttr); + bindNameDeviceTypes.push_back(getDeviceTypeAttr()); } } @@ -4244,7 +4260,6 @@ void Fortran::lower::genOpenACCRoutineConstruct( mlir::func::FuncOp funcOp, const std::vector &routineInfos) { CHECK(funcOp && "Expected a valid function operation"); - fir::FirOpBuilder &builder{converter.getFirOpBuilder()}; mlir::Location loc{funcOp.getLoc()}; std::string funcName{funcOp.getName()}; @@ -4262,7 +4277,7 @@ void Fortran::lower::genOpenACCRoutineConstruct( } // Note: Device Independent Attributes are set to the // none device type in `info`. - interpretRoutineDeviceInfo(builder, info, seqDeviceTypes, vectorDeviceTypes, + interpretRoutineDeviceInfo(converter, info, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, gangDimDeviceTypes); @@ -4271,7 +4286,7 @@ void Fortran::lower::genOpenACCRoutineConstruct( for (const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo : info.deviceTypeInfos()) { interpretRoutineDeviceInfo( - builder, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, + converter, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, gangDimDeviceTypes); } @@ -4369,8 +4384,6 @@ void Fortran::lower::genOpenACCRoutineConstruct( std::get_if(&clause.u)) { if (const auto *name = std::get_if(&bindClause->v.u)) { - // FIXME: This case mangles the name, the one below does not. - // which is correct? mlir::Attribute bindNameAttr = builder.getStringAttr(converter.mangleName(*name->symbol)); for (auto crtDeviceTypeAttr : crtDeviceTypes) { diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index befd204a671fc..76dc8db590f22 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -24,6 +24,7 @@ #include #include #include +#include #include namespace Fortran::semantics { @@ -638,8 +639,14 @@ static void PutOpenACCDeviceTypeRoutineInfo( if (info.isWorker()) { os << " worker"; } - if (info.bindName()) { - os << " bind(" << *info.bindName() << ")"; + if (const std::variant *bindName{info.bindName()}) { + os << " bind("; + if (std::holds_alternative(*bindName)) { + os << "\"" << std::get(*bindName) << "\""; + } else { + os << std::get(*bindName)->name(); + } + os << ")"; } } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 8fb3559c34426..a8f00b546306e 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1076,13 +1076,13 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( } } else if (const auto *bindClause = std::get_if(&clause.u)) { - std::string bindName = ""; - bool isInternal = false; if (const auto *name = std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { - bindName = sym->name().ToString(); - isInternal = true; + Symbol &ultimate{sym->GetUltimate()}; + for (auto &device : currentDevices) { + device->set_bindName(SymbolRef(ultimate)); + } } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1095,14 +1095,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Fortran::parser::Unwrap( *charExpr); std::string str{std::get(charConst->t)}; - std::stringstream bindNameStream; - bindNameStream << "\"" << str << "\""; - bindName = bindNameStream.str(); - } - if (!bindName.empty()) { - // Fixme: do we need to ensure there there is only one device? for (auto &device : currentDevices) { - device->set_bindName(std::string(bindName), isInternal); + device->set_bindName(std::string(str)); } } } diff --git a/flang/lib/Semantics/symbol.cpp b/flang/lib/Semantics/symbol.cpp index 32eb6c2c5a188..d44df4669fa36 100644 --- a/flang/lib/Semantics/symbol.cpp +++ b/flang/lib/Semantics/symbol.cpp @@ -144,6 +144,52 @@ llvm::raw_ostream &operator<<( os << ' ' << x; } } + if (!x.openACCRoutineInfos_.empty()) { + os << " openACCRoutineInfos:"; + for (const auto x : x.openACCRoutineInfos_) { + os << x; + } + } + return os; +} + +llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const OpenACCRoutineDeviceTypeInfo &x) { + if (x.dType() != common::OpenACCDeviceType::None) { + os << " deviceType(" << common::EnumToString(x.dType()) << ')'; + } + if (x.isSeq()) { + os << " seq"; + } + if (x.isVector()) { + os << " vector"; + } + if (x.isWorker()) { + os << " worker"; + } + if (x.isGang()) { + os << " gang(" << x.gangDim() << ')'; + } + if (const auto *bindName{x.bindName()}) { + if (const auto &symbol{std::get_if(bindName)}) { + os << " bindName(\"" << *symbol << "\")"; + } else { + const SymbolRef s{std::get(*bindName)}; + os << " bindName(" << s->name() << ")"; + } + } + return os; +} + +llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const OpenACCRoutineInfo &x) { + if (x.isNohost()) { + os << " nohost"; + } + os << static_cast(x); + for (const auto &d : x.deviceTypeInfos_) { + os << d; + } return os; } diff --git a/flang/test/Lower/OpenACC/acc-routine-named.f90 b/flang/test/Lower/OpenACC/acc-routine-named.f90 index 2cf6bf8b2bc06..de9784a1146cc 100644 --- a/flang/test/Lower/OpenACC/acc-routine-named.f90 +++ b/flang/test/Lower/OpenACC/acc-routine-named.f90 @@ -4,8 +4,8 @@ module acc_routines -! CHECK: acc.routine @acc_routine_1 func(@_QMacc_routinesPacc2) -! CHECK: acc.routine @acc_routine_0 func(@_QMacc_routinesPacc1) seq +! CHECK: acc.routine @[[r0:.*]] func(@_QMacc_routinesPacc2) +! CHECK: acc.routine @[[r1:.*]] func(@_QMacc_routinesPacc1) seq !$acc routine(acc1) seq @@ -14,12 +14,14 @@ module acc_routines subroutine acc1() end subroutine -! CHECK-LABEL: func.func @_QMacc_routinesPacc1() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +! CHECK-LABEL: func.func @_QMacc_routinesPacc1() +! CHECK-SAME:attributes {acc.routine_info = #acc.routine_info<[@[[r1]]]>} subroutine acc2() !$acc routine(acc2) end subroutine -! CHECK-LABEL: func.func @_QMacc_routinesPacc2() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_1]>} +! CHECK-LABEL: func.func @_QMacc_routinesPacc2() +! CHECK-SAME:attributes {acc.routine_info = #acc.routine_info<[@[[r0]]]>} end module diff --git a/flang/test/Lower/OpenACC/acc-routine-use-module.f90 b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 index 7fc96b0ef5684..059324230a746 100644 --- a/flang/test/Lower/OpenACC/acc-routine-use-module.f90 +++ b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 @@ -1,6 +1,6 @@ ! RUN: rm -fr %t && mkdir -p %t && cd %t -! RUN: bbc -emit-fir %S/acc-module-definition.f90 -! RUN: bbc -emit-fir %s -o - | FileCheck %s +! RUN: bbc -fopenacc -emit-fir %S/acc-module-definition.f90 +! RUN: bbc -fopenacc -emit-fir %s -o - | FileCheck %s ! This test module is based off of flang/test/Lower/use_module.f90 ! The first runs ensures the module file is generated. @@ -8,6 +8,7 @@ module use_mod1 use mod1 contains + !CHECK: acc.routine @acc_routine_0 func(@_QMmod1Pcallee) seq !CHECK: func.func @_QMuse_mod1Pcaller !CHECK-SAME { subroutine caller(aa) @@ -18,6 +19,5 @@ subroutine caller(aa) !$acc end serial end subroutine !CHECK: } - !CHECK: acc.routine @acc_routine_0 func(@_QMmod1Pcallee) seq !CHECK: func.func private @_QMmod1Pcallee(!fir.ref) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} end module \ No newline at end of file diff --git a/flang/test/Lower/OpenACC/acc-routine.f90 b/flang/test/Lower/OpenACC/acc-routine.f90 index 1170af18bc334..789f3a57e1f79 100644 --- a/flang/test/Lower/OpenACC/acc-routine.f90 +++ b/flang/test/Lower/OpenACC/acc-routine.f90 @@ -2,69 +2,77 @@ ! RUN: bbc -fopenacc -emit-hlfir %s -o - | FileCheck %s -! CHECK: acc.routine @acc_routine_17 func(@_QPacc_routine19) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) -! CHECK: acc.routine @acc_routine_16 func(@_QPacc_routine18) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) -! CHECK: acc.routine @acc_routine_15 func(@_QPacc_routine17) worker ([#acc.device_type]) vector ([#acc.device_type]) -! CHECK: acc.routine @acc_routine_14 func(@_QPacc_routine16) gang([#acc.device_type]) seq ([#acc.device_type]) -! CHECK: acc.routine @acc_routine_10 func(@_QPacc_routine11) seq -! CHECK: acc.routine @acc_routine_9 func(@_QPacc_routine10) seq -! CHECK: acc.routine @acc_routine_8 func(@_QPacc_routine9) bind("_QPacc_routine9a") -! CHECK: acc.routine @acc_routine_7 func(@_QPacc_routine8) bind("routine8_") -! CHECK: acc.routine @acc_routine_6 func(@_QPacc_routine7) gang(dim: 1 : i64) -! CHECK: acc.routine @acc_routine_5 func(@_QPacc_routine6) nohost -! CHECK: acc.routine @acc_routine_4 func(@_QPacc_routine5) worker -! CHECK: acc.routine @acc_routine_3 func(@_QPacc_routine4) vector -! CHECK: acc.routine @acc_routine_2 func(@_QPacc_routine3) gang -! CHECK: acc.routine @acc_routine_1 func(@_QPacc_routine2) seq -! CHECK: acc.routine @acc_routine_0 func(@_QPacc_routine1) +! CHECK: acc.routine @[[r14:.*]] func(@_QPacc_routine19) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) +! CHECK: acc.routine @[[r13:.*]] func(@_QPacc_routine18) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) +! CHECK: acc.routine @[[r12:.*]] func(@_QPacc_routine17) worker ([#acc.device_type]) vector ([#acc.device_type]) +! CHECK: acc.routine @[[r11:.*]] func(@_QPacc_routine16) gang([#acc.device_type]) seq ([#acc.device_type]) +! CHECK: acc.routine @[[r10:.*]] func(@_QPacc_routine11) seq +! CHECK: acc.routine @[[r09:.*]] func(@_QPacc_routine10) seq +! CHECK: acc.routine @[[r08:.*]] func(@_QPacc_routine9) bind("_QPacc_routine9a") +! CHECK: acc.routine @[[r07:.*]] func(@_QPacc_routine8) bind("routine8_") +! CHECK: acc.routine @[[r06:.*]] func(@_QPacc_routine7) gang(dim: 1 : i64) +! CHECK: acc.routine @[[r05:.*]] func(@_QPacc_routine6) nohost +! CHECK: acc.routine @[[r04:.*]] func(@_QPacc_routine5) worker +! CHECK: acc.routine @[[r03:.*]] func(@_QPacc_routine4) vector +! CHECK: acc.routine @[[r02:.*]] func(@_QPacc_routine3) gang +! CHECK: acc.routine @[[r01:.*]] func(@_QPacc_routine2) seq +! CHECK: acc.routine @[[r00:.*]] func(@_QPacc_routine1) subroutine acc_routine1() !$acc routine end subroutine -! CHECK-LABEL: func.func @_QPacc_routine1() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +! CHECK-LABEL: func.func @_QPacc_routine1() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r00]]]>} subroutine acc_routine2() !$acc routine seq end subroutine -! CHECK-LABEL: func.func @_QPacc_routine2() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_1]>} +! CHECK-LABEL: func.func @_QPacc_routine2() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r01]]]>} subroutine acc_routine3() !$acc routine gang end subroutine -! CHECK-LABEL: func.func @_QPacc_routine3() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_2]>} +! CHECK-LABEL: func.func @_QPacc_routine3() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r02]]]>} subroutine acc_routine4() !$acc routine vector end subroutine -! CHECK-LABEL: func.func @_QPacc_routine4() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_3]>} +! CHECK-LABEL: func.func @_QPacc_routine4() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r03]]]>} subroutine acc_routine5() !$acc routine worker end subroutine -! CHECK-LABEL: func.func @_QPacc_routine5() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_4]>} +! CHECK-LABEL: func.func @_QPacc_routine5() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r04]]]>} subroutine acc_routine6() !$acc routine nohost end subroutine -! CHECK-LABEL: func.func @_QPacc_routine6() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_5]>} +! CHECK-LABEL: func.func @_QPacc_routine6() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r05]]]>} subroutine acc_routine7() !$acc routine gang(dim:1) end subroutine -! CHECK-LABEL: func.func @_QPacc_routine7() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_6]>} +! CHECK-LABEL: func.func @_QPacc_routine7() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r06]]]>} subroutine acc_routine8() !$acc routine bind("routine8_") end subroutine -! CHECK-LABEL: func.func @_QPacc_routine8() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_7]>} +! CHECK-LABEL: func.func @_QPacc_routine8() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r07]]]>} subroutine acc_routine9a() end subroutine @@ -73,20 +81,23 @@ subroutine acc_routine9() !$acc routine bind(acc_routine9a) end subroutine -! CHECK-LABEL: func.func @_QPacc_routine9() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_8]>} +! CHECK-LABEL: func.func @_QPacc_routine9() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r08]]]>} function acc_routine10() !$acc routine(acc_routine10) seq end function -! CHECK-LABEL: func.func @_QPacc_routine10() -> f32 attributes {acc.routine_info = #acc.routine_info<[@acc_routine_9]>} +! CHECK-LABEL: func.func @_QPacc_routine10() -> f32 +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r09]]]>} subroutine acc_routine11(a) real :: a !$acc routine(acc_routine11) seq end subroutine -! CHECK-LABEL: func.func @_QPacc_routine11(%arg0: !fir.ref {fir.bindc_name = "a"}) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_10]>} +! CHECK-LABEL: func.func @_QPacc_routine11(%arg0: !fir.ref {fir.bindc_name = "a"}) +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r10]]]>} subroutine acc_routine12() >From e2d1a05d2de2356644d385e9099a7e6879143cc7 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 14:40:14 -0700 Subject: [PATCH 5/9] clang-format --- flang/include/flang/Lower/OpenACC.h | 3 +- flang/include/flang/Semantics/symbol.h | 4 +- flang/lib/Lower/Bridge.cpp | 12 ++--- flang/lib/Lower/CallInterface.cpp | 9 ++-- flang/lib/Lower/OpenACC.cpp | 56 +++++++++++----------- flang/lib/Semantics/resolve-directives.cpp | 3 +- 6 files changed, 48 insertions(+), 39 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index dc014a71526c3..35a33e751b52b 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -76,7 +76,8 @@ static constexpr llvm::StringRef declarePostDeallocSuffix = static constexpr llvm::StringRef privatizationRecipePrefix = "privatization"; -bool needsOpenACCRoutineConstruct(const Fortran::evaluate::ProcedureDesignator *); +bool needsOpenACCRoutineConstruct( + const Fortran::evaluate::ProcedureDesignator *); mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 8c60a196bdfc1..eb34aac9c390d 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -143,7 +143,8 @@ class OpenACCRoutineDeviceTypeInfo { const std::variant *bindName() const { return bindName_.has_value() ? &*bindName_ : nullptr; } - const std::optional> &bindNameOpt() const { + const std::optional> & + bindNameOpt() const { return bindName_; } void set_bindName(std::string &&name) { bindName_.emplace(std::move(name)); } @@ -153,6 +154,7 @@ class OpenACCRoutineDeviceTypeInfo { friend llvm::raw_ostream &operator<<( llvm::raw_ostream &, const OpenACCRoutineDeviceTypeInfo &); + private: bool isSeq_{false}; bool isVector_{false}; diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index abe07bcfdfcda..5e7b783323bfd 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -465,8 +465,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { setCurrentPosition(funit.getStartingSourceLoc()); - builder = new fir::FirOpBuilder( - bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); + builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), + &mlirSymbolTable); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { funit.setActiveEntry(entryIndex); @@ -1031,9 +1031,9 @@ class FirConverter : public Fortran::lower::AbstractConverter { return bridge.getSemanticsContext().FindScope(currentPosition); } - fir::FirOpBuilder &getFirOpBuilder() override final { + fir::FirOpBuilder &getFirOpBuilder() override final { CHECK(builder && "builder is not set before calling getFirOpBuilder"); - return *builder; + return *builder; } mlir::ModuleOp getModuleOp() override final { return bridge.getModule(); } @@ -5616,10 +5616,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { LLVM_DEBUG(llvm::dbgs() << "\n[bridge - startNewFunction]"; if (auto *sym = scope.symbol()) llvm::dbgs() << " " << *sym; llvm::dbgs() << "\n"); - // I don't think setting the builder is necessary here, because callee + // I don't think setting the builder is necessary here, because callee // always looks up the FuncOp from the module. If there was a function that // was not declared yet. This call to callee will cause an assertion - //failure. + // failure. Fortran::lower::CalleeInterface callee(funit, *this); mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments(); builder = diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index b938354e6bcb3..611eacfe178e5 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -717,11 +717,14 @@ void Fortran::lower::CallInterface::declare() { func.setArgAttrs(placeHolder.index(), placeHolder.value().attributes); setCUDAAttributes(func, side().getProcedureSymbol(), characteristic); - + if (const Fortran::semantics::Symbol *sym = side().getProcedureSymbol()) { - if (const auto &info{sym->GetUltimate().detailsIf()}) { + if (const auto &info{ + sym->GetUltimate() + .detailsIf()}) { if (!info->openACCRoutineInfos().empty()) { - genOpenACCRoutineConstruct(converter, module, func, info->openACCRoutineInfos()); + genOpenACCRoutineConstruct(converter, module, func, + info->openACCRoutineInfos()); } } } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index eefa8fbf12b1a..891dc998bc596 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -32,14 +32,14 @@ #include "flang/Semantics/expression.h" #include "flang/Semantics/scope.h" #include "flang/Semantics/tools.h" +#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" +#include "mlir/IR/MLIRContext.h" +#include "mlir/Support/LLVM.h" #include "llvm/ADT/STLExtras.h" #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" #include "llvm/Support/ErrorHandling.h" -#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" -#include "mlir/IR/MLIRContext.h" -#include "mlir/Support/LLVM.h" #define DEBUG_TYPE "flang-lower-openacc" @@ -4141,7 +4141,9 @@ static void attachRoutineInfo(mlir::func::FuncOp func, mlir::acc::RoutineInfoAttr::get(func.getContext(), routines)); } -static mlir::ArrayAttr getArrayAttrOrNull(fir::FirOpBuilder &builder, llvm::SmallVector &attributes) { +static mlir::ArrayAttr +getArrayAttrOrNull(fir::FirOpBuilder &builder, + llvm::SmallVector &attributes) { if (attributes.empty()) { return nullptr; } else { @@ -4181,25 +4183,24 @@ void createOpenACCRoutineConstruct( mlir::OpBuilder modBuilder(mod.getBodyRegion()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); modBuilder.create( - loc, routineOpStr, funcName, - getArrayAttrOrNull(builder, bindNames), + loc, routineOpStr, funcName, getArrayAttrOrNull(builder, bindNames), getArrayAttrOrNull(builder, bindNameDeviceTypes), getArrayAttrOrNull(builder, workerDeviceTypes), getArrayAttrOrNull(builder, vectorDeviceTypes), - getArrayAttrOrNull(builder, seqDeviceTypes), - hasNohost, /*implicit=*/false, - getArrayAttrOrNull(builder, gangDeviceTypes), + getArrayAttrOrNull(builder, seqDeviceTypes), hasNohost, + /*implicit=*/false, getArrayAttrOrNull(builder, gangDeviceTypes), getArrayAttrOrNull(builder, gangDimValues), getArrayAttrOrNull(builder, gangDimDeviceTypes)); auto symbolRefAttr = builder.getSymbolRefAttr(routineOpStr); if (funcOp) { - + attachRoutineInfo(funcOp, symbolRefAttr); } else { // FuncOp is not lowered yet. Keep the information so the routine info // can be attached later to the funcOp. - converter.getAccDelayedRoutines().push_back(std::make_pair(funcName, symbolRefAttr)); + converter.getAccDelayedRoutines().push_back( + std::make_pair(funcName, symbolRefAttr)); } } @@ -4218,7 +4219,7 @@ static void interpretRoutineDeviceInfo( auto getDeviceTypeAttr = [&]() -> mlir::Attribute { auto context = builder.getContext(); auto value = getDeviceType(dinfo.dType()); - return mlir::acc::DeviceTypeAttr::get(context, value ); + return mlir::acc::DeviceTypeAttr::get(context, value); }; if (dinfo.isSeq()) { seqDeviceTypes.push_back(getDeviceTypeAttr()); @@ -4244,14 +4245,15 @@ static void interpretRoutineDeviceInfo( const auto &bindName = dinfo.bindNameOpt().value(); mlir::Attribute bindNameAttr; if (const auto &bindStr{std::get_if(&bindName)}) { - bindNameAttr = builder.getStringAttr(*bindStr); - } else if (const auto &bindSym{std::get_if(&bindName)}) { - bindNameAttr = builder.getStringAttr(converter.mangleName(*bindSym)); - } else { - llvm_unreachable("Unsupported bind name type"); - } - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(getDeviceTypeAttr()); + bindNameAttr = builder.getStringAttr(*bindStr); + } else if (const auto &bindSym{ + std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(converter.mangleName(*bindSym)); + } else { + llvm_unreachable("Unsupported bind name type"); + } + bindNames.push_back(bindNameAttr); + bindNameDeviceTypes.push_back(getDeviceTypeAttr()); } } @@ -4277,18 +4279,18 @@ void Fortran::lower::genOpenACCRoutineConstruct( } // Note: Device Independent Attributes are set to the // none device type in `info`. - interpretRoutineDeviceInfo(converter, info, seqDeviceTypes, vectorDeviceTypes, - workerDeviceTypes, bindNameDeviceTypes, - bindNames, gangDeviceTypes, gangDimValues, - gangDimDeviceTypes); + interpretRoutineDeviceInfo(converter, info, seqDeviceTypes, + vectorDeviceTypes, workerDeviceTypes, + bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimValues, gangDimDeviceTypes); // Device Dependent Attributes for (const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo : info.deviceTypeInfos()) { interpretRoutineDeviceInfo( - converter, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, - bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, - gangDimDeviceTypes); + converter, dinfo, seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimValues, gangDimDeviceTypes); } } createOpenACCRoutineConstruct( diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index a8f00b546306e..c2df7cddc0025 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1103,7 +1103,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( } symbol.get().add_openACCRoutineInfo(info); } else { - llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() << "\n"; + llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() + << "\n"; } } >From c26093683edb7c0270809d2afb717450f92df6ab Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 14:47:39 -0700 Subject: [PATCH 6/9] tidy up --- flang/include/flang/Lower/OpenACC.h | 1 - flang/lib/Lower/CallInterface.cpp | 1 - 2 files changed, 2 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 35a33e751b52b..9e71ad0a15c89 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -46,7 +46,6 @@ struct AccClauseList; struct OpenACCConstruct; struct OpenACCDeclarativeConstruct; struct OpenACCRoutineConstruct; -struct ProcedureDesignator; } // namespace parser namespace semantics { diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 611eacfe178e5..602b5c7bfa6c6 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -1701,7 +1701,6 @@ class SignatureBuilder fir::emitFatalError(converter.getCurrentLocation(), "SignatureBuilder should only be used once"); declare(); - interfaceDetermined = true; return getFuncOp(); } >From 1b825b55c808ac92cd2866d855611cd585eb28db Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 17:00:27 -0700 Subject: [PATCH 7/9] cleaning up unused code --- flang/include/flang/Lower/AbstractConverter.h | 3 - flang/include/flang/Lower/OpenACC.h | 18 +- flang/lib/Lower/Bridge.cpp | 43 +++-- flang/lib/Lower/OpenACC.cpp | 166 +----------------- flang/lib/Semantics/resolve-directives.cpp | 12 +- 5 files changed, 32 insertions(+), 210 deletions(-) diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 59419e829718f..2fa0da94b0396 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -358,9 +358,6 @@ class AbstractConverter { /// functions in order to be in sync). virtual mlir::SymbolTable *getMLIRSymbolTable() = 0; - virtual Fortran::lower::AccRoutineInfoMappingList & - getAccDelayedRoutines() = 0; - private: /// Options controlling lowering behavior. const Fortran::lower::LoweringOptions &loweringOptions; diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 9e71ad0a15c89..d2cd7712fb2c7 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -63,9 +63,6 @@ namespace pft { struct Evaluation; } // namespace pft -using AccRoutineInfoMappingList = - llvm::SmallVector>; - static constexpr llvm::StringRef declarePostAllocSuffix = "_acc_declare_update_desc_post_alloc"; static constexpr llvm::StringRef declarePreDeallocSuffix = @@ -82,22 +79,13 @@ mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, pft::Evaluation &, const parser::OpenACCConstruct &); -void genOpenACCDeclarativeConstruct(AbstractConverter &, - Fortran::semantics::SemanticsContext &, - StatementContext &, - const parser::OpenACCDeclarativeConstruct &, - AccRoutineInfoMappingList &); -void genOpenACCRoutineConstruct(AbstractConverter &, - Fortran::semantics::SemanticsContext &, - mlir::ModuleOp, - const parser::OpenACCRoutineConstruct &); +void genOpenACCDeclarativeConstruct( + AbstractConverter &, Fortran::semantics::SemanticsContext &, + StatementContext &, const parser::OpenACCDeclarativeConstruct &); void genOpenACCRoutineConstruct( AbstractConverter &, mlir::ModuleOp, mlir::func::FuncOp, const std::vector &); -void finalizeOpenACCRoutineAttachment(mlir::ModuleOp, - AccRoutineInfoMappingList &); - /// Get a acc.private.recipe op for the given type or create it if it does not /// exist yet. mlir::acc::PrivateRecipeOp createOrGetPrivateRecipe(mlir::OpBuilder &, diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 5e7b783323bfd..1615493003898 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -458,15 +458,25 @@ class FirConverter : public Fortran::lower::AbstractConverter { Fortran::common::LanguageFeature::CUDA)); }); - finalizeOpenACCLowering(); finalizeOpenMPLowering(globalOmpRequiresSymbol); } /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { + // Since this is a recursive function, we only need to create a new builder + // for each top-level declaration. It would be simpler to have a single + // builder for the entire translation unit, but that requires a lot of + // changes to the code. + // FIXME: Once createGlobalOutsideOfFunctionLowering is fixed, we can + // remove this code and share the module builder. + bool newBuilder = false; + if (!builder) { + newBuilder = true; + builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), + &mlirSymbolTable); + } + CHECK(builder && "FirOpBuilder did not instantiate"); setCurrentPosition(funit.getStartingSourceLoc()); - builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), - &mlirSymbolTable); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { funit.setActiveEntry(entryIndex); @@ -493,7 +503,11 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (Fortran::lower::pft::ContainedUnit &unit : funit.containedUnitList) if (auto *f = std::get_if(&unit)) declareFunction(*f); - builder = nullptr; + + if (newBuilder) { + delete builder; + builder = nullptr; + } } /// Get the scope that is defining or using \p sym. The returned scope is not @@ -3017,8 +3031,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { void genFIR(const Fortran::parser::OpenACCDeclarativeConstruct &accDecl) { genOpenACCDeclarativeConstruct(*this, bridge.getSemanticsContext(), - bridge.openAccCtx(), accDecl, - accRoutineInfos); + bridge.openAccCtx(), accDecl); for (Fortran::lower::pft::Evaluation &e : getEval().getNestedEvaluations()) genFIR(e); } @@ -4286,11 +4299,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { return Fortran::lower::createMutableBox(loc, *this, expr, localSymbols); } - Fortran::lower::AccRoutineInfoMappingList & - getAccDelayedRoutines() override final { - return accRoutineInfos; - } - // Create the [newRank] array with the lower bounds to be passed to the // runtime as a descriptor. mlir::Value createLboundArray(llvm::ArrayRef lbounds, @@ -5889,7 +5897,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Helper to generate GlobalOps when the builder is not positioned in any /// region block. This is required because the FirOpBuilder assumes it is /// always positioned inside a region block when creating globals, the easiest - /// way comply is to create a dummy function and to throw it afterwards. + /// way comply is to create a dummy function and to throw it away afterwards. void createGlobalOutsideOfFunctionLowering( const std::function &createGlobals) { // FIXME: get rid of the bogus function context and instantiate the @@ -5902,6 +5910,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { mlir::FunctionType::get(context, std::nullopt, std::nullopt), symbolTable); func.addEntryBlock(); + CHECK(!builder && "Expected builder to be uninitialized"); builder = new fir::FirOpBuilder(func, bridge.getKindMap(), symbolTable); assert(builder && "FirOpBuilder did not instantiate"); builder->setFastMathFlags(bridge.getLoweringOptions().getMathOptions()); @@ -6331,13 +6340,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { expr.u); } - /// Performing OpenACC lowering action that were deferred to the end of - /// lowering. - void finalizeOpenACCLowering() { - Fortran::lower::finalizeOpenACCRoutineAttachment(getModuleOp(), - accRoutineInfos); - } - /// Performing OpenMP lowering actions that were deferred to the end of /// lowering. void finalizeOpenMPLowering( @@ -6429,9 +6431,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// A counter for uniquing names in `literalNamesMap`. std::uint64_t uniqueLitId = 0; - /// Deferred OpenACC routine attachment. - Fortran::lower::AccRoutineInfoMappingList accRoutineInfos; - /// Whether an OpenMP target region or declare target function/subroutine /// intended for device offloading has been detected bool ompDeviceCodeFound = false; diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 891dc998bc596..1a031dce7a487 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -4163,9 +4163,6 @@ void createOpenACCRoutineConstruct( llvm::SmallVector &workerDeviceTypes, llvm::SmallVector &vectorDeviceTypes) { - std::stringstream routineOpName; - routineOpName << accRoutinePrefix.str() << routineCounter++; - for (auto routineOp : mod.getOps()) { if (routineOp.getFuncName().str().compare(funcName) == 0) { // If the routine is already specified with the same clauses, just skip @@ -4179,6 +4176,8 @@ void createOpenACCRoutineConstruct( mlir::emitError(loc, "Routine already specified with different clauses"); } } + std::stringstream routineOpName; + routineOpName << accRoutinePrefix.str() << routineCounter++; std::string routineOpStr = routineOpName.str(); mlir::OpBuilder modBuilder(mod.getBodyRegion()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); @@ -4192,16 +4191,7 @@ void createOpenACCRoutineConstruct( getArrayAttrOrNull(builder, gangDimValues), getArrayAttrOrNull(builder, gangDimDeviceTypes)); - auto symbolRefAttr = builder.getSymbolRefAttr(routineOpStr); - if (funcOp) { - - attachRoutineInfo(funcOp, symbolRefAttr); - } else { - // FuncOp is not lowered yet. Keep the information so the routine info - // can be attached later to the funcOp. - converter.getAccDelayedRoutines().push_back( - std::make_pair(funcName, symbolRefAttr)); - } + attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpStr)); } static void interpretRoutineDeviceInfo( @@ -4299,145 +4289,6 @@ void Fortran::lower::genOpenACCRoutineConstruct( seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); } -void Fortran::lower::genOpenACCRoutineConstruct( - Fortran::lower::AbstractConverter &converter, - Fortran::semantics::SemanticsContext &semanticsContext, mlir::ModuleOp mod, - const Fortran::parser::OpenACCRoutineConstruct &routineConstruct) { - fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - mlir::Location loc = converter.genLocation(routineConstruct.source); - std::optional name = - std::get>(routineConstruct.t); - const auto &clauses = - std::get(routineConstruct.t); - mlir::func::FuncOp funcOp; - std::string funcName; - if (name) { - funcName = converter.mangleName(*name->symbol); - funcOp = - builder.getNamedFunction(mod, builder.getMLIRSymbolTable(), funcName); - } else { - Fortran::semantics::Scope &scope = - semanticsContext.FindScope(routineConstruct.source); - const Fortran::semantics::Scope &progUnit{GetProgramUnitContaining(scope)}; - const auto *subpDetails{ - progUnit.symbol() - ? progUnit.symbol() - ->detailsIf() - : nullptr}; - if (subpDetails && subpDetails->isInterface()) { - funcName = converter.mangleName(*progUnit.symbol()); - funcOp = - builder.getNamedFunction(mod, builder.getMLIRSymbolTable(), funcName); - } else { - funcOp = builder.getFunction(); - funcName = funcOp.getName(); - } - } - // TODO: Refactor this to use the OpenACCRoutineInfo - bool hasNohost = false; - - llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, - workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, - gangDimDeviceTypes, gangDimValues; - - // device_type attribute is set to `none` until a device_type clause is - // encountered. - llvm::SmallVector crtDeviceTypes; - crtDeviceTypes.push_back(mlir::acc::DeviceTypeAttr::get( - builder.getContext(), mlir::acc::DeviceType::None)); - - for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - seqDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (const auto *gangClause = - std::get_if(&clause.u)) { - if (gangClause->v) { - const Fortran::parser::AccGangArgList &x = *gangClause->v; - for (const Fortran::parser::AccGangArg &gangArg : x.v) { - if (const auto *dim = - std::get_if(&gangArg.u)) { - const std::optional dimValue = Fortran::evaluate::ToInt64( - *Fortran::semantics::GetExpr(dim->v)); - if (!dimValue) - mlir::emitError(loc, - "dim value must be a constant positive integer"); - mlir::Attribute gangDimAttr = - builder.getIntegerAttr(builder.getI64Type(), *dimValue); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - gangDimValues.push_back(gangDimAttr); - gangDimDeviceTypes.push_back(crtDeviceTypeAttr); - } - } - } - } else { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - gangDeviceTypes.push_back(crtDeviceTypeAttr); - } - } else if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - vectorDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - workerDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (std::get_if(&clause.u)) { - hasNohost = true; - } else if (const auto *bindClause = - std::get_if(&clause.u)) { - if (const auto *name = - std::get_if(&bindClause->v.u)) { - mlir::Attribute bindNameAttr = - builder.getStringAttr(converter.mangleName(*name->symbol)); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(crtDeviceTypeAttr); - } - } else if (const auto charExpr = - std::get_if( - &bindClause->v.u)) { - const std::optional name = - Fortran::semantics::GetConstExpr(semanticsContext, - *charExpr); - if (!name) - mlir::emitError(loc, "Could not retrieve the bind name"); - - mlir::Attribute bindNameAttr = builder.getStringAttr(*name); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(crtDeviceTypeAttr); - } - } - } else if (const auto *deviceTypeClause = - std::get_if( - &clause.u)) { - crtDeviceTypes.clear(); - gatherDeviceTypeAttrs(builder, deviceTypeClause, crtDeviceTypes); - } - } - - createOpenACCRoutineConstruct( - converter, loc, mod, funcOp, funcName, hasNohost, bindNames, - bindNameDeviceTypes, gangDeviceTypes, gangDimValues, gangDimDeviceTypes, - seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); -} - -void Fortran::lower::finalizeOpenACCRoutineAttachment( - mlir::ModuleOp mod, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { - for (auto &mapping : accRoutineInfos) { - mlir::func::FuncOp funcOp = - mod.lookupSymbol(mapping.first); - if (!funcOp) - mlir::emitWarning(mod.getLoc(), - llvm::Twine("function '") + llvm::Twine(mapping.first) + - llvm::Twine("' in acc routine directive is not " - "found in this translation unit.")); - else - attachRoutineInfo(funcOp, mapping.second); - } - accRoutineInfos.clear(); -} - static void genACC(Fortran::lower::AbstractConverter &converter, Fortran::lower::pft::Evaluation &eval, @@ -4551,8 +4402,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( Fortran::lower::AbstractConverter &converter, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &openAccCtx, - const Fortran::parser::OpenACCDeclarativeConstruct &accDeclConstruct, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { + const Fortran::parser::OpenACCDeclarativeConstruct &accDeclConstruct) { Fortran::common::visit( common::visitors{ @@ -4561,13 +4411,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( genACC(converter, semanticsContext, openAccCtx, standaloneDeclarativeConstruct); }, - [&](const Fortran::parser::OpenACCRoutineConstruct - &routineConstruct) { - fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - mlir::ModuleOp mod = builder.getModule(); - Fortran::lower::genOpenACCRoutineConstruct( - converter, semanticsContext, mod, routineConstruct); - }, + [&](const Fortran::parser::OpenACCRoutineConstruct &x) {}, }, accDeclConstruct.u); } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index c2df7cddc0025..d74953df1e630 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1041,9 +1041,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( if (const auto *dTypeClause = std::get_if(&clause.u)) { currentDevices.clear(); - for (const auto &deviceTypeExpr : dTypeClause->v.v) { + for (const auto &deviceTypeExpr : dTypeClause->v.v) currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); - } } else if (std::get_if(&clause.u)) { info.set_isNohost(); } else if (std::get_if(&clause.u)) { @@ -1080,9 +1079,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { Symbol &ultimate{sym->GetUltimate()}; - for (auto &device : currentDevices) { + for (auto &device : currentDevices) device->set_bindName(SymbolRef(ultimate)); - } } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1095,16 +1093,12 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Fortran::parser::Unwrap( *charExpr); std::string str{std::get(charConst->t)}; - for (auto &device : currentDevices) { + for (auto &device : currentDevices) device->set_bindName(std::string(str)); - } } } } symbol.get().add_openACCRoutineInfo(info); - } else { - llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() - << "\n"; } } >From 158481e5eaf63c1b2b4c172b9c143ca4c10722f5 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 17:15:26 -0700 Subject: [PATCH 8/9] a little more tidying up --- flang/include/flang/Lower/AbstractConverter.h | 1 - flang/include/flang/Lower/OpenACC.h | 3 --- flang/lib/Lower/Bridge.cpp | 3 ++- 3 files changed, 2 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 2fa0da94b0396..1d1323642bf9c 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -14,7 +14,6 @@ #define FORTRAN_LOWER_ABSTRACTCONVERTER_H #include "flang/Lower/LoweringOptions.h" -#include "flang/Lower/OpenACC.h" #include "flang/Lower/PFTDefs.h" #include "flang/Optimizer/Builder/BoxValue.h" #include "flang/Optimizer/Dialect/FIRAttr.h" diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index d2cd7712fb2c7..4034953976427 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -72,9 +72,6 @@ static constexpr llvm::StringRef declarePostDeallocSuffix = static constexpr llvm::StringRef privatizationRecipePrefix = "privatization"; -bool needsOpenACCRoutineConstruct( - const Fortran::evaluate::ProcedureDesignator *); - mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, pft::Evaluation &, diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 1615493003898..e50c91654f7bb 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -5897,7 +5897,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Helper to generate GlobalOps when the builder is not positioned in any /// region block. This is required because the FirOpBuilder assumes it is /// always positioned inside a region block when creating globals, the easiest - /// way comply is to create a dummy function and to throw it away afterwards. + /// way to comply is to create a dummy function and to throw it away + /// afterwards. void createGlobalOutsideOfFunctionLowering( const std::function &createGlobals) { // FIXME: get rid of the bogus function context and instantiate the >From 8f6ae035147336c4ed04b5b25487f72ebc52c757 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 1 May 2025 08:12:35 -0700 Subject: [PATCH 9/9] more consistent use of builders --- flang/lib/Lower/Bridge.cpp | 40 ++++++++++--------------------- flang/lib/Lower/CallInterface.cpp | 1 - 2 files changed, 13 insertions(+), 28 deletions(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index e50c91654f7bb..fb20dfbaf477e 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -403,18 +403,21 @@ class FirConverter : public Fortran::lower::AbstractConverter { [&](Fortran::lower::pft::FunctionLikeUnit &f) { if (f.isMainProgram()) hasMainProgram = true; - declareFunction(f); + createGlobalOutsideOfFunctionLowering( + [&]() { declareFunction(f); }); if (!globalOmpRequiresSymbol) globalOmpRequiresSymbol = f.getScope().symbol(); }, [&](Fortran::lower::pft::ModuleLikeUnit &m) { lowerModuleDeclScope(m); - for (Fortran::lower::pft::ContainedUnit &unit : - m.containedUnitList) - if (auto *f = - std::get_if( - &unit)) - declareFunction(*f); + createGlobalOutsideOfFunctionLowering([&]() { + for (Fortran::lower::pft::ContainedUnit &unit : + m.containedUnitList) + if (auto *f = + std::get_if( + &unit)) + declareFunction(*f); + }); }, [&](Fortran::lower::pft::BlockDataUnit &b) { if (!globalOmpRequiresSymbol) @@ -463,19 +466,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { - // Since this is a recursive function, we only need to create a new builder - // for each top-level declaration. It would be simpler to have a single - // builder for the entire translation unit, but that requires a lot of - // changes to the code. - // FIXME: Once createGlobalOutsideOfFunctionLowering is fixed, we can - // remove this code and share the module builder. - bool newBuilder = false; - if (!builder) { - newBuilder = true; - builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), - &mlirSymbolTable); - } - CHECK(builder && "FirOpBuilder did not instantiate"); + CHECK(builder && "declareFunction called with uninitialized builder"); setCurrentPosition(funit.getStartingSourceLoc()); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { @@ -503,11 +494,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (Fortran::lower::pft::ContainedUnit &unit : funit.containedUnitList) if (auto *f = std::get_if(&unit)) declareFunction(*f); - - if (newBuilder) { - delete builder; - builder = nullptr; - } } /// Get the scope that is defining or using \p sym. The returned scope is not @@ -5624,9 +5610,9 @@ class FirConverter : public Fortran::lower::AbstractConverter { LLVM_DEBUG(llvm::dbgs() << "\n[bridge - startNewFunction]"; if (auto *sym = scope.symbol()) llvm::dbgs() << " " << *sym; llvm::dbgs() << "\n"); - // I don't think setting the builder is necessary here, because callee + // Setting the builder is not necessary here, because callee // always looks up the FuncOp from the module. If there was a function that - // was not declared yet. This call to callee will cause an assertion + // was not declared yet, this call to callee will cause an assertion // failure. Fortran::lower::CalleeInterface callee(funit, *this); mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments(); diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 602b5c7bfa6c6..8affa1e1965e8 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -21,7 +21,6 @@ #include "flang/Optimizer/Dialect/FIROpsSupport.h" #include "flang/Optimizer/Support/InternalNames.h" #include "flang/Optimizer/Support/Utils.h" -#include "flang/Parser/parse-tree.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" #include "flang/Support/Fortran.h" From flang-commits at lists.llvm.org Thu May 1 08:18:32 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Thu, 01 May 2025 08:18:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681390c8.170a0220.25ad07.ecb7@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/136012 >From 0f4591ee621e2e9d7acb0e6066b556cb7e243162 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Wed, 16 Apr 2025 12:01:24 -0700 Subject: [PATCH 01/10] initial commit --- flang/include/flang/Lower/AbstractConverter.h | 4 + flang/include/flang/Lower/OpenACC.h | 10 +- flang/include/flang/Semantics/symbol.h | 23 +- flang/lib/Lower/Bridge.cpp | 7 +- flang/lib/Lower/CallInterface.cpp | 10 + flang/lib/Lower/OpenACC.cpp | 197 ++++++++++++++---- flang/lib/Semantics/mod-file.cpp | 1 + flang/lib/Semantics/resolve-directives.cpp | 83 ++++---- 8 files changed, 233 insertions(+), 102 deletions(-) diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 1d1323642bf9c..59419e829718f 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -14,6 +14,7 @@ #define FORTRAN_LOWER_ABSTRACTCONVERTER_H #include "flang/Lower/LoweringOptions.h" +#include "flang/Lower/OpenACC.h" #include "flang/Lower/PFTDefs.h" #include "flang/Optimizer/Builder/BoxValue.h" #include "flang/Optimizer/Dialect/FIRAttr.h" @@ -357,6 +358,9 @@ class AbstractConverter { /// functions in order to be in sync). virtual mlir::SymbolTable *getMLIRSymbolTable() = 0; + virtual Fortran::lower::AccRoutineInfoMappingList & + getAccDelayedRoutines() = 0; + private: /// Options controlling lowering behavior. const Fortran::lower::LoweringOptions &loweringOptions; diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 0d7038a7fd856..7832e8b69ea23 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -22,6 +22,9 @@ class StringRef; } // namespace llvm namespace mlir { +namespace func { +class FuncOp; +} class Location; class Type; class ModuleOp; @@ -42,6 +45,7 @@ struct OpenACCRoutineConstruct; } // namespace parser namespace semantics { +class OpenACCRoutineInfo; class SemanticsContext; class Symbol; } // namespace semantics @@ -79,8 +83,10 @@ void genOpenACCDeclarativeConstruct(AbstractConverter &, void genOpenACCRoutineConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, mlir::ModuleOp, - const parser::OpenACCRoutineConstruct &, - AccRoutineInfoMappingList &); + const parser::OpenACCRoutineConstruct &); +void genOpenACCRoutineConstruct( + AbstractConverter &, mlir::ModuleOp, mlir::func::FuncOp, + const std::vector &); void finalizeOpenACCRoutineAttachment(mlir::ModuleOp, AccRoutineInfoMappingList &); diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 715811885c219..1b6b247c9f5bc 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -127,6 +127,8 @@ class WithBindName { // Device type specific OpenACC routine information class OpenACCRoutineDeviceTypeInfo { public: + OpenACCRoutineDeviceTypeInfo(Fortran::common::OpenACCDeviceType dType) + : deviceType_{dType} {} bool isSeq() const { return isSeq_; } void set_isSeq(bool value = true) { isSeq_ = value; } bool isVector() const { return isVector_; } @@ -141,9 +143,7 @@ class OpenACCRoutineDeviceTypeInfo { return bindName_ ? &*bindName_ : nullptr; } void set_bindName(std::string &&name) { bindName_ = std::move(name); } - void set_dType(Fortran::common::OpenACCDeviceType dType) { - deviceType_ = dType; - } + Fortran::common::OpenACCDeviceType dType() const { return deviceType_; } private: @@ -162,13 +162,24 @@ class OpenACCRoutineDeviceTypeInfo { // in as objects in the OpenACCRoutineDeviceTypeInfo list. class OpenACCRoutineInfo : public OpenACCRoutineDeviceTypeInfo { public: + OpenACCRoutineInfo() + : OpenACCRoutineDeviceTypeInfo(Fortran::common::OpenACCDeviceType::None) { + } bool isNohost() const { return isNohost_; } void set_isNohost(bool value = true) { isNohost_ = value; } - std::list &deviceTypeInfos() { + const std::list &deviceTypeInfos() const { return deviceTypeInfos_; } - void add_deviceTypeInfo(OpenACCRoutineDeviceTypeInfo &info) { - deviceTypeInfos_.push_back(info); + + OpenACCRoutineDeviceTypeInfo &add_deviceTypeInfo( + Fortran::common::OpenACCDeviceType type) { + return add_deviceTypeInfo(OpenACCRoutineDeviceTypeInfo(type)); + } + + OpenACCRoutineDeviceTypeInfo &add_deviceTypeInfo( + OpenACCRoutineDeviceTypeInfo &&info) { + deviceTypeInfos_.push_back(std::move(info)); + return deviceTypeInfos_.back(); } private: diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index b4d1197822a43..9285d587585f8 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -443,7 +443,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); Fortran::lower::genOpenACCRoutineConstruct( *this, bridge.getSemanticsContext(), bridge.getModule(), - d.routine, accRoutineInfos); + d.routine); builder = nullptr; }, }, @@ -4287,6 +4287,11 @@ class FirConverter : public Fortran::lower::AbstractConverter { return Fortran::lower::createMutableBox(loc, *this, expr, localSymbols); } + Fortran::lower::AccRoutineInfoMappingList & + getAccDelayedRoutines() override final { + return accRoutineInfos; + } + // Create the [newRank] array with the lower bounds to be passed to the // runtime as a descriptor. mlir::Value createLboundArray(llvm::ArrayRef lbounds, diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 226ba1e52c968..867248f16237e 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -1689,6 +1689,16 @@ class SignatureBuilder "SignatureBuilder should only be used once"); declare(); interfaceDetermined = true; + if (procDesignator && procDesignator->GetInterfaceSymbol() && + procDesignator->GetInterfaceSymbol() + ->has()) { + auto info = procDesignator->GetInterfaceSymbol() + ->get(); + if (!info.openACCRoutineInfos().empty()) { + genOpenACCRoutineConstruct(converter, converter.getModuleOp(), + getFuncOp(), info.openACCRoutineInfos()); + } + } return getFuncOp(); } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 3dd35ed9ae481..37b660408af6c 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -38,6 +38,7 @@ #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" +#include #define DEBUG_TYPE "flang-lower-openacc" @@ -4139,11 +4140,152 @@ static void attachRoutineInfo(mlir::func::FuncOp func, mlir::acc::RoutineInfoAttr::get(func.getContext(), routines)); } +void createOpenACCRoutineConstruct( + Fortran::lower::AbstractConverter &converter, mlir::Location loc, + mlir::ModuleOp mod, mlir::func::FuncOp funcOp, std::string funcName, + bool hasNohost, llvm::SmallVector &bindNames, + llvm::SmallVector &bindNameDeviceTypes, + llvm::SmallVector &gangDeviceTypes, + llvm::SmallVector &gangDimValues, + llvm::SmallVector &gangDimDeviceTypes, + llvm::SmallVector &seqDeviceTypes, + llvm::SmallVector &workerDeviceTypes, + llvm::SmallVector &vectorDeviceTypes) { + + std::stringstream routineOpName; + routineOpName << accRoutinePrefix.str() << routineCounter++; + + for (auto routineOp : mod.getOps()) { + if (routineOp.getFuncName().str().compare(funcName) == 0) { + // If the routine is already specified with the same clauses, just skip + // the operation creation. + if (compareDeviceTypeInfo(routineOp, bindNames, bindNameDeviceTypes, + gangDeviceTypes, gangDimValues, + gangDimDeviceTypes, seqDeviceTypes, + workerDeviceTypes, vectorDeviceTypes) && + routineOp.getNohost() == hasNohost) + return; + mlir::emitError(loc, "Routine already specified with different clauses"); + } + } + std::string routineOpStr = routineOpName.str(); + mlir::OpBuilder modBuilder(mod.getBodyRegion()); + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + modBuilder.create( + loc, routineOpStr, funcName, + bindNames.empty() ? nullptr : builder.getArrayAttr(bindNames), + bindNameDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(bindNameDeviceTypes), + workerDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(workerDeviceTypes), + vectorDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(vectorDeviceTypes), + seqDeviceTypes.empty() ? nullptr : builder.getArrayAttr(seqDeviceTypes), + hasNohost, /*implicit=*/false, + gangDeviceTypes.empty() ? nullptr : builder.getArrayAttr(gangDeviceTypes), + gangDimValues.empty() ? nullptr : builder.getArrayAttr(gangDimValues), + gangDimDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(gangDimDeviceTypes)); + + if (funcOp) + attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpStr)); + else + // FuncOp is not lowered yet. Keep the information so the routine info + // can be attached later to the funcOp. + converter.getAccDelayedRoutines().push_back( + std::make_pair(funcName, builder.getSymbolRefAttr(routineOpStr))); +} + +static void interpretRoutineDeviceInfo( + fir::FirOpBuilder &builder, + const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo, + llvm::SmallVector &seqDeviceTypes, + llvm::SmallVector &vectorDeviceTypes, + llvm::SmallVector &workerDeviceTypes, + llvm::SmallVector &bindNameDeviceTypes, + llvm::SmallVector &bindNames, + llvm::SmallVector &gangDeviceTypes, + llvm::SmallVector &gangDimValues, + llvm::SmallVector &gangDimDeviceTypes) { + mlir::MLIRContext *context{builder.getContext()}; + if (dinfo.isSeq()) { + seqDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } + if (dinfo.isVector()) { + vectorDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } + if (dinfo.isWorker()) { + workerDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } + if (dinfo.isGang()) { + unsigned gangDim = dinfo.gangDim(); + auto deviceType = + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType())); + if (!gangDim) { + gangDeviceTypes.push_back(deviceType); + } else { + gangDimValues.push_back( + builder.getIntegerAttr(builder.getI64Type(), gangDim)); + gangDimDeviceTypes.push_back(deviceType); + } + } + if (const std::string *bindName{dinfo.bindName()}) { + bindNames.push_back(builder.getStringAttr(*bindName)); + bindNameDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } +} + +void Fortran::lower::genOpenACCRoutineConstruct( + Fortran::lower::AbstractConverter &converter, mlir::ModuleOp mod, + mlir::func::FuncOp funcOp, + const std::vector &routineInfos) { + CHECK(funcOp && "Expected a valid function operation"); + fir::FirOpBuilder &builder{converter.getFirOpBuilder()}; + mlir::Location loc{funcOp.getLoc()}; + std::string funcName{funcOp.getName()}; + + // Collect the routine clauses + bool hasNohost{false}; + + llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimDeviceTypes, gangDimValues; + + for (const Fortran::semantics::OpenACCRoutineInfo &info : routineInfos) { + // Device Independent Attributes + if (info.isNohost()) { + hasNohost = true; + } + // Note: Device Independent Attributes are set to the + // none device type in `info`. + interpretRoutineDeviceInfo(builder, info, seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, + bindNames, gangDeviceTypes, gangDimValues, + gangDimDeviceTypes); + + // Device Dependent Attributes + for (const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo : + info.deviceTypeInfos()) { + interpretRoutineDeviceInfo( + builder, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, + bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, + gangDimDeviceTypes); + } + } + createOpenACCRoutineConstruct( + converter, loc, mod, funcOp, funcName, hasNohost, bindNames, + bindNameDeviceTypes, gangDeviceTypes, gangDimValues, gangDimDeviceTypes, + seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); +} + void Fortran::lower::genOpenACCRoutineConstruct( Fortran::lower::AbstractConverter &converter, Fortran::semantics::SemanticsContext &semanticsContext, mlir::ModuleOp mod, - const Fortran::parser::OpenACCRoutineConstruct &routineConstruct, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { + const Fortran::parser::OpenACCRoutineConstruct &routineConstruct) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); mlir::Location loc = converter.genLocation(routineConstruct.source); std::optional name = @@ -4174,6 +4316,7 @@ void Fortran::lower::genOpenACCRoutineConstruct( funcName = funcOp.getName(); } } + // TODO: Refactor this to use the OpenACCRoutineInfo bool hasNohost = false; llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, @@ -4226,6 +4369,8 @@ void Fortran::lower::genOpenACCRoutineConstruct( std::get_if(&clause.u)) { if (const auto *name = std::get_if(&bindClause->v.u)) { + // FIXME: This case mangles the name, the one below does not. + // which is correct? mlir::Attribute bindNameAttr = builder.getStringAttr(converter.mangleName(*name->symbol)); for (auto crtDeviceTypeAttr : crtDeviceTypes) { @@ -4255,47 +4400,10 @@ void Fortran::lower::genOpenACCRoutineConstruct( } } - mlir::OpBuilder modBuilder(mod.getBodyRegion()); - std::stringstream routineOpName; - routineOpName << accRoutinePrefix.str() << routineCounter++; - - for (auto routineOp : mod.getOps()) { - if (routineOp.getFuncName().str().compare(funcName) == 0) { - // If the routine is already specified with the same clauses, just skip - // the operation creation. - if (compareDeviceTypeInfo(routineOp, bindNames, bindNameDeviceTypes, - gangDeviceTypes, gangDimValues, - gangDimDeviceTypes, seqDeviceTypes, - workerDeviceTypes, vectorDeviceTypes) && - routineOp.getNohost() == hasNohost) - return; - mlir::emitError(loc, "Routine already specified with different clauses"); - } - } - - modBuilder.create( - loc, routineOpName.str(), funcName, - bindNames.empty() ? nullptr : builder.getArrayAttr(bindNames), - bindNameDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(bindNameDeviceTypes), - workerDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(workerDeviceTypes), - vectorDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(vectorDeviceTypes), - seqDeviceTypes.empty() ? nullptr : builder.getArrayAttr(seqDeviceTypes), - hasNohost, /*implicit=*/false, - gangDeviceTypes.empty() ? nullptr : builder.getArrayAttr(gangDeviceTypes), - gangDimValues.empty() ? nullptr : builder.getArrayAttr(gangDimValues), - gangDimDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(gangDimDeviceTypes)); - - if (funcOp) - attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpName.str())); - else - // FuncOp is not lowered yet. Keep the information so the routine info - // can be attached later to the funcOp. - accRoutineInfos.push_back(std::make_pair( - funcName, builder.getSymbolRefAttr(routineOpName.str()))); + createOpenACCRoutineConstruct( + converter, loc, mod, funcOp, funcName, hasNohost, bindNames, + bindNameDeviceTypes, gangDeviceTypes, gangDimValues, gangDimDeviceTypes, + seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); } void Fortran::lower::finalizeOpenACCRoutineAttachment( @@ -4443,8 +4551,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( fir::FirOpBuilder &builder = converter.getFirOpBuilder(); mlir::ModuleOp mod = builder.getModule(); Fortran::lower::genOpenACCRoutineConstruct( - converter, semanticsContext, mod, routineConstruct, - accRoutineInfos); + converter, semanticsContext, mod, routineConstruct); }, }, accDeclConstruct.u); diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index ee356e56e4458..befd204a671fc 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1387,6 +1387,7 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, parser::Options options; options.isModuleFile = true; options.features.Enable(common::LanguageFeature::BackslashEscapes); + options.features.Enable(common::LanguageFeature::OpenACC); options.features.Enable(common::LanguageFeature::OpenMP); options.features.Enable(common::LanguageFeature::CUDA); if (!isIntrinsic.value_or(false) && !notAModule) { diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index d75b4ea13d35f..93c334a3ca3cb 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1034,61 +1034,53 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Symbol &symbol, const parser::OpenACCRoutineConstruct &x) { if (symbol.has()) { Fortran::semantics::OpenACCRoutineInfo info; + std::vector currentDevices; + currentDevices.push_back(&info); const auto &clauses = std::get(x.t); for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isSeq(); - } else { - info.deviceTypeInfos().back().set_isSeq(); + if (const auto *dTypeClause = + std::get_if(&clause.u)) { + currentDevices.clear(); + for (const auto &deviceTypeExpr : dTypeClause->v.v) { + currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); } + } else if (std::get_if(&clause.u)) { + info.set_isNohost(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) + device->set_isSeq(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) + device->set_isVector(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) + device->set_isWorker(); } else if (const auto *gangClause = std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isGang(); - } else { - info.deviceTypeInfos().back().set_isGang(); - } + for (auto &device : currentDevices) + device->set_isGang(); if (gangClause->v) { const Fortran::parser::AccGangArgList &x = *gangClause->v; + int numArgs{0}; for (const Fortran::parser::AccGangArg &gangArg : x.v) { + CHECK(numArgs <= 1 && "expecting 0 or 1 gang dim args"); if (const auto *dim = std::get_if(&gangArg.u)) { if (const auto v{EvaluateInt64(context_, dim->v)}) { - if (info.deviceTypeInfos().empty()) { - info.set_gangDim(*v); - } else { - info.deviceTypeInfos().back().set_gangDim(*v); - } + for (auto &device : currentDevices) + device->set_gangDim(*v); } } + numArgs++; } } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isVector(); - } else { - info.deviceTypeInfos().back().set_isVector(); - } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isWorker(); - } else { - info.deviceTypeInfos().back().set_isWorker(); - } - } else if (std::get_if(&clause.u)) { - info.set_isNohost(); } else if (const auto *bindClause = std::get_if(&clause.u)) { + std::string bindName = ""; if (const auto *name = std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { - if (info.deviceTypeInfos().empty()) { - info.set_bindName(sym->name().ToString()); - } else { - info.deviceTypeInfos().back().set_bindName( - sym->name().ToString()); - } + bindName = sym->name().ToString(); } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1101,21 +1093,16 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Fortran::parser::Unwrap( *charExpr); std::string str{std::get(charConst->t)}; - std::stringstream bindName; - bindName << "\"" << str << "\""; - if (info.deviceTypeInfos().empty()) { - info.set_bindName(bindName.str()); - } else { - info.deviceTypeInfos().back().set_bindName(bindName.str()); + std::stringstream bindNameStream; + bindNameStream << "\"" << str << "\""; + bindName = bindNameStream.str(); + } + if (!bindName.empty()) { + // Fixme: do we need to ensure there there is only one device? + for (auto &device : currentDevices) { + device->set_bindName(std::string(bindName)); } } - } else if (const auto *dType = - std::get_if( - &clause.u)) { - const parser::AccDeviceTypeExprList &deviceTypeExprList = dType->v; - OpenACCRoutineDeviceTypeInfo dtypeInfo; - dtypeInfo.set_dType(deviceTypeExprList.v.front().v); - info.add_deviceTypeInfo(dtypeInfo); } } symbol.get().add_openACCRoutineInfo(info); >From 1b6da293788edc56eea566f5c15126de6955169c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Tue, 22 Apr 2025 16:06:33 -0700 Subject: [PATCH 02/10] fix includes --- flang/lib/Lower/OpenACC.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 37b660408af6c..a3ebd9b931dc6 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -32,13 +32,13 @@ #include "flang/Semantics/expression.h" #include "flang/Semantics/scope.h" #include "flang/Semantics/tools.h" -#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" -#include "mlir/Support/LLVM.h" #include "llvm/ADT/STLExtras.h" #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" -#include +#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" +#include "mlir/IR/MLIRContext.h" +#include "mlir/Support/LLVM.h" #define DEBUG_TYPE "flang-lower-openacc" >From 7b65ac4c477e5e46bf369a3a9f94f69cf496ef6b Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Wed, 23 Apr 2025 13:50:19 -0700 Subject: [PATCH 03/10] adding test --- flang/include/flang/Semantics/symbol.h | 7 +++++- flang/lib/Semantics/resolve-directives.cpp | 6 ++++- .../Lower/OpenACC/acc-module-definition.f90 | 17 ++++++++++++++ .../Lower/OpenACC/acc-routine-use-module.f90 | 23 +++++++++++++++++++ 4 files changed, 51 insertions(+), 2 deletions(-) create mode 100644 flang/test/Lower/OpenACC/acc-module-definition.f90 create mode 100644 flang/test/Lower/OpenACC/acc-routine-use-module.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 1b6b247c9f5bc..fe6c73997733a 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -142,7 +142,11 @@ class OpenACCRoutineDeviceTypeInfo { const std::string *bindName() const { return bindName_ ? &*bindName_ : nullptr; } - void set_bindName(std::string &&name) { bindName_ = std::move(name); } + bool bindNameIsInternal() const {return bindNameIsInternal_;} + void set_bindName(std::string &&name, bool isInternal=false) { + bindName_ = std::move(name); + bindNameIsInternal_ = isInternal; + } Fortran::common::OpenACCDeviceType dType() const { return deviceType_; } @@ -153,6 +157,7 @@ class OpenACCRoutineDeviceTypeInfo { bool isGang_{false}; unsigned gangDim_{0}; std::optional bindName_; + bool bindNameIsInternal_{false}; Fortran::common::OpenACCDeviceType deviceType_{ Fortran::common::OpenACCDeviceType::None}; }; diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 93c334a3ca3cb..8fb3559c34426 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1077,10 +1077,12 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( } else if (const auto *bindClause = std::get_if(&clause.u)) { std::string bindName = ""; + bool isInternal = false; if (const auto *name = std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { bindName = sym->name().ToString(); + isInternal = true; } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1100,12 +1102,14 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( if (!bindName.empty()) { // Fixme: do we need to ensure there there is only one device? for (auto &device : currentDevices) { - device->set_bindName(std::string(bindName)); + device->set_bindName(std::string(bindName), isInternal); } } } } symbol.get().add_openACCRoutineInfo(info); + } else { + llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() << "\n"; } } diff --git a/flang/test/Lower/OpenACC/acc-module-definition.f90 b/flang/test/Lower/OpenACC/acc-module-definition.f90 new file mode 100644 index 0000000000000..36e41fc631c77 --- /dev/null +++ b/flang/test/Lower/OpenACC/acc-module-definition.f90 @@ -0,0 +1,17 @@ +! RUN: rm -fr %t && mkdir -p %t && cd %t +! RUN: bbc -fopenacc -emit-fir %s +! RUN: cat mod1.mod | FileCheck %s + +!CHECK-LABEL: module mod1 +module mod1 + contains + !CHECK subroutine callee(aa) + subroutine callee(aa) + !CHECK: !$acc routine seq + !$acc routine seq + integer :: aa + aa = 1 + end subroutine + !CHECK: end + !CHECK: end +end module \ No newline at end of file diff --git a/flang/test/Lower/OpenACC/acc-routine-use-module.f90 b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 new file mode 100644 index 0000000000000..7fc96b0ef5684 --- /dev/null +++ b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 @@ -0,0 +1,23 @@ +! RUN: rm -fr %t && mkdir -p %t && cd %t +! RUN: bbc -emit-fir %S/acc-module-definition.f90 +! RUN: bbc -emit-fir %s -o - | FileCheck %s + +! This test module is based off of flang/test/Lower/use_module.f90 +! The first runs ensures the module file is generated. + +module use_mod1 + use mod1 + contains + !CHECK: func.func @_QMuse_mod1Pcaller + !CHECK-SAME { + subroutine caller(aa) + integer :: aa + !$acc serial + !CHECK: fir.call @_QMmod1Pcallee + call callee(aa) + !$acc end serial + end subroutine + !CHECK: } + !CHECK: acc.routine @acc_routine_0 func(@_QMmod1Pcallee) seq + !CHECK: func.func private @_QMmod1Pcallee(!fir.ref) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +end module \ No newline at end of file >From 70f8d469346d22597c7b3ff38b2f4a84a82b6d85 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 14:39:04 -0700 Subject: [PATCH 04/10] debugging failure --- flang/include/flang/Lower/OpenACC.h | 7 ++ flang/include/flang/Semantics/symbol.h | 21 +++-- flang/lib/Lower/Bridge.cpp | 21 +++-- flang/lib/Lower/CallInterface.cpp | 21 ++--- flang/lib/Lower/OpenACC.cpp | 87 +++++++++++-------- flang/lib/Semantics/mod-file.cpp | 11 ++- flang/lib/Semantics/resolve-directives.cpp | 16 ++-- flang/lib/Semantics/symbol.cpp | 46 ++++++++++ .../test/Lower/OpenACC/acc-routine-named.f90 | 10 ++- .../Lower/OpenACC/acc-routine-use-module.f90 | 6 +- flang/test/Lower/OpenACC/acc-routine.f90 | 63 ++++++++------ 11 files changed, 199 insertions(+), 110 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 7832e8b69ea23..dc014a71526c3 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -37,11 +37,16 @@ class FirOpBuilder; } namespace Fortran { +namespace evaluate { +class ProcedureDesignator; +} // namespace evaluate + namespace parser { struct AccClauseList; struct OpenACCConstruct; struct OpenACCDeclarativeConstruct; struct OpenACCRoutineConstruct; +struct ProcedureDesignator; } // namespace parser namespace semantics { @@ -71,6 +76,8 @@ static constexpr llvm::StringRef declarePostDeallocSuffix = static constexpr llvm::StringRef privatizationRecipePrefix = "privatization"; +bool needsOpenACCRoutineConstruct(const Fortran::evaluate::ProcedureDesignator *); + mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, pft::Evaluation &, diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index fe6c73997733a..8c60a196bdfc1 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -22,6 +22,7 @@ #include #include #include +#include #include namespace llvm { @@ -139,25 +140,26 @@ class OpenACCRoutineDeviceTypeInfo { void set_isGang(bool value = true) { isGang_ = value; } unsigned gangDim() const { return gangDim_; } void set_gangDim(unsigned value) { gangDim_ = value; } - const std::string *bindName() const { - return bindName_ ? &*bindName_ : nullptr; + const std::variant *bindName() const { + return bindName_.has_value() ? &*bindName_ : nullptr; } - bool bindNameIsInternal() const {return bindNameIsInternal_;} - void set_bindName(std::string &&name, bool isInternal=false) { - bindName_ = std::move(name); - bindNameIsInternal_ = isInternal; + const std::optional> &bindNameOpt() const { + return bindName_; } + void set_bindName(std::string &&name) { bindName_.emplace(std::move(name)); } + void set_bindName(SymbolRef symbol) { bindName_.emplace(symbol); } Fortran::common::OpenACCDeviceType dType() const { return deviceType_; } + friend llvm::raw_ostream &operator<<( + llvm::raw_ostream &, const OpenACCRoutineDeviceTypeInfo &); private: bool isSeq_{false}; bool isVector_{false}; bool isWorker_{false}; bool isGang_{false}; unsigned gangDim_{0}; - std::optional bindName_; - bool bindNameIsInternal_{false}; + std::optional> bindName_; Fortran::common::OpenACCDeviceType deviceType_{ Fortran::common::OpenACCDeviceType::None}; }; @@ -187,6 +189,9 @@ class OpenACCRoutineInfo : public OpenACCRoutineDeviceTypeInfo { return deviceTypeInfos_.back(); } + friend llvm::raw_ostream &operator<<( + llvm::raw_ostream &, const OpenACCRoutineInfo &); + private: std::list deviceTypeInfos_; bool isNohost_{false}; diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 9285d587585f8..abe07bcfdfcda 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -438,14 +438,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { [&](Fortran::lower::pft::ModuleLikeUnit &m) { lowerMod(m); }, [&](Fortran::lower::pft::BlockDataUnit &b) {}, [&](Fortran::lower::pft::CompilerDirectiveUnit &d) {}, - [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) { - builder = new fir::FirOpBuilder( - bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); - Fortran::lower::genOpenACCRoutineConstruct( - *this, bridge.getSemanticsContext(), bridge.getModule(), - d.routine); - builder = nullptr; - }, + [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) {}, }, u); } @@ -472,6 +465,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { setCurrentPosition(funit.getStartingSourceLoc()); + builder = new fir::FirOpBuilder( + bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { funit.setActiveEntry(entryIndex); @@ -498,6 +493,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (Fortran::lower::pft::ContainedUnit &unit : funit.containedUnitList) if (auto *f = std::get_if(&unit)) declareFunction(*f); + builder = nullptr; } /// Get the scope that is defining or using \p sym. The returned scope is not @@ -1035,7 +1031,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { return bridge.getSemanticsContext().FindScope(currentPosition); } - fir::FirOpBuilder &getFirOpBuilder() override final { return *builder; } + fir::FirOpBuilder &getFirOpBuilder() override final { + CHECK(builder && "builder is not set before calling getFirOpBuilder"); + return *builder; + } mlir::ModuleOp getModuleOp() override final { return bridge.getModule(); } @@ -5617,6 +5616,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { LLVM_DEBUG(llvm::dbgs() << "\n[bridge - startNewFunction]"; if (auto *sym = scope.symbol()) llvm::dbgs() << " " << *sym; llvm::dbgs() << "\n"); + // I don't think setting the builder is necessary here, because callee + // always looks up the FuncOp from the module. If there was a function that + // was not declared yet. This call to callee will cause an assertion + //failure. Fortran::lower::CalleeInterface callee(funit, *this); mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments(); builder = diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 867248f16237e..b938354e6bcb3 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -10,6 +10,7 @@ #include "flang/Evaluate/fold.h" #include "flang/Lower/Bridge.h" #include "flang/Lower/Mangler.h" +#include "flang/Lower/OpenACC.h" #include "flang/Lower/PFTBuilder.h" #include "flang/Lower/StatementContext.h" #include "flang/Lower/Support/Utils.h" @@ -20,6 +21,7 @@ #include "flang/Optimizer/Dialect/FIROpsSupport.h" #include "flang/Optimizer/Support/InternalNames.h" #include "flang/Optimizer/Support/Utils.h" +#include "flang/Parser/parse-tree.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" #include "flang/Support/Fortran.h" @@ -715,6 +717,14 @@ void Fortran::lower::CallInterface::declare() { func.setArgAttrs(placeHolder.index(), placeHolder.value().attributes); setCUDAAttributes(func, side().getProcedureSymbol(), characteristic); + + if (const Fortran::semantics::Symbol *sym = side().getProcedureSymbol()) { + if (const auto &info{sym->GetUltimate().detailsIf()}) { + if (!info->openACCRoutineInfos().empty()) { + genOpenACCRoutineConstruct(converter, module, func, info->openACCRoutineInfos()); + } + } + } } } } @@ -1688,17 +1698,8 @@ class SignatureBuilder fir::emitFatalError(converter.getCurrentLocation(), "SignatureBuilder should only be used once"); declare(); + interfaceDetermined = true; - if (procDesignator && procDesignator->GetInterfaceSymbol() && - procDesignator->GetInterfaceSymbol() - ->has()) { - auto info = procDesignator->GetInterfaceSymbol() - ->get(); - if (!info.openACCRoutineInfos().empty()) { - genOpenACCRoutineConstruct(converter, converter.getModuleOp(), - getFuncOp(), info.openACCRoutineInfos()); - } - } return getFuncOp(); } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index a3ebd9b931dc6..eefa8fbf12b1a 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -36,6 +36,7 @@ #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" +#include "llvm/Support/ErrorHandling.h" #include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" #include "mlir/IR/MLIRContext.h" #include "mlir/Support/LLVM.h" @@ -4140,6 +4141,14 @@ static void attachRoutineInfo(mlir::func::FuncOp func, mlir::acc::RoutineInfoAttr::get(func.getContext(), routines)); } +static mlir::ArrayAttr getArrayAttrOrNull(fir::FirOpBuilder &builder, llvm::SmallVector &attributes) { + if (attributes.empty()) { + return nullptr; + } else { + return builder.getArrayAttr(attributes); + } +} + void createOpenACCRoutineConstruct( Fortran::lower::AbstractConverter &converter, mlir::Location loc, mlir::ModuleOp mod, mlir::func::FuncOp funcOp, std::string funcName, @@ -4173,31 +4182,29 @@ void createOpenACCRoutineConstruct( fir::FirOpBuilder &builder = converter.getFirOpBuilder(); modBuilder.create( loc, routineOpStr, funcName, - bindNames.empty() ? nullptr : builder.getArrayAttr(bindNames), - bindNameDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(bindNameDeviceTypes), - workerDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(workerDeviceTypes), - vectorDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(vectorDeviceTypes), - seqDeviceTypes.empty() ? nullptr : builder.getArrayAttr(seqDeviceTypes), + getArrayAttrOrNull(builder, bindNames), + getArrayAttrOrNull(builder, bindNameDeviceTypes), + getArrayAttrOrNull(builder, workerDeviceTypes), + getArrayAttrOrNull(builder, vectorDeviceTypes), + getArrayAttrOrNull(builder, seqDeviceTypes), hasNohost, /*implicit=*/false, - gangDeviceTypes.empty() ? nullptr : builder.getArrayAttr(gangDeviceTypes), - gangDimValues.empty() ? nullptr : builder.getArrayAttr(gangDimValues), - gangDimDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(gangDimDeviceTypes)); - - if (funcOp) - attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpStr)); - else + getArrayAttrOrNull(builder, gangDeviceTypes), + getArrayAttrOrNull(builder, gangDimValues), + getArrayAttrOrNull(builder, gangDimDeviceTypes)); + + auto symbolRefAttr = builder.getSymbolRefAttr(routineOpStr); + if (funcOp) { + + attachRoutineInfo(funcOp, symbolRefAttr); + } else { // FuncOp is not lowered yet. Keep the information so the routine info // can be attached later to the funcOp. - converter.getAccDelayedRoutines().push_back( - std::make_pair(funcName, builder.getSymbolRefAttr(routineOpStr))); + converter.getAccDelayedRoutines().push_back(std::make_pair(funcName, symbolRefAttr)); + } } static void interpretRoutineDeviceInfo( - fir::FirOpBuilder &builder, + Fortran::lower::AbstractConverter &converter, const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo, llvm::SmallVector &seqDeviceTypes, llvm::SmallVector &vectorDeviceTypes, @@ -4207,23 +4214,24 @@ static void interpretRoutineDeviceInfo( llvm::SmallVector &gangDeviceTypes, llvm::SmallVector &gangDimValues, llvm::SmallVector &gangDimDeviceTypes) { - mlir::MLIRContext *context{builder.getContext()}; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto getDeviceTypeAttr = [&]() -> mlir::Attribute { + auto context = builder.getContext(); + auto value = getDeviceType(dinfo.dType()); + return mlir::acc::DeviceTypeAttr::get(context, value ); + }; if (dinfo.isSeq()) { - seqDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + seqDeviceTypes.push_back(getDeviceTypeAttr()); } if (dinfo.isVector()) { - vectorDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + vectorDeviceTypes.push_back(getDeviceTypeAttr()); } if (dinfo.isWorker()) { - workerDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + workerDeviceTypes.push_back(getDeviceTypeAttr()); } if (dinfo.isGang()) { unsigned gangDim = dinfo.gangDim(); - auto deviceType = - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType())); + auto deviceType = getDeviceTypeAttr(); if (!gangDim) { gangDeviceTypes.push_back(deviceType); } else { @@ -4232,10 +4240,18 @@ static void interpretRoutineDeviceInfo( gangDimDeviceTypes.push_back(deviceType); } } - if (const std::string *bindName{dinfo.bindName()}) { - bindNames.push_back(builder.getStringAttr(*bindName)); - bindNameDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + if (dinfo.bindNameOpt().has_value()) { + const auto &bindName = dinfo.bindNameOpt().value(); + mlir::Attribute bindNameAttr; + if (const auto &bindStr{std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(*bindStr); + } else if (const auto &bindSym{std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(converter.mangleName(*bindSym)); + } else { + llvm_unreachable("Unsupported bind name type"); + } + bindNames.push_back(bindNameAttr); + bindNameDeviceTypes.push_back(getDeviceTypeAttr()); } } @@ -4244,7 +4260,6 @@ void Fortran::lower::genOpenACCRoutineConstruct( mlir::func::FuncOp funcOp, const std::vector &routineInfos) { CHECK(funcOp && "Expected a valid function operation"); - fir::FirOpBuilder &builder{converter.getFirOpBuilder()}; mlir::Location loc{funcOp.getLoc()}; std::string funcName{funcOp.getName()}; @@ -4262,7 +4277,7 @@ void Fortran::lower::genOpenACCRoutineConstruct( } // Note: Device Independent Attributes are set to the // none device type in `info`. - interpretRoutineDeviceInfo(builder, info, seqDeviceTypes, vectorDeviceTypes, + interpretRoutineDeviceInfo(converter, info, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, gangDimDeviceTypes); @@ -4271,7 +4286,7 @@ void Fortran::lower::genOpenACCRoutineConstruct( for (const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo : info.deviceTypeInfos()) { interpretRoutineDeviceInfo( - builder, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, + converter, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, gangDimDeviceTypes); } @@ -4369,8 +4384,6 @@ void Fortran::lower::genOpenACCRoutineConstruct( std::get_if(&clause.u)) { if (const auto *name = std::get_if(&bindClause->v.u)) { - // FIXME: This case mangles the name, the one below does not. - // which is correct? mlir::Attribute bindNameAttr = builder.getStringAttr(converter.mangleName(*name->symbol)); for (auto crtDeviceTypeAttr : crtDeviceTypes) { diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index befd204a671fc..76dc8db590f22 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -24,6 +24,7 @@ #include #include #include +#include #include namespace Fortran::semantics { @@ -638,8 +639,14 @@ static void PutOpenACCDeviceTypeRoutineInfo( if (info.isWorker()) { os << " worker"; } - if (info.bindName()) { - os << " bind(" << *info.bindName() << ")"; + if (const std::variant *bindName{info.bindName()}) { + os << " bind("; + if (std::holds_alternative(*bindName)) { + os << "\"" << std::get(*bindName) << "\""; + } else { + os << std::get(*bindName)->name(); + } + os << ")"; } } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 8fb3559c34426..a8f00b546306e 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1076,13 +1076,13 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( } } else if (const auto *bindClause = std::get_if(&clause.u)) { - std::string bindName = ""; - bool isInternal = false; if (const auto *name = std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { - bindName = sym->name().ToString(); - isInternal = true; + Symbol &ultimate{sym->GetUltimate()}; + for (auto &device : currentDevices) { + device->set_bindName(SymbolRef(ultimate)); + } } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1095,14 +1095,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Fortran::parser::Unwrap( *charExpr); std::string str{std::get(charConst->t)}; - std::stringstream bindNameStream; - bindNameStream << "\"" << str << "\""; - bindName = bindNameStream.str(); - } - if (!bindName.empty()) { - // Fixme: do we need to ensure there there is only one device? for (auto &device : currentDevices) { - device->set_bindName(std::string(bindName), isInternal); + device->set_bindName(std::string(str)); } } } diff --git a/flang/lib/Semantics/symbol.cpp b/flang/lib/Semantics/symbol.cpp index 32eb6c2c5a188..d44df4669fa36 100644 --- a/flang/lib/Semantics/symbol.cpp +++ b/flang/lib/Semantics/symbol.cpp @@ -144,6 +144,52 @@ llvm::raw_ostream &operator<<( os << ' ' << x; } } + if (!x.openACCRoutineInfos_.empty()) { + os << " openACCRoutineInfos:"; + for (const auto x : x.openACCRoutineInfos_) { + os << x; + } + } + return os; +} + +llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const OpenACCRoutineDeviceTypeInfo &x) { + if (x.dType() != common::OpenACCDeviceType::None) { + os << " deviceType(" << common::EnumToString(x.dType()) << ')'; + } + if (x.isSeq()) { + os << " seq"; + } + if (x.isVector()) { + os << " vector"; + } + if (x.isWorker()) { + os << " worker"; + } + if (x.isGang()) { + os << " gang(" << x.gangDim() << ')'; + } + if (const auto *bindName{x.bindName()}) { + if (const auto &symbol{std::get_if(bindName)}) { + os << " bindName(\"" << *symbol << "\")"; + } else { + const SymbolRef s{std::get(*bindName)}; + os << " bindName(" << s->name() << ")"; + } + } + return os; +} + +llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const OpenACCRoutineInfo &x) { + if (x.isNohost()) { + os << " nohost"; + } + os << static_cast(x); + for (const auto &d : x.deviceTypeInfos_) { + os << d; + } return os; } diff --git a/flang/test/Lower/OpenACC/acc-routine-named.f90 b/flang/test/Lower/OpenACC/acc-routine-named.f90 index 2cf6bf8b2bc06..de9784a1146cc 100644 --- a/flang/test/Lower/OpenACC/acc-routine-named.f90 +++ b/flang/test/Lower/OpenACC/acc-routine-named.f90 @@ -4,8 +4,8 @@ module acc_routines -! CHECK: acc.routine @acc_routine_1 func(@_QMacc_routinesPacc2) -! CHECK: acc.routine @acc_routine_0 func(@_QMacc_routinesPacc1) seq +! CHECK: acc.routine @[[r0:.*]] func(@_QMacc_routinesPacc2) +! CHECK: acc.routine @[[r1:.*]] func(@_QMacc_routinesPacc1) seq !$acc routine(acc1) seq @@ -14,12 +14,14 @@ module acc_routines subroutine acc1() end subroutine -! CHECK-LABEL: func.func @_QMacc_routinesPacc1() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +! CHECK-LABEL: func.func @_QMacc_routinesPacc1() +! CHECK-SAME:attributes {acc.routine_info = #acc.routine_info<[@[[r1]]]>} subroutine acc2() !$acc routine(acc2) end subroutine -! CHECK-LABEL: func.func @_QMacc_routinesPacc2() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_1]>} +! CHECK-LABEL: func.func @_QMacc_routinesPacc2() +! CHECK-SAME:attributes {acc.routine_info = #acc.routine_info<[@[[r0]]]>} end module diff --git a/flang/test/Lower/OpenACC/acc-routine-use-module.f90 b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 index 7fc96b0ef5684..059324230a746 100644 --- a/flang/test/Lower/OpenACC/acc-routine-use-module.f90 +++ b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 @@ -1,6 +1,6 @@ ! RUN: rm -fr %t && mkdir -p %t && cd %t -! RUN: bbc -emit-fir %S/acc-module-definition.f90 -! RUN: bbc -emit-fir %s -o - | FileCheck %s +! RUN: bbc -fopenacc -emit-fir %S/acc-module-definition.f90 +! RUN: bbc -fopenacc -emit-fir %s -o - | FileCheck %s ! This test module is based off of flang/test/Lower/use_module.f90 ! The first runs ensures the module file is generated. @@ -8,6 +8,7 @@ module use_mod1 use mod1 contains + !CHECK: acc.routine @acc_routine_0 func(@_QMmod1Pcallee) seq !CHECK: func.func @_QMuse_mod1Pcaller !CHECK-SAME { subroutine caller(aa) @@ -18,6 +19,5 @@ subroutine caller(aa) !$acc end serial end subroutine !CHECK: } - !CHECK: acc.routine @acc_routine_0 func(@_QMmod1Pcallee) seq !CHECK: func.func private @_QMmod1Pcallee(!fir.ref) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} end module \ No newline at end of file diff --git a/flang/test/Lower/OpenACC/acc-routine.f90 b/flang/test/Lower/OpenACC/acc-routine.f90 index 1170af18bc334..789f3a57e1f79 100644 --- a/flang/test/Lower/OpenACC/acc-routine.f90 +++ b/flang/test/Lower/OpenACC/acc-routine.f90 @@ -2,69 +2,77 @@ ! RUN: bbc -fopenacc -emit-hlfir %s -o - | FileCheck %s -! CHECK: acc.routine @acc_routine_17 func(@_QPacc_routine19) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) -! CHECK: acc.routine @acc_routine_16 func(@_QPacc_routine18) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) -! CHECK: acc.routine @acc_routine_15 func(@_QPacc_routine17) worker ([#acc.device_type]) vector ([#acc.device_type]) -! CHECK: acc.routine @acc_routine_14 func(@_QPacc_routine16) gang([#acc.device_type]) seq ([#acc.device_type]) -! CHECK: acc.routine @acc_routine_10 func(@_QPacc_routine11) seq -! CHECK: acc.routine @acc_routine_9 func(@_QPacc_routine10) seq -! CHECK: acc.routine @acc_routine_8 func(@_QPacc_routine9) bind("_QPacc_routine9a") -! CHECK: acc.routine @acc_routine_7 func(@_QPacc_routine8) bind("routine8_") -! CHECK: acc.routine @acc_routine_6 func(@_QPacc_routine7) gang(dim: 1 : i64) -! CHECK: acc.routine @acc_routine_5 func(@_QPacc_routine6) nohost -! CHECK: acc.routine @acc_routine_4 func(@_QPacc_routine5) worker -! CHECK: acc.routine @acc_routine_3 func(@_QPacc_routine4) vector -! CHECK: acc.routine @acc_routine_2 func(@_QPacc_routine3) gang -! CHECK: acc.routine @acc_routine_1 func(@_QPacc_routine2) seq -! CHECK: acc.routine @acc_routine_0 func(@_QPacc_routine1) +! CHECK: acc.routine @[[r14:.*]] func(@_QPacc_routine19) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) +! CHECK: acc.routine @[[r13:.*]] func(@_QPacc_routine18) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) +! CHECK: acc.routine @[[r12:.*]] func(@_QPacc_routine17) worker ([#acc.device_type]) vector ([#acc.device_type]) +! CHECK: acc.routine @[[r11:.*]] func(@_QPacc_routine16) gang([#acc.device_type]) seq ([#acc.device_type]) +! CHECK: acc.routine @[[r10:.*]] func(@_QPacc_routine11) seq +! CHECK: acc.routine @[[r09:.*]] func(@_QPacc_routine10) seq +! CHECK: acc.routine @[[r08:.*]] func(@_QPacc_routine9) bind("_QPacc_routine9a") +! CHECK: acc.routine @[[r07:.*]] func(@_QPacc_routine8) bind("routine8_") +! CHECK: acc.routine @[[r06:.*]] func(@_QPacc_routine7) gang(dim: 1 : i64) +! CHECK: acc.routine @[[r05:.*]] func(@_QPacc_routine6) nohost +! CHECK: acc.routine @[[r04:.*]] func(@_QPacc_routine5) worker +! CHECK: acc.routine @[[r03:.*]] func(@_QPacc_routine4) vector +! CHECK: acc.routine @[[r02:.*]] func(@_QPacc_routine3) gang +! CHECK: acc.routine @[[r01:.*]] func(@_QPacc_routine2) seq +! CHECK: acc.routine @[[r00:.*]] func(@_QPacc_routine1) subroutine acc_routine1() !$acc routine end subroutine -! CHECK-LABEL: func.func @_QPacc_routine1() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +! CHECK-LABEL: func.func @_QPacc_routine1() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r00]]]>} subroutine acc_routine2() !$acc routine seq end subroutine -! CHECK-LABEL: func.func @_QPacc_routine2() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_1]>} +! CHECK-LABEL: func.func @_QPacc_routine2() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r01]]]>} subroutine acc_routine3() !$acc routine gang end subroutine -! CHECK-LABEL: func.func @_QPacc_routine3() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_2]>} +! CHECK-LABEL: func.func @_QPacc_routine3() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r02]]]>} subroutine acc_routine4() !$acc routine vector end subroutine -! CHECK-LABEL: func.func @_QPacc_routine4() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_3]>} +! CHECK-LABEL: func.func @_QPacc_routine4() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r03]]]>} subroutine acc_routine5() !$acc routine worker end subroutine -! CHECK-LABEL: func.func @_QPacc_routine5() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_4]>} +! CHECK-LABEL: func.func @_QPacc_routine5() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r04]]]>} subroutine acc_routine6() !$acc routine nohost end subroutine -! CHECK-LABEL: func.func @_QPacc_routine6() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_5]>} +! CHECK-LABEL: func.func @_QPacc_routine6() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r05]]]>} subroutine acc_routine7() !$acc routine gang(dim:1) end subroutine -! CHECK-LABEL: func.func @_QPacc_routine7() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_6]>} +! CHECK-LABEL: func.func @_QPacc_routine7() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r06]]]>} subroutine acc_routine8() !$acc routine bind("routine8_") end subroutine -! CHECK-LABEL: func.func @_QPacc_routine8() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_7]>} +! CHECK-LABEL: func.func @_QPacc_routine8() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r07]]]>} subroutine acc_routine9a() end subroutine @@ -73,20 +81,23 @@ subroutine acc_routine9() !$acc routine bind(acc_routine9a) end subroutine -! CHECK-LABEL: func.func @_QPacc_routine9() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_8]>} +! CHECK-LABEL: func.func @_QPacc_routine9() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r08]]]>} function acc_routine10() !$acc routine(acc_routine10) seq end function -! CHECK-LABEL: func.func @_QPacc_routine10() -> f32 attributes {acc.routine_info = #acc.routine_info<[@acc_routine_9]>} +! CHECK-LABEL: func.func @_QPacc_routine10() -> f32 +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r09]]]>} subroutine acc_routine11(a) real :: a !$acc routine(acc_routine11) seq end subroutine -! CHECK-LABEL: func.func @_QPacc_routine11(%arg0: !fir.ref {fir.bindc_name = "a"}) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_10]>} +! CHECK-LABEL: func.func @_QPacc_routine11(%arg0: !fir.ref {fir.bindc_name = "a"}) +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r10]]]>} subroutine acc_routine12() >From e2d1a05d2de2356644d385e9099a7e6879143cc7 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 14:40:14 -0700 Subject: [PATCH 05/10] clang-format --- flang/include/flang/Lower/OpenACC.h | 3 +- flang/include/flang/Semantics/symbol.h | 4 +- flang/lib/Lower/Bridge.cpp | 12 ++--- flang/lib/Lower/CallInterface.cpp | 9 ++-- flang/lib/Lower/OpenACC.cpp | 56 +++++++++++----------- flang/lib/Semantics/resolve-directives.cpp | 3 +- 6 files changed, 48 insertions(+), 39 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index dc014a71526c3..35a33e751b52b 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -76,7 +76,8 @@ static constexpr llvm::StringRef declarePostDeallocSuffix = static constexpr llvm::StringRef privatizationRecipePrefix = "privatization"; -bool needsOpenACCRoutineConstruct(const Fortran::evaluate::ProcedureDesignator *); +bool needsOpenACCRoutineConstruct( + const Fortran::evaluate::ProcedureDesignator *); mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 8c60a196bdfc1..eb34aac9c390d 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -143,7 +143,8 @@ class OpenACCRoutineDeviceTypeInfo { const std::variant *bindName() const { return bindName_.has_value() ? &*bindName_ : nullptr; } - const std::optional> &bindNameOpt() const { + const std::optional> & + bindNameOpt() const { return bindName_; } void set_bindName(std::string &&name) { bindName_.emplace(std::move(name)); } @@ -153,6 +154,7 @@ class OpenACCRoutineDeviceTypeInfo { friend llvm::raw_ostream &operator<<( llvm::raw_ostream &, const OpenACCRoutineDeviceTypeInfo &); + private: bool isSeq_{false}; bool isVector_{false}; diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index abe07bcfdfcda..5e7b783323bfd 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -465,8 +465,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { setCurrentPosition(funit.getStartingSourceLoc()); - builder = new fir::FirOpBuilder( - bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); + builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), + &mlirSymbolTable); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { funit.setActiveEntry(entryIndex); @@ -1031,9 +1031,9 @@ class FirConverter : public Fortran::lower::AbstractConverter { return bridge.getSemanticsContext().FindScope(currentPosition); } - fir::FirOpBuilder &getFirOpBuilder() override final { + fir::FirOpBuilder &getFirOpBuilder() override final { CHECK(builder && "builder is not set before calling getFirOpBuilder"); - return *builder; + return *builder; } mlir::ModuleOp getModuleOp() override final { return bridge.getModule(); } @@ -5616,10 +5616,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { LLVM_DEBUG(llvm::dbgs() << "\n[bridge - startNewFunction]"; if (auto *sym = scope.symbol()) llvm::dbgs() << " " << *sym; llvm::dbgs() << "\n"); - // I don't think setting the builder is necessary here, because callee + // I don't think setting the builder is necessary here, because callee // always looks up the FuncOp from the module. If there was a function that // was not declared yet. This call to callee will cause an assertion - //failure. + // failure. Fortran::lower::CalleeInterface callee(funit, *this); mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments(); builder = diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index b938354e6bcb3..611eacfe178e5 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -717,11 +717,14 @@ void Fortran::lower::CallInterface::declare() { func.setArgAttrs(placeHolder.index(), placeHolder.value().attributes); setCUDAAttributes(func, side().getProcedureSymbol(), characteristic); - + if (const Fortran::semantics::Symbol *sym = side().getProcedureSymbol()) { - if (const auto &info{sym->GetUltimate().detailsIf()}) { + if (const auto &info{ + sym->GetUltimate() + .detailsIf()}) { if (!info->openACCRoutineInfos().empty()) { - genOpenACCRoutineConstruct(converter, module, func, info->openACCRoutineInfos()); + genOpenACCRoutineConstruct(converter, module, func, + info->openACCRoutineInfos()); } } } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index eefa8fbf12b1a..891dc998bc596 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -32,14 +32,14 @@ #include "flang/Semantics/expression.h" #include "flang/Semantics/scope.h" #include "flang/Semantics/tools.h" +#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" +#include "mlir/IR/MLIRContext.h" +#include "mlir/Support/LLVM.h" #include "llvm/ADT/STLExtras.h" #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" #include "llvm/Support/ErrorHandling.h" -#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" -#include "mlir/IR/MLIRContext.h" -#include "mlir/Support/LLVM.h" #define DEBUG_TYPE "flang-lower-openacc" @@ -4141,7 +4141,9 @@ static void attachRoutineInfo(mlir::func::FuncOp func, mlir::acc::RoutineInfoAttr::get(func.getContext(), routines)); } -static mlir::ArrayAttr getArrayAttrOrNull(fir::FirOpBuilder &builder, llvm::SmallVector &attributes) { +static mlir::ArrayAttr +getArrayAttrOrNull(fir::FirOpBuilder &builder, + llvm::SmallVector &attributes) { if (attributes.empty()) { return nullptr; } else { @@ -4181,25 +4183,24 @@ void createOpenACCRoutineConstruct( mlir::OpBuilder modBuilder(mod.getBodyRegion()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); modBuilder.create( - loc, routineOpStr, funcName, - getArrayAttrOrNull(builder, bindNames), + loc, routineOpStr, funcName, getArrayAttrOrNull(builder, bindNames), getArrayAttrOrNull(builder, bindNameDeviceTypes), getArrayAttrOrNull(builder, workerDeviceTypes), getArrayAttrOrNull(builder, vectorDeviceTypes), - getArrayAttrOrNull(builder, seqDeviceTypes), - hasNohost, /*implicit=*/false, - getArrayAttrOrNull(builder, gangDeviceTypes), + getArrayAttrOrNull(builder, seqDeviceTypes), hasNohost, + /*implicit=*/false, getArrayAttrOrNull(builder, gangDeviceTypes), getArrayAttrOrNull(builder, gangDimValues), getArrayAttrOrNull(builder, gangDimDeviceTypes)); auto symbolRefAttr = builder.getSymbolRefAttr(routineOpStr); if (funcOp) { - + attachRoutineInfo(funcOp, symbolRefAttr); } else { // FuncOp is not lowered yet. Keep the information so the routine info // can be attached later to the funcOp. - converter.getAccDelayedRoutines().push_back(std::make_pair(funcName, symbolRefAttr)); + converter.getAccDelayedRoutines().push_back( + std::make_pair(funcName, symbolRefAttr)); } } @@ -4218,7 +4219,7 @@ static void interpretRoutineDeviceInfo( auto getDeviceTypeAttr = [&]() -> mlir::Attribute { auto context = builder.getContext(); auto value = getDeviceType(dinfo.dType()); - return mlir::acc::DeviceTypeAttr::get(context, value ); + return mlir::acc::DeviceTypeAttr::get(context, value); }; if (dinfo.isSeq()) { seqDeviceTypes.push_back(getDeviceTypeAttr()); @@ -4244,14 +4245,15 @@ static void interpretRoutineDeviceInfo( const auto &bindName = dinfo.bindNameOpt().value(); mlir::Attribute bindNameAttr; if (const auto &bindStr{std::get_if(&bindName)}) { - bindNameAttr = builder.getStringAttr(*bindStr); - } else if (const auto &bindSym{std::get_if(&bindName)}) { - bindNameAttr = builder.getStringAttr(converter.mangleName(*bindSym)); - } else { - llvm_unreachable("Unsupported bind name type"); - } - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(getDeviceTypeAttr()); + bindNameAttr = builder.getStringAttr(*bindStr); + } else if (const auto &bindSym{ + std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(converter.mangleName(*bindSym)); + } else { + llvm_unreachable("Unsupported bind name type"); + } + bindNames.push_back(bindNameAttr); + bindNameDeviceTypes.push_back(getDeviceTypeAttr()); } } @@ -4277,18 +4279,18 @@ void Fortran::lower::genOpenACCRoutineConstruct( } // Note: Device Independent Attributes are set to the // none device type in `info`. - interpretRoutineDeviceInfo(converter, info, seqDeviceTypes, vectorDeviceTypes, - workerDeviceTypes, bindNameDeviceTypes, - bindNames, gangDeviceTypes, gangDimValues, - gangDimDeviceTypes); + interpretRoutineDeviceInfo(converter, info, seqDeviceTypes, + vectorDeviceTypes, workerDeviceTypes, + bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimValues, gangDimDeviceTypes); // Device Dependent Attributes for (const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo : info.deviceTypeInfos()) { interpretRoutineDeviceInfo( - converter, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, - bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, - gangDimDeviceTypes); + converter, dinfo, seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimValues, gangDimDeviceTypes); } } createOpenACCRoutineConstruct( diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index a8f00b546306e..c2df7cddc0025 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1103,7 +1103,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( } symbol.get().add_openACCRoutineInfo(info); } else { - llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() << "\n"; + llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() + << "\n"; } } >From c26093683edb7c0270809d2afb717450f92df6ab Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 14:47:39 -0700 Subject: [PATCH 06/10] tidy up --- flang/include/flang/Lower/OpenACC.h | 1 - flang/lib/Lower/CallInterface.cpp | 1 - 2 files changed, 2 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 35a33e751b52b..9e71ad0a15c89 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -46,7 +46,6 @@ struct AccClauseList; struct OpenACCConstruct; struct OpenACCDeclarativeConstruct; struct OpenACCRoutineConstruct; -struct ProcedureDesignator; } // namespace parser namespace semantics { diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 611eacfe178e5..602b5c7bfa6c6 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -1701,7 +1701,6 @@ class SignatureBuilder fir::emitFatalError(converter.getCurrentLocation(), "SignatureBuilder should only be used once"); declare(); - interfaceDetermined = true; return getFuncOp(); } >From 1b825b55c808ac92cd2866d855611cd585eb28db Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 17:00:27 -0700 Subject: [PATCH 07/10] cleaning up unused code --- flang/include/flang/Lower/AbstractConverter.h | 3 - flang/include/flang/Lower/OpenACC.h | 18 +- flang/lib/Lower/Bridge.cpp | 43 +++-- flang/lib/Lower/OpenACC.cpp | 166 +----------------- flang/lib/Semantics/resolve-directives.cpp | 12 +- 5 files changed, 32 insertions(+), 210 deletions(-) diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 59419e829718f..2fa0da94b0396 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -358,9 +358,6 @@ class AbstractConverter { /// functions in order to be in sync). virtual mlir::SymbolTable *getMLIRSymbolTable() = 0; - virtual Fortran::lower::AccRoutineInfoMappingList & - getAccDelayedRoutines() = 0; - private: /// Options controlling lowering behavior. const Fortran::lower::LoweringOptions &loweringOptions; diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 9e71ad0a15c89..d2cd7712fb2c7 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -63,9 +63,6 @@ namespace pft { struct Evaluation; } // namespace pft -using AccRoutineInfoMappingList = - llvm::SmallVector>; - static constexpr llvm::StringRef declarePostAllocSuffix = "_acc_declare_update_desc_post_alloc"; static constexpr llvm::StringRef declarePreDeallocSuffix = @@ -82,22 +79,13 @@ mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, pft::Evaluation &, const parser::OpenACCConstruct &); -void genOpenACCDeclarativeConstruct(AbstractConverter &, - Fortran::semantics::SemanticsContext &, - StatementContext &, - const parser::OpenACCDeclarativeConstruct &, - AccRoutineInfoMappingList &); -void genOpenACCRoutineConstruct(AbstractConverter &, - Fortran::semantics::SemanticsContext &, - mlir::ModuleOp, - const parser::OpenACCRoutineConstruct &); +void genOpenACCDeclarativeConstruct( + AbstractConverter &, Fortran::semantics::SemanticsContext &, + StatementContext &, const parser::OpenACCDeclarativeConstruct &); void genOpenACCRoutineConstruct( AbstractConverter &, mlir::ModuleOp, mlir::func::FuncOp, const std::vector &); -void finalizeOpenACCRoutineAttachment(mlir::ModuleOp, - AccRoutineInfoMappingList &); - /// Get a acc.private.recipe op for the given type or create it if it does not /// exist yet. mlir::acc::PrivateRecipeOp createOrGetPrivateRecipe(mlir::OpBuilder &, diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 5e7b783323bfd..1615493003898 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -458,15 +458,25 @@ class FirConverter : public Fortran::lower::AbstractConverter { Fortran::common::LanguageFeature::CUDA)); }); - finalizeOpenACCLowering(); finalizeOpenMPLowering(globalOmpRequiresSymbol); } /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { + // Since this is a recursive function, we only need to create a new builder + // for each top-level declaration. It would be simpler to have a single + // builder for the entire translation unit, but that requires a lot of + // changes to the code. + // FIXME: Once createGlobalOutsideOfFunctionLowering is fixed, we can + // remove this code and share the module builder. + bool newBuilder = false; + if (!builder) { + newBuilder = true; + builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), + &mlirSymbolTable); + } + CHECK(builder && "FirOpBuilder did not instantiate"); setCurrentPosition(funit.getStartingSourceLoc()); - builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), - &mlirSymbolTable); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { funit.setActiveEntry(entryIndex); @@ -493,7 +503,11 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (Fortran::lower::pft::ContainedUnit &unit : funit.containedUnitList) if (auto *f = std::get_if(&unit)) declareFunction(*f); - builder = nullptr; + + if (newBuilder) { + delete builder; + builder = nullptr; + } } /// Get the scope that is defining or using \p sym. The returned scope is not @@ -3017,8 +3031,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { void genFIR(const Fortran::parser::OpenACCDeclarativeConstruct &accDecl) { genOpenACCDeclarativeConstruct(*this, bridge.getSemanticsContext(), - bridge.openAccCtx(), accDecl, - accRoutineInfos); + bridge.openAccCtx(), accDecl); for (Fortran::lower::pft::Evaluation &e : getEval().getNestedEvaluations()) genFIR(e); } @@ -4286,11 +4299,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { return Fortran::lower::createMutableBox(loc, *this, expr, localSymbols); } - Fortran::lower::AccRoutineInfoMappingList & - getAccDelayedRoutines() override final { - return accRoutineInfos; - } - // Create the [newRank] array with the lower bounds to be passed to the // runtime as a descriptor. mlir::Value createLboundArray(llvm::ArrayRef lbounds, @@ -5889,7 +5897,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Helper to generate GlobalOps when the builder is not positioned in any /// region block. This is required because the FirOpBuilder assumes it is /// always positioned inside a region block when creating globals, the easiest - /// way comply is to create a dummy function and to throw it afterwards. + /// way comply is to create a dummy function and to throw it away afterwards. void createGlobalOutsideOfFunctionLowering( const std::function &createGlobals) { // FIXME: get rid of the bogus function context and instantiate the @@ -5902,6 +5910,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { mlir::FunctionType::get(context, std::nullopt, std::nullopt), symbolTable); func.addEntryBlock(); + CHECK(!builder && "Expected builder to be uninitialized"); builder = new fir::FirOpBuilder(func, bridge.getKindMap(), symbolTable); assert(builder && "FirOpBuilder did not instantiate"); builder->setFastMathFlags(bridge.getLoweringOptions().getMathOptions()); @@ -6331,13 +6340,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { expr.u); } - /// Performing OpenACC lowering action that were deferred to the end of - /// lowering. - void finalizeOpenACCLowering() { - Fortran::lower::finalizeOpenACCRoutineAttachment(getModuleOp(), - accRoutineInfos); - } - /// Performing OpenMP lowering actions that were deferred to the end of /// lowering. void finalizeOpenMPLowering( @@ -6429,9 +6431,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// A counter for uniquing names in `literalNamesMap`. std::uint64_t uniqueLitId = 0; - /// Deferred OpenACC routine attachment. - Fortran::lower::AccRoutineInfoMappingList accRoutineInfos; - /// Whether an OpenMP target region or declare target function/subroutine /// intended for device offloading has been detected bool ompDeviceCodeFound = false; diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 891dc998bc596..1a031dce7a487 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -4163,9 +4163,6 @@ void createOpenACCRoutineConstruct( llvm::SmallVector &workerDeviceTypes, llvm::SmallVector &vectorDeviceTypes) { - std::stringstream routineOpName; - routineOpName << accRoutinePrefix.str() << routineCounter++; - for (auto routineOp : mod.getOps()) { if (routineOp.getFuncName().str().compare(funcName) == 0) { // If the routine is already specified with the same clauses, just skip @@ -4179,6 +4176,8 @@ void createOpenACCRoutineConstruct( mlir::emitError(loc, "Routine already specified with different clauses"); } } + std::stringstream routineOpName; + routineOpName << accRoutinePrefix.str() << routineCounter++; std::string routineOpStr = routineOpName.str(); mlir::OpBuilder modBuilder(mod.getBodyRegion()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); @@ -4192,16 +4191,7 @@ void createOpenACCRoutineConstruct( getArrayAttrOrNull(builder, gangDimValues), getArrayAttrOrNull(builder, gangDimDeviceTypes)); - auto symbolRefAttr = builder.getSymbolRefAttr(routineOpStr); - if (funcOp) { - - attachRoutineInfo(funcOp, symbolRefAttr); - } else { - // FuncOp is not lowered yet. Keep the information so the routine info - // can be attached later to the funcOp. - converter.getAccDelayedRoutines().push_back( - std::make_pair(funcName, symbolRefAttr)); - } + attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpStr)); } static void interpretRoutineDeviceInfo( @@ -4299,145 +4289,6 @@ void Fortran::lower::genOpenACCRoutineConstruct( seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); } -void Fortran::lower::genOpenACCRoutineConstruct( - Fortran::lower::AbstractConverter &converter, - Fortran::semantics::SemanticsContext &semanticsContext, mlir::ModuleOp mod, - const Fortran::parser::OpenACCRoutineConstruct &routineConstruct) { - fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - mlir::Location loc = converter.genLocation(routineConstruct.source); - std::optional name = - std::get>(routineConstruct.t); - const auto &clauses = - std::get(routineConstruct.t); - mlir::func::FuncOp funcOp; - std::string funcName; - if (name) { - funcName = converter.mangleName(*name->symbol); - funcOp = - builder.getNamedFunction(mod, builder.getMLIRSymbolTable(), funcName); - } else { - Fortran::semantics::Scope &scope = - semanticsContext.FindScope(routineConstruct.source); - const Fortran::semantics::Scope &progUnit{GetProgramUnitContaining(scope)}; - const auto *subpDetails{ - progUnit.symbol() - ? progUnit.symbol() - ->detailsIf() - : nullptr}; - if (subpDetails && subpDetails->isInterface()) { - funcName = converter.mangleName(*progUnit.symbol()); - funcOp = - builder.getNamedFunction(mod, builder.getMLIRSymbolTable(), funcName); - } else { - funcOp = builder.getFunction(); - funcName = funcOp.getName(); - } - } - // TODO: Refactor this to use the OpenACCRoutineInfo - bool hasNohost = false; - - llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, - workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, - gangDimDeviceTypes, gangDimValues; - - // device_type attribute is set to `none` until a device_type clause is - // encountered. - llvm::SmallVector crtDeviceTypes; - crtDeviceTypes.push_back(mlir::acc::DeviceTypeAttr::get( - builder.getContext(), mlir::acc::DeviceType::None)); - - for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - seqDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (const auto *gangClause = - std::get_if(&clause.u)) { - if (gangClause->v) { - const Fortran::parser::AccGangArgList &x = *gangClause->v; - for (const Fortran::parser::AccGangArg &gangArg : x.v) { - if (const auto *dim = - std::get_if(&gangArg.u)) { - const std::optional dimValue = Fortran::evaluate::ToInt64( - *Fortran::semantics::GetExpr(dim->v)); - if (!dimValue) - mlir::emitError(loc, - "dim value must be a constant positive integer"); - mlir::Attribute gangDimAttr = - builder.getIntegerAttr(builder.getI64Type(), *dimValue); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - gangDimValues.push_back(gangDimAttr); - gangDimDeviceTypes.push_back(crtDeviceTypeAttr); - } - } - } - } else { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - gangDeviceTypes.push_back(crtDeviceTypeAttr); - } - } else if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - vectorDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - workerDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (std::get_if(&clause.u)) { - hasNohost = true; - } else if (const auto *bindClause = - std::get_if(&clause.u)) { - if (const auto *name = - std::get_if(&bindClause->v.u)) { - mlir::Attribute bindNameAttr = - builder.getStringAttr(converter.mangleName(*name->symbol)); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(crtDeviceTypeAttr); - } - } else if (const auto charExpr = - std::get_if( - &bindClause->v.u)) { - const std::optional name = - Fortran::semantics::GetConstExpr(semanticsContext, - *charExpr); - if (!name) - mlir::emitError(loc, "Could not retrieve the bind name"); - - mlir::Attribute bindNameAttr = builder.getStringAttr(*name); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(crtDeviceTypeAttr); - } - } - } else if (const auto *deviceTypeClause = - std::get_if( - &clause.u)) { - crtDeviceTypes.clear(); - gatherDeviceTypeAttrs(builder, deviceTypeClause, crtDeviceTypes); - } - } - - createOpenACCRoutineConstruct( - converter, loc, mod, funcOp, funcName, hasNohost, bindNames, - bindNameDeviceTypes, gangDeviceTypes, gangDimValues, gangDimDeviceTypes, - seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); -} - -void Fortran::lower::finalizeOpenACCRoutineAttachment( - mlir::ModuleOp mod, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { - for (auto &mapping : accRoutineInfos) { - mlir::func::FuncOp funcOp = - mod.lookupSymbol(mapping.first); - if (!funcOp) - mlir::emitWarning(mod.getLoc(), - llvm::Twine("function '") + llvm::Twine(mapping.first) + - llvm::Twine("' in acc routine directive is not " - "found in this translation unit.")); - else - attachRoutineInfo(funcOp, mapping.second); - } - accRoutineInfos.clear(); -} - static void genACC(Fortran::lower::AbstractConverter &converter, Fortran::lower::pft::Evaluation &eval, @@ -4551,8 +4402,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( Fortran::lower::AbstractConverter &converter, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &openAccCtx, - const Fortran::parser::OpenACCDeclarativeConstruct &accDeclConstruct, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { + const Fortran::parser::OpenACCDeclarativeConstruct &accDeclConstruct) { Fortran::common::visit( common::visitors{ @@ -4561,13 +4411,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( genACC(converter, semanticsContext, openAccCtx, standaloneDeclarativeConstruct); }, - [&](const Fortran::parser::OpenACCRoutineConstruct - &routineConstruct) { - fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - mlir::ModuleOp mod = builder.getModule(); - Fortran::lower::genOpenACCRoutineConstruct( - converter, semanticsContext, mod, routineConstruct); - }, + [&](const Fortran::parser::OpenACCRoutineConstruct &x) {}, }, accDeclConstruct.u); } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index c2df7cddc0025..d74953df1e630 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1041,9 +1041,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( if (const auto *dTypeClause = std::get_if(&clause.u)) { currentDevices.clear(); - for (const auto &deviceTypeExpr : dTypeClause->v.v) { + for (const auto &deviceTypeExpr : dTypeClause->v.v) currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); - } } else if (std::get_if(&clause.u)) { info.set_isNohost(); } else if (std::get_if(&clause.u)) { @@ -1080,9 +1079,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { Symbol &ultimate{sym->GetUltimate()}; - for (auto &device : currentDevices) { + for (auto &device : currentDevices) device->set_bindName(SymbolRef(ultimate)); - } } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1095,16 +1093,12 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Fortran::parser::Unwrap( *charExpr); std::string str{std::get(charConst->t)}; - for (auto &device : currentDevices) { + for (auto &device : currentDevices) device->set_bindName(std::string(str)); - } } } } symbol.get().add_openACCRoutineInfo(info); - } else { - llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() - << "\n"; } } >From 158481e5eaf63c1b2b4c172b9c143ca4c10722f5 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 17:15:26 -0700 Subject: [PATCH 08/10] a little more tidying up --- flang/include/flang/Lower/AbstractConverter.h | 1 - flang/include/flang/Lower/OpenACC.h | 3 --- flang/lib/Lower/Bridge.cpp | 3 ++- 3 files changed, 2 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 2fa0da94b0396..1d1323642bf9c 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -14,7 +14,6 @@ #define FORTRAN_LOWER_ABSTRACTCONVERTER_H #include "flang/Lower/LoweringOptions.h" -#include "flang/Lower/OpenACC.h" #include "flang/Lower/PFTDefs.h" #include "flang/Optimizer/Builder/BoxValue.h" #include "flang/Optimizer/Dialect/FIRAttr.h" diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index d2cd7712fb2c7..4034953976427 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -72,9 +72,6 @@ static constexpr llvm::StringRef declarePostDeallocSuffix = static constexpr llvm::StringRef privatizationRecipePrefix = "privatization"; -bool needsOpenACCRoutineConstruct( - const Fortran::evaluate::ProcedureDesignator *); - mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, pft::Evaluation &, diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 1615493003898..e50c91654f7bb 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -5897,7 +5897,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Helper to generate GlobalOps when the builder is not positioned in any /// region block. This is required because the FirOpBuilder assumes it is /// always positioned inside a region block when creating globals, the easiest - /// way comply is to create a dummy function and to throw it away afterwards. + /// way to comply is to create a dummy function and to throw it away + /// afterwards. void createGlobalOutsideOfFunctionLowering( const std::function &createGlobals) { // FIXME: get rid of the bogus function context and instantiate the >From 8f6ae035147336c4ed04b5b25487f72ebc52c757 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 1 May 2025 08:12:35 -0700 Subject: [PATCH 09/10] more consistent use of builders --- flang/lib/Lower/Bridge.cpp | 40 ++++++++++--------------------- flang/lib/Lower/CallInterface.cpp | 1 - 2 files changed, 13 insertions(+), 28 deletions(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index e50c91654f7bb..fb20dfbaf477e 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -403,18 +403,21 @@ class FirConverter : public Fortran::lower::AbstractConverter { [&](Fortran::lower::pft::FunctionLikeUnit &f) { if (f.isMainProgram()) hasMainProgram = true; - declareFunction(f); + createGlobalOutsideOfFunctionLowering( + [&]() { declareFunction(f); }); if (!globalOmpRequiresSymbol) globalOmpRequiresSymbol = f.getScope().symbol(); }, [&](Fortran::lower::pft::ModuleLikeUnit &m) { lowerModuleDeclScope(m); - for (Fortran::lower::pft::ContainedUnit &unit : - m.containedUnitList) - if (auto *f = - std::get_if( - &unit)) - declareFunction(*f); + createGlobalOutsideOfFunctionLowering([&]() { + for (Fortran::lower::pft::ContainedUnit &unit : + m.containedUnitList) + if (auto *f = + std::get_if( + &unit)) + declareFunction(*f); + }); }, [&](Fortran::lower::pft::BlockDataUnit &b) { if (!globalOmpRequiresSymbol) @@ -463,19 +466,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { - // Since this is a recursive function, we only need to create a new builder - // for each top-level declaration. It would be simpler to have a single - // builder for the entire translation unit, but that requires a lot of - // changes to the code. - // FIXME: Once createGlobalOutsideOfFunctionLowering is fixed, we can - // remove this code and share the module builder. - bool newBuilder = false; - if (!builder) { - newBuilder = true; - builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), - &mlirSymbolTable); - } - CHECK(builder && "FirOpBuilder did not instantiate"); + CHECK(builder && "declareFunction called with uninitialized builder"); setCurrentPosition(funit.getStartingSourceLoc()); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { @@ -503,11 +494,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (Fortran::lower::pft::ContainedUnit &unit : funit.containedUnitList) if (auto *f = std::get_if(&unit)) declareFunction(*f); - - if (newBuilder) { - delete builder; - builder = nullptr; - } } /// Get the scope that is defining or using \p sym. The returned scope is not @@ -5624,9 +5610,9 @@ class FirConverter : public Fortran::lower::AbstractConverter { LLVM_DEBUG(llvm::dbgs() << "\n[bridge - startNewFunction]"; if (auto *sym = scope.symbol()) llvm::dbgs() << " " << *sym; llvm::dbgs() << "\n"); - // I don't think setting the builder is necessary here, because callee + // Setting the builder is not necessary here, because callee // always looks up the FuncOp from the module. If there was a function that - // was not declared yet. This call to callee will cause an assertion + // was not declared yet, this call to callee will cause an assertion // failure. Fortran::lower::CalleeInterface callee(funit, *this); mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments(); diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 602b5c7bfa6c6..8affa1e1965e8 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -21,7 +21,6 @@ #include "flang/Optimizer/Dialect/FIROpsSupport.h" #include "flang/Optimizer/Support/InternalNames.h" #include "flang/Optimizer/Support/Utils.h" -#include "flang/Parser/parse-tree.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" #include "flang/Support/Fortran.h" >From a52655d0b90055ad6ff062fbf66be3172a95973b Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 1 May 2025 08:18:14 -0700 Subject: [PATCH 10/10] delete space --- flang/lib/Lower/Bridge.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index fb20dfbaf477e..a6ee24edd8381 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -5883,7 +5883,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Helper to generate GlobalOps when the builder is not positioned in any /// region block. This is required because the FirOpBuilder assumes it is /// always positioned inside a region block when creating globals, the easiest - /// way to comply is to create a dummy function and to throw it away + /// way to comply is to create a dummy function and to throw it away /// afterwards. void createGlobalOutsideOfFunctionLowering( const std::function &createGlobals) { From flang-commits at lists.llvm.org Thu May 1 08:19:14 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 08:19:14 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <681390f2.170a0220.38868.d635@mx.google.com> https://github.com/fanju110 updated https://github.com/llvm/llvm-project/pull/136098 >From 9494c9752400e4708dbc8b6a5ca4993ea9565e95 Mon Sep 17 00:00:00 2001 From: fanyikang Date: Thu, 17 Apr 2025 15:17:07 +0800 Subject: [PATCH 01/11] Add support for IR PGO (-fprofile-generate/-fprofile-use=/file) This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags: -fprofile-generate for instrumentation-based profile generation -fprofile-use=/file for profile-guided optimization Co-Authored-By: ict-ql <168183727+ict-ql at users.noreply.github.com> --- clang/include/clang/Driver/Options.td | 4 +- clang/lib/Driver/ToolChains/Flang.cpp | 8 +++ .../include/flang/Frontend/CodeGenOptions.def | 5 ++ flang/include/flang/Frontend/CodeGenOptions.h | 49 +++++++++++++++++ flang/lib/Frontend/CompilerInvocation.cpp | 12 +++++ flang/lib/Frontend/FrontendActions.cpp | 54 +++++++++++++++++++ flang/test/Driver/flang-f-opts.f90 | 5 ++ .../Inputs/gcc-flag-compatibility_IR.proftext | 19 +++++++ .../gcc-flag-compatibility_IR_entry.proftext | 14 +++++ flang/test/Profile/gcc-flag-compatibility.f90 | 39 ++++++++++++++ 10 files changed, 207 insertions(+), 2 deletions(-) create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext create mode 100644 flang/test/Profile/gcc-flag-compatibility.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index affc076a876ad..0b0dbc467c1e0 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1756,7 +1756,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFFFFFE">; def fprofile_generate : Flag<["-"], "fprofile-generate">, - Group, Visibility<[ClangOption, CLOption]>, + Group, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">, Group, Visibility<[ClangOption, CLOption]>, @@ -1773,7 +1773,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group, Visibility<[ClangOption, CLOption]>, Alias; def fprofile_use_EQ : Joined<["-"], "fprofile-use=">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, MetaVarName<"">, HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from /default.profdata. Otherwise, it reads from file .">; def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">, diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index a8b4688aed09c..fcdbe8a6aba5a 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,6 +882,14 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); + + if (Args.hasArg(options::OPT_fprofile_generate)){ + CmdArgs.push_back("-fprofile-generate"); + } + if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { + CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); + } + // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ, diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 57830bf51a1b3..4dec86cd8f51b 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,6 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. +ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Whether emit extra debug info for sample pgo profile collection. +CODEGENOPT(DebugInfoForProfiling, 1, 0) +CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..e052250f97e75 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,6 +148,55 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; + enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. + }; + + + /// Name of the profile file to use as output for -fprofile-instr-generate, + /// -fprofile-generate, and -fcs-profile-generate. + std::string InstrProfileOutput; + + /// Name of the profile file to use as input for -fmemory-profile-use. + std::string MemoryProfileUsePath; + + unsigned int DebugInfoForProfiling; + + unsigned int AtomicProfileUpdate; + + /// Name of the profile file to use as input for -fprofile-instr-use + std::string ProfileInstrumentUsePath; + + /// Name of the profile remapping file to apply to the profile data supplied + /// by -fprofile-sample-use or -fprofile-instr-use. + std::string ProfileRemappingFile; + + /// Check if Clang profile instrumenation is on. + bool hasProfileClangInstr() const { + return getProfileInstr() == ProfileClangInstr; + } + + /// Check if IR level profile instrumentation is on. + bool hasProfileIRInstr() const { + return getProfileInstr() == ProfileIRInstr; + } + + /// Check if CS IR level profile instrumentation is on. + bool hasProfileCSIRInstr() const { + return getProfileInstr() == ProfileCSIRInstr; + } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } + /// Check if CSIR profile use is on. + bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) #define ENUM_CODEGENOPT(Name, Type, Bits, Default) \ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 6f87a18d69c3d..f013fce2f3cfc 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,6 +27,7 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" +#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -431,6 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, opts.IsPIE = 1; } + if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { + opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + } + + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { + opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.ProfileInstrumentUsePath = A->getValue(); + } + // -mcmodel option. if (const llvm::opt::Arg *a = args.getLastArg(clang::driver::options::OPT_mcmodel_EQ)) { diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..68880bdeecf8d 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -63,11 +63,14 @@ #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" #include "llvm/Transforms/Utils/ModuleUtils.h" +#include "llvm/Transforms/Instrumentation/InstrProfiling.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include #include @@ -130,6 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// + +static llvm::cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -892,6 +909,20 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } + +// Default filename used for profile generation. +namespace llvm { + extern llvm::cl::opt DebugInfoCorrelate; + extern llvm::cl::opt ProfileCorrelate; + + +std::string getDefaultProfileGenName() { + return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} +} + void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -909,6 +940,29 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; + + if (opts.hasProfileIRInstr()){ + // // -fprofile-generate. + pgoOpt = llvm::PGOOptions( + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } + else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", + opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, + llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Driver/flang-f-opts.f90 b/flang/test/Driver/flang-f-opts.f90 index 4493a519e2010..b972b9b7b2a59 100644 --- a/flang/test/Driver/flang-f-opts.f90 +++ b/flang/test/Driver/flang-f-opts.f90 @@ -8,3 +8,8 @@ ! CHECK-LABEL: "-fc1" ! CHECK: -ffp-contract=off ! CHECK: -O3 + +! RUN: %flang -### -S -fprofile-generate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-GENERATE-LLVM %s +! CHECK-PROFILE-GENERATE-LLVM: "-fprofile-generate" +! RUN: %flang -### -S -fprofile-use=%S %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-USE-DIR %s +! CHECK-PROFILE-USE-DIR: "-fprofile-use={{.*}}" diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext new file mode 100644 index 0000000000000..6a6df8b1d4d5b --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -0,0 +1,19 @@ +# IR level Instrumentation Flag +:ir +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + +main +# Func Hash: +742261418966908927 +# Num Counters: +1 +# Counter Values: +1 + diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext new file mode 100644 index 0000000000000..9a46140286673 --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -0,0 +1,14 @@ +# IR level Instrumentation Flag +:ir +:entry_first +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + + + diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 new file mode 100644 index 0000000000000..0124bc79b87ef --- /dev/null +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -0,0 +1,39 @@ +! Tests for -fprofile-generate and -fprofile-use flag compatibility. These two +! flags behave similarly to their GCC counterparts: +! +! -fprofile-generate Generates the profile file ./default.profraw +! -fprofile-use=/file Uses the profile file /file + +! On AIX, -flto used to be required with -fprofile-generate. gcc-flag-compatibility-aix.c is used to do the testing on AIX with -flto +! RUN: %flang %s -c -S -o - -emit-llvm -fprofile-generate | FileCheck -check-prefix=PROFILE-GEN %s +! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section +! PROFILE-GEN: @__profd_{{_?}}main = + + + +! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof +! This uses LLVM IR format profile. +! RUN: rm -rf %t.dir +! RUN: mkdir -p %t.dir/some/path +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s +! + + + +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s +! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} +! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} + + +program main + implicit none + integer :: i + integer :: X = 0 + + do i = 0, 99 + X = X + i + end do + +end program main >From b897c7aa1e21dfe46b4acf709f3ea38d9021c164 Mon Sep 17 00:00:00 2001 From: FYK Date: Wed, 23 Apr 2025 09:56:14 +0800 Subject: [PATCH 02/11] Update flang/lib/Frontend/FrontendActions.cpp Remove redundant comment symbols Co-authored-by: Tom Eccles --- flang/lib/Frontend/FrontendActions.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 68880bdeecf8d..cd13a6aca92cd 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -942,7 +942,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; if (opts.hasProfileIRInstr()){ - // // -fprofile-generate. + // -fprofile-generate. pgoOpt = llvm::PGOOptions( opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() : opts.InstrProfileOutput, >From bc5adfcc4ac3456f587bedd48c1a8892d27e53ae Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:48:30 +0800 Subject: [PATCH 03/11] format code with clang-format --- flang/include/flang/Frontend/CodeGenOptions.h | 17 ++-- flang/lib/Frontend/CompilerInvocation.cpp | 15 ++-- flang/lib/Frontend/FrontendActions.cpp | 83 +++++++++---------- .../Inputs/gcc-flag-compatibility_IR.proftext | 3 +- .../gcc-flag-compatibility_IR_entry.proftext | 5 +- 5 files changed, 59 insertions(+), 64 deletions(-) diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index e052250f97e75..c9577862df832 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -156,7 +156,6 @@ class CodeGenOptions : public CodeGenOptionsBase { ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -171,7 +170,7 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; - /// Name of the profile remapping file to apply to the profile data supplied + /// Name of the profile remapping file to apply to the profile data supplied /// by -fprofile-sample-use or -fprofile-instr-use. std::string ProfileRemappingFile; @@ -181,19 +180,17 @@ class CodeGenOptions : public CodeGenOptionsBase { } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; - } + bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { return getProfileInstr() == ProfileCSIRInstr; } - /// Check if IR level profile use is on. - bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; - } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } /// Check if CSIR profile use is on. bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index f013fce2f3cfc..b28c2c0047579 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,7 +27,6 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" -#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -433,13 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = + args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = + args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); } - + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index cd13a6aca92cd..8d1ab670e4db4 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -56,21 +56,21 @@ #include "llvm/Passes/PassBuilder.h" #include "llvm/Passes/PassPlugin.h" #include "llvm/Passes/StandardInstrumentations.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/Support/AMDGPUAddrSpace.h" #include "llvm/Support/Error.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/FileSystem.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" -#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" -#include "llvm/Transforms/Utils/ModuleUtils.h" #include "llvm/Transforms/Instrumentation/InstrProfiling.h" -#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Transforms/Utils/ModuleUtils.h" #include #include @@ -133,19 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// - static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, + "default", "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, + "optsize", "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, + "minsize", "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, + "optnone", + "Mark cold functions with optnone."))); bool PrescanAction::beginSourceFileAction() { return runPrescan(); } @@ -909,19 +910,18 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } - // Default filename used for profile generation. namespace llvm { - extern llvm::cl::opt DebugInfoCorrelate; - extern llvm::cl::opt ProfileCorrelate; - +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt ProfileCorrelate; std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + return DebugInfoCorrelate || + ProfileCorrelate != llvm::InstrProfCorrelator::NONE ? "default_%m.proflite" : "default_%m.profraw"; } -} +} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); @@ -940,29 +940,28 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; - - if (opts.hasProfileIRInstr()){ + + if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); - } - else if (opts.hasProfileIRUse()) { - llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); - // -fprofile-use. - auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse - : llvm::PGOOptions::NoCSAction; - pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", - opts.ProfileRemappingFile, - opts.MemoryProfileUsePath, VFS, - llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling); - } - + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = + llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions( + opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, + ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext index 6a6df8b1d4d5b..2650fb5ebfd35 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -15,5 +15,4 @@ main # Num Counters: 1 # Counter Values: -1 - +1 \ No newline at end of file diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext index 9a46140286673..c4a2a26557e80 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -8,7 +8,4 @@ _QQmain 2 # Counter Values: 100 -1 - - - +1 \ No newline at end of file >From d64d9d95fb97d6cfa4bf4192bfb20f5c8d6b3bc3 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:53:47 +0800 Subject: [PATCH 04/11] simplify push_back usage --- clang/lib/Driver/ToolChains/Flang.cpp | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index fcdbe8a6aba5a..9c7e87c455e44 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,13 +882,9 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); - - if (Args.hasArg(options::OPT_fprofile_generate)){ - CmdArgs.push_back("-fprofile-generate"); - } - if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { - CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); - } + // recognise options: fprofile-generate -fprofile-use= + Args.addAllArgs( + CmdArgs, {options::OPT_fprofile_generate, options::OPT_fprofile_use_EQ}); // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. >From 22475a85d24b22fb44ca5a5ce26542b556bae280 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 20:33:54 +0800 Subject: [PATCH 05/11] Port the getDefaultProfileGenName definition and the ProfileInstrKind definition from clang to the llvm namespace to allow flang to reuse these code. --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++--- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/include/clang/CodeGen/BackendUtil.h | 3 ++ clang/lib/Basic/ProfileList.cpp | 20 ++++---- clang/lib/CodeGen/BackendUtil.cpp | 50 ++++++------------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 +-- flang/include/flang/Frontend/CodeGenOptions.h | 28 ++++------- .../include/flang/Frontend/FrontendActions.h | 5 ++ flang/lib/Frontend/CompilerInvocation.cpp | 11 ++-- flang/lib/Frontend/FrontendActions.cpp | 28 +++-------- .../llvm/Frontend/Driver/CodeGenOptions.h | 15 +++++- llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 25 ++++++++++ 17 files changed, 123 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index 92e0d13bf25b6..d9abf7bf962d2 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,6 +8,8 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -19,6 +21,7 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; +extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..963ed321b2cb9 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,38 +103,16 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); -// Experiment to mark cold functions as optsize/minsize/optnone. -// TODO: remove once this is exposed as a proper driver flag. -static cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, - cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); - extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +812,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,14 +825,14 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) @@ -863,15 +841,15 @@ void EmitAssemblyHelper::RunOptimizationPipeline( ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index f9a45bd6c0a56..9ba74a9dad9be 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,8 +20,13 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include +namespace llvm { +extern cl::opt ClPGOColdFuncAttr; +} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..c758aa18fbb8e 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -28,6 +28,7 @@ #include "flang/Semantics/unparse-with-symbols.h" #include "flang/Support/default-kinds.h" #include "flang/Tools/CrossToolHelpers.h" +#include "clang/CodeGen/BackendUtil.h" #include "mlir/IR/Dialect.h" #include "mlir/Parser/Parser.h" @@ -133,21 +134,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -944,12 +930,12 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, + opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + llvm::PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +945,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..6188c20cb29cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,9 +13,14 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; } // namespace llvm namespace llvm::driver { @@ -35,7 +40,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); - +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..818dcd3752437 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,7 +8,26 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); +} // namespace llvm namespace llvm::driver { @@ -56,4 +75,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From e53e689985088bbcdc253950a2ecc715592b5b3a Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 21:49:36 +0800 Subject: [PATCH 06/11] Remove redundant function definitions --- flang/lib/Frontend/FrontendActions.cpp | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c758aa18fbb8e..cdd2853bcd201 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -896,18 +896,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); >From 248175453354fecd078f5553576d16ce810e7808 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:12:32 +0800 Subject: [PATCH 07/11] Move the interface to the cpp that uses it --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++---- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/lib/Basic/ProfileList.cpp | 20 ++++----- clang/lib/CodeGen/BackendUtil.cpp | 37 +++++++-------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 ++-- flang/include/flang/Frontend/CodeGenOptions.h | 28 +++++------- flang/lib/Frontend/CompilerInvocation.cpp | 11 ++--- flang/lib/Frontend/FrontendActions.cpp | 45 ++++--------------- .../llvm/Frontend/Driver/CodeGenOptions.h | 10 +++++ llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 12 +++++ 15 files changed, 101 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..592e3bbbcc1cf 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -124,17 +124,10 @@ namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +827,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,31 +840,31 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) PGOOpt = PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..a650f54620543 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -133,21 +133,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -910,19 +895,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm - void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -943,13 +915,14 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. - pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + pgoOpt = llvm::PGOOptions(opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, + llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, + llvm::PGOOptions::ColdFuncOpt::Default, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +932,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::PGOOptions::ColdFuncOpt::Default, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..3eb03cc3064cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,6 +13,7 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -36,6 +37,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..14b6b89da8465 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,8 +8,14 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; +} // namespace llvm namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, @@ -56,4 +62,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From 70fea2265a374f59345691f4ad7653ef4f0b6aa6 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:25:15 +0800 Subject: [PATCH 08/11] Move the interface to the cpp that uses it --- clang/include/clang/CodeGen/BackendUtil.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index d9abf7bf962d2..92e0d13bf25b6 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,8 +8,6 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -21,7 +19,6 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; -extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs >From 5705d5eff937ca18eb44bec28a967a8629f0c085 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:26:22 +0800 Subject: [PATCH 09/11] Move the interface to the cpp that uses it --- flang/include/flang/Frontend/FrontendActions.h | 5 ----- 1 file changed, 5 deletions(-) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index 9ba74a9dad9be..f9a45bd6c0a56 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,13 +20,8 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include -namespace llvm { -extern cl::opt ClPGOColdFuncAttr; -} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// >From 016aab17f4cc73416c6ebca61240f269aac837d2 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:34:00 +0800 Subject: [PATCH 10/11] Fill in the missing code --- clang/lib/CodeGen/BackendUtil.cpp | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 2d33edbb8430d..6eb3a8638b7d1 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,6 +103,21 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +static cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { >From f36bfcfbfdc87b896f41be1ba25d8c18c339f1c1 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Thu, 1 May 2025 23:18:34 +0800 Subject: [PATCH 11/11] Adjusting the format of the code --- flang/test/Profile/gcc-flag-compatibility.f90 | 7 ------- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 7 ++++--- 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 index 0124bc79b87ef..4490c45232d28 100644 --- a/flang/test/Profile/gcc-flag-compatibility.f90 +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -9,24 +9,17 @@ ! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section ! PROFILE-GEN: @__profd_{{_?}}main = - - ! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof ! This uses LLVM IR format profile. ! RUN: rm -rf %t.dir ! RUN: mkdir -p %t.dir/some/path ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s -! - - - ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s ! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} ! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} - program main implicit none integer :: i diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 3eb03cc3064cf..98b9e1554f317 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -14,6 +14,7 @@ #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #include + namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -34,9 +35,6 @@ enum class VectorLibrary { AMDLIBM // AMD vector math library. }; -TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, - VectorLibrary Veclib); - enum ProfileInstrKind { ProfileNone, // Profile instrumentation is turned off. ProfileClangInstr, // Clang instrumentation to generate execution counts @@ -44,6 +42,9 @@ enum ProfileInstrKind { ProfileIRInstr, // IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; +TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, + VectorLibrary Veclib); + // Default filename used for profile generation. std::string getDefaultProfileGenName(); } // end namespace llvm::driver From flang-commits at lists.llvm.org Thu May 1 08:35:57 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Thu, 01 May 2025 08:35:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <681394dd.170a0220.45c31.dd96@mx.google.com> https://github.com/abidh edited https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Thu May 1 08:40:55 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 01 May 2025 08:40:55 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Allow UPDATE clause to not have any arguments (PR #137521) In-Reply-To: Message-ID: <68139607.170a0220.8a25f.e89c@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks https://github.com/llvm/llvm-project/pull/137521 From flang-commits at lists.llvm.org Thu May 1 08:41:29 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 01 May 2025 08:41:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Always set "openmp_flags" (PR #138153) Message-ID: https://github.com/kparzysz created https://github.com/llvm/llvm-project/pull/138153 Many OpenMP tests use "%openmp_flags" in the RUN line. In many OpenMP lit tests this variable is expected to at least have "-fopenmp" in it. However, in the lit config this variable was only given a value when the OpenMP runtime build was enabled. If the runtime build was not enabled, %openmp_flags would expand to an empty string, and unless a lit test specifically used -fopenmp in the RUN line, OpenMP would be disabled. This patch sets %openmp_flags to start with "-fopenmp" regardless of the build configuration. >From bc54eb1091f9cc491ba8e79ec096a2aa5c57e296 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 1 May 2025 10:30:38 -0500 Subject: [PATCH] [flang][OpenMP] Always set "openmp_flags" Many OpenMP tests use "%openmp_flags" in the RUN line. In many OpenMP lit tests this variable is expected to at least have "-fopenmp" in it. However, in the lit config this variable was only given a value when the OpenMP runtime build was enabled. If the runtime build was not enabled, %openmp_flags would expand to an empty string, and unless a lit test specifically used -fopenmp in the RUN line, OpenMP would be disabled. This patch sets %openmp_flags to start with "-fopenmp" regardless of the build configuration. --- flang/test/lit.cfg.py | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/flang/test/lit.cfg.py b/flang/test/lit.cfg.py index aa27fdc2fe412..7eb57670ac767 100644 --- a/flang/test/lit.cfg.py +++ b/flang/test/lit.cfg.py @@ -178,17 +178,15 @@ config.environment["LIBPGMATH"] = True # Determine if OpenMP runtime was built (enable OpenMP tests via REQUIRES in test file) +openmp_flags_substitution = "-fopenmp" if config.have_openmp_rtl: config.available_features.add("openmp_runtime") # For the enabled OpenMP tests, add a substitution that is needed in the tests to find # the omp_lib.{h,mod} files, depending on whether the OpenMP runtime was built as a # project or runtime. if config.openmp_module_dir: - config.substitutions.append( - ("%openmp_flags", f"-fopenmp -J {config.openmp_module_dir}") - ) - else: - config.substitutions.append(("%openmp_flags", "-fopenmp")) + openmp_flags_substitution += f" -J {config.openmp_module_dir}" +config.substitutions.append(("%openmp_flags", openmp_flags_substitution)) # Add features and substitutions to test F128 math support. # %f128-lib substitution may be used to generate check prefixes From flang-commits at lists.llvm.org Thu May 1 08:46:32 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 01 May 2025 08:46:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Always set "openmp_flags" (PR #138153) In-Reply-To: Message-ID: <68139758.650a0220.2b2e2f.df46@mx.google.com> https://github.com/kiranchandramohan approved this pull request. LG. Thanks for the fix. https://github.com/llvm/llvm-project/pull/138153 From flang-commits at lists.llvm.org Thu May 1 08:48:15 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Thu, 01 May 2025 08:48:15 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681397bf.630a0220.32acb6.fc56@mx.google.com> akuhlens wrote: Here is the minimal change that I could make that I think make the use of FirOpBuilder safer. We should consider how to make the builder code more resistant to programmer error, the current interface seems to require the programmer to understand when it is safe to call methods on FirOpBuilder. Maybe we could parameterize the builder by the Op int is building in? https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Thu May 1 08:49:11 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Thu, 01 May 2025 08:49:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <681397f7.050a0220.72253.0acc@mx.google.com> https://github.com/abidh ready_for_review https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Thu May 1 08:51:35 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 01 May 2025 08:51:35 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <68139887.630a0220.cd15c.0632@mx.google.com> https://github.com/tarunprabhu commented: Thank you for seeing this through and making all the little changes. I have requested reviews from @MaskRay and @aeubanks for the clang side of things. https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Thu May 1 08:51:35 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 01 May 2025 08:51:35 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <68139887.170a0220.147a2b.ee7f@mx.google.com> https://github.com/tarunprabhu edited https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Thu May 1 08:51:37 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 01 May 2025 08:51:37 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <68139889.170a0220.224f7c.f3fb@mx.google.com> ================ @@ -33,9 +35,18 @@ enum class VectorLibrary { AMDLIBM // AMD vector math library. }; +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, ---------------- tarunprabhu wrote: Nit: Newline here too please. https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Thu May 1 08:51:37 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 01 May 2025 08:51:37 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <68139889.050a0220.245b89.02f1@mx.google.com> ================ @@ -8,8 +8,14 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; +} // namespace llvm namespace llvm::driver { ---------------- tarunprabhu wrote: Sorry, but I don't see the correction here. https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Thu May 1 09:14:43 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 01 May 2025 09:14:43 -0700 (PDT) Subject: [flang-commits] [flang] [fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <68139df3.170a0220.26728a.0765@mx.google.com> https://github.com/tblah commented: Thank you for upstreaming this. Just a minor comment from me. If you plan to upstream your whole optimization pipeline (very welcome!) then please create an RFC at https://discourse.llvm.org/c/subprojects/flang/33 https://github.com/llvm/llvm-project/pull/137790 From flang-commits at lists.llvm.org Thu May 1 09:14:43 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 01 May 2025 09:14:43 -0700 (PDT) Subject: [flang-commits] [flang] [fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <68139df3.630a0220.d3a12.1d6a@mx.google.com> ================ @@ -461,14 +475,28 @@ class AffineLoopConversion : public mlir::OpRewritePattern { LLVM_ATTRIBUTE_UNUSED auto loopAnalysis = functionAnalysis.getChildLoopAnalysis(loop); auto &loopOps = loop.getBody()->getOperations(); + auto resultOp = cast(loop.getBody()->getTerminator()); ---------------- tblah wrote: Can a loop have multiple result operations if it contains multiple blocks inside the loop body? As this is an experimental optimization I guess it would be okay to skip loops which are too complicated. But printing something in `LLVM_DEBUG(llvm::dbgs())` would be ideal. https://github.com/llvm/llvm-project/pull/137790 From flang-commits at lists.llvm.org Thu May 1 09:14:44 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 01 May 2025 09:14:44 -0700 (PDT) Subject: [flang-commits] [flang] [fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <68139df4.170a0220.339e59.0a5d@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/137790 From flang-commits at lists.llvm.org Thu May 1 09:15:12 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 01 May 2025 09:15:12 -0700 (PDT) Subject: [flang-commits] [flang] [fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <68139e10.050a0220.10dd62.2a04@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/137790 From flang-commits at lists.llvm.org Thu May 1 09:23:45 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 09:23:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) Message-ID: https://github.com/NimishMishra created https://github.com/llvm/llvm-project/pull/138163 This patch adds support for emitting implicit casts for atomic capture if its constituent operations have different yet compatible types. Fixes: https://github.com/llvm/llvm-project/issues/138123 >From e912e7c9e434dc40fbd986f98725bda849a56553 Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Thu, 1 May 2025 21:51:19 +0530 Subject: [PATCH] [flang][OpenMP] Add implicit casts for omp.atomic.capture --- flang/docs/OpenMPSupport.md | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 236 ++++++++++++------ .../Todo/atomic-capture-implicit-cast.f90 | 48 ---- .../Lower/OpenMP/atomic-implicit-cast.f90 | 78 ++++++ 4 files changed, 240 insertions(+), 124 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/docs/OpenMPSupport.md b/flang/docs/OpenMPSupport.md index 2d4b9dd737777..46be14f4c168c 100644 --- a/flang/docs/OpenMPSupport.md +++ b/flang/docs/OpenMPSupport.md @@ -64,4 +64,4 @@ Note : No distinction is made between the support in Parser/Semantics, MLIR, Low | target teams distribute parallel loop simd construct | P | device, reduction, dist_schedule and linear clauses are not supported | ## OpenMP 3.1, OpenMP 2.5, OpenMP 1.1 -All features except a few corner cases in atomic (complex type, different but compatible types in lhs and rhs), threadprivate (character type) constructs/clauses are supported. +All features except a few corner cases in threadprivate (character type) constructs/clauses are supported. diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 47e7c266ff7d3..526148855b113 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2865,6 +2865,85 @@ static void genAtomicWrite(lower::AbstractConverter &converter, rightHandClauseList, loc); } +/* + Emit an implicit cast. Different yet compatible types on + omp.atomic.read constitute valid Fortran. The OMPIRBuilder will + emit atomic instructions (on primitive types) and `__atomic_load` + libcall (on complex type) without explicitly converting + between such compatible types. The OMPIRBuilder relies on the + frontend to resolve such inconsistencies between `omp.atomic.read ` + operand types. Similar inconsistencies between operand types in + `omp.atomic.write` are resolved through implicit casting by use of typed + assignment (i.e. `evaluate::Assignment`). However, use of typed + assignment in `omp.atomic.read` (of form `v = x`) leads to an unsafe, + non-atomic load of `x` into a temporary `alloca`, followed by an atomic + read of form `v = alloca`. Hence, it is needed to perform a custom + implicit cast. + + An atomic read of form `v = x` would (without implicit casting) + lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, + type2`. This implicit casting will rather generate the following FIR: + + %alloca = fir.alloca type2 + omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 + %load = fir.load %alloca : !fir.ref + %cvt = fir.convert %load : (type2) -> type1 + fir.store %cvt to %v : !fir.ref + + These sequence of operations is thread-safe since each thread allocates + the `alloca` in its stack, and performs `%alloca = %x` atomically. Once + safely read, each thread performs the implicit cast on the local + `alloca`, and writes the final result to `%v`. + +/// \param builder : FirOpBuilder +/// \param loc : Location for FIR generation +/// \param toAddress : Address of %v +/// \param toType : Type of %v +/// \param fromType : Type of %x +/// \param alloca : Thread scoped `alloca` +// It is the responsibility of the callee +// to position the `alloca` at `AllocaIP` +// through `builder.getAllocaBlock()` +*/ + +static void emitAtomicReadImplicitCast(fir::FirOpBuilder &builder, + mlir::Location loc, + mlir::Value toAddress, mlir::Type toType, + mlir::Type fromType, + mlir::Value alloca) { + auto load = builder.create(loc, alloca); + if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { + // Emit an additional `ExtractValueOp` if `fromAddress` is of complex + // type, but `toAddress` is not. + auto extract = builder.create( + loc, mlir::cast(fromType).getElementType(), load, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + auto cvt = builder.create(loc, toType, extract); + builder.create(loc, cvt, toAddress); + } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { + // Emit an additional `InsertValueOp` if `toAddress` is of complex + // type, but `fromAddress` is not. + mlir::Value undef = builder.create(loc, toType); + mlir::Type complexEleTy = + mlir::cast(toType).getElementType(); + mlir::Value cvt = builder.create(loc, complexEleTy, load); + mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); + mlir::Value idx0 = builder.create( + loc, toType, undef, cvt, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + mlir::Value idx1 = builder.create( + loc, toType, idx0, zero, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 1))); + builder.create(loc, idx1, toAddress); + } else { + auto cvt = builder.create(loc, toType, load); + builder.create(loc, cvt, toAddress); + } +} + /// Processes an atomic construct with read clause. static void genAtomicRead(lower::AbstractConverter &converter, const parser::OmpAtomicRead &atomicRead, @@ -2891,34 +2970,7 @@ static void genAtomicRead(lower::AbstractConverter &converter, *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); if (fromAddress.getType() != toAddress.getType()) { - // Emit an implicit cast. Different yet compatible types on - // omp.atomic.read constitute valid Fortran. The OMPIRBuilder will - // emit atomic instructions (on primitive types) and `__atomic_load` - // libcall (on complex type) without explicitly converting - // between such compatible types. The OMPIRBuilder relies on the - // frontend to resolve such inconsistencies between `omp.atomic.read ` - // operand types. Similar inconsistencies between operand types in - // `omp.atomic.write` are resolved through implicit casting by use of typed - // assignment (i.e. `evaluate::Assignment`). However, use of typed - // assignment in `omp.atomic.read` (of form `v = x`) leads to an unsafe, - // non-atomic load of `x` into a temporary `alloca`, followed by an atomic - // read of form `v = alloca`. Hence, it is needed to perform a custom - // implicit cast. - - // An atomic read of form `v = x` would (without implicit casting) - // lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, - // type2`. This implicit casting will rather generate the following FIR: - // - // %alloca = fir.alloca type2 - // omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 - // %load = fir.load %alloca : !fir.ref - // %cvt = fir.convert %load : (type2) -> type1 - // fir.store %cvt to %v : !fir.ref - - // These sequence of operations is thread-safe since each thread allocates - // the `alloca` in its stack, and performs `%alloca = %x` atomically. Once - // safely read, each thread performs the implicit cast on the local - // `alloca`, and writes the final result to `%v`. + mlir::Type toType = fir::unwrapRefType(toAddress.getType()); mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); @@ -2930,37 +2982,8 @@ static void genAtomicRead(lower::AbstractConverter &converter, genAtomicCaptureStatement(converter, fromAddress, alloca, leftHandClauseList, rightHandClauseList, elementType, loc); - auto load = builder.create(loc, alloca); - if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { - // Emit an additional `ExtractValueOp` if `fromAddress` is of complex - // type, but `toAddress` is not. - auto extract = builder.create( - loc, mlir::cast(fromType).getElementType(), load, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 0))); - auto cvt = builder.create(loc, toType, extract); - builder.create(loc, cvt, toAddress); - } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { - // Emit an additional `InsertValueOp` if `toAddress` is of complex - // type, but `fromAddress` is not. - mlir::Value undef = builder.create(loc, toType); - mlir::Type complexEleTy = - mlir::cast(toType).getElementType(); - mlir::Value cvt = builder.create(loc, complexEleTy, load); - mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); - mlir::Value idx0 = builder.create( - loc, toType, undef, cvt, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 0))); - mlir::Value idx1 = builder.create( - loc, toType, idx0, zero, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 1))); - builder.create(loc, idx1, toAddress); - } else { - auto cvt = builder.create(loc, toType, load); - builder.create(loc, cvt, toAddress); - } + emitAtomicReadImplicitCast(builder, loc, toAddress, toType, fromType, + alloca); } else genAtomicCaptureStatement(converter, fromAddress, toAddress, leftHandClauseList, rightHandClauseList, @@ -3049,10 +3072,6 @@ static void genAtomicCapture(lower::AbstractConverter &converter, mlir::Type stmt2VarType = fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - // Check if implicit type is needed - if (stmt1VarType != stmt2VarType) - TODO(loc, "atomic capture requiring implicit type casts"); - mlir::Operation *atomicCaptureOp = nullptr; mlir::IntegerAttr hint = nullptr; mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; @@ -3075,10 +3094,31 @@ static void genAtomicCapture(lower::AbstractConverter &converter, // Atomic capture construct is of the form [capture-stmt, update-stmt] const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt1LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt2LHSArg.getType()); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + genAtomicCaptureStatement(converter, stmt2LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt1LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } genAtomicUpdateStatement( converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, /*leftHandClauseList=*/nullptr, @@ -3091,10 +3131,32 @@ static void genAtomicCapture(lower::AbstractConverter &converter, firOpBuilder.setInsertionPointToStart(&block); const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt1LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt2LHSArg.getType()); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + genAtomicCaptureStatement(converter, stmt2LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt1LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, /*leftHandClauseList=*/nullptr, /*rightHandClauseList=*/nullptr, loc); @@ -3107,10 +3169,34 @@ static void genAtomicCapture(lower::AbstractConverter &converter, converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, /*leftHandClauseList=*/nullptr, /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt2LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt1LHSArg.getType()); + + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + + genAtomicCaptureStatement(converter, stmt1LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt2LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } } firOpBuilder.setInsertionPointToEnd(&block); firOpBuilder.create(loc); diff --git a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 b/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 deleted file mode 100644 index 5b61f1169308f..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 +++ /dev/null @@ -1,48 +0,0 @@ -!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..4c1be1ca91ac0 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -4,6 +4,10 @@ ! CHECK: func.func @_QPatomic_implicit_cast_read() { subroutine atomic_implicit_cast_read +! CHECK: %[[ALLOCA7:.*]] = fir.alloca complex +! CHECK: %[[ALLOCA6:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA5:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA4:.*]] = fir.alloca i32 ! CHECK: %[[ALLOCA3:.*]] = fir.alloca complex ! CHECK: %[[ALLOCA2:.*]] = fir.alloca complex ! CHECK: %[[ALLOCA1:.*]] = fir.alloca i32 @@ -53,4 +57,78 @@ subroutine atomic_implicit_cast_read ! CHECK: fir.store %[[CVT]] to %[[M_DECL]]#0 : !fir.ref> !$omp atomic read m = w + +! CHECK: %[[CONST:.*]] = arith.constant 1 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[ALLOCA4]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.update %[[X_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[ARG:.*]]: i32): +! CHECK: %[[RESULT:.*]] = arith.addi %[[ARG]], %[[CONST]] : i32 +! CHECK: omp.yield(%[[RESULT]] : i32) +! CHECK: } +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA4]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 +! CHECK: fir.store %[[CVT]] to %[[Y_DECL]]#0 : !fir.ref + !$omp atomic capture + y = x + x = x + 1 + !$omp end atomic + +! CHECK: %[[CONST:.*]] = arith.constant 10 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[ALLOCA5:.*]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.write %[[X_DECL]]#0 = %[[CONST]] : !fir.ref, i32 +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA5]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f64 +! CHECK: fir.store %[[CVT]] to %[[Z_DECL]]#0 : !fir.ref + !$omp atomic capture + z = x + x = 10 + !$omp end atomic + +! CHECK: %[[CONST:.*]] = arith.constant 1 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[X_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[ARG:.*]]: i32): +! CHECK: %[[RESULT:.*]] = arith.addi %[[ARG]], %[[CONST]] : i32 +! CHECK: omp.yield(%[[RESULT]] : i32) +! CHECK: } +! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref +! CHECK: %[[UNDEF:.*]] = fir.undefined complex +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 +! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex +! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex +! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> + !$omp atomic capture + x = x + 1 + w = x + !$omp end atomic + + +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): +! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 +! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex +! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex +! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex +! CHECK: omp.yield(%[[RESULT]] : complex) +! CHECK: } +! CHECK: omp.atomic.read %[[ALLOCA7]] = %[[M_DECL]]#0 : !fir.ref>, !fir.ref>, complex +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA7]] : !fir.ref> +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (complex) -> complex +! CHECK: fir.store %[[CVT]] to %[[W_DECL]]#0 : !fir.ref> + !$omp atomic capture + m = m + 1 + w = m + !$omp end atomic + + end subroutine From flang-commits at lists.llvm.org Thu May 1 09:24:03 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 09:24:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <6813a023.630a0220.28fd9f.268e@mx.google.com> NimishMishra wrote: I will run this PR against fujitsu and gfortran testsuite once and report back the results https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Thu May 1 09:24:19 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 09:24:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <6813a033.170a0220.1d3d32.23e3@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: None (NimishMishra)
Changes This patch adds support for emitting implicit casts for atomic capture if its constituent operations have different yet compatible types. Fixes: https://github.com/llvm/llvm-project/issues/138123 --- Patch is 21.22 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/138163.diff 4 Files Affected: - (modified) flang/docs/OpenMPSupport.md (+1-1) - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+161-75) - (removed) flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 (-48) - (modified) flang/test/Lower/OpenMP/atomic-implicit-cast.f90 (+78) ``````````diff diff --git a/flang/docs/OpenMPSupport.md b/flang/docs/OpenMPSupport.md index 2d4b9dd737777..46be14f4c168c 100644 --- a/flang/docs/OpenMPSupport.md +++ b/flang/docs/OpenMPSupport.md @@ -64,4 +64,4 @@ Note : No distinction is made between the support in Parser/Semantics, MLIR, Low | target teams distribute parallel loop simd construct | P | device, reduction, dist_schedule and linear clauses are not supported | ## OpenMP 3.1, OpenMP 2.5, OpenMP 1.1 -All features except a few corner cases in atomic (complex type, different but compatible types in lhs and rhs), threadprivate (character type) constructs/clauses are supported. +All features except a few corner cases in threadprivate (character type) constructs/clauses are supported. diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 47e7c266ff7d3..526148855b113 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2865,6 +2865,85 @@ static void genAtomicWrite(lower::AbstractConverter &converter, rightHandClauseList, loc); } +/* + Emit an implicit cast. Different yet compatible types on + omp.atomic.read constitute valid Fortran. The OMPIRBuilder will + emit atomic instructions (on primitive types) and `__atomic_load` + libcall (on complex type) without explicitly converting + between such compatible types. The OMPIRBuilder relies on the + frontend to resolve such inconsistencies between `omp.atomic.read ` + operand types. Similar inconsistencies between operand types in + `omp.atomic.write` are resolved through implicit casting by use of typed + assignment (i.e. `evaluate::Assignment`). However, use of typed + assignment in `omp.atomic.read` (of form `v = x`) leads to an unsafe, + non-atomic load of `x` into a temporary `alloca`, followed by an atomic + read of form `v = alloca`. Hence, it is needed to perform a custom + implicit cast. + + An atomic read of form `v = x` would (without implicit casting) + lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, + type2`. This implicit casting will rather generate the following FIR: + + %alloca = fir.alloca type2 + omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 + %load = fir.load %alloca : !fir.ref + %cvt = fir.convert %load : (type2) -> type1 + fir.store %cvt to %v : !fir.ref + + These sequence of operations is thread-safe since each thread allocates + the `alloca` in its stack, and performs `%alloca = %x` atomically. Once + safely read, each thread performs the implicit cast on the local + `alloca`, and writes the final result to `%v`. + +/// \param builder : FirOpBuilder +/// \param loc : Location for FIR generation +/// \param toAddress : Address of %v +/// \param toType : Type of %v +/// \param fromType : Type of %x +/// \param alloca : Thread scoped `alloca` +// It is the responsibility of the callee +// to position the `alloca` at `AllocaIP` +// through `builder.getAllocaBlock()` +*/ + +static void emitAtomicReadImplicitCast(fir::FirOpBuilder &builder, + mlir::Location loc, + mlir::Value toAddress, mlir::Type toType, + mlir::Type fromType, + mlir::Value alloca) { + auto load = builder.create(loc, alloca); + if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { + // Emit an additional `ExtractValueOp` if `fromAddress` is of complex + // type, but `toAddress` is not. + auto extract = builder.create( + loc, mlir::cast(fromType).getElementType(), load, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + auto cvt = builder.create(loc, toType, extract); + builder.create(loc, cvt, toAddress); + } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { + // Emit an additional `InsertValueOp` if `toAddress` is of complex + // type, but `fromAddress` is not. + mlir::Value undef = builder.create(loc, toType); + mlir::Type complexEleTy = + mlir::cast(toType).getElementType(); + mlir::Value cvt = builder.create(loc, complexEleTy, load); + mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); + mlir::Value idx0 = builder.create( + loc, toType, undef, cvt, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + mlir::Value idx1 = builder.create( + loc, toType, idx0, zero, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 1))); + builder.create(loc, idx1, toAddress); + } else { + auto cvt = builder.create(loc, toType, load); + builder.create(loc, cvt, toAddress); + } +} + /// Processes an atomic construct with read clause. static void genAtomicRead(lower::AbstractConverter &converter, const parser::OmpAtomicRead &atomicRead, @@ -2891,34 +2970,7 @@ static void genAtomicRead(lower::AbstractConverter &converter, *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); if (fromAddress.getType() != toAddress.getType()) { - // Emit an implicit cast. Different yet compatible types on - // omp.atomic.read constitute valid Fortran. The OMPIRBuilder will - // emit atomic instructions (on primitive types) and `__atomic_load` - // libcall (on complex type) without explicitly converting - // between such compatible types. The OMPIRBuilder relies on the - // frontend to resolve such inconsistencies between `omp.atomic.read ` - // operand types. Similar inconsistencies between operand types in - // `omp.atomic.write` are resolved through implicit casting by use of typed - // assignment (i.e. `evaluate::Assignment`). However, use of typed - // assignment in `omp.atomic.read` (of form `v = x`) leads to an unsafe, - // non-atomic load of `x` into a temporary `alloca`, followed by an atomic - // read of form `v = alloca`. Hence, it is needed to perform a custom - // implicit cast. - - // An atomic read of form `v = x` would (without implicit casting) - // lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, - // type2`. This implicit casting will rather generate the following FIR: - // - // %alloca = fir.alloca type2 - // omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 - // %load = fir.load %alloca : !fir.ref - // %cvt = fir.convert %load : (type2) -> type1 - // fir.store %cvt to %v : !fir.ref - - // These sequence of operations is thread-safe since each thread allocates - // the `alloca` in its stack, and performs `%alloca = %x` atomically. Once - // safely read, each thread performs the implicit cast on the local - // `alloca`, and writes the final result to `%v`. + mlir::Type toType = fir::unwrapRefType(toAddress.getType()); mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); @@ -2930,37 +2982,8 @@ static void genAtomicRead(lower::AbstractConverter &converter, genAtomicCaptureStatement(converter, fromAddress, alloca, leftHandClauseList, rightHandClauseList, elementType, loc); - auto load = builder.create(loc, alloca); - if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { - // Emit an additional `ExtractValueOp` if `fromAddress` is of complex - // type, but `toAddress` is not. - auto extract = builder.create( - loc, mlir::cast(fromType).getElementType(), load, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 0))); - auto cvt = builder.create(loc, toType, extract); - builder.create(loc, cvt, toAddress); - } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { - // Emit an additional `InsertValueOp` if `toAddress` is of complex - // type, but `fromAddress` is not. - mlir::Value undef = builder.create(loc, toType); - mlir::Type complexEleTy = - mlir::cast(toType).getElementType(); - mlir::Value cvt = builder.create(loc, complexEleTy, load); - mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); - mlir::Value idx0 = builder.create( - loc, toType, undef, cvt, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 0))); - mlir::Value idx1 = builder.create( - loc, toType, idx0, zero, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 1))); - builder.create(loc, idx1, toAddress); - } else { - auto cvt = builder.create(loc, toType, load); - builder.create(loc, cvt, toAddress); - } + emitAtomicReadImplicitCast(builder, loc, toAddress, toType, fromType, + alloca); } else genAtomicCaptureStatement(converter, fromAddress, toAddress, leftHandClauseList, rightHandClauseList, @@ -3049,10 +3072,6 @@ static void genAtomicCapture(lower::AbstractConverter &converter, mlir::Type stmt2VarType = fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - // Check if implicit type is needed - if (stmt1VarType != stmt2VarType) - TODO(loc, "atomic capture requiring implicit type casts"); - mlir::Operation *atomicCaptureOp = nullptr; mlir::IntegerAttr hint = nullptr; mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; @@ -3075,10 +3094,31 @@ static void genAtomicCapture(lower::AbstractConverter &converter, // Atomic capture construct is of the form [capture-stmt, update-stmt] const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt1LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt2LHSArg.getType()); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + genAtomicCaptureStatement(converter, stmt2LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt1LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } genAtomicUpdateStatement( converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, /*leftHandClauseList=*/nullptr, @@ -3091,10 +3131,32 @@ static void genAtomicCapture(lower::AbstractConverter &converter, firOpBuilder.setInsertionPointToStart(&block); const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt1LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt2LHSArg.getType()); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + genAtomicCaptureStatement(converter, stmt2LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt1LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, /*leftHandClauseList=*/nullptr, /*rightHandClauseList=*/nullptr, loc); @@ -3107,10 +3169,34 @@ static void genAtomicCapture(lower::AbstractConverter &converter, converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, /*leftHandClauseList=*/nullptr, /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt2LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt1LHSArg.getType()); + + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + + genAtomicCaptureStatement(converter, stmt1LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt2LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } } firOpBuilder.setInsertionPointToEnd(&block); firOpBuilder.create(loc); diff --git a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 b/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 deleted file mode 100644 index 5b61f1169308f..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 +++ /dev/null @@ -1,48 +0,0 @@ -!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..4c1be1ca91ac0 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -4,6 +4,10 @@ ! CHECK: func.func @_QPatomic_implicit_cast_read() { subroutine atomic_implicit_cast_read +! CHECK: %[[ALLOCA7:.*]] = fir.alloca complex +! CHECK: %[[ALLOCA6:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA5:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA4:.*]] = fir.alloca i32 ! CHECK: %[[ALLOCA3:.*]] = fir.alloca complex ! CHECK: %[[ALLOCA2:.*]] = fir.alloca complex ! CHECK: %[[ALLOCA1:.*]] = fir.alloca i32 @@ -53,4 +57,78 @@ subroutine atomic_implicit_cast_read ! CHECK: fir.store %[[CVT]] to %[[M_DECL]]#0 : !fir.ref> !$omp atomic read m = w + +! CHECK: %[[CONST:.*]] = arith.constant 1 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[ALLOCA4]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.update %[[X_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[ARG:.*]]: i32): +! CHECK: %[[RESULT:.*]] = arith.addi %[[ARG]], %[[CONST]] : i32 +! CHECK: omp.yield(%[[RESULT]] : i32) +! CHECK: } +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA4]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 +! CHECK: fir.store %[[CVT]] to %[[Y_DECL]]#0 : !fir.ref + !$omp atomic capture + y = x + x = x + 1 + !$omp end atomic + +! CHECK: %[[CONST:.*]] = arith.constant 10 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[ALLOCA5:.*]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.write %[[X_DECL]]#0 = %[[CONST]] : !fir.ref, i32 +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA5]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f64 +! CHECK: fir.store %[[CVT]] to %[[Z_DECL]]#0 : !fir.ref + !$omp atomic capture + z = x + x = 10 + !$omp end atomic + +! CHECK: %[[CONST:.*]] = arith.constant 1 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[X_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[ARG:.*]]: i32): +! CHECK: %[[RESULT:.*]] = arith.addi %[[ARG]], %[[CONST]] : i32 +! CHECK: omp.yield(%[[RESULT]] : i32) +! CHECK: } +! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref +! CHECK: %[[UNDEF:.*]] = fir.undefined complex +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 +! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex +! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex +! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> + !$om... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Thu May 1 09:24:57 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 09:24:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <6813a059.630a0220.15c654.20b8@mx.google.com> ================ @@ -64,4 +64,4 @@ Note : No distinction is made between the support in Parser/Semantics, MLIR, Low | target teams distribute parallel loop simd construct | P | device, reduction, dist_schedule and linear clauses are not supported | ## OpenMP 3.1, OpenMP 2.5, OpenMP 1.1 -All features except a few corner cases in atomic (complex type, different but compatible types in lhs and rhs), threadprivate (character type) constructs/clauses are supported. +All features except a few corner cases in threadprivate (character type) constructs/clauses are supported. ---------------- NimishMishra wrote: @kiranchandramohan @tblah With this PR merged, would we be in a position to close these open issues with atomic 1.1 ? https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Thu May 1 09:34:25 2025 From: flang-commits at lists.llvm.org (Anchu Rajendran S via flang-commits) Date: Thu, 01 May 2025 09:34:25 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][flang-driver] Support flag -finstrument-functions (PR #137996) In-Reply-To: Message-ID: <6813a291.170a0220.32816f.17e5@mx.google.com> https://github.com/anchuraj updated https://github.com/llvm/llvm-project/pull/137996 >From bb486c5e7cbe7b1c4a87469e06ca51bf49ddd081 Mon Sep 17 00:00:00 2001 From: Anchu Rajendran Date: Tue, 29 Apr 2025 14:41:55 -0500 Subject: [PATCH 1/4] [flang] Support flag -finstrument-functions --- clang/include/clang/Driver/Options.td | 10 ++++++---- clang/lib/Driver/ToolChains/Flang.cpp | 3 ++- flang/include/flang/Frontend/CodeGenOptions.h | 2 ++ flang/include/flang/Optimizer/Transforms/Passes.td | 8 ++++++++ flang/include/flang/Tools/CrossToolHelpers.h | 8 +++++++- flang/lib/Frontend/CompilerInvocation.cpp | 4 ++++ flang/lib/Optimizer/Passes/Pipelines.cpp | 3 ++- flang/lib/Optimizer/Transforms/FunctionAttr.cpp | 10 ++++++++++ flang/test/Driver/func-attr-instrument-functions.f90 | 9 +++++++++ 9 files changed, 50 insertions(+), 7 deletions(-) create mode 100644 flang/test/Driver/func-attr-instrument-functions.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index c0f469e04375c..8a3b74c397b95 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -2824,10 +2824,12 @@ def finput_charset_EQ : Joined<["-"], "finput-charset=">, Visibility<[ClangOption, FlangOption, FC1Option]>, Group, HelpText<"Specify the default character set for source files">; def fexec_charset_EQ : Joined<["-"], "fexec-charset=">, Group; -def finstrument_functions : Flag<["-"], "finstrument-functions">, Group, - Visibility<[ClangOption, CC1Option]>, - HelpText<"Generate calls to instrument function entry and exit">, - MarshallingInfoFlag>; +def finstrument_functions + : Flag<["-"], "finstrument-functions">, + Group, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, + HelpText<"Generate calls to instrument function entry and exit">, + MarshallingInfoFlag>; def finstrument_functions_after_inlining : Flag<["-"], "finstrument-functions-after-inlining">, Group, Visibility<[ClangOption, CC1Option]>, HelpText<"Like -finstrument-functions, but insert the calls after inlining">, diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index e9d5a844ab073..a407e295c09bd 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -128,7 +128,8 @@ void Flang::addOtherOptions(const ArgList &Args, ArgStringList &CmdArgs) const { options::OPT_std_EQ, options::OPT_W_Joined, options::OPT_fconvert_EQ, options::OPT_fpass_plugin_EQ, options::OPT_funderscoring, options::OPT_fno_underscoring, - options::OPT_funsigned, options::OPT_fno_unsigned}); + options::OPT_funsigned, options::OPT_fno_unsigned, + options::OPT_finstrument_functions}); llvm::codegenoptions::DebugInfoKind DebugInfoKind; if (Args.hasArg(options::OPT_gN_Group)) { diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..93711ae382f17 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -81,6 +81,8 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Options to add to the linker for the object file std::vector DependentLibs; + bool InstrumentFunctions{false}; + // The RemarkKind enum class and OptRemark struct are identical to what Clang // has // TODO: Share with clang instead of re-implementing here diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c59416fa2c024..9b6919eec3f73 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -393,6 +393,14 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { clEnumValN(mlir::LLVM::framePointerKind::FramePointerKind::All, "All", ""), clEnumValN(mlir::LLVM::framePointerKind::FramePointerKind::Reserved, "Reserved", "") )}]>, + Option<"instrumentFunctionEntry", "instrument-function-entry", + "std::string", /*default=*/"", + "Sets the name of the profiling function called during function " + "entry">, + Option<"instrumentFunctionExit", "instrument-function-exit", + "std::string", /*default=*/"", + "Sets the name of the profiling function called during function " + "exit">, Option<"noInfsFPMath", "no-infs-fp-math", "bool", /*default=*/"false", "Set the no-infs-fp-math attribute on functions in the module.">, Option<"noNaNsFPMath", "no-nans-fp-math", "bool", /*default=*/"false", diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 1dbc18e2b348b..36828028d3239 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,13 +102,17 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + if (opts.InstrumentFunctions) { + InstrumentFunctionsEntry = "__cyg_profile_func_enter"; + InstrumentFunctionsExit = "__cyg_profile_func_exit"; + } } llvm::OptimizationLevel OptLevel; ///< optimisation level bool StackArrays = false; ///< convert memory allocations to alloca. bool Underscoring = true; ///< add underscores to function names. bool LoopVersioning = false; ///< Run the version loop pass. - bool AliasAnalysis = false; ///< Add TBAA tags to generated LLVMIR + bool AliasAnalysis = false; ///< Add TBAA tags to generated LLVMIR. llvm::codegenoptions::DebugInfoKind DebugInfo = llvm::codegenoptions::NoDebugInfo; ///< Debug info generation. llvm::FramePointerKind FramePointerKind = @@ -124,6 +128,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. + std::string InstrumentFunctionsEntry = ""; + std::string InstrumentFunctionsExit = ""; }; struct OffloadModuleOpts { diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 6f87a18d69c3d..ffb16b11f6af0 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -310,6 +310,10 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) opts.OffloadObjects.push_back(a->getValue()); + if (args.hasFlag(clang::driver::options::OPT_finstrument_functions, + clang::driver::options::OPT_finstrument_functions, false)) + opts.InstrumentFunctions = true; + // -flto=full/thin option. if (const llvm::opt::Arg *a = args.getLastArg(clang::driver::options::OPT_flto_EQ)) { diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 130cbe72ec273..795ddcbd821da 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -349,7 +349,8 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, framePointerKind = mlir::LLVM::framePointerKind::FramePointerKind::None; pm.addPass(fir::createFunctionAttr( - {framePointerKind, config.NoInfsFPMath, config.NoNaNsFPMath, + {framePointerKind, config.InstrumentFunctionsEntry, + config.InstrumentFunctionsExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, ""})); diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index c79843fac4ce2..43e4c1a7af3cd 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -28,6 +28,8 @@ namespace { class FunctionAttrPass : public fir::impl::FunctionAttrBase { public: FunctionAttrPass(const fir::FunctionAttrOptions &options) { + instrumentFunctionEntry = options.instrumentFunctionEntry; + instrumentFunctionExit = options.instrumentFunctionExit; framePointerKind = options.framePointerKind; noInfsFPMath = options.noInfsFPMath; noNaNsFPMath = options.noNaNsFPMath; @@ -72,6 +74,14 @@ void FunctionAttrPass::runOnOperation() { auto llvmFuncOpName = mlir::OperationName(mlir::LLVM::LLVMFuncOp::getOperationName(), context); + if (!instrumentFunctionEntry.empty()) + func->setAttr(mlir::LLVM::LLVMFuncOp::getInstrumentFunctionEntryAttrName( + llvmFuncOpName), + mlir::StringAttr::get(context, instrumentFunctionEntry)); + if (!instrumentFunctionExit.empty()) + func->setAttr(mlir::LLVM::LLVMFuncOp::getInstrumentFunctionExitAttrName( + llvmFuncOpName), + mlir::StringAttr::get(context, instrumentFunctionExit)); if (noInfsFPMath) func->setAttr( mlir::LLVM::LLVMFuncOp::getNoInfsFpMathAttrName(llvmFuncOpName), diff --git a/flang/test/Driver/func-attr-instrument-functions.f90 b/flang/test/Driver/func-attr-instrument-functions.f90 new file mode 100644 index 0000000000000..0ef81806e9fb9 --- /dev/null +++ b/flang/test/Driver/func-attr-instrument-functions.f90 @@ -0,0 +1,9 @@ +! RUN: %flang -O1 -finstrument-functions -emit-llvm -S -o - %s 2>&1| FileCheck %s + +subroutine func +end subroutine func + +! CHECK: define void @func_() +! CHECK: {{.*}}call void @__cyg_profile_func_enter(ptr {{.*}}@func_, ptr {{.*}}) +! CHECK: {{.*}}call void @__cyg_profile_func_exit(ptr {{.*}}@func_, ptr {{.*}}) +! CHECK-NEXT: ret {{.*}} >From 88d258f0b2672cecd6a35909a664d7bd29e0132d Mon Sep 17 00:00:00 2001 From: Anchu Rajendran Date: Wed, 30 Apr 2025 15:41:14 -0500 Subject: [PATCH 2/4] Address review comments --- flang/include/flang/Tools/CrossToolHelpers.h | 12 ++++++++---- flang/lib/Frontend/CompilerInvocation.cpp | 3 +-- flang/lib/Optimizer/Passes/Pipelines.cpp | 4 ++-- 3 files changed, 11 insertions(+), 8 deletions(-) diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 36828028d3239..118695bbe2626 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -103,8 +103,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); if (opts.InstrumentFunctions) { - InstrumentFunctionsEntry = "__cyg_profile_func_enter"; - InstrumentFunctionsExit = "__cyg_profile_func_exit"; + InstrumentFunctionEntry = "__cyg_profile_func_enter"; + InstrumentFunctionExit = "__cyg_profile_func_exit"; } } @@ -128,8 +128,12 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. - std::string InstrumentFunctionsEntry = ""; - std::string InstrumentFunctionsExit = ""; + std::string InstrumentFunctionEntry = + ""; ///< Name of the instrument-function that is called on each + ///< function-entry + std::string InstrumentFunctionExit = + ""; ///< Name of the instrument-function that is called on each + ///< function-exit }; struct OffloadModuleOpts { diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ffb16b11f6af0..03b9d87824369 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -310,8 +310,7 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) opts.OffloadObjects.push_back(a->getValue()); - if (args.hasFlag(clang::driver::options::OPT_finstrument_functions, - clang::driver::options::OPT_finstrument_functions, false)) + if (args.hasArg(clang::driver::options::OPT_finstrument_functions)) opts.InstrumentFunctions = true; // -flto=full/thin option. diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 795ddcbd821da..a3ef473ea39b7 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -349,8 +349,8 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, framePointerKind = mlir::LLVM::framePointerKind::FramePointerKind::None; pm.addPass(fir::createFunctionAttr( - {framePointerKind, config.InstrumentFunctionsEntry, - config.InstrumentFunctionsExit, config.NoInfsFPMath, config.NoNaNsFPMath, + {framePointerKind, config.InstrumentFunctionEntry, + config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, ""})); >From 3a00558f985edee03f9b5a720fc36908c65b495f Mon Sep 17 00:00:00 2001 From: Anchu Rajendran Date: Wed, 30 Apr 2025 16:28:42 -0500 Subject: [PATCH 3/4] R3: Address review comments --- flang/include/flang/Frontend/CodeGenOptions.h | 1 + 1 file changed, 1 insertion(+) diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 93711ae382f17..6786a5b8b5fa3 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -81,6 +81,7 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Options to add to the linker for the object file std::vector DependentLibs; + /// Indicates whether -finstrument-functions is passed bool InstrumentFunctions{false}; // The RemarkKind enum class and OptRemark struct are identical to what Clang >From 47bb3c1f1783f947fc65973a1f4c525cf779a0ac Mon Sep 17 00:00:00 2001 From: Anchu Rajendran Date: Thu, 1 May 2025 10:16:56 -0500 Subject: [PATCH 4/4] R4: Addressing review comments --- flang/include/flang/Frontend/CodeGenOptions.def | 2 ++ flang/include/flang/Frontend/CodeGenOptions.h | 3 --- flang/lib/Frontend/CompilerInvocation.cpp | 2 +- 3 files changed, 3 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 57830bf51a1b3..d9dbd274e83e5 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,6 +24,8 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. +CODEGENOPT(InstrumentFunctions, 1, 0) ///< Set when -finstrument_functions is + ///< enabled on the compile step. CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 6786a5b8b5fa3..2b4e823b3fef4 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -81,9 +81,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Options to add to the linker for the object file std::vector DependentLibs; - /// Indicates whether -finstrument-functions is passed - bool InstrumentFunctions{false}; - // The RemarkKind enum class and OptRemark struct are identical to what Clang // has // TODO: Share with clang instead of re-implementing here diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 03b9d87824369..d6ba644b1400d 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -311,7 +311,7 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, opts.OffloadObjects.push_back(a->getValue()); if (args.hasArg(clang::driver::options::OPT_finstrument_functions)) - opts.InstrumentFunctions = true; + opts.InstrumentFunctions = 1; // -flto=full/thin option. if (const llvm::opt::Arg *a = From flang-commits at lists.llvm.org Thu May 1 09:35:32 2025 From: flang-commits at lists.llvm.org (Anchu Rajendran S via flang-commits) Date: Thu, 01 May 2025 09:35:32 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][flang-driver] Support flag -finstrument-functions (PR #137996) In-Reply-To: Message-ID: <6813a2d4.170a0220.1603de.196c@mx.google.com> ================ @@ -81,6 +81,8 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Options to add to the linker for the object file std::vector DependentLibs; + bool InstrumentFunctions{false}; ---------------- anchuraj wrote: Thank you for the review @tblah . Updated. https://github.com/llvm/llvm-project/pull/137996 From flang-commits at lists.llvm.org Thu May 1 08:57:48 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 01 May 2025 08:57:48 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] fix predetermined privatization inside section (PR #138159) Message-ID: https://github.com/tblah created https://github.com/llvm/llvm-project/pull/138159 This now produces code equivalent to if there was an explicit private clause on the SECTIONS construct. The problem was that each SECTION construct got its own DSP, which tried to privatize the same symbol for that SECTION. Privatization for SECTION(S) happens on the outer SECTION construct and so the outer construct's DSP should be shared. Fixes #135108 >From 91aa3bb1a0cdb61ff5be27440f88e29e1cb78455 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Thu, 1 May 2025 10:34:28 +0000 Subject: [PATCH] [flang][OpenMP] fix predetermined privatization inside section This now produces code equivalent to if there was an explicit private clause on the SECTIONS construct. The problem was that each SECTION construct got its own DSP, which tried to privatize the same symbol for that SECTION. Privatization for SECTION(S) happens on the outer SECTION construct and so the outer construct's DSP should be shared. Fixes #135108 --- flang/lib/Lower/OpenMP/OpenMP.cpp | 14 ++++++-- .../OpenMP/sections-predetermined-private.f90 | 34 +++++++++++++++++++ 2 files changed, 46 insertions(+), 2 deletions(-) create mode 100644 flang/test/Lower/OpenMP/sections-predetermined-private.f90 diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index f099028c23323..ae28d5ddd1564 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1057,6 +1057,11 @@ struct OpWithBodyGenInfo { return *this; } + OpWithBodyGenInfo &setSkipDspStep2(bool value) { + skipDspStep2 = value; + return *this; + } + OpWithBodyGenInfo &setEntryBlockArgs(const EntryBlockArgs *value) { blockArgs = value; return *this; @@ -1088,6 +1093,8 @@ struct OpWithBodyGenInfo { const List *clauses = nullptr; /// [in] if provided, processes the construct's data-sharing attributes. DataSharingProcessor *dsp = nullptr; + /// [in] if true, skip DataSharingProcessor::processStep2 + bool skipDspStep2 = false; /// [in] if provided, it is used to create the op's region entry block. It is /// overriden when a \see genRegionEntryCB is provided. This is only valid for /// operations implementing the \see mlir::omp::BlockArgOpenMPOpInterface. @@ -1240,7 +1247,7 @@ static void createBodyOfOp(mlir::Operation &op, const OpWithBodyGenInfo &info, // loop (this may not make sense in production code, but a user could // write that and we should handle it). firOpBuilder.setInsertionPoint(term); - if (privatize) { + if (privatize && !info.skipDspStep2) { // DataSharingProcessor::processStep2() may create operations before/after // the one passed as argument. We need to treat loop wrappers and their // nested loop as a unit, so we need to pass the bottom level wrapper (if @@ -2162,7 +2169,10 @@ genSectionsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, OpWithBodyGenInfo(converter, symTable, semaCtx, loc, nestedEval, llvm::omp::Directive::OMPD_section) .setClauses(§ionQueue.begin()->clauses) - .setEntryBlockArgs(&args), + .setEntryBlockArgs(&args) + .setDataSharingProcessor(&dsp) + // lastprivate is handled differently for SECTIONS, see below + .setSkipDspStep2(true), sectionQueue, sectionQueue.begin()); } diff --git a/flang/test/Lower/OpenMP/sections-predetermined-private.f90 b/flang/test/Lower/OpenMP/sections-predetermined-private.f90 new file mode 100644 index 0000000000000..9c2e2e127aa78 --- /dev/null +++ b/flang/test/Lower/OpenMP/sections-predetermined-private.f90 @@ -0,0 +1,34 @@ +! RUN: %flang_fc1 -fopenmp -emit-hlfir -o - %s | FileCheck %s + +!$omp parallel sections +!$omp section + do i = 1, 2 + end do +!$omp section + do i = 1, 2 + end do +!$omp end parallel sections +end +! CHECK-LABEL: func.func @_QQmain() { +! CHECK: omp.parallel { +! CHECK: %[[VAL_3:.*]] = fir.alloca i32 {bindc_name = "i", pinned} +! CHECK: %[[VAL_4:.*]]:2 = hlfir.declare %[[VAL_3]] {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.sections { +! CHECK: omp.section { +! CHECK: %[[VAL_11:.*]]:2 = fir.do_loop %[[VAL_12:.*]] = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%{{.*}} = %{{.*}} -> (index, i32) { +! CHECK: } +! CHECK: fir.store %[[VAL_11]]#1 to %[[VAL_4]]#0 : !fir.ref +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.section { +! CHECK: %[[VAL_25:.*]]:2 = fir.do_loop %[[VAL_26:.*]] = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%{{.*}} = %{{.*}}) -> (index, i32) { +! CHECK: } +! CHECK: fir.store %[[VAL_25]]#1 to %[[VAL_4]]#0 : !fir.ref +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } From flang-commits at lists.llvm.org Thu May 1 08:58:23 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 08:58:23 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] fix predetermined privatization inside section (PR #138159) In-Reply-To: Message-ID: <68139a1f.050a0220.255b16.0b42@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Tom Eccles (tblah)
Changes This now produces code equivalent to if there was an explicit private clause on the SECTIONS construct. The problem was that each SECTION construct got its own DSP, which tried to privatize the same symbol for that SECTION. Privatization for SECTION(S) happens on the outer SECTION construct and so the outer construct's DSP should be shared. Fixes #135108 --- Full diff: https://github.com/llvm/llvm-project/pull/138159.diff 2 Files Affected: - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+12-2) - (added) flang/test/Lower/OpenMP/sections-predetermined-private.f90 (+34) ``````````diff diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index f099028c23323..ae28d5ddd1564 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1057,6 +1057,11 @@ struct OpWithBodyGenInfo { return *this; } + OpWithBodyGenInfo &setSkipDspStep2(bool value) { + skipDspStep2 = value; + return *this; + } + OpWithBodyGenInfo &setEntryBlockArgs(const EntryBlockArgs *value) { blockArgs = value; return *this; @@ -1088,6 +1093,8 @@ struct OpWithBodyGenInfo { const List *clauses = nullptr; /// [in] if provided, processes the construct's data-sharing attributes. DataSharingProcessor *dsp = nullptr; + /// [in] if true, skip DataSharingProcessor::processStep2 + bool skipDspStep2 = false; /// [in] if provided, it is used to create the op's region entry block. It is /// overriden when a \see genRegionEntryCB is provided. This is only valid for /// operations implementing the \see mlir::omp::BlockArgOpenMPOpInterface. @@ -1240,7 +1247,7 @@ static void createBodyOfOp(mlir::Operation &op, const OpWithBodyGenInfo &info, // loop (this may not make sense in production code, but a user could // write that and we should handle it). firOpBuilder.setInsertionPoint(term); - if (privatize) { + if (privatize && !info.skipDspStep2) { // DataSharingProcessor::processStep2() may create operations before/after // the one passed as argument. We need to treat loop wrappers and their // nested loop as a unit, so we need to pass the bottom level wrapper (if @@ -2162,7 +2169,10 @@ genSectionsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, OpWithBodyGenInfo(converter, symTable, semaCtx, loc, nestedEval, llvm::omp::Directive::OMPD_section) .setClauses(§ionQueue.begin()->clauses) - .setEntryBlockArgs(&args), + .setEntryBlockArgs(&args) + .setDataSharingProcessor(&dsp) + // lastprivate is handled differently for SECTIONS, see below + .setSkipDspStep2(true), sectionQueue, sectionQueue.begin()); } diff --git a/flang/test/Lower/OpenMP/sections-predetermined-private.f90 b/flang/test/Lower/OpenMP/sections-predetermined-private.f90 new file mode 100644 index 0000000000000..9c2e2e127aa78 --- /dev/null +++ b/flang/test/Lower/OpenMP/sections-predetermined-private.f90 @@ -0,0 +1,34 @@ +! RUN: %flang_fc1 -fopenmp -emit-hlfir -o - %s | FileCheck %s + +!$omp parallel sections +!$omp section + do i = 1, 2 + end do +!$omp section + do i = 1, 2 + end do +!$omp end parallel sections +end +! CHECK-LABEL: func.func @_QQmain() { +! CHECK: omp.parallel { +! CHECK: %[[VAL_3:.*]] = fir.alloca i32 {bindc_name = "i", pinned} +! CHECK: %[[VAL_4:.*]]:2 = hlfir.declare %[[VAL_3]] {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.sections { +! CHECK: omp.section { +! CHECK: %[[VAL_11:.*]]:2 = fir.do_loop %[[VAL_12:.*]] = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%{{.*}} = %{{.*}} -> (index, i32) { +! CHECK: } +! CHECK: fir.store %[[VAL_11]]#1 to %[[VAL_4]]#0 : !fir.ref +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.section { +! CHECK: %[[VAL_25:.*]]:2 = fir.do_loop %[[VAL_26:.*]] = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%{{.*}} = %{{.*}}) -> (index, i32) { +! CHECK: } +! CHECK: fir.store %[[VAL_25]]#1 to %[[VAL_4]]#0 : !fir.ref +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } ``````````
https://github.com/llvm/llvm-project/pull/138159 From flang-commits at lists.llvm.org Thu May 1 09:01:48 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 09:01:48 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <68139aec.050a0220.32797f.0f57@mx.google.com> https://github.com/fanju110 updated https://github.com/llvm/llvm-project/pull/136098 >From 9494c9752400e4708dbc8b6a5ca4993ea9565e95 Mon Sep 17 00:00:00 2001 From: fanyikang Date: Thu, 17 Apr 2025 15:17:07 +0800 Subject: [PATCH 01/12] Add support for IR PGO (-fprofile-generate/-fprofile-use=/file) This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags: -fprofile-generate for instrumentation-based profile generation -fprofile-use=/file for profile-guided optimization Co-Authored-By: ict-ql <168183727+ict-ql at users.noreply.github.com> --- clang/include/clang/Driver/Options.td | 4 +- clang/lib/Driver/ToolChains/Flang.cpp | 8 +++ .../include/flang/Frontend/CodeGenOptions.def | 5 ++ flang/include/flang/Frontend/CodeGenOptions.h | 49 +++++++++++++++++ flang/lib/Frontend/CompilerInvocation.cpp | 12 +++++ flang/lib/Frontend/FrontendActions.cpp | 54 +++++++++++++++++++ flang/test/Driver/flang-f-opts.f90 | 5 ++ .../Inputs/gcc-flag-compatibility_IR.proftext | 19 +++++++ .../gcc-flag-compatibility_IR_entry.proftext | 14 +++++ flang/test/Profile/gcc-flag-compatibility.f90 | 39 ++++++++++++++ 10 files changed, 207 insertions(+), 2 deletions(-) create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext create mode 100644 flang/test/Profile/gcc-flag-compatibility.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index affc076a876ad..0b0dbc467c1e0 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1756,7 +1756,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFFFFFE">; def fprofile_generate : Flag<["-"], "fprofile-generate">, - Group, Visibility<[ClangOption, CLOption]>, + Group, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">, Group, Visibility<[ClangOption, CLOption]>, @@ -1773,7 +1773,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group, Visibility<[ClangOption, CLOption]>, Alias; def fprofile_use_EQ : Joined<["-"], "fprofile-use=">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, MetaVarName<"">, HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from /default.profdata. Otherwise, it reads from file .">; def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">, diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index a8b4688aed09c..fcdbe8a6aba5a 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,6 +882,14 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); + + if (Args.hasArg(options::OPT_fprofile_generate)){ + CmdArgs.push_back("-fprofile-generate"); + } + if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { + CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); + } + // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ, diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 57830bf51a1b3..4dec86cd8f51b 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,6 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. +ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Whether emit extra debug info for sample pgo profile collection. +CODEGENOPT(DebugInfoForProfiling, 1, 0) +CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..e052250f97e75 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,6 +148,55 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; + enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. + }; + + + /// Name of the profile file to use as output for -fprofile-instr-generate, + /// -fprofile-generate, and -fcs-profile-generate. + std::string InstrProfileOutput; + + /// Name of the profile file to use as input for -fmemory-profile-use. + std::string MemoryProfileUsePath; + + unsigned int DebugInfoForProfiling; + + unsigned int AtomicProfileUpdate; + + /// Name of the profile file to use as input for -fprofile-instr-use + std::string ProfileInstrumentUsePath; + + /// Name of the profile remapping file to apply to the profile data supplied + /// by -fprofile-sample-use or -fprofile-instr-use. + std::string ProfileRemappingFile; + + /// Check if Clang profile instrumenation is on. + bool hasProfileClangInstr() const { + return getProfileInstr() == ProfileClangInstr; + } + + /// Check if IR level profile instrumentation is on. + bool hasProfileIRInstr() const { + return getProfileInstr() == ProfileIRInstr; + } + + /// Check if CS IR level profile instrumentation is on. + bool hasProfileCSIRInstr() const { + return getProfileInstr() == ProfileCSIRInstr; + } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } + /// Check if CSIR profile use is on. + bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) #define ENUM_CODEGENOPT(Name, Type, Bits, Default) \ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 6f87a18d69c3d..f013fce2f3cfc 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,6 +27,7 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" +#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -431,6 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, opts.IsPIE = 1; } + if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { + opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + } + + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { + opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.ProfileInstrumentUsePath = A->getValue(); + } + // -mcmodel option. if (const llvm::opt::Arg *a = args.getLastArg(clang::driver::options::OPT_mcmodel_EQ)) { diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..68880bdeecf8d 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -63,11 +63,14 @@ #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" #include "llvm/Transforms/Utils/ModuleUtils.h" +#include "llvm/Transforms/Instrumentation/InstrProfiling.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include #include @@ -130,6 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// + +static llvm::cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -892,6 +909,20 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } + +// Default filename used for profile generation. +namespace llvm { + extern llvm::cl::opt DebugInfoCorrelate; + extern llvm::cl::opt ProfileCorrelate; + + +std::string getDefaultProfileGenName() { + return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} +} + void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -909,6 +940,29 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; + + if (opts.hasProfileIRInstr()){ + // // -fprofile-generate. + pgoOpt = llvm::PGOOptions( + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } + else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", + opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, + llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Driver/flang-f-opts.f90 b/flang/test/Driver/flang-f-opts.f90 index 4493a519e2010..b972b9b7b2a59 100644 --- a/flang/test/Driver/flang-f-opts.f90 +++ b/flang/test/Driver/flang-f-opts.f90 @@ -8,3 +8,8 @@ ! CHECK-LABEL: "-fc1" ! CHECK: -ffp-contract=off ! CHECK: -O3 + +! RUN: %flang -### -S -fprofile-generate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-GENERATE-LLVM %s +! CHECK-PROFILE-GENERATE-LLVM: "-fprofile-generate" +! RUN: %flang -### -S -fprofile-use=%S %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-USE-DIR %s +! CHECK-PROFILE-USE-DIR: "-fprofile-use={{.*}}" diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext new file mode 100644 index 0000000000000..6a6df8b1d4d5b --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -0,0 +1,19 @@ +# IR level Instrumentation Flag +:ir +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + +main +# Func Hash: +742261418966908927 +# Num Counters: +1 +# Counter Values: +1 + diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext new file mode 100644 index 0000000000000..9a46140286673 --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -0,0 +1,14 @@ +# IR level Instrumentation Flag +:ir +:entry_first +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + + + diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 new file mode 100644 index 0000000000000..0124bc79b87ef --- /dev/null +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -0,0 +1,39 @@ +! Tests for -fprofile-generate and -fprofile-use flag compatibility. These two +! flags behave similarly to their GCC counterparts: +! +! -fprofile-generate Generates the profile file ./default.profraw +! -fprofile-use=/file Uses the profile file /file + +! On AIX, -flto used to be required with -fprofile-generate. gcc-flag-compatibility-aix.c is used to do the testing on AIX with -flto +! RUN: %flang %s -c -S -o - -emit-llvm -fprofile-generate | FileCheck -check-prefix=PROFILE-GEN %s +! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section +! PROFILE-GEN: @__profd_{{_?}}main = + + + +! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof +! This uses LLVM IR format profile. +! RUN: rm -rf %t.dir +! RUN: mkdir -p %t.dir/some/path +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s +! + + + +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s +! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} +! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} + + +program main + implicit none + integer :: i + integer :: X = 0 + + do i = 0, 99 + X = X + i + end do + +end program main >From b897c7aa1e21dfe46b4acf709f3ea38d9021c164 Mon Sep 17 00:00:00 2001 From: FYK Date: Wed, 23 Apr 2025 09:56:14 +0800 Subject: [PATCH 02/12] Update flang/lib/Frontend/FrontendActions.cpp Remove redundant comment symbols Co-authored-by: Tom Eccles --- flang/lib/Frontend/FrontendActions.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 68880bdeecf8d..cd13a6aca92cd 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -942,7 +942,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; if (opts.hasProfileIRInstr()){ - // // -fprofile-generate. + // -fprofile-generate. pgoOpt = llvm::PGOOptions( opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() : opts.InstrProfileOutput, >From bc5adfcc4ac3456f587bedd48c1a8892d27e53ae Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:48:30 +0800 Subject: [PATCH 03/12] format code with clang-format --- flang/include/flang/Frontend/CodeGenOptions.h | 17 ++-- flang/lib/Frontend/CompilerInvocation.cpp | 15 ++-- flang/lib/Frontend/FrontendActions.cpp | 83 +++++++++---------- .../Inputs/gcc-flag-compatibility_IR.proftext | 3 +- .../gcc-flag-compatibility_IR_entry.proftext | 5 +- 5 files changed, 59 insertions(+), 64 deletions(-) diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index e052250f97e75..c9577862df832 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -156,7 +156,6 @@ class CodeGenOptions : public CodeGenOptionsBase { ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -171,7 +170,7 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; - /// Name of the profile remapping file to apply to the profile data supplied + /// Name of the profile remapping file to apply to the profile data supplied /// by -fprofile-sample-use or -fprofile-instr-use. std::string ProfileRemappingFile; @@ -181,19 +180,17 @@ class CodeGenOptions : public CodeGenOptionsBase { } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; - } + bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { return getProfileInstr() == ProfileCSIRInstr; } - /// Check if IR level profile use is on. - bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; - } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } /// Check if CSIR profile use is on. bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index f013fce2f3cfc..b28c2c0047579 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,7 +27,6 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" -#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -433,13 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = + args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = + args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); } - + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index cd13a6aca92cd..8d1ab670e4db4 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -56,21 +56,21 @@ #include "llvm/Passes/PassBuilder.h" #include "llvm/Passes/PassPlugin.h" #include "llvm/Passes/StandardInstrumentations.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/Support/AMDGPUAddrSpace.h" #include "llvm/Support/Error.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/FileSystem.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" -#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" -#include "llvm/Transforms/Utils/ModuleUtils.h" #include "llvm/Transforms/Instrumentation/InstrProfiling.h" -#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Transforms/Utils/ModuleUtils.h" #include #include @@ -133,19 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// - static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, + "default", "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, + "optsize", "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, + "minsize", "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, + "optnone", + "Mark cold functions with optnone."))); bool PrescanAction::beginSourceFileAction() { return runPrescan(); } @@ -909,19 +910,18 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } - // Default filename used for profile generation. namespace llvm { - extern llvm::cl::opt DebugInfoCorrelate; - extern llvm::cl::opt ProfileCorrelate; - +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt ProfileCorrelate; std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + return DebugInfoCorrelate || + ProfileCorrelate != llvm::InstrProfCorrelator::NONE ? "default_%m.proflite" : "default_%m.profraw"; } -} +} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); @@ -940,29 +940,28 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; - - if (opts.hasProfileIRInstr()){ + + if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); - } - else if (opts.hasProfileIRUse()) { - llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); - // -fprofile-use. - auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse - : llvm::PGOOptions::NoCSAction; - pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", - opts.ProfileRemappingFile, - opts.MemoryProfileUsePath, VFS, - llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling); - } - + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = + llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions( + opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, + ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext index 6a6df8b1d4d5b..2650fb5ebfd35 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -15,5 +15,4 @@ main # Num Counters: 1 # Counter Values: -1 - +1 \ No newline at end of file diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext index 9a46140286673..c4a2a26557e80 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -8,7 +8,4 @@ _QQmain 2 # Counter Values: 100 -1 - - - +1 \ No newline at end of file >From d64d9d95fb97d6cfa4bf4192bfb20f5c8d6b3bc3 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:53:47 +0800 Subject: [PATCH 04/12] simplify push_back usage --- clang/lib/Driver/ToolChains/Flang.cpp | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index fcdbe8a6aba5a..9c7e87c455e44 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,13 +882,9 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); - - if (Args.hasArg(options::OPT_fprofile_generate)){ - CmdArgs.push_back("-fprofile-generate"); - } - if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { - CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); - } + // recognise options: fprofile-generate -fprofile-use= + Args.addAllArgs( + CmdArgs, {options::OPT_fprofile_generate, options::OPT_fprofile_use_EQ}); // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. >From 22475a85d24b22fb44ca5a5ce26542b556bae280 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 20:33:54 +0800 Subject: [PATCH 05/12] Port the getDefaultProfileGenName definition and the ProfileInstrKind definition from clang to the llvm namespace to allow flang to reuse these code. --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++--- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/include/clang/CodeGen/BackendUtil.h | 3 ++ clang/lib/Basic/ProfileList.cpp | 20 ++++---- clang/lib/CodeGen/BackendUtil.cpp | 50 ++++++------------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 +-- flang/include/flang/Frontend/CodeGenOptions.h | 28 ++++------- .../include/flang/Frontend/FrontendActions.h | 5 ++ flang/lib/Frontend/CompilerInvocation.cpp | 11 ++-- flang/lib/Frontend/FrontendActions.cpp | 28 +++-------- .../llvm/Frontend/Driver/CodeGenOptions.h | 15 +++++- llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 25 ++++++++++ 17 files changed, 123 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index 92e0d13bf25b6..d9abf7bf962d2 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,6 +8,8 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -19,6 +21,7 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; +extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..963ed321b2cb9 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,38 +103,16 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); -// Experiment to mark cold functions as optsize/minsize/optnone. -// TODO: remove once this is exposed as a proper driver flag. -static cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, - cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); - extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +812,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,14 +825,14 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) @@ -863,15 +841,15 @@ void EmitAssemblyHelper::RunOptimizationPipeline( ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index f9a45bd6c0a56..9ba74a9dad9be 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,8 +20,13 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include +namespace llvm { +extern cl::opt ClPGOColdFuncAttr; +} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..c758aa18fbb8e 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -28,6 +28,7 @@ #include "flang/Semantics/unparse-with-symbols.h" #include "flang/Support/default-kinds.h" #include "flang/Tools/CrossToolHelpers.h" +#include "clang/CodeGen/BackendUtil.h" #include "mlir/IR/Dialect.h" #include "mlir/Parser/Parser.h" @@ -133,21 +134,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -944,12 +930,12 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, + opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + llvm::PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +945,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..6188c20cb29cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,9 +13,14 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; } // namespace llvm namespace llvm::driver { @@ -35,7 +40,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); - +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..818dcd3752437 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,7 +8,26 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); +} // namespace llvm namespace llvm::driver { @@ -56,4 +75,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From e53e689985088bbcdc253950a2ecc715592b5b3a Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 21:49:36 +0800 Subject: [PATCH 06/12] Remove redundant function definitions --- flang/lib/Frontend/FrontendActions.cpp | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c758aa18fbb8e..cdd2853bcd201 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -896,18 +896,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); >From 248175453354fecd078f5553576d16ce810e7808 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:12:32 +0800 Subject: [PATCH 07/12] Move the interface to the cpp that uses it --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++---- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/lib/Basic/ProfileList.cpp | 20 ++++----- clang/lib/CodeGen/BackendUtil.cpp | 37 +++++++-------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 ++-- flang/include/flang/Frontend/CodeGenOptions.h | 28 +++++------- flang/lib/Frontend/CompilerInvocation.cpp | 11 ++--- flang/lib/Frontend/FrontendActions.cpp | 45 ++++--------------- .../llvm/Frontend/Driver/CodeGenOptions.h | 10 +++++ llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 12 +++++ 15 files changed, 101 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..592e3bbbcc1cf 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -124,17 +124,10 @@ namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +827,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,31 +840,31 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) PGOOpt = PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..a650f54620543 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -133,21 +133,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -910,19 +895,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm - void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -943,13 +915,14 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. - pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + pgoOpt = llvm::PGOOptions(opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, + llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, + llvm::PGOOptions::ColdFuncOpt::Default, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +932,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::PGOOptions::ColdFuncOpt::Default, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..3eb03cc3064cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,6 +13,7 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -36,6 +37,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..14b6b89da8465 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,8 +8,14 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; +} // namespace llvm namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, @@ -56,4 +62,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From 70fea2265a374f59345691f4ad7653ef4f0b6aa6 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:25:15 +0800 Subject: [PATCH 08/12] Move the interface to the cpp that uses it --- clang/include/clang/CodeGen/BackendUtil.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index d9abf7bf962d2..92e0d13bf25b6 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,8 +8,6 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -21,7 +19,6 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; -extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs >From 5705d5eff937ca18eb44bec28a967a8629f0c085 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:26:22 +0800 Subject: [PATCH 09/12] Move the interface to the cpp that uses it --- flang/include/flang/Frontend/FrontendActions.h | 5 ----- 1 file changed, 5 deletions(-) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index 9ba74a9dad9be..f9a45bd6c0a56 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,13 +20,8 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include -namespace llvm { -extern cl::opt ClPGOColdFuncAttr; -} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// >From 016aab17f4cc73416c6ebca61240f269aac837d2 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:34:00 +0800 Subject: [PATCH 10/12] Fill in the missing code --- clang/lib/CodeGen/BackendUtil.cpp | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 2d33edbb8430d..6eb3a8638b7d1 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,6 +103,21 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +static cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { >From f36bfcfbfdc87b896f41be1ba25d8c18c339f1c1 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Thu, 1 May 2025 23:18:34 +0800 Subject: [PATCH 11/12] Adjusting the format of the code --- flang/test/Profile/gcc-flag-compatibility.f90 | 7 ------- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 7 ++++--- 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 index 0124bc79b87ef..4490c45232d28 100644 --- a/flang/test/Profile/gcc-flag-compatibility.f90 +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -9,24 +9,17 @@ ! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section ! PROFILE-GEN: @__profd_{{_?}}main = - - ! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof ! This uses LLVM IR format profile. ! RUN: rm -rf %t.dir ! RUN: mkdir -p %t.dir/some/path ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s -! - - - ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s ! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} ! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} - program main implicit none integer :: i diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 3eb03cc3064cf..98b9e1554f317 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -14,6 +14,7 @@ #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #include + namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -34,9 +35,6 @@ enum class VectorLibrary { AMDLIBM // AMD vector math library. }; -TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, - VectorLibrary Veclib); - enum ProfileInstrKind { ProfileNone, // Profile instrumentation is turned off. ProfileClangInstr, // Clang instrumentation to generate execution counts @@ -44,6 +42,9 @@ enum ProfileInstrKind { ProfileIRInstr, // IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; +TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, + VectorLibrary Veclib); + // Default filename used for profile generation. std::string getDefaultProfileGenName(); } // end namespace llvm::driver >From a5c7da77d2aa6909451bed3fb0f02c9b735dc876 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 2 May 2025 00:01:26 +0800 Subject: [PATCH 12/12] Adjusting the format of the code --- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 98b9e1554f317..84bba2a964ecf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -20,6 +20,7 @@ class Triple; class TargetLibraryInfoImpl; } // namespace llvm + namespace llvm::driver { /// Vector library option used with -fveclib= @@ -42,6 +43,7 @@ enum ProfileInstrKind { ProfileIRInstr, // IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; + TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); From flang-commits at lists.llvm.org Thu May 1 09:07:47 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 09:07:47 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <68139c53.a70a0220.1a3e2d.19bc@mx.google.com> https://github.com/fanju110 updated https://github.com/llvm/llvm-project/pull/136098 >From 9494c9752400e4708dbc8b6a5ca4993ea9565e95 Mon Sep 17 00:00:00 2001 From: fanyikang Date: Thu, 17 Apr 2025 15:17:07 +0800 Subject: [PATCH 01/13] Add support for IR PGO (-fprofile-generate/-fprofile-use=/file) This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags: -fprofile-generate for instrumentation-based profile generation -fprofile-use=/file for profile-guided optimization Co-Authored-By: ict-ql <168183727+ict-ql at users.noreply.github.com> --- clang/include/clang/Driver/Options.td | 4 +- clang/lib/Driver/ToolChains/Flang.cpp | 8 +++ .../include/flang/Frontend/CodeGenOptions.def | 5 ++ flang/include/flang/Frontend/CodeGenOptions.h | 49 +++++++++++++++++ flang/lib/Frontend/CompilerInvocation.cpp | 12 +++++ flang/lib/Frontend/FrontendActions.cpp | 54 +++++++++++++++++++ flang/test/Driver/flang-f-opts.f90 | 5 ++ .../Inputs/gcc-flag-compatibility_IR.proftext | 19 +++++++ .../gcc-flag-compatibility_IR_entry.proftext | 14 +++++ flang/test/Profile/gcc-flag-compatibility.f90 | 39 ++++++++++++++ 10 files changed, 207 insertions(+), 2 deletions(-) create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext create mode 100644 flang/test/Profile/gcc-flag-compatibility.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index affc076a876ad..0b0dbc467c1e0 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1756,7 +1756,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFFFFFE">; def fprofile_generate : Flag<["-"], "fprofile-generate">, - Group, Visibility<[ClangOption, CLOption]>, + Group, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">, Group, Visibility<[ClangOption, CLOption]>, @@ -1773,7 +1773,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group, Visibility<[ClangOption, CLOption]>, Alias; def fprofile_use_EQ : Joined<["-"], "fprofile-use=">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, MetaVarName<"">, HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from /default.profdata. Otherwise, it reads from file .">; def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">, diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index a8b4688aed09c..fcdbe8a6aba5a 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,6 +882,14 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); + + if (Args.hasArg(options::OPT_fprofile_generate)){ + CmdArgs.push_back("-fprofile-generate"); + } + if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { + CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); + } + // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ, diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 57830bf51a1b3..4dec86cd8f51b 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,6 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. +ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Whether emit extra debug info for sample pgo profile collection. +CODEGENOPT(DebugInfoForProfiling, 1, 0) +CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..e052250f97e75 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,6 +148,55 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; + enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. + }; + + + /// Name of the profile file to use as output for -fprofile-instr-generate, + /// -fprofile-generate, and -fcs-profile-generate. + std::string InstrProfileOutput; + + /// Name of the profile file to use as input for -fmemory-profile-use. + std::string MemoryProfileUsePath; + + unsigned int DebugInfoForProfiling; + + unsigned int AtomicProfileUpdate; + + /// Name of the profile file to use as input for -fprofile-instr-use + std::string ProfileInstrumentUsePath; + + /// Name of the profile remapping file to apply to the profile data supplied + /// by -fprofile-sample-use or -fprofile-instr-use. + std::string ProfileRemappingFile; + + /// Check if Clang profile instrumenation is on. + bool hasProfileClangInstr() const { + return getProfileInstr() == ProfileClangInstr; + } + + /// Check if IR level profile instrumentation is on. + bool hasProfileIRInstr() const { + return getProfileInstr() == ProfileIRInstr; + } + + /// Check if CS IR level profile instrumentation is on. + bool hasProfileCSIRInstr() const { + return getProfileInstr() == ProfileCSIRInstr; + } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } + /// Check if CSIR profile use is on. + bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) #define ENUM_CODEGENOPT(Name, Type, Bits, Default) \ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 6f87a18d69c3d..f013fce2f3cfc 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,6 +27,7 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" +#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -431,6 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, opts.IsPIE = 1; } + if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { + opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + } + + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { + opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.ProfileInstrumentUsePath = A->getValue(); + } + // -mcmodel option. if (const llvm::opt::Arg *a = args.getLastArg(clang::driver::options::OPT_mcmodel_EQ)) { diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..68880bdeecf8d 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -63,11 +63,14 @@ #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" #include "llvm/Transforms/Utils/ModuleUtils.h" +#include "llvm/Transforms/Instrumentation/InstrProfiling.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include #include @@ -130,6 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// + +static llvm::cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -892,6 +909,20 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } + +// Default filename used for profile generation. +namespace llvm { + extern llvm::cl::opt DebugInfoCorrelate; + extern llvm::cl::opt ProfileCorrelate; + + +std::string getDefaultProfileGenName() { + return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} +} + void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -909,6 +940,29 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; + + if (opts.hasProfileIRInstr()){ + // // -fprofile-generate. + pgoOpt = llvm::PGOOptions( + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } + else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", + opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, + llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Driver/flang-f-opts.f90 b/flang/test/Driver/flang-f-opts.f90 index 4493a519e2010..b972b9b7b2a59 100644 --- a/flang/test/Driver/flang-f-opts.f90 +++ b/flang/test/Driver/flang-f-opts.f90 @@ -8,3 +8,8 @@ ! CHECK-LABEL: "-fc1" ! CHECK: -ffp-contract=off ! CHECK: -O3 + +! RUN: %flang -### -S -fprofile-generate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-GENERATE-LLVM %s +! CHECK-PROFILE-GENERATE-LLVM: "-fprofile-generate" +! RUN: %flang -### -S -fprofile-use=%S %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-USE-DIR %s +! CHECK-PROFILE-USE-DIR: "-fprofile-use={{.*}}" diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext new file mode 100644 index 0000000000000..6a6df8b1d4d5b --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -0,0 +1,19 @@ +# IR level Instrumentation Flag +:ir +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + +main +# Func Hash: +742261418966908927 +# Num Counters: +1 +# Counter Values: +1 + diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext new file mode 100644 index 0000000000000..9a46140286673 --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -0,0 +1,14 @@ +# IR level Instrumentation Flag +:ir +:entry_first +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + + + diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 new file mode 100644 index 0000000000000..0124bc79b87ef --- /dev/null +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -0,0 +1,39 @@ +! Tests for -fprofile-generate and -fprofile-use flag compatibility. These two +! flags behave similarly to their GCC counterparts: +! +! -fprofile-generate Generates the profile file ./default.profraw +! -fprofile-use=/file Uses the profile file /file + +! On AIX, -flto used to be required with -fprofile-generate. gcc-flag-compatibility-aix.c is used to do the testing on AIX with -flto +! RUN: %flang %s -c -S -o - -emit-llvm -fprofile-generate | FileCheck -check-prefix=PROFILE-GEN %s +! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section +! PROFILE-GEN: @__profd_{{_?}}main = + + + +! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof +! This uses LLVM IR format profile. +! RUN: rm -rf %t.dir +! RUN: mkdir -p %t.dir/some/path +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s +! + + + +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s +! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} +! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} + + +program main + implicit none + integer :: i + integer :: X = 0 + + do i = 0, 99 + X = X + i + end do + +end program main >From b897c7aa1e21dfe46b4acf709f3ea38d9021c164 Mon Sep 17 00:00:00 2001 From: FYK Date: Wed, 23 Apr 2025 09:56:14 +0800 Subject: [PATCH 02/13] Update flang/lib/Frontend/FrontendActions.cpp Remove redundant comment symbols Co-authored-by: Tom Eccles --- flang/lib/Frontend/FrontendActions.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 68880bdeecf8d..cd13a6aca92cd 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -942,7 +942,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; if (opts.hasProfileIRInstr()){ - // // -fprofile-generate. + // -fprofile-generate. pgoOpt = llvm::PGOOptions( opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() : opts.InstrProfileOutput, >From bc5adfcc4ac3456f587bedd48c1a8892d27e53ae Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:48:30 +0800 Subject: [PATCH 03/13] format code with clang-format --- flang/include/flang/Frontend/CodeGenOptions.h | 17 ++-- flang/lib/Frontend/CompilerInvocation.cpp | 15 ++-- flang/lib/Frontend/FrontendActions.cpp | 83 +++++++++---------- .../Inputs/gcc-flag-compatibility_IR.proftext | 3 +- .../gcc-flag-compatibility_IR_entry.proftext | 5 +- 5 files changed, 59 insertions(+), 64 deletions(-) diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index e052250f97e75..c9577862df832 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -156,7 +156,6 @@ class CodeGenOptions : public CodeGenOptionsBase { ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -171,7 +170,7 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; - /// Name of the profile remapping file to apply to the profile data supplied + /// Name of the profile remapping file to apply to the profile data supplied /// by -fprofile-sample-use or -fprofile-instr-use. std::string ProfileRemappingFile; @@ -181,19 +180,17 @@ class CodeGenOptions : public CodeGenOptionsBase { } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; - } + bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { return getProfileInstr() == ProfileCSIRInstr; } - /// Check if IR level profile use is on. - bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; - } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } /// Check if CSIR profile use is on. bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index f013fce2f3cfc..b28c2c0047579 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,7 +27,6 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" -#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -433,13 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = + args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = + args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); } - + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index cd13a6aca92cd..8d1ab670e4db4 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -56,21 +56,21 @@ #include "llvm/Passes/PassBuilder.h" #include "llvm/Passes/PassPlugin.h" #include "llvm/Passes/StandardInstrumentations.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/Support/AMDGPUAddrSpace.h" #include "llvm/Support/Error.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/FileSystem.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" -#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" -#include "llvm/Transforms/Utils/ModuleUtils.h" #include "llvm/Transforms/Instrumentation/InstrProfiling.h" -#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Transforms/Utils/ModuleUtils.h" #include #include @@ -133,19 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// - static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, + "default", "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, + "optsize", "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, + "minsize", "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, + "optnone", + "Mark cold functions with optnone."))); bool PrescanAction::beginSourceFileAction() { return runPrescan(); } @@ -909,19 +910,18 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } - // Default filename used for profile generation. namespace llvm { - extern llvm::cl::opt DebugInfoCorrelate; - extern llvm::cl::opt ProfileCorrelate; - +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt ProfileCorrelate; std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + return DebugInfoCorrelate || + ProfileCorrelate != llvm::InstrProfCorrelator::NONE ? "default_%m.proflite" : "default_%m.profraw"; } -} +} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); @@ -940,29 +940,28 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; - - if (opts.hasProfileIRInstr()){ + + if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); - } - else if (opts.hasProfileIRUse()) { - llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); - // -fprofile-use. - auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse - : llvm::PGOOptions::NoCSAction; - pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", - opts.ProfileRemappingFile, - opts.MemoryProfileUsePath, VFS, - llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling); - } - + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = + llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions( + opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, + ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext index 6a6df8b1d4d5b..2650fb5ebfd35 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -15,5 +15,4 @@ main # Num Counters: 1 # Counter Values: -1 - +1 \ No newline at end of file diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext index 9a46140286673..c4a2a26557e80 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -8,7 +8,4 @@ _QQmain 2 # Counter Values: 100 -1 - - - +1 \ No newline at end of file >From d64d9d95fb97d6cfa4bf4192bfb20f5c8d6b3bc3 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:53:47 +0800 Subject: [PATCH 04/13] simplify push_back usage --- clang/lib/Driver/ToolChains/Flang.cpp | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index fcdbe8a6aba5a..9c7e87c455e44 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,13 +882,9 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); - - if (Args.hasArg(options::OPT_fprofile_generate)){ - CmdArgs.push_back("-fprofile-generate"); - } - if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { - CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); - } + // recognise options: fprofile-generate -fprofile-use= + Args.addAllArgs( + CmdArgs, {options::OPT_fprofile_generate, options::OPT_fprofile_use_EQ}); // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. >From 22475a85d24b22fb44ca5a5ce26542b556bae280 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 20:33:54 +0800 Subject: [PATCH 05/13] Port the getDefaultProfileGenName definition and the ProfileInstrKind definition from clang to the llvm namespace to allow flang to reuse these code. --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++--- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/include/clang/CodeGen/BackendUtil.h | 3 ++ clang/lib/Basic/ProfileList.cpp | 20 ++++---- clang/lib/CodeGen/BackendUtil.cpp | 50 ++++++------------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 +-- flang/include/flang/Frontend/CodeGenOptions.h | 28 ++++------- .../include/flang/Frontend/FrontendActions.h | 5 ++ flang/lib/Frontend/CompilerInvocation.cpp | 11 ++-- flang/lib/Frontend/FrontendActions.cpp | 28 +++-------- .../llvm/Frontend/Driver/CodeGenOptions.h | 15 +++++- llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 25 ++++++++++ 17 files changed, 123 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index 92e0d13bf25b6..d9abf7bf962d2 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,6 +8,8 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -19,6 +21,7 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; +extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..963ed321b2cb9 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,38 +103,16 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); -// Experiment to mark cold functions as optsize/minsize/optnone. -// TODO: remove once this is exposed as a proper driver flag. -static cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, - cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); - extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +812,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,14 +825,14 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) @@ -863,15 +841,15 @@ void EmitAssemblyHelper::RunOptimizationPipeline( ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index f9a45bd6c0a56..9ba74a9dad9be 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,8 +20,13 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include +namespace llvm { +extern cl::opt ClPGOColdFuncAttr; +} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..c758aa18fbb8e 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -28,6 +28,7 @@ #include "flang/Semantics/unparse-with-symbols.h" #include "flang/Support/default-kinds.h" #include "flang/Tools/CrossToolHelpers.h" +#include "clang/CodeGen/BackendUtil.h" #include "mlir/IR/Dialect.h" #include "mlir/Parser/Parser.h" @@ -133,21 +134,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -944,12 +930,12 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, + opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + llvm::PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +945,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..6188c20cb29cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,9 +13,14 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; } // namespace llvm namespace llvm::driver { @@ -35,7 +40,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); - +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..818dcd3752437 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,7 +8,26 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); +} // namespace llvm namespace llvm::driver { @@ -56,4 +75,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From e53e689985088bbcdc253950a2ecc715592b5b3a Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 21:49:36 +0800 Subject: [PATCH 06/13] Remove redundant function definitions --- flang/lib/Frontend/FrontendActions.cpp | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c758aa18fbb8e..cdd2853bcd201 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -896,18 +896,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); >From 248175453354fecd078f5553576d16ce810e7808 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:12:32 +0800 Subject: [PATCH 07/13] Move the interface to the cpp that uses it --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++---- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/lib/Basic/ProfileList.cpp | 20 ++++----- clang/lib/CodeGen/BackendUtil.cpp | 37 +++++++-------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 ++-- flang/include/flang/Frontend/CodeGenOptions.h | 28 +++++------- flang/lib/Frontend/CompilerInvocation.cpp | 11 ++--- flang/lib/Frontend/FrontendActions.cpp | 45 ++++--------------- .../llvm/Frontend/Driver/CodeGenOptions.h | 10 +++++ llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 12 +++++ 15 files changed, 101 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..592e3bbbcc1cf 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -124,17 +124,10 @@ namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +827,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,31 +840,31 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) PGOOpt = PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..a650f54620543 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -133,21 +133,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -910,19 +895,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm - void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -943,13 +915,14 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. - pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + pgoOpt = llvm::PGOOptions(opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, + llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, + llvm::PGOOptions::ColdFuncOpt::Default, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +932,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::PGOOptions::ColdFuncOpt::Default, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..3eb03cc3064cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,6 +13,7 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -36,6 +37,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..14b6b89da8465 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,8 +8,14 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; +} // namespace llvm namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, @@ -56,4 +62,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From 70fea2265a374f59345691f4ad7653ef4f0b6aa6 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:25:15 +0800 Subject: [PATCH 08/13] Move the interface to the cpp that uses it --- clang/include/clang/CodeGen/BackendUtil.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index d9abf7bf962d2..92e0d13bf25b6 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,8 +8,6 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -21,7 +19,6 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; -extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs >From 5705d5eff937ca18eb44bec28a967a8629f0c085 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:26:22 +0800 Subject: [PATCH 09/13] Move the interface to the cpp that uses it --- flang/include/flang/Frontend/FrontendActions.h | 5 ----- 1 file changed, 5 deletions(-) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index 9ba74a9dad9be..f9a45bd6c0a56 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,13 +20,8 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include -namespace llvm { -extern cl::opt ClPGOColdFuncAttr; -} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// >From 016aab17f4cc73416c6ebca61240f269aac837d2 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:34:00 +0800 Subject: [PATCH 10/13] Fill in the missing code --- clang/lib/CodeGen/BackendUtil.cpp | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 2d33edbb8430d..6eb3a8638b7d1 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,6 +103,21 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +static cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { >From f36bfcfbfdc87b896f41be1ba25d8c18c339f1c1 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Thu, 1 May 2025 23:18:34 +0800 Subject: [PATCH 11/13] Adjusting the format of the code --- flang/test/Profile/gcc-flag-compatibility.f90 | 7 ------- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 7 ++++--- 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 index 0124bc79b87ef..4490c45232d28 100644 --- a/flang/test/Profile/gcc-flag-compatibility.f90 +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -9,24 +9,17 @@ ! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section ! PROFILE-GEN: @__profd_{{_?}}main = - - ! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof ! This uses LLVM IR format profile. ! RUN: rm -rf %t.dir ! RUN: mkdir -p %t.dir/some/path ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s -! - - - ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s ! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} ! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} - program main implicit none integer :: i diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 3eb03cc3064cf..98b9e1554f317 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -14,6 +14,7 @@ #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #include + namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -34,9 +35,6 @@ enum class VectorLibrary { AMDLIBM // AMD vector math library. }; -TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, - VectorLibrary Veclib); - enum ProfileInstrKind { ProfileNone, // Profile instrumentation is turned off. ProfileClangInstr, // Clang instrumentation to generate execution counts @@ -44,6 +42,9 @@ enum ProfileInstrKind { ProfileIRInstr, // IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; +TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, + VectorLibrary Veclib); + // Default filename used for profile generation. std::string getDefaultProfileGenName(); } // end namespace llvm::driver >From a5c7da77d2aa6909451bed3fb0f02c9b735dc876 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 2 May 2025 00:01:26 +0800 Subject: [PATCH 12/13] Adjusting the format of the code --- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 98b9e1554f317..84bba2a964ecf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -20,6 +20,7 @@ class Triple; class TargetLibraryInfoImpl; } // namespace llvm + namespace llvm::driver { /// Vector library option used with -fveclib= @@ -42,6 +43,7 @@ enum ProfileInstrKind { ProfileIRInstr, // IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; + TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); >From a99e16b29d70d2fea6d16ec06e6ca55f477b74e9 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 2 May 2025 00:07:23 +0800 Subject: [PATCH 13/13] Adjusting the format of the code --- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 1 - llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 84bba2a964ecf..f0baa6fcdbbd3 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -20,7 +20,6 @@ class Triple; class TargetLibraryInfoImpl; } // namespace llvm - namespace llvm::driver { /// Vector library option used with -fveclib= diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index 14b6b89da8465..c48f5ed68b10b 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -16,6 +16,7 @@ extern llvm::cl::opt DebugInfoCorrelate; extern llvm::cl::opt ProfileCorrelate; } // namespace llvm + namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, From flang-commits at lists.llvm.org Thu May 1 09:17:29 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 01 May 2025 09:17:29 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Allow UPDATE clause to not have any arguments (PR #137521) In-Reply-To: Message-ID: <68139e99.170a0220.1898e9.ff27@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137521 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 1/3] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 2/3] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 3/3] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; From flang-commits at lists.llvm.org Thu May 1 10:10:19 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 10:10:19 -0700 (PDT) Subject: [flang-commits] [flang] 35c76eb - [flang][OpenMP] Always set "openmp_flags" (#138153) Message-ID: <6813aafb.050a0220.323a93.55b1@mx.google.com> Author: Krzysztof Parzyszek Date: 2025-05-01T12:10:15-05:00 New Revision: 35c76eb195990865ee63a0aba3c18ad3c4189e73 URL: https://github.com/llvm/llvm-project/commit/35c76eb195990865ee63a0aba3c18ad3c4189e73 DIFF: https://github.com/llvm/llvm-project/commit/35c76eb195990865ee63a0aba3c18ad3c4189e73.diff LOG: [flang][OpenMP] Always set "openmp_flags" (#138153) Many OpenMP tests use "%openmp_flags" in the RUN line. In many OpenMP lit tests this variable is expected to at least have "-fopenmp" in it. However, in the lit config this variable was only given a value when the OpenMP runtime build was enabled. If the runtime build was not enabled, %openmp_flags would expand to an empty string, and unless a lit test specifically used -fopenmp in the RUN line, OpenMP would be disabled. This patch sets %openmp_flags to start with "-fopenmp" regardless of the build configuration. Added: Modified: flang/test/lit.cfg.py Removed: ################################################################################ diff --git a/flang/test/lit.cfg.py b/flang/test/lit.cfg.py index aa27fdc2fe412..7eb57670ac767 100644 --- a/flang/test/lit.cfg.py +++ b/flang/test/lit.cfg.py @@ -178,17 +178,15 @@ config.environment["LIBPGMATH"] = True # Determine if OpenMP runtime was built (enable OpenMP tests via REQUIRES in test file) +openmp_flags_substitution = "-fopenmp" if config.have_openmp_rtl: config.available_features.add("openmp_runtime") # For the enabled OpenMP tests, add a substitution that is needed in the tests to find # the omp_lib.{h,mod} files, depending on whether the OpenMP runtime was built as a # project or runtime. if config.openmp_module_dir: - config.substitutions.append( - ("%openmp_flags", f"-fopenmp -J {config.openmp_module_dir}") - ) - else: - config.substitutions.append(("%openmp_flags", "-fopenmp")) + openmp_flags_substitution += f" -J {config.openmp_module_dir}" +config.substitutions.append(("%openmp_flags", openmp_flags_substitution)) # Add features and substitutions to test F128 math support. # %f128-lib substitution may be used to generate check prefixes From flang-commits at lists.llvm.org Thu May 1 10:10:22 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 01 May 2025 10:10:22 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Always set "openmp_flags" (PR #138153) In-Reply-To: Message-ID: <6813aafe.170a0220.18fa54.2e58@mx.google.com> https://github.com/kparzysz closed https://github.com/llvm/llvm-project/pull/138153 From flang-commits at lists.llvm.org Thu May 1 10:13:12 2025 From: flang-commits at lists.llvm.org (Arthur Eubanks via flang-commits) Date: Thu, 01 May 2025 10:13:12 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6813aba8.170a0220.35025b.32e5@mx.google.com> ================ @@ -130,6 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// + +static llvm::cl::opt ClPGOColdFuncAttr( ---------------- aeubanks wrote: sorry, I missed this, yeah promoting this to a proper frontend option is fine https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Thu May 1 11:30:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 11:30:50 -0700 (PDT) Subject: [flang-commits] [flang] 760bba4 - [flang][OpenMP] Allow UPDATE clause to not have any arguments (#137521) Message-ID: <6813bdda.170a0220.2b1ed7.4b72@mx.google.com> Author: Krzysztof Parzyszek Date: 2025-05-01T13:30:45-05:00 New Revision: 760bba4666d6cdb7b4aef3c8ce9a242f59e39216 URL: https://github.com/llvm/llvm-project/commit/760bba4666d6cdb7b4aef3c8ce9a242f59e39216 DIFF: https://github.com/llvm/llvm-project/commit/760bba4666d6cdb7b4aef3c8ce9a242f59e39216.diff LOG: [flang][OpenMP] Allow UPDATE clause to not have any arguments (#137521) The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. Added: Modified: flang/include/flang/Parser/parse-tree.h flang/lib/Lower/OpenMP/Clauses.cpp flang/lib/Parser/openmp-parsers.cpp flang/lib/Semantics/check-omp-structure.cpp llvm/include/llvm/Frontend/OpenMP/OMP.td Removed: ################################################################################ diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index d9fe32bae1c27..f654fe6e4681a 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4571,10 +4571,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; From flang-commits at lists.llvm.org Thu May 1 11:30:52 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 01 May 2025 11:30:52 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Allow UPDATE clause to not have any arguments (PR #137521) In-Reply-To: Message-ID: <6813bddc.170a0220.14b47a.4dd1@mx.google.com> https://github.com/kparzysz closed https://github.com/llvm/llvm-project/pull/137521 From flang-commits at lists.llvm.org Thu May 1 11:31:00 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 01 May 2025 11:31:00 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6813bde4.170a0220.d5df.4b42@mx.google.com> https://github.com/kparzysz edited https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Thu May 1 11:54:58 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 01 May 2025 11:54:58 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6813c382.050a0220.16cf37.70c2@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 01/11] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 02/11] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 03/11] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; >From 02b40809aff5b2ad315ef078f0ea8edb0f30a40c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 17 Mar 2025 15:53:27 -0500 Subject: [PATCH 04/11] [flang][OpenMP] Overhaul implementation of ATOMIC construct The parser will accept a wide variety of illegal attempts at forming an ATOMIC construct, leaving it to the semantic analysis to diagnose any issues. This consolidates the analysis into one place and allows us to produce more informative diagnostics. The parser's outcome will be parser::OpenMPAtomicConstruct object holding the directive, parser::Body, and an optional end-directive. The prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have been removed. READ, WRITE, etc. are now proper clauses. The semantic analysis consistently operates on "evaluation" represen- tations, mainly evaluate::Expr (as SomeExpr) and evaluate::Assignment. The results of the semantic analysis are stored in a mutable member of the OpenMPAtomicConstruct node. This follows a precedent of having `typedExpr` member in parser::Expr, for example. This allows the lowering code to avoid duplicated handling of AST nodes. Using a BLOCK construct containing multiple statements for an ATOMIC construct that requires multiple statements is now allowed. In fact, any nesting of such BLOCK constructs is allowed. This implementation will parse, and perform semantic checks for both conditional-update and conditional-update-capture, although no MLIR will be generated for those. Instead, a TODO error will be issues prior to lowering. The allowed forms of the ATOMIC construct were based on the OpenMP 6.0 spec. --- flang/include/flang/Parser/dump-parse-tree.h | 12 - flang/include/flang/Parser/parse-tree.h | 111 +- flang/include/flang/Semantics/tools.h | 16 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 40 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 760 +++---- flang/lib/Parser/openmp-parsers.cpp | 233 +- flang/lib/Parser/parse-tree.cpp | 28 + flang/lib/Parser/unparse.cpp | 102 +- flang/lib/Semantics/check-omp-structure.cpp | 1998 +++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 56 +- flang/lib/Semantics/resolve-names.cpp | 7 +- flang/lib/Semantics/rewrite-directives.cpp | 126 +- .../Lower/OpenMP/Todo/atomic-compare-fail.f90 | 2 +- .../test/Lower/OpenMP/Todo/atomic-compare.f90 | 2 +- flang/test/Lower/OpenMP/atomic-capture.f90 | 4 +- flang/test/Lower/OpenMP/atomic-write.f90 | 2 +- flang/test/Parser/OpenMP/atomic-compare.f90 | 16 - .../test/Semantics/OpenMP/atomic-compare.f90 | 29 +- .../Semantics/OpenMP/atomic-hint-clause.f90 | 23 +- flang/test/Semantics/OpenMP/atomic-read.f90 | 89 + .../OpenMP/atomic-update-capture.f90 | 77 + .../Semantics/OpenMP/atomic-update-only.f90 | 83 + .../OpenMP/atomic-update-overloaded-ops.f90 | 4 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 81 + flang/test/Semantics/OpenMP/atomic.f90 | 29 +- flang/test/Semantics/OpenMP/atomic01.f90 | 221 +- flang/test/Semantics/OpenMP/atomic02.f90 | 47 +- flang/test/Semantics/OpenMP/atomic03.f90 | 51 +- flang/test/Semantics/OpenMP/atomic04.f90 | 96 +- flang/test/Semantics/OpenMP/atomic05.f90 | 12 +- .../Semantics/OpenMP/critical-hint-clause.f90 | 20 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 58 +- .../Semantics/OpenMP/requires-atomic01.f90 | 86 +- .../Semantics/OpenMP/requires-atomic02.f90 | 86 +- 34 files changed, 2838 insertions(+), 1769 deletions(-) delete mode 100644 flang/test/Parser/OpenMP/atomic-compare.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-read.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-capture.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-only.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-write.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a8c01ee2e7da3 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -526,15 +526,6 @@ class ParseTreeDumper { NODE(parser, OmpAtClause) NODE_ENUM(OmpAtClause, ActionTime) NODE_ENUM(OmpSeverityClause, Severity) - NODE(parser, OmpAtomic) - NODE(parser, OmpAtomicCapture) - NODE(OmpAtomicCapture, Stmt1) - NODE(OmpAtomicCapture, Stmt2) - NODE(parser, OmpAtomicCompare) - NODE(parser, OmpAtomicCompareIfStmt) - NODE(parser, OmpAtomicRead) - NODE(parser, OmpAtomicUpdate) - NODE(parser, OmpAtomicWrite) NODE(parser, OmpBeginBlockDirective) NODE(parser, OmpBeginLoopDirective) NODE(parser, OmpBeginSectionsDirective) @@ -581,7 +572,6 @@ class ParseTreeDumper { NODE(parser, OmpDoacrossClause) NODE(parser, OmpDestroyClause) NODE(parser, OmpEndAllocators) - NODE(parser, OmpEndAtomic) NODE(parser, OmpEndBlockDirective) NODE(parser, OmpEndCriticalDirective) NODE(parser, OmpEndLoopDirective) @@ -709,8 +699,6 @@ class ParseTreeDumper { NODE(parser, OpenMPDeclareMapperConstruct) NODE_ENUM(common, OmpMemoryOrderType) NODE(parser, OmpMemoryOrderClause) - NODE(parser, OmpAtomicClause) - NODE(parser, OmpAtomicClauseList) NODE(parser, OmpAtomicDefaultMemOrderClause) NODE(parser, OpenMPDepobjConstruct) NODE(parser, OpenMPUtilityConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..77f57b1cb85c7 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4835,94 +4835,37 @@ struct OmpMemoryOrderClause { CharBlock source; }; -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) | -// FAIL(memory-order) -struct OmpAtomicClause { - UNION_CLASS_BOILERPLATE(OmpAtomicClause); - CharBlock source; - std::variant u; -}; - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -struct OmpAtomicClauseList { - WRAPPER_CLASS_BOILERPLATE(OmpAtomicClauseList, std::list); - CharBlock source; -}; - -// END ATOMIC -EMPTY_CLASS(OmpEndAtomic); - -// ATOMIC READ -struct OmpAtomicRead { - TUPLE_CLASS_BOILERPLATE(OmpAtomicRead); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC WRITE -struct OmpAtomicWrite { - TUPLE_CLASS_BOILERPLATE(OmpAtomicWrite); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC UPDATE -struct OmpAtomicUpdate { - TUPLE_CLASS_BOILERPLATE(OmpAtomicUpdate); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC CAPTURE -struct OmpAtomicCapture { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCapture); - CharBlock source; - WRAPPER_CLASS(Stmt1, Statement); - WRAPPER_CLASS(Stmt2, Statement); - std::tuple - t; -}; - -struct OmpAtomicCompareIfStmt { - UNION_CLASS_BOILERPLATE(OmpAtomicCompareIfStmt); - std::variant, common::Indirection> u; -}; - -// ATOMIC COMPARE (OpenMP 5.1, OPenMP 5.2 spec: 15.8.4) -struct OmpAtomicCompare { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCompare); +struct OpenMPAtomicConstruct { + llvm::omp::Clause GetKind() const; + bool IsCapture() const; + bool IsCompare() const; + TUPLE_CLASS_BOILERPLATE(OpenMPAtomicConstruct); CharBlock source; - std::tuple> + std::tuple> t; -}; -// ATOMIC -struct OmpAtomic { - TUPLE_CLASS_BOILERPLATE(OmpAtomic); - CharBlock source; - std::tuple, - std::optional> - t; -}; + // Information filled out during semantic checks to avoid duplication + // of analyses. + struct Analysis { + static constexpr int None = 0; + static constexpr int Read = 1; + static constexpr int Write = 2; + static constexpr int Update = Read | Write; + static constexpr int Action = 3; // Bitmask for None, Read, Write, Update + static constexpr int IfTrue = 4; + static constexpr int IfFalse = 8; + static constexpr int Condition = 12; // Bitmask for IfTrue, IfFalse + + struct Op { + int what; + TypedExpr expr; + }; + TypedExpr atom, cond; + Op op0, op1; + }; -// 2.17.7 atomic -> -// ATOMIC [atomic-clause-list] atomic-construct [atomic-clause-list] | -// ATOMIC [atomic-clause-list] -// atomic-construct -> READ | WRITE | UPDATE | CAPTURE | COMPARE -struct OpenMPAtomicConstruct { - UNION_CLASS_BOILERPLATE(OpenMPAtomicConstruct); - std::variant - u; + mutable Analysis analysis; }; // OpenMP directives that associate with loop(s) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..7f1ec59b087a2 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -782,5 +782,21 @@ inline bool checkForSymbolMatch( } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). +/// Check if "expr" is +/// SomeType(SomeKind(Type( +/// Convert +/// SomeKind(...)[2]))) +/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves +/// TypeCategory. +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..0d941bf39afc3 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -332,26 +332,26 @@ getSource(const semantics::SemanticsContext &semaCtx, const parser::CharBlock *source = nullptr; auto ompConsVisit = [&](const parser::OpenMPConstruct &x) { - std::visit(common::visitors{ - [&](const parser::OpenMPSectionsConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPLoopConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPBlockConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPCriticalConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPAtomicConstruct &x) { - std::visit([&](const auto &x) { source = &x.source; }, - x.u); - }, - [&](const auto &x) { source = &x.source; }, - }, - x.u); + std::visit( + common::visitors{ + [&](const parser::OpenMPSectionsConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPLoopConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPBlockConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPCriticalConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPAtomicConstruct &x) { + source = &std::get(x.t).source; + }, + [&](const auto &x) { source = &x.source; }, + }, + x.u); }; eval.visit(common::visitors{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fdd85e94829f3..ab868df76d298 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2588,455 +2588,183 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); +} -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} + +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, + const List &clauses) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + if (clause.id != llvm::omp::Clause::OMPC_hint) + continue; + auto &hint = std::get(clause.u); + auto maybeVal = evaluate::ToInt64(hint.v); + CHECK(maybeVal); + return builder.getI64IntegerAttr(*maybeVal); } + return nullptr; } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static mlir::omp::ClauseMemoryOrderKindAttr +getAtomicMemoryOrder(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + const List &clauses) { + std::optional kind; + unsigned version = semaCtx.langOptions().OpenMPVersion; - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + switch (clause.id) { + case llvm::omp::Clause::OMPC_acq_rel: + kind = mlir::omp::ClauseMemoryOrderKind::Acq_rel; + break; + case llvm::omp::Clause::OMPC_acquire: + kind = mlir::omp::ClauseMemoryOrderKind::Acquire; + break; + case llvm::omp::Clause::OMPC_relaxed: + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + break; + case llvm::omp::Clause::OMPC_release: + kind = mlir::omp::ClauseMemoryOrderKind::Release; + break; + case llvm::omp::Clause::OMPC_seq_cst: + kind = mlir::omp::ClauseMemoryOrderKind::Seq_cst; + break; + default: + break; + } + } - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); + // Starting with 5.1, if no memory-order clause is present, the effect + // is as if "relaxed" was present. + if (!kind) { + if (version <= 50) + return nullptr; + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + } + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + return mlir::omp::ClauseMemoryOrderKindAttr::get(builder.getContext(), *kind); +} + +static mlir::Operation * // +genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; +static mlir::Operation * // +genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Value converted = builder.createConvert(loc, atomType, value); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, converted, hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - common::visit( - common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - lower::StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); +static mlir::Operation * +genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + lower::ExprToValueMap overrides; + lower::StatementContext naCtx; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + assert(!args.empty() && "Update operation without arguments"); + for (auto &arg : args) { + if (!semantics::IsSameOrResizeOf(arg, atom)) { + mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); + overrides.try_emplace(&arg, val); } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); } - mlir::Operation *atomicUpdateOp = nullptr; - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), - val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -static void genAtomicWrite(lower::AbstractConverter &converter, - const parser::OmpAtomicWrite &atomicWrite, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - - const parser::AssignmentStmt &stmt = - std::get>(atomicWrite.t) - .statement; - const evaluate::Assignment &assign = *stmt.typedAssignment->v; - lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -static void genAtomicRead(lower::AbstractConverter &converter, - const parser::OmpAtomicRead &atomicRead, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicRead.t) - .statement.t); + builder.restoreInsertionPoint(atomicAt); + auto updateOp = + builder.create(loc, atomAddr, hint, memOrder); - lower::StatementContext stmtCtx; - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -static void genAtomicUpdate(lower::AbstractConverter &converter, - const parser::OmpAtomicUpdate &atomicUpdate, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicUpdate.t) - .statement.t); + mlir::Region ®ion = updateOp->getRegion(0); + mlir::Block *block = builder.createBlock(®ion, {}, {atomType}, {loc}); + mlir::Value localAtom = fir::getBase(block->getArgument(0)); + overrides.try_emplace(&atom, localAtom); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -static void genOmpAtomic(lower::AbstractConverter &converter, - const parser::OmpAtomic &atomicConstruct, - mlir::Location loc) { - const parser::OmpAtomicClauseList &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>(atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicConstruct.t) - .statement.t); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -static void genAtomicCapture(lower::AbstractConverter &converter, - const parser::OmpAtomicCapture &atomicCapture, - mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + converter.overrideExprValues(&overrides); + mlir::Value updated = + fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value converted = builder.createConvert(loc, atomType, updated); + builder.create(loc, converted); + converter.resetExprOverrides(); - const parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const parser::OmpAtomicClauseList &rightHandClauseList = - std::get<2>(atomicCapture.t); - const parser::OmpAtomicClauseList &leftHandClauseList = - std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + builder.restoreInsertionPoint(saved); + return updateOp; +} + +static mlir::Operation * +genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, int action, + mlir::Value atomAddr, const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + switch (action) { + case parser::OpenMPAtomicConstruct::Analysis::Read: + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Write: + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Update: + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + default: + return nullptr; } - firOpBuilder.setInsertionPointToEnd(&block); - firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); } //===----------------------------------------------------------------------===// @@ -3911,10 +3639,6 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, @@ -3922,38 +3646,140 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; + llvm::errs() << " atom: " << atom.AsFortran() << "\n"; + llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; + llvm::errs() << " op0 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << " op1 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << "}\n"; +} + static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - const parser::OpenMPAtomicConstruct &atomicConstruct) { - Fortran::common::visit( - common::visitors{ - [&](const parser::OmpAtomicRead &atomicRead) { - mlir::Location loc = converter.genLocation(atomicRead.source); - genAtomicRead(converter, atomicRead, loc); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - mlir::Location loc = converter.genLocation(atomicWrite.source); - genAtomicWrite(converter, atomicWrite, loc); - }, - [&](const parser::OmpAtomic &atomicConstruct) { - mlir::Location loc = converter.genLocation(atomicConstruct.source); - genOmpAtomic(converter, atomicConstruct, loc); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - mlir::Location loc = converter.genLocation(atomicUpdate.source); - genAtomicUpdate(converter, atomicUpdate, loc); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - mlir::Location loc = converter.genLocation(atomicCapture.source); - genAtomicCapture(converter, atomicCapture, loc); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - mlir::Location loc = converter.genLocation(atomicCompare.source); - TODO(loc, "OpenMP atomic compare"); - }, - }, - atomicConstruct.u); + const parser::OpenMPAtomicConstruct &construct) { + auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { + if (auto *maybe = expr.get(); maybe && maybe->v) { + return &*maybe->v; + } else { + return nullptr; + } + }; + + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto &dirSpec = std::get(construct.t); + List clauses = makeClauses(dirSpec.Clauses(), semaCtx); + lower::StatementContext stmtCtx; + + const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + const semantics::SomeExpr &atom = *get(analysis.atom); + mlir::Location loc = converter.genLocation(construct.source); + mlir::Value atomAddr = + fir::getBase(converter.genExprAddr(atom, stmtCtx, &loc)); + mlir::IntegerAttr hint = getAtomicHint(converter, clauses); + mlir::omp::ClauseMemoryOrderKindAttr memOrder = + getAtomicMemoryOrder(converter, semaCtx, clauses); + + if (auto *cond = get(analysis.cond)) { + (void)cond; + TODO(loc, "OpenMP ATOMIC COMPARE"); + } else { + int action0 = analysis.op0.what & analysis.Action; + int action1 = analysis.op1.what & analysis.Action; + mlir::Operation *captureOp = nullptr; + fir::FirOpBuilder::InsertPoint atomicAt; + fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + + if (construct.IsCapture()) { + // Capturing operation. + assert(action0 != analysis.None && action1 != analysis.None && + "Expexcing two actions"); + captureOp = + builder.create(loc, hint, memOrder); + // Set the non-atomic insertion point to before the atomic.capture. + prepareAt = getInsertionPointBefore(captureOp); + + mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); + builder.setInsertionPointToEnd(block); + // Set the atomic insertion point to before the terminator inside + // atomic.capture. + mlir::Operation *term = builder.create(loc); + atomicAt = getInsertionPointBefore(term); + hint = nullptr; + memOrder = nullptr; + } else { + // Non-capturing operation. + assert(action0 != analysis.None && action1 == analysis.None && + "Expexcing single action"); + assert(!(analysis.op0.what & analysis.Condition)); + atomicAt = prepareAt; + } + + mlir::Operation *firstOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, + *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + assert(firstOp && "Should have created an atomic operation"); + atomicAt = getInsertionPointAfter(firstOp); + + mlir::Operation *secondOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, + *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } + } } static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..b30263ad54fa4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { + return std::string( + state.GetLocation(), std::min(64, state.BytesRemaining())); +} + constexpr auto startOmpLine = skipStuffBeforeStatement >> "!$OMP "_sptok; constexpr auto endOmpLine = space >> endOfLine; @@ -918,8 +924,10 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "BIND" >> construct(construct( parenthesized(Parser{}))) || + "CAPTURE" >> construct(construct()) || "COLLAPSE" >> construct(construct( parenthesized(scalarIntConstantExpr))) || + "COMPARE" >> construct(construct()) || "CONTAINS" >> construct(construct( parenthesized(Parser{}))) || "COPYIN" >> construct(construct( @@ -1039,6 +1047,7 @@ TYPE_PARSER( // "TASK_REDUCTION" >> construct(construct( parenthesized(Parser{}))) || + "READ" >> construct(construct()) || "RELAXED" >> construct(construct()) || "RELEASE" >> construct(construct()) || "REVERSE_OFFLOAD" >> @@ -1082,6 +1091,7 @@ TYPE_PARSER( // maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || + "WRITE" >> construct(construct()) || // Cancellable constructs "DO"_id >= construct(construct( @@ -1210,6 +1220,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. +// A problem can occur when atomic constructs without end-directive follow +// each other closely, e.g. +// !$omp atomic write +// x = v +// !$omp atomic update +// x = x + 1 +// ... +// The speculative parsing will become "recursive", and has the potential +// to take a (practically) infinite amount of time given a sufficiently +// large number of such constructs in a row. Since atomic constructs cannot +// contain other OpenMP constructs, guarding against recursive calls to the +// atomic construct parser solves the problem. +struct OmpAtomicConstructParser { + using resultType = OpenMPAtomicConstruct; + + static constexpr size_t BodyLimit{5}; + + std::optional Parse(ParseState &state) const { + if (recursing_) { + return std::nullopt; + } + recursing_ = true; + + auto dirSpec{Parser{}.Parse(state)}; + if (!dirSpec || dirSpec->DirId() != llvm::omp::Directive::OMPD_atomic) { + recursing_ = false; + return std::nullopt; + } + + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + if (ParseOne(exec, end, tail, state)) { + if (!tail.first.empty()) { + if (auto &&rest{attempt(LimitedTailParser(BodyLimit)).Parse(state)}) { + for (auto &&s : rest->first) { + tail.first.emplace_back(std::move(s)); + } + assert(!tail.second); + tail.second = std::move(rest->second); + } + } + recursing_ = false; + return OpenMPAtomicConstruct{ + std::move(*dirSpec), std::move(tail.first), std::move(tail.second)}; + } + + recursing_ = false; + return std::nullopt; + } + +private: + // Begin-directive + TailType = entire construct. + using TailType = std::pair>; + + // Parse either an ExecutionPartConstruct, or atomic end-directive. When + // successful, record the result in the "tail" provided, otherwise fail. + static std::optional ParseOne( // + Parser &exec, OmpEndDirectiveParser &end, + TailType &tail, ParseState &state) { + auto isRecovery{[](const ExecutionPartConstruct &e) { + return std::holds_alternative(e.u); + }}; + if (auto &&stmt{attempt(exec).Parse(state)}; stmt && !isRecovery(*stmt)) { + tail.first.emplace_back(std::move(*stmt)); + } else if (auto &&dir{attempt(end).Parse(state)}) { + tail.second = std::move(*dir); + } else { + return std::nullopt; + } + return Success{}; + } + + struct LimitedTailParser { + using resultType = TailType; + + constexpr LimitedTailParser(size_t count) : count_(count) {} + + std::optional Parse(ParseState &state) const { + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + for (size_t i{0}; i != count_; ++i) { + if (ParseOne(exec, end, tail, state)) { + if (tail.second) { + // Return when the end-directive was parsed. + return std::move(tail); + } + } else { + break; + } + } + return std::nullopt; + } + + private: + const size_t count_; + }; + + // The recursion guard should become thread_local if parsing is ever + // parallelized. + static bool recursing_; +}; + +bool OmpAtomicConstructParser::recursing_{false}; + +TYPE_PARSER(sourced( // + construct(OmpAtomicConstructParser{}))) + // 2.17.7 Atomic construct/2.17.8 Flush construct [OpenMP 5.0] // memory-order-clause -> // acq_rel @@ -1224,19 +1383,6 @@ TYPE_PARSER(sourced(construct( "RELEASE" >> construct(construct()) || "SEQ_CST" >> construct(construct()))))) -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) -TYPE_PARSER(sourced(construct( - construct(Parser{}) || - construct( - "FAIL" >> parenthesized(Parser{})) || - construct( - "HINT" >> parenthesized(Parser{}))))) - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -TYPE_PARSER(sourced(construct( - many(maybe(","_tok) >> sourced(Parser{}))))) - static bool IsSimpleStandalone(const OmpDirectiveName &name) { switch (name.v) { case llvm::omp::Directive::OMPD_barrier: @@ -1383,67 +1529,6 @@ TYPE_PARSER(sourced( TYPE_PARSER(construct(Parser{}) || construct(Parser{})) -// 2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] | -// ATOMIC [clause] -// clause -> memory-order-clause | HINT(hint-expression) -// memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED -// atomic-clause -> READ | WRITE | UPDATE | CAPTURE - -// OMP END ATOMIC -TYPE_PARSER(construct(startOmpLine >> "END ATOMIC"_tok)) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] READ [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("READ"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] CAPTURE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("CAPTURE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - statement(assignmentStmt), Parser{} / endOmpLine))) - -TYPE_PARSER(construct(indirect(Parser{})) || - construct(indirect(Parser{}))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] COMPARE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("COMPARE"_tok), - Parser{} / endOmpLine, - Parser{}, - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] UPDATE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("UPDATE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [atomic-clause-list] -TYPE_PARSER(sourced(construct(verbatim("ATOMIC"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] WRITE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("WRITE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// Atomic Construct -TYPE_PARSER(construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{})) - // 2.13.2 OMP CRITICAL TYPE_PARSER(startOmpLine >> sourced(construct( diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..5983f54600c21 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -318,6 +318,34 @@ std::string OmpTraitSetSelectorName::ToString() const { return std::string(EnumToString(v)); } +llvm::omp::Clause OpenMPAtomicConstruct::GetKind() const { + auto &dirSpec{std::get(t)}; + for (auto &clause : dirSpec.Clauses().v) { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_read: + case llvm::omp::Clause::OMPC_write: + case llvm::omp::Clause::OMPC_update: + return clause.Id(); + default: + break; + } + } + return llvm::omp::Clause::OMPC_update; +} + +bool OpenMPAtomicConstruct::IsCapture() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_capture; + }); +} + +bool OpenMPAtomicConstruct::IsCompare() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_compare; + }); +} } // namespace Fortran::parser template static llvm::omp::Clause getClauseIdForClass(C &&) { diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..a2bc8b98088f1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2562,83 +2562,22 @@ class UnparseVisitor { Word(ToUpperCaseLetters(common::EnumToString(x))); } - void Unparse(const OmpAtomicClauseList &x) { Walk(" ", x.v, " "); } - - void Unparse(const OmpAtomic &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCapture &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" CAPTURE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - Put("\n"); - Walk(std::get(x.t)); - BeginOpenMP(); - Word("!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCompare &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" COMPARE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - } - void Unparse(const OmpAtomicRead &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" READ"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicUpdate &x) { + void Unparse(const OpenMPAtomicConstruct &x) { BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" UPDATE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicWrite &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" WRITE"); - Walk(std::get<2>(x.t)); + Word("!$OMP "); + Walk(std::get(x.t)); Put("\n"); EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); + Walk(std::get(x.t), ""); + if (auto &end{std::get>(x.t)}) { + BeginOpenMP(); + Word("!$OMP END "); + Walk(*end); + Put("\n"); + EndOpenMP(); + } } + void Unparse(const OpenMPExecutableAllocate &x) { const auto &fields = std::get>>( @@ -2889,23 +2828,8 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } + void Unparse(const OmpFailClause &x) { Walk(x.v); } void Unparse(const OmpMemoryOrderClause &x) { Walk(x.v); } - void Unparse(const OmpAtomicClause &x) { - common::visit(common::visitors{ - [&](const OmpMemoryOrderClause &y) { Walk(y); }, - [&](const OmpFailClause &y) { - Word("FAIL("); - Walk(y.v); - Put(")"); - }, - [&](const OmpHintClause &y) { - Word("HINT("); - Walk(y.v); - Put(")"); - }, - }, - x.u); - } void Unparse(const OmpMetadirectiveDirective &x) { BeginOpenMP(); Word("!$OMP METADIRECTIVE "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c582de6df0319..bf3dff8ddab28 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); + +template +static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { + return !(e == f); +} + // Use when clause falls under 'struct OmpClause' in 'parse-tree.h'. #define CHECK_SIMPLE_CLAUSE(X, Y) \ void OmpStructureChecker::Enter(const parser::OmpClause::X &) { \ @@ -78,6 +86,28 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } +static bool IsVarOrFunctionRef(const SomeExpr &expr) { + return evaluate::UnwrapProcedureRef(expr) != nullptr || + evaluate::IsVariable(expr); +} + +static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { + const parser::TypedExpr &typedExpr{parserExpr.typedExpr}; + // ForwardOwningPointer typedExpr + // `- GenericExprWrapper ^.get() + // `- std::optional ^->v + return typedExpr.get()->v; +} + +static std::optional GetDynamicType( + const parser::Expr &parserExpr) { + if (auto maybeExpr{GetEvaluateExpr(parserExpr)}) { + return maybeExpr->GetType(); + } else { + return std::nullopt; + } +} + // 'OmpWorkshareBlockChecker' is used to check the validity of the assignment // statements and the expressions enclosed in an OpenMP Workshare construct class OmpWorkshareBlockChecker { @@ -584,51 +614,26 @@ void OmpStructureChecker::CheckPredefinedAllocatorRestriction( } } -template -void OmpStructureChecker::CheckHintClause( - D *leftOmpClauseList, D *rightOmpClauseList, std::string_view dirName) { - bool foundHint{false}; +void OmpStructureChecker::Enter(const parser::OmpClause::Hint &x) { + CheckAllowedClause(llvm::omp::Clause::OMPC_hint); + auto &dirCtx{GetContext()}; - auto checkForValidHintClause = [&](const D *clauseList) { - for (const auto &clause : clauseList->v) { - const parser::OmpHintClause *ompHintClause = nullptr; - if constexpr (std::is_same_v) { - ompHintClause = std::get_if(&clause.u); - } else if constexpr (std::is_same_v) { - if (auto *hint{std::get_if(&clause.u)}) { - ompHintClause = &hint->v; - } - } - if (!ompHintClause) - continue; - if (foundHint) { - context_.Say(clause.source, - "At most one HINT clause can appear on the %s directive"_err_en_US, - parser::ToUpperCaseLetters(dirName)); - } - foundHint = true; - std::optional hintValue = GetIntValue(ompHintClause->v); - if (hintValue && *hintValue >= 0) { - /*`omp_sync_hint_nonspeculative` and `omp_lock_hint_speculative`*/ - if ((*hintValue & 0xC) == 0xC - /*`omp_sync_hint_uncontended` and omp_sync_hint_contended*/ - || (*hintValue & 0x3) == 0x3) - context_.Say(clause.source, - "Hint clause value " - "is not a valid OpenMP synchronization value"_err_en_US); - } else { - context_.Say(clause.source, - "Hint clause must have non-negative constant " - "integer expression"_err_en_US); + if (std::optional maybeVal{GetIntValue(x.v.v)}) { + int64_t val{*maybeVal}; + if (val >= 0) { + // Check contradictory values. + if ((val & 0xC) == 0xC || // omp_sync_hint_speculative and nonspeculative + (val & 0x3) == 0x3) { // omp_sync_hint_contended and uncontended + context_.Say(dirCtx.clauseSource, + "The synchronization hint is not valid"_err_en_US); } + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be non-negative"_err_en_US); } - }; - - if (leftOmpClauseList) { - checkForValidHintClause(leftOmpClauseList); - } - if (rightOmpClauseList) { - checkForValidHintClause(rightOmpClauseList); + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be a constant integer value"_err_en_US); } } @@ -2370,8 +2375,9 @@ void OmpStructureChecker::Leave(const parser::OpenMPCancelConstruct &) { void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { const auto &dir{std::get(x.t)}; + const auto &dirSource{std::get(dir.t).source}; const auto &endDir{std::get(x.t)}; - PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_critical); + PushContextAndClauseSets(dirSource, llvm::omp::Directive::OMPD_critical); const auto &block{std::get(x.t)}; CheckNoBranching(block, llvm::omp::Directive::OMPD_critical, dir.source); const auto &dirName{std::get>(dir.t)}; @@ -2404,7 +2410,6 @@ void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { "Hint clause other than omp_sync_hint_none cannot be specified for " "an unnamed CRITICAL directive"_err_en_US}); } - CheckHintClause(&ompClause, nullptr, "CRITICAL"); } void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { @@ -2632,420 +2637,1519 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; } - return false; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } + + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (Append(v, std::move(results)), ...); + return v; + } + +private: + static void Append(Result &acc, Result &&data) { + for (auto &&s : data) { + acc.push_back(std::move(s)); } } +}; +} // namespace atomic - ErrIfAllocatableVariable(var); +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { + // Both expr and x have the form of SomeType(SomeKind(...)[1]). + // Check if expr is + // SomeType(SomeKind(Type( + // Convert + // SomeKind(...)[2]))) + // where SomeKind(...) [1] and [2] are equal, and the Convert preserves + // TypeCategory. + + if (expr != x) { + auto top{atomic::ArgumentExtractor{}(expr)}; + return top.first == atomic::Operator::Resize && x == top.second.front(); } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); + return true; } } -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, const MaybeExpr &maybeExpr = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetExpr(operation.expr, maybeExpr); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedExpr expr; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var + // or + // ... = x capture-var = atomic-var + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return u.lhs == c.rhs; + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); + bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + + if (couldBeCapture1) { + if (couldBeCapture2) { + if (isUpdateCapture(as2, as1)) { + if (isUpdateCapture(as1, as2)) { + // If both statements could be captures and both could be updates, + // emit a warning about the ambiguity. + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); } + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, + as1.rhs.AsFortran(), as2.rhs.AsFortran()); + } + } else { // !couldBeCapture2 + if (isUpdateCapture(as2, as1)) { + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else { + context_.Say(act2.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as1.rhs.AsFortran()); + } + } + } else { // !couldBeCapture1 + if (couldBeCapture2) { + if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); + } else { + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as2.rhs.AsFortran()); + } + } else { + context_.Say(source, + "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + } + } + + return std::make_pair(nullptr, nullptr); +} + +void OmpStructureChecker::CheckAtomicCaptureAssignment( + const evaluate::Assignment &capture, const SomeExpr &atom, + parser::CharBlock source) { + const SomeExpr &cap{capture.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + // This part should have been checked prior to callig this function. + assert(capture.rhs == atom && "This canont be a capture assignment"); + CheckStorageOverlap(atom, {cap}, source); + } +} + +void OmpStructureChecker::CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source) { + const SomeExpr &atom{read.rhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source) { + // [6.0:190:13-15] + // A write structured block is write-statement, a write statement that has + // one of the following forms: + // x = expr + // x => expr + const SomeExpr &atom{write.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, lsrc); + CheckStorageOverlap(atom, {write.rhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source) { + // [6.0:191:1-7] + // An update structured block is update-statement, an update statement + // that has one of the following forms: + // x = x operator expr + // x = expr operator x + // x = intrinsic-procedure-name (x) + // x = intrinsic-procedure-name (x, expr-list) + // x = intrinsic-procedure-name (expr-list, x) + const SomeExpr &atom{update.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, lsrc); + + auto top{GetTopLevelOperation(update.rhs)}; + switch (top.first) { + case atomic::Operator::Add: + case atomic::Operator::Sub: + case atomic::Operator::Mul: + case atomic::Operator::Div: + case atomic::Operator::And: + case atomic::Operator::Or: + case atomic::Operator::Eqv: + case atomic::Operator::Neqv: + case atomic::Operator::Min: + case atomic::Operator::Max: + case atomic::Operator::Identity: + break; + case atomic::Operator::Call: + context_.Say(source, + "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Convert: + context_.Say(source, + "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Intrinsic: + context_.Say(source, + "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Unk: + context_.Say( + source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + default: + context_.Say(source, + "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, + atomic::ToString(top.first)); + return; + } + // Check if `atom` occurs exactly once in the argument list. + std::vector nonAtom; + auto unique{[&]() { // -> iterator + auto found{top.second.end()}; + for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { + if (IsSameOrResizeOf(*i, atom)) { + if (found != top.second.end()) { + return top.second.end(); } + found = i; + } else { + nonAtom.push_back(*i); } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); + return found; + }()}; + + if (unique == top.second.end()) { + if (top.first == atomic::Operator::Identity) { + // This is "x = y". + context_.Say(rsrc, + "The atomic variable %s should appear as an argument in the update operation"_err_en_US, + atom.AsFortran()); + } else { + context_.Say(rsrc, + "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, + atom.AsFortran(), atomic::ToString(top.first)); + } + } else { + CheckStorageOverlap(atom, nonAtom, source); } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( + const SomeExpr &cond, parser::CharBlock condSource, + const evaluate::Assignment &assign, parser::CharBlock assignSource) { + const SomeExpr &atom{assign.lhs}; + auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, alsrc); + + auto top{GetTopLevelOperation(cond)}; + // Missing arguments to operations would have been diagnosed by now. + + switch (top.first) { + case atomic::Operator::Associated: + if (atom != top.second.front()) { + context_.Say(assignSource, + "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); + } + break; + // x equalop e | e equalop x (allowing "e equalop x" is an extension) + case atomic::Operator::Eq: + case atomic::Operator::Eqv: + // x ordop expr | expr ordop x + case atomic::Operator::Lt: + case atomic::Operator::Gt: { + const SomeExpr &arg0{top.second[0]}; + const SomeExpr &arg1{top.second[1]}; + if (IsSameOrResizeOf(arg0, atom)) { + CheckStorageOverlap(atom, {arg1}, condSource); + } else if (IsSameOrResizeOf(arg1, atom)) { + CheckStorageOverlap(atom, {arg0}, condSource); + } else { + context_.Say(assignSource, + "An argument of the %s operator should be the target of the assignment"_err_en_US, + atomic::ToString(top.first)); + } + break; + } + case atomic::Operator::True: + case atomic::Operator::False: + break; + default: + context_.Say(condSource, + "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, + atomic::ToString(top.first)); + break; } } -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, +void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source) { + // The condition/statements must be: + // - cond: x equalop e ift: x = d iff: - + // - cond: x ordop expr ift: x = expr iff: - (+ commute ordop) + // - cond: associated(x) ift: x => expr iff: - + // - cond: associated(x, e) ift: x => expr iff: - + + // The if-true statement must be present, and must be an assignment. + auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; + if (!maybeAssign) { + if (update.ift.stmt) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); + } else { + context_.Say( + source, "Invalid body of ATOMIC UPDATE COMPARE operation"_err_en_US); + } + return; + } + const evaluate::Assignment assign{*maybeAssign}; + const SomeExpr &atom{assign.lhs}; + + CheckAtomicConditionalUpdateAssignment( + update.cond, update.source, assign, update.ift.source); + + CheckStorageOverlap(atom, {assign.rhs}, update.ift.source); + + if (update.iff) { + context_.Say(update.iff.source, + "In ATOMIC UPDATE COMPARE the update statement should not have an ELSE branch"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdateOnly( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeUpdate{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeUpdate->lhs}; + CheckAtomicUpdateAssignment(*maybeUpdate, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC UPDATE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdate( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // Allowable forms are (single-statement): + // - if ... + // - x = (... ? ... : x) + // and two-statement: + // - r = cond ; if (r) ... + + const parser::ExecutionPartConstruct *ust{nullptr}; // update + const parser::ExecutionPartConstruct *cst{nullptr}; // condition + + if (body.size() == 1) { + ust = &body.front(); + } else if (body.size() == 2) { + cst = &body.front(); + ust = &body.back(); + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE operation should contain one or two statements"_err_en_US); + return; + } + + // Flang doesn't support conditional-expr yet, so all update statements + // are if-statements. + + // IfStmt: if (...) ... + // IfConstruct: if (...) then ... endif + auto maybeUpdate{AnalyzeConditionalStmt(ust)}; + if (!maybeUpdate) { + context_.Say(source, + "In ATOMIC UPDATE COMPARE the update statement should be a conditional statement"_err_en_US); + return; + } + + AnalyzedCondStmt &update{*maybeUpdate}; + + if (SourcedActionStmt action{GetActionStmt(cst)}) { + // The "condition" statement must be `r = cond`. + if (auto maybeCond{GetEvaluateAssignment(action.stmt)}) { + if (maybeCond->lhs != update.cond) { + context_.Say(update.source, + "In ATOMIC UPDATE COMPARE the conditional statement must use %s as the condition"_err_en_US, + maybeCond->lhs.AsFortran()); + } else { + // If it's "r = ...; if (r) ..." then put the original condition + // in `update`. + update.cond = maybeCond->rhs; + } + } else { + context_.Say(action.source, + "In ATOMIC UPDATE COMPARE with two statements the first statement should compute the condition"_err_en_US); + } + } + + evaluate::Assignment assign{*GetEvaluateAssignment(update.ift.stmt)}; + + CheckAtomicConditionalUpdateStmt(update, source); + if (IsCheckForAssociated(update.cond)) { + if (!IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment should be a pointer-assignment when the condition is ASSOCIATED"_err_en_US); + } + } else { + if (IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment cannot be a pointer-assignment except when the condition is ASSOCIATED"_err_en_US); + } + } + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::None)); +} + +void OmpStructureChecker::CheckAtomicUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() != 2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two statements"_err_en_US); + return; + } + + auto [uec, cec]{CheckUpdateCapture(&body.front(), &body.back(), source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; + auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; + + if (!maybeUpdate || !maybeCapture) { + context_.Say(source, + "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); + return; + } + + const evaluate::Assignment &update{*maybeUpdate}; + const evaluate::Assignment &capture{*maybeCapture}; + const SomeExpr &atom{update.lhs}; + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + int action; + + if (IsMaybeAtomicWrite(update)) { + action = Analysis::Write; + CheckAtomicWriteAssignment(update, uact.source); + } else { + action = Analysis::Update; + CheckAtomicUpdateAssignment(update, uact.source); + } + CheckAtomicCaptureAssignment(capture, atom, cact.source); + + if (IsPointerAssignment(update) != IsPointerAssignment(capture)) { + context_.Say(cact.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + + if (GetActionStmt(&body.front()).stmt == uact.stmt) { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(action, update.rhs), + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + } else { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), + MakeAtomicAnalysisOp(action, update.rhs)); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // There are two different variants of this: + // (1) conditional-update and capture separately: + // This form only allows single-statement updates, i.e. the update + // form "r = cond; if (r) ...)" is not allowed. + // (2) conditional-update combined with capture in a single statement: + // This form does allow the condition to be calculated separately, + // i.e. "r = cond; if (r) ...". + // Regardless of what form it is, the actual update assignment is a + // proper write, i.e. "x = d", where d does not depend on x. + + AnalyzedCondStmt update; + SourcedActionStmt capture; + bool captureAlways{true}, captureFirst{true}; + + auto extractCapture{[&]() { + capture = update.iff; + captureAlways = false; + update.iff = SourcedActionStmt{}; + }}; + + auto classifyNonUpdate{[&](const SourcedActionStmt &action) { + // The non-update statement is either "r = cond" or the capture. + if (auto maybeAssign{GetEvaluateAssignment(action.stmt)}) { + if (update.cond == maybeAssign->lhs) { + // If this is "r = cond; if (r) ...", then update the condition. + update.cond = maybeAssign->rhs; + update.source = action.source; + // In this form, the update and the capture are combined into + // an IF-THEN-ELSE statement. + extractCapture(); + } else { + // Assume this is the capture-statement. + capture = action; + } + } + }}; + + if (body.size() == 2) { + // This could be + // - capture; conditional-update (in any order), or + // - r = cond; if (r) capture-update + const parser::ExecutionPartConstruct *st1{&body.front()}; + const parser::ExecutionPartConstruct *st2{&body.back()}; + // In either case, the conditional statement can be analyzed by + // AnalyzeConditionalStmt, whereas the other statement cannot. + if (auto maybeUpdate1{AnalyzeConditionalStmt(st1)}) { + update = *maybeUpdate1; + classifyNonUpdate(GetActionStmt(st2)); + captureFirst = false; + } else if (auto maybeUpdate2{AnalyzeConditionalStmt(st2)}) { + update = *maybeUpdate2; + classifyNonUpdate(GetActionStmt(st1)); + } else { + // None of the statements are conditional, this rules out the + // "r = cond; if (r) ..." and the "capture + conditional-update" + // variants. This could still be capture + write (which is classified + // as conditional-update-capture in the spec). + auto [uec, cec]{CheckUpdateCapture(st1, st2, source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + update.ift = uact; + capture = cact; + if (uec == st1) { + captureFirst = false; + } + } + } else if (body.size() == 1) { + if (auto maybeUpdate{AnalyzeConditionalStmt(&body.front())}) { + update = *maybeUpdate; + // This is the form with update and capture combined into an IF-THEN-ELSE + // statement. The capture-statement is always the ELSE branch. + extractCapture(); + } else { + goto invalid; + } + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE CAPTURE operation should contain one or two statements"_err_en_US); + return; + invalid: + context_.Say(source, + "Invalid body of ATOMIC UPDATE COMPARE CAPTURE operation"_err_en_US); + return; + } + + // The update must have a form `x = d` or `x => d`. + if (auto maybeWrite{GetEvaluateAssignment(update.ift.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, update.ift.source); + if (auto maybeCapture{GetEvaluateAssignment(capture.stmt)}) { + CheckAtomicCaptureAssignment(*maybeCapture, atom, capture.source); + + if (IsPointerAssignment(*maybeWrite) != + IsPointerAssignment(*maybeCapture)) { + context_.Say(capture.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + } else { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + return; + } + } else { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + return; + } + + // update.iff should be empty here, the capture statement should be + // stored in "capture". + + // Fill out the analysis in the AST node. + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + bool condUnused{std::visit( + [](auto &&s) { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v) { + return true; + } else { + return false; + } }, - x.u); + update.cond.u)}; + + int updateWhen{!condUnused ? Analysis::IfTrue : 0}; + int captureWhen{!captureAlways ? Analysis::IfFalse : 0}; + + evaluate::Assignment updAssign{*GetEvaluateAssignment(update.ift.stmt)}; + evaluate::Assignment capAssign{*GetEvaluateAssignment(capture.stmt)}; + + if (captureFirst) { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + } else { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + } +} + +void OmpStructureChecker::CheckAtomicRead( + const parser::OpenMPAtomicConstruct &x) { + // [6.0:190:5-7] + // A read structured block is read-statement, a read statement that has one + // of the following forms: + // v = x + // v => x + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Read cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC READ cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeRead->rhs}; + CheckAtomicReadAssignment(*maybeRead, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC READ operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC READ operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicWrite( + const parser::OpenMPAtomicConstruct &x) { + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Write cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC WRITE cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeWrite{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC WRITE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdate( + const parser::OpenMPAtomicConstruct &x) { + auto &block{std::get(x.t)}; + + bool isConditional{x.IsCompare()}; + bool isCapture{x.IsCapture()}; + const parser::Block &body{GetInnermostExecPart(block)}; + + if (isConditional && isCapture) { + CheckAtomicConditionalUpdateCapture(x, body, x.source); + } else if (isConditional) { + CheckAtomicConditionalUpdate(x, body, x.source); + } else if (isCapture) { + CheckAtomicUpdateCapture(x, body, x.source); + } else { // update-only + CheckAtomicUpdateOnly(x, body, x.source); + } +} + +void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { + // All of the following groups have the "exclusive" property, i.e. at + // most one clause from each group is allowed. + // The exclusivity-checking code should eventually be unified for all + // clauses, with clause groups defined in OMP.td. + std::array atomic{llvm::omp::Clause::OMPC_read, + llvm::omp::Clause::OMPC_update, llvm::omp::Clause::OMPC_write}; + std::array memoryOrder{llvm::omp::Clause::OMPC_acq_rel, + llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_relaxed, + llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_seq_cst}; + + auto checkExclusive{[&](llvm::ArrayRef group, + std::string_view name, + const parser::OmpClauseList &clauses) { + const parser::OmpClause *present{nullptr}; + for (const parser::OmpClause &clause : clauses.v) { + llvm::omp::Clause id{clause.Id()}; + if (!llvm::is_contained(group, id)) { + continue; + } + if (present == nullptr) { + present = &clause; + continue; + } else if (id == present->Id()) { + // Ignore repetitions of the same clause, those will be diagnosed + // separately. + continue; + } + parser::MessageFormattedText txt( + "At most one clause from the '%s' group is allowed on ATOMIC construct"_err_en_US, + name.data()); + parser::Message message(clause.source, txt); + message.Attach(present->source, + "Previous clause from this group provided here"_en_US); + context_.Say(std::move(message)); + return; + } + }}; + + auto &dirSpec{std::get(x.t)}; + auto &dir{std::get(dirSpec.t)}; + PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_atomic); + llvm::omp::Clause kind{x.GetKind()}; + + checkExclusive(atomic, "atomic", dirSpec.Clauses()); + checkExclusive(memoryOrder, "memory-order", dirSpec.Clauses()); + + switch (kind) { + case llvm::omp::Clause::OMPC_read: + CheckAtomicRead(x); + break; + case llvm::omp::Clause::OMPC_write: + CheckAtomicWrite(x); + break; + case llvm::omp::Clause::OMPC_update: + CheckAtomicUpdate(x); + break; + default: + break; + } } void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { @@ -3237,7 +4341,6 @@ CHECK_SIMPLE_CLAUSE(Final, OMPC_final) CHECK_SIMPLE_CLAUSE(Flush, OMPC_flush) CHECK_SIMPLE_CLAUSE(Full, OMPC_full) CHECK_SIMPLE_CLAUSE(Grainsize, OMPC_grainsize) -CHECK_SIMPLE_CLAUSE(Hint, OMPC_hint) CHECK_SIMPLE_CLAUSE(Holds, OMPC_holds) CHECK_SIMPLE_CLAUSE(Inclusive, OMPC_inclusive) CHECK_SIMPLE_CLAUSE(Initializer, OMPC_initializer) @@ -3867,40 +4970,6 @@ void OmpStructureChecker::CheckIsLoopIvPartOfClause( } } } -// Following clauses have a separate node in parse-tree.h. -// Atomic-clause -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicRead, OMPC_read) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicWrite, OMPC_write) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicUpdate, OMPC_update) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicCapture, OMPC_capture) - -void OmpStructureChecker::Leave(const parser::OmpAtomicRead &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_read, - {llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicWrite &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_write, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicUpdate &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_update, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -// OmpAtomic node represents atomic directive without atomic-clause. -// atomic-clause - READ,WRITE,UPDATE,CAPTURE. -void OmpStructureChecker::Leave(const parser::OmpAtomic &) { - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acquire)}) { - context_.Say(clause->source, - "Clause ACQUIRE is not allowed on the ATOMIC directive"_err_en_US); - } - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acq_rel)}) { - context_.Say(clause->source, - "Clause ACQ_REL is not allowed on the ATOMIC directive"_err_en_US); - } -} // Restrictions specific to each clause are implemented apart from the // generalized restrictions. @@ -4854,21 +5923,6 @@ void OmpStructureChecker::Leave(const parser::OmpContextSelector &) { ExitDirectiveNest(ContextSelectorNest); } -std::optional OmpStructureChecker::GetDynamicType( - const common::Indirection &parserExpr) { - // Indirection parserExpr - // `- parser::Expr ^.value() - const parser::TypedExpr &typedExpr{parserExpr.value().typedExpr}; - // ForwardOwningPointer typedExpr - // `- GenericExprWrapper ^.get() - // `- std::optional ^->v - if (auto maybeExpr{typedExpr.get()->v}) { - return maybeExpr->GetType(); - } else { - return std::nullopt; - } -} - const std::list & OmpStructureChecker::GetTraitPropertyList( const parser::OmpTraitSelector &trait) { @@ -5258,7 +6312,7 @@ void OmpStructureChecker::CheckTraitCondition( const parser::OmpTraitProperty &property{properties.front()}; auto &scalarExpr{std::get(property.u)}; - auto maybeType{GetDynamicType(scalarExpr.thing)}; + auto maybeType{GetDynamicType(scalarExpr.thing.value())}; if (!maybeType || maybeType->category() != TypeCategory::Logical) { context_.Say(property.source, "%s trait requires a single LOGICAL expression"_err_en_US, diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..bf6fbf16d0646 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -48,6 +48,7 @@ static const OmpDirectiveSet noWaitClauseNotAllowedSet{ } // namespace llvm namespace Fortran::semantics { +struct AnalyzedCondStmt; // Mapping from 'Symbol' to 'Source' to keep track of the variables // used in multiple clauses @@ -142,15 +143,6 @@ class OmpStructureChecker void Leave(const parser::OmpClauseList &); void Enter(const parser::OmpClause &); - void Enter(const parser::OmpAtomicRead &); - void Leave(const parser::OmpAtomicRead &); - void Enter(const parser::OmpAtomicWrite &); - void Leave(const parser::OmpAtomicWrite &); - void Enter(const parser::OmpAtomicUpdate &); - void Leave(const parser::OmpAtomicUpdate &); - void Enter(const parser::OmpAtomicCapture &); - void Leave(const parser::OmpAtomic &); - void Enter(const parser::DoConstruct &); void Leave(const parser::DoConstruct &); @@ -189,8 +181,6 @@ class OmpStructureChecker void CheckAllowedMapTypes(const parser::OmpMapType::Value &, const std::list &); - std::optional GetDynamicType( - const common::Indirection &); const std::list &GetTraitPropertyList( const parser::OmpTraitSelector &); std::optional GetClauseFromProperty( @@ -260,14 +250,41 @@ class OmpStructureChecker void CheckDoWhile(const parser::OpenMPLoopConstruct &x); void CheckAssociatedLoopConstraints(const parser::OpenMPLoopConstruct &x); template bool IsOperatorValid(const T &, const D &); - void CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *, const parser::OmpAtomicClauseList *); - void CheckAtomicUpdateStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureStmt(const parser::AssignmentStmt &); - void CheckAtomicWriteStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureConstruct(const parser::OmpAtomicCapture &); - void CheckAtomicCompareConstruct(const parser::OmpAtomicCompare &); - void CheckAtomicConstructStructure(const parser::OpenMPAtomicConstruct &); + + void CheckStorageOverlap(const evaluate::Expr &, + llvm::ArrayRef>, parser::CharBlock); + void CheckAtomicVariable( + const evaluate::Expr &, parser::CharBlock); + std::pair + CheckUpdateCapture(const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source); + void CheckAtomicCaptureAssignment(const evaluate::Assignment &capture, + const SomeExpr &atom, parser::CharBlock source); + void CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source); + void CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source); + void CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source); + void CheckAtomicConditionalUpdateAssignment(const SomeExpr &cond, + parser::CharBlock condSource, const evaluate::Assignment &assign, + parser::CharBlock assignSource); + void CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source); + void CheckAtomicUpdateOnly(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdate(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicUpdateCapture(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source); + void CheckAtomicRead(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicWrite(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicUpdate(const parser::OpenMPAtomicConstruct &x); + void CheckDistLinear(const parser::OpenMPLoopConstruct &x); void CheckSIMDNest(const parser::OpenMPConstruct &x); void CheckTargetNest(const parser::OpenMPConstruct &x); @@ -319,7 +336,6 @@ class OmpStructureChecker void EnterDirectiveNest(const int index) { directiveNest_[index]++; } void ExitDirectiveNest(const int index) { directiveNest_[index]--; } int GetDirectiveNest(const int index) { return directiveNest_[index]; } - template void CheckHintClause(D *, D *, std::string_view); inline void ErrIfAllocatableVariable(const parser::Variable &); inline void ErrIfLHSAndRHSSymbolsMatch( const parser::Variable &, const parser::Expr &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..dc395ecf211b3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1638,11 +1638,8 @@ class OmpVisitor : public virtual DeclarationVisitor { messageHandler().set_currStmtSource(std::nullopt); } bool Pre(const parser::OpenMPAtomicConstruct &x) { - return common::visit(common::visitors{[&](const auto &u) -> bool { - AddOmpSourceRange(u.source); - return true; - }}, - x.u); + AddOmpSourceRange(x.source); + return true; } void Post(const parser::OpenMPAtomicConstruct &) { messageHandler().set_currStmtSource(std::nullopt); diff --git a/flang/lib/Semantics/rewrite-directives.cpp b/flang/lib/Semantics/rewrite-directives.cpp index 104a77885d276..b4fef2c881b67 100644 --- a/flang/lib/Semantics/rewrite-directives.cpp +++ b/flang/lib/Semantics/rewrite-directives.cpp @@ -51,23 +51,21 @@ class OmpRewriteMutator : public DirectiveRewriteMutator { bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { // Find top-level parent of the operation. - Symbol *topLevelParent{common::visit( - [&](auto &atomic) { - Symbol *symbol{nullptr}; - Scope *scope{ - &context_.FindScope(std::get(atomic.t).source)}; - do { - if (Symbol * parent{scope->symbol()}) { - symbol = parent; - } - scope = &scope->parent(); - } while (!scope->IsGlobal()); - - assert(symbol && - "Atomic construct must be within a scope associated with a symbol"); - return symbol; - }, - x.u)}; + Symbol *topLevelParent{[&]() { + Symbol *symbol{nullptr}; + Scope *scope{&context_.FindScope( + std::get(x.t).source)}; + do { + if (Symbol * parent{scope->symbol()}) { + symbol = parent; + } + scope = &scope->parent(); + } while (!scope->IsGlobal()); + + assert(symbol && + "Atomic construct must be within a scope associated with a symbol"); + return symbol; + }()}; // Get the `atomic_default_mem_order` clause from the top-level parent. std::optional defaultMemOrder; @@ -86,66 +84,48 @@ bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { return false; } - auto findMemOrderClause = - [](const std::list &clauses) { - return llvm::any_of(clauses, [](const auto &clause) { - return std::get_if(&clause.u); + auto findMemOrderClause{[](const parser::OmpClauseList &clauses) { + return llvm::any_of( + clauses.v, [](auto &clause) -> const parser::OmpClause * { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_acq_rel: + case llvm::omp::Clause::OMPC_acquire: + case llvm::omp::Clause::OMPC_relaxed: + case llvm::omp::Clause::OMPC_release: + case llvm::omp::Clause::OMPC_seq_cst: + return &clause; + default: + return nullptr; + } }); - }; - - // Get the clause list to which the new memory order clause must be added, - // only if there are no other memory order clauses present for this atomic - // directive. - std::list *clauseList = common::visit( - common::visitors{[&](parser::OmpAtomic &atomicConstruct) { - // OmpAtomic only has a single list of clauses. - auto &clauses{std::get( - atomicConstruct.t)}; - return !findMemOrderClause(clauses.v) ? &clauses.v - : nullptr; - }, - [&](auto &atomicConstruct) { - // All other atomic constructs have two lists of clauses. - auto &clausesLhs{std::get<0>(atomicConstruct.t)}; - auto &clausesRhs{std::get<2>(atomicConstruct.t)}; - return !findMemOrderClause(clausesLhs.v) && - !findMemOrderClause(clausesRhs.v) - ? &clausesRhs.v - : nullptr; - }}, - x.u); + }}; - // Add a memory order clause to the atomic directive. + auto &dirSpec{std::get(x.t)}; + auto &clauseList{std::get>(dirSpec.t)}; if (clauseList) { - atomicDirectiveDefaultOrderFound_ = true; - switch (*defaultMemOrder) { - case common::OmpMemoryOrderType::Acq_Rel: - clauseList->emplace_back(common::visit( - common::visitors{[](parser::OmpAtomicRead &) -> parser::OmpClause { - return parser::OmpClause::Acquire{}; - }, - [](parser::OmpAtomicCapture &) -> parser::OmpClause { - return parser::OmpClause::AcqRel{}; - }, - [](auto &) -> parser::OmpClause { - // parser::{OmpAtomic, OmpAtomicUpdate, OmpAtomicWrite} - return parser::OmpClause::Release{}; - }}, - x.u)); - break; - case common::OmpMemoryOrderType::Relaxed: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::Relaxed{}}); - break; - case common::OmpMemoryOrderType::Seq_Cst: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::SeqCst{}}); - break; - default: - // FIXME: Don't process other values at the moment since their validity - // depends on the OpenMP version (which is unavailable here). - break; + if (findMemOrderClause(*clauseList)) { + return false; } + } else { + clauseList = parser::OmpClauseList(decltype(parser::OmpClauseList::v){}); + } + + // Add a memory order clause to the atomic directive. + atomicDirectiveDefaultOrderFound_ = true; + switch (*defaultMemOrder) { + case common::OmpMemoryOrderType::Acq_Rel: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::AcqRel{}}); + break; + case common::OmpMemoryOrderType::Relaxed: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::Relaxed{}}); + break; + case common::OmpMemoryOrderType::Seq_Cst: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::SeqCst{}}); + break; + default: + // FIXME: Don't process other values at the moment since their validity + // depends on the OpenMP version (which is unavailable here). + break; } return false; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 index b82bd13622764..6f58e0939a787 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 index 88ec6fe910b9e..6729be6e5cf8b 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b8422c8f3b769 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -79,16 +79,16 @@ subroutine pointers_in_atomic_capture() !CHECK: %[[VAL_A_BOX_ADDR:.*]] = fir.box_addr %[[VAL_A_LOADED]] : (!fir.box>) -> !fir.ptr !CHECK: %[[VAL_B_LOADED:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR:.*]] = fir.box_addr %[[VAL_B_LOADED]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR]] : !fir.ptr !CHECK: %[[VAL_B_LOADED_2:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR_2:.*]] = fir.box_addr %[[VAL_B_LOADED_2]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR_2]] : !fir.ptr !CHECK: omp.atomic.capture { !CHECK: omp.atomic.update %[[VAL_A_BOX_ADDR]] : !fir.ptr { !CHECK: ^bb0(%[[ARG:.*]]: i32): !CHECK: %[[TEMP:.*]] = arith.addi %[[ARG]], %[[VAL_B]] : i32 !CHECK: omp.yield(%[[TEMP]] : i32) !CHECK: } -!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 +!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR_2]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 !CHECK: } !CHECK: return !CHECK: } diff --git a/flang/test/Lower/OpenMP/atomic-write.f90 b/flang/test/Lower/OpenMP/atomic-write.f90 index 13392ad76471f..6eded49b0b15d 100644 --- a/flang/test/Lower/OpenMP/atomic-write.f90 +++ b/flang/test/Lower/OpenMP/atomic-write.f90 @@ -44,9 +44,9 @@ end program OmpAtomicWrite !CHECK-LABEL: func.func @_QPatomic_write_pointer() { !CHECK: %[[X_REF:.*]] = fir.alloca !fir.box> {bindc_name = "x", uniq_name = "_QFatomic_write_pointerEx"} !CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFatomic_write_pointerEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) -!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> !CHECK: %[[X_POINTEE_ADDR:.*]] = fir.box_addr %[[X_ADDR_BOX]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: omp.atomic.write %[[X_POINTEE_ADDR]] = %[[C1]] : !fir.ptr, i32 !CHECK: %[[C2:.*]] = arith.constant 2 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> diff --git a/flang/test/Parser/OpenMP/atomic-compare.f90 b/flang/test/Parser/OpenMP/atomic-compare.f90 deleted file mode 100644 index 5cd02698ff482..0000000000000 --- a/flang/test/Parser/OpenMP/atomic-compare.f90 +++ /dev/null @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s -! OpenMP version for documentation purposes only - it isn't used until Sema. -! This is testing for Parser errors that bail out before Sema. -program main - implicit none - integer :: i, j = 10 - logical :: r - - !CHECK: error: expected OpenMP construct - !$omp atomic compare write - r = i .eq. j + 1 - - !CHECK: error: expected end of line - !$omp atomic compare num_threads(4) - r = i .eq. j -end program main diff --git a/flang/test/Semantics/OpenMP/atomic-compare.f90 b/flang/test/Semantics/OpenMP/atomic-compare.f90 index 54492bf6a22a6..11e23e062bce7 100644 --- a/flang/test/Semantics/OpenMP/atomic-compare.f90 +++ b/flang/test/Semantics/OpenMP/atomic-compare.f90 @@ -44,46 +44,37 @@ !$omp end atomic ! Check for error conditions: - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic compare seq_cst seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst compare seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic compare acquire acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire compare acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic compare relaxed relaxed if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed compare relaxed if (b .eq. c) b = a - !ERROR: More than one FAIL clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one FAIL clause can appear on the ATOMIC directive !$omp atomic fail(release) compare fail(release) if (c .eq. a) a = b !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index c13a11a8dd5dc..deb67e7614659 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -16,20 +16,21 @@ program sample !$omp atomic read hint(2) y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(3) y = y + 10 !$omp atomic update hint(5) y = x + y - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement y = x x = y !$omp end atomic - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic update hint(x) y = y * 1 @@ -46,7 +47,7 @@ program sample !$omp atomic hint(omp_lock_hint_speculative) x = y + x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint) read y = x @@ -69,36 +70,36 @@ program sample !$omp atomic hint(omp_lock_hint_contended + omp_sync_hint_nonspeculative) x = y + x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint_contended) read y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = y * 9 - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp atomic hint(1.0) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp atomic hint(z + omp_sync_hint_nonspeculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(k + omp_sync_hint_speculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(p(1) + omp_sync_hint_uncontended) write x = 10 * y !$omp atomic write hint(a) - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and y+x access the same storage x = y + x !$omp atomic hint(abs(-1)) write diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 new file mode 100644 index 0000000000000..2619d235380f8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -0,0 +1,89 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC READ. Expect no diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v = x + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v => x +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic read + v = p(i) +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic read + !ERROR: Atomic variable x should be a scalar + v = x + + !$omp atomic read + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + v = y(2:4) +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic read + !ERROR: Within atomic operation x and x access the same storage + x = x + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic read + y(1) = y(2) +end + +subroutine f05 + integer :: x, v + + !$omp atomic read + !ERROR: Atomic expression x+1_4 should be a variable + v = x + 1 +end + +subroutine f06 + character :: x, v + + !$omp atomic read + !ERROR: Atomic variable x cannot have CHARACTER type + v = x +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic read + !ERROR: Atomic variable x cannot be ALLOCATABLE + v = x +end + diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 new file mode 100644 index 0000000000000..64e31a7974b55 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -0,0 +1,77 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !$omp atomic update capture + x = v + x = x + 1 + y = x + !$omp end atomic +end + +subroutine f01 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two assignments + !$omp atomic update capture + x = v + block + x = x + 1 + y = x + end block + !$omp end atomic +end + +subroutine f02 + integer :: x, y + + ! The update and capture statements can be inside of a single BLOCK. + ! The end-directive is then optional. Expect no diagnostics. + !$omp atomic update capture + block + x = x + 1 + y = x + end block +end + +subroutine f03 + integer :: x + + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !$omp atomic update capture + x = x + 1 + x = x + 2 + !$omp end atomic +end + +subroutine f04 + integer :: x, v + + !$omp atomic update capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + v = x + x = v + !$omp end atomic +end + +subroutine f05 + integer :: x, v, z + + !$omp atomic update capture + v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + !$omp end atomic +end + +subroutine f06 + integer :: x, v, z + + !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + v = x + !$omp end atomic +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 new file mode 100644 index 0000000000000..0aee02d429dc0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + !$omp atomic update + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + x = int(x + y) +end + +subroutine f06 + integer :: x, y + interface + function f(i, j) + integer :: f, i, j + end + end interface + + !$omp atomic update + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation + x = f(x, y) +end + +subroutine f07 + real :: x + integer :: y + + !$omp atomic update + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation + x = x ** y +end + +subroutine f08 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should appear as an argument in the update operation + x = y +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 index 21a9b87d26345..3084376b4275d 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 @@ -22,10 +22,10 @@ program sample x = x / y !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y end program diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 new file mode 100644 index 0000000000000..979568ebf7140 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -0,0 +1,81 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC WRITE. Expect no diagnostics. + !$omp atomic write + x = v + 1 + + !$omp atomic write + x = v + 3 + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic write + x = 2 * v + 3 + + !$omp atomic write + x => v +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic write + p(i) = v +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic write + !ERROR: Atomic variable x should be a scalar + x = v + + !$omp atomic write + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + y(2:4) = v +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic write + !ERROR: Within atomic operation x and x+1_4 access the same storage + x = x + 1 + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic write + y(1) = y(2) +end + +subroutine f06 + character :: x, v + + !$omp atomic write + !ERROR: Atomic variable x cannot have CHARACTER type + x = v +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic write + !ERROR: Atomic variable x cannot be ALLOCATABLE + x = v +end + diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0e100871ea9b4..0dc8251ae356e 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct @@ -11,9 +11,13 @@ a = b !$omp end atomic + !ERROR: ACQUIRE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic read acquire hint(OMP_LOCK_HINT_CONTENDED) a = b + !ERROR: RELEASE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic release hint(OMP_LOCK_HINT_UNCONTENDED) write a = b @@ -22,39 +26,32 @@ a = a + 1 !$omp end atomic + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: ACQ_REL clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic hint(1) acq_rel capture b = a a = a + 1 !$omp end atomic - !ERROR: expected end of line + !ERROR: At most one clause from the 'atomic' group is allowed on ATOMIC construct !$omp atomic read write + !ERROR: Atomic expression a+1._4 should be a variable a = a + 1 !$omp atomic a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic num_threads(4) a = a + 1 - !ERROR: expected end of line + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic capture num_threads(4) a = a + 1 + !ERROR: RELAXED clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic relaxed a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' - !$omp atomic num_threads write - a = a + 1 - !$omp end parallel end diff --git a/flang/test/Semantics/OpenMP/atomic01.f90 b/flang/test/Semantics/OpenMP/atomic01.f90 index 173effe86b69c..f700c381cadd0 100644 --- a/flang/test/Semantics/OpenMP/atomic01.f90 +++ b/flang/test/Semantics/OpenMP/atomic01.f90 @@ -14,322 +14,277 @@ ! At most one memory-order-clause may appear on the construct. !READ - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic read seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst read seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic read acquire acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire read acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic read relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed read relaxed i = j !UPDATE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic update seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst update seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic update release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release update release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic update relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed update relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !CAPTURE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic capture seq_cst seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst capture seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic capture release release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release capture release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic capture relaxed relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed capture relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel acq_rel capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic capture acq_rel acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel capture acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic capture acquire acquire i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire capture acquire i = j j = k !$omp end atomic !WRITE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic write seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst write seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic write release release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release write release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic write relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed write relaxed i = j !No atomic-clause - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j ! 2.17.7.3 ! At most one hint clause may appear on the construct. - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_speculative) hint(omp_sync_hint_speculative) read i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) read hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic read hint(omp_sync_hint_uncontended) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) update hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic update hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) capture i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) capture hint(omp_sync_hint_nonspeculative) i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic capture hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j j = k @@ -337,34 +292,26 @@ ! 2.17.7.4 ! If atomic-clause is read then memory-order-clause must not be acq_rel or release. - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic acq_rel read i = j - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read acq_rel i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic release read i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read release i = j ! 2.17.7.5 ! If atomic-clause is write then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acq_rel write i = j - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acq_rel i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acquire write i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acquire i = j @@ -372,33 +319,27 @@ ! 2.17.7.6 ! If atomic-clause is update or not present then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acq_rel update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acquire update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed on the ATOMIC directive !$omp atomic acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed on the ATOMIC directive !$omp atomic acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j end program diff --git a/flang/test/Semantics/OpenMP/atomic02.f90 b/flang/test/Semantics/OpenMP/atomic02.f90 index c66085d00f157..45e41f2552965 100644 --- a/flang/test/Semantics/OpenMP/atomic02.f90 +++ b/flang/test/Semantics/OpenMP/atomic02.f90 @@ -28,36 +28,29 @@ program OmpAtomic !$omp atomic a = a/(b + 1) !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: The atomic variable c should appear as an argument in the update operation c = d !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The /= operator is not a valid ATOMIC UPDATE operation l = a .NE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic m = m .AND. n @@ -76,32 +69,26 @@ program OmpAtomic !$omp atomic update a = a/(b + 1) !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: This is not a valid ATOMIC UPDATE operation c = c//d !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic update m = m .AND. n diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index 76367495b9861..f5c189fd05318 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -25,28 +25,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7, b, c) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8, a, d) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -60,26 +58,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = MOD(y, 9) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation x = ABS(x) end program OmpAtomic @@ -92,7 +90,7 @@ subroutine conflicting_types() type(simple) ::s z = 1 !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(s%z, 4) end subroutine @@ -105,40 +103,37 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) :: s !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MIN operator a = min(a, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a) !$omp atomic - !ERROR: Atomic update statement should be of the form `a = intrinsic_procedure(a, expr_list)` OR `a = intrinsic_procedure(expr_list, a)` a = min(b, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a, b) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: The atomic variable y should occur exactly once among the arguments of the top-level MIN operator y = min(z, x) !$omp atomic z = max(z, y) !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'k' + !ERROR: Atomic variable k should be a scalar + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation k = max(x, y) - + !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement x = min(x, k) !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - z =z + s%m + z = z + s%m end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index a9644ad95aa30..5c91ab5dc37e4 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -20,12 +20,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic @@ -33,12 +31,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic @@ -46,12 +42,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic @@ -59,12 +53,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic @@ -72,8 +64,7 @@ program OmpAtomic !$omp atomic m = n .AND. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic @@ -81,8 +72,7 @@ program OmpAtomic !$omp atomic m = n .OR. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic @@ -90,8 +80,7 @@ program OmpAtomic !$omp atomic m = n .EQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic @@ -99,8 +88,7 @@ program OmpAtomic !$omp atomic m = n .NEQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l !$omp atomic update @@ -108,12 +96,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic update @@ -121,12 +107,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic update @@ -134,12 +118,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic update @@ -147,12 +129,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic update @@ -160,8 +140,7 @@ program OmpAtomic !$omp atomic update m = n .AND. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic update @@ -169,8 +148,7 @@ program OmpAtomic !$omp atomic update m = n .OR. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic update @@ -178,8 +156,7 @@ program OmpAtomic !$omp atomic update m = n .EQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic update @@ -187,8 +164,7 @@ program OmpAtomic !$omp atomic update m = n .NEQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l end program OmpAtomic @@ -204,35 +180,34 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) p !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement x = x !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 1 !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and a*b access the same storage a = a * b + a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level * operator a = b * (a + 9) !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (a+b) access the same storage a = a * (a + b) !$omp atomic - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (b+a) access the same storage a = (b + a) * a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a * b + c !$omp atomic update - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a + b + c !$omp atomic @@ -243,23 +218,18 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement a = a + d !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = x * y / z !$omp atomic - !ERROR: Atomic update statement should be of form `p%m = p%m operator expr` OR `p%m = expr operator p%m` - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable p%m should occur exactly once among the arguments of the top-level + operator p%m = x + y !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement p%m = p%m + p%n end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index 266268a212440..e0103be4cae4a 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -8,20 +8,20 @@ program OmpAtomic use omp_lib integer :: g, x - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic relaxed, seq_cst x = x + 1 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic read seq_cst, relaxed x = g - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic write relaxed, release x = 2 * 4 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 10 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst x = g g = x * 10 diff --git a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 index 7ca8c858239f7..e9cfa49bf934e 100644 --- a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 @@ -18,7 +18,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(3) y = 2 !$omp end critical (name) @@ -27,12 +27,12 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(7) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(x) y = 2 @@ -54,7 +54,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint) y = 2 @@ -84,35 +84,35 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint_contended) y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp critical (name) hint(1.0) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp critical (name) hint(z + omp_sync_hint_nonspeculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(k + omp_sync_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(p(1) + omp_sync_hint_uncontended) y = 2 diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 505cbc48fef90..677b933932b44 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -20,70 +20,64 @@ program sample !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable y(1_8:3_8:1_8) should be a scalar v = y(1:3) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression x*(10_4+x) should be a variable v = x * (10 + x) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression 4_4 should be a variable v = 4 !$omp atomic read - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE v = k !$omp atomic write - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = x !$omp atomic update - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = k + x * (v * x) !$omp atomic - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = v * k !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'z%y' + !ERROR: Within atomic operation z%y and x+z%y access the same storage z%y = x + z%y !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Within atomic operation m and min(m,x,z%m)+k access the same storage m = min(m, x, z%m) + k !$omp atomic read - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Atomic expression min(m,x,z%m)+k should be a variable m = min(m, x, z%m) + k !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar x = a - !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - a = x - !$omp atomic write !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement x = a !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar a = x !$omp atomic capture @@ -93,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = b + (x*1) !$omp end atomic @@ -103,60 +97,58 @@ program sample !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component x expected to be captured in the second statement of ATOMIC CAPTURE construct x = x + 10 v = b !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt] v = 1 x = 4 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component z%y expected to be assigned in the second statement of ATOMIC CAPTURE construct x = z%y + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component z%m expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 x = z%y !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component y(2) expected to be assigned in the second statement of ATOMIC CAPTURE construct x = y(2) + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component y(1) expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 x = y(2) !$omp end atomic !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable r cannot have CHARACTER type l = r !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable l cannot have CHARACTER type l = r end program diff --git a/flang/test/Semantics/OpenMP/requires-atomic01.f90 b/flang/test/Semantics/OpenMP/requires-atomic01.f90 index ae9fd086015dd..e8817c3f5ef61 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic01.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic01.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> SeqCst !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> SeqCst !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> SeqCst !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> SeqCst !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> SeqCst !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 diff --git a/flang/test/Semantics/OpenMP/requires-atomic02.f90 b/flang/test/Semantics/OpenMP/requires-atomic02.f90 index 4976a9667eb78..a3724a83456fd 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic02.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic02.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Acquire + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> AcqRel !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> AcqRel !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> AcqRel !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> AcqRel !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> AcqRel + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> AcqRel !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 >From 5928e6c450ed9d91a8130fd489a3fabab830140c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:41:16 -0500 Subject: [PATCH 05/11] Restart build >From 5a9cb4aaca6c7cd079f24036b233dbfd3c6278d4 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:45:28 -0500 Subject: [PATCH 06/11] Restart build >From 6273bb856c66fe110731e25293be0f9a1f02c64a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 07:54:54 -0500 Subject: [PATCH 07/11] Restart build >From 3594a0fcab499d7509c2bd4620b2c47feb74a481 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 08:18:01 -0500 Subject: [PATCH 08/11] Replace %openmp_flags with -fopenmp in tests, add REQUIRES where needed --- flang/test/Semantics/OpenMP/atomic-read.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-capture.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 2 +- flang/test/Semantics/OpenMP/atomic.f90 | 2 ++ 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 index 2619d235380f8..73899a9ff37f2 100644 --- a/flang/test/Semantics/OpenMP/atomic-read.f90 +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index 64e31a7974b55..c427ba07d43d8 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 0aee02d429dc0..4595e02d01456 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 index 979568ebf7140..7965ad2dc7dbf 100644 --- a/flang/test/Semantics/OpenMP/atomic-write.f90 +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0dc8251ae356e..2caa161507d49 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,3 +1,5 @@ +! REQUIRES: openmp_runtime + ! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct >From 0142671339663ba09c169800f909c0d48c0ad38b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 09:06:20 -0500 Subject: [PATCH 09/11] Fix examples --- flang/examples/FeatureList/FeatureList.cpp | 10 ------- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 26 +++++-------------- 2 files changed, 7 insertions(+), 29 deletions(-) diff --git a/flang/examples/FeatureList/FeatureList.cpp b/flang/examples/FeatureList/FeatureList.cpp index d1407cf0ef239..a36b8719e365d 100644 --- a/flang/examples/FeatureList/FeatureList.cpp +++ b/flang/examples/FeatureList/FeatureList.cpp @@ -445,13 +445,6 @@ struct NodeVisitor { READ_FEATURE(ObjectDecl) READ_FEATURE(OldParameterStmt) READ_FEATURE(OmpAlignedClause) - READ_FEATURE(OmpAtomic) - READ_FEATURE(OmpAtomicCapture) - READ_FEATURE(OmpAtomicCapture::Stmt1) - READ_FEATURE(OmpAtomicCapture::Stmt2) - READ_FEATURE(OmpAtomicRead) - READ_FEATURE(OmpAtomicUpdate) - READ_FEATURE(OmpAtomicWrite) READ_FEATURE(OmpBeginBlockDirective) READ_FEATURE(OmpBeginLoopDirective) READ_FEATURE(OmpBeginSectionsDirective) @@ -480,7 +473,6 @@ struct NodeVisitor { READ_FEATURE(OmpIterationOffset) READ_FEATURE(OmpIterationVector) READ_FEATURE(OmpEndAllocators) - READ_FEATURE(OmpEndAtomic) READ_FEATURE(OmpEndBlockDirective) READ_FEATURE(OmpEndCriticalDirective) READ_FEATURE(OmpEndLoopDirective) @@ -566,8 +558,6 @@ struct NodeVisitor { READ_FEATURE(OpenMPDeclareTargetConstruct) READ_FEATURE(OmpMemoryOrderType) READ_FEATURE(OmpMemoryOrderClause) - READ_FEATURE(OmpAtomicClause) - READ_FEATURE(OmpAtomicClauseList) READ_FEATURE(OmpAtomicDefaultMemOrderClause) READ_FEATURE(OpenMPFlushConstruct) READ_FEATURE(OpenMPLoopConstruct) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..18a4107f9a9d1 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -74,25 +74,19 @@ SourcePosition OpenMPCounterVisitor::getLocation(const OpenMPConstruct &c) { // the directive field. [&](const auto &c) -> SourcePosition { const CharBlock &source{std::get<0>(c.t).source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPAtomicConstruct &c) -> SourcePosition { - return std::visit( - [&](const auto &o) -> SourcePosition { - const CharBlock &source{std::get(o.t).source}; - return parsing->allCooked() - .GetSourcePositionRange(source) - ->first; - }, - c.u); + const CharBlock &source{c.source}; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPSectionConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPUtilityConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, }, c.u); @@ -157,14 +151,8 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - return std::visit( - [&](const auto &c) { - // Get source from the verbatim fields - const CharBlock &source{std::get(c.t).source}; - return "atomic-" + - normalize_construct_name(source.ToString()); - }, - c.u); + const CharBlock &source{c.source}; + return normalize_construct_name(source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; >From c158867527e0a5e2b67146784386675e0d414c71 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:19 -0500 Subject: [PATCH 10/11] Fix example --- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 5 +++-- flang/test/Examples/omp-atomic.f90 | 16 +++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index 18a4107f9a9d1..b0964845d3b99 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -151,8 +151,9 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - const CharBlock &source{c.source}; - return normalize_construct_name(source.ToString()); + auto &dirSpec = std::get(c.t); + auto &dirName = std::get(dirSpec.t); + return normalize_construct_name(dirName.source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; diff --git a/flang/test/Examples/omp-atomic.f90 b/flang/test/Examples/omp-atomic.f90 index dcca34b633a3e..934f84f132484 100644 --- a/flang/test/Examples/omp-atomic.f90 +++ b/flang/test/Examples/omp-atomic.f90 @@ -26,25 +26,31 @@ ! CHECK:--- ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 9 -! CHECK-NEXT: construct: atomic-read +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: -! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: - clause: read ! CHECK-NEXT: details: '' +! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 12 -! CHECK-NEXT: construct: atomic-write +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: ! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' +! CHECK-NEXT: - clause: write ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 16 -! CHECK-NEXT: construct: atomic-capture +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: +! CHECK-NEXT: - clause: capture +! CHECK-NEXT: details: 'name_modifier=atomic;name_modifier=atomic;' ! CHECK-NEXT: - clause: seq_cst ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 21 -! CHECK-NEXT: construct: atomic-atomic +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: [] ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 8 >From ddacb71299d1fefd37b566e3caed45031584aac8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:47 -0500 Subject: [PATCH 11/11] Remove reference to *nullptr --- flang/lib/Lower/OpenMP/OpenMP.cpp | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ab868df76d298..4d9b375553c1e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3770,9 +3770,12 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); - mlir::Operation *secondOp = genAtomicOperation( - converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, - *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + mlir::Operation *secondOp = nullptr; + if (analysis.op1.what != analysis.None) { + secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, + atomAddr, atom, *get(analysis.op1.expr), + hint, memOrder, atomicAt, prepareAt); + } if (secondOp) { builder.setInsertionPointAfter(secondOp); From flang-commits at lists.llvm.org Thu May 1 12:19:13 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Thu, 01 May 2025 12:19:13 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Hide strict volatility checks behind flag (PR #138183) Message-ID: https://github.com/ashermancinelli created https://github.com/llvm/llvm-project/pull/138183 Enabling volatility lowering by default revealed some issues in lowering and op verification. For example, given volatile variable of a nested type, accessing structure members of a structure member would result in a volatility mismatch when the inner structure member is designated (and thus a verification error at compile time). In other cases, I found correct codegen when the checks were disabled, also related to allocatable types and how we handle volatile references of boxes. This hides the strict verification of fir and hlfir ops behind a flag so I can iteratively improve lowering of volatile variables without causing compile-time failures, keeping the strict verification on when running tests. >From 9adde0b382811bbc39d00d181e3b05ba9141ff3e Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Thu, 1 May 2025 11:46:49 -0700 Subject: [PATCH] [flang] Hide strict volatility checks behind flag Enabling volatility lowering by default revealed some issues in lowering and op verification. For example, given volatile variable of a nested type, accessing structure members of a structure member would result in a volatility mismatch when the inner structure member is designated (and thus a verification error at compile time). In other cases, I found correct codegen when the checks were disabled, also related to allocatable types and how we handle volatile references of boxes. This hides the strict verification of fir and hlfir ops behind a flag so I can iteratively improve lowering of volatile variables without causing compile-time failures. --- .../include/flang/Optimizer/Dialect/FIROps.h | 1 + flang/lib/Optimizer/Dialect/FIROps.cpp | 27 +++++++++++++++---- flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp | 5 ++-- flang/test/Fir/invalid.fir | 4 +-- flang/test/Fir/volatile.fir | 2 +- flang/test/Fir/volatile2.fir | 2 +- flang/test/HLFIR/volatile.fir | 2 +- flang/test/HLFIR/volatile1.fir | 2 +- flang/test/HLFIR/volatile2.fir | 2 +- flang/test/HLFIR/volatile3.fir | 2 +- flang/test/HLFIR/volatile4.fir | 2 +- flang/test/Lower/volatile-allocatable1.f90 | 17 ++++++++++++ flang/test/Lower/volatile-openmp.f90 | 2 +- flang/test/Lower/volatile-string.f90 | 2 +- flang/test/Lower/volatile1.f90 | 2 +- flang/test/Lower/volatile2.f90 | 2 +- flang/test/Lower/volatile3.f90 | 2 +- flang/test/Lower/volatile4.f90 | 2 +- 18 files changed, 58 insertions(+), 22 deletions(-) create mode 100644 flang/test/Lower/volatile-allocatable1.f90 diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.h b/flang/include/flang/Optimizer/Dialect/FIROps.h index 15bd512ea85af..1bed227afb50d 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.h +++ b/flang/include/flang/Optimizer/Dialect/FIROps.h @@ -40,6 +40,7 @@ mlir::ParseResult parseSelector(mlir::OpAsmParser &parser, mlir::OperationState &result, mlir::OpAsmParser::UnresolvedOperand &selector, mlir::Type &type); +bool useStrictVolatileVerification(); static constexpr llvm::StringRef getNormalizedLowerBoundAttrName() { return "normalized.lb"; diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index 8a24608336495..05ef69169bae5 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -33,11 +33,21 @@ #include "llvm/ADT/STLExtras.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/TypeSwitch.h" +#include "llvm/Support/CommandLine.h" namespace { #include "flang/Optimizer/Dialect/CanonicalizationPatterns.inc" } // namespace +static llvm::cl::opt clUseStrictVolatileVerification( + "strict-fir-volatile-verifier", llvm::cl::init(false), + llvm::cl::desc( + "use stricter verifier for FIR operations with volatile types")); + +bool fir::useStrictVolatileVerification() { + return clUseStrictVolatileVerification; +} + static void propagateAttributes(mlir::Operation *fromOp, mlir::Operation *toOp) { if (!fromOp || !toOp) @@ -1535,11 +1545,14 @@ llvm::LogicalResult fir::ConvertOp::verify() { // represent volatility. const bool toLLVMPointer = mlir::isa(outType); const bool toInteger = fir::isa_integer(outType); - if (fir::isa_volatile_type(inType) != fir::isa_volatile_type(outType) && - !toLLVMPointer && !toInteger) - return emitOpError("cannot convert between volatile and non-volatile " - "types, use fir.volatile_cast instead ") - << inType << " / " << outType; + if (fir::useStrictVolatileVerification()) { + if (fir::isa_volatile_type(inType) != fir::isa_volatile_type(outType) && + !toLLVMPointer && !toInteger) { + return emitOpError("cannot convert between volatile and non-volatile " + "types, use fir.volatile_cast instead ") + << inType << " / " << outType; + } + } if (canBeConverted(inType, outType)) return mlir::success(); return emitOpError("invalid type conversion") @@ -1841,6 +1854,10 @@ llvm::LogicalResult fir::TypeInfoOp::verify() { static llvm::LogicalResult verifyEmboxOpVolatilityInvariants(mlir::Type memrefType, mlir::Type resultType) { + + if (!fir::useStrictVolatileVerification()) + return mlir::success(); + mlir::Type boxElementType = llvm::TypeSwitch(resultType) .Case( diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp index c5ed76753ea0c..eef1377f26961 100644 --- a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp +++ b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp @@ -423,8 +423,9 @@ llvm::LogicalResult hlfir::DesignateOp::verify() { unsigned outputRank = 0; mlir::Type outputElementType; bool hasBoxComponent; - if (fir::isa_volatile_type(memrefType) != - fir::isa_volatile_type(getResult().getType())) { + if (fir::useStrictVolatileVerification() && + fir::isa_volatile_type(memrefType) != + fir::isa_volatile_type(getResult().getType())) { return emitOpError("volatility mismatch between memref and result type") << " memref type: " << memrefType << " result type: " << getResult().getType(); diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index 447a6c68b4b0a..f9f5e267dd9bc 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -1,6 +1,6 @@ -// FIR ops diagnotic tests -// RUN: fir-opt -split-input-file -verify-diagnostics %s + +// RUN: fir-opt -split-input-file -verify-diagnostics --strict-fir-volatile-verifier %s // expected-error at +1{{custom op 'fir.string_lit' must have character type}} %0 = fir.string_lit "Hello, World!"(13) : !fir.int<32> diff --git a/flang/test/Fir/volatile.fir b/flang/test/Fir/volatile.fir index 6b3d8709abdeb..9a7853083799f 100644 --- a/flang/test/Fir/volatile.fir +++ b/flang/test/Fir/volatile.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" %s -o - | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" %s -o - | FileCheck %s // CHECK: llvm.store volatile %{{.+}}, %{{.+}} : i32, !llvm.ptr // CHECK: %{{.+}} = llvm.load volatile %{{.+}} : !llvm.ptr -> i32 func.func @foo() { diff --git a/flang/test/Fir/volatile2.fir b/flang/test/Fir/volatile2.fir index 82a8413d2fc02..d7c7351c361dd 100644 --- a/flang/test/Fir/volatile2.fir +++ b/flang/test/Fir/volatile2.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt --fir-to-llvm-ir %s | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier --fir-to-llvm-ir %s | FileCheck %s func.func @_QQmain() { %0 = fir.alloca !fir.box, volatile> %c1 = arith.constant 1 : index diff --git a/flang/test/HLFIR/volatile.fir b/flang/test/HLFIR/volatile.fir index 453413a93af44..6d43bf20a702b 100644 --- a/flang/test/HLFIR/volatile.fir +++ b/flang/test/HLFIR/volatile.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt --convert-hlfir-to-fir %s -o - | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier --convert-hlfir-to-fir %s -o - | FileCheck %s func.func @foo() { %true = arith.constant true diff --git a/flang/test/HLFIR/volatile1.fir b/flang/test/HLFIR/volatile1.fir index 174acd77f9076..c6150fe72ed66 100644 --- a/flang/test/HLFIR/volatile1.fir +++ b/flang/test/HLFIR/volatile1.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s func.func @_QQmain() attributes {fir.bindc_name = "p"} { %0 = fir.address_of(@_QFEarr) : !fir.ref> %c10 = arith.constant 10 : index diff --git a/flang/test/HLFIR/volatile2.fir b/flang/test/HLFIR/volatile2.fir index 86ac683adad3f..0501cfcc8e8ac 100644 --- a/flang/test/HLFIR/volatile2.fir +++ b/flang/test/HLFIR/volatile2.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s func.func private @_QFPa() -> i32 attributes {fir.host_symbol = @_QQmain, llvm.linkage = #llvm.linkage} { %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFFaEa"} %1 = fir.volatile_cast %0 : (!fir.ref) -> !fir.ref diff --git a/flang/test/HLFIR/volatile3.fir b/flang/test/HLFIR/volatile3.fir index 41e42916e8ee5..24ea4e4b6df97 100644 --- a/flang/test/HLFIR/volatile3.fir +++ b/flang/test/HLFIR/volatile3.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s func.func @_QQmain() attributes {fir.bindc_name = "p"} { %0 = fir.address_of(@_QFEarr) : !fir.ref> %c10 = arith.constant 10 : index diff --git a/flang/test/HLFIR/volatile4.fir b/flang/test/HLFIR/volatile4.fir index cbf0aa31cb9f3..8980bcf932f81 100644 --- a/flang/test/HLFIR/volatile4.fir +++ b/flang/test/HLFIR/volatile4.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s func.func @_QQmain() attributes {fir.bindc_name = "p"} { %0 = fir.address_of(@_QFEarr) : !fir.ref> %c10 = arith.constant 10 : index diff --git a/flang/test/Lower/volatile-allocatable1.f90 b/flang/test/Lower/volatile-allocatable1.f90 new file mode 100644 index 0000000000000..a21359c3b4225 --- /dev/null +++ b/flang/test/Lower/volatile-allocatable1.f90 @@ -0,0 +1,17 @@ +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s + +! Requires correct propagation of volatility for allocatable nested types. +! XFAIL: * + +function allocatable_udt() + type :: base_type + integer :: i = 42 + end type + type, extends(base_type) :: ext_type + integer :: j = 100 + end type + integer :: allocatable_udt + type(ext_type), allocatable, volatile :: v2(:,:) + allocate(v2(2,3)) + allocatable_udt = v2(1,1)%i +end function diff --git a/flang/test/Lower/volatile-openmp.f90 b/flang/test/Lower/volatile-openmp.f90 index 3269af9618f10..6277cf942b8ec 100644 --- a/flang/test/Lower/volatile-openmp.f90 +++ b/flang/test/Lower/volatile-openmp.f90 @@ -1,4 +1,4 @@ -! RUN: bbc -fopenmp %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier -fopenmp %s -o - | FileCheck %s type t integer, pointer :: array(:) end type diff --git a/flang/test/Lower/volatile-string.f90 b/flang/test/Lower/volatile-string.f90 index 9173268880ace..88b21d7b245e9 100644 --- a/flang/test/Lower/volatile-string.f90 +++ b/flang/test/Lower/volatile-string.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s program p character(3), volatile :: string = 'foo' character(3) :: nonvolatile_string diff --git a/flang/test/Lower/volatile1.f90 b/flang/test/Lower/volatile1.f90 index 8447704619db0..385b9fa3bd1ad 100644 --- a/flang/test/Lower/volatile1.f90 +++ b/flang/test/Lower/volatile1.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s program p integer,volatile::i,arr(10) diff --git a/flang/test/Lower/volatile2.f90 b/flang/test/Lower/volatile2.f90 index 4b7f185f24c41..defacf820bd54 100644 --- a/flang/test/Lower/volatile2.f90 +++ b/flang/test/Lower/volatile2.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s program p print*,a(),b(),c() diff --git a/flang/test/Lower/volatile3.f90 b/flang/test/Lower/volatile3.f90 index dee6642e82593..8825f8f3afbcb 100644 --- a/flang/test/Lower/volatile3.f90 +++ b/flang/test/Lower/volatile3.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s ! Test that all combinations of volatile pointer and target are properly lowered - ! note that a volatile pointer implies that the target is volatile, even if not specified diff --git a/flang/test/Lower/volatile4.f90 b/flang/test/Lower/volatile4.f90 index 42d7b68507b53..83ce2b8fdb25a 100644 --- a/flang/test/Lower/volatile4.f90 +++ b/flang/test/Lower/volatile4.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s program p integer,volatile::i,arr(10) From flang-commits at lists.llvm.org Thu May 1 12:19:48 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 12:19:48 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Hide strict volatility checks behind flag (PR #138183) In-Reply-To: Message-ID: <6813c954.170a0220.16fc7.6949@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Asher Mancinelli (ashermancinelli)
Changes Enabling volatility lowering by default revealed some issues in lowering and op verification. For example, given volatile variable of a nested type, accessing structure members of a structure member would result in a volatility mismatch when the inner structure member is designated (and thus a verification error at compile time). In other cases, I found correct codegen when the checks were disabled, also related to allocatable types and how we handle volatile references of boxes. This hides the strict verification of fir and hlfir ops behind a flag so I can iteratively improve lowering of volatile variables without causing compile-time failures, keeping the strict verification on when running tests. --- Full diff: https://github.com/llvm/llvm-project/pull/138183.diff 18 Files Affected: - (modified) flang/include/flang/Optimizer/Dialect/FIROps.h (+1) - (modified) flang/lib/Optimizer/Dialect/FIROps.cpp (+22-5) - (modified) flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp (+3-2) - (modified) flang/test/Fir/invalid.fir (+2-2) - (modified) flang/test/Fir/volatile.fir (+1-1) - (modified) flang/test/Fir/volatile2.fir (+1-1) - (modified) flang/test/HLFIR/volatile.fir (+1-1) - (modified) flang/test/HLFIR/volatile1.fir (+1-1) - (modified) flang/test/HLFIR/volatile2.fir (+1-1) - (modified) flang/test/HLFIR/volatile3.fir (+1-1) - (modified) flang/test/HLFIR/volatile4.fir (+1-1) - (added) flang/test/Lower/volatile-allocatable1.f90 (+17) - (modified) flang/test/Lower/volatile-openmp.f90 (+1-1) - (modified) flang/test/Lower/volatile-string.f90 (+1-1) - (modified) flang/test/Lower/volatile1.f90 (+1-1) - (modified) flang/test/Lower/volatile2.f90 (+1-1) - (modified) flang/test/Lower/volatile3.f90 (+1-1) - (modified) flang/test/Lower/volatile4.f90 (+1-1) ``````````diff diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.h b/flang/include/flang/Optimizer/Dialect/FIROps.h index 15bd512ea85af..1bed227afb50d 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.h +++ b/flang/include/flang/Optimizer/Dialect/FIROps.h @@ -40,6 +40,7 @@ mlir::ParseResult parseSelector(mlir::OpAsmParser &parser, mlir::OperationState &result, mlir::OpAsmParser::UnresolvedOperand &selector, mlir::Type &type); +bool useStrictVolatileVerification(); static constexpr llvm::StringRef getNormalizedLowerBoundAttrName() { return "normalized.lb"; diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index 8a24608336495..05ef69169bae5 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -33,11 +33,21 @@ #include "llvm/ADT/STLExtras.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/TypeSwitch.h" +#include "llvm/Support/CommandLine.h" namespace { #include "flang/Optimizer/Dialect/CanonicalizationPatterns.inc" } // namespace +static llvm::cl::opt clUseStrictVolatileVerification( + "strict-fir-volatile-verifier", llvm::cl::init(false), + llvm::cl::desc( + "use stricter verifier for FIR operations with volatile types")); + +bool fir::useStrictVolatileVerification() { + return clUseStrictVolatileVerification; +} + static void propagateAttributes(mlir::Operation *fromOp, mlir::Operation *toOp) { if (!fromOp || !toOp) @@ -1535,11 +1545,14 @@ llvm::LogicalResult fir::ConvertOp::verify() { // represent volatility. const bool toLLVMPointer = mlir::isa(outType); const bool toInteger = fir::isa_integer(outType); - if (fir::isa_volatile_type(inType) != fir::isa_volatile_type(outType) && - !toLLVMPointer && !toInteger) - return emitOpError("cannot convert between volatile and non-volatile " - "types, use fir.volatile_cast instead ") - << inType << " / " << outType; + if (fir::useStrictVolatileVerification()) { + if (fir::isa_volatile_type(inType) != fir::isa_volatile_type(outType) && + !toLLVMPointer && !toInteger) { + return emitOpError("cannot convert between volatile and non-volatile " + "types, use fir.volatile_cast instead ") + << inType << " / " << outType; + } + } if (canBeConverted(inType, outType)) return mlir::success(); return emitOpError("invalid type conversion") @@ -1841,6 +1854,10 @@ llvm::LogicalResult fir::TypeInfoOp::verify() { static llvm::LogicalResult verifyEmboxOpVolatilityInvariants(mlir::Type memrefType, mlir::Type resultType) { + + if (!fir::useStrictVolatileVerification()) + return mlir::success(); + mlir::Type boxElementType = llvm::TypeSwitch(resultType) .Case( diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp index c5ed76753ea0c..eef1377f26961 100644 --- a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp +++ b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp @@ -423,8 +423,9 @@ llvm::LogicalResult hlfir::DesignateOp::verify() { unsigned outputRank = 0; mlir::Type outputElementType; bool hasBoxComponent; - if (fir::isa_volatile_type(memrefType) != - fir::isa_volatile_type(getResult().getType())) { + if (fir::useStrictVolatileVerification() && + fir::isa_volatile_type(memrefType) != + fir::isa_volatile_type(getResult().getType())) { return emitOpError("volatility mismatch between memref and result type") << " memref type: " << memrefType << " result type: " << getResult().getType(); diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index 447a6c68b4b0a..f9f5e267dd9bc 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -1,6 +1,6 @@ -// FIR ops diagnotic tests -// RUN: fir-opt -split-input-file -verify-diagnostics %s + +// RUN: fir-opt -split-input-file -verify-diagnostics --strict-fir-volatile-verifier %s // expected-error at +1{{custom op 'fir.string_lit' must have character type}} %0 = fir.string_lit "Hello, World!"(13) : !fir.int<32> diff --git a/flang/test/Fir/volatile.fir b/flang/test/Fir/volatile.fir index 6b3d8709abdeb..9a7853083799f 100644 --- a/flang/test/Fir/volatile.fir +++ b/flang/test/Fir/volatile.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" %s -o - | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" %s -o - | FileCheck %s // CHECK: llvm.store volatile %{{.+}}, %{{.+}} : i32, !llvm.ptr // CHECK: %{{.+}} = llvm.load volatile %{{.+}} : !llvm.ptr -> i32 func.func @foo() { diff --git a/flang/test/Fir/volatile2.fir b/flang/test/Fir/volatile2.fir index 82a8413d2fc02..d7c7351c361dd 100644 --- a/flang/test/Fir/volatile2.fir +++ b/flang/test/Fir/volatile2.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt --fir-to-llvm-ir %s | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier --fir-to-llvm-ir %s | FileCheck %s func.func @_QQmain() { %0 = fir.alloca !fir.box, volatile> %c1 = arith.constant 1 : index diff --git a/flang/test/HLFIR/volatile.fir b/flang/test/HLFIR/volatile.fir index 453413a93af44..6d43bf20a702b 100644 --- a/flang/test/HLFIR/volatile.fir +++ b/flang/test/HLFIR/volatile.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt --convert-hlfir-to-fir %s -o - | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier --convert-hlfir-to-fir %s -o - | FileCheck %s func.func @foo() { %true = arith.constant true diff --git a/flang/test/HLFIR/volatile1.fir b/flang/test/HLFIR/volatile1.fir index 174acd77f9076..c6150fe72ed66 100644 --- a/flang/test/HLFIR/volatile1.fir +++ b/flang/test/HLFIR/volatile1.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s func.func @_QQmain() attributes {fir.bindc_name = "p"} { %0 = fir.address_of(@_QFEarr) : !fir.ref> %c10 = arith.constant 10 : index diff --git a/flang/test/HLFIR/volatile2.fir b/flang/test/HLFIR/volatile2.fir index 86ac683adad3f..0501cfcc8e8ac 100644 --- a/flang/test/HLFIR/volatile2.fir +++ b/flang/test/HLFIR/volatile2.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s func.func private @_QFPa() -> i32 attributes {fir.host_symbol = @_QQmain, llvm.linkage = #llvm.linkage} { %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFFaEa"} %1 = fir.volatile_cast %0 : (!fir.ref) -> !fir.ref diff --git a/flang/test/HLFIR/volatile3.fir b/flang/test/HLFIR/volatile3.fir index 41e42916e8ee5..24ea4e4b6df97 100644 --- a/flang/test/HLFIR/volatile3.fir +++ b/flang/test/HLFIR/volatile3.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s func.func @_QQmain() attributes {fir.bindc_name = "p"} { %0 = fir.address_of(@_QFEarr) : !fir.ref> %c10 = arith.constant 10 : index diff --git a/flang/test/HLFIR/volatile4.fir b/flang/test/HLFIR/volatile4.fir index cbf0aa31cb9f3..8980bcf932f81 100644 --- a/flang/test/HLFIR/volatile4.fir +++ b/flang/test/HLFIR/volatile4.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s func.func @_QQmain() attributes {fir.bindc_name = "p"} { %0 = fir.address_of(@_QFEarr) : !fir.ref> %c10 = arith.constant 10 : index diff --git a/flang/test/Lower/volatile-allocatable1.f90 b/flang/test/Lower/volatile-allocatable1.f90 new file mode 100644 index 0000000000000..a21359c3b4225 --- /dev/null +++ b/flang/test/Lower/volatile-allocatable1.f90 @@ -0,0 +1,17 @@ +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s + +! Requires correct propagation of volatility for allocatable nested types. +! XFAIL: * + +function allocatable_udt() + type :: base_type + integer :: i = 42 + end type + type, extends(base_type) :: ext_type + integer :: j = 100 + end type + integer :: allocatable_udt + type(ext_type), allocatable, volatile :: v2(:,:) + allocate(v2(2,3)) + allocatable_udt = v2(1,1)%i +end function diff --git a/flang/test/Lower/volatile-openmp.f90 b/flang/test/Lower/volatile-openmp.f90 index 3269af9618f10..6277cf942b8ec 100644 --- a/flang/test/Lower/volatile-openmp.f90 +++ b/flang/test/Lower/volatile-openmp.f90 @@ -1,4 +1,4 @@ -! RUN: bbc -fopenmp %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier -fopenmp %s -o - | FileCheck %s type t integer, pointer :: array(:) end type diff --git a/flang/test/Lower/volatile-string.f90 b/flang/test/Lower/volatile-string.f90 index 9173268880ace..88b21d7b245e9 100644 --- a/flang/test/Lower/volatile-string.f90 +++ b/flang/test/Lower/volatile-string.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s program p character(3), volatile :: string = 'foo' character(3) :: nonvolatile_string diff --git a/flang/test/Lower/volatile1.f90 b/flang/test/Lower/volatile1.f90 index 8447704619db0..385b9fa3bd1ad 100644 --- a/flang/test/Lower/volatile1.f90 +++ b/flang/test/Lower/volatile1.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s program p integer,volatile::i,arr(10) diff --git a/flang/test/Lower/volatile2.f90 b/flang/test/Lower/volatile2.f90 index 4b7f185f24c41..defacf820bd54 100644 --- a/flang/test/Lower/volatile2.f90 +++ b/flang/test/Lower/volatile2.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s program p print*,a(),b(),c() diff --git a/flang/test/Lower/volatile3.f90 b/flang/test/Lower/volatile3.f90 index dee6642e82593..8825f8f3afbcb 100644 --- a/flang/test/Lower/volatile3.f90 +++ b/flang/test/Lower/volatile3.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s ! Test that all combinations of volatile pointer and target are properly lowered - ! note that a volatile pointer implies that the target is volatile, even if not specified diff --git a/flang/test/Lower/volatile4.f90 b/flang/test/Lower/volatile4.f90 index 42d7b68507b53..83ce2b8fdb25a 100644 --- a/flang/test/Lower/volatile4.f90 +++ b/flang/test/Lower/volatile4.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s program p integer,volatile::i,arr(10) ``````````
https://github.com/llvm/llvm-project/pull/138183 From flang-commits at lists.llvm.org Thu May 1 12:26:25 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 12:26:25 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][cuda] Use a reference for asyncObject (PR #138186) In-Reply-To: Message-ID: <6813cae1.170a0220.27fed9.6d5f@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-openacc Author: Valentin Clement (バレンタイン クレメン) (clementval)
Changes Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation. New tentative with some fix. The previous was reverted yesterday. --- Patch is 88.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/138186.diff 52 Files Affected: - (modified) flang-rt/include/flang-rt/runtime/allocator-registry.h (+2-2) - (modified) flang-rt/include/flang-rt/runtime/descriptor.h (+3-3) - (modified) flang-rt/include/flang-rt/runtime/reduction-templates.h (+1-1) - (modified) flang-rt/lib/cuda/allocatable.cpp (+4-4) - (modified) flang-rt/lib/cuda/allocator.cpp (+10-10) - (modified) flang-rt/lib/cuda/descriptor.cpp (+1-1) - (modified) flang-rt/lib/runtime/allocatable.cpp (+6-6) - (modified) flang-rt/lib/runtime/array-constructor.cpp (+2-2) - (modified) flang-rt/lib/runtime/assign.cpp (+2-2) - (modified) flang-rt/lib/runtime/character.cpp (+11-9) - (modified) flang-rt/lib/runtime/copy.cpp (+2-2) - (modified) flang-rt/lib/runtime/derived.cpp (+3-3) - (modified) flang-rt/lib/runtime/descriptor.cpp (+2-2) - (modified) flang-rt/lib/runtime/extrema.cpp (+2-2) - (modified) flang-rt/lib/runtime/findloc.cpp (+1-1) - (modified) flang-rt/lib/runtime/matmul-transpose.cpp (+1-1) - (modified) flang-rt/lib/runtime/matmul.cpp (+1-1) - (modified) flang-rt/lib/runtime/misc-intrinsic.cpp (+1-1) - (modified) flang-rt/lib/runtime/pointer.cpp (+1-1) - (modified) flang-rt/lib/runtime/temporary-stack.cpp (+1-1) - (modified) flang-rt/lib/runtime/tools.cpp (+1-1) - (modified) flang-rt/lib/runtime/transformational.cpp (+2-2) - (modified) flang-rt/unittests/Evaluate/reshape.cpp (+1-1) - (modified) flang-rt/unittests/Runtime/Allocatable.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/CUDA/Allocatable.cpp (+8-4) - (modified) flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/CUDA/Memory.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/CharacterTest.cpp (+1-1) - (modified) flang-rt/unittests/Runtime/CommandTest.cpp (+4-4) - (modified) flang-rt/unittests/Runtime/TemporaryStack.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/tools.h (+1-1) - (modified) flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td (+5-6) - (modified) flang/include/flang/Runtime/CUDA/allocatable.h (+4-4) - (modified) flang/include/flang/Runtime/CUDA/allocator.h (+4-4) - (modified) flang/include/flang/Runtime/CUDA/pointer.h (+4-4) - (modified) flang/include/flang/Runtime/allocatable.h (+4-3) - (modified) flang/lib/Lower/Allocatable.cpp (+1-1) - (modified) flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp (+3-4) - (modified) flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp (+11-11) - (modified) flang/lib/Optimizer/Transforms/CUFOpConversion.cpp (+4-6) - (modified) flang/test/Fir/CUDA/cuda-allocate.fir (+8-10) - (modified) flang/test/Fir/cuf-invalid.fir (+2-3) - (modified) flang/test/Fir/cuf.mlir (+3-4) - (modified) flang/test/HLFIR/elemental-codegen.fir (+3-3) - (modified) flang/test/Lower/CUDA/cuda-allocatable.cuf (+4-5) - (modified) flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-declare.f90 (+2-2) - (modified) flang/test/Lower/allocatable-polymorphic.f90 (+13-13) - (modified) flang/test/Lower/allocatable-runtime.f90 (+2-2) - (modified) flang/test/Lower/allocate-mold.f90 (+2-2) - (modified) flang/test/Lower/polymorphic.f90 (+1-1) - (modified) flang/test/Transforms/lower-repack-arrays.fir (+4-4) ``````````diff diff --git a/flang-rt/include/flang-rt/runtime/allocator-registry.h b/flang-rt/include/flang-rt/runtime/allocator-registry.h index 33e8e2c7d7850..f0ba77a360736 100644 --- a/flang-rt/include/flang-rt/runtime/allocator-registry.h +++ b/flang-rt/include/flang-rt/runtime/allocator-registry.h @@ -19,7 +19,7 @@ namespace Fortran::runtime { -using AllocFct = void *(*)(std::size_t, std::int64_t); +using AllocFct = void *(*)(std::size_t, std::int64_t *); using FreeFct = void (*)(void *); typedef struct Allocator_t { @@ -28,7 +28,7 @@ typedef struct Allocator_t { } Allocator_t; static RT_API_ATTRS void *MallocWrapper( - std::size_t size, [[maybe_unused]] std::int64_t) { + std::size_t size, [[maybe_unused]] std::int64_t *) { return std::malloc(size); } #ifdef RT_DEVICE_COMPILATION diff --git a/flang-rt/include/flang-rt/runtime/descriptor.h b/flang-rt/include/flang-rt/runtime/descriptor.h index 9907e7866e7bf..c98e6b14850cb 100644 --- a/flang-rt/include/flang-rt/runtime/descriptor.h +++ b/flang-rt/include/flang-rt/runtime/descriptor.h @@ -29,8 +29,8 @@ #include #include -/// Value used for asyncId when no specific stream is specified. -static constexpr std::int64_t kNoAsyncId = -1; +/// Value used for asyncObject when no specific stream is specified. +static constexpr std::int64_t *kNoAsyncObject = nullptr; namespace Fortran::runtime { @@ -372,7 +372,7 @@ class Descriptor { // before calling. It (re)computes the byte strides after // allocation. Does not allocate automatic components or // perform default component initialization. - RT_API_ATTRS int Allocate(std::int64_t asyncId); + RT_API_ATTRS int Allocate(std::int64_t *asyncObject); RT_API_ATTRS void SetByteStrides(); // Deallocates storage; does not call FINAL subroutines or diff --git a/flang-rt/include/flang-rt/runtime/reduction-templates.h b/flang-rt/include/flang-rt/runtime/reduction-templates.h index 77f77a592a476..18412708b02c5 100644 --- a/flang-rt/include/flang-rt/runtime/reduction-templates.h +++ b/flang-rt/include/flang-rt/runtime/reduction-templates.h @@ -347,7 +347,7 @@ inline RT_API_ATTRS void DoMaxMinNorm2(Descriptor &result, const Descriptor &x, // as the element size of the source. result.Establish(x.type(), x.ElementBytes(), nullptr, 0, nullptr, CFI_attribute_allocatable); - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/cuda/allocatable.cpp b/flang-rt/lib/cuda/allocatable.cpp index 432974d18a3e3..c77819e9440d7 100644 --- a/flang-rt/lib/cuda/allocatable.cpp +++ b/flang-rt/lib/cuda/allocatable.cpp @@ -23,7 +23,7 @@ namespace Fortran::runtime::cuda { extern "C" { RT_EXT_API_GROUP_BEGIN -int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, +int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( @@ -41,7 +41,7 @@ int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, return stat; } -int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, +int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { if (desc.HasAddendum()) { @@ -63,7 +63,7 @@ int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, } int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; @@ -76,7 +76,7 @@ int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, } int RTDEF(CUFAllocatableAllocateSourceSync)(Descriptor &alloc, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocateSync)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; diff --git a/flang-rt/lib/cuda/allocator.cpp b/flang-rt/lib/cuda/allocator.cpp index 51119ab251168..f4289c55bd8de 100644 --- a/flang-rt/lib/cuda/allocator.cpp +++ b/flang-rt/lib/cuda/allocator.cpp @@ -98,7 +98,7 @@ static unsigned findAllocation(void *ptr) { return allocNotFound; } -static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { +static void insertAllocation(void *ptr, std::size_t size, cudaStream_t stream) { CriticalSection critical{lock}; initAllocations(); if (numDeviceAllocations >= maxDeviceAllocations) { @@ -106,7 +106,7 @@ static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { } deviceAllocations[numDeviceAllocations].ptr = ptr; deviceAllocations[numDeviceAllocations].size = size; - deviceAllocations[numDeviceAllocations].stream = (cudaStream_t)stream; + deviceAllocations[numDeviceAllocations].stream = stream; ++numDeviceAllocations; qsort(deviceAllocations, numDeviceAllocations, sizeof(DeviceAllocation), compareDeviceAlloc); @@ -136,7 +136,7 @@ void RTDEF(CUFRegisterAllocator)() { } void *CUFAllocPinned( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { void *p; CUDA_REPORT_IF_ERROR(cudaMallocHost((void **)&p, sizeInBytes)); return p; @@ -144,18 +144,18 @@ void *CUFAllocPinned( void CUFFreePinned(void *p) { CUDA_REPORT_IF_ERROR(cudaFreeHost(p)); } -void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t asyncId) { +void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t *asyncObject) { void *p; if (Fortran::runtime::executionEnvironment.cudaDeviceIsManaged) { CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); } else { - if (asyncId == kNoAsyncId) { + if (asyncObject == kNoAsyncObject) { CUDA_REPORT_IF_ERROR(cudaMalloc(&p, sizeInBytes)); } else { CUDA_REPORT_IF_ERROR( - cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)asyncId)); - insertAllocation(p, sizeInBytes, asyncId); + cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)*asyncObject)); + insertAllocation(p, sizeInBytes, (cudaStream_t)*asyncObject); } } return p; @@ -174,7 +174,7 @@ void CUFFreeDevice(void *p) { } void *CUFAllocManaged( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { void *p; CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); @@ -184,9 +184,9 @@ void *CUFAllocManaged( void CUFFreeManaged(void *p) { CUDA_REPORT_IF_ERROR(cudaFree(p)); } void *CUFAllocUnified( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { // Call alloc managed for the time being. - return CUFAllocManaged(sizeInBytes, asyncId); + return CUFAllocManaged(sizeInBytes, asyncObject); } void CUFFreeUnified(void *p) { diff --git a/flang-rt/lib/cuda/descriptor.cpp b/flang-rt/lib/cuda/descriptor.cpp index 175e8c0ef8438..7b768f91af29d 100644 --- a/flang-rt/lib/cuda/descriptor.cpp +++ b/flang-rt/lib/cuda/descriptor.cpp @@ -21,7 +21,7 @@ RT_EXT_API_GROUP_BEGIN Descriptor *RTDEF(CUFAllocDescriptor)( std::size_t sizeInBytes, const char *sourceFile, int sourceLine) { return reinterpret_cast( - CUFAllocManaged(sizeInBytes, /*asyncId*/ -1)); + CUFAllocManaged(sizeInBytes, /*asyncObject=*/nullptr)); } void RTDEF(CUFFreeDescriptor)( diff --git a/flang-rt/lib/runtime/allocatable.cpp b/flang-rt/lib/runtime/allocatable.cpp index 6acce34eb9a9e..ef18da6ea0786 100644 --- a/flang-rt/lib/runtime/allocatable.cpp +++ b/flang-rt/lib/runtime/allocatable.cpp @@ -133,17 +133,17 @@ void RTDEF(AllocatableApplyMold)( } } -int RTDEF(AllocatableAllocate)(Descriptor &descriptor, std::int64_t asyncId, - bool hasStat, const Descriptor *errMsg, const char *sourceFile, - int sourceLine) { +int RTDEF(AllocatableAllocate)(Descriptor &descriptor, + std::int64_t *asyncObject, bool hasStat, const Descriptor *errMsg, + const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; if (!descriptor.IsAllocatable()) { return ReturnError(terminator, StatInvalidDescriptor, errMsg, hasStat); } else if (descriptor.IsAllocated()) { return ReturnError(terminator, StatBaseNotNull, errMsg, hasStat); } else { - int stat{ - ReturnError(terminator, descriptor.Allocate(asyncId), errMsg, hasStat)}; + int stat{ReturnError( + terminator, descriptor.Allocate(asyncObject), errMsg, hasStat)}; if (stat == StatOk) { if (const DescriptorAddendum * addendum{descriptor.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -162,7 +162,7 @@ int RTDEF(AllocatableAllocateSource)(Descriptor &alloc, const Descriptor &source, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(AllocatableAllocate)( - alloc, /*asyncId=*/-1, hasStat, errMsg, sourceFile, sourceLine)}; + alloc, /*asyncObject=*/nullptr, hasStat, errMsg, sourceFile, sourceLine)}; if (stat == StatOk) { Terminator terminator{sourceFile, sourceLine}; DoFromSourceAssign(alloc, source, terminator); diff --git a/flang-rt/lib/runtime/array-constructor.cpp b/flang-rt/lib/runtime/array-constructor.cpp index 67b3b5e1e0f50..858fac7bf2b39 100644 --- a/flang-rt/lib/runtime/array-constructor.cpp +++ b/flang-rt/lib/runtime/array-constructor.cpp @@ -50,7 +50,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( initialAllocationSize(fromElements, to.ElementBytes())}; to.GetDimension(0).SetBounds(1, allocationSize); RTNAME(AllocatableAllocate) - (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); to.GetDimension(0).SetBounds(1, fromElements); vector.actualAllocationSize = allocationSize; @@ -59,7 +59,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( // first value: there should be no reallocation. RUNTIME_CHECK(terminator, previousToElements >= fromElements); RTNAME(AllocatableAllocate) - (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); vector.actualAllocationSize = previousToElements; } diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 4a813cd489022..8a4fa36c91479 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -99,7 +99,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; + int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; if (result == StatOk && derived && !derived->noInitializationNeeded()) { result = ReturnError(terminator, Initialize(to, *derived, terminator)); } @@ -277,7 +277,7 @@ RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; + auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; if (stat == StatOk) { if (HasDynamicComponent(from)) { // If 'from' has allocatable/automatic component, we cannot diff --git a/flang-rt/lib/runtime/character.cpp b/flang-rt/lib/runtime/character.cpp index d1152ee1caefb..f140d202e118e 100644 --- a/flang-rt/lib/runtime/character.cpp +++ b/flang-rt/lib/runtime/character.cpp @@ -118,7 +118,7 @@ static RT_API_ATTRS void Compare(Descriptor &result, const Descriptor &x, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("Compare: could not allocate storage for result"); } std::size_t xChars{x.ElementBytes() >> shift}; @@ -173,7 +173,7 @@ static RT_API_ATTRS void AdjustLRHelper(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("ADJUSTL/R: could not allocate storage for result"); } for (SubscriptValue resultAt{0}; elements-- > 0; @@ -227,7 +227,7 @@ static RT_API_ATTRS void LenTrim(Descriptor &result, const Descriptor &string, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("LEN_TRIM: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -427,7 +427,7 @@ static RT_API_ATTRS void GeneralCharFunc(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("SCAN/VERIFY: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -530,7 +530,8 @@ static RT_API_ATTRS void MaxMinHelper(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); } for (CHAR *result{accumulator.OffsetElement()}; elements-- > 0; accumData += accumChars, result += chars, x.IncrementSubscripts(xAt)) { @@ -606,7 +607,7 @@ void RTDEF(CharacterConcatenate)(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - if (accumulator.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (accumulator.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash( "CharacterConcatenate: could not allocate storage for result"); } @@ -629,7 +630,8 @@ void RTDEF(CharacterConcatenateScalar1)( accumulator.set_base_addr(nullptr); std::size_t oldLen{accumulator.ElementBytes()}; accumulator.raw().elem_len += chars; - RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); std::memcpy(accumulator.OffsetElement(oldLen), from, chars); FreeMemory(old); } @@ -831,7 +833,7 @@ void RTDEF(Repeat)(Descriptor &result, const Descriptor &string, std::size_t origBytes{string.ElementBytes()}; result.Establish(string.type(), origBytes * ncopies, nullptr, 0, nullptr, CFI_attribute_allocatable); - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("REPEAT could not allocate storage for result"); } const char *from{string.OffsetElement()}; @@ -865,7 +867,7 @@ void RTDEF(Trim)(Descriptor &result, const Descriptor &string, } result.Establish(string.type(), resultBytes, nullptr, 0, nullptr, CFI_attribute_allocatable); - RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncObject) == CFI_SUCCESS); std::memcpy(result.OffsetElement(), string.OffsetElement(), resultBytes); } diff --git a/flang-rt/lib/runtime/copy.cpp b/flang-rt/lib/runtime/copy.cpp index 3a0f98cf8d376..f990f46e0be66 100644 --- a/flang-rt/lib/runtime/copy.cpp +++ b/flang-rt/lib/runtime/copy.cpp @@ -171,8 +171,8 @@ RT_API_ATTRS void CopyElement(const Descriptor &to, const SubscriptValue toAt[], *reinterpret_cast(toPtr + component->offset())}; if (toDesc.raw().base_addr != nullptr) { toDesc.set_base_addr(nullptr); - RUNTIME_CHECK( - terminator, toDesc.Allocate(/*asyncId=*/-1) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, + toDesc.Allocate(/*asyncObject=*/nullptr) == CFI_SUCCESS); const Descriptor &fromDesc{*reinterpret_cast( fromPtr + component->offset())}; copyStack.emplace(toDesc, fromDesc); diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..35037036f63e7 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -52,7 +52,7 @@ RT_API_ATTRS int Initialize(const Descriptor &instance, allocDesc.raw().attribute = CFI_attribute_allocatable; if (comp.genre() == typeInfo::Component::Genre::Automatic) { stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); + terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -153,7 +153,7 @@ RT_API_ATTRS int InitializeClone(const Descriptor &clone, if (origDesc.IsAllocated()) { cloneDesc.ApplyMold(origDesc, origDesc.rank()); stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); + terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { if (const typeInfo::DerivedType * @@ -260,7 +260,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy.raw().attribute = CFI_attribute_allocatable; Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncId) == CFI_SUCCESS); + copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } diff --git a/flang-rt/lib/runtime/descriptor.cpp b/flang-rt/lib/runtime/descriptor.cpp index 3debf53bb5290..67336d01380e0 100644 --- a/flang-rt/lib/runtime/descriptor.cpp +++ b/flang-rt/lib/runtime/descriptor.cpp @@ -158,7 +158,7 @@ RT_API_ATTRS static inline int MapAllocIdx(const Descriptor &desc) { #endif } -RT_API_ATTRS int Descriptor::Allocate(std::int64_t asyncId) { +RT_API_ATTRS int Descriptor::Allocate(std::int64_t *asyncObject) { std::size_t elementBytes{ElementBytes()}; if (static_cast(elementBytes) < 0) { // F'2023 7.4.4.2 p5: "If the character length parameter value evaluates @@ -170,7 +170,7 @@ RT_API_ATTRS int Descriptor::Allocate(std::int64_t asyncId) { // Zero size allocation is possible in Fortran and the resulting // descriptor must be allocated/associated. Since std::malloc(0) // result is implementation defined, always allocate at least one byte. - void *p{alloc(byteSize ? byteSize : 1, asyncId)}; + void *p{alloc(byteSize ? byteSize : 1, asyncObject)}; if (!p) { return CFI_ERROR_MEM_ALLOCATION; } diff --git a/flang-rt/lib/runtime/extrema.cpp b/flang-rt/lib/r... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/138186 From flang-commits at lists.llvm.org Thu May 1 12:26:27 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 12:26:27 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][cuda] Use a reference for asyncObject (PR #138186) In-Reply-To: Message-ID: <6813cae3.650a0220.89192.0190@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Valentin Clement (バレンタイン クレメン) (clementval)
Changes Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation. New tentative with some fix. The previous was reverted yesterday. --- Patch is 88.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/138186.diff 52 Files Affected: - (modified) flang-rt/include/flang-rt/runtime/allocator-registry.h (+2-2) - (modified) flang-rt/include/flang-rt/runtime/descriptor.h (+3-3) - (modified) flang-rt/include/flang-rt/runtime/reduction-templates.h (+1-1) - (modified) flang-rt/lib/cuda/allocatable.cpp (+4-4) - (modified) flang-rt/lib/cuda/allocator.cpp (+10-10) - (modified) flang-rt/lib/cuda/descriptor.cpp (+1-1) - (modified) flang-rt/lib/runtime/allocatable.cpp (+6-6) - (modified) flang-rt/lib/runtime/array-constructor.cpp (+2-2) - (modified) flang-rt/lib/runtime/assign.cpp (+2-2) - (modified) flang-rt/lib/runtime/character.cpp (+11-9) - (modified) flang-rt/lib/runtime/copy.cpp (+2-2) - (modified) flang-rt/lib/runtime/derived.cpp (+3-3) - (modified) flang-rt/lib/runtime/descriptor.cpp (+2-2) - (modified) flang-rt/lib/runtime/extrema.cpp (+2-2) - (modified) flang-rt/lib/runtime/findloc.cpp (+1-1) - (modified) flang-rt/lib/runtime/matmul-transpose.cpp (+1-1) - (modified) flang-rt/lib/runtime/matmul.cpp (+1-1) - (modified) flang-rt/lib/runtime/misc-intrinsic.cpp (+1-1) - (modified) flang-rt/lib/runtime/pointer.cpp (+1-1) - (modified) flang-rt/lib/runtime/temporary-stack.cpp (+1-1) - (modified) flang-rt/lib/runtime/tools.cpp (+1-1) - (modified) flang-rt/lib/runtime/transformational.cpp (+2-2) - (modified) flang-rt/unittests/Evaluate/reshape.cpp (+1-1) - (modified) flang-rt/unittests/Runtime/Allocatable.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/CUDA/Allocatable.cpp (+8-4) - (modified) flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/CUDA/Memory.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/CharacterTest.cpp (+1-1) - (modified) flang-rt/unittests/Runtime/CommandTest.cpp (+4-4) - (modified) flang-rt/unittests/Runtime/TemporaryStack.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/tools.h (+1-1) - (modified) flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td (+5-6) - (modified) flang/include/flang/Runtime/CUDA/allocatable.h (+4-4) - (modified) flang/include/flang/Runtime/CUDA/allocator.h (+4-4) - (modified) flang/include/flang/Runtime/CUDA/pointer.h (+4-4) - (modified) flang/include/flang/Runtime/allocatable.h (+4-3) - (modified) flang/lib/Lower/Allocatable.cpp (+1-1) - (modified) flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp (+3-4) - (modified) flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp (+11-11) - (modified) flang/lib/Optimizer/Transforms/CUFOpConversion.cpp (+4-6) - (modified) flang/test/Fir/CUDA/cuda-allocate.fir (+8-10) - (modified) flang/test/Fir/cuf-invalid.fir (+2-3) - (modified) flang/test/Fir/cuf.mlir (+3-4) - (modified) flang/test/HLFIR/elemental-codegen.fir (+3-3) - (modified) flang/test/Lower/CUDA/cuda-allocatable.cuf (+4-5) - (modified) flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-declare.f90 (+2-2) - (modified) flang/test/Lower/allocatable-polymorphic.f90 (+13-13) - (modified) flang/test/Lower/allocatable-runtime.f90 (+2-2) - (modified) flang/test/Lower/allocate-mold.f90 (+2-2) - (modified) flang/test/Lower/polymorphic.f90 (+1-1) - (modified) flang/test/Transforms/lower-repack-arrays.fir (+4-4) ``````````diff diff --git a/flang-rt/include/flang-rt/runtime/allocator-registry.h b/flang-rt/include/flang-rt/runtime/allocator-registry.h index 33e8e2c7d7850..f0ba77a360736 100644 --- a/flang-rt/include/flang-rt/runtime/allocator-registry.h +++ b/flang-rt/include/flang-rt/runtime/allocator-registry.h @@ -19,7 +19,7 @@ namespace Fortran::runtime { -using AllocFct = void *(*)(std::size_t, std::int64_t); +using AllocFct = void *(*)(std::size_t, std::int64_t *); using FreeFct = void (*)(void *); typedef struct Allocator_t { @@ -28,7 +28,7 @@ typedef struct Allocator_t { } Allocator_t; static RT_API_ATTRS void *MallocWrapper( - std::size_t size, [[maybe_unused]] std::int64_t) { + std::size_t size, [[maybe_unused]] std::int64_t *) { return std::malloc(size); } #ifdef RT_DEVICE_COMPILATION diff --git a/flang-rt/include/flang-rt/runtime/descriptor.h b/flang-rt/include/flang-rt/runtime/descriptor.h index 9907e7866e7bf..c98e6b14850cb 100644 --- a/flang-rt/include/flang-rt/runtime/descriptor.h +++ b/flang-rt/include/flang-rt/runtime/descriptor.h @@ -29,8 +29,8 @@ #include #include -/// Value used for asyncId when no specific stream is specified. -static constexpr std::int64_t kNoAsyncId = -1; +/// Value used for asyncObject when no specific stream is specified. +static constexpr std::int64_t *kNoAsyncObject = nullptr; namespace Fortran::runtime { @@ -372,7 +372,7 @@ class Descriptor { // before calling. It (re)computes the byte strides after // allocation. Does not allocate automatic components or // perform default component initialization. - RT_API_ATTRS int Allocate(std::int64_t asyncId); + RT_API_ATTRS int Allocate(std::int64_t *asyncObject); RT_API_ATTRS void SetByteStrides(); // Deallocates storage; does not call FINAL subroutines or diff --git a/flang-rt/include/flang-rt/runtime/reduction-templates.h b/flang-rt/include/flang-rt/runtime/reduction-templates.h index 77f77a592a476..18412708b02c5 100644 --- a/flang-rt/include/flang-rt/runtime/reduction-templates.h +++ b/flang-rt/include/flang-rt/runtime/reduction-templates.h @@ -347,7 +347,7 @@ inline RT_API_ATTRS void DoMaxMinNorm2(Descriptor &result, const Descriptor &x, // as the element size of the source. result.Establish(x.type(), x.ElementBytes(), nullptr, 0, nullptr, CFI_attribute_allocatable); - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/cuda/allocatable.cpp b/flang-rt/lib/cuda/allocatable.cpp index 432974d18a3e3..c77819e9440d7 100644 --- a/flang-rt/lib/cuda/allocatable.cpp +++ b/flang-rt/lib/cuda/allocatable.cpp @@ -23,7 +23,7 @@ namespace Fortran::runtime::cuda { extern "C" { RT_EXT_API_GROUP_BEGIN -int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, +int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( @@ -41,7 +41,7 @@ int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, return stat; } -int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, +int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { if (desc.HasAddendum()) { @@ -63,7 +63,7 @@ int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, } int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; @@ -76,7 +76,7 @@ int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, } int RTDEF(CUFAllocatableAllocateSourceSync)(Descriptor &alloc, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocateSync)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; diff --git a/flang-rt/lib/cuda/allocator.cpp b/flang-rt/lib/cuda/allocator.cpp index 51119ab251168..f4289c55bd8de 100644 --- a/flang-rt/lib/cuda/allocator.cpp +++ b/flang-rt/lib/cuda/allocator.cpp @@ -98,7 +98,7 @@ static unsigned findAllocation(void *ptr) { return allocNotFound; } -static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { +static void insertAllocation(void *ptr, std::size_t size, cudaStream_t stream) { CriticalSection critical{lock}; initAllocations(); if (numDeviceAllocations >= maxDeviceAllocations) { @@ -106,7 +106,7 @@ static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { } deviceAllocations[numDeviceAllocations].ptr = ptr; deviceAllocations[numDeviceAllocations].size = size; - deviceAllocations[numDeviceAllocations].stream = (cudaStream_t)stream; + deviceAllocations[numDeviceAllocations].stream = stream; ++numDeviceAllocations; qsort(deviceAllocations, numDeviceAllocations, sizeof(DeviceAllocation), compareDeviceAlloc); @@ -136,7 +136,7 @@ void RTDEF(CUFRegisterAllocator)() { } void *CUFAllocPinned( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { void *p; CUDA_REPORT_IF_ERROR(cudaMallocHost((void **)&p, sizeInBytes)); return p; @@ -144,18 +144,18 @@ void *CUFAllocPinned( void CUFFreePinned(void *p) { CUDA_REPORT_IF_ERROR(cudaFreeHost(p)); } -void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t asyncId) { +void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t *asyncObject) { void *p; if (Fortran::runtime::executionEnvironment.cudaDeviceIsManaged) { CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); } else { - if (asyncId == kNoAsyncId) { + if (asyncObject == kNoAsyncObject) { CUDA_REPORT_IF_ERROR(cudaMalloc(&p, sizeInBytes)); } else { CUDA_REPORT_IF_ERROR( - cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)asyncId)); - insertAllocation(p, sizeInBytes, asyncId); + cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)*asyncObject)); + insertAllocation(p, sizeInBytes, (cudaStream_t)*asyncObject); } } return p; @@ -174,7 +174,7 @@ void CUFFreeDevice(void *p) { } void *CUFAllocManaged( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { void *p; CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); @@ -184,9 +184,9 @@ void *CUFAllocManaged( void CUFFreeManaged(void *p) { CUDA_REPORT_IF_ERROR(cudaFree(p)); } void *CUFAllocUnified( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { // Call alloc managed for the time being. - return CUFAllocManaged(sizeInBytes, asyncId); + return CUFAllocManaged(sizeInBytes, asyncObject); } void CUFFreeUnified(void *p) { diff --git a/flang-rt/lib/cuda/descriptor.cpp b/flang-rt/lib/cuda/descriptor.cpp index 175e8c0ef8438..7b768f91af29d 100644 --- a/flang-rt/lib/cuda/descriptor.cpp +++ b/flang-rt/lib/cuda/descriptor.cpp @@ -21,7 +21,7 @@ RT_EXT_API_GROUP_BEGIN Descriptor *RTDEF(CUFAllocDescriptor)( std::size_t sizeInBytes, const char *sourceFile, int sourceLine) { return reinterpret_cast( - CUFAllocManaged(sizeInBytes, /*asyncId*/ -1)); + CUFAllocManaged(sizeInBytes, /*asyncObject=*/nullptr)); } void RTDEF(CUFFreeDescriptor)( diff --git a/flang-rt/lib/runtime/allocatable.cpp b/flang-rt/lib/runtime/allocatable.cpp index 6acce34eb9a9e..ef18da6ea0786 100644 --- a/flang-rt/lib/runtime/allocatable.cpp +++ b/flang-rt/lib/runtime/allocatable.cpp @@ -133,17 +133,17 @@ void RTDEF(AllocatableApplyMold)( } } -int RTDEF(AllocatableAllocate)(Descriptor &descriptor, std::int64_t asyncId, - bool hasStat, const Descriptor *errMsg, const char *sourceFile, - int sourceLine) { +int RTDEF(AllocatableAllocate)(Descriptor &descriptor, + std::int64_t *asyncObject, bool hasStat, const Descriptor *errMsg, + const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; if (!descriptor.IsAllocatable()) { return ReturnError(terminator, StatInvalidDescriptor, errMsg, hasStat); } else if (descriptor.IsAllocated()) { return ReturnError(terminator, StatBaseNotNull, errMsg, hasStat); } else { - int stat{ - ReturnError(terminator, descriptor.Allocate(asyncId), errMsg, hasStat)}; + int stat{ReturnError( + terminator, descriptor.Allocate(asyncObject), errMsg, hasStat)}; if (stat == StatOk) { if (const DescriptorAddendum * addendum{descriptor.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -162,7 +162,7 @@ int RTDEF(AllocatableAllocateSource)(Descriptor &alloc, const Descriptor &source, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(AllocatableAllocate)( - alloc, /*asyncId=*/-1, hasStat, errMsg, sourceFile, sourceLine)}; + alloc, /*asyncObject=*/nullptr, hasStat, errMsg, sourceFile, sourceLine)}; if (stat == StatOk) { Terminator terminator{sourceFile, sourceLine}; DoFromSourceAssign(alloc, source, terminator); diff --git a/flang-rt/lib/runtime/array-constructor.cpp b/flang-rt/lib/runtime/array-constructor.cpp index 67b3b5e1e0f50..858fac7bf2b39 100644 --- a/flang-rt/lib/runtime/array-constructor.cpp +++ b/flang-rt/lib/runtime/array-constructor.cpp @@ -50,7 +50,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( initialAllocationSize(fromElements, to.ElementBytes())}; to.GetDimension(0).SetBounds(1, allocationSize); RTNAME(AllocatableAllocate) - (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); to.GetDimension(0).SetBounds(1, fromElements); vector.actualAllocationSize = allocationSize; @@ -59,7 +59,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( // first value: there should be no reallocation. RUNTIME_CHECK(terminator, previousToElements >= fromElements); RTNAME(AllocatableAllocate) - (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); vector.actualAllocationSize = previousToElements; } diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 4a813cd489022..8a4fa36c91479 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -99,7 +99,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; + int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; if (result == StatOk && derived && !derived->noInitializationNeeded()) { result = ReturnError(terminator, Initialize(to, *derived, terminator)); } @@ -277,7 +277,7 @@ RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; + auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; if (stat == StatOk) { if (HasDynamicComponent(from)) { // If 'from' has allocatable/automatic component, we cannot diff --git a/flang-rt/lib/runtime/character.cpp b/flang-rt/lib/runtime/character.cpp index d1152ee1caefb..f140d202e118e 100644 --- a/flang-rt/lib/runtime/character.cpp +++ b/flang-rt/lib/runtime/character.cpp @@ -118,7 +118,7 @@ static RT_API_ATTRS void Compare(Descriptor &result, const Descriptor &x, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("Compare: could not allocate storage for result"); } std::size_t xChars{x.ElementBytes() >> shift}; @@ -173,7 +173,7 @@ static RT_API_ATTRS void AdjustLRHelper(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("ADJUSTL/R: could not allocate storage for result"); } for (SubscriptValue resultAt{0}; elements-- > 0; @@ -227,7 +227,7 @@ static RT_API_ATTRS void LenTrim(Descriptor &result, const Descriptor &string, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("LEN_TRIM: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -427,7 +427,7 @@ static RT_API_ATTRS void GeneralCharFunc(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("SCAN/VERIFY: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -530,7 +530,8 @@ static RT_API_ATTRS void MaxMinHelper(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); } for (CHAR *result{accumulator.OffsetElement()}; elements-- > 0; accumData += accumChars, result += chars, x.IncrementSubscripts(xAt)) { @@ -606,7 +607,7 @@ void RTDEF(CharacterConcatenate)(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - if (accumulator.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (accumulator.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash( "CharacterConcatenate: could not allocate storage for result"); } @@ -629,7 +630,8 @@ void RTDEF(CharacterConcatenateScalar1)( accumulator.set_base_addr(nullptr); std::size_t oldLen{accumulator.ElementBytes()}; accumulator.raw().elem_len += chars; - RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); std::memcpy(accumulator.OffsetElement(oldLen), from, chars); FreeMemory(old); } @@ -831,7 +833,7 @@ void RTDEF(Repeat)(Descriptor &result, const Descriptor &string, std::size_t origBytes{string.ElementBytes()}; result.Establish(string.type(), origBytes * ncopies, nullptr, 0, nullptr, CFI_attribute_allocatable); - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("REPEAT could not allocate storage for result"); } const char *from{string.OffsetElement()}; @@ -865,7 +867,7 @@ void RTDEF(Trim)(Descriptor &result, const Descriptor &string, } result.Establish(string.type(), resultBytes, nullptr, 0, nullptr, CFI_attribute_allocatable); - RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncObject) == CFI_SUCCESS); std::memcpy(result.OffsetElement(), string.OffsetElement(), resultBytes); } diff --git a/flang-rt/lib/runtime/copy.cpp b/flang-rt/lib/runtime/copy.cpp index 3a0f98cf8d376..f990f46e0be66 100644 --- a/flang-rt/lib/runtime/copy.cpp +++ b/flang-rt/lib/runtime/copy.cpp @@ -171,8 +171,8 @@ RT_API_ATTRS void CopyElement(const Descriptor &to, const SubscriptValue toAt[], *reinterpret_cast(toPtr + component->offset())}; if (toDesc.raw().base_addr != nullptr) { toDesc.set_base_addr(nullptr); - RUNTIME_CHECK( - terminator, toDesc.Allocate(/*asyncId=*/-1) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, + toDesc.Allocate(/*asyncObject=*/nullptr) == CFI_SUCCESS); const Descriptor &fromDesc{*reinterpret_cast( fromPtr + component->offset())}; copyStack.emplace(toDesc, fromDesc); diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..35037036f63e7 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -52,7 +52,7 @@ RT_API_ATTRS int Initialize(const Descriptor &instance, allocDesc.raw().attribute = CFI_attribute_allocatable; if (comp.genre() == typeInfo::Component::Genre::Automatic) { stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); + terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -153,7 +153,7 @@ RT_API_ATTRS int InitializeClone(const Descriptor &clone, if (origDesc.IsAllocated()) { cloneDesc.ApplyMold(origDesc, origDesc.rank()); stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); + terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { if (const typeInfo::DerivedType * @@ -260,7 +260,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy.raw().attribute = CFI_attribute_allocatable; Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncId) == CFI_SUCCESS); + copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } diff --git a/flang-rt/lib/runtime/descriptor.cpp b/flang-rt/lib/runtime/descriptor.cpp index 3debf53bb5290..67336d01380e0 100644 --- a/flang-rt/lib/runtime/descriptor.cpp +++ b/flang-rt/lib/runtime/descriptor.cpp @@ -158,7 +158,7 @@ RT_API_ATTRS static inline int MapAllocIdx(const Descriptor &desc) { #endif } -RT_API_ATTRS int Descriptor::Allocate(std::int64_t asyncId) { +RT_API_ATTRS int Descriptor::Allocate(std::int64_t *asyncObject) { std::size_t elementBytes{ElementBytes()}; if (static_cast(elementBytes) < 0) { // F'2023 7.4.4.2 p5: "If the character length parameter value evaluates @@ -170,7 +170,7 @@ RT_API_ATTRS int Descriptor::Allocate(std::int64_t asyncId) { // Zero size allocation is possible in Fortran and the resulting // descriptor must be allocated/associated. Since std::malloc(0) // result is implementation defined, always allocate at least one byte. - void *p{alloc(byteSize ? byteSize : 1, asyncId)}; + void *p{alloc(byteSize ? byteSize : 1, asyncObject)}; if (!p) { return CFI_ERROR_MEM_ALLOCATION; } diff --git a/flang-rt/lib/runtime/extrema.cpp b/flang-rt/lib/r... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/138186 From flang-commits at lists.llvm.org Thu May 1 13:43:23 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Thu, 01 May 2025 13:43:23 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix #else with trailing text (PR #138045) In-Reply-To: Message-ID: <6813dceb.170a0220.192705.0329@mx.google.com> https://github.com/akuhlens approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/138045 From flang-commits at lists.llvm.org Thu May 1 13:45:17 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 13:45:17 -0700 (PDT) Subject: [flang-commits] [flang] 42f5d71 - [flang][frontend] warn when a volatile target is pointer associated with an non-volatile pointer (#136778) Message-ID: <6813dd5d.630a0220.57d49.000d@mx.google.com> Author: Andre Kuhlenschmidt Date: 2025-05-01T13:45:13-07:00 New Revision: 42f5d716cbb8b391203eb880ac81f6272fd611f1 URL: https://github.com/llvm/llvm-project/commit/42f5d716cbb8b391203eb880ac81f6272fd611f1 DIFF: https://github.com/llvm/llvm-project/commit/42f5d716cbb8b391203eb880ac81f6272fd611f1.diff LOG: [flang][frontend] warn when a volatile target is pointer associated with an non-volatile pointer (#136778) closes #135805 Added: Modified: flang/include/flang/Support/Fortran-features.h flang/lib/Semantics/pointer-assignment.cpp flang/lib/Support/Fortran-features.cpp flang/test/Semantics/assign02.f90 flang/test/Semantics/call03.f90 Removed: ################################################################################ diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 5b22313754a0f..6cb1bcdb0003f 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -76,7 +76,7 @@ ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, MismatchingDummyProcedure, SubscriptedEmptyArray, UnsignedLiteralTruncation, CompatibleDeclarationsFromDistinctModules, NullActualForDefaultIntentAllocatable, UseAssociationIntoSameNameSubprogram, - HostAssociatedIntentOutInSpecExpr) + HostAssociatedIntentOutInSpecExpr, NonVolatilePointerToVolatile) using LanguageFeatures = EnumSet; using UsageWarnings = EnumSet; diff --git a/flang/lib/Semantics/pointer-assignment.cpp b/flang/lib/Semantics/pointer-assignment.cpp index ab3771c808761..36c9c5b845706 100644 --- a/flang/lib/Semantics/pointer-assignment.cpp +++ b/flang/lib/Semantics/pointer-assignment.cpp @@ -360,6 +360,20 @@ bool PointerAssignmentChecker::Check(const evaluate::Designator &d) { } else { Say(std::get(*msg)); } + } + + // Show warnings after errors + + // 8.5.20(3) A pointer should have the VOLATILE attribute if its target has + // the VOLATILE attribute + // 8.5.20(4) If an object has the VOLATILE attribute, then all of its + // subobjects also have the VOLATILE attribute. + if (!isVolatile_ && base->attrs().test(Attr::VOLATILE)) { + Warn(common::UsageWarning::NonVolatilePointerToVolatile, + "VOLATILE target associated with non-VOLATILE pointer"_warn_en_US); + } + + if (msg) { return false; } else { context_.NoteDefinedSymbol(*base); diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index b3cb62e62f5fb..49a5989849eaa 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -87,6 +87,7 @@ LanguageFeatureControl::LanguageFeatureControl() { warnUsage_.set(UsageWarning::NullActualForDefaultIntentAllocatable); warnUsage_.set(UsageWarning::UseAssociationIntoSameNameSubprogram); warnUsage_.set(UsageWarning::HostAssociatedIntentOutInSpecExpr); + warnUsage_.set(UsageWarning::NonVolatilePointerToVolatile); // New warnings, on by default warnLanguage_.set(LanguageFeature::SavedLocalInSpecExpr); warnLanguage_.set(LanguageFeature::NullActualForAllocatable); diff --git a/flang/test/Semantics/assign02.f90 b/flang/test/Semantics/assign02.f90 index 6775506c21a3b..d83d126e2734c 100644 --- a/flang/test/Semantics/assign02.f90 +++ b/flang/test/Semantics/assign02.f90 @@ -9,6 +9,9 @@ module m1 sequence real :: t2Field end type + type t3 + type(t2) :: t3Field + end type contains ! C852 @@ -80,6 +83,7 @@ subroutine s5 real, pointer, volatile :: q p => x !ERROR: Pointer must be VOLATILE when target is a VOLATILE coarray + !ERROR: VOLATILE target associated with non-VOLATILE pointer p => y !ERROR: Pointer may not be VOLATILE when target is a non-VOLATILE coarray q => x @@ -165,6 +169,36 @@ subroutine s11 ca[1]%p => x end + subroutine s12 + real, volatile, target :: x + real, pointer :: p + real, pointer, volatile :: q + !ERROR: VOLATILE target associated with non-VOLATILE pointer + p => x + q => x + end + + subroutine s13 + type(t3), target, volatile :: y = t3(t2(4.4)) + real, pointer :: p1 + type(t2), pointer :: p2 + type(t3), pointer :: p3 + real, pointer, volatile :: q1 + type(t2), pointer, volatile :: q2 + type(t3), pointer, volatile :: q3 + !ERROR: VOLATILE target associated with non-VOLATILE pointer + p1 => y%t3Field%t2Field + !ERROR: VOLATILE target associated with non-VOLATILE pointer + p2 => y%t3Field + !ERROR: VOLATILE target associated with non-VOLATILE pointer + p3 => y + !OK: + q1 => y%t3Field%t2Field + !OK: + q2 => y%t3Field + !OK: + q3 => y + end end module m2 diff --git a/flang/test/Semantics/call03.f90 b/flang/test/Semantics/call03.f90 index 8f1be1ebff4eb..59513557324e5 100644 --- a/flang/test/Semantics/call03.f90 +++ b/flang/test/Semantics/call03.f90 @@ -386,7 +386,9 @@ subroutine test16() ! C1540 call contiguous(a) ! ok call pointer(a) ! ok call pointer(b) ! ok + !ERROR: VOLATILE target associated with non-VOLATILE pointer call pointer(c) ! ok + !ERROR: VOLATILE target associated with non-VOLATILE pointer call pointer(d) ! ok call valueassumedsize(a) ! ok call valueassumedsize(b) ! ok From flang-commits at lists.llvm.org Thu May 1 13:45:19 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Thu, 01 May 2025 13:45:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang][frontend] warn when a volatile target is pointer associated with an non-volatile pointer (PR #136778) In-Reply-To: Message-ID: <6813dd5f.170a0220.34a58a.02e2@mx.google.com> https://github.com/akuhlens closed https://github.com/llvm/llvm-project/pull/136778 From flang-commits at lists.llvm.org Thu May 1 13:50:32 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Thu, 01 May 2025 13:50:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix scoping of cray pointer declarations and add check for initialization (PR #136776) In-Reply-To: Message-ID: <6813de98.170a0220.172698.79d6@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/136776 >From 15f0033cb1f0376c59b30f4ed44dd03f35578acf Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Tue, 22 Apr 2025 14:59:37 -0700 Subject: [PATCH 1/3] initial commit --- flang/lib/Semantics/check-declarations.cpp | 12 +++++++++++- flang/lib/Semantics/resolve-names.cpp | 2 +- flang/lib/Semantics/semantics.cpp | 1 + flang/test/Lower/OpenMP/cray-pointers01.f90 | 2 +- flang/test/Semantics/declarations08.f90 | 6 ++++++ 5 files changed, 20 insertions(+), 3 deletions(-) diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 8d5e034f8624b..6cc4665532385 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -963,7 +963,17 @@ void CheckHelper::CheckObjectEntity( "'%s' is a data object and may not be EXTERNAL"_err_en_US, symbol.name()); } - + if (symbol.test(Symbol::Flag::CrayPointee)) { + // NB, IsSaved was too smart here. + if (details.init()) { + messages_.Say( + "Cray pointee '%s' may not be initialized"_err_en_US, symbol.name()); + } else if (symbol.attrs().test(Attr::SAVE) || + symbol.implicitAttrs().test(Attr::SAVE)) { + messages_.Say( + "Cray pointee '%s' may not be SAVE"_err_en_US, symbol.name()); + } + } if (derived) { bool isUnsavedLocal{ isLocalVariable && !IsAllocatable(symbol) && !IsSaved(symbol)}; diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..e0550b3724bef 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6650,7 +6650,7 @@ bool DeclarationVisitor::Pre(const parser::BasedPointer &) { void DeclarationVisitor::Post(const parser::BasedPointer &bp) { const parser::ObjectName &pointerName{std::get<0>(bp.t)}; - auto *pointer{FindSymbol(pointerName)}; + auto *pointer{FindInScope(pointerName)}; if (!pointer) { pointer = &MakeSymbol(pointerName, ObjectEntityDetails{}); } else if (!ConvertToObjectEntity(*pointer)) { diff --git a/flang/lib/Semantics/semantics.cpp b/flang/lib/Semantics/semantics.cpp index 10a01039ea0ae..e07054f8ec564 100644 --- a/flang/lib/Semantics/semantics.cpp +++ b/flang/lib/Semantics/semantics.cpp @@ -731,6 +731,7 @@ void DoDumpSymbols(llvm::raw_ostream &os, const Scope &scope, int indent) { for (const auto &[pointee, pointer] : scope.crayPointers()) { os << " (" << pointer->name() << ',' << pointee << ')'; } + os << '\n'; } for (const auto &pair : scope.commonBlocks()) { const auto &symbol{*pair.second}; diff --git a/flang/test/Lower/OpenMP/cray-pointers01.f90 b/flang/test/Lower/OpenMP/cray-pointers01.f90 index 87692ccbadfe3..d3a5a3cdd39a3 100644 --- a/flang/test/Lower/OpenMP/cray-pointers01.f90 +++ b/flang/test/Lower/OpenMP/cray-pointers01.f90 @@ -33,7 +33,7 @@ subroutine set_cray_pointer end module program test_cray_pointers_01 - real*8, save :: var(*) + real*8 :: var(*) ! CHECK: %[[BOX_ALLOCA:.*]] = fir.alloca !fir.box>> ! CHECK: %[[IVAR_ALLOCA:.*]] = fir.alloca i64 {bindc_name = "ivar", uniq_name = "_QFEivar"} ! CHECK: %[[IVAR_DECL_01:.*]]:2 = hlfir.declare %[[IVAR_ALLOCA]] {uniq_name = "_QFEivar"} : (!fir.ref) -> (!fir.ref, !fir.ref) diff --git a/flang/test/Semantics/declarations08.f90 b/flang/test/Semantics/declarations08.f90 index bd14131b33c28..140ff710228c1 100644 --- a/flang/test/Semantics/declarations08.f90 +++ b/flang/test/Semantics/declarations08.f90 @@ -5,4 +5,10 @@ !ERROR: Cray pointee 'x' may not be a member of a COMMON block common x equivalence(y,z) +!ERROR: Cray pointee 'v' may not be initialized +real :: v = 42.0 +pointer(p,v) +!ERROR: Cray pointee 'u' may not be SAVE +save u +pointer(p, u) end >From ece1e456b6a5d8da92dd20a50c04d4b03a01e519 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 24 Apr 2025 14:45:29 -0700 Subject: [PATCH 2/3] address feedback --- flang/lib/Semantics/check-declarations.cpp | 6 +++--- flang/test/Semantics/declarations08.f90 | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 6cc4665532385..1c865f37a4a35 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -968,10 +968,10 @@ void CheckHelper::CheckObjectEntity( if (details.init()) { messages_.Say( "Cray pointee '%s' may not be initialized"_err_en_US, symbol.name()); - } else if (symbol.attrs().test(Attr::SAVE) || - symbol.implicitAttrs().test(Attr::SAVE)) { + } else if (symbol.attrs().test(Attr::SAVE)) { messages_.Say( - "Cray pointee '%s' may not be SAVE"_err_en_US, symbol.name()); + "Cray pointee '%s' may not have the SAVE attribute"_err_en_US, + symbol.name()); } } if (derived) { diff --git a/flang/test/Semantics/declarations08.f90 b/flang/test/Semantics/declarations08.f90 index 140ff710228c1..2c4027d117365 100644 --- a/flang/test/Semantics/declarations08.f90 +++ b/flang/test/Semantics/declarations08.f90 @@ -8,7 +8,7 @@ !ERROR: Cray pointee 'v' may not be initialized real :: v = 42.0 pointer(p,v) -!ERROR: Cray pointee 'u' may not be SAVE +!ERROR: Cray pointee 'u' may not have the SAVE attribute save u pointer(p, u) end >From 75abd17d037541a13093a4297f3addeac4ce21f9 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Mon, 28 Apr 2025 17:06:15 -0700 Subject: [PATCH 3/3] add a test for resolution --- flang/lib/Semantics/check-declarations.cpp | 3 +- flang/test/Semantics/resolve125.f90 | 64 ++++++++++++++++++++++ 2 files changed, 66 insertions(+), 1 deletion(-) create mode 100644 flang/test/Semantics/resolve125.f90 diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 1c865f37a4a35..318085518cc57 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -968,7 +968,8 @@ void CheckHelper::CheckObjectEntity( if (details.init()) { messages_.Say( "Cray pointee '%s' may not be initialized"_err_en_US, symbol.name()); - } else if (symbol.attrs().test(Attr::SAVE)) { + } + if (symbol.attrs().test(Attr::SAVE)) { messages_.Say( "Cray pointee '%s' may not have the SAVE attribute"_err_en_US, symbol.name()); diff --git a/flang/test/Semantics/resolve125.f90 b/flang/test/Semantics/resolve125.f90 new file mode 100644 index 0000000000000..e040c006ec179 --- /dev/null +++ b/flang/test/Semantics/resolve125.f90 @@ -0,0 +1,64 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols %s 2>&1 | FileCheck %s + +!CHECK: Module scope: m1 +!CHECK: i, PUBLIC size={{[0-9]+}} offset={{[0-9]+}}: ObjectEntity type: REAL({{[0-9]+}}) init:{{.+}} +!CHECK: init, PUBLIC (Subroutine): Subprogram () +!CHECK: o, PUBLIC (CrayPointee) size={{[0-9]+}} offset={{[0-9]+}}: ObjectEntity type: REAL({{[0-9]+}}) +!CHECK: ptr, PUBLIC (CrayPointer) size={{[0-9]+}} offset={{[0-9]+}}: ObjectEntity type: INTEGER({{[0-9]+}}) +module m1 + implicit none + real:: o + real:: i = 42.0 + pointer (ptr, o) +contains + !CHECK: Subprogram scope: init + subroutine init + implicit none + ptr=loc(i) + print *, "init : o= ", o + end subroutine init +end module m1 + +!CHECK: Module scope: m2 +!CHECK: i, PUBLIC: Use from i in m1 +!CHECK: i2, PUBLIC size={{[0-9]+}} offset={{[0-9]+}}: ObjectEntity type: REAL({{[0-9]+}}) init:{{.+}} +!CHECK: init, PUBLIC (Subroutine): Use from init in m1 +!CHECK: o, PUBLIC (CrayPointee): Use from o in m1 +!CHECK: ptr, PUBLIC (CrayPointer): Use from ptr in m1 +!CHECK: reset, PUBLIC (Subroutine): Subprogram () +module m2 + use m1 + implicit none + real:: i2 = 777.0 +contains + !CHECK: Subprogram scope: reset + !CHECK: o2 (CrayPointee) size={{[0-9]+}} offset={{[0-9]+}}: ObjectEntity type: REAL({{[0-9]+}}) + !CHECK: ptr (CrayPointer) size={{[0-9]+}} offset={{[0-9]+}}: ObjectEntity type: INTEGER({{[0-9]+}}) + subroutine reset + real::o2 + pointer (ptr, o2) + ptr=loc(i2) + print *, "reset : o= ", o, " o2 = ", o2 + o2 = 666.0 + end subroutine reset +end module m2 + +!CHECK: MainProgram scope: main +!CHECK: i: Use from i in m2 +!CHECK: i2: Use from i2 in m2 +!CHECK: init (Subroutine): Use from init in m2 +!CHECK: o (CrayPointee): Use from o in m2 +!CHECK: ptr (CrayPointer): Use from ptr in m2 +!CHECK: reset (Subroutine): Use from reset in m2 +program main + use m2 + implicit none + call init + call reset + write(6,*) "main : o = ", o + if (o == 42.0) then + print *, "pass" + else + print *, "fail" + end if +end program main \ No newline at end of file From flang-commits at lists.llvm.org Thu May 1 14:17:49 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Thu, 01 May 2025 14:17:49 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][cuda] Use a reference for asyncObject (PR #138186) In-Reply-To: Message-ID: <6813e4fd.050a0220.cd703.091d@mx.google.com> ================ @@ -82,19 +83,22 @@ TEST(AllocatableCUFTest, StreamDeviceAllocatable) { RTNAME(AllocatableSetBounds)(*c, 0, 1, 100); RTNAME(AllocatableAllocate) - (*a, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(a->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); RTNAME(AllocatableAllocate) - (*b, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*b, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(b->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); RTNAME(AllocatableAllocate) - (*c, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*c, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, ---------------- wangzpgi wrote: Question: `asyncId=-1` translates to `asyncObject=nullptr`. Here `asyncId=1` also translates to `asyncObject=nullptr`? https://github.com/llvm/llvm-project/pull/138186 From flang-commits at lists.llvm.org Thu May 1 15:43:35 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 15:43:35 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) Message-ID: https://github.com/agozillon created https://github.com/llvm/llvm-project/pull/138210 Currently, we do not generate the appropriate checks to check if an optional allocatable argument is present before accessing relevant components of it, in particular when creating bounds, we must generate a presence check and we must make sure we do not generate/keep an load external to the presence check by utilising the raw address rather than the regular address of the info data structure. Similarly in cases for optional allocatables we must treat them like non-allocatable arguments and generate an intermediate allocation that we can have as a location in memory that we can access later in the lowering without causing segfaults when we perform "mapping" on it, even if the end result is an empty allocatable (basically, we shouldn't explode if someone tries to map a non-present optional, similar to C++ when mapping null data). >From 82387ec13258f67a530ddb615a49e0f36e8575e1 Mon Sep 17 00:00:00 2001 From: agozillon Date: Thu, 1 May 2025 17:40:12 -0500 Subject: [PATCH] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables Currently, we do not generate the appropriate checks to check if an optional allocatable argument is present before accessing relevant components of it, in particular when creating bounds, we must generate a presence check and we must make sure we do not generate/keep an load external to the presence check by utilising the raw address rather than the regular address of the info data structure. Similarly in cases for optional allocatables we must treat them like non-allocatable arguments and generate an intermediate allocation that we can have as a location in memory that we can access later in the lowering without causing segfaults when we perform "mapping" on it, even if the end result is an empty allocatable (basically, we shouldn't explode if someone tries to map a non-present optional, similar to C++ when mapping null data). --- .../Optimizer/Builder/DirectivesCommon.h | 25 +++++++- .../Optimizer/OpenMP/MapInfoFinalization.cpp | 16 ++++-- .../Lower/OpenMP/optional-argument-map-2.f90 | 46 +++++++++++++++ .../fortran/optional-mapped-arguments-2.f90 | 57 +++++++++++++++++++ 4 files changed, 137 insertions(+), 7 deletions(-) create mode 100644 flang/test/Lower/OpenMP/optional-argument-map-2.f90 create mode 100644 offload/test/offloading/fortran/optional-mapped-arguments-2.f90 diff --git a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h index 8684299ab6792..e655c8e592364 100644 --- a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h +++ b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h @@ -156,9 +156,9 @@ genBoundsOpsFromBox(fir::FirOpBuilder &builder, mlir::Location loc, builder.genIfOp(loc, resTypes, info.isPresent, /*withElseRegion=*/true) .genThen([&]() { mlir::Value box = - !fir::isBoxAddress(info.addr.getType()) + !fir::isBoxAddress(info.rawInput.getType()) ? info.addr - : builder.create(loc, info.addr); + : builder.create(loc, info.rawInput); llvm::SmallVector boundValues = gatherBoundsOrBoundValues( builder, loc, dataExv, box, @@ -243,6 +243,17 @@ genBaseBoundsOps(fir::FirOpBuilder &builder, mlir::Location loc, return bounds; } +/// Checks if an argument is optional based on the fortran attributes +/// that are tied to it. +inline bool isOptionalArgument(mlir::Operation *op) { + if (auto declareOp = mlir::dyn_cast_or_null(op)) + if (declareOp.getFortranAttrs() && + bitEnumContainsAny(*declareOp.getFortranAttrs(), + fir::FortranVariableFlagsEnum::optional)) + return true; + return false; +} + template llvm::SmallVector genImplicitBoundsOps(fir::FirOpBuilder &builder, AddrAndBoundsInfo &info, @@ -251,9 +262,17 @@ genImplicitBoundsOps(fir::FirOpBuilder &builder, AddrAndBoundsInfo &info, llvm::SmallVector bounds; mlir::Value baseOp = info.rawInput; - if (mlir::isa(fir::unwrapRefType(baseOp.getType()))) + if (mlir::isa(fir::unwrapRefType(baseOp.getType()))) { + // if it's an optional argument, it is possible it is not present, in which + // case, emitting loads or stores to access bounds data will result in a + // runtime segfault, so we must emit guards against this. + if (!info.isPresent && isOptionalArgument(info.rawInput.getDefiningOp())) { + info.isPresent = builder.create( + loc, builder.getI1Type(), info.rawInput); + } bounds = genBoundsOpsFromBox(builder, loc, dataExv, info); + } if (mlir::isa(fir::unwrapRefType(baseOp.getType()))) { bounds = genBaseBoundsOps(builder, loc, dataExv, dataExvIsAssumedSize); diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp index 3fcb4b04a7b76..05d17bf71514b 100644 --- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp +++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp @@ -131,7 +131,8 @@ class MapInfoFinalizationPass boxMap.getVarPtr().getDefiningOp())) descriptor = addrOp.getVal(); - if (!mlir::isa(descriptor.getType())) + if (!mlir::isa(descriptor.getType()) && + !fir::factory::isOptionalArgument(descriptor.getDefiningOp())) return descriptor; mlir::Value &slot = localBoxAllocas[descriptor.getDefiningOp()]; @@ -151,7 +152,12 @@ class MapInfoFinalizationPass mlir::Location loc = boxMap->getLoc(); assert(allocaBlock && "No alloca block found for this top level op"); builder.setInsertionPointToStart(allocaBlock); - auto alloca = builder.create(loc, descriptor.getType()); + + mlir::Type allocaType = descriptor.getType(); + if (fir::isTypeWithDescriptor(allocaType) && + !mlir::isa(descriptor.getType())) + allocaType = fir::unwrapRefType(allocaType); + auto alloca = builder.create(loc, allocaType); builder.restoreInsertionPoint(insPt); // We should only emit a store if the passed in data is present, it is // possible a user passes in no argument to an optional parameter, in which @@ -159,8 +165,10 @@ class MapInfoFinalizationPass auto isPresent = builder.create(loc, builder.getI1Type(), descriptor); builder.genIfOp(loc, {}, isPresent, false) - .genThen( - [&]() { builder.create(loc, descriptor, alloca); }) + .genThen([&]() { + descriptor = builder.loadIfRef(loc, descriptor); + builder.create(loc, descriptor, alloca); + }) .end(); return slot = alloca; } diff --git a/flang/test/Lower/OpenMP/optional-argument-map-2.f90 b/flang/test/Lower/OpenMP/optional-argument-map-2.f90 new file mode 100644 index 0000000000000..eb89b18063f64 --- /dev/null +++ b/flang/test/Lower/OpenMP/optional-argument-map-2.f90 @@ -0,0 +1,46 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +module mod + implicit none +contains + subroutine routine(a) + implicit none + real(4), allocatable, optional, intent(inout) :: a(:) + integer(4) :: i + + !$omp target teams distribute parallel do shared(a) + do i=1,10 + a(i) = i + a(i) + end do + + end subroutine routine +end module mod + +! CHECK-LABEL: func.func @_QMmodProutine( +! CHECK-SAME: %[[ARG0:.*]]: !fir.ref>>> {fir.bindc_name = "a", fir.optional}) { +! CHECK: %[[VAL_0:.*]] = fir.alloca !fir.box>> +! CHECK: %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[ARG0]] dummy_scope %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QMmodFroutineEa"} : (!fir.ref>>>, !fir.dscope) -> (!fir.ref>>>, !fir.ref>>>) +! CHECK: %[[VAL_8:.*]] = fir.is_present %[[VAL_2]]#1 : (!fir.ref>>>) -> i1 +! CHECK: %[[VAL_9:.*]]:5 = fir.if %[[VAL_8]] -> (index, index, index, index, index) { +! CHECK: %[[VAL_10:.*]] = fir.load %[[VAL_2]]#1 : !fir.ref>>> +! CHECK: %[[VAL_11:.*]] = arith.constant 1 : index +! CHECK: %[[VAL_12:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_2]]#0 : !fir.ref>>> +! CHECK: %[[VAL_14:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_15:.*]]:3 = fir.box_dims %[[VAL_13]], %[[VAL_14]] : (!fir.box>>, index) -> (index, index, index) +! CHECK: %[[VAL_16:.*]]:3 = fir.box_dims %[[VAL_10]], %[[VAL_12]] : (!fir.box>>, index) -> (index, index, index) +! CHECK: %[[VAL_17:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_18:.*]] = arith.subi %[[VAL_16]]#1, %[[VAL_11]] : index +! CHECK: fir.result %[[VAL_17]], %[[VAL_18]], %[[VAL_16]]#1, %[[VAL_16]]#2, %[[VAL_15]]#0 : index, index, index, index, index +! CHECK: } else { +! CHECK: %[[VAL_19:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_20:.*]] = arith.constant -1 : index +! CHECK: fir.result %[[VAL_19]], %[[VAL_20]], %[[VAL_19]], %[[VAL_19]], %[[VAL_19]] : index, index, index, index, index +! CHECK: } +! CHECK: %[[VAL_21:.*]] = omp.map.bounds lower_bound(%[[VAL_22:.*]]#0 : index) upper_bound(%[[VAL_22]]#1 : index) extent(%[[VAL_22]]#2 : index) stride(%[[VAL_22]]#3 : index) start_idx(%[[VAL_22]]#4 : index) {stride_in_bytes = true} +! CHECK: %[[VAL_23:.*]] = fir.is_present %[[VAL_2]]#1 : (!fir.ref>>>) -> i1 +! CHECK: fir.if %[[VAL_23]] { +! CHECK: %[[VAL_24:.*]] = fir.load %[[VAL_2]]#1 : !fir.ref>>> +! CHECK: fir.store %[[VAL_24]] to %[[VAL_0]] : !fir.ref>>> +! CHECK: } diff --git a/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 b/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 new file mode 100644 index 0000000000000..0de6b7730d3a0 --- /dev/null +++ b/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 @@ -0,0 +1,57 @@ +! OpenMP offloading regression test that checks we do not cause a segfault when +! implicitly mapping a not present optional allocatable function argument and +! utilise it in the target region. No results requiring checking other than +! that the program compiles and runs to completion with no error. +! REQUIRES: flang, amdgpu + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +module mod + implicit none +contains + subroutine routine(a, b) + implicit none + real(4), allocatable, optional, intent(in) :: a(:) + real(4), intent(out) :: b(:) + integer(4) :: i, ia + if(present(a)) then + ia = 1 + write(*,*) "a is present" + else + ia=0 + write(*,*) "a is not present" + end if + + !$omp target teams distribute parallel do shared(a,b,ia) + do i=1,10 + if (ia>0) then + b(i) = b(i) + a(i) + end if + end do + + end subroutine routine + +end module mod + +program main + use mod + implicit none + real(4), allocatable :: a(:) + real(4), allocatable :: b(:) + integer(4) :: i + allocate(b(10)) + do i=1,10 + b(i)=0 + end do + !$omp target data map(from: b) + + call routine(b=b) + + !$omp end target data + + deallocate(b) + + print *, "success, no segmentation fault" +end program main + +!CHECK: a is not present +!CHECK: success, no segmentation fault From flang-commits at lists.llvm.org Thu May 1 15:43:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 15:43:50 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <6813f926.170a0220.139d.0914@mx.google.com> https://github.com/agozillon edited https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Thu May 1 15:44:11 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 15:44:11 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <6813f93b.170a0220.2f9b99.07df@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: None (agozillon)
Changes Currently, we do not generate the appropriate checks to check if an optional allocatable argument is present before accessing relevant components of it, in particular when creating bounds, we must generate a presence check and we must make sure we do not generate/keep an load external to the presence check by utilising the raw address rather than the regular address of the info data structure. Similarly in cases for optional allocatables we must treat them like non-allocatable arguments and generate an intermediate allocation that we can have as a location in memory that we can access later in the lowering without causing segfaults when we perform "mapping" on it, even if the end result is an empty allocatable (basically, we shouldn't explode if someone tries to map a non-present optional, similar to C++ when mapping null data). --- Full diff: https://github.com/llvm/llvm-project/pull/138210.diff 4 Files Affected: - (modified) flang/include/flang/Optimizer/Builder/DirectivesCommon.h (+22-3) - (modified) flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp (+12-4) - (added) flang/test/Lower/OpenMP/optional-argument-map-2.f90 (+46) - (added) offload/test/offloading/fortran/optional-mapped-arguments-2.f90 (+57) ``````````diff diff --git a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h index 8684299ab6792..e655c8e592364 100644 --- a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h +++ b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h @@ -156,9 +156,9 @@ genBoundsOpsFromBox(fir::FirOpBuilder &builder, mlir::Location loc, builder.genIfOp(loc, resTypes, info.isPresent, /*withElseRegion=*/true) .genThen([&]() { mlir::Value box = - !fir::isBoxAddress(info.addr.getType()) + !fir::isBoxAddress(info.rawInput.getType()) ? info.addr - : builder.create(loc, info.addr); + : builder.create(loc, info.rawInput); llvm::SmallVector boundValues = gatherBoundsOrBoundValues( builder, loc, dataExv, box, @@ -243,6 +243,17 @@ genBaseBoundsOps(fir::FirOpBuilder &builder, mlir::Location loc, return bounds; } +/// Checks if an argument is optional based on the fortran attributes +/// that are tied to it. +inline bool isOptionalArgument(mlir::Operation *op) { + if (auto declareOp = mlir::dyn_cast_or_null(op)) + if (declareOp.getFortranAttrs() && + bitEnumContainsAny(*declareOp.getFortranAttrs(), + fir::FortranVariableFlagsEnum::optional)) + return true; + return false; +} + template llvm::SmallVector genImplicitBoundsOps(fir::FirOpBuilder &builder, AddrAndBoundsInfo &info, @@ -251,9 +262,17 @@ genImplicitBoundsOps(fir::FirOpBuilder &builder, AddrAndBoundsInfo &info, llvm::SmallVector bounds; mlir::Value baseOp = info.rawInput; - if (mlir::isa(fir::unwrapRefType(baseOp.getType()))) + if (mlir::isa(fir::unwrapRefType(baseOp.getType()))) { + // if it's an optional argument, it is possible it is not present, in which + // case, emitting loads or stores to access bounds data will result in a + // runtime segfault, so we must emit guards against this. + if (!info.isPresent && isOptionalArgument(info.rawInput.getDefiningOp())) { + info.isPresent = builder.create( + loc, builder.getI1Type(), info.rawInput); + } bounds = genBoundsOpsFromBox(builder, loc, dataExv, info); + } if (mlir::isa(fir::unwrapRefType(baseOp.getType()))) { bounds = genBaseBoundsOps(builder, loc, dataExv, dataExvIsAssumedSize); diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp index 3fcb4b04a7b76..05d17bf71514b 100644 --- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp +++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp @@ -131,7 +131,8 @@ class MapInfoFinalizationPass boxMap.getVarPtr().getDefiningOp())) descriptor = addrOp.getVal(); - if (!mlir::isa(descriptor.getType())) + if (!mlir::isa(descriptor.getType()) && + !fir::factory::isOptionalArgument(descriptor.getDefiningOp())) return descriptor; mlir::Value &slot = localBoxAllocas[descriptor.getDefiningOp()]; @@ -151,7 +152,12 @@ class MapInfoFinalizationPass mlir::Location loc = boxMap->getLoc(); assert(allocaBlock && "No alloca block found for this top level op"); builder.setInsertionPointToStart(allocaBlock); - auto alloca = builder.create(loc, descriptor.getType()); + + mlir::Type allocaType = descriptor.getType(); + if (fir::isTypeWithDescriptor(allocaType) && + !mlir::isa(descriptor.getType())) + allocaType = fir::unwrapRefType(allocaType); + auto alloca = builder.create(loc, allocaType); builder.restoreInsertionPoint(insPt); // We should only emit a store if the passed in data is present, it is // possible a user passes in no argument to an optional parameter, in which @@ -159,8 +165,10 @@ class MapInfoFinalizationPass auto isPresent = builder.create(loc, builder.getI1Type(), descriptor); builder.genIfOp(loc, {}, isPresent, false) - .genThen( - [&]() { builder.create(loc, descriptor, alloca); }) + .genThen([&]() { + descriptor = builder.loadIfRef(loc, descriptor); + builder.create(loc, descriptor, alloca); + }) .end(); return slot = alloca; } diff --git a/flang/test/Lower/OpenMP/optional-argument-map-2.f90 b/flang/test/Lower/OpenMP/optional-argument-map-2.f90 new file mode 100644 index 0000000000000..eb89b18063f64 --- /dev/null +++ b/flang/test/Lower/OpenMP/optional-argument-map-2.f90 @@ -0,0 +1,46 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +module mod + implicit none +contains + subroutine routine(a) + implicit none + real(4), allocatable, optional, intent(inout) :: a(:) + integer(4) :: i + + !$omp target teams distribute parallel do shared(a) + do i=1,10 + a(i) = i + a(i) + end do + + end subroutine routine +end module mod + +! CHECK-LABEL: func.func @_QMmodProutine( +! CHECK-SAME: %[[ARG0:.*]]: !fir.ref>>> {fir.bindc_name = "a", fir.optional}) { +! CHECK: %[[VAL_0:.*]] = fir.alloca !fir.box>> +! CHECK: %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[ARG0]] dummy_scope %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QMmodFroutineEa"} : (!fir.ref>>>, !fir.dscope) -> (!fir.ref>>>, !fir.ref>>>) +! CHECK: %[[VAL_8:.*]] = fir.is_present %[[VAL_2]]#1 : (!fir.ref>>>) -> i1 +! CHECK: %[[VAL_9:.*]]:5 = fir.if %[[VAL_8]] -> (index, index, index, index, index) { +! CHECK: %[[VAL_10:.*]] = fir.load %[[VAL_2]]#1 : !fir.ref>>> +! CHECK: %[[VAL_11:.*]] = arith.constant 1 : index +! CHECK: %[[VAL_12:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_2]]#0 : !fir.ref>>> +! CHECK: %[[VAL_14:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_15:.*]]:3 = fir.box_dims %[[VAL_13]], %[[VAL_14]] : (!fir.box>>, index) -> (index, index, index) +! CHECK: %[[VAL_16:.*]]:3 = fir.box_dims %[[VAL_10]], %[[VAL_12]] : (!fir.box>>, index) -> (index, index, index) +! CHECK: %[[VAL_17:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_18:.*]] = arith.subi %[[VAL_16]]#1, %[[VAL_11]] : index +! CHECK: fir.result %[[VAL_17]], %[[VAL_18]], %[[VAL_16]]#1, %[[VAL_16]]#2, %[[VAL_15]]#0 : index, index, index, index, index +! CHECK: } else { +! CHECK: %[[VAL_19:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_20:.*]] = arith.constant -1 : index +! CHECK: fir.result %[[VAL_19]], %[[VAL_20]], %[[VAL_19]], %[[VAL_19]], %[[VAL_19]] : index, index, index, index, index +! CHECK: } +! CHECK: %[[VAL_21:.*]] = omp.map.bounds lower_bound(%[[VAL_22:.*]]#0 : index) upper_bound(%[[VAL_22]]#1 : index) extent(%[[VAL_22]]#2 : index) stride(%[[VAL_22]]#3 : index) start_idx(%[[VAL_22]]#4 : index) {stride_in_bytes = true} +! CHECK: %[[VAL_23:.*]] = fir.is_present %[[VAL_2]]#1 : (!fir.ref>>>) -> i1 +! CHECK: fir.if %[[VAL_23]] { +! CHECK: %[[VAL_24:.*]] = fir.load %[[VAL_2]]#1 : !fir.ref>>> +! CHECK: fir.store %[[VAL_24]] to %[[VAL_0]] : !fir.ref>>> +! CHECK: } diff --git a/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 b/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 new file mode 100644 index 0000000000000..0de6b7730d3a0 --- /dev/null +++ b/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 @@ -0,0 +1,57 @@ +! OpenMP offloading regression test that checks we do not cause a segfault when +! implicitly mapping a not present optional allocatable function argument and +! utilise it in the target region. No results requiring checking other than +! that the program compiles and runs to completion with no error. +! REQUIRES: flang, amdgpu + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +module mod + implicit none +contains + subroutine routine(a, b) + implicit none + real(4), allocatable, optional, intent(in) :: a(:) + real(4), intent(out) :: b(:) + integer(4) :: i, ia + if(present(a)) then + ia = 1 + write(*,*) "a is present" + else + ia=0 + write(*,*) "a is not present" + end if + + !$omp target teams distribute parallel do shared(a,b,ia) + do i=1,10 + if (ia>0) then + b(i) = b(i) + a(i) + end if + end do + + end subroutine routine + +end module mod + +program main + use mod + implicit none + real(4), allocatable :: a(:) + real(4), allocatable :: b(:) + integer(4) :: i + allocate(b(10)) + do i=1,10 + b(i)=0 + end do + !$omp target data map(from: b) + + call routine(b=b) + + !$omp end target data + + deallocate(b) + + print *, "success, no segmentation fault" +end program main + +!CHECK: a is not present +!CHECK: success, no segmentation fault ``````````
https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Thu May 1 15:44:12 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 15:44:12 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <6813f93c.170a0220.2b5868.085c@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: None (agozillon)
Changes Currently, we do not generate the appropriate checks to check if an optional allocatable argument is present before accessing relevant components of it, in particular when creating bounds, we must generate a presence check and we must make sure we do not generate/keep an load external to the presence check by utilising the raw address rather than the regular address of the info data structure. Similarly in cases for optional allocatables we must treat them like non-allocatable arguments and generate an intermediate allocation that we can have as a location in memory that we can access later in the lowering without causing segfaults when we perform "mapping" on it, even if the end result is an empty allocatable (basically, we shouldn't explode if someone tries to map a non-present optional, similar to C++ when mapping null data). --- Full diff: https://github.com/llvm/llvm-project/pull/138210.diff 4 Files Affected: - (modified) flang/include/flang/Optimizer/Builder/DirectivesCommon.h (+22-3) - (modified) flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp (+12-4) - (added) flang/test/Lower/OpenMP/optional-argument-map-2.f90 (+46) - (added) offload/test/offloading/fortran/optional-mapped-arguments-2.f90 (+57) ``````````diff diff --git a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h index 8684299ab6792..e655c8e592364 100644 --- a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h +++ b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h @@ -156,9 +156,9 @@ genBoundsOpsFromBox(fir::FirOpBuilder &builder, mlir::Location loc, builder.genIfOp(loc, resTypes, info.isPresent, /*withElseRegion=*/true) .genThen([&]() { mlir::Value box = - !fir::isBoxAddress(info.addr.getType()) + !fir::isBoxAddress(info.rawInput.getType()) ? info.addr - : builder.create(loc, info.addr); + : builder.create(loc, info.rawInput); llvm::SmallVector boundValues = gatherBoundsOrBoundValues( builder, loc, dataExv, box, @@ -243,6 +243,17 @@ genBaseBoundsOps(fir::FirOpBuilder &builder, mlir::Location loc, return bounds; } +/// Checks if an argument is optional based on the fortran attributes +/// that are tied to it. +inline bool isOptionalArgument(mlir::Operation *op) { + if (auto declareOp = mlir::dyn_cast_or_null(op)) + if (declareOp.getFortranAttrs() && + bitEnumContainsAny(*declareOp.getFortranAttrs(), + fir::FortranVariableFlagsEnum::optional)) + return true; + return false; +} + template llvm::SmallVector genImplicitBoundsOps(fir::FirOpBuilder &builder, AddrAndBoundsInfo &info, @@ -251,9 +262,17 @@ genImplicitBoundsOps(fir::FirOpBuilder &builder, AddrAndBoundsInfo &info, llvm::SmallVector bounds; mlir::Value baseOp = info.rawInput; - if (mlir::isa(fir::unwrapRefType(baseOp.getType()))) + if (mlir::isa(fir::unwrapRefType(baseOp.getType()))) { + // if it's an optional argument, it is possible it is not present, in which + // case, emitting loads or stores to access bounds data will result in a + // runtime segfault, so we must emit guards against this. + if (!info.isPresent && isOptionalArgument(info.rawInput.getDefiningOp())) { + info.isPresent = builder.create( + loc, builder.getI1Type(), info.rawInput); + } bounds = genBoundsOpsFromBox(builder, loc, dataExv, info); + } if (mlir::isa(fir::unwrapRefType(baseOp.getType()))) { bounds = genBaseBoundsOps(builder, loc, dataExv, dataExvIsAssumedSize); diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp index 3fcb4b04a7b76..05d17bf71514b 100644 --- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp +++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp @@ -131,7 +131,8 @@ class MapInfoFinalizationPass boxMap.getVarPtr().getDefiningOp())) descriptor = addrOp.getVal(); - if (!mlir::isa(descriptor.getType())) + if (!mlir::isa(descriptor.getType()) && + !fir::factory::isOptionalArgument(descriptor.getDefiningOp())) return descriptor; mlir::Value &slot = localBoxAllocas[descriptor.getDefiningOp()]; @@ -151,7 +152,12 @@ class MapInfoFinalizationPass mlir::Location loc = boxMap->getLoc(); assert(allocaBlock && "No alloca block found for this top level op"); builder.setInsertionPointToStart(allocaBlock); - auto alloca = builder.create(loc, descriptor.getType()); + + mlir::Type allocaType = descriptor.getType(); + if (fir::isTypeWithDescriptor(allocaType) && + !mlir::isa(descriptor.getType())) + allocaType = fir::unwrapRefType(allocaType); + auto alloca = builder.create(loc, allocaType); builder.restoreInsertionPoint(insPt); // We should only emit a store if the passed in data is present, it is // possible a user passes in no argument to an optional parameter, in which @@ -159,8 +165,10 @@ class MapInfoFinalizationPass auto isPresent = builder.create(loc, builder.getI1Type(), descriptor); builder.genIfOp(loc, {}, isPresent, false) - .genThen( - [&]() { builder.create(loc, descriptor, alloca); }) + .genThen([&]() { + descriptor = builder.loadIfRef(loc, descriptor); + builder.create(loc, descriptor, alloca); + }) .end(); return slot = alloca; } diff --git a/flang/test/Lower/OpenMP/optional-argument-map-2.f90 b/flang/test/Lower/OpenMP/optional-argument-map-2.f90 new file mode 100644 index 0000000000000..eb89b18063f64 --- /dev/null +++ b/flang/test/Lower/OpenMP/optional-argument-map-2.f90 @@ -0,0 +1,46 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +module mod + implicit none +contains + subroutine routine(a) + implicit none + real(4), allocatable, optional, intent(inout) :: a(:) + integer(4) :: i + + !$omp target teams distribute parallel do shared(a) + do i=1,10 + a(i) = i + a(i) + end do + + end subroutine routine +end module mod + +! CHECK-LABEL: func.func @_QMmodProutine( +! CHECK-SAME: %[[ARG0:.*]]: !fir.ref>>> {fir.bindc_name = "a", fir.optional}) { +! CHECK: %[[VAL_0:.*]] = fir.alloca !fir.box>> +! CHECK: %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[ARG0]] dummy_scope %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QMmodFroutineEa"} : (!fir.ref>>>, !fir.dscope) -> (!fir.ref>>>, !fir.ref>>>) +! CHECK: %[[VAL_8:.*]] = fir.is_present %[[VAL_2]]#1 : (!fir.ref>>>) -> i1 +! CHECK: %[[VAL_9:.*]]:5 = fir.if %[[VAL_8]] -> (index, index, index, index, index) { +! CHECK: %[[VAL_10:.*]] = fir.load %[[VAL_2]]#1 : !fir.ref>>> +! CHECK: %[[VAL_11:.*]] = arith.constant 1 : index +! CHECK: %[[VAL_12:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_2]]#0 : !fir.ref>>> +! CHECK: %[[VAL_14:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_15:.*]]:3 = fir.box_dims %[[VAL_13]], %[[VAL_14]] : (!fir.box>>, index) -> (index, index, index) +! CHECK: %[[VAL_16:.*]]:3 = fir.box_dims %[[VAL_10]], %[[VAL_12]] : (!fir.box>>, index) -> (index, index, index) +! CHECK: %[[VAL_17:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_18:.*]] = arith.subi %[[VAL_16]]#1, %[[VAL_11]] : index +! CHECK: fir.result %[[VAL_17]], %[[VAL_18]], %[[VAL_16]]#1, %[[VAL_16]]#2, %[[VAL_15]]#0 : index, index, index, index, index +! CHECK: } else { +! CHECK: %[[VAL_19:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_20:.*]] = arith.constant -1 : index +! CHECK: fir.result %[[VAL_19]], %[[VAL_20]], %[[VAL_19]], %[[VAL_19]], %[[VAL_19]] : index, index, index, index, index +! CHECK: } +! CHECK: %[[VAL_21:.*]] = omp.map.bounds lower_bound(%[[VAL_22:.*]]#0 : index) upper_bound(%[[VAL_22]]#1 : index) extent(%[[VAL_22]]#2 : index) stride(%[[VAL_22]]#3 : index) start_idx(%[[VAL_22]]#4 : index) {stride_in_bytes = true} +! CHECK: %[[VAL_23:.*]] = fir.is_present %[[VAL_2]]#1 : (!fir.ref>>>) -> i1 +! CHECK: fir.if %[[VAL_23]] { +! CHECK: %[[VAL_24:.*]] = fir.load %[[VAL_2]]#1 : !fir.ref>>> +! CHECK: fir.store %[[VAL_24]] to %[[VAL_0]] : !fir.ref>>> +! CHECK: } diff --git a/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 b/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 new file mode 100644 index 0000000000000..0de6b7730d3a0 --- /dev/null +++ b/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 @@ -0,0 +1,57 @@ +! OpenMP offloading regression test that checks we do not cause a segfault when +! implicitly mapping a not present optional allocatable function argument and +! utilise it in the target region. No results requiring checking other than +! that the program compiles and runs to completion with no error. +! REQUIRES: flang, amdgpu + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +module mod + implicit none +contains + subroutine routine(a, b) + implicit none + real(4), allocatable, optional, intent(in) :: a(:) + real(4), intent(out) :: b(:) + integer(4) :: i, ia + if(present(a)) then + ia = 1 + write(*,*) "a is present" + else + ia=0 + write(*,*) "a is not present" + end if + + !$omp target teams distribute parallel do shared(a,b,ia) + do i=1,10 + if (ia>0) then + b(i) = b(i) + a(i) + end if + end do + + end subroutine routine + +end module mod + +program main + use mod + implicit none + real(4), allocatable :: a(:) + real(4), allocatable :: b(:) + integer(4) :: i + allocate(b(10)) + do i=1,10 + b(i)=0 + end do + !$omp target data map(from: b) + + call routine(b=b) + + !$omp end target data + + deallocate(b) + + print *, "success, no segmentation fault" +end program main + +!CHECK: a is not present +!CHECK: success, no segmentation fault ``````````
https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Thu May 1 12:25:49 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Thu, 01 May 2025 12:25:49 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][cuda] Use a reference for asyncObject (PR #138186) Message-ID: https://github.com/clementval created https://github.com/llvm/llvm-project/pull/138186 Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation. New tentative with some fix. The previous was reverted yesterday. >From 09cf791ff58cc70a02a0364a8cfd7705c92aa831 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Valentin=20Clement=20=28=E3=83=90=E3=83=AC=E3=83=B3?= =?UTF-8?q?=E3=82=BF=E3=82=A4=E3=83=B3=20=E3=82=AF=E3=83=AC=E3=83=A1?= =?UTF-8?q?=E3=83=B3=29?= Date: Wed, 30 Apr 2025 14:02:29 -0700 Subject: [PATCH] [flang][cuda] Use a reference for asyncObject Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation. --- .../flang-rt/runtime/allocator-registry.h | 4 +-- .../include/flang-rt/runtime/descriptor.h | 6 ++--- .../flang-rt/runtime/reduction-templates.h | 2 +- flang-rt/lib/cuda/allocatable.cpp | 8 +++--- flang-rt/lib/cuda/allocator.cpp | 20 +++++++------- flang-rt/lib/cuda/descriptor.cpp | 2 +- flang-rt/lib/runtime/allocatable.cpp | 12 ++++----- flang-rt/lib/runtime/array-constructor.cpp | 4 +-- flang-rt/lib/runtime/assign.cpp | 4 +-- flang-rt/lib/runtime/character.cpp | 20 +++++++------- flang-rt/lib/runtime/copy.cpp | 4 +-- flang-rt/lib/runtime/derived.cpp | 6 ++--- flang-rt/lib/runtime/descriptor.cpp | 4 +-- flang-rt/lib/runtime/extrema.cpp | 4 +-- flang-rt/lib/runtime/findloc.cpp | 2 +- flang-rt/lib/runtime/matmul-transpose.cpp | 2 +- flang-rt/lib/runtime/matmul.cpp | 2 +- flang-rt/lib/runtime/misc-intrinsic.cpp | 2 +- flang-rt/lib/runtime/pointer.cpp | 2 +- flang-rt/lib/runtime/temporary-stack.cpp | 2 +- flang-rt/lib/runtime/tools.cpp | 2 +- flang-rt/lib/runtime/transformational.cpp | 4 +-- flang-rt/unittests/Evaluate/reshape.cpp | 2 +- flang-rt/unittests/Runtime/Allocatable.cpp | 4 +-- .../unittests/Runtime/CUDA/Allocatable.cpp | 12 ++++++--- .../unittests/Runtime/CUDA/AllocatorCUF.cpp | 4 +-- flang-rt/unittests/Runtime/CUDA/Memory.cpp | 4 +-- flang-rt/unittests/Runtime/CharacterTest.cpp | 2 +- flang-rt/unittests/Runtime/CommandTest.cpp | 8 +++--- flang-rt/unittests/Runtime/TemporaryStack.cpp | 4 +-- flang-rt/unittests/Runtime/tools.h | 2 +- .../flang/Optimizer/Dialect/CUF/CUFOps.td | 11 ++++---- .../include/flang/Runtime/CUDA/allocatable.h | 8 +++--- flang/include/flang/Runtime/CUDA/allocator.h | 8 +++--- flang/include/flang/Runtime/CUDA/pointer.h | 8 +++--- flang/include/flang/Runtime/allocatable.h | 7 ++--- flang/lib/Lower/Allocatable.cpp | 2 +- .../Optimizer/Builder/Runtime/Allocatable.cpp | 7 +++-- flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp | 22 ++++++++-------- .../Optimizer/Transforms/CUFOpConversion.cpp | 10 +++---- flang/test/Fir/CUDA/cuda-allocate.fir | 18 ++++++------- flang/test/Fir/cuf-invalid.fir | 5 ++-- flang/test/Fir/cuf.mlir | 7 +++-- flang/test/HLFIR/elemental-codegen.fir | 6 ++--- flang/test/Lower/CUDA/cuda-allocatable.cuf | 9 +++---- .../acc-declare-unwrap-defaultbounds.f90 | 4 +-- flang/test/Lower/OpenACC/acc-declare.f90 | 4 +-- flang/test/Lower/allocatable-polymorphic.f90 | 26 +++++++++---------- flang/test/Lower/allocatable-runtime.f90 | 4 +-- flang/test/Lower/allocate-mold.f90 | 4 +-- flang/test/Lower/polymorphic.f90 | 2 +- flang/test/Transforms/lower-repack-arrays.fir | 8 +++--- 52 files changed, 169 insertions(+), 171 deletions(-) diff --git a/flang-rt/include/flang-rt/runtime/allocator-registry.h b/flang-rt/include/flang-rt/runtime/allocator-registry.h index 33e8e2c7d7850..f0ba77a360736 100644 --- a/flang-rt/include/flang-rt/runtime/allocator-registry.h +++ b/flang-rt/include/flang-rt/runtime/allocator-registry.h @@ -19,7 +19,7 @@ namespace Fortran::runtime { -using AllocFct = void *(*)(std::size_t, std::int64_t); +using AllocFct = void *(*)(std::size_t, std::int64_t *); using FreeFct = void (*)(void *); typedef struct Allocator_t { @@ -28,7 +28,7 @@ typedef struct Allocator_t { } Allocator_t; static RT_API_ATTRS void *MallocWrapper( - std::size_t size, [[maybe_unused]] std::int64_t) { + std::size_t size, [[maybe_unused]] std::int64_t *) { return std::malloc(size); } #ifdef RT_DEVICE_COMPILATION diff --git a/flang-rt/include/flang-rt/runtime/descriptor.h b/flang-rt/include/flang-rt/runtime/descriptor.h index 9907e7866e7bf..c98e6b14850cb 100644 --- a/flang-rt/include/flang-rt/runtime/descriptor.h +++ b/flang-rt/include/flang-rt/runtime/descriptor.h @@ -29,8 +29,8 @@ #include #include -/// Value used for asyncId when no specific stream is specified. -static constexpr std::int64_t kNoAsyncId = -1; +/// Value used for asyncObject when no specific stream is specified. +static constexpr std::int64_t *kNoAsyncObject = nullptr; namespace Fortran::runtime { @@ -372,7 +372,7 @@ class Descriptor { // before calling. It (re)computes the byte strides after // allocation. Does not allocate automatic components or // perform default component initialization. - RT_API_ATTRS int Allocate(std::int64_t asyncId); + RT_API_ATTRS int Allocate(std::int64_t *asyncObject); RT_API_ATTRS void SetByteStrides(); // Deallocates storage; does not call FINAL subroutines or diff --git a/flang-rt/include/flang-rt/runtime/reduction-templates.h b/flang-rt/include/flang-rt/runtime/reduction-templates.h index 77f77a592a476..18412708b02c5 100644 --- a/flang-rt/include/flang-rt/runtime/reduction-templates.h +++ b/flang-rt/include/flang-rt/runtime/reduction-templates.h @@ -347,7 +347,7 @@ inline RT_API_ATTRS void DoMaxMinNorm2(Descriptor &result, const Descriptor &x, // as the element size of the source. result.Establish(x.type(), x.ElementBytes(), nullptr, 0, nullptr, CFI_attribute_allocatable); - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/cuda/allocatable.cpp b/flang-rt/lib/cuda/allocatable.cpp index 432974d18a3e3..c77819e9440d7 100644 --- a/flang-rt/lib/cuda/allocatable.cpp +++ b/flang-rt/lib/cuda/allocatable.cpp @@ -23,7 +23,7 @@ namespace Fortran::runtime::cuda { extern "C" { RT_EXT_API_GROUP_BEGIN -int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, +int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( @@ -41,7 +41,7 @@ int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, return stat; } -int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, +int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { if (desc.HasAddendum()) { @@ -63,7 +63,7 @@ int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, } int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; @@ -76,7 +76,7 @@ int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, } int RTDEF(CUFAllocatableAllocateSourceSync)(Descriptor &alloc, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocateSync)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; diff --git a/flang-rt/lib/cuda/allocator.cpp b/flang-rt/lib/cuda/allocator.cpp index 51119ab251168..f4289c55bd8de 100644 --- a/flang-rt/lib/cuda/allocator.cpp +++ b/flang-rt/lib/cuda/allocator.cpp @@ -98,7 +98,7 @@ static unsigned findAllocation(void *ptr) { return allocNotFound; } -static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { +static void insertAllocation(void *ptr, std::size_t size, cudaStream_t stream) { CriticalSection critical{lock}; initAllocations(); if (numDeviceAllocations >= maxDeviceAllocations) { @@ -106,7 +106,7 @@ static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { } deviceAllocations[numDeviceAllocations].ptr = ptr; deviceAllocations[numDeviceAllocations].size = size; - deviceAllocations[numDeviceAllocations].stream = (cudaStream_t)stream; + deviceAllocations[numDeviceAllocations].stream = stream; ++numDeviceAllocations; qsort(deviceAllocations, numDeviceAllocations, sizeof(DeviceAllocation), compareDeviceAlloc); @@ -136,7 +136,7 @@ void RTDEF(CUFRegisterAllocator)() { } void *CUFAllocPinned( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { void *p; CUDA_REPORT_IF_ERROR(cudaMallocHost((void **)&p, sizeInBytes)); return p; @@ -144,18 +144,18 @@ void *CUFAllocPinned( void CUFFreePinned(void *p) { CUDA_REPORT_IF_ERROR(cudaFreeHost(p)); } -void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t asyncId) { +void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t *asyncObject) { void *p; if (Fortran::runtime::executionEnvironment.cudaDeviceIsManaged) { CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); } else { - if (asyncId == kNoAsyncId) { + if (asyncObject == kNoAsyncObject) { CUDA_REPORT_IF_ERROR(cudaMalloc(&p, sizeInBytes)); } else { CUDA_REPORT_IF_ERROR( - cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)asyncId)); - insertAllocation(p, sizeInBytes, asyncId); + cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)*asyncObject)); + insertAllocation(p, sizeInBytes, (cudaStream_t)*asyncObject); } } return p; @@ -174,7 +174,7 @@ void CUFFreeDevice(void *p) { } void *CUFAllocManaged( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { void *p; CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); @@ -184,9 +184,9 @@ void *CUFAllocManaged( void CUFFreeManaged(void *p) { CUDA_REPORT_IF_ERROR(cudaFree(p)); } void *CUFAllocUnified( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { // Call alloc managed for the time being. - return CUFAllocManaged(sizeInBytes, asyncId); + return CUFAllocManaged(sizeInBytes, asyncObject); } void CUFFreeUnified(void *p) { diff --git a/flang-rt/lib/cuda/descriptor.cpp b/flang-rt/lib/cuda/descriptor.cpp index 175e8c0ef8438..7b768f91af29d 100644 --- a/flang-rt/lib/cuda/descriptor.cpp +++ b/flang-rt/lib/cuda/descriptor.cpp @@ -21,7 +21,7 @@ RT_EXT_API_GROUP_BEGIN Descriptor *RTDEF(CUFAllocDescriptor)( std::size_t sizeInBytes, const char *sourceFile, int sourceLine) { return reinterpret_cast( - CUFAllocManaged(sizeInBytes, /*asyncId*/ -1)); + CUFAllocManaged(sizeInBytes, /*asyncObject=*/nullptr)); } void RTDEF(CUFFreeDescriptor)( diff --git a/flang-rt/lib/runtime/allocatable.cpp b/flang-rt/lib/runtime/allocatable.cpp index 6acce34eb9a9e..ef18da6ea0786 100644 --- a/flang-rt/lib/runtime/allocatable.cpp +++ b/flang-rt/lib/runtime/allocatable.cpp @@ -133,17 +133,17 @@ void RTDEF(AllocatableApplyMold)( } } -int RTDEF(AllocatableAllocate)(Descriptor &descriptor, std::int64_t asyncId, - bool hasStat, const Descriptor *errMsg, const char *sourceFile, - int sourceLine) { +int RTDEF(AllocatableAllocate)(Descriptor &descriptor, + std::int64_t *asyncObject, bool hasStat, const Descriptor *errMsg, + const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; if (!descriptor.IsAllocatable()) { return ReturnError(terminator, StatInvalidDescriptor, errMsg, hasStat); } else if (descriptor.IsAllocated()) { return ReturnError(terminator, StatBaseNotNull, errMsg, hasStat); } else { - int stat{ - ReturnError(terminator, descriptor.Allocate(asyncId), errMsg, hasStat)}; + int stat{ReturnError( + terminator, descriptor.Allocate(asyncObject), errMsg, hasStat)}; if (stat == StatOk) { if (const DescriptorAddendum * addendum{descriptor.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -162,7 +162,7 @@ int RTDEF(AllocatableAllocateSource)(Descriptor &alloc, const Descriptor &source, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(AllocatableAllocate)( - alloc, /*asyncId=*/-1, hasStat, errMsg, sourceFile, sourceLine)}; + alloc, /*asyncObject=*/nullptr, hasStat, errMsg, sourceFile, sourceLine)}; if (stat == StatOk) { Terminator terminator{sourceFile, sourceLine}; DoFromSourceAssign(alloc, source, terminator); diff --git a/flang-rt/lib/runtime/array-constructor.cpp b/flang-rt/lib/runtime/array-constructor.cpp index 67b3b5e1e0f50..858fac7bf2b39 100644 --- a/flang-rt/lib/runtime/array-constructor.cpp +++ b/flang-rt/lib/runtime/array-constructor.cpp @@ -50,7 +50,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( initialAllocationSize(fromElements, to.ElementBytes())}; to.GetDimension(0).SetBounds(1, allocationSize); RTNAME(AllocatableAllocate) - (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); to.GetDimension(0).SetBounds(1, fromElements); vector.actualAllocationSize = allocationSize; @@ -59,7 +59,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( // first value: there should be no reallocation. RUNTIME_CHECK(terminator, previousToElements >= fromElements); RTNAME(AllocatableAllocate) - (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); vector.actualAllocationSize = previousToElements; } diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 4a813cd489022..8a4fa36c91479 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -99,7 +99,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; + int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; if (result == StatOk && derived && !derived->noInitializationNeeded()) { result = ReturnError(terminator, Initialize(to, *derived, terminator)); } @@ -277,7 +277,7 @@ RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; + auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; if (stat == StatOk) { if (HasDynamicComponent(from)) { // If 'from' has allocatable/automatic component, we cannot diff --git a/flang-rt/lib/runtime/character.cpp b/flang-rt/lib/runtime/character.cpp index d1152ee1caefb..f140d202e118e 100644 --- a/flang-rt/lib/runtime/character.cpp +++ b/flang-rt/lib/runtime/character.cpp @@ -118,7 +118,7 @@ static RT_API_ATTRS void Compare(Descriptor &result, const Descriptor &x, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("Compare: could not allocate storage for result"); } std::size_t xChars{x.ElementBytes() >> shift}; @@ -173,7 +173,7 @@ static RT_API_ATTRS void AdjustLRHelper(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("ADJUSTL/R: could not allocate storage for result"); } for (SubscriptValue resultAt{0}; elements-- > 0; @@ -227,7 +227,7 @@ static RT_API_ATTRS void LenTrim(Descriptor &result, const Descriptor &string, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("LEN_TRIM: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -427,7 +427,7 @@ static RT_API_ATTRS void GeneralCharFunc(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("SCAN/VERIFY: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -530,7 +530,8 @@ static RT_API_ATTRS void MaxMinHelper(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); } for (CHAR *result{accumulator.OffsetElement()}; elements-- > 0; accumData += accumChars, result += chars, x.IncrementSubscripts(xAt)) { @@ -606,7 +607,7 @@ void RTDEF(CharacterConcatenate)(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - if (accumulator.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (accumulator.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash( "CharacterConcatenate: could not allocate storage for result"); } @@ -629,7 +630,8 @@ void RTDEF(CharacterConcatenateScalar1)( accumulator.set_base_addr(nullptr); std::size_t oldLen{accumulator.ElementBytes()}; accumulator.raw().elem_len += chars; - RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); std::memcpy(accumulator.OffsetElement(oldLen), from, chars); FreeMemory(old); } @@ -831,7 +833,7 @@ void RTDEF(Repeat)(Descriptor &result, const Descriptor &string, std::size_t origBytes{string.ElementBytes()}; result.Establish(string.type(), origBytes * ncopies, nullptr, 0, nullptr, CFI_attribute_allocatable); - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("REPEAT could not allocate storage for result"); } const char *from{string.OffsetElement()}; @@ -865,7 +867,7 @@ void RTDEF(Trim)(Descriptor &result, const Descriptor &string, } result.Establish(string.type(), resultBytes, nullptr, 0, nullptr, CFI_attribute_allocatable); - RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncObject) == CFI_SUCCESS); std::memcpy(result.OffsetElement(), string.OffsetElement(), resultBytes); } diff --git a/flang-rt/lib/runtime/copy.cpp b/flang-rt/lib/runtime/copy.cpp index 3a0f98cf8d376..f990f46e0be66 100644 --- a/flang-rt/lib/runtime/copy.cpp +++ b/flang-rt/lib/runtime/copy.cpp @@ -171,8 +171,8 @@ RT_API_ATTRS void CopyElement(const Descriptor &to, const SubscriptValue toAt[], *reinterpret_cast(toPtr + component->offset())}; if (toDesc.raw().base_addr != nullptr) { toDesc.set_base_addr(nullptr); - RUNTIME_CHECK( - terminator, toDesc.Allocate(/*asyncId=*/-1) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, + toDesc.Allocate(/*asyncObject=*/nullptr) == CFI_SUCCESS); const Descriptor &fromDesc{*reinterpret_cast( fromPtr + component->offset())}; copyStack.emplace(toDesc, fromDesc); diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..35037036f63e7 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -52,7 +52,7 @@ RT_API_ATTRS int Initialize(const Descriptor &instance, allocDesc.raw().attribute = CFI_attribute_allocatable; if (comp.genre() == typeInfo::Component::Genre::Automatic) { stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); + terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -153,7 +153,7 @@ RT_API_ATTRS int InitializeClone(const Descriptor &clone, if (origDesc.IsAllocated()) { cloneDesc.ApplyMold(origDesc, origDesc.rank()); stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); + terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { if (const typeInfo::DerivedType * @@ -260,7 +260,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy.raw().attribute = CFI_attribute_allocatable; Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncId) == CFI_SUCCESS); + copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } diff --git a/flang-rt/lib/runtime/descriptor.cpp b/flang-rt/lib/runtime/descriptor.cpp index 3debf53bb5290..67336d01380e0 100644 --- a/flang-rt/lib/runtime/descriptor.cpp +++ b/flang-rt/lib/runtime/descriptor.cpp @@ -158,7 +158,7 @@ RT_API_ATTRS static inline int MapAllocIdx(const Descriptor &desc) { #endif } -RT_API_ATTRS int Descriptor::Allocate(std::int64_t asyncId) { +RT_API_ATTRS int Descriptor::Allocate(std::int64_t *asyncObject) { std::size_t elementBytes{ElementBytes()}; if (static_cast(elementBytes) < 0) { // F'2023 7.4.4.2 p5: "If the character length parameter value evaluates @@ -170,7 +170,7 @@ RT_API_ATTRS int Descriptor::Allocate(std::int64_t asyncId) { // Zero size allocation is possible in Fortran and the resulting // descriptor must be allocated/associated. Since std::malloc(0) // result is implementation defined, always allocate at least one byte. - void *p{alloc(byteSize ? byteSize : 1, asyncId)}; + void *p{alloc(byteSize ? byteSize : 1, asyncObject)}; if (!p) { return CFI_ERROR_MEM_ALLOCATION; } diff --git a/flang-rt/lib/runtime/extrema.cpp b/flang-rt/lib/runtime/extrema.cpp index 4c7f8e8b99e8f..03e574a8fbff1 100644 --- a/flang-rt/lib/runtime/extrema.cpp +++ b/flang-rt/lib/runtime/extrema.cpp @@ -152,7 +152,7 @@ inline RT_API_ATTRS void CharacterMaxOrMinLoc(const char *intrinsic, CFI_attribute_allocatable); result.GetDimension(0).SetBounds(1, extent[0]); Terminator terminator{source, line}; - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } @@ -181,7 +181,7 @@ inline RT_API_ATTRS void TotalNumericMaxOrMinLoc(const char *intrinsic, CFI_attribute_allocatable); result.GetDimension(0).SetBounds(1, extent[0]); Terminator terminator{source, line}; - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/runtime/findloc.cpp b/flang-rt/lib/runtime/findloc.cpp index e3e98953b0cfc..5485f4b97bd2f 100644 --- a/flang-rt/lib/runtime/findloc.cpp +++ b/flang-rt/lib/runtime/findloc.cpp @@ -220,7 +220,7 @@ void RTDEF(Findloc)(Descriptor &result, const Descriptor &x, CFI_attribute_allocatable); result.GetDimension(0).SetBounds(1, extent[0]); Terminator terminator{source, line}; - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "FINDLOC: could not allocate memory for result; STAT=%d", stat); } diff --git a/flang-rt/lib/runtime/matmul-transpose.cpp b/flang-rt/lib/runtime/matmul-transpose.cpp index 17987fb73d943..c9e21502b629e 100644 --- a/flang-rt/lib/runtime/matmul-transpose.cpp +++ b/flang-rt/lib/runtime/matmul-transpose.cpp @@ -183,7 +183,7 @@ inline static RT_API_ATTRS void DoMatmulTranspose( for (int j{0}; j < resRank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "MATMUL-TRANSPOSE: could not allocate memory for result; STAT=%d", stat); diff --git a/flang-rt/lib/runtime/matmul.cpp b/flang-rt/lib/runtime/matmul.cpp index 0ff92cecbbcb8..5acb345725212 100644 --- a/flang-rt/lib/runtime/matmul.cpp +++ b/flang-rt/lib/runtime/matmul.cpp @@ -255,7 +255,7 @@ static inline RT_API_ATTRS void DoMatmul( for (int j{0}; j < resRank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "MATMUL: could not allocate memory for result; STAT=%d", stat); } diff --git a/flang-rt/lib/runtime/misc-intrinsic.cpp b/flang-rt/lib/runtime/misc-intrinsic.cpp index 2fde859869ef0..a8797f48fa667 100644 --- a/flang-rt/lib/runtime/misc-intrinsic.cpp +++ b/flang-rt/lib/runtime/misc-intrinsic.cpp @@ -30,7 +30,7 @@ static RT_API_ATTRS void TransferImpl(Descriptor &result, if (const DescriptorAddendum * addendum{mold.Addendum()}) { *result.Addendum() = *addendum; } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { Terminator{sourceFile, line}.Crash( "TRANSFER: could not allocate memory for result; STAT=%d", stat); } diff --git a/flang-rt/lib/runtime/pointer.cpp b/flang-rt/lib/runtime/pointer.cpp index fd2427f4124b5..7331f7bbc3a75 100644 --- a/flang-rt/lib/runtime/pointer.cpp +++ b/flang-rt/lib/runtime/pointer.cpp @@ -129,7 +129,7 @@ RT_API_ATTRS void *AllocateValidatedPointerPayload( byteSize = ((byteSize + align - 1) / align) * align; std::size_t total{byteSize + sizeof(std::uintptr_t)}; AllocFct alloc{allocatorRegistry.GetAllocator(allocatorIdx)}; - void *p{alloc(total, /*asyncId=*/-1)}; + void *p{alloc(total, /*asyncObject=*/nullptr)}; if (p && allocatorIdx == 0) { // Fill the footer word with the XOR of the ones' complement of // the base address, which is a value that would be highly unlikely diff --git a/flang-rt/lib/runtime/temporary-stack.cpp b/flang-rt/lib/runtime/temporary-stack.cpp index 3a952b1fdbcca..3f6fd8ee15a80 100644 --- a/flang-rt/lib/runtime/temporary-stack.cpp +++ b/flang-rt/lib/runtime/temporary-stack.cpp @@ -148,7 +148,7 @@ void DescriptorStorage::push(const Descriptor &source) { if constexpr (COPY_VALUES) { // copy the data pointed to by the box box.set_base_addr(nullptr); - box.Allocate(kNoAsyncId); + box.Allocate(kNoAsyncObject); RTNAME(AssignTemporary) (box, source, terminator_.sourceFileName(), terminator_.sourceLine()); } diff --git a/flang-rt/lib/runtime/tools.cpp b/flang-rt/lib/runtime/tools.cpp index 5d6e35faca70a..1f965b0b151ce 100644 --- a/flang-rt/lib/runtime/tools.cpp +++ b/flang-rt/lib/runtime/tools.cpp @@ -261,7 +261,7 @@ RT_API_ATTRS void CreatePartialReductionResult(Descriptor &result, for (int j{0}; j + 1 < xRank; ++j) { result.GetDimension(j).SetBounds(1, resultExtent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/runtime/transformational.cpp b/flang-rt/lib/runtime/transformational.cpp index a7d5a48530ee9..3df314a4e966b 100644 --- a/flang-rt/lib/runtime/transformational.cpp +++ b/flang-rt/lib/runtime/transformational.cpp @@ -132,7 +132,7 @@ static inline RT_API_ATTRS std::size_t AllocateResult(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: Could not allocate memory for result (stat=%d)", function, stat); } @@ -157,7 +157,7 @@ static inline RT_API_ATTRS std::size_t AllocateBesselResult(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: Could not allocate memory for result (stat=%d)", function, stat); } diff --git a/flang-rt/unittests/Evaluate/reshape.cpp b/flang-rt/unittests/Evaluate/reshape.cpp index 67a0be124e8e0..f84de443965d1 100644 --- a/flang-rt/unittests/Evaluate/reshape.cpp +++ b/flang-rt/unittests/Evaluate/reshape.cpp @@ -26,7 +26,7 @@ int main() { for (int j{0}; j < 3; ++j) { source->GetDimension(j).SetBounds(1, sourceExtent[j]); } - TEST(source->Allocate(kNoAsyncId) == CFI_SUCCESS); + TEST(source->Allocate(kNoAsyncObject) == CFI_SUCCESS); TEST(source->IsAllocated()); MATCH(2, source->GetDimension(0).Extent()); MATCH(3, source->GetDimension(1).Extent()); diff --git a/flang-rt/unittests/Runtime/Allocatable.cpp b/flang-rt/unittests/Runtime/Allocatable.cpp index a6fcdd0d1423c..b394312e5bc5a 100644 --- a/flang-rt/unittests/Runtime/Allocatable.cpp +++ b/flang-rt/unittests/Runtime/Allocatable.cpp @@ -26,7 +26,7 @@ TEST(AllocatableTest, MoveAlloc) { auto b{createAllocatable(TypeCategory::Integer, 4)}; // ALLOCATE(a(20)) a->GetDimension(0).SetBounds(1, 20); - a->Allocate(kNoAsyncId); + a->Allocate(kNoAsyncObject); EXPECT_TRUE(a->IsAllocated()); EXPECT_FALSE(b->IsAllocated()); @@ -46,7 +46,7 @@ TEST(AllocatableTest, MoveAlloc) { // move_alloc with errMsg auto errMsg{Descriptor::Create( sizeof(char), 64, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - errMsg->Allocate(kNoAsyncId); + errMsg->Allocate(kNoAsyncObject); RTNAME(MoveAlloc)(*b, *a, nullptr, false, errMsg.get(), __FILE__, __LINE__); EXPECT_FALSE(a->IsAllocated()); EXPECT_TRUE(b->IsAllocated()); diff --git a/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp b/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp index 89649aa95ad93..9935ae0eaac2f 100644 --- a/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp +++ b/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp @@ -42,7 +42,8 @@ TEST(AllocatableCUFTest, SimpleDeviceAllocatable) { CUDA_REPORT_IF_ERROR(cudaMalloc(&device_desc, a->SizeInBytes())); RTNAME(AllocatableAllocate) - (*a, kNoAsyncId, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*a, kNoAsyncObject, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(a->IsAllocated()); RTNAME(CUFDescriptorSync)(device_desc, a.get(), __FILE__, __LINE__); cudaDeviceSynchronize(); @@ -82,19 +83,22 @@ TEST(AllocatableCUFTest, StreamDeviceAllocatable) { RTNAME(AllocatableSetBounds)(*c, 0, 1, 100); RTNAME(AllocatableAllocate) - (*a, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(a->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); RTNAME(AllocatableAllocate) - (*b, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*b, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(b->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); RTNAME(AllocatableAllocate) - (*c, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*c, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(c->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); diff --git a/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp b/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp index 2f1dc64dc8c5a..f1f931e87a86e 100644 --- a/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp +++ b/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp @@ -35,7 +35,7 @@ TEST(AllocatableCUFTest, SimpleDeviceAllocate) { EXPECT_FALSE(a->HasAddendum()); RTNAME(AllocatableSetBounds)(*a, 0, 1, 10); RTNAME(AllocatableAllocate) - (*a, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); EXPECT_TRUE(a->IsAllocated()); RTNAME(AllocatableDeallocate) @@ -54,7 +54,7 @@ TEST(AllocatableCUFTest, SimplePinnedAllocate) { EXPECT_FALSE(a->HasAddendum()); RTNAME(AllocatableSetBounds)(*a, 0, 1, 10); RTNAME(AllocatableAllocate) - (*a, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); EXPECT_TRUE(a->IsAllocated()); RTNAME(AllocatableDeallocate) diff --git a/flang-rt/unittests/Runtime/CUDA/Memory.cpp b/flang-rt/unittests/Runtime/CUDA/Memory.cpp index b3612073657ab..7915baca6c203 100644 --- a/flang-rt/unittests/Runtime/CUDA/Memory.cpp +++ b/flang-rt/unittests/Runtime/CUDA/Memory.cpp @@ -50,8 +50,8 @@ TEST(MemoryCUFTest, CUFDataTransferDescDesc) { EXPECT_EQ((int)kDeviceAllocatorPos, dev->GetAllocIdx()); RTNAME(AllocatableSetBounds)(*dev, 0, 1, 10); RTNAME(AllocatableAllocate) - (*dev, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, - __LINE__); + (*dev, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, + __FILE__, __LINE__); EXPECT_TRUE(dev->IsAllocated()); // Create temp array to transfer to device. diff --git a/flang-rt/unittests/Runtime/CharacterTest.cpp b/flang-rt/unittests/Runtime/CharacterTest.cpp index 0f28e883671bc..2c7af27b9da77 100644 --- a/flang-rt/unittests/Runtime/CharacterTest.cpp +++ b/flang-rt/unittests/Runtime/CharacterTest.cpp @@ -35,7 +35,7 @@ OwningPtr CreateDescriptor(const std::vector &shape, for (int j{0}; j < rank; ++j) { descriptor->GetDimension(j).SetBounds(2, shape[j] + 1); } - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } diff --git a/flang-rt/unittests/Runtime/CommandTest.cpp b/flang-rt/unittests/Runtime/CommandTest.cpp index 9d0da4ce8dd4e..6919a98105b8a 100644 --- a/flang-rt/unittests/Runtime/CommandTest.cpp +++ b/flang-rt/unittests/Runtime/CommandTest.cpp @@ -26,7 +26,7 @@ template static OwningPtr CreateEmptyCharDescriptor() { OwningPtr descriptor{Descriptor::Create( sizeof(char), n, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } return descriptor; @@ -36,7 +36,7 @@ static OwningPtr CharDescriptor(const char *value) { std::size_t n{std::strlen(value)}; OwningPtr descriptor{Descriptor::Create( sizeof(char), n, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } std::memcpy(descriptor->OffsetElement(), value, n); @@ -47,7 +47,7 @@ template static OwningPtr EmptyIntDescriptor() { OwningPtr descriptor{Descriptor::Create(TypeCategory::Integer, kind, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } return descriptor; @@ -57,7 +57,7 @@ template static OwningPtr IntDescriptor(const int &value) { OwningPtr descriptor{Descriptor::Create(TypeCategory::Integer, kind, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } std::memcpy(descriptor->OffsetElement(), &value, sizeof(int)); diff --git a/flang-rt/unittests/Runtime/TemporaryStack.cpp b/flang-rt/unittests/Runtime/TemporaryStack.cpp index 3291794f22fc1..65725840459ab 100644 --- a/flang-rt/unittests/Runtime/TemporaryStack.cpp +++ b/flang-rt/unittests/Runtime/TemporaryStack.cpp @@ -59,7 +59,7 @@ TEST(TemporaryStack, ValueStackBasic) { Descriptor &outputDesc2{testDescriptorStorage[2].descriptor()}; inputDesc.Establish(code, elementBytes, descriptorPtr, rank, extent); - inputDesc.Allocate(kNoAsyncId); + inputDesc.Allocate(kNoAsyncObject); ASSERT_EQ(inputDesc.IsAllocated(), true); uint32_t *inputData = static_cast(inputDesc.raw().base_addr); for (std::size_t i = 0; i < inputDesc.Elements(); ++i) { @@ -123,7 +123,7 @@ TEST(TemporaryStack, ValueStackMultiSize) { boxDims.extent = extent[dim]; boxDims.sm = elementBytes; } - desc->Allocate(kNoAsyncId); + desc->Allocate(kNoAsyncObject); // fill the array with some data to test for (uint32_t i = 0; i < desc->Elements(); ++i) { diff --git a/flang-rt/unittests/Runtime/tools.h b/flang-rt/unittests/Runtime/tools.h index a1eba45647a80..4ada862df110b 100644 --- a/flang-rt/unittests/Runtime/tools.h +++ b/flang-rt/unittests/Runtime/tools.h @@ -42,7 +42,7 @@ static OwningPtr MakeArray(const std::vector &shape, for (int j{0}; j < rank; ++j) { result->GetDimension(j).SetBounds(1, shape[j]); } - int stat{result->Allocate(kNoAsyncId)}; + int stat{result->Allocate(kNoAsyncObject)}; EXPECT_EQ(stat, 0) << stat; EXPECT_LE(data.size(), result->Elements()); char *p{result->OffsetElement()}; diff --git a/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td b/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td index 46cc59cda1612..e38738230ffbc 100644 --- a/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td +++ b/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td @@ -95,12 +95,11 @@ def cuf_AllocateOp : cuf_Op<"allocate", [AttrSizedOperandSegments, }]; let arguments = (ins Arg:$box, - Arg, "", [MemWrite]>:$errmsg, - Optional:$stream, - Arg, "", [MemWrite]>:$pinned, - Arg, "", [MemRead]>:$source, - cuf_DataAttributeAttr:$data_attr, - UnitAttr:$hasStat); + Arg, "", [MemWrite]>:$errmsg, + Optional:$stream, + Arg, "", [MemWrite]>:$pinned, + Arg, "", [MemRead]>:$source, + cuf_DataAttributeAttr:$data_attr, UnitAttr:$hasStat); let results = (outs AnyIntegerType:$stat); diff --git a/flang/include/flang/Runtime/CUDA/allocatable.h b/flang/include/flang/Runtime/CUDA/allocatable.h index 822f2d4a2b297..6c97afa9e10e8 100644 --- a/flang/include/flang/Runtime/CUDA/allocatable.h +++ b/flang/include/flang/Runtime/CUDA/allocatable.h @@ -17,14 +17,14 @@ namespace Fortran::runtime::cuda { extern "C" { /// Perform allocation of the descriptor. -int RTDECL(CUFAllocatableAllocate)(Descriptor &, int64_t stream = -1, +int RTDECL(CUFAllocatableAllocate)(Descriptor &, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. -int RTDECL(CUFAllocatableAllocateSync)(Descriptor &, int64_t stream = -1, +int RTDECL(CUFAllocatableAllocateSync)(Descriptor &, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); @@ -32,14 +32,14 @@ int RTDECL(CUFAllocatableAllocateSync)(Descriptor &, int64_t stream = -1, /// Perform allocation of the descriptor without synchronization. Assign data /// from source. int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, - const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, + const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. Assign data from source. int RTDEF(CUFAllocatableAllocateSourceSync)(Descriptor &alloc, - const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, + const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); diff --git a/flang/include/flang/Runtime/CUDA/allocator.h b/flang/include/flang/Runtime/CUDA/allocator.h index 18ddf75ac3852..59fdb22b6e663 100644 --- a/flang/include/flang/Runtime/CUDA/allocator.h +++ b/flang/include/flang/Runtime/CUDA/allocator.h @@ -20,16 +20,16 @@ extern "C" { void RTDECL(CUFRegisterAllocator)(); } -void *CUFAllocPinned(std::size_t, std::int64_t); +void *CUFAllocPinned(std::size_t, std::int64_t *); void CUFFreePinned(void *); -void *CUFAllocDevice(std::size_t, std::int64_t); +void *CUFAllocDevice(std::size_t, std::int64_t *); void CUFFreeDevice(void *); -void *CUFAllocManaged(std::size_t, std::int64_t); +void *CUFAllocManaged(std::size_t, std::int64_t *); void CUFFreeManaged(void *); -void *CUFAllocUnified(std::size_t, std::int64_t); +void *CUFAllocUnified(std::size_t, std::int64_t *); void CUFFreeUnified(void *); } // namespace Fortran::runtime::cuda diff --git a/flang/include/flang/Runtime/CUDA/pointer.h b/flang/include/flang/Runtime/CUDA/pointer.h index 7fbd8f8e061f2..bdfc3268e0814 100644 --- a/flang/include/flang/Runtime/CUDA/pointer.h +++ b/flang/include/flang/Runtime/CUDA/pointer.h @@ -17,14 +17,14 @@ namespace Fortran::runtime::cuda { extern "C" { /// Perform allocation of the descriptor. -int RTDECL(CUFPointerAllocate)(Descriptor &, int64_t stream = -1, +int RTDECL(CUFPointerAllocate)(Descriptor &, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. -int RTDECL(CUFPointerAllocateSync)(Descriptor &, int64_t stream = -1, +int RTDECL(CUFPointerAllocateSync)(Descriptor &, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); @@ -32,14 +32,14 @@ int RTDECL(CUFPointerAllocateSync)(Descriptor &, int64_t stream = -1, /// Perform allocation of the descriptor without synchronization. Assign data /// from source. int RTDEF(CUFPointerAllocateSource)(Descriptor &pointer, - const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, + const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. Assign data from source. int RTDEF(CUFPointerAllocateSourceSync)(Descriptor &pointer, - const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, + const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); diff --git a/flang/include/flang/Runtime/allocatable.h b/flang/include/flang/Runtime/allocatable.h index 6895f8af5e2a8..863c07494e7c3 100644 --- a/flang/include/flang/Runtime/allocatable.h +++ b/flang/include/flang/Runtime/allocatable.h @@ -94,9 +94,10 @@ int RTDECL(AllocatableCheckLengthParameter)(Descriptor &, // Successfully allocated memory is initialized if the allocatable has a // derived type, and is always initialized by AllocatableAllocateSource(). // Performs all necessary coarray synchronization and validation actions. -int RTDECL(AllocatableAllocate)(Descriptor &, std::int64_t asyncId = -1, - bool hasStat = false, const Descriptor *errMsg = nullptr, - const char *sourceFile = nullptr, int sourceLine = 0); +int RTDECL(AllocatableAllocate)(Descriptor &, + std::int64_t *asyncObject = nullptr, bool hasStat = false, + const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, + int sourceLine = 0); int RTDECL(AllocatableAllocateSource)(Descriptor &, const Descriptor &source, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); diff --git a/flang/lib/Lower/Allocatable.cpp b/flang/lib/Lower/Allocatable.cpp index 8d0444a6e5bd4..af8169c8e7f7b 100644 --- a/flang/lib/Lower/Allocatable.cpp +++ b/flang/lib/Lower/Allocatable.cpp @@ -773,7 +773,7 @@ class AllocateStmtHelper { mlir::Value errmsg = errMsgExpr ? errorManager.errMsgAddr : nullptr; mlir::Value stream = streamExpr - ? fir::getBase(converter.genExprValue(loc, *streamExpr, stmtCtx)) + ? fir::getBase(converter.genExprAddr(loc, *streamExpr, stmtCtx)) : nullptr; mlir::Value pinned = pinnedExpr diff --git a/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp b/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp index 28452d3b486da..cd5f1f6d098c3 100644 --- a/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp +++ b/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp @@ -76,8 +76,7 @@ void fir::runtime::genAllocatableAllocate(fir::FirOpBuilder &builder, mlir::func::FuncOp func{ fir::runtime::getRuntimeFunc(loc, builder)}; mlir::FunctionType fTy{func.getFunctionType()}; - mlir::Value asyncId = - builder.createIntegerConstant(loc, builder.getI64Type(), -1); + mlir::Value asyncObject = builder.createNullConstant(loc); mlir::Value sourceFile{fir::factory::locationToFilename(builder, loc)}; mlir::Value sourceLine{ fir::factory::locationToLineNo(builder, loc, fTy.getInput(5))}; @@ -88,7 +87,7 @@ void fir::runtime::genAllocatableAllocate(fir::FirOpBuilder &builder, errMsg = builder.create(loc, boxNoneTy).getResult(); } llvm::SmallVector args{ - fir::runtime::createArguments(builder, loc, fTy, desc, asyncId, hasStat, - errMsg, sourceFile, sourceLine)}; + fir::runtime::createArguments(builder, loc, fTy, desc, asyncObject, + hasStat, errMsg, sourceFile, sourceLine)}; builder.create(loc, func, args); } diff --git a/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp b/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp index 24033bc15b8eb..687007d957225 100644 --- a/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp +++ b/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp @@ -76,6 +76,16 @@ llvm::LogicalResult cuf::FreeOp::verify() { return checkCudaAttr(*this); } // AllocateOp //===----------------------------------------------------------------------===// +template +static llvm::LogicalResult checkStreamType(OpTy op) { + if (!op.getStream()) + return mlir::success(); + if (auto refTy = mlir::dyn_cast(op.getStream().getType())) + if (!refTy.getEleTy().isInteger(64)) + return op.emitOpError("stream is expected to be an i64 reference"); + return mlir::success(); +} + llvm::LogicalResult cuf::AllocateOp::verify() { if (getPinned() && getStream()) return emitOpError("pinned and stream cannot appears at the same time"); @@ -92,7 +102,7 @@ llvm::LogicalResult cuf::AllocateOp::verify() { "expect errmsg to be a reference to/or a box type value"); if (getErrmsg() && !getHasStat()) return emitOpError("expect stat attribute when errmsg is provided"); - return mlir::success(); + return checkStreamType(*this); } //===----------------------------------------------------------------------===// @@ -143,16 +153,6 @@ llvm::LogicalResult cuf::DeallocateOp::verify() { // KernelLaunchOp //===----------------------------------------------------------------------===// -template -static llvm::LogicalResult checkStreamType(OpTy op) { - if (!op.getStream()) - return mlir::success(); - if (auto refTy = mlir::dyn_cast(op.getStream().getType())) - if (!refTy.getEleTy().isInteger(64)) - return op.emitOpError("stream is expected to be an i64 reference"); - return mlir::success(); -} - llvm::LogicalResult cuf::KernelLaunchOp::verify() { return checkStreamType(*this); } diff --git a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp index e70ceb3a67d98..3a3eab9e8e37b 100644 --- a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp +++ b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp @@ -128,17 +128,15 @@ static mlir::LogicalResult convertOpToCall(OpTy op, mlir::IntegerType::get(op.getContext(), 1))); if (op.getSource()) { mlir::Value stream = - op.getStream() - ? op.getStream() - : builder.createIntegerConstant(loc, fTy.getInput(2), -1); + op.getStream() ? op.getStream() + : builder.createNullConstant(loc, fTy.getInput(2)); args = fir::runtime::createArguments( builder, loc, fTy, op.getBox(), op.getSource(), stream, pinned, hasStat, errmsg, sourceFile, sourceLine); } else { mlir::Value stream = - op.getStream() - ? op.getStream() - : builder.createIntegerConstant(loc, fTy.getInput(1), -1); + op.getStream() ? op.getStream() + : builder.createNullConstant(loc, fTy.getInput(1)); args = fir::runtime::createArguments(builder, loc, fTy, op.getBox(), stream, pinned, hasStat, errmsg, sourceFile, sourceLine); diff --git a/flang/test/Fir/CUDA/cuda-allocate.fir b/flang/test/Fir/CUDA/cuda-allocate.fir index 095ad92d5deb5..ea7890c9aac52 100644 --- a/flang/test/Fir/CUDA/cuda-allocate.fir +++ b/flang/test/Fir/CUDA/cuda-allocate.fir @@ -19,7 +19,7 @@ func.func @_QPsub1() { // CHECK: %[[DESC:.*]] = fir.convert %[[DESC_RT_CALL]] : (!fir.ref>) -> !fir.ref>>> // CHECK: %[[DECL_DESC:.*]]:2 = hlfir.declare %[[DESC]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) // CHECK: %[[BOX_NONE:.*]] = fir.convert %[[DECL_DESC]]#1 : (!fir.ref>>>) -> !fir.ref> -// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[BOX_NONE:.*]] = fir.convert %[[DECL_DESC]]#1 : (!fir.ref>>>) -> !fir.ref> // CHECK: %{{.*}} = fir.call @_FortranAAllocatableDeallocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -47,7 +47,7 @@ func.func @_QPsub3() { // CHECK: %[[A:.*]]:2 = hlfir.declare %[[A_ADDR]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QMmod1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) // CHECK: %[[A_BOX:.*]] = fir.convert %[[A]]#1 : (!fir.ref>>>) -> !fir.ref> -// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[A_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[A_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[A_BOX:.*]] = fir.convert %[[A]]#1 : (!fir.ref>>>) -> !fir.ref> // CHECK: fir.call @_FortranACUFAllocatableDeallocate(%[[A_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -87,7 +87,7 @@ func.func @_QPsub5() { } // CHECK-LABEL: func.func @_QPsub5() -// CHECK: fir.call @_FortranACUFAllocatableAllocate({{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocate({{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: fir.call @_FortranAAllocatableDeallocate({{.*}}) : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -118,7 +118,7 @@ func.func @_QQsub6() attributes {fir.bindc_name = "test"} { // CHECK: %[[B:.*]]:2 = hlfir.declare %[[B_ADDR]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QMdataEb"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) // CHECK: _FortranAAllocatableSetBounds // CHECK: %[[B_BOX:.*]] = fir.convert %[[B]]#1 : (!fir.ref>>>) -> !fir.ref> -// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[B_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[B_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 func.func @_QPallocate_source() { @@ -142,7 +142,7 @@ func.func @_QPallocate_source() { // CHECK: %[[SOURCE:.*]] = fir.load %[[DECL_HOST]] : !fir.ref>>> // CHECK: %[[DEV_CONV:.*]] = fir.convert %[[DECL_DEV]] : (!fir.ref>>>) -> !fir.ref> // CHECK: %[[SOURCE_CONV:.*]] = fir.convert %[[SOURCE]] : (!fir.box>>) -> !fir.box -// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocateSource(%[[DEV_CONV]], %[[SOURCE_CONV]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.box, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocateSource(%[[DEV_CONV]], %[[SOURCE_CONV]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.box, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 fir.global @_QMmod1Ea_d {data_attr = #cuf.cuda} : !fir.box>> { @@ -170,16 +170,14 @@ func.func @_QQallocate_stream() { %1 = fir.declare %0 {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFEa"} : (!fir.ref>>>) -> !fir.ref>>> %2 = fir.alloca i64 {bindc_name = "stream1", uniq_name = "_QFEstream1"} %3 = fir.declare %2 {uniq_name = "_QFEstream1"} : (!fir.ref) -> !fir.ref - %4 = fir.load %3 : !fir.ref - %5 = cuf.allocate %1 : !fir.ref>>> stream(%4 : i64) {data_attr = #cuf.cuda} -> i32 + %5 = cuf.allocate %1 : !fir.ref>>> stream(%3 : !fir.ref) {data_attr = #cuf.cuda} -> i32 return } // CHECK-LABEL: func.func @_QQallocate_stream() // CHECK: %[[STREAM_ALLOCA:.*]] = fir.alloca i64 {bindc_name = "stream1", uniq_name = "_QFEstream1"} // CHECK: %[[STREAM:.*]] = fir.declare %[[STREAM_ALLOCA]] {uniq_name = "_QFEstream1"} : (!fir.ref) -> !fir.ref -// CHECK: %[[STREAM_LOAD:.*]] = fir.load %[[STREAM]] : !fir.ref -// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %[[STREAM_LOAD]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %[[STREAM]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 func.func @_QPp_alloc() { @@ -268,6 +266,6 @@ func.func @_QQpinned() attributes {fir.bindc_name = "testasync"} { // CHECK: %[[PINNED:.*]] = fir.alloca !fir.logical<4> {bindc_name = "pinnedflag", uniq_name = "_QFEpinnedflag"} // CHECK: %[[DECL_PINNED:.*]] = fir.declare %[[PINNED]] {uniq_name = "_QFEpinnedflag"} : (!fir.ref>) -> !fir.ref> // CHECK: %[[CONV_PINNED:.*]] = fir.convert %[[DECL_PINNED]] : (!fir.ref>) -> !fir.ref -// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %{{.*}}, %[[CONV_PINNED]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %{{.*}}, %[[CONV_PINNED]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 } // end of module diff --git a/flang/test/Fir/cuf-invalid.fir b/flang/test/Fir/cuf-invalid.fir index a3b9be3ee8223..dceb8f6fde236 100644 --- a/flang/test/Fir/cuf-invalid.fir +++ b/flang/test/Fir/cuf-invalid.fir @@ -2,13 +2,12 @@ func.func @_QPsub1() { %0 = fir.alloca !fir.box>> {bindc_name = "a", uniq_name = "_QFsub1Ea"} - %1 = fir.alloca i32 + %s = fir.alloca i64 %pinned = fir.alloca i1 %4:2 = hlfir.declare %0 {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) %11 = fir.convert %4#1 : (!fir.ref>>>) -> !fir.ref> - %s = fir.load %1 : !fir.ref // expected-error at +1{{'cuf.allocate' op pinned and stream cannot appears at the same time}} - %13 = cuf.allocate %11 : !fir.ref> stream(%s : i32) pinned(%pinned : !fir.ref) {data_attr = #cuf.cuda} -> i32 + %13 = cuf.allocate %11 : !fir.ref> stream(%s : !fir.ref) pinned(%pinned : !fir.ref) {data_attr = #cuf.cuda} -> i32 return } diff --git a/flang/test/Fir/cuf.mlir b/flang/test/Fir/cuf.mlir index d38b26a4548ed..f80a70eca34a3 100644 --- a/flang/test/Fir/cuf.mlir +++ b/flang/test/Fir/cuf.mlir @@ -18,15 +18,14 @@ func.func @_QPsub1() { func.func @_QPsub1() { %0 = fir.alloca !fir.box>> {bindc_name = "a", uniq_name = "_QFsub1Ea"} - %1 = fir.alloca i32 + %1 = fir.alloca i64 %4:2 = hlfir.declare %0 {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) %11 = fir.convert %4#1 : (!fir.ref>>>) -> !fir.ref> - %s = fir.load %1 : !fir.ref - %13 = cuf.allocate %11 : !fir.ref> stream(%s : i32) {data_attr = #cuf.cuda} -> i32 + %13 = cuf.allocate %11 : !fir.ref> stream(%1 : !fir.ref) {data_attr = #cuf.cuda} -> i32 return } -// CHECK: cuf.allocate %{{.*}} : !fir.ref> stream(%{{.*}} : i32) {data_attr = #cuf.cuda} -> i32 +// CHECK: cuf.allocate %{{.*}} : !fir.ref> stream(%{{.*}} : !fir.ref) {data_attr = #cuf.cuda} -> i32 // ----- diff --git a/flang/test/HLFIR/elemental-codegen.fir b/flang/test/HLFIR/elemental-codegen.fir index a715479f16115..67af4261470f7 100644 --- a/flang/test/HLFIR/elemental-codegen.fir +++ b/flang/test/HLFIR/elemental-codegen.fir @@ -191,7 +191,7 @@ func.func @test_polymorphic(%arg0: !fir.class> {fir.bindc_ // CHECK: %[[VAL_35:.*]] = fir.absent !fir.box // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_4]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_37:.*]] = fir.convert %[[VAL_31]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_38:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_36]], %{{.*}}, %[[VAL_34]], %[[VAL_35]], %[[VAL_37]], %[[VAL_33]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_38:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_36]], %{{.*}}, %[[VAL_34]], %[[VAL_35]], %[[VAL_37]], %[[VAL_33]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_12:.*]] = arith.constant true // CHECK: %[[VAL_39:.*]] = fir.load %[[VAL_13]]#0 : !fir.ref>>>> // CHECK: %[[VAL_40:.*]] = arith.constant 1 : index @@ -275,7 +275,7 @@ func.func @test_polymorphic_expr(%arg0: !fir.class> {fir.b // CHECK: %[[VAL_36:.*]] = fir.absent !fir.box // CHECK: %[[VAL_37:.*]] = fir.convert %[[VAL_5]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_38:.*]] = fir.convert %[[VAL_32]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_39:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_37]], %{{.*}}, %[[VAL_35]], %[[VAL_36]], %[[VAL_38]], %[[VAL_34]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_39:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_37]], %{{.*}}, %[[VAL_35]], %[[VAL_36]], %[[VAL_38]], %[[VAL_34]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_13:.*]] = arith.constant true // CHECK: %[[VAL_40:.*]] = fir.load %[[VAL_14]]#0 : !fir.ref>>>> // CHECK: %[[VAL_41:.*]] = arith.constant 1 : index @@ -328,7 +328,7 @@ func.func @test_polymorphic_expr(%arg0: !fir.class> {fir.b // CHECK: %[[VAL_85:.*]] = fir.absent !fir.box // CHECK: %[[VAL_86:.*]] = fir.convert %[[VAL_4]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_87:.*]] = fir.convert %[[VAL_81]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_88:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_86]], %{{.*}}, %[[VAL_84]], %[[VAL_85]], %[[VAL_87]], %[[VAL_83]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_88:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_86]], %{{.*}}, %[[VAL_84]], %[[VAL_85]], %[[VAL_87]], %[[VAL_83]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_62:.*]] = arith.constant true // CHECK: %[[VAL_89:.*]] = fir.load %[[VAL_63]]#0 : !fir.ref>>>> // CHECK: %[[VAL_90:.*]] = arith.constant 1 : index diff --git a/flang/test/Lower/CUDA/cuda-allocatable.cuf b/flang/test/Lower/CUDA/cuda-allocatable.cuf index a570f636b8db1..cec10dda839e9 100644 --- a/flang/test/Lower/CUDA/cuda-allocatable.cuf +++ b/flang/test/Lower/CUDA/cuda-allocatable.cuf @@ -90,7 +90,7 @@ end subroutine subroutine sub4() real, allocatable, device :: a(:) - integer :: istream + integer(8) :: istream allocate(a(10), stream=istream) end subroutine @@ -98,11 +98,10 @@ end subroutine ! CHECK: %[[BOX:.*]] = cuf.alloc !fir.box>> {bindc_name = "a", data_attr = #cuf.cuda, uniq_name = "_QFsub4Ea"} -> !fir.ref>>> ! CHECK: fir.embox {{.*}} {allocator_idx = 2 : i32} ! CHECK: %[[BOX_DECL:.*]]:2 = hlfir.declare %{{.*}} {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub4Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) -! CHECK: %[[ISTREAM:.*]] = fir.alloca i32 {bindc_name = "istream", uniq_name = "_QFsub4Eistream"} -! CHECK: %[[ISTREAM_DECL:.*]]:2 = hlfir.declare %[[ISTREAM]] {uniq_name = "_QFsub4Eistream"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ISTREAM:.*]] = fir.alloca i64 {bindc_name = "istream", uniq_name = "_QFsub4Eistream"} +! CHECK: %[[ISTREAM_DECL:.*]]:2 = hlfir.declare %[[ISTREAM]] {uniq_name = "_QFsub4Eistream"} : (!fir.ref) -> (!fir.ref, !fir.ref) ! CHECK: fir.call @_FortranAAllocatableSetBounds -! CHECK: %[[STREAM:.*]] = fir.load %[[ISTREAM_DECL]]#0 : !fir.ref -! CHECK: %{{.*}} = cuf.allocate %[[BOX_DECL]]#0 : !fir.ref>>> stream(%[[STREAM]] : i32) {data_attr = #cuf.cuda} -> i32 +! CHECK: %{{.*}} = cuf.allocate %[[BOX_DECL]]#0 : !fir.ref>>> stream(%[[ISTREAM_DECL]]#0 : !fir.ref) {data_attr = #cuf.cuda} -> i32 ! CHECK: fir.if %{{.*}} { ! CHECK: %{{.*}} = cuf.deallocate %[[BOX_DECL]]#0 : !fir.ref>>> {data_attr = #cuf.cuda} -> i32 ! CHECK: } diff --git a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 index 5bb1ae3797346..6869af863644d 100644 --- a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 @@ -473,6 +473,6 @@ subroutine init() end module ! CHECK-LABEL: func.func @_QMacc_declare_post_action_statPinit() -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.if -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/OpenACC/acc-declare.f90 b/flang/test/Lower/OpenACC/acc-declare.f90 index 889cdef51f4ce..4d95ffa10edaf 100644 --- a/flang/test/Lower/OpenACC/acc-declare.f90 +++ b/flang/test/Lower/OpenACC/acc-declare.f90 @@ -434,6 +434,6 @@ subroutine init() end module ! CHECK-LABEL: func.func @_QMacc_declare_post_action_statPinit() -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.if -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/allocatable-polymorphic.f90 b/flang/test/Lower/allocatable-polymorphic.f90 index dd8671daeaf8e..cbd7876203424 100644 --- a/flang/test/Lower/allocatable-polymorphic.f90 +++ b/flang/test/Lower/allocatable-polymorphic.f90 @@ -267,7 +267,7 @@ subroutine test_allocatable() ! CHECK: %[[C0:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%[[P_CAST]], %[[TYPE_DESC_P1_CAST]], %[[RANK]], %[[C0]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[P_CAST:.*]] = fir.convert %[[P_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[P_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[P_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P1:.*]] = fir.type_desc !fir.type<_QMpolyTp1{a:i32,b:i32}> ! CHECK: %[[C1_CAST:.*]] = fir.convert %[[C1_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> @@ -276,7 +276,7 @@ subroutine test_allocatable() ! CHECK: %[[C0:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%[[C1_CAST]], %[[TYPE_DESC_P1_CAST]], %[[RANK]], %[[C0]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[C1_CAST:.*]] = fir.convert %[[C1_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C1_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C1_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P2:.*]] = fir.type_desc !fir.type<_QMpolyTp2{p1:!fir.type<_QMpolyTp1{a:i32,b:i32}>,c:i32}> ! CHECK: %[[C2_CAST:.*]] = fir.convert %[[C2_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> @@ -285,7 +285,7 @@ subroutine test_allocatable() ! CHECK: %[[C0:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%[[C2_CAST]], %[[TYPE_DESC_P2_CAST]], %[[RANK]], %[[C0]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[C2_CAST:.*]] = fir.convert %[[C2_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C2_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C2_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P1:.*]] = fir.type_desc !fir.type<_QMpolyTp1{a:i32,b:i32}> ! CHECK: %[[C3_CAST:.*]] = fir.convert %[[C3_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> @@ -300,7 +300,7 @@ subroutine test_allocatable() ! CHECK: %[[C10_I64:.*]] = fir.convert %[[C10]] : (i32) -> i64 ! CHECK: fir.call @_FortranAAllocatableSetBounds(%[[C3_CAST]], %[[C0]], %[[C1_I64]], %[[C10_I64]]) {{.*}}: (!fir.ref>, i32, i64, i64) -> () ! CHECK: %[[C3_CAST:.*]] = fir.convert %[[C3_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C3_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C3_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P2:.*]] = fir.type_desc !fir.type<_QMpolyTp2{p1:!fir.type<_QMpolyTp1{a:i32,b:i32}>,c:i32}> ! CHECK: %[[C4_CAST:.*]] = fir.convert %[[C4_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> @@ -316,7 +316,7 @@ subroutine test_allocatable() ! CHECK: %[[C20_I64:.*]] = fir.convert %[[C20]] : (i32) -> i64 ! CHECK: fir.call @_FortranAAllocatableSetBounds(%[[C4_CAST]], %[[C0]], %[[C1_I64]], %[[C20_I64]]) {{.*}}: (!fir.ref>, i32, i64, i64) -> () ! CHECK: %[[C4_CAST:.*]] = fir.convert %[[C4_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C4_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C4_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[C1_LOAD1:.*]] = fir.load %[[C1_DECL]]#0 : !fir.ref>>> ! CHECK: fir.dispatch "proc1"(%[[C1_LOAD1]] : !fir.class>>) @@ -390,7 +390,7 @@ subroutine test_unlimited_polymorphic_with_intrinsic_type_spec() ! CHECK: %[[CORANK:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitIntrinsicForAllocate(%[[BOX_NONE]], %[[CAT]], %[[KIND]], %[[RANK]], %[[CORANK]]) {{.*}} : (!fir.ref>, i32, i32, i32, i32) -> () ! CHECK: %[[BOX_NONE:.*]] = fir.convert %[[P_DECL]]#0 : (!fir.ref>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[BOX_NONE:.*]] = fir.convert %[[PTR_DECL]]#0 : (!fir.ref>>) -> !fir.ref> ! CHECK: %[[CAT:.*]] = arith.constant 2 : i32 @@ -573,7 +573,7 @@ subroutine test_allocatable_up_character() ! CHECK: %[[CORANK:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitCharacterForAllocate(%[[A_NONE]], %[[LEN]], %[[KIND]], %[[RANK]], %[[CORANK]]) {{.*}} : (!fir.ref>, i64, i32, i32, i32) -> () ! CHECK: %[[A_NONE:.*]] = fir.convert %[[A_DECL]]#0 : (!fir.ref>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 end module @@ -592,17 +592,17 @@ program test_alloc ! LLVM-LABEL: define void @_QMpolyPtest_allocatable() ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp1, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp1, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp2, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp1, i32 1, i32 0) ! LLVM: call void @_FortranAAllocatableSetBounds(ptr %{{.*}}, i32 0, i64 1, i64 10) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp2, i32 1, i32 0) ! LLVM: call void @_FortranAAllocatableSetBounds(ptr %{{.*}}, i32 0, i64 1, i64 20) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM-COUNT-2: call void %{{[0-9]*}}() ! LLVM: call void @llvm.memcpy.p0.p0.i32 @@ -683,5 +683,5 @@ program test_alloc ! LLVM: store { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] } { ptr null, i64 8, i32 20240719, i8 0, i8 42, i8 2, i8 1, ptr @_QMpolyEXdtXp1, [1 x i64] zeroinitializer }, ptr %[[ALLOCA1:[0-9]*]] ! LLVM: call void @llvm.memcpy.p0.p0.i32(ptr %[[ALLOCA2:[0-9]+]], ptr %[[ALLOCA1]], i32 40, i1 false) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %[[ALLOCA2]], ptr @_QMpolyEXdtXp1, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %[[ALLOCA2]], i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %[[ALLOCA2]], ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: %{{.*}} = call i32 @_FortranAAllocatableDeallocatePolymorphic(ptr %[[ALLOCA2]], ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) diff --git a/flang/test/Lower/allocatable-runtime.f90 b/flang/test/Lower/allocatable-runtime.f90 index 37272c90656cc..c63252c68974e 100644 --- a/flang/test/Lower/allocatable-runtime.f90 +++ b/flang/test/Lower/allocatable-runtime.f90 @@ -31,7 +31,7 @@ subroutine foo() ! CHECK: fir.call @{{.*}}AllocatableSetBounds(%[[xBoxCast2]], %c0{{.*}}, %[[xlbCast]], %[[xubCast]]) {{.*}}: (!fir.ref>, i32, i64, i64) -> () ! CHECK-DAG: %[[xBoxCast3:.*]] = fir.convert %[[xBoxAddr]] : (!fir.ref>>>) -> !fir.ref> ! CHECK-DAG: %[[sourceFile:.*]] = fir.convert %{{.*}} -> !fir.ref - ! CHECK: fir.call @{{.*}}AllocatableAllocate(%[[xBoxCast3]], %{{.*}}, %false{{.*}}, %[[errMsg]], %[[sourceFile]], %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 + ! CHECK: fir.call @{{.*}}AllocatableAllocate(%[[xBoxCast3]], %{{.*}}, %false{{.*}}, %[[errMsg]], %[[sourceFile]], %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! Simply check that we are emitting the right numebr of set bound for y and z. Otherwise, this is just like x. ! CHECK: fir.convert %[[yBoxAddr]] : (!fir.ref>>>) -> !fir.ref> @@ -180,4 +180,4 @@ subroutine mold_allocation() ! CHECK: %[[M_BOX_NONE:.*]] = fir.convert %[[EMBOX_M]] : (!fir.box>) -> !fir.box ! CHECK: fir.call @_FortranAAllocatableApplyMold(%[[A_BOX_NONE]], %[[M_BOX_NONE]], %[[RANK]]) {{.*}} : (!fir.ref>, !fir.box, i32) -> () ! CHECK: %[[A_BOX_NONE:.*]] = fir.convert %[[A]] : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/allocate-mold.f90 b/flang/test/Lower/allocate-mold.f90 index c7985b11397ce..9427c8b08786f 100644 --- a/flang/test/Lower/allocate-mold.f90 +++ b/flang/test/Lower/allocate-mold.f90 @@ -16,7 +16,7 @@ subroutine scalar_mold_allocation() ! CHECK: %[[A_REF_BOX_NONE1:.*]] = fir.convert %[[A]] : (!fir.ref>>) -> !fir.ref> ! CHECK: fir.call @_FortranAAllocatableApplyMold(%[[A_REF_BOX_NONE1]], %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.box, i32) -> () ! CHECK: %[[A_REF_BOX_NONE2:.*]] = fir.convert %[[A]] : (!fir.ref>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_REF_BOX_NONE2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_REF_BOX_NONE2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 subroutine array_scalar_mold_allocation() real, allocatable :: a(:) @@ -40,4 +40,4 @@ end subroutine array_scalar_mold_allocation ! CHECK: %[[REF_BOX_A1:.*]] = fir.convert %1 : (!fir.ref>>>) -> !fir.ref> ! CHECK: fir.call @_FortranAAllocatableSetBounds(%[[REF_BOX_A1]], {{.*}},{{.*}}, {{.*}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %[[REF_BOX_A2:.*]] = fir.convert %[[A]] : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[REF_BOX_A2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[REF_BOX_A2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/polymorphic.f90 b/flang/test/Lower/polymorphic.f90 index 485861a838ff6..b7be5f685d9e3 100644 --- a/flang/test/Lower/polymorphic.f90 +++ b/flang/test/Lower/polymorphic.f90 @@ -1149,7 +1149,7 @@ program test ! CHECK-LABEL: func.func @_QQmain() attributes {fir.bindc_name = "test"} { ! CHECK: %[[ADDR_O:.*]] = fir.address_of(@_QFEo) : !fir.ref}>>>> ! CHECK: %[[BOX_NONE:.*]] = fir.convert %[[ADDR_O]] : (!fir.ref}>>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[O:.*]] = fir.load %[[ADDR_O]] : !fir.ref}>>>> ! CHECK: %[[COORD_INNER:.*]] = fir.coordinate_of %[[O]], inner : (!fir.box}>>>) -> !fir.ref> ! CHECK: %{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} unordered iter_args(%arg1 = %{{.*}}) -> (!fir.array<5x!fir.logical<4>>) { diff --git a/flang/test/Transforms/lower-repack-arrays.fir b/flang/test/Transforms/lower-repack-arrays.fir index bbae7ba5b0e0b..0b323b1bb0697 100644 --- a/flang/test/Transforms/lower-repack-arrays.fir +++ b/flang/test/Transforms/lower-repack-arrays.fir @@ -840,7 +840,7 @@ func.func @_QPtest6(%arg0: !fir.class>> {fir.bi // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>>) -> !fir.box @@ -928,7 +928,7 @@ func.func @_QPtest6_stack(%arg0: !fir.class>> { // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>>) -> !fir.box @@ -1015,7 +1015,7 @@ func.func @_QPtest7(%arg0: !fir.class> {fir.bindc_name = "x // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>) -> !fir.box @@ -1103,7 +1103,7 @@ func.func @_QPtest7_stack(%arg0: !fir.class> {fir.bindc_nam // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>) -> !fir.box From flang-commits at lists.llvm.org Thu May 1 16:19:04 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Thu, 01 May 2025 16:19:04 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][cuda] Use a reference for asyncObject (PR #138186) In-Reply-To: Message-ID: <68140168.170a0220.171368.821d@mx.google.com> ================ @@ -82,19 +83,22 @@ TEST(AllocatableCUFTest, StreamDeviceAllocatable) { RTNAME(AllocatableSetBounds)(*c, 0, 1, 100); RTNAME(AllocatableAllocate) - (*a, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(a->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); RTNAME(AllocatableAllocate) - (*b, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*b, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(b->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); RTNAME(AllocatableAllocate) - (*c, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*c, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, ---------------- clementval wrote: Yes since it's a pointer now. This is a unittest so we don't really care which stream is used. https://github.com/llvm/llvm-project/pull/138186 From flang-commits at lists.llvm.org Thu May 1 16:20:21 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Thu, 01 May 2025 16:20:21 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][cuda] Use a reference for asyncObject (PR #138186) In-Reply-To: Message-ID: <681401b5.170a0220.18175a.0bae@mx.google.com> https://github.com/wangzpgi approved this pull request. https://github.com/llvm/llvm-project/pull/138186 From flang-commits at lists.llvm.org Thu May 1 17:01:18 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Thu, 01 May 2025 17:01:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix #else with trailing text (PR #138045) In-Reply-To: Message-ID: <68140b4e.050a0220.20acb1.0fd6@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/138045 >From e3f856c692eec2d5e82116e369bd51f3964256fa Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 30 Apr 2025 18:07:41 -0400 Subject: [PATCH 1/3] [flang] Fix #else with trailing text Fixed the issue, where the extra text on #else line (' Z' in the example below) caused the data from the "else" clause to be processed together with the data of "then" clause. ``` PARAMETER(A=2) PARAMETER(A=3) end ``` --- flang/lib/Parser/preprocessor.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/flang/lib/Parser/preprocessor.cpp b/flang/lib/Parser/preprocessor.cpp index a47f9c32ad27c..1e984896ea4ed 100644 --- a/flang/lib/Parser/preprocessor.cpp +++ b/flang/lib/Parser/preprocessor.cpp @@ -684,7 +684,9 @@ void Preprocessor::Directive(const TokenSequence &dir, Prescanner &prescanner) { dir.GetIntervalProvenanceRange(j, tokens - j), "#else: excess tokens at end of directive"_port_en_US); } - } else if (ifStack_.empty()) { + } + + if (ifStack_.empty()) { prescanner.Say(dir.GetTokenProvenanceRange(dirOffset), "#else: not nested within #if, #ifdef, or #ifndef"_err_en_US); } else if (ifStack_.top() != CanDeadElseAppear::Yes) { >From c7daada0d3a684d7e5c9ff8c484b1d110ed17843 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 30 Apr 2025 18:23:55 -0400 Subject: [PATCH 2/3] Test --- flang/test/Preprocessing/pp048.F | 11 +++++++++++ 1 file changed, 11 insertions(+) create mode 100644 flang/test/Preprocessing/pp048.F diff --git a/flang/test/Preprocessing/pp048.F b/flang/test/Preprocessing/pp048.F new file mode 100644 index 0000000000000..121262c1840f9 --- /dev/null +++ b/flang/test/Preprocessing/pp048.F @@ -0,0 +1,11 @@ +! RUN: %flang -E %s 2>&1 | FileCheck %s +#ifndef XYZ42 + PARAMETER(A=2) +#else Z + PARAMETER(A=3) +#endif +! Ensure that "PARAMETER(A" is printed only once +! CHECK: PARAMETER(A +! CHECK-NOT: PARAMETER(A + end + >From 17f057b31e857c0bd58c3017559db0e2913e1da5 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 30 Apr 2025 18:34:01 -0400 Subject: [PATCH 3/3] Removed the blank line --- flang/lib/Parser/preprocessor.cpp | 1 - 1 file changed, 1 deletion(-) diff --git a/flang/lib/Parser/preprocessor.cpp b/flang/lib/Parser/preprocessor.cpp index 1e984896ea4ed..6e8e3aee19b09 100644 --- a/flang/lib/Parser/preprocessor.cpp +++ b/flang/lib/Parser/preprocessor.cpp @@ -685,7 +685,6 @@ void Preprocessor::Directive(const TokenSequence &dir, Prescanner &prescanner) { "#else: excess tokens at end of directive"_port_en_US); } } - if (ifStack_.empty()) { prescanner.Say(dir.GetTokenProvenanceRange(dirOffset), "#else: not nested within #if, #ifdef, or #ifndef"_err_en_US); From flang-commits at lists.llvm.org Thu May 1 17:04:19 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Thu, 01 May 2025 17:04:19 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][cuda] Use a reference for asyncObject (PR #138186) In-Reply-To: Message-ID: <68140c03.170a0220.2a48c1.0d0d@mx.google.com> https://github.com/clementval closed https://github.com/llvm/llvm-project/pull/138186 From flang-commits at lists.llvm.org Thu May 1 17:40:28 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 17:40:28 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <6814147c.630a0220.39a0e0.11c6@mx.google.com> ================ @@ -32,6 +32,10 @@ namespace Fortran { namespace lower { namespace omp { +// Container type for tracking user specified Defaultmaps for a target region +using DefaultMapsTy = std::map Message-ID: <681414ce.050a0220.60ca4.16e3@mx.google.com> https://github.com/clementval closed https://github.com/llvm/llvm-project/pull/138221 From flang-commits at lists.llvm.org Thu May 1 17:42:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 17:42:21 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] Revert "[flang][cuda] Use a reference for asyncObject" (PR #138221) In-Reply-To: Message-ID: <681414ed.170a0220.f345f.13e8@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-openacc @llvm/pr-subscribers-flang-fir-hlfir Author: Valentin Clement (バレンタイン クレメン) (clementval)
Changes Reverts llvm/llvm-project#138186 --- Patch is 88.48 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/138221.diff 52 Files Affected: - (modified) flang-rt/include/flang-rt/runtime/allocator-registry.h (+2-2) - (modified) flang-rt/include/flang-rt/runtime/descriptor.h (+3-3) - (modified) flang-rt/include/flang-rt/runtime/reduction-templates.h (+1-1) - (modified) flang-rt/lib/cuda/allocatable.cpp (+4-4) - (modified) flang-rt/lib/cuda/allocator.cpp (+10-10) - (modified) flang-rt/lib/cuda/descriptor.cpp (+1-1) - (modified) flang-rt/lib/runtime/allocatable.cpp (+6-6) - (modified) flang-rt/lib/runtime/array-constructor.cpp (+2-2) - (modified) flang-rt/lib/runtime/assign.cpp (+2-2) - (modified) flang-rt/lib/runtime/character.cpp (+9-11) - (modified) flang-rt/lib/runtime/copy.cpp (+2-2) - (modified) flang-rt/lib/runtime/derived.cpp (+3-3) - (modified) flang-rt/lib/runtime/descriptor.cpp (+2-2) - (modified) flang-rt/lib/runtime/extrema.cpp (+2-2) - (modified) flang-rt/lib/runtime/findloc.cpp (+1-1) - (modified) flang-rt/lib/runtime/matmul-transpose.cpp (+1-1) - (modified) flang-rt/lib/runtime/matmul.cpp (+1-1) - (modified) flang-rt/lib/runtime/misc-intrinsic.cpp (+1-1) - (modified) flang-rt/lib/runtime/pointer.cpp (+1-1) - (modified) flang-rt/lib/runtime/temporary-stack.cpp (+1-1) - (modified) flang-rt/lib/runtime/tools.cpp (+1-1) - (modified) flang-rt/lib/runtime/transformational.cpp (+2-2) - (modified) flang-rt/unittests/Evaluate/reshape.cpp (+1-1) - (modified) flang-rt/unittests/Runtime/Allocatable.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/CUDA/Allocatable.cpp (+4-8) - (modified) flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/CUDA/Memory.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/CharacterTest.cpp (+1-1) - (modified) flang-rt/unittests/Runtime/CommandTest.cpp (+4-4) - (modified) flang-rt/unittests/Runtime/TemporaryStack.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/tools.h (+1-1) - (modified) flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td (+6-5) - (modified) flang/include/flang/Runtime/CUDA/allocatable.h (+4-4) - (modified) flang/include/flang/Runtime/CUDA/allocator.h (+4-4) - (modified) flang/include/flang/Runtime/CUDA/pointer.h (+4-4) - (modified) flang/include/flang/Runtime/allocatable.h (+3-4) - (modified) flang/lib/Lower/Allocatable.cpp (+1-1) - (modified) flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp (+4-3) - (modified) flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp (+11-11) - (modified) flang/lib/Optimizer/Transforms/CUFOpConversion.cpp (+6-4) - (modified) flang/test/Fir/CUDA/cuda-allocate.fir (+10-8) - (modified) flang/test/Fir/cuf-invalid.fir (+3-2) - (modified) flang/test/Fir/cuf.mlir (+4-3) - (modified) flang/test/HLFIR/elemental-codegen.fir (+3-3) - (modified) flang/test/Lower/CUDA/cuda-allocatable.cuf (+5-4) - (modified) flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-declare.f90 (+2-2) - (modified) flang/test/Lower/allocatable-polymorphic.f90 (+13-13) - (modified) flang/test/Lower/allocatable-runtime.f90 (+2-2) - (modified) flang/test/Lower/allocate-mold.f90 (+2-2) - (modified) flang/test/Lower/polymorphic.f90 (+1-1) - (modified) flang/test/Transforms/lower-repack-arrays.fir (+4-4) ``````````diff diff --git a/flang-rt/include/flang-rt/runtime/allocator-registry.h b/flang-rt/include/flang-rt/runtime/allocator-registry.h index f0ba77a360736..33e8e2c7d7850 100644 --- a/flang-rt/include/flang-rt/runtime/allocator-registry.h +++ b/flang-rt/include/flang-rt/runtime/allocator-registry.h @@ -19,7 +19,7 @@ namespace Fortran::runtime { -using AllocFct = void *(*)(std::size_t, std::int64_t *); +using AllocFct = void *(*)(std::size_t, std::int64_t); using FreeFct = void (*)(void *); typedef struct Allocator_t { @@ -28,7 +28,7 @@ typedef struct Allocator_t { } Allocator_t; static RT_API_ATTRS void *MallocWrapper( - std::size_t size, [[maybe_unused]] std::int64_t *) { + std::size_t size, [[maybe_unused]] std::int64_t) { return std::malloc(size); } #ifdef RT_DEVICE_COMPILATION diff --git a/flang-rt/include/flang-rt/runtime/descriptor.h b/flang-rt/include/flang-rt/runtime/descriptor.h index c98e6b14850cb..9907e7866e7bf 100644 --- a/flang-rt/include/flang-rt/runtime/descriptor.h +++ b/flang-rt/include/flang-rt/runtime/descriptor.h @@ -29,8 +29,8 @@ #include #include -/// Value used for asyncObject when no specific stream is specified. -static constexpr std::int64_t *kNoAsyncObject = nullptr; +/// Value used for asyncId when no specific stream is specified. +static constexpr std::int64_t kNoAsyncId = -1; namespace Fortran::runtime { @@ -372,7 +372,7 @@ class Descriptor { // before calling. It (re)computes the byte strides after // allocation. Does not allocate automatic components or // perform default component initialization. - RT_API_ATTRS int Allocate(std::int64_t *asyncObject); + RT_API_ATTRS int Allocate(std::int64_t asyncId); RT_API_ATTRS void SetByteStrides(); // Deallocates storage; does not call FINAL subroutines or diff --git a/flang-rt/include/flang-rt/runtime/reduction-templates.h b/flang-rt/include/flang-rt/runtime/reduction-templates.h index 18412708b02c5..77f77a592a476 100644 --- a/flang-rt/include/flang-rt/runtime/reduction-templates.h +++ b/flang-rt/include/flang-rt/runtime/reduction-templates.h @@ -347,7 +347,7 @@ inline RT_API_ATTRS void DoMaxMinNorm2(Descriptor &result, const Descriptor &x, // as the element size of the source. result.Establish(x.type(), x.ElementBytes(), nullptr, 0, nullptr, CFI_attribute_allocatable); - if (int stat{result.Allocate(kNoAsyncObject)}) { + if (int stat{result.Allocate(kNoAsyncId)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/cuda/allocatable.cpp b/flang-rt/lib/cuda/allocatable.cpp index c77819e9440d7..432974d18a3e3 100644 --- a/flang-rt/lib/cuda/allocatable.cpp +++ b/flang-rt/lib/cuda/allocatable.cpp @@ -23,7 +23,7 @@ namespace Fortran::runtime::cuda { extern "C" { RT_EXT_API_GROUP_BEGIN -int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t *stream, +int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( @@ -41,7 +41,7 @@ int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t *stream, return stat; } -int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t *stream, +int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { if (desc.HasAddendum()) { @@ -63,7 +63,7 @@ int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t *stream, } int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, - const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; @@ -76,7 +76,7 @@ int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, } int RTDEF(CUFAllocatableAllocateSourceSync)(Descriptor &alloc, - const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocateSync)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; diff --git a/flang-rt/lib/cuda/allocator.cpp b/flang-rt/lib/cuda/allocator.cpp index f4289c55bd8de..51119ab251168 100644 --- a/flang-rt/lib/cuda/allocator.cpp +++ b/flang-rt/lib/cuda/allocator.cpp @@ -98,7 +98,7 @@ static unsigned findAllocation(void *ptr) { return allocNotFound; } -static void insertAllocation(void *ptr, std::size_t size, cudaStream_t stream) { +static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { CriticalSection critical{lock}; initAllocations(); if (numDeviceAllocations >= maxDeviceAllocations) { @@ -106,7 +106,7 @@ static void insertAllocation(void *ptr, std::size_t size, cudaStream_t stream) { } deviceAllocations[numDeviceAllocations].ptr = ptr; deviceAllocations[numDeviceAllocations].size = size; - deviceAllocations[numDeviceAllocations].stream = stream; + deviceAllocations[numDeviceAllocations].stream = (cudaStream_t)stream; ++numDeviceAllocations; qsort(deviceAllocations, numDeviceAllocations, sizeof(DeviceAllocation), compareDeviceAlloc); @@ -136,7 +136,7 @@ void RTDEF(CUFRegisterAllocator)() { } void *CUFAllocPinned( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { void *p; CUDA_REPORT_IF_ERROR(cudaMallocHost((void **)&p, sizeInBytes)); return p; @@ -144,18 +144,18 @@ void *CUFAllocPinned( void CUFFreePinned(void *p) { CUDA_REPORT_IF_ERROR(cudaFreeHost(p)); } -void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t *asyncObject) { +void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t asyncId) { void *p; if (Fortran::runtime::executionEnvironment.cudaDeviceIsManaged) { CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); } else { - if (asyncObject == kNoAsyncObject) { + if (asyncId == kNoAsyncId) { CUDA_REPORT_IF_ERROR(cudaMalloc(&p, sizeInBytes)); } else { CUDA_REPORT_IF_ERROR( - cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)*asyncObject)); - insertAllocation(p, sizeInBytes, (cudaStream_t)*asyncObject); + cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)asyncId)); + insertAllocation(p, sizeInBytes, asyncId); } } return p; @@ -174,7 +174,7 @@ void CUFFreeDevice(void *p) { } void *CUFAllocManaged( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { void *p; CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); @@ -184,9 +184,9 @@ void *CUFAllocManaged( void CUFFreeManaged(void *p) { CUDA_REPORT_IF_ERROR(cudaFree(p)); } void *CUFAllocUnified( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { // Call alloc managed for the time being. - return CUFAllocManaged(sizeInBytes, asyncObject); + return CUFAllocManaged(sizeInBytes, asyncId); } void CUFFreeUnified(void *p) { diff --git a/flang-rt/lib/cuda/descriptor.cpp b/flang-rt/lib/cuda/descriptor.cpp index 7b768f91af29d..175e8c0ef8438 100644 --- a/flang-rt/lib/cuda/descriptor.cpp +++ b/flang-rt/lib/cuda/descriptor.cpp @@ -21,7 +21,7 @@ RT_EXT_API_GROUP_BEGIN Descriptor *RTDEF(CUFAllocDescriptor)( std::size_t sizeInBytes, const char *sourceFile, int sourceLine) { return reinterpret_cast( - CUFAllocManaged(sizeInBytes, /*asyncObject=*/nullptr)); + CUFAllocManaged(sizeInBytes, /*asyncId*/ -1)); } void RTDEF(CUFFreeDescriptor)( diff --git a/flang-rt/lib/runtime/allocatable.cpp b/flang-rt/lib/runtime/allocatable.cpp index ef18da6ea0786..6acce34eb9a9e 100644 --- a/flang-rt/lib/runtime/allocatable.cpp +++ b/flang-rt/lib/runtime/allocatable.cpp @@ -133,17 +133,17 @@ void RTDEF(AllocatableApplyMold)( } } -int RTDEF(AllocatableAllocate)(Descriptor &descriptor, - std::int64_t *asyncObject, bool hasStat, const Descriptor *errMsg, - const char *sourceFile, int sourceLine) { +int RTDEF(AllocatableAllocate)(Descriptor &descriptor, std::int64_t asyncId, + bool hasStat, const Descriptor *errMsg, const char *sourceFile, + int sourceLine) { Terminator terminator{sourceFile, sourceLine}; if (!descriptor.IsAllocatable()) { return ReturnError(terminator, StatInvalidDescriptor, errMsg, hasStat); } else if (descriptor.IsAllocated()) { return ReturnError(terminator, StatBaseNotNull, errMsg, hasStat); } else { - int stat{ReturnError( - terminator, descriptor.Allocate(asyncObject), errMsg, hasStat)}; + int stat{ + ReturnError(terminator, descriptor.Allocate(asyncId), errMsg, hasStat)}; if (stat == StatOk) { if (const DescriptorAddendum * addendum{descriptor.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -162,7 +162,7 @@ int RTDEF(AllocatableAllocateSource)(Descriptor &alloc, const Descriptor &source, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(AllocatableAllocate)( - alloc, /*asyncObject=*/nullptr, hasStat, errMsg, sourceFile, sourceLine)}; + alloc, /*asyncId=*/-1, hasStat, errMsg, sourceFile, sourceLine)}; if (stat == StatOk) { Terminator terminator{sourceFile, sourceLine}; DoFromSourceAssign(alloc, source, terminator); diff --git a/flang-rt/lib/runtime/array-constructor.cpp b/flang-rt/lib/runtime/array-constructor.cpp index 858fac7bf2b39..67b3b5e1e0f50 100644 --- a/flang-rt/lib/runtime/array-constructor.cpp +++ b/flang-rt/lib/runtime/array-constructor.cpp @@ -50,7 +50,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( initialAllocationSize(fromElements, to.ElementBytes())}; to.GetDimension(0).SetBounds(1, allocationSize); RTNAME(AllocatableAllocate) - (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); to.GetDimension(0).SetBounds(1, fromElements); vector.actualAllocationSize = allocationSize; @@ -59,7 +59,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( // first value: there should be no reallocation. RUNTIME_CHECK(terminator, previousToElements >= fromElements); RTNAME(AllocatableAllocate) - (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); vector.actualAllocationSize = previousToElements; } diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 8a4fa36c91479..4a813cd489022 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -99,7 +99,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; + int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; if (result == StatOk && derived && !derived->noInitializationNeeded()) { result = ReturnError(terminator, Initialize(to, *derived, terminator)); } @@ -277,7 +277,7 @@ RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; + auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; if (stat == StatOk) { if (HasDynamicComponent(from)) { // If 'from' has allocatable/automatic component, we cannot diff --git a/flang-rt/lib/runtime/character.cpp b/flang-rt/lib/runtime/character.cpp index f140d202e118e..d1152ee1caefb 100644 --- a/flang-rt/lib/runtime/character.cpp +++ b/flang-rt/lib/runtime/character.cpp @@ -118,7 +118,7 @@ static RT_API_ATTRS void Compare(Descriptor &result, const Descriptor &x, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { terminator.Crash("Compare: could not allocate storage for result"); } std::size_t xChars{x.ElementBytes() >> shift}; @@ -173,7 +173,7 @@ static RT_API_ATTRS void AdjustLRHelper(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { terminator.Crash("ADJUSTL/R: could not allocate storage for result"); } for (SubscriptValue resultAt{0}; elements-- > 0; @@ -227,7 +227,7 @@ static RT_API_ATTRS void LenTrim(Descriptor &result, const Descriptor &string, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { terminator.Crash("LEN_TRIM: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -427,7 +427,7 @@ static RT_API_ATTRS void GeneralCharFunc(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { terminator.Crash("SCAN/VERIFY: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -530,8 +530,7 @@ static RT_API_ATTRS void MaxMinHelper(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - RUNTIME_CHECK( - terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); } for (CHAR *result{accumulator.OffsetElement()}; elements-- > 0; accumData += accumChars, result += chars, x.IncrementSubscripts(xAt)) { @@ -607,7 +606,7 @@ void RTDEF(CharacterConcatenate)(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - if (accumulator.Allocate(kNoAsyncObject) != CFI_SUCCESS) { + if (accumulator.Allocate(kNoAsyncId) != CFI_SUCCESS) { terminator.Crash( "CharacterConcatenate: could not allocate storage for result"); } @@ -630,8 +629,7 @@ void RTDEF(CharacterConcatenateScalar1)( accumulator.set_base_addr(nullptr); std::size_t oldLen{accumulator.ElementBytes()}; accumulator.raw().elem_len += chars; - RUNTIME_CHECK( - terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); std::memcpy(accumulator.OffsetElement(oldLen), from, chars); FreeMemory(old); } @@ -833,7 +831,7 @@ void RTDEF(Repeat)(Descriptor &result, const Descriptor &string, std::size_t origBytes{string.ElementBytes()}; result.Establish(string.type(), origBytes * ncopies, nullptr, 0, nullptr, CFI_attribute_allocatable); - if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { terminator.Crash("REPEAT could not allocate storage for result"); } const char *from{string.OffsetElement()}; @@ -867,7 +865,7 @@ void RTDEF(Trim)(Descriptor &result, const Descriptor &string, } result.Establish(string.type(), resultBytes, nullptr, 0, nullptr, CFI_attribute_allocatable); - RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncId) == CFI_SUCCESS); std::memcpy(result.OffsetElement(), string.OffsetElement(), resultBytes); } diff --git a/flang-rt/lib/runtime/copy.cpp b/flang-rt/lib/runtime/copy.cpp index f990f46e0be66..3a0f98cf8d376 100644 --- a/flang-rt/lib/runtime/copy.cpp +++ b/flang-rt/lib/runtime/copy.cpp @@ -171,8 +171,8 @@ RT_API_ATTRS void CopyElement(const Descriptor &to, const SubscriptValue toAt[], *reinterpret_cast(toPtr + component->offset())}; if (toDesc.raw().base_addr != nullptr) { toDesc.set_base_addr(nullptr); - RUNTIME_CHECK(terminator, - toDesc.Allocate(/*asyncObject=*/nullptr) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, toDesc.Allocate(/*asyncId=*/-1) == CFI_SUCCESS); const Descriptor &fromDesc{*reinterpret_cast( fromPtr + component->offset())}; copyStack.emplace(toDesc, fromDesc); diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index 35037036f63e7..c46ea806a430a 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -52,7 +52,7 @@ RT_API_ATTRS int Initialize(const Descriptor &instance, allocDesc.raw().attribute = CFI_attribute_allocatable; if (comp.genre() == typeInfo::Component::Genre::Automatic) { stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); + terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -153,7 +153,7 @@ RT_API_ATTRS int InitializeClone(const Descriptor &clone, if (origDesc.IsAllocated()) { cloneDesc.ApplyMold(origDesc, origDesc.rank()); stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); + terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { if (const typeInfo::DerivedType * @@ -260,7 +260,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy.raw().attribute = CFI_attribute_allocatable; Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); + copy.Allocate(kNoAsyncId) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } diff --git a/flang-rt/lib/runtime/descriptor.cpp b/flang-rt/lib/runtime/descriptor.cpp index 67336d01380e0..3debf53bb5290 100644 --- a/flang-rt/lib/runtime/descriptor.cpp +++ b/flang-rt/lib/runtime/descriptor.cpp @@ -158,7 +158,7 @@ RT_API_ATTRS static inline int MapAllocIdx(const Descriptor &desc) { #endif } -RT_API_ATTRS int Descriptor::Allocate(std::int64_t *asyncObject) { +RT_API_ATTRS int Descriptor::Allocate(std::int64_t asyncId) { std::size_t elementBytes{ElementBytes()}; if (static_cast(elementBytes) < 0) { // F'2023 7.4.4.2 p5: "If the character length parameter value evaluates @@ -170,7 +170,7 @@ RT_API_ATTRS int Descriptor::Allocate(std::int64_t *asyncObject) { // Zero size allocation is possible in Fortran and the resulting // descriptor must be allocated/associated. Since std::malloc(0) // result is implementation defined, always allocate at least one byte. - void *p{alloc(byteSize ? byteSize : 1, asyncObject)}; + void *p{alloc(byteSize ? byteSize : 1, asyncId)}; if (!p) { return CFI_ERROR_MEM_ALLOCATION; } diff --git a/flang-rt/lib/runtime/extrema.cpp b/flang-r... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/138221 From flang-commits at lists.llvm.org Thu May 1 17:44:36 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 17:44:36 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <68141574.170a0220.1d7590.889a@mx.google.com> https://github.com/agozillon updated https://github.com/llvm/llvm-project/pull/135226 >From bcdabc07bfb091785191ba7bf98a24d2f844d1f6 Mon Sep 17 00:00:00 2001 From: agozillon Date: Thu, 1 May 2025 19:43:27 -0500 Subject: [PATCH] [Flang][OpenMP] Initial defaultmap implementation This aims to implement most of the initial arguments for defaultmap aside from firstprivate and none, and some of the more recent OpenMP 6 additions which will come in subsequent updates. --- flang/include/flang/Parser/parse-tree.h | 4 +- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 20 ++ flang/lib/Lower/OpenMP/ClauseProcessor.h | 6 + flang/lib/Lower/OpenMP/Clauses.cpp | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 218 +++++++++++++----- flang/lib/Parser/openmp-parsers.cpp | 5 +- .../Todo/defaultmap-clause-firstprivate.f90 | 11 + .../OpenMP/Todo/defaultmap-clause-none.f90 | 11 + .../Lower/OpenMP/Todo/defaultmap-clause.f90 | 8 - flang/test/Lower/OpenMP/defaultmap.f90 | 105 +++++++++ .../test/Parser/OpenMP/defaultmap-clause.f90 | 16 ++ .../fortran/target-defaultmap-present.f90 | 34 +++ .../offloading/fortran/target-defaultmap.f90 | 166 +++++++++++++ 13 files changed, 533 insertions(+), 73 deletions(-) create mode 100644 flang/test/Lower/OpenMP/Todo/defaultmap-clause-firstprivate.f90 create mode 100644 flang/test/Lower/OpenMP/Todo/defaultmap-clause-none.f90 delete mode 100644 flang/test/Lower/OpenMP/Todo/defaultmap-clause.f90 create mode 100644 flang/test/Lower/OpenMP/defaultmap.f90 create mode 100644 offload/test/offloading/fortran/target-defaultmap-present.f90 create mode 100644 offload/test/offloading/fortran/target-defaultmap.f90 diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..2720c67399092 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4133,8 +4133,8 @@ struct OmpDefaultClause { // PRESENT // since 5.1 struct OmpDefaultmapClause { TUPLE_CLASS_BOILERPLATE(OmpDefaultmapClause); - ENUM_CLASS( - ImplicitBehavior, Alloc, To, From, Tofrom, Firstprivate, None, Default) + ENUM_CLASS(ImplicitBehavior, Alloc, To, From, Tofrom, Firstprivate, None, + Default, Present) MODIFIER_BOILERPLATE(OmpVariableCategory); std::tuple t; }; diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 77b4622547d7a..98ca5d21d3ad8 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -856,6 +856,26 @@ static bool isVectorSubscript(const evaluate::Expr &expr) { return false; } +bool ClauseProcessor::processDefaultMap(lower::StatementContext &stmtCtx, + DefaultMapsTy &result) const { + auto process = [&](const omp::clause::Defaultmap &clause, + const parser::CharBlock &) { + using Defmap = omp::clause::Defaultmap; + clause::Defaultmap::VariableCategory variableCategory = + Defmap::VariableCategory::All; + // Variable Category is optional, if not specified defaults to all. + // Multiples of the same category are illegal as are any other + // defaultmaps being specified when a user specified all is in place, + // however, this should be handled earlier during semantics. + if (auto varCat = + std::get>(clause.t)) + variableCategory = varCat.value_or(Defmap::VariableCategory::All); + auto behaviour = std::get(clause.t); + result[variableCategory] = behaviour; + }; + return findRepeatableClause(process); +} + bool ClauseProcessor::processDepend(lower::SymMap &symMap, lower::StatementContext &stmtCtx, mlir::omp::DependClauseOps &result) const { diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index bdddeb145b496..2d3d2946838d7 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -32,6 +32,10 @@ namespace Fortran { namespace lower { namespace omp { +// Container type for tracking user specified Defaultmaps for a target region +using DefaultMapsTy = std::map; + /// Class that handles the processing of OpenMP clauses. /// /// Its `process()` methods perform MLIR code generation for their @@ -106,6 +110,8 @@ class ClauseProcessor { bool processCopyin() const; bool processCopyprivate(mlir::Location currentLocation, mlir::omp::CopyprivateClauseOps &result) const; + bool processDefaultMap(lower::StatementContext &stmtCtx, + DefaultMapsTy &result) const; bool processDepend(lower::SymMap &symMap, lower::StatementContext &stmtCtx, mlir::omp::DependClauseOps &result) const; bool diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index c258bef2e4427..f3088b18b77ff 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -612,7 +612,7 @@ Defaultmap make(const parser::OmpClause::Defaultmap &inp, MS(Firstprivate, Firstprivate) MS(None, None) MS(Default, Default) - // MS(, Present) missing-in-parser + MS(Present, Present) // clang-format on ); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 47e7c266ff7d3..2ecbbc526378f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1700,11 +1700,13 @@ static void genTargetClauses( lower::SymMap &symTable, lower::StatementContext &stmtCtx, lower::pft::Evaluation &eval, const List &clauses, mlir::Location loc, mlir::omp::TargetOperands &clauseOps, + DefaultMapsTy &defaultMaps, llvm::SmallVectorImpl &hasDeviceAddrSyms, llvm::SmallVectorImpl &isDevicePtrSyms, llvm::SmallVectorImpl &mapSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processBare(clauseOps); + cp.processDefaultMap(stmtCtx, defaultMaps); cp.processDepend(symTable, stmtCtx, clauseOps); cp.processDevice(stmtCtx, clauseOps); cp.processHasDeviceAddr(stmtCtx, clauseOps, hasDeviceAddrSyms); @@ -1719,9 +1721,8 @@ static void genTargetClauses( cp.processNowait(clauseOps); cp.processThreadLimit(stmtCtx, clauseOps); - cp.processTODO(loc, - llvm::omp::Directive::OMPD_target); + cp.processTODO( + loc, llvm::omp::Directive::OMPD_target); // `target private(..)` is only supported in delayed privatization mode. if (!enableDelayedPrivatizationStaging) @@ -2231,6 +2232,146 @@ genSingleOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +static clause::Defaultmap::ImplicitBehavior +getDefaultmapIfPresent(DefaultMapsTy &defaultMaps, mlir::Type varType) { + using DefMap = clause::Defaultmap; + + if (defaultMaps.empty()) + return DefMap::ImplicitBehavior::Default; + + if (llvm::is_contained(defaultMaps, DefMap::VariableCategory::All)) + return defaultMaps[DefMap::VariableCategory::All]; + + // NOTE: Unsure if complex and/or vector falls into a scalar type + // or aggregate, but the current default implicit behaviour is to + // treat them as such (c_ptr has its own behaviour, so perhaps + // being lumped in as a scalar isn't the right thing). + if ((fir::isa_trivial(varType) || fir::isa_char(varType) || + fir::isa_builtin_cptr_type(varType)) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Scalar)) + return defaultMaps[DefMap::VariableCategory::Scalar]; + + if (fir::isPointerType(varType) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Pointer)) + return defaultMaps[DefMap::VariableCategory::Pointer]; + + if (fir::isAllocatableType(varType) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Allocatable)) + return defaultMaps[DefMap::VariableCategory::Allocatable]; + + if (fir::isa_aggregate(varType) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Aggregate)) { + return defaultMaps[DefMap::VariableCategory::Aggregate]; + } + + return DefMap::ImplicitBehavior::Default; +} + +static std::pair +getImplicitMapTypeAndKind(fir::FirOpBuilder &firOpBuilder, + lower::AbstractConverter &converter, + DefaultMapsTy &defaultMaps, mlir::Type varType, + mlir::Location loc, const semantics::Symbol &sym) { + using DefMap = clause::Defaultmap; + // Check if a value of type `type` can be passed to the kernel by value. + // All kernel parameters are of pointer type, so if the value can be + // represented inside of a pointer, then it can be passed by value. + auto isLiteralType = [&](mlir::Type type) { + const mlir::DataLayout &dl = firOpBuilder.getDataLayout(); + mlir::Type ptrTy = + mlir::LLVM::LLVMPointerType::get(&converter.getMLIRContext()); + uint64_t ptrSize = dl.getTypeSize(ptrTy); + uint64_t ptrAlign = dl.getTypePreferredAlignment(ptrTy); + + auto [size, align] = fir::getTypeSizeAndAlignmentOrCrash( + loc, type, dl, converter.getKindMap()); + return size <= ptrSize && align <= ptrAlign; + }; + + llvm::omp::OpenMPOffloadMappingFlags mapFlag = + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_IMPLICIT; + + auto implicitBehaviour = getDefaultmapIfPresent(defaultMaps, varType); + if (implicitBehaviour == DefMap::ImplicitBehavior::Default) { + mlir::omp::VariableCaptureKind captureKind = + mlir::omp::VariableCaptureKind::ByRef; + + // If a variable is specified in declare target link and if device + // type is not specified as `nohost`, it needs to be mapped tofrom + mlir::ModuleOp mod = firOpBuilder.getModule(); + mlir::Operation *op = mod.lookupSymbol(converter.mangleName(sym)); + auto declareTargetOp = + llvm::dyn_cast_if_present(op); + if (declareTargetOp && declareTargetOp.isDeclareTarget()) { + if (declareTargetOp.getDeclareTargetCaptureClause() == + mlir::omp::DeclareTargetCaptureClause::link && + declareTargetOp.getDeclareTargetDeviceType() != + mlir::omp::DeclareTargetDeviceType::nohost) { + mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; + mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; + } + } else if (fir::isa_trivial(varType) || fir::isa_char(varType)) { + // Scalars behave as if they were "firstprivate". + // TODO: Handle objects that are shared/lastprivate or were listed + // in an in_reduction clause. + if (isLiteralType(varType)) { + captureKind = mlir::omp::VariableCaptureKind::ByCopy; + } else { + mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; + } + } else if (!fir::isa_builtin_cptr_type(varType)) { + mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; + mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; + } + return std::make_pair(mapFlag, captureKind); + } + + switch (implicitBehaviour) { + case DefMap::ImplicitBehavior::Alloc: + return std::make_pair(llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_NONE, + mlir::omp::VariableCaptureKind::ByRef); + break; + case DefMap::ImplicitBehavior::Firstprivate: + case DefMap::ImplicitBehavior::None: + TODO(loc, "Firstprivate and None are currently unsupported defaultmap " + "behaviour"); + break; + case DefMap::ImplicitBehavior::From: + return std::make_pair(mapFlag |= + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM, + mlir::omp::VariableCaptureKind::ByRef); + break; + case DefMap::ImplicitBehavior::Present: + return std::make_pair(mapFlag |= + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_PRESENT, + mlir::omp::VariableCaptureKind::ByRef); + break; + case DefMap::ImplicitBehavior::To: + return std::make_pair(mapFlag |= + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO, + (fir::isa_trivial(varType) || fir::isa_char(varType)) + ? mlir::omp::VariableCaptureKind::ByCopy + : mlir::omp::VariableCaptureKind::ByRef); + break; + case DefMap::ImplicitBehavior::Tofrom: + return std::make_pair(mapFlag |= + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM | + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO, + mlir::omp::VariableCaptureKind::ByRef); + break; + case DefMap::ImplicitBehavior::Default: + llvm_unreachable( + "Implicit None Behaviour Should Have Been Handled Earlier"); + break; + } + + return std::make_pair(mapFlag |= + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM | + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO, + mlir::omp::VariableCaptureKind::ByRef); +} + static mlir::omp::TargetOp genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, lower::StatementContext &stmtCtx, @@ -2247,10 +2388,12 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, hostEvalInfo.emplace_back(); mlir::omp::TargetOperands clauseOps; + DefaultMapsTy defaultMaps; llvm::SmallVector mapSyms, isDevicePtrSyms, hasDeviceAddrSyms; genTargetClauses(converter, semaCtx, symTable, stmtCtx, eval, item->clauses, - loc, clauseOps, hasDeviceAddrSyms, isDevicePtrSyms, mapSyms); + loc, clauseOps, defaultMaps, hasDeviceAddrSyms, + isDevicePtrSyms, mapSyms); DataSharingProcessor dsp(converter, semaCtx, item->clauses, eval, /*shouldCollectPreDeterminedSymbols=*/ @@ -2258,21 +2401,6 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, /*useDelayedPrivatization=*/true, symTable); dsp.processStep1(&clauseOps); - // Check if a value of type `type` can be passed to the kernel by value. - // All kernel parameters are of pointer type, so if the value can be - // represented inside of a pointer, then it can be passed by value. - auto isLiteralType = [&](mlir::Type type) { - const mlir::DataLayout &dl = firOpBuilder.getDataLayout(); - mlir::Type ptrTy = - mlir::LLVM::LLVMPointerType::get(&converter.getMLIRContext()); - uint64_t ptrSize = dl.getTypeSize(ptrTy); - uint64_t ptrAlign = dl.getTypePreferredAlignment(ptrTy); - - auto [size, align] = fir::getTypeSizeAndAlignmentOrCrash( - loc, type, dl, converter.getKindMap()); - return size <= ptrSize && align <= ptrAlign; - }; - // 5.8.1 Implicit Data-Mapping Attribute Rules // The following code follows the implicit data-mapping rules to map all the // symbols used inside the region that do not have explicit data-environment @@ -2334,56 +2462,25 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, firOpBuilder, info, dataExv, semantics::IsAssumedSizeArray(sym.GetUltimate()), converter.getCurrentLocation()); - - llvm::omp::OpenMPOffloadMappingFlags mapFlag = - llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_IMPLICIT; - mlir::omp::VariableCaptureKind captureKind = - mlir::omp::VariableCaptureKind::ByRef; - mlir::Value baseOp = info.rawInput; mlir::Type eleType = baseOp.getType(); if (auto refType = mlir::dyn_cast(baseOp.getType())) eleType = refType.getElementType(); - // If a variable is specified in declare target link and if device - // type is not specified as `nohost`, it needs to be mapped tofrom - mlir::ModuleOp mod = firOpBuilder.getModule(); - mlir::Operation *op = mod.lookupSymbol(converter.mangleName(sym)); - auto declareTargetOp = - llvm::dyn_cast_if_present(op); - if (declareTargetOp && declareTargetOp.isDeclareTarget()) { - if (declareTargetOp.getDeclareTargetCaptureClause() == - mlir::omp::DeclareTargetCaptureClause::link && - declareTargetOp.getDeclareTargetDeviceType() != - mlir::omp::DeclareTargetDeviceType::nohost) { - mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; - mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; - } - } else if (fir::isa_trivial(eleType) || fir::isa_char(eleType)) { - // Scalars behave as if they were "firstprivate". - // TODO: Handle objects that are shared/lastprivate or were listed - // in an in_reduction clause. - if (isLiteralType(eleType)) { - captureKind = mlir::omp::VariableCaptureKind::ByCopy; - } else { - mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; - } - } else if (!fir::isa_builtin_cptr_type(eleType)) { - mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; - mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; - } - auto location = - mlir::NameLoc::get(mlir::StringAttr::get(firOpBuilder.getContext(), - sym.name().ToString()), - baseOp.getLoc()); + std::pair + mapFlagAndKind = getImplicitMapTypeAndKind( + firOpBuilder, converter, defaultMaps, eleType, loc, sym); + mlir::Value mapOp = createMapInfoOp( - firOpBuilder, location, baseOp, /*varPtrPtr=*/mlir::Value{}, - name.str(), bounds, /*members=*/{}, + firOpBuilder, converter.getCurrentLocation(), baseOp, + /*varPtrPtr=*/mlir::Value{}, name.str(), bounds, /*members=*/{}, /*membersIndex=*/mlir::ArrayAttr{}, static_cast< std::underlying_type_t>( - mapFlag), - captureKind, baseOp.getType(), /*partialMap=*/false, mapperId); + std::get<0>(mapFlagAndKind)), + std::get<1>(mapFlagAndKind), baseOp.getType(), + /*partialMap=*/false, mapperId); clauseOps.mapVars.push_back(mapOp); mapSyms.push_back(&sym); @@ -4062,6 +4159,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, !std::holds_alternative(clause.u) && !std::holds_alternative(clause.u) && !std::holds_alternative(clause.u) && + !std::holds_alternative(clause.u) && !std::holds_alternative(clause.u) && !std::holds_alternative(clause.u) && !std::holds_alternative(clause.u) && diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..202c38696eaa5 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -689,7 +689,7 @@ TYPE_PARSER(construct( // [OpenMP 5.0] // 2.19.7.2 defaultmap(implicit-behavior[:variable-category]) // implicit-behavior -> ALLOC | TO | FROM | TOFROM | FIRSRTPRIVATE | NONE | -// DEFAULT +// DEFAULT | PRESENT // variable-category -> ALL | SCALAR | AGGREGATE | ALLOCATABLE | POINTER TYPE_PARSER(construct( construct( @@ -700,7 +700,8 @@ TYPE_PARSER(construct( "FIRSTPRIVATE" >> pure(OmpDefaultmapClause::ImplicitBehavior::Firstprivate) || "NONE" >> pure(OmpDefaultmapClause::ImplicitBehavior::None) || - "DEFAULT" >> pure(OmpDefaultmapClause::ImplicitBehavior::Default)), + "DEFAULT" >> pure(OmpDefaultmapClause::ImplicitBehavior::Default) || + "PRESENT" >> pure(OmpDefaultmapClause::ImplicitBehavior::Present)), maybe(":" >> nonemptyList(Parser{})))) TYPE_PARSER(construct( diff --git a/flang/test/Lower/OpenMP/Todo/defaultmap-clause-firstprivate.f90 b/flang/test/Lower/OpenMP/Todo/defaultmap-clause-firstprivate.f90 new file mode 100644 index 0000000000000..0af2c7f5ea818 --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/defaultmap-clause-firstprivate.f90 @@ -0,0 +1,11 @@ +!RUN: %not_todo_cmd bbc -emit-hlfir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s +!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s + +subroutine f00 + implicit none + integer :: i + !CHECK: not yet implemented: Firstprivate and None are currently unsupported defaultmap behaviour + !$omp target defaultmap(firstprivate) + i = 10 + !$omp end target + end diff --git a/flang/test/Lower/OpenMP/Todo/defaultmap-clause-none.f90 b/flang/test/Lower/OpenMP/Todo/defaultmap-clause-none.f90 new file mode 100644 index 0000000000000..287eb4a9dfe8f --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/defaultmap-clause-none.f90 @@ -0,0 +1,11 @@ +!RUN: %not_todo_cmd bbc -emit-hlfir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s +!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s + +subroutine f00 + implicit none + integer :: i + !CHECK: not yet implemented: Firstprivate and None are currently unsupported defaultmap behaviour + !$omp target defaultmap(none) + i = 10 + !$omp end target +end diff --git a/flang/test/Lower/OpenMP/Todo/defaultmap-clause.f90 b/flang/test/Lower/OpenMP/Todo/defaultmap-clause.f90 deleted file mode 100644 index 062399d9a1944..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/defaultmap-clause.f90 +++ /dev/null @@ -1,8 +0,0 @@ -!RUN: %not_todo_cmd bbc -emit-hlfir -fopenmp -fopenmp-version=45 -o - %s 2>&1 | FileCheck %s -!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=45 -o - %s 2>&1 | FileCheck %s - -!CHECK: not yet implemented: DEFAULTMAP clause is not implemented yet -subroutine f00 - !$omp target defaultmap(tofrom:scalar) - !$omp end target -end diff --git a/flang/test/Lower/OpenMP/defaultmap.f90 b/flang/test/Lower/OpenMP/defaultmap.f90 new file mode 100644 index 0000000000000..89d86ac1b8cc9 --- /dev/null +++ b/flang/test/Lower/OpenMP/defaultmap.f90 @@ -0,0 +1,105 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=52 %s -o - | FileCheck %s + +subroutine defaultmap_allocatable_present() + implicit none + integer, dimension(:), allocatable :: arr + +! CHECK: %[[MAP_1:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>>, i32) map_clauses(implicit, present, exit_release_or_enter_alloc) capture(ByRef) var_ptr_ptr({{.*}}) bounds({{.*}}) -> !fir.llvm_ptr>> {name = ""} +! CHECK: %[[MAP_2:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>>, !fir.box>>) map_clauses(implicit, to) capture(ByRef) members({{.*}}) -> !fir.ref>>> {name = "arr"} +!$omp target defaultmap(present: allocatable) + arr(1) = 10 +!$omp end target + + return +end subroutine + +subroutine defaultmap_scalar_tofrom() + implicit none + integer :: scalar_int + +! CHECK: %[[MAP:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref, i32) map_clauses(implicit, tofrom) capture(ByRef) -> !fir.ref {name = "scalar_int"} + !$omp target defaultmap(tofrom: scalar) + scalar_int = 20 + !$omp end target + + return +end subroutine + +subroutine defaultmap_all_default() + implicit none + integer, dimension(:), allocatable :: arr + integer :: aggregate(16) + integer :: scalar_int + +! CHECK: %[[MAP_1:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref {name = "scalar_int"} +! CHECK: %[[MAP_2:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>>, i32) map_clauses(implicit, tofrom) capture(ByRef) var_ptr_ptr({{.*}}) bounds({{.*}}) -> !fir.llvm_ptr>> {name = ""} +! CHECK: %[[MAP_3:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>>, !fir.box>>) map_clauses(implicit, to) capture(ByRef) members({{.*}}) -> !fir.ref>>> {name = "arr"} +! CHECK: %[[MAP_4:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>, !fir.array<16xi32>) map_clauses(implicit, tofrom) capture(ByRef) bounds({{.*}}) -> !fir.ref> {name = "aggregate"} + + !$omp target defaultmap(default: all) + scalar_int = 20 + arr(1) = scalar_int + aggregate(1) + !$omp end target + + return +end subroutine + +subroutine defaultmap_pointer_to() + implicit none + integer, dimension(:), pointer :: arr_ptr(:) + integer :: scalar_int + +! CHECK: %[[MAP_1:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>>, i32) map_clauses(implicit, to) capture(ByRef) var_ptr_ptr({{.*}}) bounds({{.*}}) -> !fir.llvm_ptr>> {name = ""} +! CHECK: %[[MAP_2:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>>, !fir.box>>) map_clauses(implicit, to) capture(ByRef) members({{.*}}) -> !fir.ref>>> {name = "arr_ptr"} +! CHECK: %[[MAP_3:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref {name = "scalar_int"} + !$omp target defaultmap(to: pointer) + arr_ptr(1) = scalar_int + 20 + !$omp end target + + return +end subroutine + +subroutine defaultmap_scalar_from() + implicit none + integer :: scalar_test + +! CHECK:%[[MAP:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref, i32) map_clauses(implicit, from) capture(ByRef) -> !fir.ref {name = "scalar_test"} + !$omp target defaultmap(from: scalar) + scalar_test = 20 + !$omp end target + + return +end subroutine + +subroutine defaultmap_aggregate_to() + implicit none + integer :: aggregate_arr(16) + integer :: scalar_test + +! CHECK: %[[MAP_1:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref {name = "scalar_test"} +! CHECK: %[[MAP_2:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>, !fir.array<16xi32>) map_clauses(implicit, to) capture(ByRef) bounds({{.*}}) -> !fir.ref> {name = "aggregate_arr"} + !$omp target map(tofrom: scalar_test) defaultmap(to: aggregate) + aggregate_arr(1) = 1 + scalar_test = 1 + !$omp end target + + return +end subroutine + +subroutine defaultmap_dtype_aggregate_to() + implicit none + type :: dtype + integer(4) :: array_i(10) + integer(4) :: k + end type dtype + + type(dtype) :: aggregate_type + +! CHECK: %[[MAP:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref,k:i32}>>, !fir.type<_QFdefaultmap_dtype_aggregate_toTdtype{array_i:!fir.array<10xi32>,k:i32}>) map_clauses(implicit, to) capture(ByRef) -> !fir.ref,k:i32}>> {name = "aggregate_type"} + !$omp target defaultmap(to: aggregate) + aggregate_type%k = 40 + aggregate_type%array_i(1) = 50 + !$omp end target + + return +end subroutine diff --git a/flang/test/Parser/OpenMP/defaultmap-clause.f90 b/flang/test/Parser/OpenMP/defaultmap-clause.f90 index dc036aedcd003..d908258fac763 100644 --- a/flang/test/Parser/OpenMP/defaultmap-clause.f90 +++ b/flang/test/Parser/OpenMP/defaultmap-clause.f90 @@ -82,3 +82,19 @@ subroutine f04 !PARSE-TREE: | OmpClauseList -> OmpClause -> Defaultmap -> OmpDefaultmapClause !PARSE-TREE: | | ImplicitBehavior = Tofrom !PARSE-TREE: | | Modifier -> OmpVariableCategory -> Value = Scalar + +subroutine f05 + !$omp target defaultmap(present: scalar) + !$omp end target +end + +!UNPARSE: SUBROUTINE f05 +!UNPARSE: !$OMP TARGET DEFAULTMAP(PRESENT:SCALAR) +!UNPARSE: !$OMP END TARGET +!UNPARSE: END SUBROUTINE + +!PARSE-TREE: OmpBeginBlockDirective +!PARSE-TREE: | OmpBlockDirective -> llvm::omp::Directive = target +!PARSE-TREE: | OmpClauseList -> OmpClause -> Defaultmap -> OmpDefaultmapClause +!PARSE-TREE: | | ImplicitBehavior = Present +!PARSE-TREE: | | Modifier -> OmpVariableCategory -> Value = Scalar diff --git a/offload/test/offloading/fortran/target-defaultmap-present.f90 b/offload/test/offloading/fortran/target-defaultmap-present.f90 new file mode 100644 index 0000000000000..3342db21f15c8 --- /dev/null +++ b/offload/test/offloading/fortran/target-defaultmap-present.f90 @@ -0,0 +1,34 @@ +! This checks that the basic functionality of setting the implicit mapping +! behaviour of a target region to present incurs the present behaviour for +! the implicit map capture. +! REQUIRES: flang, amdgpu +! RUN: %libomptarget-compile-fortran-generic +! RUN: %libomptarget-run-fail-generic 2>&1 \ +! RUN: | %fcheck-generic + +! NOTE: This should intentionally fatal error in omptarget as it's not +! present, as is intended. +subroutine target_data_not_present() + implicit none + double precision, dimension(:), allocatable :: arr + integer, parameter :: N = 16 + integer :: i + + allocate(arr(N)) + +!$omp target defaultmap(present: allocatable) + do i = 1,N + arr(i) = 42.0d0 + end do +!$omp end target + + deallocate(arr) + return +end subroutine + +program map_present + implicit none + call target_data_not_present() +end program + +!CHECK: omptarget message: device mapping required by 'present' map type modifier does not exist for host address{{.*}} diff --git a/offload/test/offloading/fortran/target-defaultmap.f90 b/offload/test/offloading/fortran/target-defaultmap.f90 new file mode 100644 index 0000000000000..d7184371129d2 --- /dev/null +++ b/offload/test/offloading/fortran/target-defaultmap.f90 @@ -0,0 +1,166 @@ +! Offloading test checking the use of the depend clause on the target construct +! REQUIRES: flang, amdgcn-amd-amdhsa +! UNSUPPORTED: nvptx64-nvidia-cuda +! UNSUPPORTED: nvptx64-nvidia-cuda-LTO +! UNSUPPORTED: aarch64-unknown-linux-gnu +! UNSUPPORTED: aarch64-unknown-linux-gnu-LTO +! UNSUPPORTED: x86_64-unknown-linux-gnu +! UNSUPPORTED: x86_64-unknown-linux-gnu-LTO + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +subroutine defaultmap_allocatable_present() + implicit none + integer, dimension(:), allocatable :: arr + integer :: N = 16 + integer :: i + + allocate(arr(N)) + +!$omp target enter data map(to: arr) + +!$omp target defaultmap(present: allocatable) + do i = 1,N + arr(i) = N + 40 + end do +!$omp end target + +!$omp target exit data map(from: arr) + + print *, arr + deallocate(arr) + + return +end subroutine + +subroutine defaultmap_scalar_tofrom() + implicit none + integer :: scalar_int + scalar_int = 10 + + !$omp target defaultmap(tofrom: scalar) + scalar_int = 20 + !$omp end target + + print *, scalar_int + return +end subroutine + +subroutine defaultmap_all_default() + implicit none + integer, dimension(:), allocatable :: arr + integer :: aggregate(16) + integer :: N = 16 + integer :: i, scalar_int + + allocate(arr(N)) + + scalar_int = 10 + aggregate = scalar_int + + !$omp target defaultmap(default: all) + scalar_int = 20 + do i = 1,N + arr(i) = scalar_int + aggregate(i) + end do + !$omp end target + + print *, scalar_int + print *, arr + + deallocate(arr) + return +end subroutine + +subroutine defaultmap_pointer_to() + implicit none + integer, dimension(:), pointer :: arr_ptr(:) + integer :: scalar_int, i + allocate(arr_ptr(10)) + arr_ptr = 10 + scalar_int = 20 + + !$omp target defaultmap(to: pointer) + do i = 1,10 + arr_ptr(i) = scalar_int + 20 + end do + !$omp end target + + print *, arr_ptr + deallocate(arr_ptr) + return +end subroutine + +subroutine defaultmap_scalar_from() + implicit none + integer :: scalar_test + scalar_test = 10 + !$omp target defaultmap(from: scalar) + scalar_test = 20 + !$omp end target + + print *, scalar_test + return +end subroutine + +subroutine defaultmap_aggregate_to() + implicit none + integer :: aggregate_arr(16) + integer :: i, scalar_test = 0 + aggregate_arr = 0 + !$omp target map(tofrom: scalar_test) defaultmap(to: aggregate) + do i = 1,16 + aggregate_arr(i) = i + scalar_test = scalar_test + aggregate_arr(i) + enddo + !$omp end target + + print *, scalar_test + print *, aggregate_arr + return +end subroutine + +subroutine defaultmap_dtype_aggregate_to() + implicit none + type :: dtype + real(4) :: i + real(4) :: j + integer(4) :: array_i(10) + integer(4) :: k + integer(4) :: array_j(10) + end type dtype + + type(dtype) :: aggregate_type + + aggregate_type%k = 20 + aggregate_type%array_i = 30 + + !$omp target defaultmap(to: aggregate) + aggregate_type%k = 40 + aggregate_type%array_i(1) = 50 + !$omp end target + + print *, aggregate_type%k + print *, aggregate_type%array_i(1) + return +end subroutine + +program map_present + implicit none +! CHECK: 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 + call defaultmap_allocatable_present() +! CHECK: 20 + call defaultmap_scalar_tofrom() +! CHECK: 10 +! CHECK: 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 + call defaultmap_all_default() +! CHECK: 10 10 10 10 10 10 10 10 10 10 + call defaultmap_pointer_to() +! CHECK: 20 + call defaultmap_scalar_from() +! CHECK: 136 +! CHECK: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 + call defaultmap_aggregate_to() +! CHECK: 20 +! CHECK: 30 + call defaultmap_dtype_aggregate_to() +end program From flang-commits at lists.llvm.org Thu May 1 17:46:20 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 17:46:20 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <681415dc.170a0220.1879a9.0fdd@mx.google.com> agozillon wrote: Thank you very much for the review @skatrak it's greatly appreciated! I've updated the PR now with all of the above changes except the IndexedMap one currently which I've left a comment for my current reasoning! :-) https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Thu May 1 17:41:41 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Thu, 01 May 2025 17:41:41 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] Revert "[flang][cuda] Use a reference for asyncObject" (PR #138221) Message-ID: https://github.com/clementval created https://github.com/llvm/llvm-project/pull/138221 Reverts llvm/llvm-project#138186 >From 92beccd4dff8207a56938980e60461e60abb8dbc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Valentin=20Clement=20=28=E3=83=90=E3=83=AC=E3=83=B3?= =?UTF-8?q?=E3=82=BF=E3=82=A4=E3=83=B3=20=E3=82=AF=E3=83=AC=E3=83=A1?= =?UTF-8?q?=E3=83=B3=29?= Date: Thu, 1 May 2025 17:41:29 -0700 Subject: [PATCH] Revert "[flang][cuda] Use a reference for asyncObject (#138186)" This reverts commit 7f922f1400f00a73d1618e3f17556704c6b9436d. --- .../flang-rt/runtime/allocator-registry.h | 4 +-- .../include/flang-rt/runtime/descriptor.h | 6 ++--- .../flang-rt/runtime/reduction-templates.h | 2 +- flang-rt/lib/cuda/allocatable.cpp | 8 +++--- flang-rt/lib/cuda/allocator.cpp | 20 +++++++------- flang-rt/lib/cuda/descriptor.cpp | 2 +- flang-rt/lib/runtime/allocatable.cpp | 12 ++++----- flang-rt/lib/runtime/array-constructor.cpp | 4 +-- flang-rt/lib/runtime/assign.cpp | 4 +-- flang-rt/lib/runtime/character.cpp | 20 +++++++------- flang-rt/lib/runtime/copy.cpp | 4 +-- flang-rt/lib/runtime/derived.cpp | 6 ++--- flang-rt/lib/runtime/descriptor.cpp | 4 +-- flang-rt/lib/runtime/extrema.cpp | 4 +-- flang-rt/lib/runtime/findloc.cpp | 2 +- flang-rt/lib/runtime/matmul-transpose.cpp | 2 +- flang-rt/lib/runtime/matmul.cpp | 2 +- flang-rt/lib/runtime/misc-intrinsic.cpp | 2 +- flang-rt/lib/runtime/pointer.cpp | 2 +- flang-rt/lib/runtime/temporary-stack.cpp | 2 +- flang-rt/lib/runtime/tools.cpp | 2 +- flang-rt/lib/runtime/transformational.cpp | 4 +-- flang-rt/unittests/Evaluate/reshape.cpp | 2 +- flang-rt/unittests/Runtime/Allocatable.cpp | 4 +-- .../unittests/Runtime/CUDA/Allocatable.cpp | 12 +++------ .../unittests/Runtime/CUDA/AllocatorCUF.cpp | 4 +-- flang-rt/unittests/Runtime/CUDA/Memory.cpp | 4 +-- flang-rt/unittests/Runtime/CharacterTest.cpp | 2 +- flang-rt/unittests/Runtime/CommandTest.cpp | 8 +++--- flang-rt/unittests/Runtime/TemporaryStack.cpp | 4 +-- flang-rt/unittests/Runtime/tools.h | 2 +- .../flang/Optimizer/Dialect/CUF/CUFOps.td | 11 ++++---- .../include/flang/Runtime/CUDA/allocatable.h | 8 +++--- flang/include/flang/Runtime/CUDA/allocator.h | 8 +++--- flang/include/flang/Runtime/CUDA/pointer.h | 8 +++--- flang/include/flang/Runtime/allocatable.h | 7 +++-- flang/lib/Lower/Allocatable.cpp | 2 +- .../Optimizer/Builder/Runtime/Allocatable.cpp | 7 ++--- flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp | 22 ++++++++-------- .../Optimizer/Transforms/CUFOpConversion.cpp | 10 ++++--- flang/test/Fir/CUDA/cuda-allocate.fir | 18 +++++++------ flang/test/Fir/cuf-invalid.fir | 5 ++-- flang/test/Fir/cuf.mlir | 7 ++--- flang/test/HLFIR/elemental-codegen.fir | 6 ++--- flang/test/Lower/CUDA/cuda-allocatable.cuf | 9 ++++--- .../acc-declare-unwrap-defaultbounds.f90 | 4 +-- flang/test/Lower/OpenACC/acc-declare.f90 | 4 +-- flang/test/Lower/allocatable-polymorphic.f90 | 26 +++++++++---------- flang/test/Lower/allocatable-runtime.f90 | 4 +-- flang/test/Lower/allocate-mold.f90 | 4 +-- flang/test/Lower/polymorphic.f90 | 2 +- flang/test/Transforms/lower-repack-arrays.fir | 8 +++--- 52 files changed, 171 insertions(+), 169 deletions(-) diff --git a/flang-rt/include/flang-rt/runtime/allocator-registry.h b/flang-rt/include/flang-rt/runtime/allocator-registry.h index f0ba77a360736..33e8e2c7d7850 100644 --- a/flang-rt/include/flang-rt/runtime/allocator-registry.h +++ b/flang-rt/include/flang-rt/runtime/allocator-registry.h @@ -19,7 +19,7 @@ namespace Fortran::runtime { -using AllocFct = void *(*)(std::size_t, std::int64_t *); +using AllocFct = void *(*)(std::size_t, std::int64_t); using FreeFct = void (*)(void *); typedef struct Allocator_t { @@ -28,7 +28,7 @@ typedef struct Allocator_t { } Allocator_t; static RT_API_ATTRS void *MallocWrapper( - std::size_t size, [[maybe_unused]] std::int64_t *) { + std::size_t size, [[maybe_unused]] std::int64_t) { return std::malloc(size); } #ifdef RT_DEVICE_COMPILATION diff --git a/flang-rt/include/flang-rt/runtime/descriptor.h b/flang-rt/include/flang-rt/runtime/descriptor.h index c98e6b14850cb..9907e7866e7bf 100644 --- a/flang-rt/include/flang-rt/runtime/descriptor.h +++ b/flang-rt/include/flang-rt/runtime/descriptor.h @@ -29,8 +29,8 @@ #include #include -/// Value used for asyncObject when no specific stream is specified. -static constexpr std::int64_t *kNoAsyncObject = nullptr; +/// Value used for asyncId when no specific stream is specified. +static constexpr std::int64_t kNoAsyncId = -1; namespace Fortran::runtime { @@ -372,7 +372,7 @@ class Descriptor { // before calling. It (re)computes the byte strides after // allocation. Does not allocate automatic components or // perform default component initialization. - RT_API_ATTRS int Allocate(std::int64_t *asyncObject); + RT_API_ATTRS int Allocate(std::int64_t asyncId); RT_API_ATTRS void SetByteStrides(); // Deallocates storage; does not call FINAL subroutines or diff --git a/flang-rt/include/flang-rt/runtime/reduction-templates.h b/flang-rt/include/flang-rt/runtime/reduction-templates.h index 18412708b02c5..77f77a592a476 100644 --- a/flang-rt/include/flang-rt/runtime/reduction-templates.h +++ b/flang-rt/include/flang-rt/runtime/reduction-templates.h @@ -347,7 +347,7 @@ inline RT_API_ATTRS void DoMaxMinNorm2(Descriptor &result, const Descriptor &x, // as the element size of the source. result.Establish(x.type(), x.ElementBytes(), nullptr, 0, nullptr, CFI_attribute_allocatable); - if (int stat{result.Allocate(kNoAsyncObject)}) { + if (int stat{result.Allocate(kNoAsyncId)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/cuda/allocatable.cpp b/flang-rt/lib/cuda/allocatable.cpp index c77819e9440d7..432974d18a3e3 100644 --- a/flang-rt/lib/cuda/allocatable.cpp +++ b/flang-rt/lib/cuda/allocatable.cpp @@ -23,7 +23,7 @@ namespace Fortran::runtime::cuda { extern "C" { RT_EXT_API_GROUP_BEGIN -int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t *stream, +int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( @@ -41,7 +41,7 @@ int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t *stream, return stat; } -int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t *stream, +int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { if (desc.HasAddendum()) { @@ -63,7 +63,7 @@ int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t *stream, } int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, - const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; @@ -76,7 +76,7 @@ int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, } int RTDEF(CUFAllocatableAllocateSourceSync)(Descriptor &alloc, - const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocateSync)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; diff --git a/flang-rt/lib/cuda/allocator.cpp b/flang-rt/lib/cuda/allocator.cpp index f4289c55bd8de..51119ab251168 100644 --- a/flang-rt/lib/cuda/allocator.cpp +++ b/flang-rt/lib/cuda/allocator.cpp @@ -98,7 +98,7 @@ static unsigned findAllocation(void *ptr) { return allocNotFound; } -static void insertAllocation(void *ptr, std::size_t size, cudaStream_t stream) { +static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { CriticalSection critical{lock}; initAllocations(); if (numDeviceAllocations >= maxDeviceAllocations) { @@ -106,7 +106,7 @@ static void insertAllocation(void *ptr, std::size_t size, cudaStream_t stream) { } deviceAllocations[numDeviceAllocations].ptr = ptr; deviceAllocations[numDeviceAllocations].size = size; - deviceAllocations[numDeviceAllocations].stream = stream; + deviceAllocations[numDeviceAllocations].stream = (cudaStream_t)stream; ++numDeviceAllocations; qsort(deviceAllocations, numDeviceAllocations, sizeof(DeviceAllocation), compareDeviceAlloc); @@ -136,7 +136,7 @@ void RTDEF(CUFRegisterAllocator)() { } void *CUFAllocPinned( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { void *p; CUDA_REPORT_IF_ERROR(cudaMallocHost((void **)&p, sizeInBytes)); return p; @@ -144,18 +144,18 @@ void *CUFAllocPinned( void CUFFreePinned(void *p) { CUDA_REPORT_IF_ERROR(cudaFreeHost(p)); } -void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t *asyncObject) { +void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t asyncId) { void *p; if (Fortran::runtime::executionEnvironment.cudaDeviceIsManaged) { CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); } else { - if (asyncObject == kNoAsyncObject) { + if (asyncId == kNoAsyncId) { CUDA_REPORT_IF_ERROR(cudaMalloc(&p, sizeInBytes)); } else { CUDA_REPORT_IF_ERROR( - cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)*asyncObject)); - insertAllocation(p, sizeInBytes, (cudaStream_t)*asyncObject); + cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)asyncId)); + insertAllocation(p, sizeInBytes, asyncId); } } return p; @@ -174,7 +174,7 @@ void CUFFreeDevice(void *p) { } void *CUFAllocManaged( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { void *p; CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); @@ -184,9 +184,9 @@ void *CUFAllocManaged( void CUFFreeManaged(void *p) { CUDA_REPORT_IF_ERROR(cudaFree(p)); } void *CUFAllocUnified( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { // Call alloc managed for the time being. - return CUFAllocManaged(sizeInBytes, asyncObject); + return CUFAllocManaged(sizeInBytes, asyncId); } void CUFFreeUnified(void *p) { diff --git a/flang-rt/lib/cuda/descriptor.cpp b/flang-rt/lib/cuda/descriptor.cpp index 7b768f91af29d..175e8c0ef8438 100644 --- a/flang-rt/lib/cuda/descriptor.cpp +++ b/flang-rt/lib/cuda/descriptor.cpp @@ -21,7 +21,7 @@ RT_EXT_API_GROUP_BEGIN Descriptor *RTDEF(CUFAllocDescriptor)( std::size_t sizeInBytes, const char *sourceFile, int sourceLine) { return reinterpret_cast( - CUFAllocManaged(sizeInBytes, /*asyncObject=*/nullptr)); + CUFAllocManaged(sizeInBytes, /*asyncId*/ -1)); } void RTDEF(CUFFreeDescriptor)( diff --git a/flang-rt/lib/runtime/allocatable.cpp b/flang-rt/lib/runtime/allocatable.cpp index ef18da6ea0786..6acce34eb9a9e 100644 --- a/flang-rt/lib/runtime/allocatable.cpp +++ b/flang-rt/lib/runtime/allocatable.cpp @@ -133,17 +133,17 @@ void RTDEF(AllocatableApplyMold)( } } -int RTDEF(AllocatableAllocate)(Descriptor &descriptor, - std::int64_t *asyncObject, bool hasStat, const Descriptor *errMsg, - const char *sourceFile, int sourceLine) { +int RTDEF(AllocatableAllocate)(Descriptor &descriptor, std::int64_t asyncId, + bool hasStat, const Descriptor *errMsg, const char *sourceFile, + int sourceLine) { Terminator terminator{sourceFile, sourceLine}; if (!descriptor.IsAllocatable()) { return ReturnError(terminator, StatInvalidDescriptor, errMsg, hasStat); } else if (descriptor.IsAllocated()) { return ReturnError(terminator, StatBaseNotNull, errMsg, hasStat); } else { - int stat{ReturnError( - terminator, descriptor.Allocate(asyncObject), errMsg, hasStat)}; + int stat{ + ReturnError(terminator, descriptor.Allocate(asyncId), errMsg, hasStat)}; if (stat == StatOk) { if (const DescriptorAddendum * addendum{descriptor.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -162,7 +162,7 @@ int RTDEF(AllocatableAllocateSource)(Descriptor &alloc, const Descriptor &source, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(AllocatableAllocate)( - alloc, /*asyncObject=*/nullptr, hasStat, errMsg, sourceFile, sourceLine)}; + alloc, /*asyncId=*/-1, hasStat, errMsg, sourceFile, sourceLine)}; if (stat == StatOk) { Terminator terminator{sourceFile, sourceLine}; DoFromSourceAssign(alloc, source, terminator); diff --git a/flang-rt/lib/runtime/array-constructor.cpp b/flang-rt/lib/runtime/array-constructor.cpp index 858fac7bf2b39..67b3b5e1e0f50 100644 --- a/flang-rt/lib/runtime/array-constructor.cpp +++ b/flang-rt/lib/runtime/array-constructor.cpp @@ -50,7 +50,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( initialAllocationSize(fromElements, to.ElementBytes())}; to.GetDimension(0).SetBounds(1, allocationSize); RTNAME(AllocatableAllocate) - (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); to.GetDimension(0).SetBounds(1, fromElements); vector.actualAllocationSize = allocationSize; @@ -59,7 +59,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( // first value: there should be no reallocation. RUNTIME_CHECK(terminator, previousToElements >= fromElements); RTNAME(AllocatableAllocate) - (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); vector.actualAllocationSize = previousToElements; } diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 8a4fa36c91479..4a813cd489022 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -99,7 +99,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; + int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; if (result == StatOk && derived && !derived->noInitializationNeeded()) { result = ReturnError(terminator, Initialize(to, *derived, terminator)); } @@ -277,7 +277,7 @@ RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; + auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; if (stat == StatOk) { if (HasDynamicComponent(from)) { // If 'from' has allocatable/automatic component, we cannot diff --git a/flang-rt/lib/runtime/character.cpp b/flang-rt/lib/runtime/character.cpp index f140d202e118e..d1152ee1caefb 100644 --- a/flang-rt/lib/runtime/character.cpp +++ b/flang-rt/lib/runtime/character.cpp @@ -118,7 +118,7 @@ static RT_API_ATTRS void Compare(Descriptor &result, const Descriptor &x, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { terminator.Crash("Compare: could not allocate storage for result"); } std::size_t xChars{x.ElementBytes() >> shift}; @@ -173,7 +173,7 @@ static RT_API_ATTRS void AdjustLRHelper(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { terminator.Crash("ADJUSTL/R: could not allocate storage for result"); } for (SubscriptValue resultAt{0}; elements-- > 0; @@ -227,7 +227,7 @@ static RT_API_ATTRS void LenTrim(Descriptor &result, const Descriptor &string, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { terminator.Crash("LEN_TRIM: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -427,7 +427,7 @@ static RT_API_ATTRS void GeneralCharFunc(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { terminator.Crash("SCAN/VERIFY: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -530,8 +530,7 @@ static RT_API_ATTRS void MaxMinHelper(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - RUNTIME_CHECK( - terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); } for (CHAR *result{accumulator.OffsetElement()}; elements-- > 0; accumData += accumChars, result += chars, x.IncrementSubscripts(xAt)) { @@ -607,7 +606,7 @@ void RTDEF(CharacterConcatenate)(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - if (accumulator.Allocate(kNoAsyncObject) != CFI_SUCCESS) { + if (accumulator.Allocate(kNoAsyncId) != CFI_SUCCESS) { terminator.Crash( "CharacterConcatenate: could not allocate storage for result"); } @@ -630,8 +629,7 @@ void RTDEF(CharacterConcatenateScalar1)( accumulator.set_base_addr(nullptr); std::size_t oldLen{accumulator.ElementBytes()}; accumulator.raw().elem_len += chars; - RUNTIME_CHECK( - terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); std::memcpy(accumulator.OffsetElement(oldLen), from, chars); FreeMemory(old); } @@ -833,7 +831,7 @@ void RTDEF(Repeat)(Descriptor &result, const Descriptor &string, std::size_t origBytes{string.ElementBytes()}; result.Establish(string.type(), origBytes * ncopies, nullptr, 0, nullptr, CFI_attribute_allocatable); - if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { terminator.Crash("REPEAT could not allocate storage for result"); } const char *from{string.OffsetElement()}; @@ -867,7 +865,7 @@ void RTDEF(Trim)(Descriptor &result, const Descriptor &string, } result.Establish(string.type(), resultBytes, nullptr, 0, nullptr, CFI_attribute_allocatable); - RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncId) == CFI_SUCCESS); std::memcpy(result.OffsetElement(), string.OffsetElement(), resultBytes); } diff --git a/flang-rt/lib/runtime/copy.cpp b/flang-rt/lib/runtime/copy.cpp index f990f46e0be66..3a0f98cf8d376 100644 --- a/flang-rt/lib/runtime/copy.cpp +++ b/flang-rt/lib/runtime/copy.cpp @@ -171,8 +171,8 @@ RT_API_ATTRS void CopyElement(const Descriptor &to, const SubscriptValue toAt[], *reinterpret_cast(toPtr + component->offset())}; if (toDesc.raw().base_addr != nullptr) { toDesc.set_base_addr(nullptr); - RUNTIME_CHECK(terminator, - toDesc.Allocate(/*asyncObject=*/nullptr) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, toDesc.Allocate(/*asyncId=*/-1) == CFI_SUCCESS); const Descriptor &fromDesc{*reinterpret_cast( fromPtr + component->offset())}; copyStack.emplace(toDesc, fromDesc); diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index 35037036f63e7..c46ea806a430a 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -52,7 +52,7 @@ RT_API_ATTRS int Initialize(const Descriptor &instance, allocDesc.raw().attribute = CFI_attribute_allocatable; if (comp.genre() == typeInfo::Component::Genre::Automatic) { stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); + terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -153,7 +153,7 @@ RT_API_ATTRS int InitializeClone(const Descriptor &clone, if (origDesc.IsAllocated()) { cloneDesc.ApplyMold(origDesc, origDesc.rank()); stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); + terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { if (const typeInfo::DerivedType * @@ -260,7 +260,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy.raw().attribute = CFI_attribute_allocatable; Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); + copy.Allocate(kNoAsyncId) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } diff --git a/flang-rt/lib/runtime/descriptor.cpp b/flang-rt/lib/runtime/descriptor.cpp index 67336d01380e0..3debf53bb5290 100644 --- a/flang-rt/lib/runtime/descriptor.cpp +++ b/flang-rt/lib/runtime/descriptor.cpp @@ -158,7 +158,7 @@ RT_API_ATTRS static inline int MapAllocIdx(const Descriptor &desc) { #endif } -RT_API_ATTRS int Descriptor::Allocate(std::int64_t *asyncObject) { +RT_API_ATTRS int Descriptor::Allocate(std::int64_t asyncId) { std::size_t elementBytes{ElementBytes()}; if (static_cast(elementBytes) < 0) { // F'2023 7.4.4.2 p5: "If the character length parameter value evaluates @@ -170,7 +170,7 @@ RT_API_ATTRS int Descriptor::Allocate(std::int64_t *asyncObject) { // Zero size allocation is possible in Fortran and the resulting // descriptor must be allocated/associated. Since std::malloc(0) // result is implementation defined, always allocate at least one byte. - void *p{alloc(byteSize ? byteSize : 1, asyncObject)}; + void *p{alloc(byteSize ? byteSize : 1, asyncId)}; if (!p) { return CFI_ERROR_MEM_ALLOCATION; } diff --git a/flang-rt/lib/runtime/extrema.cpp b/flang-rt/lib/runtime/extrema.cpp index 03e574a8fbff1..4c7f8e8b99e8f 100644 --- a/flang-rt/lib/runtime/extrema.cpp +++ b/flang-rt/lib/runtime/extrema.cpp @@ -152,7 +152,7 @@ inline RT_API_ATTRS void CharacterMaxOrMinLoc(const char *intrinsic, CFI_attribute_allocatable); result.GetDimension(0).SetBounds(1, extent[0]); Terminator terminator{source, line}; - if (int stat{result.Allocate(kNoAsyncObject)}) { + if (int stat{result.Allocate(kNoAsyncId)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } @@ -181,7 +181,7 @@ inline RT_API_ATTRS void TotalNumericMaxOrMinLoc(const char *intrinsic, CFI_attribute_allocatable); result.GetDimension(0).SetBounds(1, extent[0]); Terminator terminator{source, line}; - if (int stat{result.Allocate(kNoAsyncObject)}) { + if (int stat{result.Allocate(kNoAsyncId)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/runtime/findloc.cpp b/flang-rt/lib/runtime/findloc.cpp index 5485f4b97bd2f..e3e98953b0cfc 100644 --- a/flang-rt/lib/runtime/findloc.cpp +++ b/flang-rt/lib/runtime/findloc.cpp @@ -220,7 +220,7 @@ void RTDEF(Findloc)(Descriptor &result, const Descriptor &x, CFI_attribute_allocatable); result.GetDimension(0).SetBounds(1, extent[0]); Terminator terminator{source, line}; - if (int stat{result.Allocate(kNoAsyncObject)}) { + if (int stat{result.Allocate(kNoAsyncId)}) { terminator.Crash( "FINDLOC: could not allocate memory for result; STAT=%d", stat); } diff --git a/flang-rt/lib/runtime/matmul-transpose.cpp b/flang-rt/lib/runtime/matmul-transpose.cpp index c9e21502b629e..17987fb73d943 100644 --- a/flang-rt/lib/runtime/matmul-transpose.cpp +++ b/flang-rt/lib/runtime/matmul-transpose.cpp @@ -183,7 +183,7 @@ inline static RT_API_ATTRS void DoMatmulTranspose( for (int j{0}; j < resRank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncObject)}) { + if (int stat{result.Allocate(kNoAsyncId)}) { terminator.Crash( "MATMUL-TRANSPOSE: could not allocate memory for result; STAT=%d", stat); diff --git a/flang-rt/lib/runtime/matmul.cpp b/flang-rt/lib/runtime/matmul.cpp index 5acb345725212..0ff92cecbbcb8 100644 --- a/flang-rt/lib/runtime/matmul.cpp +++ b/flang-rt/lib/runtime/matmul.cpp @@ -255,7 +255,7 @@ static inline RT_API_ATTRS void DoMatmul( for (int j{0}; j < resRank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncObject)}) { + if (int stat{result.Allocate(kNoAsyncId)}) { terminator.Crash( "MATMUL: could not allocate memory for result; STAT=%d", stat); } diff --git a/flang-rt/lib/runtime/misc-intrinsic.cpp b/flang-rt/lib/runtime/misc-intrinsic.cpp index a8797f48fa667..2fde859869ef0 100644 --- a/flang-rt/lib/runtime/misc-intrinsic.cpp +++ b/flang-rt/lib/runtime/misc-intrinsic.cpp @@ -30,7 +30,7 @@ static RT_API_ATTRS void TransferImpl(Descriptor &result, if (const DescriptorAddendum * addendum{mold.Addendum()}) { *result.Addendum() = *addendum; } - if (int stat{result.Allocate(kNoAsyncObject)}) { + if (int stat{result.Allocate(kNoAsyncId)}) { Terminator{sourceFile, line}.Crash( "TRANSFER: could not allocate memory for result; STAT=%d", stat); } diff --git a/flang-rt/lib/runtime/pointer.cpp b/flang-rt/lib/runtime/pointer.cpp index 7331f7bbc3a75..fd2427f4124b5 100644 --- a/flang-rt/lib/runtime/pointer.cpp +++ b/flang-rt/lib/runtime/pointer.cpp @@ -129,7 +129,7 @@ RT_API_ATTRS void *AllocateValidatedPointerPayload( byteSize = ((byteSize + align - 1) / align) * align; std::size_t total{byteSize + sizeof(std::uintptr_t)}; AllocFct alloc{allocatorRegistry.GetAllocator(allocatorIdx)}; - void *p{alloc(total, /*asyncObject=*/nullptr)}; + void *p{alloc(total, /*asyncId=*/-1)}; if (p && allocatorIdx == 0) { // Fill the footer word with the XOR of the ones' complement of // the base address, which is a value that would be highly unlikely diff --git a/flang-rt/lib/runtime/temporary-stack.cpp b/flang-rt/lib/runtime/temporary-stack.cpp index 3f6fd8ee15a80..3a952b1fdbcca 100644 --- a/flang-rt/lib/runtime/temporary-stack.cpp +++ b/flang-rt/lib/runtime/temporary-stack.cpp @@ -148,7 +148,7 @@ void DescriptorStorage::push(const Descriptor &source) { if constexpr (COPY_VALUES) { // copy the data pointed to by the box box.set_base_addr(nullptr); - box.Allocate(kNoAsyncObject); + box.Allocate(kNoAsyncId); RTNAME(AssignTemporary) (box, source, terminator_.sourceFileName(), terminator_.sourceLine()); } diff --git a/flang-rt/lib/runtime/tools.cpp b/flang-rt/lib/runtime/tools.cpp index 1f965b0b151ce..5d6e35faca70a 100644 --- a/flang-rt/lib/runtime/tools.cpp +++ b/flang-rt/lib/runtime/tools.cpp @@ -261,7 +261,7 @@ RT_API_ATTRS void CreatePartialReductionResult(Descriptor &result, for (int j{0}; j + 1 < xRank; ++j) { result.GetDimension(j).SetBounds(1, resultExtent[j]); } - if (int stat{result.Allocate(kNoAsyncObject)}) { + if (int stat{result.Allocate(kNoAsyncId)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/runtime/transformational.cpp b/flang-rt/lib/runtime/transformational.cpp index 3df314a4e966b..a7d5a48530ee9 100644 --- a/flang-rt/lib/runtime/transformational.cpp +++ b/flang-rt/lib/runtime/transformational.cpp @@ -132,7 +132,7 @@ static inline RT_API_ATTRS std::size_t AllocateResult(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncObject)}) { + if (int stat{result.Allocate(kNoAsyncId)}) { terminator.Crash( "%s: Could not allocate memory for result (stat=%d)", function, stat); } @@ -157,7 +157,7 @@ static inline RT_API_ATTRS std::size_t AllocateBesselResult(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncObject)}) { + if (int stat{result.Allocate(kNoAsyncId)}) { terminator.Crash( "%s: Could not allocate memory for result (stat=%d)", function, stat); } diff --git a/flang-rt/unittests/Evaluate/reshape.cpp b/flang-rt/unittests/Evaluate/reshape.cpp index f84de443965d1..67a0be124e8e0 100644 --- a/flang-rt/unittests/Evaluate/reshape.cpp +++ b/flang-rt/unittests/Evaluate/reshape.cpp @@ -26,7 +26,7 @@ int main() { for (int j{0}; j < 3; ++j) { source->GetDimension(j).SetBounds(1, sourceExtent[j]); } - TEST(source->Allocate(kNoAsyncObject) == CFI_SUCCESS); + TEST(source->Allocate(kNoAsyncId) == CFI_SUCCESS); TEST(source->IsAllocated()); MATCH(2, source->GetDimension(0).Extent()); MATCH(3, source->GetDimension(1).Extent()); diff --git a/flang-rt/unittests/Runtime/Allocatable.cpp b/flang-rt/unittests/Runtime/Allocatable.cpp index b394312e5bc5a..a6fcdd0d1423c 100644 --- a/flang-rt/unittests/Runtime/Allocatable.cpp +++ b/flang-rt/unittests/Runtime/Allocatable.cpp @@ -26,7 +26,7 @@ TEST(AllocatableTest, MoveAlloc) { auto b{createAllocatable(TypeCategory::Integer, 4)}; // ALLOCATE(a(20)) a->GetDimension(0).SetBounds(1, 20); - a->Allocate(kNoAsyncObject); + a->Allocate(kNoAsyncId); EXPECT_TRUE(a->IsAllocated()); EXPECT_FALSE(b->IsAllocated()); @@ -46,7 +46,7 @@ TEST(AllocatableTest, MoveAlloc) { // move_alloc with errMsg auto errMsg{Descriptor::Create( sizeof(char), 64, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - errMsg->Allocate(kNoAsyncObject); + errMsg->Allocate(kNoAsyncId); RTNAME(MoveAlloc)(*b, *a, nullptr, false, errMsg.get(), __FILE__, __LINE__); EXPECT_FALSE(a->IsAllocated()); EXPECT_TRUE(b->IsAllocated()); diff --git a/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp b/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp index 9935ae0eaac2f..89649aa95ad93 100644 --- a/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp +++ b/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp @@ -42,8 +42,7 @@ TEST(AllocatableCUFTest, SimpleDeviceAllocatable) { CUDA_REPORT_IF_ERROR(cudaMalloc(&device_desc, a->SizeInBytes())); RTNAME(AllocatableAllocate) - (*a, kNoAsyncObject, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, - __LINE__); + (*a, kNoAsyncId, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); EXPECT_TRUE(a->IsAllocated()); RTNAME(CUFDescriptorSync)(device_desc, a.get(), __FILE__, __LINE__); cudaDeviceSynchronize(); @@ -83,22 +82,19 @@ TEST(AllocatableCUFTest, StreamDeviceAllocatable) { RTNAME(AllocatableSetBounds)(*c, 0, 1, 100); RTNAME(AllocatableAllocate) - (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, - __LINE__); + (*a, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); EXPECT_TRUE(a->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); RTNAME(AllocatableAllocate) - (*b, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, - __LINE__); + (*b, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); EXPECT_TRUE(b->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); RTNAME(AllocatableAllocate) - (*c, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, - __LINE__); + (*c, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); EXPECT_TRUE(c->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); diff --git a/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp b/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp index f1f931e87a86e..2f1dc64dc8c5a 100644 --- a/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp +++ b/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp @@ -35,7 +35,7 @@ TEST(AllocatableCUFTest, SimpleDeviceAllocate) { EXPECT_FALSE(a->HasAddendum()); RTNAME(AllocatableSetBounds)(*a, 0, 1, 10); RTNAME(AllocatableAllocate) - (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + (*a, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); EXPECT_TRUE(a->IsAllocated()); RTNAME(AllocatableDeallocate) @@ -54,7 +54,7 @@ TEST(AllocatableCUFTest, SimplePinnedAllocate) { EXPECT_FALSE(a->HasAddendum()); RTNAME(AllocatableSetBounds)(*a, 0, 1, 10); RTNAME(AllocatableAllocate) - (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + (*a, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); EXPECT_TRUE(a->IsAllocated()); RTNAME(AllocatableDeallocate) diff --git a/flang-rt/unittests/Runtime/CUDA/Memory.cpp b/flang-rt/unittests/Runtime/CUDA/Memory.cpp index 7915baca6c203..b3612073657ab 100644 --- a/flang-rt/unittests/Runtime/CUDA/Memory.cpp +++ b/flang-rt/unittests/Runtime/CUDA/Memory.cpp @@ -50,8 +50,8 @@ TEST(MemoryCUFTest, CUFDataTransferDescDesc) { EXPECT_EQ((int)kDeviceAllocatorPos, dev->GetAllocIdx()); RTNAME(AllocatableSetBounds)(*dev, 0, 1, 10); RTNAME(AllocatableAllocate) - (*dev, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, - __FILE__, __LINE__); + (*dev, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(dev->IsAllocated()); // Create temp array to transfer to device. diff --git a/flang-rt/unittests/Runtime/CharacterTest.cpp b/flang-rt/unittests/Runtime/CharacterTest.cpp index 2c7af27b9da77..0f28e883671bc 100644 --- a/flang-rt/unittests/Runtime/CharacterTest.cpp +++ b/flang-rt/unittests/Runtime/CharacterTest.cpp @@ -35,7 +35,7 @@ OwningPtr CreateDescriptor(const std::vector &shape, for (int j{0}; j < rank; ++j) { descriptor->GetDimension(j).SetBounds(2, shape[j] + 1); } - if (descriptor->Allocate(kNoAsyncObject) != 0) { + if (descriptor->Allocate(kNoAsyncId) != 0) { return nullptr; } diff --git a/flang-rt/unittests/Runtime/CommandTest.cpp b/flang-rt/unittests/Runtime/CommandTest.cpp index 6919a98105b8a..9d0da4ce8dd4e 100644 --- a/flang-rt/unittests/Runtime/CommandTest.cpp +++ b/flang-rt/unittests/Runtime/CommandTest.cpp @@ -26,7 +26,7 @@ template static OwningPtr CreateEmptyCharDescriptor() { OwningPtr descriptor{Descriptor::Create( sizeof(char), n, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncObject) != 0) { + if (descriptor->Allocate(kNoAsyncId) != 0) { return nullptr; } return descriptor; @@ -36,7 +36,7 @@ static OwningPtr CharDescriptor(const char *value) { std::size_t n{std::strlen(value)}; OwningPtr descriptor{Descriptor::Create( sizeof(char), n, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncObject) != 0) { + if (descriptor->Allocate(kNoAsyncId) != 0) { return nullptr; } std::memcpy(descriptor->OffsetElement(), value, n); @@ -47,7 +47,7 @@ template static OwningPtr EmptyIntDescriptor() { OwningPtr descriptor{Descriptor::Create(TypeCategory::Integer, kind, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncObject) != 0) { + if (descriptor->Allocate(kNoAsyncId) != 0) { return nullptr; } return descriptor; @@ -57,7 +57,7 @@ template static OwningPtr IntDescriptor(const int &value) { OwningPtr descriptor{Descriptor::Create(TypeCategory::Integer, kind, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncObject) != 0) { + if (descriptor->Allocate(kNoAsyncId) != 0) { return nullptr; } std::memcpy(descriptor->OffsetElement(), &value, sizeof(int)); diff --git a/flang-rt/unittests/Runtime/TemporaryStack.cpp b/flang-rt/unittests/Runtime/TemporaryStack.cpp index 65725840459ab..3291794f22fc1 100644 --- a/flang-rt/unittests/Runtime/TemporaryStack.cpp +++ b/flang-rt/unittests/Runtime/TemporaryStack.cpp @@ -59,7 +59,7 @@ TEST(TemporaryStack, ValueStackBasic) { Descriptor &outputDesc2{testDescriptorStorage[2].descriptor()}; inputDesc.Establish(code, elementBytes, descriptorPtr, rank, extent); - inputDesc.Allocate(kNoAsyncObject); + inputDesc.Allocate(kNoAsyncId); ASSERT_EQ(inputDesc.IsAllocated(), true); uint32_t *inputData = static_cast(inputDesc.raw().base_addr); for (std::size_t i = 0; i < inputDesc.Elements(); ++i) { @@ -123,7 +123,7 @@ TEST(TemporaryStack, ValueStackMultiSize) { boxDims.extent = extent[dim]; boxDims.sm = elementBytes; } - desc->Allocate(kNoAsyncObject); + desc->Allocate(kNoAsyncId); // fill the array with some data to test for (uint32_t i = 0; i < desc->Elements(); ++i) { diff --git a/flang-rt/unittests/Runtime/tools.h b/flang-rt/unittests/Runtime/tools.h index 4ada862df110b..a1eba45647a80 100644 --- a/flang-rt/unittests/Runtime/tools.h +++ b/flang-rt/unittests/Runtime/tools.h @@ -42,7 +42,7 @@ static OwningPtr MakeArray(const std::vector &shape, for (int j{0}; j < rank; ++j) { result->GetDimension(j).SetBounds(1, shape[j]); } - int stat{result->Allocate(kNoAsyncObject)}; + int stat{result->Allocate(kNoAsyncId)}; EXPECT_EQ(stat, 0) << stat; EXPECT_LE(data.size(), result->Elements()); char *p{result->OffsetElement()}; diff --git a/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td b/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td index e38738230ffbc..46cc59cda1612 100644 --- a/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td +++ b/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td @@ -95,11 +95,12 @@ def cuf_AllocateOp : cuf_Op<"allocate", [AttrSizedOperandSegments, }]; let arguments = (ins Arg:$box, - Arg, "", [MemWrite]>:$errmsg, - Optional:$stream, - Arg, "", [MemWrite]>:$pinned, - Arg, "", [MemRead]>:$source, - cuf_DataAttributeAttr:$data_attr, UnitAttr:$hasStat); + Arg, "", [MemWrite]>:$errmsg, + Optional:$stream, + Arg, "", [MemWrite]>:$pinned, + Arg, "", [MemRead]>:$source, + cuf_DataAttributeAttr:$data_attr, + UnitAttr:$hasStat); let results = (outs AnyIntegerType:$stat); diff --git a/flang/include/flang/Runtime/CUDA/allocatable.h b/flang/include/flang/Runtime/CUDA/allocatable.h index 6c97afa9e10e8..822f2d4a2b297 100644 --- a/flang/include/flang/Runtime/CUDA/allocatable.h +++ b/flang/include/flang/Runtime/CUDA/allocatable.h @@ -17,14 +17,14 @@ namespace Fortran::runtime::cuda { extern "C" { /// Perform allocation of the descriptor. -int RTDECL(CUFAllocatableAllocate)(Descriptor &, int64_t *stream = nullptr, +int RTDECL(CUFAllocatableAllocate)(Descriptor &, int64_t stream = -1, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. -int RTDECL(CUFAllocatableAllocateSync)(Descriptor &, int64_t *stream = nullptr, +int RTDECL(CUFAllocatableAllocateSync)(Descriptor &, int64_t stream = -1, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); @@ -32,14 +32,14 @@ int RTDECL(CUFAllocatableAllocateSync)(Descriptor &, int64_t *stream = nullptr, /// Perform allocation of the descriptor without synchronization. Assign data /// from source. int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, - const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, + const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. Assign data from source. int RTDEF(CUFAllocatableAllocateSourceSync)(Descriptor &alloc, - const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, + const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); diff --git a/flang/include/flang/Runtime/CUDA/allocator.h b/flang/include/flang/Runtime/CUDA/allocator.h index 59fdb22b6e663..18ddf75ac3852 100644 --- a/flang/include/flang/Runtime/CUDA/allocator.h +++ b/flang/include/flang/Runtime/CUDA/allocator.h @@ -20,16 +20,16 @@ extern "C" { void RTDECL(CUFRegisterAllocator)(); } -void *CUFAllocPinned(std::size_t, std::int64_t *); +void *CUFAllocPinned(std::size_t, std::int64_t); void CUFFreePinned(void *); -void *CUFAllocDevice(std::size_t, std::int64_t *); +void *CUFAllocDevice(std::size_t, std::int64_t); void CUFFreeDevice(void *); -void *CUFAllocManaged(std::size_t, std::int64_t *); +void *CUFAllocManaged(std::size_t, std::int64_t); void CUFFreeManaged(void *); -void *CUFAllocUnified(std::size_t, std::int64_t *); +void *CUFAllocUnified(std::size_t, std::int64_t); void CUFFreeUnified(void *); } // namespace Fortran::runtime::cuda diff --git a/flang/include/flang/Runtime/CUDA/pointer.h b/flang/include/flang/Runtime/CUDA/pointer.h index bdfc3268e0814..7fbd8f8e061f2 100644 --- a/flang/include/flang/Runtime/CUDA/pointer.h +++ b/flang/include/flang/Runtime/CUDA/pointer.h @@ -17,14 +17,14 @@ namespace Fortran::runtime::cuda { extern "C" { /// Perform allocation of the descriptor. -int RTDECL(CUFPointerAllocate)(Descriptor &, int64_t *stream = nullptr, +int RTDECL(CUFPointerAllocate)(Descriptor &, int64_t stream = -1, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. -int RTDECL(CUFPointerAllocateSync)(Descriptor &, int64_t *stream = nullptr, +int RTDECL(CUFPointerAllocateSync)(Descriptor &, int64_t stream = -1, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); @@ -32,14 +32,14 @@ int RTDECL(CUFPointerAllocateSync)(Descriptor &, int64_t *stream = nullptr, /// Perform allocation of the descriptor without synchronization. Assign data /// from source. int RTDEF(CUFPointerAllocateSource)(Descriptor &pointer, - const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, + const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. Assign data from source. int RTDEF(CUFPointerAllocateSourceSync)(Descriptor &pointer, - const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, + const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); diff --git a/flang/include/flang/Runtime/allocatable.h b/flang/include/flang/Runtime/allocatable.h index 863c07494e7c3..6895f8af5e2a8 100644 --- a/flang/include/flang/Runtime/allocatable.h +++ b/flang/include/flang/Runtime/allocatable.h @@ -94,10 +94,9 @@ int RTDECL(AllocatableCheckLengthParameter)(Descriptor &, // Successfully allocated memory is initialized if the allocatable has a // derived type, and is always initialized by AllocatableAllocateSource(). // Performs all necessary coarray synchronization and validation actions. -int RTDECL(AllocatableAllocate)(Descriptor &, - std::int64_t *asyncObject = nullptr, bool hasStat = false, - const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, - int sourceLine = 0); +int RTDECL(AllocatableAllocate)(Descriptor &, std::int64_t asyncId = -1, + bool hasStat = false, const Descriptor *errMsg = nullptr, + const char *sourceFile = nullptr, int sourceLine = 0); int RTDECL(AllocatableAllocateSource)(Descriptor &, const Descriptor &source, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); diff --git a/flang/lib/Lower/Allocatable.cpp b/flang/lib/Lower/Allocatable.cpp index af8169c8e7f7b..8d0444a6e5bd4 100644 --- a/flang/lib/Lower/Allocatable.cpp +++ b/flang/lib/Lower/Allocatable.cpp @@ -773,7 +773,7 @@ class AllocateStmtHelper { mlir::Value errmsg = errMsgExpr ? errorManager.errMsgAddr : nullptr; mlir::Value stream = streamExpr - ? fir::getBase(converter.genExprAddr(loc, *streamExpr, stmtCtx)) + ? fir::getBase(converter.genExprValue(loc, *streamExpr, stmtCtx)) : nullptr; mlir::Value pinned = pinnedExpr diff --git a/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp b/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp index cd5f1f6d098c3..28452d3b486da 100644 --- a/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp +++ b/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp @@ -76,7 +76,8 @@ void fir::runtime::genAllocatableAllocate(fir::FirOpBuilder &builder, mlir::func::FuncOp func{ fir::runtime::getRuntimeFunc(loc, builder)}; mlir::FunctionType fTy{func.getFunctionType()}; - mlir::Value asyncObject = builder.createNullConstant(loc); + mlir::Value asyncId = + builder.createIntegerConstant(loc, builder.getI64Type(), -1); mlir::Value sourceFile{fir::factory::locationToFilename(builder, loc)}; mlir::Value sourceLine{ fir::factory::locationToLineNo(builder, loc, fTy.getInput(5))}; @@ -87,7 +88,7 @@ void fir::runtime::genAllocatableAllocate(fir::FirOpBuilder &builder, errMsg = builder.create(loc, boxNoneTy).getResult(); } llvm::SmallVector args{ - fir::runtime::createArguments(builder, loc, fTy, desc, asyncObject, - hasStat, errMsg, sourceFile, sourceLine)}; + fir::runtime::createArguments(builder, loc, fTy, desc, asyncId, hasStat, + errMsg, sourceFile, sourceLine)}; builder.create(loc, func, args); } diff --git a/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp b/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp index 687007d957225..24033bc15b8eb 100644 --- a/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp +++ b/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp @@ -76,16 +76,6 @@ llvm::LogicalResult cuf::FreeOp::verify() { return checkCudaAttr(*this); } // AllocateOp //===----------------------------------------------------------------------===// -template -static llvm::LogicalResult checkStreamType(OpTy op) { - if (!op.getStream()) - return mlir::success(); - if (auto refTy = mlir::dyn_cast(op.getStream().getType())) - if (!refTy.getEleTy().isInteger(64)) - return op.emitOpError("stream is expected to be an i64 reference"); - return mlir::success(); -} - llvm::LogicalResult cuf::AllocateOp::verify() { if (getPinned() && getStream()) return emitOpError("pinned and stream cannot appears at the same time"); @@ -102,7 +92,7 @@ llvm::LogicalResult cuf::AllocateOp::verify() { "expect errmsg to be a reference to/or a box type value"); if (getErrmsg() && !getHasStat()) return emitOpError("expect stat attribute when errmsg is provided"); - return checkStreamType(*this); + return mlir::success(); } //===----------------------------------------------------------------------===// @@ -153,6 +143,16 @@ llvm::LogicalResult cuf::DeallocateOp::verify() { // KernelLaunchOp //===----------------------------------------------------------------------===// +template +static llvm::LogicalResult checkStreamType(OpTy op) { + if (!op.getStream()) + return mlir::success(); + if (auto refTy = mlir::dyn_cast(op.getStream().getType())) + if (!refTy.getEleTy().isInteger(64)) + return op.emitOpError("stream is expected to be an i64 reference"); + return mlir::success(); +} + llvm::LogicalResult cuf::KernelLaunchOp::verify() { return checkStreamType(*this); } diff --git a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp index 3a3eab9e8e37b..e70ceb3a67d98 100644 --- a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp +++ b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp @@ -128,15 +128,17 @@ static mlir::LogicalResult convertOpToCall(OpTy op, mlir::IntegerType::get(op.getContext(), 1))); if (op.getSource()) { mlir::Value stream = - op.getStream() ? op.getStream() - : builder.createNullConstant(loc, fTy.getInput(2)); + op.getStream() + ? op.getStream() + : builder.createIntegerConstant(loc, fTy.getInput(2), -1); args = fir::runtime::createArguments( builder, loc, fTy, op.getBox(), op.getSource(), stream, pinned, hasStat, errmsg, sourceFile, sourceLine); } else { mlir::Value stream = - op.getStream() ? op.getStream() - : builder.createNullConstant(loc, fTy.getInput(1)); + op.getStream() + ? op.getStream() + : builder.createIntegerConstant(loc, fTy.getInput(1), -1); args = fir::runtime::createArguments(builder, loc, fTy, op.getBox(), stream, pinned, hasStat, errmsg, sourceFile, sourceLine); diff --git a/flang/test/Fir/CUDA/cuda-allocate.fir b/flang/test/Fir/CUDA/cuda-allocate.fir index ea7890c9aac52..095ad92d5deb5 100644 --- a/flang/test/Fir/CUDA/cuda-allocate.fir +++ b/flang/test/Fir/CUDA/cuda-allocate.fir @@ -19,7 +19,7 @@ func.func @_QPsub1() { // CHECK: %[[DESC:.*]] = fir.convert %[[DESC_RT_CALL]] : (!fir.ref>) -> !fir.ref>>> // CHECK: %[[DECL_DESC:.*]]:2 = hlfir.declare %[[DESC]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) // CHECK: %[[BOX_NONE:.*]] = fir.convert %[[DECL_DESC]]#1 : (!fir.ref>>>) -> !fir.ref> -// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[BOX_NONE:.*]] = fir.convert %[[DECL_DESC]]#1 : (!fir.ref>>>) -> !fir.ref> // CHECK: %{{.*}} = fir.call @_FortranAAllocatableDeallocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -47,7 +47,7 @@ func.func @_QPsub3() { // CHECK: %[[A:.*]]:2 = hlfir.declare %[[A_ADDR]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QMmod1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) // CHECK: %[[A_BOX:.*]] = fir.convert %[[A]]#1 : (!fir.ref>>>) -> !fir.ref> -// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[A_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[A_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[A_BOX:.*]] = fir.convert %[[A]]#1 : (!fir.ref>>>) -> !fir.ref> // CHECK: fir.call @_FortranACUFAllocatableDeallocate(%[[A_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -87,7 +87,7 @@ func.func @_QPsub5() { } // CHECK-LABEL: func.func @_QPsub5() -// CHECK: fir.call @_FortranACUFAllocatableAllocate({{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocate({{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: fir.call @_FortranAAllocatableDeallocate({{.*}}) : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -118,7 +118,7 @@ func.func @_QQsub6() attributes {fir.bindc_name = "test"} { // CHECK: %[[B:.*]]:2 = hlfir.declare %[[B_ADDR]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QMdataEb"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) // CHECK: _FortranAAllocatableSetBounds // CHECK: %[[B_BOX:.*]] = fir.convert %[[B]]#1 : (!fir.ref>>>) -> !fir.ref> -// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[B_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[B_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 func.func @_QPallocate_source() { @@ -142,7 +142,7 @@ func.func @_QPallocate_source() { // CHECK: %[[SOURCE:.*]] = fir.load %[[DECL_HOST]] : !fir.ref>>> // CHECK: %[[DEV_CONV:.*]] = fir.convert %[[DECL_DEV]] : (!fir.ref>>>) -> !fir.ref> // CHECK: %[[SOURCE_CONV:.*]] = fir.convert %[[SOURCE]] : (!fir.box>>) -> !fir.box -// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocateSource(%[[DEV_CONV]], %[[SOURCE_CONV]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.box, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocateSource(%[[DEV_CONV]], %[[SOURCE_CONV]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.box, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 fir.global @_QMmod1Ea_d {data_attr = #cuf.cuda} : !fir.box>> { @@ -170,14 +170,16 @@ func.func @_QQallocate_stream() { %1 = fir.declare %0 {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFEa"} : (!fir.ref>>>) -> !fir.ref>>> %2 = fir.alloca i64 {bindc_name = "stream1", uniq_name = "_QFEstream1"} %3 = fir.declare %2 {uniq_name = "_QFEstream1"} : (!fir.ref) -> !fir.ref - %5 = cuf.allocate %1 : !fir.ref>>> stream(%3 : !fir.ref) {data_attr = #cuf.cuda} -> i32 + %4 = fir.load %3 : !fir.ref + %5 = cuf.allocate %1 : !fir.ref>>> stream(%4 : i64) {data_attr = #cuf.cuda} -> i32 return } // CHECK-LABEL: func.func @_QQallocate_stream() // CHECK: %[[STREAM_ALLOCA:.*]] = fir.alloca i64 {bindc_name = "stream1", uniq_name = "_QFEstream1"} // CHECK: %[[STREAM:.*]] = fir.declare %[[STREAM_ALLOCA]] {uniq_name = "_QFEstream1"} : (!fir.ref) -> !fir.ref -// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %[[STREAM]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[STREAM_LOAD:.*]] = fir.load %[[STREAM]] : !fir.ref +// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %[[STREAM_LOAD]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 func.func @_QPp_alloc() { @@ -266,6 +268,6 @@ func.func @_QQpinned() attributes {fir.bindc_name = "testasync"} { // CHECK: %[[PINNED:.*]] = fir.alloca !fir.logical<4> {bindc_name = "pinnedflag", uniq_name = "_QFEpinnedflag"} // CHECK: %[[DECL_PINNED:.*]] = fir.declare %[[PINNED]] {uniq_name = "_QFEpinnedflag"} : (!fir.ref>) -> !fir.ref> // CHECK: %[[CONV_PINNED:.*]] = fir.convert %[[DECL_PINNED]] : (!fir.ref>) -> !fir.ref -// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %{{.*}}, %[[CONV_PINNED]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %{{.*}}, %[[CONV_PINNED]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 } // end of module diff --git a/flang/test/Fir/cuf-invalid.fir b/flang/test/Fir/cuf-invalid.fir index dceb8f6fde236..a3b9be3ee8223 100644 --- a/flang/test/Fir/cuf-invalid.fir +++ b/flang/test/Fir/cuf-invalid.fir @@ -2,12 +2,13 @@ func.func @_QPsub1() { %0 = fir.alloca !fir.box>> {bindc_name = "a", uniq_name = "_QFsub1Ea"} - %s = fir.alloca i64 + %1 = fir.alloca i32 %pinned = fir.alloca i1 %4:2 = hlfir.declare %0 {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) %11 = fir.convert %4#1 : (!fir.ref>>>) -> !fir.ref> + %s = fir.load %1 : !fir.ref // expected-error at +1{{'cuf.allocate' op pinned and stream cannot appears at the same time}} - %13 = cuf.allocate %11 : !fir.ref> stream(%s : !fir.ref) pinned(%pinned : !fir.ref) {data_attr = #cuf.cuda} -> i32 + %13 = cuf.allocate %11 : !fir.ref> stream(%s : i32) pinned(%pinned : !fir.ref) {data_attr = #cuf.cuda} -> i32 return } diff --git a/flang/test/Fir/cuf.mlir b/flang/test/Fir/cuf.mlir index f80a70eca34a3..d38b26a4548ed 100644 --- a/flang/test/Fir/cuf.mlir +++ b/flang/test/Fir/cuf.mlir @@ -18,14 +18,15 @@ func.func @_QPsub1() { func.func @_QPsub1() { %0 = fir.alloca !fir.box>> {bindc_name = "a", uniq_name = "_QFsub1Ea"} - %1 = fir.alloca i64 + %1 = fir.alloca i32 %4:2 = hlfir.declare %0 {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) %11 = fir.convert %4#1 : (!fir.ref>>>) -> !fir.ref> - %13 = cuf.allocate %11 : !fir.ref> stream(%1 : !fir.ref) {data_attr = #cuf.cuda} -> i32 + %s = fir.load %1 : !fir.ref + %13 = cuf.allocate %11 : !fir.ref> stream(%s : i32) {data_attr = #cuf.cuda} -> i32 return } -// CHECK: cuf.allocate %{{.*}} : !fir.ref> stream(%{{.*}} : !fir.ref) {data_attr = #cuf.cuda} -> i32 +// CHECK: cuf.allocate %{{.*}} : !fir.ref> stream(%{{.*}} : i32) {data_attr = #cuf.cuda} -> i32 // ----- diff --git a/flang/test/HLFIR/elemental-codegen.fir b/flang/test/HLFIR/elemental-codegen.fir index 67af4261470f7..a715479f16115 100644 --- a/flang/test/HLFIR/elemental-codegen.fir +++ b/flang/test/HLFIR/elemental-codegen.fir @@ -191,7 +191,7 @@ func.func @test_polymorphic(%arg0: !fir.class> {fir.bindc_ // CHECK: %[[VAL_35:.*]] = fir.absent !fir.box // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_4]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_37:.*]] = fir.convert %[[VAL_31]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_38:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_36]], %{{.*}}, %[[VAL_34]], %[[VAL_35]], %[[VAL_37]], %[[VAL_33]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_38:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_36]], %{{.*}}, %[[VAL_34]], %[[VAL_35]], %[[VAL_37]], %[[VAL_33]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_12:.*]] = arith.constant true // CHECK: %[[VAL_39:.*]] = fir.load %[[VAL_13]]#0 : !fir.ref>>>> // CHECK: %[[VAL_40:.*]] = arith.constant 1 : index @@ -275,7 +275,7 @@ func.func @test_polymorphic_expr(%arg0: !fir.class> {fir.b // CHECK: %[[VAL_36:.*]] = fir.absent !fir.box // CHECK: %[[VAL_37:.*]] = fir.convert %[[VAL_5]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_38:.*]] = fir.convert %[[VAL_32]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_39:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_37]], %{{.*}}, %[[VAL_35]], %[[VAL_36]], %[[VAL_38]], %[[VAL_34]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_39:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_37]], %{{.*}}, %[[VAL_35]], %[[VAL_36]], %[[VAL_38]], %[[VAL_34]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_13:.*]] = arith.constant true // CHECK: %[[VAL_40:.*]] = fir.load %[[VAL_14]]#0 : !fir.ref>>>> // CHECK: %[[VAL_41:.*]] = arith.constant 1 : index @@ -328,7 +328,7 @@ func.func @test_polymorphic_expr(%arg0: !fir.class> {fir.b // CHECK: %[[VAL_85:.*]] = fir.absent !fir.box // CHECK: %[[VAL_86:.*]] = fir.convert %[[VAL_4]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_87:.*]] = fir.convert %[[VAL_81]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_88:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_86]], %{{.*}}, %[[VAL_84]], %[[VAL_85]], %[[VAL_87]], %[[VAL_83]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_88:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_86]], %{{.*}}, %[[VAL_84]], %[[VAL_85]], %[[VAL_87]], %[[VAL_83]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_62:.*]] = arith.constant true // CHECK: %[[VAL_89:.*]] = fir.load %[[VAL_63]]#0 : !fir.ref>>>> // CHECK: %[[VAL_90:.*]] = arith.constant 1 : index diff --git a/flang/test/Lower/CUDA/cuda-allocatable.cuf b/flang/test/Lower/CUDA/cuda-allocatable.cuf index cec10dda839e9..a570f636b8db1 100644 --- a/flang/test/Lower/CUDA/cuda-allocatable.cuf +++ b/flang/test/Lower/CUDA/cuda-allocatable.cuf @@ -90,7 +90,7 @@ end subroutine subroutine sub4() real, allocatable, device :: a(:) - integer(8) :: istream + integer :: istream allocate(a(10), stream=istream) end subroutine @@ -98,10 +98,11 @@ end subroutine ! CHECK: %[[BOX:.*]] = cuf.alloc !fir.box>> {bindc_name = "a", data_attr = #cuf.cuda, uniq_name = "_QFsub4Ea"} -> !fir.ref>>> ! CHECK: fir.embox {{.*}} {allocator_idx = 2 : i32} ! CHECK: %[[BOX_DECL:.*]]:2 = hlfir.declare %{{.*}} {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub4Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) -! CHECK: %[[ISTREAM:.*]] = fir.alloca i64 {bindc_name = "istream", uniq_name = "_QFsub4Eistream"} -! CHECK: %[[ISTREAM_DECL:.*]]:2 = hlfir.declare %[[ISTREAM]] {uniq_name = "_QFsub4Eistream"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ISTREAM:.*]] = fir.alloca i32 {bindc_name = "istream", uniq_name = "_QFsub4Eistream"} +! CHECK: %[[ISTREAM_DECL:.*]]:2 = hlfir.declare %[[ISTREAM]] {uniq_name = "_QFsub4Eistream"} : (!fir.ref) -> (!fir.ref, !fir.ref) ! CHECK: fir.call @_FortranAAllocatableSetBounds -! CHECK: %{{.*}} = cuf.allocate %[[BOX_DECL]]#0 : !fir.ref>>> stream(%[[ISTREAM_DECL]]#0 : !fir.ref) {data_attr = #cuf.cuda} -> i32 +! CHECK: %[[STREAM:.*]] = fir.load %[[ISTREAM_DECL]]#0 : !fir.ref +! CHECK: %{{.*}} = cuf.allocate %[[BOX_DECL]]#0 : !fir.ref>>> stream(%[[STREAM]] : i32) {data_attr = #cuf.cuda} -> i32 ! CHECK: fir.if %{{.*}} { ! CHECK: %{{.*}} = cuf.deallocate %[[BOX_DECL]]#0 : !fir.ref>>> {data_attr = #cuf.cuda} -> i32 ! CHECK: } diff --git a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 index 6869af863644d..5bb1ae3797346 100644 --- a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 @@ -473,6 +473,6 @@ subroutine init() end module ! CHECK-LABEL: func.func @_QMacc_declare_post_action_statPinit() -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.if -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/OpenACC/acc-declare.f90 b/flang/test/Lower/OpenACC/acc-declare.f90 index 4d95ffa10edaf..889cdef51f4ce 100644 --- a/flang/test/Lower/OpenACC/acc-declare.f90 +++ b/flang/test/Lower/OpenACC/acc-declare.f90 @@ -434,6 +434,6 @@ subroutine init() end module ! CHECK-LABEL: func.func @_QMacc_declare_post_action_statPinit() -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.if -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/allocatable-polymorphic.f90 b/flang/test/Lower/allocatable-polymorphic.f90 index cbd7876203424..dd8671daeaf8e 100644 --- a/flang/test/Lower/allocatable-polymorphic.f90 +++ b/flang/test/Lower/allocatable-polymorphic.f90 @@ -267,7 +267,7 @@ subroutine test_allocatable() ! CHECK: %[[C0:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%[[P_CAST]], %[[TYPE_DESC_P1_CAST]], %[[RANK]], %[[C0]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[P_CAST:.*]] = fir.convert %[[P_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[P_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[P_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P1:.*]] = fir.type_desc !fir.type<_QMpolyTp1{a:i32,b:i32}> ! CHECK: %[[C1_CAST:.*]] = fir.convert %[[C1_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> @@ -276,7 +276,7 @@ subroutine test_allocatable() ! CHECK: %[[C0:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%[[C1_CAST]], %[[TYPE_DESC_P1_CAST]], %[[RANK]], %[[C0]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[C1_CAST:.*]] = fir.convert %[[C1_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C1_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C1_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P2:.*]] = fir.type_desc !fir.type<_QMpolyTp2{p1:!fir.type<_QMpolyTp1{a:i32,b:i32}>,c:i32}> ! CHECK: %[[C2_CAST:.*]] = fir.convert %[[C2_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> @@ -285,7 +285,7 @@ subroutine test_allocatable() ! CHECK: %[[C0:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%[[C2_CAST]], %[[TYPE_DESC_P2_CAST]], %[[RANK]], %[[C0]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[C2_CAST:.*]] = fir.convert %[[C2_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C2_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C2_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P1:.*]] = fir.type_desc !fir.type<_QMpolyTp1{a:i32,b:i32}> ! CHECK: %[[C3_CAST:.*]] = fir.convert %[[C3_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> @@ -300,7 +300,7 @@ subroutine test_allocatable() ! CHECK: %[[C10_I64:.*]] = fir.convert %[[C10]] : (i32) -> i64 ! CHECK: fir.call @_FortranAAllocatableSetBounds(%[[C3_CAST]], %[[C0]], %[[C1_I64]], %[[C10_I64]]) {{.*}}: (!fir.ref>, i32, i64, i64) -> () ! CHECK: %[[C3_CAST:.*]] = fir.convert %[[C3_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C3_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C3_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P2:.*]] = fir.type_desc !fir.type<_QMpolyTp2{p1:!fir.type<_QMpolyTp1{a:i32,b:i32}>,c:i32}> ! CHECK: %[[C4_CAST:.*]] = fir.convert %[[C4_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> @@ -316,7 +316,7 @@ subroutine test_allocatable() ! CHECK: %[[C20_I64:.*]] = fir.convert %[[C20]] : (i32) -> i64 ! CHECK: fir.call @_FortranAAllocatableSetBounds(%[[C4_CAST]], %[[C0]], %[[C1_I64]], %[[C20_I64]]) {{.*}}: (!fir.ref>, i32, i64, i64) -> () ! CHECK: %[[C4_CAST:.*]] = fir.convert %[[C4_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C4_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C4_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[C1_LOAD1:.*]] = fir.load %[[C1_DECL]]#0 : !fir.ref>>> ! CHECK: fir.dispatch "proc1"(%[[C1_LOAD1]] : !fir.class>>) @@ -390,7 +390,7 @@ subroutine test_unlimited_polymorphic_with_intrinsic_type_spec() ! CHECK: %[[CORANK:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitIntrinsicForAllocate(%[[BOX_NONE]], %[[CAT]], %[[KIND]], %[[RANK]], %[[CORANK]]) {{.*}} : (!fir.ref>, i32, i32, i32, i32) -> () ! CHECK: %[[BOX_NONE:.*]] = fir.convert %[[P_DECL]]#0 : (!fir.ref>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[BOX_NONE:.*]] = fir.convert %[[PTR_DECL]]#0 : (!fir.ref>>) -> !fir.ref> ! CHECK: %[[CAT:.*]] = arith.constant 2 : i32 @@ -573,7 +573,7 @@ subroutine test_allocatable_up_character() ! CHECK: %[[CORANK:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitCharacterForAllocate(%[[A_NONE]], %[[LEN]], %[[KIND]], %[[RANK]], %[[CORANK]]) {{.*}} : (!fir.ref>, i64, i32, i32, i32) -> () ! CHECK: %[[A_NONE:.*]] = fir.convert %[[A_DECL]]#0 : (!fir.ref>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 end module @@ -592,17 +592,17 @@ program test_alloc ! LLVM-LABEL: define void @_QMpolyPtest_allocatable() ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp1, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp1, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp2, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp1, i32 1, i32 0) ! LLVM: call void @_FortranAAllocatableSetBounds(ptr %{{.*}}, i32 0, i64 1, i64 10) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp2, i32 1, i32 0) ! LLVM: call void @_FortranAAllocatableSetBounds(ptr %{{.*}}, i32 0, i64 1, i64 20) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM-COUNT-2: call void %{{[0-9]*}}() ! LLVM: call void @llvm.memcpy.p0.p0.i32 @@ -683,5 +683,5 @@ program test_alloc ! LLVM: store { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] } { ptr null, i64 8, i32 20240719, i8 0, i8 42, i8 2, i8 1, ptr @_QMpolyEXdtXp1, [1 x i64] zeroinitializer }, ptr %[[ALLOCA1:[0-9]*]] ! LLVM: call void @llvm.memcpy.p0.p0.i32(ptr %[[ALLOCA2:[0-9]+]], ptr %[[ALLOCA1]], i32 40, i1 false) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %[[ALLOCA2]], ptr @_QMpolyEXdtXp1, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %[[ALLOCA2]], ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %[[ALLOCA2]], i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: %{{.*}} = call i32 @_FortranAAllocatableDeallocatePolymorphic(ptr %[[ALLOCA2]], ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) diff --git a/flang/test/Lower/allocatable-runtime.f90 b/flang/test/Lower/allocatable-runtime.f90 index c63252c68974e..37272c90656cc 100644 --- a/flang/test/Lower/allocatable-runtime.f90 +++ b/flang/test/Lower/allocatable-runtime.f90 @@ -31,7 +31,7 @@ subroutine foo() ! CHECK: fir.call @{{.*}}AllocatableSetBounds(%[[xBoxCast2]], %c0{{.*}}, %[[xlbCast]], %[[xubCast]]) {{.*}}: (!fir.ref>, i32, i64, i64) -> () ! CHECK-DAG: %[[xBoxCast3:.*]] = fir.convert %[[xBoxAddr]] : (!fir.ref>>>) -> !fir.ref> ! CHECK-DAG: %[[sourceFile:.*]] = fir.convert %{{.*}} -> !fir.ref - ! CHECK: fir.call @{{.*}}AllocatableAllocate(%[[xBoxCast3]], %{{.*}}, %false{{.*}}, %[[errMsg]], %[[sourceFile]], %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 + ! CHECK: fir.call @{{.*}}AllocatableAllocate(%[[xBoxCast3]], %{{.*}}, %false{{.*}}, %[[errMsg]], %[[sourceFile]], %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! Simply check that we are emitting the right numebr of set bound for y and z. Otherwise, this is just like x. ! CHECK: fir.convert %[[yBoxAddr]] : (!fir.ref>>>) -> !fir.ref> @@ -180,4 +180,4 @@ subroutine mold_allocation() ! CHECK: %[[M_BOX_NONE:.*]] = fir.convert %[[EMBOX_M]] : (!fir.box>) -> !fir.box ! CHECK: fir.call @_FortranAAllocatableApplyMold(%[[A_BOX_NONE]], %[[M_BOX_NONE]], %[[RANK]]) {{.*}} : (!fir.ref>, !fir.box, i32) -> () ! CHECK: %[[A_BOX_NONE:.*]] = fir.convert %[[A]] : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/allocate-mold.f90 b/flang/test/Lower/allocate-mold.f90 index 9427c8b08786f..c7985b11397ce 100644 --- a/flang/test/Lower/allocate-mold.f90 +++ b/flang/test/Lower/allocate-mold.f90 @@ -16,7 +16,7 @@ subroutine scalar_mold_allocation() ! CHECK: %[[A_REF_BOX_NONE1:.*]] = fir.convert %[[A]] : (!fir.ref>>) -> !fir.ref> ! CHECK: fir.call @_FortranAAllocatableApplyMold(%[[A_REF_BOX_NONE1]], %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.box, i32) -> () ! CHECK: %[[A_REF_BOX_NONE2:.*]] = fir.convert %[[A]] : (!fir.ref>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_REF_BOX_NONE2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_REF_BOX_NONE2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 subroutine array_scalar_mold_allocation() real, allocatable :: a(:) @@ -40,4 +40,4 @@ end subroutine array_scalar_mold_allocation ! CHECK: %[[REF_BOX_A1:.*]] = fir.convert %1 : (!fir.ref>>>) -> !fir.ref> ! CHECK: fir.call @_FortranAAllocatableSetBounds(%[[REF_BOX_A1]], {{.*}},{{.*}}, {{.*}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %[[REF_BOX_A2:.*]] = fir.convert %[[A]] : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[REF_BOX_A2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[REF_BOX_A2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/polymorphic.f90 b/flang/test/Lower/polymorphic.f90 index b7be5f685d9e3..485861a838ff6 100644 --- a/flang/test/Lower/polymorphic.f90 +++ b/flang/test/Lower/polymorphic.f90 @@ -1149,7 +1149,7 @@ program test ! CHECK-LABEL: func.func @_QQmain() attributes {fir.bindc_name = "test"} { ! CHECK: %[[ADDR_O:.*]] = fir.address_of(@_QFEo) : !fir.ref}>>>> ! CHECK: %[[BOX_NONE:.*]] = fir.convert %[[ADDR_O]] : (!fir.ref}>>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[O:.*]] = fir.load %[[ADDR_O]] : !fir.ref}>>>> ! CHECK: %[[COORD_INNER:.*]] = fir.coordinate_of %[[O]], inner : (!fir.box}>>>) -> !fir.ref> ! CHECK: %{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} unordered iter_args(%arg1 = %{{.*}}) -> (!fir.array<5x!fir.logical<4>>) { diff --git a/flang/test/Transforms/lower-repack-arrays.fir b/flang/test/Transforms/lower-repack-arrays.fir index 0b323b1bb0697..bbae7ba5b0e0b 100644 --- a/flang/test/Transforms/lower-repack-arrays.fir +++ b/flang/test/Transforms/lower-repack-arrays.fir @@ -840,7 +840,7 @@ func.func @_QPtest6(%arg0: !fir.class>> {fir.bi // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>>) -> !fir.box @@ -928,7 +928,7 @@ func.func @_QPtest6_stack(%arg0: !fir.class>> { // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>>) -> !fir.box @@ -1015,7 +1015,7 @@ func.func @_QPtest7(%arg0: !fir.class> {fir.bindc_name = "x // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>) -> !fir.box @@ -1103,7 +1103,7 @@ func.func @_QPtest7_stack(%arg0: !fir.class> {fir.bindc_nam // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>) -> !fir.box From flang-commits at lists.llvm.org Thu May 1 20:07:56 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 20:07:56 -0700 (PDT) Subject: [flang-commits] [flang] 36541ec - [flang] Fix #else with trailing text (#138045) Message-ID: <6814370c.050a0220.1b10d4.2b37@mx.google.com> Author: Eugene Epshteyn Date: 2025-05-01T23:07:52-04:00 New Revision: 36541ec3ca7027aec87118262773b35964c0edec URL: https://github.com/llvm/llvm-project/commit/36541ec3ca7027aec87118262773b35964c0edec DIFF: https://github.com/llvm/llvm-project/commit/36541ec3ca7027aec87118262773b35964c0edec.diff LOG: [flang] Fix #else with trailing text (#138045) Fixed the issue, where the extra text on #else line (' Z' in the example below) caused the data from the "else" clause to be processed together with the data of "then" clause. ``` #ifndef XYZ42 PARAMETER(A=2) #else Z PARAMETER(A=3) #endif end ``` Added: flang/test/Preprocessing/pp048.F Modified: flang/lib/Parser/preprocessor.cpp Removed: ################################################################################ diff --git a/flang/lib/Parser/preprocessor.cpp b/flang/lib/Parser/preprocessor.cpp index a47f9c32ad27c..6e8e3aee19b09 100644 --- a/flang/lib/Parser/preprocessor.cpp +++ b/flang/lib/Parser/preprocessor.cpp @@ -684,7 +684,8 @@ void Preprocessor::Directive(const TokenSequence &dir, Prescanner &prescanner) { dir.GetIntervalProvenanceRange(j, tokens - j), "#else: excess tokens at end of directive"_port_en_US); } - } else if (ifStack_.empty()) { + } + if (ifStack_.empty()) { prescanner.Say(dir.GetTokenProvenanceRange(dirOffset), "#else: not nested within #if, #ifdef, or #ifndef"_err_en_US); } else if (ifStack_.top() != CanDeadElseAppear::Yes) { diff --git a/flang/test/Preprocessing/pp048.F b/flang/test/Preprocessing/pp048.F new file mode 100644 index 0000000000000..121262c1840f9 --- /dev/null +++ b/flang/test/Preprocessing/pp048.F @@ -0,0 +1,11 @@ +! RUN: %flang -E %s 2>&1 | FileCheck %s +#ifndef XYZ42 + PARAMETER(A=2) +#else Z + PARAMETER(A=3) +#endif +! Ensure that "PARAMETER(A" is printed only once +! CHECK: PARAMETER(A +! CHECK-NOT: PARAMETER(A + end + From flang-commits at lists.llvm.org Thu May 1 20:08:00 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Thu, 01 May 2025 20:08:00 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix #else with trailing text (PR #138045) In-Reply-To: Message-ID: <68143710.170a0220.33ff13.5d68@mx.google.com> https://github.com/eugeneepshteyn closed https://github.com/llvm/llvm-project/pull/138045 From flang-commits at lists.llvm.org Thu May 1 21:04:37 2025 From: flang-commits at lists.llvm.org (Kelvin Li via flang-commits) Date: Thu, 01 May 2025 21:04:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang] update LIT test for AIX (NFC) (PR #138228) Message-ID: https://github.com/kkwli created https://github.com/llvm/llvm-project/pull/138228 On AIX, `_QM__fortran_type_infoTvalue` and `_QM__fortran_type_infoTspecialbinding` are packed struct type, hence `%T2 = type <{ }>`. >From 5f425420f0e4dd68b0ebe7478aba0126d02ca368 Mon Sep 17 00:00:00 2001 From: Kelvin Li Date: Thu, 1 May 2025 23:59:03 -0400 Subject: [PATCH] [flang] update LIT test for AIX --- flang/test/Lower/volatile-openmp.f90 | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/flang/test/Lower/volatile-openmp.f90 b/flang/test/Lower/volatile-openmp.f90 index 3269af9618f10..64fd5f04afdd6 100644 --- a/flang/test/Lower/volatile-openmp.f90 +++ b/flang/test/Lower/volatile-openmp.f90 @@ -23,11 +23,11 @@ ! CHECK: %[[VAL_11:.*]] = fir.address_of(@_QFEcontainer) : !fir.ref>>}>> ! CHECK: %[[VAL_12:.*]] = fir.volatile_cast %[[VAL_11]] : (!fir.ref>>}>>) -> !fir.ref>>}>, volatile> ! CHECK: %[[VAL_13:.*]]:2 = hlfir.declare %[[VAL_12]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEcontainer"} : (!fir.ref>>}>, volatile>) -> (!fir.ref>>}>, volatile>, !fir.ref>>}>, volatile>) -! CHECK: %[[VAL_14:.*]] = fir.address_of(@_QFE.c.t) : !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>> +! CHECK: %[[VAL_14:.*]] = fir.address_of(@_QFE.c.t) : !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>> ! CHECK: %[[VAL_15:.*]] = fir.shape_shift %[[VAL_0]], %[[VAL_1]] : (index, index) -> !fir.shapeshift<1> -! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[VAL_14]](%[[VAL_15]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.c.t"} : (!fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.shapeshift<1>) -> (!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>) -! CHECK: %[[VAL_17:.*]] = fir.address_of(@_QFE.dt.t) : !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>> -! CHECK: %[[VAL_18:.*]]:2 = hlfir.declare %[[VAL_17]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.dt.t"} : (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) -> (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>, !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) +! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[VAL_14]](%[[VAL_15]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.c.t"} : (!fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.shapeshift<1>) -> (!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>) +! CHECK: %[[VAL_17:.*]] = fir.address_of(@_QFE.dt.t) : !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>> +! CHECK: %[[VAL_18:.*]]:2 = hlfir.declare %[[VAL_17]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.dt.t"} : (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) -> (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>, !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) ! CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_13]]#0{"array"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>}>, volatile>) -> !fir.ref>>, volatile> ! CHECK: %[[VAL_20:.*]] = fir.load %[[VAL_19]] : !fir.ref>>, volatile> ! CHECK: %[[VAL_21:.*]]:3 = fir.box_dims %[[VAL_20]], %[[VAL_0]] : (!fir.box>>, index) -> (index, index, index) From flang-commits at lists.llvm.org Thu May 1 21:05:08 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 21:05:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang] update LIT test for AIX (NFC) (PR #138228) In-Reply-To: Message-ID: <68144474.170a0220.3beb16.519f@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Kelvin Li (kkwli)
Changes On AIX, `_QM__fortran_type_infoTvalue` and `_QM__fortran_type_infoTspecialbinding` are packed struct type, hence `%T2 = type <{ <type list> }>`. --- Patch is 31.78 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/138228.diff 1 Files Affected: - (modified) flang/test/Lower/volatile-openmp.f90 (+4-4) ``````````diff diff --git a/flang/test/Lower/volatile-openmp.f90 b/flang/test/Lower/volatile-openmp.f90 index 3269af9618f10..64fd5f04afdd6 100644 --- a/flang/test/Lower/volatile-openmp.f90 +++ b/flang/test/Lower/volatile-openmp.f90 @@ -23,11 +23,11 @@ ! CHECK: %[[VAL_11:.*]] = fir.address_of(@_QFEcontainer) : !fir.ref>>}>> ! CHECK: %[[VAL_12:.*]] = fir.volatile_cast %[[VAL_11]] : (!fir.ref>>}>>) -> !fir.ref>>}>, volatile> ! CHECK: %[[VAL_13:.*]]:2 = hlfir.declare %[[VAL_12]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEcontainer"} : (!fir.ref>>}>, volatile>) -> (!fir.ref>>}>, volatile>, !fir.ref>>}>, volatile>) -! CHECK: %[[VAL_14:.*]] = fir.address_of(@_QFE.c.t) : !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>> +! CHECK: %[[VAL_14:.*]] = fir.address_of(@_QFE.c.t) : !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>> ! CHECK: %[[VAL_15:.*]] = fir.shape_shift %[[VAL_0]], %[[VAL_1]] : (index, index) -> !fir.shapeshift<1> -! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[VAL_14]](%[[VAL_15]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.c.t"} : (!fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.shapeshift<1>) -> (!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>) -! CHECK: %[[VAL_17:.*]] = fir.address_of(@_QFE.dt.t) : !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>> -! CHECK: %[[VAL_18:.*]]:2 = hlfir.declare %[[VAL_17]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.dt.t"} : (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) -> (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>, !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) +! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[VAL_14]](%[[VAL_15]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.c.t"} : (!fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.shapeshift<1>) -> (!fir.box>>,genre:i8,... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/138228 From flang-commits at lists.llvm.org Thu May 1 21:19:52 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Thu, 01 May 2025 21:19:52 -0700 (PDT) Subject: [flang-commits] [flang] [flang] update LIT test for AIX (NFC) (PR #138228) In-Reply-To: Message-ID: <681447e8.170a0220.1154ce.599a@mx.google.com> ashermancinelli wrote: Thank you! I had a hard time reproducing this one. I think matching everything inside fir.type with `fir.type<{{.+}}>` would be okay too - this test is for correct handling of the volatile, pointer and target attributes, and how they interact with openmp, and those should all be on the `fir.ref`s and not the `fir.type`s. Thanks again! https://github.com/llvm/llvm-project/pull/138228 From flang-commits at lists.llvm.org Thu May 1 21:20:04 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Thu, 01 May 2025 21:20:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang] update LIT test for AIX (NFC) (PR #138228) In-Reply-To: Message-ID: <681447f4.050a0220.293c20.3642@mx.google.com> https://github.com/ashermancinelli approved this pull request. https://github.com/llvm/llvm-project/pull/138228 From flang-commits at lists.llvm.org Thu May 1 21:57:58 2025 From: flang-commits at lists.llvm.org (Kelvin Li via flang-commits) Date: Thu, 01 May 2025 21:57:58 -0700 (PDT) Subject: [flang-commits] [flang] [flang] update LIT test for AIX (NFC) (PR #138228) In-Reply-To: Message-ID: <681450d6.170a0220.249b01.9b5c@mx.google.com> https://github.com/kkwli updated https://github.com/llvm/llvm-project/pull/138228 >From 5f425420f0e4dd68b0ebe7478aba0126d02ca368 Mon Sep 17 00:00:00 2001 From: Kelvin Li Date: Thu, 1 May 2025 23:59:03 -0400 Subject: [PATCH] [flang] update LIT test for AIX --- flang/test/Lower/volatile-openmp.f90 | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/flang/test/Lower/volatile-openmp.f90 b/flang/test/Lower/volatile-openmp.f90 index 3269af9618f10..64fd5f04afdd6 100644 --- a/flang/test/Lower/volatile-openmp.f90 +++ b/flang/test/Lower/volatile-openmp.f90 @@ -23,11 +23,11 @@ ! CHECK: %[[VAL_11:.*]] = fir.address_of(@_QFEcontainer) : !fir.ref>>}>> ! CHECK: %[[VAL_12:.*]] = fir.volatile_cast %[[VAL_11]] : (!fir.ref>>}>>) -> !fir.ref>>}>, volatile> ! CHECK: %[[VAL_13:.*]]:2 = hlfir.declare %[[VAL_12]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEcontainer"} : (!fir.ref>>}>, volatile>) -> (!fir.ref>>}>, volatile>, !fir.ref>>}>, volatile>) -! CHECK: %[[VAL_14:.*]] = fir.address_of(@_QFE.c.t) : !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>> +! CHECK: %[[VAL_14:.*]] = fir.address_of(@_QFE.c.t) : !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>> ! CHECK: %[[VAL_15:.*]] = fir.shape_shift %[[VAL_0]], %[[VAL_1]] : (index, index) -> !fir.shapeshift<1> -! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[VAL_14]](%[[VAL_15]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.c.t"} : (!fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.shapeshift<1>) -> (!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>) -! CHECK: %[[VAL_17:.*]] = fir.address_of(@_QFE.dt.t) : !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>> -! CHECK: %[[VAL_18:.*]]:2 = hlfir.declare %[[VAL_17]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.dt.t"} : (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) -> (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>, !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) +! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[VAL_14]](%[[VAL_15]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.c.t"} : (!fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.shapeshift<1>) -> (!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>) +! CHECK: %[[VAL_17:.*]] = fir.address_of(@_QFE.dt.t) : !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>> +! CHECK: %[[VAL_18:.*]]:2 = hlfir.declare %[[VAL_17]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.dt.t"} : (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) -> (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>, !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) ! CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_13]]#0{"array"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>}>, volatile>) -> !fir.ref>>, volatile> ! CHECK: %[[VAL_20:.*]] = fir.load %[[VAL_19]] : !fir.ref>>, volatile> ! CHECK: %[[VAL_21:.*]]:3 = fir.box_dims %[[VAL_20]], %[[VAL_0]] : (!fir.box>>, index) -> (index, index, index) From flang-commits at lists.llvm.org Thu May 1 22:01:33 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 01 May 2025 22:01:33 -0700 (PDT) Subject: [flang-commits] [flang] 8f06f5d - [flang] update LIT test for AIX (NFC) (#138228) Message-ID: <681451ad.170a0220.47dfa.5b5e@mx.google.com> Author: Kelvin Li Date: 2025-05-02T01:01:30-04:00 New Revision: 8f06f5dca05f8c63caf4cfc171b59ce673afecec URL: https://github.com/llvm/llvm-project/commit/8f06f5dca05f8c63caf4cfc171b59ce673afecec DIFF: https://github.com/llvm/llvm-project/commit/8f06f5dca05f8c63caf4cfc171b59ce673afecec.diff LOG: [flang] update LIT test for AIX (NFC) (#138228) Added: Modified: flang/test/Lower/volatile-openmp.f90 Removed: ################################################################################ diff --git a/flang/test/Lower/volatile-openmp.f90 b/flang/test/Lower/volatile-openmp.f90 index 3269af9618f10..64fd5f04afdd6 100644 --- a/flang/test/Lower/volatile-openmp.f90 +++ b/flang/test/Lower/volatile-openmp.f90 @@ -23,11 +23,11 @@ ! CHECK: %[[VAL_11:.*]] = fir.address_of(@_QFEcontainer) : !fir.ref>>}>> ! CHECK: %[[VAL_12:.*]] = fir.volatile_cast %[[VAL_11]] : (!fir.ref>>}>>) -> !fir.ref>>}>, volatile> ! CHECK: %[[VAL_13:.*]]:2 = hlfir.declare %[[VAL_12]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEcontainer"} : (!fir.ref>>}>, volatile>) -> (!fir.ref>>}>, volatile>, !fir.ref>>}>, volatile>) -! CHECK: %[[VAL_14:.*]] = fir.address_of(@_QFE.c.t) : !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>> +! CHECK: %[[VAL_14:.*]] = fir.address_of(@_QFE.c.t) : !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>> ! CHECK: %[[VAL_15:.*]] = fir.shape_shift %[[VAL_0]], %[[VAL_1]] : (index, index) -> !fir.shapeshift<1> -! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[VAL_14]](%[[VAL_15]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.c.t"} : (!fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.shapeshift<1>) -> (!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>) -! CHECK: %[[VAL_17:.*]] = fir.address_of(@_QFE.dt.t) : !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>> -! CHECK: %[[VAL_18:.*]]:2 = hlfir.declare %[[VAL_17]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.dt.t"} : (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) -> (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>, !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{genre:i8,__padding0:!fir.array<7xi8>,value:i64}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}>>>>,bounds:!fir.box,value:i64}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) +! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[VAL_14]](%[[VAL_15]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.c.t"} : (!fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.shapeshift<1>) -> (!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>) +! CHECK: %[[VAL_17:.*]] = fir.address_of(@_QFE.dt.t) : !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>> +! CHECK: %[[VAL_18:.*]]:2 = hlfir.declare %[[VAL_17]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.dt.t"} : (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) -> (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>, !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) ! CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_13]]#0{"array"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>}>, volatile>) -> !fir.ref>>, volatile> ! CHECK: %[[VAL_20:.*]] = fir.load %[[VAL_19]] : !fir.ref>>, volatile> ! CHECK: %[[VAL_21:.*]]:3 = fir.box_dims %[[VAL_20]], %[[VAL_0]] : (!fir.box>>, index) -> (index, index, index) From flang-commits at lists.llvm.org Thu May 1 22:01:36 2025 From: flang-commits at lists.llvm.org (Kelvin Li via flang-commits) Date: Thu, 01 May 2025 22:01:36 -0700 (PDT) Subject: [flang-commits] [flang] [flang] update LIT test for AIX (NFC) (PR #138228) In-Reply-To: Message-ID: <681451b0.050a0220.3608fb.3c92@mx.google.com> https://github.com/kkwli closed https://github.com/llvm/llvm-project/pull/138228 From flang-commits at lists.llvm.org Fri May 2 04:38:41 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Fri, 02 May 2025 04:38:41 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <6814aec1.170a0220.edd8a.1215@mx.google.com> ================ @@ -2961,6 +2961,63 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { // clause CheckMultListItems(); + if (GetContext().directive == llvm::omp::Directive::OMPD_task) { + if (auto *detachClause{FindClause(llvm::omp::Clause::OMPC_detach)}) { + unsigned version{context_.langOptions().OpenMPVersion}; + if (version == 50 || version == 51) { + // OpenMP 5.0: 2.10.1 Task construct restrictions + CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_detach, + {llvm::omp::Clause::OMPC_mergeable}); + } else if (version >= 52) { + // OpenMP 5.2: 12.5.2 Detach construct restrictions + if (FindClause(llvm::omp::Clause::OMPC_final)) { + context_.Say(GetContext().clauseSource, + "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); + } + + const auto &detach{ + std::get(detachClause->u)}; + if (const auto *name{parser::Unwrap(detach.v.v)}) { + if (name->symbol) { + Symbol *eventHandleSym{name->symbol}; + auto checkVarAppearsInDataEnvClause = [&](const parser:: + OmpObjectList &objs, + std::string clause) { + for (const auto &obj : objs.v) { + if (const parser::Name * + objName{parser::Unwrap(obj)}) { + if (&objName->symbol->GetUltimate() == eventHandleSym) { + context_.Say(GetContext().clauseSource, + "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, + objName->source, clause); + } + } + } + }; + if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_private)}) { + const auto &pClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + const auto &fpClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + const auto &irClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause( + std::get(irClause.v.t), + "IN_REDUCTION"); + } ---------------- Thirumalai-Shaktivel wrote: Good catch, I don't know how I missed this. https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Fri May 2 05:05:54 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Fri, 02 May 2025 05:05:54 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <6814b522.170a0220.6a323.849d@mx.google.com> ================ @@ -3627,40 +3683,45 @@ void OmpStructureChecker::CheckIsVarPartOfAnotherVar( const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause) { for (const auto &ompObject : objList.v) { - common::visit( - common::visitors{ - [&](const parser::Designator &designator) { - if (const auto *dataRef{ - std::get_if(&designator.u)}) { - if (IsDataRefTypeParamInquiry(dataRef)) { + CheckIsVarPartOfAnotherVar(source, ompObject, clause); + } +} + +void OmpStructureChecker::CheckIsVarPartOfAnotherVar( + const parser::CharBlock &source, const parser::OmpObject &ompObject, + llvm::StringRef clause) { + common::visit( + common::visitors{ + [&](const parser::Designator &designator) { + if (const auto *dataRef{ + std::get_if(&designator.u)}) { + if (IsDataRefTypeParamInquiry(dataRef)) { + context_.Say(source, + "A type parameter inquiry cannot appear on the %s " + "directive"_err_en_US, ---------------- Thirumalai-Shaktivel wrote: Done https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Fri May 2 05:05:56 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Fri, 02 May 2025 05:05:56 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <6814b524.050a0220.97551.54ad@mx.google.com> ================ @@ -2961,6 +2961,63 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { // clause CheckMultListItems(); + if (GetContext().directive == llvm::omp::Directive::OMPD_task) { + if (auto *detachClause{FindClause(llvm::omp::Clause::OMPC_detach)}) { + unsigned version{context_.langOptions().OpenMPVersion}; + if (version == 50 || version == 51) { + // OpenMP 5.0: 2.10.1 Task construct restrictions + CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_detach, + {llvm::omp::Clause::OMPC_mergeable}); + } else if (version >= 52) { + // OpenMP 5.2: 12.5.2 Detach construct restrictions + if (FindClause(llvm::omp::Clause::OMPC_final)) { + context_.Say(GetContext().clauseSource, + "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); + } + + const auto &detach{ + std::get(detachClause->u)}; + if (const auto *name{parser::Unwrap(detach.v.v)}) { + if (name->symbol) { ---------------- Thirumalai-Shaktivel wrote: Done https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Fri May 2 05:05:58 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Fri, 02 May 2025 05:05:58 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <6814b526.170a0220.1d4faa.c8db@mx.google.com> ================ @@ -3627,40 +3683,45 @@ void OmpStructureChecker::CheckIsVarPartOfAnotherVar( const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause) { for (const auto &ompObject : objList.v) { - common::visit( - common::visitors{ - [&](const parser::Designator &designator) { - if (const auto *dataRef{ - std::get_if(&designator.u)}) { - if (IsDataRefTypeParamInquiry(dataRef)) { + CheckIsVarPartOfAnotherVar(source, ompObject, clause); + } +} + +void OmpStructureChecker::CheckIsVarPartOfAnotherVar( ---------------- Thirumalai-Shaktivel wrote: Done https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Fri May 2 05:06:02 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Fri, 02 May 2025 05:06:02 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <6814b52a.050a0220.3910fc.599f@mx.google.com> Thirumalai-Shaktivel wrote: Thanks for the reviews, @kparzysz and @tblah. @tblah, can you please do a final review on this PR? I have addressed your comments. If I missed anything, please let me know. https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Fri May 2 05:40:22 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 02 May 2025 05:40:22 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <6814bd36.050a0220.2936bb.5571@mx.google.com> https://github.com/jeanPerier commented: Thanks for the update! I think we should try to get rid of the assumption that the FIROpBuilder insertion point is anywhere but somewhere inside a ModuleOp (that assumption is required to gather compilation attributes). That should be done outside of this patch. But in the meantime, your change looks OK, except I would push it a bit further to make it cleaner. See inline comment. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Fri May 2 05:40:22 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 02 May 2025 05:40:22 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <6814bd36.170a0220.39873b.1a2c@mx.google.com> https://github.com/jeanPerier edited https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Fri May 2 05:40:22 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 02 May 2025 05:40:22 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <6814bd36.170a0220.d302d.1b13@mx.google.com> ================ @@ -403,18 +403,21 @@ class FirConverter : public Fortran::lower::AbstractConverter { [&](Fortran::lower::pft::FunctionLikeUnit &f) { if (f.isMainProgram()) hasMainProgram = true; - declareFunction(f); + createGlobalOutsideOfFunctionLowering( ---------------- jeanPerier wrote: Could you make that a single call that wraps the whole `for (Fortran::lower::pft::Program::Units &u` ...`, there is no real need to create an delete a mock builder for each declaration. I think that would involve modifying `createGlobalOutsideOfFunctionLowering` to check if a builder is already set and just directly call the callback in that case to allow nested call. Also this should probably be renamed to something like `createBuilderOutsideOfFuncOpAndDo` if used for other things than global creations. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Fri May 2 05:48:57 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 02 May 2025 05:48:57 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <6814bf39.050a0220.3b3b92.5714@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Fri May 2 05:48:58 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 02 May 2025 05:48:58 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <6814bf3a.170a0220.3683b7.1a69@mx.google.com> https://github.com/tblah approved this pull request. Thanks for the update. No need for another round of review for my nit comment. https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Fri May 2 05:49:00 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 02 May 2025 05:49:00 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <6814bf3c.170a0220.165091.8012@mx.google.com> ================ @@ -4137,6 +4195,33 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } } +void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { + unsigned version{context_.langOptions().OpenMPVersion}; + if (version >= 52) { + SetContextClauseInfo(llvm::omp::Clause::OMPC_detach); + } else { + // OpenMP 5.0: 2.10.1 Task construct restrictions + CheckAllowedClause(llvm::omp::Clause::OMPC_detach); + } ---------------- tblah wrote: nit: won't CheckAllowedClause understand that this is allowed in newer standard versions? https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Fri May 2 05:50:10 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 02 May 2025 05:50:10 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][flang-driver] Support flag -finstrument-functions (PR #137996) In-Reply-To: Message-ID: <6814bf82.050a0220.25f316.5b9c@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks https://github.com/llvm/llvm-project/pull/137996 From flang-commits at lists.llvm.org Fri May 2 06:02:35 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 02 May 2025 06:02:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <6814c26b.170a0220.102b18.d332@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Fri May 2 06:02:36 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 02 May 2025 06:02:36 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <6814c26c.170a0220.39f26d.95a8@mx.google.com> ================ @@ -3091,10 +3131,32 @@ static void genAtomicCapture(lower::AbstractConverter &converter, firOpBuilder.setInsertionPointToStart(&block); const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + + if (stmt1VarType != stmt2VarType) { ---------------- tblah wrote: nit: These blocks look very similar (identical?). Maybe they should go in a helper function? https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Fri May 2 06:02:35 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 02 May 2025 06:02:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <6814c26b.170a0220.2abd70.b26e@mx.google.com> https://github.com/tblah approved this pull request. Looks good. Just a nit. Thanks for following up with this. Save it a few days before merging in case anyone else wants to take a look. https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Fri May 2 06:37:08 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 02 May 2025 06:37:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Hide strict volatility checks behind flag (PR #138183) In-Reply-To: Message-ID: <6814ca84.630a0220.d77f2.32d4@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. The use of the new flag LGTM https://github.com/llvm/llvm-project/pull/138183 From flang-commits at lists.llvm.org Fri May 2 07:04:52 2025 From: flang-commits at lists.llvm.org (Susan Tan =?utf-8?b?44K5LeOCtuODs+OAgOOCv+ODsw==?= via flang-commits) Date: Fri, 02 May 2025 07:04:52 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Propogate fir.declare attributes through cg-rewrite (PR #137207) In-Reply-To: Message-ID: <6814d104.050a0220.253e8e.8634@mx.google.com> https://github.com/SusanTan closed https://github.com/llvm/llvm-project/pull/137207 From flang-commits at lists.llvm.org Fri May 2 07:24:05 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 02 May 2025 07:24:05 -0700 (PDT) Subject: [flang-commits] [flang] 1ab70fe - [flang] Enable installing binding header and modules as part of distribution (#133962) Message-ID: <6814d585.170a0220.35760f.762d@mx.google.com> Author: Reilly Brogan Date: 2025-05-02T15:24:02+01:00 New Revision: 1ab70fed622bcbd4a679f66b1c638cdd46f0365c URL: https://github.com/llvm/llvm-project/commit/1ab70fed622bcbd4a679f66b1c638cdd46f0365c DIFF: https://github.com/llvm/llvm-project/commit/1ab70fed622bcbd4a679f66b1c638cdd46f0365c.diff LOG: [flang] Enable installing binding header and modules as part of distribution (#133962) This allows the C/Fortran interop header (`ISO_Fortran_binding.h`) and the Fortran module interfaces (`.mod` files) to be installed with `LLVM_DISTRIBUTION_COMPONENTS`. Signed-off-by: Reilly Brogan Added: Modified: flang/CMakeLists.txt flang/tools/f18/CMakeLists.txt Removed: ################################################################################ diff --git a/flang/CMakeLists.txt b/flang/CMakeLists.txt index f43f4ab310a13..f358a93fdd792 100644 --- a/flang/CMakeLists.txt +++ b/flang/CMakeLists.txt @@ -585,5 +585,7 @@ configure_file( get_clang_resource_dir(HEADER_INSTALL_DIR SUBDIR include) install( FILES include/flang/ISO_Fortran_binding.h - DESTINATION ${HEADER_INSTALL_DIR} ) - + DESTINATION ${HEADER_INSTALL_DIR} + COMPONENT flang-fortran-binding) +add_llvm_install_targets(install-flang-fortran-binding + COMPONENT flang-fortran-binding) diff --git a/flang/tools/f18/CMakeLists.txt b/flang/tools/f18/CMakeLists.txt index 817f3687dbcc8..fb5510d7163d1 100644 --- a/flang/tools/f18/CMakeLists.txt +++ b/flang/tools/f18/CMakeLists.txt @@ -111,7 +111,7 @@ if (NOT CMAKE_CROSSCOMPILING) DEPENDS flang ${FLANG_SOURCE_DIR}/module/${filename}.f90 ${FLANG_SOURCE_DIR}/module/__fortran_builtins.f90 ${depends} ) list(APPEND MODULE_FILES ${base}.mod) - install(FILES ${base}.mod DESTINATION "${CMAKE_INSTALL_INCLUDEDIR}/flang") + install(FILES ${base}.mod DESTINATION "${CMAKE_INSTALL_INCLUDEDIR}/flang" COMPONENT flang-module-interfaces) # If a module has been compiled into an object file, add the file to # the link line for the flang_rt.runtime library. @@ -144,12 +144,14 @@ if (NOT CMAKE_CROSSCOMPILING) DEPENDS ${base}.mod COMMAND ${CMAKE_COMMAND} -E copy ${base}_kinds.mod ${base}_kinds.f18.mod) list(APPEND MODULE_FILES ${base}.mod ${base}.f18.mod ${base}_kinds.mod ${base}_kinds.f18.mod) - install(FILES ${base}.mod ${base}.f18.mod ${base}_kinds.mod ${base}_kinds.f18.mod DESTINATION "${CMAKE_INSTALL_INCLUDEDIR}/flang") + install(FILES ${base}.mod ${base}.f18.mod ${base}_kinds.mod ${base}_kinds.f18.mod DESTINATION "${CMAKE_INSTALL_INCLUDEDIR}/flang" COMPONENT flang-module-interfaces) elseif ("openmp" IN_LIST LLVM_ENABLE_RUNTIMES) message(STATUS "OpenMP runtime support enabled via LLVM_ENABLE_RUNTIMES, assuming omp_lib.mod is built there") else() message(WARNING "Not building omp_lib.mod, no OpenMP runtime in either LLVM_ENABLE_PROJECTS or LLVM_ENABLE_RUNTIMES") endif() + add_llvm_install_targets(install-flang-module-interfaces + COMPONENT flang-module-interfaces) endif() add_custom_target(module_files ALL DEPENDS ${MODULE_FILES}) From flang-commits at lists.llvm.org Fri May 2 07:24:08 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 02 May 2025 07:24:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Enable installing binding header and modules as part of distribution (PR #133962) In-Reply-To: Message-ID: <6814d588.170a0220.30406e.7688@mx.google.com> https://github.com/kiranchandramohan closed https://github.com/llvm/llvm-project/pull/133962 From flang-commits at lists.llvm.org Fri May 2 07:24:30 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 02 May 2025 07:24:30 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Enable installing binding header and modules as part of distribution (PR #133962) In-Reply-To: Message-ID: <6814d59e.170a0220.954bd.7807@mx.google.com> github-actions[bot] wrote: @ReillyBrogan Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our [build bots](https://lab.llvm.org/buildbot/). If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail [here](https://llvm.org/docs/MyFirstTypoFix.html#myfirsttypofix-issues-after-landing-your-pr). If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of [LLVM development](https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy). You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! https://github.com/llvm/llvm-project/pull/133962 From flang-commits at lists.llvm.org Fri May 2 07:35:27 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 02 May 2025 07:35:27 -0700 (PDT) Subject: [flang-commits] [flang] a18adb2 - [flang] fix scoping of cray pointer declarations and add check for initialization (#136776) Message-ID: <6814d82f.630a0220.271d88.8f7a@mx.google.com> Author: Andre Kuhlenschmidt Date: 2025-05-02T07:35:24-07:00 New Revision: a18adb2358cac0f14cce5faf27f607c2b1601ed9 URL: https://github.com/llvm/llvm-project/commit/a18adb2358cac0f14cce5faf27f607c2b1601ed9 DIFF: https://github.com/llvm/llvm-project/commit/a18adb2358cac0f14cce5faf27f607c2b1601ed9.diff LOG: [flang] fix scoping of cray pointer declarations and add check for initialization (#136776) This PR: - makes Cray pointer declarations shadow previous bindings instead of modifying them, - errors when the pointee of a cray pointee has the SAVE attribute, and - adds a missing newline after dumping the list of cray pointers in a scope. Closes #135579 Added: flang/test/Semantics/resolve125.f90 Modified: flang/lib/Semantics/check-declarations.cpp flang/lib/Semantics/resolve-names.cpp flang/lib/Semantics/semantics.cpp flang/test/Lower/OpenMP/cray-pointers01.f90 flang/test/Semantics/declarations08.f90 Removed: ################################################################################ diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 8d5e034f8624b..318085518cc57 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -963,7 +963,18 @@ void CheckHelper::CheckObjectEntity( "'%s' is a data object and may not be EXTERNAL"_err_en_US, symbol.name()); } - + if (symbol.test(Symbol::Flag::CrayPointee)) { + // NB, IsSaved was too smart here. + if (details.init()) { + messages_.Say( + "Cray pointee '%s' may not be initialized"_err_en_US, symbol.name()); + } + if (symbol.attrs().test(Attr::SAVE)) { + messages_.Say( + "Cray pointee '%s' may not have the SAVE attribute"_err_en_US, + symbol.name()); + } + } if (derived) { bool isUnsavedLocal{ isLocalVariable && !IsAllocatable(symbol) && !IsSaved(symbol)}; diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..e0550b3724bef 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6650,7 +6650,7 @@ bool DeclarationVisitor::Pre(const parser::BasedPointer &) { void DeclarationVisitor::Post(const parser::BasedPointer &bp) { const parser::ObjectName &pointerName{std::get<0>(bp.t)}; - auto *pointer{FindSymbol(pointerName)}; + auto *pointer{FindInScope(pointerName)}; if (!pointer) { pointer = &MakeSymbol(pointerName, ObjectEntityDetails{}); } else if (!ConvertToObjectEntity(*pointer)) { diff --git a/flang/lib/Semantics/semantics.cpp b/flang/lib/Semantics/semantics.cpp index 10a01039ea0ae..e07054f8ec564 100644 --- a/flang/lib/Semantics/semantics.cpp +++ b/flang/lib/Semantics/semantics.cpp @@ -731,6 +731,7 @@ void DoDumpSymbols(llvm::raw_ostream &os, const Scope &scope, int indent) { for (const auto &[pointee, pointer] : scope.crayPointers()) { os << " (" << pointer->name() << ',' << pointee << ')'; } + os << '\n'; } for (const auto &pair : scope.commonBlocks()) { const auto &symbol{*pair.second}; diff --git a/flang/test/Lower/OpenMP/cray-pointers01.f90 b/flang/test/Lower/OpenMP/cray-pointers01.f90 index 87692ccbadfe3..d3a5a3cdd39a3 100644 --- a/flang/test/Lower/OpenMP/cray-pointers01.f90 +++ b/flang/test/Lower/OpenMP/cray-pointers01.f90 @@ -33,7 +33,7 @@ subroutine set_cray_pointer end module program test_cray_pointers_01 - real*8, save :: var(*) + real*8 :: var(*) ! CHECK: %[[BOX_ALLOCA:.*]] = fir.alloca !fir.box>> ! CHECK: %[[IVAR_ALLOCA:.*]] = fir.alloca i64 {bindc_name = "ivar", uniq_name = "_QFEivar"} ! CHECK: %[[IVAR_DECL_01:.*]]:2 = hlfir.declare %[[IVAR_ALLOCA]] {uniq_name = "_QFEivar"} : (!fir.ref) -> (!fir.ref, !fir.ref) diff --git a/flang/test/Semantics/declarations08.f90 b/flang/test/Semantics/declarations08.f90 index bd14131b33c28..2c4027d117365 100644 --- a/flang/test/Semantics/declarations08.f90 +++ b/flang/test/Semantics/declarations08.f90 @@ -5,4 +5,10 @@ !ERROR: Cray pointee 'x' may not be a member of a COMMON block common x equivalence(y,z) +!ERROR: Cray pointee 'v' may not be initialized +real :: v = 42.0 +pointer(p,v) +!ERROR: Cray pointee 'u' may not have the SAVE attribute +save u +pointer(p, u) end diff --git a/flang/test/Semantics/resolve125.f90 b/flang/test/Semantics/resolve125.f90 new file mode 100644 index 0000000000000..e040c006ec179 --- /dev/null +++ b/flang/test/Semantics/resolve125.f90 @@ -0,0 +1,64 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols %s 2>&1 | FileCheck %s + +!CHECK: Module scope: m1 +!CHECK: i, PUBLIC size={{[0-9]+}} offset={{[0-9]+}}: ObjectEntity type: REAL({{[0-9]+}}) init:{{.+}} +!CHECK: init, PUBLIC (Subroutine): Subprogram () +!CHECK: o, PUBLIC (CrayPointee) size={{[0-9]+}} offset={{[0-9]+}}: ObjectEntity type: REAL({{[0-9]+}}) +!CHECK: ptr, PUBLIC (CrayPointer) size={{[0-9]+}} offset={{[0-9]+}}: ObjectEntity type: INTEGER({{[0-9]+}}) +module m1 + implicit none + real:: o + real:: i = 42.0 + pointer (ptr, o) +contains + !CHECK: Subprogram scope: init + subroutine init + implicit none + ptr=loc(i) + print *, "init : o= ", o + end subroutine init +end module m1 + +!CHECK: Module scope: m2 +!CHECK: i, PUBLIC: Use from i in m1 +!CHECK: i2, PUBLIC size={{[0-9]+}} offset={{[0-9]+}}: ObjectEntity type: REAL({{[0-9]+}}) init:{{.+}} +!CHECK: init, PUBLIC (Subroutine): Use from init in m1 +!CHECK: o, PUBLIC (CrayPointee): Use from o in m1 +!CHECK: ptr, PUBLIC (CrayPointer): Use from ptr in m1 +!CHECK: reset, PUBLIC (Subroutine): Subprogram () +module m2 + use m1 + implicit none + real:: i2 = 777.0 +contains + !CHECK: Subprogram scope: reset + !CHECK: o2 (CrayPointee) size={{[0-9]+}} offset={{[0-9]+}}: ObjectEntity type: REAL({{[0-9]+}}) + !CHECK: ptr (CrayPointer) size={{[0-9]+}} offset={{[0-9]+}}: ObjectEntity type: INTEGER({{[0-9]+}}) + subroutine reset + real::o2 + pointer (ptr, o2) + ptr=loc(i2) + print *, "reset : o= ", o, " o2 = ", o2 + o2 = 666.0 + end subroutine reset +end module m2 + +!CHECK: MainProgram scope: main +!CHECK: i: Use from i in m2 +!CHECK: i2: Use from i2 in m2 +!CHECK: init (Subroutine): Use from init in m2 +!CHECK: o (CrayPointee): Use from o in m2 +!CHECK: ptr (CrayPointer): Use from ptr in m2 +!CHECK: reset (Subroutine): Use from reset in m2 +program main + use m2 + implicit none + call init + call reset + write(6,*) "main : o = ", o + if (o == 42.0) then + print *, "pass" + else + print *, "fail" + end if +end program main \ No newline at end of file From flang-commits at lists.llvm.org Fri May 2 07:35:31 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 02 May 2025 07:35:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix scoping of cray pointer declarations and add check for initialization (PR #136776) In-Reply-To: Message-ID: <6814d833.170a0220.1f1bda.8c25@mx.google.com> https://github.com/akuhlens closed https://github.com/llvm/llvm-project/pull/136776 From flang-commits at lists.llvm.org Fri May 2 07:38:47 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 02 May 2025 07:38:47 -0700 (PDT) Subject: [flang-commits] [flang] 580da48 - [flang][flang-driver] Support flag -finstrument-functions (#137996) Message-ID: <6814d8f7.050a0220.151ca4.d47b@mx.google.com> Author: Anchu Rajendran S Date: 2025-05-02T07:38:44-07:00 New Revision: 580da48a93ea3065cced426bb37df65a933c21f7 URL: https://github.com/llvm/llvm-project/commit/580da48a93ea3065cced426bb37df65a933c21f7 DIFF: https://github.com/llvm/llvm-project/commit/580da48a93ea3065cced426bb37df65a933c21f7.diff LOG: [flang][flang-driver] Support flag -finstrument-functions (#137996) Added: flang/test/Driver/func-attr-instrument-functions.f90 Modified: clang/include/clang/Driver/Options.td clang/lib/Driver/ToolChains/Flang.cpp flang/include/flang/Frontend/CodeGenOptions.def flang/include/flang/Optimizer/Transforms/Passes.td flang/include/flang/Tools/CrossToolHelpers.h flang/lib/Frontend/CompilerInvocation.cpp flang/lib/Optimizer/Passes/Pipelines.cpp flang/lib/Optimizer/Transforms/FunctionAttr.cpp Removed: ################################################################################ diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 561b0498c549c..736088a70d189 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -2824,10 +2824,12 @@ def finput_charset_EQ : Joined<["-"], "finput-charset=">, Visibility<[ClangOption, FlangOption, FC1Option]>, Group, HelpText<"Specify the default character set for source files">; def fexec_charset_EQ : Joined<["-"], "fexec-charset=">, Group; -def finstrument_functions : Flag<["-"], "finstrument-functions">, Group, - Visibility<[ClangOption, CC1Option]>, - HelpText<"Generate calls to instrument function entry and exit">, - MarshallingInfoFlag>; +def finstrument_functions + : Flag<["-"], "finstrument-functions">, + Group, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, + HelpText<"Generate calls to instrument function entry and exit">, + MarshallingInfoFlag>; def finstrument_functions_after_inlining : Flag<["-"], "finstrument-functions-after-inlining">, Group, Visibility<[ClangOption, CC1Option]>, HelpText<"Like -finstrument-functions, but insert the calls after inlining">, diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index e9d5a844ab073..a407e295c09bd 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -128,7 +128,8 @@ void Flang::addOtherOptions(const ArgList &Args, ArgStringList &CmdArgs) const { options::OPT_std_EQ, options::OPT_W_Joined, options::OPT_fconvert_EQ, options::OPT_fpass_plugin_EQ, options::OPT_funderscoring, options::OPT_fno_underscoring, - options::OPT_funsigned, options::OPT_fno_unsigned}); + options::OPT_funsigned, options::OPT_fno_unsigned, + options::OPT_finstrument_functions}); llvm::codegenoptions::DebugInfoKind DebugInfoKind; if (Args.hasArg(options::OPT_gN_Group)) { diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 57830bf51a1b3..d9dbd274e83e5 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,6 +24,8 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. +CODEGENOPT(InstrumentFunctions, 1, 0) ///< Set when -finstrument_functions is + ///< enabled on the compile step. CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c59416fa2c024..9b6919eec3f73 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -393,6 +393,14 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { clEnumValN(mlir::LLVM::framePointerKind::FramePointerKind::All, "All", ""), clEnumValN(mlir::LLVM::framePointerKind::FramePointerKind::Reserved, "Reserved", "") )}]>, + Option<"instrumentFunctionEntry", "instrument-function-entry", + "std::string", /*default=*/"", + "Sets the name of the profiling function called during function " + "entry">, + Option<"instrumentFunctionExit", "instrument-function-exit", + "std::string", /*default=*/"", + "Sets the name of the profiling function called during function " + "exit">, Option<"noInfsFPMath", "no-infs-fp-math", "bool", /*default=*/"false", "Set the no-infs-fp-math attribute on functions in the module.">, Option<"noNaNsFPMath", "no-nans-fp-math", "bool", /*default=*/"false", diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 1dbc18e2b348b..118695bbe2626 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,13 +102,17 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + if (opts.InstrumentFunctions) { + InstrumentFunctionEntry = "__cyg_profile_func_enter"; + InstrumentFunctionExit = "__cyg_profile_func_exit"; + } } llvm::OptimizationLevel OptLevel; ///< optimisation level bool StackArrays = false; ///< convert memory allocations to alloca. bool Underscoring = true; ///< add underscores to function names. bool LoopVersioning = false; ///< Run the version loop pass. - bool AliasAnalysis = false; ///< Add TBAA tags to generated LLVMIR + bool AliasAnalysis = false; ///< Add TBAA tags to generated LLVMIR. llvm::codegenoptions::DebugInfoKind DebugInfo = llvm::codegenoptions::NoDebugInfo; ///< Debug info generation. llvm::FramePointerKind FramePointerKind = @@ -124,6 +128,12 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. + std::string InstrumentFunctionEntry = + ""; ///< Name of the instrument-function that is called on each + ///< function-entry + std::string InstrumentFunctionExit = + ""; ///< Name of the instrument-function that is called on each + ///< function-exit }; struct OffloadModuleOpts { diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 6f87a18d69c3d..d6ba644b1400d 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -310,6 +310,9 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) opts.OffloadObjects.push_back(a->getValue()); + if (args.hasArg(clang::driver::options::OPT_finstrument_functions)) + opts.InstrumentFunctions = 1; + // -flto=full/thin option. if (const llvm::opt::Arg *a = args.getLastArg(clang::driver::options::OPT_flto_EQ)) { diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 130cbe72ec273..a3ef473ea39b7 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -349,7 +349,8 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, framePointerKind = mlir::LLVM::framePointerKind::FramePointerKind::None; pm.addPass(fir::createFunctionAttr( - {framePointerKind, config.NoInfsFPMath, config.NoNaNsFPMath, + {framePointerKind, config.InstrumentFunctionEntry, + config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, ""})); diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index c79843fac4ce2..43e4c1a7af3cd 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -28,6 +28,8 @@ namespace { class FunctionAttrPass : public fir::impl::FunctionAttrBase { public: FunctionAttrPass(const fir::FunctionAttrOptions &options) { + instrumentFunctionEntry = options.instrumentFunctionEntry; + instrumentFunctionExit = options.instrumentFunctionExit; framePointerKind = options.framePointerKind; noInfsFPMath = options.noInfsFPMath; noNaNsFPMath = options.noNaNsFPMath; @@ -72,6 +74,14 @@ void FunctionAttrPass::runOnOperation() { auto llvmFuncOpName = mlir::OperationName(mlir::LLVM::LLVMFuncOp::getOperationName(), context); + if (!instrumentFunctionEntry.empty()) + func->setAttr(mlir::LLVM::LLVMFuncOp::getInstrumentFunctionEntryAttrName( + llvmFuncOpName), + mlir::StringAttr::get(context, instrumentFunctionEntry)); + if (!instrumentFunctionExit.empty()) + func->setAttr(mlir::LLVM::LLVMFuncOp::getInstrumentFunctionExitAttrName( + llvmFuncOpName), + mlir::StringAttr::get(context, instrumentFunctionExit)); if (noInfsFPMath) func->setAttr( mlir::LLVM::LLVMFuncOp::getNoInfsFpMathAttrName(llvmFuncOpName), diff --git a/flang/test/Driver/func-attr-instrument-functions.f90 b/flang/test/Driver/func-attr-instrument-functions.f90 new file mode 100644 index 0000000000000..0ef81806e9fb9 --- /dev/null +++ b/flang/test/Driver/func-attr-instrument-functions.f90 @@ -0,0 +1,9 @@ +! RUN: %flang -O1 -finstrument-functions -emit-llvm -S -o - %s 2>&1| FileCheck %s + +subroutine func +end subroutine func + +! CHECK: define void @func_() +! CHECK: {{.*}}call void @__cyg_profile_func_enter(ptr {{.*}}@func_, ptr {{.*}}) +! CHECK: {{.*}}call void @__cyg_profile_func_exit(ptr {{.*}}@func_, ptr {{.*}}) +! CHECK-NEXT: ret {{.*}} From flang-commits at lists.llvm.org Fri May 2 07:38:52 2025 From: flang-commits at lists.llvm.org (Anchu Rajendran S via flang-commits) Date: Fri, 02 May 2025 07:38:52 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][flang-driver] Support flag -finstrument-functions (PR #137996) In-Reply-To: Message-ID: <6814d8fc.170a0220.1e1a05.8f6b@mx.google.com> https://github.com/anchuraj closed https://github.com/llvm/llvm-project/pull/137996 From flang-commits at lists.llvm.org Fri May 2 05:05:48 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Fri, 02 May 2025 05:05:48 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <6814b51c.170a0220.92612.cfe3@mx.google.com> https://github.com/Thirumalai-Shaktivel updated https://github.com/llvm/llvm-project/pull/119172 >From 6e491ccd80b902df6946713a372ec9667e0811c3 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Mon, 9 Dec 2024 07:37:01 +0000 Subject: [PATCH 01/14] [Flang] [OpenMP] Add semantic checks for detach clause in Task Fixes: - Add semantic checks along with the tests - Move the detach clause to allowedOnceClauses list in Task construct Restrictions:\ OpenMP 5.0: Task construct - At most one detach clause can appear on the directive. - If a detach clause appears on the directive, then a mergeable clause cannot appear on the same directive. OpenMP 5.2: Detach contruct - If a detach clause appears on a directive, then the encountering task must not be a final task. - A variable that appears in a detach clause cannot appear as a list item on a data-environment attribute clause on the same construct. - A variable that is part of another variable (as an array element or a structure element) cannot appear in a detach clause. - event-handle must not have the POINTER attribute. --- flang/lib/Semantics/check-omp-structure.cpp | 141 +++++++++++++++----- flang/lib/Semantics/check-omp-structure.h | 2 + flang/test/Semantics/OpenMP/detach01.f90 | 65 +++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 2 +- 4 files changed, 179 insertions(+), 31 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/detach01.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 95b962f5daf57..6641e39c6e358 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2733,6 +2733,59 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { llvm::omp::Clause::OMPC_copyprivate, {llvm::omp::Clause::OMPC_nowait}); } + if (GetContext().directive == llvm::omp::Directive::OMPD_task) { + if (auto *d_clause{FindClause(llvm::omp::Clause::OMPC_detach)}) { + // OpenMP 5.0: Task construct restrictions + CheckNotAllowedIfClause( + llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); + + // OpenMP 5.2: Task construct restrictions + if (FindClause(llvm::omp::Clause::OMPC_final)) { + context_.Say(GetContext().clauseSource, + "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); + } + + const auto &detachClause{ + std::get(d_clause->u)}; + if (const auto *name{parser::Unwrap(detachClause.v.v)}) { + if (name->symbol) { + std::string eventHandleSymName{name->ToString()}; + auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList + &objs, + std::string clause) { + for (const auto &obj : objs.v) { + if (const parser::Name *objName{ + parser::Unwrap(obj)}) { + if (objName->ToString() == eventHandleSymName) { + context_.Say(GetContext().clauseSource, + "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, + eventHandleSymName, clause); + } + } + } + }; + if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_private)}) { + const auto &pClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + const auto &fpClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + const auto &irClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause( + std::get(irClause.v.t), "IN_REDUCTION"); + } + } + } + } + } + auto testThreadprivateVarErr = [&](Symbol sym, parser::Name name, llvmOmpClause clauseTy) { if (sym.test(Symbol::Flag::OmpThreadprivate)) @@ -2815,7 +2868,6 @@ CHECK_SIMPLE_CLAUSE(Capture, OMPC_capture) CHECK_SIMPLE_CLAUSE(Contains, OMPC_contains) CHECK_SIMPLE_CLAUSE(Default, OMPC_default) CHECK_SIMPLE_CLAUSE(Depobj, OMPC_depobj) -CHECK_SIMPLE_CLAUSE(Detach, OMPC_detach) CHECK_SIMPLE_CLAUSE(DeviceType, OMPC_device_type) CHECK_SIMPLE_CLAUSE(DistSchedule, OMPC_dist_schedule) CHECK_SIMPLE_CLAUSE(Exclusive, OMPC_exclusive) @@ -3386,40 +3438,45 @@ void OmpStructureChecker::CheckIsVarPartOfAnotherVar( const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause) { for (const auto &ompObject : objList.v) { - common::visit( - common::visitors{ - [&](const parser::Designator &designator) { - if (const auto *dataRef{ - std::get_if(&designator.u)}) { - if (IsDataRefTypeParamInquiry(dataRef)) { + CheckIsVarPartOfAnotherVar(source, ompObject, clause); + } +} + +void OmpStructureChecker::CheckIsVarPartOfAnotherVar( + const parser::CharBlock &source, const parser::OmpObject &ompObject, + llvm::StringRef clause) { + common::visit( + common::visitors{ + [&](const parser::Designator &designator) { + if (const auto *dataRef{ + std::get_if(&designator.u)}) { + if (IsDataRefTypeParamInquiry(dataRef)) { + context_.Say(source, + "A type parameter inquiry cannot appear on the %s " + "directive"_err_en_US, + ContextDirectiveAsFortran()); + } else if (parser::Unwrap( + ompObject) || + parser::Unwrap(ompObject)) { + if (llvm::omp::nonPartialVarSet.test(GetContext().directive)) { context_.Say(source, - "A type parameter inquiry cannot appear on the %s " + "A variable that is part of another variable (as an " + "array or structure element) cannot appear on the %s " "directive"_err_en_US, ContextDirectiveAsFortran()); - } else if (parser::Unwrap( - ompObject) || - parser::Unwrap(ompObject)) { - if (llvm::omp::nonPartialVarSet.test( - GetContext().directive)) { - context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear on the %s " - "directive"_err_en_US, - ContextDirectiveAsFortran()); - } else { - context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear in a " - "%s clause"_err_en_US, - clause.data()); - } + } else { + context_.Say(source, + "A variable that is part of another variable (as an " + "array or structure element) cannot appear in a " + "%s clause"_err_en_US, + clause.data()); } } - }, - [&](const parser::Name &name) {}, - }, - ompObject.u); - } + } + }, + [&](const parser::Name &name) {}, + }, + ompObject.u); } void OmpStructureChecker::Enter(const parser::OmpClause::Firstprivate &x) { @@ -3746,6 +3803,30 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } } +void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { + // OpenMP 5.0: Task construct restrictions + CheckAllowedClause(llvm::omp::Clause::OMPC_detach); + + // OpenMP 5.2: Detach clause restrictions + CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); + if (const auto *name{parser::Unwrap(x.v.v)}) { + if (name->symbol) { + if (IsPointer(*name->symbol)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must not have the POINTER attribute"_err_en_US, + name->ToString()); + } + } + auto type{name->symbol->GetType()}; + if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer) || + evaluate::ToInt64(type->numericTypeSpec().kind()) != 8) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, + name->ToString()); + } + } +} + void OmpStructureChecker::CheckAllowedMapTypes( const parser::OmpMapType::Value &type, const std::list &allowedMapTypeList) { diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 346a7bed9138f..a8f94992ff091 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -186,6 +186,8 @@ class OmpStructureChecker const common::Indirection &, const parser::Name &); void CheckDoacross(const parser::OmpDoacross &doa); bool IsDataRefTypeParamInquiry(const parser::DataRef *dataRef); + void CheckIsVarPartOfAnotherVar(const parser::CharBlock &source, + const parser::OmpObject &obj, llvm::StringRef clause = ""); void CheckIsVarPartOfAnotherVar(const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause = ""); void CheckThreadprivateOrDeclareTargetVar( diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 new file mode 100644 index 0000000000000..e342fcd1b19b4 --- /dev/null +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -0,0 +1,65 @@ +! REQUIRES: openmp_runtime +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=50 + +! OpenMP Version 5.2 +! Various checks for DETACH Clause (12.5.2) + +program test_detach + use omp_lib + implicit none + integer :: e, x + integer(omp_event_handle_kind) :: event_01, event_02(2) + integer(omp_event_handle_kind), pointer :: event_03 + + + type :: t + integer(omp_event_handle_kind) :: event + end type + + type(t) :: t_01 + + !ERROR: The event-handle: `e` must be of type integer(kind=omp_event_handle_kind) + !$omp task detach(e) + x = x + 1 + !$omp end task + + !ERROR: At most one DETACH clause can appear on the TASK directive + !$omp task detach(event_01) detach(event_01) + x = x + 1 + !$omp end task + + !ERROR: Clause MERGEABLE is not allowed if clause DETACH appears on the TASK directive + !$omp task detach(event_01) mergeable + x = x + 1 + !$omp end task + + !ERROR: If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task + !$omp task detach(event_01) final(.false.) + x = x + 1 + !$omp end task + + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on PRIVATE clause on the same construct + !$omp task detach(event_01) private(event_01) + x = x + 1 + !$omp end task + + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on IN_REDUCTION clause on the same construct + !$omp task detach(event_01) in_reduction(+:event_01) + x = x + 1 + !$omp end task + + !ERROR: A variable that is part of another variable (as an array or structure element) cannot appear in a DETACH clause + !$omp task detach(event_02(1)) + x = x + 1 + !$omp end task + + !ERROR: A variable that is part of another variable (as an array or structure element) cannot appear in a DETACH clause + !$omp task detach(t_01%event) + x = x + 1 + !$omp end task + + !ERROR: The event-handle: `event_03` must not have the POINTER attribute + !$omp task detach(event_03) + x = x + 1 + !$omp end task +end program diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index e36eb77cefe7e..aec80decf6039 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -1090,7 +1090,6 @@ def OMP_Task : Directive<"task"> { VersionedClause, VersionedClause, VersionedClause, - VersionedClause, VersionedClause, VersionedClause, VersionedClause, @@ -1100,6 +1099,7 @@ def OMP_Task : Directive<"task"> { ]; let allowedOnceClauses = [ VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, >From fa2b051e724898f3be9b2d74d218816088b3975f Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 05:51:11 +0000 Subject: [PATCH 02/14] Do not check for interger kind --- flang/lib/Semantics/check-omp-structure.cpp | 4 +--- flang/test/Semantics/OpenMP/detach01.f90 | 4 ++-- 2 files changed, 3 insertions(+), 5 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 6641e39c6e358..e2f897f5c9246 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3817,9 +3817,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { name->ToString()); } } - auto type{name->symbol->GetType()}; - if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer) || - evaluate::ToInt64(type->numericTypeSpec().kind()) != 8) { + if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { context_.Say(GetContext().clauseSource, "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, name->ToString()); diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index e342fcd1b19b4..7ba2888be9237 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -7,8 +7,8 @@ program test_detach use omp_lib implicit none - integer :: e, x - integer(omp_event_handle_kind) :: event_01, event_02(2) + real :: e, x + integer(omp_event_handle_kind) :: event_01, event_02(2) integer(omp_event_handle_kind), pointer :: event_03 >From 569712fc9b14f3a1f4b3a7d2c55febfe216f5e7f Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 06:14:26 +0000 Subject: [PATCH 03/14] Fix snake_case --- flang/lib/Semantics/check-omp-structure.cpp | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index e2f897f5c9246..aa19b71e69c62 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2734,7 +2734,7 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { } if (GetContext().directive == llvm::omp::Directive::OMPD_task) { - if (auto *d_clause{FindClause(llvm::omp::Clause::OMPC_detach)}) { + if (auto *detachClause{FindClause(llvm::omp::Clause::OMPC_detach)}) { // OpenMP 5.0: Task construct restrictions CheckNotAllowedIfClause( llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); @@ -2745,9 +2745,9 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); } - const auto &detachClause{ - std::get(d_clause->u)}; - if (const auto *name{parser::Unwrap(detachClause.v.v)}) { + const auto &detach{ + std::get(detachClause->u)}; + if (const auto *name{parser::Unwrap(detach.v.v)}) { if (name->symbol) { std::string eventHandleSymName{name->ToString()}; auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList >From 42cc3e75d041b9935b38e19cb70d7b3273f4087f Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 06:36:11 +0000 Subject: [PATCH 04/14] Compare symbols instead of name --- flang/lib/Semantics/check-omp-structure.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa19b71e69c62..084dbf48c5c50 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2749,17 +2749,17 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { std::get(detachClause->u)}; if (const auto *name{parser::Unwrap(detach.v.v)}) { if (name->symbol) { - std::string eventHandleSymName{name->ToString()}; + Symbol *eventHandleSym{name->symbol->GetUltimate()}; auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList &objs, std::string clause) { for (const auto &obj : objs.v) { if (const parser::Name *objName{ parser::Unwrap(obj)}) { - if (objName->ToString() == eventHandleSymName) { + if (objName->symbol->GetUltimate() == eventHandleSym) { context_.Say(GetContext().clauseSource, "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, - eventHandleSymName, clause); + objName->source, clause); } } } >From a3c6a4517354991dc5f166d57d14012571b3f553 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 09:09:15 +0000 Subject: [PATCH 05/14] Add OpenMP version based checks --- flang/lib/Semantics/check-omp-structure.cpp | 105 ++++++++++---------- flang/test/Semantics/OpenMP/detach01.f90 | 22 +--- flang/test/Semantics/OpenMP/detach02.f90 | 22 ++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 2 +- 4 files changed, 83 insertions(+), 68 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/detach02.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 084dbf48c5c50..41ed858ed650f 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2735,51 +2735,54 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { if (GetContext().directive == llvm::omp::Directive::OMPD_task) { if (auto *detachClause{FindClause(llvm::omp::Clause::OMPC_detach)}) { - // OpenMP 5.0: Task construct restrictions - CheckNotAllowedIfClause( - llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); - - // OpenMP 5.2: Task construct restrictions - if (FindClause(llvm::omp::Clause::OMPC_final)) { - context_.Say(GetContext().clauseSource, - "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); - } + unsigned version{context_.langOptions().OpenMPVersion}; + if (version == 50 || version == 51) { + // OpenMP 5.0: 2.10.1 Task construct restrictions + CheckNotAllowedIfClause( + llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); + } else if (version >= 52) { + // OpenMP 5.2: 12.5.2 Detach construct restrictions + if (FindClause(llvm::omp::Clause::OMPC_final)) { + context_.Say(GetContext().clauseSource, + "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); + } - const auto &detach{ - std::get(detachClause->u)}; - if (const auto *name{parser::Unwrap(detach.v.v)}) { - if (name->symbol) { - Symbol *eventHandleSym{name->symbol->GetUltimate()}; - auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList - &objs, - std::string clause) { - for (const auto &obj : objs.v) { - if (const parser::Name *objName{ - parser::Unwrap(obj)}) { - if (objName->symbol->GetUltimate() == eventHandleSym) { - context_.Say(GetContext().clauseSource, - "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, - objName->source, clause); + const auto &detach{ + std::get(detachClause->u)}; + if (const auto *name{parser::Unwrap(detach.v.v)}) { + if (name->symbol) { + Symbol *eventHandleSym{name->symbol}; + auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList + &objs, + std::string clause) { + for (const auto &obj : objs.v) { + if (const parser::Name *objName{ + parser::Unwrap(obj)}) { + if (&objName->symbol->GetUltimate() == eventHandleSym) { + context_.Say(GetContext().clauseSource, + "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, + objName->source, clause); + } } } + }; + if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_private)}) { + const auto &pClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + const auto &fpClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + const auto &irClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause( + std::get(irClause.v.t), "IN_REDUCTION"); } - }; - if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_private)}) { - const auto &pClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); - } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { - const auto &fpClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); - } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { - const auto &irClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause( - std::get(irClause.v.t), "IN_REDUCTION"); } } } @@ -3804,23 +3807,25 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { - // OpenMP 5.0: Task construct restrictions CheckAllowedClause(llvm::omp::Clause::OMPC_detach); + unsigned version{context_.langOptions().OpenMPVersion}; + // OpenMP 5.2: 12.5.2 Detach clause restrictions + if (version >= 52) { + CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); + } - // OpenMP 5.2: Detach clause restrictions - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); if (const auto *name{parser::Unwrap(x.v.v)}) { if (name->symbol) { - if (IsPointer(*name->symbol)) { + if (version >= 52 && IsPointer(*name->symbol)) { context_.Say(GetContext().clauseSource, "The event-handle: `%s` must not have the POINTER attribute"_err_en_US, name->ToString()); } - } - if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { - context_.Say(GetContext().clauseSource, - "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, - name->ToString()); + if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, + name->ToString()); + } } } } diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index 7ba2888be9237..ea8208c022ef1 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -1,21 +1,19 @@ ! REQUIRES: openmp_runtime -! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=50 +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=52 -! OpenMP Version 5.2 -! Various checks for DETACH Clause (12.5.2) +! OpenMP Version 5.2: 12.5.2 +! Various checks for DETACH Clause -program test_detach - use omp_lib +program detach01 + use omp_lib, only: omp_event_handle_kind implicit none real :: e, x integer(omp_event_handle_kind) :: event_01, event_02(2) integer(omp_event_handle_kind), pointer :: event_03 - type :: t integer(omp_event_handle_kind) :: event end type - type(t) :: t_01 !ERROR: The event-handle: `e` must be of type integer(kind=omp_event_handle_kind) @@ -23,16 +21,6 @@ program test_detach x = x + 1 !$omp end task - !ERROR: At most one DETACH clause can appear on the TASK directive - !$omp task detach(event_01) detach(event_01) - x = x + 1 - !$omp end task - - !ERROR: Clause MERGEABLE is not allowed if clause DETACH appears on the TASK directive - !$omp task detach(event_01) mergeable - x = x + 1 - !$omp end task - !ERROR: If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task !$omp task detach(event_01) final(.false.) x = x + 1 diff --git a/flang/test/Semantics/OpenMP/detach02.f90 b/flang/test/Semantics/OpenMP/detach02.f90 new file mode 100644 index 0000000000000..1304233976351 --- /dev/null +++ b/flang/test/Semantics/OpenMP/detach02.f90 @@ -0,0 +1,22 @@ +! REQUIRES: openmp_runtime +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=50 +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=51 + +! OpenMP Version 5.0: 2.10.1 +! Various checks for DETACH Clause + +program detach02 + use omp_lib, only: omp_event_handle_kind + integer(omp_event_handle_kind) :: event_01, event_02 + + !TODO: Throw following error for the versions 5.0 and 5.1 + !ERR: At most one DETACH clause can appear on the TASK directive + !!$omp task detach(event_01) detach(event_02) + ! x = x + 1 + !!$omp end task + + !ERROR: Clause MERGEABLE is not allowed if clause DETACH appears on the TASK directive + !$omp task detach(event_01) mergeable + x = x + 1 + !$omp end task +end program diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index aec80decf6039..e36eb77cefe7e 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -1090,6 +1090,7 @@ def OMP_Task : Directive<"task"> { VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, @@ -1099,7 +1100,6 @@ def OMP_Task : Directive<"task"> { ]; let allowedOnceClauses = [ VersionedClause, - VersionedClause, VersionedClause, VersionedClause, VersionedClause, >From 019a24c814b226352a13c18b840cc22a9b28ec7e Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 09:26:33 +0000 Subject: [PATCH 06/14] Fix formatting --- flang/lib/Semantics/check-omp-structure.cpp | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 41ed858ed650f..7ca695acc74b3 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2738,8 +2738,8 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { unsigned version{context_.langOptions().OpenMPVersion}; if (version == 50 || version == 51) { // OpenMP 5.0: 2.10.1 Task construct restrictions - CheckNotAllowedIfClause( - llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); + CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_detach, + {llvm::omp::Clause::OMPC_mergeable}); } else if (version >= 52) { // OpenMP 5.2: 12.5.2 Detach construct restrictions if (FindClause(llvm::omp::Clause::OMPC_final)) { @@ -2752,12 +2752,12 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { if (const auto *name{parser::Unwrap(detach.v.v)}) { if (name->symbol) { Symbol *eventHandleSym{name->symbol}; - auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList - &objs, + auto checkVarAppearsInDataEnvClause = [&](const parser:: + OmpObjectList &objs, std::string clause) { for (const auto &obj : objs.v) { - if (const parser::Name *objName{ - parser::Unwrap(obj)}) { + if (const parser::Name * + objName{parser::Unwrap(obj)}) { if (&objName->symbol->GetUltimate() == eventHandleSym) { context_.Say(GetContext().clauseSource, "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, @@ -2772,16 +2772,17 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { std::get(dataEnvClause->u)}; checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { const auto &fpClause{ std::get(dataEnvClause->u)}; checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { const auto &irClause{ std::get(dataEnvClause->u)}; checkVarAppearsInDataEnvClause( - std::get(irClause.v.t), "IN_REDUCTION"); + std::get(irClause.v.t), + "IN_REDUCTION"); } } } >From 0348571f198063b2358d52435cd0c62c14b89ef9 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 7 Mar 2025 05:03:09 +0000 Subject: [PATCH 07/14] Allow detach clause only once --- flang/lib/Semantics/check-omp-structure.cpp | 3 ++- flang/test/Semantics/OpenMP/detach02.f90 | 9 ++++----- llvm/include/llvm/Frontend/OpenMP/OMP.td | 2 +- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 7ca695acc74b3..c6a4c2472c4af 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3808,9 +3808,10 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { + // OpenMP 5.0: 2.10.1 Task construct restrictions CheckAllowedClause(llvm::omp::Clause::OMPC_detach); - unsigned version{context_.langOptions().OpenMPVersion}; // OpenMP 5.2: 12.5.2 Detach clause restrictions + unsigned version{context_.langOptions().OpenMPVersion}; if (version >= 52) { CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); } diff --git a/flang/test/Semantics/OpenMP/detach02.f90 b/flang/test/Semantics/OpenMP/detach02.f90 index 1304233976351..49d80358fcdb6 100644 --- a/flang/test/Semantics/OpenMP/detach02.f90 +++ b/flang/test/Semantics/OpenMP/detach02.f90 @@ -9,11 +9,10 @@ program detach02 use omp_lib, only: omp_event_handle_kind integer(omp_event_handle_kind) :: event_01, event_02 - !TODO: Throw following error for the versions 5.0 and 5.1 - !ERR: At most one DETACH clause can appear on the TASK directive - !!$omp task detach(event_01) detach(event_02) - ! x = x + 1 - !!$omp end task + !ERROR: At most one DETACH clause can appear on the TASK directive + !$omp task detach(event_01) detach(event_02) + x = x + 1 + !$omp end task !ERROR: Clause MERGEABLE is not allowed if clause DETACH appears on the TASK directive !$omp task detach(event_01) mergeable diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index e36eb77cefe7e..aec80decf6039 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -1090,7 +1090,6 @@ def OMP_Task : Directive<"task"> { VersionedClause, VersionedClause, VersionedClause, - VersionedClause, VersionedClause, VersionedClause, VersionedClause, @@ -1100,6 +1099,7 @@ def OMP_Task : Directive<"task"> { ]; let allowedOnceClauses = [ VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, >From 2c09c87ffd454186d9bd335e09c9bf0cd459c153 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 7 Mar 2025 05:06:18 +0000 Subject: [PATCH 08/14] Add a test for openmp 52 --- flang/test/Semantics/OpenMP/detach01.f90 | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index ea8208c022ef1..8f19dfc1f92a7 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -50,4 +50,9 @@ program detach01 !$omp task detach(event_03) x = x + 1 !$omp end task + + !ERROR: At most one DETACH clause can appear on the TASK directive + !$omp task detach(event_01) detach(event_02) + x = x + 1 + !$omp end task end program >From 6fc96f7795546aa0771f90d38e83cf3b2fffe32c Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:51:19 +0000 Subject: [PATCH 09/14] Rename a function name --- flang/lib/Semantics/check-omp-structure.cpp | 32 ++++++++++----------- flang/lib/Semantics/check-omp-structure.h | 4 +-- 2 files changed, 18 insertions(+), 18 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 346c11e9765b7..012529af0767d 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -1605,7 +1605,7 @@ void OmpStructureChecker::Leave(const parser::OpenMPThreadprivate &c) { const auto &dir{std::get(c.t)}; const auto &objectList{std::get(c.t)}; CheckSymbolNames(dir.source, objectList); - CheckIsVarPartOfAnotherVar(dir.source, objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, objectList); CheckThreadprivateOrDeclareTargetVar(objectList); dirContext_.pop_back(); } @@ -1736,7 +1736,7 @@ void OmpStructureChecker::Enter(const parser::OpenMPDeclarativeAllocate &x) { for (const auto &clause : clauseList.v) { CheckAlignValue(clause); } - CheckIsVarPartOfAnotherVar(dir.source, objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, objectList); } void OmpStructureChecker::Leave(const parser::OpenMPDeclarativeAllocate &x) { @@ -1902,7 +1902,7 @@ void OmpStructureChecker::Leave(const parser::OpenMPDeclareTargetConstruct &x) { if (const auto *objectList{parser::Unwrap(spec.u)}) { deviceConstructFound_ = true; CheckSymbolNames(dir.source, *objectList); - CheckIsVarPartOfAnotherVar(dir.source, *objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, *objectList); CheckThreadprivateOrDeclareTargetVar(*objectList); } else if (const auto *clauseList{ parser::Unwrap(spec.u)}) { @@ -1915,18 +1915,18 @@ void OmpStructureChecker::Leave(const parser::OpenMPDeclareTargetConstruct &x) { toClauseFound = true; auto &objList{std::get(toClause.v.t)}; CheckSymbolNames(dir.source, objList); - CheckIsVarPartOfAnotherVar(dir.source, objList); + CheckVarIsNotPartOfAnotherVar(dir.source, objList); CheckThreadprivateOrDeclareTargetVar(objList); }, [&](const parser::OmpClause::Link &linkClause) { CheckSymbolNames(dir.source, linkClause.v); - CheckIsVarPartOfAnotherVar(dir.source, linkClause.v); + CheckVarIsNotPartOfAnotherVar(dir.source, linkClause.v); CheckThreadprivateOrDeclareTargetVar(linkClause.v); }, [&](const parser::OmpClause::Enter &enterClause) { enterClauseFound = true; CheckSymbolNames(dir.source, enterClause.v); - CheckIsVarPartOfAnotherVar(dir.source, enterClause.v); + CheckVarIsNotPartOfAnotherVar(dir.source, enterClause.v); CheckThreadprivateOrDeclareTargetVar(enterClause.v); }, [&](const parser::OmpClause::DeviceType &deviceTypeClause) { @@ -2009,7 +2009,7 @@ void OmpStructureChecker::Enter(const parser::OpenMPExecutableAllocate &x) { CheckAlignValue(clause); } if (objectList) { - CheckIsVarPartOfAnotherVar(dir.source, *objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, *objectList); } } @@ -2029,7 +2029,7 @@ void OmpStructureChecker::Enter(const parser::OpenMPAllocatorsConstruct &x) { for (const auto &clause : clauseList.v) { if (const auto *allocClause{ parser::Unwrap(clause)}) { - CheckIsVarPartOfAnotherVar( + CheckVarIsNotPartOfAnotherVar( dir.source, std::get(allocClause->v.t)); } } @@ -3791,14 +3791,14 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Ordered &x) { void OmpStructureChecker::Enter(const parser::OmpClause::Shared &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_shared); - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v, "SHARED"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v, "SHARED"); CheckCrayPointee(x.v, "SHARED"); } void OmpStructureChecker::Enter(const parser::OmpClause::Private &x) { SymbolSourceMap symbols; GetSymbolsInObjectList(x.v, symbols); CheckAllowedClause(llvm::omp::Clause::OMPC_private); - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v, "PRIVATE"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v, "PRIVATE"); CheckIntentInPointer(symbols, llvm::omp::Clause::OMPC_private); CheckCrayPointee(x.v, "PRIVATE"); } @@ -3827,15 +3827,15 @@ bool OmpStructureChecker::IsDataRefTypeParamInquiry( return dataRefIsTypeParamInquiry; } -void OmpStructureChecker::CheckIsVarPartOfAnotherVar( +void OmpStructureChecker::CheckVarIsNotPartOfAnotherVar( const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause) { for (const auto &ompObject : objList.v) { - CheckIsVarPartOfAnotherVar(source, ompObject, clause); + CheckVarIsNotPartOfAnotherVar(source, ompObject, clause); } } -void OmpStructureChecker::CheckIsVarPartOfAnotherVar( +void OmpStructureChecker::CheckVarIsNotPartOfAnotherVar( const parser::CharBlock &source, const parser::OmpObject &ompObject, llvm::StringRef clause) { common::visit( @@ -3875,7 +3875,7 @@ void OmpStructureChecker::CheckIsVarPartOfAnotherVar( void OmpStructureChecker::Enter(const parser::OmpClause::Firstprivate &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_firstprivate); - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v, "FIRSTPRIVATE"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v, "FIRSTPRIVATE"); CheckCrayPointee(x.v, "FIRSTPRIVATE"); CheckIsLoopIvPartOfClause(llvmOmpClause::OMPC_firstprivate, x.v); @@ -4204,7 +4204,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { // OpenMP 5.2: 12.5.2 Detach clause restrictions unsigned version{context_.langOptions().OpenMPVersion}; if (version >= 52) { - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); } if (const auto *name{parser::Unwrap(x.v.v)}) { @@ -4571,7 +4571,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Lastprivate &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_lastprivate); const auto &objectList{std::get(x.v.t)}; - CheckIsVarPartOfAnotherVar( + CheckVarIsNotPartOfAnotherVar( GetContext().clauseSource, objectList, "LASTPRIVATE"); CheckCrayPointee(objectList, "LASTPRIVATE"); diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 1517f89f476f4..be420332d491c 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -231,9 +231,9 @@ class OmpStructureChecker const common::Indirection &, const parser::Name &); void CheckDoacross(const parser::OmpDoacross &doa); bool IsDataRefTypeParamInquiry(const parser::DataRef *dataRef); - void CheckIsVarPartOfAnotherVar(const parser::CharBlock &source, + void CheckVarIsNotPartOfAnotherVar(const parser::CharBlock &source, const parser::OmpObject &obj, llvm::StringRef clause = ""); - void CheckIsVarPartOfAnotherVar(const parser::CharBlock &source, + void CheckVarIsNotPartOfAnotherVar(const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause = ""); void CheckThreadprivateOrDeclareTargetVar( const parser::OmpObjectList &objList); >From 59075c12dd380f607874a4af3e9c0fc0cc67efd6 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:54:18 +0000 Subject: [PATCH 10/14] Remove the check for symbol's nullptr --- flang/lib/Semantics/check-omp-structure.cpp | 79 ++++++++++----------- 1 file changed, 37 insertions(+), 42 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 012529af0767d..e98536fd506cb 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3163,40 +3163,37 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { const auto &detach{ std::get(detachClause->u)}; if (const auto *name{parser::Unwrap(detach.v.v)}) { - if (name->symbol) { - Symbol *eventHandleSym{name->symbol}; - auto checkVarAppearsInDataEnvClause = [&](const parser:: - OmpObjectList &objs, - std::string clause) { - for (const auto &obj : objs.v) { - if (const parser::Name * - objName{parser::Unwrap(obj)}) { - if (&objName->symbol->GetUltimate() == eventHandleSym) { - context_.Say(GetContext().clauseSource, - "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, - objName->source, clause); - } + Symbol *eventHandleSym{name->symbol}; + auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList + &objs, + std::string clause) { + for (const auto &obj : objs.v) { + if (const parser::Name *objName{ + parser::Unwrap(obj)}) { + if (&objName->symbol->GetUltimate() == eventHandleSym) { + context_.Say(GetContext().clauseSource, + "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, + objName->source, clause); } } - }; - if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_private)}) { - const auto &pClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); - } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { - const auto &fpClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); - } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { - const auto &irClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause( - std::get(irClause.v.t), - "IN_REDUCTION"); } + }; + if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_private)}) { + const auto &pClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + const auto &fpClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + const auto &irClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause( + std::get(irClause.v.t), "IN_REDUCTION"); } } } @@ -4208,17 +4205,15 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { } if (const auto *name{parser::Unwrap(x.v.v)}) { - if (name->symbol) { - if (version >= 52 && IsPointer(*name->symbol)) { - context_.Say(GetContext().clauseSource, - "The event-handle: `%s` must not have the POINTER attribute"_err_en_US, - name->ToString()); - } - if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { - context_.Say(GetContext().clauseSource, - "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, - name->ToString()); - } + if (version >= 52 && IsPointer(*name->symbol)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must not have the POINTER attribute"_err_en_US, + name->ToString()); + } + if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, + name->ToString()); } } } >From 965df852abd5ffa1735e19491c0df1b04f58470d Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:54:40 +0000 Subject: [PATCH 11/14] CheckAllowedClause for OpenMP 50 and 51 --- flang/lib/Semantics/check-omp-structure.cpp | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index e98536fd506cb..cb0e06ed6aa32 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4196,10 +4196,14 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { - // OpenMP 5.0: 2.10.1 Task construct restrictions - CheckAllowedClause(llvm::omp::Clause::OMPC_detach); - // OpenMP 5.2: 12.5.2 Detach clause restrictions unsigned version{context_.langOptions().OpenMPVersion}; + if (version >= 52) { + SetContextClauseInfo(llvm::omp::Clause::OMPC_detach); + } else { + // OpenMP 5.0: 2.10.1 Task construct restrictions + CheckAllowedClause(llvm::omp::Clause::OMPC_detach); + } + // OpenMP 5.2: 12.5.2 Detach clause restrictions if (version >= 52) { CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); } >From d13d267f8d8a5641c4556b98b9e4eea119f21adf Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:55:07 +0000 Subject: [PATCH 12/14] Fix the error message string to be in a single line --- flang/lib/Semantics/check-omp-structure.cpp | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index cb0e06ed6aa32..06437dc7a8588 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3842,23 +3842,18 @@ void OmpStructureChecker::CheckVarIsNotPartOfAnotherVar( std::get_if(&designator.u)}) { if (IsDataRefTypeParamInquiry(dataRef)) { context_.Say(source, - "A type parameter inquiry cannot appear on the %s " - "directive"_err_en_US, + "A type parameter inquiry cannot appear on the %s directive"_err_en_US, ContextDirectiveAsFortran()); } else if (parser::Unwrap( ompObject) || parser::Unwrap(ompObject)) { if (llvm::omp::nonPartialVarSet.test(GetContext().directive)) { context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear on the %s " - "directive"_err_en_US, + "A variable that is part of another variable (as an array or structure element) cannot appear on the %s directive"_err_en_US, ContextDirectiveAsFortran()); } else { context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear in a " - "%s clause"_err_en_US, + "A variable that is part of another variable (as an array or structure element) cannot appear in a %s clause"_err_en_US, clause.data()); } } >From 573580e5526d7981fb93077b5fd05b7a3547ec85 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:55:40 +0000 Subject: [PATCH 13/14] Add check for shared clause as well --- flang/lib/Semantics/check-omp-structure.cpp | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 06437dc7a8588..3223652a27187 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,6 +3183,11 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { const auto &pClause{ std::get(dataEnvClause->u)}; checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_shared)}) { + const auto &sClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(sClause.v, "SHARED"); } else if (auto *dataEnvClause{ FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { const auto &fpClause{ >From 5d9be322e9577b4ac7ba6b3b8b30336f2693b588 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:58:27 +0000 Subject: [PATCH 14/14] Remove a test --- flang/test/Semantics/OpenMP/detach01.f90 | 5 ----- 1 file changed, 5 deletions(-) diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index 8f19dfc1f92a7..ea8208c022ef1 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -50,9 +50,4 @@ program detach01 !$omp task detach(event_03) x = x + 1 !$omp end task - - !ERROR: At most one DETACH clause can appear on the TASK directive - !$omp task detach(event_01) detach(event_02) - x = x + 1 - !$omp end task end program From flang-commits at lists.llvm.org Fri May 2 05:10:30 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Fri, 02 May 2025 05:10:30 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <6814b636.170a0220.2cd887.f65c@mx.google.com> https://github.com/Thirumalai-Shaktivel updated https://github.com/llvm/llvm-project/pull/119172 >From 6e491ccd80b902df6946713a372ec9667e0811c3 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Mon, 9 Dec 2024 07:37:01 +0000 Subject: [PATCH 01/15] [Flang] [OpenMP] Add semantic checks for detach clause in Task Fixes: - Add semantic checks along with the tests - Move the detach clause to allowedOnceClauses list in Task construct Restrictions:\ OpenMP 5.0: Task construct - At most one detach clause can appear on the directive. - If a detach clause appears on the directive, then a mergeable clause cannot appear on the same directive. OpenMP 5.2: Detach contruct - If a detach clause appears on a directive, then the encountering task must not be a final task. - A variable that appears in a detach clause cannot appear as a list item on a data-environment attribute clause on the same construct. - A variable that is part of another variable (as an array element or a structure element) cannot appear in a detach clause. - event-handle must not have the POINTER attribute. --- flang/lib/Semantics/check-omp-structure.cpp | 141 +++++++++++++++----- flang/lib/Semantics/check-omp-structure.h | 2 + flang/test/Semantics/OpenMP/detach01.f90 | 65 +++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 2 +- 4 files changed, 179 insertions(+), 31 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/detach01.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 95b962f5daf57..6641e39c6e358 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2733,6 +2733,59 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { llvm::omp::Clause::OMPC_copyprivate, {llvm::omp::Clause::OMPC_nowait}); } + if (GetContext().directive == llvm::omp::Directive::OMPD_task) { + if (auto *d_clause{FindClause(llvm::omp::Clause::OMPC_detach)}) { + // OpenMP 5.0: Task construct restrictions + CheckNotAllowedIfClause( + llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); + + // OpenMP 5.2: Task construct restrictions + if (FindClause(llvm::omp::Clause::OMPC_final)) { + context_.Say(GetContext().clauseSource, + "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); + } + + const auto &detachClause{ + std::get(d_clause->u)}; + if (const auto *name{parser::Unwrap(detachClause.v.v)}) { + if (name->symbol) { + std::string eventHandleSymName{name->ToString()}; + auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList + &objs, + std::string clause) { + for (const auto &obj : objs.v) { + if (const parser::Name *objName{ + parser::Unwrap(obj)}) { + if (objName->ToString() == eventHandleSymName) { + context_.Say(GetContext().clauseSource, + "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, + eventHandleSymName, clause); + } + } + } + }; + if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_private)}) { + const auto &pClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + const auto &fpClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + const auto &irClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause( + std::get(irClause.v.t), "IN_REDUCTION"); + } + } + } + } + } + auto testThreadprivateVarErr = [&](Symbol sym, parser::Name name, llvmOmpClause clauseTy) { if (sym.test(Symbol::Flag::OmpThreadprivate)) @@ -2815,7 +2868,6 @@ CHECK_SIMPLE_CLAUSE(Capture, OMPC_capture) CHECK_SIMPLE_CLAUSE(Contains, OMPC_contains) CHECK_SIMPLE_CLAUSE(Default, OMPC_default) CHECK_SIMPLE_CLAUSE(Depobj, OMPC_depobj) -CHECK_SIMPLE_CLAUSE(Detach, OMPC_detach) CHECK_SIMPLE_CLAUSE(DeviceType, OMPC_device_type) CHECK_SIMPLE_CLAUSE(DistSchedule, OMPC_dist_schedule) CHECK_SIMPLE_CLAUSE(Exclusive, OMPC_exclusive) @@ -3386,40 +3438,45 @@ void OmpStructureChecker::CheckIsVarPartOfAnotherVar( const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause) { for (const auto &ompObject : objList.v) { - common::visit( - common::visitors{ - [&](const parser::Designator &designator) { - if (const auto *dataRef{ - std::get_if(&designator.u)}) { - if (IsDataRefTypeParamInquiry(dataRef)) { + CheckIsVarPartOfAnotherVar(source, ompObject, clause); + } +} + +void OmpStructureChecker::CheckIsVarPartOfAnotherVar( + const parser::CharBlock &source, const parser::OmpObject &ompObject, + llvm::StringRef clause) { + common::visit( + common::visitors{ + [&](const parser::Designator &designator) { + if (const auto *dataRef{ + std::get_if(&designator.u)}) { + if (IsDataRefTypeParamInquiry(dataRef)) { + context_.Say(source, + "A type parameter inquiry cannot appear on the %s " + "directive"_err_en_US, + ContextDirectiveAsFortran()); + } else if (parser::Unwrap( + ompObject) || + parser::Unwrap(ompObject)) { + if (llvm::omp::nonPartialVarSet.test(GetContext().directive)) { context_.Say(source, - "A type parameter inquiry cannot appear on the %s " + "A variable that is part of another variable (as an " + "array or structure element) cannot appear on the %s " "directive"_err_en_US, ContextDirectiveAsFortran()); - } else if (parser::Unwrap( - ompObject) || - parser::Unwrap(ompObject)) { - if (llvm::omp::nonPartialVarSet.test( - GetContext().directive)) { - context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear on the %s " - "directive"_err_en_US, - ContextDirectiveAsFortran()); - } else { - context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear in a " - "%s clause"_err_en_US, - clause.data()); - } + } else { + context_.Say(source, + "A variable that is part of another variable (as an " + "array or structure element) cannot appear in a " + "%s clause"_err_en_US, + clause.data()); } } - }, - [&](const parser::Name &name) {}, - }, - ompObject.u); - } + } + }, + [&](const parser::Name &name) {}, + }, + ompObject.u); } void OmpStructureChecker::Enter(const parser::OmpClause::Firstprivate &x) { @@ -3746,6 +3803,30 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } } +void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { + // OpenMP 5.0: Task construct restrictions + CheckAllowedClause(llvm::omp::Clause::OMPC_detach); + + // OpenMP 5.2: Detach clause restrictions + CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); + if (const auto *name{parser::Unwrap(x.v.v)}) { + if (name->symbol) { + if (IsPointer(*name->symbol)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must not have the POINTER attribute"_err_en_US, + name->ToString()); + } + } + auto type{name->symbol->GetType()}; + if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer) || + evaluate::ToInt64(type->numericTypeSpec().kind()) != 8) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, + name->ToString()); + } + } +} + void OmpStructureChecker::CheckAllowedMapTypes( const parser::OmpMapType::Value &type, const std::list &allowedMapTypeList) { diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 346a7bed9138f..a8f94992ff091 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -186,6 +186,8 @@ class OmpStructureChecker const common::Indirection &, const parser::Name &); void CheckDoacross(const parser::OmpDoacross &doa); bool IsDataRefTypeParamInquiry(const parser::DataRef *dataRef); + void CheckIsVarPartOfAnotherVar(const parser::CharBlock &source, + const parser::OmpObject &obj, llvm::StringRef clause = ""); void CheckIsVarPartOfAnotherVar(const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause = ""); void CheckThreadprivateOrDeclareTargetVar( diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 new file mode 100644 index 0000000000000..e342fcd1b19b4 --- /dev/null +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -0,0 +1,65 @@ +! REQUIRES: openmp_runtime +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=50 + +! OpenMP Version 5.2 +! Various checks for DETACH Clause (12.5.2) + +program test_detach + use omp_lib + implicit none + integer :: e, x + integer(omp_event_handle_kind) :: event_01, event_02(2) + integer(omp_event_handle_kind), pointer :: event_03 + + + type :: t + integer(omp_event_handle_kind) :: event + end type + + type(t) :: t_01 + + !ERROR: The event-handle: `e` must be of type integer(kind=omp_event_handle_kind) + !$omp task detach(e) + x = x + 1 + !$omp end task + + !ERROR: At most one DETACH clause can appear on the TASK directive + !$omp task detach(event_01) detach(event_01) + x = x + 1 + !$omp end task + + !ERROR: Clause MERGEABLE is not allowed if clause DETACH appears on the TASK directive + !$omp task detach(event_01) mergeable + x = x + 1 + !$omp end task + + !ERROR: If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task + !$omp task detach(event_01) final(.false.) + x = x + 1 + !$omp end task + + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on PRIVATE clause on the same construct + !$omp task detach(event_01) private(event_01) + x = x + 1 + !$omp end task + + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on IN_REDUCTION clause on the same construct + !$omp task detach(event_01) in_reduction(+:event_01) + x = x + 1 + !$omp end task + + !ERROR: A variable that is part of another variable (as an array or structure element) cannot appear in a DETACH clause + !$omp task detach(event_02(1)) + x = x + 1 + !$omp end task + + !ERROR: A variable that is part of another variable (as an array or structure element) cannot appear in a DETACH clause + !$omp task detach(t_01%event) + x = x + 1 + !$omp end task + + !ERROR: The event-handle: `event_03` must not have the POINTER attribute + !$omp task detach(event_03) + x = x + 1 + !$omp end task +end program diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index e36eb77cefe7e..aec80decf6039 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -1090,7 +1090,6 @@ def OMP_Task : Directive<"task"> { VersionedClause, VersionedClause, VersionedClause, - VersionedClause, VersionedClause, VersionedClause, VersionedClause, @@ -1100,6 +1099,7 @@ def OMP_Task : Directive<"task"> { ]; let allowedOnceClauses = [ VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, >From fa2b051e724898f3be9b2d74d218816088b3975f Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 05:51:11 +0000 Subject: [PATCH 02/15] Do not check for interger kind --- flang/lib/Semantics/check-omp-structure.cpp | 4 +--- flang/test/Semantics/OpenMP/detach01.f90 | 4 ++-- 2 files changed, 3 insertions(+), 5 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 6641e39c6e358..e2f897f5c9246 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3817,9 +3817,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { name->ToString()); } } - auto type{name->symbol->GetType()}; - if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer) || - evaluate::ToInt64(type->numericTypeSpec().kind()) != 8) { + if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { context_.Say(GetContext().clauseSource, "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, name->ToString()); diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index e342fcd1b19b4..7ba2888be9237 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -7,8 +7,8 @@ program test_detach use omp_lib implicit none - integer :: e, x - integer(omp_event_handle_kind) :: event_01, event_02(2) + real :: e, x + integer(omp_event_handle_kind) :: event_01, event_02(2) integer(omp_event_handle_kind), pointer :: event_03 >From 569712fc9b14f3a1f4b3a7d2c55febfe216f5e7f Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 06:14:26 +0000 Subject: [PATCH 03/15] Fix snake_case --- flang/lib/Semantics/check-omp-structure.cpp | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index e2f897f5c9246..aa19b71e69c62 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2734,7 +2734,7 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { } if (GetContext().directive == llvm::omp::Directive::OMPD_task) { - if (auto *d_clause{FindClause(llvm::omp::Clause::OMPC_detach)}) { + if (auto *detachClause{FindClause(llvm::omp::Clause::OMPC_detach)}) { // OpenMP 5.0: Task construct restrictions CheckNotAllowedIfClause( llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); @@ -2745,9 +2745,9 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); } - const auto &detachClause{ - std::get(d_clause->u)}; - if (const auto *name{parser::Unwrap(detachClause.v.v)}) { + const auto &detach{ + std::get(detachClause->u)}; + if (const auto *name{parser::Unwrap(detach.v.v)}) { if (name->symbol) { std::string eventHandleSymName{name->ToString()}; auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList >From 42cc3e75d041b9935b38e19cb70d7b3273f4087f Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 06:36:11 +0000 Subject: [PATCH 04/15] Compare symbols instead of name --- flang/lib/Semantics/check-omp-structure.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa19b71e69c62..084dbf48c5c50 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2749,17 +2749,17 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { std::get(detachClause->u)}; if (const auto *name{parser::Unwrap(detach.v.v)}) { if (name->symbol) { - std::string eventHandleSymName{name->ToString()}; + Symbol *eventHandleSym{name->symbol->GetUltimate()}; auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList &objs, std::string clause) { for (const auto &obj : objs.v) { if (const parser::Name *objName{ parser::Unwrap(obj)}) { - if (objName->ToString() == eventHandleSymName) { + if (objName->symbol->GetUltimate() == eventHandleSym) { context_.Say(GetContext().clauseSource, "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, - eventHandleSymName, clause); + objName->source, clause); } } } >From a3c6a4517354991dc5f166d57d14012571b3f553 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 09:09:15 +0000 Subject: [PATCH 05/15] Add OpenMP version based checks --- flang/lib/Semantics/check-omp-structure.cpp | 105 ++++++++++---------- flang/test/Semantics/OpenMP/detach01.f90 | 22 +--- flang/test/Semantics/OpenMP/detach02.f90 | 22 ++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 2 +- 4 files changed, 83 insertions(+), 68 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/detach02.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 084dbf48c5c50..41ed858ed650f 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2735,51 +2735,54 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { if (GetContext().directive == llvm::omp::Directive::OMPD_task) { if (auto *detachClause{FindClause(llvm::omp::Clause::OMPC_detach)}) { - // OpenMP 5.0: Task construct restrictions - CheckNotAllowedIfClause( - llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); - - // OpenMP 5.2: Task construct restrictions - if (FindClause(llvm::omp::Clause::OMPC_final)) { - context_.Say(GetContext().clauseSource, - "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); - } + unsigned version{context_.langOptions().OpenMPVersion}; + if (version == 50 || version == 51) { + // OpenMP 5.0: 2.10.1 Task construct restrictions + CheckNotAllowedIfClause( + llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); + } else if (version >= 52) { + // OpenMP 5.2: 12.5.2 Detach construct restrictions + if (FindClause(llvm::omp::Clause::OMPC_final)) { + context_.Say(GetContext().clauseSource, + "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); + } - const auto &detach{ - std::get(detachClause->u)}; - if (const auto *name{parser::Unwrap(detach.v.v)}) { - if (name->symbol) { - Symbol *eventHandleSym{name->symbol->GetUltimate()}; - auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList - &objs, - std::string clause) { - for (const auto &obj : objs.v) { - if (const parser::Name *objName{ - parser::Unwrap(obj)}) { - if (objName->symbol->GetUltimate() == eventHandleSym) { - context_.Say(GetContext().clauseSource, - "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, - objName->source, clause); + const auto &detach{ + std::get(detachClause->u)}; + if (const auto *name{parser::Unwrap(detach.v.v)}) { + if (name->symbol) { + Symbol *eventHandleSym{name->symbol}; + auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList + &objs, + std::string clause) { + for (const auto &obj : objs.v) { + if (const parser::Name *objName{ + parser::Unwrap(obj)}) { + if (&objName->symbol->GetUltimate() == eventHandleSym) { + context_.Say(GetContext().clauseSource, + "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, + objName->source, clause); + } } } + }; + if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_private)}) { + const auto &pClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + const auto &fpClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + const auto &irClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause( + std::get(irClause.v.t), "IN_REDUCTION"); } - }; - if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_private)}) { - const auto &pClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); - } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { - const auto &fpClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); - } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { - const auto &irClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause( - std::get(irClause.v.t), "IN_REDUCTION"); } } } @@ -3804,23 +3807,25 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { - // OpenMP 5.0: Task construct restrictions CheckAllowedClause(llvm::omp::Clause::OMPC_detach); + unsigned version{context_.langOptions().OpenMPVersion}; + // OpenMP 5.2: 12.5.2 Detach clause restrictions + if (version >= 52) { + CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); + } - // OpenMP 5.2: Detach clause restrictions - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); if (const auto *name{parser::Unwrap(x.v.v)}) { if (name->symbol) { - if (IsPointer(*name->symbol)) { + if (version >= 52 && IsPointer(*name->symbol)) { context_.Say(GetContext().clauseSource, "The event-handle: `%s` must not have the POINTER attribute"_err_en_US, name->ToString()); } - } - if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { - context_.Say(GetContext().clauseSource, - "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, - name->ToString()); + if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, + name->ToString()); + } } } } diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index 7ba2888be9237..ea8208c022ef1 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -1,21 +1,19 @@ ! REQUIRES: openmp_runtime -! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=50 +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=52 -! OpenMP Version 5.2 -! Various checks for DETACH Clause (12.5.2) +! OpenMP Version 5.2: 12.5.2 +! Various checks for DETACH Clause -program test_detach - use omp_lib +program detach01 + use omp_lib, only: omp_event_handle_kind implicit none real :: e, x integer(omp_event_handle_kind) :: event_01, event_02(2) integer(omp_event_handle_kind), pointer :: event_03 - type :: t integer(omp_event_handle_kind) :: event end type - type(t) :: t_01 !ERROR: The event-handle: `e` must be of type integer(kind=omp_event_handle_kind) @@ -23,16 +21,6 @@ program test_detach x = x + 1 !$omp end task - !ERROR: At most one DETACH clause can appear on the TASK directive - !$omp task detach(event_01) detach(event_01) - x = x + 1 - !$omp end task - - !ERROR: Clause MERGEABLE is not allowed if clause DETACH appears on the TASK directive - !$omp task detach(event_01) mergeable - x = x + 1 - !$omp end task - !ERROR: If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task !$omp task detach(event_01) final(.false.) x = x + 1 diff --git a/flang/test/Semantics/OpenMP/detach02.f90 b/flang/test/Semantics/OpenMP/detach02.f90 new file mode 100644 index 0000000000000..1304233976351 --- /dev/null +++ b/flang/test/Semantics/OpenMP/detach02.f90 @@ -0,0 +1,22 @@ +! REQUIRES: openmp_runtime +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=50 +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=51 + +! OpenMP Version 5.0: 2.10.1 +! Various checks for DETACH Clause + +program detach02 + use omp_lib, only: omp_event_handle_kind + integer(omp_event_handle_kind) :: event_01, event_02 + + !TODO: Throw following error for the versions 5.0 and 5.1 + !ERR: At most one DETACH clause can appear on the TASK directive + !!$omp task detach(event_01) detach(event_02) + ! x = x + 1 + !!$omp end task + + !ERROR: Clause MERGEABLE is not allowed if clause DETACH appears on the TASK directive + !$omp task detach(event_01) mergeable + x = x + 1 + !$omp end task +end program diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index aec80decf6039..e36eb77cefe7e 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -1090,6 +1090,7 @@ def OMP_Task : Directive<"task"> { VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, @@ -1099,7 +1100,6 @@ def OMP_Task : Directive<"task"> { ]; let allowedOnceClauses = [ VersionedClause, - VersionedClause, VersionedClause, VersionedClause, VersionedClause, >From 019a24c814b226352a13c18b840cc22a9b28ec7e Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 09:26:33 +0000 Subject: [PATCH 06/15] Fix formatting --- flang/lib/Semantics/check-omp-structure.cpp | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 41ed858ed650f..7ca695acc74b3 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2738,8 +2738,8 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { unsigned version{context_.langOptions().OpenMPVersion}; if (version == 50 || version == 51) { // OpenMP 5.0: 2.10.1 Task construct restrictions - CheckNotAllowedIfClause( - llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); + CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_detach, + {llvm::omp::Clause::OMPC_mergeable}); } else if (version >= 52) { // OpenMP 5.2: 12.5.2 Detach construct restrictions if (FindClause(llvm::omp::Clause::OMPC_final)) { @@ -2752,12 +2752,12 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { if (const auto *name{parser::Unwrap(detach.v.v)}) { if (name->symbol) { Symbol *eventHandleSym{name->symbol}; - auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList - &objs, + auto checkVarAppearsInDataEnvClause = [&](const parser:: + OmpObjectList &objs, std::string clause) { for (const auto &obj : objs.v) { - if (const parser::Name *objName{ - parser::Unwrap(obj)}) { + if (const parser::Name * + objName{parser::Unwrap(obj)}) { if (&objName->symbol->GetUltimate() == eventHandleSym) { context_.Say(GetContext().clauseSource, "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, @@ -2772,16 +2772,17 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { std::get(dataEnvClause->u)}; checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { const auto &fpClause{ std::get(dataEnvClause->u)}; checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { const auto &irClause{ std::get(dataEnvClause->u)}; checkVarAppearsInDataEnvClause( - std::get(irClause.v.t), "IN_REDUCTION"); + std::get(irClause.v.t), + "IN_REDUCTION"); } } } >From 0348571f198063b2358d52435cd0c62c14b89ef9 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 7 Mar 2025 05:03:09 +0000 Subject: [PATCH 07/15] Allow detach clause only once --- flang/lib/Semantics/check-omp-structure.cpp | 3 ++- flang/test/Semantics/OpenMP/detach02.f90 | 9 ++++----- llvm/include/llvm/Frontend/OpenMP/OMP.td | 2 +- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 7ca695acc74b3..c6a4c2472c4af 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3808,9 +3808,10 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { + // OpenMP 5.0: 2.10.1 Task construct restrictions CheckAllowedClause(llvm::omp::Clause::OMPC_detach); - unsigned version{context_.langOptions().OpenMPVersion}; // OpenMP 5.2: 12.5.2 Detach clause restrictions + unsigned version{context_.langOptions().OpenMPVersion}; if (version >= 52) { CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); } diff --git a/flang/test/Semantics/OpenMP/detach02.f90 b/flang/test/Semantics/OpenMP/detach02.f90 index 1304233976351..49d80358fcdb6 100644 --- a/flang/test/Semantics/OpenMP/detach02.f90 +++ b/flang/test/Semantics/OpenMP/detach02.f90 @@ -9,11 +9,10 @@ program detach02 use omp_lib, only: omp_event_handle_kind integer(omp_event_handle_kind) :: event_01, event_02 - !TODO: Throw following error for the versions 5.0 and 5.1 - !ERR: At most one DETACH clause can appear on the TASK directive - !!$omp task detach(event_01) detach(event_02) - ! x = x + 1 - !!$omp end task + !ERROR: At most one DETACH clause can appear on the TASK directive + !$omp task detach(event_01) detach(event_02) + x = x + 1 + !$omp end task !ERROR: Clause MERGEABLE is not allowed if clause DETACH appears on the TASK directive !$omp task detach(event_01) mergeable diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index e36eb77cefe7e..aec80decf6039 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -1090,7 +1090,6 @@ def OMP_Task : Directive<"task"> { VersionedClause, VersionedClause, VersionedClause, - VersionedClause, VersionedClause, VersionedClause, VersionedClause, @@ -1100,6 +1099,7 @@ def OMP_Task : Directive<"task"> { ]; let allowedOnceClauses = [ VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, >From 2c09c87ffd454186d9bd335e09c9bf0cd459c153 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 7 Mar 2025 05:06:18 +0000 Subject: [PATCH 08/15] Add a test for openmp 52 --- flang/test/Semantics/OpenMP/detach01.f90 | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index ea8208c022ef1..8f19dfc1f92a7 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -50,4 +50,9 @@ program detach01 !$omp task detach(event_03) x = x + 1 !$omp end task + + !ERROR: At most one DETACH clause can appear on the TASK directive + !$omp task detach(event_01) detach(event_02) + x = x + 1 + !$omp end task end program >From 6fc96f7795546aa0771f90d38e83cf3b2fffe32c Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:51:19 +0000 Subject: [PATCH 09/15] Rename a function name --- flang/lib/Semantics/check-omp-structure.cpp | 32 ++++++++++----------- flang/lib/Semantics/check-omp-structure.h | 4 +-- 2 files changed, 18 insertions(+), 18 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 346c11e9765b7..012529af0767d 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -1605,7 +1605,7 @@ void OmpStructureChecker::Leave(const parser::OpenMPThreadprivate &c) { const auto &dir{std::get(c.t)}; const auto &objectList{std::get(c.t)}; CheckSymbolNames(dir.source, objectList); - CheckIsVarPartOfAnotherVar(dir.source, objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, objectList); CheckThreadprivateOrDeclareTargetVar(objectList); dirContext_.pop_back(); } @@ -1736,7 +1736,7 @@ void OmpStructureChecker::Enter(const parser::OpenMPDeclarativeAllocate &x) { for (const auto &clause : clauseList.v) { CheckAlignValue(clause); } - CheckIsVarPartOfAnotherVar(dir.source, objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, objectList); } void OmpStructureChecker::Leave(const parser::OpenMPDeclarativeAllocate &x) { @@ -1902,7 +1902,7 @@ void OmpStructureChecker::Leave(const parser::OpenMPDeclareTargetConstruct &x) { if (const auto *objectList{parser::Unwrap(spec.u)}) { deviceConstructFound_ = true; CheckSymbolNames(dir.source, *objectList); - CheckIsVarPartOfAnotherVar(dir.source, *objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, *objectList); CheckThreadprivateOrDeclareTargetVar(*objectList); } else if (const auto *clauseList{ parser::Unwrap(spec.u)}) { @@ -1915,18 +1915,18 @@ void OmpStructureChecker::Leave(const parser::OpenMPDeclareTargetConstruct &x) { toClauseFound = true; auto &objList{std::get(toClause.v.t)}; CheckSymbolNames(dir.source, objList); - CheckIsVarPartOfAnotherVar(dir.source, objList); + CheckVarIsNotPartOfAnotherVar(dir.source, objList); CheckThreadprivateOrDeclareTargetVar(objList); }, [&](const parser::OmpClause::Link &linkClause) { CheckSymbolNames(dir.source, linkClause.v); - CheckIsVarPartOfAnotherVar(dir.source, linkClause.v); + CheckVarIsNotPartOfAnotherVar(dir.source, linkClause.v); CheckThreadprivateOrDeclareTargetVar(linkClause.v); }, [&](const parser::OmpClause::Enter &enterClause) { enterClauseFound = true; CheckSymbolNames(dir.source, enterClause.v); - CheckIsVarPartOfAnotherVar(dir.source, enterClause.v); + CheckVarIsNotPartOfAnotherVar(dir.source, enterClause.v); CheckThreadprivateOrDeclareTargetVar(enterClause.v); }, [&](const parser::OmpClause::DeviceType &deviceTypeClause) { @@ -2009,7 +2009,7 @@ void OmpStructureChecker::Enter(const parser::OpenMPExecutableAllocate &x) { CheckAlignValue(clause); } if (objectList) { - CheckIsVarPartOfAnotherVar(dir.source, *objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, *objectList); } } @@ -2029,7 +2029,7 @@ void OmpStructureChecker::Enter(const parser::OpenMPAllocatorsConstruct &x) { for (const auto &clause : clauseList.v) { if (const auto *allocClause{ parser::Unwrap(clause)}) { - CheckIsVarPartOfAnotherVar( + CheckVarIsNotPartOfAnotherVar( dir.source, std::get(allocClause->v.t)); } } @@ -3791,14 +3791,14 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Ordered &x) { void OmpStructureChecker::Enter(const parser::OmpClause::Shared &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_shared); - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v, "SHARED"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v, "SHARED"); CheckCrayPointee(x.v, "SHARED"); } void OmpStructureChecker::Enter(const parser::OmpClause::Private &x) { SymbolSourceMap symbols; GetSymbolsInObjectList(x.v, symbols); CheckAllowedClause(llvm::omp::Clause::OMPC_private); - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v, "PRIVATE"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v, "PRIVATE"); CheckIntentInPointer(symbols, llvm::omp::Clause::OMPC_private); CheckCrayPointee(x.v, "PRIVATE"); } @@ -3827,15 +3827,15 @@ bool OmpStructureChecker::IsDataRefTypeParamInquiry( return dataRefIsTypeParamInquiry; } -void OmpStructureChecker::CheckIsVarPartOfAnotherVar( +void OmpStructureChecker::CheckVarIsNotPartOfAnotherVar( const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause) { for (const auto &ompObject : objList.v) { - CheckIsVarPartOfAnotherVar(source, ompObject, clause); + CheckVarIsNotPartOfAnotherVar(source, ompObject, clause); } } -void OmpStructureChecker::CheckIsVarPartOfAnotherVar( +void OmpStructureChecker::CheckVarIsNotPartOfAnotherVar( const parser::CharBlock &source, const parser::OmpObject &ompObject, llvm::StringRef clause) { common::visit( @@ -3875,7 +3875,7 @@ void OmpStructureChecker::CheckIsVarPartOfAnotherVar( void OmpStructureChecker::Enter(const parser::OmpClause::Firstprivate &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_firstprivate); - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v, "FIRSTPRIVATE"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v, "FIRSTPRIVATE"); CheckCrayPointee(x.v, "FIRSTPRIVATE"); CheckIsLoopIvPartOfClause(llvmOmpClause::OMPC_firstprivate, x.v); @@ -4204,7 +4204,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { // OpenMP 5.2: 12.5.2 Detach clause restrictions unsigned version{context_.langOptions().OpenMPVersion}; if (version >= 52) { - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); } if (const auto *name{parser::Unwrap(x.v.v)}) { @@ -4571,7 +4571,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Lastprivate &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_lastprivate); const auto &objectList{std::get(x.v.t)}; - CheckIsVarPartOfAnotherVar( + CheckVarIsNotPartOfAnotherVar( GetContext().clauseSource, objectList, "LASTPRIVATE"); CheckCrayPointee(objectList, "LASTPRIVATE"); diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 1517f89f476f4..be420332d491c 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -231,9 +231,9 @@ class OmpStructureChecker const common::Indirection &, const parser::Name &); void CheckDoacross(const parser::OmpDoacross &doa); bool IsDataRefTypeParamInquiry(const parser::DataRef *dataRef); - void CheckIsVarPartOfAnotherVar(const parser::CharBlock &source, + void CheckVarIsNotPartOfAnotherVar(const parser::CharBlock &source, const parser::OmpObject &obj, llvm::StringRef clause = ""); - void CheckIsVarPartOfAnotherVar(const parser::CharBlock &source, + void CheckVarIsNotPartOfAnotherVar(const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause = ""); void CheckThreadprivateOrDeclareTargetVar( const parser::OmpObjectList &objList); >From 59075c12dd380f607874a4af3e9c0fc0cc67efd6 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:54:18 +0000 Subject: [PATCH 10/15] Remove the check for symbol's nullptr --- flang/lib/Semantics/check-omp-structure.cpp | 79 ++++++++++----------- 1 file changed, 37 insertions(+), 42 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 012529af0767d..e98536fd506cb 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3163,40 +3163,37 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { const auto &detach{ std::get(detachClause->u)}; if (const auto *name{parser::Unwrap(detach.v.v)}) { - if (name->symbol) { - Symbol *eventHandleSym{name->symbol}; - auto checkVarAppearsInDataEnvClause = [&](const parser:: - OmpObjectList &objs, - std::string clause) { - for (const auto &obj : objs.v) { - if (const parser::Name * - objName{parser::Unwrap(obj)}) { - if (&objName->symbol->GetUltimate() == eventHandleSym) { - context_.Say(GetContext().clauseSource, - "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, - objName->source, clause); - } + Symbol *eventHandleSym{name->symbol}; + auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList + &objs, + std::string clause) { + for (const auto &obj : objs.v) { + if (const parser::Name *objName{ + parser::Unwrap(obj)}) { + if (&objName->symbol->GetUltimate() == eventHandleSym) { + context_.Say(GetContext().clauseSource, + "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, + objName->source, clause); } } - }; - if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_private)}) { - const auto &pClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); - } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { - const auto &fpClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); - } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { - const auto &irClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause( - std::get(irClause.v.t), - "IN_REDUCTION"); } + }; + if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_private)}) { + const auto &pClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + const auto &fpClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + const auto &irClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause( + std::get(irClause.v.t), "IN_REDUCTION"); } } } @@ -4208,17 +4205,15 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { } if (const auto *name{parser::Unwrap(x.v.v)}) { - if (name->symbol) { - if (version >= 52 && IsPointer(*name->symbol)) { - context_.Say(GetContext().clauseSource, - "The event-handle: `%s` must not have the POINTER attribute"_err_en_US, - name->ToString()); - } - if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { - context_.Say(GetContext().clauseSource, - "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, - name->ToString()); - } + if (version >= 52 && IsPointer(*name->symbol)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must not have the POINTER attribute"_err_en_US, + name->ToString()); + } + if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, + name->ToString()); } } } >From 965df852abd5ffa1735e19491c0df1b04f58470d Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:54:40 +0000 Subject: [PATCH 11/15] CheckAllowedClause for OpenMP 50 and 51 --- flang/lib/Semantics/check-omp-structure.cpp | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index e98536fd506cb..cb0e06ed6aa32 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4196,10 +4196,14 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { - // OpenMP 5.0: 2.10.1 Task construct restrictions - CheckAllowedClause(llvm::omp::Clause::OMPC_detach); - // OpenMP 5.2: 12.5.2 Detach clause restrictions unsigned version{context_.langOptions().OpenMPVersion}; + if (version >= 52) { + SetContextClauseInfo(llvm::omp::Clause::OMPC_detach); + } else { + // OpenMP 5.0: 2.10.1 Task construct restrictions + CheckAllowedClause(llvm::omp::Clause::OMPC_detach); + } + // OpenMP 5.2: 12.5.2 Detach clause restrictions if (version >= 52) { CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); } >From d13d267f8d8a5641c4556b98b9e4eea119f21adf Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:55:07 +0000 Subject: [PATCH 12/15] Fix the error message string to be in a single line --- flang/lib/Semantics/check-omp-structure.cpp | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index cb0e06ed6aa32..06437dc7a8588 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3842,23 +3842,18 @@ void OmpStructureChecker::CheckVarIsNotPartOfAnotherVar( std::get_if(&designator.u)}) { if (IsDataRefTypeParamInquiry(dataRef)) { context_.Say(source, - "A type parameter inquiry cannot appear on the %s " - "directive"_err_en_US, + "A type parameter inquiry cannot appear on the %s directive"_err_en_US, ContextDirectiveAsFortran()); } else if (parser::Unwrap( ompObject) || parser::Unwrap(ompObject)) { if (llvm::omp::nonPartialVarSet.test(GetContext().directive)) { context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear on the %s " - "directive"_err_en_US, + "A variable that is part of another variable (as an array or structure element) cannot appear on the %s directive"_err_en_US, ContextDirectiveAsFortran()); } else { context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear in a " - "%s clause"_err_en_US, + "A variable that is part of another variable (as an array or structure element) cannot appear in a %s clause"_err_en_US, clause.data()); } } >From 573580e5526d7981fb93077b5fd05b7a3547ec85 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:55:40 +0000 Subject: [PATCH 13/15] Add check for shared clause as well --- flang/lib/Semantics/check-omp-structure.cpp | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 06437dc7a8588..3223652a27187 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,6 +3183,11 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { const auto &pClause{ std::get(dataEnvClause->u)}; checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_shared)}) { + const auto &sClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(sClause.v, "SHARED"); } else if (auto *dataEnvClause{ FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { const auto &fpClause{ >From 5d9be322e9577b4ac7ba6b3b8b30336f2693b588 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:58:27 +0000 Subject: [PATCH 14/15] Remove a test --- flang/test/Semantics/OpenMP/detach01.f90 | 5 ----- 1 file changed, 5 deletions(-) diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index 8f19dfc1f92a7..ea8208c022ef1 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -50,9 +50,4 @@ program detach01 !$omp task detach(event_03) x = x + 1 !$omp end task - - !ERROR: At most one DETACH clause can appear on the TASK directive - !$omp task detach(event_01) detach(event_02) - x = x + 1 - !$omp end task end program >From 47fbdc4a08a30cfb70545bb1adf1f72a7c67bbd9 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 12:10:06 +0000 Subject: [PATCH 15/15] Add some more tests --- flang/test/Semantics/OpenMP/detach01.f90 | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index ea8208c022ef1..7729c85ea1128 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -31,6 +31,16 @@ program detach01 x = x + 1 !$omp end task + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on FIRSTPRIVATE clause on the same construct + !$omp task detach(event_01) firstprivate(event_01) + x = x + 1 + !$omp end task + + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on SHARED clause on the same construct + !$omp task detach(event_01) shared(event_01) + x = x + 1 + !$omp end task + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on IN_REDUCTION clause on the same construct !$omp task detach(event_01) in_reduction(+:event_01) x = x + 1 From flang-commits at lists.llvm.org Fri May 2 08:57:44 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 02 May 2025 08:57:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Hide strict volatility checks behind flag (PR #138183) In-Reply-To: Message-ID: <6814eb78.170a0220.bcd9d.0cc6@mx.google.com> https://github.com/vzakhari approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/138183 From flang-commits at lists.llvm.org Fri May 2 09:02:35 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Fri, 02 May 2025 09:02:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Hide strict volatility checks behind flag (PR #138183) In-Reply-To: Message-ID: <6814ec9b.170a0220.94d9f.0de5@mx.google.com> https://github.com/clementval approved this pull request. https://github.com/llvm/llvm-project/pull/138183 From flang-commits at lists.llvm.org Fri May 2 09:03:23 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 02 May 2025 09:03:23 -0700 (PDT) Subject: [flang-commits] [flang] 7220fda - [flang] Hide strict volatility checks behind flag (#138183) Message-ID: <6814eccb.170a0220.2f6e16.d49f@mx.google.com> Author: Asher Mancinelli Date: 2025-05-02T09:03:20-07:00 New Revision: 7220fdad0cbf42a9d134e157ba5821192327f4f3 URL: https://github.com/llvm/llvm-project/commit/7220fdad0cbf42a9d134e157ba5821192327f4f3 DIFF: https://github.com/llvm/llvm-project/commit/7220fdad0cbf42a9d134e157ba5821192327f4f3.diff LOG: [flang] Hide strict volatility checks behind flag (#138183) Enabling volatility lowering by default revealed some issues in lowering and op verification. For example, given volatile variable of a nested type, accessing structure members of a structure member would result in a volatility mismatch when the inner structure member is designated (and thus a verification error at compile time). In other cases, I found correct codegen when the checks were disabled, also related to allocatable types and how we handle volatile references of boxes. This hides the strict verification of fir and hlfir ops behind a flag so I can iteratively improve lowering of volatile variables without causing compile-time failures, keeping the strict verification on when running tests. Added: flang/test/Lower/volatile-allocatable1.f90 Modified: flang/include/flang/Optimizer/Dialect/FIROps.h flang/lib/Optimizer/Dialect/FIROps.cpp flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp flang/test/Fir/invalid.fir flang/test/Fir/volatile.fir flang/test/Fir/volatile2.fir flang/test/HLFIR/volatile.fir flang/test/HLFIR/volatile1.fir flang/test/HLFIR/volatile2.fir flang/test/HLFIR/volatile3.fir flang/test/HLFIR/volatile4.fir flang/test/Lower/volatile-openmp.f90 flang/test/Lower/volatile-string.f90 flang/test/Lower/volatile1.f90 flang/test/Lower/volatile2.f90 flang/test/Lower/volatile3.f90 flang/test/Lower/volatile4.f90 Removed: ################################################################################ diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.h b/flang/include/flang/Optimizer/Dialect/FIROps.h index 15bd512ea85af..1bed227afb50d 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.h +++ b/flang/include/flang/Optimizer/Dialect/FIROps.h @@ -40,6 +40,7 @@ mlir::ParseResult parseSelector(mlir::OpAsmParser &parser, mlir::OperationState &result, mlir::OpAsmParser::UnresolvedOperand &selector, mlir::Type &type); +bool useStrictVolatileVerification(); static constexpr llvm::StringRef getNormalizedLowerBoundAttrName() { return "normalized.lb"; diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index 8a24608336495..05ef69169bae5 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -33,11 +33,21 @@ #include "llvm/ADT/STLExtras.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/TypeSwitch.h" +#include "llvm/Support/CommandLine.h" namespace { #include "flang/Optimizer/Dialect/CanonicalizationPatterns.inc" } // namespace +static llvm::cl::opt clUseStrictVolatileVerification( + "strict-fir-volatile-verifier", llvm::cl::init(false), + llvm::cl::desc( + "use stricter verifier for FIR operations with volatile types")); + +bool fir::useStrictVolatileVerification() { + return clUseStrictVolatileVerification; +} + static void propagateAttributes(mlir::Operation *fromOp, mlir::Operation *toOp) { if (!fromOp || !toOp) @@ -1535,11 +1545,14 @@ llvm::LogicalResult fir::ConvertOp::verify() { // represent volatility. const bool toLLVMPointer = mlir::isa(outType); const bool toInteger = fir::isa_integer(outType); - if (fir::isa_volatile_type(inType) != fir::isa_volatile_type(outType) && - !toLLVMPointer && !toInteger) - return emitOpError("cannot convert between volatile and non-volatile " - "types, use fir.volatile_cast instead ") - << inType << " / " << outType; + if (fir::useStrictVolatileVerification()) { + if (fir::isa_volatile_type(inType) != fir::isa_volatile_type(outType) && + !toLLVMPointer && !toInteger) { + return emitOpError("cannot convert between volatile and non-volatile " + "types, use fir.volatile_cast instead ") + << inType << " / " << outType; + } + } if (canBeConverted(inType, outType)) return mlir::success(); return emitOpError("invalid type conversion") @@ -1841,6 +1854,10 @@ llvm::LogicalResult fir::TypeInfoOp::verify() { static llvm::LogicalResult verifyEmboxOpVolatilityInvariants(mlir::Type memrefType, mlir::Type resultType) { + + if (!fir::useStrictVolatileVerification()) + return mlir::success(); + mlir::Type boxElementType = llvm::TypeSwitch(resultType) .Case( diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp index c5ed76753ea0c..eef1377f26961 100644 --- a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp +++ b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp @@ -423,8 +423,9 @@ llvm::LogicalResult hlfir::DesignateOp::verify() { unsigned outputRank = 0; mlir::Type outputElementType; bool hasBoxComponent; - if (fir::isa_volatile_type(memrefType) != - fir::isa_volatile_type(getResult().getType())) { + if (fir::useStrictVolatileVerification() && + fir::isa_volatile_type(memrefType) != + fir::isa_volatile_type(getResult().getType())) { return emitOpError("volatility mismatch between memref and result type") << " memref type: " << memrefType << " result type: " << getResult().getType(); diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index 447a6c68b4b0a..f9f5e267dd9bc 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -1,6 +1,6 @@ -// FIR ops diagnotic tests -// RUN: fir-opt -split-input-file -verify-diagnostics %s + +// RUN: fir-opt -split-input-file -verify-diagnostics --strict-fir-volatile-verifier %s // expected-error at +1{{custom op 'fir.string_lit' must have character type}} %0 = fir.string_lit "Hello, World!"(13) : !fir.int<32> diff --git a/flang/test/Fir/volatile.fir b/flang/test/Fir/volatile.fir index 6b3d8709abdeb..9a7853083799f 100644 --- a/flang/test/Fir/volatile.fir +++ b/flang/test/Fir/volatile.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" %s -o - | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" %s -o - | FileCheck %s // CHECK: llvm.store volatile %{{.+}}, %{{.+}} : i32, !llvm.ptr // CHECK: %{{.+}} = llvm.load volatile %{{.+}} : !llvm.ptr -> i32 func.func @foo() { diff --git a/flang/test/Fir/volatile2.fir b/flang/test/Fir/volatile2.fir index 82a8413d2fc02..d7c7351c361dd 100644 --- a/flang/test/Fir/volatile2.fir +++ b/flang/test/Fir/volatile2.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt --fir-to-llvm-ir %s | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier --fir-to-llvm-ir %s | FileCheck %s func.func @_QQmain() { %0 = fir.alloca !fir.box, volatile> %c1 = arith.constant 1 : index diff --git a/flang/test/HLFIR/volatile.fir b/flang/test/HLFIR/volatile.fir index 453413a93af44..6d43bf20a702b 100644 --- a/flang/test/HLFIR/volatile.fir +++ b/flang/test/HLFIR/volatile.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt --convert-hlfir-to-fir %s -o - | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier --convert-hlfir-to-fir %s -o - | FileCheck %s func.func @foo() { %true = arith.constant true diff --git a/flang/test/HLFIR/volatile1.fir b/flang/test/HLFIR/volatile1.fir index 174acd77f9076..c6150fe72ed66 100644 --- a/flang/test/HLFIR/volatile1.fir +++ b/flang/test/HLFIR/volatile1.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s func.func @_QQmain() attributes {fir.bindc_name = "p"} { %0 = fir.address_of(@_QFEarr) : !fir.ref> %c10 = arith.constant 10 : index diff --git a/flang/test/HLFIR/volatile2.fir b/flang/test/HLFIR/volatile2.fir index 86ac683adad3f..0501cfcc8e8ac 100644 --- a/flang/test/HLFIR/volatile2.fir +++ b/flang/test/HLFIR/volatile2.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s func.func private @_QFPa() -> i32 attributes {fir.host_symbol = @_QQmain, llvm.linkage = #llvm.linkage} { %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFFaEa"} %1 = fir.volatile_cast %0 : (!fir.ref) -> !fir.ref diff --git a/flang/test/HLFIR/volatile3.fir b/flang/test/HLFIR/volatile3.fir index 41e42916e8ee5..24ea4e4b6df97 100644 --- a/flang/test/HLFIR/volatile3.fir +++ b/flang/test/HLFIR/volatile3.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s func.func @_QQmain() attributes {fir.bindc_name = "p"} { %0 = fir.address_of(@_QFEarr) : !fir.ref> %c10 = arith.constant 10 : index diff --git a/flang/test/HLFIR/volatile4.fir b/flang/test/HLFIR/volatile4.fir index cbf0aa31cb9f3..8980bcf932f81 100644 --- a/flang/test/HLFIR/volatile4.fir +++ b/flang/test/HLFIR/volatile4.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s +// RUN: fir-opt --strict-fir-volatile-verifier %s --bufferize-hlfir --convert-hlfir-to-fir | FileCheck %s func.func @_QQmain() attributes {fir.bindc_name = "p"} { %0 = fir.address_of(@_QFEarr) : !fir.ref> %c10 = arith.constant 10 : index diff --git a/flang/test/Lower/volatile-allocatable1.f90 b/flang/test/Lower/volatile-allocatable1.f90 new file mode 100644 index 0000000000000..a21359c3b4225 --- /dev/null +++ b/flang/test/Lower/volatile-allocatable1.f90 @@ -0,0 +1,17 @@ +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s + +! Requires correct propagation of volatility for allocatable nested types. +! XFAIL: * + +function allocatable_udt() + type :: base_type + integer :: i = 42 + end type + type, extends(base_type) :: ext_type + integer :: j = 100 + end type + integer :: allocatable_udt + type(ext_type), allocatable, volatile :: v2(:,:) + allocate(v2(2,3)) + allocatable_udt = v2(1,1)%i +end function diff --git a/flang/test/Lower/volatile-openmp.f90 b/flang/test/Lower/volatile-openmp.f90 index 64fd5f04afdd6..28f0bf78f33c9 100644 --- a/flang/test/Lower/volatile-openmp.f90 +++ b/flang/test/Lower/volatile-openmp.f90 @@ -1,4 +1,4 @@ -! RUN: bbc -fopenmp %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier -fopenmp %s -o - | FileCheck %s type t integer, pointer :: array(:) end type diff --git a/flang/test/Lower/volatile-string.f90 b/flang/test/Lower/volatile-string.f90 index 9173268880ace..88b21d7b245e9 100644 --- a/flang/test/Lower/volatile-string.f90 +++ b/flang/test/Lower/volatile-string.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s program p character(3), volatile :: string = 'foo' character(3) :: nonvolatile_string diff --git a/flang/test/Lower/volatile1.f90 b/flang/test/Lower/volatile1.f90 index 8447704619db0..385b9fa3bd1ad 100644 --- a/flang/test/Lower/volatile1.f90 +++ b/flang/test/Lower/volatile1.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s program p integer,volatile::i,arr(10) diff --git a/flang/test/Lower/volatile2.f90 b/flang/test/Lower/volatile2.f90 index 4b7f185f24c41..defacf820bd54 100644 --- a/flang/test/Lower/volatile2.f90 +++ b/flang/test/Lower/volatile2.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s program p print*,a(),b(),c() diff --git a/flang/test/Lower/volatile3.f90 b/flang/test/Lower/volatile3.f90 index dee6642e82593..8825f8f3afbcb 100644 --- a/flang/test/Lower/volatile3.f90 +++ b/flang/test/Lower/volatile3.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s ! Test that all combinations of volatile pointer and target are properly lowered - ! note that a volatile pointer implies that the target is volatile, even if not specified diff --git a/flang/test/Lower/volatile4.f90 b/flang/test/Lower/volatile4.f90 index 42d7b68507b53..83ce2b8fdb25a 100644 --- a/flang/test/Lower/volatile4.f90 +++ b/flang/test/Lower/volatile4.f90 @@ -1,4 +1,4 @@ -! RUN: bbc %s -o - | FileCheck %s +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s program p integer,volatile::i,arr(10) From flang-commits at lists.llvm.org Fri May 2 09:03:26 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Fri, 02 May 2025 09:03:26 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Hide strict volatility checks behind flag (PR #138183) In-Reply-To: Message-ID: <6814ecce.050a0220.309e8a.4b79@mx.google.com> https://github.com/ashermancinelli closed https://github.com/llvm/llvm-project/pull/138183 From flang-commits at lists.llvm.org Fri May 2 09:26:50 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 02 May 2025 09:26:50 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] WIP: Add frontend support for declare variant (PR #130578) In-Reply-To: Message-ID: <6814f24a.170a0220.209b32.231b@mx.google.com> https://github.com/kiranchandramohan updated https://github.com/llvm/llvm-project/pull/130578 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Fri May 2 09:27:54 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 02 May 2025 09:27:54 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] WIP: Add frontend support for declare variant (PR #130578) In-Reply-To: Message-ID: <6814f28a.170a0220.1d7590.e27d@mx.google.com> ================ @@ -1494,6 +1494,25 @@ class OmpVisitor : public virtual DeclarationVisitor { return true; } + bool Pre(const parser::OmpDeclareVariantDirective &x) { + AddOmpSourceRange(x.source); + auto FindSymbolOrError = [&](const parser::Name &procName) { + auto *symbol{FindSymbol(NonDerivedTypeScope(), procName)}; + if (!symbol) { + context().Say(procName.source, + "Implicit subroutine declaration '%s' in !$OMP DECLARE VARIANT"_err_en_US, ---------------- kiranchandramohan wrote: Added test https://github.com/llvm/llvm-project/pull/130578 From flang-commits at lists.llvm.org Fri May 2 09:32:10 2025 From: flang-commits at lists.llvm.org (Paul Walker via flang-commits) Date: Fri, 02 May 2025 09:32:10 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Driver] Fix target parsing for -fveclib=libmvec option. (PR #138288) In-Reply-To: Message-ID: <6814f38a.170a0220.1dc11a.2a3d@mx.google.com> https://github.com/paulwalker-arm updated https://github.com/llvm/llvm-project/pull/138288 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Fri May 2 09:32:39 2025 From: flang-commits at lists.llvm.org (Paul Walker via flang-commits) Date: Fri, 02 May 2025 09:32:39 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Flang][Driver] Fix target parsing for -fveclib=libmvec option. (PR #138288) In-Reply-To: Message-ID: <6814f3a7.170a0220.1d7590.e67f@mx.google.com> https://github.com/paulwalker-arm edited https://github.com/llvm/llvm-project/pull/138288 From flang-commits at lists.llvm.org Fri May 2 09:33:22 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 02 May 2025 09:33:22 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] WIP: Add frontend support for declare variant (PR #130578) In-Reply-To: Message-ID: <6814f3d2.630a0220.250127.2c91@mx.google.com> https://github.com/kiranchandramohan updated https://github.com/llvm/llvm-project/pull/130578 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Fri May 2 11:33:59 2025 From: flang-commits at lists.llvm.org (=?UTF-8?B?2YXZh9iv2Yog2LTZitmG2YjZhg==?= via flang-commits) Date: Fri, 02 May 2025 11:33:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inherit target specific code for BIND(C) types on Windows (PR #129579) In-Reply-To: Message-ID: <68151017.630a0220.250127.5fe7@mx.google.com> MehdiChinoune wrote: Any progress? https://github.com/llvm/llvm-project/pull/129579 From flang-commits at lists.llvm.org Fri May 2 13:22:35 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Fri, 02 May 2025 13:22:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Component references are volatile if their parent is (PR #138339) Message-ID: https://github.com/ashermancinelli created https://github.com/llvm/llvm-project/pull/138339 Component references inherit volatility from their base derived types. Moved the base type volatility check before the box type is built, and merge it (instead of overwrite it) with the volatility of the base type. >From 1e4bf1066a919af5b5bcaac18ef43eb197818d7a Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Fri, 2 May 2025 13:13:09 -0700 Subject: [PATCH] [flang] Component references are volatile if their parent is Component references inherit volatility from their base derived types. Moved the base type volatility check before the box type is built, and merge it (instead of overwrite it) with the volatility of the base type. --- flang/lib/Lower/ConvertExprToHLFIR.cpp | 12 +++--- flang/test/Lower/volatile-derived-type.f90 | 48 ++++++++++++++++++++++ 2 files changed, 54 insertions(+), 6 deletions(-) create mode 100644 flang/test/Lower/volatile-derived-type.f90 diff --git a/flang/lib/Lower/ConvertExprToHLFIR.cpp b/flang/lib/Lower/ConvertExprToHLFIR.cpp index a3be50ac072d4..5981116a6d3f7 100644 --- a/flang/lib/Lower/ConvertExprToHLFIR.cpp +++ b/flang/lib/Lower/ConvertExprToHLFIR.cpp @@ -236,6 +236,12 @@ class HlfirDesignatorBuilder { isVolatile = true; } + // Check if the base type is volatile + if (partInfo.base.has_value()) { + mlir::Type baseType = partInfo.base.value().getType(); + isVolatile = isVolatile || fir::isa_volatile_type(baseType); + } + // Arrays with non default lower bounds or dynamic length or dynamic extent // need a fir.box to hold the dynamic or lower bound information. if (fir::hasDynamicSize(resultValueType) || @@ -249,12 +255,6 @@ class HlfirDesignatorBuilder { /*namedConstantSectionsAreAlwaysContiguous=*/false)) return fir::BoxType::get(resultValueType, isVolatile); - // Check if the base type is volatile - if (partInfo.base.has_value()) { - mlir::Type baseType = partInfo.base.value().getType(); - isVolatile = fir::isa_volatile_type(baseType); - } - // Other designators can be handled as raw addresses. return fir::ReferenceType::get(resultValueType, isVolatile); } diff --git a/flang/test/Lower/volatile-derived-type.f90 b/flang/test/Lower/volatile-derived-type.f90 new file mode 100644 index 0000000000000..edd77a9265530 --- /dev/null +++ b/flang/test/Lower/volatile-derived-type.f90 @@ -0,0 +1,48 @@ +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s +! Ensure member access of a volatile derived type is volatile. + type t + integer :: e(4)=2 + end type t + type(t), volatile :: f + call test (f%e(::2)) +contains + subroutine test(v) + integer, asynchronous :: v(:) + end subroutine +end +! CHECK-LABEL: func.func @_QQmain() { +! CHECK: %[[VAL_0:.*]] = arith.constant 4 : index +! CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +! CHECK: %[[VAL_2:.*]] = arith.constant 2 : index +! CHECK: %[[VAL_3:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_4:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_5:.*]] = fir.address_of(@_QFE.b.t.e) : !fir.ref,value:i64}>>> +! CHECK: %[[VAL_6:.*]] = fir.shape_shift %[[VAL_3]], %[[VAL_2]], %[[VAL_3]], %[[VAL_1]] : (index, index, index, index) -> !fir.shapeshift<2> +! CHECK: %[[VAL_7:.*]]:2 = hlfir.declare %[[VAL_5]](%[[VAL_6]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.b.t.e"} : +! CHECK: %[[VAL_8:.*]] = fir.address_of(@_QFE.n.e) : !fir.ref> +! CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_8]] typeparams %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.n.e"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[VAL_10:.*]] = fir.address_of(@_QFE.di.t.e) : !fir.ref> +! CHECK: %[[VAL_11:.*]] = fir.shape %[[VAL_0]] : (index) -> !fir.shape<1> +! CHECK: %[[VAL_12:.*]]:2 = hlfir.declare %[[VAL_10]](%[[VAL_11]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.di.t.e"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[VAL_13:.*]] = fir.address_of(@_QFE.n.t) : !fir.ref> +! CHECK: %[[VAL_14:.*]]:2 = hlfir.declare %[[VAL_13]] typeparams %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.n.t"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[VAL_15:.*]] = fir.alloca !fir.type<_QFTt{e:!fir.array<4xi32>}> {bindc_name = "f", uniq_name = "_QFEf"} +! CHECK: %[[VAL_16:.*]] = fir.volatile_cast %[[VAL_15]] : (!fir.ref}>>) -> !fir.ref}>, volatile> +! CHECK: %[[VAL_17:.*]]:2 = hlfir.declare %[[VAL_16]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEf"} : (!fir.ref}>, volatile>) -> (!fir.ref}>, volatile>, !fir.ref}>, volatile>) +! CHECK: %[[VAL_18:.*]] = fir.address_of(@_QQ_QFTt.DerivedInit) : !fir.ref}>> +! CHECK: fir.copy %[[VAL_18]] to %[[VAL_17]]#0 no_overlap : !fir.ref}>>, !fir.ref}>, volatile> +! CHECK: %[[VAL_20:.*]] = fir.shape_shift %[[VAL_3]], %[[VAL_1]] : (index, index) -> !fir.shapeshift<1> +! CHECK: %[[VAL_21:.*]]:2 = hlfir.declare %{{.+}}(%[[VAL_20]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.c.t"} : +! CHECK: %[[VAL_24:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +! CHECK: %[[VAL_25:.*]] = hlfir.designate %[[VAL_17]]#0{"e"} <%[[VAL_11]]> (%[[VAL_1]]:%[[VAL_0]]:%[[VAL_2]]) shape %[[VAL_24]] : (!fir.ref}>, volatile>, !fir.shape<1>, index, index, index, !fir.shape<1>) -> !fir.box, volatile> +! CHECK: %[[VAL_26:.*]] = fir.volatile_cast %[[VAL_25]] : (!fir.box, volatile>) -> !fir.box> +! CHECK: %[[VAL_27:.*]] = fir.convert %[[VAL_26]] : (!fir.box>) -> !fir.box> +! CHECK: fir.call @_QFPtest(%[[VAL_27]]) fastmath : (!fir.box>) -> () +! CHECK: return +! CHECK: } +! CHECK-LABEL: func.func private @_QFPtest( +! CHECK-SAME: %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !fir.box> {fir.asynchronous, fir.bindc_name = "v"}) attributes {fir.host_symbol = @_QQmain, llvm.linkage = #llvm.linkage} { +! CHECK: %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[VAL_0]] dummy_scope %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFFtestEv"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +! CHECK: return +! CHECK: } From flang-commits at lists.llvm.org Fri May 2 13:23:09 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 02 May 2025 13:23:09 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Component references are volatile if their parent is (PR #138339) In-Reply-To: Message-ID: <681529ad.050a0220.9bbc0.b9d9@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Asher Mancinelli (ashermancinelli)
Changes Component references inherit volatility from their base derived types. Moved the base type volatility check before the box type is built, and merge it (instead of overwrite it) with the volatility of the base type. --- Full diff: https://github.com/llvm/llvm-project/pull/138339.diff 2 Files Affected: - (modified) flang/lib/Lower/ConvertExprToHLFIR.cpp (+6-6) - (added) flang/test/Lower/volatile-derived-type.f90 (+48) ``````````diff diff --git a/flang/lib/Lower/ConvertExprToHLFIR.cpp b/flang/lib/Lower/ConvertExprToHLFIR.cpp index a3be50ac072d4..5981116a6d3f7 100644 --- a/flang/lib/Lower/ConvertExprToHLFIR.cpp +++ b/flang/lib/Lower/ConvertExprToHLFIR.cpp @@ -236,6 +236,12 @@ class HlfirDesignatorBuilder { isVolatile = true; } + // Check if the base type is volatile + if (partInfo.base.has_value()) { + mlir::Type baseType = partInfo.base.value().getType(); + isVolatile = isVolatile || fir::isa_volatile_type(baseType); + } + // Arrays with non default lower bounds or dynamic length or dynamic extent // need a fir.box to hold the dynamic or lower bound information. if (fir::hasDynamicSize(resultValueType) || @@ -249,12 +255,6 @@ class HlfirDesignatorBuilder { /*namedConstantSectionsAreAlwaysContiguous=*/false)) return fir::BoxType::get(resultValueType, isVolatile); - // Check if the base type is volatile - if (partInfo.base.has_value()) { - mlir::Type baseType = partInfo.base.value().getType(); - isVolatile = fir::isa_volatile_type(baseType); - } - // Other designators can be handled as raw addresses. return fir::ReferenceType::get(resultValueType, isVolatile); } diff --git a/flang/test/Lower/volatile-derived-type.f90 b/flang/test/Lower/volatile-derived-type.f90 new file mode 100644 index 0000000000000..edd77a9265530 --- /dev/null +++ b/flang/test/Lower/volatile-derived-type.f90 @@ -0,0 +1,48 @@ +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s +! Ensure member access of a volatile derived type is volatile. + type t + integer :: e(4)=2 + end type t + type(t), volatile :: f + call test (f%e(::2)) +contains + subroutine test(v) + integer, asynchronous :: v(:) + end subroutine +end +! CHECK-LABEL: func.func @_QQmain() { +! CHECK: %[[VAL_0:.*]] = arith.constant 4 : index +! CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +! CHECK: %[[VAL_2:.*]] = arith.constant 2 : index +! CHECK: %[[VAL_3:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_4:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_5:.*]] = fir.address_of(@_QFE.b.t.e) : !fir.ref,value:i64}>>> +! CHECK: %[[VAL_6:.*]] = fir.shape_shift %[[VAL_3]], %[[VAL_2]], %[[VAL_3]], %[[VAL_1]] : (index, index, index, index) -> !fir.shapeshift<2> +! CHECK: %[[VAL_7:.*]]:2 = hlfir.declare %[[VAL_5]](%[[VAL_6]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.b.t.e"} : +! CHECK: %[[VAL_8:.*]] = fir.address_of(@_QFE.n.e) : !fir.ref> +! CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_8]] typeparams %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.n.e"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[VAL_10:.*]] = fir.address_of(@_QFE.di.t.e) : !fir.ref> +! CHECK: %[[VAL_11:.*]] = fir.shape %[[VAL_0]] : (index) -> !fir.shape<1> +! CHECK: %[[VAL_12:.*]]:2 = hlfir.declare %[[VAL_10]](%[[VAL_11]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.di.t.e"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[VAL_13:.*]] = fir.address_of(@_QFE.n.t) : !fir.ref> +! CHECK: %[[VAL_14:.*]]:2 = hlfir.declare %[[VAL_13]] typeparams %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.n.t"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[VAL_15:.*]] = fir.alloca !fir.type<_QFTt{e:!fir.array<4xi32>}> {bindc_name = "f", uniq_name = "_QFEf"} +! CHECK: %[[VAL_16:.*]] = fir.volatile_cast %[[VAL_15]] : (!fir.ref}>>) -> !fir.ref}>, volatile> +! CHECK: %[[VAL_17:.*]]:2 = hlfir.declare %[[VAL_16]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEf"} : (!fir.ref}>, volatile>) -> (!fir.ref}>, volatile>, !fir.ref}>, volatile>) +! CHECK: %[[VAL_18:.*]] = fir.address_of(@_QQ_QFTt.DerivedInit) : !fir.ref}>> +! CHECK: fir.copy %[[VAL_18]] to %[[VAL_17]]#0 no_overlap : !fir.ref}>>, !fir.ref}>, volatile> +! CHECK: %[[VAL_20:.*]] = fir.shape_shift %[[VAL_3]], %[[VAL_1]] : (index, index) -> !fir.shapeshift<1> +! CHECK: %[[VAL_21:.*]]:2 = hlfir.declare %{{.+}}(%[[VAL_20]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.c.t"} : +! CHECK: %[[VAL_24:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +! CHECK: %[[VAL_25:.*]] = hlfir.designate %[[VAL_17]]#0{"e"} <%[[VAL_11]]> (%[[VAL_1]]:%[[VAL_0]]:%[[VAL_2]]) shape %[[VAL_24]] : (!fir.ref}>, volatile>, !fir.shape<1>, index, index, index, !fir.shape<1>) -> !fir.box, volatile> +! CHECK: %[[VAL_26:.*]] = fir.volatile_cast %[[VAL_25]] : (!fir.box, volatile>) -> !fir.box> +! CHECK: %[[VAL_27:.*]] = fir.convert %[[VAL_26]] : (!fir.box>) -> !fir.box> +! CHECK: fir.call @_QFPtest(%[[VAL_27]]) fastmath : (!fir.box>) -> () +! CHECK: return +! CHECK: } +! CHECK-LABEL: func.func private @_QFPtest( +! CHECK-SAME: %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !fir.box> {fir.asynchronous, fir.bindc_name = "v"}) attributes {fir.host_symbol = @_QQmain, llvm.linkage = #llvm.linkage} { +! CHECK: %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[VAL_0]] dummy_scope %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFFtestEv"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +! CHECK: return +! CHECK: } ``````````
https://github.com/llvm/llvm-project/pull/138339 From flang-commits at lists.llvm.org Fri May 2 11:31:50 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 02 May 2025 11:31:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <68150f96.650a0220.e2599.41b0@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/136012 >From 0f4591ee621e2e9d7acb0e6066b556cb7e243162 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Wed, 16 Apr 2025 12:01:24 -0700 Subject: [PATCH 01/11] initial commit --- flang/include/flang/Lower/AbstractConverter.h | 4 + flang/include/flang/Lower/OpenACC.h | 10 +- flang/include/flang/Semantics/symbol.h | 23 +- flang/lib/Lower/Bridge.cpp | 7 +- flang/lib/Lower/CallInterface.cpp | 10 + flang/lib/Lower/OpenACC.cpp | 197 ++++++++++++++---- flang/lib/Semantics/mod-file.cpp | 1 + flang/lib/Semantics/resolve-directives.cpp | 83 ++++---- 8 files changed, 233 insertions(+), 102 deletions(-) diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 1d1323642bf9c..59419e829718f 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -14,6 +14,7 @@ #define FORTRAN_LOWER_ABSTRACTCONVERTER_H #include "flang/Lower/LoweringOptions.h" +#include "flang/Lower/OpenACC.h" #include "flang/Lower/PFTDefs.h" #include "flang/Optimizer/Builder/BoxValue.h" #include "flang/Optimizer/Dialect/FIRAttr.h" @@ -357,6 +358,9 @@ class AbstractConverter { /// functions in order to be in sync). virtual mlir::SymbolTable *getMLIRSymbolTable() = 0; + virtual Fortran::lower::AccRoutineInfoMappingList & + getAccDelayedRoutines() = 0; + private: /// Options controlling lowering behavior. const Fortran::lower::LoweringOptions &loweringOptions; diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 0d7038a7fd856..7832e8b69ea23 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -22,6 +22,9 @@ class StringRef; } // namespace llvm namespace mlir { +namespace func { +class FuncOp; +} class Location; class Type; class ModuleOp; @@ -42,6 +45,7 @@ struct OpenACCRoutineConstruct; } // namespace parser namespace semantics { +class OpenACCRoutineInfo; class SemanticsContext; class Symbol; } // namespace semantics @@ -79,8 +83,10 @@ void genOpenACCDeclarativeConstruct(AbstractConverter &, void genOpenACCRoutineConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, mlir::ModuleOp, - const parser::OpenACCRoutineConstruct &, - AccRoutineInfoMappingList &); + const parser::OpenACCRoutineConstruct &); +void genOpenACCRoutineConstruct( + AbstractConverter &, mlir::ModuleOp, mlir::func::FuncOp, + const std::vector &); void finalizeOpenACCRoutineAttachment(mlir::ModuleOp, AccRoutineInfoMappingList &); diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 715811885c219..1b6b247c9f5bc 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -127,6 +127,8 @@ class WithBindName { // Device type specific OpenACC routine information class OpenACCRoutineDeviceTypeInfo { public: + OpenACCRoutineDeviceTypeInfo(Fortran::common::OpenACCDeviceType dType) + : deviceType_{dType} {} bool isSeq() const { return isSeq_; } void set_isSeq(bool value = true) { isSeq_ = value; } bool isVector() const { return isVector_; } @@ -141,9 +143,7 @@ class OpenACCRoutineDeviceTypeInfo { return bindName_ ? &*bindName_ : nullptr; } void set_bindName(std::string &&name) { bindName_ = std::move(name); } - void set_dType(Fortran::common::OpenACCDeviceType dType) { - deviceType_ = dType; - } + Fortran::common::OpenACCDeviceType dType() const { return deviceType_; } private: @@ -162,13 +162,24 @@ class OpenACCRoutineDeviceTypeInfo { // in as objects in the OpenACCRoutineDeviceTypeInfo list. class OpenACCRoutineInfo : public OpenACCRoutineDeviceTypeInfo { public: + OpenACCRoutineInfo() + : OpenACCRoutineDeviceTypeInfo(Fortran::common::OpenACCDeviceType::None) { + } bool isNohost() const { return isNohost_; } void set_isNohost(bool value = true) { isNohost_ = value; } - std::list &deviceTypeInfos() { + const std::list &deviceTypeInfos() const { return deviceTypeInfos_; } - void add_deviceTypeInfo(OpenACCRoutineDeviceTypeInfo &info) { - deviceTypeInfos_.push_back(info); + + OpenACCRoutineDeviceTypeInfo &add_deviceTypeInfo( + Fortran::common::OpenACCDeviceType type) { + return add_deviceTypeInfo(OpenACCRoutineDeviceTypeInfo(type)); + } + + OpenACCRoutineDeviceTypeInfo &add_deviceTypeInfo( + OpenACCRoutineDeviceTypeInfo &&info) { + deviceTypeInfos_.push_back(std::move(info)); + return deviceTypeInfos_.back(); } private: diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index b4d1197822a43..9285d587585f8 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -443,7 +443,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); Fortran::lower::genOpenACCRoutineConstruct( *this, bridge.getSemanticsContext(), bridge.getModule(), - d.routine, accRoutineInfos); + d.routine); builder = nullptr; }, }, @@ -4287,6 +4287,11 @@ class FirConverter : public Fortran::lower::AbstractConverter { return Fortran::lower::createMutableBox(loc, *this, expr, localSymbols); } + Fortran::lower::AccRoutineInfoMappingList & + getAccDelayedRoutines() override final { + return accRoutineInfos; + } + // Create the [newRank] array with the lower bounds to be passed to the // runtime as a descriptor. mlir::Value createLboundArray(llvm::ArrayRef lbounds, diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 226ba1e52c968..867248f16237e 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -1689,6 +1689,16 @@ class SignatureBuilder "SignatureBuilder should only be used once"); declare(); interfaceDetermined = true; + if (procDesignator && procDesignator->GetInterfaceSymbol() && + procDesignator->GetInterfaceSymbol() + ->has()) { + auto info = procDesignator->GetInterfaceSymbol() + ->get(); + if (!info.openACCRoutineInfos().empty()) { + genOpenACCRoutineConstruct(converter, converter.getModuleOp(), + getFuncOp(), info.openACCRoutineInfos()); + } + } return getFuncOp(); } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 3dd35ed9ae481..37b660408af6c 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -38,6 +38,7 @@ #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" +#include #define DEBUG_TYPE "flang-lower-openacc" @@ -4139,11 +4140,152 @@ static void attachRoutineInfo(mlir::func::FuncOp func, mlir::acc::RoutineInfoAttr::get(func.getContext(), routines)); } +void createOpenACCRoutineConstruct( + Fortran::lower::AbstractConverter &converter, mlir::Location loc, + mlir::ModuleOp mod, mlir::func::FuncOp funcOp, std::string funcName, + bool hasNohost, llvm::SmallVector &bindNames, + llvm::SmallVector &bindNameDeviceTypes, + llvm::SmallVector &gangDeviceTypes, + llvm::SmallVector &gangDimValues, + llvm::SmallVector &gangDimDeviceTypes, + llvm::SmallVector &seqDeviceTypes, + llvm::SmallVector &workerDeviceTypes, + llvm::SmallVector &vectorDeviceTypes) { + + std::stringstream routineOpName; + routineOpName << accRoutinePrefix.str() << routineCounter++; + + for (auto routineOp : mod.getOps()) { + if (routineOp.getFuncName().str().compare(funcName) == 0) { + // If the routine is already specified with the same clauses, just skip + // the operation creation. + if (compareDeviceTypeInfo(routineOp, bindNames, bindNameDeviceTypes, + gangDeviceTypes, gangDimValues, + gangDimDeviceTypes, seqDeviceTypes, + workerDeviceTypes, vectorDeviceTypes) && + routineOp.getNohost() == hasNohost) + return; + mlir::emitError(loc, "Routine already specified with different clauses"); + } + } + std::string routineOpStr = routineOpName.str(); + mlir::OpBuilder modBuilder(mod.getBodyRegion()); + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + modBuilder.create( + loc, routineOpStr, funcName, + bindNames.empty() ? nullptr : builder.getArrayAttr(bindNames), + bindNameDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(bindNameDeviceTypes), + workerDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(workerDeviceTypes), + vectorDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(vectorDeviceTypes), + seqDeviceTypes.empty() ? nullptr : builder.getArrayAttr(seqDeviceTypes), + hasNohost, /*implicit=*/false, + gangDeviceTypes.empty() ? nullptr : builder.getArrayAttr(gangDeviceTypes), + gangDimValues.empty() ? nullptr : builder.getArrayAttr(gangDimValues), + gangDimDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(gangDimDeviceTypes)); + + if (funcOp) + attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpStr)); + else + // FuncOp is not lowered yet. Keep the information so the routine info + // can be attached later to the funcOp. + converter.getAccDelayedRoutines().push_back( + std::make_pair(funcName, builder.getSymbolRefAttr(routineOpStr))); +} + +static void interpretRoutineDeviceInfo( + fir::FirOpBuilder &builder, + const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo, + llvm::SmallVector &seqDeviceTypes, + llvm::SmallVector &vectorDeviceTypes, + llvm::SmallVector &workerDeviceTypes, + llvm::SmallVector &bindNameDeviceTypes, + llvm::SmallVector &bindNames, + llvm::SmallVector &gangDeviceTypes, + llvm::SmallVector &gangDimValues, + llvm::SmallVector &gangDimDeviceTypes) { + mlir::MLIRContext *context{builder.getContext()}; + if (dinfo.isSeq()) { + seqDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } + if (dinfo.isVector()) { + vectorDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } + if (dinfo.isWorker()) { + workerDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } + if (dinfo.isGang()) { + unsigned gangDim = dinfo.gangDim(); + auto deviceType = + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType())); + if (!gangDim) { + gangDeviceTypes.push_back(deviceType); + } else { + gangDimValues.push_back( + builder.getIntegerAttr(builder.getI64Type(), gangDim)); + gangDimDeviceTypes.push_back(deviceType); + } + } + if (const std::string *bindName{dinfo.bindName()}) { + bindNames.push_back(builder.getStringAttr(*bindName)); + bindNameDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } +} + +void Fortran::lower::genOpenACCRoutineConstruct( + Fortran::lower::AbstractConverter &converter, mlir::ModuleOp mod, + mlir::func::FuncOp funcOp, + const std::vector &routineInfos) { + CHECK(funcOp && "Expected a valid function operation"); + fir::FirOpBuilder &builder{converter.getFirOpBuilder()}; + mlir::Location loc{funcOp.getLoc()}; + std::string funcName{funcOp.getName()}; + + // Collect the routine clauses + bool hasNohost{false}; + + llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimDeviceTypes, gangDimValues; + + for (const Fortran::semantics::OpenACCRoutineInfo &info : routineInfos) { + // Device Independent Attributes + if (info.isNohost()) { + hasNohost = true; + } + // Note: Device Independent Attributes are set to the + // none device type in `info`. + interpretRoutineDeviceInfo(builder, info, seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, + bindNames, gangDeviceTypes, gangDimValues, + gangDimDeviceTypes); + + // Device Dependent Attributes + for (const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo : + info.deviceTypeInfos()) { + interpretRoutineDeviceInfo( + builder, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, + bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, + gangDimDeviceTypes); + } + } + createOpenACCRoutineConstruct( + converter, loc, mod, funcOp, funcName, hasNohost, bindNames, + bindNameDeviceTypes, gangDeviceTypes, gangDimValues, gangDimDeviceTypes, + seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); +} + void Fortran::lower::genOpenACCRoutineConstruct( Fortran::lower::AbstractConverter &converter, Fortran::semantics::SemanticsContext &semanticsContext, mlir::ModuleOp mod, - const Fortran::parser::OpenACCRoutineConstruct &routineConstruct, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { + const Fortran::parser::OpenACCRoutineConstruct &routineConstruct) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); mlir::Location loc = converter.genLocation(routineConstruct.source); std::optional name = @@ -4174,6 +4316,7 @@ void Fortran::lower::genOpenACCRoutineConstruct( funcName = funcOp.getName(); } } + // TODO: Refactor this to use the OpenACCRoutineInfo bool hasNohost = false; llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, @@ -4226,6 +4369,8 @@ void Fortran::lower::genOpenACCRoutineConstruct( std::get_if(&clause.u)) { if (const auto *name = std::get_if(&bindClause->v.u)) { + // FIXME: This case mangles the name, the one below does not. + // which is correct? mlir::Attribute bindNameAttr = builder.getStringAttr(converter.mangleName(*name->symbol)); for (auto crtDeviceTypeAttr : crtDeviceTypes) { @@ -4255,47 +4400,10 @@ void Fortran::lower::genOpenACCRoutineConstruct( } } - mlir::OpBuilder modBuilder(mod.getBodyRegion()); - std::stringstream routineOpName; - routineOpName << accRoutinePrefix.str() << routineCounter++; - - for (auto routineOp : mod.getOps()) { - if (routineOp.getFuncName().str().compare(funcName) == 0) { - // If the routine is already specified with the same clauses, just skip - // the operation creation. - if (compareDeviceTypeInfo(routineOp, bindNames, bindNameDeviceTypes, - gangDeviceTypes, gangDimValues, - gangDimDeviceTypes, seqDeviceTypes, - workerDeviceTypes, vectorDeviceTypes) && - routineOp.getNohost() == hasNohost) - return; - mlir::emitError(loc, "Routine already specified with different clauses"); - } - } - - modBuilder.create( - loc, routineOpName.str(), funcName, - bindNames.empty() ? nullptr : builder.getArrayAttr(bindNames), - bindNameDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(bindNameDeviceTypes), - workerDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(workerDeviceTypes), - vectorDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(vectorDeviceTypes), - seqDeviceTypes.empty() ? nullptr : builder.getArrayAttr(seqDeviceTypes), - hasNohost, /*implicit=*/false, - gangDeviceTypes.empty() ? nullptr : builder.getArrayAttr(gangDeviceTypes), - gangDimValues.empty() ? nullptr : builder.getArrayAttr(gangDimValues), - gangDimDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(gangDimDeviceTypes)); - - if (funcOp) - attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpName.str())); - else - // FuncOp is not lowered yet. Keep the information so the routine info - // can be attached later to the funcOp. - accRoutineInfos.push_back(std::make_pair( - funcName, builder.getSymbolRefAttr(routineOpName.str()))); + createOpenACCRoutineConstruct( + converter, loc, mod, funcOp, funcName, hasNohost, bindNames, + bindNameDeviceTypes, gangDeviceTypes, gangDimValues, gangDimDeviceTypes, + seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); } void Fortran::lower::finalizeOpenACCRoutineAttachment( @@ -4443,8 +4551,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( fir::FirOpBuilder &builder = converter.getFirOpBuilder(); mlir::ModuleOp mod = builder.getModule(); Fortran::lower::genOpenACCRoutineConstruct( - converter, semanticsContext, mod, routineConstruct, - accRoutineInfos); + converter, semanticsContext, mod, routineConstruct); }, }, accDeclConstruct.u); diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index ee356e56e4458..befd204a671fc 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1387,6 +1387,7 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, parser::Options options; options.isModuleFile = true; options.features.Enable(common::LanguageFeature::BackslashEscapes); + options.features.Enable(common::LanguageFeature::OpenACC); options.features.Enable(common::LanguageFeature::OpenMP); options.features.Enable(common::LanguageFeature::CUDA); if (!isIntrinsic.value_or(false) && !notAModule) { diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index d75b4ea13d35f..93c334a3ca3cb 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1034,61 +1034,53 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Symbol &symbol, const parser::OpenACCRoutineConstruct &x) { if (symbol.has()) { Fortran::semantics::OpenACCRoutineInfo info; + std::vector currentDevices; + currentDevices.push_back(&info); const auto &clauses = std::get(x.t); for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isSeq(); - } else { - info.deviceTypeInfos().back().set_isSeq(); + if (const auto *dTypeClause = + std::get_if(&clause.u)) { + currentDevices.clear(); + for (const auto &deviceTypeExpr : dTypeClause->v.v) { + currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); } + } else if (std::get_if(&clause.u)) { + info.set_isNohost(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) + device->set_isSeq(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) + device->set_isVector(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) + device->set_isWorker(); } else if (const auto *gangClause = std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isGang(); - } else { - info.deviceTypeInfos().back().set_isGang(); - } + for (auto &device : currentDevices) + device->set_isGang(); if (gangClause->v) { const Fortran::parser::AccGangArgList &x = *gangClause->v; + int numArgs{0}; for (const Fortran::parser::AccGangArg &gangArg : x.v) { + CHECK(numArgs <= 1 && "expecting 0 or 1 gang dim args"); if (const auto *dim = std::get_if(&gangArg.u)) { if (const auto v{EvaluateInt64(context_, dim->v)}) { - if (info.deviceTypeInfos().empty()) { - info.set_gangDim(*v); - } else { - info.deviceTypeInfos().back().set_gangDim(*v); - } + for (auto &device : currentDevices) + device->set_gangDim(*v); } } + numArgs++; } } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isVector(); - } else { - info.deviceTypeInfos().back().set_isVector(); - } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isWorker(); - } else { - info.deviceTypeInfos().back().set_isWorker(); - } - } else if (std::get_if(&clause.u)) { - info.set_isNohost(); } else if (const auto *bindClause = std::get_if(&clause.u)) { + std::string bindName = ""; if (const auto *name = std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { - if (info.deviceTypeInfos().empty()) { - info.set_bindName(sym->name().ToString()); - } else { - info.deviceTypeInfos().back().set_bindName( - sym->name().ToString()); - } + bindName = sym->name().ToString(); } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1101,21 +1093,16 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Fortran::parser::Unwrap( *charExpr); std::string str{std::get(charConst->t)}; - std::stringstream bindName; - bindName << "\"" << str << "\""; - if (info.deviceTypeInfos().empty()) { - info.set_bindName(bindName.str()); - } else { - info.deviceTypeInfos().back().set_bindName(bindName.str()); + std::stringstream bindNameStream; + bindNameStream << "\"" << str << "\""; + bindName = bindNameStream.str(); + } + if (!bindName.empty()) { + // Fixme: do we need to ensure there there is only one device? + for (auto &device : currentDevices) { + device->set_bindName(std::string(bindName)); } } - } else if (const auto *dType = - std::get_if( - &clause.u)) { - const parser::AccDeviceTypeExprList &deviceTypeExprList = dType->v; - OpenACCRoutineDeviceTypeInfo dtypeInfo; - dtypeInfo.set_dType(deviceTypeExprList.v.front().v); - info.add_deviceTypeInfo(dtypeInfo); } } symbol.get().add_openACCRoutineInfo(info); >From 1b6da293788edc56eea566f5c15126de6955169c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Tue, 22 Apr 2025 16:06:33 -0700 Subject: [PATCH 02/11] fix includes --- flang/lib/Lower/OpenACC.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 37b660408af6c..a3ebd9b931dc6 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -32,13 +32,13 @@ #include "flang/Semantics/expression.h" #include "flang/Semantics/scope.h" #include "flang/Semantics/tools.h" -#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" -#include "mlir/Support/LLVM.h" #include "llvm/ADT/STLExtras.h" #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" -#include +#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" +#include "mlir/IR/MLIRContext.h" +#include "mlir/Support/LLVM.h" #define DEBUG_TYPE "flang-lower-openacc" >From 7b65ac4c477e5e46bf369a3a9f94f69cf496ef6b Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Wed, 23 Apr 2025 13:50:19 -0700 Subject: [PATCH 03/11] adding test --- flang/include/flang/Semantics/symbol.h | 7 +++++- flang/lib/Semantics/resolve-directives.cpp | 6 ++++- .../Lower/OpenACC/acc-module-definition.f90 | 17 ++++++++++++++ .../Lower/OpenACC/acc-routine-use-module.f90 | 23 +++++++++++++++++++ 4 files changed, 51 insertions(+), 2 deletions(-) create mode 100644 flang/test/Lower/OpenACC/acc-module-definition.f90 create mode 100644 flang/test/Lower/OpenACC/acc-routine-use-module.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 1b6b247c9f5bc..fe6c73997733a 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -142,7 +142,11 @@ class OpenACCRoutineDeviceTypeInfo { const std::string *bindName() const { return bindName_ ? &*bindName_ : nullptr; } - void set_bindName(std::string &&name) { bindName_ = std::move(name); } + bool bindNameIsInternal() const {return bindNameIsInternal_;} + void set_bindName(std::string &&name, bool isInternal=false) { + bindName_ = std::move(name); + bindNameIsInternal_ = isInternal; + } Fortran::common::OpenACCDeviceType dType() const { return deviceType_; } @@ -153,6 +157,7 @@ class OpenACCRoutineDeviceTypeInfo { bool isGang_{false}; unsigned gangDim_{0}; std::optional bindName_; + bool bindNameIsInternal_{false}; Fortran::common::OpenACCDeviceType deviceType_{ Fortran::common::OpenACCDeviceType::None}; }; diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 93c334a3ca3cb..8fb3559c34426 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1077,10 +1077,12 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( } else if (const auto *bindClause = std::get_if(&clause.u)) { std::string bindName = ""; + bool isInternal = false; if (const auto *name = std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { bindName = sym->name().ToString(); + isInternal = true; } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1100,12 +1102,14 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( if (!bindName.empty()) { // Fixme: do we need to ensure there there is only one device? for (auto &device : currentDevices) { - device->set_bindName(std::string(bindName)); + device->set_bindName(std::string(bindName), isInternal); } } } } symbol.get().add_openACCRoutineInfo(info); + } else { + llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() << "\n"; } } diff --git a/flang/test/Lower/OpenACC/acc-module-definition.f90 b/flang/test/Lower/OpenACC/acc-module-definition.f90 new file mode 100644 index 0000000000000..36e41fc631c77 --- /dev/null +++ b/flang/test/Lower/OpenACC/acc-module-definition.f90 @@ -0,0 +1,17 @@ +! RUN: rm -fr %t && mkdir -p %t && cd %t +! RUN: bbc -fopenacc -emit-fir %s +! RUN: cat mod1.mod | FileCheck %s + +!CHECK-LABEL: module mod1 +module mod1 + contains + !CHECK subroutine callee(aa) + subroutine callee(aa) + !CHECK: !$acc routine seq + !$acc routine seq + integer :: aa + aa = 1 + end subroutine + !CHECK: end + !CHECK: end +end module \ No newline at end of file diff --git a/flang/test/Lower/OpenACC/acc-routine-use-module.f90 b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 new file mode 100644 index 0000000000000..7fc96b0ef5684 --- /dev/null +++ b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 @@ -0,0 +1,23 @@ +! RUN: rm -fr %t && mkdir -p %t && cd %t +! RUN: bbc -emit-fir %S/acc-module-definition.f90 +! RUN: bbc -emit-fir %s -o - | FileCheck %s + +! This test module is based off of flang/test/Lower/use_module.f90 +! The first runs ensures the module file is generated. + +module use_mod1 + use mod1 + contains + !CHECK: func.func @_QMuse_mod1Pcaller + !CHECK-SAME { + subroutine caller(aa) + integer :: aa + !$acc serial + !CHECK: fir.call @_QMmod1Pcallee + call callee(aa) + !$acc end serial + end subroutine + !CHECK: } + !CHECK: acc.routine @acc_routine_0 func(@_QMmod1Pcallee) seq + !CHECK: func.func private @_QMmod1Pcallee(!fir.ref) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +end module \ No newline at end of file >From 70f8d469346d22597c7b3ff38b2f4a84a82b6d85 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 14:39:04 -0700 Subject: [PATCH 04/11] debugging failure --- flang/include/flang/Lower/OpenACC.h | 7 ++ flang/include/flang/Semantics/symbol.h | 21 +++-- flang/lib/Lower/Bridge.cpp | 21 +++-- flang/lib/Lower/CallInterface.cpp | 21 ++--- flang/lib/Lower/OpenACC.cpp | 87 +++++++++++-------- flang/lib/Semantics/mod-file.cpp | 11 ++- flang/lib/Semantics/resolve-directives.cpp | 16 ++-- flang/lib/Semantics/symbol.cpp | 46 ++++++++++ .../test/Lower/OpenACC/acc-routine-named.f90 | 10 ++- .../Lower/OpenACC/acc-routine-use-module.f90 | 6 +- flang/test/Lower/OpenACC/acc-routine.f90 | 63 ++++++++------ 11 files changed, 199 insertions(+), 110 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 7832e8b69ea23..dc014a71526c3 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -37,11 +37,16 @@ class FirOpBuilder; } namespace Fortran { +namespace evaluate { +class ProcedureDesignator; +} // namespace evaluate + namespace parser { struct AccClauseList; struct OpenACCConstruct; struct OpenACCDeclarativeConstruct; struct OpenACCRoutineConstruct; +struct ProcedureDesignator; } // namespace parser namespace semantics { @@ -71,6 +76,8 @@ static constexpr llvm::StringRef declarePostDeallocSuffix = static constexpr llvm::StringRef privatizationRecipePrefix = "privatization"; +bool needsOpenACCRoutineConstruct(const Fortran::evaluate::ProcedureDesignator *); + mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, pft::Evaluation &, diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index fe6c73997733a..8c60a196bdfc1 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -22,6 +22,7 @@ #include #include #include +#include #include namespace llvm { @@ -139,25 +140,26 @@ class OpenACCRoutineDeviceTypeInfo { void set_isGang(bool value = true) { isGang_ = value; } unsigned gangDim() const { return gangDim_; } void set_gangDim(unsigned value) { gangDim_ = value; } - const std::string *bindName() const { - return bindName_ ? &*bindName_ : nullptr; + const std::variant *bindName() const { + return bindName_.has_value() ? &*bindName_ : nullptr; } - bool bindNameIsInternal() const {return bindNameIsInternal_;} - void set_bindName(std::string &&name, bool isInternal=false) { - bindName_ = std::move(name); - bindNameIsInternal_ = isInternal; + const std::optional> &bindNameOpt() const { + return bindName_; } + void set_bindName(std::string &&name) { bindName_.emplace(std::move(name)); } + void set_bindName(SymbolRef symbol) { bindName_.emplace(symbol); } Fortran::common::OpenACCDeviceType dType() const { return deviceType_; } + friend llvm::raw_ostream &operator<<( + llvm::raw_ostream &, const OpenACCRoutineDeviceTypeInfo &); private: bool isSeq_{false}; bool isVector_{false}; bool isWorker_{false}; bool isGang_{false}; unsigned gangDim_{0}; - std::optional bindName_; - bool bindNameIsInternal_{false}; + std::optional> bindName_; Fortran::common::OpenACCDeviceType deviceType_{ Fortran::common::OpenACCDeviceType::None}; }; @@ -187,6 +189,9 @@ class OpenACCRoutineInfo : public OpenACCRoutineDeviceTypeInfo { return deviceTypeInfos_.back(); } + friend llvm::raw_ostream &operator<<( + llvm::raw_ostream &, const OpenACCRoutineInfo &); + private: std::list deviceTypeInfos_; bool isNohost_{false}; diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 9285d587585f8..abe07bcfdfcda 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -438,14 +438,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { [&](Fortran::lower::pft::ModuleLikeUnit &m) { lowerMod(m); }, [&](Fortran::lower::pft::BlockDataUnit &b) {}, [&](Fortran::lower::pft::CompilerDirectiveUnit &d) {}, - [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) { - builder = new fir::FirOpBuilder( - bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); - Fortran::lower::genOpenACCRoutineConstruct( - *this, bridge.getSemanticsContext(), bridge.getModule(), - d.routine); - builder = nullptr; - }, + [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) {}, }, u); } @@ -472,6 +465,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { setCurrentPosition(funit.getStartingSourceLoc()); + builder = new fir::FirOpBuilder( + bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { funit.setActiveEntry(entryIndex); @@ -498,6 +493,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (Fortran::lower::pft::ContainedUnit &unit : funit.containedUnitList) if (auto *f = std::get_if(&unit)) declareFunction(*f); + builder = nullptr; } /// Get the scope that is defining or using \p sym. The returned scope is not @@ -1035,7 +1031,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { return bridge.getSemanticsContext().FindScope(currentPosition); } - fir::FirOpBuilder &getFirOpBuilder() override final { return *builder; } + fir::FirOpBuilder &getFirOpBuilder() override final { + CHECK(builder && "builder is not set before calling getFirOpBuilder"); + return *builder; + } mlir::ModuleOp getModuleOp() override final { return bridge.getModule(); } @@ -5617,6 +5616,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { LLVM_DEBUG(llvm::dbgs() << "\n[bridge - startNewFunction]"; if (auto *sym = scope.symbol()) llvm::dbgs() << " " << *sym; llvm::dbgs() << "\n"); + // I don't think setting the builder is necessary here, because callee + // always looks up the FuncOp from the module. If there was a function that + // was not declared yet. This call to callee will cause an assertion + //failure. Fortran::lower::CalleeInterface callee(funit, *this); mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments(); builder = diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 867248f16237e..b938354e6bcb3 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -10,6 +10,7 @@ #include "flang/Evaluate/fold.h" #include "flang/Lower/Bridge.h" #include "flang/Lower/Mangler.h" +#include "flang/Lower/OpenACC.h" #include "flang/Lower/PFTBuilder.h" #include "flang/Lower/StatementContext.h" #include "flang/Lower/Support/Utils.h" @@ -20,6 +21,7 @@ #include "flang/Optimizer/Dialect/FIROpsSupport.h" #include "flang/Optimizer/Support/InternalNames.h" #include "flang/Optimizer/Support/Utils.h" +#include "flang/Parser/parse-tree.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" #include "flang/Support/Fortran.h" @@ -715,6 +717,14 @@ void Fortran::lower::CallInterface::declare() { func.setArgAttrs(placeHolder.index(), placeHolder.value().attributes); setCUDAAttributes(func, side().getProcedureSymbol(), characteristic); + + if (const Fortran::semantics::Symbol *sym = side().getProcedureSymbol()) { + if (const auto &info{sym->GetUltimate().detailsIf()}) { + if (!info->openACCRoutineInfos().empty()) { + genOpenACCRoutineConstruct(converter, module, func, info->openACCRoutineInfos()); + } + } + } } } } @@ -1688,17 +1698,8 @@ class SignatureBuilder fir::emitFatalError(converter.getCurrentLocation(), "SignatureBuilder should only be used once"); declare(); + interfaceDetermined = true; - if (procDesignator && procDesignator->GetInterfaceSymbol() && - procDesignator->GetInterfaceSymbol() - ->has()) { - auto info = procDesignator->GetInterfaceSymbol() - ->get(); - if (!info.openACCRoutineInfos().empty()) { - genOpenACCRoutineConstruct(converter, converter.getModuleOp(), - getFuncOp(), info.openACCRoutineInfos()); - } - } return getFuncOp(); } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index a3ebd9b931dc6..eefa8fbf12b1a 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -36,6 +36,7 @@ #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" +#include "llvm/Support/ErrorHandling.h" #include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" #include "mlir/IR/MLIRContext.h" #include "mlir/Support/LLVM.h" @@ -4140,6 +4141,14 @@ static void attachRoutineInfo(mlir::func::FuncOp func, mlir::acc::RoutineInfoAttr::get(func.getContext(), routines)); } +static mlir::ArrayAttr getArrayAttrOrNull(fir::FirOpBuilder &builder, llvm::SmallVector &attributes) { + if (attributes.empty()) { + return nullptr; + } else { + return builder.getArrayAttr(attributes); + } +} + void createOpenACCRoutineConstruct( Fortran::lower::AbstractConverter &converter, mlir::Location loc, mlir::ModuleOp mod, mlir::func::FuncOp funcOp, std::string funcName, @@ -4173,31 +4182,29 @@ void createOpenACCRoutineConstruct( fir::FirOpBuilder &builder = converter.getFirOpBuilder(); modBuilder.create( loc, routineOpStr, funcName, - bindNames.empty() ? nullptr : builder.getArrayAttr(bindNames), - bindNameDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(bindNameDeviceTypes), - workerDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(workerDeviceTypes), - vectorDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(vectorDeviceTypes), - seqDeviceTypes.empty() ? nullptr : builder.getArrayAttr(seqDeviceTypes), + getArrayAttrOrNull(builder, bindNames), + getArrayAttrOrNull(builder, bindNameDeviceTypes), + getArrayAttrOrNull(builder, workerDeviceTypes), + getArrayAttrOrNull(builder, vectorDeviceTypes), + getArrayAttrOrNull(builder, seqDeviceTypes), hasNohost, /*implicit=*/false, - gangDeviceTypes.empty() ? nullptr : builder.getArrayAttr(gangDeviceTypes), - gangDimValues.empty() ? nullptr : builder.getArrayAttr(gangDimValues), - gangDimDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(gangDimDeviceTypes)); - - if (funcOp) - attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpStr)); - else + getArrayAttrOrNull(builder, gangDeviceTypes), + getArrayAttrOrNull(builder, gangDimValues), + getArrayAttrOrNull(builder, gangDimDeviceTypes)); + + auto symbolRefAttr = builder.getSymbolRefAttr(routineOpStr); + if (funcOp) { + + attachRoutineInfo(funcOp, symbolRefAttr); + } else { // FuncOp is not lowered yet. Keep the information so the routine info // can be attached later to the funcOp. - converter.getAccDelayedRoutines().push_back( - std::make_pair(funcName, builder.getSymbolRefAttr(routineOpStr))); + converter.getAccDelayedRoutines().push_back(std::make_pair(funcName, symbolRefAttr)); + } } static void interpretRoutineDeviceInfo( - fir::FirOpBuilder &builder, + Fortran::lower::AbstractConverter &converter, const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo, llvm::SmallVector &seqDeviceTypes, llvm::SmallVector &vectorDeviceTypes, @@ -4207,23 +4214,24 @@ static void interpretRoutineDeviceInfo( llvm::SmallVector &gangDeviceTypes, llvm::SmallVector &gangDimValues, llvm::SmallVector &gangDimDeviceTypes) { - mlir::MLIRContext *context{builder.getContext()}; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto getDeviceTypeAttr = [&]() -> mlir::Attribute { + auto context = builder.getContext(); + auto value = getDeviceType(dinfo.dType()); + return mlir::acc::DeviceTypeAttr::get(context, value ); + }; if (dinfo.isSeq()) { - seqDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + seqDeviceTypes.push_back(getDeviceTypeAttr()); } if (dinfo.isVector()) { - vectorDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + vectorDeviceTypes.push_back(getDeviceTypeAttr()); } if (dinfo.isWorker()) { - workerDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + workerDeviceTypes.push_back(getDeviceTypeAttr()); } if (dinfo.isGang()) { unsigned gangDim = dinfo.gangDim(); - auto deviceType = - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType())); + auto deviceType = getDeviceTypeAttr(); if (!gangDim) { gangDeviceTypes.push_back(deviceType); } else { @@ -4232,10 +4240,18 @@ static void interpretRoutineDeviceInfo( gangDimDeviceTypes.push_back(deviceType); } } - if (const std::string *bindName{dinfo.bindName()}) { - bindNames.push_back(builder.getStringAttr(*bindName)); - bindNameDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + if (dinfo.bindNameOpt().has_value()) { + const auto &bindName = dinfo.bindNameOpt().value(); + mlir::Attribute bindNameAttr; + if (const auto &bindStr{std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(*bindStr); + } else if (const auto &bindSym{std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(converter.mangleName(*bindSym)); + } else { + llvm_unreachable("Unsupported bind name type"); + } + bindNames.push_back(bindNameAttr); + bindNameDeviceTypes.push_back(getDeviceTypeAttr()); } } @@ -4244,7 +4260,6 @@ void Fortran::lower::genOpenACCRoutineConstruct( mlir::func::FuncOp funcOp, const std::vector &routineInfos) { CHECK(funcOp && "Expected a valid function operation"); - fir::FirOpBuilder &builder{converter.getFirOpBuilder()}; mlir::Location loc{funcOp.getLoc()}; std::string funcName{funcOp.getName()}; @@ -4262,7 +4277,7 @@ void Fortran::lower::genOpenACCRoutineConstruct( } // Note: Device Independent Attributes are set to the // none device type in `info`. - interpretRoutineDeviceInfo(builder, info, seqDeviceTypes, vectorDeviceTypes, + interpretRoutineDeviceInfo(converter, info, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, gangDimDeviceTypes); @@ -4271,7 +4286,7 @@ void Fortran::lower::genOpenACCRoutineConstruct( for (const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo : info.deviceTypeInfos()) { interpretRoutineDeviceInfo( - builder, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, + converter, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, gangDimDeviceTypes); } @@ -4369,8 +4384,6 @@ void Fortran::lower::genOpenACCRoutineConstruct( std::get_if(&clause.u)) { if (const auto *name = std::get_if(&bindClause->v.u)) { - // FIXME: This case mangles the name, the one below does not. - // which is correct? mlir::Attribute bindNameAttr = builder.getStringAttr(converter.mangleName(*name->symbol)); for (auto crtDeviceTypeAttr : crtDeviceTypes) { diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index befd204a671fc..76dc8db590f22 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -24,6 +24,7 @@ #include #include #include +#include #include namespace Fortran::semantics { @@ -638,8 +639,14 @@ static void PutOpenACCDeviceTypeRoutineInfo( if (info.isWorker()) { os << " worker"; } - if (info.bindName()) { - os << " bind(" << *info.bindName() << ")"; + if (const std::variant *bindName{info.bindName()}) { + os << " bind("; + if (std::holds_alternative(*bindName)) { + os << "\"" << std::get(*bindName) << "\""; + } else { + os << std::get(*bindName)->name(); + } + os << ")"; } } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 8fb3559c34426..a8f00b546306e 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1076,13 +1076,13 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( } } else if (const auto *bindClause = std::get_if(&clause.u)) { - std::string bindName = ""; - bool isInternal = false; if (const auto *name = std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { - bindName = sym->name().ToString(); - isInternal = true; + Symbol &ultimate{sym->GetUltimate()}; + for (auto &device : currentDevices) { + device->set_bindName(SymbolRef(ultimate)); + } } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1095,14 +1095,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Fortran::parser::Unwrap( *charExpr); std::string str{std::get(charConst->t)}; - std::stringstream bindNameStream; - bindNameStream << "\"" << str << "\""; - bindName = bindNameStream.str(); - } - if (!bindName.empty()) { - // Fixme: do we need to ensure there there is only one device? for (auto &device : currentDevices) { - device->set_bindName(std::string(bindName), isInternal); + device->set_bindName(std::string(str)); } } } diff --git a/flang/lib/Semantics/symbol.cpp b/flang/lib/Semantics/symbol.cpp index 32eb6c2c5a188..d44df4669fa36 100644 --- a/flang/lib/Semantics/symbol.cpp +++ b/flang/lib/Semantics/symbol.cpp @@ -144,6 +144,52 @@ llvm::raw_ostream &operator<<( os << ' ' << x; } } + if (!x.openACCRoutineInfos_.empty()) { + os << " openACCRoutineInfos:"; + for (const auto x : x.openACCRoutineInfos_) { + os << x; + } + } + return os; +} + +llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const OpenACCRoutineDeviceTypeInfo &x) { + if (x.dType() != common::OpenACCDeviceType::None) { + os << " deviceType(" << common::EnumToString(x.dType()) << ')'; + } + if (x.isSeq()) { + os << " seq"; + } + if (x.isVector()) { + os << " vector"; + } + if (x.isWorker()) { + os << " worker"; + } + if (x.isGang()) { + os << " gang(" << x.gangDim() << ')'; + } + if (const auto *bindName{x.bindName()}) { + if (const auto &symbol{std::get_if(bindName)}) { + os << " bindName(\"" << *symbol << "\")"; + } else { + const SymbolRef s{std::get(*bindName)}; + os << " bindName(" << s->name() << ")"; + } + } + return os; +} + +llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const OpenACCRoutineInfo &x) { + if (x.isNohost()) { + os << " nohost"; + } + os << static_cast(x); + for (const auto &d : x.deviceTypeInfos_) { + os << d; + } return os; } diff --git a/flang/test/Lower/OpenACC/acc-routine-named.f90 b/flang/test/Lower/OpenACC/acc-routine-named.f90 index 2cf6bf8b2bc06..de9784a1146cc 100644 --- a/flang/test/Lower/OpenACC/acc-routine-named.f90 +++ b/flang/test/Lower/OpenACC/acc-routine-named.f90 @@ -4,8 +4,8 @@ module acc_routines -! CHECK: acc.routine @acc_routine_1 func(@_QMacc_routinesPacc2) -! CHECK: acc.routine @acc_routine_0 func(@_QMacc_routinesPacc1) seq +! CHECK: acc.routine @[[r0:.*]] func(@_QMacc_routinesPacc2) +! CHECK: acc.routine @[[r1:.*]] func(@_QMacc_routinesPacc1) seq !$acc routine(acc1) seq @@ -14,12 +14,14 @@ module acc_routines subroutine acc1() end subroutine -! CHECK-LABEL: func.func @_QMacc_routinesPacc1() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +! CHECK-LABEL: func.func @_QMacc_routinesPacc1() +! CHECK-SAME:attributes {acc.routine_info = #acc.routine_info<[@[[r1]]]>} subroutine acc2() !$acc routine(acc2) end subroutine -! CHECK-LABEL: func.func @_QMacc_routinesPacc2() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_1]>} +! CHECK-LABEL: func.func @_QMacc_routinesPacc2() +! CHECK-SAME:attributes {acc.routine_info = #acc.routine_info<[@[[r0]]]>} end module diff --git a/flang/test/Lower/OpenACC/acc-routine-use-module.f90 b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 index 7fc96b0ef5684..059324230a746 100644 --- a/flang/test/Lower/OpenACC/acc-routine-use-module.f90 +++ b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 @@ -1,6 +1,6 @@ ! RUN: rm -fr %t && mkdir -p %t && cd %t -! RUN: bbc -emit-fir %S/acc-module-definition.f90 -! RUN: bbc -emit-fir %s -o - | FileCheck %s +! RUN: bbc -fopenacc -emit-fir %S/acc-module-definition.f90 +! RUN: bbc -fopenacc -emit-fir %s -o - | FileCheck %s ! This test module is based off of flang/test/Lower/use_module.f90 ! The first runs ensures the module file is generated. @@ -8,6 +8,7 @@ module use_mod1 use mod1 contains + !CHECK: acc.routine @acc_routine_0 func(@_QMmod1Pcallee) seq !CHECK: func.func @_QMuse_mod1Pcaller !CHECK-SAME { subroutine caller(aa) @@ -18,6 +19,5 @@ subroutine caller(aa) !$acc end serial end subroutine !CHECK: } - !CHECK: acc.routine @acc_routine_0 func(@_QMmod1Pcallee) seq !CHECK: func.func private @_QMmod1Pcallee(!fir.ref) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} end module \ No newline at end of file diff --git a/flang/test/Lower/OpenACC/acc-routine.f90 b/flang/test/Lower/OpenACC/acc-routine.f90 index 1170af18bc334..789f3a57e1f79 100644 --- a/flang/test/Lower/OpenACC/acc-routine.f90 +++ b/flang/test/Lower/OpenACC/acc-routine.f90 @@ -2,69 +2,77 @@ ! RUN: bbc -fopenacc -emit-hlfir %s -o - | FileCheck %s -! CHECK: acc.routine @acc_routine_17 func(@_QPacc_routine19) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) -! CHECK: acc.routine @acc_routine_16 func(@_QPacc_routine18) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) -! CHECK: acc.routine @acc_routine_15 func(@_QPacc_routine17) worker ([#acc.device_type]) vector ([#acc.device_type]) -! CHECK: acc.routine @acc_routine_14 func(@_QPacc_routine16) gang([#acc.device_type]) seq ([#acc.device_type]) -! CHECK: acc.routine @acc_routine_10 func(@_QPacc_routine11) seq -! CHECK: acc.routine @acc_routine_9 func(@_QPacc_routine10) seq -! CHECK: acc.routine @acc_routine_8 func(@_QPacc_routine9) bind("_QPacc_routine9a") -! CHECK: acc.routine @acc_routine_7 func(@_QPacc_routine8) bind("routine8_") -! CHECK: acc.routine @acc_routine_6 func(@_QPacc_routine7) gang(dim: 1 : i64) -! CHECK: acc.routine @acc_routine_5 func(@_QPacc_routine6) nohost -! CHECK: acc.routine @acc_routine_4 func(@_QPacc_routine5) worker -! CHECK: acc.routine @acc_routine_3 func(@_QPacc_routine4) vector -! CHECK: acc.routine @acc_routine_2 func(@_QPacc_routine3) gang -! CHECK: acc.routine @acc_routine_1 func(@_QPacc_routine2) seq -! CHECK: acc.routine @acc_routine_0 func(@_QPacc_routine1) +! CHECK: acc.routine @[[r14:.*]] func(@_QPacc_routine19) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) +! CHECK: acc.routine @[[r13:.*]] func(@_QPacc_routine18) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) +! CHECK: acc.routine @[[r12:.*]] func(@_QPacc_routine17) worker ([#acc.device_type]) vector ([#acc.device_type]) +! CHECK: acc.routine @[[r11:.*]] func(@_QPacc_routine16) gang([#acc.device_type]) seq ([#acc.device_type]) +! CHECK: acc.routine @[[r10:.*]] func(@_QPacc_routine11) seq +! CHECK: acc.routine @[[r09:.*]] func(@_QPacc_routine10) seq +! CHECK: acc.routine @[[r08:.*]] func(@_QPacc_routine9) bind("_QPacc_routine9a") +! CHECK: acc.routine @[[r07:.*]] func(@_QPacc_routine8) bind("routine8_") +! CHECK: acc.routine @[[r06:.*]] func(@_QPacc_routine7) gang(dim: 1 : i64) +! CHECK: acc.routine @[[r05:.*]] func(@_QPacc_routine6) nohost +! CHECK: acc.routine @[[r04:.*]] func(@_QPacc_routine5) worker +! CHECK: acc.routine @[[r03:.*]] func(@_QPacc_routine4) vector +! CHECK: acc.routine @[[r02:.*]] func(@_QPacc_routine3) gang +! CHECK: acc.routine @[[r01:.*]] func(@_QPacc_routine2) seq +! CHECK: acc.routine @[[r00:.*]] func(@_QPacc_routine1) subroutine acc_routine1() !$acc routine end subroutine -! CHECK-LABEL: func.func @_QPacc_routine1() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +! CHECK-LABEL: func.func @_QPacc_routine1() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r00]]]>} subroutine acc_routine2() !$acc routine seq end subroutine -! CHECK-LABEL: func.func @_QPacc_routine2() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_1]>} +! CHECK-LABEL: func.func @_QPacc_routine2() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r01]]]>} subroutine acc_routine3() !$acc routine gang end subroutine -! CHECK-LABEL: func.func @_QPacc_routine3() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_2]>} +! CHECK-LABEL: func.func @_QPacc_routine3() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r02]]]>} subroutine acc_routine4() !$acc routine vector end subroutine -! CHECK-LABEL: func.func @_QPacc_routine4() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_3]>} +! CHECK-LABEL: func.func @_QPacc_routine4() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r03]]]>} subroutine acc_routine5() !$acc routine worker end subroutine -! CHECK-LABEL: func.func @_QPacc_routine5() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_4]>} +! CHECK-LABEL: func.func @_QPacc_routine5() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r04]]]>} subroutine acc_routine6() !$acc routine nohost end subroutine -! CHECK-LABEL: func.func @_QPacc_routine6() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_5]>} +! CHECK-LABEL: func.func @_QPacc_routine6() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r05]]]>} subroutine acc_routine7() !$acc routine gang(dim:1) end subroutine -! CHECK-LABEL: func.func @_QPacc_routine7() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_6]>} +! CHECK-LABEL: func.func @_QPacc_routine7() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r06]]]>} subroutine acc_routine8() !$acc routine bind("routine8_") end subroutine -! CHECK-LABEL: func.func @_QPacc_routine8() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_7]>} +! CHECK-LABEL: func.func @_QPacc_routine8() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r07]]]>} subroutine acc_routine9a() end subroutine @@ -73,20 +81,23 @@ subroutine acc_routine9() !$acc routine bind(acc_routine9a) end subroutine -! CHECK-LABEL: func.func @_QPacc_routine9() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_8]>} +! CHECK-LABEL: func.func @_QPacc_routine9() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r08]]]>} function acc_routine10() !$acc routine(acc_routine10) seq end function -! CHECK-LABEL: func.func @_QPacc_routine10() -> f32 attributes {acc.routine_info = #acc.routine_info<[@acc_routine_9]>} +! CHECK-LABEL: func.func @_QPacc_routine10() -> f32 +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r09]]]>} subroutine acc_routine11(a) real :: a !$acc routine(acc_routine11) seq end subroutine -! CHECK-LABEL: func.func @_QPacc_routine11(%arg0: !fir.ref {fir.bindc_name = "a"}) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_10]>} +! CHECK-LABEL: func.func @_QPacc_routine11(%arg0: !fir.ref {fir.bindc_name = "a"}) +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r10]]]>} subroutine acc_routine12() >From e2d1a05d2de2356644d385e9099a7e6879143cc7 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 14:40:14 -0700 Subject: [PATCH 05/11] clang-format --- flang/include/flang/Lower/OpenACC.h | 3 +- flang/include/flang/Semantics/symbol.h | 4 +- flang/lib/Lower/Bridge.cpp | 12 ++--- flang/lib/Lower/CallInterface.cpp | 9 ++-- flang/lib/Lower/OpenACC.cpp | 56 +++++++++++----------- flang/lib/Semantics/resolve-directives.cpp | 3 +- 6 files changed, 48 insertions(+), 39 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index dc014a71526c3..35a33e751b52b 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -76,7 +76,8 @@ static constexpr llvm::StringRef declarePostDeallocSuffix = static constexpr llvm::StringRef privatizationRecipePrefix = "privatization"; -bool needsOpenACCRoutineConstruct(const Fortran::evaluate::ProcedureDesignator *); +bool needsOpenACCRoutineConstruct( + const Fortran::evaluate::ProcedureDesignator *); mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 8c60a196bdfc1..eb34aac9c390d 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -143,7 +143,8 @@ class OpenACCRoutineDeviceTypeInfo { const std::variant *bindName() const { return bindName_.has_value() ? &*bindName_ : nullptr; } - const std::optional> &bindNameOpt() const { + const std::optional> & + bindNameOpt() const { return bindName_; } void set_bindName(std::string &&name) { bindName_.emplace(std::move(name)); } @@ -153,6 +154,7 @@ class OpenACCRoutineDeviceTypeInfo { friend llvm::raw_ostream &operator<<( llvm::raw_ostream &, const OpenACCRoutineDeviceTypeInfo &); + private: bool isSeq_{false}; bool isVector_{false}; diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index abe07bcfdfcda..5e7b783323bfd 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -465,8 +465,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { setCurrentPosition(funit.getStartingSourceLoc()); - builder = new fir::FirOpBuilder( - bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); + builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), + &mlirSymbolTable); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { funit.setActiveEntry(entryIndex); @@ -1031,9 +1031,9 @@ class FirConverter : public Fortran::lower::AbstractConverter { return bridge.getSemanticsContext().FindScope(currentPosition); } - fir::FirOpBuilder &getFirOpBuilder() override final { + fir::FirOpBuilder &getFirOpBuilder() override final { CHECK(builder && "builder is not set before calling getFirOpBuilder"); - return *builder; + return *builder; } mlir::ModuleOp getModuleOp() override final { return bridge.getModule(); } @@ -5616,10 +5616,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { LLVM_DEBUG(llvm::dbgs() << "\n[bridge - startNewFunction]"; if (auto *sym = scope.symbol()) llvm::dbgs() << " " << *sym; llvm::dbgs() << "\n"); - // I don't think setting the builder is necessary here, because callee + // I don't think setting the builder is necessary here, because callee // always looks up the FuncOp from the module. If there was a function that // was not declared yet. This call to callee will cause an assertion - //failure. + // failure. Fortran::lower::CalleeInterface callee(funit, *this); mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments(); builder = diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index b938354e6bcb3..611eacfe178e5 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -717,11 +717,14 @@ void Fortran::lower::CallInterface::declare() { func.setArgAttrs(placeHolder.index(), placeHolder.value().attributes); setCUDAAttributes(func, side().getProcedureSymbol(), characteristic); - + if (const Fortran::semantics::Symbol *sym = side().getProcedureSymbol()) { - if (const auto &info{sym->GetUltimate().detailsIf()}) { + if (const auto &info{ + sym->GetUltimate() + .detailsIf()}) { if (!info->openACCRoutineInfos().empty()) { - genOpenACCRoutineConstruct(converter, module, func, info->openACCRoutineInfos()); + genOpenACCRoutineConstruct(converter, module, func, + info->openACCRoutineInfos()); } } } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index eefa8fbf12b1a..891dc998bc596 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -32,14 +32,14 @@ #include "flang/Semantics/expression.h" #include "flang/Semantics/scope.h" #include "flang/Semantics/tools.h" +#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" +#include "mlir/IR/MLIRContext.h" +#include "mlir/Support/LLVM.h" #include "llvm/ADT/STLExtras.h" #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" #include "llvm/Support/ErrorHandling.h" -#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" -#include "mlir/IR/MLIRContext.h" -#include "mlir/Support/LLVM.h" #define DEBUG_TYPE "flang-lower-openacc" @@ -4141,7 +4141,9 @@ static void attachRoutineInfo(mlir::func::FuncOp func, mlir::acc::RoutineInfoAttr::get(func.getContext(), routines)); } -static mlir::ArrayAttr getArrayAttrOrNull(fir::FirOpBuilder &builder, llvm::SmallVector &attributes) { +static mlir::ArrayAttr +getArrayAttrOrNull(fir::FirOpBuilder &builder, + llvm::SmallVector &attributes) { if (attributes.empty()) { return nullptr; } else { @@ -4181,25 +4183,24 @@ void createOpenACCRoutineConstruct( mlir::OpBuilder modBuilder(mod.getBodyRegion()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); modBuilder.create( - loc, routineOpStr, funcName, - getArrayAttrOrNull(builder, bindNames), + loc, routineOpStr, funcName, getArrayAttrOrNull(builder, bindNames), getArrayAttrOrNull(builder, bindNameDeviceTypes), getArrayAttrOrNull(builder, workerDeviceTypes), getArrayAttrOrNull(builder, vectorDeviceTypes), - getArrayAttrOrNull(builder, seqDeviceTypes), - hasNohost, /*implicit=*/false, - getArrayAttrOrNull(builder, gangDeviceTypes), + getArrayAttrOrNull(builder, seqDeviceTypes), hasNohost, + /*implicit=*/false, getArrayAttrOrNull(builder, gangDeviceTypes), getArrayAttrOrNull(builder, gangDimValues), getArrayAttrOrNull(builder, gangDimDeviceTypes)); auto symbolRefAttr = builder.getSymbolRefAttr(routineOpStr); if (funcOp) { - + attachRoutineInfo(funcOp, symbolRefAttr); } else { // FuncOp is not lowered yet. Keep the information so the routine info // can be attached later to the funcOp. - converter.getAccDelayedRoutines().push_back(std::make_pair(funcName, symbolRefAttr)); + converter.getAccDelayedRoutines().push_back( + std::make_pair(funcName, symbolRefAttr)); } } @@ -4218,7 +4219,7 @@ static void interpretRoutineDeviceInfo( auto getDeviceTypeAttr = [&]() -> mlir::Attribute { auto context = builder.getContext(); auto value = getDeviceType(dinfo.dType()); - return mlir::acc::DeviceTypeAttr::get(context, value ); + return mlir::acc::DeviceTypeAttr::get(context, value); }; if (dinfo.isSeq()) { seqDeviceTypes.push_back(getDeviceTypeAttr()); @@ -4244,14 +4245,15 @@ static void interpretRoutineDeviceInfo( const auto &bindName = dinfo.bindNameOpt().value(); mlir::Attribute bindNameAttr; if (const auto &bindStr{std::get_if(&bindName)}) { - bindNameAttr = builder.getStringAttr(*bindStr); - } else if (const auto &bindSym{std::get_if(&bindName)}) { - bindNameAttr = builder.getStringAttr(converter.mangleName(*bindSym)); - } else { - llvm_unreachable("Unsupported bind name type"); - } - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(getDeviceTypeAttr()); + bindNameAttr = builder.getStringAttr(*bindStr); + } else if (const auto &bindSym{ + std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(converter.mangleName(*bindSym)); + } else { + llvm_unreachable("Unsupported bind name type"); + } + bindNames.push_back(bindNameAttr); + bindNameDeviceTypes.push_back(getDeviceTypeAttr()); } } @@ -4277,18 +4279,18 @@ void Fortran::lower::genOpenACCRoutineConstruct( } // Note: Device Independent Attributes are set to the // none device type in `info`. - interpretRoutineDeviceInfo(converter, info, seqDeviceTypes, vectorDeviceTypes, - workerDeviceTypes, bindNameDeviceTypes, - bindNames, gangDeviceTypes, gangDimValues, - gangDimDeviceTypes); + interpretRoutineDeviceInfo(converter, info, seqDeviceTypes, + vectorDeviceTypes, workerDeviceTypes, + bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimValues, gangDimDeviceTypes); // Device Dependent Attributes for (const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo : info.deviceTypeInfos()) { interpretRoutineDeviceInfo( - converter, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, - bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, - gangDimDeviceTypes); + converter, dinfo, seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimValues, gangDimDeviceTypes); } } createOpenACCRoutineConstruct( diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index a8f00b546306e..c2df7cddc0025 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1103,7 +1103,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( } symbol.get().add_openACCRoutineInfo(info); } else { - llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() << "\n"; + llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() + << "\n"; } } >From c26093683edb7c0270809d2afb717450f92df6ab Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 14:47:39 -0700 Subject: [PATCH 06/11] tidy up --- flang/include/flang/Lower/OpenACC.h | 1 - flang/lib/Lower/CallInterface.cpp | 1 - 2 files changed, 2 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 35a33e751b52b..9e71ad0a15c89 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -46,7 +46,6 @@ struct AccClauseList; struct OpenACCConstruct; struct OpenACCDeclarativeConstruct; struct OpenACCRoutineConstruct; -struct ProcedureDesignator; } // namespace parser namespace semantics { diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 611eacfe178e5..602b5c7bfa6c6 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -1701,7 +1701,6 @@ class SignatureBuilder fir::emitFatalError(converter.getCurrentLocation(), "SignatureBuilder should only be used once"); declare(); - interfaceDetermined = true; return getFuncOp(); } >From 1b825b55c808ac92cd2866d855611cd585eb28db Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 17:00:27 -0700 Subject: [PATCH 07/11] cleaning up unused code --- flang/include/flang/Lower/AbstractConverter.h | 3 - flang/include/flang/Lower/OpenACC.h | 18 +- flang/lib/Lower/Bridge.cpp | 43 +++-- flang/lib/Lower/OpenACC.cpp | 166 +----------------- flang/lib/Semantics/resolve-directives.cpp | 12 +- 5 files changed, 32 insertions(+), 210 deletions(-) diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 59419e829718f..2fa0da94b0396 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -358,9 +358,6 @@ class AbstractConverter { /// functions in order to be in sync). virtual mlir::SymbolTable *getMLIRSymbolTable() = 0; - virtual Fortran::lower::AccRoutineInfoMappingList & - getAccDelayedRoutines() = 0; - private: /// Options controlling lowering behavior. const Fortran::lower::LoweringOptions &loweringOptions; diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 9e71ad0a15c89..d2cd7712fb2c7 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -63,9 +63,6 @@ namespace pft { struct Evaluation; } // namespace pft -using AccRoutineInfoMappingList = - llvm::SmallVector>; - static constexpr llvm::StringRef declarePostAllocSuffix = "_acc_declare_update_desc_post_alloc"; static constexpr llvm::StringRef declarePreDeallocSuffix = @@ -82,22 +79,13 @@ mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, pft::Evaluation &, const parser::OpenACCConstruct &); -void genOpenACCDeclarativeConstruct(AbstractConverter &, - Fortran::semantics::SemanticsContext &, - StatementContext &, - const parser::OpenACCDeclarativeConstruct &, - AccRoutineInfoMappingList &); -void genOpenACCRoutineConstruct(AbstractConverter &, - Fortran::semantics::SemanticsContext &, - mlir::ModuleOp, - const parser::OpenACCRoutineConstruct &); +void genOpenACCDeclarativeConstruct( + AbstractConverter &, Fortran::semantics::SemanticsContext &, + StatementContext &, const parser::OpenACCDeclarativeConstruct &); void genOpenACCRoutineConstruct( AbstractConverter &, mlir::ModuleOp, mlir::func::FuncOp, const std::vector &); -void finalizeOpenACCRoutineAttachment(mlir::ModuleOp, - AccRoutineInfoMappingList &); - /// Get a acc.private.recipe op for the given type or create it if it does not /// exist yet. mlir::acc::PrivateRecipeOp createOrGetPrivateRecipe(mlir::OpBuilder &, diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 5e7b783323bfd..1615493003898 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -458,15 +458,25 @@ class FirConverter : public Fortran::lower::AbstractConverter { Fortran::common::LanguageFeature::CUDA)); }); - finalizeOpenACCLowering(); finalizeOpenMPLowering(globalOmpRequiresSymbol); } /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { + // Since this is a recursive function, we only need to create a new builder + // for each top-level declaration. It would be simpler to have a single + // builder for the entire translation unit, but that requires a lot of + // changes to the code. + // FIXME: Once createGlobalOutsideOfFunctionLowering is fixed, we can + // remove this code and share the module builder. + bool newBuilder = false; + if (!builder) { + newBuilder = true; + builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), + &mlirSymbolTable); + } + CHECK(builder && "FirOpBuilder did not instantiate"); setCurrentPosition(funit.getStartingSourceLoc()); - builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), - &mlirSymbolTable); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { funit.setActiveEntry(entryIndex); @@ -493,7 +503,11 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (Fortran::lower::pft::ContainedUnit &unit : funit.containedUnitList) if (auto *f = std::get_if(&unit)) declareFunction(*f); - builder = nullptr; + + if (newBuilder) { + delete builder; + builder = nullptr; + } } /// Get the scope that is defining or using \p sym. The returned scope is not @@ -3017,8 +3031,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { void genFIR(const Fortran::parser::OpenACCDeclarativeConstruct &accDecl) { genOpenACCDeclarativeConstruct(*this, bridge.getSemanticsContext(), - bridge.openAccCtx(), accDecl, - accRoutineInfos); + bridge.openAccCtx(), accDecl); for (Fortran::lower::pft::Evaluation &e : getEval().getNestedEvaluations()) genFIR(e); } @@ -4286,11 +4299,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { return Fortran::lower::createMutableBox(loc, *this, expr, localSymbols); } - Fortran::lower::AccRoutineInfoMappingList & - getAccDelayedRoutines() override final { - return accRoutineInfos; - } - // Create the [newRank] array with the lower bounds to be passed to the // runtime as a descriptor. mlir::Value createLboundArray(llvm::ArrayRef lbounds, @@ -5889,7 +5897,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Helper to generate GlobalOps when the builder is not positioned in any /// region block. This is required because the FirOpBuilder assumes it is /// always positioned inside a region block when creating globals, the easiest - /// way comply is to create a dummy function and to throw it afterwards. + /// way comply is to create a dummy function and to throw it away afterwards. void createGlobalOutsideOfFunctionLowering( const std::function &createGlobals) { // FIXME: get rid of the bogus function context and instantiate the @@ -5902,6 +5910,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { mlir::FunctionType::get(context, std::nullopt, std::nullopt), symbolTable); func.addEntryBlock(); + CHECK(!builder && "Expected builder to be uninitialized"); builder = new fir::FirOpBuilder(func, bridge.getKindMap(), symbolTable); assert(builder && "FirOpBuilder did not instantiate"); builder->setFastMathFlags(bridge.getLoweringOptions().getMathOptions()); @@ -6331,13 +6340,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { expr.u); } - /// Performing OpenACC lowering action that were deferred to the end of - /// lowering. - void finalizeOpenACCLowering() { - Fortran::lower::finalizeOpenACCRoutineAttachment(getModuleOp(), - accRoutineInfos); - } - /// Performing OpenMP lowering actions that were deferred to the end of /// lowering. void finalizeOpenMPLowering( @@ -6429,9 +6431,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// A counter for uniquing names in `literalNamesMap`. std::uint64_t uniqueLitId = 0; - /// Deferred OpenACC routine attachment. - Fortran::lower::AccRoutineInfoMappingList accRoutineInfos; - /// Whether an OpenMP target region or declare target function/subroutine /// intended for device offloading has been detected bool ompDeviceCodeFound = false; diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 891dc998bc596..1a031dce7a487 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -4163,9 +4163,6 @@ void createOpenACCRoutineConstruct( llvm::SmallVector &workerDeviceTypes, llvm::SmallVector &vectorDeviceTypes) { - std::stringstream routineOpName; - routineOpName << accRoutinePrefix.str() << routineCounter++; - for (auto routineOp : mod.getOps()) { if (routineOp.getFuncName().str().compare(funcName) == 0) { // If the routine is already specified with the same clauses, just skip @@ -4179,6 +4176,8 @@ void createOpenACCRoutineConstruct( mlir::emitError(loc, "Routine already specified with different clauses"); } } + std::stringstream routineOpName; + routineOpName << accRoutinePrefix.str() << routineCounter++; std::string routineOpStr = routineOpName.str(); mlir::OpBuilder modBuilder(mod.getBodyRegion()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); @@ -4192,16 +4191,7 @@ void createOpenACCRoutineConstruct( getArrayAttrOrNull(builder, gangDimValues), getArrayAttrOrNull(builder, gangDimDeviceTypes)); - auto symbolRefAttr = builder.getSymbolRefAttr(routineOpStr); - if (funcOp) { - - attachRoutineInfo(funcOp, symbolRefAttr); - } else { - // FuncOp is not lowered yet. Keep the information so the routine info - // can be attached later to the funcOp. - converter.getAccDelayedRoutines().push_back( - std::make_pair(funcName, symbolRefAttr)); - } + attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpStr)); } static void interpretRoutineDeviceInfo( @@ -4299,145 +4289,6 @@ void Fortran::lower::genOpenACCRoutineConstruct( seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); } -void Fortran::lower::genOpenACCRoutineConstruct( - Fortran::lower::AbstractConverter &converter, - Fortran::semantics::SemanticsContext &semanticsContext, mlir::ModuleOp mod, - const Fortran::parser::OpenACCRoutineConstruct &routineConstruct) { - fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - mlir::Location loc = converter.genLocation(routineConstruct.source); - std::optional name = - std::get>(routineConstruct.t); - const auto &clauses = - std::get(routineConstruct.t); - mlir::func::FuncOp funcOp; - std::string funcName; - if (name) { - funcName = converter.mangleName(*name->symbol); - funcOp = - builder.getNamedFunction(mod, builder.getMLIRSymbolTable(), funcName); - } else { - Fortran::semantics::Scope &scope = - semanticsContext.FindScope(routineConstruct.source); - const Fortran::semantics::Scope &progUnit{GetProgramUnitContaining(scope)}; - const auto *subpDetails{ - progUnit.symbol() - ? progUnit.symbol() - ->detailsIf() - : nullptr}; - if (subpDetails && subpDetails->isInterface()) { - funcName = converter.mangleName(*progUnit.symbol()); - funcOp = - builder.getNamedFunction(mod, builder.getMLIRSymbolTable(), funcName); - } else { - funcOp = builder.getFunction(); - funcName = funcOp.getName(); - } - } - // TODO: Refactor this to use the OpenACCRoutineInfo - bool hasNohost = false; - - llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, - workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, - gangDimDeviceTypes, gangDimValues; - - // device_type attribute is set to `none` until a device_type clause is - // encountered. - llvm::SmallVector crtDeviceTypes; - crtDeviceTypes.push_back(mlir::acc::DeviceTypeAttr::get( - builder.getContext(), mlir::acc::DeviceType::None)); - - for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - seqDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (const auto *gangClause = - std::get_if(&clause.u)) { - if (gangClause->v) { - const Fortran::parser::AccGangArgList &x = *gangClause->v; - for (const Fortran::parser::AccGangArg &gangArg : x.v) { - if (const auto *dim = - std::get_if(&gangArg.u)) { - const std::optional dimValue = Fortran::evaluate::ToInt64( - *Fortran::semantics::GetExpr(dim->v)); - if (!dimValue) - mlir::emitError(loc, - "dim value must be a constant positive integer"); - mlir::Attribute gangDimAttr = - builder.getIntegerAttr(builder.getI64Type(), *dimValue); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - gangDimValues.push_back(gangDimAttr); - gangDimDeviceTypes.push_back(crtDeviceTypeAttr); - } - } - } - } else { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - gangDeviceTypes.push_back(crtDeviceTypeAttr); - } - } else if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - vectorDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - workerDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (std::get_if(&clause.u)) { - hasNohost = true; - } else if (const auto *bindClause = - std::get_if(&clause.u)) { - if (const auto *name = - std::get_if(&bindClause->v.u)) { - mlir::Attribute bindNameAttr = - builder.getStringAttr(converter.mangleName(*name->symbol)); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(crtDeviceTypeAttr); - } - } else if (const auto charExpr = - std::get_if( - &bindClause->v.u)) { - const std::optional name = - Fortran::semantics::GetConstExpr(semanticsContext, - *charExpr); - if (!name) - mlir::emitError(loc, "Could not retrieve the bind name"); - - mlir::Attribute bindNameAttr = builder.getStringAttr(*name); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(crtDeviceTypeAttr); - } - } - } else if (const auto *deviceTypeClause = - std::get_if( - &clause.u)) { - crtDeviceTypes.clear(); - gatherDeviceTypeAttrs(builder, deviceTypeClause, crtDeviceTypes); - } - } - - createOpenACCRoutineConstruct( - converter, loc, mod, funcOp, funcName, hasNohost, bindNames, - bindNameDeviceTypes, gangDeviceTypes, gangDimValues, gangDimDeviceTypes, - seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); -} - -void Fortran::lower::finalizeOpenACCRoutineAttachment( - mlir::ModuleOp mod, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { - for (auto &mapping : accRoutineInfos) { - mlir::func::FuncOp funcOp = - mod.lookupSymbol(mapping.first); - if (!funcOp) - mlir::emitWarning(mod.getLoc(), - llvm::Twine("function '") + llvm::Twine(mapping.first) + - llvm::Twine("' in acc routine directive is not " - "found in this translation unit.")); - else - attachRoutineInfo(funcOp, mapping.second); - } - accRoutineInfos.clear(); -} - static void genACC(Fortran::lower::AbstractConverter &converter, Fortran::lower::pft::Evaluation &eval, @@ -4551,8 +4402,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( Fortran::lower::AbstractConverter &converter, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &openAccCtx, - const Fortran::parser::OpenACCDeclarativeConstruct &accDeclConstruct, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { + const Fortran::parser::OpenACCDeclarativeConstruct &accDeclConstruct) { Fortran::common::visit( common::visitors{ @@ -4561,13 +4411,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( genACC(converter, semanticsContext, openAccCtx, standaloneDeclarativeConstruct); }, - [&](const Fortran::parser::OpenACCRoutineConstruct - &routineConstruct) { - fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - mlir::ModuleOp mod = builder.getModule(); - Fortran::lower::genOpenACCRoutineConstruct( - converter, semanticsContext, mod, routineConstruct); - }, + [&](const Fortran::parser::OpenACCRoutineConstruct &x) {}, }, accDeclConstruct.u); } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index c2df7cddc0025..d74953df1e630 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1041,9 +1041,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( if (const auto *dTypeClause = std::get_if(&clause.u)) { currentDevices.clear(); - for (const auto &deviceTypeExpr : dTypeClause->v.v) { + for (const auto &deviceTypeExpr : dTypeClause->v.v) currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); - } } else if (std::get_if(&clause.u)) { info.set_isNohost(); } else if (std::get_if(&clause.u)) { @@ -1080,9 +1079,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { Symbol &ultimate{sym->GetUltimate()}; - for (auto &device : currentDevices) { + for (auto &device : currentDevices) device->set_bindName(SymbolRef(ultimate)); - } } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1095,16 +1093,12 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Fortran::parser::Unwrap( *charExpr); std::string str{std::get(charConst->t)}; - for (auto &device : currentDevices) { + for (auto &device : currentDevices) device->set_bindName(std::string(str)); - } } } } symbol.get().add_openACCRoutineInfo(info); - } else { - llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() - << "\n"; } } >From 158481e5eaf63c1b2b4c172b9c143ca4c10722f5 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 17:15:26 -0700 Subject: [PATCH 08/11] a little more tidying up --- flang/include/flang/Lower/AbstractConverter.h | 1 - flang/include/flang/Lower/OpenACC.h | 3 --- flang/lib/Lower/Bridge.cpp | 3 ++- 3 files changed, 2 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 2fa0da94b0396..1d1323642bf9c 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -14,7 +14,6 @@ #define FORTRAN_LOWER_ABSTRACTCONVERTER_H #include "flang/Lower/LoweringOptions.h" -#include "flang/Lower/OpenACC.h" #include "flang/Lower/PFTDefs.h" #include "flang/Optimizer/Builder/BoxValue.h" #include "flang/Optimizer/Dialect/FIRAttr.h" diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index d2cd7712fb2c7..4034953976427 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -72,9 +72,6 @@ static constexpr llvm::StringRef declarePostDeallocSuffix = static constexpr llvm::StringRef privatizationRecipePrefix = "privatization"; -bool needsOpenACCRoutineConstruct( - const Fortran::evaluate::ProcedureDesignator *); - mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, pft::Evaluation &, diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 1615493003898..e50c91654f7bb 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -5897,7 +5897,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Helper to generate GlobalOps when the builder is not positioned in any /// region block. This is required because the FirOpBuilder assumes it is /// always positioned inside a region block when creating globals, the easiest - /// way comply is to create a dummy function and to throw it away afterwards. + /// way to comply is to create a dummy function and to throw it away + /// afterwards. void createGlobalOutsideOfFunctionLowering( const std::function &createGlobals) { // FIXME: get rid of the bogus function context and instantiate the >From 8f6ae035147336c4ed04b5b25487f72ebc52c757 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 1 May 2025 08:12:35 -0700 Subject: [PATCH 09/11] more consistent use of builders --- flang/lib/Lower/Bridge.cpp | 40 ++++++++++--------------------- flang/lib/Lower/CallInterface.cpp | 1 - 2 files changed, 13 insertions(+), 28 deletions(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index e50c91654f7bb..fb20dfbaf477e 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -403,18 +403,21 @@ class FirConverter : public Fortran::lower::AbstractConverter { [&](Fortran::lower::pft::FunctionLikeUnit &f) { if (f.isMainProgram()) hasMainProgram = true; - declareFunction(f); + createGlobalOutsideOfFunctionLowering( + [&]() { declareFunction(f); }); if (!globalOmpRequiresSymbol) globalOmpRequiresSymbol = f.getScope().symbol(); }, [&](Fortran::lower::pft::ModuleLikeUnit &m) { lowerModuleDeclScope(m); - for (Fortran::lower::pft::ContainedUnit &unit : - m.containedUnitList) - if (auto *f = - std::get_if( - &unit)) - declareFunction(*f); + createGlobalOutsideOfFunctionLowering([&]() { + for (Fortran::lower::pft::ContainedUnit &unit : + m.containedUnitList) + if (auto *f = + std::get_if( + &unit)) + declareFunction(*f); + }); }, [&](Fortran::lower::pft::BlockDataUnit &b) { if (!globalOmpRequiresSymbol) @@ -463,19 +466,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { - // Since this is a recursive function, we only need to create a new builder - // for each top-level declaration. It would be simpler to have a single - // builder for the entire translation unit, but that requires a lot of - // changes to the code. - // FIXME: Once createGlobalOutsideOfFunctionLowering is fixed, we can - // remove this code and share the module builder. - bool newBuilder = false; - if (!builder) { - newBuilder = true; - builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), - &mlirSymbolTable); - } - CHECK(builder && "FirOpBuilder did not instantiate"); + CHECK(builder && "declareFunction called with uninitialized builder"); setCurrentPosition(funit.getStartingSourceLoc()); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { @@ -503,11 +494,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (Fortran::lower::pft::ContainedUnit &unit : funit.containedUnitList) if (auto *f = std::get_if(&unit)) declareFunction(*f); - - if (newBuilder) { - delete builder; - builder = nullptr; - } } /// Get the scope that is defining or using \p sym. The returned scope is not @@ -5624,9 +5610,9 @@ class FirConverter : public Fortran::lower::AbstractConverter { LLVM_DEBUG(llvm::dbgs() << "\n[bridge - startNewFunction]"; if (auto *sym = scope.symbol()) llvm::dbgs() << " " << *sym; llvm::dbgs() << "\n"); - // I don't think setting the builder is necessary here, because callee + // Setting the builder is not necessary here, because callee // always looks up the FuncOp from the module. If there was a function that - // was not declared yet. This call to callee will cause an assertion + // was not declared yet, this call to callee will cause an assertion // failure. Fortran::lower::CalleeInterface callee(funit, *this); mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments(); diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 602b5c7bfa6c6..8affa1e1965e8 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -21,7 +21,6 @@ #include "flang/Optimizer/Dialect/FIROpsSupport.h" #include "flang/Optimizer/Support/InternalNames.h" #include "flang/Optimizer/Support/Utils.h" -#include "flang/Parser/parse-tree.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" #include "flang/Support/Fortran.h" >From a52655d0b90055ad6ff062fbf66be3172a95973b Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 1 May 2025 08:18:14 -0700 Subject: [PATCH 10/11] delete space --- flang/lib/Lower/Bridge.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index fb20dfbaf477e..a6ee24edd8381 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -5883,7 +5883,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Helper to generate GlobalOps when the builder is not positioned in any /// region block. This is required because the FirOpBuilder assumes it is /// always positioned inside a region block when creating globals, the easiest - /// way to comply is to create a dummy function and to throw it away + /// way to comply is to create a dummy function and to throw it away /// afterwards. void createGlobalOutsideOfFunctionLowering( const std::function &createGlobals) { >From b2a1da313284c9709eae6f757885df697843565c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 2 May 2025 11:30:33 -0700 Subject: [PATCH 11/11] less builder creation --- flang/lib/Lower/Bridge.cpp | 109 ++++++++++++++++++------------------- 1 file changed, 53 insertions(+), 56 deletions(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index a6ee24edd8381..81127ab55a937 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -397,40 +397,39 @@ class FirConverter : public Fortran::lower::AbstractConverter { // they are available before lowering any function that may use them. bool hasMainProgram = false; const Fortran::semantics::Symbol *globalOmpRequiresSymbol = nullptr; - for (Fortran::lower::pft::Program::Units &u : pft.getUnits()) { - Fortran::common::visit( - Fortran::common::visitors{ - [&](Fortran::lower::pft::FunctionLikeUnit &f) { - if (f.isMainProgram()) - hasMainProgram = true; - createGlobalOutsideOfFunctionLowering( - [&]() { declareFunction(f); }); - if (!globalOmpRequiresSymbol) - globalOmpRequiresSymbol = f.getScope().symbol(); - }, - [&](Fortran::lower::pft::ModuleLikeUnit &m) { - lowerModuleDeclScope(m); - createGlobalOutsideOfFunctionLowering([&]() { + createBuilderOutsideOfFuncOpAndDo([&]() { + for (Fortran::lower::pft::Program::Units &u : pft.getUnits()) { + Fortran::common::visit( + Fortran::common::visitors{ + [&](Fortran::lower::pft::FunctionLikeUnit &f) { + if (f.isMainProgram()) + hasMainProgram = true; + declareFunction(f); + if (!globalOmpRequiresSymbol) + globalOmpRequiresSymbol = f.getScope().symbol(); + }, + [&](Fortran::lower::pft::ModuleLikeUnit &m) { + lowerModuleDeclScope(m); for (Fortran::lower::pft::ContainedUnit &unit : m.containedUnitList) if (auto *f = std::get_if( &unit)) declareFunction(*f); - }); - }, - [&](Fortran::lower::pft::BlockDataUnit &b) { - if (!globalOmpRequiresSymbol) - globalOmpRequiresSymbol = b.symTab.symbol(); - }, - [&](Fortran::lower::pft::CompilerDirectiveUnit &d) {}, - [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) {}, - }, - u); - } + }, + [&](Fortran::lower::pft::BlockDataUnit &b) { + if (!globalOmpRequiresSymbol) + globalOmpRequiresSymbol = b.symTab.symbol(); + }, + [&](Fortran::lower::pft::CompilerDirectiveUnit &d) {}, + [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) {}, + }, + u); + } + }); // Create definitions of intrinsic module constants. - createGlobalOutsideOfFunctionLowering( + createBuilderOutsideOfFuncOpAndDo( [&]() { createIntrinsicModuleDefinitions(pft); }); // Primary translation pass. @@ -449,12 +448,12 @@ class FirConverter : public Fortran::lower::AbstractConverter { // Once all the code has been translated, create global runtime type info // data structures for the derived types that have been processed, as well // as fir.type_info operations for the dispatch tables. - createGlobalOutsideOfFunctionLowering( + createBuilderOutsideOfFuncOpAndDo( [&]() { typeInfoConverter.createTypeInfo(*this); }); // Generate the `main` entry point if necessary if (hasMainProgram) - createGlobalOutsideOfFunctionLowering([&]() { + createBuilderOutsideOfFuncOpAndDo([&]() { fir::runtime::genMain(*builder, toLocation(), bridge.getEnvironmentDefaults(), getFoldingContext().languageFeatures().IsEnabled( @@ -5885,7 +5884,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// always positioned inside a region block when creating globals, the easiest /// way to comply is to create a dummy function and to throw it away /// afterwards. - void createGlobalOutsideOfFunctionLowering( + void createBuilderOutsideOfFuncOpAndDo( const std::function &createGlobals) { // FIXME: get rid of the bogus function context and instantiate the // globals directly into the module. @@ -5913,7 +5912,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Instantiate the data from a BLOCK DATA unit. void lowerBlockData(Fortran::lower::pft::BlockDataUnit &bdunit) { - createGlobalOutsideOfFunctionLowering([&]() { + createBuilderOutsideOfFuncOpAndDo([&]() { Fortran::lower::AggregateStoreMap fakeMap; for (const auto &[_, sym] : bdunit.symTab) { if (sym->has()) { @@ -5927,7 +5926,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Create fir::Global for all the common blocks that appear in the program. void lowerCommonBlocks(const Fortran::semantics::CommonBlockList &commonBlocks) { - createGlobalOutsideOfFunctionLowering( + createBuilderOutsideOfFuncOpAndDo( [&]() { Fortran::lower::defineCommonBlocks(*this, commonBlocks); }); } @@ -5997,36 +5996,34 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// declarative construct. void lowerModuleDeclScope(Fortran::lower::pft::ModuleLikeUnit &mod) { setCurrentPosition(mod.getStartingSourceLoc()); - createGlobalOutsideOfFunctionLowering([&]() { - auto &scopeVariableListMap = - Fortran::lower::pft::getScopeVariableListMap(mod); - for (const auto &var : Fortran::lower::pft::getScopeVariableList( - mod.getScope(), scopeVariableListMap)) { - - // Only define the variables owned by this module. - const Fortran::semantics::Scope *owningScope = var.getOwningScope(); - if (owningScope && mod.getScope() != *owningScope) - continue; + auto &scopeVariableListMap = + Fortran::lower::pft::getScopeVariableListMap(mod); + for (const auto &var : Fortran::lower::pft::getScopeVariableList( + mod.getScope(), scopeVariableListMap)) { - // Very special case: The value of numeric_storage_size depends on - // compilation options and therefore its value is not yet known when - // building the builtins runtime. Instead, the parameter is folding a - // __numeric_storage_size() expression which is loaded into the user - // program. For the iso_fortran_env object file, omit the symbol as it - // is never used. - if (var.hasSymbol()) { - const Fortran::semantics::Symbol &sym = var.getSymbol(); - const Fortran::semantics::Scope &owner = sym.owner(); - if (sym.name() == "numeric_storage_size" && owner.IsModule() && - DEREF(owner.symbol()).name() == "iso_fortran_env") - continue; - } + // Only define the variables owned by this module. + const Fortran::semantics::Scope *owningScope = var.getOwningScope(); + if (owningScope && mod.getScope() != *owningScope) + continue; - Fortran::lower::defineModuleVariable(*this, var); + // Very special case: The value of numeric_storage_size depends on + // compilation options and therefore its value is not yet known when + // building the builtins runtime. Instead, the parameter is folding a + // __numeric_storage_size() expression which is loaded into the user + // program. For the iso_fortran_env object file, omit the symbol as it + // is never used. + if (var.hasSymbol()) { + const Fortran::semantics::Symbol &sym = var.getSymbol(); + const Fortran::semantics::Scope &owner = sym.owner(); + if (sym.name() == "numeric_storage_size" && owner.IsModule() && + DEREF(owner.symbol()).name() == "iso_fortran_env") + continue; } + + Fortran::lower::defineModuleVariable(*this, var); + } for (auto &eval : mod.evaluationList) genFIR(eval); - }); } /// Lower functions contained in a module. From flang-commits at lists.llvm.org Fri May 2 15:30:23 2025 From: flang-commits at lists.llvm.org (Pranav Bhandarkar via flang-commits) Date: Fri, 02 May 2025 15:30:23 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][MLIR] - Access the LEN for a `fir.boxchar` and use it to set the bounds `omp.map.info` ops. (PR #134967) In-Reply-To: Message-ID: <6815477f.a70a0220.1cf102.b189@mx.google.com> https://github.com/bhandarkar-pranav updated https://github.com/llvm/llvm-project/pull/134967 >From 2b2e70f5bf635d573d54465c2d2af0150e5509be Mon Sep 17 00:00:00 2001 From: Pranav Bhandarkar Date: Tue, 8 Apr 2025 15:31:24 -0500 Subject: [PATCH 01/10] end-to-end test8 working. Need to get rid of lots of debugging commits --- flang/include/flang/Lower/DirectivesCommon.h | 25 +++- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 39 +++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 114 ++++++++++++++++-- .../Optimizer/OpenMP/MapInfoFinalization.cpp | 46 ++++++- 4 files changed, 208 insertions(+), 16 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index 93ab2e350d035..ce03b0751b56a 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -424,24 +424,30 @@ fir::factory::AddrAndBoundsInfo gatherDataOperandAddrAndBounds( auto arrayBase = toMaybeExpr(arrayRef->base()); assert(arrayBase); - + llvm::errs() << "arrayBase = "; + arrayBase.value().dump(); if (detail::getRef(*arrayBase)) { + llvm::errs() << "detail::getRef(*arrayBase)\n"; dataExv = converter.genExprAddr(operandLocation, *arrayBase, stmtCtx); info.addr = fir::getBase(dataExv); info.rawInput = info.addr; asFortran << arrayBase->AsFortran(); } else { + llvm::errs() << "ELSE -> detail::getRef(*arrayBase)\n"; const semantics::Symbol &sym = arrayRef->GetLastSymbol(); dataExvIsAssumedSize = Fortran::semantics::IsAssumedSizeArray(sym.GetUltimate()); info = getDataOperandBaseAddr(converter, builder, sym, operandLocation, unwrapFirBox); dataExv = converter.getSymbolExtendedValue(sym); + llvm::errs() << "isAssumedSizeArray? = " << dataExvIsAssumedSize << "\n"; + llvm::errs() << "dataExv = " << dataExv << "\n"; asFortran << sym.name().ToString(); } if (!arrayRef->subscript().empty()) { asFortran << '('; + llvm::errs() << "!arrayRef->subscript().empty()\n"; bounds = genBoundsOps( builder, operandLocation, converter, stmtCtx, arrayRef->subscript(), asFortran, dataExv, dataExvIsAssumedSize, info, treatIndexAsSection, @@ -453,8 +459,10 @@ fir::factory::AddrAndBoundsInfo gatherDataOperandAddrAndBounds( converter.genExprAddr(operandLocation, designator, stmtCtx); info.addr = fir::getBase(compExv); info.rawInput = info.addr; + llvm::errs() << "compRef = detail::getRef(designator)\n"; if (genDefaultBounds && - mlir::isa(fir::unwrapRefType(info.addr.getType()))) + mlir::isa(fir::unwrapRefType(info.addr.getType()))) { + llvm::errs() << "genDefaultBounds && mlir::isa(fir::unwrapRefType(info.addr.getType()))\n"; bounds = fir::factory::genBaseBoundsOps( builder, operandLocation, compExv, /*isAssumedSize=*/false, strideIncludeLowerExtent); @@ -492,6 +500,7 @@ fir::factory::AddrAndBoundsInfo gatherDataOperandAddrAndBounds( builder, operandLocation, compExv, info); } } else { + llvm::errs() << "All else\n"; if (detail::getRef(designator)) { fir::ExtendedValue compExv = converter.genExprAddr(operandLocation, designator, stmtCtx); @@ -500,18 +509,26 @@ fir::factory::AddrAndBoundsInfo gatherDataOperandAddrAndBounds( asFortran << designator.AsFortran(); } else if (auto symRef = detail::getRef(designator)) { // Scalar or full array. + llvm::errs() << "symRef = detail::getRef(designator)\n"; fir::ExtendedValue dataExv = converter.getSymbolExtendedValue(*symRef); info = getDataOperandBaseAddr(converter, builder, *symRef, operandLocation, unwrapFirBox); + llvm::errs() << "dataExv = " << dataExv << "\n"; + llvm::errs() << "info is \n"; + info.dump(llvm::errs()); + llvm ::errs() << "info.addr.getType()" << info.addr.getType() << "\n"; if (genDefaultBounds && mlir::isa( fir::unwrapRefType(info.addr.getType()))) { + llvm::errs() << "genDefaultBounds && mlir::isa(fir::unwrapRefType(info.addr.getType())) \n"; info.boxType = fir::unwrapRefType(info.addr.getType()); bounds = fir::factory::genBoundsOpsFromBox( builder, operandLocation, dataExv, info); - } + } else + llvm::errs()<< "ELSE => genDefaultBounds && mlir::isa(fir::unwrapRefType(info.addr.getType())) \n"; bool dataExvIsAssumedSize = Fortran::semantics::IsAssumedSizeArray(symRef->get().GetUltimate()); - if (genDefaultBounds && + llvm::errs() << "isAssumedSizeArray? = " << dataExvIsAssumedSize << "\n"; + if (genDefaultBounds && mlir::isa(fir::unwrapRefType(info.addr.getType()))) bounds = fir::factory::genBaseBoundsOps( builder, operandLocation, dataExv, dataExvIsAssumedSize, diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 77b4622547d7a..73d3f2a1cfe02 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -21,6 +21,9 @@ #include "llvm/Frontend/OpenMP/OMP.h.inc" #include "llvm/Frontend/OpenMP/OMPIRBuilder.h" +#define DEBUG_TYPE "flang-openmp-lowering" +#define PDBGS() (llvm::dbgs() << "[" << DEBUG_TYPE << "]: ") + namespace Fortran { namespace lower { namespace omp { @@ -1054,7 +1057,15 @@ void ClauseProcessor::processMapObjects( llvm::SmallVector bounds; std::stringstream asFortran; std::optional parentObj; - + LLVM_DEBUG( + LLVM_DEBUG(PDBGS() << "Sym = " << *object.sym() << "\n"); + if (object.ref()) { + PDBGS() << "Designator = "; + object.ref().value().dump(); + PDBGS() << "\n"; + } + else + PDBGS() << "No Designator\n";); fir::factory::AddrAndBoundsInfo info = lower::gatherDataOperandAddrAndBounds( @@ -1062,6 +1073,17 @@ void ClauseProcessor::processMapObjects( object.ref(), clauseLocation, asFortran, bounds, treatIndexAsSection); + LLVM_DEBUG( + if (bounds.empty()) + PDBGS() << "Bounds empty\n"; + else { + PDBGS() << "Bounds:\n"; + for (auto v : bounds) { + PDBGS() << v << "\n"; + } + } + ); + mlir::Value baseOp = info.rawInput; if (object.sym()->owner().IsDerivedType()) { omp::ObjectList objectList = gatherObjectsOf(object, semaCtx); @@ -1089,13 +1111,26 @@ void ClauseProcessor::processMapObjects( mapperIdName) : mlir::FlatSymbolRefAttr(); } - // Explicit map captures are captured ByRef by default, // optimisation passes may alter this to ByCopy or other capture // types to optimise auto location = mlir::NameLoc::get( mlir::StringAttr::get(firOpBuilder.getContext(), asFortran.str()), baseOp.getLoc()); + // mlir::Type idxTy = firOpBuilder.getIndexType(); + // mlir::Value one = firOpBuilder.createIntegerConstant(location, idxTy, 1); + // mlir::Value zero = firOpBuilder.createIntegerConstant(location, idxTy, 0); + // auto normalizedLB = zero; + // auto ub = firOpBuilder.createIntegerConstant(location, idxTy, 7); + // auto extent = firOpBuilder.createIntegerConstant(location, idxTy, 8); + // auto stride = one; + // mlir::Type boundTy = firOpBuilder.getType(); + // mlir::Value boundsOp = firOpBuilder.create( + // location, boundTy, /*lower_bound=*/normalizedLB, + // /*upper_bound=*/ub, /*extent=*/extent, /*stride=*/stride, + // /*stride_in_bytes = */ true, /*start_idx=*/normalizedLB); + // bounds.push_back(boundsOp); + // LLVM_DEBUG(PDBGS() << "Created bounds " << boundsOp << "\n"); mlir::omp::MapInfoOp mapOp = createMapInfoOp( firOpBuilder, location, baseOp, /*varPtrPtr=*/mlir::Value{}, asFortran.str(), bounds, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index f099028c23323..76748d1bd9476 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -41,6 +41,28 @@ #include "llvm/ADT/STLExtras.h" #include "llvm/Frontend/OpenMP/OMPConstants.h" +#define DEBUG_TYPE "flang-openmp-lowering" +#define PDBGS() (llvm::dbgs() << "[" << DEBUG_TYPE << "]: ") + +static void printOperation(mlir::Operation *op) { llvm::dbgs() << *op << "\n"; } +static void printBlock(mlir::Block &block) { + llvm::dbgs() << "Block with " << block.getNumArguments() << " arguments, " + << block.getNumSuccessors() + << " successors, and " + // Note, this `.size()` is traversing a linked-list and is O(n). + << block.getOperations().size() << " operations\n"; + for (mlir::Operation &op : block.getOperations()) + printOperation(&op); +} + +static void printRegion(mlir::Region ®ion) { + // A region does not hold anything by itself other than a list of blocks. + llvm::dbgs() << "Region with " << region.getBlocks().size() + << " blocks:\n"; + for (mlir::Block &block : region.getBlocks()) + printBlock(block); +} + using namespace Fortran::lower::omp; using namespace Fortran::common::openmp; @@ -219,12 +241,23 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, auto bindSingleMapLike = [&converter, &firOpBuilder](const semantics::Symbol &sym, + const mlir::Value val, const mlir::BlockArgument &arg) { // Clones the `bounds` placing them inside the entry block and returns // them. auto cloneBound = [&](mlir::Value bound) { + LLVM_DEBUG(PDBGS() << "Cloning bound " << bound << "\n"); if (mlir::isMemoryEffectFree(bound.getDefiningOp())) { + if (auto unboxCharOp = mlir::dyn_cast(bound.getDefiningOp())) { + LLVM_DEBUG(PDBGS() << "Defining Op of Bound : " << unboxCharOp << "\n"); + mlir::Operation *clonedOp = firOpBuilder.clone(*unboxCharOp); + LLVM_DEBUG(PDBGS() << "Cloned Op of Bound : " << *clonedOp << "\n"); + return clonedOp->getResult(1); + } + mlir::Operation *defOp = bound.getDefiningOp(); + LLVM_DEBUG(PDBGS() << "Defining Op of Bound : " << *defOp << "\n"); mlir::Operation *clonedOp = firOpBuilder.clone(*bound.getDefiningOp()); + LLVM_DEBUG(PDBGS() << "Cloned Op of Bound : " << *clonedOp << "\n"); return clonedOp->getResult(0); } TODO(converter.getCurrentLocation(), @@ -239,39 +272,72 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, }; fir::ExtendedValue extVal = converter.getSymbolExtendedValue(sym); + LLVM_DEBUG(PDBGS() << "In bindEntryBlockArgs\n"); + LLVM_DEBUG(PDBGS() << "Sym: " << sym << "\n"); + LLVM_DEBUG(PDBGS() << "Extended Value:\n " << extVal << "\n"); + LLVM_DEBUG(PDBGS() << "mapBaseValue: \n" << val << "\n"); auto refType = mlir::dyn_cast(arg.getType()); if (refType && fir::isa_builtin_cptr_type(refType.getElementType())) { + LLVM_DEBUG(PDBGS() << "binding builting_cptr_type\n"); converter.bindSymbol(sym, arg); } else { extVal.match( [&](const fir::BoxValue &v) { + LLVM_DEBUG(PDBGS() << "binding BoxValue " << v << "\n"); converter.bindSymbol(sym, fir::BoxValue(arg, cloneBounds(v.getLBounds()), v.getExplicitParameters(), v.getExplicitExtents())); }, [&](const fir::MutableBoxValue &v) { + LLVM_DEBUG(PDBGS() << "binding MutableBoxValue " << v << "\n"); converter.bindSymbol( sym, fir::MutableBoxValue(arg, cloneBounds(v.getLBounds()), v.getMutableProperties())); }, [&](const fir::ArrayBoxValue &v) { + LLVM_DEBUG(PDBGS() << "binding ArrayBoxValue " << v << "\n"); converter.bindSymbol( sym, fir::ArrayBoxValue(arg, cloneBounds(v.getExtents()), cloneBounds(v.getLBounds()), v.getSourceBox())); }, [&](const fir::CharArrayBoxValue &v) { + LLVM_DEBUG(PDBGS() << "binding CharArrayBoxValue " << v << "\n"); converter.bindSymbol( sym, fir::CharArrayBoxValue(arg, cloneBound(v.getLen()), cloneBounds(v.getExtents()), cloneBounds(v.getLBounds()))); }, [&](const fir::CharBoxValue &v) { - converter.bindSymbol( - sym, fir::CharBoxValue(arg, cloneBound(v.getLen()))); + // PDB: THe problem here is that v is + // [flang-openmp-lowering]: Sym: a0, INTENT(IN) (OmpMapTo) size=24 offset=0: ObjectEntity dummy type: CHARACTER(*,1) + // [flang-openmp-lowering]: Extended Value: + // boxchar { addr: %9:2 = "hlfir.declare"(%8#0, %8#1, %7) <{fortran_attrs = #fir.var_attrs, operandSegmentSizes = array, uniq_name = "_QFFrealtest_Ea0"}> : (!fir.ref>, index, !fir.dscope) -> (!fir.boxchar<1>, !fir.ref>), len: %8:2 = "fir.unboxchar"(%arg0) : (!fir.boxchar<1>) -> (!fir.ref>, index) } + // Porblem above is that "len:" references the input to the hlfir.declare. It could get it directly from the hlfir.declare. + // PDB: start thinking from here - it looks like we'll have to map %arg after all because getting the length will still need us to access the defining op of the len, which is the unboxchar. + // Maybe we should use val which could be the hlfir.declare for the symbol. Use the len from that instead of cloning the len from the extended value. + LLVM_DEBUG(PDBGS() << "binding CharBoxValue " << v << "\n"); + mlir::Value len = v.getLen(); + LLVM_DEBUG(PDBGS() << "Starting with len = v.getLen() = " << len << "\n"); + if (auto declareOp = val.getDefiningOp()) { + mlir::Value base = declareOp.getBase(); + if (auto boxCharType = mlir::dyn_cast(base.getType())) { + LLVM_DEBUG(PDBGS() << "Type of declareOp.getBase() = " << boxCharType << "\n"); + mlir::Type lenType = firOpBuilder.getCharacterLengthType(); + mlir::Type refType = firOpBuilder.getRefType(boxCharType.getEleTy()); + mlir::Location loc = converter.getCurrentLocation(); + LLVM_DEBUG(PDBGS() << "lenType = " << lenType << "\n"); + LLVM_DEBUG(PDBGS() << "refType = " << refType << "\n"); + auto unboxed = firOpBuilder.create(loc, refType, lenType, base); + len = unboxed.getResult(1); + } + } + auto charBoxValue = fir::CharBoxValue(arg, cloneBound(len)); + LLVM_DEBUG(PDBGS() << "Binding " << sym << " to " << charBoxValue << "\n"); + converter.bindSymbol(sym, charBoxValue); }, - [&](const fir::UnboxedValue &v) { converter.bindSymbol(sym, arg); }, + [&](const fir::UnboxedValue &v) { LLVM_DEBUG(PDBGS() << "binding Unboxed Value " << v << " \n");converter.bindSymbol(sym, arg); }, [&](const auto &) { TODO(converter.getCurrentLocation(), "target map clause operand unsupported type"); @@ -281,6 +347,7 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, auto bindMapLike = [&bindSingleMapLike](llvm::ArrayRef syms, + llvm::ArrayRef vars, llvm::ArrayRef args) { // Structure component symbols don't have bindings, and can only be // explicitly mapped individually. If a member is captured implicitly @@ -289,8 +356,8 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, llvm::copy_if(syms, std::back_inserter(processedSyms), [](auto *sym) { return !sym->owner().IsDerivedType(); }); - for (auto [sym, arg] : llvm::zip_equal(processedSyms, args)) - bindSingleMapLike(*sym, arg); + for (auto [sym, var, arg] : llvm::zip_equal(processedSyms, vars, args)) + bindSingleMapLike(*sym, var, arg); }; auto bindPrivateLike = [&converter, &firOpBuilder]( @@ -321,17 +388,17 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, // Process in clause name alphabetical order to match block arguments order. // Do not bind host_eval variables because they cannot be used inside of the // corresponding region, except for very specific cases handled separately. - bindMapLike(args.hasDeviceAddr.syms, op.getHasDeviceAddrBlockArgs()); + bindMapLike(args.hasDeviceAddr.syms, args.hasDeviceAddr.vars, op.getHasDeviceAddrBlockArgs()); bindPrivateLike(args.inReduction.syms, args.inReduction.vars, op.getInReductionBlockArgs()); - bindMapLike(args.map.syms, op.getMapBlockArgs()); + bindMapLike(args.map.syms, args.map.vars, op.getMapBlockArgs()); bindPrivateLike(args.priv.syms, args.priv.vars, op.getPrivateBlockArgs()); bindPrivateLike(args.reduction.syms, args.reduction.vars, op.getReductionBlockArgs()); bindPrivateLike(args.taskReduction.syms, args.taskReduction.vars, op.getTaskReductionBlockArgs()); - bindMapLike(args.useDeviceAddr.syms, op.getUseDeviceAddrBlockArgs()); - bindMapLike(args.useDevicePtr.syms, op.getUseDevicePtrBlockArgs()); + bindMapLike(args.useDeviceAddr.syms, args.useDeviceAddr.vars, op.getUseDeviceAddrBlockArgs()); + bindMapLike(args.useDevicePtr.syms, args.useDevicePtr.vars, op.getUseDevicePtrBlockArgs()); } /// Get the list of base values that the specified map-like variables point to. @@ -1357,8 +1424,13 @@ static void genBodyOfTargetOp( auto argIface = llvm::cast(*targetOp); mlir::Region ®ion = targetOp.getRegion(); + mlir::func::FuncOp func = targetOp->getParentOfType(); + LLVM_DEBUG(PDBGS() << "Function before genEntryBlock\n " << func << "\n\n"); mlir::Block *entryBlock = genEntryBlock(firOpBuilder, args, region); + LLVM_DEBUG(PDBGS() << "Function after genEntryBlock\n" << func << "\n\n"); + LLVM_DEBUG(PDBGS() << "entryBlock before bindEntryBlockArgs\n" << *entryBlock << "\n\n"); bindEntryBlockArgs(converter, targetOp, args); + LLVM_DEBUG(PDBGS() << "entryBlock after bindEntryBlockArgs\n" << *entryBlock << "\n\n"); if (!hostEvalInfo.empty()) hostEvalInfo.back().bindOperands(argIface.getHostEvalBlockArgs()); @@ -1368,9 +1440,26 @@ static void genBodyOfTargetOp( // lists and replace their uses with the new temporary. llvm::SetVector valuesDefinedAbove; mlir::getUsedValuesDefinedAbove(region, valuesDefinedAbove); + LLVM_DEBUG(PDBGS() << "region is \n"); + LLVM_DEBUG(printRegion(region)); + LLVM_DEBUG(PDBGS() << "valuesDefinedAbove.empty() : " << valuesDefinedAbove.empty() << "\n"); while (!valuesDefinedAbove.empty()) { for (mlir::Value val : valuesDefinedAbove) { + LLVM_DEBUG(PDBGS() << "Value defined above is \n" << val << "\n"); mlir::Operation *valOp = val.getDefiningOp(); + // if (!valOp) { + // // This means val is a blockArg. + // assert(mlir::isa(val)); + // auto blockArg = llvm::cast(val); + // LLVM_DEBUG(PDBGS() << "val is a BlockArgument: Arg Number: " << blockArg.getArgNumber() << "\n"); + // mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + // firOpBuilder.setInsertionPoint(targetOp); + // // firOpBuilder.setInsertionPointAfter(valOp); + // auto copyVal = + // firOpBuilder.createTemporary(val.getLoc(), val.getType()); + // firOpBuilder.createStoreWithConvert(copyVal.getLoc(), val, copyVal); + // LLVM_DEBUG(PDBGS() << "Function after processing null valOp\n" << func << "\n\n"); + // } assert(valOp != nullptr); // NOTE: We skip BoxDimsOp's as the lesser of two evils is to map the @@ -2397,6 +2486,13 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, extractMappedBaseValues(clauseOps.hasDeviceAddrVars, hasDeviceAddrBaseValues); extractMappedBaseValues(clauseOps.mapVars, mapBaseValues); + LLVM_DEBUG(PDBGS() << "mapVars and mapBaseValues are \n"; + for (auto [mapSym, mapVar, mapBaseValue] : llvm::zip(mapSyms, clauseOps.mapVars, mapBaseValues)) { + PDBGS() << "(mapSym): " << *mapSym << "\n"; + PDBGS() << "(mapVar): " << mapVar << "\n"; + PDBGS() << "(mapBaseValue): " << mapBaseValue << "\n"; + } + ); EntryBlockArgs args; args.hasDeviceAddr.syms = hasDeviceAddrSyms; args.hasDeviceAddr.vars = hasDeviceAddrBaseValues; diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp index 3fcb4b04a7b76..1f116d2e243e6 100644 --- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp +++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp @@ -550,7 +550,51 @@ class MapInfoFinalizationPass // iterations from previous function scopes. localBoxAllocas.clear(); - // First, walk `omp.map.info` ops to see if any record members should be + // First, walk `omp.map.info` ops to see if any + func->walk([&](mlir::omp::MapInfoOp op) { + mlir::Value varPtr = op.getVarPtr(); + mlir::Type underlyingVarType = fir::unwrapRefType(varPtr.getType()); + if (!mlir::isa(underlyingVarType)) + return mlir::WalkResult::advance(); + + fir::CharacterType cType = mlir::cast(underlyingVarType); + if (!cType.hasDynamicLen()) + return mlir::WalkResult::advance(); + + // This means varPtr is a BlockArgument. I do not know how to get to a + // fir.boxchar<> type of mlir::Value for varPtr. So, skipping this for now. + mlir::Operation *definingOp = varPtr.getDefiningOp(); + if (!definingOp) + return mlir::WalkResult::advance(); + + if (auto declOp = mlir::dyn_cast(definingOp)) { + mlir::Value base = declOp.getBase(); + assert(mlir::isa(base.getType())); + // mlir::value unboxChar + builder.setInsertionPoint(op); + fir::BoxCharType boxCharType = mlir::cast(base.getType()); + mlir::Type idxTy = builder.getIndexType(); + mlir::Type lenType = builder.getCharacterLengthType(); + mlir::Type refType = builder.getRefType(boxCharType.getEleTy()); + mlir::Location location = op.getLoc(); + auto unboxed = builder.create(location, refType, lenType, base); + // len = unboxed.getResult(1); + mlir::Value zero = builder.createIntegerConstant(location, idxTy, 0); + mlir::Value one = builder.createIntegerConstant(location, idxTy, 1); + mlir::Value extent = unboxed.getResult(1); + mlir::Value stride = one; + mlir::Value ub = builder.create(location, extent, one); + mlir::Type boundTy = builder.getType(); + mlir::Value boundsOp = builder.create( + location, boundTy, /*lower_bound=*/zero, + /*upper_bound=*/ub, /*extent=*/extent, /*stride=*/stride, + /*stride_in_bytes = */ true, /*start_idx=*/zero); + op.getBoundsMutable().append({boundsOp}); + } + return mlir::WalkResult::advance(); + }); + + // Next, walk `omp.map.info` ops to see if any record members should be // implicitly mapped. func->walk([&](mlir::omp::MapInfoOp op) { mlir::Type underlyingType = >From e245e4b35b960a779ede1f3713f8d48ba0cb8654 Mon Sep 17 00:00:00 2001 From: Pranav Bhandarkar Date: Tue, 8 Apr 2025 15:56:26 -0500 Subject: [PATCH 02/10] Removed unconditional debugging prints --- flang/include/flang/Lower/DirectivesCommon.h | 22 +--- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 46 ++++---- flang/lib/Lower/OpenMP/OpenMP.cpp | 107 ++++++------------ .../Optimizer/OpenMP/MapInfoFinalization.cpp | 17 ++- 4 files changed, 69 insertions(+), 123 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index ce03b0751b56a..de9008a9010c4 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -424,30 +424,23 @@ fir::factory::AddrAndBoundsInfo gatherDataOperandAddrAndBounds( auto arrayBase = toMaybeExpr(arrayRef->base()); assert(arrayBase); - llvm::errs() << "arrayBase = "; - arrayBase.value().dump(); if (detail::getRef(*arrayBase)) { - llvm::errs() << "detail::getRef(*arrayBase)\n"; dataExv = converter.genExprAddr(operandLocation, *arrayBase, stmtCtx); info.addr = fir::getBase(dataExv); info.rawInput = info.addr; asFortran << arrayBase->AsFortran(); } else { - llvm::errs() << "ELSE -> detail::getRef(*arrayBase)\n"; const semantics::Symbol &sym = arrayRef->GetLastSymbol(); dataExvIsAssumedSize = Fortran::semantics::IsAssumedSizeArray(sym.GetUltimate()); info = getDataOperandBaseAddr(converter, builder, sym, operandLocation, unwrapFirBox); dataExv = converter.getSymbolExtendedValue(sym); - llvm::errs() << "isAssumedSizeArray? = " << dataExvIsAssumedSize << "\n"; - llvm::errs() << "dataExv = " << dataExv << "\n"; asFortran << sym.name().ToString(); } if (!arrayRef->subscript().empty()) { asFortran << '('; - llvm::errs() << "!arrayRef->subscript().empty()\n"; bounds = genBoundsOps( builder, operandLocation, converter, stmtCtx, arrayRef->subscript(), asFortran, dataExv, dataExvIsAssumedSize, info, treatIndexAsSection, @@ -459,10 +452,8 @@ fir::factory::AddrAndBoundsInfo gatherDataOperandAddrAndBounds( converter.genExprAddr(operandLocation, designator, stmtCtx); info.addr = fir::getBase(compExv); info.rawInput = info.addr; - llvm::errs() << "compRef = detail::getRef(designator)\n"; if (genDefaultBounds && mlir::isa(fir::unwrapRefType(info.addr.getType()))) { - llvm::errs() << "genDefaultBounds && mlir::isa(fir::unwrapRefType(info.addr.getType()))\n"; bounds = fir::factory::genBaseBoundsOps( builder, operandLocation, compExv, /*isAssumedSize=*/false, strideIncludeLowerExtent); @@ -500,7 +491,6 @@ fir::factory::AddrAndBoundsInfo gatherDataOperandAddrAndBounds( builder, operandLocation, compExv, info); } } else { - llvm::errs() << "All else\n"; if (detail::getRef(designator)) { fir::ExtendedValue compExv = converter.genExprAddr(operandLocation, designator, stmtCtx); @@ -509,26 +499,18 @@ fir::factory::AddrAndBoundsInfo gatherDataOperandAddrAndBounds( asFortran << designator.AsFortran(); } else if (auto symRef = detail::getRef(designator)) { // Scalar or full array. - llvm::errs() << "symRef = detail::getRef(designator)\n"; fir::ExtendedValue dataExv = converter.getSymbolExtendedValue(*symRef); info = getDataOperandBaseAddr(converter, builder, *symRef, operandLocation, unwrapFirBox); - llvm::errs() << "dataExv = " << dataExv << "\n"; - llvm::errs() << "info is \n"; - info.dump(llvm::errs()); - llvm ::errs() << "info.addr.getType()" << info.addr.getType() << "\n"; if (genDefaultBounds && mlir::isa( fir::unwrapRefType(info.addr.getType()))) { - llvm::errs() << "genDefaultBounds && mlir::isa(fir::unwrapRefType(info.addr.getType())) \n"; info.boxType = fir::unwrapRefType(info.addr.getType()); bounds = fir::factory::genBoundsOpsFromBox( builder, operandLocation, dataExv, info); - } else - llvm::errs()<< "ELSE => genDefaultBounds && mlir::isa(fir::unwrapRefType(info.addr.getType())) \n"; + } bool dataExvIsAssumedSize = Fortran::semantics::IsAssumedSizeArray(symRef->get().GetUltimate()); - llvm::errs() << "isAssumedSizeArray? = " << dataExvIsAssumedSize << "\n"; - if (genDefaultBounds && + if (genDefaultBounds && mlir::isa(fir::unwrapRefType(info.addr.getType()))) bounds = fir::factory::genBaseBoundsOps( builder, operandLocation, dataExv, dataExvIsAssumedSize, diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 73d3f2a1cfe02..6cf0fccc81b75 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1057,15 +1057,13 @@ void ClauseProcessor::processMapObjects( llvm::SmallVector bounds; std::stringstream asFortran; std::optional parentObj; - LLVM_DEBUG( - LLVM_DEBUG(PDBGS() << "Sym = " << *object.sym() << "\n"); - if (object.ref()) { - PDBGS() << "Designator = "; - object.ref().value().dump(); - PDBGS() << "\n"; - } - else - PDBGS() << "No Designator\n";); + LLVM_DEBUG(LLVM_DEBUG(PDBGS() << "Sym = " << *object.sym() << "\n"); + if (object.ref()) { + PDBGS() << "Designator = "; + object.ref().value().dump(); + PDBGS() << "\n"; + } else PDBGS() + << "No Designator\n";); fir::factory::AddrAndBoundsInfo info = lower::gatherDataOperandAddrAndBounds( @@ -1073,16 +1071,12 @@ void ClauseProcessor::processMapObjects( object.ref(), clauseLocation, asFortran, bounds, treatIndexAsSection); - LLVM_DEBUG( - if (bounds.empty()) - PDBGS() << "Bounds empty\n"; - else { - PDBGS() << "Bounds:\n"; - for (auto v : bounds) { - PDBGS() << v << "\n"; - } - } - ); + LLVM_DEBUG(if (bounds.empty()) PDBGS() << "Bounds empty\n"; else { + PDBGS() << "Bounds:\n"; + for (auto v : bounds) { + PDBGS() << v << "\n"; + } + }); mlir::Value baseOp = info.rawInput; if (object.sym()->owner().IsDerivedType()) { @@ -1119,13 +1113,13 @@ void ClauseProcessor::processMapObjects( baseOp.getLoc()); // mlir::Type idxTy = firOpBuilder.getIndexType(); // mlir::Value one = firOpBuilder.createIntegerConstant(location, idxTy, 1); - // mlir::Value zero = firOpBuilder.createIntegerConstant(location, idxTy, 0); - // auto normalizedLB = zero; - // auto ub = firOpBuilder.createIntegerConstant(location, idxTy, 7); - // auto extent = firOpBuilder.createIntegerConstant(location, idxTy, 8); - // auto stride = one; - // mlir::Type boundTy = firOpBuilder.getType(); - // mlir::Value boundsOp = firOpBuilder.create( + // mlir::Value zero = firOpBuilder.createIntegerConstant(location, idxTy, + // 0); auto normalizedLB = zero; auto ub = + // firOpBuilder.createIntegerConstant(location, idxTy, 7); auto extent = + // firOpBuilder.createIntegerConstant(location, idxTy, 8); auto stride = + // one; mlir::Type boundTy = + // firOpBuilder.getType(); mlir::Value boundsOp = + // firOpBuilder.create( // location, boundTy, /*lower_bound=*/normalizedLB, // /*upper_bound=*/ub, /*extent=*/extent, /*stride=*/stride, // /*stride_in_bytes = */ true, /*start_idx=*/normalizedLB); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 76748d1bd9476..aef34cc920bf8 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -41,24 +41,20 @@ #include "llvm/ADT/STLExtras.h" #include "llvm/Frontend/OpenMP/OMPConstants.h" -#define DEBUG_TYPE "flang-openmp-lowering" -#define PDBGS() (llvm::dbgs() << "[" << DEBUG_TYPE << "]: ") - static void printOperation(mlir::Operation *op) { llvm::dbgs() << *op << "\n"; } static void printBlock(mlir::Block &block) { llvm::dbgs() << "Block with " << block.getNumArguments() << " arguments, " - << block.getNumSuccessors() - << " successors, and " - // Note, this `.size()` is traversing a linked-list and is O(n). - << block.getOperations().size() << " operations\n"; + << block.getNumSuccessors() + << " successors, and " + // Note, this `.size()` is traversing a linked-list and is O(n). + << block.getOperations().size() << " operations\n"; for (mlir::Operation &op : block.getOperations()) printOperation(&op); } static void printRegion(mlir::Region ®ion) { // A region does not hold anything by itself other than a list of blocks. - llvm::dbgs() << "Region with " << region.getBlocks().size() - << " blocks:\n"; + llvm::dbgs() << "Region with " << region.getBlocks().size() << " blocks:\n"; for (mlir::Block &block : region.getBlocks()) printBlock(block); } @@ -246,18 +242,13 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, // Clones the `bounds` placing them inside the entry block and returns // them. auto cloneBound = [&](mlir::Value bound) { - LLVM_DEBUG(PDBGS() << "Cloning bound " << bound << "\n"); if (mlir::isMemoryEffectFree(bound.getDefiningOp())) { - if (auto unboxCharOp = mlir::dyn_cast(bound.getDefiningOp())) { - LLVM_DEBUG(PDBGS() << "Defining Op of Bound : " << unboxCharOp << "\n"); + if (auto unboxCharOp = + mlir::dyn_cast(bound.getDefiningOp())) { mlir::Operation *clonedOp = firOpBuilder.clone(*unboxCharOp); - LLVM_DEBUG(PDBGS() << "Cloned Op of Bound : " << *clonedOp << "\n"); return clonedOp->getResult(1); } - mlir::Operation *defOp = bound.getDefiningOp(); - LLVM_DEBUG(PDBGS() << "Defining Op of Bound : " << *defOp << "\n"); mlir::Operation *clonedOp = firOpBuilder.clone(*bound.getDefiningOp()); - LLVM_DEBUG(PDBGS() << "Cloned Op of Bound : " << *clonedOp << "\n"); return clonedOp->getResult(0); } TODO(converter.getCurrentLocation(), @@ -272,38 +263,29 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, }; fir::ExtendedValue extVal = converter.getSymbolExtendedValue(sym); - LLVM_DEBUG(PDBGS() << "In bindEntryBlockArgs\n"); - LLVM_DEBUG(PDBGS() << "Sym: " << sym << "\n"); - LLVM_DEBUG(PDBGS() << "Extended Value:\n " << extVal << "\n"); - LLVM_DEBUG(PDBGS() << "mapBaseValue: \n" << val << "\n"); auto refType = mlir::dyn_cast(arg.getType()); if (refType && fir::isa_builtin_cptr_type(refType.getElementType())) { - LLVM_DEBUG(PDBGS() << "binding builting_cptr_type\n"); converter.bindSymbol(sym, arg); } else { extVal.match( [&](const fir::BoxValue &v) { - LLVM_DEBUG(PDBGS() << "binding BoxValue " << v << "\n"); converter.bindSymbol(sym, fir::BoxValue(arg, cloneBounds(v.getLBounds()), v.getExplicitParameters(), v.getExplicitExtents())); }, [&](const fir::MutableBoxValue &v) { - LLVM_DEBUG(PDBGS() << "binding MutableBoxValue " << v << "\n"); converter.bindSymbol( sym, fir::MutableBoxValue(arg, cloneBounds(v.getLBounds()), v.getMutableProperties())); }, [&](const fir::ArrayBoxValue &v) { - LLVM_DEBUG(PDBGS() << "binding ArrayBoxValue " << v << "\n"); converter.bindSymbol( sym, fir::ArrayBoxValue(arg, cloneBounds(v.getExtents()), cloneBounds(v.getLBounds()), v.getSourceBox())); }, [&](const fir::CharArrayBoxValue &v) { - LLVM_DEBUG(PDBGS() << "binding CharArrayBoxValue " << v << "\n"); converter.bindSymbol( sym, fir::CharArrayBoxValue(arg, cloneBound(v.getLen()), cloneBounds(v.getExtents()), @@ -311,33 +293,41 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, }, [&](const fir::CharBoxValue &v) { // PDB: THe problem here is that v is - // [flang-openmp-lowering]: Sym: a0, INTENT(IN) (OmpMapTo) size=24 offset=0: ObjectEntity dummy type: CHARACTER(*,1) + // [flang-openmp-lowering]: Sym: a0, INTENT(IN) (OmpMapTo) size=24 + // offset=0: ObjectEntity dummy type: CHARACTER(*,1) // [flang-openmp-lowering]: Extended Value: - // boxchar { addr: %9:2 = "hlfir.declare"(%8#0, %8#1, %7) <{fortran_attrs = #fir.var_attrs, operandSegmentSizes = array, uniq_name = "_QFFrealtest_Ea0"}> : (!fir.ref>, index, !fir.dscope) -> (!fir.boxchar<1>, !fir.ref>), len: %8:2 = "fir.unboxchar"(%arg0) : (!fir.boxchar<1>) -> (!fir.ref>, index) } - // Porblem above is that "len:" references the input to the hlfir.declare. It could get it directly from the hlfir.declare. - // PDB: start thinking from here - it looks like we'll have to map %arg after all because getting the length will still need us to access the defining op of the len, which is the unboxchar. - // Maybe we should use val which could be the hlfir.declare for the symbol. Use the len from that instead of cloning the len from the extended value. - LLVM_DEBUG(PDBGS() << "binding CharBoxValue " << v << "\n"); + // boxchar { addr: %9:2 = "hlfir.declare"(%8#0, %8#1, %7) + // <{fortran_attrs = #fir.var_attrs, operandSegmentSizes + // = array, uniq_name = "_QFFrealtest_Ea0"}> : + // (!fir.ref>, index, !fir.dscope) -> + // (!fir.boxchar<1>, !fir.ref>), len: %8:2 = + // "fir.unboxchar"(%arg0) : (!fir.boxchar<1>) -> + // (!fir.ref>, index) } Porblem above is that "len:" + // references the input to the hlfir.declare. It could get it + // directly from the hlfir.declare. PDB: start thinking from here - + // it looks like we'll have to map %arg after all because getting + // the length will still need us to access the defining op of the + // len, which is the unboxchar. Maybe we should use val which could + // be the hlfir.declare for the symbol. Use the len from that + // instead of cloning the len from the extended value. mlir::Value len = v.getLen(); - LLVM_DEBUG(PDBGS() << "Starting with len = v.getLen() = " << len << "\n"); if (auto declareOp = val.getDefiningOp()) { mlir::Value base = declareOp.getBase(); - if (auto boxCharType = mlir::dyn_cast(base.getType())) { - LLVM_DEBUG(PDBGS() << "Type of declareOp.getBase() = " << boxCharType << "\n"); + if (auto boxCharType = + mlir::dyn_cast(base.getType())) { mlir::Type lenType = firOpBuilder.getCharacterLengthType(); - mlir::Type refType = firOpBuilder.getRefType(boxCharType.getEleTy()); + mlir::Type refType = + firOpBuilder.getRefType(boxCharType.getEleTy()); mlir::Location loc = converter.getCurrentLocation(); - LLVM_DEBUG(PDBGS() << "lenType = " << lenType << "\n"); - LLVM_DEBUG(PDBGS() << "refType = " << refType << "\n"); - auto unboxed = firOpBuilder.create(loc, refType, lenType, base); + auto unboxed = firOpBuilder.create( + loc, refType, lenType, base); len = unboxed.getResult(1); } } auto charBoxValue = fir::CharBoxValue(arg, cloneBound(len)); - LLVM_DEBUG(PDBGS() << "Binding " << sym << " to " << charBoxValue << "\n"); converter.bindSymbol(sym, charBoxValue); }, - [&](const fir::UnboxedValue &v) { LLVM_DEBUG(PDBGS() << "binding Unboxed Value " << v << " \n");converter.bindSymbol(sym, arg); }, + [&](const fir::UnboxedValue &v) { converter.bindSymbol(sym, arg); }, [&](const auto &) { TODO(converter.getCurrentLocation(), "target map clause operand unsupported type"); @@ -388,7 +378,8 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, // Process in clause name alphabetical order to match block arguments order. // Do not bind host_eval variables because they cannot be used inside of the // corresponding region, except for very specific cases handled separately. - bindMapLike(args.hasDeviceAddr.syms, args.hasDeviceAddr.vars, op.getHasDeviceAddrBlockArgs()); + bindMapLike(args.hasDeviceAddr.syms, args.hasDeviceAddr.vars, + op.getHasDeviceAddrBlockArgs()); bindPrivateLike(args.inReduction.syms, args.inReduction.vars, op.getInReductionBlockArgs()); bindMapLike(args.map.syms, args.map.vars, op.getMapBlockArgs()); @@ -397,8 +388,10 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, op.getReductionBlockArgs()); bindPrivateLike(args.taskReduction.syms, args.taskReduction.vars, op.getTaskReductionBlockArgs()); - bindMapLike(args.useDeviceAddr.syms, args.useDeviceAddr.vars, op.getUseDeviceAddrBlockArgs()); - bindMapLike(args.useDevicePtr.syms, args.useDevicePtr.vars, op.getUseDevicePtrBlockArgs()); + bindMapLike(args.useDeviceAddr.syms, args.useDeviceAddr.vars, + op.getUseDeviceAddrBlockArgs()); + bindMapLike(args.useDevicePtr.syms, args.useDevicePtr.vars, + op.getUseDevicePtrBlockArgs()); } /// Get the list of base values that the specified map-like variables point to. @@ -1425,12 +1418,8 @@ static void genBodyOfTargetOp( mlir::Region ®ion = targetOp.getRegion(); mlir::func::FuncOp func = targetOp->getParentOfType(); - LLVM_DEBUG(PDBGS() << "Function before genEntryBlock\n " << func << "\n\n"); mlir::Block *entryBlock = genEntryBlock(firOpBuilder, args, region); - LLVM_DEBUG(PDBGS() << "Function after genEntryBlock\n" << func << "\n\n"); - LLVM_DEBUG(PDBGS() << "entryBlock before bindEntryBlockArgs\n" << *entryBlock << "\n\n"); bindEntryBlockArgs(converter, targetOp, args); - LLVM_DEBUG(PDBGS() << "entryBlock after bindEntryBlockArgs\n" << *entryBlock << "\n\n"); if (!hostEvalInfo.empty()) hostEvalInfo.back().bindOperands(argIface.getHostEvalBlockArgs()); @@ -1440,26 +1429,9 @@ static void genBodyOfTargetOp( // lists and replace their uses with the new temporary. llvm::SetVector valuesDefinedAbove; mlir::getUsedValuesDefinedAbove(region, valuesDefinedAbove); - LLVM_DEBUG(PDBGS() << "region is \n"); - LLVM_DEBUG(printRegion(region)); - LLVM_DEBUG(PDBGS() << "valuesDefinedAbove.empty() : " << valuesDefinedAbove.empty() << "\n"); while (!valuesDefinedAbove.empty()) { for (mlir::Value val : valuesDefinedAbove) { - LLVM_DEBUG(PDBGS() << "Value defined above is \n" << val << "\n"); mlir::Operation *valOp = val.getDefiningOp(); - // if (!valOp) { - // // This means val is a blockArg. - // assert(mlir::isa(val)); - // auto blockArg = llvm::cast(val); - // LLVM_DEBUG(PDBGS() << "val is a BlockArgument: Arg Number: " << blockArg.getArgNumber() << "\n"); - // mlir::OpBuilder::InsertionGuard guard(firOpBuilder); - // firOpBuilder.setInsertionPoint(targetOp); - // // firOpBuilder.setInsertionPointAfter(valOp); - // auto copyVal = - // firOpBuilder.createTemporary(val.getLoc(), val.getType()); - // firOpBuilder.createStoreWithConvert(copyVal.getLoc(), val, copyVal); - // LLVM_DEBUG(PDBGS() << "Function after processing null valOp\n" << func << "\n\n"); - // } assert(valOp != nullptr); // NOTE: We skip BoxDimsOp's as the lesser of two evils is to map the @@ -2486,13 +2458,6 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, extractMappedBaseValues(clauseOps.hasDeviceAddrVars, hasDeviceAddrBaseValues); extractMappedBaseValues(clauseOps.mapVars, mapBaseValues); - LLVM_DEBUG(PDBGS() << "mapVars and mapBaseValues are \n"; - for (auto [mapSym, mapVar, mapBaseValue] : llvm::zip(mapSyms, clauseOps.mapVars, mapBaseValues)) { - PDBGS() << "(mapSym): " << *mapSym << "\n"; - PDBGS() << "(mapVar): " << mapVar << "\n"; - PDBGS() << "(mapBaseValue): " << mapBaseValue << "\n"; - } - ); EntryBlockArgs args; args.hasDeviceAddr.syms = hasDeviceAddrSyms; args.hasDeviceAddr.vars = hasDeviceAddrBaseValues; diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp index 1f116d2e243e6..7cb0f63ba9b69 100644 --- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp +++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp @@ -550,19 +550,21 @@ class MapInfoFinalizationPass // iterations from previous function scopes. localBoxAllocas.clear(); - // First, walk `omp.map.info` ops to see if any + // First, walk `omp.map.info` ops to see if any func->walk([&](mlir::omp::MapInfoOp op) { mlir::Value varPtr = op.getVarPtr(); mlir::Type underlyingVarType = fir::unwrapRefType(varPtr.getType()); if (!mlir::isa(underlyingVarType)) return mlir::WalkResult::advance(); - fir::CharacterType cType = mlir::cast(underlyingVarType); + fir::CharacterType cType = + mlir::cast(underlyingVarType); if (!cType.hasDynamicLen()) return mlir::WalkResult::advance(); // This means varPtr is a BlockArgument. I do not know how to get to a - // fir.boxchar<> type of mlir::Value for varPtr. So, skipping this for now. + // fir.boxchar<> type of mlir::Value for varPtr. So, skipping this for + // now. mlir::Operation *definingOp = varPtr.getDefiningOp(); if (!definingOp) return mlir::WalkResult::advance(); @@ -572,18 +574,21 @@ class MapInfoFinalizationPass assert(mlir::isa(base.getType())); // mlir::value unboxChar builder.setInsertionPoint(op); - fir::BoxCharType boxCharType = mlir::cast(base.getType()); + fir::BoxCharType boxCharType = + mlir::cast(base.getType()); mlir::Type idxTy = builder.getIndexType(); mlir::Type lenType = builder.getCharacterLengthType(); mlir::Type refType = builder.getRefType(boxCharType.getEleTy()); mlir::Location location = op.getLoc(); - auto unboxed = builder.create(location, refType, lenType, base); + auto unboxed = builder.create(location, refType, + lenType, base); // len = unboxed.getResult(1); mlir::Value zero = builder.createIntegerConstant(location, idxTy, 0); mlir::Value one = builder.createIntegerConstant(location, idxTy, 1); mlir::Value extent = unboxed.getResult(1); mlir::Value stride = one; - mlir::Value ub = builder.create(location, extent, one); + mlir::Value ub = + builder.create(location, extent, one); mlir::Type boundTy = builder.getType(); mlir::Value boundsOp = builder.create( location, boundTy, /*lower_bound=*/zero, >From 79054de361a9e890eccbb401c66a2a38a94d7311 Mon Sep 17 00:00:00 2001 From: Pranav Bhandarkar Date: Tue, 8 Apr 2025 17:07:30 -0500 Subject: [PATCH 03/10] More cleanup --- flang/include/flang/Lower/DirectivesCommon.h | 3 +- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 31 -------------------- 2 files changed, 2 insertions(+), 32 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index de9008a9010c4..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -424,6 +424,7 @@ fir::factory::AddrAndBoundsInfo gatherDataOperandAddrAndBounds( auto arrayBase = toMaybeExpr(arrayRef->base()); assert(arrayBase); + if (detail::getRef(*arrayBase)) { dataExv = converter.genExprAddr(operandLocation, *arrayBase, stmtCtx); info.addr = fir::getBase(dataExv); @@ -453,7 +454,7 @@ fir::factory::AddrAndBoundsInfo gatherDataOperandAddrAndBounds( info.addr = fir::getBase(compExv); info.rawInput = info.addr; if (genDefaultBounds && - mlir::isa(fir::unwrapRefType(info.addr.getType()))) { + mlir::isa(fir::unwrapRefType(info.addr.getType()))) bounds = fir::factory::genBaseBoundsOps( builder, operandLocation, compExv, /*isAssumedSize=*/false, strideIncludeLowerExtent); diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 6cf0fccc81b75..6322e7c0c5b41 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -21,9 +21,6 @@ #include "llvm/Frontend/OpenMP/OMP.h.inc" #include "llvm/Frontend/OpenMP/OMPIRBuilder.h" -#define DEBUG_TYPE "flang-openmp-lowering" -#define PDBGS() (llvm::dbgs() << "[" << DEBUG_TYPE << "]: ") - namespace Fortran { namespace lower { namespace omp { @@ -1057,13 +1054,6 @@ void ClauseProcessor::processMapObjects( llvm::SmallVector bounds; std::stringstream asFortran; std::optional parentObj; - LLVM_DEBUG(LLVM_DEBUG(PDBGS() << "Sym = " << *object.sym() << "\n"); - if (object.ref()) { - PDBGS() << "Designator = "; - object.ref().value().dump(); - PDBGS() << "\n"; - } else PDBGS() - << "No Designator\n";); fir::factory::AddrAndBoundsInfo info = lower::gatherDataOperandAddrAndBounds( @@ -1071,13 +1061,6 @@ void ClauseProcessor::processMapObjects( object.ref(), clauseLocation, asFortran, bounds, treatIndexAsSection); - LLVM_DEBUG(if (bounds.empty()) PDBGS() << "Bounds empty\n"; else { - PDBGS() << "Bounds:\n"; - for (auto v : bounds) { - PDBGS() << v << "\n"; - } - }); - mlir::Value baseOp = info.rawInput; if (object.sym()->owner().IsDerivedType()) { omp::ObjectList objectList = gatherObjectsOf(object, semaCtx); @@ -1111,20 +1094,6 @@ void ClauseProcessor::processMapObjects( auto location = mlir::NameLoc::get( mlir::StringAttr::get(firOpBuilder.getContext(), asFortran.str()), baseOp.getLoc()); - // mlir::Type idxTy = firOpBuilder.getIndexType(); - // mlir::Value one = firOpBuilder.createIntegerConstant(location, idxTy, 1); - // mlir::Value zero = firOpBuilder.createIntegerConstant(location, idxTy, - // 0); auto normalizedLB = zero; auto ub = - // firOpBuilder.createIntegerConstant(location, idxTy, 7); auto extent = - // firOpBuilder.createIntegerConstant(location, idxTy, 8); auto stride = - // one; mlir::Type boundTy = - // firOpBuilder.getType(); mlir::Value boundsOp = - // firOpBuilder.create( - // location, boundTy, /*lower_bound=*/normalizedLB, - // /*upper_bound=*/ub, /*extent=*/extent, /*stride=*/stride, - // /*stride_in_bytes = */ true, /*start_idx=*/normalizedLB); - // bounds.push_back(boundsOp); - // LLVM_DEBUG(PDBGS() << "Created bounds " << boundsOp << "\n"); mlir::omp::MapInfoOp mapOp = createMapInfoOp( firOpBuilder, location, baseOp, /*varPtrPtr=*/mlir::Value{}, asFortran.str(), bounds, >From ed1f64160be37dd7db98c95c521ac79e30cb9189 Mon Sep 17 00:00:00 2001 From: Pranav Bhandarkar Date: Tue, 8 Apr 2025 17:18:17 -0500 Subject: [PATCH 04/10] add a testcase --- flang/test/Lower/OpenMP/map-character.f90 | 47 +++++++++++++++++++++++ 1 file changed, 47 insertions(+) create mode 100644 flang/test/Lower/OpenMP/map-character.f90 diff --git a/flang/test/Lower/OpenMP/map-character.f90 b/flang/test/Lower/OpenMP/map-character.f90 new file mode 100644 index 0000000000000..2ed2397713b5d --- /dev/null +++ b/flang/test/Lower/OpenMP/map-character.f90 @@ -0,0 +1,47 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s + +subroutine TestOfCharacter(a0, a1, l) + character(len=*), intent(in) :: a0 + character(len=*), intent(inout):: a1 + integer, intent(in) :: l + + !$omp target map(to:a0) map(from: a1) + a1 = a0 + !$omp end target +end subroutine TestOfCharacter + + +!CHECK: %[[A1_BOXCHAR_ALLOCA:.*]] = fir.alloca !fir.boxchar<1> +!CHECK: %[[A0_BOXCHAR_ALLOCA:.*]] = fir.alloca !fir.boxchar<1> +!CHECK: %[[UNBOXED_ARG0:.*]]:2 = fir.unboxchar %arg0 : (!fir.boxchar<1>) -> (!fir.ref>, index) +!CHECK: %[[A0_DECL:.*]]:2 = hlfir.declare %[[UNBOXED_ARG0]]#0 typeparams %[[UNBOXED_ARG0]]#1 dummy_scope {{.*}} -> (!fir.boxchar<1>, !fir.ref>) +!CHECK: fir.store %[[A0_DECL]]#0 to %[[A0_BOXCHAR_ALLOCA]] : !fir.ref> +!CHECK: %[[UNBOXED_ARG1:.*]]:2 = fir.unboxchar %arg1 : (!fir.boxchar<1>) -> (!fir.ref>, index) +!CHECK: %[[A1_DECL:.*]]:2 = hlfir.declare %[[UNBOXED_ARG1]]#0 typeparams %[[UNBOXED_ARG1]]#1 dummy_scope {{.*}} -> (!fir.boxchar<1>, !fir.ref>) +!CHECK: fir.store %[[A1_DECL]]#0 to %[[A1_BOXCHAR_ALLOCA]] : !fir.ref> +!CHECK: %[[UNBOXED_A0_DECL:.*]]:2 = fir.unboxchar %[[A0_DECL]]#0 : (!fir.boxchar<1>) -> (!fir.ref>, index) +!CHECK: %[[A0_LB:.*]] = arith.constant 0 : index +!CHECK: %[[A0_STRIDE:.*]] = arith.constant 1 : index +!CHECK: %[[A0_UB:.*]] = arith.subi %[[UNBOXED_A0_DECL]]#1, %[[A0_STRIDE]] : index +!CHECK: %[[A0_BOUNDS:.*]] = omp.map.bounds lower_bound(%[[A0_LB]] : index) upper_bound(%[[A0_UB]] : index) extent(%[[UNBOXED_A0_DECL]]#1 : index) +!CHECK-SAME: stride(%[[A0_STRIDE]] : index) start_idx(%[[A0_LB]] : index) {stride_in_bytes = true} +!CHECK: %[[A0_MAP:.*]] = omp.map.info var_ptr(%[[A0_DECL]]#1 : !fir.ref>, !fir.char<1,?>) map_clauses(to) capture(ByRef) bounds(%[[A0_BOUNDS]]) -> !fir.ref> {name = "a0"} +!CHECK: %[[UNBOXED_A1_DECL:.*]]:2 = fir.unboxchar %[[A1_DECL]]#0 : (!fir.boxchar<1>) -> (!fir.ref>, index) +!CHECK: %[[A1_LB:.*]] = arith.constant 0 : index +!CHECK: %[[A1_STRIDE:.*]] = arith.constant 1 : index +!CHECK: %[[A1_UB:.*]] = arith.subi %[[UNBOXED_A1_DECL]]#1, %[[A1_STRIDE]] : index +!CHECK: %[[A1_BOUNDS:.*]] = omp.map.bounds lower_bound(%[[A1_LB]] : index) upper_bound(%[[A1_UB]] : index) extent(%[[UNBOXED_A1_DECL]]#1 : index) +!CHECKL-SAME: stride(%[[A1_STRIDE]] : index) start_idx(%[[A1_LB]] : index) {stride_in_bytes = true} +!CHECK: %[[A1_MAP:.*]] = omp.map.info var_ptr(%[[A1_DECL]]#1 : !fir.ref>, !fir.char<1,?>) map_clauses(from) capture(ByRef) bounds(%[[A1_BOUNDS]]) -> !fir.ref> {name = "a1"} + +!CHECK: %[[A0_BOXCHAR_MAP:.*]] = omp.map.info var_ptr(%[[A0_BOXCHAR_ALLOCA]] : !fir.ref>, !fir.boxchar<1>) map_clauses(implicit, to) capture(ByRef) -> !fir.ref> {name = ""} +!CHECK: %[[A1_BOXCHAR_MAP:.*]] = omp.map.info var_ptr(%[[A1_BOXCHAR_ALLOCA]] : !fir.ref>, !fir.boxchar<1>) map_clauses(implicit, to) capture(ByRef) -> !fir.ref> {name = ""} + +!CHECK: omp.target map_entries(%[[A0_MAP]] -> %[[TGT_A0:.*]], %[[A1_MAP]] -> %[[TGT_A1:.*]], %[[A0_BOXCHAR_MAP]] -> %[[TGT_A0_BOXCHAR:.*]], %[[A1_BOXCHAR_MAP]] -> %[[TGT_A1_BOXCHAR:.*]] : !fir.ref>, !fir.ref>, !fir.ref>, !fir.ref>) { +!CHECK: %[[TGT_A1_BC_LD:.*]] = fir.load %[[TGT_A1_BOXCHAR]] : !fir.ref> +!CHECK: %[[TGT_A0_BC_LD:.*]] = fir.load %[[TGT_A0_BOXCHAR]] : !fir.ref> +!CHECK: %[[UNBOXED_TGT_A0:.*]]:2 = fir.unboxchar %[[TGT_A0_BC_LD]] : (!fir.boxchar<1>) -> (!fir.ref>, index) +!CHECK: %[[TGT_A0_DECL:.*]]:2 = hlfir.declare %[[TGT_A0]] typeparams %[[UNBOXED_TGT_A0]]#1 {{.*}} -> (!fir.boxchar<1>, !fir.ref>) +!CHECK: %[[UNBOXED_TGT_A1:.*]]:2 = fir.unboxchar %[[TGT_A1_BC_LD]] : (!fir.boxchar<1>) -> (!fir.ref>, index) +!CHECK: %[[TGT_A1_DECL:.*]]:2 = hlfir.declare %[[TGT_A1]] typeparams %[[UNBOXED_TGT_A1]]#1 {{.*}} -> (!fir.boxchar<1>, !fir.ref>) + >From 715baf491e4f9e3456aa12a4f2b2b6ce6aa03fba Mon Sep 17 00:00:00 2001 From: Pranav Bhandarkar Date: Tue, 8 Apr 2025 17:18:50 -0500 Subject: [PATCH 05/10] clean up som more --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 2 ++ flang/lib/Lower/OpenMP/OpenMP.cpp | 18 ------------------ 2 files changed, 2 insertions(+), 18 deletions(-) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 6322e7c0c5b41..77b4622547d7a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1054,6 +1054,7 @@ void ClauseProcessor::processMapObjects( llvm::SmallVector bounds; std::stringstream asFortran; std::optional parentObj; + fir::factory::AddrAndBoundsInfo info = lower::gatherDataOperandAddrAndBounds( @@ -1088,6 +1089,7 @@ void ClauseProcessor::processMapObjects( mapperIdName) : mlir::FlatSymbolRefAttr(); } + // Explicit map captures are captured ByRef by default, // optimisation passes may alter this to ByCopy or other capture // types to optimise diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index aef34cc920bf8..86974dea8f758 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -41,24 +41,6 @@ #include "llvm/ADT/STLExtras.h" #include "llvm/Frontend/OpenMP/OMPConstants.h" -static void printOperation(mlir::Operation *op) { llvm::dbgs() << *op << "\n"; } -static void printBlock(mlir::Block &block) { - llvm::dbgs() << "Block with " << block.getNumArguments() << " arguments, " - << block.getNumSuccessors() - << " successors, and " - // Note, this `.size()` is traversing a linked-list and is O(n). - << block.getOperations().size() << " operations\n"; - for (mlir::Operation &op : block.getOperations()) - printOperation(&op); -} - -static void printRegion(mlir::Region ®ion) { - // A region does not hold anything by itself other than a list of blocks. - llvm::dbgs() << "Region with " << region.getBlocks().size() << " blocks:\n"; - for (mlir::Block &block : region.getBlocks()) - printBlock(block); -} - using namespace Fortran::lower::omp; using namespace Fortran::common::openmp; >From f142c969d59ad9a2e1b389ffc1df054f21c77c8f Mon Sep 17 00:00:00 2001 From: Pranav Bhandarkar Date: Tue, 8 Apr 2025 17:38:59 -0500 Subject: [PATCH 06/10] clean up flang/lib/Lower/OpenMP/OpenMP.cpp --- flang/lib/Lower/OpenMP/OpenMP.cpp | 60 +++++++++++++++++++------------ 1 file changed, 37 insertions(+), 23 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 86974dea8f758..39c5f82f0101b 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -225,12 +225,12 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, // them. auto cloneBound = [&](mlir::Value bound) { if (mlir::isMemoryEffectFree(bound.getDefiningOp())) { - if (auto unboxCharOp = - mlir::dyn_cast(bound.getDefiningOp())) { - mlir::Operation *clonedOp = firOpBuilder.clone(*unboxCharOp); + mlir::Operation *definingOp = bound.getDefiningOp(); + mlir::Operation *clonedOp = firOpBuilder.clone(*definingOp); + // Todo: Do we need to check for more operation types? + // For now, specializing only for fir::UnboxCharOp + if (auto unboxCharOp = mlir::dyn_cast(definingOp)) return clonedOp->getResult(1); - } - mlir::Operation *clonedOp = firOpBuilder.clone(*bound.getDefiningOp()); return clonedOp->getResult(0); } TODO(converter.getCurrentLocation(), @@ -274,24 +274,38 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, cloneBounds(v.getLBounds()))); }, [&](const fir::CharBoxValue &v) { - // PDB: THe problem here is that v is - // [flang-openmp-lowering]: Sym: a0, INTENT(IN) (OmpMapTo) size=24 - // offset=0: ObjectEntity dummy type: CHARACTER(*,1) - // [flang-openmp-lowering]: Extended Value: - // boxchar { addr: %9:2 = "hlfir.declare"(%8#0, %8#1, %7) - // <{fortran_attrs = #fir.var_attrs, operandSegmentSizes - // = array, uniq_name = "_QFFrealtest_Ea0"}> : - // (!fir.ref>, index, !fir.dscope) -> - // (!fir.boxchar<1>, !fir.ref>), len: %8:2 = - // "fir.unboxchar"(%arg0) : (!fir.boxchar<1>) -> - // (!fir.ref>, index) } Porblem above is that "len:" - // references the input to the hlfir.declare. It could get it - // directly from the hlfir.declare. PDB: start thinking from here - - // it looks like we'll have to map %arg after all because getting - // the length will still need us to access the defining op of the - // len, which is the unboxchar. Maybe we should use val which could - // be the hlfir.declare for the symbol. Use the len from that - // instead of cloning the len from the extended value. + // In some cases, v.len could reference the input the hlfir.declare + // which is the corresponding v.addr. While, this isn't a big + // problem by itself, it is desirable to extract this out of v.addr + // itself since it's first result will be of type fir.boxchar<>. For + // example, consider the following + // + // func.func private @_QFPrealtest(%arg0: !fir.boxchar<1>) + // %2 = fir.dummy_scope : !fir.dscope + // %3:2 = fir.unboxchar %arg0 : (!fir.boxchar<1>) -> + // (!fir.ref>, index) %4:2 = hlfir.declare %3#0 + // typeparams %3#1 dummy_scope %2 : (!fir.ref>, + // index, + // !fir.dscope) -> (!fir.boxchar<1>, + // !fir.ref>) + + // In the case above, + // v.addr is + // %4:2 = hlfir.declare %3#0 typeparams %3#1 dummy_scope %2 : + // (!fir.ref>, index, + // !fir.dscope) -> (!fir.boxchar<1>, + // !fir.ref>) + // v.len is + // %3:2 = fir.unboxchar %arg0 : (!fir.boxchar<1>) -> + // (!fir.ref>, index) + + // Mapping this to the target will create a use of %arg0 on the + // target. Since omp.target is IsolatedFromAbove, this will have to + // be mapped. Presently, OpenMP lowering of target barfs when it has + // to map a value that doesnt have a defining op. This can be fixed. + // Or we ensure that v.len = fir.unboxchar %4#0. Now if %4:2 is + // mapped to the target, there wont be any use of the block argument + // %arg0 on the target. mlir::Value len = v.getLen(); if (auto declareOp = val.getDefiningOp()) { mlir::Value base = declareOp.getBase(); >From 06874f5b341ab97694b565c47b9adaab1f91a591 Mon Sep 17 00:00:00 2001 From: Pranav Bhandarkar Date: Tue, 8 Apr 2025 17:45:38 -0500 Subject: [PATCH 07/10] clean up a large comment in flang/lib/Lower/OpenMP/OpenMP.cpp --- flang/lib/Lower/OpenMP/OpenMP.cpp | 38 +++++++++++++++---------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 39c5f82f0101b..97331bcbac258 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -274,38 +274,38 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, cloneBounds(v.getLBounds()))); }, [&](const fir::CharBoxValue &v) { - // In some cases, v.len could reference the input the hlfir.declare - // which is the corresponding v.addr. While, this isn't a big - // problem by itself, it is desirable to extract this out of v.addr - // itself since it's first result will be of type fir.boxchar<>. For - // example, consider the following + // In some cases, v.len could reference the input to the + // hlfir.declare which is the corresponding v.addr. While this isn't + // a big problem by itself, it is desirable to extract this out of + // v.addr itself since it's first result will be of type + // fir.boxchar<>. For example, consider the following // // func.func private @_QFPrealtest(%arg0: !fir.boxchar<1>) // %2 = fir.dummy_scope : !fir.dscope // %3:2 = fir.unboxchar %arg0 : (!fir.boxchar<1>) -> - // (!fir.ref>, index) %4:2 = hlfir.declare %3#0 - // typeparams %3#1 dummy_scope %2 : (!fir.ref>, - // index, - // !fir.dscope) -> (!fir.boxchar<1>, - // !fir.ref>) + // (!fir.ref>, index) + // %4:2 = hlfir.declare (%3#0, %3#1, %2):(!fir.ref>, + // index,!fir.dscope) -> + // (!fir.boxchar<1>, !fir.ref>) // In the case above, // v.addr is - // %4:2 = hlfir.declare %3#0 typeparams %3#1 dummy_scope %2 : - // (!fir.ref>, index, - // !fir.dscope) -> (!fir.boxchar<1>, - // !fir.ref>) + // %4:2 = hlfir.declare (%3#0, %3#1, %2):(!fir.ref>, + // index,!fir.dscope) -> + // (!fir.boxchar<1>, !fir.ref>) // v.len is // %3:2 = fir.unboxchar %arg0 : (!fir.boxchar<1>) -> - // (!fir.ref>, index) + // (!fir.ref>, index) // Mapping this to the target will create a use of %arg0 on the - // target. Since omp.target is IsolatedFromAbove, this will have to + // target. Since omp.target is IsolatedFromAbove, %arg0 will have to // be mapped. Presently, OpenMP lowering of target barfs when it has // to map a value that doesnt have a defining op. This can be fixed. - // Or we ensure that v.len = fir.unboxchar %4#0. Now if %4:2 is - // mapped to the target, there wont be any use of the block argument - // %arg0 on the target. + // Or we ensure that v.len is fir.unboxchar %4#0 which will + // cause %4#1 to be used on the target and consequently be + // mapped to the target. As such then, there wont be any use of the + // block argument %arg0 on the target. + mlir::Value len = v.getLen(); if (auto declareOp = val.getDefiningOp()) { mlir::Value base = declareOp.getBase(); >From 68993c5398c2f73e53ae2ff869dc8bcdfa89c665 Mon Sep 17 00:00:00 2001 From: Pranav Bhandarkar Date: Tue, 8 Apr 2025 17:47:35 -0500 Subject: [PATCH 08/10] Remove an unused variable from genBodyofTargetOp in flang/lib/Lower/OpenMP/OpenMP.cpp --- flang/lib/Lower/OpenMP/OpenMP.cpp | 1 - 1 file changed, 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 97331bcbac258..35226ba1332c3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1413,7 +1413,6 @@ static void genBodyOfTargetOp( auto argIface = llvm::cast(*targetOp); mlir::Region ®ion = targetOp.getRegion(); - mlir::func::FuncOp func = targetOp->getParentOfType(); mlir::Block *entryBlock = genEntryBlock(firOpBuilder, args, region); bindEntryBlockArgs(converter, targetOp, args); if (!hostEvalInfo.empty()) >From 703d7f305b9433dd6dd3280a71ef8e6dd4d7319a Mon Sep 17 00:00:00 2001 From: Pranav Bhandarkar Date: Tue, 8 Apr 2025 21:13:14 -0500 Subject: [PATCH 09/10] Update a comment in MapInfoFinalizationPass and also add one more check that will advance the walk in case the MapInfoOp already has bounds --- flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp index 7cb0f63ba9b69..d228507d2d771 100644 --- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp +++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp @@ -550,7 +550,9 @@ class MapInfoFinalizationPass // iterations from previous function scopes. localBoxAllocas.clear(); - // First, walk `omp.map.info` ops to see if any + // First, walk `omp.map.info` ops to see if any of them have varPtrs + // with an underlying type of fir.char, i.e a character + // with dynamic length. If so, check if they need bounds added. func->walk([&](mlir::omp::MapInfoOp op) { mlir::Value varPtr = op.getVarPtr(); mlir::Type underlyingVarType = fir::unwrapRefType(varPtr.getType()); @@ -562,6 +564,8 @@ class MapInfoFinalizationPass if (!cType.hasDynamicLen()) return mlir::WalkResult::advance(); + if (!op.getBounds().empty()) + return mlir::WalkResult::advance(); // This means varPtr is a BlockArgument. I do not know how to get to a // fir.boxchar<> type of mlir::Value for varPtr. So, skipping this for // now. >From a482741cfc925be3a3ae7562cd295e7297c096eb Mon Sep 17 00:00:00 2001 From: Pranav Bhandarkar Date: Fri, 25 Apr 2025 17:07:42 -0500 Subject: [PATCH 10/10] [Flang][mlir] - In fortran lowering, handle block arguments defined outside omp target This patch handles block arguments that are live-in into the target region of an omp.target op. This is done by simply reusing the mapping mechanism in place for values that are not block arguments. Further, in MapInfoFinalizationPass, this patch adds bounds to maps that map `!fir.ref>` types. Also, we don't clone bounds when binding entry block arguments any more. --- .../Optimizer/Builder/DirectivesCommon.h | 31 +++++ flang/lib/Lower/OpenMP/OpenMP.cpp | 128 ++++-------------- .../Optimizer/OpenMP/MapInfoFinalization.cpp | 63 +++------ flang/test/Lower/OpenMP/map-character.f90 | 15 +- 4 files changed, 87 insertions(+), 150 deletions(-) diff --git a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h index 8684299ab6792..864b8938889f8 100644 --- a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h +++ b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h @@ -17,6 +17,8 @@ #ifndef FORTRAN_OPTIMIZER_BUILDER_DIRECTIVESCOMMON_H_ #define FORTRAN_OPTIMIZER_BUILDER_DIRECTIVESCOMMON_H_ +#include "BoxValue.h" +#include "FIRBuilder.h" #include "flang/Optimizer/Builder/BoxValue.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/Todo.h" @@ -131,6 +133,31 @@ gatherBoundsOrBoundValues(fir::FirOpBuilder &builder, mlir::Location loc, } return values; } +template +mlir::Value +genBoundsOpFromBoxChar(fir::FirOpBuilder &builder, mlir::Location loc, + fir::ExtendedValue dataExv, AddrAndBoundsInfo &info) { + // TODO: Handle info.isPresent. + if (auto boxCharType = + mlir::dyn_cast(info.addr.getType())) { + mlir::Type idxTy = builder.getIndexType(); + mlir::Type lenType = builder.getCharacterLengthType(); + mlir::Type refType = builder.getRefType(boxCharType.getEleTy()); + auto unboxed = + builder.create(loc, refType, lenType, info.addr); + mlir::Value zero = builder.createIntegerConstant(loc, idxTy, 0); + mlir::Value one = builder.createIntegerConstant(loc, idxTy, 1); + mlir::Value extent = unboxed.getResult(1); + mlir::Value stride = one; + mlir::Value ub = builder.create(loc, extent, one); + mlir::Type boundTy = builder.getType(); + return builder.create( + loc, boundTy, /*lower_bound=*/zero, + /*upper_bound=*/ub, /*extent=*/extent, /*stride=*/stride, + /*stride_in_bytes = */ true, /*start_idx=*/zero); + } + return mlir::Value{}; +} /// Generate the bounds operation from the descriptor information. template @@ -258,6 +285,10 @@ genImplicitBoundsOps(fir::FirOpBuilder &builder, AddrAndBoundsInfo &info, bounds = genBaseBoundsOps(builder, loc, dataExv, dataExvIsAssumedSize); } + if (characterWithDynamicLen(fir::unwrapRefType(baseOp.getType()))) { + bounds = {genBoundsOpFromBoxChar(builder, loc, + dataExv, info)}; + } return bounds; } diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 35226ba1332c3..ca6a7193fd26a 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -217,33 +217,8 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, assert(args.isValid() && "invalid args"); fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - auto bindSingleMapLike = [&converter, - &firOpBuilder](const semantics::Symbol &sym, - const mlir::Value val, - const mlir::BlockArgument &arg) { - // Clones the `bounds` placing them inside the entry block and returns - // them. - auto cloneBound = [&](mlir::Value bound) { - if (mlir::isMemoryEffectFree(bound.getDefiningOp())) { - mlir::Operation *definingOp = bound.getDefiningOp(); - mlir::Operation *clonedOp = firOpBuilder.clone(*definingOp); - // Todo: Do we need to check for more operation types? - // For now, specializing only for fir::UnboxCharOp - if (auto unboxCharOp = mlir::dyn_cast(definingOp)) - return clonedOp->getResult(1); - return clonedOp->getResult(0); - } - TODO(converter.getCurrentLocation(), - "target map-like clause operand unsupported bound type"); - }; - - auto cloneBounds = [cloneBound](llvm::ArrayRef bounds) { - llvm::SmallVector clonedBounds; - llvm::transform(bounds, std::back_inserter(clonedBounds), - [&](mlir::Value bound) { return cloneBound(bound); }); - return clonedBounds; - }; - + auto bindSingleMapLike = [&converter](const semantics::Symbol &sym, + const mlir::BlockArgument &arg) { fir::ExtendedValue extVal = converter.getSymbolExtendedValue(sym); auto refType = mlir::dyn_cast(arg.getType()); if (refType && fir::isa_builtin_cptr_type(refType.getElementType())) { @@ -251,77 +226,27 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, } else { extVal.match( [&](const fir::BoxValue &v) { - converter.bindSymbol(sym, - fir::BoxValue(arg, cloneBounds(v.getLBounds()), - v.getExplicitParameters(), - v.getExplicitExtents())); + converter.bindSymbol(sym, fir::BoxValue(arg, v.getLBounds(), + v.getExplicitParameters(), + v.getExplicitExtents())); }, [&](const fir::MutableBoxValue &v) { converter.bindSymbol( - sym, fir::MutableBoxValue(arg, cloneBounds(v.getLBounds()), + sym, fir::MutableBoxValue(arg, v.getLBounds(), v.getMutableProperties())); }, [&](const fir::ArrayBoxValue &v) { - converter.bindSymbol( - sym, fir::ArrayBoxValue(arg, cloneBounds(v.getExtents()), - cloneBounds(v.getLBounds()), - v.getSourceBox())); + converter.bindSymbol(sym, fir::ArrayBoxValue(arg, v.getExtents(), + v.getLBounds(), + v.getSourceBox())); }, [&](const fir::CharArrayBoxValue &v) { - converter.bindSymbol( - sym, fir::CharArrayBoxValue(arg, cloneBound(v.getLen()), - cloneBounds(v.getExtents()), - cloneBounds(v.getLBounds()))); + converter.bindSymbol(sym, fir::CharArrayBoxValue(arg, v.getLen(), + v.getExtents(), + v.getLBounds())); }, [&](const fir::CharBoxValue &v) { - // In some cases, v.len could reference the input to the - // hlfir.declare which is the corresponding v.addr. While this isn't - // a big problem by itself, it is desirable to extract this out of - // v.addr itself since it's first result will be of type - // fir.boxchar<>. For example, consider the following - // - // func.func private @_QFPrealtest(%arg0: !fir.boxchar<1>) - // %2 = fir.dummy_scope : !fir.dscope - // %3:2 = fir.unboxchar %arg0 : (!fir.boxchar<1>) -> - // (!fir.ref>, index) - // %4:2 = hlfir.declare (%3#0, %3#1, %2):(!fir.ref>, - // index,!fir.dscope) -> - // (!fir.boxchar<1>, !fir.ref>) - - // In the case above, - // v.addr is - // %4:2 = hlfir.declare (%3#0, %3#1, %2):(!fir.ref>, - // index,!fir.dscope) -> - // (!fir.boxchar<1>, !fir.ref>) - // v.len is - // %3:2 = fir.unboxchar %arg0 : (!fir.boxchar<1>) -> - // (!fir.ref>, index) - - // Mapping this to the target will create a use of %arg0 on the - // target. Since omp.target is IsolatedFromAbove, %arg0 will have to - // be mapped. Presently, OpenMP lowering of target barfs when it has - // to map a value that doesnt have a defining op. This can be fixed. - // Or we ensure that v.len is fir.unboxchar %4#0 which will - // cause %4#1 to be used on the target and consequently be - // mapped to the target. As such then, there wont be any use of the - // block argument %arg0 on the target. - - mlir::Value len = v.getLen(); - if (auto declareOp = val.getDefiningOp()) { - mlir::Value base = declareOp.getBase(); - if (auto boxCharType = - mlir::dyn_cast(base.getType())) { - mlir::Type lenType = firOpBuilder.getCharacterLengthType(); - mlir::Type refType = - firOpBuilder.getRefType(boxCharType.getEleTy()); - mlir::Location loc = converter.getCurrentLocation(); - auto unboxed = firOpBuilder.create( - loc, refType, lenType, base); - len = unboxed.getResult(1); - } - } - auto charBoxValue = fir::CharBoxValue(arg, cloneBound(len)); - converter.bindSymbol(sym, charBoxValue); + converter.bindSymbol(sym, fir::CharBoxValue(arg, v.getLen())); }, [&](const fir::UnboxedValue &v) { converter.bindSymbol(sym, arg); }, [&](const auto &) { @@ -333,7 +258,6 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, auto bindMapLike = [&bindSingleMapLike](llvm::ArrayRef syms, - llvm::ArrayRef vars, llvm::ArrayRef args) { // Structure component symbols don't have bindings, and can only be // explicitly mapped individually. If a member is captured implicitly @@ -342,8 +266,8 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, llvm::copy_if(syms, std::back_inserter(processedSyms), [](auto *sym) { return !sym->owner().IsDerivedType(); }); - for (auto [sym, var, arg] : llvm::zip_equal(processedSyms, vars, args)) - bindSingleMapLike(*sym, var, arg); + for (auto [sym, arg] : llvm::zip_equal(processedSyms, args)) + bindSingleMapLike(*sym, arg); }; auto bindPrivateLike = [&converter, &firOpBuilder]( @@ -374,20 +298,17 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, // Process in clause name alphabetical order to match block arguments order. // Do not bind host_eval variables because they cannot be used inside of the // corresponding region, except for very specific cases handled separately. - bindMapLike(args.hasDeviceAddr.syms, args.hasDeviceAddr.vars, - op.getHasDeviceAddrBlockArgs()); + bindMapLike(args.hasDeviceAddr.syms, op.getHasDeviceAddrBlockArgs()); bindPrivateLike(args.inReduction.syms, args.inReduction.vars, op.getInReductionBlockArgs()); - bindMapLike(args.map.syms, args.map.vars, op.getMapBlockArgs()); + bindMapLike(args.map.syms, op.getMapBlockArgs()); bindPrivateLike(args.priv.syms, args.priv.vars, op.getPrivateBlockArgs()); bindPrivateLike(args.reduction.syms, args.reduction.vars, op.getReductionBlockArgs()); bindPrivateLike(args.taskReduction.syms, args.taskReduction.vars, op.getTaskReductionBlockArgs()); - bindMapLike(args.useDeviceAddr.syms, args.useDeviceAddr.vars, - op.getUseDeviceAddrBlockArgs()); - bindMapLike(args.useDevicePtr.syms, args.useDevicePtr.vars, - op.getUseDevicePtrBlockArgs()); + bindMapLike(args.useDeviceAddr.syms, op.getUseDeviceAddrBlockArgs()); + bindMapLike(args.useDevicePtr.syms, op.getUseDevicePtrBlockArgs()); } /// Get the list of base values that the specified map-like variables point to. @@ -1427,14 +1348,13 @@ static void genBodyOfTargetOp( while (!valuesDefinedAbove.empty()) { for (mlir::Value val : valuesDefinedAbove) { mlir::Operation *valOp = val.getDefiningOp(); - assert(valOp != nullptr); // NOTE: We skip BoxDimsOp's as the lesser of two evils is to map the // indices separately, as the alternative is to eventually map the Box, // which comes with a fairly large overhead comparatively. We could be // more robust about this and check using a BackwardsSlice to see if we // run the risk of mapping a box. - if (mlir::isMemoryEffectFree(valOp) && + if (valOp && mlir::isMemoryEffectFree(valOp) && !mlir::isa(valOp)) { mlir::Operation *clonedOp = valOp->clone(); entryBlock->push_front(clonedOp); @@ -1447,7 +1367,13 @@ static void genBodyOfTargetOp( valOp->replaceUsesWithIf(clonedOp, replace); } else { auto savedIP = firOpBuilder.getInsertionPoint(); - firOpBuilder.setInsertionPointAfter(valOp); + + if (valOp) + firOpBuilder.setInsertionPointAfter(valOp); + else + // This means val is a block argument + firOpBuilder.setInsertionPoint(targetOp); + auto copyVal = firOpBuilder.createTemporary(val.getLoc(), val.getType()); firOpBuilder.createStoreWithConvert(copyVal.getLoc(), val, copyVal); diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp index d228507d2d771..94797f173a5ef 100644 --- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp +++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp @@ -47,6 +47,8 @@ #include #include +#define DEBUG_TYPE "omp-map-info-finalization" +#define PDBGS() (llvm::dbgs() << "[" << DEBUG_TYPE << "]: ") namespace flangomp { #define GEN_PASS_DEF_MAPINFOFINALIZATIONPASS #include "flang/Optimizer/OpenMP/Passes.h.inc" @@ -554,52 +556,31 @@ class MapInfoFinalizationPass // with an underlying type of fir.char, i.e a character // with dynamic length. If so, check if they need bounds added. func->walk([&](mlir::omp::MapInfoOp op) { - mlir::Value varPtr = op.getVarPtr(); - mlir::Type underlyingVarType = fir::unwrapRefType(varPtr.getType()); - if (!mlir::isa(underlyingVarType)) + if (!op.getBounds().empty()) return mlir::WalkResult::advance(); - fir::CharacterType cType = - mlir::cast(underlyingVarType); - if (!cType.hasDynamicLen()) - return mlir::WalkResult::advance(); + mlir::Value varPtr = op.getVarPtr(); + mlir::Type underlyingVarType = fir::unwrapRefType(varPtr.getType()); - if (!op.getBounds().empty()) - return mlir::WalkResult::advance(); - // This means varPtr is a BlockArgument. I do not know how to get to a - // fir.boxchar<> type of mlir::Value for varPtr. So, skipping this for - // now. - mlir::Operation *definingOp = varPtr.getDefiningOp(); - if (!definingOp) + if (!fir::characterWithDynamicLen(underlyingVarType)) return mlir::WalkResult::advance(); - if (auto declOp = mlir::dyn_cast(definingOp)) { - mlir::Value base = declOp.getBase(); - assert(mlir::isa(base.getType())); - // mlir::value unboxChar - builder.setInsertionPoint(op); - fir::BoxCharType boxCharType = - mlir::cast(base.getType()); - mlir::Type idxTy = builder.getIndexType(); - mlir::Type lenType = builder.getCharacterLengthType(); - mlir::Type refType = builder.getRefType(boxCharType.getEleTy()); - mlir::Location location = op.getLoc(); - auto unboxed = builder.create(location, refType, - lenType, base); - // len = unboxed.getResult(1); - mlir::Value zero = builder.createIntegerConstant(location, idxTy, 0); - mlir::Value one = builder.createIntegerConstant(location, idxTy, 1); - mlir::Value extent = unboxed.getResult(1); - mlir::Value stride = one; - mlir::Value ub = - builder.create(location, extent, one); - mlir::Type boundTy = builder.getType(); - mlir::Value boundsOp = builder.create( - location, boundTy, /*lower_bound=*/zero, - /*upper_bound=*/ub, /*extent=*/extent, /*stride=*/stride, - /*stride_in_bytes = */ true, /*start_idx=*/zero); - op.getBoundsMutable().append({boundsOp}); - } + fir::factory::AddrAndBoundsInfo info = + fir::factory::getDataOperandBaseAddr( + builder, varPtr, /*isOptional=*/false, varPtr.getLoc()); + fir::ExtendedValue extendedValue = + hlfir::translateToExtendedValue(varPtr.getLoc(), builder, + hlfir::Entity{info.addr}, + /*continguousHint=*/true) + .first; + builder.setInsertionPoint(op); + llvm::SmallVector boundsOps = + fir::factory::genImplicitBoundsOps( + builder, info, extendedValue, + /*dataExvIsAssumedSize=*/false, varPtr.getLoc()); + + op.getBoundsMutable().append(boundsOps); return mlir::WalkResult::advance(); }); diff --git a/flang/test/Lower/OpenMP/map-character.f90 b/flang/test/Lower/OpenMP/map-character.f90 index 2ed2397713b5d..76df43f1c2cb3 100644 --- a/flang/test/Lower/OpenMP/map-character.f90 +++ b/flang/test/Lower/OpenMP/map-character.f90 @@ -11,14 +11,12 @@ subroutine TestOfCharacter(a0, a1, l) end subroutine TestOfCharacter -!CHECK: %[[A1_BOXCHAR_ALLOCA:.*]] = fir.alloca !fir.boxchar<1> !CHECK: %[[A0_BOXCHAR_ALLOCA:.*]] = fir.alloca !fir.boxchar<1> +!CHECK: %[[A1_BOXCHAR_ALLOCA:.*]] = fir.alloca !fir.boxchar<1> !CHECK: %[[UNBOXED_ARG0:.*]]:2 = fir.unboxchar %arg0 : (!fir.boxchar<1>) -> (!fir.ref>, index) !CHECK: %[[A0_DECL:.*]]:2 = hlfir.declare %[[UNBOXED_ARG0]]#0 typeparams %[[UNBOXED_ARG0]]#1 dummy_scope {{.*}} -> (!fir.boxchar<1>, !fir.ref>) -!CHECK: fir.store %[[A0_DECL]]#0 to %[[A0_BOXCHAR_ALLOCA]] : !fir.ref> !CHECK: %[[UNBOXED_ARG1:.*]]:2 = fir.unboxchar %arg1 : (!fir.boxchar<1>) -> (!fir.ref>, index) !CHECK: %[[A1_DECL:.*]]:2 = hlfir.declare %[[UNBOXED_ARG1]]#0 typeparams %[[UNBOXED_ARG1]]#1 dummy_scope {{.*}} -> (!fir.boxchar<1>, !fir.ref>) -!CHECK: fir.store %[[A1_DECL]]#0 to %[[A1_BOXCHAR_ALLOCA]] : !fir.ref> !CHECK: %[[UNBOXED_A0_DECL:.*]]:2 = fir.unboxchar %[[A0_DECL]]#0 : (!fir.boxchar<1>) -> (!fir.ref>, index) !CHECK: %[[A0_LB:.*]] = arith.constant 0 : index !CHECK: %[[A0_STRIDE:.*]] = arith.constant 1 : index @@ -33,15 +31,16 @@ end subroutine TestOfCharacter !CHECK: %[[A1_BOUNDS:.*]] = omp.map.bounds lower_bound(%[[A1_LB]] : index) upper_bound(%[[A1_UB]] : index) extent(%[[UNBOXED_A1_DECL]]#1 : index) !CHECKL-SAME: stride(%[[A1_STRIDE]] : index) start_idx(%[[A1_LB]] : index) {stride_in_bytes = true} !CHECK: %[[A1_MAP:.*]] = omp.map.info var_ptr(%[[A1_DECL]]#1 : !fir.ref>, !fir.char<1,?>) map_clauses(from) capture(ByRef) bounds(%[[A1_BOUNDS]]) -> !fir.ref> {name = "a1"} - -!CHECK: %[[A0_BOXCHAR_MAP:.*]] = omp.map.info var_ptr(%[[A0_BOXCHAR_ALLOCA]] : !fir.ref>, !fir.boxchar<1>) map_clauses(implicit, to) capture(ByRef) -> !fir.ref> {name = ""} +!CHECK: fir.store %arg1 to %[[A1_BOXCHAR_ALLOCA]] : !fir.ref> !CHECK: %[[A1_BOXCHAR_MAP:.*]] = omp.map.info var_ptr(%[[A1_BOXCHAR_ALLOCA]] : !fir.ref>, !fir.boxchar<1>) map_clauses(implicit, to) capture(ByRef) -> !fir.ref> {name = ""} +!CHECK: fir.store %arg0 to %[[A0_BOXCHAR_ALLOCA]] : !fir.ref> +!CHECK: %[[A0_BOXCHAR_MAP:.*]] = omp.map.info var_ptr(%[[A0_BOXCHAR_ALLOCA]] : !fir.ref>, !fir.boxchar<1>) map_clauses(implicit, to) capture(ByRef) -> !fir.ref> {name = ""} -!CHECK: omp.target map_entries(%[[A0_MAP]] -> %[[TGT_A0:.*]], %[[A1_MAP]] -> %[[TGT_A1:.*]], %[[A0_BOXCHAR_MAP]] -> %[[TGT_A0_BOXCHAR:.*]], %[[A1_BOXCHAR_MAP]] -> %[[TGT_A1_BOXCHAR:.*]] : !fir.ref>, !fir.ref>, !fir.ref>, !fir.ref>) { -!CHECK: %[[TGT_A1_BC_LD:.*]] = fir.load %[[TGT_A1_BOXCHAR]] : !fir.ref> +!CHECK: omp.target map_entries(%[[A0_MAP]] -> %[[TGT_A0:.*]], %[[A1_MAP]] -> %[[TGT_A1:.*]], %[[A1_BOXCHAR_MAP]] -> %[[TGT_A1_BOXCHAR:.*]], %[[A0_BOXCHAR_MAP]] -> %[[TGT_A0_BOXCHAR:.*]] : !fir.ref>, !fir.ref>, !fir.ref>, !fir.ref>) { !CHECK: %[[TGT_A0_BC_LD:.*]] = fir.load %[[TGT_A0_BOXCHAR]] : !fir.ref> +!CHECK: %[[TGT_A1_BC_LD:.*]] = fir.load %[[TGT_A1_BOXCHAR]] : !fir.ref> +!CHECK: %[[UNBOXED_TGT_A1:.*]]:2 = fir.unboxchar %[[TGT_A1_BC_LD]] : (!fir.boxchar<1>) -> (!fir.ref>, index) !CHECK: %[[UNBOXED_TGT_A0:.*]]:2 = fir.unboxchar %[[TGT_A0_BC_LD]] : (!fir.boxchar<1>) -> (!fir.ref>, index) !CHECK: %[[TGT_A0_DECL:.*]]:2 = hlfir.declare %[[TGT_A0]] typeparams %[[UNBOXED_TGT_A0]]#1 {{.*}} -> (!fir.boxchar<1>, !fir.ref>) -!CHECK: %[[UNBOXED_TGT_A1:.*]]:2 = fir.unboxchar %[[TGT_A1_BC_LD]] : (!fir.boxchar<1>) -> (!fir.ref>, index) !CHECK: %[[TGT_A1_DECL:.*]]:2 = hlfir.declare %[[TGT_A1]] typeparams %[[UNBOXED_TGT_A1]]#1 {{.*}} -> (!fir.boxchar<1>, !fir.ref>) From flang-commits at lists.llvm.org Sat May 3 00:24:49 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 03 May 2025 00:24:49 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <6815c4c1.170a0220.f9ee5.ab55@mx.google.com> ================ @@ -3091,10 +3131,32 @@ static void genAtomicCapture(lower::AbstractConverter &converter, firOpBuilder.setInsertionPointToStart(&block); const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + + if (stmt1VarType != stmt2VarType) { ---------------- NimishMishra wrote: Thanks! The first two blocks are similar (since the capture-stmt appears first in [capture-stmt, update-stmt] and [capture-stmt, write-stmt]), but the third one is different (for [update-stmt, capture-stmt]). I'll try to refactor https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Sat May 3 00:55:20 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 03 May 2025 00:55:20 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <6815cbe8.170a0220.73625.ac0d@mx.google.com> https://github.com/NimishMishra approved this pull request. LGTM. The code formatting needs to be fixed though https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Sat May 3 06:28:28 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 03 May 2025 06:28:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Fix fir.convert in omp.atomic.update region (PR #138397) Message-ID: https://github.com/NimishMishra created https://github.com/llvm/llvm-project/pull/138397 Region generation in omp.atomic.update currently emits a direct `fir.convert`. This crashes when the RHS expression involves complex type but the LHS variable is primitive type (say `f32`), since a `fir.convert` from `complex` to `f32` is emitted, which is illegal. This PR adds a conditional check to emit an additional `ExtractValueOp` in case RHS expression has a complex type. Fixes https://github.com/llvm/llvm-project/issues/138396 >From e667941c15865bba517fde0e0d0882e6b92c78cb Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Sat, 3 May 2025 18:54:15 +0530 Subject: [PATCH] [flang][OpenMP] Fix fir.convert in omp.atomic.update region --- flang/lib/Lower/OpenMP/OpenMP.cpp | 20 +++++++++++++++++--- flang/test/Lower/OpenMP/atomic-update.f90 | 18 ++++++++++++++++++ 2 files changed, 35 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 47e7c266ff7d3..0910d0003356c 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2833,9 +2833,23 @@ static void genAtomicUpdateStatement( lower::StatementContext atomicStmtCtx; mlir::Value rhsExpr = fir::getBase(converter.genExprValue( *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); + mlir::Type exprType = fir::unwrapRefType(rhsExpr.getType()); + if (fir::isa_complex(exprType) && !fir::isa_complex(varType)) { + // Emit an additional `ExtractValueOp` if the expression is of complex + // type + auto extract = firOpBuilder.create( + currentLocation, + mlir::cast(exprType).getElementType(), rhsExpr, + firOpBuilder.getArrayAttr( + firOpBuilder.getIntegerAttr(firOpBuilder.getIndexType(), 0))); + mlir::Value convertResult = firOpBuilder.create( + currentLocation, varType, extract); + firOpBuilder.create(currentLocation, convertResult); + } else { + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + } converter.resetExprOverrides(); } firOpBuilder.setInsertionPointAfter(atomicUpdateOp); diff --git a/flang/test/Lower/OpenMP/atomic-update.f90 b/flang/test/Lower/OpenMP/atomic-update.f90 index 31bf447006930..257ae8fb497ff 100644 --- a/flang/test/Lower/OpenMP/atomic-update.f90 +++ b/flang/test/Lower/OpenMP/atomic-update.f90 @@ -20,6 +20,8 @@ program OmpAtomicUpdate !CHECK: %[[VAL_C_DECLARE:.*]]:2 = hlfir.declare %[[VAL_C_ADDRESS]] {{.*}} !CHECK: %[[VAL_D_ADDRESS:.*]] = fir.address_of(@_QFEd) : !fir.ref !CHECK: %[[VAL_D_DECLARE:.*]]:2 = hlfir.declare %[[VAL_D_ADDRESS]] {{.}} +!CHECK: %[[VAL_G_ADDRESS:.*]] = fir.alloca complex {bindc_name = "g", uniq_name = "_QFEg"} +!CHECK: %[[VAL_G_DECLARE:.*]]:2 = hlfir.declare %[[VAL_G_ADDRESS]] {uniq_name = "_QFEg"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) !CHECK: %[[VAL_i1_ALLOCA:.*]] = fir.alloca i8 {bindc_name = "i1", uniq_name = "_QFEi1"} !CHECK: %[[VAL_i1_DECLARE:.*]]:2 = hlfir.declare %[[VAL_i1_ALLOCA]] {{.*}} !CHECK: %[[VAL_c5:.*]] = arith.constant 5 : index @@ -40,6 +42,7 @@ program OmpAtomicUpdate integer, target :: c, d integer(1) :: i1 integer, dimension(5) :: k + complex :: g !CHECK: %[[EMBOX:.*]] = fir.embox %[[VAL_C_DECLARE]]#0 : (!fir.ref) -> !fir.box> !CHECK: fir.store %[[EMBOX]] to %[[VAL_A_DECLARE]]#0 : !fir.ref>> @@ -200,4 +203,19 @@ program OmpAtomicUpdate !CHECK: } !$omp atomic update x = x + sum([ (y+2, y=1, z) ]) + +!CHECK: %[[LOAD:.*]] = fir.load %[[VAL_G_DECLARE]]#0 : !fir.ref> +!CHECK: omp.atomic.update %[[VAL_W_DECLARE]]#0 : !fir.ref { +!CHECK: ^bb0(%[[ARG:.*]]: i32): +!CHECK: %[[CVT:.*]] = fir.convert %[[ARG]] : (i32) -> f32 +!CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +!CHECK: %[[UNDEF:.*]] = fir.undefined complex +!CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex +!CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex +!CHECK: %[[ADD:.*]] = fir.addc %[[IDX2]], %[[LOAD]] {fastmath = #arith.fastmath} : complex +!CHECK: %[[EXT:.*]] = fir.extract_value %[[ADD]], [0 : index] : (complex) -> f32 +!CHECK: %[[RESULT:.*]] = fir.convert %[[EXT]] : (f32) -> i32 +!CHECK: omp.yield(%[[RESULT]] : i32) + !$omp atomic update + w = w + g end program OmpAtomicUpdate From flang-commits at lists.llvm.org Sat May 3 06:29:00 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 03 May 2025 06:29:00 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Fix fir.convert in omp.atomic.update region (PR #138397) In-Reply-To: Message-ID: <68161a1c.050a0220.221863.28ba@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: None (NimishMishra)
Changes Region generation in omp.atomic.update currently emits a direct `fir.convert`. This crashes when the RHS expression involves complex type but the LHS variable is primitive type (say `f32`), since a `fir.convert` from `complex<f32>` to `f32` is emitted, which is illegal. This PR adds a conditional check to emit an additional `ExtractValueOp` in case RHS expression has a complex type. Fixes https://github.com/llvm/llvm-project/issues/138396 --- Full diff: https://github.com/llvm/llvm-project/pull/138397.diff 2 Files Affected: - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+17-3) - (modified) flang/test/Lower/OpenMP/atomic-update.f90 (+18) ``````````diff diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 47e7c266ff7d3..0910d0003356c 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2833,9 +2833,23 @@ static void genAtomicUpdateStatement( lower::StatementContext atomicStmtCtx; mlir::Value rhsExpr = fir::getBase(converter.genExprValue( *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); + mlir::Type exprType = fir::unwrapRefType(rhsExpr.getType()); + if (fir::isa_complex(exprType) && !fir::isa_complex(varType)) { + // Emit an additional `ExtractValueOp` if the expression is of complex + // type + auto extract = firOpBuilder.create( + currentLocation, + mlir::cast(exprType).getElementType(), rhsExpr, + firOpBuilder.getArrayAttr( + firOpBuilder.getIntegerAttr(firOpBuilder.getIndexType(), 0))); + mlir::Value convertResult = firOpBuilder.create( + currentLocation, varType, extract); + firOpBuilder.create(currentLocation, convertResult); + } else { + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + } converter.resetExprOverrides(); } firOpBuilder.setInsertionPointAfter(atomicUpdateOp); diff --git a/flang/test/Lower/OpenMP/atomic-update.f90 b/flang/test/Lower/OpenMP/atomic-update.f90 index 31bf447006930..257ae8fb497ff 100644 --- a/flang/test/Lower/OpenMP/atomic-update.f90 +++ b/flang/test/Lower/OpenMP/atomic-update.f90 @@ -20,6 +20,8 @@ program OmpAtomicUpdate !CHECK: %[[VAL_C_DECLARE:.*]]:2 = hlfir.declare %[[VAL_C_ADDRESS]] {{.*}} !CHECK: %[[VAL_D_ADDRESS:.*]] = fir.address_of(@_QFEd) : !fir.ref !CHECK: %[[VAL_D_DECLARE:.*]]:2 = hlfir.declare %[[VAL_D_ADDRESS]] {{.}} +!CHECK: %[[VAL_G_ADDRESS:.*]] = fir.alloca complex {bindc_name = "g", uniq_name = "_QFEg"} +!CHECK: %[[VAL_G_DECLARE:.*]]:2 = hlfir.declare %[[VAL_G_ADDRESS]] {uniq_name = "_QFEg"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) !CHECK: %[[VAL_i1_ALLOCA:.*]] = fir.alloca i8 {bindc_name = "i1", uniq_name = "_QFEi1"} !CHECK: %[[VAL_i1_DECLARE:.*]]:2 = hlfir.declare %[[VAL_i1_ALLOCA]] {{.*}} !CHECK: %[[VAL_c5:.*]] = arith.constant 5 : index @@ -40,6 +42,7 @@ program OmpAtomicUpdate integer, target :: c, d integer(1) :: i1 integer, dimension(5) :: k + complex :: g !CHECK: %[[EMBOX:.*]] = fir.embox %[[VAL_C_DECLARE]]#0 : (!fir.ref) -> !fir.box> !CHECK: fir.store %[[EMBOX]] to %[[VAL_A_DECLARE]]#0 : !fir.ref>> @@ -200,4 +203,19 @@ program OmpAtomicUpdate !CHECK: } !$omp atomic update x = x + sum([ (y+2, y=1, z) ]) + +!CHECK: %[[LOAD:.*]] = fir.load %[[VAL_G_DECLARE]]#0 : !fir.ref> +!CHECK: omp.atomic.update %[[VAL_W_DECLARE]]#0 : !fir.ref { +!CHECK: ^bb0(%[[ARG:.*]]: i32): +!CHECK: %[[CVT:.*]] = fir.convert %[[ARG]] : (i32) -> f32 +!CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +!CHECK: %[[UNDEF:.*]] = fir.undefined complex +!CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex +!CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex +!CHECK: %[[ADD:.*]] = fir.addc %[[IDX2]], %[[LOAD]] {fastmath = #arith.fastmath} : complex +!CHECK: %[[EXT:.*]] = fir.extract_value %[[ADD]], [0 : index] : (complex) -> f32 +!CHECK: %[[RESULT:.*]] = fir.convert %[[EXT]] : (f32) -> i32 +!CHECK: omp.yield(%[[RESULT]] : i32) + !$omp atomic update + w = w + g end program OmpAtomicUpdate ``````````
https://github.com/llvm/llvm-project/pull/138397 From flang-commits at lists.llvm.org Sat May 3 06:29:26 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 03 May 2025 06:29:26 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <68161a36.630a0220.187254.cb1e@mx.google.com> https://github.com/NimishMishra edited https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Sat May 3 22:50:38 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 03 May 2025 22:50:38 -0700 (PDT) Subject: [flang-commits] [flang] [fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <6817002e.170a0220.1edde3.8742@mx.google.com> ================ @@ -461,14 +475,28 @@ class AffineLoopConversion : public mlir::OpRewritePattern { LLVM_ATTRIBUTE_UNUSED auto loopAnalysis = functionAnalysis.getChildLoopAnalysis(loop); auto &loopOps = loop.getBody()->getOperations(); + auto resultOp = cast(loop.getBody()->getTerminator()); ---------------- NexMing wrote: Yes, it can have mlutiple results. I can add tests if needed. This optimization is indeed experimental and requires more testing to uncover issues. https://github.com/llvm/llvm-project/pull/137790 From flang-commits at lists.llvm.org Sun May 4 08:59:40 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Sun, 04 May 2025 08:59:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang][Evaluate] Fix AsGenericExpr for Relational (PR #138455) Message-ID: https://github.com/kparzysz created https://github.com/llvm/llvm-project/pull/138455 The variant in Expr> only contains Relational, not other, more specific Relational types. When calling AsGenericExpr for a value of type Relational, the AsExpr function will attempt to create Expr<> directly for Relational, which won't work for the above reason. Implement an overload of AsExpr for Relational, which will wrap the Relational in Relational before creating Expr<>. >From 053e565b483b94d72b183bfd991c39ce2c27af60 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 4 May 2025 10:13:47 -0500 Subject: [PATCH] [flang][Evaluate] Fix AsGenericExpr for Relational The variant in Expr> only contains Relational, not other, more specific Relational types. When calling AsGenericExpr for a value of type Relational, the AsExpr function will attempt to create Expr<> directly for Relational, which won't work for the above reason. Implement an overload of AsExpr for Relational, which will wrap the Relational in Relational before creating Expr<>. --- flang/include/flang/Evaluate/tools.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 1414eaf14f7d6..cd29872c1e0fd 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -125,11 +125,18 @@ template bool IsCoarray(const A &x) { return GetCorank(x) > 0; } // Generalizing packagers: these take operations and expressions of more // specific types and wrap them in Expr<> containers of more abstract types. - template common::IfNoLvalue>, A> AsExpr(A &&x) { return Expr>{std::move(x)}; } +template ::Result> +Expr AsExpr(Relational &&x) { + // The variant in Expr> only contains + // Relational, not other Relationals. Wrap the Relational + // in Relational before creating Expr<>. + return Expr(Relational{std::move(x)}); +} + template Expr AsExpr(Expr &&x) { static_assert(IsSpecificIntrinsicType); return std::move(x); From flang-commits at lists.llvm.org Sun May 4 09:00:12 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sun, 04 May 2025 09:00:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang][Evaluate] Fix AsGenericExpr for Relational (PR #138455) In-Reply-To: Message-ID: <68178f0c.170a0220.18b641.9895@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Krzysztof Parzyszek (kparzysz)
Changes The variant in Expr<Type<TypeCategory::Logical, KIND>> only contains Relational<SomeType>, not other, more specific Relational<T> types. When calling AsGenericExpr for a value of type Relational<T>, the AsExpr function will attempt to create Expr<> directly for Relational<T>, which won't work for the above reason. Implement an overload of AsExpr for Relational<T>, which will wrap the Relational<T> in Relational<SomeType> before creating Expr<>. --- Full diff: https://github.com/llvm/llvm-project/pull/138455.diff 1 Files Affected: - (modified) flang/include/flang/Evaluate/tools.h (+8-1) ``````````diff diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 1414eaf14f7d6..cd29872c1e0fd 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -125,11 +125,18 @@ template bool IsCoarray(const A &x) { return GetCorank(x) > 0; } // Generalizing packagers: these take operations and expressions of more // specific types and wrap them in Expr<> containers of more abstract types. - template common::IfNoLvalue>, A> AsExpr(A &&x) { return Expr>{std::move(x)}; } +template ::Result> +Expr AsExpr(Relational &&x) { + // The variant in Expr> only contains + // Relational, not other Relationals. Wrap the Relational + // in Relational before creating Expr<>. + return Expr(Relational{std::move(x)}); +} + template Expr AsExpr(Expr &&x) { static_assert(IsSpecificIntrinsicType); return std::move(x); ``````````
https://github.com/llvm/llvm-project/pull/138455 From flang-commits at lists.llvm.org Sun May 4 09:12:11 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Sun, 04 May 2025 09:12:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang][Evaluate] Restrict ConstantBase constructor overload (PR #138456) Message-ID: https://github.com/kparzysz created https://github.com/llvm/llvm-project/pull/138456 ConstantBase has a constructor that takes a value of any type as an input: template ConstantBase(const T &). A derived type Constant is a member of many Expr classes (as an alternative in the member variant). When trying (erroneously) to create Expr from a wrong input, if the specific instance of Expr contains Constant, it's that constructor that will be instantiated, leading to cryptic and unexpected errors. Eliminate the constructor from overload for invalid input values to help produce more meaningful diagnostics. >From f9b0da8ad4fca7d2d40c3ad6eb2bf612402a00e3 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 4 May 2025 09:58:45 -0500 Subject: [PATCH] [flang][Evaluate] Restrict ConstantBase constructor overload ConstantBase has a constructor that takes a value of any type as an input: template ConstantBase(const T &). A derived type Constant is a member of many Expr classes (as an alternative in the member variant). When trying (erroneously) to create Expr from a wrong input, if the specific instance of Expr contains Constant, it's that constructor that will be instantiated, leading to cryptic and unexpected errors. Eliminate the constructor from overload for invalid input values to help produce more meaningful diagnostics. --- flang/include/flang/Evaluate/constant.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/constant.h b/flang/include/flang/Evaluate/constant.h index 6fc22e3b86aa2..d4c6601c37bca 100644 --- a/flang/include/flang/Evaluate/constant.h +++ b/flang/include/flang/Evaluate/constant.h @@ -110,8 +110,12 @@ class ConstantBase : public ConstantBounds { using Result = RESULT; using Element = ELEMENT; - template + // Constructor for creating ConstantBase from an actual value (i.e. + // literals, etc.) + template >> ConstantBase(const A &x, Result res = Result{}) : result_{res}, values_{x} {} + ConstantBase(ELEMENT &&x, Result res = Result{}) : result_{res}, values_{std::move(x)} {} ConstantBase( From flang-commits at lists.llvm.org Sun May 4 09:12:42 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sun, 04 May 2025 09:12:42 -0700 (PDT) Subject: [flang-commits] [flang] [flang][Evaluate] Restrict ConstantBase constructor overload (PR #138456) In-Reply-To: Message-ID: <681791fa.630a0220.1dd03b.2104@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Krzysztof Parzyszek (kparzysz)
Changes ConstantBase has a constructor that takes a value of any type as an input: template <typename T> ConstantBase(const T &). A derived type Constant<T> is a member of many Expr<T> classes (as an alternative in the member variant). When trying (erroneously) to create Expr<T> from a wrong input, if the specific instance of Expr<T> contains Constant<T>, it's that constructor that will be instantiated, leading to cryptic and unexpected errors. Eliminate the constructor from overload for invalid input values to help produce more meaningful diagnostics. --- Full diff: https://github.com/llvm/llvm-project/pull/138456.diff 1 Files Affected: - (modified) flang/include/flang/Evaluate/constant.h (+5-1) ``````````diff diff --git a/flang/include/flang/Evaluate/constant.h b/flang/include/flang/Evaluate/constant.h index 6fc22e3b86aa2..d4c6601c37bca 100644 --- a/flang/include/flang/Evaluate/constant.h +++ b/flang/include/flang/Evaluate/constant.h @@ -110,8 +110,12 @@ class ConstantBase : public ConstantBounds { using Result = RESULT; using Element = ELEMENT; - template + // Constructor for creating ConstantBase from an actual value (i.e. + // literals, etc.) + template >> ConstantBase(const A &x, Result res = Result{}) : result_{res}, values_{x} {} + ConstantBase(ELEMENT &&x, Result res = Result{}) : result_{res}, values_{std::move(x)} {} ConstantBase( ``````````
https://github.com/llvm/llvm-project/pull/138456 From flang-commits at lists.llvm.org Sun May 4 09:15:35 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Sun, 04 May 2025 09:15:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang][Evaluate] Fix AsGenericExpr for Relational (PR #138455) In-Reply-To: Message-ID: <681792a7.170a0220.35dbc3.0afe@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/138455 >From 053e565b483b94d72b183bfd991c39ce2c27af60 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 4 May 2025 10:13:47 -0500 Subject: [PATCH 1/2] [flang][Evaluate] Fix AsGenericExpr for Relational The variant in Expr> only contains Relational, not other, more specific Relational types. When calling AsGenericExpr for a value of type Relational, the AsExpr function will attempt to create Expr<> directly for Relational, which won't work for the above reason. Implement an overload of AsExpr for Relational, which will wrap the Relational in Relational before creating Expr<>. --- flang/include/flang/Evaluate/tools.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 1414eaf14f7d6..cd29872c1e0fd 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -125,11 +125,18 @@ template bool IsCoarray(const A &x) { return GetCorank(x) > 0; } // Generalizing packagers: these take operations and expressions of more // specific types and wrap them in Expr<> containers of more abstract types. - template common::IfNoLvalue>, A> AsExpr(A &&x) { return Expr>{std::move(x)}; } +template ::Result> +Expr AsExpr(Relational &&x) { + // The variant in Expr> only contains + // Relational, not other Relationals. Wrap the Relational + // in Relational before creating Expr<>. + return Expr(Relational{std::move(x)}); +} + template Expr AsExpr(Expr &&x) { static_assert(IsSpecificIntrinsicType); return std::move(x); >From a9daa4c7e4df3b0f0ca79f0abe55fc4f124f50cd Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 4 May 2025 11:14:49 -0500 Subject: [PATCH 2/2] Add deleted empty line back --- flang/include/flang/Evaluate/tools.h | 1 + 1 file changed, 1 insertion(+) diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index cd29872c1e0fd..5cdabb3056d8f 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -125,6 +125,7 @@ template bool IsCoarray(const A &x) { return GetCorank(x) > 0; } // Generalizing packagers: these take operations and expressions of more // specific types and wrap them in Expr<> containers of more abstract types. + template common::IfNoLvalue>, A> AsExpr(A &&x) { return Expr>{std::move(x)}; } From flang-commits at lists.llvm.org Sun May 4 09:33:30 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Sun, 04 May 2025 09:33:30 -0700 (PDT) Subject: [flang-commits] [flang] [flang][Evaluate] Fix AsGenericExpr for Relational (PR #138455) In-Reply-To: Message-ID: <681796da.050a0220.24fefa.4ab0@mx.google.com> kparzysz wrote: The commit message reads `... only contains Relational, not other, more specific Relational types.` The rendering in the top comment doesn't show some of the the <...>s for some reason. https://github.com/llvm/llvm-project/pull/138455 From flang-commits at lists.llvm.org Sun May 4 10:23:19 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Sun, 04 May 2025 10:23:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang][Evaluate] Fix AsGenericExpr for Relational (PR #138455) In-Reply-To: Message-ID: <6817a287.170a0220.193163.0911@mx.google.com> kparzysz wrote: The Windows pre-merge tests is some kind of a glitch: the builder gets shutdown for some reason while still building. The buildkite Windows tests pass. https://github.com/llvm/llvm-project/pull/138455 From flang-commits at lists.llvm.org Sun May 4 23:34:02 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Sun, 04 May 2025 23:34:02 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Retrieve shape from selector when generating assoc sym type (PR #137117) In-Reply-To: Message-ID: <68185bda.170a0220.33f204.2779@mx.google.com> ergawy wrote: Ping! please take another look when you have time. https://github.com/llvm/llvm-project/pull/137117 From flang-commits at lists.llvm.org Sun May 4 23:46:29 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Sun, 04 May 2025 23:46:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Retrieve shape from selector when generating assoc sym type (PR #137117) In-Reply-To: Message-ID: <68185ec5.a70a0220.382055.9578@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/137117 >From 10e4545372442ddb50c0dd6855209499a14bf06e Mon Sep 17 00:00:00 2001 From: ergawy Date: Thu, 24 Apr 2025 00:34:44 -0500 Subject: [PATCH] [flang] Retrieve shape from selector when generating assoc sym type This PR extends `genSymbolType` so that the type of an associating symbol carries the shape of the selector expression, if any. This is a fix for a bug that triggered when an associating symbol is used in a locality specifier. For example, given the following input: ```fortran associate(a => aa(4:)) do concurrent (i = 4:11) local(a) a(i) = 0 end do end associate ``` before the changes in the PR, flang would assert that we are casting between incompatible types. The issue happened since for the associating symbol (`a`), flang generated its type as `f32` rather than `!fir.array<8xf32>` as it should be in this case. --- flang/lib/Lower/ConvertType.cpp | 17 ++++++++++++++ .../do_concurrent_local_assoc_entity.f90 | 22 +++++++++++++++++++ 2 files changed, 39 insertions(+) create mode 100644 flang/test/Lower/do_concurrent_local_assoc_entity.f90 diff --git a/flang/lib/Lower/ConvertType.cpp b/flang/lib/Lower/ConvertType.cpp index d45f9e7c0bf1b..875bdba6cc6ba 100644 --- a/flang/lib/Lower/ConvertType.cpp +++ b/flang/lib/Lower/ConvertType.cpp @@ -279,6 +279,23 @@ struct TypeBuilderImpl { bool isPolymorphic = (Fortran::semantics::IsPolymorphic(symbol) || Fortran::semantics::IsUnlimitedPolymorphic(symbol)) && !Fortran::semantics::IsAssumedType(symbol); + if (const auto *assocDetails = + ultimate.detailsIf()) { + const auto &selector = assocDetails->expr(); + + if (selector && selector->Rank() > 0) { + auto shapeExpr = Fortran::evaluate::GetShape( + converter.getFoldingContext(), selector); + + fir::SequenceType::Shape shape; + // If there is no shapExpr, this is an assumed-rank, and the empty shape + // will build the desired fir.array<*:T> type. + if (shapeExpr) + translateShape(shape, std::move(*shapeExpr)); + ty = fir::SequenceType::get(shape, ty); + } + } + if (ultimate.IsObjectArray()) { auto shapeExpr = Fortran::evaluate::GetShape(converter.getFoldingContext(), ultimate); diff --git a/flang/test/Lower/do_concurrent_local_assoc_entity.f90 b/flang/test/Lower/do_concurrent_local_assoc_entity.f90 new file mode 100644 index 0000000000000..ca16ecaa5c137 --- /dev/null +++ b/flang/test/Lower/do_concurrent_local_assoc_entity.f90 @@ -0,0 +1,22 @@ +! RUN: %flang_fc1 -emit-hlfir -o - %s | FileCheck %s + +subroutine local_assoc + implicit none + integer i + real, dimension(2:11) :: aa + + associate(a => aa(4:)) + do concurrent (i = 4:11) local(a) + a(i) = 0 + end do + end associate +end subroutine local_assoc + +! CHECK: %[[C8:.*]] = arith.constant 8 : index + +! CHECK: fir.do_loop {{.*}} unordered { +! CHECK: %[[LOCAL_ALLOC:.*]] = fir.alloca !fir.array<8xf32> {bindc_name = "a", pinned, uniq_name = "{{.*}}local_assocEa"} +! CHECK: %[[LOCAL_SHAPE:.*]] = fir.shape %[[C8]] : +! CHECK: %[[LOCAL_DECL:.*]]:2 = hlfir.declare %[[LOCAL_ALLOC]](%[[LOCAL_SHAPE]]) +! CHECK: hlfir.designate %[[LOCAL_DECL]]#0 (%{{.*}}) +! CHECK: } From flang-commits at lists.llvm.org Mon May 5 00:25:10 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Mon, 05 May 2025 00:25:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` (PR #137928) In-Reply-To: Message-ID: <681867d6.170a0220.25e6b1.3e98@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/137928 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 5 00:26:40 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Mon, 05 May 2025 00:26:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Retrieve shape from selector when generating assoc sym type (PR #137117) In-Reply-To: Message-ID: <68186830.170a0220.6a323.f613@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/137117 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 5 00:35:24 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Mon, 05 May 2025 00:35:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` (PR #137928) In-Reply-To: Message-ID: <68186a3c.050a0220.142dbb.ae04@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/137928 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 5 00:56:58 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 00:56:58 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <68186f4a.630a0220.20324f.4fc6@mx.google.com> https://github.com/jeanPerier approved this pull request. Bridges.cpp changes looks good to me, thanks for addressing my comments! Please wait for @klausler and @razvanlupusoru or @clementval approval to make sure you have addressed their comments in semantics/ACC lowering. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Mon May 5 01:51:36 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Mon, 05 May 2025 01:51:36 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR (PR #111155) In-Reply-To: Message-ID: <68187c18.170a0220.8ec2.47b1@mx.google.com> kaviya2510 wrote: `According to the OpenMP specification, the rules for variables with implicitly determined data-sharing attributes are: * In a _parallel_ construct, if no _default_ clause is present, these variables are **shared** * In a task generating construct, if no _default_ clause is present, a variable for which the data-sharing attribute is not determined by the rules aboveand that in the enclosing context is determined to be shared by all implicit tasks bound to the current team is **shared** * In an orphaned task generating construct, if no _default_ clause is present, dummy arguments are **firstprivate** * In a task generating construct, if no _default_ clause is present, a variable for which the data-sharing attribute is not determined by the rules above is **firstprivate**` ```fortran subroutine omp_task_in_reduction() integer i i = 0 !$omp task in_reduction(+:i) i = i + 1 !$omp end task end subroutine omp_task_in_reduction ``` In the above example, without any explicit data sharing attribute the variable `i` would normally considered as `firstprivate` in task construct. However, since the variable`i` appears in the `in_reduction` clause, its data-sharing attribute should not be firstprivate. With the current flow, the compiler incorrectly marks the data-sharing attribute of `i` as `firstprivate`, based on the implicit rule for task constructs and as a result it generate the below mlir during lowering. ```mlir omp.private {type = firstprivate} @_QFomp_task_in_reductionEi_firstprivate_ref_i32 : !fir.ref alloc { ^bb0(%arg0: !fir.ref): %0 = fir.alloca i32 {bindc_name = "i", pinned, uniq_name = "_QFomp_task_in_reductionEi"} %1:2 = hlfir.declare %0 {uniq_name = "_QFomp_task_in_reductionEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) omp.yield(%1#0 : !fir.ref) } copy { ^bb0(%arg0: !fir.ref, %arg1: !fir.ref): %0 = fir.load %arg0 : !fir.ref hlfir.assign %0 to %arg1 : i32, !fir.ref omp.yield(%arg1 : !fir.ref) } ``` I am felling that my earlier fix in DataSharingProcessor.cpp is not correct. Instead of skipping collecting symbols in DataSharingProcessor.cpp, I added a new change which detects the dsa of variable `i` as in_reduction and skip marking it as firstprivate. @tblah , could you please take a look at it and review my changes? https://github.com/llvm/llvm-project/pull/111155 From flang-commits at lists.llvm.org Mon May 5 01:45:27 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Mon, 05 May 2025 01:45:27 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR (PR #111155) In-Reply-To: Message-ID: <68187aa7.a70a0220.1f25b0.9684@mx.google.com> https://github.com/kaviya2510 updated https://github.com/llvm/llvm-project/pull/111155 >From 60cbcc29d9d0628db19e498377759b6affb2b2b5 Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Fri, 6 Dec 2024 18:40:03 +0530 Subject: [PATCH 1/5] [Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 55 +++++++++++++- flang/lib/Lower/OpenMP/ClauseProcessor.h | 7 ++ .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 7 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 71 +++++++++++++------ flang/lib/Lower/OpenMP/ReductionProcessor.cpp | 41 +++++++++-- flang/lib/Lower/OpenMP/ReductionProcessor.h | 4 +- .../Lower/OpenMP/Todo/task-inreduction.f90 | 15 ---- .../OpenMP/Todo/taskgroup-task-reduction.f90 | 10 --- flang/test/Lower/OpenMP/task-inreduction.f90 | 35 +++++++++ .../OpenMP/taskgroup-task-array-reduction.f90 | 49 +++++++++++++ .../OpenMP/taskgroup-task_reduction01.f90 | 34 +++++++++ .../OpenMP/taskgroup-task_reduction02.f90 | 36 ++++++++++ 12 files changed, 305 insertions(+), 59 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/task-inreduction.f90 delete mode 100644 flang/test/Lower/OpenMP/Todo/taskgroup-task-reduction.f90 create mode 100644 flang/test/Lower/OpenMP/task-inreduction.f90 create mode 100644 flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 create mode 100644 flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 create mode 100644 flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 48c559a78b9bc..1f94458ff0b97 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -916,6 +916,30 @@ bool ClauseProcessor::processIsDevicePtr( }); } +bool ClauseProcessor::processInReduction( + mlir::Location currentLocation, mlir::omp::InReductionClauseOps &result, + llvm::SmallVectorImpl &outReductionSyms) const { + return findRepeatableClause( + [&](const omp::clause::InReduction &clause, const parser::CharBlock &) { + llvm::SmallVector inReductionVars; + llvm::SmallVector inReduceVarByRef; + llvm::SmallVector inReductionDeclSymbols; + llvm::SmallVector inReductionSyms; + ReductionProcessor rp; + rp.addDeclareReduction( + currentLocation, converter, clause, inReductionVars, + inReduceVarByRef, inReductionDeclSymbols, inReductionSyms); + + // Copy local lists into the output. + llvm::copy(inReductionVars, std::back_inserter(result.inReductionVars)); + llvm::copy(inReduceVarByRef, + std::back_inserter(result.inReductionByref)); + llvm::copy(inReductionDeclSymbols, + std::back_inserter(result.inReductionSyms)); + llvm::copy(inReductionSyms, std::back_inserter(outReductionSyms)); + }); +} + bool ClauseProcessor::processLink( llvm::SmallVectorImpl &result) const { return findRepeatableClause( @@ -1126,9 +1150,10 @@ bool ClauseProcessor::processReduction( llvm::SmallVector reductionDeclSymbols; llvm::SmallVector reductionSyms; ReductionProcessor rp; - rp.addDeclareReduction(currentLocation, converter, clause, - reductionVars, reduceVarByRef, - reductionDeclSymbols, reductionSyms); + + rp.addDeclareReduction( + currentLocation, converter, clause, reductionVars, reduceVarByRef, + reductionDeclSymbols, reductionSyms); // Copy local lists into the output. llvm::copy(reductionVars, std::back_inserter(result.reductionVars)); @@ -1139,6 +1164,30 @@ bool ClauseProcessor::processReduction( }); } +bool ClauseProcessor::processTaskReduction( + mlir::Location currentLocation, mlir::omp::TaskReductionClauseOps &result, + llvm::SmallVectorImpl &outReductionSyms) const { + return findRepeatableClause( + [&](const omp::clause::TaskReduction &clause, const parser::CharBlock &) { + llvm::SmallVector taskReductionVars; + llvm::SmallVector TaskReduceVarByRef; + llvm::SmallVector TaskReductionDeclSymbols; + llvm::SmallVector TaskReductionSyms; + ReductionProcessor rp; + rp.addDeclareReduction( + currentLocation, converter, clause, taskReductionVars, + TaskReduceVarByRef, TaskReductionDeclSymbols, TaskReductionSyms); + // Copy local lists into the output. + llvm::copy(taskReductionVars, + std::back_inserter(result.taskReductionVars)); + llvm::copy(TaskReduceVarByRef, + std::back_inserter(result.taskReductionByref)); + llvm::copy(TaskReductionDeclSymbols, + std::back_inserter(result.taskReductionSyms)); + llvm::copy(TaskReductionSyms, std::back_inserter(outReductionSyms)); + }); +} + bool ClauseProcessor::processTo( llvm::SmallVectorImpl &result) const { return findRepeatableClause( diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index e0fe917c50e8f..e042d3a1efdc8 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -105,6 +105,9 @@ class ClauseProcessor { bool processIsDevicePtr( mlir::omp::IsDevicePtrClauseOps &result, llvm::SmallVectorImpl &isDeviceSyms) const; + bool processInReduction( + mlir::Location currentLocation, mlir::omp::InReductionClauseOps &result, + llvm::SmallVectorImpl &InReductionSyms) const; bool processLink(llvm::SmallVectorImpl &result) const; @@ -123,6 +126,10 @@ class ClauseProcessor { bool processReduction( mlir::Location currentLocation, mlir::omp::ReductionClauseOps &result, llvm::SmallVectorImpl &reductionSyms) const; + bool processTaskReduction(mlir::Location currentLocation, + mlir::omp::TaskReductionClauseOps &result, + llvm::SmallVectorImpl + &TaskReductionSyms) const; bool processTo(llvm::SmallVectorImpl &result) const; bool processUseDeviceAddr( lower::StatementContext &stmtCtx, diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 99835c515463b..d4377498ccad0 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -344,8 +344,13 @@ void DataSharingProcessor::collectSymbols( // Collect all symbols referenced in the evaluation being processed, // that matches 'flag'. llvm::SetVector allSymbols; + bool collectSymbols = true; + for (const omp::Clause &clause : clauses) { + if (clause.id == llvm::omp::Clause::OMPC_in_reduction) + collectSymbols = false; + } converter.collectSymbolSet(eval, allSymbols, flag, - /*collectSymbols=*/true, + /*collectSymbols=*/collectSymbols, /*collectHostAssociatedSymbols=*/true); llvm::SetVector symbolsInNestedRegions; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index c167d347b4315..f657a2ef0a26d 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1249,34 +1249,35 @@ static void genTargetEnterExitUpdateDataClauses( cp.processNowait(clauseOps); } -static void genTaskClauses(lower::AbstractConverter &converter, - semantics::SemanticsContext &semaCtx, - lower::StatementContext &stmtCtx, - const List &clauses, mlir::Location loc, - mlir::omp::TaskOperands &clauseOps) { +static void genTaskClauses( + lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, + lower::StatementContext &stmtCtx, const List &clauses, + mlir::Location loc, mlir::omp::TaskOperands &clauseOps, + llvm::SmallVectorImpl &InReductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processAllocate(clauseOps); cp.processDepend(clauseOps); cp.processFinal(stmtCtx, clauseOps); cp.processIf(llvm::omp::Directive::OMPD_task, clauseOps); + cp.processInReduction(loc, clauseOps, InReductionSyms); cp.processMergeable(clauseOps); cp.processPriority(stmtCtx, clauseOps); cp.processUntied(clauseOps); cp.processDetach(clauseOps); // TODO Support delayed privatization. - cp.processTODO( + cp.processTODO( loc, llvm::omp::Directive::OMPD_task); } -static void genTaskgroupClauses(lower::AbstractConverter &converter, - semantics::SemanticsContext &semaCtx, - const List &clauses, mlir::Location loc, - mlir::omp::TaskgroupOperands &clauseOps) { +static void genTaskgroupClauses( + lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, + const List &clauses, mlir::Location loc, + mlir::omp::TaskgroupOperands &clauseOps, + llvm::SmallVectorImpl &taskReductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processAllocate(clauseOps); - cp.processTODO(loc, - llvm::omp::Directive::OMPD_taskgroup); + cp.processTaskReduction(loc, clauseOps, taskReductionSyms); } static void genTaskwaitClauses(lower::AbstractConverter &converter, @@ -1887,7 +1888,9 @@ genTaskOp(lower::AbstractConverter &converter, lower::SymMap &symTable, ConstructQueue::const_iterator item) { lower::StatementContext stmtCtx; mlir::omp::TaskOperands clauseOps; - genTaskClauses(converter, semaCtx, stmtCtx, item->clauses, loc, clauseOps); + llvm::SmallVector InReductionSyms; + genTaskClauses(converter, semaCtx, stmtCtx, item->clauses, loc, clauseOps, + InReductionSyms); if (!enableDelayedPrivatization) return genOpWithBody( @@ -1904,22 +1907,35 @@ genTaskOp(lower::AbstractConverter &converter, lower::SymMap &symTable, EntryBlockArgs taskArgs; taskArgs.priv.syms = dsp.getDelayedPrivSymbols(); taskArgs.priv.vars = clauseOps.privateVars; + taskArgs.inReduction.syms = InReductionSyms; + taskArgs.inReduction.vars = clauseOps.inReductionVars; auto genRegionEntryCB = [&](mlir::Operation *op) { genEntryBlock(converter.getFirOpBuilder(), taskArgs, op->getRegion(0)); bindEntryBlockArgs(converter, llvm::cast(op), taskArgs); - return llvm::to_vector(taskArgs.priv.syms); + return llvm::to_vector(taskArgs.getSyms()); }; - return genOpWithBody( + OpWithBodyGenInfo genInfo = OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, llvm::omp::Directive::OMPD_task) .setClauses(&item->clauses) .setDataSharingProcessor(&dsp) - .setGenRegionEntryCb(genRegionEntryCB), - queue, item, clauseOps); + .setGenRegionEntryCb(genRegionEntryCB); + + auto taskOp = + genOpWithBody(genInfo, queue, item, clauseOps); + + llvm::SmallVector inReductionTypes; + for (const auto &inreductionVar : clauseOps.inReductionVars) + inReductionTypes.push_back(inreductionVar.getType()); + + // Add reduction variables as entry block arguments to the task region + llvm::SmallVector blockArgLocs(InReductionSyms.size(), loc); + taskOp->getRegion(0).addArguments(inReductionTypes, blockArgLocs); + return taskOp; } static mlir::omp::TaskgroupOp @@ -1929,13 +1945,26 @@ genTaskgroupOp(lower::AbstractConverter &converter, lower::SymMap &symTable, const ConstructQueue &queue, ConstructQueue::const_iterator item) { mlir::omp::TaskgroupOperands clauseOps; - genTaskgroupClauses(converter, semaCtx, item->clauses, loc, clauseOps); + llvm::SmallVector taskReductionSyms; + genTaskgroupClauses(converter, semaCtx, item->clauses, loc, clauseOps, + taskReductionSyms); - return genOpWithBody( + OpWithBodyGenInfo genInfo = OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, llvm::omp::Directive::OMPD_taskgroup) - .setClauses(&item->clauses), - queue, item, clauseOps); + .setClauses(&item->clauses); + + auto taskgroupOp = + genOpWithBody(genInfo, queue, item, clauseOps); + + llvm::SmallVector taskReductionTypes; + for (const auto &taskreductionVar : clauseOps.taskReductionVars) + taskReductionTypes.push_back(taskreductionVar.getType()); + + // Add reduction variables as entry block arguments to the taskgroup region + llvm::SmallVector blockArgLocs(taskReductionSyms.size(), loc); + taskgroupOp->getRegion(0).addArguments(taskReductionTypes, blockArgLocs); + return taskgroupOp; } static mlir::omp::TaskwaitOp diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp index 736de2ee511be..4bdfda701a9c8 100644 --- a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp @@ -24,6 +24,7 @@ #include "flang/Parser/tools.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "llvm/Support/CommandLine.h" +#include static llvm::cl::opt forceByrefReduction( "force-byref-reduction", @@ -34,6 +35,32 @@ namespace Fortran { namespace lower { namespace omp { +// explicit template declarations +template void ReductionProcessor::addDeclareReduction( + mlir::Location currentLocation, lower::AbstractConverter &converter, + const omp::clause::Reduction &reduction, + llvm::SmallVectorImpl &reductionVars, + llvm::SmallVectorImpl &reduceVarByRef, + llvm::SmallVectorImpl &reductionDeclSymbols, + llvm::SmallVectorImpl &reductionSymbols); + +template void +ReductionProcessor::addDeclareReduction( + mlir::Location currentLocation, lower::AbstractConverter &converter, + const omp::clause::TaskReduction &reduction, + llvm::SmallVectorImpl &reductionVars, + llvm::SmallVectorImpl &reduceVarByRef, + llvm::SmallVectorImpl &reductionDeclSymbols, + llvm::SmallVectorImpl &reductionSymbols); + +template void ReductionProcessor::addDeclareReduction( + mlir::Location currentLocation, lower::AbstractConverter &converter, + const omp::clause::InReduction &reduction, + llvm::SmallVectorImpl &reductionVars, + llvm::SmallVectorImpl &reduceVarByRef, + llvm::SmallVectorImpl &reductionDeclSymbols, + llvm::SmallVectorImpl &reductionSymbols); + ReductionProcessor::ReductionIdentifier ReductionProcessor::getReductionType( const omp::clause::ProcedureDesignator &pd) { auto redType = llvm::StringSwitch>( @@ -716,22 +743,22 @@ static bool doReductionByRef(mlir::Value reductionVar) { return false; } +template void ReductionProcessor::addDeclareReduction( mlir::Location currentLocation, lower::AbstractConverter &converter, - const omp::clause::Reduction &reduction, - llvm::SmallVectorImpl &reductionVars, + const T &reduction, llvm::SmallVectorImpl &reductionVars, llvm::SmallVectorImpl &reduceVarByRef, llvm::SmallVectorImpl &reductionDeclSymbols, llvm::SmallVectorImpl &reductionSymbols) { fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - if (std::get>( - reduction.t)) - TODO(currentLocation, "Reduction modifiers are not supported"); + if constexpr (std::is_same::value) { + if (std::get>(reduction.t)) + TODO(currentLocation, "Reduction modifiers are not supported"); + } mlir::omp::DeclareReductionOp decl; const auto &redOperatorList{ - std::get(reduction.t)}; + std::get(reduction.t)}; assert(redOperatorList.size() == 1 && "Expecting single operator"); const auto &redOperator = redOperatorList.front(); const auto &objectList{std::get(reduction.t)}; diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.h b/flang/lib/Lower/OpenMP/ReductionProcessor.h index 5f4d742b62cb1..91b54da314243 100644 --- a/flang/lib/Lower/OpenMP/ReductionProcessor.h +++ b/flang/lib/Lower/OpenMP/ReductionProcessor.h @@ -120,10 +120,10 @@ class ReductionProcessor { /// Creates a reduction declaration and associates it with an OpenMP block /// directive. + template static void addDeclareReduction( mlir::Location currentLocation, lower::AbstractConverter &converter, - const omp::clause::Reduction &reduction, - llvm::SmallVectorImpl &reductionVars, + const T &reduction, llvm::SmallVectorImpl &reductionVars, llvm::SmallVectorImpl &reduceVarByRef, llvm::SmallVectorImpl &reductionDeclSymbols, llvm::SmallVectorImpl &reductionSymbols); diff --git a/flang/test/Lower/OpenMP/Todo/task-inreduction.f90 b/flang/test/Lower/OpenMP/Todo/task-inreduction.f90 deleted file mode 100644 index aeed680a6dba7..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/task-inreduction.f90 +++ /dev/null @@ -1,15 +0,0 @@ -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s - -!=============================================================================== -! `mergeable` clause -!=============================================================================== - -! CHECK: not yet implemented: Unhandled clause IN_REDUCTION in TASK construct -subroutine omp_task_in_reduction() - integer i - i = 0 - !$omp task in_reduction(+:i) - i = i + 1 - !$omp end task -end subroutine omp_task_in_reduction diff --git a/flang/test/Lower/OpenMP/Todo/taskgroup-task-reduction.f90 b/flang/test/Lower/OpenMP/Todo/taskgroup-task-reduction.f90 deleted file mode 100644 index 1cb471d784d76..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/taskgroup-task-reduction.f90 +++ /dev/null @@ -1,10 +0,0 @@ -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s - -! CHECK: not yet implemented: Unhandled clause TASK_REDUCTION in TASKGROUP construct -subroutine omp_taskgroup_task_reduction - integer :: res - !$omp taskgroup task_reduction(+:res) - res = res + 1 - !$omp end taskgroup -end subroutine omp_taskgroup_task_reduction diff --git a/flang/test/Lower/OpenMP/task-inreduction.f90 b/flang/test/Lower/OpenMP/task-inreduction.f90 new file mode 100644 index 0000000000000..ded4710d5c13d --- /dev/null +++ b/flang/test/Lower/OpenMP/task-inreduction.f90 @@ -0,0 +1,35 @@ +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +!CHECK-LABEL: omp.declare_reduction +!CHECK-SAME: @[[RED_I32_NAME:.*]] : i32 init { +!CHECK: ^bb0(%{{.*}}: i32): +!CHECK: %[[C0_1:.*]] = arith.constant 0 : i32 +!CHECK: omp.yield(%[[C0_1]] : i32) +!CHECK: } combiner { +!CHECK: ^bb0(%[[ARG0:.*]]: i32, %[[ARG1:.*]]: i32): +!CHECK: %[[RES:.*]] = arith.addi %[[ARG0]], %[[ARG1]] : i32 +!CHECK: omp.yield(%[[RES]] : i32) +!CHECK: } + +!CHECK-LABEL: func.func @_QPomp_task_in_reduction() { +! [...] +!CHECK: omp.task in_reduction(@[[RED_I32_NAME]] %[[VAL_1:.*]]#0 -> %[[ARG0]] : !fir.ref) { +!CHECK: %[[VAL_4:.*]]:2 = hlfir.declare %[[ARG0]] +!CHECK-SAME: {uniq_name = "_QFomp_task_in_reductionEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[VAL_5:.*]] = fir.load %[[VAL_4]]#0 : !fir.ref +!CHECK: %[[VAL_6:.*]] = arith.constant 1 : i32 +!CHECK: %[[VAL_7:.*]] = arith.addi %[[VAL_5]], %[[VAL_6]] : i32 +!CHECK: hlfir.assign %[[VAL_7]] to %[[VAL_4]]#0 : i32, !fir.ref +!CHECK: omp.terminator +!CHECK: } +!CHECK: return +!CHECK: } + +subroutine omp_task_in_reduction() + integer i + i = 0 + !$omp task in_reduction(+:i) + i = i + 1 + !$omp end task +end subroutine omp_task_in_reduction \ No newline at end of file diff --git a/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 b/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 new file mode 100644 index 0000000000000..7e6d7f09fbc67 --- /dev/null +++ b/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 @@ -0,0 +1,49 @@ +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.declare_reduction @add_reduction_byref_box_Uxf32 : !fir.ref>> alloc { +! [...] +! CHECK: omp.yield +! CHECK-LABEL: } init { +! [...] +! CHECK: omp.yield +! CHECK-LABEL: } combiner { +! [...] +! CHECK: omp.yield +! CHECK-LABEL: } cleanup { +! [...] +! CHECK: omp.yield +! CHECK: } + +! CHECK-LABEL: func.func @_QPtaskreduction +! CHECK-SAME: (%[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}) { +! CHECK: %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[VAL_0]] dummy_scope %[[VAL_1]] +! CHECK-SAME {uniq_name = "_QFtaskreductionEx"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +! CHECK: omp.parallel { +! CHECK: %[[VAL_3:.*]] = fir.alloca !fir.box> +! CHECK: fir.store %[[VAL_2]]#1 to %[[VAL_3]] : !fir.ref>> +! CHECK: omp.taskgroup task_reduction(byref @add_reduction_byref_box_Uxf32 %[[VAL_3]] -> %[[VAL_4:.*]]: !fir.ref>>) { +! CHECK: %[[VAL_5:.*]] = fir.alloca !fir.box> +! CHECK: fir.store %[[VAL_2]]#1 to %[[VAL_5]] : !fir.ref>> +! CHECK: omp.task in_reduction(byref @add_reduction_byref_box_Uxf32 %[[VAL_5]] -> %[[VAL_6:.*]] : !fir.ref>>) { +! [...] +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } + +subroutine taskReduction(x) + real, dimension(:) :: x + !$omp parallel + !$omp taskgroup task_reduction(+:x) + !$omp task in_reduction(+:x) + x = x + 1 + !$omp end task + !$omp end taskgroup + !$omp end parallel +end subroutine \ No newline at end of file diff --git a/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 b/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 new file mode 100644 index 0000000000000..bc32cee93d47f --- /dev/null +++ b/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 @@ -0,0 +1,34 @@ +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +!CHECK-LABEL: omp.declare_reduction +!CHECK-SAME: @[[RED_I32_NAME:.*]] : i32 init { +!CHECK: ^bb0(%{{.*}}: i32): +!CHECK: %[[C0_1:.*]] = arith.constant 0 : i32 +!CHECK: omp.yield(%[[C0_1]] : i32) +!CHECK: } combiner { +!CHECK: ^bb0(%[[ARG0:.*]]: i32, %[[ARG1:.*]]: i32): +!CHECK: %[[RES:.*]] = arith.addi %[[ARG0]], %[[ARG1]] : i32 +!CHECK: omp.yield(%[[RES]] : i32) +!CHECK: } + +!CHECK-LABEL: func.func @_QPomp_taskgroup_task_reduction() { +!CHECK: %[[VAL_0:.*]] = fir.alloca i32 {bindc_name = "res", uniq_name = "_QFomp_taskgroup_task_reductionEres"} +!CHECK: %[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFomp_taskgroup_task_reductionEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: omp.taskgroup task_reduction(@[[RED_I32_NAME]] %[[VAL_1]]#0 -> %[[VAL_2:.*]] : !fir.ref) { +!CHECK: %[[VAL_3:.*]] = fir.load %[[VAL_1]]#0 : !fir.ref +!CHECK: %[[VAL_4:.*]] = arith.constant 1 : i32 +!CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_3]], %[[VAL_4]] : i32 +!CHECK: hlfir.assign %[[VAL_5]] to %[[VAL_1]]#0 : i32, !fir.ref +!CHECK: omp.terminator +!CHECK: } +!CHECK: return +!CHECK: } + + +subroutine omp_taskgroup_task_reduction() + integer :: res + !$omp taskgroup task_reduction(+:res) + res = res + 1 + !$omp end taskgroup +end subroutine omp_taskgroup_task_reduction \ No newline at end of file diff --git a/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 b/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 new file mode 100644 index 0000000000000..6a5bc568efb8e --- /dev/null +++ b/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 @@ -0,0 +1,36 @@ +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +!CHECK-LABEL: omp.declare_reduction +!CHECK-SAME: @[[RED_I32_NAME:.*]] : i32 init { +!CHECK: ^bb0(%{{.*}}: i32): +!CHECK: %[[C0_1:.*]] = arith.constant 0 : i32 +!CHECK: omp.yield(%[[C0_1]] : i32) +!CHECK: } combiner { +!CHECK: ^bb0(%[[ARG0:.*]]: i32, %[[ARG1:.*]]: i32): +!CHECK: %[[RES:.*]] = arith.addi %[[ARG0]], %[[ARG1]] : i32 +!CHECK: omp.yield(%[[RES]] : i32) +!CHECK: } + +!CHECK-LABEL: func.func @_QPin_reduction() { +! [...] +!CHECK: omp.taskgroup task_reduction(@[[RED_I32_NAME]] %[[VAL_1:.*]]#0 -> %[[VAL_3:.*]] : !fir.ref) { +!CHECK: omp.task in_reduction(@[[RED_I32_NAME]] %[[VAL_1]]#0 -> %[[VAL_4:.*]] : !fir.ref) { +!CHECK: %[[VAL_5:.*]]:2 = hlfir.declare %[[VAL_4]] {uniq_name = "_QFin_reductionEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! [...] +!CHECK: omp.terminator +!CHECK: } +!CHECK: omp.terminator +!CHECK: } +!CHECK: return +!CHECK: } + +subroutine in_reduction + integer :: x + x = 0 + !$omp taskgroup task_reduction(+:x) + !$omp task in_reduction(+:x) + x = x + 1 + !$omp end task + !$omp end taskgroup +end subroutine \ No newline at end of file >From 1b5a47ebf3bfd57ce4d0b78657fd09de1268ba93 Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Fri, 13 Dec 2024 17:21:53 +0530 Subject: [PATCH 2/5] [Flang][OpenMP] Addressed review comments --- flang/lib/Lower/OpenMP/ClauseProcessor.h | 9 ++-- .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 12 +++--- flang/lib/Lower/OpenMP/OpenMP.cpp | 41 ++++++++----------- flang/test/Lower/OpenMP/task-inreduction.f90 | 2 +- .../OpenMP/taskgroup-task-array-reduction.f90 | 2 +- .../OpenMP/taskgroup-task_reduction01.f90 | 2 +- .../OpenMP/taskgroup-task_reduction02.f90 | 4 +- 7 files changed, 33 insertions(+), 39 deletions(-) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index e042d3a1efdc8..764964fc706e4 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -107,7 +107,7 @@ class ClauseProcessor { llvm::SmallVectorImpl &isDeviceSyms) const; bool processInReduction( mlir::Location currentLocation, mlir::omp::InReductionClauseOps &result, - llvm::SmallVectorImpl &InReductionSyms) const; + llvm::SmallVectorImpl &outReductionSyms) const; bool processLink(llvm::SmallVectorImpl &result) const; @@ -126,10 +126,9 @@ class ClauseProcessor { bool processReduction( mlir::Location currentLocation, mlir::omp::ReductionClauseOps &result, llvm::SmallVectorImpl &reductionSyms) const; - bool processTaskReduction(mlir::Location currentLocation, - mlir::omp::TaskReductionClauseOps &result, - llvm::SmallVectorImpl - &TaskReductionSyms) const; + bool processTaskReduction( + mlir::Location currentLocation, mlir::omp::TaskReductionClauseOps &result, + llvm::SmallVectorImpl &outReductionSyms) const; bool processTo(llvm::SmallVectorImpl &result) const; bool processUseDeviceAddr( lower::StatementContext &stmtCtx, diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index d4377498ccad0..b4422cdd72546 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -344,11 +344,13 @@ void DataSharingProcessor::collectSymbols( // Collect all symbols referenced in the evaluation being processed, // that matches 'flag'. llvm::SetVector allSymbols; - bool collectSymbols = true; - for (const omp::Clause &clause : clauses) { - if (clause.id == llvm::omp::Clause::OMPC_in_reduction) - collectSymbols = false; - } + + auto itr = llvm::find_if(clauses, [](const omp::Clause &clause) { + return clause.id == llvm::omp::Clause::OMPC_in_reduction; + }); + + bool collectSymbols = (itr == clauses.end()); + converter.collectSymbolSet(eval, allSymbols, flag, /*collectSymbols=*/collectSymbols, /*collectHostAssociatedSymbols=*/true); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index f657a2ef0a26d..75b8c047d918e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1253,21 +1253,20 @@ static void genTaskClauses( lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, lower::StatementContext &stmtCtx, const List &clauses, mlir::Location loc, mlir::omp::TaskOperands &clauseOps, - llvm::SmallVectorImpl &InReductionSyms) { + llvm::SmallVectorImpl &inReductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processAllocate(clauseOps); cp.processDepend(clauseOps); cp.processFinal(stmtCtx, clauseOps); cp.processIf(llvm::omp::Directive::OMPD_task, clauseOps); - cp.processInReduction(loc, clauseOps, InReductionSyms); + cp.processInReduction(loc, clauseOps, inReductionSyms); cp.processMergeable(clauseOps); cp.processPriority(stmtCtx, clauseOps); cp.processUntied(clauseOps); cp.processDetach(clauseOps); // TODO Support delayed privatization. - cp.processTODO( - loc, llvm::omp::Directive::OMPD_task); + cp.processTODO(loc, llvm::omp::Directive::OMPD_task); } static void genTaskgroupClauses( @@ -1888,9 +1887,9 @@ genTaskOp(lower::AbstractConverter &converter, lower::SymMap &symTable, ConstructQueue::const_iterator item) { lower::StatementContext stmtCtx; mlir::omp::TaskOperands clauseOps; - llvm::SmallVector InReductionSyms; + llvm::SmallVector inReductionSyms; genTaskClauses(converter, semaCtx, stmtCtx, item->clauses, loc, clauseOps, - InReductionSyms); + inReductionSyms); if (!enableDelayedPrivatization) return genOpWithBody( @@ -1907,7 +1906,7 @@ genTaskOp(lower::AbstractConverter &converter, lower::SymMap &symTable, EntryBlockArgs taskArgs; taskArgs.priv.syms = dsp.getDelayedPrivSymbols(); taskArgs.priv.vars = clauseOps.privateVars; - taskArgs.inReduction.syms = InReductionSyms; + taskArgs.inReduction.syms = inReductionSyms; taskArgs.inReduction.vars = clauseOps.inReductionVars; auto genRegionEntryCB = [&](mlir::Operation *op) { @@ -1927,14 +1926,6 @@ genTaskOp(lower::AbstractConverter &converter, lower::SymMap &symTable, auto taskOp = genOpWithBody(genInfo, queue, item, clauseOps); - - llvm::SmallVector inReductionTypes; - for (const auto &inreductionVar : clauseOps.inReductionVars) - inReductionTypes.push_back(inreductionVar.getType()); - - // Add reduction variables as entry block arguments to the task region - llvm::SmallVector blockArgLocs(InReductionSyms.size(), loc); - taskOp->getRegion(0).addArguments(inReductionTypes, blockArgLocs); return taskOp; } @@ -1949,21 +1940,23 @@ genTaskgroupOp(lower::AbstractConverter &converter, lower::SymMap &symTable, genTaskgroupClauses(converter, semaCtx, item->clauses, loc, clauseOps, taskReductionSyms); + EntryBlockArgs taskgroupArgs; + taskgroupArgs.taskReduction.syms = taskReductionSyms; + taskgroupArgs.taskReduction.vars = clauseOps.taskReductionVars; + + auto genRegionEntryCB = [&](mlir::Operation *op) { + genEntryBlock(converter.getFirOpBuilder(), taskgroupArgs, op->getRegion(0)); + return llvm::to_vector(taskgroupArgs.getSyms()); + }; + OpWithBodyGenInfo genInfo = OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, llvm::omp::Directive::OMPD_taskgroup) - .setClauses(&item->clauses); + .setClauses(&item->clauses) + .setGenRegionEntryCb(genRegionEntryCB); auto taskgroupOp = genOpWithBody(genInfo, queue, item, clauseOps); - - llvm::SmallVector taskReductionTypes; - for (const auto &taskreductionVar : clauseOps.taskReductionVars) - taskReductionTypes.push_back(taskreductionVar.getType()); - - // Add reduction variables as entry block arguments to the taskgroup region - llvm::SmallVector blockArgLocs(taskReductionSyms.size(), loc); - taskgroupOp->getRegion(0).addArguments(taskReductionTypes, blockArgLocs); return taskgroupOp; } diff --git a/flang/test/Lower/OpenMP/task-inreduction.f90 b/flang/test/Lower/OpenMP/task-inreduction.f90 index ded4710d5c13d..41657d320f7d2 100644 --- a/flang/test/Lower/OpenMP/task-inreduction.f90 +++ b/flang/test/Lower/OpenMP/task-inreduction.f90 @@ -32,4 +32,4 @@ subroutine omp_task_in_reduction() !$omp task in_reduction(+:i) i = i + 1 !$omp end task -end subroutine omp_task_in_reduction \ No newline at end of file +end subroutine omp_task_in_reduction diff --git a/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 b/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 index 7e6d7f09fbc67..175242bfc5656 100644 --- a/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 +++ b/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 @@ -46,4 +46,4 @@ subroutine taskReduction(x) !$omp end task !$omp end taskgroup !$omp end parallel -end subroutine \ No newline at end of file +end subroutine diff --git a/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 b/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 index bc32cee93d47f..29da1c56e0b3c 100644 --- a/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 +++ b/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 @@ -31,4 +31,4 @@ subroutine omp_taskgroup_task_reduction() !$omp taskgroup task_reduction(+:res) res = res + 1 !$omp end taskgroup -end subroutine omp_taskgroup_task_reduction \ No newline at end of file +end subroutine diff --git a/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 b/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 index 6a5bc568efb8e..ad41c1fbc1556 100644 --- a/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 +++ b/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 @@ -25,7 +25,7 @@ !CHECK: return !CHECK: } -subroutine in_reduction +subroutine in_reduction() integer :: x x = 0 !$omp taskgroup task_reduction(+:x) @@ -33,4 +33,4 @@ subroutine in_reduction x = x + 1 !$omp end task !$omp end taskgroup -end subroutine \ No newline at end of file +end subroutine >From c2026f4806b58b73516e830abb1a0fda56344275 Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Wed, 23 Apr 2025 14:48:44 +0530 Subject: [PATCH 3/5] Addressed review comments: Binding Fortran symbols to entry block arguments in taskgroup construct and fixed testcases --- flang/lib/Lower/OpenMP/ClauseProcessor.h | 6 +++--- flang/lib/Lower/OpenMP/DataSharingProcessor.cpp | 3 +-- flang/lib/Lower/OpenMP/OpenMP.cpp | 12 ++++++------ .../OpenMP/taskgroup-task-array-reduction.f90 | 14 +++++++------- .../Lower/OpenMP/taskgroup-task_reduction01.f90 | 10 ++++++---- .../Lower/OpenMP/taskgroup-task_reduction02.f90 | 13 +++++++------ 6 files changed, 30 insertions(+), 28 deletions(-) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 06bb3685a3dec..3d3f26f06da26 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -112,12 +112,12 @@ class ClauseProcessor { processEnter(llvm::SmallVectorImpl &result) const; bool processIf(omp::clause::If::DirectiveNameModifier directiveName, mlir::omp::IfClauseOps &result) const; - bool processIsDevicePtr( - mlir::omp::IsDevicePtrClauseOps &result, - llvm::SmallVectorImpl &isDeviceSyms) const; bool processInReduction( mlir::Location currentLocation, mlir::omp::InReductionClauseOps &result, llvm::SmallVectorImpl &outReductionSyms) const; + bool processIsDevicePtr( + mlir::omp::IsDevicePtrClauseOps &result, + llvm::SmallVectorImpl &isDeviceSyms) const; bool processLink(llvm::SmallVectorImpl &result) const; diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 867e76ad0b1cf..cd64d9918b2d7 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -416,8 +416,7 @@ void DataSharingProcessor::collectSymbols( bool collectSymbols = (itr == clauses.end()); - converter.collectSymbolSet(eval, allSymbols, flag, - /*collectSymbols=*/collectSymbols, + converter.collectSymbolSet(eval, allSymbols, flag, collectSymbols, /*collectHostAssociatedSymbols=*/true); llvm::SetVector symbolsInNestedRegions; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 877cda847868a..6344a30736693 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2537,18 +2537,18 @@ genTaskgroupOp(lower::AbstractConverter &converter, lower::SymMap &symTable, auto genRegionEntryCB = [&](mlir::Operation *op) { genEntryBlock(converter.getFirOpBuilder(), taskgroupArgs, op->getRegion(0)); + bindEntryBlockArgs(converter, + llvm::cast(op), + taskgroupArgs); return llvm::to_vector(taskgroupArgs.getSyms()); }; - OpWithBodyGenInfo genInfo = + return genOpWithBody( OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, llvm::omp::Directive::OMPD_taskgroup) .setClauses(&item->clauses) - .setGenRegionEntryCb(genRegionEntryCB); - - auto taskgroupOp = - genOpWithBody(genInfo, queue, item, clauseOps); - return taskgroupOp; + .setGenRegionEntryCb(genRegionEntryCB), + queue, item, clauseOps); } static mlir::omp::TaskwaitOp diff --git a/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 b/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 index 175242bfc5656..18d45217272fc 100644 --- a/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 +++ b/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 @@ -15,18 +15,18 @@ ! CHECK: omp.yield ! CHECK: } -! CHECK-LABEL: func.func @_QPtaskreduction +! CHECK-LABEL: func.func @_QPtask_reduction ! CHECK-SAME: (%[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}) { ! CHECK: %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope ! CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[VAL_0]] dummy_scope %[[VAL_1]] -! CHECK-SAME {uniq_name = "_QFtaskreductionEx"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +! CHECK-SAME: {uniq_name = "_QFtask_reductionEx"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) ! CHECK: omp.parallel { ! CHECK: %[[VAL_3:.*]] = fir.alloca !fir.box> ! CHECK: fir.store %[[VAL_2]]#1 to %[[VAL_3]] : !fir.ref>> -! CHECK: omp.taskgroup task_reduction(byref @add_reduction_byref_box_Uxf32 %[[VAL_3]] -> %[[VAL_4:.*]]: !fir.ref>>) { -! CHECK: %[[VAL_5:.*]] = fir.alloca !fir.box> -! CHECK: fir.store %[[VAL_2]]#1 to %[[VAL_5]] : !fir.ref>> -! CHECK: omp.task in_reduction(byref @add_reduction_byref_box_Uxf32 %[[VAL_5]] -> %[[VAL_6:.*]] : !fir.ref>>) { +! CHECK: omp.taskgroup task_reduction(byref @add_reduction_byref_box_Uxf32 %[[VAL_3]] -> %[[VAL_4:.*]]: !fir.ref>>) { +! CHECK: %[[VAL_5:.*]]:2 = hlfir.declare %[[VAL_4]] +! CHECK-SAME: {uniq_name = "_QFtask_reductionEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) +! CHECK: omp.task in_reduction(byref @add_reduction_byref_box_Uxf32 %[[VAL_5]]#0 -> %[[VAL_6:.*]] : !fir.ref>>) { ! [...] ! CHECK: omp.terminator ! CHECK: } @@ -37,7 +37,7 @@ ! CHECK: return ! CHECK: } -subroutine taskReduction(x) +subroutine task_reduction(x) real, dimension(:) :: x !$omp parallel !$omp taskgroup task_reduction(+:x) diff --git a/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 b/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 index 29da1c56e0b3c..be4d3193e99f7 100644 --- a/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 +++ b/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 @@ -16,10 +16,12 @@ !CHECK: %[[VAL_0:.*]] = fir.alloca i32 {bindc_name = "res", uniq_name = "_QFomp_taskgroup_task_reductionEres"} !CHECK: %[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFomp_taskgroup_task_reductionEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) !CHECK: omp.taskgroup task_reduction(@[[RED_I32_NAME]] %[[VAL_1]]#0 -> %[[VAL_2:.*]] : !fir.ref) { -!CHECK: %[[VAL_3:.*]] = fir.load %[[VAL_1]]#0 : !fir.ref -!CHECK: %[[VAL_4:.*]] = arith.constant 1 : i32 -!CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_3]], %[[VAL_4]] : i32 -!CHECK: hlfir.assign %[[VAL_5]] to %[[VAL_1]]#0 : i32, !fir.ref +!CHECK: %[[VAL_3:.*]]:2 = hlfir.declare %[[VAL_2]] +!CHECK-SAME: {uniq_name = "_QFomp_taskgroup_task_reductionEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[VAL_4:.*]] = fir.load %[[VAL_3]]#0 : !fir.ref +!CHECK: %[[VAL_5:.*]] = arith.constant 1 : i32 +!CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_4]], %[[VAL_5]] : i32 +!CHECK: hlfir.assign %[[VAL_6]] to %[[VAL_3]]#0 : i32, !fir.ref !CHECK: omp.terminator !CHECK: } !CHECK: return diff --git a/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 b/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 index ad41c1fbc1556..ed91e582d2bf5 100644 --- a/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 +++ b/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 @@ -15,12 +15,13 @@ !CHECK-LABEL: func.func @_QPin_reduction() { ! [...] !CHECK: omp.taskgroup task_reduction(@[[RED_I32_NAME]] %[[VAL_1:.*]]#0 -> %[[VAL_3:.*]] : !fir.ref) { -!CHECK: omp.task in_reduction(@[[RED_I32_NAME]] %[[VAL_1]]#0 -> %[[VAL_4:.*]] : !fir.ref) { -!CHECK: %[[VAL_5:.*]]:2 = hlfir.declare %[[VAL_4]] {uniq_name = "_QFin_reductionEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) -! [...] -!CHECK: omp.terminator -!CHECK: } -!CHECK: omp.terminator +!CHECK: %[[VAL_4:.*]]:2 = hlfir.declare %[[VAL_3]] {uniq_name = "_QFin_reductionEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: omp.task in_reduction(@[[RED_I32_NAME]] %[[VAL_4]]#0 -> %[[VAL_5:.*]] : !fir.ref) { +!CHECK: %[[VAL_6:.*]]:2 = hlfir.declare %[[VAL_5]] {uniq_name = "_QFin_reductionEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! [...] +!CHECK: omp.terminator +!CHECK: } +!CHECK: omp.terminator !CHECK: } !CHECK: return !CHECK: } >From 258259a064bcb8970e8647c9fc06002e0c461f6e Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Mon, 28 Apr 2025 13:40:07 +0530 Subject: [PATCH 4/5] Addressed review comment: Replacing the call setGenRegionEntryCb() with setEntryBlockArgs() --- flang/lib/Lower/OpenMP/OpenMP.cpp | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6344a30736693..2bf03c61abe56 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2502,21 +2502,12 @@ genTaskOp(lower::AbstractConverter &converter, lower::SymMap &symTable, taskArgs.inReduction.syms = inReductionSyms; taskArgs.inReduction.vars = clauseOps.inReductionVars; - auto genRegionEntryCB = [&](mlir::Operation *op) { - genEntryBlock(converter.getFirOpBuilder(), taskArgs, op->getRegion(0)); - bindEntryBlockArgs(converter, - llvm::cast(op), - taskArgs); - return llvm::to_vector(taskArgs.getSyms()); - }; - return genOpWithBody( OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, llvm::omp::Directive::OMPD_task) .setClauses(&item->clauses) .setDataSharingProcessor(&dsp) - .setEntryBlockArgs(&taskArgs) - .setGenRegionEntryCb(genRegionEntryCB), + .setEntryBlockArgs(&taskArgs), queue, item, clauseOps); } @@ -2535,19 +2526,11 @@ genTaskgroupOp(lower::AbstractConverter &converter, lower::SymMap &symTable, taskgroupArgs.taskReduction.syms = taskReductionSyms; taskgroupArgs.taskReduction.vars = clauseOps.taskReductionVars; - auto genRegionEntryCB = [&](mlir::Operation *op) { - genEntryBlock(converter.getFirOpBuilder(), taskgroupArgs, op->getRegion(0)); - bindEntryBlockArgs(converter, - llvm::cast(op), - taskgroupArgs); - return llvm::to_vector(taskgroupArgs.getSyms()); - }; - return genOpWithBody( OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, llvm::omp::Directive::OMPD_taskgroup) .setClauses(&item->clauses) - .setGenRegionEntryCb(genRegionEntryCB), + .setEntryBlockArgs(&taskgroupArgs), queue, item, clauseOps); } >From 4954246f545365692850211f7b3df2d9d845875c Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Mon, 5 May 2025 14:14:54 +0530 Subject: [PATCH 5/5] Fixed dsa of variables in task construct in the presence of in_reduction clause --- flang/include/flang/Semantics/symbol.h | 12 +++++----- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 23 +++++++++---------- .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 8 +------ flang/lib/Semantics/resolve-directives.cpp | 6 +++++ 4 files changed, 24 insertions(+), 25 deletions(-) diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 715811885c219..3ea9c31461e24 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -754,12 +754,12 @@ class Symbol { // OpenMP data-copying attribute OmpCopyIn, OmpCopyPrivate, // OpenMP miscellaneous flags - OmpCommonBlock, OmpReduction, OmpAligned, OmpNontemporal, OmpAllocate, - OmpDeclarativeAllocateDirective, OmpExecutableAllocateDirective, - OmpDeclareSimd, OmpDeclareTarget, OmpThreadprivate, OmpDeclareReduction, - OmpFlushed, OmpCriticalLock, OmpIfSpecified, OmpNone, OmpPreDetermined, - OmpImplicit, OmpDependObject, OmpInclusiveScan, OmpExclusiveScan, - OmpInScanReduction); + OmpCommonBlock, OmpReduction, OmpInReduction, OmpAligned, OmpNontemporal, + OmpAllocate, OmpDeclarativeAllocateDirective, + OmpExecutableAllocateDirective, OmpDeclareSimd, OmpDeclareTarget, + OmpThreadprivate, OmpDeclareReduction, OmpFlushed, OmpCriticalLock, + OmpIfSpecified, OmpNone, OmpPreDetermined, OmpImplicit, OmpDependObject, + OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction); using Flags = common::EnumSet; const Scope &owner() const { return *owner_; } diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 3f5f3d03d9b26..afe9f90b270ab 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -976,18 +976,6 @@ bool ClauseProcessor::processIf( }); return found; } - -bool ClauseProcessor::processIsDevicePtr( - mlir::omp::IsDevicePtrClauseOps &result, - llvm::SmallVectorImpl &isDeviceSyms) const { - return findRepeatableClause( - [&](const omp::clause::IsDevicePtr &devPtrClause, - const parser::CharBlock &) { - addUseDeviceClause(converter, devPtrClause.v, result.isDevicePtrVars, - isDeviceSyms); - }); -} - bool ClauseProcessor::processInReduction( mlir::Location currentLocation, mlir::omp::InReductionClauseOps &result, llvm::SmallVectorImpl &outReductionSyms) const { @@ -1012,6 +1000,17 @@ bool ClauseProcessor::processInReduction( }); } +bool ClauseProcessor::processIsDevicePtr( + mlir::omp::IsDevicePtrClauseOps &result, + llvm::SmallVectorImpl &isDeviceSyms) const { + return findRepeatableClause( + [&](const omp::clause::IsDevicePtr &devPtrClause, + const parser::CharBlock &) { + addUseDeviceClause(converter, devPtrClause.v, result.isDevicePtrVars, + isDeviceSyms); + }); +} + bool ClauseProcessor::processLink( llvm::SmallVectorImpl &result) const { return findRepeatableClause( diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index cd64d9918b2d7..030fa8c426161 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -410,13 +410,7 @@ void DataSharingProcessor::collectSymbols( // that matches 'flag'. llvm::SetVector allSymbols; - auto itr = llvm::find_if(clauses, [](const omp::Clause &clause) { - return clause.id == llvm::omp::Clause::OMPC_in_reduction; - }); - - bool collectSymbols = (itr == clauses.end()); - - converter.collectSymbolSet(eval, allSymbols, flag, collectSymbols, + converter.collectSymbolSet(eval, allSymbols, flag, /*collectSymbols=*/true, /*collectHostAssociatedSymbols=*/true); llvm::SetVector symbolsInNestedRegions; diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index d75b4ea13d35f..7f16dd631c88f 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -530,6 +530,12 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { return false; } + bool Pre(const parser::OmpInReductionClause &x) { + auto &objects{std::get(x.t)}; + ResolveOmpObjectList(objects, Symbol::Flag::OmpInReduction); + return false; + } + bool Pre(const parser::OmpClause::Reduction &x) { const auto &objList{std::get(x.v.t)}; ResolveOmpObjectList(objList, Symbol::Flag::OmpReduction); From flang-commits at lists.llvm.org Mon May 5 04:13:39 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Mon, 05 May 2025 04:13:39 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` (PR #137928) In-Reply-To: Message-ID: <68189d63.050a0220.31513.9ff4@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/137928 >From 12114384de7c16a689ae67640fb2f4a6ae530dfb Mon Sep 17 00:00:00 2001 From: Kareem Ergawy Date: Wed, 16 Apr 2025 06:14:38 +0200 Subject: [PATCH] [flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` Adds support for lowering `do concurrent` nests from PFT to the new `fir.do_concurrent` MLIR op as well as its special terminator `fir.do_concurrent.loop` which models the actual loop nest. To that end, this PR emits the allocations for the iteration variables within the block of the `fir.do_concurrent` op and creates a region for the `fir.do_concurrent.loop` op that accepts arguments equal in number to the number of the input `do concurrent` iteration ranges. For example, given the following input: ```fortran do concurrent(i=1:10, j=11:20) end do ``` the changes in this PR emit the following MLIR: ```mlir fir.do_concurrent { %22 = fir.alloca i32 {bindc_name = "i"} %23:2 = hlfir.declare %22 {uniq_name = "_QFsub1Ei"} : (!fir.ref) -> (!fir.ref, !fir.ref) %24 = fir.alloca i32 {bindc_name = "j"} %25:2 = hlfir.declare %24 {uniq_name = "_QFsub1Ej"} : (!fir.ref) -> (!fir.ref, !fir.ref) fir.do_concurrent.loop (%arg1, %arg2) = (%18, %20) to (%19, %21) step (%c1, %c1_0) { %26 = fir.convert %arg1 : (index) -> i32 fir.store %26 to %23#0 : !fir.ref %27 = fir.convert %arg2 : (index) -> i32 fir.store %27 to %25#0 : !fir.ref } } ``` --- flang/lib/Lower/Bridge.cpp | 228 +++++++++++------- flang/lib/Optimizer/Builder/FIRBuilder.cpp | 3 + flang/test/Lower/do_concurrent.f90 | 39 ++- .../do_concurrent_local_default_init.f90 | 4 +- flang/test/Lower/loops.f90 | 37 +-- flang/test/Lower/loops3.f90 | 4 +- flang/test/Lower/nsw.f90 | 5 +- .../Transforms/DoConcurrent/basic_host.f90 | 3 + .../DoConcurrent/locally_destroyed_temp.f90 | 3 + .../DoConcurrent/loop_nest_test.f90 | 3 + .../multiple_iteration_ranges.f90 | 3 + .../DoConcurrent/non_const_bounds.f90 | 3 + .../DoConcurrent/not_perfectly_nested.f90 | 3 + 13 files changed, 208 insertions(+), 130 deletions(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 72c63e4e314d2..8da05255d5f41 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -94,10 +94,11 @@ struct IncrementLoopInfo { template explicit IncrementLoopInfo(Fortran::semantics::Symbol &sym, const T &lower, const T &upper, const std::optional &step, - bool isUnordered = false) + bool isConcurrent = false) : loopVariableSym{&sym}, lowerExpr{Fortran::semantics::GetExpr(lower)}, upperExpr{Fortran::semantics::GetExpr(upper)}, - stepExpr{Fortran::semantics::GetExpr(step)}, isUnordered{isUnordered} {} + stepExpr{Fortran::semantics::GetExpr(step)}, + isConcurrent{isConcurrent} {} IncrementLoopInfo(IncrementLoopInfo &&) = default; IncrementLoopInfo &operator=(IncrementLoopInfo &&x) = default; @@ -120,7 +121,7 @@ struct IncrementLoopInfo { const Fortran::lower::SomeExpr *upperExpr; const Fortran::lower::SomeExpr *stepExpr; const Fortran::lower::SomeExpr *maskExpr = nullptr; - bool isUnordered; // do concurrent, forall + bool isConcurrent; llvm::SmallVector localSymList; llvm::SmallVector localInitSymList; llvm::SmallVector< @@ -130,7 +131,7 @@ struct IncrementLoopInfo { mlir::Value loopVariable = nullptr; // Data members for structured loops. - fir::DoLoopOp doLoop = nullptr; + mlir::Operation *loopOp = nullptr; // Data members for unstructured loops. bool hasRealControl = false; @@ -1981,7 +1982,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { llvm_unreachable("illegal reduction operator"); } - /// Collect DO CONCURRENT or FORALL loop control information. + /// Collect DO CONCURRENT loop control information. IncrementLoopNestInfo getConcurrentControl( const Fortran::parser::ConcurrentHeader &header, const std::list &localityList = {}) { @@ -2292,8 +2293,14 @@ class FirConverter : public Fortran::lower::AbstractConverter { mlir::LLVM::LoopAnnotationAttr la = mlir::LLVM::LoopAnnotationAttr::get( builder->getContext(), {}, /*vectorize=*/va, {}, /*unroll*/ ua, /*unroll_and_jam*/ uja, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}); - if (has_attrs) - info.doLoop.setLoopAnnotationAttr(la); + if (has_attrs) { + if (auto loopOp = mlir::dyn_cast(info.loopOp)) + loopOp.setLoopAnnotationAttr(la); + + if (auto doConcurrentOp = + mlir::dyn_cast(info.loopOp)) + doConcurrentOp.setLoopAnnotationAttr(la); + } } /// Generate FIR to begin a structured or unstructured increment loop nest. @@ -2302,96 +2309,77 @@ class FirConverter : public Fortran::lower::AbstractConverter { llvm::SmallVectorImpl &dirs) { assert(!incrementLoopNestInfo.empty() && "empty loop nest"); mlir::Location loc = toLocation(); - mlir::Operation *boundsAndStepIP = nullptr; mlir::arith::IntegerOverflowFlags iofBackup{}; + llvm::SmallVector nestLBs; + llvm::SmallVector nestUBs; + llvm::SmallVector nestSts; + llvm::SmallVector nestReduceOperands; + llvm::SmallVector nestReduceAttrs; + bool genDoConcurrent = false; + for (IncrementLoopInfo &info : incrementLoopNestInfo) { - mlir::Value lowerValue; - mlir::Value upperValue; - mlir::Value stepValue; + genDoConcurrent = info.isStructured() && info.isConcurrent; - { - mlir::OpBuilder::InsertionGuard guard(*builder); + if (!genDoConcurrent) + info.loopVariable = genLoopVariableAddress(loc, *info.loopVariableSym, + info.isConcurrent); - // Set the IP before the first loop in the nest so that all nest bounds - // and step values are created outside the nest. - if (boundsAndStepIP) - builder->setInsertionPointAfter(boundsAndStepIP); + if (!getLoweringOptions().getIntegerWrapAround()) { + iofBackup = builder->getIntegerOverflowFlags(); + builder->setIntegerOverflowFlags( + mlir::arith::IntegerOverflowFlags::nsw); + } - info.loopVariable = genLoopVariableAddress(loc, *info.loopVariableSym, - info.isUnordered); - if (!getLoweringOptions().getIntegerWrapAround()) { - iofBackup = builder->getIntegerOverflowFlags(); - builder->setIntegerOverflowFlags( - mlir::arith::IntegerOverflowFlags::nsw); - } - lowerValue = genControlValue(info.lowerExpr, info); - upperValue = genControlValue(info.upperExpr, info); - bool isConst = true; - stepValue = genControlValue(info.stepExpr, info, - info.isStructured() ? nullptr : &isConst); - if (!getLoweringOptions().getIntegerWrapAround()) - builder->setIntegerOverflowFlags(iofBackup); - boundsAndStepIP = stepValue.getDefiningOp(); - - // Use a temp variable for unstructured loops with non-const step. - if (!isConst) { - info.stepVariable = - builder->createTemporary(loc, stepValue.getType()); - boundsAndStepIP = - builder->create(loc, stepValue, info.stepVariable); + nestLBs.push_back(genControlValue(info.lowerExpr, info)); + nestUBs.push_back(genControlValue(info.upperExpr, info)); + bool isConst = true; + nestSts.push_back(genControlValue( + info.stepExpr, info, info.isStructured() ? nullptr : &isConst)); + + if (!getLoweringOptions().getIntegerWrapAround()) + builder->setIntegerOverflowFlags(iofBackup); + + // Use a temp variable for unstructured loops with non-const step. + if (!isConst) { + mlir::Value stepValue = nestSts.back(); + info.stepVariable = builder->createTemporary(loc, stepValue.getType()); + builder->create(loc, stepValue, info.stepVariable); + } + + if (genDoConcurrent && nestReduceOperands.empty()) { + // Create DO CONCURRENT reduce operands and attributes + for (const auto &reduceSym : info.reduceSymList) { + const fir::ReduceOperationEnum reduceOperation = reduceSym.first; + const Fortran::semantics::Symbol *sym = reduceSym.second; + fir::ExtendedValue exv = getSymbolExtendedValue(*sym, nullptr); + nestReduceOperands.push_back(fir::getBase(exv)); + auto reduceAttr = + fir::ReduceAttr::get(builder->getContext(), reduceOperation); + nestReduceAttrs.push_back(reduceAttr); } } + } + for (auto [info, lowerValue, upperValue, stepValue] : + llvm::zip_equal(incrementLoopNestInfo, nestLBs, nestUBs, nestSts)) { // Structured loop - generate fir.do_loop. if (info.isStructured()) { + if (genDoConcurrent) + continue; + + // The loop variable is a doLoop op argument. mlir::Type loopVarType = info.getLoopVariableType(); - mlir::Value loopValue; - if (info.isUnordered) { - llvm::SmallVector reduceOperands; - llvm::SmallVector reduceAttrs; - // Create DO CONCURRENT reduce operands and attributes - for (const auto &reduceSym : info.reduceSymList) { - const fir::ReduceOperationEnum reduce_operation = reduceSym.first; - const Fortran::semantics::Symbol *sym = reduceSym.second; - fir::ExtendedValue exv = getSymbolExtendedValue(*sym, nullptr); - reduceOperands.push_back(fir::getBase(exv)); - auto reduce_attr = - fir::ReduceAttr::get(builder->getContext(), reduce_operation); - reduceAttrs.push_back(reduce_attr); - } - // The loop variable value is explicitly updated. - info.doLoop = builder->create( - loc, lowerValue, upperValue, stepValue, /*unordered=*/true, - /*finalCountValue=*/false, /*iterArgs=*/std::nullopt, - llvm::ArrayRef(reduceOperands), reduceAttrs); - builder->setInsertionPointToStart(info.doLoop.getBody()); - loopValue = builder->createConvert(loc, loopVarType, - info.doLoop.getInductionVar()); - } else { - // The loop variable is a doLoop op argument. - info.doLoop = builder->create( - loc, lowerValue, upperValue, stepValue, /*unordered=*/false, - /*finalCountValue=*/true, - builder->createConvert(loc, loopVarType, lowerValue)); - builder->setInsertionPointToStart(info.doLoop.getBody()); - loopValue = info.doLoop.getRegionIterArgs()[0]; - } + auto loopOp = builder->create( + loc, lowerValue, upperValue, stepValue, /*unordered=*/false, + /*finalCountValue=*/true, + builder->createConvert(loc, loopVarType, lowerValue)); + info.loopOp = loopOp; + builder->setInsertionPointToStart(loopOp.getBody()); + mlir::Value loopValue = loopOp.getRegionIterArgs()[0]; + // Update the loop variable value in case it has non-index references. builder->create(loc, loopValue, info.loopVariable); - if (info.maskExpr) { - Fortran::lower::StatementContext stmtCtx; - mlir::Value maskCond = createFIRExpr(loc, info.maskExpr, stmtCtx); - stmtCtx.finalizeAndReset(); - mlir::Value maskCondCast = - builder->createConvert(loc, builder->getI1Type(), maskCond); - auto ifOp = builder->create(loc, maskCondCast, - /*withElseRegion=*/false); - builder->setInsertionPointToStart(&ifOp.getThenRegion().front()); - } - if (info.hasLocalitySpecs()) - handleLocalitySpecs(info); - addLoopAnnotationAttr(info, dirs); continue; } @@ -2455,6 +2443,60 @@ class FirConverter : public Fortran::lower::AbstractConverter { builder->restoreInsertionPoint(insertPt); } } + + if (genDoConcurrent) { + auto loopWrapperOp = builder->create(loc); + builder->setInsertionPointToStart( + builder->createBlock(&loopWrapperOp.getRegion())); + + for (IncrementLoopInfo &info : llvm::reverse(incrementLoopNestInfo)) { + info.loopVariable = genLoopVariableAddress(loc, *info.loopVariableSym, + info.isConcurrent); + } + + builder->setInsertionPointToEnd(loopWrapperOp.getBody()); + auto loopOp = builder->create( + loc, nestLBs, nestUBs, nestSts, nestReduceOperands, + nestReduceAttrs.empty() + ? nullptr + : mlir::ArrayAttr::get(builder->getContext(), nestReduceAttrs), + nullptr); + + llvm::SmallVector loopBlockArgTypes( + incrementLoopNestInfo.size(), builder->getIndexType()); + llvm::SmallVector loopBlockArgLocs( + incrementLoopNestInfo.size(), loc); + mlir::Region &loopRegion = loopOp.getRegion(); + mlir::Block *loopBlock = builder->createBlock( + &loopRegion, loopRegion.begin(), loopBlockArgTypes, loopBlockArgLocs); + builder->setInsertionPointToStart(loopBlock); + + for (auto [info, blockArg] : + llvm::zip_equal(incrementLoopNestInfo, loopBlock->getArguments())) { + info.loopOp = loopOp; + mlir::Value loopValue = + builder->createConvert(loc, info.getLoopVariableType(), blockArg); + builder->create(loc, loopValue, info.loopVariable); + + if (info.maskExpr) { + Fortran::lower::StatementContext stmtCtx; + mlir::Value maskCond = createFIRExpr(loc, info.maskExpr, stmtCtx); + stmtCtx.finalizeAndReset(); + mlir::Value maskCondCast = + builder->createConvert(loc, builder->getI1Type(), maskCond); + auto ifOp = builder->create(loc, maskCondCast, + /*withElseRegion=*/false); + builder->setInsertionPointToStart(&ifOp.getThenRegion().front()); + } + } + + IncrementLoopInfo &innermostInfo = incrementLoopNestInfo.back(); + + if (innermostInfo.hasLocalitySpecs()) + handleLocalitySpecs(innermostInfo); + + addLoopAnnotationAttr(innermostInfo, dirs); + } } /// Generate FIR to end a structured or unstructured increment loop nest. @@ -2471,29 +2513,31 @@ class FirConverter : public Fortran::lower::AbstractConverter { it != rend; ++it) { IncrementLoopInfo &info = *it; if (info.isStructured()) { - // End fir.do_loop. - if (info.isUnordered) { - builder->setInsertionPointAfter(info.doLoop); + // End fir.do_concurent.loop. + if (info.isConcurrent) { + builder->setInsertionPointAfter(info.loopOp->getParentOp()); continue; } + + // End fir.do_loop. // Decrement tripVariable. - builder->setInsertionPointToEnd(info.doLoop.getBody()); + auto doLoopOp = mlir::cast(info.loopOp); + builder->setInsertionPointToEnd(doLoopOp.getBody()); llvm::SmallVector results; results.push_back(builder->create( - loc, info.doLoop.getInductionVar(), info.doLoop.getStep(), - iofAttr)); + loc, doLoopOp.getInductionVar(), doLoopOp.getStep(), iofAttr)); // Step loopVariable to help optimizations such as vectorization. // Induction variable elimination will clean up as necessary. mlir::Value step = builder->createConvert( - loc, info.getLoopVariableType(), info.doLoop.getStep()); + loc, info.getLoopVariableType(), doLoopOp.getStep()); mlir::Value loopVar = builder->create(loc, info.loopVariable); results.push_back( builder->create(loc, loopVar, step, iofAttr)); builder->create(loc, results); - builder->setInsertionPointAfter(info.doLoop); + builder->setInsertionPointAfter(doLoopOp); // The loop control variable may be used after the loop. - builder->create(loc, info.doLoop.getResult(1), + builder->create(loc, doLoopOp.getResult(1), info.loopVariable); continue; } diff --git a/flang/lib/Optimizer/Builder/FIRBuilder.cpp b/flang/lib/Optimizer/Builder/FIRBuilder.cpp index 1d6e1502ed0f9..86166db355f72 100644 --- a/flang/lib/Optimizer/Builder/FIRBuilder.cpp +++ b/flang/lib/Optimizer/Builder/FIRBuilder.cpp @@ -280,6 +280,9 @@ mlir::Block *fir::FirOpBuilder::getAllocaBlock() { if (auto cufKernelOp = getRegion().getParentOfType()) return &cufKernelOp.getRegion().front(); + if (auto doConcurentOp = getRegion().getParentOfType()) + return doConcurentOp.getBody(); + return getEntryBlock(); } diff --git a/flang/test/Lower/do_concurrent.f90 b/flang/test/Lower/do_concurrent.f90 index ef93d2d6b035b..cc113f59c35e3 100644 --- a/flang/test/Lower/do_concurrent.f90 +++ b/flang/test/Lower/do_concurrent.f90 @@ -14,6 +14,9 @@ subroutine sub1(n) implicit none integer :: n, m, i, j, k integer, dimension(n) :: a +!CHECK: %[[N_DECL:.*]]:2 = hlfir.declare %{{.*}} dummy_scope %{{.*}} {uniq_name = "_QFsub1En"} +!CHECK: %[[A_DECL:.*]]:2 = hlfir.declare %{{.*}}(%{{.*}}) {uniq_name = "_QFsub1Ea"} + !CHECK: %[[LB1:.*]] = arith.constant 1 : i32 !CHECK: %[[LB1_CVT:.*]] = fir.convert %[[LB1]] : (i32) -> index !CHECK: %[[UB1:.*]] = fir.load %{{.*}}#0 : !fir.ref @@ -29,10 +32,30 @@ subroutine sub1(n) !CHECK: %[[UB3:.*]] = arith.constant 10 : i32 !CHECK: %[[UB3_CVT:.*]] = fir.convert %[[UB3]] : (i32) -> index -!CHECK: fir.do_loop %{{.*}} = %[[LB1_CVT]] to %[[UB1_CVT]] step %{{.*}} unordered -!CHECK: fir.do_loop %{{.*}} = %[[LB2_CVT]] to %[[UB2_CVT]] step %{{.*}} unordered -!CHECK: fir.do_loop %{{.*}} = %[[LB3_CVT]] to %[[UB3_CVT]] step %{{.*}} unordered +!CHECK: fir.do_concurrent +!CHECK: %[[I:.*]] = fir.alloca i32 {bindc_name = "i"} +!CHECK: %[[I_DECL:.*]]:2 = hlfir.declare %[[I]] +!CHECK: %[[J:.*]] = fir.alloca i32 {bindc_name = "j"} +!CHECK: %[[J_DECL:.*]]:2 = hlfir.declare %[[J]] +!CHECK: %[[K:.*]] = fir.alloca i32 {bindc_name = "k"} +!CHECK: %[[K_DECL:.*]]:2 = hlfir.declare %[[K]] + +!CHECK: fir.do_concurrent.loop (%[[I_IV:.*]], %[[J_IV:.*]], %[[K_IV:.*]]) = +!CHECK-SAME: (%[[LB1_CVT]], %[[LB2_CVT]], %[[LB3_CVT]]) to +!CHECK-SAME: (%[[UB1_CVT]], %[[UB2_CVT]], %[[UB3_CVT]]) step +!CHECK-SAME: (%{{.*}}, %{{.*}}, %{{.*}}) { +!CHECK: %[[I_IV_CVT:.*]] = fir.convert %[[I_IV]] : (index) -> i32 +!CHECK: fir.store %[[I_IV_CVT]] to %[[I_DECL]]#0 : !fir.ref +!CHECK: %[[J_IV_CVT:.*]] = fir.convert %[[J_IV]] : (index) -> i32 +!CHECK: fir.store %[[J_IV_CVT]] to %[[J_DECL]]#0 : !fir.ref +!CHECK: %[[K_IV_CVT:.*]] = fir.convert %[[K_IV]] : (index) -> i32 +!CHECK: fir.store %[[K_IV_CVT]] to %[[K_DECL]]#0 : !fir.ref +!CHECK: %[[N_VAL:.*]] = fir.load %[[N_DECL]]#0 : !fir.ref +!CHECK: %[[I_VAL:.*]] = fir.load %[[I_DECL]]#0 : !fir.ref +!CHECK: %[[I_VAL_CVT:.*]] = fir.convert %[[I_VAL]] : (i32) -> i64 +!CHECK: %[[A_ELEM:.*]] = hlfir.designate %[[A_DECL]]#0 (%[[I_VAL_CVT]]) +!CHECK: hlfir.assign %[[N_VAL]] to %[[A_ELEM]] : i32, !fir.ref do concurrent(i=1:n, j=1:bar(n*m, n/m), k=5:10) a(i) = n end do @@ -45,14 +68,17 @@ subroutine sub2(n) integer, dimension(n) :: a !CHECK: %[[LB1:.*]] = arith.constant 1 : i32 !CHECK: %[[LB1_CVT:.*]] = fir.convert %[[LB1]] : (i32) -> index -!CHECK: %[[UB1:.*]] = fir.load %5#0 : !fir.ref +!CHECK: %[[UB1:.*]] = fir.load %{{.*}}#0 : !fir.ref !CHECK: %[[UB1_CVT:.*]] = fir.convert %[[UB1]] : (i32) -> index -!CHECK: fir.do_loop %{{.*}} = %[[LB1_CVT]] to %[[UB1_CVT]] step %{{.*}} unordered +!CHECK: fir.do_concurrent +!CHECK: fir.do_concurrent.loop (%{{.*}}) = (%[[LB1_CVT]]) to (%[[UB1_CVT]]) step (%{{.*}}) + !CHECK: %[[LB2:.*]] = arith.constant 1 : i32 !CHECK: %[[LB2_CVT:.*]] = fir.convert %[[LB2]] : (i32) -> index !CHECK: %[[UB2:.*]] = fir.call @_QPbar(%{{.*}}, %{{.*}}) proc_attrs fastmath : (!fir.ref, !fir.ref) -> i32 !CHECK: %[[UB2_CVT:.*]] = fir.convert %[[UB2]] : (i32) -> index -!CHECK: fir.do_loop %{{.*}} = %[[LB2_CVT]] to %[[UB2_CVT]] step %{{.*}} unordered +!CHECK: fir.do_concurrent +!CHECK: fir.do_concurrent.loop (%{{.*}}) = (%[[LB2_CVT]]) to (%[[UB2_CVT]]) step (%{{.*}}) do concurrent(i=1:n) do concurrent(j=1:bar(n*m, n/m)) a(i) = n @@ -60,7 +86,6 @@ subroutine sub2(n) end do end subroutine - !CHECK-LABEL: unstructured subroutine unstructured(inner_step) integer(4) :: i, j, inner_step diff --git a/flang/test/Lower/do_concurrent_local_default_init.f90 b/flang/test/Lower/do_concurrent_local_default_init.f90 index 7652e4fcd0402..207704ac1a990 100644 --- a/flang/test/Lower/do_concurrent_local_default_init.f90 +++ b/flang/test/Lower/do_concurrent_local_default_init.f90 @@ -29,7 +29,7 @@ subroutine test_default_init() ! CHECK-SAME: %[[VAL_0:.*]]: !fir.ref>>>> {fir.bindc_name = "p"}) { ! CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_0]] : !fir.ref>>>> ! CHECK: %[[VAL_7:.*]] = fir.box_elesize %[[VAL_6]] : (!fir.box>>>) -> index -! CHECK: fir.do_loop +! CHECK: fir.do_concurrent.loop ! CHECK: %[[VAL_16:.*]] = fir.alloca !fir.box>>> {bindc_name = "p", pinned, uniq_name = "_QFtest_ptrEp"} ! CHECK: %[[VAL_17:.*]] = fir.zero_bits !fir.ptr>> ! CHECK: %[[VAL_18:.*]] = arith.constant 0 : index @@ -43,7 +43,7 @@ subroutine test_default_init() ! CHECK: } ! CHECK-LABEL: func.func @_QPtest_default_init( -! CHECK: fir.do_loop +! CHECK: fir.do_concurrent.loop ! CHECK: %[[VAL_26:.*]] = fir.alloca !fir.type<_QFtest_default_initTt{i:i32}> {bindc_name = "a", pinned, uniq_name = "_QFtest_default_initEa"} ! CHECK: %[[VAL_27:.*]] = fir.embox %[[VAL_26]] : (!fir.ref>) -> !fir.box> ! CHECK: %[[VAL_30:.*]] = fir.convert %[[VAL_27]] : (!fir.box>) -> !fir.box diff --git a/flang/test/Lower/loops.f90 b/flang/test/Lower/loops.f90 index ea65ba3e4d66d..60df27a591dc3 100644 --- a/flang/test/Lower/loops.f90 +++ b/flang/test/Lower/loops.f90 @@ -2,15 +2,6 @@ ! CHECK-LABEL: loop_test subroutine loop_test - ! CHECK: %[[VAL_2:.*]] = fir.alloca i16 {bindc_name = "i"} - ! CHECK: %[[VAL_3:.*]] = fir.alloca i16 {bindc_name = "i"} - ! CHECK: %[[VAL_4:.*]] = fir.alloca i16 {bindc_name = "i"} - ! CHECK: %[[VAL_5:.*]] = fir.alloca i8 {bindc_name = "k"} - ! CHECK: %[[VAL_6:.*]] = fir.alloca i8 {bindc_name = "j"} - ! CHECK: %[[VAL_7:.*]] = fir.alloca i8 {bindc_name = "i"} - ! CHECK: %[[VAL_8:.*]] = fir.alloca i32 {bindc_name = "k"} - ! CHECK: %[[VAL_9:.*]] = fir.alloca i32 {bindc_name = "j"} - ! CHECK: %[[VAL_10:.*]] = fir.alloca i32 {bindc_name = "i"} ! CHECK: %[[VAL_11:.*]] = fir.alloca !fir.array<5x5x5xi32> {bindc_name = "a", uniq_name = "_QFloop_testEa"} ! CHECK: %[[VAL_12:.*]] = fir.alloca i32 {bindc_name = "asum", uniq_name = "_QFloop_testEasum"} ! CHECK: %[[VAL_13:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFloop_testEi"} @@ -25,7 +16,7 @@ subroutine loop_test j = 200 k = 300 - ! CHECK-COUNT-3: fir.do_loop {{.*}} unordered + ! CHECK: fir.do_concurrent.loop (%{{.*}}, %{{.*}}, %{{.*}}) = {{.*}} do concurrent (i=1:5, j=1:5, k=1:5) ! shared(a) ! CHECK: fir.coordinate_of a(i,j,k) = 0 @@ -33,7 +24,7 @@ subroutine loop_test ! CHECK: fir.call @_FortranAioBeginExternalListOutput print*, 'A:', i, j, k - ! CHECK-COUNT-3: fir.do_loop {{.*}} unordered + ! CHECK: fir.do_concurrent.loop (%{{.*}}, %{{.*}}, %{{.*}}) = {{.*}} ! CHECK: fir.if do concurrent (integer(1)::i=1:5, j=1:5, k=1:5, i.ne.j .and. k.ne.3) shared(a) ! CHECK-COUNT-2: fir.coordinate_of @@ -53,7 +44,7 @@ subroutine loop_test ! CHECK: fir.call @_FortranAioBeginExternalListOutput print*, 'B:', i, j, k, '-', asum - ! CHECK: fir.do_loop {{.*}} unordered + ! CHECK: fir.do_concurrent.loop (%{{.*}}) = {{.*}} ! CHECK-COUNT-2: fir.if do concurrent (integer(2)::i=1:5, i.ne.3) if (i.eq.2 .or. i.eq.4) goto 5 ! fir.if @@ -62,7 +53,7 @@ subroutine loop_test 5 continue enddo - ! CHECK: fir.do_loop {{.*}} unordered + ! CHECK: fir.do_concurrent.loop (%{{.*}}) = {{.*}} ! CHECK-COUNT-2: fir.if do concurrent (integer(2)::i=1:5, i.ne.3) if (i.eq.2 .or. i.eq.4) then ! fir.if @@ -93,10 +84,6 @@ end subroutine loop_test ! CHECK-LABEL: c.func @_QPlis subroutine lis(n) - ! CHECK-DAG: fir.alloca i32 {bindc_name = "m"} - ! CHECK-DAG: fir.alloca i32 {bindc_name = "j"} - ! CHECK-DAG: fir.alloca i32 {bindc_name = "i"} - ! CHECK-DAG: fir.alloca i8 {bindc_name = "i"} ! CHECK-DAG: fir.alloca i32 {bindc_name = "j", uniq_name = "_QFlisEj"} ! CHECK-DAG: fir.alloca i32 {bindc_name = "k", uniq_name = "_QFlisEk"} ! CHECK-DAG: fir.alloca !fir.box>> {bindc_name = "p", uniq_name = "_QFlisEp"} @@ -117,8 +104,8 @@ subroutine lis(n) ! CHECK: } r = 0 - ! CHECK: fir.do_loop %arg1 = %{{.*}} to %{{.*}} step %{{.*}} unordered { - ! CHECK: fir.do_loop %arg2 = %{{.*}} to %{{.*}} step %c1{{.*}} iter_args(%arg3 = %{{.*}}) -> (index, i32) { + ! CHECK: fir.do_concurrent { + ! CHECK: fir.do_concurrent.loop (%{{.*}}) = (%{{.*}}) to (%{{.*}}) step (%{{.*}}) { ! CHECK: } ! CHECK: } do concurrent (integer(kind=1)::i=n:1:-1) @@ -128,16 +115,18 @@ subroutine lis(n) enddo enddo - ! CHECK: fir.do_loop %arg1 = %{{.*}} to %{{.*}} step %c1{{.*}} unordered { - ! CHECK: fir.do_loop %arg2 = %{{.*}} to %{{.*}} step %c1{{.*}} unordered { + ! CHECK: fir.do_concurrent.loop (%{{.*}}, %{{.*}}) = (%{{.*}}, %{{.*}}) to (%{{.*}}, %{{.*}}) step (%{{.*}}, %{{.*}}) { ! CHECK: fir.if %{{.*}} { ! CHECK: %[[V_95:[0-9]+]] = fir.alloca !fir.array, %{{.*}}, %{{.*}} {bindc_name = "t", pinned, uniq_name = "_QFlisEt"} ! CHECK: %[[V_96:[0-9]+]] = fir.alloca !fir.box>> {bindc_name = "p", pinned, uniq_name = "_QFlisEp"} ! CHECK: fir.store %{{.*}} to %[[V_96]] : !fir.ref>>> ! CHECK: fir.do_loop %arg3 = %{{.*}} to %{{.*}} step %c1{{.*}} iter_args(%arg4 = %{{.*}}) -> (index, i32) { - ! CHECK: fir.do_loop %arg5 = %{{.*}} to %{{.*}} step %c1{{.*}} unordered { - ! CHECK: fir.load %[[V_96]] : !fir.ref>>> - ! CHECK: fir.convert %[[V_95]] : (!fir.ref>) -> !fir.ref> + ! CHECK: fir.do_concurrent { + ! CHECK: fir.alloca i32 {bindc_name = "m"} + ! CHECK: fir.do_concurrent.loop (%{{.*}}) = (%{{.*}}) to (%{{.*}}) step (%{{.*}}) { + ! CHECK: fir.load %[[V_96]] : !fir.ref>>> + ! CHECK: fir.convert %[[V_95]] : (!fir.ref>) -> !fir.ref> + ! CHECK: } ! CHECK: } ! CHECK: } ! CHECK: fir.convert %[[V_95]] : (!fir.ref>) -> !fir.ref> diff --git a/flang/test/Lower/loops3.f90 b/flang/test/Lower/loops3.f90 index 78f39e1013082..84db1972cca16 100644 --- a/flang/test/Lower/loops3.f90 +++ b/flang/test/Lower/loops3.f90 @@ -12,9 +12,7 @@ subroutine loop_test ! CHECK: %[[VAL_0:.*]] = fir.alloca f32 {bindc_name = "m", uniq_name = "_QFloop_testEm"} ! CHECK: %[[VAL_1:.*]] = fir.address_of(@_QFloop_testEsum) : !fir.ref - ! CHECK: fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} unordered reduce(#fir.reduce_attr -> %[[VAL_1:.*]] : !fir.ref, #fir.reduce_attr -> %[[VAL_0:.*]] : !fir.ref) { - ! CHECK: fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} unordered reduce(#fir.reduce_attr -> %[[VAL_1:.*]] : !fir.ref, #fir.reduce_attr -> %[[VAL_0:.*]] : !fir.ref) { - ! CHECK: fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} unordered reduce(#fir.reduce_attr -> %[[VAL_1:.*]] : !fir.ref, #fir.reduce_attr -> %[[VAL_0:.*]] : !fir.ref) { + ! CHECK: fir.do_concurrent.loop ({{.*}}) = ({{.*}}) to ({{.*}}) step ({{.*}}) reduce(#fir.reduce_attr -> %[[VAL_1:.*]] : !fir.ref, #fir.reduce_attr -> %[[VAL_0:.*]] : !fir.ref) { do concurrent (i=1:5, j=1:5, k=1:5) local(tmp) reduce(+:sum) reduce(max:m) tmp = i + j + k sum = tmp + sum diff --git a/flang/test/Lower/nsw.f90 b/flang/test/Lower/nsw.f90 index 4ee9e5da829e6..2ec1efb2af42a 100644 --- a/flang/test/Lower/nsw.f90 +++ b/flang/test/Lower/nsw.f90 @@ -139,7 +139,6 @@ subroutine loop_params3(a,lb,ub,st) ! CHECK-LABEL: func.func @_QPloop_params3( ! CHECK: %[[VAL_4:.*]] = arith.constant 2 : i32 ! CHECK: %[[VAL_5:.*]] = arith.constant 1 : i32 -! CHECK: %[[VAL_9:.*]] = fir.declare %{{.*}}i"} : (!fir.ref) -> !fir.ref ! CHECK: %[[VAL_11:.*]] = fir.declare %{{.*}}lb"} : (!fir.ref, !fir.dscope) -> !fir.ref ! CHECK: %[[VAL_12:.*]] = fir.declare %{{.*}}ub"} : (!fir.ref, !fir.dscope) -> !fir.ref ! CHECK: %[[VAL_14:.*]] = fir.declare %{{.*}}i"} : (!fir.ref) -> !fir.ref @@ -153,4 +152,6 @@ subroutine loop_params3(a,lb,ub,st) ! CHECK: %[[VAL_31:.*]] = fir.load %[[VAL_15]] : !fir.ref ! CHECK: %[[VAL_32:.*]] = arith.muli %[[VAL_31]], %[[VAL_4]] overflow : i32 ! CHECK: %[[VAL_33:.*]] = fir.convert %[[VAL_32]] : (i32) -> index -! CHECK: fir.do_loop %[[VAL_34:.*]] = %[[VAL_28]] to %[[VAL_30]] step %[[VAL_33]] unordered { +! CHECK: fir.do_concurrent { +! CHECK: %[[VAL_9:.*]] = fir.declare %{{.*}}i"} : (!fir.ref) -> !fir.ref +! CHECK: fir.do_concurrent.loop (%[[VAL_34:.*]]) = (%[[VAL_28]]) to (%[[VAL_30]]) step (%[[VAL_33]]) { diff --git a/flang/test/Transforms/DoConcurrent/basic_host.f90 b/flang/test/Transforms/DoConcurrent/basic_host.f90 index 12f63031cbaee..b84d4481ac766 100644 --- a/flang/test/Transforms/DoConcurrent/basic_host.f90 +++ b/flang/test/Transforms/DoConcurrent/basic_host.f90 @@ -1,3 +1,6 @@ +! Fails until we update the pass to use the `fir.do_concurrent` op. +! XFAIL: * + ! Tests mapping of a basic `do concurrent` loop to `!$omp parallel do`. ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fdo-concurrent-to-openmp=host %s -o - \ diff --git a/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 b/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 index f82696669eca6..4e13c0919589a 100644 --- a/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 +++ b/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 @@ -1,3 +1,6 @@ +! Fails until we update the pass to use the `fir.do_concurrent` op. +! XFAIL: * + ! Tests that "loop-local values" are properly handled by localizing them to the ! body of the loop nest. See `collectLoopLocalValues` and `localizeLoopLocalValue` ! for a definition of "loop-local values" and how they are handled. diff --git a/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 b/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 index 32bed61fe69e4..adc4a488d1ec9 100644 --- a/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 +++ b/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 @@ -1,3 +1,6 @@ +! Fails until we update the pass to use the `fir.do_concurrent` op. +! XFAIL: * + ! Tests loop-nest detection algorithm for do-concurrent mapping. ! REQUIRES: asserts diff --git a/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 b/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 index d0210726de83e..26800678d381c 100644 --- a/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 +++ b/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 @@ -1,3 +1,6 @@ +! Fails until we update the pass to use the `fir.do_concurrent` op. +! XFAIL: * + ! Tests mapping of a `do concurrent` loop with multiple iteration ranges. ! RUN: split-file %s %t diff --git a/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 b/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 index cd1bd4f98a3f5..23a3aae976c07 100644 --- a/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 +++ b/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 @@ -1,3 +1,6 @@ +! Fails until we update the pass to use the `fir.do_concurrent` op. +! XFAIL: * + ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fdo-concurrent-to-openmp=host %s -o - \ ! RUN: | FileCheck %s diff --git a/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 b/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 index 184fdfe00d397..d1c02101318ab 100644 --- a/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 +++ b/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 @@ -1,3 +1,6 @@ +! Fails until we update the pass to use the `fir.do_concurrent` op. +! XFAIL: * + ! Tests that if `do concurrent` is not perfectly nested in its parent loop, that ! we skip converting the not-perfectly nested `do concurrent` loop. From flang-commits at lists.llvm.org Mon May 5 04:20:47 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Mon, 05 May 2025 04:20:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` (PR #137928) In-Reply-To: Message-ID: <68189f0f.170a0220.19e4a3.4d83@mx.google.com> ergawy wrote: > Is this now ready for review? Are the issues with your downstream fork resolved and is the RFC for the representation of locality specifiers sufficiently discussed that you would like to merge this? Once https://github.com/llvm/llvm-project/pull/138489 is reviewed and approved, I will merge both PRs. https://github.com/llvm/llvm-project/pull/137928 From flang-commits at lists.llvm.org Mon May 5 07:01:04 2025 From: flang-commits at lists.llvm.org (Pranav Bhandarkar via flang-commits) Date: Mon, 05 May 2025 07:01:04 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][MLIR] - Access the LEN for a `fir.boxchar` and use it to set the bounds `omp.map.info` ops. (PR #134967) In-Reply-To: Message-ID: <6818c4a0.050a0220.31ebd1.3c39@mx.google.com> bhandarkar-pranav wrote: @vzakhari - I have changed my approach and updated this PR. I have tested this locally to the extent that I could and things work fine. Could you please review this PR? @agozillon @TIFitis @raghavendhra @ergawy @skatrak - Could you please review this PR? TIA https://github.com/llvm/llvm-project/pull/134967 From flang-commits at lists.llvm.org Mon May 5 08:07:55 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Mon, 05 May 2025 08:07:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add lowering of volatile references (PR #132486) In-Reply-To: Message-ID: <6818d44b.050a0220.fda0d.838c@mx.google.com> DanielCChen wrote: @ashermancinelli We still have a few regressions from this patch after I pulled the latest source just now (May 5). The following is a reducer. ``` type base integer :: p(10) end type type another class(*), pointer :: p(:) end type type(another), volatile :: a type(base), volatile, target :: b a%p => b%p end ``` Flang complains ``` ./t4.f:14:5: warning: VOLATILE target associated with non-VOLATILE pointer a%p => b%p ^^^^^^^^^^ ./t4.f:6:30: Declaration of 'p' class(*), pointer :: p(:) ^ ``` The standard says in [8.5.20 VOLATILE attribute] `If an object has the VOLATILE attribute, then all of its subobjects also have the VOLATILE attribute.` So this code should be valid. The full test cases fails at verification of lowering. https://github.com/llvm/llvm-project/pull/132486 From flang-commits at lists.llvm.org Mon May 5 08:12:04 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Mon, 05 May 2025 08:12:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add lowering of volatile references (PR #132486) In-Reply-To: Message-ID: <6818d544.170a0220.202f91.1f6a@mx.google.com> ashermancinelli wrote: Hello @DanielCChen, thank you for the report. #138339 fixes your test case on the machines I just tested on, and I'd like to merge that as soon as I can. https://github.com/llvm/llvm-project/pull/132486 From flang-commits at lists.llvm.org Mon May 5 08:13:57 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Mon, 05 May 2025 08:13:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Component references are volatile if their parent is (PR #138339) In-Reply-To: Message-ID: <6818d5b5.050a0220.1c65cf.8858@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/138339 From flang-commits at lists.llvm.org Mon May 5 08:14:39 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Mon, 05 May 2025 08:14:39 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add lowering of volatile references (PR #132486) In-Reply-To: Message-ID: <6818d5df.170a0220.3beb16.d129@mx.google.com> DanielCChen wrote: Got it! Thanks! I will verify the rest of the regression with that patch. https://github.com/llvm/llvm-project/pull/132486 From flang-commits at lists.llvm.org Mon May 5 08:22:08 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 08:22:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang][docs] Add note about Cray pointers and the TARGET attribute (PR #137993) In-Reply-To: Message-ID: <6818d7a0.a70a0220.1ba177.a648@mx.google.com> https://github.com/jeanPerier approved this pull request. Thanks Asher! LGTM. https://github.com/llvm/llvm-project/pull/137993 From flang-commits at lists.llvm.org Mon May 5 08:32:12 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 08:32:12 -0700 (PDT) Subject: [flang-commits] [flang] 8870ce1 - [flang][docs] Add note about Cray pointers and the TARGET attribute (#137993) Message-ID: <6818d9fc.170a0220.28eb84.f366@mx.google.com> Author: Asher Mancinelli Date: 2025-05-05T08:32:08-07:00 New Revision: 8870ce1aa0ddf4df4f25ae834a1fab167895892d URL: https://github.com/llvm/llvm-project/commit/8870ce1aa0ddf4df4f25ae834a1fab167895892d DIFF: https://github.com/llvm/llvm-project/commit/8870ce1aa0ddf4df4f25ae834a1fab167895892d.diff LOG: [flang][docs] Add note about Cray pointers and the TARGET attribute (#137993) We found some tests checking for loops assigning between Cray pointer handles and their pointees which produced "incorrect" results with optimizations enabled; this is because the compiler expects Cray pointers not to alias with any other entity. [The HPE documentation for Cray Fortran extensions specifies:](https://support.hpe.com/hpesc/public/docDisplay?docId=a00113911en_us&docLocale=en_US&page=Types.html#cray-poiter-type) > the compiler assumes that the storage of a pointee is > never overlaid on the storage of another variable Jean pointed out that if a user's code uses entities that alias via Cray pointers, they may add the TARGET attribute to inform Flang of this aliasing, but that Flang's behavior is in line with Cray's own documentation and we should not make any changes to our alias analysis to try and detect this case. Updating documentation so that users that encounter this situation have a way to allow their code to compile as they intend. Added: Modified: flang/docs/Aliasing.md Removed: ################################################################################ diff --git a/flang/docs/Aliasing.md b/flang/docs/Aliasing.md index 652b766541fd4..d5edf59f37c6e 100644 --- a/flang/docs/Aliasing.md +++ b/flang/docs/Aliasing.md @@ -264,11 +264,30 @@ Fortran also has no rule against associating read-only data with a pointer. Cray pointers are, or were, an extension that attempted to provide some of the capabilities of modern pointers and allocatables before those features were standardized. -They had some aliasing restrictions; in particular, Cray pointers were -not allowed to alias each other. -They are now more or less obsolete and we have no plan in place to -support them. +They had some aliasing restrictions; in particular, Cray pointers were not +allowed to alias each other. + +In this example, `handle` aliases with `target`. + +``` +integer(kind=8) :: target(10) +integer(kind=8) :: ptr +integer(kind=8) :: handle(10) +pointer(ptr, handle) +target = 1 +ptr = loc(target) +print *, target +end +``` + +Optimizations assume that Cray pointers do not alias any other variables. +In the above example, it is assumed that `handle` and `target` do not alias, +and optimizations will treat them as separate entities. + +In order to disable optimizations that assume that there is no aliasing between +Cray pointer targets and entities they alias with, add the TARGET attribute to +variables aliasing with a Cray pointer (the `target` variable in this example). ## Type considerations From flang-commits at lists.llvm.org Mon May 5 08:32:16 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Mon, 05 May 2025 08:32:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][docs] Add note about Cray pointers and the TARGET attribute (PR #137993) In-Reply-To: Message-ID: <6818da00.630a0220.576ff.3cd7@mx.google.com> https://github.com/ashermancinelli closed https://github.com/llvm/llvm-project/pull/137993 From flang-commits at lists.llvm.org Mon May 5 08:56:33 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 08:56:33 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <6818dfb1.170a0220.3683b7.4b11@mx.google.com> https://github.com/jeanPerier edited https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Mon May 5 08:56:33 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 08:56:33 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <6818dfb1.620a0220.184ea2.98a7@mx.google.com> https://github.com/jeanPerier commented: Makes sense to me, just a question about why using `info.rawInput` vs `info.addr`. https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Mon May 5 08:56:33 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 08:56:33 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <6818dfb1.170a0220.32287a.4616@mx.google.com> ================ @@ -156,9 +156,9 @@ genBoundsOpsFromBox(fir::FirOpBuilder &builder, mlir::Location loc, builder.genIfOp(loc, resTypes, info.isPresent, /*withElseRegion=*/true) .genThen([&]() { mlir::Value box = - !fir::isBoxAddress(info.addr.getType()) + !fir::isBoxAddress(info.rawInput.getType()) ? info.addr - : builder.create(loc, info.addr); + : builder.create(loc, info.rawInput); ---------------- jeanPerier wrote: Why is the change from addr to rawInput needed here? I am asking because I am considering removing the hlfir.declare second result as an IR design simplification, since it should be possible to get anything starting from the first result, and I see that rawInput is specifically set to be the second result [here](https://github.com/llvm/llvm-project/blob/82387ec13258f67a530ddb615a49e0f36e8575e1/flang/include/flang/Optimizer/Builder/DirectivesCommon.h#L63). https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Mon May 5 08:56:33 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 08:56:33 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <6818dfb1.170a0220.2b1ea4.4353@mx.google.com> ================ @@ -151,16 +152,23 @@ class MapInfoFinalizationPass mlir::Location loc = boxMap->getLoc(); assert(allocaBlock && "No alloca block found for this top level op"); builder.setInsertionPointToStart(allocaBlock); - auto alloca = builder.create(loc, descriptor.getType()); + + mlir::Type allocaType = descriptor.getType(); + if (fir::isTypeWithDescriptor(allocaType) && + !mlir::isa(descriptor.getType())) ---------------- jeanPerier wrote: ```suggestion if (fir::isBoxAddress(allocaType)) ``` https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Mon May 5 09:00:48 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 09:00:48 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Component references are volatile if their parent is (PR #138339) In-Reply-To: Message-ID: <6818e0b0.050a0220.24fefa.a123@mx.google.com> https://github.com/jeanPerier approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/138339 From flang-commits at lists.llvm.org Mon May 5 09:50:45 2025 From: flang-commits at lists.llvm.org (Razvan Lupusoru via flang-commits) Date: Mon, 05 May 2025 09:50:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <6818ec65.170a0220.2e53e6.57a8@mx.google.com> https://github.com/razvanlupusoru approved this pull request. Impressive work! Thank you! https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Mon May 5 09:57:41 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Mon, 05 May 2025 09:57:41 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <6818ee05.050a0220.1e8272.bc67@mx.google.com> https://github.com/clementval approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Mon May 5 10:03:56 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 10:03:56 -0700 (PDT) Subject: [flang-commits] [flang] c0e52f3 - [flang] Component references are volatile if their parent is (#138339) Message-ID: <6818ef7c.170a0220.3053bd.6830@mx.google.com> Author: Asher Mancinelli Date: 2025-05-05T10:03:54-07:00 New Revision: c0e52f3ec7f147c2c1414ef0f2a5f08c413a587b URL: https://github.com/llvm/llvm-project/commit/c0e52f3ec7f147c2c1414ef0f2a5f08c413a587b DIFF: https://github.com/llvm/llvm-project/commit/c0e52f3ec7f147c2c1414ef0f2a5f08c413a587b.diff LOG: [flang] Component references are volatile if their parent is (#138339) Component references inherit volatility from their base derived types. Moved the base type volatility check before the box type is built, and merge it (instead of overwrite it) with the volatility of the base type. Added: flang/test/Lower/volatile-derived-type.f90 Modified: flang/lib/Lower/ConvertExprToHLFIR.cpp Removed: ################################################################################ diff --git a/flang/lib/Lower/ConvertExprToHLFIR.cpp b/flang/lib/Lower/ConvertExprToHLFIR.cpp index a3be50ac072d4..5981116a6d3f7 100644 --- a/flang/lib/Lower/ConvertExprToHLFIR.cpp +++ b/flang/lib/Lower/ConvertExprToHLFIR.cpp @@ -236,6 +236,12 @@ class HlfirDesignatorBuilder { isVolatile = true; } + // Check if the base type is volatile + if (partInfo.base.has_value()) { + mlir::Type baseType = partInfo.base.value().getType(); + isVolatile = isVolatile || fir::isa_volatile_type(baseType); + } + // Arrays with non default lower bounds or dynamic length or dynamic extent // need a fir.box to hold the dynamic or lower bound information. if (fir::hasDynamicSize(resultValueType) || @@ -249,12 +255,6 @@ class HlfirDesignatorBuilder { /*namedConstantSectionsAreAlwaysContiguous=*/false)) return fir::BoxType::get(resultValueType, isVolatile); - // Check if the base type is volatile - if (partInfo.base.has_value()) { - mlir::Type baseType = partInfo.base.value().getType(); - isVolatile = fir::isa_volatile_type(baseType); - } - // Other designators can be handled as raw addresses. return fir::ReferenceType::get(resultValueType, isVolatile); } diff --git a/flang/test/Lower/volatile-derived-type.f90 b/flang/test/Lower/volatile-derived-type.f90 new file mode 100644 index 0000000000000..edd77a9265530 --- /dev/null +++ b/flang/test/Lower/volatile-derived-type.f90 @@ -0,0 +1,48 @@ +! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s +! Ensure member access of a volatile derived type is volatile. + type t + integer :: e(4)=2 + end type t + type(t), volatile :: f + call test (f%e(::2)) +contains + subroutine test(v) + integer, asynchronous :: v(:) + end subroutine +end +! CHECK-LABEL: func.func @_QQmain() { +! CHECK: %[[VAL_0:.*]] = arith.constant 4 : index +! CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +! CHECK: %[[VAL_2:.*]] = arith.constant 2 : index +! CHECK: %[[VAL_3:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_4:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_5:.*]] = fir.address_of(@_QFE.b.t.e) : !fir.ref,value:i64}>>> +! CHECK: %[[VAL_6:.*]] = fir.shape_shift %[[VAL_3]], %[[VAL_2]], %[[VAL_3]], %[[VAL_1]] : (index, index, index, index) -> !fir.shapeshift<2> +! CHECK: %[[VAL_7:.*]]:2 = hlfir.declare %[[VAL_5]](%[[VAL_6]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.b.t.e"} : +! CHECK: %[[VAL_8:.*]] = fir.address_of(@_QFE.n.e) : !fir.ref> +! CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_8]] typeparams %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.n.e"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[VAL_10:.*]] = fir.address_of(@_QFE.di.t.e) : !fir.ref> +! CHECK: %[[VAL_11:.*]] = fir.shape %[[VAL_0]] : (index) -> !fir.shape<1> +! CHECK: %[[VAL_12:.*]]:2 = hlfir.declare %[[VAL_10]](%[[VAL_11]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.di.t.e"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[VAL_13:.*]] = fir.address_of(@_QFE.n.t) : !fir.ref> +! CHECK: %[[VAL_14:.*]]:2 = hlfir.declare %[[VAL_13]] typeparams %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.n.t"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %[[VAL_15:.*]] = fir.alloca !fir.type<_QFTt{e:!fir.array<4xi32>}> {bindc_name = "f", uniq_name = "_QFEf"} +! CHECK: %[[VAL_16:.*]] = fir.volatile_cast %[[VAL_15]] : (!fir.ref}>>) -> !fir.ref}>, volatile> +! CHECK: %[[VAL_17:.*]]:2 = hlfir.declare %[[VAL_16]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEf"} : (!fir.ref}>, volatile>) -> (!fir.ref}>, volatile>, !fir.ref}>, volatile>) +! CHECK: %[[VAL_18:.*]] = fir.address_of(@_QQ_QFTt.DerivedInit) : !fir.ref}>> +! CHECK: fir.copy %[[VAL_18]] to %[[VAL_17]]#0 no_overlap : !fir.ref}>>, !fir.ref}>, volatile> +! CHECK: %[[VAL_20:.*]] = fir.shape_shift %[[VAL_3]], %[[VAL_1]] : (index, index) -> !fir.shapeshift<1> +! CHECK: %[[VAL_21:.*]]:2 = hlfir.declare %{{.+}}(%[[VAL_20]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.c.t"} : +! CHECK: %[[VAL_24:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +! CHECK: %[[VAL_25:.*]] = hlfir.designate %[[VAL_17]]#0{"e"} <%[[VAL_11]]> (%[[VAL_1]]:%[[VAL_0]]:%[[VAL_2]]) shape %[[VAL_24]] : (!fir.ref}>, volatile>, !fir.shape<1>, index, index, index, !fir.shape<1>) -> !fir.box, volatile> +! CHECK: %[[VAL_26:.*]] = fir.volatile_cast %[[VAL_25]] : (!fir.box, volatile>) -> !fir.box> +! CHECK: %[[VAL_27:.*]] = fir.convert %[[VAL_26]] : (!fir.box>) -> !fir.box> +! CHECK: fir.call @_QFPtest(%[[VAL_27]]) fastmath : (!fir.box>) -> () +! CHECK: return +! CHECK: } +! CHECK-LABEL: func.func private @_QFPtest( +! CHECK-SAME: %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !fir.box> {fir.asynchronous, fir.bindc_name = "v"}) attributes {fir.host_symbol = @_QQmain, llvm.linkage = #llvm.linkage} { +! CHECK: %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[VAL_0]] dummy_scope %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFFtestEv"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +! CHECK: return +! CHECK: } From flang-commits at lists.llvm.org Mon May 5 10:04:00 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Mon, 05 May 2025 10:04:00 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Component references are volatile if their parent is (PR #138339) In-Reply-To: Message-ID: <6818ef80.630a0220.a4ddf.77d3@mx.google.com> https://github.com/ashermancinelli closed https://github.com/llvm/llvm-project/pull/138339 From flang-commits at lists.llvm.org Mon May 5 08:07:15 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Mon, 05 May 2025 08:07:15 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <6818d423.630a0220.1166b1.ba9c@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/136012 >From 0f4591ee621e2e9d7acb0e6066b556cb7e243162 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Wed, 16 Apr 2025 12:01:24 -0700 Subject: [PATCH 01/12] initial commit --- flang/include/flang/Lower/AbstractConverter.h | 4 + flang/include/flang/Lower/OpenACC.h | 10 +- flang/include/flang/Semantics/symbol.h | 23 +- flang/lib/Lower/Bridge.cpp | 7 +- flang/lib/Lower/CallInterface.cpp | 10 + flang/lib/Lower/OpenACC.cpp | 197 ++++++++++++++---- flang/lib/Semantics/mod-file.cpp | 1 + flang/lib/Semantics/resolve-directives.cpp | 83 ++++---- 8 files changed, 233 insertions(+), 102 deletions(-) diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 1d1323642bf9c..59419e829718f 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -14,6 +14,7 @@ #define FORTRAN_LOWER_ABSTRACTCONVERTER_H #include "flang/Lower/LoweringOptions.h" +#include "flang/Lower/OpenACC.h" #include "flang/Lower/PFTDefs.h" #include "flang/Optimizer/Builder/BoxValue.h" #include "flang/Optimizer/Dialect/FIRAttr.h" @@ -357,6 +358,9 @@ class AbstractConverter { /// functions in order to be in sync). virtual mlir::SymbolTable *getMLIRSymbolTable() = 0; + virtual Fortran::lower::AccRoutineInfoMappingList & + getAccDelayedRoutines() = 0; + private: /// Options controlling lowering behavior. const Fortran::lower::LoweringOptions &loweringOptions; diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 0d7038a7fd856..7832e8b69ea23 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -22,6 +22,9 @@ class StringRef; } // namespace llvm namespace mlir { +namespace func { +class FuncOp; +} class Location; class Type; class ModuleOp; @@ -42,6 +45,7 @@ struct OpenACCRoutineConstruct; } // namespace parser namespace semantics { +class OpenACCRoutineInfo; class SemanticsContext; class Symbol; } // namespace semantics @@ -79,8 +83,10 @@ void genOpenACCDeclarativeConstruct(AbstractConverter &, void genOpenACCRoutineConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, mlir::ModuleOp, - const parser::OpenACCRoutineConstruct &, - AccRoutineInfoMappingList &); + const parser::OpenACCRoutineConstruct &); +void genOpenACCRoutineConstruct( + AbstractConverter &, mlir::ModuleOp, mlir::func::FuncOp, + const std::vector &); void finalizeOpenACCRoutineAttachment(mlir::ModuleOp, AccRoutineInfoMappingList &); diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 715811885c219..1b6b247c9f5bc 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -127,6 +127,8 @@ class WithBindName { // Device type specific OpenACC routine information class OpenACCRoutineDeviceTypeInfo { public: + OpenACCRoutineDeviceTypeInfo(Fortran::common::OpenACCDeviceType dType) + : deviceType_{dType} {} bool isSeq() const { return isSeq_; } void set_isSeq(bool value = true) { isSeq_ = value; } bool isVector() const { return isVector_; } @@ -141,9 +143,7 @@ class OpenACCRoutineDeviceTypeInfo { return bindName_ ? &*bindName_ : nullptr; } void set_bindName(std::string &&name) { bindName_ = std::move(name); } - void set_dType(Fortran::common::OpenACCDeviceType dType) { - deviceType_ = dType; - } + Fortran::common::OpenACCDeviceType dType() const { return deviceType_; } private: @@ -162,13 +162,24 @@ class OpenACCRoutineDeviceTypeInfo { // in as objects in the OpenACCRoutineDeviceTypeInfo list. class OpenACCRoutineInfo : public OpenACCRoutineDeviceTypeInfo { public: + OpenACCRoutineInfo() + : OpenACCRoutineDeviceTypeInfo(Fortran::common::OpenACCDeviceType::None) { + } bool isNohost() const { return isNohost_; } void set_isNohost(bool value = true) { isNohost_ = value; } - std::list &deviceTypeInfos() { + const std::list &deviceTypeInfos() const { return deviceTypeInfos_; } - void add_deviceTypeInfo(OpenACCRoutineDeviceTypeInfo &info) { - deviceTypeInfos_.push_back(info); + + OpenACCRoutineDeviceTypeInfo &add_deviceTypeInfo( + Fortran::common::OpenACCDeviceType type) { + return add_deviceTypeInfo(OpenACCRoutineDeviceTypeInfo(type)); + } + + OpenACCRoutineDeviceTypeInfo &add_deviceTypeInfo( + OpenACCRoutineDeviceTypeInfo &&info) { + deviceTypeInfos_.push_back(std::move(info)); + return deviceTypeInfos_.back(); } private: diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index b4d1197822a43..9285d587585f8 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -443,7 +443,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); Fortran::lower::genOpenACCRoutineConstruct( *this, bridge.getSemanticsContext(), bridge.getModule(), - d.routine, accRoutineInfos); + d.routine); builder = nullptr; }, }, @@ -4287,6 +4287,11 @@ class FirConverter : public Fortran::lower::AbstractConverter { return Fortran::lower::createMutableBox(loc, *this, expr, localSymbols); } + Fortran::lower::AccRoutineInfoMappingList & + getAccDelayedRoutines() override final { + return accRoutineInfos; + } + // Create the [newRank] array with the lower bounds to be passed to the // runtime as a descriptor. mlir::Value createLboundArray(llvm::ArrayRef lbounds, diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 226ba1e52c968..867248f16237e 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -1689,6 +1689,16 @@ class SignatureBuilder "SignatureBuilder should only be used once"); declare(); interfaceDetermined = true; + if (procDesignator && procDesignator->GetInterfaceSymbol() && + procDesignator->GetInterfaceSymbol() + ->has()) { + auto info = procDesignator->GetInterfaceSymbol() + ->get(); + if (!info.openACCRoutineInfos().empty()) { + genOpenACCRoutineConstruct(converter, converter.getModuleOp(), + getFuncOp(), info.openACCRoutineInfos()); + } + } return getFuncOp(); } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 3dd35ed9ae481..37b660408af6c 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -38,6 +38,7 @@ #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" +#include #define DEBUG_TYPE "flang-lower-openacc" @@ -4139,11 +4140,152 @@ static void attachRoutineInfo(mlir::func::FuncOp func, mlir::acc::RoutineInfoAttr::get(func.getContext(), routines)); } +void createOpenACCRoutineConstruct( + Fortran::lower::AbstractConverter &converter, mlir::Location loc, + mlir::ModuleOp mod, mlir::func::FuncOp funcOp, std::string funcName, + bool hasNohost, llvm::SmallVector &bindNames, + llvm::SmallVector &bindNameDeviceTypes, + llvm::SmallVector &gangDeviceTypes, + llvm::SmallVector &gangDimValues, + llvm::SmallVector &gangDimDeviceTypes, + llvm::SmallVector &seqDeviceTypes, + llvm::SmallVector &workerDeviceTypes, + llvm::SmallVector &vectorDeviceTypes) { + + std::stringstream routineOpName; + routineOpName << accRoutinePrefix.str() << routineCounter++; + + for (auto routineOp : mod.getOps()) { + if (routineOp.getFuncName().str().compare(funcName) == 0) { + // If the routine is already specified with the same clauses, just skip + // the operation creation. + if (compareDeviceTypeInfo(routineOp, bindNames, bindNameDeviceTypes, + gangDeviceTypes, gangDimValues, + gangDimDeviceTypes, seqDeviceTypes, + workerDeviceTypes, vectorDeviceTypes) && + routineOp.getNohost() == hasNohost) + return; + mlir::emitError(loc, "Routine already specified with different clauses"); + } + } + std::string routineOpStr = routineOpName.str(); + mlir::OpBuilder modBuilder(mod.getBodyRegion()); + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + modBuilder.create( + loc, routineOpStr, funcName, + bindNames.empty() ? nullptr : builder.getArrayAttr(bindNames), + bindNameDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(bindNameDeviceTypes), + workerDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(workerDeviceTypes), + vectorDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(vectorDeviceTypes), + seqDeviceTypes.empty() ? nullptr : builder.getArrayAttr(seqDeviceTypes), + hasNohost, /*implicit=*/false, + gangDeviceTypes.empty() ? nullptr : builder.getArrayAttr(gangDeviceTypes), + gangDimValues.empty() ? nullptr : builder.getArrayAttr(gangDimValues), + gangDimDeviceTypes.empty() ? nullptr + : builder.getArrayAttr(gangDimDeviceTypes)); + + if (funcOp) + attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpStr)); + else + // FuncOp is not lowered yet. Keep the information so the routine info + // can be attached later to the funcOp. + converter.getAccDelayedRoutines().push_back( + std::make_pair(funcName, builder.getSymbolRefAttr(routineOpStr))); +} + +static void interpretRoutineDeviceInfo( + fir::FirOpBuilder &builder, + const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo, + llvm::SmallVector &seqDeviceTypes, + llvm::SmallVector &vectorDeviceTypes, + llvm::SmallVector &workerDeviceTypes, + llvm::SmallVector &bindNameDeviceTypes, + llvm::SmallVector &bindNames, + llvm::SmallVector &gangDeviceTypes, + llvm::SmallVector &gangDimValues, + llvm::SmallVector &gangDimDeviceTypes) { + mlir::MLIRContext *context{builder.getContext()}; + if (dinfo.isSeq()) { + seqDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } + if (dinfo.isVector()) { + vectorDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } + if (dinfo.isWorker()) { + workerDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } + if (dinfo.isGang()) { + unsigned gangDim = dinfo.gangDim(); + auto deviceType = + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType())); + if (!gangDim) { + gangDeviceTypes.push_back(deviceType); + } else { + gangDimValues.push_back( + builder.getIntegerAttr(builder.getI64Type(), gangDim)); + gangDimDeviceTypes.push_back(deviceType); + } + } + if (const std::string *bindName{dinfo.bindName()}) { + bindNames.push_back(builder.getStringAttr(*bindName)); + bindNameDeviceTypes.push_back( + mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + } +} + +void Fortran::lower::genOpenACCRoutineConstruct( + Fortran::lower::AbstractConverter &converter, mlir::ModuleOp mod, + mlir::func::FuncOp funcOp, + const std::vector &routineInfos) { + CHECK(funcOp && "Expected a valid function operation"); + fir::FirOpBuilder &builder{converter.getFirOpBuilder()}; + mlir::Location loc{funcOp.getLoc()}; + std::string funcName{funcOp.getName()}; + + // Collect the routine clauses + bool hasNohost{false}; + + llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimDeviceTypes, gangDimValues; + + for (const Fortran::semantics::OpenACCRoutineInfo &info : routineInfos) { + // Device Independent Attributes + if (info.isNohost()) { + hasNohost = true; + } + // Note: Device Independent Attributes are set to the + // none device type in `info`. + interpretRoutineDeviceInfo(builder, info, seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, + bindNames, gangDeviceTypes, gangDimValues, + gangDimDeviceTypes); + + // Device Dependent Attributes + for (const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo : + info.deviceTypeInfos()) { + interpretRoutineDeviceInfo( + builder, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, + bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, + gangDimDeviceTypes); + } + } + createOpenACCRoutineConstruct( + converter, loc, mod, funcOp, funcName, hasNohost, bindNames, + bindNameDeviceTypes, gangDeviceTypes, gangDimValues, gangDimDeviceTypes, + seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); +} + void Fortran::lower::genOpenACCRoutineConstruct( Fortran::lower::AbstractConverter &converter, Fortran::semantics::SemanticsContext &semanticsContext, mlir::ModuleOp mod, - const Fortran::parser::OpenACCRoutineConstruct &routineConstruct, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { + const Fortran::parser::OpenACCRoutineConstruct &routineConstruct) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); mlir::Location loc = converter.genLocation(routineConstruct.source); std::optional name = @@ -4174,6 +4316,7 @@ void Fortran::lower::genOpenACCRoutineConstruct( funcName = funcOp.getName(); } } + // TODO: Refactor this to use the OpenACCRoutineInfo bool hasNohost = false; llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, @@ -4226,6 +4369,8 @@ void Fortran::lower::genOpenACCRoutineConstruct( std::get_if(&clause.u)) { if (const auto *name = std::get_if(&bindClause->v.u)) { + // FIXME: This case mangles the name, the one below does not. + // which is correct? mlir::Attribute bindNameAttr = builder.getStringAttr(converter.mangleName(*name->symbol)); for (auto crtDeviceTypeAttr : crtDeviceTypes) { @@ -4255,47 +4400,10 @@ void Fortran::lower::genOpenACCRoutineConstruct( } } - mlir::OpBuilder modBuilder(mod.getBodyRegion()); - std::stringstream routineOpName; - routineOpName << accRoutinePrefix.str() << routineCounter++; - - for (auto routineOp : mod.getOps()) { - if (routineOp.getFuncName().str().compare(funcName) == 0) { - // If the routine is already specified with the same clauses, just skip - // the operation creation. - if (compareDeviceTypeInfo(routineOp, bindNames, bindNameDeviceTypes, - gangDeviceTypes, gangDimValues, - gangDimDeviceTypes, seqDeviceTypes, - workerDeviceTypes, vectorDeviceTypes) && - routineOp.getNohost() == hasNohost) - return; - mlir::emitError(loc, "Routine already specified with different clauses"); - } - } - - modBuilder.create( - loc, routineOpName.str(), funcName, - bindNames.empty() ? nullptr : builder.getArrayAttr(bindNames), - bindNameDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(bindNameDeviceTypes), - workerDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(workerDeviceTypes), - vectorDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(vectorDeviceTypes), - seqDeviceTypes.empty() ? nullptr : builder.getArrayAttr(seqDeviceTypes), - hasNohost, /*implicit=*/false, - gangDeviceTypes.empty() ? nullptr : builder.getArrayAttr(gangDeviceTypes), - gangDimValues.empty() ? nullptr : builder.getArrayAttr(gangDimValues), - gangDimDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(gangDimDeviceTypes)); - - if (funcOp) - attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpName.str())); - else - // FuncOp is not lowered yet. Keep the information so the routine info - // can be attached later to the funcOp. - accRoutineInfos.push_back(std::make_pair( - funcName, builder.getSymbolRefAttr(routineOpName.str()))); + createOpenACCRoutineConstruct( + converter, loc, mod, funcOp, funcName, hasNohost, bindNames, + bindNameDeviceTypes, gangDeviceTypes, gangDimValues, gangDimDeviceTypes, + seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); } void Fortran::lower::finalizeOpenACCRoutineAttachment( @@ -4443,8 +4551,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( fir::FirOpBuilder &builder = converter.getFirOpBuilder(); mlir::ModuleOp mod = builder.getModule(); Fortran::lower::genOpenACCRoutineConstruct( - converter, semanticsContext, mod, routineConstruct, - accRoutineInfos); + converter, semanticsContext, mod, routineConstruct); }, }, accDeclConstruct.u); diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index ee356e56e4458..befd204a671fc 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1387,6 +1387,7 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, parser::Options options; options.isModuleFile = true; options.features.Enable(common::LanguageFeature::BackslashEscapes); + options.features.Enable(common::LanguageFeature::OpenACC); options.features.Enable(common::LanguageFeature::OpenMP); options.features.Enable(common::LanguageFeature::CUDA); if (!isIntrinsic.value_or(false) && !notAModule) { diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index d75b4ea13d35f..93c334a3ca3cb 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1034,61 +1034,53 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Symbol &symbol, const parser::OpenACCRoutineConstruct &x) { if (symbol.has()) { Fortran::semantics::OpenACCRoutineInfo info; + std::vector currentDevices; + currentDevices.push_back(&info); const auto &clauses = std::get(x.t); for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isSeq(); - } else { - info.deviceTypeInfos().back().set_isSeq(); + if (const auto *dTypeClause = + std::get_if(&clause.u)) { + currentDevices.clear(); + for (const auto &deviceTypeExpr : dTypeClause->v.v) { + currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); } + } else if (std::get_if(&clause.u)) { + info.set_isNohost(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) + device->set_isSeq(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) + device->set_isVector(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) + device->set_isWorker(); } else if (const auto *gangClause = std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isGang(); - } else { - info.deviceTypeInfos().back().set_isGang(); - } + for (auto &device : currentDevices) + device->set_isGang(); if (gangClause->v) { const Fortran::parser::AccGangArgList &x = *gangClause->v; + int numArgs{0}; for (const Fortran::parser::AccGangArg &gangArg : x.v) { + CHECK(numArgs <= 1 && "expecting 0 or 1 gang dim args"); if (const auto *dim = std::get_if(&gangArg.u)) { if (const auto v{EvaluateInt64(context_, dim->v)}) { - if (info.deviceTypeInfos().empty()) { - info.set_gangDim(*v); - } else { - info.deviceTypeInfos().back().set_gangDim(*v); - } + for (auto &device : currentDevices) + device->set_gangDim(*v); } } + numArgs++; } } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isVector(); - } else { - info.deviceTypeInfos().back().set_isVector(); - } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isWorker(); - } else { - info.deviceTypeInfos().back().set_isWorker(); - } - } else if (std::get_if(&clause.u)) { - info.set_isNohost(); } else if (const auto *bindClause = std::get_if(&clause.u)) { + std::string bindName = ""; if (const auto *name = std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { - if (info.deviceTypeInfos().empty()) { - info.set_bindName(sym->name().ToString()); - } else { - info.deviceTypeInfos().back().set_bindName( - sym->name().ToString()); - } + bindName = sym->name().ToString(); } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1101,21 +1093,16 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Fortran::parser::Unwrap( *charExpr); std::string str{std::get(charConst->t)}; - std::stringstream bindName; - bindName << "\"" << str << "\""; - if (info.deviceTypeInfos().empty()) { - info.set_bindName(bindName.str()); - } else { - info.deviceTypeInfos().back().set_bindName(bindName.str()); + std::stringstream bindNameStream; + bindNameStream << "\"" << str << "\""; + bindName = bindNameStream.str(); + } + if (!bindName.empty()) { + // Fixme: do we need to ensure there there is only one device? + for (auto &device : currentDevices) { + device->set_bindName(std::string(bindName)); } } - } else if (const auto *dType = - std::get_if( - &clause.u)) { - const parser::AccDeviceTypeExprList &deviceTypeExprList = dType->v; - OpenACCRoutineDeviceTypeInfo dtypeInfo; - dtypeInfo.set_dType(deviceTypeExprList.v.front().v); - info.add_deviceTypeInfo(dtypeInfo); } } symbol.get().add_openACCRoutineInfo(info); >From 1b6da293788edc56eea566f5c15126de6955169c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Tue, 22 Apr 2025 16:06:33 -0700 Subject: [PATCH 02/12] fix includes --- flang/lib/Lower/OpenACC.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 37b660408af6c..a3ebd9b931dc6 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -32,13 +32,13 @@ #include "flang/Semantics/expression.h" #include "flang/Semantics/scope.h" #include "flang/Semantics/tools.h" -#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" -#include "mlir/Support/LLVM.h" #include "llvm/ADT/STLExtras.h" #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" -#include +#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" +#include "mlir/IR/MLIRContext.h" +#include "mlir/Support/LLVM.h" #define DEBUG_TYPE "flang-lower-openacc" >From 7b65ac4c477e5e46bf369a3a9f94f69cf496ef6b Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Wed, 23 Apr 2025 13:50:19 -0700 Subject: [PATCH 03/12] adding test --- flang/include/flang/Semantics/symbol.h | 7 +++++- flang/lib/Semantics/resolve-directives.cpp | 6 ++++- .../Lower/OpenACC/acc-module-definition.f90 | 17 ++++++++++++++ .../Lower/OpenACC/acc-routine-use-module.f90 | 23 +++++++++++++++++++ 4 files changed, 51 insertions(+), 2 deletions(-) create mode 100644 flang/test/Lower/OpenACC/acc-module-definition.f90 create mode 100644 flang/test/Lower/OpenACC/acc-routine-use-module.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 1b6b247c9f5bc..fe6c73997733a 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -142,7 +142,11 @@ class OpenACCRoutineDeviceTypeInfo { const std::string *bindName() const { return bindName_ ? &*bindName_ : nullptr; } - void set_bindName(std::string &&name) { bindName_ = std::move(name); } + bool bindNameIsInternal() const {return bindNameIsInternal_;} + void set_bindName(std::string &&name, bool isInternal=false) { + bindName_ = std::move(name); + bindNameIsInternal_ = isInternal; + } Fortran::common::OpenACCDeviceType dType() const { return deviceType_; } @@ -153,6 +157,7 @@ class OpenACCRoutineDeviceTypeInfo { bool isGang_{false}; unsigned gangDim_{0}; std::optional bindName_; + bool bindNameIsInternal_{false}; Fortran::common::OpenACCDeviceType deviceType_{ Fortran::common::OpenACCDeviceType::None}; }; diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 93c334a3ca3cb..8fb3559c34426 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1077,10 +1077,12 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( } else if (const auto *bindClause = std::get_if(&clause.u)) { std::string bindName = ""; + bool isInternal = false; if (const auto *name = std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { bindName = sym->name().ToString(); + isInternal = true; } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1100,12 +1102,14 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( if (!bindName.empty()) { // Fixme: do we need to ensure there there is only one device? for (auto &device : currentDevices) { - device->set_bindName(std::string(bindName)); + device->set_bindName(std::string(bindName), isInternal); } } } } symbol.get().add_openACCRoutineInfo(info); + } else { + llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() << "\n"; } } diff --git a/flang/test/Lower/OpenACC/acc-module-definition.f90 b/flang/test/Lower/OpenACC/acc-module-definition.f90 new file mode 100644 index 0000000000000..36e41fc631c77 --- /dev/null +++ b/flang/test/Lower/OpenACC/acc-module-definition.f90 @@ -0,0 +1,17 @@ +! RUN: rm -fr %t && mkdir -p %t && cd %t +! RUN: bbc -fopenacc -emit-fir %s +! RUN: cat mod1.mod | FileCheck %s + +!CHECK-LABEL: module mod1 +module mod1 + contains + !CHECK subroutine callee(aa) + subroutine callee(aa) + !CHECK: !$acc routine seq + !$acc routine seq + integer :: aa + aa = 1 + end subroutine + !CHECK: end + !CHECK: end +end module \ No newline at end of file diff --git a/flang/test/Lower/OpenACC/acc-routine-use-module.f90 b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 new file mode 100644 index 0000000000000..7fc96b0ef5684 --- /dev/null +++ b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 @@ -0,0 +1,23 @@ +! RUN: rm -fr %t && mkdir -p %t && cd %t +! RUN: bbc -emit-fir %S/acc-module-definition.f90 +! RUN: bbc -emit-fir %s -o - | FileCheck %s + +! This test module is based off of flang/test/Lower/use_module.f90 +! The first runs ensures the module file is generated. + +module use_mod1 + use mod1 + contains + !CHECK: func.func @_QMuse_mod1Pcaller + !CHECK-SAME { + subroutine caller(aa) + integer :: aa + !$acc serial + !CHECK: fir.call @_QMmod1Pcallee + call callee(aa) + !$acc end serial + end subroutine + !CHECK: } + !CHECK: acc.routine @acc_routine_0 func(@_QMmod1Pcallee) seq + !CHECK: func.func private @_QMmod1Pcallee(!fir.ref) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +end module \ No newline at end of file >From 70f8d469346d22597c7b3ff38b2f4a84a82b6d85 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 14:39:04 -0700 Subject: [PATCH 04/12] debugging failure --- flang/include/flang/Lower/OpenACC.h | 7 ++ flang/include/flang/Semantics/symbol.h | 21 +++-- flang/lib/Lower/Bridge.cpp | 21 +++-- flang/lib/Lower/CallInterface.cpp | 21 ++--- flang/lib/Lower/OpenACC.cpp | 87 +++++++++++-------- flang/lib/Semantics/mod-file.cpp | 11 ++- flang/lib/Semantics/resolve-directives.cpp | 16 ++-- flang/lib/Semantics/symbol.cpp | 46 ++++++++++ .../test/Lower/OpenACC/acc-routine-named.f90 | 10 ++- .../Lower/OpenACC/acc-routine-use-module.f90 | 6 +- flang/test/Lower/OpenACC/acc-routine.f90 | 63 ++++++++------ 11 files changed, 199 insertions(+), 110 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 7832e8b69ea23..dc014a71526c3 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -37,11 +37,16 @@ class FirOpBuilder; } namespace Fortran { +namespace evaluate { +class ProcedureDesignator; +} // namespace evaluate + namespace parser { struct AccClauseList; struct OpenACCConstruct; struct OpenACCDeclarativeConstruct; struct OpenACCRoutineConstruct; +struct ProcedureDesignator; } // namespace parser namespace semantics { @@ -71,6 +76,8 @@ static constexpr llvm::StringRef declarePostDeallocSuffix = static constexpr llvm::StringRef privatizationRecipePrefix = "privatization"; +bool needsOpenACCRoutineConstruct(const Fortran::evaluate::ProcedureDesignator *); + mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, pft::Evaluation &, diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index fe6c73997733a..8c60a196bdfc1 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -22,6 +22,7 @@ #include #include #include +#include #include namespace llvm { @@ -139,25 +140,26 @@ class OpenACCRoutineDeviceTypeInfo { void set_isGang(bool value = true) { isGang_ = value; } unsigned gangDim() const { return gangDim_; } void set_gangDim(unsigned value) { gangDim_ = value; } - const std::string *bindName() const { - return bindName_ ? &*bindName_ : nullptr; + const std::variant *bindName() const { + return bindName_.has_value() ? &*bindName_ : nullptr; } - bool bindNameIsInternal() const {return bindNameIsInternal_;} - void set_bindName(std::string &&name, bool isInternal=false) { - bindName_ = std::move(name); - bindNameIsInternal_ = isInternal; + const std::optional> &bindNameOpt() const { + return bindName_; } + void set_bindName(std::string &&name) { bindName_.emplace(std::move(name)); } + void set_bindName(SymbolRef symbol) { bindName_.emplace(symbol); } Fortran::common::OpenACCDeviceType dType() const { return deviceType_; } + friend llvm::raw_ostream &operator<<( + llvm::raw_ostream &, const OpenACCRoutineDeviceTypeInfo &); private: bool isSeq_{false}; bool isVector_{false}; bool isWorker_{false}; bool isGang_{false}; unsigned gangDim_{0}; - std::optional bindName_; - bool bindNameIsInternal_{false}; + std::optional> bindName_; Fortran::common::OpenACCDeviceType deviceType_{ Fortran::common::OpenACCDeviceType::None}; }; @@ -187,6 +189,9 @@ class OpenACCRoutineInfo : public OpenACCRoutineDeviceTypeInfo { return deviceTypeInfos_.back(); } + friend llvm::raw_ostream &operator<<( + llvm::raw_ostream &, const OpenACCRoutineInfo &); + private: std::list deviceTypeInfos_; bool isNohost_{false}; diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 9285d587585f8..abe07bcfdfcda 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -438,14 +438,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { [&](Fortran::lower::pft::ModuleLikeUnit &m) { lowerMod(m); }, [&](Fortran::lower::pft::BlockDataUnit &b) {}, [&](Fortran::lower::pft::CompilerDirectiveUnit &d) {}, - [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) { - builder = new fir::FirOpBuilder( - bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); - Fortran::lower::genOpenACCRoutineConstruct( - *this, bridge.getSemanticsContext(), bridge.getModule(), - d.routine); - builder = nullptr; - }, + [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) {}, }, u); } @@ -472,6 +465,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { setCurrentPosition(funit.getStartingSourceLoc()); + builder = new fir::FirOpBuilder( + bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { funit.setActiveEntry(entryIndex); @@ -498,6 +493,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (Fortran::lower::pft::ContainedUnit &unit : funit.containedUnitList) if (auto *f = std::get_if(&unit)) declareFunction(*f); + builder = nullptr; } /// Get the scope that is defining or using \p sym. The returned scope is not @@ -1035,7 +1031,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { return bridge.getSemanticsContext().FindScope(currentPosition); } - fir::FirOpBuilder &getFirOpBuilder() override final { return *builder; } + fir::FirOpBuilder &getFirOpBuilder() override final { + CHECK(builder && "builder is not set before calling getFirOpBuilder"); + return *builder; + } mlir::ModuleOp getModuleOp() override final { return bridge.getModule(); } @@ -5617,6 +5616,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { LLVM_DEBUG(llvm::dbgs() << "\n[bridge - startNewFunction]"; if (auto *sym = scope.symbol()) llvm::dbgs() << " " << *sym; llvm::dbgs() << "\n"); + // I don't think setting the builder is necessary here, because callee + // always looks up the FuncOp from the module. If there was a function that + // was not declared yet. This call to callee will cause an assertion + //failure. Fortran::lower::CalleeInterface callee(funit, *this); mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments(); builder = diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 867248f16237e..b938354e6bcb3 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -10,6 +10,7 @@ #include "flang/Evaluate/fold.h" #include "flang/Lower/Bridge.h" #include "flang/Lower/Mangler.h" +#include "flang/Lower/OpenACC.h" #include "flang/Lower/PFTBuilder.h" #include "flang/Lower/StatementContext.h" #include "flang/Lower/Support/Utils.h" @@ -20,6 +21,7 @@ #include "flang/Optimizer/Dialect/FIROpsSupport.h" #include "flang/Optimizer/Support/InternalNames.h" #include "flang/Optimizer/Support/Utils.h" +#include "flang/Parser/parse-tree.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" #include "flang/Support/Fortran.h" @@ -715,6 +717,14 @@ void Fortran::lower::CallInterface::declare() { func.setArgAttrs(placeHolder.index(), placeHolder.value().attributes); setCUDAAttributes(func, side().getProcedureSymbol(), characteristic); + + if (const Fortran::semantics::Symbol *sym = side().getProcedureSymbol()) { + if (const auto &info{sym->GetUltimate().detailsIf()}) { + if (!info->openACCRoutineInfos().empty()) { + genOpenACCRoutineConstruct(converter, module, func, info->openACCRoutineInfos()); + } + } + } } } } @@ -1688,17 +1698,8 @@ class SignatureBuilder fir::emitFatalError(converter.getCurrentLocation(), "SignatureBuilder should only be used once"); declare(); + interfaceDetermined = true; - if (procDesignator && procDesignator->GetInterfaceSymbol() && - procDesignator->GetInterfaceSymbol() - ->has()) { - auto info = procDesignator->GetInterfaceSymbol() - ->get(); - if (!info.openACCRoutineInfos().empty()) { - genOpenACCRoutineConstruct(converter, converter.getModuleOp(), - getFuncOp(), info.openACCRoutineInfos()); - } - } return getFuncOp(); } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index a3ebd9b931dc6..eefa8fbf12b1a 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -36,6 +36,7 @@ #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" +#include "llvm/Support/ErrorHandling.h" #include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" #include "mlir/IR/MLIRContext.h" #include "mlir/Support/LLVM.h" @@ -4140,6 +4141,14 @@ static void attachRoutineInfo(mlir::func::FuncOp func, mlir::acc::RoutineInfoAttr::get(func.getContext(), routines)); } +static mlir::ArrayAttr getArrayAttrOrNull(fir::FirOpBuilder &builder, llvm::SmallVector &attributes) { + if (attributes.empty()) { + return nullptr; + } else { + return builder.getArrayAttr(attributes); + } +} + void createOpenACCRoutineConstruct( Fortran::lower::AbstractConverter &converter, mlir::Location loc, mlir::ModuleOp mod, mlir::func::FuncOp funcOp, std::string funcName, @@ -4173,31 +4182,29 @@ void createOpenACCRoutineConstruct( fir::FirOpBuilder &builder = converter.getFirOpBuilder(); modBuilder.create( loc, routineOpStr, funcName, - bindNames.empty() ? nullptr : builder.getArrayAttr(bindNames), - bindNameDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(bindNameDeviceTypes), - workerDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(workerDeviceTypes), - vectorDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(vectorDeviceTypes), - seqDeviceTypes.empty() ? nullptr : builder.getArrayAttr(seqDeviceTypes), + getArrayAttrOrNull(builder, bindNames), + getArrayAttrOrNull(builder, bindNameDeviceTypes), + getArrayAttrOrNull(builder, workerDeviceTypes), + getArrayAttrOrNull(builder, vectorDeviceTypes), + getArrayAttrOrNull(builder, seqDeviceTypes), hasNohost, /*implicit=*/false, - gangDeviceTypes.empty() ? nullptr : builder.getArrayAttr(gangDeviceTypes), - gangDimValues.empty() ? nullptr : builder.getArrayAttr(gangDimValues), - gangDimDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(gangDimDeviceTypes)); - - if (funcOp) - attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpStr)); - else + getArrayAttrOrNull(builder, gangDeviceTypes), + getArrayAttrOrNull(builder, gangDimValues), + getArrayAttrOrNull(builder, gangDimDeviceTypes)); + + auto symbolRefAttr = builder.getSymbolRefAttr(routineOpStr); + if (funcOp) { + + attachRoutineInfo(funcOp, symbolRefAttr); + } else { // FuncOp is not lowered yet. Keep the information so the routine info // can be attached later to the funcOp. - converter.getAccDelayedRoutines().push_back( - std::make_pair(funcName, builder.getSymbolRefAttr(routineOpStr))); + converter.getAccDelayedRoutines().push_back(std::make_pair(funcName, symbolRefAttr)); + } } static void interpretRoutineDeviceInfo( - fir::FirOpBuilder &builder, + Fortran::lower::AbstractConverter &converter, const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo, llvm::SmallVector &seqDeviceTypes, llvm::SmallVector &vectorDeviceTypes, @@ -4207,23 +4214,24 @@ static void interpretRoutineDeviceInfo( llvm::SmallVector &gangDeviceTypes, llvm::SmallVector &gangDimValues, llvm::SmallVector &gangDimDeviceTypes) { - mlir::MLIRContext *context{builder.getContext()}; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto getDeviceTypeAttr = [&]() -> mlir::Attribute { + auto context = builder.getContext(); + auto value = getDeviceType(dinfo.dType()); + return mlir::acc::DeviceTypeAttr::get(context, value ); + }; if (dinfo.isSeq()) { - seqDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + seqDeviceTypes.push_back(getDeviceTypeAttr()); } if (dinfo.isVector()) { - vectorDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + vectorDeviceTypes.push_back(getDeviceTypeAttr()); } if (dinfo.isWorker()) { - workerDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + workerDeviceTypes.push_back(getDeviceTypeAttr()); } if (dinfo.isGang()) { unsigned gangDim = dinfo.gangDim(); - auto deviceType = - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType())); + auto deviceType = getDeviceTypeAttr(); if (!gangDim) { gangDeviceTypes.push_back(deviceType); } else { @@ -4232,10 +4240,18 @@ static void interpretRoutineDeviceInfo( gangDimDeviceTypes.push_back(deviceType); } } - if (const std::string *bindName{dinfo.bindName()}) { - bindNames.push_back(builder.getStringAttr(*bindName)); - bindNameDeviceTypes.push_back( - mlir::acc::DeviceTypeAttr::get(context, getDeviceType(dinfo.dType()))); + if (dinfo.bindNameOpt().has_value()) { + const auto &bindName = dinfo.bindNameOpt().value(); + mlir::Attribute bindNameAttr; + if (const auto &bindStr{std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(*bindStr); + } else if (const auto &bindSym{std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(converter.mangleName(*bindSym)); + } else { + llvm_unreachable("Unsupported bind name type"); + } + bindNames.push_back(bindNameAttr); + bindNameDeviceTypes.push_back(getDeviceTypeAttr()); } } @@ -4244,7 +4260,6 @@ void Fortran::lower::genOpenACCRoutineConstruct( mlir::func::FuncOp funcOp, const std::vector &routineInfos) { CHECK(funcOp && "Expected a valid function operation"); - fir::FirOpBuilder &builder{converter.getFirOpBuilder()}; mlir::Location loc{funcOp.getLoc()}; std::string funcName{funcOp.getName()}; @@ -4262,7 +4277,7 @@ void Fortran::lower::genOpenACCRoutineConstruct( } // Note: Device Independent Attributes are set to the // none device type in `info`. - interpretRoutineDeviceInfo(builder, info, seqDeviceTypes, vectorDeviceTypes, + interpretRoutineDeviceInfo(converter, info, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, gangDimDeviceTypes); @@ -4271,7 +4286,7 @@ void Fortran::lower::genOpenACCRoutineConstruct( for (const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo : info.deviceTypeInfos()) { interpretRoutineDeviceInfo( - builder, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, + converter, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, gangDimDeviceTypes); } @@ -4369,8 +4384,6 @@ void Fortran::lower::genOpenACCRoutineConstruct( std::get_if(&clause.u)) { if (const auto *name = std::get_if(&bindClause->v.u)) { - // FIXME: This case mangles the name, the one below does not. - // which is correct? mlir::Attribute bindNameAttr = builder.getStringAttr(converter.mangleName(*name->symbol)); for (auto crtDeviceTypeAttr : crtDeviceTypes) { diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index befd204a671fc..76dc8db590f22 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -24,6 +24,7 @@ #include #include #include +#include #include namespace Fortran::semantics { @@ -638,8 +639,14 @@ static void PutOpenACCDeviceTypeRoutineInfo( if (info.isWorker()) { os << " worker"; } - if (info.bindName()) { - os << " bind(" << *info.bindName() << ")"; + if (const std::variant *bindName{info.bindName()}) { + os << " bind("; + if (std::holds_alternative(*bindName)) { + os << "\"" << std::get(*bindName) << "\""; + } else { + os << std::get(*bindName)->name(); + } + os << ")"; } } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 8fb3559c34426..a8f00b546306e 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1076,13 +1076,13 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( } } else if (const auto *bindClause = std::get_if(&clause.u)) { - std::string bindName = ""; - bool isInternal = false; if (const auto *name = std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { - bindName = sym->name().ToString(); - isInternal = true; + Symbol &ultimate{sym->GetUltimate()}; + for (auto &device : currentDevices) { + device->set_bindName(SymbolRef(ultimate)); + } } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1095,14 +1095,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Fortran::parser::Unwrap( *charExpr); std::string str{std::get(charConst->t)}; - std::stringstream bindNameStream; - bindNameStream << "\"" << str << "\""; - bindName = bindNameStream.str(); - } - if (!bindName.empty()) { - // Fixme: do we need to ensure there there is only one device? for (auto &device : currentDevices) { - device->set_bindName(std::string(bindName), isInternal); + device->set_bindName(std::string(str)); } } } diff --git a/flang/lib/Semantics/symbol.cpp b/flang/lib/Semantics/symbol.cpp index 32eb6c2c5a188..d44df4669fa36 100644 --- a/flang/lib/Semantics/symbol.cpp +++ b/flang/lib/Semantics/symbol.cpp @@ -144,6 +144,52 @@ llvm::raw_ostream &operator<<( os << ' ' << x; } } + if (!x.openACCRoutineInfos_.empty()) { + os << " openACCRoutineInfos:"; + for (const auto x : x.openACCRoutineInfos_) { + os << x; + } + } + return os; +} + +llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const OpenACCRoutineDeviceTypeInfo &x) { + if (x.dType() != common::OpenACCDeviceType::None) { + os << " deviceType(" << common::EnumToString(x.dType()) << ')'; + } + if (x.isSeq()) { + os << " seq"; + } + if (x.isVector()) { + os << " vector"; + } + if (x.isWorker()) { + os << " worker"; + } + if (x.isGang()) { + os << " gang(" << x.gangDim() << ')'; + } + if (const auto *bindName{x.bindName()}) { + if (const auto &symbol{std::get_if(bindName)}) { + os << " bindName(\"" << *symbol << "\")"; + } else { + const SymbolRef s{std::get(*bindName)}; + os << " bindName(" << s->name() << ")"; + } + } + return os; +} + +llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const OpenACCRoutineInfo &x) { + if (x.isNohost()) { + os << " nohost"; + } + os << static_cast(x); + for (const auto &d : x.deviceTypeInfos_) { + os << d; + } return os; } diff --git a/flang/test/Lower/OpenACC/acc-routine-named.f90 b/flang/test/Lower/OpenACC/acc-routine-named.f90 index 2cf6bf8b2bc06..de9784a1146cc 100644 --- a/flang/test/Lower/OpenACC/acc-routine-named.f90 +++ b/flang/test/Lower/OpenACC/acc-routine-named.f90 @@ -4,8 +4,8 @@ module acc_routines -! CHECK: acc.routine @acc_routine_1 func(@_QMacc_routinesPacc2) -! CHECK: acc.routine @acc_routine_0 func(@_QMacc_routinesPacc1) seq +! CHECK: acc.routine @[[r0:.*]] func(@_QMacc_routinesPacc2) +! CHECK: acc.routine @[[r1:.*]] func(@_QMacc_routinesPacc1) seq !$acc routine(acc1) seq @@ -14,12 +14,14 @@ module acc_routines subroutine acc1() end subroutine -! CHECK-LABEL: func.func @_QMacc_routinesPacc1() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +! CHECK-LABEL: func.func @_QMacc_routinesPacc1() +! CHECK-SAME:attributes {acc.routine_info = #acc.routine_info<[@[[r1]]]>} subroutine acc2() !$acc routine(acc2) end subroutine -! CHECK-LABEL: func.func @_QMacc_routinesPacc2() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_1]>} +! CHECK-LABEL: func.func @_QMacc_routinesPacc2() +! CHECK-SAME:attributes {acc.routine_info = #acc.routine_info<[@[[r0]]]>} end module diff --git a/flang/test/Lower/OpenACC/acc-routine-use-module.f90 b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 index 7fc96b0ef5684..059324230a746 100644 --- a/flang/test/Lower/OpenACC/acc-routine-use-module.f90 +++ b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 @@ -1,6 +1,6 @@ ! RUN: rm -fr %t && mkdir -p %t && cd %t -! RUN: bbc -emit-fir %S/acc-module-definition.f90 -! RUN: bbc -emit-fir %s -o - | FileCheck %s +! RUN: bbc -fopenacc -emit-fir %S/acc-module-definition.f90 +! RUN: bbc -fopenacc -emit-fir %s -o - | FileCheck %s ! This test module is based off of flang/test/Lower/use_module.f90 ! The first runs ensures the module file is generated. @@ -8,6 +8,7 @@ module use_mod1 use mod1 contains + !CHECK: acc.routine @acc_routine_0 func(@_QMmod1Pcallee) seq !CHECK: func.func @_QMuse_mod1Pcaller !CHECK-SAME { subroutine caller(aa) @@ -18,6 +19,5 @@ subroutine caller(aa) !$acc end serial end subroutine !CHECK: } - !CHECK: acc.routine @acc_routine_0 func(@_QMmod1Pcallee) seq !CHECK: func.func private @_QMmod1Pcallee(!fir.ref) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} end module \ No newline at end of file diff --git a/flang/test/Lower/OpenACC/acc-routine.f90 b/flang/test/Lower/OpenACC/acc-routine.f90 index 1170af18bc334..789f3a57e1f79 100644 --- a/flang/test/Lower/OpenACC/acc-routine.f90 +++ b/flang/test/Lower/OpenACC/acc-routine.f90 @@ -2,69 +2,77 @@ ! RUN: bbc -fopenacc -emit-hlfir %s -o - | FileCheck %s -! CHECK: acc.routine @acc_routine_17 func(@_QPacc_routine19) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) -! CHECK: acc.routine @acc_routine_16 func(@_QPacc_routine18) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) -! CHECK: acc.routine @acc_routine_15 func(@_QPacc_routine17) worker ([#acc.device_type]) vector ([#acc.device_type]) -! CHECK: acc.routine @acc_routine_14 func(@_QPacc_routine16) gang([#acc.device_type]) seq ([#acc.device_type]) -! CHECK: acc.routine @acc_routine_10 func(@_QPacc_routine11) seq -! CHECK: acc.routine @acc_routine_9 func(@_QPacc_routine10) seq -! CHECK: acc.routine @acc_routine_8 func(@_QPacc_routine9) bind("_QPacc_routine9a") -! CHECK: acc.routine @acc_routine_7 func(@_QPacc_routine8) bind("routine8_") -! CHECK: acc.routine @acc_routine_6 func(@_QPacc_routine7) gang(dim: 1 : i64) -! CHECK: acc.routine @acc_routine_5 func(@_QPacc_routine6) nohost -! CHECK: acc.routine @acc_routine_4 func(@_QPacc_routine5) worker -! CHECK: acc.routine @acc_routine_3 func(@_QPacc_routine4) vector -! CHECK: acc.routine @acc_routine_2 func(@_QPacc_routine3) gang -! CHECK: acc.routine @acc_routine_1 func(@_QPacc_routine2) seq -! CHECK: acc.routine @acc_routine_0 func(@_QPacc_routine1) +! CHECK: acc.routine @[[r14:.*]] func(@_QPacc_routine19) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) +! CHECK: acc.routine @[[r13:.*]] func(@_QPacc_routine18) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) +! CHECK: acc.routine @[[r12:.*]] func(@_QPacc_routine17) worker ([#acc.device_type]) vector ([#acc.device_type]) +! CHECK: acc.routine @[[r11:.*]] func(@_QPacc_routine16) gang([#acc.device_type]) seq ([#acc.device_type]) +! CHECK: acc.routine @[[r10:.*]] func(@_QPacc_routine11) seq +! CHECK: acc.routine @[[r09:.*]] func(@_QPacc_routine10) seq +! CHECK: acc.routine @[[r08:.*]] func(@_QPacc_routine9) bind("_QPacc_routine9a") +! CHECK: acc.routine @[[r07:.*]] func(@_QPacc_routine8) bind("routine8_") +! CHECK: acc.routine @[[r06:.*]] func(@_QPacc_routine7) gang(dim: 1 : i64) +! CHECK: acc.routine @[[r05:.*]] func(@_QPacc_routine6) nohost +! CHECK: acc.routine @[[r04:.*]] func(@_QPacc_routine5) worker +! CHECK: acc.routine @[[r03:.*]] func(@_QPacc_routine4) vector +! CHECK: acc.routine @[[r02:.*]] func(@_QPacc_routine3) gang +! CHECK: acc.routine @[[r01:.*]] func(@_QPacc_routine2) seq +! CHECK: acc.routine @[[r00:.*]] func(@_QPacc_routine1) subroutine acc_routine1() !$acc routine end subroutine -! CHECK-LABEL: func.func @_QPacc_routine1() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +! CHECK-LABEL: func.func @_QPacc_routine1() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r00]]]>} subroutine acc_routine2() !$acc routine seq end subroutine -! CHECK-LABEL: func.func @_QPacc_routine2() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_1]>} +! CHECK-LABEL: func.func @_QPacc_routine2() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r01]]]>} subroutine acc_routine3() !$acc routine gang end subroutine -! CHECK-LABEL: func.func @_QPacc_routine3() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_2]>} +! CHECK-LABEL: func.func @_QPacc_routine3() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r02]]]>} subroutine acc_routine4() !$acc routine vector end subroutine -! CHECK-LABEL: func.func @_QPacc_routine4() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_3]>} +! CHECK-LABEL: func.func @_QPacc_routine4() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r03]]]>} subroutine acc_routine5() !$acc routine worker end subroutine -! CHECK-LABEL: func.func @_QPacc_routine5() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_4]>} +! CHECK-LABEL: func.func @_QPacc_routine5() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r04]]]>} subroutine acc_routine6() !$acc routine nohost end subroutine -! CHECK-LABEL: func.func @_QPacc_routine6() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_5]>} +! CHECK-LABEL: func.func @_QPacc_routine6() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r05]]]>} subroutine acc_routine7() !$acc routine gang(dim:1) end subroutine -! CHECK-LABEL: func.func @_QPacc_routine7() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_6]>} +! CHECK-LABEL: func.func @_QPacc_routine7() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r06]]]>} subroutine acc_routine8() !$acc routine bind("routine8_") end subroutine -! CHECK-LABEL: func.func @_QPacc_routine8() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_7]>} +! CHECK-LABEL: func.func @_QPacc_routine8() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r07]]]>} subroutine acc_routine9a() end subroutine @@ -73,20 +81,23 @@ subroutine acc_routine9() !$acc routine bind(acc_routine9a) end subroutine -! CHECK-LABEL: func.func @_QPacc_routine9() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_8]>} +! CHECK-LABEL: func.func @_QPacc_routine9() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r08]]]>} function acc_routine10() !$acc routine(acc_routine10) seq end function -! CHECK-LABEL: func.func @_QPacc_routine10() -> f32 attributes {acc.routine_info = #acc.routine_info<[@acc_routine_9]>} +! CHECK-LABEL: func.func @_QPacc_routine10() -> f32 +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r09]]]>} subroutine acc_routine11(a) real :: a !$acc routine(acc_routine11) seq end subroutine -! CHECK-LABEL: func.func @_QPacc_routine11(%arg0: !fir.ref {fir.bindc_name = "a"}) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_10]>} +! CHECK-LABEL: func.func @_QPacc_routine11(%arg0: !fir.ref {fir.bindc_name = "a"}) +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r10]]]>} subroutine acc_routine12() >From e2d1a05d2de2356644d385e9099a7e6879143cc7 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 14:40:14 -0700 Subject: [PATCH 05/12] clang-format --- flang/include/flang/Lower/OpenACC.h | 3 +- flang/include/flang/Semantics/symbol.h | 4 +- flang/lib/Lower/Bridge.cpp | 12 ++--- flang/lib/Lower/CallInterface.cpp | 9 ++-- flang/lib/Lower/OpenACC.cpp | 56 +++++++++++----------- flang/lib/Semantics/resolve-directives.cpp | 3 +- 6 files changed, 48 insertions(+), 39 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index dc014a71526c3..35a33e751b52b 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -76,7 +76,8 @@ static constexpr llvm::StringRef declarePostDeallocSuffix = static constexpr llvm::StringRef privatizationRecipePrefix = "privatization"; -bool needsOpenACCRoutineConstruct(const Fortran::evaluate::ProcedureDesignator *); +bool needsOpenACCRoutineConstruct( + const Fortran::evaluate::ProcedureDesignator *); mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 8c60a196bdfc1..eb34aac9c390d 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -143,7 +143,8 @@ class OpenACCRoutineDeviceTypeInfo { const std::variant *bindName() const { return bindName_.has_value() ? &*bindName_ : nullptr; } - const std::optional> &bindNameOpt() const { + const std::optional> & + bindNameOpt() const { return bindName_; } void set_bindName(std::string &&name) { bindName_.emplace(std::move(name)); } @@ -153,6 +154,7 @@ class OpenACCRoutineDeviceTypeInfo { friend llvm::raw_ostream &operator<<( llvm::raw_ostream &, const OpenACCRoutineDeviceTypeInfo &); + private: bool isSeq_{false}; bool isVector_{false}; diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index abe07bcfdfcda..5e7b783323bfd 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -465,8 +465,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { setCurrentPosition(funit.getStartingSourceLoc()); - builder = new fir::FirOpBuilder( - bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); + builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), + &mlirSymbolTable); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { funit.setActiveEntry(entryIndex); @@ -1031,9 +1031,9 @@ class FirConverter : public Fortran::lower::AbstractConverter { return bridge.getSemanticsContext().FindScope(currentPosition); } - fir::FirOpBuilder &getFirOpBuilder() override final { + fir::FirOpBuilder &getFirOpBuilder() override final { CHECK(builder && "builder is not set before calling getFirOpBuilder"); - return *builder; + return *builder; } mlir::ModuleOp getModuleOp() override final { return bridge.getModule(); } @@ -5616,10 +5616,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { LLVM_DEBUG(llvm::dbgs() << "\n[bridge - startNewFunction]"; if (auto *sym = scope.symbol()) llvm::dbgs() << " " << *sym; llvm::dbgs() << "\n"); - // I don't think setting the builder is necessary here, because callee + // I don't think setting the builder is necessary here, because callee // always looks up the FuncOp from the module. If there was a function that // was not declared yet. This call to callee will cause an assertion - //failure. + // failure. Fortran::lower::CalleeInterface callee(funit, *this); mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments(); builder = diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index b938354e6bcb3..611eacfe178e5 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -717,11 +717,14 @@ void Fortran::lower::CallInterface::declare() { func.setArgAttrs(placeHolder.index(), placeHolder.value().attributes); setCUDAAttributes(func, side().getProcedureSymbol(), characteristic); - + if (const Fortran::semantics::Symbol *sym = side().getProcedureSymbol()) { - if (const auto &info{sym->GetUltimate().detailsIf()}) { + if (const auto &info{ + sym->GetUltimate() + .detailsIf()}) { if (!info->openACCRoutineInfos().empty()) { - genOpenACCRoutineConstruct(converter, module, func, info->openACCRoutineInfos()); + genOpenACCRoutineConstruct(converter, module, func, + info->openACCRoutineInfos()); } } } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index eefa8fbf12b1a..891dc998bc596 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -32,14 +32,14 @@ #include "flang/Semantics/expression.h" #include "flang/Semantics/scope.h" #include "flang/Semantics/tools.h" +#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" +#include "mlir/IR/MLIRContext.h" +#include "mlir/Support/LLVM.h" #include "llvm/ADT/STLExtras.h" #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" #include "llvm/Support/ErrorHandling.h" -#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" -#include "mlir/IR/MLIRContext.h" -#include "mlir/Support/LLVM.h" #define DEBUG_TYPE "flang-lower-openacc" @@ -4141,7 +4141,9 @@ static void attachRoutineInfo(mlir::func::FuncOp func, mlir::acc::RoutineInfoAttr::get(func.getContext(), routines)); } -static mlir::ArrayAttr getArrayAttrOrNull(fir::FirOpBuilder &builder, llvm::SmallVector &attributes) { +static mlir::ArrayAttr +getArrayAttrOrNull(fir::FirOpBuilder &builder, + llvm::SmallVector &attributes) { if (attributes.empty()) { return nullptr; } else { @@ -4181,25 +4183,24 @@ void createOpenACCRoutineConstruct( mlir::OpBuilder modBuilder(mod.getBodyRegion()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); modBuilder.create( - loc, routineOpStr, funcName, - getArrayAttrOrNull(builder, bindNames), + loc, routineOpStr, funcName, getArrayAttrOrNull(builder, bindNames), getArrayAttrOrNull(builder, bindNameDeviceTypes), getArrayAttrOrNull(builder, workerDeviceTypes), getArrayAttrOrNull(builder, vectorDeviceTypes), - getArrayAttrOrNull(builder, seqDeviceTypes), - hasNohost, /*implicit=*/false, - getArrayAttrOrNull(builder, gangDeviceTypes), + getArrayAttrOrNull(builder, seqDeviceTypes), hasNohost, + /*implicit=*/false, getArrayAttrOrNull(builder, gangDeviceTypes), getArrayAttrOrNull(builder, gangDimValues), getArrayAttrOrNull(builder, gangDimDeviceTypes)); auto symbolRefAttr = builder.getSymbolRefAttr(routineOpStr); if (funcOp) { - + attachRoutineInfo(funcOp, symbolRefAttr); } else { // FuncOp is not lowered yet. Keep the information so the routine info // can be attached later to the funcOp. - converter.getAccDelayedRoutines().push_back(std::make_pair(funcName, symbolRefAttr)); + converter.getAccDelayedRoutines().push_back( + std::make_pair(funcName, symbolRefAttr)); } } @@ -4218,7 +4219,7 @@ static void interpretRoutineDeviceInfo( auto getDeviceTypeAttr = [&]() -> mlir::Attribute { auto context = builder.getContext(); auto value = getDeviceType(dinfo.dType()); - return mlir::acc::DeviceTypeAttr::get(context, value ); + return mlir::acc::DeviceTypeAttr::get(context, value); }; if (dinfo.isSeq()) { seqDeviceTypes.push_back(getDeviceTypeAttr()); @@ -4244,14 +4245,15 @@ static void interpretRoutineDeviceInfo( const auto &bindName = dinfo.bindNameOpt().value(); mlir::Attribute bindNameAttr; if (const auto &bindStr{std::get_if(&bindName)}) { - bindNameAttr = builder.getStringAttr(*bindStr); - } else if (const auto &bindSym{std::get_if(&bindName)}) { - bindNameAttr = builder.getStringAttr(converter.mangleName(*bindSym)); - } else { - llvm_unreachable("Unsupported bind name type"); - } - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(getDeviceTypeAttr()); + bindNameAttr = builder.getStringAttr(*bindStr); + } else if (const auto &bindSym{ + std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(converter.mangleName(*bindSym)); + } else { + llvm_unreachable("Unsupported bind name type"); + } + bindNames.push_back(bindNameAttr); + bindNameDeviceTypes.push_back(getDeviceTypeAttr()); } } @@ -4277,18 +4279,18 @@ void Fortran::lower::genOpenACCRoutineConstruct( } // Note: Device Independent Attributes are set to the // none device type in `info`. - interpretRoutineDeviceInfo(converter, info, seqDeviceTypes, vectorDeviceTypes, - workerDeviceTypes, bindNameDeviceTypes, - bindNames, gangDeviceTypes, gangDimValues, - gangDimDeviceTypes); + interpretRoutineDeviceInfo(converter, info, seqDeviceTypes, + vectorDeviceTypes, workerDeviceTypes, + bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimValues, gangDimDeviceTypes); // Device Dependent Attributes for (const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo : info.deviceTypeInfos()) { interpretRoutineDeviceInfo( - converter, dinfo, seqDeviceTypes, vectorDeviceTypes, workerDeviceTypes, - bindNameDeviceTypes, bindNames, gangDeviceTypes, gangDimValues, - gangDimDeviceTypes); + converter, dinfo, seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimValues, gangDimDeviceTypes); } } createOpenACCRoutineConstruct( diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index a8f00b546306e..c2df7cddc0025 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1103,7 +1103,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( } symbol.get().add_openACCRoutineInfo(info); } else { - llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() << "\n"; + llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() + << "\n"; } } >From c26093683edb7c0270809d2afb717450f92df6ab Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 14:47:39 -0700 Subject: [PATCH 06/12] tidy up --- flang/include/flang/Lower/OpenACC.h | 1 - flang/lib/Lower/CallInterface.cpp | 1 - 2 files changed, 2 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 35a33e751b52b..9e71ad0a15c89 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -46,7 +46,6 @@ struct AccClauseList; struct OpenACCConstruct; struct OpenACCDeclarativeConstruct; struct OpenACCRoutineConstruct; -struct ProcedureDesignator; } // namespace parser namespace semantics { diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 611eacfe178e5..602b5c7bfa6c6 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -1701,7 +1701,6 @@ class SignatureBuilder fir::emitFatalError(converter.getCurrentLocation(), "SignatureBuilder should only be used once"); declare(); - interfaceDetermined = true; return getFuncOp(); } >From 1b825b55c808ac92cd2866d855611cd585eb28db Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 17:00:27 -0700 Subject: [PATCH 07/12] cleaning up unused code --- flang/include/flang/Lower/AbstractConverter.h | 3 - flang/include/flang/Lower/OpenACC.h | 18 +- flang/lib/Lower/Bridge.cpp | 43 +++-- flang/lib/Lower/OpenACC.cpp | 166 +----------------- flang/lib/Semantics/resolve-directives.cpp | 12 +- 5 files changed, 32 insertions(+), 210 deletions(-) diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 59419e829718f..2fa0da94b0396 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -358,9 +358,6 @@ class AbstractConverter { /// functions in order to be in sync). virtual mlir::SymbolTable *getMLIRSymbolTable() = 0; - virtual Fortran::lower::AccRoutineInfoMappingList & - getAccDelayedRoutines() = 0; - private: /// Options controlling lowering behavior. const Fortran::lower::LoweringOptions &loweringOptions; diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 9e71ad0a15c89..d2cd7712fb2c7 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -63,9 +63,6 @@ namespace pft { struct Evaluation; } // namespace pft -using AccRoutineInfoMappingList = - llvm::SmallVector>; - static constexpr llvm::StringRef declarePostAllocSuffix = "_acc_declare_update_desc_post_alloc"; static constexpr llvm::StringRef declarePreDeallocSuffix = @@ -82,22 +79,13 @@ mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, pft::Evaluation &, const parser::OpenACCConstruct &); -void genOpenACCDeclarativeConstruct(AbstractConverter &, - Fortran::semantics::SemanticsContext &, - StatementContext &, - const parser::OpenACCDeclarativeConstruct &, - AccRoutineInfoMappingList &); -void genOpenACCRoutineConstruct(AbstractConverter &, - Fortran::semantics::SemanticsContext &, - mlir::ModuleOp, - const parser::OpenACCRoutineConstruct &); +void genOpenACCDeclarativeConstruct( + AbstractConverter &, Fortran::semantics::SemanticsContext &, + StatementContext &, const parser::OpenACCDeclarativeConstruct &); void genOpenACCRoutineConstruct( AbstractConverter &, mlir::ModuleOp, mlir::func::FuncOp, const std::vector &); -void finalizeOpenACCRoutineAttachment(mlir::ModuleOp, - AccRoutineInfoMappingList &); - /// Get a acc.private.recipe op for the given type or create it if it does not /// exist yet. mlir::acc::PrivateRecipeOp createOrGetPrivateRecipe(mlir::OpBuilder &, diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 5e7b783323bfd..1615493003898 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -458,15 +458,25 @@ class FirConverter : public Fortran::lower::AbstractConverter { Fortran::common::LanguageFeature::CUDA)); }); - finalizeOpenACCLowering(); finalizeOpenMPLowering(globalOmpRequiresSymbol); } /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { + // Since this is a recursive function, we only need to create a new builder + // for each top-level declaration. It would be simpler to have a single + // builder for the entire translation unit, but that requires a lot of + // changes to the code. + // FIXME: Once createGlobalOutsideOfFunctionLowering is fixed, we can + // remove this code and share the module builder. + bool newBuilder = false; + if (!builder) { + newBuilder = true; + builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), + &mlirSymbolTable); + } + CHECK(builder && "FirOpBuilder did not instantiate"); setCurrentPosition(funit.getStartingSourceLoc()); - builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), - &mlirSymbolTable); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { funit.setActiveEntry(entryIndex); @@ -493,7 +503,11 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (Fortran::lower::pft::ContainedUnit &unit : funit.containedUnitList) if (auto *f = std::get_if(&unit)) declareFunction(*f); - builder = nullptr; + + if (newBuilder) { + delete builder; + builder = nullptr; + } } /// Get the scope that is defining or using \p sym. The returned scope is not @@ -3017,8 +3031,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { void genFIR(const Fortran::parser::OpenACCDeclarativeConstruct &accDecl) { genOpenACCDeclarativeConstruct(*this, bridge.getSemanticsContext(), - bridge.openAccCtx(), accDecl, - accRoutineInfos); + bridge.openAccCtx(), accDecl); for (Fortran::lower::pft::Evaluation &e : getEval().getNestedEvaluations()) genFIR(e); } @@ -4286,11 +4299,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { return Fortran::lower::createMutableBox(loc, *this, expr, localSymbols); } - Fortran::lower::AccRoutineInfoMappingList & - getAccDelayedRoutines() override final { - return accRoutineInfos; - } - // Create the [newRank] array with the lower bounds to be passed to the // runtime as a descriptor. mlir::Value createLboundArray(llvm::ArrayRef lbounds, @@ -5889,7 +5897,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Helper to generate GlobalOps when the builder is not positioned in any /// region block. This is required because the FirOpBuilder assumes it is /// always positioned inside a region block when creating globals, the easiest - /// way comply is to create a dummy function and to throw it afterwards. + /// way comply is to create a dummy function and to throw it away afterwards. void createGlobalOutsideOfFunctionLowering( const std::function &createGlobals) { // FIXME: get rid of the bogus function context and instantiate the @@ -5902,6 +5910,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { mlir::FunctionType::get(context, std::nullopt, std::nullopt), symbolTable); func.addEntryBlock(); + CHECK(!builder && "Expected builder to be uninitialized"); builder = new fir::FirOpBuilder(func, bridge.getKindMap(), symbolTable); assert(builder && "FirOpBuilder did not instantiate"); builder->setFastMathFlags(bridge.getLoweringOptions().getMathOptions()); @@ -6331,13 +6340,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { expr.u); } - /// Performing OpenACC lowering action that were deferred to the end of - /// lowering. - void finalizeOpenACCLowering() { - Fortran::lower::finalizeOpenACCRoutineAttachment(getModuleOp(), - accRoutineInfos); - } - /// Performing OpenMP lowering actions that were deferred to the end of /// lowering. void finalizeOpenMPLowering( @@ -6429,9 +6431,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// A counter for uniquing names in `literalNamesMap`. std::uint64_t uniqueLitId = 0; - /// Deferred OpenACC routine attachment. - Fortran::lower::AccRoutineInfoMappingList accRoutineInfos; - /// Whether an OpenMP target region or declare target function/subroutine /// intended for device offloading has been detected bool ompDeviceCodeFound = false; diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 891dc998bc596..1a031dce7a487 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -4163,9 +4163,6 @@ void createOpenACCRoutineConstruct( llvm::SmallVector &workerDeviceTypes, llvm::SmallVector &vectorDeviceTypes) { - std::stringstream routineOpName; - routineOpName << accRoutinePrefix.str() << routineCounter++; - for (auto routineOp : mod.getOps()) { if (routineOp.getFuncName().str().compare(funcName) == 0) { // If the routine is already specified with the same clauses, just skip @@ -4179,6 +4176,8 @@ void createOpenACCRoutineConstruct( mlir::emitError(loc, "Routine already specified with different clauses"); } } + std::stringstream routineOpName; + routineOpName << accRoutinePrefix.str() << routineCounter++; std::string routineOpStr = routineOpName.str(); mlir::OpBuilder modBuilder(mod.getBodyRegion()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); @@ -4192,16 +4191,7 @@ void createOpenACCRoutineConstruct( getArrayAttrOrNull(builder, gangDimValues), getArrayAttrOrNull(builder, gangDimDeviceTypes)); - auto symbolRefAttr = builder.getSymbolRefAttr(routineOpStr); - if (funcOp) { - - attachRoutineInfo(funcOp, symbolRefAttr); - } else { - // FuncOp is not lowered yet. Keep the information so the routine info - // can be attached later to the funcOp. - converter.getAccDelayedRoutines().push_back( - std::make_pair(funcName, symbolRefAttr)); - } + attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpStr)); } static void interpretRoutineDeviceInfo( @@ -4299,145 +4289,6 @@ void Fortran::lower::genOpenACCRoutineConstruct( seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); } -void Fortran::lower::genOpenACCRoutineConstruct( - Fortran::lower::AbstractConverter &converter, - Fortran::semantics::SemanticsContext &semanticsContext, mlir::ModuleOp mod, - const Fortran::parser::OpenACCRoutineConstruct &routineConstruct) { - fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - mlir::Location loc = converter.genLocation(routineConstruct.source); - std::optional name = - std::get>(routineConstruct.t); - const auto &clauses = - std::get(routineConstruct.t); - mlir::func::FuncOp funcOp; - std::string funcName; - if (name) { - funcName = converter.mangleName(*name->symbol); - funcOp = - builder.getNamedFunction(mod, builder.getMLIRSymbolTable(), funcName); - } else { - Fortran::semantics::Scope &scope = - semanticsContext.FindScope(routineConstruct.source); - const Fortran::semantics::Scope &progUnit{GetProgramUnitContaining(scope)}; - const auto *subpDetails{ - progUnit.symbol() - ? progUnit.symbol() - ->detailsIf() - : nullptr}; - if (subpDetails && subpDetails->isInterface()) { - funcName = converter.mangleName(*progUnit.symbol()); - funcOp = - builder.getNamedFunction(mod, builder.getMLIRSymbolTable(), funcName); - } else { - funcOp = builder.getFunction(); - funcName = funcOp.getName(); - } - } - // TODO: Refactor this to use the OpenACCRoutineInfo - bool hasNohost = false; - - llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, - workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, - gangDimDeviceTypes, gangDimValues; - - // device_type attribute is set to `none` until a device_type clause is - // encountered. - llvm::SmallVector crtDeviceTypes; - crtDeviceTypes.push_back(mlir::acc::DeviceTypeAttr::get( - builder.getContext(), mlir::acc::DeviceType::None)); - - for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - seqDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (const auto *gangClause = - std::get_if(&clause.u)) { - if (gangClause->v) { - const Fortran::parser::AccGangArgList &x = *gangClause->v; - for (const Fortran::parser::AccGangArg &gangArg : x.v) { - if (const auto *dim = - std::get_if(&gangArg.u)) { - const std::optional dimValue = Fortran::evaluate::ToInt64( - *Fortran::semantics::GetExpr(dim->v)); - if (!dimValue) - mlir::emitError(loc, - "dim value must be a constant positive integer"); - mlir::Attribute gangDimAttr = - builder.getIntegerAttr(builder.getI64Type(), *dimValue); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - gangDimValues.push_back(gangDimAttr); - gangDimDeviceTypes.push_back(crtDeviceTypeAttr); - } - } - } - } else { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - gangDeviceTypes.push_back(crtDeviceTypeAttr); - } - } else if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - vectorDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - workerDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (std::get_if(&clause.u)) { - hasNohost = true; - } else if (const auto *bindClause = - std::get_if(&clause.u)) { - if (const auto *name = - std::get_if(&bindClause->v.u)) { - mlir::Attribute bindNameAttr = - builder.getStringAttr(converter.mangleName(*name->symbol)); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(crtDeviceTypeAttr); - } - } else if (const auto charExpr = - std::get_if( - &bindClause->v.u)) { - const std::optional name = - Fortran::semantics::GetConstExpr(semanticsContext, - *charExpr); - if (!name) - mlir::emitError(loc, "Could not retrieve the bind name"); - - mlir::Attribute bindNameAttr = builder.getStringAttr(*name); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(crtDeviceTypeAttr); - } - } - } else if (const auto *deviceTypeClause = - std::get_if( - &clause.u)) { - crtDeviceTypes.clear(); - gatherDeviceTypeAttrs(builder, deviceTypeClause, crtDeviceTypes); - } - } - - createOpenACCRoutineConstruct( - converter, loc, mod, funcOp, funcName, hasNohost, bindNames, - bindNameDeviceTypes, gangDeviceTypes, gangDimValues, gangDimDeviceTypes, - seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); -} - -void Fortran::lower::finalizeOpenACCRoutineAttachment( - mlir::ModuleOp mod, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { - for (auto &mapping : accRoutineInfos) { - mlir::func::FuncOp funcOp = - mod.lookupSymbol(mapping.first); - if (!funcOp) - mlir::emitWarning(mod.getLoc(), - llvm::Twine("function '") + llvm::Twine(mapping.first) + - llvm::Twine("' in acc routine directive is not " - "found in this translation unit.")); - else - attachRoutineInfo(funcOp, mapping.second); - } - accRoutineInfos.clear(); -} - static void genACC(Fortran::lower::AbstractConverter &converter, Fortran::lower::pft::Evaluation &eval, @@ -4551,8 +4402,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( Fortran::lower::AbstractConverter &converter, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &openAccCtx, - const Fortran::parser::OpenACCDeclarativeConstruct &accDeclConstruct, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { + const Fortran::parser::OpenACCDeclarativeConstruct &accDeclConstruct) { Fortran::common::visit( common::visitors{ @@ -4561,13 +4411,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( genACC(converter, semanticsContext, openAccCtx, standaloneDeclarativeConstruct); }, - [&](const Fortran::parser::OpenACCRoutineConstruct - &routineConstruct) { - fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - mlir::ModuleOp mod = builder.getModule(); - Fortran::lower::genOpenACCRoutineConstruct( - converter, semanticsContext, mod, routineConstruct); - }, + [&](const Fortran::parser::OpenACCRoutineConstruct &x) {}, }, accDeclConstruct.u); } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index c2df7cddc0025..d74953df1e630 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1041,9 +1041,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( if (const auto *dTypeClause = std::get_if(&clause.u)) { currentDevices.clear(); - for (const auto &deviceTypeExpr : dTypeClause->v.v) { + for (const auto &deviceTypeExpr : dTypeClause->v.v) currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); - } } else if (std::get_if(&clause.u)) { info.set_isNohost(); } else if (std::get_if(&clause.u)) { @@ -1080,9 +1079,8 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( std::get_if(&bindClause->v.u)) { if (Symbol *sym = ResolveFctName(*name)) { Symbol &ultimate{sym->GetUltimate()}; - for (auto &device : currentDevices) { + for (auto &device : currentDevices) device->set_bindName(SymbolRef(ultimate)); - } } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, @@ -1095,16 +1093,12 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Fortran::parser::Unwrap( *charExpr); std::string str{std::get(charConst->t)}; - for (auto &device : currentDevices) { + for (auto &device : currentDevices) device->set_bindName(std::string(str)); - } } } } symbol.get().add_openACCRoutineInfo(info); - } else { - llvm::errs() << "Couldnot add routine info to symbol: " << symbol.name() - << "\n"; } } >From 158481e5eaf63c1b2b4c172b9c143ca4c10722f5 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 25 Apr 2025 17:15:26 -0700 Subject: [PATCH 08/12] a little more tidying up --- flang/include/flang/Lower/AbstractConverter.h | 1 - flang/include/flang/Lower/OpenACC.h | 3 --- flang/lib/Lower/Bridge.cpp | 3 ++- 3 files changed, 2 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 2fa0da94b0396..1d1323642bf9c 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -14,7 +14,6 @@ #define FORTRAN_LOWER_ABSTRACTCONVERTER_H #include "flang/Lower/LoweringOptions.h" -#include "flang/Lower/OpenACC.h" #include "flang/Lower/PFTDefs.h" #include "flang/Optimizer/Builder/BoxValue.h" #include "flang/Optimizer/Dialect/FIRAttr.h" diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index d2cd7712fb2c7..4034953976427 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -72,9 +72,6 @@ static constexpr llvm::StringRef declarePostDeallocSuffix = static constexpr llvm::StringRef privatizationRecipePrefix = "privatization"; -bool needsOpenACCRoutineConstruct( - const Fortran::evaluate::ProcedureDesignator *); - mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, pft::Evaluation &, diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 1615493003898..e50c91654f7bb 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -5897,7 +5897,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Helper to generate GlobalOps when the builder is not positioned in any /// region block. This is required because the FirOpBuilder assumes it is /// always positioned inside a region block when creating globals, the easiest - /// way comply is to create a dummy function and to throw it away afterwards. + /// way to comply is to create a dummy function and to throw it away + /// afterwards. void createGlobalOutsideOfFunctionLowering( const std::function &createGlobals) { // FIXME: get rid of the bogus function context and instantiate the >From 8f6ae035147336c4ed04b5b25487f72ebc52c757 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 1 May 2025 08:12:35 -0700 Subject: [PATCH 09/12] more consistent use of builders --- flang/lib/Lower/Bridge.cpp | 40 ++++++++++--------------------- flang/lib/Lower/CallInterface.cpp | 1 - 2 files changed, 13 insertions(+), 28 deletions(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index e50c91654f7bb..fb20dfbaf477e 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -403,18 +403,21 @@ class FirConverter : public Fortran::lower::AbstractConverter { [&](Fortran::lower::pft::FunctionLikeUnit &f) { if (f.isMainProgram()) hasMainProgram = true; - declareFunction(f); + createGlobalOutsideOfFunctionLowering( + [&]() { declareFunction(f); }); if (!globalOmpRequiresSymbol) globalOmpRequiresSymbol = f.getScope().symbol(); }, [&](Fortran::lower::pft::ModuleLikeUnit &m) { lowerModuleDeclScope(m); - for (Fortran::lower::pft::ContainedUnit &unit : - m.containedUnitList) - if (auto *f = - std::get_if( - &unit)) - declareFunction(*f); + createGlobalOutsideOfFunctionLowering([&]() { + for (Fortran::lower::pft::ContainedUnit &unit : + m.containedUnitList) + if (auto *f = + std::get_if( + &unit)) + declareFunction(*f); + }); }, [&](Fortran::lower::pft::BlockDataUnit &b) { if (!globalOmpRequiresSymbol) @@ -463,19 +466,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { - // Since this is a recursive function, we only need to create a new builder - // for each top-level declaration. It would be simpler to have a single - // builder for the entire translation unit, but that requires a lot of - // changes to the code. - // FIXME: Once createGlobalOutsideOfFunctionLowering is fixed, we can - // remove this code and share the module builder. - bool newBuilder = false; - if (!builder) { - newBuilder = true; - builder = new fir::FirOpBuilder(bridge.getModule(), bridge.getKindMap(), - &mlirSymbolTable); - } - CHECK(builder && "FirOpBuilder did not instantiate"); + CHECK(builder && "declareFunction called with uninitialized builder"); setCurrentPosition(funit.getStartingSourceLoc()); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { @@ -503,11 +494,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (Fortran::lower::pft::ContainedUnit &unit : funit.containedUnitList) if (auto *f = std::get_if(&unit)) declareFunction(*f); - - if (newBuilder) { - delete builder; - builder = nullptr; - } } /// Get the scope that is defining or using \p sym. The returned scope is not @@ -5624,9 +5610,9 @@ class FirConverter : public Fortran::lower::AbstractConverter { LLVM_DEBUG(llvm::dbgs() << "\n[bridge - startNewFunction]"; if (auto *sym = scope.symbol()) llvm::dbgs() << " " << *sym; llvm::dbgs() << "\n"); - // I don't think setting the builder is necessary here, because callee + // Setting the builder is not necessary here, because callee // always looks up the FuncOp from the module. If there was a function that - // was not declared yet. This call to callee will cause an assertion + // was not declared yet, this call to callee will cause an assertion // failure. Fortran::lower::CalleeInterface callee(funit, *this); mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments(); diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 602b5c7bfa6c6..8affa1e1965e8 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -21,7 +21,6 @@ #include "flang/Optimizer/Dialect/FIROpsSupport.h" #include "flang/Optimizer/Support/InternalNames.h" #include "flang/Optimizer/Support/Utils.h" -#include "flang/Parser/parse-tree.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" #include "flang/Support/Fortran.h" >From a52655d0b90055ad6ff062fbf66be3172a95973b Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 1 May 2025 08:18:14 -0700 Subject: [PATCH 10/12] delete space --- flang/lib/Lower/Bridge.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index fb20dfbaf477e..a6ee24edd8381 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -5883,7 +5883,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Helper to generate GlobalOps when the builder is not positioned in any /// region block. This is required because the FirOpBuilder assumes it is /// always positioned inside a region block when creating globals, the easiest - /// way to comply is to create a dummy function and to throw it away + /// way to comply is to create a dummy function and to throw it away /// afterwards. void createGlobalOutsideOfFunctionLowering( const std::function &createGlobals) { >From b2a1da313284c9709eae6f757885df697843565c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 2 May 2025 11:30:33 -0700 Subject: [PATCH 11/12] less builder creation --- flang/lib/Lower/Bridge.cpp | 109 ++++++++++++++++++------------------- 1 file changed, 53 insertions(+), 56 deletions(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index a6ee24edd8381..81127ab55a937 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -397,40 +397,39 @@ class FirConverter : public Fortran::lower::AbstractConverter { // they are available before lowering any function that may use them. bool hasMainProgram = false; const Fortran::semantics::Symbol *globalOmpRequiresSymbol = nullptr; - for (Fortran::lower::pft::Program::Units &u : pft.getUnits()) { - Fortran::common::visit( - Fortran::common::visitors{ - [&](Fortran::lower::pft::FunctionLikeUnit &f) { - if (f.isMainProgram()) - hasMainProgram = true; - createGlobalOutsideOfFunctionLowering( - [&]() { declareFunction(f); }); - if (!globalOmpRequiresSymbol) - globalOmpRequiresSymbol = f.getScope().symbol(); - }, - [&](Fortran::lower::pft::ModuleLikeUnit &m) { - lowerModuleDeclScope(m); - createGlobalOutsideOfFunctionLowering([&]() { + createBuilderOutsideOfFuncOpAndDo([&]() { + for (Fortran::lower::pft::Program::Units &u : pft.getUnits()) { + Fortran::common::visit( + Fortran::common::visitors{ + [&](Fortran::lower::pft::FunctionLikeUnit &f) { + if (f.isMainProgram()) + hasMainProgram = true; + declareFunction(f); + if (!globalOmpRequiresSymbol) + globalOmpRequiresSymbol = f.getScope().symbol(); + }, + [&](Fortran::lower::pft::ModuleLikeUnit &m) { + lowerModuleDeclScope(m); for (Fortran::lower::pft::ContainedUnit &unit : m.containedUnitList) if (auto *f = std::get_if( &unit)) declareFunction(*f); - }); - }, - [&](Fortran::lower::pft::BlockDataUnit &b) { - if (!globalOmpRequiresSymbol) - globalOmpRequiresSymbol = b.symTab.symbol(); - }, - [&](Fortran::lower::pft::CompilerDirectiveUnit &d) {}, - [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) {}, - }, - u); - } + }, + [&](Fortran::lower::pft::BlockDataUnit &b) { + if (!globalOmpRequiresSymbol) + globalOmpRequiresSymbol = b.symTab.symbol(); + }, + [&](Fortran::lower::pft::CompilerDirectiveUnit &d) {}, + [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) {}, + }, + u); + } + }); // Create definitions of intrinsic module constants. - createGlobalOutsideOfFunctionLowering( + createBuilderOutsideOfFuncOpAndDo( [&]() { createIntrinsicModuleDefinitions(pft); }); // Primary translation pass. @@ -449,12 +448,12 @@ class FirConverter : public Fortran::lower::AbstractConverter { // Once all the code has been translated, create global runtime type info // data structures for the derived types that have been processed, as well // as fir.type_info operations for the dispatch tables. - createGlobalOutsideOfFunctionLowering( + createBuilderOutsideOfFuncOpAndDo( [&]() { typeInfoConverter.createTypeInfo(*this); }); // Generate the `main` entry point if necessary if (hasMainProgram) - createGlobalOutsideOfFunctionLowering([&]() { + createBuilderOutsideOfFuncOpAndDo([&]() { fir::runtime::genMain(*builder, toLocation(), bridge.getEnvironmentDefaults(), getFoldingContext().languageFeatures().IsEnabled( @@ -5885,7 +5884,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// always positioned inside a region block when creating globals, the easiest /// way to comply is to create a dummy function and to throw it away /// afterwards. - void createGlobalOutsideOfFunctionLowering( + void createBuilderOutsideOfFuncOpAndDo( const std::function &createGlobals) { // FIXME: get rid of the bogus function context and instantiate the // globals directly into the module. @@ -5913,7 +5912,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Instantiate the data from a BLOCK DATA unit. void lowerBlockData(Fortran::lower::pft::BlockDataUnit &bdunit) { - createGlobalOutsideOfFunctionLowering([&]() { + createBuilderOutsideOfFuncOpAndDo([&]() { Fortran::lower::AggregateStoreMap fakeMap; for (const auto &[_, sym] : bdunit.symTab) { if (sym->has()) { @@ -5927,7 +5926,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Create fir::Global for all the common blocks that appear in the program. void lowerCommonBlocks(const Fortran::semantics::CommonBlockList &commonBlocks) { - createGlobalOutsideOfFunctionLowering( + createBuilderOutsideOfFuncOpAndDo( [&]() { Fortran::lower::defineCommonBlocks(*this, commonBlocks); }); } @@ -5997,36 +5996,34 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// declarative construct. void lowerModuleDeclScope(Fortran::lower::pft::ModuleLikeUnit &mod) { setCurrentPosition(mod.getStartingSourceLoc()); - createGlobalOutsideOfFunctionLowering([&]() { - auto &scopeVariableListMap = - Fortran::lower::pft::getScopeVariableListMap(mod); - for (const auto &var : Fortran::lower::pft::getScopeVariableList( - mod.getScope(), scopeVariableListMap)) { - - // Only define the variables owned by this module. - const Fortran::semantics::Scope *owningScope = var.getOwningScope(); - if (owningScope && mod.getScope() != *owningScope) - continue; + auto &scopeVariableListMap = + Fortran::lower::pft::getScopeVariableListMap(mod); + for (const auto &var : Fortran::lower::pft::getScopeVariableList( + mod.getScope(), scopeVariableListMap)) { - // Very special case: The value of numeric_storage_size depends on - // compilation options and therefore its value is not yet known when - // building the builtins runtime. Instead, the parameter is folding a - // __numeric_storage_size() expression which is loaded into the user - // program. For the iso_fortran_env object file, omit the symbol as it - // is never used. - if (var.hasSymbol()) { - const Fortran::semantics::Symbol &sym = var.getSymbol(); - const Fortran::semantics::Scope &owner = sym.owner(); - if (sym.name() == "numeric_storage_size" && owner.IsModule() && - DEREF(owner.symbol()).name() == "iso_fortran_env") - continue; - } + // Only define the variables owned by this module. + const Fortran::semantics::Scope *owningScope = var.getOwningScope(); + if (owningScope && mod.getScope() != *owningScope) + continue; - Fortran::lower::defineModuleVariable(*this, var); + // Very special case: The value of numeric_storage_size depends on + // compilation options and therefore its value is not yet known when + // building the builtins runtime. Instead, the parameter is folding a + // __numeric_storage_size() expression which is loaded into the user + // program. For the iso_fortran_env object file, omit the symbol as it + // is never used. + if (var.hasSymbol()) { + const Fortran::semantics::Symbol &sym = var.getSymbol(); + const Fortran::semantics::Scope &owner = sym.owner(); + if (sym.name() == "numeric_storage_size" && owner.IsModule() && + DEREF(owner.symbol()).name() == "iso_fortran_env") + continue; } + + Fortran::lower::defineModuleVariable(*this, var); + } for (auto &eval : mod.evaluationList) genFIR(eval); - }); } /// Lower functions contained in a module. >From 524841cf4918347b30973e4eb0ee1fbd174cdff2 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Mon, 5 May 2025 08:06:26 -0700 Subject: [PATCH 12/12] guard OpenACC in module files --- flang/lib/Semantics/mod-file.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index 76dc8db590f22..60b97b401affb 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1394,7 +1394,9 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, parser::Options options; options.isModuleFile = true; options.features.Enable(common::LanguageFeature::BackslashEscapes); - options.features.Enable(common::LanguageFeature::OpenACC); + if (context_.languageFeatures().IsEnabled(common::LanguageFeature::OpenACC)) { + options.features.Enable(common::LanguageFeature::OpenACC); + } options.features.Enable(common::LanguageFeature::OpenMP); options.features.Enable(common::LanguageFeature::CUDA); if (!isIntrinsic.value_or(false) && !notAModule) { From flang-commits at lists.llvm.org Mon May 5 13:19:07 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Mon, 05 May 2025 13:19:07 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add lowering of volatile references (PR #132486) In-Reply-To: Message-ID: <68191d3b.630a0220.235bba.a24b@mx.google.com> DanielCChen wrote: @ashermancinelli The above code still issues the incorrect warning with this patch. https://github.com/llvm/llvm-project/pull/132486 From flang-commits at lists.llvm.org Mon May 5 13:26:45 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Mon, 05 May 2025 13:26:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add lowering of volatile references (PR #132486) In-Reply-To: Message-ID: <68191f05.170a0220.3fdbd.c8b2@mx.google.com> ashermancinelli wrote: Ah, you were asking about the warning. This series of patches only deals with the lowering of volatile entities, not anything in the frontend. @akuhlens added a warning last week I think, I'll ask him about this. https://github.com/llvm/llvm-project/pull/132486 From flang-commits at lists.llvm.org Mon May 5 13:42:55 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Mon, 05 May 2025 13:42:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add lowering of volatile references (PR #132486) In-Reply-To: Message-ID: <681922cf.170a0220.92612.ab35@mx.google.com> DanielCChen wrote: > Ah, you were asking about the warning. This series of patches only deals with the lowering of volatile entities, not anything in the frontend. @akuhlens added a warning last week I think, I'll ask him about this. Oh I see. Yeah, the warning should be removed. Thanks! I got the assert in lowering for another case. ``` MODULE M TYPE :: DT CHARACTER :: C0="!" INTEGER :: I=0 CHARACTER :: C1="!" END TYPE END MODULE PROGRAM dataPtrVolatile USE M IMPLICIT NONE TYPE(DT), VOLATILE , TARGET :: Arr(100, 100), Arr1(10000), T(100,100) CLASS(DT), VOLATILE , POINTER :: Ptr(:, :) INTEGER :: I, J DO I =1, 100 DO J =I, 100 Arr(I:, J:) = DT(I=-I) Ptr(I:, J:) => Arr(I:, J:) T(I:, J:) = Ptr(I:, J:) END DO END DO END error: loc("/home/cdchen/temp/t5.f":26:5): failed to legalize unresolved materialization from ('!fir.class,i:i32,c1:!fir.char<1>}>>, volatile>') to ('!fir.class,i:i32,c1:!fir.char<1>}>>>') that remained live after conversion error: failure in HLFIR to FIR conversion pass error: Lowering to LLVM IR failed error: loc("/home/cdchen/temp/t5.f":1:3): LLVM Translation failed for operation: fir.global error: failed to create the LLVM module ``` https://github.com/llvm/llvm-project/pull/132486 From flang-commits at lists.llvm.org Mon May 5 14:01:21 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Mon, 05 May 2025 14:01:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang][nfc] Fix test unneccesarily checking type layout (PR #138585) Message-ID: https://github.com/ashermancinelli created https://github.com/llvm/llvm-project/pull/138585 Test added in #138339 unneccesarily had CHECK lines with the type layout, which fails on aix. >From 72e3fa9b5ad02233b0b511af47ccd1c96b9bc152 Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Mon, 5 May 2025 13:56:08 -0700 Subject: [PATCH] [flang][nfc] Fix test unneccesarily checking type layout Test added in #138339 unneccesarily had CHECK lines with the type layout, which fails on aix. --- flang/test/Lower/volatile-derived-type.f90 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/test/Lower/volatile-derived-type.f90 b/flang/test/Lower/volatile-derived-type.f90 index edd77a9265530..963e4cf45a761 100644 --- a/flang/test/Lower/volatile-derived-type.f90 +++ b/flang/test/Lower/volatile-derived-type.f90 @@ -16,7 +16,7 @@ subroutine test(v) ! CHECK: %[[VAL_2:.*]] = arith.constant 2 : index ! CHECK: %[[VAL_3:.*]] = arith.constant 0 : index ! CHECK: %[[VAL_4:.*]] = fir.dummy_scope : !fir.dscope -! CHECK: %[[VAL_5:.*]] = fir.address_of(@_QFE.b.t.e) : !fir.ref,value:i64}>>> +! CHECK: %[[VAL_5:.*]] = fir.address_of(@_QFE.b.t.e) : !fir.ref>> ! CHECK: %[[VAL_6:.*]] = fir.shape_shift %[[VAL_3]], %[[VAL_2]], %[[VAL_3]], %[[VAL_1]] : (index, index, index, index) -> !fir.shapeshift<2> ! CHECK: %[[VAL_7:.*]]:2 = hlfir.declare %[[VAL_5]](%[[VAL_6]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.b.t.e"} : ! CHECK: %[[VAL_8:.*]] = fir.address_of(@_QFE.n.e) : !fir.ref> From flang-commits at lists.llvm.org Mon May 5 14:01:56 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 14:01:56 -0700 (PDT) Subject: [flang-commits] [flang] [flang][nfc] Fix test unneccesarily checking type layout (PR #138585) In-Reply-To: Message-ID: <68192744.050a0220.1b0382.ff50@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Asher Mancinelli (ashermancinelli)
Changes Test added in #138339 unneccesarily had CHECK lines with the type layout, which fails on aix. --- Full diff: https://github.com/llvm/llvm-project/pull/138585.diff 1 Files Affected: - (modified) flang/test/Lower/volatile-derived-type.f90 (+1-1) ``````````diff diff --git a/flang/test/Lower/volatile-derived-type.f90 b/flang/test/Lower/volatile-derived-type.f90 index edd77a9265530..963e4cf45a761 100644 --- a/flang/test/Lower/volatile-derived-type.f90 +++ b/flang/test/Lower/volatile-derived-type.f90 @@ -16,7 +16,7 @@ subroutine test(v) ! CHECK: %[[VAL_2:.*]] = arith.constant 2 : index ! CHECK: %[[VAL_3:.*]] = arith.constant 0 : index ! CHECK: %[[VAL_4:.*]] = fir.dummy_scope : !fir.dscope -! CHECK: %[[VAL_5:.*]] = fir.address_of(@_QFE.b.t.e) : !fir.ref,value:i64}>>> +! CHECK: %[[VAL_5:.*]] = fir.address_of(@_QFE.b.t.e) : !fir.ref>> ! CHECK: %[[VAL_6:.*]] = fir.shape_shift %[[VAL_3]], %[[VAL_2]], %[[VAL_3]], %[[VAL_1]] : (index, index, index, index) -> !fir.shapeshift<2> ! CHECK: %[[VAL_7:.*]]:2 = hlfir.declare %[[VAL_5]](%[[VAL_6]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.b.t.e"} : ! CHECK: %[[VAL_8:.*]] = fir.address_of(@_QFE.n.e) : !fir.ref> ``````````
https://github.com/llvm/llvm-project/pull/138585 From flang-commits at lists.llvm.org Mon May 5 14:04:43 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 05 May 2025 14:04:43 -0700 (PDT) Subject: [flang-commits] [flang] [flang][nfc] Fix test unneccesarily checking type layout (PR #138585) In-Reply-To: Message-ID: <681927eb.170a0220.738c0.b127@mx.google.com> https://github.com/vzakhari approved this pull request. https://github.com/llvm/llvm-project/pull/138585 From flang-commits at lists.llvm.org Mon May 5 14:10:56 2025 From: flang-commits at lists.llvm.org (Kelvin Li via flang-commits) Date: Mon, 05 May 2025 14:10:56 -0700 (PDT) Subject: [flang-commits] [flang] [flang][nfc] Fix test unneccesarily checking type layout (PR #138585) In-Reply-To: Message-ID: <68192960.050a0220.17264b.1b11@mx.google.com> https://github.com/kkwli approved this pull request. Thanks. https://github.com/llvm/llvm-project/pull/138585 From flang-commits at lists.llvm.org Mon May 5 14:15:19 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Mon, 05 May 2025 14:15:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang][nfc] Fix test unneccesarily checking type layout (PR #138585) In-Reply-To: Message-ID: <68192a67.050a0220.3a40a7.0439@mx.google.com> https://github.com/DanielCChen approved this pull request. LGTM. https://github.com/llvm/llvm-project/pull/138585 From flang-commits at lists.llvm.org Mon May 5 14:35:53 2025 From: flang-commits at lists.llvm.org (Kelvin Li via flang-commits) Date: Mon, 05 May 2025 14:35:53 -0700 (PDT) Subject: [flang-commits] [flang] [flang][AIX] Predefine __64BIT__ and _AIX macros (PR #138591) Message-ID: https://github.com/kkwli created https://github.com/llvm/llvm-project/pull/138591 None >From c6a69d92f960923b3662ed0b080314f2e62c6458 Mon Sep 17 00:00:00 2001 From: Kelvin Li Date: Mon, 5 May 2025 17:33:31 -0400 Subject: [PATCH] [flang][AIX] Predefine __64BIT__ and _AIX macros --- flang/lib/Frontend/CompilerInvocation.cpp | 17 ++++++++++++----- .../test/Driver/predefined-macros-powerpc2.f90 | 18 +++++++++++++++--- 2 files changed, 27 insertions(+), 8 deletions(-) diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 6f87a18d69c3d..c3e6471fa9a0f 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -1612,13 +1612,10 @@ void CompilerInvocation::setDefaultPredefinitions() { } llvm::Triple targetTriple{llvm::Triple(this->targetOpts.triple)}; - if (targetTriple.isPPC()) { - // '__powerpc__' is a generic macro for any PowerPC cases. e.g. Max integer - // size. - fortranOptions.predefinitions.emplace_back("__powerpc__", "1"); - } if (targetTriple.isOSLinux()) { fortranOptions.predefinitions.emplace_back("__linux__", "1"); + } else if (targetTriple.isOSAIX()) { + fortranOptions.predefinitions.emplace_back("_AIX", "1"); } switch (targetTriple.getArch()) { @@ -1628,6 +1625,16 @@ void CompilerInvocation::setDefaultPredefinitions() { fortranOptions.predefinitions.emplace_back("__x86_64__", "1"); fortranOptions.predefinitions.emplace_back("__x86_64", "1"); break; + case llvm::Triple::ArchType::ppc: + case llvm::Triple::ArchType::ppc64: + case llvm::Triple::ArchType::ppcle: + case llvm::Triple::ArchType::ppc64le: + // '__powerpc__' is a generic macro for any PowerPC. + fortranOptions.predefinitions.emplace_back("__powerpc__", "1"); + if (targetTriple.isOSAIX() && targetTriple.isArch64Bit()) { + fortranOptions.predefinitions.emplace_back("__64BIT__", "1"); + } + break; } } diff --git a/flang/test/Driver/predefined-macros-powerpc2.f90 b/flang/test/Driver/predefined-macros-powerpc2.f90 index 6e10235e21f86..6d235afcf8c3b 100644 --- a/flang/test/Driver/predefined-macros-powerpc2.f90 +++ b/flang/test/Driver/predefined-macros-powerpc2.f90 @@ -1,13 +1,25 @@ ! Test predefined macro for PowerPC architecture -! RUN: %flang_fc1 -triple ppc64le-unknown-linux -cpp -E %s | FileCheck %s +! RUN: %flang_fc1 -triple ppc64le-unknown-linux -cpp -E %s | FileCheck %s -check-prefix=CHECK-LINUX +! RUN: %flang_fc1 -triple powerpc-unknown-aix -cpp -E %s | FileCheck %s -check-prefix=CHECK-AIX32 +! RUN: %flang_fc1 -triple powerpc64-unknown-aix -cpp -E %s | FileCheck %s -check-prefix=CHECK-AIX64 ! REQUIRES: target=powerpc{{.*}} -! CHECK: integer :: var1 = 1 -! CHECK: integer :: var2 = 1 +! CHECK-LINUX: integer :: var1 = 1 +! CHECK-LINUX: integer :: var2 = 1 +! CHECK-AIX32: integer :: var1 = 1 +! CHECK-AIX32: integer :: var2 = 1 +! CHECK-AIX32: integer :: var3 = __64BIT__ +! CHECK-AIX64: integer :: var1 = 1 +! CHECK-AIX64: integer :: var2 = 1 +! CHECK-AIX64: integer :: var3 = 1 #if defined(__linux__) && defined(__powerpc__) integer :: var1 = __powerpc__ integer :: var2 = __linux__ +#elif defined(_AIX) && defined(__powerpc__) + integer :: var1 = __powerpc__ + integer :: var2 = _AIX + integer :: var3 = __64BIT__ #endif end program From flang-commits at lists.llvm.org Mon May 5 14:36:27 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 14:36:27 -0700 (PDT) Subject: [flang-commits] [flang] [flang][AIX] Predefine __64BIT__ and _AIX macros (PR #138591) In-Reply-To: Message-ID: <68192f5b.170a0220.38dc02.a18b@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-driver Author: Kelvin Li (kkwli)
Changes --- Full diff: https://github.com/llvm/llvm-project/pull/138591.diff 2 Files Affected: - (modified) flang/lib/Frontend/CompilerInvocation.cpp (+12-5) - (modified) flang/test/Driver/predefined-macros-powerpc2.f90 (+15-3) ``````````diff diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 6f87a18d69c3d..c3e6471fa9a0f 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -1612,13 +1612,10 @@ void CompilerInvocation::setDefaultPredefinitions() { } llvm::Triple targetTriple{llvm::Triple(this->targetOpts.triple)}; - if (targetTriple.isPPC()) { - // '__powerpc__' is a generic macro for any PowerPC cases. e.g. Max integer - // size. - fortranOptions.predefinitions.emplace_back("__powerpc__", "1"); - } if (targetTriple.isOSLinux()) { fortranOptions.predefinitions.emplace_back("__linux__", "1"); + } else if (targetTriple.isOSAIX()) { + fortranOptions.predefinitions.emplace_back("_AIX", "1"); } switch (targetTriple.getArch()) { @@ -1628,6 +1625,16 @@ void CompilerInvocation::setDefaultPredefinitions() { fortranOptions.predefinitions.emplace_back("__x86_64__", "1"); fortranOptions.predefinitions.emplace_back("__x86_64", "1"); break; + case llvm::Triple::ArchType::ppc: + case llvm::Triple::ArchType::ppc64: + case llvm::Triple::ArchType::ppcle: + case llvm::Triple::ArchType::ppc64le: + // '__powerpc__' is a generic macro for any PowerPC. + fortranOptions.predefinitions.emplace_back("__powerpc__", "1"); + if (targetTriple.isOSAIX() && targetTriple.isArch64Bit()) { + fortranOptions.predefinitions.emplace_back("__64BIT__", "1"); + } + break; } } diff --git a/flang/test/Driver/predefined-macros-powerpc2.f90 b/flang/test/Driver/predefined-macros-powerpc2.f90 index 6e10235e21f86..6d235afcf8c3b 100644 --- a/flang/test/Driver/predefined-macros-powerpc2.f90 +++ b/flang/test/Driver/predefined-macros-powerpc2.f90 @@ -1,13 +1,25 @@ ! Test predefined macro for PowerPC architecture -! RUN: %flang_fc1 -triple ppc64le-unknown-linux -cpp -E %s | FileCheck %s +! RUN: %flang_fc1 -triple ppc64le-unknown-linux -cpp -E %s | FileCheck %s -check-prefix=CHECK-LINUX +! RUN: %flang_fc1 -triple powerpc-unknown-aix -cpp -E %s | FileCheck %s -check-prefix=CHECK-AIX32 +! RUN: %flang_fc1 -triple powerpc64-unknown-aix -cpp -E %s | FileCheck %s -check-prefix=CHECK-AIX64 ! REQUIRES: target=powerpc{{.*}} -! CHECK: integer :: var1 = 1 -! CHECK: integer :: var2 = 1 +! CHECK-LINUX: integer :: var1 = 1 +! CHECK-LINUX: integer :: var2 = 1 +! CHECK-AIX32: integer :: var1 = 1 +! CHECK-AIX32: integer :: var2 = 1 +! CHECK-AIX32: integer :: var3 = __64BIT__ +! CHECK-AIX64: integer :: var1 = 1 +! CHECK-AIX64: integer :: var2 = 1 +! CHECK-AIX64: integer :: var3 = 1 #if defined(__linux__) && defined(__powerpc__) integer :: var1 = __powerpc__ integer :: var2 = __linux__ +#elif defined(_AIX) && defined(__powerpc__) + integer :: var1 = __powerpc__ + integer :: var2 = _AIX + integer :: var3 = __64BIT__ #endif end program ``````````
https://github.com/llvm/llvm-project/pull/138591 From flang-commits at lists.llvm.org Mon May 5 15:22:38 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 15:22:38 -0700 (PDT) Subject: [flang-commits] [flang] a04ab7b - [flang][nfc] Fix test unneccesarily checking type layout (#138585) Message-ID: <68193a2e.170a0220.d9624.a468@mx.google.com> Author: Asher Mancinelli Date: 2025-05-05T15:22:34-07:00 New Revision: a04ab7b81f5f6cc2e00b30007d267b19b0095157 URL: https://github.com/llvm/llvm-project/commit/a04ab7b81f5f6cc2e00b30007d267b19b0095157 DIFF: https://github.com/llvm/llvm-project/commit/a04ab7b81f5f6cc2e00b30007d267b19b0095157.diff LOG: [flang][nfc] Fix test unneccesarily checking type layout (#138585) Test added in #138339 unneccesarily had CHECK lines with the type layout, which fails on aix. Added: Modified: flang/test/Lower/volatile-derived-type.f90 Removed: ################################################################################ diff --git a/flang/test/Lower/volatile-derived-type.f90 b/flang/test/Lower/volatile-derived-type.f90 index edd77a9265530..963e4cf45a761 100644 --- a/flang/test/Lower/volatile-derived-type.f90 +++ b/flang/test/Lower/volatile-derived-type.f90 @@ -16,7 +16,7 @@ subroutine test(v) ! CHECK: %[[VAL_2:.*]] = arith.constant 2 : index ! CHECK: %[[VAL_3:.*]] = arith.constant 0 : index ! CHECK: %[[VAL_4:.*]] = fir.dummy_scope : !fir.dscope -! CHECK: %[[VAL_5:.*]] = fir.address_of(@_QFE.b.t.e) : !fir.ref,value:i64}>>> +! CHECK: %[[VAL_5:.*]] = fir.address_of(@_QFE.b.t.e) : !fir.ref>> ! CHECK: %[[VAL_6:.*]] = fir.shape_shift %[[VAL_3]], %[[VAL_2]], %[[VAL_3]], %[[VAL_1]] : (index, index, index, index) -> !fir.shapeshift<2> ! CHECK: %[[VAL_7:.*]]:2 = hlfir.declare %[[VAL_5]](%[[VAL_6]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.b.t.e"} : ! CHECK: %[[VAL_8:.*]] = fir.address_of(@_QFE.n.e) : !fir.ref> From flang-commits at lists.llvm.org Mon May 5 15:22:40 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Mon, 05 May 2025 15:22:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang][nfc] Fix test unneccesarily checking type layout (PR #138585) In-Reply-To: Message-ID: <68193a30.170a0220.2890a1.b12e@mx.google.com> https://github.com/ashermancinelli closed https://github.com/llvm/llvm-project/pull/138585 From flang-commits at lists.llvm.org Mon May 5 15:56:47 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Mon, 05 May 2025 15:56:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Lower volatile class types (PR #138607) Message-ID: https://github.com/ashermancinelli created https://github.com/llvm/llvm-project/pull/138607 So far, only boxes and references have had their volatile attribute set during lowering. This patch enables the volatility of classes to be properly represented in the ir, same as box and ref. For simple cases, not much needs to change in the codegen or conversion patterns because the prior work on volatile refs/boxes propagates volatility already. I am running further testing with the strict verification enabled to find remaining cases of incorrect/missing volatile propagation. Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 5 15:57:17 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 15:57:17 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Lower volatile class types (PR #138607) In-Reply-To: Message-ID: <6819424d.170a0220.38bddb.72dc@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Asher Mancinelli (ashermancinelli)
Changes So far, only boxes and references have had their volatile attribute set during lowering. This patch enables the volatility of classes to be properly represented in the ir, same as box and ref. For simple cases, not much needs to change in the codegen or conversion patterns because the prior work on volatile refs/boxes propagates volatility already. I am running further testing with the strict verification enabled to find remaining cases of incorrect/missing volatile propagation. --- Full diff: https://github.com/llvm/llvm-project/pull/138607.diff 2 Files Affected: - (modified) flang/lib/Lower/ConvertExprToHLFIR.cpp (+13-11) - (added) flang/test/Lower/volatile-derived-type-pointer.f90 (+43) ``````````diff diff --git a/flang/lib/Lower/ConvertExprToHLFIR.cpp b/flang/lib/Lower/ConvertExprToHLFIR.cpp index 5981116a6d3f7..dd7f6db6f8dd4 100644 --- a/flang/lib/Lower/ConvertExprToHLFIR.cpp +++ b/flang/lib/Lower/ConvertExprToHLFIR.cpp @@ -205,17 +205,8 @@ class HlfirDesignatorBuilder { partInfo.resultShape = hlfir::genShape(getLoc(), getBuilder(), *partInfo.base); - // Dynamic type of polymorphic base must be kept if the designator is - // polymorphic. - if (isPolymorphic(designatorNode)) - return fir::ClassType::get(resultValueType); - // Character scalar with dynamic length needs a fir.boxchar to hold the - // designator length. - auto charType = mlir::dyn_cast(resultValueType); - if (charType && charType.hasDynamicLen()) - return fir::BoxCharType::get(charType.getContext(), charType.getFKind()); - - // When volatile is enabled, enable volatility on the designatory type. + // Enable volatility on the designatory type if it has the VOLATILE attribute + // or if the base is volatile. bool isVolatile = false; // Check if this should be a volatile reference @@ -236,6 +227,17 @@ class HlfirDesignatorBuilder { isVolatile = true; } + // Dynamic type of polymorphic base must be kept if the designator is + // polymorphic. + if (isPolymorphic(designatorNode)) + return fir::ClassType::get(resultValueType, isVolatile); + + // Character scalar with dynamic length needs a fir.boxchar to hold the + // designator length. + auto charType = mlir::dyn_cast(resultValueType); + if (charType && charType.hasDynamicLen()) + return fir::BoxCharType::get(charType.getContext(), charType.getFKind()); + // Check if the base type is volatile if (partInfo.base.has_value()) { mlir::Type baseType = partInfo.base.value().getType(); diff --git a/flang/test/Lower/volatile-derived-type-pointer.f90 b/flang/test/Lower/volatile-derived-type-pointer.f90 new file mode 100644 index 0000000000000..64c4e64784510 --- /dev/null +++ b/flang/test/Lower/volatile-derived-type-pointer.f90 @@ -0,0 +1,43 @@ +! RUN: bbc %s -o - --strict-fir-volatile-verifier | FileCheck %s + +! Ensure that assignments between volatile classes/derived type pointer/targets +! lower to the correct hlfir declare/designate operations. + +module m + type :: dt + character :: c0="!" + integer :: i=0 + character :: c1="!" + end type + end module + program dataptrvolatile + use m + implicit none + type(dt), volatile , target :: arr(100, 100), arr1(10000), t(100,100) + class(dt), volatile , pointer :: ptr(:, :) + integer :: i, j + do i =1, 100 + do j =i, 100 + arr(i:, j:) = dt(i=-i) + ptr(i:, j:) => arr(i:, j:) + t(i:, j:) = ptr(i:, j:) + end do + end do +end + +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEarr"} : (!fir.ref,i:i32,c1:!fir.char<1>}>>, volatile>, !fir.shape<2>) -> (!fir.ref,i:i32,c1:!fir.char<1>}>>, volatile>, !fir.ref,i:i32,c1:!fir.char<1>}>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEarr1"} : (!fir.ref,i:i32,c1:!fir.char<1>}>>, volatile>, !fir.shape<1>) -> (!fir.ref,i:i32,c1:!fir.char<1>}>>, volatile>, !fir.ref,i:i32,c1:!fir.char<1>}>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {uniq_name = "_QFEj"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEptr"} : (!fir.ref,i:i32,c1:!fir.char<1>}>>>, volatile>, volatile>) -> (!fir.ref,i:i32,c1:!fir.char<1>}>>>, volatile>, volatile>, !fir.ref,i:i32,c1:!fir.char<1>}>>>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEt"} : (!fir.ref,i:i32,c1:!fir.char<1>}>>, volatile>, !fir.shape<2>) -> (!fir.ref,i:i32,c1:!fir.char<1>}>>, volatile>, !fir.ref,i:i32,c1:!fir.char<1>}>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {uniq_name = "ctor.temp"} : (!fir.ref,i:i32,c1:!fir.char<1>}>>) -> (!fir.ref,i:i32,c1:!fir.char<1>}>>, !fir.ref,i:i32,c1:!fir.char<1>}>>) +! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0{"c0"} typeparams %{{.+}} : (!fir.ref,i:i32,c1:!fir.char<1>}>>, index) -> !fir.ref> +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QQclX21"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0{"i"} : (!fir.ref,i:i32,c1:!fir.char<1>}>>) -> !fir.ref +! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0{"c1"} typeparams %{{.+}} : (!fir.ref,i:i32,c1:!fir.char<1>}>>, index) -> !fir.ref> +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QQclX21"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0 (%{{.+}}:%{{.+}}:%{{.+}}, %{{.+}}:%{{.+}}:%{{.+}}) shape %{{.+}} : (!fir.ref,i:i32,c1:!fir.char<1>}>>, volatile>, index, index, index, index, index, index, !fir.shape<2>) -> !fir.box,i:i32,c1:!fir.char<1>}>>, volatile> +! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0 (%{{.+}}:%{{.+}}:%{{.+}}, %{{.+}}:%{{.+}}:%{{.+}}) shape %{{.+}} : (!fir.ref,i:i32,c1:!fir.char<1>}>>, volatile>, index, index, index, index, index, index, !fir.shape<2>) -> !fir.box,i:i32,c1:!fir.char<1>}>>, volatile> +! CHECK: %{{.+}} = hlfir.designate %{{.+}} (%{{.+}}:%{{.+}}:%{{.+}}, %{{.+}}:%{{.+}}:%{{.+}}) shape %{{.+}} : (!fir.class,i:i32,c1:!fir.char<1>}>>>, volatile>, index, index, index, index, index, index, !fir.shape<2>) -> !fir.class,i:i32,c1:!fir.char<1>}>>, volatile> +! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0 (%{{.+}}:%{{.+}}:%{{.+}}, %{{.+}}:%{{.+}}:%{{.+}}) shape %{{.+}} : (!fir.ref,i:i32,c1:!fir.char<1>}>>, volatile>, index, index, index, index, index, index, !fir.shape<2>) -> !fir.box,i:i32,c1:!fir.char<1>}>>, volatile> ``````````
https://github.com/llvm/llvm-project/pull/138607 From flang-commits at lists.llvm.org Mon May 5 15:58:55 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 15:58:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Lower volatile class types (PR #138607) In-Reply-To: Message-ID: <681942af.050a0220.242dd1.0c91@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions cpp -- flang/lib/Lower/ConvertExprToHLFIR.cpp ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/lib/Lower/ConvertExprToHLFIR.cpp b/flang/lib/Lower/ConvertExprToHLFIR.cpp index dd7f6db6f..04b63f92a 100644 --- a/flang/lib/Lower/ConvertExprToHLFIR.cpp +++ b/flang/lib/Lower/ConvertExprToHLFIR.cpp @@ -205,8 +205,8 @@ private: partInfo.resultShape = hlfir::genShape(getLoc(), getBuilder(), *partInfo.base); - // Enable volatility on the designatory type if it has the VOLATILE attribute - // or if the base is volatile. + // Enable volatility on the designatory type if it has the VOLATILE + // attribute or if the base is volatile. bool isVolatile = false; // Check if this should be a volatile reference ``````````
https://github.com/llvm/llvm-project/pull/138607 From flang-commits at lists.llvm.org Mon May 5 15:59:18 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Mon, 05 May 2025 15:59:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Lower volatile class types (PR #138607) In-Reply-To: Message-ID: <681942c6.170a0220.af078.b7ae@mx.google.com> https://github.com/ashermancinelli updated https://github.com/llvm/llvm-project/pull/138607 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 5 16:21:29 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Mon, 05 May 2025 16:21:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang][volatile] Get volatility of designators from base instead of component symbol (PR #138611) Message-ID: https://github.com/akuhlens created https://github.com/llvm/llvm-project/pull/138611 The standard says in [8.5.20 VOLATILE attribute]: If an object has the VOLATILE attribute, then all of its sub-objects also have the VOLATILE attribute. This code takes this into account and uses the volatility of the base of the designator instead of that of the component. In fact, fields in a structure are not allowed to have the volatile attribute. So given the code, `A%B => t`, symbol `B` could never directly have the volatile attribute, and the volatility of `A` indicates the volatility of `B`. This PR should address [the comments](https://github.com/llvm/llvm-project/pull/132486#issuecomment-2851313119) on this PR #132486 >From ad18209089951a8a2b482c199baf85d5388e26fb Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Mon, 5 May 2025 15:57:44 -0700 Subject: [PATCH] initial commit --- flang/lib/Semantics/pointer-assignment.cpp | 7 ++++++- flang/test/Semantics/assign02.f90 | 10 ++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/flang/lib/Semantics/pointer-assignment.cpp b/flang/lib/Semantics/pointer-assignment.cpp index 36c9c5b845706..c17eb0aa941ec 100644 --- a/flang/lib/Semantics/pointer-assignment.cpp +++ b/flang/lib/Semantics/pointer-assignment.cpp @@ -329,7 +329,6 @@ bool PointerAssignmentChecker::Check(const evaluate::Designator &d) { " shape"_err_en_US; } else if (rhsType->corank() > 0 && (isVolatile_ != last->attrs().test(Attr::VOLATILE))) { // C1020 - // TODO: what if A is VOLATILE in A%B%C? need a better test here if (isVolatile_) { msg = "Pointer may not be VOLATILE when target is a" " non-VOLATILE coarray"_err_en_US; @@ -569,6 +568,12 @@ bool CheckPointerAssignment(SemanticsContext &context, const SomeExpr &lhs, return false; // error was reported } PointerAssignmentChecker checker{context, scope, *pointer}; + const Symbol *base{GetFirstSymbol(lhs)}; + if (base) { + // 8.5.20(4) If an object has the VOLATILE attribute, then all of its + // subobjects also have the VOLATILE attribute. + checker.set_isVolatile(base->attrs().test(Attr::VOLATILE)); + } checker.set_isBoundsRemapping(isBoundsRemapping); checker.set_isAssumedRank(isAssumedRank); bool lhsOk{checker.CheckLeftHandSide(lhs)}; diff --git a/flang/test/Semantics/assign02.f90 b/flang/test/Semantics/assign02.f90 index d83d126e2734c..9fa672025bfe7 100644 --- a/flang/test/Semantics/assign02.f90 +++ b/flang/test/Semantics/assign02.f90 @@ -8,9 +8,11 @@ module m1 type t2 sequence real :: t2Field + real, pointer :: t2FieldPtr end type type t3 type(t2) :: t3Field + type(t2), pointer :: t3FieldPtr end type contains @@ -198,6 +200,14 @@ subroutine s13 q2 => y%t3Field !OK: q3 => y + !ERROR: VOLATILE target associated with non-VOLATILE pointer + p3%t3FieldPtr => y%t3Field + !ERROR: VOLATILE target associated with non-VOLATILE pointer + p3%t3FieldPtr%t2FieldPtr => y%t3Field%t2Field + !OK + q3%t3FieldPtr => y%t3Field + !OK + q3%t3FieldPtr%t2FieldPtr => y%t3Field%t2Field end end From flang-commits at lists.llvm.org Mon May 5 16:22:06 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 16:22:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang][volatile] Get volatility of designators from base instead of component symbol (PR #138611) In-Reply-To: Message-ID: <6819481e.170a0220.3cda38.ad1e@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Andre Kuhlenschmidt (akuhlens)
Changes The standard says in [8.5.20 VOLATILE attribute]: If an object has the VOLATILE attribute, then all of its sub-objects also have the VOLATILE attribute. This code takes this into account and uses the volatility of the base of the designator instead of that of the component. In fact, fields in a structure are not allowed to have the volatile attribute. So given the code, `A%B => t`, symbol `B` could never directly have the volatile attribute, and the volatility of `A` indicates the volatility of `B`. This PR should address [the comments](https://github.com/llvm/llvm-project/pull/132486#issuecomment-2851313119) on this PR #132486 --- Full diff: https://github.com/llvm/llvm-project/pull/138611.diff 2 Files Affected: - (modified) flang/lib/Semantics/pointer-assignment.cpp (+6-1) - (modified) flang/test/Semantics/assign02.f90 (+10) ``````````diff diff --git a/flang/lib/Semantics/pointer-assignment.cpp b/flang/lib/Semantics/pointer-assignment.cpp index 36c9c5b845706..c17eb0aa941ec 100644 --- a/flang/lib/Semantics/pointer-assignment.cpp +++ b/flang/lib/Semantics/pointer-assignment.cpp @@ -329,7 +329,6 @@ bool PointerAssignmentChecker::Check(const evaluate::Designator &d) { " shape"_err_en_US; } else if (rhsType->corank() > 0 && (isVolatile_ != last->attrs().test(Attr::VOLATILE))) { // C1020 - // TODO: what if A is VOLATILE in A%B%C? need a better test here if (isVolatile_) { msg = "Pointer may not be VOLATILE when target is a" " non-VOLATILE coarray"_err_en_US; @@ -569,6 +568,12 @@ bool CheckPointerAssignment(SemanticsContext &context, const SomeExpr &lhs, return false; // error was reported } PointerAssignmentChecker checker{context, scope, *pointer}; + const Symbol *base{GetFirstSymbol(lhs)}; + if (base) { + // 8.5.20(4) If an object has the VOLATILE attribute, then all of its + // subobjects also have the VOLATILE attribute. + checker.set_isVolatile(base->attrs().test(Attr::VOLATILE)); + } checker.set_isBoundsRemapping(isBoundsRemapping); checker.set_isAssumedRank(isAssumedRank); bool lhsOk{checker.CheckLeftHandSide(lhs)}; diff --git a/flang/test/Semantics/assign02.f90 b/flang/test/Semantics/assign02.f90 index d83d126e2734c..9fa672025bfe7 100644 --- a/flang/test/Semantics/assign02.f90 +++ b/flang/test/Semantics/assign02.f90 @@ -8,9 +8,11 @@ module m1 type t2 sequence real :: t2Field + real, pointer :: t2FieldPtr end type type t3 type(t2) :: t3Field + type(t2), pointer :: t3FieldPtr end type contains @@ -198,6 +200,14 @@ subroutine s13 q2 => y%t3Field !OK: q3 => y + !ERROR: VOLATILE target associated with non-VOLATILE pointer + p3%t3FieldPtr => y%t3Field + !ERROR: VOLATILE target associated with non-VOLATILE pointer + p3%t3FieldPtr%t2FieldPtr => y%t3Field%t2Field + !OK + q3%t3FieldPtr => y%t3Field + !OK + q3%t3FieldPtr%t2FieldPtr => y%t3Field%t2Field end end ``````````
https://github.com/llvm/llvm-project/pull/138611 From flang-commits at lists.llvm.org Mon May 5 16:23:51 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Mon, 05 May 2025 16:23:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add lowering of volatile references (PR #132486) In-Reply-To: Message-ID: <68194887.170a0220.3beb16.6898@mx.google.com> akuhlens wrote: @DanielCChen PR #138611 fixes your issue. Sorry for not catching that. https://github.com/llvm/llvm-project/pull/132486 From flang-commits at lists.llvm.org Mon May 5 17:47:24 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 05 May 2025 17:47:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Lower volatile class types (PR #138607) In-Reply-To: Message-ID: <68195c1c.630a0220.3ab385.b1c5@mx.google.com> https://github.com/vzakhari approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/138607 From flang-commits at lists.llvm.org Mon May 5 19:17:37 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Mon, 05 May 2025 19:17:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang][volatile] Get volatility of designators from base instead of component symbol (PR #138611) In-Reply-To: Message-ID: <68197141.170a0220.1b7e1.bae9@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/138611 From flang-commits at lists.llvm.org Mon May 5 20:53:49 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 20:53:49 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) Message-ID: https://github.com/NexMing created https://github.com/llvm/llvm-project/pull/138627 Currently, the FIR dialect is directly lowered to the LLVM dialect. We can first convert the FIR dialect to the Affine dialect, perform optimizations on top of it, and then lower it to the FIR dialect. The optimization passes are currently experimental, so it's important to actively identify and address issues. >From 03ead063dd9ece8c8077bacd0663d871c906823f Mon Sep 17 00:00:00 2001 From: yanming Date: Wed, 30 Apr 2025 16:32:14 +0800 Subject: [PATCH] [flang][fir] Add affine optimization pass pipeline. --- .../flang/Optimizer/Passes/CommandLineOpts.h | 1 + .../flang/Optimizer/Passes/Pipelines.h | 4 +- flang/lib/Optimizer/Passes/CMakeLists.txt | 1 + .../lib/Optimizer/Passes/CommandLineOpts.cpp | 1 + flang/lib/Optimizer/Passes/Pipelines.cpp | 17 ++++++ flang/test/Lower/OpenMP/auto-omp.f90 | 52 +++++++++++++++++++ 6 files changed, 74 insertions(+), 2 deletions(-) create mode 100644 flang/test/Lower/OpenMP/auto-omp.f90 diff --git a/flang/include/flang/Optimizer/Passes/CommandLineOpts.h b/flang/include/flang/Optimizer/Passes/CommandLineOpts.h index 1cfaf285e75e6..320c561953213 100644 --- a/flang/include/flang/Optimizer/Passes/CommandLineOpts.h +++ b/flang/include/flang/Optimizer/Passes/CommandLineOpts.h @@ -42,6 +42,7 @@ extern llvm::cl::opt disableCfgConversion; extern llvm::cl::opt disableFirAvc; extern llvm::cl::opt disableFirMao; +extern llvm::cl::opt enableAffineOpt; extern llvm::cl::opt disableFirAliasTags; extern llvm::cl::opt useOldAliasTags; diff --git a/flang/include/flang/Optimizer/Passes/Pipelines.h b/flang/include/flang/Optimizer/Passes/Pipelines.h index a3f59ee8dd013..5c87b1ce609ef 100644 --- a/flang/include/flang/Optimizer/Passes/Pipelines.h +++ b/flang/include/flang/Optimizer/Passes/Pipelines.h @@ -18,8 +18,8 @@ #include "flang/Optimizer/Passes/CommandLineOpts.h" #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Tools/CrossToolHelpers.h" -#include "mlir/Conversion/ReconcileUnrealizedCasts/ReconcileUnrealizedCasts.h" -#include "mlir/Conversion/SCFToControlFlow/SCFToControlFlow.h" +#include "mlir/Conversion/Passes.h" +#include "mlir/Dialect/Affine/Passes.h" #include "mlir/Dialect/GPU/IR/GPUDialect.h" #include "mlir/Dialect/LLVMIR/LLVMAttrs.h" #include "mlir/Pass/PassManager.h" diff --git a/flang/lib/Optimizer/Passes/CMakeLists.txt b/flang/lib/Optimizer/Passes/CMakeLists.txt index 1c19a5765aff1..ad6c714c28bec 100644 --- a/flang/lib/Optimizer/Passes/CMakeLists.txt +++ b/flang/lib/Optimizer/Passes/CMakeLists.txt @@ -21,6 +21,7 @@ add_flang_library(flangPasses MLIRPass MLIRReconcileUnrealizedCasts MLIRSCFToControlFlow + MLIRSCFToOpenMP MLIRSupport MLIRTransforms ) diff --git a/flang/lib/Optimizer/Passes/CommandLineOpts.cpp b/flang/lib/Optimizer/Passes/CommandLineOpts.cpp index f95a280883cba..b8ae6ede423e3 100644 --- a/flang/lib/Optimizer/Passes/CommandLineOpts.cpp +++ b/flang/lib/Optimizer/Passes/CommandLineOpts.cpp @@ -55,6 +55,7 @@ cl::opt useOldAliasTags( cl::desc("Use a single TBAA tree for all functions and do not use " "the FIR alias tags pass"), cl::init(false), cl::Hidden); +EnableOption(AffineOpt, "affine-opt", "affine optimization"); /// CodeGen Passes DisableOption(CodeGenRewrite, "codegen-rewrite", "rewrite FIR for codegen"); diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index a3ef473ea39b7..e1653cdb1e874 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -211,6 +211,23 @@ void createDefaultFIROptimizerPassPipeline(mlir::PassManager &pm, addNestedPassToAllTopLevelOperations( pm, fir::createStackReclaim); + + if (enableAffineOpt && pc.OptLevel.isOptimizingForSpeed()) { + pm.addPass(fir::createPromoteToAffinePass()); + pm.addPass(mlir::createCSEPass()); + pm.addPass(mlir::affine::createAffineLoopInvariantCodeMotionPass()); + pm.addPass(mlir::affine::createAffineLoopNormalizePass()); + pm.addPass(mlir::affine::createSimplifyAffineStructuresPass()); + pm.addPass(mlir::affine::createAffineParallelize( + mlir::affine::AffineParallelizeOptions{1, false})); + pm.addPass(fir::createAffineDemotionPass()); + pm.addPass(mlir::createLowerAffinePass()); + if (pc.EnableOpenMP) { + pm.addPass(mlir::createConvertSCFToOpenMPPass()); + pm.addPass(mlir::createCanonicalizerPass()); + } + } + // convert control flow to CFG form fir::addCfgConversionPass(pm, pc); pm.addPass(mlir::createSCFToControlFlowPass()); diff --git a/flang/test/Lower/OpenMP/auto-omp.f90 b/flang/test/Lower/OpenMP/auto-omp.f90 new file mode 100644 index 0000000000000..d66e6c3f3a3a0 --- /dev/null +++ b/flang/test/Lower/OpenMP/auto-omp.f90 @@ -0,0 +1,52 @@ +! RUN: %flang_fc1 -O1 -mllvm --enable-affine-opt -emit-llvm -fopenmp -o - %s \ +! RUN: | FileCheck %s + +subroutine foo(a) + integer, dimension(100, 100), intent(out) :: a + a = 1 +end subroutine foo + +!CHECK-LABEL: entry: +!CHECK: %[[VAL_0:.*]] = alloca { ptr }, align 8 +!CHECK: %[[VAL_1:.*]] = tail call i32 @__kmpc_global_thread_num(ptr nonnull @1) +!CHECK: store ptr %[[VAL_2:.*]], ptr %[[VAL_0]], align 8 +!CHECK: call void (ptr, i32, ptr, ...) @__kmpc_fork_call(ptr nonnull @1, i32 1, ptr nonnull @foo_..omp_par, ptr nonnull %[[VAL_0]]) +!CHECK: ret void +!CHECK: omp.par.entry: +!CHECK: %[[VAL_3:.*]] = load ptr, ptr %[[VAL_4:.*]], align 8, !align !3 +!CHECK: %[[VAL_5:.*]] = alloca i32, align 4 +!CHECK: %[[VAL_6:.*]] = alloca i64, align 8 +!CHECK: %[[VAL_7:.*]] = alloca i64, align 8 +!CHECK: %[[VAL_8:.*]] = alloca i64, align 8 +!CHECK: store i64 0, ptr %[[VAL_6]], align 8 +!CHECK: store i64 99, ptr %[[VAL_7]], align 8 +!CHECK: store i64 1, ptr %[[VAL_8]], align 8 +!CHECK: %[[VAL_9:.*]] = tail call i32 @__kmpc_global_thread_num(ptr nonnull @1) +!CHECK: call void @__kmpc_for_static_init_8u(ptr nonnull @1, i32 %[[VAL_9]], i32 34, ptr nonnull %[[VAL_5]], ptr nonnull %[[VAL_6]], ptr nonnull %[[VAL_7]], ptr nonnull %[[VAL_8]], i64 1, i64 0) +!CHECK: %[[VAL_10:.*]] = load i64, ptr %[[VAL_6]], align 8 +!CHECK: %[[VAL_11:.*]] = load i64, ptr %[[VAL_7]], align 8 +!CHECK: %[[VAL_12:.*]] = sub i64 %[[VAL_11]], %[[VAL_10]] +!CHECK: %[[VAL_13:.*]] = icmp eq i64 %[[VAL_12]], -1 +!CHECK: br i1 %[[VAL_13]], label %[[VAL_14:.*]], label %[[VAL_15:.*]] +!CHECK: omp_loop.exit: ; preds = %[[VAL_16:.*]], %[[VAL_17:.*]] +!CHECK: call void @__kmpc_for_static_fini(ptr nonnull @1, i32 %[[VAL_9]]) +!CHECK: %[[VAL_18:.*]] = call i32 @__kmpc_global_thread_num(ptr nonnull @1) +!CHECK: call void @__kmpc_barrier(ptr nonnull @2, i32 %[[VAL_18]]) +!CHECK: ret void +!CHECK: omp_loop.body: ; preds = %[[VAL_17]], %[[VAL_16]] +!CHECK: %[[VAL_19:.*]] = phi i64 [ %[[VAL_20:.*]], %[[VAL_16]] ], [ 0, %[[VAL_17]] ] +!CHECK: %[[VAL_21:.*]] = add i64 %[[VAL_19]], %[[VAL_10]] +!CHECK: %[[VAL_22:.*]] = mul i64 %[[VAL_21]], 400 +!CHECK: %[[VAL_23:.*]] = getelementptr i8, ptr %[[VAL_3]], i64 %[[VAL_22]] +!CHECK: br label %[[VAL_24:.*]] +!CHECK: omp_loop.inc: ; preds = %[[VAL_24]] +!CHECK: %[[VAL_20]] = add nuw i64 %[[VAL_19]], 1 +!CHECK: %[[VAL_25:.*]] = icmp eq i64 %[[VAL_19]], %[[VAL_12]] +!CHECK: br i1 %[[VAL_25]], label %[[VAL_14]], label %[[VAL_15]] +!CHECK: omp.loop_nest.region6: ; preds = %[[VAL_15]], %[[VAL_24]] +!CHECK: %[[VAL_26:.*]] = phi i64 [ 0, %[[VAL_15]] ], [ %[[VAL_27:.*]], %[[VAL_24]] ] +!CHECK: %[[VAL_28:.*]] = getelementptr i32, ptr %[[VAL_23]], i64 %[[VAL_26]] +!CHECK: store i32 1, ptr %[[VAL_28]], align 4, !tbaa !4 +!CHECK: %[[VAL_27]] = add nuw nsw i64 %[[VAL_26]], 1 +!CHECK: %[[VAL_29:.*]] = icmp eq i64 %[[VAL_27]], 100 +!CHECK: br i1 %[[VAL_29]], label %[[VAL_16]], label %[[VAL_24]] From flang-commits at lists.llvm.org Mon May 5 21:08:13 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Mon, 05 May 2025 21:08:13 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Fix fir.convert in omp.atomic.update region (PR #138397) In-Reply-To: Message-ID: <68198b2d.170a0220.34d0bf.d506@mx.google.com> https://github.com/Thirumalai-Shaktivel approved this pull request. LGTM, thank you! https://github.com/llvm/llvm-project/pull/138397 From flang-commits at lists.llvm.org Mon May 5 21:08:52 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Mon, 05 May 2025 21:08:52 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Fix fir.convert in omp.atomic.update region (PR #138397) In-Reply-To: Message-ID: <68198b54.630a0220.1b0cf9.f08b@mx.google.com> https://github.com/Thirumalai-Shaktivel edited https://github.com/llvm/llvm-project/pull/138397 From flang-commits at lists.llvm.org Mon May 5 21:09:48 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Mon, 05 May 2025 21:09:48 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Fix fir.convert in omp.atomic.update region (PR #138397) In-Reply-To: Message-ID: <68198b8c.170a0220.30a145.631b@mx.google.com> https://github.com/Thirumalai-Shaktivel edited https://github.com/llvm/llvm-project/pull/138397 From flang-commits at lists.llvm.org Mon May 5 21:36:04 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Mon, 05 May 2025 21:36:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <681991b4.a70a0220.3751bc.350b@mx.google.com> https://github.com/Thirumalai-Shaktivel commented: LGTM https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Mon May 5 19:43:18 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 19:43:18 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <68197746.170a0220.d302d.ce44@mx.google.com> https://github.com/fanju110 updated https://github.com/llvm/llvm-project/pull/136098 >From 9494c9752400e4708dbc8b6a5ca4993ea9565e95 Mon Sep 17 00:00:00 2001 From: fanyikang Date: Thu, 17 Apr 2025 15:17:07 +0800 Subject: [PATCH 01/13] Add support for IR PGO (-fprofile-generate/-fprofile-use=/file) This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags: -fprofile-generate for instrumentation-based profile generation -fprofile-use=/file for profile-guided optimization Co-Authored-By: ict-ql <168183727+ict-ql at users.noreply.github.com> --- clang/include/clang/Driver/Options.td | 4 +- clang/lib/Driver/ToolChains/Flang.cpp | 8 +++ .../include/flang/Frontend/CodeGenOptions.def | 5 ++ flang/include/flang/Frontend/CodeGenOptions.h | 49 +++++++++++++++++ flang/lib/Frontend/CompilerInvocation.cpp | 12 +++++ flang/lib/Frontend/FrontendActions.cpp | 54 +++++++++++++++++++ flang/test/Driver/flang-f-opts.f90 | 5 ++ .../Inputs/gcc-flag-compatibility_IR.proftext | 19 +++++++ .../gcc-flag-compatibility_IR_entry.proftext | 14 +++++ flang/test/Profile/gcc-flag-compatibility.f90 | 39 ++++++++++++++ 10 files changed, 207 insertions(+), 2 deletions(-) create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext create mode 100644 flang/test/Profile/gcc-flag-compatibility.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index affc076a876ad..0b0dbc467c1e0 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1756,7 +1756,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFFFFFE">; def fprofile_generate : Flag<["-"], "fprofile-generate">, - Group, Visibility<[ClangOption, CLOption]>, + Group, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">, Group, Visibility<[ClangOption, CLOption]>, @@ -1773,7 +1773,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group, Visibility<[ClangOption, CLOption]>, Alias; def fprofile_use_EQ : Joined<["-"], "fprofile-use=">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, MetaVarName<"">, HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from /default.profdata. Otherwise, it reads from file .">; def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">, diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index a8b4688aed09c..fcdbe8a6aba5a 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,6 +882,14 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); + + if (Args.hasArg(options::OPT_fprofile_generate)){ + CmdArgs.push_back("-fprofile-generate"); + } + if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { + CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); + } + // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ, diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 57830bf51a1b3..4dec86cd8f51b 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,6 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. +ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Whether emit extra debug info for sample pgo profile collection. +CODEGENOPT(DebugInfoForProfiling, 1, 0) +CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..e052250f97e75 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,6 +148,55 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; + enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. + }; + + + /// Name of the profile file to use as output for -fprofile-instr-generate, + /// -fprofile-generate, and -fcs-profile-generate. + std::string InstrProfileOutput; + + /// Name of the profile file to use as input for -fmemory-profile-use. + std::string MemoryProfileUsePath; + + unsigned int DebugInfoForProfiling; + + unsigned int AtomicProfileUpdate; + + /// Name of the profile file to use as input for -fprofile-instr-use + std::string ProfileInstrumentUsePath; + + /// Name of the profile remapping file to apply to the profile data supplied + /// by -fprofile-sample-use or -fprofile-instr-use. + std::string ProfileRemappingFile; + + /// Check if Clang profile instrumenation is on. + bool hasProfileClangInstr() const { + return getProfileInstr() == ProfileClangInstr; + } + + /// Check if IR level profile instrumentation is on. + bool hasProfileIRInstr() const { + return getProfileInstr() == ProfileIRInstr; + } + + /// Check if CS IR level profile instrumentation is on. + bool hasProfileCSIRInstr() const { + return getProfileInstr() == ProfileCSIRInstr; + } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } + /// Check if CSIR profile use is on. + bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) #define ENUM_CODEGENOPT(Name, Type, Bits, Default) \ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 6f87a18d69c3d..f013fce2f3cfc 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,6 +27,7 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" +#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -431,6 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, opts.IsPIE = 1; } + if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { + opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + } + + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { + opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.ProfileInstrumentUsePath = A->getValue(); + } + // -mcmodel option. if (const llvm::opt::Arg *a = args.getLastArg(clang::driver::options::OPT_mcmodel_EQ)) { diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..68880bdeecf8d 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -63,11 +63,14 @@ #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" #include "llvm/Transforms/Utils/ModuleUtils.h" +#include "llvm/Transforms/Instrumentation/InstrProfiling.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include #include @@ -130,6 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// + +static llvm::cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -892,6 +909,20 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } + +// Default filename used for profile generation. +namespace llvm { + extern llvm::cl::opt DebugInfoCorrelate; + extern llvm::cl::opt ProfileCorrelate; + + +std::string getDefaultProfileGenName() { + return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} +} + void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -909,6 +940,29 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; + + if (opts.hasProfileIRInstr()){ + // // -fprofile-generate. + pgoOpt = llvm::PGOOptions( + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } + else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", + opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, + llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Driver/flang-f-opts.f90 b/flang/test/Driver/flang-f-opts.f90 index 4493a519e2010..b972b9b7b2a59 100644 --- a/flang/test/Driver/flang-f-opts.f90 +++ b/flang/test/Driver/flang-f-opts.f90 @@ -8,3 +8,8 @@ ! CHECK-LABEL: "-fc1" ! CHECK: -ffp-contract=off ! CHECK: -O3 + +! RUN: %flang -### -S -fprofile-generate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-GENERATE-LLVM %s +! CHECK-PROFILE-GENERATE-LLVM: "-fprofile-generate" +! RUN: %flang -### -S -fprofile-use=%S %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-USE-DIR %s +! CHECK-PROFILE-USE-DIR: "-fprofile-use={{.*}}" diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext new file mode 100644 index 0000000000000..6a6df8b1d4d5b --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -0,0 +1,19 @@ +# IR level Instrumentation Flag +:ir +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + +main +# Func Hash: +742261418966908927 +# Num Counters: +1 +# Counter Values: +1 + diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext new file mode 100644 index 0000000000000..9a46140286673 --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -0,0 +1,14 @@ +# IR level Instrumentation Flag +:ir +:entry_first +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + + + diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 new file mode 100644 index 0000000000000..0124bc79b87ef --- /dev/null +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -0,0 +1,39 @@ +! Tests for -fprofile-generate and -fprofile-use flag compatibility. These two +! flags behave similarly to their GCC counterparts: +! +! -fprofile-generate Generates the profile file ./default.profraw +! -fprofile-use=/file Uses the profile file /file + +! On AIX, -flto used to be required with -fprofile-generate. gcc-flag-compatibility-aix.c is used to do the testing on AIX with -flto +! RUN: %flang %s -c -S -o - -emit-llvm -fprofile-generate | FileCheck -check-prefix=PROFILE-GEN %s +! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section +! PROFILE-GEN: @__profd_{{_?}}main = + + + +! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof +! This uses LLVM IR format profile. +! RUN: rm -rf %t.dir +! RUN: mkdir -p %t.dir/some/path +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s +! + + + +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s +! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} +! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} + + +program main + implicit none + integer :: i + integer :: X = 0 + + do i = 0, 99 + X = X + i + end do + +end program main >From b897c7aa1e21dfe46b4acf709f3ea38d9021c164 Mon Sep 17 00:00:00 2001 From: FYK Date: Wed, 23 Apr 2025 09:56:14 +0800 Subject: [PATCH 02/13] Update flang/lib/Frontend/FrontendActions.cpp Remove redundant comment symbols Co-authored-by: Tom Eccles --- flang/lib/Frontend/FrontendActions.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 68880bdeecf8d..cd13a6aca92cd 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -942,7 +942,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; if (opts.hasProfileIRInstr()){ - // // -fprofile-generate. + // -fprofile-generate. pgoOpt = llvm::PGOOptions( opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() : opts.InstrProfileOutput, >From bc5adfcc4ac3456f587bedd48c1a8892d27e53ae Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:48:30 +0800 Subject: [PATCH 03/13] format code with clang-format --- flang/include/flang/Frontend/CodeGenOptions.h | 17 ++-- flang/lib/Frontend/CompilerInvocation.cpp | 15 ++-- flang/lib/Frontend/FrontendActions.cpp | 83 +++++++++---------- .../Inputs/gcc-flag-compatibility_IR.proftext | 3 +- .../gcc-flag-compatibility_IR_entry.proftext | 5 +- 5 files changed, 59 insertions(+), 64 deletions(-) diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index e052250f97e75..c9577862df832 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -156,7 +156,6 @@ class CodeGenOptions : public CodeGenOptionsBase { ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -171,7 +170,7 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; - /// Name of the profile remapping file to apply to the profile data supplied + /// Name of the profile remapping file to apply to the profile data supplied /// by -fprofile-sample-use or -fprofile-instr-use. std::string ProfileRemappingFile; @@ -181,19 +180,17 @@ class CodeGenOptions : public CodeGenOptionsBase { } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; - } + bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { return getProfileInstr() == ProfileCSIRInstr; } - /// Check if IR level profile use is on. - bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; - } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } /// Check if CSIR profile use is on. bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index f013fce2f3cfc..b28c2c0047579 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,7 +27,6 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" -#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -433,13 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = + args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = + args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); } - + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index cd13a6aca92cd..8d1ab670e4db4 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -56,21 +56,21 @@ #include "llvm/Passes/PassBuilder.h" #include "llvm/Passes/PassPlugin.h" #include "llvm/Passes/StandardInstrumentations.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/Support/AMDGPUAddrSpace.h" #include "llvm/Support/Error.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/FileSystem.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" -#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" -#include "llvm/Transforms/Utils/ModuleUtils.h" #include "llvm/Transforms/Instrumentation/InstrProfiling.h" -#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Transforms/Utils/ModuleUtils.h" #include #include @@ -133,19 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// - static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, + "default", "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, + "optsize", "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, + "minsize", "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, + "optnone", + "Mark cold functions with optnone."))); bool PrescanAction::beginSourceFileAction() { return runPrescan(); } @@ -909,19 +910,18 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } - // Default filename used for profile generation. namespace llvm { - extern llvm::cl::opt DebugInfoCorrelate; - extern llvm::cl::opt ProfileCorrelate; - +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt ProfileCorrelate; std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + return DebugInfoCorrelate || + ProfileCorrelate != llvm::InstrProfCorrelator::NONE ? "default_%m.proflite" : "default_%m.profraw"; } -} +} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); @@ -940,29 +940,28 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; - - if (opts.hasProfileIRInstr()){ + + if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); - } - else if (opts.hasProfileIRUse()) { - llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); - // -fprofile-use. - auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse - : llvm::PGOOptions::NoCSAction; - pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", - opts.ProfileRemappingFile, - opts.MemoryProfileUsePath, VFS, - llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling); - } - + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = + llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions( + opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, + ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext index 6a6df8b1d4d5b..2650fb5ebfd35 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -15,5 +15,4 @@ main # Num Counters: 1 # Counter Values: -1 - +1 \ No newline at end of file diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext index 9a46140286673..c4a2a26557e80 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -8,7 +8,4 @@ _QQmain 2 # Counter Values: 100 -1 - - - +1 \ No newline at end of file >From d64d9d95fb97d6cfa4bf4192bfb20f5c8d6b3bc3 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:53:47 +0800 Subject: [PATCH 04/13] simplify push_back usage --- clang/lib/Driver/ToolChains/Flang.cpp | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index fcdbe8a6aba5a..9c7e87c455e44 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,13 +882,9 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); - - if (Args.hasArg(options::OPT_fprofile_generate)){ - CmdArgs.push_back("-fprofile-generate"); - } - if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { - CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); - } + // recognise options: fprofile-generate -fprofile-use= + Args.addAllArgs( + CmdArgs, {options::OPT_fprofile_generate, options::OPT_fprofile_use_EQ}); // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. >From 22475a85d24b22fb44ca5a5ce26542b556bae280 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 20:33:54 +0800 Subject: [PATCH 05/13] Port the getDefaultProfileGenName definition and the ProfileInstrKind definition from clang to the llvm namespace to allow flang to reuse these code. --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++--- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/include/clang/CodeGen/BackendUtil.h | 3 ++ clang/lib/Basic/ProfileList.cpp | 20 ++++---- clang/lib/CodeGen/BackendUtil.cpp | 50 ++++++------------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 +-- flang/include/flang/Frontend/CodeGenOptions.h | 28 ++++------- .../include/flang/Frontend/FrontendActions.h | 5 ++ flang/lib/Frontend/CompilerInvocation.cpp | 11 ++-- flang/lib/Frontend/FrontendActions.cpp | 28 +++-------- .../llvm/Frontend/Driver/CodeGenOptions.h | 15 +++++- llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 25 ++++++++++ 17 files changed, 123 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index 92e0d13bf25b6..d9abf7bf962d2 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,6 +8,8 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -19,6 +21,7 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; +extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..963ed321b2cb9 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,38 +103,16 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); -// Experiment to mark cold functions as optsize/minsize/optnone. -// TODO: remove once this is exposed as a proper driver flag. -static cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, - cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); - extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +812,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,14 +825,14 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) @@ -863,15 +841,15 @@ void EmitAssemblyHelper::RunOptimizationPipeline( ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index f9a45bd6c0a56..9ba74a9dad9be 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,8 +20,13 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include +namespace llvm { +extern cl::opt ClPGOColdFuncAttr; +} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..c758aa18fbb8e 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -28,6 +28,7 @@ #include "flang/Semantics/unparse-with-symbols.h" #include "flang/Support/default-kinds.h" #include "flang/Tools/CrossToolHelpers.h" +#include "clang/CodeGen/BackendUtil.h" #include "mlir/IR/Dialect.h" #include "mlir/Parser/Parser.h" @@ -133,21 +134,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -944,12 +930,12 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, + opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + llvm::PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +945,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..6188c20cb29cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,9 +13,14 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; } // namespace llvm namespace llvm::driver { @@ -35,7 +40,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); - +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..818dcd3752437 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,7 +8,26 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); +} // namespace llvm namespace llvm::driver { @@ -56,4 +75,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From e53e689985088bbcdc253950a2ecc715592b5b3a Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 21:49:36 +0800 Subject: [PATCH 06/13] Remove redundant function definitions --- flang/lib/Frontend/FrontendActions.cpp | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c758aa18fbb8e..cdd2853bcd201 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -896,18 +896,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); >From 248175453354fecd078f5553576d16ce810e7808 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:12:32 +0800 Subject: [PATCH 07/13] Move the interface to the cpp that uses it --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++---- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/lib/Basic/ProfileList.cpp | 20 ++++----- clang/lib/CodeGen/BackendUtil.cpp | 37 +++++++-------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 ++-- flang/include/flang/Frontend/CodeGenOptions.h | 28 +++++------- flang/lib/Frontend/CompilerInvocation.cpp | 11 ++--- flang/lib/Frontend/FrontendActions.cpp | 45 ++++--------------- .../llvm/Frontend/Driver/CodeGenOptions.h | 10 +++++ llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 12 +++++ 15 files changed, 101 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..592e3bbbcc1cf 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -124,17 +124,10 @@ namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +827,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,31 +840,31 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) PGOOpt = PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..a650f54620543 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -133,21 +133,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -910,19 +895,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm - void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -943,13 +915,14 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. - pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + pgoOpt = llvm::PGOOptions(opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, + llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, + llvm::PGOOptions::ColdFuncOpt::Default, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +932,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::PGOOptions::ColdFuncOpt::Default, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..3eb03cc3064cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,6 +13,7 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -36,6 +37,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..14b6b89da8465 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,8 +8,14 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; +} // namespace llvm namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, @@ -56,4 +62,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From 70fea2265a374f59345691f4ad7653ef4f0b6aa6 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:25:15 +0800 Subject: [PATCH 08/13] Move the interface to the cpp that uses it --- clang/include/clang/CodeGen/BackendUtil.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index d9abf7bf962d2..92e0d13bf25b6 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,8 +8,6 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -21,7 +19,6 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; -extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs >From 5705d5eff937ca18eb44bec28a967a8629f0c085 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:26:22 +0800 Subject: [PATCH 09/13] Move the interface to the cpp that uses it --- flang/include/flang/Frontend/FrontendActions.h | 5 ----- 1 file changed, 5 deletions(-) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index 9ba74a9dad9be..f9a45bd6c0a56 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,13 +20,8 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include -namespace llvm { -extern cl::opt ClPGOColdFuncAttr; -} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// >From 016aab17f4cc73416c6ebca61240f269aac837d2 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:34:00 +0800 Subject: [PATCH 10/13] Fill in the missing code --- clang/lib/CodeGen/BackendUtil.cpp | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 2d33edbb8430d..6eb3a8638b7d1 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,6 +103,21 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +static cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { >From f36bfcfbfdc87b896f41be1ba25d8c18c339f1c1 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Thu, 1 May 2025 23:18:34 +0800 Subject: [PATCH 11/13] Adjusting the format of the code --- flang/test/Profile/gcc-flag-compatibility.f90 | 7 ------- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 7 ++++--- 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 index 0124bc79b87ef..4490c45232d28 100644 --- a/flang/test/Profile/gcc-flag-compatibility.f90 +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -9,24 +9,17 @@ ! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section ! PROFILE-GEN: @__profd_{{_?}}main = - - ! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof ! This uses LLVM IR format profile. ! RUN: rm -rf %t.dir ! RUN: mkdir -p %t.dir/some/path ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s -! - - - ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s ! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} ! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} - program main implicit none integer :: i diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 3eb03cc3064cf..98b9e1554f317 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -14,6 +14,7 @@ #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #include + namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -34,9 +35,6 @@ enum class VectorLibrary { AMDLIBM // AMD vector math library. }; -TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, - VectorLibrary Veclib); - enum ProfileInstrKind { ProfileNone, // Profile instrumentation is turned off. ProfileClangInstr, // Clang instrumentation to generate execution counts @@ -44,6 +42,9 @@ enum ProfileInstrKind { ProfileIRInstr, // IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; +TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, + VectorLibrary Veclib); + // Default filename used for profile generation. std::string getDefaultProfileGenName(); } // end namespace llvm::driver >From a5c7da77d2aa6909451bed3fb0f02c9b735dc876 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 2 May 2025 00:01:26 +0800 Subject: [PATCH 12/13] Adjusting the format of the code --- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 98b9e1554f317..84bba2a964ecf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -20,6 +20,7 @@ class Triple; class TargetLibraryInfoImpl; } // namespace llvm + namespace llvm::driver { /// Vector library option used with -fveclib= @@ -42,6 +43,7 @@ enum ProfileInstrKind { ProfileIRInstr, // IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; + TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); >From a99e16b29d70d2fea6d16ec06e6ca55f477b74e9 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 2 May 2025 00:07:23 +0800 Subject: [PATCH 13/13] Adjusting the format of the code --- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 1 - llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 84bba2a964ecf..f0baa6fcdbbd3 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -20,7 +20,6 @@ class Triple; class TargetLibraryInfoImpl; } // namespace llvm - namespace llvm::driver { /// Vector library option used with -fveclib= diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index 14b6b89da8465..c48f5ed68b10b 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -16,6 +16,7 @@ extern llvm::cl::opt DebugInfoCorrelate; extern llvm::cl::opt ProfileCorrelate; } // namespace llvm + namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, From flang-commits at lists.llvm.org Mon May 5 22:44:43 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 22:44:43 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <6819a1cb.630a0220.279b8c.ff16@mx.google.com> https://github.com/NexMing ready_for_review https://github.com/llvm/llvm-project/pull/138627 From flang-commits at lists.llvm.org Mon May 5 22:46:58 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 05 May 2025 22:46:58 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <6819a252.630a0220.9cd89.e647@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: MingYan (NexMing)
Changes Currently, the FIR dialect is directly lowered to the LLVM dialect. We can first convert the FIR dialect to the Affine dialect, perform optimizations on top of it, and then lower it to the FIR dialect. The optimization passes are currently experimental, so it's important to actively identify and address issues. --- Full diff: https://github.com/llvm/llvm-project/pull/138627.diff 6 Files Affected: - (modified) flang/include/flang/Optimizer/Passes/CommandLineOpts.h (+1) - (modified) flang/include/flang/Optimizer/Passes/Pipelines.h (+2-2) - (modified) flang/lib/Optimizer/Passes/CMakeLists.txt (+1) - (modified) flang/lib/Optimizer/Passes/CommandLineOpts.cpp (+1) - (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+17) - (added) flang/test/Lower/OpenMP/auto-omp.f90 (+52) ``````````diff diff --git a/flang/include/flang/Optimizer/Passes/CommandLineOpts.h b/flang/include/flang/Optimizer/Passes/CommandLineOpts.h index 1cfaf285e75e6..320c561953213 100644 --- a/flang/include/flang/Optimizer/Passes/CommandLineOpts.h +++ b/flang/include/flang/Optimizer/Passes/CommandLineOpts.h @@ -42,6 +42,7 @@ extern llvm::cl::opt disableCfgConversion; extern llvm::cl::opt disableFirAvc; extern llvm::cl::opt disableFirMao; +extern llvm::cl::opt enableAffineOpt; extern llvm::cl::opt disableFirAliasTags; extern llvm::cl::opt useOldAliasTags; diff --git a/flang/include/flang/Optimizer/Passes/Pipelines.h b/flang/include/flang/Optimizer/Passes/Pipelines.h index a3f59ee8dd013..5c87b1ce609ef 100644 --- a/flang/include/flang/Optimizer/Passes/Pipelines.h +++ b/flang/include/flang/Optimizer/Passes/Pipelines.h @@ -18,8 +18,8 @@ #include "flang/Optimizer/Passes/CommandLineOpts.h" #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Tools/CrossToolHelpers.h" -#include "mlir/Conversion/ReconcileUnrealizedCasts/ReconcileUnrealizedCasts.h" -#include "mlir/Conversion/SCFToControlFlow/SCFToControlFlow.h" +#include "mlir/Conversion/Passes.h" +#include "mlir/Dialect/Affine/Passes.h" #include "mlir/Dialect/GPU/IR/GPUDialect.h" #include "mlir/Dialect/LLVMIR/LLVMAttrs.h" #include "mlir/Pass/PassManager.h" diff --git a/flang/lib/Optimizer/Passes/CMakeLists.txt b/flang/lib/Optimizer/Passes/CMakeLists.txt index 1c19a5765aff1..ad6c714c28bec 100644 --- a/flang/lib/Optimizer/Passes/CMakeLists.txt +++ b/flang/lib/Optimizer/Passes/CMakeLists.txt @@ -21,6 +21,7 @@ add_flang_library(flangPasses MLIRPass MLIRReconcileUnrealizedCasts MLIRSCFToControlFlow + MLIRSCFToOpenMP MLIRSupport MLIRTransforms ) diff --git a/flang/lib/Optimizer/Passes/CommandLineOpts.cpp b/flang/lib/Optimizer/Passes/CommandLineOpts.cpp index f95a280883cba..b8ae6ede423e3 100644 --- a/flang/lib/Optimizer/Passes/CommandLineOpts.cpp +++ b/flang/lib/Optimizer/Passes/CommandLineOpts.cpp @@ -55,6 +55,7 @@ cl::opt useOldAliasTags( cl::desc("Use a single TBAA tree for all functions and do not use " "the FIR alias tags pass"), cl::init(false), cl::Hidden); +EnableOption(AffineOpt, "affine-opt", "affine optimization"); /// CodeGen Passes DisableOption(CodeGenRewrite, "codegen-rewrite", "rewrite FIR for codegen"); diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index a3ef473ea39b7..e1653cdb1e874 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -211,6 +211,23 @@ void createDefaultFIROptimizerPassPipeline(mlir::PassManager &pm, addNestedPassToAllTopLevelOperations( pm, fir::createStackReclaim); + + if (enableAffineOpt && pc.OptLevel.isOptimizingForSpeed()) { + pm.addPass(fir::createPromoteToAffinePass()); + pm.addPass(mlir::createCSEPass()); + pm.addPass(mlir::affine::createAffineLoopInvariantCodeMotionPass()); + pm.addPass(mlir::affine::createAffineLoopNormalizePass()); + pm.addPass(mlir::affine::createSimplifyAffineStructuresPass()); + pm.addPass(mlir::affine::createAffineParallelize( + mlir::affine::AffineParallelizeOptions{1, false})); + pm.addPass(fir::createAffineDemotionPass()); + pm.addPass(mlir::createLowerAffinePass()); + if (pc.EnableOpenMP) { + pm.addPass(mlir::createConvertSCFToOpenMPPass()); + pm.addPass(mlir::createCanonicalizerPass()); + } + } + // convert control flow to CFG form fir::addCfgConversionPass(pm, pc); pm.addPass(mlir::createSCFToControlFlowPass()); diff --git a/flang/test/Lower/OpenMP/auto-omp.f90 b/flang/test/Lower/OpenMP/auto-omp.f90 new file mode 100644 index 0000000000000..d66e6c3f3a3a0 --- /dev/null +++ b/flang/test/Lower/OpenMP/auto-omp.f90 @@ -0,0 +1,52 @@ +! RUN: %flang_fc1 -O1 -mllvm --enable-affine-opt -emit-llvm -fopenmp -o - %s \ +! RUN: | FileCheck %s + +subroutine foo(a) + integer, dimension(100, 100), intent(out) :: a + a = 1 +end subroutine foo + +!CHECK-LABEL: entry: +!CHECK: %[[VAL_0:.*]] = alloca { ptr }, align 8 +!CHECK: %[[VAL_1:.*]] = tail call i32 @__kmpc_global_thread_num(ptr nonnull @1) +!CHECK: store ptr %[[VAL_2:.*]], ptr %[[VAL_0]], align 8 +!CHECK: call void (ptr, i32, ptr, ...) @__kmpc_fork_call(ptr nonnull @1, i32 1, ptr nonnull @foo_..omp_par, ptr nonnull %[[VAL_0]]) +!CHECK: ret void +!CHECK: omp.par.entry: +!CHECK: %[[VAL_3:.*]] = load ptr, ptr %[[VAL_4:.*]], align 8, !align !3 +!CHECK: %[[VAL_5:.*]] = alloca i32, align 4 +!CHECK: %[[VAL_6:.*]] = alloca i64, align 8 +!CHECK: %[[VAL_7:.*]] = alloca i64, align 8 +!CHECK: %[[VAL_8:.*]] = alloca i64, align 8 +!CHECK: store i64 0, ptr %[[VAL_6]], align 8 +!CHECK: store i64 99, ptr %[[VAL_7]], align 8 +!CHECK: store i64 1, ptr %[[VAL_8]], align 8 +!CHECK: %[[VAL_9:.*]] = tail call i32 @__kmpc_global_thread_num(ptr nonnull @1) +!CHECK: call void @__kmpc_for_static_init_8u(ptr nonnull @1, i32 %[[VAL_9]], i32 34, ptr nonnull %[[VAL_5]], ptr nonnull %[[VAL_6]], ptr nonnull %[[VAL_7]], ptr nonnull %[[VAL_8]], i64 1, i64 0) +!CHECK: %[[VAL_10:.*]] = load i64, ptr %[[VAL_6]], align 8 +!CHECK: %[[VAL_11:.*]] = load i64, ptr %[[VAL_7]], align 8 +!CHECK: %[[VAL_12:.*]] = sub i64 %[[VAL_11]], %[[VAL_10]] +!CHECK: %[[VAL_13:.*]] = icmp eq i64 %[[VAL_12]], -1 +!CHECK: br i1 %[[VAL_13]], label %[[VAL_14:.*]], label %[[VAL_15:.*]] +!CHECK: omp_loop.exit: ; preds = %[[VAL_16:.*]], %[[VAL_17:.*]] +!CHECK: call void @__kmpc_for_static_fini(ptr nonnull @1, i32 %[[VAL_9]]) +!CHECK: %[[VAL_18:.*]] = call i32 @__kmpc_global_thread_num(ptr nonnull @1) +!CHECK: call void @__kmpc_barrier(ptr nonnull @2, i32 %[[VAL_18]]) +!CHECK: ret void +!CHECK: omp_loop.body: ; preds = %[[VAL_17]], %[[VAL_16]] +!CHECK: %[[VAL_19:.*]] = phi i64 [ %[[VAL_20:.*]], %[[VAL_16]] ], [ 0, %[[VAL_17]] ] +!CHECK: %[[VAL_21:.*]] = add i64 %[[VAL_19]], %[[VAL_10]] +!CHECK: %[[VAL_22:.*]] = mul i64 %[[VAL_21]], 400 +!CHECK: %[[VAL_23:.*]] = getelementptr i8, ptr %[[VAL_3]], i64 %[[VAL_22]] +!CHECK: br label %[[VAL_24:.*]] +!CHECK: omp_loop.inc: ; preds = %[[VAL_24]] +!CHECK: %[[VAL_20]] = add nuw i64 %[[VAL_19]], 1 +!CHECK: %[[VAL_25:.*]] = icmp eq i64 %[[VAL_19]], %[[VAL_12]] +!CHECK: br i1 %[[VAL_25]], label %[[VAL_14]], label %[[VAL_15]] +!CHECK: omp.loop_nest.region6: ; preds = %[[VAL_15]], %[[VAL_24]] +!CHECK: %[[VAL_26:.*]] = phi i64 [ 0, %[[VAL_15]] ], [ %[[VAL_27:.*]], %[[VAL_24]] ] +!CHECK: %[[VAL_28:.*]] = getelementptr i32, ptr %[[VAL_23]], i64 %[[VAL_26]] +!CHECK: store i32 1, ptr %[[VAL_28]], align 4, !tbaa !4 +!CHECK: %[[VAL_27]] = add nuw nsw i64 %[[VAL_26]], 1 +!CHECK: %[[VAL_29:.*]] = icmp eq i64 %[[VAL_27]], 100 +!CHECK: br i1 %[[VAL_29]], label %[[VAL_16]], label %[[VAL_24]] ``````````
https://github.com/llvm/llvm-project/pull/138627 From flang-commits at lists.llvm.org Tue May 6 00:20:59 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 00:20:59 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [Flang][OpenMP] Support for lowering of taskloop construct to MLIR (PR #138646) Message-ID: https://github.com/kaviya2510 created https://github.com/llvm/llvm-project/pull/138646 Added support for lowering of taskloop construct and its clauses(Private and Firstprivate) to MLIR. >From eb5a16e62bd5a372d52d4c8c7d97c3b2097f5807 Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Tue, 6 May 2025 12:49:08 +0530 Subject: [PATCH] [Flang][OpenMP] Support for lowering of taskloop construct to MLIR --- flang/lib/Lower/OpenMP/OpenMP.cpp | 44 +++++++++++- .../Lower/OpenMP/Todo/taskloop-cancel.f90 | 14 ---- flang/test/Lower/OpenMP/Todo/taskloop.f90 | 13 ---- flang/test/Lower/OpenMP/masked_taskloop.f90 | 55 ++++++++++++++ flang/test/Lower/OpenMP/master_taskloop.f90 | 14 ---- .../Lower/OpenMP/parallel-masked-taskloop.f90 | 48 +++++++++++++ .../Lower/OpenMP/parallel-master-taskloop.f90 | 14 ---- flang/test/Lower/OpenMP/taskloop-cancel.f90 | 37 ++++++++++ flang/test/Lower/OpenMP/taskloop.f90 | 72 +++++++++++++++++++ mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp | 8 +-- 10 files changed, 258 insertions(+), 61 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 delete mode 100644 flang/test/Lower/OpenMP/Todo/taskloop.f90 create mode 100644 flang/test/Lower/OpenMP/masked_taskloop.f90 delete mode 100644 flang/test/Lower/OpenMP/master_taskloop.f90 create mode 100644 flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 delete mode 100644 flang/test/Lower/OpenMP/parallel-master-taskloop.f90 create mode 100644 flang/test/Lower/OpenMP/taskloop-cancel.f90 create mode 100644 flang/test/Lower/OpenMP/taskloop.f90 diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 47e7c266ff7d3..5d545f40c7ccb 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1804,6 +1804,22 @@ static void genTaskgroupClauses(lower::AbstractConverter &converter, llvm::omp::Directive::OMPD_taskgroup); } +static void genTaskloopClauses(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + lower::StatementContext &stmtCtx, + const List &clauses, mlir::Location loc, + mlir::omp::TaskloopOperands &clauseOps) { + + ClauseProcessor cp(converter, semaCtx, clauses); + + cp.processTODO( + loc, llvm::omp::Directive::OMPD_taskloop); +} + static void genTaskwaitClauses(lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, const List &clauses, mlir::Location loc, @@ -3256,8 +3272,32 @@ static mlir::omp::TaskloopOp genStandaloneTaskloop( semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, mlir::Location loc, const ConstructQueue &queue, ConstructQueue::const_iterator item) { - TODO(loc, "Taskloop construct"); - return nullptr; + mlir::omp::TaskloopOperands taskloopClauseOps; + lower::StatementContext stmtCtx; + genTaskloopClauses(converter, semaCtx, stmtCtx, item->clauses, loc, + taskloopClauseOps); + + DataSharingProcessor dsp(converter, semaCtx, item->clauses, eval, + /*shouldCollectPreDeterminedSymbols=*/true, + enableDelayedPrivatization, symTable); + dsp.processStep1(&taskloopClauseOps); + + mlir::omp::LoopNestOperands loopNestClauseOps; + llvm::SmallVector iv; + genLoopNestClauses(converter, semaCtx, eval, item->clauses, loc, + loopNestClauseOps, iv); + + EntryBlockArgs taskloopArgs; + taskloopArgs.priv.syms = dsp.getDelayedPrivSymbols(); + taskloopArgs.priv.vars = taskloopClauseOps.privateVars; + + auto taskLoopOp = genWrapperOp( + converter, loc, taskloopClauseOps, taskloopArgs); + + genLoopNestOp(converter, symTable, semaCtx, eval, loc, queue, item, + loopNestClauseOps, iv, {{taskLoopOp, taskloopArgs}}, + llvm::omp::Directive::OMPD_taskloop, dsp); + return taskLoopOp; } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 b/flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 deleted file mode 100644 index 5045c621e4d77..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 +++ /dev/null @@ -1,14 +0,0 @@ -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s - -! CHECK: not yet implemented: Taskloop construct -subroutine omp_taskloop -integer :: i -!$omp parallel - !$omp taskloop - do i = 1, 10 - !$omp cancel taskgroup - end do - !$omp end taskloop -!$omp end parallel -end subroutine omp_taskloop diff --git a/flang/test/Lower/OpenMP/Todo/taskloop.f90 b/flang/test/Lower/OpenMP/Todo/taskloop.f90 deleted file mode 100644 index aca050584cbbe..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/taskloop.f90 +++ /dev/null @@ -1,13 +0,0 @@ -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s - -! CHECK: not yet implemented: Taskloop construct -subroutine omp_taskloop - integer :: res, i - !$omp taskloop - do i = 1, 10 - res = res + 1 - end do - !$omp end taskloop -end subroutine omp_taskloop - diff --git a/flang/test/Lower/OpenMP/masked_taskloop.f90 b/flang/test/Lower/OpenMP/masked_taskloop.f90 new file mode 100644 index 0000000000000..abe20ec1fd87c --- /dev/null +++ b/flang/test/Lower/OpenMP/masked_taskloop.f90 @@ -0,0 +1,55 @@ +! This test checks lowering of OpenMP masked taskloop Directive. + +! RUN: bbc -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private {type = private} +! CHECK-SAME: @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[J_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_masked_taskloop() { +! CHECK: %[[VAL_0:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_masked_taskloopEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] +! CHECK-SAME: {uniq_name = "_QFtest_masked_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_J:.*]] = fir.address_of(@_QFtest_masked_taskloopEj) : !fir.ref +! CHECK: %[[DECL_J:.*]]:2 = hlfir.declare %[[ALLOCA_J]] {uniq_name = "_QFtest_masked_taskloopEj"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.masked { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private( +! CHECK-SAME: @[[J_FIRSTPRIVATE]] %[[DECL_J]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { +! CHECK: omp.loop_nest (%arg2) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%[[C1_I32_0]]) { +! CHECK: %[[VAL1:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFtest_masked_taskloopEj"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL2:.*]]:2 = hlfir.declare %[[ARG1]] {uniq_name = "_QFtest_masked_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %arg2 to %[[VAL2]]#0 : i32, !fir.ref +! CHECK: %[[LOAD_J:.*]] = fir.load %[[VAL1]]#0 : !fir.ref +! CHECK: %[[C1_I32_1:.*]] = arith.constant 1 : i32 +! CHECK: %[[RES_J:.*]] = arith.addi %[[LOAD_J]], %[[C1_I32_1]] : i32 +! CHECK: hlfir.assign %[[RES_J]] to %[[VAL1]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } +! CHECK: fir.global internal @_QFtest_masked_taskloopEj : i32 { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: fir.has_value %[[C1_I32]] : i32 +! CHECK: } + + +subroutine test_masked_taskloop + integer :: i, j = 1 + !$omp masked taskloop + do i=1,10 + j = j + 1 + end do + !$omp end masked taskloop +end subroutine diff --git a/flang/test/Lower/OpenMP/master_taskloop.f90 b/flang/test/Lower/OpenMP/master_taskloop.f90 deleted file mode 100644 index 26f664b2662dc..0000000000000 --- a/flang/test/Lower/OpenMP/master_taskloop.f90 +++ /dev/null @@ -1,14 +0,0 @@ -! This test checks lowering of OpenMP master taskloop Directive. - -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s - -subroutine test_master_taskloop - integer :: i, j = 1 - !CHECK: not yet implemented: Taskloop construct - !$omp master taskloop - do i=1,10 - j = j + 1 - end do - !$omp end master taskloop -end subroutine diff --git a/flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 b/flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 new file mode 100644 index 0000000000000..497cc396a5a02 --- /dev/null +++ b/flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 @@ -0,0 +1,48 @@ +! This test checks lowering of OpenMP parallel masked taskloop Directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private {type = private} +! CHECK-SAME: @[[I_PRIVATE:.*]] : i32 +! CHECK-LABEL: func.func @_QPtest_parallel_master_taskloop() { +! CHECK: %[[VAL0:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_parallel_master_taskloopEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_parallel_master_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ADDR_J:.*]] = fir.address_of(@_QFtest_parallel_master_taskloopEj) : !fir.ref +! CHECK: %[[DECL_J:.*]]:2 = hlfir.declare %[[ADDR_J]] {uniq_name = "_QFtest_parallel_master_taskloopEj"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.parallel { +! CHECK: omp.masked { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private(@[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG0:.*]] : !fir.ref) { +! CHECK: omp.loop_nest (%[[ARG1:.*]]) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%c1_i32_0) { +! CHECK: %[[VAL1:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFtest_parallel_master_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[ARG1]] to %[[VAL1]]#0 : i32, !fir.ref +! CHECK: %[[LOAD_J:.*]] = fir.load %[[DECL_J]]#0 : !fir.ref +! CHECK: %c1_i32_1 = arith.constant 1 : i32 +! CHECK: %[[RES_ADD:.*]] = arith.addi %[[LOAD_J]], %c1_i32_1 : i32 +! CHECK: hlfir.assign %[[RES_ADD]] to %[[DECL_J]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } +! CHECK: fir.global internal @_QFtest_parallel_master_taskloopEj : i32 { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: fir.has_value %[[C1_I32]] : i32 +! CHECK: } + +subroutine test_parallel_master_taskloop + integer :: i, j = 1 + !$omp parallel masked taskloop + do i=1,10 + j = j + 1 + end do + !$omp end parallel masked taskloop +end subroutine diff --git a/flang/test/Lower/OpenMP/parallel-master-taskloop.f90 b/flang/test/Lower/OpenMP/parallel-master-taskloop.f90 deleted file mode 100644 index 17ceb9496c8d3..0000000000000 --- a/flang/test/Lower/OpenMP/parallel-master-taskloop.f90 +++ /dev/null @@ -1,14 +0,0 @@ -! This test checks lowering of OpenMP parallel master taskloop Directive. - -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s - -subroutine test_parallel_master_taskloop - integer :: i, j = 1 - !CHECK: not yet implemented: Taskloop construct - !$omp parallel master taskloop - do i=1,10 - j = j + 1 - end do - !$omp end parallel master taskloop -end subroutine diff --git a/flang/test/Lower/OpenMP/taskloop-cancel.f90 b/flang/test/Lower/OpenMP/taskloop-cancel.f90 new file mode 100644 index 0000000000000..2bc0f17428c36 --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-cancel.f90 @@ -0,0 +1,37 @@ +! RUN: bbc -emit-hlfir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private {type = private} +! CHECK-SAME: @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: func.func @_QPomp_taskloop() { +! CHECK: %[[VAL_0:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFomp_taskloopEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %1 {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.parallel { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private(@[[I_PRIVATE]] %2#0 -> %[[ARG0:.*]] : !fir.ref) { +! CHECK: omp.loop_nest (%[[ARG1:.*]]) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%[[C1_I32_0]]) { +! CHECK: %[[IDX:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[ARG1]] to %[[IDX]]#0 : i32, !fir.ref +! CHECK: omp.cancel cancellation_construct_type(taskgroup) +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } + +subroutine omp_taskloop +integer :: i +!$omp parallel + !$omp taskloop + do i = 1, 10 + !$omp cancel taskgroup + end do + !$omp end taskloop +!$omp end parallel +end subroutine omp_taskloop diff --git a/flang/test/Lower/OpenMP/taskloop.f90 b/flang/test/Lower/OpenMP/taskloop.f90 new file mode 100644 index 0000000000000..10689a34c7efb --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop.f90 @@ -0,0 +1,72 @@ +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[RES_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[RES_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPomp_taskloop +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFomp_taskloopEi"} +! CHECK: %[[I_VAL:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_RES:.*]] = fir.alloca i32 {bindc_name = "res", uniq_name = "_QFomp_taskloopEres"} +! CHECK: %[[RES_VAL:.*]]:2 = hlfir.declare %[[ALLOCA_RES]] {uniq_name = "_QFomp_taskloopEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private(@[[RES_FIRSTPRIVATE]] %[[RES_VAL]]#0 -> %[[PRIV_RES:.*]], @[[I_PRIVATE]] %[[I_VAL]]#0 -> %[[PRIV_I:.*]] : !fir.ref, !fir.ref) { +! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%[[C1_I32_0]]) { +! CHECK: %[[RES_DECL:.*]]:2 = hlfir.declare %[[PRIV_RES]] {uniq_name = "_QFomp_taskloopEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[I_DECL:.*]]:2 = hlfir.declare %[[PRIV_I]] {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[ARG2]] to %[[I_DECL]]#0 : i32, !fir.ref +! CHECK: %[[LOAD_RES:.*]] = fir.load %[[RES_DECL]]#0 : !fir.ref +! CHECK: %[[C1_I32_1:.*]] = arith.constant 1 : i32 +! CHECK: %[[OUT_VAL:.*]] = arith.addi %[[LOAD_RES]], %[[C1_I32_1]] : i32 +! CHECK: hlfir.assign %[[OUT_VAL]] to %[[RES_DECL]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: return +! CHECK: } + +subroutine omp_taskloop + integer :: res, i + !$omp taskloop + do i = 1, 10 + res = res + 1 + end do + !$omp end taskloop +end subroutine omp_taskloop + + +! CHECK-LABEL: func.func @_QPomp_taskloop_private +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFomp_taskloop_privateEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFomp_taskloop_privateEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_RES:.*]] = fir.alloca i32 {bindc_name = "res", uniq_name = "_QFomp_taskloop_privateEres"} +! CHECK: %[[DECL_RES:.*]]:2 = hlfir.declare %[[ALLOCA_RES]] {uniq_name = "_QFomp_taskloop_privateEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine omp_taskloop_private + integer :: res, i +! CHECK: omp.taskloop private(@[[RES_PRIVATE_TEST2]] %[[DECL_RES]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE_TEST2]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { +! CHECK: omp.loop_nest (%{{.*}}) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { +! CHECK: %[[VAL1:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFomp_taskloop_privateEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) + !$omp taskloop private(res) + do i = 1, 10 +! CHECK: %[[LOAD_RES:.*]] = fir.load %[[VAL1]]#0 : !fir.ref +! CHECK: %[[C1_I32_1:.*]] = arith.constant 1 : i32 +! CHECK: %[[ADD_VAL:.*]] = arith.addi %[[LOAD_RES]], %[[C1_I32_1]] : i32 +! CHECK: hlfir.assign %[[ADD_VAL]] to %[[VAL1]]#0 : i32, !fir.ref + res = res + 1 + end do +! CHECK: return +! CHECK: } + !$omp end taskloop +end subroutine omp_taskloop_private \ No newline at end of file diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp index 0f65ace0186db..4f1bea24bf6ac 100644 --- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp +++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp @@ -2785,16 +2785,16 @@ LogicalResult TaskgroupOp::verify() { void TaskloopOp::build(OpBuilder &builder, OperationState &state, const TaskloopOperands &clauses) { MLIRContext *ctx = builder.getContext(); - // TODO Store clauses in op: privateVars, privateSyms. TaskloopOp::build(builder, state, clauses.allocateVars, clauses.allocatorVars, clauses.final, clauses.grainsizeMod, clauses.grainsize, clauses.ifExpr, clauses.inReductionVars, makeDenseBoolArrayAttr(ctx, clauses.inReductionByref), makeArrayAttr(ctx, clauses.inReductionSyms), clauses.mergeable, clauses.nogroup, clauses.numTasksMod, - clauses.numTasks, clauses.priority, /*private_vars=*/{}, - /*private_syms=*/nullptr, clauses.reductionMod, - clauses.reductionVars, + clauses.numTasks, clauses.priority, + /*private_vars=*/clauses.privateVars, + /*private_syms=*/makeArrayAttr(ctx, clauses.privateSyms), + clauses.reductionMod, clauses.reductionVars, makeDenseBoolArrayAttr(ctx, clauses.reductionByref), makeArrayAttr(ctx, clauses.reductionSyms), clauses.untied); } From flang-commits at lists.llvm.org Tue May 6 00:21:31 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 00:21:31 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [Flang][OpenMP] Support for lowering of taskloop construct to MLIR (PR #138646) In-Reply-To: Message-ID: <6819b87b.050a0220.2064ba.5e32@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Kaviya Rajendiran (kaviya2510)
Changes Added support for lowering of taskloop construct and its clauses(Private and Firstprivate) to MLIR. --- Full diff: https://github.com/llvm/llvm-project/pull/138646.diff 10 Files Affected: - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+42-2) - (removed) flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 (-14) - (removed) flang/test/Lower/OpenMP/Todo/taskloop.f90 (-13) - (added) flang/test/Lower/OpenMP/masked_taskloop.f90 (+55) - (removed) flang/test/Lower/OpenMP/master_taskloop.f90 (-14) - (added) flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 (+48) - (removed) flang/test/Lower/OpenMP/parallel-master-taskloop.f90 (-14) - (added) flang/test/Lower/OpenMP/taskloop-cancel.f90 (+37) - (added) flang/test/Lower/OpenMP/taskloop.f90 (+72) - (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+4-4) ``````````diff diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 47e7c266ff7d3..5d545f40c7ccb 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1804,6 +1804,22 @@ static void genTaskgroupClauses(lower::AbstractConverter &converter, llvm::omp::Directive::OMPD_taskgroup); } +static void genTaskloopClauses(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + lower::StatementContext &stmtCtx, + const List &clauses, mlir::Location loc, + mlir::omp::TaskloopOperands &clauseOps) { + + ClauseProcessor cp(converter, semaCtx, clauses); + + cp.processTODO( + loc, llvm::omp::Directive::OMPD_taskloop); +} + static void genTaskwaitClauses(lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, const List &clauses, mlir::Location loc, @@ -3256,8 +3272,32 @@ static mlir::omp::TaskloopOp genStandaloneTaskloop( semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, mlir::Location loc, const ConstructQueue &queue, ConstructQueue::const_iterator item) { - TODO(loc, "Taskloop construct"); - return nullptr; + mlir::omp::TaskloopOperands taskloopClauseOps; + lower::StatementContext stmtCtx; + genTaskloopClauses(converter, semaCtx, stmtCtx, item->clauses, loc, + taskloopClauseOps); + + DataSharingProcessor dsp(converter, semaCtx, item->clauses, eval, + /*shouldCollectPreDeterminedSymbols=*/true, + enableDelayedPrivatization, symTable); + dsp.processStep1(&taskloopClauseOps); + + mlir::omp::LoopNestOperands loopNestClauseOps; + llvm::SmallVector iv; + genLoopNestClauses(converter, semaCtx, eval, item->clauses, loc, + loopNestClauseOps, iv); + + EntryBlockArgs taskloopArgs; + taskloopArgs.priv.syms = dsp.getDelayedPrivSymbols(); + taskloopArgs.priv.vars = taskloopClauseOps.privateVars; + + auto taskLoopOp = genWrapperOp( + converter, loc, taskloopClauseOps, taskloopArgs); + + genLoopNestOp(converter, symTable, semaCtx, eval, loc, queue, item, + loopNestClauseOps, iv, {{taskLoopOp, taskloopArgs}}, + llvm::omp::Directive::OMPD_taskloop, dsp); + return taskLoopOp; } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 b/flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 deleted file mode 100644 index 5045c621e4d77..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 +++ /dev/null @@ -1,14 +0,0 @@ -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s - -! CHECK: not yet implemented: Taskloop construct -subroutine omp_taskloop -integer :: i -!$omp parallel - !$omp taskloop - do i = 1, 10 - !$omp cancel taskgroup - end do - !$omp end taskloop -!$omp end parallel -end subroutine omp_taskloop diff --git a/flang/test/Lower/OpenMP/Todo/taskloop.f90 b/flang/test/Lower/OpenMP/Todo/taskloop.f90 deleted file mode 100644 index aca050584cbbe..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/taskloop.f90 +++ /dev/null @@ -1,13 +0,0 @@ -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s - -! CHECK: not yet implemented: Taskloop construct -subroutine omp_taskloop - integer :: res, i - !$omp taskloop - do i = 1, 10 - res = res + 1 - end do - !$omp end taskloop -end subroutine omp_taskloop - diff --git a/flang/test/Lower/OpenMP/masked_taskloop.f90 b/flang/test/Lower/OpenMP/masked_taskloop.f90 new file mode 100644 index 0000000000000..abe20ec1fd87c --- /dev/null +++ b/flang/test/Lower/OpenMP/masked_taskloop.f90 @@ -0,0 +1,55 @@ +! This test checks lowering of OpenMP masked taskloop Directive. + +! RUN: bbc -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private {type = private} +! CHECK-SAME: @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[J_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_masked_taskloop() { +! CHECK: %[[VAL_0:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_masked_taskloopEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] +! CHECK-SAME: {uniq_name = "_QFtest_masked_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_J:.*]] = fir.address_of(@_QFtest_masked_taskloopEj) : !fir.ref +! CHECK: %[[DECL_J:.*]]:2 = hlfir.declare %[[ALLOCA_J]] {uniq_name = "_QFtest_masked_taskloopEj"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.masked { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private( +! CHECK-SAME: @[[J_FIRSTPRIVATE]] %[[DECL_J]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { +! CHECK: omp.loop_nest (%arg2) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%[[C1_I32_0]]) { +! CHECK: %[[VAL1:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFtest_masked_taskloopEj"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL2:.*]]:2 = hlfir.declare %[[ARG1]] {uniq_name = "_QFtest_masked_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %arg2 to %[[VAL2]]#0 : i32, !fir.ref +! CHECK: %[[LOAD_J:.*]] = fir.load %[[VAL1]]#0 : !fir.ref +! CHECK: %[[C1_I32_1:.*]] = arith.constant 1 : i32 +! CHECK: %[[RES_J:.*]] = arith.addi %[[LOAD_J]], %[[C1_I32_1]] : i32 +! CHECK: hlfir.assign %[[RES_J]] to %[[VAL1]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } +! CHECK: fir.global internal @_QFtest_masked_taskloopEj : i32 { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: fir.has_value %[[C1_I32]] : i32 +! CHECK: } + + +subroutine test_masked_taskloop + integer :: i, j = 1 + !$omp masked taskloop + do i=1,10 + j = j + 1 + end do + !$omp end masked taskloop +end subroutine diff --git a/flang/test/Lower/OpenMP/master_taskloop.f90 b/flang/test/Lower/OpenMP/master_taskloop.f90 deleted file mode 100644 index 26f664b2662dc..0000000000000 --- a/flang/test/Lower/OpenMP/master_taskloop.f90 +++ /dev/null @@ -1,14 +0,0 @@ -! This test checks lowering of OpenMP master taskloop Directive. - -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s - -subroutine test_master_taskloop - integer :: i, j = 1 - !CHECK: not yet implemented: Taskloop construct - !$omp master taskloop - do i=1,10 - j = j + 1 - end do - !$omp end master taskloop -end subroutine diff --git a/flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 b/flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 new file mode 100644 index 0000000000000..497cc396a5a02 --- /dev/null +++ b/flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 @@ -0,0 +1,48 @@ +! This test checks lowering of OpenMP parallel masked taskloop Directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private {type = private} +! CHECK-SAME: @[[I_PRIVATE:.*]] : i32 +! CHECK-LABEL: func.func @_QPtest_parallel_master_taskloop() { +! CHECK: %[[VAL0:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_parallel_master_taskloopEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_parallel_master_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ADDR_J:.*]] = fir.address_of(@_QFtest_parallel_master_taskloopEj) : !fir.ref +! CHECK: %[[DECL_J:.*]]:2 = hlfir.declare %[[ADDR_J]] {uniq_name = "_QFtest_parallel_master_taskloopEj"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.parallel { +! CHECK: omp.masked { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private(@[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG0:.*]] : !fir.ref) { +! CHECK: omp.loop_nest (%[[ARG1:.*]]) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%c1_i32_0) { +! CHECK: %[[VAL1:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFtest_parallel_master_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[ARG1]] to %[[VAL1]]#0 : i32, !fir.ref +! CHECK: %[[LOAD_J:.*]] = fir.load %[[DECL_J]]#0 : !fir.ref +! CHECK: %c1_i32_1 = arith.constant 1 : i32 +! CHECK: %[[RES_ADD:.*]] = arith.addi %[[LOAD_J]], %c1_i32_1 : i32 +! CHECK: hlfir.assign %[[RES_ADD]] to %[[DECL_J]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } +! CHECK: fir.global internal @_QFtest_parallel_master_taskloopEj : i32 { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: fir.has_value %[[C1_I32]] : i32 +! CHECK: } + +subroutine test_parallel_master_taskloop + integer :: i, j = 1 + !$omp parallel masked taskloop + do i=1,10 + j = j + 1 + end do + !$omp end parallel masked taskloop +end subroutine diff --git a/flang/test/Lower/OpenMP/parallel-master-taskloop.f90 b/flang/test/Lower/OpenMP/parallel-master-taskloop.f90 deleted file mode 100644 index 17ceb9496c8d3..0000000000000 --- a/flang/test/Lower/OpenMP/parallel-master-taskloop.f90 +++ /dev/null @@ -1,14 +0,0 @@ -! This test checks lowering of OpenMP parallel master taskloop Directive. - -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s - -subroutine test_parallel_master_taskloop - integer :: i, j = 1 - !CHECK: not yet implemented: Taskloop construct - !$omp parallel master taskloop - do i=1,10 - j = j + 1 - end do - !$omp end parallel master taskloop -end subroutine diff --git a/flang/test/Lower/OpenMP/taskloop-cancel.f90 b/flang/test/Lower/OpenMP/taskloop-cancel.f90 new file mode 100644 index 0000000000000..2bc0f17428c36 --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-cancel.f90 @@ -0,0 +1,37 @@ +! RUN: bbc -emit-hlfir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private {type = private} +! CHECK-SAME: @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: func.func @_QPomp_taskloop() { +! CHECK: %[[VAL_0:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFomp_taskloopEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %1 {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.parallel { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private(@[[I_PRIVATE]] %2#0 -> %[[ARG0:.*]] : !fir.ref) { +! CHECK: omp.loop_nest (%[[ARG1:.*]]) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%[[C1_I32_0]]) { +! CHECK: %[[IDX:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[ARG1]] to %[[IDX]]#0 : i32, !fir.ref +! CHECK: omp.cancel cancellation_construct_type(taskgroup) +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } + +subroutine omp_taskloop +integer :: i +!$omp parallel + !$omp taskloop + do i = 1, 10 + !$omp cancel taskgroup + end do + !$omp end taskloop +!$omp end parallel +end subroutine omp_taskloop diff --git a/flang/test/Lower/OpenMP/taskloop.f90 b/flang/test/Lower/OpenMP/taskloop.f90 new file mode 100644 index 0000000000000..10689a34c7efb --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop.f90 @@ -0,0 +1,72 @@ +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[RES_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[RES_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPomp_taskloop +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFomp_taskloopEi"} +! CHECK: %[[I_VAL:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_RES:.*]] = fir.alloca i32 {bindc_name = "res", uniq_name = "_QFomp_taskloopEres"} +! CHECK: %[[RES_VAL:.*]]:2 = hlfir.declare %[[ALLOCA_RES]] {uniq_name = "_QFomp_taskloopEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private(@[[RES_FIRSTPRIVATE]] %[[RES_VAL]]#0 -> %[[PRIV_RES:.*]], @[[I_PRIVATE]] %[[I_VAL]]#0 -> %[[PRIV_I:.*]] : !fir.ref, !fir.ref) { +! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%[[C1_I32_0]]) { +! CHECK: %[[RES_DECL:.*]]:2 = hlfir.declare %[[PRIV_RES]] {uniq_name = "_QFomp_taskloopEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[I_DECL:.*]]:2 = hlfir.declare %[[PRIV_I]] {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[ARG2]] to %[[I_DECL]]#0 : i32, !fir.ref +! CHECK: %[[LOAD_RES:.*]] = fir.load %[[RES_DECL]]#0 : !fir.ref +! CHECK: %[[C1_I32_1:.*]] = arith.constant 1 : i32 +! CHECK: %[[OUT_VAL:.*]] = arith.addi %[[LOAD_RES]], %[[C1_I32_1]] : i32 +! CHECK: hlfir.assign %[[OUT_VAL]] to %[[RES_DECL]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: return +! CHECK: } + +subroutine omp_taskloop + integer :: res, i + !$omp taskloop + do i = 1, 10 + res = res + 1 + end do + !$omp end taskloop +end subroutine omp_taskloop + + +! CHECK-LABEL: func.func @_QPomp_taskloop_private +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFomp_taskloop_privateEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFomp_taskloop_privateEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_RES:.*]] = fir.alloca i32 {bindc_name = "res", uniq_name = "_QFomp_taskloop_privateEres"} +! CHECK: %[[DECL_RES:.*]]:2 = hlfir.declare %[[ALLOCA_RES]] {uniq_name = "_QFomp_taskloop_privateEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine omp_taskloop_private + integer :: res, i +! CHECK: omp.taskloop private(@[[RES_PRIVATE_TEST2]] %[[DECL_RES]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE_TEST2]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { +! CHECK: omp.loop_nest (%{{.*}}) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { +! CHECK: %[[VAL1:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFomp_taskloop_privateEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) + !$omp taskloop private(res) + do i = 1, 10 +! CHECK: %[[LOAD_RES:.*]] = fir.load %[[VAL1]]#0 : !fir.ref +! CHECK: %[[C1_I32_1:.*]] = arith.constant 1 : i32 +! CHECK: %[[ADD_VAL:.*]] = arith.addi %[[LOAD_RES]], %[[C1_I32_1]] : i32 +! CHECK: hlfir.assign %[[ADD_VAL]] to %[[VAL1]]#0 : i32, !fir.ref + res = res + 1 + end do +! CHECK: return +! CHECK: } + !$omp end taskloop +end subroutine omp_taskloop_private \ No newline at end of file diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp index 0f65ace0186db..4f1bea24bf6ac 100644 --- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp +++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp @@ -2785,16 +2785,16 @@ LogicalResult TaskgroupOp::verify() { void TaskloopOp::build(OpBuilder &builder, OperationState &state, const TaskloopOperands &clauses) { MLIRContext *ctx = builder.getContext(); - // TODO Store clauses in op: privateVars, privateSyms. TaskloopOp::build(builder, state, clauses.allocateVars, clauses.allocatorVars, clauses.final, clauses.grainsizeMod, clauses.grainsize, clauses.ifExpr, clauses.inReductionVars, makeDenseBoolArrayAttr(ctx, clauses.inReductionByref), makeArrayAttr(ctx, clauses.inReductionSyms), clauses.mergeable, clauses.nogroup, clauses.numTasksMod, - clauses.numTasks, clauses.priority, /*private_vars=*/{}, - /*private_syms=*/nullptr, clauses.reductionMod, - clauses.reductionVars, + clauses.numTasks, clauses.priority, + /*private_vars=*/clauses.privateVars, + /*private_syms=*/makeArrayAttr(ctx, clauses.privateSyms), + clauses.reductionMod, clauses.reductionVars, makeDenseBoolArrayAttr(ctx, clauses.reductionByref), makeArrayAttr(ctx, clauses.reductionSyms), clauses.untied); } ``````````
https://github.com/llvm/llvm-project/pull/138646 From flang-commits at lists.llvm.org Tue May 6 00:21:31 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 00:21:31 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [Flang][OpenMP] Support for lowering of taskloop construct to MLIR (PR #138646) In-Reply-To: Message-ID: <6819b87b.170a0220.3069f5.c13c@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Kaviya Rajendiran (kaviya2510)
Changes Added support for lowering of taskloop construct and its clauses(Private and Firstprivate) to MLIR. --- Full diff: https://github.com/llvm/llvm-project/pull/138646.diff 10 Files Affected: - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+42-2) - (removed) flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 (-14) - (removed) flang/test/Lower/OpenMP/Todo/taskloop.f90 (-13) - (added) flang/test/Lower/OpenMP/masked_taskloop.f90 (+55) - (removed) flang/test/Lower/OpenMP/master_taskloop.f90 (-14) - (added) flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 (+48) - (removed) flang/test/Lower/OpenMP/parallel-master-taskloop.f90 (-14) - (added) flang/test/Lower/OpenMP/taskloop-cancel.f90 (+37) - (added) flang/test/Lower/OpenMP/taskloop.f90 (+72) - (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+4-4) ``````````diff diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 47e7c266ff7d3..5d545f40c7ccb 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1804,6 +1804,22 @@ static void genTaskgroupClauses(lower::AbstractConverter &converter, llvm::omp::Directive::OMPD_taskgroup); } +static void genTaskloopClauses(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + lower::StatementContext &stmtCtx, + const List &clauses, mlir::Location loc, + mlir::omp::TaskloopOperands &clauseOps) { + + ClauseProcessor cp(converter, semaCtx, clauses); + + cp.processTODO( + loc, llvm::omp::Directive::OMPD_taskloop); +} + static void genTaskwaitClauses(lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, const List &clauses, mlir::Location loc, @@ -3256,8 +3272,32 @@ static mlir::omp::TaskloopOp genStandaloneTaskloop( semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, mlir::Location loc, const ConstructQueue &queue, ConstructQueue::const_iterator item) { - TODO(loc, "Taskloop construct"); - return nullptr; + mlir::omp::TaskloopOperands taskloopClauseOps; + lower::StatementContext stmtCtx; + genTaskloopClauses(converter, semaCtx, stmtCtx, item->clauses, loc, + taskloopClauseOps); + + DataSharingProcessor dsp(converter, semaCtx, item->clauses, eval, + /*shouldCollectPreDeterminedSymbols=*/true, + enableDelayedPrivatization, symTable); + dsp.processStep1(&taskloopClauseOps); + + mlir::omp::LoopNestOperands loopNestClauseOps; + llvm::SmallVector iv; + genLoopNestClauses(converter, semaCtx, eval, item->clauses, loc, + loopNestClauseOps, iv); + + EntryBlockArgs taskloopArgs; + taskloopArgs.priv.syms = dsp.getDelayedPrivSymbols(); + taskloopArgs.priv.vars = taskloopClauseOps.privateVars; + + auto taskLoopOp = genWrapperOp( + converter, loc, taskloopClauseOps, taskloopArgs); + + genLoopNestOp(converter, symTable, semaCtx, eval, loc, queue, item, + loopNestClauseOps, iv, {{taskLoopOp, taskloopArgs}}, + llvm::omp::Directive::OMPD_taskloop, dsp); + return taskLoopOp; } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 b/flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 deleted file mode 100644 index 5045c621e4d77..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 +++ /dev/null @@ -1,14 +0,0 @@ -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s - -! CHECK: not yet implemented: Taskloop construct -subroutine omp_taskloop -integer :: i -!$omp parallel - !$omp taskloop - do i = 1, 10 - !$omp cancel taskgroup - end do - !$omp end taskloop -!$omp end parallel -end subroutine omp_taskloop diff --git a/flang/test/Lower/OpenMP/Todo/taskloop.f90 b/flang/test/Lower/OpenMP/Todo/taskloop.f90 deleted file mode 100644 index aca050584cbbe..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/taskloop.f90 +++ /dev/null @@ -1,13 +0,0 @@ -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s - -! CHECK: not yet implemented: Taskloop construct -subroutine omp_taskloop - integer :: res, i - !$omp taskloop - do i = 1, 10 - res = res + 1 - end do - !$omp end taskloop -end subroutine omp_taskloop - diff --git a/flang/test/Lower/OpenMP/masked_taskloop.f90 b/flang/test/Lower/OpenMP/masked_taskloop.f90 new file mode 100644 index 0000000000000..abe20ec1fd87c --- /dev/null +++ b/flang/test/Lower/OpenMP/masked_taskloop.f90 @@ -0,0 +1,55 @@ +! This test checks lowering of OpenMP masked taskloop Directive. + +! RUN: bbc -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private {type = private} +! CHECK-SAME: @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[J_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_masked_taskloop() { +! CHECK: %[[VAL_0:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_masked_taskloopEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] +! CHECK-SAME: {uniq_name = "_QFtest_masked_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_J:.*]] = fir.address_of(@_QFtest_masked_taskloopEj) : !fir.ref +! CHECK: %[[DECL_J:.*]]:2 = hlfir.declare %[[ALLOCA_J]] {uniq_name = "_QFtest_masked_taskloopEj"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.masked { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private( +! CHECK-SAME: @[[J_FIRSTPRIVATE]] %[[DECL_J]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { +! CHECK: omp.loop_nest (%arg2) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%[[C1_I32_0]]) { +! CHECK: %[[VAL1:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFtest_masked_taskloopEj"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL2:.*]]:2 = hlfir.declare %[[ARG1]] {uniq_name = "_QFtest_masked_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %arg2 to %[[VAL2]]#0 : i32, !fir.ref +! CHECK: %[[LOAD_J:.*]] = fir.load %[[VAL1]]#0 : !fir.ref +! CHECK: %[[C1_I32_1:.*]] = arith.constant 1 : i32 +! CHECK: %[[RES_J:.*]] = arith.addi %[[LOAD_J]], %[[C1_I32_1]] : i32 +! CHECK: hlfir.assign %[[RES_J]] to %[[VAL1]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } +! CHECK: fir.global internal @_QFtest_masked_taskloopEj : i32 { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: fir.has_value %[[C1_I32]] : i32 +! CHECK: } + + +subroutine test_masked_taskloop + integer :: i, j = 1 + !$omp masked taskloop + do i=1,10 + j = j + 1 + end do + !$omp end masked taskloop +end subroutine diff --git a/flang/test/Lower/OpenMP/master_taskloop.f90 b/flang/test/Lower/OpenMP/master_taskloop.f90 deleted file mode 100644 index 26f664b2662dc..0000000000000 --- a/flang/test/Lower/OpenMP/master_taskloop.f90 +++ /dev/null @@ -1,14 +0,0 @@ -! This test checks lowering of OpenMP master taskloop Directive. - -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s - -subroutine test_master_taskloop - integer :: i, j = 1 - !CHECK: not yet implemented: Taskloop construct - !$omp master taskloop - do i=1,10 - j = j + 1 - end do - !$omp end master taskloop -end subroutine diff --git a/flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 b/flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 new file mode 100644 index 0000000000000..497cc396a5a02 --- /dev/null +++ b/flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 @@ -0,0 +1,48 @@ +! This test checks lowering of OpenMP parallel masked taskloop Directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private {type = private} +! CHECK-SAME: @[[I_PRIVATE:.*]] : i32 +! CHECK-LABEL: func.func @_QPtest_parallel_master_taskloop() { +! CHECK: %[[VAL0:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_parallel_master_taskloopEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_parallel_master_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ADDR_J:.*]] = fir.address_of(@_QFtest_parallel_master_taskloopEj) : !fir.ref +! CHECK: %[[DECL_J:.*]]:2 = hlfir.declare %[[ADDR_J]] {uniq_name = "_QFtest_parallel_master_taskloopEj"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.parallel { +! CHECK: omp.masked { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private(@[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG0:.*]] : !fir.ref) { +! CHECK: omp.loop_nest (%[[ARG1:.*]]) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%c1_i32_0) { +! CHECK: %[[VAL1:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFtest_parallel_master_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[ARG1]] to %[[VAL1]]#0 : i32, !fir.ref +! CHECK: %[[LOAD_J:.*]] = fir.load %[[DECL_J]]#0 : !fir.ref +! CHECK: %c1_i32_1 = arith.constant 1 : i32 +! CHECK: %[[RES_ADD:.*]] = arith.addi %[[LOAD_J]], %c1_i32_1 : i32 +! CHECK: hlfir.assign %[[RES_ADD]] to %[[DECL_J]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } +! CHECK: fir.global internal @_QFtest_parallel_master_taskloopEj : i32 { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: fir.has_value %[[C1_I32]] : i32 +! CHECK: } + +subroutine test_parallel_master_taskloop + integer :: i, j = 1 + !$omp parallel masked taskloop + do i=1,10 + j = j + 1 + end do + !$omp end parallel masked taskloop +end subroutine diff --git a/flang/test/Lower/OpenMP/parallel-master-taskloop.f90 b/flang/test/Lower/OpenMP/parallel-master-taskloop.f90 deleted file mode 100644 index 17ceb9496c8d3..0000000000000 --- a/flang/test/Lower/OpenMP/parallel-master-taskloop.f90 +++ /dev/null @@ -1,14 +0,0 @@ -! This test checks lowering of OpenMP parallel master taskloop Directive. - -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s - -subroutine test_parallel_master_taskloop - integer :: i, j = 1 - !CHECK: not yet implemented: Taskloop construct - !$omp parallel master taskloop - do i=1,10 - j = j + 1 - end do - !$omp end parallel master taskloop -end subroutine diff --git a/flang/test/Lower/OpenMP/taskloop-cancel.f90 b/flang/test/Lower/OpenMP/taskloop-cancel.f90 new file mode 100644 index 0000000000000..2bc0f17428c36 --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-cancel.f90 @@ -0,0 +1,37 @@ +! RUN: bbc -emit-hlfir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private {type = private} +! CHECK-SAME: @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: func.func @_QPomp_taskloop() { +! CHECK: %[[VAL_0:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFomp_taskloopEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %1 {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.parallel { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private(@[[I_PRIVATE]] %2#0 -> %[[ARG0:.*]] : !fir.ref) { +! CHECK: omp.loop_nest (%[[ARG1:.*]]) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%[[C1_I32_0]]) { +! CHECK: %[[IDX:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[ARG1]] to %[[IDX]]#0 : i32, !fir.ref +! CHECK: omp.cancel cancellation_construct_type(taskgroup) +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } + +subroutine omp_taskloop +integer :: i +!$omp parallel + !$omp taskloop + do i = 1, 10 + !$omp cancel taskgroup + end do + !$omp end taskloop +!$omp end parallel +end subroutine omp_taskloop diff --git a/flang/test/Lower/OpenMP/taskloop.f90 b/flang/test/Lower/OpenMP/taskloop.f90 new file mode 100644 index 0000000000000..10689a34c7efb --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop.f90 @@ -0,0 +1,72 @@ +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[RES_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[RES_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPomp_taskloop +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFomp_taskloopEi"} +! CHECK: %[[I_VAL:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_RES:.*]] = fir.alloca i32 {bindc_name = "res", uniq_name = "_QFomp_taskloopEres"} +! CHECK: %[[RES_VAL:.*]]:2 = hlfir.declare %[[ALLOCA_RES]] {uniq_name = "_QFomp_taskloopEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private(@[[RES_FIRSTPRIVATE]] %[[RES_VAL]]#0 -> %[[PRIV_RES:.*]], @[[I_PRIVATE]] %[[I_VAL]]#0 -> %[[PRIV_I:.*]] : !fir.ref, !fir.ref) { +! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%[[C1_I32_0]]) { +! CHECK: %[[RES_DECL:.*]]:2 = hlfir.declare %[[PRIV_RES]] {uniq_name = "_QFomp_taskloopEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[I_DECL:.*]]:2 = hlfir.declare %[[PRIV_I]] {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[ARG2]] to %[[I_DECL]]#0 : i32, !fir.ref +! CHECK: %[[LOAD_RES:.*]] = fir.load %[[RES_DECL]]#0 : !fir.ref +! CHECK: %[[C1_I32_1:.*]] = arith.constant 1 : i32 +! CHECK: %[[OUT_VAL:.*]] = arith.addi %[[LOAD_RES]], %[[C1_I32_1]] : i32 +! CHECK: hlfir.assign %[[OUT_VAL]] to %[[RES_DECL]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: return +! CHECK: } + +subroutine omp_taskloop + integer :: res, i + !$omp taskloop + do i = 1, 10 + res = res + 1 + end do + !$omp end taskloop +end subroutine omp_taskloop + + +! CHECK-LABEL: func.func @_QPomp_taskloop_private +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFomp_taskloop_privateEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFomp_taskloop_privateEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_RES:.*]] = fir.alloca i32 {bindc_name = "res", uniq_name = "_QFomp_taskloop_privateEres"} +! CHECK: %[[DECL_RES:.*]]:2 = hlfir.declare %[[ALLOCA_RES]] {uniq_name = "_QFomp_taskloop_privateEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine omp_taskloop_private + integer :: res, i +! CHECK: omp.taskloop private(@[[RES_PRIVATE_TEST2]] %[[DECL_RES]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE_TEST2]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { +! CHECK: omp.loop_nest (%{{.*}}) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { +! CHECK: %[[VAL1:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFomp_taskloop_privateEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) + !$omp taskloop private(res) + do i = 1, 10 +! CHECK: %[[LOAD_RES:.*]] = fir.load %[[VAL1]]#0 : !fir.ref +! CHECK: %[[C1_I32_1:.*]] = arith.constant 1 : i32 +! CHECK: %[[ADD_VAL:.*]] = arith.addi %[[LOAD_RES]], %[[C1_I32_1]] : i32 +! CHECK: hlfir.assign %[[ADD_VAL]] to %[[VAL1]]#0 : i32, !fir.ref + res = res + 1 + end do +! CHECK: return +! CHECK: } + !$omp end taskloop +end subroutine omp_taskloop_private \ No newline at end of file diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp index 0f65ace0186db..4f1bea24bf6ac 100644 --- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp +++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp @@ -2785,16 +2785,16 @@ LogicalResult TaskgroupOp::verify() { void TaskloopOp::build(OpBuilder &builder, OperationState &state, const TaskloopOperands &clauses) { MLIRContext *ctx = builder.getContext(); - // TODO Store clauses in op: privateVars, privateSyms. TaskloopOp::build(builder, state, clauses.allocateVars, clauses.allocatorVars, clauses.final, clauses.grainsizeMod, clauses.grainsize, clauses.ifExpr, clauses.inReductionVars, makeDenseBoolArrayAttr(ctx, clauses.inReductionByref), makeArrayAttr(ctx, clauses.inReductionSyms), clauses.mergeable, clauses.nogroup, clauses.numTasksMod, - clauses.numTasks, clauses.priority, /*private_vars=*/{}, - /*private_syms=*/nullptr, clauses.reductionMod, - clauses.reductionVars, + clauses.numTasks, clauses.priority, + /*private_vars=*/clauses.privateVars, + /*private_syms=*/makeArrayAttr(ctx, clauses.privateSyms), + clauses.reductionMod, clauses.reductionVars, makeDenseBoolArrayAttr(ctx, clauses.reductionByref), makeArrayAttr(ctx, clauses.reductionSyms), clauses.untied); } ``````````
https://github.com/llvm/llvm-project/pull/138646 From flang-commits at lists.llvm.org Tue May 6 00:25:18 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 00:25:18 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [Flang][OpenMP] Support for lowering of taskloop construct to MLIR (PR #138646) In-Reply-To: Message-ID: <6819b95e.050a0220.226b98.4d8e@mx.google.com> https://github.com/kaviya2510 edited https://github.com/llvm/llvm-project/pull/138646 From flang-commits at lists.llvm.org Tue May 6 00:26:24 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 00:26:24 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [Flang][OpenMP] Support for lowering of taskloop construct to MLIR (PR #138646) In-Reply-To: Message-ID: <6819b9a0.170a0220.36c02e.ba5b@mx.google.com> https://github.com/kaviya2510 updated https://github.com/llvm/llvm-project/pull/138646 >From ec312e73d39c8742b6c7c7cc22451292d15357cc Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Tue, 6 May 2025 12:49:08 +0530 Subject: [PATCH] [Flang][OpenMP] Support for lowering of taskloop construct to MLIR --- flang/lib/Lower/OpenMP/OpenMP.cpp | 44 +++++++++++- .../Lower/OpenMP/Todo/taskloop-cancel.f90 | 14 ---- flang/test/Lower/OpenMP/Todo/taskloop.f90 | 13 ---- flang/test/Lower/OpenMP/masked_taskloop.f90 | 55 ++++++++++++++ flang/test/Lower/OpenMP/master_taskloop.f90 | 14 ---- .../Lower/OpenMP/parallel-masked-taskloop.f90 | 48 +++++++++++++ .../Lower/OpenMP/parallel-master-taskloop.f90 | 14 ---- flang/test/Lower/OpenMP/taskloop-cancel.f90 | 37 ++++++++++ flang/test/Lower/OpenMP/taskloop.f90 | 72 +++++++++++++++++++ mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp | 8 +-- 10 files changed, 258 insertions(+), 61 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 delete mode 100644 flang/test/Lower/OpenMP/Todo/taskloop.f90 create mode 100644 flang/test/Lower/OpenMP/masked_taskloop.f90 delete mode 100644 flang/test/Lower/OpenMP/master_taskloop.f90 create mode 100644 flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 delete mode 100644 flang/test/Lower/OpenMP/parallel-master-taskloop.f90 create mode 100644 flang/test/Lower/OpenMP/taskloop-cancel.f90 create mode 100644 flang/test/Lower/OpenMP/taskloop.f90 diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 47e7c266ff7d3..5d545f40c7ccb 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1804,6 +1804,22 @@ static void genTaskgroupClauses(lower::AbstractConverter &converter, llvm::omp::Directive::OMPD_taskgroup); } +static void genTaskloopClauses(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + lower::StatementContext &stmtCtx, + const List &clauses, mlir::Location loc, + mlir::omp::TaskloopOperands &clauseOps) { + + ClauseProcessor cp(converter, semaCtx, clauses); + + cp.processTODO( + loc, llvm::omp::Directive::OMPD_taskloop); +} + static void genTaskwaitClauses(lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, const List &clauses, mlir::Location loc, @@ -3256,8 +3272,32 @@ static mlir::omp::TaskloopOp genStandaloneTaskloop( semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, mlir::Location loc, const ConstructQueue &queue, ConstructQueue::const_iterator item) { - TODO(loc, "Taskloop construct"); - return nullptr; + mlir::omp::TaskloopOperands taskloopClauseOps; + lower::StatementContext stmtCtx; + genTaskloopClauses(converter, semaCtx, stmtCtx, item->clauses, loc, + taskloopClauseOps); + + DataSharingProcessor dsp(converter, semaCtx, item->clauses, eval, + /*shouldCollectPreDeterminedSymbols=*/true, + enableDelayedPrivatization, symTable); + dsp.processStep1(&taskloopClauseOps); + + mlir::omp::LoopNestOperands loopNestClauseOps; + llvm::SmallVector iv; + genLoopNestClauses(converter, semaCtx, eval, item->clauses, loc, + loopNestClauseOps, iv); + + EntryBlockArgs taskloopArgs; + taskloopArgs.priv.syms = dsp.getDelayedPrivSymbols(); + taskloopArgs.priv.vars = taskloopClauseOps.privateVars; + + auto taskLoopOp = genWrapperOp( + converter, loc, taskloopClauseOps, taskloopArgs); + + genLoopNestOp(converter, symTable, semaCtx, eval, loc, queue, item, + loopNestClauseOps, iv, {{taskLoopOp, taskloopArgs}}, + llvm::omp::Directive::OMPD_taskloop, dsp); + return taskLoopOp; } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 b/flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 deleted file mode 100644 index 5045c621e4d77..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/taskloop-cancel.f90 +++ /dev/null @@ -1,14 +0,0 @@ -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s - -! CHECK: not yet implemented: Taskloop construct -subroutine omp_taskloop -integer :: i -!$omp parallel - !$omp taskloop - do i = 1, 10 - !$omp cancel taskgroup - end do - !$omp end taskloop -!$omp end parallel -end subroutine omp_taskloop diff --git a/flang/test/Lower/OpenMP/Todo/taskloop.f90 b/flang/test/Lower/OpenMP/Todo/taskloop.f90 deleted file mode 100644 index aca050584cbbe..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/taskloop.f90 +++ /dev/null @@ -1,13 +0,0 @@ -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s - -! CHECK: not yet implemented: Taskloop construct -subroutine omp_taskloop - integer :: res, i - !$omp taskloop - do i = 1, 10 - res = res + 1 - end do - !$omp end taskloop -end subroutine omp_taskloop - diff --git a/flang/test/Lower/OpenMP/masked_taskloop.f90 b/flang/test/Lower/OpenMP/masked_taskloop.f90 new file mode 100644 index 0000000000000..abe20ec1fd87c --- /dev/null +++ b/flang/test/Lower/OpenMP/masked_taskloop.f90 @@ -0,0 +1,55 @@ +! This test checks lowering of OpenMP masked taskloop Directive. + +! RUN: bbc -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private {type = private} +! CHECK-SAME: @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[J_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_masked_taskloop() { +! CHECK: %[[VAL_0:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_masked_taskloopEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] +! CHECK-SAME: {uniq_name = "_QFtest_masked_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_J:.*]] = fir.address_of(@_QFtest_masked_taskloopEj) : !fir.ref +! CHECK: %[[DECL_J:.*]]:2 = hlfir.declare %[[ALLOCA_J]] {uniq_name = "_QFtest_masked_taskloopEj"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.masked { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private( +! CHECK-SAME: @[[J_FIRSTPRIVATE]] %[[DECL_J]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { +! CHECK: omp.loop_nest (%arg2) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%[[C1_I32_0]]) { +! CHECK: %[[VAL1:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFtest_masked_taskloopEj"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL2:.*]]:2 = hlfir.declare %[[ARG1]] {uniq_name = "_QFtest_masked_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %arg2 to %[[VAL2]]#0 : i32, !fir.ref +! CHECK: %[[LOAD_J:.*]] = fir.load %[[VAL1]]#0 : !fir.ref +! CHECK: %[[C1_I32_1:.*]] = arith.constant 1 : i32 +! CHECK: %[[RES_J:.*]] = arith.addi %[[LOAD_J]], %[[C1_I32_1]] : i32 +! CHECK: hlfir.assign %[[RES_J]] to %[[VAL1]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } +! CHECK: fir.global internal @_QFtest_masked_taskloopEj : i32 { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: fir.has_value %[[C1_I32]] : i32 +! CHECK: } + + +subroutine test_masked_taskloop + integer :: i, j = 1 + !$omp masked taskloop + do i=1,10 + j = j + 1 + end do + !$omp end masked taskloop +end subroutine diff --git a/flang/test/Lower/OpenMP/master_taskloop.f90 b/flang/test/Lower/OpenMP/master_taskloop.f90 deleted file mode 100644 index 26f664b2662dc..0000000000000 --- a/flang/test/Lower/OpenMP/master_taskloop.f90 +++ /dev/null @@ -1,14 +0,0 @@ -! This test checks lowering of OpenMP master taskloop Directive. - -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s - -subroutine test_master_taskloop - integer :: i, j = 1 - !CHECK: not yet implemented: Taskloop construct - !$omp master taskloop - do i=1,10 - j = j + 1 - end do - !$omp end master taskloop -end subroutine diff --git a/flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 b/flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 new file mode 100644 index 0000000000000..497cc396a5a02 --- /dev/null +++ b/flang/test/Lower/OpenMP/parallel-masked-taskloop.f90 @@ -0,0 +1,48 @@ +! This test checks lowering of OpenMP parallel masked taskloop Directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private {type = private} +! CHECK-SAME: @[[I_PRIVATE:.*]] : i32 +! CHECK-LABEL: func.func @_QPtest_parallel_master_taskloop() { +! CHECK: %[[VAL0:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_parallel_master_taskloopEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_parallel_master_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ADDR_J:.*]] = fir.address_of(@_QFtest_parallel_master_taskloopEj) : !fir.ref +! CHECK: %[[DECL_J:.*]]:2 = hlfir.declare %[[ADDR_J]] {uniq_name = "_QFtest_parallel_master_taskloopEj"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.parallel { +! CHECK: omp.masked { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private(@[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG0:.*]] : !fir.ref) { +! CHECK: omp.loop_nest (%[[ARG1:.*]]) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%c1_i32_0) { +! CHECK: %[[VAL1:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFtest_parallel_master_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[ARG1]] to %[[VAL1]]#0 : i32, !fir.ref +! CHECK: %[[LOAD_J:.*]] = fir.load %[[DECL_J]]#0 : !fir.ref +! CHECK: %c1_i32_1 = arith.constant 1 : i32 +! CHECK: %[[RES_ADD:.*]] = arith.addi %[[LOAD_J]], %c1_i32_1 : i32 +! CHECK: hlfir.assign %[[RES_ADD]] to %[[DECL_J]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } +! CHECK: fir.global internal @_QFtest_parallel_master_taskloopEj : i32 { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: fir.has_value %[[C1_I32]] : i32 +! CHECK: } + +subroutine test_parallel_master_taskloop + integer :: i, j = 1 + !$omp parallel masked taskloop + do i=1,10 + j = j + 1 + end do + !$omp end parallel masked taskloop +end subroutine diff --git a/flang/test/Lower/OpenMP/parallel-master-taskloop.f90 b/flang/test/Lower/OpenMP/parallel-master-taskloop.f90 deleted file mode 100644 index 17ceb9496c8d3..0000000000000 --- a/flang/test/Lower/OpenMP/parallel-master-taskloop.f90 +++ /dev/null @@ -1,14 +0,0 @@ -! This test checks lowering of OpenMP parallel master taskloop Directive. - -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s - -subroutine test_parallel_master_taskloop - integer :: i, j = 1 - !CHECK: not yet implemented: Taskloop construct - !$omp parallel master taskloop - do i=1,10 - j = j + 1 - end do - !$omp end parallel master taskloop -end subroutine diff --git a/flang/test/Lower/OpenMP/taskloop-cancel.f90 b/flang/test/Lower/OpenMP/taskloop-cancel.f90 new file mode 100644 index 0000000000000..2bc0f17428c36 --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-cancel.f90 @@ -0,0 +1,37 @@ +! RUN: bbc -emit-hlfir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private {type = private} +! CHECK-SAME: @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: func.func @_QPomp_taskloop() { +! CHECK: %[[VAL_0:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFomp_taskloopEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %1 {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.parallel { +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private(@[[I_PRIVATE]] %2#0 -> %[[ARG0:.*]] : !fir.ref) { +! CHECK: omp.loop_nest (%[[ARG1:.*]]) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%[[C1_I32_0]]) { +! CHECK: %[[IDX:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[ARG1]] to %[[IDX]]#0 : i32, !fir.ref +! CHECK: omp.cancel cancellation_construct_type(taskgroup) +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } + +subroutine omp_taskloop +integer :: i +!$omp parallel + !$omp taskloop + do i = 1, 10 + !$omp cancel taskgroup + end do + !$omp end taskloop +!$omp end parallel +end subroutine omp_taskloop diff --git a/flang/test/Lower/OpenMP/taskloop.f90 b/flang/test/Lower/OpenMP/taskloop.f90 new file mode 100644 index 0000000000000..10689a34c7efb --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop.f90 @@ -0,0 +1,72 @@ +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[RES_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[RES_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPomp_taskloop +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFomp_taskloopEi"} +! CHECK: %[[I_VAL:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_RES:.*]] = fir.alloca i32 {bindc_name = "res", uniq_name = "_QFomp_taskloopEres"} +! CHECK: %[[RES_VAL:.*]]:2 = hlfir.declare %[[ALLOCA_RES]] {uniq_name = "_QFomp_taskloopEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[C1_I32:.*]] = arith.constant 1 : i32 +! CHECK: %[[C10_I32:.*]] = arith.constant 10 : i32 +! CHECK: %[[C1_I32_0:.*]] = arith.constant 1 : i32 +! CHECK: omp.taskloop private(@[[RES_FIRSTPRIVATE]] %[[RES_VAL]]#0 -> %[[PRIV_RES:.*]], @[[I_PRIVATE]] %[[I_VAL]]#0 -> %[[PRIV_I:.*]] : !fir.ref, !fir.ref) { +! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%[[C1_I32]]) to (%[[C10_I32]]) inclusive step (%[[C1_I32_0]]) { +! CHECK: %[[RES_DECL:.*]]:2 = hlfir.declare %[[PRIV_RES]] {uniq_name = "_QFomp_taskloopEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[I_DECL:.*]]:2 = hlfir.declare %[[PRIV_I]] {uniq_name = "_QFomp_taskloopEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[ARG2]] to %[[I_DECL]]#0 : i32, !fir.ref +! CHECK: %[[LOAD_RES:.*]] = fir.load %[[RES_DECL]]#0 : !fir.ref +! CHECK: %[[C1_I32_1:.*]] = arith.constant 1 : i32 +! CHECK: %[[OUT_VAL:.*]] = arith.addi %[[LOAD_RES]], %[[C1_I32_1]] : i32 +! CHECK: hlfir.assign %[[OUT_VAL]] to %[[RES_DECL]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: return +! CHECK: } + +subroutine omp_taskloop + integer :: res, i + !$omp taskloop + do i = 1, 10 + res = res + 1 + end do + !$omp end taskloop +end subroutine omp_taskloop + + +! CHECK-LABEL: func.func @_QPomp_taskloop_private +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFomp_taskloop_privateEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFomp_taskloop_privateEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_RES:.*]] = fir.alloca i32 {bindc_name = "res", uniq_name = "_QFomp_taskloop_privateEres"} +! CHECK: %[[DECL_RES:.*]]:2 = hlfir.declare %[[ALLOCA_RES]] {uniq_name = "_QFomp_taskloop_privateEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine omp_taskloop_private + integer :: res, i +! CHECK: omp.taskloop private(@[[RES_PRIVATE_TEST2]] %[[DECL_RES]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE_TEST2]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { +! CHECK: omp.loop_nest (%{{.*}}) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { +! CHECK: %[[VAL1:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFomp_taskloop_privateEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) + !$omp taskloop private(res) + do i = 1, 10 +! CHECK: %[[LOAD_RES:.*]] = fir.load %[[VAL1]]#0 : !fir.ref +! CHECK: %[[C1_I32_1:.*]] = arith.constant 1 : i32 +! CHECK: %[[ADD_VAL:.*]] = arith.addi %[[LOAD_RES]], %[[C1_I32_1]] : i32 +! CHECK: hlfir.assign %[[ADD_VAL]] to %[[VAL1]]#0 : i32, !fir.ref + res = res + 1 + end do +! CHECK: return +! CHECK: } + !$omp end taskloop +end subroutine omp_taskloop_private \ No newline at end of file diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp index 0f65ace0186db..4f1bea24bf6ac 100644 --- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp +++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp @@ -2785,16 +2785,16 @@ LogicalResult TaskgroupOp::verify() { void TaskloopOp::build(OpBuilder &builder, OperationState &state, const TaskloopOperands &clauses) { MLIRContext *ctx = builder.getContext(); - // TODO Store clauses in op: privateVars, privateSyms. TaskloopOp::build(builder, state, clauses.allocateVars, clauses.allocatorVars, clauses.final, clauses.grainsizeMod, clauses.grainsize, clauses.ifExpr, clauses.inReductionVars, makeDenseBoolArrayAttr(ctx, clauses.inReductionByref), makeArrayAttr(ctx, clauses.inReductionSyms), clauses.mergeable, clauses.nogroup, clauses.numTasksMod, - clauses.numTasks, clauses.priority, /*private_vars=*/{}, - /*private_syms=*/nullptr, clauses.reductionMod, - clauses.reductionVars, + clauses.numTasks, clauses.priority, + /*private_vars=*/clauses.privateVars, + /*private_syms=*/makeArrayAttr(ctx, clauses.privateSyms), + clauses.reductionMod, clauses.reductionVars, makeDenseBoolArrayAttr(ctx, clauses.reductionByref), makeArrayAttr(ctx, clauses.reductionSyms), clauses.untied); } From flang-commits at lists.llvm.org Tue May 6 00:27:20 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 00:27:20 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [Flang][OpenMP] Support for lowering of taskloop construct to MLIR (PR #138646) In-Reply-To: Message-ID: <6819b9d8.050a0220.23c0d0.5cc6@mx.google.com> https://github.com/kaviya2510 updated https://github.com/llvm/llvm-project/pull/138646 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 6 01:58:07 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 01:58:07 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [Flang][OpenMP] Support for lowering of taskloop construct to MLIR (PR #138646) In-Reply-To: Message-ID: <6819cf1f.170a0220.16fc7.05ea@mx.google.com> https://github.com/kaviya2510 updated https://github.com/llvm/llvm-project/pull/138646 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 6 02:13:25 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 02:13:25 -0700 (PDT) Subject: [flang-commits] =?utf-8?b?W2ZsYW5nXSBbRmxhbmddW09wZW5NUF1TdXBw?= =?utf-8?q?ort_for_lowering_grainsize_and_num=5Ftasks_clause_to=E2=80=A6_?= =?utf-8?b?KFBSICMxMjg0OTAp?= In-Reply-To: Message-ID: <6819d2b5.170a0220.2a225a.f1d5@mx.google.com> https://github.com/kaviya2510 updated https://github.com/llvm/llvm-project/pull/128490 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 6 02:15:44 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 02:15:44 -0700 (PDT) Subject: [flang-commits] =?utf-8?b?W2ZsYW5nXSBbRmxhbmddW09wZW5NUF1TdXBw?= =?utf-8?q?ort_for_lowering_grainsize_and_num=5Ftasks_clause_to=E2=80=A6_?= =?utf-8?b?KFBSICMxMjg0OTAp?= In-Reply-To: Message-ID: <6819d340.050a0220.31513.5d86@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions cpp,h -- flang/lib/Lower/OpenMP/ClauseProcessor.cpp flang/lib/Lower/OpenMP/ClauseProcessor.h flang/lib/Lower/OpenMP/OpenMP.cpp ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 3bfcee26c..8a35413f8 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3268,13 +3268,11 @@ genStandaloneSimd(lower::AbstractConverter &converter, lower::SymMap &symTable, return simdOp; } -static mlir::omp::TaskloopOp genStandaloneTaskloop(lower::AbstractConverter &converter, - lower::SymMap &symTable, - semantics::SemanticsContext &semaCtx, - lower::pft::Evaluation &eval, - mlir::Location loc, - const ConstructQueue &queue, - ConstructQueue::const_iterator item) { +static mlir::omp::TaskloopOp genStandaloneTaskloop( + lower::AbstractConverter &converter, lower::SymMap &symTable, + semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, + mlir::Location loc, const ConstructQueue &queue, + ConstructQueue::const_iterator item) { mlir::omp::TaskloopOperands taskloopClauseOps; lower::StatementContext stmtCtx; genTaskloopClauses(converter, semaCtx, stmtCtx, item->clauses, loc, @@ -3298,7 +3296,6 @@ static mlir::omp::TaskloopOp genStandaloneTaskloop(lower::AbstractConverter &con llvm::omp::Directive::OMPD_taskloop, dsp); return taskLoopOp; - } //===----------------------------------------------------------------------===// ``````````
https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Tue May 6 03:03:23 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 03:03:23 -0700 (PDT) Subject: [flang-commits] [flang] [fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <6819de6b.630a0220.21d449.03a5@mx.google.com> https://github.com/NexMing edited https://github.com/llvm/llvm-project/pull/137790 From flang-commits at lists.llvm.org Tue May 6 03:57:08 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 03:57:08 -0700 (PDT) Subject: [flang-commits] [flang] cb9683f - [Clang][Flang][Driver] Fix target parsing for -fveclib=libmvec option. (#138288) Message-ID: <6819eb04.170a0220.3b236c.fde3@mx.google.com> Author: Paul Walker Date: 2025-05-06T11:57:04+01:00 New Revision: cb9683fad12101417a46b35452cb23dfb7c6c367 URL: https://github.com/llvm/llvm-project/commit/cb9683fad12101417a46b35452cb23dfb7c6c367 DIFF: https://github.com/llvm/llvm-project/commit/cb9683fad12101417a46b35452cb23dfb7c6c367.diff LOG: [Clang][Flang][Driver] Fix target parsing for -fveclib=libmvec option. (#138288) There are various places where the -fveclib option is parsed to determine whether its value is correct for the target. Unfortunately these places assume case-insensitivity and subsequently use "LIBMVEC" where the driver mandates "libmvec", thus rendering the diagnosistic useless. This PR corrects the naming along with similar incorrect uses within the test files. Added: Modified: clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Driver/ToolChains/CommonArgs.cpp clang/lib/Driver/ToolChains/Flang.cpp clang/test/Driver/fveclib.c flang/lib/Frontend/CompilerInvocation.cpp flang/test/Driver/fveclib-codegen.f90 flang/test/Driver/fveclib.f90 Removed: ################################################################################ diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index b2e0e5c857228..f87549baff5e1 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -5843,7 +5843,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) << Name << Triple.getArchName(); - } else if (Name == "LIBMVEC-X86") { + } else if (Name == "libmvec") { if (Triple.getArch() != llvm::Triple::x86 && Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp index 109316d0a27e7..e4bad39f8332a 100644 --- a/clang/lib/Driver/ToolChains/CommonArgs.cpp +++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp @@ -934,7 +934,7 @@ void tools::addLTOOptions(const ToolChain &ToolChain, const ArgList &Args, std::optional OptVal = llvm::StringSwitch>(ArgVecLib->getValue()) .Case("Accelerate", "Accelerate") - .Case("LIBMVEC", "LIBMVEC-X86") + .Case("libmvec", "LIBMVEC-X86") .Case("MASSV", "MASSV") .Case("SVML", "SVML") .Case("SLEEF", "sleefgnuabi") diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index a407e295c09bd..b1ca747e68b89 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -484,7 +484,7 @@ void Flang::addTargetOptions(const ArgList &Args, Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) << Name << Triple.getArchName(); - } else if (Name == "LIBMVEC-X86") { + } else if (Name == "libmvec") { if (Triple.getArch() != llvm::Triple::x86 && Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 78b5316b67e47..99baa46cb31c3 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -1,6 +1,6 @@ // RUN: %clang -### -c -fveclib=none %s 2>&1 | FileCheck --check-prefix=CHECK-NOLIB %s // RUN: %clang -### -c -fveclib=Accelerate %s 2>&1 | FileCheck --check-prefix=CHECK-ACCELERATE %s -// RUN: %clang -### -c -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-libmvec %s +// RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-libmvec %s // RUN: %clang -### -c -fveclib=MASSV %s 2>&1 | FileCheck --check-prefix=CHECK-MASSV %s // RUN: %clang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck --check-prefix=CHECK-DARWIN_LIBSYSTEM_M %s // RUN: %clang -### -c --target=aarch64 -fveclib=SLEEF %s 2>&1 | FileCheck --check-prefix=CHECK-SLEEF %s @@ -21,7 +21,7 @@ // RUN: not %clang --target=x86 -c -fveclib=SLEEF %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s // RUN: not %clang --target=x86 -c -fveclib=ArmPL %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s -// RUN: not %clang --target=aarch64 -c -fveclib=LIBMVEC-X86 %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s +// RUN: not %clang --target=aarch64 -c -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s // RUN: not %clang --target=aarch64 -c -fveclib=SVML %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s // CHECK-ERROR: unsupported option {{.*}} for target @@ -37,7 +37,7 @@ /* Verify that the correct vector library is passed to LTO flags. */ -// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s @@ -58,8 +58,8 @@ /* Verify that -fmath-errno is set correctly for the vector library. */ -// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-LIBMVEC %s -// CHECK-ERRNO-LIBMVEC: "-fveclib=LIBMVEC" +// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-LIBMVEC %s +// CHECK-ERRNO-LIBMVEC: "-fveclib=libmvec" // CHECK-ERRNO-LIBMVEC-SAME: "-fmath-errno" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-MASSV %s @@ -110,7 +110,7 @@ // CHECK-ENABLED-LAST: math errno enabled by '-ffp-model=strict' after it was implicitly disabled by '-fveclib=ArmPL', this may limit the utilization of the vector library [-Wmath-errno-enabled-with-veclib] /* Verify no warning when math-errno is re-enabled for a diff erent veclib (that does not imply -fno-math-errno). */ -// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -fmath-errno -fveclib=LIBMVEC %s 2>&1 | FileCheck --check-prefix=CHECK-REPEAT-VECLIB %s +// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -fmath-errno -fveclib=Accelerate %s 2>&1 | FileCheck --check-prefix=CHECK-REPEAT-VECLIB %s // CHECK-REPEAT-VECLIB-NOT: math errno enabled /// Verify that vectorized routines library is being linked in. diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index d6ba644b1400d..28f2f69f23baf 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -195,7 +195,7 @@ static bool parseVectorLibArg(Fortran::frontend::CodeGenOptions &opts, std::optional val = llvm::StringSwitch>(arg->getValue()) .Case("Accelerate", VectorLibrary::Accelerate) - .Case("LIBMVEC", VectorLibrary::LIBMVEC) + .Case("libmvec", VectorLibrary::LIBMVEC) .Case("MASSV", VectorLibrary::MASSV) .Case("SVML", VectorLibrary::SVML) .Case("SLEEF", VectorLibrary::SLEEF) diff --git a/flang/test/Driver/fveclib-codegen.f90 b/flang/test/Driver/fveclib-codegen.f90 index 3720b9e597f5b..802fff9772bb3 100644 --- a/flang/test/Driver/fveclib-codegen.f90 +++ b/flang/test/Driver/fveclib-codegen.f90 @@ -1,6 +1,6 @@ ! test that -fveclib= is passed to the backend -! RUN: %if aarch64-registered-target %{ %flang -S -Ofast -target aarch64-unknown-linux-gnu -fveclib=LIBMVEC -o - %s | FileCheck %s %} -! RUN: %if x86-registered-target %{ %flang -S -Ofast -target x86_64-unknown-linux-gnu -fveclib=LIBMVEC -o - %s | FileCheck %s %} +! RUN: %if aarch64-registered-target %{ %flang -S -Ofast -target aarch64-unknown-linux-gnu -fveclib=SLEEF -o - %s | FileCheck %s --check-prefix=SLEEF %} +! RUN: %if x86-registered-target %{ %flang -S -Ofast -target x86_64-unknown-linux-gnu -fveclib=libmvec -o - %s | FileCheck %s %} ! RUN: %flang -S -Ofast -fveclib=NoLibrary -o - %s | FileCheck %s --check-prefix=NOLIB subroutine sb(a, b) @@ -9,6 +9,7 @@ subroutine sb(a, b) do i=1,100 ! check that we used a vectorized call to powf() ! CHECK: _ZGVbN4vv_powf +! SLEEF: _ZGVnN4vv_powf ! NOLIB: powf a(i) = a(i) ** b(i) end do diff --git a/flang/test/Driver/fveclib.f90 b/flang/test/Driver/fveclib.f90 index 7c2540b91ba79..1b536b8ad0f18 100644 --- a/flang/test/Driver/fveclib.f90 +++ b/flang/test/Driver/fveclib.f90 @@ -1,6 +1,6 @@ ! RUN: %flang -### -c -fveclib=none %s 2>&1 | FileCheck -check-prefix CHECK-NOLIB %s ! RUN: %flang -### -c -fveclib=Accelerate %s 2>&1 | FileCheck -check-prefix CHECK-ACCELERATE %s -! RUN: %flang -### -c -fveclib=libmvec %s 2>&1 | FileCheck -check-prefix CHECK-libmvec %s +! RUN: %flang -### -c --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck -check-prefix CHECK-libmvec %s ! RUN: %flang -### -c -fveclib=MASSV %s 2>&1 | FileCheck -check-prefix CHECK-MASSV %s ! RUN: %flang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck -check-prefix CHECK-DARWIN_LIBSYSTEM_M %s ! RUN: %flang -### -c --target=aarch64-none-none -fveclib=SLEEF %s 2>&1 | FileCheck -check-prefix CHECK-SLEEF %s @@ -21,7 +21,7 @@ ! RUN: not %flang --target=x86-none-none -c -fveclib=SLEEF %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=x86-none-none -c -fveclib=ArmPL %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s -! RUN: not %flang --target=aarch64-none-none -c -fveclib=LIBMVEC-X86 %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s +! RUN: not %flang --target=aarch64-none-none -c -fveclib=libmvec %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=aarch64-none-none -c -fveclib=SVML %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! CHECK-ERROR: unsupported option {{.*}} for target From flang-commits at lists.llvm.org Tue May 6 03:57:11 2025 From: flang-commits at lists.llvm.org (Paul Walker via flang-commits) Date: Tue, 06 May 2025 03:57:11 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Flang][Driver] Fix target parsing for -fveclib=libmvec option. (PR #138288) In-Reply-To: Message-ID: <6819eb07.050a0220.1cddbb.649f@mx.google.com> https://github.com/paulwalker-arm closed https://github.com/llvm/llvm-project/pull/138288 From flang-commits at lists.llvm.org Tue May 6 04:35:37 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Tue, 06 May 2025 04:35:37 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Flang][Driver] Fix target parsing for -fveclib=libmvec option. (PR #138288) In-Reply-To: Message-ID: <6819f409.630a0220.21d449.0e7a@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `lldb-arm-ubuntu` running on `linaro-lldb-arm-ubuntu` while building `clang,flang` at step 6 "test". Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/15564
Here is the relevant piece of the build log for the reference ``` Step 6 (test) failure: build (failure) ... PASS: lldb-api :: tools/lldb-dap/io/TestDAP_io.py (1181 of 3016) UNSUPPORTED: lldb-api :: tools/lldb-dap/memory/TestDAP_memory.py (1182 of 3016) PASS: lldb-api :: tools/lldb-dap/locations/TestDAP_locations.py (1183 of 3016) PASS: lldb-api :: terminal/TestEditlineCompletions.py (1184 of 3016) PASS: lldb-api :: tools/lldb-dap/optimized/TestDAP_optimized.py (1185 of 3016) PASS: lldb-api :: tools/lldb-dap/output/TestDAP_output.py (1186 of 3016) PASS: lldb-api :: tools/lldb-dap/repl-mode/TestDAP_repl_mode_detection.py (1187 of 3016) PASS: lldb-api :: tools/lldb-dap/launch/TestDAP_launch.py (1188 of 3016) UNSUPPORTED: lldb-api :: tools/lldb-dap/restart/TestDAP_restart_runInTerminal.py (1189 of 3016) UNRESOLVED: lldb-api :: tools/lldb-dap/restart/TestDAP_restart.py (1190 of 3016) ******************** TEST 'lldb-api :: tools/lldb-dap/restart/TestDAP_restart.py' FAILED ******************** Script: -- /usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --arch armv8l --build-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/tools/lldb-dap/restart -p TestDAP_restart.py -- Exit Code: 1 Command Output (stdout): -- lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision cb9683fad12101417a46b35452cb23dfb7c6c367) clang revision cb9683fad12101417a46b35452cb23dfb7c6c367 llvm revision cb9683fad12101417a46b35452cb23dfb7c6c367 Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc'] -- Command Output (stderr): -- FAIL: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_arguments (TestDAP_restart.TestDAP_restart) ========= DEBUG ADAPTER PROTOCOL LOGS ========= 1746531130.699173450 --> (stdin/stdout) {"command":"initialize","type":"request","arguments":{"adapterID":"lldb-native","clientID":"vscode","columnsStartAt1":true,"linesStartAt1":true,"locale":"en-us","pathFormat":"path","supportsRunInTerminalRequest":true,"supportsVariablePaging":true,"supportsVariableType":true,"supportsStartDebuggingRequest":true,"supportsProgressReporting":true,"$__lldb_sourceInitFile":false},"seq":1} 1746531130.703027487 <-- (stdin/stdout) {"body":{"$__lldb_version":"lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision cb9683fad12101417a46b35452cb23dfb7c6c367)\n clang revision cb9683fad12101417a46b35452cb23dfb7c6c367\n llvm revision cb9683fad12101417a46b35452cb23dfb7c6c367","completionTriggerCharacters":["."," ","\t"],"exceptionBreakpointFilters":[{"default":false,"filter":"cpp_catch","label":"C++ Catch"},{"default":false,"filter":"cpp_throw","label":"C++ Throw"},{"default":false,"filter":"objc_catch","label":"Objective-C Catch"},{"default":false,"filter":"objc_throw","label":"Objective-C Throw"}],"supportTerminateDebuggee":true,"supportsBreakpointLocationsRequest":true,"supportsCancelRequest":true,"supportsCompletionsRequest":true,"supportsConditionalBreakpoints":true,"supportsConfigurationDoneRequest":true,"supportsDataBreakpoints":true,"supportsDelayedStackTraceLoading":true,"supportsDisassembleRequest":true,"supportsEvaluateForHovers":true,"supportsExceptionInfoRequest":true,"supportsExceptionOptions":true,"supportsFunctionBreakpoints":true,"supportsHitConditionalBreakpoints":true,"supportsInstructionBreakpoints":true,"supportsLogPoints":true,"supportsModulesRequest":true,"supportsReadMemoryRequest":true,"supportsRestartRequest":true,"supportsSetVariable":true,"supportsStepInTargetsRequest":true,"supportsSteppingGranularity":true,"supportsValueFormattingOptions":true},"command":"initialize","request_seq":1,"seq":0,"success":true,"type":"response"} 1746531130.703825712 --> (stdin/stdout) {"command":"launch","type":"request","arguments":{"program":"/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/tools/lldb-dap/restart/TestDAP_restart.test_arguments/a.out","initCommands":["settings clear --all","settings set symbols.enable-external-lookup false","settings set target.inherit-tcc true","settings set target.disable-aslr false","settings set target.detach-on-error false","settings set target.auto-apply-fixits false","settings set plugin.process.gdb-remote.packet-timeout 60","settings set symbols.clang-modules-cache-path \"/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api\"","settings set use-color false","settings set show-statusline false"],"disableASLR":false,"enableAutoVariableSummaries":false,"enableSyntheticChildDebugging":false,"displayExtendedBacktrace":false},"seq":2} 1746531130.704373837 <-- (stdin/stdout) {"body":{"category":"console","output":"Running initCommands:\n"},"event":"output","seq":0,"type":"event"} 1746531130.704469204 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings clear --all\n"},"event":"output","seq":0,"type":"event"} 1746531130.704485416 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set symbols.enable-external-lookup false\n"},"event":"output","seq":0,"type":"event"} 1746531130.704499006 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.inherit-tcc true\n"},"event":"output","seq":0,"type":"event"} 1746531130.704510689 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.disable-aslr false\n"},"event":"output","seq":0,"type":"event"} 1746531130.704523325 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.detach-on-error false\n"},"event":"output","seq":0,"type":"event"} 1746531130.704534531 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set target.auto-apply-fixits false\n"},"event":"output","seq":0,"type":"event"} 1746531130.704546213 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set plugin.process.gdb-remote.packet-timeout 60\n"},"event":"output","seq":0,"type":"event"} 1746531130.704605103 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set symbols.clang-modules-cache-path \"/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api\"\n"},"event":"output","seq":0,"type":"event"} 1746531130.704617262 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set use-color false\n"},"event":"output","seq":0,"type":"event"} 1746531130.704628706 <-- (stdin/stdout) {"body":{"category":"console","output":"(lldb) settings set show-statusline false\n"},"event":"output","seq":0,"type":"event"} 1746531130.867408037 <-- (stdin/stdout) {"body":{"module":{"addressRange":"4159479808","debugInfoSize":"983.3KB","id":"0D794E6C-AF7E-D8CB-B9BA-E385B4F8753F-5A793D65","name":"ld-linux-armhf.so.3","path":"/usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3","symbolFilePath":"/usr/lib/arm-linux-gnueabihf/ld-linux-armhf.so.3","symbolStatus":"Symbols loaded."},"reason":"new"},"event":"module","seq":0,"type":"event"} 1746531130.867597818 <-- (stdin/stdout) {"body":{"module":{"addressRange":"9568256","debugInfoSize":"1.2KB","id":"EA705DC9","name":"a.out","path":"/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/tools/lldb-dap/restart/TestDAP_restart.test_arguments/a.out","symbolFilePath":"/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/tools/lldb-dap/restart/TestDAP_restart.test_arguments/a.out","symbolStatus":"Symbols loaded."},"reason":"new"},"event":"module","seq":0,"type":"event"} 1746531130.867681503 <-- (stdin/stdout) {"command":"launch","request_seq":2,"seq":0,"success":true,"type":"response"} 1746531130.867837906 <-- (stdin/stdout) {"body":{"isLocalProcess":true,"name":"/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/tools/lldb-dap/restart/TestDAP_restart.test_arguments/a.out","startMethod":"launch","systemProcessId":3152025},"event":"process","seq":0,"type":"event"} 1746531130.867872238 <-- (stdin/stdout) {"event":"initialized","seq":0,"type":"event"} 1746531130.868159294 --> (stdin/stdout) {"command":"setBreakpoints","type":"request","arguments":{"source":{"name":"main.c","path":"main.c"},"sourceModified":false,"lines":[5],"breakpoints":[{"line":5}]},"seq":3} ```
https://github.com/llvm/llvm-project/pull/138288 From flang-commits at lists.llvm.org Tue May 6 04:58:58 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 04:58:58 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [Flang][OpenMP] Support for lowering of taskloop construct to MLIR (PR #138646) In-Reply-To: Message-ID: <6819f982.050a0220.242dd1.6f78@mx.google.com> https://github.com/kaviya2510 updated https://github.com/llvm/llvm-project/pull/138646 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 6 05:23:59 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Tue, 06 May 2025 05:23:59 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] WIP: Add frontend support for declare variant (PR #130578) In-Reply-To: Message-ID: <6819ff5f.050a0220.36233.8bd4@mx.google.com> https://github.com/kiranchandramohan updated https://github.com/llvm/llvm-project/pull/130578 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 6 05:24:15 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Tue, 06 May 2025 05:24:15 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] WIP: Add frontend support for declare variant (PR #130578) In-Reply-To: Message-ID: <6819ff6f.050a0220.f9fc.7d90@mx.google.com> https://github.com/kiranchandramohan edited https://github.com/llvm/llvm-project/pull/130578 From flang-commits at lists.llvm.org Tue May 6 05:49:05 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 06 May 2025 05:49:05 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Fix fir.convert in omp.atomic.update region (PR #138397) In-Reply-To: Message-ID: <681a0541.170a0220.20c4dc.1452@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks! https://github.com/llvm/llvm-project/pull/138397 From flang-commits at lists.llvm.org Tue May 6 05:57:56 2025 From: flang-commits at lists.llvm.org (Mark Danial via flang-commits) Date: Tue, 06 May 2025 05:57:56 -0700 (PDT) Subject: [flang-commits] [flang] [flang][AIX] Predefine __64BIT__ and _AIX macros (PR #138591) In-Reply-To: Message-ID: <681a0754.170a0220.1d59aa.0a92@mx.google.com> https://github.com/madanial0 approved this pull request. LGTM! thanks! https://github.com/llvm/llvm-project/pull/138591 From flang-commits at lists.llvm.org Tue May 6 06:28:00 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 06 May 2025 06:28:00 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Add frontend support for declare variant (PR #130578) In-Reply-To: Message-ID: <681a0e60.170a0220.2be1.82e3@mx.google.com> https://github.com/tblah commented: Thanks for this Kiran. The semantic restrictions on adjust_args appear to be unimplemented. I think this patch is still useful without implementing all of the semantic checks, but please update the commit message to make that clear. https://github.com/llvm/llvm-project/pull/130578 From flang-commits at lists.llvm.org Tue May 6 06:31:10 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Tue, 06 May 2025 06:31:10 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Add frontend support for declare variant (PR #130578) In-Reply-To: Message-ID: <681a0f1e.170a0220.9c653.3f0f@mx.google.com> https://github.com/kiranchandramohan edited https://github.com/llvm/llvm-project/pull/130578 From flang-commits at lists.llvm.org Tue May 6 06:31:45 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Tue, 06 May 2025 06:31:45 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Add frontend support for declare variant (PR #130578) In-Reply-To: Message-ID: <681a0f41.170a0220.1ff9e8.454c@mx.google.com> kiranchandramohan wrote: > The semantic restrictions on adjust_args appear to be unimplemented. I think this patch is still useful without implementing all of the semantic checks, but please update the commit message to make that clear. Thanks. I have modified the summary. https://github.com/llvm/llvm-project/pull/130578 From flang-commits at lists.llvm.org Tue May 6 06:56:53 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 06:56:53 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <681a1525.170a0220.bdc3f.7367@mx.google.com> ================ @@ -156,9 +156,9 @@ genBoundsOpsFromBox(fir::FirOpBuilder &builder, mlir::Location loc, builder.genIfOp(loc, resTypes, info.isPresent, /*withElseRegion=*/true) .genThen([&]() { mlir::Value box = - !fir::isBoxAddress(info.addr.getType()) + !fir::isBoxAddress(info.rawInput.getType()) ? info.addr - : builder.create(loc, info.addr); + : builder.create(loc, info.rawInput); ---------------- agozillon wrote: The issue this is trying to solve (which might not be the correct way to do so, so please do feel free to suggest alternatives :-) ) is the generation of external loads to the presence check, which will itself cause a segmentation fault if the Box isn't present (e.g. optional argument not been supplied). This might only be an issue for OpenMP, but I think it also applies to OpenACC (as it's utilising the same utilities), but when you utilise the getDataOperandBaseAddr we get the AddrAndBoundsInfo which contains our rawInput and addr. In this particular case the rawInput is a hlfir.declare but the addr is a load of the declare (the address), and when we pass the AddrAndBoundsInfo into this function to create the appropriate presence checks before accessing information and utilise the Info.addr directly segment above, we'll skip the load generation inside of the protection of the present check and instead utilise the one generated external to the presence check that getDataOperandBaseAddr created to access the .addr, which at runtime will generate a segfault in certain cases. Swapping it to utilise rawInput ensures the load is generated inside of the presence checks! That was at least my understanding of the problem, there is likely better ways to fix it, this was the simplest/(seemingly)least intrusive one I came across. https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Tue May 6 07:11:38 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Tue, 06 May 2025 07:11:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][volatile] Get volatility of designators from base instead of component symbol (PR #138611) In-Reply-To: Message-ID: <681a189a.170a0220.1ad4fc.97d9@mx.google.com> https://github.com/ashermancinelli approved this pull request. https://github.com/llvm/llvm-project/pull/138611 From flang-commits at lists.llvm.org Tue May 6 07:12:16 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 07:12:16 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR (PR #111155) In-Reply-To: Message-ID: <681a18c0.050a0220.1b10d4.072e@mx.google.com> https://github.com/kaviya2510 updated https://github.com/llvm/llvm-project/pull/111155 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 6 07:13:32 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 07:13:32 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR (PR #111155) In-Reply-To: Message-ID: <681a190c.170a0220.34d0bf.93d1@mx.google.com> kaviya2510 wrote: Thanks for the review. https://github.com/llvm/llvm-project/pull/111155 From flang-commits at lists.llvm.org Tue May 6 07:14:13 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 07:14:13 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR (PR #111155) In-Reply-To: Message-ID: <681a1935.170a0220.1d4faa.c5a2@mx.google.com> ================ @@ -409,8 +409,8 @@ void DataSharingProcessor::collectSymbols( // Collect all symbols referenced in the evaluation being processed, // that matches 'flag'. llvm::SetVector allSymbols; - converter.collectSymbolSet(eval, allSymbols, flag, - /*collectSymbols=*/true, + + converter.collectSymbolSet(eval, allSymbols, flag, /*collectSymbols=*/true, ---------------- kaviya2510 wrote: Done with the changes. https://github.com/llvm/llvm-project/pull/111155 From flang-commits at lists.llvm.org Tue May 6 07:59:48 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 06 May 2025 07:59:48 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <681a23e4.050a0220.9bbc0.7a7b@mx.google.com> https://github.com/skatrak edited https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Tue May 6 07:59:48 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 06 May 2025 07:59:48 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <681a23e4.170a0220.31351.3e41@mx.google.com> https://github.com/skatrak approved this pull request. Thanks again Andrew, LGTM. No need for another review by me after addressing these final nits. https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Tue May 6 07:59:48 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 06 May 2025 07:59:48 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <681a23e4.630a0220.3c0424.f44c@mx.google.com> ================ @@ -2231,6 +2232,146 @@ genSingleOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +static clause::Defaultmap::ImplicitBehavior +getDefaultmapIfPresent(DefaultMapsTy &defaultMaps, mlir::Type varType) { ---------------- skatrak wrote: ```suggestion getDefaultmapIfPresent(const DefaultMapsTy &defaultMaps, mlir::Type varType) { ``` https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Tue May 6 07:59:49 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 06 May 2025 07:59:49 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <681a23e5.050a0220.13eeba.6c19@mx.google.com> ================ @@ -2231,6 +2232,146 @@ genSingleOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +static clause::Defaultmap::ImplicitBehavior +getDefaultmapIfPresent(DefaultMapsTy &defaultMaps, mlir::Type varType) { ---------------- skatrak wrote: Nit: These new helper functions belong to the "Code generation helper functions" section of the file, like e.g. `markDeclareTarget()`. https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Tue May 6 07:59:49 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 06 May 2025 07:59:49 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <681a23e5.170a0220.68b65.f6a5@mx.google.com> ================ @@ -2231,6 +2232,146 @@ genSingleOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +static clause::Defaultmap::ImplicitBehavior +getDefaultmapIfPresent(DefaultMapsTy &defaultMaps, mlir::Type varType) { + using DefMap = clause::Defaultmap; + + if (defaultMaps.empty()) + return DefMap::ImplicitBehavior::Default; + + if (llvm::is_contained(defaultMaps, DefMap::VariableCategory::All)) + return defaultMaps[DefMap::VariableCategory::All]; + + // NOTE: Unsure if complex and/or vector falls into a scalar type + // or aggregate, but the current default implicit behaviour is to + // treat them as such (c_ptr has its own behaviour, so perhaps + // being lumped in as a scalar isn't the right thing). + if ((fir::isa_trivial(varType) || fir::isa_char(varType) || + fir::isa_builtin_cptr_type(varType)) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Scalar)) + return defaultMaps[DefMap::VariableCategory::Scalar]; + + if (fir::isPointerType(varType) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Pointer)) + return defaultMaps[DefMap::VariableCategory::Pointer]; + + if (fir::isAllocatableType(varType) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Allocatable)) + return defaultMaps[DefMap::VariableCategory::Allocatable]; + + if (fir::isa_aggregate(varType) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Aggregate)) { + return defaultMaps[DefMap::VariableCategory::Aggregate]; + } + + return DefMap::ImplicitBehavior::Default; +} + +static std::pair +getImplicitMapTypeAndKind(fir::FirOpBuilder &firOpBuilder, + lower::AbstractConverter &converter, + DefaultMapsTy &defaultMaps, mlir::Type varType, ---------------- skatrak wrote: ```suggestion const DefaultMapsTy &defaultMaps, mlir::Type varType, ``` https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Tue May 6 07:59:50 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 06 May 2025 07:59:50 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <681a23e6.170a0220.96379.9e6c@mx.google.com> ================ @@ -32,6 +32,10 @@ namespace Fortran { namespace lower { namespace omp { +// Container type for tracking user specified Defaultmaps for a target region +using DefaultMapsTy = std::map Message-ID: <681a28fd.170a0220.2b050f.e4fc@mx.google.com> ================ @@ -1,13 +1,25 @@ ! Test predefined macro for PowerPC architecture -! RUN: %flang_fc1 -triple ppc64le-unknown-linux -cpp -E %s | FileCheck %s +! RUN: %flang_fc1 -triple ppc64le-unknown-linux -cpp -E %s | FileCheck %s -check-prefix=CHECK-LINUX +! RUN: %flang_fc1 -triple powerpc-unknown-aix -cpp -E %s | FileCheck %s -check-prefix=CHECK-AIX32 +! RUN: %flang_fc1 -triple powerpc64-unknown-aix -cpp -E %s | FileCheck %s -check-prefix=CHECK-AIX64 ! REQUIRES: target=powerpc{{.*}} -! CHECK: integer :: var1 = 1 -! CHECK: integer :: var2 = 1 +! CHECK-LINUX: integer :: var1 = 1 +! CHECK-LINUX: integer :: var2 = 1 +! CHECK-AIX32: integer :: var1 = 1 +! CHECK-AIX32: integer :: var2 = 1 +! CHECK-AIX32: integer :: var3 = __64BIT__ ---------------- DanielCChen wrote: Should this be `... :: var3 = 0`? https://github.com/llvm/llvm-project/pull/138591 From flang-commits at lists.llvm.org Tue May 6 09:03:04 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Tue, 06 May 2025 09:03:04 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Add frontend support for declare variant (PR #130578) In-Reply-To: Message-ID: <681a32b8.050a0220.20be2e.5d70@mx.google.com> https://github.com/kiranchandramohan updated https://github.com/llvm/llvm-project/pull/130578 >From 26bbee2e27ebb370808872cf9056d8946665a01d Mon Sep 17 00:00:00 2001 From: Kiran Chandramohan Date: Mon, 10 Mar 2025 10:56:36 +0000 Subject: [PATCH 1/2] [Flang][OpenMP] Add frontend support for declare variant Support is added for parsing and semantics. Lowering will emit a TODO error. --- flang/include/flang/Parser/dump-parse-tree.h | 6 + flang/include/flang/Parser/parse-tree.h | 26 ++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 7 ++ flang/lib/Parser/openmp-parsers.cpp | 26 +++++ flang/lib/Parser/unparse.cpp | 24 +++- flang/lib/Semantics/check-omp-structure.cpp | 10 ++ flang/lib/Semantics/check-omp-structure.h | 2 + flang/lib/Semantics/resolve-names.cpp | 19 ++++ .../Lower/OpenMP/Todo/declare-variant.f90 | 17 +++ flang/test/Parser/OpenMP/declare-variant.f90 | 104 ++++++++++++++++++ .../test/Semantics/OpenMP/declare-variant.f90 | 14 +++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 8 +- 12 files changed, 257 insertions(+), 6 deletions(-) create mode 100644 flang/test/Lower/OpenMP/Todo/declare-variant.f90 create mode 100644 flang/test/Parser/OpenMP/declare-variant.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-variant.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a3721bc8410ba 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -483,6 +483,11 @@ class ParseTreeDumper { NODE(parser, OldParameterStmt) NODE(parser, OmpTypeSpecifier) NODE(parser, OmpTypeNameList) + NODE(parser, OmpAdjustArgsClause) + NODE(OmpAdjustArgsClause, OmpAdjustOp) + NODE_ENUM(OmpAdjustArgsClause::OmpAdjustOp, Value) + NODE(parser, OmpAppendArgsClause) + NODE(OmpAppendArgsClause, OmpAppendOp) NODE(parser, OmpLocator) NODE(parser, OmpLocatorList) NODE(parser, OmpReductionSpecifier) @@ -703,6 +708,7 @@ class ParseTreeDumper { NODE(parser, OpenMPCriticalConstruct) NODE(parser, OpenMPDeclarativeAllocate) NODE(parser, OpenMPDeclarativeConstruct) + NODE(parser, OmpDeclareVariantDirective) NODE(parser, OpenMPDeclareReductionConstruct) NODE(parser, OpenMPDeclareSimdConstruct) NODE(parser, OpenMPDeclareTargetConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..20d5fc49426d2 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4013,6 +4013,15 @@ struct OmpAbsentClause { WRAPPER_CLASS_BOILERPLATE(OmpAbsentClause, OmpDirectiveList); }; +struct OmpAdjustArgsClause { + TUPLE_CLASS_BOILERPLATE(OmpAdjustArgsClause); + struct OmpAdjustOp { + ENUM_CLASS(Value, Nothing, Need_Device_Ptr) + WRAPPER_CLASS_BOILERPLATE(OmpAdjustOp, Value); + }; + std::tuple t; +}; + // Ref: [5.0:135-140], [5.1:161-166], [5.2:264-265] // // affinity-clause -> @@ -4056,6 +4065,13 @@ struct OmpAllocateClause { std::tuple t; }; +struct OmpAppendArgsClause { + struct OmpAppendOp { + WRAPPER_CLASS_BOILERPLATE(OmpAppendOp, std::list); + }; + WRAPPER_CLASS_BOILERPLATE(OmpAppendArgsClause, std::list); +}; + // Ref: [5.2:216-217 (sort of, as it's only mentioned in passing) // AT(compilation|execution) struct OmpAtClause { @@ -4693,6 +4709,12 @@ struct OmpBlockDirective { CharBlock source; }; +struct OmpDeclareVariantDirective { + TUPLE_CLASS_BOILERPLATE(OmpDeclareVariantDirective); + CharBlock source; + std::tuple, Name, OmpClauseList> t; +}; + // 2.10.6 declare-target -> DECLARE TARGET (extended-list) | // DECLARE TARGET [declare-target-clause[ [,] // declare-target-clause]...] @@ -4771,8 +4793,8 @@ struct OpenMPDeclarativeConstruct { std::variant + OmpDeclareVariantDirective, OpenMPThreadprivate, OpenMPRequiresConstruct, + OpenMPUtilityConstruct, OmpMetadirectiveDirective> u; }; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 47e7c266ff7d3..ca514a8f33c62 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3755,6 +3755,13 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMP ASSUMES declaration"); } +static void +genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, + semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, + const parser::OmpDeclareVariantDirective &declareVariantDirective) { + TODO(converter.getCurrentLocation(), "OpenMPDeclareVariantDirective"); +} + static void genOMP( lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..2f01b3c254701 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -611,6 +611,14 @@ TYPE_PARSER(sourced(construct( TYPE_PARSER(sourced(construct( // Parser{}))) +TYPE_PARSER(construct( + "INTEROP" >> parenthesized(nonemptyList(Parser{})))) + +TYPE_PARSER(construct( + "NOTHING" >> pure(OmpAdjustArgsClause::OmpAdjustOp::Value::Nothing) || + "NEED_DEVICE_PTR" >> + pure(OmpAdjustArgsClause::OmpAdjustOp::Value::Need_Device_Ptr))) + // --- Parsers for clauses -------------------------------------------- /// `MOBClause` is a clause that has a @@ -630,6 +638,10 @@ static inline MOBClause makeMobClause( } } +TYPE_PARSER(construct( + (Parser{} / ":"), + Parser{})) + // [5.0] 2.10.1 affinity([aff-modifier:] locator-list) // aff-modifier: interator-modifier TYPE_PARSER(construct( @@ -653,6 +665,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct( OmpDirectiveNameParser{}, maybe(parenthesized(scalarLogicalExpr)))) +TYPE_PARSER(construct( + nonemptyList(Parser{}))) + // 2.15.3.1 DEFAULT (PRIVATE | FIRSTPRIVATE | SHARED | NONE) TYPE_PARSER(construct( "PRIVATE" >> pure(OmpDefaultClause::DataSharingAttribute::Private) || @@ -901,6 +916,8 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "ACQUIRE" >> construct(construct()) || "ACQ_REL" >> construct(construct()) || + "ADJUST_ARGS" >> construct(construct( + parenthesized(Parser{}))) || "AFFINITY" >> construct(construct( parenthesized(Parser{}))) || "ALIGN" >> construct(construct( @@ -909,6 +926,8 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "ALLOCATE" >> construct(construct( parenthesized(Parser{}))) || + "APPEND_ARGS" >> construct(construct( + parenthesized(Parser{}))) || "ALLOCATOR" >> construct(construct( parenthesized(scalarIntExpr))) || "AT" >> construct(construct( @@ -1342,6 +1361,11 @@ TYPE_PARSER(construct( construct(assignmentStmt) || construct(Parser{}))) +// OpenMP 5.2: 7.5.4 Declare Variant directive +TYPE_PARSER(sourced( + construct(verbatim("DECLARE VARIANT"_tok), + "(" >> maybe(name / ":"), name / ")", Parser{}))) + // 2.16 Declare Reduction Construct TYPE_PARSER(sourced(construct( verbatim("DECLARE REDUCTION"_tok), @@ -1513,6 +1537,8 @@ TYPE_PARSER( Parser{}) || construct( Parser{}) || + construct( + Parser{}) || construct( Parser{}) || construct( diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..1ee9096fcda56 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2743,7 +2743,28 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } - + void Unparse(const OmpAppendArgsClause::OmpAppendOp &x) { + Put("INTEROP("); + Walk(x.v, ","); + Put(")"); + } + void Unparse(const OmpAppendArgsClause &x) { Walk(x.v, ","); } + void Unparse(const OmpAdjustArgsClause &x) { + Walk(std::get(x.t).v); + Put(":"); + Walk(std::get(x.t)); + } + void Unparse(const OmpDeclareVariantDirective &x) { + BeginOpenMP(); + Word("!$OMP DECLARE VARIANT "); + Put("("); + Walk(std::get>(x.t), ":"); + Walk(std::get(x.t)); + Put(")"); + Walk(std::get(x.t)); + Put("\n"); + EndOpenMP(); + } void Unparse(const OpenMPInteropConstruct &x) { BeginOpenMP(); Word("!$OMP INTEROP"); @@ -3042,6 +3063,7 @@ class UnparseVisitor { WALK_NESTED_ENUM(InquireSpec::LogVar, Kind) WALK_NESTED_ENUM(ProcedureStmt, Kind) // R1506 WALK_NESTED_ENUM(UseStmt, ModuleNature) // R1410 + WALK_NESTED_ENUM(OmpAdjustArgsClause::OmpAdjustOp, Value) // OMP adjustop WALK_NESTED_ENUM(OmpAtClause, ActionTime) // OMP at WALK_NESTED_ENUM(OmpBindClause, Binding) // OMP bind WALK_NESTED_ENUM(OmpProcBindClause, AffinityPolicy) // OMP proc_bind diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index d9fe32bae1c27..ab8196f95807e 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -1619,6 +1619,16 @@ void OmpStructureChecker::Leave(const parser::OpenMPDeclareSimdConstruct &) { dirContext_.pop_back(); } +void OmpStructureChecker::Enter(const parser::OmpDeclareVariantDirective &x) { + const auto &dir{std::get(x.t)}; + PushContextAndClauseSets( + dir.source, llvm::omp::Directive::OMPD_declare_variant); +} + +void OmpStructureChecker::Leave(const parser::OmpDeclareVariantDirective &) { + dirContext_.pop_back(); +} + void OmpStructureChecker::Enter(const parser::OpenMPDepobjConstruct &x) { const auto &dirName{std::get(x.v.t)}; PushContextAndClauseSets(dirName.source, llvm::omp::Directive::OMPD_depobj); diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..911a6bb08fb87 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -98,6 +98,8 @@ class OmpStructureChecker void Enter(const parser::OmpEndSectionsDirective &); void Leave(const parser::OmpEndSectionsDirective &); + void Enter(const parser::OmpDeclareVariantDirective &); + void Leave(const parser::OmpDeclareVariantDirective &); void Enter(const parser::OpenMPDeclareSimdConstruct &); void Leave(const parser::OpenMPDeclareSimdConstruct &); void Enter(const parser::OpenMPDeclarativeAllocate &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..8e1ded1f98e82 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1511,6 +1511,25 @@ class OmpVisitor : public virtual DeclarationVisitor { return true; } + bool Pre(const parser::OmpDeclareVariantDirective &x) { + AddOmpSourceRange(x.source); + auto FindSymbolOrError = [&](const parser::Name &procName) { + auto *symbol{FindSymbol(NonDerivedTypeScope(), procName)}; + if (!symbol) { + context().Say(procName.source, + "Implicit subroutine declaration '%s' in !$OMP DECLARE VARIANT"_err_en_US, + procName.source); + } + }; + auto &baseProcName = std::get>(x.t); + if (baseProcName) { + FindSymbolOrError(*baseProcName); + } + auto &varProcName = std::get(x.t); + FindSymbolOrError(varProcName); + return true; + } + bool Pre(const parser::OpenMPDeclareReductionConstruct &x) { AddOmpSourceRange(x.source); ProcessReductionSpecifier( diff --git a/flang/test/Lower/OpenMP/Todo/declare-variant.f90 b/flang/test/Lower/OpenMP/Todo/declare-variant.f90 new file mode 100644 index 0000000000000..135317be5d02a --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/declare-variant.f90 @@ -0,0 +1,17 @@ +! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s + +! CHECK: not yet implemented: OpenMPDeclareVariantDirective + +subroutine sb1 + integer :: x + x = 1 + call sub(x) +contains + subroutine vsub (v1) + integer, value :: v1 + end + subroutine sub (v1) + !$omp declare variant(vsub), match(construct={dispatch}) + integer, value :: v1 + end +end subroutine diff --git a/flang/test/Parser/OpenMP/declare-variant.f90 b/flang/test/Parser/OpenMP/declare-variant.f90 new file mode 100644 index 0000000000000..1b97733ea9525 --- /dev/null +++ b/flang/test/Parser/OpenMP/declare-variant.f90 @@ -0,0 +1,104 @@ +! RUN: %flang_fc1 -fdebug-unparse-no-sema -fopenmp %s | FileCheck --ignore-case %s +! RUN: %flang_fc1 -fdebug-dump-parse-tree-no-sema -fopenmp %s | FileCheck --check-prefix="PARSE-TREE" %s + +subroutine sub0 +!CHECK: !$OMP DECLARE VARIANT (sub:vsub) MATCH(CONSTRUCT={PARALLEL}) +!PARSE-TREE: OpenMPDeclarativeConstruct -> OmpDeclareVariantDirective +!PARSE-TREE: | Verbatim +!PARSE-TREE: | Name = 'sub' +!PARSE-TREE: | Name = 'vsub' +!PARSE-TREE: | OmpClauseList -> OmpClause -> Match -> OmpMatchClause -> OmpContextSelectorSpecification -> OmpTraitSetSelector +!PARSE-TREE: | | OmpTraitSetSelectorName -> Value = Construct +!PARSE-TREE: | | OmpTraitSelector +!PARSE-TREE: | | | OmpTraitSelectorName -> llvm::omp::Directive = parallel + !$omp declare variant (sub:vsub) match (construct={parallel}) +contains + subroutine vsub + end subroutine + + subroutine sub () + end subroutine +end subroutine + +subroutine sb1 + integer :: x + x = 1 + !$omp dispatch device(1) + call sub(x) +contains + subroutine vsub (v1) + integer, value :: v1 + end + subroutine sub (v1) +!CHECK: !$OMP DECLARE VARIANT (vsub) MATCH(CONSTRUCT={DISPATCH} +!PARSE-TREE: OpenMPDeclarativeConstruct -> OmpDeclareVariantDirective +!PARSE-TREE: | Verbatim +!PARSE-TREE: | Name = 'vsub' +!PARSE-TREE: | OmpClauseList -> OmpClause -> Match -> OmpMatchClause -> OmpContextSelectorSpecification -> OmpTraitSetSelector +!PARSE-TREE: | | OmpTraitSetSelectorName -> Value = Construct +!PARSE-TREE: | | OmpTraitSelector +!PARSE-TREE: | | | OmpTraitSelectorName -> llvm::omp::Directive = dispatch + !$omp declare variant(vsub), match(construct={dispatch}) + integer, value :: v1 + end +end subroutine + +subroutine sb2 (x1, x2) + use omp_lib, only: omp_interop_kind + integer :: x + x = 1 + !$omp dispatch device(1) + call sub(x) +contains + subroutine vsub (v1, a1, a2) + integer, value :: v1 + integer(omp_interop_kind) :: a1 + integer(omp_interop_kind), value :: a2 + end + subroutine sub (v1) +!CHECK: !$OMP DECLARE VARIANT (vsub) MATCH(CONSTRUCT={DISPATCH}) APPEND_ARGS(INTEROP(T& +!CHECK: !$OMP&ARGET),INTEROP(TARGET)) +!PARSE-TREE: OpenMPDeclarativeConstruct -> OmpDeclareVariantDirective +!PARSE-TREE: | Verbatim +!PARSE-TREE: | Name = 'vsub' +!PARSE-TREE: | OmpClauseList -> OmpClause -> Match -> OmpMatchClause -> OmpContextSelectorSpecification -> OmpTraitSetSelector +!PARSE-TREE: | | OmpTraitSetSelectorName -> Value = Construct +!PARSE-TREE: | | OmpTraitSelector +!PARSE-TREE: | | | OmpTraitSelectorName -> llvm::omp::Directive = dispatch +!PARSE-TREE: | OmpClause -> AppendArgs -> OmpAppendArgsClause -> OmpAppendOp -> OmpInteropType -> Value = Target +!PARSE-TREE: | OmpAppendOp -> OmpInteropType -> Value = Target + !$omp declare variant(vsub), match(construct={dispatch}), append_args (interop(target), interop(target)) + integer, value :: v1 + end +end subroutine + +subroutine sb3 (x1, x2) + use iso_c_binding, only: c_ptr + type(c_ptr), value :: x1, x2 + + !$omp dispatch device(1) + call sub(x1, x2) +contains + subroutine sub (v1, v2) + type(c_ptr), value :: v1, v2 +!CHECK: !$OMP DECLARE VARIANT (vsub) MATCH(CONSTRUCT={DISPATCH}) ADJUST_ARGS(NOTHING:v& +!CHECK: !$OMP&1) ADJUST_ARGS(NEED_DEVICE_PTR:v2) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OmpDeclareVariantDirective +!PARSE-TREE: | Verbatim +!PARSE-TREE: | Name = 'vsub' +!PARSE-TREE: | OmpClauseList -> OmpClause -> Match -> OmpMatchClause -> OmpContextSelectorSpecification -> OmpTraitSetSelector +!PARSE-TREE: | | OmpTraitSetSelectorName -> Value = Construct +!PARSE-TREE: | | OmpTraitSelector +!PARSE-TREE: | | | OmpTraitSelectorName -> llvm::omp::Directive = dispatch +!PARSE-TREE: | OmpClause -> AdjustArgs -> OmpAdjustArgsClause +!PARSE-TREE: | | OmpAdjustOp -> Value = Nothing +!PARSE-TREE: | | OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'v1' +!PARSE-TREE: | OmpClause -> AdjustArgs -> OmpAdjustArgsClause +!PARSE-TREE: | | OmpAdjustOp -> Value = Need_Device_Ptr +!PARSE-TREE: | | OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'v2' + !$omp declare variant(vsub) match ( construct = { dispatch } ) adjust_args(nothing : v1 ) adjust_args(need_device_ptr : v2) + end + subroutine vsub(v1, v2) + type(c_ptr), value :: v1, v2 + end +end subroutine diff --git a/flang/test/Semantics/OpenMP/declare-variant.f90 b/flang/test/Semantics/OpenMP/declare-variant.f90 new file mode 100644 index 0000000000000..84a0cdcd10d91 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-variant.f90 @@ -0,0 +1,14 @@ +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=51 + +subroutine sub0 +!ERROR: Implicit subroutine declaration 'vsub1' in !$OMP DECLARE VARIANT + !$omp declare variant (sub:vsub1) match (construct={parallel}) +!ERROR: Implicit subroutine declaration 'sub1' in !$OMP DECLARE VARIANT + !$omp declare variant (sub1:vsub) match (construct={parallel}) +contains + subroutine vsub + end subroutine + + subroutine sub () + end subroutine +end subroutine diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..a648fd27904fd 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -43,6 +43,7 @@ def OMPC_AcqRel : Clause<"acq_rel"> { let clangClass = "OMPAcqRelClause"; } def OMPC_AdjustArgs : Clause<"adjust_args"> { + let flangClass = "OmpAdjustArgsClause"; } def OMPC_Affinity : Clause<"affinity"> { let clangClass = "OMPAffinityClause"; @@ -65,6 +66,7 @@ def OMPC_Allocator : Clause<"allocator"> { let flangClass = "ScalarIntExpr"; } def OMPC_AppendArgs : Clause<"append_args"> { + let flangClass = "OmpAppendArgsClause"; } def OMPC_At : Clause<"at"> { let clangClass = "OMPAtClause"; @@ -721,10 +723,10 @@ def OMP_EndDeclareTarget : Directive<"end declare target"> { } def OMP_DeclareVariant : Directive<"declare variant"> { let allowedClauses = [ - VersionedClause, - ]; - let allowedExclusiveClauses = [ VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, VersionedClause, ]; let association = AS_Declaration; >From 637008df2a71e54769c5a31e988e3ddbdbef58f1 Mon Sep 17 00:00:00 2001 From: Kiran Chandramohan Date: Tue, 6 May 2025 14:24:20 +0000 Subject: [PATCH 2/2] Fix TODO message --- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- flang/test/Lower/OpenMP/Todo/declare-variant.f90 | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ca514a8f33c62..c59c886196c50 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3759,7 +3759,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OmpDeclareVariantDirective &declareVariantDirective) { - TODO(converter.getCurrentLocation(), "OpenMPDeclareVariantDirective"); + TODO(converter.getCurrentLocation(), "OmpDeclareVariantDirective"); } static void genOMP( diff --git a/flang/test/Lower/OpenMP/Todo/declare-variant.f90 b/flang/test/Lower/OpenMP/Todo/declare-variant.f90 index 135317be5d02a..5719ef3afdee1 100644 --- a/flang/test/Lower/OpenMP/Todo/declare-variant.f90 +++ b/flang/test/Lower/OpenMP/Todo/declare-variant.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMPDeclareVariantDirective +! CHECK: not yet implemented: OmpDeclareVariantDirective subroutine sb1 integer :: x From flang-commits at lists.llvm.org Tue May 6 09:05:24 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 09:05:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang][Evaluate] Restrict ConstantBase constructor overload (PR #138456) In-Reply-To: Message-ID: <681a3344.170a0220.149288.f252@mx.google.com> https://github.com/klausler approved this pull request. https://github.com/llvm/llvm-project/pull/138456 From flang-commits at lists.llvm.org Tue May 6 09:07:32 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 09:07:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang][Evaluate] Fix AsGenericExpr for Relational (PR #138455) In-Reply-To: Message-ID: <681a33c4.170a0220.1dc8c8.f2e0@mx.google.com> https://github.com/klausler approved this pull request. https://github.com/llvm/llvm-project/pull/138455 From flang-commits at lists.llvm.org Tue May 6 09:17:02 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 09:17:02 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681a35fe.170a0220.96379.c522@mx.google.com> ================ @@ -127,6 +128,8 @@ class WithBindName { // Device type specific OpenACC routine information class OpenACCRoutineDeviceTypeInfo { public: + OpenACCRoutineDeviceTypeInfo(Fortran::common::OpenACCDeviceType dType) ---------------- klausler wrote: Should be `explicit` so that it isn't an implicit type conversion. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Tue May 6 09:17:02 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 09:17:02 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681a35fe.a70a0220.32933e.16f3@mx.google.com> ================ @@ -1034,61 +1034,53 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Symbol &symbol, const parser::OpenACCRoutineConstruct &x) { if (symbol.has()) { Fortran::semantics::OpenACCRoutineInfo info; + std::vector currentDevices; + currentDevices.push_back(&info); const auto &clauses = std::get(x.t); for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isSeq(); - } else { - info.deviceTypeInfos().back().set_isSeq(); - } + if (const auto *dTypeClause = ---------------- klausler wrote: Please always use braced initialization throughout Semantics. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Tue May 6 09:17:02 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 09:17:02 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681a35fe.630a0220.3c0424.19fc@mx.google.com> ================ @@ -1034,61 +1034,53 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Symbol &symbol, const parser::OpenACCRoutineConstruct &x) { if (symbol.has()) { Fortran::semantics::OpenACCRoutineInfo info; + std::vector currentDevices; + currentDevices.push_back(&info); const auto &clauses = std::get(x.t); for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isSeq(); - } else { - info.deviceTypeInfos().back().set_isSeq(); - } + if (const auto *dTypeClause = + std::get_if(&clause.u)) { + currentDevices.clear(); + for (const auto &deviceTypeExpr : dTypeClause->v.v) + currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); + } else if (std::get_if(&clause.u)) { + info.set_isNohost(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) ---------------- klausler wrote: Always use braces around the bodies of `if`, `for`, `while`, &c. in Semantics. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Tue May 6 09:17:02 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 09:17:02 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681a35fe.050a0220.64430.88f8@mx.google.com> ================ @@ -137,22 +140,28 @@ class OpenACCRoutineDeviceTypeInfo { void set_isGang(bool value = true) { isGang_ = value; } unsigned gangDim() const { return gangDim_; } void set_gangDim(unsigned value) { gangDim_ = value; } - const std::string *bindName() const { - return bindName_ ? &*bindName_ : nullptr; + const std::variant *bindName() const { + return bindName_.has_value() ? &*bindName_ : nullptr; } - void set_bindName(std::string &&name) { bindName_ = std::move(name); } - void set_dType(Fortran::common::OpenACCDeviceType dType) { - deviceType_ = dType; + const std::optional> & ---------------- klausler wrote: Same question here as above. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Tue May 6 09:17:03 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 09:17:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681a35ff.170a0220.1221c6.de13@mx.google.com> ================ @@ -137,22 +140,28 @@ class OpenACCRoutineDeviceTypeInfo { void set_isGang(bool value = true) { isGang_ = value; } unsigned gangDim() const { return gangDim_; } void set_gangDim(unsigned value) { gangDim_ = value; } - const std::string *bindName() const { - return bindName_ ? &*bindName_ : nullptr; + const std::variant *bindName() const { ---------------- klausler wrote: Why have a variant return type if only one of its alternatives will be returned? https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Tue May 6 09:24:06 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Tue, 06 May 2025 09:24:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) Message-ID: https://github.com/mrkajetanp created https://github.com/llvm/llvm-project/pull/138718 hlfir.copy_in implements copying non-contiguous array slices for functions that take in arrays required to be contiguous through a flang-rt function that calls memcpy/memmove separately on each element. For large arrays of trivial types, this can incur considerable overhead compared to a plain copy loop that is better able to take advantage of hardware pipelines. To address that, extend the InlineHLFIRAssign optimisation pass with a new pattern for inlining hlfir.copy_in operations for trivial types. For the time being, the pattern is only applied in cases where the copy-in does not require a corresponding copy-out, such as when the function being called declares the array parameter as intent(in). Applying this optimisation reduces the runtime of thornado-mini's DeleptonizationProblem by a factor of about 1/3rd. Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 6 09:27:10 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 09:27:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang][AIX] Predefine __64BIT__ and _AIX macros (PR #138591) In-Reply-To: Message-ID: <681a385e.050a0220.3c3295.9a54@mx.google.com> https://github.com/klausler approved this pull request. https://github.com/llvm/llvm-project/pull/138591 From flang-commits at lists.llvm.org Tue May 6 09:28:45 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 09:28:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang][volatile] Get volatility of designators from base instead of component symbol (PR #138611) In-Reply-To: Message-ID: <681a38bd.050a0220.151ca4.9a64@mx.google.com> https://github.com/klausler approved this pull request. https://github.com/llvm/llvm-project/pull/138611 From flang-commits at lists.llvm.org Tue May 6 09:28:53 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Tue, 06 May 2025 09:28:53 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <681a38c5.170a0220.32e890.f97e@mx.google.com> https://github.com/mrkajetanp edited https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 6 09:37:05 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 06 May 2025 09:37:05 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Add frontend support for declare variant (PR #130578) In-Reply-To: Message-ID: <681a3ab1.630a0220.216826.0b40@mx.google.com> https://github.com/tblah approved this pull request. Thanks https://github.com/llvm/llvm-project/pull/130578 From flang-commits at lists.llvm.org Tue May 6 09:45:06 2025 From: flang-commits at lists.llvm.org (Kelvin Li via flang-commits) Date: Tue, 06 May 2025 09:45:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang][AIX] Predefine __64BIT__ and _AIX macros (PR #138591) In-Reply-To: Message-ID: <681a3c92.050a0220.230a99.6c5d@mx.google.com> ================ @@ -1,13 +1,25 @@ ! Test predefined macro for PowerPC architecture -! RUN: %flang_fc1 -triple ppc64le-unknown-linux -cpp -E %s | FileCheck %s +! RUN: %flang_fc1 -triple ppc64le-unknown-linux -cpp -E %s | FileCheck %s -check-prefix=CHECK-LINUX +! RUN: %flang_fc1 -triple powerpc-unknown-aix -cpp -E %s | FileCheck %s -check-prefix=CHECK-AIX32 +! RUN: %flang_fc1 -triple powerpc64-unknown-aix -cpp -E %s | FileCheck %s -check-prefix=CHECK-AIX64 ! REQUIRES: target=powerpc{{.*}} -! CHECK: integer :: var1 = 1 -! CHECK: integer :: var2 = 1 +! CHECK-LINUX: integer :: var1 = 1 +! CHECK-LINUX: integer :: var2 = 1 +! CHECK-AIX32: integer :: var1 = 1 +! CHECK-AIX32: integer :: var2 = 1 +! CHECK-AIX32: integer :: var3 = __64BIT__ ---------------- kkwli wrote: No. It should be `__64BIT__`. The macro `__64BIT__` is not defined so no macro substitution occurs. https://github.com/llvm/llvm-project/pull/138591 From flang-commits at lists.llvm.org Tue May 6 09:59:29 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 09:59:29 -0700 (PDT) Subject: [flang-commits] [flang] 96e0930 - [flang][Evaluate] Fix AsGenericExpr for Relational (#138455) Message-ID: <681a3ff1.630a0220.576ff.13d1@mx.google.com> Author: Krzysztof Parzyszek Date: 2025-05-06T11:59:22-05:00 New Revision: 96e09302f9e6617b68b7dca17a5a0c866e147d4d URL: https://github.com/llvm/llvm-project/commit/96e09302f9e6617b68b7dca17a5a0c866e147d4d DIFF: https://github.com/llvm/llvm-project/commit/96e09302f9e6617b68b7dca17a5a0c866e147d4d.diff LOG: [flang][Evaluate] Fix AsGenericExpr for Relational (#138455) The variant in Expr> only contains Relational, not other, more specific Relational types. When calling AsGenericExpr for a value of type Relational, the AsExpr function will attempt to create Expr<> directly for Relational, which won't work for the above reason. Implement an overload of AsExpr for Relational, which will wrap the Relational in Relational before creating Expr<>. Added: Modified: flang/include/flang/Evaluate/tools.h Removed: ################################################################################ diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 1414eaf14f7d6..5cdabb3056d8f 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -130,6 +130,14 @@ template common::IfNoLvalue>, A> AsExpr(A &&x) { return Expr>{std::move(x)}; } +template ::Result> +Expr AsExpr(Relational &&x) { + // The variant in Expr> only contains + // Relational, not other Relationals. Wrap the Relational + // in Relational before creating Expr<>. + return Expr(Relational{std::move(x)}); +} + template Expr AsExpr(Expr &&x) { static_assert(IsSpecificIntrinsicType); return std::move(x); From flang-commits at lists.llvm.org Tue May 6 09:59:33 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Tue, 06 May 2025 09:59:33 -0700 (PDT) Subject: [flang-commits] [flang] [flang][Evaluate] Fix AsGenericExpr for Relational (PR #138455) In-Reply-To: Message-ID: <681a3ff5.a70a0220.96a1a.7c49@mx.google.com> https://github.com/kparzysz closed https://github.com/llvm/llvm-project/pull/138455 From flang-commits at lists.llvm.org Tue May 6 10:00:32 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 10:00:32 -0700 (PDT) Subject: [flang-commits] [flang] 304c7a8 - [flang][Evaluate] Restrict ConstantBase constructor overload (#138456) Message-ID: <681a4030.050a0220.156158.6b00@mx.google.com> Author: Krzysztof Parzyszek Date: 2025-05-06T12:00:29-05:00 New Revision: 304c7a87d01bd0d7c75a0b875beed0a6b491383e URL: https://github.com/llvm/llvm-project/commit/304c7a87d01bd0d7c75a0b875beed0a6b491383e DIFF: https://github.com/llvm/llvm-project/commit/304c7a87d01bd0d7c75a0b875beed0a6b491383e.diff LOG: [flang][Evaluate] Restrict ConstantBase constructor overload (#138456) ConstantBase has a constructor that takes a value of any type as an input: template ConstantBase(const T &). A derived type Constant is a member of many Expr classes (as an alternative in the member variant). When trying (erroneously) to create Expr from a wrong input, if the specific instance of Expr contains Constant, it's that constructor that will be instantiated, leading to cryptic and confusing errors. Eliminate the constructor from overload for invalid input values to help produce more meaningful diagnostics. Added: Modified: flang/include/flang/Evaluate/constant.h Removed: ################################################################################ diff --git a/flang/include/flang/Evaluate/constant.h b/flang/include/flang/Evaluate/constant.h index 6fc22e3b86aa2..d4c6601c37bca 100644 --- a/flang/include/flang/Evaluate/constant.h +++ b/flang/include/flang/Evaluate/constant.h @@ -110,8 +110,12 @@ class ConstantBase : public ConstantBounds { using Result = RESULT; using Element = ELEMENT; - template + // Constructor for creating ConstantBase from an actual value (i.e. + // literals, etc.) + template >> ConstantBase(const A &x, Result res = Result{}) : result_{res}, values_{x} {} + ConstantBase(ELEMENT &&x, Result res = Result{}) : result_{res}, values_{std::move(x)} {} ConstantBase( From flang-commits at lists.llvm.org Tue May 6 10:00:35 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Tue, 06 May 2025 10:00:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang][Evaluate] Restrict ConstantBase constructor overload (PR #138456) In-Reply-To: Message-ID: <681a4033.630a0220.2ddc0d.1ec7@mx.google.com> https://github.com/kparzysz closed https://github.com/llvm/llvm-project/pull/138456 From flang-commits at lists.llvm.org Tue May 6 10:13:10 2025 From: flang-commits at lists.llvm.org (Joseph Huber via flang-commits) Date: Tue, 06 May 2025 10:13:10 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Stop building multiple versions of the OpenMP library (PR #126657) In-Reply-To: Message-ID: <681a4326.630a0220.3c0424.316b@mx.google.com> https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/126657 From flang-commits at lists.llvm.org Tue May 6 10:13:11 2025 From: flang-commits at lists.llvm.org (Joseph Huber via flang-commits) Date: Tue, 06 May 2025 10:13:11 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Stop building multiple versions of the OpenMP library (PR #126657) In-Reply-To: Message-ID: <681a4327.170a0220.289717.fece@mx.google.com> jhuber6 wrote: Closing in favor of using the `-DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=amdgcn-amd-amdhsa` version. https://github.com/llvm/llvm-project/pull/126657 From flang-commits at lists.llvm.org Tue May 6 10:13:46 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 06 May 2025 10:13:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <681a434a.170a0220.91081.07cb@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 6 10:13:46 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 06 May 2025 10:13:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <681a434a.170a0220.2e18e6.02c2@mx.google.com> https://github.com/tblah commented: The approach used in your draft looks good to me. Just style nits from me. Please add lit tests before taking this out of draft. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 6 10:13:47 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 06 May 2025 10:13:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <681a434b.170a0220.1ff9e8.0b52@mx.google.com> ================ @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + + if (!mlir::isa(copyOut)) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (::llvm::cast(copyOut).getVar()) ---------------- tblah wrote: nit: use `mlir::cast` to cast mlir operations. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 6 10:13:47 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 06 May 2025 10:13:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <681a434b.170a0220.260633.0ce5@mx.google.com> ================ @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + + if (!mlir::isa(copyOut)) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (::llvm::cast(copyOut).getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); ---------------- tblah wrote: Instead of doing `isa` and then a `cast`, you can do it all in one with `dyn_cast` and then check if the result is `nullptr`. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 6 10:13:47 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 06 May 2025 10:13:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <681a434b.170a0220.2890a1.19b7@mx.google.com> ================ @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = ---------------- tblah wrote: nit: spell out this auto because the type is not obvious (and in other places) The guidelines are a bit ambiguous: https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable What is usually done in practice is to only use `auto` when the type is written in the initializer e.g. ``` auto copyOut = mlir::dyn_cast(...) ``` https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 6 10:13:47 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 06 May 2025 10:13:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <681a434b.170a0220.291246.75f4@mx.google.com> ================ @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + + if (!mlir::isa(copyOut)) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (::llvm::cast(copyOut).getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + auto results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + auto elem = hlfir::getElementAt(loc, builder, inputVariable, + loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + auto tempElem = hlfir::getElementAt(loc, builder, temp, + loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + auto refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + auto refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + auto addr = results[0]; + auto needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + auto tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. + rewriter.eraseOp(tempBox.getDefiningOp()); ---------------- tblah wrote: I would double check that it now has no uses https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 6 10:13:47 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 06 May 2025 10:13:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <681a434b.630a0220.3b3535.1c45@mx.google.com> ================ @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + + if (!mlir::isa(copyOut)) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (::llvm::cast(copyOut).getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + auto results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + auto elem = hlfir::getElementAt(loc, builder, inputVariable, + loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + auto tempElem = hlfir::getElementAt(loc, builder, temp, + loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + auto refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + auto refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + auto addr = results[0]; + auto needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { ---------------- tblah wrote: ```suggestion builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { ``` https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 6 10:37:11 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Tue, 06 May 2025 10:37:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Correctly prepare allocatable runtime call arguments (PR #138727) Message-ID: https://github.com/ashermancinelli created https://github.com/llvm/llvm-project/pull/138727 When lowering allocatables, the generated calls to runtime functions were not using the runtime::createArguments utility which handles the required conversions. createArguments is where I added the implicit volatile casts to handle converting volatile variables to the appropriate type based on their volatility in the callee. Because the calls to allocatable runtime functions were not using this function, their arguments were not casted to have the appropriate volatility. Add a test to demonstrate that volatile and allocatable class/box/reference types are appropriately casted before calling into the runtime library. Instead of using a recursive variadic template to perform the conversions in createArguments, map over the arguments directly so that createArguments can be called with an ArrayRef of arguments. Some cases in Allocatable.cpp already had a vector of values at the point where createArguments needed to be called - the new overload allows calling with a vector of args or the variadic version with each argument spelled out at the callsite. This change resulted in the allocatable runtime calls having their arguments converted left-to-right, which changed some of the test results. I used CHECK-DAG to ignore the order. Add some missing handling of volatile class entities, which I previously missed because I had not yet enabled volatile class entities in Lower. Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 6 10:37:48 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 10:37:48 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Correctly prepare allocatable runtime call arguments (PR #138727) In-Reply-To: Message-ID: <681a48ec.050a0220.1dbf87.c134@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Asher Mancinelli (ashermancinelli)
Changes When lowering allocatables, the generated calls to runtime functions were not using the runtime::createArguments utility which handles the required conversions. createArguments is where I added the implicit volatile casts to handle converting volatile variables to the appropriate type based on their volatility in the callee. Because the calls to allocatable runtime functions were not using this function, their arguments were not casted to have the appropriate volatility. Add a test to demonstrate that volatile and allocatable class/box/reference types are appropriately casted before calling into the runtime library. Instead of using a recursive variadic template to perform the conversions in createArguments, map over the arguments directly so that createArguments can be called with an ArrayRef of arguments. Some cases in Allocatable.cpp already had a vector of values at the point where createArguments needed to be called - the new overload allows calling with a vector of args or the variadic version with each argument spelled out at the callsite. This change resulted in the allocatable runtime calls having their arguments converted left-to-right, which changed some of the test results. I used CHECK-DAG to ignore the order. Add some missing handling of volatile class entities, which I previously missed because I had not yet enabled volatile class entities in Lower. --- Patch is 55.61 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/138727.diff 9 Files Affected: - (modified) flang/include/flang/Optimizer/Builder/Runtime/RTBuilder.h (+11-20) - (modified) flang/lib/Lower/Allocatable.cpp (+26-39) - (modified) flang/lib/Lower/ConvertExprToHLFIR.cpp (+6-6) - (modified) flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp (+3-5) - (modified) flang/test/Lower/allocatable-polymorphic.f90 (+71-71) - (modified) flang/test/Lower/allocate-source-allocatables-2.f90 (+12-12) - (modified) flang/test/Lower/allocate-source-allocatables.f90 (+7-7) - (modified) flang/test/Lower/allocate-source-pointers.f90 (+5-5) - (added) flang/test/Lower/volatile-allocatable.f90 (+152) ``````````diff diff --git a/flang/include/flang/Optimizer/Builder/Runtime/RTBuilder.h b/flang/include/flang/Optimizer/Builder/Runtime/RTBuilder.h index 5440b36c0c628..98d7de81c7f08 100644 --- a/flang/include/flang/Optimizer/Builder/Runtime/RTBuilder.h +++ b/flang/include/flang/Optimizer/Builder/Runtime/RTBuilder.h @@ -26,6 +26,7 @@ #include "flang/Support/Fortran.h" #include "mlir/IR/BuiltinTypes.h" #include "mlir/IR/MLIRContext.h" +#include "llvm/ADT/STLExtras.h" #include "llvm/ADT/SmallVector.h" #include #include @@ -824,33 +825,23 @@ static mlir::func::FuncOp getIORuntimeFunc(mlir::Location loc, return getRuntimeFunc(loc, builder, /*isIO=*/true); } -namespace helper { -template -void createArguments(llvm::SmallVectorImpl &result, - fir::FirOpBuilder &builder, mlir::Location loc, - mlir::FunctionType fTy, A arg) { - result.emplace_back( - builder.createConvertWithVolatileCast(loc, fTy.getInput(N), arg)); -} - -template -void createArguments(llvm::SmallVectorImpl &result, - fir::FirOpBuilder &builder, mlir::Location loc, - mlir::FunctionType fTy, A arg, As... args) { - result.emplace_back( - builder.createConvertWithVolatileCast(loc, fTy.getInput(N), arg)); - createArguments(result, builder, loc, fTy, args...); +inline llvm::SmallVector +createArguments(fir::FirOpBuilder &builder, mlir::Location loc, + mlir::FunctionType fTy, llvm::ArrayRef args) { + return llvm::map_to_vector(llvm::zip_equal(fTy.getInputs(), args), + [&](const auto &pair) -> mlir::Value { + auto [type, argument] = pair; + return builder.createConvertWithVolatileCast( + loc, type, argument); + }); } -} // namespace helper /// Create a SmallVector of arguments for a runtime call. template llvm::SmallVector createArguments(fir::FirOpBuilder &builder, mlir::Location loc, mlir::FunctionType fTy, As... args) { - llvm::SmallVector result; - helper::createArguments<0>(result, builder, loc, fTy, args...); - return result; + return createArguments(builder, loc, fTy, {args...}); } } // namespace fir::runtime diff --git a/flang/lib/Lower/Allocatable.cpp b/flang/lib/Lower/Allocatable.cpp index 8d0444a6e5bd4..7e32575caad9b 100644 --- a/flang/lib/Lower/Allocatable.cpp +++ b/flang/lib/Lower/Allocatable.cpp @@ -138,12 +138,10 @@ static void genRuntimeSetBounds(fir::FirOpBuilder &builder, mlir::Location loc, builder) : fir::runtime::getRuntimeFunc( loc, builder); - llvm::SmallVector args{box.getAddr(), dimIndex, lowerBound, - upperBound}; - llvm::SmallVector operands; - for (auto [fst, snd] : llvm::zip(args, callee.getFunctionType().getInputs())) - operands.emplace_back(builder.createConvert(loc, snd, fst)); - builder.create(loc, callee, operands); + const auto args = fir::runtime::createArguments( + builder, loc, callee.getFunctionType(), box.getAddr(), dimIndex, + lowerBound, upperBound); + builder.create(loc, callee, args); } /// Generate runtime call to set the lengths of a character allocatable or @@ -162,9 +160,7 @@ static void genRuntimeInitCharacter(fir::FirOpBuilder &builder, if (inputTypes.size() != 5) fir::emitFatalError( loc, "AllocatableInitCharacter runtime interface not as expected"); - llvm::SmallVector args; - args.push_back(builder.createConvert(loc, inputTypes[0], box.getAddr())); - args.push_back(builder.createConvert(loc, inputTypes[1], len)); + llvm::SmallVector args = {box.getAddr(), len}; if (kind == 0) kind = mlir::cast(box.getEleTy()).getFKind(); args.push_back(builder.createIntegerConstant(loc, inputTypes[2], kind)); @@ -173,7 +169,9 @@ static void genRuntimeInitCharacter(fir::FirOpBuilder &builder, // TODO: coarrays int corank = 0; args.push_back(builder.createIntegerConstant(loc, inputTypes[4], corank)); - builder.create(loc, callee, args); + const auto convertedArgs = fir::runtime::createArguments( + builder, loc, callee.getFunctionType(), args); + builder.create(loc, callee, convertedArgs); } /// Generate a sequence of runtime calls to allocate memory. @@ -194,10 +192,9 @@ static mlir::Value genRuntimeAllocate(fir::FirOpBuilder &builder, args.push_back(errorManager.errMsgAddr); args.push_back(errorManager.sourceFile); args.push_back(errorManager.sourceLine); - llvm::SmallVector operands; - for (auto [fst, snd] : llvm::zip(args, callee.getFunctionType().getInputs())) - operands.emplace_back(builder.createConvert(loc, snd, fst)); - return builder.create(loc, callee, operands).getResult(0); + const auto convertedArgs = fir::runtime::createArguments( + builder, loc, callee.getFunctionType(), args); + return builder.create(loc, callee, convertedArgs).getResult(0); } /// Generate a sequence of runtime calls to allocate memory and assign with the @@ -213,14 +210,11 @@ static mlir::Value genRuntimeAllocateSource(fir::FirOpBuilder &builder, loc, builder) : fir::runtime::getRuntimeFunc( loc, builder); - llvm::SmallVector args{ - box.getAddr(), fir::getBase(source), - errorManager.hasStat, errorManager.errMsgAddr, - errorManager.sourceFile, errorManager.sourceLine}; - llvm::SmallVector operands; - for (auto [fst, snd] : llvm::zip(args, callee.getFunctionType().getInputs())) - operands.emplace_back(builder.createConvert(loc, snd, fst)); - return builder.create(loc, callee, operands).getResult(0); + const auto args = fir::runtime::createArguments( + builder, loc, callee.getFunctionType(), box.getAddr(), + fir::getBase(source), errorManager.hasStat, errorManager.errMsgAddr, + errorManager.sourceFile, errorManager.sourceLine); + return builder.create(loc, callee, args).getResult(0); } /// Generate runtime call to apply mold to the descriptor. @@ -234,14 +228,12 @@ static void genRuntimeAllocateApplyMold(fir::FirOpBuilder &builder, builder) : fir::runtime::getRuntimeFunc( loc, builder); - llvm::SmallVector args{ + const auto args = fir::runtime::createArguments( + builder, loc, callee.getFunctionType(), fir::factory::getMutableIRBox(builder, loc, box), fir::getBase(mold), builder.createIntegerConstant( - loc, callee.getFunctionType().getInputs()[2], rank)}; - llvm::SmallVector operands; - for (auto [fst, snd] : llvm::zip(args, callee.getFunctionType().getInputs())) - operands.emplace_back(builder.createConvert(loc, snd, fst)); - builder.create(loc, callee, operands); + loc, callee.getFunctionType().getInputs()[2], rank)); + builder.create(loc, callee, args); } /// Generate a runtime call to deallocate memory. @@ -669,15 +661,13 @@ class AllocateStmtHelper { llvm::ArrayRef inputTypes = callee.getFunctionType().getInputs(); - llvm::SmallVector args; - args.push_back(builder.createConvert(loc, inputTypes[0], box.getAddr())); - args.push_back(builder.createConvert(loc, inputTypes[1], typeDescAddr)); mlir::Value rankValue = builder.createIntegerConstant(loc, inputTypes[2], rank); mlir::Value corankValue = builder.createIntegerConstant(loc, inputTypes[3], corank); - args.push_back(rankValue); - args.push_back(corankValue); + const auto args = fir::runtime::createArguments( + builder, loc, callee.getFunctionType(), box.getAddr(), typeDescAddr, + rankValue, corankValue); builder.create(loc, callee, args); } @@ -696,8 +686,6 @@ class AllocateStmtHelper { llvm::ArrayRef inputTypes = callee.getFunctionType().getInputs(); - llvm::SmallVector args; - args.push_back(builder.createConvert(loc, inputTypes[0], box.getAddr())); mlir::Value categoryValue = builder.createIntegerConstant( loc, inputTypes[1], static_cast(category)); mlir::Value kindValue = @@ -706,10 +694,9 @@ class AllocateStmtHelper { builder.createIntegerConstant(loc, inputTypes[3], rank); mlir::Value corankValue = builder.createIntegerConstant(loc, inputTypes[4], corank); - args.push_back(categoryValue); - args.push_back(kindValue); - args.push_back(rankValue); - args.push_back(corankValue); + const auto args = fir::runtime::createArguments( + builder, loc, callee.getFunctionType(), box.getAddr(), categoryValue, + kindValue, rankValue, corankValue); builder.create(loc, callee, args); } diff --git a/flang/lib/Lower/ConvertExprToHLFIR.cpp b/flang/lib/Lower/ConvertExprToHLFIR.cpp index 04b63f92a1fb4..395f4518efb1e 100644 --- a/flang/lib/Lower/ConvertExprToHLFIR.cpp +++ b/flang/lib/Lower/ConvertExprToHLFIR.cpp @@ -227,6 +227,12 @@ class HlfirDesignatorBuilder { isVolatile = true; } + // Check if the base type is volatile + if (partInfo.base.has_value()) { + mlir::Type baseType = partInfo.base.value().getType(); + isVolatile = isVolatile || fir::isa_volatile_type(baseType); + } + // Dynamic type of polymorphic base must be kept if the designator is // polymorphic. if (isPolymorphic(designatorNode)) @@ -238,12 +244,6 @@ class HlfirDesignatorBuilder { if (charType && charType.hasDynamicLen()) return fir::BoxCharType::get(charType.getContext(), charType.getFKind()); - // Check if the base type is volatile - if (partInfo.base.has_value()) { - mlir::Type baseType = partInfo.base.value().getType(); - isVolatile = isVolatile || fir::isa_volatile_type(baseType); - } - // Arrays with non default lower bounds or dynamic length or dynamic extent // need a fir.box to hold the dynamic or lower bound information. if (fir::hasDynamicSize(resultValueType) || diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp index eef1377f26961..711d5d1461b08 100644 --- a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp +++ b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp @@ -210,15 +210,14 @@ static bool hasExplicitLowerBounds(mlir::Value shape) { static std::pair updateDeclareInputTypeWithVolatility( mlir::Type inputType, mlir::Value memref, mlir::OpBuilder &builder, fir::FortranVariableFlagsAttr fortran_attrs) { - if (mlir::isa(inputType) && fortran_attrs && + if (fortran_attrs && bitEnumContainsAny(fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::fortran_volatile)) { const bool isPointer = bitEnumContainsAny( fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::pointer); auto updateType = [&](auto t) { using FIRT = decltype(t); - // If an entity is a pointer, the entity it points to is volatile, as far - // as consumers of the pointer are concerned. + // A volatile pointer's pointee is volatile. auto elementType = t.getEleTy(); const bool elementTypeIsVolatile = isPointer || fir::isa_volatile_type(elementType); @@ -227,8 +226,7 @@ static std::pair updateDeclareInputTypeWithVolatility( inputType = FIRT::get(newEleTy, true); }; llvm::TypeSwitch(inputType) - .Case(updateType) - .Default([](mlir::Type t) { return t; }); + .Case(updateType); memref = builder.create(memref.getLoc(), inputType, memref); } diff --git a/flang/test/Lower/allocatable-polymorphic.f90 b/flang/test/Lower/allocatable-polymorphic.f90 index dd8671daeaf8e..10e703210ea61 100644 --- a/flang/test/Lower/allocatable-polymorphic.f90 +++ b/flang/test/Lower/allocatable-polymorphic.f90 @@ -41,7 +41,7 @@ subroutine proc2_p2(this) end subroutine ! ------------------------------------------------------------------------------ -! Test lowering of ALLOCATE statement for polymoprhic pointer +! Test lowering of ALLOCATE statement for polymorphic pointer ! ------------------------------------------------------------------------------ subroutine test_pointer() @@ -98,10 +98,10 @@ subroutine test_pointer() ! CHECK: %[[P_DECL:.*]]:2 = hlfir.declare %[[P_DESC]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QMpolyFtest_pointerEp"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) ! CHECK: %[[TYPE_DESC_P1:.*]] = fir.type_desc !fir.type<_QMpolyTp1{a:i32,b:i32}> -! CHECK: %[[P_DESC_CAST:.*]] = fir.convert %[[P_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %[[TYPE_DESC_P1_CAST:.*]] = fir.convert %[[TYPE_DESC_P1]] : (!fir.tdesc>) -> !fir.ref -! CHECK: %[[RANK:.*]] = arith.constant 0 : i32 -! CHECK: %[[CORANK:.*]] = arith.constant 0 : i32 +! CHECK-DAG: %[[P_DESC_CAST:.*]] = fir.convert %[[P_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> +! CHECK-DAG: %[[TYPE_DESC_P1_CAST:.*]] = fir.convert %[[TYPE_DESC_P1]] : (!fir.tdesc>) -> !fir.ref +! CHECK-DAG: %[[RANK:.*]] = arith.constant 0 : i32 +! CHECK-DAG: %[[CORANK:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAPointerNullifyDerived(%[[P_DESC_CAST]], %[[TYPE_DESC_P1_CAST]], %[[RANK]], %[[CORANK]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[P_DESC_CAST:.*]] = fir.convert %[[P_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> ! CHECK: %{{.*}} = fir.call @_FortranAPointerAllocate(%[[P_DESC_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -111,19 +111,19 @@ subroutine test_pointer() ! CHECK: fir.dispatch "proc1"(%[[P_LOAD]] : !fir.class>>) ! CHECK: %[[TYPE_DESC_P1:.*]] = fir.type_desc !fir.type<_QMpolyTp1{a:i32,b:i32}> -! CHECK: %[[C1_DESC_CAST:.*]] = fir.convert %[[C1_DECL:.*]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %[[TYPE_DESC_P1_CAST:.*]] = fir.convert %[[TYPE_DESC_P1]] : (!fir.tdesc>) -> !fir.ref -! CHECK: %[[RANK:.*]] = arith.constant 0 : i32 -! CHECK: %[[CORANK:.*]] = arith.constant 0 : i32 +! CHECK-DAG: %[[C1_DESC_CAST:.*]] = fir.convert %[[C1_DECL:.*]]#0 : (!fir.ref>>>) -> !fir.ref> +! CHECK-DAG: %[[TYPE_DESC_P1_CAST:.*]] = fir.convert %[[TYPE_DESC_P1]] : (!fir.tdesc>) -> !fir.ref +! CHECK-DAG: %[[RANK:.*]] = arith.constant 0 : i32 +! CHECK-DAG: %[[CORANK:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAPointerNullifyDerived(%[[C1_DESC_CAST]], %[[TYPE_DESC_P1_CAST]], %[[RANK]], %[[CORANK]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[C1_DESC_CAST:.*]] = fir.convert %[[C1_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> ! CHECK: %{{.*}} = fir.call @_FortranAPointerAllocate(%[[C1_DESC_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P2:.*]] = fir.type_desc !fir.type<_QMpolyTp2{p1:!fir.type<_QMpolyTp1{a:i32,b:i32}>,c:i32}> -! CHECK: %[[C2_DESC_CAST:.*]] = fir.convert %[[C2_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %[[TYPE_DESC_P2_CAST:.*]] = fir.convert %[[TYPE_DESC_P2]] : (!fir.tdesc,c:i32}>>) -> !fir.ref -! CHECK: %[[RANK:.*]] = arith.constant 0 : i32 -! CHECK: %[[CORANK:.*]] = arith.constant 0 : i32 +! CHECK-DAG: %[[C2_DESC_CAST:.*]] = fir.convert %[[C2_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> +! CHECK-DAG: %[[TYPE_DESC_P2_CAST:.*]] = fir.convert %[[TYPE_DESC_P2]] : (!fir.tdesc,c:i32}>>) -> !fir.ref +! CHECK-DAG: %[[RANK:.*]] = arith.constant 0 : i32 +! CHECK-DAG: %[[CORANK:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAPointerNullifyDerived(%[[C2_DESC_CAST]], %[[TYPE_DESC_P2_CAST]], %[[RANK]], %[[CORANK]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[C2_DESC_CAST:.*]] = fir.convert %[[C2_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> ! CHECK: %{{.*}} = fir.call @_FortranAPointerAllocate(%[[C2_DESC_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -147,10 +147,10 @@ subroutine test_pointer() ! CHECK: fir.dispatch "proc2"(%[[C2_REBOX]] : !fir.class>) (%[[C2_REBOX]] : !fir.class>) {pass_arg_pos = 0 : i32} ! CHECK: %[[TYPE_DESC_P1:.*]] = fir.type_desc !fir.type<_QMpolyTp1{a:i32,b:i32}> -! CHECK: %[[C3_CAST:.*]] = fir.convert %[[C3_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> -! CHECK: %[[TYPE_DESC_P1_CAST:.*]] = fir.convert %[[TYPE_DESC_P1]] : (!fir.tdesc>) -> !fir.ref -! CHECK: %[[RANK:.*]] = arith.constant 1 : i32 -! CHECK: %[[CORANK:.*]] = arith.constant 0 : i32 +! CHECK-DAG: %[[C3_CAST:.*]] = fir.convert %[[C3_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> +! CHECK-DAG: %[[TYPE_DESC_P1_CAST:.*]] = fir.convert %[[TYPE_DESC_P1]] : (!fir.tdesc>) -> !fir.ref +! CHECK-DAG: %[[RANK:.*]] = arith.constant 1 : i32 +! CHECK-DAG: %[[CORANK:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAPointerNullifyDerived(%[[C3_CAST]], %[[TYPE_DESC_P1_CAST]], %[[RANK]], %[[CORANK]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[C3_CAST:.*]] = fir.convert %[[C3_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> ! CHECK: fir.call @_FortranAPointerSetBounds(%[[C3_CAST]], %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i32, i64, i64) -> () @@ -158,10 +158,10 @@ subroutine test_pointer() ! CHECK: %{{.*}} = fir.call @_FortranAPointerAllocate(%[[C3_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P2:.*]] = fir.type_desc !fir.type<_QMpolyTp2{p1:!fir.type<_QMpolyTp1{a:i32,b:i32}>,c:i32}> -! CHECK: %[[C4_CAST:.*]] = fir.convert %[[C4_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> -! CHECK: %[[TYPE_DESC_P2_CAST:.*]] = fir.convert %[[TYPE_DESC_P2]] : (!fir.tdesc,c:i32}>>) -> !fir.ref -! CHECK: %[[RANK:.*]] = arith.constant 1 : i32 -! CHECK: %[[CORANK:.*]] = arith.constant 0 : i32 +! CHECK-DAG: %[[C4_CAST:.*]] = fir.c... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/138727 From flang-commits at lists.llvm.org Tue May 6 10:46:08 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Tue, 06 May 2025 10:46:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681a4ae0.650a0220.e2599.9482@mx.google.com> ================ @@ -137,22 +140,28 @@ class OpenACCRoutineDeviceTypeInfo { void set_isGang(bool value = true) { isGang_ = value; } unsigned gangDim() const { return gangDim_; } void set_gangDim(unsigned value) { gangDim_ = value; } - const std::string *bindName() const { - return bindName_ ? &*bindName_ : nullptr; + const std::variant *bindName() const { ---------------- akuhlens wrote: Could you say more about this? I think I am returning the variant here, which means I am returning both alternatives. So I am not certain what you are poking at. If you are referring to the use of optional instead of just storing the possible null pointer, I was just mimicking what had been done previously while adding what I needed. Happy to fix this up and return an optional here. I will add a comment to this effect, but the string is the bindc name and the SymbolRef is the name of an fortran managed symbol binding that will require mangling in lowering. Seperated so that I wasn't forced to call into name mangling during name resolution. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Tue May 6 10:46:19 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Tue, 06 May 2025 10:46:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681a4aeb.170a0220.1d5598.14ca@mx.google.com> https://github.com/akuhlens edited https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Tue May 6 10:49:54 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 10:49:54 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681a4bc2.170a0220.1e1a05.1f8b@mx.google.com> ================ @@ -137,22 +140,28 @@ class OpenACCRoutineDeviceTypeInfo { void set_isGang(bool value = true) { isGang_ = value; } unsigned gangDim() const { return gangDim_; } void set_gangDim(unsigned value) { gangDim_ = value; } - const std::string *bindName() const { - return bindName_ ? &*bindName_ : nullptr; + const std::variant *bindName() const { ---------------- klausler wrote: Where in the code does `bindName()` ever return a `SymbolRef`? https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Tue May 6 10:50:52 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 10:50:52 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681a4bfc.170a0220.291eb2.0f80@mx.google.com> ================ @@ -137,22 +140,28 @@ class OpenACCRoutineDeviceTypeInfo { void set_isGang(bool value = true) { isGang_ = value; } unsigned gangDim() const { return gangDim_; } void set_gangDim(unsigned value) { gangDim_ = value; } - const std::string *bindName() const { - return bindName_ ? &*bindName_ : nullptr; + const std::variant *bindName() const { ---------------- klausler wrote: Oh, now I see, you redeclared the `bindName_` component to be a variant. Never mind. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Tue May 6 13:08:23 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Tue, 06 May 2025 13:08:23 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681a6c37.170a0220.e919d.2a12@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/136012 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 6 13:14:59 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 13:14:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Updated the parsing structure of some OpenAcc constructs to give better/more uniform inspection (PR #138076) In-Reply-To: Message-ID: <681a6dc3.a70a0220.119922.a685@mx.google.com> ================ @@ -5243,38 +5243,65 @@ struct AccEndBlockDirective { }; // ACC END ATOMIC -EMPTY_CLASS(AccEndAtomic); +struct AccEndAtomic { + WRAPPER_CLASS_BOILERPLATE(AccEndAtomic, Verbatim); + CharBlock source; +}; // ACC ATOMIC READ +struct AccAtomicReadDirective { + TUPLE_CLASS_BOILERPLATE(AccAtomicReadDirective); + std::tuple t; + CharBlock source; ---------------- klausler wrote: Both `Verbatim` and `AccClauseList` nodes have `source` data members. Why do you need this one? https://github.com/llvm/llvm-project/pull/138076 From flang-commits at lists.llvm.org Tue May 6 13:14:59 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 13:14:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Updated the parsing structure of some OpenAcc constructs to give better/more uniform inspection (PR #138076) In-Reply-To: Message-ID: <681a6dc3.630a0220.1eda9c.470a@mx.google.com> ================ @@ -57,11 +57,15 @@ class ParseTreeDumper { NODE(format, IntrinsicTypeDataEditDesc) NODE(format::IntrinsicTypeDataEditDesc, Kind) NODE(parser, Abstract) + NODE(parser, AccAtomicCaptureDirective) ---------------- klausler wrote: These used to be in alphabetic order. Now they are not. https://github.com/llvm/llvm-project/pull/138076 From flang-commits at lists.llvm.org Tue May 6 13:14:59 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 13:14:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Updated the parsing structure of some OpenAcc constructs to give better/more uniform inspection (PR #138076) In-Reply-To: Message-ID: <681a6dc3.170a0220.169f91.02ac@mx.google.com> ================ @@ -5243,38 +5243,65 @@ struct AccEndBlockDirective { }; // ACC END ATOMIC -EMPTY_CLASS(AccEndAtomic); +struct AccEndAtomic { + WRAPPER_CLASS_BOILERPLATE(AccEndAtomic, Verbatim); + CharBlock source; ---------------- klausler wrote: the `Verbatim` parse tree node has a `source` data member. That is its point. This new class doesn't add any value. https://github.com/llvm/llvm-project/pull/138076 From flang-commits at lists.llvm.org Tue May 6 13:15:00 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 13:15:00 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Updated the parsing structure of some OpenAcc constructs to give better/more uniform inspection (PR #138076) In-Reply-To: Message-ID: <681a6dc4.650a0220.143407.aa24@mx.google.com> ================ @@ -5324,14 +5352,19 @@ struct OpenACCDeclarativeConstruct { }; // OpenACC directives enclosing do loop -EMPTY_CLASS(AccEndLoop); +struct AccEndLoop { + WRAPPER_CLASS_BOILERPLATE(AccEndLoop, Verbatim); + CharBlock source; +}; + struct OpenACCLoopConstruct { TUPLE_CLASS_BOILERPLATE(OpenACCLoopConstruct); OpenACCLoopConstruct(AccBeginLoopDirective &&a) : t({std::move(a), std::nullopt, std::nullopt}) {} std::tuple, std::optional> t; + CharBlock source; ---------------- klausler wrote: Why do you need a `source` data member that covers an entire construct? https://github.com/llvm/llvm-project/pull/138076 From flang-commits at lists.llvm.org Tue May 6 14:02:06 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Tue, 06 May 2025 14:02:06 -0700 (PDT) Subject: [flang-commits] [flang] Skip contiguous check when ignore_tkr(c) is used (PR #138762) Message-ID: https://github.com/wangzpgi created https://github.com/llvm/llvm-project/pull/138762 The point of ignore_tkr(c) is to ignore both contiguous warnings and errors for arguments of all attribute types. Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 6 14:02:43 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 14:02:43 -0700 (PDT) Subject: [flang-commits] [flang] Skip contiguous check when ignore_tkr(c) is used (PR #138762) In-Reply-To: Message-ID: <681a78f3.170a0220.237166.4e3d@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Zhen Wang (wangzpgi)
Changes The point of ignore_tkr(c) is to ignore both contiguous warnings and errors for arguments of all attribute types. --- Full diff: https://github.com/llvm/llvm-project/pull/138762.diff 2 Files Affected: - (modified) flang/lib/Semantics/check-call.cpp (+2-1) - (added) flang/test/Semantics/cuf20.cuf (+42) ``````````diff diff --git a/flang/lib/Semantics/check-call.cpp b/flang/lib/Semantics/check-call.cpp index dfaa0e028d698..58271d7ca2e87 100644 --- a/flang/lib/Semantics/check-call.cpp +++ b/flang/lib/Semantics/check-call.cpp @@ -1016,7 +1016,8 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, } } if (dummyDataAttr == common::CUDADataAttr::Device && - (dummyIsAssumedShape || dummyIsAssumedRank)) { + (dummyIsAssumedShape || dummyIsAssumedRank) + !dummy.ignoreTKR.test(common::IgnoreTKR::Contiguous)) { if (auto contig{evaluate::IsContiguous(actual, foldingContext, /*namedConstantSectionsAreContiguous=*/true, /*firstDimensionStride1=*/true)}) { diff --git a/flang/test/Semantics/cuf20.cuf b/flang/test/Semantics/cuf20.cuf new file mode 100644 index 0000000000000..222ff2a1b7c6d --- /dev/null +++ b/flang/test/Semantics/cuf20.cuf @@ -0,0 +1,42 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 + +! Test case 1: Device arrays with ignore_tkr(c) +subroutine test_device_arrays() + interface bar + subroutine bar1(a) +!dir$ ignore_tkr(c) a + real :: a(..) +!@cuf attributes(device) :: a + end subroutine + end interface + + integer :: n = 10, k = 2 + real, device :: a(10), b(10), c(10) + + call bar(a(1:n)) ! Should not warn about contiguity + call bar(b(1:n:k)) ! Should not warn about contiguity + call bar(c(1:n:2)) ! Should not warn about contiguity +end subroutine + +! Test case 2: Managed arrays with ignore_tkr(c) +subroutine test_managed_arrays() + interface bar + subroutine bar1(a) +!dir$ ignore_tkr(c) a + real :: a(..) +!@cuf attributes(device) :: a + end subroutine + end interface + + integer :: n = 10, k = 2 + real, managed :: a(10), b(10), c(10) + + call bar(a(1:n)) ! Should not warn about contiguity + call bar(b(1:n:k)) ! Should not warn about contiguity + call bar(c(1:n:2)) ! Should not warn about contiguity +end subroutine + +program main + call test_device_arrays() + call test_managed_arrays() +end program \ No newline at end of file ``````````
https://github.com/llvm/llvm-project/pull/138762 From flang-commits at lists.llvm.org Tue May 6 14:04:29 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 14:04:29 -0700 (PDT) Subject: [flang-commits] [flang] Skip contiguous check when ignore_tkr(c) is used (PR #138762) In-Reply-To: Message-ID: <681a795d.a70a0220.1e7f6d.e053@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions cpp -- flang/lib/Semantics/check-call.cpp ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/lib/Semantics/check-call.cpp b/flang/lib/Semantics/check-call.cpp index 58271d7ca..af8c85d58 100644 --- a/flang/lib/Semantics/check-call.cpp +++ b/flang/lib/Semantics/check-call.cpp @@ -1016,8 +1016,8 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, } } if (dummyDataAttr == common::CUDADataAttr::Device && - (dummyIsAssumedShape || dummyIsAssumedRank) - !dummy.ignoreTKR.test(common::IgnoreTKR::Contiguous)) { + (dummyIsAssumedShape || dummyIsAssumedRank) !dummy.ignoreTKR.test( + common::IgnoreTKR::Contiguous)) { if (auto contig{evaluate::IsContiguous(actual, foldingContext, /*namedConstantSectionsAreContiguous=*/true, /*firstDimensionStride1=*/true)}) { ``````````
https://github.com/llvm/llvm-project/pull/138762 From flang-commits at lists.llvm.org Tue May 6 14:09:02 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 14:09:02 -0700 (PDT) Subject: [flang-commits] [flang] Skip contiguous check when ignore_tkr(c) is used (PR #138762) In-Reply-To: Message-ID: <681a7a6e.a70a0220.1ba177.e5a2@mx.google.com> ================ @@ -1016,7 +1016,8 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, } } if (dummyDataAttr == common::CUDADataAttr::Device && - (dummyIsAssumedShape || dummyIsAssumedRank)) { + (dummyIsAssumedShape || dummyIsAssumedRank) ---------------- klausler wrote: How does this code even compile? https://github.com/llvm/llvm-project/pull/138762 From flang-commits at lists.llvm.org Tue May 6 14:10:52 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Tue, 06 May 2025 14:10:52 -0700 (PDT) Subject: [flang-commits] [flang] Skip contiguous check when ignore_tkr(c) is used (PR #138762) In-Reply-To: Message-ID: <681a7adc.630a0220.a4ddf.5cd3@mx.google.com> https://github.com/wangzpgi updated https://github.com/llvm/llvm-project/pull/138762 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 6 14:14:17 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Tue, 06 May 2025 14:14:17 -0700 (PDT) Subject: [flang-commits] [flang] Skip contiguous check when ignore_tkr(c) is used (PR #138762) In-Reply-To: Message-ID: <681a7ba9.050a0220.293c20.e02c@mx.google.com> ================ @@ -1016,7 +1016,8 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, } } if (dummyDataAttr == common::CUDADataAttr::Device && - (dummyIsAssumedShape || dummyIsAssumedRank)) { + (dummyIsAssumedShape || dummyIsAssumedRank) ---------------- wangzpgi wrote: Missed `&&` in the source. https://github.com/llvm/llvm-project/pull/138762 From flang-commits at lists.llvm.org Tue May 6 17:01:32 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 06 May 2025 17:01:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix crash with USE of hermetic module file (PR #138785) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/138785 When one hermetic module file uses another, a later compilation may crash in semantics when it itself is used, since the module file reader sets the "current hermetic module file scope" to null after reading one rather than saving and restoring that pointer. >From ffb0d3a53a0f072ab59f9b12e8b84be4b73f4873 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Tue, 6 May 2025 16:57:36 -0700 Subject: [PATCH] [flang] Fix crash with USE of hermetic module file When one hermetic module file uses another, a later compilation may crash in semantics when it itself is used, since the module file reader sets the "current hermetic module file scope" to null after reading one rather than saving and restoring that pointer. --- flang/lib/Semantics/mod-file.cpp | 3 ++- flang/test/Semantics/modfile75.F90 | 17 +++++++++++++++++ 2 files changed, 19 insertions(+), 1 deletion(-) create mode 100644 flang/test/Semantics/modfile75.F90 diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index ee356e56e4458..3df229cc85587 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1537,6 +1537,7 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, // created under -fhermetic-module-files? If so, process them first in // their own nested scope that will be visible only to USE statements // within the module file. + Scope *previousHermetic{context_.currentHermeticModuleFileScope()}; if (parseTree.v.size() > 1) { parser::Program hermeticModules{std::move(parseTree.v)}; parseTree.v.emplace_back(std::move(hermeticModules.v.front())); @@ -1552,7 +1553,7 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, GetModuleDependences(context_.moduleDependences(), sourceFile->content()); ResolveNames(context_, parseTree, topScope); context_.foldingContext().set_moduleFileName(wasModuleFileName); - context_.set_currentHermeticModuleFileScope(nullptr); + context_.set_currentHermeticModuleFileScope(previousHermetic); if (!moduleSymbol) { // Submodule symbols' storage are owned by their parents' scopes, // but their names are not in their parents' dictionaries -- we diff --git a/flang/test/Semantics/modfile75.F90 b/flang/test/Semantics/modfile75.F90 new file mode 100644 index 0000000000000..e8cfb13552437 --- /dev/null +++ b/flang/test/Semantics/modfile75.F90 @@ -0,0 +1,17 @@ +!RUN: (%flang -c -DWHICH=1 %s && %flang -c -DWHICH=2 %s && %flang_fc1 -fdebug-unparse %s) | FileCheck %s + +#if WHICH == 1 +module m1 + use iso_c_binding +end +#elif WHICH == 2 +module m2 + use m1 +end +#else +program test + use m2 +!CHECK: INTEGER(KIND=4_4) n + integer(c_int) n +end +#endif From flang-commits at lists.llvm.org Tue May 6 17:02:01 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 17:02:01 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix crash with USE of hermetic module file (PR #138785) In-Reply-To: Message-ID: <681aa2f9.630a0220.155cf2.470a@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes When one hermetic module file uses another, a later compilation may crash in semantics when it itself is used, since the module file reader sets the "current hermetic module file scope" to null after reading one rather than saving and restoring that pointer. --- Full diff: https://github.com/llvm/llvm-project/pull/138785.diff 2 Files Affected: - (modified) flang/lib/Semantics/mod-file.cpp (+2-1) - (added) flang/test/Semantics/modfile75.F90 (+17) ``````````diff diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index ee356e56e4458..3df229cc85587 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1537,6 +1537,7 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, // created under -fhermetic-module-files? If so, process them first in // their own nested scope that will be visible only to USE statements // within the module file. + Scope *previousHermetic{context_.currentHermeticModuleFileScope()}; if (parseTree.v.size() > 1) { parser::Program hermeticModules{std::move(parseTree.v)}; parseTree.v.emplace_back(std::move(hermeticModules.v.front())); @@ -1552,7 +1553,7 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, GetModuleDependences(context_.moduleDependences(), sourceFile->content()); ResolveNames(context_, parseTree, topScope); context_.foldingContext().set_moduleFileName(wasModuleFileName); - context_.set_currentHermeticModuleFileScope(nullptr); + context_.set_currentHermeticModuleFileScope(previousHermetic); if (!moduleSymbol) { // Submodule symbols' storage are owned by their parents' scopes, // but their names are not in their parents' dictionaries -- we diff --git a/flang/test/Semantics/modfile75.F90 b/flang/test/Semantics/modfile75.F90 new file mode 100644 index 0000000000000..e8cfb13552437 --- /dev/null +++ b/flang/test/Semantics/modfile75.F90 @@ -0,0 +1,17 @@ +!RUN: (%flang -c -DWHICH=1 %s && %flang -c -DWHICH=2 %s && %flang_fc1 -fdebug-unparse %s) | FileCheck %s + +#if WHICH == 1 +module m1 + use iso_c_binding +end +#elif WHICH == 2 +module m2 + use m1 +end +#else +program test + use m2 +!CHECK: INTEGER(KIND=4_4) n + integer(c_int) n +end +#endif ``````````
https://github.com/llvm/llvm-project/pull/138785 From flang-commits at lists.llvm.org Tue May 6 18:30:04 2025 From: flang-commits at lists.llvm.org (Sebastian Pop via flang-commits) Date: Tue, 06 May 2025 18:30:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix Werror=dangling-reference (PR #138793) Message-ID: https://github.com/sebpop created https://github.com/llvm/llvm-project/pull/138793 when compiling with g++ 13.3.0 flang build fails with: llvm-project/flang/lib/Semantics/expression.cpp:424:17: error: possibly dangling reference to a temporary [-Werror=dangling-reference] 424 | const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; | ^~~~~~~~~~~~~ llvm-project/flang/lib/Semantics/expression.cpp:424:58: note: the temporary was destroyed at the end of the full expression ‘Fortran::evaluate::CoarrayRef::GetBase() const().Fortran::evaluate::NamedEntity::GetLastSymbol()’ 424 | const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; | ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ Keep the base in a temporary variable to make sure it is not deleted. Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 6 18:30:40 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 18:30:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix Werror=dangling-reference (PR #138793) In-Reply-To: Message-ID: <681ab7c0.050a0220.22c6ae.c45d@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Sebastian Pop (sebpop)
Changes when compiling with g++ 13.3.0 flang build fails with: llvm-project/flang/lib/Semantics/expression.cpp:424:17: error: possibly dangling reference to a temporary [-Werror=dangling-reference] 424 | const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; | ^~~~~~~~~~~~~ llvm-project/flang/lib/Semantics/expression.cpp:424:58: note: the temporary was destroyed at the end of the full expression ‘Fortran::evaluate::CoarrayRef::GetBase() const().Fortran::evaluate::NamedEntity::GetLastSymbol()’ 424 | const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; | ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ Keep the base in a temporary variable to make sure it is not deleted. --- Full diff: https://github.com/llvm/llvm-project/pull/138793.diff 1 Files Affected: - (modified) flang/lib/Semantics/expression.cpp (+2-1) ``````````diff diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index e139bda7e4950..35eb7b61429fb 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -421,7 +421,8 @@ static void CheckSubscripts( static void CheckSubscripts( semantics::SemanticsContext &context, CoarrayRef &ref) { - const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; + const auto &base = ref.GetBase(); + const Symbol &coarraySymbol{base.GetLastSymbol()}; Shape lb, ub; if (FoldSubscripts(context, coarraySymbol, ref.subscript(), lb, ub)) { ValidateSubscripts(context, coarraySymbol, ref.subscript(), lb, ub); ``````````
https://github.com/llvm/llvm-project/pull/138793 From flang-commits at lists.llvm.org Tue May 6 21:54:26 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 21:54:26 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR (PR #111155) In-Reply-To: Message-ID: <681ae782.170a0220.123fa8.6e75@mx.google.com> kaviya2510 wrote: > Thank you Kaviya, LGTM! I have a minimal nit, but no need for another review by me before merging. Thanks for the review @skatrak https://github.com/llvm/llvm-project/pull/111155 From flang-commits at lists.llvm.org Tue May 6 21:56:00 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 21:56:00 -0700 (PDT) Subject: [flang-commits] [flang] 9e7d529 - [Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR (#111155) Message-ID: <681ae7e0.170a0220.349d90.63b2@mx.google.com> Author: Kaviya Rajendiran Date: 2025-05-07T10:25:56+05:30 New Revision: 9e7d529607ebde67af5b214a654de82cfa2ec8c4 URL: https://github.com/llvm/llvm-project/commit/9e7d529607ebde67af5b214a654de82cfa2ec8c4 DIFF: https://github.com/llvm/llvm-project/commit/9e7d529607ebde67af5b214a654de82cfa2ec8c4.diff LOG: [Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR (#111155) This patch, - Added support for lowering of task_reduction to MLIR - Added support for lowering of in_reduction to MLIR - Fixed incorrect DSA handling for variables in the presence of 'in_reduction' clause. Added: flang/test/Lower/OpenMP/task-inreduction.f90 flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 Modified: flang/include/flang/Semantics/symbol.h flang/lib/Lower/OpenMP/ClauseProcessor.cpp flang/lib/Lower/OpenMP/ClauseProcessor.h flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Lower/OpenMP/ReductionProcessor.cpp flang/lib/Lower/OpenMP/ReductionProcessor.h flang/lib/Semantics/resolve-directives.cpp Removed: flang/test/Lower/OpenMP/Todo/task-inreduction.f90 flang/test/Lower/OpenMP/Todo/taskgroup-task-reduction.f90 ################################################################################ diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 36d926a8a4bc5..1d997abef6dee 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -754,12 +754,12 @@ class Symbol { // OpenMP data-copying attribute OmpCopyIn, OmpCopyPrivate, // OpenMP miscellaneous flags - OmpCommonBlock, OmpReduction, OmpAligned, OmpNontemporal, OmpAllocate, - OmpDeclarativeAllocateDirective, OmpExecutableAllocateDirective, - OmpDeclareSimd, OmpDeclareTarget, OmpThreadprivate, OmpDeclareReduction, - OmpFlushed, OmpCriticalLock, OmpIfSpecified, OmpNone, OmpPreDetermined, - OmpImplicit, OmpDependObject, OmpInclusiveScan, OmpExclusiveScan, - OmpInScanReduction); + OmpCommonBlock, OmpReduction, OmpInReduction, OmpAligned, OmpNontemporal, + OmpAllocate, OmpDeclarativeAllocateDirective, + OmpExecutableAllocateDirective, OmpDeclareSimd, OmpDeclareTarget, + OmpThreadprivate, OmpDeclareReduction, OmpFlushed, OmpCriticalLock, + OmpIfSpecified, OmpNone, OmpPreDetermined, OmpImplicit, OmpDependObject, + OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction); using Flags = common::EnumSet; const Scope &owner() const { return *owner_; } diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 77b4622547d7a..318455f0afe80 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -983,6 +983,29 @@ bool ClauseProcessor::processIf( }); return found; } +bool ClauseProcessor::processInReduction( + mlir::Location currentLocation, mlir::omp::InReductionClauseOps &result, + llvm::SmallVectorImpl &outReductionSyms) const { + return findRepeatableClause( + [&](const omp::clause::InReduction &clause, const parser::CharBlock &) { + llvm::SmallVector inReductionVars; + llvm::SmallVector inReduceVarByRef; + llvm::SmallVector inReductionDeclSymbols; + llvm::SmallVector inReductionSyms; + ReductionProcessor rp; + rp.processReductionArguments( + currentLocation, converter, clause, inReductionVars, + inReduceVarByRef, inReductionDeclSymbols, inReductionSyms); + + // Copy local lists into the output. + llvm::copy(inReductionVars, std::back_inserter(result.inReductionVars)); + llvm::copy(inReduceVarByRef, + std::back_inserter(result.inReductionByref)); + llvm::copy(inReductionDeclSymbols, + std::back_inserter(result.inReductionSyms)); + llvm::copy(inReductionSyms, std::back_inserter(outReductionSyms)); + }); +} bool ClauseProcessor::processIsDevicePtr( mlir::omp::IsDevicePtrClauseOps &result, @@ -1257,9 +1280,9 @@ bool ClauseProcessor::processReduction( llvm::SmallVector reductionDeclSymbols; llvm::SmallVector reductionSyms; ReductionProcessor rp; - rp.processReductionArguments( + rp.processReductionArguments( currentLocation, converter, clause, reductionVars, reduceVarByRef, - reductionDeclSymbols, reductionSyms, result.reductionMod); + reductionDeclSymbols, reductionSyms, &result.reductionMod); // Copy local lists into the output. llvm::copy(reductionVars, std::back_inserter(result.reductionVars)); llvm::copy(reduceVarByRef, std::back_inserter(result.reductionByref)); @@ -1269,6 +1292,30 @@ bool ClauseProcessor::processReduction( }); } +bool ClauseProcessor::processTaskReduction( + mlir::Location currentLocation, mlir::omp::TaskReductionClauseOps &result, + llvm::SmallVectorImpl &outReductionSyms) const { + return findRepeatableClause( + [&](const omp::clause::TaskReduction &clause, const parser::CharBlock &) { + llvm::SmallVector taskReductionVars; + llvm::SmallVector TaskReduceVarByRef; + llvm::SmallVector TaskReductionDeclSymbols; + llvm::SmallVector TaskReductionSyms; + ReductionProcessor rp; + rp.processReductionArguments( + currentLocation, converter, clause, taskReductionVars, + TaskReduceVarByRef, TaskReductionDeclSymbols, TaskReductionSyms); + // Copy local lists into the output. + llvm::copy(taskReductionVars, + std::back_inserter(result.taskReductionVars)); + llvm::copy(TaskReduceVarByRef, + std::back_inserter(result.taskReductionByref)); + llvm::copy(TaskReductionDeclSymbols, + std::back_inserter(result.taskReductionSyms)); + llvm::copy(TaskReductionSyms, std::back_inserter(outReductionSyms)); + }); +} + bool ClauseProcessor::processTo( llvm::SmallVectorImpl &result) const { return findRepeatableClause( diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index bdddeb145b496..3d3f26f06da26 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -112,6 +112,9 @@ class ClauseProcessor { processEnter(llvm::SmallVectorImpl &result) const; bool processIf(omp::clause::If::DirectiveNameModifier directiveName, mlir::omp::IfClauseOps &result) const; + bool processInReduction( + mlir::Location currentLocation, mlir::omp::InReductionClauseOps &result, + llvm::SmallVectorImpl &outReductionSyms) const; bool processIsDevicePtr( mlir::omp::IsDevicePtrClauseOps &result, llvm::SmallVectorImpl &isDeviceSyms) const; @@ -133,6 +136,9 @@ class ClauseProcessor { bool processReduction( mlir::Location currentLocation, mlir::omp::ReductionClauseOps &result, llvm::SmallVectorImpl &reductionSyms) const; + bool processTaskReduction( + mlir::Location currentLocation, mlir::omp::TaskReductionClauseOps &result, + llvm::SmallVectorImpl &outReductionSyms) const; bool processTo(llvm::SmallVectorImpl &result) const; bool processUseDeviceAddr( lower::StatementContext &stmtCtx, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fcd3de9671098..ca53130e5d363 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1774,34 +1774,34 @@ static void genTargetEnterExitUpdateDataClauses( cp.processNowait(clauseOps); } -static void genTaskClauses(lower::AbstractConverter &converter, - semantics::SemanticsContext &semaCtx, - lower::SymMap &symTable, - lower::StatementContext &stmtCtx, - const List &clauses, mlir::Location loc, - mlir::omp::TaskOperands &clauseOps) { +static void genTaskClauses( + lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, + lower::SymMap &symTable, lower::StatementContext &stmtCtx, + const List &clauses, mlir::Location loc, + mlir::omp::TaskOperands &clauseOps, + llvm::SmallVectorImpl &inReductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processAllocate(clauseOps); cp.processDepend(symTable, stmtCtx, clauseOps); cp.processFinal(stmtCtx, clauseOps); cp.processIf(llvm::omp::Directive::OMPD_task, clauseOps); + cp.processInReduction(loc, clauseOps, inReductionSyms); cp.processMergeable(clauseOps); cp.processPriority(stmtCtx, clauseOps); cp.processUntied(clauseOps); cp.processDetach(clauseOps); - cp.processTODO( - loc, llvm::omp::Directive::OMPD_task); + cp.processTODO(loc, llvm::omp::Directive::OMPD_task); } -static void genTaskgroupClauses(lower::AbstractConverter &converter, - semantics::SemanticsContext &semaCtx, - const List &clauses, mlir::Location loc, - mlir::omp::TaskgroupOperands &clauseOps) { +static void genTaskgroupClauses( + lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, + const List &clauses, mlir::Location loc, + mlir::omp::TaskgroupOperands &clauseOps, + llvm::SmallVectorImpl &taskReductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processAllocate(clauseOps); - cp.processTODO(loc, - llvm::omp::Directive::OMPD_taskgroup); + cp.processTaskReduction(loc, clauseOps, taskReductionSyms); } static void genTaskloopClauses(lower::AbstractConverter &converter, @@ -2496,8 +2496,9 @@ genTaskOp(lower::AbstractConverter &converter, lower::SymMap &symTable, mlir::Location loc, const ConstructQueue &queue, ConstructQueue::const_iterator item) { mlir::omp::TaskOperands clauseOps; + llvm::SmallVector inReductionSyms; genTaskClauses(converter, semaCtx, symTable, stmtCtx, item->clauses, loc, - clauseOps); + clauseOps, inReductionSyms); if (!enableDelayedPrivatization) return genOpWithBody( @@ -2514,6 +2515,8 @@ genTaskOp(lower::AbstractConverter &converter, lower::SymMap &symTable, EntryBlockArgs taskArgs; taskArgs.priv.syms = dsp.getDelayedPrivSymbols(); taskArgs.priv.vars = clauseOps.privateVars; + taskArgs.inReduction.syms = inReductionSyms; + taskArgs.inReduction.vars = clauseOps.inReductionVars; return genOpWithBody( OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, @@ -2531,12 +2534,19 @@ genTaskgroupOp(lower::AbstractConverter &converter, lower::SymMap &symTable, const ConstructQueue &queue, ConstructQueue::const_iterator item) { mlir::omp::TaskgroupOperands clauseOps; - genTaskgroupClauses(converter, semaCtx, item->clauses, loc, clauseOps); + llvm::SmallVector taskReductionSyms; + genTaskgroupClauses(converter, semaCtx, item->clauses, loc, clauseOps, + taskReductionSyms); + + EntryBlockArgs taskgroupArgs; + taskgroupArgs.taskReduction.syms = taskReductionSyms; + taskgroupArgs.taskReduction.vars = clauseOps.taskReductionVars; return genOpWithBody( OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, llvm::omp::Directive::OMPD_taskgroup) - .setClauses(&item->clauses), + .setClauses(&item->clauses) + .setEntryBlockArgs(&taskgroupArgs), queue, item, clauseOps); } diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp index 729bd3689ad4f..b8aa0deb42dd6 100644 --- a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp @@ -25,6 +25,7 @@ #include "flang/Parser/tools.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "llvm/Support/CommandLine.h" +#include static llvm::cl::opt forceByrefReduction( "force-byref-reduction", @@ -38,6 +39,37 @@ namespace Fortran { namespace lower { namespace omp { +// explicit template declarations +template void +ReductionProcessor::processReductionArguments( + mlir::Location currentLocation, lower::AbstractConverter &converter, + const omp::clause::Reduction &reduction, + llvm::SmallVectorImpl &reductionVars, + llvm::SmallVectorImpl &reduceVarByRef, + llvm::SmallVectorImpl &reductionDeclSymbols, + llvm::SmallVectorImpl &reductionSymbols, + mlir::omp::ReductionModifierAttr *reductionMod); + +template void +ReductionProcessor::processReductionArguments( + mlir::Location currentLocation, lower::AbstractConverter &converter, + const omp::clause::TaskReduction &reduction, + llvm::SmallVectorImpl &reductionVars, + llvm::SmallVectorImpl &reduceVarByRef, + llvm::SmallVectorImpl &reductionDeclSymbols, + llvm::SmallVectorImpl &reductionSymbols, + mlir::omp::ReductionModifierAttr *reductionMod); + +template void +ReductionProcessor::processReductionArguments( + mlir::Location currentLocation, lower::AbstractConverter &converter, + const omp::clause::InReduction &reduction, + llvm::SmallVectorImpl &reductionVars, + llvm::SmallVectorImpl &reduceVarByRef, + llvm::SmallVectorImpl &reductionDeclSymbols, + llvm::SmallVectorImpl &reductionSymbols, + mlir::omp::ReductionModifierAttr *reductionMod); + ReductionProcessor::ReductionIdentifier ReductionProcessor::getReductionType( const omp::clause::ProcedureDesignator &pd) { auto redType = llvm::StringSwitch>( @@ -538,28 +570,30 @@ mlir::omp::ReductionModifier translateReductionModifier(ReductionModifier mod) { return mlir::omp::ReductionModifier::defaultmod; } +template void ReductionProcessor::processReductionArguments( mlir::Location currentLocation, lower::AbstractConverter &converter, - const omp::clause::Reduction &reduction, - llvm::SmallVectorImpl &reductionVars, + const T &reduction, llvm::SmallVectorImpl &reductionVars, llvm::SmallVectorImpl &reduceVarByRef, llvm::SmallVectorImpl &reductionDeclSymbols, llvm::SmallVectorImpl &reductionSymbols, - mlir::omp::ReductionModifierAttr &reductionMod) { + mlir::omp::ReductionModifierAttr *reductionMod) { fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - auto mod = std::get>(reduction.t); - if (mod.has_value()) { - if (mod.value() == ReductionModifier::Task) - TODO(currentLocation, "Reduction modifier `task` is not supported"); - else - reductionMod = mlir::omp::ReductionModifierAttr::get( - firOpBuilder.getContext(), translateReductionModifier(mod.value())); + if constexpr (std::is_same_v) { + auto mod = std::get>(reduction.t); + if (mod.has_value()) { + if (mod.value() == ReductionModifier::Task) + TODO(currentLocation, "Reduction modifier `task` is not supported"); + else + *reductionMod = mlir::omp::ReductionModifierAttr::get( + firOpBuilder.getContext(), translateReductionModifier(mod.value())); + } } mlir::omp::DeclareReductionOp decl; const auto &redOperatorList{ - std::get(reduction.t)}; + std::get(reduction.t)}; assert(redOperatorList.size() == 1 && "Expecting single operator"); const auto &redOperator = redOperatorList.front(); const auto &objectList{std::get(reduction.t)}; diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.h b/flang/lib/Lower/OpenMP/ReductionProcessor.h index 11baa839c74b4..a7198b48f6b4e 100644 --- a/flang/lib/Lower/OpenMP/ReductionProcessor.h +++ b/flang/lib/Lower/OpenMP/ReductionProcessor.h @@ -121,14 +121,14 @@ class ReductionProcessor { /// Creates a reduction declaration and associates it with an OpenMP block /// directive. + template static void processReductionArguments( mlir::Location currentLocation, lower::AbstractConverter &converter, - const omp::clause::Reduction &reduction, - llvm::SmallVectorImpl &reductionVars, + const T &reduction, llvm::SmallVectorImpl &reductionVars, llvm::SmallVectorImpl &reduceVarByRef, llvm::SmallVectorImpl &reductionDeclSymbols, llvm::SmallVectorImpl &reductionSymbols, - mlir::omp::ReductionModifierAttr &reductionMod); + mlir::omp::ReductionModifierAttr *reductionMod = nullptr); }; template diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 620a37cb40231..8b1caca34a6a7 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -530,6 +530,12 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { return false; } + bool Pre(const parser::OmpInReductionClause &x) { + auto &objects{std::get(x.t)}; + ResolveOmpObjectList(objects, Symbol::Flag::OmpInReduction); + return false; + } + bool Pre(const parser::OmpClause::Reduction &x) { const auto &objList{std::get(x.v.t)}; ResolveOmpObjectList(objList, Symbol::Flag::OmpReduction); diff --git a/flang/test/Lower/OpenMP/Todo/task-inreduction.f90 b/flang/test/Lower/OpenMP/Todo/task-inreduction.f90 deleted file mode 100644 index aeed680a6dba7..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/task-inreduction.f90 +++ /dev/null @@ -1,15 +0,0 @@ -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s - -!=============================================================================== -! `mergeable` clause -!=============================================================================== - -! CHECK: not yet implemented: Unhandled clause IN_REDUCTION in TASK construct -subroutine omp_task_in_reduction() - integer i - i = 0 - !$omp task in_reduction(+:i) - i = i + 1 - !$omp end task -end subroutine omp_task_in_reduction diff --git a/flang/test/Lower/OpenMP/Todo/taskgroup-task-reduction.f90 b/flang/test/Lower/OpenMP/Todo/taskgroup-task-reduction.f90 deleted file mode 100644 index 1cb471d784d76..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/taskgroup-task-reduction.f90 +++ /dev/null @@ -1,10 +0,0 @@ -! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s -! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -o - %s -fopenmp-version=50 2>&1 | FileCheck %s - -! CHECK: not yet implemented: Unhandled clause TASK_REDUCTION in TASKGROUP construct -subroutine omp_taskgroup_task_reduction - integer :: res - !$omp taskgroup task_reduction(+:res) - res = res + 1 - !$omp end taskgroup -end subroutine omp_taskgroup_task_reduction diff --git a/flang/test/Lower/OpenMP/task-inreduction.f90 b/flang/test/Lower/OpenMP/task-inreduction.f90 new file mode 100644 index 0000000000000..41657d320f7d2 --- /dev/null +++ b/flang/test/Lower/OpenMP/task-inreduction.f90 @@ -0,0 +1,35 @@ +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +!CHECK-LABEL: omp.declare_reduction +!CHECK-SAME: @[[RED_I32_NAME:.*]] : i32 init { +!CHECK: ^bb0(%{{.*}}: i32): +!CHECK: %[[C0_1:.*]] = arith.constant 0 : i32 +!CHECK: omp.yield(%[[C0_1]] : i32) +!CHECK: } combiner { +!CHECK: ^bb0(%[[ARG0:.*]]: i32, %[[ARG1:.*]]: i32): +!CHECK: %[[RES:.*]] = arith.addi %[[ARG0]], %[[ARG1]] : i32 +!CHECK: omp.yield(%[[RES]] : i32) +!CHECK: } + +!CHECK-LABEL: func.func @_QPomp_task_in_reduction() { +! [...] +!CHECK: omp.task in_reduction(@[[RED_I32_NAME]] %[[VAL_1:.*]]#0 -> %[[ARG0]] : !fir.ref) { +!CHECK: %[[VAL_4:.*]]:2 = hlfir.declare %[[ARG0]] +!CHECK-SAME: {uniq_name = "_QFomp_task_in_reductionEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[VAL_5:.*]] = fir.load %[[VAL_4]]#0 : !fir.ref +!CHECK: %[[VAL_6:.*]] = arith.constant 1 : i32 +!CHECK: %[[VAL_7:.*]] = arith.addi %[[VAL_5]], %[[VAL_6]] : i32 +!CHECK: hlfir.assign %[[VAL_7]] to %[[VAL_4]]#0 : i32, !fir.ref +!CHECK: omp.terminator +!CHECK: } +!CHECK: return +!CHECK: } + +subroutine omp_task_in_reduction() + integer i + i = 0 + !$omp task in_reduction(+:i) + i = i + 1 + !$omp end task +end subroutine omp_task_in_reduction diff --git a/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 b/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 new file mode 100644 index 0000000000000..18d45217272fc --- /dev/null +++ b/flang/test/Lower/OpenMP/taskgroup-task-array-reduction.f90 @@ -0,0 +1,49 @@ +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.declare_reduction @add_reduction_byref_box_Uxf32 : !fir.ref>> alloc { +! [...] +! CHECK: omp.yield +! CHECK-LABEL: } init { +! [...] +! CHECK: omp.yield +! CHECK-LABEL: } combiner { +! [...] +! CHECK: omp.yield +! CHECK-LABEL: } cleanup { +! [...] +! CHECK: omp.yield +! CHECK: } + +! CHECK-LABEL: func.func @_QPtask_reduction +! CHECK-SAME: (%[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}) { +! CHECK: %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[VAL_0]] dummy_scope %[[VAL_1]] +! CHECK-SAME: {uniq_name = "_QFtask_reductionEx"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +! CHECK: omp.parallel { +! CHECK: %[[VAL_3:.*]] = fir.alloca !fir.box> +! CHECK: fir.store %[[VAL_2]]#1 to %[[VAL_3]] : !fir.ref>> +! CHECK: omp.taskgroup task_reduction(byref @add_reduction_byref_box_Uxf32 %[[VAL_3]] -> %[[VAL_4:.*]]: !fir.ref>>) { +! CHECK: %[[VAL_5:.*]]:2 = hlfir.declare %[[VAL_4]] +! CHECK-SAME: {uniq_name = "_QFtask_reductionEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) +! CHECK: omp.task in_reduction(byref @add_reduction_byref_box_Uxf32 %[[VAL_5]]#0 -> %[[VAL_6:.*]] : !fir.ref>>) { +! [...] +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } + +subroutine task_reduction(x) + real, dimension(:) :: x + !$omp parallel + !$omp taskgroup task_reduction(+:x) + !$omp task in_reduction(+:x) + x = x + 1 + !$omp end task + !$omp end taskgroup + !$omp end parallel +end subroutine diff --git a/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 b/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 new file mode 100644 index 0000000000000..be4d3193e99f7 --- /dev/null +++ b/flang/test/Lower/OpenMP/taskgroup-task_reduction01.f90 @@ -0,0 +1,36 @@ +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +!CHECK-LABEL: omp.declare_reduction +!CHECK-SAME: @[[RED_I32_NAME:.*]] : i32 init { +!CHECK: ^bb0(%{{.*}}: i32): +!CHECK: %[[C0_1:.*]] = arith.constant 0 : i32 +!CHECK: omp.yield(%[[C0_1]] : i32) +!CHECK: } combiner { +!CHECK: ^bb0(%[[ARG0:.*]]: i32, %[[ARG1:.*]]: i32): +!CHECK: %[[RES:.*]] = arith.addi %[[ARG0]], %[[ARG1]] : i32 +!CHECK: omp.yield(%[[RES]] : i32) +!CHECK: } + +!CHECK-LABEL: func.func @_QPomp_taskgroup_task_reduction() { +!CHECK: %[[VAL_0:.*]] = fir.alloca i32 {bindc_name = "res", uniq_name = "_QFomp_taskgroup_task_reductionEres"} +!CHECK: %[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFomp_taskgroup_task_reductionEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: omp.taskgroup task_reduction(@[[RED_I32_NAME]] %[[VAL_1]]#0 -> %[[VAL_2:.*]] : !fir.ref) { +!CHECK: %[[VAL_3:.*]]:2 = hlfir.declare %[[VAL_2]] +!CHECK-SAME: {uniq_name = "_QFomp_taskgroup_task_reductionEres"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[VAL_4:.*]] = fir.load %[[VAL_3]]#0 : !fir.ref +!CHECK: %[[VAL_5:.*]] = arith.constant 1 : i32 +!CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_4]], %[[VAL_5]] : i32 +!CHECK: hlfir.assign %[[VAL_6]] to %[[VAL_3]]#0 : i32, !fir.ref +!CHECK: omp.terminator +!CHECK: } +!CHECK: return +!CHECK: } + + +subroutine omp_taskgroup_task_reduction() + integer :: res + !$omp taskgroup task_reduction(+:res) + res = res + 1 + !$omp end taskgroup +end subroutine diff --git a/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 b/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 new file mode 100644 index 0000000000000..ed91e582d2bf5 --- /dev/null +++ b/flang/test/Lower/OpenMP/taskgroup-task_reduction02.f90 @@ -0,0 +1,37 @@ +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +!CHECK-LABEL: omp.declare_reduction +!CHECK-SAME: @[[RED_I32_NAME:.*]] : i32 init { +!CHECK: ^bb0(%{{.*}}: i32): +!CHECK: %[[C0_1:.*]] = arith.constant 0 : i32 +!CHECK: omp.yield(%[[C0_1]] : i32) +!CHECK: } combiner { +!CHECK: ^bb0(%[[ARG0:.*]]: i32, %[[ARG1:.*]]: i32): +!CHECK: %[[RES:.*]] = arith.addi %[[ARG0]], %[[ARG1]] : i32 +!CHECK: omp.yield(%[[RES]] : i32) +!CHECK: } + +!CHECK-LABEL: func.func @_QPin_reduction() { +! [...] +!CHECK: omp.taskgroup task_reduction(@[[RED_I32_NAME]] %[[VAL_1:.*]]#0 -> %[[VAL_3:.*]] : !fir.ref) { +!CHECK: %[[VAL_4:.*]]:2 = hlfir.declare %[[VAL_3]] {uniq_name = "_QFin_reductionEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: omp.task in_reduction(@[[RED_I32_NAME]] %[[VAL_4]]#0 -> %[[VAL_5:.*]] : !fir.ref) { +!CHECK: %[[VAL_6:.*]]:2 = hlfir.declare %[[VAL_5]] {uniq_name = "_QFin_reductionEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! [...] +!CHECK: omp.terminator +!CHECK: } +!CHECK: omp.terminator +!CHECK: } +!CHECK: return +!CHECK: } + +subroutine in_reduction() + integer :: x + x = 0 + !$omp taskgroup task_reduction(+:x) + !$omp task in_reduction(+:x) + x = x + 1 + !$omp end task + !$omp end taskgroup +end subroutine From flang-commits at lists.llvm.org Tue May 6 21:56:05 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 21:56:05 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR (PR #111155) In-Reply-To: Message-ID: <681ae7e5.170a0220.edd8a.6a56@mx.google.com> https://github.com/kaviya2510 closed https://github.com/llvm/llvm-project/pull/111155 From flang-commits at lists.llvm.org Tue May 6 21:56:23 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 21:56:23 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP]Support for lowering task_reduction and in_reduction to MLIR (PR #111155) In-Reply-To: Message-ID: <681ae7f7.170a0220.149288.74d5@mx.google.com> github-actions[bot] wrote: @kaviya2510 Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our [build bots](https://lab.llvm.org/buildbot/). If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail [here](https://llvm.org/docs/MyFirstTypoFix.html#myfirsttypofix-issues-after-landing-your-pr). If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of [LLVM development](https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy). You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! https://github.com/llvm/llvm-project/pull/111155 From flang-commits at lists.llvm.org Tue May 6 22:44:51 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 22:44:51 -0700 (PDT) Subject: [flang-commits] [flang] e1fed24 - [flang][OpenMP] Fix fir.convert in omp.atomic.update region (#138397) Message-ID: <681af353.170a0220.23ef68.7c24@mx.google.com> Author: NimishMishra Date: 2025-05-06T22:44:48-07:00 New Revision: e1fed24034fee3f45bc17252ced5ee29ab6b5408 URL: https://github.com/llvm/llvm-project/commit/e1fed24034fee3f45bc17252ced5ee29ab6b5408 DIFF: https://github.com/llvm/llvm-project/commit/e1fed24034fee3f45bc17252ced5ee29ab6b5408.diff LOG: [flang][OpenMP] Fix fir.convert in omp.atomic.update region (#138397) Region generation in omp.atomic.update currently emits a direct `fir.convert`. This crashes when the RHS expression involves complex type but the LHS variable is primitive type (say `f32`), since a `fir.convert` from `complex` to `f32` is emitted, which is illegal. This PR adds a conditional check to emit an additional `ExtractValueOp` in case RHS expression has a complex type. Fixes https://github.com/llvm/llvm-project/issues/138396 Added: Modified: flang/lib/Lower/OpenMP/OpenMP.cpp flang/test/Lower/OpenMP/atomic-update.f90 Removed: ################################################################################ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ca53130e5d363..49247e2e5cce4 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2858,9 +2858,23 @@ static void genAtomicUpdateStatement( lower::StatementContext atomicStmtCtx; mlir::Value rhsExpr = fir::getBase(converter.genExprValue( *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); + mlir::Type exprType = fir::unwrapRefType(rhsExpr.getType()); + if (fir::isa_complex(exprType) && !fir::isa_complex(varType)) { + // Emit an additional `ExtractValueOp` if the expression is of complex + // type + auto extract = firOpBuilder.create( + currentLocation, + mlir::cast(exprType).getElementType(), rhsExpr, + firOpBuilder.getArrayAttr( + firOpBuilder.getIntegerAttr(firOpBuilder.getIndexType(), 0))); + mlir::Value convertResult = firOpBuilder.create( + currentLocation, varType, extract); + firOpBuilder.create(currentLocation, convertResult); + } else { + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + } converter.resetExprOverrides(); } firOpBuilder.setInsertionPointAfter(atomicUpdateOp); diff --git a/flang/test/Lower/OpenMP/atomic-update.f90 b/flang/test/Lower/OpenMP/atomic-update.f90 index 31bf447006930..257ae8fb497ff 100644 --- a/flang/test/Lower/OpenMP/atomic-update.f90 +++ b/flang/test/Lower/OpenMP/atomic-update.f90 @@ -20,6 +20,8 @@ program OmpAtomicUpdate !CHECK: %[[VAL_C_DECLARE:.*]]:2 = hlfir.declare %[[VAL_C_ADDRESS]] {{.*}} !CHECK: %[[VAL_D_ADDRESS:.*]] = fir.address_of(@_QFEd) : !fir.ref !CHECK: %[[VAL_D_DECLARE:.*]]:2 = hlfir.declare %[[VAL_D_ADDRESS]] {{.}} +!CHECK: %[[VAL_G_ADDRESS:.*]] = fir.alloca complex {bindc_name = "g", uniq_name = "_QFEg"} +!CHECK: %[[VAL_G_DECLARE:.*]]:2 = hlfir.declare %[[VAL_G_ADDRESS]] {uniq_name = "_QFEg"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) !CHECK: %[[VAL_i1_ALLOCA:.*]] = fir.alloca i8 {bindc_name = "i1", uniq_name = "_QFEi1"} !CHECK: %[[VAL_i1_DECLARE:.*]]:2 = hlfir.declare %[[VAL_i1_ALLOCA]] {{.*}} !CHECK: %[[VAL_c5:.*]] = arith.constant 5 : index @@ -40,6 +42,7 @@ program OmpAtomicUpdate integer, target :: c, d integer(1) :: i1 integer, dimension(5) :: k + complex :: g !CHECK: %[[EMBOX:.*]] = fir.embox %[[VAL_C_DECLARE]]#0 : (!fir.ref) -> !fir.box> !CHECK: fir.store %[[EMBOX]] to %[[VAL_A_DECLARE]]#0 : !fir.ref>> @@ -200,4 +203,19 @@ program OmpAtomicUpdate !CHECK: } !$omp atomic update x = x + sum([ (y+2, y=1, z) ]) + +!CHECK: %[[LOAD:.*]] = fir.load %[[VAL_G_DECLARE]]#0 : !fir.ref> +!CHECK: omp.atomic.update %[[VAL_W_DECLARE]]#0 : !fir.ref { +!CHECK: ^bb0(%[[ARG:.*]]: i32): +!CHECK: %[[CVT:.*]] = fir.convert %[[ARG]] : (i32) -> f32 +!CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +!CHECK: %[[UNDEF:.*]] = fir.undefined complex +!CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex +!CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex +!CHECK: %[[ADD:.*]] = fir.addc %[[IDX2]], %[[LOAD]] {fastmath = #arith.fastmath} : complex +!CHECK: %[[EXT:.*]] = fir.extract_value %[[ADD]], [0 : index] : (complex) -> f32 +!CHECK: %[[RESULT:.*]] = fir.convert %[[EXT]] : (f32) -> i32 +!CHECK: omp.yield(%[[RESULT]] : i32) + !$omp atomic update + w = w + g end program OmpAtomicUpdate From flang-commits at lists.llvm.org Tue May 6 22:44:53 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 06 May 2025 22:44:53 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Fix fir.convert in omp.atomic.update region (PR #138397) In-Reply-To: Message-ID: <681af355.a70a0220.2e11b4.148d@mx.google.com> https://github.com/NimishMishra closed https://github.com/llvm/llvm-project/pull/138397 From flang-commits at lists.llvm.org Tue May 6 23:24:24 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 23:24:24 -0700 (PDT) Subject: [flang-commits] =?utf-8?b?W2ZsYW5nXSBbRmxhbmddW09wZW5NUF1TdXBw?= =?utf-8?q?ort_for_lowering_grainsize_and_num=5Ftasks_clause_to=E2=80=A6_?= =?utf-8?b?KFBSICMxMjg0OTAp?= In-Reply-To: Message-ID: <681afc98.170a0220.31efc4.eae0@mx.google.com> https://github.com/kaviya2510 updated https://github.com/llvm/llvm-project/pull/128490 >From 2527a46c3b687a954dcb5b20c1c90add10796443 Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Wed, 7 May 2025 11:53:51 +0530 Subject: [PATCH] [Flang][OpenMP]Support for lowering grainsize and num_tasks clause of taskloop construct to MLIR --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 42 +++++++++++++++ flang/lib/Lower/OpenMP/ClauseProcessor.h | 4 ++ flang/lib/Lower/OpenMP/OpenMP.cpp | 26 ++++----- .../test/Lower/OpenMP/taskloop-grainsize.f90 | 51 ++++++++++++++++++ flang/test/Lower/OpenMP/taskloop-numtasks.f90 | 54 +++++++++++++++++++ 5 files changed, 165 insertions(+), 12 deletions(-) create mode 100644 flang/test/Lower/OpenMP/taskloop-grainsize.f90 create mode 100644 flang/test/Lower/OpenMP/taskloop-numtasks.f90 diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 77b4622547d7a..ac940b5c74152 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -365,6 +365,27 @@ bool ClauseProcessor::processHint(mlir::omp::HintClauseOps &result) const { return false; } +bool ClauseProcessor::processGrainsize( + lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const { + using grainsize = omp::clause::Grainsize; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier) { + result.grainsizeMod = mlir::omp::ClauseGrainsizeTypeAttr::get( + context, mlir::omp::ClauseGrainsizeType::Strict); + } + const auto &grainsizeExpr = std::get(clause->t); + result.grainsize = + fir::getBase(converter.genExprValue(grainsizeExpr, stmtCtx)); + return true; + } + return false; +} + bool ClauseProcessor::processInclusive( mlir::Location currentLocation, mlir::omp::InclusiveClauseOps &result) const { @@ -388,6 +409,27 @@ bool ClauseProcessor::processNowait(mlir::omp::NowaitClauseOps &result) const { return markClauseOccurrence(result.nowait); } +bool ClauseProcessor::processNumTasks( + lower::StatementContext &stmtCtx, + mlir::omp::NumTasksClauseOps &result) const { + using numtasks = omp::clause::NumTasks; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier) { + result.numTasksMod = mlir::omp::ClauseNumTasksTypeAttr::get( + context, mlir::omp::ClauseNumTasksType::Strict); + } + const auto &numtasksExpr = std::get(clause->t); + result.numTasks = + fir::getBase(converter.genExprValue(numtasksExpr, stmtCtx)); + return true; + } + return false; +} + bool ClauseProcessor::processNumTeams( lower::StatementContext &stmtCtx, mlir::omp::NumTeamsClauseOps &result) const { diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index bdddeb145b496..375e24b80fc21 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -78,10 +78,14 @@ class ClauseProcessor { mlir::omp::HasDeviceAddrClauseOps &result, llvm::SmallVectorImpl &hasDeviceSyms) const; bool processHint(mlir::omp::HintClauseOps &result) const; + bool processGrainsize(lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const; bool processInclusive(mlir::Location currentLocation, mlir::omp::InclusiveClauseOps &result) const; bool processMergeable(mlir::omp::MergeableClauseOps &result) const; bool processNowait(mlir::omp::NowaitClauseOps &result) const; + bool processNumTasks(lower::StatementContext &stmtCtx, + mlir::omp::NumTasksClauseOps &result) const; bool processNumTeams(lower::StatementContext &stmtCtx, mlir::omp::NumTeamsClauseOps &result) const; bool processNumThreads(lower::StatementContext &stmtCtx, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fcd3de9671098..af227b28d35b3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1806,17 +1806,19 @@ static void genTaskgroupClauses(lower::AbstractConverter &converter, static void genTaskloopClauses(lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, + lower::StatementContext &stmtCtx, const List &clauses, mlir::Location loc, mlir::omp::TaskloopOperands &clauseOps) { ClauseProcessor cp(converter, semaCtx, clauses); + cp.processGrainsize(stmtCtx, clauseOps); + cp.processNumTasks(stmtCtx, clauseOps); cp.processTODO( - loc, llvm::omp::Directive::OMPD_taskloop); + clause::Final, clause::If, clause::InReduction, + clause::Lastprivate, clause::Mergeable, clause::Nogroup, + clause::Priority, clause::Reduction, clause::Shared, + clause::Untied>(loc, llvm::omp::Directive::OMPD_taskloop); } static void genTaskwaitClauses(lower::AbstractConverter &converter, @@ -3268,12 +3270,12 @@ genStandaloneSimd(lower::AbstractConverter &converter, lower::SymMap &symTable, static mlir::omp::TaskloopOp genStandaloneTaskloop( lower::AbstractConverter &converter, lower::SymMap &symTable, - semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - mlir::Location loc, const ConstructQueue &queue, - ConstructQueue::const_iterator item) { + lower::StatementContext &stmtCtx, semantics::SemanticsContext &semaCtx, + lower::pft::Evaluation &eval, mlir::Location loc, + const ConstructQueue &queue, ConstructQueue::const_iterator item) { mlir::omp::TaskloopOperands taskloopClauseOps; - genTaskloopClauses(converter, semaCtx, item->clauses, loc, taskloopClauseOps); - + genTaskloopClauses(converter, semaCtx, stmtCtx, item->clauses, loc, + taskloopClauseOps); DataSharingProcessor dsp(converter, semaCtx, item->clauses, eval, /*shouldCollectPreDeterminedSymbols=*/true, enableDelayedPrivatization, symTable); @@ -3734,8 +3736,8 @@ static void genOMPDispatch(lower::AbstractConverter &converter, genTaskgroupOp(converter, symTable, semaCtx, eval, loc, queue, item); break; case llvm::omp::Directive::OMPD_taskloop: - newOp = genStandaloneTaskloop(converter, symTable, semaCtx, eval, loc, - queue, item); + newOp = genStandaloneTaskloop(converter, symTable, stmtCtx, semaCtx, eval, + loc, queue, item); break; case llvm::omp::Directive::OMPD_taskwait: newOp = genTaskwaitOp(converter, symTable, semaCtx, eval, loc, queue, item); diff --git a/flang/test/Lower/OpenMP/taskloop-grainsize.f90 b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 new file mode 100644 index 0000000000000..fa684ad213d0a --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 @@ -0,0 +1,51 @@ +! This test checks lowering of grainsize clause in taskloop directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE_TEST2:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_grainsize +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_grainsizeEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_grainsizeEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFtest_grainsizeEx"} +! CHECK: %[[DECL_X:.*]]:2 = hlfir.declare %[[ALLOCA_X]] {uniq_name = "_QFtest_grainsizeEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[GRAINSIZE:.*]] = arith.constant 10 : i32 +subroutine test_grainsize + integer :: i, x + ! CHECK: omp.taskloop grainsize(%[[GRAINSIZE]]: i32) + ! CHECK-SAME: private(@[[X_FIRSTPRIVATE]] %[[DECL_X]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { + ! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { + !$omp taskloop grainsize(10) + do i = 1, 1000 + x = x + 1 + end do + !$omp end taskloop +end subroutine test_grainsize + +!CHECK-LABEL: func.func @_QPtest_grainsize_strict() +subroutine test_grainsize_strict + integer :: i, x + ! CHECK: %[[GRAINSIZE:.*]] = arith.constant 10 : i32 + ! CHECK: omp.taskloop grainsize(strict, %[[GRAINSIZE]]: i32) + !$omp taskloop grainsize(strict:10) + do i = 1, 1000 + !CHECK: arith.addi + x = x + 1 + end do + !$omp end taskloop +end subroutine \ No newline at end of file diff --git a/flang/test/Lower/OpenMP/taskloop-numtasks.f90 b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 new file mode 100644 index 0000000000000..38f3975bbd371 --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 @@ -0,0 +1,54 @@ +! This test checks lowering of num_tasks clause in taskloop directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE_TEST2:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_num_tasks +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_num_tasksEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_num_tasksEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFtest_num_tasksEx"} +! CHECK: %[[DECL_X:.*]]:2 = hlfir.declare %[[ALLOCA_X]] {uniq_name = "_QFtest_num_tasksEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_NUMTASKS:.*]] = arith.constant 10 : i32 +subroutine test_num_tasks + integer :: i, x + ! CHECK: omp.taskloop num_tasks(%[[VAL_NUMTASKS]]: i32) + ! CHECK-SAME: private(@[[X_FIRSTPRIVATE]] %[[DECL_X]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { + ! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { + !$omp taskloop num_tasks(10) + do i = 1, 1000 + x = x + 1 + end do + !$omp end taskloop +end subroutine test_num_tasks + +! CHECK-LABEL: func.func @_QPtest_num_tasks_strict +subroutine test_num_tasks_strict + integer :: x, i + ! CHECK: %[[NUM_TASKS:.*]] = arith.constant 10 : i32 + ! CHECK: omp.taskloop num_tasks(strict, %[[NUM_TASKS]]: i32) + !$omp taskloop num_tasks(strict:10) + do i = 1, 100 + !CHECK: arith.addi + x = x + 1 + end do + !$omp end taskloop +end subroutine + + + From flang-commits at lists.llvm.org Tue May 6 23:26:54 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 23:26:54 -0700 (PDT) Subject: [flang-commits] =?utf-8?b?W2ZsYW5nXSBbRmxhbmddW09wZW5NUF1TdXBw?= =?utf-8?q?ort_for_lowering_grainsize_and_num=5Ftasks_clause_to=E2=80=A6_?= =?utf-8?b?KFBSICMxMjg0OTAp?= In-Reply-To: Message-ID: <681afd2e.050a0220.13d468.f17a@mx.google.com> https://github.com/kaviya2510 edited https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Tue May 6 23:29:33 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 23:29:33 -0700 (PDT) Subject: [flang-commits] =?utf-8?b?W2ZsYW5nXSBbRmxhbmddW09wZW5NUF1TdXBw?= =?utf-8?q?ort_for_lowering_grainsize_and_num=5Ftasks_clause_to=E2=80=A6_?= =?utf-8?b?KFBSICMxMjg0OTAp?= In-Reply-To: Message-ID: <681afdcd.050a0220.f9fc.f919@mx.google.com> https://github.com/kaviya2510 updated https://github.com/llvm/llvm-project/pull/128490 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 6 23:30:36 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 23:30:36 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681afe0c.050a0220.2f1ad4.f9ac@mx.google.com> https://github.com/kaviya2510 edited https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Tue May 6 23:30:59 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 23:30:59 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681afe23.050a0220.31ebd1.0077@mx.google.com> https://github.com/kaviya2510 edited https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Tue May 6 23:31:05 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 23:31:05 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681afe29.050a0220.15a3dd.c18e@mx.google.com> https://github.com/kaviya2510 ready_for_review https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Tue May 6 23:35:29 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 06 May 2025 23:35:29 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681aff31.050a0220.22c6ae.f5f5@mx.google.com> https://github.com/kaviya2510 updated https://github.com/llvm/llvm-project/pull/128490 >From 2075eb0739938946e80b8e632f6512be735a04c7 Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Wed, 7 May 2025 11:53:51 +0530 Subject: [PATCH 1/2] [Flang][OpenMP]Added MLIR lowering for grainsize and num_tasks clause --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 42 +++++++++++++++ flang/lib/Lower/OpenMP/ClauseProcessor.h | 4 ++ flang/lib/Lower/OpenMP/OpenMP.cpp | 26 ++++----- .../test/Lower/OpenMP/taskloop-grainsize.f90 | 51 ++++++++++++++++++ flang/test/Lower/OpenMP/taskloop-numtasks.f90 | 54 +++++++++++++++++++ 5 files changed, 165 insertions(+), 12 deletions(-) create mode 100644 flang/test/Lower/OpenMP/taskloop-grainsize.f90 create mode 100644 flang/test/Lower/OpenMP/taskloop-numtasks.f90 diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 77b4622547d7a..ac940b5c74152 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -365,6 +365,27 @@ bool ClauseProcessor::processHint(mlir::omp::HintClauseOps &result) const { return false; } +bool ClauseProcessor::processGrainsize( + lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const { + using grainsize = omp::clause::Grainsize; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier) { + result.grainsizeMod = mlir::omp::ClauseGrainsizeTypeAttr::get( + context, mlir::omp::ClauseGrainsizeType::Strict); + } + const auto &grainsizeExpr = std::get(clause->t); + result.grainsize = + fir::getBase(converter.genExprValue(grainsizeExpr, stmtCtx)); + return true; + } + return false; +} + bool ClauseProcessor::processInclusive( mlir::Location currentLocation, mlir::omp::InclusiveClauseOps &result) const { @@ -388,6 +409,27 @@ bool ClauseProcessor::processNowait(mlir::omp::NowaitClauseOps &result) const { return markClauseOccurrence(result.nowait); } +bool ClauseProcessor::processNumTasks( + lower::StatementContext &stmtCtx, + mlir::omp::NumTasksClauseOps &result) const { + using numtasks = omp::clause::NumTasks; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier) { + result.numTasksMod = mlir::omp::ClauseNumTasksTypeAttr::get( + context, mlir::omp::ClauseNumTasksType::Strict); + } + const auto &numtasksExpr = std::get(clause->t); + result.numTasks = + fir::getBase(converter.genExprValue(numtasksExpr, stmtCtx)); + return true; + } + return false; +} + bool ClauseProcessor::processNumTeams( lower::StatementContext &stmtCtx, mlir::omp::NumTeamsClauseOps &result) const { diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index bdddeb145b496..375e24b80fc21 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -78,10 +78,14 @@ class ClauseProcessor { mlir::omp::HasDeviceAddrClauseOps &result, llvm::SmallVectorImpl &hasDeviceSyms) const; bool processHint(mlir::omp::HintClauseOps &result) const; + bool processGrainsize(lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const; bool processInclusive(mlir::Location currentLocation, mlir::omp::InclusiveClauseOps &result) const; bool processMergeable(mlir::omp::MergeableClauseOps &result) const; bool processNowait(mlir::omp::NowaitClauseOps &result) const; + bool processNumTasks(lower::StatementContext &stmtCtx, + mlir::omp::NumTasksClauseOps &result) const; bool processNumTeams(lower::StatementContext &stmtCtx, mlir::omp::NumTeamsClauseOps &result) const; bool processNumThreads(lower::StatementContext &stmtCtx, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fcd3de9671098..af227b28d35b3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1806,17 +1806,19 @@ static void genTaskgroupClauses(lower::AbstractConverter &converter, static void genTaskloopClauses(lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, + lower::StatementContext &stmtCtx, const List &clauses, mlir::Location loc, mlir::omp::TaskloopOperands &clauseOps) { ClauseProcessor cp(converter, semaCtx, clauses); + cp.processGrainsize(stmtCtx, clauseOps); + cp.processNumTasks(stmtCtx, clauseOps); cp.processTODO( - loc, llvm::omp::Directive::OMPD_taskloop); + clause::Final, clause::If, clause::InReduction, + clause::Lastprivate, clause::Mergeable, clause::Nogroup, + clause::Priority, clause::Reduction, clause::Shared, + clause::Untied>(loc, llvm::omp::Directive::OMPD_taskloop); } static void genTaskwaitClauses(lower::AbstractConverter &converter, @@ -3268,12 +3270,12 @@ genStandaloneSimd(lower::AbstractConverter &converter, lower::SymMap &symTable, static mlir::omp::TaskloopOp genStandaloneTaskloop( lower::AbstractConverter &converter, lower::SymMap &symTable, - semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - mlir::Location loc, const ConstructQueue &queue, - ConstructQueue::const_iterator item) { + lower::StatementContext &stmtCtx, semantics::SemanticsContext &semaCtx, + lower::pft::Evaluation &eval, mlir::Location loc, + const ConstructQueue &queue, ConstructQueue::const_iterator item) { mlir::omp::TaskloopOperands taskloopClauseOps; - genTaskloopClauses(converter, semaCtx, item->clauses, loc, taskloopClauseOps); - + genTaskloopClauses(converter, semaCtx, stmtCtx, item->clauses, loc, + taskloopClauseOps); DataSharingProcessor dsp(converter, semaCtx, item->clauses, eval, /*shouldCollectPreDeterminedSymbols=*/true, enableDelayedPrivatization, symTable); @@ -3734,8 +3736,8 @@ static void genOMPDispatch(lower::AbstractConverter &converter, genTaskgroupOp(converter, symTable, semaCtx, eval, loc, queue, item); break; case llvm::omp::Directive::OMPD_taskloop: - newOp = genStandaloneTaskloop(converter, symTable, semaCtx, eval, loc, - queue, item); + newOp = genStandaloneTaskloop(converter, symTable, stmtCtx, semaCtx, eval, + loc, queue, item); break; case llvm::omp::Directive::OMPD_taskwait: newOp = genTaskwaitOp(converter, symTable, semaCtx, eval, loc, queue, item); diff --git a/flang/test/Lower/OpenMP/taskloop-grainsize.f90 b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 new file mode 100644 index 0000000000000..fa684ad213d0a --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 @@ -0,0 +1,51 @@ +! This test checks lowering of grainsize clause in taskloop directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE_TEST2:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_grainsize +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_grainsizeEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_grainsizeEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFtest_grainsizeEx"} +! CHECK: %[[DECL_X:.*]]:2 = hlfir.declare %[[ALLOCA_X]] {uniq_name = "_QFtest_grainsizeEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[GRAINSIZE:.*]] = arith.constant 10 : i32 +subroutine test_grainsize + integer :: i, x + ! CHECK: omp.taskloop grainsize(%[[GRAINSIZE]]: i32) + ! CHECK-SAME: private(@[[X_FIRSTPRIVATE]] %[[DECL_X]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { + ! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { + !$omp taskloop grainsize(10) + do i = 1, 1000 + x = x + 1 + end do + !$omp end taskloop +end subroutine test_grainsize + +!CHECK-LABEL: func.func @_QPtest_grainsize_strict() +subroutine test_grainsize_strict + integer :: i, x + ! CHECK: %[[GRAINSIZE:.*]] = arith.constant 10 : i32 + ! CHECK: omp.taskloop grainsize(strict, %[[GRAINSIZE]]: i32) + !$omp taskloop grainsize(strict:10) + do i = 1, 1000 + !CHECK: arith.addi + x = x + 1 + end do + !$omp end taskloop +end subroutine \ No newline at end of file diff --git a/flang/test/Lower/OpenMP/taskloop-numtasks.f90 b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 new file mode 100644 index 0000000000000..38f3975bbd371 --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 @@ -0,0 +1,54 @@ +! This test checks lowering of num_tasks clause in taskloop directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE_TEST2:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_num_tasks +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_num_tasksEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_num_tasksEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFtest_num_tasksEx"} +! CHECK: %[[DECL_X:.*]]:2 = hlfir.declare %[[ALLOCA_X]] {uniq_name = "_QFtest_num_tasksEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_NUMTASKS:.*]] = arith.constant 10 : i32 +subroutine test_num_tasks + integer :: i, x + ! CHECK: omp.taskloop num_tasks(%[[VAL_NUMTASKS]]: i32) + ! CHECK-SAME: private(@[[X_FIRSTPRIVATE]] %[[DECL_X]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { + ! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { + !$omp taskloop num_tasks(10) + do i = 1, 1000 + x = x + 1 + end do + !$omp end taskloop +end subroutine test_num_tasks + +! CHECK-LABEL: func.func @_QPtest_num_tasks_strict +subroutine test_num_tasks_strict + integer :: x, i + ! CHECK: %[[NUM_TASKS:.*]] = arith.constant 10 : i32 + ! CHECK: omp.taskloop num_tasks(strict, %[[NUM_TASKS]]: i32) + !$omp taskloop num_tasks(strict:10) + do i = 1, 100 + !CHECK: arith.addi + x = x + 1 + end do + !$omp end taskloop +end subroutine + + + >From 916889f13e0a5d2b67d370e338ed1fcfd8c62b2a Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Wed, 7 May 2025 12:04:30 +0530 Subject: [PATCH 2/2] [Flang][OpenMP] Formatting fix --- flang/test/Lower/OpenMP/taskloop-grainsize.f90 | 2 +- flang/test/Lower/OpenMP/taskloop-numtasks.f90 | 3 --- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/flang/test/Lower/OpenMP/taskloop-grainsize.f90 b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 index fa684ad213d0a..43db8acdeceac 100644 --- a/flang/test/Lower/OpenMP/taskloop-grainsize.f90 +++ b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 @@ -48,4 +48,4 @@ subroutine test_grainsize_strict x = x + 1 end do !$omp end taskloop -end subroutine \ No newline at end of file +end subroutine diff --git a/flang/test/Lower/OpenMP/taskloop-numtasks.f90 b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 index 38f3975bbd371..f68f3a2d6ad26 100644 --- a/flang/test/Lower/OpenMP/taskloop-numtasks.f90 +++ b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 @@ -49,6 +49,3 @@ subroutine test_num_tasks_strict end do !$omp end taskloop end subroutine - - - From flang-commits at lists.llvm.org Tue May 6 23:37:14 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 06 May 2025 23:37:14 -0700 (PDT) Subject: [flang-commits] [flang] Skip contiguous check when ignore_tkr(c) is used (PR #138762) In-Reply-To: Message-ID: <681aff9a.050a0220.8498b.1f35@mx.google.com> https://github.com/clementval edited https://github.com/llvm/llvm-project/pull/138762 From flang-commits at lists.llvm.org Tue May 6 23:37:15 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 06 May 2025 23:37:15 -0700 (PDT) Subject: [flang-commits] [flang] Skip contiguous check when ignore_tkr(c) is used (PR #138762) In-Reply-To: Message-ID: <681aff9b.050a0220.2847f4.2d06@mx.google.com> https://github.com/clementval approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/138762 From flang-commits at lists.llvm.org Tue May 6 23:37:16 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 06 May 2025 23:37:16 -0700 (PDT) Subject: [flang-commits] [flang] Skip contiguous check when ignore_tkr(c) is used (PR #138762) In-Reply-To: Message-ID: <681aff9c.170a0220.289717.6749@mx.google.com> ================ @@ -0,0 +1,42 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 + +! Test case 1: Device arrays with ignore_tkr(c) +subroutine test_device_arrays() + interface bar + subroutine bar1(a) +!dir$ ignore_tkr(c) a + real :: a(..) +!@cuf attributes(device) :: a + end subroutine + end interface + + integer :: n = 10, k = 2 + real, device :: a(10), b(10), c(10) + + call bar(a(1:n)) ! Should not warn about contiguity + call bar(b(1:n:k)) ! Should not warn about contiguity + call bar(c(1:n:2)) ! Should not warn about contiguity +end subroutine + +! Test case 2: Managed arrays with ignore_tkr(c) +subroutine test_managed_arrays() + interface bar + subroutine bar1(a) +!dir$ ignore_tkr(c) a + real :: a(..) +!@cuf attributes(device) :: a + end subroutine + end interface + + integer :: n = 10, k = 2 + real, managed :: a(10), b(10), c(10) + + call bar(a(1:n)) ! Should not warn about contiguity + call bar(b(1:n:k)) ! Should not warn about contiguity + call bar(c(1:n:2)) ! Should not warn about contiguity +end subroutine + +program main + call test_device_arrays() + call test_managed_arrays() +end program ---------------- clementval wrote: New line at end of line https://github.com/llvm/llvm-project/pull/138762 From flang-commits at lists.llvm.org Wed May 7 00:10:40 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 00:10:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix crash with USE of hermetic module file (PR #138785) In-Reply-To: Message-ID: <681b0770.170a0220.186e5e.89c2@mx.google.com> https://github.com/jeanPerier approved this pull request. https://github.com/llvm/llvm-project/pull/138785 From flang-commits at lists.llvm.org Wed May 7 00:45:11 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 00:45:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix crash with USE of hermetic module file (PR #138785) In-Reply-To: Message-ID: <681b0f87.170a0220.3ab31f.7fc5@mx.google.com> ================ @@ -0,0 +1,17 @@ +!RUN: (%flang -c -DWHICH=1 %s && %flang -c -DWHICH=2 %s && %flang_fc1 -fdebug-unparse %s) | FileCheck %s + +#if WHICH == 1 +module m1 + use iso_c_binding +end +#elif WHICH == 2 +module m2 + use m1 +end +#else +program test + use m2 +!CHECK: INTEGER(KIND=4_4) n + integer(c_int) n ---------------- jeanPerier wrote: Something is broken here. See buildbots "flang/test/Semantics/modfile75.F90:15:11: error: Must be a constant value". I am thinking this could be a lit "race" with module file (I am not sure, but I think all the tests are ran in the same directory in parallel, which would cause the issue since `m1` are common names and could be overridden before the final part of the test). https://github.com/llvm/llvm-project/pull/138785 From flang-commits at lists.llvm.org Wed May 7 01:17:10 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 01:17:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Propagate contiguous attribute through HLFIR. (PR #138797) In-Reply-To: Message-ID: <681b1706.050a0220.2f2dd7.3b9f@mx.google.com> https://github.com/jeanPerier approved this pull request. Thanks! This look good to me because it will help after inlining gave more info, but for the use case of the test, I think lowering could be made nicer by setting the CONTIGUOUS attribute when it knows about contiguity of hlfir.designate fir.box result [here](https://github.com/llvm/llvm-project/blob/e55172f139a21f3d6da932787a0b221b53eab2cb/flang/lib/Lower/ConvertExprToHLFIR.cpp#L280) using the semantic `Fortran::evaluate::IsSimplyContiguous(designatorNode,getConverter().getFoldingContext(),/*namedConstantSectionsAreAlwaysContiguous=*/false))`. https://github.com/llvm/llvm-project/pull/138797 From flang-commits at lists.llvm.org Wed May 7 01:32:24 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 01:32:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Postpone hlfir.end_associate generation for calls. (PR #138786) In-Reply-To: Message-ID: <681b1a98.170a0220.19e4d9.7ea1@mx.google.com> https://github.com/jeanPerier approved this pull request. Looks great to me, thanks! I agree delaying the copy-out is more tricky and best to not do without more thinking. https://github.com/llvm/llvm-project/pull/138786 From flang-commits at lists.llvm.org Wed May 7 01:41:46 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 01:41:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Correctly prepare allocatable runtime call arguments (PR #138727) In-Reply-To: Message-ID: <681b1cca.170a0220.1ce0a.1121@mx.google.com> https://github.com/jeanPerier edited https://github.com/llvm/llvm-project/pull/138727 From flang-commits at lists.llvm.org Wed May 7 01:41:46 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 01:41:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Correctly prepare allocatable runtime call arguments (PR #138727) In-Reply-To: Message-ID: <681b1cca.050a0220.3520dc.fcf3@mx.google.com> https://github.com/jeanPerier approved this pull request. Thanks Asher, LGTM! https://github.com/llvm/llvm-project/pull/138727 From flang-commits at lists.llvm.org Wed May 7 01:41:46 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 01:41:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Correctly prepare allocatable runtime call arguments (PR #138727) In-Reply-To: Message-ID: <681b1cca.170a0220.f5084.7341@mx.google.com> ================ @@ -210,15 +210,14 @@ static bool hasExplicitLowerBounds(mlir::Value shape) { static std::pair updateDeclareInputTypeWithVolatility( mlir::Type inputType, mlir::Value memref, mlir::OpBuilder &builder, fir::FortranVariableFlagsAttr fortran_attrs) { - if (mlir::isa(inputType) && fortran_attrs && + if (fortran_attrs && bitEnumContainsAny(fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::fortran_volatile)) { const bool isPointer = bitEnumContainsAny( fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::pointer); auto updateType = [&](auto t) { using FIRT = decltype(t); - // If an entity is a pointer, the entity it points to is volatile, as far - // as consumers of the pointer are concerned. + // A volatile pointer's pointee is volatile. ---------------- jeanPerier wrote: Unrelated to your patch, but since you updated the comment :), why doesn't this also apply to the target data of an allocatable descriptor? https://github.com/llvm/llvm-project/pull/138727 From flang-commits at lists.llvm.org Wed May 7 01:49:35 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 01:49:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] share global variable initialization code (PR #138672) In-Reply-To: Message-ID: <681b1e9f.170a0220.5b842.1e0c@mx.google.com> https://github.com/jeanPerier approved this pull request. Looks good! https://github.com/llvm/llvm-project/pull/138672 From flang-commits at lists.llvm.org Wed May 7 01:56:49 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 01:56:49 -0700 (PDT) Subject: [flang-commits] [flang] a13c0b6 - [Flang][OpenMP] Add frontend support for declare variant (#130578) Message-ID: <681b2051.050a0220.25f316.42e1@mx.google.com> Author: Kiran Chandramohan Date: 2025-05-07T09:56:45+01:00 New Revision: a13c0b67708173b8033a53ff6ae4c46c5b80bb2b URL: https://github.com/llvm/llvm-project/commit/a13c0b67708173b8033a53ff6ae4c46c5b80bb2b DIFF: https://github.com/llvm/llvm-project/commit/a13c0b67708173b8033a53ff6ae4c46c5b80bb2b.diff LOG: [Flang][OpenMP] Add frontend support for declare variant (#130578) Support is added for parsing. Basic semantics support is added to forward the code to Lowering. Lowering will emit a TODO error. Detailed semantics checks and lowering is further work. Added: flang/test/Lower/OpenMP/Todo/declare-variant.f90 flang/test/Parser/OpenMP/declare-variant.f90 flang/test/Semantics/OpenMP/declare-variant.f90 Modified: flang/include/flang/Parser/dump-parse-tree.h flang/include/flang/Parser/parse-tree.h flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Parser/openmp-parsers.cpp flang/lib/Parser/unparse.cpp flang/lib/Semantics/check-omp-structure.cpp flang/lib/Semantics/check-omp-structure.h flang/lib/Semantics/resolve-names.cpp llvm/include/llvm/Frontend/OpenMP/OMP.td Removed: ################################################################################ diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a3721bc8410ba 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -483,6 +483,11 @@ class ParseTreeDumper { NODE(parser, OldParameterStmt) NODE(parser, OmpTypeSpecifier) NODE(parser, OmpTypeNameList) + NODE(parser, OmpAdjustArgsClause) + NODE(OmpAdjustArgsClause, OmpAdjustOp) + NODE_ENUM(OmpAdjustArgsClause::OmpAdjustOp, Value) + NODE(parser, OmpAppendArgsClause) + NODE(OmpAppendArgsClause, OmpAppendOp) NODE(parser, OmpLocator) NODE(parser, OmpLocatorList) NODE(parser, OmpReductionSpecifier) @@ -703,6 +708,7 @@ class ParseTreeDumper { NODE(parser, OpenMPCriticalConstruct) NODE(parser, OpenMPDeclarativeAllocate) NODE(parser, OpenMPDeclarativeConstruct) + NODE(parser, OmpDeclareVariantDirective) NODE(parser, OpenMPDeclareReductionConstruct) NODE(parser, OpenMPDeclareSimdConstruct) NODE(parser, OpenMPDeclareTargetConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..a0d7a797e7203 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4013,6 +4013,15 @@ struct OmpAbsentClause { WRAPPER_CLASS_BOILERPLATE(OmpAbsentClause, OmpDirectiveList); }; +struct OmpAdjustArgsClause { + TUPLE_CLASS_BOILERPLATE(OmpAdjustArgsClause); + struct OmpAdjustOp { + ENUM_CLASS(Value, Nothing, Need_Device_Ptr) + WRAPPER_CLASS_BOILERPLATE(OmpAdjustOp, Value); + }; + std::tuple t; +}; + // Ref: [5.0:135-140], [5.1:161-166], [5.2:264-265] // // affinity-clause -> @@ -4056,6 +4065,13 @@ struct OmpAllocateClause { std::tuple t; }; +struct OmpAppendArgsClause { + struct OmpAppendOp { + WRAPPER_CLASS_BOILERPLATE(OmpAppendOp, std::list); + }; + WRAPPER_CLASS_BOILERPLATE(OmpAppendArgsClause, std::list); +}; + // Ref: [5.2:216-217 (sort of, as it's only mentioned in passing) // AT(compilation|execution) struct OmpAtClause { @@ -4698,6 +4714,12 @@ struct OmpBlockDirective { CharBlock source; }; +struct OmpDeclareVariantDirective { + TUPLE_CLASS_BOILERPLATE(OmpDeclareVariantDirective); + CharBlock source; + std::tuple, Name, OmpClauseList> t; +}; + // 2.10.6 declare-target -> DECLARE TARGET (extended-list) | // DECLARE TARGET [declare-target-clause[ [,] // declare-target-clause]...] @@ -4776,8 +4798,8 @@ struct OpenMPDeclarativeConstruct { std::variant + OmpDeclareVariantDirective, OpenMPThreadprivate, OpenMPRequiresConstruct, + OpenMPUtilityConstruct, OmpMetadirectiveDirective> u; }; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 49247e2e5cce4..3ba9c2ff85332 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3816,6 +3816,13 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMP ASSUMES declaration"); } +static void +genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, + semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, + const parser::OmpDeclareVariantDirective &declareVariantDirective) { + TODO(converter.getCurrentLocation(), "OmpDeclareVariantDirective"); +} + static void genOMP( lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..c4728e0fabe61 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -611,6 +611,14 @@ TYPE_PARSER(sourced(construct( TYPE_PARSER(sourced(construct( // Parser{}))) +TYPE_PARSER(construct( + "INTEROP" >> parenthesized(nonemptyList(Parser{})))) + +TYPE_PARSER(construct( + "NOTHING" >> pure(OmpAdjustArgsClause::OmpAdjustOp::Value::Nothing) || + "NEED_DEVICE_PTR" >> + pure(OmpAdjustArgsClause::OmpAdjustOp::Value::Need_Device_Ptr))) + // --- Parsers for clauses -------------------------------------------- /// `MOBClause` is a clause that has a @@ -630,6 +638,10 @@ static inline MOBClause makeMobClause( } } +TYPE_PARSER(construct( + (Parser{} / ":"), + Parser{})) + // [5.0] 2.10.1 affinity([aff-modifier:] locator-list) // aff-modifier: interator-modifier TYPE_PARSER(construct( @@ -653,6 +665,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct( OmpDirectiveNameParser{}, maybe(parenthesized(scalarLogicalExpr)))) +TYPE_PARSER(construct( + nonemptyList(Parser{}))) + // 2.15.3.1 DEFAULT (PRIVATE | FIRSTPRIVATE | SHARED | NONE) TYPE_PARSER(construct( "PRIVATE" >> pure(OmpDefaultClause::DataSharingAttribute::Private) || @@ -901,6 +916,8 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "ACQUIRE" >> construct(construct()) || "ACQ_REL" >> construct(construct()) || + "ADJUST_ARGS" >> construct(construct( + parenthesized(Parser{}))) || "AFFINITY" >> construct(construct( parenthesized(Parser{}))) || "ALIGN" >> construct(construct( @@ -909,6 +926,8 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "ALLOCATE" >> construct(construct( parenthesized(Parser{}))) || + "APPEND_ARGS" >> construct(construct( + parenthesized(Parser{}))) || "ALLOCATOR" >> construct(construct( parenthesized(scalarIntExpr))) || "AT" >> construct(construct( @@ -1348,6 +1367,11 @@ TYPE_PARSER(construct( construct(assignmentStmt) || construct(Parser{}))) +// OpenMP 5.2: 7.5.4 Declare Variant directive +TYPE_PARSER(sourced( + construct(verbatim("DECLARE VARIANT"_tok), + "(" >> maybe(name / ":"), name / ")", Parser{}))) + // 2.16 Declare Reduction Construct TYPE_PARSER(sourced(construct( verbatim("DECLARE REDUCTION"_tok), @@ -1519,6 +1543,8 @@ TYPE_PARSER( Parser{}) || construct( Parser{}) || + construct( + Parser{}) || construct( Parser{}) || construct( diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..1ee9096fcda56 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2743,7 +2743,28 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } - + void Unparse(const OmpAppendArgsClause::OmpAppendOp &x) { + Put("INTEROP("); + Walk(x.v, ","); + Put(")"); + } + void Unparse(const OmpAppendArgsClause &x) { Walk(x.v, ","); } + void Unparse(const OmpAdjustArgsClause &x) { + Walk(std::get(x.t).v); + Put(":"); + Walk(std::get(x.t)); + } + void Unparse(const OmpDeclareVariantDirective &x) { + BeginOpenMP(); + Word("!$OMP DECLARE VARIANT "); + Put("("); + Walk(std::get>(x.t), ":"); + Walk(std::get(x.t)); + Put(")"); + Walk(std::get(x.t)); + Put("\n"); + EndOpenMP(); + } void Unparse(const OpenMPInteropConstruct &x) { BeginOpenMP(); Word("!$OMP INTEROP"); @@ -3042,6 +3063,7 @@ class UnparseVisitor { WALK_NESTED_ENUM(InquireSpec::LogVar, Kind) WALK_NESTED_ENUM(ProcedureStmt, Kind) // R1506 WALK_NESTED_ENUM(UseStmt, ModuleNature) // R1410 + WALK_NESTED_ENUM(OmpAdjustArgsClause::OmpAdjustOp, Value) // OMP adjustop WALK_NESTED_ENUM(OmpAtClause, ActionTime) // OMP at WALK_NESTED_ENUM(OmpBindClause, Binding) // OMP bind WALK_NESTED_ENUM(OmpProcBindClause, AffinityPolicy) // OMP proc_bind diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index f654fe6e4681a..f17de42ca2466 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -1619,6 +1619,16 @@ void OmpStructureChecker::Leave(const parser::OpenMPDeclareSimdConstruct &) { dirContext_.pop_back(); } +void OmpStructureChecker::Enter(const parser::OmpDeclareVariantDirective &x) { + const auto &dir{std::get(x.t)}; + PushContextAndClauseSets( + dir.source, llvm::omp::Directive::OMPD_declare_variant); +} + +void OmpStructureChecker::Leave(const parser::OmpDeclareVariantDirective &) { + dirContext_.pop_back(); +} + void OmpStructureChecker::Enter(const parser::OpenMPDepobjConstruct &x) { const auto &dirName{std::get(x.v.t)}; PushContextAndClauseSets(dirName.source, llvm::omp::Directive::OMPD_depobj); diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..911a6bb08fb87 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -98,6 +98,8 @@ class OmpStructureChecker void Enter(const parser::OmpEndSectionsDirective &); void Leave(const parser::OmpEndSectionsDirective &); + void Enter(const parser::OmpDeclareVariantDirective &); + void Leave(const parser::OmpDeclareVariantDirective &); void Enter(const parser::OpenMPDeclareSimdConstruct &); void Leave(const parser::OpenMPDeclareSimdConstruct &); void Enter(const parser::OpenMPDeclarativeAllocate &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index e0550b3724bef..b2979690f78e7 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1511,6 +1511,25 @@ class OmpVisitor : public virtual DeclarationVisitor { return true; } + bool Pre(const parser::OmpDeclareVariantDirective &x) { + AddOmpSourceRange(x.source); + auto FindSymbolOrError = [&](const parser::Name &procName) { + auto *symbol{FindSymbol(NonDerivedTypeScope(), procName)}; + if (!symbol) { + context().Say(procName.source, + "Implicit subroutine declaration '%s' in !$OMP DECLARE VARIANT"_err_en_US, + procName.source); + } + }; + auto &baseProcName = std::get>(x.t); + if (baseProcName) { + FindSymbolOrError(*baseProcName); + } + auto &varProcName = std::get(x.t); + FindSymbolOrError(varProcName); + return true; + } + bool Pre(const parser::OpenMPDeclareReductionConstruct &x) { AddOmpSourceRange(x.source); ProcessReductionSpecifier( diff --git a/flang/test/Lower/OpenMP/Todo/declare-variant.f90 b/flang/test/Lower/OpenMP/Todo/declare-variant.f90 new file mode 100644 index 0000000000000..5719ef3afdee1 --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/declare-variant.f90 @@ -0,0 +1,17 @@ +! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s + +! CHECK: not yet implemented: OmpDeclareVariantDirective + +subroutine sb1 + integer :: x + x = 1 + call sub(x) +contains + subroutine vsub (v1) + integer, value :: v1 + end + subroutine sub (v1) + !$omp declare variant(vsub), match(construct={dispatch}) + integer, value :: v1 + end +end subroutine diff --git a/flang/test/Parser/OpenMP/declare-variant.f90 b/flang/test/Parser/OpenMP/declare-variant.f90 new file mode 100644 index 0000000000000..1b97733ea9525 --- /dev/null +++ b/flang/test/Parser/OpenMP/declare-variant.f90 @@ -0,0 +1,104 @@ +! RUN: %flang_fc1 -fdebug-unparse-no-sema -fopenmp %s | FileCheck --ignore-case %s +! RUN: %flang_fc1 -fdebug-dump-parse-tree-no-sema -fopenmp %s | FileCheck --check-prefix="PARSE-TREE" %s + +subroutine sub0 +!CHECK: !$OMP DECLARE VARIANT (sub:vsub) MATCH(CONSTRUCT={PARALLEL}) +!PARSE-TREE: OpenMPDeclarativeConstruct -> OmpDeclareVariantDirective +!PARSE-TREE: | Verbatim +!PARSE-TREE: | Name = 'sub' +!PARSE-TREE: | Name = 'vsub' +!PARSE-TREE: | OmpClauseList -> OmpClause -> Match -> OmpMatchClause -> OmpContextSelectorSpecification -> OmpTraitSetSelector +!PARSE-TREE: | | OmpTraitSetSelectorName -> Value = Construct +!PARSE-TREE: | | OmpTraitSelector +!PARSE-TREE: | | | OmpTraitSelectorName -> llvm::omp::Directive = parallel + !$omp declare variant (sub:vsub) match (construct={parallel}) +contains + subroutine vsub + end subroutine + + subroutine sub () + end subroutine +end subroutine + +subroutine sb1 + integer :: x + x = 1 + !$omp dispatch device(1) + call sub(x) +contains + subroutine vsub (v1) + integer, value :: v1 + end + subroutine sub (v1) +!CHECK: !$OMP DECLARE VARIANT (vsub) MATCH(CONSTRUCT={DISPATCH} +!PARSE-TREE: OpenMPDeclarativeConstruct -> OmpDeclareVariantDirective +!PARSE-TREE: | Verbatim +!PARSE-TREE: | Name = 'vsub' +!PARSE-TREE: | OmpClauseList -> OmpClause -> Match -> OmpMatchClause -> OmpContextSelectorSpecification -> OmpTraitSetSelector +!PARSE-TREE: | | OmpTraitSetSelectorName -> Value = Construct +!PARSE-TREE: | | OmpTraitSelector +!PARSE-TREE: | | | OmpTraitSelectorName -> llvm::omp::Directive = dispatch + !$omp declare variant(vsub), match(construct={dispatch}) + integer, value :: v1 + end +end subroutine + +subroutine sb2 (x1, x2) + use omp_lib, only: omp_interop_kind + integer :: x + x = 1 + !$omp dispatch device(1) + call sub(x) +contains + subroutine vsub (v1, a1, a2) + integer, value :: v1 + integer(omp_interop_kind) :: a1 + integer(omp_interop_kind), value :: a2 + end + subroutine sub (v1) +!CHECK: !$OMP DECLARE VARIANT (vsub) MATCH(CONSTRUCT={DISPATCH}) APPEND_ARGS(INTEROP(T& +!CHECK: !$OMP&ARGET),INTEROP(TARGET)) +!PARSE-TREE: OpenMPDeclarativeConstruct -> OmpDeclareVariantDirective +!PARSE-TREE: | Verbatim +!PARSE-TREE: | Name = 'vsub' +!PARSE-TREE: | OmpClauseList -> OmpClause -> Match -> OmpMatchClause -> OmpContextSelectorSpecification -> OmpTraitSetSelector +!PARSE-TREE: | | OmpTraitSetSelectorName -> Value = Construct +!PARSE-TREE: | | OmpTraitSelector +!PARSE-TREE: | | | OmpTraitSelectorName -> llvm::omp::Directive = dispatch +!PARSE-TREE: | OmpClause -> AppendArgs -> OmpAppendArgsClause -> OmpAppendOp -> OmpInteropType -> Value = Target +!PARSE-TREE: | OmpAppendOp -> OmpInteropType -> Value = Target + !$omp declare variant(vsub), match(construct={dispatch}), append_args (interop(target), interop(target)) + integer, value :: v1 + end +end subroutine + +subroutine sb3 (x1, x2) + use iso_c_binding, only: c_ptr + type(c_ptr), value :: x1, x2 + + !$omp dispatch device(1) + call sub(x1, x2) +contains + subroutine sub (v1, v2) + type(c_ptr), value :: v1, v2 +!CHECK: !$OMP DECLARE VARIANT (vsub) MATCH(CONSTRUCT={DISPATCH}) ADJUST_ARGS(NOTHING:v& +!CHECK: !$OMP&1) ADJUST_ARGS(NEED_DEVICE_PTR:v2) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OmpDeclareVariantDirective +!PARSE-TREE: | Verbatim +!PARSE-TREE: | Name = 'vsub' +!PARSE-TREE: | OmpClauseList -> OmpClause -> Match -> OmpMatchClause -> OmpContextSelectorSpecification -> OmpTraitSetSelector +!PARSE-TREE: | | OmpTraitSetSelectorName -> Value = Construct +!PARSE-TREE: | | OmpTraitSelector +!PARSE-TREE: | | | OmpTraitSelectorName -> llvm::omp::Directive = dispatch +!PARSE-TREE: | OmpClause -> AdjustArgs -> OmpAdjustArgsClause +!PARSE-TREE: | | OmpAdjustOp -> Value = Nothing +!PARSE-TREE: | | OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'v1' +!PARSE-TREE: | OmpClause -> AdjustArgs -> OmpAdjustArgsClause +!PARSE-TREE: | | OmpAdjustOp -> Value = Need_Device_Ptr +!PARSE-TREE: | | OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'v2' + !$omp declare variant(vsub) match ( construct = { dispatch } ) adjust_args(nothing : v1 ) adjust_args(need_device_ptr : v2) + end + subroutine vsub(v1, v2) + type(c_ptr), value :: v1, v2 + end +end subroutine diff --git a/flang/test/Semantics/OpenMP/declare-variant.f90 b/flang/test/Semantics/OpenMP/declare-variant.f90 new file mode 100644 index 0000000000000..84a0cdcd10d91 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-variant.f90 @@ -0,0 +1,14 @@ +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=51 + +subroutine sub0 +!ERROR: Implicit subroutine declaration 'vsub1' in !$OMP DECLARE VARIANT + !$omp declare variant (sub:vsub1) match (construct={parallel}) +!ERROR: Implicit subroutine declaration 'sub1' in !$OMP DECLARE VARIANT + !$omp declare variant (sub1:vsub) match (construct={parallel}) +contains + subroutine vsub + end subroutine + + subroutine sub () + end subroutine +end subroutine diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 3743b03ec4d71..583718a2396f5 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -43,6 +43,7 @@ def OMPC_AcqRel : Clause<"acq_rel"> { let clangClass = "OMPAcqRelClause"; } def OMPC_AdjustArgs : Clause<"adjust_args"> { + let flangClass = "OmpAdjustArgsClause"; } def OMPC_Affinity : Clause<"affinity"> { let clangClass = "OMPAffinityClause"; @@ -65,6 +66,7 @@ def OMPC_Allocator : Clause<"allocator"> { let flangClass = "ScalarIntExpr"; } def OMPC_AppendArgs : Clause<"append_args"> { + let flangClass = "OmpAppendArgsClause"; } def OMPC_At : Clause<"at"> { let clangClass = "OMPAtClause"; @@ -721,10 +723,10 @@ def OMP_EndDeclareTarget : Directive<"end declare target"> { } def OMP_DeclareVariant : Directive<"declare variant"> { let allowedClauses = [ - VersionedClause, - ]; - let allowedExclusiveClauses = [ VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, VersionedClause, ]; let association = AS_Declaration; From flang-commits at lists.llvm.org Wed May 7 01:56:52 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 07 May 2025 01:56:52 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Add frontend support for declare variant (PR #130578) In-Reply-To: Message-ID: <681b2054.170a0220.20c4dc.8599@mx.google.com> https://github.com/kiranchandramohan closed https://github.com/llvm/llvm-project/pull/130578 From flang-commits at lists.llvm.org Wed May 7 02:18:17 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 02:18:17 -0700 (PDT) Subject: [flang-commits] [flang] 75e5643 - [flang][OpenMP] share global variable initialization code (#138672) Message-ID: <681b2559.170a0220.df0b9.7bd2@mx.google.com> Author: Tom Eccles Date: 2025-05-07T10:18:13+01:00 New Revision: 75e5643abf6b59db8dfae6b524e9c3c2ec0ffc29 URL: https://github.com/llvm/llvm-project/commit/75e5643abf6b59db8dfae6b524e9c3c2ec0ffc29 DIFF: https://github.com/llvm/llvm-project/commit/75e5643abf6b59db8dfae6b524e9c3c2ec0ffc29.diff LOG: [flang][OpenMP] share global variable initialization code (#138672) Fixes #108136 In #108136 (the new testcase), flang was missing the length parameter required for the variable length string when boxing the global variable. The code that is initializing global variables for OpenMP did not support types with length parameters. Instead of duplicating this initialization logic in OpenMP, I decided to use the exact same initialization as is used in the base language because this will already be well tested and will be updated for any new types. The difference for OpenMP is that the global variables will be zero initialized instead of left undefined. Previously `Fortran::lower::createGlobalInitialization` was used to share a smaller amount of the logic with the base language lowering. I think this bug has demonstrated that helper was too low level to be helpful, and it was only used in OpenMP so I have made it static inside of ConvertVariable.cpp. Added: flang/test/Lower/OpenMP/threadprivate-lenparams.f90 Modified: flang/docs/OpenMPSupport.md flang/include/flang/Lower/ConvertVariable.h flang/lib/Lower/ConvertVariable.cpp flang/lib/Lower/OpenMP/OpenMP.cpp flang/test/Lower/OpenMP/omp-declare-target-program-var.f90 flang/test/Lower/OpenMP/threadprivate-host-association-2.f90 flang/test/Lower/OpenMP/threadprivate-host-association-3.f90 flang/test/Lower/OpenMP/threadprivate-non-global.f90 Removed: ################################################################################ diff --git a/flang/docs/OpenMPSupport.md b/flang/docs/OpenMPSupport.md index 2d4b9dd737777..587723890d226 100644 --- a/flang/docs/OpenMPSupport.md +++ b/flang/docs/OpenMPSupport.md @@ -64,4 +64,4 @@ Note : No distinction is made between the support in Parser/Semantics, MLIR, Low | target teams distribute parallel loop simd construct | P | device, reduction, dist_schedule and linear clauses are not supported | ## OpenMP 3.1, OpenMP 2.5, OpenMP 1.1 -All features except a few corner cases in atomic (complex type, diff erent but compatible types in lhs and rhs), threadprivate (character type) constructs/clauses are supported. +All features except a few corner cases in atomic (complex type, diff erent but compatible types in lhs and rhs) are supported. diff --git a/flang/include/flang/Lower/ConvertVariable.h b/flang/include/flang/Lower/ConvertVariable.h index 8288b814134c8..e05625a229ac7 100644 --- a/flang/include/flang/Lower/ConvertVariable.h +++ b/flang/include/flang/Lower/ConvertVariable.h @@ -134,10 +134,11 @@ mlir::Value genInitialDataTarget(Fortran::lower::AbstractConverter &, const SomeExpr &initialTarget, bool couldBeInEquivalence = false); -/// Call \p genInit to generate code inside \p global initializer region. -void createGlobalInitialization( - fir::FirOpBuilder &builder, fir::GlobalOp global, - std::function genInit); +/// Create the global op and its init if it has one +fir::GlobalOp defineGlobal(Fortran::lower::AbstractConverter &converter, + const Fortran::lower::pft::Variable &var, + llvm::StringRef globalName, mlir::StringAttr linkage, + cuf::DataAttributeAttr dataAttr = {}); /// Generate address \p addr inside an initializer. fir::ExtendedValue diff --git a/flang/lib/Lower/ConvertVariable.cpp b/flang/lib/Lower/ConvertVariable.cpp index b277c0d7040a7..372c71b6d4821 100644 --- a/flang/lib/Lower/ConvertVariable.cpp +++ b/flang/lib/Lower/ConvertVariable.cpp @@ -145,11 +145,10 @@ static bool isConstant(const Fortran::semantics::Symbol &sym) { sym.test(Fortran::semantics::Symbol::Flag::ReadOnly); } -static fir::GlobalOp defineGlobal(Fortran::lower::AbstractConverter &converter, - const Fortran::lower::pft::Variable &var, - llvm::StringRef globalName, - mlir::StringAttr linkage, - cuf::DataAttributeAttr dataAttr = {}); +/// Call \p genInit to generate code inside \p global initializer region. +static void +createGlobalInitialization(fir::FirOpBuilder &builder, fir::GlobalOp global, + std::function genInit); static mlir::Location genLocation(Fortran::lower::AbstractConverter &converter, const Fortran::semantics::Symbol &sym) { @@ -467,9 +466,9 @@ static bool globalIsInitialized(fir::GlobalOp global) { } /// Call \p genInit to generate code inside \p global initializer region. -void Fortran::lower::createGlobalInitialization( - fir::FirOpBuilder &builder, fir::GlobalOp global, - std::function genInit) { +static void +createGlobalInitialization(fir::FirOpBuilder &builder, fir::GlobalOp global, + std::function genInit) { mlir::Region ®ion = global.getRegion(); region.push_back(new mlir::Block); mlir::Block &block = region.back(); @@ -479,7 +478,7 @@ void Fortran::lower::createGlobalInitialization( builder.restoreInsertionPoint(insertPt); } -static unsigned getAllocatorIdx(cuf::DataAttributeAttr dataAttr) { +static unsigned getAllocatorIdxFromDataAttr(cuf::DataAttributeAttr dataAttr) { if (dataAttr) { if (dataAttr.getValue() == cuf::DataAttribute::Pinned) return kPinnedAllocatorPos; @@ -494,11 +493,10 @@ static unsigned getAllocatorIdx(cuf::DataAttributeAttr dataAttr) { } /// Create the global op and its init if it has one -static fir::GlobalOp defineGlobal(Fortran::lower::AbstractConverter &converter, - const Fortran::lower::pft::Variable &var, - llvm::StringRef globalName, - mlir::StringAttr linkage, - cuf::DataAttributeAttr dataAttr) { +fir::GlobalOp Fortran::lower::defineGlobal( + Fortran::lower::AbstractConverter &converter, + const Fortran::lower::pft::Variable &var, llvm::StringRef globalName, + mlir::StringAttr linkage, cuf::DataAttributeAttr dataAttr) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); const Fortran::semantics::Symbol &sym = var.getSymbol(); mlir::Location loc = genLocation(converter, sym); @@ -545,27 +543,25 @@ static fir::GlobalOp defineGlobal(Fortran::lower::AbstractConverter &converter, sym.detailsIf(); if (details && details->init()) { auto expr = *details->init(); - Fortran::lower::createGlobalInitialization( - builder, global, [&](fir::FirOpBuilder &b) { - mlir::Value box = Fortran::lower::genInitialDataTarget( - converter, loc, symTy, expr); - b.create(loc, box); - }); + createGlobalInitialization(builder, global, [&](fir::FirOpBuilder &b) { + mlir::Value box = + Fortran::lower::genInitialDataTarget(converter, loc, symTy, expr); + b.create(loc, box); + }); } else { // Create unallocated/disassociated descriptor if no explicit init - Fortran::lower::createGlobalInitialization( - builder, global, [&](fir::FirOpBuilder &b) { - mlir::Value box = fir::factory::createUnallocatedBox( - b, loc, symTy, - /*nonDeferredParams=*/std::nullopt, - /*typeSourceBox=*/{}, getAllocatorIdx(dataAttr)); - b.create(loc, box); - }); + createGlobalInitialization(builder, global, [&](fir::FirOpBuilder &b) { + mlir::Value box = fir::factory::createUnallocatedBox( + b, loc, symTy, + /*nonDeferredParams=*/std::nullopt, + /*typeSourceBox=*/{}, getAllocatorIdxFromDataAttr(dataAttr)); + b.create(loc, box); + }); } } else if (const auto *details = sym.detailsIf()) { if (details->init()) { - Fortran::lower::createGlobalInitialization( + createGlobalInitialization( builder, global, [&](fir::FirOpBuilder &builder) { Fortran::lower::StatementContext stmtCtx( /*cleanupProhibited=*/true); @@ -576,7 +572,7 @@ static fir::GlobalOp defineGlobal(Fortran::lower::AbstractConverter &converter, builder.create(loc, castTo); }); } else if (Fortran::lower::hasDefaultInitialization(sym)) { - Fortran::lower::createGlobalInitialization( + createGlobalInitialization( builder, global, [&](fir::FirOpBuilder &builder) { Fortran::lower::StatementContext stmtCtx( /*cleanupProhibited=*/true); @@ -591,7 +587,7 @@ static fir::GlobalOp defineGlobal(Fortran::lower::AbstractConverter &converter, if (details && details->init()) { auto sym{*details->init()}; if (sym) // Has a procedure target. - Fortran::lower::createGlobalInitialization( + createGlobalInitialization( builder, global, [&](fir::FirOpBuilder &b) { Fortran::lower::StatementContext stmtCtx( /*cleanupProhibited=*/true); @@ -601,19 +597,17 @@ static fir::GlobalOp defineGlobal(Fortran::lower::AbstractConverter &converter, b.create(loc, castTo); }); else { // Has NULL() target. - Fortran::lower::createGlobalInitialization( - builder, global, [&](fir::FirOpBuilder &b) { - auto box{fir::factory::createNullBoxProc(b, loc, symTy)}; - b.create(loc, box); - }); + createGlobalInitialization(builder, global, [&](fir::FirOpBuilder &b) { + auto box{fir::factory::createNullBoxProc(b, loc, symTy)}; + b.create(loc, box); + }); } } else { // No initialization. - Fortran::lower::createGlobalInitialization( - builder, global, [&](fir::FirOpBuilder &b) { - auto box{fir::factory::createNullBoxProc(b, loc, symTy)}; - b.create(loc, box); - }); + createGlobalInitialization(builder, global, [&](fir::FirOpBuilder &b) { + auto box{fir::factory::createNullBoxProc(b, loc, symTy)}; + b.create(loc, box); + }); } } else if (sym.has()) { mlir::emitError(loc, "COMMON symbol processed elsewhere"); @@ -634,7 +628,7 @@ static fir::GlobalOp defineGlobal(Fortran::lower::AbstractConverter &converter, // file. if (sym.attrs().test(Fortran::semantics::Attr::BIND_C)) global.setLinkName(builder.createCommonLinkage()); - Fortran::lower::createGlobalInitialization( + createGlobalInitialization( builder, global, [&](fir::FirOpBuilder &builder) { mlir::Value initValue; if (converter.getLoweringOptions().getInitGlobalZero()) @@ -826,7 +820,7 @@ void Fortran::lower::defaultInitializeAtRuntime( /*isConst=*/true, /*isTarget=*/false, /*dataAttr=*/{}); - Fortran::lower::createGlobalInitialization( + createGlobalInitialization( builder, global, [&](fir::FirOpBuilder &builder) { Fortran::lower::StatementContext stmtCtx( /*cleanupProhibited=*/true); @@ -842,7 +836,7 @@ void Fortran::lower::defaultInitializeAtRuntime( /*isConst=*/true, /*isTarget=*/false, /*dataAttr=*/{}); - Fortran::lower::createGlobalInitialization( + createGlobalInitialization( builder, global, [&](fir::FirOpBuilder &builder) { Fortran::lower::StatementContext stmtCtx( /*cleanupProhibited=*/true); @@ -1207,7 +1201,7 @@ static fir::GlobalOp defineGlobalAggregateStore( if (const auto *objectDetails = initSym->detailsIf()) if (objectDetails->init()) { - Fortran::lower::createGlobalInitialization( + createGlobalInitialization( builder, global, [&](fir::FirOpBuilder &builder) { Fortran::lower::StatementContext stmtCtx; mlir::Value initVal = fir::getBase(genInitializerExprValue( @@ -1219,12 +1213,11 @@ static fir::GlobalOp defineGlobalAggregateStore( // Equivalence has no Fortran initial value. Create an undefined FIR initial // value to ensure this is consider an object definition in the IR regardless // of the linkage. - Fortran::lower::createGlobalInitialization( - builder, global, [&](fir::FirOpBuilder &builder) { - Fortran::lower::StatementContext stmtCtx; - mlir::Value initVal = builder.create(loc, aggTy); - builder.create(loc, initVal); - }); + createGlobalInitialization(builder, global, [&](fir::FirOpBuilder &builder) { + Fortran::lower::StatementContext stmtCtx; + mlir::Value initVal = builder.create(loc, aggTy); + builder.create(loc, initVal); + }); return global; } @@ -1543,7 +1536,7 @@ static void finalizeCommonBlockDefinition( LLVM_DEBUG(llvm::dbgs() << "}\n"); builder.create(loc, cb); }; - Fortran::lower::createGlobalInitialization(builder, global, initFunc); + createGlobalInitialization(builder, global, initFunc); } void Fortran::lower::defineCommonBlocks( diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 3ba9c2ff85332..cc793c683f898 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -662,32 +662,9 @@ static fir::GlobalOp globalInitialization(lower::AbstractConverter &converter, const semantics::Symbol &sym, const lower::pft::Variable &var, mlir::Location currentLocation) { - mlir::Type ty = converter.genType(sym); std::string globalName = converter.mangleName(sym); mlir::StringAttr linkage = firOpBuilder.createInternalLinkage(); - fir::GlobalOp global = - firOpBuilder.createGlobal(currentLocation, ty, globalName, linkage); - - // Create default initialization for non-character scalar. - if (semantics::IsAllocatableOrObjectPointer(&sym)) { - mlir::Type baseAddrType = mlir::dyn_cast(ty).getEleTy(); - lower::createGlobalInitialization( - firOpBuilder, global, [&](fir::FirOpBuilder &b) { - mlir::Value nullAddr = - b.createNullConstant(currentLocation, baseAddrType); - mlir::Value box = - b.create(currentLocation, ty, nullAddr); - b.create(currentLocation, box); - }); - } else { - lower::createGlobalInitialization( - firOpBuilder, global, [&](fir::FirOpBuilder &b) { - mlir::Value undef = b.create(currentLocation, ty); - b.create(currentLocation, undef); - }); - } - - return global; + return Fortran::lower::defineGlobal(converter, var, globalName, linkage); } // Get the extended value for \p val by extracting additional variable diff --git a/flang/test/Lower/OpenMP/omp-declare-target-program-var.f90 b/flang/test/Lower/OpenMP/omp-declare-target-program-var.f90 index 20538ff34871f..d18f42ae3ceb0 100644 --- a/flang/test/Lower/OpenMP/omp-declare-target-program-var.f90 +++ b/flang/test/Lower/OpenMP/omp-declare-target-program-var.f90 @@ -6,7 +6,7 @@ PROGRAM main ! HOST-DAG: %[[I_DECL:.*]]:2 = hlfir.declare %[[I_REF]] {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) REAL :: I ! ALL-DAG: fir.global internal @_QFEi {omp.declare_target = #omp.declaretarget} : f32 { - ! ALL-DAG: %[[UNDEF:.*]] = fir.undefined f32 + ! ALL-DAG: %[[UNDEF:.*]] = fir.zero_bits f32 ! ALL-DAG: fir.has_value %[[UNDEF]] : f32 ! ALL-DAG: } !$omp declare target(I) diff --git a/flang/test/Lower/OpenMP/threadprivate-host-association-2.f90 b/flang/test/Lower/OpenMP/threadprivate-host-association-2.f90 index 546d4920042d7..5e54cef8c29db 100644 --- a/flang/test/Lower/OpenMP/threadprivate-host-association-2.f90 +++ b/flang/test/Lower/OpenMP/threadprivate-host-association-2.f90 @@ -27,7 +27,7 @@ !CHECK: return !CHECK: } !CHECK: fir.global internal @_QFEa : i32 { -!CHECK: %[[A:.*]] = fir.undefined i32 +!CHECK: %[[A:.*]] = fir.zero_bits i32 !CHECK: fir.has_value %[[A]] : i32 !CHECK: } diff --git a/flang/test/Lower/OpenMP/threadprivate-host-association-3.f90 b/flang/test/Lower/OpenMP/threadprivate-host-association-3.f90 index 22ee51f82bc0f..21547b47cf381 100644 --- a/flang/test/Lower/OpenMP/threadprivate-host-association-3.f90 +++ b/flang/test/Lower/OpenMP/threadprivate-host-association-3.f90 @@ -27,7 +27,7 @@ !CHECK: return !CHECK: } !CHECK: fir.global internal @_QFEa : i32 { -!CHECK: %[[A:.*]] = fir.undefined i32 +!CHECK: %[[A:.*]] = fir.zero_bits i32 !CHECK: fir.has_value %[[A]] : i32 !CHECK: } diff --git a/flang/test/Lower/OpenMP/threadprivate-lenparams.f90 b/flang/test/Lower/OpenMP/threadprivate-lenparams.f90 new file mode 100644 index 0000000000000..a220db2a11b2e --- /dev/null +++ b/flang/test/Lower/OpenMP/threadprivate-lenparams.f90 @@ -0,0 +1,22 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +! Regression test for https://github.com/llvm/llvm-project/issues/108136 + +character(:), pointer :: c +character(2), pointer :: c2 +!$omp threadprivate(c, c2) +end + +! CHECK-LABEL: fir.global internal @_QFEc : !fir.box>> { +! CHECK: %[[VAL_0:.*]] = fir.zero_bits !fir.ptr> +! CHECK: %[[VAL_1:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_2:.*]] = fir.embox %[[VAL_0]] typeparams %[[VAL_1]] : (!fir.ptr>, index) -> !fir.box>> +! CHECK: fir.has_value %[[VAL_2]] : !fir.box>> +! CHECK: } + +! CHECK-LABEL: fir.global internal @_QFEc2 : !fir.box>> { +! CHECK: %[[VAL_0:.*]] = fir.zero_bits !fir.ptr> +! CHECK: %[[VAL_1:.*]] = fir.embox %[[VAL_0]] : (!fir.ptr>) -> !fir.box>> +! CHECK: fir.has_value %[[VAL_1]] : !fir.box>> +! CHECK: } + diff --git a/flang/test/Lower/OpenMP/threadprivate-non-global.f90 b/flang/test/Lower/OpenMP/threadprivate-non-global.f90 index 0b9abd1d0bcf4..508a67deb698b 100644 --- a/flang/test/Lower/OpenMP/threadprivate-non-global.f90 +++ b/flang/test/Lower/OpenMP/threadprivate-non-global.f90 @@ -85,19 +85,19 @@ program test !CHECK-DAG: fir.has_value [[E1]] : !fir.box> !CHECK-DAG: } !CHECK-DAG: fir.global internal @_QFEw : complex { -!CHECK-DAG: [[Z2:%.*]] = fir.undefined complex +!CHECK-DAG: [[Z2:%.*]] = fir.zero_bits complex !CHECK-DAG: fir.has_value [[Z2]] : complex !CHECK-DAG: } !CHECK-DAG: fir.global internal @_QFEx : i32 { -!CHECK-DAG: [[Z3:%.*]] = fir.undefined i32 +!CHECK-DAG: [[Z3:%.*]] = fir.zero_bits i32 !CHECK-DAG: fir.has_value [[Z3]] : i32 !CHECK-DAG: } !CHECK-DAG: fir.global internal @_QFEy : f32 { -!CHECK-DAG: [[Z4:%.*]] = fir.undefined f32 +!CHECK-DAG: [[Z4:%.*]] = fir.zero_bits f32 !CHECK-DAG: fir.has_value [[Z4]] : f32 !CHECK-DAG: } !CHECK-DAG: fir.global internal @_QFEz : !fir.logical<4> { -!CHECK-DAG: [[Z5:%.*]] = fir.undefined !fir.logical<4> +!CHECK-DAG: [[Z5:%.*]] = fir.zero_bits !fir.logical<4> !CHECK-DAG: fir.has_value [[Z5]] : !fir.logical<4> !CHECK-DAG: } end From flang-commits at lists.llvm.org Wed May 7 02:18:19 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 07 May 2025 02:18:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] share global variable initialization code (PR #138672) In-Reply-To: Message-ID: <681b255b.170a0220.2a225a.7d3d@mx.google.com> https://github.com/tblah closed https://github.com/llvm/llvm-project/pull/138672 From flang-commits at lists.llvm.org Wed May 7 02:27:32 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 07 May 2025 02:27:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <681b2784.050a0220.3ce759.0b86@mx.google.com> ================ @@ -18,8 +18,8 @@ #include "flang/Optimizer/Passes/CommandLineOpts.h" #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Tools/CrossToolHelpers.h" -#include "mlir/Conversion/ReconcileUnrealizedCasts/ReconcileUnrealizedCasts.h" -#include "mlir/Conversion/SCFToControlFlow/SCFToControlFlow.h" +#include "mlir/Conversion/Passes.h" ---------------- kiranchandramohan wrote: This include probably brings in a lot of conversion headers that are not relevant to Flang. Can we include only the relevant conversion headers? https://github.com/llvm/llvm-project/pull/138627 From flang-commits at lists.llvm.org Wed May 7 02:27:32 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 07 May 2025 02:27:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <681b2784.170a0220.88e13.8d17@mx.google.com> ================ @@ -211,6 +211,23 @@ void createDefaultFIROptimizerPassPipeline(mlir::PassManager &pm, addNestedPassToAllTopLevelOperations( pm, fir::createStackReclaim); + + if (enableAffineOpt && pc.OptLevel.isOptimizingForSpeed()) { + pm.addPass(fir::createPromoteToAffinePass()); + pm.addPass(mlir::createCSEPass()); + pm.addPass(mlir::affine::createAffineLoopInvariantCodeMotionPass()); + pm.addPass(mlir::affine::createAffineLoopNormalizePass()); + pm.addPass(mlir::affine::createSimplifyAffineStructuresPass()); + pm.addPass(mlir::affine::createAffineParallelize( ---------------- kiranchandramohan wrote: Is `AffineParallelize` specific for parallelizing to multiple threads or is it also applicable for single-thread transformations as well? https://github.com/llvm/llvm-project/pull/138627 From flang-commits at lists.llvm.org Wed May 7 02:27:34 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 07 May 2025 02:27:34 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <681b2786.050a0220.fa533.64ce@mx.google.com> ================ @@ -0,0 +1,52 @@ +! RUN: %flang_fc1 -O1 -mllvm --enable-affine-opt -emit-llvm -fopenmp -o - %s \ ---------------- kiranchandramohan wrote: We usually do not test end to end in this directory. If you really need to test the whole pipeline then you have to move this test to the Integration directory. Also, it might be good to check minimally (only the presence of OpenMP runtime calls) to avoid this test being a burden for others making unrelated changes. https://github.com/llvm/llvm-project/pull/138627 From flang-commits at lists.llvm.org Wed May 7 03:03:38 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 07 May 2025 03:03:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <681b2ffa.050a0220.1f4c5.e0e8@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/138627 From flang-commits at lists.llvm.org Wed May 7 03:03:39 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 07 May 2025 03:03:39 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <681b2ffb.170a0220.4912d.7e2a@mx.google.com> https://github.com/tblah commented: Thanks https://github.com/llvm/llvm-project/pull/138627 From flang-commits at lists.llvm.org Wed May 7 03:03:39 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 07 May 2025 03:03:39 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <681b2ffb.620a0220.140fa9.f8a2@mx.google.com> ================ @@ -0,0 +1,52 @@ +! RUN: %flang_fc1 -O1 -mllvm --enable-affine-opt -emit-llvm -fopenmp -o - %s \ +! RUN: | FileCheck %s ---------------- tblah wrote: The lit tests are usually for testing a single step in the compilation flow (e.g. a single pass, just lowering to HLFIR, etc). Testing only one thing at a time makes it clearer what that thing does, and prevents the test from needing to be updated when unrelated code has changed. However, I can see the use of an integration test here. Please could you put it in `flang/test/Integration/OpenMP/`. It is probably also helpful to add a direct test that your affine optimization pipeline is being run. For example something like `flang/test/Driver/mlir-pass-pipeline.f90`. https://github.com/llvm/llvm-project/pull/138627 From flang-commits at lists.llvm.org Wed May 7 03:06:13 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Wed, 07 May 2025 03:06:13 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <681b3095.170a0220.3c0b5c.879c@mx.google.com> ================ @@ -4137,6 +4195,33 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } } +void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { + unsigned version{context_.langOptions().OpenMPVersion}; + if (version >= 52) { + SetContextClauseInfo(llvm::omp::Clause::OMPC_detach); + } else { + // OpenMP 5.0: 2.10.1 Task construct restrictions + CheckAllowedClause(llvm::omp::Clause::OMPC_detach); + } ---------------- Thirumalai-Shaktivel wrote: > nit: won't CheckAllowedClause understand that this is allowed in newer standard versions? Nope, are you suggesting to handle this in CheckAllowedClause itself? ``` At most one detach clause can appear on the directive. ``` The above restriction is available only for OpenMP 50 & 51. I wanted this check not to be applied to versions equal to or above 52. But `CheckAllowedClause` will throw an error for all versions. I tried some methods in OMP.td, but it didn't help much. So, I took this route. https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Wed May 7 03:06:27 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Wed, 07 May 2025 03:06:27 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <681b30a3.170a0220.1c4af9.863f@mx.google.com> https://github.com/Thirumalai-Shaktivel edited https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Wed May 7 03:07:05 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 07 May 2025 03:07:05 -0700 (PDT) Subject: [flang-commits] [flang] [fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <681b30c9.050a0220.3d8265.3160@mx.google.com> ================ @@ -461,14 +475,28 @@ class AffineLoopConversion : public mlir::OpRewritePattern { LLVM_ATTRIBUTE_UNUSED auto loopAnalysis = functionAnalysis.getChildLoopAnalysis(loop); auto &loopOps = loop.getBody()->getOperations(); + auto resultOp = cast(loop.getBody()->getTerminator()); ---------------- tblah wrote: Please modify the logic to skip loops which are unsupported (those with multiple results). Otherwise this could produce incorrect code. https://github.com/llvm/llvm-project/pull/137790 From flang-commits at lists.llvm.org Wed May 7 03:46:53 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 07 May 2025 03:46:53 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681b3a1d.050a0220.b246b.dcb1@mx.google.com> https://github.com/tblah commented: Looks good, just minor comments. https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Wed May 7 03:46:53 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 07 May 2025 03:46:53 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681b3a1d.170a0220.2de5a7.8158@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Wed May 7 03:46:53 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 07 May 2025 03:46:53 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681b3a1d.050a0220.3c3295.3f86@mx.google.com> ================ @@ -388,6 +409,27 @@ bool ClauseProcessor::processNowait(mlir::omp::NowaitClauseOps &result) const { return markClauseOccurrence(result.nowait); } +bool ClauseProcessor::processNumTasks( + lower::StatementContext &stmtCtx, + mlir::omp::NumTasksClauseOps &result) const { + using numtasks = omp::clause::NumTasks; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier) { + result.numTasksMod = mlir::omp::ClauseNumTasksTypeAttr::get( + context, mlir::omp::ClauseNumTasksType::Strict); + } ---------------- tblah wrote: Same as for the grain size. https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Wed May 7 03:46:54 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 07 May 2025 03:46:54 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681b3a1e.170a0220.1ad544.7d7c@mx.google.com> ================ @@ -365,6 +365,27 @@ bool ClauseProcessor::processHint(mlir::omp::HintClauseOps &result) const { return false; } +bool ClauseProcessor::processGrainsize( + lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const { + using grainsize = omp::clause::Grainsize; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier) { + result.grainsizeMod = mlir::omp::ClauseGrainsizeTypeAttr::get( + context, mlir::omp::ClauseGrainsizeType::Strict); + } ---------------- tblah wrote: I understand that `strict` is the only grainsize modifier currently in the openmp standard, but I think it would be good practice to check that the modifier is actually `strict` just in case something else is added in the future. https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Wed May 7 03:52:29 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 03:52:29 -0700 (PDT) Subject: [flang-commits] [flang] 2fb288d - [flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` (#137928) Message-ID: <681b3b6d.170a0220.227fd7.ab3c@mx.google.com> Author: Kareem Ergawy Date: 2025-05-07T12:52:25+02:00 New Revision: 2fb288d4b8e0fb6c08a1a72b64cbf6a0752fdac7 URL: https://github.com/llvm/llvm-project/commit/2fb288d4b8e0fb6c08a1a72b64cbf6a0752fdac7 DIFF: https://github.com/llvm/llvm-project/commit/2fb288d4b8e0fb6c08a1a72b64cbf6a0752fdac7.diff LOG: [flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` (#137928) Adds support for lowering `do concurrent` nests from PFT to the new `fir.do_concurrent` MLIR op as well as its special terminator `fir.do_concurrent.loop` which models the actual loop nest. To that end, this PR emits the allocations for the iteration variables within the block of the `fir.do_concurrent` op and creates a region for the `fir.do_concurrent.loop` op that accepts arguments equal in number to the number of the input `do concurrent` iteration ranges. For example, given the following input: ```fortran do concurrent(i=1:10, j=11:20) end do ``` the changes in this PR emit the following MLIR: ```mlir fir.do_concurrent { %22 = fir.alloca i32 {bindc_name = "i"} %23:2 = hlfir.declare %22 {uniq_name = "_QFsub1Ei"} : (!fir.ref) -> (!fir.ref, !fir.ref) %24 = fir.alloca i32 {bindc_name = "j"} %25:2 = hlfir.declare %24 {uniq_name = "_QFsub1Ej"} : (!fir.ref) -> (!fir.ref, !fir.ref) fir.do_concurrent.loop (%arg1, %arg2) = (%18, %20) to (%19, %21) step (%c1, %c1_0) { %26 = fir.convert %arg1 : (index) -> i32 fir.store %26 to %23#0 : !fir.ref %27 = fir.convert %arg2 : (index) -> i32 fir.store %27 to %25#0 : !fir.ref } } ``` Added: Modified: flang/lib/Lower/Bridge.cpp flang/lib/Optimizer/Builder/FIRBuilder.cpp flang/test/Lower/do_concurrent.f90 flang/test/Lower/do_concurrent_local_default_init.f90 flang/test/Lower/loops.f90 flang/test/Lower/loops3.f90 flang/test/Lower/nsw.f90 flang/test/Transforms/DoConcurrent/basic_host.f90 flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 flang/test/Transforms/DoConcurrent/loop_nest_test.f90 flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 flang/test/Transforms/DoConcurrent/non_const_bounds.f90 flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 Removed: ################################################################################ diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 72c63e4e314d2..8da05255d5f41 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -94,10 +94,11 @@ struct IncrementLoopInfo { template explicit IncrementLoopInfo(Fortran::semantics::Symbol &sym, const T &lower, const T &upper, const std::optional &step, - bool isUnordered = false) + bool isConcurrent = false) : loopVariableSym{&sym}, lowerExpr{Fortran::semantics::GetExpr(lower)}, upperExpr{Fortran::semantics::GetExpr(upper)}, - stepExpr{Fortran::semantics::GetExpr(step)}, isUnordered{isUnordered} {} + stepExpr{Fortran::semantics::GetExpr(step)}, + isConcurrent{isConcurrent} {} IncrementLoopInfo(IncrementLoopInfo &&) = default; IncrementLoopInfo &operator=(IncrementLoopInfo &&x) = default; @@ -120,7 +121,7 @@ struct IncrementLoopInfo { const Fortran::lower::SomeExpr *upperExpr; const Fortran::lower::SomeExpr *stepExpr; const Fortran::lower::SomeExpr *maskExpr = nullptr; - bool isUnordered; // do concurrent, forall + bool isConcurrent; llvm::SmallVector localSymList; llvm::SmallVector localInitSymList; llvm::SmallVector< @@ -130,7 +131,7 @@ struct IncrementLoopInfo { mlir::Value loopVariable = nullptr; // Data members for structured loops. - fir::DoLoopOp doLoop = nullptr; + mlir::Operation *loopOp = nullptr; // Data members for unstructured loops. bool hasRealControl = false; @@ -1981,7 +1982,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { llvm_unreachable("illegal reduction operator"); } - /// Collect DO CONCURRENT or FORALL loop control information. + /// Collect DO CONCURRENT loop control information. IncrementLoopNestInfo getConcurrentControl( const Fortran::parser::ConcurrentHeader &header, const std::list &localityList = {}) { @@ -2292,8 +2293,14 @@ class FirConverter : public Fortran::lower::AbstractConverter { mlir::LLVM::LoopAnnotationAttr la = mlir::LLVM::LoopAnnotationAttr::get( builder->getContext(), {}, /*vectorize=*/va, {}, /*unroll*/ ua, /*unroll_and_jam*/ uja, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}); - if (has_attrs) - info.doLoop.setLoopAnnotationAttr(la); + if (has_attrs) { + if (auto loopOp = mlir::dyn_cast(info.loopOp)) + loopOp.setLoopAnnotationAttr(la); + + if (auto doConcurrentOp = + mlir::dyn_cast(info.loopOp)) + doConcurrentOp.setLoopAnnotationAttr(la); + } } /// Generate FIR to begin a structured or unstructured increment loop nest. @@ -2302,96 +2309,77 @@ class FirConverter : public Fortran::lower::AbstractConverter { llvm::SmallVectorImpl &dirs) { assert(!incrementLoopNestInfo.empty() && "empty loop nest"); mlir::Location loc = toLocation(); - mlir::Operation *boundsAndStepIP = nullptr; mlir::arith::IntegerOverflowFlags iofBackup{}; + llvm::SmallVector nestLBs; + llvm::SmallVector nestUBs; + llvm::SmallVector nestSts; + llvm::SmallVector nestReduceOperands; + llvm::SmallVector nestReduceAttrs; + bool genDoConcurrent = false; + for (IncrementLoopInfo &info : incrementLoopNestInfo) { - mlir::Value lowerValue; - mlir::Value upperValue; - mlir::Value stepValue; + genDoConcurrent = info.isStructured() && info.isConcurrent; - { - mlir::OpBuilder::InsertionGuard guard(*builder); + if (!genDoConcurrent) + info.loopVariable = genLoopVariableAddress(loc, *info.loopVariableSym, + info.isConcurrent); - // Set the IP before the first loop in the nest so that all nest bounds - // and step values are created outside the nest. - if (boundsAndStepIP) - builder->setInsertionPointAfter(boundsAndStepIP); + if (!getLoweringOptions().getIntegerWrapAround()) { + iofBackup = builder->getIntegerOverflowFlags(); + builder->setIntegerOverflowFlags( + mlir::arith::IntegerOverflowFlags::nsw); + } - info.loopVariable = genLoopVariableAddress(loc, *info.loopVariableSym, - info.isUnordered); - if (!getLoweringOptions().getIntegerWrapAround()) { - iofBackup = builder->getIntegerOverflowFlags(); - builder->setIntegerOverflowFlags( - mlir::arith::IntegerOverflowFlags::nsw); - } - lowerValue = genControlValue(info.lowerExpr, info); - upperValue = genControlValue(info.upperExpr, info); - bool isConst = true; - stepValue = genControlValue(info.stepExpr, info, - info.isStructured() ? nullptr : &isConst); - if (!getLoweringOptions().getIntegerWrapAround()) - builder->setIntegerOverflowFlags(iofBackup); - boundsAndStepIP = stepValue.getDefiningOp(); - - // Use a temp variable for unstructured loops with non-const step. - if (!isConst) { - info.stepVariable = - builder->createTemporary(loc, stepValue.getType()); - boundsAndStepIP = - builder->create(loc, stepValue, info.stepVariable); + nestLBs.push_back(genControlValue(info.lowerExpr, info)); + nestUBs.push_back(genControlValue(info.upperExpr, info)); + bool isConst = true; + nestSts.push_back(genControlValue( + info.stepExpr, info, info.isStructured() ? nullptr : &isConst)); + + if (!getLoweringOptions().getIntegerWrapAround()) + builder->setIntegerOverflowFlags(iofBackup); + + // Use a temp variable for unstructured loops with non-const step. + if (!isConst) { + mlir::Value stepValue = nestSts.back(); + info.stepVariable = builder->createTemporary(loc, stepValue.getType()); + builder->create(loc, stepValue, info.stepVariable); + } + + if (genDoConcurrent && nestReduceOperands.empty()) { + // Create DO CONCURRENT reduce operands and attributes + for (const auto &reduceSym : info.reduceSymList) { + const fir::ReduceOperationEnum reduceOperation = reduceSym.first; + const Fortran::semantics::Symbol *sym = reduceSym.second; + fir::ExtendedValue exv = getSymbolExtendedValue(*sym, nullptr); + nestReduceOperands.push_back(fir::getBase(exv)); + auto reduceAttr = + fir::ReduceAttr::get(builder->getContext(), reduceOperation); + nestReduceAttrs.push_back(reduceAttr); } } + } + for (auto [info, lowerValue, upperValue, stepValue] : + llvm::zip_equal(incrementLoopNestInfo, nestLBs, nestUBs, nestSts)) { // Structured loop - generate fir.do_loop. if (info.isStructured()) { + if (genDoConcurrent) + continue; + + // The loop variable is a doLoop op argument. mlir::Type loopVarType = info.getLoopVariableType(); - mlir::Value loopValue; - if (info.isUnordered) { - llvm::SmallVector reduceOperands; - llvm::SmallVector reduceAttrs; - // Create DO CONCURRENT reduce operands and attributes - for (const auto &reduceSym : info.reduceSymList) { - const fir::ReduceOperationEnum reduce_operation = reduceSym.first; - const Fortran::semantics::Symbol *sym = reduceSym.second; - fir::ExtendedValue exv = getSymbolExtendedValue(*sym, nullptr); - reduceOperands.push_back(fir::getBase(exv)); - auto reduce_attr = - fir::ReduceAttr::get(builder->getContext(), reduce_operation); - reduceAttrs.push_back(reduce_attr); - } - // The loop variable value is explicitly updated. - info.doLoop = builder->create( - loc, lowerValue, upperValue, stepValue, /*unordered=*/true, - /*finalCountValue=*/false, /*iterArgs=*/std::nullopt, - llvm::ArrayRef(reduceOperands), reduceAttrs); - builder->setInsertionPointToStart(info.doLoop.getBody()); - loopValue = builder->createConvert(loc, loopVarType, - info.doLoop.getInductionVar()); - } else { - // The loop variable is a doLoop op argument. - info.doLoop = builder->create( - loc, lowerValue, upperValue, stepValue, /*unordered=*/false, - /*finalCountValue=*/true, - builder->createConvert(loc, loopVarType, lowerValue)); - builder->setInsertionPointToStart(info.doLoop.getBody()); - loopValue = info.doLoop.getRegionIterArgs()[0]; - } + auto loopOp = builder->create( + loc, lowerValue, upperValue, stepValue, /*unordered=*/false, + /*finalCountValue=*/true, + builder->createConvert(loc, loopVarType, lowerValue)); + info.loopOp = loopOp; + builder->setInsertionPointToStart(loopOp.getBody()); + mlir::Value loopValue = loopOp.getRegionIterArgs()[0]; + // Update the loop variable value in case it has non-index references. builder->create(loc, loopValue, info.loopVariable); - if (info.maskExpr) { - Fortran::lower::StatementContext stmtCtx; - mlir::Value maskCond = createFIRExpr(loc, info.maskExpr, stmtCtx); - stmtCtx.finalizeAndReset(); - mlir::Value maskCondCast = - builder->createConvert(loc, builder->getI1Type(), maskCond); - auto ifOp = builder->create(loc, maskCondCast, - /*withElseRegion=*/false); - builder->setInsertionPointToStart(&ifOp.getThenRegion().front()); - } - if (info.hasLocalitySpecs()) - handleLocalitySpecs(info); - addLoopAnnotationAttr(info, dirs); continue; } @@ -2455,6 +2443,60 @@ class FirConverter : public Fortran::lower::AbstractConverter { builder->restoreInsertionPoint(insertPt); } } + + if (genDoConcurrent) { + auto loopWrapperOp = builder->create(loc); + builder->setInsertionPointToStart( + builder->createBlock(&loopWrapperOp.getRegion())); + + for (IncrementLoopInfo &info : llvm::reverse(incrementLoopNestInfo)) { + info.loopVariable = genLoopVariableAddress(loc, *info.loopVariableSym, + info.isConcurrent); + } + + builder->setInsertionPointToEnd(loopWrapperOp.getBody()); + auto loopOp = builder->create( + loc, nestLBs, nestUBs, nestSts, nestReduceOperands, + nestReduceAttrs.empty() + ? nullptr + : mlir::ArrayAttr::get(builder->getContext(), nestReduceAttrs), + nullptr); + + llvm::SmallVector loopBlockArgTypes( + incrementLoopNestInfo.size(), builder->getIndexType()); + llvm::SmallVector loopBlockArgLocs( + incrementLoopNestInfo.size(), loc); + mlir::Region &loopRegion = loopOp.getRegion(); + mlir::Block *loopBlock = builder->createBlock( + &loopRegion, loopRegion.begin(), loopBlockArgTypes, loopBlockArgLocs); + builder->setInsertionPointToStart(loopBlock); + + for (auto [info, blockArg] : + llvm::zip_equal(incrementLoopNestInfo, loopBlock->getArguments())) { + info.loopOp = loopOp; + mlir::Value loopValue = + builder->createConvert(loc, info.getLoopVariableType(), blockArg); + builder->create(loc, loopValue, info.loopVariable); + + if (info.maskExpr) { + Fortran::lower::StatementContext stmtCtx; + mlir::Value maskCond = createFIRExpr(loc, info.maskExpr, stmtCtx); + stmtCtx.finalizeAndReset(); + mlir::Value maskCondCast = + builder->createConvert(loc, builder->getI1Type(), maskCond); + auto ifOp = builder->create(loc, maskCondCast, + /*withElseRegion=*/false); + builder->setInsertionPointToStart(&ifOp.getThenRegion().front()); + } + } + + IncrementLoopInfo &innermostInfo = incrementLoopNestInfo.back(); + + if (innermostInfo.hasLocalitySpecs()) + handleLocalitySpecs(innermostInfo); + + addLoopAnnotationAttr(innermostInfo, dirs); + } } /// Generate FIR to end a structured or unstructured increment loop nest. @@ -2471,29 +2513,31 @@ class FirConverter : public Fortran::lower::AbstractConverter { it != rend; ++it) { IncrementLoopInfo &info = *it; if (info.isStructured()) { - // End fir.do_loop. - if (info.isUnordered) { - builder->setInsertionPointAfter(info.doLoop); + // End fir.do_concurent.loop. + if (info.isConcurrent) { + builder->setInsertionPointAfter(info.loopOp->getParentOp()); continue; } + + // End fir.do_loop. // Decrement tripVariable. - builder->setInsertionPointToEnd(info.doLoop.getBody()); + auto doLoopOp = mlir::cast(info.loopOp); + builder->setInsertionPointToEnd(doLoopOp.getBody()); llvm::SmallVector results; results.push_back(builder->create( - loc, info.doLoop.getInductionVar(), info.doLoop.getStep(), - iofAttr)); + loc, doLoopOp.getInductionVar(), doLoopOp.getStep(), iofAttr)); // Step loopVariable to help optimizations such as vectorization. // Induction variable elimination will clean up as necessary. mlir::Value step = builder->createConvert( - loc, info.getLoopVariableType(), info.doLoop.getStep()); + loc, info.getLoopVariableType(), doLoopOp.getStep()); mlir::Value loopVar = builder->create(loc, info.loopVariable); results.push_back( builder->create(loc, loopVar, step, iofAttr)); builder->create(loc, results); - builder->setInsertionPointAfter(info.doLoop); + builder->setInsertionPointAfter(doLoopOp); // The loop control variable may be used after the loop. - builder->create(loc, info.doLoop.getResult(1), + builder->create(loc, doLoopOp.getResult(1), info.loopVariable); continue; } diff --git a/flang/lib/Optimizer/Builder/FIRBuilder.cpp b/flang/lib/Optimizer/Builder/FIRBuilder.cpp index 1d6e1502ed0f9..86166db355f72 100644 --- a/flang/lib/Optimizer/Builder/FIRBuilder.cpp +++ b/flang/lib/Optimizer/Builder/FIRBuilder.cpp @@ -280,6 +280,9 @@ mlir::Block *fir::FirOpBuilder::getAllocaBlock() { if (auto cufKernelOp = getRegion().getParentOfType()) return &cufKernelOp.getRegion().front(); + if (auto doConcurentOp = getRegion().getParentOfType()) + return doConcurentOp.getBody(); + return getEntryBlock(); } diff --git a/flang/test/Lower/do_concurrent.f90 b/flang/test/Lower/do_concurrent.f90 index ef93d2d6b035b..cc113f59c35e3 100644 --- a/flang/test/Lower/do_concurrent.f90 +++ b/flang/test/Lower/do_concurrent.f90 @@ -14,6 +14,9 @@ subroutine sub1(n) implicit none integer :: n, m, i, j, k integer, dimension(n) :: a +!CHECK: %[[N_DECL:.*]]:2 = hlfir.declare %{{.*}} dummy_scope %{{.*}} {uniq_name = "_QFsub1En"} +!CHECK: %[[A_DECL:.*]]:2 = hlfir.declare %{{.*}}(%{{.*}}) {uniq_name = "_QFsub1Ea"} + !CHECK: %[[LB1:.*]] = arith.constant 1 : i32 !CHECK: %[[LB1_CVT:.*]] = fir.convert %[[LB1]] : (i32) -> index !CHECK: %[[UB1:.*]] = fir.load %{{.*}}#0 : !fir.ref @@ -29,10 +32,30 @@ subroutine sub1(n) !CHECK: %[[UB3:.*]] = arith.constant 10 : i32 !CHECK: %[[UB3_CVT:.*]] = fir.convert %[[UB3]] : (i32) -> index -!CHECK: fir.do_loop %{{.*}} = %[[LB1_CVT]] to %[[UB1_CVT]] step %{{.*}} unordered -!CHECK: fir.do_loop %{{.*}} = %[[LB2_CVT]] to %[[UB2_CVT]] step %{{.*}} unordered -!CHECK: fir.do_loop %{{.*}} = %[[LB3_CVT]] to %[[UB3_CVT]] step %{{.*}} unordered +!CHECK: fir.do_concurrent +!CHECK: %[[I:.*]] = fir.alloca i32 {bindc_name = "i"} +!CHECK: %[[I_DECL:.*]]:2 = hlfir.declare %[[I]] +!CHECK: %[[J:.*]] = fir.alloca i32 {bindc_name = "j"} +!CHECK: %[[J_DECL:.*]]:2 = hlfir.declare %[[J]] +!CHECK: %[[K:.*]] = fir.alloca i32 {bindc_name = "k"} +!CHECK: %[[K_DECL:.*]]:2 = hlfir.declare %[[K]] + +!CHECK: fir.do_concurrent.loop (%[[I_IV:.*]], %[[J_IV:.*]], %[[K_IV:.*]]) = +!CHECK-SAME: (%[[LB1_CVT]], %[[LB2_CVT]], %[[LB3_CVT]]) to +!CHECK-SAME: (%[[UB1_CVT]], %[[UB2_CVT]], %[[UB3_CVT]]) step +!CHECK-SAME: (%{{.*}}, %{{.*}}, %{{.*}}) { +!CHECK: %[[I_IV_CVT:.*]] = fir.convert %[[I_IV]] : (index) -> i32 +!CHECK: fir.store %[[I_IV_CVT]] to %[[I_DECL]]#0 : !fir.ref +!CHECK: %[[J_IV_CVT:.*]] = fir.convert %[[J_IV]] : (index) -> i32 +!CHECK: fir.store %[[J_IV_CVT]] to %[[J_DECL]]#0 : !fir.ref +!CHECK: %[[K_IV_CVT:.*]] = fir.convert %[[K_IV]] : (index) -> i32 +!CHECK: fir.store %[[K_IV_CVT]] to %[[K_DECL]]#0 : !fir.ref +!CHECK: %[[N_VAL:.*]] = fir.load %[[N_DECL]]#0 : !fir.ref +!CHECK: %[[I_VAL:.*]] = fir.load %[[I_DECL]]#0 : !fir.ref +!CHECK: %[[I_VAL_CVT:.*]] = fir.convert %[[I_VAL]] : (i32) -> i64 +!CHECK: %[[A_ELEM:.*]] = hlfir.designate %[[A_DECL]]#0 (%[[I_VAL_CVT]]) +!CHECK: hlfir.assign %[[N_VAL]] to %[[A_ELEM]] : i32, !fir.ref do concurrent(i=1:n, j=1:bar(n*m, n/m), k=5:10) a(i) = n end do @@ -45,14 +68,17 @@ subroutine sub2(n) integer, dimension(n) :: a !CHECK: %[[LB1:.*]] = arith.constant 1 : i32 !CHECK: %[[LB1_CVT:.*]] = fir.convert %[[LB1]] : (i32) -> index -!CHECK: %[[UB1:.*]] = fir.load %5#0 : !fir.ref +!CHECK: %[[UB1:.*]] = fir.load %{{.*}}#0 : !fir.ref !CHECK: %[[UB1_CVT:.*]] = fir.convert %[[UB1]] : (i32) -> index -!CHECK: fir.do_loop %{{.*}} = %[[LB1_CVT]] to %[[UB1_CVT]] step %{{.*}} unordered +!CHECK: fir.do_concurrent +!CHECK: fir.do_concurrent.loop (%{{.*}}) = (%[[LB1_CVT]]) to (%[[UB1_CVT]]) step (%{{.*}}) + !CHECK: %[[LB2:.*]] = arith.constant 1 : i32 !CHECK: %[[LB2_CVT:.*]] = fir.convert %[[LB2]] : (i32) -> index !CHECK: %[[UB2:.*]] = fir.call @_QPbar(%{{.*}}, %{{.*}}) proc_attrs fastmath : (!fir.ref, !fir.ref) -> i32 !CHECK: %[[UB2_CVT:.*]] = fir.convert %[[UB2]] : (i32) -> index -!CHECK: fir.do_loop %{{.*}} = %[[LB2_CVT]] to %[[UB2_CVT]] step %{{.*}} unordered +!CHECK: fir.do_concurrent +!CHECK: fir.do_concurrent.loop (%{{.*}}) = (%[[LB2_CVT]]) to (%[[UB2_CVT]]) step (%{{.*}}) do concurrent(i=1:n) do concurrent(j=1:bar(n*m, n/m)) a(i) = n @@ -60,7 +86,6 @@ subroutine sub2(n) end do end subroutine - !CHECK-LABEL: unstructured subroutine unstructured(inner_step) integer(4) :: i, j, inner_step diff --git a/flang/test/Lower/do_concurrent_local_default_init.f90 b/flang/test/Lower/do_concurrent_local_default_init.f90 index 7652e4fcd0402..207704ac1a990 100644 --- a/flang/test/Lower/do_concurrent_local_default_init.f90 +++ b/flang/test/Lower/do_concurrent_local_default_init.f90 @@ -29,7 +29,7 @@ subroutine test_default_init() ! CHECK-SAME: %[[VAL_0:.*]]: !fir.ref>>>> {fir.bindc_name = "p"}) { ! CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_0]] : !fir.ref>>>> ! CHECK: %[[VAL_7:.*]] = fir.box_elesize %[[VAL_6]] : (!fir.box>>>) -> index -! CHECK: fir.do_loop +! CHECK: fir.do_concurrent.loop ! CHECK: %[[VAL_16:.*]] = fir.alloca !fir.box>>> {bindc_name = "p", pinned, uniq_name = "_QFtest_ptrEp"} ! CHECK: %[[VAL_17:.*]] = fir.zero_bits !fir.ptr>> ! CHECK: %[[VAL_18:.*]] = arith.constant 0 : index @@ -43,7 +43,7 @@ subroutine test_default_init() ! CHECK: } ! CHECK-LABEL: func.func @_QPtest_default_init( -! CHECK: fir.do_loop +! CHECK: fir.do_concurrent.loop ! CHECK: %[[VAL_26:.*]] = fir.alloca !fir.type<_QFtest_default_initTt{i:i32}> {bindc_name = "a", pinned, uniq_name = "_QFtest_default_initEa"} ! CHECK: %[[VAL_27:.*]] = fir.embox %[[VAL_26]] : (!fir.ref>) -> !fir.box> ! CHECK: %[[VAL_30:.*]] = fir.convert %[[VAL_27]] : (!fir.box>) -> !fir.box diff --git a/flang/test/Lower/loops.f90 b/flang/test/Lower/loops.f90 index ea65ba3e4d66d..60df27a591dc3 100644 --- a/flang/test/Lower/loops.f90 +++ b/flang/test/Lower/loops.f90 @@ -2,15 +2,6 @@ ! CHECK-LABEL: loop_test subroutine loop_test - ! CHECK: %[[VAL_2:.*]] = fir.alloca i16 {bindc_name = "i"} - ! CHECK: %[[VAL_3:.*]] = fir.alloca i16 {bindc_name = "i"} - ! CHECK: %[[VAL_4:.*]] = fir.alloca i16 {bindc_name = "i"} - ! CHECK: %[[VAL_5:.*]] = fir.alloca i8 {bindc_name = "k"} - ! CHECK: %[[VAL_6:.*]] = fir.alloca i8 {bindc_name = "j"} - ! CHECK: %[[VAL_7:.*]] = fir.alloca i8 {bindc_name = "i"} - ! CHECK: %[[VAL_8:.*]] = fir.alloca i32 {bindc_name = "k"} - ! CHECK: %[[VAL_9:.*]] = fir.alloca i32 {bindc_name = "j"} - ! CHECK: %[[VAL_10:.*]] = fir.alloca i32 {bindc_name = "i"} ! CHECK: %[[VAL_11:.*]] = fir.alloca !fir.array<5x5x5xi32> {bindc_name = "a", uniq_name = "_QFloop_testEa"} ! CHECK: %[[VAL_12:.*]] = fir.alloca i32 {bindc_name = "asum", uniq_name = "_QFloop_testEasum"} ! CHECK: %[[VAL_13:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFloop_testEi"} @@ -25,7 +16,7 @@ subroutine loop_test j = 200 k = 300 - ! CHECK-COUNT-3: fir.do_loop {{.*}} unordered + ! CHECK: fir.do_concurrent.loop (%{{.*}}, %{{.*}}, %{{.*}}) = {{.*}} do concurrent (i=1:5, j=1:5, k=1:5) ! shared(a) ! CHECK: fir.coordinate_of a(i,j,k) = 0 @@ -33,7 +24,7 @@ subroutine loop_test ! CHECK: fir.call @_FortranAioBeginExternalListOutput print*, 'A:', i, j, k - ! CHECK-COUNT-3: fir.do_loop {{.*}} unordered + ! CHECK: fir.do_concurrent.loop (%{{.*}}, %{{.*}}, %{{.*}}) = {{.*}} ! CHECK: fir.if do concurrent (integer(1)::i=1:5, j=1:5, k=1:5, i.ne.j .and. k.ne.3) shared(a) ! CHECK-COUNT-2: fir.coordinate_of @@ -53,7 +44,7 @@ subroutine loop_test ! CHECK: fir.call @_FortranAioBeginExternalListOutput print*, 'B:', i, j, k, '-', asum - ! CHECK: fir.do_loop {{.*}} unordered + ! CHECK: fir.do_concurrent.loop (%{{.*}}) = {{.*}} ! CHECK-COUNT-2: fir.if do concurrent (integer(2)::i=1:5, i.ne.3) if (i.eq.2 .or. i.eq.4) goto 5 ! fir.if @@ -62,7 +53,7 @@ subroutine loop_test 5 continue enddo - ! CHECK: fir.do_loop {{.*}} unordered + ! CHECK: fir.do_concurrent.loop (%{{.*}}) = {{.*}} ! CHECK-COUNT-2: fir.if do concurrent (integer(2)::i=1:5, i.ne.3) if (i.eq.2 .or. i.eq.4) then ! fir.if @@ -93,10 +84,6 @@ end subroutine loop_test ! CHECK-LABEL: c.func @_QPlis subroutine lis(n) - ! CHECK-DAG: fir.alloca i32 {bindc_name = "m"} - ! CHECK-DAG: fir.alloca i32 {bindc_name = "j"} - ! CHECK-DAG: fir.alloca i32 {bindc_name = "i"} - ! CHECK-DAG: fir.alloca i8 {bindc_name = "i"} ! CHECK-DAG: fir.alloca i32 {bindc_name = "j", uniq_name = "_QFlisEj"} ! CHECK-DAG: fir.alloca i32 {bindc_name = "k", uniq_name = "_QFlisEk"} ! CHECK-DAG: fir.alloca !fir.box>> {bindc_name = "p", uniq_name = "_QFlisEp"} @@ -117,8 +104,8 @@ subroutine lis(n) ! CHECK: } r = 0 - ! CHECK: fir.do_loop %arg1 = %{{.*}} to %{{.*}} step %{{.*}} unordered { - ! CHECK: fir.do_loop %arg2 = %{{.*}} to %{{.*}} step %c1{{.*}} iter_args(%arg3 = %{{.*}}) -> (index, i32) { + ! CHECK: fir.do_concurrent { + ! CHECK: fir.do_concurrent.loop (%{{.*}}) = (%{{.*}}) to (%{{.*}}) step (%{{.*}}) { ! CHECK: } ! CHECK: } do concurrent (integer(kind=1)::i=n:1:-1) @@ -128,16 +115,18 @@ subroutine lis(n) enddo enddo - ! CHECK: fir.do_loop %arg1 = %{{.*}} to %{{.*}} step %c1{{.*}} unordered { - ! CHECK: fir.do_loop %arg2 = %{{.*}} to %{{.*}} step %c1{{.*}} unordered { + ! CHECK: fir.do_concurrent.loop (%{{.*}}, %{{.*}}) = (%{{.*}}, %{{.*}}) to (%{{.*}}, %{{.*}}) step (%{{.*}}, %{{.*}}) { ! CHECK: fir.if %{{.*}} { ! CHECK: %[[V_95:[0-9]+]] = fir.alloca !fir.array, %{{.*}}, %{{.*}} {bindc_name = "t", pinned, uniq_name = "_QFlisEt"} ! CHECK: %[[V_96:[0-9]+]] = fir.alloca !fir.box>> {bindc_name = "p", pinned, uniq_name = "_QFlisEp"} ! CHECK: fir.store %{{.*}} to %[[V_96]] : !fir.ref>>> ! CHECK: fir.do_loop %arg3 = %{{.*}} to %{{.*}} step %c1{{.*}} iter_args(%arg4 = %{{.*}}) -> (index, i32) { - ! CHECK: fir.do_loop %arg5 = %{{.*}} to %{{.*}} step %c1{{.*}} unordered { - ! CHECK: fir.load %[[V_96]] : !fir.ref>>> - ! CHECK: fir.convert %[[V_95]] : (!fir.ref>) -> !fir.ref> + ! CHECK: fir.do_concurrent { + ! CHECK: fir.alloca i32 {bindc_name = "m"} + ! CHECK: fir.do_concurrent.loop (%{{.*}}) = (%{{.*}}) to (%{{.*}}) step (%{{.*}}) { + ! CHECK: fir.load %[[V_96]] : !fir.ref>>> + ! CHECK: fir.convert %[[V_95]] : (!fir.ref>) -> !fir.ref> + ! CHECK: } ! CHECK: } ! CHECK: } ! CHECK: fir.convert %[[V_95]] : (!fir.ref>) -> !fir.ref> diff --git a/flang/test/Lower/loops3.f90 b/flang/test/Lower/loops3.f90 index 78f39e1013082..84db1972cca16 100644 --- a/flang/test/Lower/loops3.f90 +++ b/flang/test/Lower/loops3.f90 @@ -12,9 +12,7 @@ subroutine loop_test ! CHECK: %[[VAL_0:.*]] = fir.alloca f32 {bindc_name = "m", uniq_name = "_QFloop_testEm"} ! CHECK: %[[VAL_1:.*]] = fir.address_of(@_QFloop_testEsum) : !fir.ref - ! CHECK: fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} unordered reduce(#fir.reduce_attr -> %[[VAL_1:.*]] : !fir.ref, #fir.reduce_attr -> %[[VAL_0:.*]] : !fir.ref) { - ! CHECK: fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} unordered reduce(#fir.reduce_attr -> %[[VAL_1:.*]] : !fir.ref, #fir.reduce_attr -> %[[VAL_0:.*]] : !fir.ref) { - ! CHECK: fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} unordered reduce(#fir.reduce_attr -> %[[VAL_1:.*]] : !fir.ref, #fir.reduce_attr -> %[[VAL_0:.*]] : !fir.ref) { + ! CHECK: fir.do_concurrent.loop ({{.*}}) = ({{.*}}) to ({{.*}}) step ({{.*}}) reduce(#fir.reduce_attr -> %[[VAL_1:.*]] : !fir.ref, #fir.reduce_attr -> %[[VAL_0:.*]] : !fir.ref) { do concurrent (i=1:5, j=1:5, k=1:5) local(tmp) reduce(+:sum) reduce(max:m) tmp = i + j + k sum = tmp + sum diff --git a/flang/test/Lower/nsw.f90 b/flang/test/Lower/nsw.f90 index 4ee9e5da829e6..2ec1efb2af42a 100644 --- a/flang/test/Lower/nsw.f90 +++ b/flang/test/Lower/nsw.f90 @@ -139,7 +139,6 @@ subroutine loop_params3(a,lb,ub,st) ! CHECK-LABEL: func.func @_QPloop_params3( ! CHECK: %[[VAL_4:.*]] = arith.constant 2 : i32 ! CHECK: %[[VAL_5:.*]] = arith.constant 1 : i32 -! CHECK: %[[VAL_9:.*]] = fir.declare %{{.*}}i"} : (!fir.ref) -> !fir.ref ! CHECK: %[[VAL_11:.*]] = fir.declare %{{.*}}lb"} : (!fir.ref, !fir.dscope) -> !fir.ref ! CHECK: %[[VAL_12:.*]] = fir.declare %{{.*}}ub"} : (!fir.ref, !fir.dscope) -> !fir.ref ! CHECK: %[[VAL_14:.*]] = fir.declare %{{.*}}i"} : (!fir.ref) -> !fir.ref @@ -153,4 +152,6 @@ subroutine loop_params3(a,lb,ub,st) ! CHECK: %[[VAL_31:.*]] = fir.load %[[VAL_15]] : !fir.ref ! CHECK: %[[VAL_32:.*]] = arith.muli %[[VAL_31]], %[[VAL_4]] overflow : i32 ! CHECK: %[[VAL_33:.*]] = fir.convert %[[VAL_32]] : (i32) -> index -! CHECK: fir.do_loop %[[VAL_34:.*]] = %[[VAL_28]] to %[[VAL_30]] step %[[VAL_33]] unordered { +! CHECK: fir.do_concurrent { +! CHECK: %[[VAL_9:.*]] = fir.declare %{{.*}}i"} : (!fir.ref) -> !fir.ref +! CHECK: fir.do_concurrent.loop (%[[VAL_34:.*]]) = (%[[VAL_28]]) to (%[[VAL_30]]) step (%[[VAL_33]]) { diff --git a/flang/test/Transforms/DoConcurrent/basic_host.f90 b/flang/test/Transforms/DoConcurrent/basic_host.f90 index 12f63031cbaee..b84d4481ac766 100644 --- a/flang/test/Transforms/DoConcurrent/basic_host.f90 +++ b/flang/test/Transforms/DoConcurrent/basic_host.f90 @@ -1,3 +1,6 @@ +! Fails until we update the pass to use the `fir.do_concurrent` op. +! XFAIL: * + ! Tests mapping of a basic `do concurrent` loop to `!$omp parallel do`. ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fdo-concurrent-to-openmp=host %s -o - \ diff --git a/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 b/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 index f82696669eca6..4e13c0919589a 100644 --- a/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 +++ b/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 @@ -1,3 +1,6 @@ +! Fails until we update the pass to use the `fir.do_concurrent` op. +! XFAIL: * + ! Tests that "loop-local values" are properly handled by localizing them to the ! body of the loop nest. See `collectLoopLocalValues` and `localizeLoopLocalValue` ! for a definition of "loop-local values" and how they are handled. diff --git a/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 b/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 index 32bed61fe69e4..adc4a488d1ec9 100644 --- a/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 +++ b/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 @@ -1,3 +1,6 @@ +! Fails until we update the pass to use the `fir.do_concurrent` op. +! XFAIL: * + ! Tests loop-nest detection algorithm for do-concurrent mapping. ! REQUIRES: asserts diff --git a/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 b/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 index d0210726de83e..26800678d381c 100644 --- a/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 +++ b/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 @@ -1,3 +1,6 @@ +! Fails until we update the pass to use the `fir.do_concurrent` op. +! XFAIL: * + ! Tests mapping of a `do concurrent` loop with multiple iteration ranges. ! RUN: split-file %s %t diff --git a/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 b/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 index cd1bd4f98a3f5..23a3aae976c07 100644 --- a/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 +++ b/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 @@ -1,3 +1,6 @@ +! Fails until we update the pass to use the `fir.do_concurrent` op. +! XFAIL: * + ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fdo-concurrent-to-openmp=host %s -o - \ ! RUN: | FileCheck %s diff --git a/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 b/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 index 184fdfe00d397..d1c02101318ab 100644 --- a/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 +++ b/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 @@ -1,3 +1,6 @@ +! Fails until we update the pass to use the `fir.do_concurrent` op. +! XFAIL: * + ! Tests that if `do concurrent` is not perfectly nested in its parent loop, that ! we skip converting the not-perfectly nested `do concurrent` loop. From flang-commits at lists.llvm.org Wed May 7 03:52:31 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 07 May 2025 03:52:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` (PR #137928) In-Reply-To: Message-ID: <681b3b6f.170a0220.2858d1.7ca4@mx.google.com> https://github.com/ergawy closed https://github.com/llvm/llvm-project/pull/137928 From flang-commits at lists.llvm.org Wed May 7 03:52:33 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 07 May 2025 03:52:33 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Update `do concurrent` mapping pass to use `fir.do_concurrent` op (PR #138489) In-Reply-To: Message-ID: <681b3b71.170a0220.32cf8f.9e7a@mx.google.com> https://github.com/ergawy edited https://github.com/llvm/llvm-project/pull/138489 From flang-commits at lists.llvm.org Wed May 7 03:52:35 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 07 May 2025 03:52:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add `fir.local` op for locality specifiers (PR #138505) In-Reply-To: Message-ID: <681b3b73.630a0220.37da94.9b4f@mx.google.com> https://github.com/ergawy edited https://github.com/llvm/llvm-project/pull/138505 From flang-commits at lists.llvm.org Wed May 7 03:53:49 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 07 May 2025 03:53:49 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Update `do concurrent` mapping pass to use `fir.do_concurrent` op (PR #138489) In-Reply-To: Message-ID: <681b3bbd.170a0220.34d0bf.893b@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/138489 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 7 03:55:30 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Wed, 07 May 2025 03:55:30 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <681b3c22.170a0220.250d2a.947b@mx.google.com> https://github.com/skatrak edited https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Wed May 7 03:55:30 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Wed, 07 May 2025 03:55:30 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <681b3c22.050a0220.bf586.11b2@mx.google.com> ================ @@ -251,9 +262,17 @@ genImplicitBoundsOps(fir::FirOpBuilder &builder, AddrAndBoundsInfo &info, llvm::SmallVector bounds; mlir::Value baseOp = info.rawInput; - if (mlir::isa(fir::unwrapRefType(baseOp.getType()))) + if (mlir::isa(fir::unwrapRefType(baseOp.getType()))) { + // if it's an optional argument, it is possible it is not present, in which ---------------- skatrak wrote: ```suggestion // If it's an optional argument, it is possible it is not present, in which ``` https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Wed May 7 03:55:30 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Wed, 07 May 2025 03:55:30 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <681b3c22.170a0220.27129e.7c66@mx.google.com> https://github.com/skatrak approved this pull request. Thank you Andrew, LGTM. Please wait for @jeanPerier's approval before merging, though. https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Wed May 7 03:55:31 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Wed, 07 May 2025 03:55:31 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <681b3c23.630a0220.382d9.aca2@mx.google.com> ================ @@ -0,0 +1,46 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +module mod + implicit none +contains + subroutine routine(a) + implicit none + real(4), allocatable, optional, intent(inout) :: a(:) + integer(4) :: i + + !$omp target teams distribute parallel do shared(a) + do i=1,10 + a(i) = i + a(i) + end do + + end subroutine routine +end module mod + +! CHECK-LABEL: func.func @_QMmodProutine( +! CHECK-SAME: %[[ARG0:.*]]: !fir.ref>>> {fir.bindc_name = "a", fir.optional}) { +! CHECK: %[[VAL_0:.*]] = fir.alloca !fir.box>> +! CHECK: %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[ARG0]] dummy_scope %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QMmodFroutineEa"} : (!fir.ref>>>, !fir.dscope) -> (!fir.ref>>>, !fir.ref>>>) +! CHECK: %[[VAL_8:.*]] = fir.is_present %[[VAL_2]]#1 : (!fir.ref>>>) -> i1 +! CHECK: %[[VAL_9:.*]]:5 = fir.if %[[VAL_8]] -> (index, index, index, index, index) { ---------------- skatrak wrote: What are the returned values here used for? https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Wed May 7 04:08:38 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 04:08:38 -0700 (PDT) Subject: [flang-commits] [flang] [fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <681b3f36.170a0220.302125.82d1@mx.google.com> ================ @@ -461,14 +475,28 @@ class AffineLoopConversion : public mlir::OpRewritePattern { LLVM_ATTRIBUTE_UNUSED auto loopAnalysis = functionAnalysis.getChildLoopAnalysis(loop); auto &loopOps = loop.getBody()->getOperations(); + auto resultOp = cast(loop.getBody()->getTerminator()); ---------------- NexMing wrote: My logic takes loops with multiple results into account, and I believe it can be successfully converted. I will add tests later to demonstrate this. https://github.com/llvm/llvm-project/pull/137790 From flang-commits at lists.llvm.org Wed May 7 04:55:55 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Wed, 07 May 2025 04:55:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Update `do concurrent` mapping pass to use `fir.do_concurrent` op (PR #138489) In-Reply-To: Message-ID: <681b4a4b.170a0220.3b1b50.8cc3@mx.google.com> ================ @@ -548,19 +357,16 @@ class DoConcurrentConversionPass return; } - llvm::DenseSet concurrentLoopsToSkip; + llvm::DenseSet concurrentLoopsToSkip; ---------------- skatrak wrote: I think we should be safe to rely on that, since it's documented: https://mlir.llvm.org/docs/DialectConversion/#modes-of-conversion. > In all cases, the framework walks the operations in preorder, examining an op before the ops in any regions it has. https://github.com/llvm/llvm-project/pull/138489 From flang-commits at lists.llvm.org Wed May 7 05:00:09 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 05:00:09 -0700 (PDT) Subject: [flang-commits] [flang] a83bb35 - [flang][fir] Add `fir.local` op for locality specifiers (#138505) Message-ID: <681b4b49.170a0220.1f3e6e.82ef@mx.google.com> Author: Kareem Ergawy Date: 2025-05-07T14:00:06+02:00 New Revision: a83bb35e9989f9d27bb6c0578caa4183b8cbefdc URL: https://github.com/llvm/llvm-project/commit/a83bb35e9989f9d27bb6c0578caa4183b8cbefdc DIFF: https://github.com/llvm/llvm-project/commit/a83bb35e9989f9d27bb6c0578caa4183b8cbefdc.diff LOG: [flang][fir] Add `fir.local` op for locality specifiers (#138505) Adds a new `fir.local` op to model `local` and `local_init` locality specifiers. This op is a clone of `omp.private`. In particular, this new op also models the privatization/localization logic of an SSA value in the `fir` dialect just like `omp.private` does for OpenMP. PR stack: - https://github.com/llvm/llvm-project/pull/137928 - https://github.com/llvm/llvm-project/pull/138505 (this PR) - https://github.com/llvm/llvm-project/pull/138506 - https://github.com/llvm/llvm-project/pull/138512 - https://github.com/llvm/llvm-project/pull/138534 - https://github.com/llvm/llvm-project/pull/138816 Added: Modified: flang/include/flang/Optimizer/Dialect/FIRAttr.td flang/include/flang/Optimizer/Dialect/FIROps.td flang/lib/Optimizer/Dialect/FIROps.cpp flang/test/Fir/do_concurrent.fir flang/test/Fir/invalid.fir Removed: ################################################################################ diff --git a/flang/include/flang/Optimizer/Dialect/FIRAttr.td b/flang/include/flang/Optimizer/Dialect/FIRAttr.td index 3ebc24951cfff..2845080030b92 100644 --- a/flang/include/flang/Optimizer/Dialect/FIRAttr.td +++ b/flang/include/flang/Optimizer/Dialect/FIRAttr.td @@ -200,4 +200,23 @@ def fir_OpenMPSafeTempArrayCopyAttr : fir_Attr<"OpenMPSafeTempArrayCopy"> { }]; } +def LocalitySpecTypeLocal : I32EnumAttrCase<"Local", 0, "local">; +def LocalitySpecTypeLocalInit + : I32EnumAttrCase<"LocalInit", 1, "local_init">; + +def LocalitySpecifierType : I32EnumAttr< + "LocalitySpecifierType", + "Type of a locality specifier", [ + LocalitySpecTypeLocal, + LocalitySpecTypeLocalInit + ]> { + let genSpecializedAttr = 0; + let cppNamespace = "::fir"; +} + +def LocalitySpecifierTypeAttr : EnumAttr { + let assemblyFormat = "`{` `type` `=` $value `}`"; +} + #endif // FIR_DIALECT_FIR_ATTRS diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td b/flang/include/flang/Optimizer/Dialect/FIROps.td index 0ba985641069b..acc0c6967c739 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.td +++ b/flang/include/flang/Optimizer/Dialect/FIROps.td @@ -3485,6 +3485,137 @@ def fir_BoxTotalElementsOp let hasCanonicalizer = 1; } +def YieldOp : fir_Op<"yield", + [Pure, ReturnLike, Terminator, + ParentOneOf<["LocalitySpecifierOp"]>]> { + let summary = "loop yield and termination operation"; + let description = [{ + "fir.yield" yields SSA values from a fir dialect op region and + terminates the region. The semantics of how the values are yielded is + defined by the parent operation. + }]; + + let arguments = (ins Variadic:$results); + + let builders = [ + OpBuilder<(ins), [{ build($_builder, $_state, {}); }]> + ]; + + let assemblyFormat = "( `(` $results^ `:` type($results) `)` )? attr-dict"; +} + +def fir_LocalitySpecifierOp : fir_Op<"local", [IsolatedFromAbove]> { + let summary = "Provides declaration of local and local_init logic."; + let description = [{ + This operation provides a declaration of how to implement the + localization of a variable. The dialect users should provide + which type should be allocated for this variable. The allocated (usually by + alloca) variable is passed to the initialization region which does everything + else (e.g. initialization of Fortran runtime descriptors). Information about + how to initialize the copy from the original item should be given in the + copy region, and if needed, how to deallocate memory (allocated by the + initialization region) in the dealloc region. + + Examples: + + * `local(x)` would not need any regions because no initialization is + required by the standard for i32 variables and this is not local_init. + ``` + fir.local {type = local} @x.localizer : i32 + ``` + + * `local_init(x)` would be emitted as: + ``` + fir.local {type = local_init} @x.localizer : i32 copy { + ^bb0(%arg0: !fir.ref, %arg1: !fir.ref): + // %arg0 is the original host variable. + // %arg1 represents the memory allocated for this private variable. + ... copy from host to the localized clone .... + fir.yield(%arg1 : !fir.ref) + } + ``` + + * `local(x)` for "allocatables" would be emitted as: + ``` + fir.local {type = local} @x.localizer : !some.type init { + ^bb0(%arg0: !fir.ref, %arg1: !fir.ref): + // initialize %arg1, using %arg0 as a mold for allocations. + // For example if %arg0 is a heap allocated array with a runtime determined + // length and !some.type is a runtime type descriptor, the init region + // will read the array length from %arg0, and heap allocate an array of the + // right length and initialize %arg1 to contain the array allocation and + // length. + fir.yield(%arg1 : !fir.ref) + } dealloc { + ^bb0(%arg0: !fir.ref): + // ... deallocate memory allocated by the init region... + // In the example above, this will free the heap allocated array data. + fir.yield + } + ``` + + There are no restrictions on the body except for: + - The `dealloc` regions has a single argument. + - The `init` & `copy` regions have 2 arguments. + - All three regions are terminated by `fir.yield` ops. + The above restrictions and other obvious restrictions (e.g. verifying the + type of yielded values) are verified by the custom op verifier. The actual + contents of the blocks inside all regions are not verified. + + Instances of this op would then be used by ops that model directives that + accept data-sharing attribute clauses. + + The `sym_name` attribute provides a symbol by which the privatizer op can be + referenced by other dialect ops. + + The `type` attribute is the type of the value being localized. This type + will be implicitly allocated in MLIR->LLVMIR conversion and passed as the + second argument to the init region. Therefore the type of arguments to + the regions should be a type which represents a pointer to `type`. + + The `locality_specifier_type` attribute specifies whether the localized + corresponds to a `local` or a `local_init` specifier. + }]; + + let arguments = (ins SymbolNameAttr:$sym_name, + TypeAttrOf:$type, + LocalitySpecifierTypeAttr:$locality_specifier_type); + + let regions = (region AnyRegion:$init_region, + AnyRegion:$copy_region, + AnyRegion:$dealloc_region); + + let assemblyFormat = [{ + $locality_specifier_type $sym_name `:` $type + (`init` $init_region^)? + (`copy` $copy_region^)? + (`dealloc` $dealloc_region^)? + attr-dict + }]; + + let builders = [ + OpBuilder<(ins CArg<"mlir::TypeRange">:$result, + CArg<"mlir::StringAttr">:$sym_name, + CArg<"mlir::TypeAttr">:$type)> + ]; + + let extraClassDeclaration = [{ + /// Get the type for arguments to nested regions. This should + /// generally be either the same as getType() or some pointer + /// type (pointing to the type allocated by this op). + /// This method will return Type{nullptr} if there are no nested + /// regions. + mlir::Type getArgType() { + for (mlir::Region *region : getRegions()) + for (mlir::Type ty : region->getArgumentTypes()) + return ty; + return nullptr; + } + }]; + + let hasRegionVerifier = 1; +} + def fir_DoConcurrentOp : fir_Op<"do_concurrent", [SingleBlock, AutomaticAllocationScope]> { let summary = "do concurrent loop wrapper"; diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index 05ef69169bae5..955acbe7018d3 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -4909,6 +4909,105 @@ void fir::BoxTotalElementsOp::getCanonicalizationPatterns( patterns.add(context); } +//===----------------------------------------------------------------------===// +// LocalitySpecifierOp +//===----------------------------------------------------------------------===// + +llvm::LogicalResult fir::LocalitySpecifierOp::verifyRegions() { + mlir::Type argType = getArgType(); + auto verifyTerminator = [&](mlir::Operation *terminator, + bool yieldsValue) -> llvm::LogicalResult { + if (!terminator->getBlock()->getSuccessors().empty()) + return llvm::success(); + + if (!llvm::isa(terminator)) + return mlir::emitError(terminator->getLoc()) + << "expected exit block terminator to be an `fir.yield` op."; + + YieldOp yieldOp = llvm::cast(terminator); + mlir::TypeRange yieldedTypes = yieldOp.getResults().getTypes(); + + if (!yieldsValue) { + if (yieldedTypes.empty()) + return llvm::success(); + + return mlir::emitError(terminator->getLoc()) + << "Did not expect any values to be yielded."; + } + + if (yieldedTypes.size() == 1 && yieldedTypes.front() == argType) + return llvm::success(); + + auto error = mlir::emitError(yieldOp.getLoc()) + << "Invalid yielded value. Expected type: " << argType + << ", got: "; + + if (yieldedTypes.empty()) + error << "None"; + else + error << yieldedTypes; + + return error; + }; + + auto verifyRegion = [&](mlir::Region ®ion, unsigned expectedNumArgs, + llvm::StringRef regionName, + bool yieldsValue) -> llvm::LogicalResult { + assert(!region.empty()); + + if (region.getNumArguments() != expectedNumArgs) + return mlir::emitError(region.getLoc()) + << "`" << regionName << "`: " + << "expected " << expectedNumArgs + << " region arguments, got: " << region.getNumArguments(); + + for (mlir::Block &block : region) { + // MLIR will verify the absence of the terminator for us. + if (!block.mightHaveTerminator()) + continue; + + if (failed(verifyTerminator(block.getTerminator(), yieldsValue))) + return llvm::failure(); + } + + return llvm::success(); + }; + + // Ensure all of the region arguments have the same type + for (mlir::Region *region : getRegions()) + for (mlir::Type ty : region->getArgumentTypes()) + if (ty != argType) + return emitError() << "Region argument type mismatch: got " << ty + << " expected " << argType << "."; + + mlir::Region &initRegion = getInitRegion(); + if (!initRegion.empty() && + failed(verifyRegion(getInitRegion(), /*expectedNumArgs=*/2, "init", + /*yieldsValue=*/true))) + return llvm::failure(); + + LocalitySpecifierType dsType = getLocalitySpecifierType(); + + if (dsType == LocalitySpecifierType::Local && !getCopyRegion().empty()) + return emitError("`local` specifiers do not require a `copy` region."); + + if (dsType == LocalitySpecifierType::LocalInit && getCopyRegion().empty()) + return emitError( + "`local_init` specifiers require at least a `copy` region."); + + if (dsType == LocalitySpecifierType::LocalInit && + failed(verifyRegion(getCopyRegion(), /*expectedNumArgs=*/2, "copy", + /*yieldsValue=*/true))) + return llvm::failure(); + + if (!getDeallocRegion().empty() && + failed(verifyRegion(getDeallocRegion(), /*expectedNumArgs=*/1, "dealloc", + /*yieldsValue=*/false))) + return llvm::failure(); + + return llvm::success(); +} + //===----------------------------------------------------------------------===// // DoConcurrentOp //===----------------------------------------------------------------------===// diff --git a/flang/test/Fir/do_concurrent.fir b/flang/test/Fir/do_concurrent.fir index 8e80ffb9c7b0b..4e55777402428 100644 --- a/flang/test/Fir/do_concurrent.fir +++ b/flang/test/Fir/do_concurrent.fir @@ -90,3 +90,22 @@ func.func @dc_2d_reduction(%i_lb: index, %i_ub: index, %i_st: index, // CHECK: fir.store %[[J_IV_CVT]] to %[[J]] : !fir.ref // CHECK: } // CHECK: } + + +fir.local {type = local} @local_privatizer : i32 + +// CHECK: fir.local {type = local} @[[LOCAL_PRIV_SYM:local_privatizer]] : i32 + +fir.local {type = local_init} @local_init_privatizer : i32 copy { +^bb0(%arg0: !fir.ref, %arg1: !fir.ref): + %0 = fir.load %arg0 : !fir.ref + fir.store %0 to %arg1 : !fir.ref + fir.yield(%arg1 : !fir.ref) +} + +// CHECK: fir.local {type = local_init} @[[LOCAL_INIT_PRIV_SYM:local_init_privatizer]] : i32 +// CHECK: ^bb0(%[[ORIG_VAL:.*]]: !fir.ref, %[[LOCAL_VAL:.*]]: !fir.ref): +// CHECK: %[[ORIG_VAL_LD:.*]] = fir.load %[[ORIG_VAL]] +// CHECK: fir.store %[[ORIG_VAL_LD]] to %[[LOCAL_VAL]] : !fir.ref +// CHECK: fir.yield(%[[LOCAL_VAL]] : !fir.ref) +// CHECK: } diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index f9f5e267dd9bc..733227339bc39 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -1,5 +1,3 @@ - - // RUN: fir-opt -split-input-file -verify-diagnostics --strict-fir-volatile-verifier %s // expected-error at +1{{custom op 'fir.string_lit' must have character type}} @@ -1311,3 +1309,79 @@ func.func @bad_convert_volatile6(%arg0: !fir.ref) -> !fir.ref { %0 = fir.volatile_cast %arg0 : (!fir.ref) -> !fir.ref return %0 : !fir.ref } + +// ----- + +fir.local {type = local} @x.localizer : i32 init { +^bb0(%arg0: i32, %arg1: i32): + %0 = arith.constant 0.0 : f32 + // expected-error @below {{Invalid yielded value. Expected type: 'i32', got: 'f32'}} + fir.yield(%0 : f32) +} + +// ----- + +// expected-error @below {{Region argument type mismatch: got 'f32' expected 'i32'.}} +fir.local {type = local} @x.localizer : i32 init { +^bb0(%arg0: i32, %arg1: f32): + fir.yield +} + +// ----- + +fir.local {type = local} @x.localizer : f32 init { +^bb0(%arg0: f32, %arg1: f32): + fir.yield(%arg0: f32) +} dealloc { +^bb0(%arg0: f32): + // expected-error @below {{Did not expect any values to be yielded.}} + fir.yield(%arg0 : f32) +} + +// ----- + +fir.local {type = local} @x.localizer : i32 init { +^bb0(%arg0: i32, %arg1: i32): + // expected-error @below {{expected exit block terminator to be an `fir.yield` op.}} + fir.unreachable +} + +// ----- + +// expected-error @below {{`init`: expected 2 region arguments, got: 1}} +fir.local {type = local} @x.localizer : f32 init { +^bb0(%arg0: f32): + fir.yield(%arg0 : f32) +} + +// ----- + +// expected-error @below {{`copy`: expected 2 region arguments, got: 1}} +fir.local {type = local_init} @x.privatizer : f32 copy { +^bb0(%arg0: f32): + fir.yield(%arg0 : f32) +} + +// ----- + +// expected-error @below {{`dealloc`: expected 1 region arguments, got: 2}} +fir.local {type = local} @x.localizer : f32 dealloc { +^bb0(%arg0: f32, %arg1: f32): + fir.yield +} + +// ----- + +// expected-error @below {{`local` specifiers do not require a `copy` region.}} +fir.local {type = local} @x.localizer : f32 copy { +^bb0(%arg0: f32, %arg1 : f32): + fir.yield(%arg0 : f32) +} + +// ----- + +// expected-error @below {{`local_init` specifiers require at least a `copy` region.}} +fir.local {type = local_init} @x.localizer : f32 init { +^bb0(%arg0: f32, %arg1: f32): + fir.yield(%arg0 : f32) +} From flang-commits at lists.llvm.org Wed May 7 05:00:14 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 07 May 2025 05:00:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add `fir.local` op for locality specifiers (PR #138505) In-Reply-To: Message-ID: <681b4b4e.170a0220.3c0b5c.9046@mx.google.com> https://github.com/ergawy closed https://github.com/llvm/llvm-project/pull/138505 From flang-commits at lists.llvm.org Wed May 7 05:00:15 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 07 May 2025 05:00:15 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add locality specifiers modeling to `fir.do_concurrent.loop` (PR #138506) In-Reply-To: Message-ID: <681b4b4f.170a0220.120fce.946e@mx.google.com> https://github.com/ergawy edited https://github.com/llvm/llvm-project/pull/138506 From flang-commits at lists.llvm.org Wed May 7 05:02:02 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 07 May 2025 05:02:02 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add locality specifiers modeling to `fir.do_concurrent.loop` (PR #138506) In-Reply-To: Message-ID: <681b4bba.170a0220.1b00bf.9606@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/138506 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 7 03:01:16 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Wed, 07 May 2025 03:01:16 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <681b2f6c.170a0220.3b2347.87b0@mx.google.com> https://github.com/Thirumalai-Shaktivel updated https://github.com/llvm/llvm-project/pull/119172 >From 6e491ccd80b902df6946713a372ec9667e0811c3 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Mon, 9 Dec 2024 07:37:01 +0000 Subject: [PATCH 01/16] [Flang] [OpenMP] Add semantic checks for detach clause in Task Fixes: - Add semantic checks along with the tests - Move the detach clause to allowedOnceClauses list in Task construct Restrictions:\ OpenMP 5.0: Task construct - At most one detach clause can appear on the directive. - If a detach clause appears on the directive, then a mergeable clause cannot appear on the same directive. OpenMP 5.2: Detach contruct - If a detach clause appears on a directive, then the encountering task must not be a final task. - A variable that appears in a detach clause cannot appear as a list item on a data-environment attribute clause on the same construct. - A variable that is part of another variable (as an array element or a structure element) cannot appear in a detach clause. - event-handle must not have the POINTER attribute. --- flang/lib/Semantics/check-omp-structure.cpp | 141 +++++++++++++++----- flang/lib/Semantics/check-omp-structure.h | 2 + flang/test/Semantics/OpenMP/detach01.f90 | 65 +++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 2 +- 4 files changed, 179 insertions(+), 31 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/detach01.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 95b962f5daf57..6641e39c6e358 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2733,6 +2733,59 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { llvm::omp::Clause::OMPC_copyprivate, {llvm::omp::Clause::OMPC_nowait}); } + if (GetContext().directive == llvm::omp::Directive::OMPD_task) { + if (auto *d_clause{FindClause(llvm::omp::Clause::OMPC_detach)}) { + // OpenMP 5.0: Task construct restrictions + CheckNotAllowedIfClause( + llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); + + // OpenMP 5.2: Task construct restrictions + if (FindClause(llvm::omp::Clause::OMPC_final)) { + context_.Say(GetContext().clauseSource, + "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); + } + + const auto &detachClause{ + std::get(d_clause->u)}; + if (const auto *name{parser::Unwrap(detachClause.v.v)}) { + if (name->symbol) { + std::string eventHandleSymName{name->ToString()}; + auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList + &objs, + std::string clause) { + for (const auto &obj : objs.v) { + if (const parser::Name *objName{ + parser::Unwrap(obj)}) { + if (objName->ToString() == eventHandleSymName) { + context_.Say(GetContext().clauseSource, + "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, + eventHandleSymName, clause); + } + } + } + }; + if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_private)}) { + const auto &pClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + const auto &fpClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + const auto &irClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause( + std::get(irClause.v.t), "IN_REDUCTION"); + } + } + } + } + } + auto testThreadprivateVarErr = [&](Symbol sym, parser::Name name, llvmOmpClause clauseTy) { if (sym.test(Symbol::Flag::OmpThreadprivate)) @@ -2815,7 +2868,6 @@ CHECK_SIMPLE_CLAUSE(Capture, OMPC_capture) CHECK_SIMPLE_CLAUSE(Contains, OMPC_contains) CHECK_SIMPLE_CLAUSE(Default, OMPC_default) CHECK_SIMPLE_CLAUSE(Depobj, OMPC_depobj) -CHECK_SIMPLE_CLAUSE(Detach, OMPC_detach) CHECK_SIMPLE_CLAUSE(DeviceType, OMPC_device_type) CHECK_SIMPLE_CLAUSE(DistSchedule, OMPC_dist_schedule) CHECK_SIMPLE_CLAUSE(Exclusive, OMPC_exclusive) @@ -3386,40 +3438,45 @@ void OmpStructureChecker::CheckIsVarPartOfAnotherVar( const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause) { for (const auto &ompObject : objList.v) { - common::visit( - common::visitors{ - [&](const parser::Designator &designator) { - if (const auto *dataRef{ - std::get_if(&designator.u)}) { - if (IsDataRefTypeParamInquiry(dataRef)) { + CheckIsVarPartOfAnotherVar(source, ompObject, clause); + } +} + +void OmpStructureChecker::CheckIsVarPartOfAnotherVar( + const parser::CharBlock &source, const parser::OmpObject &ompObject, + llvm::StringRef clause) { + common::visit( + common::visitors{ + [&](const parser::Designator &designator) { + if (const auto *dataRef{ + std::get_if(&designator.u)}) { + if (IsDataRefTypeParamInquiry(dataRef)) { + context_.Say(source, + "A type parameter inquiry cannot appear on the %s " + "directive"_err_en_US, + ContextDirectiveAsFortran()); + } else if (parser::Unwrap( + ompObject) || + parser::Unwrap(ompObject)) { + if (llvm::omp::nonPartialVarSet.test(GetContext().directive)) { context_.Say(source, - "A type parameter inquiry cannot appear on the %s " + "A variable that is part of another variable (as an " + "array or structure element) cannot appear on the %s " "directive"_err_en_US, ContextDirectiveAsFortran()); - } else if (parser::Unwrap( - ompObject) || - parser::Unwrap(ompObject)) { - if (llvm::omp::nonPartialVarSet.test( - GetContext().directive)) { - context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear on the %s " - "directive"_err_en_US, - ContextDirectiveAsFortran()); - } else { - context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear in a " - "%s clause"_err_en_US, - clause.data()); - } + } else { + context_.Say(source, + "A variable that is part of another variable (as an " + "array or structure element) cannot appear in a " + "%s clause"_err_en_US, + clause.data()); } } - }, - [&](const parser::Name &name) {}, - }, - ompObject.u); - } + } + }, + [&](const parser::Name &name) {}, + }, + ompObject.u); } void OmpStructureChecker::Enter(const parser::OmpClause::Firstprivate &x) { @@ -3746,6 +3803,30 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } } +void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { + // OpenMP 5.0: Task construct restrictions + CheckAllowedClause(llvm::omp::Clause::OMPC_detach); + + // OpenMP 5.2: Detach clause restrictions + CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); + if (const auto *name{parser::Unwrap(x.v.v)}) { + if (name->symbol) { + if (IsPointer(*name->symbol)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must not have the POINTER attribute"_err_en_US, + name->ToString()); + } + } + auto type{name->symbol->GetType()}; + if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer) || + evaluate::ToInt64(type->numericTypeSpec().kind()) != 8) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, + name->ToString()); + } + } +} + void OmpStructureChecker::CheckAllowedMapTypes( const parser::OmpMapType::Value &type, const std::list &allowedMapTypeList) { diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 346a7bed9138f..a8f94992ff091 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -186,6 +186,8 @@ class OmpStructureChecker const common::Indirection &, const parser::Name &); void CheckDoacross(const parser::OmpDoacross &doa); bool IsDataRefTypeParamInquiry(const parser::DataRef *dataRef); + void CheckIsVarPartOfAnotherVar(const parser::CharBlock &source, + const parser::OmpObject &obj, llvm::StringRef clause = ""); void CheckIsVarPartOfAnotherVar(const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause = ""); void CheckThreadprivateOrDeclareTargetVar( diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 new file mode 100644 index 0000000000000..e342fcd1b19b4 --- /dev/null +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -0,0 +1,65 @@ +! REQUIRES: openmp_runtime +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=50 + +! OpenMP Version 5.2 +! Various checks for DETACH Clause (12.5.2) + +program test_detach + use omp_lib + implicit none + integer :: e, x + integer(omp_event_handle_kind) :: event_01, event_02(2) + integer(omp_event_handle_kind), pointer :: event_03 + + + type :: t + integer(omp_event_handle_kind) :: event + end type + + type(t) :: t_01 + + !ERROR: The event-handle: `e` must be of type integer(kind=omp_event_handle_kind) + !$omp task detach(e) + x = x + 1 + !$omp end task + + !ERROR: At most one DETACH clause can appear on the TASK directive + !$omp task detach(event_01) detach(event_01) + x = x + 1 + !$omp end task + + !ERROR: Clause MERGEABLE is not allowed if clause DETACH appears on the TASK directive + !$omp task detach(event_01) mergeable + x = x + 1 + !$omp end task + + !ERROR: If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task + !$omp task detach(event_01) final(.false.) + x = x + 1 + !$omp end task + + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on PRIVATE clause on the same construct + !$omp task detach(event_01) private(event_01) + x = x + 1 + !$omp end task + + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on IN_REDUCTION clause on the same construct + !$omp task detach(event_01) in_reduction(+:event_01) + x = x + 1 + !$omp end task + + !ERROR: A variable that is part of another variable (as an array or structure element) cannot appear in a DETACH clause + !$omp task detach(event_02(1)) + x = x + 1 + !$omp end task + + !ERROR: A variable that is part of another variable (as an array or structure element) cannot appear in a DETACH clause + !$omp task detach(t_01%event) + x = x + 1 + !$omp end task + + !ERROR: The event-handle: `event_03` must not have the POINTER attribute + !$omp task detach(event_03) + x = x + 1 + !$omp end task +end program diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index e36eb77cefe7e..aec80decf6039 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -1090,7 +1090,6 @@ def OMP_Task : Directive<"task"> { VersionedClause, VersionedClause, VersionedClause, - VersionedClause, VersionedClause, VersionedClause, VersionedClause, @@ -1100,6 +1099,7 @@ def OMP_Task : Directive<"task"> { ]; let allowedOnceClauses = [ VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, >From fa2b051e724898f3be9b2d74d218816088b3975f Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 05:51:11 +0000 Subject: [PATCH 02/16] Do not check for interger kind --- flang/lib/Semantics/check-omp-structure.cpp | 4 +--- flang/test/Semantics/OpenMP/detach01.f90 | 4 ++-- 2 files changed, 3 insertions(+), 5 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 6641e39c6e358..e2f897f5c9246 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3817,9 +3817,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { name->ToString()); } } - auto type{name->symbol->GetType()}; - if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer) || - evaluate::ToInt64(type->numericTypeSpec().kind()) != 8) { + if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { context_.Say(GetContext().clauseSource, "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, name->ToString()); diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index e342fcd1b19b4..7ba2888be9237 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -7,8 +7,8 @@ program test_detach use omp_lib implicit none - integer :: e, x - integer(omp_event_handle_kind) :: event_01, event_02(2) + real :: e, x + integer(omp_event_handle_kind) :: event_01, event_02(2) integer(omp_event_handle_kind), pointer :: event_03 >From 569712fc9b14f3a1f4b3a7d2c55febfe216f5e7f Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 06:14:26 +0000 Subject: [PATCH 03/16] Fix snake_case --- flang/lib/Semantics/check-omp-structure.cpp | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index e2f897f5c9246..aa19b71e69c62 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2734,7 +2734,7 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { } if (GetContext().directive == llvm::omp::Directive::OMPD_task) { - if (auto *d_clause{FindClause(llvm::omp::Clause::OMPC_detach)}) { + if (auto *detachClause{FindClause(llvm::omp::Clause::OMPC_detach)}) { // OpenMP 5.0: Task construct restrictions CheckNotAllowedIfClause( llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); @@ -2745,9 +2745,9 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); } - const auto &detachClause{ - std::get(d_clause->u)}; - if (const auto *name{parser::Unwrap(detachClause.v.v)}) { + const auto &detach{ + std::get(detachClause->u)}; + if (const auto *name{parser::Unwrap(detach.v.v)}) { if (name->symbol) { std::string eventHandleSymName{name->ToString()}; auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList >From 42cc3e75d041b9935b38e19cb70d7b3273f4087f Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 06:36:11 +0000 Subject: [PATCH 04/16] Compare symbols instead of name --- flang/lib/Semantics/check-omp-structure.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa19b71e69c62..084dbf48c5c50 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2749,17 +2749,17 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { std::get(detachClause->u)}; if (const auto *name{parser::Unwrap(detach.v.v)}) { if (name->symbol) { - std::string eventHandleSymName{name->ToString()}; + Symbol *eventHandleSym{name->symbol->GetUltimate()}; auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList &objs, std::string clause) { for (const auto &obj : objs.v) { if (const parser::Name *objName{ parser::Unwrap(obj)}) { - if (objName->ToString() == eventHandleSymName) { + if (objName->symbol->GetUltimate() == eventHandleSym) { context_.Say(GetContext().clauseSource, "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, - eventHandleSymName, clause); + objName->source, clause); } } } >From a3c6a4517354991dc5f166d57d14012571b3f553 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 09:09:15 +0000 Subject: [PATCH 05/16] Add OpenMP version based checks --- flang/lib/Semantics/check-omp-structure.cpp | 105 ++++++++++---------- flang/test/Semantics/OpenMP/detach01.f90 | 22 +--- flang/test/Semantics/OpenMP/detach02.f90 | 22 ++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 2 +- 4 files changed, 83 insertions(+), 68 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/detach02.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 084dbf48c5c50..41ed858ed650f 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2735,51 +2735,54 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { if (GetContext().directive == llvm::omp::Directive::OMPD_task) { if (auto *detachClause{FindClause(llvm::omp::Clause::OMPC_detach)}) { - // OpenMP 5.0: Task construct restrictions - CheckNotAllowedIfClause( - llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); - - // OpenMP 5.2: Task construct restrictions - if (FindClause(llvm::omp::Clause::OMPC_final)) { - context_.Say(GetContext().clauseSource, - "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); - } + unsigned version{context_.langOptions().OpenMPVersion}; + if (version == 50 || version == 51) { + // OpenMP 5.0: 2.10.1 Task construct restrictions + CheckNotAllowedIfClause( + llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); + } else if (version >= 52) { + // OpenMP 5.2: 12.5.2 Detach construct restrictions + if (FindClause(llvm::omp::Clause::OMPC_final)) { + context_.Say(GetContext().clauseSource, + "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); + } - const auto &detach{ - std::get(detachClause->u)}; - if (const auto *name{parser::Unwrap(detach.v.v)}) { - if (name->symbol) { - Symbol *eventHandleSym{name->symbol->GetUltimate()}; - auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList - &objs, - std::string clause) { - for (const auto &obj : objs.v) { - if (const parser::Name *objName{ - parser::Unwrap(obj)}) { - if (objName->symbol->GetUltimate() == eventHandleSym) { - context_.Say(GetContext().clauseSource, - "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, - objName->source, clause); + const auto &detach{ + std::get(detachClause->u)}; + if (const auto *name{parser::Unwrap(detach.v.v)}) { + if (name->symbol) { + Symbol *eventHandleSym{name->symbol}; + auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList + &objs, + std::string clause) { + for (const auto &obj : objs.v) { + if (const parser::Name *objName{ + parser::Unwrap(obj)}) { + if (&objName->symbol->GetUltimate() == eventHandleSym) { + context_.Say(GetContext().clauseSource, + "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, + objName->source, clause); + } } } + }; + if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_private)}) { + const auto &pClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + const auto &fpClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + const auto &irClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause( + std::get(irClause.v.t), "IN_REDUCTION"); } - }; - if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_private)}) { - const auto &pClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); - } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { - const auto &fpClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); - } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { - const auto &irClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause( - std::get(irClause.v.t), "IN_REDUCTION"); } } } @@ -3804,23 +3807,25 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { - // OpenMP 5.0: Task construct restrictions CheckAllowedClause(llvm::omp::Clause::OMPC_detach); + unsigned version{context_.langOptions().OpenMPVersion}; + // OpenMP 5.2: 12.5.2 Detach clause restrictions + if (version >= 52) { + CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); + } - // OpenMP 5.2: Detach clause restrictions - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); if (const auto *name{parser::Unwrap(x.v.v)}) { if (name->symbol) { - if (IsPointer(*name->symbol)) { + if (version >= 52 && IsPointer(*name->symbol)) { context_.Say(GetContext().clauseSource, "The event-handle: `%s` must not have the POINTER attribute"_err_en_US, name->ToString()); } - } - if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { - context_.Say(GetContext().clauseSource, - "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, - name->ToString()); + if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, + name->ToString()); + } } } } diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index 7ba2888be9237..ea8208c022ef1 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -1,21 +1,19 @@ ! REQUIRES: openmp_runtime -! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=50 +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=52 -! OpenMP Version 5.2 -! Various checks for DETACH Clause (12.5.2) +! OpenMP Version 5.2: 12.5.2 +! Various checks for DETACH Clause -program test_detach - use omp_lib +program detach01 + use omp_lib, only: omp_event_handle_kind implicit none real :: e, x integer(omp_event_handle_kind) :: event_01, event_02(2) integer(omp_event_handle_kind), pointer :: event_03 - type :: t integer(omp_event_handle_kind) :: event end type - type(t) :: t_01 !ERROR: The event-handle: `e` must be of type integer(kind=omp_event_handle_kind) @@ -23,16 +21,6 @@ program test_detach x = x + 1 !$omp end task - !ERROR: At most one DETACH clause can appear on the TASK directive - !$omp task detach(event_01) detach(event_01) - x = x + 1 - !$omp end task - - !ERROR: Clause MERGEABLE is not allowed if clause DETACH appears on the TASK directive - !$omp task detach(event_01) mergeable - x = x + 1 - !$omp end task - !ERROR: If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task !$omp task detach(event_01) final(.false.) x = x + 1 diff --git a/flang/test/Semantics/OpenMP/detach02.f90 b/flang/test/Semantics/OpenMP/detach02.f90 new file mode 100644 index 0000000000000..1304233976351 --- /dev/null +++ b/flang/test/Semantics/OpenMP/detach02.f90 @@ -0,0 +1,22 @@ +! REQUIRES: openmp_runtime +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=50 +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=51 + +! OpenMP Version 5.0: 2.10.1 +! Various checks for DETACH Clause + +program detach02 + use omp_lib, only: omp_event_handle_kind + integer(omp_event_handle_kind) :: event_01, event_02 + + !TODO: Throw following error for the versions 5.0 and 5.1 + !ERR: At most one DETACH clause can appear on the TASK directive + !!$omp task detach(event_01) detach(event_02) + ! x = x + 1 + !!$omp end task + + !ERROR: Clause MERGEABLE is not allowed if clause DETACH appears on the TASK directive + !$omp task detach(event_01) mergeable + x = x + 1 + !$omp end task +end program diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index aec80decf6039..e36eb77cefe7e 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -1090,6 +1090,7 @@ def OMP_Task : Directive<"task"> { VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, @@ -1099,7 +1100,6 @@ def OMP_Task : Directive<"task"> { ]; let allowedOnceClauses = [ VersionedClause, - VersionedClause, VersionedClause, VersionedClause, VersionedClause, >From 019a24c814b226352a13c18b840cc22a9b28ec7e Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Thu, 26 Dec 2024 09:26:33 +0000 Subject: [PATCH 06/16] Fix formatting --- flang/lib/Semantics/check-omp-structure.cpp | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 41ed858ed650f..7ca695acc74b3 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2738,8 +2738,8 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { unsigned version{context_.langOptions().OpenMPVersion}; if (version == 50 || version == 51) { // OpenMP 5.0: 2.10.1 Task construct restrictions - CheckNotAllowedIfClause( - llvm::omp::Clause::OMPC_detach, {llvm::omp::Clause::OMPC_mergeable}); + CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_detach, + {llvm::omp::Clause::OMPC_mergeable}); } else if (version >= 52) { // OpenMP 5.2: 12.5.2 Detach construct restrictions if (FindClause(llvm::omp::Clause::OMPC_final)) { @@ -2752,12 +2752,12 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { if (const auto *name{parser::Unwrap(detach.v.v)}) { if (name->symbol) { Symbol *eventHandleSym{name->symbol}; - auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList - &objs, + auto checkVarAppearsInDataEnvClause = [&](const parser:: + OmpObjectList &objs, std::string clause) { for (const auto &obj : objs.v) { - if (const parser::Name *objName{ - parser::Unwrap(obj)}) { + if (const parser::Name * + objName{parser::Unwrap(obj)}) { if (&objName->symbol->GetUltimate() == eventHandleSym) { context_.Say(GetContext().clauseSource, "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, @@ -2772,16 +2772,17 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { std::get(dataEnvClause->u)}; checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { const auto &fpClause{ std::get(dataEnvClause->u)}; checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { const auto &irClause{ std::get(dataEnvClause->u)}; checkVarAppearsInDataEnvClause( - std::get(irClause.v.t), "IN_REDUCTION"); + std::get(irClause.v.t), + "IN_REDUCTION"); } } } >From 0348571f198063b2358d52435cd0c62c14b89ef9 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 7 Mar 2025 05:03:09 +0000 Subject: [PATCH 07/16] Allow detach clause only once --- flang/lib/Semantics/check-omp-structure.cpp | 3 ++- flang/test/Semantics/OpenMP/detach02.f90 | 9 ++++----- llvm/include/llvm/Frontend/OpenMP/OMP.td | 2 +- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 7ca695acc74b3..c6a4c2472c4af 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3808,9 +3808,10 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { + // OpenMP 5.0: 2.10.1 Task construct restrictions CheckAllowedClause(llvm::omp::Clause::OMPC_detach); - unsigned version{context_.langOptions().OpenMPVersion}; // OpenMP 5.2: 12.5.2 Detach clause restrictions + unsigned version{context_.langOptions().OpenMPVersion}; if (version >= 52) { CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); } diff --git a/flang/test/Semantics/OpenMP/detach02.f90 b/flang/test/Semantics/OpenMP/detach02.f90 index 1304233976351..49d80358fcdb6 100644 --- a/flang/test/Semantics/OpenMP/detach02.f90 +++ b/flang/test/Semantics/OpenMP/detach02.f90 @@ -9,11 +9,10 @@ program detach02 use omp_lib, only: omp_event_handle_kind integer(omp_event_handle_kind) :: event_01, event_02 - !TODO: Throw following error for the versions 5.0 and 5.1 - !ERR: At most one DETACH clause can appear on the TASK directive - !!$omp task detach(event_01) detach(event_02) - ! x = x + 1 - !!$omp end task + !ERROR: At most one DETACH clause can appear on the TASK directive + !$omp task detach(event_01) detach(event_02) + x = x + 1 + !$omp end task !ERROR: Clause MERGEABLE is not allowed if clause DETACH appears on the TASK directive !$omp task detach(event_01) mergeable diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index e36eb77cefe7e..aec80decf6039 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -1090,7 +1090,6 @@ def OMP_Task : Directive<"task"> { VersionedClause, VersionedClause, VersionedClause, - VersionedClause, VersionedClause, VersionedClause, VersionedClause, @@ -1100,6 +1099,7 @@ def OMP_Task : Directive<"task"> { ]; let allowedOnceClauses = [ VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, >From 2c09c87ffd454186d9bd335e09c9bf0cd459c153 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 7 Mar 2025 05:06:18 +0000 Subject: [PATCH 08/16] Add a test for openmp 52 --- flang/test/Semantics/OpenMP/detach01.f90 | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index ea8208c022ef1..8f19dfc1f92a7 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -50,4 +50,9 @@ program detach01 !$omp task detach(event_03) x = x + 1 !$omp end task + + !ERROR: At most one DETACH clause can appear on the TASK directive + !$omp task detach(event_01) detach(event_02) + x = x + 1 + !$omp end task end program >From 6fc96f7795546aa0771f90d38e83cf3b2fffe32c Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:51:19 +0000 Subject: [PATCH 09/16] Rename a function name --- flang/lib/Semantics/check-omp-structure.cpp | 32 ++++++++++----------- flang/lib/Semantics/check-omp-structure.h | 4 +-- 2 files changed, 18 insertions(+), 18 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 346c11e9765b7..012529af0767d 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -1605,7 +1605,7 @@ void OmpStructureChecker::Leave(const parser::OpenMPThreadprivate &c) { const auto &dir{std::get(c.t)}; const auto &objectList{std::get(c.t)}; CheckSymbolNames(dir.source, objectList); - CheckIsVarPartOfAnotherVar(dir.source, objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, objectList); CheckThreadprivateOrDeclareTargetVar(objectList); dirContext_.pop_back(); } @@ -1736,7 +1736,7 @@ void OmpStructureChecker::Enter(const parser::OpenMPDeclarativeAllocate &x) { for (const auto &clause : clauseList.v) { CheckAlignValue(clause); } - CheckIsVarPartOfAnotherVar(dir.source, objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, objectList); } void OmpStructureChecker::Leave(const parser::OpenMPDeclarativeAllocate &x) { @@ -1902,7 +1902,7 @@ void OmpStructureChecker::Leave(const parser::OpenMPDeclareTargetConstruct &x) { if (const auto *objectList{parser::Unwrap(spec.u)}) { deviceConstructFound_ = true; CheckSymbolNames(dir.source, *objectList); - CheckIsVarPartOfAnotherVar(dir.source, *objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, *objectList); CheckThreadprivateOrDeclareTargetVar(*objectList); } else if (const auto *clauseList{ parser::Unwrap(spec.u)}) { @@ -1915,18 +1915,18 @@ void OmpStructureChecker::Leave(const parser::OpenMPDeclareTargetConstruct &x) { toClauseFound = true; auto &objList{std::get(toClause.v.t)}; CheckSymbolNames(dir.source, objList); - CheckIsVarPartOfAnotherVar(dir.source, objList); + CheckVarIsNotPartOfAnotherVar(dir.source, objList); CheckThreadprivateOrDeclareTargetVar(objList); }, [&](const parser::OmpClause::Link &linkClause) { CheckSymbolNames(dir.source, linkClause.v); - CheckIsVarPartOfAnotherVar(dir.source, linkClause.v); + CheckVarIsNotPartOfAnotherVar(dir.source, linkClause.v); CheckThreadprivateOrDeclareTargetVar(linkClause.v); }, [&](const parser::OmpClause::Enter &enterClause) { enterClauseFound = true; CheckSymbolNames(dir.source, enterClause.v); - CheckIsVarPartOfAnotherVar(dir.source, enterClause.v); + CheckVarIsNotPartOfAnotherVar(dir.source, enterClause.v); CheckThreadprivateOrDeclareTargetVar(enterClause.v); }, [&](const parser::OmpClause::DeviceType &deviceTypeClause) { @@ -2009,7 +2009,7 @@ void OmpStructureChecker::Enter(const parser::OpenMPExecutableAllocate &x) { CheckAlignValue(clause); } if (objectList) { - CheckIsVarPartOfAnotherVar(dir.source, *objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, *objectList); } } @@ -2029,7 +2029,7 @@ void OmpStructureChecker::Enter(const parser::OpenMPAllocatorsConstruct &x) { for (const auto &clause : clauseList.v) { if (const auto *allocClause{ parser::Unwrap(clause)}) { - CheckIsVarPartOfAnotherVar( + CheckVarIsNotPartOfAnotherVar( dir.source, std::get(allocClause->v.t)); } } @@ -3791,14 +3791,14 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Ordered &x) { void OmpStructureChecker::Enter(const parser::OmpClause::Shared &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_shared); - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v, "SHARED"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v, "SHARED"); CheckCrayPointee(x.v, "SHARED"); } void OmpStructureChecker::Enter(const parser::OmpClause::Private &x) { SymbolSourceMap symbols; GetSymbolsInObjectList(x.v, symbols); CheckAllowedClause(llvm::omp::Clause::OMPC_private); - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v, "PRIVATE"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v, "PRIVATE"); CheckIntentInPointer(symbols, llvm::omp::Clause::OMPC_private); CheckCrayPointee(x.v, "PRIVATE"); } @@ -3827,15 +3827,15 @@ bool OmpStructureChecker::IsDataRefTypeParamInquiry( return dataRefIsTypeParamInquiry; } -void OmpStructureChecker::CheckIsVarPartOfAnotherVar( +void OmpStructureChecker::CheckVarIsNotPartOfAnotherVar( const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause) { for (const auto &ompObject : objList.v) { - CheckIsVarPartOfAnotherVar(source, ompObject, clause); + CheckVarIsNotPartOfAnotherVar(source, ompObject, clause); } } -void OmpStructureChecker::CheckIsVarPartOfAnotherVar( +void OmpStructureChecker::CheckVarIsNotPartOfAnotherVar( const parser::CharBlock &source, const parser::OmpObject &ompObject, llvm::StringRef clause) { common::visit( @@ -3875,7 +3875,7 @@ void OmpStructureChecker::CheckIsVarPartOfAnotherVar( void OmpStructureChecker::Enter(const parser::OmpClause::Firstprivate &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_firstprivate); - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v, "FIRSTPRIVATE"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v, "FIRSTPRIVATE"); CheckCrayPointee(x.v, "FIRSTPRIVATE"); CheckIsLoopIvPartOfClause(llvmOmpClause::OMPC_firstprivate, x.v); @@ -4204,7 +4204,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { // OpenMP 5.2: 12.5.2 Detach clause restrictions unsigned version{context_.langOptions().OpenMPVersion}; if (version >= 52) { - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); } if (const auto *name{parser::Unwrap(x.v.v)}) { @@ -4571,7 +4571,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Lastprivate &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_lastprivate); const auto &objectList{std::get(x.v.t)}; - CheckIsVarPartOfAnotherVar( + CheckVarIsNotPartOfAnotherVar( GetContext().clauseSource, objectList, "LASTPRIVATE"); CheckCrayPointee(objectList, "LASTPRIVATE"); diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 1517f89f476f4..be420332d491c 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -231,9 +231,9 @@ class OmpStructureChecker const common::Indirection &, const parser::Name &); void CheckDoacross(const parser::OmpDoacross &doa); bool IsDataRefTypeParamInquiry(const parser::DataRef *dataRef); - void CheckIsVarPartOfAnotherVar(const parser::CharBlock &source, + void CheckVarIsNotPartOfAnotherVar(const parser::CharBlock &source, const parser::OmpObject &obj, llvm::StringRef clause = ""); - void CheckIsVarPartOfAnotherVar(const parser::CharBlock &source, + void CheckVarIsNotPartOfAnotherVar(const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause = ""); void CheckThreadprivateOrDeclareTargetVar( const parser::OmpObjectList &objList); >From 59075c12dd380f607874a4af3e9c0fc0cc67efd6 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:54:18 +0000 Subject: [PATCH 10/16] Remove the check for symbol's nullptr --- flang/lib/Semantics/check-omp-structure.cpp | 79 ++++++++++----------- 1 file changed, 37 insertions(+), 42 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 012529af0767d..e98536fd506cb 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3163,40 +3163,37 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { const auto &detach{ std::get(detachClause->u)}; if (const auto *name{parser::Unwrap(detach.v.v)}) { - if (name->symbol) { - Symbol *eventHandleSym{name->symbol}; - auto checkVarAppearsInDataEnvClause = [&](const parser:: - OmpObjectList &objs, - std::string clause) { - for (const auto &obj : objs.v) { - if (const parser::Name * - objName{parser::Unwrap(obj)}) { - if (&objName->symbol->GetUltimate() == eventHandleSym) { - context_.Say(GetContext().clauseSource, - "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, - objName->source, clause); - } + Symbol *eventHandleSym{name->symbol}; + auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList + &objs, + std::string clause) { + for (const auto &obj : objs.v) { + if (const parser::Name *objName{ + parser::Unwrap(obj)}) { + if (&objName->symbol->GetUltimate() == eventHandleSym) { + context_.Say(GetContext().clauseSource, + "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, + objName->source, clause); } } - }; - if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_private)}) { - const auto &pClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); - } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { - const auto &fpClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); - } else if (auto *dataEnvClause{ - FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { - const auto &irClause{ - std::get(dataEnvClause->u)}; - checkVarAppearsInDataEnvClause( - std::get(irClause.v.t), - "IN_REDUCTION"); } + }; + if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_private)}) { + const auto &pClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + const auto &fpClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + const auto &irClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause( + std::get(irClause.v.t), "IN_REDUCTION"); } } } @@ -4208,17 +4205,15 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { } if (const auto *name{parser::Unwrap(x.v.v)}) { - if (name->symbol) { - if (version >= 52 && IsPointer(*name->symbol)) { - context_.Say(GetContext().clauseSource, - "The event-handle: `%s` must not have the POINTER attribute"_err_en_US, - name->ToString()); - } - if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { - context_.Say(GetContext().clauseSource, - "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, - name->ToString()); - } + if (version >= 52 && IsPointer(*name->symbol)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must not have the POINTER attribute"_err_en_US, + name->ToString()); + } + if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, + name->ToString()); } } } >From 965df852abd5ffa1735e19491c0df1b04f58470d Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:54:40 +0000 Subject: [PATCH 11/16] CheckAllowedClause for OpenMP 50 and 51 --- flang/lib/Semantics/check-omp-structure.cpp | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index e98536fd506cb..cb0e06ed6aa32 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4196,10 +4196,14 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { - // OpenMP 5.0: 2.10.1 Task construct restrictions - CheckAllowedClause(llvm::omp::Clause::OMPC_detach); - // OpenMP 5.2: 12.5.2 Detach clause restrictions unsigned version{context_.langOptions().OpenMPVersion}; + if (version >= 52) { + SetContextClauseInfo(llvm::omp::Clause::OMPC_detach); + } else { + // OpenMP 5.0: 2.10.1 Task construct restrictions + CheckAllowedClause(llvm::omp::Clause::OMPC_detach); + } + // OpenMP 5.2: 12.5.2 Detach clause restrictions if (version >= 52) { CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); } >From d13d267f8d8a5641c4556b98b9e4eea119f21adf Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:55:07 +0000 Subject: [PATCH 12/16] Fix the error message string to be in a single line --- flang/lib/Semantics/check-omp-structure.cpp | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index cb0e06ed6aa32..06437dc7a8588 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3842,23 +3842,18 @@ void OmpStructureChecker::CheckVarIsNotPartOfAnotherVar( std::get_if(&designator.u)}) { if (IsDataRefTypeParamInquiry(dataRef)) { context_.Say(source, - "A type parameter inquiry cannot appear on the %s " - "directive"_err_en_US, + "A type parameter inquiry cannot appear on the %s directive"_err_en_US, ContextDirectiveAsFortran()); } else if (parser::Unwrap( ompObject) || parser::Unwrap(ompObject)) { if (llvm::omp::nonPartialVarSet.test(GetContext().directive)) { context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear on the %s " - "directive"_err_en_US, + "A variable that is part of another variable (as an array or structure element) cannot appear on the %s directive"_err_en_US, ContextDirectiveAsFortran()); } else { context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear in a " - "%s clause"_err_en_US, + "A variable that is part of another variable (as an array or structure element) cannot appear in a %s clause"_err_en_US, clause.data()); } } >From 573580e5526d7981fb93077b5fd05b7a3547ec85 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:55:40 +0000 Subject: [PATCH 13/16] Add check for shared clause as well --- flang/lib/Semantics/check-omp-structure.cpp | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 06437dc7a8588..3223652a27187 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,6 +3183,11 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { const auto &pClause{ std::get(dataEnvClause->u)}; checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_shared)}) { + const auto &sClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(sClause.v, "SHARED"); } else if (auto *dataEnvClause{ FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { const auto &fpClause{ >From 5d9be322e9577b4ac7ba6b3b8b30336f2693b588 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 11:58:27 +0000 Subject: [PATCH 14/16] Remove a test --- flang/test/Semantics/OpenMP/detach01.f90 | 5 ----- 1 file changed, 5 deletions(-) diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index 8f19dfc1f92a7..ea8208c022ef1 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -50,9 +50,4 @@ program detach01 !$omp task detach(event_03) x = x + 1 !$omp end task - - !ERROR: At most one DETACH clause can appear on the TASK directive - !$omp task detach(event_01) detach(event_02) - x = x + 1 - !$omp end task end program >From 47fbdc4a08a30cfb70545bb1adf1f72a7c67bbd9 Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Fri, 2 May 2025 12:10:06 +0000 Subject: [PATCH 15/16] Add some more tests --- flang/test/Semantics/OpenMP/detach01.f90 | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 index ea8208c022ef1..7729c85ea1128 100644 --- a/flang/test/Semantics/OpenMP/detach01.f90 +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -31,6 +31,16 @@ program detach01 x = x + 1 !$omp end task + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on FIRSTPRIVATE clause on the same construct + !$omp task detach(event_01) firstprivate(event_01) + x = x + 1 + !$omp end task + + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on SHARED clause on the same construct + !$omp task detach(event_01) shared(event_01) + x = x + 1 + !$omp end task + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on IN_REDUCTION clause on the same construct !$omp task detach(event_01) in_reduction(+:event_01) x = x + 1 >From 596d3266902ca85fe7cd92fafa42701b0f34a10a Mon Sep 17 00:00:00 2001 From: Thirumalai-Shaktivel Date: Wed, 7 May 2025 10:00:30 +0000 Subject: [PATCH 16/16] Fix clang-format --- flang/lib/Semantics/check-omp-structure.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 3223652a27187..025b0ea4ec66d 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3168,8 +3168,8 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { &objs, std::string clause) { for (const auto &obj : objs.v) { - if (const parser::Name *objName{ - parser::Unwrap(obj)}) { + if (const parser::Name * + objName{parser::Unwrap(obj)}) { if (&objName->symbol->GetUltimate() == eventHandleSym) { context_.Say(GetContext().clauseSource, "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, From flang-commits at lists.llvm.org Wed May 7 06:50:24 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 06:50:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang][driver] do not crash when fc1 process multiple files (PR #138875) Message-ID: https://github.com/jeanPerier created https://github.com/llvm/llvm-project/pull/138875 This is a fix for the issue https://github.com/llvm/llvm-project/issues/137126 that turned out to be a driver issue. The FrontendActions have a loop to process multiple input file and flang -fc1 accept multiple files, but the semantic, lowering, and llvm codegen actions were not re-entrant. - The SemanticContext cannot be reuse between two files since it is holding the scope, so symbol will clash. - The mlir and llvm module need to be reset before overriding the previous contexts, otherwise there later deletion will use the old context and cause use after free. **I am unsure it is the intention/a feature to process mutliple input files in a single -fc1 command** Flang invoke one flang -fc1 process per input file, hence the issue does not show up with `flang -c foo.f90 bar.f90`). I think my fix is fragile (it is hard to keep track of where all the unique pointers are set/deleted, and how they depend on each other) and maybe it would be better to prevent multiple input to fc1. @banach-space, what was the intention? >From 216ec3f3d243eb5d8e18d1dac392da093ec91cee Mon Sep 17 00:00:00 2001 From: Jean Perier Date: Wed, 7 May 2025 06:40:41 -0700 Subject: [PATCH] [flang][driver] do not crash when fc1 process multiple files --- flang/include/flang/Frontend/CompilerInstance.h | 6 ++++++ flang/lib/Frontend/CompilerInstance.cpp | 2 -- flang/lib/Frontend/FrontendAction.cpp | 2 +- flang/lib/Frontend/FrontendActions.cpp | 7 +++++++ flang/test/Driver/multiple-fc1-input.f90 | 9 +++++++++ 5 files changed, 23 insertions(+), 3 deletions(-) create mode 100644 flang/test/Driver/multiple-fc1-input.f90 diff --git a/flang/include/flang/Frontend/CompilerInstance.h b/flang/include/flang/Frontend/CompilerInstance.h index e37ef5e236871..4ad95c9df42d7 100644 --- a/flang/include/flang/Frontend/CompilerInstance.h +++ b/flang/include/flang/Frontend/CompilerInstance.h @@ -147,6 +147,12 @@ class CompilerInstance { /// @name Semantic analysis /// { + Fortran::semantics::SemanticsContext &createNewSemanticsContext() { + semaContext = + getInvocation().getSemanticsCtx(*allCookedSources, getTargetMachine()); + return *semaContext; + } + Fortran::semantics::SemanticsContext &getSemanticsContext() { return *semaContext; } diff --git a/flang/lib/Frontend/CompilerInstance.cpp b/flang/lib/Frontend/CompilerInstance.cpp index f7ed969f03bf4..cbd2c58eeeb47 100644 --- a/flang/lib/Frontend/CompilerInstance.cpp +++ b/flang/lib/Frontend/CompilerInstance.cpp @@ -162,8 +162,6 @@ bool CompilerInstance::executeAction(FrontendAction &act) { allSources->set_encoding(invoc.getFortranOpts().encoding); if (!setUpTargetMachine()) return false; - // Create the semantics context - semaContext = invoc.getSemanticsCtx(*allCookedSources, getTargetMachine()); // Set options controlling lowering to FIR. invoc.setLoweringOptions(); diff --git a/flang/lib/Frontend/FrontendAction.cpp b/flang/lib/Frontend/FrontendAction.cpp index ab77d143fa4b6..d178fd6a9395d 100644 --- a/flang/lib/Frontend/FrontendAction.cpp +++ b/flang/lib/Frontend/FrontendAction.cpp @@ -183,7 +183,7 @@ bool FrontendAction::runSemanticChecks() { // Transfer any pending non-fatal messages from parsing to semantics // so that they are merged and all printed in order. - auto &semanticsCtx{ci.getSemanticsContext()}; + auto &semanticsCtx{ci.createNewSemanticsContext()}; semanticsCtx.messages().Annex(std::move(ci.getParsing().messages())); semanticsCtx.set_debugModuleWriter(ci.getInvocation().getDebugModuleDir()); diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..e5a15c555fa5e 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -171,6 +171,10 @@ static void addDependentLibs(mlir::ModuleOp mlirModule, CompilerInstance &ci) { } bool CodeGenAction::beginSourceFileAction() { + // Delete previous LLVM module depending on old context before making a new + // one. + if (llvmModule) + llvmModule.reset(nullptr); llvmCtx = std::make_unique(); CompilerInstance &ci = this->getInstance(); mlir::DefaultTimingManager &timingMgr = ci.getTimingManager(); @@ -197,6 +201,9 @@ bool CodeGenAction::beginSourceFileAction() { return true; } + // Reset MLIR module if it was set before overriding the old context. + if (mlirModule) + mlirModule = mlir::OwningOpRef(nullptr); // Load the MLIR dialects required by Flang mlirCtx = std::make_unique(); fir::support::loadDialects(*mlirCtx); diff --git a/flang/test/Driver/multiple-fc1-input.f90 b/flang/test/Driver/multiple-fc1-input.f90 new file mode 100644 index 0000000000000..57f7c5e92b4c4 --- /dev/null +++ b/flang/test/Driver/multiple-fc1-input.f90 @@ -0,0 +1,9 @@ +! Test that flang -fc1 can be called with several input files without +! crashing. +! Regression tests for: https://github.com/llvm/llvm-project/issues/137126 + +! RUN: %flang_fc1 -emit-fir %s %s -o - | FileCheck %s +subroutine foo() +end subroutine +! CHECK: func @_QPfoo() +! CHECK: func @_QPfoo() From flang-commits at lists.llvm.org Wed May 7 06:51:16 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 06:51:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][driver] do not crash when fc1 process multiple files (PR #138875) In-Reply-To: Message-ID: <681b6554.630a0220.668d6.ea27@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-driver Author: None (jeanPerier)
Changes This is a fix for the issue https://github.com/llvm/llvm-project/issues/137126 that turned out to be a driver issue. The FrontendActions have a loop to process multiple input file and flang -fc1 accept multiple files, but the semantic, lowering, and llvm codegen actions were not re-entrant. - The SemanticContext cannot be reuse between two files since it is holding the scope, so symbol will clash. - The mlir and llvm module need to be reset before overriding the previous contexts, otherwise there later deletion will use the old context and cause use after free. **I am unsure it is the intention/a feature to process mutliple input files in a single -fc1 command** Flang invoke one flang -fc1 process per input file, hence the issue does not show up with `flang -c foo.f90 bar.f90`). I think my fix is fragile (it is hard to keep track of where all the unique pointers are set/deleted, and how they depend on each other) and maybe it would be better to prevent multiple input to fc1. @banach-space, what was the intention? --- Full diff: https://github.com/llvm/llvm-project/pull/138875.diff 5 Files Affected: - (modified) flang/include/flang/Frontend/CompilerInstance.h (+6) - (modified) flang/lib/Frontend/CompilerInstance.cpp (-2) - (modified) flang/lib/Frontend/FrontendAction.cpp (+1-1) - (modified) flang/lib/Frontend/FrontendActions.cpp (+7) - (added) flang/test/Driver/multiple-fc1-input.f90 (+9) ``````````diff diff --git a/flang/include/flang/Frontend/CompilerInstance.h b/flang/include/flang/Frontend/CompilerInstance.h index e37ef5e236871..4ad95c9df42d7 100644 --- a/flang/include/flang/Frontend/CompilerInstance.h +++ b/flang/include/flang/Frontend/CompilerInstance.h @@ -147,6 +147,12 @@ class CompilerInstance { /// @name Semantic analysis /// { + Fortran::semantics::SemanticsContext &createNewSemanticsContext() { + semaContext = + getInvocation().getSemanticsCtx(*allCookedSources, getTargetMachine()); + return *semaContext; + } + Fortran::semantics::SemanticsContext &getSemanticsContext() { return *semaContext; } diff --git a/flang/lib/Frontend/CompilerInstance.cpp b/flang/lib/Frontend/CompilerInstance.cpp index f7ed969f03bf4..cbd2c58eeeb47 100644 --- a/flang/lib/Frontend/CompilerInstance.cpp +++ b/flang/lib/Frontend/CompilerInstance.cpp @@ -162,8 +162,6 @@ bool CompilerInstance::executeAction(FrontendAction &act) { allSources->set_encoding(invoc.getFortranOpts().encoding); if (!setUpTargetMachine()) return false; - // Create the semantics context - semaContext = invoc.getSemanticsCtx(*allCookedSources, getTargetMachine()); // Set options controlling lowering to FIR. invoc.setLoweringOptions(); diff --git a/flang/lib/Frontend/FrontendAction.cpp b/flang/lib/Frontend/FrontendAction.cpp index ab77d143fa4b6..d178fd6a9395d 100644 --- a/flang/lib/Frontend/FrontendAction.cpp +++ b/flang/lib/Frontend/FrontendAction.cpp @@ -183,7 +183,7 @@ bool FrontendAction::runSemanticChecks() { // Transfer any pending non-fatal messages from parsing to semantics // so that they are merged and all printed in order. - auto &semanticsCtx{ci.getSemanticsContext()}; + auto &semanticsCtx{ci.createNewSemanticsContext()}; semanticsCtx.messages().Annex(std::move(ci.getParsing().messages())); semanticsCtx.set_debugModuleWriter(ci.getInvocation().getDebugModuleDir()); diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..e5a15c555fa5e 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -171,6 +171,10 @@ static void addDependentLibs(mlir::ModuleOp mlirModule, CompilerInstance &ci) { } bool CodeGenAction::beginSourceFileAction() { + // Delete previous LLVM module depending on old context before making a new + // one. + if (llvmModule) + llvmModule.reset(nullptr); llvmCtx = std::make_unique(); CompilerInstance &ci = this->getInstance(); mlir::DefaultTimingManager &timingMgr = ci.getTimingManager(); @@ -197,6 +201,9 @@ bool CodeGenAction::beginSourceFileAction() { return true; } + // Reset MLIR module if it was set before overriding the old context. + if (mlirModule) + mlirModule = mlir::OwningOpRef(nullptr); // Load the MLIR dialects required by Flang mlirCtx = std::make_unique(); fir::support::loadDialects(*mlirCtx); diff --git a/flang/test/Driver/multiple-fc1-input.f90 b/flang/test/Driver/multiple-fc1-input.f90 new file mode 100644 index 0000000000000..57f7c5e92b4c4 --- /dev/null +++ b/flang/test/Driver/multiple-fc1-input.f90 @@ -0,0 +1,9 @@ +! Test that flang -fc1 can be called with several input files without +! crashing. +! Regression tests for: https://github.com/llvm/llvm-project/issues/137126 + +! RUN: %flang_fc1 -emit-fir %s %s -o - | FileCheck %s +subroutine foo() +end subroutine +! CHECK: func @_QPfoo() +! CHECK: func @_QPfoo() ``````````
https://github.com/llvm/llvm-project/pull/138875 From flang-commits at lists.llvm.org Wed May 7 07:00:24 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 07 May 2025 07:00:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] fix predetermined privatization inside section (PR #138159) In-Reply-To: Message-ID: <681b6778.170a0220.37f778.5612@mx.google.com> https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/138159 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 7 07:01:03 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 07 May 2025 07:01:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] fix predetermined privatization inside section (PR #138159) In-Reply-To: Message-ID: <681b679f.170a0220.150855.c5ec@mx.google.com> ================ @@ -1240,7 +1247,7 @@ static void createBodyOfOp(mlir::Operation &op, const OpWithBodyGenInfo &info, // loop (this may not make sense in production code, but a user could // write that and we should handle it). firOpBuilder.setInsertionPoint(term); - if (privatize) { + if (privatize && !info.skipDspStep2) { ---------------- tblah wrote: Thanks that was a good idea. https://github.com/llvm/llvm-project/pull/138159 From flang-commits at lists.llvm.org Wed May 7 07:50:25 2025 From: flang-commits at lists.llvm.org (David Truby via flang-commits) Date: Wed, 07 May 2025 07:50:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang][driver] do not crash when fc1 process multiple files (PR #138875) In-Reply-To: Message-ID: <681b7331.170a0220.5b842.d11e@mx.google.com> https://github.com/DavidTruby approved this pull request. LGTM thanks for the fix! https://github.com/llvm/llvm-project/pull/138875 From flang-commits at lists.llvm.org Wed May 7 07:55:43 2025 From: flang-commits at lists.llvm.org (David Truby via flang-commits) Date: Wed, 07 May 2025 07:55:43 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][Windows] Disable PCH on Windows for flangFrontend (PR #134726) In-Reply-To: Message-ID: <681b746f.170a0220.1f1bda.2a1f@mx.google.com> DavidTruby wrote: For now we have disabled PCHs for Windows in general in flang, we may re-enable them at some point if we can reproduce and fix the issue but will be cautious in doing so. Can this be closed now that that is done? https://github.com/llvm/llvm-project/pull/134726 From flang-commits at lists.llvm.org Wed May 7 08:04:52 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Wed, 07 May 2025 08:04:52 -0700 (PDT) Subject: [flang-commits] [flang] [flang][driver] do not crash when fc1 process multiple files (PR #138875) In-Reply-To: Message-ID: <681b7694.170a0220.26a4f3.edcf@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. I'm not the deciding vote, but this LGTM https://github.com/llvm/llvm-project/pull/138875 From flang-commits at lists.llvm.org Wed May 7 08:06:07 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Wed, 07 May 2025 08:06:07 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828) In-Reply-To: Message-ID: <681b76df.050a0220.3d155.c896@mx.google.com> DanielCChen wrote: > 1. It should be in `$prefix/lib/clang/finclude//*.h` Just want to make sure: Should it be `$prefix/lib/clang/${LLVM_VERSION_MAJOR}/finclude//*.mod` https://github.com/llvm/llvm-project/pull/137828 From flang-commits at lists.llvm.org Wed May 7 08:48:46 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 07 May 2025 08:48:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681b80de.630a0220.3c0424.868f@mx.google.com> ================ @@ -22,6 +22,9 @@ class StringRef; } // namespace llvm namespace mlir { +namespace func { +class FuncOp; +} ---------------- klausler wrote: `} // namespace func` https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Wed May 7 08:49:28 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Wed, 07 May 2025 08:49:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix crash with USE of hermetic module file (PR #138785) In-Reply-To: Message-ID: <681b8108.170a0220.237166.6f53@mx.google.com> https://github.com/akuhlens approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/138785 From flang-commits at lists.llvm.org Wed May 7 08:51:38 2025 From: flang-commits at lists.llvm.org (=?UTF-8?Q?I=C3=B1aki_Amatria_Barral?= via flang-commits) Date: Wed, 07 May 2025 08:51:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][driver] do not crash when fc1 process multiple files (PR #138875) In-Reply-To: Message-ID: <681b818a.630a0220.2491e5.63c7@mx.google.com> inaki-amatria wrote: TYSM @jeanPerier! We really appreciate you taking the time to look into this issue. The way I reported the issue, invoking two files with `flang -fc1`, was intentional. We use both Clang and Flang as libraries, and this was the most effective way to report the issue in a form that aligns with your typical workflow, where Flang is primarily used as a binary. Regardless of whether `flang -fc1` is expected to accept multiple input files directly, we believe the fix is still of great value for Flang library users like us. Again, thanks so much for looking into this @jeanPerier. https://github.com/llvm/llvm-project/pull/138875 From flang-commits at lists.llvm.org Wed May 7 08:52:53 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 07 May 2025 08:52:53 -0700 (PDT) Subject: [flang-commits] [flang] [fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <681b81d5.170a0220.32f084.c244@mx.google.com> ================ @@ -461,14 +475,28 @@ class AffineLoopConversion : public mlir::OpRewritePattern { LLVM_ATTRIBUTE_UNUSED auto loopAnalysis = functionAnalysis.getChildLoopAnalysis(loop); auto &loopOps = loop.getBody()->getOperations(); + auto resultOp = cast(loop.getBody()->getTerminator()); ---------------- tblah wrote: Thanks https://github.com/llvm/llvm-project/pull/137790 From flang-commits at lists.llvm.org Wed May 7 08:53:03 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 07 May 2025 08:53:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681b81df.050a0220.23c0d0.e80d@mx.google.com> ================ @@ -1034,88 +1034,78 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Symbol &symbol, const parser::OpenACCRoutineConstruct &x) { if (symbol.has()) { Fortran::semantics::OpenACCRoutineInfo info; - const auto &clauses = std::get(x.t); + std::vector currentDevices; + currentDevices.push_back(&info); + const auto &clauses{std::get(x.t)}; for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isSeq(); - } else { - info.deviceTypeInfos().back().set_isSeq(); + if (const auto *dTypeClause{ + std::get_if(&clause.u)}) { + currentDevices.clear(); + for (const auto &deviceTypeExpr : dTypeClause->v.v) { + currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); } - } else if (const auto *gangClause = - std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isGang(); - } else { - info.deviceTypeInfos().back().set_isGang(); + } else if (std::get_if(&clause.u)) { + info.set_isNohost(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isSeq(); + } + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isVector(); + } + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isWorker(); + } + } else if (const auto *gangClause{ + std::get_if( + &clause.u)}) { + for (auto &device : currentDevices) { + device->set_isGang(); } if (gangClause->v) { const Fortran::parser::AccGangArgList &x = *gangClause->v; + int numArgs{0}; for (const Fortran::parser::AccGangArg &gangArg : x.v) { - if (const auto *dim = - std::get_if(&gangArg.u)) { + CHECK(numArgs <= 1 && "expecting 0 or 1 gang dim args"); + if (const auto *dim{std::get_if( + &gangArg.u)}) { if (const auto v{EvaluateInt64(context_, dim->v)}) { - if (info.deviceTypeInfos().empty()) { - info.set_gangDim(*v); - } else { - info.deviceTypeInfos().back().set_gangDim(*v); + for (auto &device : currentDevices) { + device->set_gangDim(*v); } } } + numArgs++; } } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isVector(); - } else { - info.deviceTypeInfos().back().set_isVector(); - } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isWorker(); - } else { - info.deviceTypeInfos().back().set_isWorker(); - } - } else if (std::get_if(&clause.u)) { - info.set_isNohost(); - } else if (const auto *bindClause = - std::get_if(&clause.u)) { - if (const auto *name = - std::get_if(&bindClause->v.u)) { - if (Symbol *sym = ResolveFctName(*name)) { - if (info.deviceTypeInfos().empty()) { - info.set_bindName(sym->name().ToString()); - } else { - info.deviceTypeInfos().back().set_bindName( - sym->name().ToString()); + } else if (const auto *bindClause{ + std::get_if( + &clause.u)}) { + if (const auto *name{ + std::get_if(&bindClause->v.u)}) { + if (Symbol * sym{ResolveFctName(*name)}) { + Symbol &ultimate{sym->GetUltimate()}; + for (auto &device : currentDevices) { + device->set_bindName(SymbolRef(ultimate)); ---------------- klausler wrote: nit: `SymbolRef{ultimate}` (with braces) would be more consistent with modern usage. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Wed May 7 08:53:40 2025 From: flang-commits at lists.llvm.org (=?UTF-8?Q?I=C3=B1aki_Amatria_Barral?= via flang-commits) Date: Wed, 07 May 2025 08:53:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang][driver] do not crash when fc1 process multiple files (PR #138875) In-Reply-To: Message-ID: <681b8204.050a0220.2936bb.da9a@mx.google.com> https://github.com/inaki-amatria approved this pull request. https://github.com/llvm/llvm-project/pull/138875 From flang-commits at lists.llvm.org Wed May 7 08:53:56 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 07 May 2025 08:53:56 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681b8214.050a0220.1137bf.d85a@mx.google.com> ================ @@ -3050,4 +3040,4 @@ void OmpAttributeVisitor::IssueNonConformanceWarning( context_.Warn(common::UsageWarning::OpenMPUsage, source, "%s"_warn_en_US, warnStrOS.str()); } -} // namespace Fortran::semantics +} // namespace Fortran::semantics ---------------- klausler wrote: I think that you lost a newline here (not sure what the icon means in the GitHub GUI) https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Wed May 7 08:54:06 2025 From: flang-commits at lists.llvm.org (Pranav Bhandarkar via flang-commits) Date: Wed, 07 May 2025 08:54:06 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][MLIR] - Handle the mapping of subroutine arguments when they are subsequently used inside the region of an `omp.target` Op (PR #134967) In-Reply-To: Message-ID: <681b821e.050a0220.37eef8.db84@mx.google.com> https://github.com/bhandarkar-pranav edited https://github.com/llvm/llvm-project/pull/134967 From flang-commits at lists.llvm.org Wed May 7 08:55:01 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 07 May 2025 08:55:01 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681b8255.620a0220.140fa9.cf8b@mx.google.com> https://github.com/klausler approved this pull request. The non-lowering bits look pretty good to me now; thanks for responding to my comments. I have a few more small items. Please wait for a lowering expert to approve the rest of it. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Wed May 7 09:04:56 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Wed, 07 May 2025 09:04:56 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681b84a8.050a0220.1f029e.301c@mx.google.com> ================ @@ -1034,88 +1034,78 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Symbol &symbol, const parser::OpenACCRoutineConstruct &x) { if (symbol.has()) { Fortran::semantics::OpenACCRoutineInfo info; - const auto &clauses = std::get(x.t); + std::vector currentDevices; + currentDevices.push_back(&info); + const auto &clauses{std::get(x.t)}; for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isSeq(); - } else { - info.deviceTypeInfos().back().set_isSeq(); + if (const auto *dTypeClause{ + std::get_if(&clause.u)}) { + currentDevices.clear(); + for (const auto &deviceTypeExpr : dTypeClause->v.v) { + currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); } - } else if (const auto *gangClause = - std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isGang(); - } else { - info.deviceTypeInfos().back().set_isGang(); + } else if (std::get_if(&clause.u)) { + info.set_isNohost(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isSeq(); + } + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isVector(); + } + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isWorker(); + } + } else if (const auto *gangClause{ + std::get_if( + &clause.u)}) { + for (auto &device : currentDevices) { + device->set_isGang(); } if (gangClause->v) { const Fortran::parser::AccGangArgList &x = *gangClause->v; + int numArgs{0}; for (const Fortran::parser::AccGangArg &gangArg : x.v) { - if (const auto *dim = - std::get_if(&gangArg.u)) { + CHECK(numArgs <= 1 && "expecting 0 or 1 gang dim args"); + if (const auto *dim{std::get_if( + &gangArg.u)}) { if (const auto v{EvaluateInt64(context_, dim->v)}) { - if (info.deviceTypeInfos().empty()) { - info.set_gangDim(*v); - } else { - info.deviceTypeInfos().back().set_gangDim(*v); + for (auto &device : currentDevices) { + device->set_gangDim(*v); } } } + numArgs++; } } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isVector(); - } else { - info.deviceTypeInfos().back().set_isVector(); - } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isWorker(); - } else { - info.deviceTypeInfos().back().set_isWorker(); - } - } else if (std::get_if(&clause.u)) { - info.set_isNohost(); - } else if (const auto *bindClause = - std::get_if(&clause.u)) { - if (const auto *name = - std::get_if(&bindClause->v.u)) { - if (Symbol *sym = ResolveFctName(*name)) { - if (info.deviceTypeInfos().empty()) { - info.set_bindName(sym->name().ToString()); - } else { - info.deviceTypeInfos().back().set_bindName( - sym->name().ToString()); + } else if (const auto *bindClause{ + std::get_if( + &clause.u)}) { + if (const auto *name{ + std::get_if(&bindClause->v.u)}) { + if (Symbol * sym{ResolveFctName(*name)}) { + Symbol &ultimate{sym->GetUltimate()}; + for (auto &device : currentDevices) { + device->set_bindName(SymbolRef(ultimate)); ---------------- akuhlens wrote: For my own edification, what is the semantic difference? https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Wed May 7 09:11:02 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 07 May 2025 09:11:02 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681b8616.170a0220.169f91.245c@mx.google.com> ================ @@ -1034,88 +1034,78 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Symbol &symbol, const parser::OpenACCRoutineConstruct &x) { if (symbol.has()) { Fortran::semantics::OpenACCRoutineInfo info; - const auto &clauses = std::get(x.t); + std::vector currentDevices; + currentDevices.push_back(&info); + const auto &clauses{std::get(x.t)}; for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isSeq(); - } else { - info.deviceTypeInfos().back().set_isSeq(); + if (const auto *dTypeClause{ + std::get_if(&clause.u)}) { + currentDevices.clear(); + for (const auto &deviceTypeExpr : dTypeClause->v.v) { + currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); } - } else if (const auto *gangClause = - std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isGang(); - } else { - info.deviceTypeInfos().back().set_isGang(); + } else if (std::get_if(&clause.u)) { + info.set_isNohost(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isSeq(); + } + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isVector(); + } + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isWorker(); + } + } else if (const auto *gangClause{ + std::get_if( + &clause.u)}) { + for (auto &device : currentDevices) { + device->set_isGang(); } if (gangClause->v) { const Fortran::parser::AccGangArgList &x = *gangClause->v; + int numArgs{0}; for (const Fortran::parser::AccGangArg &gangArg : x.v) { - if (const auto *dim = - std::get_if(&gangArg.u)) { + CHECK(numArgs <= 1 && "expecting 0 or 1 gang dim args"); + if (const auto *dim{std::get_if( + &gangArg.u)}) { if (const auto v{EvaluateInt64(context_, dim->v)}) { - if (info.deviceTypeInfos().empty()) { - info.set_gangDim(*v); - } else { - info.deviceTypeInfos().back().set_gangDim(*v); + for (auto &device : currentDevices) { + device->set_gangDim(*v); } } } + numArgs++; } } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isVector(); - } else { - info.deviceTypeInfos().back().set_isVector(); - } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isWorker(); - } else { - info.deviceTypeInfos().back().set_isWorker(); - } - } else if (std::get_if(&clause.u)) { - info.set_isNohost(); - } else if (const auto *bindClause = - std::get_if(&clause.u)) { - if (const auto *name = - std::get_if(&bindClause->v.u)) { - if (Symbol *sym = ResolveFctName(*name)) { - if (info.deviceTypeInfos().empty()) { - info.set_bindName(sym->name().ToString()); - } else { - info.deviceTypeInfos().back().set_bindName( - sym->name().ToString()); + } else if (const auto *bindClause{ + std::get_if( + &clause.u)}) { + if (const auto *name{ + std::get_if(&bindClause->v.u)}) { + if (Symbol * sym{ResolveFctName(*name)}) { + Symbol &ultimate{sym->GetUltimate()}; + for (auto &device : currentDevices) { + device->set_bindName(SymbolRef(ultimate)); ---------------- klausler wrote: None that I know of; it's just a matter of syntactic consistency to use braces for constructor calls. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Wed May 7 09:12:28 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 07 May 2025 09:12:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681b866c.050a0220.3a4ee0.e1a9@mx.google.com> ================ @@ -1034,88 +1034,78 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Symbol &symbol, const parser::OpenACCRoutineConstruct &x) { if (symbol.has()) { Fortran::semantics::OpenACCRoutineInfo info; - const auto &clauses = std::get(x.t); + std::vector currentDevices; + currentDevices.push_back(&info); + const auto &clauses{std::get(x.t)}; for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isSeq(); - } else { - info.deviceTypeInfos().back().set_isSeq(); + if (const auto *dTypeClause{ + std::get_if(&clause.u)}) { + currentDevices.clear(); + for (const auto &deviceTypeExpr : dTypeClause->v.v) { + currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); } - } else if (const auto *gangClause = - std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isGang(); - } else { - info.deviceTypeInfos().back().set_isGang(); + } else if (std::get_if(&clause.u)) { + info.set_isNohost(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isSeq(); + } + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isVector(); + } + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isWorker(); + } + } else if (const auto *gangClause{ + std::get_if( + &clause.u)}) { + for (auto &device : currentDevices) { + device->set_isGang(); } if (gangClause->v) { const Fortran::parser::AccGangArgList &x = *gangClause->v; + int numArgs{0}; for (const Fortran::parser::AccGangArg &gangArg : x.v) { - if (const auto *dim = - std::get_if(&gangArg.u)) { + CHECK(numArgs <= 1 && "expecting 0 or 1 gang dim args"); + if (const auto *dim{std::get_if( + &gangArg.u)}) { if (const auto v{EvaluateInt64(context_, dim->v)}) { - if (info.deviceTypeInfos().empty()) { - info.set_gangDim(*v); - } else { - info.deviceTypeInfos().back().set_gangDim(*v); + for (auto &device : currentDevices) { + device->set_gangDim(*v); } } } + numArgs++; } } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isVector(); - } else { - info.deviceTypeInfos().back().set_isVector(); - } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isWorker(); - } else { - info.deviceTypeInfos().back().set_isWorker(); - } - } else if (std::get_if(&clause.u)) { - info.set_isNohost(); - } else if (const auto *bindClause = - std::get_if(&clause.u)) { - if (const auto *name = - std::get_if(&bindClause->v.u)) { - if (Symbol *sym = ResolveFctName(*name)) { - if (info.deviceTypeInfos().empty()) { - info.set_bindName(sym->name().ToString()); - } else { - info.deviceTypeInfos().back().set_bindName( - sym->name().ToString()); + } else if (const auto *bindClause{ + std::get_if( + &clause.u)}) { + if (const auto *name{ + std::get_if(&bindClause->v.u)}) { + if (Symbol * sym{ResolveFctName(*name)}) { + Symbol &ultimate{sym->GetUltimate()}; + for (auto &device : currentDevices) { + device->set_bindName(SymbolRef(ultimate)); ---------------- klausler wrote: (But there might be a semantic distinction related to usage as conversions or something; let me know if you can find one.) https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Wed May 7 09:15:46 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Wed, 07 May 2025 09:15:46 -0700 (PDT) Subject: [flang-commits] [flang] Skip contiguous check when ignore_tkr(c) is used (PR #138762) In-Reply-To: Message-ID: <681b8732.630a0220.a4ddf.8b79@mx.google.com> ================ @@ -0,0 +1,42 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 + +! Test case 1: Device arrays with ignore_tkr(c) +subroutine test_device_arrays() + interface bar + subroutine bar1(a) +!dir$ ignore_tkr(c) a + real :: a(..) +!@cuf attributes(device) :: a + end subroutine + end interface + + integer :: n = 10, k = 2 + real, device :: a(10), b(10), c(10) + + call bar(a(1:n)) ! Should not warn about contiguity + call bar(b(1:n:k)) ! Should not warn about contiguity + call bar(c(1:n:2)) ! Should not warn about contiguity +end subroutine + +! Test case 2: Managed arrays with ignore_tkr(c) +subroutine test_managed_arrays() + interface bar + subroutine bar1(a) +!dir$ ignore_tkr(c) a + real :: a(..) +!@cuf attributes(device) :: a + end subroutine + end interface + + integer :: n = 10, k = 2 + real, managed :: a(10), b(10), c(10) + + call bar(a(1:n)) ! Should not warn about contiguity + call bar(b(1:n:k)) ! Should not warn about contiguity + call bar(c(1:n:2)) ! Should not warn about contiguity +end subroutine + +program main + call test_device_arrays() + call test_managed_arrays() +end program ---------------- wangzpgi wrote: Strangely I don't see new line after `end program` in the source code. https://github.com/llvm/llvm-project/pull/138762 From flang-commits at lists.llvm.org Wed May 7 09:16:08 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Wed, 07 May 2025 09:16:08 -0700 (PDT) Subject: [flang-commits] [flang] Skip contiguous check when ignore_tkr(c) is used (PR #138762) In-Reply-To: Message-ID: <681b8748.050a0220.156158.de74@mx.google.com> https://github.com/wangzpgi updated https://github.com/llvm/llvm-project/pull/138762 >From 61c452e7384b762e33fb9636c9dfd7f4cbad026e Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Tue, 6 May 2025 13:58:54 -0700 Subject: [PATCH 1/2] Do not emit warnings/errors for contiguous check when ignore_tkr(c) is used for all attributes --- flang/lib/Semantics/check-call.cpp | 3 ++- flang/test/Semantics/cuf20.cuf | 42 ++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+), 1 deletion(-) create mode 100644 flang/test/Semantics/cuf20.cuf diff --git a/flang/lib/Semantics/check-call.cpp b/flang/lib/Semantics/check-call.cpp index dfaa0e028d698..58271d7ca2e87 100644 --- a/flang/lib/Semantics/check-call.cpp +++ b/flang/lib/Semantics/check-call.cpp @@ -1016,7 +1016,8 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, } } if (dummyDataAttr == common::CUDADataAttr::Device && - (dummyIsAssumedShape || dummyIsAssumedRank)) { + (dummyIsAssumedShape || dummyIsAssumedRank) + !dummy.ignoreTKR.test(common::IgnoreTKR::Contiguous)) { if (auto contig{evaluate::IsContiguous(actual, foldingContext, /*namedConstantSectionsAreContiguous=*/true, /*firstDimensionStride1=*/true)}) { diff --git a/flang/test/Semantics/cuf20.cuf b/flang/test/Semantics/cuf20.cuf new file mode 100644 index 0000000000000..222ff2a1b7c6d --- /dev/null +++ b/flang/test/Semantics/cuf20.cuf @@ -0,0 +1,42 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 + +! Test case 1: Device arrays with ignore_tkr(c) +subroutine test_device_arrays() + interface bar + subroutine bar1(a) +!dir$ ignore_tkr(c) a + real :: a(..) +!@cuf attributes(device) :: a + end subroutine + end interface + + integer :: n = 10, k = 2 + real, device :: a(10), b(10), c(10) + + call bar(a(1:n)) ! Should not warn about contiguity + call bar(b(1:n:k)) ! Should not warn about contiguity + call bar(c(1:n:2)) ! Should not warn about contiguity +end subroutine + +! Test case 2: Managed arrays with ignore_tkr(c) +subroutine test_managed_arrays() + interface bar + subroutine bar1(a) +!dir$ ignore_tkr(c) a + real :: a(..) +!@cuf attributes(device) :: a + end subroutine + end interface + + integer :: n = 10, k = 2 + real, managed :: a(10), b(10), c(10) + + call bar(a(1:n)) ! Should not warn about contiguity + call bar(b(1:n:k)) ! Should not warn about contiguity + call bar(c(1:n:2)) ! Should not warn about contiguity +end subroutine + +program main + call test_device_arrays() + call test_managed_arrays() +end program \ No newline at end of file >From 33ad28f8ea46f8fc2f5dce2fc7d333be35a05291 Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Tue, 6 May 2025 14:10:41 -0700 Subject: [PATCH 2/2] fix missing && --- flang/lib/Semantics/check-call.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Semantics/check-call.cpp b/flang/lib/Semantics/check-call.cpp index 58271d7ca2e87..11928860fea5f 100644 --- a/flang/lib/Semantics/check-call.cpp +++ b/flang/lib/Semantics/check-call.cpp @@ -1016,7 +1016,7 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, } } if (dummyDataAttr == common::CUDADataAttr::Device && - (dummyIsAssumedShape || dummyIsAssumedRank) + (dummyIsAssumedShape || dummyIsAssumedRank) && !dummy.ignoreTKR.test(common::IgnoreTKR::Contiguous)) { if (auto contig{evaluate::IsContiguous(actual, foldingContext, /*namedConstantSectionsAreContiguous=*/true, From flang-commits at lists.llvm.org Wed May 7 09:20:31 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Wed, 07 May 2025 09:20:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681b884f.170a0220.309dd7.d55e@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/136012 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 7 09:21:39 2025 From: flang-commits at lists.llvm.org (Pranav Bhandarkar via flang-commits) Date: Wed, 07 May 2025 09:21:39 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][MLIR] - Handle the mapping of subroutine arguments when they are subsequently used inside the region of an `omp.target` Op (PR #134967) In-Reply-To: Message-ID: <681b8893.a70a0220.6ae43.2c2e@mx.google.com> ================ @@ -217,59 +217,36 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, assert(args.isValid() && "invalid args"); fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - auto bindSingleMapLike = [&converter, - &firOpBuilder](const semantics::Symbol &sym, - const mlir::BlockArgument &arg) { - // Clones the `bounds` placing them inside the entry block and returns - // them. - auto cloneBound = [&](mlir::Value bound) { - if (mlir::isMemoryEffectFree(bound.getDefiningOp())) { - mlir::Operation *clonedOp = firOpBuilder.clone(*bound.getDefiningOp()); - return clonedOp->getResult(0); - } - TODO(converter.getCurrentLocation(), - "target map-like clause operand unsupported bound type"); - }; - - auto cloneBounds = [cloneBound](llvm::ArrayRef bounds) { - llvm::SmallVector clonedBounds; - llvm::transform(bounds, std::back_inserter(clonedBounds), - [&](mlir::Value bound) { return cloneBound(bound); }); - return clonedBounds; - }; - ---------------- bhandarkar-pranav wrote: Thanks @TIFitis for taking a look. The above snippet compiled fine with this PR. https://github.com/llvm/llvm-project/pull/134967 From flang-commits at lists.llvm.org Wed May 7 09:23:19 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Wed, 07 May 2025 09:23:19 -0700 (PDT) Subject: [flang-commits] [flang] Skip contiguous check when ignore_tkr(c) is used (PR #138762) In-Reply-To: Message-ID: <681b88f7.170a0220.17a064.c5f0@mx.google.com> https://github.com/wangzpgi updated https://github.com/llvm/llvm-project/pull/138762 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 7 09:23:46 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 09:23:46 -0700 (PDT) Subject: [flang-commits] [flang] ce69a60 - Skip contiguous check when ignore_tkr(c) is used (#138762) Message-ID: <681b8912.170a0220.96fac.d089@mx.google.com> Author: Zhen Wang Date: 2025-05-07T09:23:43-07:00 New Revision: ce69a60bc21024706a90fb36ffc2b43e112fb002 URL: https://github.com/llvm/llvm-project/commit/ce69a60bc21024706a90fb36ffc2b43e112fb002 DIFF: https://github.com/llvm/llvm-project/commit/ce69a60bc21024706a90fb36ffc2b43e112fb002.diff LOG: Skip contiguous check when ignore_tkr(c) is used (#138762) The point of ignore_tkr(c) is to ignore both contiguous warnings and errors for arguments of all attribute types. Added: flang/test/Semantics/cuf20.cuf Modified: flang/lib/Semantics/check-call.cpp Removed: ################################################################################ diff --git a/flang/lib/Semantics/check-call.cpp b/flang/lib/Semantics/check-call.cpp index dfaa0e028d698..11928860fea5f 100644 --- a/flang/lib/Semantics/check-call.cpp +++ b/flang/lib/Semantics/check-call.cpp @@ -1016,7 +1016,8 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, } } if (dummyDataAttr == common::CUDADataAttr::Device && - (dummyIsAssumedShape || dummyIsAssumedRank)) { + (dummyIsAssumedShape || dummyIsAssumedRank) && + !dummy.ignoreTKR.test(common::IgnoreTKR::Contiguous)) { if (auto contig{evaluate::IsContiguous(actual, foldingContext, /*namedConstantSectionsAreContiguous=*/true, /*firstDimensionStride1=*/true)}) { diff --git a/flang/test/Semantics/cuf20.cuf b/flang/test/Semantics/cuf20.cuf new file mode 100644 index 0000000000000..222ff2a1b7c6d --- /dev/null +++ b/flang/test/Semantics/cuf20.cuf @@ -0,0 +1,42 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 + +! Test case 1: Device arrays with ignore_tkr(c) +subroutine test_device_arrays() + interface bar + subroutine bar1(a) +!dir$ ignore_tkr(c) a + real :: a(..) +!@cuf attributes(device) :: a + end subroutine + end interface + + integer :: n = 10, k = 2 + real, device :: a(10), b(10), c(10) + + call bar(a(1:n)) ! Should not warn about contiguity + call bar(b(1:n:k)) ! Should not warn about contiguity + call bar(c(1:n:2)) ! Should not warn about contiguity +end subroutine + +! Test case 2: Managed arrays with ignore_tkr(c) +subroutine test_managed_arrays() + interface bar + subroutine bar1(a) +!dir$ ignore_tkr(c) a + real :: a(..) +!@cuf attributes(device) :: a + end subroutine + end interface + + integer :: n = 10, k = 2 + real, managed :: a(10), b(10), c(10) + + call bar(a(1:n)) ! Should not warn about contiguity + call bar(b(1:n:k)) ! Should not warn about contiguity + call bar(c(1:n:2)) ! Should not warn about contiguity +end subroutine + +program main + call test_device_arrays() + call test_managed_arrays() +end program \ No newline at end of file From flang-commits at lists.llvm.org Wed May 7 09:23:50 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Wed, 07 May 2025 09:23:50 -0700 (PDT) Subject: [flang-commits] [flang] Skip contiguous check when ignore_tkr(c) is used (PR #138762) In-Reply-To: Message-ID: <681b8916.170a0220.110a7e.dbb3@mx.google.com> https://github.com/wangzpgi closed https://github.com/llvm/llvm-project/pull/138762 From flang-commits at lists.llvm.org Wed May 7 09:28:14 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Wed, 07 May 2025 09:28:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Update syncthreads interface (PR #138023) In-Reply-To: Message-ID: <681b8a1e.170a0220.2fbbe3.e318@mx.google.com> https://github.com/wangzpgi approved this pull request. https://github.com/llvm/llvm-project/pull/138023 From flang-commits at lists.llvm.org Wed May 7 10:21:57 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Wed, 07 May 2025 10:21:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #131628) In-Reply-To: Message-ID: <681b96b5.050a0220.48da2.fa84@mx.google.com> ================ @@ -8,6 +8,7 @@ #include "mod-file.h" #include "resolve-names.h" +#include "flang/Common/indirection.h" ---------------- kparzysz wrote: Seems unnecessary... https://github.com/llvm/llvm-project/pull/131628 From flang-commits at lists.llvm.org Wed May 7 10:21:57 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Wed, 07 May 2025 10:21:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #131628) In-Reply-To: Message-ID: <681b96b5.170a0220.239b15.b474@mx.google.com> ================ @@ -3485,8 +3496,20 @@ void OmpStructureChecker::CheckReductionObjects( } } +static bool CheckSymbolSupportsType(const Scope &scope, + const parser::CharBlock &name, const DeclTypeSpec &type) { + if (const auto &symbol{scope.FindSymbol(name)}) { ---------------- kparzysz wrote: `const auto &` -> `const auto *` https://github.com/llvm/llvm-project/pull/131628 From flang-commits at lists.llvm.org Wed May 7 10:21:57 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Wed, 07 May 2025 10:21:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #131628) In-Reply-To: Message-ID: <681b96b5.170a0220.303ba1.f057@mx.google.com> ================ @@ -1752,14 +1761,91 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, PopScope(); } +parser::CharBlock MakeNameFromOperator( + const parser::DefinedOperator::IntrinsicOperator &op, + SemanticsContext &context) { + switch (op) { + case parser::DefinedOperator::IntrinsicOperator::Multiply: + return parser::CharBlock{"op.*", 4}; + case parser::DefinedOperator::IntrinsicOperator::Add: + return parser::CharBlock{"op.+", 4}; + case parser::DefinedOperator::IntrinsicOperator::Subtract: + return parser::CharBlock{"op.-", 4}; + + case parser::DefinedOperator::IntrinsicOperator::AND: + return parser::CharBlock{"op.AND", 6}; + case parser::DefinedOperator::IntrinsicOperator::OR: + return parser::CharBlock{"op.OR", 6}; + case parser::DefinedOperator::IntrinsicOperator::EQV: + return parser::CharBlock{"op.EQV", 7}; + case parser::DefinedOperator::IntrinsicOperator::NEQV: + return parser::CharBlock{"op.NEQV", 8}; + + default: + context.Say("Unsupported operator in DECLARE REDUCTION"_err_en_US); + return parser::CharBlock{"op.?", 4}; + } +} + +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name) { + return llvm::StringSwitch(name.ToString()) + .Case("max", {"op.max", 6}) + .Case("min", {"op.min", 6}) + .Case("iand", {"op.iand", 7}) + .Case("ior", {"op.ior", 6}) + .Case("ieor", {"op.ieor", 7}) + .Default(name); +} + +std::string MangleDefinedOperator(const parser::CharBlock &name) { + CHECK(name[0] == '.' && name[name.size() - 1] == '.'); + return "op" + name.ToString(); +} + +template void OmpVisitor::ProcessReductionSpecifier( const parser::OmpReductionSpecifier &spec, - const std::optional &clauses) { + const std::optional &clauses, + const T &wholeOmpConstruct) { + const parser::Name *name{nullptr}; + parser::Name mangledName; + UserReductionDetails reductionDetailsTemp; const auto &id{std::get(spec.t)}; if (auto procDes{std::get_if(&id.u)}) { - if (auto *name{std::get_if(&procDes->u)}) { - name->symbol = - &MakeSymbol(*name, MiscDetails{MiscDetails::Kind::ConstructName}); + name = std::get_if(&procDes->u); + if (name) { + mangledName.source = MangleSpecialFunctions(name->source); + } + + } else { + const auto &defOp{std::get(id.u)}; + if (const auto definedOp{std::get_if(&defOp.u)}) { + name = &definedOp->v; + mangledName.source = parser::CharBlock{context().StoreUserReductionName( + MangleDefinedOperator(definedOp->v.source))}; + } else { + mangledName.source = MakeNameFromOperator( + std::get(defOp.u), + context()); + name = &mangledName; + } + } + + // Use reductionDetailsTemp if we can't find the symbol (this is + // the first, or only, instance with this name). The details then + // gets stored in the symbol when it's created. + UserReductionDetails *reductionDetails{&reductionDetailsTemp}; + Symbol *symbol{FindSymbol(mangledName)}; + if (symbol) { + // If we found a symbol, we append the type info to the + // existing reductionDetails. + reductionDetails = symbol->detailsIf(); + + if (!reductionDetails) { + context().Say(name->source, ---------------- kparzysz wrote: The "source" argument to Say must point to somewhere within the cooked sources. It's used to locate the line number and for underlining of the problematic fragment. I'm not sure if that's guaranteed here, since some names are created within this function. https://github.com/llvm/llvm-project/pull/131628 From flang-commits at lists.llvm.org Wed May 7 11:09:37 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Wed, 07 May 2025 11:09:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Propagate contiguous attribute through HLFIR. (PR #138797) In-Reply-To: Message-ID: <681ba1e1.050a0220.1f904d.2bab@mx.google.com> https://github.com/vzakhari updated https://github.com/llvm/llvm-project/pull/138797 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 7 10:21:57 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Wed, 07 May 2025 10:21:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #131628) In-Reply-To: Message-ID: <681b96b5.050a0220.165230.0243@mx.google.com> ================ @@ -343,6 +347,11 @@ class SemanticsContext { std::map moduleFileOutputRenamings_; UnorderedSymbolSet isDefined_; std::list programTrees_; + + // storage for mangled names used in OMP DECLARE REDUCTION. ---------------- kparzysz wrote: Typo: "storage" -> "Storage". https://github.com/llvm/llvm-project/pull/131628 From flang-commits at lists.llvm.org Wed May 7 11:58:48 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 07 May 2025 11:58:48 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Update `do concurrent` mapping pass to use `fir.do_concurrent` op (PR #138489) In-Reply-To: Message-ID: <681bad68.050a0220.abbe0.2d47@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/138489 >From 7a8633684d3c6e021c4188e5ce90c563b4537196 Mon Sep 17 00:00:00 2001 From: ergawy Date: Mon, 5 May 2025 02:23:04 -0500 Subject: [PATCH] [flang][OpenMP] Update `do concurrent` mapping pass to use `fir.do_concurrent` op This PR updates the `do concurrent` to OpenMP mapping pass to use the newly added `fir.do_concurrent` ops that were recently added upstream instead of handling nests of `fir.do_loop ... unordered` ops. Parent PR: https://github.com/llvm/llvm-project/pull/137928. --- .../OpenMP/DoConcurrentConversion.cpp | 362 ++++-------------- .../Transforms/DoConcurrent/basic_device.mlir | 24 +- .../Transforms/DoConcurrent/basic_host.f90 | 3 - .../Transforms/DoConcurrent/basic_host.mlir | 26 +- .../DoConcurrent/locally_destroyed_temp.f90 | 3 - .../DoConcurrent/loop_nest_test.f90 | 92 ----- .../multiple_iteration_ranges.f90 | 3 - .../DoConcurrent/non_const_bounds.f90 | 3 - .../DoConcurrent/not_perfectly_nested.f90 | 24 +- 9 files changed, 122 insertions(+), 418 deletions(-) delete mode 100644 flang/test/Transforms/DoConcurrent/loop_nest_test.f90 diff --git a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp index 2c069860ffdca..0fdb302fe10ca 100644 --- a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp +++ b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp @@ -6,6 +6,7 @@ // //===----------------------------------------------------------------------===// +#include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/OpenMP/Passes.h" #include "flang/Optimizer/OpenMP/Utils.h" @@ -28,8 +29,10 @@ namespace looputils { /// Stores info needed about the induction/iteration variable for each `do /// concurrent` in a loop nest. struct InductionVariableInfo { - InductionVariableInfo(fir::DoLoopOp doLoop) { populateInfo(doLoop); } - + InductionVariableInfo(fir::DoConcurrentLoopOp loop, + mlir::Value inductionVar) { + populateInfo(loop, inductionVar); + } /// The operation allocating memory for iteration variable. mlir::Operation *iterVarMemDef; /// the operation(s) updating the iteration variable with the current @@ -45,7 +48,7 @@ struct InductionVariableInfo { /// ... /// %i:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : ... /// ... - /// fir.do_loop %ind_var = %lb to %ub step %s unordered { + /// fir.do_concurrent.loop (%ind_var) = (%lb) to (%ub) step (%s) { /// %ind_var_conv = fir.convert %ind_var : (index) -> i32 /// fir.store %ind_var_conv to %i#1 : !fir.ref /// ... @@ -62,14 +65,14 @@ struct InductionVariableInfo { /// Note: The current implementation is dependent on how flang emits loop /// bodies; which is sufficient for the current simple test/use cases. If this /// proves to be insufficient, this should be made more generic. - void populateInfo(fir::DoLoopOp doLoop) { + void populateInfo(fir::DoConcurrentLoopOp loop, mlir::Value inductionVar) { mlir::Value result = nullptr; // Checks if a StoreOp is updating the memref of the loop's iteration // variable. auto isStoringIV = [&](fir::StoreOp storeOp) { // Direct store into the IV memref. - if (storeOp.getValue() == doLoop.getInductionVar()) { + if (storeOp.getValue() == inductionVar) { indVarUpdateOps.push_back(storeOp); return true; } @@ -77,7 +80,7 @@ struct InductionVariableInfo { // Indirect store into the IV memref. if (auto convertOp = mlir::dyn_cast( storeOp.getValue().getDefiningOp())) { - if (convertOp.getOperand() == doLoop.getInductionVar()) { + if (convertOp.getOperand() == inductionVar) { indVarUpdateOps.push_back(convertOp); indVarUpdateOps.push_back(storeOp); return true; @@ -87,7 +90,7 @@ struct InductionVariableInfo { return false; }; - for (mlir::Operation &op : doLoop) { + for (mlir::Operation &op : loop) { if (auto storeOp = mlir::dyn_cast(op)) if (isStoringIV(storeOp)) { result = storeOp.getMemref(); @@ -100,219 +103,7 @@ struct InductionVariableInfo { } }; -using LoopNestToIndVarMap = - llvm::MapVector; - -/// Loop \p innerLoop is considered perfectly-nested inside \p outerLoop iff -/// there are no operations in \p outerloop's body other than: -/// -/// 1. the operations needed to assign/update \p outerLoop's induction variable. -/// 2. \p innerLoop itself. -/// -/// \p return true if \p innerLoop is perfectly nested inside \p outerLoop -/// according to the above definition. -bool isPerfectlyNested(fir::DoLoopOp outerLoop, fir::DoLoopOp innerLoop) { - mlir::ForwardSliceOptions forwardSliceOptions; - forwardSliceOptions.inclusive = true; - // The following will be used as an example to clarify the internals of this - // function: - // ``` - // 1. fir.do_loop %i_idx = %34 to %36 step %c1 unordered { - // 2. %i_idx_2 = fir.convert %i_idx : (index) -> i32 - // 3. fir.store %i_idx_2 to %i_iv#1 : !fir.ref - // - // 4. fir.do_loop %j_idx = %37 to %39 step %c1_3 unordered { - // 5. %j_idx_2 = fir.convert %j_idx : (index) -> i32 - // 6. fir.store %j_idx_2 to %j_iv#1 : !fir.ref - // ... loop nest body, possible uses %i_idx ... - // } - // } - // ``` - // In this example, the `j` loop is perfectly nested inside the `i` loop and - // below is how we find that. - - // We don't care about the outer-loop's induction variable's uses within the - // inner-loop, so we filter out these uses. - // - // This filter tells `getForwardSlice` (below) to only collect operations - // which produce results defined above (i.e. outside) the inner-loop's body. - // - // Since `outerLoop.getInductionVar()` is a block argument (to the - // outer-loop's body), the filter effectively collects uses of - // `outerLoop.getInductionVar()` inside the outer-loop but outside the - // inner-loop. - forwardSliceOptions.filter = [&](mlir::Operation *op) { - return mlir::areValuesDefinedAbove(op->getResults(), innerLoop.getRegion()); - }; - - llvm::SetVector indVarSlice; - // The forward slice of the `i` loop's IV will be the 2 ops in line 1 & 2 - // above. Uses of `%i_idx` inside the `j` loop are not collected because of - // the filter. - mlir::getForwardSlice(outerLoop.getInductionVar(), &indVarSlice, - forwardSliceOptions); - llvm::DenseSet indVarSet(indVarSlice.begin(), - indVarSlice.end()); - - llvm::DenseSet outerLoopBodySet; - // The following walk collects ops inside `outerLoop` that are **not**: - // * the outer-loop itself, - // * or the inner-loop, - // * or the `fir.result` op (the outer-loop's terminator). - // - // For the above example, this will also populate `outerLoopBodySet` with ops - // in line 1 & 2 since we skip the `i` loop, the `j` loop, and the terminator. - outerLoop.walk([&](mlir::Operation *op) { - if (op == outerLoop) - return mlir::WalkResult::advance(); - - if (op == innerLoop) - return mlir::WalkResult::skip(); - - if (mlir::isa(op)) - return mlir::WalkResult::advance(); - - outerLoopBodySet.insert(op); - return mlir::WalkResult::advance(); - }); - - // If `outerLoopBodySet` ends up having the same ops as `indVarSet`, then - // `outerLoop` only contains ops that setup its induction variable + - // `innerLoop` + the `fir.result` terminator. In other words, `innerLoop` is - // perfectly nested inside `outerLoop`. - bool result = (outerLoopBodySet == indVarSet); - LLVM_DEBUG(DBGS() << "Loop pair starting at location " << outerLoop.getLoc() - << " is" << (result ? "" : " not") - << " perfectly nested\n"); - - return result; -} - -/// Starting with `currentLoop` collect a perfectly nested loop nest, if any. -/// This function collects as much as possible loops in the nest; it case it -/// fails to recognize a certain nested loop as part of the nest it just returns -/// the parent loops it discovered before. -mlir::LogicalResult collectLoopNest(fir::DoLoopOp currentLoop, - LoopNestToIndVarMap &loopNest) { - assert(currentLoop.getUnordered()); - - while (true) { - loopNest.insert({currentLoop, InductionVariableInfo(currentLoop)}); - llvm::SmallVector unorderedLoops; - - for (auto nestedLoop : currentLoop.getRegion().getOps()) - if (nestedLoop.getUnordered()) - unorderedLoops.push_back(nestedLoop); - - if (unorderedLoops.empty()) - break; - - // Having more than one unordered loop means that we are not dealing with a - // perfect loop nest (i.e. a mulit-range `do concurrent` loop); which is the - // case we are after here. - if (unorderedLoops.size() > 1) - return mlir::failure(); - - fir::DoLoopOp nestedUnorderedLoop = unorderedLoops.front(); - - if (!isPerfectlyNested(currentLoop, nestedUnorderedLoop)) - return mlir::failure(); - - currentLoop = nestedUnorderedLoop; - } - - return mlir::success(); -} - -/// Prepares the `fir.do_loop` nest to be easily mapped to OpenMP. In -/// particular, this function would take this input IR: -/// ``` -/// fir.do_loop %i_iv = %i_lb to %i_ub step %i_step unordered { -/// fir.store %i_iv to %i#1 : !fir.ref -/// %j_lb = arith.constant 1 : i32 -/// %j_ub = arith.constant 10 : i32 -/// %j_step = arith.constant 1 : index -/// -/// fir.do_loop %j_iv = %j_lb to %j_ub step %j_step unordered { -/// fir.store %j_iv to %j#1 : !fir.ref -/// ... -/// } -/// } -/// ``` -/// -/// into the following form (using generic op form since the result is -/// technically an invalid `fir.do_loop` op: -/// -/// ``` -/// "fir.do_loop"(%i_lb, %i_ub, %i_step) <{unordered}> ({ -/// ^bb0(%i_iv: index): -/// %j_lb = "arith.constant"() <{value = 1 : i32}> : () -> i32 -/// %j_ub = "arith.constant"() <{value = 10 : i32}> : () -> i32 -/// %j_step = "arith.constant"() <{value = 1 : index}> : () -> index -/// -/// "fir.do_loop"(%j_lb, %j_ub, %j_step) <{unordered}> ({ -/// ^bb0(%new_i_iv: index, %new_j_iv: index): -/// "fir.store"(%new_i_iv, %i#1) : (i32, !fir.ref) -> () -/// "fir.store"(%new_j_iv, %j#1) : (i32, !fir.ref) -> () -/// ... -/// }) -/// ``` -/// -/// What happened to the loop nest is the following: -/// -/// * the innermost loop's entry block was updated from having one operand to -/// having `n` operands where `n` is the number of loops in the nest, -/// -/// * the outer loop(s)' ops that update the IVs were sank inside the innermost -/// loop (see the `"fir.store"(%new_i_iv, %i#1)` op above), -/// -/// * the innermost loop's entry block's arguments were mapped in order from the -/// outermost to the innermost IV. -/// -/// With this IR change, we can directly inline the innermost loop's region into -/// the newly generated `omp.loop_nest` op. -/// -/// Note that this function has a pre-condition that \p loopNest consists of -/// perfectly nested loops; i.e. there are no in-between ops between 2 nested -/// loops except for the ops to setup the inner loop's LB, UB, and step. These -/// ops are handled/cloned by `genLoopNestClauseOps(..)`. -void sinkLoopIVArgs(mlir::ConversionPatternRewriter &rewriter, - looputils::LoopNestToIndVarMap &loopNest) { - if (loopNest.size() <= 1) - return; - - fir::DoLoopOp innermostLoop = loopNest.back().first; - mlir::Operation &innermostFirstOp = innermostLoop.getRegion().front().front(); - - llvm::SmallVector argTypes; - llvm::SmallVector argLocs; - - for (auto &[doLoop, indVarInfo] : llvm::drop_end(loopNest)) { - // Sink the IV update ops to the innermost loop. We need to do for all loops - // except for the innermost one, hence the `drop_end` usage above. - for (mlir::Operation *op : indVarInfo.indVarUpdateOps) - op->moveBefore(&innermostFirstOp); - - argTypes.push_back(doLoop.getInductionVar().getType()); - argLocs.push_back(doLoop.getInductionVar().getLoc()); - } - - mlir::Region &innermmostRegion = innermostLoop.getRegion(); - // Extend the innermost entry block with arguments to represent the outer IVs. - innermmostRegion.addArguments(argTypes, argLocs); - - unsigned idx = 1; - // In reverse, remap the IVs of the loop nest from the old values to the new - // ones. We do that in reverse since the first argument before this loop is - // the old IV for the innermost loop. Therefore, we want to replace it first - // before the old value (1st argument in the block) is remapped to be the IV - // of the outermost loop in the nest. - for (auto &[doLoop, _] : llvm::reverse(loopNest)) { - doLoop.getInductionVar().replaceAllUsesWith( - innermmostRegion.getArgument(innermmostRegion.getNumArguments() - idx)); - ++idx; - } -} +using InductionVariableInfos = llvm::SmallVector; /// Collects values that are local to a loop: "loop-local values". A loop-local /// value is one that is used exclusively inside the loop but allocated outside @@ -326,9 +117,9 @@ void sinkLoopIVArgs(mlir::ConversionPatternRewriter &rewriter, /// used exclusively inside. /// /// \param [out] locals - the list of loop-local values detected for \p doLoop. -void collectLoopLocalValues(fir::DoLoopOp doLoop, +void collectLoopLocalValues(fir::DoConcurrentLoopOp loop, llvm::SetVector &locals) { - doLoop.walk([&](mlir::Operation *op) { + loop.walk([&](mlir::Operation *op) { for (mlir::Value operand : op->getOperands()) { if (locals.contains(operand)) continue; @@ -340,11 +131,11 @@ void collectLoopLocalValues(fir::DoLoopOp doLoop, // Values defined inside the loop are not interesting since they do not // need to be localized. - if (doLoop->isAncestor(operand.getDefiningOp())) + if (loop->isAncestor(operand.getDefiningOp())) continue; for (auto *user : operand.getUsers()) { - if (!doLoop->isAncestor(user)) { + if (!loop->isAncestor(user)) { isLocal = false; break; } @@ -373,39 +164,42 @@ static void localizeLoopLocalValue(mlir::Value local, mlir::Region &allocRegion, } } // namespace looputils -class DoConcurrentConversion : public mlir::OpConversionPattern { +class DoConcurrentConversion + : public mlir::OpConversionPattern { public: - using mlir::OpConversionPattern::OpConversionPattern; + using mlir::OpConversionPattern::OpConversionPattern; - DoConcurrentConversion(mlir::MLIRContext *context, bool mapToDevice, - llvm::DenseSet &concurrentLoopsToSkip) + DoConcurrentConversion( + mlir::MLIRContext *context, bool mapToDevice, + llvm::DenseSet &concurrentLoopsToSkip) : OpConversionPattern(context), mapToDevice(mapToDevice), concurrentLoopsToSkip(concurrentLoopsToSkip) {} mlir::LogicalResult - matchAndRewrite(fir::DoLoopOp doLoop, OpAdaptor adaptor, + matchAndRewrite(fir::DoConcurrentOp doLoop, OpAdaptor adaptor, mlir::ConversionPatternRewriter &rewriter) const override { if (mapToDevice) return doLoop.emitError( "not yet implemented: Mapping `do concurrent` loops to device"); - looputils::LoopNestToIndVarMap loopNest; - bool hasRemainingNestedLoops = - failed(looputils::collectLoopNest(doLoop, loopNest)); - if (hasRemainingNestedLoops) - mlir::emitWarning(doLoop.getLoc(), - "Some `do concurent` loops are not perfectly-nested. " - "These will be serialized."); + looputils::InductionVariableInfos ivInfos; + auto loop = mlir::cast( + doLoop.getRegion().back().getTerminator()); + + auto indVars = loop.getLoopInductionVars(); + assert(indVars.has_value()); + + for (mlir::Value indVar : *indVars) + ivInfos.emplace_back(loop, indVar); llvm::SetVector locals; - looputils::collectLoopLocalValues(loopNest.back().first, locals); - looputils::sinkLoopIVArgs(rewriter, loopNest); + looputils::collectLoopLocalValues(loop, locals); mlir::IRMapping mapper; mlir::omp::ParallelOp parallelOp = - genParallelOp(doLoop.getLoc(), rewriter, loopNest, mapper); + genParallelOp(doLoop.getLoc(), rewriter, ivInfos, mapper); mlir::omp::LoopNestOperands loopNestClauseOps; - genLoopNestClauseOps(doLoop.getLoc(), rewriter, loopNest, mapper, + genLoopNestClauseOps(doLoop.getLoc(), rewriter, loop, mapper, loopNestClauseOps); for (mlir::Value local : locals) @@ -413,41 +207,56 @@ class DoConcurrentConversion : public mlir::OpConversionPattern { rewriter); mlir::omp::LoopNestOp ompLoopNest = - genWsLoopOp(rewriter, loopNest.back().first, mapper, loopNestClauseOps, + genWsLoopOp(rewriter, loop, mapper, loopNestClauseOps, /*isComposite=*/mapToDevice); - rewriter.eraseOp(doLoop); + rewriter.setInsertionPoint(doLoop); + fir::FirOpBuilder builder( + rewriter, + fir::getKindMapping(doLoop->getParentOfType())); + + // Collect iteration variable(s) allocations so that we can move them + // outside the `fir.do_concurrent` wrapper (before erasing it). + llvm::SmallVector opsToMove; + for (mlir::Operation &op : llvm::drop_end(doLoop)) + opsToMove.push_back(&op); + + mlir::Block *allocBlock = builder.getAllocaBlock(); + + for (mlir::Operation *op : llvm::reverse(opsToMove)) { + rewriter.moveOpBefore(op, allocBlock, allocBlock->begin()); + } // Mark `unordered` loops that are not perfectly nested to be skipped from // the legality check of the `ConversionTarget` since we are not interested // in mapping them to OpenMP. - ompLoopNest->walk([&](fir::DoLoopOp doLoop) { - if (doLoop.getUnordered()) { - concurrentLoopsToSkip.insert(doLoop); - } + ompLoopNest->walk([&](fir::DoConcurrentOp doLoop) { + concurrentLoopsToSkip.insert(doLoop); }); + rewriter.eraseOp(doLoop); + return mlir::success(); } private: - mlir::omp::ParallelOp genParallelOp(mlir::Location loc, - mlir::ConversionPatternRewriter &rewriter, - looputils::LoopNestToIndVarMap &loopNest, - mlir::IRMapping &mapper) const { + mlir::omp::ParallelOp + genParallelOp(mlir::Location loc, mlir::ConversionPatternRewriter &rewriter, + looputils::InductionVariableInfos &ivInfos, + mlir::IRMapping &mapper) const { auto parallelOp = rewriter.create(loc); rewriter.createBlock(¶llelOp.getRegion()); rewriter.setInsertionPoint(rewriter.create(loc)); - genLoopNestIndVarAllocs(rewriter, loopNest, mapper); + genLoopNestIndVarAllocs(rewriter, ivInfos, mapper); return parallelOp; } void genLoopNestIndVarAllocs(mlir::ConversionPatternRewriter &rewriter, - looputils::LoopNestToIndVarMap &loopNest, + looputils::InductionVariableInfos &ivInfos, mlir::IRMapping &mapper) const { - for (auto &[_, indVarInfo] : loopNest) + for (auto &indVarInfo : ivInfos) genInductionVariableAlloc(rewriter, indVarInfo.iterVarMemDef, mapper); } @@ -471,10 +280,11 @@ class DoConcurrentConversion : public mlir::OpConversionPattern { return result; } - void genLoopNestClauseOps( - mlir::Location loc, mlir::ConversionPatternRewriter &rewriter, - looputils::LoopNestToIndVarMap &loopNest, mlir::IRMapping &mapper, - mlir::omp::LoopNestOperands &loopNestClauseOps) const { + void + genLoopNestClauseOps(mlir::Location loc, + mlir::ConversionPatternRewriter &rewriter, + fir::DoConcurrentLoopOp loop, mlir::IRMapping &mapper, + mlir::omp::LoopNestOperands &loopNestClauseOps) const { assert(loopNestClauseOps.loopLowerBounds.empty() && "Loop nest bounds were already emitted!"); @@ -483,43 +293,42 @@ class DoConcurrentConversion : public mlir::OpConversionPattern { bounds.push_back(var.getDefiningOp()->getResult(0)); }; - for (auto &[doLoop, _] : loopNest) { - populateBounds(doLoop.getLowerBound(), loopNestClauseOps.loopLowerBounds); - populateBounds(doLoop.getUpperBound(), loopNestClauseOps.loopUpperBounds); - populateBounds(doLoop.getStep(), loopNestClauseOps.loopSteps); + for (auto [lb, ub, st] : llvm::zip_equal( + loop.getLowerBound(), loop.getUpperBound(), loop.getStep())) { + populateBounds(lb, loopNestClauseOps.loopLowerBounds); + populateBounds(ub, loopNestClauseOps.loopUpperBounds); + populateBounds(st, loopNestClauseOps.loopSteps); } loopNestClauseOps.loopInclusive = rewriter.getUnitAttr(); } mlir::omp::LoopNestOp - genWsLoopOp(mlir::ConversionPatternRewriter &rewriter, fir::DoLoopOp doLoop, - mlir::IRMapping &mapper, + genWsLoopOp(mlir::ConversionPatternRewriter &rewriter, + fir::DoConcurrentLoopOp loop, mlir::IRMapping &mapper, const mlir::omp::LoopNestOperands &clauseOps, bool isComposite) const { - auto wsloopOp = rewriter.create(doLoop.getLoc()); + auto wsloopOp = rewriter.create(loop.getLoc()); wsloopOp.setComposite(isComposite); rewriter.createBlock(&wsloopOp.getRegion()); auto loopNestOp = - rewriter.create(doLoop.getLoc(), clauseOps); + rewriter.create(loop.getLoc(), clauseOps); // Clone the loop's body inside the loop nest construct using the // mapped values. - rewriter.cloneRegionBefore(doLoop.getRegion(), loopNestOp.getRegion(), + rewriter.cloneRegionBefore(loop.getRegion(), loopNestOp.getRegion(), loopNestOp.getRegion().begin(), mapper); - mlir::Operation *terminator = loopNestOp.getRegion().back().getTerminator(); rewriter.setInsertionPointToEnd(&loopNestOp.getRegion().back()); - rewriter.create(terminator->getLoc()); - rewriter.eraseOp(terminator); + rewriter.create(loop->getLoc()); return loopNestOp; } bool mapToDevice; - llvm::DenseSet &concurrentLoopsToSkip; + llvm::DenseSet &concurrentLoopsToSkip; }; class DoConcurrentConversionPass @@ -548,19 +357,16 @@ class DoConcurrentConversionPass return; } - llvm::DenseSet concurrentLoopsToSkip; + llvm::DenseSet concurrentLoopsToSkip; mlir::RewritePatternSet patterns(context); patterns.insert( context, mapTo == flangomp::DoConcurrentMappingKind::DCMK_Device, concurrentLoopsToSkip); mlir::ConversionTarget target(*context); - target.addDynamicallyLegalOp([&](fir::DoLoopOp op) { - // The goal is to handle constructs that eventually get lowered to - // `fir.do_loop` with the `unordered` attribute (e.g. array expressions). - // Currently, this is only enabled for the `do concurrent` construct since - // the pass runs early in the pipeline. - return !op.getUnordered() || concurrentLoopsToSkip.contains(op); - }); + target.addDynamicallyLegalOp( + [&](fir::DoConcurrentOp op) { + return concurrentLoopsToSkip.contains(op); + }); target.markUnknownOpDynamicallyLegal( [](mlir::Operation *) { return true; }); diff --git a/flang/test/Transforms/DoConcurrent/basic_device.mlir b/flang/test/Transforms/DoConcurrent/basic_device.mlir index d7fcc40e4a7f9..0ca48943864c8 100644 --- a/flang/test/Transforms/DoConcurrent/basic_device.mlir +++ b/flang/test/Transforms/DoConcurrent/basic_device.mlir @@ -1,8 +1,6 @@ // RUN: fir-opt --omp-do-concurrent-conversion="map-to=device" -verify-diagnostics %s func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_basic"} { - %0 = fir.alloca i32 {bindc_name = "i"} - %1:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) %2 = fir.address_of(@_QFEa) : !fir.ref> %c10 = arith.constant 10 : index %3 = fir.shape %c10 : (index) -> !fir.shape<1> @@ -14,15 +12,19 @@ func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_bas %c1 = arith.constant 1 : index // expected-error at +2 {{not yet implemented: Mapping `do concurrent` loops to device}} - // expected-error at below {{failed to legalize operation 'fir.do_loop'}} - fir.do_loop %arg0 = %7 to %8 step %c1 unordered { - %13 = fir.convert %arg0 : (index) -> i32 - fir.store %13 to %1#1 : !fir.ref - %14 = fir.load %1#0 : !fir.ref - %15 = fir.load %1#0 : !fir.ref - %16 = fir.convert %15 : (i32) -> i64 - %17 = hlfir.designate %4#0 (%16) : (!fir.ref>, i64) -> !fir.ref - hlfir.assign %14 to %17 : i32, !fir.ref + // expected-error at below {{failed to legalize operation 'fir.do_concurrent'}} + fir.do_concurrent { + %0 = fir.alloca i32 {bindc_name = "i"} + %1:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) + fir.do_concurrent.loop (%arg0) = (%7) to (%8) step (%c1) { + %13 = fir.convert %arg0 : (index) -> i32 + fir.store %13 to %1#1 : !fir.ref + %14 = fir.load %1#0 : !fir.ref + %15 = fir.load %1#0 : !fir.ref + %16 = fir.convert %15 : (i32) -> i64 + %17 = hlfir.designate %4#0 (%16) : (!fir.ref>, i64) -> !fir.ref + hlfir.assign %14 to %17 : i32, !fir.ref + } } return diff --git a/flang/test/Transforms/DoConcurrent/basic_host.f90 b/flang/test/Transforms/DoConcurrent/basic_host.f90 index b84d4481ac766..12f63031cbaee 100644 --- a/flang/test/Transforms/DoConcurrent/basic_host.f90 +++ b/flang/test/Transforms/DoConcurrent/basic_host.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! Tests mapping of a basic `do concurrent` loop to `!$omp parallel do`. ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fdo-concurrent-to-openmp=host %s -o - \ diff --git a/flang/test/Transforms/DoConcurrent/basic_host.mlir b/flang/test/Transforms/DoConcurrent/basic_host.mlir index b53ecd687c039..5425829404d7b 100644 --- a/flang/test/Transforms/DoConcurrent/basic_host.mlir +++ b/flang/test/Transforms/DoConcurrent/basic_host.mlir @@ -6,8 +6,6 @@ func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_basic"} { // CHECK: %[[ARR:.*]]:2 = hlfir.declare %{{.*}}(%{{.*}}) {uniq_name = "_QFEa"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) - %0 = fir.alloca i32 {bindc_name = "i"} - %1:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) %2 = fir.address_of(@_QFEa) : !fir.ref> %c10 = arith.constant 10 : index %3 = fir.shape %c10 : (index) -> !fir.shape<1> @@ -18,7 +16,7 @@ func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_bas %8 = fir.convert %c10_i32 : (i32) -> index %c1 = arith.constant 1 : index - // CHECK-NOT: fir.do_loop + // CHECK-NOT: fir.do_concurrent // CHECK: %[[C1:.*]] = arith.constant 1 : i32 // CHECK: %[[LB:.*]] = fir.convert %[[C1]] : (i32) -> index @@ -46,17 +44,21 @@ func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_bas // CHECK-NEXT: omp.terminator // CHECK-NEXT: } - fir.do_loop %arg0 = %7 to %8 step %c1 unordered { - %13 = fir.convert %arg0 : (index) -> i32 - fir.store %13 to %1#1 : !fir.ref - %14 = fir.load %1#0 : !fir.ref - %15 = fir.load %1#0 : !fir.ref - %16 = fir.convert %15 : (i32) -> i64 - %17 = hlfir.designate %4#0 (%16) : (!fir.ref>, i64) -> !fir.ref - hlfir.assign %14 to %17 : i32, !fir.ref + fir.do_concurrent { + %0 = fir.alloca i32 {bindc_name = "i"} + %1:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) + fir.do_concurrent.loop (%arg0) = (%7) to (%8) step (%c1) { + %13 = fir.convert %arg0 : (index) -> i32 + fir.store %13 to %1#1 : !fir.ref + %14 = fir.load %1#0 : !fir.ref + %15 = fir.load %1#0 : !fir.ref + %16 = fir.convert %15 : (i32) -> i64 + %17 = hlfir.designate %4#0 (%16) : (!fir.ref>, i64) -> !fir.ref + hlfir.assign %14 to %17 : i32, !fir.ref + } } - // CHECK-NOT: fir.do_loop + // CHECK-NOT: fir.do_concurrent return } diff --git a/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 b/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 index 4e13c0919589a..f82696669eca6 100644 --- a/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 +++ b/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! Tests that "loop-local values" are properly handled by localizing them to the ! body of the loop nest. See `collectLoopLocalValues` and `localizeLoopLocalValue` ! for a definition of "loop-local values" and how they are handled. diff --git a/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 b/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 deleted file mode 100644 index adc4a488d1ec9..0000000000000 --- a/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 +++ /dev/null @@ -1,92 +0,0 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - -! Tests loop-nest detection algorithm for do-concurrent mapping. - -! REQUIRES: asserts - -! RUN: %flang_fc1 -emit-hlfir -fopenmp -fdo-concurrent-to-openmp=host \ -! RUN: -mmlir -debug -mmlir -mlir-disable-threading %s -o - 2> %t.log || true - -! RUN: FileCheck %s < %t.log - -program main - implicit none - -contains - -subroutine foo(n) - implicit none - integer :: n, m - integer :: i, j, k - integer :: x - integer, dimension(n) :: a - integer, dimension(n, n, n) :: b - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is perfectly nested - do concurrent(i=1:n, j=1:bar(n*m, n/m)) - a(i) = n - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is perfectly nested - do concurrent(i=bar(n, x):n, j=1:bar(n*m, n/m)) - a(i) = n - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is not perfectly nested - do concurrent(i=bar(n, x):n) - do concurrent(j=1:bar(n*m, n/m)) - a(i) = n - end do - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is not perfectly nested - do concurrent(i=1:n) - x = 10 - do concurrent(j=1:m) - b(i,j,k) = i * j + k - end do - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is not perfectly nested - do concurrent(i=1:n) - do concurrent(j=1:m) - b(i,j,k) = i * j + k - end do - x = 10 - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is not perfectly nested - do concurrent(i=1:n) - do concurrent(j=1:m) - b(i,j,k) = i * j + k - x = 10 - end do - end do - - ! Verify the (i,j) and (j,k) pairs of loops are detected as perfectly nested. - ! - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 3]]:{{.*}}) is perfectly nested - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is perfectly nested - do concurrent(i=bar(n, x):n, j=1:bar(n*m, n/m), k=1:bar(n*m, bar(n*m, n/m))) - a(i) = n - end do -end subroutine - -pure function bar(n, m) - implicit none - integer, intent(in) :: n, m - integer :: bar - - bar = n + m -end function - -end program main diff --git a/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 b/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 index 26800678d381c..d0210726de83e 100644 --- a/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 +++ b/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! Tests mapping of a `do concurrent` loop with multiple iteration ranges. ! RUN: split-file %s %t diff --git a/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 b/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 index 23a3aae976c07..cd1bd4f98a3f5 100644 --- a/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 +++ b/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fdo-concurrent-to-openmp=host %s -o - \ ! RUN: | FileCheck %s diff --git a/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 b/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 index d1c02101318ab..74799359e0476 100644 --- a/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 +++ b/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! Tests that if `do concurrent` is not perfectly nested in its parent loop, that ! we skip converting the not-perfectly nested `do concurrent` loop. @@ -22,23 +19,24 @@ program main end do end -! CHECK: %[[ORIG_K_ALLOC:.*]] = fir.alloca i32 {bindc_name = "k"} -! CHECK: %[[ORIG_K_DECL:.*]]:2 = hlfir.declare %[[ORIG_K_ALLOC]] - -! CHECK: %[[ORIG_J_ALLOC:.*]] = fir.alloca i32 {bindc_name = "j"} -! CHECK: %[[ORIG_J_DECL:.*]]:2 = hlfir.declare %[[ORIG_J_ALLOC]] - ! CHECK: omp.parallel { ! CHECK: omp.wsloop { ! CHECK: omp.loop_nest ({{[^[:space:]]+}}) {{.*}} { -! CHECK: fir.do_loop %[[J_IV:.*]] = {{.*}} { -! CHECK: %[[J_IV_CONV:.*]] = fir.convert %[[J_IV]] : (index) -> i32 +! CHECK: fir.do_concurrent { + +! CHECK: %[[ORIG_J_ALLOC:.*]] = fir.alloca i32 {bindc_name = "j"} +! CHECK: %[[ORIG_J_DECL:.*]]:2 = hlfir.declare %[[ORIG_J_ALLOC]] + +! CHECK: %[[ORIG_K_ALLOC:.*]] = fir.alloca i32 {bindc_name = "k"} +! CHECK: %[[ORIG_K_DECL:.*]]:2 = hlfir.declare %[[ORIG_K_ALLOC]] + +! CHECK: fir.do_concurrent.loop (%[[J_IV:.*]], %[[K_IV:.*]]) = {{.*}} { +! CHECK: %[[J_IV_CONV:.*]] = fir.convert %[[J_IV]] : (index) -> i32 ! CHECK: fir.store %[[J_IV_CONV]] to %[[ORIG_J_DECL]]#0 -! CHECK: fir.do_loop %[[K_IV:.*]] = {{.*}} { ! CHECK: %[[K_IV_CONV:.*]] = fir.convert %[[K_IV]] : (index) -> i32 -! CHECK: fir.store %[[K_IV_CONV]] to %[[ORIG_K_DECL]]#0 +! CHECK: fir.store %[[K_IV_CONV]] to %[[ORIG_K_DECL]]#0 ! CHECK: } ! CHECK: } ! CHECK: omp.yield From flang-commits at lists.llvm.org Wed May 7 12:51:28 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 07 May 2025 12:51:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add `fir.local` op for locality specifiers (PR #138505) In-Reply-To: Message-ID: <681bb9c0.170a0220.15362c.f8c4@mx.google.com> ================ @@ -94,10 +94,11 @@ struct IncrementLoopInfo { template explicit IncrementLoopInfo(Fortran::semantics::Symbol &sym, const T &lower, const T &upper, const std::optional &step, - bool isUnordered = false) + bool isConcurrent = false) ---------------- clementval wrote: unordered is also used for array operation. how is this handled now? https://github.com/llvm/llvm-project/pull/138505 From flang-commits at lists.llvm.org Wed May 7 12:51:55 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 07 May 2025 12:51:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add `fir.local` op for locality specifiers (PR #138505) In-Reply-To: Message-ID: <681bb9db.170a0220.1fd603.9bde@mx.google.com> ================ @@ -120,7 +121,7 @@ struct IncrementLoopInfo { const Fortran::lower::SomeExpr *upperExpr; const Fortran::lower::SomeExpr *stepExpr; const Fortran::lower::SomeExpr *maskExpr = nullptr; - bool isUnordered; // do concurrent, forall ---------------- clementval wrote: is forall treated as do concurrent? https://github.com/llvm/llvm-project/pull/138505 From flang-commits at lists.llvm.org Wed May 7 12:53:03 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 07 May 2025 12:53:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add `fir.local` op for locality specifiers (PR #138505) In-Reply-To: Message-ID: <681bba1f.170a0220.2237be.1ddf@mx.google.com> https://github.com/clementval commented: Some post commit questions. https://github.com/llvm/llvm-project/pull/138505 From flang-commits at lists.llvm.org Wed May 7 12:56:15 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 12:56:15 -0700 (PDT) Subject: [flang-commits] [flang] 1a7cd92 - [flang][cuda] Update syncthreads interface (#138023) Message-ID: <681bbadf.170a0220.349bc2.5ffe@mx.google.com> Author: Valentin Clement (バレンタイン クレメン) Date: 2025-05-07T21:56:11+02:00 New Revision: 1a7cd92c8607bbad5c212f474a1e46043a8016cd URL: https://github.com/llvm/llvm-project/commit/1a7cd92c8607bbad5c212f474a1e46043a8016cd DIFF: https://github.com/llvm/llvm-project/commit/1a7cd92c8607bbad5c212f474a1e46043a8016cd.diff LOG: [flang][cuda] Update syncthreads interface (#138023) Added: Modified: flang/module/cudadevice.f90 Removed: ################################################################################ diff --git a/flang/module/cudadevice.f90 b/flang/module/cudadevice.f90 index 9bd90bcfc30ec..f8a30da8b9615 100644 --- a/flang/module/cudadevice.f90 +++ b/flang/module/cudadevice.f90 @@ -17,9 +17,8 @@ module cudadevice ! Synchronization Functions - interface - attributes(device) subroutine syncthreads() - end subroutine + interface syncthreads + procedure :: syncthreads end interface interface @@ -1614,4 +1613,9 @@ attributes(device,host) logical function on_device() bind(c) end function end interface +contains + + attributes(device) subroutine syncthreads() + end subroutine + end module From flang-commits at lists.llvm.org Wed May 7 12:56:18 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 07 May 2025 12:56:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Update syncthreads interface (PR #138023) In-Reply-To: Message-ID: <681bbae2.170a0220.bc0ea.969c@mx.google.com> https://github.com/clementval closed https://github.com/llvm/llvm-project/pull/138023 From flang-commits at lists.llvm.org Wed May 7 13:36:03 2025 From: flang-commits at lists.llvm.org (Michael Kruse via flang-commits) Date: Wed, 07 May 2025 13:36:03 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828) In-Reply-To: Message-ID: <681bc433.170a0220.10c52d.b31b@mx.google.com> Meinersbur wrote: > Just want to make sure: Should it be `$prefix/lib/clang/${LLVM_VERSION_MAJOR}/finclude//*.mod`? That is correct, I forgot the version number that is part of the resource directory. https://github.com/llvm/llvm-project/pull/137828 From flang-commits at lists.llvm.org Wed May 7 13:40:46 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 07 May 2025 13:40:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Further refinement of OpenMP !$ lines in -E mode (PR #138956) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/138956 Address failing Fujitsu test suite cases that were broken by the patch to defer the handling of !$ lines in -fopenmp vs. normal compilation to actual compilation rather than processing them immediately in -E mode. Tested on the samples in the bug report as well as all of the Fujitsu tests that I could find that use !$ lines. Fixes https://github.com/llvm/llvm-project/issues/136845. >From 3481138345699342b1737cda068df876dc08f92c Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Tue, 29 Apr 2025 13:55:45 -0700 Subject: [PATCH] [flang] Further refinement of OpenMP !$ lines in -E mode Address failing Fujitsu test suite cases that were broken by the patch to defer the handling of !$ lines in -fopenmp vs. normal compilation to actual compilation rather than processing them immediately in -E mode. Tested on the samples in the bug report as well as all of the Fujitsu tests that I could find that use !$ lines. Fixes https://github.com/llvm/llvm-project/issues/136845. --- flang/include/flang/Parser/token-sequence.h | 2 +- flang/lib/Parser/parsing.cpp | 7 +- flang/lib/Parser/prescan.cpp | 193 +++++++++--------- flang/lib/Parser/prescan.h | 5 + flang/lib/Parser/token-sequence.cpp | 6 +- flang/test/Parser/OpenMP/bug518.f | 4 +- .../compiler-directive-continuation.f90 | 12 +- flang/test/Parser/OpenMP/sentinels.f | 4 +- .../continuation-in-conditional-compilation.f | 7 +- flang/test/Preprocessing/bug136845.F | 45 ++++ 10 files changed, 168 insertions(+), 117 deletions(-) create mode 100644 flang/test/Preprocessing/bug136845.F diff --git a/flang/include/flang/Parser/token-sequence.h b/flang/include/flang/Parser/token-sequence.h index 69291e69526e2..05aeacccde097 100644 --- a/flang/include/flang/Parser/token-sequence.h +++ b/flang/include/flang/Parser/token-sequence.h @@ -137,7 +137,7 @@ class TokenSequence { TokenSequence &RemoveRedundantBlanks(std::size_t firstChar = 0); TokenSequence &ClipComment(const Prescanner &, bool skipFirst = false); const TokenSequence &CheckBadFortranCharacters( - Messages &, const Prescanner &, bool allowAmpersand) const; + Messages &, const Prescanner &, bool preprocessingOnly) const; bool BadlyNestedParentheses() const; const TokenSequence &CheckBadParentheses(Messages &) const; void Emit(CookedSource &) const; diff --git a/flang/lib/Parser/parsing.cpp b/flang/lib/Parser/parsing.cpp index 17f544194de02..93737d99567dd 100644 --- a/flang/lib/Parser/parsing.cpp +++ b/flang/lib/Parser/parsing.cpp @@ -230,10 +230,11 @@ void Parsing::EmitPreprocessedSource( column = 7; // start of fixed form source field ++sourceLine; inContinuation = true; - } else if (!inDirective && ch != ' ' && (ch < '0' || ch > '9')) { + } else if (!inDirective && !ompConditionalLine && ch != ' ' && + (ch < '0' || ch > '9')) { // Put anything other than a label or directive into the // Fortran fixed form source field (columns [7:72]). - for (; column < 7; ++column) { + for (int toCol{ch == '&' ? 6 : 7}; column < toCol; ++column) { out << ' '; } } @@ -241,7 +242,7 @@ void Parsing::EmitPreprocessedSource( if (ompConditionalLine) { // Only digits can stay in the label field if (!(ch >= '0' && ch <= '9')) { - for (; column < 7; ++column) { + for (int toCol{ch == '&' ? 6 : 7}; column < toCol; ++column) { out << ' '; } } diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp index 46e04c15ade01..ee180d986e39d 100644 --- a/flang/lib/Parser/prescan.cpp +++ b/flang/lib/Parser/prescan.cpp @@ -150,10 +150,7 @@ void Prescanner::Statement() { CHECK(*at_ == '!'); } std::optional condOffset; - bool isOpenMPCondCompilation{ - directiveSentinel_[0] == '$' && directiveSentinel_[1] == '\0'}; - if (isOpenMPCondCompilation) { - // OpenMP conditional compilation line. + if (InOpenMPConditionalLine()) { condOffset = 2; } else if (directiveSentinel_[0] == '@' && directiveSentinel_[1] == 'c' && directiveSentinel_[2] == 'u' && directiveSentinel_[3] == 'f' && @@ -167,19 +164,10 @@ void Prescanner::Statement() { FortranInclude(at_ + *payload); return; } - while (true) { - if (auto n{IsSpace(at_)}) { - at_ += n, ++column_; - } else if (*at_ == '\t') { - ++at_, ++column_; - tabInCurrentLine_ = true; - } else if (inFixedForm_ && column_ == 6 && !tabInCurrentLine_ && - *at_ == '0') { - ++at_, ++column_; - } else { - break; - } + if (inFixedForm_) { + LabelField(tokens); } + SkipSpaces(); } else { // Compiler directive. Emit normalized sentinel, squash following spaces. // Conditional compilation lines (!$) take this path in -E mode too @@ -190,35 +178,47 @@ void Prescanner::Statement() { ++sp, ++at_, ++column_) { EmitChar(tokens, *sp); } - if (IsSpaceOrTab(at_)) { - while (int n{IsSpaceOrTab(at_)}) { - if (isOpenMPCondCompilation && inFixedForm_) { + if (inFixedForm_) { + while (column_ < 6) { + if (*at_ == '\t') { + tabInCurrentLine_ = true; + ++at_; + for (; column_ < 7; ++column_) { + EmitChar(tokens, ' '); + } + } else if (int spaceBytes{IsSpace(at_)}) { EmitChar(tokens, ' '); - } - tabInCurrentLine_ |= *at_ == '\t'; - at_ += n, ++column_; - if (inFixedForm_ && column_ > fixedFormColumnLimit_) { + at_ += spaceBytes; + ++column_; + } else { + if (InOpenMPConditionalLine() && column_ == 3 && + IsDecimalDigit(*at_)) { + // subtle: !$ in -E mode can't be immediately followed by a digit + EmitChar(tokens, ' '); + } break; } } - if (isOpenMPCondCompilation && inFixedForm_ && column_ == 6) { - if (*at_ == '0') { - EmitChar(tokens, ' '); - } else { - tokens.CloseToken(); - EmitChar(tokens, '&'); - } - ++at_, ++column_; + } else if (int spaceBytes{IsSpaceOrTab(at_)}) { + EmitChar(tokens, ' '); + at_ += spaceBytes, ++column_; + } + tokens.CloseToken(); + SkipSpaces(); + if (InOpenMPConditionalLine() && inFixedForm_ && !tabInCurrentLine_ && + column_ == 6 && *at_ != '\n') { + // !$ 0 - turn '0' into a space + // !$ 1 - turn '1' into '&' + if (int n{IsSpace(at_)}; n || *at_ == '0') { + at_ += n ? n : 1; } else { - EmitChar(tokens, ' '); + ++at_; + EmitChar(tokens, '&'); + tokens.CloseToken(); } + ++column_; + SkipSpaces(); } - tokens.CloseToken(); - } - if (*at_ == '!' || *at_ == '\n' || - (inFixedForm_ && column_ > fixedFormColumnLimit_ && - !tabInCurrentLine_)) { - return; // Directive without payload } break; } @@ -323,8 +323,8 @@ void Prescanner::Statement() { NormalizeCompilerDirectiveCommentMarker(*preprocessed); preprocessed->ToLowerCase(); SourceFormChange(preprocessed->ToString()); - CheckAndEmitLine(preprocessed->ToLowerCase().ClipComment( - *this, true /* skip first ! */), + CheckAndEmitLine( + preprocessed->ClipComment(*this, true /* skip first ! */), newlineProvenance); break; case LineClassification::Kind::Source: @@ -349,6 +349,24 @@ void Prescanner::Statement() { while (CompilerDirectiveContinuation(tokens, line.sentinel)) { newlineProvenance = GetCurrentProvenance(); } + if (preprocessingOnly_ && inFixedForm_ && InOpenMPConditionalLine() && + nextLine_ < limit_) { + // In -E mode, when the line after !$ conditional compilation is a + // regular fixed form continuation line, append a '&' to the line. + const char *p{nextLine_}; + int col{1}; + while (int n{IsSpace(p)}) { + if (*p == '\t') { + break; + } + p += n; + ++col; + } + if (col == 6 && *p != '0' && *p != '\t' && *p != '\n') { + EmitChar(tokens, '&'); + tokens.CloseToken(); + } + } tokens.ToLowerCase(); SourceFormChange(tokens.ToString()); } else { // Kind::Source @@ -544,7 +562,8 @@ void Prescanner::SkipToEndOfLine() { bool Prescanner::MustSkipToEndOfLine() const { if (inFixedForm_ && column_ > fixedFormColumnLimit_ && !tabInCurrentLine_) { return true; // skip over ignored columns in right margin (73:80) - } else if (*at_ == '!' && !inCharLiteral_) { + } else if (*at_ == '!' && !inCharLiteral_ && + (!inFixedForm_ || tabInCurrentLine_ || column_ != 6)) { return !IsCompilerDirectiveSentinel(at_); } else { return false; @@ -569,10 +588,11 @@ void Prescanner::NextChar() { // directives, Fortran ! comments, stuff after the right margin in // fixed form, and all forms of line continuation. bool Prescanner::SkipToNextSignificantCharacter() { - auto anyContinuationLine{false}; if (inPreprocessorDirective_) { SkipCComments(); + return false; } else { + auto anyContinuationLine{false}; bool mightNeedSpace{false}; if (MustSkipToEndOfLine()) { SkipToEndOfLine(); @@ -589,8 +609,8 @@ bool Prescanner::SkipToNextSignificantCharacter() { if (*at_ == '\t') { tabInCurrentLine_ = true; } + return anyContinuationLine; } - return anyContinuationLine; } void Prescanner::SkipCComments() { @@ -1119,12 +1139,10 @@ static bool IsAtProcess(const char *p) { bool Prescanner::IsFixedFormCommentLine(const char *start) const { const char *p{start}; - // The @process directive must start in column 1. if (*p == '@' && IsAtProcess(p)) { return true; } - if (IsFixedFormCommentChar(*p) || *p == '%' || // VAX %list, %eject, &c. ((*p == 'D' || *p == 'd') && !features_.IsEnabled(LanguageFeature::OldDebugLines))) { @@ -1325,23 +1343,9 @@ const char *Prescanner::FixedFormContinuationLine(bool mightNeedSpace) { nextLine_[1] == ' ' && nextLine_[2] == ' ' && nextLine_[3] == ' ' && nextLine_[4] == ' '}; if (InCompilerDirective()) { - if (directiveSentinel_[0] == '$' && directiveSentinel_[1] == '\0') { - if (IsFixedFormCommentChar(col1)) { - if (nextLine_[1] == '$' && - (nextLine_[2] == '&' || IsSpaceOrTab(&nextLine_[2]))) { - // Next line is also !$ conditional compilation, might be continuation - if (preprocessingOnly_) { - return nullptr; - } - } else { - return nullptr; // comment, or distinct directive - } - } else if (!canBeNonDirectiveContinuation) { - return nullptr; - } - } else if (!IsFixedFormCommentChar(col1)) { - return nullptr; // in directive other than !$, but next line is not - } else { // in directive other than !$, next line might be continuation + // !$ under -E is not continued, but deferred to later compilation + if (IsFixedFormCommentChar(col1) && + !(InOpenMPConditionalLine() && preprocessingOnly_)) { int j{1}; for (; j < 5; ++j) { char ch{directiveSentinel_[j - 1]}; @@ -1356,31 +1360,27 @@ const char *Prescanner::FixedFormContinuationLine(bool mightNeedSpace) { return nullptr; } } - } - const char *col6{nextLine_ + 5}; - if (*col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { - if (mightNeedSpace && !IsSpace(nextLine_ + 6)) { - insertASpace_ = true; + const char *col6{nextLine_ + 5}; + if (*col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { + if (mightNeedSpace && !IsSpace(nextLine_ + 6)) { + insertASpace_ = true; + } + return nextLine_ + 6; } - return nextLine_ + 6; } - } else { - // Normal case: not in a compiler directive. - if (IsFixedFormCommentChar(col1)) { - if (nextLine_[1] == '$' && nextLine_[2] == ' ' && nextLine_[3] == ' ' && - nextLine_[4] == ' ' && - IsCompilerDirectiveSentinel(&nextLine_[1], 1) && - !preprocessingOnly_) { - // !$ conditional compilation line as a continuation - const char *col6{nextLine_ + 5}; - if (*col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { - if (mightNeedSpace && !IsSpace(nextLine_ + 6)) { - insertASpace_ = true; - } - return nextLine_ + 6; - } + } else { // Normal case: not in a compiler directive. + // !$ conditional compilation lines may be continuations when not + // just preprocessing. + if (!preprocessingOnly_ && IsFixedFormCommentChar(col1) && + nextLine_[1] == '$' && nextLine_[2] == ' ' && nextLine_[3] == ' ' && + nextLine_[4] == ' ' && IsCompilerDirectiveSentinel(&nextLine_[1], 1)) { + if (const char *col6{nextLine_ + 5}; + *col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { + insertASpace_ |= mightNeedSpace && !IsSpace(nextLine_ + 6); + return nextLine_ + 6; + } else { + return nullptr; } - return nullptr; } if (col1 == '&' && features_.IsEnabled( @@ -1422,13 +1422,13 @@ const char *Prescanner::FreeFormContinuationLine(bool ampersand) { } p = SkipWhiteSpaceIncludingEmptyMacros(p); if (InCompilerDirective()) { - if (directiveSentinel_[0] == '$' && directiveSentinel_[1] == '\0') { + if (InOpenMPConditionalLine()) { if (preprocessingOnly_) { // in -E mode, don't treat !$ as a continuation return nullptr; } else if (p[0] == '!' && p[1] == '$') { // accept but do not require a matching sentinel - if (!(p[2] == '&' || IsSpaceOrTab(&p[2]))) { + if (p[2] != '&' && !IsSpaceOrTab(&p[2])) { return nullptr; // not !$ } p += 2; @@ -1566,15 +1566,11 @@ Prescanner::IsFixedFormCompilerDirectiveLine(const char *start) const { } char sentinel[5], *sp{sentinel}; int column{2}; - for (; column < 6; ++column, ++p) { - if (*p == '\n' || IsSpaceOrTab(p)) { - break; - } - if (sp == sentinel + 1 && sentinel[0] == '$' && IsDecimalDigit(*p)) { - // OpenMP conditional compilation line: leave the label alone + for (; column < 6; ++column) { + if (*p == '\n' || IsSpaceOrTab(p) || IsDecimalDigit(*p)) { break; } - *sp++ = ToLowerCaseLetter(*p); + *sp++ = ToLowerCaseLetter(*p++); } if (sp == sentinel) { return std::nullopt; @@ -1600,7 +1596,8 @@ Prescanner::IsFixedFormCompilerDirectiveLine(const char *start) const { ++p; } else if (int n{IsSpaceOrTab(p)}) { p += n; - } else if (isOpenMPConditional && preprocessingOnly_ && !hadDigit) { + } else if (isOpenMPConditional && preprocessingOnly_ && !hadDigit && + *p != '\n') { // In -E mode, "!$ &" is treated as a directive } else { // This is a Continuation line, not an initial directive line. @@ -1671,14 +1668,14 @@ const char *Prescanner::IsCompilerDirectiveSentinel(CharBlock token) const { std::optional> Prescanner::IsCompilerDirectiveSentinel(const char *p) const { char sentinel[8]; - for (std::size_t j{0}; j + 1 < sizeof sentinel && *p != '\n'; ++p, ++j) { + for (std::size_t j{0}; j + 1 < sizeof sentinel; ++p, ++j) { if (int n{IsSpaceOrTab(p)}; n || !(IsLetter(*p) || *p == '$' || *p == '@')) { if (j > 0) { - if (j == 1 && sentinel[0] == '$' && n == 0 && *p != '&') { - // OpenMP conditional compilation line sentinels have to + if (j == 1 && sentinel[0] == '$' && n == 0 && *p != '&' && *p != '\n') { + // Free form OpenMP conditional compilation line sentinels have to // be immediately followed by a space or &, not a digit - // or anything else. + // or anything else. A newline also works for an initial line. break; } sentinel[j] = '\0'; diff --git a/flang/lib/Parser/prescan.h b/flang/lib/Parser/prescan.h index 53361ba14f378..ec4c53cf3e0f2 100644 --- a/flang/lib/Parser/prescan.h +++ b/flang/lib/Parser/prescan.h @@ -159,6 +159,11 @@ class Prescanner { } bool InCompilerDirective() const { return directiveSentinel_ != nullptr; } + bool InOpenMPConditionalLine() const { + return directiveSentinel_ && directiveSentinel_[0] == '$' && + !directiveSentinel_[1]; + ; + } bool InFixedFormSource() const { return inFixedForm_ && !inPreprocessorDirective_ && !InCompilerDirective(); } diff --git a/flang/lib/Parser/token-sequence.cpp b/flang/lib/Parser/token-sequence.cpp index aee76938550f5..40a074eaf0a47 100644 --- a/flang/lib/Parser/token-sequence.cpp +++ b/flang/lib/Parser/token-sequence.cpp @@ -357,7 +357,7 @@ ProvenanceRange TokenSequence::GetProvenanceRange() const { const TokenSequence &TokenSequence::CheckBadFortranCharacters( Messages &messages, const Prescanner &prescanner, - bool allowAmpersand) const { + bool preprocessingOnly) const { std::size_t tokens{SizeInTokens()}; for (std::size_t j{0}; j < tokens; ++j) { CharBlock token{TokenAt(j)}; @@ -371,8 +371,10 @@ const TokenSequence &TokenSequence::CheckBadFortranCharacters( TokenAt(j + 1))) { // !dir$, &c. ++j; continue; + } else if (preprocessingOnly) { + continue; } - } else if (ch == '&' && allowAmpersand) { + } else if (ch == '&' && preprocessingOnly) { continue; } if (ch < ' ' || ch >= '\x7f') { diff --git a/flang/test/Parser/OpenMP/bug518.f b/flang/test/Parser/OpenMP/bug518.f index 2dbacef59fa8a..2739de63f8b25 100644 --- a/flang/test/Parser/OpenMP/bug518.f +++ b/flang/test/Parser/OpenMP/bug518.f @@ -9,9 +9,9 @@ !$omp end parallel end -!CHECK-E:{{^}}!$ thread = OMP_GET_MAX_THREADS() +!CHECK-E:{{^}}!$ thread = OMP_GET_MAX_THREADS() !CHECK-E:{{^}}!$omp parallel private(ia) -!CHECK-E:{{^}}!$ continue +!CHECK-E:{{^}}!$ continue !CHECK-E:{{^}}!$omp end parallel !CHECK-OMP:thread=omp_get_max_threads() diff --git a/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 b/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 index 169976d74c0bf..644ab3f723aba 100644 --- a/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 +++ b/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 @@ -7,10 +7,10 @@ ! CHECK-LABEL: subroutine mixed_form1() ! CHECK-E:{{^}} i = 1 & ! CHECK-E:{{^}}!$ +100& -! CHECK-E:{{^}}!$ &+ 1000& -! CHECK-E:{{^}} &+ 10 + 1& -! CHECK-E:{{^}}!$ & +100000& -! CHECK-E:{{^}} &0000 + 1000000 +! CHECK-E:{{^}}!$ &+ 1000& +! CHECK-E:{{^}} &+ 10 + 1& +! CHECK-E:{{^}}!$ & +100000& +! CHECK-E:{{^}} &0000 + 1000000 ! CHECK-OMP: i=1001001112_4 ! CHECK-NO-OMP: i=1010011_4 subroutine mixed_form1() @@ -39,8 +39,8 @@ subroutine mixed_form2() ! CHECK-LABEL: subroutine mixed_form3() ! CHECK-E:{{^}}!$ i=0 ! CHECK-E:{{^}}!$ i = 1 & -! CHECK-E:{{^}}!$ & +10 & -! CHECK-E:{{^}}!$ &+100& +! CHECK-E:{{^}}!$ & +10 & +! CHECK-E:{{^}}!$ &+100& ! CHECK-E:{{^}}!$ +1000 ! CHECK-OMP: i=0_4 ! CHECK-OMP: i=1111_4 diff --git a/flang/test/Parser/OpenMP/sentinels.f b/flang/test/Parser/OpenMP/sentinels.f index 299b83e2abba8..f5a2fd4f7f931 100644 --- a/flang/test/Parser/OpenMP/sentinels.f +++ b/flang/test/Parser/OpenMP/sentinels.f @@ -61,12 +61,12 @@ subroutine sub(a, b) ! Test valid chars in initial and continuation lines. ! CHECK: !$ 20 PRINT *, "msg2" -! CHECK: !$ & , "msg3" +! CHECK: !$ &, "msg3" c$ 20 PRINT *, "msg2" c$ & , "msg3" ! CHECK: !$ PRINT *, "msg4", -! CHECK: !$ & "msg5" +! CHECK: !$ &"msg5" c$ 0PRINT *, "msg4", c$ + "msg5" end diff --git a/flang/test/Parser/continuation-in-conditional-compilation.f b/flang/test/Parser/continuation-in-conditional-compilation.f index 57b69de657348..ebc6a3f875b9a 100644 --- a/flang/test/Parser/continuation-in-conditional-compilation.f +++ b/flang/test/Parser/continuation-in-conditional-compilation.f @@ -1,11 +1,12 @@ ! RUN: %flang_fc1 -E %s 2>&1 | FileCheck %s program main ! CHECK: k01=1+ -! CHECK: !$ & 1 +! CHECK: !$ &1 k01=1+ -!$ & 1 +!$ &1 -! CHECK: !$ k02=23 +! CHECK: !$ k02=2 +! CHECK: 3 ! CHECK: !$ &4 !$ k02=2 +3 diff --git a/flang/test/Preprocessing/bug136845.F b/flang/test/Preprocessing/bug136845.F new file mode 100644 index 0000000000000..ce52c2953bb57 --- /dev/null +++ b/flang/test/Preprocessing/bug136845.F @@ -0,0 +1,45 @@ +!RUN: %flang_fc1 -E %s | FileCheck --check-prefix=PREPRO %s +!RUN: %flang_fc1 -fdebug-unparse %s | FileCheck --check-prefix=NORMAL %s +!RUN: %flang_fc1 -fopenmp -fdebug-unparse %s | FileCheck --check-prefix=OMP %s + +c$ ! + +C$ + continue + + k=0 w + k=0 +c$ 0 x +c$ 1 y +c$ 2 k= z +c$ ! A +c$ !1 B + print *,k +*$1 continue + end + +!PREPRO:!$ & +!PREPRO: continue +!PREPRO: k=0 +!PREPRO: k=0 +!PREPRO:!$ +!PREPRO:!$ & +!PREPRO:!$ &k= +!PREPRO:!$ & +!PREPRO:!$ &1 +!PREPRO: print *,k +!PREPRO:!$ 1 continue +!PREPRO: end + +!NORMAL: k=0_4 +!NORMAL: k=0_4 +!NORMAL: PRINT *, k +!NORMAL:END PROGRAM + +!OMP: CONTINUE +!OMP: k=0_4 +!OMP: k=0_4 +!OMP: k=1_4 +!OMP: PRINT *, k +!OMP: 1 CONTINUE +!OMP:END PROGRAM From flang-commits at lists.llvm.org Wed May 7 13:41:27 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 13:41:27 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Further refinement of OpenMP !$ lines in -E mode (PR #138956) In-Reply-To: Message-ID: <681bc577.050a0220.20be2e.2c41@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Peter Klausler (klausler)
Changes Address failing Fujitsu test suite cases that were broken by the patch to defer the handling of !$ lines in -fopenmp vs. normal compilation to actual compilation rather than processing them immediately in -E mode. Tested on the samples in the bug report as well as all of the Fujitsu tests that I could find that use !$ lines. Fixes https://github.com/llvm/llvm-project/issues/136845. --- Patch is 20.21 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/138956.diff 10 Files Affected: - (modified) flang/include/flang/Parser/token-sequence.h (+1-1) - (modified) flang/lib/Parser/parsing.cpp (+4-3) - (modified) flang/lib/Parser/prescan.cpp (+95-98) - (modified) flang/lib/Parser/prescan.h (+5) - (modified) flang/lib/Parser/token-sequence.cpp (+4-2) - (modified) flang/test/Parser/OpenMP/bug518.f (+2-2) - (modified) flang/test/Parser/OpenMP/compiler-directive-continuation.f90 (+6-6) - (modified) flang/test/Parser/OpenMP/sentinels.f (+2-2) - (modified) flang/test/Parser/continuation-in-conditional-compilation.f (+4-3) - (added) flang/test/Preprocessing/bug136845.F (+45) ``````````diff diff --git a/flang/include/flang/Parser/token-sequence.h b/flang/include/flang/Parser/token-sequence.h index 69291e69526e2..05aeacccde097 100644 --- a/flang/include/flang/Parser/token-sequence.h +++ b/flang/include/flang/Parser/token-sequence.h @@ -137,7 +137,7 @@ class TokenSequence { TokenSequence &RemoveRedundantBlanks(std::size_t firstChar = 0); TokenSequence &ClipComment(const Prescanner &, bool skipFirst = false); const TokenSequence &CheckBadFortranCharacters( - Messages &, const Prescanner &, bool allowAmpersand) const; + Messages &, const Prescanner &, bool preprocessingOnly) const; bool BadlyNestedParentheses() const; const TokenSequence &CheckBadParentheses(Messages &) const; void Emit(CookedSource &) const; diff --git a/flang/lib/Parser/parsing.cpp b/flang/lib/Parser/parsing.cpp index 17f544194de02..93737d99567dd 100644 --- a/flang/lib/Parser/parsing.cpp +++ b/flang/lib/Parser/parsing.cpp @@ -230,10 +230,11 @@ void Parsing::EmitPreprocessedSource( column = 7; // start of fixed form source field ++sourceLine; inContinuation = true; - } else if (!inDirective && ch != ' ' && (ch < '0' || ch > '9')) { + } else if (!inDirective && !ompConditionalLine && ch != ' ' && + (ch < '0' || ch > '9')) { // Put anything other than a label or directive into the // Fortran fixed form source field (columns [7:72]). - for (; column < 7; ++column) { + for (int toCol{ch == '&' ? 6 : 7}; column < toCol; ++column) { out << ' '; } } @@ -241,7 +242,7 @@ void Parsing::EmitPreprocessedSource( if (ompConditionalLine) { // Only digits can stay in the label field if (!(ch >= '0' && ch <= '9')) { - for (; column < 7; ++column) { + for (int toCol{ch == '&' ? 6 : 7}; column < toCol; ++column) { out << ' '; } } diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp index 46e04c15ade01..ee180d986e39d 100644 --- a/flang/lib/Parser/prescan.cpp +++ b/flang/lib/Parser/prescan.cpp @@ -150,10 +150,7 @@ void Prescanner::Statement() { CHECK(*at_ == '!'); } std::optional condOffset; - bool isOpenMPCondCompilation{ - directiveSentinel_[0] == '$' && directiveSentinel_[1] == '\0'}; - if (isOpenMPCondCompilation) { - // OpenMP conditional compilation line. + if (InOpenMPConditionalLine()) { condOffset = 2; } else if (directiveSentinel_[0] == '@' && directiveSentinel_[1] == 'c' && directiveSentinel_[2] == 'u' && directiveSentinel_[3] == 'f' && @@ -167,19 +164,10 @@ void Prescanner::Statement() { FortranInclude(at_ + *payload); return; } - while (true) { - if (auto n{IsSpace(at_)}) { - at_ += n, ++column_; - } else if (*at_ == '\t') { - ++at_, ++column_; - tabInCurrentLine_ = true; - } else if (inFixedForm_ && column_ == 6 && !tabInCurrentLine_ && - *at_ == '0') { - ++at_, ++column_; - } else { - break; - } + if (inFixedForm_) { + LabelField(tokens); } + SkipSpaces(); } else { // Compiler directive. Emit normalized sentinel, squash following spaces. // Conditional compilation lines (!$) take this path in -E mode too @@ -190,35 +178,47 @@ void Prescanner::Statement() { ++sp, ++at_, ++column_) { EmitChar(tokens, *sp); } - if (IsSpaceOrTab(at_)) { - while (int n{IsSpaceOrTab(at_)}) { - if (isOpenMPCondCompilation && inFixedForm_) { + if (inFixedForm_) { + while (column_ < 6) { + if (*at_ == '\t') { + tabInCurrentLine_ = true; + ++at_; + for (; column_ < 7; ++column_) { + EmitChar(tokens, ' '); + } + } else if (int spaceBytes{IsSpace(at_)}) { EmitChar(tokens, ' '); - } - tabInCurrentLine_ |= *at_ == '\t'; - at_ += n, ++column_; - if (inFixedForm_ && column_ > fixedFormColumnLimit_) { + at_ += spaceBytes; + ++column_; + } else { + if (InOpenMPConditionalLine() && column_ == 3 && + IsDecimalDigit(*at_)) { + // subtle: !$ in -E mode can't be immediately followed by a digit + EmitChar(tokens, ' '); + } break; } } - if (isOpenMPCondCompilation && inFixedForm_ && column_ == 6) { - if (*at_ == '0') { - EmitChar(tokens, ' '); - } else { - tokens.CloseToken(); - EmitChar(tokens, '&'); - } - ++at_, ++column_; + } else if (int spaceBytes{IsSpaceOrTab(at_)}) { + EmitChar(tokens, ' '); + at_ += spaceBytes, ++column_; + } + tokens.CloseToken(); + SkipSpaces(); + if (InOpenMPConditionalLine() && inFixedForm_ && !tabInCurrentLine_ && + column_ == 6 && *at_ != '\n') { + // !$ 0 - turn '0' into a space + // !$ 1 - turn '1' into '&' + if (int n{IsSpace(at_)}; n || *at_ == '0') { + at_ += n ? n : 1; } else { - EmitChar(tokens, ' '); + ++at_; + EmitChar(tokens, '&'); + tokens.CloseToken(); } + ++column_; + SkipSpaces(); } - tokens.CloseToken(); - } - if (*at_ == '!' || *at_ == '\n' || - (inFixedForm_ && column_ > fixedFormColumnLimit_ && - !tabInCurrentLine_)) { - return; // Directive without payload } break; } @@ -323,8 +323,8 @@ void Prescanner::Statement() { NormalizeCompilerDirectiveCommentMarker(*preprocessed); preprocessed->ToLowerCase(); SourceFormChange(preprocessed->ToString()); - CheckAndEmitLine(preprocessed->ToLowerCase().ClipComment( - *this, true /* skip first ! */), + CheckAndEmitLine( + preprocessed->ClipComment(*this, true /* skip first ! */), newlineProvenance); break; case LineClassification::Kind::Source: @@ -349,6 +349,24 @@ void Prescanner::Statement() { while (CompilerDirectiveContinuation(tokens, line.sentinel)) { newlineProvenance = GetCurrentProvenance(); } + if (preprocessingOnly_ && inFixedForm_ && InOpenMPConditionalLine() && + nextLine_ < limit_) { + // In -E mode, when the line after !$ conditional compilation is a + // regular fixed form continuation line, append a '&' to the line. + const char *p{nextLine_}; + int col{1}; + while (int n{IsSpace(p)}) { + if (*p == '\t') { + break; + } + p += n; + ++col; + } + if (col == 6 && *p != '0' && *p != '\t' && *p != '\n') { + EmitChar(tokens, '&'); + tokens.CloseToken(); + } + } tokens.ToLowerCase(); SourceFormChange(tokens.ToString()); } else { // Kind::Source @@ -544,7 +562,8 @@ void Prescanner::SkipToEndOfLine() { bool Prescanner::MustSkipToEndOfLine() const { if (inFixedForm_ && column_ > fixedFormColumnLimit_ && !tabInCurrentLine_) { return true; // skip over ignored columns in right margin (73:80) - } else if (*at_ == '!' && !inCharLiteral_) { + } else if (*at_ == '!' && !inCharLiteral_ && + (!inFixedForm_ || tabInCurrentLine_ || column_ != 6)) { return !IsCompilerDirectiveSentinel(at_); } else { return false; @@ -569,10 +588,11 @@ void Prescanner::NextChar() { // directives, Fortran ! comments, stuff after the right margin in // fixed form, and all forms of line continuation. bool Prescanner::SkipToNextSignificantCharacter() { - auto anyContinuationLine{false}; if (inPreprocessorDirective_) { SkipCComments(); + return false; } else { + auto anyContinuationLine{false}; bool mightNeedSpace{false}; if (MustSkipToEndOfLine()) { SkipToEndOfLine(); @@ -589,8 +609,8 @@ bool Prescanner::SkipToNextSignificantCharacter() { if (*at_ == '\t') { tabInCurrentLine_ = true; } + return anyContinuationLine; } - return anyContinuationLine; } void Prescanner::SkipCComments() { @@ -1119,12 +1139,10 @@ static bool IsAtProcess(const char *p) { bool Prescanner::IsFixedFormCommentLine(const char *start) const { const char *p{start}; - // The @process directive must start in column 1. if (*p == '@' && IsAtProcess(p)) { return true; } - if (IsFixedFormCommentChar(*p) || *p == '%' || // VAX %list, %eject, &c. ((*p == 'D' || *p == 'd') && !features_.IsEnabled(LanguageFeature::OldDebugLines))) { @@ -1325,23 +1343,9 @@ const char *Prescanner::FixedFormContinuationLine(bool mightNeedSpace) { nextLine_[1] == ' ' && nextLine_[2] == ' ' && nextLine_[3] == ' ' && nextLine_[4] == ' '}; if (InCompilerDirective()) { - if (directiveSentinel_[0] == '$' && directiveSentinel_[1] == '\0') { - if (IsFixedFormCommentChar(col1)) { - if (nextLine_[1] == '$' && - (nextLine_[2] == '&' || IsSpaceOrTab(&nextLine_[2]))) { - // Next line is also !$ conditional compilation, might be continuation - if (preprocessingOnly_) { - return nullptr; - } - } else { - return nullptr; // comment, or distinct directive - } - } else if (!canBeNonDirectiveContinuation) { - return nullptr; - } - } else if (!IsFixedFormCommentChar(col1)) { - return nullptr; // in directive other than !$, but next line is not - } else { // in directive other than !$, next line might be continuation + // !$ under -E is not continued, but deferred to later compilation + if (IsFixedFormCommentChar(col1) && + !(InOpenMPConditionalLine() && preprocessingOnly_)) { int j{1}; for (; j < 5; ++j) { char ch{directiveSentinel_[j - 1]}; @@ -1356,31 +1360,27 @@ const char *Prescanner::FixedFormContinuationLine(bool mightNeedSpace) { return nullptr; } } - } - const char *col6{nextLine_ + 5}; - if (*col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { - if (mightNeedSpace && !IsSpace(nextLine_ + 6)) { - insertASpace_ = true; + const char *col6{nextLine_ + 5}; + if (*col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { + if (mightNeedSpace && !IsSpace(nextLine_ + 6)) { + insertASpace_ = true; + } + return nextLine_ + 6; } - return nextLine_ + 6; } - } else { - // Normal case: not in a compiler directive. - if (IsFixedFormCommentChar(col1)) { - if (nextLine_[1] == '$' && nextLine_[2] == ' ' && nextLine_[3] == ' ' && - nextLine_[4] == ' ' && - IsCompilerDirectiveSentinel(&nextLine_[1], 1) && - !preprocessingOnly_) { - // !$ conditional compilation line as a continuation - const char *col6{nextLine_ + 5}; - if (*col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { - if (mightNeedSpace && !IsSpace(nextLine_ + 6)) { - insertASpace_ = true; - } - return nextLine_ + 6; - } + } else { // Normal case: not in a compiler directive. + // !$ conditional compilation lines may be continuations when not + // just preprocessing. + if (!preprocessingOnly_ && IsFixedFormCommentChar(col1) && + nextLine_[1] == '$' && nextLine_[2] == ' ' && nextLine_[3] == ' ' && + nextLine_[4] == ' ' && IsCompilerDirectiveSentinel(&nextLine_[1], 1)) { + if (const char *col6{nextLine_ + 5}; + *col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { + insertASpace_ |= mightNeedSpace && !IsSpace(nextLine_ + 6); + return nextLine_ + 6; + } else { + return nullptr; } - return nullptr; } if (col1 == '&' && features_.IsEnabled( @@ -1422,13 +1422,13 @@ const char *Prescanner::FreeFormContinuationLine(bool ampersand) { } p = SkipWhiteSpaceIncludingEmptyMacros(p); if (InCompilerDirective()) { - if (directiveSentinel_[0] == '$' && directiveSentinel_[1] == '\0') { + if (InOpenMPConditionalLine()) { if (preprocessingOnly_) { // in -E mode, don't treat !$ as a continuation return nullptr; } else if (p[0] == '!' && p[1] == '$') { // accept but do not require a matching sentinel - if (!(p[2] == '&' || IsSpaceOrTab(&p[2]))) { + if (p[2] != '&' && !IsSpaceOrTab(&p[2])) { return nullptr; // not !$ } p += 2; @@ -1566,15 +1566,11 @@ Prescanner::IsFixedFormCompilerDirectiveLine(const char *start) const { } char sentinel[5], *sp{sentinel}; int column{2}; - for (; column < 6; ++column, ++p) { - if (*p == '\n' || IsSpaceOrTab(p)) { - break; - } - if (sp == sentinel + 1 && sentinel[0] == '$' && IsDecimalDigit(*p)) { - // OpenMP conditional compilation line: leave the label alone + for (; column < 6; ++column) { + if (*p == '\n' || IsSpaceOrTab(p) || IsDecimalDigit(*p)) { break; } - *sp++ = ToLowerCaseLetter(*p); + *sp++ = ToLowerCaseLetter(*p++); } if (sp == sentinel) { return std::nullopt; @@ -1600,7 +1596,8 @@ Prescanner::IsFixedFormCompilerDirectiveLine(const char *start) const { ++p; } else if (int n{IsSpaceOrTab(p)}) { p += n; - } else if (isOpenMPConditional && preprocessingOnly_ && !hadDigit) { + } else if (isOpenMPConditional && preprocessingOnly_ && !hadDigit && + *p != '\n') { // In -E mode, "!$ &" is treated as a directive } else { // This is a Continuation line, not an initial directive line. @@ -1671,14 +1668,14 @@ const char *Prescanner::IsCompilerDirectiveSentinel(CharBlock token) const { std::optional> Prescanner::IsCompilerDirectiveSentinel(const char *p) const { char sentinel[8]; - for (std::size_t j{0}; j + 1 < sizeof sentinel && *p != '\n'; ++p, ++j) { + for (std::size_t j{0}; j + 1 < sizeof sentinel; ++p, ++j) { if (int n{IsSpaceOrTab(p)}; n || !(IsLetter(*p) || *p == '$' || *p == '@')) { if (j > 0) { - if (j == 1 && sentinel[0] == '$' && n == 0 && *p != '&') { - // OpenMP conditional compilation line sentinels have to + if (j == 1 && sentinel[0] == '$' && n == 0 && *p != '&' && *p != '\n') { + // Free form OpenMP conditional compilation line sentinels have to // be immediately followed by a space or &, not a digit - // or anything else. + // or anything else. A newline also works for an initial line. break; } sentinel[j] = '\0'; diff --git a/flang/lib/Parser/prescan.h b/flang/lib/Parser/prescan.h index 53361ba14f378..ec4c53cf3e0f2 100644 --- a/flang/lib/Parser/prescan.h +++ b/flang/lib/Parser/prescan.h @@ -159,6 +159,11 @@ class Prescanner { } bool InCompilerDirective() const { return directiveSentinel_ != nullptr; } + bool InOpenMPConditionalLine() const { + return directiveSentinel_ && directiveSentinel_[0] == '$' && + !directiveSentinel_[1]; + ; + } bool InFixedFormSource() const { return inFixedForm_ && !inPreprocessorDirective_ && !InCompilerDirective(); } diff --git a/flang/lib/Parser/token-sequence.cpp b/flang/lib/Parser/token-sequence.cpp index aee76938550f5..40a074eaf0a47 100644 --- a/flang/lib/Parser/token-sequence.cpp +++ b/flang/lib/Parser/token-sequence.cpp @@ -357,7 +357,7 @@ ProvenanceRange TokenSequence::GetProvenanceRange() const { const TokenSequence &TokenSequence::CheckBadFortranCharacters( Messages &messages, const Prescanner &prescanner, - bool allowAmpersand) const { + bool preprocessingOnly) const { std::size_t tokens{SizeInTokens()}; for (std::size_t j{0}; j < tokens; ++j) { CharBlock token{TokenAt(j)}; @@ -371,8 +371,10 @@ const TokenSequence &TokenSequence::CheckBadFortranCharacters( TokenAt(j + 1))) { // !dir$, &c. ++j; continue; + } else if (preprocessingOnly) { + continue; } - } else if (ch == '&' && allowAmpersand) { + } else if (ch == '&' && preprocessingOnly) { continue; } if (ch < ' ' || ch >= '\x7f') { diff --git a/flang/test/Parser/OpenMP/bug518.f b/flang/test/Parser/OpenMP/bug518.f index 2dbacef59fa8a..2739de63f8b25 100644 --- a/flang/test/Parser/OpenMP/bug518.f +++ b/flang/test/Parser/OpenMP/bug518.f @@ -9,9 +9,9 @@ !$omp end parallel end -!CHECK-E:{{^}}!$ thread = OMP_GET_MAX_THREADS() +!CHECK-E:{{^}}!$ thread = OMP_GET_MAX_THREADS() !CHECK-E:{{^}}!$omp parallel private(ia) -!CHECK-E:{{^}}!$ continue +!CHECK-E:{{^}}!$ continue !CHECK-E:{{^}}!$omp end parallel !CHECK-OMP:thread=omp_get_max_threads() diff --git a/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 b/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 index 169976d74c0bf..644ab3f723aba 100644 --- a/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 +++ b/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 @@ -7,10 +7,10 @@ ! CHECK-LABEL: subroutine mixed_form1() ! CHECK-E:{{^}} i = 1 & ! CHECK-E:{{^}}!$ +100& -! CHECK-E:{{^}}!$ &+ 1000& -! CHECK-E:{{^}} &+ 10 + 1& -! CHECK-E:{{^}}!$ & +100000& -! CHECK-E:{{^}} &0000 + 1000000 +! CHECK-E:{{^}}!$ &+ 1000& +! CHECK-E:{{^}} &+ 10 + 1& +! CHECK-E:{{^}}!$ & +100000& +! CHECK-E:{{^}} &0000 + 1000000 ! CHECK-OMP: i=1001001112_4 ! CHECK-NO-OMP: i=1010011_4 subroutine mixed_form1() @@ -39,8 +39,8 @@ subroutine mixed_form2() ! CHECK-LABEL: subroutine mixed_form3() ! CHECK-E:{{^}}!$ i=0 ! CHECK-E:{{^}}!$ i = 1 & -! CHECK-E:{{^}}!$ & +10 & -! CHECK-E:{{^}}!$ &+100& +! CHECK-E:{{^}}!$ & +10 & +! CHECK-E:{{^}}!$ &+100& ! CHECK-E:{{^}}!$ +1000 ! CHECK-OMP: i=0_4 ! CHECK-OMP: i=1111_4 diff --git a/flang/test/Parser/OpenMP/sentinels.f b/flang/test/Parser/OpenMP/sentinels.f index 299b83e2abba8..f5a2fd4f7f931 100644 --- a/flang/test/Parser/OpenMP/sentinels.f +++ b/flang/test/Parser/OpenMP/sentinels.f @@ -61,12 +61,12 @@ subroutine sub(a, b) ! Test valid chars in initial and continuation lines. ! CHECK: !$ 20 PRINT *, "msg2" -! CHECK: !$ & , "msg3" +! CHECK: !$ &, "msg3" c$ 20 PRINT *, "msg2" c$ & , "msg3" ! CHECK: !$ PRINT *, "msg4", -! CHECK: !$ & "msg5" +! CHECK: !$ &"msg5" c$ 0PRINT *, "msg4", c$ + "msg5" end diff --git a/flang/test/Parser/continuation-in-conditional-compilation.f b/flang/test/Parser/continuation-in-conditional-compilation.f index 57b69de657348..ebc6a3f875b9a 100644 --- a/flang/test/Parser/continuation-in-conditional-compilation.f +++ b/flang/test/Parser/continuation-in-conditional-compilation.f @@ -1,11 +1,12 @@ ! RUN: %flang_fc1 -E %s 2>&1 | FileCheck %s program main ! CHECK: k01=1+ -! CHECK: !$ & 1 +! CHECK: !$ &1 k01=1+ -!$ & 1 +!$ &1 -! CHECK: !$ k02=23 +! CHECK: !$ k02=2 +! CHECK: 3 ! CHECK: !$ &4 !$ k02=2 +3 diff --git a/flang/test/Preprocessing/bug136845.F b/flang/test/Preprocessing/bug136845.F new file mode 100644 index 0000000000000..ce52c2953bb57 --- /dev/null +++ b/flang/test/Preprocessing/bug136845.F @@ -0,0 +1,45 @@ +!RUN: %flang_fc1 -E %s | FileCheck --check-prefix=PREPRO %s +!RUN: %flang_fc1 -fdebug-unparse %s | FileCheck --check-prefix=NORMAL %s +!RUN: %flang_fc1 -fopenmp -fdebug-unparse %s | FileCheck --check-prefix=OMP %s + +c$ ! + +C$ + continue + + k=0 w + k=0 +c$ 0 x +c$ 1 y +c$ 2 k= z +c$ ! A +c$ !1 B + print *,k +*$1 continue + end + +!PREPRO:!$ & +!PREPRO: continue +!PREPRO: k=0 +!PREPRO: k=0 +!PREPRO:!$ +!PREPRO:!$ & +!PREPRO:!$ &k= +!PREPRO:!$ & +!PREPRO:!$ &1 +!PREPRO: print *,k +!PREPRO:!$ 1 continue +!... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/138956 From flang-commits at lists.llvm.org Wed May 7 13:41:28 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 13:41:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Further refinement of OpenMP !$ lines in -E mode (PR #138956) In-Reply-To: Message-ID: <681bc578.170a0220.15ed3b.a16c@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-parser Author: Peter Klausler (klausler)
Changes Address failing Fujitsu test suite cases that were broken by the patch to defer the handling of !$ lines in -fopenmp vs. normal compilation to actual compilation rather than processing them immediately in -E mode. Tested on the samples in the bug report as well as all of the Fujitsu tests that I could find that use !$ lines. Fixes https://github.com/llvm/llvm-project/issues/136845. --- Patch is 20.21 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/138956.diff 10 Files Affected: - (modified) flang/include/flang/Parser/token-sequence.h (+1-1) - (modified) flang/lib/Parser/parsing.cpp (+4-3) - (modified) flang/lib/Parser/prescan.cpp (+95-98) - (modified) flang/lib/Parser/prescan.h (+5) - (modified) flang/lib/Parser/token-sequence.cpp (+4-2) - (modified) flang/test/Parser/OpenMP/bug518.f (+2-2) - (modified) flang/test/Parser/OpenMP/compiler-directive-continuation.f90 (+6-6) - (modified) flang/test/Parser/OpenMP/sentinels.f (+2-2) - (modified) flang/test/Parser/continuation-in-conditional-compilation.f (+4-3) - (added) flang/test/Preprocessing/bug136845.F (+45) ``````````diff diff --git a/flang/include/flang/Parser/token-sequence.h b/flang/include/flang/Parser/token-sequence.h index 69291e69526e2..05aeacccde097 100644 --- a/flang/include/flang/Parser/token-sequence.h +++ b/flang/include/flang/Parser/token-sequence.h @@ -137,7 +137,7 @@ class TokenSequence { TokenSequence &RemoveRedundantBlanks(std::size_t firstChar = 0); TokenSequence &ClipComment(const Prescanner &, bool skipFirst = false); const TokenSequence &CheckBadFortranCharacters( - Messages &, const Prescanner &, bool allowAmpersand) const; + Messages &, const Prescanner &, bool preprocessingOnly) const; bool BadlyNestedParentheses() const; const TokenSequence &CheckBadParentheses(Messages &) const; void Emit(CookedSource &) const; diff --git a/flang/lib/Parser/parsing.cpp b/flang/lib/Parser/parsing.cpp index 17f544194de02..93737d99567dd 100644 --- a/flang/lib/Parser/parsing.cpp +++ b/flang/lib/Parser/parsing.cpp @@ -230,10 +230,11 @@ void Parsing::EmitPreprocessedSource( column = 7; // start of fixed form source field ++sourceLine; inContinuation = true; - } else if (!inDirective && ch != ' ' && (ch < '0' || ch > '9')) { + } else if (!inDirective && !ompConditionalLine && ch != ' ' && + (ch < '0' || ch > '9')) { // Put anything other than a label or directive into the // Fortran fixed form source field (columns [7:72]). - for (; column < 7; ++column) { + for (int toCol{ch == '&' ? 6 : 7}; column < toCol; ++column) { out << ' '; } } @@ -241,7 +242,7 @@ void Parsing::EmitPreprocessedSource( if (ompConditionalLine) { // Only digits can stay in the label field if (!(ch >= '0' && ch <= '9')) { - for (; column < 7; ++column) { + for (int toCol{ch == '&' ? 6 : 7}; column < toCol; ++column) { out << ' '; } } diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp index 46e04c15ade01..ee180d986e39d 100644 --- a/flang/lib/Parser/prescan.cpp +++ b/flang/lib/Parser/prescan.cpp @@ -150,10 +150,7 @@ void Prescanner::Statement() { CHECK(*at_ == '!'); } std::optional condOffset; - bool isOpenMPCondCompilation{ - directiveSentinel_[0] == '$' && directiveSentinel_[1] == '\0'}; - if (isOpenMPCondCompilation) { - // OpenMP conditional compilation line. + if (InOpenMPConditionalLine()) { condOffset = 2; } else if (directiveSentinel_[0] == '@' && directiveSentinel_[1] == 'c' && directiveSentinel_[2] == 'u' && directiveSentinel_[3] == 'f' && @@ -167,19 +164,10 @@ void Prescanner::Statement() { FortranInclude(at_ + *payload); return; } - while (true) { - if (auto n{IsSpace(at_)}) { - at_ += n, ++column_; - } else if (*at_ == '\t') { - ++at_, ++column_; - tabInCurrentLine_ = true; - } else if (inFixedForm_ && column_ == 6 && !tabInCurrentLine_ && - *at_ == '0') { - ++at_, ++column_; - } else { - break; - } + if (inFixedForm_) { + LabelField(tokens); } + SkipSpaces(); } else { // Compiler directive. Emit normalized sentinel, squash following spaces. // Conditional compilation lines (!$) take this path in -E mode too @@ -190,35 +178,47 @@ void Prescanner::Statement() { ++sp, ++at_, ++column_) { EmitChar(tokens, *sp); } - if (IsSpaceOrTab(at_)) { - while (int n{IsSpaceOrTab(at_)}) { - if (isOpenMPCondCompilation && inFixedForm_) { + if (inFixedForm_) { + while (column_ < 6) { + if (*at_ == '\t') { + tabInCurrentLine_ = true; + ++at_; + for (; column_ < 7; ++column_) { + EmitChar(tokens, ' '); + } + } else if (int spaceBytes{IsSpace(at_)}) { EmitChar(tokens, ' '); - } - tabInCurrentLine_ |= *at_ == '\t'; - at_ += n, ++column_; - if (inFixedForm_ && column_ > fixedFormColumnLimit_) { + at_ += spaceBytes; + ++column_; + } else { + if (InOpenMPConditionalLine() && column_ == 3 && + IsDecimalDigit(*at_)) { + // subtle: !$ in -E mode can't be immediately followed by a digit + EmitChar(tokens, ' '); + } break; } } - if (isOpenMPCondCompilation && inFixedForm_ && column_ == 6) { - if (*at_ == '0') { - EmitChar(tokens, ' '); - } else { - tokens.CloseToken(); - EmitChar(tokens, '&'); - } - ++at_, ++column_; + } else if (int spaceBytes{IsSpaceOrTab(at_)}) { + EmitChar(tokens, ' '); + at_ += spaceBytes, ++column_; + } + tokens.CloseToken(); + SkipSpaces(); + if (InOpenMPConditionalLine() && inFixedForm_ && !tabInCurrentLine_ && + column_ == 6 && *at_ != '\n') { + // !$ 0 - turn '0' into a space + // !$ 1 - turn '1' into '&' + if (int n{IsSpace(at_)}; n || *at_ == '0') { + at_ += n ? n : 1; } else { - EmitChar(tokens, ' '); + ++at_; + EmitChar(tokens, '&'); + tokens.CloseToken(); } + ++column_; + SkipSpaces(); } - tokens.CloseToken(); - } - if (*at_ == '!' || *at_ == '\n' || - (inFixedForm_ && column_ > fixedFormColumnLimit_ && - !tabInCurrentLine_)) { - return; // Directive without payload } break; } @@ -323,8 +323,8 @@ void Prescanner::Statement() { NormalizeCompilerDirectiveCommentMarker(*preprocessed); preprocessed->ToLowerCase(); SourceFormChange(preprocessed->ToString()); - CheckAndEmitLine(preprocessed->ToLowerCase().ClipComment( - *this, true /* skip first ! */), + CheckAndEmitLine( + preprocessed->ClipComment(*this, true /* skip first ! */), newlineProvenance); break; case LineClassification::Kind::Source: @@ -349,6 +349,24 @@ void Prescanner::Statement() { while (CompilerDirectiveContinuation(tokens, line.sentinel)) { newlineProvenance = GetCurrentProvenance(); } + if (preprocessingOnly_ && inFixedForm_ && InOpenMPConditionalLine() && + nextLine_ < limit_) { + // In -E mode, when the line after !$ conditional compilation is a + // regular fixed form continuation line, append a '&' to the line. + const char *p{nextLine_}; + int col{1}; + while (int n{IsSpace(p)}) { + if (*p == '\t') { + break; + } + p += n; + ++col; + } + if (col == 6 && *p != '0' && *p != '\t' && *p != '\n') { + EmitChar(tokens, '&'); + tokens.CloseToken(); + } + } tokens.ToLowerCase(); SourceFormChange(tokens.ToString()); } else { // Kind::Source @@ -544,7 +562,8 @@ void Prescanner::SkipToEndOfLine() { bool Prescanner::MustSkipToEndOfLine() const { if (inFixedForm_ && column_ > fixedFormColumnLimit_ && !tabInCurrentLine_) { return true; // skip over ignored columns in right margin (73:80) - } else if (*at_ == '!' && !inCharLiteral_) { + } else if (*at_ == '!' && !inCharLiteral_ && + (!inFixedForm_ || tabInCurrentLine_ || column_ != 6)) { return !IsCompilerDirectiveSentinel(at_); } else { return false; @@ -569,10 +588,11 @@ void Prescanner::NextChar() { // directives, Fortran ! comments, stuff after the right margin in // fixed form, and all forms of line continuation. bool Prescanner::SkipToNextSignificantCharacter() { - auto anyContinuationLine{false}; if (inPreprocessorDirective_) { SkipCComments(); + return false; } else { + auto anyContinuationLine{false}; bool mightNeedSpace{false}; if (MustSkipToEndOfLine()) { SkipToEndOfLine(); @@ -589,8 +609,8 @@ bool Prescanner::SkipToNextSignificantCharacter() { if (*at_ == '\t') { tabInCurrentLine_ = true; } + return anyContinuationLine; } - return anyContinuationLine; } void Prescanner::SkipCComments() { @@ -1119,12 +1139,10 @@ static bool IsAtProcess(const char *p) { bool Prescanner::IsFixedFormCommentLine(const char *start) const { const char *p{start}; - // The @process directive must start in column 1. if (*p == '@' && IsAtProcess(p)) { return true; } - if (IsFixedFormCommentChar(*p) || *p == '%' || // VAX %list, %eject, &c. ((*p == 'D' || *p == 'd') && !features_.IsEnabled(LanguageFeature::OldDebugLines))) { @@ -1325,23 +1343,9 @@ const char *Prescanner::FixedFormContinuationLine(bool mightNeedSpace) { nextLine_[1] == ' ' && nextLine_[2] == ' ' && nextLine_[3] == ' ' && nextLine_[4] == ' '}; if (InCompilerDirective()) { - if (directiveSentinel_[0] == '$' && directiveSentinel_[1] == '\0') { - if (IsFixedFormCommentChar(col1)) { - if (nextLine_[1] == '$' && - (nextLine_[2] == '&' || IsSpaceOrTab(&nextLine_[2]))) { - // Next line is also !$ conditional compilation, might be continuation - if (preprocessingOnly_) { - return nullptr; - } - } else { - return nullptr; // comment, or distinct directive - } - } else if (!canBeNonDirectiveContinuation) { - return nullptr; - } - } else if (!IsFixedFormCommentChar(col1)) { - return nullptr; // in directive other than !$, but next line is not - } else { // in directive other than !$, next line might be continuation + // !$ under -E is not continued, but deferred to later compilation + if (IsFixedFormCommentChar(col1) && + !(InOpenMPConditionalLine() && preprocessingOnly_)) { int j{1}; for (; j < 5; ++j) { char ch{directiveSentinel_[j - 1]}; @@ -1356,31 +1360,27 @@ const char *Prescanner::FixedFormContinuationLine(bool mightNeedSpace) { return nullptr; } } - } - const char *col6{nextLine_ + 5}; - if (*col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { - if (mightNeedSpace && !IsSpace(nextLine_ + 6)) { - insertASpace_ = true; + const char *col6{nextLine_ + 5}; + if (*col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { + if (mightNeedSpace && !IsSpace(nextLine_ + 6)) { + insertASpace_ = true; + } + return nextLine_ + 6; } - return nextLine_ + 6; } - } else { - // Normal case: not in a compiler directive. - if (IsFixedFormCommentChar(col1)) { - if (nextLine_[1] == '$' && nextLine_[2] == ' ' && nextLine_[3] == ' ' && - nextLine_[4] == ' ' && - IsCompilerDirectiveSentinel(&nextLine_[1], 1) && - !preprocessingOnly_) { - // !$ conditional compilation line as a continuation - const char *col6{nextLine_ + 5}; - if (*col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { - if (mightNeedSpace && !IsSpace(nextLine_ + 6)) { - insertASpace_ = true; - } - return nextLine_ + 6; - } + } else { // Normal case: not in a compiler directive. + // !$ conditional compilation lines may be continuations when not + // just preprocessing. + if (!preprocessingOnly_ && IsFixedFormCommentChar(col1) && + nextLine_[1] == '$' && nextLine_[2] == ' ' && nextLine_[3] == ' ' && + nextLine_[4] == ' ' && IsCompilerDirectiveSentinel(&nextLine_[1], 1)) { + if (const char *col6{nextLine_ + 5}; + *col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { + insertASpace_ |= mightNeedSpace && !IsSpace(nextLine_ + 6); + return nextLine_ + 6; + } else { + return nullptr; } - return nullptr; } if (col1 == '&' && features_.IsEnabled( @@ -1422,13 +1422,13 @@ const char *Prescanner::FreeFormContinuationLine(bool ampersand) { } p = SkipWhiteSpaceIncludingEmptyMacros(p); if (InCompilerDirective()) { - if (directiveSentinel_[0] == '$' && directiveSentinel_[1] == '\0') { + if (InOpenMPConditionalLine()) { if (preprocessingOnly_) { // in -E mode, don't treat !$ as a continuation return nullptr; } else if (p[0] == '!' && p[1] == '$') { // accept but do not require a matching sentinel - if (!(p[2] == '&' || IsSpaceOrTab(&p[2]))) { + if (p[2] != '&' && !IsSpaceOrTab(&p[2])) { return nullptr; // not !$ } p += 2; @@ -1566,15 +1566,11 @@ Prescanner::IsFixedFormCompilerDirectiveLine(const char *start) const { } char sentinel[5], *sp{sentinel}; int column{2}; - for (; column < 6; ++column, ++p) { - if (*p == '\n' || IsSpaceOrTab(p)) { - break; - } - if (sp == sentinel + 1 && sentinel[0] == '$' && IsDecimalDigit(*p)) { - // OpenMP conditional compilation line: leave the label alone + for (; column < 6; ++column) { + if (*p == '\n' || IsSpaceOrTab(p) || IsDecimalDigit(*p)) { break; } - *sp++ = ToLowerCaseLetter(*p); + *sp++ = ToLowerCaseLetter(*p++); } if (sp == sentinel) { return std::nullopt; @@ -1600,7 +1596,8 @@ Prescanner::IsFixedFormCompilerDirectiveLine(const char *start) const { ++p; } else if (int n{IsSpaceOrTab(p)}) { p += n; - } else if (isOpenMPConditional && preprocessingOnly_ && !hadDigit) { + } else if (isOpenMPConditional && preprocessingOnly_ && !hadDigit && + *p != '\n') { // In -E mode, "!$ &" is treated as a directive } else { // This is a Continuation line, not an initial directive line. @@ -1671,14 +1668,14 @@ const char *Prescanner::IsCompilerDirectiveSentinel(CharBlock token) const { std::optional> Prescanner::IsCompilerDirectiveSentinel(const char *p) const { char sentinel[8]; - for (std::size_t j{0}; j + 1 < sizeof sentinel && *p != '\n'; ++p, ++j) { + for (std::size_t j{0}; j + 1 < sizeof sentinel; ++p, ++j) { if (int n{IsSpaceOrTab(p)}; n || !(IsLetter(*p) || *p == '$' || *p == '@')) { if (j > 0) { - if (j == 1 && sentinel[0] == '$' && n == 0 && *p != '&') { - // OpenMP conditional compilation line sentinels have to + if (j == 1 && sentinel[0] == '$' && n == 0 && *p != '&' && *p != '\n') { + // Free form OpenMP conditional compilation line sentinels have to // be immediately followed by a space or &, not a digit - // or anything else. + // or anything else. A newline also works for an initial line. break; } sentinel[j] = '\0'; diff --git a/flang/lib/Parser/prescan.h b/flang/lib/Parser/prescan.h index 53361ba14f378..ec4c53cf3e0f2 100644 --- a/flang/lib/Parser/prescan.h +++ b/flang/lib/Parser/prescan.h @@ -159,6 +159,11 @@ class Prescanner { } bool InCompilerDirective() const { return directiveSentinel_ != nullptr; } + bool InOpenMPConditionalLine() const { + return directiveSentinel_ && directiveSentinel_[0] == '$' && + !directiveSentinel_[1]; + ; + } bool InFixedFormSource() const { return inFixedForm_ && !inPreprocessorDirective_ && !InCompilerDirective(); } diff --git a/flang/lib/Parser/token-sequence.cpp b/flang/lib/Parser/token-sequence.cpp index aee76938550f5..40a074eaf0a47 100644 --- a/flang/lib/Parser/token-sequence.cpp +++ b/flang/lib/Parser/token-sequence.cpp @@ -357,7 +357,7 @@ ProvenanceRange TokenSequence::GetProvenanceRange() const { const TokenSequence &TokenSequence::CheckBadFortranCharacters( Messages &messages, const Prescanner &prescanner, - bool allowAmpersand) const { + bool preprocessingOnly) const { std::size_t tokens{SizeInTokens()}; for (std::size_t j{0}; j < tokens; ++j) { CharBlock token{TokenAt(j)}; @@ -371,8 +371,10 @@ const TokenSequence &TokenSequence::CheckBadFortranCharacters( TokenAt(j + 1))) { // !dir$, &c. ++j; continue; + } else if (preprocessingOnly) { + continue; } - } else if (ch == '&' && allowAmpersand) { + } else if (ch == '&' && preprocessingOnly) { continue; } if (ch < ' ' || ch >= '\x7f') { diff --git a/flang/test/Parser/OpenMP/bug518.f b/flang/test/Parser/OpenMP/bug518.f index 2dbacef59fa8a..2739de63f8b25 100644 --- a/flang/test/Parser/OpenMP/bug518.f +++ b/flang/test/Parser/OpenMP/bug518.f @@ -9,9 +9,9 @@ !$omp end parallel end -!CHECK-E:{{^}}!$ thread = OMP_GET_MAX_THREADS() +!CHECK-E:{{^}}!$ thread = OMP_GET_MAX_THREADS() !CHECK-E:{{^}}!$omp parallel private(ia) -!CHECK-E:{{^}}!$ continue +!CHECK-E:{{^}}!$ continue !CHECK-E:{{^}}!$omp end parallel !CHECK-OMP:thread=omp_get_max_threads() diff --git a/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 b/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 index 169976d74c0bf..644ab3f723aba 100644 --- a/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 +++ b/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 @@ -7,10 +7,10 @@ ! CHECK-LABEL: subroutine mixed_form1() ! CHECK-E:{{^}} i = 1 & ! CHECK-E:{{^}}!$ +100& -! CHECK-E:{{^}}!$ &+ 1000& -! CHECK-E:{{^}} &+ 10 + 1& -! CHECK-E:{{^}}!$ & +100000& -! CHECK-E:{{^}} &0000 + 1000000 +! CHECK-E:{{^}}!$ &+ 1000& +! CHECK-E:{{^}} &+ 10 + 1& +! CHECK-E:{{^}}!$ & +100000& +! CHECK-E:{{^}} &0000 + 1000000 ! CHECK-OMP: i=1001001112_4 ! CHECK-NO-OMP: i=1010011_4 subroutine mixed_form1() @@ -39,8 +39,8 @@ subroutine mixed_form2() ! CHECK-LABEL: subroutine mixed_form3() ! CHECK-E:{{^}}!$ i=0 ! CHECK-E:{{^}}!$ i = 1 & -! CHECK-E:{{^}}!$ & +10 & -! CHECK-E:{{^}}!$ &+100& +! CHECK-E:{{^}}!$ & +10 & +! CHECK-E:{{^}}!$ &+100& ! CHECK-E:{{^}}!$ +1000 ! CHECK-OMP: i=0_4 ! CHECK-OMP: i=1111_4 diff --git a/flang/test/Parser/OpenMP/sentinels.f b/flang/test/Parser/OpenMP/sentinels.f index 299b83e2abba8..f5a2fd4f7f931 100644 --- a/flang/test/Parser/OpenMP/sentinels.f +++ b/flang/test/Parser/OpenMP/sentinels.f @@ -61,12 +61,12 @@ subroutine sub(a, b) ! Test valid chars in initial and continuation lines. ! CHECK: !$ 20 PRINT *, "msg2" -! CHECK: !$ & , "msg3" +! CHECK: !$ &, "msg3" c$ 20 PRINT *, "msg2" c$ & , "msg3" ! CHECK: !$ PRINT *, "msg4", -! CHECK: !$ & "msg5" +! CHECK: !$ &"msg5" c$ 0PRINT *, "msg4", c$ + "msg5" end diff --git a/flang/test/Parser/continuation-in-conditional-compilation.f b/flang/test/Parser/continuation-in-conditional-compilation.f index 57b69de657348..ebc6a3f875b9a 100644 --- a/flang/test/Parser/continuation-in-conditional-compilation.f +++ b/flang/test/Parser/continuation-in-conditional-compilation.f @@ -1,11 +1,12 @@ ! RUN: %flang_fc1 -E %s 2>&1 | FileCheck %s program main ! CHECK: k01=1+ -! CHECK: !$ & 1 +! CHECK: !$ &1 k01=1+ -!$ & 1 +!$ &1 -! CHECK: !$ k02=23 +! CHECK: !$ k02=2 +! CHECK: 3 ! CHECK: !$ &4 !$ k02=2 +3 diff --git a/flang/test/Preprocessing/bug136845.F b/flang/test/Preprocessing/bug136845.F new file mode 100644 index 0000000000000..ce52c2953bb57 --- /dev/null +++ b/flang/test/Preprocessing/bug136845.F @@ -0,0 +1,45 @@ +!RUN: %flang_fc1 -E %s | FileCheck --check-prefix=PREPRO %s +!RUN: %flang_fc1 -fdebug-unparse %s | FileCheck --check-prefix=NORMAL %s +!RUN: %flang_fc1 -fopenmp -fdebug-unparse %s | FileCheck --check-prefix=OMP %s + +c$ ! + +C$ + continue + + k=0 w + k=0 +c$ 0 x +c$ 1 y +c$ 2 k= z +c$ ! A +c$ !1 B + print *,k +*$1 continue + end + +!PREPRO:!$ & +!PREPRO: continue +!PREPRO: k=0 +!PREPRO: k=0 +!PREPRO:!$ +!PREPRO:!$ & +!PREPRO:!$ &k= +!PREPRO:!$ & +!PREPRO:!$ &1 +!PREPRO: print *,k +!PREPRO:!$ 1 continue +!... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/138956 From flang-commits at lists.llvm.org Wed May 7 13:47:29 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 07 May 2025 13:47:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix crash with USE of hermetic module file (PR #138785) In-Reply-To: Message-ID: <681bc6e1.170a0220.1b6385.11e4@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/138785 >From ce7debfdfad2e673efb957edc308e57a903b552b Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Tue, 6 May 2025 16:57:36 -0700 Subject: [PATCH] [flang] Fix crash with USE of hermetic module file When one hermetic module file uses another, a later compilation may crash in semantics when it itself is used, since the module file reader sets the "current hermetic module file scope" to null after reading one rather than saving and restoring that pointer. --- flang/lib/Semantics/mod-file.cpp | 3 ++- flang/test/Semantics/modfile75.F90 | 17 +++++++++++++++++ 2 files changed, 19 insertions(+), 1 deletion(-) create mode 100644 flang/test/Semantics/modfile75.F90 diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index ee356e56e4458..3df229cc85587 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1537,6 +1537,7 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, // created under -fhermetic-module-files? If so, process them first in // their own nested scope that will be visible only to USE statements // within the module file. + Scope *previousHermetic{context_.currentHermeticModuleFileScope()}; if (parseTree.v.size() > 1) { parser::Program hermeticModules{std::move(parseTree.v)}; parseTree.v.emplace_back(std::move(hermeticModules.v.front())); @@ -1552,7 +1553,7 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, GetModuleDependences(context_.moduleDependences(), sourceFile->content()); ResolveNames(context_, parseTree, topScope); context_.foldingContext().set_moduleFileName(wasModuleFileName); - context_.set_currentHermeticModuleFileScope(nullptr); + context_.set_currentHermeticModuleFileScope(previousHermetic); if (!moduleSymbol) { // Submodule symbols' storage are owned by their parents' scopes, // but their names are not in their parents' dictionaries -- we diff --git a/flang/test/Semantics/modfile75.F90 b/flang/test/Semantics/modfile75.F90 new file mode 100644 index 0000000000000..aba00ffac848a --- /dev/null +++ b/flang/test/Semantics/modfile75.F90 @@ -0,0 +1,17 @@ +!RUN: %flang -c -fhermetic-module-files -DWHICH=1 %s && %flang -c -fhermetic-module-files -DWHICH=2 %s && %flang_fc1 -fdebug-unparse %s | FileCheck %s + +#if WHICH == 1 +module modfile75a + use iso_c_binding +end +#elif WHICH == 2 +module modfile75b + use modfile75a +end +#else +program test + use modfile75b +!CHECK: INTEGER(KIND=4_4) n + integer(c_int) n +end +#endif From flang-commits at lists.llvm.org Wed May 7 13:48:49 2025 From: flang-commits at lists.llvm.org (Jan Patrick Lehr via flang-commits) Date: Wed, 07 May 2025 13:48:49 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Fix driver test after #125643 (PR #138959) Message-ID: https://github.com/jplehr created https://github.com/llvm/llvm-project/pull/138959 None Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 7 13:49:29 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 13:49:29 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Fix driver test after #125643 (PR #138959) In-Reply-To: Message-ID: <681bc759.050a0220.2f1ad4.3444@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-driver Author: Jan Patrick Lehr (jplehr)
Changes --- Full diff: https://github.com/llvm/llvm-project/pull/138959.diff 1 Files Affected: - (modified) flang/test/Driver/mcmodel.f90 (-2) ``````````diff diff --git a/flang/test/Driver/mcmodel.f90 b/flang/test/Driver/mcmodel.f90 index 12d90ece2f24f..8a03b17bfbcba 100644 --- a/flang/test/Driver/mcmodel.f90 +++ b/flang/test/Driver/mcmodel.f90 @@ -1,5 +1,4 @@ ! RUN: not %flang -### -c --target=i686 -mcmodel=medium %s 2>&1 | FileCheck --check-prefix=ERR-MEDIUM %s -! RUN: %flang --target=x86_64 -### -c -mcmodel=tiny %s 2>&1 | FileCheck --check-prefix=TINY %s ! RUN: %flang --target=x86_64 -### -c -mcmodel=small %s 2>&1 | FileCheck --check-prefix=SMALL %s ! RUN: %flang --target=x86_64 -### -S -mcmodel=kernel %s 2>&1 | FileCheck --check-prefix=KERNEL %s ! RUN: %flang --target=x86_64 -### -c -mcmodel=medium %s 2>&1 | FileCheck --check-prefix=MEDIUM %s @@ -41,4 +40,3 @@ ! AARCH64-PIC-LARGE: error: invalid argument '-mcmodel=large' only allowed with '-fno-pic' ! ERR-AARCH64_32: error: unsupported argument 'small' to option '-mcmodel=' for target 'aarch64_32-unknown-linux' - ``````````
https://github.com/llvm/llvm-project/pull/138959 From flang-commits at lists.llvm.org Wed May 7 13:50:46 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Wed, 07 May 2025 13:50:46 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828) In-Reply-To: Message-ID: <681bc7a6.050a0220.2f01fa.7a0c@mx.google.com> ================ @@ -88,6 +117,67 @@ set(host_sources unit-map.cpp ) +# Module sources that are required by other modules +set(intrinsics_sources + __fortran_builtins.f90 +) + + +#set_property(SOURCE "__fortran_type_info.f90" APPEND PROPERTY OBJECT_DEPENDS "/home/meinersbur/build/llvm-project-flangrt/release_bootstrap/llvm_flang_runtimes/./lib/../include/flang/__fortran_builtins.mod") +#set_property(SOURCE "__fortran_type_info.f90" APPEND PROPERTY OBJECT_DEPENDS "flang-rt/lib/runtime/CMakeFiles/flang_rt.runtime.static.dir/__fortran_builtins.f90.o") +#set_property(SOURCE "__fortran_type_info.f90" APPEND PROPERTY OBJECT_DEPENDS "/home/meinersbur/build/llvm-project-flangrt/release_bootstrap/llvm_flang_runtimes/./lib/../include/flang/__fortran_builtins.mod") + +message("CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}") +message("CMAKE_HOST_SYSTEM_PROCESSOR: ${CMAKE_HOST_SYSTEM_PROCESSOR}") +if (CMAKE_SYSTEM_PROCESSOR STREQUAL "powerpc") + list(APPEND host_source + __ppc_types.f90 + __ppc_intrinsics.f90 + mma.f90 + ) +endif () + +if (FLANG_RT_EXPERIMENTAL_OFFLOAD_SUPPORT STREQUAL "CUDA") + list(APPEND supported_sources + __cuda_builtins.f90 + __cuda_device.f90 + cudadevice.f90 + mma.f90 ---------------- DanielCChen wrote: This seems a typo to me that CUDA needs `mma.f90`. https://github.com/llvm/llvm-project/pull/137828 From flang-commits at lists.llvm.org Wed May 7 14:01:15 2025 From: flang-commits at lists.llvm.org (Jan Patrick Lehr via flang-commits) Date: Wed, 07 May 2025 14:01:15 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Fix driver test after #125643 (PR #138959) In-Reply-To: Message-ID: <681bca1b.170a0220.169f91.645d@mx.google.com> jplehr wrote: I'm gonna push the patch to get the failing bots back to green. https://github.com/llvm/llvm-project/pull/138959 From flang-commits at lists.llvm.org Wed May 7 14:01:18 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 14:01:18 -0700 (PDT) Subject: [flang-commits] [flang] b8461ac - [Flang] Fix driver test after #125643 (#138959) Message-ID: <681bca1e.170a0220.2e53e6.95fa@mx.google.com> Author: Jan Patrick Lehr Date: 2025-05-07T23:01:14+02:00 New Revision: b8461acc5eb41ced70cc5c7f5a324cfd8bf76403 URL: https://github.com/llvm/llvm-project/commit/b8461acc5eb41ced70cc5c7f5a324cfd8bf76403 DIFF: https://github.com/llvm/llvm-project/commit/b8461acc5eb41ced70cc5c7f5a324cfd8bf76403.diff LOG: [Flang] Fix driver test after #125643 (#138959) Added: Modified: flang/test/Driver/mcmodel.f90 Removed: ################################################################################ diff --git a/flang/test/Driver/mcmodel.f90 b/flang/test/Driver/mcmodel.f90 index 12d90ece2f24f..8a03b17bfbcba 100644 --- a/flang/test/Driver/mcmodel.f90 +++ b/flang/test/Driver/mcmodel.f90 @@ -1,5 +1,4 @@ ! RUN: not %flang -### -c --target=i686 -mcmodel=medium %s 2>&1 | FileCheck --check-prefix=ERR-MEDIUM %s -! RUN: %flang --target=x86_64 -### -c -mcmodel=tiny %s 2>&1 | FileCheck --check-prefix=TINY %s ! RUN: %flang --target=x86_64 -### -c -mcmodel=small %s 2>&1 | FileCheck --check-prefix=SMALL %s ! RUN: %flang --target=x86_64 -### -S -mcmodel=kernel %s 2>&1 | FileCheck --check-prefix=KERNEL %s ! RUN: %flang --target=x86_64 -### -c -mcmodel=medium %s 2>&1 | FileCheck --check-prefix=MEDIUM %s @@ -41,4 +40,3 @@ ! AARCH64-PIC-LARGE: error: invalid argument '-mcmodel=large' only allowed with '-fno-pic' ! ERR-AARCH64_32: error: unsupported argument 'small' to option '-mcmodel=' for target 'aarch64_32-unknown-linux' - From flang-commits at lists.llvm.org Wed May 7 14:01:20 2025 From: flang-commits at lists.llvm.org (Jan Patrick Lehr via flang-commits) Date: Wed, 07 May 2025 14:01:20 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Fix driver test after #125643 (PR #138959) In-Reply-To: Message-ID: <681bca20.170a0220.2ffb3f.8bea@mx.google.com> https://github.com/jplehr closed https://github.com/llvm/llvm-project/pull/138959 From flang-commits at lists.llvm.org Wed May 7 14:45:53 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 07 May 2025 14:45:53 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (WORK IN PROGRESS) (PR #137727) In-Reply-To: Message-ID: <681bd491.630a0220.36e36b.cdbd@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From ee3ae8e0b579439a41982fbd2e577485059d3b73 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, and Destroy. Default derived type I/O is also recursive, but already disabled. It can be added to this new framework later if the overall approach succeeds. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. --- .../include/flang-rt/runtime/work-queue.h | 289 ++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 542 ++++++++++-------- flang-rt/lib/runtime/derived.cpp | 487 ++++++++-------- flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 175 ++++++ flang/include/flang/Runtime/assign.h | 2 +- 7 files changed, 1017 insertions(+), 486 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..743c9ffcf0ede --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,289 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue is a list of tickets. Each ticket class has a Begin() +// member function that is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatOkContinue, and if that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatOkContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentTicketBase, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatOkContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatOkContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; +namespace typeInfo { +class DerivedType; +class Component; +} // namespace typeInfo + +// Ticket workers + +// Ticket workers return status codes. Returning StatOkContinue means +// that the ticket is incomplete and must be resumed; any other value +// means that the ticket is complete, and if not StatOk, the whole +// queue can be shut down due to an error. +static constexpr int StatOkContinue{1234}; + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +// Base class for ticket workers that operate elementwise over descriptors +// TODO: if ComponentTicketBase remains this class' only client, +// merge them for better comprehensibility. +class ElementalTicketBase { +protected: + RT_API_ATTRS ElementalTicketBase(const Descriptor &instance) + : instance_{instance} { + instance_.GetLowerBounds(subscripts_); + } + RT_API_ATTRS bool CueUpNextItem() const { return elementAt_ < elements_; } + RT_API_ATTRS void AdvanceToNextElement() { + phase_ = 0; + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + } + + const Descriptor &instance_; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + int phase_{0}; + SubscriptValue subscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentTicketBase : protected ElementalTicketBase { +protected: + RT_API_ATTRS ComponentTicketBase( + const Descriptor &instance, const typeInfo::DerivedType &derived); + RT_API_ATTRS bool CueUpNextItem(); + RT_API_ATTRS void AdvanceToNextComponent() { elementAt_ = elements_; } + + const typeInfo::DerivedType &derived_; + const typeInfo::Component *component_{nullptr}; + std::size_t components_{0}, componentAt_{0}; + StaticDescriptor componentDescriptor_; +}; + +// Implements derived type instance initialization +class InitializeTicket : private ComponentTicketBase { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ComponentTicketBase{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket : private ComponentTicketBase { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ComponentTicketBase{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatOkContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : private ComponentTicketBase { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ComponentTicketBase{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : private ComponentTicketBase { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ComponentTicketBase{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : to_{to}, from_{&from}, flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment +class DerivedAssignTicket : private ComponentTicketBase { +public: + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ComponentTicketBase{to, derived}, from_{from}, flags_{flags}, + memmoveFct_{memmoveFct}, deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS void AdvanceToNextElement(); + +private: + const Descriptor &from_; + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + SubscriptValue fromSubscripts_[common::maxRank]; + StaticDescriptor fromComponentDescriptor_; +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + RT_API_ATTRS void BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived); + RT_API_ATTRS void BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg); + RT_API_ATTRS void BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived); + RT_API_ATTRS void BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize); + RT_API_ATTRS void BeginAssign( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct); + RT_API_ATTRS void BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter); + + RT_API_ATTRS int Run(); + +private: + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index c5e7bdce5b2fd..3f9858671d1e1 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -67,6 +67,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -130,6 +131,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 4a813cd489022..054d9672bcf11 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -99,11 +100,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncId)); } // least <= 0, most >= 0 @@ -228,6 +225,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -241,274 +240,339 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + workQueue.BeginAssign(to, from, flags, memmoveFct); + workQueue.Run(); +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + const typeInfo::SpecialBinding *scalarDefinedAssignment{nullptr}; + const typeInfo::SpecialBinding *elementalDefinedAssignment{nullptr}; + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. A user-defined assignment TBP defines all of + // the semantics, including allocatable (re)allocation and any + // finalization. + // + // Note that the aliasing and LHS (re)allocation handling below + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (to_.rank() == 0) { + scalarDefinedAssignment = toDerived_->FindSpecialBinding( + typeInfo::SpecialBinding::Which::ScalarAssignment); + } + if (!scalarDefinedAssignment) { + elementalDefinedAssignment = toDerived_->FindSpecialBinding( + typeInfo::SpecialBinding::Which::ElementalAssignment); + } + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + } else if (!IsSimpleMemmove() || scalarDefinedAssignment || + elementalDefinedAssignment) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncId))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + workQueue.BeginInitialize(newFrom, *derived); + } + } } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + workQueue.BeginAssign( + newFrom, *from_, MaybeReallocate | PolymorphicLHS, memmoveFct_); + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } + if (toDeallocate_ && toDerived_ && (flags_ & NeedFinalization)) { + // Schedule finalization for the RHS temporary or old LHS. + workQueue.BeginFinalize(*toDeallocate_, *toDerived_); + flags_ &= ~NeedFinalization; + } + } + if (scalarDefinedAssignment) { + DoScalarDefinedAssignment(to_, *from_, *scalarDefinedAssignment); + done_ = true; + return StatOkContinue; + } else if (elementalDefinedAssignment) { + DoElementalDefinedAssignment( + to_, *from_, *toDerived_, *elementalDefinedAssignment); + done_ = true; + return StatOk; } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; - } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + workQueue.BeginFinalize(to_, *toDerived_); + } else if (!toDerived_->noDestructionNeeded()) { + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false); } } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( - typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + return StatOkContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); + } + return StatOk; + } + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + if (const auto *addendum{from_->Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + toDerived_ = derived; } } - if (const auto *special{toDerived->FindSpecialBinding( - typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + workQueue.BeginInitialize(to_, *toDerived_); } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); - } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } + if (toDerived_) { + workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_); + toDeallocate_ = nullptr; + } else { + if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { + case CFI_type_signed_char: + case CFI_type_char: + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, + toElements, toElementBytes, fromElementBytes); break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; + case CFI_type_char16_t: + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, + toElements, toElementBytes, fromElementBytes); + break; + case CFI_type_char32_t: + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, + toElements, toElementBytes, fromElementBytes); + break; + default: + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); + } + } else { // elemental copies, possibly with character truncation + for (std::size_t n{toElements}; n-- > 0; + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), + from_->Element(fromAt), toElementBytes); } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); } } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { - case CFI_type_signed_char: - case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, - toElementBytes, fromElementBytes); - break; - case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, - toElements, toElementBytes, fromElementBytes); - break; - case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, - toElements, toElementBytes, fromElementBytes); - break; - default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + } + if (persist_) { + done_ = true; + return StatOkContinue; + } + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; +} + +RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &workQueue) { + from_.GetLowerBounds(fromSubscripts_); + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{derived_.procPtr()}; + std::size_t numProcPtrs{procPtrDesc.Elements()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + memmoveFct_(instance_.Element(subscripts_) + procPtr.offset, + from_.Element(fromSubscripts_) + procPtr.offset, + sizeof(typeInfo::ProcedurePointer)); + } + return StatOkContinue; +} + +RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &workQueue) { + for (; CueUpNextItem(); AdvanceToNextElement()) { + // Copy the data components (incl. the parent) first. + switch (component_->genre()) { + case typeInfo::Component::Genre::Data: + if (component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{fromComponentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + toCompDesc, instance_, workQueue.terminator(), subscripts_); + component_->CreatePointerDescriptor( + fromCompDesc, from_, workQueue.terminator(), fromSubscripts_); + AdvanceToNextElement(); + workQueue.BeginAssign(toCompDesc, fromCompDesc, flags_, memmoveFct_); + return StatOkContinue; + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_.Element(fromSubscripts_) + component_->offset(), + componentByteSize); } - } else { // elemental copies, possibly with character truncation - for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), - toElementBytes); + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_.Element(fromSubscripts_) + component_->offset(), + componentByteSize); + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + instance_.Element(subscripts_) + component_->offset())}; + const auto *fromDesc{reinterpret_cast( + from_.Element(fromSubscripts_) + component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (const auto *componentDerived{component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false); + return StatOkContinue; + } + } + toDesc->Deallocate(); + } + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + // + // Be careful not to destroy/reallocate the LHS, if there is + // overlap between LHS and RHS (it seems that partial overlap + // is not possible, though). + // Invoke Assign() recursively to deal with potential aliasing. + // Force LHS deallocation with DeallocateLHS flag. + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + workQueue.BeginAssign( + *toDesc, *fromDesc, flags_ | DeallocateLHS, memmoveFct_); + AdvanceToNextElement(); + return StatOkContinue; } + } break; } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); } + return StatOk; } -RT_OFFLOAD_API_GROUP_BEGIN +RT_API_ATTRS void DerivedAssignTicket::AdvanceToNextElement() { + ComponentTicketBase::AdvanceToNextElement(); + from_.IncrementSubscripts(fromSubscripts_); +} RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -578,7 +642,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -597,8 +660,9 @@ void RTDEF(CopyOutAssign)( // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. - if (var) + if (var) { Assign(*var, temp, terminator, NoAssignFlags); + } temp.Destroy(/*finalize=*/false, /*destroyPointers=*/false, &terminator); } diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..0f461f529fae6 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,174 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + workQueue.BeginInitialize(instance, derived); + return workQueue.Run(); +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + std::size_t myProcPtrs{procPtrDesc.Elements()}; + for (std::size_t k{0}; k < myProcPtrs; ++k) { const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; + *procPtrDesc.ZeroBasedIndexedElement(k)}; SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + instance_.GetLowerBounds(at); + for (std::size_t j{0}; j++ < elements_; instance_.IncrementSubscripts(at)) { + auto &pptr{*instance_.ElementComponent( + at, comp.offset)}; + pptr = comp.procInitialization; + } + } + return StatOkContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (CueUpNextItem()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; elementAt_ < elements_; AdvanceToNextElement()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; elementAt_ < elements_; AdvanceToNextElement()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; elementAt_ < elements_; AdvanceToNextElement()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } - } - } - } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + AdvanceToNextElement(); + workQueue.BeginInitialize(compDesc, compType); + return StatOkContinue; + } else { + AdvanceToNextComponent(); } } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + workQueue.BeginInitializeClone(clone, original, derived, hasStat, errMsg); + return workQueue.Run(); } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (CueUpNextItem()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); - } + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncId), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + workQueue.BeginInitialize(cloneDesc, *derived); + return StatOkContinue; } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_); + return StatOkContinue; + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + AdvanceToNextElement(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_); + AdvanceToNextElement(); + return StatOkContinue; // will resume at next element in this component + } else { + AdvanceToNextComponent(); } + } else { + AdvanceToNextComponent(); } } - return stat; + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + workQueue.BeginFinalize(descriptor, derived); + workQueue.Run(); + } } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +216,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +253,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncId) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,86 +277,84 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (!finalizableParentType_->noFinalizationNeeded()) { + componentAt_ = 1; + } else { + finalizableParentType_ = nullptr; + } + } + return StatOkContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (CueUpNextItem()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); - } + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + AdvanceToNextElement(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + workQueue.BeginFinalize(compDesc, *compDynamicType); + return StatOkContinue; } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + AdvanceToNextElement(); + if (compDesc.IsAllocated()) { + workQueue.BeginFinalize(compDesc, *compType); } + } else { + AdvanceToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + AdvanceToNextElement(); + workQueue.BeginFinalize(compDesc, compType); + return StatOkContinue; + } else { + AdvanceToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + workQueue.BeginFinalize(tmpDesc, *finalizableParentType_); + finalizableParentType_ = nullptr; + return StatOkContinue; + } else { + return StatOk; } } @@ -373,51 +364,61 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + workQueue.BeginDestroy(descriptor, derived, finalize); + workQueue.Run(); } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + workQueue.BeginFinalize(instance_, derived_); } + return StatOkContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (CueUpNextItem()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + workQueue.BeginDestroy(*d, *componentDerived, /*finalize=*/false); + return StatOkContinue; + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + AdvanceToNextElement(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + AdvanceToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + AdvanceToNextElement(); + workQueue.BeginDestroy(compDesc, *componentDerived, /*finalize=*/false); + return StatOkContinue; } + } else { + AdvanceToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..0ae50e72bb3a9 --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,175 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS ComponentTicketBase::ComponentTicketBase( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ElementalTicketBase{instance}, derived_{derived}, + components_{derived.component().Elements()} {} + +RT_API_ATTRS bool ComponentTicketBase::CueUpNextItem() { + bool elementsDone{!ElementalTicketBase::CueUpNextItem()}; + if (elementsDone) { + component_ = nullptr; + ++componentAt_; + } + if (!component_) { + if (componentAt_ >= components_) { + return false; // done! + } + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + if (elementsDone) { + ElementalTicketBase::Reset(); + } + } + return true; +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + delete firstFree_; + } + firstFree_ = next; + } +} + +RT_API_ATTRS void WorkQueue::BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + StartTicket().u.emplace(descriptor, derived); +} + +RT_API_ATTRS void WorkQueue::BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); +} + +RT_API_ATTRS void WorkQueue::BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + StartTicket().u.emplace(descriptor, derived); +} + +RT_API_ATTRS void WorkQueue::BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + StartTicket().u.emplace(descriptor, derived, finalize); +} + +RT_API_ATTRS void WorkQueue::BeginAssign( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) { + StartTicket().u.emplace(to, from, flags, memmoveFct); +} + +RT_API_ATTRS void WorkQueue::BeginDerivedAssign(Descriptor &to, + const Descriptor &from, const typeInfo::DerivedType &derived, int flags, + MemmoveFct memmoveFct, Descriptor *deallocateAfter) { + StartTicket().u.emplace( + to, from, derived, flags, memmoveFct, deallocateAfter); +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + firstFree_ = new TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; + int stat{at->ticket.Continue(*this)}; + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatOkContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime \ No newline at end of file diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION From flang-commits at lists.llvm.org Wed May 7 14:53:52 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 14:53:52 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [lld] [lldb] [llvm] [mlir] [polly] [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS in standalone builds (PR #138587) In-Reply-To: Message-ID: <681bd670.170a0220.35024e.2320@mx.google.com> https://github.com/jeremyd2019 edited https://github.com/llvm/llvm-project/pull/138587 From flang-commits at lists.llvm.org Wed May 7 16:12:05 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Wed, 07 May 2025 16:12:05 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][MLIR] - Handle the mapping of subroutine arguments when they are subsequently used inside the region of an `omp.target` Op (PR #134967) In-Reply-To: Message-ID: <681be8c5.050a0220.31513.372a@mx.google.com> https://github.com/TIFitis approved this pull request. https://github.com/llvm/llvm-project/pull/134967 From flang-commits at lists.llvm.org Wed May 7 16:54:19 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Wed, 07 May 2025 16:54:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Use box for components with non-default lower bounds (PR #138994) Message-ID: https://github.com/ashermancinelli created https://github.com/llvm/llvm-project/pull/138994 When designating an array component that has non-default lower bounds the bridge was producing hlfir designates yielding reference types, which did not preserve the bounds information. Then, when creating components, unadjusted indices were used when initializing the structure. We could look at the declaration to get the shape parameter, but this would not be preserved if the component were passed as a block argument. These results must be boxed, but we also must not lose the contiguity information either. To address contiguity, annotate these boxes with the `contiguous` attribute during designation. Note that other designated entities are handled inside the HlfirDesignatorBuilder while component designators are built in HlfirBuilder. I am not sure if this handling should be moved into the designator builder or left in the general builder, so feedback is welcome. >From cfcafcf57776beaab75c1c69095c8639b37644aa Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Wed, 7 May 2025 16:43:12 -0700 Subject: [PATCH] [flang] Use box for components with non-default lower bounds When designating an array component that has non-default lower bounds the bridge was producing hlfir designates yielding reference types, which did not preserve the bounds information. Then, when creating components, the correct indices were not used when initializing a structure. We could look at the declaration to get the shape parameter, but this would not be preserved if the component were passed as a block argument. These results must be boxed, but we also must not lose the contiguity information either. To address contiguity, annotate these boxes with the `contiguous` attribute during designation. Note that other desgnated entities are handled inside the HlfirDesignatorBuilder while component designators are built in HlfirBuilder. I am not sure if this handling should be moved into the designator builder or left in the general builder. --- flang/lib/Lower/ConvertExprToHLFIR.cpp | 22 ++++++++++++++++--- .../Lower/HLFIR/designators-component-ref.f90 | 10 +++++++++ 2 files changed, 29 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/ConvertExprToHLFIR.cpp b/flang/lib/Lower/ConvertExprToHLFIR.cpp index 04b63f92a1fb4..da21d407e927d 100644 --- a/flang/lib/Lower/ConvertExprToHLFIR.cpp +++ b/flang/lib/Lower/ConvertExprToHLFIR.cpp @@ -30,6 +30,7 @@ #include "flang/Optimizer/Builder/Runtime/Derived.h" #include "flang/Optimizer/Builder/Runtime/Pointer.h" #include "flang/Optimizer/Builder/Todo.h" +#include "flang/Optimizer/Dialect/FIRAttr.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "mlir/IR/IRMapping.h" #include "llvm/ADT/TypeSwitch.h" @@ -125,6 +126,19 @@ class HlfirDesignatorBuilder { hlfir::ElementalAddrOp convertVectorSubscriptedExprToElementalAddr( const Fortran::lower::SomeExpr &designatorExpr); + std::tuple + genComponentDesignatorTypeAndAttributes( + const Fortran::semantics::Symbol &componentSym, mlir::Type fieldType, + bool isVolatile) { + if (mayHaveNonDefaultLowerBounds(componentSym)) { + mlir::Type boxType = fir::BoxType::get(fieldType, isVolatile); + return std::make_tuple(boxType, + fir::FortranVariableFlagsEnum::contiguous); + } + auto refType = fir::ReferenceType::get(fieldType, isVolatile); + return std::make_tuple(refType, fir::FortranVariableFlagsEnum{}); + } + mlir::Value genComponentShape(const Fortran::semantics::Symbol &componentSym, mlir::Type fieldType) { // For pointers and allocatable components, the @@ -1863,8 +1877,9 @@ class HlfirBuilder { designatorBuilder.genComponentShape(sym, compType); const bool isDesignatorVolatile = fir::isa_volatile_type(baseOp.getType()); - mlir::Type designatorType = - builder.getRefType(compType, isDesignatorVolatile); + auto [designatorType, extraAttributeFlags] = + designatorBuilder.genComponentDesignatorTypeAndAttributes( + sym, compType, isDesignatorVolatile); mlir::Type fieldElemType = hlfir::getFortranElementType(compType); llvm::SmallVector typeParams; @@ -1884,7 +1899,8 @@ class HlfirBuilder { // Convert component symbol attributes to variable attributes. fir::FortranVariableFlagsAttr attrs = - Fortran::lower::translateSymbolAttributes(builder.getContext(), sym); + Fortran::lower::translateSymbolAttributes(builder.getContext(), sym, + extraAttributeFlags); // Get the component designator. auto lhs = builder.create( diff --git a/flang/test/Lower/HLFIR/designators-component-ref.f90 b/flang/test/Lower/HLFIR/designators-component-ref.f90 index 653e28e0a6018..935176becac75 100644 --- a/flang/test/Lower/HLFIR/designators-component-ref.f90 +++ b/flang/test/Lower/HLFIR/designators-component-ref.f90 @@ -126,6 +126,16 @@ subroutine test_array_comp_non_contiguous_slice(a) ! CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_1]]#0{"array_comp"} <%[[VAL_9]]> (%[[VAL_10]]:%[[VAL_11]]:%[[VAL_12]], %[[VAL_14]]:%[[VAL_15]]:%[[VAL_16]]) shape %[[VAL_18]] : (!fir.ref}>>, !fir.shape<2>, index, index, index, index, index, index, !fir.shape<2>) -> !fir.box> end subroutine +subroutine test_array_lbs_array_ctor() + use comp_ref + type(t_array_lbs) :: a(-1:1) + real :: array_comp(2:11,3:22) + a = (/ (t_array_lbs(i, array_comp), i=-1,1) /) +! CHECK: hlfir.designate %{{.+}}#0{"array_comp_lbs"} <%{{.+}}> shape %{{.+}} {fortran_attrs = #fir.var_attrs} +! CHECK-SAME: (!fir.ref}>>, !fir.shapeshift<2>, !fir.shapeshift<2>) +! CHECK-SAME: -> !fir.box> +end subroutine + subroutine test_array_lbs_comp_lbs_1(a) use comp_ref type(t_array_lbs) :: a From flang-commits at lists.llvm.org Wed May 7 16:54:53 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 16:54:53 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Use box for components with non-default lower bounds (PR #138994) In-Reply-To: Message-ID: <681bf2cd.050a0220.d2984.0002@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Asher Mancinelli (ashermancinelli)
Changes When designating an array component that has non-default lower bounds the bridge was producing hlfir designates yielding reference types, which did not preserve the bounds information. Then, when creating components, unadjusted indices were used when initializing the structure. We could look at the declaration to get the shape parameter, but this would not be preserved if the component were passed as a block argument. These results must be boxed, but we also must not lose the contiguity information either. To address contiguity, annotate these boxes with the `contiguous` attribute during designation. Note that other designated entities are handled inside the HlfirDesignatorBuilder while component designators are built in HlfirBuilder. I am not sure if this handling should be moved into the designator builder or left in the general builder, so feedback is welcome. --- Full diff: https://github.com/llvm/llvm-project/pull/138994.diff 2 Files Affected: - (modified) flang/lib/Lower/ConvertExprToHLFIR.cpp (+19-3) - (modified) flang/test/Lower/HLFIR/designators-component-ref.f90 (+10) ``````````diff diff --git a/flang/lib/Lower/ConvertExprToHLFIR.cpp b/flang/lib/Lower/ConvertExprToHLFIR.cpp index 04b63f92a1fb4..da21d407e927d 100644 --- a/flang/lib/Lower/ConvertExprToHLFIR.cpp +++ b/flang/lib/Lower/ConvertExprToHLFIR.cpp @@ -30,6 +30,7 @@ #include "flang/Optimizer/Builder/Runtime/Derived.h" #include "flang/Optimizer/Builder/Runtime/Pointer.h" #include "flang/Optimizer/Builder/Todo.h" +#include "flang/Optimizer/Dialect/FIRAttr.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "mlir/IR/IRMapping.h" #include "llvm/ADT/TypeSwitch.h" @@ -125,6 +126,19 @@ class HlfirDesignatorBuilder { hlfir::ElementalAddrOp convertVectorSubscriptedExprToElementalAddr( const Fortran::lower::SomeExpr &designatorExpr); + std::tuple + genComponentDesignatorTypeAndAttributes( + const Fortran::semantics::Symbol &componentSym, mlir::Type fieldType, + bool isVolatile) { + if (mayHaveNonDefaultLowerBounds(componentSym)) { + mlir::Type boxType = fir::BoxType::get(fieldType, isVolatile); + return std::make_tuple(boxType, + fir::FortranVariableFlagsEnum::contiguous); + } + auto refType = fir::ReferenceType::get(fieldType, isVolatile); + return std::make_tuple(refType, fir::FortranVariableFlagsEnum{}); + } + mlir::Value genComponentShape(const Fortran::semantics::Symbol &componentSym, mlir::Type fieldType) { // For pointers and allocatable components, the @@ -1863,8 +1877,9 @@ class HlfirBuilder { designatorBuilder.genComponentShape(sym, compType); const bool isDesignatorVolatile = fir::isa_volatile_type(baseOp.getType()); - mlir::Type designatorType = - builder.getRefType(compType, isDesignatorVolatile); + auto [designatorType, extraAttributeFlags] = + designatorBuilder.genComponentDesignatorTypeAndAttributes( + sym, compType, isDesignatorVolatile); mlir::Type fieldElemType = hlfir::getFortranElementType(compType); llvm::SmallVector typeParams; @@ -1884,7 +1899,8 @@ class HlfirBuilder { // Convert component symbol attributes to variable attributes. fir::FortranVariableFlagsAttr attrs = - Fortran::lower::translateSymbolAttributes(builder.getContext(), sym); + Fortran::lower::translateSymbolAttributes(builder.getContext(), sym, + extraAttributeFlags); // Get the component designator. auto lhs = builder.create( diff --git a/flang/test/Lower/HLFIR/designators-component-ref.f90 b/flang/test/Lower/HLFIR/designators-component-ref.f90 index 653e28e0a6018..935176becac75 100644 --- a/flang/test/Lower/HLFIR/designators-component-ref.f90 +++ b/flang/test/Lower/HLFIR/designators-component-ref.f90 @@ -126,6 +126,16 @@ subroutine test_array_comp_non_contiguous_slice(a) ! CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_1]]#0{"array_comp"} <%[[VAL_9]]> (%[[VAL_10]]:%[[VAL_11]]:%[[VAL_12]], %[[VAL_14]]:%[[VAL_15]]:%[[VAL_16]]) shape %[[VAL_18]] : (!fir.ref}>>, !fir.shape<2>, index, index, index, index, index, index, !fir.shape<2>) -> !fir.box> end subroutine +subroutine test_array_lbs_array_ctor() + use comp_ref + type(t_array_lbs) :: a(-1:1) + real :: array_comp(2:11,3:22) + a = (/ (t_array_lbs(i, array_comp), i=-1,1) /) +! CHECK: hlfir.designate %{{.+}}#0{"array_comp_lbs"} <%{{.+}}> shape %{{.+}} {fortran_attrs = #fir.var_attrs} +! CHECK-SAME: (!fir.ref}>>, !fir.shapeshift<2>, !fir.shapeshift<2>) +! CHECK-SAME: -> !fir.box> +end subroutine + subroutine test_array_lbs_comp_lbs_1(a) use comp_ref type(t_array_lbs) :: a ``````````
https://github.com/llvm/llvm-project/pull/138994 From flang-commits at lists.llvm.org Wed May 7 17:14:44 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Wed, 07 May 2025 17:14:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Use box for components with non-default lower bounds (PR #138994) In-Reply-To: Message-ID: <681bf774.170a0220.35dbc3.bad7@mx.google.com> https://github.com/ashermancinelli edited https://github.com/llvm/llvm-project/pull/138994 From flang-commits at lists.llvm.org Wed May 7 18:56:50 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Wed, 07 May 2025 18:56:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Treat hlfir.associate as Allocate for FIR alias analysis. (PR #139004) Message-ID: https://github.com/vzakhari created https://github.com/llvm/llvm-project/pull/139004 Early HLFIR optimizations may experience problems with values produced by hlfir.associate. In most cases this is a unique local memory allocation, but it can also reuse some other hlfir.expr memory sometimes. It seems to be safe to assume unique allocation for trivial types, since we always allocate new memory for them. >From ae126b8ec224d6ae963ee666839cdef70fd5e4ab Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Wed, 7 May 2025 18:36:10 -0700 Subject: [PATCH] [flang] Treat hlfir.associate as Allocate for FIR alias analysis. Early HLFIR optimizations may experience problems with values produced by hlfir.associate. In most cases this is a unique local memory allocation, but it can also reuse some other hlfir.expr memory sometimes. It seems to be safe to assume unique allocation for trivial types, since we always allocate new memory for them. --- .../lib/Optimizer/Analysis/AliasAnalysis.cpp | 14 +++++++++ ...fferization-eval_in_mem-with-associate.fir | 30 +++++++++++++++++++ 2 files changed, 44 insertions(+) create mode 100644 flang/test/HLFIR/opt-bufferization-eval_in_mem-with-associate.fir diff --git a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp index cbfc8b63ab64d..73ddd1ff80126 100644 --- a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp +++ b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp @@ -540,6 +540,20 @@ AliasAnalysis::Source AliasAnalysis::getSource(mlir::Value v, v = op.getVar(); defOp = v.getDefiningOp(); }) + .Case([&](auto op) { + mlir::Value source = op.getSource(); + if (fir::isa_trivial(source.getType())) { + // Trivial values will always use distinct temp memory, + // so we can classify this as Allocate and stop. + type = SourceKind::Allocate; + breakFromLoop = true; + } else { + // AssociateOp may reuse the expression storage, + // so we have to trace further. + v = source; + defOp = v.getDefiningOp(); + } + }) .Case([&](auto op) { // Unique memory allocation. type = SourceKind::Allocate; diff --git a/flang/test/HLFIR/opt-bufferization-eval_in_mem-with-associate.fir b/flang/test/HLFIR/opt-bufferization-eval_in_mem-with-associate.fir new file mode 100644 index 0000000000000..2f5f88ff9e7e8 --- /dev/null +++ b/flang/test/HLFIR/opt-bufferization-eval_in_mem-with-associate.fir @@ -0,0 +1,30 @@ +// RUN: fir-opt --opt-bufferization %s | FileCheck %s + +// Verify that hlfir.eval_in_mem uses the LHS array instead +// of allocating a temporary. +func.func @_QPtest() { + %cst = arith.constant 1.000000e+00 : f32 + %c10 = arith.constant 10 : index + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.alloca !fir.array<10xf32> {bindc_name = "x", uniq_name = "_QFtestEx"} + %2 = fir.shape %c10 : (index) -> !fir.shape<1> + %3:2 = hlfir.declare %1(%2) {uniq_name = "_QFtestEx"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) + %4:3 = hlfir.associate %cst {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) + %5 = hlfir.eval_in_mem shape %2 : (!fir.shape<1>) -> !hlfir.expr<10xf32> { + ^bb0(%arg0: !fir.ref>): + %6 = fir.call @_QParray_func(%4#0) fastmath : (!fir.ref) -> !fir.array<10xf32> + fir.save_result %6 to %arg0(%2) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> + } + hlfir.assign %5 to %3#0 : !hlfir.expr<10xf32>, !fir.ref> + hlfir.end_associate %4#1, %4#2 : !fir.ref, i1 + hlfir.destroy %5 : !hlfir.expr<10xf32> + return +} +// CHECK-LABEL: func.func @_QPtest() { +// CHECK: %[[VAL_0:.*]] = arith.constant 1.000000e+00 : f32 +// CHECK: %[[VAL_3:.*]] = fir.alloca !fir.array<10xf32> {bindc_name = "x", uniq_name = "_QFtestEx"} +// CHECK: %[[VAL_5:.*]]:2 = hlfir.declare %[[VAL_3]](%[[VAL_4:.*]]) {uniq_name = "_QFtestEx"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) +// CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_0]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +// CHECK: %[[VAL_7:.*]] = fir.call @_QParray_func(%[[VAL_6]]#0) fastmath : (!fir.ref) -> !fir.array<10xf32> +// CHECK: fir.save_result %[[VAL_7]] to %[[VAL_5]]#0(%[[VAL_4]]) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +// CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 From flang-commits at lists.llvm.org Wed May 7 18:57:24 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 18:57:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Treat hlfir.associate as Allocate for FIR alias analysis. (PR #139004) In-Reply-To: Message-ID: <681c0f84.170a0220.2e53e6.af0d@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Slava Zakharin (vzakhari)
Changes Early HLFIR optimizations may experience problems with values produced by hlfir.associate. In most cases this is a unique local memory allocation, but it can also reuse some other hlfir.expr memory sometimes. It seems to be safe to assume unique allocation for trivial types, since we always allocate new memory for them. --- Full diff: https://github.com/llvm/llvm-project/pull/139004.diff 2 Files Affected: - (modified) flang/lib/Optimizer/Analysis/AliasAnalysis.cpp (+14) - (added) flang/test/HLFIR/opt-bufferization-eval_in_mem-with-associate.fir (+30) ``````````diff diff --git a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp index cbfc8b63ab64d..73ddd1ff80126 100644 --- a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp +++ b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp @@ -540,6 +540,20 @@ AliasAnalysis::Source AliasAnalysis::getSource(mlir::Value v, v = op.getVar(); defOp = v.getDefiningOp(); }) + .Case([&](auto op) { + mlir::Value source = op.getSource(); + if (fir::isa_trivial(source.getType())) { + // Trivial values will always use distinct temp memory, + // so we can classify this as Allocate and stop. + type = SourceKind::Allocate; + breakFromLoop = true; + } else { + // AssociateOp may reuse the expression storage, + // so we have to trace further. + v = source; + defOp = v.getDefiningOp(); + } + }) .Case([&](auto op) { // Unique memory allocation. type = SourceKind::Allocate; diff --git a/flang/test/HLFIR/opt-bufferization-eval_in_mem-with-associate.fir b/flang/test/HLFIR/opt-bufferization-eval_in_mem-with-associate.fir new file mode 100644 index 0000000000000..2f5f88ff9e7e8 --- /dev/null +++ b/flang/test/HLFIR/opt-bufferization-eval_in_mem-with-associate.fir @@ -0,0 +1,30 @@ +// RUN: fir-opt --opt-bufferization %s | FileCheck %s + +// Verify that hlfir.eval_in_mem uses the LHS array instead +// of allocating a temporary. +func.func @_QPtest() { + %cst = arith.constant 1.000000e+00 : f32 + %c10 = arith.constant 10 : index + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.alloca !fir.array<10xf32> {bindc_name = "x", uniq_name = "_QFtestEx"} + %2 = fir.shape %c10 : (index) -> !fir.shape<1> + %3:2 = hlfir.declare %1(%2) {uniq_name = "_QFtestEx"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) + %4:3 = hlfir.associate %cst {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) + %5 = hlfir.eval_in_mem shape %2 : (!fir.shape<1>) -> !hlfir.expr<10xf32> { + ^bb0(%arg0: !fir.ref>): + %6 = fir.call @_QParray_func(%4#0) fastmath : (!fir.ref) -> !fir.array<10xf32> + fir.save_result %6 to %arg0(%2) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> + } + hlfir.assign %5 to %3#0 : !hlfir.expr<10xf32>, !fir.ref> + hlfir.end_associate %4#1, %4#2 : !fir.ref, i1 + hlfir.destroy %5 : !hlfir.expr<10xf32> + return +} +// CHECK-LABEL: func.func @_QPtest() { +// CHECK: %[[VAL_0:.*]] = arith.constant 1.000000e+00 : f32 +// CHECK: %[[VAL_3:.*]] = fir.alloca !fir.array<10xf32> {bindc_name = "x", uniq_name = "_QFtestEx"} +// CHECK: %[[VAL_5:.*]]:2 = hlfir.declare %[[VAL_3]](%[[VAL_4:.*]]) {uniq_name = "_QFtestEx"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) +// CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_0]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +// CHECK: %[[VAL_7:.*]] = fir.call @_QParray_func(%[[VAL_6]]#0) fastmath : (!fir.ref) -> !fir.array<10xf32> +// CHECK: fir.save_result %[[VAL_7]] to %[[VAL_5]]#0(%[[VAL_4]]) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +// CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 ``````````
https://github.com/llvm/llvm-project/pull/139004 From flang-commits at lists.llvm.org Wed May 7 19:04:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 19:04:21 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <681c1125.170a0220.3cda32.af52@mx.google.com> https://github.com/jofrn updated https://github.com/llvm/llvm-project/pull/123609 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 7 22:50:55 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 22:50:55 -0700 (PDT) Subject: [flang-commits] [flang] [fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <681c463f.050a0220.48da2.525f@mx.google.com> https://github.com/NexMing updated https://github.com/llvm/llvm-project/pull/137790 >From 17dfda28fdf8eb8184283b686e6831a3c8b7a9ab Mon Sep 17 00:00:00 2001 From: yanming Date: Tue, 29 Apr 2025 19:16:48 +0800 Subject: [PATCH] [fir] Support promoting `fir.do_loop` with results to `affine.for`. --- .../Optimizer/Transforms/AffinePromotion.cpp | 39 +++++++++-- flang/test/Fir/affine-promotion.fir | 65 +++++++++++++++++++ 2 files changed, 99 insertions(+), 5 deletions(-) diff --git a/flang/lib/Optimizer/Transforms/AffinePromotion.cpp b/flang/lib/Optimizer/Transforms/AffinePromotion.cpp index 43fccf52dc8ab..ef82e400bea14 100644 --- a/flang/lib/Optimizer/Transforms/AffinePromotion.cpp +++ b/flang/lib/Optimizer/Transforms/AffinePromotion.cpp @@ -49,8 +49,9 @@ struct AffineIfAnalysis; /// second when doing rewrite. struct AffineFunctionAnalysis { explicit AffineFunctionAnalysis(mlir::func::FuncOp funcOp) { - for (fir::DoLoopOp op : funcOp.getOps()) - loopAnalysisMap.try_emplace(op, op, *this); + funcOp->walk([&](fir::DoLoopOp doloop) { + loopAnalysisMap.try_emplace(doloop, doloop, *this); + }); } AffineLoopAnalysis getChildLoopAnalysis(fir::DoLoopOp op) const; @@ -102,10 +103,23 @@ struct AffineLoopAnalysis { return true; } + bool analysisResults(fir::DoLoopOp loopOperation) { + if (loopOperation.getFinalValue() && + !loopOperation.getResult(0).use_empty()) { + LLVM_DEBUG( + llvm::dbgs() + << "AffineLoopAnalysis: cannot promote loop final value\n";); + return false; + } + + return true; + } + bool analyzeLoop(fir::DoLoopOp loopOperation, AffineFunctionAnalysis &functionAnalysis) { LLVM_DEBUG(llvm::dbgs() << "AffineLoopAnalysis: \n"; loopOperation.dump();); return analyzeMemoryAccess(loopOperation) && + analysisResults(loopOperation) && analyzeBody(loopOperation, functionAnalysis); } @@ -461,14 +475,28 @@ class AffineLoopConversion : public mlir::OpRewritePattern { LLVM_ATTRIBUTE_UNUSED auto loopAnalysis = functionAnalysis.getChildLoopAnalysis(loop); auto &loopOps = loop.getBody()->getOperations(); + auto resultOp = cast(loop.getBody()->getTerminator()); + auto results = resultOp.getOperands(); + auto loopResults = loop->getResults(); auto loopAndIndex = createAffineFor(loop, rewriter); auto affineFor = loopAndIndex.first; auto inductionVar = loopAndIndex.second; + if (loop.getFinalValue()) { + results = results.drop_front(); + loopResults = loopResults.drop_front(); + } + rewriter.startOpModification(affineFor.getOperation()); affineFor.getBody()->getOperations().splice( std::prev(affineFor.getBody()->end()), loopOps, loopOps.begin(), std::prev(loopOps.end())); + rewriter.replaceAllUsesWith(loop.getRegionIterArgs(), + affineFor.getRegionIterArgs()); + if (!results.empty()) { + rewriter.setInsertionPointToEnd(affineFor.getBody()); + rewriter.create(resultOp->getLoc(), results); + } rewriter.finalizeOpModification(affineFor.getOperation()); rewriter.startOpModification(loop.getOperation()); @@ -479,7 +507,8 @@ class AffineLoopConversion : public mlir::OpRewritePattern { LLVM_DEBUG(llvm::dbgs() << "AffineLoopConversion: loop rewriten to:\n"; affineFor.dump();); - rewriter.replaceOp(loop, affineFor.getOperation()->getResults()); + rewriter.replaceAllUsesWith(loopResults, affineFor->getResults()); + rewriter.eraseOp(loop); return success(); } @@ -503,7 +532,7 @@ class AffineLoopConversion : public mlir::OpRewritePattern { ValueRange(op.getUpperBound()), mlir::AffineMap::get(0, 1, 1 + mlir::getAffineSymbolExpr(0, op.getContext())), - step); + step, op.getIterOperands()); return std::make_pair(affineFor, affineFor.getInductionVar()); } @@ -528,7 +557,7 @@ class AffineLoopConversion : public mlir::OpRewritePattern { genericUpperBound.getResult(), mlir::AffineMap::get(0, 1, 1 + mlir::getAffineSymbolExpr(0, op.getContext())), - 1); + 1, op.getIterOperands()); rewriter.setInsertionPointToStart(affineFor.getBody()); auto actualIndex = rewriter.create( op.getLoc(), actualIndexMap, diff --git a/flang/test/Fir/affine-promotion.fir b/flang/test/Fir/affine-promotion.fir index aae35c6ef5659..f50f851a89eae 100644 --- a/flang/test/Fir/affine-promotion.fir +++ b/flang/test/Fir/affine-promotion.fir @@ -131,3 +131,68 @@ func.func @loop_with_if(%a: !arr_d1, %v: f32) { // CHECK: } // CHECK: return // CHECK: } + +func.func @loop_with_result(%arg0: !fir.ref>, %arg1: !fir.ref>) -> f32 { + %c1 = arith.constant 1 : index + %cst = arith.constant 0.000000e+00 : f32 + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %1 = fir.shape %c100, %c100 : (index, index) -> !fir.shape<2> + %2 = fir.alloca i32 + %3:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %cst) -> (index, f32) { + %6 = fir.array_coor %arg0(%0) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %7 = fir.load %6 : !fir.ref + %8 = arith.addf %arg3, %7 fastmath : f32 + %9 = arith.addi %arg2, %c1 overflow : index + fir.result %9, %8 : index, f32 + } + %4:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %3#1) -> (index, f32) { + %6 = fir.array_coor %arg1(%1) %c1, %arg2 : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref + %7 = fir.convert %6 : (!fir.ref) -> !fir.ref> + %8 = fir.do_loop %arg4 = %c1 to %c100 step %c1 iter_args(%arg5 = %arg3) -> (f32) { + %10 = fir.array_coor %7(%0) %arg4 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %11 = fir.load %10 : !fir.ref + %12 = arith.addf %arg5, %11 fastmath : f32 + fir.result %12 : f32 + } + %9 = arith.addi %arg2, %c1 overflow : index + fir.result %9, %8 : index, f32 + } + %5 = fir.convert %4#0 : (index) -> i32 + fir.store %5 to %2 : !fir.ref + return %4#1 : f32 +} + +// CHECK-LABEL: func.func @loop_with_result( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref>) -> f32 { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0.000000e+00 : f32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]], %[[VAL_2]] : (index, index) -> !fir.shape<2> +// CHECK: %[[VAL_5:.*]] = fir.alloca i32 +// CHECK: %[[VAL_6:.*]] = fir.convert %[[ARG0]] : (!fir.ref>) -> memref +// CHECK: %[[VAL_7:.*]] = affine.for %[[VAL_8:.*]] = %[[VAL_0]] to #{{.*}}(){{\[}}%[[VAL_2]]] iter_args(%[[VAL_9:.*]] = %[[VAL_1]]) -> (f32) { +// CHECK: %[[VAL_10:.*]] = affine.apply #{{.*}}(%[[VAL_8]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] +// CHECK: %[[VAL_11:.*]] = affine.load %[[VAL_6]]{{\[}}%[[VAL_10]]] : memref +// CHECK: %[[VAL_12:.*]] = arith.addf %[[VAL_9]], %[[VAL_11]] fastmath : f32 +// CHECK: affine.yield %[[VAL_12]] : f32 +// CHECK: } +// CHECK: %[[VAL_13:.*]]:2 = fir.do_loop %[[VAL_14:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_0]] iter_args(%[[VAL_15:.*]] = %[[VAL_7]]) -> (index, f32) { +// CHECK: %[[VAL_16:.*]] = fir.array_coor %[[ARG1]](%[[VAL_4]]) %[[VAL_0]], %[[VAL_14]] : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16]] : (!fir.ref) -> !fir.ref> +// CHECK: %[[VAL_18:.*]] = fir.convert %[[VAL_17]] : (!fir.ref>) -> memref +// CHECK: %[[VAL_19:.*]] = affine.for %[[VAL_20:.*]] = %[[VAL_0]] to #{{.*}}(){{\[}}%[[VAL_2]]] iter_args(%[[VAL_21:.*]] = %[[VAL_15]]) -> (f32) { +// CHECK: %[[VAL_22:.*]] = affine.apply #{{.*}}(%[[VAL_20]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] +// CHECK: %[[VAL_23:.*]] = affine.load %[[VAL_18]]{{\[}}%[[VAL_22]]] : memref +// CHECK: %[[VAL_24:.*]] = arith.addf %[[VAL_21]], %[[VAL_23]] fastmath : f32 +// CHECK: affine.yield %[[VAL_24]] : f32 +// CHECK: } +// CHECK: %[[VAL_25:.*]] = arith.addi %[[VAL_14]], %[[VAL_0]] overflow : index +// CHECK: fir.result %[[VAL_25]], %[[VAL_19]] : index, f32 +// CHECK: } +// CHECK: %[[VAL_26:.*]] = fir.convert %[[VAL_27:.*]]#0 : (index) -> i32 +// CHECK: fir.store %[[VAL_26]] to %[[VAL_5]] : !fir.ref +// CHECK: return %[[VAL_27]]#1 : f32 +// CHECK: } From flang-commits at lists.llvm.org Wed May 7 22:59:48 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 22:59:48 -0700 (PDT) Subject: [flang-commits] [flang] [fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <681c4854.170a0220.305fc2.ca56@mx.google.com> https://github.com/NexMing updated https://github.com/llvm/llvm-project/pull/137790 >From 17dfda28fdf8eb8184283b686e6831a3c8b7a9ab Mon Sep 17 00:00:00 2001 From: yanming Date: Tue, 29 Apr 2025 19:16:48 +0800 Subject: [PATCH 1/2] [fir] Support promoting `fir.do_loop` with results to `affine.for`. --- .../Optimizer/Transforms/AffinePromotion.cpp | 39 +++++++++-- flang/test/Fir/affine-promotion.fir | 65 +++++++++++++++++++ 2 files changed, 99 insertions(+), 5 deletions(-) diff --git a/flang/lib/Optimizer/Transforms/AffinePromotion.cpp b/flang/lib/Optimizer/Transforms/AffinePromotion.cpp index 43fccf52dc8ab..ef82e400bea14 100644 --- a/flang/lib/Optimizer/Transforms/AffinePromotion.cpp +++ b/flang/lib/Optimizer/Transforms/AffinePromotion.cpp @@ -49,8 +49,9 @@ struct AffineIfAnalysis; /// second when doing rewrite. struct AffineFunctionAnalysis { explicit AffineFunctionAnalysis(mlir::func::FuncOp funcOp) { - for (fir::DoLoopOp op : funcOp.getOps()) - loopAnalysisMap.try_emplace(op, op, *this); + funcOp->walk([&](fir::DoLoopOp doloop) { + loopAnalysisMap.try_emplace(doloop, doloop, *this); + }); } AffineLoopAnalysis getChildLoopAnalysis(fir::DoLoopOp op) const; @@ -102,10 +103,23 @@ struct AffineLoopAnalysis { return true; } + bool analysisResults(fir::DoLoopOp loopOperation) { + if (loopOperation.getFinalValue() && + !loopOperation.getResult(0).use_empty()) { + LLVM_DEBUG( + llvm::dbgs() + << "AffineLoopAnalysis: cannot promote loop final value\n";); + return false; + } + + return true; + } + bool analyzeLoop(fir::DoLoopOp loopOperation, AffineFunctionAnalysis &functionAnalysis) { LLVM_DEBUG(llvm::dbgs() << "AffineLoopAnalysis: \n"; loopOperation.dump();); return analyzeMemoryAccess(loopOperation) && + analysisResults(loopOperation) && analyzeBody(loopOperation, functionAnalysis); } @@ -461,14 +475,28 @@ class AffineLoopConversion : public mlir::OpRewritePattern { LLVM_ATTRIBUTE_UNUSED auto loopAnalysis = functionAnalysis.getChildLoopAnalysis(loop); auto &loopOps = loop.getBody()->getOperations(); + auto resultOp = cast(loop.getBody()->getTerminator()); + auto results = resultOp.getOperands(); + auto loopResults = loop->getResults(); auto loopAndIndex = createAffineFor(loop, rewriter); auto affineFor = loopAndIndex.first; auto inductionVar = loopAndIndex.second; + if (loop.getFinalValue()) { + results = results.drop_front(); + loopResults = loopResults.drop_front(); + } + rewriter.startOpModification(affineFor.getOperation()); affineFor.getBody()->getOperations().splice( std::prev(affineFor.getBody()->end()), loopOps, loopOps.begin(), std::prev(loopOps.end())); + rewriter.replaceAllUsesWith(loop.getRegionIterArgs(), + affineFor.getRegionIterArgs()); + if (!results.empty()) { + rewriter.setInsertionPointToEnd(affineFor.getBody()); + rewriter.create(resultOp->getLoc(), results); + } rewriter.finalizeOpModification(affineFor.getOperation()); rewriter.startOpModification(loop.getOperation()); @@ -479,7 +507,8 @@ class AffineLoopConversion : public mlir::OpRewritePattern { LLVM_DEBUG(llvm::dbgs() << "AffineLoopConversion: loop rewriten to:\n"; affineFor.dump();); - rewriter.replaceOp(loop, affineFor.getOperation()->getResults()); + rewriter.replaceAllUsesWith(loopResults, affineFor->getResults()); + rewriter.eraseOp(loop); return success(); } @@ -503,7 +532,7 @@ class AffineLoopConversion : public mlir::OpRewritePattern { ValueRange(op.getUpperBound()), mlir::AffineMap::get(0, 1, 1 + mlir::getAffineSymbolExpr(0, op.getContext())), - step); + step, op.getIterOperands()); return std::make_pair(affineFor, affineFor.getInductionVar()); } @@ -528,7 +557,7 @@ class AffineLoopConversion : public mlir::OpRewritePattern { genericUpperBound.getResult(), mlir::AffineMap::get(0, 1, 1 + mlir::getAffineSymbolExpr(0, op.getContext())), - 1); + 1, op.getIterOperands()); rewriter.setInsertionPointToStart(affineFor.getBody()); auto actualIndex = rewriter.create( op.getLoc(), actualIndexMap, diff --git a/flang/test/Fir/affine-promotion.fir b/flang/test/Fir/affine-promotion.fir index aae35c6ef5659..f50f851a89eae 100644 --- a/flang/test/Fir/affine-promotion.fir +++ b/flang/test/Fir/affine-promotion.fir @@ -131,3 +131,68 @@ func.func @loop_with_if(%a: !arr_d1, %v: f32) { // CHECK: } // CHECK: return // CHECK: } + +func.func @loop_with_result(%arg0: !fir.ref>, %arg1: !fir.ref>) -> f32 { + %c1 = arith.constant 1 : index + %cst = arith.constant 0.000000e+00 : f32 + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %1 = fir.shape %c100, %c100 : (index, index) -> !fir.shape<2> + %2 = fir.alloca i32 + %3:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %cst) -> (index, f32) { + %6 = fir.array_coor %arg0(%0) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %7 = fir.load %6 : !fir.ref + %8 = arith.addf %arg3, %7 fastmath : f32 + %9 = arith.addi %arg2, %c1 overflow : index + fir.result %9, %8 : index, f32 + } + %4:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %3#1) -> (index, f32) { + %6 = fir.array_coor %arg1(%1) %c1, %arg2 : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref + %7 = fir.convert %6 : (!fir.ref) -> !fir.ref> + %8 = fir.do_loop %arg4 = %c1 to %c100 step %c1 iter_args(%arg5 = %arg3) -> (f32) { + %10 = fir.array_coor %7(%0) %arg4 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %11 = fir.load %10 : !fir.ref + %12 = arith.addf %arg5, %11 fastmath : f32 + fir.result %12 : f32 + } + %9 = arith.addi %arg2, %c1 overflow : index + fir.result %9, %8 : index, f32 + } + %5 = fir.convert %4#0 : (index) -> i32 + fir.store %5 to %2 : !fir.ref + return %4#1 : f32 +} + +// CHECK-LABEL: func.func @loop_with_result( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref>) -> f32 { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0.000000e+00 : f32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]], %[[VAL_2]] : (index, index) -> !fir.shape<2> +// CHECK: %[[VAL_5:.*]] = fir.alloca i32 +// CHECK: %[[VAL_6:.*]] = fir.convert %[[ARG0]] : (!fir.ref>) -> memref +// CHECK: %[[VAL_7:.*]] = affine.for %[[VAL_8:.*]] = %[[VAL_0]] to #{{.*}}(){{\[}}%[[VAL_2]]] iter_args(%[[VAL_9:.*]] = %[[VAL_1]]) -> (f32) { +// CHECK: %[[VAL_10:.*]] = affine.apply #{{.*}}(%[[VAL_8]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] +// CHECK: %[[VAL_11:.*]] = affine.load %[[VAL_6]]{{\[}}%[[VAL_10]]] : memref +// CHECK: %[[VAL_12:.*]] = arith.addf %[[VAL_9]], %[[VAL_11]] fastmath : f32 +// CHECK: affine.yield %[[VAL_12]] : f32 +// CHECK: } +// CHECK: %[[VAL_13:.*]]:2 = fir.do_loop %[[VAL_14:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_0]] iter_args(%[[VAL_15:.*]] = %[[VAL_7]]) -> (index, f32) { +// CHECK: %[[VAL_16:.*]] = fir.array_coor %[[ARG1]](%[[VAL_4]]) %[[VAL_0]], %[[VAL_14]] : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16]] : (!fir.ref) -> !fir.ref> +// CHECK: %[[VAL_18:.*]] = fir.convert %[[VAL_17]] : (!fir.ref>) -> memref +// CHECK: %[[VAL_19:.*]] = affine.for %[[VAL_20:.*]] = %[[VAL_0]] to #{{.*}}(){{\[}}%[[VAL_2]]] iter_args(%[[VAL_21:.*]] = %[[VAL_15]]) -> (f32) { +// CHECK: %[[VAL_22:.*]] = affine.apply #{{.*}}(%[[VAL_20]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] +// CHECK: %[[VAL_23:.*]] = affine.load %[[VAL_18]]{{\[}}%[[VAL_22]]] : memref +// CHECK: %[[VAL_24:.*]] = arith.addf %[[VAL_21]], %[[VAL_23]] fastmath : f32 +// CHECK: affine.yield %[[VAL_24]] : f32 +// CHECK: } +// CHECK: %[[VAL_25:.*]] = arith.addi %[[VAL_14]], %[[VAL_0]] overflow : index +// CHECK: fir.result %[[VAL_25]], %[[VAL_19]] : index, f32 +// CHECK: } +// CHECK: %[[VAL_26:.*]] = fir.convert %[[VAL_27:.*]]#0 : (index) -> i32 +// CHECK: fir.store %[[VAL_26]] to %[[VAL_5]] : !fir.ref +// CHECK: return %[[VAL_27]]#1 : f32 +// CHECK: } >From 2b6bcf2a5b4f6596a1d0a3c7d539729326bbfc23 Mon Sep 17 00:00:00 2001 From: yanming Date: Thu, 8 May 2025 13:50:54 +0800 Subject: [PATCH 2/2] Add a test that loop with multiple results. --- flang/test/Fir/affine-promotion.fir | 77 ++++++++++++++++++----------- 1 file changed, 49 insertions(+), 28 deletions(-) diff --git a/flang/test/Fir/affine-promotion.fir b/flang/test/Fir/affine-promotion.fir index f50f851a89eae..1308f7c4b36cf 100644 --- a/flang/test/Fir/affine-promotion.fir +++ b/flang/test/Fir/affine-promotion.fir @@ -132,40 +132,51 @@ func.func @loop_with_if(%a: !arr_d1, %v: f32) { // CHECK: return // CHECK: } -func.func @loop_with_result(%arg0: !fir.ref>, %arg1: !fir.ref>) -> f32 { +func.func @loop_with_result(%arg0: !fir.ref>, %arg1: !fir.ref>, %arg2: !fir.ref>) -> f32 { %c1 = arith.constant 1 : index %cst = arith.constant 0.000000e+00 : f32 %c100 = arith.constant 100 : index %0 = fir.shape %c100 : (index) -> !fir.shape<1> %1 = fir.shape %c100, %c100 : (index, index) -> !fir.shape<2> %2 = fir.alloca i32 - %3:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %cst) -> (index, f32) { - %6 = fir.array_coor %arg0(%0) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref - %7 = fir.load %6 : !fir.ref - %8 = arith.addf %arg3, %7 fastmath : f32 - %9 = arith.addi %arg2, %c1 overflow : index - fir.result %9, %8 : index, f32 + %3:2 = fir.do_loop %arg3 = %c1 to %c100 step %c1 iter_args(%arg4 = %cst) -> (index, f32) { + %8 = fir.array_coor %arg0(%0) %arg3 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %9 = fir.load %8 : !fir.ref + %10 = arith.addf %arg4, %9 fastmath : f32 + %11 = arith.addi %arg3, %c1 overflow : index + fir.result %11, %10 : index, f32 } - %4:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %3#1) -> (index, f32) { - %6 = fir.array_coor %arg1(%1) %c1, %arg2 : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref - %7 = fir.convert %6 : (!fir.ref) -> !fir.ref> - %8 = fir.do_loop %arg4 = %c1 to %c100 step %c1 iter_args(%arg5 = %arg3) -> (f32) { - %10 = fir.array_coor %7(%0) %arg4 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref - %11 = fir.load %10 : !fir.ref - %12 = arith.addf %arg5, %11 fastmath : f32 - fir.result %12 : f32 + %4:2 = fir.do_loop %arg3 = %c1 to %c100 step %c1 iter_args(%arg4 = %3#1) -> (index, f32) { + %8 = fir.array_coor %arg1(%1) %c1, %arg3 : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref + %9 = fir.convert %8 : (!fir.ref) -> !fir.ref> + %10 = fir.do_loop %arg5 = %c1 to %c100 step %c1 iter_args(%arg6 = %arg4) -> (f32) { + %12 = fir.array_coor %9(%0) %arg5 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %13 = fir.load %12 : !fir.ref + %14 = arith.addf %arg6, %13 fastmath : f32 + fir.result %14 : f32 } - %9 = arith.addi %arg2, %c1 overflow : index - fir.result %9, %8 : index, f32 + %11 = arith.addi %arg3, %c1 overflow : index + fir.result %11, %10 : index, f32 } - %5 = fir.convert %4#0 : (index) -> i32 - fir.store %5 to %2 : !fir.ref - return %4#1 : f32 + %5:2 = fir.do_loop %arg3 = %c1 to %c100 step %c1 iter_args(%arg4 = %4#1, %arg5 = %cst) -> (f32, f32) { + %8 = fir.array_coor %arg0(%0) %arg3 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %9 = fir.load %8 : !fir.ref + %10 = arith.addf %arg4, %9 fastmath : f32 + %11 = fir.array_coor %arg2(%0) %arg3 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %12 = fir.load %11 : !fir.ref + %13 = arith.addf %arg5, %12 fastmath : f32 + fir.result %10, %13 : f32, f32 + } + %6 = arith.addf %5#0, %5#1 fastmath : f32 + %7 = fir.convert %4#0 : (index) -> i32 + fir.store %7 to %2 : !fir.ref + return %6 : f32 } // CHECK-LABEL: func.func @loop_with_result( // CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, -// CHECK-SAME: %[[ARG1:.*]]: !fir.ref>) -> f32 { +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG2:.*]]: !fir.ref>) -> f32 { // CHECK: %[[VAL_0:.*]] = arith.constant 1 : index // CHECK: %[[VAL_1:.*]] = arith.constant 0.000000e+00 : f32 // CHECK: %[[VAL_2:.*]] = arith.constant 100 : index @@ -173,8 +184,8 @@ func.func @loop_with_result(%arg0: !fir.ref>, %arg1: !fir.re // CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]], %[[VAL_2]] : (index, index) -> !fir.shape<2> // CHECK: %[[VAL_5:.*]] = fir.alloca i32 // CHECK: %[[VAL_6:.*]] = fir.convert %[[ARG0]] : (!fir.ref>) -> memref -// CHECK: %[[VAL_7:.*]] = affine.for %[[VAL_8:.*]] = %[[VAL_0]] to #{{.*}}(){{\[}}%[[VAL_2]]] iter_args(%[[VAL_9:.*]] = %[[VAL_1]]) -> (f32) { -// CHECK: %[[VAL_10:.*]] = affine.apply #{{.*}}(%[[VAL_8]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] +// CHECK: %[[VAL_7:.*]] = affine.for %[[VAL_8:.*]] = %[[VAL_0]] to #[[$ATTR_3]](){{\[}}%[[VAL_2]]] iter_args(%[[VAL_9:.*]] = %[[VAL_1]]) -> (f32) { +// CHECK: %[[VAL_10:.*]] = affine.apply #[[$ATTR_4]](%[[VAL_8]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] // CHECK: %[[VAL_11:.*]] = affine.load %[[VAL_6]]{{\[}}%[[VAL_10]]] : memref // CHECK: %[[VAL_12:.*]] = arith.addf %[[VAL_9]], %[[VAL_11]] fastmath : f32 // CHECK: affine.yield %[[VAL_12]] : f32 @@ -183,8 +194,8 @@ func.func @loop_with_result(%arg0: !fir.ref>, %arg1: !fir.re // CHECK: %[[VAL_16:.*]] = fir.array_coor %[[ARG1]](%[[VAL_4]]) %[[VAL_0]], %[[VAL_14]] : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref // CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16]] : (!fir.ref) -> !fir.ref> // CHECK: %[[VAL_18:.*]] = fir.convert %[[VAL_17]] : (!fir.ref>) -> memref -// CHECK: %[[VAL_19:.*]] = affine.for %[[VAL_20:.*]] = %[[VAL_0]] to #{{.*}}(){{\[}}%[[VAL_2]]] iter_args(%[[VAL_21:.*]] = %[[VAL_15]]) -> (f32) { -// CHECK: %[[VAL_22:.*]] = affine.apply #{{.*}}(%[[VAL_20]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] +// CHECK: %[[VAL_19:.*]] = affine.for %[[VAL_20:.*]] = %[[VAL_0]] to #[[$ATTR_3]](){{\[}}%[[VAL_2]]] iter_args(%[[VAL_21:.*]] = %[[VAL_15]]) -> (f32) { +// CHECK: %[[VAL_22:.*]] = affine.apply #[[$ATTR_4]](%[[VAL_20]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] // CHECK: %[[VAL_23:.*]] = affine.load %[[VAL_18]]{{\[}}%[[VAL_22]]] : memref // CHECK: %[[VAL_24:.*]] = arith.addf %[[VAL_21]], %[[VAL_23]] fastmath : f32 // CHECK: affine.yield %[[VAL_24]] : f32 @@ -192,7 +203,17 @@ func.func @loop_with_result(%arg0: !fir.ref>, %arg1: !fir.re // CHECK: %[[VAL_25:.*]] = arith.addi %[[VAL_14]], %[[VAL_0]] overflow : index // CHECK: fir.result %[[VAL_25]], %[[VAL_19]] : index, f32 // CHECK: } -// CHECK: %[[VAL_26:.*]] = fir.convert %[[VAL_27:.*]]#0 : (index) -> i32 -// CHECK: fir.store %[[VAL_26]] to %[[VAL_5]] : !fir.ref -// CHECK: return %[[VAL_27]]#1 : f32 +// CHECK: %[[VAL_26:.*]] = fir.convert %[[ARG2]] : (!fir.ref>) -> memref +// CHECK: %[[VAL_27:.*]]:2 = affine.for %[[VAL_28:.*]] = %[[VAL_0]] to #[[$ATTR_3]](){{\[}}%[[VAL_2]]] iter_args(%[[VAL_29:.*]] = %[[VAL_30:.*]]#1, %[[VAL_31:.*]] = %[[VAL_1]]) -> (f32, f32) { +// CHECK: %[[VAL_32:.*]] = affine.apply #[[$ATTR_4]](%[[VAL_28]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] +// CHECK: %[[VAL_33:.*]] = affine.load %[[VAL_6]]{{\[}}%[[VAL_32]]] : memref +// CHECK: %[[VAL_34:.*]] = arith.addf %[[VAL_29]], %[[VAL_33]] fastmath : f32 +// CHECK: %[[VAL_35:.*]] = affine.load %[[VAL_26]]{{\[}}%[[VAL_32]]] : memref +// CHECK: %[[VAL_36:.*]] = arith.addf %[[VAL_31]], %[[VAL_35]] fastmath : f32 +// CHECK: affine.yield %[[VAL_34]], %[[VAL_36]] : f32, f32 +// CHECK: } +// CHECK: %[[VAL_37:.*]] = arith.addf %[[VAL_38:.*]]#0, %[[VAL_38]]#1 fastmath : f32 +// CHECK: %[[VAL_39:.*]] = fir.convert %[[VAL_40:.*]]#0 : (index) -> i32 +// CHECK: fir.store %[[VAL_39]] to %[[VAL_5]] : !fir.ref +// CHECK: return %[[VAL_37]] : f32 // CHECK: } From flang-commits at lists.llvm.org Wed May 7 23:26:03 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 07 May 2025 23:26:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Update `do concurrent` mapping pass to use `fir.do_concurrent` op (PR #138489) In-Reply-To: Message-ID: <681c4e7b.050a0220.3c56f4.a24b@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/138489 >From 96b00e287cb2f8efc29ef66f3f6cbddf592f1295 Mon Sep 17 00:00:00 2001 From: ergawy Date: Mon, 5 May 2025 02:23:04 -0500 Subject: [PATCH] [flang][OpenMP] Update `do concurrent` mapping pass to use `fir.do_concurrent` op This PR updates the `do concurrent` to OpenMP mapping pass to use the newly added `fir.do_concurrent` ops that were recently added upstream instead of handling nests of `fir.do_loop ... unordered` ops. Parent PR: https://github.com/llvm/llvm-project/pull/137928. --- .../OpenMP/DoConcurrentConversion.cpp | 362 ++++-------------- .../Transforms/DoConcurrent/basic_device.mlir | 24 +- .../Transforms/DoConcurrent/basic_host.f90 | 3 - .../Transforms/DoConcurrent/basic_host.mlir | 26 +- .../DoConcurrent/locally_destroyed_temp.f90 | 3 - .../DoConcurrent/loop_nest_test.f90 | 92 ----- .../multiple_iteration_ranges.f90 | 3 - .../DoConcurrent/non_const_bounds.f90 | 3 - .../DoConcurrent/not_perfectly_nested.f90 | 24 +- 9 files changed, 122 insertions(+), 418 deletions(-) delete mode 100644 flang/test/Transforms/DoConcurrent/loop_nest_test.f90 diff --git a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp index 2c069860ffdca..0fdb302fe10ca 100644 --- a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp +++ b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp @@ -6,6 +6,7 @@ // //===----------------------------------------------------------------------===// +#include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/OpenMP/Passes.h" #include "flang/Optimizer/OpenMP/Utils.h" @@ -28,8 +29,10 @@ namespace looputils { /// Stores info needed about the induction/iteration variable for each `do /// concurrent` in a loop nest. struct InductionVariableInfo { - InductionVariableInfo(fir::DoLoopOp doLoop) { populateInfo(doLoop); } - + InductionVariableInfo(fir::DoConcurrentLoopOp loop, + mlir::Value inductionVar) { + populateInfo(loop, inductionVar); + } /// The operation allocating memory for iteration variable. mlir::Operation *iterVarMemDef; /// the operation(s) updating the iteration variable with the current @@ -45,7 +48,7 @@ struct InductionVariableInfo { /// ... /// %i:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : ... /// ... - /// fir.do_loop %ind_var = %lb to %ub step %s unordered { + /// fir.do_concurrent.loop (%ind_var) = (%lb) to (%ub) step (%s) { /// %ind_var_conv = fir.convert %ind_var : (index) -> i32 /// fir.store %ind_var_conv to %i#1 : !fir.ref /// ... @@ -62,14 +65,14 @@ struct InductionVariableInfo { /// Note: The current implementation is dependent on how flang emits loop /// bodies; which is sufficient for the current simple test/use cases. If this /// proves to be insufficient, this should be made more generic. - void populateInfo(fir::DoLoopOp doLoop) { + void populateInfo(fir::DoConcurrentLoopOp loop, mlir::Value inductionVar) { mlir::Value result = nullptr; // Checks if a StoreOp is updating the memref of the loop's iteration // variable. auto isStoringIV = [&](fir::StoreOp storeOp) { // Direct store into the IV memref. - if (storeOp.getValue() == doLoop.getInductionVar()) { + if (storeOp.getValue() == inductionVar) { indVarUpdateOps.push_back(storeOp); return true; } @@ -77,7 +80,7 @@ struct InductionVariableInfo { // Indirect store into the IV memref. if (auto convertOp = mlir::dyn_cast( storeOp.getValue().getDefiningOp())) { - if (convertOp.getOperand() == doLoop.getInductionVar()) { + if (convertOp.getOperand() == inductionVar) { indVarUpdateOps.push_back(convertOp); indVarUpdateOps.push_back(storeOp); return true; @@ -87,7 +90,7 @@ struct InductionVariableInfo { return false; }; - for (mlir::Operation &op : doLoop) { + for (mlir::Operation &op : loop) { if (auto storeOp = mlir::dyn_cast(op)) if (isStoringIV(storeOp)) { result = storeOp.getMemref(); @@ -100,219 +103,7 @@ struct InductionVariableInfo { } }; -using LoopNestToIndVarMap = - llvm::MapVector; - -/// Loop \p innerLoop is considered perfectly-nested inside \p outerLoop iff -/// there are no operations in \p outerloop's body other than: -/// -/// 1. the operations needed to assign/update \p outerLoop's induction variable. -/// 2. \p innerLoop itself. -/// -/// \p return true if \p innerLoop is perfectly nested inside \p outerLoop -/// according to the above definition. -bool isPerfectlyNested(fir::DoLoopOp outerLoop, fir::DoLoopOp innerLoop) { - mlir::ForwardSliceOptions forwardSliceOptions; - forwardSliceOptions.inclusive = true; - // The following will be used as an example to clarify the internals of this - // function: - // ``` - // 1. fir.do_loop %i_idx = %34 to %36 step %c1 unordered { - // 2. %i_idx_2 = fir.convert %i_idx : (index) -> i32 - // 3. fir.store %i_idx_2 to %i_iv#1 : !fir.ref - // - // 4. fir.do_loop %j_idx = %37 to %39 step %c1_3 unordered { - // 5. %j_idx_2 = fir.convert %j_idx : (index) -> i32 - // 6. fir.store %j_idx_2 to %j_iv#1 : !fir.ref - // ... loop nest body, possible uses %i_idx ... - // } - // } - // ``` - // In this example, the `j` loop is perfectly nested inside the `i` loop and - // below is how we find that. - - // We don't care about the outer-loop's induction variable's uses within the - // inner-loop, so we filter out these uses. - // - // This filter tells `getForwardSlice` (below) to only collect operations - // which produce results defined above (i.e. outside) the inner-loop's body. - // - // Since `outerLoop.getInductionVar()` is a block argument (to the - // outer-loop's body), the filter effectively collects uses of - // `outerLoop.getInductionVar()` inside the outer-loop but outside the - // inner-loop. - forwardSliceOptions.filter = [&](mlir::Operation *op) { - return mlir::areValuesDefinedAbove(op->getResults(), innerLoop.getRegion()); - }; - - llvm::SetVector indVarSlice; - // The forward slice of the `i` loop's IV will be the 2 ops in line 1 & 2 - // above. Uses of `%i_idx` inside the `j` loop are not collected because of - // the filter. - mlir::getForwardSlice(outerLoop.getInductionVar(), &indVarSlice, - forwardSliceOptions); - llvm::DenseSet indVarSet(indVarSlice.begin(), - indVarSlice.end()); - - llvm::DenseSet outerLoopBodySet; - // The following walk collects ops inside `outerLoop` that are **not**: - // * the outer-loop itself, - // * or the inner-loop, - // * or the `fir.result` op (the outer-loop's terminator). - // - // For the above example, this will also populate `outerLoopBodySet` with ops - // in line 1 & 2 since we skip the `i` loop, the `j` loop, and the terminator. - outerLoop.walk([&](mlir::Operation *op) { - if (op == outerLoop) - return mlir::WalkResult::advance(); - - if (op == innerLoop) - return mlir::WalkResult::skip(); - - if (mlir::isa(op)) - return mlir::WalkResult::advance(); - - outerLoopBodySet.insert(op); - return mlir::WalkResult::advance(); - }); - - // If `outerLoopBodySet` ends up having the same ops as `indVarSet`, then - // `outerLoop` only contains ops that setup its induction variable + - // `innerLoop` + the `fir.result` terminator. In other words, `innerLoop` is - // perfectly nested inside `outerLoop`. - bool result = (outerLoopBodySet == indVarSet); - LLVM_DEBUG(DBGS() << "Loop pair starting at location " << outerLoop.getLoc() - << " is" << (result ? "" : " not") - << " perfectly nested\n"); - - return result; -} - -/// Starting with `currentLoop` collect a perfectly nested loop nest, if any. -/// This function collects as much as possible loops in the nest; it case it -/// fails to recognize a certain nested loop as part of the nest it just returns -/// the parent loops it discovered before. -mlir::LogicalResult collectLoopNest(fir::DoLoopOp currentLoop, - LoopNestToIndVarMap &loopNest) { - assert(currentLoop.getUnordered()); - - while (true) { - loopNest.insert({currentLoop, InductionVariableInfo(currentLoop)}); - llvm::SmallVector unorderedLoops; - - for (auto nestedLoop : currentLoop.getRegion().getOps()) - if (nestedLoop.getUnordered()) - unorderedLoops.push_back(nestedLoop); - - if (unorderedLoops.empty()) - break; - - // Having more than one unordered loop means that we are not dealing with a - // perfect loop nest (i.e. a mulit-range `do concurrent` loop); which is the - // case we are after here. - if (unorderedLoops.size() > 1) - return mlir::failure(); - - fir::DoLoopOp nestedUnorderedLoop = unorderedLoops.front(); - - if (!isPerfectlyNested(currentLoop, nestedUnorderedLoop)) - return mlir::failure(); - - currentLoop = nestedUnorderedLoop; - } - - return mlir::success(); -} - -/// Prepares the `fir.do_loop` nest to be easily mapped to OpenMP. In -/// particular, this function would take this input IR: -/// ``` -/// fir.do_loop %i_iv = %i_lb to %i_ub step %i_step unordered { -/// fir.store %i_iv to %i#1 : !fir.ref -/// %j_lb = arith.constant 1 : i32 -/// %j_ub = arith.constant 10 : i32 -/// %j_step = arith.constant 1 : index -/// -/// fir.do_loop %j_iv = %j_lb to %j_ub step %j_step unordered { -/// fir.store %j_iv to %j#1 : !fir.ref -/// ... -/// } -/// } -/// ``` -/// -/// into the following form (using generic op form since the result is -/// technically an invalid `fir.do_loop` op: -/// -/// ``` -/// "fir.do_loop"(%i_lb, %i_ub, %i_step) <{unordered}> ({ -/// ^bb0(%i_iv: index): -/// %j_lb = "arith.constant"() <{value = 1 : i32}> : () -> i32 -/// %j_ub = "arith.constant"() <{value = 10 : i32}> : () -> i32 -/// %j_step = "arith.constant"() <{value = 1 : index}> : () -> index -/// -/// "fir.do_loop"(%j_lb, %j_ub, %j_step) <{unordered}> ({ -/// ^bb0(%new_i_iv: index, %new_j_iv: index): -/// "fir.store"(%new_i_iv, %i#1) : (i32, !fir.ref) -> () -/// "fir.store"(%new_j_iv, %j#1) : (i32, !fir.ref) -> () -/// ... -/// }) -/// ``` -/// -/// What happened to the loop nest is the following: -/// -/// * the innermost loop's entry block was updated from having one operand to -/// having `n` operands where `n` is the number of loops in the nest, -/// -/// * the outer loop(s)' ops that update the IVs were sank inside the innermost -/// loop (see the `"fir.store"(%new_i_iv, %i#1)` op above), -/// -/// * the innermost loop's entry block's arguments were mapped in order from the -/// outermost to the innermost IV. -/// -/// With this IR change, we can directly inline the innermost loop's region into -/// the newly generated `omp.loop_nest` op. -/// -/// Note that this function has a pre-condition that \p loopNest consists of -/// perfectly nested loops; i.e. there are no in-between ops between 2 nested -/// loops except for the ops to setup the inner loop's LB, UB, and step. These -/// ops are handled/cloned by `genLoopNestClauseOps(..)`. -void sinkLoopIVArgs(mlir::ConversionPatternRewriter &rewriter, - looputils::LoopNestToIndVarMap &loopNest) { - if (loopNest.size() <= 1) - return; - - fir::DoLoopOp innermostLoop = loopNest.back().first; - mlir::Operation &innermostFirstOp = innermostLoop.getRegion().front().front(); - - llvm::SmallVector argTypes; - llvm::SmallVector argLocs; - - for (auto &[doLoop, indVarInfo] : llvm::drop_end(loopNest)) { - // Sink the IV update ops to the innermost loop. We need to do for all loops - // except for the innermost one, hence the `drop_end` usage above. - for (mlir::Operation *op : indVarInfo.indVarUpdateOps) - op->moveBefore(&innermostFirstOp); - - argTypes.push_back(doLoop.getInductionVar().getType()); - argLocs.push_back(doLoop.getInductionVar().getLoc()); - } - - mlir::Region &innermmostRegion = innermostLoop.getRegion(); - // Extend the innermost entry block with arguments to represent the outer IVs. - innermmostRegion.addArguments(argTypes, argLocs); - - unsigned idx = 1; - // In reverse, remap the IVs of the loop nest from the old values to the new - // ones. We do that in reverse since the first argument before this loop is - // the old IV for the innermost loop. Therefore, we want to replace it first - // before the old value (1st argument in the block) is remapped to be the IV - // of the outermost loop in the nest. - for (auto &[doLoop, _] : llvm::reverse(loopNest)) { - doLoop.getInductionVar().replaceAllUsesWith( - innermmostRegion.getArgument(innermmostRegion.getNumArguments() - idx)); - ++idx; - } -} +using InductionVariableInfos = llvm::SmallVector; /// Collects values that are local to a loop: "loop-local values". A loop-local /// value is one that is used exclusively inside the loop but allocated outside @@ -326,9 +117,9 @@ void sinkLoopIVArgs(mlir::ConversionPatternRewriter &rewriter, /// used exclusively inside. /// /// \param [out] locals - the list of loop-local values detected for \p doLoop. -void collectLoopLocalValues(fir::DoLoopOp doLoop, +void collectLoopLocalValues(fir::DoConcurrentLoopOp loop, llvm::SetVector &locals) { - doLoop.walk([&](mlir::Operation *op) { + loop.walk([&](mlir::Operation *op) { for (mlir::Value operand : op->getOperands()) { if (locals.contains(operand)) continue; @@ -340,11 +131,11 @@ void collectLoopLocalValues(fir::DoLoopOp doLoop, // Values defined inside the loop are not interesting since they do not // need to be localized. - if (doLoop->isAncestor(operand.getDefiningOp())) + if (loop->isAncestor(operand.getDefiningOp())) continue; for (auto *user : operand.getUsers()) { - if (!doLoop->isAncestor(user)) { + if (!loop->isAncestor(user)) { isLocal = false; break; } @@ -373,39 +164,42 @@ static void localizeLoopLocalValue(mlir::Value local, mlir::Region &allocRegion, } } // namespace looputils -class DoConcurrentConversion : public mlir::OpConversionPattern { +class DoConcurrentConversion + : public mlir::OpConversionPattern { public: - using mlir::OpConversionPattern::OpConversionPattern; + using mlir::OpConversionPattern::OpConversionPattern; - DoConcurrentConversion(mlir::MLIRContext *context, bool mapToDevice, - llvm::DenseSet &concurrentLoopsToSkip) + DoConcurrentConversion( + mlir::MLIRContext *context, bool mapToDevice, + llvm::DenseSet &concurrentLoopsToSkip) : OpConversionPattern(context), mapToDevice(mapToDevice), concurrentLoopsToSkip(concurrentLoopsToSkip) {} mlir::LogicalResult - matchAndRewrite(fir::DoLoopOp doLoop, OpAdaptor adaptor, + matchAndRewrite(fir::DoConcurrentOp doLoop, OpAdaptor adaptor, mlir::ConversionPatternRewriter &rewriter) const override { if (mapToDevice) return doLoop.emitError( "not yet implemented: Mapping `do concurrent` loops to device"); - looputils::LoopNestToIndVarMap loopNest; - bool hasRemainingNestedLoops = - failed(looputils::collectLoopNest(doLoop, loopNest)); - if (hasRemainingNestedLoops) - mlir::emitWarning(doLoop.getLoc(), - "Some `do concurent` loops are not perfectly-nested. " - "These will be serialized."); + looputils::InductionVariableInfos ivInfos; + auto loop = mlir::cast( + doLoop.getRegion().back().getTerminator()); + + auto indVars = loop.getLoopInductionVars(); + assert(indVars.has_value()); + + for (mlir::Value indVar : *indVars) + ivInfos.emplace_back(loop, indVar); llvm::SetVector locals; - looputils::collectLoopLocalValues(loopNest.back().first, locals); - looputils::sinkLoopIVArgs(rewriter, loopNest); + looputils::collectLoopLocalValues(loop, locals); mlir::IRMapping mapper; mlir::omp::ParallelOp parallelOp = - genParallelOp(doLoop.getLoc(), rewriter, loopNest, mapper); + genParallelOp(doLoop.getLoc(), rewriter, ivInfos, mapper); mlir::omp::LoopNestOperands loopNestClauseOps; - genLoopNestClauseOps(doLoop.getLoc(), rewriter, loopNest, mapper, + genLoopNestClauseOps(doLoop.getLoc(), rewriter, loop, mapper, loopNestClauseOps); for (mlir::Value local : locals) @@ -413,41 +207,56 @@ class DoConcurrentConversion : public mlir::OpConversionPattern { rewriter); mlir::omp::LoopNestOp ompLoopNest = - genWsLoopOp(rewriter, loopNest.back().first, mapper, loopNestClauseOps, + genWsLoopOp(rewriter, loop, mapper, loopNestClauseOps, /*isComposite=*/mapToDevice); - rewriter.eraseOp(doLoop); + rewriter.setInsertionPoint(doLoop); + fir::FirOpBuilder builder( + rewriter, + fir::getKindMapping(doLoop->getParentOfType())); + + // Collect iteration variable(s) allocations so that we can move them + // outside the `fir.do_concurrent` wrapper (before erasing it). + llvm::SmallVector opsToMove; + for (mlir::Operation &op : llvm::drop_end(doLoop)) + opsToMove.push_back(&op); + + mlir::Block *allocBlock = builder.getAllocaBlock(); + + for (mlir::Operation *op : llvm::reverse(opsToMove)) { + rewriter.moveOpBefore(op, allocBlock, allocBlock->begin()); + } // Mark `unordered` loops that are not perfectly nested to be skipped from // the legality check of the `ConversionTarget` since we are not interested // in mapping them to OpenMP. - ompLoopNest->walk([&](fir::DoLoopOp doLoop) { - if (doLoop.getUnordered()) { - concurrentLoopsToSkip.insert(doLoop); - } + ompLoopNest->walk([&](fir::DoConcurrentOp doLoop) { + concurrentLoopsToSkip.insert(doLoop); }); + rewriter.eraseOp(doLoop); + return mlir::success(); } private: - mlir::omp::ParallelOp genParallelOp(mlir::Location loc, - mlir::ConversionPatternRewriter &rewriter, - looputils::LoopNestToIndVarMap &loopNest, - mlir::IRMapping &mapper) const { + mlir::omp::ParallelOp + genParallelOp(mlir::Location loc, mlir::ConversionPatternRewriter &rewriter, + looputils::InductionVariableInfos &ivInfos, + mlir::IRMapping &mapper) const { auto parallelOp = rewriter.create(loc); rewriter.createBlock(¶llelOp.getRegion()); rewriter.setInsertionPoint(rewriter.create(loc)); - genLoopNestIndVarAllocs(rewriter, loopNest, mapper); + genLoopNestIndVarAllocs(rewriter, ivInfos, mapper); return parallelOp; } void genLoopNestIndVarAllocs(mlir::ConversionPatternRewriter &rewriter, - looputils::LoopNestToIndVarMap &loopNest, + looputils::InductionVariableInfos &ivInfos, mlir::IRMapping &mapper) const { - for (auto &[_, indVarInfo] : loopNest) + for (auto &indVarInfo : ivInfos) genInductionVariableAlloc(rewriter, indVarInfo.iterVarMemDef, mapper); } @@ -471,10 +280,11 @@ class DoConcurrentConversion : public mlir::OpConversionPattern { return result; } - void genLoopNestClauseOps( - mlir::Location loc, mlir::ConversionPatternRewriter &rewriter, - looputils::LoopNestToIndVarMap &loopNest, mlir::IRMapping &mapper, - mlir::omp::LoopNestOperands &loopNestClauseOps) const { + void + genLoopNestClauseOps(mlir::Location loc, + mlir::ConversionPatternRewriter &rewriter, + fir::DoConcurrentLoopOp loop, mlir::IRMapping &mapper, + mlir::omp::LoopNestOperands &loopNestClauseOps) const { assert(loopNestClauseOps.loopLowerBounds.empty() && "Loop nest bounds were already emitted!"); @@ -483,43 +293,42 @@ class DoConcurrentConversion : public mlir::OpConversionPattern { bounds.push_back(var.getDefiningOp()->getResult(0)); }; - for (auto &[doLoop, _] : loopNest) { - populateBounds(doLoop.getLowerBound(), loopNestClauseOps.loopLowerBounds); - populateBounds(doLoop.getUpperBound(), loopNestClauseOps.loopUpperBounds); - populateBounds(doLoop.getStep(), loopNestClauseOps.loopSteps); + for (auto [lb, ub, st] : llvm::zip_equal( + loop.getLowerBound(), loop.getUpperBound(), loop.getStep())) { + populateBounds(lb, loopNestClauseOps.loopLowerBounds); + populateBounds(ub, loopNestClauseOps.loopUpperBounds); + populateBounds(st, loopNestClauseOps.loopSteps); } loopNestClauseOps.loopInclusive = rewriter.getUnitAttr(); } mlir::omp::LoopNestOp - genWsLoopOp(mlir::ConversionPatternRewriter &rewriter, fir::DoLoopOp doLoop, - mlir::IRMapping &mapper, + genWsLoopOp(mlir::ConversionPatternRewriter &rewriter, + fir::DoConcurrentLoopOp loop, mlir::IRMapping &mapper, const mlir::omp::LoopNestOperands &clauseOps, bool isComposite) const { - auto wsloopOp = rewriter.create(doLoop.getLoc()); + auto wsloopOp = rewriter.create(loop.getLoc()); wsloopOp.setComposite(isComposite); rewriter.createBlock(&wsloopOp.getRegion()); auto loopNestOp = - rewriter.create(doLoop.getLoc(), clauseOps); + rewriter.create(loop.getLoc(), clauseOps); // Clone the loop's body inside the loop nest construct using the // mapped values. - rewriter.cloneRegionBefore(doLoop.getRegion(), loopNestOp.getRegion(), + rewriter.cloneRegionBefore(loop.getRegion(), loopNestOp.getRegion(), loopNestOp.getRegion().begin(), mapper); - mlir::Operation *terminator = loopNestOp.getRegion().back().getTerminator(); rewriter.setInsertionPointToEnd(&loopNestOp.getRegion().back()); - rewriter.create(terminator->getLoc()); - rewriter.eraseOp(terminator); + rewriter.create(loop->getLoc()); return loopNestOp; } bool mapToDevice; - llvm::DenseSet &concurrentLoopsToSkip; + llvm::DenseSet &concurrentLoopsToSkip; }; class DoConcurrentConversionPass @@ -548,19 +357,16 @@ class DoConcurrentConversionPass return; } - llvm::DenseSet concurrentLoopsToSkip; + llvm::DenseSet concurrentLoopsToSkip; mlir::RewritePatternSet patterns(context); patterns.insert( context, mapTo == flangomp::DoConcurrentMappingKind::DCMK_Device, concurrentLoopsToSkip); mlir::ConversionTarget target(*context); - target.addDynamicallyLegalOp([&](fir::DoLoopOp op) { - // The goal is to handle constructs that eventually get lowered to - // `fir.do_loop` with the `unordered` attribute (e.g. array expressions). - // Currently, this is only enabled for the `do concurrent` construct since - // the pass runs early in the pipeline. - return !op.getUnordered() || concurrentLoopsToSkip.contains(op); - }); + target.addDynamicallyLegalOp( + [&](fir::DoConcurrentOp op) { + return concurrentLoopsToSkip.contains(op); + }); target.markUnknownOpDynamicallyLegal( [](mlir::Operation *) { return true; }); diff --git a/flang/test/Transforms/DoConcurrent/basic_device.mlir b/flang/test/Transforms/DoConcurrent/basic_device.mlir index d7fcc40e4a7f9..0ca48943864c8 100644 --- a/flang/test/Transforms/DoConcurrent/basic_device.mlir +++ b/flang/test/Transforms/DoConcurrent/basic_device.mlir @@ -1,8 +1,6 @@ // RUN: fir-opt --omp-do-concurrent-conversion="map-to=device" -verify-diagnostics %s func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_basic"} { - %0 = fir.alloca i32 {bindc_name = "i"} - %1:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) %2 = fir.address_of(@_QFEa) : !fir.ref> %c10 = arith.constant 10 : index %3 = fir.shape %c10 : (index) -> !fir.shape<1> @@ -14,15 +12,19 @@ func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_bas %c1 = arith.constant 1 : index // expected-error at +2 {{not yet implemented: Mapping `do concurrent` loops to device}} - // expected-error at below {{failed to legalize operation 'fir.do_loop'}} - fir.do_loop %arg0 = %7 to %8 step %c1 unordered { - %13 = fir.convert %arg0 : (index) -> i32 - fir.store %13 to %1#1 : !fir.ref - %14 = fir.load %1#0 : !fir.ref - %15 = fir.load %1#0 : !fir.ref - %16 = fir.convert %15 : (i32) -> i64 - %17 = hlfir.designate %4#0 (%16) : (!fir.ref>, i64) -> !fir.ref - hlfir.assign %14 to %17 : i32, !fir.ref + // expected-error at below {{failed to legalize operation 'fir.do_concurrent'}} + fir.do_concurrent { + %0 = fir.alloca i32 {bindc_name = "i"} + %1:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) + fir.do_concurrent.loop (%arg0) = (%7) to (%8) step (%c1) { + %13 = fir.convert %arg0 : (index) -> i32 + fir.store %13 to %1#1 : !fir.ref + %14 = fir.load %1#0 : !fir.ref + %15 = fir.load %1#0 : !fir.ref + %16 = fir.convert %15 : (i32) -> i64 + %17 = hlfir.designate %4#0 (%16) : (!fir.ref>, i64) -> !fir.ref + hlfir.assign %14 to %17 : i32, !fir.ref + } } return diff --git a/flang/test/Transforms/DoConcurrent/basic_host.f90 b/flang/test/Transforms/DoConcurrent/basic_host.f90 index b84d4481ac766..12f63031cbaee 100644 --- a/flang/test/Transforms/DoConcurrent/basic_host.f90 +++ b/flang/test/Transforms/DoConcurrent/basic_host.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! Tests mapping of a basic `do concurrent` loop to `!$omp parallel do`. ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fdo-concurrent-to-openmp=host %s -o - \ diff --git a/flang/test/Transforms/DoConcurrent/basic_host.mlir b/flang/test/Transforms/DoConcurrent/basic_host.mlir index b53ecd687c039..5425829404d7b 100644 --- a/flang/test/Transforms/DoConcurrent/basic_host.mlir +++ b/flang/test/Transforms/DoConcurrent/basic_host.mlir @@ -6,8 +6,6 @@ func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_basic"} { // CHECK: %[[ARR:.*]]:2 = hlfir.declare %{{.*}}(%{{.*}}) {uniq_name = "_QFEa"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) - %0 = fir.alloca i32 {bindc_name = "i"} - %1:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) %2 = fir.address_of(@_QFEa) : !fir.ref> %c10 = arith.constant 10 : index %3 = fir.shape %c10 : (index) -> !fir.shape<1> @@ -18,7 +16,7 @@ func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_bas %8 = fir.convert %c10_i32 : (i32) -> index %c1 = arith.constant 1 : index - // CHECK-NOT: fir.do_loop + // CHECK-NOT: fir.do_concurrent // CHECK: %[[C1:.*]] = arith.constant 1 : i32 // CHECK: %[[LB:.*]] = fir.convert %[[C1]] : (i32) -> index @@ -46,17 +44,21 @@ func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_bas // CHECK-NEXT: omp.terminator // CHECK-NEXT: } - fir.do_loop %arg0 = %7 to %8 step %c1 unordered { - %13 = fir.convert %arg0 : (index) -> i32 - fir.store %13 to %1#1 : !fir.ref - %14 = fir.load %1#0 : !fir.ref - %15 = fir.load %1#0 : !fir.ref - %16 = fir.convert %15 : (i32) -> i64 - %17 = hlfir.designate %4#0 (%16) : (!fir.ref>, i64) -> !fir.ref - hlfir.assign %14 to %17 : i32, !fir.ref + fir.do_concurrent { + %0 = fir.alloca i32 {bindc_name = "i"} + %1:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) + fir.do_concurrent.loop (%arg0) = (%7) to (%8) step (%c1) { + %13 = fir.convert %arg0 : (index) -> i32 + fir.store %13 to %1#1 : !fir.ref + %14 = fir.load %1#0 : !fir.ref + %15 = fir.load %1#0 : !fir.ref + %16 = fir.convert %15 : (i32) -> i64 + %17 = hlfir.designate %4#0 (%16) : (!fir.ref>, i64) -> !fir.ref + hlfir.assign %14 to %17 : i32, !fir.ref + } } - // CHECK-NOT: fir.do_loop + // CHECK-NOT: fir.do_concurrent return } diff --git a/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 b/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 index 4e13c0919589a..f82696669eca6 100644 --- a/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 +++ b/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! Tests that "loop-local values" are properly handled by localizing them to the ! body of the loop nest. See `collectLoopLocalValues` and `localizeLoopLocalValue` ! for a definition of "loop-local values" and how they are handled. diff --git a/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 b/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 deleted file mode 100644 index adc4a488d1ec9..0000000000000 --- a/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 +++ /dev/null @@ -1,92 +0,0 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - -! Tests loop-nest detection algorithm for do-concurrent mapping. - -! REQUIRES: asserts - -! RUN: %flang_fc1 -emit-hlfir -fopenmp -fdo-concurrent-to-openmp=host \ -! RUN: -mmlir -debug -mmlir -mlir-disable-threading %s -o - 2> %t.log || true - -! RUN: FileCheck %s < %t.log - -program main - implicit none - -contains - -subroutine foo(n) - implicit none - integer :: n, m - integer :: i, j, k - integer :: x - integer, dimension(n) :: a - integer, dimension(n, n, n) :: b - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is perfectly nested - do concurrent(i=1:n, j=1:bar(n*m, n/m)) - a(i) = n - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is perfectly nested - do concurrent(i=bar(n, x):n, j=1:bar(n*m, n/m)) - a(i) = n - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is not perfectly nested - do concurrent(i=bar(n, x):n) - do concurrent(j=1:bar(n*m, n/m)) - a(i) = n - end do - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is not perfectly nested - do concurrent(i=1:n) - x = 10 - do concurrent(j=1:m) - b(i,j,k) = i * j + k - end do - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is not perfectly nested - do concurrent(i=1:n) - do concurrent(j=1:m) - b(i,j,k) = i * j + k - end do - x = 10 - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is not perfectly nested - do concurrent(i=1:n) - do concurrent(j=1:m) - b(i,j,k) = i * j + k - x = 10 - end do - end do - - ! Verify the (i,j) and (j,k) pairs of loops are detected as perfectly nested. - ! - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 3]]:{{.*}}) is perfectly nested - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is perfectly nested - do concurrent(i=bar(n, x):n, j=1:bar(n*m, n/m), k=1:bar(n*m, bar(n*m, n/m))) - a(i) = n - end do -end subroutine - -pure function bar(n, m) - implicit none - integer, intent(in) :: n, m - integer :: bar - - bar = n + m -end function - -end program main diff --git a/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 b/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 index 26800678d381c..d0210726de83e 100644 --- a/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 +++ b/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! Tests mapping of a `do concurrent` loop with multiple iteration ranges. ! RUN: split-file %s %t diff --git a/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 b/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 index 23a3aae976c07..cd1bd4f98a3f5 100644 --- a/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 +++ b/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fdo-concurrent-to-openmp=host %s -o - \ ! RUN: | FileCheck %s diff --git a/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 b/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 index d1c02101318ab..74799359e0476 100644 --- a/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 +++ b/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! Tests that if `do concurrent` is not perfectly nested in its parent loop, that ! we skip converting the not-perfectly nested `do concurrent` loop. @@ -22,23 +19,24 @@ program main end do end -! CHECK: %[[ORIG_K_ALLOC:.*]] = fir.alloca i32 {bindc_name = "k"} -! CHECK: %[[ORIG_K_DECL:.*]]:2 = hlfir.declare %[[ORIG_K_ALLOC]] - -! CHECK: %[[ORIG_J_ALLOC:.*]] = fir.alloca i32 {bindc_name = "j"} -! CHECK: %[[ORIG_J_DECL:.*]]:2 = hlfir.declare %[[ORIG_J_ALLOC]] - ! CHECK: omp.parallel { ! CHECK: omp.wsloop { ! CHECK: omp.loop_nest ({{[^[:space:]]+}}) {{.*}} { -! CHECK: fir.do_loop %[[J_IV:.*]] = {{.*}} { -! CHECK: %[[J_IV_CONV:.*]] = fir.convert %[[J_IV]] : (index) -> i32 +! CHECK: fir.do_concurrent { + +! CHECK: %[[ORIG_J_ALLOC:.*]] = fir.alloca i32 {bindc_name = "j"} +! CHECK: %[[ORIG_J_DECL:.*]]:2 = hlfir.declare %[[ORIG_J_ALLOC]] + +! CHECK: %[[ORIG_K_ALLOC:.*]] = fir.alloca i32 {bindc_name = "k"} +! CHECK: %[[ORIG_K_DECL:.*]]:2 = hlfir.declare %[[ORIG_K_ALLOC]] + +! CHECK: fir.do_concurrent.loop (%[[J_IV:.*]], %[[K_IV:.*]]) = {{.*}} { +! CHECK: %[[J_IV_CONV:.*]] = fir.convert %[[J_IV]] : (index) -> i32 ! CHECK: fir.store %[[J_IV_CONV]] to %[[ORIG_J_DECL]]#0 -! CHECK: fir.do_loop %[[K_IV:.*]] = {{.*}} { ! CHECK: %[[K_IV_CONV:.*]] = fir.convert %[[K_IV]] : (index) -> i32 -! CHECK: fir.store %[[K_IV_CONV]] to %[[ORIG_K_DECL]]#0 +! CHECK: fir.store %[[K_IV_CONV]] to %[[ORIG_K_DECL]]#0 ! CHECK: } ! CHECK: } ! CHECK: omp.yield From flang-commits at lists.llvm.org Thu May 8 00:07:28 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 08 May 2025 00:07:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add `fir.local` op for locality specifiers (PR #138505) In-Reply-To: Message-ID: <681c5830.050a0220.2936bb.672c@mx.google.com> ================ @@ -94,10 +94,11 @@ struct IncrementLoopInfo { template explicit IncrementLoopInfo(Fortran::semantics::Symbol &sym, const T &lower, const T &upper, const std::optional &step, - bool isUnordered = false) + bool isConcurrent = false) ---------------- ergawy wrote: This is done somewhere else though not here. In particular, `genImplicitLoops` in `ConvertExpr.cpp` where `fir.do_loop .... unordered` loops are still created. See: https://github.com/llvm/llvm-project/blob/main/flang/lib/Lower/ConvertExpr.cpp#L4393. https://github.com/llvm/llvm-project/pull/138505 From flang-commits at lists.llvm.org Thu May 8 00:15:04 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 08 May 2025 00:15:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add `fir.local` op for locality specifiers (PR #138505) In-Reply-To: Message-ID: <681c59f8.050a0220.1d90d9.0e1c@mx.google.com> ================ @@ -120,7 +121,7 @@ struct IncrementLoopInfo { const Fortran::lower::SomeExpr *upperExpr; const Fortran::lower::SomeExpr *stepExpr; const Fortran::lower::SomeExpr *maskExpr = nullptr; - bool isUnordered; // do concurrent, forall ---------------- ergawy wrote: Nope. This also takes a different codegen path. In particular, `forall` concurrent headers are generated through `void genFIR(const Fortran::parser::ConcurrentHeader &header)` where `fir.do_loop .... unordered` nests are still generated. See: https://github.com/llvm/llvm-project/blob/main/flang/lib/Lower/Bridge.cpp#L2771. https://github.com/llvm/llvm-project/pull/138505 From flang-commits at lists.llvm.org Thu May 8 00:15:28 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 08 May 2025 00:15:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add `fir.local` op for locality specifiers (PR #138505) In-Reply-To: Message-ID: <681c5a10.630a0220.2491e5.dd9e@mx.google.com> ergawy wrote: > Some post commit questions. Thanks for taking a look. Replied to your questions. https://github.com/llvm/llvm-project/pull/138505 From flang-commits at lists.llvm.org Thu May 8 00:15:46 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 00:15:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <681c5a22.050a0220.2126bd.b47b@mx.google.com> https://github.com/NexMing updated https://github.com/llvm/llvm-project/pull/138627 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Thu May 8 00:43:46 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Thu, 08 May 2025 00:43:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add `fir.local` op for locality specifiers (PR #138505) In-Reply-To: Message-ID: <681c60b2.050a0220.346761.abfe@mx.google.com> clementval wrote: > > Some post commit questions. > > > > Thanks for taking a look. Replied to your questions. Thanks for the replies. All good from my side https://github.com/llvm/llvm-project/pull/138505 From flang-commits at lists.llvm.org Thu May 8 00:53:05 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 00:53:05 -0700 (PDT) Subject: [flang-commits] [flang] [fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <681c62e1.170a0220.641a3.553e@mx.google.com> https://github.com/NexMing updated https://github.com/llvm/llvm-project/pull/137790 >From 17dfda28fdf8eb8184283b686e6831a3c8b7a9ab Mon Sep 17 00:00:00 2001 From: yanming Date: Tue, 29 Apr 2025 19:16:48 +0800 Subject: [PATCH 1/2] [fir] Support promoting `fir.do_loop` with results to `affine.for`. --- .../Optimizer/Transforms/AffinePromotion.cpp | 39 +++++++++-- flang/test/Fir/affine-promotion.fir | 65 +++++++++++++++++++ 2 files changed, 99 insertions(+), 5 deletions(-) diff --git a/flang/lib/Optimizer/Transforms/AffinePromotion.cpp b/flang/lib/Optimizer/Transforms/AffinePromotion.cpp index 43fccf52dc8ab..ef82e400bea14 100644 --- a/flang/lib/Optimizer/Transforms/AffinePromotion.cpp +++ b/flang/lib/Optimizer/Transforms/AffinePromotion.cpp @@ -49,8 +49,9 @@ struct AffineIfAnalysis; /// second when doing rewrite. struct AffineFunctionAnalysis { explicit AffineFunctionAnalysis(mlir::func::FuncOp funcOp) { - for (fir::DoLoopOp op : funcOp.getOps()) - loopAnalysisMap.try_emplace(op, op, *this); + funcOp->walk([&](fir::DoLoopOp doloop) { + loopAnalysisMap.try_emplace(doloop, doloop, *this); + }); } AffineLoopAnalysis getChildLoopAnalysis(fir::DoLoopOp op) const; @@ -102,10 +103,23 @@ struct AffineLoopAnalysis { return true; } + bool analysisResults(fir::DoLoopOp loopOperation) { + if (loopOperation.getFinalValue() && + !loopOperation.getResult(0).use_empty()) { + LLVM_DEBUG( + llvm::dbgs() + << "AffineLoopAnalysis: cannot promote loop final value\n";); + return false; + } + + return true; + } + bool analyzeLoop(fir::DoLoopOp loopOperation, AffineFunctionAnalysis &functionAnalysis) { LLVM_DEBUG(llvm::dbgs() << "AffineLoopAnalysis: \n"; loopOperation.dump();); return analyzeMemoryAccess(loopOperation) && + analysisResults(loopOperation) && analyzeBody(loopOperation, functionAnalysis); } @@ -461,14 +475,28 @@ class AffineLoopConversion : public mlir::OpRewritePattern { LLVM_ATTRIBUTE_UNUSED auto loopAnalysis = functionAnalysis.getChildLoopAnalysis(loop); auto &loopOps = loop.getBody()->getOperations(); + auto resultOp = cast(loop.getBody()->getTerminator()); + auto results = resultOp.getOperands(); + auto loopResults = loop->getResults(); auto loopAndIndex = createAffineFor(loop, rewriter); auto affineFor = loopAndIndex.first; auto inductionVar = loopAndIndex.second; + if (loop.getFinalValue()) { + results = results.drop_front(); + loopResults = loopResults.drop_front(); + } + rewriter.startOpModification(affineFor.getOperation()); affineFor.getBody()->getOperations().splice( std::prev(affineFor.getBody()->end()), loopOps, loopOps.begin(), std::prev(loopOps.end())); + rewriter.replaceAllUsesWith(loop.getRegionIterArgs(), + affineFor.getRegionIterArgs()); + if (!results.empty()) { + rewriter.setInsertionPointToEnd(affineFor.getBody()); + rewriter.create(resultOp->getLoc(), results); + } rewriter.finalizeOpModification(affineFor.getOperation()); rewriter.startOpModification(loop.getOperation()); @@ -479,7 +507,8 @@ class AffineLoopConversion : public mlir::OpRewritePattern { LLVM_DEBUG(llvm::dbgs() << "AffineLoopConversion: loop rewriten to:\n"; affineFor.dump();); - rewriter.replaceOp(loop, affineFor.getOperation()->getResults()); + rewriter.replaceAllUsesWith(loopResults, affineFor->getResults()); + rewriter.eraseOp(loop); return success(); } @@ -503,7 +532,7 @@ class AffineLoopConversion : public mlir::OpRewritePattern { ValueRange(op.getUpperBound()), mlir::AffineMap::get(0, 1, 1 + mlir::getAffineSymbolExpr(0, op.getContext())), - step); + step, op.getIterOperands()); return std::make_pair(affineFor, affineFor.getInductionVar()); } @@ -528,7 +557,7 @@ class AffineLoopConversion : public mlir::OpRewritePattern { genericUpperBound.getResult(), mlir::AffineMap::get(0, 1, 1 + mlir::getAffineSymbolExpr(0, op.getContext())), - 1); + 1, op.getIterOperands()); rewriter.setInsertionPointToStart(affineFor.getBody()); auto actualIndex = rewriter.create( op.getLoc(), actualIndexMap, diff --git a/flang/test/Fir/affine-promotion.fir b/flang/test/Fir/affine-promotion.fir index aae35c6ef5659..f50f851a89eae 100644 --- a/flang/test/Fir/affine-promotion.fir +++ b/flang/test/Fir/affine-promotion.fir @@ -131,3 +131,68 @@ func.func @loop_with_if(%a: !arr_d1, %v: f32) { // CHECK: } // CHECK: return // CHECK: } + +func.func @loop_with_result(%arg0: !fir.ref>, %arg1: !fir.ref>) -> f32 { + %c1 = arith.constant 1 : index + %cst = arith.constant 0.000000e+00 : f32 + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %1 = fir.shape %c100, %c100 : (index, index) -> !fir.shape<2> + %2 = fir.alloca i32 + %3:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %cst) -> (index, f32) { + %6 = fir.array_coor %arg0(%0) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %7 = fir.load %6 : !fir.ref + %8 = arith.addf %arg3, %7 fastmath : f32 + %9 = arith.addi %arg2, %c1 overflow : index + fir.result %9, %8 : index, f32 + } + %4:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %3#1) -> (index, f32) { + %6 = fir.array_coor %arg1(%1) %c1, %arg2 : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref + %7 = fir.convert %6 : (!fir.ref) -> !fir.ref> + %8 = fir.do_loop %arg4 = %c1 to %c100 step %c1 iter_args(%arg5 = %arg3) -> (f32) { + %10 = fir.array_coor %7(%0) %arg4 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %11 = fir.load %10 : !fir.ref + %12 = arith.addf %arg5, %11 fastmath : f32 + fir.result %12 : f32 + } + %9 = arith.addi %arg2, %c1 overflow : index + fir.result %9, %8 : index, f32 + } + %5 = fir.convert %4#0 : (index) -> i32 + fir.store %5 to %2 : !fir.ref + return %4#1 : f32 +} + +// CHECK-LABEL: func.func @loop_with_result( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref>) -> f32 { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0.000000e+00 : f32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]], %[[VAL_2]] : (index, index) -> !fir.shape<2> +// CHECK: %[[VAL_5:.*]] = fir.alloca i32 +// CHECK: %[[VAL_6:.*]] = fir.convert %[[ARG0]] : (!fir.ref>) -> memref +// CHECK: %[[VAL_7:.*]] = affine.for %[[VAL_8:.*]] = %[[VAL_0]] to #{{.*}}(){{\[}}%[[VAL_2]]] iter_args(%[[VAL_9:.*]] = %[[VAL_1]]) -> (f32) { +// CHECK: %[[VAL_10:.*]] = affine.apply #{{.*}}(%[[VAL_8]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] +// CHECK: %[[VAL_11:.*]] = affine.load %[[VAL_6]]{{\[}}%[[VAL_10]]] : memref +// CHECK: %[[VAL_12:.*]] = arith.addf %[[VAL_9]], %[[VAL_11]] fastmath : f32 +// CHECK: affine.yield %[[VAL_12]] : f32 +// CHECK: } +// CHECK: %[[VAL_13:.*]]:2 = fir.do_loop %[[VAL_14:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_0]] iter_args(%[[VAL_15:.*]] = %[[VAL_7]]) -> (index, f32) { +// CHECK: %[[VAL_16:.*]] = fir.array_coor %[[ARG1]](%[[VAL_4]]) %[[VAL_0]], %[[VAL_14]] : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16]] : (!fir.ref) -> !fir.ref> +// CHECK: %[[VAL_18:.*]] = fir.convert %[[VAL_17]] : (!fir.ref>) -> memref +// CHECK: %[[VAL_19:.*]] = affine.for %[[VAL_20:.*]] = %[[VAL_0]] to #{{.*}}(){{\[}}%[[VAL_2]]] iter_args(%[[VAL_21:.*]] = %[[VAL_15]]) -> (f32) { +// CHECK: %[[VAL_22:.*]] = affine.apply #{{.*}}(%[[VAL_20]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] +// CHECK: %[[VAL_23:.*]] = affine.load %[[VAL_18]]{{\[}}%[[VAL_22]]] : memref +// CHECK: %[[VAL_24:.*]] = arith.addf %[[VAL_21]], %[[VAL_23]] fastmath : f32 +// CHECK: affine.yield %[[VAL_24]] : f32 +// CHECK: } +// CHECK: %[[VAL_25:.*]] = arith.addi %[[VAL_14]], %[[VAL_0]] overflow : index +// CHECK: fir.result %[[VAL_25]], %[[VAL_19]] : index, f32 +// CHECK: } +// CHECK: %[[VAL_26:.*]] = fir.convert %[[VAL_27:.*]]#0 : (index) -> i32 +// CHECK: fir.store %[[VAL_26]] to %[[VAL_5]] : !fir.ref +// CHECK: return %[[VAL_27]]#1 : f32 +// CHECK: } >From 98746e859c3bb9fdd72ecdd562cd3b404b1fc98b Mon Sep 17 00:00:00 2001 From: yanming Date: Thu, 8 May 2025 13:50:54 +0800 Subject: [PATCH 2/2] Add a test that loop with multiple results. --- flang/test/Fir/affine-promotion.fir | 69 +++++++++++++++++++---------- 1 file changed, 45 insertions(+), 24 deletions(-) diff --git a/flang/test/Fir/affine-promotion.fir b/flang/test/Fir/affine-promotion.fir index f50f851a89eae..46467ab4a292a 100644 --- a/flang/test/Fir/affine-promotion.fir +++ b/flang/test/Fir/affine-promotion.fir @@ -132,40 +132,51 @@ func.func @loop_with_if(%a: !arr_d1, %v: f32) { // CHECK: return // CHECK: } -func.func @loop_with_result(%arg0: !fir.ref>, %arg1: !fir.ref>) -> f32 { +func.func @loop_with_result(%arg0: !fir.ref>, %arg1: !fir.ref>, %arg2: !fir.ref>) -> f32 { %c1 = arith.constant 1 : index %cst = arith.constant 0.000000e+00 : f32 %c100 = arith.constant 100 : index %0 = fir.shape %c100 : (index) -> !fir.shape<1> %1 = fir.shape %c100, %c100 : (index, index) -> !fir.shape<2> %2 = fir.alloca i32 - %3:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %cst) -> (index, f32) { - %6 = fir.array_coor %arg0(%0) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref - %7 = fir.load %6 : !fir.ref - %8 = arith.addf %arg3, %7 fastmath : f32 - %9 = arith.addi %arg2, %c1 overflow : index - fir.result %9, %8 : index, f32 + %3:2 = fir.do_loop %arg3 = %c1 to %c100 step %c1 iter_args(%arg4 = %cst) -> (index, f32) { + %8 = fir.array_coor %arg0(%0) %arg3 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %9 = fir.load %8 : !fir.ref + %10 = arith.addf %arg4, %9 fastmath : f32 + %11 = arith.addi %arg3, %c1 overflow : index + fir.result %11, %10 : index, f32 } - %4:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %3#1) -> (index, f32) { - %6 = fir.array_coor %arg1(%1) %c1, %arg2 : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref - %7 = fir.convert %6 : (!fir.ref) -> !fir.ref> - %8 = fir.do_loop %arg4 = %c1 to %c100 step %c1 iter_args(%arg5 = %arg3) -> (f32) { - %10 = fir.array_coor %7(%0) %arg4 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref - %11 = fir.load %10 : !fir.ref - %12 = arith.addf %arg5, %11 fastmath : f32 - fir.result %12 : f32 + %4:2 = fir.do_loop %arg3 = %c1 to %c100 step %c1 iter_args(%arg4 = %3#1) -> (index, f32) { + %8 = fir.array_coor %arg1(%1) %c1, %arg3 : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref + %9 = fir.convert %8 : (!fir.ref) -> !fir.ref> + %10 = fir.do_loop %arg5 = %c1 to %c100 step %c1 iter_args(%arg6 = %arg4) -> (f32) { + %12 = fir.array_coor %9(%0) %arg5 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %13 = fir.load %12 : !fir.ref + %14 = arith.addf %arg6, %13 fastmath : f32 + fir.result %14 : f32 } - %9 = arith.addi %arg2, %c1 overflow : index - fir.result %9, %8 : index, f32 + %11 = arith.addi %arg3, %c1 overflow : index + fir.result %11, %10 : index, f32 } - %5 = fir.convert %4#0 : (index) -> i32 - fir.store %5 to %2 : !fir.ref - return %4#1 : f32 + %5:2 = fir.do_loop %arg3 = %c1 to %c100 step %c1 iter_args(%arg4 = %4#1, %arg5 = %cst) -> (f32, f32) { + %8 = fir.array_coor %arg0(%0) %arg3 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %9 = fir.load %8 : !fir.ref + %10 = arith.addf %arg4, %9 fastmath : f32 + %11 = fir.array_coor %arg2(%0) %arg3 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %12 = fir.load %11 : !fir.ref + %13 = arith.addf %arg5, %12 fastmath : f32 + fir.result %10, %13 : f32, f32 + } + %6 = arith.addf %5#0, %5#1 fastmath : f32 + %7 = fir.convert %4#0 : (index) -> i32 + fir.store %7 to %2 : !fir.ref + return %6 : f32 } // CHECK-LABEL: func.func @loop_with_result( // CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, -// CHECK-SAME: %[[ARG1:.*]]: !fir.ref>) -> f32 { +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG2:.*]]: !fir.ref>) -> f32 { // CHECK: %[[VAL_0:.*]] = arith.constant 1 : index // CHECK: %[[VAL_1:.*]] = arith.constant 0.000000e+00 : f32 // CHECK: %[[VAL_2:.*]] = arith.constant 100 : index @@ -192,7 +203,17 @@ func.func @loop_with_result(%arg0: !fir.ref>, %arg1: !fir.re // CHECK: %[[VAL_25:.*]] = arith.addi %[[VAL_14]], %[[VAL_0]] overflow : index // CHECK: fir.result %[[VAL_25]], %[[VAL_19]] : index, f32 // CHECK: } -// CHECK: %[[VAL_26:.*]] = fir.convert %[[VAL_27:.*]]#0 : (index) -> i32 -// CHECK: fir.store %[[VAL_26]] to %[[VAL_5]] : !fir.ref -// CHECK: return %[[VAL_27]]#1 : f32 +// CHECK: %[[VAL_26:.*]] = fir.convert %[[ARG2]] : (!fir.ref>) -> memref +// CHECK: %[[VAL_27:.*]]:2 = affine.for %[[VAL_28:.*]] = %[[VAL_0]] to #{{.*}}(){{\[}}%[[VAL_2]]] iter_args(%[[VAL_29:.*]] = %[[VAL_30:.*]]#1, %[[VAL_31:.*]] = %[[VAL_1]]) -> (f32, f32) { +// CHECK: %[[VAL_32:.*]] = affine.apply #{{.*}}(%[[VAL_28]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] +// CHECK: %[[VAL_33:.*]] = affine.load %[[VAL_6]]{{\[}}%[[VAL_32]]] : memref +// CHECK: %[[VAL_34:.*]] = arith.addf %[[VAL_29]], %[[VAL_33]] fastmath : f32 +// CHECK: %[[VAL_35:.*]] = affine.load %[[VAL_26]]{{\[}}%[[VAL_32]]] : memref +// CHECK: %[[VAL_36:.*]] = arith.addf %[[VAL_31]], %[[VAL_35]] fastmath : f32 +// CHECK: affine.yield %[[VAL_34]], %[[VAL_36]] : f32, f32 +// CHECK: } +// CHECK: %[[VAL_37:.*]] = arith.addf %[[VAL_38:.*]]#0, %[[VAL_38]]#1 fastmath : f32 +// CHECK: %[[VAL_39:.*]] = fir.convert %[[VAL_40:.*]]#0 : (index) -> i32 +// CHECK: fir.store %[[VAL_39]] to %[[VAL_5]] : !fir.ref +// CHECK: return %[[VAL_37]] : f32 // CHECK: } From flang-commits at lists.llvm.org Thu May 8 00:55:58 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Thu, 08 May 2025 00:55:58 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681c638e.050a0220.2f6125.8079@mx.google.com> https://github.com/kaviya2510 updated https://github.com/llvm/llvm-project/pull/128490 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Thu May 8 00:56:07 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Thu, 08 May 2025 00:56:07 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681c6397.170a0220.254680.f0ba@mx.google.com> ================ @@ -365,6 +365,27 @@ bool ClauseProcessor::processHint(mlir::omp::HintClauseOps &result) const { return false; } +bool ClauseProcessor::processGrainsize( + lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const { + using grainsize = omp::clause::Grainsize; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier) { + result.grainsizeMod = mlir::omp::ClauseGrainsizeTypeAttr::get( + context, mlir::omp::ClauseGrainsizeType::Strict); + } ---------------- kaviya2510 wrote: Thanks for the quick review! I agree with your suggestion. I have done the required changes, kindly check it. Thankyou. https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Thu May 8 01:18:11 2025 From: flang-commits at lists.llvm.org (Michael Klemm via flang-commits) Date: Thu, 08 May 2025 01:18:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Update `do concurrent` mapping pass to use `fir.do_concurrent` op (PR #138489) In-Reply-To: Message-ID: <681c68c3.630a0220.2491e5.e1cd@mx.google.com> https://github.com/mjklemm approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/138489 From flang-commits at lists.llvm.org Thu May 8 01:19:23 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 01:19:23 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <681c690b.050a0220.f11e8.1734@mx.google.com> https://github.com/NexMing updated https://github.com/llvm/llvm-project/pull/138627 >From ea6a6e5721d301647770ef05548555b05f1092f7 Mon Sep 17 00:00:00 2001 From: yanming Date: Wed, 30 Apr 2025 16:32:14 +0800 Subject: [PATCH 1/2] [flang][fir] Add affine optimization pass pipeline. --- .../flang/Optimizer/Passes/CommandLineOpts.h | 1 + .../flang/Optimizer/Passes/Pipelines.h | 3 ++ flang/lib/Optimizer/Passes/CMakeLists.txt | 1 + .../lib/Optimizer/Passes/CommandLineOpts.cpp | 1 + flang/lib/Optimizer/Passes/Pipelines.cpp | 17 ++++++ flang/test/Driver/mlir-pass-pipeline.f90 | 14 +++++ flang/test/Integration/OpenMP/auto-omp.f90 | 52 +++++++++++++++++++ 7 files changed, 89 insertions(+) create mode 100644 flang/test/Integration/OpenMP/auto-omp.f90 diff --git a/flang/include/flang/Optimizer/Passes/CommandLineOpts.h b/flang/include/flang/Optimizer/Passes/CommandLineOpts.h index 1cfaf285e75e6..320c561953213 100644 --- a/flang/include/flang/Optimizer/Passes/CommandLineOpts.h +++ b/flang/include/flang/Optimizer/Passes/CommandLineOpts.h @@ -42,6 +42,7 @@ extern llvm::cl::opt disableCfgConversion; extern llvm::cl::opt disableFirAvc; extern llvm::cl::opt disableFirMao; +extern llvm::cl::opt enableAffineOpt; extern llvm::cl::opt disableFirAliasTags; extern llvm::cl::opt useOldAliasTags; diff --git a/flang/include/flang/Optimizer/Passes/Pipelines.h b/flang/include/flang/Optimizer/Passes/Pipelines.h index a3f59ee8dd013..7680987367256 100644 --- a/flang/include/flang/Optimizer/Passes/Pipelines.h +++ b/flang/include/flang/Optimizer/Passes/Pipelines.h @@ -18,8 +18,11 @@ #include "flang/Optimizer/Passes/CommandLineOpts.h" #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Tools/CrossToolHelpers.h" +#include "mlir/Conversion/AffineToStandard/AffineToStandard.h" #include "mlir/Conversion/ReconcileUnrealizedCasts/ReconcileUnrealizedCasts.h" #include "mlir/Conversion/SCFToControlFlow/SCFToControlFlow.h" +#include "mlir/Conversion/SCFToOpenMP/SCFToOpenMP.h" +#include "mlir/Dialect/Affine/Passes.h" #include "mlir/Dialect/GPU/IR/GPUDialect.h" #include "mlir/Dialect/LLVMIR/LLVMAttrs.h" #include "mlir/Pass/PassManager.h" diff --git a/flang/lib/Optimizer/Passes/CMakeLists.txt b/flang/lib/Optimizer/Passes/CMakeLists.txt index 1c19a5765aff1..ad6c714c28bec 100644 --- a/flang/lib/Optimizer/Passes/CMakeLists.txt +++ b/flang/lib/Optimizer/Passes/CMakeLists.txt @@ -21,6 +21,7 @@ add_flang_library(flangPasses MLIRPass MLIRReconcileUnrealizedCasts MLIRSCFToControlFlow + MLIRSCFToOpenMP MLIRSupport MLIRTransforms ) diff --git a/flang/lib/Optimizer/Passes/CommandLineOpts.cpp b/flang/lib/Optimizer/Passes/CommandLineOpts.cpp index f95a280883cba..b8ae6ede423e3 100644 --- a/flang/lib/Optimizer/Passes/CommandLineOpts.cpp +++ b/flang/lib/Optimizer/Passes/CommandLineOpts.cpp @@ -55,6 +55,7 @@ cl::opt useOldAliasTags( cl::desc("Use a single TBAA tree for all functions and do not use " "the FIR alias tags pass"), cl::init(false), cl::Hidden); +EnableOption(AffineOpt, "affine-opt", "affine optimization"); /// CodeGen Passes DisableOption(CodeGenRewrite, "codegen-rewrite", "rewrite FIR for codegen"); diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index a3ef473ea39b7..f85de45f6029d 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -209,8 +209,25 @@ void createDefaultFIROptimizerPassPipeline(mlir::PassManager &pm, if (pc.AliasAnalysis && !disableFirAliasTags && !useOldAliasTags) pm.addPass(fir::createAddAliasTags()); + if (enableAffineOpt && pc.OptLevel.isOptimizingForSpeed()) { + pm.addPass(fir::createPromoteToAffinePass()); + pm.addPass(mlir::createCSEPass()); + pm.addPass(mlir::affine::createAffineLoopInvariantCodeMotionPass()); + pm.addPass(mlir::affine::createAffineLoopNormalizePass()); + pm.addPass(mlir::affine::createSimplifyAffineStructuresPass()); + pm.addPass(mlir::affine::createAffineParallelize( + mlir::affine::AffineParallelizeOptions{1, false})); + pm.addPass(fir::createAffineDemotionPass()); + pm.addPass(mlir::createLowerAffinePass()); + if (pc.EnableOpenMP) { + pm.addPass(mlir::createConvertSCFToOpenMPPass()); + pm.addPass(mlir::createCanonicalizerPass()); + } + } + addNestedPassToAllTopLevelOperations( pm, fir::createStackReclaim); + // convert control flow to CFG form fir::addCfgConversionPass(pm, pc); pm.addPass(mlir::createSCFToControlFlowPass()); diff --git a/flang/test/Driver/mlir-pass-pipeline.f90 b/flang/test/Driver/mlir-pass-pipeline.f90 index 45370895db397..188a42d231500 100644 --- a/flang/test/Driver/mlir-pass-pipeline.f90 +++ b/flang/test/Driver/mlir-pass-pipeline.f90 @@ -4,6 +4,7 @@ ! -O0 is the default: ! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline %s -O0 -o /dev/null 2>&1 | FileCheck --check-prefixes=ALL %s ! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline %s -O2 -o /dev/null 2>&1 | FileCheck --check-prefixes=ALL,O2 %s +! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline -mllvm --enable-affine-opt %s -O2 -o /dev/null 2>&1 | FileCheck --check-prefixes=ALL,O2,AFFINE %s ! REQUIRES: asserts @@ -105,6 +106,19 @@ ! ALL-NEXT: SimplifyFIROperations ! O2-NEXT: AddAliasTags +! AFFINE-NEXT: 'func.func' Pipeline +! AFFINE-NEXT: AffineDialectPromotion +! AFFINE-NEXT: CSE +! AFFINE-NEXT: (S) 0 num-cse'd - Number of operations CSE'd +! AFFINE-NEXT: (S) 0 num-dce'd - Number of operations DCE'd +! AFFINE-NEXT: 'func.func' Pipeline +! AFFINE-NEXT: AffineLoopInvariantCodeMotion +! AFFINE-NEXT: AffineLoopNormalize +! AFFINE-NEXT: SimplifyAffineStructures +! AFFINE-NEXT: AffineParallelize +! AFFINE-NEXT: AffineDialectDemotion +! AFFINE-NEXT: LowerAffinePass + ! ALL-NEXT: Pipeline Collection : ['fir.global', 'func.func', 'omp.declare_reduction', 'omp.private'] ! ALL-NEXT: 'fir.global' Pipeline ! ALL-NEXT: StackReclaim diff --git a/flang/test/Integration/OpenMP/auto-omp.f90 b/flang/test/Integration/OpenMP/auto-omp.f90 new file mode 100644 index 0000000000000..7e348bfb41c17 --- /dev/null +++ b/flang/test/Integration/OpenMP/auto-omp.f90 @@ -0,0 +1,52 @@ +! RUN: %flang_fc1 -O1 -mllvm --enable-affine-opt -emit-llvm -fopenmp -o - %s \ +! RUN: | FileCheck %s + +!CHECK-LABEL: entry: +!CHECK: %[[VAL_0:.*]] = alloca { ptr }, align 8 +!CHECK: %[[VAL_1:.*]] = tail call i32 @__kmpc_global_thread_num(ptr nonnull @1) +!CHECK: store ptr %[[VAL_2:.*]], ptr %[[VAL_0]], align 8 +!CHECK: call void (ptr, i32, ptr, ...) @__kmpc_fork_call(ptr nonnull @1, i32 1, ptr nonnull @foo_..omp_par, ptr nonnull %[[VAL_0]]) +!CHECK: ret void +!CHECK: omp.par.entry: +!CHECK: %[[VAL_3:.*]] = load ptr, ptr %[[VAL_4:.*]], align 8, !align !3 +!CHECK: %[[VAL_5:.*]] = alloca i32, align 4 +!CHECK: %[[VAL_6:.*]] = alloca i64, align 8 +!CHECK: %[[VAL_7:.*]] = alloca i64, align 8 +!CHECK: %[[VAL_8:.*]] = alloca i64, align 8 +!CHECK: store i64 0, ptr %[[VAL_6]], align 8 +!CHECK: store i64 99, ptr %[[VAL_7]], align 8 +!CHECK: store i64 1, ptr %[[VAL_8]], align 8 +!CHECK: %[[VAL_9:.*]] = tail call i32 @__kmpc_global_thread_num(ptr nonnull @1) +!CHECK: call void @__kmpc_for_static_init_8u(ptr nonnull @1, i32 %[[VAL_9]], i32 34, ptr nonnull %[[VAL_5]], ptr nonnull %[[VAL_6]], ptr nonnull %[[VAL_7]], ptr nonnull %[[VAL_8]], i64 1, i64 0) +!CHECK: %[[VAL_10:.*]] = load i64, ptr %[[VAL_6]], align 8 +!CHECK: %[[VAL_11:.*]] = load i64, ptr %[[VAL_7]], align 8 +!CHECK: %[[VAL_12:.*]] = sub i64 %[[VAL_11]], %[[VAL_10]] +!CHECK: %[[VAL_13:.*]] = icmp eq i64 %[[VAL_12]], -1 +!CHECK: br i1 %[[VAL_13]], label %[[VAL_14:.*]], label %[[VAL_15:.*]] +!CHECK: omp_loop.exit: ; preds = %[[VAL_16:.*]], %[[VAL_17:.*]] +!CHECK: call void @__kmpc_for_static_fini(ptr nonnull @1, i32 %[[VAL_9]]) +!CHECK: %[[VAL_18:.*]] = call i32 @__kmpc_global_thread_num(ptr nonnull @1) +!CHECK: call void @__kmpc_barrier(ptr nonnull @2, i32 %[[VAL_18]]) +!CHECK: ret void +!CHECK: omp_loop.body: ; preds = %[[VAL_17]], %[[VAL_16]] +!CHECK: %[[VAL_19:.*]] = phi i64 [ %[[VAL_20:.*]], %[[VAL_16]] ], [ 0, %[[VAL_17]] ] +!CHECK: %[[VAL_21:.*]] = add i64 %[[VAL_19]], %[[VAL_10]] +!CHECK: %[[VAL_22:.*]] = mul i64 %[[VAL_21]], 400 +!CHECK: %[[VAL_23:.*]] = getelementptr i8, ptr %[[VAL_3]], i64 %[[VAL_22]] +!CHECK: br label %[[VAL_24:.*]] +!CHECK: omp_loop.inc: ; preds = %[[VAL_24]] +!CHECK: %[[VAL_20]] = add nuw i64 %[[VAL_19]], 1 +!CHECK: %[[VAL_25:.*]] = icmp eq i64 %[[VAL_19]], %[[VAL_12]] +!CHECK: br i1 %[[VAL_25]], label %[[VAL_14]], label %[[VAL_15]] +!CHECK: omp.loop_nest.region6: ; preds = %[[VAL_15]], %[[VAL_24]] +!CHECK: %[[VAL_26:.*]] = phi i64 [ 0, %[[VAL_15]] ], [ %[[VAL_27:.*]], %[[VAL_24]] ] +!CHECK: %[[VAL_28:.*]] = getelementptr i32, ptr %[[VAL_23]], i64 %[[VAL_26]] +!CHECK: store i32 1, ptr %[[VAL_28]], align 4, !tbaa !4 +!CHECK: %[[VAL_27]] = add nuw nsw i64 %[[VAL_26]], 1 +!CHECK: %[[VAL_29:.*]] = icmp eq i64 %[[VAL_27]], 100 +!CHECK: br i1 %[[VAL_29]], label %[[VAL_16]], label %[[VAL_24]] + +subroutine foo(a) + integer, dimension(100, 100), intent(out) :: a + a = 1 +end subroutine foo >From 99ecb0b36284e5a6eb42797f6330cf69c0d37b5b Mon Sep 17 00:00:00 2001 From: yanming Date: Thu, 8 May 2025 16:17:48 +0800 Subject: [PATCH 2/2] Fix the failed test. --- flang/test/Integration/OpenMP/auto-omp.f90 | 46 +--------------------- 1 file changed, 2 insertions(+), 44 deletions(-) diff --git a/flang/test/Integration/OpenMP/auto-omp.f90 b/flang/test/Integration/OpenMP/auto-omp.f90 index 7e348bfb41c17..bf7da292552d8 100644 --- a/flang/test/Integration/OpenMP/auto-omp.f90 +++ b/flang/test/Integration/OpenMP/auto-omp.f90 @@ -1,50 +1,8 @@ ! RUN: %flang_fc1 -O1 -mllvm --enable-affine-opt -emit-llvm -fopenmp -o - %s \ ! RUN: | FileCheck %s -!CHECK-LABEL: entry: -!CHECK: %[[VAL_0:.*]] = alloca { ptr }, align 8 -!CHECK: %[[VAL_1:.*]] = tail call i32 @__kmpc_global_thread_num(ptr nonnull @1) -!CHECK: store ptr %[[VAL_2:.*]], ptr %[[VAL_0]], align 8 -!CHECK: call void (ptr, i32, ptr, ...) @__kmpc_fork_call(ptr nonnull @1, i32 1, ptr nonnull @foo_..omp_par, ptr nonnull %[[VAL_0]]) -!CHECK: ret void -!CHECK: omp.par.entry: -!CHECK: %[[VAL_3:.*]] = load ptr, ptr %[[VAL_4:.*]], align 8, !align !3 -!CHECK: %[[VAL_5:.*]] = alloca i32, align 4 -!CHECK: %[[VAL_6:.*]] = alloca i64, align 8 -!CHECK: %[[VAL_7:.*]] = alloca i64, align 8 -!CHECK: %[[VAL_8:.*]] = alloca i64, align 8 -!CHECK: store i64 0, ptr %[[VAL_6]], align 8 -!CHECK: store i64 99, ptr %[[VAL_7]], align 8 -!CHECK: store i64 1, ptr %[[VAL_8]], align 8 -!CHECK: %[[VAL_9:.*]] = tail call i32 @__kmpc_global_thread_num(ptr nonnull @1) -!CHECK: call void @__kmpc_for_static_init_8u(ptr nonnull @1, i32 %[[VAL_9]], i32 34, ptr nonnull %[[VAL_5]], ptr nonnull %[[VAL_6]], ptr nonnull %[[VAL_7]], ptr nonnull %[[VAL_8]], i64 1, i64 0) -!CHECK: %[[VAL_10:.*]] = load i64, ptr %[[VAL_6]], align 8 -!CHECK: %[[VAL_11:.*]] = load i64, ptr %[[VAL_7]], align 8 -!CHECK: %[[VAL_12:.*]] = sub i64 %[[VAL_11]], %[[VAL_10]] -!CHECK: %[[VAL_13:.*]] = icmp eq i64 %[[VAL_12]], -1 -!CHECK: br i1 %[[VAL_13]], label %[[VAL_14:.*]], label %[[VAL_15:.*]] -!CHECK: omp_loop.exit: ; preds = %[[VAL_16:.*]], %[[VAL_17:.*]] -!CHECK: call void @__kmpc_for_static_fini(ptr nonnull @1, i32 %[[VAL_9]]) -!CHECK: %[[VAL_18:.*]] = call i32 @__kmpc_global_thread_num(ptr nonnull @1) -!CHECK: call void @__kmpc_barrier(ptr nonnull @2, i32 %[[VAL_18]]) -!CHECK: ret void -!CHECK: omp_loop.body: ; preds = %[[VAL_17]], %[[VAL_16]] -!CHECK: %[[VAL_19:.*]] = phi i64 [ %[[VAL_20:.*]], %[[VAL_16]] ], [ 0, %[[VAL_17]] ] -!CHECK: %[[VAL_21:.*]] = add i64 %[[VAL_19]], %[[VAL_10]] -!CHECK: %[[VAL_22:.*]] = mul i64 %[[VAL_21]], 400 -!CHECK: %[[VAL_23:.*]] = getelementptr i8, ptr %[[VAL_3]], i64 %[[VAL_22]] -!CHECK: br label %[[VAL_24:.*]] -!CHECK: omp_loop.inc: ; preds = %[[VAL_24]] -!CHECK: %[[VAL_20]] = add nuw i64 %[[VAL_19]], 1 -!CHECK: %[[VAL_25:.*]] = icmp eq i64 %[[VAL_19]], %[[VAL_12]] -!CHECK: br i1 %[[VAL_25]], label %[[VAL_14]], label %[[VAL_15]] -!CHECK: omp.loop_nest.region6: ; preds = %[[VAL_15]], %[[VAL_24]] -!CHECK: %[[VAL_26:.*]] = phi i64 [ 0, %[[VAL_15]] ], [ %[[VAL_27:.*]], %[[VAL_24]] ] -!CHECK: %[[VAL_28:.*]] = getelementptr i32, ptr %[[VAL_23]], i64 %[[VAL_26]] -!CHECK: store i32 1, ptr %[[VAL_28]], align 4, !tbaa !4 -!CHECK: %[[VAL_27]] = add nuw nsw i64 %[[VAL_26]], 1 -!CHECK: %[[VAL_29:.*]] = icmp eq i64 %[[VAL_27]], 100 -!CHECK: br i1 %[[VAL_29]], label %[[VAL_16]], label %[[VAL_24]] +!CHECK-LABEL: define void @foo_(ptr captures(none) %0) {{.*}} { +!CHECK: call void{{.*}}@__kmpc_fork_call{{.*}}@[[OMP_OUTLINED_FN_1:.*]]) subroutine foo(a) integer, dimension(100, 100), intent(out) :: a From flang-commits at lists.llvm.org Thu May 8 02:06:29 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 02:06:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <681c7415.170a0220.22506c.2890@mx.google.com> https://github.com/NexMing edited https://github.com/llvm/llvm-project/pull/137790 From flang-commits at lists.llvm.org Thu May 8 02:08:52 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 02:08:52 -0700 (PDT) Subject: [flang-commits] [flang] 2a32d73 - [flang][OpenMP] fix predetermined privatization inside section (#138159) Message-ID: <681c74a4.170a0220.a70bf.5447@mx.google.com> Author: Tom Eccles Date: 2025-05-08T10:08:49+01:00 New Revision: 2a32d738bb213a8a1e814b65beb61e39b7c66834 URL: https://github.com/llvm/llvm-project/commit/2a32d738bb213a8a1e814b65beb61e39b7c66834 DIFF: https://github.com/llvm/llvm-project/commit/2a32d738bb213a8a1e814b65beb61e39b7c66834.diff LOG: [flang][OpenMP] fix predetermined privatization inside section (#138159) This now produces code equivalent to if there was an explicit private clause on the SECTIONS construct. The problem was that each SECTION construct got its own DSP, which tried to privatize the same symbol for that SECTION. Privatization for SECTION(S) happens on the outer SECTION construct and so the outer construct's DSP should be shared. Fixes #135108 Added: flang/test/Lower/OpenMP/sections-predetermined-private.f90 Modified: flang/lib/Lower/OpenMP/DataSharingProcessor.cpp flang/lib/Lower/OpenMP/OpenMP.cpp Removed: ################################################################################ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..7eec598645eac 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -67,6 +67,8 @@ void DataSharingProcessor::processStep1( void DataSharingProcessor::processStep2(mlir::Operation *op, bool isLoop) { // 'sections' lastprivate is handled by genOMP() + if (mlir::isa(op)) + return; if (!mlir::isa(op)) { mlir::OpBuilder::InsertionGuard guard(firOpBuilder); copyLastPrivatize(op); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index cc793c683f898..099d5c604060f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2154,6 +2154,7 @@ genSectionsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, OpWithBodyGenInfo(converter, symTable, semaCtx, loc, nestedEval, llvm::omp::Directive::OMPD_section) .setClauses(§ionQueue.begin()->clauses) + .setDataSharingProcessor(&dsp) .setEntryBlockArgs(&args), sectionQueue, sectionQueue.begin()); } diff --git a/flang/test/Lower/OpenMP/sections-predetermined-private.f90 b/flang/test/Lower/OpenMP/sections-predetermined-private.f90 new file mode 100644 index 0000000000000..9c2e2e127aa78 --- /dev/null +++ b/flang/test/Lower/OpenMP/sections-predetermined-private.f90 @@ -0,0 +1,34 @@ +! RUN: %flang_fc1 -fopenmp -emit-hlfir -o - %s | FileCheck %s + +!$omp parallel sections +!$omp section + do i = 1, 2 + end do +!$omp section + do i = 1, 2 + end do +!$omp end parallel sections +end +! CHECK-LABEL: func.func @_QQmain() { +! CHECK: omp.parallel { +! CHECK: %[[VAL_3:.*]] = fir.alloca i32 {bindc_name = "i", pinned} +! CHECK: %[[VAL_4:.*]]:2 = hlfir.declare %[[VAL_3]] {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: omp.sections { +! CHECK: omp.section { +! CHECK: %[[VAL_11:.*]]:2 = fir.do_loop %[[VAL_12:.*]] = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%{{.*}} = %{{.*}} -> (index, i32) { +! CHECK: } +! CHECK: fir.store %[[VAL_11]]#1 to %[[VAL_4]]#0 : !fir.ref +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.section { +! CHECK: %[[VAL_25:.*]]:2 = fir.do_loop %[[VAL_26:.*]] = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%{{.*}} = %{{.*}}) -> (index, i32) { +! CHECK: } +! CHECK: fir.store %[[VAL_25]]#1 to %[[VAL_4]]#0 : !fir.ref +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } From flang-commits at lists.llvm.org Thu May 8 02:11:45 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 02:11:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <681c7551.050a0220.60ca4.6bdf@mx.google.com> https://github.com/tblah approved this pull request. Thanks for the update. LGTM, but wait for Kiran's approval too. https://github.com/llvm/llvm-project/pull/138627 From flang-commits at lists.llvm.org Thu May 8 02:21:55 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 02:21:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Use box for components with non-default lower bounds (PR #138994) In-Reply-To: Message-ID: <681c77b3.170a0220.3613ba.2366@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks https://github.com/llvm/llvm-project/pull/138994 From flang-commits at lists.llvm.org Thu May 8 02:30:15 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 02:30:15 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Treat hlfir.associate as Allocate for FIR alias analysis. (PR #139004) In-Reply-To: Message-ID: <681c79a7.050a0220.abbe0.7b7c@mx.google.com> https://github.com/tblah approved this pull request. LGTM. Maybe this assumption should be documented for future reference in the hlfir.associate operation definition, because otherwise the connection between the assumption here and the implementation of HLFIR bufferization is not that obvious. https://github.com/llvm/llvm-project/pull/139004 From flang-commits at lists.llvm.org Thu May 8 02:40:35 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 02:40:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <681c7c13.170a0220.bed2b.e6ae@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks! https://github.com/llvm/llvm-project/pull/137790 From flang-commits at lists.llvm.org Thu May 8 02:41:47 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 02:41:47 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681c7c5b.170a0220.1160f0.59a5@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks! https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Thu May 8 03:09:24 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 03:09:24 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [mlir][OpenMP] cancel(lation point) taskgroup LLVMIR (PR #137841) In-Reply-To: Message-ID: <681c82d4.050a0220.36835f.7c57@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/137841 From flang-commits at lists.llvm.org Thu May 8 03:15:49 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 03:15:49 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [mlir][OpenMP] cancel(lation point) taskgroup LLVMIR (PR #137841) In-Reply-To: Message-ID: <681c8455.170a0220.151805.e72f@mx.google.com> https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/137841 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Thu May 8 03:16:01 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 03:16:01 -0700 (PDT) Subject: [flang-commits] [flang] e402009 - [mlir][OpenMP] cancel(lation point) taskgroup LLVMIR (#137841) Message-ID: <681c8461.170a0220.2fa0d3.2ef9@mx.google.com> Author: Tom Eccles Date: 2025-05-08T11:15:58+01:00 New Revision: e40200901cf1af860db9ded5c03b7b104396e429 URL: https://github.com/llvm/llvm-project/commit/e40200901cf1af860db9ded5c03b7b104396e429 DIFF: https://github.com/llvm/llvm-project/commit/e40200901cf1af860db9ded5c03b7b104396e429.diff LOG: [mlir][OpenMP] cancel(lation point) taskgroup LLVMIR (#137841) A cancel or cancellation point for taskgroup is always nested inside of a task inside of the taskgroup. For the task which is cancelled, it is that task which needs to be cleaned up: not the owning taskgroup. Therefore the cancellation branch handler is done in the conversion of the task not in conversion of taskgroup. I added a firstprivate clause to the test for cancel taskgroup to demonstrate that the block being branched to is the same block where mandatory cleanup code is added. Cancellation point follows exactly the same code path. Added: Modified: flang/docs/OpenMPSupport.md mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp mlir/test/Target/LLVMIR/openmp-cancel.mlir mlir/test/Target/LLVMIR/openmp-cancellation-point.mlir mlir/test/Target/LLVMIR/openmp-todo.mlir Removed: ################################################################################ diff --git a/flang/docs/OpenMPSupport.md b/flang/docs/OpenMPSupport.md index 587723890d226..28e13d3179bd2 100644 --- a/flang/docs/OpenMPSupport.md +++ b/flang/docs/OpenMPSupport.md @@ -51,8 +51,8 @@ Note : No distinction is made between the support in Parser/Semantics, MLIR, Low | depend clause | P | depend clause with array sections are not supported | | declare reduction construct | N | | | atomic construct extensions | Y | | -| cancel construct | N | | -| cancellation point construct | N | | +| cancel construct | Y | | +| cancellation point construct | Y | | | parallel do simd construct | P | linear clause is not supported | | target teams construct | P | device and reduction clauses are not supported | | teams distribute construct | P | reduction and dist_schedule clauses not supported | diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 2e8e5a6ca5c2a..9f7b5605556e6 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -161,8 +161,18 @@ static LogicalResult checkImplementationStatus(Operation &op) { auto checkCancelDirective = [&todo](auto op, LogicalResult &result) { omp::ClauseCancellationConstructType cancelledDirective = op.getCancelDirective(); - if (cancelledDirective == omp::ClauseCancellationConstructType::Taskgroup) - result = todo("cancel directive construct type not yet supported"); + // Cancelling a taskloop is not yet supported because we don't yet have LLVM + // IR conversion for taskloop + if (cancelledDirective == omp::ClauseCancellationConstructType::Taskgroup) { + Operation *parent = op->getParentOp(); + while (parent) { + if (parent->getDialect() == op->getDialect()) + break; + parent = parent->getParentOp(); + } + if (isa_and_nonnull(parent)) + result = todo("cancel directive inside of taskloop"); + } }; auto checkDepend = [&todo](auto op, LogicalResult &result) { if (!op.getDependVars().empty() || op.getDependKinds()) @@ -1889,6 +1899,55 @@ buildDependData(std::optional dependKinds, OperandRange dependVars, } } +/// Shared implementation of a callback which adds a termiator for the new block +/// created for the branch taken when an openmp construct is cancelled. The +/// terminator is saved in \p cancelTerminators. This callback is invoked only +/// if there is cancellation inside of the taskgroup body. +/// The terminator will need to be fixed to branch to the correct block to +/// cleanup the construct. +static void +pushCancelFinalizationCB(SmallVectorImpl &cancelTerminators, + llvm::IRBuilderBase &llvmBuilder, + llvm::OpenMPIRBuilder &ompBuilder, mlir::Operation *op, + llvm::omp::Directive cancelDirective) { + auto finiCB = [&](llvm::OpenMPIRBuilder::InsertPointTy ip) -> llvm::Error { + llvm::IRBuilderBase::InsertPointGuard guard(llvmBuilder); + + // ip is currently in the block branched to if cancellation occured. + // We need to create a branch to terminate that block. + llvmBuilder.restoreIP(ip); + + // We must still clean up the construct after cancelling it, so we need to + // branch to the block that finalizes the taskgroup. + // That block has not been created yet so use this block as a dummy for now + // and fix this after creating the operation. + cancelTerminators.push_back(llvmBuilder.CreateBr(ip.getBlock())); + return llvm::Error::success(); + }; + // We have to add the cleanup to the OpenMPIRBuilder before the body gets + // created in case the body contains omp.cancel (which will then expect to be + // able to find this cleanup callback). + ompBuilder.pushFinalizationCB( + {finiCB, cancelDirective, constructIsCancellable(op)}); +} + +/// If we cancelled the construct, we should branch to the finalization block of +/// that construct. OMPIRBuilder structures the CFG such that the cleanup block +/// is immediately before the continuation block. Now this finalization has +/// been created we can fix the branch. +static void +popCancelFinalizationCB(const ArrayRef cancelTerminators, + llvm::OpenMPIRBuilder &ompBuilder, + const llvm::OpenMPIRBuilder::InsertPointTy &afterIP) { + ompBuilder.popFinalizationCB(); + llvm::BasicBlock *constructFini = afterIP.getBlock()->getSinglePredecessor(); + for (llvm::BranchInst *cancelBranch : cancelTerminators) { + assert(cancelBranch->getNumSuccessors() == 1 && + "cancel branch should have one target"); + cancelBranch->setSuccessor(0, constructFini); + } +} + namespace { /// TaskContextStructManager takes care of creating and freeing a structure /// containing information needed by the task body to execute. @@ -2202,6 +2261,14 @@ convertOmpTaskOp(omp::TaskOp taskOp, llvm::IRBuilderBase &builder, return llvm::Error::success(); }; + llvm::OpenMPIRBuilder &ompBuilder = *moduleTranslation.getOpenMPBuilder(); + SmallVector cancelTerminators; + // The directive to match here is OMPD_taskgroup because it is the taskgroup + // which is canceled. This is handled here because it is the task's cleanup + // block which should be branched to. + pushCancelFinalizationCB(cancelTerminators, builder, ompBuilder, taskOp, + llvm::omp::Directive::OMPD_taskgroup); + SmallVector dds; buildDependData(taskOp.getDependKinds(), taskOp.getDependVars(), moduleTranslation, dds); @@ -2219,6 +2286,9 @@ convertOmpTaskOp(omp::TaskOp taskOp, llvm::IRBuilderBase &builder, if (failed(handleError(afterIP, *taskOp))) return failure(); + // Set the correct branch target for task cancellation + popCancelFinalizationCB(cancelTerminators, ompBuilder, afterIP.get()); + builder.restoreIP(*afterIP); return success(); } @@ -2349,28 +2419,8 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, : llvm::omp::WorksharingLoopType::ForStaticLoop; SmallVector cancelTerminators; - // This callback is invoked only if there is cancellation inside of the wsloop - // body. - auto finiCB = [&](llvm::OpenMPIRBuilder::InsertPointTy ip) -> llvm::Error { - llvm::IRBuilderBase &llvmBuilder = ompBuilder->Builder; - llvm::IRBuilderBase::InsertPointGuard guard(llvmBuilder); - - // ip is currently in the block branched to if cancellation occured. - // We need to create a branch to terminate that block. - llvmBuilder.restoreIP(ip); - - // We must still clean up the wsloop after cancelling it, so we need to - // branch to the block that finalizes the wsloop. - // That block has not been created yet so use this block as a dummy for now - // and fix this after creating the wsloop. - cancelTerminators.push_back(llvmBuilder.CreateBr(ip.getBlock())); - return llvm::Error::success(); - }; - // We have to add the cleanup to the OpenMPIRBuilder before the body gets - // created in case the body contains omp.cancel (which will then expect to be - // able to find this cleanup callback). - ompBuilder->pushFinalizationCB({finiCB, llvm::omp::Directive::OMPD_for, - constructIsCancellable(wsloopOp)}); + pushCancelFinalizationCB(cancelTerminators, builder, *ompBuilder, wsloopOp, + llvm::omp::Directive::OMPD_for); llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder); llvm::Expected regionBlock = convertOmpOpRegions( @@ -2393,18 +2443,8 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, if (failed(handleError(wsloopIP, opInst))) return failure(); - ompBuilder->popFinalizationCB(); - if (!cancelTerminators.empty()) { - // If we cancelled the loop, we should branch to the finalization block of - // the wsloop (which is always immediately before the loop continuation - // block). Now the finalization has been created, we can fix the branch. - llvm::BasicBlock *wsloopFini = wsloopIP->getBlock()->getSinglePredecessor(); - for (llvm::BranchInst *cancelBranch : cancelTerminators) { - assert(cancelBranch->getNumSuccessors() == 1 && - "cancel branch should have one target"); - cancelBranch->setSuccessor(0, wsloopFini); - } - } + // Set the correct branch target for task cancellation + popCancelFinalizationCB(cancelTerminators, *ompBuilder, wsloopIP.get()); // Process the reductions if required. if (failed(createReductionsAndCleanup( @@ -3060,12 +3100,12 @@ static llvm::omp::Directive convertCancellationConstructType( static LogicalResult convertOmpCancel(omp::CancelOp op, llvm::IRBuilderBase &builder, LLVM::ModuleTranslation &moduleTranslation) { - llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder); - llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder(); - if (failed(checkImplementationStatus(*op.getOperation()))) return failure(); + llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder); + llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder(); + llvm::Value *ifCond = nullptr; if (Value ifVar = op.getIfExpr()) ifCond = moduleTranslation.lookupValue(ifVar); @@ -3088,12 +3128,12 @@ static LogicalResult convertOmpCancellationPoint(omp::CancellationPointOp op, llvm::IRBuilderBase &builder, LLVM::ModuleTranslation &moduleTranslation) { - llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder); - llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder(); - if (failed(checkImplementationStatus(*op.getOperation()))) return failure(); + llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder); + llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder(); + llvm::omp::Directive cancelledDirective = convertCancellationConstructType(op.getCancelDirective()); diff --git a/mlir/test/Target/LLVMIR/openmp-cancel.mlir b/mlir/test/Target/LLVMIR/openmp-cancel.mlir index 3c195a98d1000..21241702ad569 100644 --- a/mlir/test/Target/LLVMIR/openmp-cancel.mlir +++ b/mlir/test/Target/LLVMIR/openmp-cancel.mlir @@ -243,3 +243,51 @@ llvm.func @cancel_wsloop_if(%lb : i32, %ub : i32, %step : i32, %cond : i1) { // CHECK: ret void // CHECK: .cncl: ; preds = %[[VAL_44]] // CHECK: br label %[[VAL_38]] + +omp.private {type = firstprivate} @i32_priv : i32 copy { +^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr): + %0 = llvm.load %arg0 : !llvm.ptr -> i32 + llvm.store %0, %arg1 : i32, !llvm.ptr + omp.yield(%arg1 : !llvm.ptr) +} + +llvm.func @do_something(!llvm.ptr) + +llvm.func @cancel_taskgroup(%arg0: !llvm.ptr) { + omp.taskgroup { +// Using firstprivate clause so we have some end of task cleanup to branch to +// after the cancellation. + omp.task private(@i32_priv %arg0 -> %arg1 : !llvm.ptr) { + omp.cancel cancellation_construct_type(taskgroup) + llvm.call @do_something(%arg1) : (!llvm.ptr) -> () + omp.terminator + } + omp.terminator + } + llvm.return +} +// CHECK-LABEL: define internal void @cancel_taskgroup..omp_par( +// CHECK: task.alloca: +// CHECK: %[[VAL_21:.*]] = load ptr, ptr %[[VAL_22:.*]], align 8 +// CHECK: %[[VAL_23:.*]] = getelementptr { ptr }, ptr %[[VAL_21]], i32 0, i32 0 +// CHECK: %[[VAL_24:.*]] = load ptr, ptr %[[VAL_23]], align 8, !align !1 +// CHECK: br label %[[VAL_25:.*]] +// CHECK: task.body: ; preds = %[[VAL_26:.*]] +// CHECK: %[[VAL_27:.*]] = getelementptr { i32 }, ptr %[[VAL_24]], i32 0, i32 0 +// CHECK: br label %[[VAL_28:.*]] +// CHECK: omp.task.region: ; preds = %[[VAL_25]] +// CHECK: %[[VAL_29:.*]] = call i32 @__kmpc_global_thread_num(ptr @1) +// CHECK: %[[VAL_30:.*]] = call i32 @__kmpc_cancel(ptr @1, i32 %[[VAL_29]], i32 4) +// CHECK: %[[VAL_31:.*]] = icmp eq i32 %[[VAL_30]], 0 +// CHECK: br i1 %[[VAL_31]], label %omp.task.region.split, label %omp.task.region.cncl +// CHECK: omp.task.region.cncl: +// CHECK: br label %omp.region.cont2 +// CHECK: omp.region.cont2: +// Both cancellation and normal paths reach the end-of-task cleanup: +// CHECK: tail call void @free(ptr %[[VAL_24]]) +// CHECK: br label %task.exit.exitStub +// CHECK: omp.task.region.split: +// CHECK: call void @do_something(ptr %[[VAL_27]]) +// CHECK: br label %omp.region.cont2 +// CHECK: task.exit.exitStub: +// CHECK: ret void diff --git a/mlir/test/Target/LLVMIR/openmp-cancellation-point.mlir b/mlir/test/Target/LLVMIR/openmp-cancellation-point.mlir index bbb313c113567..5e0d3f9f7e293 100644 --- a/mlir/test/Target/LLVMIR/openmp-cancellation-point.mlir +++ b/mlir/test/Target/LLVMIR/openmp-cancellation-point.mlir @@ -186,3 +186,33 @@ llvm.func @cancellation_point_wsloop(%lb : i32, %ub : i32, %step : i32) { // CHECK: ret void // CHECK: omp.loop_nest.region.cncl: ; preds = %[[VAL_100]] // CHECK: br label %[[VAL_96]] + + +llvm.func @cancellation_point_taskgroup() { + omp.taskgroup { + omp.task { + omp.cancellation_point cancellation_construct_type(taskgroup) + omp.terminator + } + omp.terminator + } + llvm.return +} +// CHECK-LABEL: define internal void @cancellation_point_taskgroup..omp_par( +// CHECK: task.alloca: +// CHECK: br label %[[VAL_50:.*]] +// CHECK: task.body: ; preds = %[[VAL_51:.*]] +// CHECK: br label %[[VAL_52:.*]] +// CHECK: omp.task.region: ; preds = %[[VAL_50]] +// CHECK: %[[VAL_53:.*]] = call i32 @__kmpc_global_thread_num(ptr @1) +// CHECK: %[[VAL_54:.*]] = call i32 @__kmpc_cancellationpoint(ptr @1, i32 %[[VAL_53]], i32 4) +// CHECK: %[[VAL_55:.*]] = icmp eq i32 %[[VAL_54]], 0 +// CHECK: br i1 %[[VAL_55]], label %omp.task.region.split, label %omp.task.region.cncl +// CHECK: omp.task.region.cncl: +// CHECK: br label %omp.region.cont1 +// CHECK: omp.region.cont1: +// CHECK: br label %task.exit.exitStub +// CHECK: omp.task.region.split: +// CHECK: br label %omp.region.cont1 +// CHECK: task.exit.exitStub: +// CHECK: ret void diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir index a67c61f23631f..9a83b46efddca 100644 --- a/mlir/test/Target/LLVMIR/openmp-todo.mlir +++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir @@ -26,40 +26,6 @@ llvm.func @atomic_hint(%v : !llvm.ptr, %x : !llvm.ptr, %expr : i32) { // ----- -llvm.func @cancel_taskgroup() { - // expected-error at below {{LLVM Translation failed for operation: omp.taskgroup}} - omp.taskgroup { - // expected-error at below {{LLVM Translation failed for operation: omp.task}} - omp.task { - // expected-error at below {{not yet implemented: Unhandled clause cancel directive construct type not yet supported in omp.cancel operation}} - // expected-error at below {{LLVM Translation failed for operation: omp.cancel}} - omp.cancel cancellation_construct_type(taskgroup) - omp.terminator - } - omp.terminator - } - llvm.return -} - -// ----- - -llvm.func @cancellation_point_taskgroup() { - // expected-error at below {{LLVM Translation failed for operation: omp.taskgroup}} - omp.taskgroup { - // expected-error at below {{LLVM Translation failed for operation: omp.task}} - omp.task { - // expected-error at below {{not yet implemented: Unhandled clause cancel directive construct type not yet supported in omp.cancellation_point operation}} - // expected-error at below {{LLVM Translation failed for operation: omp.cancellation_point}} - omp.cancellation_point cancellation_construct_type(taskgroup) - omp.terminator - } - omp.terminator - } - llvm.return -} - -// ----- - llvm.func @do_simd(%lb : i32, %ub : i32, %step : i32) { omp.wsloop { // expected-warning at below {{simd information on composite construct discarded}} From flang-commits at lists.llvm.org Thu May 8 03:16:05 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 03:16:05 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [mlir][OpenMP] cancel(lation point) taskgroup LLVMIR (PR #137841) In-Reply-To: Message-ID: <681c8465.630a0220.d9d9d.b16d@mx.google.com> https://github.com/tblah closed https://github.com/llvm/llvm-project/pull/137841 From flang-commits at lists.llvm.org Thu May 8 04:22:56 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 04:22:56 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <681c9410.050a0220.31bb51.aae9@mx.google.com> ================ @@ -156,9 +156,9 @@ genBoundsOpsFromBox(fir::FirOpBuilder &builder, mlir::Location loc, builder.genIfOp(loc, resTypes, info.isPresent, /*withElseRegion=*/true) .genThen([&]() { mlir::Value box = - !fir::isBoxAddress(info.addr.getType()) + !fir::isBoxAddress(info.rawInput.getType()) ? info.addr - : builder.create(loc, info.addr); + : builder.create(loc, info.rawInput); ---------------- jeanPerier wrote: Thanks for the details. Isn't the load suposed to not be done on the `addr` of OPTIONAL (there seems to be some guard for optional [here](https://github.com/llvm/llvm-project/blob/82387ec13258f67a530ddb615a49e0f36e8575e1/flang/include/flang/Optimizer/Builder/DirectivesCommon.h#L85), but maybe that is not covering your case). https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Thu May 8 04:23:01 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 04:23:01 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <681c9415.050a0220.3b600.7a5d@mx.google.com> https://github.com/jeanPerier edited https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Thu May 8 04:26:44 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 04:26:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Propagate contiguous attribute through HLFIR. (PR #138797) In-Reply-To: Message-ID: <681c94f4.170a0220.3079f9.5fe2@mx.google.com> jeanPerier wrote: > Yes, we've been talking about the pass-through operation interface in the context of FIR alias analysis. I would be glad to introduce some abstraction (especially that I am going to add another case in FIR alias analysis in another patch), but I do not see too much generality between FIR alias analysis and the attribute propagation so far. So I do not have a good proposal right now, but the idea is still perfectly valid. +1 here. I think we are just waiting to have more use cases to come with a reasonable and robust abstraction. As long as the backward walks are centralized as much as possible in HLFIRTools.cpp/Alias analysis, changing the model should not be a huge change. https://github.com/llvm/llvm-project/pull/138797 From flang-commits at lists.llvm.org Thu May 8 05:24:35 2025 From: flang-commits at lists.llvm.org (Yussur Mustafa Oraji via flang-commits) Date: Thu, 08 May 2025 05:24:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <681ca283.050a0220.8114f.cc9a@mx.google.com> N00byKing wrote: @akuhlens @eugeneepshteyn Ping for review. I dont have write access and can't specify reviewers. I apologize in advance should this not be correct https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Thu May 8 06:00:25 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Thu, 08 May 2025 06:00:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <681caae9.050a0220.2166c1.d15d@mx.google.com> eugeneepshteyn wrote: @N00byKing , sorry it's the first time I see it, because you tagged me in the comment. In the future, to request review, please add people to Reviewers list. https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Thu May 8 06:06:58 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Thu, 08 May 2025 06:06:58 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <681cac72.630a0220.155cf2.eedf@mx.google.com> ================ @@ -121,6 +121,8 @@ class Preprocessor { std::list names_; std::unordered_map definitions_; std::stack ifStack_; + + unsigned CounterValue = 0; ---------------- eugeneepshteyn wrote: We typically use `unsigned int` in this code. Also, the variable names should start with lower case and member variables should end with `_`. Also, this code uses `{}` initialization. ```suggestion unsigned int counter_val_{0}; ``` https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Thu May 8 06:07:28 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Thu, 08 May 2025 06:07:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <681cac90.170a0220.2cf2f6.e200@mx.google.com> eugeneepshteyn wrote: Also adding @klausler , since this change implements GNU extension. https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Thu May 8 06:13:08 2025 From: flang-commits at lists.llvm.org (Yussur Mustafa Oraji via flang-commits) Date: Thu, 08 May 2025 06:13:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <681cade4.170a0220.202f91.e221@mx.google.com> https://github.com/N00byKing updated https://github.com/llvm/llvm-project/pull/136827 >From 0b007f31fbc6f54571cae051f00d071edbe4962f Mon Sep 17 00:00:00 2001 From: Yussur Mustafa Oraji Date: Wed, 23 Apr 2025 10:33:04 +0200 Subject: [PATCH] [flang] Add __COUNTER__ preprocessor macro --- flang/include/flang/Parser/preprocessor.h | 2 ++ flang/lib/Parser/preprocessor.cpp | 4 ++++ flang/test/Preprocessing/counter.F90 | 9 +++++++++ 3 files changed, 15 insertions(+) create mode 100644 flang/test/Preprocessing/counter.F90 diff --git a/flang/include/flang/Parser/preprocessor.h b/flang/include/flang/Parser/preprocessor.h index 86528a7e68def..6209f993f8ed0 100644 --- a/flang/include/flang/Parser/preprocessor.h +++ b/flang/include/flang/Parser/preprocessor.h @@ -121,6 +121,8 @@ class Preprocessor { std::list names_; std::unordered_map definitions_; std::stack ifStack_; + + unsigned int counter_val_{0}; }; } // namespace Fortran::parser #endif // FORTRAN_PARSER_PREPROCESSOR_H_ diff --git a/flang/lib/Parser/preprocessor.cpp b/flang/lib/Parser/preprocessor.cpp index a47f9c32ad27c..f4f716f7c1269 100644 --- a/flang/lib/Parser/preprocessor.cpp +++ b/flang/lib/Parser/preprocessor.cpp @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -299,6 +300,7 @@ void Preprocessor::DefineStandardMacros() { Define("__FILE__"s, "__FILE__"s); Define("__LINE__"s, "__LINE__"s); Define("__TIMESTAMP__"s, "__TIMESTAMP__"s); + Define("__COUNTER__"s, "__COUNTER__"s); } void Preprocessor::Define(const std::string ¯o, const std::string &value) { @@ -421,6 +423,8 @@ std::optional Preprocessor::MacroReplacement( repl = "\""s + time + '"'; } } + } else if (name == "__COUNTER__") { + repl = std::to_string(CounterValue++); } if (!repl.empty()) { ProvenanceRange insert{allSources_.AddCompilerInsertion(repl)}; diff --git a/flang/test/Preprocessing/counter.F90 b/flang/test/Preprocessing/counter.F90 new file mode 100644 index 0000000000000..9761c8fb7f355 --- /dev/null +++ b/flang/test/Preprocessing/counter.F90 @@ -0,0 +1,9 @@ +! RUN: %flang -E %s | FileCheck %s +! CHECK: print *, 0 +! CHECK: print *, 1 +! CHECK: print *, 2 +! Check incremental counter macro +#define foo bar +print *, __COUNTER__ +print *, __COUNTER__ +print *, __COUNTER__ From flang-commits at lists.llvm.org Thu May 8 06:14:40 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 06:14:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Implement !DIR$ [NO]INLINE and FORCEINLINE directives (PR #134350) In-Reply-To: Message-ID: <681cae40.170a0220.2e2888.f01b@mx.google.com> tblah wrote: @JDPailleux is this now blocked? https://github.com/llvm/llvm-project/pull/134350 From flang-commits at lists.llvm.org Thu May 8 06:15:29 2025 From: flang-commits at lists.llvm.org (Yussur Mustafa Oraji via flang-commits) Date: Thu, 08 May 2025 06:15:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <681cae71.630a0220.668d6.0be6@mx.google.com> N00byKing wrote: Thank you for the quick response! I've adjusted the variable style. I will make sure to check discourse first next time :) https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Thu May 8 06:20:06 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Thu, 08 May 2025 06:20:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <681caf86.050a0220.1c65cf.8bed@mx.google.com> ================ @@ -421,6 +423,8 @@ std::optional Preprocessor::MacroReplacement( repl = "\""s + time + '"'; } } + } else if (name == "__COUNTER__") { + repl = std::to_string(CounterValue++); ---------------- eugeneepshteyn wrote: Well, ok, since you accepted the variable renaming above, you should also rename it here :) https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Thu May 8 06:20:36 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Thu, 08 May 2025 06:20:36 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681cafa4.170a0220.199d49.7520@mx.google.com> ================ @@ -78,10 +78,14 @@ class ClauseProcessor { mlir::omp::HasDeviceAddrClauseOps &result, llvm::SmallVectorImpl &hasDeviceSyms) const; bool processHint(mlir::omp::HintClauseOps &result) const; + bool processGrainsize(lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const; ---------------- skatrak wrote: Nit: This should go before `processHasDeviceAddr`. Also in ClauseProcessor.cpp. https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Thu May 8 06:20:36 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Thu, 08 May 2025 06:20:36 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681cafa4.050a0220.77a60.d0c1@mx.google.com> https://github.com/skatrak approved this pull request. Thank you Kaviya, just minor non-blocking nits from me! https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Thu May 8 06:20:36 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Thu, 08 May 2025 06:20:36 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681cafa4.630a0220.271d88.01e3@mx.google.com> https://github.com/skatrak edited https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Thu May 8 06:20:36 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Thu, 08 May 2025 06:20:36 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681cafa4.050a0220.8114f.d453@mx.google.com> ================ @@ -365,6 +365,27 @@ bool ClauseProcessor::processHint(mlir::omp::HintClauseOps &result) const { return false; } +bool ClauseProcessor::processGrainsize( + lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const { + using grainsize = omp::clause::Grainsize; ---------------- skatrak wrote: Nit: Use upper case for types. Same comment for `numtasks -> NumTasks` below. ```suggestion using Grainsize = omp::clause::Grainsize; ``` https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Thu May 8 06:21:40 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 06:21:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang][docs][OpenMP] array sections with DEPEND are supported (PR #139081) Message-ID: https://github.com/tblah created https://github.com/llvm/llvm-project/pull/139081 This was added in - https://github.com/llvm/llvm-project/pull/132230 - https://github.com/llvm/llvm-project/pull/132994 - https://github.com/llvm/llvm-project/pull/133892 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Thu May 8 06:50:57 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 06:50:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Propagate contiguous attribute through HLFIR. (PR #138797) In-Reply-To: Message-ID: <681cb6c1.a70a0220.2372e6.eafc@mx.google.com> tblah wrote: > +1 here. I think we are just waiting to have more use cases to come with a reasonable and robust abstraction. Here is another case which may or may not fit depending upon the abstraction https://github.com/llvm/llvm-project/blob/main/flang/lib/Optimizer/OpenMP/LowerNontemporal.cpp https://github.com/llvm/llvm-project/pull/138797 From flang-commits at lists.llvm.org Thu May 8 07:04:39 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Thu, 08 May 2025 07:04:39 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681cb9f7.170a0220.1acd96.a426@mx.google.com> ================ @@ -365,6 +365,27 @@ bool ClauseProcessor::processHint(mlir::omp::HintClauseOps &result) const { return false; } +bool ClauseProcessor::processGrainsize( + lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const { + using grainsize = omp::clause::Grainsize; ---------------- kaviya2510 wrote: Sure, I will update it. Thanks for the review @skatrak https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Thu May 8 07:13:08 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 07:13:08 -0700 (PDT) Subject: [flang-commits] [flang] 3ed158f - [flang][docs][OpenMP] array sections with DEPEND are supported (#139081) Message-ID: <681cbbf4.650a0220.2021ee.5020@mx.google.com> Author: Tom Eccles Date: 2025-05-08T15:13:04+01:00 New Revision: 3ed158fab432fd92b9d3d1386477ae12fa493132 URL: https://github.com/llvm/llvm-project/commit/3ed158fab432fd92b9d3d1386477ae12fa493132 DIFF: https://github.com/llvm/llvm-project/commit/3ed158fab432fd92b9d3d1386477ae12fa493132.diff LOG: [flang][docs][OpenMP] array sections with DEPEND are supported (#139081) This was added in - https://github.com/llvm/llvm-project/pull/132230 - https://github.com/llvm/llvm-project/pull/132994 - https://github.com/llvm/llvm-project/pull/133892 Added: Modified: flang/docs/OpenMPSupport.md Removed: ################################################################################ diff --git a/flang/docs/OpenMPSupport.md b/flang/docs/OpenMPSupport.md index 28e13d3179bd2..bde0bf724e480 100644 --- a/flang/docs/OpenMPSupport.md +++ b/flang/docs/OpenMPSupport.md @@ -48,7 +48,7 @@ Note : No distinction is made between the support in Parser/Semantics, MLIR, Low | distribute simd construct | P | dist_schedule and linear clauses are not supported | | distribute parallel loop construct | P | dist_schedule clause not supported | | distribute parallel loop simd construct | P | dist_schedule and linear clauses are not supported | -| depend clause | P | depend clause with array sections are not supported | +| depend clause | Y | | | declare reduction construct | N | | | atomic construct extensions | Y | | | cancel construct | Y | | From flang-commits at lists.llvm.org Thu May 8 07:24:45 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Thu, 08 May 2025 07:24:45 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681cbead.a70a0220.1ade42.0bd5@mx.google.com> https://github.com/kaviya2510 updated https://github.com/llvm/llvm-project/pull/128490 >From 2075eb0739938946e80b8e632f6512be735a04c7 Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Wed, 7 May 2025 11:53:51 +0530 Subject: [PATCH 1/4] [Flang][OpenMP]Added MLIR lowering for grainsize and num_tasks clause --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 42 +++++++++++++++ flang/lib/Lower/OpenMP/ClauseProcessor.h | 4 ++ flang/lib/Lower/OpenMP/OpenMP.cpp | 26 ++++----- .../test/Lower/OpenMP/taskloop-grainsize.f90 | 51 ++++++++++++++++++ flang/test/Lower/OpenMP/taskloop-numtasks.f90 | 54 +++++++++++++++++++ 5 files changed, 165 insertions(+), 12 deletions(-) create mode 100644 flang/test/Lower/OpenMP/taskloop-grainsize.f90 create mode 100644 flang/test/Lower/OpenMP/taskloop-numtasks.f90 diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 77b4622547d7a..ac940b5c74152 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -365,6 +365,27 @@ bool ClauseProcessor::processHint(mlir::omp::HintClauseOps &result) const { return false; } +bool ClauseProcessor::processGrainsize( + lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const { + using grainsize = omp::clause::Grainsize; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier) { + result.grainsizeMod = mlir::omp::ClauseGrainsizeTypeAttr::get( + context, mlir::omp::ClauseGrainsizeType::Strict); + } + const auto &grainsizeExpr = std::get(clause->t); + result.grainsize = + fir::getBase(converter.genExprValue(grainsizeExpr, stmtCtx)); + return true; + } + return false; +} + bool ClauseProcessor::processInclusive( mlir::Location currentLocation, mlir::omp::InclusiveClauseOps &result) const { @@ -388,6 +409,27 @@ bool ClauseProcessor::processNowait(mlir::omp::NowaitClauseOps &result) const { return markClauseOccurrence(result.nowait); } +bool ClauseProcessor::processNumTasks( + lower::StatementContext &stmtCtx, + mlir::omp::NumTasksClauseOps &result) const { + using numtasks = omp::clause::NumTasks; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier) { + result.numTasksMod = mlir::omp::ClauseNumTasksTypeAttr::get( + context, mlir::omp::ClauseNumTasksType::Strict); + } + const auto &numtasksExpr = std::get(clause->t); + result.numTasks = + fir::getBase(converter.genExprValue(numtasksExpr, stmtCtx)); + return true; + } + return false; +} + bool ClauseProcessor::processNumTeams( lower::StatementContext &stmtCtx, mlir::omp::NumTeamsClauseOps &result) const { diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index bdddeb145b496..375e24b80fc21 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -78,10 +78,14 @@ class ClauseProcessor { mlir::omp::HasDeviceAddrClauseOps &result, llvm::SmallVectorImpl &hasDeviceSyms) const; bool processHint(mlir::omp::HintClauseOps &result) const; + bool processGrainsize(lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const; bool processInclusive(mlir::Location currentLocation, mlir::omp::InclusiveClauseOps &result) const; bool processMergeable(mlir::omp::MergeableClauseOps &result) const; bool processNowait(mlir::omp::NowaitClauseOps &result) const; + bool processNumTasks(lower::StatementContext &stmtCtx, + mlir::omp::NumTasksClauseOps &result) const; bool processNumTeams(lower::StatementContext &stmtCtx, mlir::omp::NumTeamsClauseOps &result) const; bool processNumThreads(lower::StatementContext &stmtCtx, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fcd3de9671098..af227b28d35b3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1806,17 +1806,19 @@ static void genTaskgroupClauses(lower::AbstractConverter &converter, static void genTaskloopClauses(lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, + lower::StatementContext &stmtCtx, const List &clauses, mlir::Location loc, mlir::omp::TaskloopOperands &clauseOps) { ClauseProcessor cp(converter, semaCtx, clauses); + cp.processGrainsize(stmtCtx, clauseOps); + cp.processNumTasks(stmtCtx, clauseOps); cp.processTODO( - loc, llvm::omp::Directive::OMPD_taskloop); + clause::Final, clause::If, clause::InReduction, + clause::Lastprivate, clause::Mergeable, clause::Nogroup, + clause::Priority, clause::Reduction, clause::Shared, + clause::Untied>(loc, llvm::omp::Directive::OMPD_taskloop); } static void genTaskwaitClauses(lower::AbstractConverter &converter, @@ -3268,12 +3270,12 @@ genStandaloneSimd(lower::AbstractConverter &converter, lower::SymMap &symTable, static mlir::omp::TaskloopOp genStandaloneTaskloop( lower::AbstractConverter &converter, lower::SymMap &symTable, - semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - mlir::Location loc, const ConstructQueue &queue, - ConstructQueue::const_iterator item) { + lower::StatementContext &stmtCtx, semantics::SemanticsContext &semaCtx, + lower::pft::Evaluation &eval, mlir::Location loc, + const ConstructQueue &queue, ConstructQueue::const_iterator item) { mlir::omp::TaskloopOperands taskloopClauseOps; - genTaskloopClauses(converter, semaCtx, item->clauses, loc, taskloopClauseOps); - + genTaskloopClauses(converter, semaCtx, stmtCtx, item->clauses, loc, + taskloopClauseOps); DataSharingProcessor dsp(converter, semaCtx, item->clauses, eval, /*shouldCollectPreDeterminedSymbols=*/true, enableDelayedPrivatization, symTable); @@ -3734,8 +3736,8 @@ static void genOMPDispatch(lower::AbstractConverter &converter, genTaskgroupOp(converter, symTable, semaCtx, eval, loc, queue, item); break; case llvm::omp::Directive::OMPD_taskloop: - newOp = genStandaloneTaskloop(converter, symTable, semaCtx, eval, loc, - queue, item); + newOp = genStandaloneTaskloop(converter, symTable, stmtCtx, semaCtx, eval, + loc, queue, item); break; case llvm::omp::Directive::OMPD_taskwait: newOp = genTaskwaitOp(converter, symTable, semaCtx, eval, loc, queue, item); diff --git a/flang/test/Lower/OpenMP/taskloop-grainsize.f90 b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 new file mode 100644 index 0000000000000..fa684ad213d0a --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 @@ -0,0 +1,51 @@ +! This test checks lowering of grainsize clause in taskloop directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE_TEST2:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_grainsize +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_grainsizeEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_grainsizeEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFtest_grainsizeEx"} +! CHECK: %[[DECL_X:.*]]:2 = hlfir.declare %[[ALLOCA_X]] {uniq_name = "_QFtest_grainsizeEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[GRAINSIZE:.*]] = arith.constant 10 : i32 +subroutine test_grainsize + integer :: i, x + ! CHECK: omp.taskloop grainsize(%[[GRAINSIZE]]: i32) + ! CHECK-SAME: private(@[[X_FIRSTPRIVATE]] %[[DECL_X]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { + ! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { + !$omp taskloop grainsize(10) + do i = 1, 1000 + x = x + 1 + end do + !$omp end taskloop +end subroutine test_grainsize + +!CHECK-LABEL: func.func @_QPtest_grainsize_strict() +subroutine test_grainsize_strict + integer :: i, x + ! CHECK: %[[GRAINSIZE:.*]] = arith.constant 10 : i32 + ! CHECK: omp.taskloop grainsize(strict, %[[GRAINSIZE]]: i32) + !$omp taskloop grainsize(strict:10) + do i = 1, 1000 + !CHECK: arith.addi + x = x + 1 + end do + !$omp end taskloop +end subroutine \ No newline at end of file diff --git a/flang/test/Lower/OpenMP/taskloop-numtasks.f90 b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 new file mode 100644 index 0000000000000..38f3975bbd371 --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 @@ -0,0 +1,54 @@ +! This test checks lowering of num_tasks clause in taskloop directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE_TEST2:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_num_tasks +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_num_tasksEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_num_tasksEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFtest_num_tasksEx"} +! CHECK: %[[DECL_X:.*]]:2 = hlfir.declare %[[ALLOCA_X]] {uniq_name = "_QFtest_num_tasksEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_NUMTASKS:.*]] = arith.constant 10 : i32 +subroutine test_num_tasks + integer :: i, x + ! CHECK: omp.taskloop num_tasks(%[[VAL_NUMTASKS]]: i32) + ! CHECK-SAME: private(@[[X_FIRSTPRIVATE]] %[[DECL_X]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { + ! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { + !$omp taskloop num_tasks(10) + do i = 1, 1000 + x = x + 1 + end do + !$omp end taskloop +end subroutine test_num_tasks + +! CHECK-LABEL: func.func @_QPtest_num_tasks_strict +subroutine test_num_tasks_strict + integer :: x, i + ! CHECK: %[[NUM_TASKS:.*]] = arith.constant 10 : i32 + ! CHECK: omp.taskloop num_tasks(strict, %[[NUM_TASKS]]: i32) + !$omp taskloop num_tasks(strict:10) + do i = 1, 100 + !CHECK: arith.addi + x = x + 1 + end do + !$omp end taskloop +end subroutine + + + >From 916889f13e0a5d2b67d370e338ed1fcfd8c62b2a Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Wed, 7 May 2025 12:04:30 +0530 Subject: [PATCH 2/4] [Flang][OpenMP] Formatting fix --- flang/test/Lower/OpenMP/taskloop-grainsize.f90 | 2 +- flang/test/Lower/OpenMP/taskloop-numtasks.f90 | 3 --- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/flang/test/Lower/OpenMP/taskloop-grainsize.f90 b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 index fa684ad213d0a..43db8acdeceac 100644 --- a/flang/test/Lower/OpenMP/taskloop-grainsize.f90 +++ b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 @@ -48,4 +48,4 @@ subroutine test_grainsize_strict x = x + 1 end do !$omp end taskloop -end subroutine \ No newline at end of file +end subroutine diff --git a/flang/test/Lower/OpenMP/taskloop-numtasks.f90 b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 index 38f3975bbd371..f68f3a2d6ad26 100644 --- a/flang/test/Lower/OpenMP/taskloop-numtasks.f90 +++ b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 @@ -49,6 +49,3 @@ subroutine test_num_tasks_strict end do !$omp end taskloop end subroutine - - - >From 9558691252b5ab9ab693aa017a963f8a1a7fa1b1 Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Thu, 8 May 2025 13:13:06 +0530 Subject: [PATCH 3/4] [Flang][OpenMP] Added a condition to checks if the modifier is 'strict' --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index ac940b5c74152..e2a35cee5c75e 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -374,7 +374,7 @@ bool ClauseProcessor::processGrainsize( mlir::MLIRContext *context = firOpBuilder.getContext(); const auto &modifier = std::get>(clause->t); - if (modifier) { + if (modifier && *modifier == grainsize::Prescriptiveness::Strict) { result.grainsizeMod = mlir::omp::ClauseGrainsizeTypeAttr::get( context, mlir::omp::ClauseGrainsizeType::Strict); } @@ -418,7 +418,7 @@ bool ClauseProcessor::processNumTasks( mlir::MLIRContext *context = firOpBuilder.getContext(); const auto &modifier = std::get>(clause->t); - if (modifier) { + if (modifier && *modifier == numtasks::Prescriptiveness::Strict) { result.numTasksMod = mlir::omp::ClauseNumTasksTypeAttr::get( context, mlir::omp::ClauseNumTasksType::Strict); } >From eb8d4d78e04f5274028fd55160bc7063db9eae18 Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Thu, 8 May 2025 19:44:09 +0530 Subject: [PATCH 4/4] Formatting fix and update variable names --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 50 +++++++++++----------- flang/lib/Lower/OpenMP/ClauseProcessor.h | 4 +- 2 files changed, 27 insertions(+), 27 deletions(-) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index e2a35cee5c75e..7fbbcec753221 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -365,27 +365,6 @@ bool ClauseProcessor::processHint(mlir::omp::HintClauseOps &result) const { return false; } -bool ClauseProcessor::processGrainsize( - lower::StatementContext &stmtCtx, - mlir::omp::GrainsizeClauseOps &result) const { - using grainsize = omp::clause::Grainsize; - if (auto *clause = findUniqueClause()) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::MLIRContext *context = firOpBuilder.getContext(); - const auto &modifier = - std::get>(clause->t); - if (modifier && *modifier == grainsize::Prescriptiveness::Strict) { - result.grainsizeMod = mlir::omp::ClauseGrainsizeTypeAttr::get( - context, mlir::omp::ClauseGrainsizeType::Strict); - } - const auto &grainsizeExpr = std::get(clause->t); - result.grainsize = - fir::getBase(converter.genExprValue(grainsizeExpr, stmtCtx)); - return true; - } - return false; -} - bool ClauseProcessor::processInclusive( mlir::Location currentLocation, mlir::omp::InclusiveClauseOps &result) const { @@ -412,13 +391,13 @@ bool ClauseProcessor::processNowait(mlir::omp::NowaitClauseOps &result) const { bool ClauseProcessor::processNumTasks( lower::StatementContext &stmtCtx, mlir::omp::NumTasksClauseOps &result) const { - using numtasks = omp::clause::NumTasks; - if (auto *clause = findUniqueClause()) { + using NumTasks = omp::clause::NumTasks; + if (auto *clause = findUniqueClause()) { fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); mlir::MLIRContext *context = firOpBuilder.getContext(); const auto &modifier = - std::get>(clause->t); - if (modifier && *modifier == numtasks::Prescriptiveness::Strict) { + std::get>(clause->t); + if (modifier && *modifier == NumTasks::Prescriptiveness::Strict) { result.numTasksMod = mlir::omp::ClauseNumTasksTypeAttr::get( context, mlir::omp::ClauseNumTasksType::Strict); } @@ -976,6 +955,27 @@ bool ClauseProcessor::processDepend(lower::SymMap &symMap, return findRepeatableClause(process); } +bool ClauseProcessor::processGrainsize( + lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const { + using Grainsize = omp::clause::Grainsize; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier && *modifier == Grainsize::Prescriptiveness::Strict) { + result.grainsizeMod = mlir::omp::ClauseGrainsizeTypeAttr::get( + context, mlir::omp::ClauseGrainsizeType::Strict); + } + const auto &grainsizeExpr = std::get(clause->t); + result.grainsize = + fir::getBase(converter.genExprValue(grainsizeExpr, stmtCtx)); + return true; + } + return false; +} + bool ClauseProcessor::processHasDeviceAddr( lower::StatementContext &stmtCtx, mlir::omp::HasDeviceAddrClauseOps &result, llvm::SmallVectorImpl &hasDeviceSyms) const { diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 375e24b80fc21..9541f3585ead3 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -73,13 +73,13 @@ class ClauseProcessor { mlir::omp::FilterClauseOps &result) const; bool processFinal(lower::StatementContext &stmtCtx, mlir::omp::FinalClauseOps &result) const; + bool processGrainsize(lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const; bool processHasDeviceAddr( lower::StatementContext &stmtCtx, mlir::omp::HasDeviceAddrClauseOps &result, llvm::SmallVectorImpl &hasDeviceSyms) const; bool processHint(mlir::omp::HintClauseOps &result) const; - bool processGrainsize(lower::StatementContext &stmtCtx, - mlir::omp::GrainsizeClauseOps &result) const; bool processInclusive(mlir::Location currentLocation, mlir::omp::InclusiveClauseOps &result) const; bool processMergeable(mlir::omp::MergeableClauseOps &result) const; From flang-commits at lists.llvm.org Wed May 7 19:48:46 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 07 May 2025 19:48:46 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <681c1b8e.170a0220.1b7e1.bc8e@mx.google.com> https://github.com/jofrn updated https://github.com/llvm/llvm-project/pull/123609 >From 210b6d80bcfbbcd216f98199df386280724561e2 Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 20 Jan 2025 04:51:26 -0800 Subject: [PATCH 01/30] [TargetVerifier][AMDGPU] Add TargetVerifier. This pass verifies the IR for an individual backend. This is different than Lint because it consolidates all checks for a given backend in a single pass. A check for Lint may be undefined behavior across all targets, whereas a check in TargetVerifier would only pertain to the specified target but can check more than just undefined behavior such are IR validity. A use case of this would be to reject programs with invalid IR while fuzzing. --- llvm/include/llvm/IR/Module.h | 4 + llvm/include/llvm/Target/TargetVerifier.h | 82 +++++++ .../TargetVerify/AMDGPUTargetVerifier.h | 36 +++ llvm/lib/IR/Verifier.cpp | 18 +- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 213 ++++++++++++++++++ llvm/lib/Target/AMDGPU/CMakeLists.txt | 1 + llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 62 +++++ llvm/tools/llvm-tgt-verify/CMakeLists.txt | 34 +++ .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 172 ++++++++++++++ 9 files changed, 618 insertions(+), 4 deletions(-) create mode 100644 llvm/include/llvm/Target/TargetVerifier.h create mode 100644 llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h create mode 100644 llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp create mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify.ll create mode 100644 llvm/tools/llvm-tgt-verify/CMakeLists.txt create mode 100644 llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp diff --git a/llvm/include/llvm/IR/Module.h b/llvm/include/llvm/IR/Module.h index 91ccd76c41e07..03c0cf1cf0924 100644 --- a/llvm/include/llvm/IR/Module.h +++ b/llvm/include/llvm/IR/Module.h @@ -214,6 +214,10 @@ class LLVM_ABI Module { /// @name Constructors /// @{ public: + /// Is this Module valid as determined by one of the verification passes + /// i.e. Lint, Verifier, TargetVerifier. + bool IsValid = true; + /// Is this Module using intrinsics to record the position of debugging /// information, or non-intrinsic records? See IsNewDbgInfoFormat in /// \ref BasicBlock. diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h new file mode 100644 index 0000000000000..e00c6a7b260c9 --- /dev/null +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -0,0 +1,82 @@ +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at TargetVerifier.cpp or an +// individual backend's TargetVerifier. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_TARGET_VERIFIER_H +#define LLVM_TARGET_VERIFIER_H + +#include "llvm/IR/PassManager.h" +#include "llvm/IR/Module.h" +#include "llvm/TargetParser/Triple.h" + +namespace llvm { + +class Function; + +class TargetVerifierPass : public PassInfoMixin { +public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) {} +}; + +class TargetVerify { +protected: + void WriteValues(ArrayRef Vs) { + for (const Value *V : Vs) { + if (!V) + continue; + if (isa(V)) { + MessagesStr << *V << '\n'; + } else { + V->printAsOperand(MessagesStr, true, Mod); + MessagesStr << '\n'; + } + } + } + + /// A check failed, so printout out the condition and the message. + /// + /// This provides a nice place to put a breakpoint if you want to see why + /// something is not correct. + void CheckFailed(const Twine &Message) { MessagesStr << Message << '\n'; } + + /// A check failed (with values to print). + /// + /// This calls the Message-only version so that the above is easier to set + /// a breakpoint on. + template + void CheckFailed(const Twine &Message, const T1 &V1, const Ts &... Vs) { + CheckFailed(Message); + WriteValues({V1, Vs...}); + } +public: + Module *Mod; + Triple TT; + + std::string Messages; + raw_string_ostream MessagesStr; + + TargetVerify(Module *Mod) + : Mod(Mod), TT(Triple::normalize(Mod->getTargetTriple())), + MessagesStr(Messages) {} + + void run(Function &F) {}; +}; + +} // namespace llvm + +#endif // LLVM_TARGET_VERIFIER_H diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h new file mode 100644 index 0000000000000..e6ff57629b141 --- /dev/null +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -0,0 +1,36 @@ +//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU ---*- C++ -*-===// +//// +//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +//// See https://llvm.org/LICENSE.txt for license information. +//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +//// +////===----------------------------------------------------------------------===// +//// +//// This file defines target verifier interfaces that can be used for some +//// validation of input to the system, and for checking that transformations +//// haven't done something bad. In contrast to the Verifier or Lint, the +//// TargetVerifier looks for constructions invalid to a particular target +//// machine. +//// +//// To see what specifically is checked, look at an individual backend's +//// TargetVerifier. +//// +////===----------------------------------------------------------------------===// + +#ifndef LLVM_AMDGPU_TARGET_VERIFIER_H +#define LLVM_AMDGPU_TARGET_VERIFIER_H + +#include "llvm/Target/TargetVerifier.h" + +namespace llvm { + +class Function; + +class AMDGPUTargetVerifierPass : public TargetVerifierPass { +public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); +}; + +} // namespace llvm + +#endif // LLVM_AMDGPU_TARGET_VERIFIER_H diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 8afe360d088bc..9d21ca182ca13 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -135,6 +135,10 @@ static cl::opt VerifyNoAliasScopeDomination( cl::desc("Ensure that llvm.experimental.noalias.scope.decl for identical " "scopes are not dominating")); +static cl::opt + VerifyAbortOnError("verifier-abort-on-error", cl::init(false), + cl::desc("In the Verifier pass, abort on errors.")); + namespace llvm { struct VerifierSupport { @@ -7796,16 +7800,22 @@ VerifierAnalysis::Result VerifierAnalysis::run(Function &F, PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { auto Res = AM.getResult(M); - if (FatalErrors && (Res.IRBroken || Res.DebugInfoBroken)) - report_fatal_error("Broken module found, compilation aborted!"); + if (Res.IRBroken || Res.DebugInfoBroken) { + M.IsValid = false; + if (VerifyAbortOnError && FatalErrors) + report_fatal_error("Broken module found, compilation aborted!"); + } return PreservedAnalyses::all(); } PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto res = AM.getResult(F); - if (res.IRBroken && FatalErrors) - report_fatal_error("Broken function found, compilation aborted!"); + if (res.IRBroken) { + F.getParent()->IsValid = false; + if (VerifyAbortOnError && FatalErrors) + report_fatal_error("Broken function found, compilation aborted!"); + } return PreservedAnalyses::all(); } diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp new file mode 100644 index 0000000000000..585b19065c142 --- /dev/null +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -0,0 +1,213 @@ +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" + +#include "llvm/Analysis/UniformityAnalysis.h" +#include "llvm/Analysis/PostDominators.h" +#include "llvm/Support/Debug.h" +#include "llvm/IR/Dominators.h" +#include "llvm/IR/Function.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/IntrinsicsAMDGPU.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/Value.h" + +#include "llvm/Support/raw_ostream.h" + +using namespace llvm; + +static cl::opt +MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); + +// Check - We know that cond should be true, if not print an error message. +#define Check(C, ...) \ + do { \ + if (!(C)) { \ + TargetVerify::CheckFailed(__VA_ARGS__); \ + return; \ + } \ + } while (false) + +static bool isMFMA(unsigned IID) { + switch (IID) { + case Intrinsic::amdgcn_mfma_f32_4x4x1f32: + case Intrinsic::amdgcn_mfma_f32_4x4x4f16: + case Intrinsic::amdgcn_mfma_i32_4x4x4i8: + case Intrinsic::amdgcn_mfma_f32_4x4x2bf16: + + case Intrinsic::amdgcn_mfma_f32_16x16x1f32: + case Intrinsic::amdgcn_mfma_f32_16x16x4f32: + case Intrinsic::amdgcn_mfma_f32_16x16x4f16: + case Intrinsic::amdgcn_mfma_f32_16x16x16f16: + case Intrinsic::amdgcn_mfma_i32_16x16x4i8: + case Intrinsic::amdgcn_mfma_i32_16x16x16i8: + case Intrinsic::amdgcn_mfma_f32_16x16x2bf16: + case Intrinsic::amdgcn_mfma_f32_16x16x8bf16: + + case Intrinsic::amdgcn_mfma_f32_32x32x1f32: + case Intrinsic::amdgcn_mfma_f32_32x32x2f32: + case Intrinsic::amdgcn_mfma_f32_32x32x4f16: + case Intrinsic::amdgcn_mfma_f32_32x32x8f16: + case Intrinsic::amdgcn_mfma_i32_32x32x4i8: + case Intrinsic::amdgcn_mfma_i32_32x32x8i8: + case Intrinsic::amdgcn_mfma_f32_32x32x2bf16: + case Intrinsic::amdgcn_mfma_f32_32x32x4bf16: + + case Intrinsic::amdgcn_mfma_f32_4x4x4bf16_1k: + case Intrinsic::amdgcn_mfma_f32_16x16x4bf16_1k: + case Intrinsic::amdgcn_mfma_f32_16x16x16bf16_1k: + case Intrinsic::amdgcn_mfma_f32_32x32x4bf16_1k: + case Intrinsic::amdgcn_mfma_f32_32x32x8bf16_1k: + + case Intrinsic::amdgcn_mfma_f64_16x16x4f64: + case Intrinsic::amdgcn_mfma_f64_4x4x4f64: + + case Intrinsic::amdgcn_mfma_i32_16x16x32_i8: + case Intrinsic::amdgcn_mfma_i32_32x32x16_i8: + case Intrinsic::amdgcn_mfma_f32_16x16x8_xf32: + case Intrinsic::amdgcn_mfma_f32_32x32x4_xf32: + + case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_bf8: + case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_fp8: + case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_bf8: + case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_fp8: + + case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_bf8: + case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_fp8: + case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_bf8: + case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_fp8: + return true; + default: + return false; + } +} + +namespace llvm { +class AMDGPUTargetVerify : public TargetVerify { +public: + Module *Mod; + + DominatorTree *DT; + PostDominatorTree *PDT; + UniformityInfo *UA; + + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) + : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} + + void run(Function &F); +}; + +static bool IsValidInt(const Type *Ty) { + return Ty->isIntegerTy(1) || + Ty->isIntegerTy(8) || + Ty->isIntegerTy(16) || + Ty->isIntegerTy(32) || + Ty->isIntegerTy(64) || + Ty->isIntegerTy(128); +} + +static bool isShader(CallingConv::ID CC) { + switch(CC) { + case CallingConv::AMDGPU_VS: + case CallingConv::AMDGPU_LS: + case CallingConv::AMDGPU_HS: + case CallingConv::AMDGPU_ES: + case CallingConv::AMDGPU_GS: + case CallingConv::AMDGPU_PS: + case CallingConv::AMDGPU_CS_Chain: + case CallingConv::AMDGPU_CS_ChainPreserve: + case CallingConv::AMDGPU_CS: + return true; + default: + return false; + } +} + +void AMDGPUTargetVerify::run(Function &F) { + // Ensure shader calling convention returns void + if (isShader(F.getCallingConv())) + Check(F.getReturnType() == Type::getVoidTy(F.getContext()), "Shaders must return void"); + + for (auto &BB : F) { + + for (auto &I : BB) { + if (MarkUniform) + outs() << UA->isUniform(&I) << ' ' << I << '\n'; + + // Ensure integral types are valid: i8, i16, i32, i64, i128 + if (I.getType()->isIntegerTy()) + Check(IsValidInt(I.getType()), "Int type is invalid.", &I); + for (unsigned i = 0; i < I.getNumOperands(); ++i) + if (I.getOperand(i)->getType()->isIntegerTy()) + Check(IsValidInt(I.getOperand(i)->getType()), + "Int type is invalid.", I.getOperand(i)); + + // Ensure no store to const memory + if (auto *SI = dyn_cast(&I)) + { + unsigned AS = SI->getPointerAddressSpace(); + Check(AS != 4, "Write to const memory", SI); + } + + // Ensure no kernel to kernel calls. + if (auto *CI = dyn_cast(&I)) + { + CallingConv::ID CalleeCC = CI->getCallingConv(); + if (CalleeCC == CallingConv::AMDGPU_KERNEL) + { + CallingConv::ID CallerCC = CI->getParent()->getParent()->getCallingConv(); + Check(CallerCC != CallingConv::AMDGPU_KERNEL, + "A kernel may not call a kernel", CI->getParent()->getParent()); + } + } + + // Ensure MFMA is not in control flow with diverging operands + if (auto *II = dyn_cast(&I)) { + if (isMFMA(II->getIntrinsicID())) { + bool InControlFlow = false; + for (const auto &P : predecessors(&BB)) + if (!PDT->dominates(&BB, P)) { + InControlFlow = true; + break; + } + for (const auto &S : successors(&BB)) + if (!DT->dominates(&BB, S)) { + InControlFlow = true; + break; + } + if (InControlFlow) { + // If operands to MFMA are not uniform, MFMA cannot be in control flow + bool hasUniformOperands = true; + for (unsigned i = 0; i < II->getNumOperands(); i++) { + if (!UA->isUniform(II->getOperand(i))) { + dbgs() << "Not uniform: " << *II->getOperand(i) << '\n'; + hasUniformOperands = false; + } + } + if (!hasUniformOperands) Check(false, "MFMA in control flow", II); + //else Check(false, "MFMA in control flow (uniform operands)", II); + } + //else Check(false, "MFMA not in control flow", II); + } + } + } + } +} + +PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { + + auto *Mod = F.getParent(); + + auto UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + F.getParent()->IsValid = false; + } + + return PreservedAnalyses::all(); +} +} // namespace llvm diff --git a/llvm/lib/Target/AMDGPU/CMakeLists.txt b/llvm/lib/Target/AMDGPU/CMakeLists.txt index 09a3096602fc3..bcfea0bf8ac94 100644 --- a/llvm/lib/Target/AMDGPU/CMakeLists.txt +++ b/llvm/lib/Target/AMDGPU/CMakeLists.txt @@ -110,6 +110,7 @@ add_llvm_target(AMDGPUCodeGen AMDGPUTargetMachine.cpp AMDGPUTargetObjectFile.cpp AMDGPUTargetTransformInfo.cpp + AMDGPUTargetVerifier.cpp AMDGPUWaitSGPRHazards.cpp AMDGPUUnifyDivergentExitNodes.cpp AMDGPUUnifyMetadata.cpp diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll new file mode 100644 index 0000000000000..f56ff992a56c2 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -0,0 +1,62 @@ +; RUN: not llvm-tgt-verify %s -mtriple=amdgcn |& FileCheck %s + +define amdgpu_kernel void @test_mfma_f32_32x32x1f32_vecarg(ptr addrspace(1) %arg) #0 { +; CHECK: Not uniform: %in.f32 = load <32 x float>, ptr addrspace(1) %gep, align 128 +; CHECK-NEXT: MFMA in control flow +; CHECK-NEXT: %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) +s: + %tid = call i32 @llvm.amdgcn.workitem.id.x() + %gep = getelementptr inbounds <32 x float>, ptr addrspace(1) %arg, i32 %tid + %in.i32 = load <32 x i32>, ptr addrspace(1) %gep + %in.f32 = load <32 x float>, ptr addrspace(1) %gep + + %0 = icmp eq <32 x i32> %in.i32, zeroinitializer + %div.br = extractelement <32 x i1> %0, i32 0 + br i1 %div.br, label %if.3, label %else.0 + +if.3: + br label %join + +else.0: + %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) + br label %join + +join: + ret void +} + +define amdgpu_cs i32 @shader() { +; CHECK: Shaders must return void + ret i32 0 +} + +define amdgpu_kernel void @store_const(ptr addrspace(4) %out, i32 %a, i32 %b) { +; CHECK: Undefined behavior: Write to memory in const addrspace +; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 +; CHECK-NEXT: Write to const memory +; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 + %r = add i32 %a, %b + store i32 %r, ptr addrspace(4) %out + ret void +} + +define amdgpu_kernel void @kernel_callee(ptr %x) { + ret void +} + +define amdgpu_kernel void @kernel_caller(ptr %x) { +; CHECK: A kernel may not call a kernel +; CHECK-NEXT: ptr @kernel_caller + call amdgpu_kernel void @kernel_callee(ptr %x) + ret void +} + + +; Function Attrs: nounwind +define i65 @invalid_type(i65 %x) #0 { +; CHECK: Int type is invalid. +; CHECK-NEXT: %tmp2 = ashr i65 %x, 64 +entry: + %tmp2 = ashr i65 %x, 64 + ret i65 %tmp2 +} diff --git a/llvm/tools/llvm-tgt-verify/CMakeLists.txt b/llvm/tools/llvm-tgt-verify/CMakeLists.txt new file mode 100644 index 0000000000000..fe47c85e6cdce --- /dev/null +++ b/llvm/tools/llvm-tgt-verify/CMakeLists.txt @@ -0,0 +1,34 @@ +set(LLVM_LINK_COMPONENTS + AllTargetsAsmParsers + AllTargetsCodeGens + AllTargetsDescs + AllTargetsInfos + Analysis + AsmPrinter + CodeGen + CodeGenTypes + Core + IRPrinter + IRReader + MC + MIRParser + Passes + Remarks + ScalarOpts + SelectionDAG + Support + Target + TargetParser + TransformUtils + Vectorize + ) + +add_llvm_tool(llvm-tgt-verify + llvm-tgt-verify.cpp + + DEPENDS + intrinsics_gen + SUPPORT_PLUGINS + ) + +export_executable_symbols_for_plugins(llc) diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp new file mode 100644 index 0000000000000..68422abd6f4cc --- /dev/null +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -0,0 +1,172 @@ +//===--- llvm-isel-fuzzer.cpp - Fuzzer for instruction selection ----------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Tool to fuzz instruction selection using libFuzzer. +// +//===----------------------------------------------------------------------===// + +#include "llvm/InitializePasses.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Analysis/Lint.h" +#include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/Bitcode/BitcodeReader.h" +#include "llvm/Bitcode/BitcodeWriter.h" +#include "llvm/CodeGen/CommandFlags.h" +#include "llvm/CodeGen/TargetPassConfig.h" +#include "llvm/IR/Constants.h" +#include "llvm/IR/LLVMContext.h" +#include "llvm/IR/LegacyPassManager.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/Verifier.h" +#include "llvm/IRReader/IRReader.h" +#include "llvm/Passes/PassBuilder.h" +#include "llvm/Passes/StandardInstrumentations.h" +#include "llvm/MC/TargetRegistry.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/DataTypes.h" +#include "llvm/Support/Debug.h" +#include "llvm/Support/InitLLVM.h" +#include "llvm/Support/SourceMgr.h" +#include "llvm/Support/TargetSelect.h" +#include "llvm/Target/TargetMachine.h" +#include "llvm/Target/TargetVerifier.h" + +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" + +#define DEBUG_TYPE "isel-fuzzer" + +using namespace llvm; + +static codegen::RegisterCodeGenFlags CGF; + +static cl::opt +InputFilename(cl::Positional, cl::desc(""), cl::init("-")); + +static cl::opt + StacktraceAbort("stacktrace-abort", + cl::desc("Turn on stacktrace"), cl::init(false)); + +static cl::opt + NoLint("no-lint", + cl::desc("Turn off Lint"), cl::init(false)); + +static cl::opt + NoVerify("no-verifier", + cl::desc("Turn off Verifier"), cl::init(false)); + +static cl::opt + OptLevel("O", + cl::desc("Optimization level. [-O0, -O1, -O2, or -O3] " + "(default = '-O2')"), + cl::Prefix, cl::init('2')); + +static cl::opt + TargetTriple("mtriple", cl::desc("Override target triple for module")); + +static std::unique_ptr TM; + +static void handleLLVMFatalError(void *, const char *Message, bool) { + if (StacktraceAbort) { + dbgs() << "LLVM ERROR: " << Message << "\n" + << "Aborting.\n"; + abort(); + } +} + +int main(int argc, char **argv) { + StringRef ExecName = argv[0]; + InitLLVM X(argc, argv); + + InitializeAllTargets(); + InitializeAllTargetMCs(); + InitializeAllAsmPrinters(); + InitializeAllAsmParsers(); + + PassRegistry *Registry = PassRegistry::getPassRegistry(); + initializeCore(*Registry); + initializeCodeGen(*Registry); + initializeAnalysis(*Registry); + initializeTarget(*Registry); + + cl::ParseCommandLineOptions(argc, argv); + + if (TargetTriple.empty()) { + errs() << ExecName << ": -mtriple must be specified\n"; + exit(1); + } + + CodeGenOptLevel OLvl; + if (auto Level = CodeGenOpt::parseLevel(OptLevel)) { + OLvl = *Level; + } else { + errs() << ExecName << ": invalid optimization level.\n"; + return 1; + } + ExitOnError ExitOnErr(std::string(ExecName) + ": error:"); + TM = ExitOnErr(codegen::createTargetMachineForTriple( + Triple::normalize(TargetTriple), OLvl)); + assert(TM && "Could not allocate target machine!"); + + // Make sure we print the summary and the current unit when LLVM errors out. + install_fatal_error_handler(handleLLVMFatalError, nullptr); + + LLVMContext Context; + SMDiagnostic Err; + std::unique_ptr M = parseIRFile(InputFilename, Err, Context); + if (!M) { + errs() << "Invalid mod\n"; + return 1; + } + auto S = Triple::normalize(TargetTriple); + M->setTargetTriple(S); + + PassInstrumentationCallbacks PIC; + StandardInstrumentations SI(Context, false/*debug PM*/, + false); + registerCodeGenCallback(PIC, *TM); + + ModulePassManager MPM; + FunctionPassManager FPM; + //TargetLibraryInfoImpl TLII(Triple(M->getTargetTriple())); + + MachineFunctionAnalysisManager MFAM; + LoopAnalysisManager LAM; + FunctionAnalysisManager FAM; + CGSCCAnalysisManager CGAM; + ModuleAnalysisManager MAM; + PassBuilder PB(TM.get(), PipelineTuningOptions(), std::nullopt, &PIC); + PB.registerModuleAnalyses(MAM); + PB.registerCGSCCAnalyses(CGAM); + PB.registerFunctionAnalyses(FAM); + PB.registerLoopAnalyses(LAM); + PB.registerMachineFunctionAnalyses(MFAM); + PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); + + SI.registerCallbacks(PIC, &MAM); + + //FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); + + Triple TT(M->getTargetTriple()); + if (!NoLint) + FPM.addPass(LintPass()); + if (!NoVerify) + MPM.addPass(VerifierPass()); + if (TT.isAMDGPU()) + FPM.addPass(AMDGPUTargetVerifierPass()); + else if (false) {} // ... + else + FPM.addPass(TargetVerifierPass()); + MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); + + MPM.run(*M, MAM); + + if (!M->IsValid) + return 1; + + return 0; +} >From a808efce8d90524845a44ffa5b90adb6741e488d Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 3 Feb 2025 07:15:12 -0800 Subject: [PATCH 02/30] Add hook for target verifier in llc,opt --- .../llvm/Passes/StandardInstrumentations.h | 6 ++++-- llvm/include/llvm/Target/TargetVerifier.h | 1 + .../TargetVerify/AMDGPUTargetVerifier.h | 18 ++++++++++++++++++ llvm/lib/LTO/LTOBackend.cpp | 2 +- llvm/lib/LTO/ThinLTOCodeGenerator.cpp | 2 +- llvm/lib/Passes/CMakeLists.txt | 1 + llvm/lib/Passes/PassBuilderBindings.cpp | 2 +- llvm/lib/Passes/StandardInstrumentations.cpp | 19 +++++++++++++++---- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 12 ++++++------ llvm/lib/Target/CMakeLists.txt | 2 ++ .../CodeGen/AMDGPU/tgt-verify-llc-fail.ll | 6 ++++++ .../CodeGen/AMDGPU/tgt-verify-llc-pass.ll | 6 ++++++ llvm/tools/llc/NewPMDriver.cpp | 2 +- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 2 +- llvm/tools/opt/NewPMDriver.cpp | 2 +- llvm/unittests/IR/PassManagerTest.cpp | 6 +++--- 16 files changed, 68 insertions(+), 21 deletions(-) create mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll create mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll diff --git a/llvm/include/llvm/Passes/StandardInstrumentations.h b/llvm/include/llvm/Passes/StandardInstrumentations.h index f7a65a88ecf5b..988fcb93b2357 100644 --- a/llvm/include/llvm/Passes/StandardInstrumentations.h +++ b/llvm/include/llvm/Passes/StandardInstrumentations.h @@ -476,7 +476,8 @@ class VerifyInstrumentation { public: VerifyInstrumentation(bool DebugLogging) : DebugLogging(DebugLogging) {} void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM); + ModuleAnalysisManager *MAM, + FunctionAnalysisManager *FAM); }; /// This class implements --time-trace functionality for new pass manager. @@ -621,7 +622,8 @@ class StandardInstrumentations { // Register all the standard instrumentation callbacks. If \p FAM is nullptr // then PreservedCFGChecker is not enabled. void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM = nullptr); + ModuleAnalysisManager *MAM, + FunctionAnalysisManager *FAM); TimePassesHandler &getTimePasses() { return TimePasses; } }; diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index e00c6a7b260c9..ad5aeb895953d 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -75,6 +75,7 @@ class TargetVerify { MessagesStr(Messages) {} void run(Function &F) {}; + void run(Function &F, FunctionAnalysisManager &AM); }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index e6ff57629b141..d8a3fda4f87dc 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -22,6 +22,10 @@ #include "llvm/Target/TargetVerifier.h" +#include "llvm/Analysis/UniformityAnalysis.h" +#include "llvm/Analysis/PostDominators.h" +#include "llvm/IR/Dominators.h" + namespace llvm { class Function; @@ -31,6 +35,20 @@ class AMDGPUTargetVerifierPass : public TargetVerifierPass { PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); }; +class AMDGPUTargetVerify : public TargetVerify { +public: + Module *Mod; + + DominatorTree *DT; + PostDominatorTree *PDT; + UniformityInfo *UA; + + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) + : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} + + void run(Function &F); +}; + } // namespace llvm #endif // LLVM_AMDGPU_TARGET_VERIFIER_H diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp index 1c764a0188eda..475e7cf45371b 100644 --- a/llvm/lib/LTO/LTOBackend.cpp +++ b/llvm/lib/LTO/LTOBackend.cpp @@ -275,7 +275,7 @@ static void runNewPMPasses(const Config &Conf, Module &Mod, TargetMachine *TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(Mod.getContext(), Conf.DebugPassManager, Conf.VerifyEach); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); PassBuilder PB(TM, Conf.PTO, PGOOpt, &PIC); RegisterPassPlugins(Conf.PassPlugins, PB); diff --git a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp index 9e7f8187fe49c..369b003df1364 100644 --- a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp +++ b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp @@ -245,7 +245,7 @@ static void optimizeModule(Module &TheModule, TargetMachine &TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(TheModule.getContext(), DebugPassManager); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); PipelineTuningOptions PTO; PTO.LoopVectorization = true; PTO.SLPVectorization = true; diff --git a/llvm/lib/Passes/CMakeLists.txt b/llvm/lib/Passes/CMakeLists.txt index 6425f4934b210..f171377a8b270 100644 --- a/llvm/lib/Passes/CMakeLists.txt +++ b/llvm/lib/Passes/CMakeLists.txt @@ -29,6 +29,7 @@ add_llvm_component_library(LLVMPasses Scalar Support Target + TargetParser TransformUtils Vectorize Instrumentation diff --git a/llvm/lib/Passes/PassBuilderBindings.cpp b/llvm/lib/Passes/PassBuilderBindings.cpp index 933fe89e53a94..f0e1abb8cebc4 100644 --- a/llvm/lib/Passes/PassBuilderBindings.cpp +++ b/llvm/lib/Passes/PassBuilderBindings.cpp @@ -76,7 +76,7 @@ static LLVMErrorRef runPasses(Module *Mod, Function *Fun, const char *Passes, PB.crossRegisterProxies(LAM, FAM, CGAM, MAM); StandardInstrumentations SI(Mod->getContext(), Debug, VerifyEach); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); // Run the pipeline. if (Fun) { diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index dc1dd5d9c7f4c..7b15f89e361b8 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -45,6 +45,7 @@ #include "llvm/Support/Signals.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Support/xxhash.h" +#include "llvm/Target/TargetVerifier.h" #include #include #include @@ -1454,9 +1455,10 @@ void PreservedCFGCheckerInstrumentation::registerCallbacks( } void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM) { + ModuleAnalysisManager *MAM, + FunctionAnalysisManager *FAM) { PIC.registerAfterPassCallback( - [this, MAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { + [this, MAM, FAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { if (isIgnored(P) || P == "VerifierPass") return; const auto *F = unwrapIR(IR); @@ -1473,6 +1475,15 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", P)); + + if (FAM) { + TargetVerify TV(const_cast(F->getParent())); + TV.run(*const_cast(F), *FAM); + if (!F->getParent()->IsValid) + report_fatal_error(formatv("Broken function found after pass " + "\"{0}\", compilation aborted!", + P)); + } } else { const auto *M = unwrapIR(IR); if (!M) { @@ -2512,7 +2523,7 @@ void PrintCrashIRInstrumentation::registerCallbacks( } void StandardInstrumentations::registerCallbacks( - PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM) { + PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM, FunctionAnalysisManager *FAM) { PrintIR.registerCallbacks(PIC); PrintPass.registerCallbacks(PIC); TimePasses.registerCallbacks(PIC); @@ -2521,7 +2532,7 @@ void StandardInstrumentations::registerCallbacks( PrintChangedIR.registerCallbacks(PIC); PseudoProbeVerification.registerCallbacks(PIC); if (VerifyEach) - Verify.registerCallbacks(PIC, MAM); + Verify.registerCallbacks(PIC, MAM, FAM); PrintChangedDiff.registerCallbacks(PIC); WebsiteChangeReporter.registerCallbacks(PIC); ChangeTester.registerCallbacks(PIC); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 585b19065c142..e6cdec7160229 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -14,8 +14,8 @@ using namespace llvm; -static cl::opt -MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); +//static cl::opt +//MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); // Check - We know that cond should be true, if not print an error message. #define Check(C, ...) \ @@ -81,7 +81,7 @@ static bool isMFMA(unsigned IID) { } namespace llvm { -class AMDGPUTargetVerify : public TargetVerify { +/*class AMDGPUTargetVerify : public TargetVerify { public: Module *Mod; @@ -93,7 +93,7 @@ class AMDGPUTargetVerify : public TargetVerify { : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} void run(Function &F); -}; +};*/ static bool IsValidInt(const Type *Ty) { return Ty->isIntegerTy(1) || @@ -129,8 +129,8 @@ void AMDGPUTargetVerify::run(Function &F) { for (auto &BB : F) { for (auto &I : BB) { - if (MarkUniform) - outs() << UA->isUniform(&I) << ' ' << I << '\n'; + //if (MarkUniform) + //outs() << UA->isUniform(&I) << ' ' << I << '\n'; // Ensure integral types are valid: i8, i16, i32, i64, i128 if (I.getType()->isIntegerTy()) diff --git a/llvm/lib/Target/CMakeLists.txt b/llvm/lib/Target/CMakeLists.txt index 9472288229cac..f2a5d545ce84f 100644 --- a/llvm/lib/Target/CMakeLists.txt +++ b/llvm/lib/Target/CMakeLists.txt @@ -7,6 +7,8 @@ add_llvm_component_library(LLVMTarget TargetLoweringObjectFile.cpp TargetMachine.cpp TargetMachineC.cpp + TargetVerifier.cpp + AMDGPU/AMDGPUTargetVerifier.cpp ADDITIONAL_HEADER_DIRS ${LLVM_MAIN_INCLUDE_DIR}/llvm/Target diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll new file mode 100644 index 0000000000000..584097d7bc134 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll @@ -0,0 +1,6 @@ +; RUN: not not llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each -o - < %s 2>&1 | FileCheck %s + +define amdgpu_cs i32 @nonvoid_shader() { +; CHECK: LLVM ERROR + ret i32 0 +} diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll new file mode 100644 index 0000000000000..0c3a5fe5ac4a5 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll @@ -0,0 +1,6 @@ +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each %s -o - 2>&1 | FileCheck %s + +define amdgpu_cs void @void_shader() { +; CHECK: ModuleToFunctionPassAdaptor + ret void +} diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index fa82689ecf9ae..a060d16e74958 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -126,7 +126,7 @@ int llvm::compileModuleWithNewPM( PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); MAM.registerPass([&] { return MachineModuleAnalysis(MMI); }); diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 68422abd6f4cc..3352d07deff2f 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -147,7 +147,7 @@ int main(int argc, char **argv) { PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); //FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); diff --git a/llvm/tools/opt/NewPMDriver.cpp b/llvm/tools/opt/NewPMDriver.cpp index 7d168a6ceb17c..a8977d80bdf44 100644 --- a/llvm/tools/opt/NewPMDriver.cpp +++ b/llvm/tools/opt/NewPMDriver.cpp @@ -423,7 +423,7 @@ bool llvm::runPassPipeline( PrintPassOpts.SkipAnalyses = DebugPM == DebugLogging::Quiet; StandardInstrumentations SI(M.getContext(), DebugPM != DebugLogging::None, VK == VerifierKind::EachPass, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); DebugifyEachInstrumentation Debugify; DebugifyStatsMap DIStatsMap; DebugInfoPerPass DebugInfoBeforePass; diff --git a/llvm/unittests/IR/PassManagerTest.cpp b/llvm/unittests/IR/PassManagerTest.cpp index a6487169224c2..bb4db6120035f 100644 --- a/llvm/unittests/IR/PassManagerTest.cpp +++ b/llvm/unittests/IR/PassManagerTest.cpp @@ -828,7 +828,7 @@ TEST_F(PassManagerTest, FunctionPassCFGChecker) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -877,7 +877,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerInvalidateAnalysis) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -945,7 +945,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerWrapped) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); >From 64d001858efc994e965071cd319d268b934a6eb3 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 16 Apr 2025 10:19:00 -0400 Subject: [PATCH 03/30] Run AMDGPUTargetVerifier within AMDGPU pipeline. Move IsValid from Module to TargetVerify. --- clang/lib/CodeGen/BackendUtil.cpp | 2 +- llvm/include/llvm/IR/Module.h | 4 ---- llvm/include/llvm/Target/TargetVerifier.h | 2 ++ llvm/lib/IR/Verifier.cpp | 4 ++-- llvm/lib/Passes/StandardInstrumentations.cpp | 2 +- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 5 +++++ llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 3 ++- llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 6 +++--- 8 files changed, 16 insertions(+), 12 deletions(-) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index f7eb853beb23c..9a1c922f5ddef 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -922,7 +922,7 @@ void EmitAssemblyHelper::RunOptimizationPipeline( TheModule->getContext(), (CodeGenOpts.DebugPassManager || DebugPassStructure), CodeGenOpts.VerifyEach, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); PassBuilder PB(TM.get(), PTO, PGOOpt, &PIC); // Handle the assignment tracking feature options. diff --git a/llvm/include/llvm/IR/Module.h b/llvm/include/llvm/IR/Module.h index 03c0cf1cf0924..91ccd76c41e07 100644 --- a/llvm/include/llvm/IR/Module.h +++ b/llvm/include/llvm/IR/Module.h @@ -214,10 +214,6 @@ class LLVM_ABI Module { /// @name Constructors /// @{ public: - /// Is this Module valid as determined by one of the verification passes - /// i.e. Lint, Verifier, TargetVerifier. - bool IsValid = true; - /// Is this Module using intrinsics to record the position of debugging /// information, or non-intrinsic records? See IsNewDbgInfoFormat in /// \ref BasicBlock. diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index ad5aeb895953d..2d0c039132c35 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -70,6 +70,8 @@ class TargetVerify { std::string Messages; raw_string_ostream MessagesStr; + bool IsValid = true; + TargetVerify(Module *Mod) : Mod(Mod), TT(Triple::normalize(Mod->getTargetTriple())), MessagesStr(Messages) {} diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 9d21ca182ca13..d7c514610b4ba 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -7801,7 +7801,7 @@ VerifierAnalysis::Result VerifierAnalysis::run(Function &F, PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { auto Res = AM.getResult(M); if (Res.IRBroken || Res.DebugInfoBroken) { - M.IsValid = false; + //M.IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken module found, compilation aborted!"); } @@ -7812,7 +7812,7 @@ PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto res = AM.getResult(F); if (res.IRBroken) { - F.getParent()->IsValid = false; + //F.getParent()->IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken function found, compilation aborted!"); } diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index 7b15f89e361b8..879d657c87695 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -1479,7 +1479,7 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, if (FAM) { TargetVerify TV(const_cast(F->getParent())); TV.run(*const_cast(F), *FAM); - if (!F->getParent()->IsValid) + if (!TV.IsValid) report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", P)); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 90e3489ced923..6ec34d6a0fdbf 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -90,6 +90,7 @@ #include "llvm/MC/TargetRegistry.h" #include "llvm/Passes/PassBuilder.h" #include "llvm/Support/FormatVariadic.h" +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Transforms/HipStdPar/HipStdPar.h" #include "llvm/Transforms/IPO.h" #include "llvm/Transforms/IPO/AlwaysInliner.h" @@ -1298,6 +1299,8 @@ void AMDGPUPassConfig::addIRPasses() { addPass(createLICMPass()); } + //addPass(AMDGPUTargetVerifierPass()); + TargetPassConfig::addIRPasses(); // EarlyCSE is not always strong enough to clean up what LSR produces. For @@ -2040,6 +2043,8 @@ void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { // but EarlyCSE can do neither of them. if (isPassEnabled(EnableScalarIRPasses)) addEarlyCSEOrGVNPass(addPass); + + addPass(AMDGPUTargetVerifierPass()); } void AMDGPUCodeGenPassBuilder::addCodeGenPrepare(AddIRPass &addPass) const { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index e6cdec7160229..c70a6d1b6fa66 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -205,7 +205,8 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan dbgs() << TV.MessagesStr.str(); if (!TV.MessagesStr.str().empty()) { - F.getParent()->IsValid = false; + TV.IsValid = false; + return PreservedAnalyses::none(); } return PreservedAnalyses::all(); diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 3352d07deff2f..fbe7f6089ff18 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -163,9 +163,9 @@ int main(int argc, char **argv) { FPM.addPass(TargetVerifierPass()); MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); - MPM.run(*M, MAM); - - if (!M->IsValid) + auto PA = MPM.run(*M, MAM); + auto PAC = PA.getChecker(); + if (!PAC.preserved()) return 1; return 0; >From fdae3025942584d0085deb3442f40471548defe5 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 16 Apr 2025 11:08:20 -0400 Subject: [PATCH 04/30] Remove cmd line options that aren't required. Make error message explicit. --- llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll | 4 ++-- llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll index 584097d7bc134..c5e59d4a2369e 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll @@ -1,6 +1,6 @@ -; RUN: not not llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each -o - < %s 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm -o - < %s 2>&1 | FileCheck %s define amdgpu_cs i32 @nonvoid_shader() { -; CHECK: LLVM ERROR +; CHECK: Shaders must return void ret i32 0 } diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll index 0c3a5fe5ac4a5..8a503b7624a73 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll @@ -1,6 +1,6 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each %s -o - 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm %s -o - 2>&1 | FileCheck %s --allow-empty define amdgpu_cs void @void_shader() { -; CHECK: ModuleToFunctionPassAdaptor +; CHECK-NOT: Shaders must return void ret void } >From 5ceda58cc5b5d7372c6e43cbdf583f0dda87b956 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 19:36:34 -0400 Subject: [PATCH 05/30] Return Verifier none status through PreservedAnalyses on fail. --- llvm/lib/Analysis/Lint.cpp | 4 +++- llvm/lib/IR/Verifier.cpp | 2 ++ llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 8 +++++--- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/llvm/lib/Analysis/Lint.cpp b/llvm/lib/Analysis/Lint.cpp index f05e36e2025d4..c8e38963e5974 100644 --- a/llvm/lib/Analysis/Lint.cpp +++ b/llvm/lib/Analysis/Lint.cpp @@ -742,9 +742,11 @@ PreservedAnalyses LintPass::run(Function &F, FunctionAnalysisManager &AM) { Lint L(Mod, DL, AA, AC, DT, TLI); L.visit(F); dbgs() << L.MessagesStr.str(); - if (AbortOnError && !L.MessagesStr.str().empty()) + if (AbortOnError && !L.MessagesStr.str().empty()) { report_fatal_error( "linter found errors, aborting. (enabled by abort-on-error)", false); + return PreservedAnalyses::none(); + } return PreservedAnalyses::all(); } diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index d7c514610b4ba..51f6dec53b70f 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -7804,6 +7804,7 @@ PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { //M.IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken module found, compilation aborted!"); + return PreservedAnalyses::none(); } return PreservedAnalyses::all(); @@ -7815,6 +7816,7 @@ PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { //F.getParent()->IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken function found, compilation aborted!"); + return PreservedAnalyses::none(); } return PreservedAnalyses::all(); diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index fbe7f6089ff18..042824ac37fea 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -164,9 +164,11 @@ int main(int argc, char **argv) { MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); auto PA = MPM.run(*M, MAM); - auto PAC = PA.getChecker(); - if (!PAC.preserved()) - return 1; + { + auto PAC = PA.getChecker(); + if (!PAC.preserved()) + return 1; + } return 0; } >From 99c29069cdaf68c92ce7f25ca2f730bf738ca324 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 21:16:02 -0400 Subject: [PATCH 06/30] Rebase update. --- llvm/include/llvm/Target/TargetVerifier.h | 2 +- llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 2d0c039132c35..fe683311b901c 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -73,7 +73,7 @@ class TargetVerify { bool IsValid = true; TargetVerify(Module *Mod) - : Mod(Mod), TT(Triple::normalize(Mod->getTargetTriple())), + : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} void run(Function &F) {}; diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 042824ac37fea..627bc51ef3a43 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -123,7 +123,7 @@ int main(int argc, char **argv) { return 1; } auto S = Triple::normalize(TargetTriple); - M->setTargetTriple(S); + M->setTargetTriple(Triple(S)); PassInstrumentationCallbacks PIC; StandardInstrumentations SI(Context, false/*debug PM*/, @@ -153,7 +153,7 @@ int main(int argc, char **argv) { Triple TT(M->getTargetTriple()); if (!NoLint) - FPM.addPass(LintPass()); + FPM.addPass(LintPass(false)); if (!NoVerify) MPM.addPass(VerifierPass()); if (TT.isAMDGPU()) >From 3ea7eae48a6addbf711716e7a819830dddc1b34a Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 22:49:52 -0400 Subject: [PATCH 07/30] Add generic TargetVerifier. --- llvm/lib/Target/TargetVerifier.cpp | 32 ++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) create mode 100644 llvm/lib/Target/TargetVerifier.cpp diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp new file mode 100644 index 0000000000000..de3ff749e7c3c --- /dev/null +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -0,0 +1,32 @@ +#include "llvm/Target/TargetVerifier.h" +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" + +#include "llvm/Analysis/UniformityAnalysis.h" +#include "llvm/Analysis/PostDominators.h" +#include "llvm/Support/Debug.h" +#include "llvm/IR/Dominators.h" +#include "llvm/IR/Function.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/IntrinsicsAMDGPU.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/Value.h" + +namespace llvm { + +void TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { + if (TT.isAMDGPU()) { + auto *UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + } + } +} + +} // namespace llvm >From f52c4dbc84952d97266f5f4158729e564de10240 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 23:13:14 -0400 Subject: [PATCH 08/30] Remove store to const check since it is in Lint already --- llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 8 -------- llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 2 -- 2 files changed, 10 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index c70a6d1b6fa66..1cf2b277bee26 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -140,14 +140,6 @@ void AMDGPUTargetVerify::run(Function &F) { Check(IsValidInt(I.getOperand(i)->getType()), "Int type is invalid.", I.getOperand(i)); - // Ensure no store to const memory - if (auto *SI = dyn_cast(&I)) - { - unsigned AS = SI->getPointerAddressSpace(); - Check(AS != 4, "Write to const memory", SI); - } - - // Ensure no kernel to kernel calls. if (auto *CI = dyn_cast(&I)) { CallingConv::ID CalleeCC = CI->getCallingConv(); diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll index f56ff992a56c2..c628abbde11d1 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -32,8 +32,6 @@ define amdgpu_cs i32 @shader() { define amdgpu_kernel void @store_const(ptr addrspace(4) %out, i32 %a, i32 %b) { ; CHECK: Undefined behavior: Write to memory in const addrspace -; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 -; CHECK-NEXT: Write to const memory ; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 %r = add i32 %a, %b store i32 %r, ptr addrspace(4) %out >From 5c9a4ab3895d6939b12386d1db2081ca388df01a Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 23:14:38 -0400 Subject: [PATCH 09/30] Add chain followed by unreachable check --- llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 6 ++++++ llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 10 ++++++++++ 2 files changed, 16 insertions(+) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 1cf2b277bee26..8ea773bc0e66f 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -142,6 +142,7 @@ void AMDGPUTargetVerify::run(Function &F) { if (auto *CI = dyn_cast(&I)) { + // Ensure no kernel to kernel calls. CallingConv::ID CalleeCC = CI->getCallingConv(); if (CalleeCC == CallingConv::AMDGPU_KERNEL) { @@ -149,6 +150,11 @@ void AMDGPUTargetVerify::run(Function &F) { Check(CallerCC != CallingConv::AMDGPU_KERNEL, "A kernel may not call a kernel", CI->getParent()->getParent()); } + + // Ensure chain intrinsics are followed by unreachables. + if (CI->getIntrinsicID() == Intrinsic::amdgcn_cs_chain) + Check(isa_and_present(CI->getNextNode()), + "llvm.amdgcn.cs.chain must be followed by unreachable", CI); } // Ensure MFMA is not in control flow with diverging operands diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll index c628abbde11d1..e620df94ccde4 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -58,3 +58,13 @@ entry: %tmp2 = ashr i65 %x, 64 ret i65 %tmp2 } + +declare void @llvm.amdgcn.cs.chain.v3i32(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) +declare amdgpu_cs_chain void @chain_callee(<3 x i32> inreg, <3 x i32>) + +define amdgpu_cs void @no_unreachable(<3 x i32> inreg %a, <3 x i32> %b) { +; CHECK: llvm.amdgcn.cs.chain must be followed by unreachable +; CHECK-NEXT: call void (ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.p0.i32.v3i32.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) + call void(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) + ret void +} >From 0ff03f792c018e4fd0c11de9da4d3353617707f5 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 23:26:19 -0400 Subject: [PATCH 10/30] Remove mfma check --- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 89 ------------------- llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 25 ------ 2 files changed, 114 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 8ea773bc0e66f..684ced5bba574 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -14,9 +14,6 @@ using namespace llvm; -//static cl::opt -//MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); - // Check - We know that cond should be true, if not print an error message. #define Check(C, ...) \ do { \ @@ -26,60 +23,6 @@ using namespace llvm; } \ } while (false) -static bool isMFMA(unsigned IID) { - switch (IID) { - case Intrinsic::amdgcn_mfma_f32_4x4x1f32: - case Intrinsic::amdgcn_mfma_f32_4x4x4f16: - case Intrinsic::amdgcn_mfma_i32_4x4x4i8: - case Intrinsic::amdgcn_mfma_f32_4x4x2bf16: - - case Intrinsic::amdgcn_mfma_f32_16x16x1f32: - case Intrinsic::amdgcn_mfma_f32_16x16x4f32: - case Intrinsic::amdgcn_mfma_f32_16x16x4f16: - case Intrinsic::amdgcn_mfma_f32_16x16x16f16: - case Intrinsic::amdgcn_mfma_i32_16x16x4i8: - case Intrinsic::amdgcn_mfma_i32_16x16x16i8: - case Intrinsic::amdgcn_mfma_f32_16x16x2bf16: - case Intrinsic::amdgcn_mfma_f32_16x16x8bf16: - - case Intrinsic::amdgcn_mfma_f32_32x32x1f32: - case Intrinsic::amdgcn_mfma_f32_32x32x2f32: - case Intrinsic::amdgcn_mfma_f32_32x32x4f16: - case Intrinsic::amdgcn_mfma_f32_32x32x8f16: - case Intrinsic::amdgcn_mfma_i32_32x32x4i8: - case Intrinsic::amdgcn_mfma_i32_32x32x8i8: - case Intrinsic::amdgcn_mfma_f32_32x32x2bf16: - case Intrinsic::amdgcn_mfma_f32_32x32x4bf16: - - case Intrinsic::amdgcn_mfma_f32_4x4x4bf16_1k: - case Intrinsic::amdgcn_mfma_f32_16x16x4bf16_1k: - case Intrinsic::amdgcn_mfma_f32_16x16x16bf16_1k: - case Intrinsic::amdgcn_mfma_f32_32x32x4bf16_1k: - case Intrinsic::amdgcn_mfma_f32_32x32x8bf16_1k: - - case Intrinsic::amdgcn_mfma_f64_16x16x4f64: - case Intrinsic::amdgcn_mfma_f64_4x4x4f64: - - case Intrinsic::amdgcn_mfma_i32_16x16x32_i8: - case Intrinsic::amdgcn_mfma_i32_32x32x16_i8: - case Intrinsic::amdgcn_mfma_f32_16x16x8_xf32: - case Intrinsic::amdgcn_mfma_f32_32x32x4_xf32: - - case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_bf8: - case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_fp8: - case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_bf8: - case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_fp8: - - case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_bf8: - case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_fp8: - case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_bf8: - case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_fp8: - return true; - default: - return false; - } -} - namespace llvm { /*class AMDGPUTargetVerify : public TargetVerify { public: @@ -129,8 +72,6 @@ void AMDGPUTargetVerify::run(Function &F) { for (auto &BB : F) { for (auto &I : BB) { - //if (MarkUniform) - //outs() << UA->isUniform(&I) << ' ' << I << '\n'; // Ensure integral types are valid: i8, i16, i32, i64, i128 if (I.getType()->isIntegerTy()) @@ -156,36 +97,6 @@ void AMDGPUTargetVerify::run(Function &F) { Check(isa_and_present(CI->getNextNode()), "llvm.amdgcn.cs.chain must be followed by unreachable", CI); } - - // Ensure MFMA is not in control flow with diverging operands - if (auto *II = dyn_cast(&I)) { - if (isMFMA(II->getIntrinsicID())) { - bool InControlFlow = false; - for (const auto &P : predecessors(&BB)) - if (!PDT->dominates(&BB, P)) { - InControlFlow = true; - break; - } - for (const auto &S : successors(&BB)) - if (!DT->dominates(&BB, S)) { - InControlFlow = true; - break; - } - if (InControlFlow) { - // If operands to MFMA are not uniform, MFMA cannot be in control flow - bool hasUniformOperands = true; - for (unsigned i = 0; i < II->getNumOperands(); i++) { - if (!UA->isUniform(II->getOperand(i))) { - dbgs() << "Not uniform: " << *II->getOperand(i) << '\n'; - hasUniformOperands = false; - } - } - if (!hasUniformOperands) Check(false, "MFMA in control flow", II); - //else Check(false, "MFMA in control flow (uniform operands)", II); - } - //else Check(false, "MFMA not in control flow", II); - } - } } } } diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll index e620df94ccde4..62b220d7d9f49 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -1,30 +1,5 @@ ; RUN: not llvm-tgt-verify %s -mtriple=amdgcn |& FileCheck %s -define amdgpu_kernel void @test_mfma_f32_32x32x1f32_vecarg(ptr addrspace(1) %arg) #0 { -; CHECK: Not uniform: %in.f32 = load <32 x float>, ptr addrspace(1) %gep, align 128 -; CHECK-NEXT: MFMA in control flow -; CHECK-NEXT: %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) -s: - %tid = call i32 @llvm.amdgcn.workitem.id.x() - %gep = getelementptr inbounds <32 x float>, ptr addrspace(1) %arg, i32 %tid - %in.i32 = load <32 x i32>, ptr addrspace(1) %gep - %in.f32 = load <32 x float>, ptr addrspace(1) %gep - - %0 = icmp eq <32 x i32> %in.i32, zeroinitializer - %div.br = extractelement <32 x i1> %0, i32 0 - br i1 %div.br, label %if.3, label %else.0 - -if.3: - br label %join - -else.0: - %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) - br label %join - -join: - ret void -} - define amdgpu_cs i32 @shader() { ; CHECK: Shaders must return void ret i32 0 >From 6b84c73a35a260d64ed45df90052f8212b0ee4e7 Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 21 Apr 2025 20:54:10 -0400 Subject: [PATCH 11/30] Add registerVerifierPasses to PassBuilder and add the verifier passes to PassRegistry. --- llvm/include/llvm/InitializePasses.h | 1 + llvm/include/llvm/Passes/PassBuilder.h | 21 +++++++ .../llvm/Passes/TargetPassRegistry.inc | 12 ++++ .../TargetVerify/AMDGPUTargetVerifier.h | 11 ++-- llvm/lib/Passes/PassBuilder.cpp | 7 +++ llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 11 ++++ .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 56 ++++++++++++++++++- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 1 + 8 files changed, 114 insertions(+), 6 deletions(-) diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 9bef8e496c57e..ae398db3dc1da 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -317,6 +317,7 @@ void initializeUnpackMachineBundlesPass(PassRegistry &); void initializeUnreachableBlockElimLegacyPassPass(PassRegistry &); void initializeUnreachableMachineBlockElimLegacyPass(PassRegistry &); void initializeVerifierLegacyPassPass(PassRegistry &); +void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeVirtRegMapWrapperLegacyPass(PassRegistry &); void initializeVirtRegRewriterPass(PassRegistry &); void initializeWasmEHPreparePass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/PassBuilder.h b/llvm/include/llvm/Passes/PassBuilder.h index 51ccaa53447d7..6000769ce723b 100644 --- a/llvm/include/llvm/Passes/PassBuilder.h +++ b/llvm/include/llvm/Passes/PassBuilder.h @@ -172,6 +172,13 @@ class PassBuilder { /// additional analyses. void registerLoopAnalyses(LoopAnalysisManager &LAM); + /// Registers all available verifier passes. + /// + /// This is an interface that can be used to populate a + /// \c ModuleAnalysisManager with all registered loop analyses. Callers can + /// still manually register any additional analyses. + void registerVerifierPasses(ModulePassManager &PM, FunctionPassManager &); + /// Registers all available machine function analysis passes. /// /// This is an interface that can be used to populate a \c @@ -570,6 +577,15 @@ class PassBuilder { } /// @}} + /// Register a callback for parsing an Verifier Name to populate + /// the given managers. + void registerVerifierCallback( + const std::function &C, + const std::function &CF) { + VerifierCallbacks.push_back(C); + FnVerifierCallbacks.push_back(CF); + } + /// {{@ Register pipeline parsing callbacks with this pass builder instance. /// Using these callbacks, callers can parse both a single pass name, as well /// as entire sub-pipelines, and populate the PassManager instance @@ -841,6 +857,11 @@ class PassBuilder { // Callbacks to parse `filter` parameter in register allocation passes SmallVector, 2> RegClassFilterParsingCallbacks; + // Verifier callbacks + SmallVector, 2> + VerifierCallbacks; + SmallVector, 2> + FnVerifierCallbacks; }; /// This utility template takes care of adding require<> and invalidate<> diff --git a/llvm/include/llvm/Passes/TargetPassRegistry.inc b/llvm/include/llvm/Passes/TargetPassRegistry.inc index 521913cb25a4a..2d04b874cf360 100644 --- a/llvm/include/llvm/Passes/TargetPassRegistry.inc +++ b/llvm/include/llvm/Passes/TargetPassRegistry.inc @@ -151,6 +151,18 @@ PB.registerPipelineParsingCallback([=](StringRef Name, FunctionPassManager &PM, return false; }); +PB.registerVerifierCallback([](ModulePassManager &PM) { +#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) PM.addPass(CREATE_PASS) +#include GET_PASS_REGISTRY +#undef VERIFIER_MODULE_ANALYSIS + return false; +}, [](FunctionPassManager &FPM) { +#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) FPM.addPass(CREATE_PASS) +#include GET_PASS_REGISTRY +#undef VERIFIER_FUNCTION_ANALYSIS + return false; +}); + #undef ADD_PASS #undef ADD_PASS_WITH_PARAMS diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index d8a3fda4f87dc..b6a7412e8c1ef 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -39,14 +39,17 @@ class AMDGPUTargetVerify : public TargetVerify { public: Module *Mod; - DominatorTree *DT; - PostDominatorTree *PDT; - UniformityInfo *UA; + DominatorTree *DT = nullptr; + PostDominatorTree *PDT = nullptr; + UniformityInfo *UA = nullptr; + + AMDGPUTargetVerify(Module *Mod) + : TargetVerify(Mod), Mod(Mod) {} AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} - void run(Function &F); + bool run(Function &F); }; } // namespace llvm diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index e7057d9a6b625..e942fed8b6a72 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -582,6 +582,13 @@ void PassBuilder::registerLoopAnalyses(LoopAnalysisManager &LAM) { C(LAM); } +void PassBuilder::registerVerifierPasses(ModulePassManager &MPM, FunctionPassManager &FPM) { + for (auto &C : VerifierCallbacks) + C(MPM); + for (auto &C : FnVerifierCallbacks) + C(FPM); +} + static std::optional> parseFunctionPipelineName(StringRef Name) { std::pair Params; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 98a1147ef6d66..41e6a399c7239 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -81,6 +81,17 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #undef FUNCTION_ALIAS_ANALYSIS #undef FUNCTION_ANALYSIS +#ifndef VERIFIER_MODULE_ANALYSIS +#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) +#endif +#ifndef VERIFIER_FUNCTION_ANALYSIS +#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) +#endif +VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) +VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) +#undef VERIFIER_MODULE_ANALYSIS +#undef VERIFIER_FUNCTION_ANALYSIS + #ifndef FUNCTION_PASS_WITH_PARAMS #define FUNCTION_PASS_WITH_PARAMS(NAME, CLASS, CREATE_PASS, PARSER, PARAMS) #endif diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 684ced5bba574..63a7526b9abdc 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -5,6 +5,7 @@ #include "llvm/Support/Debug.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" +#include "llvm/InitializePasses.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" @@ -19,7 +20,7 @@ using namespace llvm; do { \ if (!(C)) { \ TargetVerify::CheckFailed(__VA_ARGS__); \ - return; \ + return false; \ } \ } while (false) @@ -64,7 +65,7 @@ static bool isShader(CallingConv::ID CC) { } } -void AMDGPUTargetVerify::run(Function &F) { +bool AMDGPUTargetVerify::run(Function &F) { // Ensure shader calling convention returns void if (isShader(F.getCallingConv())) Check(F.getReturnType() == Type::getVoidTy(F.getContext()), "Shaders must return void"); @@ -99,6 +100,10 @@ void AMDGPUTargetVerify::run(Function &F) { } } } + + if (!MessagesStr.str().empty()) + return false; + return true; } PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { @@ -120,4 +125,51 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan return PreservedAnalyses::all(); } + +struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { + static char ID; + + std::unique_ptr TV; + bool FatalErrors = true; + + AMDGPUTargetVerifierLegacyPass() : FunctionPass(ID) { + initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + AMDGPUTargetVerifierLegacyPass(bool FatalErrors) + : FunctionPass(ID), + FatalErrors(FatalErrors) { + initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + + bool doInitialization(Module &M) override { + TV = std::make_unique(&M); + return false; + } + + bool runOnFunction(Function &F) override { + if (TV->run(F) && FatalErrors) { + errs() << "in function " << F.getName() << '\n'; + report_fatal_error("Broken function found, compilation aborted!"); + } + return false; + } + + bool doFinalization(Module &M) override { + bool IsValid = true; + for (Function &F : M) + if (F.isDeclaration()) + IsValid &= TV->run(F); + + //IsValid &= TV->run(); + if (FatalErrors && !IsValid) + report_fatal_error("Broken module found, compilation aborted!"); + return false; + } + + void getAnalysisUsage(AnalysisUsage &AU) const override { + AU.setPreservesAll(); + } +}; +char AMDGPUTargetVerifierLegacyPass::ID = 0; } // namespace llvm +INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverify", "AMDGPU Target Verifier", false, false) diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 627bc51ef3a43..503db7b1f8d18 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -144,6 +144,7 @@ int main(int argc, char **argv) { PB.registerCGSCCAnalyses(CGAM); PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); + //PB.registerVerifierPasses(MPM, FPM); PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); >From ec3276b182f3a758a24024291772efe435485857 Mon Sep 17 00:00:00 2001 From: jofernau Date: Tue, 22 Apr 2025 14:57:31 -0400 Subject: [PATCH 12/30] Remove leftovers. Add titles. Add call to registerVerifierCallbacks in llc. --- llvm/lib/Passes/CMakeLists.txt | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 4 --- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 35 +++++++++++-------- llvm/lib/Target/TargetVerifier.cpp | 19 ++++++++++ llvm/tools/llc/NewPMDriver.cpp | 6 ++-- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 7 ++-- 6 files changed, 45 insertions(+), 28 deletions(-) diff --git a/llvm/lib/Passes/CMakeLists.txt b/llvm/lib/Passes/CMakeLists.txt index f171377a8b270..9c348cb89a8c5 100644 --- a/llvm/lib/Passes/CMakeLists.txt +++ b/llvm/lib/Passes/CMakeLists.txt @@ -29,7 +29,7 @@ add_llvm_component_library(LLVMPasses Scalar Support Target - TargetParser + #TargetParser TransformUtils Vectorize Instrumentation diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 6ec34d6a0fdbf..257cc724b3da9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1299,8 +1299,6 @@ void AMDGPUPassConfig::addIRPasses() { addPass(createLICMPass()); } - //addPass(AMDGPUTargetVerifierPass()); - TargetPassConfig::addIRPasses(); // EarlyCSE is not always strong enough to clean up what LSR produces. For @@ -2043,8 +2041,6 @@ void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { // but EarlyCSE can do neither of them. if (isPassEnabled(EnableScalarIRPasses)) addEarlyCSEOrGVNPass(addPass); - - addPass(AMDGPUTargetVerifierPass()); } void AMDGPUCodeGenPassBuilder::addCodeGenPrepare(AddIRPass &addPass) const { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 63a7526b9abdc..0eecedaebc7ce 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -1,3 +1,22 @@ +//===-- AMDGPUTargetVerifier.cpp - AMDGPU -------------------------*- C++ -*-===// +//// +//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +//// See https://llvm.org/LICENSE.txt for license information. +//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +//// +////===----------------------------------------------------------------------===// +//// +//// This file defines target verifier interfaces that can be used for some +//// validation of input to the system, and for checking that transformations +//// haven't done something bad. In contrast to the Verifier or Lint, the +//// TargetVerifier looks for constructions invalid to a particular target +//// machine. +//// +//// To see what specifically is checked, look at an individual backend's +//// TargetVerifier. +//// +////===----------------------------------------------------------------------===// + #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Analysis/UniformityAnalysis.h" @@ -25,19 +44,6 @@ using namespace llvm; } while (false) namespace llvm { -/*class AMDGPUTargetVerify : public TargetVerify { -public: - Module *Mod; - - DominatorTree *DT; - PostDominatorTree *PDT; - UniformityInfo *UA; - - AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) - : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} - - void run(Function &F); -};*/ static bool IsValidInt(const Type *Ty) { return Ty->isIntegerTy(1) || @@ -147,7 +153,7 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { } bool runOnFunction(Function &F) override { - if (TV->run(F) && FatalErrors) { + if (!TV->run(F) && FatalErrors) { errs() << "in function " << F.getName() << '\n'; report_fatal_error("Broken function found, compilation aborted!"); } @@ -160,7 +166,6 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { if (F.isDeclaration()) IsValid &= TV->run(F); - //IsValid &= TV->run(); if (FatalErrors && !IsValid) report_fatal_error("Broken module found, compilation aborted!"); return false; diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index de3ff749e7c3c..992a0c91d93b1 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -1,3 +1,22 @@ +//===-- TargetVerifier.cpp - LLVM IR Target Verifier ----------------*- C++ -*-===// +//// +///// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +///// See https://llvm.org/LICENSE.txt for license information. +///// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +///// +/////===----------------------------------------------------------------------===// +///// +///// This file defines target verifier interfaces that can be used for some +///// validation of input to the system, and for checking that transformations +///// haven't done something bad. In contrast to the Verifier or Lint, the +///// TargetVerifier looks for constructions invalid to a particular target +///// machine. +///// +///// To see what specifically is checked, look at TargetVerifier.cpp or an +///// individual backend's TargetVerifier. +///// +/////===----------------------------------------------------------------------===// + #include "llvm/Target/TargetVerifier.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index a060d16e74958..a8f6b999af06e 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -114,6 +114,8 @@ int llvm::compileModuleWithNewPM( VK == VerifierKind::EachPass); registerCodeGenCallback(PIC, *Target); + ModulePassManager MPM; + FunctionPassManager FPM; MachineFunctionAnalysisManager MFAM; LoopAnalysisManager LAM; FunctionAnalysisManager FAM; @@ -125,15 +127,13 @@ int llvm::compileModuleWithNewPM( PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); + PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM, &FAM); FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); MAM.registerPass([&] { return MachineModuleAnalysis(MMI); }); - ModulePassManager MPM; - FunctionPassManager FPM; - if (!PassPipeline.empty()) { // Construct a custom pass pipeline that starts after instruction // selection. diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 503db7b1f8d18..b00bab66c6c3e 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -1,4 +1,4 @@ -//===--- llvm-isel-fuzzer.cpp - Fuzzer for instruction selection ----------===// +//===--- llvm-tgt-verify.cpp - Target Verifier ----------------- ----------===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -6,7 +6,7 @@ // //===----------------------------------------------------------------------===// // -// Tool to fuzz instruction selection using libFuzzer. +// Tool to verify a target. // //===----------------------------------------------------------------------===// @@ -144,14 +144,11 @@ int main(int argc, char **argv) { PB.registerCGSCCAnalyses(CGAM); PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); - //PB.registerVerifierPasses(MPM, FPM); PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM, &FAM); - //FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); - Triple TT(M->getTargetTriple()); if (!NoLint) FPM.addPass(LintPass(false)); >From 4f00c83f58a86a0adc26b621cc53e8b568b8c8e0 Mon Sep 17 00:00:00 2001 From: jofernau Date: Thu, 24 Apr 2025 16:02:21 -0400 Subject: [PATCH 13/30] Add pass to legacy PM. --- llvm/include/llvm/CodeGen/Passes.h | 2 + llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Target/TargetVerifier.h | 6 +- llvm/lib/Passes/StandardInstrumentations.cpp | 4 +- llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 2 +- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 45 ---------- llvm/lib/Target/TargetVerifier.cpp | 87 ++++++++++++++++++- llvm/tools/llc/NewPMDriver.cpp | 6 +- llvm/tools/llc/llc.cpp | 4 + .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 1 + 10 files changed, 106 insertions(+), 53 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index d214ab9306c2f..b293315e11c17 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -617,6 +617,8 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); + + FunctionPass *createTargetVerifierLegacyPass(); } // End llvm namespace #endif diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index ae398db3dc1da..3f9ffc4efd9ec 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -307,6 +307,7 @@ void initializeTailDuplicateLegacyPass(PassRegistry &); void initializeTargetLibraryInfoWrapperPassPass(PassRegistry &); void initializeTargetPassConfigPass(PassRegistry &); void initializeTargetTransformInfoWrapperPassPass(PassRegistry &); +void initializeTargetVerifierLegacyPassPass(PassRegistry &); void initializeTwoAddressInstructionLegacyPassPass(PassRegistry &); void initializeTypeBasedAAWrapperPassPass(PassRegistry &); void initializeTypePromotionLegacyPass(PassRegistry &); @@ -317,7 +318,6 @@ void initializeUnpackMachineBundlesPass(PassRegistry &); void initializeUnreachableBlockElimLegacyPassPass(PassRegistry &); void initializeUnreachableMachineBlockElimLegacyPass(PassRegistry &); void initializeVerifierLegacyPassPass(PassRegistry &); -void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeVirtRegMapWrapperLegacyPass(PassRegistry &); void initializeVirtRegRewriterPass(PassRegistry &); void initializeWasmEHPreparePass(PassRegistry &); diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index fe683311b901c..23ef2e0b8d4ef 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -30,7 +30,7 @@ class Function; class TargetVerifierPass : public PassInfoMixin { public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) {} + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); }; class TargetVerify { @@ -76,8 +76,8 @@ class TargetVerify { : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} - void run(Function &F) {}; - void run(Function &F, FunctionAnalysisManager &AM); + bool run(Function &F); + bool run(Function &F, FunctionAnalysisManager &AM); }; } // namespace llvm diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index 879d657c87695..f125b3daffd5e 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -62,6 +62,8 @@ static cl::opt VerifyAnalysisInvalidation("verify-analysis-invalidation", #endif ); +static cl::opt VerifyTargetEach("verify-tgt-each"); + // An option that supports the -print-changed option. See // the description for -print-changed for an explanation of the use // of this option. Note that this option has no effect without -print-changed. @@ -1476,7 +1478,7 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, "\"{0}\", compilation aborted!", P)); - if (FAM) { + if (VerifyTargetEach && FAM) { TargetVerify TV(const_cast(F->getParent())); TV.run(*const_cast(F), *FAM); if (!TV.IsValid) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 41e6a399c7239..73f9c60cf588c 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -88,7 +88,7 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) #endif VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) -VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) +VERIFIER_FUNCTION_ANALYSIS("tgtverifier", TargetVerifierPass()) #undef VERIFIER_MODULE_ANALYSIS #undef VERIFIER_FUNCTION_ANALYSIS diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 0eecedaebc7ce..96bcaaf6f2ac9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -132,49 +132,4 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan return PreservedAnalyses::all(); } -struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { - static char ID; - - std::unique_ptr TV; - bool FatalErrors = true; - - AMDGPUTargetVerifierLegacyPass() : FunctionPass(ID) { - initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); - } - AMDGPUTargetVerifierLegacyPass(bool FatalErrors) - : FunctionPass(ID), - FatalErrors(FatalErrors) { - initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); - } - - bool doInitialization(Module &M) override { - TV = std::make_unique(&M); - return false; - } - - bool runOnFunction(Function &F) override { - if (!TV->run(F) && FatalErrors) { - errs() << "in function " << F.getName() << '\n'; - report_fatal_error("Broken function found, compilation aborted!"); - } - return false; - } - - bool doFinalization(Module &M) override { - bool IsValid = true; - for (Function &F : M) - if (F.isDeclaration()) - IsValid &= TV->run(F); - - if (FatalErrors && !IsValid) - report_fatal_error("Broken module found, compilation aborted!"); - return false; - } - - void getAnalysisUsage(AnalysisUsage &AU) const override { - AU.setPreservesAll(); - } -}; -char AMDGPUTargetVerifierLegacyPass::ID = 0; } // namespace llvm -INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverify", "AMDGPU Target Verifier", false, false) diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 992a0c91d93b1..170fc4769c1d8 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -20,6 +20,7 @@ #include "llvm/Target/TargetVerifier.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" +#include "llvm/InitializePasses.h" #include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/Analysis/PostDominators.h" #include "llvm/Support/Debug.h" @@ -32,7 +33,22 @@ namespace llvm { -void TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { +bool TargetVerify::run(Function &F) { + if (TT.isAMDGPU()) { + AMDGPUTargetVerify TV(Mod); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + return false; + } + return true; + } + report_fatal_error("Target has no verification method\n"); +} + +bool TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { if (TT.isAMDGPU()) { auto *UA = &AM.getResult(F); auto *DT = &AM.getResult(F); @@ -44,8 +60,77 @@ void TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { dbgs() << TV.MessagesStr.str(); if (!TV.MessagesStr.str().empty()) { TV.IsValid = false; + return false; + } + return true; + } + report_fatal_error("Target has no verification method\n"); +} + +PreservedAnalyses TargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { + auto TT = F.getParent()->getTargetTriple(); + + if (TT.isAMDGPU()) { + auto *Mod = F.getParent(); + + auto UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + return PreservedAnalyses::none(); } + return PreservedAnalyses::all(); } + report_fatal_error("Target has no verification method\n"); } +struct TargetVerifierLegacyPass : public FunctionPass { + static char ID; + + std::unique_ptr TV; + + TargetVerifierLegacyPass() : FunctionPass(ID) { + initializeTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + + bool doInitialization(Module &M) override { + TV = std::make_unique(&M); + return false; + } + + bool runOnFunction(Function &F) override { + if (!TV->run(F)) { + errs() << "in function " << F.getName() << '\n'; + report_fatal_error("broken function found, compilation aborted!"); + } + return false; + } + + bool doFinalization(Module &M) override { + bool IsValid = true; + for (Function &F : M) + if (F.isDeclaration()) + IsValid &= TV->run(F); + + if (!IsValid) + report_fatal_error("broken module found, compilation aborted!"); + return false; + } + + void getAnalysisUsage(AnalysisUsage &AU) const override { + AU.setPreservesAll(); + } +}; +char TargetVerifierLegacyPass::ID = 0; +FunctionPass *createTargetVerifierLegacyPass() { + return new TargetVerifierLegacyPass(); +} } // namespace llvm +using namespace llvm; +INITIALIZE_PASS(TargetVerifierLegacyPass, "tgtverifier", "Target Verifier", false, false) diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index a8f6b999af06e..4b95977a10c5f 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -57,6 +57,9 @@ static cl::opt DebugPM("debug-pass-manager", cl::Hidden, cl::desc("Print pass management debugging information")); +static cl::opt VerifyTarget("verify-tgt-new-pm", + cl::desc("Verify the target")); + bool LLCDiagnosticHandler::handleDiagnostics(const DiagnosticInfo &DI) { DiagnosticHandler::handleDiagnostics(DI); if (DI.getKind() == llvm::DK_SrcMgr) { @@ -127,7 +130,8 @@ int llvm::compileModuleWithNewPM( PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); - PB.registerVerifierPasses(MPM, FPM); + if (VerifyTarget) + PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM, &FAM); diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 140459ba2de21..1fd8a9f9cd9f8 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -209,6 +209,8 @@ static cl::opt PassPipeline( static cl::alias PassPipeline2("p", cl::aliasopt(PassPipeline), cl::desc("Alias for -passes")); +static cl::opt VerifyTarget("verify-tgt", cl::desc("Verify the target")); + namespace { std::vector &getRunPassNames() { @@ -658,6 +660,8 @@ static int compileModule(char **argv, LLVMContext &Context) { // Build up all of the passes that we want to do to the module. legacy::PassManager PM; + if (VerifyTarget) + PM.add(createTargetVerifierLegacyPass()); PM.add(new TargetLibraryInfoWrapperPass(TLII)); { diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index b00bab66c6c3e..b86c2318b45b7 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -141,6 +141,7 @@ int main(int argc, char **argv) { ModuleAnalysisManager MAM; PassBuilder PB(TM.get(), PipelineTuningOptions(), std::nullopt, &PIC); PB.registerModuleAnalyses(MAM); + //PB.registerVerifierPasses(MPM, FPM); PB.registerCGSCCAnalyses(CGAM); PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); >From 3013fc91155a7d84c73ac820fe6bc24c47dad38d Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 26 Apr 2025 00:13:42 -0400 Subject: [PATCH 14/30] Add fam in other projects. --- flang/lib/Frontend/FrontendActions.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter4/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter5/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter6/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter7/toy.cpp | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..7c48e35ff68cf 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -911,7 +911,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); - si.registerCallbacks(pic, &mam); + si.registerCallbacks(pic, &mam, &fam); if (ci.isTimingEnabled()) si.getTimePasses().setOutStream(ci.getTimingStreamLLVM()); pto.LoopUnrolling = opts.UnrollLoops; diff --git a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp index 0f58391c50667..f9664025f61f1 100644 --- a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp @@ -577,7 +577,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp index 7117eaf4982b0..eae06d9f57467 100644 --- a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp @@ -851,7 +851,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp index cb7b6cc8651c1..30ad79ef2fc58 100644 --- a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp @@ -970,7 +970,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp index 91b7191a07c6f..4a39bc33c5591 100644 --- a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp @@ -1139,7 +1139,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Promote allocas to registers. >From 8745cd135bd27559429f158fc0d678a210af7292 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 26 Apr 2025 02:30:40 -0400 Subject: [PATCH 15/30] Avoid fatal errors in llc. --- llvm/include/llvm/CodeGen/Passes.h | 2 +- llvm/lib/Target/TargetVerifier.cpp | 18 +++++++++++++----- .../test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll | 2 +- .../test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll | 2 +- llvm/tools/llc/llc.cpp | 2 +- 5 files changed, 17 insertions(+), 9 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index b293315e11c17..8d88d858c57ad 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -618,7 +618,7 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); - FunctionPass *createTargetVerifierLegacyPass(); + FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors); } // End llvm namespace #endif diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 170fc4769c1d8..3be50f4ef6da3 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -94,8 +94,10 @@ struct TargetVerifierLegacyPass : public FunctionPass { static char ID; std::unique_ptr TV; + bool FatalErrors = false; - TargetVerifierLegacyPass() : FunctionPass(ID) { + TargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), + FatalErrors(FatalErrors) { initializeTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); } @@ -107,7 +109,10 @@ struct TargetVerifierLegacyPass : public FunctionPass { bool runOnFunction(Function &F) override { if (!TV->run(F)) { errs() << "in function " << F.getName() << '\n'; - report_fatal_error("broken function found, compilation aborted!"); + if (FatalErrors) + report_fatal_error("broken function found, compilation aborted!"); + else + errs() << "broken function found, compilation aborted!\n"; } return false; } @@ -119,7 +124,10 @@ struct TargetVerifierLegacyPass : public FunctionPass { IsValid &= TV->run(F); if (!IsValid) - report_fatal_error("broken module found, compilation aborted!"); + if (FatalErrors) + report_fatal_error("broken module found, compilation aborted!"); + else + errs() << "broken module found, compilation aborted!\n"; return false; } @@ -128,8 +136,8 @@ struct TargetVerifierLegacyPass : public FunctionPass { } }; char TargetVerifierLegacyPass::ID = 0; -FunctionPass *createTargetVerifierLegacyPass() { - return new TargetVerifierLegacyPass(); +FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors) { + return new TargetVerifierLegacyPass(FatalErrors); } } // namespace llvm using namespace llvm; diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll index c5e59d4a2369e..e2d9edda5d008 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll @@ -1,4 +1,4 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm -o - < %s 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-tgt -o - < %s 2>&1 | FileCheck %s define amdgpu_cs i32 @nonvoid_shader() { ; CHECK: Shaders must return void diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll index 8a503b7624a73..a2dab0ff47924 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll @@ -1,4 +1,4 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm %s -o - 2>&1 | FileCheck %s --allow-empty +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-tgt %s -o - 2>&1 | FileCheck %s --allow-empty define amdgpu_cs void @void_shader() { ; CHECK-NOT: Shaders must return void diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 1fd8a9f9cd9f8..329d95826551f 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -661,7 +661,7 @@ static int compileModule(char **argv, LLVMContext &Context) { // Build up all of the passes that we want to do to the module. legacy::PassManager PM; if (VerifyTarget) - PM.add(createTargetVerifierLegacyPass()); + PM.add(createTargetVerifierLegacyPass(false)); PM.add(new TargetLibraryInfoWrapperPass(TLII)); { >From c7bf730193e39bf838a29de7617d31a900bbc576 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 26 Apr 2025 03:40:47 -0400 Subject: [PATCH 16/30] Add tool to build/test. --- llvm/test/CMakeLists.txt | 1 + llvm/test/lit.cfg.py | 1 + llvm/utils/gn/secondary/llvm/test/BUILD.gn | 1 + .../llvm/tools/llvm-tgt-verify/BUILD.gn | 25 +++++++++++++++++++ 4 files changed, 28 insertions(+) create mode 100644 llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn diff --git a/llvm/test/CMakeLists.txt b/llvm/test/CMakeLists.txt index 66849002eb470..10ca9300e7c66 100644 --- a/llvm/test/CMakeLists.txt +++ b/llvm/test/CMakeLists.txt @@ -135,6 +135,7 @@ set(LLVM_TEST_DEPENDS llvm-strip llvm-symbolizer llvm-tblgen + llvm-tgt-verify llvm-readtapi llvm-tli-checker llvm-undname diff --git a/llvm/test/lit.cfg.py b/llvm/test/lit.cfg.py index aad7a088551b2..8620f2a7014b5 100644 --- a/llvm/test/lit.cfg.py +++ b/llvm/test/lit.cfg.py @@ -227,6 +227,7 @@ def get_asan_rtlib(): "llvm-strings", "llvm-strip", "llvm-tblgen", + "llvm-tgt-verify", "llvm-readtapi", "llvm-undname", "llvm-windres", diff --git a/llvm/utils/gn/secondary/llvm/test/BUILD.gn b/llvm/utils/gn/secondary/llvm/test/BUILD.gn index 228642667b41d..157e7991c52a8 100644 --- a/llvm/utils/gn/secondary/llvm/test/BUILD.gn +++ b/llvm/utils/gn/secondary/llvm/test/BUILD.gn @@ -319,6 +319,7 @@ group("test") { "//llvm/tools/llvm-strings", "//llvm/tools/llvm-symbolizer:symlinks", "//llvm/tools/llvm-tli-checker", + "//llvm/tools/llvm-tgt-verify", "//llvm/tools/llvm-undname", "//llvm/tools/llvm-xray", "//llvm/tools/lto", diff --git a/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn b/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn new file mode 100644 index 0000000000000..b751bafc5052c --- /dev/null +++ b/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn @@ -0,0 +1,25 @@ +import("//llvm/utils/TableGen/tablegen.gni") + +tgtverifier("llvm-tgt-verify") { + deps = [ + "//llvm/lib/Analysis", + "//llvm/lib/AsmPrinter", + "//llvm/lib/CodeGen", + "//llvm/lib/CodeGenTypes", + "//llvm/lib/Core", + "//llvm/lib/IRPrinter", + "//llvm/lib/IRReader", + "//llvm/lib/MC", + "//llvm/lib/MIRParser", + "//llvm/lib/Passes", + "//llvm/lib/Remarks", + "//llvm/lib/ScalarOpts", + "//llvm/lib/SelectionDAG", + "//llvm/lib/Support", + "//llvm/lib/Target", + "//llvm/lib/TargetParser", + "//llvm/lib/TransformUtils", + "//llvm/lib/Vectorize", + ] + sources = [ "llvm-tgt-verify.cpp" ] +} >From c8dd3db3fe078f76e822a9646d3d7295fa23752a Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 28 Apr 2025 10:42:24 -0400 Subject: [PATCH 17/30] Cleanup of unrequired functions. --- llvm/include/llvm/Target/TargetVerifier.h | 1 - .../TargetVerify/AMDGPUTargetVerifier.h | 1 - .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 25 +++---------------- llvm/lib/Target/TargetVerifier.cpp | 22 ++-------------- 4 files changed, 6 insertions(+), 43 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 23ef2e0b8d4ef..427a05b2648a9 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -77,7 +77,6 @@ class TargetVerify { MessagesStr(Messages) {} bool run(Function &F); - bool run(Function &F, FunctionAnalysisManager &AM); }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index b6a7412e8c1ef..74e5b5f7a1efd 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -32,7 +32,6 @@ class Function; class AMDGPUTargetVerifierPass : public TargetVerifierPass { public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); }; class AMDGPUTargetVerify : public TargetVerify { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 96bcaaf6f2ac9..bda412f723242 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -107,29 +107,12 @@ bool AMDGPUTargetVerify::run(Function &F) { } } - if (!MessagesStr.str().empty()) + //dbgs() << MessagesStr.str(); + if (!MessagesStr.str().empty()) { + //IsValid = false; return false; - return true; -} - -PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { - - auto *Mod = F.getParent(); - - auto UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return PreservedAnalyses::none(); } - - return PreservedAnalyses::all(); + return true; } } // namespace llvm diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 3be50f4ef6da3..6b57c18ff9316 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -48,25 +48,6 @@ bool TargetVerify::run(Function &F) { report_fatal_error("Target has no verification method\n"); } -bool TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { - if (TT.isAMDGPU()) { - auto *UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return false; - } - return true; - } - report_fatal_error("Target has no verification method\n"); -} - PreservedAnalyses TargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto TT = F.getParent()->getTargetTriple(); @@ -123,11 +104,12 @@ struct TargetVerifierLegacyPass : public FunctionPass { if (F.isDeclaration()) IsValid &= TV->run(F); - if (!IsValid) + if (!IsValid) { if (FatalErrors) report_fatal_error("broken module found, compilation aborted!"); else errs() << "broken module found, compilation aborted!\n"; + } return false; } >From 2c12e6a6d7f9a1cb7bcebfb30ccdd0fe7b198727 Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 28 Apr 2025 10:43:32 -0400 Subject: [PATCH 18/30] Make virtual. --- llvm/include/llvm/Target/TargetVerifier.h | 2 +- llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 427a05b2648a9..ade2676a64325 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -76,7 +76,7 @@ class TargetVerify { : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} - bool run(Function &F); + virtual bool run(Function &F); }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index 74e5b5f7a1efd..b97fbc046e391 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -48,7 +48,7 @@ class AMDGPUTargetVerify : public TargetVerify { AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} - bool run(Function &F); + bool run(Function &F) override; }; } // namespace llvm >From 3267b65e82da4cb7bc0f31f74c76f78d0445512f Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 10:56:43 -0400 Subject: [PATCH 19/30] Remove from legacy PM. Add to target dependent pipeline. --- llvm/include/llvm/CodeGen/Passes.h | 2 +- llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Target/TargetVerifier.h | 4 +- .../TargetVerify/AMDGPUTargetVerifier.h | 1 + llvm/lib/Passes/StandardInstrumentations.cpp | 10 +-- llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 + .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 74 ++++++++++++++- llvm/lib/Target/TargetVerifier.cpp | 90 ------------------- llvm/tools/llc/llc.cpp | 2 - .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 2 - 11 files changed, 85 insertions(+), 106 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index 8d88d858c57ad..da6ad3f612aa8 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -618,7 +618,7 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); - FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors); + //FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); } // End llvm namespace #endif diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 3f9ffc4efd9ec..7d4fad2d87a16 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -307,7 +307,7 @@ void initializeTailDuplicateLegacyPass(PassRegistry &); void initializeTargetLibraryInfoWrapperPassPass(PassRegistry &); void initializeTargetPassConfigPass(PassRegistry &); void initializeTargetTransformInfoWrapperPassPass(PassRegistry &); -void initializeTargetVerifierLegacyPassPass(PassRegistry &); +//void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeTwoAddressInstructionLegacyPassPass(PassRegistry &); void initializeTypeBasedAAWrapperPassPass(PassRegistry &); void initializeTypePromotionLegacyPass(PassRegistry &); diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index ade2676a64325..1d12eb55bbf0a 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -30,7 +30,7 @@ class Function; class TargetVerifierPass : public PassInfoMixin { public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); + virtual PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) = 0; }; class TargetVerify { @@ -76,7 +76,7 @@ class TargetVerify { : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} - virtual bool run(Function &F); + virtual bool run(Function &F) = 0; }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index b97fbc046e391..49bcbc8849e3c 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -32,6 +32,7 @@ class Function; class AMDGPUTargetVerifierPass : public TargetVerifierPass { public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) override; }; class AMDGPUTargetVerify : public TargetVerify { diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index f125b3daffd5e..076df47d5b15d 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -45,7 +45,7 @@ #include "llvm/Support/Signals.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Support/xxhash.h" -#include "llvm/Target/TargetVerifier.h" +//#include "llvm/Target/TargetVerifier.h" #include #include #include @@ -1479,12 +1479,12 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, P)); if (VerifyTargetEach && FAM) { - TargetVerify TV(const_cast(F->getParent())); - TV.run(*const_cast(F), *FAM); - if (!TV.IsValid) + //TargetVerify TV(const_cast(F->getParent())); + //TV.run(*const_cast(F), *FAM); + /*if (!TV.IsValid) report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", - P)); + P));*/ } } else { const auto *M = unwrapIR(IR); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 73f9c60cf588c..41e6a399c7239 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -88,7 +88,7 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) #endif VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) -VERIFIER_FUNCTION_ANALYSIS("tgtverifier", TargetVerifierPass()) +VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) #undef VERIFIER_MODULE_ANALYSIS #undef VERIFIER_FUNCTION_ANALYSIS diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 257cc724b3da9..f1a60b8f33140 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1976,6 +1976,8 @@ AMDGPUCodeGenPassBuilder::AMDGPUCodeGenPassBuilder( } void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { + addPass(AMDGPUTargetVerifierPass()); + if (RemoveIncompatibleFunctions && TM.getTargetTriple().isAMDGCN()) addPass(AMDGPURemoveIncompatibleFunctionsPass(TM)); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index bda412f723242..cedd9ddc78011 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -107,12 +107,82 @@ bool AMDGPUTargetVerify::run(Function &F) { } } - //dbgs() << MessagesStr.str(); + dbgs() << MessagesStr.str(); if (!MessagesStr.str().empty()) { - //IsValid = false; + IsValid = false; return false; } return true; } +PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { + auto *Mod = F.getParent(); + + auto UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + return PreservedAnalyses::none(); + } + return PreservedAnalyses::all(); +} + +/* +struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { + static char ID; + + std::unique_ptr TV; + bool FatalErrors = false; + + AMDGPUTargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), + FatalErrors(FatalErrors) { + initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + + bool doInitialization(Module &M) override { + TV = std::make_unique(&M); + return false; + } + + bool runOnFunction(Function &F) override { + if (!TV->run(F)) { + errs() << "in function " << F.getName() << '\n'; + if (FatalErrors) + report_fatal_error("broken function found, compilation aborted!"); + else + errs() << "broken function found, compilation aborted!\n"; + } + return false; + } + + bool doFinalization(Module &M) override { + bool IsValid = true; + for (Function &F : M) + if (F.isDeclaration()) + IsValid &= TV->run(F); + + if (!IsValid) { + if (FatalErrors) + report_fatal_error("broken module found, compilation aborted!"); + else + errs() << "broken module found, compilation aborted!\n"; + } + return false; + } + + void getAnalysisUsage(AnalysisUsage &AU) const override { + AU.setPreservesAll(); + } +}; +char AMDGPUTargetVerifierLegacyPass::ID = 0; +FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors) { + return new AMDGPUTargetVerifierLegacyPass(FatalErrors); +}*/ } // namespace llvm +//INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 6b57c18ff9316..c63ae2a2c5daf 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -33,94 +33,4 @@ namespace llvm { -bool TargetVerify::run(Function &F) { - if (TT.isAMDGPU()) { - AMDGPUTargetVerify TV(Mod); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return false; - } - return true; - } - report_fatal_error("Target has no verification method\n"); -} - -PreservedAnalyses TargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { - auto TT = F.getParent()->getTargetTriple(); - - if (TT.isAMDGPU()) { - auto *Mod = F.getParent(); - - auto UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return PreservedAnalyses::none(); - } - return PreservedAnalyses::all(); - } - report_fatal_error("Target has no verification method\n"); -} - -struct TargetVerifierLegacyPass : public FunctionPass { - static char ID; - - std::unique_ptr TV; - bool FatalErrors = false; - - TargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), - FatalErrors(FatalErrors) { - initializeTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); - } - - bool doInitialization(Module &M) override { - TV = std::make_unique(&M); - return false; - } - - bool runOnFunction(Function &F) override { - if (!TV->run(F)) { - errs() << "in function " << F.getName() << '\n'; - if (FatalErrors) - report_fatal_error("broken function found, compilation aborted!"); - else - errs() << "broken function found, compilation aborted!\n"; - } - return false; - } - - bool doFinalization(Module &M) override { - bool IsValid = true; - for (Function &F : M) - if (F.isDeclaration()) - IsValid &= TV->run(F); - - if (!IsValid) { - if (FatalErrors) - report_fatal_error("broken module found, compilation aborted!"); - else - errs() << "broken module found, compilation aborted!\n"; - } - return false; - } - - void getAnalysisUsage(AnalysisUsage &AU) const override { - AU.setPreservesAll(); - } -}; -char TargetVerifierLegacyPass::ID = 0; -FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors) { - return new TargetVerifierLegacyPass(FatalErrors); -} } // namespace llvm -using namespace llvm; -INITIALIZE_PASS(TargetVerifierLegacyPass, "tgtverifier", "Target Verifier", false, false) diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 329d95826551f..2e9e4837fe467 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -660,8 +660,6 @@ static int compileModule(char **argv, LLVMContext &Context) { // Build up all of the passes that we want to do to the module. legacy::PassManager PM; - if (VerifyTarget) - PM.add(createTargetVerifierLegacyPass(false)); PM.add(new TargetLibraryInfoWrapperPass(TLII)); { diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index b86c2318b45b7..d832dcdff4ad0 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -158,8 +158,6 @@ int main(int argc, char **argv) { if (TT.isAMDGPU()) FPM.addPass(AMDGPUTargetVerifierPass()); else if (false) {} // ... - else - FPM.addPass(TargetVerifierPass()); MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); auto PA = MPM.run(*M, MAM); >From 6401b7517843a03ab114aaf333624ef914d5a5f3 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 11:18:50 -0400 Subject: [PATCH 20/30] Add back to legacy PM. --- llvm/include/llvm/CodeGen/Passes.h | 2 -- llvm/include/llvm/InitializePasses.h | 1 - llvm/lib/Target/AMDGPU/AMDGPU.h | 3 +++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 1 + llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 8 ++++---- 5 files changed, 8 insertions(+), 7 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index da6ad3f612aa8..d214ab9306c2f 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -617,8 +617,6 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); - - //FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); } // End llvm namespace #endif diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 7d4fad2d87a16..9bef8e496c57e 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -307,7 +307,6 @@ void initializeTailDuplicateLegacyPass(PassRegistry &); void initializeTargetLibraryInfoWrapperPassPass(PassRegistry &); void initializeTargetPassConfigPass(PassRegistry &); void initializeTargetTransformInfoWrapperPassPass(PassRegistry &); -//void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeTwoAddressInstructionLegacyPassPass(PassRegistry &); void initializeTypeBasedAAWrapperPassPass(PassRegistry &); void initializeTypePromotionLegacyPass(PassRegistry &); diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h index 4ff761ec19b3c..f69956ba44255 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPU.h +++ b/llvm/lib/Target/AMDGPU/AMDGPU.h @@ -530,6 +530,9 @@ extern char &GCNRewritePartialRegUsesID; void initializeAMDGPUWaitSGPRHazardsLegacyPass(PassRegistry &); extern char &AMDGPUWaitSGPRHazardsLegacyID; +FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); +void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); + namespace AMDGPU { enum TargetIndex { TI_CONSTDATA_START, diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index f1a60b8f33140..42d6764eacda9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1377,6 +1377,7 @@ bool AMDGPUPassConfig::addGCPasses() { //===----------------------------------------------------------------------===// bool GCNPassConfig::addPreISel() { + addPass(createAMDGPUTargetVerifierLegacyPass(false)); AMDGPUPassConfig::addPreISel(); if (TM->getOptLevel() > CodeGenOptLevel::None) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index cedd9ddc78011..c4d303bee6ef8 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -17,6 +17,7 @@ //// ////===----------------------------------------------------------------------===// +#include "AMDGPU.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Analysis/UniformityAnalysis.h" @@ -24,7 +25,7 @@ #include "llvm/Support/Debug.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" -#include "llvm/InitializePasses.h" +//#include "llvm/InitializePasses.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" @@ -133,7 +134,6 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan return PreservedAnalyses::all(); } -/* struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { static char ID; @@ -183,6 +183,6 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { char AMDGPUTargetVerifierLegacyPass::ID = 0; FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors) { return new AMDGPUTargetVerifierLegacyPass(FatalErrors); -}*/ +} } // namespace llvm -//INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) +INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) >From e2f0225db1439f7d8ee612ee4c4d37a4b44f96b6 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 14:04:10 -0400 Subject: [PATCH 21/30] Remove reference to FAM in registerCallbacks and VerifyEach for TargetVerify in instrumentation --- clang/lib/CodeGen/BackendUtil.cpp | 2 +- flang/lib/Frontend/FrontendActions.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter4/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter5/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter6/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter7/toy.cpp | 2 +- .../llvm/Passes/StandardInstrumentations.h | 6 ++---- llvm/lib/LTO/LTOBackend.cpp | 2 +- llvm/lib/LTO/ThinLTOCodeGenerator.cpp | 2 +- llvm/lib/Passes/PassBuilderBindings.cpp | 2 +- llvm/lib/Passes/StandardInstrumentations.cpp | 21 ++++--------------- llvm/tools/llc/NewPMDriver.cpp | 7 ++++--- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 2 +- llvm/tools/opt/NewPMDriver.cpp | 2 +- llvm/unittests/IR/PassManagerTest.cpp | 6 +++--- 15 files changed, 24 insertions(+), 38 deletions(-) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 9a1c922f5ddef..f7eb853beb23c 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -922,7 +922,7 @@ void EmitAssemblyHelper::RunOptimizationPipeline( TheModule->getContext(), (CodeGenOpts.DebugPassManager || DebugPassStructure), CodeGenOpts.VerifyEach, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); PassBuilder PB(TM.get(), PTO, PGOOpt, &PIC); // Handle the assignment tracking feature options. diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 7c48e35ff68cf..c1f47b12abee2 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -911,7 +911,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); - si.registerCallbacks(pic, &mam, &fam); + si.registerCallbacks(pic, &mam); if (ci.isTimingEnabled()) si.getTimePasses().setOutStream(ci.getTimingStreamLLVM()); pto.LoopUnrolling = opts.UnrollLoops; diff --git a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp index f9664025f61f1..0f58391c50667 100644 --- a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp @@ -577,7 +577,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp index eae06d9f57467..7117eaf4982b0 100644 --- a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp @@ -851,7 +851,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp index 30ad79ef2fc58..cb7b6cc8651c1 100644 --- a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp @@ -970,7 +970,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp index 4a39bc33c5591..91b7191a07c6f 100644 --- a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp @@ -1139,7 +1139,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Promote allocas to registers. diff --git a/llvm/include/llvm/Passes/StandardInstrumentations.h b/llvm/include/llvm/Passes/StandardInstrumentations.h index 988fcb93b2357..65934c93ba614 100644 --- a/llvm/include/llvm/Passes/StandardInstrumentations.h +++ b/llvm/include/llvm/Passes/StandardInstrumentations.h @@ -476,8 +476,7 @@ class VerifyInstrumentation { public: VerifyInstrumentation(bool DebugLogging) : DebugLogging(DebugLogging) {} void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM, - FunctionAnalysisManager *FAM); + ModuleAnalysisManager *MAM); }; /// This class implements --time-trace functionality for new pass manager. @@ -622,8 +621,7 @@ class StandardInstrumentations { // Register all the standard instrumentation callbacks. If \p FAM is nullptr // then PreservedCFGChecker is not enabled. void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM, - FunctionAnalysisManager *FAM); + ModuleAnalysisManager *MAM); TimePassesHandler &getTimePasses() { return TimePasses; } }; diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp index 475e7cf45371b..1c764a0188eda 100644 --- a/llvm/lib/LTO/LTOBackend.cpp +++ b/llvm/lib/LTO/LTOBackend.cpp @@ -275,7 +275,7 @@ static void runNewPMPasses(const Config &Conf, Module &Mod, TargetMachine *TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(Mod.getContext(), Conf.DebugPassManager, Conf.VerifyEach); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); PassBuilder PB(TM, Conf.PTO, PGOOpt, &PIC); RegisterPassPlugins(Conf.PassPlugins, PB); diff --git a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp index 369b003df1364..9e7f8187fe49c 100644 --- a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp +++ b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp @@ -245,7 +245,7 @@ static void optimizeModule(Module &TheModule, TargetMachine &TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(TheModule.getContext(), DebugPassManager); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); PipelineTuningOptions PTO; PTO.LoopVectorization = true; PTO.SLPVectorization = true; diff --git a/llvm/lib/Passes/PassBuilderBindings.cpp b/llvm/lib/Passes/PassBuilderBindings.cpp index f0e1abb8cebc4..933fe89e53a94 100644 --- a/llvm/lib/Passes/PassBuilderBindings.cpp +++ b/llvm/lib/Passes/PassBuilderBindings.cpp @@ -76,7 +76,7 @@ static LLVMErrorRef runPasses(Module *Mod, Function *Fun, const char *Passes, PB.crossRegisterProxies(LAM, FAM, CGAM, MAM); StandardInstrumentations SI(Mod->getContext(), Debug, VerifyEach); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); // Run the pipeline. if (Fun) { diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index 076df47d5b15d..dc1dd5d9c7f4c 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -45,7 +45,6 @@ #include "llvm/Support/Signals.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Support/xxhash.h" -//#include "llvm/Target/TargetVerifier.h" #include #include #include @@ -62,8 +61,6 @@ static cl::opt VerifyAnalysisInvalidation("verify-analysis-invalidation", #endif ); -static cl::opt VerifyTargetEach("verify-tgt-each"); - // An option that supports the -print-changed option. See // the description for -print-changed for an explanation of the use // of this option. Note that this option has no effect without -print-changed. @@ -1457,10 +1454,9 @@ void PreservedCFGCheckerInstrumentation::registerCallbacks( } void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM, - FunctionAnalysisManager *FAM) { + ModuleAnalysisManager *MAM) { PIC.registerAfterPassCallback( - [this, MAM, FAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { + [this, MAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { if (isIgnored(P) || P == "VerifierPass") return; const auto *F = unwrapIR(IR); @@ -1477,15 +1473,6 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", P)); - - if (VerifyTargetEach && FAM) { - //TargetVerify TV(const_cast(F->getParent())); - //TV.run(*const_cast(F), *FAM); - /*if (!TV.IsValid) - report_fatal_error(formatv("Broken function found after pass " - "\"{0}\", compilation aborted!", - P));*/ - } } else { const auto *M = unwrapIR(IR); if (!M) { @@ -2525,7 +2512,7 @@ void PrintCrashIRInstrumentation::registerCallbacks( } void StandardInstrumentations::registerCallbacks( - PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM, FunctionAnalysisManager *FAM) { + PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM) { PrintIR.registerCallbacks(PIC); PrintPass.registerCallbacks(PIC); TimePasses.registerCallbacks(PIC); @@ -2534,7 +2521,7 @@ void StandardInstrumentations::registerCallbacks( PrintChangedIR.registerCallbacks(PIC); PseudoProbeVerification.registerCallbacks(PIC); if (VerifyEach) - Verify.registerCallbacks(PIC, MAM, FAM); + Verify.registerCallbacks(PIC, MAM); PrintChangedDiff.registerCallbacks(PIC); WebsiteChangeReporter.registerCallbacks(PIC); ChangeTester.registerCallbacks(PIC); diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index 4b95977a10c5f..863a555798dab 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -117,8 +117,6 @@ int llvm::compileModuleWithNewPM( VK == VerifierKind::EachPass); registerCodeGenCallback(PIC, *Target); - ModulePassManager MPM; - FunctionPassManager FPM; MachineFunctionAnalysisManager MFAM; LoopAnalysisManager LAM; FunctionAnalysisManager FAM; @@ -133,11 +131,14 @@ int llvm::compileModuleWithNewPM( if (VerifyTarget) PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); MAM.registerPass([&] { return MachineModuleAnalysis(MMI); }); + ModulePassManager MPM; + FunctionPassManager FPM; + if (!PassPipeline.empty()) { // Construct a custom pass pipeline that starts after instruction // selection. diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index d832dcdff4ad0..50f4e56bb6af6 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -148,7 +148,7 @@ int main(int argc, char **argv) { PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); Triple TT(M->getTargetTriple()); if (!NoLint) diff --git a/llvm/tools/opt/NewPMDriver.cpp b/llvm/tools/opt/NewPMDriver.cpp index a8977d80bdf44..7d168a6ceb17c 100644 --- a/llvm/tools/opt/NewPMDriver.cpp +++ b/llvm/tools/opt/NewPMDriver.cpp @@ -423,7 +423,7 @@ bool llvm::runPassPipeline( PrintPassOpts.SkipAnalyses = DebugPM == DebugLogging::Quiet; StandardInstrumentations SI(M.getContext(), DebugPM != DebugLogging::None, VK == VerifierKind::EachPass, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); DebugifyEachInstrumentation Debugify; DebugifyStatsMap DIStatsMap; DebugInfoPerPass DebugInfoBeforePass; diff --git a/llvm/unittests/IR/PassManagerTest.cpp b/llvm/unittests/IR/PassManagerTest.cpp index bb4db6120035f..a6487169224c2 100644 --- a/llvm/unittests/IR/PassManagerTest.cpp +++ b/llvm/unittests/IR/PassManagerTest.cpp @@ -828,7 +828,7 @@ TEST_F(PassManagerTest, FunctionPassCFGChecker) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -877,7 +877,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerInvalidateAnalysis) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -945,7 +945,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerWrapped) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); >From b43cec12bbfc6071d4a99e75aad4273bab4e3182 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 14:44:28 -0400 Subject: [PATCH 22/30] Remove references to registry --- llvm/include/llvm/Passes/PassBuilder.h | 21 ------------------- .../llvm/Passes/StandardInstrumentations.h | 2 +- .../llvm/Passes/TargetPassRegistry.inc | 12 ----------- llvm/lib/Passes/PassBuilder.cpp | 7 ------- llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 11 ---------- llvm/tools/llc/NewPMDriver.cpp | 5 ----- llvm/tools/llc/llc.cpp | 2 -- 7 files changed, 1 insertion(+), 59 deletions(-) diff --git a/llvm/include/llvm/Passes/PassBuilder.h b/llvm/include/llvm/Passes/PassBuilder.h index 6000769ce723b..51ccaa53447d7 100644 --- a/llvm/include/llvm/Passes/PassBuilder.h +++ b/llvm/include/llvm/Passes/PassBuilder.h @@ -172,13 +172,6 @@ class PassBuilder { /// additional analyses. void registerLoopAnalyses(LoopAnalysisManager &LAM); - /// Registers all available verifier passes. - /// - /// This is an interface that can be used to populate a - /// \c ModuleAnalysisManager with all registered loop analyses. Callers can - /// still manually register any additional analyses. - void registerVerifierPasses(ModulePassManager &PM, FunctionPassManager &); - /// Registers all available machine function analysis passes. /// /// This is an interface that can be used to populate a \c @@ -577,15 +570,6 @@ class PassBuilder { } /// @}} - /// Register a callback for parsing an Verifier Name to populate - /// the given managers. - void registerVerifierCallback( - const std::function &C, - const std::function &CF) { - VerifierCallbacks.push_back(C); - FnVerifierCallbacks.push_back(CF); - } - /// {{@ Register pipeline parsing callbacks with this pass builder instance. /// Using these callbacks, callers can parse both a single pass name, as well /// as entire sub-pipelines, and populate the PassManager instance @@ -857,11 +841,6 @@ class PassBuilder { // Callbacks to parse `filter` parameter in register allocation passes SmallVector, 2> RegClassFilterParsingCallbacks; - // Verifier callbacks - SmallVector, 2> - VerifierCallbacks; - SmallVector, 2> - FnVerifierCallbacks; }; /// This utility template takes care of adding require<> and invalidate<> diff --git a/llvm/include/llvm/Passes/StandardInstrumentations.h b/llvm/include/llvm/Passes/StandardInstrumentations.h index 65934c93ba614..f7a65a88ecf5b 100644 --- a/llvm/include/llvm/Passes/StandardInstrumentations.h +++ b/llvm/include/llvm/Passes/StandardInstrumentations.h @@ -621,7 +621,7 @@ class StandardInstrumentations { // Register all the standard instrumentation callbacks. If \p FAM is nullptr // then PreservedCFGChecker is not enabled. void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM); + ModuleAnalysisManager *MAM = nullptr); TimePassesHandler &getTimePasses() { return TimePasses; } }; diff --git a/llvm/include/llvm/Passes/TargetPassRegistry.inc b/llvm/include/llvm/Passes/TargetPassRegistry.inc index 2d04b874cf360..521913cb25a4a 100644 --- a/llvm/include/llvm/Passes/TargetPassRegistry.inc +++ b/llvm/include/llvm/Passes/TargetPassRegistry.inc @@ -151,18 +151,6 @@ PB.registerPipelineParsingCallback([=](StringRef Name, FunctionPassManager &PM, return false; }); -PB.registerVerifierCallback([](ModulePassManager &PM) { -#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) PM.addPass(CREATE_PASS) -#include GET_PASS_REGISTRY -#undef VERIFIER_MODULE_ANALYSIS - return false; -}, [](FunctionPassManager &FPM) { -#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) FPM.addPass(CREATE_PASS) -#include GET_PASS_REGISTRY -#undef VERIFIER_FUNCTION_ANALYSIS - return false; -}); - #undef ADD_PASS #undef ADD_PASS_WITH_PARAMS diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index e942fed8b6a72..e7057d9a6b625 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -582,13 +582,6 @@ void PassBuilder::registerLoopAnalyses(LoopAnalysisManager &LAM) { C(LAM); } -void PassBuilder::registerVerifierPasses(ModulePassManager &MPM, FunctionPassManager &FPM) { - for (auto &C : VerifierCallbacks) - C(MPM); - for (auto &C : FnVerifierCallbacks) - C(FPM); -} - static std::optional> parseFunctionPipelineName(StringRef Name) { std::pair Params; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 41e6a399c7239..98a1147ef6d66 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -81,17 +81,6 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #undef FUNCTION_ALIAS_ANALYSIS #undef FUNCTION_ANALYSIS -#ifndef VERIFIER_MODULE_ANALYSIS -#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) -#endif -#ifndef VERIFIER_FUNCTION_ANALYSIS -#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) -#endif -VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) -VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) -#undef VERIFIER_MODULE_ANALYSIS -#undef VERIFIER_FUNCTION_ANALYSIS - #ifndef FUNCTION_PASS_WITH_PARAMS #define FUNCTION_PASS_WITH_PARAMS(NAME, CLASS, CREATE_PASS, PARSER, PARAMS) #endif diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index 863a555798dab..fa82689ecf9ae 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -57,9 +57,6 @@ static cl::opt DebugPM("debug-pass-manager", cl::Hidden, cl::desc("Print pass management debugging information")); -static cl::opt VerifyTarget("verify-tgt-new-pm", - cl::desc("Verify the target")); - bool LLCDiagnosticHandler::handleDiagnostics(const DiagnosticInfo &DI) { DiagnosticHandler::handleDiagnostics(DI); if (DI.getKind() == llvm::DK_SrcMgr) { @@ -128,8 +125,6 @@ int llvm::compileModuleWithNewPM( PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); - if (VerifyTarget) - PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM); diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 2e9e4837fe467..140459ba2de21 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -209,8 +209,6 @@ static cl::opt PassPipeline( static cl::alias PassPipeline2("p", cl::aliasopt(PassPipeline), cl::desc("Alias for -passes")); -static cl::opt VerifyTarget("verify-tgt", cl::desc("Verify the target")); - namespace { std::vector &getRunPassNames() { >From b583b3f804758f6b8ca686bf66d59d744fffbe8e Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 19:05:53 -0400 Subject: [PATCH 23/30] Remove int check --- .../lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 18 ------------------ 1 file changed, 18 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index c4d303bee6ef8..2ca0bbeb57653 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -25,7 +25,6 @@ #include "llvm/Support/Debug.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" -//#include "llvm/InitializePasses.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" @@ -46,15 +45,6 @@ using namespace llvm; namespace llvm { -static bool IsValidInt(const Type *Ty) { - return Ty->isIntegerTy(1) || - Ty->isIntegerTy(8) || - Ty->isIntegerTy(16) || - Ty->isIntegerTy(32) || - Ty->isIntegerTy(64) || - Ty->isIntegerTy(128); -} - static bool isShader(CallingConv::ID CC) { switch(CC) { case CallingConv::AMDGPU_VS: @@ -81,14 +71,6 @@ bool AMDGPUTargetVerify::run(Function &F) { for (auto &I : BB) { - // Ensure integral types are valid: i8, i16, i32, i64, i128 - if (I.getType()->isIntegerTy()) - Check(IsValidInt(I.getType()), "Int type is invalid.", &I); - for (unsigned i = 0; i < I.getNumOperands(); ++i) - if (I.getOperand(i)->getType()->isIntegerTy()) - Check(IsValidInt(I.getOperand(i)->getType()), - "Int type is invalid.", I.getOperand(i)); - if (auto *CI = dyn_cast(&I)) { // Ensure no kernel to kernel calls. >From 2ba9f5d85326b80bd502116a95353d7e9ad4c9bb Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 21:56:28 -0400 Subject: [PATCH 24/30] Remove modifications to Lint/Verifier. --- llvm/lib/Analysis/Lint.cpp | 4 +--- llvm/lib/IR/Verifier.cpp | 20 ++++---------------- 2 files changed, 5 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Analysis/Lint.cpp b/llvm/lib/Analysis/Lint.cpp index c8e38963e5974..f05e36e2025d4 100644 --- a/llvm/lib/Analysis/Lint.cpp +++ b/llvm/lib/Analysis/Lint.cpp @@ -742,11 +742,9 @@ PreservedAnalyses LintPass::run(Function &F, FunctionAnalysisManager &AM) { Lint L(Mod, DL, AA, AC, DT, TLI); L.visit(F); dbgs() << L.MessagesStr.str(); - if (AbortOnError && !L.MessagesStr.str().empty()) { + if (AbortOnError && !L.MessagesStr.str().empty()) report_fatal_error( "linter found errors, aborting. (enabled by abort-on-error)", false); - return PreservedAnalyses::none(); - } return PreservedAnalyses::all(); } diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 51f6dec53b70f..8afe360d088bc 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -135,10 +135,6 @@ static cl::opt VerifyNoAliasScopeDomination( cl::desc("Ensure that llvm.experimental.noalias.scope.decl for identical " "scopes are not dominating")); -static cl::opt - VerifyAbortOnError("verifier-abort-on-error", cl::init(false), - cl::desc("In the Verifier pass, abort on errors.")); - namespace llvm { struct VerifierSupport { @@ -7800,24 +7796,16 @@ VerifierAnalysis::Result VerifierAnalysis::run(Function &F, PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { auto Res = AM.getResult(M); - if (Res.IRBroken || Res.DebugInfoBroken) { - //M.IsValid = false; - if (VerifyAbortOnError && FatalErrors) - report_fatal_error("Broken module found, compilation aborted!"); - return PreservedAnalyses::none(); - } + if (FatalErrors && (Res.IRBroken || Res.DebugInfoBroken)) + report_fatal_error("Broken module found, compilation aborted!"); return PreservedAnalyses::all(); } PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto res = AM.getResult(F); - if (res.IRBroken) { - //F.getParent()->IsValid = false; - if (VerifyAbortOnError && FatalErrors) - report_fatal_error("Broken function found, compilation aborted!"); - return PreservedAnalyses::none(); - } + if (res.IRBroken && FatalErrors) + report_fatal_error("Broken function found, compilation aborted!"); return PreservedAnalyses::all(); } >From 0c572440b11b571d0431c2c0bfd83132126e096f Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 22:21:47 -0400 Subject: [PATCH 25/30] Remove llvm-tgt-verify tool. --- llvm/test/CMakeLists.txt | 1 - llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 45 ----- llvm/test/lit.cfg.py | 1 - llvm/tools/llvm-tgt-verify/CMakeLists.txt | 34 ---- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 171 ------------------ llvm/utils/gn/secondary/llvm/test/BUILD.gn | 1 - .../llvm/tools/llvm-tgt-verify/BUILD.gn | 25 --- 7 files changed, 278 deletions(-) delete mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify.ll delete mode 100644 llvm/tools/llvm-tgt-verify/CMakeLists.txt delete mode 100644 llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp delete mode 100644 llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn diff --git a/llvm/test/CMakeLists.txt b/llvm/test/CMakeLists.txt index 10ca9300e7c66..66849002eb470 100644 --- a/llvm/test/CMakeLists.txt +++ b/llvm/test/CMakeLists.txt @@ -135,7 +135,6 @@ set(LLVM_TEST_DEPENDS llvm-strip llvm-symbolizer llvm-tblgen - llvm-tgt-verify llvm-readtapi llvm-tli-checker llvm-undname diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll deleted file mode 100644 index 62b220d7d9f49..0000000000000 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ /dev/null @@ -1,45 +0,0 @@ -; RUN: not llvm-tgt-verify %s -mtriple=amdgcn |& FileCheck %s - -define amdgpu_cs i32 @shader() { -; CHECK: Shaders must return void - ret i32 0 -} - -define amdgpu_kernel void @store_const(ptr addrspace(4) %out, i32 %a, i32 %b) { -; CHECK: Undefined behavior: Write to memory in const addrspace -; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 - %r = add i32 %a, %b - store i32 %r, ptr addrspace(4) %out - ret void -} - -define amdgpu_kernel void @kernel_callee(ptr %x) { - ret void -} - -define amdgpu_kernel void @kernel_caller(ptr %x) { -; CHECK: A kernel may not call a kernel -; CHECK-NEXT: ptr @kernel_caller - call amdgpu_kernel void @kernel_callee(ptr %x) - ret void -} - - -; Function Attrs: nounwind -define i65 @invalid_type(i65 %x) #0 { -; CHECK: Int type is invalid. -; CHECK-NEXT: %tmp2 = ashr i65 %x, 64 -entry: - %tmp2 = ashr i65 %x, 64 - ret i65 %tmp2 -} - -declare void @llvm.amdgcn.cs.chain.v3i32(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) -declare amdgpu_cs_chain void @chain_callee(<3 x i32> inreg, <3 x i32>) - -define amdgpu_cs void @no_unreachable(<3 x i32> inreg %a, <3 x i32> %b) { -; CHECK: llvm.amdgcn.cs.chain must be followed by unreachable -; CHECK-NEXT: call void (ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.p0.i32.v3i32.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) - call void(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) - ret void -} diff --git a/llvm/test/lit.cfg.py b/llvm/test/lit.cfg.py index 8620f2a7014b5..aad7a088551b2 100644 --- a/llvm/test/lit.cfg.py +++ b/llvm/test/lit.cfg.py @@ -227,7 +227,6 @@ def get_asan_rtlib(): "llvm-strings", "llvm-strip", "llvm-tblgen", - "llvm-tgt-verify", "llvm-readtapi", "llvm-undname", "llvm-windres", diff --git a/llvm/tools/llvm-tgt-verify/CMakeLists.txt b/llvm/tools/llvm-tgt-verify/CMakeLists.txt deleted file mode 100644 index fe47c85e6cdce..0000000000000 --- a/llvm/tools/llvm-tgt-verify/CMakeLists.txt +++ /dev/null @@ -1,34 +0,0 @@ -set(LLVM_LINK_COMPONENTS - AllTargetsAsmParsers - AllTargetsCodeGens - AllTargetsDescs - AllTargetsInfos - Analysis - AsmPrinter - CodeGen - CodeGenTypes - Core - IRPrinter - IRReader - MC - MIRParser - Passes - Remarks - ScalarOpts - SelectionDAG - Support - Target - TargetParser - TransformUtils - Vectorize - ) - -add_llvm_tool(llvm-tgt-verify - llvm-tgt-verify.cpp - - DEPENDS - intrinsics_gen - SUPPORT_PLUGINS - ) - -export_executable_symbols_for_plugins(llc) diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp deleted file mode 100644 index 50f4e56bb6af6..0000000000000 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ /dev/null @@ -1,171 +0,0 @@ -//===--- llvm-tgt-verify.cpp - Target Verifier ----------------- ----------===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===----------------------------------------------------------------------===// -// -// Tool to verify a target. -// -//===----------------------------------------------------------------------===// - -#include "llvm/InitializePasses.h" -#include "llvm/ADT/StringRef.h" -#include "llvm/Analysis/Lint.h" -#include "llvm/Analysis/TargetLibraryInfo.h" -#include "llvm/Bitcode/BitcodeReader.h" -#include "llvm/Bitcode/BitcodeWriter.h" -#include "llvm/CodeGen/CommandFlags.h" -#include "llvm/CodeGen/TargetPassConfig.h" -#include "llvm/IR/Constants.h" -#include "llvm/IR/LLVMContext.h" -#include "llvm/IR/LegacyPassManager.h" -#include "llvm/IR/Module.h" -#include "llvm/IR/Verifier.h" -#include "llvm/IRReader/IRReader.h" -#include "llvm/Passes/PassBuilder.h" -#include "llvm/Passes/StandardInstrumentations.h" -#include "llvm/MC/TargetRegistry.h" -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/DataTypes.h" -#include "llvm/Support/Debug.h" -#include "llvm/Support/InitLLVM.h" -#include "llvm/Support/SourceMgr.h" -#include "llvm/Support/TargetSelect.h" -#include "llvm/Target/TargetMachine.h" -#include "llvm/Target/TargetVerifier.h" - -#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" - -#define DEBUG_TYPE "isel-fuzzer" - -using namespace llvm; - -static codegen::RegisterCodeGenFlags CGF; - -static cl::opt -InputFilename(cl::Positional, cl::desc(""), cl::init("-")); - -static cl::opt - StacktraceAbort("stacktrace-abort", - cl::desc("Turn on stacktrace"), cl::init(false)); - -static cl::opt - NoLint("no-lint", - cl::desc("Turn off Lint"), cl::init(false)); - -static cl::opt - NoVerify("no-verifier", - cl::desc("Turn off Verifier"), cl::init(false)); - -static cl::opt - OptLevel("O", - cl::desc("Optimization level. [-O0, -O1, -O2, or -O3] " - "(default = '-O2')"), - cl::Prefix, cl::init('2')); - -static cl::opt - TargetTriple("mtriple", cl::desc("Override target triple for module")); - -static std::unique_ptr TM; - -static void handleLLVMFatalError(void *, const char *Message, bool) { - if (StacktraceAbort) { - dbgs() << "LLVM ERROR: " << Message << "\n" - << "Aborting.\n"; - abort(); - } -} - -int main(int argc, char **argv) { - StringRef ExecName = argv[0]; - InitLLVM X(argc, argv); - - InitializeAllTargets(); - InitializeAllTargetMCs(); - InitializeAllAsmPrinters(); - InitializeAllAsmParsers(); - - PassRegistry *Registry = PassRegistry::getPassRegistry(); - initializeCore(*Registry); - initializeCodeGen(*Registry); - initializeAnalysis(*Registry); - initializeTarget(*Registry); - - cl::ParseCommandLineOptions(argc, argv); - - if (TargetTriple.empty()) { - errs() << ExecName << ": -mtriple must be specified\n"; - exit(1); - } - - CodeGenOptLevel OLvl; - if (auto Level = CodeGenOpt::parseLevel(OptLevel)) { - OLvl = *Level; - } else { - errs() << ExecName << ": invalid optimization level.\n"; - return 1; - } - ExitOnError ExitOnErr(std::string(ExecName) + ": error:"); - TM = ExitOnErr(codegen::createTargetMachineForTriple( - Triple::normalize(TargetTriple), OLvl)); - assert(TM && "Could not allocate target machine!"); - - // Make sure we print the summary and the current unit when LLVM errors out. - install_fatal_error_handler(handleLLVMFatalError, nullptr); - - LLVMContext Context; - SMDiagnostic Err; - std::unique_ptr M = parseIRFile(InputFilename, Err, Context); - if (!M) { - errs() << "Invalid mod\n"; - return 1; - } - auto S = Triple::normalize(TargetTriple); - M->setTargetTriple(Triple(S)); - - PassInstrumentationCallbacks PIC; - StandardInstrumentations SI(Context, false/*debug PM*/, - false); - registerCodeGenCallback(PIC, *TM); - - ModulePassManager MPM; - FunctionPassManager FPM; - //TargetLibraryInfoImpl TLII(Triple(M->getTargetTriple())); - - MachineFunctionAnalysisManager MFAM; - LoopAnalysisManager LAM; - FunctionAnalysisManager FAM; - CGSCCAnalysisManager CGAM; - ModuleAnalysisManager MAM; - PassBuilder PB(TM.get(), PipelineTuningOptions(), std::nullopt, &PIC); - PB.registerModuleAnalyses(MAM); - //PB.registerVerifierPasses(MPM, FPM); - PB.registerCGSCCAnalyses(CGAM); - PB.registerFunctionAnalyses(FAM); - PB.registerLoopAnalyses(LAM); - PB.registerMachineFunctionAnalyses(MFAM); - PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - - SI.registerCallbacks(PIC, &MAM); - - Triple TT(M->getTargetTriple()); - if (!NoLint) - FPM.addPass(LintPass(false)); - if (!NoVerify) - MPM.addPass(VerifierPass()); - if (TT.isAMDGPU()) - FPM.addPass(AMDGPUTargetVerifierPass()); - else if (false) {} // ... - MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); - - auto PA = MPM.run(*M, MAM); - { - auto PAC = PA.getChecker(); - if (!PAC.preserved()) - return 1; - } - - return 0; -} diff --git a/llvm/utils/gn/secondary/llvm/test/BUILD.gn b/llvm/utils/gn/secondary/llvm/test/BUILD.gn index 157e7991c52a8..228642667b41d 100644 --- a/llvm/utils/gn/secondary/llvm/test/BUILD.gn +++ b/llvm/utils/gn/secondary/llvm/test/BUILD.gn @@ -319,7 +319,6 @@ group("test") { "//llvm/tools/llvm-strings", "//llvm/tools/llvm-symbolizer:symlinks", "//llvm/tools/llvm-tli-checker", - "//llvm/tools/llvm-tgt-verify", "//llvm/tools/llvm-undname", "//llvm/tools/llvm-xray", "//llvm/tools/lto", diff --git a/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn b/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn deleted file mode 100644 index b751bafc5052c..0000000000000 --- a/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn +++ /dev/null @@ -1,25 +0,0 @@ -import("//llvm/utils/TableGen/tablegen.gni") - -tgtverifier("llvm-tgt-verify") { - deps = [ - "//llvm/lib/Analysis", - "//llvm/lib/AsmPrinter", - "//llvm/lib/CodeGen", - "//llvm/lib/CodeGenTypes", - "//llvm/lib/Core", - "//llvm/lib/IRPrinter", - "//llvm/lib/IRReader", - "//llvm/lib/MC", - "//llvm/lib/MIRParser", - "//llvm/lib/Passes", - "//llvm/lib/Remarks", - "//llvm/lib/ScalarOpts", - "//llvm/lib/SelectionDAG", - "//llvm/lib/Support", - "//llvm/lib/Target", - "//llvm/lib/TargetParser", - "//llvm/lib/TransformUtils", - "//llvm/lib/Vectorize", - ] - sources = [ "llvm-tgt-verify.cpp" ] -} >From 5fec133fdd54a6864918beb1dfd737a8b8fa25a9 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 22:23:15 -0400 Subject: [PATCH 26/30] Remove TargetVerifier.cpp --- llvm/lib/Passes/CMakeLists.txt | 1 - llvm/lib/Target/CMakeLists.txt | 2 -- llvm/lib/Target/TargetVerifier.cpp | 36 ------------------------------ 3 files changed, 39 deletions(-) delete mode 100644 llvm/lib/Target/TargetVerifier.cpp diff --git a/llvm/lib/Passes/CMakeLists.txt b/llvm/lib/Passes/CMakeLists.txt index 9c348cb89a8c5..6425f4934b210 100644 --- a/llvm/lib/Passes/CMakeLists.txt +++ b/llvm/lib/Passes/CMakeLists.txt @@ -29,7 +29,6 @@ add_llvm_component_library(LLVMPasses Scalar Support Target - #TargetParser TransformUtils Vectorize Instrumentation diff --git a/llvm/lib/Target/CMakeLists.txt b/llvm/lib/Target/CMakeLists.txt index f2a5d545ce84f..9472288229cac 100644 --- a/llvm/lib/Target/CMakeLists.txt +++ b/llvm/lib/Target/CMakeLists.txt @@ -7,8 +7,6 @@ add_llvm_component_library(LLVMTarget TargetLoweringObjectFile.cpp TargetMachine.cpp TargetMachineC.cpp - TargetVerifier.cpp - AMDGPU/AMDGPUTargetVerifier.cpp ADDITIONAL_HEADER_DIRS ${LLVM_MAIN_INCLUDE_DIR}/llvm/Target diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp deleted file mode 100644 index c63ae2a2c5daf..0000000000000 --- a/llvm/lib/Target/TargetVerifier.cpp +++ /dev/null @@ -1,36 +0,0 @@ -//===-- TargetVerifier.cpp - LLVM IR Target Verifier ----------------*- C++ -*-===// -//// -///// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -///// See https://llvm.org/LICENSE.txt for license information. -///// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -///// -/////===----------------------------------------------------------------------===// -///// -///// This file defines target verifier interfaces that can be used for some -///// validation of input to the system, and for checking that transformations -///// haven't done something bad. In contrast to the Verifier or Lint, the -///// TargetVerifier looks for constructions invalid to a particular target -///// machine. -///// -///// To see what specifically is checked, look at TargetVerifier.cpp or an -///// individual backend's TargetVerifier. -///// -/////===----------------------------------------------------------------------===// - -#include "llvm/Target/TargetVerifier.h" -#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" - -#include "llvm/InitializePasses.h" -#include "llvm/Analysis/UniformityAnalysis.h" -#include "llvm/Analysis/PostDominators.h" -#include "llvm/Support/Debug.h" -#include "llvm/IR/Dominators.h" -#include "llvm/IR/Function.h" -#include "llvm/IR/IntrinsicInst.h" -#include "llvm/IR/IntrinsicsAMDGPU.h" -#include "llvm/IR/Module.h" -#include "llvm/IR/Value.h" - -namespace llvm { - -} // namespace llvm >From c176578fd5dd63e6097bffac4f5ff51ba8cb76ce Mon Sep 17 00:00:00 2001 From: jofernau Date: Thu, 1 May 2025 03:02:30 -0400 Subject: [PATCH 27/30] clang-format --- llvm/include/llvm/Target/TargetVerifier.h | 10 +- .../TargetVerify/AMDGPUTargetVerifier.h | 44 ++++----- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 98 ++++++++++--------- 3 files changed, 77 insertions(+), 75 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 1d12eb55bbf0a..3f8c710a88768 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -1,4 +1,4 @@ -//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier ---*- C++ -*-===// +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier --*- C++ -*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -20,8 +20,8 @@ #ifndef LLVM_TARGET_VERIFIER_H #define LLVM_TARGET_VERIFIER_H -#include "llvm/IR/PassManager.h" #include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" #include "llvm/TargetParser/Triple.h" namespace llvm { @@ -59,10 +59,11 @@ class TargetVerify { /// This calls the Message-only version so that the above is easier to set /// a breakpoint on. template - void CheckFailed(const Twine &Message, const T1 &V1, const Ts &... Vs) { + void CheckFailed(const Twine &Message, const T1 &V1, const Ts &...Vs) { CheckFailed(Message); WriteValues({V1, Vs...}); } + public: Module *Mod; Triple TT; @@ -73,8 +74,7 @@ class TargetVerify { bool IsValid = true; TargetVerify(Module *Mod) - : Mod(Mod), TT(Mod->getTargetTriple()), - MessagesStr(Messages) {} + : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} virtual bool run(Function &F) = 0; }; diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index 49bcbc8849e3c..5b8d9ec259b63 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -1,29 +1,29 @@ -//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU ---*- C++ -*-===// -//// -//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -//// See https://llvm.org/LICENSE.txt for license information. -//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -//// -////===----------------------------------------------------------------------===// -//// -//// This file defines target verifier interfaces that can be used for some -//// validation of input to the system, and for checking that transformations -//// haven't done something bad. In contrast to the Verifier or Lint, the -//// TargetVerifier looks for constructions invalid to a particular target -//// machine. -//// -//// To see what specifically is checked, look at an individual backend's -//// TargetVerifier. -//// -////===----------------------------------------------------------------------===// +//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU -- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at an individual backend's +// TargetVerifier. +// +//===----------------------------------------------------------------------===// #ifndef LLVM_AMDGPU_TARGET_VERIFIER_H #define LLVM_AMDGPU_TARGET_VERIFIER_H #include "llvm/Target/TargetVerifier.h" -#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/Analysis/PostDominators.h" +#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/IR/Dominators.h" namespace llvm { @@ -43,10 +43,10 @@ class AMDGPUTargetVerify : public TargetVerify { PostDominatorTree *PDT = nullptr; UniformityInfo *UA = nullptr; - AMDGPUTargetVerify(Module *Mod) - : TargetVerify(Mod), Mod(Mod) {} + AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod), Mod(Mod) {} - AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, + UniformityInfo *UA) : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} bool run(Function &F) override; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 2ca0bbeb57653..eb22eb2177f7f 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -1,34 +1,34 @@ -//===-- AMDGPUTargetVerifier.cpp - AMDGPU -------------------------*- C++ -*-===// -//// -//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -//// See https://llvm.org/LICENSE.txt for license information. -//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -//// -////===----------------------------------------------------------------------===// -//// -//// This file defines target verifier interfaces that can be used for some -//// validation of input to the system, and for checking that transformations -//// haven't done something bad. In contrast to the Verifier or Lint, the -//// TargetVerifier looks for constructions invalid to a particular target -//// machine. -//// -//// To see what specifically is checked, look at an individual backend's -//// TargetVerifier. -//// -////===----------------------------------------------------------------------===// +//===-- AMDGPUTargetVerifier.cpp - AMDGPU -----------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at an individual backend's +// TargetVerifier. +// +//===----------------------------------------------------------------------===// -#include "AMDGPU.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" +#include "AMDGPU.h" -#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/Analysis/PostDominators.h" -#include "llvm/Support/Debug.h" +#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" #include "llvm/IR/Value.h" +#include "llvm/Support/Debug.h" #include "llvm/Support/raw_ostream.h" @@ -39,53 +39,52 @@ using namespace llvm; do { \ if (!(C)) { \ TargetVerify::CheckFailed(__VA_ARGS__); \ - return false; \ } \ } while (false) namespace llvm { static bool isShader(CallingConv::ID CC) { - switch(CC) { - case CallingConv::AMDGPU_VS: - case CallingConv::AMDGPU_LS: - case CallingConv::AMDGPU_HS: - case CallingConv::AMDGPU_ES: - case CallingConv::AMDGPU_GS: - case CallingConv::AMDGPU_PS: - case CallingConv::AMDGPU_CS_Chain: - case CallingConv::AMDGPU_CS_ChainPreserve: - case CallingConv::AMDGPU_CS: - return true; - default: - return false; + switch (CC) { + case CallingConv::AMDGPU_VS: + case CallingConv::AMDGPU_LS: + case CallingConv::AMDGPU_HS: + case CallingConv::AMDGPU_ES: + case CallingConv::AMDGPU_GS: + case CallingConv::AMDGPU_PS: + case CallingConv::AMDGPU_CS_Chain: + case CallingConv::AMDGPU_CS_ChainPreserve: + case CallingConv::AMDGPU_CS: + return true; + default: + return false; } } bool AMDGPUTargetVerify::run(Function &F) { // Ensure shader calling convention returns void if (isShader(F.getCallingConv())) - Check(F.getReturnType() == Type::getVoidTy(F.getContext()), "Shaders must return void"); + Check(F.getReturnType() == Type::getVoidTy(F.getContext()), + "Shaders must return void"); for (auto &BB : F) { for (auto &I : BB) { - if (auto *CI = dyn_cast(&I)) - { + if (auto *CI = dyn_cast(&I)) { // Ensure no kernel to kernel calls. CallingConv::ID CalleeCC = CI->getCallingConv(); - if (CalleeCC == CallingConv::AMDGPU_KERNEL) - { - CallingConv::ID CallerCC = CI->getParent()->getParent()->getCallingConv(); + if (CalleeCC == CallingConv::AMDGPU_KERNEL) { + CallingConv::ID CallerCC = + CI->getParent()->getParent()->getCallingConv(); Check(CallerCC != CallingConv::AMDGPU_KERNEL, - "A kernel may not call a kernel", CI->getParent()->getParent()); + "A kernel may not call a kernel", CI->getParent()->getParent()); } // Ensure chain intrinsics are followed by unreachables. if (CI->getIntrinsicID() == Intrinsic::amdgcn_cs_chain) Check(isa_and_present(CI->getNextNode()), - "llvm.amdgcn.cs.chain must be followed by unreachable", CI); + "llvm.amdgcn.cs.chain must be followed by unreachable", CI); } } } @@ -98,7 +97,8 @@ bool AMDGPUTargetVerify::run(Function &F) { return true; } -PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { +PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, + FunctionAnalysisManager &AM) { auto *Mod = F.getParent(); auto UA = &AM.getResult(F); @@ -122,9 +122,10 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { std::unique_ptr TV; bool FatalErrors = false; - AMDGPUTargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), - FatalErrors(FatalErrors) { - initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + AMDGPUTargetVerifierLegacyPass(bool FatalErrors) + : FunctionPass(ID), FatalErrors(FatalErrors) { + initializeAMDGPUTargetVerifierLegacyPassPass( + *PassRegistry::getPassRegistry()); } bool doInitialization(Module &M) override { @@ -167,4 +168,5 @@ FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors) { return new AMDGPUTargetVerifierLegacyPass(FatalErrors); } } // namespace llvm -INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) +INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", + "AMDGPU Target Verifier", false, false) >From 97d43c16df784fde92cbd0f22324e3e4ecfb8374 Mon Sep 17 00:00:00 2001 From: jofernau Date: Thu, 1 May 2025 03:49:20 -0400 Subject: [PATCH 28/30] Add VerifyTarget option --- .../llvm/Target/TargetVerify/AMDGPUTargetVerifier.h | 6 ++---- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 9 +++++++-- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index 5b8d9ec259b63..8ed7dd7ea2f69 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -37,17 +37,15 @@ class AMDGPUTargetVerifierPass : public TargetVerifierPass { class AMDGPUTargetVerify : public TargetVerify { public: - Module *Mod; - DominatorTree *DT = nullptr; PostDominatorTree *PDT = nullptr; UniformityInfo *UA = nullptr; - AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod), Mod(Mod) {} + AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod) {} AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) - : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} + : TargetVerify(Mod), DT(DT), PDT(PDT), UA(UA) {} bool run(Function &F) override; }; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 42d6764eacda9..582090c3c411e 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -482,6 +482,9 @@ static cl::opt HasClosedWorldAssumption( cl::desc("Whether has closed-world assumption at link time"), cl::init(false), cl::Hidden); +static cl::opt VerifyTarget("verify-tgt", + cl::desc("Enable the target verifier")); + extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAMDGPUTarget() { // Register the target RegisterTargetMachine X(getTheR600Target()); @@ -1377,7 +1380,8 @@ bool AMDGPUPassConfig::addGCPasses() { //===----------------------------------------------------------------------===// bool GCNPassConfig::addPreISel() { - addPass(createAMDGPUTargetVerifierLegacyPass(false)); + if (VerifyTarget) + addPass(createAMDGPUTargetVerifierLegacyPass(false)); AMDGPUPassConfig::addPreISel(); if (TM->getOptLevel() > CodeGenOptLevel::None) @@ -1977,7 +1981,8 @@ AMDGPUCodeGenPassBuilder::AMDGPUCodeGenPassBuilder( } void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { - addPass(AMDGPUTargetVerifierPass()); + if (VerifyTarget) + addPass(AMDGPUTargetVerifierPass()); if (RemoveIncompatibleFunctions && TM.getTargetTriple().isAMDGCN()) addPass(AMDGPURemoveIncompatibleFunctionsPass(TM)); >From ea2f05032ced1046ec5a5615d5e1d9e6a411da2e Mon Sep 17 00:00:00 2001 From: Joey F Date: Wed, 7 May 2025 21:59:43 -0400 Subject: [PATCH 29/30] Remove AMDGPUTargetVerifier.h --- .../TargetVerify/AMDGPUTargetVerifier.h | 55 ------------------- llvm/lib/Target/AMDGPU/AMDGPU.h | 5 ++ .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 1 - .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 16 +++++- 4 files changed, 20 insertions(+), 57 deletions(-) delete mode 100644 llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h deleted file mode 100644 index 8ed7dd7ea2f69..0000000000000 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ /dev/null @@ -1,55 +0,0 @@ -//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU -- C++ -*-===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===----------------------------------------------------------------------===// -// -// This file defines target verifier interfaces that can be used for some -// validation of input to the system, and for checking that transformations -// haven't done something bad. In contrast to the Verifier or Lint, the -// TargetVerifier looks for constructions invalid to a particular target -// machine. -// -// To see what specifically is checked, look at an individual backend's -// TargetVerifier. -// -//===----------------------------------------------------------------------===// - -#ifndef LLVM_AMDGPU_TARGET_VERIFIER_H -#define LLVM_AMDGPU_TARGET_VERIFIER_H - -#include "llvm/Target/TargetVerifier.h" - -#include "llvm/Analysis/PostDominators.h" -#include "llvm/Analysis/UniformityAnalysis.h" -#include "llvm/IR/Dominators.h" - -namespace llvm { - -class Function; - -class AMDGPUTargetVerifierPass : public TargetVerifierPass { -public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) override; -}; - -class AMDGPUTargetVerify : public TargetVerify { -public: - DominatorTree *DT = nullptr; - PostDominatorTree *PDT = nullptr; - UniformityInfo *UA = nullptr; - - AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod) {} - - AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, - UniformityInfo *UA) - : TargetVerify(Mod), DT(DT), PDT(PDT), UA(UA) {} - - bool run(Function &F) override; -}; - -} // namespace llvm - -#endif // LLVM_AMDGPU_TARGET_VERIFIER_H diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h index f69956ba44255..ca273b20e9baf 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPU.h +++ b/llvm/lib/Target/AMDGPU/AMDGPU.h @@ -15,6 +15,7 @@ #include "llvm/Pass.h" #include "llvm/Support/AMDGPUAddrSpace.h" #include "llvm/Support/CodeGen.h" +#include "llvm/Target/TargetVerifier.h" namespace llvm { @@ -532,6 +533,10 @@ extern char &AMDGPUWaitSGPRHazardsLegacyID; FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); +class AMDGPUTargetVerifierPass : public TargetVerifierPass { +public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) override; +}; namespace AMDGPU { enum TargetIndex { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 582090c3c411e..1350492b9cf32 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -90,7 +90,6 @@ #include "llvm/MC/TargetRegistry.h" #include "llvm/Passes/PassBuilder.h" #include "llvm/Support/FormatVariadic.h" -#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Transforms/HipStdPar/HipStdPar.h" #include "llvm/Transforms/IPO.h" #include "llvm/Transforms/IPO/AlwaysInliner.h" diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index eb22eb2177f7f..cc96164a1a558 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -17,7 +17,6 @@ // //===----------------------------------------------------------------------===// -#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "AMDGPU.h" #include "llvm/Analysis/PostDominators.h" @@ -44,6 +43,21 @@ using namespace llvm; namespace llvm { +class AMDGPUTargetVerify : public TargetVerify { +public: + DominatorTree *DT = nullptr; + PostDominatorTree *PDT = nullptr; + UniformityInfo *UA = nullptr; + + AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod) {} + + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, + UniformityInfo *UA) + : TargetVerify(Mod), DT(DT), PDT(PDT), UA(UA) {} + + bool run(Function &F) override; +}; + static bool isShader(CallingConv::ID CC) { switch (CC) { case CallingConv::AMDGPU_VS: >From de38b78d2bd598c6168c6235d408726c68251f4c Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 7 May 2025 22:47:55 -0400 Subject: [PATCH 30/30] Remove analyses. --- .../lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 18 +----------------- 1 file changed, 1 insertion(+), 17 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index cc96164a1a558..d07b5fb668c14 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -19,9 +19,6 @@ #include "AMDGPU.h" -#include "llvm/Analysis/PostDominators.h" -#include "llvm/Analysis/UniformityAnalysis.h" -#include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" @@ -45,16 +42,7 @@ namespace llvm { class AMDGPUTargetVerify : public TargetVerify { public: - DominatorTree *DT = nullptr; - PostDominatorTree *PDT = nullptr; - UniformityInfo *UA = nullptr; - AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod) {} - - AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, - UniformityInfo *UA) - : TargetVerify(Mod), DT(DT), PDT(PDT), UA(UA) {} - bool run(Function &F) override; }; @@ -115,11 +103,7 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto *Mod = F.getParent(); - auto UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + AMDGPUTargetVerify TV(Mod); TV.run(F); dbgs() << TV.MessagesStr.str(); From flang-commits at lists.llvm.org Thu May 8 02:08:55 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 02:08:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] fix predetermined privatization inside section (PR #138159) In-Reply-To: Message-ID: <681c74a7.170a0220.247d49.5d29@mx.google.com> https://github.com/tblah closed https://github.com/llvm/llvm-project/pull/138159 From flang-commits at lists.llvm.org Thu May 8 07:11:50 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 08 May 2025 07:11:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang][docs][OpenMP] array sections with DEPEND are supported (PR #139081) In-Reply-To: Message-ID: <681cbba6.170a0220.145f43.1db8@mx.google.com> https://github.com/kiranchandramohan approved this pull request. LG. Thanks Tom. https://github.com/llvm/llvm-project/pull/139081 From flang-commits at lists.llvm.org Thu May 8 07:13:10 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 07:13:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang][docs][OpenMP] array sections with DEPEND are supported (PR #139081) In-Reply-To: Message-ID: <681cbbf6.a70a0220.2b34fd.049d@mx.google.com> https://github.com/tblah closed https://github.com/llvm/llvm-project/pull/139081 From flang-commits at lists.llvm.org Thu May 8 07:48:38 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 08 May 2025 07:48:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <681cc446.630a0220.23e28a.e6f3@mx.google.com> https://github.com/klausler requested changes to this pull request. There is no documentation for users. https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Thu May 8 07:48:38 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 08 May 2025 07:48:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <681cc446.630a0220.36e36b.7bb7@mx.google.com> https://github.com/klausler edited https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Thu May 8 07:48:39 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 08 May 2025 07:48:39 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <681cc447.170a0220.3312f5.a3a7@mx.google.com> ================ @@ -121,6 +121,8 @@ class Preprocessor { std::list names_; std::unordered_map definitions_; std::stack ifStack_; + + unsigned int counter_val_{0}; ---------------- klausler wrote: `int counterVal_{0};` https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Thu May 8 08:07:15 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Thu, 08 May 2025 08:07:15 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Propagate contiguous attribute through HLFIR. (PR #138797) In-Reply-To: Message-ID: <681cc8a3.170a0220.247d49.e91a@mx.google.com> vzakhari wrote: Note that the contiguous propagation works in a forward manner, so it is a bit different from the backward walking in the alias analysis. I guess we may need to have different operation interfaces that abstract the walk with the particular property in mind, e.g. MemSourceOpInterface can be used for finding the producer of the memory being accessed through a result of the operation. I do not think this interface works directly for the contiguous propagation, so we will need a different one. Again, I do not have a good proposal right now :) https://github.com/llvm/llvm-project/pull/138797 From flang-commits at lists.llvm.org Thu May 8 08:18:43 2025 From: flang-commits at lists.llvm.org (Shilei Tian via flang-commits) Date: Thu, 08 May 2025 08:18:43 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <681ccb53.050a0220.3b93f9.4d11@mx.google.com> ================ @@ -481,6 +481,9 @@ static cl::opt HasClosedWorldAssumption( cl::desc("Whether has closed-world assumption at link time"), cl::init(false), cl::Hidden); +static cl::opt VerifyTarget("verify-tgt", ---------------- shiltian wrote: `verify-tgt` is too broad. I'd add an `amdgpu-` prefix. https://github.com/llvm/llvm-project/pull/123609 From flang-commits at lists.llvm.org Thu May 8 08:19:17 2025 From: flang-commits at lists.llvm.org (Shilei Tian via flang-commits) Date: Thu, 08 May 2025 08:19:17 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <681ccb75.170a0220.382185.0b51@mx.google.com> ================ @@ -0,0 +1,84 @@ +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at TargetVerifier.cpp or an +// individual backend's TargetVerifier. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_TARGET_VERIFIER_H +#define LLVM_TARGET_VERIFIER_H + +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/TargetParser/Triple.h" + +namespace llvm { + +class Function; + +class TargetVerifierPass : public PassInfoMixin { +public: + virtual PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) = 0; +}; + +class TargetVerify { +protected: + void WriteValues(ArrayRef Vs) { ---------------- shiltian wrote: ```suggestion void writeValues(ArrayRef Vs) { ``` https://github.com/llvm/llvm-project/pull/123609 From flang-commits at lists.llvm.org Thu May 8 08:20:29 2025 From: flang-commits at lists.llvm.org (Shilei Tian via flang-commits) Date: Thu, 08 May 2025 08:20:29 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <681ccbbd.170a0220.3a1a59.0a31@mx.google.com> ================ @@ -0,0 +1,84 @@ +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at TargetVerifier.cpp or an +// individual backend's TargetVerifier. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_TARGET_VERIFIER_H +#define LLVM_TARGET_VERIFIER_H + +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/TargetParser/Triple.h" + +namespace llvm { + +class Function; + +class TargetVerifierPass : public PassInfoMixin { +public: + virtual PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) = 0; +}; + +class TargetVerify { +protected: + void WriteValues(ArrayRef Vs) { + for (const Value *V : Vs) { ---------------- shiltian wrote: I'd rename this to something more accurate because currently I don't know where it writes to. https://github.com/llvm/llvm-project/pull/123609 From flang-commits at lists.llvm.org Thu May 8 08:21:29 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Thu, 08 May 2025 08:21:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <681ccbf9.a70a0220.2c83dd.539b@mx.google.com> https://github.com/mrkajetanp updated https://github.com/llvm/llvm-project/pull/138718 >From b454ca4b28272f9c440f9d24c70cfbee776cc0ad Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 10 Apr 2025 14:04:52 +0000 Subject: [PATCH 1/3] [flang] Inline hlfir.copy_in for trivial types hlfir.copy_in implements copying non-contiguous array slices for functions that take in arrays required to be contiguous through a flang-rt function that calls memcpy/memmove separately on each element. For large arrays of trivial types, this can incur considerable overhead compared to a plain copy loop that is better able to take advantage of hardware pipelines. To address that, extend the InlineHLFIRAssign optimisation pass with a new pattern for inlining hlfir.copy_in operations for trivial types. For the time being, the pattern is only applied in cases where the copy-in does not require a corresponding copy-out, such as when the function being called declares the array parameter as intent(in). Applying this optimisation reduces the runtime of thornado-mini's DeleptonizationProblem by a factor of about 1/3rd. Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 117 ++++++++++++++++++ 1 file changed, 117 insertions(+) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 6e209cce07ad4..38c684eaceb7d 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,6 +13,7 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + + if (!mlir::isa(copyOut)) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (::llvm::cast(copyOut).getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + auto results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + auto elem = hlfir::getElementAt(loc, builder, inputVariable, + loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + auto tempElem = hlfir::getElementAt(loc, builder, temp, + loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + auto refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + auto refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + auto addr = results[0]; + auto needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + auto tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. + rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); +} + class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -140,6 +256,7 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); + patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { >From 2e03475b633233c96d379b707e5e75881d95eba8 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Wed, 7 May 2025 16:04:07 +0000 Subject: [PATCH 2/3] Add tests --- flang/test/HLFIR/inline-hlfir-assign.fir | 144 +++++++++++++++++++++++ 1 file changed, 144 insertions(+) diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index f834e7971e3d5..df7681b9c5c16 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,3 +353,147 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } >From 0c9ef1c69769e652277ce614f1fc72b2f31abdfa Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 8 May 2025 15:15:56 +0000 Subject: [PATCH 3/3] Address Tom's review comments --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 41 +++++++++++-------- 1 file changed, 23 insertions(+), 18 deletions(-) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 38c684eaceb7d..dc545ece8adff 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -158,16 +158,16 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, "CopyInOp's WasCopied has no uses"); // The copy out should always be present, either to actually copy or just // deallocate memory. - auto *copyOut = - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - if (!mlir::isa(copyOut)) + if (!copyOut) return rewriter.notifyMatchFailure(copyIn, "CopyInOp has no direct CopyOut"); // Only inline the copy_in when copy_out does not need to be done, i.e. in // case of intent(in). - if (::llvm::cast(copyOut).getVar()) + if (copyOut.getVar()) return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); inputVariable = @@ -175,7 +175,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); mlir::Value isContiguous = builder.create(loc, inputVariable); - auto results = + mlir::Operation::result_range results = builder .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, /*withElseRegion=*/true) @@ -195,11 +195,11 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, loc, builder, extents, /*isUnordered=*/true, flangomp::shouldUseWorkshareLowering(copyIn)); builder.setInsertionPointToStart(loopNest.body); - auto elem = hlfir::getElementAt(loc, builder, inputVariable, - loopNest.oneBasedIndices); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); elem = hlfir::loadTrivialScalar(loc, builder, elem); - auto tempElem = hlfir::getElementAt(loc, builder, temp, - loopNest.oneBasedIndices); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); builder.create(loc, elem, tempElem); builder.setInsertionPointAfter(loopNest.outerOp); @@ -209,9 +209,9 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, if (mlir::isa(temp.getType())) { result = temp; } else { - auto refTy = + fir::ReferenceType refTy = fir::ReferenceType::get(temp.getElementOrSequenceType()); - auto refVal = builder.createConvert(loc, refTy, temp); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); result = builder.create(loc, resultAddrType, refVal); } @@ -221,25 +221,30 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }) .getResults(); - auto addr = results[0]; - auto needsCleanup = results[1]; + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { auto boxAddr = builder.create(loc, addr); - auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); builder.create(loc, heapVal); }); rewriter.eraseOp(copyOut); - auto tempBox = copyIn.getTempBox(); + mlir::Value tempBox = copyIn.getTempBox(); rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); // The TempBox is only needed for flang-rt calls which we're no longer - // generating. + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); } From flang-commits at lists.llvm.org Thu May 8 08:25:47 2025 From: flang-commits at lists.llvm.org (Shilei Tian via flang-commits) Date: Thu, 08 May 2025 08:25:47 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <681cccfb.170a0220.8d34e.281e@mx.google.com> ================ @@ -0,0 +1,84 @@ +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at TargetVerifier.cpp or an ---------------- shiltian wrote: `TargetVerifier.cpp` no longer exists. Is it still necessary to have this class? TBH I don't see how this base class can benefit others. Typically, I can see it would be beneficial if we do something like: ``` // This is in target specific file. Base *Ptr = new Derived(...); // This is in target independent file and Ptr is passed from target specific part Ptr->run(...); ``` However, this is not how the pass pipeline works. https://github.com/llvm/llvm-project/pull/123609 From flang-commits at lists.llvm.org Thu May 8 08:46:22 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 08:46:22 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <681cd1ce.170a0220.6435c.2078@mx.google.com> ================ @@ -0,0 +1,84 @@ +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at TargetVerifier.cpp or an ---------------- jofrn wrote: `TargetVerify` base class is to have the `writeValues` and `checkFailed` functionality shared across targets. We won't use it like that given your reason and also the fact that it is an abstract base class. It is not meant for instantiation anymore. (It was meant for instantiation in a target independent location like instrumentation before, but not now. I'll fix the comment message.) https://github.com/llvm/llvm-project/pull/123609 From flang-commits at lists.llvm.org Thu May 8 08:51:09 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 08:51:09 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <681cd2ed.050a0220.2036f5.6639@mx.google.com> ================ @@ -0,0 +1,84 @@ +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at TargetVerifier.cpp or an +// individual backend's TargetVerifier. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_TARGET_VERIFIER_H +#define LLVM_TARGET_VERIFIER_H + +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/TargetParser/Triple.h" + +namespace llvm { + +class Function; + +class TargetVerifierPass : public PassInfoMixin { +public: + virtual PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) = 0; +}; + +class TargetVerify { +protected: + void WriteValues(ArrayRef Vs) { + for (const Value *V : Vs) { ---------------- jofrn wrote: It can be written to anywhere, but `MessagesStr` is currently being written to `dbgs()`. https://github.com/llvm/llvm-project/blob/de38b78d2bd598c6168c6235d408726c68251f4c/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp#L109 https://github.com/llvm/llvm-project/pull/123609 From flang-commits at lists.llvm.org Thu May 8 08:55:27 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Thu, 08 May 2025 08:55:27 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add missing dependency to AddDebugInfo pass (PR #139099) Message-ID: https://github.com/skatrak created https://github.com/llvm/llvm-project/pull/139099 The `AddDebugInfo` pass currently has a dependency on the `DLTI` MLIR dialect caused by a call to the `fir::support::getOrSetMLIRDataLayout()` utility function. This dependency is not captured in the pass definition. This patch adds the dependency and simplifies several unit tests that had to explicitly use the `DLTI` dialect to prevent the missing dependency from causing compiler failures. >From ae3c5f362d94c9c535893625cbabf72111d35889 Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Thu, 8 May 2025 16:27:25 +0100 Subject: [PATCH] [Flang] Add missing dependency to AddDebugInfo pass The `AddDebugInfo` pass currently has a dependency on the `DLTI` MLIR dialect caused by a call to the `fir::support::getOrSetMLIRDataLayout()` utility function. This dependency is not captured in the pass definition. This patch adds the dependency and simplifies several unit tests that had to explicitly use the `DLTI` dialect to prevent the missing dependency from causing compiler failures. --- flang/include/flang/Optimizer/Transforms/Passes.td | 3 ++- flang/lib/Optimizer/Transforms/AddDebugInfo.cpp | 1 + flang/test/Transforms/debug-107988.fir | 2 +- flang/test/Transforms/debug-96314.fir | 2 +- flang/test/Transforms/debug-allocatable-1.fir | 2 +- flang/test/Transforms/debug-assumed-rank-array.fir | 2 +- flang/test/Transforms/debug-assumed-shape-array-2.fir | 2 +- flang/test/Transforms/debug-assumed-size-array.fir | 2 +- flang/test/Transforms/debug-char-type-1.fir | 2 +- flang/test/Transforms/debug-class-type.fir | 2 +- flang/test/Transforms/debug-common-block.fir | 2 +- flang/test/Transforms/debug-complex-1.fir | 2 +- flang/test/Transforms/debug-derived-type-2.fir | 2 +- flang/test/Transforms/debug-extra-global.fir | 2 +- flang/test/Transforms/debug-fixed-array-type.fir | 2 +- flang/test/Transforms/debug-fn-info.fir | 2 +- flang/test/Transforms/debug-imported-entity.fir | 2 +- flang/test/Transforms/debug-index-type.fir | 2 +- flang/test/Transforms/debug-line-table-existing.fir | 2 +- flang/test/Transforms/debug-line-table-inc-file.fir | 2 +- flang/test/Transforms/debug-line-table-inc-same-file.fir | 2 +- flang/test/Transforms/debug-line-table.fir | 2 +- flang/test/Transforms/debug-local-var.fir | 2 +- flang/test/Transforms/debug-module-1.fir | 2 +- flang/test/Transforms/debug-ptr-type.fir | 2 +- flang/test/Transforms/debug-ref-type.fir | 2 +- flang/test/Transforms/debug-tuple-type.fir | 2 +- flang/test/Transforms/debug-variable-array-dim.fir | 2 +- flang/test/Transforms/debug-variable-char-len.fir | 2 +- flang/test/Transforms/debug-vector-type.fir | 2 +- 30 files changed, 31 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index 9b6919eec3f73..3243b44df9c7a 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -210,7 +210,8 @@ def AddDebugInfo : Pass<"add-debug-info", "mlir::ModuleOp"> { }]; let constructor = "::fir::createAddDebugInfoPass()"; let dependentDialects = [ - "fir::FIROpsDialect", "mlir::func::FuncDialect", "mlir::LLVM::LLVMDialect" + "fir::FIROpsDialect", "mlir::func::FuncDialect", "mlir::LLVM::LLVMDialect", + "mlir::DLTIDialect" ]; let options = [ Option<"debugLevel", "debug-level", diff --git a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp index c479c1a0892b5..8fa2f38818c02 100644 --- a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp +++ b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp @@ -23,6 +23,7 @@ #include "flang/Optimizer/Support/InternalNames.h" #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Support/Version.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/Func/IR/FuncOps.h" #include "mlir/Dialect/LLVMIR/LLVMDialect.h" #include "mlir/IR/Matchers.h" diff --git a/flang/test/Transforms/debug-107988.fir b/flang/test/Transforms/debug-107988.fir index 0ba4296138f50..674ce287a29ec 100644 --- a/flang/test/Transforms/debug-107988.fir +++ b/flang/test/Transforms/debug-107988.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s -o - | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @test(%arg0: !fir.ref> {fir.bindc_name = "str"}, %arg1: i64) { %0 = fir.emboxchar %arg0, %arg1 : (!fir.ref>, i64) -> !fir.boxchar<1> %1 = fir.undefined !fir.dscope diff --git a/flang/test/Transforms/debug-96314.fir b/flang/test/Transforms/debug-96314.fir index e2d0f24a1105c..4df0c4a555d39 100644 --- a/flang/test/Transforms/debug-96314.fir +++ b/flang/test/Transforms/debug-96314.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s -o - | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QMhelperPmod_sub(%arg0: !fir.ref {fir.bindc_name = "a"} ) { return } loc(#loc1) diff --git a/flang/test/Transforms/debug-allocatable-1.fir b/flang/test/Transforms/debug-allocatable-1.fir index fd0beaddcdb70..f523025f5945e 100644 --- a/flang/test/Transforms/debug-allocatable-1.fir +++ b/flang/test/Transforms/debug-allocatable-1.fir @@ -1,7 +1,7 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @_QFPff() { %c1 = arith.constant 1 : index %c0 = arith.constant 0 : index diff --git a/flang/test/Transforms/debug-assumed-rank-array.fir b/flang/test/Transforms/debug-assumed-rank-array.fir index ce474cd259619..41e0396b076f7 100644 --- a/flang/test/Transforms/debug-assumed-rank-array.fir +++ b/flang/test/Transforms/debug-assumed-rank-array.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QFPfn(%arg0: !fir.box> ) { %1 = fir.undefined !fir.dscope %2 = fircg.ext_declare %arg0 dummy_scope %1 {uniq_name = "_QFFfnEx"} : (!fir.box>, !fir.dscope) -> !fir.box> loc(#loc2) diff --git a/flang/test/Transforms/debug-assumed-shape-array-2.fir b/flang/test/Transforms/debug-assumed-shape-array-2.fir index 212a3453d110d..acad57a710205 100644 --- a/flang/test/Transforms/debug-assumed-shape-array-2.fir +++ b/flang/test/Transforms/debug-assumed-shape-array-2.fir @@ -2,7 +2,7 @@ // Test assumed shape array with variable lower bound. -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @_QFPfn(%arg0: !fir.box> {fir.bindc_name = "b"}, %arg1: !fir.ref {fir.bindc_name = "n"}) attributes {} { %c23_i32 = arith.constant 23 : i32 %c6_i32 = arith.constant 6 : i32 diff --git a/flang/test/Transforms/debug-assumed-size-array.fir b/flang/test/Transforms/debug-assumed-size-array.fir index 892502cb64a59..40e57100fd9ff 100644 --- a/flang/test/Transforms/debug-assumed-size-array.fir +++ b/flang/test/Transforms/debug-assumed-size-array.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QMhelperPfn(%arg0: !fir.ref> {fir.bindc_name = "a1"}, %arg1: !fir.ref> {fir.bindc_name = "a2"}, %arg2: !fir.ref> {fir.bindc_name = "a3"}) { %c5 = arith.constant 5 : index %c1 = arith.constant 1 : index diff --git a/flang/test/Transforms/debug-char-type-1.fir b/flang/test/Transforms/debug-char-type-1.fir index 630b52d96cb85..49f230f7307fa 100644 --- a/flang/test/Transforms/debug-char-type-1.fir +++ b/flang/test/Transforms/debug-char-type-1.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMhelperEstr1 : !fir.char<1,40> { %0 = fir.zero_bits !fir.char<1,40> fir.has_value %0 : !fir.char<1,40> diff --git a/flang/test/Transforms/debug-class-type.fir b/flang/test/Transforms/debug-class-type.fir index aad15a831fd2f..23af60b71ca50 100644 --- a/flang/test/Transforms/debug-class-type.fir +++ b/flang/test/Transforms/debug-class-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.type_info @_QTtest_type nofinal : !fir.type<_QTtest_type{a:i32,b:!fir.box>>}> dispatch_table { fir.dt_entry "test_proc", @_QPtest_proc } loc(#loc1) diff --git a/flang/test/Transforms/debug-common-block.fir b/flang/test/Transforms/debug-common-block.fir index 481b26369a92c..d68b524225df5 100644 --- a/flang/test/Transforms/debug-common-block.fir +++ b/flang/test/Transforms/debug-common-block.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @__BLNK__ {alignment = 4 : i64} : tuple> {} loc(#loc1) fir.global @a_ {alignment = 4 : i64} : tuple> {} loc(#loc2) func.func @f1() { diff --git a/flang/test/Transforms/debug-complex-1.fir b/flang/test/Transforms/debug-complex-1.fir index 633f27af99fb1..f7be6b2d4a931 100644 --- a/flang/test/Transforms/debug-complex-1.fir +++ b/flang/test/Transforms/debug-complex-1.fir @@ -3,7 +3,7 @@ // check conversion of complex type of different size. Both fir and mlir // variants are checked. -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @test1(%x : complex) -> complex { %1 = fir.convert %x : (complex) -> complex return %1 : complex diff --git a/flang/test/Transforms/debug-derived-type-2.fir b/flang/test/Transforms/debug-derived-type-2.fir index 63e842619edbe..1e128d702b347 100644 --- a/flang/test/Transforms/debug-derived-type-2.fir +++ b/flang/test/Transforms/debug-derived-type-2.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMmEvar : !fir.type<_QMmTt1{elm:!fir.array<5xi32>,elm2:!fir.array<5x8xi32>}> {} loc(#loc1) fir.type_info @_QMmTt1 noinit nodestroy nofinal : !fir.type<_QMmTt1{elm:!fir.array<5xi32>,elm2:!fir.array<5x8xi32>}> component_info { fir.dt_component "elm" lbs [2] diff --git a/flang/test/Transforms/debug-extra-global.fir b/flang/test/Transforms/debug-extra-global.fir index d3bc22ad0c59b..e3a33e4cfdf40 100644 --- a/flang/test/Transforms/debug-extra-global.fir +++ b/flang/test/Transforms/debug-extra-global.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global linkonce_odr @_QFEXnXxcx constant target : !fir.char<1,3> { %0 = fir.string_lit "xcx"(3) : !fir.char<1,3> fir.has_value %0 : !fir.char<1,3> diff --git a/flang/test/Transforms/debug-fixed-array-type.fir b/flang/test/Transforms/debug-fixed-array-type.fir index a15975c7cc92a..75cb88b08b248 100644 --- a/flang/test/Transforms/debug-fixed-array-type.fir +++ b/flang/test/Transforms/debug-fixed-array-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QQmain() attributes {fir.bindc_name = "mn"} { %c7 = arith.constant 7 : index %c8 = arith.constant 8 : index diff --git a/flang/test/Transforms/debug-fn-info.fir b/flang/test/Transforms/debug-fn-info.fir index 85cfd13643ec3..c02835be50af5 100644 --- a/flang/test/Transforms/debug-fn-info.fir +++ b/flang/test/Transforms/debug-fn-info.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QQmain() attributes {fir.bindc_name = "mn"} { %0 = fir.alloca i32 {bindc_name = "i4", uniq_name = "_QFEi4"} %1 = fircg.ext_declare %0 {uniq_name = "_QFEi4"} : (!fir.ref) -> !fir.ref diff --git a/flang/test/Transforms/debug-imported-entity.fir b/flang/test/Transforms/debug-imported-entity.fir index 7be6531a703a8..194bc82724583 100644 --- a/flang/test/Transforms/debug-imported-entity.fir +++ b/flang/test/Transforms/debug-imported-entity.fir @@ -1,7 +1,7 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMfooEv1 : i32 { %0 = fir.zero_bits i32 fir.has_value %0 : i32 diff --git a/flang/test/Transforms/debug-index-type.fir b/flang/test/Transforms/debug-index-type.fir index 20bd8471d7cf6..751e2e156dc20 100644 --- a/flang/test/Transforms/debug-index-type.fir +++ b/flang/test/Transforms/debug-index-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @str(%arg0: index) -> i32 loc(#loc1) } #loc1 = loc("test.f90":5:1) diff --git a/flang/test/Transforms/debug-line-table-existing.fir b/flang/test/Transforms/debug-line-table-existing.fir index 0e006303c8a81..03eefd08a4379 100644 --- a/flang/test/Transforms/debug-line-table-existing.fir +++ b/flang/test/Transforms/debug-line-table-existing.fir @@ -3,7 +3,7 @@ // REQUIRES: system-linux // Test that there are no changes to a function with existed fused loc debug -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QPs1() { return loc(#loc1) } loc(#loc2) diff --git a/flang/test/Transforms/debug-line-table-inc-file.fir b/flang/test/Transforms/debug-line-table-inc-file.fir index 216cd5e016f2f..32c9f515ead43 100644 --- a/flang/test/Transforms/debug-line-table-inc-file.fir +++ b/flang/test/Transforms/debug-line-table-inc-file.fir @@ -3,7 +3,7 @@ // REQUIRES: system-linux // Test for included functions that have a different debug location than the current file -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QPsinc() { return loc(#loc2) } loc(#loc1) diff --git a/flang/test/Transforms/debug-line-table-inc-same-file.fir b/flang/test/Transforms/debug-line-table-inc-same-file.fir index bcaf449798231..aaa8d03a76ef0 100644 --- a/flang/test/Transforms/debug-line-table-inc-same-file.fir +++ b/flang/test/Transforms/debug-line-table-inc-same-file.fir @@ -4,7 +4,7 @@ // Test that there is only one FileAttribute generated for multiple functions // in the same file. -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QPs1() { return loc(#loc2) } loc(#loc1) diff --git a/flang/test/Transforms/debug-line-table.fir b/flang/test/Transforms/debug-line-table.fir index d6e54fd1ac467..81aebf026882a 100644 --- a/flang/test/Transforms/debug-line-table.fir +++ b/flang/test/Transforms/debug-line-table.fir @@ -3,7 +3,7 @@ // RUN: fir-opt --add-debug-info="debug-level=LineTablesOnly" --mlir-print-debuginfo %s | FileCheck %s --check-prefix=LINETABLE // RUN: fir-opt --add-debug-info="is-optimized=true" --mlir-print-debuginfo %s | FileCheck %s --check-prefix=OPT -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QPsb() { return loc(#loc_sb) } loc(#loc_sb) diff --git a/flang/test/Transforms/debug-local-var.fir b/flang/test/Transforms/debug-local-var.fir index b7a1ff7185a63..06c9b01e75a61 100644 --- a/flang/test/Transforms/debug-local-var.fir +++ b/flang/test/Transforms/debug-local-var.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QQmain() attributes {fir.bindc_name = "mn"} { %0 = fir.alloca i32 {bindc_name = "i4", uniq_name = "_QFEi4"} %1 = fircg.ext_declare %0 {uniq_name = "_QFEi4"} : (!fir.ref) -> !fir.ref loc(#loc1) diff --git a/flang/test/Transforms/debug-module-1.fir b/flang/test/Transforms/debug-module-1.fir index ede996f053835..c1e4c2eeffefe 100644 --- a/flang/test/Transforms/debug-module-1.fir +++ b/flang/test/Transforms/debug-module-1.fir @@ -1,7 +1,7 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMhelperEgli : i32 { %0 = fir.zero_bits i32 fir.has_value %0 : i32 diff --git a/flang/test/Transforms/debug-ptr-type.fir b/flang/test/Transforms/debug-ptr-type.fir index 64e64cb1a19ae..2bbece56a7ab5 100644 --- a/flang/test/Transforms/debug-ptr-type.fir +++ b/flang/test/Transforms/debug-ptr-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMhelperEpar : !fir.box>> { %0 = fir.zero_bits !fir.ptr> %c0 = arith.constant 0 : index diff --git a/flang/test/Transforms/debug-ref-type.fir b/flang/test/Transforms/debug-ref-type.fir index 2b3af485385d8..745aebee778be 100644 --- a/flang/test/Transforms/debug-ref-type.fir +++ b/flang/test/Transforms/debug-ref-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @_FortranAioBeginExternalListOutput(i8) -> !fir.ref loc(#loc1) } #loc1 = loc("test.f90":5:1) diff --git a/flang/test/Transforms/debug-tuple-type.fir b/flang/test/Transforms/debug-tuple-type.fir index c9b0d16c06e1a..e3b0bafdf3cd4 100644 --- a/flang/test/Transforms/debug-tuple-type.fir +++ b/flang/test/Transforms/debug-tuple-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @fn1(!fir.ref>) func.func private @_FortranAioOutputDerivedType(!fir.ref>) } diff --git a/flang/test/Transforms/debug-variable-array-dim.fir b/flang/test/Transforms/debug-variable-array-dim.fir index 1f401041dee57..a376133cf449a 100644 --- a/flang/test/Transforms/debug-variable-array-dim.fir +++ b/flang/test/Transforms/debug-variable-array-dim.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @foo_(%arg0: !fir.ref> {fir.bindc_name = "a"}, %arg1: !fir.ref {fir.bindc_name = "n"}, %arg2: !fir.ref {fir.bindc_name = "m"}, %arg3: !fir.ref {fir.bindc_name = "p"}) attributes {fir.internal_name = "_QPfoo"} { %c5_i32 = arith.constant 5 : i32 %c6_i32 = arith.constant 6 : i32 diff --git a/flang/test/Transforms/debug-variable-char-len.fir b/flang/test/Transforms/debug-variable-char-len.fir index 9e177d22d5b10..907b65a4c6d4f 100644 --- a/flang/test/Transforms/debug-variable-char-len.fir +++ b/flang/test/Transforms/debug-variable-char-len.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s -o - | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @foo(%arg0: !fir.ref> {fir.bindc_name = "str1"} , %arg1: !fir.ref {fir.bindc_name = "len1"} loc("/home/haqadeer/work/fortran/t1/../str.f90":1:1), %arg2: i64) { %0 = fir.emboxchar %arg0, %arg2 : (!fir.ref>, i64) -> !fir.boxchar<1> %c4_i32 = arith.constant 4 : i32 diff --git a/flang/test/Transforms/debug-vector-type.fir b/flang/test/Transforms/debug-vector-type.fir index 63846ce006c6c..d3e1f6ec28d0f 100644 --- a/flang/test/Transforms/debug-vector-type.fir +++ b/flang/test/Transforms/debug-vector-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @foo1(%arg0: !fir.vector<20:bf16>) // CHECK-DAG: #[[F16:.*]] = #llvm.di_basic_type // CHECK-DAG: #llvm.di_composite_type> From flang-commits at lists.llvm.org Thu May 8 08:56:00 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 08:56:00 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add missing dependency to AddDebugInfo pass (PR #139099) In-Reply-To: Message-ID: <681cd410.050a0220.11bc41.64c7@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Sergio Afonso (skatrak)
Changes The `AddDebugInfo` pass currently has a dependency on the `DLTI` MLIR dialect caused by a call to the `fir::support::getOrSetMLIRDataLayout()` utility function. This dependency is not captured in the pass definition. This patch adds the dependency and simplifies several unit tests that had to explicitly use the `DLTI` dialect to prevent the missing dependency from causing compiler failures. --- Full diff: https://github.com/llvm/llvm-project/pull/139099.diff 30 Files Affected: - (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+2-1) - (modified) flang/lib/Optimizer/Transforms/AddDebugInfo.cpp (+1) - (modified) flang/test/Transforms/debug-107988.fir (+1-1) - (modified) flang/test/Transforms/debug-96314.fir (+1-1) - (modified) flang/test/Transforms/debug-allocatable-1.fir (+1-1) - (modified) flang/test/Transforms/debug-assumed-rank-array.fir (+1-1) - (modified) flang/test/Transforms/debug-assumed-shape-array-2.fir (+1-1) - (modified) flang/test/Transforms/debug-assumed-size-array.fir (+1-1) - (modified) flang/test/Transforms/debug-char-type-1.fir (+1-1) - (modified) flang/test/Transforms/debug-class-type.fir (+1-1) - (modified) flang/test/Transforms/debug-common-block.fir (+1-1) - (modified) flang/test/Transforms/debug-complex-1.fir (+1-1) - (modified) flang/test/Transforms/debug-derived-type-2.fir (+1-1) - (modified) flang/test/Transforms/debug-extra-global.fir (+1-1) - (modified) flang/test/Transforms/debug-fixed-array-type.fir (+1-1) - (modified) flang/test/Transforms/debug-fn-info.fir (+1-1) - (modified) flang/test/Transforms/debug-imported-entity.fir (+1-1) - (modified) flang/test/Transforms/debug-index-type.fir (+1-1) - (modified) flang/test/Transforms/debug-line-table-existing.fir (+1-1) - (modified) flang/test/Transforms/debug-line-table-inc-file.fir (+1-1) - (modified) flang/test/Transforms/debug-line-table-inc-same-file.fir (+1-1) - (modified) flang/test/Transforms/debug-line-table.fir (+1-1) - (modified) flang/test/Transforms/debug-local-var.fir (+1-1) - (modified) flang/test/Transforms/debug-module-1.fir (+1-1) - (modified) flang/test/Transforms/debug-ptr-type.fir (+1-1) - (modified) flang/test/Transforms/debug-ref-type.fir (+1-1) - (modified) flang/test/Transforms/debug-tuple-type.fir (+1-1) - (modified) flang/test/Transforms/debug-variable-array-dim.fir (+1-1) - (modified) flang/test/Transforms/debug-variable-char-len.fir (+1-1) - (modified) flang/test/Transforms/debug-vector-type.fir (+1-1) ``````````diff diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index 9b6919eec3f73..3243b44df9c7a 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -210,7 +210,8 @@ def AddDebugInfo : Pass<"add-debug-info", "mlir::ModuleOp"> { }]; let constructor = "::fir::createAddDebugInfoPass()"; let dependentDialects = [ - "fir::FIROpsDialect", "mlir::func::FuncDialect", "mlir::LLVM::LLVMDialect" + "fir::FIROpsDialect", "mlir::func::FuncDialect", "mlir::LLVM::LLVMDialect", + "mlir::DLTIDialect" ]; let options = [ Option<"debugLevel", "debug-level", diff --git a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp index c479c1a0892b5..8fa2f38818c02 100644 --- a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp +++ b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp @@ -23,6 +23,7 @@ #include "flang/Optimizer/Support/InternalNames.h" #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Support/Version.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/Func/IR/FuncOps.h" #include "mlir/Dialect/LLVMIR/LLVMDialect.h" #include "mlir/IR/Matchers.h" diff --git a/flang/test/Transforms/debug-107988.fir b/flang/test/Transforms/debug-107988.fir index 0ba4296138f50..674ce287a29ec 100644 --- a/flang/test/Transforms/debug-107988.fir +++ b/flang/test/Transforms/debug-107988.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s -o - | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @test(%arg0: !fir.ref> {fir.bindc_name = "str"}, %arg1: i64) { %0 = fir.emboxchar %arg0, %arg1 : (!fir.ref>, i64) -> !fir.boxchar<1> %1 = fir.undefined !fir.dscope diff --git a/flang/test/Transforms/debug-96314.fir b/flang/test/Transforms/debug-96314.fir index e2d0f24a1105c..4df0c4a555d39 100644 --- a/flang/test/Transforms/debug-96314.fir +++ b/flang/test/Transforms/debug-96314.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s -o - | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QMhelperPmod_sub(%arg0: !fir.ref {fir.bindc_name = "a"} ) { return } loc(#loc1) diff --git a/flang/test/Transforms/debug-allocatable-1.fir b/flang/test/Transforms/debug-allocatable-1.fir index fd0beaddcdb70..f523025f5945e 100644 --- a/flang/test/Transforms/debug-allocatable-1.fir +++ b/flang/test/Transforms/debug-allocatable-1.fir @@ -1,7 +1,7 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @_QFPff() { %c1 = arith.constant 1 : index %c0 = arith.constant 0 : index diff --git a/flang/test/Transforms/debug-assumed-rank-array.fir b/flang/test/Transforms/debug-assumed-rank-array.fir index ce474cd259619..41e0396b076f7 100644 --- a/flang/test/Transforms/debug-assumed-rank-array.fir +++ b/flang/test/Transforms/debug-assumed-rank-array.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QFPfn(%arg0: !fir.box> ) { %1 = fir.undefined !fir.dscope %2 = fircg.ext_declare %arg0 dummy_scope %1 {uniq_name = "_QFFfnEx"} : (!fir.box>, !fir.dscope) -> !fir.box> loc(#loc2) diff --git a/flang/test/Transforms/debug-assumed-shape-array-2.fir b/flang/test/Transforms/debug-assumed-shape-array-2.fir index 212a3453d110d..acad57a710205 100644 --- a/flang/test/Transforms/debug-assumed-shape-array-2.fir +++ b/flang/test/Transforms/debug-assumed-shape-array-2.fir @@ -2,7 +2,7 @@ // Test assumed shape array with variable lower bound. -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @_QFPfn(%arg0: !fir.box> {fir.bindc_name = "b"}, %arg1: !fir.ref {fir.bindc_name = "n"}) attributes {} { %c23_i32 = arith.constant 23 : i32 %c6_i32 = arith.constant 6 : i32 diff --git a/flang/test/Transforms/debug-assumed-size-array.fir b/flang/test/Transforms/debug-assumed-size-array.fir index 892502cb64a59..40e57100fd9ff 100644 --- a/flang/test/Transforms/debug-assumed-size-array.fir +++ b/flang/test/Transforms/debug-assumed-size-array.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QMhelperPfn(%arg0: !fir.ref> {fir.bindc_name = "a1"}, %arg1: !fir.ref> {fir.bindc_name = "a2"}, %arg2: !fir.ref> {fir.bindc_name = "a3"}) { %c5 = arith.constant 5 : index %c1 = arith.constant 1 : index diff --git a/flang/test/Transforms/debug-char-type-1.fir b/flang/test/Transforms/debug-char-type-1.fir index 630b52d96cb85..49f230f7307fa 100644 --- a/flang/test/Transforms/debug-char-type-1.fir +++ b/flang/test/Transforms/debug-char-type-1.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMhelperEstr1 : !fir.char<1,40> { %0 = fir.zero_bits !fir.char<1,40> fir.has_value %0 : !fir.char<1,40> diff --git a/flang/test/Transforms/debug-class-type.fir b/flang/test/Transforms/debug-class-type.fir index aad15a831fd2f..23af60b71ca50 100644 --- a/flang/test/Transforms/debug-class-type.fir +++ b/flang/test/Transforms/debug-class-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.type_info @_QTtest_type nofinal : !fir.type<_QTtest_type{a:i32,b:!fir.box>>}> dispatch_table { fir.dt_entry "test_proc", @_QPtest_proc } loc(#loc1) diff --git a/flang/test/Transforms/debug-common-block.fir b/flang/test/Transforms/debug-common-block.fir index 481b26369a92c..d68b524225df5 100644 --- a/flang/test/Transforms/debug-common-block.fir +++ b/flang/test/Transforms/debug-common-block.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @__BLNK__ {alignment = 4 : i64} : tuple> {} loc(#loc1) fir.global @a_ {alignment = 4 : i64} : tuple> {} loc(#loc2) func.func @f1() { diff --git a/flang/test/Transforms/debug-complex-1.fir b/flang/test/Transforms/debug-complex-1.fir index 633f27af99fb1..f7be6b2d4a931 100644 --- a/flang/test/Transforms/debug-complex-1.fir +++ b/flang/test/Transforms/debug-complex-1.fir @@ -3,7 +3,7 @@ // check conversion of complex type of different size. Both fir and mlir // variants are checked. -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @test1(%x : complex) -> complex { %1 = fir.convert %x : (complex) -> complex return %1 : complex diff --git a/flang/test/Transforms/debug-derived-type-2.fir b/flang/test/Transforms/debug-derived-type-2.fir index 63e842619edbe..1e128d702b347 100644 --- a/flang/test/Transforms/debug-derived-type-2.fir +++ b/flang/test/Transforms/debug-derived-type-2.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMmEvar : !fir.type<_QMmTt1{elm:!fir.array<5xi32>,elm2:!fir.array<5x8xi32>}> {} loc(#loc1) fir.type_info @_QMmTt1 noinit nodestroy nofinal : !fir.type<_QMmTt1{elm:!fir.array<5xi32>,elm2:!fir.array<5x8xi32>}> component_info { fir.dt_component "elm" lbs [2] diff --git a/flang/test/Transforms/debug-extra-global.fir b/flang/test/Transforms/debug-extra-global.fir index d3bc22ad0c59b..e3a33e4cfdf40 100644 --- a/flang/test/Transforms/debug-extra-global.fir +++ b/flang/test/Transforms/debug-extra-global.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global linkonce_odr @_QFEXnXxcx constant target : !fir.char<1,3> { %0 = fir.string_lit "xcx"(3) : !fir.char<1,3> fir.has_value %0 : !fir.char<1,3> diff --git a/flang/test/Transforms/debug-fixed-array-type.fir b/flang/test/Transforms/debug-fixed-array-type.fir index a15975c7cc92a..75cb88b08b248 100644 --- a/flang/test/Transforms/debug-fixed-array-type.fir +++ b/flang/test/Transforms/debug-fixed-array-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QQmain() attributes {fir.bindc_name = "mn"} { %c7 = arith.constant 7 : index %c8 = arith.constant 8 : index diff --git a/flang/test/Transforms/debug-fn-info.fir b/flang/test/Transforms/debug-fn-info.fir index 85cfd13643ec3..c02835be50af5 100644 --- a/flang/test/Transforms/debug-fn-info.fir +++ b/flang/test/Transforms/debug-fn-info.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QQmain() attributes {fir.bindc_name = "mn"} { %0 = fir.alloca i32 {bindc_name = "i4", uniq_name = "_QFEi4"} %1 = fircg.ext_declare %0 {uniq_name = "_QFEi4"} : (!fir.ref) -> !fir.ref diff --git a/flang/test/Transforms/debug-imported-entity.fir b/flang/test/Transforms/debug-imported-entity.fir index 7be6531a703a8..194bc82724583 100644 --- a/flang/test/Transforms/debug-imported-entity.fir +++ b/flang/test/Transforms/debug-imported-entity.fir @@ -1,7 +1,7 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMfooEv1 : i32 { %0 = fir.zero_bits i32 fir.has_value %0 : i32 diff --git a/flang/test/Transforms/debug-index-type.fir b/flang/test/Transforms/debug-index-type.fir index 20bd8471d7cf6..751e2e156dc20 100644 --- a/flang/test/Transforms/debug-index-type.fir +++ b/flang/test/Transforms/debug-index-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @str(%arg0: index) -> i32 loc(#loc1) } #loc1 = loc("test.f90":5:1) diff --git a/flang/test/Transforms/debug-line-table-existing.fir b/flang/test/Transforms/debug-line-table-existing.fir index 0e006303c8a81..03eefd08a4379 100644 --- a/flang/test/Transforms/debug-line-table-existing.fir +++ b/flang/test/Transforms/debug-line-table-existing.fir @@ -3,7 +3,7 @@ // REQUIRES: system-linux // Test that there are no changes to a function with existed fused loc debug -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QPs1() { return loc(#loc1) } loc(#loc2) diff --git a/flang/test/Transforms/debug-line-table-inc-file.fir b/flang/test/Transforms/debug-line-table-inc-file.fir index 216cd5e016f2f..32c9f515ead43 100644 --- a/flang/test/Transforms/debug-line-table-inc-file.fir +++ b/flang/test/Transforms/debug-line-table-inc-file.fir @@ -3,7 +3,7 @@ // REQUIRES: system-linux // Test for included functions that have a different debug location than the current file -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QPsinc() { return loc(#loc2) } loc(#loc1) diff --git a/flang/test/Transforms/debug-line-table-inc-same-file.fir b/flang/test/Transforms/debug-line-table-inc-same-file.fir index bcaf449798231..aaa8d03a76ef0 100644 --- a/flang/test/Transforms/debug-line-table-inc-same-file.fir +++ b/flang/test/Transforms/debug-line-table-inc-same-file.fir @@ -4,7 +4,7 @@ // Test that there is only one FileAttribute generated for multiple functions // in the same file. -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QPs1() { return loc(#loc2) } loc(#loc1) diff --git a/flang/test/Transforms/debug-line-table.fir b/flang/test/Transforms/debug-line-table.fir index d6e54fd1ac467..81aebf026882a 100644 --- a/flang/test/Transforms/debug-line-table.fir +++ b/flang/test/Transforms/debug-line-table.fir @@ -3,7 +3,7 @@ // RUN: fir-opt --add-debug-info="debug-level=LineTablesOnly" --mlir-print-debuginfo %s | FileCheck %s --check-prefix=LINETABLE // RUN: fir-opt --add-debug-info="is-optimized=true" --mlir-print-debuginfo %s | FileCheck %s --check-prefix=OPT -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QPsb() { return loc(#loc_sb) } loc(#loc_sb) diff --git a/flang/test/Transforms/debug-local-var.fir b/flang/test/Transforms/debug-local-var.fir index b7a1ff7185a63..06c9b01e75a61 100644 --- a/flang/test/Transforms/debug-local-var.fir +++ b/flang/test/Transforms/debug-local-var.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QQmain() attributes {fir.bindc_name = "mn"} { %0 = fir.alloca i32 {bindc_name = "i4", uniq_name = "_QFEi4"} %1 = fircg.ext_declare %0 {uniq_name = "_QFEi4"} : (!fir.ref) -> !fir.ref loc(#loc1) diff --git a/flang/test/Transforms/debug-module-1.fir b/flang/test/Transforms/debug-module-1.fir index ede996f053835..c1e4c2eeffefe 100644 --- a/flang/test/Transforms/debug-module-1.fir +++ b/flang/test/Transforms/debug-module-1.fir @@ -1,7 +1,7 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMhelperEgli : i32 { %0 = fir.zero_bits i32 fir.has_value %0 : i32 diff --git a/flang/test/Transforms/debug-ptr-type.fir b/flang/test/Transforms/debug-ptr-type.fir index 64e64cb1a19ae..2bbece56a7ab5 100644 --- a/flang/test/Transforms/debug-ptr-type.fir +++ b/flang/test/Transforms/debug-ptr-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMhelperEpar : !fir.box>> { %0 = fir.zero_bits !fir.ptr> %c0 = arith.constant 0 : index diff --git a/flang/test/Transforms/debug-ref-type.fir b/flang/test/Transforms/debug-ref-type.fir index 2b3af485385d8..745aebee778be 100644 --- a/flang/test/Transforms/debug-ref-type.fir +++ b/flang/test/Transforms/debug-ref-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @_FortranAioBeginExternalListOutput(i8) -> !fir.ref loc(#loc1) } #loc1 = loc("test.f90":5:1) diff --git a/flang/test/Transforms/debug-tuple-type.fir b/flang/test/Transforms/debug-tuple-type.fir index c9b0d16c06e1a..e3b0bafdf3cd4 100644 --- a/flang/test/Transforms/debug-tuple-type.fir +++ b/flang/test/Transforms/debug-tuple-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @fn1(!fir.ref>) func.func private @_FortranAioOutputDerivedType(!fir.ref>) } diff --git a/flang/test/Transforms/debug-variable-array-dim.fir b/flang/test/Transforms/debug-variable-array-dim.fir index 1f401041dee57..a376133cf449a 100644 --- a/flang/test/Transforms/debug-variable-array-dim.fir +++ b/flang/test/Transforms/debug-variable-array-dim.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @foo_(%arg0: !fir.ref> {fir.bindc_name = "a"}, %arg1: !fir.ref {fir.bindc_name = "n"}, %arg2: !fir.ref {fir.bindc_name = "m"}, %arg3: !fir.ref {fir.bindc_name = "p"}) attributes {fir.internal_name = "_QPfoo"} { %c5_i32 = arith.constant 5 : i32 %c6_i32 = arith.constant 6 : i32 diff --git a/flang/test/Transforms/debug-variable-char-len.fir b/flang/test/Transforms/debug-variable-char-len.fir index 9e177d22d5b10..907b65a4c6d4f 100644 --- a/flang/test/Transforms/debug-variable-char-len.fir +++ b/flang/test/Transforms/debug-variable-char-len.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s -o - | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @foo(%arg0: !fir.ref> {fir.bindc_name = "str1"} , %arg1: !fir.ref {fir.bindc_name = "len1"} loc("/home/haqadeer/work/fortran/t1/../str.f90":1:1), %arg2: i64) { %0 = fir.emboxchar %arg0, %arg2 : (!fir.ref>, i64) -> !fir.boxchar<1> %c4_i32 = arith.constant 4 : i32 diff --git a/flang/test/Transforms/debug-vector-type.fir b/flang/test/Transforms/debug-vector-type.fir index 63846ce006c6c..d3e1f6ec28d0f 100644 --- a/flang/test/Transforms/debug-vector-type.fir +++ b/flang/test/Transforms/debug-vector-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @foo1(%arg0: !fir.vector<20:bf16>) // CHECK-DAG: #[[F16:.*]] = #llvm.di_basic_type // CHECK-DAG: #llvm.di_composite_type> ``````````
https://github.com/llvm/llvm-project/pull/139099 From flang-commits at lists.llvm.org Thu May 8 09:08:31 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 08 May 2025 09:08:31 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add missing dependency to AddDebugInfo pass (PR #139099) In-Reply-To: Message-ID: <681cd6ff.050a0220.1cd98c.6ec1@mx.google.com> tblah wrote: What about the other passes that call `fir::support::getOrSetMLIRDataLayout` (and similar) helpers? https://github.com/llvm/llvm-project/pull/139099 From flang-commits at lists.llvm.org Thu May 8 09:40:18 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 09:40:18 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <681cde72.050a0220.1cd9af.7cd6@mx.google.com> https://github.com/jofrn edited https://github.com/llvm/llvm-project/pull/123609 From flang-commits at lists.llvm.org Thu May 8 09:47:07 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 09:47:07 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <681ce00b.170a0220.2c13b.1710@mx.google.com> https://github.com/agozillon updated https://github.com/llvm/llvm-project/pull/138210 >From fd8e8a39e9437eca3e3fdbd0397a58d4a756b9ee Mon Sep 17 00:00:00 2001 From: agozillon Date: Thu, 1 May 2025 17:40:12 -0500 Subject: [PATCH] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables Currently, we do not generate the appropriate checks to check if an optional allocatable argument is present before accessing relevant components of it, in particular when creating bounds, we must generate a presence check and we must make sure we do not generate/keep an load external to the presence check by utilising the raw address rather than the regular address of the info data structure. Similarly in cases for optional allocatables we must treat them like non-allocatable arguments and generate an intermediate allocation that we can have as a location in memory that we can access later in the lowering without causing segfaults when we perform "mapping" on it, even if the end result is an empty allocatable (basically, we shouldn't explode if someone tries to map a non-present optional, similar to C++ when mapping null data). --- .../Optimizer/Builder/DirectivesCommon.h | 11 ++++ flang/lib/Lower/OpenMP/OpenMP.cpp | 3 +- .../Optimizer/OpenMP/MapInfoFinalization.cpp | 16 ++++-- .../Lower/OpenMP/optional-argument-map-2.f90 | 46 +++++++++++++++ .../fortran/optional-mapped-arguments-2.f90 | 57 +++++++++++++++++++ 5 files changed, 128 insertions(+), 5 deletions(-) create mode 100644 flang/test/Lower/OpenMP/optional-argument-map-2.f90 create mode 100644 offload/test/offloading/fortran/optional-mapped-arguments-2.f90 diff --git a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h index 8684299ab6792..183e5711213eb 100644 --- a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h +++ b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h @@ -243,6 +243,17 @@ genBaseBoundsOps(fir::FirOpBuilder &builder, mlir::Location loc, return bounds; } +/// Checks if an argument is optional based on the fortran attributes +/// that are tied to it. +inline bool isOptionalArgument(mlir::Operation *op) { + if (auto declareOp = mlir::dyn_cast_or_null(op)) + if (declareOp.getFortranAttrs() && + bitEnumContainsAny(*declareOp.getFortranAttrs(), + fir::FortranVariableFlagsEnum::optional)) + return true; + return false; +} + template llvm::SmallVector genImplicitBoundsOps(fir::FirOpBuilder &builder, AddrAndBoundsInfo &info, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 099d5c604060f..2ff88bb27d186 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2320,7 +2320,8 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, fir::factory::AddrAndBoundsInfo info = Fortran::lower::getDataOperandBaseAddr( - converter, firOpBuilder, sym, converter.getCurrentLocation()); + converter, firOpBuilder, sym.GetUltimate(), + converter.getCurrentLocation()); llvm::SmallVector bounds = fir::factory::genImplicitBoundsOps( diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp index 3fcb4b04a7b76..05d17bf71514b 100644 --- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp +++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp @@ -131,7 +131,8 @@ class MapInfoFinalizationPass boxMap.getVarPtr().getDefiningOp())) descriptor = addrOp.getVal(); - if (!mlir::isa(descriptor.getType())) + if (!mlir::isa(descriptor.getType()) && + !fir::factory::isOptionalArgument(descriptor.getDefiningOp())) return descriptor; mlir::Value &slot = localBoxAllocas[descriptor.getDefiningOp()]; @@ -151,7 +152,12 @@ class MapInfoFinalizationPass mlir::Location loc = boxMap->getLoc(); assert(allocaBlock && "No alloca block found for this top level op"); builder.setInsertionPointToStart(allocaBlock); - auto alloca = builder.create(loc, descriptor.getType()); + + mlir::Type allocaType = descriptor.getType(); + if (fir::isTypeWithDescriptor(allocaType) && + !mlir::isa(descriptor.getType())) + allocaType = fir::unwrapRefType(allocaType); + auto alloca = builder.create(loc, allocaType); builder.restoreInsertionPoint(insPt); // We should only emit a store if the passed in data is present, it is // possible a user passes in no argument to an optional parameter, in which @@ -159,8 +165,10 @@ class MapInfoFinalizationPass auto isPresent = builder.create(loc, builder.getI1Type(), descriptor); builder.genIfOp(loc, {}, isPresent, false) - .genThen( - [&]() { builder.create(loc, descriptor, alloca); }) + .genThen([&]() { + descriptor = builder.loadIfRef(loc, descriptor); + builder.create(loc, descriptor, alloca); + }) .end(); return slot = alloca; } diff --git a/flang/test/Lower/OpenMP/optional-argument-map-2.f90 b/flang/test/Lower/OpenMP/optional-argument-map-2.f90 new file mode 100644 index 0000000000000..3b629cfc06d3a --- /dev/null +++ b/flang/test/Lower/OpenMP/optional-argument-map-2.f90 @@ -0,0 +1,46 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +module mod + implicit none +contains + subroutine routine(a) + implicit none + real(4), allocatable, optional, intent(inout) :: a(:) + integer(4) :: i + + !$omp target teams distribute parallel do shared(a) + do i=1,10 + a(i) = i + a(i) + end do + + end subroutine routine +end module mod + +! CHECK-LABEL: func.func @_QMmodProutine( +! CHECK-SAME: %[[ARG0:.*]]: !fir.ref>>> {fir.bindc_name = "a", fir.optional}) { +! CHECK: %[[VAL_0:.*]] = fir.alloca !fir.box>> +! CHECK: %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[ARG0]] dummy_scope %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QMmodFroutineEa"} : (!fir.ref>>>, !fir.dscope) -> (!fir.ref>>>, !fir.ref>>>) +! CHECK: %[[VAL_8:.*]] = fir.is_present %[[VAL_2]]#1 : (!fir.ref>>>) -> i1 +! CHECK: %[[VAL_9:.*]]:5 = fir.if %[[VAL_8]] -> (index, index, index, index, index) { +! CHECK: %[[VAL_10:.*]] = fir.load %[[VAL_2]]#0 : !fir.ref>>> +! CHECK: %[[VAL_11:.*]] = arith.constant 1 : index +! CHECK: %[[VAL_12:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_2]]#0 : !fir.ref>>> +! CHECK: %[[VAL_14:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_15:.*]]:3 = fir.box_dims %[[VAL_13]], %[[VAL_14]] : (!fir.box>>, index) -> (index, index, index) +! CHECK: %[[VAL_16:.*]]:3 = fir.box_dims %[[VAL_10]], %[[VAL_12]] : (!fir.box>>, index) -> (index, index, index) +! CHECK: %[[VAL_17:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_18:.*]] = arith.subi %[[VAL_16]]#1, %[[VAL_11]] : index +! CHECK: fir.result %[[VAL_17]], %[[VAL_18]], %[[VAL_16]]#1, %[[VAL_16]]#2, %[[VAL_15]]#0 : index, index, index, index, index +! CHECK: } else { +! CHECK: %[[VAL_19:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_20:.*]] = arith.constant -1 : index +! CHECK: fir.result %[[VAL_19]], %[[VAL_20]], %[[VAL_19]], %[[VAL_19]], %[[VAL_19]] : index, index, index, index, index +! CHECK: } +! CHECK: %[[VAL_21:.*]] = omp.map.bounds lower_bound(%[[VAL_9]]#0 : index) upper_bound(%[[VAL_9]]#1 : index) extent(%[[VAL_9]]#2 : index) stride(%[[VAL_9]]#3 : index) start_idx(%[[VAL_9]]#4 : index) {stride_in_bytes = true} +! CHECK: %[[VAL_23:.*]] = fir.is_present %[[VAL_2]]#1 : (!fir.ref>>>) -> i1 +! CHECK: fir.if %[[VAL_23]] { +! CHECK: %[[VAL_24:.*]] = fir.load %[[VAL_2]]#1 : !fir.ref>>> +! CHECK: fir.store %[[VAL_24]] to %[[VAL_0]] : !fir.ref>>> +! CHECK: } diff --git a/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 b/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 new file mode 100644 index 0000000000000..0de6b7730d3a0 --- /dev/null +++ b/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 @@ -0,0 +1,57 @@ +! OpenMP offloading regression test that checks we do not cause a segfault when +! implicitly mapping a not present optional allocatable function argument and +! utilise it in the target region. No results requiring checking other than +! that the program compiles and runs to completion with no error. +! REQUIRES: flang, amdgpu + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +module mod + implicit none +contains + subroutine routine(a, b) + implicit none + real(4), allocatable, optional, intent(in) :: a(:) + real(4), intent(out) :: b(:) + integer(4) :: i, ia + if(present(a)) then + ia = 1 + write(*,*) "a is present" + else + ia=0 + write(*,*) "a is not present" + end if + + !$omp target teams distribute parallel do shared(a,b,ia) + do i=1,10 + if (ia>0) then + b(i) = b(i) + a(i) + end if + end do + + end subroutine routine + +end module mod + +program main + use mod + implicit none + real(4), allocatable :: a(:) + real(4), allocatable :: b(:) + integer(4) :: i + allocate(b(10)) + do i=1,10 + b(i)=0 + end do + !$omp target data map(from: b) + + call routine(b=b) + + !$omp end target data + + deallocate(b) + + print *, "success, no segmentation fault" +end program main + +!CHECK: a is not present +!CHECK: success, no segmentation fault From flang-commits at lists.llvm.org Thu May 8 09:51:59 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 09:51:59 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <681ce12f.a70a0220.145002.8324@mx.google.com> ================ @@ -156,9 +156,9 @@ genBoundsOpsFromBox(fir::FirOpBuilder &builder, mlir::Location loc, builder.genIfOp(loc, resTypes, info.isPresent, /*withElseRegion=*/true) .genThen([&]() { mlir::Value box = - !fir::isBoxAddress(info.addr.getType()) + !fir::isBoxAddress(info.rawInput.getType()) ? info.addr - : builder.create(loc, info.addr); + : builder.create(loc, info.rawInput); ---------------- agozillon wrote: Thank you very much for pointing that out Jean! I looked into why it wasn't being triggered and it turns out we need to use the ultimate symbol when we're generating our AddrAndBoundsInfo, as the base symbol we have at that time refers to the map (or in this case shared clause) symbols which don't necessarily have the optional tag/information. So I've reverted the changes to this bit of code and swapped to utilising the ultimate symbol. Thank you again for helping with the simplification :-) https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Thu May 8 09:56:05 2025 From: flang-commits at lists.llvm.org (David Truby via flang-commits) Date: Thu, 08 May 2025 09:56:05 -0700 (PDT) Subject: [flang-commits] [flang] [flang][driver] do not crash when fc1 process multiple files (PR #138875) In-Reply-To: Message-ID: <681ce225.170a0220.5c999.547e@mx.google.com> DavidTruby wrote: Sorry, @inaki-amatria's comment has made me realise I didn't look at this closely enough; I clearly only skim read the commit message, and the code change looked sensible to me. I've checked what clang -cc1 does here, and that only accepts the last flag. I.e. `clang -cc1 -o test.o -x c test.c -o test2.o -x c test2.c` will only genereate `test2.o`. It does this silently, i.e. it doesn't give you any warning that you've passed multiple files. I'm not opposed to `flang -fc1` accepting it even though `clang -cc1` doesn't, but I'm not actually clear what the expected behaviour would be with `flang -fc1` accepting multiple input files. What happens with this change? I do think that making this re-entrant is sensible anyway for library users as @inaki-amatria mentioned so my accept still stands. https://github.com/llvm/llvm-project/pull/138875 From flang-commits at lists.llvm.org Thu May 8 10:00:10 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 10:00:10 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <681ce31a.170a0220.3cf55c.4481@mx.google.com> https://github.com/agozillon updated https://github.com/llvm/llvm-project/pull/138210 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Thu May 8 10:01:44 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 10:01:44 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <681ce378.170a0220.799e7.357a@mx.google.com> agozillon wrote: I believe I addressed all reviewer comments in the last update and simplified the PR a fair bit thanks to some information from @jeanPerier https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Thu May 8 10:07:42 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 10:07:42 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <681ce4de.170a0220.29a93d.4f63@mx.google.com> https://github.com/jofrn updated https://github.com/llvm/llvm-project/pull/123609 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Thu May 8 11:22:32 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 11:22:32 -0700 (PDT) Subject: [flang-commits] [flang] 5fe69fd - [flang][OpenMP] Update `do concurrent` mapping pass to use `fir.do_concurrent` op (#138489) Message-ID: <681cf668.170a0220.1762a8.6b50@mx.google.com> Author: Kareem Ergawy Date: 2025-05-08T20:22:29+02:00 New Revision: 5fe69fd95c4e2bc55a41a41047d08522a5f26d57 URL: https://github.com/llvm/llvm-project/commit/5fe69fd95c4e2bc55a41a41047d08522a5f26d57 DIFF: https://github.com/llvm/llvm-project/commit/5fe69fd95c4e2bc55a41a41047d08522a5f26d57.diff LOG: [flang][OpenMP] Update `do concurrent` mapping pass to use `fir.do_concurrent` op (#138489) This PR updates the `do concurrent` to OpenMP mapping pass to use the newly added `fir.do_concurrent` ops that were recently added upstream instead of handling nests of `fir.do_loop ... unordered` ops. Parent PR: https://github.com/llvm/llvm-project/pull/137928. Added: Modified: flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp flang/test/Transforms/DoConcurrent/basic_device.mlir flang/test/Transforms/DoConcurrent/basic_host.f90 flang/test/Transforms/DoConcurrent/basic_host.mlir flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 flang/test/Transforms/DoConcurrent/non_const_bounds.f90 flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 Removed: flang/test/Transforms/DoConcurrent/loop_nest_test.f90 ################################################################################ diff --git a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp index 2c069860ffdca..0fdb302fe10ca 100644 --- a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp +++ b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp @@ -6,6 +6,7 @@ // //===----------------------------------------------------------------------===// +#include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/OpenMP/Passes.h" #include "flang/Optimizer/OpenMP/Utils.h" @@ -28,8 +29,10 @@ namespace looputils { /// Stores info needed about the induction/iteration variable for each `do /// concurrent` in a loop nest. struct InductionVariableInfo { - InductionVariableInfo(fir::DoLoopOp doLoop) { populateInfo(doLoop); } - + InductionVariableInfo(fir::DoConcurrentLoopOp loop, + mlir::Value inductionVar) { + populateInfo(loop, inductionVar); + } /// The operation allocating memory for iteration variable. mlir::Operation *iterVarMemDef; /// the operation(s) updating the iteration variable with the current @@ -45,7 +48,7 @@ struct InductionVariableInfo { /// ... /// %i:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : ... /// ... - /// fir.do_loop %ind_var = %lb to %ub step %s unordered { + /// fir.do_concurrent.loop (%ind_var) = (%lb) to (%ub) step (%s) { /// %ind_var_conv = fir.convert %ind_var : (index) -> i32 /// fir.store %ind_var_conv to %i#1 : !fir.ref /// ... @@ -62,14 +65,14 @@ struct InductionVariableInfo { /// Note: The current implementation is dependent on how flang emits loop /// bodies; which is sufficient for the current simple test/use cases. If this /// proves to be insufficient, this should be made more generic. - void populateInfo(fir::DoLoopOp doLoop) { + void populateInfo(fir::DoConcurrentLoopOp loop, mlir::Value inductionVar) { mlir::Value result = nullptr; // Checks if a StoreOp is updating the memref of the loop's iteration // variable. auto isStoringIV = [&](fir::StoreOp storeOp) { // Direct store into the IV memref. - if (storeOp.getValue() == doLoop.getInductionVar()) { + if (storeOp.getValue() == inductionVar) { indVarUpdateOps.push_back(storeOp); return true; } @@ -77,7 +80,7 @@ struct InductionVariableInfo { // Indirect store into the IV memref. if (auto convertOp = mlir::dyn_cast( storeOp.getValue().getDefiningOp())) { - if (convertOp.getOperand() == doLoop.getInductionVar()) { + if (convertOp.getOperand() == inductionVar) { indVarUpdateOps.push_back(convertOp); indVarUpdateOps.push_back(storeOp); return true; @@ -87,7 +90,7 @@ struct InductionVariableInfo { return false; }; - for (mlir::Operation &op : doLoop) { + for (mlir::Operation &op : loop) { if (auto storeOp = mlir::dyn_cast(op)) if (isStoringIV(storeOp)) { result = storeOp.getMemref(); @@ -100,219 +103,7 @@ struct InductionVariableInfo { } }; -using LoopNestToIndVarMap = - llvm::MapVector; - -/// Loop \p innerLoop is considered perfectly-nested inside \p outerLoop iff -/// there are no operations in \p outerloop's body other than: -/// -/// 1. the operations needed to assign/update \p outerLoop's induction variable. -/// 2. \p innerLoop itself. -/// -/// \p return true if \p innerLoop is perfectly nested inside \p outerLoop -/// according to the above definition. -bool isPerfectlyNested(fir::DoLoopOp outerLoop, fir::DoLoopOp innerLoop) { - mlir::ForwardSliceOptions forwardSliceOptions; - forwardSliceOptions.inclusive = true; - // The following will be used as an example to clarify the internals of this - // function: - // ``` - // 1. fir.do_loop %i_idx = %34 to %36 step %c1 unordered { - // 2. %i_idx_2 = fir.convert %i_idx : (index) -> i32 - // 3. fir.store %i_idx_2 to %i_iv#1 : !fir.ref - // - // 4. fir.do_loop %j_idx = %37 to %39 step %c1_3 unordered { - // 5. %j_idx_2 = fir.convert %j_idx : (index) -> i32 - // 6. fir.store %j_idx_2 to %j_iv#1 : !fir.ref - // ... loop nest body, possible uses %i_idx ... - // } - // } - // ``` - // In this example, the `j` loop is perfectly nested inside the `i` loop and - // below is how we find that. - - // We don't care about the outer-loop's induction variable's uses within the - // inner-loop, so we filter out these uses. - // - // This filter tells `getForwardSlice` (below) to only collect operations - // which produce results defined above (i.e. outside) the inner-loop's body. - // - // Since `outerLoop.getInductionVar()` is a block argument (to the - // outer-loop's body), the filter effectively collects uses of - // `outerLoop.getInductionVar()` inside the outer-loop but outside the - // inner-loop. - forwardSliceOptions.filter = [&](mlir::Operation *op) { - return mlir::areValuesDefinedAbove(op->getResults(), innerLoop.getRegion()); - }; - - llvm::SetVector indVarSlice; - // The forward slice of the `i` loop's IV will be the 2 ops in line 1 & 2 - // above. Uses of `%i_idx` inside the `j` loop are not collected because of - // the filter. - mlir::getForwardSlice(outerLoop.getInductionVar(), &indVarSlice, - forwardSliceOptions); - llvm::DenseSet indVarSet(indVarSlice.begin(), - indVarSlice.end()); - - llvm::DenseSet outerLoopBodySet; - // The following walk collects ops inside `outerLoop` that are **not**: - // * the outer-loop itself, - // * or the inner-loop, - // * or the `fir.result` op (the outer-loop's terminator). - // - // For the above example, this will also populate `outerLoopBodySet` with ops - // in line 1 & 2 since we skip the `i` loop, the `j` loop, and the terminator. - outerLoop.walk([&](mlir::Operation *op) { - if (op == outerLoop) - return mlir::WalkResult::advance(); - - if (op == innerLoop) - return mlir::WalkResult::skip(); - - if (mlir::isa(op)) - return mlir::WalkResult::advance(); - - outerLoopBodySet.insert(op); - return mlir::WalkResult::advance(); - }); - - // If `outerLoopBodySet` ends up having the same ops as `indVarSet`, then - // `outerLoop` only contains ops that setup its induction variable + - // `innerLoop` + the `fir.result` terminator. In other words, `innerLoop` is - // perfectly nested inside `outerLoop`. - bool result = (outerLoopBodySet == indVarSet); - LLVM_DEBUG(DBGS() << "Loop pair starting at location " << outerLoop.getLoc() - << " is" << (result ? "" : " not") - << " perfectly nested\n"); - - return result; -} - -/// Starting with `currentLoop` collect a perfectly nested loop nest, if any. -/// This function collects as much as possible loops in the nest; it case it -/// fails to recognize a certain nested loop as part of the nest it just returns -/// the parent loops it discovered before. -mlir::LogicalResult collectLoopNest(fir::DoLoopOp currentLoop, - LoopNestToIndVarMap &loopNest) { - assert(currentLoop.getUnordered()); - - while (true) { - loopNest.insert({currentLoop, InductionVariableInfo(currentLoop)}); - llvm::SmallVector unorderedLoops; - - for (auto nestedLoop : currentLoop.getRegion().getOps()) - if (nestedLoop.getUnordered()) - unorderedLoops.push_back(nestedLoop); - - if (unorderedLoops.empty()) - break; - - // Having more than one unordered loop means that we are not dealing with a - // perfect loop nest (i.e. a mulit-range `do concurrent` loop); which is the - // case we are after here. - if (unorderedLoops.size() > 1) - return mlir::failure(); - - fir::DoLoopOp nestedUnorderedLoop = unorderedLoops.front(); - - if (!isPerfectlyNested(currentLoop, nestedUnorderedLoop)) - return mlir::failure(); - - currentLoop = nestedUnorderedLoop; - } - - return mlir::success(); -} - -/// Prepares the `fir.do_loop` nest to be easily mapped to OpenMP. In -/// particular, this function would take this input IR: -/// ``` -/// fir.do_loop %i_iv = %i_lb to %i_ub step %i_step unordered { -/// fir.store %i_iv to %i#1 : !fir.ref -/// %j_lb = arith.constant 1 : i32 -/// %j_ub = arith.constant 10 : i32 -/// %j_step = arith.constant 1 : index -/// -/// fir.do_loop %j_iv = %j_lb to %j_ub step %j_step unordered { -/// fir.store %j_iv to %j#1 : !fir.ref -/// ... -/// } -/// } -/// ``` -/// -/// into the following form (using generic op form since the result is -/// technically an invalid `fir.do_loop` op: -/// -/// ``` -/// "fir.do_loop"(%i_lb, %i_ub, %i_step) <{unordered}> ({ -/// ^bb0(%i_iv: index): -/// %j_lb = "arith.constant"() <{value = 1 : i32}> : () -> i32 -/// %j_ub = "arith.constant"() <{value = 10 : i32}> : () -> i32 -/// %j_step = "arith.constant"() <{value = 1 : index}> : () -> index -/// -/// "fir.do_loop"(%j_lb, %j_ub, %j_step) <{unordered}> ({ -/// ^bb0(%new_i_iv: index, %new_j_iv: index): -/// "fir.store"(%new_i_iv, %i#1) : (i32, !fir.ref) -> () -/// "fir.store"(%new_j_iv, %j#1) : (i32, !fir.ref) -> () -/// ... -/// }) -/// ``` -/// -/// What happened to the loop nest is the following: -/// -/// * the innermost loop's entry block was updated from having one operand to -/// having `n` operands where `n` is the number of loops in the nest, -/// -/// * the outer loop(s)' ops that update the IVs were sank inside the innermost -/// loop (see the `"fir.store"(%new_i_iv, %i#1)` op above), -/// -/// * the innermost loop's entry block's arguments were mapped in order from the -/// outermost to the innermost IV. -/// -/// With this IR change, we can directly inline the innermost loop's region into -/// the newly generated `omp.loop_nest` op. -/// -/// Note that this function has a pre-condition that \p loopNest consists of -/// perfectly nested loops; i.e. there are no in-between ops between 2 nested -/// loops except for the ops to setup the inner loop's LB, UB, and step. These -/// ops are handled/cloned by `genLoopNestClauseOps(..)`. -void sinkLoopIVArgs(mlir::ConversionPatternRewriter &rewriter, - looputils::LoopNestToIndVarMap &loopNest) { - if (loopNest.size() <= 1) - return; - - fir::DoLoopOp innermostLoop = loopNest.back().first; - mlir::Operation &innermostFirstOp = innermostLoop.getRegion().front().front(); - - llvm::SmallVector argTypes; - llvm::SmallVector argLocs; - - for (auto &[doLoop, indVarInfo] : llvm::drop_end(loopNest)) { - // Sink the IV update ops to the innermost loop. We need to do for all loops - // except for the innermost one, hence the `drop_end` usage above. - for (mlir::Operation *op : indVarInfo.indVarUpdateOps) - op->moveBefore(&innermostFirstOp); - - argTypes.push_back(doLoop.getInductionVar().getType()); - argLocs.push_back(doLoop.getInductionVar().getLoc()); - } - - mlir::Region &innermmostRegion = innermostLoop.getRegion(); - // Extend the innermost entry block with arguments to represent the outer IVs. - innermmostRegion.addArguments(argTypes, argLocs); - - unsigned idx = 1; - // In reverse, remap the IVs of the loop nest from the old values to the new - // ones. We do that in reverse since the first argument before this loop is - // the old IV for the innermost loop. Therefore, we want to replace it first - // before the old value (1st argument in the block) is remapped to be the IV - // of the outermost loop in the nest. - for (auto &[doLoop, _] : llvm::reverse(loopNest)) { - doLoop.getInductionVar().replaceAllUsesWith( - innermmostRegion.getArgument(innermmostRegion.getNumArguments() - idx)); - ++idx; - } -} +using InductionVariableInfos = llvm::SmallVector; /// Collects values that are local to a loop: "loop-local values". A loop-local /// value is one that is used exclusively inside the loop but allocated outside @@ -326,9 +117,9 @@ void sinkLoopIVArgs(mlir::ConversionPatternRewriter &rewriter, /// used exclusively inside. /// /// \param [out] locals - the list of loop-local values detected for \p doLoop. -void collectLoopLocalValues(fir::DoLoopOp doLoop, +void collectLoopLocalValues(fir::DoConcurrentLoopOp loop, llvm::SetVector &locals) { - doLoop.walk([&](mlir::Operation *op) { + loop.walk([&](mlir::Operation *op) { for (mlir::Value operand : op->getOperands()) { if (locals.contains(operand)) continue; @@ -340,11 +131,11 @@ void collectLoopLocalValues(fir::DoLoopOp doLoop, // Values defined inside the loop are not interesting since they do not // need to be localized. - if (doLoop->isAncestor(operand.getDefiningOp())) + if (loop->isAncestor(operand.getDefiningOp())) continue; for (auto *user : operand.getUsers()) { - if (!doLoop->isAncestor(user)) { + if (!loop->isAncestor(user)) { isLocal = false; break; } @@ -373,39 +164,42 @@ static void localizeLoopLocalValue(mlir::Value local, mlir::Region &allocRegion, } } // namespace looputils -class DoConcurrentConversion : public mlir::OpConversionPattern { +class DoConcurrentConversion + : public mlir::OpConversionPattern { public: - using mlir::OpConversionPattern::OpConversionPattern; + using mlir::OpConversionPattern::OpConversionPattern; - DoConcurrentConversion(mlir::MLIRContext *context, bool mapToDevice, - llvm::DenseSet &concurrentLoopsToSkip) + DoConcurrentConversion( + mlir::MLIRContext *context, bool mapToDevice, + llvm::DenseSet &concurrentLoopsToSkip) : OpConversionPattern(context), mapToDevice(mapToDevice), concurrentLoopsToSkip(concurrentLoopsToSkip) {} mlir::LogicalResult - matchAndRewrite(fir::DoLoopOp doLoop, OpAdaptor adaptor, + matchAndRewrite(fir::DoConcurrentOp doLoop, OpAdaptor adaptor, mlir::ConversionPatternRewriter &rewriter) const override { if (mapToDevice) return doLoop.emitError( "not yet implemented: Mapping `do concurrent` loops to device"); - looputils::LoopNestToIndVarMap loopNest; - bool hasRemainingNestedLoops = - failed(looputils::collectLoopNest(doLoop, loopNest)); - if (hasRemainingNestedLoops) - mlir::emitWarning(doLoop.getLoc(), - "Some `do concurent` loops are not perfectly-nested. " - "These will be serialized."); + looputils::InductionVariableInfos ivInfos; + auto loop = mlir::cast( + doLoop.getRegion().back().getTerminator()); + + auto indVars = loop.getLoopInductionVars(); + assert(indVars.has_value()); + + for (mlir::Value indVar : *indVars) + ivInfos.emplace_back(loop, indVar); llvm::SetVector locals; - looputils::collectLoopLocalValues(loopNest.back().first, locals); - looputils::sinkLoopIVArgs(rewriter, loopNest); + looputils::collectLoopLocalValues(loop, locals); mlir::IRMapping mapper; mlir::omp::ParallelOp parallelOp = - genParallelOp(doLoop.getLoc(), rewriter, loopNest, mapper); + genParallelOp(doLoop.getLoc(), rewriter, ivInfos, mapper); mlir::omp::LoopNestOperands loopNestClauseOps; - genLoopNestClauseOps(doLoop.getLoc(), rewriter, loopNest, mapper, + genLoopNestClauseOps(doLoop.getLoc(), rewriter, loop, mapper, loopNestClauseOps); for (mlir::Value local : locals) @@ -413,41 +207,56 @@ class DoConcurrentConversion : public mlir::OpConversionPattern { rewriter); mlir::omp::LoopNestOp ompLoopNest = - genWsLoopOp(rewriter, loopNest.back().first, mapper, loopNestClauseOps, + genWsLoopOp(rewriter, loop, mapper, loopNestClauseOps, /*isComposite=*/mapToDevice); - rewriter.eraseOp(doLoop); + rewriter.setInsertionPoint(doLoop); + fir::FirOpBuilder builder( + rewriter, + fir::getKindMapping(doLoop->getParentOfType())); + + // Collect iteration variable(s) allocations so that we can move them + // outside the `fir.do_concurrent` wrapper (before erasing it). + llvm::SmallVector opsToMove; + for (mlir::Operation &op : llvm::drop_end(doLoop)) + opsToMove.push_back(&op); + + mlir::Block *allocBlock = builder.getAllocaBlock(); + + for (mlir::Operation *op : llvm::reverse(opsToMove)) { + rewriter.moveOpBefore(op, allocBlock, allocBlock->begin()); + } // Mark `unordered` loops that are not perfectly nested to be skipped from // the legality check of the `ConversionTarget` since we are not interested // in mapping them to OpenMP. - ompLoopNest->walk([&](fir::DoLoopOp doLoop) { - if (doLoop.getUnordered()) { - concurrentLoopsToSkip.insert(doLoop); - } + ompLoopNest->walk([&](fir::DoConcurrentOp doLoop) { + concurrentLoopsToSkip.insert(doLoop); }); + rewriter.eraseOp(doLoop); + return mlir::success(); } private: - mlir::omp::ParallelOp genParallelOp(mlir::Location loc, - mlir::ConversionPatternRewriter &rewriter, - looputils::LoopNestToIndVarMap &loopNest, - mlir::IRMapping &mapper) const { + mlir::omp::ParallelOp + genParallelOp(mlir::Location loc, mlir::ConversionPatternRewriter &rewriter, + looputils::InductionVariableInfos &ivInfos, + mlir::IRMapping &mapper) const { auto parallelOp = rewriter.create(loc); rewriter.createBlock(¶llelOp.getRegion()); rewriter.setInsertionPoint(rewriter.create(loc)); - genLoopNestIndVarAllocs(rewriter, loopNest, mapper); + genLoopNestIndVarAllocs(rewriter, ivInfos, mapper); return parallelOp; } void genLoopNestIndVarAllocs(mlir::ConversionPatternRewriter &rewriter, - looputils::LoopNestToIndVarMap &loopNest, + looputils::InductionVariableInfos &ivInfos, mlir::IRMapping &mapper) const { - for (auto &[_, indVarInfo] : loopNest) + for (auto &indVarInfo : ivInfos) genInductionVariableAlloc(rewriter, indVarInfo.iterVarMemDef, mapper); } @@ -471,10 +280,11 @@ class DoConcurrentConversion : public mlir::OpConversionPattern { return result; } - void genLoopNestClauseOps( - mlir::Location loc, mlir::ConversionPatternRewriter &rewriter, - looputils::LoopNestToIndVarMap &loopNest, mlir::IRMapping &mapper, - mlir::omp::LoopNestOperands &loopNestClauseOps) const { + void + genLoopNestClauseOps(mlir::Location loc, + mlir::ConversionPatternRewriter &rewriter, + fir::DoConcurrentLoopOp loop, mlir::IRMapping &mapper, + mlir::omp::LoopNestOperands &loopNestClauseOps) const { assert(loopNestClauseOps.loopLowerBounds.empty() && "Loop nest bounds were already emitted!"); @@ -483,43 +293,42 @@ class DoConcurrentConversion : public mlir::OpConversionPattern { bounds.push_back(var.getDefiningOp()->getResult(0)); }; - for (auto &[doLoop, _] : loopNest) { - populateBounds(doLoop.getLowerBound(), loopNestClauseOps.loopLowerBounds); - populateBounds(doLoop.getUpperBound(), loopNestClauseOps.loopUpperBounds); - populateBounds(doLoop.getStep(), loopNestClauseOps.loopSteps); + for (auto [lb, ub, st] : llvm::zip_equal( + loop.getLowerBound(), loop.getUpperBound(), loop.getStep())) { + populateBounds(lb, loopNestClauseOps.loopLowerBounds); + populateBounds(ub, loopNestClauseOps.loopUpperBounds); + populateBounds(st, loopNestClauseOps.loopSteps); } loopNestClauseOps.loopInclusive = rewriter.getUnitAttr(); } mlir::omp::LoopNestOp - genWsLoopOp(mlir::ConversionPatternRewriter &rewriter, fir::DoLoopOp doLoop, - mlir::IRMapping &mapper, + genWsLoopOp(mlir::ConversionPatternRewriter &rewriter, + fir::DoConcurrentLoopOp loop, mlir::IRMapping &mapper, const mlir::omp::LoopNestOperands &clauseOps, bool isComposite) const { - auto wsloopOp = rewriter.create(doLoop.getLoc()); + auto wsloopOp = rewriter.create(loop.getLoc()); wsloopOp.setComposite(isComposite); rewriter.createBlock(&wsloopOp.getRegion()); auto loopNestOp = - rewriter.create(doLoop.getLoc(), clauseOps); + rewriter.create(loop.getLoc(), clauseOps); // Clone the loop's body inside the loop nest construct using the // mapped values. - rewriter.cloneRegionBefore(doLoop.getRegion(), loopNestOp.getRegion(), + rewriter.cloneRegionBefore(loop.getRegion(), loopNestOp.getRegion(), loopNestOp.getRegion().begin(), mapper); - mlir::Operation *terminator = loopNestOp.getRegion().back().getTerminator(); rewriter.setInsertionPointToEnd(&loopNestOp.getRegion().back()); - rewriter.create(terminator->getLoc()); - rewriter.eraseOp(terminator); + rewriter.create(loop->getLoc()); return loopNestOp; } bool mapToDevice; - llvm::DenseSet &concurrentLoopsToSkip; + llvm::DenseSet &concurrentLoopsToSkip; }; class DoConcurrentConversionPass @@ -548,19 +357,16 @@ class DoConcurrentConversionPass return; } - llvm::DenseSet concurrentLoopsToSkip; + llvm::DenseSet concurrentLoopsToSkip; mlir::RewritePatternSet patterns(context); patterns.insert( context, mapTo == flangomp::DoConcurrentMappingKind::DCMK_Device, concurrentLoopsToSkip); mlir::ConversionTarget target(*context); - target.addDynamicallyLegalOp([&](fir::DoLoopOp op) { - // The goal is to handle constructs that eventually get lowered to - // `fir.do_loop` with the `unordered` attribute (e.g. array expressions). - // Currently, this is only enabled for the `do concurrent` construct since - // the pass runs early in the pipeline. - return !op.getUnordered() || concurrentLoopsToSkip.contains(op); - }); + target.addDynamicallyLegalOp( + [&](fir::DoConcurrentOp op) { + return concurrentLoopsToSkip.contains(op); + }); target.markUnknownOpDynamicallyLegal( [](mlir::Operation *) { return true; }); diff --git a/flang/test/Transforms/DoConcurrent/basic_device.mlir b/flang/test/Transforms/DoConcurrent/basic_device.mlir index d7fcc40e4a7f9..0ca48943864c8 100644 --- a/flang/test/Transforms/DoConcurrent/basic_device.mlir +++ b/flang/test/Transforms/DoConcurrent/basic_device.mlir @@ -1,8 +1,6 @@ // RUN: fir-opt --omp-do-concurrent-conversion="map-to=device" -verify-diagnostics %s func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_basic"} { - %0 = fir.alloca i32 {bindc_name = "i"} - %1:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) %2 = fir.address_of(@_QFEa) : !fir.ref> %c10 = arith.constant 10 : index %3 = fir.shape %c10 : (index) -> !fir.shape<1> @@ -14,15 +12,19 @@ func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_bas %c1 = arith.constant 1 : index // expected-error at +2 {{not yet implemented: Mapping `do concurrent` loops to device}} - // expected-error at below {{failed to legalize operation 'fir.do_loop'}} - fir.do_loop %arg0 = %7 to %8 step %c1 unordered { - %13 = fir.convert %arg0 : (index) -> i32 - fir.store %13 to %1#1 : !fir.ref - %14 = fir.load %1#0 : !fir.ref - %15 = fir.load %1#0 : !fir.ref - %16 = fir.convert %15 : (i32) -> i64 - %17 = hlfir.designate %4#0 (%16) : (!fir.ref>, i64) -> !fir.ref - hlfir.assign %14 to %17 : i32, !fir.ref + // expected-error at below {{failed to legalize operation 'fir.do_concurrent'}} + fir.do_concurrent { + %0 = fir.alloca i32 {bindc_name = "i"} + %1:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) + fir.do_concurrent.loop (%arg0) = (%7) to (%8) step (%c1) { + %13 = fir.convert %arg0 : (index) -> i32 + fir.store %13 to %1#1 : !fir.ref + %14 = fir.load %1#0 : !fir.ref + %15 = fir.load %1#0 : !fir.ref + %16 = fir.convert %15 : (i32) -> i64 + %17 = hlfir.designate %4#0 (%16) : (!fir.ref>, i64) -> !fir.ref + hlfir.assign %14 to %17 : i32, !fir.ref + } } return diff --git a/flang/test/Transforms/DoConcurrent/basic_host.f90 b/flang/test/Transforms/DoConcurrent/basic_host.f90 index b84d4481ac766..12f63031cbaee 100644 --- a/flang/test/Transforms/DoConcurrent/basic_host.f90 +++ b/flang/test/Transforms/DoConcurrent/basic_host.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! Tests mapping of a basic `do concurrent` loop to `!$omp parallel do`. ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fdo-concurrent-to-openmp=host %s -o - \ diff --git a/flang/test/Transforms/DoConcurrent/basic_host.mlir b/flang/test/Transforms/DoConcurrent/basic_host.mlir index b53ecd687c039..5425829404d7b 100644 --- a/flang/test/Transforms/DoConcurrent/basic_host.mlir +++ b/flang/test/Transforms/DoConcurrent/basic_host.mlir @@ -6,8 +6,6 @@ func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_basic"} { // CHECK: %[[ARR:.*]]:2 = hlfir.declare %{{.*}}(%{{.*}}) {uniq_name = "_QFEa"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) - %0 = fir.alloca i32 {bindc_name = "i"} - %1:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) %2 = fir.address_of(@_QFEa) : !fir.ref> %c10 = arith.constant 10 : index %3 = fir.shape %c10 : (index) -> !fir.shape<1> @@ -18,7 +16,7 @@ func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_bas %8 = fir.convert %c10_i32 : (i32) -> index %c1 = arith.constant 1 : index - // CHECK-NOT: fir.do_loop + // CHECK-NOT: fir.do_concurrent // CHECK: %[[C1:.*]] = arith.constant 1 : i32 // CHECK: %[[LB:.*]] = fir.convert %[[C1]] : (i32) -> index @@ -46,17 +44,21 @@ func.func @do_concurrent_basic() attributes {fir.bindc_name = "do_concurrent_bas // CHECK-NEXT: omp.terminator // CHECK-NEXT: } - fir.do_loop %arg0 = %7 to %8 step %c1 unordered { - %13 = fir.convert %arg0 : (index) -> i32 - fir.store %13 to %1#1 : !fir.ref - %14 = fir.load %1#0 : !fir.ref - %15 = fir.load %1#0 : !fir.ref - %16 = fir.convert %15 : (i32) -> i64 - %17 = hlfir.designate %4#0 (%16) : (!fir.ref>, i64) -> !fir.ref - hlfir.assign %14 to %17 : i32, !fir.ref + fir.do_concurrent { + %0 = fir.alloca i32 {bindc_name = "i"} + %1:2 = hlfir.declare %0 {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) + fir.do_concurrent.loop (%arg0) = (%7) to (%8) step (%c1) { + %13 = fir.convert %arg0 : (index) -> i32 + fir.store %13 to %1#1 : !fir.ref + %14 = fir.load %1#0 : !fir.ref + %15 = fir.load %1#0 : !fir.ref + %16 = fir.convert %15 : (i32) -> i64 + %17 = hlfir.designate %4#0 (%16) : (!fir.ref>, i64) -> !fir.ref + hlfir.assign %14 to %17 : i32, !fir.ref + } } - // CHECK-NOT: fir.do_loop + // CHECK-NOT: fir.do_concurrent return } diff --git a/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 b/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 index 4e13c0919589a..f82696669eca6 100644 --- a/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 +++ b/flang/test/Transforms/DoConcurrent/locally_destroyed_temp.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! Tests that "loop-local values" are properly handled by localizing them to the ! body of the loop nest. See `collectLoopLocalValues` and `localizeLoopLocalValue` ! for a definition of "loop-local values" and how they are handled. diff --git a/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 b/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 deleted file mode 100644 index adc4a488d1ec9..0000000000000 --- a/flang/test/Transforms/DoConcurrent/loop_nest_test.f90 +++ /dev/null @@ -1,92 +0,0 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - -! Tests loop-nest detection algorithm for do-concurrent mapping. - -! REQUIRES: asserts - -! RUN: %flang_fc1 -emit-hlfir -fopenmp -fdo-concurrent-to-openmp=host \ -! RUN: -mmlir -debug -mmlir -mlir-disable-threading %s -o - 2> %t.log || true - -! RUN: FileCheck %s < %t.log - -program main - implicit none - -contains - -subroutine foo(n) - implicit none - integer :: n, m - integer :: i, j, k - integer :: x - integer, dimension(n) :: a - integer, dimension(n, n, n) :: b - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is perfectly nested - do concurrent(i=1:n, j=1:bar(n*m, n/m)) - a(i) = n - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is perfectly nested - do concurrent(i=bar(n, x):n, j=1:bar(n*m, n/m)) - a(i) = n - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is not perfectly nested - do concurrent(i=bar(n, x):n) - do concurrent(j=1:bar(n*m, n/m)) - a(i) = n - end do - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is not perfectly nested - do concurrent(i=1:n) - x = 10 - do concurrent(j=1:m) - b(i,j,k) = i * j + k - end do - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is not perfectly nested - do concurrent(i=1:n) - do concurrent(j=1:m) - b(i,j,k) = i * j + k - end do - x = 10 - end do - - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is not perfectly nested - do concurrent(i=1:n) - do concurrent(j=1:m) - b(i,j,k) = i * j + k - x = 10 - end do - end do - - ! Verify the (i,j) and (j,k) pairs of loops are detected as perfectly nested. - ! - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 3]]:{{.*}}) is perfectly nested - ! CHECK: Loop pair starting at location - ! CHECK: loc("{{.*}}":[[# @LINE + 1]]:{{.*}}) is perfectly nested - do concurrent(i=bar(n, x):n, j=1:bar(n*m, n/m), k=1:bar(n*m, bar(n*m, n/m))) - a(i) = n - end do -end subroutine - -pure function bar(n, m) - implicit none - integer, intent(in) :: n, m - integer :: bar - - bar = n + m -end function - -end program main diff --git a/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 b/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 index 26800678d381c..d0210726de83e 100644 --- a/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 +++ b/flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! Tests mapping of a `do concurrent` loop with multiple iteration ranges. ! RUN: split-file %s %t diff --git a/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 b/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 index 23a3aae976c07..cd1bd4f98a3f5 100644 --- a/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 +++ b/flang/test/Transforms/DoConcurrent/non_const_bounds.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fdo-concurrent-to-openmp=host %s -o - \ ! RUN: | FileCheck %s diff --git a/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 b/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 index d1c02101318ab..74799359e0476 100644 --- a/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 +++ b/flang/test/Transforms/DoConcurrent/not_perfectly_nested.f90 @@ -1,6 +1,3 @@ -! Fails until we update the pass to use the `fir.do_concurrent` op. -! XFAIL: * - ! Tests that if `do concurrent` is not perfectly nested in its parent loop, that ! we skip converting the not-perfectly nested `do concurrent` loop. @@ -22,23 +19,24 @@ program main end do end -! CHECK: %[[ORIG_K_ALLOC:.*]] = fir.alloca i32 {bindc_name = "k"} -! CHECK: %[[ORIG_K_DECL:.*]]:2 = hlfir.declare %[[ORIG_K_ALLOC]] - -! CHECK: %[[ORIG_J_ALLOC:.*]] = fir.alloca i32 {bindc_name = "j"} -! CHECK: %[[ORIG_J_DECL:.*]]:2 = hlfir.declare %[[ORIG_J_ALLOC]] - ! CHECK: omp.parallel { ! CHECK: omp.wsloop { ! CHECK: omp.loop_nest ({{[^[:space:]]+}}) {{.*}} { -! CHECK: fir.do_loop %[[J_IV:.*]] = {{.*}} { -! CHECK: %[[J_IV_CONV:.*]] = fir.convert %[[J_IV]] : (index) -> i32 +! CHECK: fir.do_concurrent { + +! CHECK: %[[ORIG_J_ALLOC:.*]] = fir.alloca i32 {bindc_name = "j"} +! CHECK: %[[ORIG_J_DECL:.*]]:2 = hlfir.declare %[[ORIG_J_ALLOC]] + +! CHECK: %[[ORIG_K_ALLOC:.*]] = fir.alloca i32 {bindc_name = "k"} +! CHECK: %[[ORIG_K_DECL:.*]]:2 = hlfir.declare %[[ORIG_K_ALLOC]] + +! CHECK: fir.do_concurrent.loop (%[[J_IV:.*]], %[[K_IV:.*]]) = {{.*}} { +! CHECK: %[[J_IV_CONV:.*]] = fir.convert %[[J_IV]] : (index) -> i32 ! CHECK: fir.store %[[J_IV_CONV]] to %[[ORIG_J_DECL]]#0 -! CHECK: fir.do_loop %[[K_IV:.*]] = {{.*}} { ! CHECK: %[[K_IV_CONV:.*]] = fir.convert %[[K_IV]] : (index) -> i32 -! CHECK: fir.store %[[K_IV_CONV]] to %[[ORIG_K_DECL]]#0 +! CHECK: fir.store %[[K_IV_CONV]] to %[[ORIG_K_DECL]]#0 ! CHECK: } ! CHECK: } ! CHECK: omp.yield From flang-commits at lists.llvm.org Thu May 8 11:22:35 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 08 May 2025 11:22:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Update `do concurrent` mapping pass to use `fir.do_concurrent` op (PR #138489) In-Reply-To: Message-ID: <681cf66b.050a0220.cba47.a96c@mx.google.com> https://github.com/ergawy closed https://github.com/llvm/llvm-project/pull/138489 From flang-commits at lists.llvm.org Thu May 8 13:08:11 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 13:08:11 -0700 (PDT) Subject: [flang-commits] [flang] 02f61ab - [flang] Use box for components with non-default lower bounds (#138994) Message-ID: <681d0f2b.a70a0220.313daf.ad4f@mx.google.com> Author: Asher Mancinelli Date: 2025-05-08T13:08:08-07:00 New Revision: 02f61ab46b1608c26fd72862d4b46cbb7b034889 URL: https://github.com/llvm/llvm-project/commit/02f61ab46b1608c26fd72862d4b46cbb7b034889 DIFF: https://github.com/llvm/llvm-project/commit/02f61ab46b1608c26fd72862d4b46cbb7b034889.diff LOG: [flang] Use box for components with non-default lower bounds (#138994) When designating an array component that has non-default lower bounds the bridge was producing hlfir designates yielding reference types, which did not preserve the bounds information. Then, when creating components, unadjusted indices were used when initializing the structure. We could look at the declaration to get the shape parameter, but this would not be preserved if the component were passed as a block argument. These results must be boxed, but we also must not lose the contiguity information either. To address contiguity, annotate these boxes with the `contiguous` attribute during designation. Note that other designated entities are handled inside the HlfirDesignatorBuilder while component designators are built in HlfirBuilder. I am not sure if this handling should be moved into the designator builder or left in the general builder, so feedback is welcome. Also, I wouldn't mind finding a test that demonstrates a box-designated component with the contiguous attribute really is determined to be contiguous by any passes down the line checking for that. I don't have a test like that yet. Added: Modified: flang/lib/Lower/ConvertExprToHLFIR.cpp flang/test/Lower/HLFIR/designators-component-ref.f90 Removed: ################################################################################ diff --git a/flang/lib/Lower/ConvertExprToHLFIR.cpp b/flang/lib/Lower/ConvertExprToHLFIR.cpp index 395f4518efb1e..808928bc97adf 100644 --- a/flang/lib/Lower/ConvertExprToHLFIR.cpp +++ b/flang/lib/Lower/ConvertExprToHLFIR.cpp @@ -30,6 +30,7 @@ #include "flang/Optimizer/Builder/Runtime/Derived.h" #include "flang/Optimizer/Builder/Runtime/Pointer.h" #include "flang/Optimizer/Builder/Todo.h" +#include "flang/Optimizer/Dialect/FIRAttr.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "mlir/IR/IRMapping.h" #include "llvm/ADT/TypeSwitch.h" @@ -125,6 +126,19 @@ class HlfirDesignatorBuilder { hlfir::ElementalAddrOp convertVectorSubscriptedExprToElementalAddr( const Fortran::lower::SomeExpr &designatorExpr); + std::tuple + genComponentDesignatorTypeAndAttributes( + const Fortran::semantics::Symbol &componentSym, mlir::Type fieldType, + bool isVolatile) { + if (mayHaveNonDefaultLowerBounds(componentSym)) { + mlir::Type boxType = fir::BoxType::get(fieldType, isVolatile); + return std::make_tuple(boxType, + fir::FortranVariableFlagsEnum::contiguous); + } + auto refType = fir::ReferenceType::get(fieldType, isVolatile); + return std::make_tuple(refType, fir::FortranVariableFlagsEnum{}); + } + mlir::Value genComponentShape(const Fortran::semantics::Symbol &componentSym, mlir::Type fieldType) { // For pointers and allocatable components, the @@ -1863,8 +1877,9 @@ class HlfirBuilder { designatorBuilder.genComponentShape(sym, compType); const bool isDesignatorVolatile = fir::isa_volatile_type(baseOp.getType()); - mlir::Type designatorType = - builder.getRefType(compType, isDesignatorVolatile); + auto [designatorType, extraAttributeFlags] = + designatorBuilder.genComponentDesignatorTypeAndAttributes( + sym, compType, isDesignatorVolatile); mlir::Type fieldElemType = hlfir::getFortranElementType(compType); llvm::SmallVector typeParams; @@ -1884,7 +1899,8 @@ class HlfirBuilder { // Convert component symbol attributes to variable attributes. fir::FortranVariableFlagsAttr attrs = - Fortran::lower::translateSymbolAttributes(builder.getContext(), sym); + Fortran::lower::translateSymbolAttributes(builder.getContext(), sym, + extraAttributeFlags); // Get the component designator. auto lhs = builder.create( diff --git a/flang/test/Lower/HLFIR/designators-component-ref.f90 b/flang/test/Lower/HLFIR/designators-component-ref.f90 index 653e28e0a6018..935176becac75 100644 --- a/flang/test/Lower/HLFIR/designators-component-ref.f90 +++ b/flang/test/Lower/HLFIR/designators-component-ref.f90 @@ -126,6 +126,16 @@ subroutine test_array_comp_non_contiguous_slice(a) ! CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_1]]#0{"array_comp"} <%[[VAL_9]]> (%[[VAL_10]]:%[[VAL_11]]:%[[VAL_12]], %[[VAL_14]]:%[[VAL_15]]:%[[VAL_16]]) shape %[[VAL_18]] : (!fir.ref}>>, !fir.shape<2>, index, index, index, index, index, index, !fir.shape<2>) -> !fir.box> end subroutine +subroutine test_array_lbs_array_ctor() + use comp_ref + type(t_array_lbs) :: a(-1:1) + real :: array_comp(2:11,3:22) + a = (/ (t_array_lbs(i, array_comp), i=-1,1) /) +! CHECK: hlfir.designate %{{.+}}#0{"array_comp_lbs"} <%{{.+}}> shape %{{.+}} {fortran_attrs = #fir.var_attrs} +! CHECK-SAME: (!fir.ref}>>, !fir.shapeshift<2>, !fir.shapeshift<2>) +! CHECK-SAME: -> !fir.box> +end subroutine + subroutine test_array_lbs_comp_lbs_1(a) use comp_ref type(t_array_lbs) :: a From flang-commits at lists.llvm.org Thu May 8 13:08:14 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Thu, 08 May 2025 13:08:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Use box for components with non-default lower bounds (PR #138994) In-Reply-To: Message-ID: <681d0f2e.050a0220.26e301.b42d@mx.google.com> https://github.com/ashermancinelli closed https://github.com/llvm/llvm-project/pull/138994 From flang-commits at lists.llvm.org Thu May 8 16:19:47 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Thu, 08 May 2025 16:19:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Postpone hlfir.end_associate generation for calls. (PR #138786) In-Reply-To: Message-ID: <681d3c13.170a0220.3c68c7.79fb@mx.google.com> https://github.com/vzakhari updated https://github.com/llvm/llvm-project/pull/138786 >From f65c8b369ce0e866996095c239293f0716608d11 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Tue, 6 May 2025 16:38:48 -0700 Subject: [PATCH 1/4] [flang] Postpone hlfir.end_associate generation for calls. If we generate hlfir.end_associate at the end of the statement, we get easier optimizable HLFIR, because there are no compiler generated operations with side-effects in between the call and the consumers. This allows more hlfir.eval_in_mem to reuse the LHS instead of allocating temporary buffer. I do not think the same can be done for hlfir.copy_out always, e.g.: ``` subroutine test2(x) interface function array_func2(x,y) real:: x(*), array_func2(10), y end function array_func2 end interface real :: x(:) x = array_func2(x, 1.0) end subroutine test2 ``` If we postpone the copy-out until after the assignment, then the result may be wrong. --- flang/lib/Lower/ConvertCall.cpp | 42 +++++++-- .../Lower/HLFIR/call-postponed-associate.f90 | 85 +++++++++++++++++++ 2 files changed, 121 insertions(+), 6 deletions(-) create mode 100644 flang/test/Lower/HLFIR/call-postponed-associate.f90 diff --git a/flang/lib/Lower/ConvertCall.cpp b/flang/lib/Lower/ConvertCall.cpp index a5b85e25b1af0..d37d51f6ec634 100644 --- a/flang/lib/Lower/ConvertCall.cpp +++ b/flang/lib/Lower/ConvertCall.cpp @@ -960,9 +960,26 @@ struct CallCleanUp { mlir::Value tempVar; mlir::Value mustFree; }; - void genCleanUp(mlir::Location loc, fir::FirOpBuilder &builder) { - Fortran::common::visit([&](auto &c) { c.genCleanUp(loc, builder); }, + + /// Generate clean-up code. + /// If \p postponeAssociates is true, the ExprAssociate clean-up + /// is not generated, and instead the corresponding CallCleanUp + /// object is returned as the result. + std::optional genCleanUp(mlir::Location loc, + fir::FirOpBuilder &builder, + bool postponeAssociates) { + std::optional postponed; + Fortran::common::visit(Fortran::common::visitors{ + [&](CopyIn &c) { c.genCleanUp(loc, builder); }, + [&](ExprAssociate &c) { + if (postponeAssociates) + postponed = CallCleanUp{c}; + else + c.genCleanUp(loc, builder); + }, + }, cleanUp); + return postponed; } std::variant cleanUp; }; @@ -1729,10 +1746,23 @@ genUserCall(Fortran::lower::PreparedActualArguments &loweredActuals, caller, callSiteType, callContext.resultType, callContext.isElementalProcWithArrayArgs()); - /// Clean-up associations and copy-in. - for (auto cleanUp : callCleanUps) - cleanUp.genCleanUp(loc, builder); - + // Clean-up associations and copy-in. + // The association clean-ups are postponed to the end of the statement + // lowering. The copy-in clean-ups may be delayed as well, + // but they are done immediately after the call currently. + llvm::SmallVector associateCleanups; + for (auto cleanUp : callCleanUps) { + auto postponed = + cleanUp.genCleanUp(loc, builder, /*postponeAssociates=*/true); + if (postponed) + associateCleanups.push_back(*postponed); + } + + fir::FirOpBuilder *bldr = &builder; + callContext.stmtCtx.attachCleanup([=]() { + for (auto cleanUp : associateCleanups) + (void)cleanUp.genCleanUp(loc, *bldr, /*postponeAssociates=*/false); + }); if (auto *entity = std::get_if(&loweredResult)) return *entity; diff --git a/flang/test/Lower/HLFIR/call-postponed-associate.f90 b/flang/test/Lower/HLFIR/call-postponed-associate.f90 new file mode 100644 index 0000000000000..18df62b44324b --- /dev/null +++ b/flang/test/Lower/HLFIR/call-postponed-associate.f90 @@ -0,0 +1,85 @@ +! RUN: bbc -emit-hlfir -o - %s -I nowhere | FileCheck %s + +subroutine test1 + interface + function array_func1(x) + real:: x, array_func1(10) + end function array_func1 + end interface + real :: x(10) + x = array_func1(1.0) +end subroutine test1 +! CHECK-LABEL: func.func @_QPtest1() { +! CHECK: %[[VAL_5:.*]] = arith.constant 1.000000e+00 : f32 +! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_5]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_17:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: fir.call @_QParray_func1 +! CHECK: fir.save_result +! CHECK: } +! CHECK: hlfir.assign %[[VAL_17]] to %{{.*}} : !hlfir.expr<10xf32>, !fir.ref> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 + +subroutine test2(x) + interface + function array_func2(x,y) + real:: x(*), array_func2(10), y + end function array_func2 + end interface + real :: x(:) + x = array_func2(x, 1.0) +end subroutine test2 +! CHECK-LABEL: func.func @_QPtest2( +! CHECK: %[[VAL_3:.*]] = arith.constant 1.000000e+00 : f32 +! CHECK: %[[VAL_4:.*]]:2 = hlfir.copy_in %{{.*}} to %{{.*}} : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +! CHECK: %[[VAL_5:.*]] = fir.box_addr %[[VAL_4]]#0 : (!fir.box>) -> !fir.ref> +! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_3]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_17:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: ^bb0(%[[VAL_18:.*]]: !fir.ref>): +! CHECK: %[[VAL_19:.*]] = fir.call @_QParray_func2(%[[VAL_5]], %[[VAL_6]]#0) fastmath : (!fir.ref>, !fir.ref) -> !fir.array<10xf32> +! CHECK: fir.save_result %[[VAL_19]] to %[[VAL_18]](%{{.*}}) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +! CHECK: } +! CHECK: hlfir.copy_out %{{.*}}, %[[VAL_4]]#1 to %{{.*}} : (!fir.ref>>>, i1, !fir.box>) -> () +! CHECK: hlfir.assign %[[VAL_17]] to %{{.*}} : !hlfir.expr<10xf32>, !fir.box> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 +! CHECK: hlfir.destroy %[[VAL_17]] : !hlfir.expr<10xf32> + +subroutine test3(x) + interface + function array_func3(x) + real :: x, array_func3(10) + end function array_func3 + end interface + logical :: x + if (any(array_func3(1.0).le.array_func3(2.0))) x = .true. +end subroutine test3 +! CHECK-LABEL: func.func @_QPtest3( +! CHECK: %[[VAL_2:.*]] = arith.constant 1.000000e+00 : f32 +! CHECK: %[[VAL_3:.*]]:3 = hlfir.associate %[[VAL_2]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_14:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: ^bb0(%[[VAL_15:.*]]: !fir.ref>): +! CHECK: %[[VAL_16:.*]] = fir.call @_QParray_func3(%[[VAL_3]]#0) fastmath : (!fir.ref) -> !fir.array<10xf32> +! CHECK: fir.save_result %[[VAL_16]] to %[[VAL_15]](%{{.*}}) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +! CHECK: } +! CHECK: %[[VAL_17:.*]] = arith.constant 2.000000e+00 : f32 +! CHECK: %[[VAL_18:.*]]:3 = hlfir.associate %[[VAL_17]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_29:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: ^bb0(%[[VAL_30:.*]]: !fir.ref>): +! CHECK: %[[VAL_31:.*]] = fir.call @_QParray_func3(%[[VAL_18]]#0) fastmath : (!fir.ref) -> !fir.array<10xf32> +! CHECK: fir.save_result %[[VAL_31]] to %[[VAL_30]](%{{.*}}) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +! CHECK: } +! CHECK: %[[VAL_32:.*]] = hlfir.elemental %{{.*}} unordered : (!fir.shape<1>) -> !hlfir.expr> { +! CHECK: ^bb0(%[[VAL_33:.*]]: index): +! CHECK: %[[VAL_34:.*]] = hlfir.apply %[[VAL_14]], %[[VAL_33]] : (!hlfir.expr<10xf32>, index) -> f32 +! CHECK: %[[VAL_35:.*]] = hlfir.apply %[[VAL_29]], %[[VAL_33]] : (!hlfir.expr<10xf32>, index) -> f32 +! CHECK: %[[VAL_36:.*]] = arith.cmpf ole, %[[VAL_34]], %[[VAL_35]] fastmath : f32 +! CHECK: %[[VAL_37:.*]] = fir.convert %[[VAL_36]] : (i1) -> !fir.logical<4> +! CHECK: hlfir.yield_element %[[VAL_37]] : !fir.logical<4> +! CHECK: } +! CHECK: %[[VAL_38:.*]] = hlfir.any %[[VAL_32]] : (!hlfir.expr>) -> !fir.logical<4> +! CHECK: hlfir.destroy %[[VAL_32]] : !hlfir.expr> +! CHECK: hlfir.end_associate %[[VAL_18]]#1, %[[VAL_18]]#2 : !fir.ref, i1 +! CHECK: hlfir.destroy %[[VAL_29]] : !hlfir.expr<10xf32> +! CHECK: hlfir.end_associate %[[VAL_3]]#1, %[[VAL_3]]#2 : !fir.ref, i1 +! CHECK: hlfir.destroy %[[VAL_14]] : !hlfir.expr<10xf32> +! CHECK: %[[VAL_39:.*]] = fir.convert %[[VAL_38]] : (!fir.logical<4>) -> i1 +! CHECK: fir.if %[[VAL_39]] { >From 8ddc8be77b16303127470993be7333d35fb9fb56 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Tue, 6 May 2025 19:14:39 -0700 Subject: [PATCH 2/4] Added test changes missing from the original patch. --- flang/test/Lower/HLFIR/entry_return.f90 | 8 ++++---- flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/test/Lower/HLFIR/entry_return.f90 b/flang/test/Lower/HLFIR/entry_return.f90 index 5d3e160af2df6..18fb2b571b950 100644 --- a/flang/test/Lower/HLFIR/entry_return.f90 +++ b/flang/test/Lower/HLFIR/entry_return.f90 @@ -51,13 +51,13 @@ logical function f2() ! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_4]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %[[VAL_5]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_8:.*]] = fir.call @_QPcomplex(%[[VAL_6]]#0, %[[VAL_7]]#0) fastmath : (!fir.ref, !fir.ref) -> f32 -! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 -! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_9:.*]] = arith.constant 0.000000e+00 : f32 ! CHECK: %[[VAL_10:.*]] = fir.undefined complex ! CHECK: %[[VAL_11:.*]] = fir.insert_value %[[VAL_10]], %[[VAL_8]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[VAL_12:.*]] = fir.insert_value %[[VAL_11]], %[[VAL_9]], [1 : index] : (complex, f32) -> complex ! CHECK: hlfir.assign %[[VAL_12]] to %[[VAL_1]]#0 : complex, !fir.ref> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 +! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_3]]#0 : !fir.ref> ! CHECK: return %[[VAL_13]] : !fir.logical<4> ! CHECK: } @@ -74,13 +74,13 @@ logical function f2() ! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_4]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %[[VAL_5]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_8:.*]] = fir.call @_QPcomplex(%[[VAL_6]]#0, %[[VAL_7]]#0) fastmath : (!fir.ref, !fir.ref) -> f32 -! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 -! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_9:.*]] = arith.constant 0.000000e+00 : f32 ! CHECK: %[[VAL_10:.*]] = fir.undefined complex ! CHECK: %[[VAL_11:.*]] = fir.insert_value %[[VAL_10]], %[[VAL_8]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[VAL_12:.*]] = fir.insert_value %[[VAL_11]], %[[VAL_9]], [1 : index] : (complex, f32) -> complex ! CHECK: hlfir.assign %[[VAL_12]] to %[[VAL_1]]#0 : complex, !fir.ref> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 +! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_1]]#0 : !fir.ref> ! CHECK: return %[[VAL_13]] : complex ! CHECK: } diff --git a/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 b/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 index 28659a33d0893..206b6e4e9b797 100644 --- a/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 +++ b/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 @@ -32,8 +32,8 @@ real function test1(x) ! CHECK: %[[VAL_7:.*]] = fir.load %[[VAL_6]] : !fir.ref) -> f32>> ! CHECK: %[[VAL_8:.*]] = fir.box_addr %[[VAL_7]] : (!fir.boxproc<(!fir.ref) -> f32>) -> ((!fir.ref) -> f32) ! CHECK: %[[VAL_9:.*]] = fir.call %[[VAL_8]](%[[VAL_5]]#0) fastmath : (!fir.ref) -> f32 -! CHECK: hlfir.end_associate %[[VAL_5]]#1, %[[VAL_5]]#2 : !fir.ref, i1 ! CHECK: hlfir.assign %[[VAL_9]] to %[[VAL_2]]#0 : f32, !fir.ref +! CHECK: hlfir.end_associate %[[VAL_5]]#1, %[[VAL_5]]#2 : !fir.ref, i1 subroutine test2(x) use proc_comp_defs, only : t, iface >From d843f4eb45ad474adef13371540b2d22817074f0 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Thu, 8 May 2025 12:18:07 -0700 Subject: [PATCH 3/4] Fixed clean-ups insertion for atomic capture. --- flang/lib/Lower/OpenMP/OpenMP.cpp | 4 +++- flang/test/Lower/OpenMP/atomic-capture.f90 | 20 ++++++++++++++++++++ flang/test/Lower/OpenMP/atomic-update.f90 | 21 +++++++++++++++++++++ 3 files changed, 44 insertions(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fcd3de9671098..d1a77a2624628 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3129,7 +3129,9 @@ static void genAtomicCapture(lower::AbstractConverter &converter, } firOpBuilder.setInsertionPointToEnd(&block); firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); + // The clean-ups associated with the statements inside the capture + // construct must be generated after the AtomicCaptureOp. + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b5c8edc8f31c1 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -97,3 +97,23 @@ subroutine pointers_in_atomic_capture() b = a !$omp end atomic end subroutine + +! Check that the clean-ups associated with the function call +! are generated after the omp.atomic.capture operation: +! CHECK-LABEL: func.func @_QPfunc_call_cleanup( +subroutine func_call_cleanup(x, v, vv) + integer :: x, v, vv + +! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_8:.*]] = fir.call @_QPfunc(%[[VAL_7]]#0) fastmath : (!fir.ref) -> f32 +! CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_8]] : (f32) -> i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %{{.*}} = %[[VAL_3:.*]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.write %[[VAL_3]]#0 = %[[VAL_9]] : !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 + !$omp atomic capture + v = x + x = func(vv + 1) + !$omp end atomic +end subroutine func_call_cleanup diff --git a/flang/test/Lower/OpenMP/atomic-update.f90 b/flang/test/Lower/OpenMP/atomic-update.f90 index 31bf447006930..e0269ea1f8af1 100644 --- a/flang/test/Lower/OpenMP/atomic-update.f90 +++ b/flang/test/Lower/OpenMP/atomic-update.f90 @@ -201,3 +201,24 @@ program OmpAtomicUpdate !$omp atomic update x = x + sum([ (y+2, y=1, z) ]) end program OmpAtomicUpdate + +! Check that the clean-ups associated with the function call +! are generated after the omp.atomic.update operation: +! CHECK-LABEL: func.func @_QPfunc_call_cleanup( +subroutine func_call_cleanup(v, vv) + integer v, vv + +! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_7:.*]] = fir.call @_QPfunc(%[[VAL_6]]#0) fastmath : (!fir.ref) -> f32 +! CHECK: omp.atomic.update %{{.*}} : !fir.ref { +! CHECK: ^bb0(%[[VAL_8:.*]]: i32): +! CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_8]] : (i32) -> f32 +! CHECK: %[[VAL_10:.*]] = arith.addf %[[VAL_9]], %[[VAL_7]] fastmath : f32 +! CHECK: %[[VAL_11:.*]] = fir.convert %[[VAL_10]] : (f32) -> i32 +! CHECK: omp.yield(%[[VAL_11]] : i32) +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 + !$omp atomic update + v = v + func(vv + 1) + !$omp end atomic +end subroutine func_call_cleanup >From 22c381e753d137e02ef81996c2ec65ca91f86410 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Thu, 8 May 2025 16:13:01 -0700 Subject: [PATCH 4/4] Fixed atomic capture cases with atomic update inside. --- flang/lib/Lower/OpenMP/OpenMP.cpp | 20 +++++++--- flang/test/Lower/OpenMP/atomic-capture.f90 | 44 ++++++++++++++++++++-- 2 files changed, 55 insertions(+), 9 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index d1a77a2624628..5b0b54b9e0377 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2729,7 +2729,8 @@ static void genAtomicUpdateStatement( const parser::Expr &assignmentStmtExpr, const parser::OmpAtomicClauseList *leftHandClauseList, const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { + mlir::Operation *atomicCaptureOp = nullptr, + lower::StatementContext *atomicCaptureStmtCtx = nullptr) { // Generate `atomic.update` operation for atomic assignment statements fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); mlir::Location currentLocation = converter.getCurrentLocation(); @@ -2803,15 +2804,24 @@ static void genAtomicUpdateStatement( }, assignmentStmtExpr.u); lower::StatementContext nonAtomicStmtCtx; + lower::StatementContext *stmtCtxPtr = &nonAtomicStmtCtx; if (!nonAtomicSubExprs.empty()) { // Generate non atomic part before all the atomic operations. auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) + if (atomicCaptureOp) { + assert(atomicCaptureStmtCtx && "must specify statement context"); firOpBuilder.setInsertionPoint(atomicCaptureOp); + // Any clean-ups associated with the expression lowering + // must also be generated outside of the atomic update operation + // and after the atomic capture operation. + // The atomicCaptureStmtCtx will be finalized at the end + // of the atomic capture operation generation. + stmtCtxPtr = atomicCaptureStmtCtx; + } mlir::Value nonAtomicVal; for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + currentLocation, *nonAtomicSubExpr, *stmtCtxPtr)); exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); } if (atomicCaptureOp) @@ -3097,7 +3107,7 @@ static void genAtomicCapture(lower::AbstractConverter &converter, genAtomicUpdateStatement( converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp, &stmtCtx); } else { // Atomic capture construct is of the form [capture-stmt, write-stmt] firOpBuilder.setInsertionPoint(atomicCaptureOp); @@ -3121,7 +3131,7 @@ static void genAtomicCapture(lower::AbstractConverter &converter, genAtomicUpdateStatement( converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp, &stmtCtx); genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, /*leftHandClauseList=*/nullptr, /*rightHandClauseList=*/nullptr, elementType, diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index b5c8edc8f31c1..2f800d534dc36 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -102,18 +102,54 @@ subroutine pointers_in_atomic_capture() ! are generated after the omp.atomic.capture operation: ! CHECK-LABEL: func.func @_QPfunc_call_cleanup( subroutine func_call_cleanup(x, v, vv) + interface + integer function func(x) + integer :: x + end function func + end interface integer :: x, v, vv ! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -! CHECK: %[[VAL_8:.*]] = fir.call @_QPfunc(%[[VAL_7]]#0) fastmath : (!fir.ref) -> f32 -! CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_8]] : (f32) -> i32 +! CHECK: %[[VAL_8:.*]] = fir.call @_QPfunc(%[[VAL_7]]#0) fastmath : (!fir.ref) -> i32 ! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.read %{{.*}} = %[[VAL_3:.*]]#0 : !fir.ref, !fir.ref, i32 -! CHECK: omp.atomic.write %[[VAL_3]]#0 = %[[VAL_9]] : !fir.ref, i32 +! CHECK: omp.atomic.read %[[VAL_1:.*]]#0 = %[[VAL_3:.*]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.write %[[VAL_3]]#0 = %[[VAL_8]] : !fir.ref, i32 ! CHECK: } ! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 !$omp atomic capture v = x x = func(vv + 1) !$omp end atomic + +! CHECK: %[[VAL_12:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_13:.*]] = fir.call @_QPfunc(%[[VAL_12]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[VAL_1]]#0 = %[[VAL_3]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.update %[[VAL_3]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_14:.*]]: i32): +! CHECK: %[[VAL_15:.*]] = arith.addi %[[VAL_13]], %[[VAL_14]] : i32 +! CHECK: omp.yield(%[[VAL_15]] : i32) +! CHECK: } +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_12]]#1, %[[VAL_12]]#2 : !fir.ref, i1 + !$omp atomic capture + v = x + x = func(vv + 1) + x + !$omp end atomic + +! CHECK: %[[VAL_19:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_20:.*]] = fir.call @_QPfunc(%[[VAL_19]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[VAL_3]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_21:.*]]: i32): +! CHECK: %[[VAL_22:.*]] = arith.addi %[[VAL_20]], %[[VAL_21]] : i32 +! CHECK: omp.yield(%[[VAL_22]] : i32) +! CHECK: } +! CHECK: omp.atomic.read %[[VAL_1]]#0 = %[[VAL_3]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_19]]#1, %[[VAL_19]]#2 : !fir.ref, i1 + !$omp atomic capture + x = func(vv + 1) + x + v = x + !$omp end atomic end subroutine func_call_cleanup From flang-commits at lists.llvm.org Thu May 8 17:47:45 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Thu, 08 May 2025 17:47:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) Message-ID: https://github.com/ashermancinelli created https://github.com/llvm/llvm-project/pull/139183 Ensure volatility is reflected not just on the reference to an allocatable, but on the box, too. When we declare a volatile allocatable, we now get a volatile reference to a volatile box. Some related cleanups: * SELECT TYPE constructs check the selector's type for volatility when creating and designating the type used in the selecting block. * Refine the verifier for fir.convert. In general, I think it is ok to implicitly drop volatility in any ptr-to-int conversion because it means we are in codegen (and representing volatility on the LLVM ops and intrinsics) or we are calling an external function (are there any cases I'm not thinking of?) * An allocatable test that was XFAILed is now passing. Making allocatables' boxes volatile resulted in accesses of those boxes being volatile, which resolved some errors coming from the strict verifier. * I noticed a runtime function was missing the fir.runtime attribute. >From 637a768af6a2a9c53785ab4655e8384061480c1b Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Thu, 8 May 2025 17:30:04 -0700 Subject: [PATCH 1/2] [flang] Fix volatile attr propagation on allocatables Ensure volatility is reflected not just on the reference of an allocatable, but on the box, too. When we designate a volatile allocatable, we now get a volatile reference to a volatile box. Some related cleanups: * SELECT TYPE constructs properly handle volatility by checking the selector's type for volatility when creating the target type. * Refine the verifier for fir.convert. In general, I think it should be ok to implicitly drop volatility in any ptr-to-int conversion because it means we are in codegen or we are calling an external function, and it's okay to drop volatility from the Fir type system in these cases. * An allocatable test that was XFAILed is now passing. * I noticed a runtime function was missing the fir.runtime attribute. Fix that. --- flang/lib/Lower/Bridge.cpp | 12 +++-- flang/lib/Optimizer/Dialect/FIROps.cpp | 47 +++++++++++++++---- flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp | 15 ++++-- .../Transforms/PolymorphicOpConversion.cpp | 11 +++-- flang/test/Fir/invalid.fir | 4 +- flang/test/Lower/volatile-allocatable.f90 | 21 +++++---- flang/test/Lower/volatile-allocatable1.f90 | 17 ++++++- 7 files changed, 91 insertions(+), 36 deletions(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 0a61f61ab8f75..c981abaffad18 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -3845,6 +3845,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { bool hasLocalScope = false; llvm::SmallVector typeCaseScopes; + const auto selectorIsVolatile = [&selector]() { + return fir::isa_volatile_type(fir::getBase(selector).getType()); + }; + const auto &typeCaseList = std::get>( selectTypeConstruct.t); @@ -3998,7 +4002,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { addrTy = fir::HeapType::get(addrTy); if (std::holds_alternative( typeSpec->u)) { - mlir::Type refTy = fir::ReferenceType::get(addrTy); + mlir::Type refTy = fir::ReferenceType::get(addrTy, selectorIsVolatile()); if (isPointer || isAllocatable) refTy = addrTy; exactValue = builder->create( @@ -4007,7 +4011,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { typeSpec->declTypeSpec->AsIntrinsic(); if (isArray) { mlir::Value exact = builder->create( - loc, fir::BoxType::get(addrTy), fir::getBase(selector)); + loc, fir::BoxType::get(addrTy, selectorIsVolatile()), fir::getBase(selector)); addAssocEntitySymbol(selectorBox->clone(exact)); } else if (intrinsic->category() == Fortran::common::TypeCategory::Character) { @@ -4022,7 +4026,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { } else if (std::holds_alternative( typeSpec->u)) { exactValue = builder->create( - loc, fir::BoxType::get(addrTy), fir::getBase(selector)); + loc, fir::BoxType::get(addrTy, selectorIsVolatile()), fir::getBase(selector)); addAssocEntitySymbol(selectorBox->clone(exactValue)); } } else if (std::holds_alternative( @@ -4040,7 +4044,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { addrTy = fir::PointerType::get(addrTy); if (isAllocatable) addrTy = fir::HeapType::get(addrTy); - mlir::Type classTy = fir::ClassType::get(addrTy); + mlir::Type classTy = fir::ClassType::get(addrTy, selectorIsVolatile()); if (classTy == baseTy) { addAssocEntitySymbol(selector); } else { diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index 332cca1ab9f95..9b58578e55474 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -1536,20 +1536,47 @@ bool fir::ConvertOp::canBeConverted(mlir::Type inType, mlir::Type outType) { areRecordsCompatible(inType, outType); } +// In general, ptrtoint-like conversions are allowed to lose volatility information +// because they are either: +// +// 1. passing an entity to an external function and there's nothing we can do +// about volatility after that happens, or +// 2. for code generation, at which point we represent volatility with attributes +// on the LLVM instructions and intrinsics. +// +// For all other cases, volatility ought to match exactly. +static mlir::LogicalResult verifyVolatility(mlir::Type inType, mlir::Type outType) { + const bool toLLVMPointer = mlir::isa(outType); + const bool toInteger = fir::isa_integer(outType); + + // When converting references to classes or allocatables into boxes for runtime arguments, + // we cast away all the volatility information and pass a box. This is allowed. + const bool isBoxNoneLike = [&]() { + if (fir::isBoxNone(outType)) + return true; + if (auto referenceType = mlir::dyn_cast(outType)) { + if (fir::isBoxNone(referenceType.getElementType())) { + return true; + } + } + return false; + }(); + + const bool isPtrToIntLike = toLLVMPointer || toInteger || isBoxNoneLike; + if (isPtrToIntLike) { + return mlir::success(); + } + + // In all other cases, we need to check for an exact volatility match. + return mlir::success(fir::isa_volatile_type(inType) == fir::isa_volatile_type(outType)); +} + llvm::LogicalResult fir::ConvertOp::verify() { mlir::Type inType = getValue().getType(); mlir::Type outType = getType(); - // If we're converting to an LLVM pointer type or an integer, we don't - // need to check for volatility mismatch - volatility will be handled by the - // memory operations themselves in llvm code generation and ptr-to-int can't - // represent volatility. - const bool toLLVMPointer = mlir::isa(outType); - const bool toInteger = fir::isa_integer(outType); if (fir::useStrictVolatileVerification()) { - if (fir::isa_volatile_type(inType) != fir::isa_volatile_type(outType) && - !toLLVMPointer && !toInteger) { - return emitOpError("cannot convert between volatile and non-volatile " - "types, use fir.volatile_cast instead ") + if (failed(verifyVolatility(inType, outType))) { + return emitOpError("this conversion does not preserve volatility: ") << inType << " / " << outType; } } diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp index 711d5d1461b08..52517eef2890d 100644 --- a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp +++ b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp @@ -207,20 +207,25 @@ static bool hasExplicitLowerBounds(mlir::Value shape) { mlir::isa(shape.getType()); } -static std::pair updateDeclareInputTypeWithVolatility( +static std::pair updateDeclaredInputTypeWithVolatility( mlir::Type inputType, mlir::Value memref, mlir::OpBuilder &builder, fir::FortranVariableFlagsAttr fortran_attrs) { if (fortran_attrs && bitEnumContainsAny(fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::fortran_volatile)) { + // A volatile pointer's pointee is volatile. const bool isPointer = bitEnumContainsAny( fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::pointer); + // An allocatable's inner type's volatility matches that of the reference. + const bool isAllocatable = bitEnumContainsAny( + fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::allocatable); auto updateType = [&](auto t) { using FIRT = decltype(t); - // A volatile pointer's pointee is volatile. auto elementType = t.getEleTy(); - const bool elementTypeIsVolatile = - isPointer || fir::isa_volatile_type(elementType); + const bool elementTypeIsBox = mlir::isa(elementType); + const bool elementTypeIsVolatile = isPointer || isAllocatable || + elementTypeIsBox || + fir::isa_volatile_type(elementType); auto newEleTy = fir::updateTypeWithVolatility(elementType, elementTypeIsVolatile); inputType = FIRT::get(newEleTy, true); @@ -243,7 +248,7 @@ void hlfir::DeclareOp::build(mlir::OpBuilder &builder, auto nameAttr = builder.getStringAttr(uniq_name); mlir::Type inputType = memref.getType(); bool hasExplicitLbs = hasExplicitLowerBounds(shape); - std::tie(inputType, memref) = updateDeclareInputTypeWithVolatility( + std::tie(inputType, memref) = updateDeclaredInputTypeWithVolatility( inputType, memref, builder, fortran_attrs); mlir::Type hlfirVariableType = getHLFIRVariableType(inputType, hasExplicitLbs); diff --git a/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp b/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp index 0c78a878cdc53..309e557e409c0 100644 --- a/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp +++ b/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp @@ -401,10 +401,13 @@ llvm::LogicalResult SelectTypeConv::genTypeLadderStep( { // Since conversion is done in parallel for each fir.select_type // operation, the runtime function insertion must be threadsafe. - callee = - fir::createFuncOp(rewriter.getUnknownLoc(), mod, fctName, - rewriter.getFunctionType({descNoneTy, typeDescTy}, - rewriter.getI1Type())); + auto runtimeAttr = + mlir::NamedAttribute(fir::FIROpsDialect::getFirRuntimeAttrName(), + mlir::UnitAttr::get(rewriter.getContext())); + callee = fir::createFuncOp(rewriter.getUnknownLoc(), mod, fctName, + rewriter.getFunctionType({descNoneTy, typeDescTy}, + rewriter.getI1Type()), + {runtimeAttr}); } cmp = rewriter .create(loc, callee, diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index 1de48b87365b3..834eea7df8ebe 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -1260,7 +1260,7 @@ func.func @dc_invalid_reduction(%arg0: index, %arg1: index) { // Should fail when volatility changes from a fir.convert func.func @bad_convert_volatile(%arg0: !fir.ref) -> !fir.ref { - // expected-error at +1 {{'fir.convert' op cannot convert between volatile and non-volatile types, use fir.volatile_cast instead}} + // expected-error at +1 {{op this conversion does not preserve volatilit}} %0 = fir.convert %arg0 : (!fir.ref) -> !fir.ref return %0 : !fir.ref } @@ -1269,7 +1269,7 @@ func.func @bad_convert_volatile(%arg0: !fir.ref) -> !fir.ref // Should fail when volatility changes from a fir.convert func.func @bad_convert_volatile2(%arg0: !fir.ref) -> !fir.ref { - // expected-error at +1 {{'fir.convert' op cannot convert between volatile and non-volatile types, use fir.volatile_cast instead}} + // expected-error at +1 {{op this conversion does not preserve volatility}} %0 = fir.convert %arg0 : (!fir.ref) -> !fir.ref return %0 : !fir.ref } diff --git a/flang/test/Lower/volatile-allocatable.f90 b/flang/test/Lower/volatile-allocatable.f90 index 5f75a5425422a..e182fe8a4d9c9 100644 --- a/flang/test/Lower/volatile-allocatable.f90 +++ b/flang/test/Lower/volatile-allocatable.f90 @@ -119,10 +119,10 @@ subroutine test_unlimited_polymorphic() end subroutine ! CHECK-LABEL: func.func @_QPtest_scalar_volatile() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEc1"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv1"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv2"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv3"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEc1"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv1"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv2"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv3"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () @@ -140,8 +140,8 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_volatile_asynchronous() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEi1"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEv1"} : (!fir.ref>>>, volatile>) -> (!fir.ref>>>, volatile>, !fir.ref>>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEi1"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEv1"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 @@ -151,10 +151,11 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_select_base_type_volatile() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.ref>>>, volatile>) -> (!fir.ref>>>, volatile>, !fir.ref>>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAClassIs(%{{.+}}, %{{.+}}) : (!fir.box, !fir.ref) -> i1 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.class>>, volatile>, !fir.shift<1>) -> (!fir.class>>, volatile>, !fir.class>>, volatile>) ! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0 (%{{.+}}) : (!fir.class>>, volatile>, index) -> !fir.class, volatile> ! CHECK: %{{.+}} = hlfir.designate %{{.+}}{"i"} : (!fir.class, volatile>) -> !fir.ref @@ -162,7 +163,7 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_mold_allocation() { ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {uniq_name = "_QFtest_mold_allocationEtemplate"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_mold_allocationEv"} : (!fir.ref>>>, volatile>) -> (!fir.ref>>>, volatile>, !fir.ref>>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_mold_allocationEv"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QQclX6D6F6C642074657374"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) ! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0{"str"} typeparams %{{.+}} : (!fir.ref>, index) -> !fir.ref> ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QQro.2xi4.2"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) @@ -173,8 +174,8 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_unlimited_polymorphic() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.ref>, volatile>) -> (!fir.ref>, volatile>, !fir.ref>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEupa"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.ref, volatile>, volatile>) -> (!fir.ref, volatile>, volatile>, !fir.ref, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEupa"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitIntrinsicForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i32, i32, i32) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.heap) -> (!fir.heap, !fir.heap) diff --git a/flang/test/Lower/volatile-allocatable1.f90 b/flang/test/Lower/volatile-allocatable1.f90 index a21359c3b4225..d2a07c8763885 100644 --- a/flang/test/Lower/volatile-allocatable1.f90 +++ b/flang/test/Lower/volatile-allocatable1.f90 @@ -1,7 +1,6 @@ ! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s ! Requires correct propagation of volatility for allocatable nested types. -! XFAIL: * function allocatable_udt() type :: base_type @@ -15,3 +14,19 @@ function allocatable_udt() allocate(v2(2,3)) allocatable_udt = v2(1,1)%i end function +! CHECK-LABEL: func.func @_QPallocatable_udt() -> i32 { +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.n.i"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.di.base_type.i"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.n.base_type"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.n.j"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.di.ext_type.j"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.n.ext_type"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {uniq_name = "_QFallocatable_udtEallocatable_udt"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtEv2"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.c.base_type"} : (!fir.ref>>, !fir.shapeshift<1>) -> (!fir.box>>, !fir.ref>>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.dt.base_type"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.dt.ext_type"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.c.ext_type"} : (!fir.ref>>, !fir.shapeshift<1>) -> (!fir.box>>, !fir.ref>>) +! CHECK: %{{.+}} = hlfir.designate %{{.+}} (%{{.+}}, %{{.+}}) : (!fir.box>>, volatile>, index, index) -> !fir.ref, volatile> +! CHECK: %{{.+}} = hlfir.designate %{{.+}}{"base_type"} : (!fir.ref, volatile>) -> !fir.ref, volatile> +! CHECK: %{{.+}} = hlfir.designate %{{.+}}{"i"} : (!fir.ref, volatile>) -> !fir.ref >From 209110578a57333be1511f0ffc60ea889023d107 Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Thu, 8 May 2025 17:44:10 -0700 Subject: [PATCH 2/2] format --- flang/lib/Lower/Bridge.cpp | 12 +++++++---- flang/lib/Optimizer/Dialect/FIROps.cpp | 20 +++++++++++-------- .../Transforms/PolymorphicOpConversion.cpp | 9 +++++---- 3 files changed, 25 insertions(+), 16 deletions(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index c981abaffad18..3706dcf37a204 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -4002,7 +4002,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { addrTy = fir::HeapType::get(addrTy); if (std::holds_alternative( typeSpec->u)) { - mlir::Type refTy = fir::ReferenceType::get(addrTy, selectorIsVolatile()); + mlir::Type refTy = + fir::ReferenceType::get(addrTy, selectorIsVolatile()); if (isPointer || isAllocatable) refTy = addrTy; exactValue = builder->create( @@ -4011,7 +4012,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { typeSpec->declTypeSpec->AsIntrinsic(); if (isArray) { mlir::Value exact = builder->create( - loc, fir::BoxType::get(addrTy, selectorIsVolatile()), fir::getBase(selector)); + loc, fir::BoxType::get(addrTy, selectorIsVolatile()), + fir::getBase(selector)); addAssocEntitySymbol(selectorBox->clone(exact)); } else if (intrinsic->category() == Fortran::common::TypeCategory::Character) { @@ -4026,7 +4028,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { } else if (std::holds_alternative( typeSpec->u)) { exactValue = builder->create( - loc, fir::BoxType::get(addrTy, selectorIsVolatile()), fir::getBase(selector)); + loc, fir::BoxType::get(addrTy, selectorIsVolatile()), + fir::getBase(selector)); addAssocEntitySymbol(selectorBox->clone(exactValue)); } } else if (std::holds_alternative( @@ -4044,7 +4047,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { addrTy = fir::PointerType::get(addrTy); if (isAllocatable) addrTy = fir::HeapType::get(addrTy); - mlir::Type classTy = fir::ClassType::get(addrTy, selectorIsVolatile()); + mlir::Type classTy = + fir::ClassType::get(addrTy, selectorIsVolatile()); if (classTy == baseTy) { addAssocEntitySymbol(selector); } else { diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index 9b58578e55474..75185e719393f 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -1536,21 +1536,24 @@ bool fir::ConvertOp::canBeConverted(mlir::Type inType, mlir::Type outType) { areRecordsCompatible(inType, outType); } -// In general, ptrtoint-like conversions are allowed to lose volatility information -// because they are either: +// In general, ptrtoint-like conversions are allowed to lose volatility +// information because they are either: // // 1. passing an entity to an external function and there's nothing we can do // about volatility after that happens, or -// 2. for code generation, at which point we represent volatility with attributes +// 2. for code generation, at which point we represent volatility with +// attributes // on the LLVM instructions and intrinsics. // // For all other cases, volatility ought to match exactly. -static mlir::LogicalResult verifyVolatility(mlir::Type inType, mlir::Type outType) { +static mlir::LogicalResult verifyVolatility(mlir::Type inType, + mlir::Type outType) { const bool toLLVMPointer = mlir::isa(outType); const bool toInteger = fir::isa_integer(outType); - // When converting references to classes or allocatables into boxes for runtime arguments, - // we cast away all the volatility information and pass a box. This is allowed. + // When converting references to classes or allocatables into boxes for + // runtime arguments, we cast away all the volatility information and pass a + // box. This is allowed. const bool isBoxNoneLike = [&]() { if (fir::isBoxNone(outType)) return true; @@ -1561,14 +1564,15 @@ static mlir::LogicalResult verifyVolatility(mlir::Type inType, mlir::Type outTyp } return false; }(); - + const bool isPtrToIntLike = toLLVMPointer || toInteger || isBoxNoneLike; if (isPtrToIntLike) { return mlir::success(); } // In all other cases, we need to check for an exact volatility match. - return mlir::success(fir::isa_volatile_type(inType) == fir::isa_volatile_type(outType)); + return mlir::success(fir::isa_volatile_type(inType) == + fir::isa_volatile_type(outType)); } llvm::LogicalResult fir::ConvertOp::verify() { diff --git a/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp b/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp index 309e557e409c0..f9a4c4d0283c7 100644 --- a/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp +++ b/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp @@ -404,10 +404,11 @@ llvm::LogicalResult SelectTypeConv::genTypeLadderStep( auto runtimeAttr = mlir::NamedAttribute(fir::FIROpsDialect::getFirRuntimeAttrName(), mlir::UnitAttr::get(rewriter.getContext())); - callee = fir::createFuncOp(rewriter.getUnknownLoc(), mod, fctName, - rewriter.getFunctionType({descNoneTy, typeDescTy}, - rewriter.getI1Type()), - {runtimeAttr}); + callee = + fir::createFuncOp(rewriter.getUnknownLoc(), mod, fctName, + rewriter.getFunctionType({descNoneTy, typeDescTy}, + rewriter.getI1Type()), + {runtimeAttr}); } cmp = rewriter .create(loc, callee, From flang-commits at lists.llvm.org Thu May 8 17:48:16 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 17:48:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) In-Reply-To: Message-ID: <681d50d0.050a0220.460ce.c2f9@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Asher Mancinelli (ashermancinelli)
Changes Ensure volatility is reflected not just on the reference to an allocatable, but on the box, too. When we declare a volatile allocatable, we now get a volatile reference to a volatile box. Some related cleanups: * SELECT TYPE constructs check the selector's type for volatility when creating and designating the type used in the selecting block. * Refine the verifier for fir.convert. In general, I think it is ok to implicitly drop volatility in any ptr-to-int conversion because it means we are in codegen (and representing volatility on the LLVM ops and intrinsics) or we are calling an external function (are there any cases I'm not thinking of?) * An allocatable test that was XFAILed is now passing. Making allocatables' boxes volatile resulted in accesses of those boxes being volatile, which resolved some errors coming from the strict verifier. * I noticed a runtime function was missing the fir.runtime attribute. --- Patch is 28.10 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139183.diff 7 Files Affected: - (modified) flang/lib/Lower/Bridge.cpp (+12-4) - (modified) flang/lib/Optimizer/Dialect/FIROps.cpp (+41-10) - (modified) flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp (+10-5) - (modified) flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp (+5-1) - (modified) flang/test/Fir/invalid.fir (+2-2) - (modified) flang/test/Lower/volatile-allocatable.f90 (+11-10) - (modified) flang/test/Lower/volatile-allocatable1.f90 (+16-1) ``````````diff diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 0a61f61ab8f75..3706dcf37a204 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -3845,6 +3845,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { bool hasLocalScope = false; llvm::SmallVector typeCaseScopes; + const auto selectorIsVolatile = [&selector]() { + return fir::isa_volatile_type(fir::getBase(selector).getType()); + }; + const auto &typeCaseList = std::get>( selectTypeConstruct.t); @@ -3998,7 +4002,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { addrTy = fir::HeapType::get(addrTy); if (std::holds_alternative( typeSpec->u)) { - mlir::Type refTy = fir::ReferenceType::get(addrTy); + mlir::Type refTy = + fir::ReferenceType::get(addrTy, selectorIsVolatile()); if (isPointer || isAllocatable) refTy = addrTy; exactValue = builder->create( @@ -4007,7 +4012,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { typeSpec->declTypeSpec->AsIntrinsic(); if (isArray) { mlir::Value exact = builder->create( - loc, fir::BoxType::get(addrTy), fir::getBase(selector)); + loc, fir::BoxType::get(addrTy, selectorIsVolatile()), + fir::getBase(selector)); addAssocEntitySymbol(selectorBox->clone(exact)); } else if (intrinsic->category() == Fortran::common::TypeCategory::Character) { @@ -4022,7 +4028,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { } else if (std::holds_alternative( typeSpec->u)) { exactValue = builder->create( - loc, fir::BoxType::get(addrTy), fir::getBase(selector)); + loc, fir::BoxType::get(addrTy, selectorIsVolatile()), + fir::getBase(selector)); addAssocEntitySymbol(selectorBox->clone(exactValue)); } } else if (std::holds_alternative( @@ -4040,7 +4047,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { addrTy = fir::PointerType::get(addrTy); if (isAllocatable) addrTy = fir::HeapType::get(addrTy); - mlir::Type classTy = fir::ClassType::get(addrTy); + mlir::Type classTy = + fir::ClassType::get(addrTy, selectorIsVolatile()); if (classTy == baseTy) { addAssocEntitySymbol(selector); } else { diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index 332cca1ab9f95..75185e719393f 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -1536,20 +1536,51 @@ bool fir::ConvertOp::canBeConverted(mlir::Type inType, mlir::Type outType) { areRecordsCompatible(inType, outType); } +// In general, ptrtoint-like conversions are allowed to lose volatility +// information because they are either: +// +// 1. passing an entity to an external function and there's nothing we can do +// about volatility after that happens, or +// 2. for code generation, at which point we represent volatility with +// attributes +// on the LLVM instructions and intrinsics. +// +// For all other cases, volatility ought to match exactly. +static mlir::LogicalResult verifyVolatility(mlir::Type inType, + mlir::Type outType) { + const bool toLLVMPointer = mlir::isa(outType); + const bool toInteger = fir::isa_integer(outType); + + // When converting references to classes or allocatables into boxes for + // runtime arguments, we cast away all the volatility information and pass a + // box. This is allowed. + const bool isBoxNoneLike = [&]() { + if (fir::isBoxNone(outType)) + return true; + if (auto referenceType = mlir::dyn_cast(outType)) { + if (fir::isBoxNone(referenceType.getElementType())) { + return true; + } + } + return false; + }(); + + const bool isPtrToIntLike = toLLVMPointer || toInteger || isBoxNoneLike; + if (isPtrToIntLike) { + return mlir::success(); + } + + // In all other cases, we need to check for an exact volatility match. + return mlir::success(fir::isa_volatile_type(inType) == + fir::isa_volatile_type(outType)); +} + llvm::LogicalResult fir::ConvertOp::verify() { mlir::Type inType = getValue().getType(); mlir::Type outType = getType(); - // If we're converting to an LLVM pointer type or an integer, we don't - // need to check for volatility mismatch - volatility will be handled by the - // memory operations themselves in llvm code generation and ptr-to-int can't - // represent volatility. - const bool toLLVMPointer = mlir::isa(outType); - const bool toInteger = fir::isa_integer(outType); if (fir::useStrictVolatileVerification()) { - if (fir::isa_volatile_type(inType) != fir::isa_volatile_type(outType) && - !toLLVMPointer && !toInteger) { - return emitOpError("cannot convert between volatile and non-volatile " - "types, use fir.volatile_cast instead ") + if (failed(verifyVolatility(inType, outType))) { + return emitOpError("this conversion does not preserve volatility: ") << inType << " / " << outType; } } diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp index 711d5d1461b08..52517eef2890d 100644 --- a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp +++ b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp @@ -207,20 +207,25 @@ static bool hasExplicitLowerBounds(mlir::Value shape) { mlir::isa(shape.getType()); } -static std::pair updateDeclareInputTypeWithVolatility( +static std::pair updateDeclaredInputTypeWithVolatility( mlir::Type inputType, mlir::Value memref, mlir::OpBuilder &builder, fir::FortranVariableFlagsAttr fortran_attrs) { if (fortran_attrs && bitEnumContainsAny(fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::fortran_volatile)) { + // A volatile pointer's pointee is volatile. const bool isPointer = bitEnumContainsAny( fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::pointer); + // An allocatable's inner type's volatility matches that of the reference. + const bool isAllocatable = bitEnumContainsAny( + fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::allocatable); auto updateType = [&](auto t) { using FIRT = decltype(t); - // A volatile pointer's pointee is volatile. auto elementType = t.getEleTy(); - const bool elementTypeIsVolatile = - isPointer || fir::isa_volatile_type(elementType); + const bool elementTypeIsBox = mlir::isa(elementType); + const bool elementTypeIsVolatile = isPointer || isAllocatable || + elementTypeIsBox || + fir::isa_volatile_type(elementType); auto newEleTy = fir::updateTypeWithVolatility(elementType, elementTypeIsVolatile); inputType = FIRT::get(newEleTy, true); @@ -243,7 +248,7 @@ void hlfir::DeclareOp::build(mlir::OpBuilder &builder, auto nameAttr = builder.getStringAttr(uniq_name); mlir::Type inputType = memref.getType(); bool hasExplicitLbs = hasExplicitLowerBounds(shape); - std::tie(inputType, memref) = updateDeclareInputTypeWithVolatility( + std::tie(inputType, memref) = updateDeclaredInputTypeWithVolatility( inputType, memref, builder, fortran_attrs); mlir::Type hlfirVariableType = getHLFIRVariableType(inputType, hasExplicitLbs); diff --git a/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp b/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp index 0c78a878cdc53..f9a4c4d0283c7 100644 --- a/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp +++ b/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp @@ -401,10 +401,14 @@ llvm::LogicalResult SelectTypeConv::genTypeLadderStep( { // Since conversion is done in parallel for each fir.select_type // operation, the runtime function insertion must be threadsafe. + auto runtimeAttr = + mlir::NamedAttribute(fir::FIROpsDialect::getFirRuntimeAttrName(), + mlir::UnitAttr::get(rewriter.getContext())); callee = fir::createFuncOp(rewriter.getUnknownLoc(), mod, fctName, rewriter.getFunctionType({descNoneTy, typeDescTy}, - rewriter.getI1Type())); + rewriter.getI1Type()), + {runtimeAttr}); } cmp = rewriter .create(loc, callee, diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index 1de48b87365b3..834eea7df8ebe 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -1260,7 +1260,7 @@ func.func @dc_invalid_reduction(%arg0: index, %arg1: index) { // Should fail when volatility changes from a fir.convert func.func @bad_convert_volatile(%arg0: !fir.ref) -> !fir.ref { - // expected-error at +1 {{'fir.convert' op cannot convert between volatile and non-volatile types, use fir.volatile_cast instead}} + // expected-error at +1 {{op this conversion does not preserve volatilit}} %0 = fir.convert %arg0 : (!fir.ref) -> !fir.ref return %0 : !fir.ref } @@ -1269,7 +1269,7 @@ func.func @bad_convert_volatile(%arg0: !fir.ref) -> !fir.ref // Should fail when volatility changes from a fir.convert func.func @bad_convert_volatile2(%arg0: !fir.ref) -> !fir.ref { - // expected-error at +1 {{'fir.convert' op cannot convert between volatile and non-volatile types, use fir.volatile_cast instead}} + // expected-error at +1 {{op this conversion does not preserve volatility}} %0 = fir.convert %arg0 : (!fir.ref) -> !fir.ref return %0 : !fir.ref } diff --git a/flang/test/Lower/volatile-allocatable.f90 b/flang/test/Lower/volatile-allocatable.f90 index 5f75a5425422a..e182fe8a4d9c9 100644 --- a/flang/test/Lower/volatile-allocatable.f90 +++ b/flang/test/Lower/volatile-allocatable.f90 @@ -119,10 +119,10 @@ subroutine test_unlimited_polymorphic() end subroutine ! CHECK-LABEL: func.func @_QPtest_scalar_volatile() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEc1"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv1"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv2"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv3"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEc1"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv1"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv2"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv3"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () @@ -140,8 +140,8 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_volatile_asynchronous() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEi1"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEv1"} : (!fir.ref>>>, volatile>) -> (!fir.ref>>>, volatile>, !fir.ref>>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEi1"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEv1"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 @@ -151,10 +151,11 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_select_base_type_volatile() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.ref>>>, volatile>) -> (!fir.ref>>>, volatile>, !fir.ref>>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAClassIs(%{{.+}}, %{{.+}}) : (!fir.box, !fir.ref) -> i1 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.class>>, volatile>, !fir.shift<1>) -> (!fir.class>>, volatile>, !fir.class>>, volatile>) ! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0 (%{{.+}}) : (!fir.class>>, volatile>, index) -> !fir.class, volatile> ! CHECK: %{{.+}} = hlfir.designate %{{.+}}{"i"} : (!fir.class, volatile>) -> !fir.ref @@ -162,7 +163,7 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_mold_allocation() { ! CHECK: %{{.+}}:2 = h... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/139183 From flang-commits at lists.llvm.org Thu May 8 17:56:40 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 08 May 2025 17:56:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix spurious error on defined assignment in PURE (PR #139186) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/139186 An assignment to a whole polymorphic object in a PURE subprogram that is implemented by means of a defined assignment procedure shouldn't be subjected to the same definability checks as it would be for an intrinsic assignment (which would also require it to be allocatable). Fixes https://github.com/llvm/llvm-project/issues/139129. Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Thu May 8 17:57:19 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 17:57:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix spurious error on defined assignment in PURE (PR #139186) In-Reply-To: Message-ID: <681d52ef.170a0220.3d8973.6e28@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes An assignment to a whole polymorphic object in a PURE subprogram that is implemented by means of a defined assignment procedure shouldn't be subjected to the same definability checks as it would be for an intrinsic assignment (which would also require it to be allocatable). Fixes https://github.com/llvm/llvm-project/issues/139129. --- Full diff: https://github.com/llvm/llvm-project/pull/139186.diff 11 Files Affected: - (modified) flang/lib/Semantics/assignment.cpp (+5) - (modified) flang/lib/Semantics/check-deallocate.cpp (+2-1) - (modified) flang/lib/Semantics/check-declarations.cpp (+2-2) - (modified) flang/lib/Semantics/definable.cpp (+21-21) - (modified) flang/lib/Semantics/definable.h (+1-1) - (modified) flang/lib/Semantics/expression.cpp (+3-3) - (modified) flang/test/Semantics/assign11.f90 (+3-3) - (added) flang/test/Semantics/bug139129.f90 (+17) - (modified) flang/test/Semantics/call28.f90 (+1-3) - (modified) flang/test/Semantics/deallocate07.f90 (+3-3) - (modified) flang/test/Semantics/declarations05.f90 (+1-1) ``````````diff diff --git a/flang/lib/Semantics/assignment.cpp b/flang/lib/Semantics/assignment.cpp index 935f5a03bdb6a..6e55d0210ee0e 100644 --- a/flang/lib/Semantics/assignment.cpp +++ b/flang/lib/Semantics/assignment.cpp @@ -72,6 +72,11 @@ void AssignmentContext::Analyze(const parser::AssignmentStmt &stmt) { std::holds_alternative(assignment->u)}; if (isDefinedAssignment) { flags.set(DefinabilityFlag::AllowEventLockOrNotifyType); + } else if (const Symbol * + whole{evaluate::UnwrapWholeSymbolOrComponentDataRef(lhs)}) { + if (IsAllocatable(whole->GetUltimate())) { + flags.set(DefinabilityFlag::PotentialDeallocation); + } } if (auto whyNot{WhyNotDefinable(lhsLoc, scope, flags, lhs)}) { if (whyNot->IsFatal()) { diff --git a/flang/lib/Semantics/check-deallocate.cpp b/flang/lib/Semantics/check-deallocate.cpp index 3bcd4d87b0906..332e6b52e1c9a 100644 --- a/flang/lib/Semantics/check-deallocate.cpp +++ b/flang/lib/Semantics/check-deallocate.cpp @@ -36,7 +36,8 @@ void DeallocateChecker::Leave(const parser::DeallocateStmt &deallocateStmt) { } else if (auto whyNot{WhyNotDefinable(name.source, context_.FindScope(name.source), {DefinabilityFlag::PointerDefinition, - DefinabilityFlag::AcceptAllocatable}, + DefinabilityFlag::AcceptAllocatable, + DefinabilityFlag::PotentialDeallocation}, *symbol)}) { // Catch problems with non-definability of the // pointer/allocatable diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 318085518cc57..c3a228f3ab8a9 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -949,8 +949,8 @@ void CheckHelper::CheckObjectEntity( !IsFunctionResult(symbol) /*ditto*/) { // Check automatically deallocated local variables for possible // problems with finalization in PURE. - if (auto whyNot{ - WhyNotDefinable(symbol.name(), symbol.owner(), {}, symbol)}) { + if (auto whyNot{WhyNotDefinable(symbol.name(), symbol.owner(), + {DefinabilityFlag::PotentialDeallocation}, symbol)}) { if (auto *msg{messages_.Say( "'%s' may not be a local variable in a pure subprogram"_err_en_US, symbol.name())}) { diff --git a/flang/lib/Semantics/definable.cpp b/flang/lib/Semantics/definable.cpp index 99a31553f2782..931c8e52fc6d7 100644 --- a/flang/lib/Semantics/definable.cpp +++ b/flang/lib/Semantics/definable.cpp @@ -193,6 +193,15 @@ static std::optional WhyNotDefinableLast(parser::CharBlock at, return WhyNotDefinableLast(at, scope, flags, dataRef->GetLastSymbol()); } } + auto dyType{evaluate::DynamicType::From(ultimate)}; + const auto *inPure{FindPureProcedureContaining(scope)}; + if (inPure && !flags.test(DefinabilityFlag::PolymorphicOkInPure) && + flags.test(DefinabilityFlag::PotentialDeallocation) && dyType && + dyType->IsPolymorphic()) { + return BlameSymbol(at, + "'%s' is a whole polymorphic object in a pure subprogram"_en_US, + original); + } if (flags.test(DefinabilityFlag::PointerDefinition)) { if (flags.test(DefinabilityFlag::AcceptAllocatable)) { if (!IsAllocatableOrObjectPointer(&ultimate)) { @@ -210,26 +219,17 @@ static std::optional WhyNotDefinableLast(parser::CharBlock at, "'%s' is an entity with either an EVENT_TYPE or LOCK_TYPE"_en_US, original); } - if (FindPureProcedureContaining(scope)) { - if (auto dyType{evaluate::DynamicType::From(ultimate)}) { - if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { - if (dyType->IsPolymorphic()) { // C1596 - return BlameSymbol( - at, "'%s' is polymorphic in a pure subprogram"_en_US, original); - } - } - if (const Symbol * impure{HasImpureFinal(ultimate)}) { - return BlameSymbol(at, "'%s' has an impure FINAL procedure '%s'"_en_US, - original, impure->name()); - } + if (dyType && inPure) { + if (const Symbol * impure{HasImpureFinal(ultimate)}) { + return BlameSymbol(at, "'%s' has an impure FINAL procedure '%s'"_en_US, + original, impure->name()); + } + if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { if (const DerivedTypeSpec * derived{GetDerivedTypeSpec(dyType)}) { - if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { - if (auto bad{ - FindPolymorphicAllocatablePotentialComponent(*derived)}) { - return BlameSymbol(at, - "'%s' has polymorphic component '%s' in a pure subprogram"_en_US, - original, bad.BuildResultDesignatorName()); - } + if (auto bad{FindPolymorphicAllocatablePotentialComponent(*derived)}) { + return BlameSymbol(at, + "'%s' has polymorphic component '%s' in a pure subprogram"_en_US, + original, bad.BuildResultDesignatorName()); } } } @@ -241,10 +241,10 @@ static std::optional WhyNotDefinableLast(parser::CharBlock at, static std::optional WhyNotDefinable(parser::CharBlock at, const Scope &scope, DefinabilityFlags flags, const evaluate::DataRef &dataRef) { + bool isWholeSymbol{std::holds_alternative(dataRef.u)}; auto whyNotBase{ WhyNotDefinableBase(at, scope, flags, dataRef.GetFirstSymbol(), - std::holds_alternative(dataRef.u), - DefinesComponentPointerTarget(dataRef, flags))}; + isWholeSymbol, DefinesComponentPointerTarget(dataRef, flags))}; if (!whyNotBase || !whyNotBase->IsFatal()) { if (auto whyNotLast{ WhyNotDefinableLast(at, scope, flags, dataRef.GetLastSymbol())}) { diff --git a/flang/lib/Semantics/definable.h b/flang/lib/Semantics/definable.h index 902702dbccbf3..0d027961417be 100644 --- a/flang/lib/Semantics/definable.h +++ b/flang/lib/Semantics/definable.h @@ -33,7 +33,7 @@ ENUM_CLASS(DefinabilityFlag, SourcedAllocation, // ALLOCATE(a,SOURCE=) PolymorphicOkInPure, // don't check for polymorphic type in pure subprogram DoNotNoteDefinition, // context does not imply definition - AllowEventLockOrNotifyType) + AllowEventLockOrNotifyType, PotentialDeallocation) using DefinabilityFlags = common::EnumSet; diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index e139bda7e4950..96d039edf89d7 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -3385,15 +3385,15 @@ const Assignment *ExpressionAnalyzer::Analyze(const parser::AssignmentStmt &x) { const Symbol *lastWhole{ lastWhole0 ? &ResolveAssociations(*lastWhole0) : nullptr}; if (!lastWhole || !IsAllocatable(*lastWhole)) { - Say("Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable"_err_en_US); + Say("Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable"_err_en_US); } else if (evaluate::IsCoarray(*lastWhole)) { - Say("Left-hand side of assignment may not be polymorphic if it is a coarray"_err_en_US); + Say("Left-hand side of intrinsic assignment may not be polymorphic if it is a coarray"_err_en_US); } } if (auto *derived{GetDerivedTypeSpec(*dyType)}) { if (auto iter{FindAllocatableUltimateComponent(*derived)}) { if (ExtractCoarrayRef(lhs)) { - Say("Left-hand side of assignment must not be coindexed due to allocatable ultimate component '%s'"_err_en_US, + Say("Left-hand side of intrinsic assignment must not be coindexed due to allocatable ultimate component '%s'"_err_en_US, iter.BuildResultDesignatorName()); } } diff --git a/flang/test/Semantics/assign11.f90 b/flang/test/Semantics/assign11.f90 index 37216526b5f33..9d70d7109e75e 100644 --- a/flang/test/Semantics/assign11.f90 +++ b/flang/test/Semantics/assign11.f90 @@ -9,10 +9,10 @@ program test end type type(t) auc[*] pa = 1 ! ok - !ERROR: Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable pp = 1 - !ERROR: Left-hand side of assignment may not be polymorphic if it is a coarray + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic if it is a coarray pac = 1 - !ERROR: Left-hand side of assignment must not be coindexed due to allocatable ultimate component '%a' + !ERROR: Left-hand side of intrinsic assignment must not be coindexed due to allocatable ultimate component '%a' auc[1] = t() end diff --git a/flang/test/Semantics/bug139129.f90 b/flang/test/Semantics/bug139129.f90 new file mode 100644 index 0000000000000..2f0f865854706 --- /dev/null +++ b/flang/test/Semantics/bug139129.f90 @@ -0,0 +1,17 @@ +!RUN: %flang_fc1 -fsyntax-only %s +module m + type t + contains + procedure asst + generic :: assignment(=) => asst + end type + contains + pure subroutine asst(lhs, rhs) + class(t), intent(in out) :: lhs + class(t), intent(in) :: rhs + end + pure subroutine test(x, y) + class(t), intent(in out) :: x, y + x = y ! spurious definability error + end +end diff --git a/flang/test/Semantics/call28.f90 b/flang/test/Semantics/call28.f90 index 51430853d663f..f133276f7547e 100644 --- a/flang/test/Semantics/call28.f90 +++ b/flang/test/Semantics/call28.f90 @@ -11,9 +11,7 @@ pure subroutine s1(x) end subroutine pure subroutine s2(x) class(t), intent(in out) :: x - !ERROR: Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable - !ERROR: Left-hand side of assignment is not definable - !BECAUSE: 'x' is polymorphic in a pure subprogram + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable x = t() end subroutine pure subroutine s3(x) diff --git a/flang/test/Semantics/deallocate07.f90 b/flang/test/Semantics/deallocate07.f90 index 154c680f47c82..dd2885e2cab35 100644 --- a/flang/test/Semantics/deallocate07.f90 +++ b/flang/test/Semantics/deallocate07.f90 @@ -19,11 +19,11 @@ pure subroutine subr(pp1, pp2, mp2) !ERROR: Name in DEALLOCATE statement is not definable !BECAUSE: 'mv1' may not be defined in pure subprogram 'subr' because it is host-associated deallocate(mv1%pc) - !ERROR: Object in DEALLOCATE statement is not deallocatable - !BECAUSE: 'pp1' is polymorphic in a pure subprogram + !ERROR: Name in DEALLOCATE statement is not definable + !BECAUSE: 'pp1' is a whole polymorphic object in a pure subprogram deallocate(pp1) !ERROR: Object in DEALLOCATE statement is not deallocatable - !BECAUSE: 'pc' is polymorphic in a pure subprogram + !BECAUSE: 'pc' has polymorphic component '%pc' in a pure subprogram deallocate(pp2%pc) !ERROR: Object in DEALLOCATE statement is not deallocatable !BECAUSE: 'mp2' has polymorphic component '%pc' in a pure subprogram diff --git a/flang/test/Semantics/declarations05.f90 b/flang/test/Semantics/declarations05.f90 index b6dab7aeea0bc..b1e3d3c773160 100644 --- a/flang/test/Semantics/declarations05.f90 +++ b/flang/test/Semantics/declarations05.f90 @@ -22,7 +22,7 @@ impure subroutine final(x) end pure subroutine test !ERROR: 'x0' may not be a local variable in a pure subprogram - !BECAUSE: 'x0' is polymorphic in a pure subprogram + !BECAUSE: 'x0' is a whole polymorphic object in a pure subprogram class(t0), allocatable :: x0 !ERROR: 'x1' may not be a local variable in a pure subprogram !BECAUSE: 'x1' has an impure FINAL procedure 'final' ``````````
https://github.com/llvm/llvm-project/pull/139186 From flang-commits at lists.llvm.org Thu May 8 19:55:25 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 19:55:25 -0700 (PDT) Subject: [flang-commits] [flang] db2d576 - [flang][fir] Support promoting `fir.do_loop` with results to `affine.for`. (#137790) Message-ID: <681d6e9d.170a0220.24470.8d37@mx.google.com> Author: MingYan Date: 2025-05-09T10:55:21+08:00 New Revision: db2d5762ebf61b95b0e414b461db68ac49d06b8c URL: https://github.com/llvm/llvm-project/commit/db2d5762ebf61b95b0e414b461db68ac49d06b8c DIFF: https://github.com/llvm/llvm-project/commit/db2d5762ebf61b95b0e414b461db68ac49d06b8c.diff LOG: [flang][fir] Support promoting `fir.do_loop` with results to `affine.for`. (#137790) Co-authored-by: yanming Added: Modified: flang/lib/Optimizer/Transforms/AffinePromotion.cpp flang/test/Fir/affine-promotion.fir Removed: ################################################################################ diff --git a/flang/lib/Optimizer/Transforms/AffinePromotion.cpp b/flang/lib/Optimizer/Transforms/AffinePromotion.cpp index 43fccf52dc8ab..ef82e400bea14 100644 --- a/flang/lib/Optimizer/Transforms/AffinePromotion.cpp +++ b/flang/lib/Optimizer/Transforms/AffinePromotion.cpp @@ -49,8 +49,9 @@ struct AffineIfAnalysis; /// second when doing rewrite. struct AffineFunctionAnalysis { explicit AffineFunctionAnalysis(mlir::func::FuncOp funcOp) { - for (fir::DoLoopOp op : funcOp.getOps()) - loopAnalysisMap.try_emplace(op, op, *this); + funcOp->walk([&](fir::DoLoopOp doloop) { + loopAnalysisMap.try_emplace(doloop, doloop, *this); + }); } AffineLoopAnalysis getChildLoopAnalysis(fir::DoLoopOp op) const; @@ -102,10 +103,23 @@ struct AffineLoopAnalysis { return true; } + bool analysisResults(fir::DoLoopOp loopOperation) { + if (loopOperation.getFinalValue() && + !loopOperation.getResult(0).use_empty()) { + LLVM_DEBUG( + llvm::dbgs() + << "AffineLoopAnalysis: cannot promote loop final value\n";); + return false; + } + + return true; + } + bool analyzeLoop(fir::DoLoopOp loopOperation, AffineFunctionAnalysis &functionAnalysis) { LLVM_DEBUG(llvm::dbgs() << "AffineLoopAnalysis: \n"; loopOperation.dump();); return analyzeMemoryAccess(loopOperation) && + analysisResults(loopOperation) && analyzeBody(loopOperation, functionAnalysis); } @@ -461,14 +475,28 @@ class AffineLoopConversion : public mlir::OpRewritePattern { LLVM_ATTRIBUTE_UNUSED auto loopAnalysis = functionAnalysis.getChildLoopAnalysis(loop); auto &loopOps = loop.getBody()->getOperations(); + auto resultOp = cast(loop.getBody()->getTerminator()); + auto results = resultOp.getOperands(); + auto loopResults = loop->getResults(); auto loopAndIndex = createAffineFor(loop, rewriter); auto affineFor = loopAndIndex.first; auto inductionVar = loopAndIndex.second; + if (loop.getFinalValue()) { + results = results.drop_front(); + loopResults = loopResults.drop_front(); + } + rewriter.startOpModification(affineFor.getOperation()); affineFor.getBody()->getOperations().splice( std::prev(affineFor.getBody()->end()), loopOps, loopOps.begin(), std::prev(loopOps.end())); + rewriter.replaceAllUsesWith(loop.getRegionIterArgs(), + affineFor.getRegionIterArgs()); + if (!results.empty()) { + rewriter.setInsertionPointToEnd(affineFor.getBody()); + rewriter.create(resultOp->getLoc(), results); + } rewriter.finalizeOpModification(affineFor.getOperation()); rewriter.startOpModification(loop.getOperation()); @@ -479,7 +507,8 @@ class AffineLoopConversion : public mlir::OpRewritePattern { LLVM_DEBUG(llvm::dbgs() << "AffineLoopConversion: loop rewriten to:\n"; affineFor.dump();); - rewriter.replaceOp(loop, affineFor.getOperation()->getResults()); + rewriter.replaceAllUsesWith(loopResults, affineFor->getResults()); + rewriter.eraseOp(loop); return success(); } @@ -503,7 +532,7 @@ class AffineLoopConversion : public mlir::OpRewritePattern { ValueRange(op.getUpperBound()), mlir::AffineMap::get(0, 1, 1 + mlir::getAffineSymbolExpr(0, op.getContext())), - step); + step, op.getIterOperands()); return std::make_pair(affineFor, affineFor.getInductionVar()); } @@ -528,7 +557,7 @@ class AffineLoopConversion : public mlir::OpRewritePattern { genericUpperBound.getResult(), mlir::AffineMap::get(0, 1, 1 + mlir::getAffineSymbolExpr(0, op.getContext())), - 1); + 1, op.getIterOperands()); rewriter.setInsertionPointToStart(affineFor.getBody()); auto actualIndex = rewriter.create( op.getLoc(), actualIndexMap, diff --git a/flang/test/Fir/affine-promotion.fir b/flang/test/Fir/affine-promotion.fir index aae35c6ef5659..46467ab4a292a 100644 --- a/flang/test/Fir/affine-promotion.fir +++ b/flang/test/Fir/affine-promotion.fir @@ -131,3 +131,89 @@ func.func @loop_with_if(%a: !arr_d1, %v: f32) { // CHECK: } // CHECK: return // CHECK: } + +func.func @loop_with_result(%arg0: !fir.ref>, %arg1: !fir.ref>, %arg2: !fir.ref>) -> f32 { + %c1 = arith.constant 1 : index + %cst = arith.constant 0.000000e+00 : f32 + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %1 = fir.shape %c100, %c100 : (index, index) -> !fir.shape<2> + %2 = fir.alloca i32 + %3:2 = fir.do_loop %arg3 = %c1 to %c100 step %c1 iter_args(%arg4 = %cst) -> (index, f32) { + %8 = fir.array_coor %arg0(%0) %arg3 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %9 = fir.load %8 : !fir.ref + %10 = arith.addf %arg4, %9 fastmath : f32 + %11 = arith.addi %arg3, %c1 overflow : index + fir.result %11, %10 : index, f32 + } + %4:2 = fir.do_loop %arg3 = %c1 to %c100 step %c1 iter_args(%arg4 = %3#1) -> (index, f32) { + %8 = fir.array_coor %arg1(%1) %c1, %arg3 : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref + %9 = fir.convert %8 : (!fir.ref) -> !fir.ref> + %10 = fir.do_loop %arg5 = %c1 to %c100 step %c1 iter_args(%arg6 = %arg4) -> (f32) { + %12 = fir.array_coor %9(%0) %arg5 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %13 = fir.load %12 : !fir.ref + %14 = arith.addf %arg6, %13 fastmath : f32 + fir.result %14 : f32 + } + %11 = arith.addi %arg3, %c1 overflow : index + fir.result %11, %10 : index, f32 + } + %5:2 = fir.do_loop %arg3 = %c1 to %c100 step %c1 iter_args(%arg4 = %4#1, %arg5 = %cst) -> (f32, f32) { + %8 = fir.array_coor %arg0(%0) %arg3 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %9 = fir.load %8 : !fir.ref + %10 = arith.addf %arg4, %9 fastmath : f32 + %11 = fir.array_coor %arg2(%0) %arg3 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %12 = fir.load %11 : !fir.ref + %13 = arith.addf %arg5, %12 fastmath : f32 + fir.result %10, %13 : f32, f32 + } + %6 = arith.addf %5#0, %5#1 fastmath : f32 + %7 = fir.convert %4#0 : (index) -> i32 + fir.store %7 to %2 : !fir.ref + return %6 : f32 +} + +// CHECK-LABEL: func.func @loop_with_result( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG2:.*]]: !fir.ref>) -> f32 { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0.000000e+00 : f32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]], %[[VAL_2]] : (index, index) -> !fir.shape<2> +// CHECK: %[[VAL_5:.*]] = fir.alloca i32 +// CHECK: %[[VAL_6:.*]] = fir.convert %[[ARG0]] : (!fir.ref>) -> memref +// CHECK: %[[VAL_7:.*]] = affine.for %[[VAL_8:.*]] = %[[VAL_0]] to #{{.*}}(){{\[}}%[[VAL_2]]] iter_args(%[[VAL_9:.*]] = %[[VAL_1]]) -> (f32) { +// CHECK: %[[VAL_10:.*]] = affine.apply #{{.*}}(%[[VAL_8]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] +// CHECK: %[[VAL_11:.*]] = affine.load %[[VAL_6]]{{\[}}%[[VAL_10]]] : memref +// CHECK: %[[VAL_12:.*]] = arith.addf %[[VAL_9]], %[[VAL_11]] fastmath : f32 +// CHECK: affine.yield %[[VAL_12]] : f32 +// CHECK: } +// CHECK: %[[VAL_13:.*]]:2 = fir.do_loop %[[VAL_14:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_0]] iter_args(%[[VAL_15:.*]] = %[[VAL_7]]) -> (index, f32) { +// CHECK: %[[VAL_16:.*]] = fir.array_coor %[[ARG1]](%[[VAL_4]]) %[[VAL_0]], %[[VAL_14]] : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16]] : (!fir.ref) -> !fir.ref> +// CHECK: %[[VAL_18:.*]] = fir.convert %[[VAL_17]] : (!fir.ref>) -> memref +// CHECK: %[[VAL_19:.*]] = affine.for %[[VAL_20:.*]] = %[[VAL_0]] to #{{.*}}(){{\[}}%[[VAL_2]]] iter_args(%[[VAL_21:.*]] = %[[VAL_15]]) -> (f32) { +// CHECK: %[[VAL_22:.*]] = affine.apply #{{.*}}(%[[VAL_20]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] +// CHECK: %[[VAL_23:.*]] = affine.load %[[VAL_18]]{{\[}}%[[VAL_22]]] : memref +// CHECK: %[[VAL_24:.*]] = arith.addf %[[VAL_21]], %[[VAL_23]] fastmath : f32 +// CHECK: affine.yield %[[VAL_24]] : f32 +// CHECK: } +// CHECK: %[[VAL_25:.*]] = arith.addi %[[VAL_14]], %[[VAL_0]] overflow : index +// CHECK: fir.result %[[VAL_25]], %[[VAL_19]] : index, f32 +// CHECK: } +// CHECK: %[[VAL_26:.*]] = fir.convert %[[ARG2]] : (!fir.ref>) -> memref +// CHECK: %[[VAL_27:.*]]:2 = affine.for %[[VAL_28:.*]] = %[[VAL_0]] to #{{.*}}(){{\[}}%[[VAL_2]]] iter_args(%[[VAL_29:.*]] = %[[VAL_30:.*]]#1, %[[VAL_31:.*]] = %[[VAL_1]]) -> (f32, f32) { +// CHECK: %[[VAL_32:.*]] = affine.apply #{{.*}}(%[[VAL_28]]){{\[}}%[[VAL_0]], %[[VAL_2]], %[[VAL_0]]] +// CHECK: %[[VAL_33:.*]] = affine.load %[[VAL_6]]{{\[}}%[[VAL_32]]] : memref +// CHECK: %[[VAL_34:.*]] = arith.addf %[[VAL_29]], %[[VAL_33]] fastmath : f32 +// CHECK: %[[VAL_35:.*]] = affine.load %[[VAL_26]]{{\[}}%[[VAL_32]]] : memref +// CHECK: %[[VAL_36:.*]] = arith.addf %[[VAL_31]], %[[VAL_35]] fastmath : f32 +// CHECK: affine.yield %[[VAL_34]], %[[VAL_36]] : f32, f32 +// CHECK: } +// CHECK: %[[VAL_37:.*]] = arith.addf %[[VAL_38:.*]]#0, %[[VAL_38]]#1 fastmath : f32 +// CHECK: %[[VAL_39:.*]] = fir.convert %[[VAL_40:.*]]#0 : (index) -> i32 +// CHECK: fir.store %[[VAL_39]] to %[[VAL_5]] : !fir.ref +// CHECK: return %[[VAL_37]] : f32 +// CHECK: } From flang-commits at lists.llvm.org Thu May 8 19:55:28 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 19:55:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Support promoting `fir.do_loop` with results to `affine.for`. (PR #137790) In-Reply-To: Message-ID: <681d6ea0.050a0220.10efa6.cbf7@mx.google.com> https://github.com/NexMing closed https://github.com/llvm/llvm-project/pull/137790 From flang-commits at lists.llvm.org Thu May 8 09:59:51 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 09:59:51 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609) In-Reply-To: Message-ID: <681ce307.170a0220.1acd96.28ff@mx.google.com> https://github.com/jofrn updated https://github.com/llvm/llvm-project/pull/123609 >From 210b6d80bcfbbcd216f98199df386280724561e2 Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 20 Jan 2025 04:51:26 -0800 Subject: [PATCH 01/31] [TargetVerifier][AMDGPU] Add TargetVerifier. This pass verifies the IR for an individual backend. This is different than Lint because it consolidates all checks for a given backend in a single pass. A check for Lint may be undefined behavior across all targets, whereas a check in TargetVerifier would only pertain to the specified target but can check more than just undefined behavior such are IR validity. A use case of this would be to reject programs with invalid IR while fuzzing. --- llvm/include/llvm/IR/Module.h | 4 + llvm/include/llvm/Target/TargetVerifier.h | 82 +++++++ .../TargetVerify/AMDGPUTargetVerifier.h | 36 +++ llvm/lib/IR/Verifier.cpp | 18 +- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 213 ++++++++++++++++++ llvm/lib/Target/AMDGPU/CMakeLists.txt | 1 + llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 62 +++++ llvm/tools/llvm-tgt-verify/CMakeLists.txt | 34 +++ .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 172 ++++++++++++++ 9 files changed, 618 insertions(+), 4 deletions(-) create mode 100644 llvm/include/llvm/Target/TargetVerifier.h create mode 100644 llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h create mode 100644 llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp create mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify.ll create mode 100644 llvm/tools/llvm-tgt-verify/CMakeLists.txt create mode 100644 llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp diff --git a/llvm/include/llvm/IR/Module.h b/llvm/include/llvm/IR/Module.h index 91ccd76c41e07..03c0cf1cf0924 100644 --- a/llvm/include/llvm/IR/Module.h +++ b/llvm/include/llvm/IR/Module.h @@ -214,6 +214,10 @@ class LLVM_ABI Module { /// @name Constructors /// @{ public: + /// Is this Module valid as determined by one of the verification passes + /// i.e. Lint, Verifier, TargetVerifier. + bool IsValid = true; + /// Is this Module using intrinsics to record the position of debugging /// information, or non-intrinsic records? See IsNewDbgInfoFormat in /// \ref BasicBlock. diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h new file mode 100644 index 0000000000000..e00c6a7b260c9 --- /dev/null +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -0,0 +1,82 @@ +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at TargetVerifier.cpp or an +// individual backend's TargetVerifier. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_TARGET_VERIFIER_H +#define LLVM_TARGET_VERIFIER_H + +#include "llvm/IR/PassManager.h" +#include "llvm/IR/Module.h" +#include "llvm/TargetParser/Triple.h" + +namespace llvm { + +class Function; + +class TargetVerifierPass : public PassInfoMixin { +public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) {} +}; + +class TargetVerify { +protected: + void WriteValues(ArrayRef Vs) { + for (const Value *V : Vs) { + if (!V) + continue; + if (isa(V)) { + MessagesStr << *V << '\n'; + } else { + V->printAsOperand(MessagesStr, true, Mod); + MessagesStr << '\n'; + } + } + } + + /// A check failed, so printout out the condition and the message. + /// + /// This provides a nice place to put a breakpoint if you want to see why + /// something is not correct. + void CheckFailed(const Twine &Message) { MessagesStr << Message << '\n'; } + + /// A check failed (with values to print). + /// + /// This calls the Message-only version so that the above is easier to set + /// a breakpoint on. + template + void CheckFailed(const Twine &Message, const T1 &V1, const Ts &... Vs) { + CheckFailed(Message); + WriteValues({V1, Vs...}); + } +public: + Module *Mod; + Triple TT; + + std::string Messages; + raw_string_ostream MessagesStr; + + TargetVerify(Module *Mod) + : Mod(Mod), TT(Triple::normalize(Mod->getTargetTriple())), + MessagesStr(Messages) {} + + void run(Function &F) {}; +}; + +} // namespace llvm + +#endif // LLVM_TARGET_VERIFIER_H diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h new file mode 100644 index 0000000000000..e6ff57629b141 --- /dev/null +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -0,0 +1,36 @@ +//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU ---*- C++ -*-===// +//// +//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +//// See https://llvm.org/LICENSE.txt for license information. +//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +//// +////===----------------------------------------------------------------------===// +//// +//// This file defines target verifier interfaces that can be used for some +//// validation of input to the system, and for checking that transformations +//// haven't done something bad. In contrast to the Verifier or Lint, the +//// TargetVerifier looks for constructions invalid to a particular target +//// machine. +//// +//// To see what specifically is checked, look at an individual backend's +//// TargetVerifier. +//// +////===----------------------------------------------------------------------===// + +#ifndef LLVM_AMDGPU_TARGET_VERIFIER_H +#define LLVM_AMDGPU_TARGET_VERIFIER_H + +#include "llvm/Target/TargetVerifier.h" + +namespace llvm { + +class Function; + +class AMDGPUTargetVerifierPass : public TargetVerifierPass { +public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); +}; + +} // namespace llvm + +#endif // LLVM_AMDGPU_TARGET_VERIFIER_H diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 8afe360d088bc..9d21ca182ca13 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -135,6 +135,10 @@ static cl::opt VerifyNoAliasScopeDomination( cl::desc("Ensure that llvm.experimental.noalias.scope.decl for identical " "scopes are not dominating")); +static cl::opt + VerifyAbortOnError("verifier-abort-on-error", cl::init(false), + cl::desc("In the Verifier pass, abort on errors.")); + namespace llvm { struct VerifierSupport { @@ -7796,16 +7800,22 @@ VerifierAnalysis::Result VerifierAnalysis::run(Function &F, PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { auto Res = AM.getResult(M); - if (FatalErrors && (Res.IRBroken || Res.DebugInfoBroken)) - report_fatal_error("Broken module found, compilation aborted!"); + if (Res.IRBroken || Res.DebugInfoBroken) { + M.IsValid = false; + if (VerifyAbortOnError && FatalErrors) + report_fatal_error("Broken module found, compilation aborted!"); + } return PreservedAnalyses::all(); } PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto res = AM.getResult(F); - if (res.IRBroken && FatalErrors) - report_fatal_error("Broken function found, compilation aborted!"); + if (res.IRBroken) { + F.getParent()->IsValid = false; + if (VerifyAbortOnError && FatalErrors) + report_fatal_error("Broken function found, compilation aborted!"); + } return PreservedAnalyses::all(); } diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp new file mode 100644 index 0000000000000..585b19065c142 --- /dev/null +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -0,0 +1,213 @@ +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" + +#include "llvm/Analysis/UniformityAnalysis.h" +#include "llvm/Analysis/PostDominators.h" +#include "llvm/Support/Debug.h" +#include "llvm/IR/Dominators.h" +#include "llvm/IR/Function.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/IntrinsicsAMDGPU.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/Value.h" + +#include "llvm/Support/raw_ostream.h" + +using namespace llvm; + +static cl::opt +MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); + +// Check - We know that cond should be true, if not print an error message. +#define Check(C, ...) \ + do { \ + if (!(C)) { \ + TargetVerify::CheckFailed(__VA_ARGS__); \ + return; \ + } \ + } while (false) + +static bool isMFMA(unsigned IID) { + switch (IID) { + case Intrinsic::amdgcn_mfma_f32_4x4x1f32: + case Intrinsic::amdgcn_mfma_f32_4x4x4f16: + case Intrinsic::amdgcn_mfma_i32_4x4x4i8: + case Intrinsic::amdgcn_mfma_f32_4x4x2bf16: + + case Intrinsic::amdgcn_mfma_f32_16x16x1f32: + case Intrinsic::amdgcn_mfma_f32_16x16x4f32: + case Intrinsic::amdgcn_mfma_f32_16x16x4f16: + case Intrinsic::amdgcn_mfma_f32_16x16x16f16: + case Intrinsic::amdgcn_mfma_i32_16x16x4i8: + case Intrinsic::amdgcn_mfma_i32_16x16x16i8: + case Intrinsic::amdgcn_mfma_f32_16x16x2bf16: + case Intrinsic::amdgcn_mfma_f32_16x16x8bf16: + + case Intrinsic::amdgcn_mfma_f32_32x32x1f32: + case Intrinsic::amdgcn_mfma_f32_32x32x2f32: + case Intrinsic::amdgcn_mfma_f32_32x32x4f16: + case Intrinsic::amdgcn_mfma_f32_32x32x8f16: + case Intrinsic::amdgcn_mfma_i32_32x32x4i8: + case Intrinsic::amdgcn_mfma_i32_32x32x8i8: + case Intrinsic::amdgcn_mfma_f32_32x32x2bf16: + case Intrinsic::amdgcn_mfma_f32_32x32x4bf16: + + case Intrinsic::amdgcn_mfma_f32_4x4x4bf16_1k: + case Intrinsic::amdgcn_mfma_f32_16x16x4bf16_1k: + case Intrinsic::amdgcn_mfma_f32_16x16x16bf16_1k: + case Intrinsic::amdgcn_mfma_f32_32x32x4bf16_1k: + case Intrinsic::amdgcn_mfma_f32_32x32x8bf16_1k: + + case Intrinsic::amdgcn_mfma_f64_16x16x4f64: + case Intrinsic::amdgcn_mfma_f64_4x4x4f64: + + case Intrinsic::amdgcn_mfma_i32_16x16x32_i8: + case Intrinsic::amdgcn_mfma_i32_32x32x16_i8: + case Intrinsic::amdgcn_mfma_f32_16x16x8_xf32: + case Intrinsic::amdgcn_mfma_f32_32x32x4_xf32: + + case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_bf8: + case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_fp8: + case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_bf8: + case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_fp8: + + case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_bf8: + case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_fp8: + case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_bf8: + case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_fp8: + return true; + default: + return false; + } +} + +namespace llvm { +class AMDGPUTargetVerify : public TargetVerify { +public: + Module *Mod; + + DominatorTree *DT; + PostDominatorTree *PDT; + UniformityInfo *UA; + + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) + : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} + + void run(Function &F); +}; + +static bool IsValidInt(const Type *Ty) { + return Ty->isIntegerTy(1) || + Ty->isIntegerTy(8) || + Ty->isIntegerTy(16) || + Ty->isIntegerTy(32) || + Ty->isIntegerTy(64) || + Ty->isIntegerTy(128); +} + +static bool isShader(CallingConv::ID CC) { + switch(CC) { + case CallingConv::AMDGPU_VS: + case CallingConv::AMDGPU_LS: + case CallingConv::AMDGPU_HS: + case CallingConv::AMDGPU_ES: + case CallingConv::AMDGPU_GS: + case CallingConv::AMDGPU_PS: + case CallingConv::AMDGPU_CS_Chain: + case CallingConv::AMDGPU_CS_ChainPreserve: + case CallingConv::AMDGPU_CS: + return true; + default: + return false; + } +} + +void AMDGPUTargetVerify::run(Function &F) { + // Ensure shader calling convention returns void + if (isShader(F.getCallingConv())) + Check(F.getReturnType() == Type::getVoidTy(F.getContext()), "Shaders must return void"); + + for (auto &BB : F) { + + for (auto &I : BB) { + if (MarkUniform) + outs() << UA->isUniform(&I) << ' ' << I << '\n'; + + // Ensure integral types are valid: i8, i16, i32, i64, i128 + if (I.getType()->isIntegerTy()) + Check(IsValidInt(I.getType()), "Int type is invalid.", &I); + for (unsigned i = 0; i < I.getNumOperands(); ++i) + if (I.getOperand(i)->getType()->isIntegerTy()) + Check(IsValidInt(I.getOperand(i)->getType()), + "Int type is invalid.", I.getOperand(i)); + + // Ensure no store to const memory + if (auto *SI = dyn_cast(&I)) + { + unsigned AS = SI->getPointerAddressSpace(); + Check(AS != 4, "Write to const memory", SI); + } + + // Ensure no kernel to kernel calls. + if (auto *CI = dyn_cast(&I)) + { + CallingConv::ID CalleeCC = CI->getCallingConv(); + if (CalleeCC == CallingConv::AMDGPU_KERNEL) + { + CallingConv::ID CallerCC = CI->getParent()->getParent()->getCallingConv(); + Check(CallerCC != CallingConv::AMDGPU_KERNEL, + "A kernel may not call a kernel", CI->getParent()->getParent()); + } + } + + // Ensure MFMA is not in control flow with diverging operands + if (auto *II = dyn_cast(&I)) { + if (isMFMA(II->getIntrinsicID())) { + bool InControlFlow = false; + for (const auto &P : predecessors(&BB)) + if (!PDT->dominates(&BB, P)) { + InControlFlow = true; + break; + } + for (const auto &S : successors(&BB)) + if (!DT->dominates(&BB, S)) { + InControlFlow = true; + break; + } + if (InControlFlow) { + // If operands to MFMA are not uniform, MFMA cannot be in control flow + bool hasUniformOperands = true; + for (unsigned i = 0; i < II->getNumOperands(); i++) { + if (!UA->isUniform(II->getOperand(i))) { + dbgs() << "Not uniform: " << *II->getOperand(i) << '\n'; + hasUniformOperands = false; + } + } + if (!hasUniformOperands) Check(false, "MFMA in control flow", II); + //else Check(false, "MFMA in control flow (uniform operands)", II); + } + //else Check(false, "MFMA not in control flow", II); + } + } + } + } +} + +PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { + + auto *Mod = F.getParent(); + + auto UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + F.getParent()->IsValid = false; + } + + return PreservedAnalyses::all(); +} +} // namespace llvm diff --git a/llvm/lib/Target/AMDGPU/CMakeLists.txt b/llvm/lib/Target/AMDGPU/CMakeLists.txt index 09a3096602fc3..bcfea0bf8ac94 100644 --- a/llvm/lib/Target/AMDGPU/CMakeLists.txt +++ b/llvm/lib/Target/AMDGPU/CMakeLists.txt @@ -110,6 +110,7 @@ add_llvm_target(AMDGPUCodeGen AMDGPUTargetMachine.cpp AMDGPUTargetObjectFile.cpp AMDGPUTargetTransformInfo.cpp + AMDGPUTargetVerifier.cpp AMDGPUWaitSGPRHazards.cpp AMDGPUUnifyDivergentExitNodes.cpp AMDGPUUnifyMetadata.cpp diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll new file mode 100644 index 0000000000000..f56ff992a56c2 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -0,0 +1,62 @@ +; RUN: not llvm-tgt-verify %s -mtriple=amdgcn |& FileCheck %s + +define amdgpu_kernel void @test_mfma_f32_32x32x1f32_vecarg(ptr addrspace(1) %arg) #0 { +; CHECK: Not uniform: %in.f32 = load <32 x float>, ptr addrspace(1) %gep, align 128 +; CHECK-NEXT: MFMA in control flow +; CHECK-NEXT: %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) +s: + %tid = call i32 @llvm.amdgcn.workitem.id.x() + %gep = getelementptr inbounds <32 x float>, ptr addrspace(1) %arg, i32 %tid + %in.i32 = load <32 x i32>, ptr addrspace(1) %gep + %in.f32 = load <32 x float>, ptr addrspace(1) %gep + + %0 = icmp eq <32 x i32> %in.i32, zeroinitializer + %div.br = extractelement <32 x i1> %0, i32 0 + br i1 %div.br, label %if.3, label %else.0 + +if.3: + br label %join + +else.0: + %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) + br label %join + +join: + ret void +} + +define amdgpu_cs i32 @shader() { +; CHECK: Shaders must return void + ret i32 0 +} + +define amdgpu_kernel void @store_const(ptr addrspace(4) %out, i32 %a, i32 %b) { +; CHECK: Undefined behavior: Write to memory in const addrspace +; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 +; CHECK-NEXT: Write to const memory +; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 + %r = add i32 %a, %b + store i32 %r, ptr addrspace(4) %out + ret void +} + +define amdgpu_kernel void @kernel_callee(ptr %x) { + ret void +} + +define amdgpu_kernel void @kernel_caller(ptr %x) { +; CHECK: A kernel may not call a kernel +; CHECK-NEXT: ptr @kernel_caller + call amdgpu_kernel void @kernel_callee(ptr %x) + ret void +} + + +; Function Attrs: nounwind +define i65 @invalid_type(i65 %x) #0 { +; CHECK: Int type is invalid. +; CHECK-NEXT: %tmp2 = ashr i65 %x, 64 +entry: + %tmp2 = ashr i65 %x, 64 + ret i65 %tmp2 +} diff --git a/llvm/tools/llvm-tgt-verify/CMakeLists.txt b/llvm/tools/llvm-tgt-verify/CMakeLists.txt new file mode 100644 index 0000000000000..fe47c85e6cdce --- /dev/null +++ b/llvm/tools/llvm-tgt-verify/CMakeLists.txt @@ -0,0 +1,34 @@ +set(LLVM_LINK_COMPONENTS + AllTargetsAsmParsers + AllTargetsCodeGens + AllTargetsDescs + AllTargetsInfos + Analysis + AsmPrinter + CodeGen + CodeGenTypes + Core + IRPrinter + IRReader + MC + MIRParser + Passes + Remarks + ScalarOpts + SelectionDAG + Support + Target + TargetParser + TransformUtils + Vectorize + ) + +add_llvm_tool(llvm-tgt-verify + llvm-tgt-verify.cpp + + DEPENDS + intrinsics_gen + SUPPORT_PLUGINS + ) + +export_executable_symbols_for_plugins(llc) diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp new file mode 100644 index 0000000000000..68422abd6f4cc --- /dev/null +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -0,0 +1,172 @@ +//===--- llvm-isel-fuzzer.cpp - Fuzzer for instruction selection ----------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Tool to fuzz instruction selection using libFuzzer. +// +//===----------------------------------------------------------------------===// + +#include "llvm/InitializePasses.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Analysis/Lint.h" +#include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/Bitcode/BitcodeReader.h" +#include "llvm/Bitcode/BitcodeWriter.h" +#include "llvm/CodeGen/CommandFlags.h" +#include "llvm/CodeGen/TargetPassConfig.h" +#include "llvm/IR/Constants.h" +#include "llvm/IR/LLVMContext.h" +#include "llvm/IR/LegacyPassManager.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/Verifier.h" +#include "llvm/IRReader/IRReader.h" +#include "llvm/Passes/PassBuilder.h" +#include "llvm/Passes/StandardInstrumentations.h" +#include "llvm/MC/TargetRegistry.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/DataTypes.h" +#include "llvm/Support/Debug.h" +#include "llvm/Support/InitLLVM.h" +#include "llvm/Support/SourceMgr.h" +#include "llvm/Support/TargetSelect.h" +#include "llvm/Target/TargetMachine.h" +#include "llvm/Target/TargetVerifier.h" + +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" + +#define DEBUG_TYPE "isel-fuzzer" + +using namespace llvm; + +static codegen::RegisterCodeGenFlags CGF; + +static cl::opt +InputFilename(cl::Positional, cl::desc(""), cl::init("-")); + +static cl::opt + StacktraceAbort("stacktrace-abort", + cl::desc("Turn on stacktrace"), cl::init(false)); + +static cl::opt + NoLint("no-lint", + cl::desc("Turn off Lint"), cl::init(false)); + +static cl::opt + NoVerify("no-verifier", + cl::desc("Turn off Verifier"), cl::init(false)); + +static cl::opt + OptLevel("O", + cl::desc("Optimization level. [-O0, -O1, -O2, or -O3] " + "(default = '-O2')"), + cl::Prefix, cl::init('2')); + +static cl::opt + TargetTriple("mtriple", cl::desc("Override target triple for module")); + +static std::unique_ptr TM; + +static void handleLLVMFatalError(void *, const char *Message, bool) { + if (StacktraceAbort) { + dbgs() << "LLVM ERROR: " << Message << "\n" + << "Aborting.\n"; + abort(); + } +} + +int main(int argc, char **argv) { + StringRef ExecName = argv[0]; + InitLLVM X(argc, argv); + + InitializeAllTargets(); + InitializeAllTargetMCs(); + InitializeAllAsmPrinters(); + InitializeAllAsmParsers(); + + PassRegistry *Registry = PassRegistry::getPassRegistry(); + initializeCore(*Registry); + initializeCodeGen(*Registry); + initializeAnalysis(*Registry); + initializeTarget(*Registry); + + cl::ParseCommandLineOptions(argc, argv); + + if (TargetTriple.empty()) { + errs() << ExecName << ": -mtriple must be specified\n"; + exit(1); + } + + CodeGenOptLevel OLvl; + if (auto Level = CodeGenOpt::parseLevel(OptLevel)) { + OLvl = *Level; + } else { + errs() << ExecName << ": invalid optimization level.\n"; + return 1; + } + ExitOnError ExitOnErr(std::string(ExecName) + ": error:"); + TM = ExitOnErr(codegen::createTargetMachineForTriple( + Triple::normalize(TargetTriple), OLvl)); + assert(TM && "Could not allocate target machine!"); + + // Make sure we print the summary and the current unit when LLVM errors out. + install_fatal_error_handler(handleLLVMFatalError, nullptr); + + LLVMContext Context; + SMDiagnostic Err; + std::unique_ptr M = parseIRFile(InputFilename, Err, Context); + if (!M) { + errs() << "Invalid mod\n"; + return 1; + } + auto S = Triple::normalize(TargetTriple); + M->setTargetTriple(S); + + PassInstrumentationCallbacks PIC; + StandardInstrumentations SI(Context, false/*debug PM*/, + false); + registerCodeGenCallback(PIC, *TM); + + ModulePassManager MPM; + FunctionPassManager FPM; + //TargetLibraryInfoImpl TLII(Triple(M->getTargetTriple())); + + MachineFunctionAnalysisManager MFAM; + LoopAnalysisManager LAM; + FunctionAnalysisManager FAM; + CGSCCAnalysisManager CGAM; + ModuleAnalysisManager MAM; + PassBuilder PB(TM.get(), PipelineTuningOptions(), std::nullopt, &PIC); + PB.registerModuleAnalyses(MAM); + PB.registerCGSCCAnalyses(CGAM); + PB.registerFunctionAnalyses(FAM); + PB.registerLoopAnalyses(LAM); + PB.registerMachineFunctionAnalyses(MFAM); + PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); + + SI.registerCallbacks(PIC, &MAM); + + //FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); + + Triple TT(M->getTargetTriple()); + if (!NoLint) + FPM.addPass(LintPass()); + if (!NoVerify) + MPM.addPass(VerifierPass()); + if (TT.isAMDGPU()) + FPM.addPass(AMDGPUTargetVerifierPass()); + else if (false) {} // ... + else + FPM.addPass(TargetVerifierPass()); + MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); + + MPM.run(*M, MAM); + + if (!M->IsValid) + return 1; + + return 0; +} >From a808efce8d90524845a44ffa5b90adb6741e488d Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 3 Feb 2025 07:15:12 -0800 Subject: [PATCH 02/31] Add hook for target verifier in llc,opt --- .../llvm/Passes/StandardInstrumentations.h | 6 ++++-- llvm/include/llvm/Target/TargetVerifier.h | 1 + .../TargetVerify/AMDGPUTargetVerifier.h | 18 ++++++++++++++++++ llvm/lib/LTO/LTOBackend.cpp | 2 +- llvm/lib/LTO/ThinLTOCodeGenerator.cpp | 2 +- llvm/lib/Passes/CMakeLists.txt | 1 + llvm/lib/Passes/PassBuilderBindings.cpp | 2 +- llvm/lib/Passes/StandardInstrumentations.cpp | 19 +++++++++++++++---- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 12 ++++++------ llvm/lib/Target/CMakeLists.txt | 2 ++ .../CodeGen/AMDGPU/tgt-verify-llc-fail.ll | 6 ++++++ .../CodeGen/AMDGPU/tgt-verify-llc-pass.ll | 6 ++++++ llvm/tools/llc/NewPMDriver.cpp | 2 +- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 2 +- llvm/tools/opt/NewPMDriver.cpp | 2 +- llvm/unittests/IR/PassManagerTest.cpp | 6 +++--- 16 files changed, 68 insertions(+), 21 deletions(-) create mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll create mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll diff --git a/llvm/include/llvm/Passes/StandardInstrumentations.h b/llvm/include/llvm/Passes/StandardInstrumentations.h index f7a65a88ecf5b..988fcb93b2357 100644 --- a/llvm/include/llvm/Passes/StandardInstrumentations.h +++ b/llvm/include/llvm/Passes/StandardInstrumentations.h @@ -476,7 +476,8 @@ class VerifyInstrumentation { public: VerifyInstrumentation(bool DebugLogging) : DebugLogging(DebugLogging) {} void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM); + ModuleAnalysisManager *MAM, + FunctionAnalysisManager *FAM); }; /// This class implements --time-trace functionality for new pass manager. @@ -621,7 +622,8 @@ class StandardInstrumentations { // Register all the standard instrumentation callbacks. If \p FAM is nullptr // then PreservedCFGChecker is not enabled. void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM = nullptr); + ModuleAnalysisManager *MAM, + FunctionAnalysisManager *FAM); TimePassesHandler &getTimePasses() { return TimePasses; } }; diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index e00c6a7b260c9..ad5aeb895953d 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -75,6 +75,7 @@ class TargetVerify { MessagesStr(Messages) {} void run(Function &F) {}; + void run(Function &F, FunctionAnalysisManager &AM); }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index e6ff57629b141..d8a3fda4f87dc 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -22,6 +22,10 @@ #include "llvm/Target/TargetVerifier.h" +#include "llvm/Analysis/UniformityAnalysis.h" +#include "llvm/Analysis/PostDominators.h" +#include "llvm/IR/Dominators.h" + namespace llvm { class Function; @@ -31,6 +35,20 @@ class AMDGPUTargetVerifierPass : public TargetVerifierPass { PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); }; +class AMDGPUTargetVerify : public TargetVerify { +public: + Module *Mod; + + DominatorTree *DT; + PostDominatorTree *PDT; + UniformityInfo *UA; + + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) + : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} + + void run(Function &F); +}; + } // namespace llvm #endif // LLVM_AMDGPU_TARGET_VERIFIER_H diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp index 1c764a0188eda..475e7cf45371b 100644 --- a/llvm/lib/LTO/LTOBackend.cpp +++ b/llvm/lib/LTO/LTOBackend.cpp @@ -275,7 +275,7 @@ static void runNewPMPasses(const Config &Conf, Module &Mod, TargetMachine *TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(Mod.getContext(), Conf.DebugPassManager, Conf.VerifyEach); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); PassBuilder PB(TM, Conf.PTO, PGOOpt, &PIC); RegisterPassPlugins(Conf.PassPlugins, PB); diff --git a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp index 9e7f8187fe49c..369b003df1364 100644 --- a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp +++ b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp @@ -245,7 +245,7 @@ static void optimizeModule(Module &TheModule, TargetMachine &TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(TheModule.getContext(), DebugPassManager); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); PipelineTuningOptions PTO; PTO.LoopVectorization = true; PTO.SLPVectorization = true; diff --git a/llvm/lib/Passes/CMakeLists.txt b/llvm/lib/Passes/CMakeLists.txt index 6425f4934b210..f171377a8b270 100644 --- a/llvm/lib/Passes/CMakeLists.txt +++ b/llvm/lib/Passes/CMakeLists.txt @@ -29,6 +29,7 @@ add_llvm_component_library(LLVMPasses Scalar Support Target + TargetParser TransformUtils Vectorize Instrumentation diff --git a/llvm/lib/Passes/PassBuilderBindings.cpp b/llvm/lib/Passes/PassBuilderBindings.cpp index 933fe89e53a94..f0e1abb8cebc4 100644 --- a/llvm/lib/Passes/PassBuilderBindings.cpp +++ b/llvm/lib/Passes/PassBuilderBindings.cpp @@ -76,7 +76,7 @@ static LLVMErrorRef runPasses(Module *Mod, Function *Fun, const char *Passes, PB.crossRegisterProxies(LAM, FAM, CGAM, MAM); StandardInstrumentations SI(Mod->getContext(), Debug, VerifyEach); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); // Run the pipeline. if (Fun) { diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index dc1dd5d9c7f4c..7b15f89e361b8 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -45,6 +45,7 @@ #include "llvm/Support/Signals.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Support/xxhash.h" +#include "llvm/Target/TargetVerifier.h" #include #include #include @@ -1454,9 +1455,10 @@ void PreservedCFGCheckerInstrumentation::registerCallbacks( } void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM) { + ModuleAnalysisManager *MAM, + FunctionAnalysisManager *FAM) { PIC.registerAfterPassCallback( - [this, MAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { + [this, MAM, FAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { if (isIgnored(P) || P == "VerifierPass") return; const auto *F = unwrapIR(IR); @@ -1473,6 +1475,15 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", P)); + + if (FAM) { + TargetVerify TV(const_cast(F->getParent())); + TV.run(*const_cast(F), *FAM); + if (!F->getParent()->IsValid) + report_fatal_error(formatv("Broken function found after pass " + "\"{0}\", compilation aborted!", + P)); + } } else { const auto *M = unwrapIR(IR); if (!M) { @@ -2512,7 +2523,7 @@ void PrintCrashIRInstrumentation::registerCallbacks( } void StandardInstrumentations::registerCallbacks( - PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM) { + PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM, FunctionAnalysisManager *FAM) { PrintIR.registerCallbacks(PIC); PrintPass.registerCallbacks(PIC); TimePasses.registerCallbacks(PIC); @@ -2521,7 +2532,7 @@ void StandardInstrumentations::registerCallbacks( PrintChangedIR.registerCallbacks(PIC); PseudoProbeVerification.registerCallbacks(PIC); if (VerifyEach) - Verify.registerCallbacks(PIC, MAM); + Verify.registerCallbacks(PIC, MAM, FAM); PrintChangedDiff.registerCallbacks(PIC); WebsiteChangeReporter.registerCallbacks(PIC); ChangeTester.registerCallbacks(PIC); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 585b19065c142..e6cdec7160229 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -14,8 +14,8 @@ using namespace llvm; -static cl::opt -MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); +//static cl::opt +//MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); // Check - We know that cond should be true, if not print an error message. #define Check(C, ...) \ @@ -81,7 +81,7 @@ static bool isMFMA(unsigned IID) { } namespace llvm { -class AMDGPUTargetVerify : public TargetVerify { +/*class AMDGPUTargetVerify : public TargetVerify { public: Module *Mod; @@ -93,7 +93,7 @@ class AMDGPUTargetVerify : public TargetVerify { : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} void run(Function &F); -}; +};*/ static bool IsValidInt(const Type *Ty) { return Ty->isIntegerTy(1) || @@ -129,8 +129,8 @@ void AMDGPUTargetVerify::run(Function &F) { for (auto &BB : F) { for (auto &I : BB) { - if (MarkUniform) - outs() << UA->isUniform(&I) << ' ' << I << '\n'; + //if (MarkUniform) + //outs() << UA->isUniform(&I) << ' ' << I << '\n'; // Ensure integral types are valid: i8, i16, i32, i64, i128 if (I.getType()->isIntegerTy()) diff --git a/llvm/lib/Target/CMakeLists.txt b/llvm/lib/Target/CMakeLists.txt index 9472288229cac..f2a5d545ce84f 100644 --- a/llvm/lib/Target/CMakeLists.txt +++ b/llvm/lib/Target/CMakeLists.txt @@ -7,6 +7,8 @@ add_llvm_component_library(LLVMTarget TargetLoweringObjectFile.cpp TargetMachine.cpp TargetMachineC.cpp + TargetVerifier.cpp + AMDGPU/AMDGPUTargetVerifier.cpp ADDITIONAL_HEADER_DIRS ${LLVM_MAIN_INCLUDE_DIR}/llvm/Target diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll new file mode 100644 index 0000000000000..584097d7bc134 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll @@ -0,0 +1,6 @@ +; RUN: not not llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each -o - < %s 2>&1 | FileCheck %s + +define amdgpu_cs i32 @nonvoid_shader() { +; CHECK: LLVM ERROR + ret i32 0 +} diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll new file mode 100644 index 0000000000000..0c3a5fe5ac4a5 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll @@ -0,0 +1,6 @@ +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each %s -o - 2>&1 | FileCheck %s + +define amdgpu_cs void @void_shader() { +; CHECK: ModuleToFunctionPassAdaptor + ret void +} diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index fa82689ecf9ae..a060d16e74958 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -126,7 +126,7 @@ int llvm::compileModuleWithNewPM( PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); MAM.registerPass([&] { return MachineModuleAnalysis(MMI); }); diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 68422abd6f4cc..3352d07deff2f 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -147,7 +147,7 @@ int main(int argc, char **argv) { PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); //FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); diff --git a/llvm/tools/opt/NewPMDriver.cpp b/llvm/tools/opt/NewPMDriver.cpp index 7d168a6ceb17c..a8977d80bdf44 100644 --- a/llvm/tools/opt/NewPMDriver.cpp +++ b/llvm/tools/opt/NewPMDriver.cpp @@ -423,7 +423,7 @@ bool llvm::runPassPipeline( PrintPassOpts.SkipAnalyses = DebugPM == DebugLogging::Quiet; StandardInstrumentations SI(M.getContext(), DebugPM != DebugLogging::None, VK == VerifierKind::EachPass, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); DebugifyEachInstrumentation Debugify; DebugifyStatsMap DIStatsMap; DebugInfoPerPass DebugInfoBeforePass; diff --git a/llvm/unittests/IR/PassManagerTest.cpp b/llvm/unittests/IR/PassManagerTest.cpp index a6487169224c2..bb4db6120035f 100644 --- a/llvm/unittests/IR/PassManagerTest.cpp +++ b/llvm/unittests/IR/PassManagerTest.cpp @@ -828,7 +828,7 @@ TEST_F(PassManagerTest, FunctionPassCFGChecker) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -877,7 +877,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerInvalidateAnalysis) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -945,7 +945,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerWrapped) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); >From 64d001858efc994e965071cd319d268b934a6eb3 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 16 Apr 2025 10:19:00 -0400 Subject: [PATCH 03/31] Run AMDGPUTargetVerifier within AMDGPU pipeline. Move IsValid from Module to TargetVerify. --- clang/lib/CodeGen/BackendUtil.cpp | 2 +- llvm/include/llvm/IR/Module.h | 4 ---- llvm/include/llvm/Target/TargetVerifier.h | 2 ++ llvm/lib/IR/Verifier.cpp | 4 ++-- llvm/lib/Passes/StandardInstrumentations.cpp | 2 +- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 5 +++++ llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 3 ++- llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 6 +++--- 8 files changed, 16 insertions(+), 12 deletions(-) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index f7eb853beb23c..9a1c922f5ddef 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -922,7 +922,7 @@ void EmitAssemblyHelper::RunOptimizationPipeline( TheModule->getContext(), (CodeGenOpts.DebugPassManager || DebugPassStructure), CodeGenOpts.VerifyEach, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM); + SI.registerCallbacks(PIC, &MAM, &FAM); PassBuilder PB(TM.get(), PTO, PGOOpt, &PIC); // Handle the assignment tracking feature options. diff --git a/llvm/include/llvm/IR/Module.h b/llvm/include/llvm/IR/Module.h index 03c0cf1cf0924..91ccd76c41e07 100644 --- a/llvm/include/llvm/IR/Module.h +++ b/llvm/include/llvm/IR/Module.h @@ -214,10 +214,6 @@ class LLVM_ABI Module { /// @name Constructors /// @{ public: - /// Is this Module valid as determined by one of the verification passes - /// i.e. Lint, Verifier, TargetVerifier. - bool IsValid = true; - /// Is this Module using intrinsics to record the position of debugging /// information, or non-intrinsic records? See IsNewDbgInfoFormat in /// \ref BasicBlock. diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index ad5aeb895953d..2d0c039132c35 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -70,6 +70,8 @@ class TargetVerify { std::string Messages; raw_string_ostream MessagesStr; + bool IsValid = true; + TargetVerify(Module *Mod) : Mod(Mod), TT(Triple::normalize(Mod->getTargetTriple())), MessagesStr(Messages) {} diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 9d21ca182ca13..d7c514610b4ba 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -7801,7 +7801,7 @@ VerifierAnalysis::Result VerifierAnalysis::run(Function &F, PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { auto Res = AM.getResult(M); if (Res.IRBroken || Res.DebugInfoBroken) { - M.IsValid = false; + //M.IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken module found, compilation aborted!"); } @@ -7812,7 +7812,7 @@ PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto res = AM.getResult(F); if (res.IRBroken) { - F.getParent()->IsValid = false; + //F.getParent()->IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken function found, compilation aborted!"); } diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index 7b15f89e361b8..879d657c87695 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -1479,7 +1479,7 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, if (FAM) { TargetVerify TV(const_cast(F->getParent())); TV.run(*const_cast(F), *FAM); - if (!F->getParent()->IsValid) + if (!TV.IsValid) report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", P)); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 90e3489ced923..6ec34d6a0fdbf 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -90,6 +90,7 @@ #include "llvm/MC/TargetRegistry.h" #include "llvm/Passes/PassBuilder.h" #include "llvm/Support/FormatVariadic.h" +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Transforms/HipStdPar/HipStdPar.h" #include "llvm/Transforms/IPO.h" #include "llvm/Transforms/IPO/AlwaysInliner.h" @@ -1298,6 +1299,8 @@ void AMDGPUPassConfig::addIRPasses() { addPass(createLICMPass()); } + //addPass(AMDGPUTargetVerifierPass()); + TargetPassConfig::addIRPasses(); // EarlyCSE is not always strong enough to clean up what LSR produces. For @@ -2040,6 +2043,8 @@ void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { // but EarlyCSE can do neither of them. if (isPassEnabled(EnableScalarIRPasses)) addEarlyCSEOrGVNPass(addPass); + + addPass(AMDGPUTargetVerifierPass()); } void AMDGPUCodeGenPassBuilder::addCodeGenPrepare(AddIRPass &addPass) const { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index e6cdec7160229..c70a6d1b6fa66 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -205,7 +205,8 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan dbgs() << TV.MessagesStr.str(); if (!TV.MessagesStr.str().empty()) { - F.getParent()->IsValid = false; + TV.IsValid = false; + return PreservedAnalyses::none(); } return PreservedAnalyses::all(); diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 3352d07deff2f..fbe7f6089ff18 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -163,9 +163,9 @@ int main(int argc, char **argv) { FPM.addPass(TargetVerifierPass()); MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); - MPM.run(*M, MAM); - - if (!M->IsValid) + auto PA = MPM.run(*M, MAM); + auto PAC = PA.getChecker(); + if (!PAC.preserved()) return 1; return 0; >From fdae3025942584d0085deb3442f40471548defe5 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 16 Apr 2025 11:08:20 -0400 Subject: [PATCH 04/31] Remove cmd line options that aren't required. Make error message explicit. --- llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll | 4 ++-- llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll index 584097d7bc134..c5e59d4a2369e 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll @@ -1,6 +1,6 @@ -; RUN: not not llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each -o - < %s 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm -o - < %s 2>&1 | FileCheck %s define amdgpu_cs i32 @nonvoid_shader() { -; CHECK: LLVM ERROR +; CHECK: Shaders must return void ret i32 0 } diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll index 0c3a5fe5ac4a5..8a503b7624a73 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll @@ -1,6 +1,6 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-machineinstrs -enable-new-pm -verify-each %s -o - 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm %s -o - 2>&1 | FileCheck %s --allow-empty define amdgpu_cs void @void_shader() { -; CHECK: ModuleToFunctionPassAdaptor +; CHECK-NOT: Shaders must return void ret void } >From 5ceda58cc5b5d7372c6e43cbdf583f0dda87b956 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 19:36:34 -0400 Subject: [PATCH 05/31] Return Verifier none status through PreservedAnalyses on fail. --- llvm/lib/Analysis/Lint.cpp | 4 +++- llvm/lib/IR/Verifier.cpp | 2 ++ llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 8 +++++--- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/llvm/lib/Analysis/Lint.cpp b/llvm/lib/Analysis/Lint.cpp index f05e36e2025d4..c8e38963e5974 100644 --- a/llvm/lib/Analysis/Lint.cpp +++ b/llvm/lib/Analysis/Lint.cpp @@ -742,9 +742,11 @@ PreservedAnalyses LintPass::run(Function &F, FunctionAnalysisManager &AM) { Lint L(Mod, DL, AA, AC, DT, TLI); L.visit(F); dbgs() << L.MessagesStr.str(); - if (AbortOnError && !L.MessagesStr.str().empty()) + if (AbortOnError && !L.MessagesStr.str().empty()) { report_fatal_error( "linter found errors, aborting. (enabled by abort-on-error)", false); + return PreservedAnalyses::none(); + } return PreservedAnalyses::all(); } diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index d7c514610b4ba..51f6dec53b70f 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -7804,6 +7804,7 @@ PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { //M.IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken module found, compilation aborted!"); + return PreservedAnalyses::none(); } return PreservedAnalyses::all(); @@ -7815,6 +7816,7 @@ PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { //F.getParent()->IsValid = false; if (VerifyAbortOnError && FatalErrors) report_fatal_error("Broken function found, compilation aborted!"); + return PreservedAnalyses::none(); } return PreservedAnalyses::all(); diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index fbe7f6089ff18..042824ac37fea 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -164,9 +164,11 @@ int main(int argc, char **argv) { MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); auto PA = MPM.run(*M, MAM); - auto PAC = PA.getChecker(); - if (!PAC.preserved()) - return 1; + { + auto PAC = PA.getChecker(); + if (!PAC.preserved()) + return 1; + } return 0; } >From 99c29069cdaf68c92ce7f25ca2f730bf738ca324 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 21:16:02 -0400 Subject: [PATCH 06/31] Rebase update. --- llvm/include/llvm/Target/TargetVerifier.h | 2 +- llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 2d0c039132c35..fe683311b901c 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -73,7 +73,7 @@ class TargetVerify { bool IsValid = true; TargetVerify(Module *Mod) - : Mod(Mod), TT(Triple::normalize(Mod->getTargetTriple())), + : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} void run(Function &F) {}; diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 042824ac37fea..627bc51ef3a43 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -123,7 +123,7 @@ int main(int argc, char **argv) { return 1; } auto S = Triple::normalize(TargetTriple); - M->setTargetTriple(S); + M->setTargetTriple(Triple(S)); PassInstrumentationCallbacks PIC; StandardInstrumentations SI(Context, false/*debug PM*/, @@ -153,7 +153,7 @@ int main(int argc, char **argv) { Triple TT(M->getTargetTriple()); if (!NoLint) - FPM.addPass(LintPass()); + FPM.addPass(LintPass(false)); if (!NoVerify) MPM.addPass(VerifierPass()); if (TT.isAMDGPU()) >From 3ea7eae48a6addbf711716e7a819830dddc1b34a Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 22:49:52 -0400 Subject: [PATCH 07/31] Add generic TargetVerifier. --- llvm/lib/Target/TargetVerifier.cpp | 32 ++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) create mode 100644 llvm/lib/Target/TargetVerifier.cpp diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp new file mode 100644 index 0000000000000..de3ff749e7c3c --- /dev/null +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -0,0 +1,32 @@ +#include "llvm/Target/TargetVerifier.h" +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" + +#include "llvm/Analysis/UniformityAnalysis.h" +#include "llvm/Analysis/PostDominators.h" +#include "llvm/Support/Debug.h" +#include "llvm/IR/Dominators.h" +#include "llvm/IR/Function.h" +#include "llvm/IR/IntrinsicInst.h" +#include "llvm/IR/IntrinsicsAMDGPU.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/Value.h" + +namespace llvm { + +void TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { + if (TT.isAMDGPU()) { + auto *UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + } + } +} + +} // namespace llvm >From f52c4dbc84952d97266f5f4158729e564de10240 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 23:13:14 -0400 Subject: [PATCH 08/31] Remove store to const check since it is in Lint already --- llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 8 -------- llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 2 -- 2 files changed, 10 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index c70a6d1b6fa66..1cf2b277bee26 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -140,14 +140,6 @@ void AMDGPUTargetVerify::run(Function &F) { Check(IsValidInt(I.getOperand(i)->getType()), "Int type is invalid.", I.getOperand(i)); - // Ensure no store to const memory - if (auto *SI = dyn_cast(&I)) - { - unsigned AS = SI->getPointerAddressSpace(); - Check(AS != 4, "Write to const memory", SI); - } - - // Ensure no kernel to kernel calls. if (auto *CI = dyn_cast(&I)) { CallingConv::ID CalleeCC = CI->getCallingConv(); diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll index f56ff992a56c2..c628abbde11d1 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -32,8 +32,6 @@ define amdgpu_cs i32 @shader() { define amdgpu_kernel void @store_const(ptr addrspace(4) %out, i32 %a, i32 %b) { ; CHECK: Undefined behavior: Write to memory in const addrspace -; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 -; CHECK-NEXT: Write to const memory ; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 %r = add i32 %a, %b store i32 %r, ptr addrspace(4) %out >From 5c9a4ab3895d6939b12386d1db2081ca388df01a Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 23:14:38 -0400 Subject: [PATCH 09/31] Add chain followed by unreachable check --- llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 6 ++++++ llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 10 ++++++++++ 2 files changed, 16 insertions(+) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 1cf2b277bee26..8ea773bc0e66f 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -142,6 +142,7 @@ void AMDGPUTargetVerify::run(Function &F) { if (auto *CI = dyn_cast(&I)) { + // Ensure no kernel to kernel calls. CallingConv::ID CalleeCC = CI->getCallingConv(); if (CalleeCC == CallingConv::AMDGPU_KERNEL) { @@ -149,6 +150,11 @@ void AMDGPUTargetVerify::run(Function &F) { Check(CallerCC != CallingConv::AMDGPU_KERNEL, "A kernel may not call a kernel", CI->getParent()->getParent()); } + + // Ensure chain intrinsics are followed by unreachables. + if (CI->getIntrinsicID() == Intrinsic::amdgcn_cs_chain) + Check(isa_and_present(CI->getNextNode()), + "llvm.amdgcn.cs.chain must be followed by unreachable", CI); } // Ensure MFMA is not in control flow with diverging operands diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll index c628abbde11d1..e620df94ccde4 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -58,3 +58,13 @@ entry: %tmp2 = ashr i65 %x, 64 ret i65 %tmp2 } + +declare void @llvm.amdgcn.cs.chain.v3i32(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) +declare amdgpu_cs_chain void @chain_callee(<3 x i32> inreg, <3 x i32>) + +define amdgpu_cs void @no_unreachable(<3 x i32> inreg %a, <3 x i32> %b) { +; CHECK: llvm.amdgcn.cs.chain must be followed by unreachable +; CHECK-NEXT: call void (ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.p0.i32.v3i32.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) + call void(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) + ret void +} >From 0ff03f792c018e4fd0c11de9da4d3353617707f5 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 19 Apr 2025 23:26:19 -0400 Subject: [PATCH 10/31] Remove mfma check --- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 89 ------------------- llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 25 ------ 2 files changed, 114 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 8ea773bc0e66f..684ced5bba574 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -14,9 +14,6 @@ using namespace llvm; -//static cl::opt -//MarkUniform("mark-uniform", cl::desc("Mark instructions as uniform"), cl::init(false)); - // Check - We know that cond should be true, if not print an error message. #define Check(C, ...) \ do { \ @@ -26,60 +23,6 @@ using namespace llvm; } \ } while (false) -static bool isMFMA(unsigned IID) { - switch (IID) { - case Intrinsic::amdgcn_mfma_f32_4x4x1f32: - case Intrinsic::amdgcn_mfma_f32_4x4x4f16: - case Intrinsic::amdgcn_mfma_i32_4x4x4i8: - case Intrinsic::amdgcn_mfma_f32_4x4x2bf16: - - case Intrinsic::amdgcn_mfma_f32_16x16x1f32: - case Intrinsic::amdgcn_mfma_f32_16x16x4f32: - case Intrinsic::amdgcn_mfma_f32_16x16x4f16: - case Intrinsic::amdgcn_mfma_f32_16x16x16f16: - case Intrinsic::amdgcn_mfma_i32_16x16x4i8: - case Intrinsic::amdgcn_mfma_i32_16x16x16i8: - case Intrinsic::amdgcn_mfma_f32_16x16x2bf16: - case Intrinsic::amdgcn_mfma_f32_16x16x8bf16: - - case Intrinsic::amdgcn_mfma_f32_32x32x1f32: - case Intrinsic::amdgcn_mfma_f32_32x32x2f32: - case Intrinsic::amdgcn_mfma_f32_32x32x4f16: - case Intrinsic::amdgcn_mfma_f32_32x32x8f16: - case Intrinsic::amdgcn_mfma_i32_32x32x4i8: - case Intrinsic::amdgcn_mfma_i32_32x32x8i8: - case Intrinsic::amdgcn_mfma_f32_32x32x2bf16: - case Intrinsic::amdgcn_mfma_f32_32x32x4bf16: - - case Intrinsic::amdgcn_mfma_f32_4x4x4bf16_1k: - case Intrinsic::amdgcn_mfma_f32_16x16x4bf16_1k: - case Intrinsic::amdgcn_mfma_f32_16x16x16bf16_1k: - case Intrinsic::amdgcn_mfma_f32_32x32x4bf16_1k: - case Intrinsic::amdgcn_mfma_f32_32x32x8bf16_1k: - - case Intrinsic::amdgcn_mfma_f64_16x16x4f64: - case Intrinsic::amdgcn_mfma_f64_4x4x4f64: - - case Intrinsic::amdgcn_mfma_i32_16x16x32_i8: - case Intrinsic::amdgcn_mfma_i32_32x32x16_i8: - case Intrinsic::amdgcn_mfma_f32_16x16x8_xf32: - case Intrinsic::amdgcn_mfma_f32_32x32x4_xf32: - - case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_bf8: - case Intrinsic::amdgcn_mfma_f32_16x16x32_bf8_fp8: - case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_bf8: - case Intrinsic::amdgcn_mfma_f32_16x16x32_fp8_fp8: - - case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_bf8: - case Intrinsic::amdgcn_mfma_f32_32x32x16_bf8_fp8: - case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_bf8: - case Intrinsic::amdgcn_mfma_f32_32x32x16_fp8_fp8: - return true; - default: - return false; - } -} - namespace llvm { /*class AMDGPUTargetVerify : public TargetVerify { public: @@ -129,8 +72,6 @@ void AMDGPUTargetVerify::run(Function &F) { for (auto &BB : F) { for (auto &I : BB) { - //if (MarkUniform) - //outs() << UA->isUniform(&I) << ' ' << I << '\n'; // Ensure integral types are valid: i8, i16, i32, i64, i128 if (I.getType()->isIntegerTy()) @@ -156,36 +97,6 @@ void AMDGPUTargetVerify::run(Function &F) { Check(isa_and_present(CI->getNextNode()), "llvm.amdgcn.cs.chain must be followed by unreachable", CI); } - - // Ensure MFMA is not in control flow with diverging operands - if (auto *II = dyn_cast(&I)) { - if (isMFMA(II->getIntrinsicID())) { - bool InControlFlow = false; - for (const auto &P : predecessors(&BB)) - if (!PDT->dominates(&BB, P)) { - InControlFlow = true; - break; - } - for (const auto &S : successors(&BB)) - if (!DT->dominates(&BB, S)) { - InControlFlow = true; - break; - } - if (InControlFlow) { - // If operands to MFMA are not uniform, MFMA cannot be in control flow - bool hasUniformOperands = true; - for (unsigned i = 0; i < II->getNumOperands(); i++) { - if (!UA->isUniform(II->getOperand(i))) { - dbgs() << "Not uniform: " << *II->getOperand(i) << '\n'; - hasUniformOperands = false; - } - } - if (!hasUniformOperands) Check(false, "MFMA in control flow", II); - //else Check(false, "MFMA in control flow (uniform operands)", II); - } - //else Check(false, "MFMA not in control flow", II); - } - } } } } diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll index e620df94ccde4..62b220d7d9f49 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll @@ -1,30 +1,5 @@ ; RUN: not llvm-tgt-verify %s -mtriple=amdgcn |& FileCheck %s -define amdgpu_kernel void @test_mfma_f32_32x32x1f32_vecarg(ptr addrspace(1) %arg) #0 { -; CHECK: Not uniform: %in.f32 = load <32 x float>, ptr addrspace(1) %gep, align 128 -; CHECK-NEXT: MFMA in control flow -; CHECK-NEXT: %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) -s: - %tid = call i32 @llvm.amdgcn.workitem.id.x() - %gep = getelementptr inbounds <32 x float>, ptr addrspace(1) %arg, i32 %tid - %in.i32 = load <32 x i32>, ptr addrspace(1) %gep - %in.f32 = load <32 x float>, ptr addrspace(1) %gep - - %0 = icmp eq <32 x i32> %in.i32, zeroinitializer - %div.br = extractelement <32 x i1> %0, i32 0 - br i1 %div.br, label %if.3, label %else.0 - -if.3: - br label %join - -else.0: - %mfma = tail call <32 x float> @llvm.amdgcn.mfma.f32.32x32x1f32(float 1.000000e+00, float 2.000000e+00, <32 x float> %in.f32, i32 1, i32 2, i32 3) - br label %join - -join: - ret void -} - define amdgpu_cs i32 @shader() { ; CHECK: Shaders must return void ret i32 0 >From 6b84c73a35a260d64ed45df90052f8212b0ee4e7 Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 21 Apr 2025 20:54:10 -0400 Subject: [PATCH 11/31] Add registerVerifierPasses to PassBuilder and add the verifier passes to PassRegistry. --- llvm/include/llvm/InitializePasses.h | 1 + llvm/include/llvm/Passes/PassBuilder.h | 21 +++++++ .../llvm/Passes/TargetPassRegistry.inc | 12 ++++ .../TargetVerify/AMDGPUTargetVerifier.h | 11 ++-- llvm/lib/Passes/PassBuilder.cpp | 7 +++ llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 11 ++++ .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 56 ++++++++++++++++++- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 1 + 8 files changed, 114 insertions(+), 6 deletions(-) diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 9bef8e496c57e..ae398db3dc1da 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -317,6 +317,7 @@ void initializeUnpackMachineBundlesPass(PassRegistry &); void initializeUnreachableBlockElimLegacyPassPass(PassRegistry &); void initializeUnreachableMachineBlockElimLegacyPass(PassRegistry &); void initializeVerifierLegacyPassPass(PassRegistry &); +void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeVirtRegMapWrapperLegacyPass(PassRegistry &); void initializeVirtRegRewriterPass(PassRegistry &); void initializeWasmEHPreparePass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/PassBuilder.h b/llvm/include/llvm/Passes/PassBuilder.h index 51ccaa53447d7..6000769ce723b 100644 --- a/llvm/include/llvm/Passes/PassBuilder.h +++ b/llvm/include/llvm/Passes/PassBuilder.h @@ -172,6 +172,13 @@ class PassBuilder { /// additional analyses. void registerLoopAnalyses(LoopAnalysisManager &LAM); + /// Registers all available verifier passes. + /// + /// This is an interface that can be used to populate a + /// \c ModuleAnalysisManager with all registered loop analyses. Callers can + /// still manually register any additional analyses. + void registerVerifierPasses(ModulePassManager &PM, FunctionPassManager &); + /// Registers all available machine function analysis passes. /// /// This is an interface that can be used to populate a \c @@ -570,6 +577,15 @@ class PassBuilder { } /// @}} + /// Register a callback for parsing an Verifier Name to populate + /// the given managers. + void registerVerifierCallback( + const std::function &C, + const std::function &CF) { + VerifierCallbacks.push_back(C); + FnVerifierCallbacks.push_back(CF); + } + /// {{@ Register pipeline parsing callbacks with this pass builder instance. /// Using these callbacks, callers can parse both a single pass name, as well /// as entire sub-pipelines, and populate the PassManager instance @@ -841,6 +857,11 @@ class PassBuilder { // Callbacks to parse `filter` parameter in register allocation passes SmallVector, 2> RegClassFilterParsingCallbacks; + // Verifier callbacks + SmallVector, 2> + VerifierCallbacks; + SmallVector, 2> + FnVerifierCallbacks; }; /// This utility template takes care of adding require<> and invalidate<> diff --git a/llvm/include/llvm/Passes/TargetPassRegistry.inc b/llvm/include/llvm/Passes/TargetPassRegistry.inc index 521913cb25a4a..2d04b874cf360 100644 --- a/llvm/include/llvm/Passes/TargetPassRegistry.inc +++ b/llvm/include/llvm/Passes/TargetPassRegistry.inc @@ -151,6 +151,18 @@ PB.registerPipelineParsingCallback([=](StringRef Name, FunctionPassManager &PM, return false; }); +PB.registerVerifierCallback([](ModulePassManager &PM) { +#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) PM.addPass(CREATE_PASS) +#include GET_PASS_REGISTRY +#undef VERIFIER_MODULE_ANALYSIS + return false; +}, [](FunctionPassManager &FPM) { +#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) FPM.addPass(CREATE_PASS) +#include GET_PASS_REGISTRY +#undef VERIFIER_FUNCTION_ANALYSIS + return false; +}); + #undef ADD_PASS #undef ADD_PASS_WITH_PARAMS diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index d8a3fda4f87dc..b6a7412e8c1ef 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -39,14 +39,17 @@ class AMDGPUTargetVerify : public TargetVerify { public: Module *Mod; - DominatorTree *DT; - PostDominatorTree *PDT; - UniformityInfo *UA; + DominatorTree *DT = nullptr; + PostDominatorTree *PDT = nullptr; + UniformityInfo *UA = nullptr; + + AMDGPUTargetVerify(Module *Mod) + : TargetVerify(Mod), Mod(Mod) {} AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} - void run(Function &F); + bool run(Function &F); }; } // namespace llvm diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index e7057d9a6b625..e942fed8b6a72 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -582,6 +582,13 @@ void PassBuilder::registerLoopAnalyses(LoopAnalysisManager &LAM) { C(LAM); } +void PassBuilder::registerVerifierPasses(ModulePassManager &MPM, FunctionPassManager &FPM) { + for (auto &C : VerifierCallbacks) + C(MPM); + for (auto &C : FnVerifierCallbacks) + C(FPM); +} + static std::optional> parseFunctionPipelineName(StringRef Name) { std::pair Params; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 98a1147ef6d66..41e6a399c7239 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -81,6 +81,17 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #undef FUNCTION_ALIAS_ANALYSIS #undef FUNCTION_ANALYSIS +#ifndef VERIFIER_MODULE_ANALYSIS +#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) +#endif +#ifndef VERIFIER_FUNCTION_ANALYSIS +#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) +#endif +VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) +VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) +#undef VERIFIER_MODULE_ANALYSIS +#undef VERIFIER_FUNCTION_ANALYSIS + #ifndef FUNCTION_PASS_WITH_PARAMS #define FUNCTION_PASS_WITH_PARAMS(NAME, CLASS, CREATE_PASS, PARSER, PARAMS) #endif diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 684ced5bba574..63a7526b9abdc 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -5,6 +5,7 @@ #include "llvm/Support/Debug.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" +#include "llvm/InitializePasses.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" @@ -19,7 +20,7 @@ using namespace llvm; do { \ if (!(C)) { \ TargetVerify::CheckFailed(__VA_ARGS__); \ - return; \ + return false; \ } \ } while (false) @@ -64,7 +65,7 @@ static bool isShader(CallingConv::ID CC) { } } -void AMDGPUTargetVerify::run(Function &F) { +bool AMDGPUTargetVerify::run(Function &F) { // Ensure shader calling convention returns void if (isShader(F.getCallingConv())) Check(F.getReturnType() == Type::getVoidTy(F.getContext()), "Shaders must return void"); @@ -99,6 +100,10 @@ void AMDGPUTargetVerify::run(Function &F) { } } } + + if (!MessagesStr.str().empty()) + return false; + return true; } PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { @@ -120,4 +125,51 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan return PreservedAnalyses::all(); } + +struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { + static char ID; + + std::unique_ptr TV; + bool FatalErrors = true; + + AMDGPUTargetVerifierLegacyPass() : FunctionPass(ID) { + initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + AMDGPUTargetVerifierLegacyPass(bool FatalErrors) + : FunctionPass(ID), + FatalErrors(FatalErrors) { + initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + + bool doInitialization(Module &M) override { + TV = std::make_unique(&M); + return false; + } + + bool runOnFunction(Function &F) override { + if (TV->run(F) && FatalErrors) { + errs() << "in function " << F.getName() << '\n'; + report_fatal_error("Broken function found, compilation aborted!"); + } + return false; + } + + bool doFinalization(Module &M) override { + bool IsValid = true; + for (Function &F : M) + if (F.isDeclaration()) + IsValid &= TV->run(F); + + //IsValid &= TV->run(); + if (FatalErrors && !IsValid) + report_fatal_error("Broken module found, compilation aborted!"); + return false; + } + + void getAnalysisUsage(AnalysisUsage &AU) const override { + AU.setPreservesAll(); + } +}; +char AMDGPUTargetVerifierLegacyPass::ID = 0; } // namespace llvm +INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverify", "AMDGPU Target Verifier", false, false) diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 627bc51ef3a43..503db7b1f8d18 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -144,6 +144,7 @@ int main(int argc, char **argv) { PB.registerCGSCCAnalyses(CGAM); PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); + //PB.registerVerifierPasses(MPM, FPM); PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); >From ec3276b182f3a758a24024291772efe435485857 Mon Sep 17 00:00:00 2001 From: jofernau Date: Tue, 22 Apr 2025 14:57:31 -0400 Subject: [PATCH 12/31] Remove leftovers. Add titles. Add call to registerVerifierCallbacks in llc. --- llvm/lib/Passes/CMakeLists.txt | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 4 --- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 35 +++++++++++-------- llvm/lib/Target/TargetVerifier.cpp | 19 ++++++++++ llvm/tools/llc/NewPMDriver.cpp | 6 ++-- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 7 ++-- 6 files changed, 45 insertions(+), 28 deletions(-) diff --git a/llvm/lib/Passes/CMakeLists.txt b/llvm/lib/Passes/CMakeLists.txt index f171377a8b270..9c348cb89a8c5 100644 --- a/llvm/lib/Passes/CMakeLists.txt +++ b/llvm/lib/Passes/CMakeLists.txt @@ -29,7 +29,7 @@ add_llvm_component_library(LLVMPasses Scalar Support Target - TargetParser + #TargetParser TransformUtils Vectorize Instrumentation diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 6ec34d6a0fdbf..257cc724b3da9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1299,8 +1299,6 @@ void AMDGPUPassConfig::addIRPasses() { addPass(createLICMPass()); } - //addPass(AMDGPUTargetVerifierPass()); - TargetPassConfig::addIRPasses(); // EarlyCSE is not always strong enough to clean up what LSR produces. For @@ -2043,8 +2041,6 @@ void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { // but EarlyCSE can do neither of them. if (isPassEnabled(EnableScalarIRPasses)) addEarlyCSEOrGVNPass(addPass); - - addPass(AMDGPUTargetVerifierPass()); } void AMDGPUCodeGenPassBuilder::addCodeGenPrepare(AddIRPass &addPass) const { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 63a7526b9abdc..0eecedaebc7ce 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -1,3 +1,22 @@ +//===-- AMDGPUTargetVerifier.cpp - AMDGPU -------------------------*- C++ -*-===// +//// +//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +//// See https://llvm.org/LICENSE.txt for license information. +//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +//// +////===----------------------------------------------------------------------===// +//// +//// This file defines target verifier interfaces that can be used for some +//// validation of input to the system, and for checking that transformations +//// haven't done something bad. In contrast to the Verifier or Lint, the +//// TargetVerifier looks for constructions invalid to a particular target +//// machine. +//// +//// To see what specifically is checked, look at an individual backend's +//// TargetVerifier. +//// +////===----------------------------------------------------------------------===// + #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Analysis/UniformityAnalysis.h" @@ -25,19 +44,6 @@ using namespace llvm; } while (false) namespace llvm { -/*class AMDGPUTargetVerify : public TargetVerify { -public: - Module *Mod; - - DominatorTree *DT; - PostDominatorTree *PDT; - UniformityInfo *UA; - - AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) - : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} - - void run(Function &F); -};*/ static bool IsValidInt(const Type *Ty) { return Ty->isIntegerTy(1) || @@ -147,7 +153,7 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { } bool runOnFunction(Function &F) override { - if (TV->run(F) && FatalErrors) { + if (!TV->run(F) && FatalErrors) { errs() << "in function " << F.getName() << '\n'; report_fatal_error("Broken function found, compilation aborted!"); } @@ -160,7 +166,6 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { if (F.isDeclaration()) IsValid &= TV->run(F); - //IsValid &= TV->run(); if (FatalErrors && !IsValid) report_fatal_error("Broken module found, compilation aborted!"); return false; diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index de3ff749e7c3c..992a0c91d93b1 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -1,3 +1,22 @@ +//===-- TargetVerifier.cpp - LLVM IR Target Verifier ----------------*- C++ -*-===// +//// +///// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +///// See https://llvm.org/LICENSE.txt for license information. +///// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +///// +/////===----------------------------------------------------------------------===// +///// +///// This file defines target verifier interfaces that can be used for some +///// validation of input to the system, and for checking that transformations +///// haven't done something bad. In contrast to the Verifier or Lint, the +///// TargetVerifier looks for constructions invalid to a particular target +///// machine. +///// +///// To see what specifically is checked, look at TargetVerifier.cpp or an +///// individual backend's TargetVerifier. +///// +/////===----------------------------------------------------------------------===// + #include "llvm/Target/TargetVerifier.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index a060d16e74958..a8f6b999af06e 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -114,6 +114,8 @@ int llvm::compileModuleWithNewPM( VK == VerifierKind::EachPass); registerCodeGenCallback(PIC, *Target); + ModulePassManager MPM; + FunctionPassManager FPM; MachineFunctionAnalysisManager MFAM; LoopAnalysisManager LAM; FunctionAnalysisManager FAM; @@ -125,15 +127,13 @@ int llvm::compileModuleWithNewPM( PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); + PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM, &FAM); FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); MAM.registerPass([&] { return MachineModuleAnalysis(MMI); }); - ModulePassManager MPM; - FunctionPassManager FPM; - if (!PassPipeline.empty()) { // Construct a custom pass pipeline that starts after instruction // selection. diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index 503db7b1f8d18..b00bab66c6c3e 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -1,4 +1,4 @@ -//===--- llvm-isel-fuzzer.cpp - Fuzzer for instruction selection ----------===// +//===--- llvm-tgt-verify.cpp - Target Verifier ----------------- ----------===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -6,7 +6,7 @@ // //===----------------------------------------------------------------------===// // -// Tool to fuzz instruction selection using libFuzzer. +// Tool to verify a target. // //===----------------------------------------------------------------------===// @@ -144,14 +144,11 @@ int main(int argc, char **argv) { PB.registerCGSCCAnalyses(CGAM); PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); - //PB.registerVerifierPasses(MPM, FPM); PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM, &FAM); - //FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); - Triple TT(M->getTargetTriple()); if (!NoLint) FPM.addPass(LintPass(false)); >From 4f00c83f58a86a0adc26b621cc53e8b568b8c8e0 Mon Sep 17 00:00:00 2001 From: jofernau Date: Thu, 24 Apr 2025 16:02:21 -0400 Subject: [PATCH 13/31] Add pass to legacy PM. --- llvm/include/llvm/CodeGen/Passes.h | 2 + llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Target/TargetVerifier.h | 6 +- llvm/lib/Passes/StandardInstrumentations.cpp | 4 +- llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 2 +- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 45 ---------- llvm/lib/Target/TargetVerifier.cpp | 87 ++++++++++++++++++- llvm/tools/llc/NewPMDriver.cpp | 6 +- llvm/tools/llc/llc.cpp | 4 + .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 1 + 10 files changed, 106 insertions(+), 53 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index d214ab9306c2f..b293315e11c17 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -617,6 +617,8 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); + + FunctionPass *createTargetVerifierLegacyPass(); } // End llvm namespace #endif diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index ae398db3dc1da..3f9ffc4efd9ec 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -307,6 +307,7 @@ void initializeTailDuplicateLegacyPass(PassRegistry &); void initializeTargetLibraryInfoWrapperPassPass(PassRegistry &); void initializeTargetPassConfigPass(PassRegistry &); void initializeTargetTransformInfoWrapperPassPass(PassRegistry &); +void initializeTargetVerifierLegacyPassPass(PassRegistry &); void initializeTwoAddressInstructionLegacyPassPass(PassRegistry &); void initializeTypeBasedAAWrapperPassPass(PassRegistry &); void initializeTypePromotionLegacyPass(PassRegistry &); @@ -317,7 +318,6 @@ void initializeUnpackMachineBundlesPass(PassRegistry &); void initializeUnreachableBlockElimLegacyPassPass(PassRegistry &); void initializeUnreachableMachineBlockElimLegacyPass(PassRegistry &); void initializeVerifierLegacyPassPass(PassRegistry &); -void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeVirtRegMapWrapperLegacyPass(PassRegistry &); void initializeVirtRegRewriterPass(PassRegistry &); void initializeWasmEHPreparePass(PassRegistry &); diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index fe683311b901c..23ef2e0b8d4ef 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -30,7 +30,7 @@ class Function; class TargetVerifierPass : public PassInfoMixin { public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) {} + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); }; class TargetVerify { @@ -76,8 +76,8 @@ class TargetVerify { : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} - void run(Function &F) {}; - void run(Function &F, FunctionAnalysisManager &AM); + bool run(Function &F); + bool run(Function &F, FunctionAnalysisManager &AM); }; } // namespace llvm diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index 879d657c87695..f125b3daffd5e 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -62,6 +62,8 @@ static cl::opt VerifyAnalysisInvalidation("verify-analysis-invalidation", #endif ); +static cl::opt VerifyTargetEach("verify-tgt-each"); + // An option that supports the -print-changed option. See // the description for -print-changed for an explanation of the use // of this option. Note that this option has no effect without -print-changed. @@ -1476,7 +1478,7 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, "\"{0}\", compilation aborted!", P)); - if (FAM) { + if (VerifyTargetEach && FAM) { TargetVerify TV(const_cast(F->getParent())); TV.run(*const_cast(F), *FAM); if (!TV.IsValid) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 41e6a399c7239..73f9c60cf588c 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -88,7 +88,7 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) #endif VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) -VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) +VERIFIER_FUNCTION_ANALYSIS("tgtverifier", TargetVerifierPass()) #undef VERIFIER_MODULE_ANALYSIS #undef VERIFIER_FUNCTION_ANALYSIS diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 0eecedaebc7ce..96bcaaf6f2ac9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -132,49 +132,4 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan return PreservedAnalyses::all(); } -struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { - static char ID; - - std::unique_ptr TV; - bool FatalErrors = true; - - AMDGPUTargetVerifierLegacyPass() : FunctionPass(ID) { - initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); - } - AMDGPUTargetVerifierLegacyPass(bool FatalErrors) - : FunctionPass(ID), - FatalErrors(FatalErrors) { - initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); - } - - bool doInitialization(Module &M) override { - TV = std::make_unique(&M); - return false; - } - - bool runOnFunction(Function &F) override { - if (!TV->run(F) && FatalErrors) { - errs() << "in function " << F.getName() << '\n'; - report_fatal_error("Broken function found, compilation aborted!"); - } - return false; - } - - bool doFinalization(Module &M) override { - bool IsValid = true; - for (Function &F : M) - if (F.isDeclaration()) - IsValid &= TV->run(F); - - if (FatalErrors && !IsValid) - report_fatal_error("Broken module found, compilation aborted!"); - return false; - } - - void getAnalysisUsage(AnalysisUsage &AU) const override { - AU.setPreservesAll(); - } -}; -char AMDGPUTargetVerifierLegacyPass::ID = 0; } // namespace llvm -INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverify", "AMDGPU Target Verifier", false, false) diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 992a0c91d93b1..170fc4769c1d8 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -20,6 +20,7 @@ #include "llvm/Target/TargetVerifier.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" +#include "llvm/InitializePasses.h" #include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/Analysis/PostDominators.h" #include "llvm/Support/Debug.h" @@ -32,7 +33,22 @@ namespace llvm { -void TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { +bool TargetVerify::run(Function &F) { + if (TT.isAMDGPU()) { + AMDGPUTargetVerify TV(Mod); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + return false; + } + return true; + } + report_fatal_error("Target has no verification method\n"); +} + +bool TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { if (TT.isAMDGPU()) { auto *UA = &AM.getResult(F); auto *DT = &AM.getResult(F); @@ -44,8 +60,77 @@ void TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { dbgs() << TV.MessagesStr.str(); if (!TV.MessagesStr.str().empty()) { TV.IsValid = false; + return false; + } + return true; + } + report_fatal_error("Target has no verification method\n"); +} + +PreservedAnalyses TargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { + auto TT = F.getParent()->getTargetTriple(); + + if (TT.isAMDGPU()) { + auto *Mod = F.getParent(); + + auto UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + return PreservedAnalyses::none(); } + return PreservedAnalyses::all(); } + report_fatal_error("Target has no verification method\n"); } +struct TargetVerifierLegacyPass : public FunctionPass { + static char ID; + + std::unique_ptr TV; + + TargetVerifierLegacyPass() : FunctionPass(ID) { + initializeTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + + bool doInitialization(Module &M) override { + TV = std::make_unique(&M); + return false; + } + + bool runOnFunction(Function &F) override { + if (!TV->run(F)) { + errs() << "in function " << F.getName() << '\n'; + report_fatal_error("broken function found, compilation aborted!"); + } + return false; + } + + bool doFinalization(Module &M) override { + bool IsValid = true; + for (Function &F : M) + if (F.isDeclaration()) + IsValid &= TV->run(F); + + if (!IsValid) + report_fatal_error("broken module found, compilation aborted!"); + return false; + } + + void getAnalysisUsage(AnalysisUsage &AU) const override { + AU.setPreservesAll(); + } +}; +char TargetVerifierLegacyPass::ID = 0; +FunctionPass *createTargetVerifierLegacyPass() { + return new TargetVerifierLegacyPass(); +} } // namespace llvm +using namespace llvm; +INITIALIZE_PASS(TargetVerifierLegacyPass, "tgtverifier", "Target Verifier", false, false) diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index a8f6b999af06e..4b95977a10c5f 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -57,6 +57,9 @@ static cl::opt DebugPM("debug-pass-manager", cl::Hidden, cl::desc("Print pass management debugging information")); +static cl::opt VerifyTarget("verify-tgt-new-pm", + cl::desc("Verify the target")); + bool LLCDiagnosticHandler::handleDiagnostics(const DiagnosticInfo &DI) { DiagnosticHandler::handleDiagnostics(DI); if (DI.getKind() == llvm::DK_SrcMgr) { @@ -127,7 +130,8 @@ int llvm::compileModuleWithNewPM( PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); - PB.registerVerifierPasses(MPM, FPM); + if (VerifyTarget) + PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM, &FAM); diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 140459ba2de21..1fd8a9f9cd9f8 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -209,6 +209,8 @@ static cl::opt PassPipeline( static cl::alias PassPipeline2("p", cl::aliasopt(PassPipeline), cl::desc("Alias for -passes")); +static cl::opt VerifyTarget("verify-tgt", cl::desc("Verify the target")); + namespace { std::vector &getRunPassNames() { @@ -658,6 +660,8 @@ static int compileModule(char **argv, LLVMContext &Context) { // Build up all of the passes that we want to do to the module. legacy::PassManager PM; + if (VerifyTarget) + PM.add(createTargetVerifierLegacyPass()); PM.add(new TargetLibraryInfoWrapperPass(TLII)); { diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index b00bab66c6c3e..b86c2318b45b7 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -141,6 +141,7 @@ int main(int argc, char **argv) { ModuleAnalysisManager MAM; PassBuilder PB(TM.get(), PipelineTuningOptions(), std::nullopt, &PIC); PB.registerModuleAnalyses(MAM); + //PB.registerVerifierPasses(MPM, FPM); PB.registerCGSCCAnalyses(CGAM); PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); >From 3013fc91155a7d84c73ac820fe6bc24c47dad38d Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 26 Apr 2025 00:13:42 -0400 Subject: [PATCH 14/31] Add fam in other projects. --- flang/lib/Frontend/FrontendActions.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter4/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter5/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter6/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter7/toy.cpp | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..7c48e35ff68cf 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -911,7 +911,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); - si.registerCallbacks(pic, &mam); + si.registerCallbacks(pic, &mam, &fam); if (ci.isTimingEnabled()) si.getTimePasses().setOutStream(ci.getTimingStreamLLVM()); pto.LoopUnrolling = opts.UnrollLoops; diff --git a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp index 0f58391c50667..f9664025f61f1 100644 --- a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp @@ -577,7 +577,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp index 7117eaf4982b0..eae06d9f57467 100644 --- a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp @@ -851,7 +851,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp index cb7b6cc8651c1..30ad79ef2fc58 100644 --- a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp @@ -970,7 +970,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp index 91b7191a07c6f..4a39bc33c5591 100644 --- a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp @@ -1139,7 +1139,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); // Add transform passes. // Promote allocas to registers. >From 8745cd135bd27559429f158fc0d678a210af7292 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 26 Apr 2025 02:30:40 -0400 Subject: [PATCH 15/31] Avoid fatal errors in llc. --- llvm/include/llvm/CodeGen/Passes.h | 2 +- llvm/lib/Target/TargetVerifier.cpp | 18 +++++++++++++----- .../test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll | 2 +- .../test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll | 2 +- llvm/tools/llc/llc.cpp | 2 +- 5 files changed, 17 insertions(+), 9 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index b293315e11c17..8d88d858c57ad 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -618,7 +618,7 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); - FunctionPass *createTargetVerifierLegacyPass(); + FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors); } // End llvm namespace #endif diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 170fc4769c1d8..3be50f4ef6da3 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -94,8 +94,10 @@ struct TargetVerifierLegacyPass : public FunctionPass { static char ID; std::unique_ptr TV; + bool FatalErrors = false; - TargetVerifierLegacyPass() : FunctionPass(ID) { + TargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), + FatalErrors(FatalErrors) { initializeTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); } @@ -107,7 +109,10 @@ struct TargetVerifierLegacyPass : public FunctionPass { bool runOnFunction(Function &F) override { if (!TV->run(F)) { errs() << "in function " << F.getName() << '\n'; - report_fatal_error("broken function found, compilation aborted!"); + if (FatalErrors) + report_fatal_error("broken function found, compilation aborted!"); + else + errs() << "broken function found, compilation aborted!\n"; } return false; } @@ -119,7 +124,10 @@ struct TargetVerifierLegacyPass : public FunctionPass { IsValid &= TV->run(F); if (!IsValid) - report_fatal_error("broken module found, compilation aborted!"); + if (FatalErrors) + report_fatal_error("broken module found, compilation aborted!"); + else + errs() << "broken module found, compilation aborted!\n"; return false; } @@ -128,8 +136,8 @@ struct TargetVerifierLegacyPass : public FunctionPass { } }; char TargetVerifierLegacyPass::ID = 0; -FunctionPass *createTargetVerifierLegacyPass() { - return new TargetVerifierLegacyPass(); +FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors) { + return new TargetVerifierLegacyPass(FatalErrors); } } // namespace llvm using namespace llvm; diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll index c5e59d4a2369e..e2d9edda5d008 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll @@ -1,4 +1,4 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm -o - < %s 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-tgt -o - < %s 2>&1 | FileCheck %s define amdgpu_cs i32 @nonvoid_shader() { ; CHECK: Shaders must return void diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll index 8a503b7624a73..a2dab0ff47924 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll @@ -1,4 +1,4 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -enable-new-pm %s -o - 2>&1 | FileCheck %s --allow-empty +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-tgt %s -o - 2>&1 | FileCheck %s --allow-empty define amdgpu_cs void @void_shader() { ; CHECK-NOT: Shaders must return void diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 1fd8a9f9cd9f8..329d95826551f 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -661,7 +661,7 @@ static int compileModule(char **argv, LLVMContext &Context) { // Build up all of the passes that we want to do to the module. legacy::PassManager PM; if (VerifyTarget) - PM.add(createTargetVerifierLegacyPass()); + PM.add(createTargetVerifierLegacyPass(false)); PM.add(new TargetLibraryInfoWrapperPass(TLII)); { >From c7bf730193e39bf838a29de7617d31a900bbc576 Mon Sep 17 00:00:00 2001 From: jofernau Date: Sat, 26 Apr 2025 03:40:47 -0400 Subject: [PATCH 16/31] Add tool to build/test. --- llvm/test/CMakeLists.txt | 1 + llvm/test/lit.cfg.py | 1 + llvm/utils/gn/secondary/llvm/test/BUILD.gn | 1 + .../llvm/tools/llvm-tgt-verify/BUILD.gn | 25 +++++++++++++++++++ 4 files changed, 28 insertions(+) create mode 100644 llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn diff --git a/llvm/test/CMakeLists.txt b/llvm/test/CMakeLists.txt index 66849002eb470..10ca9300e7c66 100644 --- a/llvm/test/CMakeLists.txt +++ b/llvm/test/CMakeLists.txt @@ -135,6 +135,7 @@ set(LLVM_TEST_DEPENDS llvm-strip llvm-symbolizer llvm-tblgen + llvm-tgt-verify llvm-readtapi llvm-tli-checker llvm-undname diff --git a/llvm/test/lit.cfg.py b/llvm/test/lit.cfg.py index aad7a088551b2..8620f2a7014b5 100644 --- a/llvm/test/lit.cfg.py +++ b/llvm/test/lit.cfg.py @@ -227,6 +227,7 @@ def get_asan_rtlib(): "llvm-strings", "llvm-strip", "llvm-tblgen", + "llvm-tgt-verify", "llvm-readtapi", "llvm-undname", "llvm-windres", diff --git a/llvm/utils/gn/secondary/llvm/test/BUILD.gn b/llvm/utils/gn/secondary/llvm/test/BUILD.gn index 228642667b41d..157e7991c52a8 100644 --- a/llvm/utils/gn/secondary/llvm/test/BUILD.gn +++ b/llvm/utils/gn/secondary/llvm/test/BUILD.gn @@ -319,6 +319,7 @@ group("test") { "//llvm/tools/llvm-strings", "//llvm/tools/llvm-symbolizer:symlinks", "//llvm/tools/llvm-tli-checker", + "//llvm/tools/llvm-tgt-verify", "//llvm/tools/llvm-undname", "//llvm/tools/llvm-xray", "//llvm/tools/lto", diff --git a/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn b/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn new file mode 100644 index 0000000000000..b751bafc5052c --- /dev/null +++ b/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn @@ -0,0 +1,25 @@ +import("//llvm/utils/TableGen/tablegen.gni") + +tgtverifier("llvm-tgt-verify") { + deps = [ + "//llvm/lib/Analysis", + "//llvm/lib/AsmPrinter", + "//llvm/lib/CodeGen", + "//llvm/lib/CodeGenTypes", + "//llvm/lib/Core", + "//llvm/lib/IRPrinter", + "//llvm/lib/IRReader", + "//llvm/lib/MC", + "//llvm/lib/MIRParser", + "//llvm/lib/Passes", + "//llvm/lib/Remarks", + "//llvm/lib/ScalarOpts", + "//llvm/lib/SelectionDAG", + "//llvm/lib/Support", + "//llvm/lib/Target", + "//llvm/lib/TargetParser", + "//llvm/lib/TransformUtils", + "//llvm/lib/Vectorize", + ] + sources = [ "llvm-tgt-verify.cpp" ] +} >From c8dd3db3fe078f76e822a9646d3d7295fa23752a Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 28 Apr 2025 10:42:24 -0400 Subject: [PATCH 17/31] Cleanup of unrequired functions. --- llvm/include/llvm/Target/TargetVerifier.h | 1 - .../TargetVerify/AMDGPUTargetVerifier.h | 1 - .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 25 +++---------------- llvm/lib/Target/TargetVerifier.cpp | 22 ++-------------- 4 files changed, 6 insertions(+), 43 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 23ef2e0b8d4ef..427a05b2648a9 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -77,7 +77,6 @@ class TargetVerify { MessagesStr(Messages) {} bool run(Function &F); - bool run(Function &F, FunctionAnalysisManager &AM); }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index b6a7412e8c1ef..74e5b5f7a1efd 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -32,7 +32,6 @@ class Function; class AMDGPUTargetVerifierPass : public TargetVerifierPass { public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); }; class AMDGPUTargetVerify : public TargetVerify { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 96bcaaf6f2ac9..bda412f723242 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -107,29 +107,12 @@ bool AMDGPUTargetVerify::run(Function &F) { } } - if (!MessagesStr.str().empty()) + //dbgs() << MessagesStr.str(); + if (!MessagesStr.str().empty()) { + //IsValid = false; return false; - return true; -} - -PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { - - auto *Mod = F.getParent(); - - auto UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return PreservedAnalyses::none(); } - - return PreservedAnalyses::all(); + return true; } } // namespace llvm diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 3be50f4ef6da3..6b57c18ff9316 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -48,25 +48,6 @@ bool TargetVerify::run(Function &F) { report_fatal_error("Target has no verification method\n"); } -bool TargetVerify::run(Function &F, FunctionAnalysisManager &AM) { - if (TT.isAMDGPU()) { - auto *UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return false; - } - return true; - } - report_fatal_error("Target has no verification method\n"); -} - PreservedAnalyses TargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto TT = F.getParent()->getTargetTriple(); @@ -123,11 +104,12 @@ struct TargetVerifierLegacyPass : public FunctionPass { if (F.isDeclaration()) IsValid &= TV->run(F); - if (!IsValid) + if (!IsValid) { if (FatalErrors) report_fatal_error("broken module found, compilation aborted!"); else errs() << "broken module found, compilation aborted!\n"; + } return false; } >From 2c12e6a6d7f9a1cb7bcebfb30ccdd0fe7b198727 Mon Sep 17 00:00:00 2001 From: jofernau Date: Mon, 28 Apr 2025 10:43:32 -0400 Subject: [PATCH 18/31] Make virtual. --- llvm/include/llvm/Target/TargetVerifier.h | 2 +- llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 427a05b2648a9..ade2676a64325 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -76,7 +76,7 @@ class TargetVerify { : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} - bool run(Function &F); + virtual bool run(Function &F); }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index 74e5b5f7a1efd..b97fbc046e391 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -48,7 +48,7 @@ class AMDGPUTargetVerify : public TargetVerify { AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} - bool run(Function &F); + bool run(Function &F) override; }; } // namespace llvm >From 3267b65e82da4cb7bc0f31f74c76f78d0445512f Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 10:56:43 -0400 Subject: [PATCH 19/31] Remove from legacy PM. Add to target dependent pipeline. --- llvm/include/llvm/CodeGen/Passes.h | 2 +- llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Target/TargetVerifier.h | 4 +- .../TargetVerify/AMDGPUTargetVerifier.h | 1 + llvm/lib/Passes/StandardInstrumentations.cpp | 10 +-- llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 + .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 74 ++++++++++++++- llvm/lib/Target/TargetVerifier.cpp | 90 ------------------- llvm/tools/llc/llc.cpp | 2 - .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 2 - 11 files changed, 85 insertions(+), 106 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index 8d88d858c57ad..da6ad3f612aa8 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -618,7 +618,7 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); - FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors); + //FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); } // End llvm namespace #endif diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 3f9ffc4efd9ec..7d4fad2d87a16 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -307,7 +307,7 @@ void initializeTailDuplicateLegacyPass(PassRegistry &); void initializeTargetLibraryInfoWrapperPassPass(PassRegistry &); void initializeTargetPassConfigPass(PassRegistry &); void initializeTargetTransformInfoWrapperPassPass(PassRegistry &); -void initializeTargetVerifierLegacyPassPass(PassRegistry &); +//void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeTwoAddressInstructionLegacyPassPass(PassRegistry &); void initializeTypeBasedAAWrapperPassPass(PassRegistry &); void initializeTypePromotionLegacyPass(PassRegistry &); diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index ade2676a64325..1d12eb55bbf0a 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -30,7 +30,7 @@ class Function; class TargetVerifierPass : public PassInfoMixin { public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); + virtual PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) = 0; }; class TargetVerify { @@ -76,7 +76,7 @@ class TargetVerify { : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} - virtual bool run(Function &F); + virtual bool run(Function &F) = 0; }; } // namespace llvm diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index b97fbc046e391..49bcbc8849e3c 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -32,6 +32,7 @@ class Function; class AMDGPUTargetVerifierPass : public TargetVerifierPass { public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) override; }; class AMDGPUTargetVerify : public TargetVerify { diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index f125b3daffd5e..076df47d5b15d 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -45,7 +45,7 @@ #include "llvm/Support/Signals.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Support/xxhash.h" -#include "llvm/Target/TargetVerifier.h" +//#include "llvm/Target/TargetVerifier.h" #include #include #include @@ -1479,12 +1479,12 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, P)); if (VerifyTargetEach && FAM) { - TargetVerify TV(const_cast(F->getParent())); - TV.run(*const_cast(F), *FAM); - if (!TV.IsValid) + //TargetVerify TV(const_cast(F->getParent())); + //TV.run(*const_cast(F), *FAM); + /*if (!TV.IsValid) report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", - P)); + P));*/ } } else { const auto *M = unwrapIR(IR); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 73f9c60cf588c..41e6a399c7239 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -88,7 +88,7 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) #endif VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) -VERIFIER_FUNCTION_ANALYSIS("tgtverifier", TargetVerifierPass()) +VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) #undef VERIFIER_MODULE_ANALYSIS #undef VERIFIER_FUNCTION_ANALYSIS diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 257cc724b3da9..f1a60b8f33140 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1976,6 +1976,8 @@ AMDGPUCodeGenPassBuilder::AMDGPUCodeGenPassBuilder( } void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { + addPass(AMDGPUTargetVerifierPass()); + if (RemoveIncompatibleFunctions && TM.getTargetTriple().isAMDGCN()) addPass(AMDGPURemoveIncompatibleFunctionsPass(TM)); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index bda412f723242..cedd9ddc78011 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -107,12 +107,82 @@ bool AMDGPUTargetVerify::run(Function &F) { } } - //dbgs() << MessagesStr.str(); + dbgs() << MessagesStr.str(); if (!MessagesStr.str().empty()) { - //IsValid = false; + IsValid = false; return false; } return true; } +PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { + auto *Mod = F.getParent(); + + auto UA = &AM.getResult(F); + auto *DT = &AM.getResult(F); + auto *PDT = &AM.getResult(F); + + AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + TV.run(F); + + dbgs() << TV.MessagesStr.str(); + if (!TV.MessagesStr.str().empty()) { + TV.IsValid = false; + return PreservedAnalyses::none(); + } + return PreservedAnalyses::all(); +} + +/* +struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { + static char ID; + + std::unique_ptr TV; + bool FatalErrors = false; + + AMDGPUTargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), + FatalErrors(FatalErrors) { + initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + } + + bool doInitialization(Module &M) override { + TV = std::make_unique(&M); + return false; + } + + bool runOnFunction(Function &F) override { + if (!TV->run(F)) { + errs() << "in function " << F.getName() << '\n'; + if (FatalErrors) + report_fatal_error("broken function found, compilation aborted!"); + else + errs() << "broken function found, compilation aborted!\n"; + } + return false; + } + + bool doFinalization(Module &M) override { + bool IsValid = true; + for (Function &F : M) + if (F.isDeclaration()) + IsValid &= TV->run(F); + + if (!IsValid) { + if (FatalErrors) + report_fatal_error("broken module found, compilation aborted!"); + else + errs() << "broken module found, compilation aborted!\n"; + } + return false; + } + + void getAnalysisUsage(AnalysisUsage &AU) const override { + AU.setPreservesAll(); + } +}; +char AMDGPUTargetVerifierLegacyPass::ID = 0; +FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors) { + return new AMDGPUTargetVerifierLegacyPass(FatalErrors); +}*/ } // namespace llvm +//INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp index 6b57c18ff9316..c63ae2a2c5daf 100644 --- a/llvm/lib/Target/TargetVerifier.cpp +++ b/llvm/lib/Target/TargetVerifier.cpp @@ -33,94 +33,4 @@ namespace llvm { -bool TargetVerify::run(Function &F) { - if (TT.isAMDGPU()) { - AMDGPUTargetVerify TV(Mod); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return false; - } - return true; - } - report_fatal_error("Target has no verification method\n"); -} - -PreservedAnalyses TargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { - auto TT = F.getParent()->getTargetTriple(); - - if (TT.isAMDGPU()) { - auto *Mod = F.getParent(); - - auto UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); - TV.run(F); - - dbgs() << TV.MessagesStr.str(); - if (!TV.MessagesStr.str().empty()) { - TV.IsValid = false; - return PreservedAnalyses::none(); - } - return PreservedAnalyses::all(); - } - report_fatal_error("Target has no verification method\n"); -} - -struct TargetVerifierLegacyPass : public FunctionPass { - static char ID; - - std::unique_ptr TV; - bool FatalErrors = false; - - TargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), - FatalErrors(FatalErrors) { - initializeTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); - } - - bool doInitialization(Module &M) override { - TV = std::make_unique(&M); - return false; - } - - bool runOnFunction(Function &F) override { - if (!TV->run(F)) { - errs() << "in function " << F.getName() << '\n'; - if (FatalErrors) - report_fatal_error("broken function found, compilation aborted!"); - else - errs() << "broken function found, compilation aborted!\n"; - } - return false; - } - - bool doFinalization(Module &M) override { - bool IsValid = true; - for (Function &F : M) - if (F.isDeclaration()) - IsValid &= TV->run(F); - - if (!IsValid) { - if (FatalErrors) - report_fatal_error("broken module found, compilation aborted!"); - else - errs() << "broken module found, compilation aborted!\n"; - } - return false; - } - - void getAnalysisUsage(AnalysisUsage &AU) const override { - AU.setPreservesAll(); - } -}; -char TargetVerifierLegacyPass::ID = 0; -FunctionPass *createTargetVerifierLegacyPass(bool FatalErrors) { - return new TargetVerifierLegacyPass(FatalErrors); -} } // namespace llvm -using namespace llvm; -INITIALIZE_PASS(TargetVerifierLegacyPass, "tgtverifier", "Target Verifier", false, false) diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 329d95826551f..2e9e4837fe467 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -660,8 +660,6 @@ static int compileModule(char **argv, LLVMContext &Context) { // Build up all of the passes that we want to do to the module. legacy::PassManager PM; - if (VerifyTarget) - PM.add(createTargetVerifierLegacyPass(false)); PM.add(new TargetLibraryInfoWrapperPass(TLII)); { diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index b86c2318b45b7..d832dcdff4ad0 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -158,8 +158,6 @@ int main(int argc, char **argv) { if (TT.isAMDGPU()) FPM.addPass(AMDGPUTargetVerifierPass()); else if (false) {} // ... - else - FPM.addPass(TargetVerifierPass()); MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); auto PA = MPM.run(*M, MAM); >From 6401b7517843a03ab114aaf333624ef914d5a5f3 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 11:18:50 -0400 Subject: [PATCH 20/31] Add back to legacy PM. --- llvm/include/llvm/CodeGen/Passes.h | 2 -- llvm/include/llvm/InitializePasses.h | 1 - llvm/lib/Target/AMDGPU/AMDGPU.h | 3 +++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 1 + llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 8 ++++---- 5 files changed, 8 insertions(+), 7 deletions(-) diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index da6ad3f612aa8..d214ab9306c2f 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -617,8 +617,6 @@ namespace llvm { /// Lowers KCFI operand bundles for indirect calls. FunctionPass *createKCFIPass(); - - //FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); } // End llvm namespace #endif diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 7d4fad2d87a16..9bef8e496c57e 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -307,7 +307,6 @@ void initializeTailDuplicateLegacyPass(PassRegistry &); void initializeTargetLibraryInfoWrapperPassPass(PassRegistry &); void initializeTargetPassConfigPass(PassRegistry &); void initializeTargetTransformInfoWrapperPassPass(PassRegistry &); -//void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); void initializeTwoAddressInstructionLegacyPassPass(PassRegistry &); void initializeTypeBasedAAWrapperPassPass(PassRegistry &); void initializeTypePromotionLegacyPass(PassRegistry &); diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h index 4ff761ec19b3c..f69956ba44255 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPU.h +++ b/llvm/lib/Target/AMDGPU/AMDGPU.h @@ -530,6 +530,9 @@ extern char &GCNRewritePartialRegUsesID; void initializeAMDGPUWaitSGPRHazardsLegacyPass(PassRegistry &); extern char &AMDGPUWaitSGPRHazardsLegacyID; +FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); +void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); + namespace AMDGPU { enum TargetIndex { TI_CONSTDATA_START, diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index f1a60b8f33140..42d6764eacda9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1377,6 +1377,7 @@ bool AMDGPUPassConfig::addGCPasses() { //===----------------------------------------------------------------------===// bool GCNPassConfig::addPreISel() { + addPass(createAMDGPUTargetVerifierLegacyPass(false)); AMDGPUPassConfig::addPreISel(); if (TM->getOptLevel() > CodeGenOptLevel::None) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index cedd9ddc78011..c4d303bee6ef8 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -17,6 +17,7 @@ //// ////===----------------------------------------------------------------------===// +#include "AMDGPU.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Analysis/UniformityAnalysis.h" @@ -24,7 +25,7 @@ #include "llvm/Support/Debug.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" -#include "llvm/InitializePasses.h" +//#include "llvm/InitializePasses.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" @@ -133,7 +134,6 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisMan return PreservedAnalyses::all(); } -/* struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { static char ID; @@ -183,6 +183,6 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { char AMDGPUTargetVerifierLegacyPass::ID = 0; FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors) { return new AMDGPUTargetVerifierLegacyPass(FatalErrors); -}*/ +} } // namespace llvm -//INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) +INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) >From e2f0225db1439f7d8ee612ee4c4d37a4b44f96b6 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 14:04:10 -0400 Subject: [PATCH 21/31] Remove reference to FAM in registerCallbacks and VerifyEach for TargetVerify in instrumentation --- clang/lib/CodeGen/BackendUtil.cpp | 2 +- flang/lib/Frontend/FrontendActions.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter4/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter5/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter6/toy.cpp | 2 +- llvm/examples/Kaleidoscope/Chapter7/toy.cpp | 2 +- .../llvm/Passes/StandardInstrumentations.h | 6 ++---- llvm/lib/LTO/LTOBackend.cpp | 2 +- llvm/lib/LTO/ThinLTOCodeGenerator.cpp | 2 +- llvm/lib/Passes/PassBuilderBindings.cpp | 2 +- llvm/lib/Passes/StandardInstrumentations.cpp | 21 ++++--------------- llvm/tools/llc/NewPMDriver.cpp | 7 ++++--- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 2 +- llvm/tools/opt/NewPMDriver.cpp | 2 +- llvm/unittests/IR/PassManagerTest.cpp | 6 +++--- 15 files changed, 24 insertions(+), 38 deletions(-) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 9a1c922f5ddef..f7eb853beb23c 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -922,7 +922,7 @@ void EmitAssemblyHelper::RunOptimizationPipeline( TheModule->getContext(), (CodeGenOpts.DebugPassManager || DebugPassStructure), CodeGenOpts.VerifyEach, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); PassBuilder PB(TM.get(), PTO, PGOOpt, &PIC); // Handle the assignment tracking feature options. diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 7c48e35ff68cf..c1f47b12abee2 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -911,7 +911,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); - si.registerCallbacks(pic, &mam, &fam); + si.registerCallbacks(pic, &mam); if (ci.isTimingEnabled()) si.getTimePasses().setOutStream(ci.getTimingStreamLLVM()); pto.LoopUnrolling = opts.UnrollLoops; diff --git a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp index f9664025f61f1..0f58391c50667 100644 --- a/llvm/examples/Kaleidoscope/Chapter4/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter4/toy.cpp @@ -577,7 +577,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp index eae06d9f57467..7117eaf4982b0 100644 --- a/llvm/examples/Kaleidoscope/Chapter5/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter5/toy.cpp @@ -851,7 +851,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp index 30ad79ef2fc58..cb7b6cc8651c1 100644 --- a/llvm/examples/Kaleidoscope/Chapter6/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter6/toy.cpp @@ -970,7 +970,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Do simple "peephole" optimizations and bit-twiddling optzns. diff --git a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp index 4a39bc33c5591..91b7191a07c6f 100644 --- a/llvm/examples/Kaleidoscope/Chapter7/toy.cpp +++ b/llvm/examples/Kaleidoscope/Chapter7/toy.cpp @@ -1139,7 +1139,7 @@ static void InitializeModuleAndManagers() { ThePIC = std::make_unique(); TheSI = std::make_unique(*TheContext, /*DebugLogging*/ true); - TheSI->registerCallbacks(*ThePIC, TheMAM.get(), TheFAM.get()); + TheSI->registerCallbacks(*ThePIC, TheMAM.get()); // Add transform passes. // Promote allocas to registers. diff --git a/llvm/include/llvm/Passes/StandardInstrumentations.h b/llvm/include/llvm/Passes/StandardInstrumentations.h index 988fcb93b2357..65934c93ba614 100644 --- a/llvm/include/llvm/Passes/StandardInstrumentations.h +++ b/llvm/include/llvm/Passes/StandardInstrumentations.h @@ -476,8 +476,7 @@ class VerifyInstrumentation { public: VerifyInstrumentation(bool DebugLogging) : DebugLogging(DebugLogging) {} void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM, - FunctionAnalysisManager *FAM); + ModuleAnalysisManager *MAM); }; /// This class implements --time-trace functionality for new pass manager. @@ -622,8 +621,7 @@ class StandardInstrumentations { // Register all the standard instrumentation callbacks. If \p FAM is nullptr // then PreservedCFGChecker is not enabled. void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM, - FunctionAnalysisManager *FAM); + ModuleAnalysisManager *MAM); TimePassesHandler &getTimePasses() { return TimePasses; } }; diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp index 475e7cf45371b..1c764a0188eda 100644 --- a/llvm/lib/LTO/LTOBackend.cpp +++ b/llvm/lib/LTO/LTOBackend.cpp @@ -275,7 +275,7 @@ static void runNewPMPasses(const Config &Conf, Module &Mod, TargetMachine *TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(Mod.getContext(), Conf.DebugPassManager, Conf.VerifyEach); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); PassBuilder PB(TM, Conf.PTO, PGOOpt, &PIC); RegisterPassPlugins(Conf.PassPlugins, PB); diff --git a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp index 369b003df1364..9e7f8187fe49c 100644 --- a/llvm/lib/LTO/ThinLTOCodeGenerator.cpp +++ b/llvm/lib/LTO/ThinLTOCodeGenerator.cpp @@ -245,7 +245,7 @@ static void optimizeModule(Module &TheModule, TargetMachine &TM, PassInstrumentationCallbacks PIC; StandardInstrumentations SI(TheModule.getContext(), DebugPassManager); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); PipelineTuningOptions PTO; PTO.LoopVectorization = true; PTO.SLPVectorization = true; diff --git a/llvm/lib/Passes/PassBuilderBindings.cpp b/llvm/lib/Passes/PassBuilderBindings.cpp index f0e1abb8cebc4..933fe89e53a94 100644 --- a/llvm/lib/Passes/PassBuilderBindings.cpp +++ b/llvm/lib/Passes/PassBuilderBindings.cpp @@ -76,7 +76,7 @@ static LLVMErrorRef runPasses(Module *Mod, Function *Fun, const char *Passes, PB.crossRegisterProxies(LAM, FAM, CGAM, MAM); StandardInstrumentations SI(Mod->getContext(), Debug, VerifyEach); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); // Run the pipeline. if (Fun) { diff --git a/llvm/lib/Passes/StandardInstrumentations.cpp b/llvm/lib/Passes/StandardInstrumentations.cpp index 076df47d5b15d..dc1dd5d9c7f4c 100644 --- a/llvm/lib/Passes/StandardInstrumentations.cpp +++ b/llvm/lib/Passes/StandardInstrumentations.cpp @@ -45,7 +45,6 @@ #include "llvm/Support/Signals.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Support/xxhash.h" -//#include "llvm/Target/TargetVerifier.h" #include #include #include @@ -62,8 +61,6 @@ static cl::opt VerifyAnalysisInvalidation("verify-analysis-invalidation", #endif ); -static cl::opt VerifyTargetEach("verify-tgt-each"); - // An option that supports the -print-changed option. See // the description for -print-changed for an explanation of the use // of this option. Note that this option has no effect without -print-changed. @@ -1457,10 +1454,9 @@ void PreservedCFGCheckerInstrumentation::registerCallbacks( } void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM, - FunctionAnalysisManager *FAM) { + ModuleAnalysisManager *MAM) { PIC.registerAfterPassCallback( - [this, MAM, FAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { + [this, MAM](StringRef P, Any IR, const PreservedAnalyses &PassPA) { if (isIgnored(P) || P == "VerifierPass") return; const auto *F = unwrapIR(IR); @@ -1477,15 +1473,6 @@ void VerifyInstrumentation::registerCallbacks(PassInstrumentationCallbacks &PIC, report_fatal_error(formatv("Broken function found after pass " "\"{0}\", compilation aborted!", P)); - - if (VerifyTargetEach && FAM) { - //TargetVerify TV(const_cast(F->getParent())); - //TV.run(*const_cast(F), *FAM); - /*if (!TV.IsValid) - report_fatal_error(formatv("Broken function found after pass " - "\"{0}\", compilation aborted!", - P));*/ - } } else { const auto *M = unwrapIR(IR); if (!M) { @@ -2525,7 +2512,7 @@ void PrintCrashIRInstrumentation::registerCallbacks( } void StandardInstrumentations::registerCallbacks( - PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM, FunctionAnalysisManager *FAM) { + PassInstrumentationCallbacks &PIC, ModuleAnalysisManager *MAM) { PrintIR.registerCallbacks(PIC); PrintPass.registerCallbacks(PIC); TimePasses.registerCallbacks(PIC); @@ -2534,7 +2521,7 @@ void StandardInstrumentations::registerCallbacks( PrintChangedIR.registerCallbacks(PIC); PseudoProbeVerification.registerCallbacks(PIC); if (VerifyEach) - Verify.registerCallbacks(PIC, MAM, FAM); + Verify.registerCallbacks(PIC, MAM); PrintChangedDiff.registerCallbacks(PIC); WebsiteChangeReporter.registerCallbacks(PIC); ChangeTester.registerCallbacks(PIC); diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index 4b95977a10c5f..863a555798dab 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -117,8 +117,6 @@ int llvm::compileModuleWithNewPM( VK == VerifierKind::EachPass); registerCodeGenCallback(PIC, *Target); - ModulePassManager MPM; - FunctionPassManager FPM; MachineFunctionAnalysisManager MFAM; LoopAnalysisManager LAM; FunctionAnalysisManager FAM; @@ -133,11 +131,14 @@ int llvm::compileModuleWithNewPM( if (VerifyTarget) PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); FAM.registerPass([&] { return TargetLibraryAnalysis(TLII); }); MAM.registerPass([&] { return MachineModuleAnalysis(MMI); }); + ModulePassManager MPM; + FunctionPassManager FPM; + if (!PassPipeline.empty()) { // Construct a custom pass pipeline that starts after instruction // selection. diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp index d832dcdff4ad0..50f4e56bb6af6 100644 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp @@ -148,7 +148,7 @@ int main(int argc, char **argv) { PB.registerMachineFunctionAnalyses(MFAM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); Triple TT(M->getTargetTriple()); if (!NoLint) diff --git a/llvm/tools/opt/NewPMDriver.cpp b/llvm/tools/opt/NewPMDriver.cpp index a8977d80bdf44..7d168a6ceb17c 100644 --- a/llvm/tools/opt/NewPMDriver.cpp +++ b/llvm/tools/opt/NewPMDriver.cpp @@ -423,7 +423,7 @@ bool llvm::runPassPipeline( PrintPassOpts.SkipAnalyses = DebugPM == DebugLogging::Quiet; StandardInstrumentations SI(M.getContext(), DebugPM != DebugLogging::None, VK == VerifierKind::EachPass, PrintPassOpts); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); DebugifyEachInstrumentation Debugify; DebugifyStatsMap DIStatsMap; DebugInfoPerPass DebugInfoBeforePass; diff --git a/llvm/unittests/IR/PassManagerTest.cpp b/llvm/unittests/IR/PassManagerTest.cpp index bb4db6120035f..a6487169224c2 100644 --- a/llvm/unittests/IR/PassManagerTest.cpp +++ b/llvm/unittests/IR/PassManagerTest.cpp @@ -828,7 +828,7 @@ TEST_F(PassManagerTest, FunctionPassCFGChecker) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -877,7 +877,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerInvalidateAnalysis) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); @@ -945,7 +945,7 @@ TEST_F(PassManagerTest, FunctionPassCFGCheckerWrapped) { FunctionPassManager FPM; PassInstrumentationCallbacks PIC; StandardInstrumentations SI(M->getContext(), /*DebugLogging*/ true); - SI.registerCallbacks(PIC, &MAM, &FAM); + SI.registerCallbacks(PIC, &MAM); MAM.registerPass([&] { return FunctionAnalysisManagerModuleProxy(FAM); }); MAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); FAM.registerPass([&] { return PassInstrumentationAnalysis(&PIC); }); >From b43cec12bbfc6071d4a99e75aad4273bab4e3182 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 14:44:28 -0400 Subject: [PATCH 22/31] Remove references to registry --- llvm/include/llvm/Passes/PassBuilder.h | 21 ------------------- .../llvm/Passes/StandardInstrumentations.h | 2 +- .../llvm/Passes/TargetPassRegistry.inc | 12 ----------- llvm/lib/Passes/PassBuilder.cpp | 7 ------- llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 11 ---------- llvm/tools/llc/NewPMDriver.cpp | 5 ----- llvm/tools/llc/llc.cpp | 2 -- 7 files changed, 1 insertion(+), 59 deletions(-) diff --git a/llvm/include/llvm/Passes/PassBuilder.h b/llvm/include/llvm/Passes/PassBuilder.h index 6000769ce723b..51ccaa53447d7 100644 --- a/llvm/include/llvm/Passes/PassBuilder.h +++ b/llvm/include/llvm/Passes/PassBuilder.h @@ -172,13 +172,6 @@ class PassBuilder { /// additional analyses. void registerLoopAnalyses(LoopAnalysisManager &LAM); - /// Registers all available verifier passes. - /// - /// This is an interface that can be used to populate a - /// \c ModuleAnalysisManager with all registered loop analyses. Callers can - /// still manually register any additional analyses. - void registerVerifierPasses(ModulePassManager &PM, FunctionPassManager &); - /// Registers all available machine function analysis passes. /// /// This is an interface that can be used to populate a \c @@ -577,15 +570,6 @@ class PassBuilder { } /// @}} - /// Register a callback for parsing an Verifier Name to populate - /// the given managers. - void registerVerifierCallback( - const std::function &C, - const std::function &CF) { - VerifierCallbacks.push_back(C); - FnVerifierCallbacks.push_back(CF); - } - /// {{@ Register pipeline parsing callbacks with this pass builder instance. /// Using these callbacks, callers can parse both a single pass name, as well /// as entire sub-pipelines, and populate the PassManager instance @@ -857,11 +841,6 @@ class PassBuilder { // Callbacks to parse `filter` parameter in register allocation passes SmallVector, 2> RegClassFilterParsingCallbacks; - // Verifier callbacks - SmallVector, 2> - VerifierCallbacks; - SmallVector, 2> - FnVerifierCallbacks; }; /// This utility template takes care of adding require<> and invalidate<> diff --git a/llvm/include/llvm/Passes/StandardInstrumentations.h b/llvm/include/llvm/Passes/StandardInstrumentations.h index 65934c93ba614..f7a65a88ecf5b 100644 --- a/llvm/include/llvm/Passes/StandardInstrumentations.h +++ b/llvm/include/llvm/Passes/StandardInstrumentations.h @@ -621,7 +621,7 @@ class StandardInstrumentations { // Register all the standard instrumentation callbacks. If \p FAM is nullptr // then PreservedCFGChecker is not enabled. void registerCallbacks(PassInstrumentationCallbacks &PIC, - ModuleAnalysisManager *MAM); + ModuleAnalysisManager *MAM = nullptr); TimePassesHandler &getTimePasses() { return TimePasses; } }; diff --git a/llvm/include/llvm/Passes/TargetPassRegistry.inc b/llvm/include/llvm/Passes/TargetPassRegistry.inc index 2d04b874cf360..521913cb25a4a 100644 --- a/llvm/include/llvm/Passes/TargetPassRegistry.inc +++ b/llvm/include/llvm/Passes/TargetPassRegistry.inc @@ -151,18 +151,6 @@ PB.registerPipelineParsingCallback([=](StringRef Name, FunctionPassManager &PM, return false; }); -PB.registerVerifierCallback([](ModulePassManager &PM) { -#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) PM.addPass(CREATE_PASS) -#include GET_PASS_REGISTRY -#undef VERIFIER_MODULE_ANALYSIS - return false; -}, [](FunctionPassManager &FPM) { -#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) FPM.addPass(CREATE_PASS) -#include GET_PASS_REGISTRY -#undef VERIFIER_FUNCTION_ANALYSIS - return false; -}); - #undef ADD_PASS #undef ADD_PASS_WITH_PARAMS diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp index e942fed8b6a72..e7057d9a6b625 100644 --- a/llvm/lib/Passes/PassBuilder.cpp +++ b/llvm/lib/Passes/PassBuilder.cpp @@ -582,13 +582,6 @@ void PassBuilder::registerLoopAnalyses(LoopAnalysisManager &LAM) { C(LAM); } -void PassBuilder::registerVerifierPasses(ModulePassManager &MPM, FunctionPassManager &FPM) { - for (auto &C : VerifierCallbacks) - C(MPM); - for (auto &C : FnVerifierCallbacks) - C(FPM); -} - static std::optional> parseFunctionPipelineName(StringRef Name) { std::pair Params; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 41e6a399c7239..98a1147ef6d66 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -81,17 +81,6 @@ FUNCTION_ALIAS_ANALYSIS("amdgpu-aa", AMDGPUAA()) #undef FUNCTION_ALIAS_ANALYSIS #undef FUNCTION_ANALYSIS -#ifndef VERIFIER_MODULE_ANALYSIS -#define VERIFIER_MODULE_ANALYSIS(NAME, CREATE_PASS) -#endif -#ifndef VERIFIER_FUNCTION_ANALYSIS -#define VERIFIER_FUNCTION_ANALYSIS(NAME, CREATE_PASS) -#endif -VERIFIER_MODULE_ANALYSIS("verifier", VerifierPass()) -VERIFIER_FUNCTION_ANALYSIS("amdgpu-tgtverifier", AMDGPUTargetVerifierPass()) -#undef VERIFIER_MODULE_ANALYSIS -#undef VERIFIER_FUNCTION_ANALYSIS - #ifndef FUNCTION_PASS_WITH_PARAMS #define FUNCTION_PASS_WITH_PARAMS(NAME, CLASS, CREATE_PASS, PARSER, PARAMS) #endif diff --git a/llvm/tools/llc/NewPMDriver.cpp b/llvm/tools/llc/NewPMDriver.cpp index 863a555798dab..fa82689ecf9ae 100644 --- a/llvm/tools/llc/NewPMDriver.cpp +++ b/llvm/tools/llc/NewPMDriver.cpp @@ -57,9 +57,6 @@ static cl::opt DebugPM("debug-pass-manager", cl::Hidden, cl::desc("Print pass management debugging information")); -static cl::opt VerifyTarget("verify-tgt-new-pm", - cl::desc("Verify the target")); - bool LLCDiagnosticHandler::handleDiagnostics(const DiagnosticInfo &DI) { DiagnosticHandler::handleDiagnostics(DI); if (DI.getKind() == llvm::DK_SrcMgr) { @@ -128,8 +125,6 @@ int llvm::compileModuleWithNewPM( PB.registerFunctionAnalyses(FAM); PB.registerLoopAnalyses(LAM); PB.registerMachineFunctionAnalyses(MFAM); - if (VerifyTarget) - PB.registerVerifierPasses(MPM, FPM); PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); SI.registerCallbacks(PIC, &MAM); diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp index 2e9e4837fe467..140459ba2de21 100644 --- a/llvm/tools/llc/llc.cpp +++ b/llvm/tools/llc/llc.cpp @@ -209,8 +209,6 @@ static cl::opt PassPipeline( static cl::alias PassPipeline2("p", cl::aliasopt(PassPipeline), cl::desc("Alias for -passes")); -static cl::opt VerifyTarget("verify-tgt", cl::desc("Verify the target")); - namespace { std::vector &getRunPassNames() { >From b583b3f804758f6b8ca686bf66d59d744fffbe8e Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 19:05:53 -0400 Subject: [PATCH 23/31] Remove int check --- .../lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 18 ------------------ 1 file changed, 18 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index c4d303bee6ef8..2ca0bbeb57653 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -25,7 +25,6 @@ #include "llvm/Support/Debug.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" -//#include "llvm/InitializePasses.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" @@ -46,15 +45,6 @@ using namespace llvm; namespace llvm { -static bool IsValidInt(const Type *Ty) { - return Ty->isIntegerTy(1) || - Ty->isIntegerTy(8) || - Ty->isIntegerTy(16) || - Ty->isIntegerTy(32) || - Ty->isIntegerTy(64) || - Ty->isIntegerTy(128); -} - static bool isShader(CallingConv::ID CC) { switch(CC) { case CallingConv::AMDGPU_VS: @@ -81,14 +71,6 @@ bool AMDGPUTargetVerify::run(Function &F) { for (auto &I : BB) { - // Ensure integral types are valid: i8, i16, i32, i64, i128 - if (I.getType()->isIntegerTy()) - Check(IsValidInt(I.getType()), "Int type is invalid.", &I); - for (unsigned i = 0; i < I.getNumOperands(); ++i) - if (I.getOperand(i)->getType()->isIntegerTy()) - Check(IsValidInt(I.getOperand(i)->getType()), - "Int type is invalid.", I.getOperand(i)); - if (auto *CI = dyn_cast(&I)) { // Ensure no kernel to kernel calls. >From 2ba9f5d85326b80bd502116a95353d7e9ad4c9bb Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 21:56:28 -0400 Subject: [PATCH 24/31] Remove modifications to Lint/Verifier. --- llvm/lib/Analysis/Lint.cpp | 4 +--- llvm/lib/IR/Verifier.cpp | 20 ++++---------------- 2 files changed, 5 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Analysis/Lint.cpp b/llvm/lib/Analysis/Lint.cpp index c8e38963e5974..f05e36e2025d4 100644 --- a/llvm/lib/Analysis/Lint.cpp +++ b/llvm/lib/Analysis/Lint.cpp @@ -742,11 +742,9 @@ PreservedAnalyses LintPass::run(Function &F, FunctionAnalysisManager &AM) { Lint L(Mod, DL, AA, AC, DT, TLI); L.visit(F); dbgs() << L.MessagesStr.str(); - if (AbortOnError && !L.MessagesStr.str().empty()) { + if (AbortOnError && !L.MessagesStr.str().empty()) report_fatal_error( "linter found errors, aborting. (enabled by abort-on-error)", false); - return PreservedAnalyses::none(); - } return PreservedAnalyses::all(); } diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp index 51f6dec53b70f..8afe360d088bc 100644 --- a/llvm/lib/IR/Verifier.cpp +++ b/llvm/lib/IR/Verifier.cpp @@ -135,10 +135,6 @@ static cl::opt VerifyNoAliasScopeDomination( cl::desc("Ensure that llvm.experimental.noalias.scope.decl for identical " "scopes are not dominating")); -static cl::opt - VerifyAbortOnError("verifier-abort-on-error", cl::init(false), - cl::desc("In the Verifier pass, abort on errors.")); - namespace llvm { struct VerifierSupport { @@ -7800,24 +7796,16 @@ VerifierAnalysis::Result VerifierAnalysis::run(Function &F, PreservedAnalyses VerifierPass::run(Module &M, ModuleAnalysisManager &AM) { auto Res = AM.getResult(M); - if (Res.IRBroken || Res.DebugInfoBroken) { - //M.IsValid = false; - if (VerifyAbortOnError && FatalErrors) - report_fatal_error("Broken module found, compilation aborted!"); - return PreservedAnalyses::none(); - } + if (FatalErrors && (Res.IRBroken || Res.DebugInfoBroken)) + report_fatal_error("Broken module found, compilation aborted!"); return PreservedAnalyses::all(); } PreservedAnalyses VerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto res = AM.getResult(F); - if (res.IRBroken) { - //F.getParent()->IsValid = false; - if (VerifyAbortOnError && FatalErrors) - report_fatal_error("Broken function found, compilation aborted!"); - return PreservedAnalyses::none(); - } + if (res.IRBroken && FatalErrors) + report_fatal_error("Broken function found, compilation aborted!"); return PreservedAnalyses::all(); } >From 0c572440b11b571d0431c2c0bfd83132126e096f Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 22:21:47 -0400 Subject: [PATCH 25/31] Remove llvm-tgt-verify tool. --- llvm/test/CMakeLists.txt | 1 - llvm/test/CodeGen/AMDGPU/tgt-verify.ll | 45 ----- llvm/test/lit.cfg.py | 1 - llvm/tools/llvm-tgt-verify/CMakeLists.txt | 34 ---- .../tools/llvm-tgt-verify/llvm-tgt-verify.cpp | 171 ------------------ llvm/utils/gn/secondary/llvm/test/BUILD.gn | 1 - .../llvm/tools/llvm-tgt-verify/BUILD.gn | 25 --- 7 files changed, 278 deletions(-) delete mode 100644 llvm/test/CodeGen/AMDGPU/tgt-verify.ll delete mode 100644 llvm/tools/llvm-tgt-verify/CMakeLists.txt delete mode 100644 llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp delete mode 100644 llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn diff --git a/llvm/test/CMakeLists.txt b/llvm/test/CMakeLists.txt index 10ca9300e7c66..66849002eb470 100644 --- a/llvm/test/CMakeLists.txt +++ b/llvm/test/CMakeLists.txt @@ -135,7 +135,6 @@ set(LLVM_TEST_DEPENDS llvm-strip llvm-symbolizer llvm-tblgen - llvm-tgt-verify llvm-readtapi llvm-tli-checker llvm-undname diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify.ll deleted file mode 100644 index 62b220d7d9f49..0000000000000 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify.ll +++ /dev/null @@ -1,45 +0,0 @@ -; RUN: not llvm-tgt-verify %s -mtriple=amdgcn |& FileCheck %s - -define amdgpu_cs i32 @shader() { -; CHECK: Shaders must return void - ret i32 0 -} - -define amdgpu_kernel void @store_const(ptr addrspace(4) %out, i32 %a, i32 %b) { -; CHECK: Undefined behavior: Write to memory in const addrspace -; CHECK-NEXT: store i32 %r, ptr addrspace(4) %out, align 4 - %r = add i32 %a, %b - store i32 %r, ptr addrspace(4) %out - ret void -} - -define amdgpu_kernel void @kernel_callee(ptr %x) { - ret void -} - -define amdgpu_kernel void @kernel_caller(ptr %x) { -; CHECK: A kernel may not call a kernel -; CHECK-NEXT: ptr @kernel_caller - call amdgpu_kernel void @kernel_callee(ptr %x) - ret void -} - - -; Function Attrs: nounwind -define i65 @invalid_type(i65 %x) #0 { -; CHECK: Int type is invalid. -; CHECK-NEXT: %tmp2 = ashr i65 %x, 64 -entry: - %tmp2 = ashr i65 %x, 64 - ret i65 %tmp2 -} - -declare void @llvm.amdgcn.cs.chain.v3i32(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) -declare amdgpu_cs_chain void @chain_callee(<3 x i32> inreg, <3 x i32>) - -define amdgpu_cs void @no_unreachable(<3 x i32> inreg %a, <3 x i32> %b) { -; CHECK: llvm.amdgcn.cs.chain must be followed by unreachable -; CHECK-NEXT: call void (ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.p0.i32.v3i32.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) - call void(ptr, i32, <3 x i32>, <3 x i32>, i32, ...) @llvm.amdgcn.cs.chain.v3i32(ptr @chain_callee, i32 -1, <3 x i32> inreg %a, <3 x i32> %b, i32 0) - ret void -} diff --git a/llvm/test/lit.cfg.py b/llvm/test/lit.cfg.py index 8620f2a7014b5..aad7a088551b2 100644 --- a/llvm/test/lit.cfg.py +++ b/llvm/test/lit.cfg.py @@ -227,7 +227,6 @@ def get_asan_rtlib(): "llvm-strings", "llvm-strip", "llvm-tblgen", - "llvm-tgt-verify", "llvm-readtapi", "llvm-undname", "llvm-windres", diff --git a/llvm/tools/llvm-tgt-verify/CMakeLists.txt b/llvm/tools/llvm-tgt-verify/CMakeLists.txt deleted file mode 100644 index fe47c85e6cdce..0000000000000 --- a/llvm/tools/llvm-tgt-verify/CMakeLists.txt +++ /dev/null @@ -1,34 +0,0 @@ -set(LLVM_LINK_COMPONENTS - AllTargetsAsmParsers - AllTargetsCodeGens - AllTargetsDescs - AllTargetsInfos - Analysis - AsmPrinter - CodeGen - CodeGenTypes - Core - IRPrinter - IRReader - MC - MIRParser - Passes - Remarks - ScalarOpts - SelectionDAG - Support - Target - TargetParser - TransformUtils - Vectorize - ) - -add_llvm_tool(llvm-tgt-verify - llvm-tgt-verify.cpp - - DEPENDS - intrinsics_gen - SUPPORT_PLUGINS - ) - -export_executable_symbols_for_plugins(llc) diff --git a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp b/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp deleted file mode 100644 index 50f4e56bb6af6..0000000000000 --- a/llvm/tools/llvm-tgt-verify/llvm-tgt-verify.cpp +++ /dev/null @@ -1,171 +0,0 @@ -//===--- llvm-tgt-verify.cpp - Target Verifier ----------------- ----------===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===----------------------------------------------------------------------===// -// -// Tool to verify a target. -// -//===----------------------------------------------------------------------===// - -#include "llvm/InitializePasses.h" -#include "llvm/ADT/StringRef.h" -#include "llvm/Analysis/Lint.h" -#include "llvm/Analysis/TargetLibraryInfo.h" -#include "llvm/Bitcode/BitcodeReader.h" -#include "llvm/Bitcode/BitcodeWriter.h" -#include "llvm/CodeGen/CommandFlags.h" -#include "llvm/CodeGen/TargetPassConfig.h" -#include "llvm/IR/Constants.h" -#include "llvm/IR/LLVMContext.h" -#include "llvm/IR/LegacyPassManager.h" -#include "llvm/IR/Module.h" -#include "llvm/IR/Verifier.h" -#include "llvm/IRReader/IRReader.h" -#include "llvm/Passes/PassBuilder.h" -#include "llvm/Passes/StandardInstrumentations.h" -#include "llvm/MC/TargetRegistry.h" -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/DataTypes.h" -#include "llvm/Support/Debug.h" -#include "llvm/Support/InitLLVM.h" -#include "llvm/Support/SourceMgr.h" -#include "llvm/Support/TargetSelect.h" -#include "llvm/Target/TargetMachine.h" -#include "llvm/Target/TargetVerifier.h" - -#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" - -#define DEBUG_TYPE "isel-fuzzer" - -using namespace llvm; - -static codegen::RegisterCodeGenFlags CGF; - -static cl::opt -InputFilename(cl::Positional, cl::desc(""), cl::init("-")); - -static cl::opt - StacktraceAbort("stacktrace-abort", - cl::desc("Turn on stacktrace"), cl::init(false)); - -static cl::opt - NoLint("no-lint", - cl::desc("Turn off Lint"), cl::init(false)); - -static cl::opt - NoVerify("no-verifier", - cl::desc("Turn off Verifier"), cl::init(false)); - -static cl::opt - OptLevel("O", - cl::desc("Optimization level. [-O0, -O1, -O2, or -O3] " - "(default = '-O2')"), - cl::Prefix, cl::init('2')); - -static cl::opt - TargetTriple("mtriple", cl::desc("Override target triple for module")); - -static std::unique_ptr TM; - -static void handleLLVMFatalError(void *, const char *Message, bool) { - if (StacktraceAbort) { - dbgs() << "LLVM ERROR: " << Message << "\n" - << "Aborting.\n"; - abort(); - } -} - -int main(int argc, char **argv) { - StringRef ExecName = argv[0]; - InitLLVM X(argc, argv); - - InitializeAllTargets(); - InitializeAllTargetMCs(); - InitializeAllAsmPrinters(); - InitializeAllAsmParsers(); - - PassRegistry *Registry = PassRegistry::getPassRegistry(); - initializeCore(*Registry); - initializeCodeGen(*Registry); - initializeAnalysis(*Registry); - initializeTarget(*Registry); - - cl::ParseCommandLineOptions(argc, argv); - - if (TargetTriple.empty()) { - errs() << ExecName << ": -mtriple must be specified\n"; - exit(1); - } - - CodeGenOptLevel OLvl; - if (auto Level = CodeGenOpt::parseLevel(OptLevel)) { - OLvl = *Level; - } else { - errs() << ExecName << ": invalid optimization level.\n"; - return 1; - } - ExitOnError ExitOnErr(std::string(ExecName) + ": error:"); - TM = ExitOnErr(codegen::createTargetMachineForTriple( - Triple::normalize(TargetTriple), OLvl)); - assert(TM && "Could not allocate target machine!"); - - // Make sure we print the summary and the current unit when LLVM errors out. - install_fatal_error_handler(handleLLVMFatalError, nullptr); - - LLVMContext Context; - SMDiagnostic Err; - std::unique_ptr M = parseIRFile(InputFilename, Err, Context); - if (!M) { - errs() << "Invalid mod\n"; - return 1; - } - auto S = Triple::normalize(TargetTriple); - M->setTargetTriple(Triple(S)); - - PassInstrumentationCallbacks PIC; - StandardInstrumentations SI(Context, false/*debug PM*/, - false); - registerCodeGenCallback(PIC, *TM); - - ModulePassManager MPM; - FunctionPassManager FPM; - //TargetLibraryInfoImpl TLII(Triple(M->getTargetTriple())); - - MachineFunctionAnalysisManager MFAM; - LoopAnalysisManager LAM; - FunctionAnalysisManager FAM; - CGSCCAnalysisManager CGAM; - ModuleAnalysisManager MAM; - PassBuilder PB(TM.get(), PipelineTuningOptions(), std::nullopt, &PIC); - PB.registerModuleAnalyses(MAM); - //PB.registerVerifierPasses(MPM, FPM); - PB.registerCGSCCAnalyses(CGAM); - PB.registerFunctionAnalyses(FAM); - PB.registerLoopAnalyses(LAM); - PB.registerMachineFunctionAnalyses(MFAM); - PB.crossRegisterProxies(LAM, FAM, CGAM, MAM, &MFAM); - - SI.registerCallbacks(PIC, &MAM); - - Triple TT(M->getTargetTriple()); - if (!NoLint) - FPM.addPass(LintPass(false)); - if (!NoVerify) - MPM.addPass(VerifierPass()); - if (TT.isAMDGPU()) - FPM.addPass(AMDGPUTargetVerifierPass()); - else if (false) {} // ... - MPM.addPass(createModuleToFunctionPassAdaptor(std::move(FPM))); - - auto PA = MPM.run(*M, MAM); - { - auto PAC = PA.getChecker(); - if (!PAC.preserved()) - return 1; - } - - return 0; -} diff --git a/llvm/utils/gn/secondary/llvm/test/BUILD.gn b/llvm/utils/gn/secondary/llvm/test/BUILD.gn index 157e7991c52a8..228642667b41d 100644 --- a/llvm/utils/gn/secondary/llvm/test/BUILD.gn +++ b/llvm/utils/gn/secondary/llvm/test/BUILD.gn @@ -319,7 +319,6 @@ group("test") { "//llvm/tools/llvm-strings", "//llvm/tools/llvm-symbolizer:symlinks", "//llvm/tools/llvm-tli-checker", - "//llvm/tools/llvm-tgt-verify", "//llvm/tools/llvm-undname", "//llvm/tools/llvm-xray", "//llvm/tools/lto", diff --git a/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn b/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn deleted file mode 100644 index b751bafc5052c..0000000000000 --- a/llvm/utils/gn/secondary/llvm/tools/llvm-tgt-verify/BUILD.gn +++ /dev/null @@ -1,25 +0,0 @@ -import("//llvm/utils/TableGen/tablegen.gni") - -tgtverifier("llvm-tgt-verify") { - deps = [ - "//llvm/lib/Analysis", - "//llvm/lib/AsmPrinter", - "//llvm/lib/CodeGen", - "//llvm/lib/CodeGenTypes", - "//llvm/lib/Core", - "//llvm/lib/IRPrinter", - "//llvm/lib/IRReader", - "//llvm/lib/MC", - "//llvm/lib/MIRParser", - "//llvm/lib/Passes", - "//llvm/lib/Remarks", - "//llvm/lib/ScalarOpts", - "//llvm/lib/SelectionDAG", - "//llvm/lib/Support", - "//llvm/lib/Target", - "//llvm/lib/TargetParser", - "//llvm/lib/TransformUtils", - "//llvm/lib/Vectorize", - ] - sources = [ "llvm-tgt-verify.cpp" ] -} >From 5fec133fdd54a6864918beb1dfd737a8b8fa25a9 Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 30 Apr 2025 22:23:15 -0400 Subject: [PATCH 26/31] Remove TargetVerifier.cpp --- llvm/lib/Passes/CMakeLists.txt | 1 - llvm/lib/Target/CMakeLists.txt | 2 -- llvm/lib/Target/TargetVerifier.cpp | 36 ------------------------------ 3 files changed, 39 deletions(-) delete mode 100644 llvm/lib/Target/TargetVerifier.cpp diff --git a/llvm/lib/Passes/CMakeLists.txt b/llvm/lib/Passes/CMakeLists.txt index 9c348cb89a8c5..6425f4934b210 100644 --- a/llvm/lib/Passes/CMakeLists.txt +++ b/llvm/lib/Passes/CMakeLists.txt @@ -29,7 +29,6 @@ add_llvm_component_library(LLVMPasses Scalar Support Target - #TargetParser TransformUtils Vectorize Instrumentation diff --git a/llvm/lib/Target/CMakeLists.txt b/llvm/lib/Target/CMakeLists.txt index f2a5d545ce84f..9472288229cac 100644 --- a/llvm/lib/Target/CMakeLists.txt +++ b/llvm/lib/Target/CMakeLists.txt @@ -7,8 +7,6 @@ add_llvm_component_library(LLVMTarget TargetLoweringObjectFile.cpp TargetMachine.cpp TargetMachineC.cpp - TargetVerifier.cpp - AMDGPU/AMDGPUTargetVerifier.cpp ADDITIONAL_HEADER_DIRS ${LLVM_MAIN_INCLUDE_DIR}/llvm/Target diff --git a/llvm/lib/Target/TargetVerifier.cpp b/llvm/lib/Target/TargetVerifier.cpp deleted file mode 100644 index c63ae2a2c5daf..0000000000000 --- a/llvm/lib/Target/TargetVerifier.cpp +++ /dev/null @@ -1,36 +0,0 @@ -//===-- TargetVerifier.cpp - LLVM IR Target Verifier ----------------*- C++ -*-===// -//// -///// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -///// See https://llvm.org/LICENSE.txt for license information. -///// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -///// -/////===----------------------------------------------------------------------===// -///// -///// This file defines target verifier interfaces that can be used for some -///// validation of input to the system, and for checking that transformations -///// haven't done something bad. In contrast to the Verifier or Lint, the -///// TargetVerifier looks for constructions invalid to a particular target -///// machine. -///// -///// To see what specifically is checked, look at TargetVerifier.cpp or an -///// individual backend's TargetVerifier. -///// -/////===----------------------------------------------------------------------===// - -#include "llvm/Target/TargetVerifier.h" -#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" - -#include "llvm/InitializePasses.h" -#include "llvm/Analysis/UniformityAnalysis.h" -#include "llvm/Analysis/PostDominators.h" -#include "llvm/Support/Debug.h" -#include "llvm/IR/Dominators.h" -#include "llvm/IR/Function.h" -#include "llvm/IR/IntrinsicInst.h" -#include "llvm/IR/IntrinsicsAMDGPU.h" -#include "llvm/IR/Module.h" -#include "llvm/IR/Value.h" - -namespace llvm { - -} // namespace llvm >From c176578fd5dd63e6097bffac4f5ff51ba8cb76ce Mon Sep 17 00:00:00 2001 From: jofernau Date: Thu, 1 May 2025 03:02:30 -0400 Subject: [PATCH 27/31] clang-format --- llvm/include/llvm/Target/TargetVerifier.h | 10 +- .../TargetVerify/AMDGPUTargetVerifier.h | 44 ++++----- .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 98 ++++++++++--------- 3 files changed, 77 insertions(+), 75 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 1d12eb55bbf0a..3f8c710a88768 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -1,4 +1,4 @@ -//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier ---*- C++ -*-===// +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier --*- C++ -*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -20,8 +20,8 @@ #ifndef LLVM_TARGET_VERIFIER_H #define LLVM_TARGET_VERIFIER_H -#include "llvm/IR/PassManager.h" #include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" #include "llvm/TargetParser/Triple.h" namespace llvm { @@ -59,10 +59,11 @@ class TargetVerify { /// This calls the Message-only version so that the above is easier to set /// a breakpoint on. template - void CheckFailed(const Twine &Message, const T1 &V1, const Ts &... Vs) { + void CheckFailed(const Twine &Message, const T1 &V1, const Ts &...Vs) { CheckFailed(Message); WriteValues({V1, Vs...}); } + public: Module *Mod; Triple TT; @@ -73,8 +74,7 @@ class TargetVerify { bool IsValid = true; TargetVerify(Module *Mod) - : Mod(Mod), TT(Mod->getTargetTriple()), - MessagesStr(Messages) {} + : Mod(Mod), TT(Mod->getTargetTriple()), MessagesStr(Messages) {} virtual bool run(Function &F) = 0; }; diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index 49bcbc8849e3c..5b8d9ec259b63 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -1,29 +1,29 @@ -//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU ---*- C++ -*-===// -//// -//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -//// See https://llvm.org/LICENSE.txt for license information. -//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -//// -////===----------------------------------------------------------------------===// -//// -//// This file defines target verifier interfaces that can be used for some -//// validation of input to the system, and for checking that transformations -//// haven't done something bad. In contrast to the Verifier or Lint, the -//// TargetVerifier looks for constructions invalid to a particular target -//// machine. -//// -//// To see what specifically is checked, look at an individual backend's -//// TargetVerifier. -//// -////===----------------------------------------------------------------------===// +//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU -- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at an individual backend's +// TargetVerifier. +// +//===----------------------------------------------------------------------===// #ifndef LLVM_AMDGPU_TARGET_VERIFIER_H #define LLVM_AMDGPU_TARGET_VERIFIER_H #include "llvm/Target/TargetVerifier.h" -#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/Analysis/PostDominators.h" +#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/IR/Dominators.h" namespace llvm { @@ -43,10 +43,10 @@ class AMDGPUTargetVerify : public TargetVerify { PostDominatorTree *PDT = nullptr; UniformityInfo *UA = nullptr; - AMDGPUTargetVerify(Module *Mod) - : TargetVerify(Mod), Mod(Mod) {} + AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod), Mod(Mod) {} - AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, + UniformityInfo *UA) : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} bool run(Function &F) override; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index 2ca0bbeb57653..eb22eb2177f7f 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -1,34 +1,34 @@ -//===-- AMDGPUTargetVerifier.cpp - AMDGPU -------------------------*- C++ -*-===// -//// -//// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -//// See https://llvm.org/LICENSE.txt for license information. -//// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -//// -////===----------------------------------------------------------------------===// -//// -//// This file defines target verifier interfaces that can be used for some -//// validation of input to the system, and for checking that transformations -//// haven't done something bad. In contrast to the Verifier or Lint, the -//// TargetVerifier looks for constructions invalid to a particular target -//// machine. -//// -//// To see what specifically is checked, look at an individual backend's -//// TargetVerifier. -//// -////===----------------------------------------------------------------------===// +//===-- AMDGPUTargetVerifier.cpp - AMDGPU -----------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file defines target verifier interfaces that can be used for some +// validation of input to the system, and for checking that transformations +// haven't done something bad. In contrast to the Verifier or Lint, the +// TargetVerifier looks for constructions invalid to a particular target +// machine. +// +// To see what specifically is checked, look at an individual backend's +// TargetVerifier. +// +//===----------------------------------------------------------------------===// -#include "AMDGPU.h" #include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" +#include "AMDGPU.h" -#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/Analysis/PostDominators.h" -#include "llvm/Support/Debug.h" +#include "llvm/Analysis/UniformityAnalysis.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" #include "llvm/IR/Module.h" #include "llvm/IR/Value.h" +#include "llvm/Support/Debug.h" #include "llvm/Support/raw_ostream.h" @@ -39,53 +39,52 @@ using namespace llvm; do { \ if (!(C)) { \ TargetVerify::CheckFailed(__VA_ARGS__); \ - return false; \ } \ } while (false) namespace llvm { static bool isShader(CallingConv::ID CC) { - switch(CC) { - case CallingConv::AMDGPU_VS: - case CallingConv::AMDGPU_LS: - case CallingConv::AMDGPU_HS: - case CallingConv::AMDGPU_ES: - case CallingConv::AMDGPU_GS: - case CallingConv::AMDGPU_PS: - case CallingConv::AMDGPU_CS_Chain: - case CallingConv::AMDGPU_CS_ChainPreserve: - case CallingConv::AMDGPU_CS: - return true; - default: - return false; + switch (CC) { + case CallingConv::AMDGPU_VS: + case CallingConv::AMDGPU_LS: + case CallingConv::AMDGPU_HS: + case CallingConv::AMDGPU_ES: + case CallingConv::AMDGPU_GS: + case CallingConv::AMDGPU_PS: + case CallingConv::AMDGPU_CS_Chain: + case CallingConv::AMDGPU_CS_ChainPreserve: + case CallingConv::AMDGPU_CS: + return true; + default: + return false; } } bool AMDGPUTargetVerify::run(Function &F) { // Ensure shader calling convention returns void if (isShader(F.getCallingConv())) - Check(F.getReturnType() == Type::getVoidTy(F.getContext()), "Shaders must return void"); + Check(F.getReturnType() == Type::getVoidTy(F.getContext()), + "Shaders must return void"); for (auto &BB : F) { for (auto &I : BB) { - if (auto *CI = dyn_cast(&I)) - { + if (auto *CI = dyn_cast(&I)) { // Ensure no kernel to kernel calls. CallingConv::ID CalleeCC = CI->getCallingConv(); - if (CalleeCC == CallingConv::AMDGPU_KERNEL) - { - CallingConv::ID CallerCC = CI->getParent()->getParent()->getCallingConv(); + if (CalleeCC == CallingConv::AMDGPU_KERNEL) { + CallingConv::ID CallerCC = + CI->getParent()->getParent()->getCallingConv(); Check(CallerCC != CallingConv::AMDGPU_KERNEL, - "A kernel may not call a kernel", CI->getParent()->getParent()); + "A kernel may not call a kernel", CI->getParent()->getParent()); } // Ensure chain intrinsics are followed by unreachables. if (CI->getIntrinsicID() == Intrinsic::amdgcn_cs_chain) Check(isa_and_present(CI->getNextNode()), - "llvm.amdgcn.cs.chain must be followed by unreachable", CI); + "llvm.amdgcn.cs.chain must be followed by unreachable", CI); } } } @@ -98,7 +97,8 @@ bool AMDGPUTargetVerify::run(Function &F) { return true; } -PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { +PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, + FunctionAnalysisManager &AM) { auto *Mod = F.getParent(); auto UA = &AM.getResult(F); @@ -122,9 +122,10 @@ struct AMDGPUTargetVerifierLegacyPass : public FunctionPass { std::unique_ptr TV; bool FatalErrors = false; - AMDGPUTargetVerifierLegacyPass(bool FatalErrors) : FunctionPass(ID), - FatalErrors(FatalErrors) { - initializeAMDGPUTargetVerifierLegacyPassPass(*PassRegistry::getPassRegistry()); + AMDGPUTargetVerifierLegacyPass(bool FatalErrors) + : FunctionPass(ID), FatalErrors(FatalErrors) { + initializeAMDGPUTargetVerifierLegacyPassPass( + *PassRegistry::getPassRegistry()); } bool doInitialization(Module &M) override { @@ -167,4 +168,5 @@ FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors) { return new AMDGPUTargetVerifierLegacyPass(FatalErrors); } } // namespace llvm -INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", "AMDGPU Target Verifier", false, false) +INITIALIZE_PASS(AMDGPUTargetVerifierLegacyPass, "amdgpu-tgtverifier", + "AMDGPU Target Verifier", false, false) >From 97d43c16df784fde92cbd0f22324e3e4ecfb8374 Mon Sep 17 00:00:00 2001 From: jofernau Date: Thu, 1 May 2025 03:49:20 -0400 Subject: [PATCH 28/31] Add VerifyTarget option --- .../llvm/Target/TargetVerify/AMDGPUTargetVerifier.h | 6 ++---- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 9 +++++++-- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h index 5b8d9ec259b63..8ed7dd7ea2f69 100644 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h @@ -37,17 +37,15 @@ class AMDGPUTargetVerifierPass : public TargetVerifierPass { class AMDGPUTargetVerify : public TargetVerify { public: - Module *Mod; - DominatorTree *DT = nullptr; PostDominatorTree *PDT = nullptr; UniformityInfo *UA = nullptr; - AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod), Mod(Mod) {} + AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod) {} AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, UniformityInfo *UA) - : TargetVerify(Mod), Mod(Mod), DT(DT), PDT(PDT), UA(UA) {} + : TargetVerify(Mod), DT(DT), PDT(PDT), UA(UA) {} bool run(Function &F) override; }; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 42d6764eacda9..582090c3c411e 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -482,6 +482,9 @@ static cl::opt HasClosedWorldAssumption( cl::desc("Whether has closed-world assumption at link time"), cl::init(false), cl::Hidden); +static cl::opt VerifyTarget("verify-tgt", + cl::desc("Enable the target verifier")); + extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAMDGPUTarget() { // Register the target RegisterTargetMachine X(getTheR600Target()); @@ -1377,7 +1380,8 @@ bool AMDGPUPassConfig::addGCPasses() { //===----------------------------------------------------------------------===// bool GCNPassConfig::addPreISel() { - addPass(createAMDGPUTargetVerifierLegacyPass(false)); + if (VerifyTarget) + addPass(createAMDGPUTargetVerifierLegacyPass(false)); AMDGPUPassConfig::addPreISel(); if (TM->getOptLevel() > CodeGenOptLevel::None) @@ -1977,7 +1981,8 @@ AMDGPUCodeGenPassBuilder::AMDGPUCodeGenPassBuilder( } void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { - addPass(AMDGPUTargetVerifierPass()); + if (VerifyTarget) + addPass(AMDGPUTargetVerifierPass()); if (RemoveIncompatibleFunctions && TM.getTargetTriple().isAMDGCN()) addPass(AMDGPURemoveIncompatibleFunctionsPass(TM)); >From ea2f05032ced1046ec5a5615d5e1d9e6a411da2e Mon Sep 17 00:00:00 2001 From: Joey F Date: Wed, 7 May 2025 21:59:43 -0400 Subject: [PATCH 29/31] Remove AMDGPUTargetVerifier.h --- .../TargetVerify/AMDGPUTargetVerifier.h | 55 ------------------- llvm/lib/Target/AMDGPU/AMDGPU.h | 5 ++ .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 1 - .../Target/AMDGPU/AMDGPUTargetVerifier.cpp | 16 +++++- 4 files changed, 20 insertions(+), 57 deletions(-) delete mode 100644 llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h diff --git a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h b/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h deleted file mode 100644 index 8ed7dd7ea2f69..0000000000000 --- a/llvm/include/llvm/Target/TargetVerify/AMDGPUTargetVerifier.h +++ /dev/null @@ -1,55 +0,0 @@ -//===-- llvm/Target/TargetVerify/AMDGPUTargetVerifier.h - AMDGPU -- C++ -*-===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===----------------------------------------------------------------------===// -// -// This file defines target verifier interfaces that can be used for some -// validation of input to the system, and for checking that transformations -// haven't done something bad. In contrast to the Verifier or Lint, the -// TargetVerifier looks for constructions invalid to a particular target -// machine. -// -// To see what specifically is checked, look at an individual backend's -// TargetVerifier. -// -//===----------------------------------------------------------------------===// - -#ifndef LLVM_AMDGPU_TARGET_VERIFIER_H -#define LLVM_AMDGPU_TARGET_VERIFIER_H - -#include "llvm/Target/TargetVerifier.h" - -#include "llvm/Analysis/PostDominators.h" -#include "llvm/Analysis/UniformityAnalysis.h" -#include "llvm/IR/Dominators.h" - -namespace llvm { - -class Function; - -class AMDGPUTargetVerifierPass : public TargetVerifierPass { -public: - PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) override; -}; - -class AMDGPUTargetVerify : public TargetVerify { -public: - DominatorTree *DT = nullptr; - PostDominatorTree *PDT = nullptr; - UniformityInfo *UA = nullptr; - - AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod) {} - - AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, - UniformityInfo *UA) - : TargetVerify(Mod), DT(DT), PDT(PDT), UA(UA) {} - - bool run(Function &F) override; -}; - -} // namespace llvm - -#endif // LLVM_AMDGPU_TARGET_VERIFIER_H diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h index f69956ba44255..ca273b20e9baf 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPU.h +++ b/llvm/lib/Target/AMDGPU/AMDGPU.h @@ -15,6 +15,7 @@ #include "llvm/Pass.h" #include "llvm/Support/AMDGPUAddrSpace.h" #include "llvm/Support/CodeGen.h" +#include "llvm/Target/TargetVerifier.h" namespace llvm { @@ -532,6 +533,10 @@ extern char &AMDGPUWaitSGPRHazardsLegacyID; FunctionPass *createAMDGPUTargetVerifierLegacyPass(bool FatalErrors); void initializeAMDGPUTargetVerifierLegacyPassPass(PassRegistry &); +class AMDGPUTargetVerifierPass : public TargetVerifierPass { +public: + PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM) override; +}; namespace AMDGPU { enum TargetIndex { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 582090c3c411e..1350492b9cf32 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -90,7 +90,6 @@ #include "llvm/MC/TargetRegistry.h" #include "llvm/Passes/PassBuilder.h" #include "llvm/Support/FormatVariadic.h" -#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "llvm/Transforms/HipStdPar/HipStdPar.h" #include "llvm/Transforms/IPO.h" #include "llvm/Transforms/IPO/AlwaysInliner.h" diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index eb22eb2177f7f..cc96164a1a558 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -17,7 +17,6 @@ // //===----------------------------------------------------------------------===// -#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" #include "AMDGPU.h" #include "llvm/Analysis/PostDominators.h" @@ -44,6 +43,21 @@ using namespace llvm; namespace llvm { +class AMDGPUTargetVerify : public TargetVerify { +public: + DominatorTree *DT = nullptr; + PostDominatorTree *PDT = nullptr; + UniformityInfo *UA = nullptr; + + AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod) {} + + AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, + UniformityInfo *UA) + : TargetVerify(Mod), DT(DT), PDT(PDT), UA(UA) {} + + bool run(Function &F) override; +}; + static bool isShader(CallingConv::ID CC) { switch (CC) { case CallingConv::AMDGPU_VS: >From de38b78d2bd598c6168c6235d408726c68251f4c Mon Sep 17 00:00:00 2001 From: jofernau Date: Wed, 7 May 2025 22:47:55 -0400 Subject: [PATCH 30/31] Remove analyses. --- .../lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 18 +----------------- 1 file changed, 1 insertion(+), 17 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index cc96164a1a558..d07b5fb668c14 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -19,9 +19,6 @@ #include "AMDGPU.h" -#include "llvm/Analysis/PostDominators.h" -#include "llvm/Analysis/UniformityAnalysis.h" -#include "llvm/IR/Dominators.h" #include "llvm/IR/Function.h" #include "llvm/IR/IntrinsicInst.h" #include "llvm/IR/IntrinsicsAMDGPU.h" @@ -45,16 +42,7 @@ namespace llvm { class AMDGPUTargetVerify : public TargetVerify { public: - DominatorTree *DT = nullptr; - PostDominatorTree *PDT = nullptr; - UniformityInfo *UA = nullptr; - AMDGPUTargetVerify(Module *Mod) : TargetVerify(Mod) {} - - AMDGPUTargetVerify(Module *Mod, DominatorTree *DT, PostDominatorTree *PDT, - UniformityInfo *UA) - : TargetVerify(Mod), DT(DT), PDT(PDT), UA(UA) {} - bool run(Function &F) override; }; @@ -115,11 +103,7 @@ PreservedAnalyses AMDGPUTargetVerifierPass::run(Function &F, FunctionAnalysisManager &AM) { auto *Mod = F.getParent(); - auto UA = &AM.getResult(F); - auto *DT = &AM.getResult(F); - auto *PDT = &AM.getResult(F); - - AMDGPUTargetVerify TV(Mod, DT, PDT, UA); + AMDGPUTargetVerify TV(Mod); TV.run(F); dbgs() << TV.MessagesStr.str(); >From b0b705d480a756632d238384f269806c6f1b80c9 Mon Sep 17 00:00:00 2001 From: jofernau Date: Thu, 8 May 2025 12:59:01 -0400 Subject: [PATCH 31/31] function name style, TargetVerifier comment, option name generality --- llvm/include/llvm/Target/TargetVerifier.h | 14 +++++++------- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 +- llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp | 2 +- llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll | 2 +- llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll | 2 +- 5 files changed, 11 insertions(+), 11 deletions(-) diff --git a/llvm/include/llvm/Target/TargetVerifier.h b/llvm/include/llvm/Target/TargetVerifier.h index 3f8c710a88768..eff846694ac10 100644 --- a/llvm/include/llvm/Target/TargetVerifier.h +++ b/llvm/include/llvm/Target/TargetVerifier.h @@ -12,8 +12,8 @@ // TargetVerifier looks for constructions invalid to a particular target // machine. // -// To see what specifically is checked, look at TargetVerifier.cpp or an -// individual backend's TargetVerifier. +// To see what specifically is checked, look at an individual backend's +// TargetVerifier. // //===----------------------------------------------------------------------===// @@ -35,7 +35,7 @@ class TargetVerifierPass : public PassInfoMixin { class TargetVerify { protected: - void WriteValues(ArrayRef Vs) { + void writeValues(ArrayRef Vs) { for (const Value *V : Vs) { if (!V) continue; @@ -52,16 +52,16 @@ class TargetVerify { /// /// This provides a nice place to put a breakpoint if you want to see why /// something is not correct. - void CheckFailed(const Twine &Message) { MessagesStr << Message << '\n'; } + void checkFailed(const Twine &Message) { MessagesStr << Message << '\n'; } /// A check failed (with values to print). /// /// This calls the Message-only version so that the above is easier to set /// a breakpoint on. template - void CheckFailed(const Twine &Message, const T1 &V1, const Ts &...Vs) { - CheckFailed(Message); - WriteValues({V1, Vs...}); + void checkFailed(const Twine &Message, const T1 &V1, const Ts &...Vs) { + checkFailed(Message); + writeValues({V1, Vs...}); } public: diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 1350492b9cf32..89e045a92f3ea 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -481,7 +481,7 @@ static cl::opt HasClosedWorldAssumption( cl::desc("Whether has closed-world assumption at link time"), cl::init(false), cl::Hidden); -static cl::opt VerifyTarget("verify-tgt", +static cl::opt VerifyTarget("amdgpu-verify-tgt", cl::desc("Enable the target verifier")); extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAMDGPUTarget() { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp index d07b5fb668c14..3246b0ac0c624 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetVerifier.cpp @@ -34,7 +34,7 @@ using namespace llvm; #define Check(C, ...) \ do { \ if (!(C)) { \ - TargetVerify::CheckFailed(__VA_ARGS__); \ + TargetVerify::checkFailed(__VA_ARGS__); \ } \ } while (false) diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll index e2d9edda5d008..819e66bb46b0a 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-fail.ll @@ -1,4 +1,4 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-tgt -o - < %s 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -amdgpu-verify-tgt -o - < %s 2>&1 | FileCheck %s define amdgpu_cs i32 @nonvoid_shader() { ; CHECK: Shaders must return void diff --git a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll index a2dab0ff47924..08854df01acb2 100644 --- a/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll +++ b/llvm/test/CodeGen/AMDGPU/tgt-verify-llc-pass.ll @@ -1,4 +1,4 @@ -; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -verify-tgt %s -o - 2>&1 | FileCheck %s --allow-empty +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -amdgpu-verify-tgt %s -o - 2>&1 | FileCheck %s --allow-empty define amdgpu_cs void @void_shader() { ; CHECK-NOT: Shaders must return void From flang-commits at lists.llvm.org Thu May 8 20:10:03 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 20:10:03 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <681d720b.a70a0220.218476.cbee@mx.google.com> https://github.com/fanju110 updated https://github.com/llvm/llvm-project/pull/136098 >From 9494c9752400e4708dbc8b6a5ca4993ea9565e95 Mon Sep 17 00:00:00 2001 From: fanyikang Date: Thu, 17 Apr 2025 15:17:07 +0800 Subject: [PATCH 01/13] Add support for IR PGO (-fprofile-generate/-fprofile-use=/file) This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags: -fprofile-generate for instrumentation-based profile generation -fprofile-use=/file for profile-guided optimization Co-Authored-By: ict-ql <168183727+ict-ql at users.noreply.github.com> --- clang/include/clang/Driver/Options.td | 4 +- clang/lib/Driver/ToolChains/Flang.cpp | 8 +++ .../include/flang/Frontend/CodeGenOptions.def | 5 ++ flang/include/flang/Frontend/CodeGenOptions.h | 49 +++++++++++++++++ flang/lib/Frontend/CompilerInvocation.cpp | 12 +++++ flang/lib/Frontend/FrontendActions.cpp | 54 +++++++++++++++++++ flang/test/Driver/flang-f-opts.f90 | 5 ++ .../Inputs/gcc-flag-compatibility_IR.proftext | 19 +++++++ .../gcc-flag-compatibility_IR_entry.proftext | 14 +++++ flang/test/Profile/gcc-flag-compatibility.f90 | 39 ++++++++++++++ 10 files changed, 207 insertions(+), 2 deletions(-) create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext create mode 100644 flang/test/Profile/gcc-flag-compatibility.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index affc076a876ad..0b0dbc467c1e0 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1756,7 +1756,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFFFFFE">; def fprofile_generate : Flag<["-"], "fprofile-generate">, - Group, Visibility<[ClangOption, CLOption]>, + Group, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">, Group, Visibility<[ClangOption, CLOption]>, @@ -1773,7 +1773,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group, Visibility<[ClangOption, CLOption]>, Alias; def fprofile_use_EQ : Joined<["-"], "fprofile-use=">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, MetaVarName<"">, HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from /default.profdata. Otherwise, it reads from file .">; def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">, diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index a8b4688aed09c..fcdbe8a6aba5a 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,6 +882,14 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); + + if (Args.hasArg(options::OPT_fprofile_generate)){ + CmdArgs.push_back("-fprofile-generate"); + } + if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { + CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); + } + // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ, diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 57830bf51a1b3..4dec86cd8f51b 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,6 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. +ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Whether emit extra debug info for sample pgo profile collection. +CODEGENOPT(DebugInfoForProfiling, 1, 0) +CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..e052250f97e75 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,6 +148,55 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; + enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. + }; + + + /// Name of the profile file to use as output for -fprofile-instr-generate, + /// -fprofile-generate, and -fcs-profile-generate. + std::string InstrProfileOutput; + + /// Name of the profile file to use as input for -fmemory-profile-use. + std::string MemoryProfileUsePath; + + unsigned int DebugInfoForProfiling; + + unsigned int AtomicProfileUpdate; + + /// Name of the profile file to use as input for -fprofile-instr-use + std::string ProfileInstrumentUsePath; + + /// Name of the profile remapping file to apply to the profile data supplied + /// by -fprofile-sample-use or -fprofile-instr-use. + std::string ProfileRemappingFile; + + /// Check if Clang profile instrumenation is on. + bool hasProfileClangInstr() const { + return getProfileInstr() == ProfileClangInstr; + } + + /// Check if IR level profile instrumentation is on. + bool hasProfileIRInstr() const { + return getProfileInstr() == ProfileIRInstr; + } + + /// Check if CS IR level profile instrumentation is on. + bool hasProfileCSIRInstr() const { + return getProfileInstr() == ProfileCSIRInstr; + } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } + /// Check if CSIR profile use is on. + bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) #define ENUM_CODEGENOPT(Name, Type, Bits, Default) \ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 6f87a18d69c3d..f013fce2f3cfc 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,6 +27,7 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" +#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -431,6 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, opts.IsPIE = 1; } + if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { + opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + } + + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { + opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.ProfileInstrumentUsePath = A->getValue(); + } + // -mcmodel option. if (const llvm::opt::Arg *a = args.getLastArg(clang::driver::options::OPT_mcmodel_EQ)) { diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..68880bdeecf8d 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -63,11 +63,14 @@ #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" #include "llvm/Transforms/Utils/ModuleUtils.h" +#include "llvm/Transforms/Instrumentation/InstrProfiling.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include #include @@ -130,6 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// + +static llvm::cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -892,6 +909,20 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } + +// Default filename used for profile generation. +namespace llvm { + extern llvm::cl::opt DebugInfoCorrelate; + extern llvm::cl::opt ProfileCorrelate; + + +std::string getDefaultProfileGenName() { + return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} +} + void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -909,6 +940,29 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; + + if (opts.hasProfileIRInstr()){ + // // -fprofile-generate. + pgoOpt = llvm::PGOOptions( + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } + else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", + opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, + llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Driver/flang-f-opts.f90 b/flang/test/Driver/flang-f-opts.f90 index 4493a519e2010..b972b9b7b2a59 100644 --- a/flang/test/Driver/flang-f-opts.f90 +++ b/flang/test/Driver/flang-f-opts.f90 @@ -8,3 +8,8 @@ ! CHECK-LABEL: "-fc1" ! CHECK: -ffp-contract=off ! CHECK: -O3 + +! RUN: %flang -### -S -fprofile-generate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-GENERATE-LLVM %s +! CHECK-PROFILE-GENERATE-LLVM: "-fprofile-generate" +! RUN: %flang -### -S -fprofile-use=%S %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-USE-DIR %s +! CHECK-PROFILE-USE-DIR: "-fprofile-use={{.*}}" diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext new file mode 100644 index 0000000000000..6a6df8b1d4d5b --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -0,0 +1,19 @@ +# IR level Instrumentation Flag +:ir +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + +main +# Func Hash: +742261418966908927 +# Num Counters: +1 +# Counter Values: +1 + diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext new file mode 100644 index 0000000000000..9a46140286673 --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -0,0 +1,14 @@ +# IR level Instrumentation Flag +:ir +:entry_first +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + + + diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 new file mode 100644 index 0000000000000..0124bc79b87ef --- /dev/null +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -0,0 +1,39 @@ +! Tests for -fprofile-generate and -fprofile-use flag compatibility. These two +! flags behave similarly to their GCC counterparts: +! +! -fprofile-generate Generates the profile file ./default.profraw +! -fprofile-use=/file Uses the profile file /file + +! On AIX, -flto used to be required with -fprofile-generate. gcc-flag-compatibility-aix.c is used to do the testing on AIX with -flto +! RUN: %flang %s -c -S -o - -emit-llvm -fprofile-generate | FileCheck -check-prefix=PROFILE-GEN %s +! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section +! PROFILE-GEN: @__profd_{{_?}}main = + + + +! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof +! This uses LLVM IR format profile. +! RUN: rm -rf %t.dir +! RUN: mkdir -p %t.dir/some/path +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s +! + + + +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s +! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} +! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} + + +program main + implicit none + integer :: i + integer :: X = 0 + + do i = 0, 99 + X = X + i + end do + +end program main >From b897c7aa1e21dfe46b4acf709f3ea38d9021c164 Mon Sep 17 00:00:00 2001 From: FYK Date: Wed, 23 Apr 2025 09:56:14 +0800 Subject: [PATCH 02/13] Update flang/lib/Frontend/FrontendActions.cpp Remove redundant comment symbols Co-authored-by: Tom Eccles --- flang/lib/Frontend/FrontendActions.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 68880bdeecf8d..cd13a6aca92cd 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -942,7 +942,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; if (opts.hasProfileIRInstr()){ - // // -fprofile-generate. + // -fprofile-generate. pgoOpt = llvm::PGOOptions( opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() : opts.InstrProfileOutput, >From bc5adfcc4ac3456f587bedd48c1a8892d27e53ae Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:48:30 +0800 Subject: [PATCH 03/13] format code with clang-format --- flang/include/flang/Frontend/CodeGenOptions.h | 17 ++-- flang/lib/Frontend/CompilerInvocation.cpp | 15 ++-- flang/lib/Frontend/FrontendActions.cpp | 83 +++++++++---------- .../Inputs/gcc-flag-compatibility_IR.proftext | 3 +- .../gcc-flag-compatibility_IR_entry.proftext | 5 +- 5 files changed, 59 insertions(+), 64 deletions(-) diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index e052250f97e75..c9577862df832 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -156,7 +156,6 @@ class CodeGenOptions : public CodeGenOptionsBase { ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -171,7 +170,7 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; - /// Name of the profile remapping file to apply to the profile data supplied + /// Name of the profile remapping file to apply to the profile data supplied /// by -fprofile-sample-use or -fprofile-instr-use. std::string ProfileRemappingFile; @@ -181,19 +180,17 @@ class CodeGenOptions : public CodeGenOptionsBase { } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; - } + bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { return getProfileInstr() == ProfileCSIRInstr; } - /// Check if IR level profile use is on. - bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; - } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } /// Check if CSIR profile use is on. bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index f013fce2f3cfc..b28c2c0047579 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,7 +27,6 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" -#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -433,13 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = + args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = + args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); } - + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index cd13a6aca92cd..8d1ab670e4db4 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -56,21 +56,21 @@ #include "llvm/Passes/PassBuilder.h" #include "llvm/Passes/PassPlugin.h" #include "llvm/Passes/StandardInstrumentations.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/Support/AMDGPUAddrSpace.h" #include "llvm/Support/Error.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/FileSystem.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" -#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" -#include "llvm/Transforms/Utils/ModuleUtils.h" #include "llvm/Transforms/Instrumentation/InstrProfiling.h" -#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Transforms/Utils/ModuleUtils.h" #include #include @@ -133,19 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// - static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, + "default", "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, + "optsize", "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, + "minsize", "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, + "optnone", + "Mark cold functions with optnone."))); bool PrescanAction::beginSourceFileAction() { return runPrescan(); } @@ -909,19 +910,18 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } - // Default filename used for profile generation. namespace llvm { - extern llvm::cl::opt DebugInfoCorrelate; - extern llvm::cl::opt ProfileCorrelate; - +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt ProfileCorrelate; std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + return DebugInfoCorrelate || + ProfileCorrelate != llvm::InstrProfCorrelator::NONE ? "default_%m.proflite" : "default_%m.profraw"; } -} +} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); @@ -940,29 +940,28 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; - - if (opts.hasProfileIRInstr()){ + + if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); - } - else if (opts.hasProfileIRUse()) { - llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); - // -fprofile-use. - auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse - : llvm::PGOOptions::NoCSAction; - pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", - opts.ProfileRemappingFile, - opts.MemoryProfileUsePath, VFS, - llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling); - } - + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = + llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions( + opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, + ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext index 6a6df8b1d4d5b..2650fb5ebfd35 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -15,5 +15,4 @@ main # Num Counters: 1 # Counter Values: -1 - +1 \ No newline at end of file diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext index 9a46140286673..c4a2a26557e80 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -8,7 +8,4 @@ _QQmain 2 # Counter Values: 100 -1 - - - +1 \ No newline at end of file >From d64d9d95fb97d6cfa4bf4192bfb20f5c8d6b3bc3 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:53:47 +0800 Subject: [PATCH 04/13] simplify push_back usage --- clang/lib/Driver/ToolChains/Flang.cpp | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index fcdbe8a6aba5a..9c7e87c455e44 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,13 +882,9 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); - - if (Args.hasArg(options::OPT_fprofile_generate)){ - CmdArgs.push_back("-fprofile-generate"); - } - if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { - CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); - } + // recognise options: fprofile-generate -fprofile-use= + Args.addAllArgs( + CmdArgs, {options::OPT_fprofile_generate, options::OPT_fprofile_use_EQ}); // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. >From 22475a85d24b22fb44ca5a5ce26542b556bae280 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 20:33:54 +0800 Subject: [PATCH 05/13] Port the getDefaultProfileGenName definition and the ProfileInstrKind definition from clang to the llvm namespace to allow flang to reuse these code. --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++--- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/include/clang/CodeGen/BackendUtil.h | 3 ++ clang/lib/Basic/ProfileList.cpp | 20 ++++---- clang/lib/CodeGen/BackendUtil.cpp | 50 ++++++------------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 +-- flang/include/flang/Frontend/CodeGenOptions.h | 28 ++++------- .../include/flang/Frontend/FrontendActions.h | 5 ++ flang/lib/Frontend/CompilerInvocation.cpp | 11 ++-- flang/lib/Frontend/FrontendActions.cpp | 28 +++-------- .../llvm/Frontend/Driver/CodeGenOptions.h | 15 +++++- llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 25 ++++++++++ 17 files changed, 123 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index 92e0d13bf25b6..d9abf7bf962d2 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,6 +8,8 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -19,6 +21,7 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; +extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..963ed321b2cb9 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,38 +103,16 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); -// Experiment to mark cold functions as optsize/minsize/optnone. -// TODO: remove once this is exposed as a proper driver flag. -static cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, - cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); - extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +812,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,14 +825,14 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) @@ -863,15 +841,15 @@ void EmitAssemblyHelper::RunOptimizationPipeline( ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index f9a45bd6c0a56..9ba74a9dad9be 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,8 +20,13 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include +namespace llvm { +extern cl::opt ClPGOColdFuncAttr; +} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..c758aa18fbb8e 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -28,6 +28,7 @@ #include "flang/Semantics/unparse-with-symbols.h" #include "flang/Support/default-kinds.h" #include "flang/Tools/CrossToolHelpers.h" +#include "clang/CodeGen/BackendUtil.h" #include "mlir/IR/Dialect.h" #include "mlir/Parser/Parser.h" @@ -133,21 +134,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -944,12 +930,12 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, + opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + llvm::PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +945,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..6188c20cb29cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,9 +13,14 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; } // namespace llvm namespace llvm::driver { @@ -35,7 +40,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); - +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..818dcd3752437 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,7 +8,26 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); +} // namespace llvm namespace llvm::driver { @@ -56,4 +75,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From e53e689985088bbcdc253950a2ecc715592b5b3a Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 21:49:36 +0800 Subject: [PATCH 06/13] Remove redundant function definitions --- flang/lib/Frontend/FrontendActions.cpp | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c758aa18fbb8e..cdd2853bcd201 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -896,18 +896,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); >From 248175453354fecd078f5553576d16ce810e7808 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:12:32 +0800 Subject: [PATCH 07/13] Move the interface to the cpp that uses it --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++---- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/lib/Basic/ProfileList.cpp | 20 ++++----- clang/lib/CodeGen/BackendUtil.cpp | 37 +++++++-------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 ++-- flang/include/flang/Frontend/CodeGenOptions.h | 28 +++++------- flang/lib/Frontend/CompilerInvocation.cpp | 11 ++--- flang/lib/Frontend/FrontendActions.cpp | 45 ++++--------------- .../llvm/Frontend/Driver/CodeGenOptions.h | 10 +++++ llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 12 +++++ 15 files changed, 101 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..592e3bbbcc1cf 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -124,17 +124,10 @@ namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +827,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,31 +840,31 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) PGOOpt = PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..a650f54620543 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -133,21 +133,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -910,19 +895,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm - void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -943,13 +915,14 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. - pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + pgoOpt = llvm::PGOOptions(opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, + llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, + llvm::PGOOptions::ColdFuncOpt::Default, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +932,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::PGOOptions::ColdFuncOpt::Default, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..3eb03cc3064cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,6 +13,7 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -36,6 +37,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..14b6b89da8465 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,8 +8,14 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; +} // namespace llvm namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, @@ -56,4 +62,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From 70fea2265a374f59345691f4ad7653ef4f0b6aa6 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:25:15 +0800 Subject: [PATCH 08/13] Move the interface to the cpp that uses it --- clang/include/clang/CodeGen/BackendUtil.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index d9abf7bf962d2..92e0d13bf25b6 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,8 +8,6 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -21,7 +19,6 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; -extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs >From 5705d5eff937ca18eb44bec28a967a8629f0c085 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:26:22 +0800 Subject: [PATCH 09/13] Move the interface to the cpp that uses it --- flang/include/flang/Frontend/FrontendActions.h | 5 ----- 1 file changed, 5 deletions(-) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index 9ba74a9dad9be..f9a45bd6c0a56 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,13 +20,8 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include -namespace llvm { -extern cl::opt ClPGOColdFuncAttr; -} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// >From 016aab17f4cc73416c6ebca61240f269aac837d2 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:34:00 +0800 Subject: [PATCH 10/13] Fill in the missing code --- clang/lib/CodeGen/BackendUtil.cpp | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 2d33edbb8430d..6eb3a8638b7d1 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,6 +103,21 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +static cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { >From f36bfcfbfdc87b896f41be1ba25d8c18c339f1c1 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Thu, 1 May 2025 23:18:34 +0800 Subject: [PATCH 11/13] Adjusting the format of the code --- flang/test/Profile/gcc-flag-compatibility.f90 | 7 ------- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 7 ++++--- 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 index 0124bc79b87ef..4490c45232d28 100644 --- a/flang/test/Profile/gcc-flag-compatibility.f90 +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -9,24 +9,17 @@ ! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section ! PROFILE-GEN: @__profd_{{_?}}main = - - ! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof ! This uses LLVM IR format profile. ! RUN: rm -rf %t.dir ! RUN: mkdir -p %t.dir/some/path ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s -! - - - ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s ! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} ! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} - program main implicit none integer :: i diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 3eb03cc3064cf..98b9e1554f317 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -14,6 +14,7 @@ #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #include + namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -34,9 +35,6 @@ enum class VectorLibrary { AMDLIBM // AMD vector math library. }; -TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, - VectorLibrary Veclib); - enum ProfileInstrKind { ProfileNone, // Profile instrumentation is turned off. ProfileClangInstr, // Clang instrumentation to generate execution counts @@ -44,6 +42,9 @@ enum ProfileInstrKind { ProfileIRInstr, // IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; +TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, + VectorLibrary Veclib); + // Default filename used for profile generation. std::string getDefaultProfileGenName(); } // end namespace llvm::driver >From a5c7da77d2aa6909451bed3fb0f02c9b735dc876 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 2 May 2025 00:01:26 +0800 Subject: [PATCH 12/13] Adjusting the format of the code --- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 98b9e1554f317..84bba2a964ecf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -20,6 +20,7 @@ class Triple; class TargetLibraryInfoImpl; } // namespace llvm + namespace llvm::driver { /// Vector library option used with -fveclib= @@ -42,6 +43,7 @@ enum ProfileInstrKind { ProfileIRInstr, // IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; + TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); >From a99e16b29d70d2fea6d16ec06e6ca55f477b74e9 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 2 May 2025 00:07:23 +0800 Subject: [PATCH 13/13] Adjusting the format of the code --- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 1 - llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 84bba2a964ecf..f0baa6fcdbbd3 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -20,7 +20,6 @@ class Triple; class TargetLibraryInfoImpl; } // namespace llvm - namespace llvm::driver { /// Vector library option used with -fveclib= diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index 14b6b89da8465..c48f5ed68b10b 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -16,6 +16,7 @@ extern llvm::cl::opt DebugInfoCorrelate; extern llvm::cl::opt ProfileCorrelate; } // namespace llvm + namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, From flang-commits at lists.llvm.org Thu May 8 20:16:59 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 20:16:59 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <681d73ab.170a0220.304aef.d4e4@mx.google.com> fanju110 wrote: @MaskRay @aeubanks when you have a moment, could you please take a look at this PR? Let me know if there’s anything I should revise or clarify. Thanks! https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Thu May 8 22:25:43 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Thu, 08 May 2025 22:25:43 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <681d91d7.170a0220.156ead.e06a@mx.google.com> Thirumalai-Shaktivel wrote: Thanks for the reviews! https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Thu May 8 22:25:58 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 22:25:58 -0700 (PDT) Subject: [flang-commits] [flang] 3b9b377 - [Flang] [OpenMP] Add semantic checks for detach clause in task (#119172) Message-ID: <681d91e6.630a0220.17b710.6928@mx.google.com> Author: Thirumalai Shaktivel Date: 2025-05-09T10:55:54+05:30 New Revision: 3b9b377f6df78c390815a54786b742d96ccd11f0 URL: https://github.com/llvm/llvm-project/commit/3b9b377f6df78c390815a54786b742d96ccd11f0 DIFF: https://github.com/llvm/llvm-project/commit/3b9b377f6df78c390815a54786b742d96ccd11f0.diff LOG: [Flang] [OpenMP] Add semantic checks for detach clause in task (#119172) Fixes: - Add semantic checks along with the tests - Move the detach clause to allowedOnceClauses list in Task construct Restrictions:\ OpenMP 5.0: Task construct - At most one detach clause can appear on the directive. - If a detach clause appears on the directive, then a mergeable clause cannot appear on the same directive. OpenMP 5.2: Detach contruct - If a detach clause appears on a directive, then the encountering task must not be a final task. - A variable that appears in a detach clause cannot appear as a list item on a data-environment attribute clause on the same construct. - A variable that is part of another variable (as an array element or a structure element) cannot appear in a detach clause. - event-handle must not have the POINTER attribute. Added: flang/test/Semantics/OpenMP/detach01.f90 flang/test/Semantics/OpenMP/detach02.f90 Modified: flang/lib/Semantics/check-omp-structure.cpp flang/lib/Semantics/check-omp-structure.h llvm/include/llvm/Frontend/OpenMP/OMP.td Removed: ################################################################################ diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index f17de42ca2466..dd8e511642976 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -1605,7 +1605,7 @@ void OmpStructureChecker::Leave(const parser::OpenMPThreadprivate &c) { const auto &dir{std::get(c.t)}; const auto &objectList{std::get(c.t)}; CheckSymbolNames(dir.source, objectList); - CheckIsVarPartOfAnotherVar(dir.source, objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, objectList); CheckThreadprivateOrDeclareTargetVar(objectList); dirContext_.pop_back(); } @@ -1746,7 +1746,7 @@ void OmpStructureChecker::Enter(const parser::OpenMPDeclarativeAllocate &x) { for (const auto &clause : clauseList.v) { CheckAlignValue(clause); } - CheckIsVarPartOfAnotherVar(dir.source, objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, objectList); } void OmpStructureChecker::Leave(const parser::OpenMPDeclarativeAllocate &x) { @@ -1912,7 +1912,7 @@ void OmpStructureChecker::Leave(const parser::OpenMPDeclareTargetConstruct &x) { if (const auto *objectList{parser::Unwrap(spec.u)}) { deviceConstructFound_ = true; CheckSymbolNames(dir.source, *objectList); - CheckIsVarPartOfAnotherVar(dir.source, *objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, *objectList); CheckThreadprivateOrDeclareTargetVar(*objectList); } else if (const auto *clauseList{ parser::Unwrap(spec.u)}) { @@ -1925,18 +1925,18 @@ void OmpStructureChecker::Leave(const parser::OpenMPDeclareTargetConstruct &x) { toClauseFound = true; auto &objList{std::get(toClause.v.t)}; CheckSymbolNames(dir.source, objList); - CheckIsVarPartOfAnotherVar(dir.source, objList); + CheckVarIsNotPartOfAnotherVar(dir.source, objList); CheckThreadprivateOrDeclareTargetVar(objList); }, [&](const parser::OmpClause::Link &linkClause) { CheckSymbolNames(dir.source, linkClause.v); - CheckIsVarPartOfAnotherVar(dir.source, linkClause.v); + CheckVarIsNotPartOfAnotherVar(dir.source, linkClause.v); CheckThreadprivateOrDeclareTargetVar(linkClause.v); }, [&](const parser::OmpClause::Enter &enterClause) { enterClauseFound = true; CheckSymbolNames(dir.source, enterClause.v); - CheckIsVarPartOfAnotherVar(dir.source, enterClause.v); + CheckVarIsNotPartOfAnotherVar(dir.source, enterClause.v); CheckThreadprivateOrDeclareTargetVar(enterClause.v); }, [&](const parser::OmpClause::DeviceType &deviceTypeClause) { @@ -2019,7 +2019,7 @@ void OmpStructureChecker::Enter(const parser::OpenMPExecutableAllocate &x) { CheckAlignValue(clause); } if (objectList) { - CheckIsVarPartOfAnotherVar(dir.source, *objectList); + CheckVarIsNotPartOfAnotherVar(dir.source, *objectList); } } @@ -2039,7 +2039,7 @@ void OmpStructureChecker::Enter(const parser::OpenMPAllocatorsConstruct &x) { for (const auto &clause : clauseList.v) { if (const auto *allocClause{ parser::Unwrap(clause)}) { - CheckIsVarPartOfAnotherVar( + CheckVarIsNotPartOfAnotherVar( dir.source, std::get(allocClause->v.t)); } } @@ -3156,6 +3156,65 @@ void OmpStructureChecker::Leave(const parser::OmpClauseList &) { // clause CheckMultListItems(); + if (GetContext().directive == llvm::omp::Directive::OMPD_task) { + if (auto *detachClause{FindClause(llvm::omp::Clause::OMPC_detach)}) { + unsigned version{context_.langOptions().OpenMPVersion}; + if (version == 50 || version == 51) { + // OpenMP 5.0: 2.10.1 Task construct restrictions + CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_detach, + {llvm::omp::Clause::OMPC_mergeable}); + } else if (version >= 52) { + // OpenMP 5.2: 12.5.2 Detach construct restrictions + if (FindClause(llvm::omp::Clause::OMPC_final)) { + context_.Say(GetContext().clauseSource, + "If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task"_err_en_US); + } + + const auto &detach{ + std::get(detachClause->u)}; + if (const auto *name{parser::Unwrap(detach.v.v)}) { + Symbol *eventHandleSym{name->symbol}; + auto checkVarAppearsInDataEnvClause = [&](const parser::OmpObjectList + &objs, + std::string clause) { + for (const auto &obj : objs.v) { + if (const parser::Name * + objName{parser::Unwrap(obj)}) { + if (&objName->symbol->GetUltimate() == eventHandleSym) { + context_.Say(GetContext().clauseSource, + "A variable: `%s` that appears in a DETACH clause cannot appear on %s clause on the same construct"_err_en_US, + objName->source, clause); + } + } + } + }; + if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_private)}) { + const auto &pClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(pClause.v, "PRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_shared)}) { + const auto &sClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(sClause.v, "SHARED"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_firstprivate)}) { + const auto &fpClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause(fpClause.v, "FIRSTPRIVATE"); + } else if (auto *dataEnvClause{ + FindClause(llvm::omp::Clause::OMPC_in_reduction)}) { + const auto &irClause{ + std::get(dataEnvClause->u)}; + checkVarAppearsInDataEnvClause( + std::get(irClause.v.t), "IN_REDUCTION"); + } + } + } + } + } + auto testThreadprivateVarErr = [&](Symbol sym, parser::Name name, llvmOmpClause clauseTy) { if (sym.test(Symbol::Flag::OmpThreadprivate)) @@ -3238,7 +3297,6 @@ CHECK_SIMPLE_CLAUSE(Capture, OMPC_capture) CHECK_SIMPLE_CLAUSE(Contains, OMPC_contains) CHECK_SIMPLE_CLAUSE(Default, OMPC_default) CHECK_SIMPLE_CLAUSE(Depobj, OMPC_depobj) -CHECK_SIMPLE_CLAUSE(Detach, OMPC_detach) CHECK_SIMPLE_CLAUSE(DeviceType, OMPC_device_type) CHECK_SIMPLE_CLAUSE(DistSchedule, OMPC_dist_schedule) CHECK_SIMPLE_CLAUSE(Exclusive, OMPC_exclusive) @@ -3745,14 +3803,14 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Ordered &x) { void OmpStructureChecker::Enter(const parser::OmpClause::Shared &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_shared); - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v, "SHARED"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v, "SHARED"); CheckCrayPointee(x.v, "SHARED"); } void OmpStructureChecker::Enter(const parser::OmpClause::Private &x) { SymbolSourceMap symbols; GetSymbolsInObjectList(x.v, symbols); CheckAllowedClause(llvm::omp::Clause::OMPC_private); - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v, "PRIVATE"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v, "PRIVATE"); CheckIntentInPointer(symbols, llvm::omp::Clause::OMPC_private); CheckCrayPointee(x.v, "PRIVATE"); } @@ -3781,50 +3839,50 @@ bool OmpStructureChecker::IsDataRefTypeParamInquiry( return dataRefIsTypeParamInquiry; } -void OmpStructureChecker::CheckIsVarPartOfAnotherVar( +void OmpStructureChecker::CheckVarIsNotPartOfAnotherVar( const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause) { for (const auto &ompObject : objList.v) { - common::visit( - common::visitors{ - [&](const parser::Designator &designator) { - if (const auto *dataRef{ - std::get_if(&designator.u)}) { - if (IsDataRefTypeParamInquiry(dataRef)) { + CheckVarIsNotPartOfAnotherVar(source, ompObject, clause); + } +} + +void OmpStructureChecker::CheckVarIsNotPartOfAnotherVar( + const parser::CharBlock &source, const parser::OmpObject &ompObject, + llvm::StringRef clause) { + common::visit( + common::visitors{ + [&](const parser::Designator &designator) { + if (const auto *dataRef{ + std::get_if(&designator.u)}) { + if (IsDataRefTypeParamInquiry(dataRef)) { + context_.Say(source, + "A type parameter inquiry cannot appear on the %s directive"_err_en_US, + ContextDirectiveAsFortran()); + } else if (parser::Unwrap( + ompObject) || + parser::Unwrap(ompObject)) { + if (llvm::omp::nonPartialVarSet.test(GetContext().directive)) { context_.Say(source, - "A type parameter inquiry cannot appear on the %s " - "directive"_err_en_US, + "A variable that is part of another variable (as an array or structure element) cannot appear on the %s directive"_err_en_US, ContextDirectiveAsFortran()); - } else if (parser::Unwrap( - ompObject) || - parser::Unwrap(ompObject)) { - if (llvm::omp::nonPartialVarSet.test( - GetContext().directive)) { - context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear on the %s " - "directive"_err_en_US, - ContextDirectiveAsFortran()); - } else { - context_.Say(source, - "A variable that is part of another variable (as an " - "array or structure element) cannot appear in a " - "%s clause"_err_en_US, - clause.data()); - } + } else { + context_.Say(source, + "A variable that is part of another variable (as an array or structure element) cannot appear in a %s clause"_err_en_US, + clause.data()); } } - }, - [&](const parser::Name &name) {}, - }, - ompObject.u); - } + } + }, + [&](const parser::Name &name) {}, + }, + ompObject.u); } void OmpStructureChecker::Enter(const parser::OmpClause::Firstprivate &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_firstprivate); - CheckIsVarPartOfAnotherVar(GetContext().clauseSource, x.v, "FIRSTPRIVATE"); + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v, "FIRSTPRIVATE"); CheckCrayPointee(x.v, "FIRSTPRIVATE"); CheckIsLoopIvPartOfClause(llvmOmpClause::OMPC_firstprivate, x.v); @@ -4147,6 +4205,33 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Linear &x) { } } +void OmpStructureChecker::Enter(const parser::OmpClause::Detach &x) { + unsigned version{context_.langOptions().OpenMPVersion}; + if (version >= 52) { + SetContextClauseInfo(llvm::omp::Clause::OMPC_detach); + } else { + // OpenMP 5.0: 2.10.1 Task construct restrictions + CheckAllowedClause(llvm::omp::Clause::OMPC_detach); + } + // OpenMP 5.2: 12.5.2 Detach clause restrictions + if (version >= 52) { + CheckVarIsNotPartOfAnotherVar(GetContext().clauseSource, x.v.v, "DETACH"); + } + + if (const auto *name{parser::Unwrap(x.v.v)}) { + if (version >= 52 && IsPointer(*name->symbol)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must not have the POINTER attribute"_err_en_US, + name->ToString()); + } + if (!name->symbol->GetType()->IsNumeric(TypeCategory::Integer)) { + context_.Say(GetContext().clauseSource, + "The event-handle: `%s` must be of type integer(kind=omp_event_handle_kind)"_err_en_US, + name->ToString()); + } + } +} + void OmpStructureChecker::CheckAllowedMapTypes( const parser::OmpMapType::Value &type, const std::list &allowedMapTypeList) { @@ -4495,7 +4580,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Lastprivate &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_lastprivate); const auto &objectList{std::get(x.v.t)}; - CheckIsVarPartOfAnotherVar( + CheckVarIsNotPartOfAnotherVar( GetContext().clauseSource, objectList, "LASTPRIVATE"); CheckCrayPointee(objectList, "LASTPRIVATE"); diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 911a6bb08fb87..587959f7d506f 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -233,7 +233,9 @@ class OmpStructureChecker const common::Indirection &, const parser::Name &); void CheckDoacross(const parser::OmpDoacross &doa); bool IsDataRefTypeParamInquiry(const parser::DataRef *dataRef); - void CheckIsVarPartOfAnotherVar(const parser::CharBlock &source, + void CheckVarIsNotPartOfAnotherVar(const parser::CharBlock &source, + const parser::OmpObject &obj, llvm::StringRef clause = ""); + void CheckVarIsNotPartOfAnotherVar(const parser::CharBlock &source, const parser::OmpObjectList &objList, llvm::StringRef clause = ""); void CheckThreadprivateOrDeclareTargetVar( const parser::OmpObjectList &objList); diff --git a/flang/test/Semantics/OpenMP/detach01.f90 b/flang/test/Semantics/OpenMP/detach01.f90 new file mode 100644 index 0000000000000..7729c85ea1128 --- /dev/null +++ b/flang/test/Semantics/OpenMP/detach01.f90 @@ -0,0 +1,63 @@ +! REQUIRES: openmp_runtime +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=52 + +! OpenMP Version 5.2: 12.5.2 +! Various checks for DETACH Clause + +program detach01 + use omp_lib, only: omp_event_handle_kind + implicit none + real :: e, x + integer(omp_event_handle_kind) :: event_01, event_02(2) + integer(omp_event_handle_kind), pointer :: event_03 + + type :: t + integer(omp_event_handle_kind) :: event + end type + type(t) :: t_01 + + !ERROR: The event-handle: `e` must be of type integer(kind=omp_event_handle_kind) + !$omp task detach(e) + x = x + 1 + !$omp end task + + !ERROR: If a DETACH clause appears on a directive, then the encountering task must not be a FINAL task + !$omp task detach(event_01) final(.false.) + x = x + 1 + !$omp end task + + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on PRIVATE clause on the same construct + !$omp task detach(event_01) private(event_01) + x = x + 1 + !$omp end task + + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on FIRSTPRIVATE clause on the same construct + !$omp task detach(event_01) firstprivate(event_01) + x = x + 1 + !$omp end task + + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on SHARED clause on the same construct + !$omp task detach(event_01) shared(event_01) + x = x + 1 + !$omp end task + + !ERROR: A variable: `event_01` that appears in a DETACH clause cannot appear on IN_REDUCTION clause on the same construct + !$omp task detach(event_01) in_reduction(+:event_01) + x = x + 1 + !$omp end task + + !ERROR: A variable that is part of another variable (as an array or structure element) cannot appear in a DETACH clause + !$omp task detach(event_02(1)) + x = x + 1 + !$omp end task + + !ERROR: A variable that is part of another variable (as an array or structure element) cannot appear in a DETACH clause + !$omp task detach(t_01%event) + x = x + 1 + !$omp end task + + !ERROR: The event-handle: `event_03` must not have the POINTER attribute + !$omp task detach(event_03) + x = x + 1 + !$omp end task +end program diff --git a/flang/test/Semantics/OpenMP/detach02.f90 b/flang/test/Semantics/OpenMP/detach02.f90 new file mode 100644 index 0000000000000..49d80358fcdb6 --- /dev/null +++ b/flang/test/Semantics/OpenMP/detach02.f90 @@ -0,0 +1,21 @@ +! REQUIRES: openmp_runtime +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=50 +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags -fopenmp-version=51 + +! OpenMP Version 5.0: 2.10.1 +! Various checks for DETACH Clause + +program detach02 + use omp_lib, only: omp_event_handle_kind + integer(omp_event_handle_kind) :: event_01, event_02 + + !ERROR: At most one DETACH clause can appear on the TASK directive + !$omp task detach(event_01) detach(event_02) + x = x + 1 + !$omp end task + + !ERROR: Clause MERGEABLE is not allowed if clause DETACH appears on the TASK directive + !$omp task detach(event_01) mergeable + x = x + 1 + !$omp end task +end program diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 583718a2396f5..194b1e657c493 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -1140,7 +1140,6 @@ def OMP_Task : Directive<"task"> { VersionedClause, VersionedClause, VersionedClause, - VersionedClause, VersionedClause, VersionedClause, VersionedClause, @@ -1150,6 +1149,7 @@ def OMP_Task : Directive<"task"> { ]; let allowedOnceClauses = [ VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, From flang-commits at lists.llvm.org Thu May 8 22:26:08 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Thu, 08 May 2025 22:26:08 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <681d91f0.170a0220.247d49.8b3d@mx.google.com> https://github.com/Thirumalai-Shaktivel closed https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Thu May 8 22:26:41 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Thu, 08 May 2025 22:26:41 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <681d9211.170a0220.19c0e7.a359@mx.google.com> https://github.com/Thirumalai-Shaktivel edited https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Thu May 8 22:47:56 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Thu, 08 May 2025 22:47:56 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681d970c.a70a0220.348493.d24b@mx.google.com> https://github.com/kaviya2510 updated https://github.com/llvm/llvm-project/pull/128490 >From 2075eb0739938946e80b8e632f6512be735a04c7 Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Wed, 7 May 2025 11:53:51 +0530 Subject: [PATCH 1/4] [Flang][OpenMP]Added MLIR lowering for grainsize and num_tasks clause --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 42 +++++++++++++++ flang/lib/Lower/OpenMP/ClauseProcessor.h | 4 ++ flang/lib/Lower/OpenMP/OpenMP.cpp | 26 ++++----- .../test/Lower/OpenMP/taskloop-grainsize.f90 | 51 ++++++++++++++++++ flang/test/Lower/OpenMP/taskloop-numtasks.f90 | 54 +++++++++++++++++++ 5 files changed, 165 insertions(+), 12 deletions(-) create mode 100644 flang/test/Lower/OpenMP/taskloop-grainsize.f90 create mode 100644 flang/test/Lower/OpenMP/taskloop-numtasks.f90 diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 77b4622547d7a..ac940b5c74152 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -365,6 +365,27 @@ bool ClauseProcessor::processHint(mlir::omp::HintClauseOps &result) const { return false; } +bool ClauseProcessor::processGrainsize( + lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const { + using grainsize = omp::clause::Grainsize; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier) { + result.grainsizeMod = mlir::omp::ClauseGrainsizeTypeAttr::get( + context, mlir::omp::ClauseGrainsizeType::Strict); + } + const auto &grainsizeExpr = std::get(clause->t); + result.grainsize = + fir::getBase(converter.genExprValue(grainsizeExpr, stmtCtx)); + return true; + } + return false; +} + bool ClauseProcessor::processInclusive( mlir::Location currentLocation, mlir::omp::InclusiveClauseOps &result) const { @@ -388,6 +409,27 @@ bool ClauseProcessor::processNowait(mlir::omp::NowaitClauseOps &result) const { return markClauseOccurrence(result.nowait); } +bool ClauseProcessor::processNumTasks( + lower::StatementContext &stmtCtx, + mlir::omp::NumTasksClauseOps &result) const { + using numtasks = omp::clause::NumTasks; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier) { + result.numTasksMod = mlir::omp::ClauseNumTasksTypeAttr::get( + context, mlir::omp::ClauseNumTasksType::Strict); + } + const auto &numtasksExpr = std::get(clause->t); + result.numTasks = + fir::getBase(converter.genExprValue(numtasksExpr, stmtCtx)); + return true; + } + return false; +} + bool ClauseProcessor::processNumTeams( lower::StatementContext &stmtCtx, mlir::omp::NumTeamsClauseOps &result) const { diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index bdddeb145b496..375e24b80fc21 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -78,10 +78,14 @@ class ClauseProcessor { mlir::omp::HasDeviceAddrClauseOps &result, llvm::SmallVectorImpl &hasDeviceSyms) const; bool processHint(mlir::omp::HintClauseOps &result) const; + bool processGrainsize(lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const; bool processInclusive(mlir::Location currentLocation, mlir::omp::InclusiveClauseOps &result) const; bool processMergeable(mlir::omp::MergeableClauseOps &result) const; bool processNowait(mlir::omp::NowaitClauseOps &result) const; + bool processNumTasks(lower::StatementContext &stmtCtx, + mlir::omp::NumTasksClauseOps &result) const; bool processNumTeams(lower::StatementContext &stmtCtx, mlir::omp::NumTeamsClauseOps &result) const; bool processNumThreads(lower::StatementContext &stmtCtx, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fcd3de9671098..af227b28d35b3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1806,17 +1806,19 @@ static void genTaskgroupClauses(lower::AbstractConverter &converter, static void genTaskloopClauses(lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, + lower::StatementContext &stmtCtx, const List &clauses, mlir::Location loc, mlir::omp::TaskloopOperands &clauseOps) { ClauseProcessor cp(converter, semaCtx, clauses); + cp.processGrainsize(stmtCtx, clauseOps); + cp.processNumTasks(stmtCtx, clauseOps); cp.processTODO( - loc, llvm::omp::Directive::OMPD_taskloop); + clause::Final, clause::If, clause::InReduction, + clause::Lastprivate, clause::Mergeable, clause::Nogroup, + clause::Priority, clause::Reduction, clause::Shared, + clause::Untied>(loc, llvm::omp::Directive::OMPD_taskloop); } static void genTaskwaitClauses(lower::AbstractConverter &converter, @@ -3268,12 +3270,12 @@ genStandaloneSimd(lower::AbstractConverter &converter, lower::SymMap &symTable, static mlir::omp::TaskloopOp genStandaloneTaskloop( lower::AbstractConverter &converter, lower::SymMap &symTable, - semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - mlir::Location loc, const ConstructQueue &queue, - ConstructQueue::const_iterator item) { + lower::StatementContext &stmtCtx, semantics::SemanticsContext &semaCtx, + lower::pft::Evaluation &eval, mlir::Location loc, + const ConstructQueue &queue, ConstructQueue::const_iterator item) { mlir::omp::TaskloopOperands taskloopClauseOps; - genTaskloopClauses(converter, semaCtx, item->clauses, loc, taskloopClauseOps); - + genTaskloopClauses(converter, semaCtx, stmtCtx, item->clauses, loc, + taskloopClauseOps); DataSharingProcessor dsp(converter, semaCtx, item->clauses, eval, /*shouldCollectPreDeterminedSymbols=*/true, enableDelayedPrivatization, symTable); @@ -3734,8 +3736,8 @@ static void genOMPDispatch(lower::AbstractConverter &converter, genTaskgroupOp(converter, symTable, semaCtx, eval, loc, queue, item); break; case llvm::omp::Directive::OMPD_taskloop: - newOp = genStandaloneTaskloop(converter, symTable, semaCtx, eval, loc, - queue, item); + newOp = genStandaloneTaskloop(converter, symTable, stmtCtx, semaCtx, eval, + loc, queue, item); break; case llvm::omp::Directive::OMPD_taskwait: newOp = genTaskwaitOp(converter, symTable, semaCtx, eval, loc, queue, item); diff --git a/flang/test/Lower/OpenMP/taskloop-grainsize.f90 b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 new file mode 100644 index 0000000000000..fa684ad213d0a --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 @@ -0,0 +1,51 @@ +! This test checks lowering of grainsize clause in taskloop directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE_TEST2:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_grainsize +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_grainsizeEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_grainsizeEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFtest_grainsizeEx"} +! CHECK: %[[DECL_X:.*]]:2 = hlfir.declare %[[ALLOCA_X]] {uniq_name = "_QFtest_grainsizeEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[GRAINSIZE:.*]] = arith.constant 10 : i32 +subroutine test_grainsize + integer :: i, x + ! CHECK: omp.taskloop grainsize(%[[GRAINSIZE]]: i32) + ! CHECK-SAME: private(@[[X_FIRSTPRIVATE]] %[[DECL_X]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { + ! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { + !$omp taskloop grainsize(10) + do i = 1, 1000 + x = x + 1 + end do + !$omp end taskloop +end subroutine test_grainsize + +!CHECK-LABEL: func.func @_QPtest_grainsize_strict() +subroutine test_grainsize_strict + integer :: i, x + ! CHECK: %[[GRAINSIZE:.*]] = arith.constant 10 : i32 + ! CHECK: omp.taskloop grainsize(strict, %[[GRAINSIZE]]: i32) + !$omp taskloop grainsize(strict:10) + do i = 1, 1000 + !CHECK: arith.addi + x = x + 1 + end do + !$omp end taskloop +end subroutine \ No newline at end of file diff --git a/flang/test/Lower/OpenMP/taskloop-numtasks.f90 b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 new file mode 100644 index 0000000000000..38f3975bbd371 --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 @@ -0,0 +1,54 @@ +! This test checks lowering of num_tasks clause in taskloop directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE_TEST2:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_num_tasks +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_num_tasksEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_num_tasksEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFtest_num_tasksEx"} +! CHECK: %[[DECL_X:.*]]:2 = hlfir.declare %[[ALLOCA_X]] {uniq_name = "_QFtest_num_tasksEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_NUMTASKS:.*]] = arith.constant 10 : i32 +subroutine test_num_tasks + integer :: i, x + ! CHECK: omp.taskloop num_tasks(%[[VAL_NUMTASKS]]: i32) + ! CHECK-SAME: private(@[[X_FIRSTPRIVATE]] %[[DECL_X]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { + ! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { + !$omp taskloop num_tasks(10) + do i = 1, 1000 + x = x + 1 + end do + !$omp end taskloop +end subroutine test_num_tasks + +! CHECK-LABEL: func.func @_QPtest_num_tasks_strict +subroutine test_num_tasks_strict + integer :: x, i + ! CHECK: %[[NUM_TASKS:.*]] = arith.constant 10 : i32 + ! CHECK: omp.taskloop num_tasks(strict, %[[NUM_TASKS]]: i32) + !$omp taskloop num_tasks(strict:10) + do i = 1, 100 + !CHECK: arith.addi + x = x + 1 + end do + !$omp end taskloop +end subroutine + + + >From 916889f13e0a5d2b67d370e338ed1fcfd8c62b2a Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Wed, 7 May 2025 12:04:30 +0530 Subject: [PATCH 2/4] [Flang][OpenMP] Formatting fix --- flang/test/Lower/OpenMP/taskloop-grainsize.f90 | 2 +- flang/test/Lower/OpenMP/taskloop-numtasks.f90 | 3 --- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/flang/test/Lower/OpenMP/taskloop-grainsize.f90 b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 index fa684ad213d0a..43db8acdeceac 100644 --- a/flang/test/Lower/OpenMP/taskloop-grainsize.f90 +++ b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 @@ -48,4 +48,4 @@ subroutine test_grainsize_strict x = x + 1 end do !$omp end taskloop -end subroutine \ No newline at end of file +end subroutine diff --git a/flang/test/Lower/OpenMP/taskloop-numtasks.f90 b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 index 38f3975bbd371..f68f3a2d6ad26 100644 --- a/flang/test/Lower/OpenMP/taskloop-numtasks.f90 +++ b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 @@ -49,6 +49,3 @@ subroutine test_num_tasks_strict end do !$omp end taskloop end subroutine - - - >From 9558691252b5ab9ab693aa017a963f8a1a7fa1b1 Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Thu, 8 May 2025 13:13:06 +0530 Subject: [PATCH 3/4] [Flang][OpenMP] Added a condition to checks if the modifier is 'strict' --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index ac940b5c74152..e2a35cee5c75e 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -374,7 +374,7 @@ bool ClauseProcessor::processGrainsize( mlir::MLIRContext *context = firOpBuilder.getContext(); const auto &modifier = std::get>(clause->t); - if (modifier) { + if (modifier && *modifier == grainsize::Prescriptiveness::Strict) { result.grainsizeMod = mlir::omp::ClauseGrainsizeTypeAttr::get( context, mlir::omp::ClauseGrainsizeType::Strict); } @@ -418,7 +418,7 @@ bool ClauseProcessor::processNumTasks( mlir::MLIRContext *context = firOpBuilder.getContext(); const auto &modifier = std::get>(clause->t); - if (modifier) { + if (modifier && *modifier == numtasks::Prescriptiveness::Strict) { result.numTasksMod = mlir::omp::ClauseNumTasksTypeAttr::get( context, mlir::omp::ClauseNumTasksType::Strict); } >From eb8d4d78e04f5274028fd55160bc7063db9eae18 Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Thu, 8 May 2025 19:44:09 +0530 Subject: [PATCH 4/4] Formatting fix and update variable names --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 50 +++++++++++----------- flang/lib/Lower/OpenMP/ClauseProcessor.h | 4 +- 2 files changed, 27 insertions(+), 27 deletions(-) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index e2a35cee5c75e..7fbbcec753221 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -365,27 +365,6 @@ bool ClauseProcessor::processHint(mlir::omp::HintClauseOps &result) const { return false; } -bool ClauseProcessor::processGrainsize( - lower::StatementContext &stmtCtx, - mlir::omp::GrainsizeClauseOps &result) const { - using grainsize = omp::clause::Grainsize; - if (auto *clause = findUniqueClause()) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::MLIRContext *context = firOpBuilder.getContext(); - const auto &modifier = - std::get>(clause->t); - if (modifier && *modifier == grainsize::Prescriptiveness::Strict) { - result.grainsizeMod = mlir::omp::ClauseGrainsizeTypeAttr::get( - context, mlir::omp::ClauseGrainsizeType::Strict); - } - const auto &grainsizeExpr = std::get(clause->t); - result.grainsize = - fir::getBase(converter.genExprValue(grainsizeExpr, stmtCtx)); - return true; - } - return false; -} - bool ClauseProcessor::processInclusive( mlir::Location currentLocation, mlir::omp::InclusiveClauseOps &result) const { @@ -412,13 +391,13 @@ bool ClauseProcessor::processNowait(mlir::omp::NowaitClauseOps &result) const { bool ClauseProcessor::processNumTasks( lower::StatementContext &stmtCtx, mlir::omp::NumTasksClauseOps &result) const { - using numtasks = omp::clause::NumTasks; - if (auto *clause = findUniqueClause()) { + using NumTasks = omp::clause::NumTasks; + if (auto *clause = findUniqueClause()) { fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); mlir::MLIRContext *context = firOpBuilder.getContext(); const auto &modifier = - std::get>(clause->t); - if (modifier && *modifier == numtasks::Prescriptiveness::Strict) { + std::get>(clause->t); + if (modifier && *modifier == NumTasks::Prescriptiveness::Strict) { result.numTasksMod = mlir::omp::ClauseNumTasksTypeAttr::get( context, mlir::omp::ClauseNumTasksType::Strict); } @@ -976,6 +955,27 @@ bool ClauseProcessor::processDepend(lower::SymMap &symMap, return findRepeatableClause(process); } +bool ClauseProcessor::processGrainsize( + lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const { + using Grainsize = omp::clause::Grainsize; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier && *modifier == Grainsize::Prescriptiveness::Strict) { + result.grainsizeMod = mlir::omp::ClauseGrainsizeTypeAttr::get( + context, mlir::omp::ClauseGrainsizeType::Strict); + } + const auto &grainsizeExpr = std::get(clause->t); + result.grainsize = + fir::getBase(converter.genExprValue(grainsizeExpr, stmtCtx)); + return true; + } + return false; +} + bool ClauseProcessor::processHasDeviceAddr( lower::StatementContext &stmtCtx, mlir::omp::HasDeviceAddrClauseOps &result, llvm::SmallVectorImpl &hasDeviceSyms) const { diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 375e24b80fc21..9541f3585ead3 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -73,13 +73,13 @@ class ClauseProcessor { mlir::omp::FilterClauseOps &result) const; bool processFinal(lower::StatementContext &stmtCtx, mlir::omp::FinalClauseOps &result) const; + bool processGrainsize(lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const; bool processHasDeviceAddr( lower::StatementContext &stmtCtx, mlir::omp::HasDeviceAddrClauseOps &result, llvm::SmallVectorImpl &hasDeviceSyms) const; bool processHint(mlir::omp::HintClauseOps &result) const; - bool processGrainsize(lower::StatementContext &stmtCtx, - mlir::omp::GrainsizeClauseOps &result) const; bool processInclusive(mlir::Location currentLocation, mlir::omp::InclusiveClauseOps &result) const; bool processMergeable(mlir::omp::MergeableClauseOps &result) const; From flang-commits at lists.llvm.org Thu May 8 22:59:02 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Thu, 08 May 2025 22:59:02 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang] [OpenMP] Add semantic checks for detach clause in task (PR #119172) In-Reply-To: Message-ID: <681d99a6.170a0220.352c29.c393@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `clang-m68k-linux-cross` running on `suse-gary-m68k-cross` while building `flang,llvm` at step 5 "ninja check 1". Full details are available at: https://lab.llvm.org/buildbot/#/builders/27/builds/9785
Here is the relevant piece of the build log for the reference ``` Step 5 (ninja check 1) failure: stage 1 checked (failure) ... [152/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/StaticAnalyzer/RangeSetTest.cpp.o [153/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/StaticAnalyzer/RegisterCustomCheckersTest.cpp.o [154/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/ByteCode/BitcastBuffer.cpp.o [155/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/Analysis/FlowSensitive/SignAnalysisTest.cpp.o [156/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/StaticAnalyzer/ObjcBug-124477.cpp.o [157/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/StaticAnalyzer/Z3CrosscheckOracleTest.cpp.o [158/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/ASTVectorTest.cpp.o [159/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/StaticAnalyzer/TestReturnValueUnderConstruction.cpp.o [160/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/StaticAnalyzer/SValSimplifyerTest.cpp.o [161/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/Analysis/FlowSensitive/TransferTest.cpp.o FAILED: tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/Analysis/FlowSensitive/TransferTest.cpp.o /usr/bin/c++ -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_STATIC -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/stage1/tools/clang/unittests -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/unittests -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/include -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/stage1/tools/clang/include -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/stage1/include -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/llvm/include -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/unittests/Tooling -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/third-party/unittest/googletest/include -I/var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/third-party/unittest/googlemock/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wno-unnecessary-virtual-specifier -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -fno-strict-aliasing -O3 -DNDEBUG -std=c++17 -Wno-variadic-macros -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -Wno-suggest-override -MD -MT tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/Analysis/FlowSensitive/TransferTest.cpp.o -MF tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/Analysis/FlowSensitive/TransferTest.cpp.o.d -o tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/Analysis/FlowSensitive/TransferTest.cpp.o -c /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp c++: fatal error: Killed signal terminated program cc1plus compilation terminated. [162/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/ASTDumperTest.cpp.o [163/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/ByteCode/toAPValue.cpp.o /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/unittests/AST/ByteCode/toAPValue.cpp: In member function ‘virtual void ToAPValue_Pointers_Test::TestBody()’: /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/unittests/AST/ByteCode/toAPValue.cpp:97:20: warning: possibly dangling reference to a temporary [-Wdangling-reference] 97 | const Pointer &GP = getGlobalPtr("arrp").deref(); | ^~ /var/lib/buildbot/workers/suse-gary-m68k-cross/clang-m68k-linux-cross/llvm/clang/unittests/AST/ByteCode/toAPValue.cpp:97:60: note: the temporary was destroyed at the end of the full expression ‘ToAPValue_Pointers_Test::TestBody()::(((const char*)"arrp")).clang::interp::Pointer::deref()’ 97 | const Pointer &GP = getGlobalPtr("arrp").deref(); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ At global scope: cc1plus: note: unrecognized command-line option ‘-Wno-unnecessary-virtual-specifier’ may have been intended to silence earlier diagnostics [164/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/ASTMatchers/GtestMatchersTest.cpp.o [165/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/ASTImporterObjCTest.cpp.o [166/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/StaticAnalyzer/SValTest.cpp.o [167/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/ASTMatchers/ASTMatchersInternalTest.cpp.o [168/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/ASTContextParentMapTest.cpp.o [169/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/ExternalASTSourceTest.cpp.o [170/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/EvaluateAsRValueTest.cpp.o [171/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/AttrTest.cpp.o [172/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/ASTExprTest.cpp.o [173/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/ASTMatchers/Dynamic/ParserTest.cpp.o [174/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/ConceptPrinterTest.cpp.o [175/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/DataCollectionTest.cpp.o [176/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/DeclBaseTest.cpp.o [177/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp.o [178/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/StaticAnalyzer/SymbolReaperTest.cpp.o [179/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/ASTImporterFixtures.cpp.o [180/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/ByteCode/Descriptor.cpp.o [181/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/ASTMatchers/Dynamic/VariantValueTest.cpp.o [182/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/ASTTypeTraitsTest.cpp.o [183/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/StaticAnalyzer/StoreTest.cpp.o [184/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/ASTMatchers/Dynamic/RegistryTest.cpp.o [185/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/DeclPrinterTest.cpp.o [186/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/ASTImporterVisibilityTest.cpp.o [187/369] Building CXX object tools/clang/unittests/CMakeFiles/AllClangUnitTests.dir/AST/DeclTest.cpp.o ```
https://github.com/llvm/llvm-project/pull/119172 From flang-commits at lists.llvm.org Thu May 8 23:22:40 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 23:22:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <681d9f30.630a0220.d8b12.7693@mx.google.com> https://github.com/NimishMishra updated https://github.com/llvm/llvm-project/pull/138163 >From e912e7c9e434dc40fbd986f98725bda849a56553 Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Thu, 1 May 2025 21:51:19 +0530 Subject: [PATCH] [flang][OpenMP] Add implicit casts for omp.atomic.capture --- flang/docs/OpenMPSupport.md | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 236 ++++++++++++------ .../Todo/atomic-capture-implicit-cast.f90 | 48 ---- .../Lower/OpenMP/atomic-implicit-cast.f90 | 78 ++++++ 4 files changed, 240 insertions(+), 124 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/docs/OpenMPSupport.md b/flang/docs/OpenMPSupport.md index 2d4b9dd737777..46be14f4c168c 100644 --- a/flang/docs/OpenMPSupport.md +++ b/flang/docs/OpenMPSupport.md @@ -64,4 +64,4 @@ Note : No distinction is made between the support in Parser/Semantics, MLIR, Low | target teams distribute parallel loop simd construct | P | device, reduction, dist_schedule and linear clauses are not supported | ## OpenMP 3.1, OpenMP 2.5, OpenMP 1.1 -All features except a few corner cases in atomic (complex type, different but compatible types in lhs and rhs), threadprivate (character type) constructs/clauses are supported. +All features except a few corner cases in threadprivate (character type) constructs/clauses are supported. diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 47e7c266ff7d3..526148855b113 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2865,6 +2865,85 @@ static void genAtomicWrite(lower::AbstractConverter &converter, rightHandClauseList, loc); } +/* + Emit an implicit cast. Different yet compatible types on + omp.atomic.read constitute valid Fortran. The OMPIRBuilder will + emit atomic instructions (on primitive types) and `__atomic_load` + libcall (on complex type) without explicitly converting + between such compatible types. The OMPIRBuilder relies on the + frontend to resolve such inconsistencies between `omp.atomic.read ` + operand types. Similar inconsistencies between operand types in + `omp.atomic.write` are resolved through implicit casting by use of typed + assignment (i.e. `evaluate::Assignment`). However, use of typed + assignment in `omp.atomic.read` (of form `v = x`) leads to an unsafe, + non-atomic load of `x` into a temporary `alloca`, followed by an atomic + read of form `v = alloca`. Hence, it is needed to perform a custom + implicit cast. + + An atomic read of form `v = x` would (without implicit casting) + lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, + type2`. This implicit casting will rather generate the following FIR: + + %alloca = fir.alloca type2 + omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 + %load = fir.load %alloca : !fir.ref + %cvt = fir.convert %load : (type2) -> type1 + fir.store %cvt to %v : !fir.ref + + These sequence of operations is thread-safe since each thread allocates + the `alloca` in its stack, and performs `%alloca = %x` atomically. Once + safely read, each thread performs the implicit cast on the local + `alloca`, and writes the final result to `%v`. + +/// \param builder : FirOpBuilder +/// \param loc : Location for FIR generation +/// \param toAddress : Address of %v +/// \param toType : Type of %v +/// \param fromType : Type of %x +/// \param alloca : Thread scoped `alloca` +// It is the responsibility of the callee +// to position the `alloca` at `AllocaIP` +// through `builder.getAllocaBlock()` +*/ + +static void emitAtomicReadImplicitCast(fir::FirOpBuilder &builder, + mlir::Location loc, + mlir::Value toAddress, mlir::Type toType, + mlir::Type fromType, + mlir::Value alloca) { + auto load = builder.create(loc, alloca); + if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { + // Emit an additional `ExtractValueOp` if `fromAddress` is of complex + // type, but `toAddress` is not. + auto extract = builder.create( + loc, mlir::cast(fromType).getElementType(), load, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + auto cvt = builder.create(loc, toType, extract); + builder.create(loc, cvt, toAddress); + } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { + // Emit an additional `InsertValueOp` if `toAddress` is of complex + // type, but `fromAddress` is not. + mlir::Value undef = builder.create(loc, toType); + mlir::Type complexEleTy = + mlir::cast(toType).getElementType(); + mlir::Value cvt = builder.create(loc, complexEleTy, load); + mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); + mlir::Value idx0 = builder.create( + loc, toType, undef, cvt, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + mlir::Value idx1 = builder.create( + loc, toType, idx0, zero, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 1))); + builder.create(loc, idx1, toAddress); + } else { + auto cvt = builder.create(loc, toType, load); + builder.create(loc, cvt, toAddress); + } +} + /// Processes an atomic construct with read clause. static void genAtomicRead(lower::AbstractConverter &converter, const parser::OmpAtomicRead &atomicRead, @@ -2891,34 +2970,7 @@ static void genAtomicRead(lower::AbstractConverter &converter, *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); if (fromAddress.getType() != toAddress.getType()) { - // Emit an implicit cast. Different yet compatible types on - // omp.atomic.read constitute valid Fortran. The OMPIRBuilder will - // emit atomic instructions (on primitive types) and `__atomic_load` - // libcall (on complex type) without explicitly converting - // between such compatible types. The OMPIRBuilder relies on the - // frontend to resolve such inconsistencies between `omp.atomic.read ` - // operand types. Similar inconsistencies between operand types in - // `omp.atomic.write` are resolved through implicit casting by use of typed - // assignment (i.e. `evaluate::Assignment`). However, use of typed - // assignment in `omp.atomic.read` (of form `v = x`) leads to an unsafe, - // non-atomic load of `x` into a temporary `alloca`, followed by an atomic - // read of form `v = alloca`. Hence, it is needed to perform a custom - // implicit cast. - - // An atomic read of form `v = x` would (without implicit casting) - // lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, - // type2`. This implicit casting will rather generate the following FIR: - // - // %alloca = fir.alloca type2 - // omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 - // %load = fir.load %alloca : !fir.ref - // %cvt = fir.convert %load : (type2) -> type1 - // fir.store %cvt to %v : !fir.ref - - // These sequence of operations is thread-safe since each thread allocates - // the `alloca` in its stack, and performs `%alloca = %x` atomically. Once - // safely read, each thread performs the implicit cast on the local - // `alloca`, and writes the final result to `%v`. + mlir::Type toType = fir::unwrapRefType(toAddress.getType()); mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); @@ -2930,37 +2982,8 @@ static void genAtomicRead(lower::AbstractConverter &converter, genAtomicCaptureStatement(converter, fromAddress, alloca, leftHandClauseList, rightHandClauseList, elementType, loc); - auto load = builder.create(loc, alloca); - if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { - // Emit an additional `ExtractValueOp` if `fromAddress` is of complex - // type, but `toAddress` is not. - auto extract = builder.create( - loc, mlir::cast(fromType).getElementType(), load, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 0))); - auto cvt = builder.create(loc, toType, extract); - builder.create(loc, cvt, toAddress); - } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { - // Emit an additional `InsertValueOp` if `toAddress` is of complex - // type, but `fromAddress` is not. - mlir::Value undef = builder.create(loc, toType); - mlir::Type complexEleTy = - mlir::cast(toType).getElementType(); - mlir::Value cvt = builder.create(loc, complexEleTy, load); - mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); - mlir::Value idx0 = builder.create( - loc, toType, undef, cvt, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 0))); - mlir::Value idx1 = builder.create( - loc, toType, idx0, zero, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 1))); - builder.create(loc, idx1, toAddress); - } else { - auto cvt = builder.create(loc, toType, load); - builder.create(loc, cvt, toAddress); - } + emitAtomicReadImplicitCast(builder, loc, toAddress, toType, fromType, + alloca); } else genAtomicCaptureStatement(converter, fromAddress, toAddress, leftHandClauseList, rightHandClauseList, @@ -3049,10 +3072,6 @@ static void genAtomicCapture(lower::AbstractConverter &converter, mlir::Type stmt2VarType = fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - // Check if implicit type is needed - if (stmt1VarType != stmt2VarType) - TODO(loc, "atomic capture requiring implicit type casts"); - mlir::Operation *atomicCaptureOp = nullptr; mlir::IntegerAttr hint = nullptr; mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; @@ -3075,10 +3094,31 @@ static void genAtomicCapture(lower::AbstractConverter &converter, // Atomic capture construct is of the form [capture-stmt, update-stmt] const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt1LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt2LHSArg.getType()); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + genAtomicCaptureStatement(converter, stmt2LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt1LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } genAtomicUpdateStatement( converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, /*leftHandClauseList=*/nullptr, @@ -3091,10 +3131,32 @@ static void genAtomicCapture(lower::AbstractConverter &converter, firOpBuilder.setInsertionPointToStart(&block); const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt1LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt2LHSArg.getType()); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + genAtomicCaptureStatement(converter, stmt2LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt1LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, /*leftHandClauseList=*/nullptr, /*rightHandClauseList=*/nullptr, loc); @@ -3107,10 +3169,34 @@ static void genAtomicCapture(lower::AbstractConverter &converter, converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, /*leftHandClauseList=*/nullptr, /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt2LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt1LHSArg.getType()); + + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + + genAtomicCaptureStatement(converter, stmt1LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt2LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } } firOpBuilder.setInsertionPointToEnd(&block); firOpBuilder.create(loc); diff --git a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 b/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 deleted file mode 100644 index 5b61f1169308f..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 +++ /dev/null @@ -1,48 +0,0 @@ -!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..4c1be1ca91ac0 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -4,6 +4,10 @@ ! CHECK: func.func @_QPatomic_implicit_cast_read() { subroutine atomic_implicit_cast_read +! CHECK: %[[ALLOCA7:.*]] = fir.alloca complex +! CHECK: %[[ALLOCA6:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA5:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA4:.*]] = fir.alloca i32 ! CHECK: %[[ALLOCA3:.*]] = fir.alloca complex ! CHECK: %[[ALLOCA2:.*]] = fir.alloca complex ! CHECK: %[[ALLOCA1:.*]] = fir.alloca i32 @@ -53,4 +57,78 @@ subroutine atomic_implicit_cast_read ! CHECK: fir.store %[[CVT]] to %[[M_DECL]]#0 : !fir.ref> !$omp atomic read m = w + +! CHECK: %[[CONST:.*]] = arith.constant 1 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[ALLOCA4]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.update %[[X_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[ARG:.*]]: i32): +! CHECK: %[[RESULT:.*]] = arith.addi %[[ARG]], %[[CONST]] : i32 +! CHECK: omp.yield(%[[RESULT]] : i32) +! CHECK: } +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA4]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 +! CHECK: fir.store %[[CVT]] to %[[Y_DECL]]#0 : !fir.ref + !$omp atomic capture + y = x + x = x + 1 + !$omp end atomic + +! CHECK: %[[CONST:.*]] = arith.constant 10 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[ALLOCA5:.*]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.write %[[X_DECL]]#0 = %[[CONST]] : !fir.ref, i32 +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA5]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f64 +! CHECK: fir.store %[[CVT]] to %[[Z_DECL]]#0 : !fir.ref + !$omp atomic capture + z = x + x = 10 + !$omp end atomic + +! CHECK: %[[CONST:.*]] = arith.constant 1 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[X_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[ARG:.*]]: i32): +! CHECK: %[[RESULT:.*]] = arith.addi %[[ARG]], %[[CONST]] : i32 +! CHECK: omp.yield(%[[RESULT]] : i32) +! CHECK: } +! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref +! CHECK: %[[UNDEF:.*]] = fir.undefined complex +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 +! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex +! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex +! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> + !$omp atomic capture + x = x + 1 + w = x + !$omp end atomic + + +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): +! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 +! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex +! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex +! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex +! CHECK: omp.yield(%[[RESULT]] : complex) +! CHECK: } +! CHECK: omp.atomic.read %[[ALLOCA7]] = %[[M_DECL]]#0 : !fir.ref>, !fir.ref>, complex +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA7]] : !fir.ref> +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (complex) -> complex +! CHECK: fir.store %[[CVT]] to %[[W_DECL]]#0 : !fir.ref> + !$omp atomic capture + m = m + 1 + w = m + !$omp end atomic + + end subroutine From flang-commits at lists.llvm.org Thu May 8 23:46:10 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 08 May 2025 23:46:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <681da4b2.170a0220.9a29b.bfb9@mx.google.com> https://github.com/kiranchandramohan edited https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Thu May 8 23:46:10 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 08 May 2025 23:46:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <681da4b2.170a0220.37cf9a.a59b@mx.google.com> https://github.com/kiranchandramohan approved this pull request. LG. https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Thu May 8 23:46:11 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 08 May 2025 23:46:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <681da4b3.170a0220.385598.a371@mx.google.com> ================ @@ -16,9 +16,7 @@ local: This document outlines the OpenMP API features supported by Flang. It is intended as a general reference. For the most accurate information on unimplemented features, rely on the compiler’s TODO or “Not Yet Implemented” -messages, which are considered authoritative. With the exception of a few corner cases, Flang -offers full support for OpenMP 3.1 ([See details here](#openmp-31-openmp-25-openmp-11)). -Partial support for OpenMP 4.0 is also available and currently under active development. +messages, which are considered authoritative. Partial support for OpenMP 4.0 is also available and currently under active development. ---------------- kiranchandramohan wrote: ```suggestion messages, which are considered authoritative. Partial support for OpenMP 4.0 is available and currently under active development. ``` https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Thu May 8 23:47:43 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 23:47:43 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <681da50f.170a0220.735ff.c040@mx.google.com> https://github.com/NimishMishra updated https://github.com/llvm/llvm-project/pull/138163 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Thu May 8 23:48:49 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 08 May 2025 23:48:49 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <681da551.170a0220.2ad15a.9b05@mx.google.com> ================ @@ -16,9 +16,7 @@ local: This document outlines the OpenMP API features supported by Flang. It is intended as a general reference. For the most accurate information on unimplemented features, rely on the compiler’s TODO or “Not Yet Implemented” -messages, which are considered authoritative. With the exception of a few corner cases, Flang -offers full support for OpenMP 3.1 ([See details here](#openmp-31-openmp-25-openmp-11)). -Partial support for OpenMP 4.0 is also available and currently under active development. +messages, which are considered authoritative. Partial support for OpenMP 4.0 is also available and currently under active development. ---------------- NimishMishra wrote: I realized I redacted the OpenMP 3.1 line too. I have added the following statement now: `Currently, Flang offers full support for OpenMP 3.1; partial support for OpenMP 4.0 is also available and currently under active development.` Is this okay? https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Fri May 9 00:21:27 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 00:21:27 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <681dacf7.170a0220.11c973.a807@mx.google.com> ================ @@ -209,8 +209,25 @@ void createDefaultFIROptimizerPassPipeline(mlir::PassManager &pm, if (pc.AliasAnalysis && !disableFirAliasTags && !useOldAliasTags) pm.addPass(fir::createAddAliasTags()); + if (enableAffineOpt && pc.OptLevel.isOptimizingForSpeed()) { ---------------- jeanPerier wrote: Can you add a comment that this is experimental? The PromoteToAffinePass was an internship prototype created 4-5 years ago, and it has not been actively tested on a big Fortran code base (unless you have). I also share the opinion that automatic promotion of array expressions to OpenMP parallel region is tricky without some very good heuristic that guess if it is worth it (e.g., big enough to justify launching threads/kernel). Also, while Affine has very nice features when you can map to it, my experience from this early experiment is that the loops we can map to affine starting from Fortran loops are limited. Therefore, I am not convinced affine should be a required step/the solution towards automatic parallelization of Fortran code. Going towards SCF and core dialects would be more promising in my opinion. Getting more experience/metric is still interesting though, thank you for the patch! https://github.com/llvm/llvm-project/pull/138627 From flang-commits at lists.llvm.org Fri May 9 00:53:03 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 00:53:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) In-Reply-To: Message-ID: <681db45f.170a0220.141e58.c3ff@mx.google.com> https://github.com/jeanPerier edited https://github.com/llvm/llvm-project/pull/139183 From flang-commits at lists.llvm.org Fri May 9 00:53:03 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 00:53:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) In-Reply-To: Message-ID: <681db45f.170a0220.100a61.b094@mx.google.com> https://github.com/jeanPerier approved this pull request. Thanks, LGTM. If you need special handling in the SELECT TYPE, do you also need something for SELECT RANK? https://github.com/llvm/llvm-project/pull/139183 From flang-commits at lists.llvm.org Fri May 9 00:53:04 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 00:53:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) In-Reply-To: Message-ID: <681db460.170a0220.8cac6.d21b@mx.google.com> ================ @@ -1260,7 +1260,7 @@ func.func @dc_invalid_reduction(%arg0: index, %arg1: index) { // Should fail when volatility changes from a fir.convert func.func @bad_convert_volatile(%arg0: !fir.ref) -> !fir.ref { - // expected-error at +1 {{'fir.convert' op cannot convert between volatile and non-volatile types, use fir.volatile_cast instead}} + // expected-error at +1 {{op this conversion does not preserve volatilit}} ---------------- jeanPerier wrote: nit: `volatilit` -> `volatility` https://github.com/llvm/llvm-project/pull/139183 From flang-commits at lists.llvm.org Fri May 9 00:53:05 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 00:53:05 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) In-Reply-To: Message-ID: <681db461.170a0220.35ba38.9ecd@mx.google.com> ================ @@ -207,20 +207,25 @@ static bool hasExplicitLowerBounds(mlir::Value shape) { mlir::isa(shape.getType()); } -static std::pair updateDeclareInputTypeWithVolatility( +static std::pair updateDeclaredInputTypeWithVolatility( mlir::Type inputType, mlir::Value memref, mlir::OpBuilder &builder, fir::FortranVariableFlagsAttr fortran_attrs) { if (fortran_attrs && bitEnumContainsAny(fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::fortran_volatile)) { + // A volatile pointer's pointee is volatile. const bool isPointer = bitEnumContainsAny( fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::pointer); + // An allocatable's inner type's volatility matches that of the reference. + const bool isAllocatable = bitEnumContainsAny( + fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::allocatable); auto updateType = [&](auto t) { using FIRT = decltype(t); - // A volatile pointer's pointee is volatile. auto elementType = t.getEleTy(); - const bool elementTypeIsVolatile = - isPointer || fir::isa_volatile_type(elementType); + const bool elementTypeIsBox = mlir::isa(elementType); ---------------- jeanPerier wrote: Shouldn't this be `mlir::isa` to cover the fir.class case? https://github.com/llvm/llvm-project/pull/139183 From flang-commits at lists.llvm.org Fri May 9 01:16:06 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 01:16:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fixed designator codegen for contiguous boxes. (PR #139003) In-Reply-To: Message-ID: <681db9c6.170a0220.381f6.a395@mx.google.com> ================ @@ -412,12 +412,44 @@ class DesignateOpConversion auto indices = designate.getIndices(); int i = 0; auto attrs = designate.getIsTripletAttr(); + + // If the shape specifies a shift and the base is not a box, + // then we have to subtract the lower bounds, as long as + // fir.array_coor does not support non-default lower bounds + // for non-box accesses. ---------------- jeanPerier wrote: I am not sure I get it. Is fir.array_coor not honoring fir.shape_shift? That seems like a big bug in its codegen. I think this may be related to my comment in the HLFIRTools.cpp helper though, genVariableFirBaseShapeAndParams should not return a fir.shift it the exv is not a box (it should be a fir.shape_shift). https://github.com/llvm/llvm-project/pull/139003 From flang-commits at lists.llvm.org Fri May 9 01:16:06 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 01:16:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fixed designator codegen for contiguous boxes. (PR #139003) In-Reply-To: Message-ID: <681db9c6.a70a0220.5ad31.f03e@mx.google.com> ================ @@ -753,9 +754,24 @@ std::pair hlfir::genVariableFirBaseShapeAndParams( } if (entity.isScalar()) return {fir::getBase(exv), mlir::Value{}}; + + // Contiguous variables that are represented with a box + // may require the shape to be extracted from the box (i.e. evx), + // because they itself may not have shape specified. + // This happens during late propagationg of contiguous + // attribute, e.g.: + // %9:2 = hlfir.declare %6 + // {fortran_attrs = #fir.var_attrs} : + // (!fir.box>) -> + // (!fir.box>, !fir.box>) + // The extended value is an ArrayBoxValue with base being + // the raw address of the array. if (auto variableInterface = entity.getIfVariableInterface()) - return {fir::getBase(exv), - asEmboxShape(loc, builder, exv, variableInterface.getShape())}; + if (mlir::isa(fir::getBase(exv).getType()) || + !mlir::isa(entity.getType()) || + variableInterface.getShape()) ---------------- jeanPerier wrote: If the CONTIGUOUS variable has custom lower bounds, I think there will be a `fir.shift` returned by `variableInterface.getShape()` instead of the needed fir.shift_shape. https://github.com/llvm/llvm-project/pull/139003 From flang-commits at lists.llvm.org Fri May 9 01:16:06 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 01:16:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fixed designator codegen for contiguous boxes. (PR #139003) In-Reply-To: Message-ID: <681db9c6.050a0220.1d9236.e1ec@mx.google.com> ================ @@ -213,3 +213,75 @@ func.func @test_polymorphic_array_elt(%arg0: !fir.class>, !fir.class>>) -> !fir.class> // CHECK: return // CHECK: } + +// Test proper generation of fir.array_coor for contiguous box with default lbounds. +func.func @_QPtest_contiguous_derived_default(%arg0: !fir.class>> {fir.bindc_name = "d1", fir.contiguous, fir.optional}) { + %c1 = arith.constant 1 : index + %c16_i32 = arith.constant 16 : i32 + %0 = fir.dummy_scope : !fir.dscope + %1:2 = hlfir.declare %arg0 dummy_scope %0 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.class>>, !fir.dscope) -> (!fir.class>>, !fir.class>>) + fir.select_type %1#1 : !fir.class>> [#fir.type_is,i:i32}>>, ^bb1, unit, ^bb2] +^bb1: // pred: ^bb0 + %2 = fir.convert %1#1 : (!fir.class>>) -> !fir.box,i:i32}>>> + %3:2 = hlfir.declare %2 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.box,i:i32}>>>) -> (!fir.box,i:i32}>>>, !fir.box,i:i32}>>>) + %4 = hlfir.designate %3#0 (%c1, %c1) : (!fir.box,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> + %5 = hlfir.designate %4{"i"} : (!fir.ref,i:i32}>>) -> !fir.ref + hlfir.assign %c16_i32 to %5 : i32, !fir.ref + cf.br ^bb3 +^bb2: // pred: ^bb0 + %6:2 = hlfir.declare %1#1 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.class>>) -> (!fir.class>>, !fir.class>>) + cf.br ^bb3 +^bb3: // 2 preds: ^bb1, ^bb2 + return +} +// CHECK-LABEL: func.func @_QPtest_contiguous_derived_default( +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_9:.*]] = fir.declare %{{.*}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.box,i:i32}>>>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_10:.*]] = fir.rebox %[[VAL_9]] : (!fir.box,i:i32}>>>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_11:.*]] = fir.box_addr %[[VAL_10]] : (!fir.box,i:i32}>>>) -> !fir.ref,i:i32}>>> +// CHECK: %[[VAL_12:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10]], %[[VAL_12]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_15:.*]]:3 = fir.box_dims %[[VAL_10]], %[[VAL_14]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_16:.*]] = fir.shape %[[VAL_13]]#1, %[[VAL_15]]#1 : (index, index) -> !fir.shape<2> +// CHECK: %[[VAL_17:.*]] = fir.array_coor %[[VAL_11]](%[[VAL_16]]) %[[VAL_0]], %[[VAL_0]] : (!fir.ref,i:i32}>>>, !fir.shape<2>, index, index) -> !fir.ref,i:i32}>> + +// Test proper generation of fir.array_coor for contiguous box with non-default lbounds. +func.func @_QPtest_contiguous_derived_lbounds(%arg0: !fir.class>> {fir.bindc_name = "d1", fir.contiguous}) { + %c3 = arith.constant 3 : index + %c1 = arith.constant 1 : index + %c16_i32 = arith.constant 16 : i32 + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.shift %c1, %c3 : (index, index) -> !fir.shift<2> + %2:2 = hlfir.declare %arg0(%1) dummy_scope %0 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.class>>, !fir.shift<2>, !fir.dscope) -> (!fir.class>>, !fir.class>>) + fir.select_type %2#1 : !fir.class>> [#fir.type_is,i:i32}>>, ^bb1, unit, ^bb2] +^bb1: // pred: ^bb0 + %3 = fir.convert %2#1 : (!fir.class>>) -> !fir.box,i:i32}>>> + %4:2 = hlfir.declare %3(%1) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.box,i:i32}>>>, !fir.shift<2>) -> (!fir.box,i:i32}>>>, !fir.box,i:i32}>>>) + %5 = hlfir.designate %4#0 (%c1, %c3) : (!fir.box,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> + %6 = hlfir.designate %5{"i"} : (!fir.ref,i:i32}>>) -> !fir.ref + hlfir.assign %c16_i32 to %6 : i32, !fir.ref + cf.br ^bb3 +^bb2: // pred: ^bb0 + %7:2 = hlfir.declare %2#1(%1) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.class>>, !fir.shift<2>) -> (!fir.class>>, !fir.class>>) + cf.br ^bb3 +^bb3: // 2 preds: ^bb1, ^bb2 + return +} +// CHECK-LABEL: func.func @_QPtest_contiguous_derived_lbounds( +// CHECK: %[[VAL_0:.*]] = arith.constant 3 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.declare %{{.*}}(%[[VAL_4:.*]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.box,i:i32}>>>, !fir.shift<2>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_9:.*]] = fir.rebox %[[VAL_8]](%[[VAL_4]]) : (!fir.box,i:i32}>>>, !fir.shift<2>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_10:.*]] = fir.box_addr %[[VAL_9]] : (!fir.box,i:i32}>>>) -> !fir.ref,i:i32}>>> +// CHECK: %[[VAL_11:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_12:.*]]:3 = fir.box_dims %[[VAL_9]], %[[VAL_11]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_13:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_14:.*]]:3 = fir.box_dims %[[VAL_9]], %[[VAL_13]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_15:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_16:.*]] = arith.subi %[[VAL_1]], %[[VAL_1]] : index +// CHECK: %[[VAL_17:.*]] = arith.addi %[[VAL_16]], %[[VAL_15]] : index +// CHECK: %[[VAL_18:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_19:.*]] = arith.subi %[[VAL_0]], %[[VAL_0]] : index +// CHECK: %[[VAL_20:.*]] = arith.addi %[[VAL_19]], %[[VAL_18]] : index +// CHECK: %[[VAL_21:.*]] = fir.array_coor %[[VAL_10]] %[[VAL_17]], %[[VAL_20]] : (!fir.ref,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> ---------------- jeanPerier wrote: How is this possible that this fir.array_coor has no shape argument? How can codegen compute the stride for the second dimension? https://github.com/llvm/llvm-project/pull/139003 From flang-commits at lists.llvm.org Fri May 9 01:18:13 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 01:18:13 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Treat hlfir.associate as Allocate for FIR alias analysis. (PR #139004) In-Reply-To: Message-ID: <681dba45.050a0220.1417e0.e70c@mx.google.com> https://github.com/jeanPerier approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/139004 From flang-commits at lists.llvm.org Fri May 9 01:32:33 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 01:32:33 -0700 (PDT) Subject: [flang-commits] [flang] [flang][driver] do not crash when fc1 process multiple files (PR #138875) In-Reply-To: Message-ID: <681dbda1.a70a0220.3a79da.e1fd@mx.google.com> jeanPerier wrote: > I'm not opposed to flang -fc1 accepting it even though clang -cc1 doesn't, but I'm not actually clear what the expected behavior would be with flang -fc1 accepting multiple input files. What happens with this change? With this change, instead of crashing, `flang -fc1 -emit-obj bar.f90 foo.f90` now emits two object files, `bar.o` and `foo.o` (just like if they had been processed individually). https://github.com/llvm/llvm-project/pull/138875 From flang-commits at lists.llvm.org Fri May 9 01:36:41 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 01:36:41 -0700 (PDT) Subject: [flang-commits] [flang] 92d2e13 - [flang][driver] do not crash when fc1 process multiple files (#138875) Message-ID: <681dbe99.a70a0220.19c00a.ed77@mx.google.com> Author: jeanPerier Date: 2025-05-09T10:36:38+02:00 New Revision: 92d2e13b99ba1770e6307af7ed7ee877bfabde8c URL: https://github.com/llvm/llvm-project/commit/92d2e13b99ba1770e6307af7ed7ee877bfabde8c DIFF: https://github.com/llvm/llvm-project/commit/92d2e13b99ba1770e6307af7ed7ee877bfabde8c.diff LOG: [flang][driver] do not crash when fc1 process multiple files (#138875) This is a fix for the issue https://github.com/llvm/llvm-project/issues/137126 that turned out to be a driver issue. FrontendActions has a loop to process multiple input files and `flang -fc1` accept multiple files, but the semantic, lowering, and llvm codegen actions were not re-entrant, and crash or weird behaviors occurred when processing multiple files with `-fc1`. This patch makes the actions reentrant by cleaning-up the contexts/modules if needed on entry. Added: flang/test/Driver/multiple-fc1-input.f90 Modified: flang/include/flang/Frontend/CompilerInstance.h flang/lib/Frontend/CompilerInstance.cpp flang/lib/Frontend/FrontendAction.cpp flang/lib/Frontend/FrontendActions.cpp Removed: ################################################################################ diff --git a/flang/include/flang/Frontend/CompilerInstance.h b/flang/include/flang/Frontend/CompilerInstance.h index e37ef5e236871..4ad95c9df42d7 100644 --- a/flang/include/flang/Frontend/CompilerInstance.h +++ b/flang/include/flang/Frontend/CompilerInstance.h @@ -147,6 +147,12 @@ class CompilerInstance { /// @name Semantic analysis /// { + Fortran::semantics::SemanticsContext &createNewSemanticsContext() { + semaContext = + getInvocation().getSemanticsCtx(*allCookedSources, getTargetMachine()); + return *semaContext; + } + Fortran::semantics::SemanticsContext &getSemanticsContext() { return *semaContext; } diff --git a/flang/lib/Frontend/CompilerInstance.cpp b/flang/lib/Frontend/CompilerInstance.cpp index f7ed969f03bf4..cbd2c58eeeb47 100644 --- a/flang/lib/Frontend/CompilerInstance.cpp +++ b/flang/lib/Frontend/CompilerInstance.cpp @@ -162,8 +162,6 @@ bool CompilerInstance::executeAction(FrontendAction &act) { allSources->set_encoding(invoc.getFortranOpts().encoding); if (!setUpTargetMachine()) return false; - // Create the semantics context - semaContext = invoc.getSemanticsCtx(*allCookedSources, getTargetMachine()); // Set options controlling lowering to FIR. invoc.setLoweringOptions(); diff --git a/flang/lib/Frontend/FrontendAction.cpp b/flang/lib/Frontend/FrontendAction.cpp index ab77d143fa4b6..d178fd6a9395d 100644 --- a/flang/lib/Frontend/FrontendAction.cpp +++ b/flang/lib/Frontend/FrontendAction.cpp @@ -183,7 +183,7 @@ bool FrontendAction::runSemanticChecks() { // Transfer any pending non-fatal messages from parsing to semantics // so that they are merged and all printed in order. - auto &semanticsCtx{ci.getSemanticsContext()}; + auto &semanticsCtx{ci.createNewSemanticsContext()}; semanticsCtx.messages().Annex(std::move(ci.getParsing().messages())); semanticsCtx.set_debugModuleWriter(ci.getInvocation().getDebugModuleDir()); diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..e5a15c555fa5e 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -171,6 +171,10 @@ static void addDependentLibs(mlir::ModuleOp mlirModule, CompilerInstance &ci) { } bool CodeGenAction::beginSourceFileAction() { + // Delete previous LLVM module depending on old context before making a new + // one. + if (llvmModule) + llvmModule.reset(nullptr); llvmCtx = std::make_unique(); CompilerInstance &ci = this->getInstance(); mlir::DefaultTimingManager &timingMgr = ci.getTimingManager(); @@ -197,6 +201,9 @@ bool CodeGenAction::beginSourceFileAction() { return true; } + // Reset MLIR module if it was set before overriding the old context. + if (mlirModule) + mlirModule = mlir::OwningOpRef(nullptr); // Load the MLIR dialects required by Flang mlirCtx = std::make_unique(); fir::support::loadDialects(*mlirCtx); diff --git a/flang/test/Driver/multiple-fc1-input.f90 b/flang/test/Driver/multiple-fc1-input.f90 new file mode 100644 index 0000000000000..57f7c5e92b4c4 --- /dev/null +++ b/flang/test/Driver/multiple-fc1-input.f90 @@ -0,0 +1,9 @@ +! Test that flang -fc1 can be called with several input files without +! crashing. +! Regression tests for: https://github.com/llvm/llvm-project/issues/137126 + +! RUN: %flang_fc1 -emit-fir %s %s -o - | FileCheck %s +subroutine foo() +end subroutine +! CHECK: func @_QPfoo() +! CHECK: func @_QPfoo() From flang-commits at lists.llvm.org Fri May 9 01:36:45 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 01:36:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang][driver] do not crash when fc1 process multiple files (PR #138875) In-Reply-To: Message-ID: <681dbe9d.170a0220.19c0e7.ad9b@mx.google.com> https://github.com/jeanPerier closed https://github.com/llvm/llvm-project/pull/138875 From flang-commits at lists.llvm.org Fri May 9 02:51:53 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Fri, 09 May 2025 02:51:53 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add missing dependency to AddDebugInfo pass (PR #139099) In-Reply-To: Message-ID: <681dd039.a70a0220.3995f1.e2e5@mx.google.com> https://github.com/abidh approved this pull request. Apart from the point raised by @tblah, LGTM too. https://github.com/llvm/llvm-project/pull/139099 From flang-commits at lists.llvm.org Fri May 9 03:09:59 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 09 May 2025 03:09:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <681dd477.170a0220.11d141.ad6e@mx.google.com> ================ @@ -138,6 +138,18 @@ text. OpenMP-style directives that look like comments are not addressed by this scheme but are obvious extensions. +## Currently implemented built-ins + +* `__DATE__`: Date, given as e.g. "Jun 16 1904" +* `__TIME__`: Time in 24-hour format including seconds, e.g. "09:24:13" +* `__TIMESTAMP__`: Date, time and year of last modification, given as e.g. "Fri May 9 09:16:17 2025" +* `__FILE__`: Current file +* `__LINE__`: Current line + +### Non-standard extensions ---------------- eugeneepshteyn wrote: Well, we do have "Preprocessing behavior" section in https://github.com/llvm/llvm-project/blob/main/flang/docs/Extensions.md , but I'm not sure if it's worth separating this information. https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Fri May 9 03:13:37 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Fri, 09 May 2025 03:13:37 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add missing dependency to AddDebugInfo pass (PR #139099) In-Reply-To: Message-ID: <681dd551.170a0220.1519d2.cf76@mx.google.com> skatrak wrote: Thanks for the quick reviews! > A quick look seems to suggest that several cuda and loop versioning tests (two passes among others that use `fir::support::getOrSetMLIRDataLayout()`) also include the `dlti.dl_spec`. May be the same changes are applicable there as well. I did notice that the loop versioning pass might potentially have the same issue. I'll take a more detailed look at other passes using that utility and see if we should be adding the dialect dependency to them. I'll make another PR with these changes, if I find any. https://github.com/llvm/llvm-project/pull/139099 From flang-commits at lists.llvm.org Fri May 9 03:31:37 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Fri, 09 May 2025 03:31:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Basic lowering `fir.do_concurrent` locality specs to `fir.do_loop ... unordered` (PR #138512) In-Reply-To: Message-ID: <681dd989.170a0220.2e43a0.e7f6@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/138512 >From b7bce438182451342a5b2ecfe0b7522af307c2ae Mon Sep 17 00:00:00 2001 From: ergawy Date: Mon, 5 May 2025 06:50:49 -0500 Subject: [PATCH 1/2] [flang][fir] Basci lowering `fir.do_concurrent` locality specs to `fir.do_loop ... unordered` Extends lowering `fir.do_concurrent` to `fir.do_loop ... unordered` by adding support for locality specifiers. In particular, for `local` specifiers, a `fir.alloca` op is created using the localizer type. For `local_init` specifiers, the `copy` region is additionally inlined in the `do concurrent` loop's body. --- .../Transforms/SimplifyFIROperations.cpp | 58 +++++++++++++++++- .../do_concurrent-to-do_loop-unodered.fir | 61 +++++++++++++++++++ 2 files changed, 118 insertions(+), 1 deletion(-) diff --git a/flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp b/flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp index 6d106046b70f2..e2dc4e14ff650 100644 --- a/flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp +++ b/flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp @@ -149,6 +149,17 @@ mlir::LogicalResult BoxTotalElementsConversion::matchAndRewrite( class DoConcurrentConversion : public mlir::OpRewritePattern { + /// Looks up from the operation from and returns the LocalitySpecifierOp with + /// name symbolName + static fir::LocalitySpecifierOp + findLocalizer(mlir::Operation *from, mlir::SymbolRefAttr symbolName) { + fir::LocalitySpecifierOp localizer = + mlir::SymbolTable::lookupNearestSymbolFrom( + from, symbolName); + assert(localizer && "localizer not found in the symbol table"); + return localizer; + } + public: using mlir::OpRewritePattern::OpRewritePattern; @@ -162,7 +173,52 @@ class DoConcurrentConversion assert(loop.getRegion().hasOneBlock()); mlir::Block &loopBlock = loop.getRegion().getBlocks().front(); - // Collect iteration variable(s) allocations do that we can move them + // Handle localization + if (!loop.getLocalVars().empty()) { + mlir::OpBuilder::InsertionGuard guard(rewriter); + rewriter.setInsertionPointToStart(&loop.getRegion().front()); + + std::optional localSyms = loop.getLocalSyms(); + + for (auto [localVar, localArg, localizerSym] : llvm::zip_equal( + loop.getLocalVars(), loop.getRegionLocalArgs(), *localSyms)) { + mlir::SymbolRefAttr localizerName = + llvm::cast(localizerSym); + fir::LocalitySpecifierOp localizer = findLocalizer(loop, localizerName); + + mlir::Value localAlloc = + rewriter.create(loop.getLoc(), localizer.getType()); + + if (localizer.getLocalitySpecifierType() == + fir::LocalitySpecifierType::LocalInit) { + // It is reasonable to make this assumption since, at this stage, + // control-flow ops are not converted yet. Therefore, things like `if` + // conditions will still be represented by their encapsulating `fir` + // dialect ops. + assert(localizer.getCopyRegion().hasOneBlock() && + "Expected localizer to have a single block."); + mlir::Block *beforeLocalInit = rewriter.getInsertionBlock(); + mlir::Block *afterLocalInit = rewriter.splitBlock( + rewriter.getInsertionBlock(), rewriter.getInsertionPoint()); + rewriter.cloneRegionBefore(localizer.getCopyRegion(), afterLocalInit); + mlir::Block *copyRegionBody = beforeLocalInit->getNextNode(); + + rewriter.eraseOp(copyRegionBody->getTerminator()); + rewriter.mergeBlocks(afterLocalInit, copyRegionBody); + rewriter.mergeBlocks(copyRegionBody, beforeLocalInit, + {localVar, localArg}); + } + + rewriter.replaceAllUsesWith(localArg, localAlloc); + } + + loop.getRegion().front().eraseArguments(loop.getNumInductionVars(), + loop.getNumLocalOperands()); + loop.getLocalVarsMutable().clear(); + loop.setLocalSymsAttr(nullptr); + } + + // Collect iteration variable(s) allocations so that we can move them // outside the `fir.do_concurrent` wrapper. llvm::SmallVector opsToMove; for (mlir::Operation &op : llvm::drop_end(wrapperBlock)) diff --git a/flang/test/Transforms/do_concurrent-to-do_loop-unodered.fir b/flang/test/Transforms/do_concurrent-to-do_loop-unodered.fir index d2ceafdda5b22..d9ef36b175598 100644 --- a/flang/test/Transforms/do_concurrent-to-do_loop-unodered.fir +++ b/flang/test/Transforms/do_concurrent-to-do_loop-unodered.fir @@ -121,3 +121,64 @@ func.func @dc_2d_reduction(%i_lb: index, %i_ub: index, %i_st: index, // CHECK: } // CHECK: return // CHECK: } + +// ----- + +fir.local {type = local} @local_localizer : i32 + +fir.local {type = local_init} @local_init_localizer : i32 copy { +^bb0(%arg0: !fir.ref, %arg1: !fir.ref): + %0 = fir.load %arg0 : !fir.ref + fir.store %0 to %arg1 : !fir.ref + fir.yield(%arg1 : !fir.ref) +} + +func.func @do_concurrent_locality_specs() { + %3 = fir.alloca i32 {bindc_name = "local_init_var", uniq_name = "_QFdo_concurrentElocal_init_var"} + %4:2 = hlfir.declare %3 {uniq_name = "_QFdo_concurrentElocal_init_var"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %5 = fir.alloca i32 {bindc_name = "local_var", uniq_name = "_QFdo_concurrentElocal_var"} + %6:2 = hlfir.declare %5 {uniq_name = "_QFdo_concurrentElocal_var"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %c1 = arith.constant 1 : index + %c10 = arith.constant 1 : index + fir.do_concurrent { + %9 = fir.alloca i32 {bindc_name = "i"} + %10:2 = hlfir.declare %9 {uniq_name = "_QFdo_concurrentEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) + fir.do_concurrent.loop (%arg0) = (%c1) to (%c10) step (%c1) local(@local_localizer %6#0 -> %arg1, @local_init_localizer %4#0 -> %arg2 : !fir.ref, !fir.ref) { + %11 = fir.convert %arg0 : (index) -> i32 + fir.store %11 to %10#0 : !fir.ref + %13:2 = hlfir.declare %arg1 {uniq_name = "_QFdo_concurrentElocal_var"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %15:2 = hlfir.declare %arg2 {uniq_name = "_QFdo_concurrentElocal_init_var"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %17 = fir.load %10#0 : !fir.ref + %c5_i32 = arith.constant 5 : i32 + %18 = arith.cmpi slt, %17, %c5_i32 : i32 + fir.if %18 { + %c42_i32 = arith.constant 42 : i32 + hlfir.assign %c42_i32 to %13#0 : i32, !fir.ref + } else { + %c84_i32 = arith.constant 84 : i32 + hlfir.assign %c84_i32 to %15#0 : i32, !fir.ref + } + } + } + return +} + +// CHECK-LABEL: func.func @do_concurrent_locality_specs() { +// CHECK: %[[LOC_INIT_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "{{.*}}Elocal_init_var"} +// CHECK: fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} unordered { +// Verify localization of the `local` var. +// CHECK: %[[PRIV_LOC_ALLOC:.*]] = fir.alloca i32 + +// Verify localization of the `local_init` var. +// CHECK: %[[PRIV_LOC_INIT_ALLOC:.*]] = fir.alloca i32 +// CHECK: %[[LOC_INIT_VAL:.*]] = fir.load %[[LOC_INIT_DECL]]#0 : !fir.ref +// CHECK: fir.store %[[LOC_INIT_VAL]] to %[[PRIV_LOC_INIT_ALLOC]] : !fir.ref + +// CHECK: %[[VAL_15:.*]]:2 = hlfir.declare %[[PRIV_LOC_ALLOC]] +// CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[PRIV_LOC_INIT_ALLOC]] + +// CHECK: hlfir.assign %{{.*}} to %[[VAL_15]]#0 : i32, !fir.ref +// CHECK: hlfir.assign %{{.*}} to %[[VAL_16]]#0 : i32, !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } >From a505cf52410b116879971b9706700a1fc34c5e24 Mon Sep 17 00:00:00 2001 From: ergawy Date: Wed, 7 May 2025 02:18:05 -0500 Subject: [PATCH 2/2] add todos --- flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp b/flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp index e2dc4e14ff650..43ffed914ffe9 100644 --- a/flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp +++ b/flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp @@ -186,6 +186,8 @@ class DoConcurrentConversion llvm::cast(localizerSym); fir::LocalitySpecifierOp localizer = findLocalizer(loop, localizerName); + // TODO Should this be a heap allocation instead? For now, we allocate + // on the stack for each loop iteration. mlir::Value localAlloc = rewriter.create(loop.getLoc(), localizer.getType()); @@ -210,6 +212,9 @@ class DoConcurrentConversion } rewriter.replaceAllUsesWith(localArg, localAlloc); + + // TODO localizers with `init` and `dealloc` regions are not handled + // yet. } loop.getRegion().front().eraseArguments(loop.getNumInductionVars(), From flang-commits at lists.llvm.org Fri May 9 03:34:08 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 09 May 2025 03:34:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Postpone hlfir.end_associate generation for calls. (PR #138786) In-Reply-To: Message-ID: <681dda20.620a0220.68429.f942@mx.google.com> https://github.com/tblah approved this pull request. OpenMP changes look good. Well caught. OpenMP and OpenACC shared implementation of atomic constructs until very recently. It is quite likely that OpenACC has the same bug. https://github.com/llvm/llvm-project/pull/138786 From flang-commits at lists.llvm.org Fri May 9 03:36:17 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Fri, 09 May 2025 03:36:17 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Basic lowering `fir.do_concurrent` locality specs to `fir.do_loop ... unordered` (PR #138512) In-Reply-To: Message-ID: <681ddaa1.170a0220.2e4c51.b452@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/138512 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Fri May 9 03:36:28 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Fri, 09 May 2025 03:36:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Basic lowering `fir.do_concurrent` locality specs to `fir.do_loop ... unordered` (PR #138512) In-Reply-To: Message-ID: <681ddaac.170a0220.228864.cb23@mx.google.com> ================ @@ -162,7 +173,52 @@ class DoConcurrentConversion assert(loop.getRegion().hasOneBlock()); mlir::Block &loopBlock = loop.getRegion().getBlocks().front(); - // Collect iteration variable(s) allocations do that we can move them + // Handle localization + if (!loop.getLocalVars().empty()) { + mlir::OpBuilder::InsertionGuard guard(rewriter); + rewriter.setInsertionPointToStart(&loop.getRegion().front()); + + std::optional localSyms = loop.getLocalSyms(); + + for (auto [localVar, localArg, localizerSym] : llvm::zip_equal( + loop.getLocalVars(), loop.getRegionLocalArgs(), *localSyms)) { + mlir::SymbolRefAttr localizerName = + llvm::cast(localizerSym); + fir::LocalitySpecifierOp localizer = findLocalizer(loop, localizerName); + + mlir::Value localAlloc = + rewriter.create(loop.getLoc(), localizer.getType()); ---------------- ergawy wrote: All good. Done. https://github.com/llvm/llvm-project/pull/138512 From flang-commits at lists.llvm.org Fri May 9 03:37:54 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 09 May 2025 03:37:54 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add missing dependency to AddDebugInfo pass (PR #139099) In-Reply-To: Message-ID: <681ddb02.170a0220.27d521.c907@mx.google.com> https://github.com/tblah approved this pull request. https://github.com/llvm/llvm-project/pull/139099 From flang-commits at lists.llvm.org Fri May 9 03:56:39 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 09 May 2025 03:56:39 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <681ddf67.170a0220.e868f.c8a3@mx.google.com> ================ @@ -16,9 +16,7 @@ local: This document outlines the OpenMP API features supported by Flang. It is intended as a general reference. For the most accurate information on unimplemented features, rely on the compiler’s TODO or “Not Yet Implemented” -messages, which are considered authoritative. With the exception of a few corner cases, Flang -offers full support for OpenMP 3.1 ([See details here](#openmp-31-openmp-25-openmp-11)). -Partial support for OpenMP 4.0 is also available and currently under active development. +messages, which are considered authoritative. Partial support for OpenMP 4.0 is also available and currently under active development. ---------------- kiranchandramohan wrote: Looks alrite. A more formal way would be the following: ``` Flang provides complete implementation of the OpenMP 3.1 specification and partial implementation of OpenMP 4.0, with continued development efforts aimed at extending full support for the latter. ``` https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Fri May 9 03:59:04 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 03:59:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <681ddff8.170a0220.29a93d.bc67@mx.google.com> https://github.com/NimishMishra updated https://github.com/llvm/llvm-project/pull/138163 >From e912e7c9e434dc40fbd986f98725bda849a56553 Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Thu, 1 May 2025 21:51:19 +0530 Subject: [PATCH 1/4] [flang][OpenMP] Add implicit casts for omp.atomic.capture --- flang/docs/OpenMPSupport.md | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 236 ++++++++++++------ .../Todo/atomic-capture-implicit-cast.f90 | 48 ---- .../Lower/OpenMP/atomic-implicit-cast.f90 | 78 ++++++ 4 files changed, 240 insertions(+), 124 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/docs/OpenMPSupport.md b/flang/docs/OpenMPSupport.md index 2d4b9dd737777..46be14f4c168c 100644 --- a/flang/docs/OpenMPSupport.md +++ b/flang/docs/OpenMPSupport.md @@ -64,4 +64,4 @@ Note : No distinction is made between the support in Parser/Semantics, MLIR, Low | target teams distribute parallel loop simd construct | P | device, reduction, dist_schedule and linear clauses are not supported | ## OpenMP 3.1, OpenMP 2.5, OpenMP 1.1 -All features except a few corner cases in atomic (complex type, different but compatible types in lhs and rhs), threadprivate (character type) constructs/clauses are supported. +All features except a few corner cases in threadprivate (character type) constructs/clauses are supported. diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 47e7c266ff7d3..526148855b113 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2865,6 +2865,85 @@ static void genAtomicWrite(lower::AbstractConverter &converter, rightHandClauseList, loc); } +/* + Emit an implicit cast. Different yet compatible types on + omp.atomic.read constitute valid Fortran. The OMPIRBuilder will + emit atomic instructions (on primitive types) and `__atomic_load` + libcall (on complex type) without explicitly converting + between such compatible types. The OMPIRBuilder relies on the + frontend to resolve such inconsistencies between `omp.atomic.read ` + operand types. Similar inconsistencies between operand types in + `omp.atomic.write` are resolved through implicit casting by use of typed + assignment (i.e. `evaluate::Assignment`). However, use of typed + assignment in `omp.atomic.read` (of form `v = x`) leads to an unsafe, + non-atomic load of `x` into a temporary `alloca`, followed by an atomic + read of form `v = alloca`. Hence, it is needed to perform a custom + implicit cast. + + An atomic read of form `v = x` would (without implicit casting) + lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, + type2`. This implicit casting will rather generate the following FIR: + + %alloca = fir.alloca type2 + omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 + %load = fir.load %alloca : !fir.ref + %cvt = fir.convert %load : (type2) -> type1 + fir.store %cvt to %v : !fir.ref + + These sequence of operations is thread-safe since each thread allocates + the `alloca` in its stack, and performs `%alloca = %x` atomically. Once + safely read, each thread performs the implicit cast on the local + `alloca`, and writes the final result to `%v`. + +/// \param builder : FirOpBuilder +/// \param loc : Location for FIR generation +/// \param toAddress : Address of %v +/// \param toType : Type of %v +/// \param fromType : Type of %x +/// \param alloca : Thread scoped `alloca` +// It is the responsibility of the callee +// to position the `alloca` at `AllocaIP` +// through `builder.getAllocaBlock()` +*/ + +static void emitAtomicReadImplicitCast(fir::FirOpBuilder &builder, + mlir::Location loc, + mlir::Value toAddress, mlir::Type toType, + mlir::Type fromType, + mlir::Value alloca) { + auto load = builder.create(loc, alloca); + if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { + // Emit an additional `ExtractValueOp` if `fromAddress` is of complex + // type, but `toAddress` is not. + auto extract = builder.create( + loc, mlir::cast(fromType).getElementType(), load, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + auto cvt = builder.create(loc, toType, extract); + builder.create(loc, cvt, toAddress); + } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { + // Emit an additional `InsertValueOp` if `toAddress` is of complex + // type, but `fromAddress` is not. + mlir::Value undef = builder.create(loc, toType); + mlir::Type complexEleTy = + mlir::cast(toType).getElementType(); + mlir::Value cvt = builder.create(loc, complexEleTy, load); + mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); + mlir::Value idx0 = builder.create( + loc, toType, undef, cvt, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + mlir::Value idx1 = builder.create( + loc, toType, idx0, zero, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 1))); + builder.create(loc, idx1, toAddress); + } else { + auto cvt = builder.create(loc, toType, load); + builder.create(loc, cvt, toAddress); + } +} + /// Processes an atomic construct with read clause. static void genAtomicRead(lower::AbstractConverter &converter, const parser::OmpAtomicRead &atomicRead, @@ -2891,34 +2970,7 @@ static void genAtomicRead(lower::AbstractConverter &converter, *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); if (fromAddress.getType() != toAddress.getType()) { - // Emit an implicit cast. Different yet compatible types on - // omp.atomic.read constitute valid Fortran. The OMPIRBuilder will - // emit atomic instructions (on primitive types) and `__atomic_load` - // libcall (on complex type) without explicitly converting - // between such compatible types. The OMPIRBuilder relies on the - // frontend to resolve such inconsistencies between `omp.atomic.read ` - // operand types. Similar inconsistencies between operand types in - // `omp.atomic.write` are resolved through implicit casting by use of typed - // assignment (i.e. `evaluate::Assignment`). However, use of typed - // assignment in `omp.atomic.read` (of form `v = x`) leads to an unsafe, - // non-atomic load of `x` into a temporary `alloca`, followed by an atomic - // read of form `v = alloca`. Hence, it is needed to perform a custom - // implicit cast. - - // An atomic read of form `v = x` would (without implicit casting) - // lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, - // type2`. This implicit casting will rather generate the following FIR: - // - // %alloca = fir.alloca type2 - // omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 - // %load = fir.load %alloca : !fir.ref - // %cvt = fir.convert %load : (type2) -> type1 - // fir.store %cvt to %v : !fir.ref - - // These sequence of operations is thread-safe since each thread allocates - // the `alloca` in its stack, and performs `%alloca = %x` atomically. Once - // safely read, each thread performs the implicit cast on the local - // `alloca`, and writes the final result to `%v`. + mlir::Type toType = fir::unwrapRefType(toAddress.getType()); mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); @@ -2930,37 +2982,8 @@ static void genAtomicRead(lower::AbstractConverter &converter, genAtomicCaptureStatement(converter, fromAddress, alloca, leftHandClauseList, rightHandClauseList, elementType, loc); - auto load = builder.create(loc, alloca); - if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { - // Emit an additional `ExtractValueOp` if `fromAddress` is of complex - // type, but `toAddress` is not. - auto extract = builder.create( - loc, mlir::cast(fromType).getElementType(), load, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 0))); - auto cvt = builder.create(loc, toType, extract); - builder.create(loc, cvt, toAddress); - } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { - // Emit an additional `InsertValueOp` if `toAddress` is of complex - // type, but `fromAddress` is not. - mlir::Value undef = builder.create(loc, toType); - mlir::Type complexEleTy = - mlir::cast(toType).getElementType(); - mlir::Value cvt = builder.create(loc, complexEleTy, load); - mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); - mlir::Value idx0 = builder.create( - loc, toType, undef, cvt, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 0))); - mlir::Value idx1 = builder.create( - loc, toType, idx0, zero, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 1))); - builder.create(loc, idx1, toAddress); - } else { - auto cvt = builder.create(loc, toType, load); - builder.create(loc, cvt, toAddress); - } + emitAtomicReadImplicitCast(builder, loc, toAddress, toType, fromType, + alloca); } else genAtomicCaptureStatement(converter, fromAddress, toAddress, leftHandClauseList, rightHandClauseList, @@ -3049,10 +3072,6 @@ static void genAtomicCapture(lower::AbstractConverter &converter, mlir::Type stmt2VarType = fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - // Check if implicit type is needed - if (stmt1VarType != stmt2VarType) - TODO(loc, "atomic capture requiring implicit type casts"); - mlir::Operation *atomicCaptureOp = nullptr; mlir::IntegerAttr hint = nullptr; mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; @@ -3075,10 +3094,31 @@ static void genAtomicCapture(lower::AbstractConverter &converter, // Atomic capture construct is of the form [capture-stmt, update-stmt] const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt1LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt2LHSArg.getType()); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + genAtomicCaptureStatement(converter, stmt2LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt1LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } genAtomicUpdateStatement( converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, /*leftHandClauseList=*/nullptr, @@ -3091,10 +3131,32 @@ static void genAtomicCapture(lower::AbstractConverter &converter, firOpBuilder.setInsertionPointToStart(&block); const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt1LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt2LHSArg.getType()); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + genAtomicCaptureStatement(converter, stmt2LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt1LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, /*leftHandClauseList=*/nullptr, /*rightHandClauseList=*/nullptr, loc); @@ -3107,10 +3169,34 @@ static void genAtomicCapture(lower::AbstractConverter &converter, converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, /*leftHandClauseList=*/nullptr, /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt2LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt1LHSArg.getType()); + + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + + genAtomicCaptureStatement(converter, stmt1LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt2LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } } firOpBuilder.setInsertionPointToEnd(&block); firOpBuilder.create(loc); diff --git a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 b/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 deleted file mode 100644 index 5b61f1169308f..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 +++ /dev/null @@ -1,48 +0,0 @@ -!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..4c1be1ca91ac0 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -4,6 +4,10 @@ ! CHECK: func.func @_QPatomic_implicit_cast_read() { subroutine atomic_implicit_cast_read +! CHECK: %[[ALLOCA7:.*]] = fir.alloca complex +! CHECK: %[[ALLOCA6:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA5:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA4:.*]] = fir.alloca i32 ! CHECK: %[[ALLOCA3:.*]] = fir.alloca complex ! CHECK: %[[ALLOCA2:.*]] = fir.alloca complex ! CHECK: %[[ALLOCA1:.*]] = fir.alloca i32 @@ -53,4 +57,78 @@ subroutine atomic_implicit_cast_read ! CHECK: fir.store %[[CVT]] to %[[M_DECL]]#0 : !fir.ref> !$omp atomic read m = w + +! CHECK: %[[CONST:.*]] = arith.constant 1 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[ALLOCA4]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.update %[[X_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[ARG:.*]]: i32): +! CHECK: %[[RESULT:.*]] = arith.addi %[[ARG]], %[[CONST]] : i32 +! CHECK: omp.yield(%[[RESULT]] : i32) +! CHECK: } +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA4]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 +! CHECK: fir.store %[[CVT]] to %[[Y_DECL]]#0 : !fir.ref + !$omp atomic capture + y = x + x = x + 1 + !$omp end atomic + +! CHECK: %[[CONST:.*]] = arith.constant 10 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[ALLOCA5:.*]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.write %[[X_DECL]]#0 = %[[CONST]] : !fir.ref, i32 +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA5]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f64 +! CHECK: fir.store %[[CVT]] to %[[Z_DECL]]#0 : !fir.ref + !$omp atomic capture + z = x + x = 10 + !$omp end atomic + +! CHECK: %[[CONST:.*]] = arith.constant 1 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[X_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[ARG:.*]]: i32): +! CHECK: %[[RESULT:.*]] = arith.addi %[[ARG]], %[[CONST]] : i32 +! CHECK: omp.yield(%[[RESULT]] : i32) +! CHECK: } +! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref +! CHECK: %[[UNDEF:.*]] = fir.undefined complex +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 +! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex +! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex +! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> + !$omp atomic capture + x = x + 1 + w = x + !$omp end atomic + + +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): +! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 +! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex +! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex +! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex +! CHECK: omp.yield(%[[RESULT]] : complex) +! CHECK: } +! CHECK: omp.atomic.read %[[ALLOCA7]] = %[[M_DECL]]#0 : !fir.ref>, !fir.ref>, complex +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA7]] : !fir.ref> +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (complex) -> complex +! CHECK: fir.store %[[CVT]] to %[[W_DECL]]#0 : !fir.ref> + !$omp atomic capture + m = m + 1 + w = m + !$omp end atomic + + end subroutine >From eb3c167320583e1aa0ad47a85e382ec6f4f3798c Mon Sep 17 00:00:00 2001 From: NimishMishra <42909663+NimishMishra at users.noreply.github.com> Date: Thu, 8 May 2025 23:38:42 -0700 Subject: [PATCH 2/4] Remove broken references in documentation --- flang/docs/OpenMPSupport.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/flang/docs/OpenMPSupport.md b/flang/docs/OpenMPSupport.md index 5fd609e9c998d..e708cb9bd8b67 100644 --- a/flang/docs/OpenMPSupport.md +++ b/flang/docs/OpenMPSupport.md @@ -16,9 +16,7 @@ local: This document outlines the OpenMP API features supported by Flang. It is intended as a general reference. For the most accurate information on unimplemented features, rely on the compiler’s TODO or “Not Yet Implemented” -messages, which are considered authoritative. With the exception of a few corner cases, Flang -offers full support for OpenMP 3.1 ([See details here](#openmp-31-openmp-25-openmp-11)). -Partial support for OpenMP 4.0 is also available and currently under active development. +messages, which are considered authoritative. Partial support for OpenMP 4.0 is also available and currently under active development. The table below outlines the current status of OpenMP 4.0 feature support. Work is ongoing to add support for OpenMP 4.5 and newer versions; a support statement for these will be shared in the future. The table entries are derived from the information provided in the Version Differences subsection of the Features History section in the OpenMP standard. >From de8e7004b711c1f81b9a35db5c9b806d878d08f5 Mon Sep 17 00:00:00 2001 From: NimishMishra <42909663+NimishMishra at users.noreply.github.com> Date: Thu, 8 May 2025 23:47:34 -0700 Subject: [PATCH 3/4] Clarify support for OpenMP 3.1 --- flang/docs/OpenMPSupport.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/flang/docs/OpenMPSupport.md b/flang/docs/OpenMPSupport.md index e708cb9bd8b67..15298fbe0489f 100644 --- a/flang/docs/OpenMPSupport.md +++ b/flang/docs/OpenMPSupport.md @@ -16,7 +16,8 @@ local: This document outlines the OpenMP API features supported by Flang. It is intended as a general reference. For the most accurate information on unimplemented features, rely on the compiler’s TODO or “Not Yet Implemented” -messages, which are considered authoritative. Partial support for OpenMP 4.0 is also available and currently under active development. +messages, which are considered authoritative. Currently, Flang offers full support for OpenMP 3.1; partial support for OpenMP 4.0 is also available and currently under active development. + The table below outlines the current status of OpenMP 4.0 feature support. Work is ongoing to add support for OpenMP 4.5 and newer versions; a support statement for these will be shared in the future. The table entries are derived from the information provided in the Version Differences subsection of the Features History section in the OpenMP standard. >From c59e959c8da84dbfa27b9829c065c265294302ef Mon Sep 17 00:00:00 2001 From: NimishMishra <42909663+NimishMishra at users.noreply.github.com> Date: Fri, 9 May 2025 03:58:54 -0700 Subject: [PATCH 4/4] Clarify support for OpenMP 3.1 and 4.0 --- flang/docs/OpenMPSupport.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/flang/docs/OpenMPSupport.md b/flang/docs/OpenMPSupport.md index 15298fbe0489f..7a4f95693a89c 100644 --- a/flang/docs/OpenMPSupport.md +++ b/flang/docs/OpenMPSupport.md @@ -16,8 +16,7 @@ local: This document outlines the OpenMP API features supported by Flang. It is intended as a general reference. For the most accurate information on unimplemented features, rely on the compiler’s TODO or “Not Yet Implemented” -messages, which are considered authoritative. Currently, Flang offers full support for OpenMP 3.1; partial support for OpenMP 4.0 is also available and currently under active development. - +messages, which are considered authoritative. Flang provides complete implementation of the OpenMP 3.1 specification and partial implementation of OpenMP 4.0, with continued development efforts aimed at extending full support for the latter. The table below outlines the current status of OpenMP 4.0 feature support. Work is ongoing to add support for OpenMP 4.5 and newer versions; a support statement for these will be shared in the future. The table entries are derived from the information provided in the Version Differences subsection of the Features History section in the OpenMP standard. From flang-commits at lists.llvm.org Fri May 9 04:19:57 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 04:19:57 -0700 (PDT) Subject: [flang-commits] [flang] 4923cff - [Flang] Add missing dependency to AddDebugInfo pass (#139099) Message-ID: <681de4dd.170a0220.19c0e7.b7db@mx.google.com> Author: Sergio Afonso Date: 2025-05-09T12:19:53+01:00 New Revision: 4923cffc1d3333cf679dd304dd6a8e7232d77d54 URL: https://github.com/llvm/llvm-project/commit/4923cffc1d3333cf679dd304dd6a8e7232d77d54 DIFF: https://github.com/llvm/llvm-project/commit/4923cffc1d3333cf679dd304dd6a8e7232d77d54.diff LOG: [Flang] Add missing dependency to AddDebugInfo pass (#139099) The `AddDebugInfo` pass currently has a dependency on the `DLTI` MLIR dialect caused by a call to the `fir::support::getOrSetMLIRDataLayout()` utility function. This dependency is not captured in the pass definition. This patch adds the dependency and simplifies several unit tests that had to explicitly use the `DLTI` dialect to prevent the missing dependency from causing compiler failures. Added: Modified: flang/include/flang/Optimizer/Transforms/Passes.td flang/lib/Optimizer/Transforms/AddDebugInfo.cpp flang/test/Transforms/debug-107988.fir flang/test/Transforms/debug-96314.fir flang/test/Transforms/debug-allocatable-1.fir flang/test/Transforms/debug-assumed-rank-array.fir flang/test/Transforms/debug-assumed-shape-array-2.fir flang/test/Transforms/debug-assumed-size-array.fir flang/test/Transforms/debug-char-type-1.fir flang/test/Transforms/debug-class-type.fir flang/test/Transforms/debug-common-block.fir flang/test/Transforms/debug-complex-1.fir flang/test/Transforms/debug-derived-type-2.fir flang/test/Transforms/debug-extra-global.fir flang/test/Transforms/debug-fixed-array-type.fir flang/test/Transforms/debug-fn-info.fir flang/test/Transforms/debug-imported-entity.fir flang/test/Transforms/debug-index-type.fir flang/test/Transforms/debug-line-table-existing.fir flang/test/Transforms/debug-line-table-inc-file.fir flang/test/Transforms/debug-line-table-inc-same-file.fir flang/test/Transforms/debug-line-table.fir flang/test/Transforms/debug-local-var.fir flang/test/Transforms/debug-module-1.fir flang/test/Transforms/debug-ptr-type.fir flang/test/Transforms/debug-ref-type.fir flang/test/Transforms/debug-tuple-type.fir flang/test/Transforms/debug-variable-array-dim.fir flang/test/Transforms/debug-variable-char-len.fir flang/test/Transforms/debug-vector-type.fir Removed: ################################################################################ diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index 9b6919eec3f73..3243b44df9c7a 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -210,7 +210,8 @@ def AddDebugInfo : Pass<"add-debug-info", "mlir::ModuleOp"> { }]; let constructor = "::fir::createAddDebugInfoPass()"; let dependentDialects = [ - "fir::FIROpsDialect", "mlir::func::FuncDialect", "mlir::LLVM::LLVMDialect" + "fir::FIROpsDialect", "mlir::func::FuncDialect", "mlir::LLVM::LLVMDialect", + "mlir::DLTIDialect" ]; let options = [ Option<"debugLevel", "debug-level", diff --git a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp index c479c1a0892b5..8fa2f38818c02 100644 --- a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp +++ b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp @@ -23,6 +23,7 @@ #include "flang/Optimizer/Support/InternalNames.h" #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Support/Version.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/Func/IR/FuncOps.h" #include "mlir/Dialect/LLVMIR/LLVMDialect.h" #include "mlir/IR/Matchers.h" diff --git a/flang/test/Transforms/debug-107988.fir b/flang/test/Transforms/debug-107988.fir index 0ba4296138f50..674ce287a29ec 100644 --- a/flang/test/Transforms/debug-107988.fir +++ b/flang/test/Transforms/debug-107988.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s -o - | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @test(%arg0: !fir.ref> {fir.bindc_name = "str"}, %arg1: i64) { %0 = fir.emboxchar %arg0, %arg1 : (!fir.ref>, i64) -> !fir.boxchar<1> %1 = fir.undefined !fir.dscope diff --git a/flang/test/Transforms/debug-96314.fir b/flang/test/Transforms/debug-96314.fir index e2d0f24a1105c..4df0c4a555d39 100644 --- a/flang/test/Transforms/debug-96314.fir +++ b/flang/test/Transforms/debug-96314.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s -o - | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QMhelperPmod_sub(%arg0: !fir.ref {fir.bindc_name = "a"} ) { return } loc(#loc1) diff --git a/flang/test/Transforms/debug-allocatable-1.fir b/flang/test/Transforms/debug-allocatable-1.fir index fd0beaddcdb70..f523025f5945e 100644 --- a/flang/test/Transforms/debug-allocatable-1.fir +++ b/flang/test/Transforms/debug-allocatable-1.fir @@ -1,7 +1,7 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @_QFPff() { %c1 = arith.constant 1 : index %c0 = arith.constant 0 : index diff --git a/flang/test/Transforms/debug-assumed-rank-array.fir b/flang/test/Transforms/debug-assumed-rank-array.fir index ce474cd259619..41e0396b076f7 100644 --- a/flang/test/Transforms/debug-assumed-rank-array.fir +++ b/flang/test/Transforms/debug-assumed-rank-array.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QFPfn(%arg0: !fir.box> ) { %1 = fir.undefined !fir.dscope %2 = fircg.ext_declare %arg0 dummy_scope %1 {uniq_name = "_QFFfnEx"} : (!fir.box>, !fir.dscope) -> !fir.box> loc(#loc2) diff --git a/flang/test/Transforms/debug-assumed-shape-array-2.fir b/flang/test/Transforms/debug-assumed-shape-array-2.fir index 212a3453d110d..acad57a710205 100644 --- a/flang/test/Transforms/debug-assumed-shape-array-2.fir +++ b/flang/test/Transforms/debug-assumed-shape-array-2.fir @@ -2,7 +2,7 @@ // Test assumed shape array with variable lower bound. -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @_QFPfn(%arg0: !fir.box> {fir.bindc_name = "b"}, %arg1: !fir.ref {fir.bindc_name = "n"}) attributes {} { %c23_i32 = arith.constant 23 : i32 %c6_i32 = arith.constant 6 : i32 diff --git a/flang/test/Transforms/debug-assumed-size-array.fir b/flang/test/Transforms/debug-assumed-size-array.fir index 892502cb64a59..40e57100fd9ff 100644 --- a/flang/test/Transforms/debug-assumed-size-array.fir +++ b/flang/test/Transforms/debug-assumed-size-array.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QMhelperPfn(%arg0: !fir.ref> {fir.bindc_name = "a1"}, %arg1: !fir.ref> {fir.bindc_name = "a2"}, %arg2: !fir.ref> {fir.bindc_name = "a3"}) { %c5 = arith.constant 5 : index %c1 = arith.constant 1 : index diff --git a/flang/test/Transforms/debug-char-type-1.fir b/flang/test/Transforms/debug-char-type-1.fir index 630b52d96cb85..49f230f7307fa 100644 --- a/flang/test/Transforms/debug-char-type-1.fir +++ b/flang/test/Transforms/debug-char-type-1.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMhelperEstr1 : !fir.char<1,40> { %0 = fir.zero_bits !fir.char<1,40> fir.has_value %0 : !fir.char<1,40> diff --git a/flang/test/Transforms/debug-class-type.fir b/flang/test/Transforms/debug-class-type.fir index aad15a831fd2f..23af60b71ca50 100644 --- a/flang/test/Transforms/debug-class-type.fir +++ b/flang/test/Transforms/debug-class-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.type_info @_QTtest_type nofinal : !fir.type<_QTtest_type{a:i32,b:!fir.box>>}> dispatch_table { fir.dt_entry "test_proc", @_QPtest_proc } loc(#loc1) diff --git a/flang/test/Transforms/debug-common-block.fir b/flang/test/Transforms/debug-common-block.fir index 481b26369a92c..d68b524225df5 100644 --- a/flang/test/Transforms/debug-common-block.fir +++ b/flang/test/Transforms/debug-common-block.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @__BLNK__ {alignment = 4 : i64} : tuple> {} loc(#loc1) fir.global @a_ {alignment = 4 : i64} : tuple> {} loc(#loc2) func.func @f1() { diff --git a/flang/test/Transforms/debug-complex-1.fir b/flang/test/Transforms/debug-complex-1.fir index 633f27af99fb1..f7be6b2d4a931 100644 --- a/flang/test/Transforms/debug-complex-1.fir +++ b/flang/test/Transforms/debug-complex-1.fir @@ -3,7 +3,7 @@ // check conversion of complex type of diff erent size. Both fir and mlir // variants are checked. -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @test1(%x : complex) -> complex { %1 = fir.convert %x : (complex) -> complex return %1 : complex diff --git a/flang/test/Transforms/debug-derived-type-2.fir b/flang/test/Transforms/debug-derived-type-2.fir index 63e842619edbe..1e128d702b347 100644 --- a/flang/test/Transforms/debug-derived-type-2.fir +++ b/flang/test/Transforms/debug-derived-type-2.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMmEvar : !fir.type<_QMmTt1{elm:!fir.array<5xi32>,elm2:!fir.array<5x8xi32>}> {} loc(#loc1) fir.type_info @_QMmTt1 noinit nodestroy nofinal : !fir.type<_QMmTt1{elm:!fir.array<5xi32>,elm2:!fir.array<5x8xi32>}> component_info { fir.dt_component "elm" lbs [2] diff --git a/flang/test/Transforms/debug-extra-global.fir b/flang/test/Transforms/debug-extra-global.fir index d3bc22ad0c59b..e3a33e4cfdf40 100644 --- a/flang/test/Transforms/debug-extra-global.fir +++ b/flang/test/Transforms/debug-extra-global.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global linkonce_odr @_QFEXnXxcx constant target : !fir.char<1,3> { %0 = fir.string_lit "xcx"(3) : !fir.char<1,3> fir.has_value %0 : !fir.char<1,3> diff --git a/flang/test/Transforms/debug-fixed-array-type.fir b/flang/test/Transforms/debug-fixed-array-type.fir index a15975c7cc92a..75cb88b08b248 100644 --- a/flang/test/Transforms/debug-fixed-array-type.fir +++ b/flang/test/Transforms/debug-fixed-array-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QQmain() attributes {fir.bindc_name = "mn"} { %c7 = arith.constant 7 : index %c8 = arith.constant 8 : index diff --git a/flang/test/Transforms/debug-fn-info.fir b/flang/test/Transforms/debug-fn-info.fir index 85cfd13643ec3..c02835be50af5 100644 --- a/flang/test/Transforms/debug-fn-info.fir +++ b/flang/test/Transforms/debug-fn-info.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QQmain() attributes {fir.bindc_name = "mn"} { %0 = fir.alloca i32 {bindc_name = "i4", uniq_name = "_QFEi4"} %1 = fircg.ext_declare %0 {uniq_name = "_QFEi4"} : (!fir.ref) -> !fir.ref diff --git a/flang/test/Transforms/debug-imported-entity.fir b/flang/test/Transforms/debug-imported-entity.fir index 7be6531a703a8..194bc82724583 100644 --- a/flang/test/Transforms/debug-imported-entity.fir +++ b/flang/test/Transforms/debug-imported-entity.fir @@ -1,7 +1,7 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMfooEv1 : i32 { %0 = fir.zero_bits i32 fir.has_value %0 : i32 diff --git a/flang/test/Transforms/debug-index-type.fir b/flang/test/Transforms/debug-index-type.fir index 20bd8471d7cf6..751e2e156dc20 100644 --- a/flang/test/Transforms/debug-index-type.fir +++ b/flang/test/Transforms/debug-index-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @str(%arg0: index) -> i32 loc(#loc1) } #loc1 = loc("test.f90":5:1) diff --git a/flang/test/Transforms/debug-line-table-existing.fir b/flang/test/Transforms/debug-line-table-existing.fir index 0e006303c8a81..03eefd08a4379 100644 --- a/flang/test/Transforms/debug-line-table-existing.fir +++ b/flang/test/Transforms/debug-line-table-existing.fir @@ -3,7 +3,7 @@ // REQUIRES: system-linux // Test that there are no changes to a function with existed fused loc debug -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QPs1() { return loc(#loc1) } loc(#loc2) diff --git a/flang/test/Transforms/debug-line-table-inc-file.fir b/flang/test/Transforms/debug-line-table-inc-file.fir index 216cd5e016f2f..32c9f515ead43 100644 --- a/flang/test/Transforms/debug-line-table-inc-file.fir +++ b/flang/test/Transforms/debug-line-table-inc-file.fir @@ -3,7 +3,7 @@ // REQUIRES: system-linux // Test for included functions that have a diff erent debug location than the current file -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QPsinc() { return loc(#loc2) } loc(#loc1) diff --git a/flang/test/Transforms/debug-line-table-inc-same-file.fir b/flang/test/Transforms/debug-line-table-inc-same-file.fir index bcaf449798231..aaa8d03a76ef0 100644 --- a/flang/test/Transforms/debug-line-table-inc-same-file.fir +++ b/flang/test/Transforms/debug-line-table-inc-same-file.fir @@ -4,7 +4,7 @@ // Test that there is only one FileAttribute generated for multiple functions // in the same file. -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QPs1() { return loc(#loc2) } loc(#loc1) diff --git a/flang/test/Transforms/debug-line-table.fir b/flang/test/Transforms/debug-line-table.fir index d6e54fd1ac467..81aebf026882a 100644 --- a/flang/test/Transforms/debug-line-table.fir +++ b/flang/test/Transforms/debug-line-table.fir @@ -3,7 +3,7 @@ // RUN: fir-opt --add-debug-info="debug-level=LineTablesOnly" --mlir-print-debuginfo %s | FileCheck %s --check-prefix=LINETABLE // RUN: fir-opt --add-debug-info="is-optimized=true" --mlir-print-debuginfo %s | FileCheck %s --check-prefix=OPT -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QPsb() { return loc(#loc_sb) } loc(#loc_sb) diff --git a/flang/test/Transforms/debug-local-var.fir b/flang/test/Transforms/debug-local-var.fir index b7a1ff7185a63..06c9b01e75a61 100644 --- a/flang/test/Transforms/debug-local-var.fir +++ b/flang/test/Transforms/debug-local-var.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @_QQmain() attributes {fir.bindc_name = "mn"} { %0 = fir.alloca i32 {bindc_name = "i4", uniq_name = "_QFEi4"} %1 = fircg.ext_declare %0 {uniq_name = "_QFEi4"} : (!fir.ref) -> !fir.ref loc(#loc1) diff --git a/flang/test/Transforms/debug-module-1.fir b/flang/test/Transforms/debug-module-1.fir index ede996f053835..c1e4c2eeffefe 100644 --- a/flang/test/Transforms/debug-module-1.fir +++ b/flang/test/Transforms/debug-module-1.fir @@ -1,7 +1,7 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMhelperEgli : i32 { %0 = fir.zero_bits i32 fir.has_value %0 : i32 diff --git a/flang/test/Transforms/debug-ptr-type.fir b/flang/test/Transforms/debug-ptr-type.fir index 64e64cb1a19ae..2bbece56a7ab5 100644 --- a/flang/test/Transforms/debug-ptr-type.fir +++ b/flang/test/Transforms/debug-ptr-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { fir.global @_QMhelperEpar : !fir.box>> { %0 = fir.zero_bits !fir.ptr> %c0 = arith.constant 0 : index diff --git a/flang/test/Transforms/debug-ref-type.fir b/flang/test/Transforms/debug-ref-type.fir index 2b3af485385d8..745aebee778be 100644 --- a/flang/test/Transforms/debug-ref-type.fir +++ b/flang/test/Transforms/debug-ref-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @_FortranAioBeginExternalListOutput(i8) -> !fir.ref loc(#loc1) } #loc1 = loc("test.f90":5:1) diff --git a/flang/test/Transforms/debug-tuple-type.fir b/flang/test/Transforms/debug-tuple-type.fir index c9b0d16c06e1a..e3b0bafdf3cd4 100644 --- a/flang/test/Transforms/debug-tuple-type.fir +++ b/flang/test/Transforms/debug-tuple-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @fn1(!fir.ref>) func.func private @_FortranAioOutputDerivedType(!fir.ref>) } diff --git a/flang/test/Transforms/debug-variable-array-dim.fir b/flang/test/Transforms/debug-variable-array-dim.fir index 1f401041dee57..a376133cf449a 100644 --- a/flang/test/Transforms/debug-variable-array-dim.fir +++ b/flang/test/Transforms/debug-variable-array-dim.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @foo_(%arg0: !fir.ref> {fir.bindc_name = "a"}, %arg1: !fir.ref {fir.bindc_name = "n"}, %arg2: !fir.ref {fir.bindc_name = "m"}, %arg3: !fir.ref {fir.bindc_name = "p"}) attributes {fir.internal_name = "_QPfoo"} { %c5_i32 = arith.constant 5 : i32 %c6_i32 = arith.constant 6 : i32 diff --git a/flang/test/Transforms/debug-variable-char-len.fir b/flang/test/Transforms/debug-variable-char-len.fir index 9e177d22d5b10..907b65a4c6d4f 100644 --- a/flang/test/Transforms/debug-variable-char-len.fir +++ b/flang/test/Transforms/debug-variable-char-len.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s -o - | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func @foo(%arg0: !fir.ref> {fir.bindc_name = "str1"} , %arg1: !fir.ref {fir.bindc_name = "len1"} loc("/home/haqadeer/work/fortran/t1/../str.f90":1:1), %arg2: i64) { %0 = fir.emboxchar %arg0, %arg2 : (!fir.ref>, i64) -> !fir.boxchar<1> %c4_i32 = arith.constant 4 : i32 diff --git a/flang/test/Transforms/debug-vector-type.fir b/flang/test/Transforms/debug-vector-type.fir index 63846ce006c6c..d3e1f6ec28d0f 100644 --- a/flang/test/Transforms/debug-vector-type.fir +++ b/flang/test/Transforms/debug-vector-type.fir @@ -1,6 +1,6 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s -module attributes {dlti.dl_spec = #dlti.dl_spec<>} { +module { func.func private @foo1(%arg0: !fir.vector<20:bf16>) // CHECK-DAG: #[[F16:.*]] = #llvm.di_basic_type // CHECK-DAG: #llvm.di_composite_type> From flang-commits at lists.llvm.org Fri May 9 04:19:59 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Fri, 09 May 2025 04:19:59 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add missing dependency to AddDebugInfo pass (PR #139099) In-Reply-To: Message-ID: <681de4df.170a0220.37e90a.b782@mx.google.com> https://github.com/skatrak closed https://github.com/llvm/llvm-project/pull/139099 From flang-commits at lists.llvm.org Fri May 9 04:23:32 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 09 May 2025 04:23:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) In-Reply-To: Message-ID: <681de5b4.170a0220.2d9e58.bb9f@mx.google.com> ================ @@ -207,20 +207,25 @@ static bool hasExplicitLowerBounds(mlir::Value shape) { mlir::isa(shape.getType()); } -static std::pair updateDeclareInputTypeWithVolatility( +static std::pair updateDeclaredInputTypeWithVolatility( mlir::Type inputType, mlir::Value memref, mlir::OpBuilder &builder, fir::FortranVariableFlagsAttr fortran_attrs) { if (fortran_attrs && bitEnumContainsAny(fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::fortran_volatile)) { + // A volatile pointer's pointee is volatile. const bool isPointer = bitEnumContainsAny( fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::pointer); + // An allocatable's inner type's volatility matches that of the reference. + const bool isAllocatable = bitEnumContainsAny( + fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::allocatable); ---------------- tblah wrote: Line 213 checks if `fortran_attrs` is null before calling `bitEnumContainsAny`. Is that needed here too? https://github.com/llvm/llvm-project/pull/139183 From flang-commits at lists.llvm.org Fri May 9 04:23:32 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 09 May 2025 04:23:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) In-Reply-To: Message-ID: <681de5b4.170a0220.1b2979.ba9f@mx.google.com> ================ @@ -1536,20 +1536,51 @@ bool fir::ConvertOp::canBeConverted(mlir::Type inType, mlir::Type outType) { areRecordsCompatible(inType, outType); } +// In general, ptrtoint-like conversions are allowed to lose volatility +// information because they are either: +// +// 1. passing an entity to an external function and there's nothing we can do +// about volatility after that happens, or +// 2. for code generation, at which point we represent volatility with +// attributes +// on the LLVM instructions and intrinsics. ---------------- tblah wrote: ultra-nit ```suggestion // 1. passing an entity to an external function and there's nothing we can do // about volatility after that happens, or // 2. for code generation, at which point we represent volatility with // attributes // on the LLVM instructions and intrinsics. ``` https://github.com/llvm/llvm-project/pull/139183 From flang-commits at lists.llvm.org Fri May 9 04:29:36 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 09 May 2025 04:29:36 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Basic lowering `fir.do_concurrent` locality specs to `fir.do_loop ... unordered` (PR #138512) In-Reply-To: Message-ID: <681de720.170a0220.174124.b5e3@mx.google.com> https://github.com/tblah approved this pull request. Thanks for the updates https://github.com/llvm/llvm-project/pull/138512 From flang-commits at lists.llvm.org Fri May 9 04:47:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 04:47:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Use box for components with non-default lower bounds (PR #138994) In-Reply-To: Message-ID: <681deb66.170a0220.951e2.f86a@mx.google.com> jeanPerier wrote: Thank you Asher, LGTM https://github.com/llvm/llvm-project/pull/138994 From flang-commits at lists.llvm.org Fri May 9 04:56:32 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 04:56:32 -0700 (PDT) Subject: [flang-commits] [flang] dd42112 - [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (#128490) Message-ID: <681ded70.170a0220.37e90a.b925@mx.google.com> Author: Kaviya Rajendiran Date: 2025-05-09T17:26:28+05:30 New Revision: dd42112c82d7b12669513dca4048167664b211b2 URL: https://github.com/llvm/llvm-project/commit/dd42112c82d7b12669513dca4048167664b211b2 DIFF: https://github.com/llvm/llvm-project/commit/dd42112c82d7b12669513dca4048167664b211b2.diff LOG: [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (#128490) - Added MLIR lowering for grainsize and num_tasks clauses of taskloop construct. Added: flang/test/Lower/OpenMP/taskloop-grainsize.f90 flang/test/Lower/OpenMP/taskloop-numtasks.f90 Modified: flang/lib/Lower/OpenMP/ClauseProcessor.cpp flang/lib/Lower/OpenMP/ClauseProcessor.h flang/lib/Lower/OpenMP/OpenMP.cpp Removed: ################################################################################ diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 318455f0afe80..79b5087e4da68 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -388,6 +388,27 @@ bool ClauseProcessor::processNowait(mlir::omp::NowaitClauseOps &result) const { return markClauseOccurrence(result.nowait); } +bool ClauseProcessor::processNumTasks( + lower::StatementContext &stmtCtx, + mlir::omp::NumTasksClauseOps &result) const { + using NumTasks = omp::clause::NumTasks; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier && *modifier == NumTasks::Prescriptiveness::Strict) { + result.numTasksMod = mlir::omp::ClauseNumTasksTypeAttr::get( + context, mlir::omp::ClauseNumTasksType::Strict); + } + const auto &numtasksExpr = std::get(clause->t); + result.numTasks = + fir::getBase(converter.genExprValue(numtasksExpr, stmtCtx)); + return true; + } + return false; +} + bool ClauseProcessor::processNumTeams( lower::StatementContext &stmtCtx, mlir::omp::NumTeamsClauseOps &result) const { @@ -934,6 +955,27 @@ bool ClauseProcessor::processDepend(lower::SymMap &symMap, return findRepeatableClause(process); } +bool ClauseProcessor::processGrainsize( + lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const { + using Grainsize = omp::clause::Grainsize; + if (auto *clause = findUniqueClause()) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::MLIRContext *context = firOpBuilder.getContext(); + const auto &modifier = + std::get>(clause->t); + if (modifier && *modifier == Grainsize::Prescriptiveness::Strict) { + result.grainsizeMod = mlir::omp::ClauseGrainsizeTypeAttr::get( + context, mlir::omp::ClauseGrainsizeType::Strict); + } + const auto &grainsizeExpr = std::get(clause->t); + result.grainsize = + fir::getBase(converter.genExprValue(grainsizeExpr, stmtCtx)); + return true; + } + return false; +} + bool ClauseProcessor::processHasDeviceAddr( lower::StatementContext &stmtCtx, mlir::omp::HasDeviceAddrClauseOps &result, llvm::SmallVectorImpl &hasDeviceSyms) const { diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 3d3f26f06da26..2e4d911aab35e 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -73,6 +73,8 @@ class ClauseProcessor { mlir::omp::FilterClauseOps &result) const; bool processFinal(lower::StatementContext &stmtCtx, mlir::omp::FinalClauseOps &result) const; + bool processGrainsize(lower::StatementContext &stmtCtx, + mlir::omp::GrainsizeClauseOps &result) const; bool processHasDeviceAddr( lower::StatementContext &stmtCtx, mlir::omp::HasDeviceAddrClauseOps &result, @@ -82,6 +84,8 @@ class ClauseProcessor { mlir::omp::InclusiveClauseOps &result) const; bool processMergeable(mlir::omp::MergeableClauseOps &result) const; bool processNowait(mlir::omp::NowaitClauseOps &result) const; + bool processNumTasks(lower::StatementContext &stmtCtx, + mlir::omp::NumTasksClauseOps &result) const; bool processNumTeams(lower::StatementContext &stmtCtx, mlir::omp::NumTeamsClauseOps &result) const; bool processNumThreads(lower::StatementContext &stmtCtx, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 099d5c604060f..1a326345379f5 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1783,17 +1783,19 @@ static void genTaskgroupClauses( static void genTaskloopClauses(lower::AbstractConverter &converter, semantics::SemanticsContext &semaCtx, + lower::StatementContext &stmtCtx, const List &clauses, mlir::Location loc, mlir::omp::TaskloopOperands &clauseOps) { ClauseProcessor cp(converter, semaCtx, clauses); + cp.processGrainsize(stmtCtx, clauseOps); + cp.processNumTasks(stmtCtx, clauseOps); cp.processTODO( - loc, llvm::omp::Directive::OMPD_taskloop); + clause::Final, clause::If, clause::InReduction, + clause::Lastprivate, clause::Mergeable, clause::Nogroup, + clause::Priority, clause::Reduction, clause::Shared, + clause::Untied>(loc, llvm::omp::Directive::OMPD_taskloop); } static void genTaskwaitClauses(lower::AbstractConverter &converter, @@ -3270,12 +3272,12 @@ genStandaloneSimd(lower::AbstractConverter &converter, lower::SymMap &symTable, static mlir::omp::TaskloopOp genStandaloneTaskloop( lower::AbstractConverter &converter, lower::SymMap &symTable, - semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - mlir::Location loc, const ConstructQueue &queue, - ConstructQueue::const_iterator item) { + lower::StatementContext &stmtCtx, semantics::SemanticsContext &semaCtx, + lower::pft::Evaluation &eval, mlir::Location loc, + const ConstructQueue &queue, ConstructQueue::const_iterator item) { mlir::omp::TaskloopOperands taskloopClauseOps; - genTaskloopClauses(converter, semaCtx, item->clauses, loc, taskloopClauseOps); - + genTaskloopClauses(converter, semaCtx, stmtCtx, item->clauses, loc, + taskloopClauseOps); DataSharingProcessor dsp(converter, semaCtx, item->clauses, eval, /*shouldCollectPreDeterminedSymbols=*/true, enableDelayedPrivatization, symTable); @@ -3736,8 +3738,8 @@ static void genOMPDispatch(lower::AbstractConverter &converter, genTaskgroupOp(converter, symTable, semaCtx, eval, loc, queue, item); break; case llvm::omp::Directive::OMPD_taskloop: - newOp = genStandaloneTaskloop(converter, symTable, semaCtx, eval, loc, - queue, item); + newOp = genStandaloneTaskloop(converter, symTable, stmtCtx, semaCtx, eval, + loc, queue, item); break; case llvm::omp::Directive::OMPD_taskwait: newOp = genTaskwaitOp(converter, symTable, semaCtx, eval, loc, queue, item); diff --git a/flang/test/Lower/OpenMP/taskloop-grainsize.f90 b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 new file mode 100644 index 0000000000000..43db8acdeceac --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-grainsize.f90 @@ -0,0 +1,51 @@ +! This test checks lowering of grainsize clause in taskloop directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE_TEST2:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_grainsize +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_grainsizeEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_grainsizeEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFtest_grainsizeEx"} +! CHECK: %[[DECL_X:.*]]:2 = hlfir.declare %[[ALLOCA_X]] {uniq_name = "_QFtest_grainsizeEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[GRAINSIZE:.*]] = arith.constant 10 : i32 +subroutine test_grainsize + integer :: i, x + ! CHECK: omp.taskloop grainsize(%[[GRAINSIZE]]: i32) + ! CHECK-SAME: private(@[[X_FIRSTPRIVATE]] %[[DECL_X]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { + ! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { + !$omp taskloop grainsize(10) + do i = 1, 1000 + x = x + 1 + end do + !$omp end taskloop +end subroutine test_grainsize + +!CHECK-LABEL: func.func @_QPtest_grainsize_strict() +subroutine test_grainsize_strict + integer :: i, x + ! CHECK: %[[GRAINSIZE:.*]] = arith.constant 10 : i32 + ! CHECK: omp.taskloop grainsize(strict, %[[GRAINSIZE]]: i32) + !$omp taskloop grainsize(strict:10) + do i = 1, 1000 + !CHECK: arith.addi + x = x + 1 + end do + !$omp end taskloop +end subroutine diff --git a/flang/test/Lower/OpenMP/taskloop-numtasks.f90 b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 new file mode 100644 index 0000000000000..f68f3a2d6ad26 --- /dev/null +++ b/flang/test/Lower/OpenMP/taskloop-numtasks.f90 @@ -0,0 +1,51 @@ +! This test checks lowering of num_tasks clause in taskloop directive. + +! RUN: bbc -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE_TEST2:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE_TEST2:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = private} @[[I_PRIVATE:.*]] : i32 + +! CHECK-LABEL: omp.private +! CHECK-SAME: {type = firstprivate} @[[X_FIRSTPRIVATE:.*]] : i32 +! CHECK-SAME: copy { +! CHECK: hlfir.assign + +! CHECK-LABEL: func.func @_QPtest_num_tasks +! CHECK: %[[ALLOCA_I:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtest_num_tasksEi"} +! CHECK: %[[DECL_I:.*]]:2 = hlfir.declare %[[ALLOCA_I]] {uniq_name = "_QFtest_num_tasksEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ALLOCA_X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFtest_num_tasksEx"} +! CHECK: %[[DECL_X:.*]]:2 = hlfir.declare %[[ALLOCA_X]] {uniq_name = "_QFtest_num_tasksEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_NUMTASKS:.*]] = arith.constant 10 : i32 +subroutine test_num_tasks + integer :: i, x + ! CHECK: omp.taskloop num_tasks(%[[VAL_NUMTASKS]]: i32) + ! CHECK-SAME: private(@[[X_FIRSTPRIVATE]] %[[DECL_X]]#0 -> %[[ARG0:.*]], @[[I_PRIVATE]] %[[DECL_I]]#0 -> %[[ARG1:.*]] : !fir.ref, !fir.ref) { + ! CHECK: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%{{.*}}) to (%{{.*}}) inclusive step (%{{.*}}) { + !$omp taskloop num_tasks(10) + do i = 1, 1000 + x = x + 1 + end do + !$omp end taskloop +end subroutine test_num_tasks + +! CHECK-LABEL: func.func @_QPtest_num_tasks_strict +subroutine test_num_tasks_strict + integer :: x, i + ! CHECK: %[[NUM_TASKS:.*]] = arith.constant 10 : i32 + ! CHECK: omp.taskloop num_tasks(strict, %[[NUM_TASKS]]: i32) + !$omp taskloop num_tasks(strict:10) + do i = 1, 100 + !CHECK: arith.addi + x = x + 1 + end do + !$omp end taskloop +end subroutine From flang-commits at lists.llvm.org Fri May 9 04:56:36 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Fri, 09 May 2025 04:56:36 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] MLIR lowering support for grainsize and num_tasks clause (PR #128490) In-Reply-To: Message-ID: <681ded74.170a0220.31b952.f33d@mx.google.com> https://github.com/kaviya2510 closed https://github.com/llvm/llvm-project/pull/128490 From flang-commits at lists.llvm.org Fri May 9 04:57:40 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 04:57:40 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <681dedb4.630a0220.2c2a2.83e1@mx.google.com> agozillon wrote: Thank you both for the review! I'll land this now. https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Fri May 9 04:57:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 04:57:50 -0700 (PDT) Subject: [flang-commits] [flang] b291cfc - [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (#138210) Message-ID: <681dedbe.170a0220.9d76a.d04e@mx.google.com> Author: agozillon Date: 2025-05-09T13:57:45+02:00 New Revision: b291cfcad4815568dc1eaca58185d25dceed3f1c URL: https://github.com/llvm/llvm-project/commit/b291cfcad4815568dc1eaca58185d25dceed3f1c DIFF: https://github.com/llvm/llvm-project/commit/b291cfcad4815568dc1eaca58185d25dceed3f1c.diff LOG: [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (#138210) Currently, we do not generate the appropriate checks to check if an optional allocatable argument is present before accessing relevant components of it, in particular when creating bounds, we must generate a presence check and we must make sure we do not generate/keep an load external to the presence check by utilising the raw address rather than the regular address of the info data structure. Similarly in cases for optional allocatables we must treat them like non-allocatable arguments and generate an intermediate allocation that we can have as a location in memory that we can access later in the lowering without causing segfaults when we perform "mapping" on it, even if the end result is an empty allocatable (basically, we shouldn't explode if someone tries to map a non-present optional, similar to C++ when mapping null data). Added: flang/test/Lower/OpenMP/optional-argument-map-2.f90 offload/test/offloading/fortran/optional-mapped-arguments-2.f90 Modified: flang/include/flang/Optimizer/Builder/DirectivesCommon.h flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp Removed: ################################################################################ diff --git a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h index 8684299ab6792..183e5711213eb 100644 --- a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h +++ b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h @@ -243,6 +243,17 @@ genBaseBoundsOps(fir::FirOpBuilder &builder, mlir::Location loc, return bounds; } +/// Checks if an argument is optional based on the fortran attributes +/// that are tied to it. +inline bool isOptionalArgument(mlir::Operation *op) { + if (auto declareOp = mlir::dyn_cast_or_null(op)) + if (declareOp.getFortranAttrs() && + bitEnumContainsAny(*declareOp.getFortranAttrs(), + fir::FortranVariableFlagsEnum::optional)) + return true; + return false; +} + template llvm::SmallVector genImplicitBoundsOps(fir::FirOpBuilder &builder, AddrAndBoundsInfo &info, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 1a326345379f5..544f31bb5054f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2322,7 +2322,8 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, fir::factory::AddrAndBoundsInfo info = Fortran::lower::getDataOperandBaseAddr( - converter, firOpBuilder, sym, converter.getCurrentLocation()); + converter, firOpBuilder, sym.GetUltimate(), + converter.getCurrentLocation()); llvm::SmallVector bounds = fir::factory::genImplicitBoundsOps( diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp index 3fcb4b04a7b76..e19594ace2992 100644 --- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp +++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp @@ -131,7 +131,8 @@ class MapInfoFinalizationPass boxMap.getVarPtr().getDefiningOp())) descriptor = addrOp.getVal(); - if (!mlir::isa(descriptor.getType())) + if (!mlir::isa(descriptor.getType()) && + !fir::factory::isOptionalArgument(descriptor.getDefiningOp())) return descriptor; mlir::Value &slot = localBoxAllocas[descriptor.getDefiningOp()]; @@ -151,7 +152,11 @@ class MapInfoFinalizationPass mlir::Location loc = boxMap->getLoc(); assert(allocaBlock && "No alloca block found for this top level op"); builder.setInsertionPointToStart(allocaBlock); - auto alloca = builder.create(loc, descriptor.getType()); + + mlir::Type allocaType = descriptor.getType(); + if (fir::isBoxAddress(allocaType)) + allocaType = fir::unwrapRefType(allocaType); + auto alloca = builder.create(loc, allocaType); builder.restoreInsertionPoint(insPt); // We should only emit a store if the passed in data is present, it is // possible a user passes in no argument to an optional parameter, in which @@ -159,8 +164,10 @@ class MapInfoFinalizationPass auto isPresent = builder.create(loc, builder.getI1Type(), descriptor); builder.genIfOp(loc, {}, isPresent, false) - .genThen( - [&]() { builder.create(loc, descriptor, alloca); }) + .genThen([&]() { + descriptor = builder.loadIfRef(loc, descriptor); + builder.create(loc, descriptor, alloca); + }) .end(); return slot = alloca; } diff --git a/flang/test/Lower/OpenMP/optional-argument-map-2.f90 b/flang/test/Lower/OpenMP/optional-argument-map-2.f90 new file mode 100644 index 0000000000000..3b629cfc06d3a --- /dev/null +++ b/flang/test/Lower/OpenMP/optional-argument-map-2.f90 @@ -0,0 +1,46 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +module mod + implicit none +contains + subroutine routine(a) + implicit none + real(4), allocatable, optional, intent(inout) :: a(:) + integer(4) :: i + + !$omp target teams distribute parallel do shared(a) + do i=1,10 + a(i) = i + a(i) + end do + + end subroutine routine +end module mod + +! CHECK-LABEL: func.func @_QMmodProutine( +! CHECK-SAME: %[[ARG0:.*]]: !fir.ref>>> {fir.bindc_name = "a", fir.optional}) { +! CHECK: %[[VAL_0:.*]] = fir.alloca !fir.box>> +! CHECK: %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[ARG0]] dummy_scope %[[VAL_1]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QMmodFroutineEa"} : (!fir.ref>>>, !fir.dscope) -> (!fir.ref>>>, !fir.ref>>>) +! CHECK: %[[VAL_8:.*]] = fir.is_present %[[VAL_2]]#1 : (!fir.ref>>>) -> i1 +! CHECK: %[[VAL_9:.*]]:5 = fir.if %[[VAL_8]] -> (index, index, index, index, index) { +! CHECK: %[[VAL_10:.*]] = fir.load %[[VAL_2]]#0 : !fir.ref>>> +! CHECK: %[[VAL_11:.*]] = arith.constant 1 : index +! CHECK: %[[VAL_12:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_2]]#0 : !fir.ref>>> +! CHECK: %[[VAL_14:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_15:.*]]:3 = fir.box_dims %[[VAL_13]], %[[VAL_14]] : (!fir.box>>, index) -> (index, index, index) +! CHECK: %[[VAL_16:.*]]:3 = fir.box_dims %[[VAL_10]], %[[VAL_12]] : (!fir.box>>, index) -> (index, index, index) +! CHECK: %[[VAL_17:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_18:.*]] = arith.subi %[[VAL_16]]#1, %[[VAL_11]] : index +! CHECK: fir.result %[[VAL_17]], %[[VAL_18]], %[[VAL_16]]#1, %[[VAL_16]]#2, %[[VAL_15]]#0 : index, index, index, index, index +! CHECK: } else { +! CHECK: %[[VAL_19:.*]] = arith.constant 0 : index +! CHECK: %[[VAL_20:.*]] = arith.constant -1 : index +! CHECK: fir.result %[[VAL_19]], %[[VAL_20]], %[[VAL_19]], %[[VAL_19]], %[[VAL_19]] : index, index, index, index, index +! CHECK: } +! CHECK: %[[VAL_21:.*]] = omp.map.bounds lower_bound(%[[VAL_9]]#0 : index) upper_bound(%[[VAL_9]]#1 : index) extent(%[[VAL_9]]#2 : index) stride(%[[VAL_9]]#3 : index) start_idx(%[[VAL_9]]#4 : index) {stride_in_bytes = true} +! CHECK: %[[VAL_23:.*]] = fir.is_present %[[VAL_2]]#1 : (!fir.ref>>>) -> i1 +! CHECK: fir.if %[[VAL_23]] { +! CHECK: %[[VAL_24:.*]] = fir.load %[[VAL_2]]#1 : !fir.ref>>> +! CHECK: fir.store %[[VAL_24]] to %[[VAL_0]] : !fir.ref>>> +! CHECK: } diff --git a/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 b/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 new file mode 100644 index 0000000000000..0de6b7730d3a0 --- /dev/null +++ b/offload/test/offloading/fortran/optional-mapped-arguments-2.f90 @@ -0,0 +1,57 @@ +! OpenMP offloading regression test that checks we do not cause a segfault when +! implicitly mapping a not present optional allocatable function argument and +! utilise it in the target region. No results requiring checking other than +! that the program compiles and runs to completion with no error. +! REQUIRES: flang, amdgpu + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +module mod + implicit none +contains + subroutine routine(a, b) + implicit none + real(4), allocatable, optional, intent(in) :: a(:) + real(4), intent(out) :: b(:) + integer(4) :: i, ia + if(present(a)) then + ia = 1 + write(*,*) "a is present" + else + ia=0 + write(*,*) "a is not present" + end if + + !$omp target teams distribute parallel do shared(a,b,ia) + do i=1,10 + if (ia>0) then + b(i) = b(i) + a(i) + end if + end do + + end subroutine routine + +end module mod + +program main + use mod + implicit none + real(4), allocatable :: a(:) + real(4), allocatable :: b(:) + integer(4) :: i + allocate(b(10)) + do i=1,10 + b(i)=0 + end do + !$omp target data map(from: b) + + call routine(b=b) + + !$omp end target data + + deallocate(b) + + print *, "success, no segmentation fault" +end program main + +!CHECK: a is not present +!CHECK: success, no segmentation fault From flang-commits at lists.llvm.org Fri May 9 04:57:51 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 04:57:51 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Generate correct present checks for implicit maps of optional allocatables (PR #138210) In-Reply-To: Message-ID: <681dedbf.050a0220.150b63.f8a4@mx.google.com> https://github.com/agozillon closed https://github.com/llvm/llvm-project/pull/138210 From flang-commits at lists.llvm.org Fri May 9 05:34:03 2025 From: flang-commits at lists.llvm.org (Yussur Mustafa Oraji via flang-commits) Date: Fri, 09 May 2025 05:34:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <681df63b.050a0220.2b3d53.62a4@mx.google.com> ================ @@ -138,6 +138,18 @@ text. OpenMP-style directives that look like comments are not addressed by this scheme but are obvious extensions. +## Currently implemented built-ins + +* `__DATE__`: Date, given as e.g. "Jun 16 1904" +* `__TIME__`: Time in 24-hour format including seconds, e.g. "09:24:13" +* `__TIMESTAMP__`: Date, time and year of last modification, given as e.g. "Fri May 9 09:16:17 2025" +* `__FILE__`: Current file +* `__LINE__`: Current line + +### Non-standard extensions ---------------- N00byKing wrote: Hm, to me this reads more like exceptions, or documenting quirks, instead of extensions to the preprocessor. If you think it fits there more, I'll move it though https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Fri May 9 05:42:07 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 09 May 2025 05:42:07 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (PR #139131) In-Reply-To: Message-ID: <681df81f.170a0220.1519d2.d6a2@mx.google.com> https://github.com/kparzysz edited https://github.com/llvm/llvm-project/pull/139131 From flang-commits at lists.llvm.org Fri May 9 05:42:18 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 05:42:18 -0700 (PDT) Subject: [flang-commits] [flang] 41aa674 - [flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (#139131) Message-ID: <681df82a.630a0220.291a11.7fc3@mx.google.com> Author: Krzysztof Parzyszek Date: 2025-05-09T07:42:15-05:00 New Revision: 41aa67488c3ca33334ec79fb5216145c3644277c URL: https://github.com/llvm/llvm-project/commit/41aa67488c3ca33334ec79fb5216145c3644277c DIFF: https://github.com/llvm/llvm-project/commit/41aa67488c3ca33334ec79fb5216145c3644277c.diff LOG: [flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (#139131) The OpenMP version is stored in LangOptions in SemanticsContext. Use the fallback version where SemanticsContext is unavailable (mostly in case of debug dumps). RFC: https://discourse.llvm.org/t/rfc-alternative-spellings-of-openmp-directives/85507 Added: Modified: flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp flang/include/flang/Parser/dump-parse-tree.h flang/include/flang/Parser/unparse.h flang/include/flang/Semantics/unparse-with-symbols.h flang/lib/Frontend/ParserActions.cpp flang/lib/Lower/OpenMP/ClauseProcessor.h flang/lib/Lower/OpenMP/Decomposer.cpp flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Parser/openmp-parsers.cpp flang/lib/Parser/parse-tree.cpp flang/lib/Parser/unparse.cpp flang/lib/Semantics/check-omp-structure.cpp flang/lib/Semantics/mod-file.cpp flang/lib/Semantics/resolve-directives.cpp flang/lib/Semantics/unparse-with-symbols.cpp Removed: ################################################################################ diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..bf66151d59950 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -267,8 +267,9 @@ void OpenMPCounterVisitor::Post(const OmpScheduleClause::Kind &c) { "type=" + std::string{OmpScheduleClause::EnumToString(c)} + ";"; } void OpenMPCounterVisitor::Post(const OmpDirectiveNameModifier &c) { - clauseDetails += - "name_modifier=" + llvm::omp::getOpenMPDirectiveName(c.v).str() + ";"; + clauseDetails += "name_modifier=" + + llvm::omp::getOpenMPDirectiveName(c.v, llvm::omp::FallbackVersion).str() + + ";"; } void OpenMPCounterVisitor::Post(const OmpClause &c) { PostClauseCommon(normalize_clause_name(c.source.ToString())); diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index a3721bc8410ba..df9278697346f 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -17,6 +17,7 @@ #include "flang/Common/idioms.h" #include "flang/Common/indirection.h" #include "flang/Support/Fortran.h" +#include "llvm/Frontend/OpenMP/OMP.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -545,8 +546,8 @@ class ParseTreeDumper { NODE(parser, OmpBeginSectionsDirective) NODE(parser, OmpBlockDirective) static std::string GetNodeName(const llvm::omp::Directive &x) { - return llvm::Twine( - "llvm::omp::Directive = ", llvm::omp::getOpenMPDirectiveName(x)) + return llvm::Twine("llvm::omp::Directive = ", + llvm::omp::getOpenMPDirectiveName(x, llvm::omp::FallbackVersion)) .str(); } NODE(parser, OmpClause) diff --git a/flang/include/flang/Parser/unparse.h b/flang/include/flang/Parser/unparse.h index 40094ecbc85e5..d796109ca8f86 100644 --- a/flang/include/flang/Parser/unparse.h +++ b/flang/include/flang/Parser/unparse.h @@ -18,6 +18,10 @@ namespace llvm { class raw_ostream; } +namespace Fortran::common { +class LangOptions; +} + namespace Fortran::evaluate { struct GenericExprWrapper; struct GenericAssignmentWrapper; @@ -47,15 +51,18 @@ struct AnalyzedObjectsAsFortran { // Converts parsed program (or fragment) to out as Fortran. template void Unparse(llvm::raw_ostream &out, const A &root, - Encoding encoding = Encoding::UTF_8, bool capitalizeKeywords = true, - bool backslashEscapes = true, preStatementType *preStatement = nullptr, + const common::LangOptions &langOpts, Encoding encoding = Encoding::UTF_8, + bool capitalizeKeywords = true, bool backslashEscapes = true, + preStatementType *preStatement = nullptr, AnalyzedObjectsAsFortran * = nullptr); extern template void Unparse(llvm::raw_ostream &out, const Program &program, - Encoding encoding, bool capitalizeKeywords, bool backslashEscapes, + const common::LangOptions &langOpts, Encoding encoding, + bool capitalizeKeywords, bool backslashEscapes, preStatementType *preStatement, AnalyzedObjectsAsFortran *); extern template void Unparse(llvm::raw_ostream &out, const Expr &expr, - Encoding encoding, bool capitalizeKeywords, bool backslashEscapes, + const common::LangOptions &langOpts, Encoding encoding, + bool capitalizeKeywords, bool backslashEscapes, preStatementType *preStatement, AnalyzedObjectsAsFortran *); } // namespace Fortran::parser diff --git a/flang/include/flang/Semantics/unparse-with-symbols.h b/flang/include/flang/Semantics/unparse-with-symbols.h index 5e18b3fc3063d..702911bbab627 100644 --- a/flang/include/flang/Semantics/unparse-with-symbols.h +++ b/flang/include/flang/Semantics/unparse-with-symbols.h @@ -16,6 +16,10 @@ namespace llvm { class raw_ostream; } +namespace Fortran::common { +class LangOptions; +} + namespace Fortran::parser { struct Program; } @@ -23,6 +27,7 @@ struct Program; namespace Fortran::semantics { class SemanticsContext; void UnparseWithSymbols(llvm::raw_ostream &, const parser::Program &, + const common::LangOptions &, parser::Encoding encoding = parser::Encoding::UTF_8); void UnparseWithModules(llvm::raw_ostream &, SemanticsContext &, const parser::Program &, diff --git a/flang/lib/Frontend/ParserActions.cpp b/flang/lib/Frontend/ParserActions.cpp index cc7e72f696f96..4fe575b06d29f 100644 --- a/flang/lib/Frontend/ParserActions.cpp +++ b/flang/lib/Frontend/ParserActions.cpp @@ -119,7 +119,7 @@ void debugUnparseNoSema(CompilerInstance &ci, llvm::raw_ostream &out) { auto &parseTree{ci.getParsing().parseTree()}; // TODO: Options should come from CompilerInvocation - Unparse(out, *parseTree, + Unparse(out, *parseTree, ci.getInvocation().getLangOpts(), /*encoding=*/parser::Encoding::UTF_8, /*capitalizeKeywords=*/true, /*backslashEscapes=*/false, /*preStatement=*/nullptr, @@ -131,6 +131,7 @@ void debugUnparseWithSymbols(CompilerInstance &ci) { auto &parseTree{*ci.getParsing().parseTree()}; semantics::UnparseWithSymbols(llvm::outs(), parseTree, + ci.getInvocation().getLangOpts(), /*encoding=*/parser::Encoding::UTF_8); } diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 2e4d911aab35e..7857ba3fd0845 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -200,9 +200,11 @@ void ClauseProcessor::processTODO(mlir::Location currentLocation, auto checkUnhandledClause = [&](llvm::omp::Clause id, const auto *x) { if (!x) return; + unsigned version = semaCtx.langOptions().OpenMPVersion; TODO(currentLocation, "Unhandled clause " + llvm::omp::getOpenMPClauseName(id).upper() + - " in " + llvm::omp::getOpenMPDirectiveName(directive).upper() + + " in " + + llvm::omp::getOpenMPDirectiveName(directive, version).upper() + " construct"); }; diff --git a/flang/lib/Lower/OpenMP/Decomposer.cpp b/flang/lib/Lower/OpenMP/Decomposer.cpp index 33568bf96b5df..251cba9204adc 100644 --- a/flang/lib/Lower/OpenMP/Decomposer.cpp +++ b/flang/lib/Lower/OpenMP/Decomposer.cpp @@ -70,7 +70,7 @@ struct ConstructDecomposition { namespace Fortran::lower::omp { LLVM_DUMP_METHOD llvm::raw_ostream &operator<<(llvm::raw_ostream &os, const UnitConstruct &uc) { - os << llvm::omp::getOpenMPDirectiveName(uc.id); + os << llvm::omp::getOpenMPDirectiveName(uc.id, llvm::omp::FallbackVersion); for (auto [index, clause] : llvm::enumerate(uc.clauses)) { os << (index == 0 ? '\t' : ' '); os << llvm::omp::getOpenMPClauseName(clause.id); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 544f31bb5054f..62a00ae8f3714 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3754,9 +3754,11 @@ static void genOMPDispatch(lower::AbstractConverter &converter, item); break; case llvm::omp::Directive::OMPD_tile: - case llvm::omp::Directive::OMPD_unroll: + case llvm::omp::Directive::OMPD_unroll: { + unsigned version = semaCtx.langOptions().OpenMPVersion; TODO(loc, "Unhandled loop directive (" + - llvm::omp::getOpenMPDirectiveName(dir) + ")"); + llvm::omp::getOpenMPDirectiveName(dir, version) + ")"); + } // case llvm::omp::Directive::OMPD_workdistribute: case llvm::omp::Directive::OMPD_workshare: newOp = genWorkshareOp(converter, symTable, stmtCtx, semaCtx, eval, loc, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index c4728e0fabe61..ffee57144f7fb 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -125,7 +125,8 @@ OmpDirectiveNameParser::directives() const { void OmpDirectiveNameParser::initTokens(NameWithId *table) const { for (size_t i{0}, e{llvm::omp::Directive_enumSize}; i != e; ++i) { auto id{static_cast(i)}; - llvm::StringRef name{llvm::omp::getOpenMPDirectiveName(id)}; + llvm::StringRef name{ + llvm::omp::getOpenMPDirectiveName(id, llvm::omp::FallbackVersion)}; table[i] = std::make_pair(name.str(), id); } // Sort the table with respect to the directive name length in a descending diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..3dd87ad9a3650 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -11,6 +11,7 @@ #include "flang/Common/indirection.h" #include "flang/Parser/tools.h" #include "flang/Parser/user-state.h" +#include "llvm/Frontend/OpenMP/OMP.h" #include "llvm/Support/raw_ostream.h" #include @@ -305,7 +306,9 @@ std::string OmpTraitSelectorName::ToString() const { return std::string(EnumToString(v)); }, [&](llvm::omp::Directive d) { - return llvm::omp::getOpenMPDirectiveName(d).str(); + return llvm::omp::getOpenMPDirectiveName( + d, llvm::omp::FallbackVersion) + .str(); }, [&](const std::string &s) { // return s; diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 1ee9096fcda56..a626888b7dfe5 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -17,6 +17,7 @@ #include "flang/Parser/parse-tree.h" #include "flang/Parser/tools.h" #include "flang/Support/Fortran.h" +#include "flang/Support/LangOptions.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -27,12 +28,14 @@ namespace Fortran::parser { class UnparseVisitor { public: - UnparseVisitor(llvm::raw_ostream &out, int indentationAmount, - Encoding encoding, bool capitalize, bool backslashEscapes, - preStatementType *preStatement, AnalyzedObjectsAsFortran *asFortran) - : out_{out}, indentationAmount_{indentationAmount}, encoding_{encoding}, - capitalizeKeywords_{capitalize}, backslashEscapes_{backslashEscapes}, - preStatement_{preStatement}, asFortran_{asFortran} {} + UnparseVisitor(llvm::raw_ostream &out, const common::LangOptions &langOpts, + int indentationAmount, Encoding encoding, bool capitalize, + bool backslashEscapes, preStatementType *preStatement, + AnalyzedObjectsAsFortran *asFortran) + : out_{out}, langOpts_{langOpts}, indentationAmount_{indentationAmount}, + encoding_{encoding}, capitalizeKeywords_{capitalize}, + backslashEscapes_{backslashEscapes}, preStatement_{preStatement}, + asFortran_{asFortran} {} // In nearly all cases, this code avoids defining Boolean-valued Pre() // callbacks for the parse tree walking framework in favor of two void @@ -2102,7 +2105,8 @@ class UnparseVisitor { Walk(":", std::get>(x.t)); } void Unparse(const llvm::omp::Directive &x) { - Word(llvm::omp::getOpenMPDirectiveName(x).str()); + unsigned ompVersion{langOpts_.OpenMPVersion}; + Word(llvm::omp::getOpenMPDirectiveName(x, ompVersion).str()); } void Unparse(const OmpDirectiveSpecification &x) { auto unparseArgs{[&]() { @@ -2167,7 +2171,8 @@ class UnparseVisitor { x.u); } void Unparse(const OmpDirectiveNameModifier &x) { - Word(llvm::omp::getOpenMPDirectiveName(x.v)); + unsigned ompVersion{langOpts_.OpenMPVersion}; + Word(llvm::omp::getOpenMPDirectiveName(x.v, ompVersion)); } void Unparse(const OmpIteratorSpecifier &x) { Walk(std::get(x.t)); @@ -3249,6 +3254,7 @@ class UnparseVisitor { } llvm::raw_ostream &out_; + const common::LangOptions &langOpts_; int indent_{0}; const int indentationAmount_{1}; int column_{1}; @@ -3341,17 +3347,20 @@ void UnparseVisitor::Word(const std::string_view &str) { } template -void Unparse(llvm::raw_ostream &out, const A &root, Encoding encoding, +void Unparse(llvm::raw_ostream &out, const A &root, + const common::LangOptions &langOpts, Encoding encoding, bool capitalizeKeywords, bool backslashEscapes, preStatementType *preStatement, AnalyzedObjectsAsFortran *asFortran) { - UnparseVisitor visitor{out, 1, encoding, capitalizeKeywords, backslashEscapes, - preStatement, asFortran}; + UnparseVisitor visitor{out, langOpts, 1, encoding, capitalizeKeywords, + backslashEscapes, preStatement, asFortran}; Walk(root, visitor); visitor.Done(); } -template void Unparse(llvm::raw_ostream &, const Program &, Encoding, - bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); -template void Unparse(llvm::raw_ostream &, const Expr &, Encoding, bool, - bool, preStatementType *, AnalyzedObjectsAsFortran *); +template void Unparse(llvm::raw_ostream &, const Program &, + const common::LangOptions &, Encoding, bool, bool, preStatementType *, + AnalyzedObjectsAsFortran *); +template void Unparse(llvm::raw_ostream &, const Expr &, + const common::LangOptions &, Encoding, bool, bool, preStatementType *, + AnalyzedObjectsAsFortran *); } // namespace Fortran::parser diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dd8e511642976..8f6a623508aa7 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2434,8 +2434,7 @@ void OmpStructureChecker::Enter( break; default: context_.Say(dirName.source, "%s is not a cancellable construct"_err_en_US, - parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(dirName.v).str())); + parser::ToUpperCaseLetters(getDirectiveName(dirName.v).str())); break; } } @@ -2468,7 +2467,7 @@ std::optional OmpStructureChecker::GetCancelType( // Given clauses from CANCEL or CANCELLATION_POINT, identify the construct // to which the cancellation applies. std::optional cancelee; - llvm::StringRef cancelName{llvm::omp::getOpenMPDirectiveName(cancelDir)}; + llvm::StringRef cancelName{getDirectiveName(cancelDir)}; for (const parser::OmpClause &clause : maybeClauses->v) { using CancellationConstructType = @@ -2496,7 +2495,7 @@ std::optional OmpStructureChecker::GetCancelType( void OmpStructureChecker::CheckCancellationNest( const parser::CharBlock &source, llvm::omp::Directive type) { - llvm::StringRef typeName{llvm::omp::getOpenMPDirectiveName(type)}; + llvm::StringRef typeName{getDirectiveName(type)}; if (CurrentDirectiveIsNested()) { // If construct-type-clause is taskgroup, the cancellation construct must be @@ -4060,10 +4059,10 @@ void OmpStructureChecker::Enter(const parser::OmpClause::If &x) { if (auto *dnm{OmpGetUniqueModifier( modifiers)}) { llvm::omp::Directive sub{dnm->v}; - std::string subName{parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(sub).str())}; - std::string dirName{parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(dir).str())}; + std::string subName{ + parser::ToUpperCaseLetters(getDirectiveName(sub).str())}; + std::string dirName{ + parser::ToUpperCaseLetters(getDirectiveName(dir).str())}; parser::CharBlock modifierSource{OmpGetModifierSource(modifiers, dnm)}; auto desc{OmpGetDescriptor()}; @@ -5433,7 +5432,8 @@ llvm::StringRef OmpStructureChecker::getClauseName(llvm::omp::Clause clause) { llvm::StringRef OmpStructureChecker::getDirectiveName( llvm::omp::Directive directive) { - return llvm::omp::getOpenMPDirectiveName(directive); + unsigned version{context_.langOptions().OpenMPVersion}; + return llvm::omp::getOpenMPDirectiveName(directive, version); } const Symbol *OmpStructureChecker::GetObjectSymbol( diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index ee356e56e4458..12fc553518cfd 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -50,7 +50,7 @@ static void CollectSymbols( const Scope &, SymbolVector &, SymbolVector &, SourceOrderedSymbolSet &); static void PutPassName(llvm::raw_ostream &, const std::optional &); static void PutInit(llvm::raw_ostream &, const Symbol &, const MaybeExpr &, - const parser::Expr *); + const parser::Expr *, SemanticsContext &); static void PutInit(llvm::raw_ostream &, const MaybeIntExpr &); static void PutBound(llvm::raw_ostream &, const Bound &); static void PutShapeSpec(llvm::raw_ostream &, const ShapeSpec &); @@ -605,7 +605,7 @@ void ModFileWriter::PutDECStructure( } decls_ << ref->name(); PutShape(decls_, object->shape(), '(', ')'); - PutInit(decls_, *ref, object->init(), nullptr); + PutInit(decls_, *ref, object->init(), nullptr, context_); emittedDECFields_.insert(*ref); } else if (any) { break; // any later use of this structure will use RECORD/str/ @@ -944,7 +944,8 @@ void ModFileWriter::PutObjectEntity( getSymbolAttrsToWrite(symbol)); PutShape(os, details.shape(), '(', ')'); PutShape(os, details.coshape(), '[', ']'); - PutInit(os, symbol, details.init(), details.unanalyzedPDTComponentInit()); + PutInit(os, symbol, details.init(), details.unanalyzedPDTComponentInit(), + context_); os << '\n'; if (auto tkr{GetIgnoreTKR(symbol)}; !tkr.empty()) { os << "!dir$ ignore_tkr("; @@ -1036,11 +1037,11 @@ void ModFileWriter::PutTypeParam(llvm::raw_ostream &os, const Symbol &symbol) { } void PutInit(llvm::raw_ostream &os, const Symbol &symbol, const MaybeExpr &init, - const parser::Expr *unanalyzed) { + const parser::Expr *unanalyzed, SemanticsContext &context) { if (IsNamedConstant(symbol) || symbol.owner().IsDerivedType()) { const char *assign{symbol.attrs().test(Attr::POINTER) ? "=>" : "="}; if (unanalyzed) { - parser::Unparse(os << assign, *unanalyzed); + parser::Unparse(os << assign, *unanalyzed, context.langOptions()); } else if (init) { init->AsFortran(os << assign); } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 8b1caca34a6a7..60531538e6d59 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1887,6 +1887,7 @@ std::int64_t OmpAttributeVisitor::GetAssociatedLoopLevelFromClauses( // construct with multiple associated do-loops are lastprivate. void OmpAttributeVisitor::PrivatizeAssociatedLoopIndexAndCheckLoopLevel( const parser::OpenMPLoopConstruct &x) { + unsigned version{context_.langOptions().OpenMPVersion}; std::int64_t level{GetContext().associatedLoopLevel}; if (level <= 0) { return; @@ -1922,7 +1923,8 @@ void OmpAttributeVisitor::PrivatizeAssociatedLoopIndexAndCheckLoopLevel( context_.Say(GetContext().directiveSource, "A DO loop must follow the %s directive"_err_en_US, parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(GetContext().directive).str())); + llvm::omp::getOpenMPDirectiveName(GetContext().directive, version) + .str())); } } void OmpAttributeVisitor::CheckAssocLoopLevel( @@ -2442,6 +2444,7 @@ static bool SymbolOrEquivalentIsInNamelist(const Symbol &symbol) { void OmpAttributeVisitor::ResolveOmpObject( const parser::OmpObject &ompObject, Symbol::Flag ompFlag) { + unsigned version{context_.langOptions().OpenMPVersion}; common::visit( common::visitors{ [&](const parser::Designator &designator) { @@ -2464,7 +2467,7 @@ void OmpAttributeVisitor::ResolveOmpObject( Symbol::OmpFlagToClauseName(secondOmpFlag), parser::ToUpperCaseLetters( llvm::omp::getOpenMPDirectiveName( - GetContext().directive) + GetContext().directive, version) .str())); } }; @@ -2500,7 +2503,7 @@ void OmpAttributeVisitor::ResolveOmpObject( "in which the %s directive appears"_err_en_US, parser::ToUpperCaseLetters( llvm::omp::getOpenMPDirectiveName( - GetContext().directive) + GetContext().directive, version) .str())); } if (ompFlag == Symbol::Flag::OmpReduction) { @@ -2924,6 +2927,7 @@ void OmpAttributeVisitor::CheckSourceLabel(const parser::Label &label) { void OmpAttributeVisitor::CheckLabelContext(const parser::CharBlock source, const parser::CharBlock target, std::optional sourceContext, std::optional targetContext) { + unsigned version{context_.langOptions().OpenMPVersion}; if (targetContext && (!sourceContext || (sourceContext->scope != targetContext->scope && @@ -2932,8 +2936,8 @@ void OmpAttributeVisitor::CheckLabelContext(const parser::CharBlock source, context_ .Say(source, "invalid branch into an OpenMP structured block"_err_en_US) .Attach(target, "In the enclosing %s directive branched into"_en_US, - parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(targetContext->directive) + parser::ToUpperCaseLetters(llvm::omp::getOpenMPDirectiveName( + targetContext->directive, version) .str())); } if (sourceContext && @@ -2945,8 +2949,8 @@ void OmpAttributeVisitor::CheckLabelContext(const parser::CharBlock source, .Say(source, "invalid branch leaving an OpenMP structured block"_err_en_US) .Attach(target, "Outside the enclosing %s directive"_en_US, - parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(sourceContext->directive) + parser::ToUpperCaseLetters(llvm::omp::getOpenMPDirectiveName( + sourceContext->directive, version) .str())); } } @@ -2979,12 +2983,14 @@ void OmpAttributeVisitor::CheckNameInAllocateStmt( } } } + unsigned version{context_.langOptions().OpenMPVersion}; context_.Say(source, "Object '%s' in %s directive not " "found in corresponding ALLOCATE statement"_err_en_US, name.ToString(), parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(GetContext().directive).str())); + llvm::omp::getOpenMPDirectiveName(GetContext().directive, version) + .str())); } void OmpAttributeVisitor::AddOmpRequiresToScope(Scope &scope, @@ -3030,9 +3036,10 @@ void OmpAttributeVisitor::IssueNonConformanceWarning( llvm::omp::Directive D, parser::CharBlock source) { std::string warnStr; llvm::raw_string_ostream warnStrOS(warnStr); + unsigned version{context_.langOptions().OpenMPVersion}; warnStrOS << "OpenMP directive " << parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(D).str()) + llvm::omp::getOpenMPDirectiveName(D, version).str()) << " has been deprecated"; auto setAlternativeStr = [&warnStrOS](llvm::StringRef alt) { diff --git a/flang/lib/Semantics/unparse-with-symbols.cpp b/flang/lib/Semantics/unparse-with-symbols.cpp index 2716d88efb9fb..634d46b8ccf40 100644 --- a/flang/lib/Semantics/unparse-with-symbols.cpp +++ b/flang/lib/Semantics/unparse-with-symbols.cpp @@ -107,13 +107,13 @@ void SymbolDumpVisitor::Post(const parser::Name &name) { } void UnparseWithSymbols(llvm::raw_ostream &out, const parser::Program &program, - parser::Encoding encoding) { + const common::LangOptions &langOpts, parser::Encoding encoding) { SymbolDumpVisitor visitor; parser::Walk(program, visitor); parser::preStatementType preStatement{ [&](const parser::CharBlock &location, llvm::raw_ostream &out, int indent) { visitor.PrintSymbols(location, out, indent); }}; - parser::Unparse(out, program, encoding, false, true, &preStatement); + parser::Unparse(out, program, langOpts, encoding, false, true, &preStatement); } // UnparseWithModules() @@ -150,6 +150,6 @@ void UnparseWithModules(llvm::raw_ostream &out, SemanticsContext &context, for (SymbolRef moduleRef : visitor.modulesUsed()) { writer.WriteClosure(out, *moduleRef, nonIntrinsicModulesWritten); } - parser::Unparse(out, program, encoding, false, true); + parser::Unparse(out, program, context.langOptions(), encoding, false, true); } } // namespace Fortran::semantics From flang-commits at lists.llvm.org Fri May 9 05:42:26 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 09 May 2025 05:42:26 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (PR #139131) In-Reply-To: Message-ID: <681df832.170a0220.dfa74.be5e@mx.google.com> https://github.com/kparzysz closed https://github.com/llvm/llvm-project/pull/139131 From flang-commits at lists.llvm.org Fri May 9 05:56:19 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 09 May 2025 05:56:19 -0700 (PDT) Subject: [flang-commits] [flang] 89822ff - Revert "[flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (#139131)" Message-ID: <681dfb73.170a0220.167f10.d184@mx.google.com> Author: Krzysztof Parzyszek Date: 2025-05-09T07:56:10-05:00 New Revision: 89822ff5a8608570897c21a3c40fb450c53f603f URL: https://github.com/llvm/llvm-project/commit/89822ff5a8608570897c21a3c40fb450c53f603f DIFF: https://github.com/llvm/llvm-project/commit/89822ff5a8608570897c21a3c40fb450c53f603f.diff LOG: Revert "[flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (#139131)" This reverts commit 41aa67488c3ca33334ec79fb5216145c3644277c. Breaks build: https://lab.llvm.org/buildbot/#/builders/140/builds/22826 Added: Modified: flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp flang/include/flang/Parser/dump-parse-tree.h flang/include/flang/Parser/unparse.h flang/include/flang/Semantics/unparse-with-symbols.h flang/lib/Frontend/ParserActions.cpp flang/lib/Lower/OpenMP/ClauseProcessor.h flang/lib/Lower/OpenMP/Decomposer.cpp flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Parser/openmp-parsers.cpp flang/lib/Parser/parse-tree.cpp flang/lib/Parser/unparse.cpp flang/lib/Semantics/check-omp-structure.cpp flang/lib/Semantics/mod-file.cpp flang/lib/Semantics/resolve-directives.cpp flang/lib/Semantics/unparse-with-symbols.cpp Removed: ################################################################################ diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index bf66151d59950..dbbf86a6c6151 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -267,9 +267,8 @@ void OpenMPCounterVisitor::Post(const OmpScheduleClause::Kind &c) { "type=" + std::string{OmpScheduleClause::EnumToString(c)} + ";"; } void OpenMPCounterVisitor::Post(const OmpDirectiveNameModifier &c) { - clauseDetails += "name_modifier=" + - llvm::omp::getOpenMPDirectiveName(c.v, llvm::omp::FallbackVersion).str() + - ";"; + clauseDetails += + "name_modifier=" + llvm::omp::getOpenMPDirectiveName(c.v).str() + ";"; } void OpenMPCounterVisitor::Post(const OmpClause &c) { PostClauseCommon(normalize_clause_name(c.source.ToString())); diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index df9278697346f..a3721bc8410ba 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -17,7 +17,6 @@ #include "flang/Common/idioms.h" #include "flang/Common/indirection.h" #include "flang/Support/Fortran.h" -#include "llvm/Frontend/OpenMP/OMP.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -546,8 +545,8 @@ class ParseTreeDumper { NODE(parser, OmpBeginSectionsDirective) NODE(parser, OmpBlockDirective) static std::string GetNodeName(const llvm::omp::Directive &x) { - return llvm::Twine("llvm::omp::Directive = ", - llvm::omp::getOpenMPDirectiveName(x, llvm::omp::FallbackVersion)) + return llvm::Twine( + "llvm::omp::Directive = ", llvm::omp::getOpenMPDirectiveName(x)) .str(); } NODE(parser, OmpClause) diff --git a/flang/include/flang/Parser/unparse.h b/flang/include/flang/Parser/unparse.h index d796109ca8f86..40094ecbc85e5 100644 --- a/flang/include/flang/Parser/unparse.h +++ b/flang/include/flang/Parser/unparse.h @@ -18,10 +18,6 @@ namespace llvm { class raw_ostream; } -namespace Fortran::common { -class LangOptions; -} - namespace Fortran::evaluate { struct GenericExprWrapper; struct GenericAssignmentWrapper; @@ -51,18 +47,15 @@ struct AnalyzedObjectsAsFortran { // Converts parsed program (or fragment) to out as Fortran. template void Unparse(llvm::raw_ostream &out, const A &root, - const common::LangOptions &langOpts, Encoding encoding = Encoding::UTF_8, - bool capitalizeKeywords = true, bool backslashEscapes = true, - preStatementType *preStatement = nullptr, + Encoding encoding = Encoding::UTF_8, bool capitalizeKeywords = true, + bool backslashEscapes = true, preStatementType *preStatement = nullptr, AnalyzedObjectsAsFortran * = nullptr); extern template void Unparse(llvm::raw_ostream &out, const Program &program, - const common::LangOptions &langOpts, Encoding encoding, - bool capitalizeKeywords, bool backslashEscapes, + Encoding encoding, bool capitalizeKeywords, bool backslashEscapes, preStatementType *preStatement, AnalyzedObjectsAsFortran *); extern template void Unparse(llvm::raw_ostream &out, const Expr &expr, - const common::LangOptions &langOpts, Encoding encoding, - bool capitalizeKeywords, bool backslashEscapes, + Encoding encoding, bool capitalizeKeywords, bool backslashEscapes, preStatementType *preStatement, AnalyzedObjectsAsFortran *); } // namespace Fortran::parser diff --git a/flang/include/flang/Semantics/unparse-with-symbols.h b/flang/include/flang/Semantics/unparse-with-symbols.h index 702911bbab627..5e18b3fc3063d 100644 --- a/flang/include/flang/Semantics/unparse-with-symbols.h +++ b/flang/include/flang/Semantics/unparse-with-symbols.h @@ -16,10 +16,6 @@ namespace llvm { class raw_ostream; } -namespace Fortran::common { -class LangOptions; -} - namespace Fortran::parser { struct Program; } @@ -27,7 +23,6 @@ struct Program; namespace Fortran::semantics { class SemanticsContext; void UnparseWithSymbols(llvm::raw_ostream &, const parser::Program &, - const common::LangOptions &, parser::Encoding encoding = parser::Encoding::UTF_8); void UnparseWithModules(llvm::raw_ostream &, SemanticsContext &, const parser::Program &, diff --git a/flang/lib/Frontend/ParserActions.cpp b/flang/lib/Frontend/ParserActions.cpp index 4fe575b06d29f..cc7e72f696f96 100644 --- a/flang/lib/Frontend/ParserActions.cpp +++ b/flang/lib/Frontend/ParserActions.cpp @@ -119,7 +119,7 @@ void debugUnparseNoSema(CompilerInstance &ci, llvm::raw_ostream &out) { auto &parseTree{ci.getParsing().parseTree()}; // TODO: Options should come from CompilerInvocation - Unparse(out, *parseTree, ci.getInvocation().getLangOpts(), + Unparse(out, *parseTree, /*encoding=*/parser::Encoding::UTF_8, /*capitalizeKeywords=*/true, /*backslashEscapes=*/false, /*preStatement=*/nullptr, @@ -131,7 +131,6 @@ void debugUnparseWithSymbols(CompilerInstance &ci) { auto &parseTree{*ci.getParsing().parseTree()}; semantics::UnparseWithSymbols(llvm::outs(), parseTree, - ci.getInvocation().getLangOpts(), /*encoding=*/parser::Encoding::UTF_8); } diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 7857ba3fd0845..2e4d911aab35e 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -200,11 +200,9 @@ void ClauseProcessor::processTODO(mlir::Location currentLocation, auto checkUnhandledClause = [&](llvm::omp::Clause id, const auto *x) { if (!x) return; - unsigned version = semaCtx.langOptions().OpenMPVersion; TODO(currentLocation, "Unhandled clause " + llvm::omp::getOpenMPClauseName(id).upper() + - " in " + - llvm::omp::getOpenMPDirectiveName(directive, version).upper() + + " in " + llvm::omp::getOpenMPDirectiveName(directive).upper() + " construct"); }; diff --git a/flang/lib/Lower/OpenMP/Decomposer.cpp b/flang/lib/Lower/OpenMP/Decomposer.cpp index 251cba9204adc..33568bf96b5df 100644 --- a/flang/lib/Lower/OpenMP/Decomposer.cpp +++ b/flang/lib/Lower/OpenMP/Decomposer.cpp @@ -70,7 +70,7 @@ struct ConstructDecomposition { namespace Fortran::lower::omp { LLVM_DUMP_METHOD llvm::raw_ostream &operator<<(llvm::raw_ostream &os, const UnitConstruct &uc) { - os << llvm::omp::getOpenMPDirectiveName(uc.id, llvm::omp::FallbackVersion); + os << llvm::omp::getOpenMPDirectiveName(uc.id); for (auto [index, clause] : llvm::enumerate(uc.clauses)) { os << (index == 0 ? '\t' : ' '); os << llvm::omp::getOpenMPClauseName(clause.id); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 62a00ae8f3714..544f31bb5054f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3754,11 +3754,9 @@ static void genOMPDispatch(lower::AbstractConverter &converter, item); break; case llvm::omp::Directive::OMPD_tile: - case llvm::omp::Directive::OMPD_unroll: { - unsigned version = semaCtx.langOptions().OpenMPVersion; + case llvm::omp::Directive::OMPD_unroll: TODO(loc, "Unhandled loop directive (" + - llvm::omp::getOpenMPDirectiveName(dir, version) + ")"); - } + llvm::omp::getOpenMPDirectiveName(dir) + ")"); // case llvm::omp::Directive::OMPD_workdistribute: case llvm::omp::Directive::OMPD_workshare: newOp = genWorkshareOp(converter, symTable, stmtCtx, semaCtx, eval, loc, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index ffee57144f7fb..c4728e0fabe61 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -125,8 +125,7 @@ OmpDirectiveNameParser::directives() const { void OmpDirectiveNameParser::initTokens(NameWithId *table) const { for (size_t i{0}, e{llvm::omp::Directive_enumSize}; i != e; ++i) { auto id{static_cast(i)}; - llvm::StringRef name{ - llvm::omp::getOpenMPDirectiveName(id, llvm::omp::FallbackVersion)}; + llvm::StringRef name{llvm::omp::getOpenMPDirectiveName(id)}; table[i] = std::make_pair(name.str(), id); } // Sort the table with respect to the directive name length in a descending diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 3dd87ad9a3650..5839e7862b38b 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -11,7 +11,6 @@ #include "flang/Common/indirection.h" #include "flang/Parser/tools.h" #include "flang/Parser/user-state.h" -#include "llvm/Frontend/OpenMP/OMP.h" #include "llvm/Support/raw_ostream.h" #include @@ -306,9 +305,7 @@ std::string OmpTraitSelectorName::ToString() const { return std::string(EnumToString(v)); }, [&](llvm::omp::Directive d) { - return llvm::omp::getOpenMPDirectiveName( - d, llvm::omp::FallbackVersion) - .str(); + return llvm::omp::getOpenMPDirectiveName(d).str(); }, [&](const std::string &s) { // return s; diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index a626888b7dfe5..1ee9096fcda56 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -17,7 +17,6 @@ #include "flang/Parser/parse-tree.h" #include "flang/Parser/tools.h" #include "flang/Support/Fortran.h" -#include "flang/Support/LangOptions.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -28,14 +27,12 @@ namespace Fortran::parser { class UnparseVisitor { public: - UnparseVisitor(llvm::raw_ostream &out, const common::LangOptions &langOpts, - int indentationAmount, Encoding encoding, bool capitalize, - bool backslashEscapes, preStatementType *preStatement, - AnalyzedObjectsAsFortran *asFortran) - : out_{out}, langOpts_{langOpts}, indentationAmount_{indentationAmount}, - encoding_{encoding}, capitalizeKeywords_{capitalize}, - backslashEscapes_{backslashEscapes}, preStatement_{preStatement}, - asFortran_{asFortran} {} + UnparseVisitor(llvm::raw_ostream &out, int indentationAmount, + Encoding encoding, bool capitalize, bool backslashEscapes, + preStatementType *preStatement, AnalyzedObjectsAsFortran *asFortran) + : out_{out}, indentationAmount_{indentationAmount}, encoding_{encoding}, + capitalizeKeywords_{capitalize}, backslashEscapes_{backslashEscapes}, + preStatement_{preStatement}, asFortran_{asFortran} {} // In nearly all cases, this code avoids defining Boolean-valued Pre() // callbacks for the parse tree walking framework in favor of two void @@ -2105,8 +2102,7 @@ class UnparseVisitor { Walk(":", std::get>(x.t)); } void Unparse(const llvm::omp::Directive &x) { - unsigned ompVersion{langOpts_.OpenMPVersion}; - Word(llvm::omp::getOpenMPDirectiveName(x, ompVersion).str()); + Word(llvm::omp::getOpenMPDirectiveName(x).str()); } void Unparse(const OmpDirectiveSpecification &x) { auto unparseArgs{[&]() { @@ -2171,8 +2167,7 @@ class UnparseVisitor { x.u); } void Unparse(const OmpDirectiveNameModifier &x) { - unsigned ompVersion{langOpts_.OpenMPVersion}; - Word(llvm::omp::getOpenMPDirectiveName(x.v, ompVersion)); + Word(llvm::omp::getOpenMPDirectiveName(x.v)); } void Unparse(const OmpIteratorSpecifier &x) { Walk(std::get(x.t)); @@ -3254,7 +3249,6 @@ class UnparseVisitor { } llvm::raw_ostream &out_; - const common::LangOptions &langOpts_; int indent_{0}; const int indentationAmount_{1}; int column_{1}; @@ -3347,20 +3341,17 @@ void UnparseVisitor::Word(const std::string_view &str) { } template -void Unparse(llvm::raw_ostream &out, const A &root, - const common::LangOptions &langOpts, Encoding encoding, +void Unparse(llvm::raw_ostream &out, const A &root, Encoding encoding, bool capitalizeKeywords, bool backslashEscapes, preStatementType *preStatement, AnalyzedObjectsAsFortran *asFortran) { - UnparseVisitor visitor{out, langOpts, 1, encoding, capitalizeKeywords, - backslashEscapes, preStatement, asFortran}; + UnparseVisitor visitor{out, 1, encoding, capitalizeKeywords, backslashEscapes, + preStatement, asFortran}; Walk(root, visitor); visitor.Done(); } -template void Unparse(llvm::raw_ostream &, const Program &, - const common::LangOptions &, Encoding, bool, bool, preStatementType *, - AnalyzedObjectsAsFortran *); -template void Unparse(llvm::raw_ostream &, const Expr &, - const common::LangOptions &, Encoding, bool, bool, preStatementType *, - AnalyzedObjectsAsFortran *); +template void Unparse(llvm::raw_ostream &, const Program &, Encoding, + bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); +template void Unparse(llvm::raw_ostream &, const Expr &, Encoding, bool, + bool, preStatementType *, AnalyzedObjectsAsFortran *); } // namespace Fortran::parser diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 8f6a623508aa7..dd8e511642976 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2434,7 +2434,8 @@ void OmpStructureChecker::Enter( break; default: context_.Say(dirName.source, "%s is not a cancellable construct"_err_en_US, - parser::ToUpperCaseLetters(getDirectiveName(dirName.v).str())); + parser::ToUpperCaseLetters( + llvm::omp::getOpenMPDirectiveName(dirName.v).str())); break; } } @@ -2467,7 +2468,7 @@ std::optional OmpStructureChecker::GetCancelType( // Given clauses from CANCEL or CANCELLATION_POINT, identify the construct // to which the cancellation applies. std::optional cancelee; - llvm::StringRef cancelName{getDirectiveName(cancelDir)}; + llvm::StringRef cancelName{llvm::omp::getOpenMPDirectiveName(cancelDir)}; for (const parser::OmpClause &clause : maybeClauses->v) { using CancellationConstructType = @@ -2495,7 +2496,7 @@ std::optional OmpStructureChecker::GetCancelType( void OmpStructureChecker::CheckCancellationNest( const parser::CharBlock &source, llvm::omp::Directive type) { - llvm::StringRef typeName{getDirectiveName(type)}; + llvm::StringRef typeName{llvm::omp::getOpenMPDirectiveName(type)}; if (CurrentDirectiveIsNested()) { // If construct-type-clause is taskgroup, the cancellation construct must be @@ -4059,10 +4060,10 @@ void OmpStructureChecker::Enter(const parser::OmpClause::If &x) { if (auto *dnm{OmpGetUniqueModifier( modifiers)}) { llvm::omp::Directive sub{dnm->v}; - std::string subName{ - parser::ToUpperCaseLetters(getDirectiveName(sub).str())}; - std::string dirName{ - parser::ToUpperCaseLetters(getDirectiveName(dir).str())}; + std::string subName{parser::ToUpperCaseLetters( + llvm::omp::getOpenMPDirectiveName(sub).str())}; + std::string dirName{parser::ToUpperCaseLetters( + llvm::omp::getOpenMPDirectiveName(dir).str())}; parser::CharBlock modifierSource{OmpGetModifierSource(modifiers, dnm)}; auto desc{OmpGetDescriptor()}; @@ -5432,8 +5433,7 @@ llvm::StringRef OmpStructureChecker::getClauseName(llvm::omp::Clause clause) { llvm::StringRef OmpStructureChecker::getDirectiveName( llvm::omp::Directive directive) { - unsigned version{context_.langOptions().OpenMPVersion}; - return llvm::omp::getOpenMPDirectiveName(directive, version); + return llvm::omp::getOpenMPDirectiveName(directive); } const Symbol *OmpStructureChecker::GetObjectSymbol( diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index 12fc553518cfd..ee356e56e4458 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -50,7 +50,7 @@ static void CollectSymbols( const Scope &, SymbolVector &, SymbolVector &, SourceOrderedSymbolSet &); static void PutPassName(llvm::raw_ostream &, const std::optional &); static void PutInit(llvm::raw_ostream &, const Symbol &, const MaybeExpr &, - const parser::Expr *, SemanticsContext &); + const parser::Expr *); static void PutInit(llvm::raw_ostream &, const MaybeIntExpr &); static void PutBound(llvm::raw_ostream &, const Bound &); static void PutShapeSpec(llvm::raw_ostream &, const ShapeSpec &); @@ -605,7 +605,7 @@ void ModFileWriter::PutDECStructure( } decls_ << ref->name(); PutShape(decls_, object->shape(), '(', ')'); - PutInit(decls_, *ref, object->init(), nullptr, context_); + PutInit(decls_, *ref, object->init(), nullptr); emittedDECFields_.insert(*ref); } else if (any) { break; // any later use of this structure will use RECORD/str/ @@ -944,8 +944,7 @@ void ModFileWriter::PutObjectEntity( getSymbolAttrsToWrite(symbol)); PutShape(os, details.shape(), '(', ')'); PutShape(os, details.coshape(), '[', ']'); - PutInit(os, symbol, details.init(), details.unanalyzedPDTComponentInit(), - context_); + PutInit(os, symbol, details.init(), details.unanalyzedPDTComponentInit()); os << '\n'; if (auto tkr{GetIgnoreTKR(symbol)}; !tkr.empty()) { os << "!dir$ ignore_tkr("; @@ -1037,11 +1036,11 @@ void ModFileWriter::PutTypeParam(llvm::raw_ostream &os, const Symbol &symbol) { } void PutInit(llvm::raw_ostream &os, const Symbol &symbol, const MaybeExpr &init, - const parser::Expr *unanalyzed, SemanticsContext &context) { + const parser::Expr *unanalyzed) { if (IsNamedConstant(symbol) || symbol.owner().IsDerivedType()) { const char *assign{symbol.attrs().test(Attr::POINTER) ? "=>" : "="}; if (unanalyzed) { - parser::Unparse(os << assign, *unanalyzed, context.langOptions()); + parser::Unparse(os << assign, *unanalyzed); } else if (init) { init->AsFortran(os << assign); } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 60531538e6d59..8b1caca34a6a7 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1887,7 +1887,6 @@ std::int64_t OmpAttributeVisitor::GetAssociatedLoopLevelFromClauses( // construct with multiple associated do-loops are lastprivate. void OmpAttributeVisitor::PrivatizeAssociatedLoopIndexAndCheckLoopLevel( const parser::OpenMPLoopConstruct &x) { - unsigned version{context_.langOptions().OpenMPVersion}; std::int64_t level{GetContext().associatedLoopLevel}; if (level <= 0) { return; @@ -1923,8 +1922,7 @@ void OmpAttributeVisitor::PrivatizeAssociatedLoopIndexAndCheckLoopLevel( context_.Say(GetContext().directiveSource, "A DO loop must follow the %s directive"_err_en_US, parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(GetContext().directive, version) - .str())); + llvm::omp::getOpenMPDirectiveName(GetContext().directive).str())); } } void OmpAttributeVisitor::CheckAssocLoopLevel( @@ -2444,7 +2442,6 @@ static bool SymbolOrEquivalentIsInNamelist(const Symbol &symbol) { void OmpAttributeVisitor::ResolveOmpObject( const parser::OmpObject &ompObject, Symbol::Flag ompFlag) { - unsigned version{context_.langOptions().OpenMPVersion}; common::visit( common::visitors{ [&](const parser::Designator &designator) { @@ -2467,7 +2464,7 @@ void OmpAttributeVisitor::ResolveOmpObject( Symbol::OmpFlagToClauseName(secondOmpFlag), parser::ToUpperCaseLetters( llvm::omp::getOpenMPDirectiveName( - GetContext().directive, version) + GetContext().directive) .str())); } }; @@ -2503,7 +2500,7 @@ void OmpAttributeVisitor::ResolveOmpObject( "in which the %s directive appears"_err_en_US, parser::ToUpperCaseLetters( llvm::omp::getOpenMPDirectiveName( - GetContext().directive, version) + GetContext().directive) .str())); } if (ompFlag == Symbol::Flag::OmpReduction) { @@ -2927,7 +2924,6 @@ void OmpAttributeVisitor::CheckSourceLabel(const parser::Label &label) { void OmpAttributeVisitor::CheckLabelContext(const parser::CharBlock source, const parser::CharBlock target, std::optional sourceContext, std::optional targetContext) { - unsigned version{context_.langOptions().OpenMPVersion}; if (targetContext && (!sourceContext || (sourceContext->scope != targetContext->scope && @@ -2936,8 +2932,8 @@ void OmpAttributeVisitor::CheckLabelContext(const parser::CharBlock source, context_ .Say(source, "invalid branch into an OpenMP structured block"_err_en_US) .Attach(target, "In the enclosing %s directive branched into"_en_US, - parser::ToUpperCaseLetters(llvm::omp::getOpenMPDirectiveName( - targetContext->directive, version) + parser::ToUpperCaseLetters( + llvm::omp::getOpenMPDirectiveName(targetContext->directive) .str())); } if (sourceContext && @@ -2949,8 +2945,8 @@ void OmpAttributeVisitor::CheckLabelContext(const parser::CharBlock source, .Say(source, "invalid branch leaving an OpenMP structured block"_err_en_US) .Attach(target, "Outside the enclosing %s directive"_en_US, - parser::ToUpperCaseLetters(llvm::omp::getOpenMPDirectiveName( - sourceContext->directive, version) + parser::ToUpperCaseLetters( + llvm::omp::getOpenMPDirectiveName(sourceContext->directive) .str())); } } @@ -2983,14 +2979,12 @@ void OmpAttributeVisitor::CheckNameInAllocateStmt( } } } - unsigned version{context_.langOptions().OpenMPVersion}; context_.Say(source, "Object '%s' in %s directive not " "found in corresponding ALLOCATE statement"_err_en_US, name.ToString(), parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(GetContext().directive, version) - .str())); + llvm::omp::getOpenMPDirectiveName(GetContext().directive).str())); } void OmpAttributeVisitor::AddOmpRequiresToScope(Scope &scope, @@ -3036,10 +3030,9 @@ void OmpAttributeVisitor::IssueNonConformanceWarning( llvm::omp::Directive D, parser::CharBlock source) { std::string warnStr; llvm::raw_string_ostream warnStrOS(warnStr); - unsigned version{context_.langOptions().OpenMPVersion}; warnStrOS << "OpenMP directive " << parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(D, version).str()) + llvm::omp::getOpenMPDirectiveName(D).str()) << " has been deprecated"; auto setAlternativeStr = [&warnStrOS](llvm::StringRef alt) { diff --git a/flang/lib/Semantics/unparse-with-symbols.cpp b/flang/lib/Semantics/unparse-with-symbols.cpp index 634d46b8ccf40..2716d88efb9fb 100644 --- a/flang/lib/Semantics/unparse-with-symbols.cpp +++ b/flang/lib/Semantics/unparse-with-symbols.cpp @@ -107,13 +107,13 @@ void SymbolDumpVisitor::Post(const parser::Name &name) { } void UnparseWithSymbols(llvm::raw_ostream &out, const parser::Program &program, - const common::LangOptions &langOpts, parser::Encoding encoding) { + parser::Encoding encoding) { SymbolDumpVisitor visitor; parser::Walk(program, visitor); parser::preStatementType preStatement{ [&](const parser::CharBlock &location, llvm::raw_ostream &out, int indent) { visitor.PrintSymbols(location, out, indent); }}; - parser::Unparse(out, program, langOpts, encoding, false, true, &preStatement); + parser::Unparse(out, program, encoding, false, true, &preStatement); } // UnparseWithModules() @@ -150,6 +150,6 @@ void UnparseWithModules(llvm::raw_ostream &out, SemanticsContext &context, for (SymbolRef moduleRef : visitor.modulesUsed()) { writer.WriteClosure(out, *moduleRef, nonIntrinsicModulesWritten); } - parser::Unparse(out, program, context.langOptions(), encoding, false, true); + parser::Unparse(out, program, encoding, false, true); } } // namespace Fortran::semantics From flang-commits at lists.llvm.org Fri May 9 06:05:11 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Fri, 09 May 2025 06:05:11 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add missing dependent dialects to MLIR passes (PR #139260) Message-ID: https://github.com/skatrak created https://github.com/llvm/llvm-project/pull/139260 This patch updates several passes to include the DLTI dialect, since their use of the `fir::support::getOrSetMLIRDataLayout()` utility function could, in some cases, require this dialect to be loaded in advance. Also, the `CUFComputeSharedMemoryOffsetsAndSize` pass has been updated with a dependency to the GPU dialect, as its invocation to `cuf::getOrCreateGPUModule()` would result in the same kind of error if no other operations or attributes from that dialect were present in the input MLIR module. Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Fri May 9 06:05:47 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 06:05:47 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add missing dependent dialects to MLIR passes (PR #139260) In-Reply-To: Message-ID: <681dfdab.170a0220.2c5fd2.b983@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Sergio Afonso (skatrak)
Changes This patch updates several passes to include the DLTI dialect, since their use of the `fir::support::getOrSetMLIRDataLayout()` utility function could, in some cases, require this dialect to be loaded in advance. Also, the `CUFComputeSharedMemoryOffsetsAndSize` pass has been updated with a dependency to the GPU dialect, as its invocation to `cuf::getOrCreateGPUModule()` would result in the same kind of error if no other operations or attributes from that dialect were present in the input MLIR module. --- Full diff: https://github.com/llvm/llvm-project/pull/139260.diff 7 Files Affected: - (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+8-5) - (modified) flang/lib/Optimizer/Transforms/CUFAddConstructor.cpp (+1) - (modified) flang/lib/Optimizer/Transforms/CUFComputeSharedMemoryOffsetsAndSize.cpp (+1) - (modified) flang/lib/Optimizer/Transforms/CUFGPUToLLVMConversion.cpp (+1) - (modified) flang/lib/Optimizer/Transforms/CUFOpConversion.cpp (+1) - (modified) flang/lib/Optimizer/Transforms/LoopVersioning.cpp (+1) - (added) flang/test/Transforms/dlti-dependency.fir (+21) ``````````diff diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index 3243b44df9c7a..c0d88a8e19f80 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -356,7 +356,7 @@ def LoopVersioning : Pass<"loop-versioning", "mlir::func::FuncOp"> { an array has element sized stride. The element sizes stride allows some loops to be vectorized as well as other loop optimizations. }]; - let dependentDialects = [ "fir::FIROpsDialect" ]; + let dependentDialects = [ "fir::FIROpsDialect", "mlir::DLTIDialect" ]; } def VScaleAttr : Pass<"vscale-attr", "mlir::func::FuncOp"> { @@ -436,7 +436,7 @@ def AssumedRankOpConversion : Pass<"fir-assumed-rank-op", "mlir::ModuleOp"> { def CUFOpConversion : Pass<"cuf-convert", "mlir::ModuleOp"> { let summary = "Convert some CUF operations to runtime calls"; let dependentDialects = [ - "fir::FIROpsDialect", "mlir::gpu::GPUDialect" + "fir::FIROpsDialect", "mlir::gpu::GPUDialect", "mlir::DLTIDialect" ]; } @@ -451,14 +451,14 @@ def CUFDeviceGlobal : def CUFAddConstructor : Pass<"cuf-add-constructor", "mlir::ModuleOp"> { let summary = "Add constructor to register CUDA Fortran allocators"; let dependentDialects = [ - "cuf::CUFDialect", "mlir::func::FuncDialect" + "cuf::CUFDialect", "mlir::func::FuncDialect", "mlir::DLTIDialect" ]; } def CUFGPUToLLVMConversion : Pass<"cuf-gpu-convert-to-llvm", "mlir::ModuleOp"> { let summary = "Convert some GPU operations lowered from CUF to runtime calls"; let dependentDialects = [ - "mlir::LLVM::LLVMDialect" + "mlir::LLVM::LLVMDialect", "mlir::DLTIDialect" ]; } @@ -472,7 +472,10 @@ def CUFComputeSharedMemoryOffsetsAndSize the global and set it. }]; - let dependentDialects = ["cuf::CUFDialect", "fir::FIROpsDialect"]; + let dependentDialects = [ + "cuf::CUFDialect", "fir::FIROpsDialect", "mlir::gpu::GPUDialect", + "mlir::DLTIDialect" + ]; } def SetRuntimeCallAttributes diff --git a/flang/lib/Optimizer/Transforms/CUFAddConstructor.cpp b/flang/lib/Optimizer/Transforms/CUFAddConstructor.cpp index 064f0f363f699..2dd6950b34897 100644 --- a/flang/lib/Optimizer/Transforms/CUFAddConstructor.cpp +++ b/flang/lib/Optimizer/Transforms/CUFAddConstructor.cpp @@ -22,6 +22,7 @@ #include "flang/Optimizer/Support/DataLayout.h" #include "flang/Runtime/CUDA/registration.h" #include "flang/Runtime/entry-names.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/GPU/IR/GPUDialect.h" #include "mlir/Dialect/LLVMIR/LLVMAttrs.h" #include "mlir/Dialect/LLVMIR/LLVMDialect.h" diff --git a/flang/lib/Optimizer/Transforms/CUFComputeSharedMemoryOffsetsAndSize.cpp b/flang/lib/Optimizer/Transforms/CUFComputeSharedMemoryOffsetsAndSize.cpp index 8009522a82e27..f6381ef8a8a21 100644 --- a/flang/lib/Optimizer/Transforms/CUFComputeSharedMemoryOffsetsAndSize.cpp +++ b/flang/lib/Optimizer/Transforms/CUFComputeSharedMemoryOffsetsAndSize.cpp @@ -22,6 +22,7 @@ #include "flang/Optimizer/Support/DataLayout.h" #include "flang/Runtime/CUDA/registration.h" #include "flang/Runtime/entry-names.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/GPU/IR/GPUDialect.h" #include "mlir/Dialect/LLVMIR/LLVMDialect.h" #include "mlir/IR/Value.h" diff --git a/flang/lib/Optimizer/Transforms/CUFGPUToLLVMConversion.cpp b/flang/lib/Optimizer/Transforms/CUFGPUToLLVMConversion.cpp index 2549fdcb8baee..fe69ffa8350af 100644 --- a/flang/lib/Optimizer/Transforms/CUFGPUToLLVMConversion.cpp +++ b/flang/lib/Optimizer/Transforms/CUFGPUToLLVMConversion.cpp @@ -14,6 +14,7 @@ #include "flang/Runtime/CUDA/common.h" #include "flang/Support/Fortran.h" #include "mlir/Conversion/LLVMCommon/Pattern.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/GPU/IR/GPUDialect.h" #include "mlir/Dialect/LLVMIR/NVVMDialect.h" #include "mlir/Pass/Pass.h" diff --git a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp index e70ceb3a67d98..7477a3c53c3ef 100644 --- a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp +++ b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp @@ -24,6 +24,7 @@ #include "flang/Runtime/allocatable.h" #include "flang/Support/Fortran.h" #include "mlir/Conversion/LLVMCommon/Pattern.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/GPU/IR/GPUDialect.h" #include "mlir/IR/Matchers.h" #include "mlir/Pass/Pass.h" diff --git a/flang/lib/Optimizer/Transforms/LoopVersioning.cpp b/flang/lib/Optimizer/Transforms/LoopVersioning.cpp index 42e149bb3dba2..94dd8db2bafcb 100644 --- a/flang/lib/Optimizer/Transforms/LoopVersioning.cpp +++ b/flang/lib/Optimizer/Transforms/LoopVersioning.cpp @@ -51,6 +51,7 @@ #include "flang/Optimizer/Dialect/Support/KindMapping.h" #include "flang/Optimizer/Support/DataLayout.h" #include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/LLVMIR/LLVMDialect.h" #include "mlir/IR/Dominance.h" #include "mlir/IR/Matchers.h" diff --git a/flang/test/Transforms/dlti-dependency.fir b/flang/test/Transforms/dlti-dependency.fir new file mode 100644 index 0000000000000..c1c3da19fb8d6 --- /dev/null +++ b/flang/test/Transforms/dlti-dependency.fir @@ -0,0 +1,21 @@ +// This test only makes sure that passes with a DLTI dialect dependency are able +// to obtain the dlti.dl_spec module attribute from an llvm.data_layout string. +// +// If dependencies for the pass are not properly set, this test causes a +// compiler error due to the DLTI dialect not being loaded. + +// RUN: fir-opt --add-debug-info %s +// RUN: fir-opt --cuf-add-constructor %s +// RUN: fir-opt --cuf-compute-shared-memory %s +// RUN: fir-opt --cuf-gpu-convert-to-llvm %s +// RUN: fir-opt --cuf-convert %s +// RUN: fir-opt --loop-versioning %s + +module attributes {llvm.data_layout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"} { + llvm.func @foo(%arg0 : i32) { + llvm.return + } +} + +// CHECK: module attributes { +// CHECK-SAME: dlti.dl_spec = #dlti.dl_spec< ``````````
https://github.com/llvm/llvm-project/pull/139260 From flang-commits at lists.llvm.org Fri May 9 06:08:37 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 09 May 2025 06:08:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix spurious error on defined assignment in PURE (PR #139186) In-Reply-To: Message-ID: <681dfe55.170a0220.2f9c30.cb9d@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/139186 From flang-commits at lists.llvm.org Fri May 9 06:14:08 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 09 May 2025 06:14:08 -0700 (PDT) Subject: [flang-commits] [flang] a68f35a - [flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (#139131) Message-ID: <681dffa0.170a0220.12d964.cbb4@mx.google.com> Author: Krzysztof Parzyszek Date: 2025-05-09T08:13:52-05:00 New Revision: a68f35a17db03a6633a660d310156f4e2f17197f URL: https://github.com/llvm/llvm-project/commit/a68f35a17db03a6633a660d310156f4e2f17197f DIFF: https://github.com/llvm/llvm-project/commit/a68f35a17db03a6633a660d310156f4e2f17197f.diff LOG: [flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (#139131) The OpenMP version is stored in LangOptions in SemanticsContext. Use the fallback version where SemanticsContext is unavailable (mostly in case of debug dumps). RFC: https://discourse.llvm.org/t/rfc-alternative-spellings-of-openmp-directives/85507 Reland with a fix for build break in f18-parse-demo. Added: Modified: flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp flang/include/flang/Parser/dump-parse-tree.h flang/include/flang/Parser/unparse.h flang/include/flang/Semantics/unparse-with-symbols.h flang/lib/Frontend/ParserActions.cpp flang/lib/Lower/OpenMP/ClauseProcessor.h flang/lib/Lower/OpenMP/Decomposer.cpp flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Parser/openmp-parsers.cpp flang/lib/Parser/parse-tree.cpp flang/lib/Parser/unparse.cpp flang/lib/Semantics/check-omp-structure.cpp flang/lib/Semantics/mod-file.cpp flang/lib/Semantics/resolve-directives.cpp flang/lib/Semantics/unparse-with-symbols.cpp flang/tools/f18-parse-demo/f18-parse-demo.cpp Removed: ################################################################################ diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..bf66151d59950 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -267,8 +267,9 @@ void OpenMPCounterVisitor::Post(const OmpScheduleClause::Kind &c) { "type=" + std::string{OmpScheduleClause::EnumToString(c)} + ";"; } void OpenMPCounterVisitor::Post(const OmpDirectiveNameModifier &c) { - clauseDetails += - "name_modifier=" + llvm::omp::getOpenMPDirectiveName(c.v).str() + ";"; + clauseDetails += "name_modifier=" + + llvm::omp::getOpenMPDirectiveName(c.v, llvm::omp::FallbackVersion).str() + + ";"; } void OpenMPCounterVisitor::Post(const OmpClause &c) { PostClauseCommon(normalize_clause_name(c.source.ToString())); diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index a3721bc8410ba..df9278697346f 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -17,6 +17,7 @@ #include "flang/Common/idioms.h" #include "flang/Common/indirection.h" #include "flang/Support/Fortran.h" +#include "llvm/Frontend/OpenMP/OMP.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -545,8 +546,8 @@ class ParseTreeDumper { NODE(parser, OmpBeginSectionsDirective) NODE(parser, OmpBlockDirective) static std::string GetNodeName(const llvm::omp::Directive &x) { - return llvm::Twine( - "llvm::omp::Directive = ", llvm::omp::getOpenMPDirectiveName(x)) + return llvm::Twine("llvm::omp::Directive = ", + llvm::omp::getOpenMPDirectiveName(x, llvm::omp::FallbackVersion)) .str(); } NODE(parser, OmpClause) diff --git a/flang/include/flang/Parser/unparse.h b/flang/include/flang/Parser/unparse.h index 40094ecbc85e5..d796109ca8f86 100644 --- a/flang/include/flang/Parser/unparse.h +++ b/flang/include/flang/Parser/unparse.h @@ -18,6 +18,10 @@ namespace llvm { class raw_ostream; } +namespace Fortran::common { +class LangOptions; +} + namespace Fortran::evaluate { struct GenericExprWrapper; struct GenericAssignmentWrapper; @@ -47,15 +51,18 @@ struct AnalyzedObjectsAsFortran { // Converts parsed program (or fragment) to out as Fortran. template void Unparse(llvm::raw_ostream &out, const A &root, - Encoding encoding = Encoding::UTF_8, bool capitalizeKeywords = true, - bool backslashEscapes = true, preStatementType *preStatement = nullptr, + const common::LangOptions &langOpts, Encoding encoding = Encoding::UTF_8, + bool capitalizeKeywords = true, bool backslashEscapes = true, + preStatementType *preStatement = nullptr, AnalyzedObjectsAsFortran * = nullptr); extern template void Unparse(llvm::raw_ostream &out, const Program &program, - Encoding encoding, bool capitalizeKeywords, bool backslashEscapes, + const common::LangOptions &langOpts, Encoding encoding, + bool capitalizeKeywords, bool backslashEscapes, preStatementType *preStatement, AnalyzedObjectsAsFortran *); extern template void Unparse(llvm::raw_ostream &out, const Expr &expr, - Encoding encoding, bool capitalizeKeywords, bool backslashEscapes, + const common::LangOptions &langOpts, Encoding encoding, + bool capitalizeKeywords, bool backslashEscapes, preStatementType *preStatement, AnalyzedObjectsAsFortran *); } // namespace Fortran::parser diff --git a/flang/include/flang/Semantics/unparse-with-symbols.h b/flang/include/flang/Semantics/unparse-with-symbols.h index 5e18b3fc3063d..702911bbab627 100644 --- a/flang/include/flang/Semantics/unparse-with-symbols.h +++ b/flang/include/flang/Semantics/unparse-with-symbols.h @@ -16,6 +16,10 @@ namespace llvm { class raw_ostream; } +namespace Fortran::common { +class LangOptions; +} + namespace Fortran::parser { struct Program; } @@ -23,6 +27,7 @@ struct Program; namespace Fortran::semantics { class SemanticsContext; void UnparseWithSymbols(llvm::raw_ostream &, const parser::Program &, + const common::LangOptions &, parser::Encoding encoding = parser::Encoding::UTF_8); void UnparseWithModules(llvm::raw_ostream &, SemanticsContext &, const parser::Program &, diff --git a/flang/lib/Frontend/ParserActions.cpp b/flang/lib/Frontend/ParserActions.cpp index cc7e72f696f96..4fe575b06d29f 100644 --- a/flang/lib/Frontend/ParserActions.cpp +++ b/flang/lib/Frontend/ParserActions.cpp @@ -119,7 +119,7 @@ void debugUnparseNoSema(CompilerInstance &ci, llvm::raw_ostream &out) { auto &parseTree{ci.getParsing().parseTree()}; // TODO: Options should come from CompilerInvocation - Unparse(out, *parseTree, + Unparse(out, *parseTree, ci.getInvocation().getLangOpts(), /*encoding=*/parser::Encoding::UTF_8, /*capitalizeKeywords=*/true, /*backslashEscapes=*/false, /*preStatement=*/nullptr, @@ -131,6 +131,7 @@ void debugUnparseWithSymbols(CompilerInstance &ci) { auto &parseTree{*ci.getParsing().parseTree()}; semantics::UnparseWithSymbols(llvm::outs(), parseTree, + ci.getInvocation().getLangOpts(), /*encoding=*/parser::Encoding::UTF_8); } diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 2e4d911aab35e..7857ba3fd0845 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -200,9 +200,11 @@ void ClauseProcessor::processTODO(mlir::Location currentLocation, auto checkUnhandledClause = [&](llvm::omp::Clause id, const auto *x) { if (!x) return; + unsigned version = semaCtx.langOptions().OpenMPVersion; TODO(currentLocation, "Unhandled clause " + llvm::omp::getOpenMPClauseName(id).upper() + - " in " + llvm::omp::getOpenMPDirectiveName(directive).upper() + + " in " + + llvm::omp::getOpenMPDirectiveName(directive, version).upper() + " construct"); }; diff --git a/flang/lib/Lower/OpenMP/Decomposer.cpp b/flang/lib/Lower/OpenMP/Decomposer.cpp index 33568bf96b5df..251cba9204adc 100644 --- a/flang/lib/Lower/OpenMP/Decomposer.cpp +++ b/flang/lib/Lower/OpenMP/Decomposer.cpp @@ -70,7 +70,7 @@ struct ConstructDecomposition { namespace Fortran::lower::omp { LLVM_DUMP_METHOD llvm::raw_ostream &operator<<(llvm::raw_ostream &os, const UnitConstruct &uc) { - os << llvm::omp::getOpenMPDirectiveName(uc.id); + os << llvm::omp::getOpenMPDirectiveName(uc.id, llvm::omp::FallbackVersion); for (auto [index, clause] : llvm::enumerate(uc.clauses)) { os << (index == 0 ? '\t' : ' '); os << llvm::omp::getOpenMPClauseName(clause.id); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 544f31bb5054f..62a00ae8f3714 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3754,9 +3754,11 @@ static void genOMPDispatch(lower::AbstractConverter &converter, item); break; case llvm::omp::Directive::OMPD_tile: - case llvm::omp::Directive::OMPD_unroll: + case llvm::omp::Directive::OMPD_unroll: { + unsigned version = semaCtx.langOptions().OpenMPVersion; TODO(loc, "Unhandled loop directive (" + - llvm::omp::getOpenMPDirectiveName(dir) + ")"); + llvm::omp::getOpenMPDirectiveName(dir, version) + ")"); + } // case llvm::omp::Directive::OMPD_workdistribute: case llvm::omp::Directive::OMPD_workshare: newOp = genWorkshareOp(converter, symTable, stmtCtx, semaCtx, eval, loc, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index c4728e0fabe61..ffee57144f7fb 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -125,7 +125,8 @@ OmpDirectiveNameParser::directives() const { void OmpDirectiveNameParser::initTokens(NameWithId *table) const { for (size_t i{0}, e{llvm::omp::Directive_enumSize}; i != e; ++i) { auto id{static_cast(i)}; - llvm::StringRef name{llvm::omp::getOpenMPDirectiveName(id)}; + llvm::StringRef name{ + llvm::omp::getOpenMPDirectiveName(id, llvm::omp::FallbackVersion)}; table[i] = std::make_pair(name.str(), id); } // Sort the table with respect to the directive name length in a descending diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..3dd87ad9a3650 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -11,6 +11,7 @@ #include "flang/Common/indirection.h" #include "flang/Parser/tools.h" #include "flang/Parser/user-state.h" +#include "llvm/Frontend/OpenMP/OMP.h" #include "llvm/Support/raw_ostream.h" #include @@ -305,7 +306,9 @@ std::string OmpTraitSelectorName::ToString() const { return std::string(EnumToString(v)); }, [&](llvm::omp::Directive d) { - return llvm::omp::getOpenMPDirectiveName(d).str(); + return llvm::omp::getOpenMPDirectiveName( + d, llvm::omp::FallbackVersion) + .str(); }, [&](const std::string &s) { // return s; diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 1ee9096fcda56..a626888b7dfe5 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -17,6 +17,7 @@ #include "flang/Parser/parse-tree.h" #include "flang/Parser/tools.h" #include "flang/Support/Fortran.h" +#include "flang/Support/LangOptions.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -27,12 +28,14 @@ namespace Fortran::parser { class UnparseVisitor { public: - UnparseVisitor(llvm::raw_ostream &out, int indentationAmount, - Encoding encoding, bool capitalize, bool backslashEscapes, - preStatementType *preStatement, AnalyzedObjectsAsFortran *asFortran) - : out_{out}, indentationAmount_{indentationAmount}, encoding_{encoding}, - capitalizeKeywords_{capitalize}, backslashEscapes_{backslashEscapes}, - preStatement_{preStatement}, asFortran_{asFortran} {} + UnparseVisitor(llvm::raw_ostream &out, const common::LangOptions &langOpts, + int indentationAmount, Encoding encoding, bool capitalize, + bool backslashEscapes, preStatementType *preStatement, + AnalyzedObjectsAsFortran *asFortran) + : out_{out}, langOpts_{langOpts}, indentationAmount_{indentationAmount}, + encoding_{encoding}, capitalizeKeywords_{capitalize}, + backslashEscapes_{backslashEscapes}, preStatement_{preStatement}, + asFortran_{asFortran} {} // In nearly all cases, this code avoids defining Boolean-valued Pre() // callbacks for the parse tree walking framework in favor of two void @@ -2102,7 +2105,8 @@ class UnparseVisitor { Walk(":", std::get>(x.t)); } void Unparse(const llvm::omp::Directive &x) { - Word(llvm::omp::getOpenMPDirectiveName(x).str()); + unsigned ompVersion{langOpts_.OpenMPVersion}; + Word(llvm::omp::getOpenMPDirectiveName(x, ompVersion).str()); } void Unparse(const OmpDirectiveSpecification &x) { auto unparseArgs{[&]() { @@ -2167,7 +2171,8 @@ class UnparseVisitor { x.u); } void Unparse(const OmpDirectiveNameModifier &x) { - Word(llvm::omp::getOpenMPDirectiveName(x.v)); + unsigned ompVersion{langOpts_.OpenMPVersion}; + Word(llvm::omp::getOpenMPDirectiveName(x.v, ompVersion)); } void Unparse(const OmpIteratorSpecifier &x) { Walk(std::get(x.t)); @@ -3249,6 +3254,7 @@ class UnparseVisitor { } llvm::raw_ostream &out_; + const common::LangOptions &langOpts_; int indent_{0}; const int indentationAmount_{1}; int column_{1}; @@ -3341,17 +3347,20 @@ void UnparseVisitor::Word(const std::string_view &str) { } template -void Unparse(llvm::raw_ostream &out, const A &root, Encoding encoding, +void Unparse(llvm::raw_ostream &out, const A &root, + const common::LangOptions &langOpts, Encoding encoding, bool capitalizeKeywords, bool backslashEscapes, preStatementType *preStatement, AnalyzedObjectsAsFortran *asFortran) { - UnparseVisitor visitor{out, 1, encoding, capitalizeKeywords, backslashEscapes, - preStatement, asFortran}; + UnparseVisitor visitor{out, langOpts, 1, encoding, capitalizeKeywords, + backslashEscapes, preStatement, asFortran}; Walk(root, visitor); visitor.Done(); } -template void Unparse(llvm::raw_ostream &, const Program &, Encoding, - bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); -template void Unparse(llvm::raw_ostream &, const Expr &, Encoding, bool, - bool, preStatementType *, AnalyzedObjectsAsFortran *); +template void Unparse(llvm::raw_ostream &, const Program &, + const common::LangOptions &, Encoding, bool, bool, preStatementType *, + AnalyzedObjectsAsFortran *); +template void Unparse(llvm::raw_ostream &, const Expr &, + const common::LangOptions &, Encoding, bool, bool, preStatementType *, + AnalyzedObjectsAsFortran *); } // namespace Fortran::parser diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dd8e511642976..8f6a623508aa7 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2434,8 +2434,7 @@ void OmpStructureChecker::Enter( break; default: context_.Say(dirName.source, "%s is not a cancellable construct"_err_en_US, - parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(dirName.v).str())); + parser::ToUpperCaseLetters(getDirectiveName(dirName.v).str())); break; } } @@ -2468,7 +2467,7 @@ std::optional OmpStructureChecker::GetCancelType( // Given clauses from CANCEL or CANCELLATION_POINT, identify the construct // to which the cancellation applies. std::optional cancelee; - llvm::StringRef cancelName{llvm::omp::getOpenMPDirectiveName(cancelDir)}; + llvm::StringRef cancelName{getDirectiveName(cancelDir)}; for (const parser::OmpClause &clause : maybeClauses->v) { using CancellationConstructType = @@ -2496,7 +2495,7 @@ std::optional OmpStructureChecker::GetCancelType( void OmpStructureChecker::CheckCancellationNest( const parser::CharBlock &source, llvm::omp::Directive type) { - llvm::StringRef typeName{llvm::omp::getOpenMPDirectiveName(type)}; + llvm::StringRef typeName{getDirectiveName(type)}; if (CurrentDirectiveIsNested()) { // If construct-type-clause is taskgroup, the cancellation construct must be @@ -4060,10 +4059,10 @@ void OmpStructureChecker::Enter(const parser::OmpClause::If &x) { if (auto *dnm{OmpGetUniqueModifier( modifiers)}) { llvm::omp::Directive sub{dnm->v}; - std::string subName{parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(sub).str())}; - std::string dirName{parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(dir).str())}; + std::string subName{ + parser::ToUpperCaseLetters(getDirectiveName(sub).str())}; + std::string dirName{ + parser::ToUpperCaseLetters(getDirectiveName(dir).str())}; parser::CharBlock modifierSource{OmpGetModifierSource(modifiers, dnm)}; auto desc{OmpGetDescriptor()}; @@ -5433,7 +5432,8 @@ llvm::StringRef OmpStructureChecker::getClauseName(llvm::omp::Clause clause) { llvm::StringRef OmpStructureChecker::getDirectiveName( llvm::omp::Directive directive) { - return llvm::omp::getOpenMPDirectiveName(directive); + unsigned version{context_.langOptions().OpenMPVersion}; + return llvm::omp::getOpenMPDirectiveName(directive, version); } const Symbol *OmpStructureChecker::GetObjectSymbol( diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index ee356e56e4458..12fc553518cfd 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -50,7 +50,7 @@ static void CollectSymbols( const Scope &, SymbolVector &, SymbolVector &, SourceOrderedSymbolSet &); static void PutPassName(llvm::raw_ostream &, const std::optional &); static void PutInit(llvm::raw_ostream &, const Symbol &, const MaybeExpr &, - const parser::Expr *); + const parser::Expr *, SemanticsContext &); static void PutInit(llvm::raw_ostream &, const MaybeIntExpr &); static void PutBound(llvm::raw_ostream &, const Bound &); static void PutShapeSpec(llvm::raw_ostream &, const ShapeSpec &); @@ -605,7 +605,7 @@ void ModFileWriter::PutDECStructure( } decls_ << ref->name(); PutShape(decls_, object->shape(), '(', ')'); - PutInit(decls_, *ref, object->init(), nullptr); + PutInit(decls_, *ref, object->init(), nullptr, context_); emittedDECFields_.insert(*ref); } else if (any) { break; // any later use of this structure will use RECORD/str/ @@ -944,7 +944,8 @@ void ModFileWriter::PutObjectEntity( getSymbolAttrsToWrite(symbol)); PutShape(os, details.shape(), '(', ')'); PutShape(os, details.coshape(), '[', ']'); - PutInit(os, symbol, details.init(), details.unanalyzedPDTComponentInit()); + PutInit(os, symbol, details.init(), details.unanalyzedPDTComponentInit(), + context_); os << '\n'; if (auto tkr{GetIgnoreTKR(symbol)}; !tkr.empty()) { os << "!dir$ ignore_tkr("; @@ -1036,11 +1037,11 @@ void ModFileWriter::PutTypeParam(llvm::raw_ostream &os, const Symbol &symbol) { } void PutInit(llvm::raw_ostream &os, const Symbol &symbol, const MaybeExpr &init, - const parser::Expr *unanalyzed) { + const parser::Expr *unanalyzed, SemanticsContext &context) { if (IsNamedConstant(symbol) || symbol.owner().IsDerivedType()) { const char *assign{symbol.attrs().test(Attr::POINTER) ? "=>" : "="}; if (unanalyzed) { - parser::Unparse(os << assign, *unanalyzed); + parser::Unparse(os << assign, *unanalyzed, context.langOptions()); } else if (init) { init->AsFortran(os << assign); } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 8b1caca34a6a7..60531538e6d59 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1887,6 +1887,7 @@ std::int64_t OmpAttributeVisitor::GetAssociatedLoopLevelFromClauses( // construct with multiple associated do-loops are lastprivate. void OmpAttributeVisitor::PrivatizeAssociatedLoopIndexAndCheckLoopLevel( const parser::OpenMPLoopConstruct &x) { + unsigned version{context_.langOptions().OpenMPVersion}; std::int64_t level{GetContext().associatedLoopLevel}; if (level <= 0) { return; @@ -1922,7 +1923,8 @@ void OmpAttributeVisitor::PrivatizeAssociatedLoopIndexAndCheckLoopLevel( context_.Say(GetContext().directiveSource, "A DO loop must follow the %s directive"_err_en_US, parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(GetContext().directive).str())); + llvm::omp::getOpenMPDirectiveName(GetContext().directive, version) + .str())); } } void OmpAttributeVisitor::CheckAssocLoopLevel( @@ -2442,6 +2444,7 @@ static bool SymbolOrEquivalentIsInNamelist(const Symbol &symbol) { void OmpAttributeVisitor::ResolveOmpObject( const parser::OmpObject &ompObject, Symbol::Flag ompFlag) { + unsigned version{context_.langOptions().OpenMPVersion}; common::visit( common::visitors{ [&](const parser::Designator &designator) { @@ -2464,7 +2467,7 @@ void OmpAttributeVisitor::ResolveOmpObject( Symbol::OmpFlagToClauseName(secondOmpFlag), parser::ToUpperCaseLetters( llvm::omp::getOpenMPDirectiveName( - GetContext().directive) + GetContext().directive, version) .str())); } }; @@ -2500,7 +2503,7 @@ void OmpAttributeVisitor::ResolveOmpObject( "in which the %s directive appears"_err_en_US, parser::ToUpperCaseLetters( llvm::omp::getOpenMPDirectiveName( - GetContext().directive) + GetContext().directive, version) .str())); } if (ompFlag == Symbol::Flag::OmpReduction) { @@ -2924,6 +2927,7 @@ void OmpAttributeVisitor::CheckSourceLabel(const parser::Label &label) { void OmpAttributeVisitor::CheckLabelContext(const parser::CharBlock source, const parser::CharBlock target, std::optional sourceContext, std::optional targetContext) { + unsigned version{context_.langOptions().OpenMPVersion}; if (targetContext && (!sourceContext || (sourceContext->scope != targetContext->scope && @@ -2932,8 +2936,8 @@ void OmpAttributeVisitor::CheckLabelContext(const parser::CharBlock source, context_ .Say(source, "invalid branch into an OpenMP structured block"_err_en_US) .Attach(target, "In the enclosing %s directive branched into"_en_US, - parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(targetContext->directive) + parser::ToUpperCaseLetters(llvm::omp::getOpenMPDirectiveName( + targetContext->directive, version) .str())); } if (sourceContext && @@ -2945,8 +2949,8 @@ void OmpAttributeVisitor::CheckLabelContext(const parser::CharBlock source, .Say(source, "invalid branch leaving an OpenMP structured block"_err_en_US) .Attach(target, "Outside the enclosing %s directive"_en_US, - parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(sourceContext->directive) + parser::ToUpperCaseLetters(llvm::omp::getOpenMPDirectiveName( + sourceContext->directive, version) .str())); } } @@ -2979,12 +2983,14 @@ void OmpAttributeVisitor::CheckNameInAllocateStmt( } } } + unsigned version{context_.langOptions().OpenMPVersion}; context_.Say(source, "Object '%s' in %s directive not " "found in corresponding ALLOCATE statement"_err_en_US, name.ToString(), parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(GetContext().directive).str())); + llvm::omp::getOpenMPDirectiveName(GetContext().directive, version) + .str())); } void OmpAttributeVisitor::AddOmpRequiresToScope(Scope &scope, @@ -3030,9 +3036,10 @@ void OmpAttributeVisitor::IssueNonConformanceWarning( llvm::omp::Directive D, parser::CharBlock source) { std::string warnStr; llvm::raw_string_ostream warnStrOS(warnStr); + unsigned version{context_.langOptions().OpenMPVersion}; warnStrOS << "OpenMP directive " << parser::ToUpperCaseLetters( - llvm::omp::getOpenMPDirectiveName(D).str()) + llvm::omp::getOpenMPDirectiveName(D, version).str()) << " has been deprecated"; auto setAlternativeStr = [&warnStrOS](llvm::StringRef alt) { diff --git a/flang/lib/Semantics/unparse-with-symbols.cpp b/flang/lib/Semantics/unparse-with-symbols.cpp index 2716d88efb9fb..634d46b8ccf40 100644 --- a/flang/lib/Semantics/unparse-with-symbols.cpp +++ b/flang/lib/Semantics/unparse-with-symbols.cpp @@ -107,13 +107,13 @@ void SymbolDumpVisitor::Post(const parser::Name &name) { } void UnparseWithSymbols(llvm::raw_ostream &out, const parser::Program &program, - parser::Encoding encoding) { + const common::LangOptions &langOpts, parser::Encoding encoding) { SymbolDumpVisitor visitor; parser::Walk(program, visitor); parser::preStatementType preStatement{ [&](const parser::CharBlock &location, llvm::raw_ostream &out, int indent) { visitor.PrintSymbols(location, out, indent); }}; - parser::Unparse(out, program, encoding, false, true, &preStatement); + parser::Unparse(out, program, langOpts, encoding, false, true, &preStatement); } // UnparseWithModules() @@ -150,6 +150,6 @@ void UnparseWithModules(llvm::raw_ostream &out, SemanticsContext &context, for (SymbolRef moduleRef : visitor.modulesUsed()) { writer.WriteClosure(out, *moduleRef, nonIntrinsicModulesWritten); } - parser::Unparse(out, program, encoding, false, true); + parser::Unparse(out, program, context.langOptions(), encoding, false, true); } } // namespace Fortran::semantics diff --git a/flang/tools/f18-parse-demo/f18-parse-demo.cpp b/flang/tools/f18-parse-demo/f18-parse-demo.cpp index a50c88dc84064..e25785bd784d4 100644 --- a/flang/tools/f18-parse-demo/f18-parse-demo.cpp +++ b/flang/tools/f18-parse-demo/f18-parse-demo.cpp @@ -30,6 +30,7 @@ #include "flang/Parser/provenance.h" #include "flang/Parser/unparse.h" #include "flang/Support/Fortran-features.h" +#include "flang/Support/LangOptions.h" #include "flang/Support/default-kinds.h" #include "llvm/Support/Errno.h" #include "llvm/Support/FileSystem.h" @@ -83,6 +84,7 @@ struct DriverOptions { bool compileOnly{false}; // -c std::string outputPath; // -o path std::vector searchDirectories{"."s}; // -I dir + Fortran::common::LangOptions langOpts; bool forcedForm{false}; // -Mfixed or -Mfree appeared bool warnOnNonstandardUsage{false}; // -Mstandard bool warnOnSuspiciousUsage{false}; // -pedantic @@ -219,7 +221,8 @@ std::string CompileFortran( return {}; } if (driver.dumpUnparse) { - Unparse(llvm::outs(), parseTree, driver.encoding, true /*capitalize*/, + Unparse(llvm::outs(), parseTree, driver.langOpts, driver.encoding, + true /*capitalize*/, options.features.IsEnabled( Fortran::common::LanguageFeature::BackslashEscapes)); return {}; @@ -240,7 +243,8 @@ std::string CompileFortran( std::exit(EXIT_FAILURE); } llvm::raw_fd_ostream tmpSource(fd, /*shouldClose*/ true); - Unparse(tmpSource, parseTree, driver.encoding, true /*capitalize*/, + Unparse(tmpSource, parseTree, driver.langOpts, driver.encoding, + true /*capitalize*/, options.features.IsEnabled( Fortran::common::LanguageFeature::BackslashEscapes)); } From flang-commits at lists.llvm.org Fri May 9 06:28:49 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Fri, 09 May 2025 06:28:49 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (PR #139131) In-Reply-To: Message-ID: <681e0311.050a0220.7e1d9.f52b@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `amdgpu-offload-rhel-8-cmake-build-only` running on `rocm-docker-rhel-8` while building `flang` at step 4 "annotate". Full details are available at: https://lab.llvm.org/buildbot/#/builders/204/builds/8855
Here is the relevant piece of the build log for the reference ``` Step 4 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py --jobs=32' (failure) ... [7271/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/Runtime/Support.cpp.o [7272/7799] Building CXX object tools/flang/lib/Parser/CMakeFiles/FortranParser.dir/cmake_pch.hxx.gch [7273/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/Runtime/Transformational.cpp.o [7274/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/common.cpp.o [7275/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/FIRBuilder.cpp.o [7276/7799] Building CXX object tools/flang/unittests/Evaluate/CMakeFiles/folding.test.dir/folding.cpp.o [7277/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/Runtime/Reduction.cpp.o [7278/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/complex.cpp.o [7279/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/host.cpp.o [7280/7799] Building CXX object tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o FAILED: tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o ccache /usr/bin/c++ -DFLANG_INCLUDE_TESTS=1 -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_STATIC -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/tools/flang/tools/f18-parse-demo -I/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo -I/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include -I/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/tools/flang/include -I/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/include -I/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/llvm/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/../mlir/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/tools/mlir/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/tools/clang/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/llvm/../clang/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-noexcept-type -Wno-unnecessary-virtual-specifier -Wdelete-non-virtual-dtor -Wno-comment -Wno-misleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-ctad-maybe-unsupported -fno-strict-aliasing -fno-semantic-interposition -fpch-preprocess -O3 -DNDEBUG -fno-semantic-interposition -std=c++17 -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -MF tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o.d -o tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -c /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp: In function ‘std::__cxx11::string CompileFortran(std::__cxx11::string, Fortran::parser::Options, DriverOptions&)’: /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:224:64: error: no matching function for call to ‘Unparse(llvm::raw_fd_ostream&, Fortran::parser::Program&, Fortran::parser::Encoding&, bool, bool)’ Fortran::common::LanguageFeature::BackslashEscapes)); ^ In file included from /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include/flang/Parser/dump-parse-tree.h:16, from /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:25: /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate: ‘template void Fortran::parser::Unparse(llvm::raw_ostream&, const A&, const Fortran::common::LangOptions&, Fortran::parser::Encoding, bool, bool, Fortran::parser::preStatementType*, Fortran::parser::AnalyzedObjectsAsFortran*)’ void Unparse(llvm::raw_ostream &out, const A &root, ^~~~~~~ /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: template argument deduction/substitution failed: /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:222:45: note: cannot convert ‘driver.DriverOptions::encoding’ (type ‘Fortran::parser::Encoding’) to type ‘const Fortran::common::LangOptions&’ Unparse(llvm::outs(), parseTree, driver.encoding, true /*capitalize*/, ~~~~~~~^~~~~~~~ /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:245:64: error: no matching function for call to ‘Unparse(llvm::raw_fd_ostream&, Fortran::parser::Program&, Fortran::parser::Encoding&, bool, bool)’ Fortran::common::LanguageFeature::BackslashEscapes)); ^ In file included from /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include/flang/Parser/dump-parse-tree.h:16, from /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:25: /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate: ‘template void Fortran::parser::Unparse(llvm::raw_ostream&, const A&, const Fortran::common::LangOptions&, Fortran::parser::Encoding, bool, bool, Fortran::parser::preStatementType*, Fortran::parser::AnalyzedObjectsAsFortran*)’ void Unparse(llvm::raw_ostream &out, const A &root, ^~~~~~~ /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: template argument deduction/substitution failed: /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:243:42: note: cannot convert ‘driver.DriverOptions::encoding’ (type ‘Fortran::parser::Encoding’) to type ‘const Fortran::common::LangOptions&’ Unparse(tmpSource, parseTree, driver.encoding, true /*capitalize*/, ~~~~~~~^~~~~~~~ At global scope: cc1plus: warning: unrecognized command line option ‘-Wno-ctad-maybe-unsupported’ cc1plus: warning: unrecognized command line option ‘-Wno-deprecated-copy’ cc1plus: warning: unrecognized command line option ‘-Wno-unnecessary-virtual-specifier’ [7281/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/constant.cpp.o [7282/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/integer.cpp.o [7283/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/static-data.cpp.o [7284/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/logical.cpp.o [7285/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/target.cpp.o [7286/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/call.cpp.o [7287/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/real.cpp.o [7288/7799] Building CXX object tools/flang/lib/FrontendTool/CMakeFiles/flangFrontendTool.dir/ExecuteCompilerInvocation.cpp.o Step 7 (build cmake config) failure: build cmake config (failure) ... [7271/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/Runtime/Support.cpp.o [7272/7799] Building CXX object tools/flang/lib/Parser/CMakeFiles/FortranParser.dir/cmake_pch.hxx.gch [7273/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/Runtime/Transformational.cpp.o [7274/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/common.cpp.o [7275/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/FIRBuilder.cpp.o [7276/7799] Building CXX object tools/flang/unittests/Evaluate/CMakeFiles/folding.test.dir/folding.cpp.o [7277/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/Runtime/Reduction.cpp.o [7278/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/complex.cpp.o [7279/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/host.cpp.o [7280/7799] Building CXX object tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o FAILED: tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o ccache /usr/bin/c++ -DFLANG_INCLUDE_TESTS=1 -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_STATIC -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/tools/flang/tools/f18-parse-demo -I/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo -I/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include -I/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/tools/flang/include -I/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/include -I/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/llvm/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/../mlir/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/tools/mlir/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/tools/clang/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/llvm/../clang/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-noexcept-type -Wno-unnecessary-virtual-specifier -Wdelete-non-virtual-dtor -Wno-comment -Wno-misleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-ctad-maybe-unsupported -fno-strict-aliasing -fno-semantic-interposition -fpch-preprocess -O3 -DNDEBUG -fno-semantic-interposition -std=c++17 -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -MF tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o.d -o tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -c /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp: In function ‘std::__cxx11::string CompileFortran(std::__cxx11::string, Fortran::parser::Options, DriverOptions&)’: /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:224:64: error: no matching function for call to ‘Unparse(llvm::raw_fd_ostream&, Fortran::parser::Program&, Fortran::parser::Encoding&, bool, bool)’ Fortran::common::LanguageFeature::BackslashEscapes)); ^ In file included from /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include/flang/Parser/dump-parse-tree.h:16, from /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:25: /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate: ‘template void Fortran::parser::Unparse(llvm::raw_ostream&, const A&, const Fortran::common::LangOptions&, Fortran::parser::Encoding, bool, bool, Fortran::parser::preStatementType*, Fortran::parser::AnalyzedObjectsAsFortran*)’ void Unparse(llvm::raw_ostream &out, const A &root, ^~~~~~~ /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: template argument deduction/substitution failed: /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:222:45: note: cannot convert ‘driver.DriverOptions::encoding’ (type ‘Fortran::parser::Encoding’) to type ‘const Fortran::common::LangOptions&’ Unparse(llvm::outs(), parseTree, driver.encoding, true /*capitalize*/, ~~~~~~~^~~~~~~~ /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:245:64: error: no matching function for call to ‘Unparse(llvm::raw_fd_ostream&, Fortran::parser::Program&, Fortran::parser::Encoding&, bool, bool)’ Fortran::common::LanguageFeature::BackslashEscapes)); ^ In file included from /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include/flang/Parser/dump-parse-tree.h:16, from /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:25: /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate: ‘template void Fortran::parser::Unparse(llvm::raw_ostream&, const A&, const Fortran::common::LangOptions&, Fortran::parser::Encoding, bool, bool, Fortran::parser::preStatementType*, Fortran::parser::AnalyzedObjectsAsFortran*)’ void Unparse(llvm::raw_ostream &out, const A &root, ^~~~~~~ /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: template argument deduction/substitution failed: /home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:243:42: note: cannot convert ‘driver.DriverOptions::encoding’ (type ‘Fortran::parser::Encoding’) to type ‘const Fortran::common::LangOptions&’ Unparse(tmpSource, parseTree, driver.encoding, true /*capitalize*/, ~~~~~~~^~~~~~~~ At global scope: cc1plus: warning: unrecognized command line option ‘-Wno-ctad-maybe-unsupported’ cc1plus: warning: unrecognized command line option ‘-Wno-deprecated-copy’ cc1plus: warning: unrecognized command line option ‘-Wno-unnecessary-virtual-specifier’ [7281/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/constant.cpp.o [7282/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/integer.cpp.o [7283/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/static-data.cpp.o [7284/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/logical.cpp.o [7285/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/target.cpp.o [7286/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/call.cpp.o [7287/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/real.cpp.o [7288/7799] Building CXX object tools/flang/lib/FrontendTool/CMakeFiles/flangFrontendTool.dir/ExecuteCompilerInvocation.cpp.o ```
https://github.com/llvm/llvm-project/pull/139131 From flang-commits at lists.llvm.org Fri May 9 06:33:38 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 09 May 2025 06:33:38 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add missing dependent dialects to MLIR passes (PR #139260) In-Reply-To: Message-ID: <681e0432.170a0220.116692.c3ed@mx.google.com> https://github.com/tblah approved this pull request. LGTM but please wait for @clementval https://github.com/llvm/llvm-project/pull/139260 From flang-commits at lists.llvm.org Fri May 9 06:37:33 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Fri, 09 May 2025 06:37:33 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (PR #139131) In-Reply-To: Message-ID: <681e051d.050a0220.35a46c.f0b6@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `amdgpu-offload-rhel-9-cmake-build-only` running on `rocm-docker-rhel-9` while building `flang` at step 4 "annotate". Full details are available at: https://lab.llvm.org/buildbot/#/builders/205/builds/8833
Here is the relevant piece of the build log for the reference ``` Step 4 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py --jobs=32' (failure) ... [7297/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/CUFCommon.cpp.o [7298/7799] Building CXX object tools/flang/unittests/Evaluate/CMakeFiles/folding.test.dir/folding.cpp.o [7299/7799] Building CXX object tools/flang/lib/Optimizer/Analysis/CMakeFiles/FIRAnalysis.dir/AliasAnalysis.cpp.o [7300/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/common.cpp.o [7301/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/Runtime/Reduction.cpp.o [7302/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/complex.cpp.o [7303/7799] Building CXX object tools/flang/unittests/Evaluate/CMakeFiles/expression.test.dir/expression.cpp.o [7304/7799] Building CXX object tools/flang/tools/flang-driver/CMakeFiles/flang.dir/driver.cpp.o [7305/7799] Building CXX object tools/flang/lib/Optimizer/HLFIR/Transforms/CMakeFiles/HLFIRTransforms.dir/SimplifyHLFIRIntrinsics.cpp.o [7306/7799] Building CXX object tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o FAILED: tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o ccache /usr/bin/c++ -DFLANG_INCLUDE_TESTS=1 -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_STATIC -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/tools/flang/tools/f18-parse-demo -I/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo -I/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include -I/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/tools/flang/include -I/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/include -I/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/../mlir/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/tools/mlir/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/tools/clang/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/../clang/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wno-unnecessary-virtual-specifier -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-ctad-maybe-unsupported -fno-strict-aliasing -fno-semantic-interposition -fpch-preprocess -O3 -DNDEBUG -fno-semantic-interposition -std=c++17 -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -MF tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o.d -o tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -c /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp: In function ‘std::string CompileFortran(std::string, Fortran::parser::Options, DriverOptions&)’: /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:222:12: error: no matching function for call to ‘Unparse(llvm::raw_fd_ostream&, Fortran::parser::Program&, Fortran::parser::Encoding&, bool, bool)’ 222 | Unparse(llvm::outs(), parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 223 | options.features.IsEnabled( | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 224 | Fortran::common::LanguageFeature::BackslashEscapes)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include/flang/Parser/dump-parse-tree.h:16, from /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:25: /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate: ‘template void Fortran::parser::Unparse(llvm::raw_ostream&, const A&, const Fortran::common::LangOptions&, Fortran::parser::Encoding, bool, bool, Fortran::parser::preStatementType*, Fortran::parser::AnalyzedObjectsAsFortran*)’ 53 | void Unparse(llvm::raw_ostream &out, const A &root, | ^~~~~~~ /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: template argument deduction/substitution failed: /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:222:45: note: cannot convert ‘driver.DriverOptions::encoding’ (type ‘Fortran::parser::Encoding’) to type ‘const Fortran::common::LangOptions&’ 222 | Unparse(llvm::outs(), parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~ /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:243:12: error: no matching function for call to ‘Unparse(llvm::raw_fd_ostream&, Fortran::parser::Program&, Fortran::parser::Encoding&, bool, bool)’ 243 | Unparse(tmpSource, parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 244 | options.features.IsEnabled( | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 245 | Fortran::common::LanguageFeature::BackslashEscapes)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include/flang/Parser/dump-parse-tree.h:16, from /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:25: /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate: ‘template void Fortran::parser::Unparse(llvm::raw_ostream&, const A&, const Fortran::common::LangOptions&, Fortran::parser::Encoding, bool, bool, Fortran::parser::preStatementType*, Fortran::parser::AnalyzedObjectsAsFortran*)’ 53 | void Unparse(llvm::raw_ostream &out, const A &root, | ^~~~~~~ /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: template argument deduction/substitution failed: /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:243:42: note: cannot convert ‘driver.DriverOptions::encoding’ (type ‘Fortran::parser::Encoding’) to type ‘const Fortran::common::LangOptions&’ 243 | Unparse(tmpSource, parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~ At global scope: cc1plus: note: unrecognized command-line option ‘-Wno-unnecessary-virtual-specifier’ may have been intended to silence earlier diagnostics [7307/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/host.cpp.o [7308/7799] Building CXX object tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o Step 7 (build cmake config) failure: build cmake config (failure) ... [7297/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/CUFCommon.cpp.o [7298/7799] Building CXX object tools/flang/unittests/Evaluate/CMakeFiles/folding.test.dir/folding.cpp.o [7299/7799] Building CXX object tools/flang/lib/Optimizer/Analysis/CMakeFiles/FIRAnalysis.dir/AliasAnalysis.cpp.o [7300/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/common.cpp.o [7301/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/Runtime/Reduction.cpp.o [7302/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/complex.cpp.o [7303/7799] Building CXX object tools/flang/unittests/Evaluate/CMakeFiles/expression.test.dir/expression.cpp.o [7304/7799] Building CXX object tools/flang/tools/flang-driver/CMakeFiles/flang.dir/driver.cpp.o [7305/7799] Building CXX object tools/flang/lib/Optimizer/HLFIR/Transforms/CMakeFiles/HLFIRTransforms.dir/SimplifyHLFIRIntrinsics.cpp.o [7306/7799] Building CXX object tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o FAILED: tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o ccache /usr/bin/c++ -DFLANG_INCLUDE_TESTS=1 -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_STATIC -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/tools/flang/tools/f18-parse-demo -I/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo -I/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include -I/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/tools/flang/include -I/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/include -I/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/../mlir/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/tools/mlir/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/tools/clang/include -isystem /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/../clang/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wno-unnecessary-virtual-specifier -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-ctad-maybe-unsupported -fno-strict-aliasing -fno-semantic-interposition -fpch-preprocess -O3 -DNDEBUG -fno-semantic-interposition -std=c++17 -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -MF tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o.d -o tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -c /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp: In function ‘std::string CompileFortran(std::string, Fortran::parser::Options, DriverOptions&)’: /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:222:12: error: no matching function for call to ‘Unparse(llvm::raw_fd_ostream&, Fortran::parser::Program&, Fortran::parser::Encoding&, bool, bool)’ 222 | Unparse(llvm::outs(), parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 223 | options.features.IsEnabled( | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 224 | Fortran::common::LanguageFeature::BackslashEscapes)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include/flang/Parser/dump-parse-tree.h:16, from /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:25: /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate: ‘template void Fortran::parser::Unparse(llvm::raw_ostream&, const A&, const Fortran::common::LangOptions&, Fortran::parser::Encoding, bool, bool, Fortran::parser::preStatementType*, Fortran::parser::AnalyzedObjectsAsFortran*)’ 53 | void Unparse(llvm::raw_ostream &out, const A &root, | ^~~~~~~ /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: template argument deduction/substitution failed: /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:222:45: note: cannot convert ‘driver.DriverOptions::encoding’ (type ‘Fortran::parser::Encoding’) to type ‘const Fortran::common::LangOptions&’ 222 | Unparse(llvm::outs(), parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~ /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:243:12: error: no matching function for call to ‘Unparse(llvm::raw_fd_ostream&, Fortran::parser::Program&, Fortran::parser::Encoding&, bool, bool)’ 243 | Unparse(tmpSource, parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 244 | options.features.IsEnabled( | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 245 | Fortran::common::LanguageFeature::BackslashEscapes)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include/flang/Parser/dump-parse-tree.h:16, from /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:25: /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate: ‘template void Fortran::parser::Unparse(llvm::raw_ostream&, const A&, const Fortran::common::LangOptions&, Fortran::parser::Encoding, bool, bool, Fortran::parser::preStatementType*, Fortran::parser::AnalyzedObjectsAsFortran*)’ 53 | void Unparse(llvm::raw_ostream &out, const A &root, | ^~~~~~~ /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: template argument deduction/substitution failed: /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:243:42: note: cannot convert ‘driver.DriverOptions::encoding’ (type ‘Fortran::parser::Encoding’) to type ‘const Fortran::common::LangOptions&’ 243 | Unparse(tmpSource, parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~ At global scope: cc1plus: note: unrecognized command-line option ‘-Wno-unnecessary-virtual-specifier’ may have been intended to silence earlier diagnostics [7307/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/host.cpp.o [7308/7799] Building CXX object tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o ```
https://github.com/llvm/llvm-project/pull/139131 From flang-commits at lists.llvm.org Fri May 9 06:38:10 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Fri, 09 May 2025 06:38:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) In-Reply-To: Message-ID: <681e0542.170a0220.2761a8.ab11@mx.google.com> ashermancinelli wrote: > Thanks, LGTM. > > If you need special handling in the SELECT TYPE, do you also need something for SELECT RANK? I see the SELECT RANK cases and statements using only simple reference types, which I handled in earlier patches. I had not propagated volatile on CLASSes in SELECT TYPE because I had not yet enabled the propagation of the VOLATILE attribute to class entities when creating designators. This patch brings SELECT TYPE in line with what is already in SELECT RANK, but I'll reread those cases in the bridge to see if I'm missing anything. Thanks! https://github.com/llvm/llvm-project/pull/139183 From flang-commits at lists.llvm.org Fri May 9 06:39:13 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Fri, 09 May 2025 06:39:13 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (PR #139131) In-Reply-To: Message-ID: <681e0581.170a0220.3c596b.4ee0@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `amdgpu-offload-ubuntu-22-cmake-build-only` running on `rocm-docker-ubu-22` while building `flang` at step 4 "annotate". Full details are available at: https://lab.llvm.org/buildbot/#/builders/203/builds/10042
Here is the relevant piece of the build log for the reference ``` Step 4 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py --jobs=32' (failure) ... [7283/7799] Building CXX object tools/flang/lib/Optimizer/Transforms/CMakeFiles/FIRTransforms.dir/CUFGPUToLLVMConversion.cpp.o [7284/7799] Building CXX object tools/flang/lib/Parser/CMakeFiles/FortranParser.dir/message.cpp.o [7285/7799] Building CXX object tools/flang/unittests/Evaluate/CMakeFiles/folding.test.dir/folding.cpp.o [7286/7799] Building CXX object tools/flang/lib/Optimizer/Analysis/CMakeFiles/FIRAnalysis.dir/AliasAnalysis.cpp.o [7287/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/HLFIRTools.cpp.o [7288/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/common.cpp.o [7289/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/complex.cpp.o [7290/7799] Building CXX object tools/flang/unittests/Evaluate/CMakeFiles/expression.test.dir/expression.cpp.o [7291/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/host.cpp.o [7292/7799] Building CXX object tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o FAILED: tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o ccache /usr/bin/c++ -DFLANG_INCLUDE_TESTS=1 -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_STATIC -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/tools/flang/tools/f18-parse-demo -I/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo -I/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include -I/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/tools/flang/include -I/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/include -I/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/llvm/include -isystem /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/../mlir/include -isystem /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/tools/mlir/include -isystem /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/tools/clang/include -isystem /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/llvm/../clang/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wno-unnecessary-virtual-specifier -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-ctad-maybe-unsupported -fno-strict-aliasing -fno-semantic-interposition -fpch-preprocess -O3 -DNDEBUG -fno-semantic-interposition -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -std=c++17 -MD -MT tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -MF tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o.d -o tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -c /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp: In function ‘std::string CompileFortran(std::string, Fortran::parser::Options, DriverOptions&)’: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:222:12: error: no matching function for call to ‘Unparse(llvm::raw_fd_ostream&, Fortran::parser::Program&, Fortran::parser::Encoding&, bool, bool)’ 222 | Unparse(llvm::outs(), parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 223 | options.features.IsEnabled( | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 224 | Fortran::common::LanguageFeature::BackslashEscapes)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include/flang/Parser/dump-parse-tree.h:16, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:25: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate: ‘template void Fortran::parser::Unparse(llvm::raw_ostream&, const A&, const Fortran::common::LangOptions&, Fortran::parser::Encoding, bool, bool, Fortran::parser::preStatementType*, Fortran::parser::AnalyzedObjectsAsFortran*)’ 53 | void Unparse(llvm::raw_ostream &out, const A &root, | ^~~~~~~ /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: template argument deduction/substitution failed: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:222:45: note: cannot convert ‘driver.DriverOptions::encoding’ (type ‘Fortran::parser::Encoding’) to type ‘const Fortran::common::LangOptions&’ 222 | Unparse(llvm::outs(), parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~ /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:243:12: error: no matching function for call to ‘Unparse(llvm::raw_fd_ostream&, Fortran::parser::Program&, Fortran::parser::Encoding&, bool, bool)’ 243 | Unparse(tmpSource, parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 244 | options.features.IsEnabled( | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 245 | Fortran::common::LanguageFeature::BackslashEscapes)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include/flang/Parser/dump-parse-tree.h:16, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:25: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate: ‘template void Fortran::parser::Unparse(llvm::raw_ostream&, const A&, const Fortran::common::LangOptions&, Fortran::parser::Encoding, bool, bool, Fortran::parser::preStatementType*, Fortran::parser::AnalyzedObjectsAsFortran*)’ 53 | void Unparse(llvm::raw_ostream &out, const A &root, | ^~~~~~~ /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: template argument deduction/substitution failed: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:243:42: note: cannot convert ‘driver.DriverOptions::encoding’ (type ‘Fortran::parser::Encoding’) to type ‘const Fortran::common::LangOptions&’ 243 | Unparse(tmpSource, parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~ At global scope: cc1plus: note: unrecognized command-line option ‘-Wno-unnecessary-virtual-specifier’ may have been intended to silence earlier diagnostics [7293/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/logical.cpp.o [7294/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/integer.cpp.o Step 7 (build cmake config) failure: build cmake config (failure) ... [7283/7799] Building CXX object tools/flang/lib/Optimizer/Transforms/CMakeFiles/FIRTransforms.dir/CUFGPUToLLVMConversion.cpp.o [7284/7799] Building CXX object tools/flang/lib/Parser/CMakeFiles/FortranParser.dir/message.cpp.o [7285/7799] Building CXX object tools/flang/unittests/Evaluate/CMakeFiles/folding.test.dir/folding.cpp.o [7286/7799] Building CXX object tools/flang/lib/Optimizer/Analysis/CMakeFiles/FIRAnalysis.dir/AliasAnalysis.cpp.o [7287/7799] Building CXX object tools/flang/lib/Optimizer/Builder/CMakeFiles/FIRBuilder.dir/HLFIRTools.cpp.o [7288/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/common.cpp.o [7289/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/complex.cpp.o [7290/7799] Building CXX object tools/flang/unittests/Evaluate/CMakeFiles/expression.test.dir/expression.cpp.o [7291/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/host.cpp.o [7292/7799] Building CXX object tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o FAILED: tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o ccache /usr/bin/c++ -DFLANG_INCLUDE_TESTS=1 -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_STATIC -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/tools/flang/tools/f18-parse-demo -I/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo -I/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include -I/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/tools/flang/include -I/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/include -I/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/llvm/include -isystem /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/../mlir/include -isystem /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/tools/mlir/include -isystem /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/tools/clang/include -isystem /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/llvm/../clang/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wno-unnecessary-virtual-specifier -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-ctad-maybe-unsupported -fno-strict-aliasing -fno-semantic-interposition -fpch-preprocess -O3 -DNDEBUG -fno-semantic-interposition -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -std=c++17 -MD -MT tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -MF tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o.d -o tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -c /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp: In function ‘std::string CompileFortran(std::string, Fortran::parser::Options, DriverOptions&)’: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:222:12: error: no matching function for call to ‘Unparse(llvm::raw_fd_ostream&, Fortran::parser::Program&, Fortran::parser::Encoding&, bool, bool)’ 222 | Unparse(llvm::outs(), parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 223 | options.features.IsEnabled( | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 224 | Fortran::common::LanguageFeature::BackslashEscapes)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include/flang/Parser/dump-parse-tree.h:16, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:25: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate: ‘template void Fortran::parser::Unparse(llvm::raw_ostream&, const A&, const Fortran::common::LangOptions&, Fortran::parser::Encoding, bool, bool, Fortran::parser::preStatementType*, Fortran::parser::AnalyzedObjectsAsFortran*)’ 53 | void Unparse(llvm::raw_ostream &out, const A &root, | ^~~~~~~ /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: template argument deduction/substitution failed: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:222:45: note: cannot convert ‘driver.DriverOptions::encoding’ (type ‘Fortran::parser::Encoding’) to type ‘const Fortran::common::LangOptions&’ 222 | Unparse(llvm::outs(), parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~ /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:243:12: error: no matching function for call to ‘Unparse(llvm::raw_fd_ostream&, Fortran::parser::Program&, Fortran::parser::Encoding&, bool, bool)’ 243 | Unparse(tmpSource, parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 244 | options.features.IsEnabled( | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 245 | Fortran::common::LanguageFeature::BackslashEscapes)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include/flang/Parser/dump-parse-tree.h:16, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:25: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate: ‘template void Fortran::parser::Unparse(llvm::raw_ostream&, const A&, const Fortran::common::LangOptions&, Fortran::parser::Encoding, bool, bool, Fortran::parser::preStatementType*, Fortran::parser::AnalyzedObjectsAsFortran*)’ 53 | void Unparse(llvm::raw_ostream &out, const A &root, | ^~~~~~~ /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: template argument deduction/substitution failed: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:243:42: note: cannot convert ‘driver.DriverOptions::encoding’ (type ‘Fortran::parser::Encoding’) to type ‘const Fortran::common::LangOptions&’ 243 | Unparse(tmpSource, parseTree, driver.encoding, true /*capitalize*/, | ~~~~~~~^~~~~~~~ At global scope: cc1plus: note: unrecognized command-line option ‘-Wno-unnecessary-virtual-specifier’ may have been intended to silence earlier diagnostics [7293/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/logical.cpp.o [7294/7799] Building CXX object tools/flang/lib/Evaluate/CMakeFiles/FortranEvaluate.dir/integer.cpp.o ```
https://github.com/llvm/llvm-project/pull/139131 From flang-commits at lists.llvm.org Fri May 9 06:42:04 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Fri, 09 May 2025 06:42:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) In-Reply-To: Message-ID: <681e062c.170a0220.1fc173.ab27@mx.google.com> ================ @@ -207,20 +207,25 @@ static bool hasExplicitLowerBounds(mlir::Value shape) { mlir::isa(shape.getType()); } -static std::pair updateDeclareInputTypeWithVolatility( +static std::pair updateDeclaredInputTypeWithVolatility( mlir::Type inputType, mlir::Value memref, mlir::OpBuilder &builder, fir::FortranVariableFlagsAttr fortran_attrs) { if (fortran_attrs && bitEnumContainsAny(fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::fortran_volatile)) { + // A volatile pointer's pointee is volatile. const bool isPointer = bitEnumContainsAny( fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::pointer); + // An allocatable's inner type's volatility matches that of the reference. + const bool isAllocatable = bitEnumContainsAny( + fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::allocatable); ---------------- ashermancinelli wrote: We should have only entered this if-body if the attrs were non-null, but I think this is not very clear. Let me rewrite this a bit so it's more obvious. Thanks! https://github.com/llvm/llvm-project/pull/139183 From flang-commits at lists.llvm.org Fri May 9 07:27:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 07:27:21 -0700 (PDT) Subject: [flang-commits] [flang] 2cc8734 - [flang][fir] Basic lowering `fir.do_concurrent` locality specs to `fir.do_loop ... unordered` (#138512) Message-ID: <681e10c9.050a0220.212b14.2328@mx.google.com> Author: Kareem Ergawy Date: 2025-05-09T16:27:18+02:00 New Revision: 2cc8734c4505a0c8ce1f8d6a915ce0fc57cb6ea4 URL: https://github.com/llvm/llvm-project/commit/2cc8734c4505a0c8ce1f8d6a915ce0fc57cb6ea4 DIFF: https://github.com/llvm/llvm-project/commit/2cc8734c4505a0c8ce1f8d6a915ce0fc57cb6ea4.diff LOG: [flang][fir] Basic lowering `fir.do_concurrent` locality specs to `fir.do_loop ... unordered` (#138512) Extends lowering `fir.do_concurrent` to `fir.do_loop ... unordered` by adding support for locality specifiers. In particular, for `local` specifiers, a `fir.alloca` op is created using the localizer type. For `local_init` specifiers, the `copy` region is additionally inlined in the `do concurrent` loop's body. PR stack: - https://github.com/llvm/llvm-project/pull/137928 - https://github.com/llvm/llvm-project/pull/138505 - https://github.com/llvm/llvm-project/pull/138506 - https://github.com/llvm/llvm-project/pull/138512 (this PR) - https://github.com/llvm/llvm-project/pull/138534 - https://github.com/llvm/llvm-project/pull/138816 Added: Modified: flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp flang/test/Transforms/do_concurrent-to-do_loop-unodered.fir Removed: ################################################################################ diff --git a/flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp b/flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp index 6d106046b70f2..cb9e48cced2a1 100644 --- a/flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp +++ b/flang/lib/Optimizer/Transforms/SimplifyFIROperations.cpp @@ -149,6 +149,17 @@ mlir::LogicalResult BoxTotalElementsConversion::matchAndRewrite( class DoConcurrentConversion : public mlir::OpRewritePattern { + /// Looks up from the operation from and returns the LocalitySpecifierOp with + /// name symbolName + static fir::LocalitySpecifierOp + findLocalizer(mlir::Operation *from, mlir::SymbolRefAttr symbolName) { + fir::LocalitySpecifierOp localizer = + mlir::SymbolTable::lookupNearestSymbolFrom( + from, symbolName); + assert(localizer && "localizer not found in the symbol table"); + return localizer; + } + public: using mlir::OpRewritePattern::OpRewritePattern; @@ -162,7 +173,59 @@ class DoConcurrentConversion assert(loop.getRegion().hasOneBlock()); mlir::Block &loopBlock = loop.getRegion().getBlocks().front(); - // Collect iteration variable(s) allocations do that we can move them + // Handle localization + if (!loop.getLocalVars().empty()) { + mlir::OpBuilder::InsertionGuard guard(rewriter); + rewriter.setInsertionPointToStart(&loop.getRegion().front()); + + std::optional localSyms = loop.getLocalSyms(); + + for (auto [localVar, localArg, localizerSym] : llvm::zip_equal( + loop.getLocalVars(), loop.getRegionLocalArgs(), *localSyms)) { + mlir::SymbolRefAttr localizerName = + llvm::cast(localizerSym); + fir::LocalitySpecifierOp localizer = findLocalizer(loop, localizerName); + + if (!localizer.getInitRegion().empty() || + !localizer.getDeallocRegion().empty()) + TODO(localizer.getLoc(), "localizers with `init` and `dealloc` " + "regions are not handled yet."); + + // TODO Should this be a heap allocation instead? For now, we allocate + // on the stack for each loop iteration. + mlir::Value localAlloc = + rewriter.create(loop.getLoc(), localizer.getType()); + + if (localizer.getLocalitySpecifierType() == + fir::LocalitySpecifierType::LocalInit) { + // It is reasonable to make this assumption since, at this stage, + // control-flow ops are not converted yet. Therefore, things like `if` + // conditions will still be represented by their encapsulating `fir` + // dialect ops. + assert(localizer.getCopyRegion().hasOneBlock() && + "Expected localizer to have a single block."); + mlir::Block *beforeLocalInit = rewriter.getInsertionBlock(); + mlir::Block *afterLocalInit = rewriter.splitBlock( + rewriter.getInsertionBlock(), rewriter.getInsertionPoint()); + rewriter.cloneRegionBefore(localizer.getCopyRegion(), afterLocalInit); + mlir::Block *copyRegionBody = beforeLocalInit->getNextNode(); + + rewriter.eraseOp(copyRegionBody->getTerminator()); + rewriter.mergeBlocks(afterLocalInit, copyRegionBody); + rewriter.mergeBlocks(copyRegionBody, beforeLocalInit, + {localVar, localArg}); + } + + rewriter.replaceAllUsesWith(localArg, localAlloc); + } + + loop.getRegion().front().eraseArguments(loop.getNumInductionVars(), + loop.getNumLocalOperands()); + loop.getLocalVarsMutable().clear(); + loop.setLocalSymsAttr(nullptr); + } + + // Collect iteration variable(s) allocations so that we can move them // outside the `fir.do_concurrent` wrapper. llvm::SmallVector opsToMove; for (mlir::Operation &op : llvm::drop_end(wrapperBlock)) diff --git a/flang/test/Transforms/do_concurrent-to-do_loop-unodered.fir b/flang/test/Transforms/do_concurrent-to-do_loop-unodered.fir index d2ceafdda5b22..d9ef36b175598 100644 --- a/flang/test/Transforms/do_concurrent-to-do_loop-unodered.fir +++ b/flang/test/Transforms/do_concurrent-to-do_loop-unodered.fir @@ -121,3 +121,64 @@ func.func @dc_2d_reduction(%i_lb: index, %i_ub: index, %i_st: index, // CHECK: } // CHECK: return // CHECK: } + +// ----- + +fir.local {type = local} @local_localizer : i32 + +fir.local {type = local_init} @local_init_localizer : i32 copy { +^bb0(%arg0: !fir.ref, %arg1: !fir.ref): + %0 = fir.load %arg0 : !fir.ref + fir.store %0 to %arg1 : !fir.ref + fir.yield(%arg1 : !fir.ref) +} + +func.func @do_concurrent_locality_specs() { + %3 = fir.alloca i32 {bindc_name = "local_init_var", uniq_name = "_QFdo_concurrentElocal_init_var"} + %4:2 = hlfir.declare %3 {uniq_name = "_QFdo_concurrentElocal_init_var"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %5 = fir.alloca i32 {bindc_name = "local_var", uniq_name = "_QFdo_concurrentElocal_var"} + %6:2 = hlfir.declare %5 {uniq_name = "_QFdo_concurrentElocal_var"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %c1 = arith.constant 1 : index + %c10 = arith.constant 1 : index + fir.do_concurrent { + %9 = fir.alloca i32 {bindc_name = "i"} + %10:2 = hlfir.declare %9 {uniq_name = "_QFdo_concurrentEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) + fir.do_concurrent.loop (%arg0) = (%c1) to (%c10) step (%c1) local(@local_localizer %6#0 -> %arg1, @local_init_localizer %4#0 -> %arg2 : !fir.ref, !fir.ref) { + %11 = fir.convert %arg0 : (index) -> i32 + fir.store %11 to %10#0 : !fir.ref + %13:2 = hlfir.declare %arg1 {uniq_name = "_QFdo_concurrentElocal_var"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %15:2 = hlfir.declare %arg2 {uniq_name = "_QFdo_concurrentElocal_init_var"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %17 = fir.load %10#0 : !fir.ref + %c5_i32 = arith.constant 5 : i32 + %18 = arith.cmpi slt, %17, %c5_i32 : i32 + fir.if %18 { + %c42_i32 = arith.constant 42 : i32 + hlfir.assign %c42_i32 to %13#0 : i32, !fir.ref + } else { + %c84_i32 = arith.constant 84 : i32 + hlfir.assign %c84_i32 to %15#0 : i32, !fir.ref + } + } + } + return +} + +// CHECK-LABEL: func.func @do_concurrent_locality_specs() { +// CHECK: %[[LOC_INIT_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "{{.*}}Elocal_init_var"} +// CHECK: fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} unordered { +// Verify localization of the `local` var. +// CHECK: %[[PRIV_LOC_ALLOC:.*]] = fir.alloca i32 + +// Verify localization of the `local_init` var. +// CHECK: %[[PRIV_LOC_INIT_ALLOC:.*]] = fir.alloca i32 +// CHECK: %[[LOC_INIT_VAL:.*]] = fir.load %[[LOC_INIT_DECL]]#0 : !fir.ref +// CHECK: fir.store %[[LOC_INIT_VAL]] to %[[PRIV_LOC_INIT_ALLOC]] : !fir.ref + +// CHECK: %[[VAL_15:.*]]:2 = hlfir.declare %[[PRIV_LOC_ALLOC]] +// CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[PRIV_LOC_INIT_ALLOC]] + +// CHECK: hlfir.assign %{{.*}} to %[[VAL_15]]#0 : i32, !fir.ref +// CHECK: hlfir.assign %{{.*}} to %[[VAL_16]]#0 : i32, !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } From flang-commits at lists.llvm.org Fri May 9 07:27:28 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Fri, 09 May 2025 07:27:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Basic PFT to MLIR lowering for do concurrent locality specifiers (PR #138534) In-Reply-To: Message-ID: <681e10d0.170a0220.15279a.db3d@mx.google.com> https://github.com/ergawy edited https://github.com/llvm/llvm-project/pull/138534 From flang-commits at lists.llvm.org Fri May 9 07:27:25 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Fri, 09 May 2025 07:27:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Basic lowering `fir.do_concurrent` locality specs to `fir.do_loop ... unordered` (PR #138512) In-Reply-To: Message-ID: <681e10cd.170a0220.38556f.ddaa@mx.google.com> https://github.com/ergawy closed https://github.com/llvm/llvm-project/pull/138512 From flang-commits at lists.llvm.org Fri May 9 08:59:27 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 08:59:27 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Driver][Test] Created test for unsupported driver options (PR #120900) In-Reply-To: Message-ID: <681e265f.170a0220.2b617e.337d@mx.google.com> GeorgeKA wrote: Ping @Maetveis Any more thoughts on this one? I think I've addressed the existing comments. https://github.com/llvm/llvm-project/pull/120900 From flang-commits at lists.llvm.org Fri May 9 09:17:48 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 09 May 2025 09:17:48 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix spurious error on defined assignment in PURE (PR #139186) In-Reply-To: Message-ID: <681e2aac.170a0220.f69e3.3db1@mx.google.com> https://github.com/akuhlens approved this pull request. https://github.com/llvm/llvm-project/pull/139186 From flang-commits at lists.llvm.org Fri May 9 09:23:44 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Fri, 09 May 2025 09:23:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) In-Reply-To: Message-ID: <681e2c10.170a0220.1f846f.3d96@mx.google.com> https://github.com/ashermancinelli updated https://github.com/llvm/llvm-project/pull/139183 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Fri May 9 09:24:29 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 09 May 2025 09:24:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <681e2c3d.170a0220.2bd230.5f1d@mx.google.com> ================ @@ -138,6 +138,18 @@ text. OpenMP-style directives that look like comments are not addressed by this scheme but are obvious extensions. +## Currently implemented built-ins + +* `__DATE__`: Date, given as e.g. "Jun 16 1904" +* `__TIME__`: Time in 24-hour format including seconds, e.g. "09:24:13" +* `__TIMESTAMP__`: Date, time and year of last modification, given as e.g. "Fri May 9 09:16:17 2025" +* `__FILE__`: Current file +* `__LINE__`: Current line + +### Non-standard extensions ---------------- akuhlens wrote: I would leave this documentation here, and add a bullet point in extensions for the `__COUNTER__` macro in the [preprocessing behavior section](https://github.com/llvm/llvm-project/blob/main/flang/docs/Extensions.md#preprocessing-behavior). Doesn't have to be anything elaborate, maybe just the fact that it exists and a link to this bullet point. https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Fri May 9 09:25:25 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 09:25:25 -0700 (PDT) Subject: [flang-commits] [flang] 7aed77e - [flang][OpenMP] Add implicit casts for omp.atomic.capture (#138163) Message-ID: <681e2c75.170a0220.2060f9.5e36@mx.google.com> Author: NimishMishra Date: 2025-05-09T09:25:21-07:00 New Revision: 7aed77ef954f83cc52dad3eba4f51470e21b1cb0 URL: https://github.com/llvm/llvm-project/commit/7aed77ef954f83cc52dad3eba4f51470e21b1cb0 DIFF: https://github.com/llvm/llvm-project/commit/7aed77ef954f83cc52dad3eba4f51470e21b1cb0.diff LOG: [flang][OpenMP] Add implicit casts for omp.atomic.capture (#138163) This patch adds support for emitting implicit casts for atomic capture if its constituent operations have different yet compatible types. Fixes: https://github.com/llvm/llvm-project/issues/138123 and https://github.com/llvm/llvm-project/issues/94177 Added: Modified: flang/docs/OpenMPSupport.md flang/lib/Lower/OpenMP/OpenMP.cpp flang/test/Lower/OpenMP/atomic-implicit-cast.f90 Removed: flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 ################################################################################ diff --git a/flang/docs/OpenMPSupport.md b/flang/docs/OpenMPSupport.md index bde0bf724e480..7a4f95693a89c 100644 --- a/flang/docs/OpenMPSupport.md +++ b/flang/docs/OpenMPSupport.md @@ -16,9 +16,7 @@ local: This document outlines the OpenMP API features supported by Flang. It is intended as a general reference. For the most accurate information on unimplemented features, rely on the compiler’s TODO or “Not Yet Implemented” -messages, which are considered authoritative. With the exception of a few corner cases, Flang -offers full support for OpenMP 3.1 ([See details here](#openmp-31-openmp-25-openmp-11)). -Partial support for OpenMP 4.0 is also available and currently under active development. +messages, which are considered authoritative. Flang provides complete implementation of the OpenMP 3.1 specification and partial implementation of OpenMP 4.0, with continued development efforts aimed at extending full support for the latter. The table below outlines the current status of OpenMP 4.0 feature support. Work is ongoing to add support for OpenMP 4.5 and newer versions; a support statement for these will be shared in the future. The table entries are derived from the information provided in the Version Differences subsection of the Features History section in the OpenMP standard. @@ -62,6 +60,3 @@ Note : No distinction is made between the support in Parser/Semantics, MLIR, Low | target teams distribute parallel loop construct | P | device, reduction and dist_schedule clauses are not supported | | teams distribute parallel loop simd construct | P | reduction, dist_schedule, and linear clauses are not supported | | target teams distribute parallel loop simd construct | P | device, reduction, dist_schedule and linear clauses are not supported | - -## OpenMP 3.1, OpenMP 2.5, OpenMP 1.1 -All features except a few corner cases in atomic (complex type, diff erent but compatible types in lhs and rhs) are supported. diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 62a00ae8f3714..54560729eb4af 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2885,6 +2885,85 @@ static void genAtomicWrite(lower::AbstractConverter &converter, rightHandClauseList, loc); } +/* + Emit an implicit cast. Different yet compatible types on + omp.atomic.read constitute valid Fortran. The OMPIRBuilder will + emit atomic instructions (on primitive types) and `__atomic_load` + libcall (on complex type) without explicitly converting + between such compatible types. The OMPIRBuilder relies on the + frontend to resolve such inconsistencies between `omp.atomic.read ` + operand types. Similar inconsistencies between operand types in + `omp.atomic.write` are resolved through implicit casting by use of typed + assignment (i.e. `evaluate::Assignment`). However, use of typed + assignment in `omp.atomic.read` (of form `v = x`) leads to an unsafe, + non-atomic load of `x` into a temporary `alloca`, followed by an atomic + read of form `v = alloca`. Hence, it is needed to perform a custom + implicit cast. + + An atomic read of form `v = x` would (without implicit casting) + lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, + type2`. This implicit casting will rather generate the following FIR: + + %alloca = fir.alloca type2 + omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 + %load = fir.load %alloca : !fir.ref + %cvt = fir.convert %load : (type2) -> type1 + fir.store %cvt to %v : !fir.ref + + These sequence of operations is thread-safe since each thread allocates + the `alloca` in its stack, and performs `%alloca = %x` atomically. Once + safely read, each thread performs the implicit cast on the local + `alloca`, and writes the final result to `%v`. + +/// \param builder : FirOpBuilder +/// \param loc : Location for FIR generation +/// \param toAddress : Address of %v +/// \param toType : Type of %v +/// \param fromType : Type of %x +/// \param alloca : Thread scoped `alloca` +// It is the responsibility of the callee +// to position the `alloca` at `AllocaIP` +// through `builder.getAllocaBlock()` +*/ + +static void emitAtomicReadImplicitCast(fir::FirOpBuilder &builder, + mlir::Location loc, + mlir::Value toAddress, mlir::Type toType, + mlir::Type fromType, + mlir::Value alloca) { + auto load = builder.create(loc, alloca); + if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { + // Emit an additional `ExtractValueOp` if `fromAddress` is of complex + // type, but `toAddress` is not. + auto extract = builder.create( + loc, mlir::cast(fromType).getElementType(), load, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + auto cvt = builder.create(loc, toType, extract); + builder.create(loc, cvt, toAddress); + } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { + // Emit an additional `InsertValueOp` if `toAddress` is of complex + // type, but `fromAddress` is not. + mlir::Value undef = builder.create(loc, toType); + mlir::Type complexEleTy = + mlir::cast(toType).getElementType(); + mlir::Value cvt = builder.create(loc, complexEleTy, load); + mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); + mlir::Value idx0 = builder.create( + loc, toType, undef, cvt, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 0))); + mlir::Value idx1 = builder.create( + loc, toType, idx0, zero, + builder.getArrayAttr( + builder.getIntegerAttr(builder.getIndexType(), 1))); + builder.create(loc, idx1, toAddress); + } else { + auto cvt = builder.create(loc, toType, load); + builder.create(loc, cvt, toAddress); + } +} + /// Processes an atomic construct with read clause. static void genAtomicRead(lower::AbstractConverter &converter, const parser::OmpAtomicRead &atomicRead, @@ -2911,34 +2990,7 @@ static void genAtomicRead(lower::AbstractConverter &converter, *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); if (fromAddress.getType() != toAddress.getType()) { - // Emit an implicit cast. Different yet compatible types on - // omp.atomic.read constitute valid Fortran. The OMPIRBuilder will - // emit atomic instructions (on primitive types) and `__atomic_load` - // libcall (on complex type) without explicitly converting - // between such compatible types. The OMPIRBuilder relies on the - // frontend to resolve such inconsistencies between `omp.atomic.read ` - // operand types. Similar inconsistencies between operand types in - // `omp.atomic.write` are resolved through implicit casting by use of typed - // assignment (i.e. `evaluate::Assignment`). However, use of typed - // assignment in `omp.atomic.read` (of form `v = x`) leads to an unsafe, - // non-atomic load of `x` into a temporary `alloca`, followed by an atomic - // read of form `v = alloca`. Hence, it is needed to perform a custom - // implicit cast. - - // An atomic read of form `v = x` would (without implicit casting) - // lower to `omp.atomic.read %v = %x : !fir.ref, !fir.ref, - // type2`. This implicit casting will rather generate the following FIR: - // - // %alloca = fir.alloca type2 - // omp.atomic.read %alloca = %x : !fir.ref, !fir.ref, type2 - // %load = fir.load %alloca : !fir.ref - // %cvt = fir.convert %load : (type2) -> type1 - // fir.store %cvt to %v : !fir.ref - - // These sequence of operations is thread-safe since each thread allocates - // the `alloca` in its stack, and performs `%alloca = %x` atomically. Once - // safely read, each thread performs the implicit cast on the local - // `alloca`, and writes the final result to `%v`. + mlir::Type toType = fir::unwrapRefType(toAddress.getType()); mlir::Type fromType = fir::unwrapRefType(fromAddress.getType()); fir::FirOpBuilder &builder = converter.getFirOpBuilder(); @@ -2950,37 +3002,8 @@ static void genAtomicRead(lower::AbstractConverter &converter, genAtomicCaptureStatement(converter, fromAddress, alloca, leftHandClauseList, rightHandClauseList, elementType, loc); - auto load = builder.create(loc, alloca); - if (fir::isa_complex(fromType) && !fir::isa_complex(toType)) { - // Emit an additional `ExtractValueOp` if `fromAddress` is of complex - // type, but `toAddress` is not. - auto extract = builder.create( - loc, mlir::cast(fromType).getElementType(), load, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 0))); - auto cvt = builder.create(loc, toType, extract); - builder.create(loc, cvt, toAddress); - } else if (!fir::isa_complex(fromType) && fir::isa_complex(toType)) { - // Emit an additional `InsertValueOp` if `toAddress` is of complex - // type, but `fromAddress` is not. - mlir::Value undef = builder.create(loc, toType); - mlir::Type complexEleTy = - mlir::cast(toType).getElementType(); - mlir::Value cvt = builder.create(loc, complexEleTy, load); - mlir::Value zero = builder.createRealZeroConstant(loc, complexEleTy); - mlir::Value idx0 = builder.create( - loc, toType, undef, cvt, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 0))); - mlir::Value idx1 = builder.create( - loc, toType, idx0, zero, - builder.getArrayAttr( - builder.getIntegerAttr(builder.getIndexType(), 1))); - builder.create(loc, idx1, toAddress); - } else { - auto cvt = builder.create(loc, toType, load); - builder.create(loc, cvt, toAddress); - } + emitAtomicReadImplicitCast(builder, loc, toAddress, toType, fromType, + alloca); } else genAtomicCaptureStatement(converter, fromAddress, toAddress, leftHandClauseList, rightHandClauseList, @@ -3069,10 +3092,6 @@ static void genAtomicCapture(lower::AbstractConverter &converter, mlir::Type stmt2VarType = fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - // Check if implicit type is needed - if (stmt1VarType != stmt2VarType) - TODO(loc, "atomic capture requiring implicit type casts"); - mlir::Operation *atomicCaptureOp = nullptr; mlir::IntegerAttr hint = nullptr; mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; @@ -3095,10 +3114,31 @@ static void genAtomicCapture(lower::AbstractConverter &converter, // Atomic capture construct is of the form [capture-stmt, update-stmt] const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt1LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt2LHSArg.getType()); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + genAtomicCaptureStatement(converter, stmt2LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt1LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } genAtomicUpdateStatement( converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, /*leftHandClauseList=*/nullptr, @@ -3111,10 +3151,32 @@ static void genAtomicCapture(lower::AbstractConverter &converter, firOpBuilder.setInsertionPointToStart(&block); const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt1LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt2LHSArg.getType()); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + genAtomicCaptureStatement(converter, stmt2LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt1LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, /*leftHandClauseList=*/nullptr, /*rightHandClauseList=*/nullptr, loc); @@ -3127,10 +3189,34 @@ static void genAtomicCapture(lower::AbstractConverter &converter, converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, /*leftHandClauseList=*/nullptr, /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + + if (stmt1VarType != stmt2VarType) { + mlir::Value alloca; + mlir::Type toType = fir::unwrapRefType(stmt2LHSArg.getType()); + mlir::Type fromType = fir::unwrapRefType(stmt1LHSArg.getType()); + + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + alloca = firOpBuilder.create(loc, fromType); + } + + genAtomicCaptureStatement(converter, stmt1LHSArg, alloca, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + { + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); + emitAtomicReadImplicitCast(firOpBuilder, loc, stmt2LHSArg, toType, + fromType, alloca); + } + } else { + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } } firOpBuilder.setInsertionPointToEnd(&block); firOpBuilder.create(loc); diff --git a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 b/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 deleted file mode 100644 index 5b61f1169308f..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 +++ /dev/null @@ -1,48 +0,0 @@ -!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..4c1be1ca91ac0 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -4,6 +4,10 @@ ! CHECK: func.func @_QPatomic_implicit_cast_read() { subroutine atomic_implicit_cast_read +! CHECK: %[[ALLOCA7:.*]] = fir.alloca complex +! CHECK: %[[ALLOCA6:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA5:.*]] = fir.alloca i32 +! CHECK: %[[ALLOCA4:.*]] = fir.alloca i32 ! CHECK: %[[ALLOCA3:.*]] = fir.alloca complex ! CHECK: %[[ALLOCA2:.*]] = fir.alloca complex ! CHECK: %[[ALLOCA1:.*]] = fir.alloca i32 @@ -53,4 +57,78 @@ subroutine atomic_implicit_cast_read ! CHECK: fir.store %[[CVT]] to %[[M_DECL]]#0 : !fir.ref> !$omp atomic read m = w + +! CHECK: %[[CONST:.*]] = arith.constant 1 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[ALLOCA4]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.update %[[X_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[ARG:.*]]: i32): +! CHECK: %[[RESULT:.*]] = arith.addi %[[ARG]], %[[CONST]] : i32 +! CHECK: omp.yield(%[[RESULT]] : i32) +! CHECK: } +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA4]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 +! CHECK: fir.store %[[CVT]] to %[[Y_DECL]]#0 : !fir.ref + !$omp atomic capture + y = x + x = x + 1 + !$omp end atomic + +! CHECK: %[[CONST:.*]] = arith.constant 10 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[ALLOCA5:.*]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.write %[[X_DECL]]#0 = %[[CONST]] : !fir.ref, i32 +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA5]] : !fir.ref +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f64 +! CHECK: fir.store %[[CVT]] to %[[Z_DECL]]#0 : !fir.ref + !$omp atomic capture + z = x + x = 10 + !$omp end atomic + +! CHECK: %[[CONST:.*]] = arith.constant 1 : i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[X_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[ARG:.*]]: i32): +! CHECK: %[[RESULT:.*]] = arith.addi %[[ARG]], %[[CONST]] : i32 +! CHECK: omp.yield(%[[RESULT]] : i32) +! CHECK: } +! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref +! CHECK: %[[UNDEF:.*]] = fir.undefined complex +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 +! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex +! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex +! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> + !$omp atomic capture + x = x + 1 + w = x + !$omp end atomic + + +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): +! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 +! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex +! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex +! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex +! CHECK: omp.yield(%[[RESULT]] : complex) +! CHECK: } +! CHECK: omp.atomic.read %[[ALLOCA7]] = %[[M_DECL]]#0 : !fir.ref>, !fir.ref>, complex +! CHECK: } +! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA7]] : !fir.ref> +! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (complex) -> complex +! CHECK: fir.store %[[CVT]] to %[[W_DECL]]#0 : !fir.ref> + !$omp atomic capture + m = m + 1 + w = m + !$omp end atomic + + end subroutine From flang-commits at lists.llvm.org Fri May 9 09:25:29 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 09:25:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Add implicit casts for omp.atomic.capture (PR #138163) In-Reply-To: Message-ID: <681e2c79.170a0220.19f797.35e7@mx.google.com> https://github.com/NimishMishra closed https://github.com/llvm/llvm-project/pull/138163 From flang-commits at lists.llvm.org Fri May 9 09:27:45 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 09 May 2025 09:27:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Acknowledge non-enforcement of C7108 (PR #139169) In-Reply-To: Message-ID: <681e2d01.170a0220.d71f2.53fb@mx.google.com> https://github.com/akuhlens approved this pull request. LGTM, it kinda makes sense that if the user went out of the way to create a "constructor" function they probably meant the constructor. So fair enough. https://github.com/llvm/llvm-project/pull/139169 From flang-commits at lists.llvm.org Fri May 9 09:29:25 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 09 May 2025 09:29:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) In-Reply-To: Message-ID: <681e2d65.170a0220.11c97c.4dfe@mx.google.com> https://github.com/tblah approved this pull request. Thanks! https://github.com/llvm/llvm-project/pull/139183 From flang-commits at lists.llvm.org Fri May 9 09:30:26 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 09 May 2025 09:30:26 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Further refinement of OpenMP !$ lines in -E mode (PR #138956) In-Reply-To: Message-ID: <681e2da2.170a0220.d4a0d.3ee0@mx.google.com> https://github.com/akuhlens approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/138956 From flang-commits at lists.llvm.org Fri May 9 09:33:56 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 09:33:56 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix issue when macro is followed by OpenMP pragma (PR #123035) In-Reply-To: Message-ID: <681e2e74.170a0220.1454c2.547d@mx.google.com> https://github.com/shivaramaarao updated https://github.com/llvm/llvm-project/pull/123035 >From 7f303b7645b0ec2bd3b140232c10974bf8bf837e Mon Sep 17 00:00:00 2001 From: Shivarama Rao Date: Wed, 15 Jan 2025 09:44:04 +0000 Subject: [PATCH] When calling IsCompilerDirectiveSentinel,the prefix comment character need to be skipped. Fixes #117693 and also handles continuation lines beginning with Macros --- flang/include/flang/Parser/token-sequence.h | 1 + flang/lib/Parser/prescan.cpp | 9 ++--- flang/lib/Parser/token-sequence.cpp | 40 +++++++++++++++++++++ flang/test/Preprocessing/bug117693.f90 | 14 ++++++++ flang/test/Preprocessing/bug117693_2.f90 | 15 ++++++++ flang/test/Preprocessing/bug117693_3.f90 | 7 ++++ 6 files changed, 82 insertions(+), 4 deletions(-) create mode 100644 flang/test/Preprocessing/bug117693.f90 create mode 100644 flang/test/Preprocessing/bug117693_2.f90 create mode 100644 flang/test/Preprocessing/bug117693_3.f90 diff --git a/flang/include/flang/Parser/token-sequence.h b/flang/include/flang/Parser/token-sequence.h index 047c0bed00762..a5f9b86ddd1eb 100644 --- a/flang/include/flang/Parser/token-sequence.h +++ b/flang/include/flang/Parser/token-sequence.h @@ -123,6 +123,7 @@ class TokenSequence { bool HasRedundantBlanks(std::size_t firstChar = 0) const; TokenSequence &RemoveBlanks(std::size_t firstChar = 0); TokenSequence &RemoveRedundantBlanks(std::size_t firstChar = 0); + TokenSequence &RemoveRedundantCompilerDirectives(const Prescanner &); TokenSequence &ClipComment(const Prescanner &, bool skipFirst = false); const TokenSequence &CheckBadFortranCharacters( Messages &, const Prescanner &, bool allowAmpersand) const; diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp index c5939a1e0b6c2..3aa177c80b704 100644 --- a/flang/lib/Parser/prescan.cpp +++ b/flang/lib/Parser/prescan.cpp @@ -175,7 +175,7 @@ void Prescanner::Statement() { EmitChar(tokens, '!'); ++at_, ++column_; for (const char *sp{directiveSentinel_}; *sp != '\0'; - ++sp, ++at_, ++column_) { + ++sp, ++at_, ++column_) { EmitChar(tokens, *sp); } if (IsSpaceOrTab(at_)) { @@ -346,6 +346,7 @@ void Prescanner::CheckAndEmitLine( tokens.CheckBadParentheses(messages_); } } + tokens.RemoveRedundantCompilerDirectives(*this); tokens.Emit(cooked_); if (omitNewline_) { omitNewline_ = false; @@ -511,7 +512,7 @@ bool Prescanner::MustSkipToEndOfLine() const { if (inFixedForm_ && column_ > fixedFormColumnLimit_ && !tabInCurrentLine_) { return true; // skip over ignored columns in right margin (73:80) } else if (*at_ == '!' && !inCharLiteral_) { - return !IsCompilerDirectiveSentinel(at_); + return !IsCompilerDirectiveSentinel(at_ + 1); } else { return false; } @@ -1073,7 +1074,7 @@ std::optional Prescanner::IsIncludeLine(const char *start) const { } if (IsDecimalDigit(*p)) { // accept & ignore a numeric kind prefix for (p = SkipWhiteSpace(p + 1); IsDecimalDigit(*p); - p = SkipWhiteSpace(p + 1)) { + p = SkipWhiteSpace(p + 1)) { } if (*p != '_') { return std::nullopt; @@ -1121,7 +1122,7 @@ void Prescanner::FortranInclude(const char *firstQuote) { llvm::raw_string_ostream error{buf}; Provenance provenance{GetProvenance(nextLine_)}; std::optional prependPath; - if (const SourceFile * currentFile{allSources_.GetSourceFile(provenance)}) { + if (const SourceFile *currentFile{allSources_.GetSourceFile(provenance)}) { prependPath = DirectoryName(currentFile->path()); } const SourceFile *included{ diff --git a/flang/lib/Parser/token-sequence.cpp b/flang/lib/Parser/token-sequence.cpp index cdbe89b1eb441..9695b8d67be0b 100644 --- a/flang/lib/Parser/token-sequence.cpp +++ b/flang/lib/Parser/token-sequence.cpp @@ -304,6 +304,46 @@ TokenSequence &TokenSequence::ClipComment( return *this; } +TokenSequence &TokenSequence::RemoveRedundantCompilerDirectives( + const Prescanner &prescanner) { + // When the toekn sqeuence is " " convert it to " + // " + std::size_t tokens{SizeInTokens()}; + TokenSequence result; + bool firstDirective{true}; + for (std::size_t j{0}; j < tokens; ++j) { + CharBlock tok{TokenAt(j)}; + bool isSentinel{false}; + std::size_t blanks{tok.CountLeadingBlanks()}; + if (blanks < tok.size() && tok[blanks] == '!') { + // Retain active compiler directive sentinels (e.g. "!dir$", "!$omp") + for (std::size_t k{j + 1}; k < tokens && tok.size() <= blanks + 5; ++k) { + if (tok.begin() + tok.size() == TokenAt(k).begin()) { + tok.ExtendToCover(TokenAt(k)); + } else { + break; + } + } + } + if (tok.size() > blanks + 5) { + isSentinel = + prescanner.IsCompilerDirectiveSentinel(&tok[blanks + 1]).has_value(); + } + if (isSentinel && + !firstDirective) { // skip the directives if not the first one + j++; + } else { + result.Put(*this, j); + } + if (isSentinel && firstDirective) { + firstDirective = false; + } + } + swap(result); + return *this; +} + void TokenSequence::Emit(CookedSource &cooked) const { if (auto n{char_.size()}) { cooked.Put(&char_[0], n); diff --git a/flang/test/Preprocessing/bug117693.f90 b/flang/test/Preprocessing/bug117693.f90 new file mode 100644 index 0000000000000..ced7927606e62 --- /dev/null +++ b/flang/test/Preprocessing/bug117693.f90 @@ -0,0 +1,14 @@ +! RUN: %flang -fopenmp -E %s 2>&1 | FileCheck %s +!CHECK: !$OMP DO SCHEDULE(STATIC) +program main +IMPLICIT NONE +INTEGER:: I +#define OMPSUPPORT +!$ INTEGER :: omp_id +!$OMP PARALLEL DO +OMPSUPPORT !$OMP DO SCHEDULE(STATIC) +DO I=1,100 +print *, omp_id +ENDDO +!$OMP END PARALLEL DO +end program diff --git a/flang/test/Preprocessing/bug117693_2.f90 b/flang/test/Preprocessing/bug117693_2.f90 new file mode 100644 index 0000000000000..fe5027ceddf09 --- /dev/null +++ b/flang/test/Preprocessing/bug117693_2.f90 @@ -0,0 +1,15 @@ +! RUN: %flang -fopenmp -E %s 2>&1 | FileCheck %s +!CHECK: !$OMP DO SCHEDULE(STATIC) DEFAULT(NONE) +!CHECK-NOT: !$OMP DEFAULT(NONE) +program main +IMPLICIT NONE +INTEGER:: I +#define OMPSUPPORT +!$ INTEGER :: omp_id +!$OMP PARALLEL DO +OMPSUPPORT !$OMP DO SCHEDULE(STATIC) !$OMP DEFAULT(NONE) +DO I=1,100 +print *, omp_id +ENDDO +!$OMP END PARALLEL DO +end program diff --git a/flang/test/Preprocessing/bug117693_3.f90 b/flang/test/Preprocessing/bug117693_3.f90 new file mode 100644 index 0000000000000..4fdff04542889 --- /dev/null +++ b/flang/test/Preprocessing/bug117693_3.f90 @@ -0,0 +1,7 @@ +! RUN: %flang -fopenmp -E %s 2>&1 | FileCheck %s +!CHECK-NOT: DO I=1,100 !$OMP +program main +INTEGER::n +DO I=1,100 !!$OMP +ENDDO +END PROGRAM From flang-commits at lists.llvm.org Fri May 9 09:37:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 09:37:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix issue when macro is followed by OpenMP pragma (PR #123035) In-Reply-To: Message-ID: <681e2f5e.170a0220.31ae39.5a43@mx.google.com> shivaramaarao wrote: Is the PR ok to submit? If So, I can ask some of our AMD engineers to commit and we will follow up on any regressions. https://github.com/llvm/llvm-project/pull/123035 From flang-commits at lists.llvm.org Fri May 9 09:58:52 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 09 May 2025 09:58:52 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fixed designator codegen for contiguous boxes. (PR #139003) In-Reply-To: Message-ID: <681e344c.050a0220.8d898.040d@mx.google.com> ================ @@ -213,3 +213,75 @@ func.func @test_polymorphic_array_elt(%arg0: !fir.class>, !fir.class>>) -> !fir.class> // CHECK: return // CHECK: } + +// Test proper generation of fir.array_coor for contiguous box with default lbounds. +func.func @_QPtest_contiguous_derived_default(%arg0: !fir.class>> {fir.bindc_name = "d1", fir.contiguous, fir.optional}) { + %c1 = arith.constant 1 : index + %c16_i32 = arith.constant 16 : i32 + %0 = fir.dummy_scope : !fir.dscope + %1:2 = hlfir.declare %arg0 dummy_scope %0 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.class>>, !fir.dscope) -> (!fir.class>>, !fir.class>>) + fir.select_type %1#1 : !fir.class>> [#fir.type_is,i:i32}>>, ^bb1, unit, ^bb2] +^bb1: // pred: ^bb0 + %2 = fir.convert %1#1 : (!fir.class>>) -> !fir.box,i:i32}>>> + %3:2 = hlfir.declare %2 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.box,i:i32}>>>) -> (!fir.box,i:i32}>>>, !fir.box,i:i32}>>>) + %4 = hlfir.designate %3#0 (%c1, %c1) : (!fir.box,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> + %5 = hlfir.designate %4{"i"} : (!fir.ref,i:i32}>>) -> !fir.ref + hlfir.assign %c16_i32 to %5 : i32, !fir.ref + cf.br ^bb3 +^bb2: // pred: ^bb0 + %6:2 = hlfir.declare %1#1 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.class>>) -> (!fir.class>>, !fir.class>>) + cf.br ^bb3 +^bb3: // 2 preds: ^bb1, ^bb2 + return +} +// CHECK-LABEL: func.func @_QPtest_contiguous_derived_default( +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_9:.*]] = fir.declare %{{.*}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.box,i:i32}>>>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_10:.*]] = fir.rebox %[[VAL_9]] : (!fir.box,i:i32}>>>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_11:.*]] = fir.box_addr %[[VAL_10]] : (!fir.box,i:i32}>>>) -> !fir.ref,i:i32}>>> +// CHECK: %[[VAL_12:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10]], %[[VAL_12]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_15:.*]]:3 = fir.box_dims %[[VAL_10]], %[[VAL_14]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_16:.*]] = fir.shape %[[VAL_13]]#1, %[[VAL_15]]#1 : (index, index) -> !fir.shape<2> +// CHECK: %[[VAL_17:.*]] = fir.array_coor %[[VAL_11]](%[[VAL_16]]) %[[VAL_0]], %[[VAL_0]] : (!fir.ref,i:i32}>>>, !fir.shape<2>, index, index) -> !fir.ref,i:i32}>> + +// Test proper generation of fir.array_coor for contiguous box with non-default lbounds. +func.func @_QPtest_contiguous_derived_lbounds(%arg0: !fir.class>> {fir.bindc_name = "d1", fir.contiguous}) { + %c3 = arith.constant 3 : index + %c1 = arith.constant 1 : index + %c16_i32 = arith.constant 16 : i32 + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.shift %c1, %c3 : (index, index) -> !fir.shift<2> + %2:2 = hlfir.declare %arg0(%1) dummy_scope %0 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.class>>, !fir.shift<2>, !fir.dscope) -> (!fir.class>>, !fir.class>>) + fir.select_type %2#1 : !fir.class>> [#fir.type_is,i:i32}>>, ^bb1, unit, ^bb2] +^bb1: // pred: ^bb0 + %3 = fir.convert %2#1 : (!fir.class>>) -> !fir.box,i:i32}>>> + %4:2 = hlfir.declare %3(%1) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.box,i:i32}>>>, !fir.shift<2>) -> (!fir.box,i:i32}>>>, !fir.box,i:i32}>>>) + %5 = hlfir.designate %4#0 (%c1, %c3) : (!fir.box,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> + %6 = hlfir.designate %5{"i"} : (!fir.ref,i:i32}>>) -> !fir.ref + hlfir.assign %c16_i32 to %6 : i32, !fir.ref + cf.br ^bb3 +^bb2: // pred: ^bb0 + %7:2 = hlfir.declare %2#1(%1) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.class>>, !fir.shift<2>) -> (!fir.class>>, !fir.class>>) + cf.br ^bb3 +^bb3: // 2 preds: ^bb1, ^bb2 + return +} +// CHECK-LABEL: func.func @_QPtest_contiguous_derived_lbounds( +// CHECK: %[[VAL_0:.*]] = arith.constant 3 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.declare %{{.*}}(%[[VAL_4:.*]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.box,i:i32}>>>, !fir.shift<2>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_9:.*]] = fir.rebox %[[VAL_8]](%[[VAL_4]]) : (!fir.box,i:i32}>>>, !fir.shift<2>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_10:.*]] = fir.box_addr %[[VAL_9]] : (!fir.box,i:i32}>>>) -> !fir.ref,i:i32}>>> +// CHECK: %[[VAL_11:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_12:.*]]:3 = fir.box_dims %[[VAL_9]], %[[VAL_11]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_13:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_14:.*]]:3 = fir.box_dims %[[VAL_9]], %[[VAL_13]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_15:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_16:.*]] = arith.subi %[[VAL_1]], %[[VAL_1]] : index +// CHECK: %[[VAL_17:.*]] = arith.addi %[[VAL_16]], %[[VAL_15]] : index +// CHECK: %[[VAL_18:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_19:.*]] = arith.subi %[[VAL_0]], %[[VAL_0]] : index +// CHECK: %[[VAL_20:.*]] = arith.addi %[[VAL_19]], %[[VAL_18]] : index +// CHECK: %[[VAL_21:.*]] = fir.array_coor %[[VAL_10]] %[[VAL_17]], %[[VAL_20]] : (!fir.ref,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> ---------------- vzakhari wrote: Thanks for catching this! It should not work, so I missed something. https://github.com/llvm/llvm-project/pull/139003 From flang-commits at lists.llvm.org Fri May 9 10:01:13 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 09 May 2025 10:01:13 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fixed designator codegen for contiguous boxes. (PR #139003) In-Reply-To: Message-ID: <681e34d9.630a0220.6549e.3c6d@mx.google.com> ================ @@ -412,12 +412,44 @@ class DesignateOpConversion auto indices = designate.getIndices(); int i = 0; auto attrs = designate.getIsTripletAttr(); + + // If the shape specifies a shift and the base is not a box, + // then we have to subtract the lower bounds, as long as + // fir.array_coor does not support non-default lower bounds + // for non-box accesses. ---------------- vzakhari wrote: I was hitting "shift can only be provided with fir.box memref" error in `fir::ArrayCoorOp::verify`, but I now see a comment there saying that the codegen does support it: ``` // TODO: it looks like PreCGRewrite and CodeGen can support // fir.shift with plain array reference, so we may consider // removing this check. if (!mlir::isa(getMemref().getType())) return emitOpError("shift can only be provided with fir.box memref"); ``` I will do more investigation. https://github.com/llvm/llvm-project/pull/139003 From flang-commits at lists.llvm.org Fri May 9 10:01:18 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 09 May 2025 10:01:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang] (PR #139291) Message-ID: https://github.com/akuhlens created https://github.com/llvm/llvm-project/pull/139291 Previously the following program would have failed with a runtime assertion violation. This PR restricts the type information such that this assertion failure isn't reachable. The example below demonstrates the change. ```bash $ cat error.f90 integer (kind=1) :: i call get_command(length=i) print *, i end $ cat good.f90 integer (kind=2) :: i call get_command(length=i) print *, i end $ prior/flang error.f90 && ./a.out fatal Fortran runtime error(/home/akuhlenschmi/work/lorado/src/llvm-project/t.f90:2): Internal error: RUNTIME_CHECK(IsValidIntDescriptor(length)) failed at /home/akuhlenschmi/work/lorado/src/llvm-project/flang-rt/lib/runtime/command.cpp(154) Aborted (core dumped) $ prior/flang good.f90 && ./a.out 7 $ current/flang error.f90 && ./a.out error: Semantic errors in t.f90 ./t.f90:2:25: error: Actual argument for 'length=' has bad type or kind 'INTEGER(1)' call get_command(length=i) ^ $ current/flang good.f90 && ./a.out 7 ``` >From 7bd59cf06ad52ab217416f815672034eca15ba9c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 8 May 2025 08:32:37 -0700 Subject: [PATCH] initial commit --- flang/lib/Evaluate/intrinsics.cpp | 8 ++++---- flang/test/Semantics/command.f90 | 30 ++++++++++++++++++++++++++++++ 2 files changed, 34 insertions(+), 4 deletions(-) create mode 100644 flang/test/Semantics/command.f90 diff --git a/flang/lib/Evaluate/intrinsics.cpp b/flang/lib/Evaluate/intrinsics.cpp index 709f2e6c85bb2..d64a008e3db84 100644 --- a/flang/lib/Evaluate/intrinsics.cpp +++ b/flang/lib/Evaluate/intrinsics.cpp @@ -1587,8 +1587,8 @@ static const IntrinsicInterface intrinsicSubroutine[]{ {"get_command", {{"command", DefaultChar, Rank::scalar, Optionality::optional, common::Intent::Out}, - {"length", AnyInt, Rank::scalar, Optionality::optional, - common::Intent::Out}, + {"length", TypePattern{IntType, KindCode::greaterOrEqualToKind, 2}, + Rank::scalar, Optionality::optional, common::Intent::Out}, {"status", AnyInt, Rank::scalar, Optionality::optional, common::Intent::Out}, {"errmsg", DefaultChar, Rank::scalar, Optionality::optional, @@ -1598,8 +1598,8 @@ static const IntrinsicInterface intrinsicSubroutine[]{ {{"number", AnyInt, Rank::scalar}, {"value", DefaultChar, Rank::scalar, Optionality::optional, common::Intent::Out}, - {"length", AnyInt, Rank::scalar, Optionality::optional, - common::Intent::Out}, + {"length", TypePattern{IntType, KindCode::greaterOrEqualToKind, 2}, + Rank::scalar, Optionality::optional, common::Intent::Out}, {"status", AnyInt, Rank::scalar, Optionality::optional, common::Intent::Out}, {"errmsg", DefaultChar, Rank::scalar, Optionality::optional, diff --git a/flang/test/Semantics/command.f90 b/flang/test/Semantics/command.f90 new file mode 100644 index 0000000000000..b5f24cddbd052 --- /dev/null +++ b/flang/test/Semantics/command.f90 @@ -0,0 +1,30 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 +program command + implicit none + Integer(1) :: i1 + Integer(2) :: i2 + Integer(4) :: i4 + Integer(8) :: i8 + Integer(16) :: i16 + Integer :: a + !ERROR: Actual argument for 'length=' has bad type or kind 'INTEGER(1)' + call get_command(length=i1) + !OK: + call get_command(length=i2) + !OK: + call get_command(length=i4) + !OK: + call get_command(length=i8) + !OK: + call get_command(length=i16) + !ERROR: Actual argument for 'length=' has bad type or kind 'INTEGER(1)' + call get_command_argument(number=a,length=i1) + !OK: + call get_command_argument(number=a,length=i2) + !OK: + call get_command_argument(number=a,length=i4) + !OK: + call get_command_argument(number=a,length=i8) + !OK: + call get_command_argument(number=a,length=i16) +end program \ No newline at end of file From flang-commits at lists.llvm.org Fri May 9 10:01:55 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 10:01:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang] (PR #139291) In-Reply-To: Message-ID: <681e3503.a70a0220.197931.bfa5@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Andre Kuhlenschmidt (akuhlens)
Changes Previously the following program would have failed with a runtime assertion violation. This PR restricts the type information such that this assertion failure isn't reachable. The example below demonstrates the change. ```bash $ cat error.f90 integer (kind=1) :: i call get_command(length=i) print *, i end $ cat good.f90 integer (kind=2) :: i call get_command(length=i) print *, i end $ prior/flang error.f90 && ./a.out fatal Fortran runtime error(/home/akuhlenschmi/work/lorado/src/llvm-project/t.f90:2): Internal error: RUNTIME_CHECK(IsValidIntDescriptor(length)) failed at /home/akuhlenschmi/work/lorado/src/llvm-project/flang-rt/lib/runtime/command.cpp(154) Aborted (core dumped) $ prior/flang good.f90 && ./a.out 7 $ current/flang error.f90 && ./a.out error: Semantic errors in t.f90 ./t.f90:2:25: error: Actual argument for 'length=' has bad type or kind 'INTEGER(1)' call get_command(length=i) ^ $ current/flang good.f90 && ./a.out 7 ``` --- Full diff: https://github.com/llvm/llvm-project/pull/139291.diff 2 Files Affected: - (modified) flang/lib/Evaluate/intrinsics.cpp (+4-4) - (added) flang/test/Semantics/command.f90 (+30) ``````````diff diff --git a/flang/lib/Evaluate/intrinsics.cpp b/flang/lib/Evaluate/intrinsics.cpp index 709f2e6c85bb2..d64a008e3db84 100644 --- a/flang/lib/Evaluate/intrinsics.cpp +++ b/flang/lib/Evaluate/intrinsics.cpp @@ -1587,8 +1587,8 @@ static const IntrinsicInterface intrinsicSubroutine[]{ {"get_command", {{"command", DefaultChar, Rank::scalar, Optionality::optional, common::Intent::Out}, - {"length", AnyInt, Rank::scalar, Optionality::optional, - common::Intent::Out}, + {"length", TypePattern{IntType, KindCode::greaterOrEqualToKind, 2}, + Rank::scalar, Optionality::optional, common::Intent::Out}, {"status", AnyInt, Rank::scalar, Optionality::optional, common::Intent::Out}, {"errmsg", DefaultChar, Rank::scalar, Optionality::optional, @@ -1598,8 +1598,8 @@ static const IntrinsicInterface intrinsicSubroutine[]{ {{"number", AnyInt, Rank::scalar}, {"value", DefaultChar, Rank::scalar, Optionality::optional, common::Intent::Out}, - {"length", AnyInt, Rank::scalar, Optionality::optional, - common::Intent::Out}, + {"length", TypePattern{IntType, KindCode::greaterOrEqualToKind, 2}, + Rank::scalar, Optionality::optional, common::Intent::Out}, {"status", AnyInt, Rank::scalar, Optionality::optional, common::Intent::Out}, {"errmsg", DefaultChar, Rank::scalar, Optionality::optional, diff --git a/flang/test/Semantics/command.f90 b/flang/test/Semantics/command.f90 new file mode 100644 index 0000000000000..b5f24cddbd052 --- /dev/null +++ b/flang/test/Semantics/command.f90 @@ -0,0 +1,30 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 +program command + implicit none + Integer(1) :: i1 + Integer(2) :: i2 + Integer(4) :: i4 + Integer(8) :: i8 + Integer(16) :: i16 + Integer :: a + !ERROR: Actual argument for 'length=' has bad type or kind 'INTEGER(1)' + call get_command(length=i1) + !OK: + call get_command(length=i2) + !OK: + call get_command(length=i4) + !OK: + call get_command(length=i8) + !OK: + call get_command(length=i16) + !ERROR: Actual argument for 'length=' has bad type or kind 'INTEGER(1)' + call get_command_argument(number=a,length=i1) + !OK: + call get_command_argument(number=a,length=i2) + !OK: + call get_command_argument(number=a,length=i4) + !OK: + call get_command_argument(number=a,length=i8) + !OK: + call get_command_argument(number=a,length=i16) +end program \ No newline at end of file ``````````
https://github.com/llvm/llvm-project/pull/139291 From flang-commits at lists.llvm.org Fri May 9 10:03:24 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 09 May 2025 10:03:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang] (PR #139291) In-Reply-To: Message-ID: <681e355c.170a0220.5982b.7a53@mx.google.com> https://github.com/akuhlens edited https://github.com/llvm/llvm-project/pull/139291 From flang-commits at lists.llvm.org Fri May 9 10:06:10 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 10:06:10 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" and clause LoopRange (PR #139293) In-Reply-To: Message-ID: <681e3602.170a0220.272ffa.7d3d@mx.google.com> github-actions[bot] wrote: Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using `@` followed by their GitHub username. If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the [LLVM GitHub User Guide](https://llvm.org/docs/GitHub.html). You can also ask questions in a comment on this PR, on the [LLVM Discord](https://discord.com/invite/xS7Z362) or on the [forums](https://discourse.llvm.org/). https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 10:06:44 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 10:06:44 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" and clause LoopRange (PR #139293) In-Reply-To: Message-ID: <681e3624.050a0220.2371d8.fdda@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp @llvm/pr-subscribers-clang Author: Walter J.T.V (eZWALT)
Changes This pull request introduces full support for the #pragma omp fuse directive, as specified in the OpenMP 6.0 specification, along with initial support for the looprange clause in Clang. To enable this functionality, infrastructure for the Loop Sequence construct, also new in OpenMP 6.0, has been implemented. Additionally, a minimal code skeleton has been added to Flang to ensure compatibility and avoid integration issues, although a full implementation in Flang is still pending. https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-6-0.pdf P.S. As a follow-up to this loop transformation work, I'm currently preparing a patch that implements the "#pragma omp split" directive, also introduced in OpenMP 6.0. --- Patch is 277.11 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139293.diff 47 Files Affected: - (modified) clang/docs/OpenMPSupport.rst (+2) - (modified) clang/include/clang-c/Index.h (+4) - (modified) clang/include/clang/AST/OpenMPClause.h (+100) - (modified) clang/include/clang/AST/RecursiveASTVisitor.h (+11) - (modified) clang/include/clang/AST/StmtOpenMP.h (+105-3) - (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+15) - (modified) clang/include/clang/Basic/StmtNodes.td (+1) - (modified) clang/include/clang/Parse/Parser.h (+3) - (modified) clang/include/clang/Sema/SemaOpenMP.h (+115) - (modified) clang/include/clang/Serialization/ASTBitCodes.h (+1) - (modified) clang/lib/AST/OpenMPClause.cpp (+35) - (modified) clang/lib/AST/StmtOpenMP.cpp (+41) - (modified) clang/lib/AST/StmtPrinter.cpp (+5) - (modified) clang/lib/AST/StmtProfile.cpp (+11) - (modified) clang/lib/Basic/OpenMPKinds.cpp (+4-1) - (modified) clang/lib/CodeGen/CGExpr.cpp (+2) - (modified) clang/lib/CodeGen/CGStmt.cpp (+3) - (modified) clang/lib/CodeGen/CGStmtOpenMP.cpp (+8) - (modified) clang/lib/CodeGen/CodeGenFunction.h (+5) - (modified) clang/lib/Parse/ParseOpenMP.cpp (+36) - (modified) clang/lib/Sema/SemaExceptionSpec.cpp (+1) - (modified) clang/lib/Sema/SemaOpenMP.cpp (+904-17) - (modified) clang/lib/Sema/TreeTransform.h (+44) - (modified) clang/lib/Serialization/ASTReader.cpp (+11) - (modified) clang/lib/Serialization/ASTReaderStmt.cpp (+13) - (modified) clang/lib/Serialization/ASTWriter.cpp (+8) - (modified) clang/lib/Serialization/ASTWriterStmt.cpp (+6) - (modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+1) - (added) clang/test/OpenMP/fuse_ast_print.cpp (+400) - (added) clang/test/OpenMP/fuse_codegen.cpp (+2328) - (added) clang/test/OpenMP/fuse_messages.cpp (+186) - (modified) clang/tools/libclang/CIndex.cpp (+12) - (modified) clang/tools/libclang/CXCursor.cpp (+3) - (modified) flang/include/flang/Parser/dump-parse-tree.h (+1) - (modified) flang/include/flang/Parser/parse-tree.h (+9) - (modified) flang/lib/Lower/OpenMP/Clauses.cpp (+5) - (modified) flang/lib/Lower/OpenMP/Clauses.h (+1) - (modified) flang/lib/Parser/openmp-parsers.cpp (+7) - (modified) flang/lib/Parser/unparse.cpp (+7) - (modified) flang/lib/Semantics/check-omp-structure.cpp (+9) - (modified) llvm/include/llvm/Frontend/OpenMP/ClauseT.h (+13-3) - (modified) llvm/include/llvm/Frontend/OpenMP/OMP.td (+11) - (added) openmp/runtime/test/transform/fuse/foreach.cpp (+192) - (added) openmp/runtime/test/transform/fuse/intfor.c (+50) - (added) openmp/runtime/test/transform/fuse/iterfor.cpp (+194) - (added) openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-foreach.cpp (+208) - (added) openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-intfor.c (+45) ``````````diff diff --git a/clang/docs/OpenMPSupport.rst b/clang/docs/OpenMPSupport.rst index d6507071d4693..b39f9d3634a63 100644 --- a/clang/docs/OpenMPSupport.rst +++ b/clang/docs/OpenMPSupport.rst @@ -376,6 +376,8 @@ implementation. +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | loop stripe transformation | :good:`done` | https://github.com/llvm/llvm-project/pull/119891 | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ +| loop fuse transformation | :good:`prototyped` | :none:`unclaimed` | | ++-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | work distribute construct | :none:`unclaimed` | :none:`unclaimed` | | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | task_iteration | :none:`unclaimed` | :none:`unclaimed` | | diff --git a/clang/include/clang-c/Index.h b/clang/include/clang-c/Index.h index d30d15e53802a..00046de62a742 100644 --- a/clang/include/clang-c/Index.h +++ b/clang/include/clang-c/Index.h @@ -2162,6 +2162,10 @@ enum CXCursorKind { */ CXCursor_OMPStripeDirective = 310, + /** OpenMP fuse directive + */ + CXCursor_OMPFuseDirective = 318, + /** OpenACC Compute Construct. */ CXCursor_OpenACCComputeConstruct = 320, diff --git a/clang/include/clang/AST/OpenMPClause.h b/clang/include/clang/AST/OpenMPClause.h index 757873fd6d414..9adf41aee6f1c 100644 --- a/clang/include/clang/AST/OpenMPClause.h +++ b/clang/include/clang/AST/OpenMPClause.h @@ -1151,6 +1151,106 @@ class OMPFullClause final : public OMPNoChildClause { static OMPFullClause *CreateEmpty(const ASTContext &C); }; +/// This class represents the 'looprange' clause in the +/// '#pragma omp fuse' directive +/// +/// \code {c} +/// #pragma omp fuse looprange(1,2) +/// { +/// for(int i = 0; i < 64; ++i) +/// for(int j = 0; j < 256; j+=2) +/// for(int k = 127; k >= 0; --k) +/// \endcode +class OMPLoopRangeClause final : public OMPClause { + friend class OMPClauseReader; + + explicit OMPLoopRangeClause() + : OMPClause(llvm::omp::OMPC_looprange, {}, {}) {} + + /// Location of '(' + SourceLocation LParenLoc; + + /// Location of 'first' + SourceLocation FirstLoc; + + /// Location of 'count' + SourceLocation CountLoc; + + /// Expr associated with 'first' argument + Expr *First = nullptr; + + /// Expr associated with 'count' argument + Expr *Count = nullptr; + + /// Set 'first' + void setFirst(Expr *First) { this->First = First; } + + /// Set 'count' + void setCount(Expr *Count) { this->Count = Count; } + + /// Set location of '('. + void setLParenLoc(SourceLocation Loc) { LParenLoc = Loc; } + + /// Set location of 'first' argument + void setFirstLoc(SourceLocation Loc) { FirstLoc = Loc; } + + /// Set location of 'count' argument + void setCountLoc(SourceLocation Loc) { CountLoc = Loc; } + +public: + /// Build an AST node for a 'looprange' clause + /// + /// \param StartLoc Starting location of the clause. + /// \param LParenLoc Location of '('. + /// \param ModifierLoc Modifier location. + /// \param + static OMPLoopRangeClause * + Create(const ASTContext &C, SourceLocation StartLoc, SourceLocation LParenLoc, + SourceLocation FirstLoc, SourceLocation CountLoc, + SourceLocation EndLoc, Expr *First, Expr *Count); + + /// Build an empty 'looprange' node for deserialization + /// + /// \param C Context of the AST. + static OMPLoopRangeClause *CreateEmpty(const ASTContext &C); + + /// Returns the location of '(' + SourceLocation getLParenLoc() const { return LParenLoc; } + + /// Returns the location of 'first' + SourceLocation getFirstLoc() const { return FirstLoc; } + + /// Returns the location of 'count' + SourceLocation getCountLoc() const { return CountLoc; } + + /// Returns the argument 'first' or nullptr if not set + Expr *getFirst() const { return cast_or_null(First); } + + /// Returns the argument 'count' or nullptr if not set + Expr *getCount() const { return cast_or_null(Count); } + + child_range children() { + return child_range(reinterpret_cast(&First), + reinterpret_cast(&Count) + 1); + } + + const_child_range children() const { + auto Children = const_cast(this)->children(); + return const_child_range(Children.begin(), Children.end()); + } + + child_range used_children() { + return child_range(child_iterator(), child_iterator()); + } + const_child_range used_children() const { + return const_child_range(const_child_iterator(), const_child_iterator()); + } + + static bool classof(const OMPClause *T) { + return T->getClauseKind() == llvm::omp::OMPC_looprange; + } +}; + /// Representation of the 'partial' clause of the '#pragma omp unroll' /// directive. /// diff --git a/clang/include/clang/AST/RecursiveASTVisitor.h b/clang/include/clang/AST/RecursiveASTVisitor.h index 3edc8684d0a19..fbc93796ab46a 100644 --- a/clang/include/clang/AST/RecursiveASTVisitor.h +++ b/clang/include/clang/AST/RecursiveASTVisitor.h @@ -3078,6 +3078,9 @@ DEF_TRAVERSE_STMT(OMPUnrollDirective, DEF_TRAVERSE_STMT(OMPReverseDirective, { TRY_TO(TraverseOMPExecutableDirective(S)); }) +DEF_TRAVERSE_STMT(OMPFuseDirective, + { TRY_TO(TraverseOMPExecutableDirective(S)); }) + DEF_TRAVERSE_STMT(OMPInterchangeDirective, { TRY_TO(TraverseOMPExecutableDirective(S)); }) @@ -3395,6 +3398,14 @@ bool RecursiveASTVisitor::VisitOMPFullClause(OMPFullClause *C) { return true; } +template +bool RecursiveASTVisitor::VisitOMPLoopRangeClause( + OMPLoopRangeClause *C) { + TRY_TO(TraverseStmt(C->getFirst())); + TRY_TO(TraverseStmt(C->getCount())); + return true; +} + template bool RecursiveASTVisitor::VisitOMPPartialClause(OMPPartialClause *C) { TRY_TO(TraverseStmt(C->getFactor())); diff --git a/clang/include/clang/AST/StmtOpenMP.h b/clang/include/clang/AST/StmtOpenMP.h index 736bcabbad1f7..b6a948a8c6020 100644 --- a/clang/include/clang/AST/StmtOpenMP.h +++ b/clang/include/clang/AST/StmtOpenMP.h @@ -962,6 +962,9 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Number of loops generated by this loop transformation. unsigned NumGeneratedLoops = 0; + /// Number of top level canonical loop nests generated by this loop + /// transformation + unsigned NumGeneratedLoopNests = 0; protected: explicit OMPLoopTransformationDirective(StmtClass SC, @@ -973,6 +976,9 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Set the number of loops generated by this loop transformation. void setNumGeneratedLoops(unsigned Num) { NumGeneratedLoops = Num; } + /// Set the number of top level canonical loop nests generated by this loop + /// transformation + void setNumGeneratedLoopNests(unsigned Num) { NumGeneratedLoopNests = Num; } public: /// Return the number of associated (consumed) loops. @@ -981,6 +987,10 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Return the number of loops generated by this loop transformation. unsigned getNumGeneratedLoops() const { return NumGeneratedLoops; } + /// Return the number of top level canonical loop nests generated by this loop + /// transformation + unsigned getNumGeneratedLoopNests() const { return NumGeneratedLoopNests; } + /// Get the de-sugared statements after the loop transformation. /// /// Might be nullptr if either the directive generates no loops and is handled @@ -995,7 +1005,7 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { Stmt::StmtClass C = T->getStmtClass(); return C == OMPTileDirectiveClass || C == OMPUnrollDirectiveClass || C == OMPReverseDirectiveClass || C == OMPInterchangeDirectiveClass || - C == OMPStripeDirectiveClass; + C == OMPStripeDirectiveClass || C == OMPFuseDirectiveClass; } }; @@ -5561,7 +5571,10 @@ class OMPTileDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPTileDirectiveClass, llvm::omp::OMPD_tile, StartLoc, EndLoc, NumLoops) { + // Tiling doubles the original number of loops setNumGeneratedLoops(2 * NumLoops); + // Produces a single top-level canonical loop nest + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { @@ -5639,6 +5652,8 @@ class OMPStripeDirective final : public OMPLoopTransformationDirective { llvm::omp::OMPD_stripe, StartLoc, EndLoc, NumLoops) { setNumGeneratedLoops(2 * NumLoops); + // Similar to Tile, it only generates a single top level loop nest + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { @@ -5790,7 +5805,11 @@ class OMPReverseDirective final : public OMPLoopTransformationDirective { explicit OMPReverseDirective(SourceLocation StartLoc, SourceLocation EndLoc) : OMPLoopTransformationDirective(OMPReverseDirectiveClass, llvm::omp::OMPD_reverse, StartLoc, - EndLoc, 1) {} + EndLoc, 1) { + // Reverse produces a single top-level canonical loop nest + setNumGeneratedLoops(1); + setNumGeneratedLoopNests(1); + } void setPreInits(Stmt *PreInits) { Data->getChildren()[PreInitsOffset] = PreInits; @@ -5857,7 +5876,10 @@ class OMPInterchangeDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPInterchangeDirectiveClass, llvm::omp::OMPD_interchange, StartLoc, EndLoc, NumLoops) { - setNumGeneratedLoops(3 * NumLoops); + // Interchange produces a single top-level canonical loop + // nest, with the exact same amount of total loops + setNumGeneratedLoops(NumLoops); + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { @@ -5908,6 +5930,86 @@ class OMPInterchangeDirective final : public OMPLoopTransformationDirective { } }; +/// Represents the '#pragma omp fuse' loop transformation directive +/// +/// \code{c} +/// #pragma omp fuse +/// { +/// for(int i = 0; i < m1; ++i) {...} +/// for(int j = 0; j < m2; ++j) {...} +/// ... +/// } +/// \endcode + +class OMPFuseDirective final : public OMPLoopTransformationDirective { + friend class ASTStmtReader; + friend class OMPExecutableDirective; + + // Offsets of child members. + enum { + PreInitsOffset = 0, + TransformedStmtOffset, + }; + + explicit OMPFuseDirective(SourceLocation StartLoc, SourceLocation EndLoc, + unsigned NumLoops) + : OMPLoopTransformationDirective(OMPFuseDirectiveClass, + llvm::omp::OMPD_fuse, StartLoc, EndLoc, + NumLoops) {} + + void setPreInits(Stmt *PreInits) { + Data->getChildren()[PreInitsOffset] = PreInits; + } + + void setTransformedStmt(Stmt *S) { + Data->getChildren()[TransformedStmtOffset] = S; + } + +public: + /// Create a new AST node representation for #pragma omp fuse' + /// + /// \param C Context of the AST + /// \param StartLoc Location of the introducer (e.g the 'omp' token) + /// \param EndLoc Location of the directive's end (e.g the tok::eod) + /// \param Clauses The directive's clauses + /// \param NumLoops Number of total affected loops + /// \param NumLoopNests Number of affected top level canonical loops + /// (number of items in the 'looprange' clause if present) + /// \param AssociatedStmt The outermost associated loop + /// \param TransformedStmt The loop nest after fusion, or nullptr in + /// dependent + /// \param PreInits Helper preinits statements for the loop nest + static OMPFuseDirective *Create(const ASTContext &C, SourceLocation StartLoc, + SourceLocation EndLoc, + ArrayRef Clauses, + unsigned NumLoops, unsigned NumLoopNests, + Stmt *AssociatedStmt, Stmt *TransformedStmt, + Stmt *PreInits); + + /// Build an empty '#pragma omp fuse' AST node for deserialization + /// + /// \param C Context of the AST + /// \param NumClauses Number of clauses to allocate + /// \param NumLoops Number of associated loops to allocate + /// \param NumLoopNests Number of top level loops to allocate + static OMPFuseDirective *CreateEmpty(const ASTContext &C, unsigned NumClauses, + unsigned NumLoops, + unsigned NumLoopNests); + + /// Gets the associated loops after the transformation. This is the de-sugared + /// replacement or nulltpr in dependent contexts. + Stmt *getTransformedStmt() const { + return Data->getChildren()[TransformedStmtOffset]; + } + + /// Return preinits statement. + Stmt *getPreInits() const { return Data->getChildren()[PreInitsOffset]; } + + static bool classof(const Stmt *T) { + return T->getStmtClass() == OMPFuseDirectiveClass; + } +}; + /// This represents '#pragma omp scan' directive. /// /// \code diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index e1b9ed0647bb9..94d1f3c3e6349 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -11516,6 +11516,21 @@ def note_omp_implicit_dsa : Note< "implicitly determined as %0">; def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; +def warn_omp_different_loop_ind_var_types : Warning < + "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">, + InGroup; +def err_omp_not_canonical_loop : Error < + "loop after '#pragma omp %0' is not in canonical form">; +def err_omp_not_a_loop_sequence : Error < + "statement after '#pragma omp %0' must be a loop sequence containing canonical loops or loop-generating constructs">; +def err_omp_empty_loop_sequence : Error < + "loop sequence after '#pragma omp %0' must contain at least 1 canonical loop or loop-generating construct">; +def err_omp_invalid_looprange : Error < + "loop range in '#pragma omp %0' exceeds the number of available loops: " + "range end '%1' is greater than the total number of loops '%2'">; +def warn_omp_redundant_fusion : Warning < + "loop range in '#pragma omp %0' contains only a single loop, resulting in redundant fusion">, + InGroup; def err_omp_not_for : Error< "%select{statement after '#pragma omp %1' must be a for loop|" "expected %2 for loops after '#pragma omp %1'%select{|, but found only %4}3}0">; diff --git a/clang/include/clang/Basic/StmtNodes.td b/clang/include/clang/Basic/StmtNodes.td index 9526fa5808aa5..739160342062c 100644 --- a/clang/include/clang/Basic/StmtNodes.td +++ b/clang/include/clang/Basic/StmtNodes.td @@ -234,6 +234,7 @@ def OMPStripeDirective : StmtNode; def OMPUnrollDirective : StmtNode; def OMPReverseDirective : StmtNode; def OMPInterchangeDirective : StmtNode; +def OMPFuseDirective : StmtNode; def OMPForDirective : StmtNode; def OMPForSimdDirective : StmtNode; def OMPSectionsDirective : StmtNode; diff --git a/clang/include/clang/Parse/Parser.h b/clang/include/clang/Parse/Parser.h index e0b8850493b49..0c4c4fc4ba417 100644 --- a/clang/include/clang/Parse/Parser.h +++ b/clang/include/clang/Parse/Parser.h @@ -3622,6 +3622,9 @@ class Parser : public CodeCompletionHandler { OpenMPClauseKind Kind, bool ParseOnly); + /// Parses the 'looprange' clause of a '#pragma omp fuse' directive. + OMPClause *ParseOpenMPLoopRangeClause(); + /// Parses the 'sizes' clause of a '#pragma omp tile' directive. OMPClause *ParseOpenMPSizesClause(); diff --git a/clang/include/clang/Sema/SemaOpenMP.h b/clang/include/clang/Sema/SemaOpenMP.h index 6498390fe96f7..ac4cbe3709a0d 100644 --- a/clang/include/clang/Sema/SemaOpenMP.h +++ b/clang/include/clang/Sema/SemaOpenMP.h @@ -457,6 +457,13 @@ class SemaOpenMP : public SemaBase { Stmt *AStmt, SourceLocation StartLoc, SourceLocation EndLoc); + + /// Called on well-formed '#pragma omp fuse' after parsing of its + /// clauses and the associated statement. + StmtResult ActOnOpenMPFuseDirective(ArrayRef Clauses, + Stmt *AStmt, SourceLocation StartLoc, + SourceLocation EndLoc); + /// Called on well-formed '\#pragma omp for' after parsing /// of the associated statement. StmtResult @@ -914,6 +921,12 @@ class SemaOpenMP : public SemaBase { SourceLocation StartLoc, SourceLocation LParenLoc, SourceLocation EndLoc); + + /// Called on well-form 'looprange' clause after parsing its arguments. + OMPClause * + ActOnOpenMPLoopRangeClause(Expr *First, Expr *Count, SourceLocation StartLoc, + SourceLocation LParenLoc, SourceLocation FirstLoc, + SourceLocation CountLoc, SourceLocation EndLoc); /// Called on well-formed 'ordered' clause. OMPClause * ActOnOpenMPOrderedClause(SourceLocation StartLoc, SourceLocation EndLoc, @@ -1480,6 +1493,108 @@ class SemaOpenMP : public SemaBase { SmallVectorImpl &LoopHelpers, Stmt *&Body, SmallVectorImpl> &OriginalInits); + /// @brief Categories of loops encountered during semantic OpenMP loop + /// analysis + /// + /// This enumeration identifies the structural category of a loop or sequence + /// of loops analyzed in the context of OpenMP transformations and directives. + /// This categorization helps differentiate between original source loops + /// and the structures resulting from applying OpenMP loop transformations. + enum class OMPLoopCategory { + + /// @var OMPLoopCategory::RegularLoop + /// Represents a standard canonical loop nest found in the + /// original source code or an intact loop after transformations + /// (i.e Post/Pre loops of a loopranged fusion) + RegularLoop, + + /// @var OMPLoopCategory::TransformSingleLoop + /// Represents the resulting loop structure when an OpenMP loop + // transformation, generates a single, top-level loop + TransformSingleLoop, + + /// @var OMPLoopCategory::TransformLoopSequence + /// Represents the resulting loop structure when an OpenMP loop + /// transformation + /// generates a ... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 10:06:59 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Fri, 09 May 2025 10:06:59 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e3633.a70a0220.196f34.a036@mx.google.com> https://github.com/eZWALT edited https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 10:08:58 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 09 May 2025 10:08:58 -0700 (PDT) Subject: [flang-commits] [flang] [flang] (PR #139291) In-Reply-To: Message-ID: <681e36aa.170a0220.1330.5224@mx.google.com> https://github.com/klausler approved this pull request. https://github.com/llvm/llvm-project/pull/139291 From flang-commits at lists.llvm.org Fri May 9 10:11:11 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 09 May 2025 10:11:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang] (PR #139291) In-Reply-To: Message-ID: <681e372f.170a0220.9d710.8629@mx.google.com> eugeneepshteyn wrote: I think Fortran spec actually agrees with this change for `get_command_argument`: "LENGTH (optional) shall be a scalar of type integer with a decimal exponent range of at least four." I read it is that it must be at least integer kind 2. https://github.com/llvm/llvm-project/pull/139291 From flang-commits at lists.llvm.org Fri May 9 10:11:42 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Fri, 09 May 2025 10:11:42 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e374e.170a0220.141e58.8597@mx.google.com> https://github.com/eZWALT edited https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 10:12:29 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 09 May 2025 10:12:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] (PR #139291) In-Reply-To: Message-ID: <681e377d.630a0220.291a11.2e60@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/139291 From flang-commits at lists.llvm.org Fri May 9 10:52:07 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 09 May 2025 10:52:07 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Require contiguous actual pointer for contiguous dummy pointer (PR #139298) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/139298 When the actual argument associated with an explicitly CONTIGUOUS pointer dummy argument is itself a pointer, it must also be contiguous. (A non-pointer actual argument can associate with a CONTIGUOUS pointer dummy argument if it's INTENT(IN), and in that case it's still just a warning if we can't prove at compilation time that the actual is contiguous.) Fixes https://github.com/llvm/llvm-project/issues/138899. >From 1b38bdca8f6d8eac9166bb5ec0a809457736ee19 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 9 May 2025 10:47:34 -0700 Subject: [PATCH] [flang] Require contiguous actual pointer for contiguous dummy pointer When the actual argument associated with an explicitly CONTIGUOUS pointer dummy argument is itself a pointer, it must also be contiguous. (A non-pointer actual argument can associate with a CONTIGUOUS pointer dummy argument if it's INTENT(IN), and in that case it's still just a warning if we can't prove at compilation time that the actual is contiguous.) Fixes https://github.com/llvm/llvm-project/issues/138899. --- flang/lib/Semantics/check-call.cpp | 9 +++++---- flang/lib/Semantics/pointer-assignment.cpp | 15 ++++++++++++++- flang/lib/Semantics/pointer-assignment.h | 2 +- flang/test/Semantics/call07.f90 | 4 +++- 4 files changed, 23 insertions(+), 7 deletions(-) diff --git a/flang/lib/Semantics/check-call.cpp b/flang/lib/Semantics/check-call.cpp index 11928860fea5f..2b1881868b8b3 100644 --- a/flang/lib/Semantics/check-call.cpp +++ b/flang/lib/Semantics/check-call.cpp @@ -754,12 +754,13 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, } } - // Cases when temporaries might be needed but must not be permitted. + bool dummyIsContiguous{ + dummy.attrs.test(characteristics::DummyDataObject::Attr::Contiguous)}; bool actualIsContiguous{IsSimplyContiguous(actual, foldingContext)}; + + // Cases when temporaries might be needed but must not be permitted. bool dummyIsAssumedShape{dummy.type.attrs().test( characteristics::TypeAndShape::Attr::AssumedShape)}; - bool dummyIsContiguous{ - dummy.attrs.test(characteristics::DummyDataObject::Attr::Contiguous)}; if ((actualIsAsynchronous || actualIsVolatile) && (dummyIsAsynchronous || dummyIsVolatile) && !dummyIsValue) { if (actualCoarrayRef) { // C1538 @@ -834,7 +835,7 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, if (scope) { semantics::CheckPointerAssignment(context, messages.at(), dummyName, dummy, actual, *scope, - /*isAssumedRank=*/dummyIsAssumedRank); + /*isAssumedRank=*/dummyIsAssumedRank, actualIsPointer); } } else if (!actualIsPointer) { messages.Say( diff --git a/flang/lib/Semantics/pointer-assignment.cpp b/flang/lib/Semantics/pointer-assignment.cpp index 36c9c5b845706..18a61af8c56f3 100644 --- a/flang/lib/Semantics/pointer-assignment.cpp +++ b/flang/lib/Semantics/pointer-assignment.cpp @@ -59,6 +59,7 @@ class PointerAssignmentChecker { PointerAssignmentChecker &set_isBoundsRemapping(bool); PointerAssignmentChecker &set_isAssumedRank(bool); PointerAssignmentChecker &set_pointerComponentLHS(const Symbol *); + PointerAssignmentChecker &set_isRHSPointerActualArgument(bool); bool CheckLeftHandSide(const SomeExpr &); bool Check(const SomeExpr &); @@ -94,6 +95,7 @@ class PointerAssignmentChecker { bool isVolatile_{false}; bool isBoundsRemapping_{false}; bool isAssumedRank_{false}; + bool isRHSPointerActualArgument_{false}; const Symbol *pointerComponentLHS_{nullptr}; }; @@ -133,6 +135,12 @@ PointerAssignmentChecker &PointerAssignmentChecker::set_pointerComponentLHS( return *this; } +PointerAssignmentChecker & +PointerAssignmentChecker::set_isRHSPointerActualArgument(bool isPointerActual) { + isRHSPointerActualArgument_ = isPointerActual; + return *this; +} + bool PointerAssignmentChecker::CharacterizeProcedure() { if (!characterizedProcedure_) { characterizedProcedure_ = true; @@ -221,6 +229,9 @@ bool PointerAssignmentChecker::Check(const SomeExpr &rhs) { Say("CONTIGUOUS pointer may not be associated with a discontiguous target"_err_en_US); return false; } + } else if (isRHSPointerActualArgument_) { + Say("CONTIGUOUS pointer dummy argument may not be associated with non-CONTIGUOUS pointer actual argument"_err_en_US); + return false; } else { Warn(common::UsageWarning::PointerToPossibleNoncontiguous, "Target of CONTIGUOUS pointer association is not known to be contiguous"_warn_en_US); @@ -585,12 +596,14 @@ bool CheckStructConstructorPointerComponent(SemanticsContext &context, bool CheckPointerAssignment(SemanticsContext &context, parser::CharBlock source, const std::string &description, const DummyDataObject &lhs, - const SomeExpr &rhs, const Scope &scope, bool isAssumedRank) { + const SomeExpr &rhs, const Scope &scope, bool isAssumedRank, + bool isPointerActualArgument) { return PointerAssignmentChecker{context, scope, source, description} .set_lhsType(common::Clone(lhs.type)) .set_isContiguous(lhs.attrs.test(DummyDataObject::Attr::Contiguous)) .set_isVolatile(lhs.attrs.test(DummyDataObject::Attr::Volatile)) .set_isAssumedRank(isAssumedRank) + .set_isRHSPointerActualArgument(isPointerActualArgument) .Check(rhs); } diff --git a/flang/lib/Semantics/pointer-assignment.h b/flang/lib/Semantics/pointer-assignment.h index 269d64112fd29..ad7c6554d5a13 100644 --- a/flang/lib/Semantics/pointer-assignment.h +++ b/flang/lib/Semantics/pointer-assignment.h @@ -31,7 +31,7 @@ bool CheckPointerAssignment(SemanticsContext &, const SomeExpr &lhs, bool CheckPointerAssignment(SemanticsContext &, parser::CharBlock source, const std::string &description, const evaluate::characteristics::DummyDataObject &, const SomeExpr &rhs, - const Scope &, bool isAssumedRank); + const Scope &, bool isAssumedRank, bool IsPointerActualArgument); bool CheckStructConstructorPointerComponent( SemanticsContext &, const Symbol &lhs, const SomeExpr &rhs, const Scope &); diff --git a/flang/test/Semantics/call07.f90 b/flang/test/Semantics/call07.f90 index 3b5c2838fadf7..92f2bdba882d5 100644 --- a/flang/test/Semantics/call07.f90 +++ b/flang/test/Semantics/call07.f90 @@ -27,8 +27,10 @@ subroutine test !PORTABILITY: CONTIGUOUS entity 'scalar' should be an array pointer, assumed-shape, or assumed-rank real, contiguous :: scalar call s01(a03) ! ok - !WARNING: Target of CONTIGUOUS pointer association is not known to be contiguous + !ERROR: CONTIGUOUS pointer dummy argument may not be associated with non-CONTIGUOUS pointer actual argument call s01(a02) + !WARNING: Target of CONTIGUOUS pointer association is not known to be contiguous + call s01(a02(:)) !ERROR: CONTIGUOUS pointer may not be associated with a discontiguous target call s01(a03(::2)) call s02(a02) ! ok From flang-commits at lists.llvm.org Fri May 9 10:52:39 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 10:52:39 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Require contiguous actual pointer for contiguous dummy pointer (PR #139298) In-Reply-To: Message-ID: <681e40e7.170a0220.1d536f.9e22@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes When the actual argument associated with an explicitly CONTIGUOUS pointer dummy argument is itself a pointer, it must also be contiguous. (A non-pointer actual argument can associate with a CONTIGUOUS pointer dummy argument if it's INTENT(IN), and in that case it's still just a warning if we can't prove at compilation time that the actual is contiguous.) Fixes https://github.com/llvm/llvm-project/issues/138899. --- Full diff: https://github.com/llvm/llvm-project/pull/139298.diff 4 Files Affected: - (modified) flang/lib/Semantics/check-call.cpp (+5-4) - (modified) flang/lib/Semantics/pointer-assignment.cpp (+14-1) - (modified) flang/lib/Semantics/pointer-assignment.h (+1-1) - (modified) flang/test/Semantics/call07.f90 (+3-1) ``````````diff diff --git a/flang/lib/Semantics/check-call.cpp b/flang/lib/Semantics/check-call.cpp index 11928860fea5f..2b1881868b8b3 100644 --- a/flang/lib/Semantics/check-call.cpp +++ b/flang/lib/Semantics/check-call.cpp @@ -754,12 +754,13 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, } } - // Cases when temporaries might be needed but must not be permitted. + bool dummyIsContiguous{ + dummy.attrs.test(characteristics::DummyDataObject::Attr::Contiguous)}; bool actualIsContiguous{IsSimplyContiguous(actual, foldingContext)}; + + // Cases when temporaries might be needed but must not be permitted. bool dummyIsAssumedShape{dummy.type.attrs().test( characteristics::TypeAndShape::Attr::AssumedShape)}; - bool dummyIsContiguous{ - dummy.attrs.test(characteristics::DummyDataObject::Attr::Contiguous)}; if ((actualIsAsynchronous || actualIsVolatile) && (dummyIsAsynchronous || dummyIsVolatile) && !dummyIsValue) { if (actualCoarrayRef) { // C1538 @@ -834,7 +835,7 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, if (scope) { semantics::CheckPointerAssignment(context, messages.at(), dummyName, dummy, actual, *scope, - /*isAssumedRank=*/dummyIsAssumedRank); + /*isAssumedRank=*/dummyIsAssumedRank, actualIsPointer); } } else if (!actualIsPointer) { messages.Say( diff --git a/flang/lib/Semantics/pointer-assignment.cpp b/flang/lib/Semantics/pointer-assignment.cpp index 36c9c5b845706..18a61af8c56f3 100644 --- a/flang/lib/Semantics/pointer-assignment.cpp +++ b/flang/lib/Semantics/pointer-assignment.cpp @@ -59,6 +59,7 @@ class PointerAssignmentChecker { PointerAssignmentChecker &set_isBoundsRemapping(bool); PointerAssignmentChecker &set_isAssumedRank(bool); PointerAssignmentChecker &set_pointerComponentLHS(const Symbol *); + PointerAssignmentChecker &set_isRHSPointerActualArgument(bool); bool CheckLeftHandSide(const SomeExpr &); bool Check(const SomeExpr &); @@ -94,6 +95,7 @@ class PointerAssignmentChecker { bool isVolatile_{false}; bool isBoundsRemapping_{false}; bool isAssumedRank_{false}; + bool isRHSPointerActualArgument_{false}; const Symbol *pointerComponentLHS_{nullptr}; }; @@ -133,6 +135,12 @@ PointerAssignmentChecker &PointerAssignmentChecker::set_pointerComponentLHS( return *this; } +PointerAssignmentChecker & +PointerAssignmentChecker::set_isRHSPointerActualArgument(bool isPointerActual) { + isRHSPointerActualArgument_ = isPointerActual; + return *this; +} + bool PointerAssignmentChecker::CharacterizeProcedure() { if (!characterizedProcedure_) { characterizedProcedure_ = true; @@ -221,6 +229,9 @@ bool PointerAssignmentChecker::Check(const SomeExpr &rhs) { Say("CONTIGUOUS pointer may not be associated with a discontiguous target"_err_en_US); return false; } + } else if (isRHSPointerActualArgument_) { + Say("CONTIGUOUS pointer dummy argument may not be associated with non-CONTIGUOUS pointer actual argument"_err_en_US); + return false; } else { Warn(common::UsageWarning::PointerToPossibleNoncontiguous, "Target of CONTIGUOUS pointer association is not known to be contiguous"_warn_en_US); @@ -585,12 +596,14 @@ bool CheckStructConstructorPointerComponent(SemanticsContext &context, bool CheckPointerAssignment(SemanticsContext &context, parser::CharBlock source, const std::string &description, const DummyDataObject &lhs, - const SomeExpr &rhs, const Scope &scope, bool isAssumedRank) { + const SomeExpr &rhs, const Scope &scope, bool isAssumedRank, + bool isPointerActualArgument) { return PointerAssignmentChecker{context, scope, source, description} .set_lhsType(common::Clone(lhs.type)) .set_isContiguous(lhs.attrs.test(DummyDataObject::Attr::Contiguous)) .set_isVolatile(lhs.attrs.test(DummyDataObject::Attr::Volatile)) .set_isAssumedRank(isAssumedRank) + .set_isRHSPointerActualArgument(isPointerActualArgument) .Check(rhs); } diff --git a/flang/lib/Semantics/pointer-assignment.h b/flang/lib/Semantics/pointer-assignment.h index 269d64112fd29..ad7c6554d5a13 100644 --- a/flang/lib/Semantics/pointer-assignment.h +++ b/flang/lib/Semantics/pointer-assignment.h @@ -31,7 +31,7 @@ bool CheckPointerAssignment(SemanticsContext &, const SomeExpr &lhs, bool CheckPointerAssignment(SemanticsContext &, parser::CharBlock source, const std::string &description, const evaluate::characteristics::DummyDataObject &, const SomeExpr &rhs, - const Scope &, bool isAssumedRank); + const Scope &, bool isAssumedRank, bool IsPointerActualArgument); bool CheckStructConstructorPointerComponent( SemanticsContext &, const Symbol &lhs, const SomeExpr &rhs, const Scope &); diff --git a/flang/test/Semantics/call07.f90 b/flang/test/Semantics/call07.f90 index 3b5c2838fadf7..92f2bdba882d5 100644 --- a/flang/test/Semantics/call07.f90 +++ b/flang/test/Semantics/call07.f90 @@ -27,8 +27,10 @@ subroutine test !PORTABILITY: CONTIGUOUS entity 'scalar' should be an array pointer, assumed-shape, or assumed-rank real, contiguous :: scalar call s01(a03) ! ok - !WARNING: Target of CONTIGUOUS pointer association is not known to be contiguous + !ERROR: CONTIGUOUS pointer dummy argument may not be associated with non-CONTIGUOUS pointer actual argument call s01(a02) + !WARNING: Target of CONTIGUOUS pointer association is not known to be contiguous + call s01(a02(:)) !ERROR: CONTIGUOUS pointer may not be associated with a discontiguous target call s01(a03(::2)) call s02(a02) ! ok ``````````
https://github.com/llvm/llvm-project/pull/139298 From flang-commits at lists.llvm.org Fri May 9 11:12:05 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:05 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4575.170a0220.262044.b528@mx.google.com> ================ @@ -5790,7 +5805,11 @@ class OMPReverseDirective final : public OMPLoopTransformationDirective { explicit OMPReverseDirective(SourceLocation StartLoc, SourceLocation EndLoc) : OMPLoopTransformationDirective(OMPReverseDirectiveClass, llvm::omp::OMPD_reverse, StartLoc, - EndLoc, 1) {} + EndLoc, 1) { + // Reverse produces a single top-level canonical loop nest + setNumGeneratedLoops(1); ---------------- alexey-bataev wrote: Should be in a separate patch https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:05 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:05 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4575.170a0220.2ad15a.6cf2@mx.google.com> ================ @@ -962,6 +962,9 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Number of loops generated by this loop transformation. unsigned NumGeneratedLoops = 0; + /// Number of top level canonical loop nests generated by this loop + /// transformation + unsigned NumGeneratedLoopNests = 0; ---------------- alexey-bataev wrote: Why do you need this new field? https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:05 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:05 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4575.170a0220.262044.b52a@mx.google.com> ================ @@ -1151,6 +1151,106 @@ class OMPFullClause final : public OMPNoChildClause { static OMPFullClause *CreateEmpty(const ASTContext &C); }; +/// This class represents the 'looprange' clause in the +/// '#pragma omp fuse' directive +/// +/// \code {c} +/// #pragma omp fuse looprange(1,2) +/// { +/// for(int i = 0; i < 64; ++i) +/// for(int j = 0; j < 256; j+=2) +/// for(int k = 127; k >= 0; --k) +/// \endcode +class OMPLoopRangeClause final : public OMPClause { + friend class OMPClauseReader; + + explicit OMPLoopRangeClause() + : OMPClause(llvm::omp::OMPC_looprange, {}, {}) {} + + /// Location of '(' + SourceLocation LParenLoc; + + /// Location of 'first' + SourceLocation FirstLoc; + + /// Location of 'count' + SourceLocation CountLoc; + + /// Expr associated with 'first' argument + Expr *First = nullptr; + + /// Expr associated with 'count' argument + Expr *Count = nullptr; + + /// Set 'first' + void setFirst(Expr *First) { this->First = First; } + + /// Set 'count' + void setCount(Expr *Count) { this->Count = Count; } + + /// Set location of '('. + void setLParenLoc(SourceLocation Loc) { LParenLoc = Loc; } + + /// Set location of 'first' argument + void setFirstLoc(SourceLocation Loc) { FirstLoc = Loc; } + + /// Set location of 'count' argument + void setCountLoc(SourceLocation Loc) { CountLoc = Loc; } + +public: + /// Build an AST node for a 'looprange' clause + /// + /// \param StartLoc Starting location of the clause. + /// \param LParenLoc Location of '('. + /// \param ModifierLoc Modifier location. + /// \param + static OMPLoopRangeClause * + Create(const ASTContext &C, SourceLocation StartLoc, SourceLocation LParenLoc, + SourceLocation FirstLoc, SourceLocation CountLoc, + SourceLocation EndLoc, Expr *First, Expr *Count); + + /// Build an empty 'looprange' node for deserialization + /// + /// \param C Context of the AST. + static OMPLoopRangeClause *CreateEmpty(const ASTContext &C); + + /// Returns the location of '(' + SourceLocation getLParenLoc() const { return LParenLoc; } + + /// Returns the location of 'first' + SourceLocation getFirstLoc() const { return FirstLoc; } + + /// Returns the location of 'count' + SourceLocation getCountLoc() const { return CountLoc; } + + /// Returns the argument 'first' or nullptr if not set + Expr *getFirst() const { return cast_or_null(First); } + + /// Returns the argument 'count' or nullptr if not set + Expr *getCount() const { return cast_or_null(Count); } + + child_range children() { + return child_range(reinterpret_cast(&First), + reinterpret_cast(&Count) + 1); ---------------- alexey-bataev wrote: It does not work safely, to do this safely you need to store both associated expressions in a single array https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:06 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:06 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4576.170a0220.7f76.8cd4@mx.google.com> ================ @@ -11516,6 +11516,21 @@ def note_omp_implicit_dsa : Note< "implicitly determined as %0">; def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; +def warn_omp_different_loop_ind_var_types : Warning < + "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">, ---------------- alexey-bataev wrote: I thought that the number of iterations should be the limitation here, not the type of indices https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:06 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:06 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4576.170a0220.32a5c1.8da2@mx.google.com> ================ @@ -14175,27 +14222,350 @@ bool SemaOpenMP::checkTransformableLoopNest( return false; }, [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + updatePreInits(Transform, OriginalInits); }); assert(OriginalInits.back().empty() && "No preinit after innermost loop"); OriginalInits.pop_back(); return Result; } +// Counts the total number of nested loops, including the outermost loop (the +// original loop). PRECONDITION of this visitor is that it must be invoked from +// the original loop to be analyzed. The traversal is stop for Decl's and +// Expr's given that they may contain inner loops that must not be counted. +// +// Example AST structure for the code: +// +// int main() { +// #pragma omp fuse +// { +// for (int i = 0; i < 100; i++) { <-- Outer loop +// []() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// }; +// for(int j = 0; j < 5; ++j) {} <-- Inner loop +// } +// for (int r = 0; i < 100; i++) { <-- Outer loop +// struct LocalClass { +// void bar() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// } +// }; +// for(int k = 0; k < 10; ++k) {} <-- Inner loop +// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +// } +// } +// } +// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { +private: + unsigned NestedLoopCount = 0; + +public: + explicit NestedLoopCounterVisitor() {} + + unsigned getNestedLoopCount() const { return NestedLoopCount; } + + bool VisitForStmt(ForStmt *FS) override { + ++NestedLoopCount; + return true; + } + + bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { + ++NestedLoopCount; + return true; + } + + bool TraverseStmt(Stmt *S) override { + if (!S) + return true; + + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) + return true; + + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || isa(S)) { + return DynamicRecursiveASTVisitor::TraverseStmt(S); + } + + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; + } + + bool TraverseDecl(Decl *D) override { + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; + } +}; + +bool SemaOpenMP::analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind) { + + VarsWithInheritedDSAType TmpDSA; + QualType BaseInductionVarType; + // Helper Lambda to handle storing initialization and body statements for both + // ForStmt and CXXForRangeStmt and checks for any possible mismatch between + // induction variables types + auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, + this, &Context](Stmt *LoopStmt) { + if (auto *For = dyn_cast(LoopStmt)) { + OriginalInits.back().push_back(For->getInit()); + ForStmts.push_back(For); + // Extract induction variable + if (auto *InitStmt = dyn_cast_or_null(For->getInit())) { + if (auto *InitDecl = dyn_cast(InitStmt->getSingleDecl())) { + QualType InductionVarType = InitDecl->getType().getCanonicalType(); + + // Compare with first loop type + if (BaseInductionVarType.isNull()) { + BaseInductionVarType = InductionVarType; + } else if (!Context.hasSameType(BaseInductionVarType, + InductionVarType)) { + Diag(InitDecl->getBeginLoc(), + diag::warn_omp_different_loop_ind_var_types) + << getOpenMPDirectiveName(OMPD_fuse) << BaseInductionVarType + << InductionVarType; + } + } + } + } else { + auto *CXXFor = cast(LoopStmt); + OriginalInits.back().push_back(CXXFor->getBeginStmt()); + ForStmts.push_back(CXXFor); + } + }; + + // Helper lambda functions to encapsulate the processing of different + // derivations of the canonical loop sequence grammar + // + // Modularized code for handling loop generation and transformations + auto analyzeLoopGeneration = [&storeLoopStatements, &LoopHelpers, + &OriginalInits, &TransformsPreInits, + &LoopCategories, &LoopSeqSize, &NumLoops, Kind, + &TmpDSA, &ForStmts, &Context, + &LoopSequencePreInits, this](Stmt *Child) { + auto LoopTransform = dyn_cast(Child); ---------------- alexey-bataev wrote: ```suggestion auto *LoopTransform = dyn_cast(Child); ``` https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:06 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:06 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4576.170a0220.59663.b11f@mx.google.com> ================ @@ -14175,27 +14222,350 @@ bool SemaOpenMP::checkTransformableLoopNest( return false; }, [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + updatePreInits(Transform, OriginalInits); }); assert(OriginalInits.back().empty() && "No preinit after innermost loop"); OriginalInits.pop_back(); return Result; } +// Counts the total number of nested loops, including the outermost loop (the +// original loop). PRECONDITION of this visitor is that it must be invoked from +// the original loop to be analyzed. The traversal is stop for Decl's and +// Expr's given that they may contain inner loops that must not be counted. +// +// Example AST structure for the code: +// +// int main() { +// #pragma omp fuse +// { +// for (int i = 0; i < 100; i++) { <-- Outer loop +// []() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// }; +// for(int j = 0; j < 5; ++j) {} <-- Inner loop +// } +// for (int r = 0; i < 100; i++) { <-- Outer loop +// struct LocalClass { +// void bar() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// } +// }; +// for(int k = 0; k < 10; ++k) {} <-- Inner loop +// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +// } +// } +// } +// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { +private: + unsigned NestedLoopCount = 0; + +public: + explicit NestedLoopCounterVisitor() {} + + unsigned getNestedLoopCount() const { return NestedLoopCount; } + + bool VisitForStmt(ForStmt *FS) override { + ++NestedLoopCount; + return true; + } + + bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { + ++NestedLoopCount; + return true; + } + + bool TraverseStmt(Stmt *S) override { + if (!S) + return true; + + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) + return true; + + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || isa(S)) { + return DynamicRecursiveASTVisitor::TraverseStmt(S); + } + + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; + } + + bool TraverseDecl(Decl *D) override { + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; + } +}; + +bool SemaOpenMP::analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind) { + + VarsWithInheritedDSAType TmpDSA; + QualType BaseInductionVarType; + // Helper Lambda to handle storing initialization and body statements for both + // ForStmt and CXXForRangeStmt and checks for any possible mismatch between + // induction variables types + auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, + this, &Context](Stmt *LoopStmt) { + if (auto *For = dyn_cast(LoopStmt)) { + OriginalInits.back().push_back(For->getInit()); + ForStmts.push_back(For); + // Extract induction variable + if (auto *InitStmt = dyn_cast_or_null(For->getInit())) { + if (auto *InitDecl = dyn_cast(InitStmt->getSingleDecl())) { + QualType InductionVarType = InitDecl->getType().getCanonicalType(); + + // Compare with first loop type + if (BaseInductionVarType.isNull()) { + BaseInductionVarType = InductionVarType; + } else if (!Context.hasSameType(BaseInductionVarType, + InductionVarType)) { + Diag(InitDecl->getBeginLoc(), + diag::warn_omp_different_loop_ind_var_types) + << getOpenMPDirectiveName(OMPD_fuse) << BaseInductionVarType + << InductionVarType; + } + } + } + } else { + auto *CXXFor = cast(LoopStmt); + OriginalInits.back().push_back(CXXFor->getBeginStmt()); + ForStmts.push_back(CXXFor); + } + }; + + // Helper lambda functions to encapsulate the processing of different + // derivations of the canonical loop sequence grammar + // + // Modularized code for handling loop generation and transformations + auto analyzeLoopGeneration = [&storeLoopStatements, &LoopHelpers, + &OriginalInits, &TransformsPreInits, + &LoopCategories, &LoopSeqSize, &NumLoops, Kind, + &TmpDSA, &ForStmts, &Context, + &LoopSequencePreInits, this](Stmt *Child) { + auto LoopTransform = dyn_cast(Child); + Stmt *TransformedStmt = LoopTransform->getTransformedStmt(); + unsigned NumGeneratedLoopNests = LoopTransform->getNumGeneratedLoopNests(); + unsigned NumGeneratedLoops = LoopTransform->getNumGeneratedLoops(); + // Handle the case where transformed statement is not available due to + // dependent contexts + if (!TransformedStmt) { + if (NumGeneratedLoopNests > 0) { + LoopSeqSize += NumGeneratedLoopNests; + NumLoops += NumGeneratedLoops; + return true; + } + // Unroll full (0 loops produced) + else { ---------------- alexey-bataev wrote: ```suggestion } else { // Unroll full (0 loops produced) ``` https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:06 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:06 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4576.170a0220.d71a0.8112@mx.google.com> ================ @@ -14145,6 +14152,46 @@ StmtResult SemaOpenMP::ActOnOpenMPTargetTeamsDistributeSimdDirective( getASTContext(), StartLoc, EndLoc, NestedLoopCount, Clauses, AStmt, B); } +// Overloaded base case function ---------------- alexey-bataev wrote: ```suggestion /// Overloaded base case function ``` https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:06 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:06 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4576.050a0220.19b7df.ab1c@mx.google.com> ================ @@ -14175,27 +14222,350 @@ bool SemaOpenMP::checkTransformableLoopNest( return false; }, [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + updatePreInits(Transform, OriginalInits); }); assert(OriginalInits.back().empty() && "No preinit after innermost loop"); OriginalInits.pop_back(); return Result; } +// Counts the total number of nested loops, including the outermost loop (the +// original loop). PRECONDITION of this visitor is that it must be invoked from +// the original loop to be analyzed. The traversal is stop for Decl's and +// Expr's given that they may contain inner loops that must not be counted. +// +// Example AST structure for the code: +// +// int main() { +// #pragma omp fuse +// { +// for (int i = 0; i < 100; i++) { <-- Outer loop +// []() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// }; +// for(int j = 0; j < 5; ++j) {} <-- Inner loop +// } +// for (int r = 0; i < 100; i++) { <-- Outer loop +// struct LocalClass { +// void bar() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// } +// }; +// for(int k = 0; k < 10; ++k) {} <-- Inner loop +// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +// } +// } +// } +// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { +private: + unsigned NestedLoopCount = 0; + +public: + explicit NestedLoopCounterVisitor() {} + + unsigned getNestedLoopCount() const { return NestedLoopCount; } + + bool VisitForStmt(ForStmt *FS) override { + ++NestedLoopCount; + return true; + } + + bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { + ++NestedLoopCount; + return true; + } + + bool TraverseStmt(Stmt *S) override { + if (!S) + return true; + + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) + return true; + + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || isa(S)) { + return DynamicRecursiveASTVisitor::TraverseStmt(S); + } ---------------- alexey-bataev wrote: ```suggestion if (isa(S)) return DynamicRecursiveASTVisitor::TraverseStmt(S); ``` https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:06 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:06 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4576.a70a0220.218476.b6c6@mx.google.com> ================ @@ -14175,27 +14222,350 @@ bool SemaOpenMP::checkTransformableLoopNest( return false; }, [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + updatePreInits(Transform, OriginalInits); }); assert(OriginalInits.back().empty() && "No preinit after innermost loop"); OriginalInits.pop_back(); return Result; } +// Counts the total number of nested loops, including the outermost loop (the +// original loop). PRECONDITION of this visitor is that it must be invoked from +// the original loop to be analyzed. The traversal is stop for Decl's and +// Expr's given that they may contain inner loops that must not be counted. +// +// Example AST structure for the code: +// +// int main() { +// #pragma omp fuse +// { +// for (int i = 0; i < 100; i++) { <-- Outer loop +// []() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// }; +// for(int j = 0; j < 5; ++j) {} <-- Inner loop +// } +// for (int r = 0; i < 100; i++) { <-- Outer loop +// struct LocalClass { +// void bar() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// } +// }; +// for(int k = 0; k < 10; ++k) {} <-- Inner loop +// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +// } +// } +// } +// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { +private: + unsigned NestedLoopCount = 0; + +public: + explicit NestedLoopCounterVisitor() {} + + unsigned getNestedLoopCount() const { return NestedLoopCount; } + + bool VisitForStmt(ForStmt *FS) override { + ++NestedLoopCount; + return true; + } + + bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { + ++NestedLoopCount; + return true; + } + + bool TraverseStmt(Stmt *S) override { + if (!S) + return true; + + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) + return true; + + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || isa(S)) { + return DynamicRecursiveASTVisitor::TraverseStmt(S); + } + + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; + } + + bool TraverseDecl(Decl *D) override { + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; + } +}; + +bool SemaOpenMP::analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind) { + + VarsWithInheritedDSAType TmpDSA; + QualType BaseInductionVarType; + // Helper Lambda to handle storing initialization and body statements for both + // ForStmt and CXXForRangeStmt and checks for any possible mismatch between + // induction variables types + auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, + this, &Context](Stmt *LoopStmt) { ---------------- alexey-bataev wrote: ```suggestion auto StoreLoopStatements = [&](Stmt *LoopStmt) { ``` https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:06 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:06 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4576.170a0220.176244.7d92@mx.google.com> ================ @@ -5378,6 +5379,10 @@ class CodeGenFunction : public CodeGenTypeCache { /// Set the address of a local variable. void setAddrOfLocalVar(const VarDecl *VD, Address Addr) { + if (LocalDeclMap.count(VD)) { + llvm::errs() << "Warning: VarDecl already exists in map: "; + VD->dumpColor(); + } ---------------- alexey-bataev wrote: Remove https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:06 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:06 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4576.a70a0220.145309.aa6c@mx.google.com> ================ @@ -15451,6 +15819,500 @@ StmtResult SemaOpenMP::ActOnOpenMPInterchangeDirective( buildPreInits(Context, PreInits)); } +StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, + Stmt *AStmt, + SourceLocation StartLoc, + SourceLocation EndLoc) { + + ASTContext &Context = getASTContext(); + DeclContext *CurrContext = SemaRef.CurContext; + Scope *CurScope = SemaRef.getCurScope(); + CaptureVars CopyTransformer(SemaRef); + + // Ensure the structured block is not empty + if (!AStmt) { + return StmtError(); + } + + unsigned NumLoops = 1; + unsigned LoopSeqSize = 1; + + // Defer transformation in dependent contexts + // The NumLoopNests argument is set to a placeholder 1 (even though + // using looprange fuse could yield up to 3 top level loop nests) + // because a dependent context could prevent determining its true value + if (CurrContext->isDependentContext()) { + return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, + NumLoops, LoopSeqSize, AStmt, nullptr, + nullptr); + } + + // Validate that the potential loop sequence is transformable for fusion + // Also collect the HelperExprs, Loop Stmts, Inits, and Number of loops + SmallVector LoopHelpers; + SmallVector LoopStmts; + SmallVector> OriginalInits; + SmallVector> TransformsPreInits; + SmallVector> LoopSequencePreInits; + SmallVector LoopCategories; + if (!checkTransformableLoopSequence(OMPD_fuse, AStmt, LoopSeqSize, NumLoops, + LoopHelpers, LoopStmts, OriginalInits, + TransformsPreInits, LoopSequencePreInits, + LoopCategories, Context)) { + return StmtError(); + } + + // Handle clauses, which can be any of the following: [looprange, apply] + const OMPLoopRangeClause *LRC = + OMPExecutableDirective::getSingleClause(Clauses); + + // The clause arguments are invalidated if any error arises + // such as non-constant or non-positive arguments + if (LRC && (!LRC->getFirst() || !LRC->getCount())) + return StmtError(); + + // Delayed semantic check of LoopRange constraint + // Evaluates the loop range arguments and returns the first and count values + auto EvaluateLoopRangeArguments = [&Context](Expr *First, Expr *Count, + uint64_t &FirstVal, + uint64_t &CountVal) { + llvm::APSInt FirstInt = First->EvaluateKnownConstInt(Context); + llvm::APSInt CountInt = Count->EvaluateKnownConstInt(Context); + FirstVal = FirstInt.getZExtValue(); + CountVal = CountInt.getZExtValue(); + }; + + // OpenMP [6.0, Restrictions] + // first + count - 1 must not evaluate to a value greater than the + // loop sequence length of the associated canonical loop sequence. + auto ValidLoopRange = [](uint64_t FirstVal, uint64_t CountVal, + unsigned NumLoops) -> bool { + return FirstVal + CountVal - 1 <= NumLoops; + }; + uint64_t FirstVal = 1, CountVal = 0, LastVal = LoopSeqSize; + + // Validates the loop range after evaluating the semantic information + // and ensures that the range is valid for the given loop sequence size. + // Expressions are evaluated at compile time to obtain constant values. + if (LRC) { + EvaluateLoopRangeArguments(LRC->getFirst(), LRC->getCount(), FirstVal, + CountVal); + if (CountVal == 1) + SemaRef.Diag(LRC->getCountLoc(), diag::warn_omp_redundant_fusion) + << getOpenMPDirectiveName(OMPD_fuse); + + if (!ValidLoopRange(FirstVal, CountVal, LoopSeqSize)) { + SemaRef.Diag(LRC->getFirstLoc(), diag::err_omp_invalid_looprange) + << getOpenMPDirectiveName(OMPD_fuse) << (FirstVal + CountVal - 1) + << LoopSeqSize; + return StmtError(); + } + + LastVal = FirstVal + CountVal - 1; + } + + // Complete fusion generates a single canonical loop nest + // However looprange clause generates several loop nests + unsigned NumLoopNests = LRC ? LoopSeqSize - CountVal + 1 : 1; + + // Emit a warning for redundant loop fusion when the sequence contains only + // one loop. + if (LoopSeqSize == 1) + SemaRef.Diag(AStmt->getBeginLoc(), diag::warn_omp_redundant_fusion) + << getOpenMPDirectiveName(OMPD_fuse); + + assert(LoopHelpers.size() == LoopSeqSize && + "Expecting loop iteration space dimensionality to match number of " + "affected loops"); + assert(OriginalInits.size() == LoopSeqSize && + "Expecting loop iteration space dimensionality to match number of " + "affected loops"); + + // Select the type with the largest bit width among all induction variables + QualType IVType = LoopHelpers[FirstVal - 1].IterationVarRef->getType(); + for (unsigned int I = FirstVal; I < LastVal; ++I) { + QualType CurrentIVType = LoopHelpers[I].IterationVarRef->getType(); + if (Context.getTypeSize(CurrentIVType) > Context.getTypeSize(IVType)) { + IVType = CurrentIVType; + } + } + uint64_t IVBitWidth = Context.getIntWidth(IVType); + + // Create pre-init declarations for all loops lower bounds, upper bounds, + // strides and num-iterations for every top level loop in the fusion + SmallVector LBVarDecls; + SmallVector STVarDecls; + SmallVector NIVarDecls; + SmallVector UBVarDecls; + SmallVector IVVarDecls; + + // Helper lambda to create variables for bounds, strides, and other + // expressions. Generates both the variable declaration and the corresponding + // initialization statement. + auto CreateHelperVarAndStmt = + [&SemaRef = this->SemaRef, &Context, &CopyTransformer, + &IVType](Expr *ExprToCopy, const std::string &BaseName, unsigned I, ---------------- alexey-bataev wrote: ```suggestion [&, &SemaRef = SemaRef](Expr *ExprToCopy, const std::string &BaseName, unsigned I, ``` https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:07 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:07 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4577.050a0220.196370.a9f9@mx.google.com> ================ @@ -0,0 +1,186 @@ +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -std=c++20 -fopenmp -fopenmp-version=60 -fsyntax-only -Wuninitialized -verify %s + +void func() { + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a loop sequence containing canonical loops or loop-generating constructs}} + #pragma omp fuse + ; + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a for loop}} + #pragma omp fuse + {int bar = 0;} + + // expected-error at +4 {{statement after '#pragma omp fuse' must be a for loop}} + #pragma omp fuse + { + for(int i = 0; i < 10; ++i); + int x = 2; + } + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a loop sequence containing canonical loops or loop-generating constructs}} + #pragma omp fuse + #pragma omp for + for (int i = 0; i < 7; ++i) + ; + + { + // expected-error at +2 {{expected statement}} + #pragma omp fuse + } + + // expected-warning at +1 {{extra tokens at the end of '#pragma omp fuse' are ignored}} + #pragma omp fuse foo + { + for (int i = 0; i < 7; ++i) + ; + for(int j = 0; j < 100; ++j); + + } + + + // expected-error at +1 {{unexpected OpenMP clause 'final' in directive '#pragma omp fuse'}} + #pragma omp fuse final(0) + { + for (int i = 0; i < 7; ++i) + ; + for(int j = 0; j < 100; ++j); + + } + + //expected-error at +4 {{loop after '#pragma omp fuse' is not in canonical form}} ---------------- alexey-bataev wrote: Why do you need the second message, if there's already another one, requiring canonical loop form? https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:07 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:07 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4577.170a0220.efc04.86fb@mx.google.com> ================ @@ -14145,6 +14152,46 @@ StmtResult SemaOpenMP::ActOnOpenMPTargetTeamsDistributeSimdDirective( getASTContext(), StartLoc, EndLoc, NestedLoopCount, Clauses, AStmt, B); } +// Overloaded base case function +template static bool tryHandleAs(T *t, F &&) { + return false; +} + +/** + * Tries to recursively cast `t` to one of the given types and invokes `f` if + * successful. + * + * @tparam Class The first type to check. + * @tparam Rest The remaining types to check. + * @tparam T The base type of `t`. + * @tparam F The callable type for the function to invoke upon a successful + * cast. + * @param t The object to be checked. + * @param f The function to invoke if `t` matches `Class`. + * @return `true` if `t` matched any type and `f` was called, otherwise `false`. + */ ---------------- alexey-bataev wrote: Use `///` style https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:07 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:07 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4577.170a0220.2598dd.564e@mx.google.com> ================ @@ -14175,27 +14222,350 @@ bool SemaOpenMP::checkTransformableLoopNest( return false; }, [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + updatePreInits(Transform, OriginalInits); }); assert(OriginalInits.back().empty() && "No preinit after innermost loop"); OriginalInits.pop_back(); return Result; } +// Counts the total number of nested loops, including the outermost loop (the +// original loop). PRECONDITION of this visitor is that it must be invoked from +// the original loop to be analyzed. The traversal is stop for Decl's and +// Expr's given that they may contain inner loops that must not be counted. +// +// Example AST structure for the code: +// +// int main() { +// #pragma omp fuse +// { +// for (int i = 0; i < 100; i++) { <-- Outer loop +// []() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// }; +// for(int j = 0; j < 5; ++j) {} <-- Inner loop +// } +// for (int r = 0; i < 100; i++) { <-- Outer loop +// struct LocalClass { +// void bar() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// } +// }; +// for(int k = 0; k < 10; ++k) {} <-- Inner loop +// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +// } +// } +// } +// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { +private: + unsigned NestedLoopCount = 0; + +public: + explicit NestedLoopCounterVisitor() {} + + unsigned getNestedLoopCount() const { return NestedLoopCount; } + + bool VisitForStmt(ForStmt *FS) override { + ++NestedLoopCount; + return true; + } + + bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { + ++NestedLoopCount; + return true; + } + + bool TraverseStmt(Stmt *S) override { + if (!S) + return true; + + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) + return true; + + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || isa(S)) { + return DynamicRecursiveASTVisitor::TraverseStmt(S); + } + + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; + } + + bool TraverseDecl(Decl *D) override { + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; + } +}; + +bool SemaOpenMP::analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind) { + + VarsWithInheritedDSAType TmpDSA; + QualType BaseInductionVarType; + // Helper Lambda to handle storing initialization and body statements for both + // ForStmt and CXXForRangeStmt and checks for any possible mismatch between + // induction variables types + auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, + this, &Context](Stmt *LoopStmt) { + if (auto *For = dyn_cast(LoopStmt)) { + OriginalInits.back().push_back(For->getInit()); + ForStmts.push_back(For); + // Extract induction variable + if (auto *InitStmt = dyn_cast_or_null(For->getInit())) { + if (auto *InitDecl = dyn_cast(InitStmt->getSingleDecl())) { + QualType InductionVarType = InitDecl->getType().getCanonicalType(); + + // Compare with first loop type + if (BaseInductionVarType.isNull()) { + BaseInductionVarType = InductionVarType; + } else if (!Context.hasSameType(BaseInductionVarType, + InductionVarType)) { + Diag(InitDecl->getBeginLoc(), + diag::warn_omp_different_loop_ind_var_types) + << getOpenMPDirectiveName(OMPD_fuse) << BaseInductionVarType + << InductionVarType; + } + } + } + } else { + auto *CXXFor = cast(LoopStmt); + OriginalInits.back().push_back(CXXFor->getBeginStmt()); + ForStmts.push_back(CXXFor); + } + }; + + // Helper lambda functions to encapsulate the processing of different + // derivations of the canonical loop sequence grammar + // + // Modularized code for handling loop generation and transformations + auto analyzeLoopGeneration = [&storeLoopStatements, &LoopHelpers, + &OriginalInits, &TransformsPreInits, + &LoopCategories, &LoopSeqSize, &NumLoops, Kind, + &TmpDSA, &ForStmts, &Context, + &LoopSequencePreInits, this](Stmt *Child) { ---------------- alexey-bataev wrote: ```suggestion auto AnalyzeLoopGeneration = [&](Stmt *Child) { ``` https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:07 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:07 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4577.a70a0220.2b34fd.b6c1@mx.google.com> ================ @@ -14175,27 +14222,350 @@ bool SemaOpenMP::checkTransformableLoopNest( return false; }, [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + updatePreInits(Transform, OriginalInits); }); assert(OriginalInits.back().empty() && "No preinit after innermost loop"); OriginalInits.pop_back(); return Result; } +// Counts the total number of nested loops, including the outermost loop (the +// original loop). PRECONDITION of this visitor is that it must be invoked from +// the original loop to be analyzed. The traversal is stop for Decl's and +// Expr's given that they may contain inner loops that must not be counted. +// +// Example AST structure for the code: +// +// int main() { +// #pragma omp fuse +// { +// for (int i = 0; i < 100; i++) { <-- Outer loop +// []() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// }; +// for(int j = 0; j < 5; ++j) {} <-- Inner loop +// } +// for (int r = 0; i < 100; i++) { <-- Outer loop +// struct LocalClass { +// void bar() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// } +// }; +// for(int k = 0; k < 10; ++k) {} <-- Inner loop +// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +// } +// } +// } +// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { +private: + unsigned NestedLoopCount = 0; + +public: + explicit NestedLoopCounterVisitor() {} + + unsigned getNestedLoopCount() const { return NestedLoopCount; } + + bool VisitForStmt(ForStmt *FS) override { + ++NestedLoopCount; + return true; + } + + bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { + ++NestedLoopCount; + return true; + } + + bool TraverseStmt(Stmt *S) override { + if (!S) + return true; + + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) + return true; + + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || isa(S)) { + return DynamicRecursiveASTVisitor::TraverseStmt(S); + } + + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; + } + + bool TraverseDecl(Decl *D) override { + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; + } +}; + +bool SemaOpenMP::analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind) { + + VarsWithInheritedDSAType TmpDSA; + QualType BaseInductionVarType; + // Helper Lambda to handle storing initialization and body statements for both + // ForStmt and CXXForRangeStmt and checks for any possible mismatch between + // induction variables types + auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, + this, &Context](Stmt *LoopStmt) { + if (auto *For = dyn_cast(LoopStmt)) { + OriginalInits.back().push_back(For->getInit()); + ForStmts.push_back(For); + // Extract induction variable + if (auto *InitStmt = dyn_cast_or_null(For->getInit())) { + if (auto *InitDecl = dyn_cast(InitStmt->getSingleDecl())) { + QualType InductionVarType = InitDecl->getType().getCanonicalType(); + + // Compare with first loop type + if (BaseInductionVarType.isNull()) { + BaseInductionVarType = InductionVarType; + } else if (!Context.hasSameType(BaseInductionVarType, + InductionVarType)) { + Diag(InitDecl->getBeginLoc(), + diag::warn_omp_different_loop_ind_var_types) + << getOpenMPDirectiveName(OMPD_fuse) << BaseInductionVarType + << InductionVarType; + } + } + } + } else { + auto *CXXFor = cast(LoopStmt); + OriginalInits.back().push_back(CXXFor->getBeginStmt()); + ForStmts.push_back(CXXFor); + } + }; + + // Helper lambda functions to encapsulate the processing of different ---------------- alexey-bataev wrote: Use `\\\` https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:08 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:08 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4578.630a0220.2bf84e.3d5c@mx.google.com> ================ @@ -14175,27 +14222,350 @@ bool SemaOpenMP::checkTransformableLoopNest( return false; }, [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + updatePreInits(Transform, OriginalInits); }); assert(OriginalInits.back().empty() && "No preinit after innermost loop"); OriginalInits.pop_back(); return Result; } +// Counts the total number of nested loops, including the outermost loop (the +// original loop). PRECONDITION of this visitor is that it must be invoked from +// the original loop to be analyzed. The traversal is stop for Decl's and +// Expr's given that they may contain inner loops that must not be counted. +// +// Example AST structure for the code: +// +// int main() { +// #pragma omp fuse +// { +// for (int i = 0; i < 100; i++) { <-- Outer loop +// []() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// }; +// for(int j = 0; j < 5; ++j) {} <-- Inner loop +// } +// for (int r = 0; i < 100; i++) { <-- Outer loop +// struct LocalClass { +// void bar() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// } +// }; +// for(int k = 0; k < 10; ++k) {} <-- Inner loop +// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +// } +// } +// } +// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { +private: + unsigned NestedLoopCount = 0; + +public: + explicit NestedLoopCounterVisitor() {} + + unsigned getNestedLoopCount() const { return NestedLoopCount; } + + bool VisitForStmt(ForStmt *FS) override { + ++NestedLoopCount; + return true; + } + + bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { + ++NestedLoopCount; + return true; + } + + bool TraverseStmt(Stmt *S) override { + if (!S) + return true; + + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) + return true; + + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || isa(S)) { + return DynamicRecursiveASTVisitor::TraverseStmt(S); + } + + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; + } + + bool TraverseDecl(Decl *D) override { + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; + } +}; + +bool SemaOpenMP::analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind) { + + VarsWithInheritedDSAType TmpDSA; + QualType BaseInductionVarType; + // Helper Lambda to handle storing initialization and body statements for both + // ForStmt and CXXForRangeStmt and checks for any possible mismatch between + // induction variables types + auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, + this, &Context](Stmt *LoopStmt) { + if (auto *For = dyn_cast(LoopStmt)) { + OriginalInits.back().push_back(For->getInit()); + ForStmts.push_back(For); + // Extract induction variable + if (auto *InitStmt = dyn_cast_or_null(For->getInit())) { + if (auto *InitDecl = dyn_cast(InitStmt->getSingleDecl())) { + QualType InductionVarType = InitDecl->getType().getCanonicalType(); + + // Compare with first loop type + if (BaseInductionVarType.isNull()) { + BaseInductionVarType = InductionVarType; + } else if (!Context.hasSameType(BaseInductionVarType, + InductionVarType)) { + Diag(InitDecl->getBeginLoc(), + diag::warn_omp_different_loop_ind_var_types) + << getOpenMPDirectiveName(OMPD_fuse) << BaseInductionVarType + << InductionVarType; + } + } + } + } else { + auto *CXXFor = cast(LoopStmt); + OriginalInits.back().push_back(CXXFor->getBeginStmt()); + ForStmts.push_back(CXXFor); + } + }; + + // Helper lambda functions to encapsulate the processing of different + // derivations of the canonical loop sequence grammar + // + // Modularized code for handling loop generation and transformations + auto analyzeLoopGeneration = [&storeLoopStatements, &LoopHelpers, + &OriginalInits, &TransformsPreInits, + &LoopCategories, &LoopSeqSize, &NumLoops, Kind, + &TmpDSA, &ForStmts, &Context, + &LoopSequencePreInits, this](Stmt *Child) { + auto LoopTransform = dyn_cast(Child); + Stmt *TransformedStmt = LoopTransform->getTransformedStmt(); + unsigned NumGeneratedLoopNests = LoopTransform->getNumGeneratedLoopNests(); + unsigned NumGeneratedLoops = LoopTransform->getNumGeneratedLoops(); + // Handle the case where transformed statement is not available due to + // dependent contexts + if (!TransformedStmt) { + if (NumGeneratedLoopNests > 0) { + LoopSeqSize += NumGeneratedLoopNests; + NumLoops += NumGeneratedLoops; + return true; + } + // Unroll full (0 loops produced) + else { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + } + // Handle loop transformations with multiple loop nests + // Unroll full + if (NumGeneratedLoopNests <= 0) { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + // Loop transformatons such as split or loopranged fuse + else if (NumGeneratedLoopNests > 1) { + // Get the preinits related to this loop sequence generating + // loop transformation (i.e loopranged fuse, split...) + LoopSequencePreInits.emplace_back(); + // These preinits differ slightly from regular inits/pre-inits related + // to single loop generating loop transformations (interchange, unroll) + // given that they are not bounded to a particular loop nest + // so they need to be treated independently + updatePreInits(LoopTransform, LoopSequencePreInits); + return analyzeLoopSequence(TransformedStmt, LoopSeqSize, NumLoops, + LoopHelpers, ForStmts, OriginalInits, + TransformsPreInits, LoopSequencePreInits, + LoopCategories, Context, Kind); + } + // Vast majority: (Tile, Unroll, Stripe, Reverse, Interchange, Fuse all) + else { + // Process the transformed loop statement + OriginalInits.emplace_back(); + TransformsPreInits.emplace_back(); + LoopHelpers.emplace_back(); + LoopCategories.push_back(OMPLoopCategory::TransformSingleLoop); + + unsigned IsCanonical = + checkOpenMPLoop(Kind, nullptr, nullptr, TransformedStmt, SemaRef, + *DSAStack, TmpDSA, LoopHelpers[LoopSeqSize]); + + if (!IsCanonical) { + Diag(TransformedStmt->getBeginLoc(), diag::err_omp_not_canonical_loop) + << getOpenMPDirectiveName(Kind); + return false; + } + storeLoopStatements(TransformedStmt); + updatePreInits(LoopTransform, TransformsPreInits); + + NumLoops += NumGeneratedLoops; + ++LoopSeqSize; + return true; + } + }; + + // Modularized code for handling regular canonical loops + auto analyzeRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, + &LoopSeqSize, &NumLoops, Kind, &TmpDSA, + &LoopCategories, this](Stmt *Child) { + OriginalInits.emplace_back(); + LoopHelpers.emplace_back(); + LoopCategories.push_back(OMPLoopCategory::RegularLoop); + + unsigned IsCanonical = + checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, + TmpDSA, LoopHelpers[LoopSeqSize]); + + if (!IsCanonical) { + Diag(Child->getBeginLoc(), diag::err_omp_not_canonical_loop) + << getOpenMPDirectiveName(Kind); + return false; + } + + storeLoopStatements(Child); + auto NLCV = NestedLoopCounterVisitor(); + NLCV.TraverseStmt(Child); + NumLoops += NLCV.getNestedLoopCount(); + return true; + }; + + // Helper functions to validate canonical loop sequence grammar is valid + auto isLoopSequenceDerivation = [](auto *Child) { + return isa(Child) || isa(Child) || + isa(Child); + }; + auto isLoopGeneratingStmt = [](auto *Child) { + return isa(Child); + }; + + // High level grammar validation + for (auto *Child : LoopSeqStmt->children()) { + + if (!Child) + continue; + + // Skip over non-loop-sequence statements + if (!isLoopSequenceDerivation(Child)) { + Child = Child->IgnoreContainers(); + + // Ignore empty compound statement + if (!Child) + continue; + + // In the case of a nested loop sequence ignoring containers would not + // be enough, a recurisve transversal of the loop sequence is required + if (isa(Child)) { + if (!analyzeLoopSequence(Child, LoopSeqSize, NumLoops, LoopHelpers, + ForStmts, OriginalInits, TransformsPreInits, + LoopSequencePreInits, LoopCategories, Context, + Kind)) + return false; + // Already been treated, skip this children + continue; + } + } + // Regular loop sequence handling + if (isLoopSequenceDerivation(Child)) { + if (isLoopGeneratingStmt(Child)) { + if (!analyzeLoopGeneration(Child)) { + return false; + } + // analyzeLoopGeneration updates Loop Sequence size accordingly + + } else { + if (!analyzeRegularLoop(Child)) { + return false; + } + // Update the Loop Sequence size by one + ++LoopSeqSize; + } + } else { + // Report error for invalid statement inside canonical loop sequence + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + } + return true; +} + +bool SemaOpenMP::checkTransformableLoopSequence( + OpenMPDirectiveKind Kind, Stmt *AStmt, unsigned &LoopSeqSize, + unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context) { + + // Checks whether the given statement is a compound statement + if (!isa(AStmt)) { + Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) + << getOpenMPDirectiveName(Kind); + return false; + } + // Number of top level canonical loop nests observed (And acts as index) + LoopSeqSize = 0; + // Number of total observed loops + NumLoops = 0; + + // Following OpenMP 6.0 API Specification, a Canonical Loop Sequence follows + // the grammar: + // + // canonical-loop-sequence: + // { + // loop-sequence+ + // } + // where loop-sequence can be any of the following: + // 1. canonical-loop-sequence + // 2. loop-nest + // 3. loop-sequence-generating-construct (i.e OMPLoopTransformationDirective) + // + // To recognise and traverse this structure the following helper functions + // have been defined. analyzeLoopSequence serves as the recurisve entry point + // and tries to match the input AST to the canonical loop sequence grammar + // structure. This function will perform both a semantic and syntactical + // analysis of the given statement according to OpenMP 6.0 definition of + // the aforementioned canonical loop sequence + + // Recursive entry point to process the main loop sequence + if (!analyzeLoopSequence(AStmt, LoopSeqSize, NumLoops, LoopHelpers, ForStmts, + OriginalInits, TransformsPreInits, + LoopSequencePreInits, LoopCategories, Context, + Kind)) { + return false; + } ---------------- alexey-bataev wrote: ```suggestion Kind)) return false; ``` https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:09 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:09 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4579.170a0220.1166a9.77bf@mx.google.com> ================ @@ -14175,27 +14222,350 @@ bool SemaOpenMP::checkTransformableLoopNest( return false; }, [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + updatePreInits(Transform, OriginalInits); }); assert(OriginalInits.back().empty() && "No preinit after innermost loop"); OriginalInits.pop_back(); return Result; } +// Counts the total number of nested loops, including the outermost loop (the +// original loop). PRECONDITION of this visitor is that it must be invoked from +// the original loop to be analyzed. The traversal is stop for Decl's and +// Expr's given that they may contain inner loops that must not be counted. +// +// Example AST structure for the code: +// +// int main() { +// #pragma omp fuse +// { +// for (int i = 0; i < 100; i++) { <-- Outer loop +// []() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// }; +// for(int j = 0; j < 5; ++j) {} <-- Inner loop +// } +// for (int r = 0; i < 100; i++) { <-- Outer loop +// struct LocalClass { +// void bar() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// } +// }; +// for(int k = 0; k < 10; ++k) {} <-- Inner loop +// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +// } +// } +// } +// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { +private: + unsigned NestedLoopCount = 0; + +public: + explicit NestedLoopCounterVisitor() {} + + unsigned getNestedLoopCount() const { return NestedLoopCount; } + + bool VisitForStmt(ForStmt *FS) override { + ++NestedLoopCount; + return true; + } + + bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { + ++NestedLoopCount; + return true; + } + + bool TraverseStmt(Stmt *S) override { + if (!S) + return true; + + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) + return true; + + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || isa(S)) { + return DynamicRecursiveASTVisitor::TraverseStmt(S); + } + + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; + } + + bool TraverseDecl(Decl *D) override { + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; + } +}; + +bool SemaOpenMP::analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind) { + + VarsWithInheritedDSAType TmpDSA; + QualType BaseInductionVarType; + // Helper Lambda to handle storing initialization and body statements for both + // ForStmt and CXXForRangeStmt and checks for any possible mismatch between + // induction variables types + auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, + this, &Context](Stmt *LoopStmt) { + if (auto *For = dyn_cast(LoopStmt)) { + OriginalInits.back().push_back(For->getInit()); + ForStmts.push_back(For); + // Extract induction variable + if (auto *InitStmt = dyn_cast_or_null(For->getInit())) { + if (auto *InitDecl = dyn_cast(InitStmt->getSingleDecl())) { + QualType InductionVarType = InitDecl->getType().getCanonicalType(); + + // Compare with first loop type + if (BaseInductionVarType.isNull()) { + BaseInductionVarType = InductionVarType; + } else if (!Context.hasSameType(BaseInductionVarType, + InductionVarType)) { + Diag(InitDecl->getBeginLoc(), + diag::warn_omp_different_loop_ind_var_types) + << getOpenMPDirectiveName(OMPD_fuse) << BaseInductionVarType + << InductionVarType; + } + } + } + } else { + auto *CXXFor = cast(LoopStmt); + OriginalInits.back().push_back(CXXFor->getBeginStmt()); + ForStmts.push_back(CXXFor); + } + }; + + // Helper lambda functions to encapsulate the processing of different + // derivations of the canonical loop sequence grammar + // + // Modularized code for handling loop generation and transformations + auto analyzeLoopGeneration = [&storeLoopStatements, &LoopHelpers, + &OriginalInits, &TransformsPreInits, + &LoopCategories, &LoopSeqSize, &NumLoops, Kind, + &TmpDSA, &ForStmts, &Context, + &LoopSequencePreInits, this](Stmt *Child) { + auto LoopTransform = dyn_cast(Child); + Stmt *TransformedStmt = LoopTransform->getTransformedStmt(); + unsigned NumGeneratedLoopNests = LoopTransform->getNumGeneratedLoopNests(); + unsigned NumGeneratedLoops = LoopTransform->getNumGeneratedLoops(); + // Handle the case where transformed statement is not available due to + // dependent contexts + if (!TransformedStmt) { + if (NumGeneratedLoopNests > 0) { + LoopSeqSize += NumGeneratedLoopNests; + NumLoops += NumGeneratedLoops; + return true; + } + // Unroll full (0 loops produced) + else { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + } + // Handle loop transformations with multiple loop nests + // Unroll full + if (NumGeneratedLoopNests <= 0) { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + // Loop transformatons such as split or loopranged fuse + else if (NumGeneratedLoopNests > 1) { + // Get the preinits related to this loop sequence generating + // loop transformation (i.e loopranged fuse, split...) + LoopSequencePreInits.emplace_back(); + // These preinits differ slightly from regular inits/pre-inits related + // to single loop generating loop transformations (interchange, unroll) + // given that they are not bounded to a particular loop nest + // so they need to be treated independently + updatePreInits(LoopTransform, LoopSequencePreInits); + return analyzeLoopSequence(TransformedStmt, LoopSeqSize, NumLoops, + LoopHelpers, ForStmts, OriginalInits, + TransformsPreInits, LoopSequencePreInits, + LoopCategories, Context, Kind); + } + // Vast majority: (Tile, Unroll, Stripe, Reverse, Interchange, Fuse all) + else { + // Process the transformed loop statement + OriginalInits.emplace_back(); + TransformsPreInits.emplace_back(); + LoopHelpers.emplace_back(); + LoopCategories.push_back(OMPLoopCategory::TransformSingleLoop); + + unsigned IsCanonical = + checkOpenMPLoop(Kind, nullptr, nullptr, TransformedStmt, SemaRef, + *DSAStack, TmpDSA, LoopHelpers[LoopSeqSize]); + + if (!IsCanonical) { + Diag(TransformedStmt->getBeginLoc(), diag::err_omp_not_canonical_loop) + << getOpenMPDirectiveName(Kind); + return false; + } + storeLoopStatements(TransformedStmt); + updatePreInits(LoopTransform, TransformsPreInits); + + NumLoops += NumGeneratedLoops; + ++LoopSeqSize; + return true; + } + }; + + // Modularized code for handling regular canonical loops + auto analyzeRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, + &LoopSeqSize, &NumLoops, Kind, &TmpDSA, + &LoopCategories, this](Stmt *Child) { + OriginalInits.emplace_back(); + LoopHelpers.emplace_back(); + LoopCategories.push_back(OMPLoopCategory::RegularLoop); + + unsigned IsCanonical = + checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, + TmpDSA, LoopHelpers[LoopSeqSize]); + + if (!IsCanonical) { + Diag(Child->getBeginLoc(), diag::err_omp_not_canonical_loop) + << getOpenMPDirectiveName(Kind); + return false; + } + + storeLoopStatements(Child); + auto NLCV = NestedLoopCounterVisitor(); + NLCV.TraverseStmt(Child); + NumLoops += NLCV.getNestedLoopCount(); + return true; + }; + + // Helper functions to validate canonical loop sequence grammar is valid + auto isLoopSequenceDerivation = [](auto *Child) { + return isa(Child) || isa(Child) || + isa(Child); + }; + auto isLoopGeneratingStmt = [](auto *Child) { ---------------- alexey-bataev wrote: ```suggestion auto IsLoopGeneratingStmt = [](auto *Child) { ``` https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:10 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 11:12:10 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e457a.a70a0220.145309.aa73@mx.google.com> ================ @@ -1480,6 +1493,108 @@ class SemaOpenMP : public SemaBase { SmallVectorImpl &LoopHelpers, Stmt *&Body, SmallVectorImpl> &OriginalInits); + /// @brief Categories of loops encountered during semantic OpenMP loop + /// analysis + /// + /// This enumeration identifies the structural category of a loop or sequence + /// of loops analyzed in the context of OpenMP transformations and directives. + /// This categorization helps differentiate between original source loops + /// and the structures resulting from applying OpenMP loop transformations. + enum class OMPLoopCategory { + + /// @var OMPLoopCategory::RegularLoop + /// Represents a standard canonical loop nest found in the + /// original source code or an intact loop after transformations + /// (i.e Post/Pre loops of a loopranged fusion) + RegularLoop, + + /// @var OMPLoopCategory::TransformSingleLoop + /// Represents the resulting loop structure when an OpenMP loop + // transformation, generates a single, top-level loop + TransformSingleLoop, + + /// @var OMPLoopCategory::TransformLoopSequence + /// Represents the resulting loop structure when an OpenMP loop + /// transformation + /// generates a sequence of two or more canonical loop nests + TransformLoopSequence + }; + + /// The main recursive process of `checkTransformableLoopSequence` that + /// performs grammatical parsing of a canonical loop sequence. It extracts + /// key information, such as the number of top-level loops, loop statements, + /// helper expressions, and other relevant loop-related data, all in a single + /// execution to avoid redundant traversals. This analysis flattens inner + /// Loop Sequences + /// + /// \param LoopSeqStmt The AST of the original statement. + /// \param LoopSeqSize [out] Number of top level canonical loops. + /// \param NumLoops [out] Number of total canonical loops (nested too). + /// \param LoopHelpers [out] The multiple loop analyses results. + /// \param ForStmts [out] The multiple Stmt of each For loop. + /// \param OriginalInits [out] The raw original initialization statements + /// of each belonging to a loop of the loop sequence + /// \param TransformPreInits [out] The multiple collection of statements and + /// declarations that must have been executed/declared + /// before entering the loop (each belonging to a + /// particular loop transformation, nullptr otherwise) + /// \param LoopSequencePreInits [out] Additional general collection of loop + /// transformation related statements and declarations + /// not bounded to a particular loop that must be + /// executed before entering the loop transformation + /// \param LoopCategories [out] A sequence of OMPLoopCategory values, + /// one for each loop or loop transformation node + /// successfully analyzed. + /// \param Context + /// \param Kind The loop transformation directive kind. + /// \return Whether the original statement is both syntactically and + /// semantically correct according to OpenMP 6.0 canonical loop + /// sequence definition. + bool analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, ---------------- alexey-bataev wrote: ```suggestion SmallVectorImpl> &OriginalInits, SmallVectorImpl> &TransformsPreInits, SmallVectorImpl> &LoopSequencePreInits, ``` https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:12:38 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 09 May 2025 11:12:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <681e4596.170a0220.2b4042.7c40@mx.google.com> https://github.com/akuhlens closed https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Fri May 9 11:35:05 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Fri, 09 May 2025 11:35:05 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4ad9.170a0220.215e02.a108@mx.google.com> ================ @@ -3223,6 +3223,8 @@ LValue CodeGenFunction::EmitDeclRefLValue(const DeclRefExpr *E) { // No other cases for now. } else { + llvm::dbgs() << "THE DAMN DECLREFEXPR HASN'T BEEN ENTERED IN LOCALDECLMAP\n"; + VD->dumpColor(); ---------------- eZWALT wrote: Oops, thanks! https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:35:11 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Fri, 09 May 2025 11:35:11 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4adf.630a0220.7f314.500d@mx.google.com> ================ @@ -5378,6 +5379,10 @@ class CodeGenFunction : public CodeGenTypeCache { /// Set the address of a local variable. void setAddrOfLocalVar(const VarDecl *VD, Address Addr) { + if (LocalDeclMap.count(VD)) { + llvm::errs() << "Warning: VarDecl already exists in map: "; + VD->dumpColor(); + } ---------------- eZWALT wrote: Oops, thanks! https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:37:01 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Fri, 09 May 2025 11:37:01 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4b4d.170a0220.d71f2.8d9c@mx.google.com> ================ @@ -5790,7 +5805,11 @@ class OMPReverseDirective final : public OMPLoopTransformationDirective { explicit OMPReverseDirective(SourceLocation StartLoc, SourceLocation EndLoc) : OMPLoopTransformationDirective(OMPReverseDirectiveClass, llvm::omp::OMPD_reverse, StartLoc, - EndLoc, 1) {} + EndLoc, 1) { + // Reverse produces a single top-level canonical loop nest + setNumGeneratedLoops(1); ---------------- eZWALT wrote: Okey do i make another pull request just for the loop corrections though? https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:40:51 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Fri, 09 May 2025 11:40:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <681e4c33.a70a0220.f979.c6bf@mx.google.com> abidh wrote: Polite ping. https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Fri May 9 11:49:58 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Fri, 09 May 2025 11:49:58 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4e56.050a0220.ff745.bfff@mx.google.com> ================ @@ -962,6 +962,9 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Number of loops generated by this loop transformation. unsigned NumGeneratedLoops = 0; + /// Number of top level canonical loop nests generated by this loop + /// transformation + unsigned NumGeneratedLoopNests = 0; ---------------- eZWALT wrote: Maybe the name is a bit unfortunate and could be improved, but they are 2 completely different fields conceptually. This top level loops are the ones actually managed by loop Sequence constructs like fuse and the upcoming split. A loop sequence contains loops which may contain several inner nestes loops, but these should not be taken into account for performing fusion or splitting. This was not taken into account originally due to all transformations having a fixed number of generated top level nests (1). However fuse or split may generate several loop nests with inner nested loops. https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 11:50:51 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Fri, 09 May 2025 11:50:51 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e4e8b.170a0220.17bb0c.a33d@mx.google.com> ================ @@ -962,6 +962,9 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Number of loops generated by this loop transformation. unsigned NumGeneratedLoops = 0; + /// Number of top level canonical loop nests generated by this loop + /// transformation + unsigned NumGeneratedLoopNests = 0; ---------------- eZWALT wrote: Note that unroll is an exception, it could have 0 or 1 but it coincides perfectly with the original number of loops . https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 12:54:53 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 09 May 2025 12:54:53 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Require contiguous actual pointer for contiguous dummy pointer (PR #139298) In-Reply-To: Message-ID: <681e5d8d.a70a0220.3cb45c.d20e@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/139298 From flang-commits at lists.llvm.org Fri May 9 13:22:58 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 13:22:58 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e6422.170a0220.261c1d.d440@mx.google.com> ================ @@ -5790,7 +5805,11 @@ class OMPReverseDirective final : public OMPLoopTransformationDirective { explicit OMPReverseDirective(SourceLocation StartLoc, SourceLocation EndLoc) : OMPLoopTransformationDirective(OMPReverseDirectiveClass, llvm::omp::OMPD_reverse, StartLoc, - EndLoc, 1) {} + EndLoc, 1) { + // Reverse produces a single top-level canonical loop nest + setNumGeneratedLoops(1); ---------------- alexey-bataev wrote: Yes https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 13:23:28 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Fri, 09 May 2025 13:23:28 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681e6440.170a0220.1ef954.ca4e@mx.google.com> ================ @@ -962,6 +962,9 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Number of loops generated by this loop transformation. unsigned NumGeneratedLoops = 0; + /// Number of top level canonical loop nests generated by this loop + /// transformation + unsigned NumGeneratedLoopNests = 0; ---------------- alexey-bataev wrote: The question is how it is used. I did not see it is being read anywhere https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 9 13:56:13 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 09 May 2025 13:56:13 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Emit error when DEFERRED binding overrides non-DEFERRED (PR #139325) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/139325 Fixes https://github.com/llvm/llvm-project/issues/138915. >From 30f50f232d79729991efadcf2dcf56a3b34d69de Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 9 May 2025 13:54:36 -0700 Subject: [PATCH] [flang] Emit error when DEFERRED binding overrides non-DEFERRED Fixes https://github.com/llvm/llvm-project/issues/138915. --- flang/lib/Evaluate/tools.cpp | 18 ++++++++---------- flang/lib/Semantics/check-declarations.cpp | 12 +++++++++--- flang/test/Semantics/bug138915.f90 | 15 +++++++++++++++ 3 files changed, 32 insertions(+), 13 deletions(-) create mode 100644 flang/test/Semantics/bug138915.f90 diff --git a/flang/lib/Evaluate/tools.cpp b/flang/lib/Evaluate/tools.cpp index 702711e3cff53..865020e050b03 100644 --- a/flang/lib/Evaluate/tools.cpp +++ b/flang/lib/Evaluate/tools.cpp @@ -1196,16 +1196,6 @@ parser::Message *AttachDeclaration( const auto *assoc{unhosted->detailsIf()}) { unhosted = &assoc->symbol(); } - if (const auto *binding{ - unhosted->detailsIf()}) { - if (binding->symbol().name() != symbol.name()) { - message.Attach(binding->symbol().name(), - "Procedure '%s' of type '%s' is bound to '%s'"_en_US, symbol.name(), - symbol.owner().GetName().value(), binding->symbol().name()); - return &message; - } - unhosted = &binding->symbol(); - } if (const auto *use{symbol.detailsIf()}) { message.Attach(use->location(), "'%s' is USE-associated with '%s' in module '%s'"_en_US, symbol.name(), @@ -1214,6 +1204,14 @@ parser::Message *AttachDeclaration( message.Attach( unhosted->name(), "Declaration of '%s'"_en_US, unhosted->name()); } + if (const auto *binding{ + unhosted->detailsIf()}) { + if (binding->symbol().name() != symbol.name()) { + message.Attach(binding->symbol().name(), + "Procedure '%s' of type '%s' is bound to '%s'"_en_US, symbol.name(), + symbol.owner().GetName().value(), binding->symbol().name()); + } + } return &message; } diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 318085518cc57..94258444cf7ef 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -2555,6 +2555,9 @@ void CheckHelper::CheckProcBinding( const Symbol &symbol, const ProcBindingDetails &binding) { const Scope &dtScope{symbol.owner()}; CHECK(dtScope.kind() == Scope::Kind::DerivedType); + bool isInaccessibleDeferred{false}; + const Symbol *overridden{ + FindOverriddenBinding(symbol, isInaccessibleDeferred)}; if (symbol.attrs().test(Attr::DEFERRED)) { if (const Symbol *dtSymbol{dtScope.symbol()}) { if (!dtSymbol->attrs().test(Attr::ABSTRACT)) { // C733 @@ -2568,6 +2571,11 @@ void CheckHelper::CheckProcBinding( "Type-bound procedure '%s' may not be both DEFERRED and NON_OVERRIDABLE"_err_en_US, symbol.name()); } + if (overridden && !overridden->attrs().test(Attr::DEFERRED)) { + SayWithDeclaration(*overridden, + "Override of non-DEFERRED '%s' must not be DEFERRED"_err_en_US, + symbol.name()); + } } if (binding.symbol().attrs().test(Attr::INTRINSIC) && !context_.intrinsics().IsSpecificIntrinsicFunction( @@ -2576,9 +2584,7 @@ void CheckHelper::CheckProcBinding( "Intrinsic procedure '%s' is not a specific intrinsic permitted for use in the definition of binding '%s'"_err_en_US, binding.symbol().name(), symbol.name()); } - bool isInaccessibleDeferred{false}; - if (const Symbol * - overridden{FindOverriddenBinding(symbol, isInaccessibleDeferred)}) { + if (overridden) { if (isInaccessibleDeferred) { SayWithDeclaration(*overridden, "Override of PRIVATE DEFERRED '%s' must appear in its module"_err_en_US, diff --git a/flang/test/Semantics/bug138915.f90 b/flang/test/Semantics/bug138915.f90 new file mode 100644 index 0000000000000..786a4ac2d930b --- /dev/null +++ b/flang/test/Semantics/bug138915.f90 @@ -0,0 +1,15 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 +module m + type base + contains + procedure, nopass :: tbp + end type + type, extends(base), abstract :: child + contains + !ERROR: Override of non-DEFERRED 'tbp' must not be DEFERRED + procedure(tbp), deferred, nopass :: tbp + end type + contains + subroutine tbp + end +end From flang-commits at lists.llvm.org Fri May 9 13:56:33 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 09 May 2025 13:56:33 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (WORK IN PROGRESS) (PR #137727) In-Reply-To: Message-ID: <681e6c01.050a0220.1cedd.cff9@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Fri May 9 13:56:42 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 13:56:42 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Emit error when DEFERRED binding overrides non-DEFERRED (PR #139325) In-Reply-To: Message-ID: <681e6c0a.050a0220.1cbea8.3bb4@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes Fixes https://github.com/llvm/llvm-project/issues/138915. --- Full diff: https://github.com/llvm/llvm-project/pull/139325.diff 3 Files Affected: - (modified) flang/lib/Evaluate/tools.cpp (+8-10) - (modified) flang/lib/Semantics/check-declarations.cpp (+9-3) - (added) flang/test/Semantics/bug138915.f90 (+15) ``````````diff diff --git a/flang/lib/Evaluate/tools.cpp b/flang/lib/Evaluate/tools.cpp index 702711e3cff53..865020e050b03 100644 --- a/flang/lib/Evaluate/tools.cpp +++ b/flang/lib/Evaluate/tools.cpp @@ -1196,16 +1196,6 @@ parser::Message *AttachDeclaration( const auto *assoc{unhosted->detailsIf()}) { unhosted = &assoc->symbol(); } - if (const auto *binding{ - unhosted->detailsIf()}) { - if (binding->symbol().name() != symbol.name()) { - message.Attach(binding->symbol().name(), - "Procedure '%s' of type '%s' is bound to '%s'"_en_US, symbol.name(), - symbol.owner().GetName().value(), binding->symbol().name()); - return &message; - } - unhosted = &binding->symbol(); - } if (const auto *use{symbol.detailsIf()}) { message.Attach(use->location(), "'%s' is USE-associated with '%s' in module '%s'"_en_US, symbol.name(), @@ -1214,6 +1204,14 @@ parser::Message *AttachDeclaration( message.Attach( unhosted->name(), "Declaration of '%s'"_en_US, unhosted->name()); } + if (const auto *binding{ + unhosted->detailsIf()}) { + if (binding->symbol().name() != symbol.name()) { + message.Attach(binding->symbol().name(), + "Procedure '%s' of type '%s' is bound to '%s'"_en_US, symbol.name(), + symbol.owner().GetName().value(), binding->symbol().name()); + } + } return &message; } diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 318085518cc57..94258444cf7ef 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -2555,6 +2555,9 @@ void CheckHelper::CheckProcBinding( const Symbol &symbol, const ProcBindingDetails &binding) { const Scope &dtScope{symbol.owner()}; CHECK(dtScope.kind() == Scope::Kind::DerivedType); + bool isInaccessibleDeferred{false}; + const Symbol *overridden{ + FindOverriddenBinding(symbol, isInaccessibleDeferred)}; if (symbol.attrs().test(Attr::DEFERRED)) { if (const Symbol *dtSymbol{dtScope.symbol()}) { if (!dtSymbol->attrs().test(Attr::ABSTRACT)) { // C733 @@ -2568,6 +2571,11 @@ void CheckHelper::CheckProcBinding( "Type-bound procedure '%s' may not be both DEFERRED and NON_OVERRIDABLE"_err_en_US, symbol.name()); } + if (overridden && !overridden->attrs().test(Attr::DEFERRED)) { + SayWithDeclaration(*overridden, + "Override of non-DEFERRED '%s' must not be DEFERRED"_err_en_US, + symbol.name()); + } } if (binding.symbol().attrs().test(Attr::INTRINSIC) && !context_.intrinsics().IsSpecificIntrinsicFunction( @@ -2576,9 +2584,7 @@ void CheckHelper::CheckProcBinding( "Intrinsic procedure '%s' is not a specific intrinsic permitted for use in the definition of binding '%s'"_err_en_US, binding.symbol().name(), symbol.name()); } - bool isInaccessibleDeferred{false}; - if (const Symbol * - overridden{FindOverriddenBinding(symbol, isInaccessibleDeferred)}) { + if (overridden) { if (isInaccessibleDeferred) { SayWithDeclaration(*overridden, "Override of PRIVATE DEFERRED '%s' must appear in its module"_err_en_US, diff --git a/flang/test/Semantics/bug138915.f90 b/flang/test/Semantics/bug138915.f90 new file mode 100644 index 0000000000000..786a4ac2d930b --- /dev/null +++ b/flang/test/Semantics/bug138915.f90 @@ -0,0 +1,15 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 +module m + type base + contains + procedure, nopass :: tbp + end type + type, extends(base), abstract :: child + contains + !ERROR: Override of non-DEFERRED 'tbp' must not be DEFERRED + procedure(tbp), deferred, nopass :: tbp + end type + contains + subroutine tbp + end +end ``````````
https://github.com/llvm/llvm-project/pull/139325 From flang-commits at lists.llvm.org Fri May 9 13:56:50 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 09 May 2025 13:56:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Acknowledge non-enforcement of C7108 (PR #139169) In-Reply-To: Message-ID: <681e6c12.170a0220.304aef.f3f3@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/139169 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Fri May 9 13:57:37 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 09 May 2025 13:57:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix spurious error on defined assignment in PURE (PR #139186) In-Reply-To: Message-ID: <681e6c41.170a0220.37cf9a.9c5a@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/139186 >From 2cd11298d20e64ae90284a03d6b394ec45fd8899 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Thu, 8 May 2025 17:46:35 -0700 Subject: [PATCH] [flang] Fix spurious error on defined assignment in PURE An assignment to a whole polymorphic object in a PURE subprogram that is implemented by means of a defined assignment procedure shouldn't be subjected to the same definability checks as it would be for an intrinsic assignment (which would also require it to be allocatable). Fixes https://github.com/llvm/llvm-project/issues/139129. --- flang/lib/Semantics/assignment.cpp | 5 +++ flang/lib/Semantics/check-deallocate.cpp | 3 +- flang/lib/Semantics/check-declarations.cpp | 4 +-- flang/lib/Semantics/definable.cpp | 42 +++++++++++----------- flang/lib/Semantics/definable.h | 2 +- flang/lib/Semantics/expression.cpp | 6 ++-- flang/test/Semantics/assign11.f90 | 6 ++-- flang/test/Semantics/bug139129.f90 | 17 +++++++++ flang/test/Semantics/call28.f90 | 4 +-- flang/test/Semantics/deallocate07.f90 | 6 ++-- flang/test/Semantics/declarations05.f90 | 2 +- 11 files changed, 59 insertions(+), 38 deletions(-) create mode 100644 flang/test/Semantics/bug139129.f90 diff --git a/flang/lib/Semantics/assignment.cpp b/flang/lib/Semantics/assignment.cpp index 935f5a03bdb6a..6e55d0210ee0e 100644 --- a/flang/lib/Semantics/assignment.cpp +++ b/flang/lib/Semantics/assignment.cpp @@ -72,6 +72,11 @@ void AssignmentContext::Analyze(const parser::AssignmentStmt &stmt) { std::holds_alternative(assignment->u)}; if (isDefinedAssignment) { flags.set(DefinabilityFlag::AllowEventLockOrNotifyType); + } else if (const Symbol * + whole{evaluate::UnwrapWholeSymbolOrComponentDataRef(lhs)}) { + if (IsAllocatable(whole->GetUltimate())) { + flags.set(DefinabilityFlag::PotentialDeallocation); + } } if (auto whyNot{WhyNotDefinable(lhsLoc, scope, flags, lhs)}) { if (whyNot->IsFatal()) { diff --git a/flang/lib/Semantics/check-deallocate.cpp b/flang/lib/Semantics/check-deallocate.cpp index 3bcd4d87b0906..332e6b52e1c9a 100644 --- a/flang/lib/Semantics/check-deallocate.cpp +++ b/flang/lib/Semantics/check-deallocate.cpp @@ -36,7 +36,8 @@ void DeallocateChecker::Leave(const parser::DeallocateStmt &deallocateStmt) { } else if (auto whyNot{WhyNotDefinable(name.source, context_.FindScope(name.source), {DefinabilityFlag::PointerDefinition, - DefinabilityFlag::AcceptAllocatable}, + DefinabilityFlag::AcceptAllocatable, + DefinabilityFlag::PotentialDeallocation}, *symbol)}) { // Catch problems with non-definability of the // pointer/allocatable diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 318085518cc57..c3a228f3ab8a9 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -949,8 +949,8 @@ void CheckHelper::CheckObjectEntity( !IsFunctionResult(symbol) /*ditto*/) { // Check automatically deallocated local variables for possible // problems with finalization in PURE. - if (auto whyNot{ - WhyNotDefinable(symbol.name(), symbol.owner(), {}, symbol)}) { + if (auto whyNot{WhyNotDefinable(symbol.name(), symbol.owner(), + {DefinabilityFlag::PotentialDeallocation}, symbol)}) { if (auto *msg{messages_.Say( "'%s' may not be a local variable in a pure subprogram"_err_en_US, symbol.name())}) { diff --git a/flang/lib/Semantics/definable.cpp b/flang/lib/Semantics/definable.cpp index 99a31553f2782..931c8e52fc6d7 100644 --- a/flang/lib/Semantics/definable.cpp +++ b/flang/lib/Semantics/definable.cpp @@ -193,6 +193,15 @@ static std::optional WhyNotDefinableLast(parser::CharBlock at, return WhyNotDefinableLast(at, scope, flags, dataRef->GetLastSymbol()); } } + auto dyType{evaluate::DynamicType::From(ultimate)}; + const auto *inPure{FindPureProcedureContaining(scope)}; + if (inPure && !flags.test(DefinabilityFlag::PolymorphicOkInPure) && + flags.test(DefinabilityFlag::PotentialDeallocation) && dyType && + dyType->IsPolymorphic()) { + return BlameSymbol(at, + "'%s' is a whole polymorphic object in a pure subprogram"_en_US, + original); + } if (flags.test(DefinabilityFlag::PointerDefinition)) { if (flags.test(DefinabilityFlag::AcceptAllocatable)) { if (!IsAllocatableOrObjectPointer(&ultimate)) { @@ -210,26 +219,17 @@ static std::optional WhyNotDefinableLast(parser::CharBlock at, "'%s' is an entity with either an EVENT_TYPE or LOCK_TYPE"_en_US, original); } - if (FindPureProcedureContaining(scope)) { - if (auto dyType{evaluate::DynamicType::From(ultimate)}) { - if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { - if (dyType->IsPolymorphic()) { // C1596 - return BlameSymbol( - at, "'%s' is polymorphic in a pure subprogram"_en_US, original); - } - } - if (const Symbol * impure{HasImpureFinal(ultimate)}) { - return BlameSymbol(at, "'%s' has an impure FINAL procedure '%s'"_en_US, - original, impure->name()); - } + if (dyType && inPure) { + if (const Symbol * impure{HasImpureFinal(ultimate)}) { + return BlameSymbol(at, "'%s' has an impure FINAL procedure '%s'"_en_US, + original, impure->name()); + } + if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { if (const DerivedTypeSpec * derived{GetDerivedTypeSpec(dyType)}) { - if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { - if (auto bad{ - FindPolymorphicAllocatablePotentialComponent(*derived)}) { - return BlameSymbol(at, - "'%s' has polymorphic component '%s' in a pure subprogram"_en_US, - original, bad.BuildResultDesignatorName()); - } + if (auto bad{FindPolymorphicAllocatablePotentialComponent(*derived)}) { + return BlameSymbol(at, + "'%s' has polymorphic component '%s' in a pure subprogram"_en_US, + original, bad.BuildResultDesignatorName()); } } } @@ -241,10 +241,10 @@ static std::optional WhyNotDefinableLast(parser::CharBlock at, static std::optional WhyNotDefinable(parser::CharBlock at, const Scope &scope, DefinabilityFlags flags, const evaluate::DataRef &dataRef) { + bool isWholeSymbol{std::holds_alternative(dataRef.u)}; auto whyNotBase{ WhyNotDefinableBase(at, scope, flags, dataRef.GetFirstSymbol(), - std::holds_alternative(dataRef.u), - DefinesComponentPointerTarget(dataRef, flags))}; + isWholeSymbol, DefinesComponentPointerTarget(dataRef, flags))}; if (!whyNotBase || !whyNotBase->IsFatal()) { if (auto whyNotLast{ WhyNotDefinableLast(at, scope, flags, dataRef.GetLastSymbol())}) { diff --git a/flang/lib/Semantics/definable.h b/flang/lib/Semantics/definable.h index 902702dbccbf3..0d027961417be 100644 --- a/flang/lib/Semantics/definable.h +++ b/flang/lib/Semantics/definable.h @@ -33,7 +33,7 @@ ENUM_CLASS(DefinabilityFlag, SourcedAllocation, // ALLOCATE(a,SOURCE=) PolymorphicOkInPure, // don't check for polymorphic type in pure subprogram DoNotNoteDefinition, // context does not imply definition - AllowEventLockOrNotifyType) + AllowEventLockOrNotifyType, PotentialDeallocation) using DefinabilityFlags = common::EnumSet; diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index e139bda7e4950..96d039edf89d7 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -3385,15 +3385,15 @@ const Assignment *ExpressionAnalyzer::Analyze(const parser::AssignmentStmt &x) { const Symbol *lastWhole{ lastWhole0 ? &ResolveAssociations(*lastWhole0) : nullptr}; if (!lastWhole || !IsAllocatable(*lastWhole)) { - Say("Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable"_err_en_US); + Say("Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable"_err_en_US); } else if (evaluate::IsCoarray(*lastWhole)) { - Say("Left-hand side of assignment may not be polymorphic if it is a coarray"_err_en_US); + Say("Left-hand side of intrinsic assignment may not be polymorphic if it is a coarray"_err_en_US); } } if (auto *derived{GetDerivedTypeSpec(*dyType)}) { if (auto iter{FindAllocatableUltimateComponent(*derived)}) { if (ExtractCoarrayRef(lhs)) { - Say("Left-hand side of assignment must not be coindexed due to allocatable ultimate component '%s'"_err_en_US, + Say("Left-hand side of intrinsic assignment must not be coindexed due to allocatable ultimate component '%s'"_err_en_US, iter.BuildResultDesignatorName()); } } diff --git a/flang/test/Semantics/assign11.f90 b/flang/test/Semantics/assign11.f90 index 37216526b5f33..9d70d7109e75e 100644 --- a/flang/test/Semantics/assign11.f90 +++ b/flang/test/Semantics/assign11.f90 @@ -9,10 +9,10 @@ program test end type type(t) auc[*] pa = 1 ! ok - !ERROR: Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable pp = 1 - !ERROR: Left-hand side of assignment may not be polymorphic if it is a coarray + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic if it is a coarray pac = 1 - !ERROR: Left-hand side of assignment must not be coindexed due to allocatable ultimate component '%a' + !ERROR: Left-hand side of intrinsic assignment must not be coindexed due to allocatable ultimate component '%a' auc[1] = t() end diff --git a/flang/test/Semantics/bug139129.f90 b/flang/test/Semantics/bug139129.f90 new file mode 100644 index 0000000000000..2f0f865854706 --- /dev/null +++ b/flang/test/Semantics/bug139129.f90 @@ -0,0 +1,17 @@ +!RUN: %flang_fc1 -fsyntax-only %s +module m + type t + contains + procedure asst + generic :: assignment(=) => asst + end type + contains + pure subroutine asst(lhs, rhs) + class(t), intent(in out) :: lhs + class(t), intent(in) :: rhs + end + pure subroutine test(x, y) + class(t), intent(in out) :: x, y + x = y ! spurious definability error + end +end diff --git a/flang/test/Semantics/call28.f90 b/flang/test/Semantics/call28.f90 index 51430853d663f..f133276f7547e 100644 --- a/flang/test/Semantics/call28.f90 +++ b/flang/test/Semantics/call28.f90 @@ -11,9 +11,7 @@ pure subroutine s1(x) end subroutine pure subroutine s2(x) class(t), intent(in out) :: x - !ERROR: Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable - !ERROR: Left-hand side of assignment is not definable - !BECAUSE: 'x' is polymorphic in a pure subprogram + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable x = t() end subroutine pure subroutine s3(x) diff --git a/flang/test/Semantics/deallocate07.f90 b/flang/test/Semantics/deallocate07.f90 index 154c680f47c82..dd2885e2cab35 100644 --- a/flang/test/Semantics/deallocate07.f90 +++ b/flang/test/Semantics/deallocate07.f90 @@ -19,11 +19,11 @@ pure subroutine subr(pp1, pp2, mp2) !ERROR: Name in DEALLOCATE statement is not definable !BECAUSE: 'mv1' may not be defined in pure subprogram 'subr' because it is host-associated deallocate(mv1%pc) - !ERROR: Object in DEALLOCATE statement is not deallocatable - !BECAUSE: 'pp1' is polymorphic in a pure subprogram + !ERROR: Name in DEALLOCATE statement is not definable + !BECAUSE: 'pp1' is a whole polymorphic object in a pure subprogram deallocate(pp1) !ERROR: Object in DEALLOCATE statement is not deallocatable - !BECAUSE: 'pc' is polymorphic in a pure subprogram + !BECAUSE: 'pc' has polymorphic component '%pc' in a pure subprogram deallocate(pp2%pc) !ERROR: Object in DEALLOCATE statement is not deallocatable !BECAUSE: 'mp2' has polymorphic component '%pc' in a pure subprogram diff --git a/flang/test/Semantics/declarations05.f90 b/flang/test/Semantics/declarations05.f90 index b6dab7aeea0bc..b1e3d3c773160 100644 --- a/flang/test/Semantics/declarations05.f90 +++ b/flang/test/Semantics/declarations05.f90 @@ -22,7 +22,7 @@ impure subroutine final(x) end pure subroutine test !ERROR: 'x0' may not be a local variable in a pure subprogram - !BECAUSE: 'x0' is polymorphic in a pure subprogram + !BECAUSE: 'x0' is a whole polymorphic object in a pure subprogram class(t0), allocatable :: x0 !ERROR: 'x1' may not be a local variable in a pure subprogram !BECAUSE: 'x1' has an impure FINAL procedure 'final' From flang-commits at lists.llvm.org Fri May 9 14:10:21 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 09 May 2025 14:10:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang][volatile] Get volatility of designators from base instead of component symbol (PR #138611) In-Reply-To: Message-ID: <681e6f3d.050a0220.1f30d2.c79a@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/138611 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Fri May 9 14:12:38 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 09 May 2025 14:12:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][intrinsic] restrict kind of get_command(_argument) to >= 2 (PR #139291) In-Reply-To: Message-ID: <681e6fc6.630a0220.29c20d.63c9@mx.google.com> https://github.com/akuhlens edited https://github.com/llvm/llvm-project/pull/139291 From flang-commits at lists.llvm.org Fri May 9 14:48:14 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 09 May 2025 14:48:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Stricter checking of v_list DIO arguments (PR #139329) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/139329 Catch assumed-rank arguments to defined I/O subroutines, and ensure that v_list dummy arguments are vectors. Fixes https://github.com/llvm/llvm-project/issues/138933. >From b0934774b62ed171e9da4e278099d6ab5371720e Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 9 May 2025 14:43:11 -0700 Subject: [PATCH] [flang] Stricter checking of v_list DIO arguments Catch assumed-rank arguments to defined I/O subroutines, and ensure that v_list dummy arguments are vectors. Fixes https://github.com/llvm/llvm-project/issues/138933. --- flang/lib/Semantics/check-declarations.cpp | 15 +++++-- flang/test/Semantics/io11.f90 | 49 ++++++++++++++++++++-- 2 files changed, 57 insertions(+), 7 deletions(-) diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 318085518cc57..25117964e078f 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -1192,7 +1192,7 @@ void CheckHelper::CheckObjectEntity( typeName); } else if (evaluate::IsAssumedRank(symbol)) { SayWithDeclaration(symbol, - "Assumed Rank entity of %s type is not supported"_err_en_US, + "Assumed rank entity of %s type is not supported"_err_en_US, typeName); } } @@ -3414,7 +3414,13 @@ void CheckHelper::CheckBindC(const Symbol &symbol) { bool CheckHelper::CheckDioDummyIsData( const Symbol &subp, const Symbol *arg, std::size_t position) { if (arg && arg->detailsIf()) { - return true; + if (evaluate::IsAssumedRank(*arg)) { + messages_.Say(arg->name(), + "Dummy argument '%s' may not be assumed-rank"_err_en_US, arg->name()); + return false; + } else { + return true; + } } else { if (arg) { messages_.Say(arg->name(), @@ -3592,9 +3598,10 @@ void CheckHelper::CheckDioVlistArg( CheckDioDummyIsDefaultInteger(subp, *arg); CheckDioDummyAttrs(subp, *arg, Attr::INTENT_IN); const auto *objectDetails{arg->detailsIf()}; - if (!objectDetails || !objectDetails->shape().CanBeAssumedShape()) { + if (!objectDetails || !objectDetails->shape().CanBeAssumedShape() || + objectDetails->shape().Rank() != 1) { messages_.Say(arg->name(), - "Dummy argument '%s' of a defined input/output procedure must be assumed shape"_err_en_US, + "Dummy argument '%s' of a defined input/output procedure must be assumed shape vector"_err_en_US, arg->name()); } } diff --git a/flang/test/Semantics/io11.f90 b/flang/test/Semantics/io11.f90 index 3529929003b01..c00deede6b516 100644 --- a/flang/test/Semantics/io11.f90 +++ b/flang/test/Semantics/io11.f90 @@ -342,7 +342,7 @@ subroutine formattedReadProc(dtv, unit, iotype, vlist, iostat, iomsg) end subroutine end module m15 -module m16 +module m16a type,public :: t integer c contains @@ -355,15 +355,58 @@ subroutine formattedReadProc(dtv, unit, iotype, vlist, iostat, iomsg) class(t), intent(inout) :: dtv integer, intent(in) :: unit character(len=*), intent(in) :: iotype - !ERROR: Dummy argument 'vlist' of a defined input/output procedure must be assumed shape + !ERROR: Dummy argument 'vlist' of a defined input/output procedure must be assumed shape vector integer, intent(in) :: vlist(5) integer, intent(out) :: iostat character(len=*), intent(inout) :: iomsg + iostat = 343 + stop 'fail' + end subroutine +end module m16a +module m16b + type,public :: t + integer c + contains + procedure, pass :: tbp=>formattedReadProc + generic :: read(formatted) => tbp + end type + private +contains + subroutine formattedReadProc(dtv, unit, iotype, vlist, iostat, iomsg) + class(t), intent(inout) :: dtv + integer, intent(in) :: unit + character(len=*), intent(in) :: iotype + !ERROR: Dummy argument 'vlist' of a defined input/output procedure must be assumed shape vector + integer, intent(in) :: vlist(:,:) + integer, intent(out) :: iostat + character(len=*), intent(inout) :: iomsg + iostat = 343 + stop 'fail' + end subroutine +end module m16b + +module m16c + type,public :: t + integer c + contains + procedure, pass :: tbp=>formattedReadProc + generic :: read(formatted) => tbp + end type + private +contains + subroutine formattedReadProc(dtv, unit, iotype, vlist, iostat, iomsg) + class(t), intent(inout) :: dtv + integer, intent(in) :: unit + character(len=*), intent(in) :: iotype + !ERROR: Dummy argument 'vlist' may not be assumed-rank + integer, intent(in) :: vlist(..) + integer, intent(out) :: iostat + character(len=*), intent(inout) :: iomsg iostat = 343 stop 'fail' end subroutine -end module m16 +end module m16c module m17 ! Test the same defined input/output procedure specified as a generic From flang-commits at lists.llvm.org Fri May 9 14:48:51 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 14:48:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Stricter checking of v_list DIO arguments (PR #139329) In-Reply-To: Message-ID: <681e7843.630a0220.9e92c.57ac@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes Catch assumed-rank arguments to defined I/O subroutines, and ensure that v_list dummy arguments are vectors. Fixes https://github.com/llvm/llvm-project/issues/138933. --- Full diff: https://github.com/llvm/llvm-project/pull/139329.diff 2 Files Affected: - (modified) flang/lib/Semantics/check-declarations.cpp (+11-4) - (modified) flang/test/Semantics/io11.f90 (+46-3) ``````````diff diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 318085518cc57..25117964e078f 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -1192,7 +1192,7 @@ void CheckHelper::CheckObjectEntity( typeName); } else if (evaluate::IsAssumedRank(symbol)) { SayWithDeclaration(symbol, - "Assumed Rank entity of %s type is not supported"_err_en_US, + "Assumed rank entity of %s type is not supported"_err_en_US, typeName); } } @@ -3414,7 +3414,13 @@ void CheckHelper::CheckBindC(const Symbol &symbol) { bool CheckHelper::CheckDioDummyIsData( const Symbol &subp, const Symbol *arg, std::size_t position) { if (arg && arg->detailsIf()) { - return true; + if (evaluate::IsAssumedRank(*arg)) { + messages_.Say(arg->name(), + "Dummy argument '%s' may not be assumed-rank"_err_en_US, arg->name()); + return false; + } else { + return true; + } } else { if (arg) { messages_.Say(arg->name(), @@ -3592,9 +3598,10 @@ void CheckHelper::CheckDioVlistArg( CheckDioDummyIsDefaultInteger(subp, *arg); CheckDioDummyAttrs(subp, *arg, Attr::INTENT_IN); const auto *objectDetails{arg->detailsIf()}; - if (!objectDetails || !objectDetails->shape().CanBeAssumedShape()) { + if (!objectDetails || !objectDetails->shape().CanBeAssumedShape() || + objectDetails->shape().Rank() != 1) { messages_.Say(arg->name(), - "Dummy argument '%s' of a defined input/output procedure must be assumed shape"_err_en_US, + "Dummy argument '%s' of a defined input/output procedure must be assumed shape vector"_err_en_US, arg->name()); } } diff --git a/flang/test/Semantics/io11.f90 b/flang/test/Semantics/io11.f90 index 3529929003b01..c00deede6b516 100644 --- a/flang/test/Semantics/io11.f90 +++ b/flang/test/Semantics/io11.f90 @@ -342,7 +342,7 @@ subroutine formattedReadProc(dtv, unit, iotype, vlist, iostat, iomsg) end subroutine end module m15 -module m16 +module m16a type,public :: t integer c contains @@ -355,15 +355,58 @@ subroutine formattedReadProc(dtv, unit, iotype, vlist, iostat, iomsg) class(t), intent(inout) :: dtv integer, intent(in) :: unit character(len=*), intent(in) :: iotype - !ERROR: Dummy argument 'vlist' of a defined input/output procedure must be assumed shape + !ERROR: Dummy argument 'vlist' of a defined input/output procedure must be assumed shape vector integer, intent(in) :: vlist(5) integer, intent(out) :: iostat character(len=*), intent(inout) :: iomsg + iostat = 343 + stop 'fail' + end subroutine +end module m16a +module m16b + type,public :: t + integer c + contains + procedure, pass :: tbp=>formattedReadProc + generic :: read(formatted) => tbp + end type + private +contains + subroutine formattedReadProc(dtv, unit, iotype, vlist, iostat, iomsg) + class(t), intent(inout) :: dtv + integer, intent(in) :: unit + character(len=*), intent(in) :: iotype + !ERROR: Dummy argument 'vlist' of a defined input/output procedure must be assumed shape vector + integer, intent(in) :: vlist(:,:) + integer, intent(out) :: iostat + character(len=*), intent(inout) :: iomsg + iostat = 343 + stop 'fail' + end subroutine +end module m16b + +module m16c + type,public :: t + integer c + contains + procedure, pass :: tbp=>formattedReadProc + generic :: read(formatted) => tbp + end type + private +contains + subroutine formattedReadProc(dtv, unit, iotype, vlist, iostat, iomsg) + class(t), intent(inout) :: dtv + integer, intent(in) :: unit + character(len=*), intent(in) :: iotype + !ERROR: Dummy argument 'vlist' may not be assumed-rank + integer, intent(in) :: vlist(..) + integer, intent(out) :: iostat + character(len=*), intent(inout) :: iomsg iostat = 343 stop 'fail' end subroutine -end module m16 +end module m16c module m17 ! Test the same defined input/output procedure specified as a generic ``````````
https://github.com/llvm/llvm-project/pull/139329 From flang-commits at lists.llvm.org Fri May 9 15:01:49 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 09 May 2025 15:01:49 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Stricter checking of v_list DIO arguments (PR #139329) In-Reply-To: Message-ID: <681e7b4d.a70a0220.197931.f22f@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/139329 From flang-commits at lists.llvm.org Fri May 9 15:06:59 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 09 May 2025 15:06:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Treat hlfir.associate as Allocate for FIR alias analysis. (PR #139004) In-Reply-To: Message-ID: <681e7c83.170a0220.5735b.9f10@mx.google.com> https://github.com/vzakhari updated https://github.com/llvm/llvm-project/pull/139004 >From ae126b8ec224d6ae963ee666839cdef70fd5e4ab Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Wed, 7 May 2025 18:36:10 -0700 Subject: [PATCH 1/2] [flang] Treat hlfir.associate as Allocate for FIR alias analysis. Early HLFIR optimizations may experience problems with values produced by hlfir.associate. In most cases this is a unique local memory allocation, but it can also reuse some other hlfir.expr memory sometimes. It seems to be safe to assume unique allocation for trivial types, since we always allocate new memory for them. --- .../lib/Optimizer/Analysis/AliasAnalysis.cpp | 14 +++++++++ ...fferization-eval_in_mem-with-associate.fir | 30 +++++++++++++++++++ 2 files changed, 44 insertions(+) create mode 100644 flang/test/HLFIR/opt-bufferization-eval_in_mem-with-associate.fir diff --git a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp index cbfc8b63ab64d..73ddd1ff80126 100644 --- a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp +++ b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp @@ -540,6 +540,20 @@ AliasAnalysis::Source AliasAnalysis::getSource(mlir::Value v, v = op.getVar(); defOp = v.getDefiningOp(); }) + .Case([&](auto op) { + mlir::Value source = op.getSource(); + if (fir::isa_trivial(source.getType())) { + // Trivial values will always use distinct temp memory, + // so we can classify this as Allocate and stop. + type = SourceKind::Allocate; + breakFromLoop = true; + } else { + // AssociateOp may reuse the expression storage, + // so we have to trace further. + v = source; + defOp = v.getDefiningOp(); + } + }) .Case([&](auto op) { // Unique memory allocation. type = SourceKind::Allocate; diff --git a/flang/test/HLFIR/opt-bufferization-eval_in_mem-with-associate.fir b/flang/test/HLFIR/opt-bufferization-eval_in_mem-with-associate.fir new file mode 100644 index 0000000000000..2f5f88ff9e7e8 --- /dev/null +++ b/flang/test/HLFIR/opt-bufferization-eval_in_mem-with-associate.fir @@ -0,0 +1,30 @@ +// RUN: fir-opt --opt-bufferization %s | FileCheck %s + +// Verify that hlfir.eval_in_mem uses the LHS array instead +// of allocating a temporary. +func.func @_QPtest() { + %cst = arith.constant 1.000000e+00 : f32 + %c10 = arith.constant 10 : index + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.alloca !fir.array<10xf32> {bindc_name = "x", uniq_name = "_QFtestEx"} + %2 = fir.shape %c10 : (index) -> !fir.shape<1> + %3:2 = hlfir.declare %1(%2) {uniq_name = "_QFtestEx"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) + %4:3 = hlfir.associate %cst {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) + %5 = hlfir.eval_in_mem shape %2 : (!fir.shape<1>) -> !hlfir.expr<10xf32> { + ^bb0(%arg0: !fir.ref>): + %6 = fir.call @_QParray_func(%4#0) fastmath : (!fir.ref) -> !fir.array<10xf32> + fir.save_result %6 to %arg0(%2) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> + } + hlfir.assign %5 to %3#0 : !hlfir.expr<10xf32>, !fir.ref> + hlfir.end_associate %4#1, %4#2 : !fir.ref, i1 + hlfir.destroy %5 : !hlfir.expr<10xf32> + return +} +// CHECK-LABEL: func.func @_QPtest() { +// CHECK: %[[VAL_0:.*]] = arith.constant 1.000000e+00 : f32 +// CHECK: %[[VAL_3:.*]] = fir.alloca !fir.array<10xf32> {bindc_name = "x", uniq_name = "_QFtestEx"} +// CHECK: %[[VAL_5:.*]]:2 = hlfir.declare %[[VAL_3]](%[[VAL_4:.*]]) {uniq_name = "_QFtestEx"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) +// CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_0]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +// CHECK: %[[VAL_7:.*]] = fir.call @_QParray_func(%[[VAL_6]]#0) fastmath : (!fir.ref) -> !fir.array<10xf32> +// CHECK: fir.save_result %[[VAL_7]] to %[[VAL_5]]#0(%[[VAL_4]]) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +// CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 >From 451712c10e80d987091e3411dfe42fd6542d19f7 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Fri, 9 May 2025 15:06:32 -0700 Subject: [PATCH 2/2] Added aliasing comment in hlfir.associate definition. --- flang/include/flang/Optimizer/HLFIR/HLFIROps.td | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/flang/include/flang/Optimizer/HLFIR/HLFIROps.td b/flang/include/flang/Optimizer/HLFIR/HLFIROps.td index f69930d5b53b3..08cfa8168a922 100644 --- a/flang/include/flang/Optimizer/HLFIR/HLFIROps.td +++ b/flang/include/flang/Optimizer/HLFIR/HLFIROps.td @@ -758,6 +758,13 @@ def hlfir_AssociateOp : hlfir_Op<"associate", [AttrSizedOperandSegments, For expressions, this operation is an incentive to re-use the expression storage, if any, after the bufferization pass when possible (if the expression is not used afterwards). + + For aliasing purposes, hlfir.associate with the source being + a trivial FIR value is considered to be a unique allocation + that does not alias with anything else. For non-trivial cases + it may be a unique allocation or an alias for the source expression + storage, so FIR alias analysis will look through it for finding + the source. }]; let arguments = (ins From flang-commits at lists.llvm.org Fri May 9 15:07:31 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 09 May 2025 15:07:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Treat hlfir.associate as Allocate for FIR alias analysis. (PR #139004) In-Reply-To: Message-ID: <681e7ca3.170a0220.2e20fd.756c@mx.google.com> vzakhari wrote: Thanks for the review! Tom, I added a comment in the op definition. https://github.com/llvm/llvm-project/pull/139004 From flang-commits at lists.llvm.org Fri May 9 15:19:44 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 09 May 2025 15:19:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Catch deferred type parameters in ALLOCATE(type-spec::) (PR #139334) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/139334 The type-spec in ALLOCATE may not have any deferred type parameters. Fixes https://github.com/llvm/llvm-project/issues/138979. >From a2f442c4f772a98ecf45284a026b3328c03b51c4 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 9 May 2025 15:17:52 -0700 Subject: [PATCH] [flang] Catch deferred type parameters in ALLOCATE(type-spec::) The type-spec in ALLOCATE may not have any deferred type parameters. Fixes https://github.com/llvm/llvm-project/issues/138979. --- flang/lib/Semantics/check-allocate.cpp | 10 ++++++++-- flang/test/Semantics/allocate01.f90 | 4 ++++ 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/check-allocate.cpp b/flang/lib/Semantics/check-allocate.cpp index b426dd81334bb..2c215f45bf516 100644 --- a/flang/lib/Semantics/check-allocate.cpp +++ b/flang/lib/Semantics/check-allocate.cpp @@ -116,13 +116,19 @@ static std::optional CheckAllocateOptions( // C937 if (auto it{FindCoarrayUltimateComponent(*derived)}) { context - .Say("Type-spec in ALLOCATE must not specify a type with a coarray" - " ultimate component"_err_en_US) + .Say( + "Type-spec in ALLOCATE must not specify a type with a coarray ultimate component"_err_en_US) .Attach(it->name(), "Type '%s' has coarray ultimate component '%s' declared here"_en_US, info.typeSpec->AsFortran(), it.BuildResultDesignatorName()); } } + if (auto dyType{evaluate::DynamicType::From(*info.typeSpec)}) { + if (dyType->HasDeferredTypeParameter()) { + context.Say( + "Type-spec in ALLOCATE must not have a deferred type parameter"_err_en_US); + } + } } const parser::Expr *parserSourceExpr{nullptr}; diff --git a/flang/test/Semantics/allocate01.f90 b/flang/test/Semantics/allocate01.f90 index a66e2467cbe4e..a10a7259ae94f 100644 --- a/flang/test/Semantics/allocate01.f90 +++ b/flang/test/Semantics/allocate01.f90 @@ -62,6 +62,7 @@ subroutine bar() real, pointer, save :: okp3 real, allocatable, save :: oka3, okac4[:,:] real, allocatable :: okacd5(:, :)[:] + character(:), allocatable :: chvar !ERROR: Name in ALLOCATE statement must be a variable name allocate(foo) @@ -102,6 +103,8 @@ subroutine bar() allocate(edc9%nok) !ERROR: Entity in ALLOCATE statement must have the ALLOCATABLE or POINTER attribute allocate(edc10) + !ERROR: Type-spec in ALLOCATE must not have a deferred type parameter + allocate(character(:) :: chvar) ! No errors expected below: allocate(a_var) @@ -117,4 +120,5 @@ subroutine bar() allocate(edc9%ok(4)) allocate(edc10%ok) allocate(rp) + allocate(character(123) :: chvar) end subroutine From flang-commits at lists.llvm.org Fri May 9 15:20:16 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 15:20:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Catch deferred type parameters in ALLOCATE(type-spec::) (PR #139334) In-Reply-To: Message-ID: <681e7fa0.170a0220.2264eb.dbbb@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes The type-spec in ALLOCATE may not have any deferred type parameters. Fixes https://github.com/llvm/llvm-project/issues/138979. --- Full diff: https://github.com/llvm/llvm-project/pull/139334.diff 2 Files Affected: - (modified) flang/lib/Semantics/check-allocate.cpp (+8-2) - (modified) flang/test/Semantics/allocate01.f90 (+4) ``````````diff diff --git a/flang/lib/Semantics/check-allocate.cpp b/flang/lib/Semantics/check-allocate.cpp index b426dd81334bb..2c215f45bf516 100644 --- a/flang/lib/Semantics/check-allocate.cpp +++ b/flang/lib/Semantics/check-allocate.cpp @@ -116,13 +116,19 @@ static std::optional CheckAllocateOptions( // C937 if (auto it{FindCoarrayUltimateComponent(*derived)}) { context - .Say("Type-spec in ALLOCATE must not specify a type with a coarray" - " ultimate component"_err_en_US) + .Say( + "Type-spec in ALLOCATE must not specify a type with a coarray ultimate component"_err_en_US) .Attach(it->name(), "Type '%s' has coarray ultimate component '%s' declared here"_en_US, info.typeSpec->AsFortran(), it.BuildResultDesignatorName()); } } + if (auto dyType{evaluate::DynamicType::From(*info.typeSpec)}) { + if (dyType->HasDeferredTypeParameter()) { + context.Say( + "Type-spec in ALLOCATE must not have a deferred type parameter"_err_en_US); + } + } } const parser::Expr *parserSourceExpr{nullptr}; diff --git a/flang/test/Semantics/allocate01.f90 b/flang/test/Semantics/allocate01.f90 index a66e2467cbe4e..a10a7259ae94f 100644 --- a/flang/test/Semantics/allocate01.f90 +++ b/flang/test/Semantics/allocate01.f90 @@ -62,6 +62,7 @@ subroutine bar() real, pointer, save :: okp3 real, allocatable, save :: oka3, okac4[:,:] real, allocatable :: okacd5(:, :)[:] + character(:), allocatable :: chvar !ERROR: Name in ALLOCATE statement must be a variable name allocate(foo) @@ -102,6 +103,8 @@ subroutine bar() allocate(edc9%nok) !ERROR: Entity in ALLOCATE statement must have the ALLOCATABLE or POINTER attribute allocate(edc10) + !ERROR: Type-spec in ALLOCATE must not have a deferred type parameter + allocate(character(:) :: chvar) ! No errors expected below: allocate(a_var) @@ -117,4 +120,5 @@ subroutine bar() allocate(edc9%ok(4)) allocate(edc10%ok) allocate(rp) + allocate(character(123) :: chvar) end subroutine ``````````
https://github.com/llvm/llvm-project/pull/139334 From flang-commits at lists.llvm.org Fri May 9 16:15:59 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 09 May 2025 16:15:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang] PRIVATE statement in derived type applies to proc components (PR #139336) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/139336 A PRIVATE statement in a derived type definition is failing to set the default accessibility of procedure pointer components; fix. Fixes https://github.com/llvm/llvm-project/issues/138911. >From 5f5cc9f86ed20906e6f2096170d6ef0056f8fdd1 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 9 May 2025 16:13:36 -0700 Subject: [PATCH] [flang] PRIVATE statement in derived type applies to proc components A PRIVATE statement in a derived type definition is failing to set the default accessibility of procedure pointer components; fix. Fixes https://github.com/llvm/llvm-project/issues/138911. --- flang/lib/Semantics/resolve-names.cpp | 4 ++++ flang/lib/Semantics/tools.cpp | 2 +- flang/test/Semantics/c_loc01.f90 | 4 ++-- flang/test/Semantics/resolve34.f90 | 33 ++++++++++++++++++++++----- 4 files changed, 34 insertions(+), 9 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index b2979690f78e7..bdafc03ad2c05 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6350,6 +6350,10 @@ void DeclarationVisitor::Post(const parser::ProcDecl &x) { if (!dtDetails) { attrs.set(Attr::EXTERNAL); } + if (derivedTypeInfo_.privateComps && + !attrs.HasAny({Attr::PUBLIC, Attr::PRIVATE})) { + attrs.set(Attr::PRIVATE); + } Symbol &symbol{DeclareProcEntity(name, attrs, procInterface)}; SetCUDADataAttr(name.source, symbol, cudaDataAttr()); // for error symbol.ReplaceName(name.source); diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 08d260555f37e..1d1e3ac044166 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -1076,7 +1076,7 @@ std::optional CheckAccessibleSymbol( return std::nullopt; } else { return parser::MessageFormattedText{ - "PRIVATE name '%s' is only accessible within module '%s'"_err_en_US, + "PRIVATE name '%s' is accessible only within module '%s'"_err_en_US, symbol.name(), DEREF(FindModuleContaining(symbol.owner())).GetName().value()}; } diff --git a/flang/test/Semantics/c_loc01.f90 b/flang/test/Semantics/c_loc01.f90 index abae1e263e2e2..a515a7a64f02a 100644 --- a/flang/test/Semantics/c_loc01.f90 +++ b/flang/test/Semantics/c_loc01.f90 @@ -48,9 +48,9 @@ subroutine test(assumedType, poly, nclen, n) cp = c_loc(ch(1:1)) ! ok cp = c_loc(deferred) ! ok cp = c_loc(p2ch) ! ok - !ERROR: PRIVATE name '__address' is only accessible within module '__fortran_builtins' + !ERROR: PRIVATE name '__address' is accessible only within module '__fortran_builtins' cp = c_ptr(0) - !ERROR: PRIVATE name '__address' is only accessible within module '__fortran_builtins' + !ERROR: PRIVATE name '__address' is accessible only within module '__fortran_builtins' cfp = c_funptr(0) !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches operand types TYPE(c_ptr) and TYPE(c_funptr) cp = cfp diff --git a/flang/test/Semantics/resolve34.f90 b/flang/test/Semantics/resolve34.f90 index 39709a362b363..da1b80b5a50b0 100644 --- a/flang/test/Semantics/resolve34.f90 +++ b/flang/test/Semantics/resolve34.f90 @@ -90,16 +90,37 @@ module m7 integer :: i2 integer, private :: i3 end type + type :: t3 + private + integer :: i4 = 0 + procedure(real), pointer, nopass :: pp1 => null() + end type + type, extends(t3) :: t4 + private + integer :: i5 + procedure(real), pointer, nopass :: pp2 + end type end subroutine s7 use m7 type(t2) :: x + type(t4) :: y integer :: j j = x%i2 - !ERROR: PRIVATE name 'i3' is only accessible within module 'm7' + !ERROR: PRIVATE name 'i3' is accessible only within module 'm7' j = x%i3 - !ERROR: PRIVATE name 't1' is only accessible within module 'm7' + !ERROR: PRIVATE name 't1' is accessible only within module 'm7' j = x%t1%i1 + !ok, parent component is not affected by PRIVATE in t4 + y%t3 = t3() + !ERROR: PRIVATE name 'i4' is accessible only within module 'm7' + y%i4 = 0 + !ERROR: PRIVATE name 'pp1' is accessible only within module 'm7' + y%pp1 => null() + !ERROR: PRIVATE name 'i5' is accessible only within module 'm7' + y%i5 = 0 + !ERROR: PRIVATE name 'pp2' is accessible only within module 'm7' + y%pp2 => null() end ! 7.5.4.8(2) @@ -122,11 +143,11 @@ subroutine s1 subroutine s8 use m8 type(t) :: x - !ERROR: PRIVATE name 'i2' is only accessible within module 'm8' + !ERROR: PRIVATE name 'i2' is accessible only within module 'm8' x = t(2, 5) - !ERROR: PRIVATE name 'i2' is only accessible within module 'm8' + !ERROR: PRIVATE name 'i2' is accessible only within module 'm8' x = t(i1=2, i2=5) - !ERROR: PRIVATE name 'i2' is only accessible within module 'm8' + !ERROR: PRIVATE name 'i2' is accessible only within module 'm8' a = [y%i2] end @@ -166,6 +187,6 @@ subroutine s10 use m10 type(t) x x = t(1) - !ERROR: PRIVATE name 'operator(+)' is only accessible within module 'm10' + !ERROR: PRIVATE name 'operator(+)' is accessible only within module 'm10' x = x + x end subroutine From flang-commits at lists.llvm.org Fri May 9 16:16:31 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 16:16:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang] PRIVATE statement in derived type applies to proc components (PR #139336) In-Reply-To: Message-ID: <681e8ccf.170a0220.1b247.b710@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes A PRIVATE statement in a derived type definition is failing to set the default accessibility of procedure pointer components; fix. Fixes https://github.com/llvm/llvm-project/issues/138911. --- Full diff: https://github.com/llvm/llvm-project/pull/139336.diff 4 Files Affected: - (modified) flang/lib/Semantics/resolve-names.cpp (+4) - (modified) flang/lib/Semantics/tools.cpp (+1-1) - (modified) flang/test/Semantics/c_loc01.f90 (+2-2) - (modified) flang/test/Semantics/resolve34.f90 (+27-6) ``````````diff diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index b2979690f78e7..bdafc03ad2c05 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6350,6 +6350,10 @@ void DeclarationVisitor::Post(const parser::ProcDecl &x) { if (!dtDetails) { attrs.set(Attr::EXTERNAL); } + if (derivedTypeInfo_.privateComps && + !attrs.HasAny({Attr::PUBLIC, Attr::PRIVATE})) { + attrs.set(Attr::PRIVATE); + } Symbol &symbol{DeclareProcEntity(name, attrs, procInterface)}; SetCUDADataAttr(name.source, symbol, cudaDataAttr()); // for error symbol.ReplaceName(name.source); diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 08d260555f37e..1d1e3ac044166 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -1076,7 +1076,7 @@ std::optional CheckAccessibleSymbol( return std::nullopt; } else { return parser::MessageFormattedText{ - "PRIVATE name '%s' is only accessible within module '%s'"_err_en_US, + "PRIVATE name '%s' is accessible only within module '%s'"_err_en_US, symbol.name(), DEREF(FindModuleContaining(symbol.owner())).GetName().value()}; } diff --git a/flang/test/Semantics/c_loc01.f90 b/flang/test/Semantics/c_loc01.f90 index abae1e263e2e2..a515a7a64f02a 100644 --- a/flang/test/Semantics/c_loc01.f90 +++ b/flang/test/Semantics/c_loc01.f90 @@ -48,9 +48,9 @@ subroutine test(assumedType, poly, nclen, n) cp = c_loc(ch(1:1)) ! ok cp = c_loc(deferred) ! ok cp = c_loc(p2ch) ! ok - !ERROR: PRIVATE name '__address' is only accessible within module '__fortran_builtins' + !ERROR: PRIVATE name '__address' is accessible only within module '__fortran_builtins' cp = c_ptr(0) - !ERROR: PRIVATE name '__address' is only accessible within module '__fortran_builtins' + !ERROR: PRIVATE name '__address' is accessible only within module '__fortran_builtins' cfp = c_funptr(0) !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches operand types TYPE(c_ptr) and TYPE(c_funptr) cp = cfp diff --git a/flang/test/Semantics/resolve34.f90 b/flang/test/Semantics/resolve34.f90 index 39709a362b363..da1b80b5a50b0 100644 --- a/flang/test/Semantics/resolve34.f90 +++ b/flang/test/Semantics/resolve34.f90 @@ -90,16 +90,37 @@ module m7 integer :: i2 integer, private :: i3 end type + type :: t3 + private + integer :: i4 = 0 + procedure(real), pointer, nopass :: pp1 => null() + end type + type, extends(t3) :: t4 + private + integer :: i5 + procedure(real), pointer, nopass :: pp2 + end type end subroutine s7 use m7 type(t2) :: x + type(t4) :: y integer :: j j = x%i2 - !ERROR: PRIVATE name 'i3' is only accessible within module 'm7' + !ERROR: PRIVATE name 'i3' is accessible only within module 'm7' j = x%i3 - !ERROR: PRIVATE name 't1' is only accessible within module 'm7' + !ERROR: PRIVATE name 't1' is accessible only within module 'm7' j = x%t1%i1 + !ok, parent component is not affected by PRIVATE in t4 + y%t3 = t3() + !ERROR: PRIVATE name 'i4' is accessible only within module 'm7' + y%i4 = 0 + !ERROR: PRIVATE name 'pp1' is accessible only within module 'm7' + y%pp1 => null() + !ERROR: PRIVATE name 'i5' is accessible only within module 'm7' + y%i5 = 0 + !ERROR: PRIVATE name 'pp2' is accessible only within module 'm7' + y%pp2 => null() end ! 7.5.4.8(2) @@ -122,11 +143,11 @@ subroutine s1 subroutine s8 use m8 type(t) :: x - !ERROR: PRIVATE name 'i2' is only accessible within module 'm8' + !ERROR: PRIVATE name 'i2' is accessible only within module 'm8' x = t(2, 5) - !ERROR: PRIVATE name 'i2' is only accessible within module 'm8' + !ERROR: PRIVATE name 'i2' is accessible only within module 'm8' x = t(i1=2, i2=5) - !ERROR: PRIVATE name 'i2' is only accessible within module 'm8' + !ERROR: PRIVATE name 'i2' is accessible only within module 'm8' a = [y%i2] end @@ -166,6 +187,6 @@ subroutine s10 use m10 type(t) x x = t(1) - !ERROR: PRIVATE name 'operator(+)' is only accessible within module 'm10' + !ERROR: PRIVATE name 'operator(+)' is accessible only within module 'm10' x = x + x end subroutine ``````````
https://github.com/llvm/llvm-project/pull/139336 From flang-commits at lists.llvm.org Fri May 9 16:30:24 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 09 May 2025 16:30:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Catch deferred type parameters in ALLOCATE(type-spec::) (PR #139334) In-Reply-To: Message-ID: <681e9010.170a0220.326dc.9d6c@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/139334 From flang-commits at lists.llvm.org Fri May 9 16:36:06 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 09 May 2025 16:36:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang] PRIVATE statement in derived type applies to proc components (PR #139336) In-Reply-To: Message-ID: <681e9166.170a0220.3bf523.cd16@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/139336 From flang-commits at lists.llvm.org Fri May 9 16:57:47 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 09 May 2025 16:57:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Extend assumed-size array checking in intrinsic functions (PR #139339) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/139339 The array argument of a reference to the intrinsic functions SHAPE can't be assumed-size; and for SIZE and UBOUND, it can be assumed-size only if DIM= is present. The checks for thes restrictions don't allow for host association, or for associate entities (ASSOCIATE, SELECT TYPE) that are variables. Fixes https://github.com/llvm/llvm-project/issues/138926. >From cbde5af2d03fdf197084bb0bf2e900ae79fbe563 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 9 May 2025 16:37:02 -0700 Subject: [PATCH] [flang] Extend assumed-size array checking in intrinsic functions The array argument of a reference to the intrinsic functions SHAPE can't be assumed-size; and for SIZE and UBOUND, it can be assumed-size only if DIM= is present. The checks for thes restrictions don't allow for host association, or for associate entities (ASSOCIATE, SELECT TYPE) that are variables. Fixes https://github.com/llvm/llvm-project/issues/138926. --- flang/lib/Evaluate/intrinsics.cpp | 10 ++++--- flang/test/Semantics/misc-intrinsics.f90 | 35 ++++++++++++++++++++++-- 2 files changed, 39 insertions(+), 6 deletions(-) diff --git a/flang/lib/Evaluate/intrinsics.cpp b/flang/lib/Evaluate/intrinsics.cpp index 709f2e6c85bb2..389d8e6c57763 100644 --- a/flang/lib/Evaluate/intrinsics.cpp +++ b/flang/lib/Evaluate/intrinsics.cpp @@ -2340,7 +2340,7 @@ std::optional IntrinsicInterface::Match( if (!knownArg) { knownArg = arg; } - if (!dimArg && rank > 0 && + if (rank > 0 && (std::strcmp(name, "shape") == 0 || std::strcmp(name, "size") == 0 || std::strcmp(name, "ubound") == 0)) { @@ -2351,16 +2351,18 @@ std::optional IntrinsicInterface::Match( // over this one, as this error is caught by the second entry // for UBOUND.) if (auto named{ExtractNamedEntity(*arg)}) { - if (semantics::IsAssumedSizeArray(named->GetLastSymbol())) { + if (semantics::IsAssumedSizeArray(ResolveAssociations( + named->GetLastSymbol().GetUltimate()))) { if (strcmp(name, "shape") == 0) { messages.Say(arg->sourceLocation(), "The 'source=' argument to the intrinsic function 'shape' may not be assumed-size"_err_en_US); - } else { + return std::nullopt; + } else if (!dimArg) { messages.Say(arg->sourceLocation(), "A dim= argument is required for '%s' when the array is assumed-size"_err_en_US, name); + return std::nullopt; } - return std::nullopt; } } } diff --git a/flang/test/Semantics/misc-intrinsics.f90 b/flang/test/Semantics/misc-intrinsics.f90 index 14dcdb05ac6c6..a7895f7b7f16f 100644 --- a/flang/test/Semantics/misc-intrinsics.f90 +++ b/flang/test/Semantics/misc-intrinsics.f90 @@ -3,17 +3,37 @@ program test_size real :: scalar real, dimension(5, 5) :: array - call test(array, array) + call test(array, array, array) contains - subroutine test(arg, assumedRank) + subroutine test(arg, assumedRank, poly) real, dimension(5, *) :: arg real, dimension(..) :: assumedRank + class(*) :: poly(5, *) !ERROR: A dim= argument is required for 'size' when the array is assumed-size print *, size(arg) + print *, size(arg, dim=1) ! ok + select type (poly) + type is (real) + !ERROR: A dim= argument is required for 'size' when the array is assumed-size + print *, size(poly) + print *, size(poly, dim=1) ! ok + end select !ERROR: A dim= argument is required for 'ubound' when the array is assumed-size print *, ubound(arg) + print *, ubound(arg, dim=1) ! ok + select type (poly) + type is (real) + !ERROR: A dim= argument is required for 'ubound' when the array is assumed-size + print *, ubound(poly) + print *, ubound(poly, dim=1) ! ok + end select !ERROR: The 'source=' argument to the intrinsic function 'shape' may not be assumed-size print *, shape(arg) + select type (poly) + type is (real) + !ERROR: The 'source=' argument to the intrinsic function 'shape' may not be assumed-size + print *, shape(poly) + end select !ERROR: The 'harvest=' argument to the intrinsic procedure 'random_number' may not be assumed-size call random_number(arg) !ERROR: 'array=' argument has unacceptable rank 0 @@ -85,5 +105,16 @@ subroutine test(arg, assumedRank) print *, lbound(assumedRank, dim=2) print *, ubound(assumedRank, dim=2) end select + contains + subroutine inner + !ERROR: A dim= argument is required for 'size' when the array is assumed-size + print *, size(arg) + print *, size(arg, dim=1) ! ok + !ERROR: A dim= argument is required for 'ubound' when the array is assumed-size + print *, ubound(arg) + print *, ubound(arg, dim=1) ! ok + !ERROR: The 'source=' argument to the intrinsic function 'shape' may not be assumed-size + print *, shape(arg) + end end subroutine end From flang-commits at lists.llvm.org Fri May 9 16:58:20 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 16:58:20 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Extend assumed-size array checking in intrinsic functions (PR #139339) In-Reply-To: Message-ID: <681e969c.170a0220.19ef22.2364@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes The array argument of a reference to the intrinsic functions SHAPE can't be assumed-size; and for SIZE and UBOUND, it can be assumed-size only if DIM= is present. The checks for thes restrictions don't allow for host association, or for associate entities (ASSOCIATE, SELECT TYPE) that are variables. Fixes https://github.com/llvm/llvm-project/issues/138926. --- Full diff: https://github.com/llvm/llvm-project/pull/139339.diff 2 Files Affected: - (modified) flang/lib/Evaluate/intrinsics.cpp (+6-4) - (modified) flang/test/Semantics/misc-intrinsics.f90 (+33-2) ``````````diff diff --git a/flang/lib/Evaluate/intrinsics.cpp b/flang/lib/Evaluate/intrinsics.cpp index 709f2e6c85bb2..389d8e6c57763 100644 --- a/flang/lib/Evaluate/intrinsics.cpp +++ b/flang/lib/Evaluate/intrinsics.cpp @@ -2340,7 +2340,7 @@ std::optional IntrinsicInterface::Match( if (!knownArg) { knownArg = arg; } - if (!dimArg && rank > 0 && + if (rank > 0 && (std::strcmp(name, "shape") == 0 || std::strcmp(name, "size") == 0 || std::strcmp(name, "ubound") == 0)) { @@ -2351,16 +2351,18 @@ std::optional IntrinsicInterface::Match( // over this one, as this error is caught by the second entry // for UBOUND.) if (auto named{ExtractNamedEntity(*arg)}) { - if (semantics::IsAssumedSizeArray(named->GetLastSymbol())) { + if (semantics::IsAssumedSizeArray(ResolveAssociations( + named->GetLastSymbol().GetUltimate()))) { if (strcmp(name, "shape") == 0) { messages.Say(arg->sourceLocation(), "The 'source=' argument to the intrinsic function 'shape' may not be assumed-size"_err_en_US); - } else { + return std::nullopt; + } else if (!dimArg) { messages.Say(arg->sourceLocation(), "A dim= argument is required for '%s' when the array is assumed-size"_err_en_US, name); + return std::nullopt; } - return std::nullopt; } } } diff --git a/flang/test/Semantics/misc-intrinsics.f90 b/flang/test/Semantics/misc-intrinsics.f90 index 14dcdb05ac6c6..a7895f7b7f16f 100644 --- a/flang/test/Semantics/misc-intrinsics.f90 +++ b/flang/test/Semantics/misc-intrinsics.f90 @@ -3,17 +3,37 @@ program test_size real :: scalar real, dimension(5, 5) :: array - call test(array, array) + call test(array, array, array) contains - subroutine test(arg, assumedRank) + subroutine test(arg, assumedRank, poly) real, dimension(5, *) :: arg real, dimension(..) :: assumedRank + class(*) :: poly(5, *) !ERROR: A dim= argument is required for 'size' when the array is assumed-size print *, size(arg) + print *, size(arg, dim=1) ! ok + select type (poly) + type is (real) + !ERROR: A dim= argument is required for 'size' when the array is assumed-size + print *, size(poly) + print *, size(poly, dim=1) ! ok + end select !ERROR: A dim= argument is required for 'ubound' when the array is assumed-size print *, ubound(arg) + print *, ubound(arg, dim=1) ! ok + select type (poly) + type is (real) + !ERROR: A dim= argument is required for 'ubound' when the array is assumed-size + print *, ubound(poly) + print *, ubound(poly, dim=1) ! ok + end select !ERROR: The 'source=' argument to the intrinsic function 'shape' may not be assumed-size print *, shape(arg) + select type (poly) + type is (real) + !ERROR: The 'source=' argument to the intrinsic function 'shape' may not be assumed-size + print *, shape(poly) + end select !ERROR: The 'harvest=' argument to the intrinsic procedure 'random_number' may not be assumed-size call random_number(arg) !ERROR: 'array=' argument has unacceptable rank 0 @@ -85,5 +105,16 @@ subroutine test(arg, assumedRank) print *, lbound(assumedRank, dim=2) print *, ubound(assumedRank, dim=2) end select + contains + subroutine inner + !ERROR: A dim= argument is required for 'size' when the array is assumed-size + print *, size(arg) + print *, size(arg, dim=1) ! ok + !ERROR: A dim= argument is required for 'ubound' when the array is assumed-size + print *, ubound(arg) + print *, ubound(arg, dim=1) ! ok + !ERROR: The 'source=' argument to the intrinsic function 'shape' may not be assumed-size + print *, shape(arg) + end end subroutine end ``````````
https://github.com/llvm/llvm-project/pull/139339 From flang-commits at lists.llvm.org Fri May 9 17:02:32 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 17:02:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Extend assumed-size array checking in intrinsic functions (PR #139339) In-Reply-To: Message-ID: <681e9798.170a0220.27e633.b833@mx.google.com> scriptforfivem wrote: Your approach to this problem is very elegant. I'm learning a lot from reviewing your code. https://github.com/llvm/llvm-project/pull/139339 From flang-commits at lists.llvm.org Fri May 9 17:20:42 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 17:20:42 -0700 (PDT) Subject: [flang-commits] [flang] 82982d7 - [flang][intrinsic] restrict kind of get_command(_argument) to >= 2 (#139291) Message-ID: <681e9bda.170a0220.1c9a2e.b20c@mx.google.com> Author: Andre Kuhlenschmidt Date: 2025-05-09T17:20:38-07:00 New Revision: 82982d74e75a7f304009263486ab1f698cc94229 URL: https://github.com/llvm/llvm-project/commit/82982d74e75a7f304009263486ab1f698cc94229 DIFF: https://github.com/llvm/llvm-project/commit/82982d74e75a7f304009263486ab1f698cc94229.diff LOG: [flang][intrinsic] restrict kind of get_command(_argument) to >= 2 (#139291) Previously the following program would have failed with a runtime assertion violation. This PR restricts the type information such that this assertion failure isn't reachable. The example below demonstrates the change. ```bash $ cat error.f90 integer (kind=1) :: i call get_command(length=i) print *, i end $ cat good.f90 integer (kind=2) :: i call get_command(length=i) print *, i end $ prior/flang error.f90 && ./a.out fatal Fortran runtime error(/home/akuhlenschmi/work/lorado/src/llvm-project/t.f90:2): Internal error: RUNTIME_CHECK(IsValidIntDescriptor(length)) failed at /home/akuhlenschmi/work/lorado/src/llvm-project/flang-rt/lib/runtime/command.cpp(154) Aborted (core dumped) $ prior/flang good.f90 && ./a.out 7 $ current/flang error.f90 && ./a.out error: Semantic errors in t.f90 ./t.f90:2:25: error: Actual argument for 'length=' has bad type or kind 'INTEGER(1)' call get_command(length=i) ^ $ current/flang good.f90 && ./a.out 7 ``` Also while making the change, I noticed that "get_command_argument" suffers from the same issue, so I made a similar change for it. Added: flang/test/Semantics/command.f90 Modified: flang/lib/Evaluate/intrinsics.cpp Removed: ################################################################################ diff --git a/flang/lib/Evaluate/intrinsics.cpp b/flang/lib/Evaluate/intrinsics.cpp index 709f2e6c85bb2..d64a008e3db84 100644 --- a/flang/lib/Evaluate/intrinsics.cpp +++ b/flang/lib/Evaluate/intrinsics.cpp @@ -1587,8 +1587,8 @@ static const IntrinsicInterface intrinsicSubroutine[]{ {"get_command", {{"command", DefaultChar, Rank::scalar, Optionality::optional, common::Intent::Out}, - {"length", AnyInt, Rank::scalar, Optionality::optional, - common::Intent::Out}, + {"length", TypePattern{IntType, KindCode::greaterOrEqualToKind, 2}, + Rank::scalar, Optionality::optional, common::Intent::Out}, {"status", AnyInt, Rank::scalar, Optionality::optional, common::Intent::Out}, {"errmsg", DefaultChar, Rank::scalar, Optionality::optional, @@ -1598,8 +1598,8 @@ static const IntrinsicInterface intrinsicSubroutine[]{ {{"number", AnyInt, Rank::scalar}, {"value", DefaultChar, Rank::scalar, Optionality::optional, common::Intent::Out}, - {"length", AnyInt, Rank::scalar, Optionality::optional, - common::Intent::Out}, + {"length", TypePattern{IntType, KindCode::greaterOrEqualToKind, 2}, + Rank::scalar, Optionality::optional, common::Intent::Out}, {"status", AnyInt, Rank::scalar, Optionality::optional, common::Intent::Out}, {"errmsg", DefaultChar, Rank::scalar, Optionality::optional, diff --git a/flang/test/Semantics/command.f90 b/flang/test/Semantics/command.f90 new file mode 100644 index 0000000000000..b5f24cddbd052 --- /dev/null +++ b/flang/test/Semantics/command.f90 @@ -0,0 +1,30 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 +program command + implicit none + Integer(1) :: i1 + Integer(2) :: i2 + Integer(4) :: i4 + Integer(8) :: i8 + Integer(16) :: i16 + Integer :: a + !ERROR: Actual argument for 'length=' has bad type or kind 'INTEGER(1)' + call get_command(length=i1) + !OK: + call get_command(length=i2) + !OK: + call get_command(length=i4) + !OK: + call get_command(length=i8) + !OK: + call get_command(length=i16) + !ERROR: Actual argument for 'length=' has bad type or kind 'INTEGER(1)' + call get_command_argument(number=a,length=i1) + !OK: + call get_command_argument(number=a,length=i2) + !OK: + call get_command_argument(number=a,length=i4) + !OK: + call get_command_argument(number=a,length=i8) + !OK: + call get_command_argument(number=a,length=i16) +end program \ No newline at end of file From flang-commits at lists.llvm.org Fri May 9 17:20:44 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 09 May 2025 17:20:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang][intrinsic] restrict kind of get_command(_argument) to >= 2 (PR #139291) In-Reply-To: Message-ID: <681e9bdc.170a0220.d71f2.ad0e@mx.google.com> https://github.com/akuhlens closed https://github.com/llvm/llvm-project/pull/139291 From flang-commits at lists.llvm.org Fri May 9 17:20:59 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 17:20:59 -0700 (PDT) Subject: [flang-commits] [flang] d21534f - [flang][volatile] Get volatility of designators from base instead of component symbol (#138611) Message-ID: <681e9beb.170a0220.8cac6.ce37@mx.google.com> Author: Andre Kuhlenschmidt Date: 2025-05-09T17:20:56-07:00 New Revision: d21534f482f21356531f25671b18ae385e04c296 URL: https://github.com/llvm/llvm-project/commit/d21534f482f21356531f25671b18ae385e04c296 DIFF: https://github.com/llvm/llvm-project/commit/d21534f482f21356531f25671b18ae385e04c296.diff LOG: [flang][volatile] Get volatility of designators from base instead of component symbol (#138611) The standard says in [8.5.20 VOLATILE attribute]: If an object has the VOLATILE attribute, then all of its sub-objects also have the VOLATILE attribute. This code takes this into account and uses the volatility of the base of the designator instead of that of the component. In fact, fields in a structure are not allowed to have the volatile attribute. So given the code, `A%B => t`, symbol `B` could never directly have the volatile attribute, and the volatility of `A` indicates the volatility of `B`. This PR should address [the comments](https://github.com/llvm/llvm-project/pull/132486#issuecomment-2851313119) on this PR #132486 Added: Modified: flang/lib/Semantics/pointer-assignment.cpp flang/test/Semantics/assign02.f90 Removed: ################################################################################ diff --git a/flang/lib/Semantics/pointer-assignment.cpp b/flang/lib/Semantics/pointer-assignment.cpp index 36c9c5b845706..c17eb0aa941ec 100644 --- a/flang/lib/Semantics/pointer-assignment.cpp +++ b/flang/lib/Semantics/pointer-assignment.cpp @@ -329,7 +329,6 @@ bool PointerAssignmentChecker::Check(const evaluate::Designator &d) { " shape"_err_en_US; } else if (rhsType->corank() > 0 && (isVolatile_ != last->attrs().test(Attr::VOLATILE))) { // C1020 - // TODO: what if A is VOLATILE in A%B%C? need a better test here if (isVolatile_) { msg = "Pointer may not be VOLATILE when target is a" " non-VOLATILE coarray"_err_en_US; @@ -569,6 +568,12 @@ bool CheckPointerAssignment(SemanticsContext &context, const SomeExpr &lhs, return false; // error was reported } PointerAssignmentChecker checker{context, scope, *pointer}; + const Symbol *base{GetFirstSymbol(lhs)}; + if (base) { + // 8.5.20(4) If an object has the VOLATILE attribute, then all of its + // subobjects also have the VOLATILE attribute. + checker.set_isVolatile(base->attrs().test(Attr::VOLATILE)); + } checker.set_isBoundsRemapping(isBoundsRemapping); checker.set_isAssumedRank(isAssumedRank); bool lhsOk{checker.CheckLeftHandSide(lhs)}; diff --git a/flang/test/Semantics/assign02.f90 b/flang/test/Semantics/assign02.f90 index d83d126e2734c..9fa672025bfe7 100644 --- a/flang/test/Semantics/assign02.f90 +++ b/flang/test/Semantics/assign02.f90 @@ -8,9 +8,11 @@ module m1 type t2 sequence real :: t2Field + real, pointer :: t2FieldPtr end type type t3 type(t2) :: t3Field + type(t2), pointer :: t3FieldPtr end type contains @@ -198,6 +200,14 @@ subroutine s13 q2 => y%t3Field !OK: q3 => y + !ERROR: VOLATILE target associated with non-VOLATILE pointer + p3%t3FieldPtr => y%t3Field + !ERROR: VOLATILE target associated with non-VOLATILE pointer + p3%t3FieldPtr%t2FieldPtr => y%t3Field%t2Field + !OK + q3%t3FieldPtr => y%t3Field + !OK + q3%t3FieldPtr%t2FieldPtr => y%t3Field%t2Field end end From flang-commits at lists.llvm.org Fri May 9 17:21:02 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 09 May 2025 17:21:02 -0700 (PDT) Subject: [flang-commits] [flang] [flang][volatile] Get volatility of designators from base instead of component symbol (PR #138611) In-Reply-To: Message-ID: <681e9bee.170a0220.211b10.d63d@mx.google.com> https://github.com/akuhlens closed https://github.com/llvm/llvm-project/pull/138611 From flang-commits at lists.llvm.org Fri May 9 17:39:26 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Fri, 09 May 2025 17:39:26 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (PR #139131) In-Reply-To: Message-ID: <681ea03e.170a0220.572ff.a84e@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `ppc64le-flang-rhel-clang` running on `ppc64le-flang-rhel-test` while building `flang` at step 5 "build-unified-tree". Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/27555
Here is the relevant piece of the build log for the reference ``` Step 5 (build-unified-tree) failure: build (failure) ... 66.323 [98/21/6688] Building CXX object tools/flang/lib/Semantics/CMakeFiles/FortranSemantics.dir/scope.cpp.o 66.324 [98/20/6689] Building CXX object tools/flang/lib/Semantics/CMakeFiles/FortranSemantics.dir/check-do-forall.cpp.o 66.325 [98/19/6690] Building CXX object tools/flang/lib/Semantics/CMakeFiles/FortranSemantics.dir/runtime-type-info.cpp.o 66.331 [98/18/6691] Building CXX object tools/flang/lib/Semantics/CMakeFiles/FortranSemantics.dir/resolve-names-utils.cpp.o 66.347 [98/17/6692] Building CXX object tools/flang/lib/Semantics/CMakeFiles/FortranSemantics.dir/symbol.cpp.o 66.355 [98/16/6693] Building CXX object tools/flang/lib/Semantics/CMakeFiles/FortranSemantics.dir/pointer-assignment.cpp.o 74.576 [98/15/6694] Building CXX object tools/flang/tools/flang-driver/CMakeFiles/flang.dir/fc1_main.cpp.o 77.127 [98/14/6695] Building CXX object tools/flang/tools/flang-driver/CMakeFiles/flang.dir/driver.cpp.o 85.943 [98/13/6696] Building CXX object tools/flang/lib/FrontendTool/CMakeFiles/flangFrontendTool.dir/ExecuteCompilerInvocation.cpp.o 95.031 [98/12/6697] Building CXX object tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o FAILED: tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o ccache /home/buildbots/llvm-external-buildbots/clang.19.1.7/bin/clang++ -DFLANG_INCLUDE_TESTS=1 -DGTEST_HAS_RTTI=0 -DLLVM_BUILD_STATIC -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/tools/flang/tools/f18-parse-demo -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/tools/f18-parse-demo -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/tools/flang/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/llvm/include -isystem /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/../mlir/include -isystem /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/tools/mlir/include -isystem /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/tools/clang/include -isystem /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/llvm/../clang/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Werror -Wno-deprecated-copy -Wno-string-conversion -Wno-ctad-maybe-unsupported -Wno-unused-command-line-argument -Wstring-conversion -Wcovered-switch-default -Wno-nested-anon-types -Xclang -fno-pch-timestamp -O3 -DNDEBUG -std=c++17 -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -MF tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o.d -o tools/flang/tools/f18-parse-demo/CMakeFiles/f18-parse-demo.dir/f18-parse-demo.cpp.o -c /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:222:5: error: no matching function for call to 'Unparse' 222 | Unparse(llvm::outs(), parseTree, driver.encoding, true /*capitalize*/, | ^~~~~~~ /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate function template not viable: no known conversion from 'Fortran::parser::Encoding' to 'const common::LangOptions' for 3rd argument 53 | void Unparse(llvm::raw_ostream &out, const A &root, | ^ 54 | const common::LangOptions &langOpts, Encoding encoding = Encoding::UTF_8, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/tools/f18-parse-demo/f18-parse-demo.cpp:243:5: error: no matching function for call to 'Unparse' 243 | Unparse(tmpSource, parseTree, driver.encoding, true /*capitalize*/, | ^~~~~~~ /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/include/flang/Parser/unparse.h:53:6: note: candidate function template not viable: no known conversion from 'Fortran::parser::Encoding' to 'const common::LangOptions' for 3rd argument 53 | void Unparse(llvm::raw_ostream &out, const A &root, | ^ 54 | const common::LangOptions &langOpts, Encoding encoding = Encoding::UTF_8, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2 errors generated. 110.700 [98/11/6698] Building CXX object tools/flang/lib/Frontend/CMakeFiles/flangFrontend.dir/cmake_pch.hxx.pch 117.087 [98/10/6699] Building CXX object tools/flang/tools/bbc/CMakeFiles/bbc.dir/bbc.cpp.o 125.393 [98/9/6700] Building CXX object tools/flang/lib/Semantics/CMakeFiles/FortranSemantics.dir/tools.cpp.o 131.111 [98/8/6701] Building CXX object tools/flang/lib/Parser/CMakeFiles/FortranParser.dir/unparse.cpp.o 132.110 [98/7/6702] Building CXX object tools/flang/lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o 137.546 [98/6/6703] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/cmake_pch.hxx.pch 138.541 [98/5/6704] Building CXX object tools/flang/lib/Semantics/CMakeFiles/FortranSemantics.dir/mod-file.cpp.o 160.017 [98/4/6705] Building CXX object tools/flang/lib/Parser/CMakeFiles/FortranParser.dir/openmp-parsers.cpp.o 193.883 [98/3/6706] Building CXX object tools/flang/lib/Semantics/CMakeFiles/FortranSemantics.dir/resolve-directives.cpp.o 253.208 [98/2/6707] Building CXX object tools/flang/lib/Semantics/CMakeFiles/FortranSemantics.dir/expression.cpp.o 263.784 [98/1/6708] Building CXX object tools/flang/lib/Semantics/CMakeFiles/FortranSemantics.dir/check-omp-structure.cpp.o ninja: build stopped: subcommand failed. ```
https://github.com/llvm/llvm-project/pull/139131 From flang-commits at lists.llvm.org Fri May 9 18:28:07 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 09 May 2025 18:28:07 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fixed designator codegen for contiguous boxes. (PR #139003) In-Reply-To: Message-ID: <681eaba7.170a0220.1b63d2.caf0@mx.google.com> https://github.com/vzakhari updated https://github.com/llvm/llvm-project/pull/139003 >From 5d3fc1fbeb088d04c1dbe863dd0ce6657bc92949 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Wed, 7 May 2025 18:07:53 -0700 Subject: [PATCH 1/2] [flang] Fixed designator codegen for contiguous boxes. Contiguous variables represented with a box do not have explicit shape, but it looks like the base/shape computation was assuming that. This caused generation of raw address fir.array_coor without the shape. This patch is needed to fix failures hapenning with #138797. --- .../flang/Optimizer/Builder/HLFIRTools.h | 6 ++ flang/lib/Optimizer/Builder/HLFIRTools.cpp | 30 ++++++-- .../HLFIR/Transforms/ConvertToFIR.cpp | 36 +++++++++- flang/test/HLFIR/designate-codegen.fir | 72 +++++++++++++++++++ 4 files changed, 135 insertions(+), 9 deletions(-) diff --git a/flang/include/flang/Optimizer/Builder/HLFIRTools.h b/flang/include/flang/Optimizer/Builder/HLFIRTools.h index ac80873dc374f..bcba38ed8bd5d 100644 --- a/flang/include/flang/Optimizer/Builder/HLFIRTools.h +++ b/flang/include/flang/Optimizer/Builder/HLFIRTools.h @@ -533,6 +533,12 @@ Entity gen1DSection(mlir::Location loc, fir::FirOpBuilder &builder, mlir::ArrayRef extents, mlir::ValueRange oneBasedIndices, mlir::ArrayRef typeParams); + +/// Return explicit lower bounds from a fir.shape result. +/// Only fir.shape, fir.shift and fir.shape_shift are currently +/// supported as \p shape. +llvm::SmallVector getExplicitLboundsFromShape(mlir::Value shape); + } // namespace hlfir #endif // FORTRAN_OPTIMIZER_BUILDER_HLFIRTOOLS_H diff --git a/flang/lib/Optimizer/Builder/HLFIRTools.cpp b/flang/lib/Optimizer/Builder/HLFIRTools.cpp index 51ea7305d3d26..752dc0cf86414 100644 --- a/flang/lib/Optimizer/Builder/HLFIRTools.cpp +++ b/flang/lib/Optimizer/Builder/HLFIRTools.cpp @@ -70,10 +70,8 @@ getExplicitExtents(fir::FortranVariableOpInterface var, return {}; } -// Return explicit lower bounds. For pointers and allocatables, this will not -// read the lower bounds and instead return an empty vector. -static llvm::SmallVector -getExplicitLboundsFromShape(mlir::Value shape) { +llvm::SmallVector +hlfir::getExplicitLboundsFromShape(mlir::Value shape) { llvm::SmallVector result; auto *shapeOp = shape.getDefiningOp(); if (auto s = mlir::dyn_cast_or_null(shapeOp)) { @@ -89,10 +87,13 @@ getExplicitLboundsFromShape(mlir::Value shape) { } return result; } + +// Return explicit lower bounds. For pointers and allocatables, this will not +// read the lower bounds and instead return an empty vector. static llvm::SmallVector getExplicitLbounds(fir::FortranVariableOpInterface var) { if (mlir::Value shape = var.getShape()) - return getExplicitLboundsFromShape(shape); + return hlfir::getExplicitLboundsFromShape(shape); return {}; } @@ -753,9 +754,24 @@ std::pair hlfir::genVariableFirBaseShapeAndParams( } if (entity.isScalar()) return {fir::getBase(exv), mlir::Value{}}; + + // Contiguous variables that are represented with a box + // may require the shape to be extracted from the box (i.e. evx), + // because they itself may not have shape specified. + // This happens during late propagationg of contiguous + // attribute, e.g.: + // %9:2 = hlfir.declare %6 + // {fortran_attrs = #fir.var_attrs} : + // (!fir.box>) -> + // (!fir.box>, !fir.box>) + // The extended value is an ArrayBoxValue with base being + // the raw address of the array. if (auto variableInterface = entity.getIfVariableInterface()) - return {fir::getBase(exv), - asEmboxShape(loc, builder, exv, variableInterface.getShape())}; + if (mlir::isa(fir::getBase(exv).getType()) || + !mlir::isa(entity.getType()) || + variableInterface.getShape()) + return {fir::getBase(exv), + asEmboxShape(loc, builder, exv, variableInterface.getShape())}; return {fir::getBase(exv), builder.createShape(loc, exv)}; } diff --git a/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp b/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp index 8721a895b5e05..495f11a365185 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp @@ -412,12 +412,44 @@ class DesignateOpConversion auto indices = designate.getIndices(); int i = 0; auto attrs = designate.getIsTripletAttr(); + + // If the shape specifies a shift and the base is not a box, + // then we have to subtract the lower bounds, as long as + // fir.array_coor does not support non-default lower bounds + // for non-box accesses. + llvm::SmallVector lbounds; + if (shape && !mlir::isa(base.getType())) + lbounds = hlfir::getExplicitLboundsFromShape(shape); + std::size_t lboundIdx = 0; for (auto isTriplet : attrs.asArrayRef()) { // Coordinate of the first element are the index and triplets lower - // bounds - firstElementIndices.push_back(indices[i]); + // bounds. + mlir::Value index = indices[i]; + if (!lbounds.empty()) { + assert(lboundIdx < lbounds.size() && "missing lbound"); + mlir::Type indexType = builder.getIndexType(); + mlir::Value one = builder.createIntegerConstant(loc, indexType, 1); + mlir::Value orig = builder.createConvert(loc, indexType, index); + mlir::Value lb = + builder.createConvert(loc, indexType, lbounds[lboundIdx]); + index = builder.create(loc, orig, lb); + index = builder.create(loc, index, one); + ++lboundIdx; + } + firstElementIndices.push_back(index); i = i + (isTriplet ? 3 : 1); } + + // Remove the shift from the shape, if needed. + if (!lbounds.empty()) { + mlir::Operation *op = shape.getDefiningOp(); + if (mlir::isa(op)) + shape = nullptr; + else if (auto shiftOp = mlir::dyn_cast(op)) + shape = builder.create(loc, shiftOp.getExtents()); + else + TODO(loc, "read fir.shape to get lower bounds"); + } mlir::Type originalDesignateType = designate.getResult().getType(); const bool isVolatile = fir::isa_volatile_type(originalDesignateType); mlir::Type arrayCoorType = fir::ReferenceType::get(baseEleTy, isVolatile); diff --git a/flang/test/HLFIR/designate-codegen.fir b/flang/test/HLFIR/designate-codegen.fir index da0a1f82b516e..d3e264941264f 100644 --- a/flang/test/HLFIR/designate-codegen.fir +++ b/flang/test/HLFIR/designate-codegen.fir @@ -213,3 +213,75 @@ func.func @test_polymorphic_array_elt(%arg0: !fir.class>, !fir.class>>) -> !fir.class> // CHECK: return // CHECK: } + +// Test proper generation of fir.array_coor for contiguous box with default lbounds. +func.func @_QPtest_contiguous_derived_default(%arg0: !fir.class>> {fir.bindc_name = "d1", fir.contiguous, fir.optional}) { + %c1 = arith.constant 1 : index + %c16_i32 = arith.constant 16 : i32 + %0 = fir.dummy_scope : !fir.dscope + %1:2 = hlfir.declare %arg0 dummy_scope %0 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.class>>, !fir.dscope) -> (!fir.class>>, !fir.class>>) + fir.select_type %1#1 : !fir.class>> [#fir.type_is,i:i32}>>, ^bb1, unit, ^bb2] +^bb1: // pred: ^bb0 + %2 = fir.convert %1#1 : (!fir.class>>) -> !fir.box,i:i32}>>> + %3:2 = hlfir.declare %2 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.box,i:i32}>>>) -> (!fir.box,i:i32}>>>, !fir.box,i:i32}>>>) + %4 = hlfir.designate %3#0 (%c1, %c1) : (!fir.box,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> + %5 = hlfir.designate %4{"i"} : (!fir.ref,i:i32}>>) -> !fir.ref + hlfir.assign %c16_i32 to %5 : i32, !fir.ref + cf.br ^bb3 +^bb2: // pred: ^bb0 + %6:2 = hlfir.declare %1#1 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.class>>) -> (!fir.class>>, !fir.class>>) + cf.br ^bb3 +^bb3: // 2 preds: ^bb1, ^bb2 + return +} +// CHECK-LABEL: func.func @_QPtest_contiguous_derived_default( +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_9:.*]] = fir.declare %{{.*}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.box,i:i32}>>>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_10:.*]] = fir.rebox %[[VAL_9]] : (!fir.box,i:i32}>>>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_11:.*]] = fir.box_addr %[[VAL_10]] : (!fir.box,i:i32}>>>) -> !fir.ref,i:i32}>>> +// CHECK: %[[VAL_12:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10]], %[[VAL_12]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_15:.*]]:3 = fir.box_dims %[[VAL_10]], %[[VAL_14]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_16:.*]] = fir.shape %[[VAL_13]]#1, %[[VAL_15]]#1 : (index, index) -> !fir.shape<2> +// CHECK: %[[VAL_17:.*]] = fir.array_coor %[[VAL_11]](%[[VAL_16]]) %[[VAL_0]], %[[VAL_0]] : (!fir.ref,i:i32}>>>, !fir.shape<2>, index, index) -> !fir.ref,i:i32}>> + +// Test proper generation of fir.array_coor for contiguous box with non-default lbounds. +func.func @_QPtest_contiguous_derived_lbounds(%arg0: !fir.class>> {fir.bindc_name = "d1", fir.contiguous}) { + %c3 = arith.constant 3 : index + %c1 = arith.constant 1 : index + %c16_i32 = arith.constant 16 : i32 + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.shift %c1, %c3 : (index, index) -> !fir.shift<2> + %2:2 = hlfir.declare %arg0(%1) dummy_scope %0 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.class>>, !fir.shift<2>, !fir.dscope) -> (!fir.class>>, !fir.class>>) + fir.select_type %2#1 : !fir.class>> [#fir.type_is,i:i32}>>, ^bb1, unit, ^bb2] +^bb1: // pred: ^bb0 + %3 = fir.convert %2#1 : (!fir.class>>) -> !fir.box,i:i32}>>> + %4:2 = hlfir.declare %3(%1) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.box,i:i32}>>>, !fir.shift<2>) -> (!fir.box,i:i32}>>>, !fir.box,i:i32}>>>) + %5 = hlfir.designate %4#0 (%c1, %c3) : (!fir.box,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> + %6 = hlfir.designate %5{"i"} : (!fir.ref,i:i32}>>) -> !fir.ref + hlfir.assign %c16_i32 to %6 : i32, !fir.ref + cf.br ^bb3 +^bb2: // pred: ^bb0 + %7:2 = hlfir.declare %2#1(%1) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.class>>, !fir.shift<2>) -> (!fir.class>>, !fir.class>>) + cf.br ^bb3 +^bb3: // 2 preds: ^bb1, ^bb2 + return +} +// CHECK-LABEL: func.func @_QPtest_contiguous_derived_lbounds( +// CHECK: %[[VAL_0:.*]] = arith.constant 3 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.declare %{{.*}}(%[[VAL_4:.*]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.box,i:i32}>>>, !fir.shift<2>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_9:.*]] = fir.rebox %[[VAL_8]](%[[VAL_4]]) : (!fir.box,i:i32}>>>, !fir.shift<2>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_10:.*]] = fir.box_addr %[[VAL_9]] : (!fir.box,i:i32}>>>) -> !fir.ref,i:i32}>>> +// CHECK: %[[VAL_11:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_12:.*]]:3 = fir.box_dims %[[VAL_9]], %[[VAL_11]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_13:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_14:.*]]:3 = fir.box_dims %[[VAL_9]], %[[VAL_13]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_15:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_16:.*]] = arith.subi %[[VAL_1]], %[[VAL_1]] : index +// CHECK: %[[VAL_17:.*]] = arith.addi %[[VAL_16]], %[[VAL_15]] : index +// CHECK: %[[VAL_18:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_19:.*]] = arith.subi %[[VAL_0]], %[[VAL_0]] : index +// CHECK: %[[VAL_20:.*]] = arith.addi %[[VAL_19]], %[[VAL_18]] : index +// CHECK: %[[VAL_21:.*]] = fir.array_coor %[[VAL_10]] %[[VAL_17]], %[[VAL_20]] : (!fir.ref,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> >From ac7905e6878385069b41eb39f20f1b760d01929b Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Fri, 9 May 2025 15:32:24 -0700 Subject: [PATCH 2/2] Fixed the fir.shift case, and got rid of the shift handling in ConvertToFIR. --- .../flang/Optimizer/Builder/HLFIRTools.h | 6 ---- flang/lib/Optimizer/Builder/HLFIRTools.cpp | 19 ++++++++--- .../HLFIR/Transforms/ConvertToFIR.cpp | 33 +------------------ flang/test/HLFIR/designate-codegen.fir | 9 ++--- 4 files changed, 17 insertions(+), 50 deletions(-) diff --git a/flang/include/flang/Optimizer/Builder/HLFIRTools.h b/flang/include/flang/Optimizer/Builder/HLFIRTools.h index bcba38ed8bd5d..ac80873dc374f 100644 --- a/flang/include/flang/Optimizer/Builder/HLFIRTools.h +++ b/flang/include/flang/Optimizer/Builder/HLFIRTools.h @@ -533,12 +533,6 @@ Entity gen1DSection(mlir::Location loc, fir::FirOpBuilder &builder, mlir::ArrayRef extents, mlir::ValueRange oneBasedIndices, mlir::ArrayRef typeParams); - -/// Return explicit lower bounds from a fir.shape result. -/// Only fir.shape, fir.shift and fir.shape_shift are currently -/// supported as \p shape. -llvm::SmallVector getExplicitLboundsFromShape(mlir::Value shape); - } // namespace hlfir #endif // FORTRAN_OPTIMIZER_BUILDER_HLFIRTOOLS_H diff --git a/flang/lib/Optimizer/Builder/HLFIRTools.cpp b/flang/lib/Optimizer/Builder/HLFIRTools.cpp index 752dc0cf86414..f2b084cb760b9 100644 --- a/flang/lib/Optimizer/Builder/HLFIRTools.cpp +++ b/flang/lib/Optimizer/Builder/HLFIRTools.cpp @@ -70,8 +70,11 @@ getExplicitExtents(fir::FortranVariableOpInterface var, return {}; } -llvm::SmallVector -hlfir::getExplicitLboundsFromShape(mlir::Value shape) { +// Return explicit lower bounds from a shape result. +// Only fir.shape, fir.shift and fir.shape_shift are currently +// supported as shape. +static llvm::SmallVector +getExplicitLboundsFromShape(mlir::Value shape) { llvm::SmallVector result; auto *shapeOp = shape.getDefiningOp(); if (auto s = mlir::dyn_cast_or_null(shapeOp)) { @@ -93,7 +96,7 @@ hlfir::getExplicitLboundsFromShape(mlir::Value shape) { static llvm::SmallVector getExplicitLbounds(fir::FortranVariableOpInterface var) { if (mlir::Value shape = var.getShape()) - return hlfir::getExplicitLboundsFromShape(shape); + return getExplicitLboundsFromShape(shape); return {}; } @@ -766,12 +769,18 @@ std::pair hlfir::genVariableFirBaseShapeAndParams( // (!fir.box>, !fir.box>) // The extended value is an ArrayBoxValue with base being // the raw address of the array. - if (auto variableInterface = entity.getIfVariableInterface()) + if (auto variableInterface = entity.getIfVariableInterface()) { + mlir::Value shape = variableInterface.getShape(); if (mlir::isa(fir::getBase(exv).getType()) || !mlir::isa(entity.getType()) || - variableInterface.getShape()) + // Still use the variable's shape if it is present. + // If it only specifies a shift, then we have to create + // a shape from the exv. + (shape && (shape.getDefiningOp() || + shape.getDefiningOp()))) return {fir::getBase(exv), asEmboxShape(loc, builder, exv, variableInterface.getShape())}; + } return {fir::getBase(exv), builder.createShape(loc, exv)}; } diff --git a/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp b/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp index 495f11a365185..8f206b5a1ade7 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp @@ -412,44 +412,13 @@ class DesignateOpConversion auto indices = designate.getIndices(); int i = 0; auto attrs = designate.getIsTripletAttr(); - - // If the shape specifies a shift and the base is not a box, - // then we have to subtract the lower bounds, as long as - // fir.array_coor does not support non-default lower bounds - // for non-box accesses. - llvm::SmallVector lbounds; - if (shape && !mlir::isa(base.getType())) - lbounds = hlfir::getExplicitLboundsFromShape(shape); - std::size_t lboundIdx = 0; for (auto isTriplet : attrs.asArrayRef()) { // Coordinate of the first element are the index and triplets lower // bounds. - mlir::Value index = indices[i]; - if (!lbounds.empty()) { - assert(lboundIdx < lbounds.size() && "missing lbound"); - mlir::Type indexType = builder.getIndexType(); - mlir::Value one = builder.createIntegerConstant(loc, indexType, 1); - mlir::Value orig = builder.createConvert(loc, indexType, index); - mlir::Value lb = - builder.createConvert(loc, indexType, lbounds[lboundIdx]); - index = builder.create(loc, orig, lb); - index = builder.create(loc, index, one); - ++lboundIdx; - } - firstElementIndices.push_back(index); + firstElementIndices.push_back(indices[i]); i = i + (isTriplet ? 3 : 1); } - // Remove the shift from the shape, if needed. - if (!lbounds.empty()) { - mlir::Operation *op = shape.getDefiningOp(); - if (mlir::isa(op)) - shape = nullptr; - else if (auto shiftOp = mlir::dyn_cast(op)) - shape = builder.create(loc, shiftOp.getExtents()); - else - TODO(loc, "read fir.shape to get lower bounds"); - } mlir::Type originalDesignateType = designate.getResult().getType(); const bool isVolatile = fir::isa_volatile_type(originalDesignateType); mlir::Type arrayCoorType = fir::ReferenceType::get(baseEleTy, isVolatile); diff --git a/flang/test/HLFIR/designate-codegen.fir b/flang/test/HLFIR/designate-codegen.fir index d3e264941264f..5c3ae202fd3b9 100644 --- a/flang/test/HLFIR/designate-codegen.fir +++ b/flang/test/HLFIR/designate-codegen.fir @@ -278,10 +278,5 @@ func.func @_QPtest_contiguous_derived_lbounds(%arg0: !fir.class,i:i32}>>>, index) -> (index, index, index) // CHECK: %[[VAL_13:.*]] = arith.constant 1 : index // CHECK: %[[VAL_14:.*]]:3 = fir.box_dims %[[VAL_9]], %[[VAL_13]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) -// CHECK: %[[VAL_15:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_16:.*]] = arith.subi %[[VAL_1]], %[[VAL_1]] : index -// CHECK: %[[VAL_17:.*]] = arith.addi %[[VAL_16]], %[[VAL_15]] : index -// CHECK: %[[VAL_18:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_19:.*]] = arith.subi %[[VAL_0]], %[[VAL_0]] : index -// CHECK: %[[VAL_20:.*]] = arith.addi %[[VAL_19]], %[[VAL_18]] : index -// CHECK: %[[VAL_21:.*]] = fir.array_coor %[[VAL_10]] %[[VAL_17]], %[[VAL_20]] : (!fir.ref,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> +// CHECK: %[[VAL_15:.*]] = fir.shape_shift %[[VAL_1]], %[[VAL_12]]#1, %[[VAL_0]], %[[VAL_14]]#1 : (index, index, index, index) -> !fir.shapeshift<2> +// CHECK: %[[VAL_16:.*]] = fir.array_coor %[[VAL_10]](%[[VAL_15]]) %[[VAL_1]], %[[VAL_0]] : (!fir.ref,i:i32}>>>, !fir.shapeshift<2>, index, index) -> !fir.ref,i:i32}>> From flang-commits at lists.llvm.org Fri May 9 18:29:29 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 09 May 2025 18:29:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fixed designator codegen for contiguous boxes. (PR #139003) In-Reply-To: Message-ID: <681eabf9.170a0220.10baaf.cc25@mx.google.com> vzakhari wrote: Thank you for the comments, Jean! I uploaded an update. https://github.com/llvm/llvm-project/pull/139003 From flang-commits at lists.llvm.org Fri May 9 19:04:34 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 09 May 2025 19:04:34 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][MLIR] - Handle the mapping of subroutine arguments when they are subsequently used inside the region of an `omp.target` Op (PR #134967) In-Reply-To: Message-ID: <681eb432.050a0220.460ce.ea45@mx.google.com> https://github.com/vzakhari approved this pull request. Looks much clearer to me. Thank you! https://github.com/llvm/llvm-project/pull/134967 From flang-commits at lists.llvm.org Fri May 9 19:37:51 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 09 May 2025 19:37:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Postpone hlfir.end_associate generation for calls. (PR #138786) In-Reply-To: Message-ID: <681ebbff.170a0220.2761a8.a1f9@mx.google.com> https://github.com/vzakhari updated https://github.com/llvm/llvm-project/pull/138786 >From f65c8b369ce0e866996095c239293f0716608d11 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Tue, 6 May 2025 16:38:48 -0700 Subject: [PATCH 1/5] [flang] Postpone hlfir.end_associate generation for calls. If we generate hlfir.end_associate at the end of the statement, we get easier optimizable HLFIR, because there are no compiler generated operations with side-effects in between the call and the consumers. This allows more hlfir.eval_in_mem to reuse the LHS instead of allocating temporary buffer. I do not think the same can be done for hlfir.copy_out always, e.g.: ``` subroutine test2(x) interface function array_func2(x,y) real:: x(*), array_func2(10), y end function array_func2 end interface real :: x(:) x = array_func2(x, 1.0) end subroutine test2 ``` If we postpone the copy-out until after the assignment, then the result may be wrong. --- flang/lib/Lower/ConvertCall.cpp | 42 +++++++-- .../Lower/HLFIR/call-postponed-associate.f90 | 85 +++++++++++++++++++ 2 files changed, 121 insertions(+), 6 deletions(-) create mode 100644 flang/test/Lower/HLFIR/call-postponed-associate.f90 diff --git a/flang/lib/Lower/ConvertCall.cpp b/flang/lib/Lower/ConvertCall.cpp index a5b85e25b1af0..d37d51f6ec634 100644 --- a/flang/lib/Lower/ConvertCall.cpp +++ b/flang/lib/Lower/ConvertCall.cpp @@ -960,9 +960,26 @@ struct CallCleanUp { mlir::Value tempVar; mlir::Value mustFree; }; - void genCleanUp(mlir::Location loc, fir::FirOpBuilder &builder) { - Fortran::common::visit([&](auto &c) { c.genCleanUp(loc, builder); }, + + /// Generate clean-up code. + /// If \p postponeAssociates is true, the ExprAssociate clean-up + /// is not generated, and instead the corresponding CallCleanUp + /// object is returned as the result. + std::optional genCleanUp(mlir::Location loc, + fir::FirOpBuilder &builder, + bool postponeAssociates) { + std::optional postponed; + Fortran::common::visit(Fortran::common::visitors{ + [&](CopyIn &c) { c.genCleanUp(loc, builder); }, + [&](ExprAssociate &c) { + if (postponeAssociates) + postponed = CallCleanUp{c}; + else + c.genCleanUp(loc, builder); + }, + }, cleanUp); + return postponed; } std::variant cleanUp; }; @@ -1729,10 +1746,23 @@ genUserCall(Fortran::lower::PreparedActualArguments &loweredActuals, caller, callSiteType, callContext.resultType, callContext.isElementalProcWithArrayArgs()); - /// Clean-up associations and copy-in. - for (auto cleanUp : callCleanUps) - cleanUp.genCleanUp(loc, builder); - + // Clean-up associations and copy-in. + // The association clean-ups are postponed to the end of the statement + // lowering. The copy-in clean-ups may be delayed as well, + // but they are done immediately after the call currently. + llvm::SmallVector associateCleanups; + for (auto cleanUp : callCleanUps) { + auto postponed = + cleanUp.genCleanUp(loc, builder, /*postponeAssociates=*/true); + if (postponed) + associateCleanups.push_back(*postponed); + } + + fir::FirOpBuilder *bldr = &builder; + callContext.stmtCtx.attachCleanup([=]() { + for (auto cleanUp : associateCleanups) + (void)cleanUp.genCleanUp(loc, *bldr, /*postponeAssociates=*/false); + }); if (auto *entity = std::get_if(&loweredResult)) return *entity; diff --git a/flang/test/Lower/HLFIR/call-postponed-associate.f90 b/flang/test/Lower/HLFIR/call-postponed-associate.f90 new file mode 100644 index 0000000000000..18df62b44324b --- /dev/null +++ b/flang/test/Lower/HLFIR/call-postponed-associate.f90 @@ -0,0 +1,85 @@ +! RUN: bbc -emit-hlfir -o - %s -I nowhere | FileCheck %s + +subroutine test1 + interface + function array_func1(x) + real:: x, array_func1(10) + end function array_func1 + end interface + real :: x(10) + x = array_func1(1.0) +end subroutine test1 +! CHECK-LABEL: func.func @_QPtest1() { +! CHECK: %[[VAL_5:.*]] = arith.constant 1.000000e+00 : f32 +! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_5]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_17:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: fir.call @_QParray_func1 +! CHECK: fir.save_result +! CHECK: } +! CHECK: hlfir.assign %[[VAL_17]] to %{{.*}} : !hlfir.expr<10xf32>, !fir.ref> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 + +subroutine test2(x) + interface + function array_func2(x,y) + real:: x(*), array_func2(10), y + end function array_func2 + end interface + real :: x(:) + x = array_func2(x, 1.0) +end subroutine test2 +! CHECK-LABEL: func.func @_QPtest2( +! CHECK: %[[VAL_3:.*]] = arith.constant 1.000000e+00 : f32 +! CHECK: %[[VAL_4:.*]]:2 = hlfir.copy_in %{{.*}} to %{{.*}} : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +! CHECK: %[[VAL_5:.*]] = fir.box_addr %[[VAL_4]]#0 : (!fir.box>) -> !fir.ref> +! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_3]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_17:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: ^bb0(%[[VAL_18:.*]]: !fir.ref>): +! CHECK: %[[VAL_19:.*]] = fir.call @_QParray_func2(%[[VAL_5]], %[[VAL_6]]#0) fastmath : (!fir.ref>, !fir.ref) -> !fir.array<10xf32> +! CHECK: fir.save_result %[[VAL_19]] to %[[VAL_18]](%{{.*}}) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +! CHECK: } +! CHECK: hlfir.copy_out %{{.*}}, %[[VAL_4]]#1 to %{{.*}} : (!fir.ref>>>, i1, !fir.box>) -> () +! CHECK: hlfir.assign %[[VAL_17]] to %{{.*}} : !hlfir.expr<10xf32>, !fir.box> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 +! CHECK: hlfir.destroy %[[VAL_17]] : !hlfir.expr<10xf32> + +subroutine test3(x) + interface + function array_func3(x) + real :: x, array_func3(10) + end function array_func3 + end interface + logical :: x + if (any(array_func3(1.0).le.array_func3(2.0))) x = .true. +end subroutine test3 +! CHECK-LABEL: func.func @_QPtest3( +! CHECK: %[[VAL_2:.*]] = arith.constant 1.000000e+00 : f32 +! CHECK: %[[VAL_3:.*]]:3 = hlfir.associate %[[VAL_2]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_14:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: ^bb0(%[[VAL_15:.*]]: !fir.ref>): +! CHECK: %[[VAL_16:.*]] = fir.call @_QParray_func3(%[[VAL_3]]#0) fastmath : (!fir.ref) -> !fir.array<10xf32> +! CHECK: fir.save_result %[[VAL_16]] to %[[VAL_15]](%{{.*}}) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +! CHECK: } +! CHECK: %[[VAL_17:.*]] = arith.constant 2.000000e+00 : f32 +! CHECK: %[[VAL_18:.*]]:3 = hlfir.associate %[[VAL_17]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_29:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: ^bb0(%[[VAL_30:.*]]: !fir.ref>): +! CHECK: %[[VAL_31:.*]] = fir.call @_QParray_func3(%[[VAL_18]]#0) fastmath : (!fir.ref) -> !fir.array<10xf32> +! CHECK: fir.save_result %[[VAL_31]] to %[[VAL_30]](%{{.*}}) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +! CHECK: } +! CHECK: %[[VAL_32:.*]] = hlfir.elemental %{{.*}} unordered : (!fir.shape<1>) -> !hlfir.expr> { +! CHECK: ^bb0(%[[VAL_33:.*]]: index): +! CHECK: %[[VAL_34:.*]] = hlfir.apply %[[VAL_14]], %[[VAL_33]] : (!hlfir.expr<10xf32>, index) -> f32 +! CHECK: %[[VAL_35:.*]] = hlfir.apply %[[VAL_29]], %[[VAL_33]] : (!hlfir.expr<10xf32>, index) -> f32 +! CHECK: %[[VAL_36:.*]] = arith.cmpf ole, %[[VAL_34]], %[[VAL_35]] fastmath : f32 +! CHECK: %[[VAL_37:.*]] = fir.convert %[[VAL_36]] : (i1) -> !fir.logical<4> +! CHECK: hlfir.yield_element %[[VAL_37]] : !fir.logical<4> +! CHECK: } +! CHECK: %[[VAL_38:.*]] = hlfir.any %[[VAL_32]] : (!hlfir.expr>) -> !fir.logical<4> +! CHECK: hlfir.destroy %[[VAL_32]] : !hlfir.expr> +! CHECK: hlfir.end_associate %[[VAL_18]]#1, %[[VAL_18]]#2 : !fir.ref, i1 +! CHECK: hlfir.destroy %[[VAL_29]] : !hlfir.expr<10xf32> +! CHECK: hlfir.end_associate %[[VAL_3]]#1, %[[VAL_3]]#2 : !fir.ref, i1 +! CHECK: hlfir.destroy %[[VAL_14]] : !hlfir.expr<10xf32> +! CHECK: %[[VAL_39:.*]] = fir.convert %[[VAL_38]] : (!fir.logical<4>) -> i1 +! CHECK: fir.if %[[VAL_39]] { >From 8ddc8be77b16303127470993be7333d35fb9fb56 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Tue, 6 May 2025 19:14:39 -0700 Subject: [PATCH 2/5] Added test changes missing from the original patch. --- flang/test/Lower/HLFIR/entry_return.f90 | 8 ++++---- flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/test/Lower/HLFIR/entry_return.f90 b/flang/test/Lower/HLFIR/entry_return.f90 index 5d3e160af2df6..18fb2b571b950 100644 --- a/flang/test/Lower/HLFIR/entry_return.f90 +++ b/flang/test/Lower/HLFIR/entry_return.f90 @@ -51,13 +51,13 @@ logical function f2() ! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_4]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %[[VAL_5]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_8:.*]] = fir.call @_QPcomplex(%[[VAL_6]]#0, %[[VAL_7]]#0) fastmath : (!fir.ref, !fir.ref) -> f32 -! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 -! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_9:.*]] = arith.constant 0.000000e+00 : f32 ! CHECK: %[[VAL_10:.*]] = fir.undefined complex ! CHECK: %[[VAL_11:.*]] = fir.insert_value %[[VAL_10]], %[[VAL_8]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[VAL_12:.*]] = fir.insert_value %[[VAL_11]], %[[VAL_9]], [1 : index] : (complex, f32) -> complex ! CHECK: hlfir.assign %[[VAL_12]] to %[[VAL_1]]#0 : complex, !fir.ref> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 +! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_3]]#0 : !fir.ref> ! CHECK: return %[[VAL_13]] : !fir.logical<4> ! CHECK: } @@ -74,13 +74,13 @@ logical function f2() ! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_4]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %[[VAL_5]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_8:.*]] = fir.call @_QPcomplex(%[[VAL_6]]#0, %[[VAL_7]]#0) fastmath : (!fir.ref, !fir.ref) -> f32 -! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 -! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_9:.*]] = arith.constant 0.000000e+00 : f32 ! CHECK: %[[VAL_10:.*]] = fir.undefined complex ! CHECK: %[[VAL_11:.*]] = fir.insert_value %[[VAL_10]], %[[VAL_8]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[VAL_12:.*]] = fir.insert_value %[[VAL_11]], %[[VAL_9]], [1 : index] : (complex, f32) -> complex ! CHECK: hlfir.assign %[[VAL_12]] to %[[VAL_1]]#0 : complex, !fir.ref> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 +! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_1]]#0 : !fir.ref> ! CHECK: return %[[VAL_13]] : complex ! CHECK: } diff --git a/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 b/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 index 28659a33d0893..206b6e4e9b797 100644 --- a/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 +++ b/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 @@ -32,8 +32,8 @@ real function test1(x) ! CHECK: %[[VAL_7:.*]] = fir.load %[[VAL_6]] : !fir.ref) -> f32>> ! CHECK: %[[VAL_8:.*]] = fir.box_addr %[[VAL_7]] : (!fir.boxproc<(!fir.ref) -> f32>) -> ((!fir.ref) -> f32) ! CHECK: %[[VAL_9:.*]] = fir.call %[[VAL_8]](%[[VAL_5]]#0) fastmath : (!fir.ref) -> f32 -! CHECK: hlfir.end_associate %[[VAL_5]]#1, %[[VAL_5]]#2 : !fir.ref, i1 ! CHECK: hlfir.assign %[[VAL_9]] to %[[VAL_2]]#0 : f32, !fir.ref +! CHECK: hlfir.end_associate %[[VAL_5]]#1, %[[VAL_5]]#2 : !fir.ref, i1 subroutine test2(x) use proc_comp_defs, only : t, iface >From d843f4eb45ad474adef13371540b2d22817074f0 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Thu, 8 May 2025 12:18:07 -0700 Subject: [PATCH 3/5] Fixed clean-ups insertion for atomic capture. --- flang/lib/Lower/OpenMP/OpenMP.cpp | 4 +++- flang/test/Lower/OpenMP/atomic-capture.f90 | 20 ++++++++++++++++++++ flang/test/Lower/OpenMP/atomic-update.f90 | 21 +++++++++++++++++++++ 3 files changed, 44 insertions(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fcd3de9671098..d1a77a2624628 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3129,7 +3129,9 @@ static void genAtomicCapture(lower::AbstractConverter &converter, } firOpBuilder.setInsertionPointToEnd(&block); firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); + // The clean-ups associated with the statements inside the capture + // construct must be generated after the AtomicCaptureOp. + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b5c8edc8f31c1 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -97,3 +97,23 @@ subroutine pointers_in_atomic_capture() b = a !$omp end atomic end subroutine + +! Check that the clean-ups associated with the function call +! are generated after the omp.atomic.capture operation: +! CHECK-LABEL: func.func @_QPfunc_call_cleanup( +subroutine func_call_cleanup(x, v, vv) + integer :: x, v, vv + +! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_8:.*]] = fir.call @_QPfunc(%[[VAL_7]]#0) fastmath : (!fir.ref) -> f32 +! CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_8]] : (f32) -> i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %{{.*}} = %[[VAL_3:.*]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.write %[[VAL_3]]#0 = %[[VAL_9]] : !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 + !$omp atomic capture + v = x + x = func(vv + 1) + !$omp end atomic +end subroutine func_call_cleanup diff --git a/flang/test/Lower/OpenMP/atomic-update.f90 b/flang/test/Lower/OpenMP/atomic-update.f90 index 31bf447006930..e0269ea1f8af1 100644 --- a/flang/test/Lower/OpenMP/atomic-update.f90 +++ b/flang/test/Lower/OpenMP/atomic-update.f90 @@ -201,3 +201,24 @@ program OmpAtomicUpdate !$omp atomic update x = x + sum([ (y+2, y=1, z) ]) end program OmpAtomicUpdate + +! Check that the clean-ups associated with the function call +! are generated after the omp.atomic.update operation: +! CHECK-LABEL: func.func @_QPfunc_call_cleanup( +subroutine func_call_cleanup(v, vv) + integer v, vv + +! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_7:.*]] = fir.call @_QPfunc(%[[VAL_6]]#0) fastmath : (!fir.ref) -> f32 +! CHECK: omp.atomic.update %{{.*}} : !fir.ref { +! CHECK: ^bb0(%[[VAL_8:.*]]: i32): +! CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_8]] : (i32) -> f32 +! CHECK: %[[VAL_10:.*]] = arith.addf %[[VAL_9]], %[[VAL_7]] fastmath : f32 +! CHECK: %[[VAL_11:.*]] = fir.convert %[[VAL_10]] : (f32) -> i32 +! CHECK: omp.yield(%[[VAL_11]] : i32) +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 + !$omp atomic update + v = v + func(vv + 1) + !$omp end atomic +end subroutine func_call_cleanup >From 22c381e753d137e02ef81996c2ec65ca91f86410 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Thu, 8 May 2025 16:13:01 -0700 Subject: [PATCH 4/5] Fixed atomic capture cases with atomic update inside. --- flang/lib/Lower/OpenMP/OpenMP.cpp | 20 +++++++--- flang/test/Lower/OpenMP/atomic-capture.f90 | 44 ++++++++++++++++++++-- 2 files changed, 55 insertions(+), 9 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index d1a77a2624628..5b0b54b9e0377 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2729,7 +2729,8 @@ static void genAtomicUpdateStatement( const parser::Expr &assignmentStmtExpr, const parser::OmpAtomicClauseList *leftHandClauseList, const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { + mlir::Operation *atomicCaptureOp = nullptr, + lower::StatementContext *atomicCaptureStmtCtx = nullptr) { // Generate `atomic.update` operation for atomic assignment statements fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); mlir::Location currentLocation = converter.getCurrentLocation(); @@ -2803,15 +2804,24 @@ static void genAtomicUpdateStatement( }, assignmentStmtExpr.u); lower::StatementContext nonAtomicStmtCtx; + lower::StatementContext *stmtCtxPtr = &nonAtomicStmtCtx; if (!nonAtomicSubExprs.empty()) { // Generate non atomic part before all the atomic operations. auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) + if (atomicCaptureOp) { + assert(atomicCaptureStmtCtx && "must specify statement context"); firOpBuilder.setInsertionPoint(atomicCaptureOp); + // Any clean-ups associated with the expression lowering + // must also be generated outside of the atomic update operation + // and after the atomic capture operation. + // The atomicCaptureStmtCtx will be finalized at the end + // of the atomic capture operation generation. + stmtCtxPtr = atomicCaptureStmtCtx; + } mlir::Value nonAtomicVal; for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + currentLocation, *nonAtomicSubExpr, *stmtCtxPtr)); exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); } if (atomicCaptureOp) @@ -3097,7 +3107,7 @@ static void genAtomicCapture(lower::AbstractConverter &converter, genAtomicUpdateStatement( converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp, &stmtCtx); } else { // Atomic capture construct is of the form [capture-stmt, write-stmt] firOpBuilder.setInsertionPoint(atomicCaptureOp); @@ -3121,7 +3131,7 @@ static void genAtomicCapture(lower::AbstractConverter &converter, genAtomicUpdateStatement( converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp, &stmtCtx); genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, /*leftHandClauseList=*/nullptr, /*rightHandClauseList=*/nullptr, elementType, diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index b5c8edc8f31c1..2f800d534dc36 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -102,18 +102,54 @@ subroutine pointers_in_atomic_capture() ! are generated after the omp.atomic.capture operation: ! CHECK-LABEL: func.func @_QPfunc_call_cleanup( subroutine func_call_cleanup(x, v, vv) + interface + integer function func(x) + integer :: x + end function func + end interface integer :: x, v, vv ! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -! CHECK: %[[VAL_8:.*]] = fir.call @_QPfunc(%[[VAL_7]]#0) fastmath : (!fir.ref) -> f32 -! CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_8]] : (f32) -> i32 +! CHECK: %[[VAL_8:.*]] = fir.call @_QPfunc(%[[VAL_7]]#0) fastmath : (!fir.ref) -> i32 ! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.read %{{.*}} = %[[VAL_3:.*]]#0 : !fir.ref, !fir.ref, i32 -! CHECK: omp.atomic.write %[[VAL_3]]#0 = %[[VAL_9]] : !fir.ref, i32 +! CHECK: omp.atomic.read %[[VAL_1:.*]]#0 = %[[VAL_3:.*]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.write %[[VAL_3]]#0 = %[[VAL_8]] : !fir.ref, i32 ! CHECK: } ! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 !$omp atomic capture v = x x = func(vv + 1) !$omp end atomic + +! CHECK: %[[VAL_12:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_13:.*]] = fir.call @_QPfunc(%[[VAL_12]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[VAL_1]]#0 = %[[VAL_3]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.update %[[VAL_3]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_14:.*]]: i32): +! CHECK: %[[VAL_15:.*]] = arith.addi %[[VAL_13]], %[[VAL_14]] : i32 +! CHECK: omp.yield(%[[VAL_15]] : i32) +! CHECK: } +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_12]]#1, %[[VAL_12]]#2 : !fir.ref, i1 + !$omp atomic capture + v = x + x = func(vv + 1) + x + !$omp end atomic + +! CHECK: %[[VAL_19:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_20:.*]] = fir.call @_QPfunc(%[[VAL_19]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[VAL_3]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_21:.*]]: i32): +! CHECK: %[[VAL_22:.*]] = arith.addi %[[VAL_20]], %[[VAL_21]] : i32 +! CHECK: omp.yield(%[[VAL_22]] : i32) +! CHECK: } +! CHECK: omp.atomic.read %[[VAL_1]]#0 = %[[VAL_3]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_19]]#1, %[[VAL_19]]#2 : !fir.ref, i1 + !$omp atomic capture + x = func(vv + 1) + x + v = x + !$omp end atomic end subroutine func_call_cleanup >From 94df928e0031c9d26cad29841701959b7322508b Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Fri, 9 May 2025 19:33:34 -0700 Subject: [PATCH 5/5] Fixed atomic handling for OpenACC. --- flang/lib/Lower/OpenACC.cpp | 24 ++++++-- .../test/Lower/OpenACC/acc-atomic-capture.f90 | 57 +++++++++++++++++++ .../test/Lower/OpenACC/acc-atomic-update.f90 | 18 +++++- 3 files changed, 92 insertions(+), 7 deletions(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 82daa05c165cb..c4529a3115996 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -414,7 +414,8 @@ static inline void genAtomicUpdateStatement( Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { + mlir::Operation *atomicCaptureOp = nullptr, + Fortran::lower::StatementContext *atomicCaptureStmtCtx = nullptr) { // Generate `atomic.update` operation for atomic assignment statements fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); mlir::Location currentLocation = converter.getCurrentLocation(); @@ -494,15 +495,24 @@ static inline void genAtomicUpdateStatement( }, assignmentStmtExpr.u); Fortran::lower::StatementContext nonAtomicStmtCtx; + Fortran::lower::StatementContext *stmtCtxPtr = &nonAtomicStmtCtx; if (!nonAtomicSubExprs.empty()) { // Generate non atomic part before all the atomic operations. auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) + if (atomicCaptureOp) { + assert(atomicCaptureStmtCtx && "must specify statement context"); firOpBuilder.setInsertionPoint(atomicCaptureOp); + // Any clean-ups associated with the expression lowering + // must also be generated outside of the atomic update operation + // and after the atomic capture operation. + // The atomicCaptureStmtCtx will be finalized at the end + // of the atomic capture operation generation. + stmtCtxPtr = atomicCaptureStmtCtx; + } mlir::Value nonAtomicVal; for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + currentLocation, *nonAtomicSubExpr, *stmtCtxPtr)); exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); } if (atomicCaptureOp) @@ -650,7 +660,7 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter, genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, elementType, loc); genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, - stmt2Expr, loc, atomicCaptureOp); + stmt2Expr, loc, atomicCaptureOp, &stmtCtx); } else { // Atomic capture construct is of the form [capture-stmt, write-stmt] firOpBuilder.setInsertionPoint(atomicCaptureOp); @@ -670,13 +680,15 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter, *Fortran::semantics::GetExpr(stmt2Expr); mlir::Type elementType = converter.genType(fromExpr); genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, - stmt1Expr, loc, atomicCaptureOp); + stmt1Expr, loc, atomicCaptureOp, &stmtCtx); genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, loc); } firOpBuilder.setInsertionPointToEnd(&block); firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); + // The clean-ups associated with the statements inside the capture + // construct must be generated after the AtomicCaptureOp. + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); } template diff --git a/flang/test/Lower/OpenACC/acc-atomic-capture.f90 b/flang/test/Lower/OpenACC/acc-atomic-capture.f90 index 82059908bcd0b..ee38ab6ce826a 100644 --- a/flang/test/Lower/OpenACC/acc-atomic-capture.f90 +++ b/flang/test/Lower/OpenACC/acc-atomic-capture.f90 @@ -306,3 +306,60 @@ end subroutine comp_ref_in_atomic_capture2 ! CHECK: } ! CHECK: acc.atomic.read %[[V_DECL]]#0 = %[[C]] : !fir.ref, !fir.ref, i32 ! CHECK: } + +! CHECK-LABEL: func.func @_QPatomic_capture_with_associate() { +subroutine atomic_capture_with_associate + interface + integer function func(x) + integer :: x + end function func + end interface +! CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "_QFatomic_capture_with_associateEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Y_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "_QFatomic_capture_with_associateEy"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Z_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "_QFatomic_capture_with_associateEz"} : (!fir.ref) -> (!fir.ref, !fir.ref) + integer :: x, y, z + +! CHECK: %[[VAL_10:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_11:.*]] = fir.call @_QPfunc(%[[VAL_10]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: acc.atomic.capture { +! CHECK: acc.atomic.read %[[X_DECL]]#0 = %[[Y_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: acc.atomic.write %[[Y_DECL]]#0 = %[[VAL_11]] : !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_10]]#1, %[[VAL_10]]#2 : !fir.ref, i1 + !$acc atomic capture + x = y + y = func(z + 1) + !$acc end atomic + +! CHECK: %[[VAL_15:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_16:.*]] = fir.call @_QPfunc(%[[VAL_15]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: acc.atomic.capture { +! CHECK: acc.atomic.update %[[Y_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_17:.*]]: i32): +! CHECK: %[[VAL_18:.*]] = arith.muli %[[VAL_16]], %[[VAL_17]] : i32 +! CHECK: acc.yield %[[VAL_18]] : i32 +! CHECK: } +! CHECK: acc.atomic.read %[[X_DECL]]#0 = %[[Y_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_15]]#1, %[[VAL_15]]#2 : !fir.ref, i1 + !$acc atomic capture + y = func(z + 1) * y + x = y + !$acc end atomic + +! CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_23:.*]] = fir.call @_QPfunc(%[[VAL_22]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: acc.atomic.capture { +! CHECK: acc.atomic.read %[[X_DECL]]#0 = %[[Y_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: acc.atomic.update %[[Y_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_24:.*]]: i32): +! CHECK: %[[VAL_25:.*]] = arith.addi %[[VAL_23]], %[[VAL_24]] : i32 +! CHECK: acc.yield %[[VAL_25]] : i32 +! CHECK: } +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_22]]#1, %[[VAL_22]]#2 : !fir.ref, i1 + !$acc atomic capture + x = y + y = func(z + 1) + y + !$acc end atomic +end subroutine atomic_capture_with_associate diff --git a/flang/test/Lower/OpenACC/acc-atomic-update.f90 b/flang/test/Lower/OpenACC/acc-atomic-update.f90 index da2972877244c..71aa69fd64eba 100644 --- a/flang/test/Lower/OpenACC/acc-atomic-update.f90 +++ b/flang/test/Lower/OpenACC/acc-atomic-update.f90 @@ -3,6 +3,11 @@ ! RUN: %flang_fc1 -fopenacc -emit-hlfir %s -o - | FileCheck %s program acc_atomic_update_test + interface + integer function func(x) + integer :: x + end function func + end interface integer :: x, y, z integer, pointer :: a, b integer, target :: c, d @@ -67,7 +72,18 @@ program acc_atomic_update_test !$acc atomic i1 = i1 + 1 !$acc end atomic + +!CHECK: %[[VAL_44:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +!CHECK: %[[VAL_45:.*]] = fir.call @_QPfunc(%[[VAL_44]]#0) fastmath : (!fir.ref) -> i32 +!CHECK: acc.atomic.update %[[X_DECL]]#0 : !fir.ref { +!CHECK: ^bb0(%[[VAL_46:.*]]: i32): +!CHECK: %[[VAL_47:.*]] = arith.addi %[[VAL_46]], %[[VAL_45]] : i32 +!CHECK: acc.yield %[[VAL_47]] : i32 +!CHECK: } +!CHECK: hlfir.end_associate %[[VAL_44]]#1, %[[VAL_44]]#2 : !fir.ref, i1 + !$acc atomic update + x = x + func(z + 1) + !$acc end atomic !CHECK: return !CHECK: } end program acc_atomic_update_test - From flang-commits at lists.llvm.org Fri May 9 19:38:37 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 09 May 2025 19:38:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Postpone hlfir.end_associate generation for calls. (PR #138786) In-Reply-To: Message-ID: <681ebc2d.a70a0220.ddcdc.ec85@mx.google.com> vzakhari wrote: Tom, thank you for pointing out the OpenACC side! @razvanlupusoru can you please review the change for OpenACC (same as for OpenMP)? https://github.com/llvm/llvm-project/pull/138786 From flang-commits at lists.llvm.org Fri May 9 22:25:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 22:25:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix issue when macro is followed by OpenMP pragma (PR #123035) In-Reply-To: Message-ID: <681ee341.170a0220.255e74.d41a@mx.google.com> https://github.com/shivaramaarao updated https://github.com/llvm/llvm-project/pull/123035 >From 7f303b7645b0ec2bd3b140232c10974bf8bf837e Mon Sep 17 00:00:00 2001 From: Shivarama Rao Date: Wed, 15 Jan 2025 09:44:04 +0000 Subject: [PATCH] When calling IsCompilerDirectiveSentinel,the prefix comment character need to be skipped. Fixes #117693 and also handles continuation lines beginning with Macros --- flang/include/flang/Parser/token-sequence.h | 1 + flang/lib/Parser/prescan.cpp | 9 ++--- flang/lib/Parser/token-sequence.cpp | 40 +++++++++++++++++++++ flang/test/Preprocessing/bug117693.f90 | 14 ++++++++ flang/test/Preprocessing/bug117693_2.f90 | 15 ++++++++ flang/test/Preprocessing/bug117693_3.f90 | 7 ++++ 6 files changed, 82 insertions(+), 4 deletions(-) create mode 100644 flang/test/Preprocessing/bug117693.f90 create mode 100644 flang/test/Preprocessing/bug117693_2.f90 create mode 100644 flang/test/Preprocessing/bug117693_3.f90 diff --git a/flang/include/flang/Parser/token-sequence.h b/flang/include/flang/Parser/token-sequence.h index 047c0bed00762..a5f9b86ddd1eb 100644 --- a/flang/include/flang/Parser/token-sequence.h +++ b/flang/include/flang/Parser/token-sequence.h @@ -123,6 +123,7 @@ class TokenSequence { bool HasRedundantBlanks(std::size_t firstChar = 0) const; TokenSequence &RemoveBlanks(std::size_t firstChar = 0); TokenSequence &RemoveRedundantBlanks(std::size_t firstChar = 0); + TokenSequence &RemoveRedundantCompilerDirectives(const Prescanner &); TokenSequence &ClipComment(const Prescanner &, bool skipFirst = false); const TokenSequence &CheckBadFortranCharacters( Messages &, const Prescanner &, bool allowAmpersand) const; diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp index c5939a1e0b6c2..3aa177c80b704 100644 --- a/flang/lib/Parser/prescan.cpp +++ b/flang/lib/Parser/prescan.cpp @@ -175,7 +175,7 @@ void Prescanner::Statement() { EmitChar(tokens, '!'); ++at_, ++column_; for (const char *sp{directiveSentinel_}; *sp != '\0'; - ++sp, ++at_, ++column_) { + ++sp, ++at_, ++column_) { EmitChar(tokens, *sp); } if (IsSpaceOrTab(at_)) { @@ -346,6 +346,7 @@ void Prescanner::CheckAndEmitLine( tokens.CheckBadParentheses(messages_); } } + tokens.RemoveRedundantCompilerDirectives(*this); tokens.Emit(cooked_); if (omitNewline_) { omitNewline_ = false; @@ -511,7 +512,7 @@ bool Prescanner::MustSkipToEndOfLine() const { if (inFixedForm_ && column_ > fixedFormColumnLimit_ && !tabInCurrentLine_) { return true; // skip over ignored columns in right margin (73:80) } else if (*at_ == '!' && !inCharLiteral_) { - return !IsCompilerDirectiveSentinel(at_); + return !IsCompilerDirectiveSentinel(at_ + 1); } else { return false; } @@ -1073,7 +1074,7 @@ std::optional Prescanner::IsIncludeLine(const char *start) const { } if (IsDecimalDigit(*p)) { // accept & ignore a numeric kind prefix for (p = SkipWhiteSpace(p + 1); IsDecimalDigit(*p); - p = SkipWhiteSpace(p + 1)) { + p = SkipWhiteSpace(p + 1)) { } if (*p != '_') { return std::nullopt; @@ -1121,7 +1122,7 @@ void Prescanner::FortranInclude(const char *firstQuote) { llvm::raw_string_ostream error{buf}; Provenance provenance{GetProvenance(nextLine_)}; std::optional prependPath; - if (const SourceFile * currentFile{allSources_.GetSourceFile(provenance)}) { + if (const SourceFile *currentFile{allSources_.GetSourceFile(provenance)}) { prependPath = DirectoryName(currentFile->path()); } const SourceFile *included{ diff --git a/flang/lib/Parser/token-sequence.cpp b/flang/lib/Parser/token-sequence.cpp index cdbe89b1eb441..9695b8d67be0b 100644 --- a/flang/lib/Parser/token-sequence.cpp +++ b/flang/lib/Parser/token-sequence.cpp @@ -304,6 +304,46 @@ TokenSequence &TokenSequence::ClipComment( return *this; } +TokenSequence &TokenSequence::RemoveRedundantCompilerDirectives( + const Prescanner &prescanner) { + // When the toekn sqeuence is " " convert it to " + // " + std::size_t tokens{SizeInTokens()}; + TokenSequence result; + bool firstDirective{true}; + for (std::size_t j{0}; j < tokens; ++j) { + CharBlock tok{TokenAt(j)}; + bool isSentinel{false}; + std::size_t blanks{tok.CountLeadingBlanks()}; + if (blanks < tok.size() && tok[blanks] == '!') { + // Retain active compiler directive sentinels (e.g. "!dir$", "!$omp") + for (std::size_t k{j + 1}; k < tokens && tok.size() <= blanks + 5; ++k) { + if (tok.begin() + tok.size() == TokenAt(k).begin()) { + tok.ExtendToCover(TokenAt(k)); + } else { + break; + } + } + } + if (tok.size() > blanks + 5) { + isSentinel = + prescanner.IsCompilerDirectiveSentinel(&tok[blanks + 1]).has_value(); + } + if (isSentinel && + !firstDirective) { // skip the directives if not the first one + j++; + } else { + result.Put(*this, j); + } + if (isSentinel && firstDirective) { + firstDirective = false; + } + } + swap(result); + return *this; +} + void TokenSequence::Emit(CookedSource &cooked) const { if (auto n{char_.size()}) { cooked.Put(&char_[0], n); diff --git a/flang/test/Preprocessing/bug117693.f90 b/flang/test/Preprocessing/bug117693.f90 new file mode 100644 index 0000000000000..ced7927606e62 --- /dev/null +++ b/flang/test/Preprocessing/bug117693.f90 @@ -0,0 +1,14 @@ +! RUN: %flang -fopenmp -E %s 2>&1 | FileCheck %s +!CHECK: !$OMP DO SCHEDULE(STATIC) +program main +IMPLICIT NONE +INTEGER:: I +#define OMPSUPPORT +!$ INTEGER :: omp_id +!$OMP PARALLEL DO +OMPSUPPORT !$OMP DO SCHEDULE(STATIC) +DO I=1,100 +print *, omp_id +ENDDO +!$OMP END PARALLEL DO +end program diff --git a/flang/test/Preprocessing/bug117693_2.f90 b/flang/test/Preprocessing/bug117693_2.f90 new file mode 100644 index 0000000000000..fe5027ceddf09 --- /dev/null +++ b/flang/test/Preprocessing/bug117693_2.f90 @@ -0,0 +1,15 @@ +! RUN: %flang -fopenmp -E %s 2>&1 | FileCheck %s +!CHECK: !$OMP DO SCHEDULE(STATIC) DEFAULT(NONE) +!CHECK-NOT: !$OMP DEFAULT(NONE) +program main +IMPLICIT NONE +INTEGER:: I +#define OMPSUPPORT +!$ INTEGER :: omp_id +!$OMP PARALLEL DO +OMPSUPPORT !$OMP DO SCHEDULE(STATIC) !$OMP DEFAULT(NONE) +DO I=1,100 +print *, omp_id +ENDDO +!$OMP END PARALLEL DO +end program diff --git a/flang/test/Preprocessing/bug117693_3.f90 b/flang/test/Preprocessing/bug117693_3.f90 new file mode 100644 index 0000000000000..4fdff04542889 --- /dev/null +++ b/flang/test/Preprocessing/bug117693_3.f90 @@ -0,0 +1,7 @@ +! RUN: %flang -fopenmp -E %s 2>&1 | FileCheck %s +!CHECK-NOT: DO I=1,100 !$OMP +program main +INTEGER::n +DO I=1,100 !!$OMP +ENDDO +END PROGRAM From flang-commits at lists.llvm.org Fri May 9 22:48:22 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 22:48:22 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix issue when macro is followed by OpenMP pragma (PR #123035) In-Reply-To: Message-ID: <681ee8a6.a70a0220.14f131.f4d3@mx.google.com> https://github.com/shivaramaarao updated https://github.com/llvm/llvm-project/pull/123035 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Sat May 10 02:47:31 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Sat, 10 May 2025 02:47:31 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation direcrive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <681f20b3.170a0220.8749f.c7bc@mx.google.com> eZWALT wrote: I want to notify that the following week I'll be unavailable, so expect this patch to be updated on the 20th of May. Thanks for the feedback @alexey-bataev https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Sat May 10 03:21:25 2025 From: flang-commits at lists.llvm.org (=?UTF-8?B?TWljaGHFgiBHw7Nybnk=?= via flang-commits) Date: Sat, 10 May 2025 03:21:25 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (PR #139131) In-Reply-To: Message-ID: <681f28a5.170a0220.7e312.d54c@mx.google.com> mgorny wrote: I'm also seeing this failure, so I'm going to revert. https://github.com/llvm/llvm-project/pull/139131 From flang-commits at lists.llvm.org Sat May 10 03:23:27 2025 From: flang-commits at lists.llvm.org (=?UTF-8?B?TWljaGHFgiBHw7Nybnk=?= via flang-commits) Date: Sat, 10 May 2025 03:23:27 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (PR #139131) In-Reply-To: Message-ID: <681f291f.170a0220.2395e1.14fc@mx.google.com> mgorny wrote: …or I'll try fixing it first, it seems to be a trivial case of missing include. https://github.com/llvm/llvm-project/pull/139131 From flang-commits at lists.llvm.org Sat May 10 03:34:12 2025 From: flang-commits at lists.llvm.org (=?UTF-8?B?TWljaGHFgiBHw7Nybnk=?= via flang-commits) Date: Sat, 10 May 2025 03:34:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add missing `#include` to fix build (PR #139371) Message-ID: https://github.com/mgorny created https://github.com/llvm/llvm-project/pull/139371 Add missing `#include` to `lib/Semantics/unparse-with-symbols.cpp`, in order to fix the build failure introduced in a68f35a17db03a6633a660d310156f4e2f17197f: ``` FAILED: lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o /usr/lib/ccache/bin/x86_64-pc-linux-gnu-g++ -DFLANG_INCLUDE_TESTS=1 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang_build/lib/Semantics -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/include -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang_build/include -isystem /usr/lib/llvm/21/include -O2 -pipe -march=native -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wno-unnecessary-virtual-specifier -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-ctad-maybe-unsupported -fno-strict-aliasing -fno-semantic-interposition -std=c++17 -D_GNU_SOURCE -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -UNDEBUG -MD -MT lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o -MF lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o.d -o lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o -c /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp: In function ‘void Fortran::semantics::UnparseWithModules(llvm::raw_ostream&, SemanticsContext&, const Fortran::parser::Program&, Fortran::parser::Encoding)’: /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp:153:33: error: invalid use of incomplete type ‘class Fortran::semantics::SemanticsContext’ 153 | parser::Unparse(out, program, context.langOptions(), encoding, false, true); | ^~~~~~~ In file included from /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp:9: /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/include/flang/Semantics/unparse-with-symbols.h:28:7: note: forward declaration of ‘class Fortran::semantics::SemanticsContext’ 28 | class SemanticsContext; | ^~~~~~~~~~~~~~~~ At global scope: cc1plus: note: unrecognized command-line option ‘-Wno-unnecessary-virtual-specifier’ may have been intended to silence earlier diagnostics ``` >From 34b077f20ebdca92c835d325c8f75dff41145f3a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?= Date: Sat, 10 May 2025 12:27:16 +0200 Subject: [PATCH] [flang] Add missing `#include` to fix build MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add missing `#include` to `lib/Semantics/unparse-with-symbols.cpp`, in order to fix the build failure introduced in a68f35a17db03a6633a660d310156f4e2f17197f: ``` FAILED: lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o /usr/lib/ccache/bin/x86_64-pc-linux-gnu-g++ -DFLANG_INCLUDE_TESTS=1 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang_build/lib/Semantics -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/include -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang_build/include -isystem /usr/lib/llvm/21/include -O2 -pipe -march=native -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wno-unnecessary-virtual-specifier -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-ctad-maybe-unsupported -fno-strict-aliasing -fno-semantic-interposition -std=c++17 -D_GNU_SOURCE -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -UNDEBUG -MD -MT lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o -MF lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o.d -o lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o -c /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp: In function ‘void Fortran::semantics::UnparseWithModules(llvm::raw_ostream&, SemanticsContext&, const Fortran::parser::Program&, Fortran::parser::Encoding)’: /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp:153:33: error: invalid use of incomplete type ‘class Fortran::semantics::SemanticsContext’ 153 | parser::Unparse(out, program, context.langOptions(), encoding, false, true); | ^~~~~~~ In file included from /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp:9: /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/include/flang/Semantics/unparse-with-symbols.h:28:7: note: forward declaration of ‘class Fortran::semantics::SemanticsContext’ 28 | class SemanticsContext; | ^~~~~~~~~~~~~~~~ At global scope: cc1plus: note: unrecognized command-line option ‘-Wno-unnecessary-virtual-specifier’ may have been intended to silence earlier diagnostics ``` --- flang/lib/Semantics/unparse-with-symbols.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/flang/lib/Semantics/unparse-with-symbols.cpp b/flang/lib/Semantics/unparse-with-symbols.cpp index 634d46b8ccf40..f1e2e4ea7f119 100644 --- a/flang/lib/Semantics/unparse-with-symbols.cpp +++ b/flang/lib/Semantics/unparse-with-symbols.cpp @@ -11,6 +11,7 @@ #include "flang/Parser/parse-tree-visitor.h" #include "flang/Parser/parse-tree.h" #include "flang/Parser/unparse.h" +#include "flang/Semantics/semantics.h" #include "flang/Semantics/symbol.h" #include "llvm/Support/raw_ostream.h" #include From flang-commits at lists.llvm.org Sat May 10 03:34:30 2025 From: flang-commits at lists.llvm.org (=?UTF-8?B?TWljaGHFgiBHw7Nybnk=?= via flang-commits) Date: Sat, 10 May 2025 03:34:30 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [flang][OpenMP] Pass OpenMP version to getOpenMPDirectiveName (PR #139131) In-Reply-To: Message-ID: <681f2bb6.050a0220.3cbdfa.fe48@mx.google.com> mgorny wrote: Filed https://github.com/llvm/llvm-project/pull/139371 https://github.com/llvm/llvm-project/pull/139131 From flang-commits at lists.llvm.org Sat May 10 03:34:44 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 03:34:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add missing `#include` to fix build (PR #139371) In-Reply-To: Message-ID: <681f2bc4.170a0220.1d245d.ea2d@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Michał Górny (mgorny)
Changes Add missing `#include` to `lib/Semantics/unparse-with-symbols.cpp`, in order to fix the build failure introduced in a68f35a17db03a6633a660d310156f4e2f17197f: ``` FAILED: lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o /usr/lib/ccache/bin/x86_64-pc-linux-gnu-g++ -DFLANG_INCLUDE_TESTS=1 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang_build/lib/Semantics -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/include -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang_build/include -isystem /usr/lib/llvm/21/include -O2 -pipe -march=native -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wno-unnecessary-virtual-specifier -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-ctad-maybe-unsupported -fno-strict-aliasing -fno-semantic-interposition -std=c++17 -D_GNU_SOURCE -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -UNDEBUG -MD -MT lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o -MF lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o.d -o lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o -c /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp: In function ‘void Fortran::semantics::UnparseWithModules(llvm::raw_ostream&, SemanticsContext&, const Fortran::parser::Program&, Fortran::parser::Encoding)’: /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp:153:33: error: invalid use of incomplete type ‘class Fortran::semantics::SemanticsContext’ 153 | parser::Unparse(out, program, context.langOptions(), encoding, false, true); | ^~~~~~~ In file included from /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp:9: /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/include/flang/Semantics/unparse-with-symbols.h:28:7: note: forward declaration of ‘class Fortran::semantics::SemanticsContext’ 28 | class SemanticsContext; | ^~~~~~~~~~~~~~~~ At global scope: cc1plus: note: unrecognized command-line option ‘-Wno-unnecessary-virtual-specifier’ may have been intended to silence earlier diagnostics ``` --- Full diff: https://github.com/llvm/llvm-project/pull/139371.diff 1 Files Affected: - (modified) flang/lib/Semantics/unparse-with-symbols.cpp (+1) ``````````diff diff --git a/flang/lib/Semantics/unparse-with-symbols.cpp b/flang/lib/Semantics/unparse-with-symbols.cpp index 634d46b8ccf40..f1e2e4ea7f119 100644 --- a/flang/lib/Semantics/unparse-with-symbols.cpp +++ b/flang/lib/Semantics/unparse-with-symbols.cpp @@ -11,6 +11,7 @@ #include "flang/Parser/parse-tree-visitor.h" #include "flang/Parser/parse-tree.h" #include "flang/Parser/unparse.h" +#include "flang/Semantics/semantics.h" #include "flang/Semantics/symbol.h" #include "llvm/Support/raw_ostream.h" #include ``````````
https://github.com/llvm/llvm-project/pull/139371 From flang-commits at lists.llvm.org Sat May 10 03:46:56 2025 From: flang-commits at lists.llvm.org (Sam James via flang-commits) Date: Sat, 10 May 2025 03:46:56 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add missing `#include` to fix build (PR #139371) In-Reply-To: Message-ID: <681f2ea0.170a0220.1159b6.ce1c@mx.google.com> https://github.com/thesamesam approved this pull request. https://github.com/llvm/llvm-project/pull/139371 From flang-commits at lists.llvm.org Sat May 10 03:49:43 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 03:49:43 -0700 (PDT) Subject: [flang-commits] [flang] fcb4bda - [flang] Add missing `#include` to fix build (#139371) Message-ID: <681f2f47.050a0220.8d898.4f44@mx.google.com> Author: Michał Górny Date: 2025-05-10T12:49:40+02:00 New Revision: fcb4bda9dcfcdb64d8b069e8416c75d7a1a62e52 URL: https://github.com/llvm/llvm-project/commit/fcb4bda9dcfcdb64d8b069e8416c75d7a1a62e52 DIFF: https://github.com/llvm/llvm-project/commit/fcb4bda9dcfcdb64d8b069e8416c75d7a1a62e52.diff LOG: [flang] Add missing `#include` to fix build (#139371) Add missing `#include` to `lib/Semantics/unparse-with-symbols.cpp`, in order to fix the build failure introduced in a68f35a17db03a6633a660d310156f4e2f17197f: ``` FAILED: lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o /usr/lib/ccache/bin/x86_64-pc-linux-gnu-g++ -DFLANG_INCLUDE_TESTS=1 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang_build/lib/Semantics -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/include -I/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang_build/include -isystem /usr/lib/llvm/21/include -O2 -pipe -march=native -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wno-unnecessary-virtual-specifier -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-ctad-maybe-unsupported -fno-strict-aliasing -fno-semantic-interposition -std=c++17 -D_GNU_SOURCE -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -UNDEBUG -MD -MT lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o -MF lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o.d -o lib/Semantics/CMakeFiles/FortranSemantics.dir/unparse-with-symbols.cpp.o -c /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp: In function ‘void Fortran::semantics::UnparseWithModules(llvm::raw_ostream&, SemanticsContext&, const Fortran::parser::Program&, Fortran::parser::Encoding)’: /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp:153:33: error: invalid use of incomplete type ‘class Fortran::semantics::SemanticsContext’ 153 | parser::Unparse(out, program, context.langOptions(), encoding, false, true); | ^~~~~~~ In file included from /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/lib/Semantics/unparse-with-symbols.cpp:9: /var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang/include/flang/Semantics/unparse-with-symbols.h:28:7: note: forward declaration of ‘class Fortran::semantics::SemanticsContext’ 28 | class SemanticsContext; | ^~~~~~~~~~~~~~~~ At global scope: cc1plus: note: unrecognized command-line option ‘-Wno-unnecessary-virtual-specifier’ may have been intended to silence earlier diagnostics ``` Added: Modified: flang/lib/Semantics/unparse-with-symbols.cpp Removed: ################################################################################ diff --git a/flang/lib/Semantics/unparse-with-symbols.cpp b/flang/lib/Semantics/unparse-with-symbols.cpp index 634d46b8ccf40..f1e2e4ea7f119 100644 --- a/flang/lib/Semantics/unparse-with-symbols.cpp +++ b/flang/lib/Semantics/unparse-with-symbols.cpp @@ -11,6 +11,7 @@ #include "flang/Parser/parse-tree-visitor.h" #include "flang/Parser/parse-tree.h" #include "flang/Parser/unparse.h" +#include "flang/Semantics/semantics.h" #include "flang/Semantics/symbol.h" #include "llvm/Support/raw_ostream.h" #include From flang-commits at lists.llvm.org Sat May 10 03:49:46 2025 From: flang-commits at lists.llvm.org (=?UTF-8?B?TWljaGHFgiBHw7Nybnk=?= via flang-commits) Date: Sat, 10 May 2025 03:49:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add missing `#include` to fix build (PR #139371) In-Reply-To: Message-ID: <681f2f4a.630a0220.9bdbe.7228@mx.google.com> https://github.com/mgorny closed https://github.com/llvm/llvm-project/pull/139371 From flang-commits at lists.llvm.org Sat May 10 06:40:58 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Sat, 10 May 2025 06:40:58 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <681f576a.630a0220.2bf84e.8d67@mx.google.com> https://github.com/kparzysz approved this pull request. https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Sat May 10 06:40:59 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Sat, 10 May 2025 06:40:59 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <681f576b.050a0220.c6648.16f3@mx.google.com> ================ @@ -856,6 +856,26 @@ static bool isVectorSubscript(const evaluate::Expr &expr) { return false; } +bool ClauseProcessor::processDefaultMap(lower::StatementContext &stmtCtx, + DefaultMapsTy &result) const { + auto process = [&](const omp::clause::Defaultmap &clause, + const parser::CharBlock &) { + using Defmap = omp::clause::Defaultmap; + clause::Defaultmap::VariableCategory variableCategory = + Defmap::VariableCategory::All; + // Variable Category is optional, if not specified defaults to all. + // Multiples of the same category are illegal as are any other + // defaultmaps being specified when a user specified all is in place, + // however, this should be handled earlier during semantics. + if (auto varCat = + std::get>(clause.t)) + variableCategory = varCat.value_or(Defmap::VariableCategory::All); ---------------- kparzysz wrote: `varCat` has a value here (it's guarded by `if`). https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Sat May 10 06:40:59 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Sat, 10 May 2025 06:40:59 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <681f576b.a70a0220.14f131.057e@mx.google.com> ================ @@ -2231,6 +2232,146 @@ genSingleOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +static clause::Defaultmap::ImplicitBehavior +getDefaultmapIfPresent(DefaultMapsTy &defaultMaps, mlir::Type varType) { + using DefMap = clause::Defaultmap; + + if (defaultMaps.empty()) + return DefMap::ImplicitBehavior::Default; + + if (llvm::is_contained(defaultMaps, DefMap::VariableCategory::All)) + return defaultMaps[DefMap::VariableCategory::All]; + + // NOTE: Unsure if complex and/or vector falls into a scalar type + // or aggregate, but the current default implicit behaviour is to + // treat them as such (c_ptr has its own behaviour, so perhaps + // being lumped in as a scalar isn't the right thing). + if ((fir::isa_trivial(varType) || fir::isa_char(varType) || + fir::isa_builtin_cptr_type(varType)) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Scalar)) + return defaultMaps[DefMap::VariableCategory::Scalar]; + + if (fir::isPointerType(varType) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Pointer)) + return defaultMaps[DefMap::VariableCategory::Pointer]; + + if (fir::isAllocatableType(varType) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Allocatable)) + return defaultMaps[DefMap::VariableCategory::Allocatable]; + + if (fir::isa_aggregate(varType) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Aggregate)) { + return defaultMaps[DefMap::VariableCategory::Aggregate]; + } ---------------- kparzysz wrote: You didn't use braces in the 3 other similar ifs above. https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Sat May 10 07:05:56 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 07:05:56 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <681f5d44.050a0220.164b8f.04a9@mx.google.com> https://github.com/NexMing updated https://github.com/llvm/llvm-project/pull/138627 >From 6dfdeafb1efbac97261a485213b451e24ce16a23 Mon Sep 17 00:00:00 2001 From: yanming Date: Wed, 30 Apr 2025 16:32:14 +0800 Subject: [PATCH] [flang][fir] Add affine optimization pass pipeline. --- .../flang/Optimizer/Passes/CommandLineOpts.h | 1 + .../flang/Optimizer/Passes/Pipelines.h | 3 +++ flang/lib/Optimizer/Passes/CMakeLists.txt | 1 + .../lib/Optimizer/Passes/CommandLineOpts.cpp | 1 + flang/lib/Optimizer/Passes/Pipelines.cpp | 20 +++++++++++++++++++ flang/test/Driver/mlir-pass-pipeline.f90 | 14 +++++++++++++ flang/test/Integration/OpenMP/auto-omp.f90 | 10 ++++++++++ 7 files changed, 50 insertions(+) create mode 100644 flang/test/Integration/OpenMP/auto-omp.f90 diff --git a/flang/include/flang/Optimizer/Passes/CommandLineOpts.h b/flang/include/flang/Optimizer/Passes/CommandLineOpts.h index 1cfaf285e75e6..320c561953213 100644 --- a/flang/include/flang/Optimizer/Passes/CommandLineOpts.h +++ b/flang/include/flang/Optimizer/Passes/CommandLineOpts.h @@ -42,6 +42,7 @@ extern llvm::cl::opt disableCfgConversion; extern llvm::cl::opt disableFirAvc; extern llvm::cl::opt disableFirMao; +extern llvm::cl::opt enableAffineOpt; extern llvm::cl::opt disableFirAliasTags; extern llvm::cl::opt useOldAliasTags; diff --git a/flang/include/flang/Optimizer/Passes/Pipelines.h b/flang/include/flang/Optimizer/Passes/Pipelines.h index a3f59ee8dd013..7680987367256 100644 --- a/flang/include/flang/Optimizer/Passes/Pipelines.h +++ b/flang/include/flang/Optimizer/Passes/Pipelines.h @@ -18,8 +18,11 @@ #include "flang/Optimizer/Passes/CommandLineOpts.h" #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Tools/CrossToolHelpers.h" +#include "mlir/Conversion/AffineToStandard/AffineToStandard.h" #include "mlir/Conversion/ReconcileUnrealizedCasts/ReconcileUnrealizedCasts.h" #include "mlir/Conversion/SCFToControlFlow/SCFToControlFlow.h" +#include "mlir/Conversion/SCFToOpenMP/SCFToOpenMP.h" +#include "mlir/Dialect/Affine/Passes.h" #include "mlir/Dialect/GPU/IR/GPUDialect.h" #include "mlir/Dialect/LLVMIR/LLVMAttrs.h" #include "mlir/Pass/PassManager.h" diff --git a/flang/lib/Optimizer/Passes/CMakeLists.txt b/flang/lib/Optimizer/Passes/CMakeLists.txt index 1c19a5765aff1..ad6c714c28bec 100644 --- a/flang/lib/Optimizer/Passes/CMakeLists.txt +++ b/flang/lib/Optimizer/Passes/CMakeLists.txt @@ -21,6 +21,7 @@ add_flang_library(flangPasses MLIRPass MLIRReconcileUnrealizedCasts MLIRSCFToControlFlow + MLIRSCFToOpenMP MLIRSupport MLIRTransforms ) diff --git a/flang/lib/Optimizer/Passes/CommandLineOpts.cpp b/flang/lib/Optimizer/Passes/CommandLineOpts.cpp index f95a280883cba..b8ae6ede423e3 100644 --- a/flang/lib/Optimizer/Passes/CommandLineOpts.cpp +++ b/flang/lib/Optimizer/Passes/CommandLineOpts.cpp @@ -55,6 +55,7 @@ cl::opt useOldAliasTags( cl::desc("Use a single TBAA tree for all functions and do not use " "the FIR alias tags pass"), cl::init(false), cl::Hidden); +EnableOption(AffineOpt, "affine-opt", "affine optimization"); /// CodeGen Passes DisableOption(CodeGenRewrite, "codegen-rewrite", "rewrite FIR for codegen"); diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index a3ef473ea39b7..6add597e0dabc 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -209,6 +209,26 @@ void createDefaultFIROptimizerPassPipeline(mlir::PassManager &pm, if (pc.AliasAnalysis && !disableFirAliasTags && !useOldAliasTags) pm.addPass(fir::createAddAliasTags()); + // We can first convert the FIR dialect to the Affine dialect, perform + // optimizations on top of it, and then lower it to the FIR dialect. + // TODO: These optimization passes (like PromoteToAffinePass) are currently + // experimental, so it's important to actively identify and address issues. + if (enableAffineOpt && pc.OptLevel.isOptimizingForSpeed()) { + pm.addPass(fir::createPromoteToAffinePass()); + pm.addPass(mlir::createCSEPass()); + pm.addPass(mlir::affine::createAffineLoopInvariantCodeMotionPass()); + pm.addPass(mlir::affine::createAffineLoopNormalizePass()); + pm.addPass(mlir::affine::createSimplifyAffineStructuresPass()); + pm.addPass(mlir::affine::createAffineParallelize( + mlir::affine::AffineParallelizeOptions{1, false})); + pm.addPass(fir::createAffineDemotionPass()); + pm.addPass(mlir::createLowerAffinePass()); + if (pc.EnableOpenMP) { + pm.addPass(mlir::createConvertSCFToOpenMPPass()); + pm.addPass(mlir::createCanonicalizerPass()); + } + } + addNestedPassToAllTopLevelOperations( pm, fir::createStackReclaim); // convert control flow to CFG form diff --git a/flang/test/Driver/mlir-pass-pipeline.f90 b/flang/test/Driver/mlir-pass-pipeline.f90 index 45370895db397..188a42d231500 100644 --- a/flang/test/Driver/mlir-pass-pipeline.f90 +++ b/flang/test/Driver/mlir-pass-pipeline.f90 @@ -4,6 +4,7 @@ ! -O0 is the default: ! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline %s -O0 -o /dev/null 2>&1 | FileCheck --check-prefixes=ALL %s ! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline %s -O2 -o /dev/null 2>&1 | FileCheck --check-prefixes=ALL,O2 %s +! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline -mllvm --enable-affine-opt %s -O2 -o /dev/null 2>&1 | FileCheck --check-prefixes=ALL,O2,AFFINE %s ! REQUIRES: asserts @@ -105,6 +106,19 @@ ! ALL-NEXT: SimplifyFIROperations ! O2-NEXT: AddAliasTags +! AFFINE-NEXT: 'func.func' Pipeline +! AFFINE-NEXT: AffineDialectPromotion +! AFFINE-NEXT: CSE +! AFFINE-NEXT: (S) 0 num-cse'd - Number of operations CSE'd +! AFFINE-NEXT: (S) 0 num-dce'd - Number of operations DCE'd +! AFFINE-NEXT: 'func.func' Pipeline +! AFFINE-NEXT: AffineLoopInvariantCodeMotion +! AFFINE-NEXT: AffineLoopNormalize +! AFFINE-NEXT: SimplifyAffineStructures +! AFFINE-NEXT: AffineParallelize +! AFFINE-NEXT: AffineDialectDemotion +! AFFINE-NEXT: LowerAffinePass + ! ALL-NEXT: Pipeline Collection : ['fir.global', 'func.func', 'omp.declare_reduction', 'omp.private'] ! ALL-NEXT: 'fir.global' Pipeline ! ALL-NEXT: StackReclaim diff --git a/flang/test/Integration/OpenMP/auto-omp.f90 b/flang/test/Integration/OpenMP/auto-omp.f90 new file mode 100644 index 0000000000000..bf7da292552d8 --- /dev/null +++ b/flang/test/Integration/OpenMP/auto-omp.f90 @@ -0,0 +1,10 @@ +! RUN: %flang_fc1 -O1 -mllvm --enable-affine-opt -emit-llvm -fopenmp -o - %s \ +! RUN: | FileCheck %s + +!CHECK-LABEL: define void @foo_(ptr captures(none) %0) {{.*}} { +!CHECK: call void{{.*}}@__kmpc_fork_call{{.*}}@[[OMP_OUTLINED_FN_1:.*]]) + +subroutine foo(a) + integer, dimension(100, 100), intent(out) :: a + a = 1 +end subroutine foo From flang-commits at lists.llvm.org Sat May 10 08:25:02 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 08:25:02 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Support MLIR lowering of linear clause for omp.wsloop (PR #139385) Message-ID: https://github.com/NimishMishra created https://github.com/llvm/llvm-project/pull/139385 This patch adds support for MLIR lowering of linear clause on omp.wsloop (except for linear modifiers). >From e4f3cb2553f8ef03a3ad347cf14a187e31064153 Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Sat, 10 May 2025 19:34:16 +0530 Subject: [PATCH] [flang][OpenMP] Support MLIR lowering of linear clause for omp.wsloop --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 34 +++++++++++ flang/lib/Lower/OpenMP/ClauseProcessor.h | 1 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 5 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 4 +- flang/test/Lower/OpenMP/wsloop-linear.f90 | 57 +++++++++++++++++++ 5 files changed, 97 insertions(+), 4 deletions(-) create mode 100644 flang/test/Lower/OpenMP/wsloop-linear.f90 diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 79b5087e4da68..8ba2f604df80a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1060,6 +1060,40 @@ bool ClauseProcessor::processIsDevicePtr( }); } +bool ClauseProcessor::processLinear(mlir::omp::LinearClauseOps &result) const { + lower::StatementContext stmtCtx; + return findRepeatableClause< + omp::clause::Linear>([&](const omp::clause::Linear &clause, + const parser::CharBlock &) { + auto &objects = std::get(clause.t); + for (const omp::Object &object : objects) { + semantics::Symbol *sym = object.sym(); + const mlir::Value variable = converter.getSymbolAddress(*sym); + result.linearVars.push_back(variable); + } + if (objects.size()) { + if (auto &mod = + std::get>( + clause.t)) { + mlir::Value operand = + fir::getBase(converter.genExprValue(toEvExpr(*mod), stmtCtx)); + result.linearStepVars.append(objects.size(), operand); + } else if (std::get>( + clause.t)) { + mlir::Location currentLocation = converter.getCurrentLocation(); + TODO(currentLocation, "Linear modifiers not yet implemented"); + } else { + // If nothing is present, add the default step of 1. + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + mlir::Value operand = firOpBuilder.createIntegerConstant( + currentLocation, firOpBuilder.getI32Type(), 1); + result.linearStepVars.append(objects.size(), operand); + } + } + }); +} + bool ClauseProcessor::processLink( llvm::SmallVectorImpl &result) const { return findRepeatableClause( diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 7857ba3fd0845..0ec41bdd33256 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -122,6 +122,7 @@ class ClauseProcessor { bool processIsDevicePtr( mlir::omp::IsDevicePtrClauseOps &result, llvm::SmallVectorImpl &isDeviceSyms) const; + bool processLinear(mlir::omp::LinearClauseOps &result) const; bool processLink(llvm::SmallVectorImpl &result) const; diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 7eec598645eac..2a1c94407e1c8 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -213,14 +213,15 @@ void DataSharingProcessor::collectSymbolsForPrivatization() { // so, we won't need to explicitely handle block objects (or forget to do // so). for (auto *sym : explicitlyPrivatizedSymbols) - allPrivatizedSymbols.insert(sym); + if (!sym->test(Fortran::semantics::Symbol::Flag::OmpLinear)) + allPrivatizedSymbols.insert(sym); } bool DataSharingProcessor::needBarrier() { // Emit implicit barrier to synchronize threads and avoid data races on // initialization of firstprivate variables and post-update of lastprivate // variables. - // Emit implicit barrier for linear clause. Maybe on somewhere else. + // Emit implicit barrier for linear clause in the OpenMPIRBuilder. for (const semantics::Symbol *sym : allPrivatizedSymbols) { if (sym->test(semantics::Symbol::Flag::OmpLastPrivate) && (sym->test(semantics::Symbol::Flag::OmpFirstPrivate) || diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 54560729eb4af..6fa915b4364f9 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1841,13 +1841,13 @@ static void genWsloopClauses( llvm::SmallVectorImpl &reductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processNowait(clauseOps); + cp.processLinear(clauseOps); cp.processOrder(clauseOps); cp.processOrdered(clauseOps); cp.processReduction(loc, clauseOps, reductionSyms); cp.processSchedule(stmtCtx, clauseOps); - cp.processTODO( - loc, llvm::omp::Directive::OMPD_do); + cp.processTODO(loc, llvm::omp::Directive::OMPD_do); } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/wsloop-linear.f90 b/flang/test/Lower/OpenMP/wsloop-linear.f90 new file mode 100644 index 0000000000000..b99677108be2f --- /dev/null +++ b/flang/test/Lower/OpenMP/wsloop-linear.f90 @@ -0,0 +1,57 @@ +! This test checks lowering of OpenMP DO Directive (Worksharing) +! with linear clause + +! RUN: %flang_fc1 -fopenmp -emit-hlfir %s -o - 2>&1 | FileCheck %s + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFsimple_linearEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFsimple_linearEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[const:.*]] = arith.constant 1 : i32 +subroutine simple_linear + implicit none + integer :: x, y, i + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_stepEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_stepEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_step + implicit none + integer :: x, y, i + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x:4) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + +!CHECK: %[[A_alloca:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFlinear_exprEa"} +!CHECK: %[[A:.*]]:2 = hlfir.declare %[[A_alloca]] {uniq_name = "_QFlinear_exprEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_exprEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_exprEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_expr + implicit none + integer :: x, y, i, a + !CHECK: %[[LOAD_A:.*]] = fir.load %[[A]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: %[[LINEAR_EXPR:.*]] = arith.addi %[[LOAD_A]], %[[const]] : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[LINEAR_EXPR]] : !fir.ref) {{.*}} + !$omp do linear(x:a+4) + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine From flang-commits at lists.llvm.org Sat May 10 08:25:35 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 08:25:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Support MLIR lowering of linear clause for omp.wsloop (PR #139385) In-Reply-To: Message-ID: <681f6fef.170a0220.30d981.d0d0@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: None (NimishMishra)
Changes This patch adds support for MLIR lowering of linear clause on omp.wsloop (except for linear modifiers). --- Full diff: https://github.com/llvm/llvm-project/pull/139385.diff 5 Files Affected: - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+34) - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.h (+1) - (modified) flang/lib/Lower/OpenMP/DataSharingProcessor.cpp (+3-2) - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+2-2) - (added) flang/test/Lower/OpenMP/wsloop-linear.f90 (+57) ``````````diff diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 79b5087e4da68..8ba2f604df80a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1060,6 +1060,40 @@ bool ClauseProcessor::processIsDevicePtr( }); } +bool ClauseProcessor::processLinear(mlir::omp::LinearClauseOps &result) const { + lower::StatementContext stmtCtx; + return findRepeatableClause< + omp::clause::Linear>([&](const omp::clause::Linear &clause, + const parser::CharBlock &) { + auto &objects = std::get(clause.t); + for (const omp::Object &object : objects) { + semantics::Symbol *sym = object.sym(); + const mlir::Value variable = converter.getSymbolAddress(*sym); + result.linearVars.push_back(variable); + } + if (objects.size()) { + if (auto &mod = + std::get>( + clause.t)) { + mlir::Value operand = + fir::getBase(converter.genExprValue(toEvExpr(*mod), stmtCtx)); + result.linearStepVars.append(objects.size(), operand); + } else if (std::get>( + clause.t)) { + mlir::Location currentLocation = converter.getCurrentLocation(); + TODO(currentLocation, "Linear modifiers not yet implemented"); + } else { + // If nothing is present, add the default step of 1. + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + mlir::Value operand = firOpBuilder.createIntegerConstant( + currentLocation, firOpBuilder.getI32Type(), 1); + result.linearStepVars.append(objects.size(), operand); + } + } + }); +} + bool ClauseProcessor::processLink( llvm::SmallVectorImpl &result) const { return findRepeatableClause( diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 7857ba3fd0845..0ec41bdd33256 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -122,6 +122,7 @@ class ClauseProcessor { bool processIsDevicePtr( mlir::omp::IsDevicePtrClauseOps &result, llvm::SmallVectorImpl &isDeviceSyms) const; + bool processLinear(mlir::omp::LinearClauseOps &result) const; bool processLink(llvm::SmallVectorImpl &result) const; diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 7eec598645eac..2a1c94407e1c8 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -213,14 +213,15 @@ void DataSharingProcessor::collectSymbolsForPrivatization() { // so, we won't need to explicitely handle block objects (or forget to do // so). for (auto *sym : explicitlyPrivatizedSymbols) - allPrivatizedSymbols.insert(sym); + if (!sym->test(Fortran::semantics::Symbol::Flag::OmpLinear)) + allPrivatizedSymbols.insert(sym); } bool DataSharingProcessor::needBarrier() { // Emit implicit barrier to synchronize threads and avoid data races on // initialization of firstprivate variables and post-update of lastprivate // variables. - // Emit implicit barrier for linear clause. Maybe on somewhere else. + // Emit implicit barrier for linear clause in the OpenMPIRBuilder. for (const semantics::Symbol *sym : allPrivatizedSymbols) { if (sym->test(semantics::Symbol::Flag::OmpLastPrivate) && (sym->test(semantics::Symbol::Flag::OmpFirstPrivate) || diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 54560729eb4af..6fa915b4364f9 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1841,13 +1841,13 @@ static void genWsloopClauses( llvm::SmallVectorImpl &reductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processNowait(clauseOps); + cp.processLinear(clauseOps); cp.processOrder(clauseOps); cp.processOrdered(clauseOps); cp.processReduction(loc, clauseOps, reductionSyms); cp.processSchedule(stmtCtx, clauseOps); - cp.processTODO( - loc, llvm::omp::Directive::OMPD_do); + cp.processTODO(loc, llvm::omp::Directive::OMPD_do); } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/wsloop-linear.f90 b/flang/test/Lower/OpenMP/wsloop-linear.f90 new file mode 100644 index 0000000000000..b99677108be2f --- /dev/null +++ b/flang/test/Lower/OpenMP/wsloop-linear.f90 @@ -0,0 +1,57 @@ +! This test checks lowering of OpenMP DO Directive (Worksharing) +! with linear clause + +! RUN: %flang_fc1 -fopenmp -emit-hlfir %s -o - 2>&1 | FileCheck %s + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFsimple_linearEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFsimple_linearEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[const:.*]] = arith.constant 1 : i32 +subroutine simple_linear + implicit none + integer :: x, y, i + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_stepEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_stepEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_step + implicit none + integer :: x, y, i + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x:4) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + +!CHECK: %[[A_alloca:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFlinear_exprEa"} +!CHECK: %[[A:.*]]:2 = hlfir.declare %[[A_alloca]] {uniq_name = "_QFlinear_exprEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_exprEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_exprEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_expr + implicit none + integer :: x, y, i, a + !CHECK: %[[LOAD_A:.*]] = fir.load %[[A]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: %[[LINEAR_EXPR:.*]] = arith.addi %[[LOAD_A]], %[[const]] : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[LINEAR_EXPR]] : !fir.ref) {{.*}} + !$omp do linear(x:a+4) + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine ``````````
https://github.com/llvm/llvm-project/pull/139385 From flang-commits at lists.llvm.org Sat May 10 08:25:34 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 08:25:34 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Support MLIR lowering of linear clause for omp.wsloop (PR #139385) In-Reply-To: Message-ID: <681f6fee.170a0220.1166a9.e5e2@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: None (NimishMishra)
Changes This patch adds support for MLIR lowering of linear clause on omp.wsloop (except for linear modifiers). --- Full diff: https://github.com/llvm/llvm-project/pull/139385.diff 5 Files Affected: - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+34) - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.h (+1) - (modified) flang/lib/Lower/OpenMP/DataSharingProcessor.cpp (+3-2) - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+2-2) - (added) flang/test/Lower/OpenMP/wsloop-linear.f90 (+57) ``````````diff diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 79b5087e4da68..8ba2f604df80a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1060,6 +1060,40 @@ bool ClauseProcessor::processIsDevicePtr( }); } +bool ClauseProcessor::processLinear(mlir::omp::LinearClauseOps &result) const { + lower::StatementContext stmtCtx; + return findRepeatableClause< + omp::clause::Linear>([&](const omp::clause::Linear &clause, + const parser::CharBlock &) { + auto &objects = std::get(clause.t); + for (const omp::Object &object : objects) { + semantics::Symbol *sym = object.sym(); + const mlir::Value variable = converter.getSymbolAddress(*sym); + result.linearVars.push_back(variable); + } + if (objects.size()) { + if (auto &mod = + std::get>( + clause.t)) { + mlir::Value operand = + fir::getBase(converter.genExprValue(toEvExpr(*mod), stmtCtx)); + result.linearStepVars.append(objects.size(), operand); + } else if (std::get>( + clause.t)) { + mlir::Location currentLocation = converter.getCurrentLocation(); + TODO(currentLocation, "Linear modifiers not yet implemented"); + } else { + // If nothing is present, add the default step of 1. + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + mlir::Value operand = firOpBuilder.createIntegerConstant( + currentLocation, firOpBuilder.getI32Type(), 1); + result.linearStepVars.append(objects.size(), operand); + } + } + }); +} + bool ClauseProcessor::processLink( llvm::SmallVectorImpl &result) const { return findRepeatableClause( diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 7857ba3fd0845..0ec41bdd33256 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -122,6 +122,7 @@ class ClauseProcessor { bool processIsDevicePtr( mlir::omp::IsDevicePtrClauseOps &result, llvm::SmallVectorImpl &isDeviceSyms) const; + bool processLinear(mlir::omp::LinearClauseOps &result) const; bool processLink(llvm::SmallVectorImpl &result) const; diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 7eec598645eac..2a1c94407e1c8 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -213,14 +213,15 @@ void DataSharingProcessor::collectSymbolsForPrivatization() { // so, we won't need to explicitely handle block objects (or forget to do // so). for (auto *sym : explicitlyPrivatizedSymbols) - allPrivatizedSymbols.insert(sym); + if (!sym->test(Fortran::semantics::Symbol::Flag::OmpLinear)) + allPrivatizedSymbols.insert(sym); } bool DataSharingProcessor::needBarrier() { // Emit implicit barrier to synchronize threads and avoid data races on // initialization of firstprivate variables and post-update of lastprivate // variables. - // Emit implicit barrier for linear clause. Maybe on somewhere else. + // Emit implicit barrier for linear clause in the OpenMPIRBuilder. for (const semantics::Symbol *sym : allPrivatizedSymbols) { if (sym->test(semantics::Symbol::Flag::OmpLastPrivate) && (sym->test(semantics::Symbol::Flag::OmpFirstPrivate) || diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 54560729eb4af..6fa915b4364f9 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1841,13 +1841,13 @@ static void genWsloopClauses( llvm::SmallVectorImpl &reductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processNowait(clauseOps); + cp.processLinear(clauseOps); cp.processOrder(clauseOps); cp.processOrdered(clauseOps); cp.processReduction(loc, clauseOps, reductionSyms); cp.processSchedule(stmtCtx, clauseOps); - cp.processTODO( - loc, llvm::omp::Directive::OMPD_do); + cp.processTODO(loc, llvm::omp::Directive::OMPD_do); } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/wsloop-linear.f90 b/flang/test/Lower/OpenMP/wsloop-linear.f90 new file mode 100644 index 0000000000000..b99677108be2f --- /dev/null +++ b/flang/test/Lower/OpenMP/wsloop-linear.f90 @@ -0,0 +1,57 @@ +! This test checks lowering of OpenMP DO Directive (Worksharing) +! with linear clause + +! RUN: %flang_fc1 -fopenmp -emit-hlfir %s -o - 2>&1 | FileCheck %s + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFsimple_linearEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFsimple_linearEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[const:.*]] = arith.constant 1 : i32 +subroutine simple_linear + implicit none + integer :: x, y, i + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_stepEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_stepEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_step + implicit none + integer :: x, y, i + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x:4) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + +!CHECK: %[[A_alloca:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFlinear_exprEa"} +!CHECK: %[[A:.*]]:2 = hlfir.declare %[[A_alloca]] {uniq_name = "_QFlinear_exprEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_exprEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_exprEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_expr + implicit none + integer :: x, y, i, a + !CHECK: %[[LOAD_A:.*]] = fir.load %[[A]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: %[[LINEAR_EXPR:.*]] = arith.addi %[[LOAD_A]], %[[const]] : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[LINEAR_EXPR]] : !fir.ref) {{.*}} + !$omp do linear(x:a+4) + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine ``````````
https://github.com/llvm/llvm-project/pull/139385 From flang-commits at lists.llvm.org Sat May 10 08:28:02 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 08:28:02 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) Message-ID: https://github.com/NimishMishra created https://github.com/llvm/llvm-project/pull/139386 This patch adds support for LLVM translation of linear clause on omp.wsloop (except for linear modifiers). >From e4f3cb2553f8ef03a3ad347cf14a187e31064153 Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Sat, 10 May 2025 19:34:16 +0530 Subject: [PATCH 1/2] [flang][OpenMP] Support MLIR lowering of linear clause for omp.wsloop --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 34 +++++++++++ flang/lib/Lower/OpenMP/ClauseProcessor.h | 1 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 5 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 4 +- flang/test/Lower/OpenMP/wsloop-linear.f90 | 57 +++++++++++++++++++ 5 files changed, 97 insertions(+), 4 deletions(-) create mode 100644 flang/test/Lower/OpenMP/wsloop-linear.f90 diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 79b5087e4da68..8ba2f604df80a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1060,6 +1060,40 @@ bool ClauseProcessor::processIsDevicePtr( }); } +bool ClauseProcessor::processLinear(mlir::omp::LinearClauseOps &result) const { + lower::StatementContext stmtCtx; + return findRepeatableClause< + omp::clause::Linear>([&](const omp::clause::Linear &clause, + const parser::CharBlock &) { + auto &objects = std::get(clause.t); + for (const omp::Object &object : objects) { + semantics::Symbol *sym = object.sym(); + const mlir::Value variable = converter.getSymbolAddress(*sym); + result.linearVars.push_back(variable); + } + if (objects.size()) { + if (auto &mod = + std::get>( + clause.t)) { + mlir::Value operand = + fir::getBase(converter.genExprValue(toEvExpr(*mod), stmtCtx)); + result.linearStepVars.append(objects.size(), operand); + } else if (std::get>( + clause.t)) { + mlir::Location currentLocation = converter.getCurrentLocation(); + TODO(currentLocation, "Linear modifiers not yet implemented"); + } else { + // If nothing is present, add the default step of 1. + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + mlir::Value operand = firOpBuilder.createIntegerConstant( + currentLocation, firOpBuilder.getI32Type(), 1); + result.linearStepVars.append(objects.size(), operand); + } + } + }); +} + bool ClauseProcessor::processLink( llvm::SmallVectorImpl &result) const { return findRepeatableClause( diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 7857ba3fd0845..0ec41bdd33256 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -122,6 +122,7 @@ class ClauseProcessor { bool processIsDevicePtr( mlir::omp::IsDevicePtrClauseOps &result, llvm::SmallVectorImpl &isDeviceSyms) const; + bool processLinear(mlir::omp::LinearClauseOps &result) const; bool processLink(llvm::SmallVectorImpl &result) const; diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 7eec598645eac..2a1c94407e1c8 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -213,14 +213,15 @@ void DataSharingProcessor::collectSymbolsForPrivatization() { // so, we won't need to explicitely handle block objects (or forget to do // so). for (auto *sym : explicitlyPrivatizedSymbols) - allPrivatizedSymbols.insert(sym); + if (!sym->test(Fortran::semantics::Symbol::Flag::OmpLinear)) + allPrivatizedSymbols.insert(sym); } bool DataSharingProcessor::needBarrier() { // Emit implicit barrier to synchronize threads and avoid data races on // initialization of firstprivate variables and post-update of lastprivate // variables. - // Emit implicit barrier for linear clause. Maybe on somewhere else. + // Emit implicit barrier for linear clause in the OpenMPIRBuilder. for (const semantics::Symbol *sym : allPrivatizedSymbols) { if (sym->test(semantics::Symbol::Flag::OmpLastPrivate) && (sym->test(semantics::Symbol::Flag::OmpFirstPrivate) || diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 54560729eb4af..6fa915b4364f9 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1841,13 +1841,13 @@ static void genWsloopClauses( llvm::SmallVectorImpl &reductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processNowait(clauseOps); + cp.processLinear(clauseOps); cp.processOrder(clauseOps); cp.processOrdered(clauseOps); cp.processReduction(loc, clauseOps, reductionSyms); cp.processSchedule(stmtCtx, clauseOps); - cp.processTODO( - loc, llvm::omp::Directive::OMPD_do); + cp.processTODO(loc, llvm::omp::Directive::OMPD_do); } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/wsloop-linear.f90 b/flang/test/Lower/OpenMP/wsloop-linear.f90 new file mode 100644 index 0000000000000..b99677108be2f --- /dev/null +++ b/flang/test/Lower/OpenMP/wsloop-linear.f90 @@ -0,0 +1,57 @@ +! This test checks lowering of OpenMP DO Directive (Worksharing) +! with linear clause + +! RUN: %flang_fc1 -fopenmp -emit-hlfir %s -o - 2>&1 | FileCheck %s + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFsimple_linearEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFsimple_linearEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[const:.*]] = arith.constant 1 : i32 +subroutine simple_linear + implicit none + integer :: x, y, i + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_stepEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_stepEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_step + implicit none + integer :: x, y, i + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x:4) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + +!CHECK: %[[A_alloca:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFlinear_exprEa"} +!CHECK: %[[A:.*]]:2 = hlfir.declare %[[A_alloca]] {uniq_name = "_QFlinear_exprEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_exprEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_exprEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_expr + implicit none + integer :: x, y, i, a + !CHECK: %[[LOAD_A:.*]] = fir.load %[[A]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: %[[LINEAR_EXPR:.*]] = arith.addi %[[LOAD_A]], %[[const]] : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[LINEAR_EXPR]] : !fir.ref) {{.*}} + !$omp do linear(x:a+4) + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine >From 616d6377bbb94bdc742023d71c63ba89df293e3c Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Sat, 10 May 2025 20:51:39 +0530 Subject: [PATCH 2/2] [mlir][llvm][OpenMP] Support translation for linear clause in omp.wsloop --- .../llvm/Frontend/OpenMP/OMPIRBuilder.h | 15 ++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 3 + .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 185 +++++++++++++++++- mlir/test/Target/LLVMIR/openmp-llvm.mlir | 88 +++++++++ mlir/test/Target/LLVMIR/openmp-todo.mlir | 13 -- 5 files changed, 289 insertions(+), 15 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h index ffc0fd0a0bdac..68f15d5c7d41e 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h @@ -3580,6 +3580,9 @@ class CanonicalLoopInfo { BasicBlock *Latch = nullptr; BasicBlock *Exit = nullptr; + // Hold the MLIR value for the `lastiter` of the canonical loop. + Value *LastIter = nullptr; + /// Add the control blocks of this loop to \p BBs. /// /// This does not include any block from the body, including the one returned @@ -3612,6 +3615,18 @@ class CanonicalLoopInfo { void mapIndVar(llvm::function_ref Updater); public: + /// Sets the last iteration variable for this loop. + void setLastIter(Value *IterVar) { LastIter = std::move(IterVar); } + + /// Returns the last iteration variable for this loop. + /// Certain use-cases (like translation of linear clause) may access + /// this variable even after a loop transformation. Hence, do not guard + /// this getter function by `isValid`. It is the responsibility of the + /// callee to ensure this functionality is not invoked by a non-outlined + /// CanonicalLoopInfo object (in which case, `setLastIter` will never be + /// invoked and `LastIter` will be by default `nullptr`). + Value *getLastIter() { return LastIter; } + /// Returns whether this object currently represents the IR of a loop. If /// returning false, it may have been consumed by a loop transformation or not /// been intialized. Do not use in this case; diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index a1268ca76b2d5..991cdb7b6b416 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -4254,6 +4254,7 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::applyStaticWorkshareLoop( Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); Value *PUpperBound = Builder.CreateAlloca(IVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(IVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // At the end of the preheader, prepare for calling the "init" function by // storing the current loop bounds into the allocated space. A canonical loop @@ -4361,6 +4362,7 @@ OpenMPIRBuilder::applyStaticChunkedWorkshareLoop(DebugLoc DL, Value *PUpperBound = Builder.CreateAlloca(InternalIVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(InternalIVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // Set up the source location value for the OpenMP runtime. Builder.restoreIP(CLI->getPreheaderIP()); @@ -4844,6 +4846,7 @@ OpenMPIRBuilder::applyDynamicWorkshareLoop(DebugLoc DL, CanonicalLoopInfo *CLI, Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); Value *PUpperBound = Builder.CreateAlloca(IVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(IVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // At the end of the preheader, prepare for calling the "init" function by // storing the current loop bounds into the allocated space. A canonical loop diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 9f7b5605556e6..571505ab9b9aa 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } + } + + // Initialize linear step + inline void initLinearStep(LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearStep) { + linearSteps.push_back(moduleTranslation.lookupValue(linearStep)); + } + + // Emit IR for initialization of linear variables + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + initLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::BasicBlock *loopPreHeader) { + builder.SetInsertPoint(loopPreHeader->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarLoad = builder.CreateLoad( + linearOrigVars[index]->getAllocatedType(), linearOrigVars[index]); + builder.CreateStore(linearVarLoad, linearPreconditionVars[index]); + } + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + return afterBarrierIP; + } + + // Emit IR for updating Linear variables + void updateLinearVar(llvm::IRBuilderBase &builder, llvm::BasicBlock *loopBody, + llvm::Value *loopInductionVar) { + builder.SetInsertPoint(loopBody->getTerminator()); + for (size_t index = 0; index < linearPreconditionVars.size(); index++) { + // Emit increments for linear vars + llvm::LoadInst *linearVarStart = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + + linearPreconditionVars[index]); + auto mulInst = builder.CreateMul(loopInductionVar, linearSteps[index]); + auto addInst = builder.CreateAdd(linearVarStart, mulInst); + builder.CreateStore(addInst, linearLoopBodyTemps[index]); + } + } + + // Linear variable finalization is conditional on the last logical iteration. + // Create BB splits to manage the same. + void outlineLinearFinalizationBB(llvm::IRBuilderBase &builder, + llvm::BasicBlock *loopExit) { + linearFinalizationBB = loopExit->splitBasicBlock( + loopExit->getTerminator(), "omp_loop.linear_finalization"); + linearExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_exit"); + linearLastIterExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_lastiter_exit"); + } + + // Finalize the linear vars + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + finalizeLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::Value *lastIter) { + // Emit condition to check whether last logical iteration is being executed + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + llvm::Value *loopLastIterLoad = builder.CreateLoad( + llvm::Type::getInt32Ty(builder.getContext()), lastIter); + llvm::Value *isLast = + builder.CreateCmp(llvm::CmpInst::ICMP_NE, loopLastIterLoad, + llvm::ConstantInt::get( + llvm::Type::getInt32Ty(builder.getContext()), 0)); + // Store the linear variable values to original variables. + builder.SetInsertPoint(linearLastIterExitBB->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarTemp = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + linearLoopBodyTemps[index]); + builder.CreateStore(linearVarTemp, linearOrigVars[index]); + } + + // Create conditional branch such that the linear variable + // values are stored to original variables only at the + // last logical iteration + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + builder.CreateCondBr(isLast, linearLastIterExitBB, linearExitBB); + linearFinalizationBB->getTerminator()->eraseFromParent(); + // Emit barrier + builder.SetInsertPoint(linearExitBB->getTerminator()); + return moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + } + + // Rewrite all uses of the original variable in `BBName` + // with the linear variable in-place + void rewriteInPlace(llvm::IRBuilderBase &builder, std::string BBName, + size_t varIndex) { + llvm::SmallVector users; + for (llvm::User *user : linearOrigVal[varIndex]->users()) + users.push_back(user); + for (auto *user : users) { + if (auto *userInst = dyn_cast(user)) { + if (userInst->getParent()->getName().str() == BBName) + user->replaceUsesOfWith(linearOrigVal[varIndex], + linearLoopBodyTemps[varIndex]); + } + } + } +}; + } // namespace /// Looks up from the operation from and returns the PrivateClauseOp with @@ -292,7 +432,6 @@ static LogicalResult checkImplementationStatus(Operation &op) { }) .Case([&](omp::WsloopOp op) { checkAllocate(op, result); - checkLinear(op, result); checkOrder(op, result); checkReduction(op, result); }) @@ -2423,15 +2562,40 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, llvm::omp::Directive::OMPD_for); llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder); + + // Initialize linear variables and linear step + LinearClauseProcessor linearClauseProcessor; + if (wsloopOp.getLinearVars().size()) { + for (mlir::Value linearVar : wsloopOp.getLinearVars()) + linearClauseProcessor.createLinearVar(builder, moduleTranslation, + linearVar); + for (mlir::Value linearStep : wsloopOp.getLinearStepVars()) + linearClauseProcessor.initLinearStep(moduleTranslation, linearStep); + } + llvm::Expected regionBlock = convertOmpOpRegions( wsloopOp.getRegion(), "omp.wsloop.region", builder, moduleTranslation); if (failed(handleError(regionBlock, opInst))) return failure(); - builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin()); llvm::CanonicalLoopInfo *loopInfo = findCurrentLoopInfo(moduleTranslation); + // Emit Initialization and Update IR for linear variables + if (wsloopOp.getLinearVars().size()) { + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + linearClauseProcessor.initLinearVar(builder, moduleTranslation, + loopInfo->getPreheader()); + if (failed(handleError(afterBarrierIP, *loopOp))) + return failure(); + builder.restoreIP(*afterBarrierIP); + linearClauseProcessor.updateLinearVar(builder, loopInfo->getBody(), + loopInfo->getIndVar()); + linearClauseProcessor.outlineLinearFinalizationBB(builder, + loopInfo->getExit()); + } + + builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin()); llvm::OpenMPIRBuilder::InsertPointOrErrorTy wsloopIP = ompBuilder->applyWorkshareLoop( ompLoc.DL, loopInfo, allocaIP, loopNeedsBarrier, @@ -2443,6 +2607,23 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, if (failed(handleError(wsloopIP, opInst))) return failure(); + // Emit finalization and in-place rewrites for linear vars. + if (wsloopOp.getLinearVars().size()) { + llvm::OpenMPIRBuilder::InsertPointTy oldIP = builder.saveIP(); + assert(loopInfo->getLastIter() && + "`lastiter` in CanonicalLoopInfo is nullptr"); + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + linearClauseProcessor.finalizeLinearVar(builder, moduleTranslation, + loopInfo->getLastIter()); + if (failed(handleError(afterBarrierIP, *loopOp))) + return failure(); + builder.restoreIP(*afterBarrierIP); + for (size_t index = 0; index < wsloopOp.getLinearVars().size(); index++) + linearClauseProcessor.rewriteInPlace(builder, "omp.loop_nest.region", + index); + builder.restoreIP(oldIP); + } + // Set the correct branch target for task cancellation popCancelFinalizationCB(cancelTerminators, *ompBuilder, wsloopIP.get()); diff --git a/mlir/test/Target/LLVMIR/openmp-llvm.mlir b/mlir/test/Target/LLVMIR/openmp-llvm.mlir index 32f0ba5b105ff..9ad9e93301239 100644 --- a/mlir/test/Target/LLVMIR/openmp-llvm.mlir +++ b/mlir/test/Target/LLVMIR/openmp-llvm.mlir @@ -358,6 +358,94 @@ llvm.func @wsloop_simple(%arg0: !llvm.ptr) { // ----- +// CHECK-LABEL: wsloop_linear + +// CHECK: {{.*}} = alloca i32, i64 1, align 4 +// CHECK: %[[Y:.*]] = alloca i32, i64 1, align 4 +// CHECK: %[[X:.*]] = alloca i32, i64 1, align 4 + +// CHECK: entry: +// CHECK: %[[LINEAR_VAR:.*]] = alloca i32, align 4 +// CHECK: %[[LINEAR_RESULT:.*]] = alloca i32, align 4 +// CHECK: br label %omp_loop.preheader + +// CHECK: omp_loop.preheader: +// CHECK: %[[LOAD:.*]] = load i32, ptr %[[X]], align 4 +// CHECK: store i32 %[[LOAD]], ptr %[[LINEAR_VAR]], align 4 +// CHECK: %omp_global_thread_num = call i32 @__kmpc_global_thread_num(ptr @2) +// CHECK: call void @__kmpc_barrier(ptr @1, i32 %omp_global_thread_num) + +// CHECK: omp_loop.body: +// CHECK: %[[LOOP_IV:.*]] = add i32 %omp_loop.iv, {{.*}} +// CHECK: %[[LINEAR_LOAD:.*]] = load i32, ptr %[[LINEAR_VAR]], align 4 +// CHECK: %[[MUL:.*]] = mul i32 %[[LOOP_IV]], 1 +// CHECK: %[[ADD:.*]] = add i32 %[[LINEAR_LOAD]], %[[MUL]] +// CHECK: store i32 %[[ADD]], ptr %[[LINEAR_RESULT]], align 4 +// CHECK: br label %omp.loop_nest.region + +// CHECK: omp.loop_nest.region: +// CHECK: %[[LINEAR_LOAD:.*]] = load i32, ptr %[[LINEAR_RESULT]], align 4 +// CHECK: %[[ADD:.*]] = add i32 %[[LINEAR_LOAD]], 2 +// CHECK: store i32 %[[ADD]], ptr %[[Y]], align 4 + +// CHECK: omp_loop.exit: +// CHECK: call void @__kmpc_for_static_fini(ptr @2, i32 %omp_global_thread_num4) +// CHECK: %omp_global_thread_num5 = call i32 @__kmpc_global_thread_num(ptr @2) +// CHECK: call void @__kmpc_barrier(ptr @3, i32 %omp_global_thread_num5) +// CHECK: br label %omp_loop.linear_finalization + +// CHECK: omp_loop.linear_finalization: +// CHECK: %[[LAST_ITER:.*]] = load i32, ptr %p.lastiter, align 4 +// CHECK: %[[CMP:.*]] = icmp ne i32 %[[LAST_ITER]], 0 +// CHECK: br i1 %[[CMP]], label %omp_loop.linear_lastiter_exit, label %omp_loop.linear_exit + +// CHECK: omp_loop.linear_lastiter_exit: +// CHECK: %[[LINEAR_RESULT_LOAD:.*]] = load i32, ptr %[[LINEAR_RESULT]], align 4 +// CHECK: store i32 %[[LINEAR_RESULT_LOAD]], ptr %[[X]], align 4 +// CHECK: br label %omp_loop.linear_exit + +// CHECK: omp_loop.linear_exit: +// CHECK: %omp_global_thread_num6 = call i32 @__kmpc_global_thread_num(ptr @2) +// CHECK: call void @__kmpc_barrier(ptr @1, i32 %omp_global_thread_num6) +// CHECK: br label %omp_loop.after + +llvm.func @wsloop_linear() { + %0 = llvm.mlir.constant(1 : i64) : i64 + %1 = llvm.alloca %0 x i32 {bindc_name = "i", pinned} : (i64) -> !llvm.ptr + %2 = llvm.mlir.constant(1 : i64) : i64 + %3 = llvm.alloca %2 x i32 {bindc_name = "y"} : (i64) -> !llvm.ptr + %4 = llvm.mlir.constant(1 : i64) : i64 + %5 = llvm.alloca %4 x i32 {bindc_name = "x"} : (i64) -> !llvm.ptr + %6 = llvm.mlir.constant(1 : i64) : i64 + %7 = llvm.alloca %6 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr + %8 = llvm.mlir.constant(2 : i32) : i32 + %9 = llvm.mlir.constant(10 : i32) : i32 + %10 = llvm.mlir.constant(1 : i32) : i32 + %11 = llvm.mlir.constant(1 : i64) : i64 + %12 = llvm.mlir.constant(1 : i64) : i64 + %13 = llvm.mlir.constant(1 : i64) : i64 + %14 = llvm.mlir.constant(1 : i64) : i64 + omp.wsloop linear(%5 = %10 : !llvm.ptr) { + omp.loop_nest (%arg0) : i32 = (%10) to (%9) inclusive step (%10) { + llvm.store %arg0, %1 : i32, !llvm.ptr + %15 = llvm.load %5 : !llvm.ptr -> i32 + %16 = llvm.add %15, %8 : i32 + llvm.store %16, %3 : i32, !llvm.ptr + %17 = llvm.add %arg0, %10 : i32 + %18 = llvm.icmp "sgt" %17, %9 : i32 + llvm.cond_br %18, ^bb1, ^bb2 + ^bb1: // pred: ^bb0 + llvm.store %17, %1 : i32, !llvm.ptr + llvm.br ^bb2 + ^bb2: // 2 preds: ^bb0, ^bb1 + omp.yield + } + } + llvm.return +} + +// ----- + // CHECK-LABEL: @wsloop_inclusive_1 llvm.func @wsloop_inclusive_1(%arg0: !llvm.ptr) { %0 = llvm.mlir.constant(42 : index) : i64 diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir index 9a83b46efddca..98fccb1a80f67 100644 --- a/mlir/test/Target/LLVMIR/openmp-todo.mlir +++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir @@ -511,19 +511,6 @@ llvm.func @wsloop_allocate(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) { // ----- -llvm.func @wsloop_linear(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) { - // expected-error at below {{not yet implemented: Unhandled clause linear in omp.wsloop operation}} - // expected-error at below {{LLVM Translation failed for operation: omp.wsloop}} - omp.wsloop linear(%x = %step : !llvm.ptr) { - omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) { - omp.yield - } - } - llvm.return -} - -// ----- - llvm.func @wsloop_order(%lb : i32, %ub : i32, %step : i32) { // expected-error at below {{not yet implemented: Unhandled clause order in omp.wsloop operation}} // expected-error at below {{LLVM Translation failed for operation: omp.wsloop}} From flang-commits at lists.llvm.org Sat May 10 08:28:22 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 08:28:22 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Support MLIR lowering of linear clause for omp.wsloop (PR #139385) In-Reply-To: Message-ID: <681f7096.050a0220.164b8f.16f3@mx.google.com> NimishMishra wrote: Current TODOs in linear clause following this patch: - Add support for translation to LLVM IR (support added in PR https://github.com/llvm/llvm-project/pull/139386) - Extend support to omp.simd (WIP) - Extend support for linear modifiers (WIP) https://github.com/llvm/llvm-project/pull/139385 From flang-commits at lists.llvm.org Sat May 10 08:28:31 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 08:28:31 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <681f709f.170a0220.3079f9.d372@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: None (NimishMishra)
Changes This patch adds support for LLVM translation of linear clause on omp.wsloop (except for linear modifiers). --- Patch is 25.69 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139386.diff 10 Files Affected: - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+34) - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.h (+1) - (modified) flang/lib/Lower/OpenMP/DataSharingProcessor.cpp (+3-2) - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+2-2) - (added) flang/test/Lower/OpenMP/wsloop-linear.f90 (+57) - (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+15) - (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+3) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+183-2) - (modified) mlir/test/Target/LLVMIR/openmp-llvm.mlir (+88) - (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (-13) ``````````diff diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 79b5087e4da68..8ba2f604df80a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1060,6 +1060,40 @@ bool ClauseProcessor::processIsDevicePtr( }); } +bool ClauseProcessor::processLinear(mlir::omp::LinearClauseOps &result) const { + lower::StatementContext stmtCtx; + return findRepeatableClause< + omp::clause::Linear>([&](const omp::clause::Linear &clause, + const parser::CharBlock &) { + auto &objects = std::get(clause.t); + for (const omp::Object &object : objects) { + semantics::Symbol *sym = object.sym(); + const mlir::Value variable = converter.getSymbolAddress(*sym); + result.linearVars.push_back(variable); + } + if (objects.size()) { + if (auto &mod = + std::get>( + clause.t)) { + mlir::Value operand = + fir::getBase(converter.genExprValue(toEvExpr(*mod), stmtCtx)); + result.linearStepVars.append(objects.size(), operand); + } else if (std::get>( + clause.t)) { + mlir::Location currentLocation = converter.getCurrentLocation(); + TODO(currentLocation, "Linear modifiers not yet implemented"); + } else { + // If nothing is present, add the default step of 1. + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + mlir::Value operand = firOpBuilder.createIntegerConstant( + currentLocation, firOpBuilder.getI32Type(), 1); + result.linearStepVars.append(objects.size(), operand); + } + } + }); +} + bool ClauseProcessor::processLink( llvm::SmallVectorImpl &result) const { return findRepeatableClause( diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 7857ba3fd0845..0ec41bdd33256 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -122,6 +122,7 @@ class ClauseProcessor { bool processIsDevicePtr( mlir::omp::IsDevicePtrClauseOps &result, llvm::SmallVectorImpl &isDeviceSyms) const; + bool processLinear(mlir::omp::LinearClauseOps &result) const; bool processLink(llvm::SmallVectorImpl &result) const; diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 7eec598645eac..2a1c94407e1c8 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -213,14 +213,15 @@ void DataSharingProcessor::collectSymbolsForPrivatization() { // so, we won't need to explicitely handle block objects (or forget to do // so). for (auto *sym : explicitlyPrivatizedSymbols) - allPrivatizedSymbols.insert(sym); + if (!sym->test(Fortran::semantics::Symbol::Flag::OmpLinear)) + allPrivatizedSymbols.insert(sym); } bool DataSharingProcessor::needBarrier() { // Emit implicit barrier to synchronize threads and avoid data races on // initialization of firstprivate variables and post-update of lastprivate // variables. - // Emit implicit barrier for linear clause. Maybe on somewhere else. + // Emit implicit barrier for linear clause in the OpenMPIRBuilder. for (const semantics::Symbol *sym : allPrivatizedSymbols) { if (sym->test(semantics::Symbol::Flag::OmpLastPrivate) && (sym->test(semantics::Symbol::Flag::OmpFirstPrivate) || diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 54560729eb4af..6fa915b4364f9 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1841,13 +1841,13 @@ static void genWsloopClauses( llvm::SmallVectorImpl &reductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processNowait(clauseOps); + cp.processLinear(clauseOps); cp.processOrder(clauseOps); cp.processOrdered(clauseOps); cp.processReduction(loc, clauseOps, reductionSyms); cp.processSchedule(stmtCtx, clauseOps); - cp.processTODO( - loc, llvm::omp::Directive::OMPD_do); + cp.processTODO(loc, llvm::omp::Directive::OMPD_do); } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/wsloop-linear.f90 b/flang/test/Lower/OpenMP/wsloop-linear.f90 new file mode 100644 index 0000000000000..b99677108be2f --- /dev/null +++ b/flang/test/Lower/OpenMP/wsloop-linear.f90 @@ -0,0 +1,57 @@ +! This test checks lowering of OpenMP DO Directive (Worksharing) +! with linear clause + +! RUN: %flang_fc1 -fopenmp -emit-hlfir %s -o - 2>&1 | FileCheck %s + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFsimple_linearEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFsimple_linearEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[const:.*]] = arith.constant 1 : i32 +subroutine simple_linear + implicit none + integer :: x, y, i + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_stepEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_stepEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_step + implicit none + integer :: x, y, i + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x:4) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + +!CHECK: %[[A_alloca:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFlinear_exprEa"} +!CHECK: %[[A:.*]]:2 = hlfir.declare %[[A_alloca]] {uniq_name = "_QFlinear_exprEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_exprEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_exprEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_expr + implicit none + integer :: x, y, i, a + !CHECK: %[[LOAD_A:.*]] = fir.load %[[A]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: %[[LINEAR_EXPR:.*]] = arith.addi %[[LOAD_A]], %[[const]] : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[LINEAR_EXPR]] : !fir.ref) {{.*}} + !$omp do linear(x:a+4) + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h index ffc0fd0a0bdac..68f15d5c7d41e 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h @@ -3580,6 +3580,9 @@ class CanonicalLoopInfo { BasicBlock *Latch = nullptr; BasicBlock *Exit = nullptr; + // Hold the MLIR value for the `lastiter` of the canonical loop. + Value *LastIter = nullptr; + /// Add the control blocks of this loop to \p BBs. /// /// This does not include any block from the body, including the one returned @@ -3612,6 +3615,18 @@ class CanonicalLoopInfo { void mapIndVar(llvm::function_ref Updater); public: + /// Sets the last iteration variable for this loop. + void setLastIter(Value *IterVar) { LastIter = std::move(IterVar); } + + /// Returns the last iteration variable for this loop. + /// Certain use-cases (like translation of linear clause) may access + /// this variable even after a loop transformation. Hence, do not guard + /// this getter function by `isValid`. It is the responsibility of the + /// callee to ensure this functionality is not invoked by a non-outlined + /// CanonicalLoopInfo object (in which case, `setLastIter` will never be + /// invoked and `LastIter` will be by default `nullptr`). + Value *getLastIter() { return LastIter; } + /// Returns whether this object currently represents the IR of a loop. If /// returning false, it may have been consumed by a loop transformation or not /// been intialized. Do not use in this case; diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index a1268ca76b2d5..991cdb7b6b416 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -4254,6 +4254,7 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::applyStaticWorkshareLoop( Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); Value *PUpperBound = Builder.CreateAlloca(IVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(IVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // At the end of the preheader, prepare for calling the "init" function by // storing the current loop bounds into the allocated space. A canonical loop @@ -4361,6 +4362,7 @@ OpenMPIRBuilder::applyStaticChunkedWorkshareLoop(DebugLoc DL, Value *PUpperBound = Builder.CreateAlloca(InternalIVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(InternalIVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // Set up the source location value for the OpenMP runtime. Builder.restoreIP(CLI->getPreheaderIP()); @@ -4844,6 +4846,7 @@ OpenMPIRBuilder::applyDynamicWorkshareLoop(DebugLoc DL, CanonicalLoopInfo *CLI, Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); Value *PUpperBound = Builder.CreateAlloca(IVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(IVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // At the end of the preheader, prepare for calling the "init" function by // storing the current loop bounds into the allocated space. A canonical loop diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 9f7b5605556e6..571505ab9b9aa 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } + } + + // Initialize linear step + inline void initLinearStep(LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearStep) { + linearSteps.push_back(moduleTranslation.lookupValue(linearStep)); + } + + // Emit IR for initialization of linear variables + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + initLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::BasicBlock *loopPreHeader) { + builder.SetInsertPoint(loopPreHeader->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarLoad = builder.CreateLoad( + linearOrigVars[index]->getAllocatedType(), linearOrigVars[index]); + builder.CreateStore(linearVarLoad, linearPreconditionVars[index]); + } + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + return afterBarrierIP; + } + + // Emit IR for updating Linear variables + void updateLinearVar(llvm::IRBuilderBase &builder, llvm::BasicBlock *loopBody, + llvm::Value *loopInductionVar) { + builder.SetInsertPoint(loopBody->getTerminator()); + for (size_t index = 0; index < linearPreconditionVars.size(); index++) { + // Emit increments for linear vars + llvm::LoadInst *linearVarStart = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + + linearPreconditionVars[index]); + auto mulInst = builder.CreateMul(loopInductionVar, linearSteps[index]); + auto addInst = builder.CreateAdd(linearVarStart, mulInst); + builder.CreateStore(addInst, linearLoopBodyTemps[index]); + } + } + + // Linear variable finalization is conditional on the last logical iteration. + // Create BB splits to manage the same. + void outlineLinearFinalizationBB(llvm::IRBuilderBase &builder, + llvm::BasicBlock *loopExit) { + linearFinalizationBB = loopExit->splitBasicBlock( + loopExit->getTerminator(), "omp_loop.linear_finalization"); + linearExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_exit"); + linearLastIterExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_lastiter_exit"); + } + + // Finalize the linear vars + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + finalizeLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::Value *lastIter) { + // Emit condition to check whether last logical iteration is being executed + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + llvm::Value *loopLastIterLoad = builder.CreateLoad( + llvm::Type::getInt32Ty(builder.getContext()), lastIter); + llvm::Value *isLast = + builder.CreateCmp(llvm::CmpInst::ICMP_NE, loopLastIterLoad, + llvm::ConstantInt::get( + llvm::Type::getInt32Ty(builder.getContext()), 0)); + // Store the linear variable values to original variables. + builder.SetInsertPoint(linearLastIterExitBB->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarTemp = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + linearLoopBodyTemps[index]); + builder.CreateStore(linearVarTemp, linearOrigVars[index]); + } + + // Create conditional branch such that the linear variable + // values are stored to original variables only at the + // last logical iteration + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + builder.CreateCondBr(isLast, linearLastIterExitBB, linearExitBB); + linearFinalizationBB->getTerminator()->eraseFromParent(); + // Emit barrier + builder.SetInsertPoint(linearExitBB->getTerminator()); + return moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + } + + // Rewrite all uses of the original variable in `BBName` + // with the linear variable in-place + void rewriteInPlace(llvm::IRBuilderBase &builder, std::string BBName, + size_t varIndex) { + llvm::SmallVector users; + for (llvm::User *user : linearOrigVal[varIndex]->users()) + users.push_back(user); + for (auto *user : users) { + if (auto *userInst = dyn_cast(user)) { + if (userInst->getParent()->getName().str() == BBName) + user->replaceUsesOfWith(linearOrigVal[varIndex], + linearLoopBodyTemps[varIndex]); + } + } + } +}; + } // namespace /// Looks up from the operation from and returns the PrivateClauseOp with @@ -292,7 +432,6 @@ static LogicalResult checkImplementationStatus(Operation &op) { }) .Case([&](omp::WsloopOp op) { checkAllocate(op, result); - checkLinear(op, result); checkOrder(op, result); checkReduction(op, result); }) @@ -2423,15 +2562,40 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, llvm::omp::Directive::OMPD_for); llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder); + + // Initialize linear variables and linear step + LinearClauseProcessor linearClauseProcessor; + if (wsloopOp.getLinearVars().size()) { + for (mlir::Value linearVar : wsloopOp.getLinearVars()) + linearClauseProcessor.createLinearVar(builder, moduleTranslation, + linearVar); + for (mlir::Value linearStep : wsloopOp.getLinearStepVars()) + linearClauseProcessor.initLinearStep(moduleTranslation, linearStep); + } + llvm::Expected regionBlock = convertOmpOpRegions( wsloopOp.getRegion(), "omp.wsloop.region", builder, moduleTranslation); if (failed(handleError(regionBlock, opInst))) return failure(); - builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin()); llvm::CanonicalLoopInfo *loopInfo = findCurrentLoopInfo(moduleTranslation); + // Emit Initialization and Update IR for linear variables + if (wsloopOp.getLinearVars().size()) { + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + linearClauseProcessor.initLinearVar(builder, moduleTranslation, + loopInfo->getPreheader()); + if (failed(handleError(afterBarrierIP, *loopOp))) + return failure(); + builder.restoreIP(*afterBarrierIP); + linearClauseProcessor.updateLinearVar(builder, loopInfo->getBody(), + loopInfo->getIndVar()); + linearClauseProcessor.outlineLinearFinalizationBB(builder, + loopInfo->getExit()); + } + + builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin()); llvm::OpenMPIRBuilder::InsertPointOrErrorTy wsloopIP = ompBuilder->applyWorkshareLoop( ompLoc.DL, loopInfo, allocaIP, loopNeedsBarrier, @@ -2443,6 +2607,23 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, if (failed(handleError(wsloopIP, opInst))) return failure(); + // Emit finalization and in-place rewrites for linear vars. + if (wsloopOp.getLinearVars().size()) { + llvm::OpenMPIRBuilder::InsertPointTy oldIP = builder.saveIP(); + assert(loopInfo->getLastIter() && + "`lastiter` in CanonicalLoopInfo is nullptr"... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Sat May 10 08:28:31 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 08:28:31 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <681f709f.050a0220.b0a03.261d@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-mlir-llvm Author: None (NimishMishra)
Changes This patch adds support for LLVM translation of linear clause on omp.wsloop (except for linear modifiers). --- Patch is 25.69 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139386.diff 10 Files Affected: - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+34) - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.h (+1) - (modified) flang/lib/Lower/OpenMP/DataSharingProcessor.cpp (+3-2) - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+2-2) - (added) flang/test/Lower/OpenMP/wsloop-linear.f90 (+57) - (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+15) - (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+3) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+183-2) - (modified) mlir/test/Target/LLVMIR/openmp-llvm.mlir (+88) - (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (-13) ``````````diff diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 79b5087e4da68..8ba2f604df80a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1060,6 +1060,40 @@ bool ClauseProcessor::processIsDevicePtr( }); } +bool ClauseProcessor::processLinear(mlir::omp::LinearClauseOps &result) const { + lower::StatementContext stmtCtx; + return findRepeatableClause< + omp::clause::Linear>([&](const omp::clause::Linear &clause, + const parser::CharBlock &) { + auto &objects = std::get(clause.t); + for (const omp::Object &object : objects) { + semantics::Symbol *sym = object.sym(); + const mlir::Value variable = converter.getSymbolAddress(*sym); + result.linearVars.push_back(variable); + } + if (objects.size()) { + if (auto &mod = + std::get>( + clause.t)) { + mlir::Value operand = + fir::getBase(converter.genExprValue(toEvExpr(*mod), stmtCtx)); + result.linearStepVars.append(objects.size(), operand); + } else if (std::get>( + clause.t)) { + mlir::Location currentLocation = converter.getCurrentLocation(); + TODO(currentLocation, "Linear modifiers not yet implemented"); + } else { + // If nothing is present, add the default step of 1. + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + mlir::Value operand = firOpBuilder.createIntegerConstant( + currentLocation, firOpBuilder.getI32Type(), 1); + result.linearStepVars.append(objects.size(), operand); + } + } + }); +} + bool ClauseProcessor::processLink( llvm::SmallVectorImpl &result) const { return findRepeatableClause( diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 7857ba3fd0845..0ec41bdd33256 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -122,6 +122,7 @@ class ClauseProcessor { bool processIsDevicePtr( mlir::omp::IsDevicePtrClauseOps &result, llvm::SmallVectorImpl &isDeviceSyms) const; + bool processLinear(mlir::omp::LinearClauseOps &result) const; bool processLink(llvm::SmallVectorImpl &result) const; diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 7eec598645eac..2a1c94407e1c8 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -213,14 +213,15 @@ void DataSharingProcessor::collectSymbolsForPrivatization() { // so, we won't need to explicitely handle block objects (or forget to do // so). for (auto *sym : explicitlyPrivatizedSymbols) - allPrivatizedSymbols.insert(sym); + if (!sym->test(Fortran::semantics::Symbol::Flag::OmpLinear)) + allPrivatizedSymbols.insert(sym); } bool DataSharingProcessor::needBarrier() { // Emit implicit barrier to synchronize threads and avoid data races on // initialization of firstprivate variables and post-update of lastprivate // variables. - // Emit implicit barrier for linear clause. Maybe on somewhere else. + // Emit implicit barrier for linear clause in the OpenMPIRBuilder. for (const semantics::Symbol *sym : allPrivatizedSymbols) { if (sym->test(semantics::Symbol::Flag::OmpLastPrivate) && (sym->test(semantics::Symbol::Flag::OmpFirstPrivate) || diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 54560729eb4af..6fa915b4364f9 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1841,13 +1841,13 @@ static void genWsloopClauses( llvm::SmallVectorImpl &reductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processNowait(clauseOps); + cp.processLinear(clauseOps); cp.processOrder(clauseOps); cp.processOrdered(clauseOps); cp.processReduction(loc, clauseOps, reductionSyms); cp.processSchedule(stmtCtx, clauseOps); - cp.processTODO( - loc, llvm::omp::Directive::OMPD_do); + cp.processTODO(loc, llvm::omp::Directive::OMPD_do); } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/wsloop-linear.f90 b/flang/test/Lower/OpenMP/wsloop-linear.f90 new file mode 100644 index 0000000000000..b99677108be2f --- /dev/null +++ b/flang/test/Lower/OpenMP/wsloop-linear.f90 @@ -0,0 +1,57 @@ +! This test checks lowering of OpenMP DO Directive (Worksharing) +! with linear clause + +! RUN: %flang_fc1 -fopenmp -emit-hlfir %s -o - 2>&1 | FileCheck %s + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFsimple_linearEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFsimple_linearEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[const:.*]] = arith.constant 1 : i32 +subroutine simple_linear + implicit none + integer :: x, y, i + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_stepEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_stepEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_step + implicit none + integer :: x, y, i + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x:4) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + +!CHECK: %[[A_alloca:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFlinear_exprEa"} +!CHECK: %[[A:.*]]:2 = hlfir.declare %[[A_alloca]] {uniq_name = "_QFlinear_exprEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_exprEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_exprEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_expr + implicit none + integer :: x, y, i, a + !CHECK: %[[LOAD_A:.*]] = fir.load %[[A]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: %[[LINEAR_EXPR:.*]] = arith.addi %[[LOAD_A]], %[[const]] : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[LINEAR_EXPR]] : !fir.ref) {{.*}} + !$omp do linear(x:a+4) + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h index ffc0fd0a0bdac..68f15d5c7d41e 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h @@ -3580,6 +3580,9 @@ class CanonicalLoopInfo { BasicBlock *Latch = nullptr; BasicBlock *Exit = nullptr; + // Hold the MLIR value for the `lastiter` of the canonical loop. + Value *LastIter = nullptr; + /// Add the control blocks of this loop to \p BBs. /// /// This does not include any block from the body, including the one returned @@ -3612,6 +3615,18 @@ class CanonicalLoopInfo { void mapIndVar(llvm::function_ref Updater); public: + /// Sets the last iteration variable for this loop. + void setLastIter(Value *IterVar) { LastIter = std::move(IterVar); } + + /// Returns the last iteration variable for this loop. + /// Certain use-cases (like translation of linear clause) may access + /// this variable even after a loop transformation. Hence, do not guard + /// this getter function by `isValid`. It is the responsibility of the + /// callee to ensure this functionality is not invoked by a non-outlined + /// CanonicalLoopInfo object (in which case, `setLastIter` will never be + /// invoked and `LastIter` will be by default `nullptr`). + Value *getLastIter() { return LastIter; } + /// Returns whether this object currently represents the IR of a loop. If /// returning false, it may have been consumed by a loop transformation or not /// been intialized. Do not use in this case; diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index a1268ca76b2d5..991cdb7b6b416 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -4254,6 +4254,7 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::applyStaticWorkshareLoop( Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); Value *PUpperBound = Builder.CreateAlloca(IVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(IVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // At the end of the preheader, prepare for calling the "init" function by // storing the current loop bounds into the allocated space. A canonical loop @@ -4361,6 +4362,7 @@ OpenMPIRBuilder::applyStaticChunkedWorkshareLoop(DebugLoc DL, Value *PUpperBound = Builder.CreateAlloca(InternalIVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(InternalIVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // Set up the source location value for the OpenMP runtime. Builder.restoreIP(CLI->getPreheaderIP()); @@ -4844,6 +4846,7 @@ OpenMPIRBuilder::applyDynamicWorkshareLoop(DebugLoc DL, CanonicalLoopInfo *CLI, Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); Value *PUpperBound = Builder.CreateAlloca(IVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(IVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // At the end of the preheader, prepare for calling the "init" function by // storing the current loop bounds into the allocated space. A canonical loop diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 9f7b5605556e6..571505ab9b9aa 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } + } + + // Initialize linear step + inline void initLinearStep(LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearStep) { + linearSteps.push_back(moduleTranslation.lookupValue(linearStep)); + } + + // Emit IR for initialization of linear variables + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + initLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::BasicBlock *loopPreHeader) { + builder.SetInsertPoint(loopPreHeader->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarLoad = builder.CreateLoad( + linearOrigVars[index]->getAllocatedType(), linearOrigVars[index]); + builder.CreateStore(linearVarLoad, linearPreconditionVars[index]); + } + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + return afterBarrierIP; + } + + // Emit IR for updating Linear variables + void updateLinearVar(llvm::IRBuilderBase &builder, llvm::BasicBlock *loopBody, + llvm::Value *loopInductionVar) { + builder.SetInsertPoint(loopBody->getTerminator()); + for (size_t index = 0; index < linearPreconditionVars.size(); index++) { + // Emit increments for linear vars + llvm::LoadInst *linearVarStart = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + + linearPreconditionVars[index]); + auto mulInst = builder.CreateMul(loopInductionVar, linearSteps[index]); + auto addInst = builder.CreateAdd(linearVarStart, mulInst); + builder.CreateStore(addInst, linearLoopBodyTemps[index]); + } + } + + // Linear variable finalization is conditional on the last logical iteration. + // Create BB splits to manage the same. + void outlineLinearFinalizationBB(llvm::IRBuilderBase &builder, + llvm::BasicBlock *loopExit) { + linearFinalizationBB = loopExit->splitBasicBlock( + loopExit->getTerminator(), "omp_loop.linear_finalization"); + linearExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_exit"); + linearLastIterExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_lastiter_exit"); + } + + // Finalize the linear vars + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + finalizeLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::Value *lastIter) { + // Emit condition to check whether last logical iteration is being executed + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + llvm::Value *loopLastIterLoad = builder.CreateLoad( + llvm::Type::getInt32Ty(builder.getContext()), lastIter); + llvm::Value *isLast = + builder.CreateCmp(llvm::CmpInst::ICMP_NE, loopLastIterLoad, + llvm::ConstantInt::get( + llvm::Type::getInt32Ty(builder.getContext()), 0)); + // Store the linear variable values to original variables. + builder.SetInsertPoint(linearLastIterExitBB->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarTemp = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + linearLoopBodyTemps[index]); + builder.CreateStore(linearVarTemp, linearOrigVars[index]); + } + + // Create conditional branch such that the linear variable + // values are stored to original variables only at the + // last logical iteration + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + builder.CreateCondBr(isLast, linearLastIterExitBB, linearExitBB); + linearFinalizationBB->getTerminator()->eraseFromParent(); + // Emit barrier + builder.SetInsertPoint(linearExitBB->getTerminator()); + return moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + } + + // Rewrite all uses of the original variable in `BBName` + // with the linear variable in-place + void rewriteInPlace(llvm::IRBuilderBase &builder, std::string BBName, + size_t varIndex) { + llvm::SmallVector users; + for (llvm::User *user : linearOrigVal[varIndex]->users()) + users.push_back(user); + for (auto *user : users) { + if (auto *userInst = dyn_cast(user)) { + if (userInst->getParent()->getName().str() == BBName) + user->replaceUsesOfWith(linearOrigVal[varIndex], + linearLoopBodyTemps[varIndex]); + } + } + } +}; + } // namespace /// Looks up from the operation from and returns the PrivateClauseOp with @@ -292,7 +432,6 @@ static LogicalResult checkImplementationStatus(Operation &op) { }) .Case([&](omp::WsloopOp op) { checkAllocate(op, result); - checkLinear(op, result); checkOrder(op, result); checkReduction(op, result); }) @@ -2423,15 +2562,40 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, llvm::omp::Directive::OMPD_for); llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder); + + // Initialize linear variables and linear step + LinearClauseProcessor linearClauseProcessor; + if (wsloopOp.getLinearVars().size()) { + for (mlir::Value linearVar : wsloopOp.getLinearVars()) + linearClauseProcessor.createLinearVar(builder, moduleTranslation, + linearVar); + for (mlir::Value linearStep : wsloopOp.getLinearStepVars()) + linearClauseProcessor.initLinearStep(moduleTranslation, linearStep); + } + llvm::Expected regionBlock = convertOmpOpRegions( wsloopOp.getRegion(), "omp.wsloop.region", builder, moduleTranslation); if (failed(handleError(regionBlock, opInst))) return failure(); - builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin()); llvm::CanonicalLoopInfo *loopInfo = findCurrentLoopInfo(moduleTranslation); + // Emit Initialization and Update IR for linear variables + if (wsloopOp.getLinearVars().size()) { + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + linearClauseProcessor.initLinearVar(builder, moduleTranslation, + loopInfo->getPreheader()); + if (failed(handleError(afterBarrierIP, *loopOp))) + return failure(); + builder.restoreIP(*afterBarrierIP); + linearClauseProcessor.updateLinearVar(builder, loopInfo->getBody(), + loopInfo->getIndVar()); + linearClauseProcessor.outlineLinearFinalizationBB(builder, + loopInfo->getExit()); + } + + builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin()); llvm::OpenMPIRBuilder::InsertPointOrErrorTy wsloopIP = ompBuilder->applyWorkshareLoop( ompLoc.DL, loopInfo, allocaIP, loopNeedsBarrier, @@ -2443,6 +2607,23 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, if (failed(handleError(wsloopIP, opInst))) return failure(); + // Emit finalization and in-place rewrites for linear vars. + if (wsloopOp.getLinearVars().size()) { + llvm::OpenMPIRBuilder::InsertPointTy oldIP = builder.saveIP(); + assert(loopInfo->getLastIter() && + "`lastiter` in CanonicalLoopInfo is nullptr"... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Sat May 10 08:28:31 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 08:28:31 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <681f709f.170a0220.3bc26a.e8f2@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: None (NimishMishra)
Changes This patch adds support for LLVM translation of linear clause on omp.wsloop (except for linear modifiers). --- Patch is 25.69 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139386.diff 10 Files Affected: - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+34) - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.h (+1) - (modified) flang/lib/Lower/OpenMP/DataSharingProcessor.cpp (+3-2) - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+2-2) - (added) flang/test/Lower/OpenMP/wsloop-linear.f90 (+57) - (modified) llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h (+15) - (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+3) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+183-2) - (modified) mlir/test/Target/LLVMIR/openmp-llvm.mlir (+88) - (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (-13) ``````````diff diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 79b5087e4da68..8ba2f604df80a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1060,6 +1060,40 @@ bool ClauseProcessor::processIsDevicePtr( }); } +bool ClauseProcessor::processLinear(mlir::omp::LinearClauseOps &result) const { + lower::StatementContext stmtCtx; + return findRepeatableClause< + omp::clause::Linear>([&](const omp::clause::Linear &clause, + const parser::CharBlock &) { + auto &objects = std::get(clause.t); + for (const omp::Object &object : objects) { + semantics::Symbol *sym = object.sym(); + const mlir::Value variable = converter.getSymbolAddress(*sym); + result.linearVars.push_back(variable); + } + if (objects.size()) { + if (auto &mod = + std::get>( + clause.t)) { + mlir::Value operand = + fir::getBase(converter.genExprValue(toEvExpr(*mod), stmtCtx)); + result.linearStepVars.append(objects.size(), operand); + } else if (std::get>( + clause.t)) { + mlir::Location currentLocation = converter.getCurrentLocation(); + TODO(currentLocation, "Linear modifiers not yet implemented"); + } else { + // If nothing is present, add the default step of 1. + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + mlir::Value operand = firOpBuilder.createIntegerConstant( + currentLocation, firOpBuilder.getI32Type(), 1); + result.linearStepVars.append(objects.size(), operand); + } + } + }); +} + bool ClauseProcessor::processLink( llvm::SmallVectorImpl &result) const { return findRepeatableClause( diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 7857ba3fd0845..0ec41bdd33256 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -122,6 +122,7 @@ class ClauseProcessor { bool processIsDevicePtr( mlir::omp::IsDevicePtrClauseOps &result, llvm::SmallVectorImpl &isDeviceSyms) const; + bool processLinear(mlir::omp::LinearClauseOps &result) const; bool processLink(llvm::SmallVectorImpl &result) const; diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 7eec598645eac..2a1c94407e1c8 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -213,14 +213,15 @@ void DataSharingProcessor::collectSymbolsForPrivatization() { // so, we won't need to explicitely handle block objects (or forget to do // so). for (auto *sym : explicitlyPrivatizedSymbols) - allPrivatizedSymbols.insert(sym); + if (!sym->test(Fortran::semantics::Symbol::Flag::OmpLinear)) + allPrivatizedSymbols.insert(sym); } bool DataSharingProcessor::needBarrier() { // Emit implicit barrier to synchronize threads and avoid data races on // initialization of firstprivate variables and post-update of lastprivate // variables. - // Emit implicit barrier for linear clause. Maybe on somewhere else. + // Emit implicit barrier for linear clause in the OpenMPIRBuilder. for (const semantics::Symbol *sym : allPrivatizedSymbols) { if (sym->test(semantics::Symbol::Flag::OmpLastPrivate) && (sym->test(semantics::Symbol::Flag::OmpFirstPrivate) || diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 54560729eb4af..6fa915b4364f9 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1841,13 +1841,13 @@ static void genWsloopClauses( llvm::SmallVectorImpl &reductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processNowait(clauseOps); + cp.processLinear(clauseOps); cp.processOrder(clauseOps); cp.processOrdered(clauseOps); cp.processReduction(loc, clauseOps, reductionSyms); cp.processSchedule(stmtCtx, clauseOps); - cp.processTODO( - loc, llvm::omp::Directive::OMPD_do); + cp.processTODO(loc, llvm::omp::Directive::OMPD_do); } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/wsloop-linear.f90 b/flang/test/Lower/OpenMP/wsloop-linear.f90 new file mode 100644 index 0000000000000..b99677108be2f --- /dev/null +++ b/flang/test/Lower/OpenMP/wsloop-linear.f90 @@ -0,0 +1,57 @@ +! This test checks lowering of OpenMP DO Directive (Worksharing) +! with linear clause + +! RUN: %flang_fc1 -fopenmp -emit-hlfir %s -o - 2>&1 | FileCheck %s + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFsimple_linearEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFsimple_linearEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[const:.*]] = arith.constant 1 : i32 +subroutine simple_linear + implicit none + integer :: x, y, i + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_stepEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_stepEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_step + implicit none + integer :: x, y, i + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x:4) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + +!CHECK: %[[A_alloca:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFlinear_exprEa"} +!CHECK: %[[A:.*]]:2 = hlfir.declare %[[A_alloca]] {uniq_name = "_QFlinear_exprEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_exprEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_exprEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_expr + implicit none + integer :: x, y, i, a + !CHECK: %[[LOAD_A:.*]] = fir.load %[[A]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: %[[LINEAR_EXPR:.*]] = arith.addi %[[LOAD_A]], %[[const]] : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[LINEAR_EXPR]] : !fir.ref) {{.*}} + !$omp do linear(x:a+4) + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h index ffc0fd0a0bdac..68f15d5c7d41e 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h @@ -3580,6 +3580,9 @@ class CanonicalLoopInfo { BasicBlock *Latch = nullptr; BasicBlock *Exit = nullptr; + // Hold the MLIR value for the `lastiter` of the canonical loop. + Value *LastIter = nullptr; + /// Add the control blocks of this loop to \p BBs. /// /// This does not include any block from the body, including the one returned @@ -3612,6 +3615,18 @@ class CanonicalLoopInfo { void mapIndVar(llvm::function_ref Updater); public: + /// Sets the last iteration variable for this loop. + void setLastIter(Value *IterVar) { LastIter = std::move(IterVar); } + + /// Returns the last iteration variable for this loop. + /// Certain use-cases (like translation of linear clause) may access + /// this variable even after a loop transformation. Hence, do not guard + /// this getter function by `isValid`. It is the responsibility of the + /// callee to ensure this functionality is not invoked by a non-outlined + /// CanonicalLoopInfo object (in which case, `setLastIter` will never be + /// invoked and `LastIter` will be by default `nullptr`). + Value *getLastIter() { return LastIter; } + /// Returns whether this object currently represents the IR of a loop. If /// returning false, it may have been consumed by a loop transformation or not /// been intialized. Do not use in this case; diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index a1268ca76b2d5..991cdb7b6b416 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -4254,6 +4254,7 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::applyStaticWorkshareLoop( Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); Value *PUpperBound = Builder.CreateAlloca(IVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(IVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // At the end of the preheader, prepare for calling the "init" function by // storing the current loop bounds into the allocated space. A canonical loop @@ -4361,6 +4362,7 @@ OpenMPIRBuilder::applyStaticChunkedWorkshareLoop(DebugLoc DL, Value *PUpperBound = Builder.CreateAlloca(InternalIVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(InternalIVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // Set up the source location value for the OpenMP runtime. Builder.restoreIP(CLI->getPreheaderIP()); @@ -4844,6 +4846,7 @@ OpenMPIRBuilder::applyDynamicWorkshareLoop(DebugLoc DL, CanonicalLoopInfo *CLI, Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); Value *PUpperBound = Builder.CreateAlloca(IVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(IVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // At the end of the preheader, prepare for calling the "init" function by // storing the current loop bounds into the allocated space. A canonical loop diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 9f7b5605556e6..571505ab9b9aa 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } + } + + // Initialize linear step + inline void initLinearStep(LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearStep) { + linearSteps.push_back(moduleTranslation.lookupValue(linearStep)); + } + + // Emit IR for initialization of linear variables + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + initLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::BasicBlock *loopPreHeader) { + builder.SetInsertPoint(loopPreHeader->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarLoad = builder.CreateLoad( + linearOrigVars[index]->getAllocatedType(), linearOrigVars[index]); + builder.CreateStore(linearVarLoad, linearPreconditionVars[index]); + } + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + return afterBarrierIP; + } + + // Emit IR for updating Linear variables + void updateLinearVar(llvm::IRBuilderBase &builder, llvm::BasicBlock *loopBody, + llvm::Value *loopInductionVar) { + builder.SetInsertPoint(loopBody->getTerminator()); + for (size_t index = 0; index < linearPreconditionVars.size(); index++) { + // Emit increments for linear vars + llvm::LoadInst *linearVarStart = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + + linearPreconditionVars[index]); + auto mulInst = builder.CreateMul(loopInductionVar, linearSteps[index]); + auto addInst = builder.CreateAdd(linearVarStart, mulInst); + builder.CreateStore(addInst, linearLoopBodyTemps[index]); + } + } + + // Linear variable finalization is conditional on the last logical iteration. + // Create BB splits to manage the same. + void outlineLinearFinalizationBB(llvm::IRBuilderBase &builder, + llvm::BasicBlock *loopExit) { + linearFinalizationBB = loopExit->splitBasicBlock( + loopExit->getTerminator(), "omp_loop.linear_finalization"); + linearExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_exit"); + linearLastIterExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_lastiter_exit"); + } + + // Finalize the linear vars + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + finalizeLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::Value *lastIter) { + // Emit condition to check whether last logical iteration is being executed + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + llvm::Value *loopLastIterLoad = builder.CreateLoad( + llvm::Type::getInt32Ty(builder.getContext()), lastIter); + llvm::Value *isLast = + builder.CreateCmp(llvm::CmpInst::ICMP_NE, loopLastIterLoad, + llvm::ConstantInt::get( + llvm::Type::getInt32Ty(builder.getContext()), 0)); + // Store the linear variable values to original variables. + builder.SetInsertPoint(linearLastIterExitBB->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarTemp = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + linearLoopBodyTemps[index]); + builder.CreateStore(linearVarTemp, linearOrigVars[index]); + } + + // Create conditional branch such that the linear variable + // values are stored to original variables only at the + // last logical iteration + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + builder.CreateCondBr(isLast, linearLastIterExitBB, linearExitBB); + linearFinalizationBB->getTerminator()->eraseFromParent(); + // Emit barrier + builder.SetInsertPoint(linearExitBB->getTerminator()); + return moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + } + + // Rewrite all uses of the original variable in `BBName` + // with the linear variable in-place + void rewriteInPlace(llvm::IRBuilderBase &builder, std::string BBName, + size_t varIndex) { + llvm::SmallVector users; + for (llvm::User *user : linearOrigVal[varIndex]->users()) + users.push_back(user); + for (auto *user : users) { + if (auto *userInst = dyn_cast(user)) { + if (userInst->getParent()->getName().str() == BBName) + user->replaceUsesOfWith(linearOrigVal[varIndex], + linearLoopBodyTemps[varIndex]); + } + } + } +}; + } // namespace /// Looks up from the operation from and returns the PrivateClauseOp with @@ -292,7 +432,6 @@ static LogicalResult checkImplementationStatus(Operation &op) { }) .Case([&](omp::WsloopOp op) { checkAllocate(op, result); - checkLinear(op, result); checkOrder(op, result); checkReduction(op, result); }) @@ -2423,15 +2562,40 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, llvm::omp::Directive::OMPD_for); llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder); + + // Initialize linear variables and linear step + LinearClauseProcessor linearClauseProcessor; + if (wsloopOp.getLinearVars().size()) { + for (mlir::Value linearVar : wsloopOp.getLinearVars()) + linearClauseProcessor.createLinearVar(builder, moduleTranslation, + linearVar); + for (mlir::Value linearStep : wsloopOp.getLinearStepVars()) + linearClauseProcessor.initLinearStep(moduleTranslation, linearStep); + } + llvm::Expected regionBlock = convertOmpOpRegions( wsloopOp.getRegion(), "omp.wsloop.region", builder, moduleTranslation); if (failed(handleError(regionBlock, opInst))) return failure(); - builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin()); llvm::CanonicalLoopInfo *loopInfo = findCurrentLoopInfo(moduleTranslation); + // Emit Initialization and Update IR for linear variables + if (wsloopOp.getLinearVars().size()) { + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + linearClauseProcessor.initLinearVar(builder, moduleTranslation, + loopInfo->getPreheader()); + if (failed(handleError(afterBarrierIP, *loopOp))) + return failure(); + builder.restoreIP(*afterBarrierIP); + linearClauseProcessor.updateLinearVar(builder, loopInfo->getBody(), + loopInfo->getIndVar()); + linearClauseProcessor.outlineLinearFinalizationBB(builder, + loopInfo->getExit()); + } + + builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin()); llvm::OpenMPIRBuilder::InsertPointOrErrorTy wsloopIP = ompBuilder->applyWorkshareLoop( ompLoc.DL, loopInfo, allocaIP, loopNeedsBarrier, @@ -2443,6 +2607,23 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, if (failed(handleError(wsloopIP, opInst))) return failure(); + // Emit finalization and in-place rewrites for linear vars. + if (wsloopOp.getLinearVars().size()) { + llvm::OpenMPIRBuilder::InsertPointTy oldIP = builder.saveIP(); + assert(loopInfo->getLastIter() && + "`lastiter` in CanonicalLoopInfo is nullptr"... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Sat May 10 08:30:55 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 08:30:55 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <681f712f.170a0220.5e126.0509@mx.google.com> NimishMishra wrote: The following test case from OpenMP examples document compiles successfully with a combination of PR https://github.com/llvm/llvm-project/pull/139385 and this PR: ``` program linear_loop use omp_lib implicit none integer, parameter :: N = 100 real :: a(N), b(N/2) integer :: i, j do i = 1, N a(i) = i end do j = 0 !$omp parallel !$omp do linear(j:1) do i = 1, N, 2 j = j + 1 b(j) = a(i) * 2.0 end do !$omp end parallel print *, j, b(1), b(j) ! print out: 50 2.0 198.0 end program ``` Flang output: `50 2. 198.` as expected. https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Sat May 10 08:43:46 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 08:43:46 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <681f7432.630a0220.162f8.bce8@mx.google.com> NimishMishra wrote: This PR is stacked over https://github.com/llvm/llvm-project/pull/139385 to allow for easy testing. I intend to merge this PR only after https://github.com/llvm/llvm-project/pull/139385 is merged. https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Sat May 10 08:51:01 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 08:51:01 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][OpenMP] Support MLIR lowering of linear clause for omp.wsloop (PR #139385) In-Reply-To: Message-ID: <681f75e5.170a0220.32f084.d46d@mx.google.com> https://github.com/NimishMishra updated https://github.com/llvm/llvm-project/pull/139385 >From e4f3cb2553f8ef03a3ad347cf14a187e31064153 Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Sat, 10 May 2025 19:34:16 +0530 Subject: [PATCH 1/3] [flang][OpenMP] Support MLIR lowering of linear clause for omp.wsloop --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 34 +++++++++++ flang/lib/Lower/OpenMP/ClauseProcessor.h | 1 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 5 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 4 +- flang/test/Lower/OpenMP/wsloop-linear.f90 | 57 +++++++++++++++++++ 5 files changed, 97 insertions(+), 4 deletions(-) create mode 100644 flang/test/Lower/OpenMP/wsloop-linear.f90 diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 79b5087e4da68..8ba2f604df80a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1060,6 +1060,40 @@ bool ClauseProcessor::processIsDevicePtr( }); } +bool ClauseProcessor::processLinear(mlir::omp::LinearClauseOps &result) const { + lower::StatementContext stmtCtx; + return findRepeatableClause< + omp::clause::Linear>([&](const omp::clause::Linear &clause, + const parser::CharBlock &) { + auto &objects = std::get(clause.t); + for (const omp::Object &object : objects) { + semantics::Symbol *sym = object.sym(); + const mlir::Value variable = converter.getSymbolAddress(*sym); + result.linearVars.push_back(variable); + } + if (objects.size()) { + if (auto &mod = + std::get>( + clause.t)) { + mlir::Value operand = + fir::getBase(converter.genExprValue(toEvExpr(*mod), stmtCtx)); + result.linearStepVars.append(objects.size(), operand); + } else if (std::get>( + clause.t)) { + mlir::Location currentLocation = converter.getCurrentLocation(); + TODO(currentLocation, "Linear modifiers not yet implemented"); + } else { + // If nothing is present, add the default step of 1. + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + mlir::Value operand = firOpBuilder.createIntegerConstant( + currentLocation, firOpBuilder.getI32Type(), 1); + result.linearStepVars.append(objects.size(), operand); + } + } + }); +} + bool ClauseProcessor::processLink( llvm::SmallVectorImpl &result) const { return findRepeatableClause( diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 7857ba3fd0845..0ec41bdd33256 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -122,6 +122,7 @@ class ClauseProcessor { bool processIsDevicePtr( mlir::omp::IsDevicePtrClauseOps &result, llvm::SmallVectorImpl &isDeviceSyms) const; + bool processLinear(mlir::omp::LinearClauseOps &result) const; bool processLink(llvm::SmallVectorImpl &result) const; diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 7eec598645eac..2a1c94407e1c8 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -213,14 +213,15 @@ void DataSharingProcessor::collectSymbolsForPrivatization() { // so, we won't need to explicitely handle block objects (or forget to do // so). for (auto *sym : explicitlyPrivatizedSymbols) - allPrivatizedSymbols.insert(sym); + if (!sym->test(Fortran::semantics::Symbol::Flag::OmpLinear)) + allPrivatizedSymbols.insert(sym); } bool DataSharingProcessor::needBarrier() { // Emit implicit barrier to synchronize threads and avoid data races on // initialization of firstprivate variables and post-update of lastprivate // variables. - // Emit implicit barrier for linear clause. Maybe on somewhere else. + // Emit implicit barrier for linear clause in the OpenMPIRBuilder. for (const semantics::Symbol *sym : allPrivatizedSymbols) { if (sym->test(semantics::Symbol::Flag::OmpLastPrivate) && (sym->test(semantics::Symbol::Flag::OmpFirstPrivate) || diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 54560729eb4af..6fa915b4364f9 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1841,13 +1841,13 @@ static void genWsloopClauses( llvm::SmallVectorImpl &reductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processNowait(clauseOps); + cp.processLinear(clauseOps); cp.processOrder(clauseOps); cp.processOrdered(clauseOps); cp.processReduction(loc, clauseOps, reductionSyms); cp.processSchedule(stmtCtx, clauseOps); - cp.processTODO( - loc, llvm::omp::Directive::OMPD_do); + cp.processTODO(loc, llvm::omp::Directive::OMPD_do); } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/wsloop-linear.f90 b/flang/test/Lower/OpenMP/wsloop-linear.f90 new file mode 100644 index 0000000000000..b99677108be2f --- /dev/null +++ b/flang/test/Lower/OpenMP/wsloop-linear.f90 @@ -0,0 +1,57 @@ +! This test checks lowering of OpenMP DO Directive (Worksharing) +! with linear clause + +! RUN: %flang_fc1 -fopenmp -emit-hlfir %s -o - 2>&1 | FileCheck %s + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFsimple_linearEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFsimple_linearEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[const:.*]] = arith.constant 1 : i32 +subroutine simple_linear + implicit none + integer :: x, y, i + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_stepEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_stepEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_step + implicit none + integer :: x, y, i + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x:4) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + +!CHECK: %[[A_alloca:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFlinear_exprEa"} +!CHECK: %[[A:.*]]:2 = hlfir.declare %[[A_alloca]] {uniq_name = "_QFlinear_exprEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_exprEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_exprEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_expr + implicit none + integer :: x, y, i, a + !CHECK: %[[LOAD_A:.*]] = fir.load %[[A]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: %[[LINEAR_EXPR:.*]] = arith.addi %[[LOAD_A]], %[[const]] : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[LINEAR_EXPR]] : !fir.ref) {{.*}} + !$omp do linear(x:a+4) + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine >From 616d6377bbb94bdc742023d71c63ba89df293e3c Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Sat, 10 May 2025 20:51:39 +0530 Subject: [PATCH 2/3] [mlir][llvm][OpenMP] Support translation for linear clause in omp.wsloop --- .../llvm/Frontend/OpenMP/OMPIRBuilder.h | 15 ++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 3 + .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 185 +++++++++++++++++- mlir/test/Target/LLVMIR/openmp-llvm.mlir | 88 +++++++++ mlir/test/Target/LLVMIR/openmp-todo.mlir | 13 -- 5 files changed, 289 insertions(+), 15 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h index ffc0fd0a0bdac..68f15d5c7d41e 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h @@ -3580,6 +3580,9 @@ class CanonicalLoopInfo { BasicBlock *Latch = nullptr; BasicBlock *Exit = nullptr; + // Hold the MLIR value for the `lastiter` of the canonical loop. + Value *LastIter = nullptr; + /// Add the control blocks of this loop to \p BBs. /// /// This does not include any block from the body, including the one returned @@ -3612,6 +3615,18 @@ class CanonicalLoopInfo { void mapIndVar(llvm::function_ref Updater); public: + /// Sets the last iteration variable for this loop. + void setLastIter(Value *IterVar) { LastIter = std::move(IterVar); } + + /// Returns the last iteration variable for this loop. + /// Certain use-cases (like translation of linear clause) may access + /// this variable even after a loop transformation. Hence, do not guard + /// this getter function by `isValid`. It is the responsibility of the + /// callee to ensure this functionality is not invoked by a non-outlined + /// CanonicalLoopInfo object (in which case, `setLastIter` will never be + /// invoked and `LastIter` will be by default `nullptr`). + Value *getLastIter() { return LastIter; } + /// Returns whether this object currently represents the IR of a loop. If /// returning false, it may have been consumed by a loop transformation or not /// been intialized. Do not use in this case; diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index a1268ca76b2d5..991cdb7b6b416 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -4254,6 +4254,7 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::applyStaticWorkshareLoop( Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); Value *PUpperBound = Builder.CreateAlloca(IVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(IVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // At the end of the preheader, prepare for calling the "init" function by // storing the current loop bounds into the allocated space. A canonical loop @@ -4361,6 +4362,7 @@ OpenMPIRBuilder::applyStaticChunkedWorkshareLoop(DebugLoc DL, Value *PUpperBound = Builder.CreateAlloca(InternalIVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(InternalIVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // Set up the source location value for the OpenMP runtime. Builder.restoreIP(CLI->getPreheaderIP()); @@ -4844,6 +4846,7 @@ OpenMPIRBuilder::applyDynamicWorkshareLoop(DebugLoc DL, CanonicalLoopInfo *CLI, Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); Value *PUpperBound = Builder.CreateAlloca(IVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(IVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // At the end of the preheader, prepare for calling the "init" function by // storing the current loop bounds into the allocated space. A canonical loop diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 9f7b5605556e6..571505ab9b9aa 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } + } + + // Initialize linear step + inline void initLinearStep(LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearStep) { + linearSteps.push_back(moduleTranslation.lookupValue(linearStep)); + } + + // Emit IR for initialization of linear variables + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + initLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::BasicBlock *loopPreHeader) { + builder.SetInsertPoint(loopPreHeader->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarLoad = builder.CreateLoad( + linearOrigVars[index]->getAllocatedType(), linearOrigVars[index]); + builder.CreateStore(linearVarLoad, linearPreconditionVars[index]); + } + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + return afterBarrierIP; + } + + // Emit IR for updating Linear variables + void updateLinearVar(llvm::IRBuilderBase &builder, llvm::BasicBlock *loopBody, + llvm::Value *loopInductionVar) { + builder.SetInsertPoint(loopBody->getTerminator()); + for (size_t index = 0; index < linearPreconditionVars.size(); index++) { + // Emit increments for linear vars + llvm::LoadInst *linearVarStart = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + + linearPreconditionVars[index]); + auto mulInst = builder.CreateMul(loopInductionVar, linearSteps[index]); + auto addInst = builder.CreateAdd(linearVarStart, mulInst); + builder.CreateStore(addInst, linearLoopBodyTemps[index]); + } + } + + // Linear variable finalization is conditional on the last logical iteration. + // Create BB splits to manage the same. + void outlineLinearFinalizationBB(llvm::IRBuilderBase &builder, + llvm::BasicBlock *loopExit) { + linearFinalizationBB = loopExit->splitBasicBlock( + loopExit->getTerminator(), "omp_loop.linear_finalization"); + linearExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_exit"); + linearLastIterExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_lastiter_exit"); + } + + // Finalize the linear vars + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + finalizeLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::Value *lastIter) { + // Emit condition to check whether last logical iteration is being executed + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + llvm::Value *loopLastIterLoad = builder.CreateLoad( + llvm::Type::getInt32Ty(builder.getContext()), lastIter); + llvm::Value *isLast = + builder.CreateCmp(llvm::CmpInst::ICMP_NE, loopLastIterLoad, + llvm::ConstantInt::get( + llvm::Type::getInt32Ty(builder.getContext()), 0)); + // Store the linear variable values to original variables. + builder.SetInsertPoint(linearLastIterExitBB->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarTemp = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + linearLoopBodyTemps[index]); + builder.CreateStore(linearVarTemp, linearOrigVars[index]); + } + + // Create conditional branch such that the linear variable + // values are stored to original variables only at the + // last logical iteration + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + builder.CreateCondBr(isLast, linearLastIterExitBB, linearExitBB); + linearFinalizationBB->getTerminator()->eraseFromParent(); + // Emit barrier + builder.SetInsertPoint(linearExitBB->getTerminator()); + return moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + } + + // Rewrite all uses of the original variable in `BBName` + // with the linear variable in-place + void rewriteInPlace(llvm::IRBuilderBase &builder, std::string BBName, + size_t varIndex) { + llvm::SmallVector users; + for (llvm::User *user : linearOrigVal[varIndex]->users()) + users.push_back(user); + for (auto *user : users) { + if (auto *userInst = dyn_cast(user)) { + if (userInst->getParent()->getName().str() == BBName) + user->replaceUsesOfWith(linearOrigVal[varIndex], + linearLoopBodyTemps[varIndex]); + } + } + } +}; + } // namespace /// Looks up from the operation from and returns the PrivateClauseOp with @@ -292,7 +432,6 @@ static LogicalResult checkImplementationStatus(Operation &op) { }) .Case([&](omp::WsloopOp op) { checkAllocate(op, result); - checkLinear(op, result); checkOrder(op, result); checkReduction(op, result); }) @@ -2423,15 +2562,40 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, llvm::omp::Directive::OMPD_for); llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder); + + // Initialize linear variables and linear step + LinearClauseProcessor linearClauseProcessor; + if (wsloopOp.getLinearVars().size()) { + for (mlir::Value linearVar : wsloopOp.getLinearVars()) + linearClauseProcessor.createLinearVar(builder, moduleTranslation, + linearVar); + for (mlir::Value linearStep : wsloopOp.getLinearStepVars()) + linearClauseProcessor.initLinearStep(moduleTranslation, linearStep); + } + llvm::Expected regionBlock = convertOmpOpRegions( wsloopOp.getRegion(), "omp.wsloop.region", builder, moduleTranslation); if (failed(handleError(regionBlock, opInst))) return failure(); - builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin()); llvm::CanonicalLoopInfo *loopInfo = findCurrentLoopInfo(moduleTranslation); + // Emit Initialization and Update IR for linear variables + if (wsloopOp.getLinearVars().size()) { + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + linearClauseProcessor.initLinearVar(builder, moduleTranslation, + loopInfo->getPreheader()); + if (failed(handleError(afterBarrierIP, *loopOp))) + return failure(); + builder.restoreIP(*afterBarrierIP); + linearClauseProcessor.updateLinearVar(builder, loopInfo->getBody(), + loopInfo->getIndVar()); + linearClauseProcessor.outlineLinearFinalizationBB(builder, + loopInfo->getExit()); + } + + builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin()); llvm::OpenMPIRBuilder::InsertPointOrErrorTy wsloopIP = ompBuilder->applyWorkshareLoop( ompLoc.DL, loopInfo, allocaIP, loopNeedsBarrier, @@ -2443,6 +2607,23 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, if (failed(handleError(wsloopIP, opInst))) return failure(); + // Emit finalization and in-place rewrites for linear vars. + if (wsloopOp.getLinearVars().size()) { + llvm::OpenMPIRBuilder::InsertPointTy oldIP = builder.saveIP(); + assert(loopInfo->getLastIter() && + "`lastiter` in CanonicalLoopInfo is nullptr"); + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + linearClauseProcessor.finalizeLinearVar(builder, moduleTranslation, + loopInfo->getLastIter()); + if (failed(handleError(afterBarrierIP, *loopOp))) + return failure(); + builder.restoreIP(*afterBarrierIP); + for (size_t index = 0; index < wsloopOp.getLinearVars().size(); index++) + linearClauseProcessor.rewriteInPlace(builder, "omp.loop_nest.region", + index); + builder.restoreIP(oldIP); + } + // Set the correct branch target for task cancellation popCancelFinalizationCB(cancelTerminators, *ompBuilder, wsloopIP.get()); diff --git a/mlir/test/Target/LLVMIR/openmp-llvm.mlir b/mlir/test/Target/LLVMIR/openmp-llvm.mlir index 32f0ba5b105ff..9ad9e93301239 100644 --- a/mlir/test/Target/LLVMIR/openmp-llvm.mlir +++ b/mlir/test/Target/LLVMIR/openmp-llvm.mlir @@ -358,6 +358,94 @@ llvm.func @wsloop_simple(%arg0: !llvm.ptr) { // ----- +// CHECK-LABEL: wsloop_linear + +// CHECK: {{.*}} = alloca i32, i64 1, align 4 +// CHECK: %[[Y:.*]] = alloca i32, i64 1, align 4 +// CHECK: %[[X:.*]] = alloca i32, i64 1, align 4 + +// CHECK: entry: +// CHECK: %[[LINEAR_VAR:.*]] = alloca i32, align 4 +// CHECK: %[[LINEAR_RESULT:.*]] = alloca i32, align 4 +// CHECK: br label %omp_loop.preheader + +// CHECK: omp_loop.preheader: +// CHECK: %[[LOAD:.*]] = load i32, ptr %[[X]], align 4 +// CHECK: store i32 %[[LOAD]], ptr %[[LINEAR_VAR]], align 4 +// CHECK: %omp_global_thread_num = call i32 @__kmpc_global_thread_num(ptr @2) +// CHECK: call void @__kmpc_barrier(ptr @1, i32 %omp_global_thread_num) + +// CHECK: omp_loop.body: +// CHECK: %[[LOOP_IV:.*]] = add i32 %omp_loop.iv, {{.*}} +// CHECK: %[[LINEAR_LOAD:.*]] = load i32, ptr %[[LINEAR_VAR]], align 4 +// CHECK: %[[MUL:.*]] = mul i32 %[[LOOP_IV]], 1 +// CHECK: %[[ADD:.*]] = add i32 %[[LINEAR_LOAD]], %[[MUL]] +// CHECK: store i32 %[[ADD]], ptr %[[LINEAR_RESULT]], align 4 +// CHECK: br label %omp.loop_nest.region + +// CHECK: omp.loop_nest.region: +// CHECK: %[[LINEAR_LOAD:.*]] = load i32, ptr %[[LINEAR_RESULT]], align 4 +// CHECK: %[[ADD:.*]] = add i32 %[[LINEAR_LOAD]], 2 +// CHECK: store i32 %[[ADD]], ptr %[[Y]], align 4 + +// CHECK: omp_loop.exit: +// CHECK: call void @__kmpc_for_static_fini(ptr @2, i32 %omp_global_thread_num4) +// CHECK: %omp_global_thread_num5 = call i32 @__kmpc_global_thread_num(ptr @2) +// CHECK: call void @__kmpc_barrier(ptr @3, i32 %omp_global_thread_num5) +// CHECK: br label %omp_loop.linear_finalization + +// CHECK: omp_loop.linear_finalization: +// CHECK: %[[LAST_ITER:.*]] = load i32, ptr %p.lastiter, align 4 +// CHECK: %[[CMP:.*]] = icmp ne i32 %[[LAST_ITER]], 0 +// CHECK: br i1 %[[CMP]], label %omp_loop.linear_lastiter_exit, label %omp_loop.linear_exit + +// CHECK: omp_loop.linear_lastiter_exit: +// CHECK: %[[LINEAR_RESULT_LOAD:.*]] = load i32, ptr %[[LINEAR_RESULT]], align 4 +// CHECK: store i32 %[[LINEAR_RESULT_LOAD]], ptr %[[X]], align 4 +// CHECK: br label %omp_loop.linear_exit + +// CHECK: omp_loop.linear_exit: +// CHECK: %omp_global_thread_num6 = call i32 @__kmpc_global_thread_num(ptr @2) +// CHECK: call void @__kmpc_barrier(ptr @1, i32 %omp_global_thread_num6) +// CHECK: br label %omp_loop.after + +llvm.func @wsloop_linear() { + %0 = llvm.mlir.constant(1 : i64) : i64 + %1 = llvm.alloca %0 x i32 {bindc_name = "i", pinned} : (i64) -> !llvm.ptr + %2 = llvm.mlir.constant(1 : i64) : i64 + %3 = llvm.alloca %2 x i32 {bindc_name = "y"} : (i64) -> !llvm.ptr + %4 = llvm.mlir.constant(1 : i64) : i64 + %5 = llvm.alloca %4 x i32 {bindc_name = "x"} : (i64) -> !llvm.ptr + %6 = llvm.mlir.constant(1 : i64) : i64 + %7 = llvm.alloca %6 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr + %8 = llvm.mlir.constant(2 : i32) : i32 + %9 = llvm.mlir.constant(10 : i32) : i32 + %10 = llvm.mlir.constant(1 : i32) : i32 + %11 = llvm.mlir.constant(1 : i64) : i64 + %12 = llvm.mlir.constant(1 : i64) : i64 + %13 = llvm.mlir.constant(1 : i64) : i64 + %14 = llvm.mlir.constant(1 : i64) : i64 + omp.wsloop linear(%5 = %10 : !llvm.ptr) { + omp.loop_nest (%arg0) : i32 = (%10) to (%9) inclusive step (%10) { + llvm.store %arg0, %1 : i32, !llvm.ptr + %15 = llvm.load %5 : !llvm.ptr -> i32 + %16 = llvm.add %15, %8 : i32 + llvm.store %16, %3 : i32, !llvm.ptr + %17 = llvm.add %arg0, %10 : i32 + %18 = llvm.icmp "sgt" %17, %9 : i32 + llvm.cond_br %18, ^bb1, ^bb2 + ^bb1: // pred: ^bb0 + llvm.store %17, %1 : i32, !llvm.ptr + llvm.br ^bb2 + ^bb2: // 2 preds: ^bb0, ^bb1 + omp.yield + } + } + llvm.return +} + +// ----- + // CHECK-LABEL: @wsloop_inclusive_1 llvm.func @wsloop_inclusive_1(%arg0: !llvm.ptr) { %0 = llvm.mlir.constant(42 : index) : i64 diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir index 9a83b46efddca..98fccb1a80f67 100644 --- a/mlir/test/Target/LLVMIR/openmp-todo.mlir +++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir @@ -511,19 +511,6 @@ llvm.func @wsloop_allocate(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) { // ----- -llvm.func @wsloop_linear(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) { - // expected-error at below {{not yet implemented: Unhandled clause linear in omp.wsloop operation}} - // expected-error at below {{LLVM Translation failed for operation: omp.wsloop}} - omp.wsloop linear(%x = %step : !llvm.ptr) { - omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) { - omp.yield - } - } - llvm.return -} - -// ----- - llvm.func @wsloop_order(%lb : i32, %ub : i32, %step : i32) { // expected-error at below {{not yet implemented: Unhandled clause order in omp.wsloop operation}} // expected-error at below {{LLVM Translation failed for operation: omp.wsloop}} >From 5b5bba68e8cc723a79ca686a9a8b317c2388c92a Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Sat, 10 May 2025 21:19:49 +0530 Subject: [PATCH 3/3] Drop unused InsertionPt restoration --- .../Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp | 1 - 1 file changed, 1 deletion(-) diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 571505ab9b9aa..d723eee10636f 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -2617,7 +2617,6 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, loopInfo->getLastIter()); if (failed(handleError(afterBarrierIP, *loopOp))) return failure(); - builder.restoreIP(*afterBarrierIP); for (size_t index = 0; index < wsloopOp.getLinearVars().size(); index++) linearClauseProcessor.rewriteInPlace(builder, "omp.loop_nest.region", index); From flang-commits at lists.llvm.org Fri May 9 09:22:14 2025 From: flang-commits at lists.llvm.org (Mats Petersson via flang-commits) Date: Fri, 09 May 2025 09:22:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #131628) In-Reply-To: Message-ID: <681e2bb6.170a0220.2598dd.22b7@mx.google.com> https://github.com/Leporacanthicus updated https://github.com/llvm/llvm-project/pull/131628 >From a75db6e7529fa9c3f1770e1325e6cd1696a77052 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Thu, 6 Mar 2025 10:41:59 +0000 Subject: [PATCH 01/12] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION This adds another puzzle piece for the support of OpenMP DECLARE REDUCTION functionality. This adds support for operators with derived types, as well as declaring multiple different types with the same name or operator. A new detail class for UserReductionDetials is introduced to hold the list of types supported for a given reduction declaration. Tests for parsing and symbol generation added. Declare reduction is still not supported to lowering, it will generate a "Not yet implemented" fatal error. --- flang/include/flang/Semantics/symbol.h | 21 ++- flang/lib/Semantics/check-omp-structure.cpp | 63 ++++++-- flang/lib/Semantics/resolve-names-utils.h | 4 + flang/lib/Semantics/resolve-names.cpp | 77 +++++++++- flang/lib/Semantics/symbol.cpp | 12 +- .../Parser/OpenMP/declare-reduction-multi.f90 | 134 ++++++++++++++++++ .../OpenMP/declare-reduction-operator.f90 | 59 ++++++++ .../OpenMP/declare-reduction-functions.f90 | 126 ++++++++++++++++ .../OpenMP/declare-reduction-mangled.f90 | 51 +++++++ .../OpenMP/declare-reduction-operators.f90 | 55 +++++++ .../OpenMP/declare-reduction-typeerror.f90 | 30 ++++ .../Semantics/OpenMP/declare-reduction.f90 | 4 +- 12 files changed, 616 insertions(+), 20 deletions(-) create mode 100644 flang/test/Parser/OpenMP/declare-reduction-multi.f90 create mode 100644 flang/test/Parser/OpenMP/declare-reduction-operator.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-functions.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-mangled.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-operators.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 715811885c219..12867a5f8ec6f 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -701,6 +701,25 @@ class GenericDetails { }; llvm::raw_ostream &operator<<(llvm::raw_ostream &, const GenericDetails &); +class UserReductionDetails : public WithBindName { +public: + using TypeVector = std::vector; + UserReductionDetails() = default; + + void AddType(const DeclTypeSpec *type) { typeList_.push_back(type); } + const TypeVector &GetTypeList() const { return typeList_; } + + bool SupportsType(const DeclTypeSpec *type) const { + for (auto t : typeList_) + if (t == type) + return true; + return false; + } + +private: + TypeVector typeList_; +}; + class UnknownDetails {}; using Details = std::variant; + TypeParamDetails, MiscDetails, UserReductionDetails>; llvm::raw_ostream &operator<<(llvm::raw_ostream &, const Details &); std::string DetailsToString(const Details &); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 717982f66027c..aa8c830e8f2d2 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -8,6 +8,7 @@ #include "check-omp-structure.h" #include "definable.h" +#include "resolve-names-utils.h" #include "flang/Evaluate/check-expression.h" #include "flang/Evaluate/expression.h" #include "flang/Evaluate/type.h" @@ -3403,8 +3404,8 @@ bool OmpStructureChecker::CheckReductionOperator( valid = llvm::is_contained({"max", "min", "iand", "ior", "ieor"}, realName); if (!valid) { - auto *misc{name->symbol->detailsIf()}; - valid = misc && misc->kind() == MiscDetails::Kind::ConstructName; + auto *reductionDetails{name->symbol->detailsIf()}; + valid = reductionDetails != nullptr; } } if (!valid) { @@ -3486,7 +3487,8 @@ void OmpStructureChecker::CheckReductionObjects( } static bool IsReductionAllowedForType( - const parser::OmpReductionIdentifier &ident, const DeclTypeSpec &type) { + const parser::OmpReductionIdentifier &ident, const DeclTypeSpec &type, + const Scope &scope) { auto isLogical{[](const DeclTypeSpec &type) -> bool { return type.category() == DeclTypeSpec::Logical; }}; @@ -3506,9 +3508,11 @@ static bool IsReductionAllowedForType( case parser::DefinedOperator::IntrinsicOperator::Multiply: case parser::DefinedOperator::IntrinsicOperator::Add: case parser::DefinedOperator::IntrinsicOperator::Subtract: - return type.IsNumeric(TypeCategory::Integer) || + if (type.IsNumeric(TypeCategory::Integer) || type.IsNumeric(TypeCategory::Real) || - type.IsNumeric(TypeCategory::Complex); + type.IsNumeric(TypeCategory::Complex)) + return true; + break; case parser::DefinedOperator::IntrinsicOperator::AND: case parser::DefinedOperator::IntrinsicOperator::OR: @@ -3521,8 +3525,18 @@ static bool IsReductionAllowedForType( DIE("This should have been caught in CheckIntrinsicOperator"); return false; } + parser::CharBlock name{MakeNameFromOperator(*intrinsicOp)}; + Symbol *symbol{scope.FindSymbol(name)}; + if (symbol) { + const auto *reductionDetails{symbol->detailsIf()}; + assert(reductionDetails && "Expected to find reductiondetails"); + + return reductionDetails->SupportsType(&type); + } + return false; } - return true; + assert(0 && "Intrinsic Operator not found - parsing gone wrong?"); + return false; // Reject everything else. }}; auto checkDesignator{[&](const parser::ProcedureDesignator &procD) { @@ -3535,18 +3549,42 @@ static bool IsReductionAllowedForType( // IAND: arguments must be integers: F2023 16.9.100 // IEOR: arguments must be integers: F2023 16.9.106 // IOR: arguments must be integers: F2023 16.9.111 - return type.IsNumeric(TypeCategory::Integer); + if (type.IsNumeric(TypeCategory::Integer)) { + return true; + } } else if (realName == "max" || realName == "min") { // MAX: arguments must be integer, real, or character: // F2023 16.9.135 // MIN: arguments must be integer, real, or character: // F2023 16.9.141 - return type.IsNumeric(TypeCategory::Integer) || - type.IsNumeric(TypeCategory::Real) || isCharacter(type); + if (type.IsNumeric(TypeCategory::Integer) || + type.IsNumeric(TypeCategory::Real) || isCharacter(type)) { + return true; + } } + + // If we get here, it may be a user declared reduction, so check + // if the symbol has UserReductionDetails, and if so, the type is + // supported. + if (const auto *reductionDetails{ + name->symbol->detailsIf()}) { + return reductionDetails->SupportsType(&type); + } + + // We also need to check for mangled names (max, min, iand, ieor and ior) + // and then check if the type is there. + parser::CharBlock mangledName = MangleSpecialFunctions(name->source); + if (const auto &symbol{scope.FindSymbol(mangledName)}) { + if (const auto *reductionDetails{ + symbol->detailsIf()}) { + return reductionDetails->SupportsType(&type); + } + } + // Everything else is "not matching type". + return false; } - // TODO: user defined reduction operators. Just allow everything for now. - return true; + assert(0 && "name and name->symbol should be set here..."); + return false; }}; return common::visit( @@ -3561,7 +3599,8 @@ void OmpStructureChecker::CheckReductionObjectTypes( for (auto &[symbol, source] : symbols) { if (auto *type{symbol->GetType()}) { - if (!IsReductionAllowedForType(ident, *type)) { + const auto &scope{context_.FindScope(symbol->name())}; + if (!IsReductionAllowedForType(ident, *type, scope)) { context_.Say(source, "The type of '%s' is incompatible with the reduction operator."_err_en_US, symbol->name()); diff --git a/flang/lib/Semantics/resolve-names-utils.h b/flang/lib/Semantics/resolve-names-utils.h index 64784722ff4f8..de0991d69b61b 100644 --- a/flang/lib/Semantics/resolve-names-utils.h +++ b/flang/lib/Semantics/resolve-names-utils.h @@ -146,5 +146,9 @@ struct SymbolAndTypeMappings; void MapSubprogramToNewSymbols(const Symbol &oldSymbol, Symbol &newSymbol, Scope &newScope, SymbolAndTypeMappings * = nullptr); +parser::CharBlock MakeNameFromOperator( + const parser::DefinedOperator::IntrinsicOperator &op); +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_RESOLVE_NAMES_H_ diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 74367b5229548..f048b374588ca 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1752,15 +1752,75 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, PopScope(); } +parser::CharBlock MakeNameFromOperator( + const parser::DefinedOperator::IntrinsicOperator &op) { + switch (op) { + case parser::DefinedOperator::IntrinsicOperator::Multiply: + return parser::CharBlock{"op.*", 4}; + case parser::DefinedOperator::IntrinsicOperator::Add: + return parser::CharBlock{"op.+", 4}; + case parser::DefinedOperator::IntrinsicOperator::Subtract: + return parser::CharBlock{"op.-", 4}; + + case parser::DefinedOperator::IntrinsicOperator::AND: + return parser::CharBlock{"op.AND", 6}; + case parser::DefinedOperator::IntrinsicOperator::OR: + return parser::CharBlock{"op.OR", 6}; + case parser::DefinedOperator::IntrinsicOperator::EQV: + return parser::CharBlock{"op.EQV", 7}; + case parser::DefinedOperator::IntrinsicOperator::NEQV: + return parser::CharBlock{"op.NEQV", 8}; + + default: + assert(0 && "Unsupported operator..."); + return parser::CharBlock{"op.?", 4}; + } +} + +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name) { + if (name == "max") { + return parser::CharBlock{"op.max", 6}; + } + if (name == "min") { + return parser::CharBlock{"op.min", 6}; + } + if (name == "iand") { + return parser::CharBlock{"op.iand", 7}; + } + if (name == "ior") { + return parser::CharBlock{"op.ior", 6}; + } + if (name == "ieor") { + return parser::CharBlock{"op.ieor", 7}; + } + // All other names: return as is. + return name; +} + void OmpVisitor::ProcessReductionSpecifier( const parser::OmpReductionSpecifier &spec, const std::optional &clauses) { + const parser::Name *name{nullptr}; + parser::Name mangledName{}; + UserReductionDetails reductionDetailsTemp{}; const auto &id{std::get(spec.t)}; if (auto procDes{std::get_if(&id.u)}) { - if (auto *name{std::get_if(&procDes->u)}) { - name->symbol = - &MakeSymbol(*name, MiscDetails{MiscDetails::Kind::ConstructName}); + name = std::get_if(&procDes->u); + if (name) { + mangledName.source = MangleSpecialFunctions(name->source); } + } else { + const auto &defOp{std::get(id.u)}; + mangledName.source = MakeNameFromOperator( + std::get(defOp.u)); + name = &mangledName; + } + + UserReductionDetails *reductionDetails{&reductionDetailsTemp}; + Symbol *symbol{name ? name->symbol : nullptr}; + symbol = FindSymbol(mangledName); + if (symbol) { + reductionDetails = symbol->detailsIf(); } auto &typeList{std::get(spec.t)}; @@ -1792,6 +1852,10 @@ void OmpVisitor::ProcessReductionSpecifier( const DeclTypeSpec *typeSpec{GetDeclTypeSpec()}; assert(typeSpec && "We should have a type here"); + if (reductionDetails) { + reductionDetails->AddType(typeSpec); + } + for (auto &nm : ompVarNames) { ObjectEntityDetails details{}; details.set_type(*typeSpec); @@ -1802,6 +1866,13 @@ void OmpVisitor::ProcessReductionSpecifier( Walk(clauses); PopScope(); } + + if (name) { + if (!symbol) { + symbol = &MakeSymbol(mangledName, Attrs{}, std::move(*reductionDetails)); + } + name->symbol = symbol; + } } bool OmpVisitor::Pre(const parser::OmpDirectiveSpecification &x) { diff --git a/flang/lib/Semantics/symbol.cpp b/flang/lib/Semantics/symbol.cpp index 32eb6c2c5a188..e627dd293ba7c 100644 --- a/flang/lib/Semantics/symbol.cpp +++ b/flang/lib/Semantics/symbol.cpp @@ -246,7 +246,7 @@ void GenericDetails::CopyFrom(const GenericDetails &from) { // This is primarily for debugging. std::string DetailsToString(const Details &details) { return common::visit( - common::visitors{ + common::visitors{// [](const UnknownDetails &) { return "Unknown"; }, [](const MainProgramDetails &) { return "MainProgram"; }, [](const ModuleDetails &) { return "Module"; }, @@ -266,7 +266,7 @@ std::string DetailsToString(const Details &details) { [](const TypeParamDetails &) { return "TypeParam"; }, [](const MiscDetails &) { return "Misc"; }, [](const AssocEntityDetails &) { return "AssocEntity"; }, - }, + [](const UserReductionDetails &) { return "UserReductionDetails"; }}, details); } @@ -300,6 +300,9 @@ bool Symbol::CanReplaceDetails(const Details &details) const { [&](const HostAssocDetails &) { return this->has(); }, + [&](const UserReductionDetails &) { + return this->has(); + }, [](const auto &) { return false; }, }, details); @@ -598,6 +601,11 @@ llvm::raw_ostream &operator<<(llvm::raw_ostream &os, const Details &details) { [&](const MiscDetails &x) { os << ' ' << MiscDetails::EnumToString(x.kind()); }, + [&](const UserReductionDetails &x) { + for (auto &type : x.GetTypeList()) { + DumpType(os, type); + } + }, [&](const auto &x) { os << x; }, }, details); diff --git a/flang/test/Parser/OpenMP/declare-reduction-multi.f90 b/flang/test/Parser/OpenMP/declare-reduction-multi.f90 new file mode 100644 index 0000000000000..0e1adcc9958d7 --- /dev/null +++ b/flang/test/Parser/OpenMP/declare-reduction-multi.f90 @@ -0,0 +1,134 @@ +! RUN: %flang_fc1 -fdebug-unparse -fopenmp %s | FileCheck --ignore-case %s +! RUN: %flang_fc1 -fdebug-dump-parse-tree -fopenmp %s | FileCheck --check-prefix="PARSE-TREE" %s + +!! Test multiple declarations for the same type, with different operations. +module mymod + type :: tt + real r + end type tt +contains + function mymax(a, b) + type(tt) :: a, b, mymax + if (a%r > b%r) then + mymax = a + else + mymax = b + end if + end function mymax +end module mymod + +program omp_examples +!CHECK-LABEL: PROGRAM omp_examples + use mymod + implicit none + integer, parameter :: n = 100 + integer :: i + type(tt) :: values(n), sum, prod, big, small + + !$omp declare reduction(+:tt:omp_out%r = omp_out%r + omp_in%r) initializer(omp_priv%r = 0) +!CHECK: !$OMP DECLARE REDUCTION (+:tt: omp_out%r=omp_out%r+omp_in%r +!CHECK-NEXT: ) INITIALIZER(omp_priv%r=0_4) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE-NEXT: OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Add +!PARSE-TREE: OmpTypeNameList -> OmpTypeSpecifier -> TypeSpec -> DerivedTypeSpec +!PARSE-TREE-NEXT: Name = 'tt' +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out%r=omp_out%r+omp_in%r' +!PARSE-TREE: OmpClauseList -> OmpClause -> Initializer -> OmpInitializerClause -> AssignmentStmt = 'omp_priv%r=0._4 + !$omp declare reduction(*:tt:omp_out%r = omp_out%r * omp_in%r) initializer(omp_priv%r = 1) +!CHECK-NEXT: !$OMP DECLARE REDUCTION (*:tt: omp_out%r=omp_out%r*omp_in%r +!CHECK-NEXT: ) INITIALIZER(omp_priv%r=1_4) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Multiply +!PARSE-TREE: OmpTypeNameList -> OmpTypeSpecifier -> TypeSpec -> DerivedTypeSpec +!PARSE-TREE-NEXT: Name = 'tt' +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out%r=omp_out%r*omp_in%r' +!PARSE-TREE: OmpClauseList -> OmpClause -> Initializer -> OmpInitializerClause -> AssignmentStmt = 'omp_priv%r=1._4' + !$omp declare reduction(max:tt:omp_out = mymax(omp_out, omp_in)) initializer(omp_priv%r = 0) +!CHECK-NEXT: !$OMP DECLARE REDUCTION (max:tt: omp_out=mymax(omp_out,omp_in) +!CHECK-NEXT: ) INITIALIZER(omp_priv%r=0_4) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> ProcedureDesignator -> Name = 'max' +!PARSE-TREE: OmpTypeNameList -> OmpTypeSpecifier -> TypeSpec -> DerivedTypeSpec +!PARSE-TREE: Name = 'tt' +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out=mymax(omp_out,omp_in)' +!PARSE-TREE: OmpClauseList -> OmpClause -> Initializer -> OmpInitializerClause -> AssignmentStmt = 'omp_priv%r=0._4' + !$omp declare reduction(min:tt:omp_out%r = min(omp_out%r, omp_in%r)) initializer(omp_priv%r = 1) +!CHECK-NEXT: !$OMP DECLARE REDUCTION (min:tt: omp_out%r=min(omp_out%r,omp_in%r) +!CHECK-NEXT: ) INITIALIZER(omp_priv%r=1_4) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> ProcedureDesignator -> Name = 'min' +!PARSE-TREE: OmpTypeNameList -> OmpTypeSpecifier -> TypeSpec -> DerivedTypeSpec +!PARSE-TREE: Name = 'tt' +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out%r=min(omp_out%r,omp_in%r)' +!PARSE-TREE: OmpClauseList -> OmpClause -> Initializer -> OmpInitializerClause -> AssignmentStmt = 'omp_priv%r=1._4' + call random_number(values%r) + + sum%r = 0 + !$omp parallel do reduction(+:sum) +!CHECK: !$OMP PARALLEL DO REDUCTION(+: sum) +!PARSE-TREE: ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct +!PARSE-TREE: OmpBeginLoopDirective +!PARSE-TREE: OmpLoopDirective -> llvm::omp::Directive = parallel do +!PARSE-TREE: OmpClauseList -> OmpClause -> Reduction -> OmpReductionClause +!PARSE-TREE: Modifier -> OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Add +!PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'sum +!PARSE-TREE: DoConstruct + do i = 1, n + sum%r = sum%r + values(i)%r + end do + + prod%r = 1 + !$omp parallel do reduction(*:prod) +!CHECK: !$OMP PARALLEL DO REDUCTION(*: prod) +!PARSE-TREE: ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct +!PARSE-TREE: OmpBeginLoopDirective +!PARSE-TREE: OmpLoopDirective -> llvm::omp::Directive = parallel do +!PARSE-TREE: OmpClauseList -> OmpClause -> Reduction -> OmpReductionClause +!PARSE-TREE: Modifier -> OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Multiply +!PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'prod' +!PARSE-TREE: DoConstruct + do i = 1, n + prod%r = prod%r * (values(i)%r+0.6) + end do + + big%r = 0 + !$omp parallel do reduction(max:big) +!CHECK: $OMP PARALLEL DO REDUCTION(max: big) +!PARSE-TREE: ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct +!PARSE-TREE: OmpBeginLoopDirective +!PARSE-TREE: OmpLoopDirective -> llvm::omp::Directive = parallel do +!PARSE-TREE: OmpClauseList -> OmpClause -> Reduction -> OmpReductionClause +!PARSE-TREE: Modifier -> OmpReductionIdentifier -> ProcedureDesignator -> Name = 'max' +!PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'big' +!PARSE-TREE: DoConstruct + do i = 1, n + big = mymax(values(i), big) + end do + + small%r = 1 + !$omp parallel do reduction(min:small) +!CHECK: !$OMP PARALLEL DO REDUCTION(min: small) +!CHECK-TREE: ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct +!CHECK-TREE: OmpBeginLoopDirective +!CHECK-TREE: OmpLoopDirective -> llvm::omp::Directive = parallel do +!CHECK-TREE: OmpClauseList -> OmpClause -> Reduction -> OmpReductionClause +!CHECK-TREE: Modifier -> OmpReductionIdentifier -> ProcedureDesignator -> Name = 'min' +!CHECK-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'small' +!CHECK-TREE: DoConstruct + do i = 1, n + small%r = min(values(i)%r, small%r) + end do + + print *, values%r + print *, "sum=", sum%r + print *, "prod=", prod%r + print *, "small=", small%r, " big=", big%r +end program omp_examples diff --git a/flang/test/Parser/OpenMP/declare-reduction-operator.f90 b/flang/test/Parser/OpenMP/declare-reduction-operator.f90 new file mode 100644 index 0000000000000..7bfb78115b10d --- /dev/null +++ b/flang/test/Parser/OpenMP/declare-reduction-operator.f90 @@ -0,0 +1,59 @@ +! RUN: %flang_fc1 -fdebug-unparse -fopenmp %s | FileCheck --ignore-case %s +! RUN: %flang_fc1 -fdebug-dump-parse-tree -fopenmp %s | FileCheck --check-prefix="PARSE-TREE" %s + +!CHECK-LABEL: SUBROUTINE reduce_1 (n, tts) +subroutine reduce_1 ( n, tts ) + type :: tt + integer :: x + integer :: y + end type tt + type :: tt2 + real(8) :: x + real(8) :: y + end type + + integer :: n + type(tt) :: tts(n) + type(tt2) :: tts2(n) + +!CHECK: !$OMP DECLARE REDUCTION (+:tt: omp_out=tt(x=omp_out%x-omp_in%x,y=omp_out%y-omp_in%y) +!CHECK: ) INITIALIZER(omp_priv=tt(x=0_4,y=0_4)) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Add +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out=tt(x=omp_out%x-omp_in%x,y=omp_out%y-omp_in%y)' +!PARSE-TREE: OmpInitializerClause -> AssignmentStmt = 'omp_priv=tt(x=0_4,y=0_4)' + + !$omp declare reduction(+ : tt : omp_out = tt(omp_out%x - omp_in%x , omp_out%y - omp_in%y)) initializer(omp_priv = tt(0,0)) + + +!CHECK: !$OMP DECLARE REDUCTION (+:tt2: omp_out=tt2(x=omp_out%x-omp_in%x,y=omp_out%y-omp_in%y) +!CHECK: ) INITIALIZER(omp_priv=tt2(x=0._8,y=0._8) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Add +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out=tt2(x=omp_out%x-omp_in%x,y=omp_out%y-omp_in%y)' +!PARSE-TREE: OmpInitializerClause -> AssignmentStmt = 'omp_priv=tt2(x=0._8,y=0._8)' + + !$omp declare reduction(+ :tt2 : omp_out = tt2(omp_out%x - omp_in%x , omp_out%y - omp_in%y)) initializer(omp_priv = tt2(0,0)) + + type(tt) :: diffp = tt( 0, 0 ) + type(tt2) :: diffp2 = tt2( 0, 0 ) + integer :: i + + !$omp parallel do reduction(+ : diffp) + do i = 1, n + diffp%x = diffp%x + tts(i)%x + diffp%y = diffp%y + tts(i)%y + end do + + !$omp parallel do reduction(+ : diffp2) + do i = 1, n + diffp2%x = diffp2%x + tts2(i)%x + diffp2%y = diffp2%y + tts2(i)%y + end do + +end subroutine reduce_1 +!CHECK: END SUBROUTINE reduce_1 diff --git a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 new file mode 100644 index 0000000000000..924ef0807ec80 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 @@ -0,0 +1,126 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +module mm + implicit none + type two + integer(4) :: a, b + end type two + + type three + integer(8) :: a, b, c + end type three + + type twothree + type(two) t2 + type(three) t3 + end type twothree + +contains +!CHECK-LABEL: Subprogram scope: inittwo + subroutine inittwo(x,n) + integer :: n + type(two) :: x + x%a=n + x%b=n + end subroutine inittwo + + subroutine initthree(x,n) + integer :: n + type(three) :: x + x%a=n + x%b=n + end subroutine initthree + + function add_two(x, y) + type(two) add_two, x, y, res + res%a = x%a + y%a + res%b = x%b + y%b + add_two = res + end function add_two + + function add_three(x, y) + type(three) add_three, x, y, res + res%a = x%a + y%a + res%b = x%b + y%b + res%c = x%c + y%c + add_three = res + end function add_three + +!CHECK-LABEL: Subprogram scope: functwo + function functwo(x, n) + type(two) functwo + integer :: n + type(two) :: x(n) + type(two) :: res + integer :: i + !$omp declare reduction(adder:two:omp_out=add_two(omp_out,omp_in)) initializer(inittwo(omp_priv,0)) +!CHECK: adder: UserReductionDetails TYPE(two) +!CHECK OtherConstruct scope +!CHECK: omp_in size=8 offset=0: ObjectEntity type: TYPE(two) +!CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) +!CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) +!CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) + + + !$omp simd reduction(adder:res) + do i=1,n + res=add_two(res,x(i)) + enddo + functwo=res + end function functwo + + function functhree(x, n) + implicit none + type(three) :: functhree + type(three) :: x(n) + type(three) :: res + integer :: i + integer :: n + !$omp declare reduction(adder:three:omp_out=add_three(omp_out,omp_in)) initializer(initthree(omp_priv,1)) + + !$omp simd reduction(adder:res) + do i=1,n + res=add_three(res,x(i)) + enddo + functhree=res + end function functhree + + function functtwothree(x, n) + type(twothree) :: functtwothree + type(twothree) :: x(n) + type(twothree) :: res + type(two) :: res2 + type(three) :: res3 + integer :: n + integer :: i + + !$omp declare reduction(adder:two:omp_out=add_two(omp_out,omp_in)) initializer(inittwo(omp_priv,0)) + + !$omp declare reduction(adder:three:omp_out=add_three(omp_out,omp_in)) initializer(initthree(omp_priv,1)) + +!CHECK: adder: UserReductionDetails TYPE(two) TYPE(three) +!CHECK OtherConstruct scope +!CHECK: omp_in size=8 offset=0: ObjectEntity type: TYPE(two) +!CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) +!CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) +!CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) +!CHECK OtherConstruct scope +!CHECK: omp_in size=24 offset=0: ObjectEntity type: TYPE(three) +!CHECK: omp_orig size=24 offset=24: ObjectEntity type: TYPE(three) +!CHECK: omp_out size=24 offset=48: ObjectEntity type: TYPE(three) +!CHECK: omp_priv size=24 offset=72: ObjectEntity type: TYPE(three) + + !$omp simd reduction(adder:res3) + do i=1,n + res3=add_three(res%t3,x(i)%t3) + enddo + + !$omp simd reduction(adder:res2) + do i=1,n + res2=add_two(res2,x(i)%t2) + enddo + res%t2 = res2 + res%t3 = res3 + end function functtwothree + +end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction-mangled.f90 b/flang/test/Semantics/OpenMP/declare-reduction-mangled.f90 new file mode 100644 index 0000000000000..f1675b6f251e0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-mangled.f90 @@ -0,0 +1,51 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +!! Test that the name mangling for min & max (also used for iand, ieor and ior). +module mymod + type :: tt + real r + end type tt +contains + function mymax(a, b) + type(tt) :: a, b, mymax + if (a%r > b%r) then + mymax = a + else + mymax = b + end if + end function mymax +end module mymod + +program omp_examples +!CHECK-LABEL: MainProgram scope: omp_examples + use mymod + implicit none + integer, parameter :: n = 100 + integer :: i + type(tt) :: values(n), big, small + + !$omp declare reduction(max:tt:omp_out = mymax(omp_out, omp_in)) initializer(omp_priv%r = 0) + !$omp declare reduction(min:tt:omp_out%r = min(omp_out%r, omp_in%r)) initializer(omp_priv%r = 1) + +!CHECK: min, ELEMENTAL, INTRINSIC, PURE (Function): ProcEntity +!CHECK: mymax (Function): Use from mymax in mymod +!CHECK: op.max: UserReductionDetails TYPE(tt) +!CHECK: op.min: UserReductionDetails TYPE(tt) + + big%r = 0 + !$omp parallel do reduction(max:big) +!CHECK: big (OmpReduction): HostAssoc +!CHECK: max, INTRINSIC: ProcEntity + do i = 1, n + big = mymax(values(i), big) + end do + + small%r = 1 + !$omp parallel do reduction(min:small) +!CHECK: small (OmpReduction): HostAssoc + do i = 1, n + small%r = min(values(i)%r, small%r) + end do + + print *, "small=", small%r, " big=", big%r +end program omp_examples diff --git a/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 b/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 new file mode 100644 index 0000000000000..e7513ab3f95b1 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 @@ -0,0 +1,55 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +module vector_mod + implicit none + type :: Vector + real :: x, y, z + contains + procedure :: add_vectors + generic :: operator(+) => add_vectors + end type Vector +contains + ! Function implementing vector addition + function add_vectors(a, b) result(res) + class(Vector), intent(in) :: a, b + type(Vector) :: res + res%x = a%x + b%x + res%y = a%y + b%y + res%z = a%z + b%z + end function add_vectors +end module vector_mod + +program test_vector +!CHECK-LABEL: MainProgram scope: test_vector + use vector_mod +!CHECK: add_vectors (Function): Use from add_vectors in vector_mod + implicit none + integer :: i + type(Vector) :: v1(100), v2(100) + + !$OMP declare reduction(+:vector:omp_out=omp_out+omp_in) initializer(omp_priv=Vector(0,0,0)) +!CHECK: op.+: UserReductionDetails TYPE(vector) +!CHECK: v1 size=1200 offset=4: ObjectEntity type: TYPE(vector) shape: 1_8:100_8 +!CHECK: v2 size=1200 offset=1204: ObjectEntity type: TYPE(vector) shape: 1_8:100_8 +!CHECK: vector: Use from vector in vector_mod + +!CHECK: OtherConstruct scope: +!CHECK: omp_in size=12 offset=0: ObjectEntity type: TYPE(vector) +!CHECK: omp_orig size=12 offset=12: ObjectEntity type: TYPE(vector) +!CHECK: omp_out size=12 offset=24: ObjectEntity type: TYPE(vector) +!CHECK: omp_priv size=12 offset=36: ObjectEntity type: TYPE(vector) + + v2 = Vector(0.0, 0.0, 0.0) + v1 = Vector(1.0, 2.0, 3.0) + !$OMP parallel do reduction(+:v2) +!CHECK: OtherConstruct scope +!CHECK: i (OmpPrivate, OmpPreDetermined): HostAssoc +!CHECK: v1: HostAssoc +!CHECK: v2 (OmpReduction): HostAssoc + + do i = 1, 100 + v2(i) = v2(i) + v1(i) ! Invokes add_vectors + end do + + print *, 'v2 components:', v2%x, v2%y, v2%z +end program test_vector diff --git a/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 b/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 new file mode 100644 index 0000000000000..14695faf844b6 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 @@ -0,0 +1,30 @@ +! RUN: not %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s 2>&1 | FileCheck %s + +module mm + implicit none + type two + integer(4) :: a, b + end type two + + type three + integer(8) :: a, b, c + end type three +contains + function add_two(x, y) + type(two) add_two, x, y, res + add_two = res + end function add_two + + function func(n) + type(three) :: func + type(three) :: res3 + integer :: n + integer :: i + !$omp declare reduction(adder:two:omp_out=add_two(omp_out,omp_in)) + !$omp simd reduction(adder:res3) +!CHECK: error: The type of 'res3' is incompatible with the reduction operator. + do i=1,n + enddo + func = res3 + end function func +end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction.f90 b/flang/test/Semantics/OpenMP/declare-reduction.f90 index 11612f01f0f2d..ddca38fd57812 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction.f90 @@ -17,7 +17,7 @@ subroutine initme(x,n) end subroutine initme end interface !$omp declare reduction(red_add:integer(4):omp_out=omp_out+omp_in) initializer(initme(omp_priv,0)) -!CHECK: red_add: Misc ConstructName +!CHECK: red_add: UserReductionDetails !CHECK: Subprogram scope: initme !CHECK: omp_in size=4 offset=0: ObjectEntity type: INTEGER(4) !CHECK: omp_orig size=4 offset=4: ObjectEntity type: INTEGER(4) @@ -35,7 +35,7 @@ program main !$omp declare reduction (my_add_red : integer : omp_out = omp_out + omp_in) initializer (omp_priv=0) -!CHECK: my_add_red: Misc ConstructName +!CHECK: my_add_red: UserReductionDetails !CHECK: omp_in size=4 offset=0: ObjectEntity type: INTEGER(4) !CHECK: omp_orig size=4 offset=4: ObjectEntity type: INTEGER(4) !CHECK: omp_out size=4 offset=8: ObjectEntity type: INTEGER(4) >From ee4f70556e81a47c3c42b3bf781425deac5d44a0 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Wed, 26 Mar 2025 13:42:43 +0000 Subject: [PATCH 02/12] Fix review comments * Add two more tests (multiple operator-based declarations and re-using symbol already declared. * Add a few comments. * Fix up logical results. --- flang/include/flang/Semantics/symbol.h | 10 +-- flang/lib/Semantics/check-omp-structure.cpp | 11 +-- flang/lib/Semantics/resolve-names.cpp | 38 +++++++---- .../OpenMP/declare-reduction-dupsym.f90 | 15 ++++ .../OpenMP/declare-reduction-functions.f90 | 68 ++++++++++++++++++- .../OpenMP/declare-reduction-logical.f90 | 32 +++++++++ .../OpenMP/declare-reduction-typeerror.f90 | 4 ++ 7 files changed, 152 insertions(+), 26 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-logical.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 12867a5f8ec6f..b944912290cf7 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -701,7 +701,10 @@ class GenericDetails { }; llvm::raw_ostream &operator<<(llvm::raw_ostream &, const GenericDetails &); -class UserReductionDetails : public WithBindName { +// Used for OpenMP DECLARE REDUCTION, it holds the information +// needed to resolve which declaration (there could be multiple +// with the same name) to use for a given type. +class UserReductionDetails { public: using TypeVector = std::vector; UserReductionDetails() = default; @@ -710,10 +713,7 @@ class UserReductionDetails : public WithBindName { const TypeVector &GetTypeList() const { return typeList_; } bool SupportsType(const DeclTypeSpec *type) const { - for (auto t : typeList_) - if (t == type) - return true; - return false; + return llvm::is_contained(typeList_, type); } private: diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa8c830e8f2d2..1eac0bdfc05bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3518,7 +3518,10 @@ static bool IsReductionAllowedForType( case parser::DefinedOperator::IntrinsicOperator::OR: case parser::DefinedOperator::IntrinsicOperator::EQV: case parser::DefinedOperator::IntrinsicOperator::NEQV: - return isLogical(type); + if (isLogical(type)) { + return true; + } + break; // Reduction identifier is not in OMP5.2 Table 5.2 default: @@ -3535,7 +3538,7 @@ static bool IsReductionAllowedForType( } return false; } - assert(0 && "Intrinsic Operator not found - parsing gone wrong?"); + DIE("Intrinsic Operator not found - parsing gone wrong?"); return false; // Reject everything else. }}; @@ -3573,7 +3576,7 @@ static bool IsReductionAllowedForType( // We also need to check for mangled names (max, min, iand, ieor and ior) // and then check if the type is there. - parser::CharBlock mangledName = MangleSpecialFunctions(name->source); + parser::CharBlock mangledName{MangleSpecialFunctions(name->source)}; if (const auto &symbol{scope.FindSymbol(mangledName)}) { if (const auto *reductionDetails{ symbol->detailsIf()}) { @@ -3583,7 +3586,7 @@ static bool IsReductionAllowedForType( // Everything else is "not matching type". return false; } - assert(0 && "name and name->symbol should be set here..."); + DIE("name and name->symbol should be set here..."); return false; }}; diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f048b374588ca..fe63ff31afd3d 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1772,7 +1772,7 @@ parser::CharBlock MakeNameFromOperator( return parser::CharBlock{"op.NEQV", 8}; default: - assert(0 && "Unsupported operator..."); + DIE("Unsupported operator..."); return parser::CharBlock{"op.?", 4}; } } @@ -1801,8 +1801,8 @@ void OmpVisitor::ProcessReductionSpecifier( const parser::OmpReductionSpecifier &spec, const std::optional &clauses) { const parser::Name *name{nullptr}; - parser::Name mangledName{}; - UserReductionDetails reductionDetailsTemp{}; + parser::Name mangledName; + UserReductionDetails reductionDetailsTemp; const auto &id{std::get(spec.t)}; if (auto procDes{std::get_if(&id.u)}) { name = std::get_if(&procDes->u); @@ -1816,11 +1816,22 @@ void OmpVisitor::ProcessReductionSpecifier( name = &mangledName; } + // Use reductionDetailsTemp if we can't find the symbol (this is + // the first, or only, instance with this name). The detaiols then + // gets stored in the symbol when it's created. UserReductionDetails *reductionDetails{&reductionDetailsTemp}; - Symbol *symbol{name ? name->symbol : nullptr}; - symbol = FindSymbol(mangledName); + Symbol *symbol{FindSymbol(mangledName)}; if (symbol) { + // If we found a symbol, we append the type info to the + // existing reductionDetails. reductionDetails = symbol->detailsIf(); + + if (!reductionDetails) { + context().Say(name->source, + "Duplicate defineition of '%s' in !$OMP DECLARE REDUCTION"_err_en_US, + name->source); + return; + } } auto &typeList{std::get(spec.t)}; @@ -1849,17 +1860,16 @@ void OmpVisitor::ProcessReductionSpecifier( // We need to walk t.u because Walk(t) does it's own BeginDeclTypeSpec. Walk(t.u); - const DeclTypeSpec *typeSpec{GetDeclTypeSpec()}; - assert(typeSpec && "We should have a type here"); - - if (reductionDetails) { + // Only process types we can find. There will be an error later on when + // a type isn't found. + if (const DeclTypeSpec * typeSpec{GetDeclTypeSpec()}) { reductionDetails->AddType(typeSpec); - } - for (auto &nm : ompVarNames) { - ObjectEntityDetails details{}; - details.set_type(*typeSpec); - MakeSymbol(nm, Attrs{}, std::move(details)); + for (auto &nm : ompVarNames) { + ObjectEntityDetails details{}; + details.set_type(*typeSpec); + MakeSymbol(nm, Attrs{}, std::move(details)); + } } EndDeclTypeSpec(); Walk(std::get>(spec.t)); diff --git a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 new file mode 100644 index 0000000000000..17f70174e1854 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 @@ -0,0 +1,15 @@ +! RUN: not %flang_fc1 -fopenmp -fopenmp-version=50 %s 2>&1 | FileCheck %s + +!! Check for duplicate symbol use. +subroutine dup_symbol() + type :: loc + integer :: x + integer :: y + end type loc + + integer :: my_red + +!CHECK: error: Duplicate defineition of 'my_red' in !$OMP DECLARE REDUCTION + !$omp declare reduction(my_red : loc : omp_out%x = omp_out%x + omp_in%x) initializer(omp_priv%x = 0) + +end subroutine dup_symbol diff --git a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 index 924ef0807ec80..a2435fca415cd 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 @@ -85,8 +85,8 @@ function functhree(x, n) functhree=res end function functhree - function functtwothree(x, n) - type(twothree) :: functtwothree + function functwothree(x, n) + type(twothree) :: functwothree type(twothree) :: x(n) type(twothree) :: res type(two) :: res2 @@ -121,6 +121,68 @@ function functtwothree(x, n) enddo res%t2 = res2 res%t3 = res3 - end function functtwothree + functwothree=res + end function functwothree + +!CHECK-LABEL: Subprogram scope: funcbtwo + function funcBtwo(x, n) + type(two) funcBtwo + integer :: n + type(two) :: x(n) + type(two) :: res + integer :: i + !$omp declare reduction(+:two:omp_out=add_two(omp_out,omp_in)) initializer(inittwo(omp_priv,0)) +!CHECK: op.+: UserReductionDetails TYPE(two) +!CHECK OtherConstruct scope +!CHECK: omp_in size=8 offset=0: ObjectEntity type: TYPE(two) +!CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) +!CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) +!CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) + + + !$omp simd reduction(+:res) + do i=1,n + res=add_two(res,x(i)) + enddo + funcBtwo=res + end function funcBtwo + + function funcBtwothree(x, n) + type(twothree) :: funcBtwothree + type(twothree) :: x(n) + type(twothree) :: res + type(two) :: res2 + type(three) :: res3 + integer :: n + integer :: i + + !$omp declare reduction(+:two:omp_out=add_two(omp_out,omp_in)) initializer(inittwo(omp_priv,0)) + !$omp declare reduction(+:three:omp_out=add_three(omp_out,omp_in)) initializer(initthree(omp_priv,1)) + +!CHECK: op.+: UserReductionDetails TYPE(two) TYPE(three) +!CHECK OtherConstruct scope +!CHECK: omp_in size=8 offset=0: ObjectEntity type: TYPE(two) +!CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) +!CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) +!CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) +!CHECK OtherConstruct scope +!CHECK: omp_in size=24 offset=0: ObjectEntity type: TYPE(three) +!CHECK: omp_orig size=24 offset=24: ObjectEntity type: TYPE(three) +!CHECK: omp_out size=24 offset=48: ObjectEntity type: TYPE(three) +!CHECK: omp_priv size=24 offset=72: ObjectEntity type: TYPE(three) + + !$omp simd reduction(+:res3) + do i=1,n + res3=add_three(res%t3,x(i)%t3) + enddo + + !$omp simd reduction(+:res2) + do i=1,n + res2=add_two(res2,x(i)%t2) + enddo + res%t2 = res2 + res%t3 = res3 + end function funcBtwothree + end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction-logical.f90 b/flang/test/Semantics/OpenMP/declare-reduction-logical.f90 new file mode 100644 index 0000000000000..7ab7cad473ac8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-logical.f90 @@ -0,0 +1,32 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +module mm + implicit none + type logicalwrapper + logical b + end type logicalwrapper + +contains +!CHECK-LABEL: Subprogram scope: func + function func(x, n) + logical func + integer :: n + type(logicalwrapper) :: x(n) + type(logicalwrapper) :: res + integer :: i + !$omp declare reduction(.AND.:type(logicalwrapper):omp_out%b=omp_out%b .AND. omp_in%b) initializer(omp_priv%b=.true.) +!CHECK: op.AND: UserReductionDetails TYPE(logicalwrapper) +!CHECK OtherConstruct scope +!CHECK: omp_in size=4 offset=0: ObjectEntity type: TYPE(logicalwrapper) +!CHECK: omp_orig size=4 offset=4: ObjectEntity type: TYPE(logicalwrapper) +!CHECK: omp_out size=4 offset=8: ObjectEntity type: TYPE(logicalwrapper) +!CHECK: omp_priv size=4 offset=12: ObjectEntity type: TYPE(logicalwrapper) + + !$omp simd reduction(.AND.:res) + do i=1,n + res%b=res%b .and. x(i)%b + enddo + + func=res%b + end function func +end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 b/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 index 14695faf844b6..b8ede55aa0ed7 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 @@ -20,6 +20,10 @@ function func(n) type(three) :: res3 integer :: n integer :: i + + !$omp declare reduction(dummy:kerflunk:omp_out=omp_out+omp_in) +!CHECK: error: Derived type 'kerflunk' not found + !$omp declare reduction(adder:two:omp_out=add_two(omp_out,omp_in)) !$omp simd reduction(adder:res3) !CHECK: error: The type of 'res3' is incompatible with the reduction operator. >From 3f926bda65b22b18e2524f95cd5041792054c9d2 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Wed, 26 Mar 2025 17:51:25 +0000 Subject: [PATCH 03/12] Use stringswitch and spell details correctly --- flang/lib/Semantics/resolve-names.cpp | 27 +++++++++------------------ 1 file changed, 9 insertions(+), 18 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index fe63ff31afd3d..4f5dde00223bc 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -38,6 +38,7 @@ #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" #include "flang/Support/default-kinds.h" +#include "llvm/ADT/StringSwitch.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1778,23 +1779,13 @@ parser::CharBlock MakeNameFromOperator( } parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name) { - if (name == "max") { - return parser::CharBlock{"op.max", 6}; - } - if (name == "min") { - return parser::CharBlock{"op.min", 6}; - } - if (name == "iand") { - return parser::CharBlock{"op.iand", 7}; - } - if (name == "ior") { - return parser::CharBlock{"op.ior", 6}; - } - if (name == "ieor") { - return parser::CharBlock{"op.ieor", 7}; - } - // All other names: return as is. - return name; + return llvm::StringSwitch(name.ToString()) + .Case("max", {"op.max", 6}) + .Case("min", {"op.min", 6}) + .Case("iand", {"op.iand", 7}) + .Case("ior", {"op.ior", 6}) + .Case("ieor", {"op.ieor", 7}) + .Default(name); } void OmpVisitor::ProcessReductionSpecifier( @@ -1817,7 +1808,7 @@ void OmpVisitor::ProcessReductionSpecifier( } // Use reductionDetailsTemp if we can't find the symbol (this is - // the first, or only, instance with this name). The detaiols then + // the first, or only, instance with this name). The details then // gets stored in the symbol when it's created. UserReductionDetails *reductionDetails{&reductionDetailsTemp}; Symbol *symbol{FindSymbol(mangledName)}; >From d0b5e5cb04a86a7980c7238787a927f881e03205 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 4 Apr 2025 16:27:07 +0100 Subject: [PATCH 04/12] Add support for user defined operators in declare reduction Also print the reduction declaration in the module file. Fix trivial typo. Add/modify tests to cover all the new things, including fixing the duplicated typo in the test... --- flang/include/flang/Semantics/semantics.h | 9 +++ flang/include/flang/Semantics/symbol.h | 10 +++ flang/lib/Parser/unparse.cpp | 7 +++ flang/lib/Semantics/mod-file.cpp | 21 +++++++ flang/lib/Semantics/mod-file.h | 1 + flang/lib/Semantics/resolve-names.cpp | 41 +++++++++--- flang/lib/Semantics/semantics.cpp | 6 ++ .../OpenMP/declare-reduction-dupsym.f90 | 2 +- .../OpenMP/declare-reduction-modfile.f90 | 63 +++++++++++++++++++ .../OpenMP/declare-reduction-operators.f90 | 29 +++++++++ 10 files changed, 180 insertions(+), 9 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 diff --git a/flang/include/flang/Semantics/semantics.h b/flang/include/flang/Semantics/semantics.h index 730513dbe3232..460af89daa0cf 100644 --- a/flang/include/flang/Semantics/semantics.h +++ b/flang/include/flang/Semantics/semantics.h @@ -290,6 +290,10 @@ class SemanticsContext { // Top-level ProgramTrees are owned by the SemanticsContext for persistence. ProgramTree &SaveProgramTree(ProgramTree &&); + // Store (and get a reference to the stored string) for mangled names + // used for OpenMP DECLARE REDUCTION. + std::string &StoreUserReductionName(const std::string &name); + private: struct ScopeIndexComparator { bool operator()(parser::CharBlock, parser::CharBlock) const; @@ -343,6 +347,11 @@ class SemanticsContext { std::map moduleFileOutputRenamings_; UnorderedSymbolSet isDefined_; std::list programTrees_; + + // storage for mangled names used in OMP DECLARE REDUCTION. + // use std::list to avoid re-allocating the string when adding + // more content to the container. + std::list userReductionNames_; }; class Semantics { diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index b944912290cf7..f28a1d6b929eb 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -29,6 +29,8 @@ class raw_ostream; } namespace Fortran::parser { struct Expr; +struct OpenMPDeclareReductionConstruct; +struct OmpDirectiveSpecification; } namespace Fortran::semantics { @@ -707,6 +709,10 @@ llvm::raw_ostream &operator<<(llvm::raw_ostream &, const GenericDetails &); class UserReductionDetails { public: using TypeVector = std::vector; + using DeclInfo = std::variant; + using DeclVector = std::vector; + UserReductionDetails() = default; void AddType(const DeclTypeSpec *type) { typeList_.push_back(type); } @@ -716,8 +722,12 @@ class UserReductionDetails { return llvm::is_contained(typeList_, type); } + void AddDecl(const DeclInfo &decl) { declList_.push_back(decl); } + const DeclVector &GetDeclList() const { return declList_; } + private: TypeVector typeList_; + DeclVector declList_; }; class UnknownDetails {}; diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 47dae0ae753d2..3f6815968b76f 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -3325,4 +3325,11 @@ template void Unparse(llvm::raw_ostream &, const Program &, Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); template void Unparse(llvm::raw_ostream &, const Expr &, Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); + +template void Unparse( + llvm::raw_ostream &, const parser::OpenMPDeclareReductionConstruct &, + Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); +template void Unparse(llvm::raw_ostream &, + const parser::OmpDirectiveSpecification &, Encoding, bool, bool, + preStatementType *, AnalyzedObjectsAsFortran *); } // namespace Fortran::parser diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index ee356e56e4458..93226beb8b5ed 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -8,6 +8,7 @@ #include "mod-file.h" #include "resolve-names.h" +#include "flang/Common/indirection.h" #include "flang/Common/restorer.h" #include "flang/Evaluate/tools.h" #include "flang/Parser/message.h" @@ -887,6 +888,7 @@ void ModFileWriter::PutEntity(llvm::raw_ostream &os, const Symbol &symbol) { [&](const ObjectEntityDetails &) { PutObjectEntity(os, symbol); }, [&](const ProcEntityDetails &) { PutProcEntity(os, symbol); }, [&](const TypeParamDetails &) { PutTypeParam(os, symbol); }, + [&](const UserReductionDetails &) { PutUserReduction(os, symbol); }, [&](const auto &) { common::die("PutEntity: unexpected details: %s", DetailsToString(symbol.details()).c_str()); @@ -1035,6 +1037,25 @@ void ModFileWriter::PutTypeParam(llvm::raw_ostream &os, const Symbol &symbol) { os << '\n'; } +void ModFileWriter::PutUserReduction( + llvm::raw_ostream &os, const Symbol &symbol) { + auto &details{symbol.get()}; + // The module content for a OpenMP Declare Reduction is the OpenMP + // declaration. There may be multiple declarations. + // Decls are pointers, so do not use a referene. + for (const auto decl : details.GetDeclList()) { + if (auto d = std::get_if( + &decl)) { + Unparse(os, **d); + } else if (auto s = std::get_if( + &decl)) { + Unparse(os, **s); + } else { + DIE("Unknown OpenMP DECLARE REDUCTION content"); + } + } +} + void PutInit(llvm::raw_ostream &os, const Symbol &symbol, const MaybeExpr &init, const parser::Expr *unanalyzed) { if (IsNamedConstant(symbol) || symbol.owner().IsDerivedType()) { diff --git a/flang/lib/Semantics/mod-file.h b/flang/lib/Semantics/mod-file.h index 82538fb510873..9e5724089b3c5 100644 --- a/flang/lib/Semantics/mod-file.h +++ b/flang/lib/Semantics/mod-file.h @@ -80,6 +80,7 @@ class ModFileWriter { void PutDerivedType(const Symbol &, const Scope * = nullptr); void PutDECStructure(const Symbol &, const Scope * = nullptr); void PutTypeParam(llvm::raw_ostream &, const Symbol &); + void PutUserReduction(llvm::raw_ostream &, const Symbol &); void PutSubprogram(const Symbol &); void PutGeneric(const Symbol &); void PutUse(const Symbol &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 4f5dde00223bc..3943bdaf0c2c7 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1502,7 +1502,7 @@ class OmpVisitor : public virtual DeclarationVisitor { AddOmpSourceRange(x.source); ProcessReductionSpecifier( std::get>(x.t).value(), - std::get>(x.t)); + std::get>(x.t), x); return false; } bool Pre(const parser::OmpMapClause &); @@ -1658,8 +1658,13 @@ class OmpVisitor : public virtual DeclarationVisitor { private: void ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, const parser::OmpClauseList &clauses); + template void ProcessReductionSpecifier(const parser::OmpReductionSpecifier &spec, - const std::optional &clauses); + const std::optional &clauses, + const T &wholeConstruct); + + parser::CharBlock MangleDefinedOperator(const parser::CharBlock &name); + int metaLevel_{0}; }; @@ -1788,9 +1793,21 @@ parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name) { .Default(name); } +parser::CharBlock OmpVisitor::MangleDefinedOperator( + const parser::CharBlock &name) { + // This function should only be used with user defined operators, that have + // the pattern + // .. + CHECK(name[0] == '.' && name[name.size() - 1] == '.'); + return parser::CharBlock{ + context().StoreUserReductionName("op" + name.ToString())}; +} + +template void OmpVisitor::ProcessReductionSpecifier( const parser::OmpReductionSpecifier &spec, - const std::optional &clauses) { + const std::optional &clauses, + const T &wholeOmpConstruct) { const parser::Name *name{nullptr}; parser::Name mangledName; UserReductionDetails reductionDetailsTemp; @@ -1800,11 +1817,17 @@ void OmpVisitor::ProcessReductionSpecifier( if (name) { mangledName.source = MangleSpecialFunctions(name->source); } + } else { const auto &defOp{std::get(id.u)}; - mangledName.source = MakeNameFromOperator( - std::get(defOp.u)); - name = &mangledName; + if (const auto definedOp{std::get_if(&defOp.u)}) { + name = &definedOp->v; + mangledName.source = MangleDefinedOperator(definedOp->v.source); + } else { + mangledName.source = MakeNameFromOperator( + std::get(defOp.u)); + name = &mangledName; + } } // Use reductionDetailsTemp if we can't find the symbol (this is @@ -1819,7 +1842,7 @@ void OmpVisitor::ProcessReductionSpecifier( if (!reductionDetails) { context().Say(name->source, - "Duplicate defineition of '%s' in !$OMP DECLARE REDUCTION"_err_en_US, + "Duplicate definition of '%s' in !$OMP DECLARE REDUCTION"_err_en_US, name->source); return; } @@ -1868,6 +1891,8 @@ void OmpVisitor::ProcessReductionSpecifier( PopScope(); } + reductionDetails->AddDecl(&wholeOmpConstruct); + if (name) { if (!symbol) { symbol = &MakeSymbol(mangledName, Attrs{}, std::move(*reductionDetails)); @@ -1903,7 +1928,7 @@ bool OmpVisitor::Pre(const parser::OmpDirectiveSpecification &x) { if (maybeArgs && maybeClauses) { const parser::OmpArgument &first{maybeArgs->v.front()}; if (auto *spec{std::get_if(&first.u)}) { - ProcessReductionSpecifier(*spec, maybeClauses); + ProcessReductionSpecifier(*spec, maybeClauses, x); } } break; diff --git a/flang/lib/Semantics/semantics.cpp b/flang/lib/Semantics/semantics.cpp index 10a01039ea0ae..4a74d9e1dc1bd 100644 --- a/flang/lib/Semantics/semantics.cpp +++ b/flang/lib/Semantics/semantics.cpp @@ -771,4 +771,10 @@ bool SemanticsContext::IsSymbolDefined(const Symbol &symbol) const { return isDefined_.find(symbol) != isDefined_.end(); } +std::string &SemanticsContext::StoreUserReductionName(const std::string &name) { + userReductionNames_.push_back(name); + CHECK(userReductionNames_.back() == name); + return userReductionNames_.back(); +} + } // namespace Fortran::semantics diff --git a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 index 17f70174e1854..2e82cd1a18332 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 @@ -9,7 +9,7 @@ subroutine dup_symbol() integer :: my_red -!CHECK: error: Duplicate defineition of 'my_red' in !$OMP DECLARE REDUCTION +!CHECK: error: Duplicate definition of 'my_red' in !$OMP DECLARE REDUCTION !$omp declare reduction(my_red : loc : omp_out%x = omp_out%x + omp_in%x) initializer(omp_priv%x = 0) end subroutine dup_symbol diff --git a/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 b/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 new file mode 100644 index 0000000000000..caed7fd335376 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 @@ -0,0 +1,63 @@ +! RUN: %python %S/../test_modfile.py %s %flang_fc1 -fopenmp +! Check correct modfile generation for OpenMP DECLARE REDUCTION construct. + +!Expect: drm.mod +!module drm +!type::t1 +!integer(4)::val +!endtype +!!$OMP DECLARE REDUCTION (*:t1:omp_out = omp_out*omp_in) INITIALIZER(omp_priv=t& +!!$OMP&1(1)) +!!$OMP DECLARE REDUCTION (.fluffy.:t1:omp_out = omp_out.fluffy.omp_in) INITIALI& +!!$OMP&ZER(omp_priv=t1(0)) +!!$OMP DECLARE REDUCTION (.mul.:t1:omp_out = omp_out.mul.omp_in) INITIALIZER(om& +!!$OMP&p_priv=t1(1)) +!interface operator(.mul.) +!procedure::mul +!end interface +!interface operator(.fluffy.) +!procedure::add +!end interface +!interface operator(*) +!procedure::mul +!end interface +!contains +!function mul(v1,v2) +!type(t1),intent(in)::v1 +!type(t1),intent(in)::v2 +!type(t1)::mul +!end +!function add(v1,v2) +!type(t1),intent(in)::v1 +!type(t1),intent(in)::v2 +!type(t1)::add +!end +!end + +module drm + type t1 + integer :: val + end type t1 + interface operator(.mul.) + procedure mul + end interface + interface operator(.fluffy.) + procedure add + end interface + interface operator(*) + module procedure mul + end interface +!$omp declare reduction(*:t1:omp_out=omp_out*omp_in) initializer(omp_priv=t1(1)) +!$omp declare reduction(.mul.:t1:omp_out=omp_out.mul.omp_in) initializer(omp_priv=t1(1)) +!$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) initializer(omp_priv=t1(0)) +contains + type(t1) function mul(v1, v2) + type(t1), intent (in):: v1, v2 + mul%val = v1%val * v2%val + end function + type(t1) function add(v1, v2) + type(t1), intent (in):: v1, v2 + add%val = v1%val + v2%val + end function +end module drm + diff --git a/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 b/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 index e7513ab3f95b1..73fa1a1fea2c5 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 @@ -19,6 +19,35 @@ function add_vectors(a, b) result(res) end function add_vectors end module vector_mod +!! Test user-defined operators. Two different varieties, using conventional and +!! unconventional names. +module m1 + interface operator(.mul.) + procedure my_mul + end interface + interface operator(.fluffy.) + procedure my_add + end interface + type t1 + integer :: val = 1 + end type +!$omp declare reduction(.mul.:t1:omp_out=omp_out.mul.omp_in) +!$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) +!CHECK: op.fluffy., PUBLIC: UserReductionDetails TYPE(t1) +!CHECK: op.mul., PUBLIC: UserReductionDetails TYPE(t1) +contains + function my_mul(x, y) + type (t1), intent (in) :: x, y + type (t1) :: my_mul + my_mul%val = x%val * y%val + end function + function my_add(x, y) + type (t1), intent (in) :: x, y + type (t1) :: my_add + my_add%val = x%val + y%val + end function +end module m1 + program test_vector !CHECK-LABEL: MainProgram scope: test_vector use vector_mod >From 38bdc0f7b498150eaa6d7a291456477fc94aff73 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 4 Apr 2025 18:56:14 +0100 Subject: [PATCH 05/12] Fix nit comments and add simple bad operator test --- flang/lib/Semantics/check-omp-structure.cpp | 12 +++++------- flang/lib/Semantics/resolve-names-utils.h | 3 ++- flang/lib/Semantics/resolve-names.cpp | 8 +++++--- flang/lib/Semantics/symbol.cpp | 3 +-- .../OpenMP/declare-reduction-bad-operator.f90 | 6 ++++++ 5 files changed, 19 insertions(+), 13 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 1eac0bdfc05bd..099a58124638d 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3488,7 +3488,7 @@ void OmpStructureChecker::CheckReductionObjects( static bool IsReductionAllowedForType( const parser::OmpReductionIdentifier &ident, const DeclTypeSpec &type, - const Scope &scope) { + const Scope &scope, SemanticsContext &context) { auto isLogical{[](const DeclTypeSpec &type) -> bool { return type.category() == DeclTypeSpec::Logical; }}; @@ -3528,7 +3528,7 @@ static bool IsReductionAllowedForType( DIE("This should have been caught in CheckIntrinsicOperator"); return false; } - parser::CharBlock name{MakeNameFromOperator(*intrinsicOp)}; + parser::CharBlock name{MakeNameFromOperator(*intrinsicOp, context)}; Symbol *symbol{scope.FindSymbol(name)}; if (symbol) { const auto *reductionDetails{symbol->detailsIf()}; @@ -3539,11 +3539,11 @@ static bool IsReductionAllowedForType( return false; } DIE("Intrinsic Operator not found - parsing gone wrong?"); - return false; // Reject everything else. }}; auto checkDesignator{[&](const parser::ProcedureDesignator &procD) { const parser::Name *name{std::get_if(&procD.u)}; + CHECK(name && name->symbol); if (name && name->symbol) { const SourceName &realName{name->symbol->GetUltimate().name()}; // OMP5.2: The type [...] of a list item that appears in a @@ -3583,10 +3583,8 @@ static bool IsReductionAllowedForType( return reductionDetails->SupportsType(&type); } } - // Everything else is "not matching type". - return false; } - DIE("name and name->symbol should be set here..."); + // Everything else is "not matching type". return false; }}; @@ -3603,7 +3601,7 @@ void OmpStructureChecker::CheckReductionObjectTypes( for (auto &[symbol, source] : symbols) { if (auto *type{symbol->GetType()}) { const auto &scope{context_.FindScope(symbol->name())}; - if (!IsReductionAllowedForType(ident, *type, scope)) { + if (!IsReductionAllowedForType(ident, *type, scope, context_)) { context_.Say(source, "The type of '%s' is incompatible with the reduction operator."_err_en_US, symbol->name()); diff --git a/flang/lib/Semantics/resolve-names-utils.h b/flang/lib/Semantics/resolve-names-utils.h index de0991d69b61b..ed74c8203e29a 100644 --- a/flang/lib/Semantics/resolve-names-utils.h +++ b/flang/lib/Semantics/resolve-names-utils.h @@ -147,7 +147,8 @@ void MapSubprogramToNewSymbols(const Symbol &oldSymbol, Symbol &newSymbol, Scope &newScope, SymbolAndTypeMappings * = nullptr); parser::CharBlock MakeNameFromOperator( - const parser::DefinedOperator::IntrinsicOperator &op); + const parser::DefinedOperator::IntrinsicOperator &op, + SemanticsContext &context); parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name); } // namespace Fortran::semantics diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 3943bdaf0c2c7..552a0efc6aaa5 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1759,7 +1759,8 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, } parser::CharBlock MakeNameFromOperator( - const parser::DefinedOperator::IntrinsicOperator &op) { + const parser::DefinedOperator::IntrinsicOperator &op, + SemanticsContext &context) { switch (op) { case parser::DefinedOperator::IntrinsicOperator::Multiply: return parser::CharBlock{"op.*", 4}; @@ -1778,7 +1779,7 @@ parser::CharBlock MakeNameFromOperator( return parser::CharBlock{"op.NEQV", 8}; default: - DIE("Unsupported operator..."); + context.Say("Unsupported operator in OMP DECLARE REDUCTION"_err_en_US); return parser::CharBlock{"op.?", 4}; } } @@ -1825,7 +1826,8 @@ void OmpVisitor::ProcessReductionSpecifier( mangledName.source = MangleDefinedOperator(definedOp->v.source); } else { mangledName.source = MakeNameFromOperator( - std::get(defOp.u)); + std::get(defOp.u), + context()); name = &mangledName; } } diff --git a/flang/lib/Semantics/symbol.cpp b/flang/lib/Semantics/symbol.cpp index e627dd293ba7c..e1e9f1705e452 100644 --- a/flang/lib/Semantics/symbol.cpp +++ b/flang/lib/Semantics/symbol.cpp @@ -246,8 +246,7 @@ void GenericDetails::CopyFrom(const GenericDetails &from) { // This is primarily for debugging. std::string DetailsToString(const Details &details) { return common::visit( - common::visitors{// - [](const UnknownDetails &) { return "Unknown"; }, + common::visitors{[](const UnknownDetails &) { return "Unknown"; }, [](const MainProgramDetails &) { return "MainProgram"; }, [](const ModuleDetails &) { return "Module"; }, [](const SubprogramDetails &) { return "Subprogram"; }, diff --git a/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 new file mode 100644 index 0000000000000..3b27c6aa20f13 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 @@ -0,0 +1,6 @@ +! RUN: not %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s 2>&1 | FileCheck %s + +function func(n) + !$omp declare reduction(/:integer:omp_out=omp_out+omp_in) +!CHECK: error: Unsupported operator in OMP DECLARE REDUCTION +end function func >From 0afc3591885e72d59912811e0e01b158f895b928 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 4 Apr 2025 19:47:42 +0100 Subject: [PATCH 06/12] Fix error messages to be more consistent --- flang/lib/Semantics/resolve-names.cpp | 6 +++--- .../Semantics/OpenMP/declare-reduction-bad-operator.f90 | 2 +- flang/test/Semantics/OpenMP/declare-reduction-error.f90 | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 552a0efc6aaa5..f1c2a0759e2ed 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1492,7 +1492,7 @@ class OmpVisitor : public virtual DeclarationVisitor { auto *symbol{FindSymbol(NonDerivedTypeScope(), name)}; if (!symbol) { context().Say(name.source, - "Implicit subroutine declaration '%s' in !$OMP DECLARE REDUCTION"_err_en_US, + "Implicit subroutine declaration '%s' in DECLARE REDUCTION"_err_en_US, name.source); } return true; @@ -1779,7 +1779,7 @@ parser::CharBlock MakeNameFromOperator( return parser::CharBlock{"op.NEQV", 8}; default: - context.Say("Unsupported operator in OMP DECLARE REDUCTION"_err_en_US); + context.Say("Unsupported operator in DECLARE REDUCTION"_err_en_US); return parser::CharBlock{"op.?", 4}; } } @@ -1844,7 +1844,7 @@ void OmpVisitor::ProcessReductionSpecifier( if (!reductionDetails) { context().Say(name->source, - "Duplicate definition of '%s' in !$OMP DECLARE REDUCTION"_err_en_US, + "Duplicate definition of '%s' in DECLARE REDUCTION"_err_en_US, name->source); return; } diff --git a/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 index 3b27c6aa20f13..1d1d2903a2780 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 @@ -2,5 +2,5 @@ function func(n) !$omp declare reduction(/:integer:omp_out=omp_out+omp_in) -!CHECK: error: Unsupported operator in OMP DECLARE REDUCTION +!CHECK: error: Unsupported operator in DECLARE REDUCTION end function func diff --git a/flang/test/Semantics/OpenMP/declare-reduction-error.f90 b/flang/test/Semantics/OpenMP/declare-reduction-error.f90 index c22cf106ea507..21f5cc186e037 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-error.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-error.f90 @@ -7,5 +7,5 @@ end subroutine initme subroutine subr !$omp declare reduction(red_add:integer(4):omp_out=omp_out+omp_in) initializer(initme(omp_priv,0)) - !CHECK: error: Implicit subroutine declaration 'initme' in !$OMP DECLARE REDUCTION + !CHECK: error: Implicit subroutine declaration 'initme' in DECLARE REDUCTION end subroutine subr >From 1b08d9cdb6b57640e0f85361c8b60f8b77084d57 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Mon, 7 Apr 2025 10:47:44 +0100 Subject: [PATCH 07/12] add missed test change --- flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 index 2e82cd1a18332..83f8f85299dca 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 @@ -9,7 +9,7 @@ subroutine dup_symbol() integer :: my_red -!CHECK: error: Duplicate definition of 'my_red' in !$OMP DECLARE REDUCTION +!CHECK: error: Duplicate definition of 'my_red' in DECLARE REDUCTION !$omp declare reduction(my_red : loc : omp_out%x = omp_out%x + omp_in%x) initializer(omp_priv%x = 0) end subroutine dup_symbol >From 17a400b89ce08f84c76ebbb4f1b2b7c92f3ae5ae Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Tue, 8 Apr 2025 14:41:37 +0100 Subject: [PATCH 08/12] Improve support for metadirective + declare reduction --- flang/include/flang/Semantics/symbol.h | 4 ++-- flang/lib/Parser/unparse.cpp | 4 ++-- flang/lib/Semantics/mod-file.cpp | 2 +- flang/lib/Semantics/resolve-names.cpp | 12 +++++++++++- .../Semantics/OpenMP/declare-reduction-modfile.f90 | 4 +++- 5 files changed, 19 insertions(+), 7 deletions(-) diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index f28a1d6b929eb..5cc47b36c234f 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -30,7 +30,7 @@ class raw_ostream; namespace Fortran::parser { struct Expr; struct OpenMPDeclareReductionConstruct; -struct OmpDirectiveSpecification; +struct OmpMetadirectiveDirective; } namespace Fortran::semantics { @@ -710,7 +710,7 @@ class UserReductionDetails { public: using TypeVector = std::vector; using DeclInfo = std::variant; + const parser::OmpMetadirectiveDirective *>; using DeclVector = std::vector; UserReductionDetails() = default; diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 3f6815968b76f..bcc50a72a84b4 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -3329,7 +3329,7 @@ template void Unparse(llvm::raw_ostream &, const Expr &, Encoding, bool, template void Unparse( llvm::raw_ostream &, const parser::OpenMPDeclareReductionConstruct &, Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); -template void Unparse(llvm::raw_ostream &, - const parser::OmpDirectiveSpecification &, Encoding, bool, bool, +template void Unparse(llvm::raw_ostream &, + const parser::OmpMetadirectiveDirective &, Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); } // namespace Fortran::parser diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index 93226beb8b5ed..c24b4a63a2aeb 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1047,7 +1047,7 @@ void ModFileWriter::PutUserReduction( if (auto d = std::get_if( &decl)) { Unparse(os, **d); - } else if (auto s = std::get_if( + } else if (auto s = std::get_if( &decl)) { Unparse(os, **s); } else { diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1c2a0759e2ed..9c3bd00627ff7 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1655,6 +1655,14 @@ class OmpVisitor : public virtual DeclarationVisitor { EndDeclTypeSpec(); } + bool Pre(const parser::OmpMetadirectiveDirective &x) { // + metaDirective_ = &x; + return true; + } + void Post(const parser::OmpMetadirectiveDirective &) { // + metaDirective_ = nullptr; + } + private: void ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, const parser::OmpClauseList &clauses); @@ -1666,6 +1674,7 @@ class OmpVisitor : public virtual DeclarationVisitor { parser::CharBlock MangleDefinedOperator(const parser::CharBlock &name); int metaLevel_{0}; + const parser::OmpMetadirectiveDirective *metaDirective_{nullptr}; }; bool OmpVisitor::NeedsScope(const parser::OpenMPBlockConstruct &x) { @@ -1930,7 +1939,8 @@ bool OmpVisitor::Pre(const parser::OmpDirectiveSpecification &x) { if (maybeArgs && maybeClauses) { const parser::OmpArgument &first{maybeArgs->v.front()}; if (auto *spec{std::get_if(&first.u)}) { - ProcessReductionSpecifier(*spec, maybeClauses, x); + CHECK(metaDirective_); + ProcessReductionSpecifier(*spec, maybeClauses, *metaDirective_); } } break; diff --git a/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 b/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 index caed7fd335376..f80eb1097e18a 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_modfile.py %s %flang_fc1 -fopenmp +! RUN: %python %S/../test_modfile.py %s %flang_fc1 -fopenmp -fopenmp-version=52 ! Check correct modfile generation for OpenMP DECLARE REDUCTION construct. !Expect: drm.mod @@ -8,6 +8,7 @@ !endtype !!$OMP DECLARE REDUCTION (*:t1:omp_out = omp_out*omp_in) INITIALIZER(omp_priv=t& !!$OMP&1(1)) +!!$OMP METADIRECTIVE OTHERWISE(DECLARE REDUCTION(+:INTEGER)) !!$OMP DECLARE REDUCTION (.fluffy.:t1:omp_out = omp_out.fluffy.omp_in) INITIALI& !!$OMP&ZER(omp_priv=t1(0)) !!$OMP DECLARE REDUCTION (.mul.:t1:omp_out = omp_out.mul.omp_in) INITIALIZER(om& @@ -50,6 +51,7 @@ module drm !$omp declare reduction(*:t1:omp_out=omp_out*omp_in) initializer(omp_priv=t1(1)) !$omp declare reduction(.mul.:t1:omp_out=omp_out.mul.omp_in) initializer(omp_priv=t1(1)) !$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) initializer(omp_priv=t1(0)) +!$omp metadirective otherwise(declare reduction(+: integer)) contains type(t1) function mul(v1, v2) type(t1), intent (in):: v1, v2 >From 93f817902ed042f48c5171b23d95578e87eb5b0e Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Thu, 10 Apr 2025 19:11:10 +0100 Subject: [PATCH 09/12] Fix Klausler reported review comments Also rebase, as the branch was quite a way behind. Small conflict was resolved. --- flang/include/flang/Semantics/symbol.h | 6 +++--- flang/lib/Semantics/check-omp-structure.cpp | 9 ++++---- flang/lib/Semantics/mod-file.cpp | 23 ++++++++++++--------- flang/lib/Semantics/resolve-names-utils.h | 2 +- flang/lib/Semantics/resolve-names.cpp | 20 +++++++----------- 5 files changed, 29 insertions(+), 31 deletions(-) diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 5cc47b36c234f..b7b29afe1ceea 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -715,11 +715,11 @@ class UserReductionDetails { UserReductionDetails() = default; - void AddType(const DeclTypeSpec *type) { typeList_.push_back(type); } + void AddType(const DeclTypeSpec &type) { typeList_.push_back(&type); } const TypeVector &GetTypeList() const { return typeList_; } - bool SupportsType(const DeclTypeSpec *type) const { - return llvm::is_contained(typeList_, type); + bool SupportsType(const DeclTypeSpec &type) const { + return llvm::is_contained(typeList_, &type); } void AddDecl(const DeclInfo &decl) { declList_.push_back(decl); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 099a58124638d..91b8a8dd57d3b 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3404,8 +3404,7 @@ bool OmpStructureChecker::CheckReductionOperator( valid = llvm::is_contained({"max", "min", "iand", "ior", "ieor"}, realName); if (!valid) { - auto *reductionDetails{name->symbol->detailsIf()}; - valid = reductionDetails != nullptr; + valid = name->symbol->detailsIf(); } } if (!valid) { @@ -3534,7 +3533,7 @@ static bool IsReductionAllowedForType( const auto *reductionDetails{symbol->detailsIf()}; assert(reductionDetails && "Expected to find reductiondetails"); - return reductionDetails->SupportsType(&type); + return reductionDetails->SupportsType(type); } return false; } @@ -3571,7 +3570,7 @@ static bool IsReductionAllowedForType( // supported. if (const auto *reductionDetails{ name->symbol->detailsIf()}) { - return reductionDetails->SupportsType(&type); + return reductionDetails->SupportsType(type); } // We also need to check for mangled names (max, min, iand, ieor and ior) @@ -3580,7 +3579,7 @@ static bool IsReductionAllowedForType( if (const auto &symbol{scope.FindSymbol(mangledName)}) { if (const auto *reductionDetails{ symbol->detailsIf()}) { - return reductionDetails->SupportsType(&type); + return reductionDetails->SupportsType(type); } } } diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index c24b4a63a2aeb..10abb4db159c1 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1039,20 +1039,23 @@ void ModFileWriter::PutTypeParam(llvm::raw_ostream &os, const Symbol &symbol) { void ModFileWriter::PutUserReduction( llvm::raw_ostream &os, const Symbol &symbol) { - auto &details{symbol.get()}; + const auto &details{symbol.get()}; // The module content for a OpenMP Declare Reduction is the OpenMP // declaration. There may be multiple declarations. // Decls are pointers, so do not use a referene. for (const auto decl : details.GetDeclList()) { - if (auto d = std::get_if( - &decl)) { - Unparse(os, **d); - } else if (auto s = std::get_if( - &decl)) { - Unparse(os, **s); - } else { - DIE("Unknown OpenMP DECLARE REDUCTION content"); - } + common::visit( // + common::visitors{// + [&](const parser::OpenMPDeclareReductionConstruct *d) { + Unparse(os, *d); + }, + [&](const parser::OmpMetadirectiveDirective *m) { + Unparse(os, *m); + }, + [&](const auto &) { + DIE("Unknown OpenMP DECLARE REDUCTION content"); + }}, + decl); } } diff --git a/flang/lib/Semantics/resolve-names-utils.h b/flang/lib/Semantics/resolve-names-utils.h index ed74c8203e29a..809074031e2cc 100644 --- a/flang/lib/Semantics/resolve-names-utils.h +++ b/flang/lib/Semantics/resolve-names-utils.h @@ -149,7 +149,7 @@ void MapSubprogramToNewSymbols(const Symbol &oldSymbol, Symbol &newSymbol, parser::CharBlock MakeNameFromOperator( const parser::DefinedOperator::IntrinsicOperator &op, SemanticsContext &context); -parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name); +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_RESOLVE_NAMES_H_ diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 9c3bd00627ff7..0416b5d410fec 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1442,11 +1442,15 @@ class OmpVisitor : public virtual DeclarationVisitor { static bool NeedsScope(const parser::OpenMPBlockConstruct &); static bool NeedsScope(const parser::OmpClause &); - bool Pre(const parser::OmpMetadirectiveDirective &) { + bool Pre(const parser::OmpMetadirectiveDirective &x) { // + metaDirective_ = &x; ++metaLevel_; return true; } - void Post(const parser::OmpMetadirectiveDirective &) { --metaLevel_; } + void Post(const parser::OmpMetadirectiveDirective &) { // + metaDirective_ = nullptr; + --metaLevel_; + } bool Pre(const parser::OpenMPRequiresConstruct &x) { AddOmpSourceRange(x.source); @@ -1655,14 +1659,6 @@ class OmpVisitor : public virtual DeclarationVisitor { EndDeclTypeSpec(); } - bool Pre(const parser::OmpMetadirectiveDirective &x) { // - metaDirective_ = &x; - return true; - } - void Post(const parser::OmpMetadirectiveDirective &) { // - metaDirective_ = nullptr; - } - private: void ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, const parser::OmpClauseList &clauses); @@ -1793,7 +1789,7 @@ parser::CharBlock MakeNameFromOperator( } } -parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name) { +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name) { return llvm::StringSwitch(name.ToString()) .Case("max", {"op.max", 6}) .Case("min", {"op.min", 6}) @@ -1888,7 +1884,7 @@ void OmpVisitor::ProcessReductionSpecifier( // Only process types we can find. There will be an error later on when // a type isn't found. if (const DeclTypeSpec * typeSpec{GetDeclTypeSpec()}) { - reductionDetails->AddType(typeSpec); + reductionDetails->AddType(*typeSpec); for (auto &nm : ompVarNames) { ObjectEntityDetails details{}; >From ce6ca8fd9de36c0617a06df3f31a413fe317e08a Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Tue, 29 Apr 2025 13:25:19 +0100 Subject: [PATCH 10/12] Fix some semantics issues --- flang/lib/Semantics/assignment.cpp | 10 ++++++ flang/lib/Semantics/assignment.h | 3 ++ flang/lib/Semantics/check-omp-structure.cpp | 39 +++++++++++++-------- flang/lib/Semantics/resolve-names-utils.h | 1 + flang/lib/Semantics/resolve-names.cpp | 14 +++----- 5 files changed, 43 insertions(+), 24 deletions(-) diff --git a/flang/lib/Semantics/assignment.cpp b/flang/lib/Semantics/assignment.cpp index 935f5a03bdb6a..b6d66c1c92aa0 100644 --- a/flang/lib/Semantics/assignment.cpp +++ b/flang/lib/Semantics/assignment.cpp @@ -43,6 +43,7 @@ class AssignmentContext { void Analyze(const parser::PointerAssignmentStmt &); void Analyze(const parser::ConcurrentControl &); int deviceConstructDepth_{0}; + SemanticsContext &context() { return context_; } private: bool CheckForPureContext(const SomeExpr &rhs, parser::CharBlock rhsSource); @@ -213,8 +214,17 @@ void AssignmentContext::PopWhereContext() { AssignmentChecker::~AssignmentChecker() {} +SemanticsContext &AssignmentChecker::context() { + return context_.value().context(); +} + AssignmentChecker::AssignmentChecker(SemanticsContext &context) : context_{new AssignmentContext{context}} {} + +void AssignmentChecker::Enter( + const parser::OpenMPDeclareReductionConstruct &x) { + context().set_location(x.source); +} void AssignmentChecker::Enter(const parser::AssignmentStmt &x) { context_.value().Analyze(x); } diff --git a/flang/lib/Semantics/assignment.h b/flang/lib/Semantics/assignment.h index a67bee4a03dfc..4a1bb92037119 100644 --- a/flang/lib/Semantics/assignment.h +++ b/flang/lib/Semantics/assignment.h @@ -37,6 +37,7 @@ class AssignmentChecker : public virtual BaseChecker { public: explicit AssignmentChecker(SemanticsContext &); ~AssignmentChecker(); + void Enter(const parser::OpenMPDeclareReductionConstruct &x); void Enter(const parser::AssignmentStmt &); void Enter(const parser::PointerAssignmentStmt &); void Enter(const parser::WhereStmt &); @@ -54,6 +55,8 @@ class AssignmentChecker : public virtual BaseChecker { void Enter(const parser::OpenACCLoopConstruct &); void Leave(const parser::OpenACCLoopConstruct &); + SemanticsContext &context(); + private: common::Indirection context_; }; diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 91b8a8dd57d3b..0490d6aae8edf 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3391,6 +3391,14 @@ bool OmpStructureChecker::CheckReductionOperator( break; } } + // User-defined operators are OK if there has been a declared reduction + // for that. So check if it's a defined operator, and it has + // UserReductionDetails - then it's good. + if (const auto *definedOp{std::get_if(&dOpr.u)}) { + if (definedOp->v.symbol->detailsIf()) { + return true; + } + } context_.Say(source, "Invalid reduction operator in %s clause."_err_en_US, parser::ToUpperCaseLetters(getClauseName(clauseId).str())); return false; @@ -3485,6 +3493,17 @@ void OmpStructureChecker::CheckReductionObjects( } } +static bool CheckSymbolSupportsType(const Scope &scope, + const parser::CharBlock &name, const DeclTypeSpec &type) { + if (const auto &symbol{scope.FindSymbol(name)}) { + if (const auto *reductionDetails{ + symbol->detailsIf()}) { + return reductionDetails->SupportsType(&type); + } + } + return false; +} + static bool IsReductionAllowedForType( const parser::OmpReductionIdentifier &ident, const DeclTypeSpec &type, const Scope &scope, SemanticsContext &context) { @@ -3528,14 +3547,11 @@ static bool IsReductionAllowedForType( return false; } parser::CharBlock name{MakeNameFromOperator(*intrinsicOp, context)}; - Symbol *symbol{scope.FindSymbol(name)}; - if (symbol) { - const auto *reductionDetails{symbol->detailsIf()}; - assert(reductionDetails && "Expected to find reductiondetails"); - - return reductionDetails->SupportsType(type); - } - return false; + return CheckSymbolSupportsType(scope, name, type); + } else if (const auto *definedOp{ + std::get_if(&dOpr.u)}) { + // TODO: Figure out if it's valid. + return true; } DIE("Intrinsic Operator not found - parsing gone wrong?"); }}; @@ -3576,12 +3592,7 @@ static bool IsReductionAllowedForType( // We also need to check for mangled names (max, min, iand, ieor and ior) // and then check if the type is there. parser::CharBlock mangledName{MangleSpecialFunctions(name->source)}; - if (const auto &symbol{scope.FindSymbol(mangledName)}) { - if (const auto *reductionDetails{ - symbol->detailsIf()}) { - return reductionDetails->SupportsType(type); - } - } + return CheckSymbolSupportsType(scope, mangledName, type); } // Everything else is "not matching type". return false; diff --git a/flang/lib/Semantics/resolve-names-utils.h b/flang/lib/Semantics/resolve-names-utils.h index 809074031e2cc..ee8113a3fda5e 100644 --- a/flang/lib/Semantics/resolve-names-utils.h +++ b/flang/lib/Semantics/resolve-names-utils.h @@ -150,6 +150,7 @@ parser::CharBlock MakeNameFromOperator( const parser::DefinedOperator::IntrinsicOperator &op, SemanticsContext &context); parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name); +std::string MangleDefinedOperator(const parser::CharBlock &name); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_RESOLVE_NAMES_H_ diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 0416b5d410fec..ebbbcc75e5b67 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1667,8 +1667,6 @@ class OmpVisitor : public virtual DeclarationVisitor { const std::optional &clauses, const T &wholeConstruct); - parser::CharBlock MangleDefinedOperator(const parser::CharBlock &name); - int metaLevel_{0}; const parser::OmpMetadirectiveDirective *metaDirective_{nullptr}; }; @@ -1799,14 +1797,9 @@ parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name) { .Default(name); } -parser::CharBlock OmpVisitor::MangleDefinedOperator( - const parser::CharBlock &name) { - // This function should only be used with user defined operators, that have - // the pattern - // .. +std::string MangleDefinedOperator(const parser::CharBlock &name) { CHECK(name[0] == '.' && name[name.size() - 1] == '.'); - return parser::CharBlock{ - context().StoreUserReductionName("op" + name.ToString())}; + return "op" + name.ToString(); } template @@ -1828,7 +1821,8 @@ void OmpVisitor::ProcessReductionSpecifier( const auto &defOp{std::get(id.u)}; if (const auto definedOp{std::get_if(&defOp.u)}) { name = &definedOp->v; - mangledName.source = MangleDefinedOperator(definedOp->v.source); + mangledName.source = parser::CharBlock{context().StoreUserReductionName( + MangleDefinedOperator(definedOp->v.source))}; } else { mangledName.source = MakeNameFromOperator( std::get(defOp.u), >From 4739878424472cc74daf7f58e82916e045626338 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Thu, 1 May 2025 13:39:58 +0100 Subject: [PATCH 11/12] [Flang][OpenMP] Fix review comment failed examples Add code to better handle operators in parsing and semantics. Add a function to set the the scope when processign assignments, which caused a crash in "check for pure functions". Add three new tests and amend existing tests to cover a pure function. --- flang/include/flang/Semantics/symbol.h | 9 +++- flang/lib/Semantics/check-omp-structure.cpp | 17 ++++--- .../declare-reduction-bad-operator2.f90 | 28 +++++++++++ .../OpenMP/declare-reduction-functions.f90 | 17 ++++++- .../OpenMP/declare-reduction-operator.f90 | 36 ++++++++++++++ .../OpenMP/declare-reduction-renamedop.f90 | 47 +++++++++++++++++++ 6 files changed, 145 insertions(+), 9 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-bad-operator2.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-operator.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-renamedop.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index b7b29afe1ceea..7141f5bb3feb4 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -719,7 +719,14 @@ class UserReductionDetails { const TypeVector &GetTypeList() const { return typeList_; } bool SupportsType(const DeclTypeSpec &type) const { - return llvm::is_contained(typeList_, &type); + // We have to compare the actual type, not the pointer, as some + // types are not guaranteed to be the same object. + for (auto t : typeList_) { + if (*t == type) { + return true; + } + } + return false; } void AddDecl(const DeclInfo &decl) { declList_.push_back(decl); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 0490d6aae8edf..cefe80f442727 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3392,11 +3392,14 @@ bool OmpStructureChecker::CheckReductionOperator( } } // User-defined operators are OK if there has been a declared reduction - // for that. So check if it's a defined operator, and it has - // UserReductionDetails - then it's good. + // for that. We mangle those names to store the user details. if (const auto *definedOp{std::get_if(&dOpr.u)}) { - if (definedOp->v.symbol->detailsIf()) { - return true; + std::string mangled = MangleDefinedOperator(definedOp->v.symbol->name()); + const Scope &scope = definedOp->v.symbol->owner(); + if (const Symbol *symbol = scope.FindSymbol(mangled)) { + if (symbol->detailsIf()) { + return true; + } } } context_.Say(source, "Invalid reduction operator in %s clause."_err_en_US, @@ -3498,7 +3501,7 @@ static bool CheckSymbolSupportsType(const Scope &scope, if (const auto &symbol{scope.FindSymbol(name)}) { if (const auto *reductionDetails{ symbol->detailsIf()}) { - return reductionDetails->SupportsType(&type); + return reductionDetails->SupportsType(type); } } return false; @@ -3550,8 +3553,8 @@ static bool IsReductionAllowedForType( return CheckSymbolSupportsType(scope, name, type); } else if (const auto *definedOp{ std::get_if(&dOpr.u)}) { - // TODO: Figure out if it's valid. - return true; + return CheckSymbolSupportsType( + scope, MangleDefinedOperator(definedOp->v.symbol->name()), type); } DIE("Intrinsic Operator not found - parsing gone wrong?"); }}; diff --git a/flang/test/Semantics/OpenMP/declare-reduction-bad-operator2.f90 b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator2.f90 new file mode 100644 index 0000000000000..9ee223c1c71fe --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator2.f90 @@ -0,0 +1,28 @@ +! RUN: not %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s 2>&1 | FileCheck %s + +module m1 + interface operator(.fluffy.) + procedure my_mul + end interface + type t1 + integer :: val = 1 + end type +!$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) +contains + function my_mul(x, y) + type (t1), intent (in) :: x, y + type (t1) :: my_mul + my_mul%val = x%val * y%val + end function my_mul + + subroutine subr(a, r) + implicit none + integer, intent(in), dimension(10) :: a + integer, intent(out) :: r + integer :: i + !$omp do parallel reduction(.fluffy.:r) +!CHECK: error: The type of 'r' is incompatible with the reduction operator. + do i=1,10 + end do + end subroutine subr +end module m1 diff --git a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 index a2435fca415cd..000d323f522cf 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 @@ -166,7 +166,7 @@ function funcBtwothree(x, n) !CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) !CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) !CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) -!CHECK OtherConstruct scope +!CHECK: OtherConstruct scope !CHECK: omp_in size=24 offset=0: ObjectEntity type: TYPE(three) !CHECK: omp_orig size=24 offset=24: ObjectEntity type: TYPE(three) !CHECK: omp_out size=24 offset=48: ObjectEntity type: TYPE(three) @@ -184,5 +184,20 @@ function funcBtwothree(x, n) res%t2 = res2 res%t3 = res3 end function funcBtwothree + + !! This is checking a special case, where a reduction is declared inside a + !! pure function + + pure logical function reduction() +!CHECK: reduction size=4 offset=0: ObjectEntity funcResult type: LOGICAL(4) +!CHECK: rr: UserReductionDetails INTEGER(4) +!CHECK: OtherConstruct scope: size=16 alignment=4 sourceRange=0 bytes +!CHECK: omp_in size=4 offset=0: ObjectEntity type: INTEGER(4) +!CHECK: omp_orig size=4 offset=4: ObjectEntity type: INTEGER(4) +!CHECK: omp_out size=4 offset=8: ObjectEntity type: INTEGER(4) +!CHECK: omp_priv size=4 offset=12: ObjectEntity type: INTEGER(4) + !$omp declare reduction (rr : integer : omp_out = omp_out + omp_in) initializer (omp_priv = 0) + reduction = .false. + end function reduction end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction-operator.f90 b/flang/test/Semantics/OpenMP/declare-reduction-operator.f90 new file mode 100644 index 0000000000000..e4ac7023f4629 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-operator.f90 @@ -0,0 +1,36 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +module m1 + interface operator(.fluffy.) +!CHECK: .fluffy., PUBLIC (Function): Generic DefinedOp procs: my_mul + procedure my_mul + end interface + type t1 + integer :: val = 1 + end type +!$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) +!CHECK: op.fluffy., PUBLIC: UserReductionDetails TYPE(t1) +!CHECK: t1, PUBLIC: DerivedType components: val +!CHECK: OtherConstruct scope: size=16 alignment=4 sourceRange=0 bytes +!CHECK: omp_in size=4 offset=0: ObjectEntity type: TYPE(t1) +!CHECK: omp_orig size=4 offset=4: ObjectEntity type: TYPE(t1) +!CHECK: omp_out size=4 offset=8: ObjectEntity type: TYPE(t1) +!CHECK: omp_priv size=4 offset=12: ObjectEntity type: TYPE(t1) +contains + function my_mul(x, y) + type (t1), intent (in) :: x, y + type (t1) :: my_mul + my_mul%val = x%val * y%val + end function my_mul + + subroutine subr(a, r) + implicit none + type(t1), intent(in), dimension(10) :: a + type(t1), intent(out) :: r + integer :: i + !$omp do parallel reduction(.fluffy.:r) + do i=1,10 + r = r .fluffy. a(i) + end do + end subroutine subr +end module m1 diff --git a/flang/test/Semantics/OpenMP/declare-reduction-renamedop.f90 b/flang/test/Semantics/OpenMP/declare-reduction-renamedop.f90 new file mode 100644 index 0000000000000..12e80cbf7b327 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-renamedop.f90 @@ -0,0 +1,47 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +!! Test that we can "rename" an operator when using a module's operator. +module module1 +!CHECK: Module scope: module1 size=0 + implicit none + type :: t1 + real :: value + end type t1 + interface operator(.mul.) + module procedure my_mul + end interface operator(.mul.) +!CHECK: .mul., PUBLIC (Function): Generic DefinedOp procs: my_mul +!CHECK: my_mul, PUBLIC (Function): Subprogram result:TYPE(t1) r (TYPE(t1) x,TYPE(t1) +!CHECK: t1, PUBLIC: DerivedType components: value +contains + function my_mul(x, y) result(r) + type(t1), intent(in) :: x, y + type(t1) :: r + r%value = x%value * y%value + end function my_mul +end module module1 + +program test_omp_reduction +!CHECK: MainProgram scope: test_omp_reduction + use module1, only: t1, operator(.modmul.) => operator(.mul.) + +!CHECK: .modmul. (Function): Use from .mul. in module1 + implicit none + + type(t1) :: result + integer :: i + !$omp declare reduction (.modmul. : t1 : omp_out = omp_out .modmul. omp_in) initializer(omp_priv = t1(1.0)) +!CHECK: op.modmul.: UserReductionDetails TYPE(t1) +!CHECK: t1: Use from t1 in module1 +!CHECK: OtherConstruct scope: size=16 alignment=4 sourceRange=0 bytes +!CHECK: omp_in size=4 offset=0: ObjectEntity type: TYPE(t1) +!CHECK: omp_orig size=4 offset=4: ObjectEntity type: TYPE(t1) +!CHECK: omp_out size=4 offset=8: ObjectEntity type: TYPE(t1) +!CHECK: omp_priv size=4 offset=12: ObjectEntity type: TYPE(t1) + result = t1(1.0) + !$omp parallel do reduction(.modmul.:result) + do i = 1, 10 + result = result .modmul. t1(real(i)) + end do + !$omp end parallel do +end program test_omp_reduction >From c4e01f68e3290660e0510e7191f37ec13acaefe6 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 9 May 2025 17:15:08 +0100 Subject: [PATCH 12/12] Fix review comments --- flang/include/flang/Semantics/semantics.h | 2 +- flang/lib/Semantics/check-omp-structure.cpp | 2 +- flang/lib/Semantics/mod-file.cpp | 1 - flang/lib/Semantics/resolve-names.cpp | 2 +- 4 files changed, 3 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Semantics/semantics.h b/flang/include/flang/Semantics/semantics.h index 460af89daa0cf..3924e6db81eb8 100644 --- a/flang/include/flang/Semantics/semantics.h +++ b/flang/include/flang/Semantics/semantics.h @@ -348,7 +348,7 @@ class SemanticsContext { UnorderedSymbolSet isDefined_; std::list programTrees_; - // storage for mangled names used in OMP DECLARE REDUCTION. + // Storage for mangled names used in OMP DECLARE REDUCTION. // use std::list to avoid re-allocating the string when adding // more content to the container. std::list userReductionNames_; diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index cefe80f442727..50f48f6a92f6f 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3498,7 +3498,7 @@ void OmpStructureChecker::CheckReductionObjects( static bool CheckSymbolSupportsType(const Scope &scope, const parser::CharBlock &name, const DeclTypeSpec &type) { - if (const auto &symbol{scope.FindSymbol(name)}) { + if (const auto *symbol{scope.FindSymbol(name)}) { if (const auto *reductionDetails{ symbol->detailsIf()}) { return reductionDetails->SupportsType(type); diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index 10abb4db159c1..1928e964ef07f 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -8,7 +8,6 @@ #include "mod-file.h" #include "resolve-names.h" -#include "flang/Common/indirection.h" #include "flang/Common/restorer.h" #include "flang/Evaluate/tools.h" #include "flang/Parser/message.h" diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index ebbbcc75e5b67..404feacb6668d 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1842,7 +1842,7 @@ void OmpVisitor::ProcessReductionSpecifier( reductionDetails = symbol->detailsIf(); if (!reductionDetails) { - context().Say(name->source, + context().Say( "Duplicate definition of '%s' in DECLARE REDUCTION"_err_en_US, name->source); return; From flang-commits at lists.llvm.org Fri May 9 10:05:55 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Fri, 09 May 2025 10:05:55 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" and clause LoopRange (PR #139293) Message-ID: https://github.com/eZWALT created https://github.com/llvm/llvm-project/pull/139293 This pull request introduces full support for the #pragma omp fuse directive, as specified in the OpenMP 6.0 specification, along with initial support for the looprange clause in Clang. To enable this functionality, infrastructure for the Loop Sequence construct, also new in OpenMP 6.0, has been implemented. Additionally, a minimal code skeleton has been added to Flang to ensure compatibility and avoid integration issues, although a full implementation in Flang is still pending. https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-6-0.pdf P.S. As a follow-up to this loop transformation work, I'm currently preparing a patch that implements the "#pragma omp split" directive, also introduced in OpenMP 6.0. >From 5e01792a04a20dfc76097081ac1cf3da71bc97b6 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:25:33 +0000 Subject: [PATCH 1/7] Add fuse directive patch --- clang/include/clang-c/Index.h | 4 + clang/include/clang/AST/RecursiveASTVisitor.h | 3 + clang/include/clang/AST/StmtOpenMP.h | 105 +- .../clang/Basic/DiagnosticSemaKinds.td | 8 + clang/include/clang/Basic/StmtNodes.td | 1 + clang/include/clang/Sema/SemaOpenMP.h | 27 + .../include/clang/Serialization/ASTBitCodes.h | 1 + clang/lib/AST/StmtOpenMP.cpp | 25 + clang/lib/AST/StmtPrinter.cpp | 5 + clang/lib/AST/StmtProfile.cpp | 4 + clang/lib/Basic/OpenMPKinds.cpp | 2 +- clang/lib/CodeGen/CGStmt.cpp | 3 + clang/lib/CodeGen/CGStmtOpenMP.cpp | 8 + clang/lib/CodeGen/CodeGenFunction.h | 1 + clang/lib/Sema/SemaExceptionSpec.cpp | 1 + clang/lib/Sema/SemaOpenMP.cpp | 600 +++++++ clang/lib/Sema/TreeTransform.h | 11 + clang/lib/Serialization/ASTReaderStmt.cpp | 11 + clang/lib/Serialization/ASTWriterStmt.cpp | 6 + clang/lib/StaticAnalyzer/Core/ExprEngine.cpp | 1 + clang/test/OpenMP/fuse_ast_print.cpp | 278 +++ clang/test/OpenMP/fuse_codegen.cpp | 1511 +++++++++++++++++ clang/test/OpenMP/fuse_messages.cpp | 76 + clang/tools/libclang/CIndex.cpp | 7 + clang/tools/libclang/CXCursor.cpp | 3 + llvm/include/llvm/Frontend/OpenMP/OMP.td | 4 + .../runtime/test/transform/fuse/foreach.cpp | 192 +++ openmp/runtime/test/transform/fuse/intfor.c | 50 + .../runtime/test/transform/fuse/iterfor.cpp | 194 +++ .../fuse/parallel-wsloop-collapse-foreach.cpp | 208 +++ .../fuse/parallel-wsloop-collapse-intfor.c | 45 + 31 files changed, 3391 insertions(+), 4 deletions(-) create mode 100644 clang/test/OpenMP/fuse_ast_print.cpp create mode 100644 clang/test/OpenMP/fuse_codegen.cpp create mode 100644 clang/test/OpenMP/fuse_messages.cpp create mode 100644 openmp/runtime/test/transform/fuse/foreach.cpp create mode 100644 openmp/runtime/test/transform/fuse/intfor.c create mode 100644 openmp/runtime/test/transform/fuse/iterfor.cpp create mode 100644 openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-foreach.cpp create mode 100644 openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-intfor.c diff --git a/clang/include/clang-c/Index.h b/clang/include/clang-c/Index.h index d30d15e53802a..00046de62a742 100644 --- a/clang/include/clang-c/Index.h +++ b/clang/include/clang-c/Index.h @@ -2162,6 +2162,10 @@ enum CXCursorKind { */ CXCursor_OMPStripeDirective = 310, + /** OpenMP fuse directive + */ + CXCursor_OMPFuseDirective = 318, + /** OpenACC Compute Construct. */ CXCursor_OpenACCComputeConstruct = 320, diff --git a/clang/include/clang/AST/RecursiveASTVisitor.h b/clang/include/clang/AST/RecursiveASTVisitor.h index 3edc8684d0a19..e712a47f1639c 100644 --- a/clang/include/clang/AST/RecursiveASTVisitor.h +++ b/clang/include/clang/AST/RecursiveASTVisitor.h @@ -3078,6 +3078,9 @@ DEF_TRAVERSE_STMT(OMPUnrollDirective, DEF_TRAVERSE_STMT(OMPReverseDirective, { TRY_TO(TraverseOMPExecutableDirective(S)); }) +DEF_TRAVERSE_STMT(OMPFuseDirective, + { TRY_TO(TraverseOMPExecutableDirective(S)); }) + DEF_TRAVERSE_STMT(OMPInterchangeDirective, { TRY_TO(TraverseOMPExecutableDirective(S)); }) diff --git a/clang/include/clang/AST/StmtOpenMP.h b/clang/include/clang/AST/StmtOpenMP.h index 736bcabbad1f7..dc6f797e24ab8 100644 --- a/clang/include/clang/AST/StmtOpenMP.h +++ b/clang/include/clang/AST/StmtOpenMP.h @@ -962,6 +962,9 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Number of loops generated by this loop transformation. unsigned NumGeneratedLoops = 0; + /// Number of top level canonical loop nests generated by this loop + /// transformation + unsigned NumGeneratedLoopNests = 0; protected: explicit OMPLoopTransformationDirective(StmtClass SC, @@ -973,6 +976,9 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Set the number of loops generated by this loop transformation. void setNumGeneratedLoops(unsigned Num) { NumGeneratedLoops = Num; } + /// Set the number of top level canonical loop nests generated by this loop + /// transformation + void setNumGeneratedLoopNests(unsigned Num) { NumGeneratedLoopNests = Num; } public: /// Return the number of associated (consumed) loops. @@ -981,6 +987,10 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Return the number of loops generated by this loop transformation. unsigned getNumGeneratedLoops() const { return NumGeneratedLoops; } + /// Return the number of top level canonical loop nests generated by this loop + /// transformation + unsigned getNumGeneratedLoopNests() const { return NumGeneratedLoopNests; } + /// Get the de-sugared statements after the loop transformation. /// /// Might be nullptr if either the directive generates no loops and is handled @@ -995,7 +1005,8 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { Stmt::StmtClass C = T->getStmtClass(); return C == OMPTileDirectiveClass || C == OMPUnrollDirectiveClass || C == OMPReverseDirectiveClass || C == OMPInterchangeDirectiveClass || - C == OMPStripeDirectiveClass; + C == OMPStripeDirectiveClass || + C == OMPFuseDirectiveClass; } }; @@ -5562,6 +5573,7 @@ class OMPTileDirective final : public OMPLoopTransformationDirective { llvm::omp::OMPD_tile, StartLoc, EndLoc, NumLoops) { setNumGeneratedLoops(2 * NumLoops); + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { @@ -5790,7 +5802,11 @@ class OMPReverseDirective final : public OMPLoopTransformationDirective { explicit OMPReverseDirective(SourceLocation StartLoc, SourceLocation EndLoc) : OMPLoopTransformationDirective(OMPReverseDirectiveClass, llvm::omp::OMPD_reverse, StartLoc, - EndLoc, 1) {} + EndLoc, 1) { + + setNumGeneratedLoopNests(1); + setNumGeneratedLoops(1); + } void setPreInits(Stmt *PreInits) { Data->getChildren()[PreInitsOffset] = PreInits; @@ -5857,7 +5873,8 @@ class OMPInterchangeDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPInterchangeDirectiveClass, llvm::omp::OMPD_interchange, StartLoc, EndLoc, NumLoops) { - setNumGeneratedLoops(3 * NumLoops); + setNumGeneratedLoops(NumLoops); + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { @@ -5908,6 +5925,88 @@ class OMPInterchangeDirective final : public OMPLoopTransformationDirective { } }; +/// Represents the '#pragma omp fuse' loop transformation directive +/// +/// \code{c} +/// #pragma omp fuse +/// { +/// for(int i = 0; i < m1; ++i) {...} +/// for(int j = 0; j < m2; ++j) {...} +/// ... +/// } +/// \endcode + +class OMPFuseDirective final : public OMPLoopTransformationDirective { + friend class ASTStmtReader; + friend class OMPExecutableDirective; + + // Offsets of child members. + enum { + PreInitsOffset = 0, + TransformedStmtOffset, + }; + + explicit OMPFuseDirective(SourceLocation StartLoc, SourceLocation EndLoc, + unsigned NumLoops) + : OMPLoopTransformationDirective(OMPFuseDirectiveClass, + llvm::omp::OMPD_fuse, StartLoc, EndLoc, + NumLoops) { + setNumGeneratedLoops(1); + // TODO: After implementing the looprange clause, change this logic + setNumGeneratedLoopNests(1); + } + + void setPreInits(Stmt *PreInits) { + Data->getChildren()[PreInitsOffset] = PreInits; + } + + void setTransformedStmt(Stmt *S) { + Data->getChildren()[TransformedStmtOffset] = S; + } + +public: + /// Create a new AST node representation for #pragma omp fuse' + /// + /// \param C Context of the AST + /// \param StartLoc Location of the introducer (e.g the 'omp' token) + /// \param EndLoc Location of the directive's end (e.g the tok::eod) + /// \param Clauses The directive's clauses + /// \param NumLoops Number of total affected loops + /// \param NumLoopNests Number of affected top level canonical loops + /// (number of items in the 'looprange' clause if present) + /// \param AssociatedStmt The outermost associated loop + /// \param TransformedStmt The loop nest after fusion, or nullptr in + /// dependent + /// \param PreInits Helper preinits statements for the loop nest + static OMPFuseDirective *Create(const ASTContext &C, SourceLocation StartLoc, + SourceLocation EndLoc, + ArrayRef Clauses, + unsigned NumLoops, unsigned NumLoopNests, + Stmt *AssociatedStmt, Stmt *TransformedStmt, + Stmt *PreInits); + + /// Build an empty '#pragma omp fuse' AST node for deserialization + /// + /// \param C Context of the AST + /// \param NumClauses Number of clauses to allocate + /// \param NumLoops Number of associated loops to allocate + static OMPFuseDirective *CreateEmpty(const ASTContext &C, unsigned NumClauses, + unsigned NumLoops); + + /// Gets the associated loops after the transformation. This is the de-sugared + /// replacement or nulltpr in dependent contexts. + Stmt *getTransformedStmt() const { + return Data->getChildren()[TransformedStmtOffset]; + } + + /// Return preinits statement. + Stmt *getPreInits() const { return Data->getChildren()[PreInitsOffset]; } + + static bool classof(const Stmt *T) { + return T->getStmtClass() == OMPFuseDirectiveClass; + } +}; + /// This represents '#pragma omp scan' directive. /// /// \code diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index e1b9ed0647bb9..640db20f82e0b 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -11516,6 +11516,14 @@ def note_omp_implicit_dsa : Note< "implicitly determined as %0">; def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; +def warn_omp_different_loop_ind_var_types : Warning < + "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">; +def err_omp_not_canonical_loop : Error < + "loop after '#pragma omp %0' is not in canonical form">; +def err_omp_not_a_loop_sequence : Error < + "statement after '#pragma omp %0' must be a loop sequence containing canonical loops or loop-generating constructs">; +def err_omp_empty_loop_sequence : Error < + "loop sequence after '#pragma omp %0' must contain at least 1 canonical loop or loop-generating construct">; def err_omp_not_for : Error< "%select{statement after '#pragma omp %1' must be a for loop|" "expected %2 for loops after '#pragma omp %1'%select{|, but found only %4}3}0">; diff --git a/clang/include/clang/Basic/StmtNodes.td b/clang/include/clang/Basic/StmtNodes.td index 9526fa5808aa5..739160342062c 100644 --- a/clang/include/clang/Basic/StmtNodes.td +++ b/clang/include/clang/Basic/StmtNodes.td @@ -234,6 +234,7 @@ def OMPStripeDirective : StmtNode; def OMPUnrollDirective : StmtNode; def OMPReverseDirective : StmtNode; def OMPInterchangeDirective : StmtNode; +def OMPFuseDirective : StmtNode; def OMPForDirective : StmtNode; def OMPForSimdDirective : StmtNode; def OMPSectionsDirective : StmtNode; diff --git a/clang/include/clang/Sema/SemaOpenMP.h b/clang/include/clang/Sema/SemaOpenMP.h index 6498390fe96f7..8d78c2197c89d 100644 --- a/clang/include/clang/Sema/SemaOpenMP.h +++ b/clang/include/clang/Sema/SemaOpenMP.h @@ -457,6 +457,13 @@ class SemaOpenMP : public SemaBase { Stmt *AStmt, SourceLocation StartLoc, SourceLocation EndLoc); + + /// Called on well-formed '#pragma omp fuse' after parsing of its + /// clauses and the associated statement. + StmtResult ActOnOpenMPFuseDirective(ArrayRef Clauses, + Stmt *AStmt, SourceLocation StartLoc, + SourceLocation EndLoc); + /// Called on well-formed '\#pragma omp for' after parsing /// of the associated statement. StmtResult @@ -1480,6 +1487,26 @@ class SemaOpenMP : public SemaBase { SmallVectorImpl &LoopHelpers, Stmt *&Body, SmallVectorImpl> &OriginalInits); + /// Analyzes and checks a loop sequence for use by a loop transformation + /// + /// \param Kind The loop transformation directive kind. + /// \param NumLoops [out] Number of total canonical loops + /// \param LoopSeqSize [out] Number of top level canonical loops + /// \param LoopHelpers [out] The multiple loop analyses results. + /// \param LoopStmts [out] The multiple Stmt of each For loop. + /// \param OriginalInits [out] The multiple collection of statements and + /// declarations that must have been executed/declared + /// before entering the loop. + /// \param Context + /// \return Whether there was an absence of errors or not + bool checkTransformableLoopSequence( + OpenMPDirectiveKind Kind, Stmt *AStmt, unsigned &LoopSeqSize, + unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + ASTContext &Context); + /// Helper to keep information about the current `omp begin/end declare /// variant` nesting. struct OMPDeclareVariantScope { diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h index 5cb9998126a85..8fe9d8248d66f 100644 --- a/clang/include/clang/Serialization/ASTBitCodes.h +++ b/clang/include/clang/Serialization/ASTBitCodes.h @@ -1948,6 +1948,7 @@ enum StmtCode { STMT_OMP_UNROLL_DIRECTIVE, STMT_OMP_REVERSE_DIRECTIVE, STMT_OMP_INTERCHANGE_DIRECTIVE, + STMT_OMP_FUSE_DIRECTIVE, STMT_OMP_FOR_DIRECTIVE, STMT_OMP_FOR_SIMD_DIRECTIVE, STMT_OMP_SECTIONS_DIRECTIVE, diff --git a/clang/lib/AST/StmtOpenMP.cpp b/clang/lib/AST/StmtOpenMP.cpp index 4f8b50e179e30..f050e9063f1fc 100644 --- a/clang/lib/AST/StmtOpenMP.cpp +++ b/clang/lib/AST/StmtOpenMP.cpp @@ -456,6 +456,8 @@ OMPUnrollDirective::Create(const ASTContext &C, SourceLocation StartLoc, auto *Dir = createDirective( C, Clauses, AssociatedStmt, TransformedStmtOffset + 1, StartLoc, EndLoc); Dir->setNumGeneratedLoops(NumGeneratedLoops); + // The number of generated loops and loop nests during unroll matches + Dir->setNumGeneratedLoopNests(NumGeneratedLoops); Dir->setTransformedStmt(TransformedStmt); Dir->setPreInits(PreInits); return Dir; @@ -505,6 +507,29 @@ OMPInterchangeDirective::CreateEmpty(const ASTContext &C, unsigned NumClauses, SourceLocation(), SourceLocation(), NumLoops); } +OMPFuseDirective *OMPFuseDirective::Create( + const ASTContext &C, SourceLocation StartLoc, SourceLocation EndLoc, + ArrayRef Clauses, unsigned NumLoops, unsigned NumLoopNests, + Stmt *AssociatedStmt, Stmt *TransformedStmt, Stmt *PreInits) { + + OMPFuseDirective *Dir = createDirective( + C, Clauses, AssociatedStmt, TransformedStmtOffset + 1, StartLoc, EndLoc, + NumLoops); + Dir->setTransformedStmt(TransformedStmt); + Dir->setPreInits(PreInits); + Dir->setNumGeneratedLoopNests(NumLoopNests); + Dir->setNumGeneratedLoops(NumLoops); + return Dir; +} + +OMPFuseDirective *OMPFuseDirective::CreateEmpty(const ASTContext &C, + unsigned NumClauses, + unsigned NumLoops) { + return createEmptyDirective( + C, NumClauses, /*HasAssociatedStmt=*/true, TransformedStmtOffset + 1, + SourceLocation(), SourceLocation(), NumLoops); +} + OMPForSimdDirective * OMPForSimdDirective::Create(const ASTContext &C, SourceLocation StartLoc, SourceLocation EndLoc, unsigned CollapsedNum, diff --git a/clang/lib/AST/StmtPrinter.cpp b/clang/lib/AST/StmtPrinter.cpp index c6c49c6c1ba4d..ec0becea8f55c 100644 --- a/clang/lib/AST/StmtPrinter.cpp +++ b/clang/lib/AST/StmtPrinter.cpp @@ -789,6 +789,11 @@ void StmtPrinter::VisitOMPInterchangeDirective(OMPInterchangeDirective *Node) { PrintOMPExecutableDirective(Node); } +void StmtPrinter::VisitOMPFuseDirective(OMPFuseDirective *Node) { + Indent() << "#pragma omp fuse"; + PrintOMPExecutableDirective(Node); +} + void StmtPrinter::VisitOMPForDirective(OMPForDirective *Node) { Indent() << "#pragma omp for"; PrintOMPExecutableDirective(Node); diff --git a/clang/lib/AST/StmtProfile.cpp b/clang/lib/AST/StmtProfile.cpp index 83d54da9be7e5..933ad19b7a8ef 100644 --- a/clang/lib/AST/StmtProfile.cpp +++ b/clang/lib/AST/StmtProfile.cpp @@ -1026,6 +1026,10 @@ void StmtProfiler::VisitOMPInterchangeDirective( VisitOMPLoopTransformationDirective(S); } +void StmtProfiler::VisitOMPFuseDirective(const OMPFuseDirective *S) { + VisitOMPLoopTransformationDirective(S); +} + void StmtProfiler::VisitOMPForDirective(const OMPForDirective *S) { VisitOMPLoopDirective(S); } diff --git a/clang/lib/Basic/OpenMPKinds.cpp b/clang/lib/Basic/OpenMPKinds.cpp index 7b90861c78de0..e18867e3c0281 100644 --- a/clang/lib/Basic/OpenMPKinds.cpp +++ b/clang/lib/Basic/OpenMPKinds.cpp @@ -702,7 +702,7 @@ bool clang::isOpenMPLoopBoundSharingDirective(OpenMPDirectiveKind Kind) { bool clang::isOpenMPLoopTransformationDirective(OpenMPDirectiveKind DKind) { return DKind == OMPD_tile || DKind == OMPD_unroll || DKind == OMPD_reverse || - DKind == OMPD_interchange || DKind == OMPD_stripe; + DKind == OMPD_interchange || DKind == OMPD_stripe || DKind == OMPD_fuse; } bool clang::isOpenMPCombinedParallelADirective(OpenMPDirectiveKind DKind) { diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp index 3562b4ea22a24..4a2dc1a537d46 100644 --- a/clang/lib/CodeGen/CGStmt.cpp +++ b/clang/lib/CodeGen/CGStmt.cpp @@ -233,6 +233,9 @@ void CodeGenFunction::EmitStmt(const Stmt *S, ArrayRef Attrs) { case Stmt::OMPInterchangeDirectiveClass: EmitOMPInterchangeDirective(cast(*S)); break; + case Stmt::OMPFuseDirectiveClass: + EmitOMPFuseDirective(cast(*S)); + break; case Stmt::OMPForDirectiveClass: EmitOMPForDirective(cast(*S)); break; diff --git a/clang/lib/CodeGen/CGStmtOpenMP.cpp b/clang/lib/CodeGen/CGStmtOpenMP.cpp index 803c7ed37635e..0c664b0f89044 100644 --- a/clang/lib/CodeGen/CGStmtOpenMP.cpp +++ b/clang/lib/CodeGen/CGStmtOpenMP.cpp @@ -197,6 +197,8 @@ class OMPLoopScope : public CodeGenFunction::RunCleanupsScope { } else if (const auto *Interchange = dyn_cast(&S)) { PreInits = Interchange->getPreInits(); + } else if (const auto *Fuse = dyn_cast(&S)) { + PreInits = Fuse->getPreInits(); } else { llvm_unreachable("Unknown loop-based directive kind."); } @@ -2918,6 +2920,12 @@ void CodeGenFunction::EmitOMPInterchangeDirective( EmitStmt(S.getTransformedStmt()); } +void CodeGenFunction::EmitOMPFuseDirective(const OMPFuseDirective &S) { + // Emit the de-sugared statement + OMPTransformDirectiveScopeRAII FuseScope(*this, &S); + EmitStmt(S.getTransformedStmt()); +} + void CodeGenFunction::EmitOMPUnrollDirective(const OMPUnrollDirective &S) { bool UseOMPIRBuilder = CGM.getLangOpts().OpenMPIRBuilder; diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h index c0bc3825f0188..59cb4d9caa98d 100644 --- a/clang/lib/CodeGen/CodeGenFunction.h +++ b/clang/lib/CodeGen/CodeGenFunction.h @@ -3871,6 +3871,7 @@ class CodeGenFunction : public CodeGenTypeCache { void EmitOMPUnrollDirective(const OMPUnrollDirective &S); void EmitOMPReverseDirective(const OMPReverseDirective &S); void EmitOMPInterchangeDirective(const OMPInterchangeDirective &S); + void EmitOMPFuseDirective(const OMPFuseDirective &S); void EmitOMPForDirective(const OMPForDirective &S); void EmitOMPForSimdDirective(const OMPForSimdDirective &S); void EmitOMPScopeDirective(const OMPScopeDirective &S); diff --git a/clang/lib/Sema/SemaExceptionSpec.cpp b/clang/lib/Sema/SemaExceptionSpec.cpp index aaa2bb22565e4..f6ff77937f54b 100644 --- a/clang/lib/Sema/SemaExceptionSpec.cpp +++ b/clang/lib/Sema/SemaExceptionSpec.cpp @@ -1492,6 +1492,7 @@ CanThrowResult Sema::canThrow(const Stmt *S) { case Stmt::OMPUnrollDirectiveClass: case Stmt::OMPReverseDirectiveClass: case Stmt::OMPInterchangeDirectiveClass: + case Stmt::OMPFuseDirectiveClass: case Stmt::OMPSingleDirectiveClass: case Stmt::OMPTargetDataDirectiveClass: case Stmt::OMPTargetDirectiveClass: diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index 835dba22a858d..c9885518217f3 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -4398,6 +4398,7 @@ void SemaOpenMP::ActOnOpenMPRegionStart(OpenMPDirectiveKind DKind, case OMPD_unroll: case OMPD_reverse: case OMPD_interchange: + case OMPD_fuse: case OMPD_assume: break; default: @@ -6209,6 +6210,10 @@ StmtResult SemaOpenMP::ActOnOpenMPExecutableDirective( Res = ActOnOpenMPInterchangeDirective(ClausesWithImplicit, AStmt, StartLoc, EndLoc); break; + case OMPD_fuse: + Res = + ActOnOpenMPFuseDirective(ClausesWithImplicit, AStmt, StartLoc, EndLoc); + break; case OMPD_for: Res = ActOnOpenMPForDirective(ClausesWithImplicit, AStmt, StartLoc, EndLoc, VarsWithInheritedDSA); @@ -14161,6 +14166,8 @@ bool SemaOpenMP::checkTransformableLoopNest( DependentPreInits = Dir->getPreInits(); else if (auto *Dir = dyn_cast(Transform)) DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); else llvm_unreachable("Unhandled loop transformation"); @@ -14171,6 +14178,265 @@ bool SemaOpenMP::checkTransformableLoopNest( return Result; } +class NestedLoopCounterVisitor + : public clang::RecursiveASTVisitor { +public: + explicit NestedLoopCounterVisitor() : NestedLoopCount(0) {} + + bool VisitForStmt(clang::ForStmt *FS) { + ++NestedLoopCount; + return true; + } + + bool VisitCXXForRangeStmt(clang::CXXForRangeStmt *FRS) { + ++NestedLoopCount; + return true; + } + + unsigned getNestedLoopCount() const { return NestedLoopCount; } + +private: + unsigned NestedLoopCount; +}; + +bool SemaOpenMP::checkTransformableLoopSequence( + OpenMPDirectiveKind Kind, Stmt *AStmt, unsigned &LoopSeqSize, + unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + ASTContext &Context) { + + // Checks whether the given statement is a compound statement + VarsWithInheritedDSAType TmpDSA; + if (!isa(AStmt)) { + Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) + << getOpenMPDirectiveName(Kind); + return false; + } + // Callback for updating pre-inits in case there are even more + // loop-sequence-generating-constructs inside of the main compound stmt + auto OnTransformationCallback = + [&OriginalInits](OMPLoopBasedDirective *Transform) { + Stmt *DependentPreInits; + if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else + llvm_unreachable("Unhandled loop transformation"); + + appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + }; + + // Number of top level canonical loop nests observed (And acts as index) + LoopSeqSize = 0; + // Number of total observed loops + NumLoops = 0; + + // Following OpenMP 6.0 API Specification, a Canonical Loop Sequence follows + // the grammar: + // + // canonical-loop-sequence: + // { + // loop-sequence+ + // } + // where loop-sequence can be any of the following: + // 1. canonical-loop-sequence + // 2. loop-nest + // 3. loop-sequence-generating-construct (i.e OMPLoopTransformationDirective) + // + // To recognise and traverse this structure the following helper functions + // have been defined. handleLoopSequence serves as the recurisve entry point + // and tries to match the input AST to the canonical loop sequence grammar + // structure + + auto NLCV = NestedLoopCounterVisitor(); + // Helper functions to validate canonical loop sequence grammar is valid + auto isLoopSequenceDerivation = [](auto *Child) { + return isa(Child) || isa(Child) || + isa(Child); + }; + auto isLoopGeneratingStmt = [](auto *Child) { + return isa(Child); + }; + + // Helper Lambda to handle storing initialization and body statements for both + // ForStmt and CXXForRangeStmt and checks for any possible mismatch between + // induction variables types + QualType BaseInductionVarType; + auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, + this, &Context](Stmt *LoopStmt) { + if (auto *For = dyn_cast(LoopStmt)) { + OriginalInits.back().push_back(For->getInit()); + ForStmts.push_back(For); + // Extract induction variable + if (auto *InitStmt = dyn_cast_or_null(For->getInit())) { + if (auto *InitDecl = dyn_cast(InitStmt->getSingleDecl())) { + QualType InductionVarType = InitDecl->getType().getCanonicalType(); + + // Compare with first loop type + if (BaseInductionVarType.isNull()) { + BaseInductionVarType = InductionVarType; + } else if (!Context.hasSameType(BaseInductionVarType, + InductionVarType)) { + Diag(InitDecl->getBeginLoc(), + diag::warn_omp_different_loop_ind_var_types) + << getOpenMPDirectiveName(OMPD_fuse) << BaseInductionVarType + << InductionVarType; + } + } + } + + } else { + assert(isa(LoopStmt) && + "Expected canonical for or range-based for loops."); + auto *CXXFor = dyn_cast(LoopStmt); + OriginalInits.back().push_back(CXXFor->getBeginStmt()); + ForStmts.push_back(CXXFor); + } + }; + // Helper lambda functions to encapsulate the processing of different + // derivations of the canonical loop sequence grammar + // + // Modularized code for handling loop generation and transformations + auto handleLoopGeneration = [&storeLoopStatements, &LoopHelpers, + &OriginalInits, &LoopSeqSize, &NumLoops, Kind, + &TmpDSA, &OnTransformationCallback, + this](Stmt *Child) { + auto LoopTransform = dyn_cast(Child); + Stmt *TransformedStmt = LoopTransform->getTransformedStmt(); + unsigned NumGeneratedLoopNests = LoopTransform->getNumGeneratedLoopNests(); + + // Handle the case where transformed statement is not available due to + // dependent contexts + if (!TransformedStmt) { + if (NumGeneratedLoopNests > 0) + return true; + // Unroll full + else { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + } + // Handle loop transformations with multiple loop nests + // Unroll full + if (NumGeneratedLoopNests <= 0) { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + // Future loop transformations that generate multiple canonical loops + } else if (NumGeneratedLoopNests > 1) { + llvm_unreachable("Multiple canonical loop generating transformations " + "like loop splitting are not yet supported"); + } + + // Process the transformed loop statement + Child = TransformedStmt; + OriginalInits.emplace_back(); + LoopHelpers.emplace_back(); + OnTransformationCallback(LoopTransform); + + unsigned IsCanonical = + checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, + TmpDSA, LoopHelpers[LoopSeqSize]); + + if (!IsCanonical) { + Diag(Child->getBeginLoc(), diag::err_omp_not_canonical_loop) + << getOpenMPDirectiveName(Kind); + return false; + } + storeLoopStatements(TransformedStmt); + NumLoops += LoopTransform->getNumGeneratedLoops(); + return true; + }; + + // Modularized code for handling regular canonical loops + auto handleRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, + &LoopSeqSize, &NumLoops, Kind, &TmpDSA, &NLCV, + this](Stmt *Child) { + OriginalInits.emplace_back(); + LoopHelpers.emplace_back(); + unsigned IsCanonical = + checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, + TmpDSA, LoopHelpers[LoopSeqSize]); + + if (!IsCanonical) { + Diag(Child->getBeginLoc(), diag::err_omp_not_canonical_loop) + << getOpenMPDirectiveName(Kind); + return false; + } + storeLoopStatements(Child); + NumLoops += NLCV.TraverseStmt(Child); + return true; + }; + + // Helper function to process a Loop Sequence Recursively + auto handleLoopSequence = [&](Stmt *LoopSeqStmt, + auto &handleLoopSequenceCallback) -> bool { + for (auto *Child : LoopSeqStmt->children()) { + if (!Child) + continue; + + // Skip over non-loop-sequence statements + if (!isLoopSequenceDerivation(Child)) { + Child = Child->IgnoreContainers(); + + // Ignore empty compound statement + if (!Child) + continue; + + // In the case of a nested loop sequence ignoring containers would not + // be enough, a recurisve transversal of the loop sequence is required + if (isa(Child)) { + if (!handleLoopSequenceCallback(Child, handleLoopSequenceCallback)) + return false; + // Already been treated, skip this children + continue; + } + } + // Regular loop sequence handling + if (isLoopSequenceDerivation(Child)) { + if (isLoopGeneratingStmt(Child)) { + if (!handleLoopGeneration(Child)) { + return false; + } + } else { + if (!handleRegularLoop(Child)) { + return false; + } + } + ++LoopSeqSize; + } else { + // Report error for invalid statement inside canonical loop sequence + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + } + return true; + }; + + // Recursive entry point to process the main loop sequence + if (!handleLoopSequence(AStmt, handleLoopSequence)) { + return false; + } + + if (LoopSeqSize <= 0) { + Diag(AStmt->getBeginLoc(), diag::err_omp_empty_loop_sequence) + << getOpenMPDirectiveName(Kind); + return false; + } + return true; +} + /// Add preinit statements that need to be propageted from the selected loop. static void addLoopPreInits(ASTContext &Context, OMPLoopBasedDirective::HelperExprs &LoopHelper, @@ -15416,6 +15682,340 @@ StmtResult SemaOpenMP::ActOnOpenMPInterchangeDirective( buildPreInits(Context, PreInits)); } +StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, + Stmt *AStmt, + SourceLocation StartLoc, + SourceLocation EndLoc) { + ASTContext &Context = getASTContext(); + DeclContext *CurrContext = SemaRef.CurContext; + Scope *CurScope = SemaRef.getCurScope(); + CaptureVars CopyTransformer(SemaRef); + + // Ensure the structured block is not empty + if (!AStmt) { + return StmtError(); + } + // Validate that the potential loop sequence is transformable for fusion + // Also collect the HelperExprs, Loop Stmts, Inits, and Number of loops + SmallVector LoopHelpers; + SmallVector LoopStmts; + SmallVector> OriginalInits; + + unsigned NumLoops; + // TODO: Support looprange clause using LoopSeqSize + unsigned LoopSeqSize; + if (!checkTransformableLoopSequence(OMPD_fuse, AStmt, LoopSeqSize, NumLoops, + LoopHelpers, LoopStmts, OriginalInits, + Context)) { + return StmtError(); + } + + // Defer transformation in dependent contexts + if (CurrContext->isDependentContext()) { + return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, + NumLoops, 1, AStmt, nullptr, nullptr); + } + assert(LoopHelpers.size() == LoopSeqSize && + "Expecting loop iteration space dimensionality to match number of " + "affected loops"); + assert(OriginalInits.size() == LoopSeqSize && + "Expecting loop iteration space dimensionality to match number of " + "affected loops"); + + // PreInits hold a sequence of variable declarations that must be executed + // before the fused loop begins. These include bounds, strides, and other + // helper variables required for the transformation. + SmallVector PreInits; + + // Select the type with the largest bit width among all induction variables + QualType IVType = LoopHelpers[0].IterationVarRef->getType(); + for (unsigned int I = 1; I < LoopSeqSize; ++I) { + QualType CurrentIVType = LoopHelpers[I].IterationVarRef->getType(); + if (Context.getTypeSize(CurrentIVType) > Context.getTypeSize(IVType)) { + IVType = CurrentIVType; + } + } + uint64_t IVBitWidth = Context.getIntWidth(IVType); + + // Create pre-init declarations for all loops lower bounds, upper bounds, + // strides and num-iterations + SmallVector LBVarDecls; + SmallVector STVarDecls; + SmallVector NIVarDecls; + SmallVector UBVarDecls; + SmallVector IVVarDecls; + + // Helper lambda to create variables for bounds, strides, and other + // expressions. Generates both the variable declaration and the corresponding + // initialization statement. + auto CreateHelperVarAndStmt = + [&SemaRef = this->SemaRef, &Context, &CopyTransformer, + &IVType](Expr *ExprToCopy, const std::string &BaseName, unsigned I, + bool NeedsNewVD = false) { + Expr *TransformedExpr = + AssertSuccess(CopyTransformer.TransformExpr(ExprToCopy)); + if (!TransformedExpr) + return std::pair(nullptr, StmtError()); + + auto Name = (Twine(".omp.") + BaseName + std::to_string(I)).str(); + + VarDecl *VD; + if (NeedsNewVD) { + VD = buildVarDecl(SemaRef, SourceLocation(), IVType, Name); + SemaRef.AddInitializerToDecl(VD, TransformedExpr, false); + + } else { + // Create a unique variable name + DeclRefExpr *DRE = cast(TransformedExpr); + VD = cast(DRE->getDecl()); + VD->setDeclName(&SemaRef.PP.getIdentifierTable().get(Name)); + } + // Create the corresponding declaration statement + StmtResult DeclStmt = new (Context) class DeclStmt( + DeclGroupRef(VD), SourceLocation(), SourceLocation()); + return std::make_pair(VD, DeclStmt); + }; + + // Process each single loop to generate and collect declarations + // and statements for all helper expressions + for (unsigned int I = 0; I < LoopSeqSize; ++I) { + addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], + PreInits); + + auto [UBVD, UBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].UB, "ub", I); + auto [LBVD, LBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].LB, "lb", I); + auto [STVD, STDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].ST, "st", I); + auto [NIVD, NIDStmt] = + CreateHelperVarAndStmt(LoopHelpers[I].NumIterations, "ni", I, true); + auto [IVVD, IVDStmt] = + CreateHelperVarAndStmt(LoopHelpers[I].IterationVarRef, "iv", I); + + if (!LBVD || !STVD || !NIVD || !IVVD) + return StmtError(); + + UBVarDecls.push_back(UBVD); + LBVarDecls.push_back(LBVD); + STVarDecls.push_back(STVD); + NIVarDecls.push_back(NIVD); + IVVarDecls.push_back(IVVD); + + PreInits.push_back(UBDStmt.get()); + PreInits.push_back(LBDStmt.get()); + PreInits.push_back(STDStmt.get()); + PreInits.push_back(NIDStmt.get()); + PreInits.push_back(IVDStmt.get()); + } + + auto MakeVarDeclRef = [&SemaRef = this->SemaRef](VarDecl *VD) { + return buildDeclRefExpr(SemaRef, VD, VD->getType(), VD->getLocation(), + false); + }; + + // Following up the creation of the final fused loop will be performed + // which has the following shape (considering the selected loops): + // + // for (fuse.index = 0; fuse.index < max(ni0, ni1..., nik); ++fuse.index) { + // if (fuse.index < ni0){ + // iv0 = lb0 + st0 * fuse.index; + // original.index0 = iv0 + // body(0); + // } + // if (fuse.index < ni1){ + // iv1 = lb1 + st1 * fuse.index; + // original.index1 = iv1 + // body(1); + // } + // + // ... + // + // if (fuse.index < nik){ + // ivk = lbk + stk * fuse.index; + // original.indexk = ivk + // body(k); Expr *InitVal = IntegerLiteral::Create(Context, + // llvm::APInt(IVWidth, 0), + + // } + + // 1. Create the initialized fuse index + const std::string IndexName = Twine(".omp.fuse.index").str(); + Expr *InitVal = IntegerLiteral::Create(Context, llvm::APInt(IVBitWidth, 0), + IVType, SourceLocation()); + VarDecl *IndexDecl = + buildVarDecl(SemaRef, {}, IVType, IndexName, nullptr, nullptr); + SemaRef.AddInitializerToDecl(IndexDecl, InitVal, false); + StmtResult InitStmt = new (Context) + DeclStmt(DeclGroupRef(IndexDecl), SourceLocation(), SourceLocation()); + + if (!InitStmt.isUsable()) + return StmtError(); + + auto MakeIVRef = [&SemaRef = this->SemaRef, IndexDecl, IVType, + Loc = InitVal->getExprLoc()]() { + return buildDeclRefExpr(SemaRef, IndexDecl, IVType, Loc, false); + }; + + // 2. Iteratively compute the max number of logical iterations Max(NI_1, NI_2, + // ..., NI_k) + // + // This loop accumulates the maximum value across multiple expressions, + // ensuring each step constructs a unique AST node for correctness. By using + // intermediate temporary variables and conditional operators, we maintain + // distinct nodes and avoid duplicating subtrees, For instance, max(a,b,c): + // omp.temp0 = max(a, b) + // omp.temp1 = max(omp.temp0, c) + // omp.fuse.max = max(omp.temp1, omp.temp0) + + ExprResult MaxExpr; + for (unsigned I = 0; I < LoopSeqSize; ++I) { + DeclRefExpr *NIRef = MakeVarDeclRef(NIVarDecls[I]); + QualType NITy = NIRef->getType(); + + if (MaxExpr.isUnset()) { + // Initialize MaxExpr with the first NI expression + MaxExpr = NIRef; + } else { + // Create a new acummulator variable t_i = MaxExpr + std::string TempName = (Twine(".omp.temp.") + Twine(I)).str(); + VarDecl *TempDecl = + buildVarDecl(SemaRef, {}, NITy, TempName, nullptr, nullptr); + TempDecl->setInit(MaxExpr.get()); + DeclRefExpr *TempRef = + buildDeclRefExpr(SemaRef, TempDecl, NITy, SourceLocation(), false); + DeclRefExpr *TempRef2 = + buildDeclRefExpr(SemaRef, TempDecl, NITy, SourceLocation(), false); + // Add a DeclStmt to PreInits to ensure the variable is declared. + StmtResult TempStmt = new (Context) + DeclStmt(DeclGroupRef(TempDecl), SourceLocation(), SourceLocation()); + + if (!TempStmt.isUsable()) + return StmtError(); + PreInits.push_back(TempStmt.get()); + + // Build MaxExpr <-(MaxExpr > NIRef ? MaxExpr : NIRef) + ExprResult Comparison = + SemaRef.BuildBinOp(nullptr, SourceLocation(), BO_GT, TempRef, NIRef); + // Handle any errors in Comparison creation + if (!Comparison.isUsable()) + return StmtError(); + + DeclRefExpr *NIRef2 = MakeVarDeclRef(NIVarDecls[I]); + // Update MaxExpr using a conditional expression to hold the max value + MaxExpr = new (Context) ConditionalOperator( + Comparison.get(), SourceLocation(), TempRef2, SourceLocation(), + NIRef2->getExprStmt(), NITy, VK_LValue, OK_Ordinary); + + if (!MaxExpr.isUsable()) + return StmtError(); + } + } + if (!MaxExpr.isUsable()) + return StmtError(); + + // 3. Declare the max variable + const std::string MaxName = Twine(".omp.fuse.max").str(); + VarDecl *MaxDecl = + buildVarDecl(SemaRef, {}, IVType, MaxName, nullptr, nullptr); + MaxDecl->setInit(MaxExpr.get()); + DeclRefExpr *MaxRef = buildDeclRefExpr(SemaRef, MaxDecl, IVType, {}, false); + StmtResult MaxStmt = new (Context) + DeclStmt(DeclGroupRef(MaxDecl), SourceLocation(), SourceLocation()); + + if (MaxStmt.isInvalid()) + return StmtError(); + PreInits.push_back(MaxStmt.get()); + + // 4. Create condition Expr: index < n_max + ExprResult CondExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_LT, + MakeIVRef(), MaxRef); + if (!CondExpr.isUsable()) + return StmtError(); + // 5. Increment Expr: ++index + ExprResult IncrExpr = + SemaRef.BuildUnaryOp(CurScope, SourceLocation(), UO_PreInc, MakeIVRef()); + if (!IncrExpr.isUsable()) + return StmtError(); + + // 6. Build the Fused Loop Body + // The final fused loop iterates over the maximum logical range. Inside the + // loop, each original loop's index is calculated dynamically, and its body + // is executed conditionally. + // + // Each sub-loop's body is guarded by a conditional statement to ensure + // it executes only within its logical iteration range: + // + // if (fuse.index < ni_k){ + // iv_k = lb_k + st_k * fuse.index; + // original.index = iv_k + // body(k); + // } + + CompoundStmt *FusedBody = nullptr; + SmallVector FusedBodyStmts; + for (unsigned I = 0; I < LoopSeqSize; ++I) { + + // Assingment of the original sub-loop index to compute the logical index + // IV_k = LB_k + omp.fuse.index * ST_k + + ExprResult IdxExpr = + SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Mul, + MakeVarDeclRef(STVarDecls[I]), MakeIVRef()); + if (!IdxExpr.isUsable()) + return StmtError(); + IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Add, + MakeVarDeclRef(LBVarDecls[I]), IdxExpr.get()); + + if (!IdxExpr.isUsable()) + return StmtError(); + IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Assign, + MakeVarDeclRef(IVVarDecls[I]), IdxExpr.get()); + if (!IdxExpr.isUsable()) + return StmtError(); + + // Update the original i_k = IV_k + SmallVector BodyStmts; + BodyStmts.push_back(IdxExpr.get()); + llvm::append_range(BodyStmts, LoopHelpers[I].Updates); + + if (auto *SourceCXXFor = dyn_cast(LoopStmts[I])) + BodyStmts.push_back(SourceCXXFor->getLoopVarStmt()); + + Stmt *Body = (isa(LoopStmts[I])) + ? cast(LoopStmts[I])->getBody() + : cast(LoopStmts[I])->getBody(); + + BodyStmts.push_back(Body); + + CompoundStmt *CombinedBody = + CompoundStmt::Create(Context, BodyStmts, FPOptionsOverride(), + SourceLocation(), SourceLocation()); + ExprResult Condition = + SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_LT, MakeIVRef(), + MakeVarDeclRef(NIVarDecls[I])); + + if (!Condition.isUsable()) + return StmtError(); + + IfStmt *IfStatement = IfStmt::Create( + Context, SourceLocation(), IfStatementKind::Ordinary, nullptr, nullptr, + Condition.get(), SourceLocation(), SourceLocation(), CombinedBody, + SourceLocation(), nullptr); + + FusedBodyStmts.push_back(IfStatement); + } + FusedBody = CompoundStmt::Create(Context, FusedBodyStmts, FPOptionsOverride(), + SourceLocation(), SourceLocation()); + + // 7. Construct the final fused loop + ForStmt *FusedForStmt = new (Context) + ForStmt(Context, InitStmt.get(), CondExpr.get(), nullptr, IncrExpr.get(), + FusedBody, InitStmt.get()->getBeginLoc(), SourceLocation(), + IncrExpr.get()->getEndLoc()); + + return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, NumLoops, + 1, AStmt, FusedForStmt, + buildPreInits(Context, PreInits)); +} + OMPClause *SemaOpenMP::ActOnOpenMPSingleExprClause(OpenMPClauseKind Kind, Expr *Expr, SourceLocation StartLoc, diff --git a/clang/lib/Sema/TreeTransform.h b/clang/lib/Sema/TreeTransform.h index 8b4b79c6ec039..39082e06a5a0b 100644 --- a/clang/lib/Sema/TreeTransform.h +++ b/clang/lib/Sema/TreeTransform.h @@ -9663,6 +9663,17 @@ StmtResult TreeTransform::TransformOMPInterchangeDirective( return Res; } +template +StmtResult +TreeTransform::TransformOMPFuseDirective(OMPFuseDirective *D) { + DeclarationNameInfo DirName; + getDerived().getSema().OpenMP().StartOpenMPDSABlock( + D->getDirectiveKind(), DirName, nullptr, D->getBeginLoc()); + StmtResult Res = getDerived().TransformOMPExecutableDirective(D); + getDerived().getSema().OpenMP().EndOpenMPDSABlock(Res.get()); + return Res; +} + template StmtResult TreeTransform::TransformOMPForDirective(OMPForDirective *D) { diff --git a/clang/lib/Serialization/ASTReaderStmt.cpp b/clang/lib/Serialization/ASTReaderStmt.cpp index f41cfcc53a35d..aee052404874c 100644 --- a/clang/lib/Serialization/ASTReaderStmt.cpp +++ b/clang/lib/Serialization/ASTReaderStmt.cpp @@ -2449,6 +2449,7 @@ void ASTStmtReader::VisitOMPLoopTransformationDirective( OMPLoopTransformationDirective *D) { VisitOMPLoopBasedDirective(D); D->setNumGeneratedLoops(Record.readUInt32()); + D->setNumGeneratedLoopNests(Record.readUInt32()); } void ASTStmtReader::VisitOMPTileDirective(OMPTileDirective *D) { @@ -2471,6 +2472,10 @@ void ASTStmtReader::VisitOMPInterchangeDirective(OMPInterchangeDirective *D) { VisitOMPLoopTransformationDirective(D); } +void ASTStmtReader::VisitOMPFuseDirective(OMPFuseDirective *D) { + VisitOMPLoopTransformationDirective(D); +} + void ASTStmtReader::VisitOMPForDirective(OMPForDirective *D) { VisitOMPLoopDirective(D); D->setHasCancel(Record.readBool()); @@ -3613,6 +3618,12 @@ Stmt *ASTReader::ReadStmtFromStream(ModuleFile &F) { S = OMPReverseDirective::CreateEmpty(Context); break; } + case STMT_OMP_FUSE_DIRECTIVE: { + unsigned NumLoops = Record[ASTStmtReader::NumStmtFields]; + unsigned NumClauses = Record[ASTStmtReader::NumStmtFields + 1]; + S = OMPFuseDirective::CreateEmpty(Context, NumClauses, NumLoops); + break; + } case STMT_OMP_INTERCHANGE_DIRECTIVE: { unsigned NumLoops = Record[ASTStmtReader::NumStmtFields]; diff --git a/clang/lib/Serialization/ASTWriterStmt.cpp b/clang/lib/Serialization/ASTWriterStmt.cpp index b9eabd5ddb64c..8b909d5c93686 100644 --- a/clang/lib/Serialization/ASTWriterStmt.cpp +++ b/clang/lib/Serialization/ASTWriterStmt.cpp @@ -2454,6 +2454,7 @@ void ASTStmtWriter::VisitOMPLoopTransformationDirective( OMPLoopTransformationDirective *D) { VisitOMPLoopBasedDirective(D); Record.writeUInt32(D->getNumGeneratedLoops()); + Record.writeUInt32(D->getNumGeneratedLoopNests()); } void ASTStmtWriter::VisitOMPTileDirective(OMPTileDirective *D) { @@ -2481,6 +2482,11 @@ void ASTStmtWriter::VisitOMPInterchangeDirective(OMPInterchangeDirective *D) { Code = serialization::STMT_OMP_INTERCHANGE_DIRECTIVE; } +void ASTStmtWriter::VisitOMPFuseDirective(OMPFuseDirective *D) { + VisitOMPLoopTransformationDirective(D); + Code = serialization::STMT_OMP_FUSE_DIRECTIVE; +} + void ASTStmtWriter::VisitOMPForDirective(OMPForDirective *D) { VisitOMPLoopDirective(D); Record.writeBool(D->hasCancel()); diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp index 86e2e8f634bfd..457a6daf061b0 100644 --- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp +++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp @@ -1818,6 +1818,7 @@ void ExprEngine::Visit(const Stmt *S, ExplodedNode *Pred, case Stmt::OMPStripeDirectiveClass: case Stmt::OMPTileDirectiveClass: case Stmt::OMPInterchangeDirectiveClass: + case Stmt::OMPFuseDirectiveClass: case Stmt::OMPInteropDirectiveClass: case Stmt::OMPDispatchDirectiveClass: case Stmt::OMPMaskedDirectiveClass: diff --git a/clang/test/OpenMP/fuse_ast_print.cpp b/clang/test/OpenMP/fuse_ast_print.cpp new file mode 100644 index 0000000000000..43ce815dab024 --- /dev/null +++ b/clang/test/OpenMP/fuse_ast_print.cpp @@ -0,0 +1,278 @@ +// Check no warnings/errors +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -fsyntax-only -verify %s +// expected-no-diagnostics + +// Check AST and unparsing +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -ast-dump %s | FileCheck %s --check-prefix=DUMP +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -ast-print %s | FileCheck %s --check-prefix=PRINT + +// Check same results after serialization round-trip +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -emit-pch -o %t %s +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -include-pch %t -ast-dump-all %s | FileCheck %s --check-prefix=DUMP +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -include-pch %t -ast-print %s | FileCheck %s --check-prefix=PRINT + +#ifndef HEADER +#define HEADER + +// placeholder for loop body code +extern "C" void body(...); + +// PRINT-LABEL: void foo1( +// DUMP-LABEL: FunctionDecl {{.*}} foo1 +void foo1() { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + + } + +} + +// PRINT-LABEL: void foo2( +// DUMP-LABEL: FunctionDecl {{.*}} foo2 +void foo2() { + // PRINT: #pragma omp unroll partial(4) + // DUMP: OMPUnrollDirective + // DUMP-NEXT: OMPPartialClause + // DUMP-NEXT: ConstantExpr + // DUMP-NEXT: value: Int 4 + // DUMP-NEXT: IntegerLiteral {{.*}} 4 + #pragma omp unroll partial(4) + // PRINT: #pragma omp fuse + // DUMP-NEXT: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + } + +} + +//PRINT-LABEL: void foo3( +//DUMP-LABEL: FunctionTemplateDecl {{.*}} foo3 +template +void foo3() { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: #pragma omp unroll partial(Factor1) + // DUMP: OMPUnrollDirective + #pragma omp unroll partial(Factor1) + // PRINT: for (int i = 0; i < 12; i += 1) + // DUMP: ForStmt + for (int i = 0; i < 12; i += 1) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: #pragma omp unroll partial(Factor2) + // DUMP: OMPUnrollDirective + #pragma omp unroll partial(Factor2) + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + + } +} + +// Also test instantiating the template. +void tfoo3() { + foo3<4,2>(); +} + +//PRINT-LABEL: void foo4( +//DUMP-LABEL: FunctionTemplateDecl {{.*}} foo4 +template +void foo4(int start, int end) { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (T i = start; i < end; i += Step) + // DUMP: ForStmt + for (T i = start; i < end; i += Step) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + + // PRINT: for (T j = end; j > start; j -= Step) + // DUMP: ForStmt + for (T j = end; j > start; j -= Step) { + // PRINT: body(j) + // DUMP: CallExpr + body(j); + } + + } +} + +// Also test instantiating the template. +void tfoo4() { + foo4(0, 64); +} + + + +// PRINT-LABEL: void foo5( +// DUMP-LABEL: FunctionDecl {{.*}} foo5 +void foo5() { + double arr[128], arr2[128]; + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT-NEXT: for (auto &&a : arr) + // DUMP-NEXT: CXXForRangeStmt + for (auto &&a: arr) + // PRINT: body(a) + // DUMP: CallExpr + body(a); + // PRINT: for (double v = 42; auto &&b : arr) + // DUMP: CXXForRangeStmt + for (double v = 42; auto &&b: arr) + // PRINT: body(b, v); + // DUMP: CallExpr + body(b, v); + // PRINT: for (auto &&c : arr2) + // DUMP: CXXForRangeStmt + for (auto &&c: arr2) + // PRINT: body(c) + // DUMP: CallExpr + body(c); + + } + +} + +// PRINT-LABEL: void foo6( +// DUMP-LABEL: FunctionDecl {{.*}} foo6 +void foo6() { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i <= 10; ++i) + // DUMP: ForStmt + for (int i = 0; i <= 10; ++i) + body(i); + // PRINT: for (int j = 0; j < 100; ++j) + // DUMP: ForStmt + for(int j = 0; j < 100; ++j) + body(j); + } + // PRINT: #pragma omp unroll partial(4) + // DUMP: OMPUnrollDirective + #pragma omp unroll partial(4) + // PRINT: for (int k = 0; k < 250; ++k) + // DUMP: ForStmt + for (int k = 0; k < 250; ++k) + body(k); + } +} + +// PRINT-LABEL: void foo7( +// DUMP-LABEL: FunctionDecl {{.*}} foo7 +void foo7() { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + } + } + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + } + } + } + } + +} + + + + + +#endif \ No newline at end of file diff --git a/clang/test/OpenMP/fuse_codegen.cpp b/clang/test/OpenMP/fuse_codegen.cpp new file mode 100644 index 0000000000000..6c1e21092da43 --- /dev/null +++ b/clang/test/OpenMP/fuse_codegen.cpp @@ -0,0 +1,1511 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --include-generated-funcs --replace-value-regex "pl_cond[.].+[.|,]" --prefix-filecheck-ir-name _ --version 5 +// expected-no-diagnostics + +// Check code generation +// RUN: %clang_cc1 -verify -triple x86_64-pc-linux-gnu -std=c++20 -fclang-abi-compat=latest -fopenmp -fopenmp-version=60 -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK1 + +// Check same results after serialization round-trip +// RUN: %clang_cc1 -verify -triple x86_64-pc-linux-gnu -std=c++20 -fclang-abi-compat=latest -fopenmp -fopenmp-version=60 -emit-pch -o %t %s +// RUN: %clang_cc1 -verify -triple x86_64-pc-linux-gnu -std=c++20 -fclang-abi-compat=latest -fopenmp -fopenmp-version=60 -include-pch %t -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK2 + +#ifndef HEADER +#define HEADER + +//placeholder for loop body code. +extern "C" void body(...) {} + +extern "C" void foo1(int start1, int end1, int step1, int start2, int end2, int step2) { + int i,j; + #pragma omp fuse + { + for(i = start1; i < end1; i += step1) body(i); + for(j = start2; j < end2; j += step2) body(j); + } + +} + +template +void foo2(T start, T end, T step){ + T i,j,k; + #pragma omp fuse + { + for(i = start; i < end; i += step) body(i); + for(j = end; j > start; j -= step) body(j); + for(k = start+step; k < end+step; k += step) body(k); + } +} + +extern "C" void tfoo2() { + foo2(0, 64, 4); +} + +extern "C" void foo3() { + double arr[256]; + #pragma omp fuse + { + #pragma omp fuse + { + for(int i = 0; i < 128; ++i) body(i); + for(int j = 0; j < 256; j+=2) body(j); + } + for(int c = 42; auto &&v: arr) body(c,v); + for(int cc = 37; auto &&vv: arr) body(cc, vv); + } +} + + +#endif +// CHECK1-LABEL: define dso_local void @body( +// CHECK1-SAME: ...) #[[ATTR0:[0-9]+]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: ret void +// +// +// CHECK1-LABEL: define dso_local void @foo1( +// CHECK1-SAME: i32 noundef [[START1:%.*]], i32 noundef [[END1:%.*]], i32 noundef [[STEP1:%.*]], i32 noundef [[START2:%.*]], i32 noundef [[END2:%.*]], i32 noundef [[STEP2:%.*]]) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[START1_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[END1_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[STEP1_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[START2_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[END2_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[STEP2_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_6:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: store i32 [[START1]], ptr [[START1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[END1]], ptr [[END1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[STEP1]], ptr [[STEP1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[START2]], ptr [[START2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[END2]], ptr [[END2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[STEP2]], ptr [[STEP2_ADDR]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[START1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[START1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP1]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[END1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[STEP1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP3]], ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[SUB:%.*]] = sub i32 [[TMP4]], [[TMP5]] +// CHECK1-NEXT: [[SUB3:%.*]] = sub i32 [[SUB]], 1 +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[ADD:%.*]] = add i32 [[SUB3]], [[TMP6]] +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] +// CHECK1-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 +// CHECK1-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK1-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[END2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK1-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK1-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 +// CHECK1-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK1-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP20]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP21]], [[TMP22]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP23]], %[[COND_TRUE]] ], [ [[TMP24]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] +// CHECK1-NEXT: br i1 [[CMP16]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: br i1 [[CMP17]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP30]], [[TMP31]] +// CHECK1-NEXT: [[ADD18:%.*]] = add i32 [[TMP29]], [[MUL]] +// CHECK1-NEXT: store i32 [[ADD18]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[MUL19:%.*]] = mul i32 [[TMP33]], [[TMP34]] +// CHECK1-NEXT: [[ADD20:%.*]] = add i32 [[TMP32]], [[MUL19]] +// CHECK1-NEXT: store i32 [[ADD20]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP35]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP36]], [[TMP37]] +// CHECK1-NEXT: br i1 [[CMP21]], label %[[IF_THEN22:.*]], label %[[IF_END27:.*]] +// CHECK1: [[IF_THEN22]]: +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL23:%.*]] = mul i32 [[TMP39]], [[TMP40]] +// CHECK1-NEXT: [[ADD24:%.*]] = add i32 [[TMP38]], [[MUL23]] +// CHECK1-NEXT: store i32 [[ADD24]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[MUL25:%.*]] = mul i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: [[ADD26:%.*]] = add i32 [[TMP41]], [[MUL25]] +// CHECK1-NEXT: store i32 [[ADD26]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK1-NEXT: br label %[[IF_END27]] +// CHECK1: [[IF_END27]]: +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP45]], 1 +// CHECK1-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP3:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: ret void +// +// +// CHECK1-LABEL: define dso_local void @tfoo2( +// CHECK1-SAME: ) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: call void @_Z4foo2IiEvT_S0_S0_(i32 noundef 0, i32 noundef 64, i32 noundef 4) +// CHECK1-NEXT: ret void +// +// +// CHECK1-LABEL: define linkonce_odr void @_Z4foo2IiEvT_S0_S0_( +// CHECK1-SAME: i32 noundef [[START:%.*]], i32 noundef [[END:%.*]], i32 noundef [[STEP:%.*]]) #[[ATTR0]] comdat { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[START_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[END_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[STEP_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_6:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_17:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_19:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP21:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_22:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: store i32 [[START]], ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[END]], ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[STEP]], ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP1]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP3]], ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[SUB:%.*]] = sub i32 [[TMP4]], [[TMP5]] +// CHECK1-NEXT: [[SUB3:%.*]] = sub i32 [[SUB]], 1 +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[ADD:%.*]] = add i32 [[SUB3]], [[TMP6]] +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] +// CHECK1-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 +// CHECK1-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK1-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK1-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK1-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 +// CHECK1-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK1-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] +// CHECK1-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] +// CHECK1-NEXT: store i32 [[ADD18]], ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP24]], [[TMP25]] +// CHECK1-NEXT: store i32 [[ADD20]], ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP26]], ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[SUB23:%.*]] = sub i32 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: [[SUB24:%.*]] = sub i32 [[SUB23]], 1 +// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP29]] +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP30]] +// CHECK1-NEXT: [[SUB27:%.*]] = sub i32 [[DIV26]], 1 +// CHECK1-NEXT: store i32 [[SUB27]], ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK1-NEXT: store i32 [[TMP31]], ptr [[DOTOMP_UB2]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB2]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST2]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK1-NEXT: [[ADD28:%.*]] = add i32 [[TMP32]], 1 +// CHECK1-NEXT: store i32 [[ADD28]], ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP33]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP34]], [[TMP35]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP36]], %[[COND_TRUE]] ], [ [[TMP37]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP38]], [[TMP39]] +// CHECK1-NEXT: br i1 [[CMP29]], label %[[COND_TRUE30:.*]], label %[[COND_FALSE31:.*]] +// CHECK1: [[COND_TRUE30]]: +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: br label %[[COND_END32:.*]] +// CHECK1: [[COND_FALSE31]]: +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: br label %[[COND_END32]] +// CHECK1: [[COND_END32]]: +// CHECK1-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP40]], %[[COND_TRUE30]] ], [ [[TMP41]], %[[COND_FALSE31]] ] +// CHECK1-NEXT: store i32 [[COND33]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: br i1 [[CMP34]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP44]], [[TMP45]] +// CHECK1-NEXT: br i1 [[CMP35]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP47]], [[TMP48]] +// CHECK1-NEXT: [[ADD36:%.*]] = add i32 [[TMP46]], [[MUL]] +// CHECK1-NEXT: store i32 [[ADD36]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[MUL37:%.*]] = mul i32 [[TMP50]], [[TMP51]] +// CHECK1-NEXT: [[ADD38:%.*]] = add i32 [[TMP49]], [[MUL37]] +// CHECK1-NEXT: store i32 [[ADD38]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP53]], [[TMP54]] +// CHECK1-NEXT: br i1 [[CMP39]], label %[[IF_THEN40:.*]], label %[[IF_END45:.*]] +// CHECK1: [[IF_THEN40]]: +// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL41:%.*]] = mul i32 [[TMP56]], [[TMP57]] +// CHECK1-NEXT: [[ADD42:%.*]] = add i32 [[TMP55]], [[MUL41]] +// CHECK1-NEXT: store i32 [[ADD42]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP58:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[MUL43:%.*]] = mul i32 [[TMP59]], [[TMP60]] +// CHECK1-NEXT: [[SUB44:%.*]] = sub i32 [[TMP58]], [[MUL43]] +// CHECK1-NEXT: store i32 [[SUB44]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP61:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP61]]) +// CHECK1-NEXT: br label %[[IF_END45]] +// CHECK1: [[IF_END45]]: +// CHECK1-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP62]], [[TMP63]] +// CHECK1-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] +// CHECK1: [[IF_THEN47]]: +// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 +// CHECK1-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 +// CHECK1-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL48:%.*]] = mul i32 [[TMP65]], [[TMP66]] +// CHECK1-NEXT: [[ADD49:%.*]] = add i32 [[TMP64]], [[MUL48]] +// CHECK1-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV2]], align 4 +// CHECK1-NEXT: [[TMP67:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 +// CHECK1-NEXT: [[TMP69:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[MUL50:%.*]] = mul i32 [[TMP68]], [[TMP69]] +// CHECK1-NEXT: [[ADD51:%.*]] = add i32 [[TMP67]], [[MUL50]] +// CHECK1-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 +// CHECK1-NEXT: [[TMP70:%.*]] = load i32, ptr [[K]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP70]]) +// CHECK1-NEXT: br label %[[IF_END52]] +// CHECK1: [[IF_END52]]: +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 +// CHECK1-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP5:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: ret void +// +// +// CHECK1-LABEL: define dso_local void @foo3( +// CHECK1-SAME: ) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB03:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB04:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST05:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI06:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV07:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_12:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_UB117:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_LB118:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_ST119:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_NI120:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV122:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[CC:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE223:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END224:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN227:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_29:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_31:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_32:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_UB2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_LB2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_ST2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_NI2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_TEMP_142:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX48:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX54:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[VV:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 [[TMP5]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[SUB:%.*]] = sub nsw i32 [[TMP6]], 0 +// CHECK1-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 +// CHECK1-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 +// CHECK1-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: store i32 [[TMP7]], ptr [[DOTOMP_UB03]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB04]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST05]], align 4 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP8]], 1 +// CHECK1-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 +// CHECK1-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI06]], align 8 +// CHECK1-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK1-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY8:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY8]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY10:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP11]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY10]], ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK1-NEXT: [[TMP12:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK1-NEXT: store ptr [[TMP12]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: [[TMP14:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP14]] to i64 +// CHECK1-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] +// CHECK1-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 +// CHECK1-NEXT: [[SUB13:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK1-NEXT: [[ADD14:%.*]] = add nsw i64 [[SUB13]], 1 +// CHECK1-NEXT: [[DIV15:%.*]] = sdiv i64 [[ADD14]], 1 +// CHECK1-NEXT: [[SUB16:%.*]] = sub nsw i64 [[DIV15]], 1 +// CHECK1-NEXT: store i64 [[SUB16]], ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK1-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK1-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_UB117]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB118]], align 8 +// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST119]], align 8 +// CHECK1-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK1-NEXT: [[ADD21:%.*]] = add nsw i64 [[TMP16]], 1 +// CHECK1-NEXT: store i64 [[ADD21]], ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: store i32 37, ptr [[CC]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE223]], align 8 +// CHECK1-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY25:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR26:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY25]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR26]], ptr [[__END224]], align 8 +// CHECK1-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP18]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY28]], ptr [[__BEGIN227]], align 8 +// CHECK1-NEXT: [[TMP19:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY30:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP19]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY30]], ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[TMP20:%.*]] = load ptr, ptr [[__END224]], align 8 +// CHECK1-NEXT: store ptr [[TMP20]], ptr [[DOTCAPTURE_EXPR_31]], align 8 +// CHECK1-NEXT: [[TMP21:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_31]], align 8 +// CHECK1-NEXT: [[TMP22:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST33:%.*]] = ptrtoint ptr [[TMP21]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST34:%.*]] = ptrtoint ptr [[TMP22]] to i64 +// CHECK1-NEXT: [[SUB_PTR_SUB35:%.*]] = sub i64 [[SUB_PTR_LHS_CAST33]], [[SUB_PTR_RHS_CAST34]] +// CHECK1-NEXT: [[SUB_PTR_DIV36:%.*]] = sdiv exact i64 [[SUB_PTR_SUB35]], 8 +// CHECK1-NEXT: [[SUB37:%.*]] = sub nsw i64 [[SUB_PTR_DIV36]], 1 +// CHECK1-NEXT: [[ADD38:%.*]] = add nsw i64 [[SUB37]], 1 +// CHECK1-NEXT: [[DIV39:%.*]] = sdiv i64 [[ADD38]], 1 +// CHECK1-NEXT: [[SUB40:%.*]] = sub nsw i64 [[DIV39]], 1 +// CHECK1-NEXT: store i64 [[SUB40]], ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK1-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK1-NEXT: store i64 [[TMP23]], ptr [[DOTOMP_UB2]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB2]], align 8 +// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST2]], align 8 +// CHECK1-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK1-NEXT: [[ADD41:%.*]] = add nsw i64 [[TMP24]], 1 +// CHECK1-NEXT: store i64 [[ADD41]], ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 +// CHECK1-NEXT: store i64 [[TMP25]], ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK1-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK1-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: [[CMP43:%.*]] = icmp sgt i64 [[TMP26]], [[TMP27]] +// CHECK1-NEXT: br i1 [[CMP43]], label %[[COND_TRUE44:.*]], label %[[COND_FALSE45:.*]] +// CHECK1: [[COND_TRUE44]]: +// CHECK1-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK1-NEXT: br label %[[COND_END46:.*]] +// CHECK1: [[COND_FALSE45]]: +// CHECK1-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: br label %[[COND_END46]] +// CHECK1: [[COND_END46]]: +// CHECK1-NEXT: [[COND47:%.*]] = phi i64 [ [[TMP28]], %[[COND_TRUE44]] ], [ [[TMP29]], %[[COND_FALSE45]] ] +// CHECK1-NEXT: store i64 [[COND47]], ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[CMP49:%.*]] = icmp sgt i64 [[TMP30]], [[TMP31]] +// CHECK1-NEXT: br i1 [[CMP49]], label %[[COND_TRUE50:.*]], label %[[COND_FALSE51:.*]] +// CHECK1: [[COND_TRUE50]]: +// CHECK1-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: br label %[[COND_END52:.*]] +// CHECK1: [[COND_FALSE51]]: +// CHECK1-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: br label %[[COND_END52]] +// CHECK1: [[COND_END52]]: +// CHECK1-NEXT: [[COND53:%.*]] = phi i64 [ [[TMP32]], %[[COND_TRUE50]] ], [ [[TMP33]], %[[COND_FALSE51]] ] +// CHECK1-NEXT: store i64 [[COND53]], ptr [[DOTOMP_FUSE_MAX48]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP35:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX48]], align 8 +// CHECK1-NEXT: [[CMP55:%.*]] = icmp slt i64 [[TMP34]], [[TMP35]] +// CHECK1-NEXT: br i1 [[CMP55]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP36:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 +// CHECK1-NEXT: [[CMP56:%.*]] = icmp slt i64 [[TMP36]], [[TMP37]] +// CHECK1-NEXT: br i1 [[CMP56]], label %[[IF_THEN:.*]], label %[[IF_END76:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB04]], align 4 +// CHECK1-NEXT: [[CONV57:%.*]] = sext i32 [[TMP38]] to i64 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST05]], align 4 +// CHECK1-NEXT: [[CONV58:%.*]] = sext i32 [[TMP39]] to i64 +// CHECK1-NEXT: [[TMP40:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV58]], [[TMP40]] +// CHECK1-NEXT: [[ADD59:%.*]] = add nsw i64 [[CONV57]], [[MUL]] +// CHECK1-NEXT: [[CONV60:%.*]] = trunc i64 [[ADD59]] to i32 +// CHECK1-NEXT: store i32 [[CONV60]], ptr [[DOTOMP_IV07]], align 4 +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_IV07]], align 4 +// CHECK1-NEXT: [[MUL61:%.*]] = mul nsw i32 [[TMP41]], 1 +// CHECK1-NEXT: [[ADD62:%.*]] = add nsw i32 0, [[MUL61]] +// CHECK1-NEXT: store i32 [[ADD62]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP63:%.*]] = icmp slt i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: br i1 [[CMP63]], label %[[IF_THEN64:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN64]]: +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP45]], [[TMP46]] +// CHECK1-NEXT: [[ADD66:%.*]] = add nsw i32 [[TMP44]], [[MUL65]] +// CHECK1-NEXT: store i32 [[ADD66]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[MUL67:%.*]] = mul nsw i32 [[TMP47]], 1 +// CHECK1-NEXT: [[ADD68:%.*]] = add nsw i32 0, [[MUL67]] +// CHECK1-NEXT: store i32 [[ADD68]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP48]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP69:%.*]] = icmp slt i32 [[TMP49]], [[TMP50]] +// CHECK1-NEXT: br i1 [[CMP69]], label %[[IF_THEN70:.*]], label %[[IF_END75:.*]] +// CHECK1: [[IF_THEN70]]: +// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP52]], [[TMP53]] +// CHECK1-NEXT: [[ADD72:%.*]] = add nsw i32 [[TMP51]], [[MUL71]] +// CHECK1-NEXT: store i32 [[ADD72]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[MUL73:%.*]] = mul nsw i32 [[TMP54]], 2 +// CHECK1-NEXT: [[ADD74:%.*]] = add nsw i32 0, [[MUL73]] +// CHECK1-NEXT: store i32 [[ADD74]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP55]]) +// CHECK1-NEXT: br label %[[IF_END75]] +// CHECK1: [[IF_END75]]: +// CHECK1-NEXT: br label %[[IF_END76]] +// CHECK1: [[IF_END76]]: +// CHECK1-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: [[CMP77:%.*]] = icmp slt i64 [[TMP56]], [[TMP57]] +// CHECK1-NEXT: br i1 [[CMP77]], label %[[IF_THEN78:.*]], label %[[IF_END83:.*]] +// CHECK1: [[IF_THEN78]]: +// CHECK1-NEXT: [[TMP58:%.*]] = load i64, ptr [[DOTOMP_LB118]], align 8 +// CHECK1-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_ST119]], align 8 +// CHECK1-NEXT: [[TMP60:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], [[TMP60]] +// CHECK1-NEXT: [[ADD80:%.*]] = add nsw i64 [[TMP58]], [[MUL79]] +// CHECK1-NEXT: store i64 [[ADD80]], ptr [[DOTOMP_IV122]], align 8 +// CHECK1-NEXT: [[TMP61:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK1-NEXT: [[TMP62:%.*]] = load i64, ptr [[DOTOMP_IV122]], align 8 +// CHECK1-NEXT: [[MUL81:%.*]] = mul nsw i64 [[TMP62]], 1 +// CHECK1-NEXT: [[ADD_PTR82:%.*]] = getelementptr inbounds double, ptr [[TMP61]], i64 [[MUL81]] +// CHECK1-NEXT: store ptr [[ADD_PTR82]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP63:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: store ptr [[TMP63]], ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[C]], align 4 +// CHECK1-NEXT: [[TMP65:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP66:%.*]] = load double, ptr [[TMP65]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP64]], double noundef [[TMP66]]) +// CHECK1-NEXT: br label %[[IF_END83]] +// CHECK1: [[IF_END83]]: +// CHECK1-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[CMP84:%.*]] = icmp slt i64 [[TMP67]], [[TMP68]] +// CHECK1-NEXT: br i1 [[CMP84]], label %[[IF_THEN85:.*]], label %[[IF_END90:.*]] +// CHECK1: [[IF_THEN85]]: +// CHECK1-NEXT: [[TMP69:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 +// CHECK1-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 +// CHECK1-NEXT: [[TMP71:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], [[TMP71]] +// CHECK1-NEXT: [[ADD87:%.*]] = add nsw i64 [[TMP69]], [[MUL86]] +// CHECK1-NEXT: store i64 [[ADD87]], ptr [[DOTOMP_IV2]], align 8 +// CHECK1-NEXT: [[TMP72:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[TMP73:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 +// CHECK1-NEXT: [[MUL88:%.*]] = mul nsw i64 [[TMP73]], 1 +// CHECK1-NEXT: [[ADD_PTR89:%.*]] = getelementptr inbounds double, ptr [[TMP72]], i64 [[MUL88]] +// CHECK1-NEXT: store ptr [[ADD_PTR89]], ptr [[__BEGIN227]], align 8 +// CHECK1-NEXT: [[TMP74:%.*]] = load ptr, ptr [[__BEGIN227]], align 8 +// CHECK1-NEXT: store ptr [[TMP74]], ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP75:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK1-NEXT: [[TMP76:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP77:%.*]] = load double, ptr [[TMP76]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP75]], double noundef [[TMP77]]) +// CHECK1-NEXT: br label %[[IF_END90]] +// CHECK1: [[IF_END90]]: +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP78:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[INC:%.*]] = add nsw i64 [[TMP78]], 1 +// CHECK1-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: ret void +// +// +// CHECK2-LABEL: define dso_local void @body( +// CHECK2-SAME: ...) #[[ATTR0:[0-9]+]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: ret void +// +// +// CHECK2-LABEL: define dso_local void @foo1( +// CHECK2-SAME: i32 noundef [[START1:%.*]], i32 noundef [[END1:%.*]], i32 noundef [[STEP1:%.*]], i32 noundef [[START2:%.*]], i32 noundef [[END2:%.*]], i32 noundef [[STEP2:%.*]]) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[START1_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[END1_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[STEP1_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[START2_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[END2_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[STEP2_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_6:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: store i32 [[START1]], ptr [[START1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[END1]], ptr [[END1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[STEP1]], ptr [[STEP1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[START2]], ptr [[START2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[END2]], ptr [[END2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[STEP2]], ptr [[STEP2_ADDR]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[START1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[START1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP1]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[END1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[STEP1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP3]], ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[SUB:%.*]] = sub i32 [[TMP4]], [[TMP5]] +// CHECK2-NEXT: [[SUB3:%.*]] = sub i32 [[SUB]], 1 +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[ADD:%.*]] = add i32 [[SUB3]], [[TMP6]] +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] +// CHECK2-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 +// CHECK2-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK2-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[END2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK2-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK2-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 +// CHECK2-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK2-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP20]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP21]], [[TMP22]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP23]], %[[COND_TRUE]] ], [ [[TMP24]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] +// CHECK2-NEXT: br i1 [[CMP16]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: br i1 [[CMP17]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP30]], [[TMP31]] +// CHECK2-NEXT: [[ADD18:%.*]] = add i32 [[TMP29]], [[MUL]] +// CHECK2-NEXT: store i32 [[ADD18]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[MUL19:%.*]] = mul i32 [[TMP33]], [[TMP34]] +// CHECK2-NEXT: [[ADD20:%.*]] = add i32 [[TMP32]], [[MUL19]] +// CHECK2-NEXT: store i32 [[ADD20]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP35]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP36]], [[TMP37]] +// CHECK2-NEXT: br i1 [[CMP21]], label %[[IF_THEN22:.*]], label %[[IF_END27:.*]] +// CHECK2: [[IF_THEN22]]: +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL23:%.*]] = mul i32 [[TMP39]], [[TMP40]] +// CHECK2-NEXT: [[ADD24:%.*]] = add i32 [[TMP38]], [[MUL23]] +// CHECK2-NEXT: store i32 [[ADD24]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[MUL25:%.*]] = mul i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: [[ADD26:%.*]] = add i32 [[TMP41]], [[MUL25]] +// CHECK2-NEXT: store i32 [[ADD26]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK2-NEXT: br label %[[IF_END27]] +// CHECK2: [[IF_END27]]: +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP45]], 1 +// CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP3:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: ret void +// +// +// CHECK2-LABEL: define dso_local void @foo3( +// CHECK2-SAME: ) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB03:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB04:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST05:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI06:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV07:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_12:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_UB117:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_LB118:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_ST119:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_NI120:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV122:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[CC:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE223:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END224:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN227:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_29:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_31:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_32:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_UB2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_LB2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_ST2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_NI2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_TEMP_142:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX48:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX54:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[VV:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 [[TMP5]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[SUB:%.*]] = sub nsw i32 [[TMP6]], 0 +// CHECK2-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 +// CHECK2-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 +// CHECK2-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: store i32 [[TMP7]], ptr [[DOTOMP_UB03]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB04]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST05]], align 4 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP8]], 1 +// CHECK2-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 +// CHECK2-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI06]], align 8 +// CHECK2-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK2-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY8:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY8]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY10:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP11]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY10]], ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK2-NEXT: [[TMP12:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK2-NEXT: store ptr [[TMP12]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: [[TMP14:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP14]] to i64 +// CHECK2-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] +// CHECK2-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 +// CHECK2-NEXT: [[SUB13:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK2-NEXT: [[ADD14:%.*]] = add nsw i64 [[SUB13]], 1 +// CHECK2-NEXT: [[DIV15:%.*]] = sdiv i64 [[ADD14]], 1 +// CHECK2-NEXT: [[SUB16:%.*]] = sub nsw i64 [[DIV15]], 1 +// CHECK2-NEXT: store i64 [[SUB16]], ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK2-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK2-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_UB117]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB118]], align 8 +// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST119]], align 8 +// CHECK2-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK2-NEXT: [[ADD21:%.*]] = add nsw i64 [[TMP16]], 1 +// CHECK2-NEXT: store i64 [[ADD21]], ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: store i32 37, ptr [[CC]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE223]], align 8 +// CHECK2-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY25:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR26:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY25]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR26]], ptr [[__END224]], align 8 +// CHECK2-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP18]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY28]], ptr [[__BEGIN227]], align 8 +// CHECK2-NEXT: [[TMP19:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY30:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP19]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY30]], ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[TMP20:%.*]] = load ptr, ptr [[__END224]], align 8 +// CHECK2-NEXT: store ptr [[TMP20]], ptr [[DOTCAPTURE_EXPR_31]], align 8 +// CHECK2-NEXT: [[TMP21:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_31]], align 8 +// CHECK2-NEXT: [[TMP22:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST33:%.*]] = ptrtoint ptr [[TMP21]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST34:%.*]] = ptrtoint ptr [[TMP22]] to i64 +// CHECK2-NEXT: [[SUB_PTR_SUB35:%.*]] = sub i64 [[SUB_PTR_LHS_CAST33]], [[SUB_PTR_RHS_CAST34]] +// CHECK2-NEXT: [[SUB_PTR_DIV36:%.*]] = sdiv exact i64 [[SUB_PTR_SUB35]], 8 +// CHECK2-NEXT: [[SUB37:%.*]] = sub nsw i64 [[SUB_PTR_DIV36]], 1 +// CHECK2-NEXT: [[ADD38:%.*]] = add nsw i64 [[SUB37]], 1 +// CHECK2-NEXT: [[DIV39:%.*]] = sdiv i64 [[ADD38]], 1 +// CHECK2-NEXT: [[SUB40:%.*]] = sub nsw i64 [[DIV39]], 1 +// CHECK2-NEXT: store i64 [[SUB40]], ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK2-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK2-NEXT: store i64 [[TMP23]], ptr [[DOTOMP_UB2]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB2]], align 8 +// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST2]], align 8 +// CHECK2-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK2-NEXT: [[ADD41:%.*]] = add nsw i64 [[TMP24]], 1 +// CHECK2-NEXT: store i64 [[ADD41]], ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 +// CHECK2-NEXT: store i64 [[TMP25]], ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK2-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK2-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: [[CMP43:%.*]] = icmp sgt i64 [[TMP26]], [[TMP27]] +// CHECK2-NEXT: br i1 [[CMP43]], label %[[COND_TRUE44:.*]], label %[[COND_FALSE45:.*]] +// CHECK2: [[COND_TRUE44]]: +// CHECK2-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK2-NEXT: br label %[[COND_END46:.*]] +// CHECK2: [[COND_FALSE45]]: +// CHECK2-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: br label %[[COND_END46]] +// CHECK2: [[COND_END46]]: +// CHECK2-NEXT: [[COND47:%.*]] = phi i64 [ [[TMP28]], %[[COND_TRUE44]] ], [ [[TMP29]], %[[COND_FALSE45]] ] +// CHECK2-NEXT: store i64 [[COND47]], ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[CMP49:%.*]] = icmp sgt i64 [[TMP30]], [[TMP31]] +// CHECK2-NEXT: br i1 [[CMP49]], label %[[COND_TRUE50:.*]], label %[[COND_FALSE51:.*]] +// CHECK2: [[COND_TRUE50]]: +// CHECK2-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: br label %[[COND_END52:.*]] +// CHECK2: [[COND_FALSE51]]: +// CHECK2-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: br label %[[COND_END52]] +// CHECK2: [[COND_END52]]: +// CHECK2-NEXT: [[COND53:%.*]] = phi i64 [ [[TMP32]], %[[COND_TRUE50]] ], [ [[TMP33]], %[[COND_FALSE51]] ] +// CHECK2-NEXT: store i64 [[COND53]], ptr [[DOTOMP_FUSE_MAX48]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP35:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX48]], align 8 +// CHECK2-NEXT: [[CMP55:%.*]] = icmp slt i64 [[TMP34]], [[TMP35]] +// CHECK2-NEXT: br i1 [[CMP55]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP36:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 +// CHECK2-NEXT: [[CMP56:%.*]] = icmp slt i64 [[TMP36]], [[TMP37]] +// CHECK2-NEXT: br i1 [[CMP56]], label %[[IF_THEN:.*]], label %[[IF_END76:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB04]], align 4 +// CHECK2-NEXT: [[CONV57:%.*]] = sext i32 [[TMP38]] to i64 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST05]], align 4 +// CHECK2-NEXT: [[CONV58:%.*]] = sext i32 [[TMP39]] to i64 +// CHECK2-NEXT: [[TMP40:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV58]], [[TMP40]] +// CHECK2-NEXT: [[ADD59:%.*]] = add nsw i64 [[CONV57]], [[MUL]] +// CHECK2-NEXT: [[CONV60:%.*]] = trunc i64 [[ADD59]] to i32 +// CHECK2-NEXT: store i32 [[CONV60]], ptr [[DOTOMP_IV07]], align 4 +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_IV07]], align 4 +// CHECK2-NEXT: [[MUL61:%.*]] = mul nsw i32 [[TMP41]], 1 +// CHECK2-NEXT: [[ADD62:%.*]] = add nsw i32 0, [[MUL61]] +// CHECK2-NEXT: store i32 [[ADD62]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP63:%.*]] = icmp slt i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: br i1 [[CMP63]], label %[[IF_THEN64:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN64]]: +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP45]], [[TMP46]] +// CHECK2-NEXT: [[ADD66:%.*]] = add nsw i32 [[TMP44]], [[MUL65]] +// CHECK2-NEXT: store i32 [[ADD66]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[MUL67:%.*]] = mul nsw i32 [[TMP47]], 1 +// CHECK2-NEXT: [[ADD68:%.*]] = add nsw i32 0, [[MUL67]] +// CHECK2-NEXT: store i32 [[ADD68]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP48]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP69:%.*]] = icmp slt i32 [[TMP49]], [[TMP50]] +// CHECK2-NEXT: br i1 [[CMP69]], label %[[IF_THEN70:.*]], label %[[IF_END75:.*]] +// CHECK2: [[IF_THEN70]]: +// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP52]], [[TMP53]] +// CHECK2-NEXT: [[ADD72:%.*]] = add nsw i32 [[TMP51]], [[MUL71]] +// CHECK2-NEXT: store i32 [[ADD72]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[MUL73:%.*]] = mul nsw i32 [[TMP54]], 2 +// CHECK2-NEXT: [[ADD74:%.*]] = add nsw i32 0, [[MUL73]] +// CHECK2-NEXT: store i32 [[ADD74]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP55]]) +// CHECK2-NEXT: br label %[[IF_END75]] +// CHECK2: [[IF_END75]]: +// CHECK2-NEXT: br label %[[IF_END76]] +// CHECK2: [[IF_END76]]: +// CHECK2-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: [[CMP77:%.*]] = icmp slt i64 [[TMP56]], [[TMP57]] +// CHECK2-NEXT: br i1 [[CMP77]], label %[[IF_THEN78:.*]], label %[[IF_END83:.*]] +// CHECK2: [[IF_THEN78]]: +// CHECK2-NEXT: [[TMP58:%.*]] = load i64, ptr [[DOTOMP_LB118]], align 8 +// CHECK2-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_ST119]], align 8 +// CHECK2-NEXT: [[TMP60:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], [[TMP60]] +// CHECK2-NEXT: [[ADD80:%.*]] = add nsw i64 [[TMP58]], [[MUL79]] +// CHECK2-NEXT: store i64 [[ADD80]], ptr [[DOTOMP_IV122]], align 8 +// CHECK2-NEXT: [[TMP61:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK2-NEXT: [[TMP62:%.*]] = load i64, ptr [[DOTOMP_IV122]], align 8 +// CHECK2-NEXT: [[MUL81:%.*]] = mul nsw i64 [[TMP62]], 1 +// CHECK2-NEXT: [[ADD_PTR82:%.*]] = getelementptr inbounds double, ptr [[TMP61]], i64 [[MUL81]] +// CHECK2-NEXT: store ptr [[ADD_PTR82]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP63:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: store ptr [[TMP63]], ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[C]], align 4 +// CHECK2-NEXT: [[TMP65:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP66:%.*]] = load double, ptr [[TMP65]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP64]], double noundef [[TMP66]]) +// CHECK2-NEXT: br label %[[IF_END83]] +// CHECK2: [[IF_END83]]: +// CHECK2-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[CMP84:%.*]] = icmp slt i64 [[TMP67]], [[TMP68]] +// CHECK2-NEXT: br i1 [[CMP84]], label %[[IF_THEN85:.*]], label %[[IF_END90:.*]] +// CHECK2: [[IF_THEN85]]: +// CHECK2-NEXT: [[TMP69:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 +// CHECK2-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 +// CHECK2-NEXT: [[TMP71:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], [[TMP71]] +// CHECK2-NEXT: [[ADD87:%.*]] = add nsw i64 [[TMP69]], [[MUL86]] +// CHECK2-NEXT: store i64 [[ADD87]], ptr [[DOTOMP_IV2]], align 8 +// CHECK2-NEXT: [[TMP72:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[TMP73:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 +// CHECK2-NEXT: [[MUL88:%.*]] = mul nsw i64 [[TMP73]], 1 +// CHECK2-NEXT: [[ADD_PTR89:%.*]] = getelementptr inbounds double, ptr [[TMP72]], i64 [[MUL88]] +// CHECK2-NEXT: store ptr [[ADD_PTR89]], ptr [[__BEGIN227]], align 8 +// CHECK2-NEXT: [[TMP74:%.*]] = load ptr, ptr [[__BEGIN227]], align 8 +// CHECK2-NEXT: store ptr [[TMP74]], ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP75:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK2-NEXT: [[TMP76:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP77:%.*]] = load double, ptr [[TMP76]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP75]], double noundef [[TMP77]]) +// CHECK2-NEXT: br label %[[IF_END90]] +// CHECK2: [[IF_END90]]: +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP78:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[INC:%.*]] = add nsw i64 [[TMP78]], 1 +// CHECK2-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP5:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: ret void +// +// +// CHECK2-LABEL: define dso_local void @tfoo2( +// CHECK2-SAME: ) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: call void @_Z4foo2IiEvT_S0_S0_(i32 noundef 0, i32 noundef 64, i32 noundef 4) +// CHECK2-NEXT: ret void +// +// +// CHECK2-LABEL: define linkonce_odr void @_Z4foo2IiEvT_S0_S0_( +// CHECK2-SAME: i32 noundef [[START:%.*]], i32 noundef [[END:%.*]], i32 noundef [[STEP:%.*]]) #[[ATTR0]] comdat { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[START_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[END_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[STEP_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_6:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_17:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_19:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP21:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_22:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: store i32 [[START]], ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[END]], ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[STEP]], ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP1]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP3]], ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[SUB:%.*]] = sub i32 [[TMP4]], [[TMP5]] +// CHECK2-NEXT: [[SUB3:%.*]] = sub i32 [[SUB]], 1 +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[ADD:%.*]] = add i32 [[SUB3]], [[TMP6]] +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] +// CHECK2-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 +// CHECK2-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK2-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK2-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK2-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 +// CHECK2-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK2-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] +// CHECK2-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] +// CHECK2-NEXT: store i32 [[ADD18]], ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP24]], [[TMP25]] +// CHECK2-NEXT: store i32 [[ADD20]], ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP26]], ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[SUB23:%.*]] = sub i32 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: [[SUB24:%.*]] = sub i32 [[SUB23]], 1 +// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP29]] +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP30]] +// CHECK2-NEXT: [[SUB27:%.*]] = sub i32 [[DIV26]], 1 +// CHECK2-NEXT: store i32 [[SUB27]], ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK2-NEXT: store i32 [[TMP31]], ptr [[DOTOMP_UB2]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB2]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST2]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK2-NEXT: [[ADD28:%.*]] = add i32 [[TMP32]], 1 +// CHECK2-NEXT: store i32 [[ADD28]], ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP33]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP34]], [[TMP35]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP36]], %[[COND_TRUE]] ], [ [[TMP37]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP38]], [[TMP39]] +// CHECK2-NEXT: br i1 [[CMP29]], label %[[COND_TRUE30:.*]], label %[[COND_FALSE31:.*]] +// CHECK2: [[COND_TRUE30]]: +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: br label %[[COND_END32:.*]] +// CHECK2: [[COND_FALSE31]]: +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: br label %[[COND_END32]] +// CHECK2: [[COND_END32]]: +// CHECK2-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP40]], %[[COND_TRUE30]] ], [ [[TMP41]], %[[COND_FALSE31]] ] +// CHECK2-NEXT: store i32 [[COND33]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: br i1 [[CMP34]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP44]], [[TMP45]] +// CHECK2-NEXT: br i1 [[CMP35]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP47]], [[TMP48]] +// CHECK2-NEXT: [[ADD36:%.*]] = add i32 [[TMP46]], [[MUL]] +// CHECK2-NEXT: store i32 [[ADD36]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[MUL37:%.*]] = mul i32 [[TMP50]], [[TMP51]] +// CHECK2-NEXT: [[ADD38:%.*]] = add i32 [[TMP49]], [[MUL37]] +// CHECK2-NEXT: store i32 [[ADD38]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP53]], [[TMP54]] +// CHECK2-NEXT: br i1 [[CMP39]], label %[[IF_THEN40:.*]], label %[[IF_END45:.*]] +// CHECK2: [[IF_THEN40]]: +// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL41:%.*]] = mul i32 [[TMP56]], [[TMP57]] +// CHECK2-NEXT: [[ADD42:%.*]] = add i32 [[TMP55]], [[MUL41]] +// CHECK2-NEXT: store i32 [[ADD42]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP58:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[MUL43:%.*]] = mul i32 [[TMP59]], [[TMP60]] +// CHECK2-NEXT: [[SUB44:%.*]] = sub i32 [[TMP58]], [[MUL43]] +// CHECK2-NEXT: store i32 [[SUB44]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP61:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP61]]) +// CHECK2-NEXT: br label %[[IF_END45]] +// CHECK2: [[IF_END45]]: +// CHECK2-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP62]], [[TMP63]] +// CHECK2-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] +// CHECK2: [[IF_THEN47]]: +// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 +// CHECK2-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 +// CHECK2-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL48:%.*]] = mul i32 [[TMP65]], [[TMP66]] +// CHECK2-NEXT: [[ADD49:%.*]] = add i32 [[TMP64]], [[MUL48]] +// CHECK2-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV2]], align 4 +// CHECK2-NEXT: [[TMP67:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 +// CHECK2-NEXT: [[TMP69:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[MUL50:%.*]] = mul i32 [[TMP68]], [[TMP69]] +// CHECK2-NEXT: [[ADD51:%.*]] = add i32 [[TMP67]], [[MUL50]] +// CHECK2-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 +// CHECK2-NEXT: [[TMP70:%.*]] = load i32, ptr [[K]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP70]]) +// CHECK2-NEXT: br label %[[IF_END52]] +// CHECK2: [[IF_END52]]: +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 +// CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: ret void +// +//. +// CHECK1: [[LOOP3]] = distinct !{[[LOOP3]], [[META4:![0-9]+]]} +// CHECK1: [[META4]] = !{!"llvm.loop.mustprogress"} +// CHECK1: [[LOOP5]] = distinct !{[[LOOP5]], [[META4]]} +// CHECK1: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} +//. +// CHECK2: [[LOOP3]] = distinct !{[[LOOP3]], [[META4:![0-9]+]]} +// CHECK2: [[META4]] = !{!"llvm.loop.mustprogress"} +// CHECK2: [[LOOP5]] = distinct !{[[LOOP5]], [[META4]]} +// CHECK2: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} +//. diff --git a/clang/test/OpenMP/fuse_messages.cpp b/clang/test/OpenMP/fuse_messages.cpp new file mode 100644 index 0000000000000..50dedfd2c0dc6 --- /dev/null +++ b/clang/test/OpenMP/fuse_messages.cpp @@ -0,0 +1,76 @@ +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -std=c++20 -fopenmp -fopenmp-version=60 -fsyntax-only -Wuninitialized -verify %s + +void func() { + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a loop sequence containing canonical loops or loop-generating constructs}} + #pragma omp fuse + ; + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a for loop}} + #pragma omp fuse + {int bar = 0;} + + // expected-error at +4 {{statement after '#pragma omp fuse' must be a for loop}} + #pragma omp fuse + { + for(int i = 0; i < 10; ++i); + int x = 2; + } + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a loop sequence containing canonical loops or loop-generating constructs}} + #pragma omp fuse + #pragma omp for + for (int i = 0; i < 7; ++i) + ; + + { + // expected-error at +2 {{expected statement}} + #pragma omp fuse + } + + // expected-warning at +1 {{extra tokens at the end of '#pragma omp fuse' are ignored}} + #pragma omp fuse foo + { + for (int i = 0; i < 7; ++i) + ; + } + + + // expected-error at +1 {{unexpected OpenMP clause 'final' in directive '#pragma omp fuse'}} + #pragma omp fuse final(0) + { + for (int i = 0; i < 7; ++i) + ; + } + + //expected-error at +4 {{loop after '#pragma omp fuse' is not in canonical form}} + //expected-error at +3 {{increment clause of OpenMP for loop must perform simple addition or subtraction on loop variable 'i'}} + #pragma omp fuse + { + for(int i = 0; i < 10; i*=2) { + ; + } + } + + //expected-error at +2 {{loop sequence after '#pragma omp fuse' must contain at least 1 canonical loop or loop-generating construct}} + #pragma omp fuse + {} + + //expected-error at +3 {{statement after '#pragma omp fuse' must be a for loop}} + #pragma omp fuse + { + #pragma omp unroll full + for(int i = 0; i < 10; ++i); + + for(int j = 0; j < 10; ++j); + } + + //expected-warning at +5 {{loop sequence following '#pragma omp fuse' contains induction variables of differing types: 'int' and 'unsigned int'}} + //expected-warning at +5 {{loop sequence following '#pragma omp fuse' contains induction variables of differing types: 'int' and 'long long'}} + #pragma omp fuse + { + for(int i = 0; i < 10; ++i); + for(unsigned int j = 0; j < 10; ++j); + for(long long k = 0; k < 100; ++k); + } +} \ No newline at end of file diff --git a/clang/tools/libclang/CIndex.cpp b/clang/tools/libclang/CIndex.cpp index fa5df3b5a06e6..80020763961fc 100644 --- a/clang/tools/libclang/CIndex.cpp +++ b/clang/tools/libclang/CIndex.cpp @@ -2206,6 +2206,7 @@ class EnqueueVisitor : public ConstStmtVisitor, void VisitOMPUnrollDirective(const OMPUnrollDirective *D); void VisitOMPReverseDirective(const OMPReverseDirective *D); void VisitOMPInterchangeDirective(const OMPInterchangeDirective *D); + void VisitOMPFuseDirective(const OMPFuseDirective *D); void VisitOMPForDirective(const OMPForDirective *D); void VisitOMPForSimdDirective(const OMPForSimdDirective *D); void VisitOMPSectionsDirective(const OMPSectionsDirective *D); @@ -3364,6 +3365,10 @@ void EnqueueVisitor::VisitOMPInterchangeDirective( VisitOMPLoopTransformationDirective(D); } +void EnqueueVisitor::VisitOMPFuseDirective(const OMPFuseDirective *D) { + VisitOMPLoopTransformationDirective(D); +} + void EnqueueVisitor::VisitOMPForDirective(const OMPForDirective *D) { VisitOMPLoopDirective(D); } @@ -6318,6 +6323,8 @@ CXString clang_getCursorKindSpelling(enum CXCursorKind Kind) { return cxstring::createRef("OMPReverseDirective"); case CXCursor_OMPInterchangeDirective: return cxstring::createRef("OMPInterchangeDirective"); + case CXCursor_OMPFuseDirective: + return cxstring::createRef("OMPFuseDirective"); case CXCursor_OMPForDirective: return cxstring::createRef("OMPForDirective"); case CXCursor_OMPForSimdDirective: diff --git a/clang/tools/libclang/CXCursor.cpp b/clang/tools/libclang/CXCursor.cpp index 635d03a88d105..709fa60d28d8d 100644 --- a/clang/tools/libclang/CXCursor.cpp +++ b/clang/tools/libclang/CXCursor.cpp @@ -688,6 +688,9 @@ CXCursor cxcursor::MakeCXCursor(const Stmt *S, const Decl *Parent, case Stmt::OMPInterchangeDirectiveClass: K = CXCursor_OMPInterchangeDirective; break; + case Stmt::OMPFuseDirectiveClass: + K = CXCursor_OMPFuseDirective; + break; case Stmt::OMPForDirectiveClass: K = CXCursor_OMPForDirective; break; diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 194b1e657c493..f33b3b1532d3d 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -842,6 +842,10 @@ def OMP_For : Directive<"for"> { let association = AS_Loop; let category = CA_Executable; } +def OMP_Fuse : Directive<"fuse"> { + let association = AS_Loop; + let category = CA_Executable; +} def OMP_Interchange : Directive<"interchange"> { let allowedOnceClauses = [ VersionedClause, diff --git a/openmp/runtime/test/transform/fuse/foreach.cpp b/openmp/runtime/test/transform/fuse/foreach.cpp new file mode 100644 index 0000000000000..cabf4bf8a511d --- /dev/null +++ b/openmp/runtime/test/transform/fuse/foreach.cpp @@ -0,0 +1,192 @@ +// RUN: %libomp-cxx20-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include +#include +#include + +struct Reporter { + const char *name; + + Reporter(const char *name) : name(name) { print("ctor"); } + + Reporter() : name("") { print("ctor"); } + + Reporter(const Reporter &that) : name(that.name) { print("copy ctor"); } + + Reporter(Reporter &&that) : name(that.name) { print("move ctor"); } + + ~Reporter() { print("dtor"); } + + const Reporter &operator=(const Reporter &that) { + print("copy assign"); + this->name = that.name; + return *this; + } + + const Reporter &operator=(Reporter &&that) { + print("move assign"); + this->name = that.name; + return *this; + } + + struct Iterator { + const Reporter *owner; + int pos; + + Iterator(const Reporter *owner, int pos) : owner(owner), pos(pos) {} + + Iterator(const Iterator &that) : owner(that.owner), pos(that.pos) { + owner->print("iterator copy ctor"); + } + + Iterator(Iterator &&that) : owner(that.owner), pos(that.pos) { + owner->print("iterator move ctor"); + } + + ~Iterator() { owner->print("iterator dtor"); } + + const Iterator &operator=(const Iterator &that) { + owner->print("iterator copy assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + const Iterator &operator=(Iterator &&that) { + owner->print("iterator move assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + bool operator==(const Iterator &that) const { + owner->print("iterator %d == %d", 2 - this->pos, 2 - that.pos); + return this->pos == that.pos; + } + + Iterator &operator++() { + owner->print("iterator prefix ++"); + pos -= 1; + return *this; + } + + Iterator operator++(int) { + owner->print("iterator postfix ++"); + auto result = *this; + pos -= 1; + return result; + } + + int operator*() const { + int result = 2 - pos; + owner->print("iterator deref: %i", result); + return result; + } + + size_t operator-(const Iterator &that) const { + int result = (2 - this->pos) - (2 - that.pos); + owner->print("iterator distance: %d", result); + return result; + } + + Iterator operator+(int steps) const { + owner->print("iterator advance: %i += %i", 2 - this->pos, steps); + return Iterator(owner, pos - steps); + } + + void print(const char *msg) const { owner->print(msg); } + }; + + Iterator begin() const { + print("begin()"); + return Iterator(this, 2); + } + + Iterator end() const { + print("end()"); + return Iterator(this, -1); + } + + void print(const char *msg, ...) const { + va_list args; + va_start(args, msg); + printf("[%s] ", name); + vprintf(msg, args); + printf("\n"); + va_end(args); + } +}; + +int main() { + printf("do\n"); +#pragma omp fuse + { + for (Reporter a{"C"}; auto &&v : Reporter("A")) + printf("v=%d\n", v); + for (Reporter aa{"D"}; auto &&vv : Reporter("B")) + printf("vv=%d\n", vv); + } + printf("done\n"); + return EXIT_SUCCESS; +} + +// CHECK: [C] ctor +// CHECK-NEXT: [A] ctor +// CHECK-NEXT: [A] end() +// CHECK-NEXT: [A] begin() +// CHECK-NEXT: [A] begin() +// CHECK-NEXT: [A] iterator distance: 3 +// CHECK-NEXT: [D] ctor +// CHECK-NEXT: [B] ctor +// CHECK-NEXT: [B] end() +// CHECK-NEXT: [B] begin() +// CHECK-NEXT: [B] begin() +// CHECK-NEXT: [B] iterator distance: 3 +// CHECK-NEXT: [A] iterator advance: 0 += 0 +// CHECK-NEXT: [A] iterator move assign +// CHECK-NEXT: [A] iterator deref: 0 +// CHECK-NEXT: v=0 +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [B] iterator advance: 0 += 0 +// CHECK-NEXT: [B] iterator move assign +// CHECK-NEXT: [B] iterator deref: 0 +// CHECK-NEXT: vv=0 +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [A] iterator advance: 0 += 1 +// CHECK-NEXT: [A] iterator move assign +// CHECK-NEXT: [A] iterator deref: 1 +// CHECK-NEXT: v=1 +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [B] iterator advance: 0 += 1 +// CHECK-NEXT: [B] iterator move assign +// CHECK-NEXT: [B] iterator deref: 1 +// CHECK-NEXT: vv=1 +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [A] iterator advance: 0 += 2 +// CHECK-NEXT: [A] iterator move assign +// CHECK-NEXT: [A] iterator deref: 2 +// CHECK-NEXT: v=2 +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [B] iterator advance: 0 += 2 +// CHECK-NEXT: [B] iterator move assign +// CHECK-NEXT: [B] iterator deref: 2 +// CHECK-NEXT: vv=2 +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [B] dtor +// CHECK-NEXT: [D] dtor +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [A] dtor +// CHECK-NEXT: [C] dtor +// CHECK-NEXT: done + + +#endif diff --git a/openmp/runtime/test/transform/fuse/intfor.c b/openmp/runtime/test/transform/fuse/intfor.c new file mode 100644 index 0000000000000..b8171b4df7042 --- /dev/null +++ b/openmp/runtime/test/transform/fuse/intfor.c @@ -0,0 +1,50 @@ +// RUN: %libomp-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include + +int main() { + printf("do\n"); +#pragma omp fuse + { + for (int i = 5; i <= 25; i += 5) + printf("i=%d\n", i); + for (int j = 10; j < 100; j += 10) + printf("j=%d\n", j); + for (int k = 10; k > 0; --k) + printf("k=%d\n", k); + } + printf("done\n"); + return EXIT_SUCCESS; +} +#endif /* HEADER */ + +// CHECK: do +// CHECK-NEXT: i=5 +// CHECK-NEXT: j=10 +// CHECK-NEXT: k=10 +// CHECK-NEXT: i=10 +// CHECK-NEXT: j=20 +// CHECK-NEXT: k=9 +// CHECK-NEXT: i=15 +// CHECK-NEXT: j=30 +// CHECK-NEXT: k=8 +// CHECK-NEXT: i=20 +// CHECK-NEXT: j=40 +// CHECK-NEXT: k=7 +// CHECK-NEXT: i=25 +// CHECK-NEXT: j=50 +// CHECK-NEXT: k=6 +// CHECK-NEXT: j=60 +// CHECK-NEXT: k=5 +// CHECK-NEXT: j=70 +// CHECK-NEXT: k=4 +// CHECK-NEXT: j=80 +// CHECK-NEXT: k=3 +// CHECK-NEXT: j=90 +// CHECK-NEXT: k=2 +// CHECK-NEXT: k=1 +// CHECK-NEXT: done diff --git a/openmp/runtime/test/transform/fuse/iterfor.cpp b/openmp/runtime/test/transform/fuse/iterfor.cpp new file mode 100644 index 0000000000000..552484b2981c4 --- /dev/null +++ b/openmp/runtime/test/transform/fuse/iterfor.cpp @@ -0,0 +1,194 @@ +// RUN: %libomp-cxx20-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include +#include +#include + +struct Reporter { + const char *name; + + Reporter(const char *name) : name(name) { print("ctor"); } + + Reporter() : name("") { print("ctor"); } + + Reporter(const Reporter &that) : name(that.name) { print("copy ctor"); } + + Reporter(Reporter &&that) : name(that.name) { print("move ctor"); } + + ~Reporter() { print("dtor"); } + + const Reporter &operator=(const Reporter &that) { + print("copy assign"); + this->name = that.name; + return *this; + } + + const Reporter &operator=(Reporter &&that) { + print("move assign"); + this->name = that.name; + return *this; + } + + struct Iterator { + const Reporter *owner; + int pos; + + Iterator(const Reporter *owner, int pos) : owner(owner), pos(pos) {} + + Iterator(const Iterator &that) : owner(that.owner), pos(that.pos) { + owner->print("iterator copy ctor"); + } + + Iterator(Iterator &&that) : owner(that.owner), pos(that.pos) { + owner->print("iterator move ctor"); + } + + ~Iterator() { owner->print("iterator dtor"); } + + const Iterator &operator=(const Iterator &that) { + owner->print("iterator copy assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + const Iterator &operator=(Iterator &&that) { + owner->print("iterator move assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + bool operator==(const Iterator &that) const { + owner->print("iterator %d == %d", 2 - this->pos, 2 - that.pos); + return this->pos == that.pos; + } + + bool operator!=(const Iterator &that) const { + owner->print("iterator %d != %d", 2 - this->pos, 2 - that.pos); + return this->pos != that.pos; + } + + Iterator &operator++() { + owner->print("iterator prefix ++"); + pos -= 1; + return *this; + } + + Iterator operator++(int) { + owner->print("iterator postfix ++"); + auto result = *this; + pos -= 1; + return result; + } + + int operator*() const { + int result = 2 - pos; + owner->print("iterator deref: %i", result); + return result; + } + + size_t operator-(const Iterator &that) const { + int result = (2 - this->pos) - (2 - that.pos); + owner->print("iterator distance: %d", result); + return result; + } + + Iterator operator+(int steps) const { + owner->print("iterator advance: %i += %i", 2 - this->pos, steps); + return Iterator(owner, pos - steps); + } + }; + + Iterator begin() const { + print("begin()"); + return Iterator(this, 2); + } + + Iterator end() const { + print("end()"); + return Iterator(this, -1); + } + + void print(const char *msg, ...) const { + va_list args; + va_start(args, msg); + printf("[%s] ", name); + vprintf(msg, args); + printf("\n"); + va_end(args); + } +}; + +int main() { + printf("do\n"); + Reporter C("C"); + Reporter D("D"); +#pragma omp fuse + { + for (auto it = C.begin(); it != C.end(); ++it) + printf("v=%d\n", *it); + + for (auto it = D.begin(); it != D.end(); ++it) + printf("vv=%d\n", *it); + } + printf("done\n"); + return EXIT_SUCCESS; +} + +#endif /* HEADER */ + +// CHECK: do +// CHECK: [C] ctor +// CHECK-NEXT: [D] ctor +// CHECK-NEXT: [C] begin() +// CHECK-NEXT: [C] begin() +// CHECK-NEXT: [C] end() +// CHECK-NEXT: [C] iterator distance: 3 +// CHECK-NEXT: [D] begin() +// CHECK-NEXT: [D] begin() +// CHECK-NEXT: [D] end() +// CHECK-NEXT: [D] iterator distance: 3 +// CHECK-NEXT: [C] iterator advance: 0 += 0 +// CHECK-NEXT: [C] iterator move assign +// CHECK-NEXT: [C] iterator deref: 0 +// CHECK-NEXT: v=0 +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [D] iterator advance: 0 += 0 +// CHECK-NEXT: [D] iterator move assign +// CHECK-NEXT: [D] iterator deref: 0 +// CHECK-NEXT: vv=0 +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [C] iterator advance: 0 += 1 +// CHECK-NEXT: [C] iterator move assign +// CHECK-NEXT: [C] iterator deref: 1 +// CHECK-NEXT: v=1 +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [D] iterator advance: 0 += 1 +// CHECK-NEXT: [D] iterator move assign +// CHECK-NEXT: [D] iterator deref: 1 +// CHECK-NEXT: vv=1 +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [C] iterator advance: 0 += 2 +// CHECK-NEXT: [C] iterator move assign +// CHECK-NEXT: [C] iterator deref: 2 +// CHECK-NEXT: v=2 +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [D] iterator advance: 0 += 2 +// CHECK-NEXT: [D] iterator move assign +// CHECK-NEXT: [D] iterator deref: 2 +// CHECK-NEXT: vv=2 +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: done +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [D] dtor +// CHECK-NEXT: [C] dtor diff --git a/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-foreach.cpp b/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-foreach.cpp new file mode 100644 index 0000000000000..e9f76713fe3e0 --- /dev/null +++ b/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-foreach.cpp @@ -0,0 +1,208 @@ +// RUN: %libomp-cxx20-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include +#include +#include + +struct Reporter { + const char *name; + + Reporter(const char *name) : name(name) { print("ctor"); } + + Reporter() : name("") { print("ctor"); } + + Reporter(const Reporter &that) : name(that.name) { print("copy ctor"); } + + Reporter(Reporter &&that) : name(that.name) { print("move ctor"); } + + ~Reporter() { print("dtor"); } + + const Reporter &operator=(const Reporter &that) { + print("copy assign"); + this->name = that.name; + return *this; + } + + const Reporter &operator=(Reporter &&that) { + print("move assign"); + this->name = that.name; + return *this; + } + + struct Iterator { + const Reporter *owner; + int pos; + + Iterator(const Reporter *owner, int pos) : owner(owner), pos(pos) {} + + Iterator(const Iterator &that) : owner(that.owner), pos(that.pos) { + owner->print("iterator copy ctor"); + } + + Iterator(Iterator &&that) : owner(that.owner), pos(that.pos) { + owner->print("iterator move ctor"); + } + + ~Iterator() { owner->print("iterator dtor"); } + + const Iterator &operator=(const Iterator &that) { + owner->print("iterator copy assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + const Iterator &operator=(Iterator &&that) { + owner->print("iterator move assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + bool operator==(const Iterator &that) const { + owner->print("iterator %d == %d", 2 - this->pos, 2 - that.pos); + return this->pos == that.pos; + } + + Iterator &operator++() { + owner->print("iterator prefix ++"); + pos -= 1; + return *this; + } + + Iterator operator++(int) { + owner->print("iterator postfix ++"); + auto result = *this; + pos -= 1; + return result; + } + + int operator*() const { + int result = 2 - pos; + owner->print("iterator deref: %i", result); + return result; + } + + size_t operator-(const Iterator &that) const { + int result = (2 - this->pos) - (2 - that.pos); + owner->print("iterator distance: %d", result); + return result; + } + + Iterator operator+(int steps) const { + owner->print("iterator advance: %i += %i", 2 - this->pos, steps); + return Iterator(owner, pos - steps); + } + + void print(const char *msg) const { owner->print(msg); } + }; + + Iterator begin() const { + print("begin()"); + return Iterator(this, 2); + } + + Iterator end() const { + print("end()"); + return Iterator(this, -1); + } + + void print(const char *msg, ...) const { + va_list args; + va_start(args, msg); + printf("[%s] ", name); + vprintf(msg, args); + printf("\n"); + va_end(args); + } +}; + +int main() { + printf("do\n"); +#pragma omp parallel for collapse(2) num_threads(1) + for (int i = 0; i < 3; ++i) +#pragma omp fuse + { + for (Reporter c{"init-stmt"}; auto &&v : Reporter("range")) + printf("i=%d v=%d\n", i, v); + for (int vv = 0; vv < 3; ++vv) + printf("i=%d vv=%d\n", i, vv); + } + printf("done\n"); + return EXIT_SUCCESS; +} + +#endif /* HEADER */ + +// CHECK: do +// CHECK-NEXT: [init-stmt] ctor +// CHECK-NEXT: [range] ctor +// CHECK-NEXT: [range] end() +// CHECK-NEXT: [range] begin() +// CHECK-NEXT: [range] begin() +// CHECK-NEXT: [range] iterator distance: 3 +// CHECK-NEXT: [range] iterator advance: 0 += 0 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 0 +// CHECK-NEXT: i=0 v=0 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=0 vv=0 +// CHECK-NEXT: [range] iterator advance: 0 += 1 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 1 +// CHECK-NEXT: i=0 v=1 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=0 vv=1 +// CHECK-NEXT: [range] iterator advance: 0 += 2 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 2 +// CHECK-NEXT: i=0 v=2 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=0 vv=2 +// CHECK-NEXT: [range] iterator advance: 0 += 0 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 0 +// CHECK-NEXT: i=1 v=0 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=1 vv=0 +// CHECK-NEXT: [range] iterator advance: 0 += 1 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 1 +// CHECK-NEXT: i=1 v=1 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=1 vv=1 +// CHECK-NEXT: [range] iterator advance: 0 += 2 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 2 +// CHECK-NEXT: i=1 v=2 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=1 vv=2 +// CHECK-NEXT: [range] iterator advance: 0 += 0 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 0 +// CHECK-NEXT: i=2 v=0 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=2 vv=0 +// CHECK-NEXT: [range] iterator advance: 0 += 1 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 1 +// CHECK-NEXT: i=2 v=1 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=2 vv=1 +// CHECK-NEXT: [range] iterator advance: 0 += 2 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 2 +// CHECK-NEXT: i=2 v=2 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=2 vv=2 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: [range] dtor +// CHECK-NEXT: [init-stmt] dtor +// CHECK-NEXT: done + diff --git a/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-intfor.c b/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-intfor.c new file mode 100644 index 0000000000000..272908e72c429 --- /dev/null +++ b/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-intfor.c @@ -0,0 +1,45 @@ +// RUN: %libomp-cxx-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include + +int main() { + printf("do\n"); +#pragma omp parallel for collapse(2) num_threads(1) + for (int i = 0; i < 3; ++i) +#pragma omp fuse + { + for (int j = 0; j < 3; ++j) + printf("i=%d j=%d\n", i, j); + for (int k = 0; k < 3; ++k) + printf("i=%d k=%d\n", i, k); + } + printf("done\n"); + return EXIT_SUCCESS; +} + +#endif /* HEADER */ + +// CHECK: do +// CHECK: i=0 j=0 +// CHECK-NEXT: i=0 k=0 +// CHECK-NEXT: i=0 j=1 +// CHECK-NEXT: i=0 k=1 +// CHECK-NEXT: i=0 j=2 +// CHECK-NEXT: i=0 k=2 +// CHECK-NEXT: i=1 j=0 +// CHECK-NEXT: i=1 k=0 +// CHECK-NEXT: i=1 j=1 +// CHECK-NEXT: i=1 k=1 +// CHECK-NEXT: i=1 j=2 +// CHECK-NEXT: i=1 k=2 +// CHECK-NEXT: i=2 j=0 +// CHECK-NEXT: i=2 k=0 +// CHECK-NEXT: i=2 j=1 +// CHECK-NEXT: i=2 k=1 +// CHECK-NEXT: i=2 j=2 +// CHECK-NEXT: i=2 k=2 +// CHECK-NEXT: done >From 044ca734221825ed05fbf8372af3cbe264a1cc3c Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:28:04 +0000 Subject: [PATCH 2/7] Add looprange clause --- clang/include/clang/AST/OpenMPClause.h | 100 ++++++ clang/include/clang/AST/RecursiveASTVisitor.h | 8 + clang/include/clang/AST/StmtOpenMP.h | 18 +- .../clang/Basic/DiagnosticSemaKinds.td | 5 + clang/include/clang/Parse/Parser.h | 3 + clang/include/clang/Sema/SemaOpenMP.h | 6 + clang/lib/AST/OpenMPClause.cpp | 35 ++ clang/lib/AST/StmtOpenMP.cpp | 7 +- clang/lib/AST/StmtProfile.cpp | 7 + clang/lib/Basic/OpenMPKinds.cpp | 2 + clang/lib/Parse/ParseOpenMP.cpp | 36 ++ clang/lib/Sema/SemaOpenMP.cpp | 155 +++++++-- clang/lib/Sema/TreeTransform.h | 33 ++ clang/lib/Serialization/ASTReader.cpp | 11 + clang/lib/Serialization/ASTReaderStmt.cpp | 4 +- clang/lib/Serialization/ASTWriter.cpp | 8 + clang/test/OpenMP/fuse_ast_print.cpp | 67 ++++ clang/test/OpenMP/fuse_codegen.cpp | 320 +++++++++++++++++- clang/test/OpenMP/fuse_messages.cpp | 112 +++++- clang/tools/libclang/CIndex.cpp | 5 + llvm/include/llvm/Frontend/OpenMP/ClauseT.h | 16 +- llvm/include/llvm/Frontend/OpenMP/OMP.td | 6 + 22 files changed, 921 insertions(+), 43 deletions(-) diff --git a/clang/include/clang/AST/OpenMPClause.h b/clang/include/clang/AST/OpenMPClause.h index 572e62249b46f..b9c7b2771c95c 100644 --- a/clang/include/clang/AST/OpenMPClause.h +++ b/clang/include/clang/AST/OpenMPClause.h @@ -1151,6 +1151,106 @@ class OMPFullClause final : public OMPNoChildClause { static OMPFullClause *CreateEmpty(const ASTContext &C); }; +/// This class represents the 'looprange' clause in the +/// '#pragma omp fuse' directive +/// +/// \code {c} +/// #pragma omp fuse looprange(1,2) +/// { +/// for(int i = 0; i < 64; ++i) +/// for(int j = 0; j < 256; j+=2) +/// for(int k = 127; k >= 0; --k) +/// \endcode +class OMPLoopRangeClause final : public OMPClause { + friend class OMPClauseReader; + + explicit OMPLoopRangeClause() + : OMPClause(llvm::omp::OMPC_looprange, {}, {}) {} + + /// Location of '(' + SourceLocation LParenLoc; + + /// Location of 'first' + SourceLocation FirstLoc; + + /// Location of 'count' + SourceLocation CountLoc; + + /// Expr associated with 'first' argument + Expr *First = nullptr; + + /// Expr associated with 'count' argument + Expr *Count = nullptr; + + /// Set 'first' + void setFirst(Expr *First) { this->First = First; } + + /// Set 'count' + void setCount(Expr *Count) { this->Count = Count; } + + /// Set location of '('. + void setLParenLoc(SourceLocation Loc) { LParenLoc = Loc; } + + /// Set location of 'first' argument + void setFirstLoc(SourceLocation Loc) { FirstLoc = Loc; } + + /// Set location of 'count' argument + void setCountLoc(SourceLocation Loc) { CountLoc = Loc; } + +public: + /// Build an AST node for a 'looprange' clause + /// + /// \param StartLoc Starting location of the clause. + /// \param LParenLoc Location of '('. + /// \param ModifierLoc Modifier location. + /// \param + static OMPLoopRangeClause * + Create(const ASTContext &C, SourceLocation StartLoc, SourceLocation LParenLoc, + SourceLocation FirstLoc, SourceLocation CountLoc, + SourceLocation EndLoc, Expr *First, Expr *Count); + + /// Build an empty 'looprange' node for deserialization + /// + /// \param C Context of the AST. + static OMPLoopRangeClause *CreateEmpty(const ASTContext &C); + + /// Returns the location of '(' + SourceLocation getLParenLoc() const { return LParenLoc; } + + /// Returns the location of 'first' + SourceLocation getFirstLoc() const { return FirstLoc; } + + /// Returns the location of 'count' + SourceLocation getCountLoc() const { return CountLoc; } + + /// Returns the argument 'first' or nullptr if not set + Expr *getFirst() const { return cast_or_null(First); } + + /// Returns the argument 'count' or nullptr if not set + Expr *getCount() const { return cast_or_null(Count); } + + child_range children() { + return child_range(reinterpret_cast(&First), + reinterpret_cast(&Count) + 1); + } + + const_child_range children() const { + auto Children = const_cast(this)->children(); + return const_child_range(Children.begin(), Children.end()); + } + + child_range used_children() { + return child_range(child_iterator(), child_iterator()); + } + const_child_range used_children() const { + return const_child_range(const_child_iterator(), const_child_iterator()); + } + + static bool classof(const OMPClause *T) { + return T->getClauseKind() == llvm::omp::OMPC_looprange; + } +}; + /// Representation of the 'partial' clause of the '#pragma omp unroll' /// directive. /// diff --git a/clang/include/clang/AST/RecursiveASTVisitor.h b/clang/include/clang/AST/RecursiveASTVisitor.h index e712a47f1639c..fbc93796ab46a 100644 --- a/clang/include/clang/AST/RecursiveASTVisitor.h +++ b/clang/include/clang/AST/RecursiveASTVisitor.h @@ -3398,6 +3398,14 @@ bool RecursiveASTVisitor::VisitOMPFullClause(OMPFullClause *C) { return true; } +template +bool RecursiveASTVisitor::VisitOMPLoopRangeClause( + OMPLoopRangeClause *C) { + TRY_TO(TraverseStmt(C->getFirst())); + TRY_TO(TraverseStmt(C->getCount())); + return true; +} + template bool RecursiveASTVisitor::VisitOMPPartialClause(OMPPartialClause *C) { TRY_TO(TraverseStmt(C->getFactor())); diff --git a/clang/include/clang/AST/StmtOpenMP.h b/clang/include/clang/AST/StmtOpenMP.h index dc6f797e24ab8..85bde292ca748 100644 --- a/clang/include/clang/AST/StmtOpenMP.h +++ b/clang/include/clang/AST/StmtOpenMP.h @@ -5572,7 +5572,9 @@ class OMPTileDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPTileDirectiveClass, llvm::omp::OMPD_tile, StartLoc, EndLoc, NumLoops) { + // Tiling doubles the original number of loops setNumGeneratedLoops(2 * NumLoops); + // Produces a single top-level canonical loop nest setNumGeneratedLoopNests(1); } @@ -5803,9 +5805,9 @@ class OMPReverseDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPReverseDirectiveClass, llvm::omp::OMPD_reverse, StartLoc, EndLoc, 1) { - - setNumGeneratedLoopNests(1); + // Reverse produces a single top-level canonical loop nest setNumGeneratedLoops(1); + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { @@ -5873,6 +5875,8 @@ class OMPInterchangeDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPInterchangeDirectiveClass, llvm::omp::OMPD_interchange, StartLoc, EndLoc, NumLoops) { + // Interchange produces a single top-level canonical loop + // nest, with the exact same amount of total loops setNumGeneratedLoops(NumLoops); setNumGeneratedLoopNests(1); } @@ -5950,11 +5954,7 @@ class OMPFuseDirective final : public OMPLoopTransformationDirective { unsigned NumLoops) : OMPLoopTransformationDirective(OMPFuseDirectiveClass, llvm::omp::OMPD_fuse, StartLoc, EndLoc, - NumLoops) { - setNumGeneratedLoops(1); - // TODO: After implementing the looprange clause, change this logic - setNumGeneratedLoopNests(1); - } + NumLoops) {} void setPreInits(Stmt *PreInits) { Data->getChildren()[PreInitsOffset] = PreInits; @@ -5990,8 +5990,10 @@ class OMPFuseDirective final : public OMPLoopTransformationDirective { /// \param C Context of the AST /// \param NumClauses Number of clauses to allocate /// \param NumLoops Number of associated loops to allocate + /// \param NumLoopNests Number of top level loops to allocate static OMPFuseDirective *CreateEmpty(const ASTContext &C, unsigned NumClauses, - unsigned NumLoops); + unsigned NumLoops, + unsigned NumLoopNests); /// Gets the associated loops after the transformation. This is the de-sugared /// replacement or nulltpr in dependent contexts. diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index 640db20f82e0b..ecfb0c83a3851 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -11524,6 +11524,11 @@ def err_omp_not_a_loop_sequence : Error < "statement after '#pragma omp %0' must be a loop sequence containing canonical loops or loop-generating constructs">; def err_omp_empty_loop_sequence : Error < "loop sequence after '#pragma omp %0' must contain at least 1 canonical loop or loop-generating construct">; +def err_omp_invalid_looprange : Error < + "loop range in '#pragma omp %0' exceeds the number of available loops: " + "range end '%1' is greater than the total number of loops '%2'">; +def warn_omp_redundant_fusion : Warning < + "loop range in '#pragma omp %0' contains only a single loop, resulting in redundant fusion">; def err_omp_not_for : Error< "%select{statement after '#pragma omp %1' must be a for loop|" "expected %2 for loops after '#pragma omp %1'%select{|, but found only %4}3}0">; diff --git a/clang/include/clang/Parse/Parser.h b/clang/include/clang/Parse/Parser.h index e0b8850493b49..0c4c4fc4ba417 100644 --- a/clang/include/clang/Parse/Parser.h +++ b/clang/include/clang/Parse/Parser.h @@ -3622,6 +3622,9 @@ class Parser : public CodeCompletionHandler { OpenMPClauseKind Kind, bool ParseOnly); + /// Parses the 'looprange' clause of a '#pragma omp fuse' directive. + OMPClause *ParseOpenMPLoopRangeClause(); + /// Parses the 'sizes' clause of a '#pragma omp tile' directive. OMPClause *ParseOpenMPSizesClause(); diff --git a/clang/include/clang/Sema/SemaOpenMP.h b/clang/include/clang/Sema/SemaOpenMP.h index 8d78c2197c89d..f4a075e54cebe 100644 --- a/clang/include/clang/Sema/SemaOpenMP.h +++ b/clang/include/clang/Sema/SemaOpenMP.h @@ -921,6 +921,12 @@ class SemaOpenMP : public SemaBase { SourceLocation StartLoc, SourceLocation LParenLoc, SourceLocation EndLoc); + + /// Called on well-form 'looprange' clause after parsing its arguments. + OMPClause * + ActOnOpenMPLoopRangeClause(Expr *First, Expr *Count, SourceLocation StartLoc, + SourceLocation LParenLoc, SourceLocation FirstLoc, + SourceLocation CountLoc, SourceLocation EndLoc); /// Called on well-formed 'ordered' clause. OMPClause * ActOnOpenMPOrderedClause(SourceLocation StartLoc, SourceLocation EndLoc, diff --git a/clang/lib/AST/OpenMPClause.cpp b/clang/lib/AST/OpenMPClause.cpp index 2226791a70b6e..e3dbc00ecf9e5 100644 --- a/clang/lib/AST/OpenMPClause.cpp +++ b/clang/lib/AST/OpenMPClause.cpp @@ -1024,6 +1024,26 @@ OMPPartialClause *OMPPartialClause::CreateEmpty(const ASTContext &C) { return new (C) OMPPartialClause(); } +OMPLoopRangeClause * +OMPLoopRangeClause::Create(const ASTContext &C, SourceLocation StartLoc, + SourceLocation LParenLoc, SourceLocation EndLoc, + SourceLocation FirstLoc, SourceLocation CountLoc, + Expr *First, Expr *Count) { + OMPLoopRangeClause *Clause = CreateEmpty(C); + Clause->setLocStart(StartLoc); + Clause->setLParenLoc(LParenLoc); + Clause->setLocEnd(EndLoc); + Clause->setFirstLoc(FirstLoc); + Clause->setCountLoc(CountLoc); + Clause->setFirst(First); + Clause->setCount(Count); + return Clause; +} + +OMPLoopRangeClause *OMPLoopRangeClause::CreateEmpty(const ASTContext &C) { + return new (C) OMPLoopRangeClause(); +} + OMPAllocateClause *OMPAllocateClause::Create( const ASTContext &C, SourceLocation StartLoc, SourceLocation LParenLoc, Expr *Allocator, Expr *Alignment, SourceLocation ColonLoc, @@ -1888,6 +1908,21 @@ void OMPClausePrinter::VisitOMPPartialClause(OMPPartialClause *Node) { } } +void OMPClausePrinter::VisitOMPLoopRangeClause(OMPLoopRangeClause *Node) { + OS << "looprange"; + + Expr *First = Node->getFirst(); + Expr *Count = Node->getCount(); + + if (First && Count) { + OS << "("; + First->printPretty(OS, nullptr, Policy, 0); + OS << ","; + Count->printPretty(OS, nullptr, Policy, 0); + OS << ")"; + } +} + void OMPClausePrinter::VisitOMPAllocatorClause(OMPAllocatorClause *Node) { OS << "allocator("; Node->getAllocator()->printPretty(OS, nullptr, Policy, 0); diff --git a/clang/lib/AST/StmtOpenMP.cpp b/clang/lib/AST/StmtOpenMP.cpp index f050e9063f1fc..6a2ac64f4e40b 100644 --- a/clang/lib/AST/StmtOpenMP.cpp +++ b/clang/lib/AST/StmtOpenMP.cpp @@ -524,10 +524,13 @@ OMPFuseDirective *OMPFuseDirective::Create( OMPFuseDirective *OMPFuseDirective::CreateEmpty(const ASTContext &C, unsigned NumClauses, - unsigned NumLoops) { - return createEmptyDirective( + unsigned NumLoops, + unsigned NumLoopNests) { + OMPFuseDirective *Dir = createEmptyDirective( C, NumClauses, /*HasAssociatedStmt=*/true, TransformedStmtOffset + 1, SourceLocation(), SourceLocation(), NumLoops); + Dir->setNumGeneratedLoopNests(NumLoopNests); + return Dir; } OMPForSimdDirective * diff --git a/clang/lib/AST/StmtProfile.cpp b/clang/lib/AST/StmtProfile.cpp index 933ad19b7a8ef..34f479b4b0b8a 100644 --- a/clang/lib/AST/StmtProfile.cpp +++ b/clang/lib/AST/StmtProfile.cpp @@ -511,6 +511,13 @@ void OMPClauseProfiler::VisitOMPPartialClause(const OMPPartialClause *C) { Profiler->VisitExpr(Factor); } +void OMPClauseProfiler::VisitOMPLoopRangeClause(const OMPLoopRangeClause *C) { + if (const Expr *First = C->getFirst()) + Profiler->VisitExpr(First); + if (const Expr *Count = C->getCount()) + Profiler->VisitExpr(Count); +} + void OMPClauseProfiler::VisitOMPAllocatorClause(const OMPAllocatorClause *C) { if (C->getAllocator()) Profiler->VisitStmt(C->getAllocator()); diff --git a/clang/lib/Basic/OpenMPKinds.cpp b/clang/lib/Basic/OpenMPKinds.cpp index e18867e3c0281..3c62b61f3a438 100644 --- a/clang/lib/Basic/OpenMPKinds.cpp +++ b/clang/lib/Basic/OpenMPKinds.cpp @@ -248,6 +248,7 @@ unsigned clang::getOpenMPSimpleClauseType(OpenMPClauseKind Kind, StringRef Str, case OMPC_affinity: case OMPC_when: case OMPC_append_args: + case OMPC_looprange: break; default: break; @@ -583,6 +584,7 @@ const char *clang::getOpenMPSimpleClauseTypeName(OpenMPClauseKind Kind, case OMPC_affinity: case OMPC_when: case OMPC_append_args: + case OMPC_looprange: break; default: break; diff --git a/clang/lib/Parse/ParseOpenMP.cpp b/clang/lib/Parse/ParseOpenMP.cpp index 8d8698e61216f..6643572b878f2 100644 --- a/clang/lib/Parse/ParseOpenMP.cpp +++ b/clang/lib/Parse/ParseOpenMP.cpp @@ -3116,6 +3116,39 @@ OMPClause *Parser::ParseOpenMPSizesClause() { OpenLoc, CloseLoc); } +OMPClause *Parser::ParseOpenMPLoopRangeClause() { + SourceLocation ClauseNameLoc = ConsumeToken(); + SourceLocation FirstLoc, CountLoc; + + BalancedDelimiterTracker T(*this, tok::l_paren, tok::annot_pragma_openmp_end); + if (T.consumeOpen()) { + Diag(Tok, diag::err_expected) << tok::l_paren; + return nullptr; + } + + FirstLoc = Tok.getLocation(); + ExprResult FirstVal = ParseConstantExpression(); + if (!FirstVal.isUsable()) { + T.skipToEnd(); + return nullptr; + } + + ExpectAndConsume(tok::comma); + + CountLoc = Tok.getLocation(); + ExprResult CountVal = ParseConstantExpression(); + if (!CountVal.isUsable()) { + T.skipToEnd(); + return nullptr; + } + + T.consumeClose(); + + return Actions.OpenMP().ActOnOpenMPLoopRangeClause( + FirstVal.get(), CountVal.get(), ClauseNameLoc, T.getOpenLocation(), + FirstLoc, CountLoc, T.getCloseLocation()); +} + OMPClause *Parser::ParseOpenMPPermutationClause() { SourceLocation ClauseNameLoc, OpenLoc, CloseLoc; SmallVector ArgExprs; @@ -3545,6 +3578,9 @@ OMPClause *Parser::ParseOpenMPClause(OpenMPDirectiveKind DKind, } Clause = ParseOpenMPClause(CKind, WrongDirective); break; + case OMPC_looprange: + Clause = ParseOpenMPLoopRangeClause(); + break; default: break; } diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index c9885518217f3..8cd56d1af6ac8 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -14257,7 +14257,6 @@ bool SemaOpenMP::checkTransformableLoopSequence( // and tries to match the input AST to the canonical loop sequence grammar // structure - auto NLCV = NestedLoopCounterVisitor(); // Helper functions to validate canonical loop sequence grammar is valid auto isLoopSequenceDerivation = [](auto *Child) { return isa(Child) || isa(Child) || @@ -14360,7 +14359,7 @@ bool SemaOpenMP::checkTransformableLoopSequence( // Modularized code for handling regular canonical loops auto handleRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, - &LoopSeqSize, &NumLoops, Kind, &TmpDSA, &NLCV, + &LoopSeqSize, &NumLoops, Kind, &TmpDSA, this](Stmt *Child) { OriginalInits.emplace_back(); LoopHelpers.emplace_back(); @@ -14373,8 +14372,11 @@ bool SemaOpenMP::checkTransformableLoopSequence( << getOpenMPDirectiveName(Kind); return false; } + storeLoopStatements(Child); - NumLoops += NLCV.TraverseStmt(Child); + auto NLCV = NestedLoopCounterVisitor(); + NLCV.TraverseStmt(Child); + NumLoops += NLCV.getNestedLoopCount(); return true; }; @@ -15686,6 +15688,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, Stmt *AStmt, SourceLocation StartLoc, SourceLocation EndLoc) { + ASTContext &Context = getASTContext(); DeclContext *CurrContext = SemaRef.CurContext; Scope *CurScope = SemaRef.getCurScope(); @@ -15702,7 +15705,6 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, SmallVector> OriginalInits; unsigned NumLoops; - // TODO: Support looprange clause using LoopSeqSize unsigned LoopSeqSize; if (!checkTransformableLoopSequence(OMPD_fuse, AStmt, LoopSeqSize, NumLoops, LoopHelpers, LoopStmts, OriginalInits, @@ -15711,10 +15713,67 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, } // Defer transformation in dependent contexts + // The NumLoopNests argument is set to a placeholder (0) + // because a dependent context could prevent determining its true value if (CurrContext->isDependentContext()) { return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, - NumLoops, 1, AStmt, nullptr, nullptr); + NumLoops, 0, AStmt, nullptr, nullptr); } + + // Handle clauses, which can be any of the following: [looprange, apply] + const OMPLoopRangeClause *LRC = + OMPExecutableDirective::getSingleClause(Clauses); + + // The clause arguments are invalidated if any error arises + // such as non-constant or non-positive arguments + if (LRC && (!LRC->getFirst() || !LRC->getCount())) + return StmtError(); + + // Delayed semantic check of LoopRange constraint + // Evaluates the loop range arguments and returns the first and count values + auto EvaluateLoopRangeArguments = [&Context](Expr *First, Expr *Count, + uint64_t &FirstVal, + uint64_t &CountVal) { + llvm::APSInt FirstInt = First->EvaluateKnownConstInt(Context); + llvm::APSInt CountInt = Count->EvaluateKnownConstInt(Context); + FirstVal = FirstInt.getZExtValue(); + CountVal = CountInt.getZExtValue(); + }; + + // Checks if the loop range is valid + auto ValidLoopRange = [](uint64_t FirstVal, uint64_t CountVal, + unsigned NumLoops) -> bool { + return FirstVal + CountVal - 1 <= NumLoops; + }; + uint64_t FirstVal = 1, CountVal = 0, LastVal = LoopSeqSize; + + if (LRC) { + EvaluateLoopRangeArguments(LRC->getFirst(), LRC->getCount(), FirstVal, + CountVal); + if (CountVal == 1) + SemaRef.Diag(LRC->getCountLoc(), diag::warn_omp_redundant_fusion) + << getOpenMPDirectiveName(OMPD_fuse); + + if (!ValidLoopRange(FirstVal, CountVal, LoopSeqSize)) { + SemaRef.Diag(LRC->getFirstLoc(), diag::err_omp_invalid_looprange) + << getOpenMPDirectiveName(OMPD_fuse) << (FirstVal + CountVal - 1) + << LoopSeqSize; + return StmtError(); + } + + LastVal = FirstVal + CountVal - 1; + } + + // Complete fusion generates a single canonical loop nest + // However looprange clause generates several loop nests + unsigned NumLoopNests = LRC ? LoopSeqSize - CountVal + 1 : 1; + + // Emit a warning for redundant loop fusion when the sequence contains only + // one loop. + if (LoopSeqSize == 1) + SemaRef.Diag(AStmt->getBeginLoc(), diag::warn_omp_redundant_fusion) + << getOpenMPDirectiveName(OMPD_fuse); + assert(LoopHelpers.size() == LoopSeqSize && "Expecting loop iteration space dimensionality to match number of " "affected loops"); @@ -15728,8 +15787,8 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, SmallVector PreInits; // Select the type with the largest bit width among all induction variables - QualType IVType = LoopHelpers[0].IterationVarRef->getType(); - for (unsigned int I = 1; I < LoopSeqSize; ++I) { + QualType IVType = LoopHelpers[FirstVal - 1].IterationVarRef->getType(); + for (unsigned int I = FirstVal; I < LastVal; ++I) { QualType CurrentIVType = LoopHelpers[I].IterationVarRef->getType(); if (Context.getTypeSize(CurrentIVType) > Context.getTypeSize(IVType)) { IVType = CurrentIVType; @@ -15778,20 +15837,21 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // Process each single loop to generate and collect declarations // and statements for all helper expressions - for (unsigned int I = 0; I < LoopSeqSize; ++I) { + for (unsigned int I = FirstVal - 1, J = 0; I < LastVal; ++I, ++J) { addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], PreInits); - auto [UBVD, UBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].UB, "ub", I); - auto [LBVD, LBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].LB, "lb", I); - auto [STVD, STDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].ST, "st", I); + auto [UBVD, UBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].UB, "ub", J); + auto [LBVD, LBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].LB, "lb", J); + auto [STVD, STDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].ST, "st", J); auto [NIVD, NIDStmt] = - CreateHelperVarAndStmt(LoopHelpers[I].NumIterations, "ni", I, true); + CreateHelperVarAndStmt(LoopHelpers[I].NumIterations, "ni", J, true); auto [IVVD, IVDStmt] = - CreateHelperVarAndStmt(LoopHelpers[I].IterationVarRef, "iv", I); + CreateHelperVarAndStmt(LoopHelpers[I].IterationVarRef, "iv", J); if (!LBVD || !STVD || !NIVD || !IVVD) - return StmtError(); + assert(LBVD && STVD && NIVD && IVVD && + "OpenMP Fuse Helper variables creation failed"); UBVarDecls.push_back(UBVD); LBVarDecls.push_back(LBVD); @@ -15866,8 +15926,9 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // omp.fuse.max = max(omp.temp1, omp.temp0) ExprResult MaxExpr; - for (unsigned I = 0; I < LoopSeqSize; ++I) { - DeclRefExpr *NIRef = MakeVarDeclRef(NIVarDecls[I]); + // I is the true + for (unsigned I = FirstVal - 1, J = 0; I < LastVal; ++I, ++J) { + DeclRefExpr *NIRef = MakeVarDeclRef(NIVarDecls[J]); QualType NITy = NIRef->getType(); if (MaxExpr.isUnset()) { @@ -15875,7 +15936,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, MaxExpr = NIRef; } else { // Create a new acummulator variable t_i = MaxExpr - std::string TempName = (Twine(".omp.temp.") + Twine(I)).str(); + std::string TempName = (Twine(".omp.temp.") + Twine(J)).str(); VarDecl *TempDecl = buildVarDecl(SemaRef, {}, NITy, TempName, nullptr, nullptr); TempDecl->setInit(MaxExpr.get()); @@ -15898,7 +15959,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, if (!Comparison.isUsable()) return StmtError(); - DeclRefExpr *NIRef2 = MakeVarDeclRef(NIVarDecls[I]); + DeclRefExpr *NIRef2 = MakeVarDeclRef(NIVarDecls[J]); // Update MaxExpr using a conditional expression to hold the max value MaxExpr = new (Context) ConditionalOperator( Comparison.get(), SourceLocation(), TempRef2, SourceLocation(), @@ -15951,23 +16012,21 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, CompoundStmt *FusedBody = nullptr; SmallVector FusedBodyStmts; - for (unsigned I = 0; I < LoopSeqSize; ++I) { - + for (unsigned I = FirstVal - 1, J = 0; I < LastVal; ++I, ++J) { // Assingment of the original sub-loop index to compute the logical index // IV_k = LB_k + omp.fuse.index * ST_k - ExprResult IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Mul, - MakeVarDeclRef(STVarDecls[I]), MakeIVRef()); + MakeVarDeclRef(STVarDecls[J]), MakeIVRef()); if (!IdxExpr.isUsable()) return StmtError(); IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Add, - MakeVarDeclRef(LBVarDecls[I]), IdxExpr.get()); + MakeVarDeclRef(LBVarDecls[J]), IdxExpr.get()); if (!IdxExpr.isUsable()) return StmtError(); IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Assign, - MakeVarDeclRef(IVVarDecls[I]), IdxExpr.get()); + MakeVarDeclRef(IVVarDecls[J]), IdxExpr.get()); if (!IdxExpr.isUsable()) return StmtError(); @@ -15982,7 +16041,6 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, Stmt *Body = (isa(LoopStmts[I])) ? cast(LoopStmts[I])->getBody() : cast(LoopStmts[I])->getBody(); - BodyStmts.push_back(Body); CompoundStmt *CombinedBody = @@ -15990,7 +16048,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, SourceLocation(), SourceLocation()); ExprResult Condition = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_LT, MakeIVRef(), - MakeVarDeclRef(NIVarDecls[I])); + MakeVarDeclRef(NIVarDecls[J])); if (!Condition.isUsable()) return StmtError(); @@ -16011,8 +16069,26 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, FusedBody, InitStmt.get()->getBeginLoc(), SourceLocation(), IncrExpr.get()->getEndLoc()); + // In the case of looprange, the result of fuse won't simply + // be a single loop (ForStmt), but rather a loop sequence + // (CompoundStmt) of 3 parts: the pre-fusion loops, the fused loop + // and the post-fusion loops, preserving its original order. + Stmt *FusionStmt = FusedForStmt; + if (LRC) { + SmallVector FinalLoops; + // Gather all the pre-fusion loops + for (unsigned I = 0; I < FirstVal - 1; ++I) + FinalLoops.push_back(LoopStmts[I]); + // Gather the fused loop + FinalLoops.push_back(FusedForStmt); + // Gather all the post-fusion loops + for (unsigned I = FirstVal + CountVal - 1; I < LoopSeqSize; ++I) + FinalLoops.push_back(LoopStmts[I]); + FusionStmt = CompoundStmt::Create(Context, FinalLoops, FPOptionsOverride(), + SourceLocation(), SourceLocation()); + } return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, NumLoops, - 1, AStmt, FusedForStmt, + NumLoopNests, AStmt, FusionStmt, buildPreInits(Context, PreInits)); } @@ -17128,6 +17204,31 @@ OMPClause *SemaOpenMP::ActOnOpenMPPartialClause(Expr *FactorExpr, FactorExpr); } +OMPClause *SemaOpenMP::ActOnOpenMPLoopRangeClause( + Expr *First, Expr *Count, SourceLocation StartLoc, SourceLocation LParenLoc, + SourceLocation FirstLoc, SourceLocation CountLoc, SourceLocation EndLoc) { + + // OpenMP [6.0, Restrictions] + // First and Count must be integer expressions with positive value + ExprResult FirstVal = + VerifyPositiveIntegerConstantInClause(First, OMPC_looprange); + if (FirstVal.isInvalid()) + First = nullptr; + + ExprResult CountVal = + VerifyPositiveIntegerConstantInClause(Count, OMPC_looprange); + if (CountVal.isInvalid()) + Count = nullptr; + + // OpenMP [6.0, Restrictions] + // first + count - 1 must not evaluate to a value greater than the + // loop sequence length of the associated canonical loop sequence. + // This check must be performed afterwards due to the delayed + // parsing and computation of the associated loop sequence + return OMPLoopRangeClause::Create(getASTContext(), StartLoc, LParenLoc, + FirstLoc, CountLoc, EndLoc, First, Count); +} + OMPClause *SemaOpenMP::ActOnOpenMPAlignClause(Expr *A, SourceLocation StartLoc, SourceLocation LParenLoc, SourceLocation EndLoc) { diff --git a/clang/lib/Sema/TreeTransform.h b/clang/lib/Sema/TreeTransform.h index 39082e06a5a0b..68b51bb3c19c5 100644 --- a/clang/lib/Sema/TreeTransform.h +++ b/clang/lib/Sema/TreeTransform.h @@ -1775,6 +1775,14 @@ class TreeTransform { LParenLoc, EndLoc); } + OMPClause * + RebuildOMPLoopRangeClause(Expr *First, Expr *Count, SourceLocation StartLoc, + SourceLocation LParenLoc, SourceLocation FirstLoc, + SourceLocation CountLoc, SourceLocation EndLoc) { + return getSema().OpenMP().ActOnOpenMPLoopRangeClause( + First, Count, StartLoc, LParenLoc, FirstLoc, CountLoc, EndLoc); + } + /// Build a new OpenMP 'allocator' clause. /// /// By default, performs semantic analysis to build the new OpenMP clause. @@ -10566,6 +10574,31 @@ TreeTransform::TransformOMPPartialClause(OMPPartialClause *C) { C->getEndLoc()); } +template +OMPClause * +TreeTransform::TransformOMPLoopRangeClause(OMPLoopRangeClause *C) { + ExprResult F = getDerived().TransformExpr(C->getFirst()); + if (F.isInvalid()) + return nullptr; + + ExprResult Cn = getDerived().TransformExpr(C->getCount()); + if (Cn.isInvalid()) + return nullptr; + + Expr *First = F.get(); + Expr *Count = Cn.get(); + + bool Changed = (First != C->getFirst()) || (Count != C->getCount()); + + // If no changes and AlwaysRebuild() is false, return the original clause + if (!Changed && !getDerived().AlwaysRebuild()) + return C; + + return RebuildOMPLoopRangeClause(First, Count, C->getBeginLoc(), + C->getLParenLoc(), C->getFirstLoc(), + C->getCountLoc(), C->getEndLoc()); +} + template OMPClause * TreeTransform::TransformOMPCollapseClause(OMPCollapseClause *C) { diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp index a17d6229ee3a1..b5aa729bf717b 100644 --- a/clang/lib/Serialization/ASTReader.cpp +++ b/clang/lib/Serialization/ASTReader.cpp @@ -11086,6 +11086,9 @@ OMPClause *OMPClauseReader::readClause() { case llvm::omp::OMPC_partial: C = OMPPartialClause::CreateEmpty(Context); break; + case llvm::omp::OMPC_looprange: + C = OMPLoopRangeClause::CreateEmpty(Context); + break; case llvm::omp::OMPC_allocator: C = new (Context) OMPAllocatorClause(); break; @@ -11487,6 +11490,14 @@ void OMPClauseReader::VisitOMPPartialClause(OMPPartialClause *C) { C->setLParenLoc(Record.readSourceLocation()); } +void OMPClauseReader::VisitOMPLoopRangeClause(OMPLoopRangeClause *C) { + C->setFirst(Record.readSubExpr()); + C->setCount(Record.readSubExpr()); + C->setLParenLoc(Record.readSourceLocation()); + C->setFirstLoc(Record.readSourceLocation()); + C->setCountLoc(Record.readSourceLocation()); +} + void OMPClauseReader::VisitOMPAllocatorClause(OMPAllocatorClause *C) { C->setAllocator(Record.readExpr()); C->setLParenLoc(Record.readSourceLocation()); diff --git a/clang/lib/Serialization/ASTReaderStmt.cpp b/clang/lib/Serialization/ASTReaderStmt.cpp index aee052404874c..90a058629f19d 100644 --- a/clang/lib/Serialization/ASTReaderStmt.cpp +++ b/clang/lib/Serialization/ASTReaderStmt.cpp @@ -3621,7 +3621,9 @@ Stmt *ASTReader::ReadStmtFromStream(ModuleFile &F) { case STMT_OMP_FUSE_DIRECTIVE: { unsigned NumLoops = Record[ASTStmtReader::NumStmtFields]; unsigned NumClauses = Record[ASTStmtReader::NumStmtFields + 1]; - S = OMPFuseDirective::CreateEmpty(Context, NumClauses, NumLoops); + unsigned NumLoopNests = Record[ASTStmtReader::NumStmtFields + 2]; + S = OMPFuseDirective::CreateEmpty(Context, NumClauses, NumLoops, + NumLoopNests); break; } diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp index cccf53de25882..33e1918f8fd91 100644 --- a/clang/lib/Serialization/ASTWriter.cpp +++ b/clang/lib/Serialization/ASTWriter.cpp @@ -7785,6 +7785,14 @@ void OMPClauseWriter::VisitOMPPartialClause(OMPPartialClause *C) { Record.AddSourceLocation(C->getLParenLoc()); } +void OMPClauseWriter::VisitOMPLoopRangeClause(OMPLoopRangeClause *C) { + Record.AddStmt(C->getFirst()); + Record.AddStmt(C->getCount()); + Record.AddSourceLocation(C->getLParenLoc()); + Record.AddSourceLocation(C->getFirstLoc()); + Record.AddSourceLocation(C->getCountLoc()); +} + void OMPClauseWriter::VisitOMPAllocatorClause(OMPAllocatorClause *C) { Record.AddStmt(C->getAllocator()); Record.AddSourceLocation(C->getLParenLoc()); diff --git a/clang/test/OpenMP/fuse_ast_print.cpp b/clang/test/OpenMP/fuse_ast_print.cpp index 43ce815dab024..ac4f0d38a9c68 100644 --- a/clang/test/OpenMP/fuse_ast_print.cpp +++ b/clang/test/OpenMP/fuse_ast_print.cpp @@ -271,6 +271,73 @@ void foo7() { } +// PRINT-LABEL: void foo8( +// DUMP-LABEL: FunctionDecl {{.*}} foo8 +void foo8() { + // PRINT: #pragma omp fuse looprange(2,2) + // DUMP: OMPFuseDirective + // DUMP: OMPLooprangeClause + #pragma omp fuse looprange(2,2) + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + + } + +} + +//PRINT-LABEL: void foo9( +//DUMP-LABEL: FunctionTemplateDecl {{.*}} foo9 +//DUMP-LABEL: NonTypeTemplateParmDecl {{.*}} F +//DUMP-LABEL: NonTypeTemplateParmDecl {{.*}} C +template +void foo9() { + // PRINT: #pragma omp fuse looprange(F,C) + // DUMP: OMPFuseDirective + // DUMP: OMPLooprangeClause + #pragma omp fuse looprange(F,C) + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + + } +} + +// Also test instantiating the template. +void tfoo9() { + foo9<1, 2>(); +} + diff --git a/clang/test/OpenMP/fuse_codegen.cpp b/clang/test/OpenMP/fuse_codegen.cpp index 6c1e21092da43..d9500bed3ce31 100644 --- a/clang/test/OpenMP/fuse_codegen.cpp +++ b/clang/test/OpenMP/fuse_codegen.cpp @@ -53,6 +53,18 @@ extern "C" void foo3() { } } +extern "C" void foo4() { + double arr[256]; + + #pragma omp fuse looprange(2,2) + { + for(int i = 0; i < 128; ++i) body(i); + for(int j = 0; j < 256; j+=2) body(j); + for(int k = 0; k < 64; ++k) body(k); + for(int c = 42; auto &&v: arr) body(c,v); + } +} + #endif // CHECK1-LABEL: define dso_local void @body( @@ -777,6 +789,157 @@ extern "C" void foo3() { // CHECK1-NEXT: ret void // // +// CHECK1-LABEL: define dso_local void @foo4( +// CHECK1-SAME: ) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[K]], align 4 +// CHECK1-NEXT: store i32 63, ptr [[DOTOMP_UB1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: store i32 64, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP5]], 128 +// CHECK1-NEXT: br i1 [[CMP1]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP6]]) +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add nsw i32 [[TMP7]], 1 +// CHECK1-NEXT: store i32 [[INC]], ptr [[I]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP7:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND2:.*]] +// CHECK1: [[FOR_COND2]]: +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP3:%.*]] = icmp slt i32 [[TMP8]], [[TMP9]] +// CHECK1-NEXT: br i1 [[CMP3]], label %[[FOR_BODY4:.*]], label %[[FOR_END17:.*]] +// CHECK1: [[FOR_BODY4]]: +// CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP5:%.*]] = icmp slt i32 [[TMP10]], [[TMP11]] +// CHECK1-NEXT: br i1 [[CMP5]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i32 [[TMP13]], [[TMP14]] +// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP12]], [[MUL]] +// CHECK1-NEXT: store i32 [[ADD]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[MUL6:%.*]] = mul nsw i32 [[TMP15]], 2 +// CHECK1-NEXT: [[ADD7:%.*]] = add nsw i32 0, [[MUL6]] +// CHECK1-NEXT: store i32 [[ADD7]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP16]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP8:%.*]] = icmp slt i32 [[TMP17]], [[TMP18]] +// CHECK1-NEXT: br i1 [[CMP8]], label %[[IF_THEN9:.*]], label %[[IF_END14:.*]] +// CHECK1: [[IF_THEN9]]: +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL10:%.*]] = mul nsw i32 [[TMP20]], [[TMP21]] +// CHECK1-NEXT: [[ADD11:%.*]] = add nsw i32 [[TMP19]], [[MUL10]] +// CHECK1-NEXT: store i32 [[ADD11]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[MUL12:%.*]] = mul nsw i32 [[TMP22]], 1 +// CHECK1-NEXT: [[ADD13:%.*]] = add nsw i32 0, [[MUL12]] +// CHECK1-NEXT: store i32 [[ADD13]], ptr [[K]], align 4 +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[K]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP23]]) +// CHECK1-NEXT: br label %[[IF_END14]] +// CHECK1: [[IF_END14]]: +// CHECK1-NEXT: br label %[[FOR_INC15:.*]] +// CHECK1: [[FOR_INC15]]: +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC16:%.*]] = add nsw i32 [[TMP24]], 1 +// CHECK1-NEXT: store i32 [[INC16]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND2]], !llvm.loop [[LOOP8:![0-9]+]] +// CHECK1: [[FOR_END17]]: +// CHECK1-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[TMP25:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP25]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP26:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY18:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP26]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY18]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND19:.*]] +// CHECK1: [[FOR_COND19]]: +// CHECK1-NEXT: [[TMP27:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP28:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK1-NEXT: [[CMP20:%.*]] = icmp ne ptr [[TMP27]], [[TMP28]] +// CHECK1-NEXT: br i1 [[CMP20]], label %[[FOR_BODY21:.*]], label %[[FOR_END23:.*]] +// CHECK1: [[FOR_BODY21]]: +// CHECK1-NEXT: [[TMP29:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: store ptr [[TMP29]], ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[C]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP32:%.*]] = load double, ptr [[TMP31]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP30]], double noundef [[TMP32]]) +// CHECK1-NEXT: br label %[[FOR_INC22:.*]] +// CHECK1: [[FOR_INC22]]: +// CHECK1-NEXT: [[TMP33:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[INCDEC_PTR:%.*]] = getelementptr inbounds nuw double, ptr [[TMP33]], i32 1 +// CHECK1-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND19]] +// CHECK1: [[FOR_END23]]: +// CHECK1-NEXT: ret void +// +// // CHECK2-LABEL: define dso_local void @body( // CHECK2-SAME: ...) #[[ATTR0:[0-9]+]] { // CHECK2-NEXT: [[ENTRY:.*:]] @@ -1259,6 +1422,157 @@ extern "C" void foo3() { // CHECK2-NEXT: ret void // // +// CHECK2-LABEL: define dso_local void @foo4( +// CHECK2-SAME: ) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[K]], align 4 +// CHECK2-NEXT: store i32 63, ptr [[DOTOMP_UB1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: store i32 64, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP5]], 128 +// CHECK2-NEXT: br i1 [[CMP1]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP6]]) +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add nsw i32 [[TMP7]], 1 +// CHECK2-NEXT: store i32 [[INC]], ptr [[I]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND2:.*]] +// CHECK2: [[FOR_COND2]]: +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP3:%.*]] = icmp slt i32 [[TMP8]], [[TMP9]] +// CHECK2-NEXT: br i1 [[CMP3]], label %[[FOR_BODY4:.*]], label %[[FOR_END17:.*]] +// CHECK2: [[FOR_BODY4]]: +// CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP5:%.*]] = icmp slt i32 [[TMP10]], [[TMP11]] +// CHECK2-NEXT: br i1 [[CMP5]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i32 [[TMP13]], [[TMP14]] +// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP12]], [[MUL]] +// CHECK2-NEXT: store i32 [[ADD]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[MUL6:%.*]] = mul nsw i32 [[TMP15]], 2 +// CHECK2-NEXT: [[ADD7:%.*]] = add nsw i32 0, [[MUL6]] +// CHECK2-NEXT: store i32 [[ADD7]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP16]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP8:%.*]] = icmp slt i32 [[TMP17]], [[TMP18]] +// CHECK2-NEXT: br i1 [[CMP8]], label %[[IF_THEN9:.*]], label %[[IF_END14:.*]] +// CHECK2: [[IF_THEN9]]: +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL10:%.*]] = mul nsw i32 [[TMP20]], [[TMP21]] +// CHECK2-NEXT: [[ADD11:%.*]] = add nsw i32 [[TMP19]], [[MUL10]] +// CHECK2-NEXT: store i32 [[ADD11]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[MUL12:%.*]] = mul nsw i32 [[TMP22]], 1 +// CHECK2-NEXT: [[ADD13:%.*]] = add nsw i32 0, [[MUL12]] +// CHECK2-NEXT: store i32 [[ADD13]], ptr [[K]], align 4 +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[K]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP23]]) +// CHECK2-NEXT: br label %[[IF_END14]] +// CHECK2: [[IF_END14]]: +// CHECK2-NEXT: br label %[[FOR_INC15:.*]] +// CHECK2: [[FOR_INC15]]: +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC16:%.*]] = add nsw i32 [[TMP24]], 1 +// CHECK2-NEXT: store i32 [[INC16]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND2]], !llvm.loop [[LOOP7:![0-9]+]] +// CHECK2: [[FOR_END17]]: +// CHECK2-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[TMP25:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP25]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP26:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY18:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP26]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY18]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND19:.*]] +// CHECK2: [[FOR_COND19]]: +// CHECK2-NEXT: [[TMP27:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP28:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK2-NEXT: [[CMP20:%.*]] = icmp ne ptr [[TMP27]], [[TMP28]] +// CHECK2-NEXT: br i1 [[CMP20]], label %[[FOR_BODY21:.*]], label %[[FOR_END23:.*]] +// CHECK2: [[FOR_BODY21]]: +// CHECK2-NEXT: [[TMP29:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: store ptr [[TMP29]], ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[C]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP32:%.*]] = load double, ptr [[TMP31]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP30]], double noundef [[TMP32]]) +// CHECK2-NEXT: br label %[[FOR_INC22:.*]] +// CHECK2: [[FOR_INC22]]: +// CHECK2-NEXT: [[TMP33:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[INCDEC_PTR:%.*]] = getelementptr inbounds nuw double, ptr [[TMP33]], i32 1 +// CHECK2-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND19]] +// CHECK2: [[FOR_END23]]: +// CHECK2-NEXT: ret void +// +// // CHECK2-LABEL: define dso_local void @tfoo2( // CHECK2-SAME: ) #[[ATTR0]] { // CHECK2-NEXT: [[ENTRY:.*:]] @@ -1494,7 +1808,7 @@ extern "C" void foo3() { // CHECK2-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 // CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP8:![0-9]+]] // CHECK2: [[FOR_END]]: // CHECK2-NEXT: ret void // @@ -1503,9 +1817,13 @@ extern "C" void foo3() { // CHECK1: [[META4]] = !{!"llvm.loop.mustprogress"} // CHECK1: [[LOOP5]] = distinct !{[[LOOP5]], [[META4]]} // CHECK1: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} +// CHECK1: [[LOOP7]] = distinct !{[[LOOP7]], [[META4]]} +// CHECK1: [[LOOP8]] = distinct !{[[LOOP8]], [[META4]]} //. // CHECK2: [[LOOP3]] = distinct !{[[LOOP3]], [[META4:![0-9]+]]} // CHECK2: [[META4]] = !{!"llvm.loop.mustprogress"} // CHECK2: [[LOOP5]] = distinct !{[[LOOP5]], [[META4]]} // CHECK2: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} +// CHECK2: [[LOOP7]] = distinct !{[[LOOP7]], [[META4]]} +// CHECK2: [[LOOP8]] = distinct !{[[LOOP8]], [[META4]]} //. diff --git a/clang/test/OpenMP/fuse_messages.cpp b/clang/test/OpenMP/fuse_messages.cpp index 50dedfd2c0dc6..2a2491d008a0b 100644 --- a/clang/test/OpenMP/fuse_messages.cpp +++ b/clang/test/OpenMP/fuse_messages.cpp @@ -33,6 +33,8 @@ void func() { { for (int i = 0; i < 7; ++i) ; + for(int j = 0; j < 100; ++j); + } @@ -41,6 +43,8 @@ void func() { { for (int i = 0; i < 7; ++i) ; + for(int j = 0; j < 100; ++j); + } //expected-error at +4 {{loop after '#pragma omp fuse' is not in canonical form}} @@ -50,6 +54,7 @@ void func() { for(int i = 0; i < 10; i*=2) { ; } + for(int j = 0; j < 100; ++j); } //expected-error at +2 {{loop sequence after '#pragma omp fuse' must contain at least 1 canonical loop or loop-generating construct}} @@ -73,4 +78,109 @@ void func() { for(unsigned int j = 0; j < 10; ++j); for(long long k = 0; k < 100; ++k); } -} \ No newline at end of file + + //expected-warning at +2 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} + #pragma omp fuse + { + for(int i = 0; i < 10; ++i); + } + + //expected-warning at +1 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} + #pragma omp fuse looprange(1, 1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + } + + //expected-error at +1 {{argument to 'looprange' clause must be a strictly positive integer value}} + #pragma omp fuse looprange(1, -1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + } + + //expected-error at +1 {{argument to 'looprange' clause must be a strictly positive integer value}} + #pragma omp fuse looprange(1, 0) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + } + + const int x = 1; + constexpr int y = 4; + //expected-error at +1 {{loop range in '#pragma omp fuse' exceeds the number of available loops: range end '4' is greater than the total number of loops '3'}} + #pragma omp fuse looprange(x,y) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } + + //expected-error at +1 {{loop range in '#pragma omp fuse' exceeds the number of available loops: range end '420' is greater than the total number of loops '3'}} + #pragma omp fuse looprange(1,420) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } +} + +// In a template context, but expression itself not instantiation-dependent +template +static void templated_func() { + + //expected-warning at +1 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} + #pragma omp fuse looprange(2,1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } + + //expected-error at +1 {{loop range in '#pragma omp fuse' exceeds the number of available loops: range end '5' is greater than the total number of loops '3'}} + #pragma omp fuse looprange(3,3) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } + +} + +template +static void templated_func_value_dependent() { + + //expected-warning at +1 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} + #pragma omp fuse looprange(V,1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } +} + +template +static void templated_func_type_dependent() { + constexpr T s = 1; + + //expected-error at +1 {{argument to 'looprange' clause must be a strictly positive integer value}} + #pragma omp fuse looprange(s,s-1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } +} + + +void template_inst() { + // expected-note at +1 {{in instantiation of function template specialization 'templated_func' requested here}} + templated_func(); + // expected-note at +1 {{in instantiation of function template specialization 'templated_func_value_dependent<1>' requested here}} + templated_func_value_dependent<1>(); + // expected-note at +1 {{in instantiation of function template specialization 'templated_func_type_dependent' requested here}} + templated_func_type_dependent(); + +} + + diff --git a/clang/tools/libclang/CIndex.cpp b/clang/tools/libclang/CIndex.cpp index 80020763961fc..a8ca488a8dc0b 100644 --- a/clang/tools/libclang/CIndex.cpp +++ b/clang/tools/libclang/CIndex.cpp @@ -2412,6 +2412,11 @@ void OMPClauseEnqueue::VisitOMPPartialClause(const OMPPartialClause *C) { Visitor->AddStmt(C->getFactor()); } +void OMPClauseEnqueue::VisitOMPLoopRangeClause(const OMPLoopRangeClause *C) { + Visitor->AddStmt(C->getFirst()); + Visitor->AddStmt(C->getCount()); +} + void OMPClauseEnqueue::VisitOMPAllocatorClause(const OMPAllocatorClause *C) { Visitor->AddStmt(C->getAllocator()); } diff --git a/llvm/include/llvm/Frontend/OpenMP/ClauseT.h b/llvm/include/llvm/Frontend/OpenMP/ClauseT.h index e0714e812e5cd..dd51274c1aaf5 100644 --- a/llvm/include/llvm/Frontend/OpenMP/ClauseT.h +++ b/llvm/include/llvm/Frontend/OpenMP/ClauseT.h @@ -1233,6 +1233,15 @@ struct WriteT { using EmptyTrait = std::true_type; }; +// V6: [6.4.7] Looprange clause +template struct LoopRangeT { + using Begin = E; + using End = E; + + using TupleTrait = std::true_type; + std::tuple t; +}; + // --- template @@ -1263,9 +1272,10 @@ using TupleClausesT = DefaultmapT, DeviceT, DistScheduleT, DoacrossT, FromT, GrainsizeT, IfT, InitT, InReductionT, - LastprivateT, LinearT, MapT, - NumTasksT, OrderT, ReductionT, - ScheduleT, TaskReductionT, ToT>; + LastprivateT, LinearT, LoopRangeT, + MapT, NumTasksT, OrderT, + ReductionT, ScheduleT, + TaskReductionT, ToT>; template using UnionClausesT = std::variant>; diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index f33b3b1532d3d..366cc7ef853d3 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -271,6 +271,9 @@ def OMPC_Linear : Clause<"linear"> { def OMPC_Link : Clause<"link"> { let flangClass = "OmpObjectList"; } +def OMPC_LoopRange : Clause<"looprange"> { + let clangClass = "OMPLoopRangeClause"; +} def OMPC_Map : Clause<"map"> { let clangClass = "OMPMapClause"; let flangClass = "OmpMapClause"; @@ -843,6 +846,9 @@ def OMP_For : Directive<"for"> { let category = CA_Executable; } def OMP_Fuse : Directive<"fuse"> { + let allowedOnceClauses = [ + VersionedClause + ]; let association = AS_Loop; let category = CA_Executable; } >From dbc440633099af24621b185036473333641bcc28 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:30:39 +0000 Subject: [PATCH 3/7] Addef fuse to documentation --- clang/docs/OpenMPSupport.rst | 2 ++ clang/docs/ReleaseNotes.rst | 1 + 2 files changed, 3 insertions(+) diff --git a/clang/docs/OpenMPSupport.rst b/clang/docs/OpenMPSupport.rst index d6507071d4693..5f0e363792b32 100644 --- a/clang/docs/OpenMPSupport.rst +++ b/clang/docs/OpenMPSupport.rst @@ -376,6 +376,8 @@ implementation. +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | loop stripe transformation | :good:`done` | https://github.com/llvm/llvm-project/pull/119891 | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ +| loop fuse transformation | :good:`done` | :none:`unclaimed` | | ++-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | work distribute construct | :none:`unclaimed` | :none:`unclaimed` | | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | task_iteration | :none:`unclaimed` | :none:`unclaimed` | | diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 1f0dbe565db6b..70fa866b8e5c9 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -901,6 +901,7 @@ OpenMP Support - Added support 'no_openmp_constructs' assumption clause. - Added support for 'self_maps' in map and requirement clause. - Added support for 'omp stripe' directive. +- Added support for 'omp fuse' directive. Improvements ^^^^^^^^^^^^ >From 00095f49f8180a9da6690954850162dd81dd0e54 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:43:41 +0000 Subject: [PATCH 4/7] Refactored preinits handling and improved coverage --- clang/docs/OpenMPSupport.rst | 2 +- clang/docs/ReleaseNotes.rst | 1 - clang/include/clang/AST/StmtOpenMP.h | 5 +- clang/include/clang/Sema/SemaOpenMP.h | 96 +- clang/lib/AST/StmtOpenMP.cpp | 13 + clang/lib/Basic/OpenMPKinds.cpp | 3 +- clang/lib/CodeGen/CGExpr.cpp | 2 + clang/lib/CodeGen/CodeGenFunction.h | 4 + clang/lib/Sema/SemaOpenMP.cpp | 588 ++++--- clang/test/OpenMP/fuse_ast_print.cpp | 55 + clang/test/OpenMP/fuse_codegen.cpp | 2117 +++++++++++++++---------- 11 files changed, 1862 insertions(+), 1024 deletions(-) diff --git a/clang/docs/OpenMPSupport.rst b/clang/docs/OpenMPSupport.rst index 5f0e363792b32..b39f9d3634a63 100644 --- a/clang/docs/OpenMPSupport.rst +++ b/clang/docs/OpenMPSupport.rst @@ -376,7 +376,7 @@ implementation. +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | loop stripe transformation | :good:`done` | https://github.com/llvm/llvm-project/pull/119891 | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ -| loop fuse transformation | :good:`done` | :none:`unclaimed` | | +| loop fuse transformation | :good:`prototyped` | :none:`unclaimed` | | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | work distribute construct | :none:`unclaimed` | :none:`unclaimed` | | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 70fa866b8e5c9..1f0dbe565db6b 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -901,7 +901,6 @@ OpenMP Support - Added support 'no_openmp_constructs' assumption clause. - Added support for 'self_maps' in map and requirement clause. - Added support for 'omp stripe' directive. -- Added support for 'omp fuse' directive. Improvements ^^^^^^^^^^^^ diff --git a/clang/include/clang/AST/StmtOpenMP.h b/clang/include/clang/AST/StmtOpenMP.h index 85bde292ca748..b6a948a8c6020 100644 --- a/clang/include/clang/AST/StmtOpenMP.h +++ b/clang/include/clang/AST/StmtOpenMP.h @@ -1005,8 +1005,7 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { Stmt::StmtClass C = T->getStmtClass(); return C == OMPTileDirectiveClass || C == OMPUnrollDirectiveClass || C == OMPReverseDirectiveClass || C == OMPInterchangeDirectiveClass || - C == OMPStripeDirectiveClass || - C == OMPFuseDirectiveClass; + C == OMPStripeDirectiveClass || C == OMPFuseDirectiveClass; } }; @@ -5653,6 +5652,8 @@ class OMPStripeDirective final : public OMPLoopTransformationDirective { llvm::omp::OMPD_stripe, StartLoc, EndLoc, NumLoops) { setNumGeneratedLoops(2 * NumLoops); + // Similar to Tile, it only generates a single top level loop nest + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { diff --git a/clang/include/clang/Sema/SemaOpenMP.h b/clang/include/clang/Sema/SemaOpenMP.h index f4a075e54cebe..ac4cbe3709a0d 100644 --- a/clang/include/clang/Sema/SemaOpenMP.h +++ b/clang/include/clang/Sema/SemaOpenMP.h @@ -1493,16 +1493,96 @@ class SemaOpenMP : public SemaBase { SmallVectorImpl &LoopHelpers, Stmt *&Body, SmallVectorImpl> &OriginalInits); - /// Analyzes and checks a loop sequence for use by a loop transformation + /// @brief Categories of loops encountered during semantic OpenMP loop + /// analysis + /// + /// This enumeration identifies the structural category of a loop or sequence + /// of loops analyzed in the context of OpenMP transformations and directives. + /// This categorization helps differentiate between original source loops + /// and the structures resulting from applying OpenMP loop transformations. + enum class OMPLoopCategory { + + /// @var OMPLoopCategory::RegularLoop + /// Represents a standard canonical loop nest found in the + /// original source code or an intact loop after transformations + /// (i.e Post/Pre loops of a loopranged fusion) + RegularLoop, + + /// @var OMPLoopCategory::TransformSingleLoop + /// Represents the resulting loop structure when an OpenMP loop + // transformation, generates a single, top-level loop + TransformSingleLoop, + + /// @var OMPLoopCategory::TransformLoopSequence + /// Represents the resulting loop structure when an OpenMP loop + /// transformation + /// generates a sequence of two or more canonical loop nests + TransformLoopSequence + }; + + /// The main recursive process of `checkTransformableLoopSequence` that + /// performs grammatical parsing of a canonical loop sequence. It extracts + /// key information, such as the number of top-level loops, loop statements, + /// helper expressions, and other relevant loop-related data, all in a single + /// execution to avoid redundant traversals. This analysis flattens inner + /// Loop Sequences + /// + /// \param LoopSeqStmt The AST of the original statement. + /// \param LoopSeqSize [out] Number of top level canonical loops. + /// \param NumLoops [out] Number of total canonical loops (nested too). + /// \param LoopHelpers [out] The multiple loop analyses results. + /// \param ForStmts [out] The multiple Stmt of each For loop. + /// \param OriginalInits [out] The raw original initialization statements + /// of each belonging to a loop of the loop sequence + /// \param TransformPreInits [out] The multiple collection of statements and + /// declarations that must have been executed/declared + /// before entering the loop (each belonging to a + /// particular loop transformation, nullptr otherwise) + /// \param LoopSequencePreInits [out] Additional general collection of loop + /// transformation related statements and declarations + /// not bounded to a particular loop that must be + /// executed before entering the loop transformation + /// \param LoopCategories [out] A sequence of OMPLoopCategory values, + /// one for each loop or loop transformation node + /// successfully analyzed. + /// \param Context + /// \param Kind The loop transformation directive kind. + /// \return Whether the original statement is both syntactically and + /// semantically correct according to OpenMP 6.0 canonical loop + /// sequence definition. + bool analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind); + + /// Validates and checks whether a loop sequence can be transformed according + /// to the given directive, providing necessary setup and initialization + /// (Driver function) before recursion using `analyzeLoopSequence`. /// /// \param Kind The loop transformation directive kind. - /// \param NumLoops [out] Number of total canonical loops - /// \param LoopSeqSize [out] Number of top level canonical loops + /// \param AStmt The AST of the original statement + /// \param LoopSeqSize [out] Number of top level canonical loops. + /// \param NumLoops [out] Number of total canonical loops (nested too) /// \param LoopHelpers [out] The multiple loop analyses results. - /// \param LoopStmts [out] The multiple Stmt of each For loop. - /// \param OriginalInits [out] The multiple collection of statements and + /// \param ForStmts [out] The multiple Stmt of each For loop. + /// \param OriginalInits [out] The raw original initialization statements + /// of each belonging to a loop of the loop sequence + /// \param TransformsPreInits [out] The multiple collection of statements and /// declarations that must have been executed/declared - /// before entering the loop. + /// before entering the loop (each belonging to a + /// particular loop transformation, nullptr otherwise) + /// \param LoopSequencePreInits [out] Additional general collection of loop + /// transformation related statements and declarations + /// not bounded to a particular loop that must be + /// executed before entering the loop transformation + /// \param LoopCategories [out] A sequence of OMPLoopCategory values, + /// one for each loop or loop transformation node + /// successfully analyzed. /// \param Context /// \return Whether there was an absence of errors or not bool checkTransformableLoopSequence( @@ -1511,7 +1591,9 @@ class SemaOpenMP : public SemaBase { SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, SmallVectorImpl> &OriginalInits, - ASTContext &Context); + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context); /// Helper to keep information about the current `omp begin/end declare /// variant` nesting. diff --git a/clang/lib/AST/StmtOpenMP.cpp b/clang/lib/AST/StmtOpenMP.cpp index 6a2ac64f4e40b..da00b8eeeb2d4 100644 --- a/clang/lib/AST/StmtOpenMP.cpp +++ b/clang/lib/AST/StmtOpenMP.cpp @@ -457,6 +457,8 @@ OMPUnrollDirective::Create(const ASTContext &C, SourceLocation StartLoc, C, Clauses, AssociatedStmt, TransformedStmtOffset + 1, StartLoc, EndLoc); Dir->setNumGeneratedLoops(NumGeneratedLoops); // The number of generated loops and loop nests during unroll matches + // given that unroll only generates top level canonical loop nests + // so each generated loop is a top level canonical loop nest Dir->setNumGeneratedLoopNests(NumGeneratedLoops); Dir->setTransformedStmt(TransformedStmt); Dir->setPreInits(PreInits); @@ -517,6 +519,17 @@ OMPFuseDirective *OMPFuseDirective::Create( NumLoops); Dir->setTransformedStmt(TransformedStmt); Dir->setPreInits(PreInits); + // The number of top level canonical nests could + // not match the total number of generated loops + // Example: + // Before fusion: + // for (int i = 0; i < N; ++i) + // for (int j = 0; j < M; ++j) + // A[i][j] = i + j; + // + // for (int k = 0; k < P; ++k) + // B[k] = k * 2; + // Here, NumLoopNests = 2, but NumLoops = 3. Dir->setNumGeneratedLoopNests(NumLoopNests); Dir->setNumGeneratedLoops(NumLoops); return Dir; diff --git a/clang/lib/Basic/OpenMPKinds.cpp b/clang/lib/Basic/OpenMPKinds.cpp index 3c62b61f3a438..12ebead63d9ba 100644 --- a/clang/lib/Basic/OpenMPKinds.cpp +++ b/clang/lib/Basic/OpenMPKinds.cpp @@ -704,7 +704,8 @@ bool clang::isOpenMPLoopBoundSharingDirective(OpenMPDirectiveKind Kind) { bool clang::isOpenMPLoopTransformationDirective(OpenMPDirectiveKind DKind) { return DKind == OMPD_tile || DKind == OMPD_unroll || DKind == OMPD_reverse || - DKind == OMPD_interchange || DKind == OMPD_stripe || DKind == OMPD_fuse; + DKind == OMPD_interchange || DKind == OMPD_stripe || + DKind == OMPD_fuse; } bool clang::isOpenMPCombinedParallelADirective(OpenMPDirectiveKind DKind) { diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index 1a835c97decef..047c60bb07378 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -3223,6 +3223,8 @@ LValue CodeGenFunction::EmitDeclRefLValue(const DeclRefExpr *E) { // No other cases for now. } else { + llvm::dbgs() << "THE DAMN DECLREFEXPR HASN'T BEEN ENTERED IN LOCALDECLMAP\n"; + VD->dumpColor(); llvm_unreachable("DeclRefExpr for Decl not entered in LocalDeclMap?"); } diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h index 59cb4d9caa98d..fc98ca0b30a7f 100644 --- a/clang/lib/CodeGen/CodeGenFunction.h +++ b/clang/lib/CodeGen/CodeGenFunction.h @@ -5379,6 +5379,10 @@ class CodeGenFunction : public CodeGenTypeCache { /// Set the address of a local variable. void setAddrOfLocalVar(const VarDecl *VD, Address Addr) { + if (LocalDeclMap.count(VD)) { + llvm::errs() << "Warning: VarDecl already exists in map: "; + VD->dumpColor(); + } assert(!LocalDeclMap.count(VD) && "Decl already exists in LocalDeclMap!"); LocalDeclMap.insert({VD, Addr}); } diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index 8cd56d1af6ac8..30f8cd3087268 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -22,6 +22,7 @@ #include "clang/AST/DeclOpenMP.h" #include "clang/AST/DynamicRecursiveASTVisitor.h" #include "clang/AST/OpenMPClause.h" +#include "clang/AST/RecursiveASTVisitor.h" #include "clang/AST/StmtCXX.h" #include "clang/AST/StmtOpenMP.h" #include "clang/AST/StmtVisitor.h" @@ -47,6 +48,7 @@ #include "llvm/Frontend/OpenMP/OMPConstants.h" #include "llvm/IR/Assumptions.h" #include +#include using namespace clang; using namespace llvm::omp; @@ -14125,6 +14127,45 @@ StmtResult SemaOpenMP::ActOnOpenMPTargetTeamsDistributeSimdDirective( getASTContext(), StartLoc, EndLoc, NestedLoopCount, Clauses, AStmt, B); } +// Overloaded base case function +template +static bool tryHandleAs(T *t, F &&) { + return false; +} + +/** + * Tries to recursively cast `t` to one of the given types and invokes `f` if successful. + * + * @tparam Class The first type to check. + * @tparam Rest The remaining types to check. + * @tparam T The base type of `t`. + * @tparam F The callable type for the function to invoke upon a successful cast. + * @param t The object to be checked. + * @param f The function to invoke if `t` matches `Class`. + * @return `true` if `t` matched any type and `f` was called, otherwise `false`. + */ +template +static bool tryHandleAs(T *t, F &&f) { + if (Class *c = dyn_cast(t)) { + f(c); + return true; + } else { + return tryHandleAs(t, std::forward(f)); + } +} + +// Updates OriginalInits by checking Transform against loop transformation +// directives and appending their pre-inits if a match is found. +static void updatePreInits(OMPLoopBasedDirective *Transform, + SmallVectorImpl> &PreInits) { + if (!tryHandleAs( + Transform, [&PreInits](auto *Dir) { + appendFlattenedStmtList(PreInits.back(), Dir->getPreInits()); + })) + llvm_unreachable("Unhandled loop transformation"); +} + bool SemaOpenMP::checkTransformableLoopNest( OpenMPDirectiveKind Kind, Stmt *AStmt, int NumLoops, SmallVectorImpl &LoopHelpers, @@ -14155,121 +14196,106 @@ bool SemaOpenMP::checkTransformableLoopNest( return false; }, [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + updatePreInits(Transform, OriginalInits); }); assert(OriginalInits.back().empty() && "No preinit after innermost loop"); OriginalInits.pop_back(); return Result; } -class NestedLoopCounterVisitor - : public clang::RecursiveASTVisitor { +// Counts the total number of nested loops, including the outermost loop (the +// original loop). PRECONDITION of this visitor is that it must be invoked from +// the original loop to be analyzed. The traversal is stop for Decl's and +// Expr's given that they may contain inner loops that must not be counted. +// +// Example AST structure for the code: +// +// int main() { +// #pragma omp fuse +// { +// for (int i = 0; i < 100; i++) { <-- Outer loop +// []() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// }; +// for(int j = 0; j < 5; ++j) {} <-- Inner loop +// } +// for (int r = 0; i < 100; i++) { <-- Outer loop +// struct LocalClass { +// void bar() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// } +// }; +// for(int k = 0; k < 10; ++k) {} <-- Inner loop +// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +// } +// } +// } +// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { +private: + unsigned NestedLoopCount = 0; + public: - explicit NestedLoopCounterVisitor() : NestedLoopCount(0) {} + explicit NestedLoopCounterVisitor() {} - bool VisitForStmt(clang::ForStmt *FS) { - ++NestedLoopCount; - return true; + unsigned getNestedLoopCount() const { return NestedLoopCount; } + + bool VisitForStmt(ForStmt *FS) override { + ++NestedLoopCount; + return true; } - bool VisitCXXForRangeStmt(clang::CXXForRangeStmt *FRS) { - ++NestedLoopCount; - return true; + bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { + ++NestedLoopCount; + return true; } - unsigned getNestedLoopCount() const { return NestedLoopCount; } + bool TraverseStmt(Stmt *S) override { + if (!S) + return true; -private: - unsigned NestedLoopCount; + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) + return true; + + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || + isa(S)) { + return DynamicRecursiveASTVisitor::TraverseStmt(S); + } + + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; + } + + bool TraverseDecl(Decl *D) override { + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; + } }; -bool SemaOpenMP::checkTransformableLoopSequence( - OpenMPDirectiveKind Kind, Stmt *AStmt, unsigned &LoopSeqSize, - unsigned &NumLoops, +bool SemaOpenMP::analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, SmallVectorImpl> &OriginalInits, - ASTContext &Context) { + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind) { - // Checks whether the given statement is a compound statement VarsWithInheritedDSAType TmpDSA; - if (!isa(AStmt)) { - Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) - << getOpenMPDirectiveName(Kind); - return false; - } - // Callback for updating pre-inits in case there are even more - // loop-sequence-generating-constructs inside of the main compound stmt - auto OnTransformationCallback = - [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); - }; - - // Number of top level canonical loop nests observed (And acts as index) - LoopSeqSize = 0; - // Number of total observed loops - NumLoops = 0; - - // Following OpenMP 6.0 API Specification, a Canonical Loop Sequence follows - // the grammar: - // - // canonical-loop-sequence: - // { - // loop-sequence+ - // } - // where loop-sequence can be any of the following: - // 1. canonical-loop-sequence - // 2. loop-nest - // 3. loop-sequence-generating-construct (i.e OMPLoopTransformationDirective) - // - // To recognise and traverse this structure the following helper functions - // have been defined. handleLoopSequence serves as the recurisve entry point - // and tries to match the input AST to the canonical loop sequence grammar - // structure - - // Helper functions to validate canonical loop sequence grammar is valid - auto isLoopSequenceDerivation = [](auto *Child) { - return isa(Child) || isa(Child) || - isa(Child); - }; - auto isLoopGeneratingStmt = [](auto *Child) { - return isa(Child); - }; - + QualType BaseInductionVarType; // Helper Lambda to handle storing initialization and body statements for both // ForStmt and CXXForRangeStmt and checks for any possible mismatch between // induction variables types - QualType BaseInductionVarType; auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, this, &Context](Stmt *LoopStmt) { if (auto *For = dyn_cast(LoopStmt)) { @@ -14292,33 +14318,35 @@ bool SemaOpenMP::checkTransformableLoopSequence( } } } - } else { - assert(isa(LoopStmt) && - "Expected canonical for or range-based for loops."); - auto *CXXFor = dyn_cast(LoopStmt); + auto *CXXFor = cast(LoopStmt); OriginalInits.back().push_back(CXXFor->getBeginStmt()); ForStmts.push_back(CXXFor); } }; + // Helper lambda functions to encapsulate the processing of different // derivations of the canonical loop sequence grammar // // Modularized code for handling loop generation and transformations - auto handleLoopGeneration = [&storeLoopStatements, &LoopHelpers, - &OriginalInits, &LoopSeqSize, &NumLoops, Kind, - &TmpDSA, &OnTransformationCallback, - this](Stmt *Child) { + auto analyzeLoopGeneration = [&storeLoopStatements, &LoopHelpers, + &OriginalInits, &TransformsPreInits, + &LoopCategories, &LoopSeqSize, &NumLoops, Kind, + &TmpDSA, &ForStmts, &Context, + &LoopSequencePreInits, this](Stmt *Child) { auto LoopTransform = dyn_cast(Child); Stmt *TransformedStmt = LoopTransform->getTransformedStmt(); unsigned NumGeneratedLoopNests = LoopTransform->getNumGeneratedLoopNests(); - + unsigned NumGeneratedLoops = LoopTransform->getNumGeneratedLoops(); // Handle the case where transformed statement is not available due to // dependent contexts if (!TransformedStmt) { - if (NumGeneratedLoopNests > 0) + if (NumGeneratedLoopNests > 0) { + LoopSeqSize += NumGeneratedLoopNests; + NumLoops += NumGeneratedLoops; return true; - // Unroll full + } + // Unroll full (0 loops produced) else { Diag(Child->getBeginLoc(), diag::err_omp_not_for) << 0 << getOpenMPDirectiveName(Kind); @@ -14331,38 +14359,56 @@ bool SemaOpenMP::checkTransformableLoopSequence( Diag(Child->getBeginLoc(), diag::err_omp_not_for) << 0 << getOpenMPDirectiveName(Kind); return false; - // Future loop transformations that generate multiple canonical loops - } else if (NumGeneratedLoopNests > 1) { - llvm_unreachable("Multiple canonical loop generating transformations " - "like loop splitting are not yet supported"); } + // Loop transformatons such as split or loopranged fuse + else if (NumGeneratedLoopNests > 1) { + // Get the preinits related to this loop sequence generating + // loop transformation (i.e loopranged fuse, split...) + LoopSequencePreInits.emplace_back(); + // These preinits differ slightly from regular inits/pre-inits related + // to single loop generating loop transformations (interchange, unroll) + // given that they are not bounded to a particular loop nest + // so they need to be treated independently + updatePreInits(LoopTransform, LoopSequencePreInits); + return analyzeLoopSequence(TransformedStmt, LoopSeqSize, NumLoops, + LoopHelpers, ForStmts, OriginalInits, + TransformsPreInits, LoopSequencePreInits, + LoopCategories, Context, Kind); + } + // Vast majority: (Tile, Unroll, Stripe, Reverse, Interchange, Fuse all) + else { + // Process the transformed loop statement + OriginalInits.emplace_back(); + TransformsPreInits.emplace_back(); + LoopHelpers.emplace_back(); + LoopCategories.push_back(OMPLoopCategory::TransformSingleLoop); + + unsigned IsCanonical = + checkOpenMPLoop(Kind, nullptr, nullptr, TransformedStmt, SemaRef, + *DSAStack, TmpDSA, LoopHelpers[LoopSeqSize]); + + if (!IsCanonical) { + Diag(TransformedStmt->getBeginLoc(), diag::err_omp_not_canonical_loop) + << getOpenMPDirectiveName(Kind); + return false; + } + storeLoopStatements(TransformedStmt); + updatePreInits(LoopTransform, TransformsPreInits); - // Process the transformed loop statement - Child = TransformedStmt; - OriginalInits.emplace_back(); - LoopHelpers.emplace_back(); - OnTransformationCallback(LoopTransform); - - unsigned IsCanonical = - checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, - TmpDSA, LoopHelpers[LoopSeqSize]); - - if (!IsCanonical) { - Diag(Child->getBeginLoc(), diag::err_omp_not_canonical_loop) - << getOpenMPDirectiveName(Kind); - return false; + NumLoops += NumGeneratedLoops; + ++LoopSeqSize; + return true; } - storeLoopStatements(TransformedStmt); - NumLoops += LoopTransform->getNumGeneratedLoops(); - return true; }; // Modularized code for handling regular canonical loops - auto handleRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, - &LoopSeqSize, &NumLoops, Kind, &TmpDSA, - this](Stmt *Child) { + auto analyzeRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, + &LoopSeqSize, &NumLoops, Kind, &TmpDSA, + &LoopCategories, this](Stmt *Child) { OriginalInits.emplace_back(); LoopHelpers.emplace_back(); + LoopCategories.push_back(OMPLoopCategory::RegularLoop); + unsigned IsCanonical = checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, TmpDSA, LoopHelpers[LoopSeqSize]); @@ -14380,57 +14426,114 @@ bool SemaOpenMP::checkTransformableLoopSequence( return true; }; - // Helper function to process a Loop Sequence Recursively - auto handleLoopSequence = [&](Stmt *LoopSeqStmt, - auto &handleLoopSequenceCallback) -> bool { - for (auto *Child : LoopSeqStmt->children()) { - if (!Child) - continue; + // Helper functions to validate canonical loop sequence grammar is valid + auto isLoopSequenceDerivation = [](auto *Child) { + return isa(Child) || isa(Child) || + isa(Child); + }; + auto isLoopGeneratingStmt = [](auto *Child) { + return isa(Child); + }; + - // Skip over non-loop-sequence statements - if (!isLoopSequenceDerivation(Child)) { - Child = Child->IgnoreContainers(); + // High level grammar validation + for (auto *Child : LoopSeqStmt->children()) { - // Ignore empty compound statement if (!Child) - continue; + continue; - // In the case of a nested loop sequence ignoring containers would not - // be enough, a recurisve transversal of the loop sequence is required - if (isa(Child)) { - if (!handleLoopSequenceCallback(Child, handleLoopSequenceCallback)) - return false; - // Already been treated, skip this children - continue; + // Skip over non-loop-sequence statements + if (!isLoopSequenceDerivation(Child)) { + Child = Child->IgnoreContainers(); + + // Ignore empty compound statement + if (!Child) + continue; + + // In the case of a nested loop sequence ignoring containers would not + // be enough, a recurisve transversal of the loop sequence is required + if (isa(Child)) { + if (!analyzeLoopSequence(Child, LoopSeqSize, NumLoops, LoopHelpers, + ForStmts, OriginalInits, TransformsPreInits, + LoopSequencePreInits, LoopCategories, Context, + Kind)) + return false; + // Already been treated, skip this children + continue; + } + } + // Regular loop sequence handling + if (isLoopSequenceDerivation(Child)) { + if (isLoopGeneratingStmt(Child)) { + if (!analyzeLoopGeneration(Child)) { + return false; } + // analyzeLoopGeneration updates Loop Sequence size accordingly + + } else { + if (!analyzeRegularLoop(Child)) { + return false; + } + // Update the Loop Sequence size by one + ++LoopSeqSize; } - // Regular loop sequence handling - if (isLoopSequenceDerivation(Child)) { - if (isLoopGeneratingStmt(Child)) { - if (!handleLoopGeneration(Child)) { - return false; - } } else { - if (!handleRegularLoop(Child)) { - return false; - } + // Report error for invalid statement inside canonical loop sequence + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; } - ++LoopSeqSize; - } else { - // Report error for invalid statement inside canonical loop sequence - Diag(Child->getBeginLoc(), diag::err_omp_not_for) - << 0 << getOpenMPDirectiveName(Kind); + } + return true; +} + +bool SemaOpenMP::checkTransformableLoopSequence( + OpenMPDirectiveKind Kind, Stmt *AStmt, unsigned &LoopSeqSize, + unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context) { + + // Checks whether the given statement is a compound statement + if (!isa(AStmt)) { + Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) + << getOpenMPDirectiveName(Kind); return false; - } - } - return true; - }; + } + // Number of top level canonical loop nests observed (And acts as index) + LoopSeqSize = 0; + // Number of total observed loops + NumLoops = 0; + + // Following OpenMP 6.0 API Specification, a Canonical Loop Sequence follows + // the grammar: + // + // canonical-loop-sequence: + // { + // loop-sequence+ + // } + // where loop-sequence can be any of the following: + // 1. canonical-loop-sequence + // 2. loop-nest + // 3. loop-sequence-generating-construct (i.e OMPLoopTransformationDirective) + // + // To recognise and traverse this structure the following helper functions + // have been defined. analyzeLoopSequence serves as the recurisve entry point + // and tries to match the input AST to the canonical loop sequence grammar + // structure. This function will perform both a semantic and syntactical + // analysis of the given statement according to OpenMP 6.0 definition of + // the aforementioned canonical loop sequence // Recursive entry point to process the main loop sequence - if (!handleLoopSequence(AStmt, handleLoopSequence)) { - return false; + if (!analyzeLoopSequence(AStmt, LoopSeqSize, NumLoops, LoopHelpers, ForStmts, + OriginalInits, TransformsPreInits, + LoopSequencePreInits, LoopCategories, Context, + Kind)) { + return false; } - if (LoopSeqSize <= 0) { Diag(AStmt->getBeginLoc(), diag::err_omp_empty_loop_sequence) << getOpenMPDirectiveName(Kind); @@ -14462,9 +14565,7 @@ static void addLoopPreInits(ASTContext &Context, RangeEnd->getBeginLoc(), RangeEnd->getEndLoc())); } - llvm::append_range(PreInits, OriginalInit); - // List of OMPCapturedExprDecl, for __begin, __end, and NumIterations if (auto *PI = cast_or_null(LoopHelper.PreInits)) { PreInits.push_back(new (Context) DeclStmt( @@ -15132,7 +15233,7 @@ StmtResult SemaOpenMP::ActOnOpenMPUnrollDirective(ArrayRef Clauses, Stmt *LoopStmt = nullptr; collectLoopStmts(AStmt, {LoopStmt}); - // Determine the PreInit declarations. + // Determine the PreInit declarations.e SmallVector PreInits; addLoopPreInits(Context, LoopHelper, LoopStmt, OriginalInits[0], PreInits); @@ -15698,28 +15799,35 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, if (!AStmt) { return StmtError(); } + + unsigned NumLoops = 1; + unsigned LoopSeqSize = 1; + + // Defer transformation in dependent contexts + // The NumLoopNests argument is set to a placeholder 1 (even though + // using looprange fuse could yield up to 3 top level loop nests) + // because a dependent context could prevent determining its true value + if (CurrContext->isDependentContext()) { + return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, + NumLoops, LoopSeqSize, AStmt, nullptr, + nullptr); + } + // Validate that the potential loop sequence is transformable for fusion // Also collect the HelperExprs, Loop Stmts, Inits, and Number of loops SmallVector LoopHelpers; SmallVector LoopStmts; SmallVector> OriginalInits; - - unsigned NumLoops; - unsigned LoopSeqSize; + SmallVector> TransformsPreInits; + SmallVector> LoopSequencePreInits; + SmallVector LoopCategories; if (!checkTransformableLoopSequence(OMPD_fuse, AStmt, LoopSeqSize, NumLoops, LoopHelpers, LoopStmts, OriginalInits, - Context)) { + TransformsPreInits, LoopSequencePreInits, + LoopCategories, Context)) { return StmtError(); } - // Defer transformation in dependent contexts - // The NumLoopNests argument is set to a placeholder (0) - // because a dependent context could prevent determining its true value - if (CurrContext->isDependentContext()) { - return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, - NumLoops, 0, AStmt, nullptr, nullptr); - } - // Handle clauses, which can be any of the following: [looprange, apply] const OMPLoopRangeClause *LRC = OMPExecutableDirective::getSingleClause(Clauses); @@ -15781,11 +15889,6 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, "Expecting loop iteration space dimensionality to match number of " "affected loops"); - // PreInits hold a sequence of variable declarations that must be executed - // before the fused loop begins. These include bounds, strides, and other - // helper variables required for the transformation. - SmallVector PreInits; - // Select the type with the largest bit width among all induction variables QualType IVType = LoopHelpers[FirstVal - 1].IterationVarRef->getType(); for (unsigned int I = FirstVal; I < LastVal; ++I) { @@ -15797,7 +15900,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, uint64_t IVBitWidth = Context.getIntWidth(IVType); // Create pre-init declarations for all loops lower bounds, upper bounds, - // strides and num-iterations + // strides and num-iterations for every top level loop in the fusion SmallVector LBVarDecls; SmallVector STVarDecls; SmallVector NIVarDecls; @@ -15835,12 +15938,62 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, return std::make_pair(VD, DeclStmt); }; + // PreInits hold a sequence of variable declarations that must be executed + // before the fused loop begins. These include bounds, strides, and other + // helper variables required for the transformation. Other loop transforms + // also contain their own preinits + SmallVector PreInits; + // Iterator to keep track of loop transformations + unsigned int TransformIndex = 0; + + // Update the general preinits using the preinits generated by loop sequence + // generating loop transformations. These preinits differ slightly from + // single-loop transformation preinits, as they can be detached from a + // specific loop inside the multiple generated loop nests. This happens + // because certain helper variables, like '.omp.fuse.max', are introduced to + // handle fused iteration spaces and may not be directly tied to a single + // original loop. the preinit structure must ensure that hidden variables + // like '.omp.fuse.max' are still properly handled. + // Transformations that apply this concept: Loopranged Fuse, Split + if (!LoopSequencePreInits.empty()) { + for (const auto <PreInits : LoopSequencePreInits) { + if (!LTPreInits.empty()) { + llvm::append_range(PreInits, LTPreInits); + } + } + } + // Process each single loop to generate and collect declarations - // and statements for all helper expressions + // and statements for all helper expressions related to + // particular single loop nests + + // Also In the case of the fused loops, we keep track of their original + // inits by appending them to their preinits statement, and in the case of + // transformations, also append their preinits (which contain the original + // loop initialization statement or other statements) + + // Firstly we need to update TransformIndex to match the begining of the + // looprange section + for (unsigned int I = 0; I < FirstVal - 1; ++I) { + if (LoopCategories[I] == OMPLoopCategory::TransformSingleLoop) + ++TransformIndex; + } for (unsigned int I = FirstVal - 1, J = 0; I < LastVal; ++I, ++J) { - addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], - PreInits); + if (LoopCategories[I] == OMPLoopCategory::RegularLoop) { + addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], + PreInits); + } else if (LoopCategories[I] == OMPLoopCategory::TransformSingleLoop) { + // For transformed loops, insert both pre-inits and original inits. + // Order matters: pre-inits may define variables used in the original + // inits such as upper bounds... + auto TransformPreInit = TransformsPreInits[TransformIndex++]; + if (!TransformPreInit.empty()) { + llvm::append_range(PreInits, TransformPreInit); + } + addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], + PreInits); + } auto [UBVD, UBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].UB, "ub", J); auto [LBVD, LBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].LB, "lb", J); auto [STVD, STDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].ST, "st", J); @@ -15859,7 +16012,6 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, NIVarDecls.push_back(NIVD); IVVarDecls.push_back(IVVD); - PreInits.push_back(UBDStmt.get()); PreInits.push_back(LBDStmt.get()); PreInits.push_back(STDStmt.get()); PreInits.push_back(NIDStmt.get()); @@ -16035,6 +16187,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, BodyStmts.push_back(IdxExpr.get()); llvm::append_range(BodyStmts, LoopHelpers[I].Updates); + // If the loop is a CXXForRangeStmt then the iterator variable is needed if (auto *SourceCXXFor = dyn_cast(LoopStmts[I])) BodyStmts.push_back(SourceCXXFor->getLoopVarStmt()); @@ -16069,21 +16222,50 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, FusedBody, InitStmt.get()->getBeginLoc(), SourceLocation(), IncrExpr.get()->getEndLoc()); - // In the case of looprange, the result of fuse won't simply - // be a single loop (ForStmt), but rather a loop sequence - // (CompoundStmt) of 3 parts: the pre-fusion loops, the fused loop - // and the post-fusion loops, preserving its original order. + // In the case of looprange, the result of fuse won't simply + // be a single loop (ForStmt), but rather a loop sequence + // (CompoundStmt) of 3 parts: the pre-fusion loops, the fused loop + // and the post-fusion loops, preserving its original order. + // + // Note: If looprange clause produces a single fused loop nest then + // this compound statement wrapper is unnecessary (Therefore this + // treatment is skipped) + Stmt *FusionStmt = FusedForStmt; - if (LRC) { + if (LRC && CountVal != LoopSeqSize) { SmallVector FinalLoops; - // Gather all the pre-fusion loops - for (unsigned I = 0; I < FirstVal - 1; ++I) - FinalLoops.push_back(LoopStmts[I]); - // Gather the fused loop - FinalLoops.push_back(FusedForStmt); - // Gather all the post-fusion loops - for (unsigned I = FirstVal + CountVal - 1; I < LoopSeqSize; ++I) + // Reset the transform index + TransformIndex = 0; + + // Collect all non-fused loops before and after the fused region. + // Pre-fusion and post-fusion loops are inserted in order exploiting their + // symmetry, along with their corresponding transformation pre-inits if + // needed. The fused loop is added between the two regions. + for (unsigned I = 0; I < LoopSeqSize; ++I) { + if (I >= FirstVal - 1 && I < FirstVal + CountVal - 1) { + // Update the Transformation counter to skip already treated + // loop transformations + if (LoopCategories[I] != OMPLoopCategory::TransformSingleLoop) + ++TransformIndex; + continue; + } + + // No need to handle: + // Regular loops: they are kept intact as-is. + // Loop-sequence-generating transformations: already handled earlier. + // Only TransformSingleLoop requires inserting pre-inits here + + if (LoopCategories[I] == OMPLoopCategory::TransformSingleLoop) { + auto TransformPreInit = TransformsPreInits[TransformIndex++]; + if (!TransformPreInit.empty()) { + llvm::append_range(PreInits, TransformPreInit); + } + } + FinalLoops.push_back(LoopStmts[I]); + } + + FinalLoops.insert(FinalLoops.begin() + (FirstVal - 1), FusedForStmt); FusionStmt = CompoundStmt::Create(Context, FinalLoops, FPOptionsOverride(), SourceLocation(), SourceLocation()); } diff --git a/clang/test/OpenMP/fuse_ast_print.cpp b/clang/test/OpenMP/fuse_ast_print.cpp index ac4f0d38a9c68..9d85bd1172948 100644 --- a/clang/test/OpenMP/fuse_ast_print.cpp +++ b/clang/test/OpenMP/fuse_ast_print.cpp @@ -338,6 +338,61 @@ void tfoo9() { foo9<1, 2>(); } +// PRINT-LABEL: void foo10( +// DUMP-LABEL: FunctionDecl {{.*}} foo10 +void foo10() { + // PRINT: #pragma omp fuse looprange(2,2) + // DUMP: OMPFuseDirective + // DUMP: OMPLooprangeClause + #pragma omp fuse looprange(2,2) + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int ii = 0; ii < 10; ii += 2) + // DUMP: ForStmt + for (int ii = 0; ii < 10; ii += 2) + // PRINT: body(ii) + // DUMP: CallExpr + body(ii); + // PRINT: #pragma omp fuse looprange(2,2) + // DUMP: OMPFuseDirective + // DUMP: OMPLooprangeClause + #pragma omp fuse looprange(2,2) + { + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + // PRINT: for (int jj = 10; jj > 0; --jj) + // DUMP: ForStmt + for (int jj = 10; jj > 0; --jj) + // PRINT: body(jj) + // DUMP: CallExpr + body(jj); + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + // PRINT: for (int kk = 0; kk <= 10; ++kk) + // DUMP: ForStmt + for (int kk = 0; kk <= 10; ++kk) + // PRINT: body(kk) + // DUMP: CallExpr + body(kk); + } + } + +} diff --git a/clang/test/OpenMP/fuse_codegen.cpp b/clang/test/OpenMP/fuse_codegen.cpp index d9500bed3ce31..742c280ed0172 100644 --- a/clang/test/OpenMP/fuse_codegen.cpp +++ b/clang/test/OpenMP/fuse_codegen.cpp @@ -65,6 +65,23 @@ extern "C" void foo4() { } } +// This exemplifies the usage of loop transformations that generate +// more than top level canonical loop nests (e.g split, loopranged fuse...) +extern "C" void foo5() { + double arr[256]; + #pragma omp fuse looprange(2,2) + { + #pragma omp fuse looprange(2,2) + { + for(int i = 0; i < 128; ++i) body(i); + for(int j = 0; j < 256; j+=2) body(j); + for(int k = 0; k < 512; ++k) body(k); + } + for(int c = 42; auto &&v: arr) body(c,v); + for(int cc = 37; auto &&vv: arr) body(cc, vv); + } +} + #endif // CHECK1-LABEL: define dso_local void @body( @@ -88,7 +105,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 @@ -97,7 +113,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -129,107 +144,103 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] // CHECK1-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 // CHECK1-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP8]], 1 // CHECK1-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP9]], ptr [[J]], align 4 // CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[START2_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[START2_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[END2_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK1-NEXT: store i32 [[TMP10]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[END2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP13]], [[TMP14]] // CHECK1-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP15]] // CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] -// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP16]] // CHECK1-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 // CHECK1-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP17]], 1 // CHECK1-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: store i32 [[TMP20]], ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP21]], [[TMP22]] +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP19]], [[TMP20]] // CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] // CHECK1: [[COND_TRUE]]: -// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 // CHECK1-NEXT: br label %[[COND_END:.*]] // CHECK1: [[COND_FALSE]]: -// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 // CHECK1-NEXT: br label %[[COND_END]] // CHECK1: [[COND_END]]: -// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP23]], %[[COND_TRUE]] ], [ [[TMP24]], %[[COND_FALSE]] ] +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP21]], %[[COND_TRUE]] ], [ [[TMP22]], %[[COND_FALSE]] ] // CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK1-NEXT: br label %[[FOR_COND:.*]] // CHECK1: [[FOR_COND]]: -// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 -// CHECK1-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP23]], [[TMP24]] // CHECK1-NEXT: br i1 [[CMP16]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK1: [[FOR_BODY]]: -// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] // CHECK1-NEXT: br i1 [[CMP17]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] // CHECK1: [[IF_THEN]]: -// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP30]], [[TMP31]] -// CHECK1-NEXT: [[ADD18:%.*]] = add i32 [[TMP29]], [[MUL]] +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP28]], [[TMP29]] +// CHECK1-NEXT: [[ADD18:%.*]] = add i32 [[TMP27]], [[MUL]] // CHECK1-NEXT: store i32 [[ADD18]], ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 -// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 -// CHECK1-NEXT: [[MUL19:%.*]] = mul i32 [[TMP33]], [[TMP34]] -// CHECK1-NEXT: [[ADD20:%.*]] = add i32 [[TMP32]], [[MUL19]] +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[MUL19:%.*]] = mul i32 [[TMP31]], [[TMP32]] +// CHECK1-NEXT: [[ADD20:%.*]] = add i32 [[TMP30]], [[MUL19]] // CHECK1-NEXT: store i32 [[ADD20]], ptr [[I]], align 4 -// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[I]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP35]]) +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP33]]) // CHECK1-NEXT: br label %[[IF_END]] // CHECK1: [[IF_END]]: -// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP36]], [[TMP37]] +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP34]], [[TMP35]] // CHECK1-NEXT: br i1 [[CMP21]], label %[[IF_THEN22:.*]], label %[[IF_END27:.*]] // CHECK1: [[IF_THEN22]]: -// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL23:%.*]] = mul i32 [[TMP39]], [[TMP40]] -// CHECK1-NEXT: [[ADD24:%.*]] = add i32 [[TMP38]], [[MUL23]] +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL23:%.*]] = mul i32 [[TMP37]], [[TMP38]] +// CHECK1-NEXT: [[ADD24:%.*]] = add i32 [[TMP36]], [[MUL23]] // CHECK1-NEXT: store i32 [[ADD24]], ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[MUL25:%.*]] = mul i32 [[TMP42]], [[TMP43]] -// CHECK1-NEXT: [[ADD26:%.*]] = add i32 [[TMP41]], [[MUL25]] +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[MUL25:%.*]] = mul i32 [[TMP40]], [[TMP41]] +// CHECK1-NEXT: [[ADD26:%.*]] = add i32 [[TMP39]], [[MUL25]] // CHECK1-NEXT: store i32 [[ADD26]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[J]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP42]]) // CHECK1-NEXT: br label %[[IF_END27]] // CHECK1: [[IF_END27]]: // CHECK1-NEXT: br label %[[FOR_INC:.*]] // CHECK1: [[FOR_INC]]: -// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP45]], 1 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP43]], 1 // CHECK1-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP3:![0-9]+]] // CHECK1: [[FOR_END]]: @@ -256,7 +267,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 @@ -265,7 +275,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -274,7 +283,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_19:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP21:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_22:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB2:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB2:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST2:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI2:%.*]] = alloca i32, align 4 @@ -304,172 +312,166 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] // CHECK1-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 // CHECK1-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP8]], 1 // CHECK1-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP9]], ptr [[J]], align 4 // CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[START_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK1-NEXT: store i32 [[TMP10]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP13]], [[TMP14]] // CHECK1-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP15]] // CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] -// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP16]] // CHECK1-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 // CHECK1-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP17]], 1 // CHECK1-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP18]], [[TMP19]] +// CHECK1-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 // CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[START_ADDR]], align 4 // CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] -// CHECK1-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 -// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[START_ADDR]], align 4 -// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] +// CHECK1-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] // CHECK1-NEXT: store i32 [[ADD18]], ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP24]], [[TMP25]] +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] // CHECK1-NEXT: store i32 [[ADD20]], ptr [[DOTCAPTURE_EXPR_19]], align 4 -// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP26]], ptr [[DOTNEW_STEP21]], align 4 -// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 -// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK1-NEXT: [[SUB23:%.*]] = sub i32 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP24]], ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[SUB23:%.*]] = sub i32 [[TMP25]], [[TMP26]] // CHECK1-NEXT: [[SUB24:%.*]] = sub i32 [[SUB23]], 1 -// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK1-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP29]] -// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK1-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP30]] +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP27]] +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP28]] // CHECK1-NEXT: [[SUB27:%.*]] = sub i32 [[DIV26]], 1 // CHECK1-NEXT: store i32 [[SUB27]], ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK1-NEXT: store i32 [[TMP31]], ptr [[DOTOMP_UB2]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB2]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST2]], align 4 -// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK1-NEXT: [[ADD28:%.*]] = add i32 [[TMP32]], 1 +// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK1-NEXT: [[ADD28:%.*]] = add i32 [[TMP29]], 1 // CHECK1-NEXT: store i32 [[ADD28]], ptr [[DOTOMP_NI2]], align 4 -// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: store i32 [[TMP33]], ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP34]], [[TMP35]] +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP30]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP31]], [[TMP32]] // CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] // CHECK1: [[COND_TRUE]]: -// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 // CHECK1-NEXT: br label %[[COND_END:.*]] // CHECK1: [[COND_FALSE]]: -// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 // CHECK1-NEXT: br label %[[COND_END]] // CHECK1: [[COND_END]]: -// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP36]], %[[COND_TRUE]] ], [ [[TMP37]], %[[COND_FALSE]] ] +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP33]], %[[COND_TRUE]] ], [ [[TMP34]], %[[COND_FALSE]] ] // CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_TEMP_2]], align 4 -// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 -// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 -// CHECK1-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP38]], [[TMP39]] +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP35]], [[TMP36]] // CHECK1-NEXT: br i1 [[CMP29]], label %[[COND_TRUE30:.*]], label %[[COND_FALSE31:.*]] // CHECK1: [[COND_TRUE30]]: -// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 // CHECK1-NEXT: br label %[[COND_END32:.*]] // CHECK1: [[COND_FALSE31]]: -// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 // CHECK1-NEXT: br label %[[COND_END32]] // CHECK1: [[COND_END32]]: -// CHECK1-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP40]], %[[COND_TRUE30]] ], [ [[TMP41]], %[[COND_FALSE31]] ] +// CHECK1-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP37]], %[[COND_TRUE30]] ], [ [[TMP38]], %[[COND_FALSE31]] ] // CHECK1-NEXT: store i32 [[COND33]], ptr [[DOTOMP_FUSE_MAX]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK1-NEXT: br label %[[FOR_COND:.*]] // CHECK1: [[FOR_COND]]: -// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 -// CHECK1-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP39]], [[TMP40]] // CHECK1-NEXT: br i1 [[CMP34]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK1: [[FOR_BODY]]: -// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP44]], [[TMP45]] +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP41]], [[TMP42]] // CHECK1-NEXT: br i1 [[CMP35]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] // CHECK1: [[IF_THEN]]: -// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP47]], [[TMP48]] -// CHECK1-NEXT: [[ADD36:%.*]] = add i32 [[TMP46]], [[MUL]] +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP44]], [[TMP45]] +// CHECK1-NEXT: [[ADD36:%.*]] = add i32 [[TMP43]], [[MUL]] // CHECK1-NEXT: store i32 [[ADD36]], ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 -// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 -// CHECK1-NEXT: [[MUL37:%.*]] = mul i32 [[TMP50]], [[TMP51]] -// CHECK1-NEXT: [[ADD38:%.*]] = add i32 [[TMP49]], [[MUL37]] +// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[MUL37:%.*]] = mul i32 [[TMP47]], [[TMP48]] +// CHECK1-NEXT: [[ADD38:%.*]] = add i32 [[TMP46]], [[MUL37]] // CHECK1-NEXT: store i32 [[ADD38]], ptr [[I]], align 4 -// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[I]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP49]]) // CHECK1-NEXT: br label %[[IF_END]] // CHECK1: [[IF_END]]: -// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP53]], [[TMP54]] +// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP50]], [[TMP51]] // CHECK1-NEXT: br i1 [[CMP39]], label %[[IF_THEN40:.*]], label %[[IF_END45:.*]] // CHECK1: [[IF_THEN40]]: -// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK1-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL41:%.*]] = mul i32 [[TMP56]], [[TMP57]] -// CHECK1-NEXT: [[ADD42:%.*]] = add i32 [[TMP55]], [[MUL41]] +// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL41:%.*]] = mul i32 [[TMP53]], [[TMP54]] +// CHECK1-NEXT: [[ADD42:%.*]] = add i32 [[TMP52]], [[MUL41]] // CHECK1-NEXT: store i32 [[ADD42]], ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP58:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[MUL43:%.*]] = mul i32 [[TMP59]], [[TMP60]] -// CHECK1-NEXT: [[SUB44:%.*]] = sub i32 [[TMP58]], [[MUL43]] +// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[MUL43:%.*]] = mul i32 [[TMP56]], [[TMP57]] +// CHECK1-NEXT: [[SUB44:%.*]] = sub i32 [[TMP55]], [[MUL43]] // CHECK1-NEXT: store i32 [[SUB44]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP61:%.*]] = load i32, ptr [[J]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP61]]) +// CHECK1-NEXT: [[TMP58:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP58]]) // CHECK1-NEXT: br label %[[IF_END45]] // CHECK1: [[IF_END45]]: -// CHECK1-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 -// CHECK1-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP62]], [[TMP63]] +// CHECK1-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP59]], [[TMP60]] // CHECK1-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] // CHECK1: [[IF_THEN47]]: -// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 -// CHECK1-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 -// CHECK1-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL48:%.*]] = mul i32 [[TMP65]], [[TMP66]] -// CHECK1-NEXT: [[ADD49:%.*]] = add i32 [[TMP64]], [[MUL48]] +// CHECK1-NEXT: [[TMP61:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 +// CHECK1-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 +// CHECK1-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL48:%.*]] = mul i32 [[TMP62]], [[TMP63]] +// CHECK1-NEXT: [[ADD49:%.*]] = add i32 [[TMP61]], [[MUL48]] // CHECK1-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV2]], align 4 -// CHECK1-NEXT: [[TMP67:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK1-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 -// CHECK1-NEXT: [[TMP69:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK1-NEXT: [[MUL50:%.*]] = mul i32 [[TMP68]], [[TMP69]] -// CHECK1-NEXT: [[ADD51:%.*]] = add i32 [[TMP67]], [[MUL50]] +// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 +// CHECK1-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[MUL50:%.*]] = mul i32 [[TMP65]], [[TMP66]] +// CHECK1-NEXT: [[ADD51:%.*]] = add i32 [[TMP64]], [[MUL50]] // CHECK1-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 -// CHECK1-NEXT: [[TMP70:%.*]] = load i32, ptr [[K]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP70]]) +// CHECK1-NEXT: [[TMP67:%.*]] = load i32, ptr [[K]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP67]]) // CHECK1-NEXT: br label %[[IF_END52]] // CHECK1: [[IF_END52]]: // CHECK1-NEXT: br label %[[FOR_INC:.*]] // CHECK1: [[FOR_INC]]: -// CHECK1-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 +// CHECK1-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP68]], 1 // CHECK1-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP5:![0-9]+]] // CHECK1: [[FOR_END]]: @@ -481,13 +483,11 @@ extern "C" void foo4() { // CHECK1-NEXT: [[ENTRY:.*:]] // CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 // CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -497,48 +497,43 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB03:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_LB04:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_ST05:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_NI06:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_IV07:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB03:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST04:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI05:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV06:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[C:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_12:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_UB117:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_LB118:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_ST119:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_NI120:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_IV122:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_8:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_10:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_LB116:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_ST117:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_NI118:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV120:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[CC:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[__RANGE223:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[__END224:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[__BEGIN227:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__RANGE221:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END222:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN225:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_27:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_29:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_31:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_32:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_UB2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_30:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_LB2:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_ST2:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_NI2:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_IV2:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_TEMP_142:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_TEMP_140:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_FUSE_MAX48:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX54:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX46:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX52:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[VV:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: store i32 0, ptr [[I]], align 4 -// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 // CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[J]], align 4 -// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB1]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 // CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI1]], align 4 @@ -565,225 +560,219 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 // CHECK1-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 // CHECK1-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB03]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST04]], align 4 // CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 -// CHECK1-NEXT: store i32 [[TMP7]], ptr [[DOTOMP_UB03]], align 4 -// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB04]], align 4 -// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST05]], align 4 -// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 -// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP8]], 1 +// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], 1 // CHECK1-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 -// CHECK1-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI06]], align 8 +// CHECK1-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI05]], align 8 // CHECK1-NEXT: store i32 42, ptr [[C]], align 4 // CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 -// CHECK1-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK1-NEXT: [[TMP8:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP8]], i64 0, i64 0 // CHECK1-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 // CHECK1-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK1-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY7:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY7]], ptr [[__BEGIN2]], align 8 // CHECK1-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY8:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 -// CHECK1-NEXT: store ptr [[ARRAYDECAY8]], ptr [[__BEGIN2]], align 8 -// CHECK1-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY10:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP11]], i64 0, i64 0 -// CHECK1-NEXT: store ptr [[ARRAYDECAY10]], ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK1-NEXT: [[TMP12:%.*]] = load ptr, ptr [[__END2]], align 8 -// CHECK1-NEXT: store ptr [[TMP12]], ptr [[DOTCAPTURE_EXPR_11]], align 8 -// CHECK1-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_11]], align 8 -// CHECK1-NEXT: [[TMP14:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK1-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 -// CHECK1-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP14]] to i64 +// CHECK1-NEXT: [[ARRAYDECAY9:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY9]], ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK1-NEXT: store ptr [[TMP11]], ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK1-NEXT: [[TMP12:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK1-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP12]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 // CHECK1-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] // CHECK1-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 -// CHECK1-NEXT: [[SUB13:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 -// CHECK1-NEXT: [[ADD14:%.*]] = add nsw i64 [[SUB13]], 1 -// CHECK1-NEXT: [[DIV15:%.*]] = sdiv i64 [[ADD14]], 1 -// CHECK1-NEXT: [[SUB16:%.*]] = sub nsw i64 [[DIV15]], 1 -// CHECK1-NEXT: store i64 [[SUB16]], ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK1-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK1-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_UB117]], align 8 -// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB118]], align 8 -// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST119]], align 8 -// CHECK1-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK1-NEXT: [[ADD21:%.*]] = add nsw i64 [[TMP16]], 1 -// CHECK1-NEXT: store i64 [[ADD21]], ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: [[SUB12:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK1-NEXT: [[ADD13:%.*]] = add nsw i64 [[SUB12]], 1 +// CHECK1-NEXT: [[DIV14:%.*]] = sdiv i64 [[ADD13]], 1 +// CHECK1-NEXT: [[SUB15:%.*]] = sub nsw i64 [[DIV14]], 1 +// CHECK1-NEXT: store i64 [[SUB15]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB116]], align 8 +// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST117]], align 8 +// CHECK1-NEXT: [[TMP14:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: [[ADD19:%.*]] = add nsw i64 [[TMP14]], 1 +// CHECK1-NEXT: store i64 [[ADD19]], ptr [[DOTOMP_NI118]], align 8 // CHECK1-NEXT: store i32 37, ptr [[CC]], align 4 -// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE223]], align 8 -// CHECK1-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY25:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 -// CHECK1-NEXT: [[ADD_PTR26:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY25]], i64 256 -// CHECK1-NEXT: store ptr [[ADD_PTR26]], ptr [[__END224]], align 8 -// CHECK1-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP18]], i64 0, i64 0 -// CHECK1-NEXT: store ptr [[ARRAYDECAY28]], ptr [[__BEGIN227]], align 8 -// CHECK1-NEXT: [[TMP19:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY30:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP19]], i64 0, i64 0 -// CHECK1-NEXT: store ptr [[ARRAYDECAY30]], ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK1-NEXT: [[TMP20:%.*]] = load ptr, ptr [[__END224]], align 8 -// CHECK1-NEXT: store ptr [[TMP20]], ptr [[DOTCAPTURE_EXPR_31]], align 8 -// CHECK1-NEXT: [[TMP21:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_31]], align 8 -// CHECK1-NEXT: [[TMP22:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK1-NEXT: [[SUB_PTR_LHS_CAST33:%.*]] = ptrtoint ptr [[TMP21]] to i64 -// CHECK1-NEXT: [[SUB_PTR_RHS_CAST34:%.*]] = ptrtoint ptr [[TMP22]] to i64 -// CHECK1-NEXT: [[SUB_PTR_SUB35:%.*]] = sub i64 [[SUB_PTR_LHS_CAST33]], [[SUB_PTR_RHS_CAST34]] -// CHECK1-NEXT: [[SUB_PTR_DIV36:%.*]] = sdiv exact i64 [[SUB_PTR_SUB35]], 8 -// CHECK1-NEXT: [[SUB37:%.*]] = sub nsw i64 [[SUB_PTR_DIV36]], 1 -// CHECK1-NEXT: [[ADD38:%.*]] = add nsw i64 [[SUB37]], 1 -// CHECK1-NEXT: [[DIV39:%.*]] = sdiv i64 [[ADD38]], 1 -// CHECK1-NEXT: [[SUB40:%.*]] = sub nsw i64 [[DIV39]], 1 -// CHECK1-NEXT: store i64 [[SUB40]], ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK1-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK1-NEXT: store i64 [[TMP23]], ptr [[DOTOMP_UB2]], align 8 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE221]], align 8 +// CHECK1-NEXT: [[TMP15:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY23:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP15]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR24:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY23]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR24]], ptr [[__END222]], align 8 +// CHECK1-NEXT: [[TMP16:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY26:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP16]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY26]], ptr [[__BEGIN225]], align 8 +// CHECK1-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY28]], ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK1-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__END222]], align 8 +// CHECK1-NEXT: store ptr [[TMP18]], ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[TMP19:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[TMP20:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST31:%.*]] = ptrtoint ptr [[TMP19]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST32:%.*]] = ptrtoint ptr [[TMP20]] to i64 +// CHECK1-NEXT: [[SUB_PTR_SUB33:%.*]] = sub i64 [[SUB_PTR_LHS_CAST31]], [[SUB_PTR_RHS_CAST32]] +// CHECK1-NEXT: [[SUB_PTR_DIV34:%.*]] = sdiv exact i64 [[SUB_PTR_SUB33]], 8 +// CHECK1-NEXT: [[SUB35:%.*]] = sub nsw i64 [[SUB_PTR_DIV34]], 1 +// CHECK1-NEXT: [[ADD36:%.*]] = add nsw i64 [[SUB35]], 1 +// CHECK1-NEXT: [[DIV37:%.*]] = sdiv i64 [[ADD36]], 1 +// CHECK1-NEXT: [[SUB38:%.*]] = sub nsw i64 [[DIV37]], 1 +// CHECK1-NEXT: store i64 [[SUB38]], ptr [[DOTCAPTURE_EXPR_30]], align 8 // CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB2]], align 8 // CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST2]], align 8 -// CHECK1-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK1-NEXT: [[ADD41:%.*]] = add nsw i64 [[TMP24]], 1 -// CHECK1-NEXT: store i64 [[ADD41]], ptr [[DOTOMP_NI2]], align 8 -// CHECK1-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 -// CHECK1-NEXT: store i64 [[TMP25]], ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK1-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK1-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK1-NEXT: [[CMP43:%.*]] = icmp sgt i64 [[TMP26]], [[TMP27]] -// CHECK1-NEXT: br i1 [[CMP43]], label %[[COND_TRUE44:.*]], label %[[COND_FALSE45:.*]] -// CHECK1: [[COND_TRUE44]]: -// CHECK1-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK1-NEXT: br label %[[COND_END46:.*]] -// CHECK1: [[COND_FALSE45]]: -// CHECK1-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK1-NEXT: br label %[[COND_END46]] -// CHECK1: [[COND_END46]]: -// CHECK1-NEXT: [[COND47:%.*]] = phi i64 [ [[TMP28]], %[[COND_TRUE44]] ], [ [[TMP29]], %[[COND_FALSE45]] ] -// CHECK1-NEXT: store i64 [[COND47]], ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK1-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK1-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK1-NEXT: [[CMP49:%.*]] = icmp sgt i64 [[TMP30]], [[TMP31]] -// CHECK1-NEXT: br i1 [[CMP49]], label %[[COND_TRUE50:.*]], label %[[COND_FALSE51:.*]] -// CHECK1: [[COND_TRUE50]]: -// CHECK1-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK1-NEXT: br label %[[COND_END52:.*]] -// CHECK1: [[COND_FALSE51]]: -// CHECK1-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK1-NEXT: br label %[[COND_END52]] -// CHECK1: [[COND_END52]]: -// CHECK1-NEXT: [[COND53:%.*]] = phi i64 [ [[TMP32]], %[[COND_TRUE50]] ], [ [[TMP33]], %[[COND_FALSE51]] ] -// CHECK1-NEXT: store i64 [[COND53]], ptr [[DOTOMP_FUSE_MAX48]], align 8 -// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP21:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_30]], align 8 +// CHECK1-NEXT: [[ADD39:%.*]] = add nsw i64 [[TMP21]], 1 +// CHECK1-NEXT: store i64 [[ADD39]], ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[TMP22:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: store i64 [[TMP22]], ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK1-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK1-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[CMP41:%.*]] = icmp sgt i64 [[TMP23]], [[TMP24]] +// CHECK1-NEXT: br i1 [[CMP41]], label %[[COND_TRUE42:.*]], label %[[COND_FALSE43:.*]] +// CHECK1: [[COND_TRUE42]]: +// CHECK1-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK1-NEXT: br label %[[COND_END44:.*]] +// CHECK1: [[COND_FALSE43]]: +// CHECK1-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: br label %[[COND_END44]] +// CHECK1: [[COND_END44]]: +// CHECK1-NEXT: [[COND45:%.*]] = phi i64 [ [[TMP25]], %[[COND_TRUE42]] ], [ [[TMP26]], %[[COND_FALSE43]] ] +// CHECK1-NEXT: store i64 [[COND45]], ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[CMP47:%.*]] = icmp sgt i64 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: br i1 [[CMP47]], label %[[COND_TRUE48:.*]], label %[[COND_FALSE49:.*]] +// CHECK1: [[COND_TRUE48]]: +// CHECK1-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: br label %[[COND_END50:.*]] +// CHECK1: [[COND_FALSE49]]: +// CHECK1-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: br label %[[COND_END50]] +// CHECK1: [[COND_END50]]: +// CHECK1-NEXT: [[COND51:%.*]] = phi i64 [ [[TMP29]], %[[COND_TRUE48]] ], [ [[TMP30]], %[[COND_FALSE49]] ] +// CHECK1-NEXT: store i64 [[COND51]], ptr [[DOTOMP_FUSE_MAX46]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX52]], align 8 // CHECK1-NEXT: br label %[[FOR_COND:.*]] // CHECK1: [[FOR_COND]]: -// CHECK1-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[TMP35:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX48]], align 8 -// CHECK1-NEXT: [[CMP55:%.*]] = icmp slt i64 [[TMP34]], [[TMP35]] -// CHECK1-NEXT: br i1 [[CMP55]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX46]], align 8 +// CHECK1-NEXT: [[CMP53:%.*]] = icmp slt i64 [[TMP31]], [[TMP32]] +// CHECK1-NEXT: br i1 [[CMP53]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK1: [[FOR_BODY]]: -// CHECK1-NEXT: [[TMP36:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 -// CHECK1-NEXT: [[CMP56:%.*]] = icmp slt i64 [[TMP36]], [[TMP37]] -// CHECK1-NEXT: br i1 [[CMP56]], label %[[IF_THEN:.*]], label %[[IF_END76:.*]] +// CHECK1-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: [[CMP54:%.*]] = icmp slt i64 [[TMP33]], [[TMP34]] +// CHECK1-NEXT: br i1 [[CMP54]], label %[[IF_THEN:.*]], label %[[IF_END74:.*]] // CHECK1: [[IF_THEN]]: -// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB04]], align 4 -// CHECK1-NEXT: [[CONV57:%.*]] = sext i32 [[TMP38]] to i64 -// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST05]], align 4 -// CHECK1-NEXT: [[CONV58:%.*]] = sext i32 [[TMP39]] to i64 -// CHECK1-NEXT: [[TMP40:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV58]], [[TMP40]] -// CHECK1-NEXT: [[ADD59:%.*]] = add nsw i64 [[CONV57]], [[MUL]] -// CHECK1-NEXT: [[CONV60:%.*]] = trunc i64 [[ADD59]] to i32 -// CHECK1-NEXT: store i32 [[CONV60]], ptr [[DOTOMP_IV07]], align 4 -// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_IV07]], align 4 -// CHECK1-NEXT: [[MUL61:%.*]] = mul nsw i32 [[TMP41]], 1 -// CHECK1-NEXT: [[ADD62:%.*]] = add nsw i32 0, [[MUL61]] -// CHECK1-NEXT: store i32 [[ADD62]], ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: [[CMP63:%.*]] = icmp slt i32 [[TMP42]], [[TMP43]] -// CHECK1-NEXT: br i1 [[CMP63]], label %[[IF_THEN64:.*]], label %[[IF_END:.*]] -// CHECK1: [[IF_THEN64]]: -// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP45]], [[TMP46]] -// CHECK1-NEXT: [[ADD66:%.*]] = add nsw i32 [[TMP44]], [[MUL65]] -// CHECK1-NEXT: store i32 [[ADD66]], ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[MUL67:%.*]] = mul nsw i32 [[TMP47]], 1 -// CHECK1-NEXT: [[ADD68:%.*]] = add nsw i32 0, [[MUL67]] -// CHECK1-NEXT: store i32 [[ADD68]], ptr [[I]], align 4 -// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[I]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP48]]) +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_LB03]], align 4 +// CHECK1-NEXT: [[CONV55:%.*]] = sext i32 [[TMP35]] to i64 +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_ST04]], align 4 +// CHECK1-NEXT: [[CONV56:%.*]] = sext i32 [[TMP36]] to i64 +// CHECK1-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV56]], [[TMP37]] +// CHECK1-NEXT: [[ADD57:%.*]] = add nsw i64 [[CONV55]], [[MUL]] +// CHECK1-NEXT: [[CONV58:%.*]] = trunc i64 [[ADD57]] to i32 +// CHECK1-NEXT: store i32 [[CONV58]], ptr [[DOTOMP_IV06]], align 4 +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_IV06]], align 4 +// CHECK1-NEXT: [[MUL59:%.*]] = mul nsw i32 [[TMP38]], 1 +// CHECK1-NEXT: [[ADD60:%.*]] = add nsw i32 0, [[MUL59]] +// CHECK1-NEXT: store i32 [[ADD60]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP61:%.*]] = icmp slt i32 [[TMP39]], [[TMP40]] +// CHECK1-NEXT: br i1 [[CMP61]], label %[[IF_THEN62:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN62]]: +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL63:%.*]] = mul nsw i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: [[ADD64:%.*]] = add nsw i32 [[TMP41]], [[MUL63]] +// CHECK1-NEXT: store i32 [[ADD64]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP44]], 1 +// CHECK1-NEXT: [[ADD66:%.*]] = add nsw i32 0, [[MUL65]] +// CHECK1-NEXT: store i32 [[ADD66]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP45]]) // CHECK1-NEXT: br label %[[IF_END]] // CHECK1: [[IF_END]]: -// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP69:%.*]] = icmp slt i32 [[TMP49]], [[TMP50]] -// CHECK1-NEXT: br i1 [[CMP69]], label %[[IF_THEN70:.*]], label %[[IF_END75:.*]] -// CHECK1: [[IF_THEN70]]: -// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP52]], [[TMP53]] -// CHECK1-NEXT: [[ADD72:%.*]] = add nsw i32 [[TMP51]], [[MUL71]] -// CHECK1-NEXT: store i32 [[ADD72]], ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[MUL73:%.*]] = mul nsw i32 [[TMP54]], 2 -// CHECK1-NEXT: [[ADD74:%.*]] = add nsw i32 0, [[MUL73]] -// CHECK1-NEXT: store i32 [[ADD74]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[J]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP55]]) -// CHECK1-NEXT: br label %[[IF_END75]] -// CHECK1: [[IF_END75]]: -// CHECK1-NEXT: br label %[[IF_END76]] -// CHECK1: [[IF_END76]]: -// CHECK1-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK1-NEXT: [[CMP77:%.*]] = icmp slt i64 [[TMP56]], [[TMP57]] -// CHECK1-NEXT: br i1 [[CMP77]], label %[[IF_THEN78:.*]], label %[[IF_END83:.*]] -// CHECK1: [[IF_THEN78]]: -// CHECK1-NEXT: [[TMP58:%.*]] = load i64, ptr [[DOTOMP_LB118]], align 8 -// CHECK1-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_ST119]], align 8 -// CHECK1-NEXT: [[TMP60:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], [[TMP60]] -// CHECK1-NEXT: [[ADD80:%.*]] = add nsw i64 [[TMP58]], [[MUL79]] -// CHECK1-NEXT: store i64 [[ADD80]], ptr [[DOTOMP_IV122]], align 8 -// CHECK1-NEXT: [[TMP61:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK1-NEXT: [[TMP62:%.*]] = load i64, ptr [[DOTOMP_IV122]], align 8 -// CHECK1-NEXT: [[MUL81:%.*]] = mul nsw i64 [[TMP62]], 1 -// CHECK1-NEXT: [[ADD_PTR82:%.*]] = getelementptr inbounds double, ptr [[TMP61]], i64 [[MUL81]] -// CHECK1-NEXT: store ptr [[ADD_PTR82]], ptr [[__BEGIN2]], align 8 -// CHECK1-NEXT: [[TMP63:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 -// CHECK1-NEXT: store ptr [[TMP63]], ptr [[V]], align 8 -// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[C]], align 4 -// CHECK1-NEXT: [[TMP65:%.*]] = load ptr, ptr [[V]], align 8 -// CHECK1-NEXT: [[TMP66:%.*]] = load double, ptr [[TMP65]], align 8 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP64]], double noundef [[TMP66]]) -// CHECK1-NEXT: br label %[[IF_END83]] -// CHECK1: [[IF_END83]]: -// CHECK1-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK1-NEXT: [[CMP84:%.*]] = icmp slt i64 [[TMP67]], [[TMP68]] -// CHECK1-NEXT: br i1 [[CMP84]], label %[[IF_THEN85:.*]], label %[[IF_END90:.*]] -// CHECK1: [[IF_THEN85]]: -// CHECK1-NEXT: [[TMP69:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 -// CHECK1-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 -// CHECK1-NEXT: [[TMP71:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], [[TMP71]] -// CHECK1-NEXT: [[ADD87:%.*]] = add nsw i64 [[TMP69]], [[MUL86]] -// CHECK1-NEXT: store i64 [[ADD87]], ptr [[DOTOMP_IV2]], align 8 -// CHECK1-NEXT: [[TMP72:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK1-NEXT: [[TMP73:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 -// CHECK1-NEXT: [[MUL88:%.*]] = mul nsw i64 [[TMP73]], 1 -// CHECK1-NEXT: [[ADD_PTR89:%.*]] = getelementptr inbounds double, ptr [[TMP72]], i64 [[MUL88]] -// CHECK1-NEXT: store ptr [[ADD_PTR89]], ptr [[__BEGIN227]], align 8 -// CHECK1-NEXT: [[TMP74:%.*]] = load ptr, ptr [[__BEGIN227]], align 8 -// CHECK1-NEXT: store ptr [[TMP74]], ptr [[VV]], align 8 -// CHECK1-NEXT: [[TMP75:%.*]] = load i32, ptr [[CC]], align 4 -// CHECK1-NEXT: [[TMP76:%.*]] = load ptr, ptr [[VV]], align 8 -// CHECK1-NEXT: [[TMP77:%.*]] = load double, ptr [[TMP76]], align 8 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP75]], double noundef [[TMP77]]) -// CHECK1-NEXT: br label %[[IF_END90]] -// CHECK1: [[IF_END90]]: +// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP67:%.*]] = icmp slt i32 [[TMP46]], [[TMP47]] +// CHECK1-NEXT: br i1 [[CMP67]], label %[[IF_THEN68:.*]], label %[[IF_END73:.*]] +// CHECK1: [[IF_THEN68]]: +// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL69:%.*]] = mul nsw i32 [[TMP49]], [[TMP50]] +// CHECK1-NEXT: [[ADD70:%.*]] = add nsw i32 [[TMP48]], [[MUL69]] +// CHECK1-NEXT: store i32 [[ADD70]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP51]], 2 +// CHECK1-NEXT: [[ADD72:%.*]] = add nsw i32 0, [[MUL71]] +// CHECK1-NEXT: store i32 [[ADD72]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK1-NEXT: br label %[[IF_END73]] +// CHECK1: [[IF_END73]]: +// CHECK1-NEXT: br label %[[IF_END74]] +// CHECK1: [[IF_END74]]: +// CHECK1-NEXT: [[TMP53:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[TMP54:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[CMP75:%.*]] = icmp slt i64 [[TMP53]], [[TMP54]] +// CHECK1-NEXT: br i1 [[CMP75]], label %[[IF_THEN76:.*]], label %[[IF_END81:.*]] +// CHECK1: [[IF_THEN76]]: +// CHECK1-NEXT: [[TMP55:%.*]] = load i64, ptr [[DOTOMP_LB116]], align 8 +// CHECK1-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_ST117]], align 8 +// CHECK1-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[MUL77:%.*]] = mul nsw i64 [[TMP56]], [[TMP57]] +// CHECK1-NEXT: [[ADD78:%.*]] = add nsw i64 [[TMP55]], [[MUL77]] +// CHECK1-NEXT: store i64 [[ADD78]], ptr [[DOTOMP_IV120]], align 8 +// CHECK1-NEXT: [[TMP58:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_IV120]], align 8 +// CHECK1-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], 1 +// CHECK1-NEXT: [[ADD_PTR80:%.*]] = getelementptr inbounds double, ptr [[TMP58]], i64 [[MUL79]] +// CHECK1-NEXT: store ptr [[ADD_PTR80]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP60:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: store ptr [[TMP60]], ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP61:%.*]] = load i32, ptr [[C]], align 4 +// CHECK1-NEXT: [[TMP62:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP63:%.*]] = load double, ptr [[TMP62]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP61]], double noundef [[TMP63]]) +// CHECK1-NEXT: br label %[[IF_END81]] +// CHECK1: [[IF_END81]]: +// CHECK1-NEXT: [[TMP64:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[TMP65:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[CMP82:%.*]] = icmp slt i64 [[TMP64]], [[TMP65]] +// CHECK1-NEXT: br i1 [[CMP82]], label %[[IF_THEN83:.*]], label %[[IF_END88:.*]] +// CHECK1: [[IF_THEN83]]: +// CHECK1-NEXT: [[TMP66:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 +// CHECK1-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 +// CHECK1-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[MUL84:%.*]] = mul nsw i64 [[TMP67]], [[TMP68]] +// CHECK1-NEXT: [[ADD85:%.*]] = add nsw i64 [[TMP66]], [[MUL84]] +// CHECK1-NEXT: store i64 [[ADD85]], ptr [[DOTOMP_IV2]], align 8 +// CHECK1-NEXT: [[TMP69:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK1-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 +// CHECK1-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], 1 +// CHECK1-NEXT: [[ADD_PTR87:%.*]] = getelementptr inbounds double, ptr [[TMP69]], i64 [[MUL86]] +// CHECK1-NEXT: store ptr [[ADD_PTR87]], ptr [[__BEGIN225]], align 8 +// CHECK1-NEXT: [[TMP71:%.*]] = load ptr, ptr [[__BEGIN225]], align 8 +// CHECK1-NEXT: store ptr [[TMP71]], ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP72:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK1-NEXT: [[TMP73:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP74:%.*]] = load double, ptr [[TMP73]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP72]], double noundef [[TMP74]]) +// CHECK1-NEXT: br label %[[IF_END88]] +// CHECK1: [[IF_END88]]: // CHECK1-NEXT: br label %[[FOR_INC:.*]] // CHECK1: [[FOR_INC]]: -// CHECK1-NEXT: [[TMP78:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[INC:%.*]] = add nsw i64 [[TMP78]], 1 -// CHECK1-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP75:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[INC:%.*]] = add nsw i64 [[TMP75]], 1 +// CHECK1-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX52]], align 8 // CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] // CHECK1: [[FOR_END]]: // CHECK1-NEXT: ret void @@ -794,13 +783,11 @@ extern "C" void foo4() { // CHECK1-NEXT: [[ENTRY:.*:]] // CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 // CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[K:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -815,12 +802,10 @@ extern "C" void foo4() { // CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: store i32 0, ptr [[J]], align 4 -// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 // CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[K]], align 4 -// CHECK1-NEXT: store i32 63, ptr [[DOTOMP_UB1]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 // CHECK1-NEXT: store i32 64, ptr [[DOTOMP_NI1]], align 4 @@ -940,6 +925,277 @@ extern "C" void foo4() { // CHECK1-NEXT: ret void // // +// CHECK1-LABEL: define dso_local void @foo5( +// CHECK1-SAME: ) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB03:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST04:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI05:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV06:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_8:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_10:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_LB116:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_ST117:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_NI118:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV120:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_TEMP_121:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX22:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX29:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[CC:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE264:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN265:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END267:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[VV:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[K]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: store i32 512, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 [[TMP5]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[SUB:%.*]] = sub nsw i32 [[TMP6]], 0 +// CHECK1-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 +// CHECK1-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 +// CHECK1-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB03]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST04]], align 4 +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], 1 +// CHECK1-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 +// CHECK1-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[TMP8:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP8]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK1-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY7:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY7]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY9:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY9]], ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK1-NEXT: store ptr [[TMP11]], ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK1-NEXT: [[TMP12:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK1-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP12]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 +// CHECK1-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] +// CHECK1-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 +// CHECK1-NEXT: [[SUB12:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK1-NEXT: [[ADD13:%.*]] = add nsw i64 [[SUB12]], 1 +// CHECK1-NEXT: [[DIV14:%.*]] = sdiv i64 [[ADD13]], 1 +// CHECK1-NEXT: [[SUB15:%.*]] = sub nsw i64 [[DIV14]], 1 +// CHECK1-NEXT: store i64 [[SUB15]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB116]], align 8 +// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST117]], align 8 +// CHECK1-NEXT: [[TMP14:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: [[ADD19:%.*]] = add nsw i64 [[TMP14]], 1 +// CHECK1-NEXT: store i64 [[ADD19]], ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK1-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK1-NEXT: [[TMP17:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[CMP23:%.*]] = icmp sgt i64 [[TMP16]], [[TMP17]] +// CHECK1-NEXT: br i1 [[CMP23]], label %[[COND_TRUE24:.*]], label %[[COND_FALSE25:.*]] +// CHECK1: [[COND_TRUE24]]: +// CHECK1-NEXT: [[TMP18:%.*]] = load i64, ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK1-NEXT: br label %[[COND_END26:.*]] +// CHECK1: [[COND_FALSE25]]: +// CHECK1-NEXT: [[TMP19:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: br label %[[COND_END26]] +// CHECK1: [[COND_END26]]: +// CHECK1-NEXT: [[COND27:%.*]] = phi i64 [ [[TMP18]], %[[COND_TRUE24]] ], [ [[TMP19]], %[[COND_FALSE25]] ] +// CHECK1-NEXT: store i64 [[COND27]], ptr [[DOTOMP_FUSE_MAX22]], align 8 +// CHECK1-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: [[CMP28:%.*]] = icmp slt i32 [[TMP20]], 128 +// CHECK1-NEXT: br i1 [[CMP28]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP21]]) +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add nsw i32 [[TMP22]], 1 +// CHECK1-NEXT: store i32 [[INC]], ptr [[I]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP9:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND30:.*]] +// CHECK1: [[FOR_COND30]]: +// CHECK1-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX22]], align 8 +// CHECK1-NEXT: [[CMP31:%.*]] = icmp slt i64 [[TMP23]], [[TMP24]] +// CHECK1-NEXT: br i1 [[CMP31]], label %[[FOR_BODY32:.*]], label %[[FOR_END63:.*]] +// CHECK1: [[FOR_BODY32]]: +// CHECK1-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: [[CMP33:%.*]] = icmp slt i64 [[TMP25]], [[TMP26]] +// CHECK1-NEXT: br i1 [[CMP33]], label %[[IF_THEN:.*]], label %[[IF_END53:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_LB03]], align 4 +// CHECK1-NEXT: [[CONV34:%.*]] = sext i32 [[TMP27]] to i64 +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_ST04]], align 4 +// CHECK1-NEXT: [[CONV35:%.*]] = sext i32 [[TMP28]] to i64 +// CHECK1-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV35]], [[TMP29]] +// CHECK1-NEXT: [[ADD36:%.*]] = add nsw i64 [[CONV34]], [[MUL]] +// CHECK1-NEXT: [[CONV37:%.*]] = trunc i64 [[ADD36]] to i32 +// CHECK1-NEXT: store i32 [[CONV37]], ptr [[DOTOMP_IV06]], align 4 +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_IV06]], align 4 +// CHECK1-NEXT: [[MUL38:%.*]] = mul nsw i32 [[TMP30]], 1 +// CHECK1-NEXT: [[ADD39:%.*]] = add nsw i32 0, [[MUL38]] +// CHECK1-NEXT: store i32 [[ADD39]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP40:%.*]] = icmp slt i32 [[TMP31]], [[TMP32]] +// CHECK1-NEXT: br i1 [[CMP40]], label %[[IF_THEN41:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN41]]: +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL42:%.*]] = mul nsw i32 [[TMP34]], [[TMP35]] +// CHECK1-NEXT: [[ADD43:%.*]] = add nsw i32 [[TMP33]], [[MUL42]] +// CHECK1-NEXT: store i32 [[ADD43]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[MUL44:%.*]] = mul nsw i32 [[TMP36]], 2 +// CHECK1-NEXT: [[ADD45:%.*]] = add nsw i32 0, [[MUL44]] +// CHECK1-NEXT: store i32 [[ADD45]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP37]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP46:%.*]] = icmp slt i32 [[TMP38]], [[TMP39]] +// CHECK1-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] +// CHECK1: [[IF_THEN47]]: +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL48:%.*]] = mul nsw i32 [[TMP41]], [[TMP42]] +// CHECK1-NEXT: [[ADD49:%.*]] = add nsw i32 [[TMP40]], [[MUL48]] +// CHECK1-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[MUL50:%.*]] = mul nsw i32 [[TMP43]], 1 +// CHECK1-NEXT: [[ADD51:%.*]] = add nsw i32 0, [[MUL50]] +// CHECK1-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[K]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK1-NEXT: br label %[[IF_END52]] +// CHECK1: [[IF_END52]]: +// CHECK1-NEXT: br label %[[IF_END53]] +// CHECK1: [[IF_END53]]: +// CHECK1-NEXT: [[TMP45:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[TMP46:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[CMP54:%.*]] = icmp slt i64 [[TMP45]], [[TMP46]] +// CHECK1-NEXT: br i1 [[CMP54]], label %[[IF_THEN55:.*]], label %[[IF_END60:.*]] +// CHECK1: [[IF_THEN55]]: +// CHECK1-NEXT: [[TMP47:%.*]] = load i64, ptr [[DOTOMP_LB116]], align 8 +// CHECK1-NEXT: [[TMP48:%.*]] = load i64, ptr [[DOTOMP_ST117]], align 8 +// CHECK1-NEXT: [[TMP49:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[MUL56:%.*]] = mul nsw i64 [[TMP48]], [[TMP49]] +// CHECK1-NEXT: [[ADD57:%.*]] = add nsw i64 [[TMP47]], [[MUL56]] +// CHECK1-NEXT: store i64 [[ADD57]], ptr [[DOTOMP_IV120]], align 8 +// CHECK1-NEXT: [[TMP50:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[TMP51:%.*]] = load i64, ptr [[DOTOMP_IV120]], align 8 +// CHECK1-NEXT: [[MUL58:%.*]] = mul nsw i64 [[TMP51]], 1 +// CHECK1-NEXT: [[ADD_PTR59:%.*]] = getelementptr inbounds double, ptr [[TMP50]], i64 [[MUL58]] +// CHECK1-NEXT: store ptr [[ADD_PTR59]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP52:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: store ptr [[TMP52]], ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[C]], align 4 +// CHECK1-NEXT: [[TMP54:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP55:%.*]] = load double, ptr [[TMP54]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP53]], double noundef [[TMP55]]) +// CHECK1-NEXT: br label %[[IF_END60]] +// CHECK1: [[IF_END60]]: +// CHECK1-NEXT: br label %[[FOR_INC61:.*]] +// CHECK1: [[FOR_INC61]]: +// CHECK1-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[INC62:%.*]] = add nsw i64 [[TMP56]], 1 +// CHECK1-NEXT: store i64 [[INC62]], ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND30]], !llvm.loop [[LOOP10:![0-9]+]] +// CHECK1: [[FOR_END63]]: +// CHECK1-NEXT: store i32 37, ptr [[CC]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE264]], align 8 +// CHECK1-NEXT: [[TMP57:%.*]] = load ptr, ptr [[__RANGE264]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY66:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP57]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY66]], ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: [[TMP58:%.*]] = load ptr, ptr [[__RANGE264]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY68:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP58]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR69:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY68]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR69]], ptr [[__END267]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND70:.*]] +// CHECK1: [[FOR_COND70]]: +// CHECK1-NEXT: [[TMP59:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: [[TMP60:%.*]] = load ptr, ptr [[__END267]], align 8 +// CHECK1-NEXT: [[CMP71:%.*]] = icmp ne ptr [[TMP59]], [[TMP60]] +// CHECK1-NEXT: br i1 [[CMP71]], label %[[FOR_BODY72:.*]], label %[[FOR_END74:.*]] +// CHECK1: [[FOR_BODY72]]: +// CHECK1-NEXT: [[TMP61:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: store ptr [[TMP61]], ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP62:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK1-NEXT: [[TMP63:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP64:%.*]] = load double, ptr [[TMP63]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP62]], double noundef [[TMP64]]) +// CHECK1-NEXT: br label %[[FOR_INC73:.*]] +// CHECK1: [[FOR_INC73]]: +// CHECK1-NEXT: [[TMP65:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: [[INCDEC_PTR:%.*]] = getelementptr inbounds nuw double, ptr [[TMP65]], i32 1 +// CHECK1-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND70]] +// CHECK1: [[FOR_END74]]: +// CHECK1-NEXT: ret void +// +// // CHECK2-LABEL: define dso_local void @body( // CHECK2-SAME: ...) #[[ATTR0:[0-9]+]] { // CHECK2-NEXT: [[ENTRY:.*:]] @@ -961,7 +1217,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 @@ -970,7 +1225,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -1002,107 +1256,103 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] // CHECK2-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 // CHECK2-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP8]], 1 // CHECK2-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP9]], ptr [[J]], align 4 // CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[START2_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[START2_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[END2_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK2-NEXT: store i32 [[TMP10]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[END2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP13]], [[TMP14]] // CHECK2-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP15]] // CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] -// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP16]] // CHECK2-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 // CHECK2-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP17]], 1 // CHECK2-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: store i32 [[TMP20]], ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP21]], [[TMP22]] +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP19]], [[TMP20]] // CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] // CHECK2: [[COND_TRUE]]: -// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 // CHECK2-NEXT: br label %[[COND_END:.*]] // CHECK2: [[COND_FALSE]]: -// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 // CHECK2-NEXT: br label %[[COND_END]] // CHECK2: [[COND_END]]: -// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP23]], %[[COND_TRUE]] ], [ [[TMP24]], %[[COND_FALSE]] ] +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP21]], %[[COND_TRUE]] ], [ [[TMP22]], %[[COND_FALSE]] ] // CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK2-NEXT: br label %[[FOR_COND:.*]] // CHECK2: [[FOR_COND]]: -// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 -// CHECK2-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP23]], [[TMP24]] // CHECK2-NEXT: br i1 [[CMP16]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK2: [[FOR_BODY]]: -// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] // CHECK2-NEXT: br i1 [[CMP17]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] // CHECK2: [[IF_THEN]]: -// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP30]], [[TMP31]] -// CHECK2-NEXT: [[ADD18:%.*]] = add i32 [[TMP29]], [[MUL]] +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP28]], [[TMP29]] +// CHECK2-NEXT: [[ADD18:%.*]] = add i32 [[TMP27]], [[MUL]] // CHECK2-NEXT: store i32 [[ADD18]], ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 -// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 -// CHECK2-NEXT: [[MUL19:%.*]] = mul i32 [[TMP33]], [[TMP34]] -// CHECK2-NEXT: [[ADD20:%.*]] = add i32 [[TMP32]], [[MUL19]] +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[MUL19:%.*]] = mul i32 [[TMP31]], [[TMP32]] +// CHECK2-NEXT: [[ADD20:%.*]] = add i32 [[TMP30]], [[MUL19]] // CHECK2-NEXT: store i32 [[ADD20]], ptr [[I]], align 4 -// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[I]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP35]]) +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP33]]) // CHECK2-NEXT: br label %[[IF_END]] // CHECK2: [[IF_END]]: -// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP36]], [[TMP37]] +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP34]], [[TMP35]] // CHECK2-NEXT: br i1 [[CMP21]], label %[[IF_THEN22:.*]], label %[[IF_END27:.*]] // CHECK2: [[IF_THEN22]]: -// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL23:%.*]] = mul i32 [[TMP39]], [[TMP40]] -// CHECK2-NEXT: [[ADD24:%.*]] = add i32 [[TMP38]], [[MUL23]] +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL23:%.*]] = mul i32 [[TMP37]], [[TMP38]] +// CHECK2-NEXT: [[ADD24:%.*]] = add i32 [[TMP36]], [[MUL23]] // CHECK2-NEXT: store i32 [[ADD24]], ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[MUL25:%.*]] = mul i32 [[TMP42]], [[TMP43]] -// CHECK2-NEXT: [[ADD26:%.*]] = add i32 [[TMP41]], [[MUL25]] +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[MUL25:%.*]] = mul i32 [[TMP40]], [[TMP41]] +// CHECK2-NEXT: [[ADD26:%.*]] = add i32 [[TMP39]], [[MUL25]] // CHECK2-NEXT: store i32 [[ADD26]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[J]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP42]]) // CHECK2-NEXT: br label %[[IF_END27]] // CHECK2: [[IF_END27]]: // CHECK2-NEXT: br label %[[FOR_INC:.*]] // CHECK2: [[FOR_INC]]: -// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP45]], 1 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP43]], 1 // CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP3:![0-9]+]] // CHECK2: [[FOR_END]]: @@ -1114,13 +1364,11 @@ extern "C" void foo4() { // CHECK2-NEXT: [[ENTRY:.*:]] // CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 // CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -1130,48 +1378,43 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB03:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_LB04:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_ST05:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_NI06:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_IV07:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB03:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST04:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI05:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV06:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[C:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_12:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_UB117:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_LB118:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_ST119:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_NI120:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_IV122:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_8:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_10:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_LB116:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_ST117:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_NI118:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV120:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[CC:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[__RANGE223:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[__END224:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[__BEGIN227:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__RANGE221:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END222:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN225:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_27:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_29:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_31:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_32:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_UB2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_30:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_LB2:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_ST2:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_NI2:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_IV2:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_TEMP_142:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_TEMP_140:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_FUSE_MAX48:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX54:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX46:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX52:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[VV:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: store i32 0, ptr [[I]], align 4 -// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 // CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[J]], align 4 -// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB1]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 // CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI1]], align 4 @@ -1198,225 +1441,219 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 // CHECK2-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 // CHECK2-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB03]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST04]], align 4 // CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 -// CHECK2-NEXT: store i32 [[TMP7]], ptr [[DOTOMP_UB03]], align 4 -// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB04]], align 4 -// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST05]], align 4 -// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 -// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP8]], 1 +// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], 1 // CHECK2-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 -// CHECK2-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI06]], align 8 +// CHECK2-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI05]], align 8 // CHECK2-NEXT: store i32 42, ptr [[C]], align 4 // CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 -// CHECK2-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK2-NEXT: [[TMP8:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP8]], i64 0, i64 0 // CHECK2-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 // CHECK2-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK2-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY7:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY7]], ptr [[__BEGIN2]], align 8 // CHECK2-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY8:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 -// CHECK2-NEXT: store ptr [[ARRAYDECAY8]], ptr [[__BEGIN2]], align 8 -// CHECK2-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY10:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP11]], i64 0, i64 0 -// CHECK2-NEXT: store ptr [[ARRAYDECAY10]], ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK2-NEXT: [[TMP12:%.*]] = load ptr, ptr [[__END2]], align 8 -// CHECK2-NEXT: store ptr [[TMP12]], ptr [[DOTCAPTURE_EXPR_11]], align 8 -// CHECK2-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_11]], align 8 -// CHECK2-NEXT: [[TMP14:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK2-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 -// CHECK2-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP14]] to i64 +// CHECK2-NEXT: [[ARRAYDECAY9:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY9]], ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK2-NEXT: store ptr [[TMP11]], ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK2-NEXT: [[TMP12:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK2-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP12]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 // CHECK2-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] // CHECK2-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 -// CHECK2-NEXT: [[SUB13:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 -// CHECK2-NEXT: [[ADD14:%.*]] = add nsw i64 [[SUB13]], 1 -// CHECK2-NEXT: [[DIV15:%.*]] = sdiv i64 [[ADD14]], 1 -// CHECK2-NEXT: [[SUB16:%.*]] = sub nsw i64 [[DIV15]], 1 -// CHECK2-NEXT: store i64 [[SUB16]], ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK2-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK2-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_UB117]], align 8 -// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB118]], align 8 -// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST119]], align 8 -// CHECK2-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK2-NEXT: [[ADD21:%.*]] = add nsw i64 [[TMP16]], 1 -// CHECK2-NEXT: store i64 [[ADD21]], ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: [[SUB12:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK2-NEXT: [[ADD13:%.*]] = add nsw i64 [[SUB12]], 1 +// CHECK2-NEXT: [[DIV14:%.*]] = sdiv i64 [[ADD13]], 1 +// CHECK2-NEXT: [[SUB15:%.*]] = sub nsw i64 [[DIV14]], 1 +// CHECK2-NEXT: store i64 [[SUB15]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB116]], align 8 +// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST117]], align 8 +// CHECK2-NEXT: [[TMP14:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: [[ADD19:%.*]] = add nsw i64 [[TMP14]], 1 +// CHECK2-NEXT: store i64 [[ADD19]], ptr [[DOTOMP_NI118]], align 8 // CHECK2-NEXT: store i32 37, ptr [[CC]], align 4 -// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE223]], align 8 -// CHECK2-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY25:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 -// CHECK2-NEXT: [[ADD_PTR26:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY25]], i64 256 -// CHECK2-NEXT: store ptr [[ADD_PTR26]], ptr [[__END224]], align 8 -// CHECK2-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP18]], i64 0, i64 0 -// CHECK2-NEXT: store ptr [[ARRAYDECAY28]], ptr [[__BEGIN227]], align 8 -// CHECK2-NEXT: [[TMP19:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY30:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP19]], i64 0, i64 0 -// CHECK2-NEXT: store ptr [[ARRAYDECAY30]], ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK2-NEXT: [[TMP20:%.*]] = load ptr, ptr [[__END224]], align 8 -// CHECK2-NEXT: store ptr [[TMP20]], ptr [[DOTCAPTURE_EXPR_31]], align 8 -// CHECK2-NEXT: [[TMP21:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_31]], align 8 -// CHECK2-NEXT: [[TMP22:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK2-NEXT: [[SUB_PTR_LHS_CAST33:%.*]] = ptrtoint ptr [[TMP21]] to i64 -// CHECK2-NEXT: [[SUB_PTR_RHS_CAST34:%.*]] = ptrtoint ptr [[TMP22]] to i64 -// CHECK2-NEXT: [[SUB_PTR_SUB35:%.*]] = sub i64 [[SUB_PTR_LHS_CAST33]], [[SUB_PTR_RHS_CAST34]] -// CHECK2-NEXT: [[SUB_PTR_DIV36:%.*]] = sdiv exact i64 [[SUB_PTR_SUB35]], 8 -// CHECK2-NEXT: [[SUB37:%.*]] = sub nsw i64 [[SUB_PTR_DIV36]], 1 -// CHECK2-NEXT: [[ADD38:%.*]] = add nsw i64 [[SUB37]], 1 -// CHECK2-NEXT: [[DIV39:%.*]] = sdiv i64 [[ADD38]], 1 -// CHECK2-NEXT: [[SUB40:%.*]] = sub nsw i64 [[DIV39]], 1 -// CHECK2-NEXT: store i64 [[SUB40]], ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK2-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK2-NEXT: store i64 [[TMP23]], ptr [[DOTOMP_UB2]], align 8 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE221]], align 8 +// CHECK2-NEXT: [[TMP15:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY23:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP15]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR24:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY23]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR24]], ptr [[__END222]], align 8 +// CHECK2-NEXT: [[TMP16:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY26:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP16]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY26]], ptr [[__BEGIN225]], align 8 +// CHECK2-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY28]], ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK2-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__END222]], align 8 +// CHECK2-NEXT: store ptr [[TMP18]], ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[TMP19:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[TMP20:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST31:%.*]] = ptrtoint ptr [[TMP19]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST32:%.*]] = ptrtoint ptr [[TMP20]] to i64 +// CHECK2-NEXT: [[SUB_PTR_SUB33:%.*]] = sub i64 [[SUB_PTR_LHS_CAST31]], [[SUB_PTR_RHS_CAST32]] +// CHECK2-NEXT: [[SUB_PTR_DIV34:%.*]] = sdiv exact i64 [[SUB_PTR_SUB33]], 8 +// CHECK2-NEXT: [[SUB35:%.*]] = sub nsw i64 [[SUB_PTR_DIV34]], 1 +// CHECK2-NEXT: [[ADD36:%.*]] = add nsw i64 [[SUB35]], 1 +// CHECK2-NEXT: [[DIV37:%.*]] = sdiv i64 [[ADD36]], 1 +// CHECK2-NEXT: [[SUB38:%.*]] = sub nsw i64 [[DIV37]], 1 +// CHECK2-NEXT: store i64 [[SUB38]], ptr [[DOTCAPTURE_EXPR_30]], align 8 // CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB2]], align 8 // CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST2]], align 8 -// CHECK2-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK2-NEXT: [[ADD41:%.*]] = add nsw i64 [[TMP24]], 1 -// CHECK2-NEXT: store i64 [[ADD41]], ptr [[DOTOMP_NI2]], align 8 -// CHECK2-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 -// CHECK2-NEXT: store i64 [[TMP25]], ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK2-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK2-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK2-NEXT: [[CMP43:%.*]] = icmp sgt i64 [[TMP26]], [[TMP27]] -// CHECK2-NEXT: br i1 [[CMP43]], label %[[COND_TRUE44:.*]], label %[[COND_FALSE45:.*]] -// CHECK2: [[COND_TRUE44]]: -// CHECK2-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK2-NEXT: br label %[[COND_END46:.*]] -// CHECK2: [[COND_FALSE45]]: -// CHECK2-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK2-NEXT: br label %[[COND_END46]] -// CHECK2: [[COND_END46]]: -// CHECK2-NEXT: [[COND47:%.*]] = phi i64 [ [[TMP28]], %[[COND_TRUE44]] ], [ [[TMP29]], %[[COND_FALSE45]] ] -// CHECK2-NEXT: store i64 [[COND47]], ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK2-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK2-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK2-NEXT: [[CMP49:%.*]] = icmp sgt i64 [[TMP30]], [[TMP31]] -// CHECK2-NEXT: br i1 [[CMP49]], label %[[COND_TRUE50:.*]], label %[[COND_FALSE51:.*]] -// CHECK2: [[COND_TRUE50]]: -// CHECK2-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK2-NEXT: br label %[[COND_END52:.*]] -// CHECK2: [[COND_FALSE51]]: -// CHECK2-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK2-NEXT: br label %[[COND_END52]] -// CHECK2: [[COND_END52]]: -// CHECK2-NEXT: [[COND53:%.*]] = phi i64 [ [[TMP32]], %[[COND_TRUE50]] ], [ [[TMP33]], %[[COND_FALSE51]] ] -// CHECK2-NEXT: store i64 [[COND53]], ptr [[DOTOMP_FUSE_MAX48]], align 8 -// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP21:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_30]], align 8 +// CHECK2-NEXT: [[ADD39:%.*]] = add nsw i64 [[TMP21]], 1 +// CHECK2-NEXT: store i64 [[ADD39]], ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[TMP22:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: store i64 [[TMP22]], ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK2-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK2-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[CMP41:%.*]] = icmp sgt i64 [[TMP23]], [[TMP24]] +// CHECK2-NEXT: br i1 [[CMP41]], label %[[COND_TRUE42:.*]], label %[[COND_FALSE43:.*]] +// CHECK2: [[COND_TRUE42]]: +// CHECK2-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK2-NEXT: br label %[[COND_END44:.*]] +// CHECK2: [[COND_FALSE43]]: +// CHECK2-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: br label %[[COND_END44]] +// CHECK2: [[COND_END44]]: +// CHECK2-NEXT: [[COND45:%.*]] = phi i64 [ [[TMP25]], %[[COND_TRUE42]] ], [ [[TMP26]], %[[COND_FALSE43]] ] +// CHECK2-NEXT: store i64 [[COND45]], ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[CMP47:%.*]] = icmp sgt i64 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: br i1 [[CMP47]], label %[[COND_TRUE48:.*]], label %[[COND_FALSE49:.*]] +// CHECK2: [[COND_TRUE48]]: +// CHECK2-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: br label %[[COND_END50:.*]] +// CHECK2: [[COND_FALSE49]]: +// CHECK2-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: br label %[[COND_END50]] +// CHECK2: [[COND_END50]]: +// CHECK2-NEXT: [[COND51:%.*]] = phi i64 [ [[TMP29]], %[[COND_TRUE48]] ], [ [[TMP30]], %[[COND_FALSE49]] ] +// CHECK2-NEXT: store i64 [[COND51]], ptr [[DOTOMP_FUSE_MAX46]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX52]], align 8 // CHECK2-NEXT: br label %[[FOR_COND:.*]] // CHECK2: [[FOR_COND]]: -// CHECK2-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[TMP35:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX48]], align 8 -// CHECK2-NEXT: [[CMP55:%.*]] = icmp slt i64 [[TMP34]], [[TMP35]] -// CHECK2-NEXT: br i1 [[CMP55]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX46]], align 8 +// CHECK2-NEXT: [[CMP53:%.*]] = icmp slt i64 [[TMP31]], [[TMP32]] +// CHECK2-NEXT: br i1 [[CMP53]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK2: [[FOR_BODY]]: -// CHECK2-NEXT: [[TMP36:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 -// CHECK2-NEXT: [[CMP56:%.*]] = icmp slt i64 [[TMP36]], [[TMP37]] -// CHECK2-NEXT: br i1 [[CMP56]], label %[[IF_THEN:.*]], label %[[IF_END76:.*]] +// CHECK2-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: [[CMP54:%.*]] = icmp slt i64 [[TMP33]], [[TMP34]] +// CHECK2-NEXT: br i1 [[CMP54]], label %[[IF_THEN:.*]], label %[[IF_END74:.*]] // CHECK2: [[IF_THEN]]: -// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB04]], align 4 -// CHECK2-NEXT: [[CONV57:%.*]] = sext i32 [[TMP38]] to i64 -// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST05]], align 4 -// CHECK2-NEXT: [[CONV58:%.*]] = sext i32 [[TMP39]] to i64 -// CHECK2-NEXT: [[TMP40:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV58]], [[TMP40]] -// CHECK2-NEXT: [[ADD59:%.*]] = add nsw i64 [[CONV57]], [[MUL]] -// CHECK2-NEXT: [[CONV60:%.*]] = trunc i64 [[ADD59]] to i32 -// CHECK2-NEXT: store i32 [[CONV60]], ptr [[DOTOMP_IV07]], align 4 -// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_IV07]], align 4 -// CHECK2-NEXT: [[MUL61:%.*]] = mul nsw i32 [[TMP41]], 1 -// CHECK2-NEXT: [[ADD62:%.*]] = add nsw i32 0, [[MUL61]] -// CHECK2-NEXT: store i32 [[ADD62]], ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: [[CMP63:%.*]] = icmp slt i32 [[TMP42]], [[TMP43]] -// CHECK2-NEXT: br i1 [[CMP63]], label %[[IF_THEN64:.*]], label %[[IF_END:.*]] -// CHECK2: [[IF_THEN64]]: -// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP45]], [[TMP46]] -// CHECK2-NEXT: [[ADD66:%.*]] = add nsw i32 [[TMP44]], [[MUL65]] -// CHECK2-NEXT: store i32 [[ADD66]], ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[MUL67:%.*]] = mul nsw i32 [[TMP47]], 1 -// CHECK2-NEXT: [[ADD68:%.*]] = add nsw i32 0, [[MUL67]] -// CHECK2-NEXT: store i32 [[ADD68]], ptr [[I]], align 4 -// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[I]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP48]]) +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_LB03]], align 4 +// CHECK2-NEXT: [[CONV55:%.*]] = sext i32 [[TMP35]] to i64 +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_ST04]], align 4 +// CHECK2-NEXT: [[CONV56:%.*]] = sext i32 [[TMP36]] to i64 +// CHECK2-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV56]], [[TMP37]] +// CHECK2-NEXT: [[ADD57:%.*]] = add nsw i64 [[CONV55]], [[MUL]] +// CHECK2-NEXT: [[CONV58:%.*]] = trunc i64 [[ADD57]] to i32 +// CHECK2-NEXT: store i32 [[CONV58]], ptr [[DOTOMP_IV06]], align 4 +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_IV06]], align 4 +// CHECK2-NEXT: [[MUL59:%.*]] = mul nsw i32 [[TMP38]], 1 +// CHECK2-NEXT: [[ADD60:%.*]] = add nsw i32 0, [[MUL59]] +// CHECK2-NEXT: store i32 [[ADD60]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP61:%.*]] = icmp slt i32 [[TMP39]], [[TMP40]] +// CHECK2-NEXT: br i1 [[CMP61]], label %[[IF_THEN62:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN62]]: +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL63:%.*]] = mul nsw i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: [[ADD64:%.*]] = add nsw i32 [[TMP41]], [[MUL63]] +// CHECK2-NEXT: store i32 [[ADD64]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP44]], 1 +// CHECK2-NEXT: [[ADD66:%.*]] = add nsw i32 0, [[MUL65]] +// CHECK2-NEXT: store i32 [[ADD66]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP45]]) // CHECK2-NEXT: br label %[[IF_END]] // CHECK2: [[IF_END]]: -// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP69:%.*]] = icmp slt i32 [[TMP49]], [[TMP50]] -// CHECK2-NEXT: br i1 [[CMP69]], label %[[IF_THEN70:.*]], label %[[IF_END75:.*]] -// CHECK2: [[IF_THEN70]]: -// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP52]], [[TMP53]] -// CHECK2-NEXT: [[ADD72:%.*]] = add nsw i32 [[TMP51]], [[MUL71]] -// CHECK2-NEXT: store i32 [[ADD72]], ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[MUL73:%.*]] = mul nsw i32 [[TMP54]], 2 -// CHECK2-NEXT: [[ADD74:%.*]] = add nsw i32 0, [[MUL73]] -// CHECK2-NEXT: store i32 [[ADD74]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[J]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP55]]) -// CHECK2-NEXT: br label %[[IF_END75]] -// CHECK2: [[IF_END75]]: -// CHECK2-NEXT: br label %[[IF_END76]] -// CHECK2: [[IF_END76]]: -// CHECK2-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK2-NEXT: [[CMP77:%.*]] = icmp slt i64 [[TMP56]], [[TMP57]] -// CHECK2-NEXT: br i1 [[CMP77]], label %[[IF_THEN78:.*]], label %[[IF_END83:.*]] -// CHECK2: [[IF_THEN78]]: -// CHECK2-NEXT: [[TMP58:%.*]] = load i64, ptr [[DOTOMP_LB118]], align 8 -// CHECK2-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_ST119]], align 8 -// CHECK2-NEXT: [[TMP60:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], [[TMP60]] -// CHECK2-NEXT: [[ADD80:%.*]] = add nsw i64 [[TMP58]], [[MUL79]] -// CHECK2-NEXT: store i64 [[ADD80]], ptr [[DOTOMP_IV122]], align 8 -// CHECK2-NEXT: [[TMP61:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK2-NEXT: [[TMP62:%.*]] = load i64, ptr [[DOTOMP_IV122]], align 8 -// CHECK2-NEXT: [[MUL81:%.*]] = mul nsw i64 [[TMP62]], 1 -// CHECK2-NEXT: [[ADD_PTR82:%.*]] = getelementptr inbounds double, ptr [[TMP61]], i64 [[MUL81]] -// CHECK2-NEXT: store ptr [[ADD_PTR82]], ptr [[__BEGIN2]], align 8 -// CHECK2-NEXT: [[TMP63:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 -// CHECK2-NEXT: store ptr [[TMP63]], ptr [[V]], align 8 -// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[C]], align 4 -// CHECK2-NEXT: [[TMP65:%.*]] = load ptr, ptr [[V]], align 8 -// CHECK2-NEXT: [[TMP66:%.*]] = load double, ptr [[TMP65]], align 8 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP64]], double noundef [[TMP66]]) -// CHECK2-NEXT: br label %[[IF_END83]] -// CHECK2: [[IF_END83]]: -// CHECK2-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK2-NEXT: [[CMP84:%.*]] = icmp slt i64 [[TMP67]], [[TMP68]] -// CHECK2-NEXT: br i1 [[CMP84]], label %[[IF_THEN85:.*]], label %[[IF_END90:.*]] -// CHECK2: [[IF_THEN85]]: -// CHECK2-NEXT: [[TMP69:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 -// CHECK2-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 -// CHECK2-NEXT: [[TMP71:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], [[TMP71]] -// CHECK2-NEXT: [[ADD87:%.*]] = add nsw i64 [[TMP69]], [[MUL86]] -// CHECK2-NEXT: store i64 [[ADD87]], ptr [[DOTOMP_IV2]], align 8 -// CHECK2-NEXT: [[TMP72:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK2-NEXT: [[TMP73:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 -// CHECK2-NEXT: [[MUL88:%.*]] = mul nsw i64 [[TMP73]], 1 -// CHECK2-NEXT: [[ADD_PTR89:%.*]] = getelementptr inbounds double, ptr [[TMP72]], i64 [[MUL88]] -// CHECK2-NEXT: store ptr [[ADD_PTR89]], ptr [[__BEGIN227]], align 8 -// CHECK2-NEXT: [[TMP74:%.*]] = load ptr, ptr [[__BEGIN227]], align 8 -// CHECK2-NEXT: store ptr [[TMP74]], ptr [[VV]], align 8 -// CHECK2-NEXT: [[TMP75:%.*]] = load i32, ptr [[CC]], align 4 -// CHECK2-NEXT: [[TMP76:%.*]] = load ptr, ptr [[VV]], align 8 -// CHECK2-NEXT: [[TMP77:%.*]] = load double, ptr [[TMP76]], align 8 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP75]], double noundef [[TMP77]]) -// CHECK2-NEXT: br label %[[IF_END90]] -// CHECK2: [[IF_END90]]: +// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP67:%.*]] = icmp slt i32 [[TMP46]], [[TMP47]] +// CHECK2-NEXT: br i1 [[CMP67]], label %[[IF_THEN68:.*]], label %[[IF_END73:.*]] +// CHECK2: [[IF_THEN68]]: +// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL69:%.*]] = mul nsw i32 [[TMP49]], [[TMP50]] +// CHECK2-NEXT: [[ADD70:%.*]] = add nsw i32 [[TMP48]], [[MUL69]] +// CHECK2-NEXT: store i32 [[ADD70]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP51]], 2 +// CHECK2-NEXT: [[ADD72:%.*]] = add nsw i32 0, [[MUL71]] +// CHECK2-NEXT: store i32 [[ADD72]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK2-NEXT: br label %[[IF_END73]] +// CHECK2: [[IF_END73]]: +// CHECK2-NEXT: br label %[[IF_END74]] +// CHECK2: [[IF_END74]]: +// CHECK2-NEXT: [[TMP53:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[TMP54:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[CMP75:%.*]] = icmp slt i64 [[TMP53]], [[TMP54]] +// CHECK2-NEXT: br i1 [[CMP75]], label %[[IF_THEN76:.*]], label %[[IF_END81:.*]] +// CHECK2: [[IF_THEN76]]: +// CHECK2-NEXT: [[TMP55:%.*]] = load i64, ptr [[DOTOMP_LB116]], align 8 +// CHECK2-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_ST117]], align 8 +// CHECK2-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[MUL77:%.*]] = mul nsw i64 [[TMP56]], [[TMP57]] +// CHECK2-NEXT: [[ADD78:%.*]] = add nsw i64 [[TMP55]], [[MUL77]] +// CHECK2-NEXT: store i64 [[ADD78]], ptr [[DOTOMP_IV120]], align 8 +// CHECK2-NEXT: [[TMP58:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_IV120]], align 8 +// CHECK2-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], 1 +// CHECK2-NEXT: [[ADD_PTR80:%.*]] = getelementptr inbounds double, ptr [[TMP58]], i64 [[MUL79]] +// CHECK2-NEXT: store ptr [[ADD_PTR80]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP60:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: store ptr [[TMP60]], ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP61:%.*]] = load i32, ptr [[C]], align 4 +// CHECK2-NEXT: [[TMP62:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP63:%.*]] = load double, ptr [[TMP62]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP61]], double noundef [[TMP63]]) +// CHECK2-NEXT: br label %[[IF_END81]] +// CHECK2: [[IF_END81]]: +// CHECK2-NEXT: [[TMP64:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[TMP65:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[CMP82:%.*]] = icmp slt i64 [[TMP64]], [[TMP65]] +// CHECK2-NEXT: br i1 [[CMP82]], label %[[IF_THEN83:.*]], label %[[IF_END88:.*]] +// CHECK2: [[IF_THEN83]]: +// CHECK2-NEXT: [[TMP66:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 +// CHECK2-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 +// CHECK2-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[MUL84:%.*]] = mul nsw i64 [[TMP67]], [[TMP68]] +// CHECK2-NEXT: [[ADD85:%.*]] = add nsw i64 [[TMP66]], [[MUL84]] +// CHECK2-NEXT: store i64 [[ADD85]], ptr [[DOTOMP_IV2]], align 8 +// CHECK2-NEXT: [[TMP69:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK2-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 +// CHECK2-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], 1 +// CHECK2-NEXT: [[ADD_PTR87:%.*]] = getelementptr inbounds double, ptr [[TMP69]], i64 [[MUL86]] +// CHECK2-NEXT: store ptr [[ADD_PTR87]], ptr [[__BEGIN225]], align 8 +// CHECK2-NEXT: [[TMP71:%.*]] = load ptr, ptr [[__BEGIN225]], align 8 +// CHECK2-NEXT: store ptr [[TMP71]], ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP72:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK2-NEXT: [[TMP73:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP74:%.*]] = load double, ptr [[TMP73]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP72]], double noundef [[TMP74]]) +// CHECK2-NEXT: br label %[[IF_END88]] +// CHECK2: [[IF_END88]]: // CHECK2-NEXT: br label %[[FOR_INC:.*]] // CHECK2: [[FOR_INC]]: -// CHECK2-NEXT: [[TMP78:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[INC:%.*]] = add nsw i64 [[TMP78]], 1 -// CHECK2-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP75:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[INC:%.*]] = add nsw i64 [[TMP75]], 1 +// CHECK2-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX52]], align 8 // CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP5:![0-9]+]] // CHECK2: [[FOR_END]]: // CHECK2-NEXT: ret void @@ -1427,13 +1664,11 @@ extern "C" void foo4() { // CHECK2-NEXT: [[ENTRY:.*:]] // CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 // CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[K:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -1448,12 +1683,10 @@ extern "C" void foo4() { // CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: store i32 0, ptr [[J]], align 4 -// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 // CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[K]], align 4 -// CHECK2-NEXT: store i32 63, ptr [[DOTOMP_UB1]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 // CHECK2-NEXT: store i32 64, ptr [[DOTOMP_NI1]], align 4 @@ -1573,6 +1806,277 @@ extern "C" void foo4() { // CHECK2-NEXT: ret void // // +// CHECK2-LABEL: define dso_local void @foo5( +// CHECK2-SAME: ) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB03:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST04:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI05:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV06:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_8:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_10:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_LB116:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_ST117:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_NI118:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV120:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_TEMP_121:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX22:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX29:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[CC:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE264:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN265:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END267:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[VV:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[K]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: store i32 512, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 [[TMP5]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[SUB:%.*]] = sub nsw i32 [[TMP6]], 0 +// CHECK2-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 +// CHECK2-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 +// CHECK2-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB03]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST04]], align 4 +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], 1 +// CHECK2-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 +// CHECK2-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[TMP8:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP8]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK2-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY7:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY7]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY9:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY9]], ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK2-NEXT: store ptr [[TMP11]], ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK2-NEXT: [[TMP12:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK2-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP12]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 +// CHECK2-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] +// CHECK2-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 +// CHECK2-NEXT: [[SUB12:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK2-NEXT: [[ADD13:%.*]] = add nsw i64 [[SUB12]], 1 +// CHECK2-NEXT: [[DIV14:%.*]] = sdiv i64 [[ADD13]], 1 +// CHECK2-NEXT: [[SUB15:%.*]] = sub nsw i64 [[DIV14]], 1 +// CHECK2-NEXT: store i64 [[SUB15]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB116]], align 8 +// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST117]], align 8 +// CHECK2-NEXT: [[TMP14:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: [[ADD19:%.*]] = add nsw i64 [[TMP14]], 1 +// CHECK2-NEXT: store i64 [[ADD19]], ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK2-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK2-NEXT: [[TMP17:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[CMP23:%.*]] = icmp sgt i64 [[TMP16]], [[TMP17]] +// CHECK2-NEXT: br i1 [[CMP23]], label %[[COND_TRUE24:.*]], label %[[COND_FALSE25:.*]] +// CHECK2: [[COND_TRUE24]]: +// CHECK2-NEXT: [[TMP18:%.*]] = load i64, ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK2-NEXT: br label %[[COND_END26:.*]] +// CHECK2: [[COND_FALSE25]]: +// CHECK2-NEXT: [[TMP19:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: br label %[[COND_END26]] +// CHECK2: [[COND_END26]]: +// CHECK2-NEXT: [[COND27:%.*]] = phi i64 [ [[TMP18]], %[[COND_TRUE24]] ], [ [[TMP19]], %[[COND_FALSE25]] ] +// CHECK2-NEXT: store i64 [[COND27]], ptr [[DOTOMP_FUSE_MAX22]], align 8 +// CHECK2-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: [[CMP28:%.*]] = icmp slt i32 [[TMP20]], 128 +// CHECK2-NEXT: br i1 [[CMP28]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP21]]) +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add nsw i32 [[TMP22]], 1 +// CHECK2-NEXT: store i32 [[INC]], ptr [[I]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP8:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND30:.*]] +// CHECK2: [[FOR_COND30]]: +// CHECK2-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX22]], align 8 +// CHECK2-NEXT: [[CMP31:%.*]] = icmp slt i64 [[TMP23]], [[TMP24]] +// CHECK2-NEXT: br i1 [[CMP31]], label %[[FOR_BODY32:.*]], label %[[FOR_END63:.*]] +// CHECK2: [[FOR_BODY32]]: +// CHECK2-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: [[CMP33:%.*]] = icmp slt i64 [[TMP25]], [[TMP26]] +// CHECK2-NEXT: br i1 [[CMP33]], label %[[IF_THEN:.*]], label %[[IF_END53:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_LB03]], align 4 +// CHECK2-NEXT: [[CONV34:%.*]] = sext i32 [[TMP27]] to i64 +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_ST04]], align 4 +// CHECK2-NEXT: [[CONV35:%.*]] = sext i32 [[TMP28]] to i64 +// CHECK2-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV35]], [[TMP29]] +// CHECK2-NEXT: [[ADD36:%.*]] = add nsw i64 [[CONV34]], [[MUL]] +// CHECK2-NEXT: [[CONV37:%.*]] = trunc i64 [[ADD36]] to i32 +// CHECK2-NEXT: store i32 [[CONV37]], ptr [[DOTOMP_IV06]], align 4 +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_IV06]], align 4 +// CHECK2-NEXT: [[MUL38:%.*]] = mul nsw i32 [[TMP30]], 1 +// CHECK2-NEXT: [[ADD39:%.*]] = add nsw i32 0, [[MUL38]] +// CHECK2-NEXT: store i32 [[ADD39]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP40:%.*]] = icmp slt i32 [[TMP31]], [[TMP32]] +// CHECK2-NEXT: br i1 [[CMP40]], label %[[IF_THEN41:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN41]]: +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL42:%.*]] = mul nsw i32 [[TMP34]], [[TMP35]] +// CHECK2-NEXT: [[ADD43:%.*]] = add nsw i32 [[TMP33]], [[MUL42]] +// CHECK2-NEXT: store i32 [[ADD43]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[MUL44:%.*]] = mul nsw i32 [[TMP36]], 2 +// CHECK2-NEXT: [[ADD45:%.*]] = add nsw i32 0, [[MUL44]] +// CHECK2-NEXT: store i32 [[ADD45]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP37]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP46:%.*]] = icmp slt i32 [[TMP38]], [[TMP39]] +// CHECK2-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] +// CHECK2: [[IF_THEN47]]: +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL48:%.*]] = mul nsw i32 [[TMP41]], [[TMP42]] +// CHECK2-NEXT: [[ADD49:%.*]] = add nsw i32 [[TMP40]], [[MUL48]] +// CHECK2-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[MUL50:%.*]] = mul nsw i32 [[TMP43]], 1 +// CHECK2-NEXT: [[ADD51:%.*]] = add nsw i32 0, [[MUL50]] +// CHECK2-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[K]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK2-NEXT: br label %[[IF_END52]] +// CHECK2: [[IF_END52]]: +// CHECK2-NEXT: br label %[[IF_END53]] +// CHECK2: [[IF_END53]]: +// CHECK2-NEXT: [[TMP45:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[TMP46:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[CMP54:%.*]] = icmp slt i64 [[TMP45]], [[TMP46]] +// CHECK2-NEXT: br i1 [[CMP54]], label %[[IF_THEN55:.*]], label %[[IF_END60:.*]] +// CHECK2: [[IF_THEN55]]: +// CHECK2-NEXT: [[TMP47:%.*]] = load i64, ptr [[DOTOMP_LB116]], align 8 +// CHECK2-NEXT: [[TMP48:%.*]] = load i64, ptr [[DOTOMP_ST117]], align 8 +// CHECK2-NEXT: [[TMP49:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[MUL56:%.*]] = mul nsw i64 [[TMP48]], [[TMP49]] +// CHECK2-NEXT: [[ADD57:%.*]] = add nsw i64 [[TMP47]], [[MUL56]] +// CHECK2-NEXT: store i64 [[ADD57]], ptr [[DOTOMP_IV120]], align 8 +// CHECK2-NEXT: [[TMP50:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[TMP51:%.*]] = load i64, ptr [[DOTOMP_IV120]], align 8 +// CHECK2-NEXT: [[MUL58:%.*]] = mul nsw i64 [[TMP51]], 1 +// CHECK2-NEXT: [[ADD_PTR59:%.*]] = getelementptr inbounds double, ptr [[TMP50]], i64 [[MUL58]] +// CHECK2-NEXT: store ptr [[ADD_PTR59]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP52:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: store ptr [[TMP52]], ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[C]], align 4 +// CHECK2-NEXT: [[TMP54:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP55:%.*]] = load double, ptr [[TMP54]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP53]], double noundef [[TMP55]]) +// CHECK2-NEXT: br label %[[IF_END60]] +// CHECK2: [[IF_END60]]: +// CHECK2-NEXT: br label %[[FOR_INC61:.*]] +// CHECK2: [[FOR_INC61]]: +// CHECK2-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[INC62:%.*]] = add nsw i64 [[TMP56]], 1 +// CHECK2-NEXT: store i64 [[INC62]], ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND30]], !llvm.loop [[LOOP9:![0-9]+]] +// CHECK2: [[FOR_END63]]: +// CHECK2-NEXT: store i32 37, ptr [[CC]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE264]], align 8 +// CHECK2-NEXT: [[TMP57:%.*]] = load ptr, ptr [[__RANGE264]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY66:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP57]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY66]], ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: [[TMP58:%.*]] = load ptr, ptr [[__RANGE264]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY68:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP58]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR69:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY68]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR69]], ptr [[__END267]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND70:.*]] +// CHECK2: [[FOR_COND70]]: +// CHECK2-NEXT: [[TMP59:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: [[TMP60:%.*]] = load ptr, ptr [[__END267]], align 8 +// CHECK2-NEXT: [[CMP71:%.*]] = icmp ne ptr [[TMP59]], [[TMP60]] +// CHECK2-NEXT: br i1 [[CMP71]], label %[[FOR_BODY72:.*]], label %[[FOR_END74:.*]] +// CHECK2: [[FOR_BODY72]]: +// CHECK2-NEXT: [[TMP61:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: store ptr [[TMP61]], ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP62:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK2-NEXT: [[TMP63:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP64:%.*]] = load double, ptr [[TMP63]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP62]], double noundef [[TMP64]]) +// CHECK2-NEXT: br label %[[FOR_INC73:.*]] +// CHECK2: [[FOR_INC73]]: +// CHECK2-NEXT: [[TMP65:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: [[INCDEC_PTR:%.*]] = getelementptr inbounds nuw double, ptr [[TMP65]], i32 1 +// CHECK2-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND70]] +// CHECK2: [[FOR_END74]]: +// CHECK2-NEXT: ret void +// +// // CHECK2-LABEL: define dso_local void @tfoo2( // CHECK2-SAME: ) #[[ATTR0]] { // CHECK2-NEXT: [[ENTRY:.*:]] @@ -1593,7 +2097,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 @@ -1602,7 +2105,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -1611,7 +2113,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_19:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP21:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_22:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB2:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB2:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST2:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI2:%.*]] = alloca i32, align 4 @@ -1641,174 +2142,168 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] // CHECK2-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 // CHECK2-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP8]], 1 // CHECK2-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP9]], ptr [[J]], align 4 // CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[START_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK2-NEXT: store i32 [[TMP10]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP13]], [[TMP14]] // CHECK2-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP15]] // CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] -// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP16]] // CHECK2-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 // CHECK2-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP17]], 1 // CHECK2-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP18]], [[TMP19]] +// CHECK2-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 // CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[START_ADDR]], align 4 // CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] -// CHECK2-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 -// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[START_ADDR]], align 4 -// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] +// CHECK2-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] // CHECK2-NEXT: store i32 [[ADD18]], ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP24]], [[TMP25]] +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] // CHECK2-NEXT: store i32 [[ADD20]], ptr [[DOTCAPTURE_EXPR_19]], align 4 -// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP26]], ptr [[DOTNEW_STEP21]], align 4 -// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 -// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK2-NEXT: [[SUB23:%.*]] = sub i32 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP24]], ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[SUB23:%.*]] = sub i32 [[TMP25]], [[TMP26]] // CHECK2-NEXT: [[SUB24:%.*]] = sub i32 [[SUB23]], 1 -// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK2-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP29]] -// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK2-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP30]] +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP27]] +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP28]] // CHECK2-NEXT: [[SUB27:%.*]] = sub i32 [[DIV26]], 1 // CHECK2-NEXT: store i32 [[SUB27]], ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK2-NEXT: store i32 [[TMP31]], ptr [[DOTOMP_UB2]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB2]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST2]], align 4 -// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK2-NEXT: [[ADD28:%.*]] = add i32 [[TMP32]], 1 +// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK2-NEXT: [[ADD28:%.*]] = add i32 [[TMP29]], 1 // CHECK2-NEXT: store i32 [[ADD28]], ptr [[DOTOMP_NI2]], align 4 -// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: store i32 [[TMP33]], ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP34]], [[TMP35]] +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP30]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP31]], [[TMP32]] // CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] // CHECK2: [[COND_TRUE]]: -// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 // CHECK2-NEXT: br label %[[COND_END:.*]] // CHECK2: [[COND_FALSE]]: -// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 // CHECK2-NEXT: br label %[[COND_END]] // CHECK2: [[COND_END]]: -// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP36]], %[[COND_TRUE]] ], [ [[TMP37]], %[[COND_FALSE]] ] +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP33]], %[[COND_TRUE]] ], [ [[TMP34]], %[[COND_FALSE]] ] // CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_TEMP_2]], align 4 -// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 -// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 -// CHECK2-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP38]], [[TMP39]] +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP35]], [[TMP36]] // CHECK2-NEXT: br i1 [[CMP29]], label %[[COND_TRUE30:.*]], label %[[COND_FALSE31:.*]] // CHECK2: [[COND_TRUE30]]: -// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 // CHECK2-NEXT: br label %[[COND_END32:.*]] // CHECK2: [[COND_FALSE31]]: -// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 // CHECK2-NEXT: br label %[[COND_END32]] // CHECK2: [[COND_END32]]: -// CHECK2-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP40]], %[[COND_TRUE30]] ], [ [[TMP41]], %[[COND_FALSE31]] ] +// CHECK2-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP37]], %[[COND_TRUE30]] ], [ [[TMP38]], %[[COND_FALSE31]] ] // CHECK2-NEXT: store i32 [[COND33]], ptr [[DOTOMP_FUSE_MAX]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK2-NEXT: br label %[[FOR_COND:.*]] // CHECK2: [[FOR_COND]]: -// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 -// CHECK2-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP39]], [[TMP40]] // CHECK2-NEXT: br i1 [[CMP34]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK2: [[FOR_BODY]]: -// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP44]], [[TMP45]] +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP41]], [[TMP42]] // CHECK2-NEXT: br i1 [[CMP35]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] // CHECK2: [[IF_THEN]]: -// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP47]], [[TMP48]] -// CHECK2-NEXT: [[ADD36:%.*]] = add i32 [[TMP46]], [[MUL]] +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP44]], [[TMP45]] +// CHECK2-NEXT: [[ADD36:%.*]] = add i32 [[TMP43]], [[MUL]] // CHECK2-NEXT: store i32 [[ADD36]], ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 -// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 -// CHECK2-NEXT: [[MUL37:%.*]] = mul i32 [[TMP50]], [[TMP51]] -// CHECK2-NEXT: [[ADD38:%.*]] = add i32 [[TMP49]], [[MUL37]] +// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[MUL37:%.*]] = mul i32 [[TMP47]], [[TMP48]] +// CHECK2-NEXT: [[ADD38:%.*]] = add i32 [[TMP46]], [[MUL37]] // CHECK2-NEXT: store i32 [[ADD38]], ptr [[I]], align 4 -// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[I]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP49]]) // CHECK2-NEXT: br label %[[IF_END]] // CHECK2: [[IF_END]]: -// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP53]], [[TMP54]] +// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP50]], [[TMP51]] // CHECK2-NEXT: br i1 [[CMP39]], label %[[IF_THEN40:.*]], label %[[IF_END45:.*]] // CHECK2: [[IF_THEN40]]: -// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK2-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL41:%.*]] = mul i32 [[TMP56]], [[TMP57]] -// CHECK2-NEXT: [[ADD42:%.*]] = add i32 [[TMP55]], [[MUL41]] +// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL41:%.*]] = mul i32 [[TMP53]], [[TMP54]] +// CHECK2-NEXT: [[ADD42:%.*]] = add i32 [[TMP52]], [[MUL41]] // CHECK2-NEXT: store i32 [[ADD42]], ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP58:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[MUL43:%.*]] = mul i32 [[TMP59]], [[TMP60]] -// CHECK2-NEXT: [[SUB44:%.*]] = sub i32 [[TMP58]], [[MUL43]] +// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[MUL43:%.*]] = mul i32 [[TMP56]], [[TMP57]] +// CHECK2-NEXT: [[SUB44:%.*]] = sub i32 [[TMP55]], [[MUL43]] // CHECK2-NEXT: store i32 [[SUB44]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP61:%.*]] = load i32, ptr [[J]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP61]]) +// CHECK2-NEXT: [[TMP58:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP58]]) // CHECK2-NEXT: br label %[[IF_END45]] // CHECK2: [[IF_END45]]: -// CHECK2-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 -// CHECK2-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP62]], [[TMP63]] +// CHECK2-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP59]], [[TMP60]] // CHECK2-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] // CHECK2: [[IF_THEN47]]: -// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 -// CHECK2-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 -// CHECK2-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL48:%.*]] = mul i32 [[TMP65]], [[TMP66]] -// CHECK2-NEXT: [[ADD49:%.*]] = add i32 [[TMP64]], [[MUL48]] +// CHECK2-NEXT: [[TMP61:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 +// CHECK2-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 +// CHECK2-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL48:%.*]] = mul i32 [[TMP62]], [[TMP63]] +// CHECK2-NEXT: [[ADD49:%.*]] = add i32 [[TMP61]], [[MUL48]] // CHECK2-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV2]], align 4 -// CHECK2-NEXT: [[TMP67:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK2-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 -// CHECK2-NEXT: [[TMP69:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK2-NEXT: [[MUL50:%.*]] = mul i32 [[TMP68]], [[TMP69]] -// CHECK2-NEXT: [[ADD51:%.*]] = add i32 [[TMP67]], [[MUL50]] +// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 +// CHECK2-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[MUL50:%.*]] = mul i32 [[TMP65]], [[TMP66]] +// CHECK2-NEXT: [[ADD51:%.*]] = add i32 [[TMP64]], [[MUL50]] // CHECK2-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 -// CHECK2-NEXT: [[TMP70:%.*]] = load i32, ptr [[K]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP70]]) +// CHECK2-NEXT: [[TMP67:%.*]] = load i32, ptr [[K]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP67]]) // CHECK2-NEXT: br label %[[IF_END52]] // CHECK2: [[IF_END52]]: // CHECK2-NEXT: br label %[[FOR_INC:.*]] // CHECK2: [[FOR_INC]]: -// CHECK2-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 +// CHECK2-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP68]], 1 // CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP8:![0-9]+]] +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP10:![0-9]+]] // CHECK2: [[FOR_END]]: // CHECK2-NEXT: ret void // @@ -1819,6 +2314,8 @@ extern "C" void foo4() { // CHECK1: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} // CHECK1: [[LOOP7]] = distinct !{[[LOOP7]], [[META4]]} // CHECK1: [[LOOP8]] = distinct !{[[LOOP8]], [[META4]]} +// CHECK1: [[LOOP9]] = distinct !{[[LOOP9]], [[META4]]} +// CHECK1: [[LOOP10]] = distinct !{[[LOOP10]], [[META4]]} //. // CHECK2: [[LOOP3]] = distinct !{[[LOOP3]], [[META4:![0-9]+]]} // CHECK2: [[META4]] = !{!"llvm.loop.mustprogress"} @@ -1826,4 +2323,6 @@ extern "C" void foo4() { // CHECK2: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} // CHECK2: [[LOOP7]] = distinct !{[[LOOP7]], [[META4]]} // CHECK2: [[LOOP8]] = distinct !{[[LOOP8]], [[META4]]} +// CHECK2: [[LOOP9]] = distinct !{[[LOOP9]], [[META4]]} +// CHECK2: [[LOOP10]] = distinct !{[[LOOP10]], [[META4]]} //. >From 0d90fa9bbeb6ea0d35ceaa7ef27a42463f257320 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:44:48 +0000 Subject: [PATCH 5/7] Fixed missing diagnostic groups in warnings --- clang/include/clang/Basic/DiagnosticSemaKinds.td | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index ecfb0c83a3851..94d1f3c3e6349 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -11517,7 +11517,8 @@ def note_omp_implicit_dsa : Note< def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; def warn_omp_different_loop_ind_var_types : Warning < - "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">; + "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">, + InGroup; def err_omp_not_canonical_loop : Error < "loop after '#pragma omp %0' is not in canonical form">; def err_omp_not_a_loop_sequence : Error < @@ -11528,7 +11529,8 @@ def err_omp_invalid_looprange : Error < "loop range in '#pragma omp %0' exceeds the number of available loops: " "range end '%1' is greater than the total number of loops '%2'">; def warn_omp_redundant_fusion : Warning < - "loop range in '#pragma omp %0' contains only a single loop, resulting in redundant fusion">; + "loop range in '#pragma omp %0' contains only a single loop, resulting in redundant fusion">, + InGroup; def err_omp_not_for : Error< "%select{statement after '#pragma omp %1' must be a for loop|" "expected %2 for loops after '#pragma omp %1'%select{|, but found only %4}3}0">; >From 07e7dc817c0862d910599ccae7c5057f72cf7fef Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:49:50 +0000 Subject: [PATCH 6/7] Fixed formatting and comments --- clang/lib/Sema/SemaOpenMP.cpp | 112 ++++++++++++++++++---------------- 1 file changed, 58 insertions(+), 54 deletions(-) diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index 30f8cd3087268..e6557fe9e2187 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -14128,42 +14128,43 @@ StmtResult SemaOpenMP::ActOnOpenMPTargetTeamsDistributeSimdDirective( } // Overloaded base case function -template -static bool tryHandleAs(T *t, F &&) { - return false; +template static bool tryHandleAs(T *t, F &&) { + return false; } /** - * Tries to recursively cast `t` to one of the given types and invokes `f` if successful. + * Tries to recursively cast `t` to one of the given types and invokes `f` if + * successful. * * @tparam Class The first type to check. * @tparam Rest The remaining types to check. * @tparam T The base type of `t`. - * @tparam F The callable type for the function to invoke upon a successful cast. + * @tparam F The callable type for the function to invoke upon a successful + * cast. * @param t The object to be checked. * @param f The function to invoke if `t` matches `Class`. * @return `true` if `t` matched any type and `f` was called, otherwise `false`. */ template static bool tryHandleAs(T *t, F &&f) { - if (Class *c = dyn_cast(t)) { - f(c); - return true; - } else { - return tryHandleAs(t, std::forward(f)); - } + if (Class *c = dyn_cast(t)) { + f(c); + return true; + } else { + return tryHandleAs(t, std::forward(f)); + } } // Updates OriginalInits by checking Transform against loop transformation // directives and appending their pre-inits if a match is found. static void updatePreInits(OMPLoopBasedDirective *Transform, SmallVectorImpl> &PreInits) { - if (!tryHandleAs( - Transform, [&PreInits](auto *Dir) { - appendFlattenedStmtList(PreInits.back(), Dir->getPreInits()); - })) - llvm_unreachable("Unhandled loop transformation"); + if (!tryHandleAs( + Transform, [&PreInits](auto *Dir) { + appendFlattenedStmtList(PreInits.back(), Dir->getPreInits()); + })) + llvm_unreachable("Unhandled loop transformation"); } bool SemaOpenMP::checkTransformableLoopNest( @@ -14241,43 +14242,42 @@ class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { unsigned getNestedLoopCount() const { return NestedLoopCount; } bool VisitForStmt(ForStmt *FS) override { - ++NestedLoopCount; - return true; + ++NestedLoopCount; + return true; } bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { - ++NestedLoopCount; - return true; + ++NestedLoopCount; + return true; } bool TraverseStmt(Stmt *S) override { - if (!S) + if (!S) return true; - // Skip traversal of all expressions, including special cases like - // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions - // may contain inner statements (and even loops), but they are not part - // of the syntactic body of the surrounding loop structure. - // Therefore must not be counted - if (isa(S)) + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) return true; - // Only recurse into CompoundStmt (block {}) and loop bodies - if (isa(S) || isa(S) || - isa(S)) { + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || isa(S)) { return DynamicRecursiveASTVisitor::TraverseStmt(S); - } + } - // Stop traversal of the rest of statements, that break perfect - // loop nesting, such as control flow (IfStmt, SwitchStmt...) - return true; + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; } bool TraverseDecl(Decl *D) override { - // Stop in the case of finding a declaration, it is not important - // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, - // FunctionDecl...) - return true; + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; } }; @@ -14435,15 +14435,14 @@ bool SemaOpenMP::analyzeLoopSequence( return isa(Child); }; - // High level grammar validation for (auto *Child : LoopSeqStmt->children()) { - if (!Child) + if (!Child) continue; - // Skip over non-loop-sequence statements - if (!isLoopSequenceDerivation(Child)) { + // Skip over non-loop-sequence statements + if (!isLoopSequenceDerivation(Child)) { Child = Child->IgnoreContainers(); // Ignore empty compound statement @@ -14461,9 +14460,9 @@ bool SemaOpenMP::analyzeLoopSequence( // Already been treated, skip this children continue; } - } - // Regular loop sequence handling - if (isLoopSequenceDerivation(Child)) { + } + // Regular loop sequence handling + if (isLoopSequenceDerivation(Child)) { if (isLoopGeneratingStmt(Child)) { if (!analyzeLoopGeneration(Child)) { return false; @@ -14477,12 +14476,12 @@ bool SemaOpenMP::analyzeLoopSequence( // Update the Loop Sequence size by one ++LoopSeqSize; } - } else { + } else { // Report error for invalid statement inside canonical loop sequence Diag(Child->getBeginLoc(), diag::err_omp_not_for) << 0 << getOpenMPDirectiveName(Kind); return false; - } + } } return true; } @@ -14499,9 +14498,9 @@ bool SemaOpenMP::checkTransformableLoopSequence( // Checks whether the given statement is a compound statement if (!isa(AStmt)) { - Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) - << getOpenMPDirectiveName(Kind); - return false; + Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) + << getOpenMPDirectiveName(Kind); + return false; } // Number of top level canonical loop nests observed (And acts as index) LoopSeqSize = 0; @@ -14532,7 +14531,7 @@ bool SemaOpenMP::checkTransformableLoopSequence( OriginalInits, TransformsPreInits, LoopSequencePreInits, LoopCategories, Context, Kind)) { - return false; + return false; } if (LoopSeqSize <= 0) { Diag(AStmt->getBeginLoc(), diag::err_omp_empty_loop_sequence) @@ -15233,7 +15232,7 @@ StmtResult SemaOpenMP::ActOnOpenMPUnrollDirective(ArrayRef Clauses, Stmt *LoopStmt = nullptr; collectLoopStmts(AStmt, {LoopStmt}); - // Determine the PreInit declarations.e + // Determine the PreInit declarations. SmallVector PreInits; addLoopPreInits(Context, LoopHelper, LoopStmt, OriginalInits[0], PreInits); @@ -15848,13 +15847,18 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, CountVal = CountInt.getZExtValue(); }; - // Checks if the loop range is valid + // OpenMP [6.0, Restrictions] + // first + count - 1 must not evaluate to a value greater than the + // loop sequence length of the associated canonical loop sequence. auto ValidLoopRange = [](uint64_t FirstVal, uint64_t CountVal, unsigned NumLoops) -> bool { return FirstVal + CountVal - 1 <= NumLoops; }; uint64_t FirstVal = 1, CountVal = 0, LastVal = LoopSeqSize; + // Validates the loop range after evaluating the semantic information + // and ensures that the range is valid for the given loop sequence size. + // Expressions are evaluated at compile time to obtain constant values. if (LRC) { EvaluateLoopRangeArguments(LRC->getFirst(), LRC->getCount(), FirstVal, CountVal); >From 78cec6d0600464d3336cdf7af19beffa12025474 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:58:54 +0000 Subject: [PATCH 7/7] Added minimal changes to enable flang future implementation --- flang/include/flang/Parser/dump-parse-tree.h | 1 + flang/include/flang/Parser/parse-tree.h | 9 +++++++++ flang/lib/Lower/OpenMP/Clauses.cpp | 5 +++++ flang/lib/Lower/OpenMP/Clauses.h | 1 + flang/lib/Parser/openmp-parsers.cpp | 7 +++++++ flang/lib/Parser/unparse.cpp | 7 +++++++ flang/lib/Semantics/check-omp-structure.cpp | 9 +++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 8 files changed, 40 insertions(+) diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index a3721bc8410ba..4f2d715ba6a2e 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -608,6 +608,7 @@ class ParseTreeDumper { NODE(OmpLinearClause, Modifier) NODE(parser, OmpLinearModifier) NODE_ENUM(OmpLinearModifier, Value) + NODE(parser, OmpLoopRangeClause) NODE(parser, OmpStepComplexModifier) NODE(parser, OmpStepSimpleModifier) NODE(parser, OmpLoopDirective) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index a0d7a797e7203..ae120ca20f686 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4361,6 +4361,15 @@ struct OmpLinearClause { std::tuple t; }; +// Ref: [6.0:207-208] +// +// loop-range-clause -> +// LOOPRANGE(first, count) // since 6.0 +struct OmpLoopRangeClause { + TUPLE_CLASS_BOILERPLATE(OmpLoopRangeClause); + std::tuple t; +}; + // Ref: [4.5:216-219], [5.0:315-324], [5.1:347-355], [5.2:150-158] // // map-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index c258bef2e4427..d26733138fa4f 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -998,6 +998,11 @@ Link make(const parser::OmpClause::Link &inp, return Link{/*List=*/makeObjects(inp.v, semaCtx)}; } +LoopRange make(const parser::OmpClause::Looprange &inp, + semantics::SemanticsContext &semaCtx) { + llvm_unreachable("Unimplemented: looprange"); +} + Map make(const parser::OmpClause::Map &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpMapClause diff --git a/flang/lib/Lower/OpenMP/Clauses.h b/flang/lib/Lower/OpenMP/Clauses.h index d7ab21d428e32..bda8571e65f23 100644 --- a/flang/lib/Lower/OpenMP/Clauses.h +++ b/flang/lib/Lower/OpenMP/Clauses.h @@ -239,6 +239,7 @@ using Initializer = tomp::clause::InitializerT; using InReduction = tomp::clause::InReductionT; using IsDevicePtr = tomp::clause::IsDevicePtrT; using Lastprivate = tomp::clause::LastprivateT; +using LoopRange = tomp::clause::LoopRangeT; using Linear = tomp::clause::LinearT; using Link = tomp::clause::LinkT; using Map = tomp::clause::MapT; diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index c4728e0fabe61..17fffa83d5af1 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -837,6 +837,11 @@ TYPE_PARSER( maybe(":"_tok >> nonemptyList(Parser{})), /*PostModified=*/pure(true))) +TYPE_PARSER( + construct(scalarIntConstantExpr, + "," >> scalarIntConstantExpr) +) + // OpenMPv5.2 12.5.2 detach-clause -> DETACH (event-handle) TYPE_PARSER(construct(Parser{})) @@ -1006,6 +1011,8 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "LINK" >> construct(construct( parenthesized(Parser{}))) || + "LOOPRANGE" >> construct(construct( + parenthesized(Parser{}))) || "MAP" >> construct(construct( parenthesized(Parser{}))) || "MATCH" >> construct(construct( diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 1ee9096fcda56..bf7daa44c7bd6 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2309,6 +2309,13 @@ class UnparseVisitor { } } } + void Unparse(const OmpLoopRangeClause &x) { + Word("LOOPRANGE("); + Walk(std::get<0>(x.t)); + Put(", "); + Walk(std::get<1>(x.t)); + Put(")"); + } void Unparse(const OmpReductionClause &x) { using Modifier = OmpReductionClause::Modifier; Walk(std::get>>(x.t), ": "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dd8e511642976..fc9e2e32d6639 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3368,6 +3368,15 @@ CHECK_REQ_CONSTANT_SCALAR_INT_CLAUSE(Collapse, OMPC_collapse) CHECK_REQ_CONSTANT_SCALAR_INT_CLAUSE(Safelen, OMPC_safelen) CHECK_REQ_CONSTANT_SCALAR_INT_CLAUSE(Simdlen, OMPC_simdlen) +void OmpStructureChecker::Enter(const parser::OmpClause::Looprange &x) { + context_.Say(GetContext().clauseSource, + "LOOPRANGE clause is not implemented yet"_err_en_US, + ContextDirectiveAsFortran()); +} + +void OmpStructureChecker::Enter(const parser::OmpClause::FreeAgent &x) { + context_.Say(GetContext().clauseSource, + "FREE_AGENT clause is not implemented yet"_err_en_US, // Restrictions specific to each clause are implemented apart from the // generalized restrictions. diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 366cc7ef853d3..491cd47dc2902 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -273,6 +273,7 @@ def OMPC_Link : Clause<"link"> { } def OMPC_LoopRange : Clause<"looprange"> { let clangClass = "OMPLoopRangeClause"; + let flangClass = "OmpLoopRangeClause"; } def OMPC_Map : Clause<"map"> { let clangClass = "OMPMapClause"; From flang-commits at lists.llvm.org Fri May 9 11:12:27 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 09 May 2025 11:12:27 -0700 (PDT) Subject: [flang-commits] [flang] 4d9479f - [flang][openacc] Allow open acc routines from other modules. (#136012) Message-ID: <681e458b.170a0220.c1a2e.9e17@mx.google.com> Author: Andre Kuhlenschmidt Date: 2025-05-09T11:12:24-07:00 New Revision: 4d9479fa8f4e949bc4c5768477cd36687c1c6b29 URL: https://github.com/llvm/llvm-project/commit/4d9479fa8f4e949bc4c5768477cd36687c1c6b29 DIFF: https://github.com/llvm/llvm-project/commit/4d9479fa8f4e949bc4c5768477cd36687c1c6b29.diff LOG: [flang][openacc] Allow open acc routines from other modules. (#136012) OpenACC routines annotations in separate compilation units currently get ignored, which leads to errors in compilation. There are two reason for currently ignoring open acc routine information and this PR is addressing both. - The module file reader doesn't read back in openacc directives from module files. - Simple fix in `flang/lib/Semantics/mod-file.cpp` - The lowering to HLFIR doesn't generate routine directives for symbols imported from other modules that are openacc routines. - This is the majority of this diff, and is address by the changes that start in `flang/lib/Lower/CallInterface.cpp`. Added: flang/test/Lower/OpenACC/acc-module-definition.f90 flang/test/Lower/OpenACC/acc-routine-use-module.f90 Modified: flang/include/flang/Lower/OpenACC.h flang/include/flang/Semantics/symbol.h flang/lib/Lower/Bridge.cpp flang/lib/Lower/CallInterface.cpp flang/lib/Lower/OpenACC.cpp flang/lib/Semantics/mod-file.cpp flang/lib/Semantics/resolve-directives.cpp flang/lib/Semantics/symbol.cpp flang/test/Lower/OpenACC/acc-routine-named.f90 flang/test/Lower/OpenACC/acc-routine.f90 Removed: ################################################################################ diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index 0d7038a7fd856..bbe3b01fdb29d 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -22,6 +22,9 @@ class StringRef; } // namespace llvm namespace mlir { +namespace func { +class FuncOp; +} // namespace func class Location; class Type; class ModuleOp; @@ -31,9 +34,13 @@ class Value; namespace fir { class FirOpBuilder; -} +} // namespace fir namespace Fortran { +namespace evaluate { +struct ProcedureDesignator; +} // namespace evaluate + namespace parser { struct AccClauseList; struct OpenACCConstruct; @@ -42,6 +49,7 @@ struct OpenACCRoutineConstruct; } // namespace parser namespace semantics { +class OpenACCRoutineInfo; class SemanticsContext; class Symbol; } // namespace semantics @@ -55,9 +63,6 @@ namespace pft { struct Evaluation; } // namespace pft -using AccRoutineInfoMappingList = - llvm::SmallVector>; - static constexpr llvm::StringRef declarePostAllocSuffix = "_acc_declare_update_desc_post_alloc"; static constexpr llvm::StringRef declarePreDeallocSuffix = @@ -71,19 +76,12 @@ mlir::Value genOpenACCConstruct(AbstractConverter &, Fortran::semantics::SemanticsContext &, pft::Evaluation &, const parser::OpenACCConstruct &); -void genOpenACCDeclarativeConstruct(AbstractConverter &, - Fortran::semantics::SemanticsContext &, - StatementContext &, - const parser::OpenACCDeclarativeConstruct &, - AccRoutineInfoMappingList &); -void genOpenACCRoutineConstruct(AbstractConverter &, - Fortran::semantics::SemanticsContext &, - mlir::ModuleOp, - const parser::OpenACCRoutineConstruct &, - AccRoutineInfoMappingList &); - -void finalizeOpenACCRoutineAttachment(mlir::ModuleOp, - AccRoutineInfoMappingList &); +void genOpenACCDeclarativeConstruct( + AbstractConverter &, Fortran::semantics::SemanticsContext &, + StatementContext &, const parser::OpenACCDeclarativeConstruct &); +void genOpenACCRoutineConstruct( + AbstractConverter &, mlir::ModuleOp, mlir::func::FuncOp, + const std::vector &); /// Get a acc.private.recipe op for the given type or create it if it does not /// exist yet. diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 1d997abef6dee..97c1e30631840 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -22,6 +22,7 @@ #include #include #include +#include #include namespace llvm { @@ -127,6 +128,9 @@ class WithBindName { // Device type specific OpenACC routine information class OpenACCRoutineDeviceTypeInfo { public: + explicit OpenACCRoutineDeviceTypeInfo( + Fortran::common::OpenACCDeviceType dType) + : deviceType_{dType} {} bool isSeq() const { return isSeq_; } void set_isSeq(bool value = true) { isSeq_ = value; } bool isVector() const { return isVector_; } @@ -137,22 +141,30 @@ class OpenACCRoutineDeviceTypeInfo { void set_isGang(bool value = true) { isGang_ = value; } unsigned gangDim() const { return gangDim_; } void set_gangDim(unsigned value) { gangDim_ = value; } - const std::string *bindName() const { - return bindName_ ? &*bindName_ : nullptr; + const std::variant *bindName() const { + return bindName_.has_value() ? &*bindName_ : nullptr; } - void set_bindName(std::string &&name) { bindName_ = std::move(name); } - void set_dType(Fortran::common::OpenACCDeviceType dType) { - deviceType_ = dType; + const std::optional> & + bindNameOpt() const { + return bindName_; } + void set_bindName(std::string &&name) { bindName_.emplace(std::move(name)); } + void set_bindName(SymbolRef symbol) { bindName_.emplace(symbol); } + Fortran::common::OpenACCDeviceType dType() const { return deviceType_; } + friend llvm::raw_ostream &operator<<( + llvm::raw_ostream &, const OpenACCRoutineDeviceTypeInfo &); + private: bool isSeq_{false}; bool isVector_{false}; bool isWorker_{false}; bool isGang_{false}; unsigned gangDim_{0}; - std::optional bindName_; + // bind("name") -> std::string + // bind(sym) -> SymbolRef (requires namemangling in lowering) + std::optional> bindName_; Fortran::common::OpenACCDeviceType deviceType_{ Fortran::common::OpenACCDeviceType::None}; }; @@ -162,15 +174,29 @@ class OpenACCRoutineDeviceTypeInfo { // in as objects in the OpenACCRoutineDeviceTypeInfo list. class OpenACCRoutineInfo : public OpenACCRoutineDeviceTypeInfo { public: + OpenACCRoutineInfo() + : OpenACCRoutineDeviceTypeInfo(Fortran::common::OpenACCDeviceType::None) { + } bool isNohost() const { return isNohost_; } void set_isNohost(bool value = true) { isNohost_ = value; } - std::list &deviceTypeInfos() { + const std::list &deviceTypeInfos() const { return deviceTypeInfos_; } - void add_deviceTypeInfo(OpenACCRoutineDeviceTypeInfo &info) { - deviceTypeInfos_.push_back(info); + + OpenACCRoutineDeviceTypeInfo &add_deviceTypeInfo( + Fortran::common::OpenACCDeviceType type) { + return add_deviceTypeInfo(OpenACCRoutineDeviceTypeInfo(type)); + } + + OpenACCRoutineDeviceTypeInfo &add_deviceTypeInfo( + OpenACCRoutineDeviceTypeInfo &&info) { + deviceTypeInfos_.push_back(std::move(info)); + return deviceTypeInfos_.back(); } + friend llvm::raw_ostream &operator<<( + llvm::raw_ostream &, const OpenACCRoutineInfo &); + private: std::list deviceTypeInfos_; bool isNohost_{false}; diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 0a61f61ab8f75..43375e84f21fa 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -398,37 +398,39 @@ class FirConverter : public Fortran::lower::AbstractConverter { // they are available before lowering any function that may use them. bool hasMainProgram = false; const Fortran::semantics::Symbol *globalOmpRequiresSymbol = nullptr; - for (Fortran::lower::pft::Program::Units &u : pft.getUnits()) { - Fortran::common::visit( - Fortran::common::visitors{ - [&](Fortran::lower::pft::FunctionLikeUnit &f) { - if (f.isMainProgram()) - hasMainProgram = true; - declareFunction(f); - if (!globalOmpRequiresSymbol) - globalOmpRequiresSymbol = f.getScope().symbol(); - }, - [&](Fortran::lower::pft::ModuleLikeUnit &m) { - lowerModuleDeclScope(m); - for (Fortran::lower::pft::ContainedUnit &unit : - m.containedUnitList) - if (auto *f = - std::get_if( - &unit)) - declareFunction(*f); - }, - [&](Fortran::lower::pft::BlockDataUnit &b) { - if (!globalOmpRequiresSymbol) - globalOmpRequiresSymbol = b.symTab.symbol(); - }, - [&](Fortran::lower::pft::CompilerDirectiveUnit &d) {}, - [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) {}, - }, - u); - } + createBuilderOutsideOfFuncOpAndDo([&]() { + for (Fortran::lower::pft::Program::Units &u : pft.getUnits()) { + Fortran::common::visit( + Fortran::common::visitors{ + [&](Fortran::lower::pft::FunctionLikeUnit &f) { + if (f.isMainProgram()) + hasMainProgram = true; + declareFunction(f); + if (!globalOmpRequiresSymbol) + globalOmpRequiresSymbol = f.getScope().symbol(); + }, + [&](Fortran::lower::pft::ModuleLikeUnit &m) { + lowerModuleDeclScope(m); + for (Fortran::lower::pft::ContainedUnit &unit : + m.containedUnitList) + if (auto *f = + std::get_if( + &unit)) + declareFunction(*f); + }, + [&](Fortran::lower::pft::BlockDataUnit &b) { + if (!globalOmpRequiresSymbol) + globalOmpRequiresSymbol = b.symTab.symbol(); + }, + [&](Fortran::lower::pft::CompilerDirectiveUnit &d) {}, + [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) {}, + }, + u); + } + }); // Create definitions of intrinsic module constants. - createGlobalOutsideOfFunctionLowering( + createBuilderOutsideOfFuncOpAndDo( [&]() { createIntrinsicModuleDefinitions(pft); }); // Primary translation pass. @@ -439,14 +441,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { [&](Fortran::lower::pft::ModuleLikeUnit &m) { lowerMod(m); }, [&](Fortran::lower::pft::BlockDataUnit &b) {}, [&](Fortran::lower::pft::CompilerDirectiveUnit &d) {}, - [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) { - builder = new fir::FirOpBuilder( - bridge.getModule(), bridge.getKindMap(), &mlirSymbolTable); - Fortran::lower::genOpenACCRoutineConstruct( - *this, bridge.getSemanticsContext(), bridge.getModule(), - d.routine, accRoutineInfos); - builder = nullptr; - }, + [&](Fortran::lower::pft::OpenACCDirectiveUnit &d) {}, }, u); } @@ -454,24 +449,24 @@ class FirConverter : public Fortran::lower::AbstractConverter { // Once all the code has been translated, create global runtime type info // data structures for the derived types that have been processed, as well // as fir.type_info operations for the dispatch tables. - createGlobalOutsideOfFunctionLowering( + createBuilderOutsideOfFuncOpAndDo( [&]() { typeInfoConverter.createTypeInfo(*this); }); // Generate the `main` entry point if necessary if (hasMainProgram) - createGlobalOutsideOfFunctionLowering([&]() { + createBuilderOutsideOfFuncOpAndDo([&]() { fir::runtime::genMain(*builder, toLocation(), bridge.getEnvironmentDefaults(), getFoldingContext().languageFeatures().IsEnabled( Fortran::common::LanguageFeature::CUDA)); }); - finalizeOpenACCLowering(); finalizeOpenMPLowering(globalOmpRequiresSymbol); } /// Declare a function. void declareFunction(Fortran::lower::pft::FunctionLikeUnit &funit) { + CHECK(builder && "declareFunction called with uninitialized builder"); setCurrentPosition(funit.getStartingSourceLoc()); for (int entryIndex = 0, last = funit.entryPointList.size(); entryIndex < last; ++entryIndex) { @@ -1036,7 +1031,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { return bridge.getSemanticsContext().FindScope(currentPosition); } - fir::FirOpBuilder &getFirOpBuilder() override final { return *builder; } + fir::FirOpBuilder &getFirOpBuilder() override final { + CHECK(builder && "builder is not set before calling getFirOpBuilder"); + return *builder; + } mlir::ModuleOp getModuleOp() override final { return bridge.getModule(); } @@ -3063,8 +3061,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { void genFIR(const Fortran::parser::OpenACCDeclarativeConstruct &accDecl) { genOpenACCDeclarativeConstruct(*this, bridge.getSemanticsContext(), - bridge.openAccCtx(), accDecl, - accRoutineInfos); + bridge.openAccCtx(), accDecl); for (Fortran::lower::pft::Evaluation &e : getEval().getNestedEvaluations()) genFIR(e); } @@ -5661,6 +5658,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { LLVM_DEBUG(llvm::dbgs() << "\n[bridge - startNewFunction]"; if (auto *sym = scope.symbol()) llvm::dbgs() << " " << *sym; llvm::dbgs() << "\n"); + // Setting the builder is not necessary here, because callee + // always looks up the FuncOp from the module. If there was a function that + // was not declared yet, this call to callee will cause an assertion + // failure. Fortran::lower::CalleeInterface callee(funit, *this); mlir::func::FuncOp func = callee.addEntryBlockAndMapArguments(); builder = @@ -5930,8 +5931,9 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Helper to generate GlobalOps when the builder is not positioned in any /// region block. This is required because the FirOpBuilder assumes it is /// always positioned inside a region block when creating globals, the easiest - /// way comply is to create a dummy function and to throw it afterwards. - void createGlobalOutsideOfFunctionLowering( + /// way to comply is to create a dummy function and to throw it away + /// afterwards. + void createBuilderOutsideOfFuncOpAndDo( const std::function &createGlobals) { // FIXME: get rid of the bogus function context and instantiate the // globals directly into the module. @@ -5943,6 +5945,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { mlir::FunctionType::get(context, std::nullopt, std::nullopt), symbolTable); func.addEntryBlock(); + CHECK(!builder && "Expected builder to be uninitialized"); builder = new fir::FirOpBuilder(func, bridge.getKindMap(), symbolTable); assert(builder && "FirOpBuilder did not instantiate"); builder->setFastMathFlags(bridge.getLoweringOptions().getMathOptions()); @@ -5958,7 +5961,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Instantiate the data from a BLOCK DATA unit. void lowerBlockData(Fortran::lower::pft::BlockDataUnit &bdunit) { - createGlobalOutsideOfFunctionLowering([&]() { + createBuilderOutsideOfFuncOpAndDo([&]() { Fortran::lower::AggregateStoreMap fakeMap; for (const auto &[_, sym] : bdunit.symTab) { if (sym->has()) { @@ -5972,7 +5975,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// Create fir::Global for all the common blocks that appear in the program. void lowerCommonBlocks(const Fortran::semantics::CommonBlockList &commonBlocks) { - createGlobalOutsideOfFunctionLowering( + createBuilderOutsideOfFuncOpAndDo( [&]() { Fortran::lower::defineCommonBlocks(*this, commonBlocks); }); } @@ -6042,36 +6045,34 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// declarative construct. void lowerModuleDeclScope(Fortran::lower::pft::ModuleLikeUnit &mod) { setCurrentPosition(mod.getStartingSourceLoc()); - createGlobalOutsideOfFunctionLowering([&]() { - auto &scopeVariableListMap = - Fortran::lower::pft::getScopeVariableListMap(mod); - for (const auto &var : Fortran::lower::pft::getScopeVariableList( - mod.getScope(), scopeVariableListMap)) { - - // Only define the variables owned by this module. - const Fortran::semantics::Scope *owningScope = var.getOwningScope(); - if (owningScope && mod.getScope() != *owningScope) - continue; + auto &scopeVariableListMap = + Fortran::lower::pft::getScopeVariableListMap(mod); + for (const auto &var : Fortran::lower::pft::getScopeVariableList( + mod.getScope(), scopeVariableListMap)) { - // Very special case: The value of numeric_storage_size depends on - // compilation options and therefore its value is not yet known when - // building the builtins runtime. Instead, the parameter is folding a - // __numeric_storage_size() expression which is loaded into the user - // program. For the iso_fortran_env object file, omit the symbol as it - // is never used. - if (var.hasSymbol()) { - const Fortran::semantics::Symbol &sym = var.getSymbol(); - const Fortran::semantics::Scope &owner = sym.owner(); - if (sym.name() == "numeric_storage_size" && owner.IsModule() && - DEREF(owner.symbol()).name() == "iso_fortran_env") - continue; - } + // Only define the variables owned by this module. + const Fortran::semantics::Scope *owningScope = var.getOwningScope(); + if (owningScope && mod.getScope() != *owningScope) + continue; - Fortran::lower::defineModuleVariable(*this, var); + // Very special case: The value of numeric_storage_size depends on + // compilation options and therefore its value is not yet known when + // building the builtins runtime. Instead, the parameter is folding a + // __numeric_storage_size() expression which is loaded into the user + // program. For the iso_fortran_env object file, omit the symbol as it + // is never used. + if (var.hasSymbol()) { + const Fortran::semantics::Symbol &sym = var.getSymbol(); + const Fortran::semantics::Scope &owner = sym.owner(); + if (sym.name() == "numeric_storage_size" && owner.IsModule() && + DEREF(owner.symbol()).name() == "iso_fortran_env") + continue; } + + Fortran::lower::defineModuleVariable(*this, var); + } for (auto &eval : mod.evaluationList) genFIR(eval); - }); } /// Lower functions contained in a module. @@ -6372,13 +6373,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { expr.u); } - /// Performing OpenACC lowering action that were deferred to the end of - /// lowering. - void finalizeOpenACCLowering() { - Fortran::lower::finalizeOpenACCRoutineAttachment(getModuleOp(), - accRoutineInfos); - } - /// Performing OpenMP lowering actions that were deferred to the end of /// lowering. void finalizeOpenMPLowering( @@ -6470,9 +6464,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { /// A counter for uniquing names in `literalNamesMap`. std::uint64_t uniqueLitId = 0; - /// Deferred OpenACC routine attachment. - Fortran::lower::AccRoutineInfoMappingList accRoutineInfos; - /// Whether an OpenMP target region or declare target function/subroutine /// intended for device offloading has been detected bool ompDeviceCodeFound = false; diff --git a/flang/lib/Lower/CallInterface.cpp b/flang/lib/Lower/CallInterface.cpp index 73e0984f01635..676c26dbcdbec 100644 --- a/flang/lib/Lower/CallInterface.cpp +++ b/flang/lib/Lower/CallInterface.cpp @@ -10,6 +10,7 @@ #include "flang/Evaluate/fold.h" #include "flang/Lower/Bridge.h" #include "flang/Lower/Mangler.h" +#include "flang/Lower/OpenACC.h" #include "flang/Lower/PFTBuilder.h" #include "flang/Lower/StatementContext.h" #include "flang/Lower/Support/Utils.h" @@ -715,6 +716,17 @@ void Fortran::lower::CallInterface::declare() { func.setArgAttrs(placeHolder.index(), placeHolder.value().attributes); setCUDAAttributes(func, side().getProcedureSymbol(), characteristic); + + if (const Fortran::semantics::Symbol *sym = side().getProcedureSymbol()) { + if (const auto &info{ + sym->GetUltimate() + .detailsIf()}) { + if (!info->openACCRoutineInfos().empty()) { + genOpenACCRoutineConstruct(converter, module, func, + info->openACCRoutineInfos()); + } + } + } } } } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 82daa05c165cb..2f70041a04dde 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -33,11 +33,13 @@ #include "flang/Semantics/scope.h" #include "flang/Semantics/tools.h" #include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" +#include "mlir/IR/MLIRContext.h" #include "mlir/Support/LLVM.h" #include "llvm/ADT/STLExtras.h" #include "llvm/Frontend/OpenACC/ACC.h.inc" #include "llvm/Support/CommandLine.h" #include "llvm/Support/Debug.h" +#include "llvm/Support/ErrorHandling.h" #define DEBUG_TYPE "flang-lower-openacc" @@ -4493,125 +4495,27 @@ static void attachRoutineInfo(mlir::func::FuncOp func, mlir::acc::RoutineInfoAttr::get(func.getContext(), routines)); } -void Fortran::lower::genOpenACCRoutineConstruct( - Fortran::lower::AbstractConverter &converter, - Fortran::semantics::SemanticsContext &semanticsContext, mlir::ModuleOp mod, - const Fortran::parser::OpenACCRoutineConstruct &routineConstruct, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { - fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - mlir::Location loc = converter.genLocation(routineConstruct.source); - std::optional name = - std::get>(routineConstruct.t); - const auto &clauses = - std::get(routineConstruct.t); - mlir::func::FuncOp funcOp; - std::string funcName; - if (name) { - funcName = converter.mangleName(*name->symbol); - funcOp = - builder.getNamedFunction(mod, builder.getMLIRSymbolTable(), funcName); +static mlir::ArrayAttr +getArrayAttrOrNull(fir::FirOpBuilder &builder, + llvm::SmallVector &attributes) { + if (attributes.empty()) { + return nullptr; } else { - Fortran::semantics::Scope &scope = - semanticsContext.FindScope(routineConstruct.source); - const Fortran::semantics::Scope &progUnit{GetProgramUnitContaining(scope)}; - const auto *subpDetails{ - progUnit.symbol() - ? progUnit.symbol() - ->detailsIf() - : nullptr}; - if (subpDetails && subpDetails->isInterface()) { - funcName = converter.mangleName(*progUnit.symbol()); - funcOp = - builder.getNamedFunction(mod, builder.getMLIRSymbolTable(), funcName); - } else { - funcOp = builder.getFunction(); - funcName = funcOp.getName(); - } - } - bool hasNohost = false; - - llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, - workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, - gangDimDeviceTypes, gangDimValues; - - // device_type attribute is set to `none` until a device_type clause is - // encountered. - llvm::SmallVector crtDeviceTypes; - crtDeviceTypes.push_back(mlir::acc::DeviceTypeAttr::get( - builder.getContext(), mlir::acc::DeviceType::None)); - - for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - seqDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (const auto *gangClause = - std::get_if(&clause.u)) { - if (gangClause->v) { - const Fortran::parser::AccGangArgList &x = *gangClause->v; - for (const Fortran::parser::AccGangArg &gangArg : x.v) { - if (const auto *dim = - std::get_if(&gangArg.u)) { - const std::optional dimValue = Fortran::evaluate::ToInt64( - *Fortran::semantics::GetExpr(dim->v)); - if (!dimValue) - mlir::emitError(loc, - "dim value must be a constant positive integer"); - mlir::Attribute gangDimAttr = - builder.getIntegerAttr(builder.getI64Type(), *dimValue); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - gangDimValues.push_back(gangDimAttr); - gangDimDeviceTypes.push_back(crtDeviceTypeAttr); - } - } - } - } else { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - gangDeviceTypes.push_back(crtDeviceTypeAttr); - } - } else if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - vectorDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (std::get_if(&clause.u)) { - for (auto crtDeviceTypeAttr : crtDeviceTypes) - workerDeviceTypes.push_back(crtDeviceTypeAttr); - } else if (std::get_if(&clause.u)) { - hasNohost = true; - } else if (const auto *bindClause = - std::get_if(&clause.u)) { - if (const auto *name = - std::get_if(&bindClause->v.u)) { - mlir::Attribute bindNameAttr = - builder.getStringAttr(converter.mangleName(*name->symbol)); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(crtDeviceTypeAttr); - } - } else if (const auto charExpr = - std::get_if( - &bindClause->v.u)) { - const std::optional name = - Fortran::semantics::GetConstExpr(semanticsContext, - *charExpr); - if (!name) - mlir::emitError(loc, "Could not retrieve the bind name"); - - mlir::Attribute bindNameAttr = builder.getStringAttr(*name); - for (auto crtDeviceTypeAttr : crtDeviceTypes) { - bindNames.push_back(bindNameAttr); - bindNameDeviceTypes.push_back(crtDeviceTypeAttr); - } - } - } else if (const auto *deviceTypeClause = - std::get_if( - &clause.u)) { - crtDeviceTypes.clear(); - gatherDeviceTypeAttrs(builder, deviceTypeClause, crtDeviceTypes); - } + return builder.getArrayAttr(attributes); } +} - mlir::OpBuilder modBuilder(mod.getBodyRegion()); - std::stringstream routineOpName; - routineOpName << accRoutinePrefix.str() << routineCounter++; +void createOpenACCRoutineConstruct( + Fortran::lower::AbstractConverter &converter, mlir::Location loc, + mlir::ModuleOp mod, mlir::func::FuncOp funcOp, std::string funcName, + bool hasNohost, llvm::SmallVector &bindNames, + llvm::SmallVector &bindNameDeviceTypes, + llvm::SmallVector &gangDeviceTypes, + llvm::SmallVector &gangDimValues, + llvm::SmallVector &gangDimDeviceTypes, + llvm::SmallVector &seqDeviceTypes, + llvm::SmallVector &workerDeviceTypes, + llvm::SmallVector &vectorDeviceTypes) { for (auto routineOp : mod.getOps()) { if (routineOp.getFuncName().str().compare(funcName) == 0) { @@ -4626,47 +4530,117 @@ void Fortran::lower::genOpenACCRoutineConstruct( mlir::emitError(loc, "Routine already specified with diff erent clauses"); } } - + std::stringstream routineOpName; + routineOpName << accRoutinePrefix.str() << routineCounter++; + std::string routineOpStr = routineOpName.str(); + mlir::OpBuilder modBuilder(mod.getBodyRegion()); + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); modBuilder.create( - loc, routineOpName.str(), funcName, - bindNames.empty() ? nullptr : builder.getArrayAttr(bindNames), - bindNameDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(bindNameDeviceTypes), - workerDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(workerDeviceTypes), - vectorDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(vectorDeviceTypes), - seqDeviceTypes.empty() ? nullptr : builder.getArrayAttr(seqDeviceTypes), - hasNohost, /*implicit=*/false, - gangDeviceTypes.empty() ? nullptr : builder.getArrayAttr(gangDeviceTypes), - gangDimValues.empty() ? nullptr : builder.getArrayAttr(gangDimValues), - gangDimDeviceTypes.empty() ? nullptr - : builder.getArrayAttr(gangDimDeviceTypes)); - - if (funcOp) - attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpName.str())); - else - // FuncOp is not lowered yet. Keep the information so the routine info - // can be attached later to the funcOp. - accRoutineInfos.push_back(std::make_pair( - funcName, builder.getSymbolRefAttr(routineOpName.str()))); + loc, routineOpStr, funcName, getArrayAttrOrNull(builder, bindNames), + getArrayAttrOrNull(builder, bindNameDeviceTypes), + getArrayAttrOrNull(builder, workerDeviceTypes), + getArrayAttrOrNull(builder, vectorDeviceTypes), + getArrayAttrOrNull(builder, seqDeviceTypes), hasNohost, + /*implicit=*/false, getArrayAttrOrNull(builder, gangDeviceTypes), + getArrayAttrOrNull(builder, gangDimValues), + getArrayAttrOrNull(builder, gangDimDeviceTypes)); + + attachRoutineInfo(funcOp, builder.getSymbolRefAttr(routineOpStr)); } -void Fortran::lower::finalizeOpenACCRoutineAttachment( - mlir::ModuleOp mod, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { - for (auto &mapping : accRoutineInfos) { - mlir::func::FuncOp funcOp = - mod.lookupSymbol(mapping.first); - if (!funcOp) - mlir::emitWarning(mod.getLoc(), - llvm::Twine("function '") + llvm::Twine(mapping.first) + - llvm::Twine("' in acc routine directive is not " - "found in this translation unit.")); - else - attachRoutineInfo(funcOp, mapping.second); +static void interpretRoutineDeviceInfo( + Fortran::lower::AbstractConverter &converter, + const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo, + llvm::SmallVector &seqDeviceTypes, + llvm::SmallVector &vectorDeviceTypes, + llvm::SmallVector &workerDeviceTypes, + llvm::SmallVector &bindNameDeviceTypes, + llvm::SmallVector &bindNames, + llvm::SmallVector &gangDeviceTypes, + llvm::SmallVector &gangDimValues, + llvm::SmallVector &gangDimDeviceTypes) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto getDeviceTypeAttr = [&]() -> mlir::Attribute { + auto context = builder.getContext(); + auto value = getDeviceType(dinfo.dType()); + return mlir::acc::DeviceTypeAttr::get(context, value); + }; + if (dinfo.isSeq()) { + seqDeviceTypes.push_back(getDeviceTypeAttr()); + } + if (dinfo.isVector()) { + vectorDeviceTypes.push_back(getDeviceTypeAttr()); + } + if (dinfo.isWorker()) { + workerDeviceTypes.push_back(getDeviceTypeAttr()); + } + if (dinfo.isGang()) { + unsigned gangDim = dinfo.gangDim(); + auto deviceType = getDeviceTypeAttr(); + if (!gangDim) { + gangDeviceTypes.push_back(deviceType); + } else { + gangDimValues.push_back( + builder.getIntegerAttr(builder.getI64Type(), gangDim)); + gangDimDeviceTypes.push_back(deviceType); + } + } + if (dinfo.bindNameOpt().has_value()) { + const auto &bindName = dinfo.bindNameOpt().value(); + mlir::Attribute bindNameAttr; + if (const auto &bindStr{std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(*bindStr); + } else if (const auto &bindSym{ + std::get_if(&bindName)}) { + bindNameAttr = builder.getStringAttr(converter.mangleName(*bindSym)); + } else { + llvm_unreachable("Unsupported bind name type"); + } + bindNames.push_back(bindNameAttr); + bindNameDeviceTypes.push_back(getDeviceTypeAttr()); } - accRoutineInfos.clear(); +} + +void Fortran::lower::genOpenACCRoutineConstruct( + Fortran::lower::AbstractConverter &converter, mlir::ModuleOp mod, + mlir::func::FuncOp funcOp, + const std::vector &routineInfos) { + CHECK(funcOp && "Expected a valid function operation"); + mlir::Location loc{funcOp.getLoc()}; + std::string funcName{funcOp.getName()}; + + // Collect the routine clauses + bool hasNohost{false}; + + llvm::SmallVector seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimDeviceTypes, gangDimValues; + + for (const Fortran::semantics::OpenACCRoutineInfo &info : routineInfos) { + // Device Independent Attributes + if (info.isNohost()) { + hasNohost = true; + } + // Note: Device Independent Attributes are set to the + // none device type in `info`. + interpretRoutineDeviceInfo(converter, info, seqDeviceTypes, + vectorDeviceTypes, workerDeviceTypes, + bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimValues, gangDimDeviceTypes); + + // Device Dependent Attributes + for (const Fortran::semantics::OpenACCRoutineDeviceTypeInfo &dinfo : + info.deviceTypeInfos()) { + interpretRoutineDeviceInfo( + converter, dinfo, seqDeviceTypes, vectorDeviceTypes, + workerDeviceTypes, bindNameDeviceTypes, bindNames, gangDeviceTypes, + gangDimValues, gangDimDeviceTypes); + } + } + createOpenACCRoutineConstruct( + converter, loc, mod, funcOp, funcName, hasNohost, bindNames, + bindNameDeviceTypes, gangDeviceTypes, gangDimValues, gangDimDeviceTypes, + seqDeviceTypes, workerDeviceTypes, vectorDeviceTypes); } static void @@ -4774,8 +4748,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( Fortran::lower::AbstractConverter &converter, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &openAccCtx, - const Fortran::parser::OpenACCDeclarativeConstruct &accDeclConstruct, - Fortran::lower::AccRoutineInfoMappingList &accRoutineInfos) { + const Fortran::parser::OpenACCDeclarativeConstruct &accDeclConstruct) { Fortran::common::visit( common::visitors{ @@ -4784,14 +4757,7 @@ void Fortran::lower::genOpenACCDeclarativeConstruct( genACC(converter, semanticsContext, openAccCtx, standaloneDeclarativeConstruct); }, - [&](const Fortran::parser::OpenACCRoutineConstruct - &routineConstruct) { - fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - mlir::ModuleOp mod = builder.getModule(); - Fortran::lower::genOpenACCRoutineConstruct( - converter, semanticsContext, mod, routineConstruct, - accRoutineInfos); - }, + [&](const Fortran::parser::OpenACCRoutineConstruct &x) {}, }, accDeclConstruct.u); } diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index 12fc553518cfd..3ea37ceddd056 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -24,6 +24,7 @@ #include #include #include +#include #include namespace Fortran::semantics { @@ -638,8 +639,14 @@ static void PutOpenACCDeviceTypeRoutineInfo( if (info.isWorker()) { os << " worker"; } - if (info.bindName()) { - os << " bind(" << *info.bindName() << ")"; + if (const std::variant *bindName{info.bindName()}) { + os << " bind("; + if (std::holds_alternative(*bindName)) { + os << "\"" << std::get(*bindName) << "\""; + } else { + os << std::get(*bindName)->name(); + } + os << ")"; } } @@ -1388,6 +1395,9 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, parser::Options options; options.isModuleFile = true; options.features.Enable(common::LanguageFeature::BackslashEscapes); + if (context_.languageFeatures().IsEnabled(common::LanguageFeature::OpenACC)) { + options.features.Enable(common::LanguageFeature::OpenACC); + } options.features.Enable(common::LanguageFeature::OpenMP); options.features.Enable(common::LanguageFeature::CUDA); if (!isIntrinsic.value_or(false) && !notAModule) { diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 60531538e6d59..138749a97eb72 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1047,88 +1047,78 @@ void AccAttributeVisitor::AddRoutineInfoToSymbol( Symbol &symbol, const parser::OpenACCRoutineConstruct &x) { if (symbol.has()) { Fortran::semantics::OpenACCRoutineInfo info; - const auto &clauses = std::get(x.t); + std::vector currentDevices; + currentDevices.push_back(&info); + const auto &clauses{std::get(x.t)}; for (const Fortran::parser::AccClause &clause : clauses.v) { - if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isSeq(); - } else { - info.deviceTypeInfos().back().set_isSeq(); + if (const auto *dTypeClause{ + std::get_if(&clause.u)}) { + currentDevices.clear(); + for (const auto &deviceTypeExpr : dTypeClause->v.v) { + currentDevices.push_back(&info.add_deviceTypeInfo(deviceTypeExpr.v)); } - } else if (const auto *gangClause = - std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isGang(); - } else { - info.deviceTypeInfos().back().set_isGang(); + } else if (std::get_if(&clause.u)) { + info.set_isNohost(); + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isSeq(); + } + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isVector(); + } + } else if (std::get_if(&clause.u)) { + for (auto &device : currentDevices) { + device->set_isWorker(); + } + } else if (const auto *gangClause{ + std::get_if( + &clause.u)}) { + for (auto &device : currentDevices) { + device->set_isGang(); } if (gangClause->v) { const Fortran::parser::AccGangArgList &x = *gangClause->v; + int numArgs{0}; for (const Fortran::parser::AccGangArg &gangArg : x.v) { - if (const auto *dim = - std::get_if(&gangArg.u)) { + CHECK(numArgs <= 1 && "expecting 0 or 1 gang dim args"); + if (const auto *dim{std::get_if( + &gangArg.u)}) { if (const auto v{EvaluateInt64(context_, dim->v)}) { - if (info.deviceTypeInfos().empty()) { - info.set_gangDim(*v); - } else { - info.deviceTypeInfos().back().set_gangDim(*v); + for (auto &device : currentDevices) { + device->set_gangDim(*v); } } } + numArgs++; } } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isVector(); - } else { - info.deviceTypeInfos().back().set_isVector(); - } - } else if (std::get_if(&clause.u)) { - if (info.deviceTypeInfos().empty()) { - info.set_isWorker(); - } else { - info.deviceTypeInfos().back().set_isWorker(); - } - } else if (std::get_if(&clause.u)) { - info.set_isNohost(); - } else if (const auto *bindClause = - std::get_if(&clause.u)) { - if (const auto *name = - std::get_if(&bindClause->v.u)) { - if (Symbol *sym = ResolveFctName(*name)) { - if (info.deviceTypeInfos().empty()) { - info.set_bindName(sym->name().ToString()); - } else { - info.deviceTypeInfos().back().set_bindName( - sym->name().ToString()); + } else if (const auto *bindClause{ + std::get_if( + &clause.u)}) { + if (const auto *name{ + std::get_if(&bindClause->v.u)}) { + if (Symbol * sym{ResolveFctName(*name)}) { + Symbol &ultimate{sym->GetUltimate()}; + for (auto &device : currentDevices) { + device->set_bindName(SymbolRef{ultimate}); } } else { context_.Say((*name).source, "No function or subroutine declared for '%s'"_err_en_US, (*name).source); } - } else if (const auto charExpr = + } else if (const auto charExpr{ std::get_if( - &bindClause->v.u)) { - auto *charConst = + &bindClause->v.u)}) { + auto *charConst{ Fortran::parser::Unwrap( - *charExpr); + *charExpr)}; std::string str{std::get(charConst->t)}; - std::stringstream bindName; - bindName << "\"" << str << "\""; - if (info.deviceTypeInfos().empty()) { - info.set_bindName(bindName.str()); - } else { - info.deviceTypeInfos().back().set_bindName(bindName.str()); + for (auto &device : currentDevices) { + device->set_bindName(std::string(str)); } } - } else if (const auto *dType = - std::get_if( - &clause.u)) { - const parser::AccDeviceTypeExprList &deviceTypeExprList = dType->v; - OpenACCRoutineDeviceTypeInfo dtypeInfo; - dtypeInfo.set_dType(deviceTypeExprList.v.front().v); - info.add_deviceTypeInfo(dtypeInfo); } } symbol.get().add_openACCRoutineInfo(info); diff --git a/flang/lib/Semantics/symbol.cpp b/flang/lib/Semantics/symbol.cpp index 32eb6c2c5a188..2118970a7bf25 100644 --- a/flang/lib/Semantics/symbol.cpp +++ b/flang/lib/Semantics/symbol.cpp @@ -144,6 +144,52 @@ llvm::raw_ostream &operator<<( os << ' ' << x; } } + if (!x.openACCRoutineInfos_.empty()) { + os << " openACCRoutineInfos:"; + for (const auto &x : x.openACCRoutineInfos_) { + os << x; + } + } + return os; +} + +llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const OpenACCRoutineDeviceTypeInfo &x) { + if (x.dType() != common::OpenACCDeviceType::None) { + os << " deviceType(" << common::EnumToString(x.dType()) << ')'; + } + if (x.isSeq()) { + os << " seq"; + } + if (x.isVector()) { + os << " vector"; + } + if (x.isWorker()) { + os << " worker"; + } + if (x.isGang()) { + os << " gang(" << x.gangDim() << ')'; + } + if (const auto *bindName{x.bindName()}) { + if (const auto &symbol{std::get_if(bindName)}) { + os << " bindName(\"" << *symbol << "\")"; + } else { + const SymbolRef s{std::get(*bindName)}; + os << " bindName(" << s->name() << ")"; + } + } + return os; +} + +llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const OpenACCRoutineInfo &x) { + if (x.isNohost()) { + os << " nohost"; + } + os << static_cast(x); + for (const auto &d : x.deviceTypeInfos_) { + os << d; + } return os; } diff --git a/flang/test/Lower/OpenACC/acc-module-definition.f90 b/flang/test/Lower/OpenACC/acc-module-definition.f90 new file mode 100644 index 0000000000000..36e41fc631c77 --- /dev/null +++ b/flang/test/Lower/OpenACC/acc-module-definition.f90 @@ -0,0 +1,17 @@ +! RUN: rm -fr %t && mkdir -p %t && cd %t +! RUN: bbc -fopenacc -emit-fir %s +! RUN: cat mod1.mod | FileCheck %s + +!CHECK-LABEL: module mod1 +module mod1 + contains + !CHECK subroutine callee(aa) + subroutine callee(aa) + !CHECK: !$acc routine seq + !$acc routine seq + integer :: aa + aa = 1 + end subroutine + !CHECK: end + !CHECK: end +end module \ No newline at end of file diff --git a/flang/test/Lower/OpenACC/acc-routine-named.f90 b/flang/test/Lower/OpenACC/acc-routine-named.f90 index 2cf6bf8b2bc06..de9784a1146cc 100644 --- a/flang/test/Lower/OpenACC/acc-routine-named.f90 +++ b/flang/test/Lower/OpenACC/acc-routine-named.f90 @@ -4,8 +4,8 @@ module acc_routines -! CHECK: acc.routine @acc_routine_1 func(@_QMacc_routinesPacc2) -! CHECK: acc.routine @acc_routine_0 func(@_QMacc_routinesPacc1) seq +! CHECK: acc.routine @[[r0:.*]] func(@_QMacc_routinesPacc2) +! CHECK: acc.routine @[[r1:.*]] func(@_QMacc_routinesPacc1) seq !$acc routine(acc1) seq @@ -14,12 +14,14 @@ module acc_routines subroutine acc1() end subroutine -! CHECK-LABEL: func.func @_QMacc_routinesPacc1() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +! CHECK-LABEL: func.func @_QMacc_routinesPacc1() +! CHECK-SAME:attributes {acc.routine_info = #acc.routine_info<[@[[r1]]]>} subroutine acc2() !$acc routine(acc2) end subroutine -! CHECK-LABEL: func.func @_QMacc_routinesPacc2() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_1]>} +! CHECK-LABEL: func.func @_QMacc_routinesPacc2() +! CHECK-SAME:attributes {acc.routine_info = #acc.routine_info<[@[[r0]]]>} end module diff --git a/flang/test/Lower/OpenACC/acc-routine-use-module.f90 b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 new file mode 100644 index 0000000000000..059324230a746 --- /dev/null +++ b/flang/test/Lower/OpenACC/acc-routine-use-module.f90 @@ -0,0 +1,23 @@ +! RUN: rm -fr %t && mkdir -p %t && cd %t +! RUN: bbc -fopenacc -emit-fir %S/acc-module-definition.f90 +! RUN: bbc -fopenacc -emit-fir %s -o - | FileCheck %s + +! This test module is based off of flang/test/Lower/use_module.f90 +! The first runs ensures the module file is generated. + +module use_mod1 + use mod1 + contains + !CHECK: acc.routine @acc_routine_0 func(@_QMmod1Pcallee) seq + !CHECK: func.func @_QMuse_mod1Pcaller + !CHECK-SAME { + subroutine caller(aa) + integer :: aa + !$acc serial + !CHECK: fir.call @_QMmod1Pcallee + call callee(aa) + !$acc end serial + end subroutine + !CHECK: } + !CHECK: func.func private @_QMmod1Pcallee(!fir.ref) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +end module \ No newline at end of file diff --git a/flang/test/Lower/OpenACC/acc-routine.f90 b/flang/test/Lower/OpenACC/acc-routine.f90 index 1170af18bc334..789f3a57e1f79 100644 --- a/flang/test/Lower/OpenACC/acc-routine.f90 +++ b/flang/test/Lower/OpenACC/acc-routine.f90 @@ -2,69 +2,77 @@ ! RUN: bbc -fopenacc -emit-hlfir %s -o - | FileCheck %s -! CHECK: acc.routine @acc_routine_17 func(@_QPacc_routine19) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) -! CHECK: acc.routine @acc_routine_16 func(@_QPacc_routine18) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) -! CHECK: acc.routine @acc_routine_15 func(@_QPacc_routine17) worker ([#acc.device_type]) vector ([#acc.device_type]) -! CHECK: acc.routine @acc_routine_14 func(@_QPacc_routine16) gang([#acc.device_type]) seq ([#acc.device_type]) -! CHECK: acc.routine @acc_routine_10 func(@_QPacc_routine11) seq -! CHECK: acc.routine @acc_routine_9 func(@_QPacc_routine10) seq -! CHECK: acc.routine @acc_routine_8 func(@_QPacc_routine9) bind("_QPacc_routine9a") -! CHECK: acc.routine @acc_routine_7 func(@_QPacc_routine8) bind("routine8_") -! CHECK: acc.routine @acc_routine_6 func(@_QPacc_routine7) gang(dim: 1 : i64) -! CHECK: acc.routine @acc_routine_5 func(@_QPacc_routine6) nohost -! CHECK: acc.routine @acc_routine_4 func(@_QPacc_routine5) worker -! CHECK: acc.routine @acc_routine_3 func(@_QPacc_routine4) vector -! CHECK: acc.routine @acc_routine_2 func(@_QPacc_routine3) gang -! CHECK: acc.routine @acc_routine_1 func(@_QPacc_routine2) seq -! CHECK: acc.routine @acc_routine_0 func(@_QPacc_routine1) +! CHECK: acc.routine @[[r14:.*]] func(@_QPacc_routine19) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) +! CHECK: acc.routine @[[r13:.*]] func(@_QPacc_routine18) bind("_QPacc_routine17" [#acc.device_type], "_QPacc_routine16" [#acc.device_type]) +! CHECK: acc.routine @[[r12:.*]] func(@_QPacc_routine17) worker ([#acc.device_type]) vector ([#acc.device_type]) +! CHECK: acc.routine @[[r11:.*]] func(@_QPacc_routine16) gang([#acc.device_type]) seq ([#acc.device_type]) +! CHECK: acc.routine @[[r10:.*]] func(@_QPacc_routine11) seq +! CHECK: acc.routine @[[r09:.*]] func(@_QPacc_routine10) seq +! CHECK: acc.routine @[[r08:.*]] func(@_QPacc_routine9) bind("_QPacc_routine9a") +! CHECK: acc.routine @[[r07:.*]] func(@_QPacc_routine8) bind("routine8_") +! CHECK: acc.routine @[[r06:.*]] func(@_QPacc_routine7) gang(dim: 1 : i64) +! CHECK: acc.routine @[[r05:.*]] func(@_QPacc_routine6) nohost +! CHECK: acc.routine @[[r04:.*]] func(@_QPacc_routine5) worker +! CHECK: acc.routine @[[r03:.*]] func(@_QPacc_routine4) vector +! CHECK: acc.routine @[[r02:.*]] func(@_QPacc_routine3) gang +! CHECK: acc.routine @[[r01:.*]] func(@_QPacc_routine2) seq +! CHECK: acc.routine @[[r00:.*]] func(@_QPacc_routine1) subroutine acc_routine1() !$acc routine end subroutine -! CHECK-LABEL: func.func @_QPacc_routine1() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_0]>} +! CHECK-LABEL: func.func @_QPacc_routine1() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r00]]]>} subroutine acc_routine2() !$acc routine seq end subroutine -! CHECK-LABEL: func.func @_QPacc_routine2() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_1]>} +! CHECK-LABEL: func.func @_QPacc_routine2() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r01]]]>} subroutine acc_routine3() !$acc routine gang end subroutine -! CHECK-LABEL: func.func @_QPacc_routine3() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_2]>} +! CHECK-LABEL: func.func @_QPacc_routine3() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r02]]]>} subroutine acc_routine4() !$acc routine vector end subroutine -! CHECK-LABEL: func.func @_QPacc_routine4() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_3]>} +! CHECK-LABEL: func.func @_QPacc_routine4() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r03]]]>} subroutine acc_routine5() !$acc routine worker end subroutine -! CHECK-LABEL: func.func @_QPacc_routine5() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_4]>} +! CHECK-LABEL: func.func @_QPacc_routine5() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r04]]]>} subroutine acc_routine6() !$acc routine nohost end subroutine -! CHECK-LABEL: func.func @_QPacc_routine6() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_5]>} +! CHECK-LABEL: func.func @_QPacc_routine6() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r05]]]>} subroutine acc_routine7() !$acc routine gang(dim:1) end subroutine -! CHECK-LABEL: func.func @_QPacc_routine7() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_6]>} +! CHECK-LABEL: func.func @_QPacc_routine7() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r06]]]>} subroutine acc_routine8() !$acc routine bind("routine8_") end subroutine -! CHECK-LABEL: func.func @_QPacc_routine8() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_7]>} +! CHECK-LABEL: func.func @_QPacc_routine8() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r07]]]>} subroutine acc_routine9a() end subroutine @@ -73,20 +81,23 @@ subroutine acc_routine9() !$acc routine bind(acc_routine9a) end subroutine -! CHECK-LABEL: func.func @_QPacc_routine9() attributes {acc.routine_info = #acc.routine_info<[@acc_routine_8]>} +! CHECK-LABEL: func.func @_QPacc_routine9() +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r08]]]>} function acc_routine10() !$acc routine(acc_routine10) seq end function -! CHECK-LABEL: func.func @_QPacc_routine10() -> f32 attributes {acc.routine_info = #acc.routine_info<[@acc_routine_9]>} +! CHECK-LABEL: func.func @_QPacc_routine10() -> f32 +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r09]]]>} subroutine acc_routine11(a) real :: a !$acc routine(acc_routine11) seq end subroutine -! CHECK-LABEL: func.func @_QPacc_routine11(%arg0: !fir.ref {fir.bindc_name = "a"}) attributes {acc.routine_info = #acc.routine_info<[@acc_routine_10]>} +! CHECK-LABEL: func.func @_QPacc_routine11(%arg0: !fir.ref {fir.bindc_name = "a"}) +! CHECK-SAME: attributes {acc.routine_info = #acc.routine_info<[@[[r10]]]>} subroutine acc_routine12() From flang-commits at lists.llvm.org Sat May 10 13:18:47 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 10 May 2025 13:18:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Acknowledge non-enforcement of C7108 (PR #139169) In-Reply-To: Message-ID: <681fb4a7.170a0220.3293ca.2089@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/139169 >From e761cf3e39d17ab1e8ff025534eac4a0119a9b8b Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Thu, 8 May 2025 15:02:01 -0700 Subject: [PATCH] [flang] Acknowledge non-enforcement of C7108 Fortran 2023 constraint C7108 prohibits the use of a structure constructor in a way that is ambiguous with a generic function reference (intrinsic or user-defined). Sadly, no Fortran compiler implements this constraint, and the common portable interpretation seems to be the generic resolution, not the structure constructor. Restructure the processing of structure constructors in expression analysis so that it can be driven both from the parse tree as well as from generic resolution, and then use it to detect ambigous structure constructor / generic function cases, so that a portability warning can be issued. And document this as a new intentional violation of the standard in Extensions.md. Fixes https://github.com/llvm/llvm-project/issues/138807. --- flang/docs/Extensions.md | 5 + flang/include/flang/Semantics/expression.h | 13 + .../include/flang/Support/Fortran-features.h | 2 +- flang/lib/Semantics/expression.cpp | 452 +++++++++++------- flang/lib/Support/Fortran-features.cpp | 1 + flang/test/Semantics/c7108.f90 | 41 ++ flang/test/Semantics/generic09.f90 | 4 + flang/test/Semantics/resolve11.f90 | 3 +- flang/test/Semantics/resolve17.f90 | 2 + flang/test/Semantics/resolve18.f90 | 1 + 10 files changed, 338 insertions(+), 186 deletions(-) create mode 100644 flang/test/Semantics/c7108.f90 diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 5c7751763eab1..00a7e2bac84e6 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -159,6 +159,11 @@ end to be constant will generate a compilation error. `ieee_support_standard` depends in part on `ieee_support_halting`, so this also applies to `ieee_support_standard` calls. +* F'2023 constraint C7108 prohibits the use of a structure constructor + that could also be interpreted as a generic function reference. + No other Fortran compiler enforces C7108 (to our knowledge); + they all resolve the ambiguity by interpreting the call as a function + reference. We do the same, with a portability warning. ## Extensions, deletions, and legacy features supported by default diff --git a/flang/include/flang/Semantics/expression.h b/flang/include/flang/Semantics/expression.h index eee23dba4831f..30f5dfd8a44cd 100644 --- a/flang/include/flang/Semantics/expression.h +++ b/flang/include/flang/Semantics/expression.h @@ -394,6 +394,19 @@ class ExpressionAnalyzer { MaybeExpr AnalyzeComplex(MaybeExpr &&re, MaybeExpr &&im, const char *what); std::optional AnalyzeChevrons(const parser::CallStmt &); + // CheckStructureConstructor() is used for parsed structure constructors + // as well as for generic function references. + struct ComponentSpec { + ComponentSpec() = default; + ComponentSpec(ComponentSpec &&) = default; + parser::CharBlock source, exprSource; + bool hasKeyword{false}; + const Symbol *keywordSymbol{nullptr}; + MaybeExpr expr; + }; + MaybeExpr CheckStructureConstructor(parser::CharBlock typeName, + const semantics::DerivedTypeSpec &, std::list &&); + MaybeExpr IterativelyAnalyzeSubexpressions(const parser::Expr &); semantics::SemanticsContext &context_; diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 6cb1bcdb0003f..aa3396c46963c 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -54,7 +54,7 @@ ENUM_CLASS(LanguageFeature, BackslashEscapes, OldDebugLines, PolymorphicActualAllocatableOrPointerToMonomorphicDummy, RelaxedPureDummy, UndefinableAsynchronousOrVolatileActual, AutomaticInMainProgram, PrintCptr, SavedLocalInSpecExpr, PrintNamelist, AssumedRankPassedToNonAssumedRank, - IgnoreIrrelevantAttributes, Unsigned) + IgnoreIrrelevantAttributes, Unsigned, AmbiguousStructureConstructor) // Portability and suspicious usage warnings ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index e139bda7e4950..8b80f907953c8 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -2063,23 +2063,9 @@ static MaybeExpr ImplicitConvertTo(const semantics::Symbol &sym, return std::nullopt; } -MaybeExpr ExpressionAnalyzer::Analyze( - const parser::StructureConstructor &structure) { - auto &parsedType{std::get(structure.t)}; - parser::Name structureType{std::get(parsedType.t)}; - parser::CharBlock &typeName{structureType.source}; - if (semantics::Symbol *typeSymbol{structureType.symbol}) { - if (typeSymbol->has()) { - semantics::DerivedTypeSpec dtSpec{typeName, typeSymbol->GetUltimate()}; - if (!CheckIsValidForwardReference(dtSpec)) { - return std::nullopt; - } - } - } - if (!parsedType.derivedTypeSpec) { - return std::nullopt; - } - const auto &spec{*parsedType.derivedTypeSpec}; +MaybeExpr ExpressionAnalyzer::CheckStructureConstructor( + parser::CharBlock typeName, const semantics::DerivedTypeSpec &spec, + std::list &&componentSpecs) { const Symbol &typeSymbol{spec.typeSymbol()}; if (!spec.scope() || !typeSymbol.has()) { return std::nullopt; // error recovery @@ -2090,10 +2076,10 @@ MaybeExpr ExpressionAnalyzer::Analyze( const Symbol *parentComponent{typeDetails.GetParentComponent(*spec.scope())}; if (typeSymbol.attrs().test(semantics::Attr::ABSTRACT)) { // C796 - AttachDeclaration(Say(typeName, - "ABSTRACT derived type '%s' may not be used in a " - "structure constructor"_err_en_US, - typeName), + AttachDeclaration( + Say(typeName, + "ABSTRACT derived type '%s' may not be used in a structure constructor"_err_en_US, + typeName), typeSymbol); // C7114 } @@ -2123,22 +2109,19 @@ MaybeExpr ExpressionAnalyzer::Analyze( bool checkConflicts{true}; // until we hit one auto &messages{GetContextualMessages()}; - // NULL() can be a valid component - auto restorer{AllowNullPointer()}; - - for (const auto &component : - std::get>(structure.t)) { - const parser::Expr &expr{ - std::get(component.t).v.value()}; - parser::CharBlock source{expr.source}; + for (ComponentSpec &componentSpec : componentSpecs) { + parser::CharBlock source{componentSpec.source}; + parser::CharBlock exprSource{componentSpec.exprSource}; auto restorer{messages.SetLocation(source)}; - const Symbol *symbol{nullptr}; - MaybeExpr value{Analyze(expr)}; + const Symbol *symbol{componentSpec.keywordSymbol}; + MaybeExpr &maybeValue{componentSpec.expr}; + if (!maybeValue.has_value()) { + return std::nullopt; + } + Expr &value{*maybeValue}; std::optional valueType{DynamicType::From(value)}; - if (const auto &kw{std::get>(component.t)}) { + if (componentSpec.hasKeyword) { anyKeyword = true; - source = kw->v.source; - symbol = kw->v.symbol; if (!symbol) { // Skip overridden inaccessible parent components in favor of // their later overrides. @@ -2190,9 +2173,9 @@ MaybeExpr ExpressionAnalyzer::Analyze( } } if (symbol) { - const semantics::Scope &innermost{context_.FindScope(expr.source)}; + const semantics::Scope &innermost{context_.FindScope(exprSource)}; if (auto msg{CheckAccessibleSymbol(innermost, *symbol)}) { - Say(expr.source, std::move(*msg)); + Say(exprSource, std::move(*msg)); } if (checkConflicts) { auto componentIter{ @@ -2200,8 +2183,7 @@ MaybeExpr ExpressionAnalyzer::Analyze( if (unavailable.find(symbol->name()) != unavailable.cend()) { // C797, C798 Say(source, - "Component '%s' conflicts with another component earlier in " - "this structure constructor"_err_en_US, + "Component '%s' conflicts with another component earlier in this structure constructor"_err_en_US, symbol->name()); } else if (symbol->test(Symbol::Flag::ParentComp)) { // Make earlier components unavailable once a whole parent appears. @@ -2219,143 +2201,136 @@ MaybeExpr ExpressionAnalyzer::Analyze( } } unavailable.insert(symbol->name()); - if (value) { - if (symbol->has()) { - Say(expr.source, - "Type parameter '%s' may not appear as a component of a structure constructor"_err_en_US, - symbol->name()); - } - if (!(symbol->has() || - symbol->has())) { - continue; // recovery - } - if (IsPointer(*symbol)) { // C7104, C7105, C1594(4) - semantics::CheckStructConstructorPointerComponent( - context_, *symbol, *value, innermost); - result.Add(*symbol, Fold(std::move(*value))); - continue; - } - if (IsNullPointer(&*value)) { - if (IsAllocatable(*symbol)) { - if (IsBareNullPointer(&*value)) { - // NULL() with no arguments allowed by 7.5.10 para 6 for - // ALLOCATABLE. - result.Add(*symbol, Expr{NullPointer{}}); - continue; - } - if (IsNullObjectPointer(&*value)) { - AttachDeclaration( - Warn(common::LanguageFeature:: - NullMoldAllocatableComponentValue, - expr.source, - "NULL() with arguments is not standard conforming as the value for allocatable component '%s'"_port_en_US, - symbol->name()), - *symbol); - // proceed to check type & shape - } else { - AttachDeclaration( - Say(expr.source, - "A NULL procedure pointer may not be used as the value for component '%s'"_err_en_US, - symbol->name()), - *symbol); - continue; - } + if (symbol->has()) { + Say(exprSource, + "Type parameter '%s' may not appear as a component of a structure constructor"_err_en_US, + symbol->name()); + } + if (!(symbol->has() || + symbol->has())) { + continue; // recovery + } + if (IsPointer(*symbol)) { // C7104, C7105, C1594(4) + semantics::CheckStructConstructorPointerComponent( + context_, *symbol, value, innermost); + result.Add(*symbol, Fold(std::move(value))); + continue; + } + if (IsNullPointer(&value)) { + if (IsAllocatable(*symbol)) { + if (IsBareNullPointer(&value)) { + // NULL() with no arguments allowed by 7.5.10 para 6 for + // ALLOCATABLE. + result.Add(*symbol, Expr{NullPointer{}}); + continue; + } + if (IsNullObjectPointer(&value)) { + AttachDeclaration( + Warn(common::LanguageFeature::NullMoldAllocatableComponentValue, + exprSource, + "NULL() with arguments is not standard conforming as the value for allocatable component '%s'"_port_en_US, + symbol->name()), + *symbol); + // proceed to check type & shape } else { AttachDeclaration( - Say(expr.source, - "A NULL pointer may not be used as the value for component '%s'"_err_en_US, + Say(exprSource, + "A NULL procedure pointer may not be used as the value for component '%s'"_err_en_US, symbol->name()), *symbol); continue; } - } else if (IsNullAllocatable(&*value) && IsAllocatable(*symbol)) { - result.Add(*symbol, Expr{NullPointer{}}); + } else { + AttachDeclaration( + Say(exprSource, + "A NULL pointer may not be used as the value for component '%s'"_err_en_US, + symbol->name()), + *symbol); continue; - } else if (auto *derived{evaluate::GetDerivedTypeSpec( - evaluate::DynamicType::From(*symbol))}) { - if (auto iter{FindPointerPotentialComponent(*derived)}; - iter && pureContext) { // F'2023 C15104(4) - if (const Symbol * - visible{semantics::FindExternallyVisibleObject( - *value, *pureContext)}) { - Say(expr.source, - "The externally visible object '%s' may not be used in a pure procedure as the value for component '%s' which has the pointer component '%s'"_err_en_US, - visible->name(), symbol->name(), - iter.BuildResultDesignatorName()); - } else if (ExtractCoarrayRef(*value)) { - Say(expr.source, - "A coindexed object may not be used in a pure procedure as the value for component '%s' which has the pointer component '%s'"_err_en_US, - symbol->name(), iter.BuildResultDesignatorName()); - } + } + } else if (IsNullAllocatable(&value) && IsAllocatable(*symbol)) { + result.Add(*symbol, Expr{NullPointer{}}); + continue; + } else if (auto *derived{evaluate::GetDerivedTypeSpec( + evaluate::DynamicType::From(*symbol))}) { + if (auto iter{FindPointerPotentialComponent(*derived)}; + iter && pureContext) { // F'2023 C15104(4) + if (const Symbol * + visible{semantics::FindExternallyVisibleObject( + value, *pureContext)}) { + Say(exprSource, + "The externally visible object '%s' may not be used in a pure procedure as the value for component '%s' which has the pointer component '%s'"_err_en_US, + visible->name(), symbol->name(), + iter.BuildResultDesignatorName()); + } else if (ExtractCoarrayRef(value)) { + Say(exprSource, + "A coindexed object may not be used in a pure procedure as the value for component '%s' which has the pointer component '%s'"_err_en_US, + symbol->name(), iter.BuildResultDesignatorName()); } } - // Make implicit conversion explicit to allow folding of the structure - // constructors and help semantic checking, unless the component is - // allocatable, in which case the value could be an unallocated - // allocatable (see Fortran 2018 7.5.10 point 7). The explicit - // convert would cause a segfault. Lowering will deal with - // conditionally converting and preserving the lower bounds in this - // case. - if (MaybeExpr converted{ImplicitConvertTo( - *symbol, std::move(*value), IsAllocatable(*symbol))}) { - if (auto componentShape{GetShape(GetFoldingContext(), *symbol)}) { - if (auto valueShape{GetShape(GetFoldingContext(), *converted)}) { - if (GetRank(*componentShape) == 0 && GetRank(*valueShape) > 0) { + } + // Make implicit conversion explicit to allow folding of the structure + // constructors and help semantic checking, unless the component is + // allocatable, in which case the value could be an unallocated + // allocatable (see Fortran 2018 7.5.10 point 7). The explicit + // convert would cause a segfault. Lowering will deal with + // conditionally converting and preserving the lower bounds in this + // case. + if (MaybeExpr converted{ImplicitConvertTo( + *symbol, std::move(value), IsAllocatable(*symbol))}) { + if (auto componentShape{GetShape(GetFoldingContext(), *symbol)}) { + if (auto valueShape{GetShape(GetFoldingContext(), *converted)}) { + if (GetRank(*componentShape) == 0 && GetRank(*valueShape) > 0) { + AttachDeclaration( + Say(exprSource, + "Rank-%d array value is not compatible with scalar component '%s'"_err_en_US, + GetRank(*valueShape), symbol->name()), + *symbol); + } else { + auto checked{CheckConformance(messages, *componentShape, + *valueShape, CheckConformanceFlags::RightIsExpandableDeferred, + "component", "value")}; + if (checked && *checked && GetRank(*componentShape) > 0 && + GetRank(*valueShape) == 0 && + (IsDeferredShape(*symbol) || + !IsExpandableScalar(*converted, GetFoldingContext(), + *componentShape, true /*admit PURE call*/))) { AttachDeclaration( - Say(expr.source, - "Rank-%d array value is not compatible with scalar component '%s'"_err_en_US, - GetRank(*valueShape), symbol->name()), + Say(exprSource, + "Scalar value cannot be expanded to shape of array component '%s'"_err_en_US, + symbol->name()), *symbol); - } else { - auto checked{ - CheckConformance(messages, *componentShape, *valueShape, - CheckConformanceFlags::RightIsExpandableDeferred, - "component", "value")}; - if (checked && *checked && GetRank(*componentShape) > 0 && - GetRank(*valueShape) == 0 && - (IsDeferredShape(*symbol) || - !IsExpandableScalar(*converted, GetFoldingContext(), - *componentShape, true /*admit PURE call*/))) { - AttachDeclaration( - Say(expr.source, - "Scalar value cannot be expanded to shape of array component '%s'"_err_en_US, - symbol->name()), - *symbol); - } - if (checked.value_or(true)) { - result.Add(*symbol, std::move(*converted)); - } } - } else { - Say(expr.source, "Shape of value cannot be determined"_err_en_US); + if (checked.value_or(true)) { + result.Add(*symbol, std::move(*converted)); + } } } else { - AttachDeclaration( - Say(expr.source, - "Shape of component '%s' cannot be determined"_err_en_US, - symbol->name()), - *symbol); - } - } else if (auto symType{DynamicType::From(symbol)}) { - if (IsAllocatable(*symbol) && symType->IsUnlimitedPolymorphic() && - valueType) { - // ok - } else if (valueType) { - AttachDeclaration( - Say(expr.source, - "Value in structure constructor of type '%s' is " - "incompatible with component '%s' of type '%s'"_err_en_US, - valueType->AsFortran(), symbol->name(), - symType->AsFortran()), - *symbol); - } else { - AttachDeclaration( - Say(expr.source, - "Value in structure constructor is incompatible with " - "component '%s' of type %s"_err_en_US, - symbol->name(), symType->AsFortran()), - *symbol); + Say(exprSource, "Shape of value cannot be determined"_err_en_US); } + } else { + AttachDeclaration( + Say(exprSource, + "Shape of component '%s' cannot be determined"_err_en_US, + symbol->name()), + *symbol); + } + } else if (auto symType{DynamicType::From(symbol)}) { + if (IsAllocatable(*symbol) && symType->IsUnlimitedPolymorphic() && + valueType) { + // ok + } else if (valueType) { + AttachDeclaration( + Say(exprSource, + "Value in structure constructor of type '%s' is incompatible with component '%s' of type '%s'"_err_en_US, + valueType->AsFortran(), symbol->name(), symType->AsFortran()), + *symbol); + } else { + AttachDeclaration( + Say(exprSource, + "Value in structure constructor is incompatible with component '%s' of type %s"_err_en_US, + symbol->name(), symType->AsFortran()), + *symbol); } } } @@ -2375,10 +2350,10 @@ MaybeExpr ExpressionAnalyzer::Analyze( } else if (IsPointer(symbol)) { result.Add(symbol, Expr{NullPointer{}}); } else if (object) { // C799 - AttachDeclaration(Say(typeName, - "Structure constructor lacks a value for " - "component '%s'"_err_en_US, - symbol.name()), + AttachDeclaration( + Say(typeName, + "Structure constructor lacks a value for component '%s'"_err_en_US, + symbol.name()), symbol); } } @@ -2388,6 +2363,45 @@ MaybeExpr ExpressionAnalyzer::Analyze( return AsMaybeExpr(Expr{std::move(result)}); } +MaybeExpr ExpressionAnalyzer::Analyze( + const parser::StructureConstructor &structure) { + const auto &parsedType{std::get(structure.t)}; + parser::Name structureType{std::get(parsedType.t)}; + parser::CharBlock &typeName{structureType.source}; + if (semantics::Symbol * typeSymbol{structureType.symbol}) { + if (typeSymbol->has()) { + semantics::DerivedTypeSpec dtSpec{typeName, typeSymbol->GetUltimate()}; + if (!CheckIsValidForwardReference(dtSpec)) { + return std::nullopt; + } + } + } + if (!parsedType.derivedTypeSpec) { + return std::nullopt; + } + auto restorer{AllowNullPointer()}; // NULL() can be a valid component + std::list componentSpecs; + for (const auto &component : + std::get>(structure.t)) { + const parser::Expr &expr{ + std::get(component.t).v.value()}; + auto restorer{GetContextualMessages().SetLocation(expr.source)}; + ComponentSpec compSpec; + compSpec.exprSource = expr.source; + compSpec.expr = Analyze(expr); + if (const auto &kw{std::get>(component.t)}) { + compSpec.source = kw->v.source; + compSpec.hasKeyword = true; + compSpec.keywordSymbol = kw->v.symbol; + } else { + compSpec.source = expr.source; + } + componentSpecs.emplace_back(std::move(compSpec)); + } + return CheckStructureConstructor( + typeName, DEREF(parsedType.derivedTypeSpec), std::move(componentSpecs)); +} + static std::optional GetPassName( const semantics::Symbol &proc) { return common::visit( @@ -2835,24 +2849,26 @@ std::pair ExpressionAnalyzer::ResolveGeneric( const Symbol &symbol, const ActualArguments &actuals, const AdjustActuals &adjustActuals, bool isSubroutine, bool mightBeStructureConstructor) { - const Symbol *elemental{nullptr}; // matching elemental specific proc - const Symbol *nonElemental{nullptr}; // matching non-elemental specific const Symbol &ultimate{symbol.GetUltimate()}; - int crtMatchingDistance{cudaInfMatchingValue}; // Check for a match with an explicit INTRINSIC + const Symbol *explicitIntrinsic{nullptr}; if (ultimate.attrs().test(semantics::Attr::INTRINSIC)) { parser::Messages buffer; - auto restorer{foldingContext_.messages().SetMessages(buffer)}; + auto restorer{GetContextualMessages().SetMessages(buffer)}; ActualArguments localActuals{actuals}; if (context_.intrinsics().Probe( CallCharacteristics{ultimate.name().ToString(), isSubroutine}, localActuals, foldingContext_) && !buffer.AnyFatalError()) { - return {&ultimate, false}; + explicitIntrinsic = &ultimate; } } - if (const auto *details{ultimate.detailsIf()}) { - for (const Symbol &specific0 : details->specificProcs()) { + const Symbol *elemental{nullptr}; // matching elemental specific proc + const Symbol *nonElemental{nullptr}; // matching non-elemental specific + const auto *genericDetails{ultimate.detailsIf()}; + if (genericDetails && !explicitIntrinsic) { + int crtMatchingDistance{cudaInfMatchingValue}; + for (const Symbol &specific0 : genericDetails->specificProcs()) { const Symbol &specific1{BypassGeneric(specific0)}; if (isSubroutine != !IsFunction(specific1)) { continue; @@ -2905,25 +2921,93 @@ std::pair ExpressionAnalyzer::ResolveGeneric( } } } - if (nonElemental) { - return {&AccessSpecific(symbol, *nonElemental), false}; - } else if (elemental) { - return {&AccessSpecific(symbol, *elemental), false}; + } + // Is there a derived type of the same name? + const Symbol *derivedType{nullptr}; + if (mightBeStructureConstructor && !isSubroutine && genericDetails) { + if (const Symbol * dt{genericDetails->derivedType()}) { + const Symbol &ultimate{dt->GetUltimate()}; + if (ultimate.has()) { + derivedType = &ultimate; + } } - // Check parent derived type - if (const auto *parentScope{symbol.owner().GetDerivedTypeParent()}) { - if (const Symbol *extended{parentScope->FindComponent(symbol.name())}) { - auto pair{ResolveGeneric( - *extended, actuals, adjustActuals, isSubroutine, false)}; - if (pair.first) { - return pair; + } + // F'2023 C7108 checking. No Fortran compiler actually enforces this + // constraint, so it's just a portability warning here. + if (derivedType && (explicitIntrinsic || nonElemental || elemental) && + context_.ShouldWarn( + common::LanguageFeature::AmbiguousStructureConstructor)) { + // See whethr there's ambiguity with a structure constructor. + bool possiblyAmbiguous{true}; + if (const semantics::Scope * dtScope{derivedType->scope()}) { + parser::Messages buffer; + auto restorer{GetContextualMessages().SetMessages(buffer)}; + std::list componentSpecs; + for (const auto &actual : actuals) { + if (actual) { + ComponentSpec compSpec; + if (const Expr *expr{actual->UnwrapExpr()}) { + compSpec.expr = *expr; + } else { + possiblyAmbiguous = false; + } + if (auto loc{actual->sourceLocation()}) { + compSpec.source = compSpec.exprSource = *loc; + } + if (auto kw{actual->keyword()}) { + compSpec.hasKeyword = true; + compSpec.keywordSymbol = dtScope->FindComponent(*kw); + } + componentSpecs.emplace_back(std::move(compSpec)); + } else { + possiblyAmbiguous = false; } } + semantics::DerivedTypeSpec dtSpec{derivedType->name(), *derivedType}; + dtSpec.set_scope(*dtScope); + possiblyAmbiguous = possiblyAmbiguous && + CheckStructureConstructor( + derivedType->name(), dtSpec, std::move(componentSpecs)) + .has_value() && + !buffer.AnyFatalError(); + } + if (possiblyAmbiguous) { + if (explicitIntrinsic) { + Warn(common::LanguageFeature::AmbiguousStructureConstructor, + "Reference to the intrinsic function '%s' is ambiguous with a structure constructor of the same name"_port_en_US, + symbol.name()); + } else { + Warn(common::LanguageFeature::AmbiguousStructureConstructor, + "Reference to generic function '%s' (resolving to specific '%s') is ambiguous with a structure constructor of the same name"_port_en_US, + symbol.name(), + nonElemental ? nonElemental->name() : elemental->name()); + } } - if (mightBeStructureConstructor && details->derivedType()) { - return {details->derivedType(), false}; + } + // Return the right resolution, if there is one. Explicit intrinsics + // are preferred, then non-elements specifics, then elementals, and + // lastly structure constructors. + if (explicitIntrinsic) { + return {explicitIntrinsic, false}; + } else if (nonElemental) { + return {&AccessSpecific(symbol, *nonElemental), false}; + } else if (elemental) { + return {&AccessSpecific(symbol, *elemental), false}; + } + // Check parent derived type + if (const auto *parentScope{symbol.owner().GetDerivedTypeParent()}) { + if (const Symbol * extended{parentScope->FindComponent(symbol.name())}) { + auto pair{ResolveGeneric( + *extended, actuals, adjustActuals, isSubroutine, false)}; + if (pair.first) { + return pair; + } } } + // Structure constructor? + if (derivedType) { + return {derivedType, false}; + } // Check for generic or explicit INTRINSIC of the same name in outer scopes. // See 15.5.5.2 for details. if (!symbol.owner().IsGlobal() && !symbol.owner().IsDerivedType()) { diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 49a5989849eaa..bee8984102b82 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -45,6 +45,7 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::HollerithPolymorphic); warnLanguage_.set(LanguageFeature::ListDirectedSize); warnLanguage_.set(LanguageFeature::IgnoreIrrelevantAttributes); + warnLanguage_.set(LanguageFeature::AmbiguousStructureConstructor); warnUsage_.set(UsageWarning::ShortArrayActual); warnUsage_.set(UsageWarning::FoldingException); warnUsage_.set(UsageWarning::FoldingAvoidsRuntimeCrash); diff --git a/flang/test/Semantics/c7108.f90 b/flang/test/Semantics/c7108.f90 new file mode 100644 index 0000000000000..c23a0abe3ee03 --- /dev/null +++ b/flang/test/Semantics/c7108.f90 @@ -0,0 +1,41 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 -pedantic -Werror +! F'2023 C7108 is portably unenforced. +module m + type foo + integer n + end type + interface foo + procedure bar0, bar1, bar2, bar3 + end interface + contains + type(foo) function bar0(n) + integer, intent(in) :: n + print *, 'bar0' + bar0%n = n + end + type(foo) function bar1() + print *, 'bar1' + bar1%n = 1 + end + type(foo) function bar2(a) + real, intent(in) :: a + print *, 'bar2' + bar2%n = a + end + type(foo) function bar3(L) + logical, intent(in) :: L + print *, 'bar3' + bar3%n = merge(4,5,L) + end +end + +program p + use m + type(foo) x + x = foo(); print *, x ! ok, not ambiguous + !PORTABILITY: Reference to generic function 'foo' (resolving to specific 'bar0') is ambiguous with a structure constructor of the same name + x = foo(2); print *, x ! ambigous + !PORTABILITY: Reference to generic function 'foo' (resolving to specific 'bar2') is ambiguous with a structure constructor of the same name + x = foo(3.); print *, x ! ambiguous due to data conversion + x = foo(.true.); print *, x ! ok, not ambigous +end diff --git a/flang/test/Semantics/generic09.f90 b/flang/test/Semantics/generic09.f90 index 6159dd4b701d7..d93d7453ed6dd 100644 --- a/flang/test/Semantics/generic09.f90 +++ b/flang/test/Semantics/generic09.f90 @@ -1,4 +1,5 @@ ! RUN: %flang_fc1 -fdebug-unparse %s 2>&1 | FileCheck %s + module m1 type foo integer n @@ -32,6 +33,9 @@ type(foo) function f2(a) end end +!CHECK: portability: Reference to generic function 'foo' (resolving to specific 'f1') is ambiguous with a structure constructor of the same name +!CHECK: portability: Reference to generic function 'foo' (resolving to specific 'f2') is ambiguous with a structure constructor of the same name + program main use m3 type(foo) x diff --git a/flang/test/Semantics/resolve11.f90 b/flang/test/Semantics/resolve11.f90 index 39a30b858ebb6..9ae4f52c4fd54 100644 --- a/flang/test/Semantics/resolve11.f90 +++ b/flang/test/Semantics/resolve11.f90 @@ -66,7 +66,8 @@ subroutine s4 !ERROR: 'fun' is PRIVATE in 'm4' use m4, only: foo, fun type(foo) x ! ok - print *, foo() ! ok + !PORTABILITY: Reference to generic function 'foo' (resolving to specific 'fun') is ambiguous with a structure constructor of the same name + print *, foo() end module m5 diff --git a/flang/test/Semantics/resolve17.f90 b/flang/test/Semantics/resolve17.f90 index 770af756d03bc..6a6e355abe0b8 100644 --- a/flang/test/Semantics/resolve17.f90 +++ b/flang/test/Semantics/resolve17.f90 @@ -290,6 +290,7 @@ module m14d contains subroutine test real :: y + !PORTABILITY: Reference to generic function 'foo' (resolving to specific 'bar') is ambiguous with a structure constructor of the same name y = foo(1.0) x = foo(2) end subroutine @@ -301,6 +302,7 @@ module m14e contains subroutine test real :: y + !PORTABILITY: Reference to generic function 'foo' (resolving to specific 'bar') is ambiguous with a structure constructor of the same name y = foo(1.0) x = foo(2) end subroutine diff --git a/flang/test/Semantics/resolve18.f90 b/flang/test/Semantics/resolve18.f90 index fef526908bbf9..547db5e85714c 100644 --- a/flang/test/Semantics/resolve18.f90 +++ b/flang/test/Semantics/resolve18.f90 @@ -348,6 +348,7 @@ subroutine s_21_23 use m21 use m23 type(foo) x ! Intel and NAG error + !PORTABILITY: Reference to generic function 'foo' (resolving to specific 'f1') is ambiguous with a structure constructor of the same name print *, foo(1.) ! Intel error print *, foo(1.,2.,3.) ! Intel error call ext(foo) ! GNU and Intel error From flang-commits at lists.llvm.org Sat May 10 15:47:55 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Sat, 10 May 2025 15:47:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] Fix CUDA implicit data transfer entity creation (PR #139414) Message-ID: https://github.com/wangzpgi created https://github.com/llvm/llvm-project/pull/139414 Fixed an issue in `genCUDAImplicitDataTransfer` where creating an `hlfir::Entity` from a symbol address could fail when the address comes from a `hlfir.declare` operation. Fix is to check if the address comes from a `hlfir.declare` operation. If so, use the base value from the declare op when available. Falling back to the original address otherwise. >From 38d7efcebee251a71c7bbcfb9de3429755c32210 Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Sat, 10 May 2025 15:44:35 -0700 Subject: [PATCH] Fix CUDA implicit data transfer entity creation --- flang/lib/Lower/Bridge.cpp | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 43375e84f21fa..bfe8898ebff3d 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -4778,7 +4778,13 @@ class FirConverter : public Fortran::lower::AbstractConverter { nbDeviceResidentObject <= 1 && "Only one reference to the device resident object is supported"); auto addr = getSymbolAddress(sym); - hlfir::Entity entity{addr}; + mlir::Value baseValue; + if (auto declareOp = llvm::dyn_cast(addr.getDefiningOp())) + baseValue = declareOp.getBase(); + else + baseValue = addr; + + hlfir::Entity entity{baseValue}; auto [temp, cleanup] = hlfir::createTempFromMold(loc, builder, entity); auto needCleanup = fir::getIntIfConstant(cleanup); From flang-commits at lists.llvm.org Sat May 10 15:48:26 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 15:48:26 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] Fix CUDA implicit data transfer entity creation (PR #139414) In-Reply-To: Message-ID: <681fd7ba.050a0220.18bd2.3253@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Zhen Wang (wangzpgi)
Changes Fixed an issue in `genCUDAImplicitDataTransfer` where creating an `hlfir::Entity` from a symbol address could fail when the address comes from a `hlfir.declare` operation. Fix is to check if the address comes from a `hlfir.declare` operation. If so, use the base value from the declare op when available. Falling back to the original address otherwise. --- Full diff: https://github.com/llvm/llvm-project/pull/139414.diff 1 Files Affected: - (modified) flang/lib/Lower/Bridge.cpp (+7-1) ``````````diff diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 43375e84f21fa..bfe8898ebff3d 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -4778,7 +4778,13 @@ class FirConverter : public Fortran::lower::AbstractConverter { nbDeviceResidentObject <= 1 && "Only one reference to the device resident object is supported"); auto addr = getSymbolAddress(sym); - hlfir::Entity entity{addr}; + mlir::Value baseValue; + if (auto declareOp = llvm::dyn_cast(addr.getDefiningOp())) + baseValue = declareOp.getBase(); + else + baseValue = addr; + + hlfir::Entity entity{baseValue}; auto [temp, cleanup] = hlfir::createTempFromMold(loc, builder, entity); auto needCleanup = fir::getIntIfConstant(cleanup); ``````````
https://github.com/llvm/llvm-project/pull/139414 From flang-commits at lists.llvm.org Sat May 10 15:51:40 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 10 May 2025 15:51:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] Fix CUDA implicit data transfer entity creation (PR #139414) In-Reply-To: Message-ID: <681fd87c.170a0220.1b63d2.26b8@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions cpp -- flang/lib/Lower/Bridge.cpp ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index bfe8898eb..cf9a32268 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -4779,7 +4779,8 @@ private: "Only one reference to the device resident object is supported"); auto addr = getSymbolAddress(sym); mlir::Value baseValue; - if (auto declareOp = llvm::dyn_cast(addr.getDefiningOp())) + if (auto declareOp = + llvm::dyn_cast(addr.getDefiningOp())) baseValue = declareOp.getBase(); else baseValue = addr; ``````````
https://github.com/llvm/llvm-project/pull/139414 From flang-commits at lists.llvm.org Sat May 10 16:53:57 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Sat, 10 May 2025 16:53:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] Fix CUDA implicit data transfer entity creation (PR #139414) In-Reply-To: Message-ID: <681fe715.050a0220.36fd2.39a6@mx.google.com> https://github.com/wangzpgi updated https://github.com/llvm/llvm-project/pull/139414 >From 38d7efcebee251a71c7bbcfb9de3429755c32210 Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Sat, 10 May 2025 15:44:35 -0700 Subject: [PATCH 1/2] Fix CUDA implicit data transfer entity creation --- flang/lib/Lower/Bridge.cpp | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 43375e84f21fa..bfe8898ebff3d 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -4778,7 +4778,13 @@ class FirConverter : public Fortran::lower::AbstractConverter { nbDeviceResidentObject <= 1 && "Only one reference to the device resident object is supported"); auto addr = getSymbolAddress(sym); - hlfir::Entity entity{addr}; + mlir::Value baseValue; + if (auto declareOp = llvm::dyn_cast(addr.getDefiningOp())) + baseValue = declareOp.getBase(); + else + baseValue = addr; + + hlfir::Entity entity{baseValue}; auto [temp, cleanup] = hlfir::createTempFromMold(loc, builder, entity); auto needCleanup = fir::getIntIfConstant(cleanup); >From 3347add1c1b2f56e7adc06d4261dc1f0735eb207 Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Sat, 10 May 2025 16:53:44 -0700 Subject: [PATCH 2/2] fix format; add test --- flang/lib/Lower/Bridge.cpp | 3 ++- flang/test/Lower/CUDA/cuda-managed.cuf | 24 ++++++++++++++++++++++++ 2 files changed, 26 insertions(+), 1 deletion(-) create mode 100644 flang/test/Lower/CUDA/cuda-managed.cuf diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index bfe8898ebff3d..cf9a322680321 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -4779,7 +4779,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { "Only one reference to the device resident object is supported"); auto addr = getSymbolAddress(sym); mlir::Value baseValue; - if (auto declareOp = llvm::dyn_cast(addr.getDefiningOp())) + if (auto declareOp = + llvm::dyn_cast(addr.getDefiningOp())) baseValue = declareOp.getBase(); else baseValue = addr; diff --git a/flang/test/Lower/CUDA/cuda-managed.cuf b/flang/test/Lower/CUDA/cuda-managed.cuf new file mode 100644 index 0000000000000..618a57da53a25 --- /dev/null +++ b/flang/test/Lower/CUDA/cuda-managed.cuf @@ -0,0 +1,24 @@ +! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s + +subroutine testr2(N1,N2) + real(4), managed :: ai4(N1,N2) + real(4), allocatable :: bRefi4(:) + + integer :: i1, i2 + + do i2 = 1, N2 + do i1 = 1, N1 + ai4(i1,i2) = i1 + N1*(i2-1) + enddo + enddo + + allocate(bRefi4 (N1)) + do i1 = 1, N1 + bRefi4(i1) = (ai4(i1,1)+ai4(i1,N2))*N2/2 + enddo + deallocate(bRefi4) + +end subroutine + +!CHECK-LABEL: func.func @_QPtestr2 +!CHECK: %{{.*}} = cuf.alloc !fir.array, %{{.*}}, %{{.*}} : index, index {bindc_name = "ai4", data_attr = #cuf.cuda, uniq_name = "_QFtestr2Eai4"} -> !fir.ref> From flang-commits at lists.llvm.org Sun May 11 08:52:11 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Sun, 11 May 2025 08:52:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Extend assumed-size array checking in intrinsic functions (PR #139339) In-Reply-To: Message-ID: <6820c7ab.170a0220.cf2c4.5a0b@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/139339 From flang-commits at lists.llvm.org Sun May 11 09:15:03 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Sun, 11 May 2025 09:15:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Emit error when DEFERRED binding overrides non-DEFERRED (PR #139325) In-Reply-To: Message-ID: <6820cd07.620a0220.260e52.6c2e@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/139325 From flang-commits at lists.llvm.org Sun May 11 18:41:14 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sun, 11 May 2025 18:41:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add affine optimization pass pipeline. (PR #138627) In-Reply-To: Message-ID: <682151ba.170a0220.15b25e.8c11@mx.google.com> https://github.com/NexMing updated https://github.com/llvm/llvm-project/pull/138627 >From 4e6b42c7ed6679bec04816840288ae6045106d61 Mon Sep 17 00:00:00 2001 From: yanming Date: Wed, 30 Apr 2025 16:32:14 +0800 Subject: [PATCH] [flang][fir] Add affine optimization pass pipeline. --- .../flang/Optimizer/Passes/CommandLineOpts.h | 1 + .../flang/Optimizer/Passes/Pipelines.h | 3 +++ flang/lib/Optimizer/Passes/CMakeLists.txt | 1 + .../lib/Optimizer/Passes/CommandLineOpts.cpp | 1 + flang/lib/Optimizer/Passes/Pipelines.cpp | 20 +++++++++++++++++++ flang/test/Driver/mlir-pass-pipeline.f90 | 14 +++++++++++++ flang/test/Integration/OpenMP/auto-omp.f90 | 10 ++++++++++ 7 files changed, 50 insertions(+) create mode 100644 flang/test/Integration/OpenMP/auto-omp.f90 diff --git a/flang/include/flang/Optimizer/Passes/CommandLineOpts.h b/flang/include/flang/Optimizer/Passes/CommandLineOpts.h index 1cfaf285e75e6..320c561953213 100644 --- a/flang/include/flang/Optimizer/Passes/CommandLineOpts.h +++ b/flang/include/flang/Optimizer/Passes/CommandLineOpts.h @@ -42,6 +42,7 @@ extern llvm::cl::opt disableCfgConversion; extern llvm::cl::opt disableFirAvc; extern llvm::cl::opt disableFirMao; +extern llvm::cl::opt enableAffineOpt; extern llvm::cl::opt disableFirAliasTags; extern llvm::cl::opt useOldAliasTags; diff --git a/flang/include/flang/Optimizer/Passes/Pipelines.h b/flang/include/flang/Optimizer/Passes/Pipelines.h index a3f59ee8dd013..7680987367256 100644 --- a/flang/include/flang/Optimizer/Passes/Pipelines.h +++ b/flang/include/flang/Optimizer/Passes/Pipelines.h @@ -18,8 +18,11 @@ #include "flang/Optimizer/Passes/CommandLineOpts.h" #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Tools/CrossToolHelpers.h" +#include "mlir/Conversion/AffineToStandard/AffineToStandard.h" #include "mlir/Conversion/ReconcileUnrealizedCasts/ReconcileUnrealizedCasts.h" #include "mlir/Conversion/SCFToControlFlow/SCFToControlFlow.h" +#include "mlir/Conversion/SCFToOpenMP/SCFToOpenMP.h" +#include "mlir/Dialect/Affine/Passes.h" #include "mlir/Dialect/GPU/IR/GPUDialect.h" #include "mlir/Dialect/LLVMIR/LLVMAttrs.h" #include "mlir/Pass/PassManager.h" diff --git a/flang/lib/Optimizer/Passes/CMakeLists.txt b/flang/lib/Optimizer/Passes/CMakeLists.txt index 1c19a5765aff1..ad6c714c28bec 100644 --- a/flang/lib/Optimizer/Passes/CMakeLists.txt +++ b/flang/lib/Optimizer/Passes/CMakeLists.txt @@ -21,6 +21,7 @@ add_flang_library(flangPasses MLIRPass MLIRReconcileUnrealizedCasts MLIRSCFToControlFlow + MLIRSCFToOpenMP MLIRSupport MLIRTransforms ) diff --git a/flang/lib/Optimizer/Passes/CommandLineOpts.cpp b/flang/lib/Optimizer/Passes/CommandLineOpts.cpp index f95a280883cba..b8ae6ede423e3 100644 --- a/flang/lib/Optimizer/Passes/CommandLineOpts.cpp +++ b/flang/lib/Optimizer/Passes/CommandLineOpts.cpp @@ -55,6 +55,7 @@ cl::opt useOldAliasTags( cl::desc("Use a single TBAA tree for all functions and do not use " "the FIR alias tags pass"), cl::init(false), cl::Hidden); +EnableOption(AffineOpt, "affine-opt", "affine optimization"); /// CodeGen Passes DisableOption(CodeGenRewrite, "codegen-rewrite", "rewrite FIR for codegen"); diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index a3ef473ea39b7..903766cb38236 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -209,6 +209,26 @@ void createDefaultFIROptimizerPassPipeline(mlir::PassManager &pm, if (pc.AliasAnalysis && !disableFirAliasTags && !useOldAliasTags) pm.addPass(fir::createAddAliasTags()); + // We can first convert the FIR dialect to the Affine dialect, perform + // optimizations on top of it, and then lower it to the FIR dialect. + // TODO: These optimization passes (e.g., PromoteToAffinePass) are currently + // experimental, so it's important to actively identify and address issues. + if (enableAffineOpt && pc.OptLevel.isOptimizingForSpeed()) { + pm.addPass(fir::createPromoteToAffinePass()); + pm.addPass(mlir::createCSEPass()); + pm.addPass(mlir::affine::createAffineLoopInvariantCodeMotionPass()); + pm.addPass(mlir::affine::createAffineLoopNormalizePass()); + pm.addPass(mlir::affine::createSimplifyAffineStructuresPass()); + pm.addPass(mlir::affine::createAffineParallelize( + mlir::affine::AffineParallelizeOptions{1, false})); + pm.addPass(fir::createAffineDemotionPass()); + pm.addPass(mlir::createLowerAffinePass()); + if (pc.EnableOpenMP) { + pm.addPass(mlir::createConvertSCFToOpenMPPass()); + pm.addPass(mlir::createCanonicalizerPass()); + } + } + addNestedPassToAllTopLevelOperations( pm, fir::createStackReclaim); // convert control flow to CFG form diff --git a/flang/test/Driver/mlir-pass-pipeline.f90 b/flang/test/Driver/mlir-pass-pipeline.f90 index 45370895db397..188a42d231500 100644 --- a/flang/test/Driver/mlir-pass-pipeline.f90 +++ b/flang/test/Driver/mlir-pass-pipeline.f90 @@ -4,6 +4,7 @@ ! -O0 is the default: ! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline %s -O0 -o /dev/null 2>&1 | FileCheck --check-prefixes=ALL %s ! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline %s -O2 -o /dev/null 2>&1 | FileCheck --check-prefixes=ALL,O2 %s +! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline -mllvm --enable-affine-opt %s -O2 -o /dev/null 2>&1 | FileCheck --check-prefixes=ALL,O2,AFFINE %s ! REQUIRES: asserts @@ -105,6 +106,19 @@ ! ALL-NEXT: SimplifyFIROperations ! O2-NEXT: AddAliasTags +! AFFINE-NEXT: 'func.func' Pipeline +! AFFINE-NEXT: AffineDialectPromotion +! AFFINE-NEXT: CSE +! AFFINE-NEXT: (S) 0 num-cse'd - Number of operations CSE'd +! AFFINE-NEXT: (S) 0 num-dce'd - Number of operations DCE'd +! AFFINE-NEXT: 'func.func' Pipeline +! AFFINE-NEXT: AffineLoopInvariantCodeMotion +! AFFINE-NEXT: AffineLoopNormalize +! AFFINE-NEXT: SimplifyAffineStructures +! AFFINE-NEXT: AffineParallelize +! AFFINE-NEXT: AffineDialectDemotion +! AFFINE-NEXT: LowerAffinePass + ! ALL-NEXT: Pipeline Collection : ['fir.global', 'func.func', 'omp.declare_reduction', 'omp.private'] ! ALL-NEXT: 'fir.global' Pipeline ! ALL-NEXT: StackReclaim diff --git a/flang/test/Integration/OpenMP/auto-omp.f90 b/flang/test/Integration/OpenMP/auto-omp.f90 new file mode 100644 index 0000000000000..bf7da292552d8 --- /dev/null +++ b/flang/test/Integration/OpenMP/auto-omp.f90 @@ -0,0 +1,10 @@ +! RUN: %flang_fc1 -O1 -mllvm --enable-affine-opt -emit-llvm -fopenmp -o - %s \ +! RUN: | FileCheck %s + +!CHECK-LABEL: define void @foo_(ptr captures(none) %0) {{.*}} { +!CHECK: call void{{.*}}@__kmpc_fork_call{{.*}}@[[OMP_OUTLINED_FN_1:.*]]) + +subroutine foo(a) + integer, dimension(100, 100), intent(out) :: a + a = 1 +end subroutine foo From flang-commits at lists.llvm.org Mon May 12 01:39:16 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 01:39:16 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][MLIR] - Handle the mapping of subroutine arguments when they are subsequently used inside the region of an `omp.target` Op (PR #134967) In-Reply-To: Message-ID: <6821b3b4.050a0220.1e1fc3.9ad6@mx.google.com> ================ @@ -217,59 +217,36 @@ static void bindEntryBlockArgs(lower::AbstractConverter &converter, assert(args.isValid() && "invalid args"); fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - auto bindSingleMapLike = [&converter, - &firOpBuilder](const semantics::Symbol &sym, - const mlir::BlockArgument &arg) { - // Clones the `bounds` placing them inside the entry block and returns - // them. - auto cloneBound = [&](mlir::Value bound) { - if (mlir::isMemoryEffectFree(bound.getDefiningOp())) { - mlir::Operation *clonedOp = firOpBuilder.clone(*bound.getDefiningOp()); - return clonedOp->getResult(0); - } - TODO(converter.getCurrentLocation(), - "target map-like clause operand unsupported bound type"); - }; - - auto cloneBounds = [cloneBound](llvm::ArrayRef bounds) { - llvm::SmallVector clonedBounds; - llvm::transform(bounds, std::back_inserter(clonedBounds), - [&](mlir::Value bound) { return cloneBound(bound); }); - return clonedBounds; - }; - ---------------- jeanPerier wrote: It matters that they are cloned somehow since the region is isolated from above, but I think you may be handling it in the code below on line 1347 that is cloning all the values that do not belong to the region. So, explicit cloning is likely not needed anymore. ``` mlir::getUsedValuesDefinedAbove(region, valuesDefinedAbove); while (!valuesDefinedAbove.empty()) { ``` https://github.com/llvm/llvm-project/pull/134967 From flang-commits at lists.llvm.org Mon May 12 01:41:11 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 01:41:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fixed designator codegen for contiguous boxes. (PR #139003) In-Reply-To: Message-ID: <6821b427.170a0220.24fcbf.7b01@mx.google.com> https://github.com/jeanPerier approved this pull request. Thanks for the update and the fix, LGTM! https://github.com/llvm/llvm-project/pull/139003 From flang-commits at lists.llvm.org Mon May 12 02:31:14 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Mon, 12 May 2025 02:31:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <6821bfe2.170a0220.3241dc.9226@mx.google.com> skatrak wrote: Thank you Abid, can you check buildbot failures? They seem to be related to the patch. https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Mon May 12 03:19:08 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 03:19:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow flush of common block (PR #139528) In-Reply-To: Message-ID: <6821cb1c.170a0220.1b247.995d@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Tom Eccles (tblah)
Changes I think this was denied by accident in https://github.com/llvm/llvm-project/commit/68180d8d16f07db8200dfce7bae26a80c43ebc5e. Flush of a common block is allowed by the standard on my reading. It is not allowed by classic-flang but is supported by gfortran and ifx. This doesn't need any lowering changes. The LLVM translation ignores the flush argument list because the openmp runtime library doesn't support flushing specific data. Depends upon https://github.com/llvm/llvm-project/pull/139522. Ignore the first commit in this PR. --- Full diff: https://github.com/llvm/llvm-project/pull/139528.diff 3 Files Affected: - (modified) flang/lib/Semantics/check-omp-structure.cpp (-7) - (modified) flang/lib/Semantics/resolve-directives.cpp (+13) - (added) flang/test/Lower/OpenMP/flush-common.f90 (+13) ``````````diff diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index f17de42ca2466..591f30db36baa 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2304,13 +2304,6 @@ void OmpStructureChecker::Leave(const parser::OpenMPFlushConstruct &x) { auto &flushList{std::get>(x.v.t)}; if (flushList) { - for (const parser::OmpArgument &arg : flushList->v) { - if (auto *sym{GetArgumentSymbol(arg)}; sym && !IsVariableListItem(*sym)) { - context_.Say(arg.source, - "FLUSH argument must be a variable list item"_err_en_US); - } - } - if (FindClause(llvm::omp::Clause::OMPC_acquire) || FindClause(llvm::omp::Clause::OMPC_release) || FindClause(llvm::omp::Clause::OMPC_acq_rel)) { diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 8b1caca34a6a7..68f8cf9f17620 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -409,6 +409,19 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { } void Post(const parser::OpenMPDepobjConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPFlushConstruct &x) { + PushContext(x.source, llvm::omp::Directive::OMPD_flush); + for (auto &arg : x.v.Arguments().v) { + if (auto *locator{std::get_if(&arg.u)}) { + if (auto *object{std::get_if(&locator->u)}) { + ResolveOmpObject(*object, Symbol::Flag::OmpDependObject); + } + } + } + return true; + } + void Post(const parser::OpenMPFlushConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPRequiresConstruct &x) { using Flags = WithOmpDeclarative::RequiresFlags; using Requires = WithOmpDeclarative::RequiresFlag; diff --git a/flang/test/Lower/OpenMP/flush-common.f90 b/flang/test/Lower/OpenMP/flush-common.f90 new file mode 100644 index 0000000000000..7656141dcb295 --- /dev/null +++ b/flang/test/Lower/OpenMP/flush-common.f90 @@ -0,0 +1,13 @@ +! RUN: %flang_fc1 -fopenmp -emit-hlfir -o - %s | FileCheck %s + +! Regression test to ensure that the name /c/ in the flush argument list is +! resolved to the common block symbol and common blocks are allowed in the +! flush argument list. + +! CHECK: %[[GLBL:.*]] = fir.address_of({{.*}}) : !fir.ref> + common /c/ x + real :: x +! CHECK: omp.flush(%[[GLBL]] : !fir.ref>) + !$omp flush(/c/) +end + ``````````
https://github.com/llvm/llvm-project/pull/139528 From flang-commits at lists.llvm.org Mon May 12 03:46:55 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Mon, 12 May 2025 03:46:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <6821d19f.050a0220.3201f7.b080@mx.google.com> abidh wrote: > Thank you Abid, can you check buildbot failures? They seem to be related to the patch. Hi Sergio, Thanks for having a look. The failures are expected as this PR needs the accompanying https://github.com/llvm/llvm-project/pull/138149 for build/tests to work correctly. I kept them separate for ease of reviewing. Once both of them are approved, I will bring them together before merge to make sure that build bots are green and they go in as one commit. https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Mon May 12 05:07:11 2025 From: flang-commits at lists.llvm.org (Michael Kruse via flang-commits) Date: Mon, 12 May 2025 05:07:11 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <6821e46f.170a0220.352c29.a8c1@mx.google.com> https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Mon May 12 06:00:02 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 06:00:02 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <6821f0d2.050a0220.1f30d2.a468@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Mon May 12 06:00:02 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 06:00:02 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <6821f0d2.170a0220.e5829.89e9@mx.google.com> ================ @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } ---------------- tblah wrote: Shouldn't there be some sort of error reporting to catch cases where the linear variable is not an alloca? What about if the variable was passed as a function argument? Nothing in the OpenMP MLIR dialect currently guarantees or documents that these variables are passed by-address into the clause, that's just how flang happens to work. I would be open to adding this requirement (as we have done this already for privatization and some reductions). But if you depend on this it does need to be documented. This is supposed to be generic code which supports any valid OpenMP MLIR code. https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Mon May 12 06:00:02 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 06:00:02 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <6821f0d2.170a0220.1b5a32.9df3@mx.google.com> https://github.com/tblah commented: Thank you for the progress so far https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Mon May 12 06:00:04 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 06:00:04 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <6821f0d4.170a0220.62e5e.802a@mx.google.com> ================ @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } + } + + // Initialize linear step + inline void initLinearStep(LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearStep) { + linearSteps.push_back(moduleTranslation.lookupValue(linearStep)); + } + + // Emit IR for initialization of linear variables + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + initLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::BasicBlock *loopPreHeader) { + builder.SetInsertPoint(loopPreHeader->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarLoad = builder.CreateLoad( + linearOrigVars[index]->getAllocatedType(), linearOrigVars[index]); + builder.CreateStore(linearVarLoad, linearPreconditionVars[index]); + } + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + return afterBarrierIP; + } + + // Emit IR for updating Linear variables + void updateLinearVar(llvm::IRBuilderBase &builder, llvm::BasicBlock *loopBody, + llvm::Value *loopInductionVar) { + builder.SetInsertPoint(loopBody->getTerminator()); + for (size_t index = 0; index < linearPreconditionVars.size(); index++) { + // Emit increments for linear vars + llvm::LoadInst *linearVarStart = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + + linearPreconditionVars[index]); + auto mulInst = builder.CreateMul(loopInductionVar, linearSteps[index]); ---------------- tblah wrote: This only works for integer variables. For floating point you would need fmul. Same for add and fadd. https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Mon May 12 06:00:04 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 06:00:04 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <6821f0d4.170a0220.e438f.90ac@mx.google.com> ================ @@ -3580,6 +3580,9 @@ class CanonicalLoopInfo { BasicBlock *Latch = nullptr; BasicBlock *Exit = nullptr; + // Hold the MLIR value for the `lastiter` of the canonical loop. ---------------- tblah wrote: ```suggestion // Hold the LLVM value for the `lastiter` of the canonical loop. ``` https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Mon May 12 06:00:03 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 06:00:03 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <6821f0d3.050a0220.1ed5ad.e94a@mx.google.com> ================ @@ -2423,15 +2562,40 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, llvm::omp::Directive::OMPD_for); llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder); + + // Initialize linear variables and linear step + LinearClauseProcessor linearClauseProcessor; + if (wsloopOp.getLinearVars().size()) { ---------------- tblah wrote: nit: This if is unnecessary. The for loops will not execute if there are no linear vars. https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Mon May 12 06:00:04 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 06:00:04 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <6821f0d4.050a0220.2a492b.a717@mx.google.com> ================ @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } + } + + // Initialize linear step + inline void initLinearStep(LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearStep) { + linearSteps.push_back(moduleTranslation.lookupValue(linearStep)); + } + + // Emit IR for initialization of linear variables + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + initLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::BasicBlock *loopPreHeader) { + builder.SetInsertPoint(loopPreHeader->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarLoad = builder.CreateLoad( + linearOrigVars[index]->getAllocatedType(), linearOrigVars[index]); + builder.CreateStore(linearVarLoad, linearPreconditionVars[index]); + } + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + return afterBarrierIP; + } + + // Emit IR for updating Linear variables + void updateLinearVar(llvm::IRBuilderBase &builder, llvm::BasicBlock *loopBody, + llvm::Value *loopInductionVar) { + builder.SetInsertPoint(loopBody->getTerminator()); + for (size_t index = 0; index < linearPreconditionVars.size(); index++) { + // Emit increments for linear vars + llvm::LoadInst *linearVarStart = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + + linearPreconditionVars[index]); + auto mulInst = builder.CreateMul(loopInductionVar, linearSteps[index]); + auto addInst = builder.CreateAdd(linearVarStart, mulInst); + builder.CreateStore(addInst, linearLoopBodyTemps[index]); + } + } + + // Linear variable finalization is conditional on the last logical iteration. + // Create BB splits to manage the same. + void outlineLinearFinalizationBB(llvm::IRBuilderBase &builder, ---------------- tblah wrote: For me at least, this isn't what I expect from the term "outline". For me, "outline" would mean moving blocks into a different function. Perhaps a better name would be `splitLinearFiniBlock`. Feel free to ignore if others disagree. https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Mon May 12 06:00:06 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 06:00:06 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <6821f0d6.170a0220.2069a8.a31b@mx.google.com> ================ @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } + } + + // Initialize linear step + inline void initLinearStep(LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearStep) { + linearSteps.push_back(moduleTranslation.lookupValue(linearStep)); + } + + // Emit IR for initialization of linear variables + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + initLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::BasicBlock *loopPreHeader) { + builder.SetInsertPoint(loopPreHeader->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarLoad = builder.CreateLoad( + linearOrigVars[index]->getAllocatedType(), linearOrigVars[index]); + builder.CreateStore(linearVarLoad, linearPreconditionVars[index]); + } + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + return afterBarrierIP; + } + + // Emit IR for updating Linear variables + void updateLinearVar(llvm::IRBuilderBase &builder, llvm::BasicBlock *loopBody, + llvm::Value *loopInductionVar) { + builder.SetInsertPoint(loopBody->getTerminator()); + for (size_t index = 0; index < linearPreconditionVars.size(); index++) { + // Emit increments for linear vars + llvm::LoadInst *linearVarStart = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + + linearPreconditionVars[index]); + auto mulInst = builder.CreateMul(loopInductionVar, linearSteps[index]); + auto addInst = builder.CreateAdd(linearVarStart, mulInst); + builder.CreateStore(addInst, linearLoopBodyTemps[index]); + } + } + + // Linear variable finalization is conditional on the last logical iteration. + // Create BB splits to manage the same. + void outlineLinearFinalizationBB(llvm::IRBuilderBase &builder, + llvm::BasicBlock *loopExit) { + linearFinalizationBB = loopExit->splitBasicBlock( + loopExit->getTerminator(), "omp_loop.linear_finalization"); + linearExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_exit"); + linearLastIterExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_lastiter_exit"); + } + + // Finalize the linear vars + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + finalizeLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::Value *lastIter) { + // Emit condition to check whether last logical iteration is being executed + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + llvm::Value *loopLastIterLoad = builder.CreateLoad( + llvm::Type::getInt32Ty(builder.getContext()), lastIter); ---------------- tblah wrote: Why is this always i32? https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Mon May 12 06:02:27 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 06:02:27 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][OpenMP] Support MLIR lowering of linear clause for omp.wsloop (PR #139385) In-Reply-To: Message-ID: <6821f163.170a0220.3ce539.071f@mx.google.com> https://github.com/tblah approved this pull request. The flang lowering changes look good to me. If you like, these can be merged without the mlir->llvm translation because that should issue its own TODO message anyway. https://github.com/llvm/llvm-project/pull/139385 From flang-commits at lists.llvm.org Mon May 12 07:03:32 2025 From: flang-commits at lists.llvm.org (Jean-Didier PAILLEUX via flang-commits) Date: Mon, 12 May 2025 07:03:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Implement !DIR$ [NO]INLINE and FORCEINLINE directives (PR #134350) In-Reply-To: Message-ID: <6821ffb4.170a0220.9828a.5daf@mx.google.com> JDPailleux wrote: > @JDPailleux is this now blocked? Hi, sorry for the late reply (I wasn't available). Yes, normally it's now blocked, unless there are any other comments. https://github.com/llvm/llvm-project/pull/134350 From flang-commits at lists.llvm.org Mon May 12 07:07:35 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 07:07:35 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <682200a7.170a0220.1762a8.de9d@mx.google.com> ================ @@ -2231,6 +2232,146 @@ genSingleOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +static clause::Defaultmap::ImplicitBehavior +getDefaultmapIfPresent(DefaultMapsTy &defaultMaps, mlir::Type varType) { ---------------- agozillon wrote: Couldn't add the const for either unfortunately, the index [] operator for map doesn't have a const qualified equivalent it seems (or one that utilises a const "this"). https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Mon May 12 07:23:10 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Mon, 12 May 2025 07:23:10 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <6822044e.170a0220.5c999.ee9e@mx.google.com> ================ @@ -2231,6 +2232,146 @@ genSingleOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +static clause::Defaultmap::ImplicitBehavior +getDefaultmapIfPresent(DefaultMapsTy &defaultMaps, mlir::Type varType) { ---------------- kparzysz wrote: std::map has "at" that works for "const this". If the given key is not present in the map, it will abort. The operator [] will insert the key with a default-initialized value, so the map cannot be const for it to work. https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Mon May 12 07:28:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 07:28:50 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <682205a2.170a0220.3b67f8.3e32@mx.google.com> https://github.com/agozillon updated https://github.com/llvm/llvm-project/pull/135226 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 12 07:30:46 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 07:30:46 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <68220616.170a0220.843ab.3982@mx.google.com> agozillon wrote: Thank you very much for the reviews @kparzysz @skatrak I'm going to go ahead and land this now, I addressed the remaining nit comments! https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Mon May 12 07:30:48 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 07:30:48 -0700 (PDT) Subject: [flang-commits] [flang] f687ed9 - [Flang][OpenMP] Initial defaultmap implementation (#135226) Message-ID: <68220618.620a0220.390156.1627@mx.google.com> Author: agozillon Date: 2025-05-12T16:30:43+02:00 New Revision: f687ed9ff717372a7c751a3bf4ef7e33eb481fd6 URL: https://github.com/llvm/llvm-project/commit/f687ed9ff717372a7c751a3bf4ef7e33eb481fd6 DIFF: https://github.com/llvm/llvm-project/commit/f687ed9ff717372a7c751a3bf4ef7e33eb481fd6.diff LOG: [Flang][OpenMP] Initial defaultmap implementation (#135226) This aims to implement most of the initial arguments for defaultmap aside from firstprivate and none, and some of the more recent OpenMP 6 additions which will come in subsequent updates (with the OpenMP 6 variants needing parsing/semantic support first). Added: flang/test/Lower/OpenMP/Todo/defaultmap-clause-firstprivate.f90 flang/test/Lower/OpenMP/Todo/defaultmap-clause-none.f90 flang/test/Lower/OpenMP/defaultmap.f90 offload/test/offloading/fortran/target-defaultmap-present.f90 offload/test/offloading/fortran/target-defaultmap.f90 Modified: flang/include/flang/Parser/parse-tree.h flang/lib/Lower/OpenMP/ClauseProcessor.cpp flang/lib/Lower/OpenMP/ClauseProcessor.h flang/lib/Lower/OpenMP/Clauses.cpp flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Parser/openmp-parsers.cpp flang/test/Parser/OpenMP/defaultmap-clause.f90 Removed: flang/test/Lower/OpenMP/Todo/defaultmap-clause.f90 ################################################################################ diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index a0d7a797e7203..254236b510544 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4149,8 +4149,8 @@ struct OmpDefaultClause { // PRESENT // since 5.1 struct OmpDefaultmapClause { TUPLE_CLASS_BOILERPLATE(OmpDefaultmapClause); - ENUM_CLASS( - ImplicitBehavior, Alloc, To, From, Tofrom, Firstprivate, None, Default) + ENUM_CLASS(ImplicitBehavior, Alloc, To, From, Tofrom, Firstprivate, None, + Default, Present) MODIFIER_BOILERPLATE(OmpVariableCategory); std::tuple t; }; diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 79b5087e4da68..f4876256a378f 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -877,6 +877,26 @@ static bool isVectorSubscript(const evaluate::Expr &expr) { return false; } +bool ClauseProcessor::processDefaultMap(lower::StatementContext &stmtCtx, + DefaultMapsTy &result) const { + auto process = [&](const omp::clause::Defaultmap &clause, + const parser::CharBlock &) { + using Defmap = omp::clause::Defaultmap; + clause::Defaultmap::VariableCategory variableCategory = + Defmap::VariableCategory::All; + // Variable Category is optional, if not specified defaults to all. + // Multiples of the same category are illegal as are any other + // defaultmaps being specified when a user specified all is in place, + // however, this should be handled earlier during semantics. + if (auto varCat = + std::get>(clause.t)) + variableCategory = varCat.value(); + auto behaviour = std::get(clause.t); + result[variableCategory] = behaviour; + }; + return findRepeatableClause(process); +} + bool ClauseProcessor::processDepend(lower::SymMap &symMap, lower::StatementContext &stmtCtx, mlir::omp::DependClauseOps &result) const { diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 7857ba3fd0845..df398c78b0213 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -32,6 +32,10 @@ namespace Fortran { namespace lower { namespace omp { +// Container type for tracking user specified Defaultmaps for a target region +using DefaultMapsTy = std::map; + /// Class that handles the processing of OpenMP clauses. /// /// Its `process()` methods perform MLIR code generation for their @@ -110,6 +114,8 @@ class ClauseProcessor { bool processCopyin() const; bool processCopyprivate(mlir::Location currentLocation, mlir::omp::CopyprivateClauseOps &result) const; + bool processDefaultMap(lower::StatementContext &stmtCtx, + DefaultMapsTy &result) const; bool processDepend(lower::SymMap &symMap, lower::StatementContext &stmtCtx, mlir::omp::DependClauseOps &result) const; bool diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index c258bef2e4427..f3088b18b77ff 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -612,7 +612,7 @@ Defaultmap make(const parser::OmpClause::Defaultmap &inp, MS(Firstprivate, Firstprivate) MS(None, None) MS(Default, Default) - // MS(, Present) missing-in-parser + MS(Present, Present) // clang-format on ); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 54560729eb4af..43f2f35b2ba61 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -980,6 +980,145 @@ static void genLoopVars( firOpBuilder.setInsertionPointAfter(storeOp); } +static clause::Defaultmap::ImplicitBehavior +getDefaultmapIfPresent(DefaultMapsTy &defaultMaps, mlir::Type varType) { + using DefMap = clause::Defaultmap; + + if (defaultMaps.empty()) + return DefMap::ImplicitBehavior::Default; + + if (llvm::is_contained(defaultMaps, DefMap::VariableCategory::All)) + return defaultMaps[DefMap::VariableCategory::All]; + + // NOTE: Unsure if complex and/or vector falls into a scalar type + // or aggregate, but the current default implicit behaviour is to + // treat them as such (c_ptr has its own behaviour, so perhaps + // being lumped in as a scalar isn't the right thing). + if ((fir::isa_trivial(varType) || fir::isa_char(varType) || + fir::isa_builtin_cptr_type(varType)) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Scalar)) + return defaultMaps[DefMap::VariableCategory::Scalar]; + + if (fir::isPointerType(varType) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Pointer)) + return defaultMaps[DefMap::VariableCategory::Pointer]; + + if (fir::isAllocatableType(varType) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Allocatable)) + return defaultMaps[DefMap::VariableCategory::Allocatable]; + + if (fir::isa_aggregate(varType) && + llvm::is_contained(defaultMaps, DefMap::VariableCategory::Aggregate)) + return defaultMaps[DefMap::VariableCategory::Aggregate]; + + return DefMap::ImplicitBehavior::Default; +} + +static std::pair +getImplicitMapTypeAndKind(fir::FirOpBuilder &firOpBuilder, + lower::AbstractConverter &converter, + DefaultMapsTy &defaultMaps, mlir::Type varType, + mlir::Location loc, const semantics::Symbol &sym) { + using DefMap = clause::Defaultmap; + // Check if a value of type `type` can be passed to the kernel by value. + // All kernel parameters are of pointer type, so if the value can be + // represented inside of a pointer, then it can be passed by value. + auto isLiteralType = [&](mlir::Type type) { + const mlir::DataLayout &dl = firOpBuilder.getDataLayout(); + mlir::Type ptrTy = + mlir::LLVM::LLVMPointerType::get(&converter.getMLIRContext()); + uint64_t ptrSize = dl.getTypeSize(ptrTy); + uint64_t ptrAlign = dl.getTypePreferredAlignment(ptrTy); + + auto [size, align] = fir::getTypeSizeAndAlignmentOrCrash( + loc, type, dl, converter.getKindMap()); + return size <= ptrSize && align <= ptrAlign; + }; + + llvm::omp::OpenMPOffloadMappingFlags mapFlag = + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_IMPLICIT; + + auto implicitBehaviour = getDefaultmapIfPresent(defaultMaps, varType); + if (implicitBehaviour == DefMap::ImplicitBehavior::Default) { + mlir::omp::VariableCaptureKind captureKind = + mlir::omp::VariableCaptureKind::ByRef; + + // If a variable is specified in declare target link and if device + // type is not specified as `nohost`, it needs to be mapped tofrom + mlir::ModuleOp mod = firOpBuilder.getModule(); + mlir::Operation *op = mod.lookupSymbol(converter.mangleName(sym)); + auto declareTargetOp = + llvm::dyn_cast_if_present(op); + if (declareTargetOp && declareTargetOp.isDeclareTarget()) { + if (declareTargetOp.getDeclareTargetCaptureClause() == + mlir::omp::DeclareTargetCaptureClause::link && + declareTargetOp.getDeclareTargetDeviceType() != + mlir::omp::DeclareTargetDeviceType::nohost) { + mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; + mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; + } + } else if (fir::isa_trivial(varType) || fir::isa_char(varType)) { + // Scalars behave as if they were "firstprivate". + // TODO: Handle objects that are shared/lastprivate or were listed + // in an in_reduction clause. + if (isLiteralType(varType)) { + captureKind = mlir::omp::VariableCaptureKind::ByCopy; + } else { + mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; + } + } else if (!fir::isa_builtin_cptr_type(varType)) { + mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; + mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; + } + return std::make_pair(mapFlag, captureKind); + } + + switch (implicitBehaviour) { + case DefMap::ImplicitBehavior::Alloc: + return std::make_pair(llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_NONE, + mlir::omp::VariableCaptureKind::ByRef); + break; + case DefMap::ImplicitBehavior::Firstprivate: + case DefMap::ImplicitBehavior::None: + TODO(loc, "Firstprivate and None are currently unsupported defaultmap " + "behaviour"); + break; + case DefMap::ImplicitBehavior::From: + return std::make_pair(mapFlag |= + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM, + mlir::omp::VariableCaptureKind::ByRef); + break; + case DefMap::ImplicitBehavior::Present: + return std::make_pair(mapFlag |= + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_PRESENT, + mlir::omp::VariableCaptureKind::ByRef); + break; + case DefMap::ImplicitBehavior::To: + return std::make_pair(mapFlag |= + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO, + (fir::isa_trivial(varType) || fir::isa_char(varType)) + ? mlir::omp::VariableCaptureKind::ByCopy + : mlir::omp::VariableCaptureKind::ByRef); + break; + case DefMap::ImplicitBehavior::Tofrom: + return std::make_pair(mapFlag |= + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM | + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO, + mlir::omp::VariableCaptureKind::ByRef); + break; + case DefMap::ImplicitBehavior::Default: + llvm_unreachable( + "Implicit None Behaviour Should Have Been Handled Earlier"); + break; + } + + return std::make_pair(mapFlag |= + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM | + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO, + mlir::omp::VariableCaptureKind::ByRef); +} + static void markDeclareTarget(mlir::Operation *op, lower::AbstractConverter &converter, mlir::omp::DeclareTargetCaptureClause captureClause, @@ -1677,11 +1816,13 @@ static void genTargetClauses( lower::SymMap &symTable, lower::StatementContext &stmtCtx, lower::pft::Evaluation &eval, const List &clauses, mlir::Location loc, mlir::omp::TargetOperands &clauseOps, + DefaultMapsTy &defaultMaps, llvm::SmallVectorImpl &hasDeviceAddrSyms, llvm::SmallVectorImpl &isDevicePtrSyms, llvm::SmallVectorImpl &mapSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processBare(clauseOps); + cp.processDefaultMap(stmtCtx, defaultMaps); cp.processDepend(symTable, stmtCtx, clauseOps); cp.processDevice(stmtCtx, clauseOps); cp.processHasDeviceAddr(stmtCtx, clauseOps, hasDeviceAddrSyms); @@ -1696,9 +1837,8 @@ static void genTargetClauses( cp.processNowait(clauseOps); cp.processThreadLimit(stmtCtx, clauseOps); - cp.processTODO(loc, - llvm::omp::Directive::OMPD_target); + cp.processTODO( + loc, llvm::omp::Directive::OMPD_target); // `target private(..)` is only supported in delayed privatization mode. if (!enableDelayedPrivatizationStaging) @@ -2242,10 +2382,12 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, hostEvalInfo.emplace_back(); mlir::omp::TargetOperands clauseOps; + DefaultMapsTy defaultMaps; llvm::SmallVector mapSyms, isDevicePtrSyms, hasDeviceAddrSyms; genTargetClauses(converter, semaCtx, symTable, stmtCtx, eval, item->clauses, - loc, clauseOps, hasDeviceAddrSyms, isDevicePtrSyms, mapSyms); + loc, clauseOps, defaultMaps, hasDeviceAddrSyms, + isDevicePtrSyms, mapSyms); DataSharingProcessor dsp(converter, semaCtx, item->clauses, eval, /*shouldCollectPreDeterminedSymbols=*/ @@ -2253,21 +2395,6 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, /*useDelayedPrivatization=*/true, symTable); dsp.processStep1(&clauseOps); - // Check if a value of type `type` can be passed to the kernel by value. - // All kernel parameters are of pointer type, so if the value can be - // represented inside of a pointer, then it can be passed by value. - auto isLiteralType = [&](mlir::Type type) { - const mlir::DataLayout &dl = firOpBuilder.getDataLayout(); - mlir::Type ptrTy = - mlir::LLVM::LLVMPointerType::get(&converter.getMLIRContext()); - uint64_t ptrSize = dl.getTypeSize(ptrTy); - uint64_t ptrAlign = dl.getTypePreferredAlignment(ptrTy); - - auto [size, align] = fir::getTypeSizeAndAlignmentOrCrash( - loc, type, dl, converter.getKindMap()); - return size <= ptrSize && align <= ptrAlign; - }; - // 5.8.1 Implicit Data-Mapping Attribute Rules // The following code follows the implicit data-mapping rules to map all the // symbols used inside the region that do not have explicit data-environment @@ -2330,56 +2457,25 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, firOpBuilder, info, dataExv, semantics::IsAssumedSizeArray(sym.GetUltimate()), converter.getCurrentLocation()); - - llvm::omp::OpenMPOffloadMappingFlags mapFlag = - llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_IMPLICIT; - mlir::omp::VariableCaptureKind captureKind = - mlir::omp::VariableCaptureKind::ByRef; - mlir::Value baseOp = info.rawInput; mlir::Type eleType = baseOp.getType(); if (auto refType = mlir::dyn_cast(baseOp.getType())) eleType = refType.getElementType(); - // If a variable is specified in declare target link and if device - // type is not specified as `nohost`, it needs to be mapped tofrom - mlir::ModuleOp mod = firOpBuilder.getModule(); - mlir::Operation *op = mod.lookupSymbol(converter.mangleName(sym)); - auto declareTargetOp = - llvm::dyn_cast_if_present(op); - if (declareTargetOp && declareTargetOp.isDeclareTarget()) { - if (declareTargetOp.getDeclareTargetCaptureClause() == - mlir::omp::DeclareTargetCaptureClause::link && - declareTargetOp.getDeclareTargetDeviceType() != - mlir::omp::DeclareTargetDeviceType::nohost) { - mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; - mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; - } - } else if (fir::isa_trivial(eleType) || fir::isa_char(eleType)) { - // Scalars behave as if they were "firstprivate". - // TODO: Handle objects that are shared/lastprivate or were listed - // in an in_reduction clause. - if (isLiteralType(eleType)) { - captureKind = mlir::omp::VariableCaptureKind::ByCopy; - } else { - mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; - } - } else if (!fir::isa_builtin_cptr_type(eleType)) { - mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; - mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; - } - auto location = - mlir::NameLoc::get(mlir::StringAttr::get(firOpBuilder.getContext(), - sym.name().ToString()), - baseOp.getLoc()); + std::pair + mapFlagAndKind = getImplicitMapTypeAndKind( + firOpBuilder, converter, defaultMaps, eleType, loc, sym); + mlir::Value mapOp = createMapInfoOp( - firOpBuilder, location, baseOp, /*varPtrPtr=*/mlir::Value{}, - name.str(), bounds, /*members=*/{}, + firOpBuilder, converter.getCurrentLocation(), baseOp, + /*varPtrPtr=*/mlir::Value{}, name.str(), bounds, /*members=*/{}, /*membersIndex=*/mlir::ArrayAttr{}, static_cast< std::underlying_type_t>( - mapFlag), - captureKind, baseOp.getType(), /*partialMap=*/false, mapperId); + std::get<0>(mapFlagAndKind)), + std::get<1>(mapFlagAndKind), baseOp.getType(), + /*partialMap=*/false, mapperId); clauseOps.mapVars.push_back(mapOp); mapSyms.push_back(&sym); @@ -4199,6 +4295,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, !std::holds_alternative(clause.u) && !std::holds_alternative(clause.u) && !std::holds_alternative(clause.u) && + !std::holds_alternative(clause.u) && !std::holds_alternative(clause.u) && !std::holds_alternative(clause.u) && !std::holds_alternative(clause.u) && diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index ffee57144f7fb..0254ac4309ee5 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -705,7 +705,7 @@ TYPE_PARSER(construct( // [OpenMP 5.0] // 2.19.7.2 defaultmap(implicit-behavior[:variable-category]) // implicit-behavior -> ALLOC | TO | FROM | TOFROM | FIRSRTPRIVATE | NONE | -// DEFAULT +// DEFAULT | PRESENT // variable-category -> ALL | SCALAR | AGGREGATE | ALLOCATABLE | POINTER TYPE_PARSER(construct( construct( @@ -716,7 +716,8 @@ TYPE_PARSER(construct( "FIRSTPRIVATE" >> pure(OmpDefaultmapClause::ImplicitBehavior::Firstprivate) || "NONE" >> pure(OmpDefaultmapClause::ImplicitBehavior::None) || - "DEFAULT" >> pure(OmpDefaultmapClause::ImplicitBehavior::Default)), + "DEFAULT" >> pure(OmpDefaultmapClause::ImplicitBehavior::Default) || + "PRESENT" >> pure(OmpDefaultmapClause::ImplicitBehavior::Present)), maybe(":" >> nonemptyList(Parser{})))) TYPE_PARSER(construct( diff --git a/flang/test/Lower/OpenMP/Todo/defaultmap-clause-firstprivate.f90 b/flang/test/Lower/OpenMP/Todo/defaultmap-clause-firstprivate.f90 new file mode 100644 index 0000000000000..0af2c7f5ea818 --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/defaultmap-clause-firstprivate.f90 @@ -0,0 +1,11 @@ +!RUN: %not_todo_cmd bbc -emit-hlfir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s +!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s + +subroutine f00 + implicit none + integer :: i + !CHECK: not yet implemented: Firstprivate and None are currently unsupported defaultmap behaviour + !$omp target defaultmap(firstprivate) + i = 10 + !$omp end target + end diff --git a/flang/test/Lower/OpenMP/Todo/defaultmap-clause-none.f90 b/flang/test/Lower/OpenMP/Todo/defaultmap-clause-none.f90 new file mode 100644 index 0000000000000..287eb4a9dfe8f --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/defaultmap-clause-none.f90 @@ -0,0 +1,11 @@ +!RUN: %not_todo_cmd bbc -emit-hlfir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s +!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s + +subroutine f00 + implicit none + integer :: i + !CHECK: not yet implemented: Firstprivate and None are currently unsupported defaultmap behaviour + !$omp target defaultmap(none) + i = 10 + !$omp end target +end diff --git a/flang/test/Lower/OpenMP/Todo/defaultmap-clause.f90 b/flang/test/Lower/OpenMP/Todo/defaultmap-clause.f90 deleted file mode 100644 index 062399d9a1944..0000000000000 --- a/flang/test/Lower/OpenMP/Todo/defaultmap-clause.f90 +++ /dev/null @@ -1,8 +0,0 @@ -!RUN: %not_todo_cmd bbc -emit-hlfir -fopenmp -fopenmp-version=45 -o - %s 2>&1 | FileCheck %s -!RUN: %not_todo_cmd %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=45 -o - %s 2>&1 | FileCheck %s - -!CHECK: not yet implemented: DEFAULTMAP clause is not implemented yet -subroutine f00 - !$omp target defaultmap(tofrom:scalar) - !$omp end target -end diff --git a/flang/test/Lower/OpenMP/defaultmap.f90 b/flang/test/Lower/OpenMP/defaultmap.f90 new file mode 100644 index 0000000000000..89d86ac1b8cc9 --- /dev/null +++ b/flang/test/Lower/OpenMP/defaultmap.f90 @@ -0,0 +1,105 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=52 %s -o - | FileCheck %s + +subroutine defaultmap_allocatable_present() + implicit none + integer, dimension(:), allocatable :: arr + +! CHECK: %[[MAP_1:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>>, i32) map_clauses(implicit, present, exit_release_or_enter_alloc) capture(ByRef) var_ptr_ptr({{.*}}) bounds({{.*}}) -> !fir.llvm_ptr>> {name = ""} +! CHECK: %[[MAP_2:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>>, !fir.box>>) map_clauses(implicit, to) capture(ByRef) members({{.*}}) -> !fir.ref>>> {name = "arr"} +!$omp target defaultmap(present: allocatable) + arr(1) = 10 +!$omp end target + + return +end subroutine + +subroutine defaultmap_scalar_tofrom() + implicit none + integer :: scalar_int + +! CHECK: %[[MAP:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref, i32) map_clauses(implicit, tofrom) capture(ByRef) -> !fir.ref {name = "scalar_int"} + !$omp target defaultmap(tofrom: scalar) + scalar_int = 20 + !$omp end target + + return +end subroutine + +subroutine defaultmap_all_default() + implicit none + integer, dimension(:), allocatable :: arr + integer :: aggregate(16) + integer :: scalar_int + +! CHECK: %[[MAP_1:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref {name = "scalar_int"} +! CHECK: %[[MAP_2:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>>, i32) map_clauses(implicit, tofrom) capture(ByRef) var_ptr_ptr({{.*}}) bounds({{.*}}) -> !fir.llvm_ptr>> {name = ""} +! CHECK: %[[MAP_3:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>>, !fir.box>>) map_clauses(implicit, to) capture(ByRef) members({{.*}}) -> !fir.ref>>> {name = "arr"} +! CHECK: %[[MAP_4:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>, !fir.array<16xi32>) map_clauses(implicit, tofrom) capture(ByRef) bounds({{.*}}) -> !fir.ref> {name = "aggregate"} + + !$omp target defaultmap(default: all) + scalar_int = 20 + arr(1) = scalar_int + aggregate(1) + !$omp end target + + return +end subroutine + +subroutine defaultmap_pointer_to() + implicit none + integer, dimension(:), pointer :: arr_ptr(:) + integer :: scalar_int + +! CHECK: %[[MAP_1:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>>, i32) map_clauses(implicit, to) capture(ByRef) var_ptr_ptr({{.*}}) bounds({{.*}}) -> !fir.llvm_ptr>> {name = ""} +! CHECK: %[[MAP_2:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>>, !fir.box>>) map_clauses(implicit, to) capture(ByRef) members({{.*}}) -> !fir.ref>>> {name = "arr_ptr"} +! CHECK: %[[MAP_3:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref {name = "scalar_int"} + !$omp target defaultmap(to: pointer) + arr_ptr(1) = scalar_int + 20 + !$omp end target + + return +end subroutine + +subroutine defaultmap_scalar_from() + implicit none + integer :: scalar_test + +! CHECK:%[[MAP:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref, i32) map_clauses(implicit, from) capture(ByRef) -> !fir.ref {name = "scalar_test"} + !$omp target defaultmap(from: scalar) + scalar_test = 20 + !$omp end target + + return +end subroutine + +subroutine defaultmap_aggregate_to() + implicit none + integer :: aggregate_arr(16) + integer :: scalar_test + +! CHECK: %[[MAP_1:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref {name = "scalar_test"} +! CHECK: %[[MAP_2:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>, !fir.array<16xi32>) map_clauses(implicit, to) capture(ByRef) bounds({{.*}}) -> !fir.ref> {name = "aggregate_arr"} + !$omp target map(tofrom: scalar_test) defaultmap(to: aggregate) + aggregate_arr(1) = 1 + scalar_test = 1 + !$omp end target + + return +end subroutine + +subroutine defaultmap_dtype_aggregate_to() + implicit none + type :: dtype + integer(4) :: array_i(10) + integer(4) :: k + end type dtype + + type(dtype) :: aggregate_type + +! CHECK: %[[MAP:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref,k:i32}>>, !fir.type<_QFdefaultmap_dtype_aggregate_toTdtype{array_i:!fir.array<10xi32>,k:i32}>) map_clauses(implicit, to) capture(ByRef) -> !fir.ref,k:i32}>> {name = "aggregate_type"} + !$omp target defaultmap(to: aggregate) + aggregate_type%k = 40 + aggregate_type%array_i(1) = 50 + !$omp end target + + return +end subroutine diff --git a/flang/test/Parser/OpenMP/defaultmap-clause.f90 b/flang/test/Parser/OpenMP/defaultmap-clause.f90 index dc036aedcd003..d908258fac763 100644 --- a/flang/test/Parser/OpenMP/defaultmap-clause.f90 +++ b/flang/test/Parser/OpenMP/defaultmap-clause.f90 @@ -82,3 +82,19 @@ subroutine f04 !PARSE-TREE: | OmpClauseList -> OmpClause -> Defaultmap -> OmpDefaultmapClause !PARSE-TREE: | | ImplicitBehavior = Tofrom !PARSE-TREE: | | Modifier -> OmpVariableCategory -> Value = Scalar + +subroutine f05 + !$omp target defaultmap(present: scalar) + !$omp end target +end + +!UNPARSE: SUBROUTINE f05 +!UNPARSE: !$OMP TARGET DEFAULTMAP(PRESENT:SCALAR) +!UNPARSE: !$OMP END TARGET +!UNPARSE: END SUBROUTINE + +!PARSE-TREE: OmpBeginBlockDirective +!PARSE-TREE: | OmpBlockDirective -> llvm::omp::Directive = target +!PARSE-TREE: | OmpClauseList -> OmpClause -> Defaultmap -> OmpDefaultmapClause +!PARSE-TREE: | | ImplicitBehavior = Present +!PARSE-TREE: | | Modifier -> OmpVariableCategory -> Value = Scalar diff --git a/offload/test/offloading/fortran/target-defaultmap-present.f90 b/offload/test/offloading/fortran/target-defaultmap-present.f90 new file mode 100644 index 0000000000000..3342db21f15c8 --- /dev/null +++ b/offload/test/offloading/fortran/target-defaultmap-present.f90 @@ -0,0 +1,34 @@ +! This checks that the basic functionality of setting the implicit mapping +! behaviour of a target region to present incurs the present behaviour for +! the implicit map capture. +! REQUIRES: flang, amdgpu +! RUN: %libomptarget-compile-fortran-generic +! RUN: %libomptarget-run-fail-generic 2>&1 \ +! RUN: | %fcheck-generic + +! NOTE: This should intentionally fatal error in omptarget as it's not +! present, as is intended. +subroutine target_data_not_present() + implicit none + double precision, dimension(:), allocatable :: arr + integer, parameter :: N = 16 + integer :: i + + allocate(arr(N)) + +!$omp target defaultmap(present: allocatable) + do i = 1,N + arr(i) = 42.0d0 + end do +!$omp end target + + deallocate(arr) + return +end subroutine + +program map_present + implicit none + call target_data_not_present() +end program + +!CHECK: omptarget message: device mapping required by 'present' map type modifier does not exist for host address{{.*}} diff --git a/offload/test/offloading/fortran/target-defaultmap.f90 b/offload/test/offloading/fortran/target-defaultmap.f90 new file mode 100644 index 0000000000000..d7184371129d2 --- /dev/null +++ b/offload/test/offloading/fortran/target-defaultmap.f90 @@ -0,0 +1,166 @@ +! Offloading test checking the use of the depend clause on the target construct +! REQUIRES: flang, amdgcn-amd-amdhsa +! UNSUPPORTED: nvptx64-nvidia-cuda +! UNSUPPORTED: nvptx64-nvidia-cuda-LTO +! UNSUPPORTED: aarch64-unknown-linux-gnu +! UNSUPPORTED: aarch64-unknown-linux-gnu-LTO +! UNSUPPORTED: x86_64-unknown-linux-gnu +! UNSUPPORTED: x86_64-unknown-linux-gnu-LTO + +! RUN: %libomptarget-compile-fortran-run-and-check-generic +subroutine defaultmap_allocatable_present() + implicit none + integer, dimension(:), allocatable :: arr + integer :: N = 16 + integer :: i + + allocate(arr(N)) + +!$omp target enter data map(to: arr) + +!$omp target defaultmap(present: allocatable) + do i = 1,N + arr(i) = N + 40 + end do +!$omp end target + +!$omp target exit data map(from: arr) + + print *, arr + deallocate(arr) + + return +end subroutine + +subroutine defaultmap_scalar_tofrom() + implicit none + integer :: scalar_int + scalar_int = 10 + + !$omp target defaultmap(tofrom: scalar) + scalar_int = 20 + !$omp end target + + print *, scalar_int + return +end subroutine + +subroutine defaultmap_all_default() + implicit none + integer, dimension(:), allocatable :: arr + integer :: aggregate(16) + integer :: N = 16 + integer :: i, scalar_int + + allocate(arr(N)) + + scalar_int = 10 + aggregate = scalar_int + + !$omp target defaultmap(default: all) + scalar_int = 20 + do i = 1,N + arr(i) = scalar_int + aggregate(i) + end do + !$omp end target + + print *, scalar_int + print *, arr + + deallocate(arr) + return +end subroutine + +subroutine defaultmap_pointer_to() + implicit none + integer, dimension(:), pointer :: arr_ptr(:) + integer :: scalar_int, i + allocate(arr_ptr(10)) + arr_ptr = 10 + scalar_int = 20 + + !$omp target defaultmap(to: pointer) + do i = 1,10 + arr_ptr(i) = scalar_int + 20 + end do + !$omp end target + + print *, arr_ptr + deallocate(arr_ptr) + return +end subroutine + +subroutine defaultmap_scalar_from() + implicit none + integer :: scalar_test + scalar_test = 10 + !$omp target defaultmap(from: scalar) + scalar_test = 20 + !$omp end target + + print *, scalar_test + return +end subroutine + +subroutine defaultmap_aggregate_to() + implicit none + integer :: aggregate_arr(16) + integer :: i, scalar_test = 0 + aggregate_arr = 0 + !$omp target map(tofrom: scalar_test) defaultmap(to: aggregate) + do i = 1,16 + aggregate_arr(i) = i + scalar_test = scalar_test + aggregate_arr(i) + enddo + !$omp end target + + print *, scalar_test + print *, aggregate_arr + return +end subroutine + +subroutine defaultmap_dtype_aggregate_to() + implicit none + type :: dtype + real(4) :: i + real(4) :: j + integer(4) :: array_i(10) + integer(4) :: k + integer(4) :: array_j(10) + end type dtype + + type(dtype) :: aggregate_type + + aggregate_type%k = 20 + aggregate_type%array_i = 30 + + !$omp target defaultmap(to: aggregate) + aggregate_type%k = 40 + aggregate_type%array_i(1) = 50 + !$omp end target + + print *, aggregate_type%k + print *, aggregate_type%array_i(1) + return +end subroutine + +program map_present + implicit none +! CHECK: 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 + call defaultmap_allocatable_present() +! CHECK: 20 + call defaultmap_scalar_tofrom() +! CHECK: 10 +! CHECK: 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 + call defaultmap_all_default() +! CHECK: 10 10 10 10 10 10 10 10 10 10 + call defaultmap_pointer_to() +! CHECK: 20 + call defaultmap_scalar_from() +! CHECK: 136 +! CHECK: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 + call defaultmap_aggregate_to() +! CHECK: 20 +! CHECK: 30 + call defaultmap_dtype_aggregate_to() +end program From flang-commits at lists.llvm.org Mon May 12 07:30:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 07:30:50 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <6822061a.170a0220.38fc6.f375@mx.google.com> https://github.com/agozillon closed https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Mon May 12 07:34:52 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 07:34:52 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [Flang][OpenMP] Initial defaultmap implementation (PR #135226) In-Reply-To: Message-ID: <6822070c.a70a0220.1ade42.1c0d@mx.google.com> ================ @@ -2231,6 +2232,146 @@ genSingleOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +static clause::Defaultmap::ImplicitBehavior +getDefaultmapIfPresent(DefaultMapsTy &defaultMaps, mlir::Type varType) { ---------------- agozillon wrote: Ah, thank you Krzysztof! I'll do a small follow up commit to amend it to const and utilise at then :-) https://github.com/llvm/llvm-project/pull/135226 From flang-commits at lists.llvm.org Mon May 12 07:49:04 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 07:49:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Implement !DIR$ [NO]INLINE and FORCEINLINE directives (PR #134350) In-Reply-To: Message-ID: <68220a60.050a0220.3201f7.4ed7@mx.google.com> tblah wrote: If I understand correctly the problem is that the llvm attributes have to be placed on the function operation not the call operation? I would recommend looking up the function definition from the symbol used in the call operation. If that can be found, add the attributes there. If there is no function definition in the current translation unit then it cannot be inlined anyway(*). (*) technically this could be possible with LTO, but with the way the compiler driver works currently, the symbol lookup would have to be done in the llvm backend (during linking). I think attempting this is a bit out of scope for this flang work. In the end whatever inlining attributes were supported in classic flang would have been constrained in the same way by the LLVM backend. Please let me know if you intend to continue to work on this. https://github.com/llvm/llvm-project/pull/134350 From flang-commits at lists.llvm.org Mon May 12 07:59:40 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Mon, 12 May 2025 07:59:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP][Semantics] resolve objects in the flush arg list (PR #139522) In-Reply-To: Message-ID: <68220cdc.170a0220.3a1a59.2b1a@mx.google.com> ================ @@ -409,6 +409,19 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { } void Post(const parser::OpenMPDepobjConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPFlushConstruct &x) { + PushContext(x.source, llvm::omp::Directive::OMPD_flush); + for (auto &arg : x.v.Arguments().v) { + if (auto *locator{std::get_if(&arg.u)}) { + if (auto *object{std::get_if(&locator->u)}) { + ResolveOmpObject(*object, Symbol::Flag::OmpDependObject); ---------------- kiranchandramohan wrote: Assuming this is a copy paste error. This is not a `DependObject`, so I think adding the `Symbol::Flag::OmpDependObject` is probably not right. What do we want to do here? Is it just resolving the names in the flush argument list to the appropriate Fortran declarations? If so, we should use `ResolveName`. https://github.com/llvm/llvm-project/pull/139522 From flang-commits at lists.llvm.org Mon May 12 08:09:58 2025 From: flang-commits at lists.llvm.org (Jean-Didier PAILLEUX via flang-commits) Date: Mon, 12 May 2025 08:09:58 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Implement !DIR$ IVDEP directive (PR #133728) In-Reply-To: Message-ID: <68220f46.a70a0220.285ae5.58b9@mx.google.com> JDPailleux wrote: > @JDPailleux do you have any update on this? Hi, sorry for the late reply (First I was waiting response from https://github.com/llvm/llvm-project/pull/101045 and after I was unavailable). I'll push an updated version soon. https://github.com/llvm/llvm-project/pull/133728 From flang-commits at lists.llvm.org Mon May 12 08:15:44 2025 From: flang-commits at lists.llvm.org (Michael Kruse via flang-commits) Date: Mon, 12 May 2025 08:15:44 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828) In-Reply-To: Message-ID: <682210a0.050a0220.240307.91da@mx.google.com> ================ @@ -88,6 +117,67 @@ set(host_sources unit-map.cpp ) +# Module sources that are required by other modules +set(intrinsics_sources + __fortran_builtins.f90 +) + + +#set_property(SOURCE "__fortran_type_info.f90" APPEND PROPERTY OBJECT_DEPENDS "/home/meinersbur/build/llvm-project-flangrt/release_bootstrap/llvm_flang_runtimes/./lib/../include/flang/__fortran_builtins.mod") +#set_property(SOURCE "__fortran_type_info.f90" APPEND PROPERTY OBJECT_DEPENDS "flang-rt/lib/runtime/CMakeFiles/flang_rt.runtime.static.dir/__fortran_builtins.f90.o") +#set_property(SOURCE "__fortran_type_info.f90" APPEND PROPERTY OBJECT_DEPENDS "/home/meinersbur/build/llvm-project-flangrt/release_bootstrap/llvm_flang_runtimes/./lib/../include/flang/__fortran_builtins.mod") + +message("CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}") +message("CMAKE_HOST_SYSTEM_PROCESSOR: ${CMAKE_HOST_SYSTEM_PROCESSOR}") +if (CMAKE_SYSTEM_PROCESSOR STREQUAL "powerpc") + list(APPEND host_source + __ppc_types.f90 + __ppc_intrinsics.f90 + mma.f90 + ) +endif () + +if (FLANG_RT_EXPERIMENTAL_OFFLOAD_SUPPORT STREQUAL "CUDA") + list(APPEND supported_sources + __cuda_builtins.f90 + __cuda_device.f90 + cudadevice.f90 + mma.f90 ---------------- Meinersbur wrote: Correct, I think was just from copy&pasting the previous list. https://github.com/llvm/llvm-project/pull/137828 From flang-commits at lists.llvm.org Mon May 12 08:16:46 2025 From: flang-commits at lists.llvm.org (Michael Kruse via flang-commits) Date: Mon, 12 May 2025 08:16:46 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828) In-Reply-To: Message-ID: <682210de.170a0220.1519d2.5415@mx.google.com> ================ @@ -299,6 +310,18 @@ elseif (FLANG_RT_GCC_RESOURCE_DIR) endif () endif () + + +if (CMAKE_C_BYTE_ORDER STREQUAL "BIG_ENDIAN") ---------------- Meinersbur wrote: I had the same problem as you: CMake does not determine endianness when cross-compiling. This is leftover from my attempt. Glad you fixed it for me already. https://github.com/llvm/llvm-project/pull/137828 From flang-commits at lists.llvm.org Mon May 12 08:19:03 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 08:19:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Further refinement of OpenMP !$ lines in -E mode (PR #138956) In-Reply-To: Message-ID: <68221167.170a0220.a153d.3fe3@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/138956 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 12 08:20:10 2025 From: flang-commits at lists.llvm.org (Jean-Didier PAILLEUX via flang-commits) Date: Mon, 12 May 2025 08:20:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Implement !DIR$ [NO]INLINE and FORCEINLINE directives (PR #134350) In-Reply-To: Message-ID: <682211aa.170a0220.88abc.3834@mx.google.com> JDPailleux wrote: Sorry in advance if I misunderstand what you're saying. But today, attributes are only available for the function operation. However, directives attach inlining attributes to call operations. LLVM already offers the possibility of applying inlining attributes on call operations. I'm just making the bridge between Flang / MLIR and LLVM. https://github.com/llvm/llvm-project/pull/134350 From flang-commits at lists.llvm.org Mon May 12 08:22:15 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 08:22:15 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Implement !DIR$ [NO]INLINE and FORCEINLINE directives (PR #134350) In-Reply-To: Message-ID: <68221227.170a0220.1b5a32.507e@mx.google.com> tblah wrote: I think in flang you need to resolve the symbols for the function call and then add the attribute to that function operation. https://github.com/llvm/llvm-project/pull/134350 From flang-commits at lists.llvm.org Mon May 12 08:24:32 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 08:24:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP]Replace assert with if-condition (PR #139559) In-Reply-To: Message-ID: <682212b0.170a0220.8749f.3b51@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics @llvm/pr-subscribers-flang-openmp Author: Mats Petersson (Leporacanthicus)
Changes If a symbol is not declared, check-omp-structure hits an assert. It should be safe to treat undeclared symbols as "not from a block", as they would have to be declared to be in a block... Adding simple test to confirm it gives error messages, not crashing. This should fix issue #131655 (there is already a check for symbol being not null in the code identified in the ticket). --- Full diff: https://github.com/llvm/llvm-project/pull/139559.diff 2 Files Affected: - (modified) flang/lib/Semantics/check-omp-structure.cpp (+1-2) - (added) flang/test/Semantics/OpenMP/reduction-undefined.f90 (+18) ``````````diff diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 8f6a623508aa7..78736ee1929d1 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3545,8 +3545,7 @@ void OmpStructureChecker::CheckReductionObjects( // names into the lists of their members. for (const parser::OmpObject &object : objects.v) { auto *symbol{GetObjectSymbol(object)}; - assert(symbol && "Expecting a symbol for object"); - if (IsCommonBlock(*symbol)) { + if (symbol && IsCommonBlock(*symbol)) { auto source{GetObjectSource(object)}; context_.Say(source ? *source : GetContext().clauseSource, "Common block names are not allowed in %s clause"_err_en_US, diff --git a/flang/test/Semantics/OpenMP/reduction-undefined.f90 b/flang/test/Semantics/OpenMP/reduction-undefined.f90 new file mode 100644 index 0000000000000..bf1f03a878630 --- /dev/null +++ b/flang/test/Semantics/OpenMP/reduction-undefined.f90 @@ -0,0 +1,18 @@ +! RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp + +subroutine dont_crash(values) + implicit none + integer, parameter :: n = 100 + real :: values(n) + integer :: i + !ERROR: No explicit type declared for 'sum' + sum = 0 + !ERROR: No explicit type declared for 'sum' + !$omp parallel do reduction(+:sum) + do i = 1, n + !ERROR: No explicit type declared for 'sum' + !ERROR: No explicit type declared for 'sum' + sum = sum + values(i) + end do +end subroutine dont_crash + ``````````
https://github.com/llvm/llvm-project/pull/139559 From flang-commits at lists.llvm.org Mon May 12 02:46:25 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 02:46:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP][Semantics] resolve objects in the flush arg list (PR #139522) Message-ID: https://github.com/tblah created https://github.com/llvm/llvm-project/pull/139522 Fixes #136583 Normally the flush argument list would contain a DataRef to some variable. All DataRefs are handled generically in resolve-names and so the problem wasn't observed. But when a common block name is specified, this is not parsed as a DataRef. There was already handling in resolve-directives for OmpObjectList but not for argument lists. I've added a visitor for FLUSH which ensures all of the arguments have been resolved. The test is there to make sure the compiler doesn't crashed encountering the unresolved symbol. It shows that we currently deny flushing a common block. I'm not sure that it is right to restrict common blocks from flush argument lists, but fixing that can come in a different patch. This one is fixing an ICE. >From a64751019f3eab8a7d6d678ad76fdb8c31b86e54 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Mon, 12 May 2025 09:38:01 +0000 Subject: [PATCH] [flang][OpenMP][Semantics] resolve objects in the flush arg list Fixes #136583 Normally the flush argument list would contain a DataRef to some variable. All DataRefs are handled generically in resolve-names and so the problem wasn't observed. But when a common block name is specified, this is not parsed as a DataRef. There was already handling in resolve-directives for OmpObjectList but not for argument lists. I've added a visitor for FLUSH which ensures all of the arguments have been resolved. The test is there to make sure the compiler doesn't crashed encountering the unresolved symbol. It shows that we currently deny flushing a common block. I'm not sure that it is right to restrict common blocks from flush argument lists, but fixing that can come in a different patch. This one is fixing an ICE. --- flang/lib/Semantics/resolve-directives.cpp | 13 +++++++++++++ flang/test/Semantics/OpenMP/flush04.f90 | 11 +++++++++++ 2 files changed, 24 insertions(+) create mode 100644 flang/test/Semantics/OpenMP/flush04.f90 diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 8b1caca34a6a7..68f8cf9f17620 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -409,6 +409,19 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { } void Post(const parser::OpenMPDepobjConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPFlushConstruct &x) { + PushContext(x.source, llvm::omp::Directive::OMPD_flush); + for (auto &arg : x.v.Arguments().v) { + if (auto *locator{std::get_if(&arg.u)}) { + if (auto *object{std::get_if(&locator->u)}) { + ResolveOmpObject(*object, Symbol::Flag::OmpDependObject); + } + } + } + return true; + } + void Post(const parser::OpenMPFlushConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPRequiresConstruct &x) { using Flags = WithOmpDeclarative::RequiresFlags; using Requires = WithOmpDeclarative::RequiresFlag; diff --git a/flang/test/Semantics/OpenMP/flush04.f90 b/flang/test/Semantics/OpenMP/flush04.f90 new file mode 100644 index 0000000000000..ffc2273b692dc --- /dev/null +++ b/flang/test/Semantics/OpenMP/flush04.f90 @@ -0,0 +1,11 @@ +! RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp + +! Regression test to ensure that the name /c/ in the flush argument list is +! resolved to the common block symbol. + + common /c/ x + real :: x +!ERROR: FLUSH argument must be a variable list item + !$omp flush(/c/) +end + From flang-commits at lists.llvm.org Mon May 12 02:46:55 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 02:46:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP][Semantics] resolve objects in the flush arg list (PR #139522) In-Reply-To: Message-ID: <6821c38f.050a0220.28799e.d9bd@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Tom Eccles (tblah)
Changes Fixes #136583 Normally the flush argument list would contain a DataRef to some variable. All DataRefs are handled generically in resolve-names and so the problem wasn't observed. But when a common block name is specified, this is not parsed as a DataRef. There was already handling in resolve-directives for OmpObjectList but not for argument lists. I've added a visitor for FLUSH which ensures all of the arguments have been resolved. The test is there to make sure the compiler doesn't crashed encountering the unresolved symbol. It shows that we currently deny flushing a common block. I'm not sure that it is right to restrict common blocks from flush argument lists, but fixing that can come in a different patch. This one is fixing an ICE. --- Full diff: https://github.com/llvm/llvm-project/pull/139522.diff 2 Files Affected: - (modified) flang/lib/Semantics/resolve-directives.cpp (+13) - (added) flang/test/Semantics/OpenMP/flush04.f90 (+11) ``````````diff diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 8b1caca34a6a7..68f8cf9f17620 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -409,6 +409,19 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { } void Post(const parser::OpenMPDepobjConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPFlushConstruct &x) { + PushContext(x.source, llvm::omp::Directive::OMPD_flush); + for (auto &arg : x.v.Arguments().v) { + if (auto *locator{std::get_if(&arg.u)}) { + if (auto *object{std::get_if(&locator->u)}) { + ResolveOmpObject(*object, Symbol::Flag::OmpDependObject); + } + } + } + return true; + } + void Post(const parser::OpenMPFlushConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPRequiresConstruct &x) { using Flags = WithOmpDeclarative::RequiresFlags; using Requires = WithOmpDeclarative::RequiresFlag; diff --git a/flang/test/Semantics/OpenMP/flush04.f90 b/flang/test/Semantics/OpenMP/flush04.f90 new file mode 100644 index 0000000000000..ffc2273b692dc --- /dev/null +++ b/flang/test/Semantics/OpenMP/flush04.f90 @@ -0,0 +1,11 @@ +! RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp + +! Regression test to ensure that the name /c/ in the flush argument list is +! resolved to the common block symbol. + + common /c/ x + real :: x +!ERROR: FLUSH argument must be a variable list item + !$omp flush(/c/) +end + ``````````
https://github.com/llvm/llvm-project/pull/139522 From flang-commits at lists.llvm.org Mon May 12 03:18:39 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 03:18:39 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow flush of common block (PR #139528) Message-ID: https://github.com/tblah created https://github.com/llvm/llvm-project/pull/139528 I think this was denied by accident in https://github.com/llvm/llvm-project/commit/68180d8d16f07db8200dfce7bae26a80c43ebc5e. Flush of a common block is allowed by the standard on my reading. It is not allowed by classic-flang but is supported by gfortran and ifx. This doesn't need any lowering changes. The LLVM translation ignores the flush argument list because the openmp runtime library doesn't support flushing specific data. Depends upon https://github.com/llvm/llvm-project/pull/139522. Ignore the first commit in this PR. >From a64751019f3eab8a7d6d678ad76fdb8c31b86e54 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Mon, 12 May 2025 09:38:01 +0000 Subject: [PATCH 1/2] [flang][OpenMP][Semantics] resolve objects in the flush arg list Fixes #136583 Normally the flush argument list would contain a DataRef to some variable. All DataRefs are handled generically in resolve-names and so the problem wasn't observed. But when a common block name is specified, this is not parsed as a DataRef. There was already handling in resolve-directives for OmpObjectList but not for argument lists. I've added a visitor for FLUSH which ensures all of the arguments have been resolved. The test is there to make sure the compiler doesn't crashed encountering the unresolved symbol. It shows that we currently deny flushing a common block. I'm not sure that it is right to restrict common blocks from flush argument lists, but fixing that can come in a different patch. This one is fixing an ICE. --- flang/lib/Semantics/resolve-directives.cpp | 13 +++++++++++++ flang/test/Semantics/OpenMP/flush04.f90 | 11 +++++++++++ 2 files changed, 24 insertions(+) create mode 100644 flang/test/Semantics/OpenMP/flush04.f90 diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 8b1caca34a6a7..68f8cf9f17620 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -409,6 +409,19 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { } void Post(const parser::OpenMPDepobjConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPFlushConstruct &x) { + PushContext(x.source, llvm::omp::Directive::OMPD_flush); + for (auto &arg : x.v.Arguments().v) { + if (auto *locator{std::get_if(&arg.u)}) { + if (auto *object{std::get_if(&locator->u)}) { + ResolveOmpObject(*object, Symbol::Flag::OmpDependObject); + } + } + } + return true; + } + void Post(const parser::OpenMPFlushConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPRequiresConstruct &x) { using Flags = WithOmpDeclarative::RequiresFlags; using Requires = WithOmpDeclarative::RequiresFlag; diff --git a/flang/test/Semantics/OpenMP/flush04.f90 b/flang/test/Semantics/OpenMP/flush04.f90 new file mode 100644 index 0000000000000..ffc2273b692dc --- /dev/null +++ b/flang/test/Semantics/OpenMP/flush04.f90 @@ -0,0 +1,11 @@ +! RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp + +! Regression test to ensure that the name /c/ in the flush argument list is +! resolved to the common block symbol. + + common /c/ x + real :: x +!ERROR: FLUSH argument must be a variable list item + !$omp flush(/c/) +end + >From 890a8a05addef06de0485f2d74f755f3c53098c1 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Mon, 12 May 2025 10:13:08 +0000 Subject: [PATCH 2/2] [flang][OpenMP] Allow flush of common block I think this was denied by accident in 68180d8. Flush of a common block is allowed by the standard on my reading. It is not allowed by classic-flang but is supported by gfortran and ifx. This doesn't need any lowering changes. The LLVM translation ignores the flush argument list because the openmp runtime library doesn't support flushing specific data. Depends upon #139522. Ignore the first commit in this PR. --- flang/lib/Semantics/check-omp-structure.cpp | 7 ------- flang/test/Lower/OpenMP/flush-common.f90 | 13 +++++++++++++ flang/test/Semantics/OpenMP/flush04.f90 | 11 ----------- 3 files changed, 13 insertions(+), 18 deletions(-) create mode 100644 flang/test/Lower/OpenMP/flush-common.f90 delete mode 100644 flang/test/Semantics/OpenMP/flush04.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index f17de42ca2466..591f30db36baa 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2304,13 +2304,6 @@ void OmpStructureChecker::Leave(const parser::OpenMPFlushConstruct &x) { auto &flushList{std::get>(x.v.t)}; if (flushList) { - for (const parser::OmpArgument &arg : flushList->v) { - if (auto *sym{GetArgumentSymbol(arg)}; sym && !IsVariableListItem(*sym)) { - context_.Say(arg.source, - "FLUSH argument must be a variable list item"_err_en_US); - } - } - if (FindClause(llvm::omp::Clause::OMPC_acquire) || FindClause(llvm::omp::Clause::OMPC_release) || FindClause(llvm::omp::Clause::OMPC_acq_rel)) { diff --git a/flang/test/Lower/OpenMP/flush-common.f90 b/flang/test/Lower/OpenMP/flush-common.f90 new file mode 100644 index 0000000000000..7656141dcb295 --- /dev/null +++ b/flang/test/Lower/OpenMP/flush-common.f90 @@ -0,0 +1,13 @@ +! RUN: %flang_fc1 -fopenmp -emit-hlfir -o - %s | FileCheck %s + +! Regression test to ensure that the name /c/ in the flush argument list is +! resolved to the common block symbol and common blocks are allowed in the +! flush argument list. + +! CHECK: %[[GLBL:.*]] = fir.address_of({{.*}}) : !fir.ref> + common /c/ x + real :: x +! CHECK: omp.flush(%[[GLBL]] : !fir.ref>) + !$omp flush(/c/) +end + diff --git a/flang/test/Semantics/OpenMP/flush04.f90 b/flang/test/Semantics/OpenMP/flush04.f90 deleted file mode 100644 index ffc2273b692dc..0000000000000 --- a/flang/test/Semantics/OpenMP/flush04.f90 +++ /dev/null @@ -1,11 +0,0 @@ -! RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp - -! Regression test to ensure that the name /c/ in the flush argument list is -! resolved to the common block symbol. - - common /c/ x - real :: x -!ERROR: FLUSH argument must be a variable list item - !$omp flush(/c/) -end - From flang-commits at lists.llvm.org Mon May 12 03:19:09 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 03:19:09 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow flush of common block (PR #139528) In-Reply-To: Message-ID: <6821cb1d.050a0220.1e1fc3.9f96@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Tom Eccles (tblah)
Changes I think this was denied by accident in https://github.com/llvm/llvm-project/commit/68180d8d16f07db8200dfce7bae26a80c43ebc5e. Flush of a common block is allowed by the standard on my reading. It is not allowed by classic-flang but is supported by gfortran and ifx. This doesn't need any lowering changes. The LLVM translation ignores the flush argument list because the openmp runtime library doesn't support flushing specific data. Depends upon https://github.com/llvm/llvm-project/pull/139522. Ignore the first commit in this PR. --- Full diff: https://github.com/llvm/llvm-project/pull/139528.diff 3 Files Affected: - (modified) flang/lib/Semantics/check-omp-structure.cpp (-7) - (modified) flang/lib/Semantics/resolve-directives.cpp (+13) - (added) flang/test/Lower/OpenMP/flush-common.f90 (+13) ``````````diff diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index f17de42ca2466..591f30db36baa 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2304,13 +2304,6 @@ void OmpStructureChecker::Leave(const parser::OpenMPFlushConstruct &x) { auto &flushList{std::get>(x.v.t)}; if (flushList) { - for (const parser::OmpArgument &arg : flushList->v) { - if (auto *sym{GetArgumentSymbol(arg)}; sym && !IsVariableListItem(*sym)) { - context_.Say(arg.source, - "FLUSH argument must be a variable list item"_err_en_US); - } - } - if (FindClause(llvm::omp::Clause::OMPC_acquire) || FindClause(llvm::omp::Clause::OMPC_release) || FindClause(llvm::omp::Clause::OMPC_acq_rel)) { diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 8b1caca34a6a7..68f8cf9f17620 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -409,6 +409,19 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { } void Post(const parser::OpenMPDepobjConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPFlushConstruct &x) { + PushContext(x.source, llvm::omp::Directive::OMPD_flush); + for (auto &arg : x.v.Arguments().v) { + if (auto *locator{std::get_if(&arg.u)}) { + if (auto *object{std::get_if(&locator->u)}) { + ResolveOmpObject(*object, Symbol::Flag::OmpDependObject); + } + } + } + return true; + } + void Post(const parser::OpenMPFlushConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPRequiresConstruct &x) { using Flags = WithOmpDeclarative::RequiresFlags; using Requires = WithOmpDeclarative::RequiresFlag; diff --git a/flang/test/Lower/OpenMP/flush-common.f90 b/flang/test/Lower/OpenMP/flush-common.f90 new file mode 100644 index 0000000000000..7656141dcb295 --- /dev/null +++ b/flang/test/Lower/OpenMP/flush-common.f90 @@ -0,0 +1,13 @@ +! RUN: %flang_fc1 -fopenmp -emit-hlfir -o - %s | FileCheck %s + +! Regression test to ensure that the name /c/ in the flush argument list is +! resolved to the common block symbol and common blocks are allowed in the +! flush argument list. + +! CHECK: %[[GLBL:.*]] = fir.address_of({{.*}}) : !fir.ref> + common /c/ x + real :: x +! CHECK: omp.flush(%[[GLBL]] : !fir.ref>) + !$omp flush(/c/) +end + ``````````
https://github.com/llvm/llvm-project/pull/139528 From flang-commits at lists.llvm.org Mon May 12 03:23:44 2025 From: flang-commits at lists.llvm.org (Kiran Kumar T P via flang-commits) Date: Mon, 12 May 2025 03:23:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow flush of common block (PR #139528) In-Reply-To: Message-ID: <6821cc30.170a0220.328d3.2419@mx.google.com> kiranktp wrote: Thanks for the patch Tom. LGTM https://github.com/llvm/llvm-project/pull/139528 From flang-commits at lists.llvm.org Mon May 12 07:11:04 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Mon, 12 May 2025 07:11:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) In-Reply-To: Message-ID: <68220178.050a0220.3ae94.311e@mx.google.com> https://github.com/ashermancinelli updated https://github.com/llvm/llvm-project/pull/139183 >From d8a31d1c81bcebdb900c546a12b4c2291d467ed8 Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Thu, 8 May 2025 17:30:04 -0700 Subject: [PATCH 1/3] [flang] Fix volatile attr propagation on allocatables Ensure volatility is reflected not just on the reference of an allocatable, but on the box, too. When we designate a volatile allocatable, we now get a volatile reference to a volatile box. Some related cleanups: * SELECT TYPE constructs properly handle volatility by checking the selector's type for volatility when creating the target type. * Refine the verifier for fir.convert. In general, I think it should be ok to implicitly drop volatility in any ptr-to-int conversion because it means we are in codegen or we are calling an external function, and it's okay to drop volatility from the Fir type system in these cases. * An allocatable test that was XFAILed is now passing. * I noticed a runtime function was missing the fir.runtime attribute. Fix that. --- flang/lib/Lower/Bridge.cpp | 12 +++-- flang/lib/Optimizer/Dialect/FIROps.cpp | 47 +++++++++++++++---- flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp | 15 ++++-- .../Transforms/PolymorphicOpConversion.cpp | 11 +++-- flang/test/Fir/invalid.fir | 4 +- flang/test/Lower/volatile-allocatable.f90 | 21 +++++---- flang/test/Lower/volatile-allocatable1.f90 | 17 ++++++- 7 files changed, 91 insertions(+), 36 deletions(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 43375e84f21fa..d28c01ed16cbf 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -3842,6 +3842,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { bool hasLocalScope = false; llvm::SmallVector typeCaseScopes; + const auto selectorIsVolatile = [&selector]() { + return fir::isa_volatile_type(fir::getBase(selector).getType()); + }; + const auto &typeCaseList = std::get>( selectTypeConstruct.t); @@ -3995,7 +3999,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { addrTy = fir::HeapType::get(addrTy); if (std::holds_alternative( typeSpec->u)) { - mlir::Type refTy = fir::ReferenceType::get(addrTy); + mlir::Type refTy = fir::ReferenceType::get(addrTy, selectorIsVolatile()); if (isPointer || isAllocatable) refTy = addrTy; exactValue = builder->create( @@ -4004,7 +4008,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { typeSpec->declTypeSpec->AsIntrinsic(); if (isArray) { mlir::Value exact = builder->create( - loc, fir::BoxType::get(addrTy), fir::getBase(selector)); + loc, fir::BoxType::get(addrTy, selectorIsVolatile()), fir::getBase(selector)); addAssocEntitySymbol(selectorBox->clone(exact)); } else if (intrinsic->category() == Fortran::common::TypeCategory::Character) { @@ -4019,7 +4023,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { } else if (std::holds_alternative( typeSpec->u)) { exactValue = builder->create( - loc, fir::BoxType::get(addrTy), fir::getBase(selector)); + loc, fir::BoxType::get(addrTy, selectorIsVolatile()), fir::getBase(selector)); addAssocEntitySymbol(selectorBox->clone(exactValue)); } } else if (std::holds_alternative( @@ -4037,7 +4041,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { addrTy = fir::PointerType::get(addrTy); if (isAllocatable) addrTy = fir::HeapType::get(addrTy); - mlir::Type classTy = fir::ClassType::get(addrTy); + mlir::Type classTy = fir::ClassType::get(addrTy, selectorIsVolatile()); if (classTy == baseTy) { addAssocEntitySymbol(selector); } else { diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index 332cca1ab9f95..9b58578e55474 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -1536,20 +1536,47 @@ bool fir::ConvertOp::canBeConverted(mlir::Type inType, mlir::Type outType) { areRecordsCompatible(inType, outType); } +// In general, ptrtoint-like conversions are allowed to lose volatility information +// because they are either: +// +// 1. passing an entity to an external function and there's nothing we can do +// about volatility after that happens, or +// 2. for code generation, at which point we represent volatility with attributes +// on the LLVM instructions and intrinsics. +// +// For all other cases, volatility ought to match exactly. +static mlir::LogicalResult verifyVolatility(mlir::Type inType, mlir::Type outType) { + const bool toLLVMPointer = mlir::isa(outType); + const bool toInteger = fir::isa_integer(outType); + + // When converting references to classes or allocatables into boxes for runtime arguments, + // we cast away all the volatility information and pass a box. This is allowed. + const bool isBoxNoneLike = [&]() { + if (fir::isBoxNone(outType)) + return true; + if (auto referenceType = mlir::dyn_cast(outType)) { + if (fir::isBoxNone(referenceType.getElementType())) { + return true; + } + } + return false; + }(); + + const bool isPtrToIntLike = toLLVMPointer || toInteger || isBoxNoneLike; + if (isPtrToIntLike) { + return mlir::success(); + } + + // In all other cases, we need to check for an exact volatility match. + return mlir::success(fir::isa_volatile_type(inType) == fir::isa_volatile_type(outType)); +} + llvm::LogicalResult fir::ConvertOp::verify() { mlir::Type inType = getValue().getType(); mlir::Type outType = getType(); - // If we're converting to an LLVM pointer type or an integer, we don't - // need to check for volatility mismatch - volatility will be handled by the - // memory operations themselves in llvm code generation and ptr-to-int can't - // represent volatility. - const bool toLLVMPointer = mlir::isa(outType); - const bool toInteger = fir::isa_integer(outType); if (fir::useStrictVolatileVerification()) { - if (fir::isa_volatile_type(inType) != fir::isa_volatile_type(outType) && - !toLLVMPointer && !toInteger) { - return emitOpError("cannot convert between volatile and non-volatile " - "types, use fir.volatile_cast instead ") + if (failed(verifyVolatility(inType, outType))) { + return emitOpError("this conversion does not preserve volatility: ") << inType << " / " << outType; } } diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp index 711d5d1461b08..52517eef2890d 100644 --- a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp +++ b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp @@ -207,20 +207,25 @@ static bool hasExplicitLowerBounds(mlir::Value shape) { mlir::isa(shape.getType()); } -static std::pair updateDeclareInputTypeWithVolatility( +static std::pair updateDeclaredInputTypeWithVolatility( mlir::Type inputType, mlir::Value memref, mlir::OpBuilder &builder, fir::FortranVariableFlagsAttr fortran_attrs) { if (fortran_attrs && bitEnumContainsAny(fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::fortran_volatile)) { + // A volatile pointer's pointee is volatile. const bool isPointer = bitEnumContainsAny( fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::pointer); + // An allocatable's inner type's volatility matches that of the reference. + const bool isAllocatable = bitEnumContainsAny( + fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::allocatable); auto updateType = [&](auto t) { using FIRT = decltype(t); - // A volatile pointer's pointee is volatile. auto elementType = t.getEleTy(); - const bool elementTypeIsVolatile = - isPointer || fir::isa_volatile_type(elementType); + const bool elementTypeIsBox = mlir::isa(elementType); + const bool elementTypeIsVolatile = isPointer || isAllocatable || + elementTypeIsBox || + fir::isa_volatile_type(elementType); auto newEleTy = fir::updateTypeWithVolatility(elementType, elementTypeIsVolatile); inputType = FIRT::get(newEleTy, true); @@ -243,7 +248,7 @@ void hlfir::DeclareOp::build(mlir::OpBuilder &builder, auto nameAttr = builder.getStringAttr(uniq_name); mlir::Type inputType = memref.getType(); bool hasExplicitLbs = hasExplicitLowerBounds(shape); - std::tie(inputType, memref) = updateDeclareInputTypeWithVolatility( + std::tie(inputType, memref) = updateDeclaredInputTypeWithVolatility( inputType, memref, builder, fortran_attrs); mlir::Type hlfirVariableType = getHLFIRVariableType(inputType, hasExplicitLbs); diff --git a/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp b/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp index 0c78a878cdc53..309e557e409c0 100644 --- a/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp +++ b/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp @@ -401,10 +401,13 @@ llvm::LogicalResult SelectTypeConv::genTypeLadderStep( { // Since conversion is done in parallel for each fir.select_type // operation, the runtime function insertion must be threadsafe. - callee = - fir::createFuncOp(rewriter.getUnknownLoc(), mod, fctName, - rewriter.getFunctionType({descNoneTy, typeDescTy}, - rewriter.getI1Type())); + auto runtimeAttr = + mlir::NamedAttribute(fir::FIROpsDialect::getFirRuntimeAttrName(), + mlir::UnitAttr::get(rewriter.getContext())); + callee = fir::createFuncOp(rewriter.getUnknownLoc(), mod, fctName, + rewriter.getFunctionType({descNoneTy, typeDescTy}, + rewriter.getI1Type()), + {runtimeAttr}); } cmp = rewriter .create(loc, callee, diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index 1de48b87365b3..834eea7df8ebe 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -1260,7 +1260,7 @@ func.func @dc_invalid_reduction(%arg0: index, %arg1: index) { // Should fail when volatility changes from a fir.convert func.func @bad_convert_volatile(%arg0: !fir.ref) -> !fir.ref { - // expected-error at +1 {{'fir.convert' op cannot convert between volatile and non-volatile types, use fir.volatile_cast instead}} + // expected-error at +1 {{op this conversion does not preserve volatilit}} %0 = fir.convert %arg0 : (!fir.ref) -> !fir.ref return %0 : !fir.ref } @@ -1269,7 +1269,7 @@ func.func @bad_convert_volatile(%arg0: !fir.ref) -> !fir.ref // Should fail when volatility changes from a fir.convert func.func @bad_convert_volatile2(%arg0: !fir.ref) -> !fir.ref { - // expected-error at +1 {{'fir.convert' op cannot convert between volatile and non-volatile types, use fir.volatile_cast instead}} + // expected-error at +1 {{op this conversion does not preserve volatility}} %0 = fir.convert %arg0 : (!fir.ref) -> !fir.ref return %0 : !fir.ref } diff --git a/flang/test/Lower/volatile-allocatable.f90 b/flang/test/Lower/volatile-allocatable.f90 index 5f75a5425422a..e182fe8a4d9c9 100644 --- a/flang/test/Lower/volatile-allocatable.f90 +++ b/flang/test/Lower/volatile-allocatable.f90 @@ -119,10 +119,10 @@ subroutine test_unlimited_polymorphic() end subroutine ! CHECK-LABEL: func.func @_QPtest_scalar_volatile() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEc1"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv1"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv2"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv3"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEc1"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv1"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv2"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv3"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () @@ -140,8 +140,8 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_volatile_asynchronous() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEi1"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEv1"} : (!fir.ref>>>, volatile>) -> (!fir.ref>>>, volatile>, !fir.ref>>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEi1"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEv1"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 @@ -151,10 +151,11 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_select_base_type_volatile() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.ref>>>, volatile>) -> (!fir.ref>>>, volatile>, !fir.ref>>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAClassIs(%{{.+}}, %{{.+}}) : (!fir.box, !fir.ref) -> i1 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.class>>, volatile>, !fir.shift<1>) -> (!fir.class>>, volatile>, !fir.class>>, volatile>) ! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0 (%{{.+}}) : (!fir.class>>, volatile>, index) -> !fir.class, volatile> ! CHECK: %{{.+}} = hlfir.designate %{{.+}}{"i"} : (!fir.class, volatile>) -> !fir.ref @@ -162,7 +163,7 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_mold_allocation() { ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {uniq_name = "_QFtest_mold_allocationEtemplate"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_mold_allocationEv"} : (!fir.ref>>>, volatile>) -> (!fir.ref>>>, volatile>, !fir.ref>>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_mold_allocationEv"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QQclX6D6F6C642074657374"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) ! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0{"str"} typeparams %{{.+}} : (!fir.ref>, index) -> !fir.ref> ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QQro.2xi4.2"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) @@ -173,8 +174,8 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_unlimited_polymorphic() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.ref>, volatile>) -> (!fir.ref>, volatile>, !fir.ref>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEupa"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.ref, volatile>, volatile>) -> (!fir.ref, volatile>, volatile>, !fir.ref, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEupa"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitIntrinsicForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i32, i32, i32) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.heap) -> (!fir.heap, !fir.heap) diff --git a/flang/test/Lower/volatile-allocatable1.f90 b/flang/test/Lower/volatile-allocatable1.f90 index a21359c3b4225..d2a07c8763885 100644 --- a/flang/test/Lower/volatile-allocatable1.f90 +++ b/flang/test/Lower/volatile-allocatable1.f90 @@ -1,7 +1,6 @@ ! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s ! Requires correct propagation of volatility for allocatable nested types. -! XFAIL: * function allocatable_udt() type :: base_type @@ -15,3 +14,19 @@ function allocatable_udt() allocate(v2(2,3)) allocatable_udt = v2(1,1)%i end function +! CHECK-LABEL: func.func @_QPallocatable_udt() -> i32 { +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.n.i"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.di.base_type.i"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.n.base_type"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.n.j"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.di.ext_type.j"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.n.ext_type"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {uniq_name = "_QFallocatable_udtEallocatable_udt"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtEv2"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.c.base_type"} : (!fir.ref>>, !fir.shapeshift<1>) -> (!fir.box>>, !fir.ref>>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.dt.base_type"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.dt.ext_type"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.c.ext_type"} : (!fir.ref>>, !fir.shapeshift<1>) -> (!fir.box>>, !fir.ref>>) +! CHECK: %{{.+}} = hlfir.designate %{{.+}} (%{{.+}}, %{{.+}}) : (!fir.box>>, volatile>, index, index) -> !fir.ref, volatile> +! CHECK: %{{.+}} = hlfir.designate %{{.+}}{"base_type"} : (!fir.ref, volatile>) -> !fir.ref, volatile> +! CHECK: %{{.+}} = hlfir.designate %{{.+}}{"i"} : (!fir.ref, volatile>) -> !fir.ref >From aeed3ec3318ea0b0fdfa813868ce3a888671ed34 Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Thu, 8 May 2025 17:44:10 -0700 Subject: [PATCH 2/3] format --- flang/lib/Lower/Bridge.cpp | 12 +++++++---- flang/lib/Optimizer/Dialect/FIROps.cpp | 20 +++++++++++-------- .../Transforms/PolymorphicOpConversion.cpp | 9 +++++---- 3 files changed, 25 insertions(+), 16 deletions(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index d28c01ed16cbf..169a2780ea14d 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -3999,7 +3999,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { addrTy = fir::HeapType::get(addrTy); if (std::holds_alternative( typeSpec->u)) { - mlir::Type refTy = fir::ReferenceType::get(addrTy, selectorIsVolatile()); + mlir::Type refTy = + fir::ReferenceType::get(addrTy, selectorIsVolatile()); if (isPointer || isAllocatable) refTy = addrTy; exactValue = builder->create( @@ -4008,7 +4009,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { typeSpec->declTypeSpec->AsIntrinsic(); if (isArray) { mlir::Value exact = builder->create( - loc, fir::BoxType::get(addrTy, selectorIsVolatile()), fir::getBase(selector)); + loc, fir::BoxType::get(addrTy, selectorIsVolatile()), + fir::getBase(selector)); addAssocEntitySymbol(selectorBox->clone(exact)); } else if (intrinsic->category() == Fortran::common::TypeCategory::Character) { @@ -4023,7 +4025,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { } else if (std::holds_alternative( typeSpec->u)) { exactValue = builder->create( - loc, fir::BoxType::get(addrTy, selectorIsVolatile()), fir::getBase(selector)); + loc, fir::BoxType::get(addrTy, selectorIsVolatile()), + fir::getBase(selector)); addAssocEntitySymbol(selectorBox->clone(exactValue)); } } else if (std::holds_alternative( @@ -4041,7 +4044,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { addrTy = fir::PointerType::get(addrTy); if (isAllocatable) addrTy = fir::HeapType::get(addrTy); - mlir::Type classTy = fir::ClassType::get(addrTy, selectorIsVolatile()); + mlir::Type classTy = + fir::ClassType::get(addrTy, selectorIsVolatile()); if (classTy == baseTy) { addAssocEntitySymbol(selector); } else { diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index 9b58578e55474..75185e719393f 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -1536,21 +1536,24 @@ bool fir::ConvertOp::canBeConverted(mlir::Type inType, mlir::Type outType) { areRecordsCompatible(inType, outType); } -// In general, ptrtoint-like conversions are allowed to lose volatility information -// because they are either: +// In general, ptrtoint-like conversions are allowed to lose volatility +// information because they are either: // // 1. passing an entity to an external function and there's nothing we can do // about volatility after that happens, or -// 2. for code generation, at which point we represent volatility with attributes +// 2. for code generation, at which point we represent volatility with +// attributes // on the LLVM instructions and intrinsics. // // For all other cases, volatility ought to match exactly. -static mlir::LogicalResult verifyVolatility(mlir::Type inType, mlir::Type outType) { +static mlir::LogicalResult verifyVolatility(mlir::Type inType, + mlir::Type outType) { const bool toLLVMPointer = mlir::isa(outType); const bool toInteger = fir::isa_integer(outType); - // When converting references to classes or allocatables into boxes for runtime arguments, - // we cast away all the volatility information and pass a box. This is allowed. + // When converting references to classes or allocatables into boxes for + // runtime arguments, we cast away all the volatility information and pass a + // box. This is allowed. const bool isBoxNoneLike = [&]() { if (fir::isBoxNone(outType)) return true; @@ -1561,14 +1564,15 @@ static mlir::LogicalResult verifyVolatility(mlir::Type inType, mlir::Type outTyp } return false; }(); - + const bool isPtrToIntLike = toLLVMPointer || toInteger || isBoxNoneLike; if (isPtrToIntLike) { return mlir::success(); } // In all other cases, we need to check for an exact volatility match. - return mlir::success(fir::isa_volatile_type(inType) == fir::isa_volatile_type(outType)); + return mlir::success(fir::isa_volatile_type(inType) == + fir::isa_volatile_type(outType)); } llvm::LogicalResult fir::ConvertOp::verify() { diff --git a/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp b/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp index 309e557e409c0..f9a4c4d0283c7 100644 --- a/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp +++ b/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp @@ -404,10 +404,11 @@ llvm::LogicalResult SelectTypeConv::genTypeLadderStep( auto runtimeAttr = mlir::NamedAttribute(fir::FIROpsDialect::getFirRuntimeAttrName(), mlir::UnitAttr::get(rewriter.getContext())); - callee = fir::createFuncOp(rewriter.getUnknownLoc(), mod, fctName, - rewriter.getFunctionType({descNoneTy, typeDescTy}, - rewriter.getI1Type()), - {runtimeAttr}); + callee = + fir::createFuncOp(rewriter.getUnknownLoc(), mod, fctName, + rewriter.getFunctionType({descNoneTy, typeDescTy}, + rewriter.getI1Type()), + {runtimeAttr}); } cmp = rewriter .create(loc, callee, >From 0345444696c27172179c711d6124b0a315975831 Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Fri, 9 May 2025 07:03:19 -0700 Subject: [PATCH 3/3] Refactor the hlfir declare utility for clarity --- flang/lib/Optimizer/Dialect/FIROps.cpp | 3 +- flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp | 64 +++++++++++++---------- flang/test/Fir/invalid.fir | 2 +- 3 files changed, 37 insertions(+), 32 deletions(-) diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index 75185e719393f..b10b5d998fa70 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -1542,8 +1542,7 @@ bool fir::ConvertOp::canBeConverted(mlir::Type inType, mlir::Type outType) { // 1. passing an entity to an external function and there's nothing we can do // about volatility after that happens, or // 2. for code generation, at which point we represent volatility with -// attributes -// on the LLVM instructions and intrinsics. +// attributes on the LLVM instructions and intrinsics. // // For all other cases, volatility ought to match exactly. static mlir::LogicalResult verifyVolatility(mlir::Type inType, diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp index 52517eef2890d..8cfca59ecdada 100644 --- a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp +++ b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp @@ -207,34 +207,37 @@ static bool hasExplicitLowerBounds(mlir::Value shape) { mlir::isa(shape.getType()); } -static std::pair updateDeclaredInputTypeWithVolatility( - mlir::Type inputType, mlir::Value memref, mlir::OpBuilder &builder, - fir::FortranVariableFlagsAttr fortran_attrs) { - if (fortran_attrs && - bitEnumContainsAny(fortran_attrs.getFlags(), - fir::FortranVariableFlagsEnum::fortran_volatile)) { - // A volatile pointer's pointee is volatile. - const bool isPointer = bitEnumContainsAny( - fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::pointer); - // An allocatable's inner type's volatility matches that of the reference. - const bool isAllocatable = bitEnumContainsAny( - fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::allocatable); - auto updateType = [&](auto t) { - using FIRT = decltype(t); - auto elementType = t.getEleTy(); - const bool elementTypeIsBox = mlir::isa(elementType); - const bool elementTypeIsVolatile = isPointer || isAllocatable || - elementTypeIsBox || - fir::isa_volatile_type(elementType); - auto newEleTy = - fir::updateTypeWithVolatility(elementType, elementTypeIsVolatile); - inputType = FIRT::get(newEleTy, true); - }; - llvm::TypeSwitch(inputType) - .Case(updateType); - memref = - builder.create(memref.getLoc(), inputType, memref); +static std::pair +updateDeclaredInputTypeWithVolatility(mlir::Type inputType, mlir::Value memref, + mlir::OpBuilder &builder, + fir::FortranVariableFlagsEnum flags) { + if (!bitEnumContainsAny(flags, + fir::FortranVariableFlagsEnum::fortran_volatile)) { + return std::make_pair(inputType, memref); } + + // A volatile pointer's pointee is volatile. + const bool isPointer = + bitEnumContainsAny(flags, fir::FortranVariableFlagsEnum::pointer); + // An allocatable's inner type's volatility matches that of the reference. + const bool isAllocatable = + bitEnumContainsAny(flags, fir::FortranVariableFlagsEnum::allocatable); + + auto updateType = [&](auto t) { + using FIRT = decltype(t); + auto elementType = t.getEleTy(); + const bool elementTypeIsBox = mlir::isa(elementType); + const bool elementTypeIsVolatile = isPointer || isAllocatable || + elementTypeIsBox || + fir::isa_volatile_type(elementType); + auto newEleTy = + fir::updateTypeWithVolatility(elementType, elementTypeIsVolatile); + inputType = FIRT::get(newEleTy, true); + }; + llvm::TypeSwitch(inputType) + .Case(updateType); + memref = + builder.create(memref.getLoc(), inputType, memref); return std::make_pair(inputType, memref); } @@ -248,8 +251,11 @@ void hlfir::DeclareOp::build(mlir::OpBuilder &builder, auto nameAttr = builder.getStringAttr(uniq_name); mlir::Type inputType = memref.getType(); bool hasExplicitLbs = hasExplicitLowerBounds(shape); - std::tie(inputType, memref) = updateDeclaredInputTypeWithVolatility( - inputType, memref, builder, fortran_attrs); + if (fortran_attrs) { + const auto flags = fortran_attrs.getFlags(); + std::tie(inputType, memref) = updateDeclaredInputTypeWithVolatility( + inputType, memref, builder, flags); + } mlir::Type hlfirVariableType = getHLFIRVariableType(inputType, hasExplicitLbs); build(builder, result, {hlfirVariableType, inputType}, memref, shape, diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index 834eea7df8ebe..fd607fd9066f7 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -1260,7 +1260,7 @@ func.func @dc_invalid_reduction(%arg0: index, %arg1: index) { // Should fail when volatility changes from a fir.convert func.func @bad_convert_volatile(%arg0: !fir.ref) -> !fir.ref { - // expected-error at +1 {{op this conversion does not preserve volatilit}} + // expected-error at +1 {{op this conversion does not preserve volatility}} %0 = fir.convert %arg0 : (!fir.ref) -> !fir.ref return %0 : !fir.ref } From flang-commits at lists.llvm.org Mon May 12 08:04:08 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Mon, 12 May 2025 08:04:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP][Semantics] resolve objects in the flush arg list (PR #139522) In-Reply-To: Message-ID: <68220de8.a70a0220.1658d4.48ff@mx.google.com> ================ @@ -409,6 +409,19 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { } void Post(const parser::OpenMPDepobjConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPFlushConstruct &x) { + PushContext(x.source, llvm::omp::Directive::OMPD_flush); + for (auto &arg : x.v.Arguments().v) { + if (auto *locator{std::get_if(&arg.u)}) { + if (auto *object{std::get_if(&locator->u)}) { + ResolveOmpObject(*object, Symbol::Flag::OmpDependObject); ---------------- kiranchandramohan wrote: If common blocks are not handled in `ResolveName` then you can probably use `OmpFlushed` as the Symbol in `ResolveOmpObject`. https://github.com/llvm/llvm-project/pull/139522 From flang-commits at lists.llvm.org Mon May 12 08:23:59 2025 From: flang-commits at lists.llvm.org (Mats Petersson via flang-commits) Date: Mon, 12 May 2025 08:23:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP]Replace assert with if-condition (PR #139559) Message-ID: https://github.com/Leporacanthicus created https://github.com/llvm/llvm-project/pull/139559 If a symbol is not declared, check-omp-structure hits an assert. It should be safe to treat undeclared symbols as "not from a block", as they would have to be declared to be in a block... Adding simple test to confirm it gives error messages, not crashing. This should fix issue #131655 (there is already a check for symbol being not null in the code identified in the ticket). >From 440141e90ed1f97d25de8be8bc2e04b372027189 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Mon, 12 May 2025 15:17:19 +0100 Subject: [PATCH] [flang][OpenMP]Replace assert with if-condition If a symbol is not declared, check-omp-structure hits an assert. It should be safe to treat undeclared symbols as "not from a block", as they would have to be declared to be in a block... Adding simple test to confirm it gives error messages, not crashing. This should fix issue #131655 (there is already a check for symbol being not null in the code identified in the ticket). --- flang/lib/Semantics/check-omp-structure.cpp | 3 +-- .../Semantics/OpenMP/reduction-undefined.f90 | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+), 2 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/reduction-undefined.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 8f6a623508aa7..78736ee1929d1 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3545,8 +3545,7 @@ void OmpStructureChecker::CheckReductionObjects( // names into the lists of their members. for (const parser::OmpObject &object : objects.v) { auto *symbol{GetObjectSymbol(object)}; - assert(symbol && "Expecting a symbol for object"); - if (IsCommonBlock(*symbol)) { + if (symbol && IsCommonBlock(*symbol)) { auto source{GetObjectSource(object)}; context_.Say(source ? *source : GetContext().clauseSource, "Common block names are not allowed in %s clause"_err_en_US, diff --git a/flang/test/Semantics/OpenMP/reduction-undefined.f90 b/flang/test/Semantics/OpenMP/reduction-undefined.f90 new file mode 100644 index 0000000000000..bf1f03a878630 --- /dev/null +++ b/flang/test/Semantics/OpenMP/reduction-undefined.f90 @@ -0,0 +1,18 @@ +! RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp + +subroutine dont_crash(values) + implicit none + integer, parameter :: n = 100 + real :: values(n) + integer :: i + !ERROR: No explicit type declared for 'sum' + sum = 0 + !ERROR: No explicit type declared for 'sum' + !$omp parallel do reduction(+:sum) + do i = 1, n + !ERROR: No explicit type declared for 'sum' + !ERROR: No explicit type declared for 'sum' + sum = sum + values(i) + end do +end subroutine dont_crash + From flang-commits at lists.llvm.org Mon May 12 08:27:10 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 08:27:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP]Replace assert with if-condition (PR #139559) In-Reply-To: Message-ID: <6822134e.170a0220.dfa74.45c9@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks https://github.com/llvm/llvm-project/pull/139559 From flang-commits at lists.llvm.org Mon May 12 08:50:33 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 08:50:33 -0700 (PDT) Subject: [flang-commits] [flang] 939bb4e - [NFC] Add const to newly added helper functions from PR #135226 Message-ID: <682218c9.050a0220.8b715.9083@mx.google.com> Author: agozillon Date: 2025-05-12T10:49:49-05:00 New Revision: 939bb4e028499a3eda783567cda7d5331ba0c242 URL: https://github.com/llvm/llvm-project/commit/939bb4e028499a3eda783567cda7d5331ba0c242 DIFF: https://github.com/llvm/llvm-project/commit/939bb4e028499a3eda783567cda7d5331ba0c242.diff LOG: [NFC] Add const to newly added helper functions from PR #135226 Added: Modified: flang/lib/Lower/OpenMP/OpenMP.cpp Removed: ################################################################################ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 43f2f35b2ba61..446aa2deb3d05 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -981,14 +981,14 @@ static void genLoopVars( } static clause::Defaultmap::ImplicitBehavior -getDefaultmapIfPresent(DefaultMapsTy &defaultMaps, mlir::Type varType) { +getDefaultmapIfPresent(const DefaultMapsTy &defaultMaps, mlir::Type varType) { using DefMap = clause::Defaultmap; if (defaultMaps.empty()) return DefMap::ImplicitBehavior::Default; if (llvm::is_contained(defaultMaps, DefMap::VariableCategory::All)) - return defaultMaps[DefMap::VariableCategory::All]; + return defaultMaps.at(DefMap::VariableCategory::All); // NOTE: Unsure if complex and/or vector falls into a scalar type // or aggregate, but the current default implicit behaviour is to @@ -997,19 +997,19 @@ getDefaultmapIfPresent(DefaultMapsTy &defaultMaps, mlir::Type varType) { if ((fir::isa_trivial(varType) || fir::isa_char(varType) || fir::isa_builtin_cptr_type(varType)) && llvm::is_contained(defaultMaps, DefMap::VariableCategory::Scalar)) - return defaultMaps[DefMap::VariableCategory::Scalar]; + return defaultMaps.at(DefMap::VariableCategory::Scalar); if (fir::isPointerType(varType) && llvm::is_contained(defaultMaps, DefMap::VariableCategory::Pointer)) - return defaultMaps[DefMap::VariableCategory::Pointer]; + return defaultMaps.at(DefMap::VariableCategory::Pointer); if (fir::isAllocatableType(varType) && llvm::is_contained(defaultMaps, DefMap::VariableCategory::Allocatable)) - return defaultMaps[DefMap::VariableCategory::Allocatable]; + return defaultMaps.at(DefMap::VariableCategory::Allocatable); if (fir::isa_aggregate(varType) && llvm::is_contained(defaultMaps, DefMap::VariableCategory::Aggregate)) - return defaultMaps[DefMap::VariableCategory::Aggregate]; + return defaultMaps.at(DefMap::VariableCategory::Aggregate); return DefMap::ImplicitBehavior::Default; } @@ -1018,7 +1018,7 @@ static std::pair getImplicitMapTypeAndKind(fir::FirOpBuilder &firOpBuilder, lower::AbstractConverter &converter, - DefaultMapsTy &defaultMaps, mlir::Type varType, + const DefaultMapsTy &defaultMaps, mlir::Type varType, mlir::Location loc, const semantics::Symbol &sym) { using DefMap = clause::Defaultmap; // Check if a value of type `type` can be passed to the kernel by value. From flang-commits at lists.llvm.org Mon May 12 09:00:31 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Mon, 12 May 2025 09:00:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Extend assumed-size array checking in intrinsic functions (PR #139339) In-Reply-To: Message-ID: <68221b1f.170a0220.278e07.94c0@mx.google.com> https://github.com/akuhlens approved this pull request. https://github.com/llvm/llvm-project/pull/139339 From flang-commits at lists.llvm.org Mon May 12 09:03:39 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Mon, 12 May 2025 09:03:39 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] Fix CUDA implicit data transfer entity creation (PR #139414) In-Reply-To: Message-ID: <68221bdb.050a0220.158753.87b7@mx.google.com> ================ @@ -0,0 +1,24 @@ +! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s + +subroutine testr2(N1,N2) + real(4), managed :: ai4(N1,N2) + real(4), allocatable :: bRefi4(:) + + integer :: i1, i2 + + do i2 = 1, N2 + do i1 = 1, N1 + ai4(i1,i2) = i1 + N1*(i2-1) + enddo + enddo + + allocate(bRefi4 (N1)) + do i1 = 1, N1 + bRefi4(i1) = (ai4(i1,1)+ai4(i1,N2))*N2/2 + enddo + deallocate(bRefi4) + +end subroutine + +!CHECK-LABEL: func.func @_QPtestr2 +!CHECK: %{{.*}} = cuf.alloc !fir.array, %{{.*}}, %{{.*}} : index, index {bindc_name = "ai4", data_attr = #cuf.cuda, uniq_name = "_QFtestr2Eai4"} -> !fir.ref> ---------------- clementval wrote: Can you check the actual change? https://github.com/llvm/llvm-project/pull/139414 From flang-commits at lists.llvm.org Mon May 12 09:03:39 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Mon, 12 May 2025 09:03:39 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] Fix CUDA implicit data transfer entity creation (PR #139414) In-Reply-To: Message-ID: <68221bdb.a70a0220.165908.9047@mx.google.com> https://github.com/clementval edited https://github.com/llvm/llvm-project/pull/139414 From flang-commits at lists.llvm.org Mon May 12 09:03:40 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Mon, 12 May 2025 09:03:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] Fix CUDA implicit data transfer entity creation (PR #139414) In-Reply-To: Message-ID: <68221bdc.050a0220.196370.7a93@mx.google.com> https://github.com/clementval requested changes to this pull request. Can you make sure your new test check the actual change? https://github.com/llvm/llvm-project/pull/139414 From flang-commits at lists.llvm.org Mon May 12 09:08:46 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 09:08:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP][Semantics] resolve objects in the flush arg list (PR #139522) In-Reply-To: Message-ID: <68221d0e.170a0220.1f846f.5a1c@mx.google.com> ================ @@ -409,6 +409,19 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { } void Post(const parser::OpenMPDepobjConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPFlushConstruct &x) { + PushContext(x.source, llvm::omp::Directive::OMPD_flush); + for (auto &arg : x.v.Arguments().v) { + if (auto *locator{std::get_if(&arg.u)}) { + if (auto *object{std::get_if(&locator->u)}) { + ResolveOmpObject(*object, Symbol::Flag::OmpDependObject); ---------------- tblah wrote: Yes it was a copy and paste error. Thanks for catching this. I had assumed that given an object, the right way to resolve it was to call ResolveOmpObject. I have now changed to include only the relevant parts of the code inline. https://github.com/llvm/llvm-project/pull/139522 From flang-commits at lists.llvm.org Mon May 12 09:11:00 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 09:11:00 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Set the default schedule modifier (PR #139572) In-Reply-To: Message-ID: <68221d94.170a0220.15279a.593b@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Tom Eccles (tblah)
Changes This is fixing "Default loop schedule modifier for worksharing-loop constructs without static schedule and ordered clause is nonmonotonic since OpenMP 5.0" in the to-do list for removing the OpenMP experimental status warning. I quote the relevant part of OpenMP 6.0 in the patch. So far as I can tell, in OpenMP 4.5 the default schedule modifier was the `def-sched-var` ICV. The initial value of this ICV is implementation defined (table 2.1). So I believe we don't need to check the version for this. It wasn't obvious to me whether this should be done in Semantics, Lowering, or LLVMIR codegen. I didn't do it in semantics so that we didn't add modifiers which could be seen in the unparse. It was more convenient to implement in lowering than in codegen. --- Full diff: https://github.com/llvm/llvm-project/pull/139572.diff 6 Files Affected: - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+15-1) - (modified) flang/test/Lower/OpenMP/distribute-parallel-do.f90 (+1-1) - (modified) flang/test/Lower/OpenMP/parallel-wsloop.f90 (+1-1) - (modified) flang/test/Lower/OpenMP/wsloop-chunks.f90 (+3-3) - (modified) flang/test/Lower/OpenMP/wsloop-schedule.f90 (+29-3) - (modified) flang/test/Lower/OpenMP/wsloop.f90 (+1-1) ``````````diff diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index f4876256a378f..960e9fe63ad88 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -547,9 +547,23 @@ bool ClauseProcessor::processSchedule( mlir::omp::ClauseScheduleKindAttr::get(context, scheduleKind); mlir::omp::ScheduleModifier scheduleMod = getScheduleModifier(*clause); - if (scheduleMod != mlir::omp::ScheduleModifier::none) + if (scheduleMod != mlir::omp::ScheduleModifier::none) { result.scheduleMod = mlir::omp::ScheduleModifierAttr::get(context, scheduleMod); + } else { + // OpenMP 6.0 13.6.3: + // If an ordering-modifier is not specified, the effect is as if the + // monotonic ordering modifier is specified if the kind argument is + // static or an ordered clause is specified on the construct; otherwise, + // the effect is as if the nonmonotonic ordering modifier is specified. + mlir::omp::ScheduleModifier defaultMod = + mlir::omp::ScheduleModifier::nonmonotonic; + if (scheduleKind == mlir::omp::ClauseScheduleKind::Static || + findUniqueClause()) + defaultMod = mlir::omp::ScheduleModifier::monotonic; + result.scheduleMod = + mlir::omp::ScheduleModifierAttr::get(context, defaultMod); + } if (getSimdModifier(*clause) != mlir::omp::ScheduleModifier::none) result.scheduleSimd = firOpBuilder.getUnitAttr(); diff --git a/flang/test/Lower/OpenMP/distribute-parallel-do.f90 b/flang/test/Lower/OpenMP/distribute-parallel-do.f90 index cddf61647ead3..ad69e3a163799 100644 --- a/flang/test/Lower/OpenMP/distribute-parallel-do.f90 +++ b/flang/test/Lower/OpenMP/distribute-parallel-do.f90 @@ -42,7 +42,7 @@ subroutine distribute_parallel_do_schedule() ! CHECK: omp.parallel private({{.*}}) { ! CHECK: omp.distribute { - ! CHECK-NEXT: omp.wsloop schedule(runtime) { + ! CHECK-NEXT: omp.wsloop schedule(runtime, nonmonotonic) { ! CHECK-NEXT: omp.loop_nest !$omp distribute parallel do schedule(runtime) do index_ = 1, 10 diff --git a/flang/test/Lower/OpenMP/parallel-wsloop.f90 b/flang/test/Lower/OpenMP/parallel-wsloop.f90 index 15a68e2c0e65b..f61b81f1d9b12 100644 --- a/flang/test/Lower/OpenMP/parallel-wsloop.f90 +++ b/flang/test/Lower/OpenMP/parallel-wsloop.f90 @@ -64,7 +64,7 @@ subroutine parallel_do_with_clauses(nt) ! CHECK: %[[WS_LB:.*]] = arith.constant 1 : i32 ! CHECK: %[[WS_UB:.*]] = arith.constant 9 : i32 ! CHECK: %[[WS_STEP:.*]] = arith.constant 1 : i32 - ! CHECK: omp.wsloop schedule(dynamic) private({{.*}}) { + ! CHECK: omp.wsloop schedule(dynamic, nonmonotonic) private({{.*}}) { ! CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB]]) to (%[[WS_UB]]) inclusive step (%[[WS_STEP]]) { !$OMP PARALLEL DO NUM_THREADS(nt) SCHEDULE(dynamic) do i=1, 9 diff --git a/flang/test/Lower/OpenMP/wsloop-chunks.f90 b/flang/test/Lower/OpenMP/wsloop-chunks.f90 index 29c02a3b3c8d5..f3df1e6249693 100644 --- a/flang/test/Lower/OpenMP/wsloop-chunks.f90 +++ b/flang/test/Lower/OpenMP/wsloop-chunks.f90 @@ -20,7 +20,7 @@ program wsloop ! CHECK: %[[VAL_3:.*]] = arith.constant 1 : i32 ! CHECK: %[[VAL_4:.*]] = arith.constant 9 : i32 ! CHECK: %[[VAL_5:.*]] = arith.constant 1 : i32 -! CHECK: omp.wsloop nowait schedule(static = %[[VAL_2]] : i32) private({{.*}}) { +! CHECK: omp.wsloop nowait schedule(static = %[[VAL_2]] : i32, monotonic) private({{.*}}) { ! CHECK-NEXT: omp.loop_nest (%[[ARG0:.*]]) : i32 = (%[[VAL_3]]) to (%[[VAL_4]]) inclusive step (%[[VAL_5]]) { ! CHECK: hlfir.assign %[[ARG0]] to %[[STORE_IV:.*]]#0 : i32, !fir.ref ! CHECK: %[[LOAD_IV:.*]] = fir.load %[[STORE_IV]]#0 : !fir.ref @@ -40,7 +40,7 @@ program wsloop ! CHECK: %[[VAL_15:.*]] = arith.constant 1 : i32 ! CHECK: %[[VAL_16:.*]] = arith.constant 9 : i32 ! CHECK: %[[VAL_17:.*]] = arith.constant 1 : i32 -! CHECK: omp.wsloop nowait schedule(static = %[[VAL_14]] : i32) private({{.*}}) { +! CHECK: omp.wsloop nowait schedule(static = %[[VAL_14]] : i32, monotonic) private({{.*}}) { ! CHECK-NEXT: omp.loop_nest (%[[ARG1:.*]]) : i32 = (%[[VAL_15]]) to (%[[VAL_16]]) inclusive step (%[[VAL_17]]) { ! CHECK: hlfir.assign %[[ARG1]] to %[[STORE_IV1:.*]]#0 : i32, !fir.ref ! CHECK: %[[VAL_24:.*]] = arith.constant 2 : i32 @@ -66,7 +66,7 @@ program wsloop ! CHECK: %[[VAL_30:.*]] = arith.constant 1 : i32 ! CHECK: %[[VAL_31:.*]] = arith.constant 9 : i32 ! CHECK: %[[VAL_32:.*]] = arith.constant 1 : i32 -! CHECK: omp.wsloop nowait schedule(static = %[[VAL_29]] : i32) private({{.*}}) { +! CHECK: omp.wsloop nowait schedule(static = %[[VAL_29]] : i32, monotonic) private({{.*}}) { ! CHECK-NEXT: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%[[VAL_30]]) to (%[[VAL_31]]) inclusive step (%[[VAL_32]]) { ! CHECK: hlfir.assign %[[ARG2]] to %[[STORE_IV2:.*]]#0 : i32, !fir.ref ! CHECK: %[[VAL_39:.*]] = arith.constant 3 : i32 diff --git a/flang/test/Lower/OpenMP/wsloop-schedule.f90 b/flang/test/Lower/OpenMP/wsloop-schedule.f90 index 5e672927c41ba..bbab01a912c3e 100644 --- a/flang/test/Lower/OpenMP/wsloop-schedule.f90 +++ b/flang/test/Lower/OpenMP/wsloop-schedule.f90 @@ -14,7 +14,7 @@ program wsloop_dynamic !CHECK: %[[WS_LB:.*]] = arith.constant 1 : i32 !CHECK: %[[WS_UB:.*]] = arith.constant 9 : i32 !CHECK: %[[WS_STEP:.*]] = arith.constant 1 : i32 -!CHECK: omp.wsloop nowait schedule(runtime, simd) private({{.*}}) { +!CHECK: omp.wsloop nowait schedule(runtime, nonmonotonic, simd) private({{.*}}) { !CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB]]) to (%[[WS_UB]]) inclusive step (%[[WS_STEP]]) { !CHECK: hlfir.assign %[[I]] to %[[STORE:.*]]#0 : i32, !fir.ref @@ -28,9 +28,35 @@ program wsloop_dynamic !CHECK: omp.yield !CHECK: } !CHECK: } -!CHECK: omp.terminator -!CHECK: } !$OMP END DO NOWAIT + +! Check that the schedule modifier is set correctly when the ordered clause is +! used +!$OMP DO SCHEDULE(runtime) ORDERED(1) +!CHECK: %[[WS_LB2:.*]] = arith.constant 1 : i32 +!CHECK: %[[WS_UB2:.*]] = arith.constant 9 : i32 +!CHECK: %[[WS_STEP2:.*]] = arith.constant 1 : i32 +!CHECK: omp.wsloop nowait ordered(1) schedule(runtime, monotonic) private({{.*}}) { +!CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB2]]) to (%[[WS_UB2]]) inclusive step (%[[WS_STEP2]]) { + do i=1, 9 + print*, i + end do +!$OMP END DO NOWAIT + +! Check that the schedule modifier is set correctly with a static schedule +!$OMP DO SCHEDULE(static) +!CHECK: %[[WS_LB3:.*]] = arith.constant 1 : i32 +!CHECK: %[[WS_UB3:.*]] = arith.constant 9 : i32 +!CHECK: %[[WS_STEP3:.*]] = arith.constant 1 : i32 +!CHECK: omp.wsloop nowait schedule(static, monotonic) private({{.*}}) { +!CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB3]]) to (%[[WS_UB3]]) inclusive step (%[[WS_STEP3]]) { + do i=1, 9 + print*, i + end do +!$OMP END DO NOWAIT + +!CHECK: omp.terminator +!CHECK: } !$OMP END PARALLEL end diff --git a/flang/test/Lower/OpenMP/wsloop.f90 b/flang/test/Lower/OpenMP/wsloop.f90 index b4e02ea3c73f8..75219c9a78e87 100644 --- a/flang/test/Lower/OpenMP/wsloop.f90 +++ b/flang/test/Lower/OpenMP/wsloop.f90 @@ -58,7 +58,7 @@ subroutine loop_with_schedule_nowait ! CHECK: %[[WS_LB:.*]] = arith.constant 1 : i32 ! CHECK: %[[WS_UB:.*]] = arith.constant 9 : i32 ! CHECK: %[[WS_STEP:.*]] = arith.constant 1 : i32 - ! CHECK: omp.wsloop nowait schedule(runtime) private(@{{.*}} %{{.*}}#0 -> %[[ALLOCA_IV:.*]] : !fir.ref) { + ! CHECK: omp.wsloop nowait schedule(runtime, nonmonotonic) private(@{{.*}} %{{.*}}#0 -> %[[ALLOCA_IV:.*]] : !fir.ref) { ! CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB]]) to (%[[WS_UB]]) inclusive step (%[[WS_STEP]]) { !$OMP DO SCHEDULE(runtime) do i=1, 9 ``````````
https://github.com/llvm/llvm-project/pull/139572 From flang-commits at lists.llvm.org Mon May 12 09:07:50 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 09:07:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP][Semantics] resolve objects in the flush arg list (PR #139522) In-Reply-To: Message-ID: <68221cd6.170a0220.37d80b.67fa@mx.google.com> https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/139522 >From a64751019f3eab8a7d6d678ad76fdb8c31b86e54 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Mon, 12 May 2025 09:38:01 +0000 Subject: [PATCH 1/2] [flang][OpenMP][Semantics] resolve objects in the flush arg list Fixes #136583 Normally the flush argument list would contain a DataRef to some variable. All DataRefs are handled generically in resolve-names and so the problem wasn't observed. But when a common block name is specified, this is not parsed as a DataRef. There was already handling in resolve-directives for OmpObjectList but not for argument lists. I've added a visitor for FLUSH which ensures all of the arguments have been resolved. The test is there to make sure the compiler doesn't crashed encountering the unresolved symbol. It shows that we currently deny flushing a common block. I'm not sure that it is right to restrict common blocks from flush argument lists, but fixing that can come in a different patch. This one is fixing an ICE. --- flang/lib/Semantics/resolve-directives.cpp | 13 +++++++++++++ flang/test/Semantics/OpenMP/flush04.f90 | 11 +++++++++++ 2 files changed, 24 insertions(+) create mode 100644 flang/test/Semantics/OpenMP/flush04.f90 diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 8b1caca34a6a7..68f8cf9f17620 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -409,6 +409,19 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { } void Post(const parser::OpenMPDepobjConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPFlushConstruct &x) { + PushContext(x.source, llvm::omp::Directive::OMPD_flush); + for (auto &arg : x.v.Arguments().v) { + if (auto *locator{std::get_if(&arg.u)}) { + if (auto *object{std::get_if(&locator->u)}) { + ResolveOmpObject(*object, Symbol::Flag::OmpDependObject); + } + } + } + return true; + } + void Post(const parser::OpenMPFlushConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPRequiresConstruct &x) { using Flags = WithOmpDeclarative::RequiresFlags; using Requires = WithOmpDeclarative::RequiresFlag; diff --git a/flang/test/Semantics/OpenMP/flush04.f90 b/flang/test/Semantics/OpenMP/flush04.f90 new file mode 100644 index 0000000000000..ffc2273b692dc --- /dev/null +++ b/flang/test/Semantics/OpenMP/flush04.f90 @@ -0,0 +1,11 @@ +! RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp + +! Regression test to ensure that the name /c/ in the flush argument list is +! resolved to the common block symbol. + + common /c/ x + real :: x +!ERROR: FLUSH argument must be a variable list item + !$omp flush(/c/) +end + >From d25c9fda4cd19bd9a27f4056c2baf5eaa9dd7d37 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Mon, 12 May 2025 16:05:36 +0000 Subject: [PATCH 2/2] Respond to review feedback --- flang/lib/Semantics/resolve-directives.cpp | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 68f8cf9f17620..e34c48c1983ca 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -414,7 +414,14 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { for (auto &arg : x.v.Arguments().v) { if (auto *locator{std::get_if(&arg.u)}) { if (auto *object{std::get_if(&locator->u)}) { - ResolveOmpObject(*object, Symbol::Flag::OmpDependObject); + if (auto *name{std::get_if(&object->u)}) { + // ResolveOmpCommonBlockName resolves the symbol as a side effect + if (!ResolveOmpCommonBlockName(name)) { + context_.Say(name->source, // 2.15.3 + "COMMON block must be declared in the same scoping unit " + "in which the OpenMP directive or clause appears"_err_en_US); + } + } } } } From flang-commits at lists.llvm.org Mon May 12 09:10:29 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 12 May 2025 09:10:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Set the default schedule modifier (PR #139572) Message-ID: https://github.com/tblah created https://github.com/llvm/llvm-project/pull/139572 This is fixing "Default loop schedule modifier for worksharing-loop constructs without static schedule and ordered clause is nonmonotonic since OpenMP 5.0" in the to-do list for removing the OpenMP experimental status warning. I quote the relevant part of OpenMP 6.0 in the patch. So far as I can tell, in OpenMP 4.5 the default schedule modifier was the `def-sched-var` ICV. The initial value of this ICV is implementation defined (table 2.1). So I believe we don't need to check the version for this. It wasn't obvious to me whether this should be done in Semantics, Lowering, or LLVMIR codegen. I didn't do it in semantics so that we didn't add modifiers which could be seen in the unparse. It was more convenient to implement in lowering than in codegen. >From a96192fe402a47cb4a0911640f01781651c7dc4d Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Mon, 12 May 2025 15:34:15 +0000 Subject: [PATCH] [flang][OpenMP] Set the default schedule modifier This is fixing "Default loop schedule modifier for worksharing-loop constructs without static schedule and ordered clause is nonmonotonic since OpenMP 5.0" in the to-do list for removing the OpenMP experimental status warning. I quote the relevant part of OpenMP 6.0 in the patch. So far as I can tell, in OpenMP 4.5 the default schedule modifier was the `def-sched-var` ICV. The initial value of this ICV is implementation defined (table 2.1). So I believe we don't need to check the version for this. It wasn't obvious to me whether this should be done in Semantics, Lowering, or LLVMIR codegen. I didn't do it in semantics so that we didn't add modifiers which could be seen in the unparse. It was more convenient to implement in lowering than in codegen. --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 16 +++++++++- .../Lower/OpenMP/distribute-parallel-do.f90 | 2 +- flang/test/Lower/OpenMP/parallel-wsloop.f90 | 2 +- flang/test/Lower/OpenMP/wsloop-chunks.f90 | 6 ++-- flang/test/Lower/OpenMP/wsloop-schedule.f90 | 32 +++++++++++++++++-- flang/test/Lower/OpenMP/wsloop.f90 | 2 +- 6 files changed, 50 insertions(+), 10 deletions(-) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index f4876256a378f..960e9fe63ad88 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -547,9 +547,23 @@ bool ClauseProcessor::processSchedule( mlir::omp::ClauseScheduleKindAttr::get(context, scheduleKind); mlir::omp::ScheduleModifier scheduleMod = getScheduleModifier(*clause); - if (scheduleMod != mlir::omp::ScheduleModifier::none) + if (scheduleMod != mlir::omp::ScheduleModifier::none) { result.scheduleMod = mlir::omp::ScheduleModifierAttr::get(context, scheduleMod); + } else { + // OpenMP 6.0 13.6.3: + // If an ordering-modifier is not specified, the effect is as if the + // monotonic ordering modifier is specified if the kind argument is + // static or an ordered clause is specified on the construct; otherwise, + // the effect is as if the nonmonotonic ordering modifier is specified. + mlir::omp::ScheduleModifier defaultMod = + mlir::omp::ScheduleModifier::nonmonotonic; + if (scheduleKind == mlir::omp::ClauseScheduleKind::Static || + findUniqueClause()) + defaultMod = mlir::omp::ScheduleModifier::monotonic; + result.scheduleMod = + mlir::omp::ScheduleModifierAttr::get(context, defaultMod); + } if (getSimdModifier(*clause) != mlir::omp::ScheduleModifier::none) result.scheduleSimd = firOpBuilder.getUnitAttr(); diff --git a/flang/test/Lower/OpenMP/distribute-parallel-do.f90 b/flang/test/Lower/OpenMP/distribute-parallel-do.f90 index cddf61647ead3..ad69e3a163799 100644 --- a/flang/test/Lower/OpenMP/distribute-parallel-do.f90 +++ b/flang/test/Lower/OpenMP/distribute-parallel-do.f90 @@ -42,7 +42,7 @@ subroutine distribute_parallel_do_schedule() ! CHECK: omp.parallel private({{.*}}) { ! CHECK: omp.distribute { - ! CHECK-NEXT: omp.wsloop schedule(runtime) { + ! CHECK-NEXT: omp.wsloop schedule(runtime, nonmonotonic) { ! CHECK-NEXT: omp.loop_nest !$omp distribute parallel do schedule(runtime) do index_ = 1, 10 diff --git a/flang/test/Lower/OpenMP/parallel-wsloop.f90 b/flang/test/Lower/OpenMP/parallel-wsloop.f90 index 15a68e2c0e65b..f61b81f1d9b12 100644 --- a/flang/test/Lower/OpenMP/parallel-wsloop.f90 +++ b/flang/test/Lower/OpenMP/parallel-wsloop.f90 @@ -64,7 +64,7 @@ subroutine parallel_do_with_clauses(nt) ! CHECK: %[[WS_LB:.*]] = arith.constant 1 : i32 ! CHECK: %[[WS_UB:.*]] = arith.constant 9 : i32 ! CHECK: %[[WS_STEP:.*]] = arith.constant 1 : i32 - ! CHECK: omp.wsloop schedule(dynamic) private({{.*}}) { + ! CHECK: omp.wsloop schedule(dynamic, nonmonotonic) private({{.*}}) { ! CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB]]) to (%[[WS_UB]]) inclusive step (%[[WS_STEP]]) { !$OMP PARALLEL DO NUM_THREADS(nt) SCHEDULE(dynamic) do i=1, 9 diff --git a/flang/test/Lower/OpenMP/wsloop-chunks.f90 b/flang/test/Lower/OpenMP/wsloop-chunks.f90 index 29c02a3b3c8d5..f3df1e6249693 100644 --- a/flang/test/Lower/OpenMP/wsloop-chunks.f90 +++ b/flang/test/Lower/OpenMP/wsloop-chunks.f90 @@ -20,7 +20,7 @@ program wsloop ! CHECK: %[[VAL_3:.*]] = arith.constant 1 : i32 ! CHECK: %[[VAL_4:.*]] = arith.constant 9 : i32 ! CHECK: %[[VAL_5:.*]] = arith.constant 1 : i32 -! CHECK: omp.wsloop nowait schedule(static = %[[VAL_2]] : i32) private({{.*}}) { +! CHECK: omp.wsloop nowait schedule(static = %[[VAL_2]] : i32, monotonic) private({{.*}}) { ! CHECK-NEXT: omp.loop_nest (%[[ARG0:.*]]) : i32 = (%[[VAL_3]]) to (%[[VAL_4]]) inclusive step (%[[VAL_5]]) { ! CHECK: hlfir.assign %[[ARG0]] to %[[STORE_IV:.*]]#0 : i32, !fir.ref ! CHECK: %[[LOAD_IV:.*]] = fir.load %[[STORE_IV]]#0 : !fir.ref @@ -40,7 +40,7 @@ program wsloop ! CHECK: %[[VAL_15:.*]] = arith.constant 1 : i32 ! CHECK: %[[VAL_16:.*]] = arith.constant 9 : i32 ! CHECK: %[[VAL_17:.*]] = arith.constant 1 : i32 -! CHECK: omp.wsloop nowait schedule(static = %[[VAL_14]] : i32) private({{.*}}) { +! CHECK: omp.wsloop nowait schedule(static = %[[VAL_14]] : i32, monotonic) private({{.*}}) { ! CHECK-NEXT: omp.loop_nest (%[[ARG1:.*]]) : i32 = (%[[VAL_15]]) to (%[[VAL_16]]) inclusive step (%[[VAL_17]]) { ! CHECK: hlfir.assign %[[ARG1]] to %[[STORE_IV1:.*]]#0 : i32, !fir.ref ! CHECK: %[[VAL_24:.*]] = arith.constant 2 : i32 @@ -66,7 +66,7 @@ program wsloop ! CHECK: %[[VAL_30:.*]] = arith.constant 1 : i32 ! CHECK: %[[VAL_31:.*]] = arith.constant 9 : i32 ! CHECK: %[[VAL_32:.*]] = arith.constant 1 : i32 -! CHECK: omp.wsloop nowait schedule(static = %[[VAL_29]] : i32) private({{.*}}) { +! CHECK: omp.wsloop nowait schedule(static = %[[VAL_29]] : i32, monotonic) private({{.*}}) { ! CHECK-NEXT: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%[[VAL_30]]) to (%[[VAL_31]]) inclusive step (%[[VAL_32]]) { ! CHECK: hlfir.assign %[[ARG2]] to %[[STORE_IV2:.*]]#0 : i32, !fir.ref ! CHECK: %[[VAL_39:.*]] = arith.constant 3 : i32 diff --git a/flang/test/Lower/OpenMP/wsloop-schedule.f90 b/flang/test/Lower/OpenMP/wsloop-schedule.f90 index 5e672927c41ba..bbab01a912c3e 100644 --- a/flang/test/Lower/OpenMP/wsloop-schedule.f90 +++ b/flang/test/Lower/OpenMP/wsloop-schedule.f90 @@ -14,7 +14,7 @@ program wsloop_dynamic !CHECK: %[[WS_LB:.*]] = arith.constant 1 : i32 !CHECK: %[[WS_UB:.*]] = arith.constant 9 : i32 !CHECK: %[[WS_STEP:.*]] = arith.constant 1 : i32 -!CHECK: omp.wsloop nowait schedule(runtime, simd) private({{.*}}) { +!CHECK: omp.wsloop nowait schedule(runtime, nonmonotonic, simd) private({{.*}}) { !CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB]]) to (%[[WS_UB]]) inclusive step (%[[WS_STEP]]) { !CHECK: hlfir.assign %[[I]] to %[[STORE:.*]]#0 : i32, !fir.ref @@ -28,9 +28,35 @@ program wsloop_dynamic !CHECK: omp.yield !CHECK: } !CHECK: } -!CHECK: omp.terminator -!CHECK: } !$OMP END DO NOWAIT + +! Check that the schedule modifier is set correctly when the ordered clause is +! used +!$OMP DO SCHEDULE(runtime) ORDERED(1) +!CHECK: %[[WS_LB2:.*]] = arith.constant 1 : i32 +!CHECK: %[[WS_UB2:.*]] = arith.constant 9 : i32 +!CHECK: %[[WS_STEP2:.*]] = arith.constant 1 : i32 +!CHECK: omp.wsloop nowait ordered(1) schedule(runtime, monotonic) private({{.*}}) { +!CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB2]]) to (%[[WS_UB2]]) inclusive step (%[[WS_STEP2]]) { + do i=1, 9 + print*, i + end do +!$OMP END DO NOWAIT + +! Check that the schedule modifier is set correctly with a static schedule +!$OMP DO SCHEDULE(static) +!CHECK: %[[WS_LB3:.*]] = arith.constant 1 : i32 +!CHECK: %[[WS_UB3:.*]] = arith.constant 9 : i32 +!CHECK: %[[WS_STEP3:.*]] = arith.constant 1 : i32 +!CHECK: omp.wsloop nowait schedule(static, monotonic) private({{.*}}) { +!CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB3]]) to (%[[WS_UB3]]) inclusive step (%[[WS_STEP3]]) { + do i=1, 9 + print*, i + end do +!$OMP END DO NOWAIT + +!CHECK: omp.terminator +!CHECK: } !$OMP END PARALLEL end diff --git a/flang/test/Lower/OpenMP/wsloop.f90 b/flang/test/Lower/OpenMP/wsloop.f90 index b4e02ea3c73f8..75219c9a78e87 100644 --- a/flang/test/Lower/OpenMP/wsloop.f90 +++ b/flang/test/Lower/OpenMP/wsloop.f90 @@ -58,7 +58,7 @@ subroutine loop_with_schedule_nowait ! CHECK: %[[WS_LB:.*]] = arith.constant 1 : i32 ! CHECK: %[[WS_UB:.*]] = arith.constant 9 : i32 ! CHECK: %[[WS_STEP:.*]] = arith.constant 1 : i32 - ! CHECK: omp.wsloop nowait schedule(runtime) private(@{{.*}} %{{.*}}#0 -> %[[ALLOCA_IV:.*]] : !fir.ref) { + ! CHECK: omp.wsloop nowait schedule(runtime, nonmonotonic) private(@{{.*}} %{{.*}}#0 -> %[[ALLOCA_IV:.*]] : !fir.ref) { ! CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB]]) to (%[[WS_UB]]) inclusive step (%[[WS_STEP]]) { !$OMP DO SCHEDULE(runtime) do i=1, 9 From flang-commits at lists.llvm.org Mon May 12 09:11:00 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 09:11:00 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Set the default schedule modifier (PR #139572) In-Reply-To: Message-ID: <68221d94.170a0220.7f76.73a3@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Tom Eccles (tblah)
Changes This is fixing "Default loop schedule modifier for worksharing-loop constructs without static schedule and ordered clause is nonmonotonic since OpenMP 5.0" in the to-do list for removing the OpenMP experimental status warning. I quote the relevant part of OpenMP 6.0 in the patch. So far as I can tell, in OpenMP 4.5 the default schedule modifier was the `def-sched-var` ICV. The initial value of this ICV is implementation defined (table 2.1). So I believe we don't need to check the version for this. It wasn't obvious to me whether this should be done in Semantics, Lowering, or LLVMIR codegen. I didn't do it in semantics so that we didn't add modifiers which could be seen in the unparse. It was more convenient to implement in lowering than in codegen. --- Full diff: https://github.com/llvm/llvm-project/pull/139572.diff 6 Files Affected: - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+15-1) - (modified) flang/test/Lower/OpenMP/distribute-parallel-do.f90 (+1-1) - (modified) flang/test/Lower/OpenMP/parallel-wsloop.f90 (+1-1) - (modified) flang/test/Lower/OpenMP/wsloop-chunks.f90 (+3-3) - (modified) flang/test/Lower/OpenMP/wsloop-schedule.f90 (+29-3) - (modified) flang/test/Lower/OpenMP/wsloop.f90 (+1-1) ``````````diff diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index f4876256a378f..960e9fe63ad88 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -547,9 +547,23 @@ bool ClauseProcessor::processSchedule( mlir::omp::ClauseScheduleKindAttr::get(context, scheduleKind); mlir::omp::ScheduleModifier scheduleMod = getScheduleModifier(*clause); - if (scheduleMod != mlir::omp::ScheduleModifier::none) + if (scheduleMod != mlir::omp::ScheduleModifier::none) { result.scheduleMod = mlir::omp::ScheduleModifierAttr::get(context, scheduleMod); + } else { + // OpenMP 6.0 13.6.3: + // If an ordering-modifier is not specified, the effect is as if the + // monotonic ordering modifier is specified if the kind argument is + // static or an ordered clause is specified on the construct; otherwise, + // the effect is as if the nonmonotonic ordering modifier is specified. + mlir::omp::ScheduleModifier defaultMod = + mlir::omp::ScheduleModifier::nonmonotonic; + if (scheduleKind == mlir::omp::ClauseScheduleKind::Static || + findUniqueClause()) + defaultMod = mlir::omp::ScheduleModifier::monotonic; + result.scheduleMod = + mlir::omp::ScheduleModifierAttr::get(context, defaultMod); + } if (getSimdModifier(*clause) != mlir::omp::ScheduleModifier::none) result.scheduleSimd = firOpBuilder.getUnitAttr(); diff --git a/flang/test/Lower/OpenMP/distribute-parallel-do.f90 b/flang/test/Lower/OpenMP/distribute-parallel-do.f90 index cddf61647ead3..ad69e3a163799 100644 --- a/flang/test/Lower/OpenMP/distribute-parallel-do.f90 +++ b/flang/test/Lower/OpenMP/distribute-parallel-do.f90 @@ -42,7 +42,7 @@ subroutine distribute_parallel_do_schedule() ! CHECK: omp.parallel private({{.*}}) { ! CHECK: omp.distribute { - ! CHECK-NEXT: omp.wsloop schedule(runtime) { + ! CHECK-NEXT: omp.wsloop schedule(runtime, nonmonotonic) { ! CHECK-NEXT: omp.loop_nest !$omp distribute parallel do schedule(runtime) do index_ = 1, 10 diff --git a/flang/test/Lower/OpenMP/parallel-wsloop.f90 b/flang/test/Lower/OpenMP/parallel-wsloop.f90 index 15a68e2c0e65b..f61b81f1d9b12 100644 --- a/flang/test/Lower/OpenMP/parallel-wsloop.f90 +++ b/flang/test/Lower/OpenMP/parallel-wsloop.f90 @@ -64,7 +64,7 @@ subroutine parallel_do_with_clauses(nt) ! CHECK: %[[WS_LB:.*]] = arith.constant 1 : i32 ! CHECK: %[[WS_UB:.*]] = arith.constant 9 : i32 ! CHECK: %[[WS_STEP:.*]] = arith.constant 1 : i32 - ! CHECK: omp.wsloop schedule(dynamic) private({{.*}}) { + ! CHECK: omp.wsloop schedule(dynamic, nonmonotonic) private({{.*}}) { ! CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB]]) to (%[[WS_UB]]) inclusive step (%[[WS_STEP]]) { !$OMP PARALLEL DO NUM_THREADS(nt) SCHEDULE(dynamic) do i=1, 9 diff --git a/flang/test/Lower/OpenMP/wsloop-chunks.f90 b/flang/test/Lower/OpenMP/wsloop-chunks.f90 index 29c02a3b3c8d5..f3df1e6249693 100644 --- a/flang/test/Lower/OpenMP/wsloop-chunks.f90 +++ b/flang/test/Lower/OpenMP/wsloop-chunks.f90 @@ -20,7 +20,7 @@ program wsloop ! CHECK: %[[VAL_3:.*]] = arith.constant 1 : i32 ! CHECK: %[[VAL_4:.*]] = arith.constant 9 : i32 ! CHECK: %[[VAL_5:.*]] = arith.constant 1 : i32 -! CHECK: omp.wsloop nowait schedule(static = %[[VAL_2]] : i32) private({{.*}}) { +! CHECK: omp.wsloop nowait schedule(static = %[[VAL_2]] : i32, monotonic) private({{.*}}) { ! CHECK-NEXT: omp.loop_nest (%[[ARG0:.*]]) : i32 = (%[[VAL_3]]) to (%[[VAL_4]]) inclusive step (%[[VAL_5]]) { ! CHECK: hlfir.assign %[[ARG0]] to %[[STORE_IV:.*]]#0 : i32, !fir.ref ! CHECK: %[[LOAD_IV:.*]] = fir.load %[[STORE_IV]]#0 : !fir.ref @@ -40,7 +40,7 @@ program wsloop ! CHECK: %[[VAL_15:.*]] = arith.constant 1 : i32 ! CHECK: %[[VAL_16:.*]] = arith.constant 9 : i32 ! CHECK: %[[VAL_17:.*]] = arith.constant 1 : i32 -! CHECK: omp.wsloop nowait schedule(static = %[[VAL_14]] : i32) private({{.*}}) { +! CHECK: omp.wsloop nowait schedule(static = %[[VAL_14]] : i32, monotonic) private({{.*}}) { ! CHECK-NEXT: omp.loop_nest (%[[ARG1:.*]]) : i32 = (%[[VAL_15]]) to (%[[VAL_16]]) inclusive step (%[[VAL_17]]) { ! CHECK: hlfir.assign %[[ARG1]] to %[[STORE_IV1:.*]]#0 : i32, !fir.ref ! CHECK: %[[VAL_24:.*]] = arith.constant 2 : i32 @@ -66,7 +66,7 @@ program wsloop ! CHECK: %[[VAL_30:.*]] = arith.constant 1 : i32 ! CHECK: %[[VAL_31:.*]] = arith.constant 9 : i32 ! CHECK: %[[VAL_32:.*]] = arith.constant 1 : i32 -! CHECK: omp.wsloop nowait schedule(static = %[[VAL_29]] : i32) private({{.*}}) { +! CHECK: omp.wsloop nowait schedule(static = %[[VAL_29]] : i32, monotonic) private({{.*}}) { ! CHECK-NEXT: omp.loop_nest (%[[ARG2:.*]]) : i32 = (%[[VAL_30]]) to (%[[VAL_31]]) inclusive step (%[[VAL_32]]) { ! CHECK: hlfir.assign %[[ARG2]] to %[[STORE_IV2:.*]]#0 : i32, !fir.ref ! CHECK: %[[VAL_39:.*]] = arith.constant 3 : i32 diff --git a/flang/test/Lower/OpenMP/wsloop-schedule.f90 b/flang/test/Lower/OpenMP/wsloop-schedule.f90 index 5e672927c41ba..bbab01a912c3e 100644 --- a/flang/test/Lower/OpenMP/wsloop-schedule.f90 +++ b/flang/test/Lower/OpenMP/wsloop-schedule.f90 @@ -14,7 +14,7 @@ program wsloop_dynamic !CHECK: %[[WS_LB:.*]] = arith.constant 1 : i32 !CHECK: %[[WS_UB:.*]] = arith.constant 9 : i32 !CHECK: %[[WS_STEP:.*]] = arith.constant 1 : i32 -!CHECK: omp.wsloop nowait schedule(runtime, simd) private({{.*}}) { +!CHECK: omp.wsloop nowait schedule(runtime, nonmonotonic, simd) private({{.*}}) { !CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB]]) to (%[[WS_UB]]) inclusive step (%[[WS_STEP]]) { !CHECK: hlfir.assign %[[I]] to %[[STORE:.*]]#0 : i32, !fir.ref @@ -28,9 +28,35 @@ program wsloop_dynamic !CHECK: omp.yield !CHECK: } !CHECK: } -!CHECK: omp.terminator -!CHECK: } !$OMP END DO NOWAIT + +! Check that the schedule modifier is set correctly when the ordered clause is +! used +!$OMP DO SCHEDULE(runtime) ORDERED(1) +!CHECK: %[[WS_LB2:.*]] = arith.constant 1 : i32 +!CHECK: %[[WS_UB2:.*]] = arith.constant 9 : i32 +!CHECK: %[[WS_STEP2:.*]] = arith.constant 1 : i32 +!CHECK: omp.wsloop nowait ordered(1) schedule(runtime, monotonic) private({{.*}}) { +!CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB2]]) to (%[[WS_UB2]]) inclusive step (%[[WS_STEP2]]) { + do i=1, 9 + print*, i + end do +!$OMP END DO NOWAIT + +! Check that the schedule modifier is set correctly with a static schedule +!$OMP DO SCHEDULE(static) +!CHECK: %[[WS_LB3:.*]] = arith.constant 1 : i32 +!CHECK: %[[WS_UB3:.*]] = arith.constant 9 : i32 +!CHECK: %[[WS_STEP3:.*]] = arith.constant 1 : i32 +!CHECK: omp.wsloop nowait schedule(static, monotonic) private({{.*}}) { +!CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB3]]) to (%[[WS_UB3]]) inclusive step (%[[WS_STEP3]]) { + do i=1, 9 + print*, i + end do +!$OMP END DO NOWAIT + +!CHECK: omp.terminator +!CHECK: } !$OMP END PARALLEL end diff --git a/flang/test/Lower/OpenMP/wsloop.f90 b/flang/test/Lower/OpenMP/wsloop.f90 index b4e02ea3c73f8..75219c9a78e87 100644 --- a/flang/test/Lower/OpenMP/wsloop.f90 +++ b/flang/test/Lower/OpenMP/wsloop.f90 @@ -58,7 +58,7 @@ subroutine loop_with_schedule_nowait ! CHECK: %[[WS_LB:.*]] = arith.constant 1 : i32 ! CHECK: %[[WS_UB:.*]] = arith.constant 9 : i32 ! CHECK: %[[WS_STEP:.*]] = arith.constant 1 : i32 - ! CHECK: omp.wsloop nowait schedule(runtime) private(@{{.*}} %{{.*}}#0 -> %[[ALLOCA_IV:.*]] : !fir.ref) { + ! CHECK: omp.wsloop nowait schedule(runtime, nonmonotonic) private(@{{.*}} %{{.*}}#0 -> %[[ALLOCA_IV:.*]] : !fir.ref) { ! CHECK-NEXT: omp.loop_nest (%[[I:.*]]) : i32 = (%[[WS_LB]]) to (%[[WS_UB]]) inclusive step (%[[WS_STEP]]) { !$OMP DO SCHEDULE(runtime) do i=1, 9 ``````````
https://github.com/llvm/llvm-project/pull/139572 From flang-commits at lists.llvm.org Mon May 12 09:14:21 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 09:14:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix spurious error on defined assignment in PURE (PR #139186) In-Reply-To: Message-ID: <68221e5d.170a0220.310b6.6735@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/139186 >From 7f1fec0ce3a50b290d52d5cd53fe135357149c6d Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Thu, 8 May 2025 17:46:35 -0700 Subject: [PATCH] [flang] Fix spurious error on defined assignment in PURE An assignment to a whole polymorphic object in a PURE subprogram that is implemented by means of a defined assignment procedure shouldn't be subjected to the same definability checks as it would be for an intrinsic assignment (which would also require it to be allocatable). Fixes https://github.com/llvm/llvm-project/issues/139129. --- flang/include/flang/Evaluate/tools.h | 41 ++++++---------------- flang/lib/Evaluate/tools.cpp | 35 ++++++++++++++++++ flang/lib/Semantics/assignment.cpp | 5 +++ flang/lib/Semantics/check-deallocate.cpp | 6 ++-- flang/lib/Semantics/check-declarations.cpp | 4 +-- flang/lib/Semantics/definable.cpp | 40 ++++++++++----------- flang/lib/Semantics/definable.h | 2 +- flang/lib/Semantics/expression.cpp | 6 ++-- flang/test/Semantics/assign11.f90 | 6 ++-- flang/test/Semantics/bug139129.f90 | 17 +++++++++ flang/test/Semantics/call28.f90 | 4 +-- flang/test/Semantics/deallocate07.f90 | 8 ++--- flang/test/Semantics/declarations05.f90 | 2 +- 13 files changed, 107 insertions(+), 69 deletions(-) create mode 100644 flang/test/Semantics/bug139129.f90 diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 5cdabb3056d8f..22f98a81d9037 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -508,50 +508,31 @@ template std::optional ExtractSubstring(const A &x) { // If an expression is simply a whole symbol data designator, // extract and return that symbol, else null. +const Symbol *UnwrapWholeSymbolDataRef(const DataRef &); +const Symbol *UnwrapWholeSymbolDataRef(const std::optional &); template const Symbol *UnwrapWholeSymbolDataRef(const A &x) { - if (auto dataRef{ExtractDataRef(x)}) { - if (const SymbolRef * p{std::get_if(&dataRef->u)}) { - return &p->get(); - } - } - return nullptr; + return UnwrapWholeSymbolDataRef(ExtractDataRef(x)); } // If an expression is a whole symbol or a whole component desginator, // extract and return that symbol, else null. +const Symbol *UnwrapWholeSymbolOrComponentDataRef(const DataRef &); +const Symbol *UnwrapWholeSymbolOrComponentDataRef( + const std::optional &); template const Symbol *UnwrapWholeSymbolOrComponentDataRef(const A &x) { - if (auto dataRef{ExtractDataRef(x)}) { - if (const SymbolRef * p{std::get_if(&dataRef->u)}) { - return &p->get(); - } else if (const Component * c{std::get_if(&dataRef->u)}) { - if (c->base().Rank() == 0) { - return &c->GetLastSymbol(); - } - } - } - return nullptr; + return UnwrapWholeSymbolOrComponentDataRef(ExtractDataRef(x)); } // If an expression is a whole symbol or a whole component designator, // potentially followed by an image selector, extract and return that symbol, // else null. +const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef(const DataRef &); +const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef( + const std::optional &); template const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef(const A &x) { - if (auto dataRef{ExtractDataRef(x)}) { - if (const SymbolRef * p{std::get_if(&dataRef->u)}) { - return &p->get(); - } else if (const Component * c{std::get_if(&dataRef->u)}) { - if (c->base().Rank() == 0) { - return &c->GetLastSymbol(); - } - } else if (const CoarrayRef * c{std::get_if(&dataRef->u)}) { - if (c->subscript().empty()) { - return &c->GetLastSymbol(); - } - } - } - return nullptr; + return UnwrapWholeSymbolOrComponentOrCoarrayRef(ExtractDataRef(x)); } // GetFirstSymbol(A%B%C[I]%D) -> A diff --git a/flang/lib/Evaluate/tools.cpp b/flang/lib/Evaluate/tools.cpp index 702711e3cff53..79d9de91ce7c7 100644 --- a/flang/lib/Evaluate/tools.cpp +++ b/flang/lib/Evaluate/tools.cpp @@ -1320,6 +1320,41 @@ std::optional CheckProcCompatibility(bool isCall, return msg; } +const Symbol *UnwrapWholeSymbolDataRef(const DataRef &dataRef) { + const SymbolRef *p{std::get_if(&dataRef.u)}; + return p ? &p->get() : nullptr; +} + +const Symbol *UnwrapWholeSymbolDataRef(const std::optional &dataRef) { + return dataRef ? UnwrapWholeSymbolDataRef(*dataRef) : nullptr; +} + +const Symbol *UnwrapWholeSymbolOrComponentDataRef(const DataRef &dataRef) { + if (const Component * c{std::get_if(&dataRef.u)}) { + return c->base().Rank() == 0 ? &c->GetLastSymbol() : nullptr; + } else { + return UnwrapWholeSymbolDataRef(dataRef); + } +} + +const Symbol *UnwrapWholeSymbolOrComponentDataRef( + const std::optional &dataRef) { + return dataRef ? UnwrapWholeSymbolOrComponentDataRef(*dataRef) : nullptr; +} + +const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef(const DataRef &dataRef) { + if (const CoarrayRef * c{std::get_if(&dataRef.u)}) { + return UnwrapWholeSymbolOrComponentOrCoarrayRef(c->base()); + } else { + return UnwrapWholeSymbolOrComponentDataRef(dataRef); + } +} + +const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef( + const std::optional &dataRef) { + return dataRef ? UnwrapWholeSymbolOrComponentOrCoarrayRef(*dataRef) : nullptr; +} + // GetLastPointerSymbol() static const Symbol *GetLastPointerSymbol(const Symbol &symbol) { return IsPointer(GetAssociationRoot(symbol)) ? &symbol : nullptr; diff --git a/flang/lib/Semantics/assignment.cpp b/flang/lib/Semantics/assignment.cpp index 935f5a03bdb6a..6e55d0210ee0e 100644 --- a/flang/lib/Semantics/assignment.cpp +++ b/flang/lib/Semantics/assignment.cpp @@ -72,6 +72,11 @@ void AssignmentContext::Analyze(const parser::AssignmentStmt &stmt) { std::holds_alternative(assignment->u)}; if (isDefinedAssignment) { flags.set(DefinabilityFlag::AllowEventLockOrNotifyType); + } else if (const Symbol * + whole{evaluate::UnwrapWholeSymbolOrComponentDataRef(lhs)}) { + if (IsAllocatable(whole->GetUltimate())) { + flags.set(DefinabilityFlag::PotentialDeallocation); + } } if (auto whyNot{WhyNotDefinable(lhsLoc, scope, flags, lhs)}) { if (whyNot->IsFatal()) { diff --git a/flang/lib/Semantics/check-deallocate.cpp b/flang/lib/Semantics/check-deallocate.cpp index 3bcd4d87b0906..c45b58586853b 100644 --- a/flang/lib/Semantics/check-deallocate.cpp +++ b/flang/lib/Semantics/check-deallocate.cpp @@ -36,7 +36,8 @@ void DeallocateChecker::Leave(const parser::DeallocateStmt &deallocateStmt) { } else if (auto whyNot{WhyNotDefinable(name.source, context_.FindScope(name.source), {DefinabilityFlag::PointerDefinition, - DefinabilityFlag::AcceptAllocatable}, + DefinabilityFlag::AcceptAllocatable, + DefinabilityFlag::PotentialDeallocation}, *symbol)}) { // Catch problems with non-definability of the // pointer/allocatable @@ -74,7 +75,8 @@ void DeallocateChecker::Leave(const parser::DeallocateStmt &deallocateStmt) { } else if (auto whyNot{WhyNotDefinable(source, context_.FindScope(source), {DefinabilityFlag::PointerDefinition, - DefinabilityFlag::AcceptAllocatable}, + DefinabilityFlag::AcceptAllocatable, + DefinabilityFlag::PotentialDeallocation}, *expr)}) { context_ .Say(source, diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 318085518cc57..c3a228f3ab8a9 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -949,8 +949,8 @@ void CheckHelper::CheckObjectEntity( !IsFunctionResult(symbol) /*ditto*/) { // Check automatically deallocated local variables for possible // problems with finalization in PURE. - if (auto whyNot{ - WhyNotDefinable(symbol.name(), symbol.owner(), {}, symbol)}) { + if (auto whyNot{WhyNotDefinable(symbol.name(), symbol.owner(), + {DefinabilityFlag::PotentialDeallocation}, symbol)}) { if (auto *msg{messages_.Say( "'%s' may not be a local variable in a pure subprogram"_err_en_US, symbol.name())}) { diff --git a/flang/lib/Semantics/definable.cpp b/flang/lib/Semantics/definable.cpp index 99a31553f2782..08cb268b318ae 100644 --- a/flang/lib/Semantics/definable.cpp +++ b/flang/lib/Semantics/definable.cpp @@ -193,6 +193,15 @@ static std::optional WhyNotDefinableLast(parser::CharBlock at, return WhyNotDefinableLast(at, scope, flags, dataRef->GetLastSymbol()); } } + auto dyType{evaluate::DynamicType::From(ultimate)}; + const auto *inPure{FindPureProcedureContaining(scope)}; + if (inPure && !flags.test(DefinabilityFlag::PolymorphicOkInPure) && + flags.test(DefinabilityFlag::PotentialDeallocation) && dyType && + dyType->IsPolymorphic()) { + return BlameSymbol(at, + "'%s' is a whole polymorphic object in a pure subprogram"_en_US, + original); + } if (flags.test(DefinabilityFlag::PointerDefinition)) { if (flags.test(DefinabilityFlag::AcceptAllocatable)) { if (!IsAllocatableOrObjectPointer(&ultimate)) { @@ -210,26 +219,17 @@ static std::optional WhyNotDefinableLast(parser::CharBlock at, "'%s' is an entity with either an EVENT_TYPE or LOCK_TYPE"_en_US, original); } - if (FindPureProcedureContaining(scope)) { - if (auto dyType{evaluate::DynamicType::From(ultimate)}) { - if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { - if (dyType->IsPolymorphic()) { // C1596 - return BlameSymbol( - at, "'%s' is polymorphic in a pure subprogram"_en_US, original); - } - } - if (const Symbol * impure{HasImpureFinal(ultimate)}) { - return BlameSymbol(at, "'%s' has an impure FINAL procedure '%s'"_en_US, - original, impure->name()); - } + if (dyType && inPure) { + if (const Symbol * impure{HasImpureFinal(ultimate)}) { + return BlameSymbol(at, "'%s' has an impure FINAL procedure '%s'"_en_US, + original, impure->name()); + } + if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { if (const DerivedTypeSpec * derived{GetDerivedTypeSpec(dyType)}) { - if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { - if (auto bad{ - FindPolymorphicAllocatablePotentialComponent(*derived)}) { - return BlameSymbol(at, - "'%s' has polymorphic component '%s' in a pure subprogram"_en_US, - original, bad.BuildResultDesignatorName()); - } + if (auto bad{FindPolymorphicAllocatablePotentialComponent(*derived)}) { + return BlameSymbol(at, + "'%s' has polymorphic component '%s' in a pure subprogram"_en_US, + original, bad.BuildResultDesignatorName()); } } } @@ -243,7 +243,7 @@ static std::optional WhyNotDefinable(parser::CharBlock at, const evaluate::DataRef &dataRef) { auto whyNotBase{ WhyNotDefinableBase(at, scope, flags, dataRef.GetFirstSymbol(), - std::holds_alternative(dataRef.u), + evaluate::UnwrapWholeSymbolDataRef(dataRef) != nullptr, DefinesComponentPointerTarget(dataRef, flags))}; if (!whyNotBase || !whyNotBase->IsFatal()) { if (auto whyNotLast{ diff --git a/flang/lib/Semantics/definable.h b/flang/lib/Semantics/definable.h index 902702dbccbf3..0d027961417be 100644 --- a/flang/lib/Semantics/definable.h +++ b/flang/lib/Semantics/definable.h @@ -33,7 +33,7 @@ ENUM_CLASS(DefinabilityFlag, SourcedAllocation, // ALLOCATE(a,SOURCE=) PolymorphicOkInPure, // don't check for polymorphic type in pure subprogram DoNotNoteDefinition, // context does not imply definition - AllowEventLockOrNotifyType) + AllowEventLockOrNotifyType, PotentialDeallocation) using DefinabilityFlags = common::EnumSet; diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index e139bda7e4950..96d039edf89d7 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -3385,15 +3385,15 @@ const Assignment *ExpressionAnalyzer::Analyze(const parser::AssignmentStmt &x) { const Symbol *lastWhole{ lastWhole0 ? &ResolveAssociations(*lastWhole0) : nullptr}; if (!lastWhole || !IsAllocatable(*lastWhole)) { - Say("Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable"_err_en_US); + Say("Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable"_err_en_US); } else if (evaluate::IsCoarray(*lastWhole)) { - Say("Left-hand side of assignment may not be polymorphic if it is a coarray"_err_en_US); + Say("Left-hand side of intrinsic assignment may not be polymorphic if it is a coarray"_err_en_US); } } if (auto *derived{GetDerivedTypeSpec(*dyType)}) { if (auto iter{FindAllocatableUltimateComponent(*derived)}) { if (ExtractCoarrayRef(lhs)) { - Say("Left-hand side of assignment must not be coindexed due to allocatable ultimate component '%s'"_err_en_US, + Say("Left-hand side of intrinsic assignment must not be coindexed due to allocatable ultimate component '%s'"_err_en_US, iter.BuildResultDesignatorName()); } } diff --git a/flang/test/Semantics/assign11.f90 b/flang/test/Semantics/assign11.f90 index 37216526b5f33..9d70d7109e75e 100644 --- a/flang/test/Semantics/assign11.f90 +++ b/flang/test/Semantics/assign11.f90 @@ -9,10 +9,10 @@ program test end type type(t) auc[*] pa = 1 ! ok - !ERROR: Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable pp = 1 - !ERROR: Left-hand side of assignment may not be polymorphic if it is a coarray + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic if it is a coarray pac = 1 - !ERROR: Left-hand side of assignment must not be coindexed due to allocatable ultimate component '%a' + !ERROR: Left-hand side of intrinsic assignment must not be coindexed due to allocatable ultimate component '%a' auc[1] = t() end diff --git a/flang/test/Semantics/bug139129.f90 b/flang/test/Semantics/bug139129.f90 new file mode 100644 index 0000000000000..2f0f865854706 --- /dev/null +++ b/flang/test/Semantics/bug139129.f90 @@ -0,0 +1,17 @@ +!RUN: %flang_fc1 -fsyntax-only %s +module m + type t + contains + procedure asst + generic :: assignment(=) => asst + end type + contains + pure subroutine asst(lhs, rhs) + class(t), intent(in out) :: lhs + class(t), intent(in) :: rhs + end + pure subroutine test(x, y) + class(t), intent(in out) :: x, y + x = y ! spurious definability error + end +end diff --git a/flang/test/Semantics/call28.f90 b/flang/test/Semantics/call28.f90 index 51430853d663f..f133276f7547e 100644 --- a/flang/test/Semantics/call28.f90 +++ b/flang/test/Semantics/call28.f90 @@ -11,9 +11,7 @@ pure subroutine s1(x) end subroutine pure subroutine s2(x) class(t), intent(in out) :: x - !ERROR: Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable - !ERROR: Left-hand side of assignment is not definable - !BECAUSE: 'x' is polymorphic in a pure subprogram + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable x = t() end subroutine pure subroutine s3(x) diff --git a/flang/test/Semantics/deallocate07.f90 b/flang/test/Semantics/deallocate07.f90 index 154c680f47c82..6dcf20e82cf0d 100644 --- a/flang/test/Semantics/deallocate07.f90 +++ b/flang/test/Semantics/deallocate07.f90 @@ -19,11 +19,11 @@ pure subroutine subr(pp1, pp2, mp2) !ERROR: Name in DEALLOCATE statement is not definable !BECAUSE: 'mv1' may not be defined in pure subprogram 'subr' because it is host-associated deallocate(mv1%pc) - !ERROR: Object in DEALLOCATE statement is not deallocatable - !BECAUSE: 'pp1' is polymorphic in a pure subprogram + !ERROR: Name in DEALLOCATE statement is not definable + !BECAUSE: 'pp1' is a whole polymorphic object in a pure subprogram deallocate(pp1) - !ERROR: Object in DEALLOCATE statement is not deallocatable - !BECAUSE: 'pc' is polymorphic in a pure subprogram + !ERROR: Name in DEALLOCATE statement is not definable + !BECAUSE: 'pc' is a whole polymorphic object in a pure subprogram deallocate(pp2%pc) !ERROR: Object in DEALLOCATE statement is not deallocatable !BECAUSE: 'mp2' has polymorphic component '%pc' in a pure subprogram diff --git a/flang/test/Semantics/declarations05.f90 b/flang/test/Semantics/declarations05.f90 index b6dab7aeea0bc..b1e3d3c773160 100644 --- a/flang/test/Semantics/declarations05.f90 +++ b/flang/test/Semantics/declarations05.f90 @@ -22,7 +22,7 @@ impure subroutine final(x) end pure subroutine test !ERROR: 'x0' may not be a local variable in a pure subprogram - !BECAUSE: 'x0' is polymorphic in a pure subprogram + !BECAUSE: 'x0' is a whole polymorphic object in a pure subprogram class(t0), allocatable :: x0 !ERROR: 'x1' may not be a local variable in a pure subprogram !BECAUSE: 'x1' has an impure FINAL procedure 'final' From flang-commits at lists.llvm.org Mon May 12 09:52:45 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Mon, 12 May 2025 09:52:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] Fix CUDA implicit data transfer entity creation (PR #139414) In-Reply-To: Message-ID: <6822275d.170a0220.38fc6.77cc@mx.google.com> https://github.com/wangzpgi updated https://github.com/llvm/llvm-project/pull/139414 >From 38d7efcebee251a71c7bbcfb9de3429755c32210 Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Sat, 10 May 2025 15:44:35 -0700 Subject: [PATCH 1/3] Fix CUDA implicit data transfer entity creation --- flang/lib/Lower/Bridge.cpp | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 43375e84f21fa..bfe8898ebff3d 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -4778,7 +4778,13 @@ class FirConverter : public Fortran::lower::AbstractConverter { nbDeviceResidentObject <= 1 && "Only one reference to the device resident object is supported"); auto addr = getSymbolAddress(sym); - hlfir::Entity entity{addr}; + mlir::Value baseValue; + if (auto declareOp = llvm::dyn_cast(addr.getDefiningOp())) + baseValue = declareOp.getBase(); + else + baseValue = addr; + + hlfir::Entity entity{baseValue}; auto [temp, cleanup] = hlfir::createTempFromMold(loc, builder, entity); auto needCleanup = fir::getIntIfConstant(cleanup); >From 3347add1c1b2f56e7adc06d4261dc1f0735eb207 Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Sat, 10 May 2025 16:53:44 -0700 Subject: [PATCH 2/3] fix format; add test --- flang/lib/Lower/Bridge.cpp | 3 ++- flang/test/Lower/CUDA/cuda-managed.cuf | 24 ++++++++++++++++++++++++ 2 files changed, 26 insertions(+), 1 deletion(-) create mode 100644 flang/test/Lower/CUDA/cuda-managed.cuf diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index bfe8898ebff3d..cf9a322680321 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -4779,7 +4779,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { "Only one reference to the device resident object is supported"); auto addr = getSymbolAddress(sym); mlir::Value baseValue; - if (auto declareOp = llvm::dyn_cast(addr.getDefiningOp())) + if (auto declareOp = + llvm::dyn_cast(addr.getDefiningOp())) baseValue = declareOp.getBase(); else baseValue = addr; diff --git a/flang/test/Lower/CUDA/cuda-managed.cuf b/flang/test/Lower/CUDA/cuda-managed.cuf new file mode 100644 index 0000000000000..618a57da53a25 --- /dev/null +++ b/flang/test/Lower/CUDA/cuda-managed.cuf @@ -0,0 +1,24 @@ +! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s + +subroutine testr2(N1,N2) + real(4), managed :: ai4(N1,N2) + real(4), allocatable :: bRefi4(:) + + integer :: i1, i2 + + do i2 = 1, N2 + do i1 = 1, N1 + ai4(i1,i2) = i1 + N1*(i2-1) + enddo + enddo + + allocate(bRefi4 (N1)) + do i1 = 1, N1 + bRefi4(i1) = (ai4(i1,1)+ai4(i1,N2))*N2/2 + enddo + deallocate(bRefi4) + +end subroutine + +!CHECK-LABEL: func.func @_QPtestr2 +!CHECK: %{{.*}} = cuf.alloc !fir.array, %{{.*}}, %{{.*}} : index, index {bindc_name = "ai4", data_attr = #cuf.cuda, uniq_name = "_QFtestr2Eai4"} -> !fir.ref> >From 470a6b707e10b6a9a3ed2205d27b2937f162de0c Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Mon, 12 May 2025 09:52:32 -0700 Subject: [PATCH 3/3] update test --- flang/test/Lower/CUDA/cuda-managed.cuf | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/flang/test/Lower/CUDA/cuda-managed.cuf b/flang/test/Lower/CUDA/cuda-managed.cuf index 618a57da53a25..e14bd849670b1 100644 --- a/flang/test/Lower/CUDA/cuda-managed.cuf +++ b/flang/test/Lower/CUDA/cuda-managed.cuf @@ -21,4 +21,7 @@ subroutine testr2(N1,N2) end subroutine !CHECK-LABEL: func.func @_QPtestr2 -!CHECK: %{{.*}} = cuf.alloc !fir.array, %{{.*}}, %{{.*}} : index, index {bindc_name = "ai4", data_attr = #cuf.cuda, uniq_name = "_QFtestr2Eai4"} -> !fir.ref> +!CHECK: %[[ALLOC:.*]] = cuf.alloc !fir.array, %{{.*}}, %{{.*}} : index, index {bindc_name = "ai4", data_attr = #cuf.cuda, uniq_name = "_QFtestr2Eai4"} -> !fir.ref> +!CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %[[ALLOC]](%{{.*}}) {data_attr = #cuf.cuda, uniq_name = "_QFtestr2Eai4"} : (!fir.ref>, !fir.shape<2>) -> (!fir.box>, !fir.ref>) +!CHECK: %[[DEST:.*]] = hlfir.designate %[[DECLARE]]#0 (%{{.*}}, %{{.*}}) : (!fir.box>, i64, i64) -> !fir.ref +!CHECK: cuf.data_transfer %{{.*}}#0 to %[[DEST]] {transfer_kind = #cuf.cuda_transfer} : !fir.ref, !fir.ref From flang-commits at lists.llvm.org Mon May 12 09:53:36 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Mon, 12 May 2025 09:53:36 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] Fix CUDA implicit data transfer entity creation (PR #139414) In-Reply-To: Message-ID: <68222790.170a0220.3b67f8.c992@mx.google.com> https://github.com/clementval approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/139414 From flang-commits at lists.llvm.org Mon May 12 09:54:45 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Mon, 12 May 2025 09:54:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] Fix CUDA implicit data transfer entity creation (PR #139414) In-Reply-To: Message-ID: <682227d5.170a0220.2e20fd.5ffe@mx.google.com> https://github.com/wangzpgi updated https://github.com/llvm/llvm-project/pull/139414 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 12 10:00:19 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Mon, 12 May 2025 10:00:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] Fix CUDA implicit data transfer entity creation (PR #139414) In-Reply-To: Message-ID: <68222923.170a0220.10096a.88fa@mx.google.com> https://github.com/wangzpgi updated https://github.com/llvm/llvm-project/pull/139414 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 12 10:01:31 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Mon, 12 May 2025 10:01:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Require contiguous actual pointer for contiguous dummy pointer (PR #139298) In-Reply-To: Message-ID: <6822296b.170a0220.274288.9f4a@mx.google.com> https://github.com/DanielCChen approved this pull request. LGTM. It fixed our full test case. Thanks. https://github.com/llvm/llvm-project/pull/139298 From flang-commits at lists.llvm.org Mon May 12 10:06:42 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 10:06:42 -0700 (PDT) Subject: [flang-commits] [flang] eef4b5a - [flang] [cuda] Fix CUDA implicit data transfer entity creation (#139414) Message-ID: <68222aa2.a70a0220.3cb45c.c9e3@mx.google.com> Author: Zhen Wang Date: 2025-05-12T10:06:39-07:00 New Revision: eef4b5a0cdf102e5035d6d4f1aa5f85b2b787e84 URL: https://github.com/llvm/llvm-project/commit/eef4b5a0cdf102e5035d6d4f1aa5f85b2b787e84 DIFF: https://github.com/llvm/llvm-project/commit/eef4b5a0cdf102e5035d6d4f1aa5f85b2b787e84.diff LOG: [flang] [cuda] Fix CUDA implicit data transfer entity creation (#139414) Fixed an issue in `genCUDAImplicitDataTransfer` where creating an `hlfir::Entity` from a symbol address could fail when the address comes from a `hlfir.declare` operation. Fix is to check if the address comes from a `hlfir.declare` operation. If so, use the base value from the declare op when available. Falling back to the original address otherwise. Added: flang/test/Lower/CUDA/cuda-managed.cuf Modified: flang/lib/Lower/Bridge.cpp Removed: ################################################################################ diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 43375e84f21fa..cf9a322680321 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -4778,7 +4778,14 @@ class FirConverter : public Fortran::lower::AbstractConverter { nbDeviceResidentObject <= 1 && "Only one reference to the device resident object is supported"); auto addr = getSymbolAddress(sym); - hlfir::Entity entity{addr}; + mlir::Value baseValue; + if (auto declareOp = + llvm::dyn_cast(addr.getDefiningOp())) + baseValue = declareOp.getBase(); + else + baseValue = addr; + + hlfir::Entity entity{baseValue}; auto [temp, cleanup] = hlfir::createTempFromMold(loc, builder, entity); auto needCleanup = fir::getIntIfConstant(cleanup); diff --git a/flang/test/Lower/CUDA/cuda-managed.cuf b/flang/test/Lower/CUDA/cuda-managed.cuf new file mode 100644 index 0000000000000..e14bd849670b1 --- /dev/null +++ b/flang/test/Lower/CUDA/cuda-managed.cuf @@ -0,0 +1,27 @@ +! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s + +subroutine testr2(N1,N2) + real(4), managed :: ai4(N1,N2) + real(4), allocatable :: bRefi4(:) + + integer :: i1, i2 + + do i2 = 1, N2 + do i1 = 1, N1 + ai4(i1,i2) = i1 + N1*(i2-1) + enddo + enddo + + allocate(bRefi4 (N1)) + do i1 = 1, N1 + bRefi4(i1) = (ai4(i1,1)+ai4(i1,N2))*N2/2 + enddo + deallocate(bRefi4) + +end subroutine + +!CHECK-LABEL: func.func @_QPtestr2 +!CHECK: %[[ALLOC:.*]] = cuf.alloc !fir.array, %{{.*}}, %{{.*}} : index, index {bindc_name = "ai4", data_attr = #cuf.cuda, uniq_name = "_QFtestr2Eai4"} -> !fir.ref> +!CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %[[ALLOC]](%{{.*}}) {data_attr = #cuf.cuda, uniq_name = "_QFtestr2Eai4"} : (!fir.ref>, !fir.shape<2>) -> (!fir.box>, !fir.ref>) +!CHECK: %[[DEST:.*]] = hlfir.designate %[[DECLARE]]#0 (%{{.*}}, %{{.*}}) : (!fir.box>, i64, i64) -> !fir.ref +!CHECK: cuf.data_transfer %{{.*}}#0 to %[[DEST]] {transfer_kind = #cuf.cuda_transfer} : !fir.ref, !fir.ref From flang-commits at lists.llvm.org Mon May 12 10:06:46 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Mon, 12 May 2025 10:06:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] Fix CUDA implicit data transfer entity creation (PR #139414) In-Reply-To: Message-ID: <68222aa6.170a0220.17617b.90fc@mx.google.com> https://github.com/wangzpgi closed https://github.com/llvm/llvm-project/pull/139414 From flang-commits at lists.llvm.org Mon May 12 10:17:18 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 12 May 2025 10:17:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Postpone hlfir.end_associate generation for calls. (PR #138786) In-Reply-To: Message-ID: <68222d1e.170a0220.2e20fd.6e5c@mx.google.com> https://github.com/vzakhari updated https://github.com/llvm/llvm-project/pull/138786 >From 5c3fa17a453fea8fdfaffccbe3feac6fe1997fcf Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Tue, 6 May 2025 16:38:48 -0700 Subject: [PATCH 1/5] [flang] Postpone hlfir.end_associate generation for calls. If we generate hlfir.end_associate at the end of the statement, we get easier optimizable HLFIR, because there are no compiler generated operations with side-effects in between the call and the consumers. This allows more hlfir.eval_in_mem to reuse the LHS instead of allocating temporary buffer. I do not think the same can be done for hlfir.copy_out always, e.g.: ``` subroutine test2(x) interface function array_func2(x,y) real:: x(*), array_func2(10), y end function array_func2 end interface real :: x(:) x = array_func2(x, 1.0) end subroutine test2 ``` If we postpone the copy-out until after the assignment, then the result may be wrong. --- flang/lib/Lower/ConvertCall.cpp | 42 +++++++-- .../Lower/HLFIR/call-postponed-associate.f90 | 85 +++++++++++++++++++ 2 files changed, 121 insertions(+), 6 deletions(-) create mode 100644 flang/test/Lower/HLFIR/call-postponed-associate.f90 diff --git a/flang/lib/Lower/ConvertCall.cpp b/flang/lib/Lower/ConvertCall.cpp index a5b85e25b1af0..d37d51f6ec634 100644 --- a/flang/lib/Lower/ConvertCall.cpp +++ b/flang/lib/Lower/ConvertCall.cpp @@ -960,9 +960,26 @@ struct CallCleanUp { mlir::Value tempVar; mlir::Value mustFree; }; - void genCleanUp(mlir::Location loc, fir::FirOpBuilder &builder) { - Fortran::common::visit([&](auto &c) { c.genCleanUp(loc, builder); }, + + /// Generate clean-up code. + /// If \p postponeAssociates is true, the ExprAssociate clean-up + /// is not generated, and instead the corresponding CallCleanUp + /// object is returned as the result. + std::optional genCleanUp(mlir::Location loc, + fir::FirOpBuilder &builder, + bool postponeAssociates) { + std::optional postponed; + Fortran::common::visit(Fortran::common::visitors{ + [&](CopyIn &c) { c.genCleanUp(loc, builder); }, + [&](ExprAssociate &c) { + if (postponeAssociates) + postponed = CallCleanUp{c}; + else + c.genCleanUp(loc, builder); + }, + }, cleanUp); + return postponed; } std::variant cleanUp; }; @@ -1729,10 +1746,23 @@ genUserCall(Fortran::lower::PreparedActualArguments &loweredActuals, caller, callSiteType, callContext.resultType, callContext.isElementalProcWithArrayArgs()); - /// Clean-up associations and copy-in. - for (auto cleanUp : callCleanUps) - cleanUp.genCleanUp(loc, builder); - + // Clean-up associations and copy-in. + // The association clean-ups are postponed to the end of the statement + // lowering. The copy-in clean-ups may be delayed as well, + // but they are done immediately after the call currently. + llvm::SmallVector associateCleanups; + for (auto cleanUp : callCleanUps) { + auto postponed = + cleanUp.genCleanUp(loc, builder, /*postponeAssociates=*/true); + if (postponed) + associateCleanups.push_back(*postponed); + } + + fir::FirOpBuilder *bldr = &builder; + callContext.stmtCtx.attachCleanup([=]() { + for (auto cleanUp : associateCleanups) + (void)cleanUp.genCleanUp(loc, *bldr, /*postponeAssociates=*/false); + }); if (auto *entity = std::get_if(&loweredResult)) return *entity; diff --git a/flang/test/Lower/HLFIR/call-postponed-associate.f90 b/flang/test/Lower/HLFIR/call-postponed-associate.f90 new file mode 100644 index 0000000000000..18df62b44324b --- /dev/null +++ b/flang/test/Lower/HLFIR/call-postponed-associate.f90 @@ -0,0 +1,85 @@ +! RUN: bbc -emit-hlfir -o - %s -I nowhere | FileCheck %s + +subroutine test1 + interface + function array_func1(x) + real:: x, array_func1(10) + end function array_func1 + end interface + real :: x(10) + x = array_func1(1.0) +end subroutine test1 +! CHECK-LABEL: func.func @_QPtest1() { +! CHECK: %[[VAL_5:.*]] = arith.constant 1.000000e+00 : f32 +! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_5]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_17:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: fir.call @_QParray_func1 +! CHECK: fir.save_result +! CHECK: } +! CHECK: hlfir.assign %[[VAL_17]] to %{{.*}} : !hlfir.expr<10xf32>, !fir.ref> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 + +subroutine test2(x) + interface + function array_func2(x,y) + real:: x(*), array_func2(10), y + end function array_func2 + end interface + real :: x(:) + x = array_func2(x, 1.0) +end subroutine test2 +! CHECK-LABEL: func.func @_QPtest2( +! CHECK: %[[VAL_3:.*]] = arith.constant 1.000000e+00 : f32 +! CHECK: %[[VAL_4:.*]]:2 = hlfir.copy_in %{{.*}} to %{{.*}} : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +! CHECK: %[[VAL_5:.*]] = fir.box_addr %[[VAL_4]]#0 : (!fir.box>) -> !fir.ref> +! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_3]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_17:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: ^bb0(%[[VAL_18:.*]]: !fir.ref>): +! CHECK: %[[VAL_19:.*]] = fir.call @_QParray_func2(%[[VAL_5]], %[[VAL_6]]#0) fastmath : (!fir.ref>, !fir.ref) -> !fir.array<10xf32> +! CHECK: fir.save_result %[[VAL_19]] to %[[VAL_18]](%{{.*}}) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +! CHECK: } +! CHECK: hlfir.copy_out %{{.*}}, %[[VAL_4]]#1 to %{{.*}} : (!fir.ref>>>, i1, !fir.box>) -> () +! CHECK: hlfir.assign %[[VAL_17]] to %{{.*}} : !hlfir.expr<10xf32>, !fir.box> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 +! CHECK: hlfir.destroy %[[VAL_17]] : !hlfir.expr<10xf32> + +subroutine test3(x) + interface + function array_func3(x) + real :: x, array_func3(10) + end function array_func3 + end interface + logical :: x + if (any(array_func3(1.0).le.array_func3(2.0))) x = .true. +end subroutine test3 +! CHECK-LABEL: func.func @_QPtest3( +! CHECK: %[[VAL_2:.*]] = arith.constant 1.000000e+00 : f32 +! CHECK: %[[VAL_3:.*]]:3 = hlfir.associate %[[VAL_2]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_14:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: ^bb0(%[[VAL_15:.*]]: !fir.ref>): +! CHECK: %[[VAL_16:.*]] = fir.call @_QParray_func3(%[[VAL_3]]#0) fastmath : (!fir.ref) -> !fir.array<10xf32> +! CHECK: fir.save_result %[[VAL_16]] to %[[VAL_15]](%{{.*}}) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +! CHECK: } +! CHECK: %[[VAL_17:.*]] = arith.constant 2.000000e+00 : f32 +! CHECK: %[[VAL_18:.*]]:3 = hlfir.associate %[[VAL_17]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_29:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: ^bb0(%[[VAL_30:.*]]: !fir.ref>): +! CHECK: %[[VAL_31:.*]] = fir.call @_QParray_func3(%[[VAL_18]]#0) fastmath : (!fir.ref) -> !fir.array<10xf32> +! CHECK: fir.save_result %[[VAL_31]] to %[[VAL_30]](%{{.*}}) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +! CHECK: } +! CHECK: %[[VAL_32:.*]] = hlfir.elemental %{{.*}} unordered : (!fir.shape<1>) -> !hlfir.expr> { +! CHECK: ^bb0(%[[VAL_33:.*]]: index): +! CHECK: %[[VAL_34:.*]] = hlfir.apply %[[VAL_14]], %[[VAL_33]] : (!hlfir.expr<10xf32>, index) -> f32 +! CHECK: %[[VAL_35:.*]] = hlfir.apply %[[VAL_29]], %[[VAL_33]] : (!hlfir.expr<10xf32>, index) -> f32 +! CHECK: %[[VAL_36:.*]] = arith.cmpf ole, %[[VAL_34]], %[[VAL_35]] fastmath : f32 +! CHECK: %[[VAL_37:.*]] = fir.convert %[[VAL_36]] : (i1) -> !fir.logical<4> +! CHECK: hlfir.yield_element %[[VAL_37]] : !fir.logical<4> +! CHECK: } +! CHECK: %[[VAL_38:.*]] = hlfir.any %[[VAL_32]] : (!hlfir.expr>) -> !fir.logical<4> +! CHECK: hlfir.destroy %[[VAL_32]] : !hlfir.expr> +! CHECK: hlfir.end_associate %[[VAL_18]]#1, %[[VAL_18]]#2 : !fir.ref, i1 +! CHECK: hlfir.destroy %[[VAL_29]] : !hlfir.expr<10xf32> +! CHECK: hlfir.end_associate %[[VAL_3]]#1, %[[VAL_3]]#2 : !fir.ref, i1 +! CHECK: hlfir.destroy %[[VAL_14]] : !hlfir.expr<10xf32> +! CHECK: %[[VAL_39:.*]] = fir.convert %[[VAL_38]] : (!fir.logical<4>) -> i1 +! CHECK: fir.if %[[VAL_39]] { >From 3c5c9a25a61786a0d002b4aec4e542c79624fdbd Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Tue, 6 May 2025 19:14:39 -0700 Subject: [PATCH 2/5] Added test changes missing from the original patch. --- flang/test/Lower/HLFIR/entry_return.f90 | 8 ++++---- flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/test/Lower/HLFIR/entry_return.f90 b/flang/test/Lower/HLFIR/entry_return.f90 index 5d3e160af2df6..18fb2b571b950 100644 --- a/flang/test/Lower/HLFIR/entry_return.f90 +++ b/flang/test/Lower/HLFIR/entry_return.f90 @@ -51,13 +51,13 @@ logical function f2() ! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_4]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %[[VAL_5]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_8:.*]] = fir.call @_QPcomplex(%[[VAL_6]]#0, %[[VAL_7]]#0) fastmath : (!fir.ref, !fir.ref) -> f32 -! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 -! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_9:.*]] = arith.constant 0.000000e+00 : f32 ! CHECK: %[[VAL_10:.*]] = fir.undefined complex ! CHECK: %[[VAL_11:.*]] = fir.insert_value %[[VAL_10]], %[[VAL_8]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[VAL_12:.*]] = fir.insert_value %[[VAL_11]], %[[VAL_9]], [1 : index] : (complex, f32) -> complex ! CHECK: hlfir.assign %[[VAL_12]] to %[[VAL_1]]#0 : complex, !fir.ref> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 +! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_3]]#0 : !fir.ref> ! CHECK: return %[[VAL_13]] : !fir.logical<4> ! CHECK: } @@ -74,13 +74,13 @@ logical function f2() ! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_4]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %[[VAL_5]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_8:.*]] = fir.call @_QPcomplex(%[[VAL_6]]#0, %[[VAL_7]]#0) fastmath : (!fir.ref, !fir.ref) -> f32 -! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 -! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_9:.*]] = arith.constant 0.000000e+00 : f32 ! CHECK: %[[VAL_10:.*]] = fir.undefined complex ! CHECK: %[[VAL_11:.*]] = fir.insert_value %[[VAL_10]], %[[VAL_8]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[VAL_12:.*]] = fir.insert_value %[[VAL_11]], %[[VAL_9]], [1 : index] : (complex, f32) -> complex ! CHECK: hlfir.assign %[[VAL_12]] to %[[VAL_1]]#0 : complex, !fir.ref> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 +! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_1]]#0 : !fir.ref> ! CHECK: return %[[VAL_13]] : complex ! CHECK: } diff --git a/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 b/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 index 28659a33d0893..206b6e4e9b797 100644 --- a/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 +++ b/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 @@ -32,8 +32,8 @@ real function test1(x) ! CHECK: %[[VAL_7:.*]] = fir.load %[[VAL_6]] : !fir.ref) -> f32>> ! CHECK: %[[VAL_8:.*]] = fir.box_addr %[[VAL_7]] : (!fir.boxproc<(!fir.ref) -> f32>) -> ((!fir.ref) -> f32) ! CHECK: %[[VAL_9:.*]] = fir.call %[[VAL_8]](%[[VAL_5]]#0) fastmath : (!fir.ref) -> f32 -! CHECK: hlfir.end_associate %[[VAL_5]]#1, %[[VAL_5]]#2 : !fir.ref, i1 ! CHECK: hlfir.assign %[[VAL_9]] to %[[VAL_2]]#0 : f32, !fir.ref +! CHECK: hlfir.end_associate %[[VAL_5]]#1, %[[VAL_5]]#2 : !fir.ref, i1 subroutine test2(x) use proc_comp_defs, only : t, iface >From b895e18becd575c6a59baf02e7eb82ed82e44f82 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Thu, 8 May 2025 12:18:07 -0700 Subject: [PATCH 3/5] Fixed clean-ups insertion for atomic capture. --- flang/lib/Lower/OpenMP/OpenMP.cpp | 4 +++- flang/test/Lower/OpenMP/atomic-capture.f90 | 20 ++++++++++++++++++++ flang/test/Lower/OpenMP/atomic-update.f90 | 21 +++++++++++++++++++++ 3 files changed, 44 insertions(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 446aa2deb3d05..2af7bd25b0754 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3316,7 +3316,9 @@ static void genAtomicCapture(lower::AbstractConverter &converter, } firOpBuilder.setInsertionPointToEnd(&block); firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); + // The clean-ups associated with the statements inside the capture + // construct must be generated after the AtomicCaptureOp. + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b5c8edc8f31c1 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -97,3 +97,23 @@ subroutine pointers_in_atomic_capture() b = a !$omp end atomic end subroutine + +! Check that the clean-ups associated with the function call +! are generated after the omp.atomic.capture operation: +! CHECK-LABEL: func.func @_QPfunc_call_cleanup( +subroutine func_call_cleanup(x, v, vv) + integer :: x, v, vv + +! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_8:.*]] = fir.call @_QPfunc(%[[VAL_7]]#0) fastmath : (!fir.ref) -> f32 +! CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_8]] : (f32) -> i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %{{.*}} = %[[VAL_3:.*]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.write %[[VAL_3]]#0 = %[[VAL_9]] : !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 + !$omp atomic capture + v = x + x = func(vv + 1) + !$omp end atomic +end subroutine func_call_cleanup diff --git a/flang/test/Lower/OpenMP/atomic-update.f90 b/flang/test/Lower/OpenMP/atomic-update.f90 index 257ae8fb497ff..3f840acefa6e8 100644 --- a/flang/test/Lower/OpenMP/atomic-update.f90 +++ b/flang/test/Lower/OpenMP/atomic-update.f90 @@ -219,3 +219,24 @@ program OmpAtomicUpdate !$omp atomic update w = w + g end program OmpAtomicUpdate + +! Check that the clean-ups associated with the function call +! are generated after the omp.atomic.update operation: +! CHECK-LABEL: func.func @_QPfunc_call_cleanup( +subroutine func_call_cleanup(v, vv) + integer v, vv + +! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_7:.*]] = fir.call @_QPfunc(%[[VAL_6]]#0) fastmath : (!fir.ref) -> f32 +! CHECK: omp.atomic.update %{{.*}} : !fir.ref { +! CHECK: ^bb0(%[[VAL_8:.*]]: i32): +! CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_8]] : (i32) -> f32 +! CHECK: %[[VAL_10:.*]] = arith.addf %[[VAL_9]], %[[VAL_7]] fastmath : f32 +! CHECK: %[[VAL_11:.*]] = fir.convert %[[VAL_10]] : (f32) -> i32 +! CHECK: omp.yield(%[[VAL_11]] : i32) +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 + !$omp atomic update + v = v + func(vv + 1) + !$omp end atomic +end subroutine func_call_cleanup >From 6ca7f878a5fc0b5b2876b573a345a53569f5cfbe Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Thu, 8 May 2025 16:13:01 -0700 Subject: [PATCH 4/5] Fixed atomic capture cases with atomic update inside. --- flang/lib/Lower/OpenMP/OpenMP.cpp | 20 +++++++--- flang/test/Lower/OpenMP/atomic-capture.f90 | 44 ++++++++++++++++++++-- 2 files changed, 55 insertions(+), 9 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 2af7bd25b0754..4909c3e277a07 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2816,7 +2816,8 @@ static void genAtomicUpdateStatement( const parser::Expr &assignmentStmtExpr, const parser::OmpAtomicClauseList *leftHandClauseList, const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { + mlir::Operation *atomicCaptureOp = nullptr, + lower::StatementContext *atomicCaptureStmtCtx = nullptr) { // Generate `atomic.update` operation for atomic assignment statements fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); mlir::Location currentLocation = converter.getCurrentLocation(); @@ -2890,15 +2891,24 @@ static void genAtomicUpdateStatement( }, assignmentStmtExpr.u); lower::StatementContext nonAtomicStmtCtx; + lower::StatementContext *stmtCtxPtr = &nonAtomicStmtCtx; if (!nonAtomicSubExprs.empty()) { // Generate non atomic part before all the atomic operations. auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) + if (atomicCaptureOp) { + assert(atomicCaptureStmtCtx && "must specify statement context"); firOpBuilder.setInsertionPoint(atomicCaptureOp); + // Any clean-ups associated with the expression lowering + // must also be generated outside of the atomic update operation + // and after the atomic capture operation. + // The atomicCaptureStmtCtx will be finalized at the end + // of the atomic capture operation generation. + stmtCtxPtr = atomicCaptureStmtCtx; + } mlir::Value nonAtomicVal; for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + currentLocation, *nonAtomicSubExpr, *stmtCtxPtr)); exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); } if (atomicCaptureOp) @@ -3238,7 +3248,7 @@ static void genAtomicCapture(lower::AbstractConverter &converter, genAtomicUpdateStatement( converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp, &stmtCtx); } else { // Atomic capture construct is of the form [capture-stmt, write-stmt] firOpBuilder.setInsertionPoint(atomicCaptureOp); @@ -3284,7 +3294,7 @@ static void genAtomicCapture(lower::AbstractConverter &converter, genAtomicUpdateStatement( converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp, &stmtCtx); if (stmt1VarType != stmt2VarType) { mlir::Value alloca; diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index b5c8edc8f31c1..2f800d534dc36 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -102,18 +102,54 @@ subroutine pointers_in_atomic_capture() ! are generated after the omp.atomic.capture operation: ! CHECK-LABEL: func.func @_QPfunc_call_cleanup( subroutine func_call_cleanup(x, v, vv) + interface + integer function func(x) + integer :: x + end function func + end interface integer :: x, v, vv ! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -! CHECK: %[[VAL_8:.*]] = fir.call @_QPfunc(%[[VAL_7]]#0) fastmath : (!fir.ref) -> f32 -! CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_8]] : (f32) -> i32 +! CHECK: %[[VAL_8:.*]] = fir.call @_QPfunc(%[[VAL_7]]#0) fastmath : (!fir.ref) -> i32 ! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.read %{{.*}} = %[[VAL_3:.*]]#0 : !fir.ref, !fir.ref, i32 -! CHECK: omp.atomic.write %[[VAL_3]]#0 = %[[VAL_9]] : !fir.ref, i32 +! CHECK: omp.atomic.read %[[VAL_1:.*]]#0 = %[[VAL_3:.*]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.write %[[VAL_3]]#0 = %[[VAL_8]] : !fir.ref, i32 ! CHECK: } ! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 !$omp atomic capture v = x x = func(vv + 1) !$omp end atomic + +! CHECK: %[[VAL_12:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_13:.*]] = fir.call @_QPfunc(%[[VAL_12]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[VAL_1]]#0 = %[[VAL_3]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.update %[[VAL_3]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_14:.*]]: i32): +! CHECK: %[[VAL_15:.*]] = arith.addi %[[VAL_13]], %[[VAL_14]] : i32 +! CHECK: omp.yield(%[[VAL_15]] : i32) +! CHECK: } +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_12]]#1, %[[VAL_12]]#2 : !fir.ref, i1 + !$omp atomic capture + v = x + x = func(vv + 1) + x + !$omp end atomic + +! CHECK: %[[VAL_19:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_20:.*]] = fir.call @_QPfunc(%[[VAL_19]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[VAL_3]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_21:.*]]: i32): +! CHECK: %[[VAL_22:.*]] = arith.addi %[[VAL_20]], %[[VAL_21]] : i32 +! CHECK: omp.yield(%[[VAL_22]] : i32) +! CHECK: } +! CHECK: omp.atomic.read %[[VAL_1]]#0 = %[[VAL_3]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_19]]#1, %[[VAL_19]]#2 : !fir.ref, i1 + !$omp atomic capture + x = func(vv + 1) + x + v = x + !$omp end atomic end subroutine func_call_cleanup >From 59dee19815b29bf7afde758e1ca33f04ff9e3685 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Fri, 9 May 2025 19:33:34 -0700 Subject: [PATCH 5/5] Fixed atomic handling for OpenACC. --- flang/lib/Lower/OpenACC.cpp | 24 ++++++-- .../test/Lower/OpenACC/acc-atomic-capture.f90 | 57 +++++++++++++++++++ .../test/Lower/OpenACC/acc-atomic-update.f90 | 18 +++++- 3 files changed, 92 insertions(+), 7 deletions(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 2f70041a04dde..e1918288d6de3 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -416,7 +416,8 @@ static inline void genAtomicUpdateStatement( Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { + mlir::Operation *atomicCaptureOp = nullptr, + Fortran::lower::StatementContext *atomicCaptureStmtCtx = nullptr) { // Generate `atomic.update` operation for atomic assignment statements fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); mlir::Location currentLocation = converter.getCurrentLocation(); @@ -496,15 +497,24 @@ static inline void genAtomicUpdateStatement( }, assignmentStmtExpr.u); Fortran::lower::StatementContext nonAtomicStmtCtx; + Fortran::lower::StatementContext *stmtCtxPtr = &nonAtomicStmtCtx; if (!nonAtomicSubExprs.empty()) { // Generate non atomic part before all the atomic operations. auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) + if (atomicCaptureOp) { + assert(atomicCaptureStmtCtx && "must specify statement context"); firOpBuilder.setInsertionPoint(atomicCaptureOp); + // Any clean-ups associated with the expression lowering + // must also be generated outside of the atomic update operation + // and after the atomic capture operation. + // The atomicCaptureStmtCtx will be finalized at the end + // of the atomic capture operation generation. + stmtCtxPtr = atomicCaptureStmtCtx; + } mlir::Value nonAtomicVal; for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + currentLocation, *nonAtomicSubExpr, *stmtCtxPtr)); exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); } if (atomicCaptureOp) @@ -652,7 +662,7 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter, genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, elementType, loc); genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, - stmt2Expr, loc, atomicCaptureOp); + stmt2Expr, loc, atomicCaptureOp, &stmtCtx); } else { // Atomic capture construct is of the form [capture-stmt, write-stmt] firOpBuilder.setInsertionPoint(atomicCaptureOp); @@ -672,13 +682,15 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter, *Fortran::semantics::GetExpr(stmt2Expr); mlir::Type elementType = converter.genType(fromExpr); genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, - stmt1Expr, loc, atomicCaptureOp); + stmt1Expr, loc, atomicCaptureOp, &stmtCtx); genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, loc); } firOpBuilder.setInsertionPointToEnd(&block); firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); + // The clean-ups associated with the statements inside the capture + // construct must be generated after the AtomicCaptureOp. + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); } template diff --git a/flang/test/Lower/OpenACC/acc-atomic-capture.f90 b/flang/test/Lower/OpenACC/acc-atomic-capture.f90 index 82059908bcd0b..ee38ab6ce826a 100644 --- a/flang/test/Lower/OpenACC/acc-atomic-capture.f90 +++ b/flang/test/Lower/OpenACC/acc-atomic-capture.f90 @@ -306,3 +306,60 @@ end subroutine comp_ref_in_atomic_capture2 ! CHECK: } ! CHECK: acc.atomic.read %[[V_DECL]]#0 = %[[C]] : !fir.ref, !fir.ref, i32 ! CHECK: } + +! CHECK-LABEL: func.func @_QPatomic_capture_with_associate() { +subroutine atomic_capture_with_associate + interface + integer function func(x) + integer :: x + end function func + end interface +! CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "_QFatomic_capture_with_associateEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Y_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "_QFatomic_capture_with_associateEy"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Z_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "_QFatomic_capture_with_associateEz"} : (!fir.ref) -> (!fir.ref, !fir.ref) + integer :: x, y, z + +! CHECK: %[[VAL_10:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_11:.*]] = fir.call @_QPfunc(%[[VAL_10]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: acc.atomic.capture { +! CHECK: acc.atomic.read %[[X_DECL]]#0 = %[[Y_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: acc.atomic.write %[[Y_DECL]]#0 = %[[VAL_11]] : !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_10]]#1, %[[VAL_10]]#2 : !fir.ref, i1 + !$acc atomic capture + x = y + y = func(z + 1) + !$acc end atomic + +! CHECK: %[[VAL_15:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_16:.*]] = fir.call @_QPfunc(%[[VAL_15]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: acc.atomic.capture { +! CHECK: acc.atomic.update %[[Y_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_17:.*]]: i32): +! CHECK: %[[VAL_18:.*]] = arith.muli %[[VAL_16]], %[[VAL_17]] : i32 +! CHECK: acc.yield %[[VAL_18]] : i32 +! CHECK: } +! CHECK: acc.atomic.read %[[X_DECL]]#0 = %[[Y_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_15]]#1, %[[VAL_15]]#2 : !fir.ref, i1 + !$acc atomic capture + y = func(z + 1) * y + x = y + !$acc end atomic + +! CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_23:.*]] = fir.call @_QPfunc(%[[VAL_22]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: acc.atomic.capture { +! CHECK: acc.atomic.read %[[X_DECL]]#0 = %[[Y_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: acc.atomic.update %[[Y_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_24:.*]]: i32): +! CHECK: %[[VAL_25:.*]] = arith.addi %[[VAL_23]], %[[VAL_24]] : i32 +! CHECK: acc.yield %[[VAL_25]] : i32 +! CHECK: } +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_22]]#1, %[[VAL_22]]#2 : !fir.ref, i1 + !$acc atomic capture + x = y + y = func(z + 1) + y + !$acc end atomic +end subroutine atomic_capture_with_associate diff --git a/flang/test/Lower/OpenACC/acc-atomic-update.f90 b/flang/test/Lower/OpenACC/acc-atomic-update.f90 index da2972877244c..71aa69fd64eba 100644 --- a/flang/test/Lower/OpenACC/acc-atomic-update.f90 +++ b/flang/test/Lower/OpenACC/acc-atomic-update.f90 @@ -3,6 +3,11 @@ ! RUN: %flang_fc1 -fopenacc -emit-hlfir %s -o - | FileCheck %s program acc_atomic_update_test + interface + integer function func(x) + integer :: x + end function func + end interface integer :: x, y, z integer, pointer :: a, b integer, target :: c, d @@ -67,7 +72,18 @@ program acc_atomic_update_test !$acc atomic i1 = i1 + 1 !$acc end atomic + +!CHECK: %[[VAL_44:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +!CHECK: %[[VAL_45:.*]] = fir.call @_QPfunc(%[[VAL_44]]#0) fastmath : (!fir.ref) -> i32 +!CHECK: acc.atomic.update %[[X_DECL]]#0 : !fir.ref { +!CHECK: ^bb0(%[[VAL_46:.*]]: i32): +!CHECK: %[[VAL_47:.*]] = arith.addi %[[VAL_46]], %[[VAL_45]] : i32 +!CHECK: acc.yield %[[VAL_47]] : i32 +!CHECK: } +!CHECK: hlfir.end_associate %[[VAL_44]]#1, %[[VAL_44]]#2 : !fir.ref, i1 + !$acc atomic update + x = x + func(z + 1) + !$acc end atomic !CHECK: return !CHECK: } end program acc_atomic_update_test - From flang-commits at lists.llvm.org Mon May 12 10:31:51 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Mon, 12 May 2025 10:31:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Extend assumed-size array checking in intrinsic functions (PR #139339) In-Reply-To: Message-ID: <68223087.170a0220.11d0b4.91bb@mx.google.com> https://github.com/DanielCChen approved this pull request. LGTM. It fixed the reducer. Thanks for the quick fix. https://github.com/llvm/llvm-project/pull/139339 From flang-commits at lists.llvm.org Mon May 12 10:35:52 2025 From: flang-commits at lists.llvm.org (Razvan Lupusoru via flang-commits) Date: Mon, 12 May 2025 10:35:52 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Postpone hlfir.end_associate generation for calls. (PR #138786) In-Reply-To: Message-ID: <68223178.170a0220.352c29.be39@mx.google.com> https://github.com/razvanlupusoru approved this pull request. Thank you! https://github.com/llvm/llvm-project/pull/138786 From flang-commits at lists.llvm.org Mon May 12 10:45:38 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Mon, 12 May 2025 10:45:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang] PRIVATE statement in derived type applies to proc components (PR #139336) In-Reply-To: Message-ID: <682233c2.050a0220.3ae94.df56@mx.google.com> https://github.com/DanielCChen approved this pull request. LGTM. It fixed our test case. Thanks! https://github.com/llvm/llvm-project/pull/139336 From flang-commits at lists.llvm.org Mon May 12 10:47:10 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Mon, 12 May 2025 10:47:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Catch deferred type parameters in ALLOCATE(type-spec::) (PR #139334) In-Reply-To: Message-ID: <6822341e.170a0220.2b617e.998c@mx.google.com> https://github.com/DanielCChen approved this pull request. LGTM. It fixed our test case. Thanks! https://github.com/llvm/llvm-project/pull/139334 From flang-commits at lists.llvm.org Mon May 12 10:48:01 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Mon, 12 May 2025 10:48:01 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Stricter checking of v_list DIO arguments (PR #139329) In-Reply-To: Message-ID: <68223451.170a0220.301af4.9287@mx.google.com> https://github.com/DanielCChen approved this pull request. LGTM. It fixed our test case. Thanks. https://github.com/llvm/llvm-project/pull/139329 From flang-commits at lists.llvm.org Mon May 12 10:48:38 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Mon, 12 May 2025 10:48:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Emit error when DEFERRED binding overrides non-DEFERRED (PR #139325) In-Reply-To: Message-ID: <68223476.630a0220.d3a69.4c6a@mx.google.com> https://github.com/DanielCChen approved this pull request. LGTM. It fixed our test case. Thanks! https://github.com/llvm/llvm-project/pull/139325 From flang-commits at lists.llvm.org Mon May 12 10:48:55 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Mon, 12 May 2025 10:48:55 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) Message-ID: https://github.com/TIFitis created https://github.com/llvm/llvm-project/pull/139593 The current semantic check in place is incorrect, this patch fixes this. Up to 1 'default' named mapper is allowed for each derived type. The current semantic check only allows up to 1 'default' named mapper across all derived types. Co-authored-by: Raghu Maddhipatla >From a83bd68fdcb613d54c66f8503f522cc2b16a63a2 Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Mon, 12 May 2025 18:41:20 +0100 Subject: [PATCH] Fix semantic check for default declare mappers. --- flang/lib/Semantics/resolve-names.cpp | 21 ++++++++++++------- .../OpenMP/declare-mapper-symbols.f90 | 18 ++++++++-------- .../Semantics/OpenMP/declare-mapper03.f90 | 6 +----- 3 files changed, 23 insertions(+), 22 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index b2979690f78e7..1fd0ea007319d 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -38,6 +38,7 @@ #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" #include "flang/Support/default-kinds.h" +#include "llvm/ADT/SmallVector.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1766,14 +1767,6 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, // just following the natural flow, the map clauses gets processed before // the type has been fully processed. BeginDeclTypeSpec(); - if (auto &mapperName{std::get>(spec.t)}) { - mapperName->symbol = - &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); - } else { - const parser::CharBlock defaultName{"default", 7}; - MakeSymbol( - defaultName, Attrs{}, MiscDetails{MiscDetails::Kind::ConstructName}); - } PushScope(Scope::Kind::OtherConstruct, nullptr); Walk(std::get(spec.t)); @@ -1783,6 +1776,18 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); + + if (auto &mapperName{std::get>(spec.t)}) { + mapperName->symbol = + &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); + } else { + const auto &type = std::get(spec.t); + static llvm::SmallVector defaultNames; + defaultNames.emplace_back( + type.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"); + MakeSymbol(defaultNames.back(), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); + } } void OmpVisitor::ProcessReductionSpecifier( diff --git a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 index b4e03bd1632e5..0dda5b4456987 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 @@ -2,23 +2,23 @@ program main !CHECK-LABEL: MainProgram scope: main - implicit none + implicit none - type ty - integer :: x - end type ty - !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) - !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) + type ty + integer :: x + end type ty + !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) + !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) !! Note, symbols come out in their respective scope, but not in declaration order. -!CHECK: default: Misc ConstructName !CHECK: mymapper: Misc ConstructName !CHECK: ty: DerivedType components: x +!CHECK: ty.default: Misc ConstructName !CHECK: DerivedType scope: ty !CHECK: OtherConstruct scope: !CHECK: mapped (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) -!CHECK: OtherConstruct scope: +!CHECK: OtherConstruct scope: !CHECK: maptwo (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) - + end program main diff --git a/flang/test/Semantics/OpenMP/declare-mapper03.f90 b/flang/test/Semantics/OpenMP/declare-mapper03.f90 index b70b8a67f33e0..84fc3efafb3ad 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper03.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper03.f90 @@ -5,12 +5,8 @@ integer :: y end type t1 -type :: t2 - real :: y, z -end type t2 - !error: 'default' is already declared in this scoping unit !$omp declare mapper(t1::x) map(x, x%y) -!$omp declare mapper(t2::w) map(w, w%y, w%z) +!$omp declare mapper(t1::x) map(x) end From flang-commits at lists.llvm.org Mon May 12 10:49:28 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 10:49:28 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <682234a8.630a0220.8f157.3934@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Akash Banerjee (TIFitis)
Changes The current semantic check in place is incorrect, this patch fixes this. Up to 1 'default' named mapper is allowed for each derived type. The current semantic check only allows up to 1 'default' named mapper across all derived types. Co-authored-by: Raghu Maddhipatla <Raghu.Maddhipatla@amd.com> --- Full diff: https://github.com/llvm/llvm-project/pull/139593.diff 3 Files Affected: - (modified) flang/lib/Semantics/resolve-names.cpp (+13-8) - (modified) flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 (+9-9) - (modified) flang/test/Semantics/OpenMP/declare-mapper03.f90 (+1-5) ``````````diff diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index b2979690f78e7..1fd0ea007319d 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -38,6 +38,7 @@ #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" #include "flang/Support/default-kinds.h" +#include "llvm/ADT/SmallVector.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1766,14 +1767,6 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, // just following the natural flow, the map clauses gets processed before // the type has been fully processed. BeginDeclTypeSpec(); - if (auto &mapperName{std::get>(spec.t)}) { - mapperName->symbol = - &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); - } else { - const parser::CharBlock defaultName{"default", 7}; - MakeSymbol( - defaultName, Attrs{}, MiscDetails{MiscDetails::Kind::ConstructName}); - } PushScope(Scope::Kind::OtherConstruct, nullptr); Walk(std::get(spec.t)); @@ -1783,6 +1776,18 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); + + if (auto &mapperName{std::get>(spec.t)}) { + mapperName->symbol = + &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); + } else { + const auto &type = std::get(spec.t); + static llvm::SmallVector defaultNames; + defaultNames.emplace_back( + type.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"); + MakeSymbol(defaultNames.back(), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); + } } void OmpVisitor::ProcessReductionSpecifier( diff --git a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 index b4e03bd1632e5..0dda5b4456987 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 @@ -2,23 +2,23 @@ program main !CHECK-LABEL: MainProgram scope: main - implicit none + implicit none - type ty - integer :: x - end type ty - !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) - !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) + type ty + integer :: x + end type ty + !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) + !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) !! Note, symbols come out in their respective scope, but not in declaration order. -!CHECK: default: Misc ConstructName !CHECK: mymapper: Misc ConstructName !CHECK: ty: DerivedType components: x +!CHECK: ty.default: Misc ConstructName !CHECK: DerivedType scope: ty !CHECK: OtherConstruct scope: !CHECK: mapped (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) -!CHECK: OtherConstruct scope: +!CHECK: OtherConstruct scope: !CHECK: maptwo (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) - + end program main diff --git a/flang/test/Semantics/OpenMP/declare-mapper03.f90 b/flang/test/Semantics/OpenMP/declare-mapper03.f90 index b70b8a67f33e0..84fc3efafb3ad 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper03.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper03.f90 @@ -5,12 +5,8 @@ integer :: y end type t1 -type :: t2 - real :: y, z -end type t2 - !error: 'default' is already declared in this scoping unit !$omp declare mapper(t1::x) map(x, x%y) -!$omp declare mapper(t2::w) map(w, w%y, w%z) +!$omp declare mapper(t1::x) map(x) end ``````````
https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Mon May 12 10:51:29 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Mon, 12 May 2025 10:51:29 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <68223521.170a0220.311f7d.aa66@mx.google.com> https://github.com/TIFitis edited https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Mon May 12 11:16:46 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Mon, 12 May 2025 11:16:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Emit error when DEFERRED binding overrides non-DEFERRED (PR #139325) In-Reply-To: Message-ID: <68223b0e.630a0220.3e98a.4d65@mx.google.com> https://github.com/akuhlens approved this pull request. https://github.com/llvm/llvm-project/pull/139325 From flang-commits at lists.llvm.org Mon May 12 11:21:11 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Mon, 12 May 2025 11:21:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Stricter checking of v_list DIO arguments (PR #139329) In-Reply-To: Message-ID: <68223c17.170a0220.1f846f.9126@mx.google.com> https://github.com/akuhlens approved this pull request. https://github.com/llvm/llvm-project/pull/139329 From flang-commits at lists.llvm.org Mon May 12 11:25:57 2025 From: flang-commits at lists.llvm.org (Pranav Bhandarkar via flang-commits) Date: Mon, 12 May 2025 11:25:57 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][MLIR] - Handle the mapping of subroutine arguments when they are subsequently used inside the region of an `omp.target` Op (PR #134967) In-Reply-To: Message-ID: <68223d35.050a0220.29e4d3.bbbb@mx.google.com> https://github.com/bhandarkar-pranav updated https://github.com/llvm/llvm-project/pull/134967 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 12 11:28:24 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 11:28:24 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][MLIR] - Handle the mapping of subroutine arguments when they are subsequently used inside the region of an `omp.target` Op (PR #134967) In-Reply-To: Message-ID: <68223dc8.630a0220.8f157.3bbe@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- flang/include/flang/Optimizer/Builder/DirectivesCommon.h flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h index 6934090a7..3f30c761a 100644 --- a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h +++ b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h @@ -154,7 +154,7 @@ genBoundsOpFromBoxChar(fir::FirOpBuilder &builder, mlir::Location loc, return builder.create( loc, boundTy, /*lower_bound=*/zero, /*upper_bound=*/ub, /*extent=*/extent, /*stride=*/stride, - /*stride_in_bytes=*/ true, /*start_idx=*/zero); + /*stride_in_bytes=*/true, /*start_idx=*/zero); } return mlir::Value{}; } ``````````
https://github.com/llvm/llvm-project/pull/134967 From flang-commits at lists.llvm.org Mon May 12 11:29:08 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Mon, 12 May 2025 11:29:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Catch deferred type parameters in ALLOCATE(type-spec::) (PR #139334) In-Reply-To: Message-ID: <68223df4.050a0220.18a33d.efff@mx.google.com> https://github.com/akuhlens approved this pull request. https://github.com/llvm/llvm-project/pull/139334 From flang-commits at lists.llvm.org Mon May 12 11:37:13 2025 From: flang-commits at lists.llvm.org (Pranav Bhandarkar via flang-commits) Date: Mon, 12 May 2025 11:37:13 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][MLIR] - Handle the mapping of subroutine arguments when they are subsequently used inside the region of an `omp.target` Op (PR #134967) In-Reply-To: Message-ID: <68223fd9.050a0220.242963.f407@mx.google.com> https://github.com/bhandarkar-pranav updated https://github.com/llvm/llvm-project/pull/134967 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 12 12:02:21 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 12:02:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Revamp evaluate::CoarrayRef (PR #136628) In-Reply-To: Message-ID: <682245bd.170a0220.272ffa.b58e@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/136628 From flang-commits at lists.llvm.org Mon May 12 12:09:31 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 12:09:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix spurious error on defined assignment in PURE (PR #139186) In-Reply-To: Message-ID: <6822476b.170a0220.182c81.9f0f@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/139186 >From 4776a6a226254dc6d97428d986c7907edc672b5a Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Thu, 8 May 2025 17:46:35 -0700 Subject: [PATCH] [flang] Fix spurious error on defined assignment in PURE An assignment to a whole polymorphic object in a PURE subprogram that is implemented by means of a defined assignment procedure shouldn't be subjected to the same definability checks as it would be for an intrinsic assignment (which would also require it to be allocatable). Fixes https://github.com/llvm/llvm-project/issues/139129. --- flang/include/flang/Evaluate/tools.h | 31 ++++++----------- flang/lib/Evaluate/tools.cpp | 38 +++++++++++++++----- flang/lib/Semantics/assignment.cpp | 5 +++ flang/lib/Semantics/check-deallocate.cpp | 6 ++-- flang/lib/Semantics/check-declarations.cpp | 4 +-- flang/lib/Semantics/definable.cpp | 40 +++++++++++----------- flang/lib/Semantics/definable.h | 2 +- flang/lib/Semantics/expression.cpp | 6 ++-- flang/test/Semantics/assign11.f90 | 6 ++-- flang/test/Semantics/bug139129.f90 | 17 +++++++++ flang/test/Semantics/call28.f90 | 4 +-- flang/test/Semantics/deallocate07.f90 | 8 ++--- flang/test/Semantics/declarations05.f90 | 2 +- 13 files changed, 101 insertions(+), 68 deletions(-) create mode 100644 flang/test/Semantics/bug139129.f90 diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 922af4190822d..14baa0371231c 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -502,42 +502,31 @@ template std::optional ExtractSubstring(const A &x) { // If an expression is simply a whole symbol data designator, // extract and return that symbol, else null. +const Symbol *UnwrapWholeSymbolDataRef(const DataRef &); +const Symbol *UnwrapWholeSymbolDataRef(const std::optional &); template const Symbol *UnwrapWholeSymbolDataRef(const A &x) { - if (auto dataRef{ExtractDataRef(x)}) { - if (const SymbolRef * p{std::get_if(&dataRef->u)}) { - return &p->get(); - } - } - return nullptr; + return UnwrapWholeSymbolDataRef(ExtractDataRef(x)); } // If an expression is a whole symbol or a whole component desginator, // extract and return that symbol, else null. +const Symbol *UnwrapWholeSymbolOrComponentDataRef(const DataRef &); +const Symbol *UnwrapWholeSymbolOrComponentDataRef( + const std::optional &); template const Symbol *UnwrapWholeSymbolOrComponentDataRef(const A &x) { - if (auto dataRef{ExtractDataRef(x)}) { - if (const SymbolRef * p{std::get_if(&dataRef->u)}) { - return &p->get(); - } else if (const Component * c{std::get_if(&dataRef->u)}) { - if (c->base().Rank() == 0) { - return &c->GetLastSymbol(); - } - } - } - return nullptr; + return UnwrapWholeSymbolOrComponentDataRef(ExtractDataRef(x)); } // If an expression is a whole symbol or a whole component designator, // potentially followed by an image selector, extract and return that symbol, // else null. const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef(const DataRef &); +const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef( + const std::optional &); template const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef(const A &x) { - if (auto dataRef{ExtractDataRef(x)}) { - return UnwrapWholeSymbolOrComponentOrCoarrayRef(*dataRef); - } else { - return nullptr; - } + return UnwrapWholeSymbolOrComponentOrCoarrayRef(ExtractDataRef(x)); } // GetFirstSymbol(A%B%C[I]%D) -> A diff --git a/flang/lib/Evaluate/tools.cpp b/flang/lib/Evaluate/tools.cpp index d39e4c42928f3..641dead6c55d5 100644 --- a/flang/lib/Evaluate/tools.cpp +++ b/flang/lib/Evaluate/tools.cpp @@ -1320,17 +1320,39 @@ std::optional CheckProcCompatibility(bool isCall, return msg; } +const Symbol *UnwrapWholeSymbolDataRef(const DataRef &dataRef) { + const SymbolRef *p{std::get_if(&dataRef.u)}; + return p ? &p->get() : nullptr; +} + +const Symbol *UnwrapWholeSymbolDataRef(const std::optional &dataRef) { + return dataRef ? UnwrapWholeSymbolDataRef(*dataRef) : nullptr; +} + +const Symbol *UnwrapWholeSymbolOrComponentDataRef(const DataRef &dataRef) { + if (const Component * c{std::get_if(&dataRef.u)}) { + return c->base().Rank() == 0 ? &c->GetLastSymbol() : nullptr; + } else { + return UnwrapWholeSymbolDataRef(dataRef); + } +} + +const Symbol *UnwrapWholeSymbolOrComponentDataRef( + const std::optional &dataRef) { + return dataRef ? UnwrapWholeSymbolOrComponentDataRef(*dataRef) : nullptr; +} + const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef(const DataRef &dataRef) { - if (const SymbolRef * p{std::get_if(&dataRef.u)}) { - return &p->get(); - } else if (const Component * c{std::get_if(&dataRef.u)}) { - if (c->base().Rank() == 0) { - return &c->GetLastSymbol(); - } - } else if (const CoarrayRef * c{std::get_if(&dataRef.u)}) { + if (const CoarrayRef * c{std::get_if(&dataRef.u)}) { return UnwrapWholeSymbolOrComponentOrCoarrayRef(c->base()); + } else { + return UnwrapWholeSymbolOrComponentDataRef(dataRef); } - return nullptr; +} + +const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef( + const std::optional &dataRef) { + return dataRef ? UnwrapWholeSymbolOrComponentOrCoarrayRef(*dataRef) : nullptr; } // GetLastPointerSymbol() diff --git a/flang/lib/Semantics/assignment.cpp b/flang/lib/Semantics/assignment.cpp index 935f5a03bdb6a..6e55d0210ee0e 100644 --- a/flang/lib/Semantics/assignment.cpp +++ b/flang/lib/Semantics/assignment.cpp @@ -72,6 +72,11 @@ void AssignmentContext::Analyze(const parser::AssignmentStmt &stmt) { std::holds_alternative(assignment->u)}; if (isDefinedAssignment) { flags.set(DefinabilityFlag::AllowEventLockOrNotifyType); + } else if (const Symbol * + whole{evaluate::UnwrapWholeSymbolOrComponentDataRef(lhs)}) { + if (IsAllocatable(whole->GetUltimate())) { + flags.set(DefinabilityFlag::PotentialDeallocation); + } } if (auto whyNot{WhyNotDefinable(lhsLoc, scope, flags, lhs)}) { if (whyNot->IsFatal()) { diff --git a/flang/lib/Semantics/check-deallocate.cpp b/flang/lib/Semantics/check-deallocate.cpp index 3bcd4d87b0906..c45b58586853b 100644 --- a/flang/lib/Semantics/check-deallocate.cpp +++ b/flang/lib/Semantics/check-deallocate.cpp @@ -36,7 +36,8 @@ void DeallocateChecker::Leave(const parser::DeallocateStmt &deallocateStmt) { } else if (auto whyNot{WhyNotDefinable(name.source, context_.FindScope(name.source), {DefinabilityFlag::PointerDefinition, - DefinabilityFlag::AcceptAllocatable}, + DefinabilityFlag::AcceptAllocatable, + DefinabilityFlag::PotentialDeallocation}, *symbol)}) { // Catch problems with non-definability of the // pointer/allocatable @@ -74,7 +75,8 @@ void DeallocateChecker::Leave(const parser::DeallocateStmt &deallocateStmt) { } else if (auto whyNot{WhyNotDefinable(source, context_.FindScope(source), {DefinabilityFlag::PointerDefinition, - DefinabilityFlag::AcceptAllocatable}, + DefinabilityFlag::AcceptAllocatable, + DefinabilityFlag::PotentialDeallocation}, *expr)}) { context_ .Say(source, diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 318085518cc57..c3a228f3ab8a9 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -949,8 +949,8 @@ void CheckHelper::CheckObjectEntity( !IsFunctionResult(symbol) /*ditto*/) { // Check automatically deallocated local variables for possible // problems with finalization in PURE. - if (auto whyNot{ - WhyNotDefinable(symbol.name(), symbol.owner(), {}, symbol)}) { + if (auto whyNot{WhyNotDefinable(symbol.name(), symbol.owner(), + {DefinabilityFlag::PotentialDeallocation}, symbol)}) { if (auto *msg{messages_.Say( "'%s' may not be a local variable in a pure subprogram"_err_en_US, symbol.name())}) { diff --git a/flang/lib/Semantics/definable.cpp b/flang/lib/Semantics/definable.cpp index 99a31553f2782..08cb268b318ae 100644 --- a/flang/lib/Semantics/definable.cpp +++ b/flang/lib/Semantics/definable.cpp @@ -193,6 +193,15 @@ static std::optional WhyNotDefinableLast(parser::CharBlock at, return WhyNotDefinableLast(at, scope, flags, dataRef->GetLastSymbol()); } } + auto dyType{evaluate::DynamicType::From(ultimate)}; + const auto *inPure{FindPureProcedureContaining(scope)}; + if (inPure && !flags.test(DefinabilityFlag::PolymorphicOkInPure) && + flags.test(DefinabilityFlag::PotentialDeallocation) && dyType && + dyType->IsPolymorphic()) { + return BlameSymbol(at, + "'%s' is a whole polymorphic object in a pure subprogram"_en_US, + original); + } if (flags.test(DefinabilityFlag::PointerDefinition)) { if (flags.test(DefinabilityFlag::AcceptAllocatable)) { if (!IsAllocatableOrObjectPointer(&ultimate)) { @@ -210,26 +219,17 @@ static std::optional WhyNotDefinableLast(parser::CharBlock at, "'%s' is an entity with either an EVENT_TYPE or LOCK_TYPE"_en_US, original); } - if (FindPureProcedureContaining(scope)) { - if (auto dyType{evaluate::DynamicType::From(ultimate)}) { - if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { - if (dyType->IsPolymorphic()) { // C1596 - return BlameSymbol( - at, "'%s' is polymorphic in a pure subprogram"_en_US, original); - } - } - if (const Symbol * impure{HasImpureFinal(ultimate)}) { - return BlameSymbol(at, "'%s' has an impure FINAL procedure '%s'"_en_US, - original, impure->name()); - } + if (dyType && inPure) { + if (const Symbol * impure{HasImpureFinal(ultimate)}) { + return BlameSymbol(at, "'%s' has an impure FINAL procedure '%s'"_en_US, + original, impure->name()); + } + if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { if (const DerivedTypeSpec * derived{GetDerivedTypeSpec(dyType)}) { - if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { - if (auto bad{ - FindPolymorphicAllocatablePotentialComponent(*derived)}) { - return BlameSymbol(at, - "'%s' has polymorphic component '%s' in a pure subprogram"_en_US, - original, bad.BuildResultDesignatorName()); - } + if (auto bad{FindPolymorphicAllocatablePotentialComponent(*derived)}) { + return BlameSymbol(at, + "'%s' has polymorphic component '%s' in a pure subprogram"_en_US, + original, bad.BuildResultDesignatorName()); } } } @@ -243,7 +243,7 @@ static std::optional WhyNotDefinable(parser::CharBlock at, const evaluate::DataRef &dataRef) { auto whyNotBase{ WhyNotDefinableBase(at, scope, flags, dataRef.GetFirstSymbol(), - std::holds_alternative(dataRef.u), + evaluate::UnwrapWholeSymbolDataRef(dataRef) != nullptr, DefinesComponentPointerTarget(dataRef, flags))}; if (!whyNotBase || !whyNotBase->IsFatal()) { if (auto whyNotLast{ diff --git a/flang/lib/Semantics/definable.h b/flang/lib/Semantics/definable.h index 902702dbccbf3..0d027961417be 100644 --- a/flang/lib/Semantics/definable.h +++ b/flang/lib/Semantics/definable.h @@ -33,7 +33,7 @@ ENUM_CLASS(DefinabilityFlag, SourcedAllocation, // ALLOCATE(a,SOURCE=) PolymorphicOkInPure, // don't check for polymorphic type in pure subprogram DoNotNoteDefinition, // context does not imply definition - AllowEventLockOrNotifyType) + AllowEventLockOrNotifyType, PotentialDeallocation) using DefinabilityFlags = common::EnumSet; diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index 0659536aab98c..2c89bcd981f6d 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -3391,15 +3391,15 @@ const Assignment *ExpressionAnalyzer::Analyze(const parser::AssignmentStmt &x) { const Symbol *lastWhole{ lastWhole0 ? &ResolveAssociations(*lastWhole0) : nullptr}; if (!lastWhole || !IsAllocatable(*lastWhole)) { - Say("Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable"_err_en_US); + Say("Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable"_err_en_US); } else if (evaluate::IsCoarray(*lastWhole)) { - Say("Left-hand side of assignment may not be polymorphic if it is a coarray"_err_en_US); + Say("Left-hand side of intrinsic assignment may not be polymorphic if it is a coarray"_err_en_US); } } if (auto *derived{GetDerivedTypeSpec(*dyType)}) { if (auto iter{FindAllocatableUltimateComponent(*derived)}) { if (ExtractCoarrayRef(lhs)) { - Say("Left-hand side of assignment must not be coindexed due to allocatable ultimate component '%s'"_err_en_US, + Say("Left-hand side of intrinsic assignment must not be coindexed due to allocatable ultimate component '%s'"_err_en_US, iter.BuildResultDesignatorName()); } } diff --git a/flang/test/Semantics/assign11.f90 b/flang/test/Semantics/assign11.f90 index 37216526b5f33..9d70d7109e75e 100644 --- a/flang/test/Semantics/assign11.f90 +++ b/flang/test/Semantics/assign11.f90 @@ -9,10 +9,10 @@ program test end type type(t) auc[*] pa = 1 ! ok - !ERROR: Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable pp = 1 - !ERROR: Left-hand side of assignment may not be polymorphic if it is a coarray + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic if it is a coarray pac = 1 - !ERROR: Left-hand side of assignment must not be coindexed due to allocatable ultimate component '%a' + !ERROR: Left-hand side of intrinsic assignment must not be coindexed due to allocatable ultimate component '%a' auc[1] = t() end diff --git a/flang/test/Semantics/bug139129.f90 b/flang/test/Semantics/bug139129.f90 new file mode 100644 index 0000000000000..2f0f865854706 --- /dev/null +++ b/flang/test/Semantics/bug139129.f90 @@ -0,0 +1,17 @@ +!RUN: %flang_fc1 -fsyntax-only %s +module m + type t + contains + procedure asst + generic :: assignment(=) => asst + end type + contains + pure subroutine asst(lhs, rhs) + class(t), intent(in out) :: lhs + class(t), intent(in) :: rhs + end + pure subroutine test(x, y) + class(t), intent(in out) :: x, y + x = y ! spurious definability error + end +end diff --git a/flang/test/Semantics/call28.f90 b/flang/test/Semantics/call28.f90 index 51430853d663f..f133276f7547e 100644 --- a/flang/test/Semantics/call28.f90 +++ b/flang/test/Semantics/call28.f90 @@ -11,9 +11,7 @@ pure subroutine s1(x) end subroutine pure subroutine s2(x) class(t), intent(in out) :: x - !ERROR: Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable - !ERROR: Left-hand side of assignment is not definable - !BECAUSE: 'x' is polymorphic in a pure subprogram + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable x = t() end subroutine pure subroutine s3(x) diff --git a/flang/test/Semantics/deallocate07.f90 b/flang/test/Semantics/deallocate07.f90 index 154c680f47c82..6dcf20e82cf0d 100644 --- a/flang/test/Semantics/deallocate07.f90 +++ b/flang/test/Semantics/deallocate07.f90 @@ -19,11 +19,11 @@ pure subroutine subr(pp1, pp2, mp2) !ERROR: Name in DEALLOCATE statement is not definable !BECAUSE: 'mv1' may not be defined in pure subprogram 'subr' because it is host-associated deallocate(mv1%pc) - !ERROR: Object in DEALLOCATE statement is not deallocatable - !BECAUSE: 'pp1' is polymorphic in a pure subprogram + !ERROR: Name in DEALLOCATE statement is not definable + !BECAUSE: 'pp1' is a whole polymorphic object in a pure subprogram deallocate(pp1) - !ERROR: Object in DEALLOCATE statement is not deallocatable - !BECAUSE: 'pc' is polymorphic in a pure subprogram + !ERROR: Name in DEALLOCATE statement is not definable + !BECAUSE: 'pc' is a whole polymorphic object in a pure subprogram deallocate(pp2%pc) !ERROR: Object in DEALLOCATE statement is not deallocatable !BECAUSE: 'mp2' has polymorphic component '%pc' in a pure subprogram diff --git a/flang/test/Semantics/declarations05.f90 b/flang/test/Semantics/declarations05.f90 index b6dab7aeea0bc..b1e3d3c773160 100644 --- a/flang/test/Semantics/declarations05.f90 +++ b/flang/test/Semantics/declarations05.f90 @@ -22,7 +22,7 @@ impure subroutine final(x) end pure subroutine test !ERROR: 'x0' may not be a local variable in a pure subprogram - !BECAUSE: 'x0' is polymorphic in a pure subprogram + !BECAUSE: 'x0' is a whole polymorphic object in a pure subprogram class(t0), allocatable :: x0 !ERROR: 'x1' may not be a local variable in a pure subprogram !BECAUSE: 'x1' has an impure FINAL procedure 'final' From flang-commits at lists.llvm.org Mon May 12 12:11:35 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 12:11:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Tune warning about incompatible implicit interfaces (PR #136788) In-Reply-To: Message-ID: <682247e7.170a0220.381f6.95d4@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/136788 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 12 12:15:27 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 12:15:27 -0700 (PDT) Subject: [flang-commits] [flang] ea87d7c - [flang] Add control and a portability warning for an extension (#137995) Message-ID: <682248cf.170a0220.5e126.bb15@mx.google.com> Author: Peter Klausler Date: 2025-05-12T12:15:24-07:00 New Revision: ea87d7c0dbceaf21ddbd53d261600ca5e3aeddd7 URL: https://github.com/llvm/llvm-project/commit/ea87d7c0dbceaf21ddbd53d261600ca5e3aeddd7 DIFF: https://github.com/llvm/llvm-project/commit/ea87d7c0dbceaf21ddbd53d261600ca5e3aeddd7.diff LOG: [flang] Add control and a portability warning for an extension (#137995) This compiler allows an element of an assumed-shape array or POINTER to be used in sequence association as an actual argument, so long as the array is declared to have the CONTIGUOUS attribute. Make sure that this extension is under control of a LanguageFeature enum, so that a hypothetical compiler driver option could disable it, and add an optional portability warning for its use. Added: flang/test/Semantics/call44.f90 Modified: flang/include/flang/Support/Fortran-features.h flang/lib/Semantics/check-call.cpp Removed: ################################################################################ diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 6cb1bcdb0003f..550a5c8f307d3 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -54,7 +54,7 @@ ENUM_CLASS(LanguageFeature, BackslashEscapes, OldDebugLines, PolymorphicActualAllocatableOrPointerToMonomorphicDummy, RelaxedPureDummy, UndefinableAsynchronousOrVolatileActual, AutomaticInMainProgram, PrintCptr, SavedLocalInSpecExpr, PrintNamelist, AssumedRankPassedToNonAssumedRank, - IgnoreIrrelevantAttributes, Unsigned) + IgnoreIrrelevantAttributes, Unsigned, ContiguousOkForSeqAssociation) // Portability and suspicious usage warnings ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, diff --git a/flang/lib/Semantics/check-call.cpp b/flang/lib/Semantics/check-call.cpp index 11928860fea5f..231f3a4222a2c 100644 --- a/flang/lib/Semantics/check-call.cpp +++ b/flang/lib/Semantics/check-call.cpp @@ -581,20 +581,38 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, "Polymorphic scalar may not be associated with a %s array"_err_en_US, dummyName); } + bool isOkBecauseContiguous{ + context.IsEnabled( + common::LanguageFeature::ContiguousOkForSeqAssociation) && + actualLastSymbol && + evaluate::IsContiguous(*actualLastSymbol, foldingContext)}; if (actualIsArrayElement && actualLastSymbol && - !evaluate::IsContiguous(*actualLastSymbol, foldingContext) && !dummy.ignoreTKR.test(common::IgnoreTKR::Contiguous)) { if (IsPointer(*actualLastSymbol)) { - basicError = true; - messages.Say( - "Element of pointer array may not be associated with a %s array"_err_en_US, - dummyName); + if (isOkBecauseContiguous) { + context.Warn( + common::LanguageFeature::ContiguousOkForSeqAssociation, + messages.at(), + "Element of contiguous pointer array is accepted for storage sequence association"_port_en_US); + } else { + basicError = true; + messages.Say( + "Element of pointer array may not be associated with a %s array"_err_en_US, + dummyName); + } } else if (IsAssumedShape(*actualLastSymbol) && !dummy.ignoreTKR.test(common::IgnoreTKR::Contiguous)) { - basicError = true; - messages.Say( - "Element of assumed-shape array may not be associated with a %s array"_err_en_US, - dummyName); + if (isOkBecauseContiguous) { + context.Warn( + common::LanguageFeature::ContiguousOkForSeqAssociation, + messages.at(), + "Element of contiguous assumed-shape array is accepted for storage sequence association"_port_en_US); + } else { + basicError = true; + messages.Say( + "Element of assumed-shape array may not be associated with a %s array"_err_en_US, + dummyName); + } } } } diff --git a/flang/test/Semantics/call44.f90 b/flang/test/Semantics/call44.f90 new file mode 100644 index 0000000000000..f7c4c9093b432 --- /dev/null +++ b/flang/test/Semantics/call44.f90 @@ -0,0 +1,13 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 -pedantic -Werror +subroutine assumedshape(normal, contig) + real normal(:) + real, contiguous :: contig(:) + !WARNING: If the procedure's interface were explicit, this reference would be in error + !BECAUSE: Element of assumed-shape array may not be associated with a dummy argument 'assumedsize=' array + call seqAssociate(normal(1)) + !PORTABILITY: Element of contiguous assumed-shape array is accepted for storage sequence association + call seqAssociate(contig(1)) +end +subroutine seqAssociate(assumedSize) + real assumedSize(*) +end From flang-commits at lists.llvm.org Mon May 12 12:15:30 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 12:15:30 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add control and a portability warning for an extension (PR #137995) In-Reply-To: Message-ID: <682248d2.170a0220.22ea86.a312@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/137995 From flang-commits at lists.llvm.org Mon May 12 12:15:49 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 12:15:49 -0700 (PDT) Subject: [flang-commits] [flang] 5b9bd88 - [flang] Fix crash with USE of hermetic module file (#138785) Message-ID: <682248e5.170a0220.32a642.b397@mx.google.com> Author: Peter Klausler Date: 2025-05-12T12:15:46-07:00 New Revision: 5b9bd8838842896b482fea20dce56906d42cc7b1 URL: https://github.com/llvm/llvm-project/commit/5b9bd8838842896b482fea20dce56906d42cc7b1 DIFF: https://github.com/llvm/llvm-project/commit/5b9bd8838842896b482fea20dce56906d42cc7b1.diff LOG: [flang] Fix crash with USE of hermetic module file (#138785) When one hermetic module file uses another, a later compilation may crash in semantics when it itself is used, since the module file reader sets the "current hermetic module file scope" to null after reading one rather than saving and restoring that pointer. Added: flang/test/Semantics/modfile75.F90 Modified: flang/lib/Semantics/mod-file.cpp Removed: ################################################################################ diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index 3ea37ceddd056..a1ec956562204 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1548,6 +1548,7 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, // created under -fhermetic-module-files? If so, process them first in // their own nested scope that will be visible only to USE statements // within the module file. + Scope *previousHermetic{context_.currentHermeticModuleFileScope()}; if (parseTree.v.size() > 1) { parser::Program hermeticModules{std::move(parseTree.v)}; parseTree.v.emplace_back(std::move(hermeticModules.v.front())); @@ -1563,7 +1564,7 @@ Scope *ModFileReader::Read(SourceName name, std::optional isIntrinsic, GetModuleDependences(context_.moduleDependences(), sourceFile->content()); ResolveNames(context_, parseTree, topScope); context_.foldingContext().set_moduleFileName(wasModuleFileName); - context_.set_currentHermeticModuleFileScope(nullptr); + context_.set_currentHermeticModuleFileScope(previousHermetic); if (!moduleSymbol) { // Submodule symbols' storage are owned by their parents' scopes, // but their names are not in their parents' dictionaries -- we diff --git a/flang/test/Semantics/modfile75.F90 b/flang/test/Semantics/modfile75.F90 new file mode 100644 index 0000000000000..aba00ffac848a --- /dev/null +++ b/flang/test/Semantics/modfile75.F90 @@ -0,0 +1,17 @@ +!RUN: %flang -c -fhermetic-module-files -DWHICH=1 %s && %flang -c -fhermetic-module-files -DWHICH=2 %s && %flang_fc1 -fdebug-unparse %s | FileCheck %s + +#if WHICH == 1 +module modfile75a + use iso_c_binding +end +#elif WHICH == 2 +module modfile75b + use modfile75a +end +#else +program test + use modfile75b +!CHECK: INTEGER(KIND=4_4) n + integer(c_int) n +end +#endif From flang-commits at lists.llvm.org Mon May 12 12:15:52 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 12:15:52 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix crash with USE of hermetic module file (PR #138785) In-Reply-To: Message-ID: <682248e8.170a0220.11c973.a9b5@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/138785 From flang-commits at lists.llvm.org Mon May 12 12:16:07 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 12:16:07 -0700 (PDT) Subject: [flang-commits] [flang] 58535e8 - [flang] Further refinement of OpenMP !$ lines in -E mode (#138956) Message-ID: <682248f7.170a0220.5b497.98eb@mx.google.com> Author: Peter Klausler Date: 2025-05-12T12:16:05-07:00 New Revision: 58535e81dd982f5e5b64df39d2ab264027d6e8ca URL: https://github.com/llvm/llvm-project/commit/58535e81dd982f5e5b64df39d2ab264027d6e8ca DIFF: https://github.com/llvm/llvm-project/commit/58535e81dd982f5e5b64df39d2ab264027d6e8ca.diff LOG: [flang] Further refinement of OpenMP !$ lines in -E mode (#138956) Address failing Fujitsu test suite cases that were broken by the patch to defer the handling of !$ lines in -fopenmp vs. normal compilation to actual compilation rather than processing them immediately in -E mode. Tested on the samples in the bug report as well as all of the Fujitsu tests that I could find that use !$ lines. Fixes https://github.com/llvm/llvm-project/issues/136845. Added: flang/test/Preprocessing/bug136845.F Modified: flang/include/flang/Parser/token-sequence.h flang/lib/Parser/parsing.cpp flang/lib/Parser/prescan.cpp flang/lib/Parser/prescan.h flang/lib/Parser/token-sequence.cpp flang/test/Parser/OpenMP/bug518.f flang/test/Parser/OpenMP/compiler-directive-continuation.f90 flang/test/Parser/OpenMP/sentinels.f flang/test/Parser/continuation-in-conditional-compilation.f Removed: ################################################################################ diff --git a/flang/include/flang/Parser/token-sequence.h b/flang/include/flang/Parser/token-sequence.h index 69291e69526e2..05aeacccde097 100644 --- a/flang/include/flang/Parser/token-sequence.h +++ b/flang/include/flang/Parser/token-sequence.h @@ -137,7 +137,7 @@ class TokenSequence { TokenSequence &RemoveRedundantBlanks(std::size_t firstChar = 0); TokenSequence &ClipComment(const Prescanner &, bool skipFirst = false); const TokenSequence &CheckBadFortranCharacters( - Messages &, const Prescanner &, bool allowAmpersand) const; + Messages &, const Prescanner &, bool preprocessingOnly) const; bool BadlyNestedParentheses() const; const TokenSequence &CheckBadParentheses(Messages &) const; void Emit(CookedSource &) const; diff --git a/flang/lib/Parser/parsing.cpp b/flang/lib/Parser/parsing.cpp index 17f544194de02..93737d99567dd 100644 --- a/flang/lib/Parser/parsing.cpp +++ b/flang/lib/Parser/parsing.cpp @@ -230,10 +230,11 @@ void Parsing::EmitPreprocessedSource( column = 7; // start of fixed form source field ++sourceLine; inContinuation = true; - } else if (!inDirective && ch != ' ' && (ch < '0' || ch > '9')) { + } else if (!inDirective && !ompConditionalLine && ch != ' ' && + (ch < '0' || ch > '9')) { // Put anything other than a label or directive into the // Fortran fixed form source field (columns [7:72]). - for (; column < 7; ++column) { + for (int toCol{ch == '&' ? 6 : 7}; column < toCol; ++column) { out << ' '; } } @@ -241,7 +242,7 @@ void Parsing::EmitPreprocessedSource( if (ompConditionalLine) { // Only digits can stay in the label field if (!(ch >= '0' && ch <= '9')) { - for (; column < 7; ++column) { + for (int toCol{ch == '&' ? 6 : 7}; column < toCol; ++column) { out << ' '; } } diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp index 46e04c15ade01..3bc2ea0b37508 100644 --- a/flang/lib/Parser/prescan.cpp +++ b/flang/lib/Parser/prescan.cpp @@ -150,10 +150,7 @@ void Prescanner::Statement() { CHECK(*at_ == '!'); } std::optional condOffset; - bool isOpenMPCondCompilation{ - directiveSentinel_[0] == '$' && directiveSentinel_[1] == '\0'}; - if (isOpenMPCondCompilation) { - // OpenMP conditional compilation line. + if (InOpenMPConditionalLine()) { condOffset = 2; } else if (directiveSentinel_[0] == '@' && directiveSentinel_[1] == 'c' && directiveSentinel_[2] == 'u' && directiveSentinel_[3] == 'f' && @@ -167,19 +164,10 @@ void Prescanner::Statement() { FortranInclude(at_ + *payload); return; } - while (true) { - if (auto n{IsSpace(at_)}) { - at_ += n, ++column_; - } else if (*at_ == '\t') { - ++at_, ++column_; - tabInCurrentLine_ = true; - } else if (inFixedForm_ && column_ == 6 && !tabInCurrentLine_ && - *at_ == '0') { - ++at_, ++column_; - } else { - break; - } + if (inFixedForm_) { + LabelField(tokens); } + SkipSpaces(); } else { // Compiler directive. Emit normalized sentinel, squash following spaces. // Conditional compilation lines (!$) take this path in -E mode too @@ -190,35 +178,47 @@ void Prescanner::Statement() { ++sp, ++at_, ++column_) { EmitChar(tokens, *sp); } - if (IsSpaceOrTab(at_)) { - while (int n{IsSpaceOrTab(at_)}) { - if (isOpenMPCondCompilation && inFixedForm_) { + if (inFixedForm_) { + while (column_ < 6) { + if (*at_ == '\t') { + tabInCurrentLine_ = true; + ++at_; + for (; column_ < 7; ++column_) { + EmitChar(tokens, ' '); + } + } else if (int spaceBytes{IsSpace(at_)}) { EmitChar(tokens, ' '); - } - tabInCurrentLine_ |= *at_ == '\t'; - at_ += n, ++column_; - if (inFixedForm_ && column_ > fixedFormColumnLimit_) { + at_ += spaceBytes; + ++column_; + } else { + if (InOpenMPConditionalLine() && column_ == 3 && + IsDecimalDigit(*at_)) { + // subtle: !$ in -E mode can't be immediately followed by a digit + EmitChar(tokens, ' '); + } break; } } - if (isOpenMPCondCompilation && inFixedForm_ && column_ == 6) { - if (*at_ == '0') { - EmitChar(tokens, ' '); - } else { - tokens.CloseToken(); - EmitChar(tokens, '&'); - } - ++at_, ++column_; + } else if (int spaceBytes{IsSpaceOrTab(at_)}) { + EmitChar(tokens, ' '); + at_ += spaceBytes, ++column_; + } + tokens.CloseToken(); + SkipSpaces(); + if (InOpenMPConditionalLine() && inFixedForm_ && !tabInCurrentLine_ && + column_ == 6 && *at_ != '\n') { + // !$ 0 - turn '0' into a space + // !$ 1 - turn '1' into '&' + if (int n{IsSpace(at_)}; n || *at_ == '0') { + at_ += n ? n : 1; } else { - EmitChar(tokens, ' '); + ++at_; + EmitChar(tokens, '&'); + tokens.CloseToken(); } + ++column_; + SkipSpaces(); } - tokens.CloseToken(); - } - if (*at_ == '!' || *at_ == '\n' || - (inFixedForm_ && column_ > fixedFormColumnLimit_ && - !tabInCurrentLine_)) { - return; // Directive without payload } break; } @@ -323,8 +323,8 @@ void Prescanner::Statement() { NormalizeCompilerDirectiveCommentMarker(*preprocessed); preprocessed->ToLowerCase(); SourceFormChange(preprocessed->ToString()); - CheckAndEmitLine(preprocessed->ToLowerCase().ClipComment( - *this, true /* skip first ! */), + CheckAndEmitLine( + preprocessed->ClipComment(*this, true /* skip first ! */), newlineProvenance); break; case LineClassification::Kind::Source: @@ -349,6 +349,24 @@ void Prescanner::Statement() { while (CompilerDirectiveContinuation(tokens, line.sentinel)) { newlineProvenance = GetCurrentProvenance(); } + if (preprocessingOnly_ && inFixedForm_ && InOpenMPConditionalLine() && + nextLine_ < limit_) { + // In -E mode, when the line after !$ conditional compilation is a + // regular fixed form continuation line, append a '&' to the line. + const char *p{nextLine_}; + int col{1}; + while (int n{IsSpace(p)}) { + if (*p == '\t') { + break; + } + p += n; + ++col; + } + if (col == 6 && *p != '0' && *p != '\t' && *p != '\n') { + EmitChar(tokens, '&'); + tokens.CloseToken(); + } + } tokens.ToLowerCase(); SourceFormChange(tokens.ToString()); } else { // Kind::Source @@ -544,7 +562,8 @@ void Prescanner::SkipToEndOfLine() { bool Prescanner::MustSkipToEndOfLine() const { if (inFixedForm_ && column_ > fixedFormColumnLimit_ && !tabInCurrentLine_) { return true; // skip over ignored columns in right margin (73:80) - } else if (*at_ == '!' && !inCharLiteral_) { + } else if (*at_ == '!' && !inCharLiteral_ && + (!inFixedForm_ || tabInCurrentLine_ || column_ != 6)) { return !IsCompilerDirectiveSentinel(at_); } else { return false; @@ -569,10 +588,11 @@ void Prescanner::NextChar() { // directives, Fortran ! comments, stuff after the right margin in // fixed form, and all forms of line continuation. bool Prescanner::SkipToNextSignificantCharacter() { - auto anyContinuationLine{false}; if (inPreprocessorDirective_) { SkipCComments(); + return false; } else { + auto anyContinuationLine{false}; bool mightNeedSpace{false}; if (MustSkipToEndOfLine()) { SkipToEndOfLine(); @@ -589,8 +609,8 @@ bool Prescanner::SkipToNextSignificantCharacter() { if (*at_ == '\t') { tabInCurrentLine_ = true; } + return anyContinuationLine; } - return anyContinuationLine; } void Prescanner::SkipCComments() { @@ -1119,12 +1139,10 @@ static bool IsAtProcess(const char *p) { bool Prescanner::IsFixedFormCommentLine(const char *start) const { const char *p{start}; - // The @process directive must start in column 1. if (*p == '@' && IsAtProcess(p)) { return true; } - if (IsFixedFormCommentChar(*p) || *p == '%' || // VAX %list, %eject, &c. ((*p == 'D' || *p == 'd') && !features_.IsEnabled(LanguageFeature::OldDebugLines))) { @@ -1324,24 +1342,11 @@ const char *Prescanner::FixedFormContinuationLine(bool mightNeedSpace) { features_.IsEnabled(LanguageFeature::OldDebugLines))) && nextLine_[1] == ' ' && nextLine_[2] == ' ' && nextLine_[3] == ' ' && nextLine_[4] == ' '}; - if (InCompilerDirective()) { - if (directiveSentinel_[0] == '$' && directiveSentinel_[1] == '\0') { - if (IsFixedFormCommentChar(col1)) { - if (nextLine_[1] == '$' && - (nextLine_[2] == '&' || IsSpaceOrTab(&nextLine_[2]))) { - // Next line is also !$ conditional compilation, might be continuation - if (preprocessingOnly_) { - return nullptr; - } - } else { - return nullptr; // comment, or distinct directive - } - } else if (!canBeNonDirectiveContinuation) { - return nullptr; - } - } else if (!IsFixedFormCommentChar(col1)) { - return nullptr; // in directive other than !$, but next line is not - } else { // in directive other than !$, next line might be continuation + if (InCompilerDirective() && + !(InOpenMPConditionalLine() && !preprocessingOnly_)) { + // !$ under -E is not continued, but deferred to later compilation + if (IsFixedFormCommentChar(col1) && + !(InOpenMPConditionalLine() && preprocessingOnly_)) { int j{1}; for (; j < 5; ++j) { char ch{directiveSentinel_[j - 1]}; @@ -1356,31 +1361,27 @@ const char *Prescanner::FixedFormContinuationLine(bool mightNeedSpace) { return nullptr; } } - } - const char *col6{nextLine_ + 5}; - if (*col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { - if (mightNeedSpace && !IsSpace(nextLine_ + 6)) { - insertASpace_ = true; + const char *col6{nextLine_ + 5}; + if (*col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { + if (mightNeedSpace && !IsSpace(nextLine_ + 6)) { + insertASpace_ = true; + } + return nextLine_ + 6; } - return nextLine_ + 6; } - } else { - // Normal case: not in a compiler directive. - if (IsFixedFormCommentChar(col1)) { - if (nextLine_[1] == '$' && nextLine_[2] == ' ' && nextLine_[3] == ' ' && - nextLine_[4] == ' ' && - IsCompilerDirectiveSentinel(&nextLine_[1], 1) && - !preprocessingOnly_) { - // !$ conditional compilation line as a continuation - const char *col6{nextLine_ + 5}; - if (*col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { - if (mightNeedSpace && !IsSpace(nextLine_ + 6)) { - insertASpace_ = true; - } - return nextLine_ + 6; - } + } else { // Normal case: not in a compiler directive. + // !$ conditional compilation lines may be continuations when not + // just preprocessing. + if (!preprocessingOnly_ && IsFixedFormCommentChar(col1) && + nextLine_[1] == '$' && nextLine_[2] == ' ' && nextLine_[3] == ' ' && + nextLine_[4] == ' ' && IsCompilerDirectiveSentinel(&nextLine_[1], 1)) { + if (const char *col6{nextLine_ + 5}; + *col6 != '\n' && *col6 != '0' && !IsSpaceOrTab(col6)) { + insertASpace_ |= mightNeedSpace && !IsSpace(nextLine_ + 6); + return nextLine_ + 6; + } else { + return nullptr; } - return nullptr; } if (col1 == '&' && features_.IsEnabled( @@ -1422,13 +1423,13 @@ const char *Prescanner::FreeFormContinuationLine(bool ampersand) { } p = SkipWhiteSpaceIncludingEmptyMacros(p); if (InCompilerDirective()) { - if (directiveSentinel_[0] == '$' && directiveSentinel_[1] == '\0') { + if (InOpenMPConditionalLine()) { if (preprocessingOnly_) { // in -E mode, don't treat !$ as a continuation return nullptr; } else if (p[0] == '!' && p[1] == '$') { // accept but do not require a matching sentinel - if (!(p[2] == '&' || IsSpaceOrTab(&p[2]))) { + if (p[2] != '&' && !IsSpaceOrTab(&p[2])) { return nullptr; // not !$ } p += 2; @@ -1566,15 +1567,11 @@ Prescanner::IsFixedFormCompilerDirectiveLine(const char *start) const { } char sentinel[5], *sp{sentinel}; int column{2}; - for (; column < 6; ++column, ++p) { - if (*p == '\n' || IsSpaceOrTab(p)) { - break; - } - if (sp == sentinel + 1 && sentinel[0] == '$' && IsDecimalDigit(*p)) { - // OpenMP conditional compilation line: leave the label alone + for (; column < 6; ++column) { + if (*p == '\n' || IsSpaceOrTab(p) || IsDecimalDigit(*p)) { break; } - *sp++ = ToLowerCaseLetter(*p); + *sp++ = ToLowerCaseLetter(*p++); } if (sp == sentinel) { return std::nullopt; @@ -1600,7 +1597,8 @@ Prescanner::IsFixedFormCompilerDirectiveLine(const char *start) const { ++p; } else if (int n{IsSpaceOrTab(p)}) { p += n; - } else if (isOpenMPConditional && preprocessingOnly_ && !hadDigit) { + } else if (isOpenMPConditional && preprocessingOnly_ && !hadDigit && + *p != '\n') { // In -E mode, "!$ &" is treated as a directive } else { // This is a Continuation line, not an initial directive line. @@ -1671,14 +1669,14 @@ const char *Prescanner::IsCompilerDirectiveSentinel(CharBlock token) const { std::optional> Prescanner::IsCompilerDirectiveSentinel(const char *p) const { char sentinel[8]; - for (std::size_t j{0}; j + 1 < sizeof sentinel && *p != '\n'; ++p, ++j) { + for (std::size_t j{0}; j + 1 < sizeof sentinel; ++p, ++j) { if (int n{IsSpaceOrTab(p)}; n || !(IsLetter(*p) || *p == '$' || *p == '@')) { if (j > 0) { - if (j == 1 && sentinel[0] == '$' && n == 0 && *p != '&') { - // OpenMP conditional compilation line sentinels have to + if (j == 1 && sentinel[0] == '$' && n == 0 && *p != '&' && *p != '\n') { + // Free form OpenMP conditional compilation line sentinels have to // be immediately followed by a space or &, not a digit - // or anything else. + // or anything else. A newline also works for an initial line. break; } sentinel[j] = '\0'; diff --git a/flang/lib/Parser/prescan.h b/flang/lib/Parser/prescan.h index 53361ba14f378..ec4c53cf3e0f2 100644 --- a/flang/lib/Parser/prescan.h +++ b/flang/lib/Parser/prescan.h @@ -159,6 +159,11 @@ class Prescanner { } bool InCompilerDirective() const { return directiveSentinel_ != nullptr; } + bool InOpenMPConditionalLine() const { + return directiveSentinel_ && directiveSentinel_[0] == '$' && + !directiveSentinel_[1]; + ; + } bool InFixedFormSource() const { return inFixedForm_ && !inPreprocessorDirective_ && !InCompilerDirective(); } diff --git a/flang/lib/Parser/token-sequence.cpp b/flang/lib/Parser/token-sequence.cpp index aee76938550f5..40a074eaf0a47 100644 --- a/flang/lib/Parser/token-sequence.cpp +++ b/flang/lib/Parser/token-sequence.cpp @@ -357,7 +357,7 @@ ProvenanceRange TokenSequence::GetProvenanceRange() const { const TokenSequence &TokenSequence::CheckBadFortranCharacters( Messages &messages, const Prescanner &prescanner, - bool allowAmpersand) const { + bool preprocessingOnly) const { std::size_t tokens{SizeInTokens()}; for (std::size_t j{0}; j < tokens; ++j) { CharBlock token{TokenAt(j)}; @@ -371,8 +371,10 @@ const TokenSequence &TokenSequence::CheckBadFortranCharacters( TokenAt(j + 1))) { // !dir$, &c. ++j; continue; + } else if (preprocessingOnly) { + continue; } - } else if (ch == '&' && allowAmpersand) { + } else if (ch == '&' && preprocessingOnly) { continue; } if (ch < ' ' || ch >= '\x7f') { diff --git a/flang/test/Parser/OpenMP/bug518.f b/flang/test/Parser/OpenMP/bug518.f index 2dbacef59fa8a..2739de63f8b25 100644 --- a/flang/test/Parser/OpenMP/bug518.f +++ b/flang/test/Parser/OpenMP/bug518.f @@ -9,9 +9,9 @@ !$omp end parallel end -!CHECK-E:{{^}}!$ thread = OMP_GET_MAX_THREADS() +!CHECK-E:{{^}}!$ thread = OMP_GET_MAX_THREADS() !CHECK-E:{{^}}!$omp parallel private(ia) -!CHECK-E:{{^}}!$ continue +!CHECK-E:{{^}}!$ continue !CHECK-E:{{^}}!$omp end parallel !CHECK-OMP:thread=omp_get_max_threads() diff --git a/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 b/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 index 169976d74c0bf..644ab3f723aba 100644 --- a/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 +++ b/flang/test/Parser/OpenMP/compiler-directive-continuation.f90 @@ -7,10 +7,10 @@ ! CHECK-LABEL: subroutine mixed_form1() ! CHECK-E:{{^}} i = 1 & ! CHECK-E:{{^}}!$ +100& -! CHECK-E:{{^}}!$ &+ 1000& -! CHECK-E:{{^}} &+ 10 + 1& -! CHECK-E:{{^}}!$ & +100000& -! CHECK-E:{{^}} &0000 + 1000000 +! CHECK-E:{{^}}!$ &+ 1000& +! CHECK-E:{{^}} &+ 10 + 1& +! CHECK-E:{{^}}!$ & +100000& +! CHECK-E:{{^}} &0000 + 1000000 ! CHECK-OMP: i=1001001112_4 ! CHECK-NO-OMP: i=1010011_4 subroutine mixed_form1() @@ -39,8 +39,8 @@ subroutine mixed_form2() ! CHECK-LABEL: subroutine mixed_form3() ! CHECK-E:{{^}}!$ i=0 ! CHECK-E:{{^}}!$ i = 1 & -! CHECK-E:{{^}}!$ & +10 & -! CHECK-E:{{^}}!$ &+100& +! CHECK-E:{{^}}!$ & +10 & +! CHECK-E:{{^}}!$ &+100& ! CHECK-E:{{^}}!$ +1000 ! CHECK-OMP: i=0_4 ! CHECK-OMP: i=1111_4 diff --git a/flang/test/Parser/OpenMP/sentinels.f b/flang/test/Parser/OpenMP/sentinels.f index 299b83e2abba8..f5a2fd4f7f931 100644 --- a/flang/test/Parser/OpenMP/sentinels.f +++ b/flang/test/Parser/OpenMP/sentinels.f @@ -61,12 +61,12 @@ subroutine sub(a, b) ! Test valid chars in initial and continuation lines. ! CHECK: !$ 20 PRINT *, "msg2" -! CHECK: !$ & , "msg3" +! CHECK: !$ &, "msg3" c$ 20 PRINT *, "msg2" c$ & , "msg3" ! CHECK: !$ PRINT *, "msg4", -! CHECK: !$ & "msg5" +! CHECK: !$ &"msg5" c$ 0PRINT *, "msg4", c$ + "msg5" end diff --git a/flang/test/Parser/continuation-in-conditional-compilation.f b/flang/test/Parser/continuation-in-conditional-compilation.f index 57b69de657348..ebc6a3f875b9a 100644 --- a/flang/test/Parser/continuation-in-conditional-compilation.f +++ b/flang/test/Parser/continuation-in-conditional-compilation.f @@ -1,11 +1,12 @@ ! RUN: %flang_fc1 -E %s 2>&1 | FileCheck %s program main ! CHECK: k01=1+ -! CHECK: !$ & 1 +! CHECK: !$ &1 k01=1+ -!$ & 1 +!$ &1 -! CHECK: !$ k02=23 +! CHECK: !$ k02=2 +! CHECK: 3 ! CHECK: !$ &4 !$ k02=2 +3 diff --git a/flang/test/Preprocessing/bug136845.F b/flang/test/Preprocessing/bug136845.F new file mode 100644 index 0000000000000..ce52c2953bb57 --- /dev/null +++ b/flang/test/Preprocessing/bug136845.F @@ -0,0 +1,45 @@ +!RUN: %flang_fc1 -E %s | FileCheck --check-prefix=PREPRO %s +!RUN: %flang_fc1 -fdebug-unparse %s | FileCheck --check-prefix=NORMAL %s +!RUN: %flang_fc1 -fopenmp -fdebug-unparse %s | FileCheck --check-prefix=OMP %s + +c$ ! + +C$ + continue + + k=0 w + k=0 +c$ 0 x +c$ 1 y +c$ 2 k= z +c$ ! A +c$ !1 B + print *,k +*$1 continue + end + +!PREPRO:!$ & +!PREPRO: continue +!PREPRO: k=0 +!PREPRO: k=0 +!PREPRO:!$ +!PREPRO:!$ & +!PREPRO:!$ &k= +!PREPRO:!$ & +!PREPRO:!$ &1 +!PREPRO: print *,k +!PREPRO:!$ 1 continue +!PREPRO: end + +!NORMAL: k=0_4 +!NORMAL: k=0_4 +!NORMAL: PRINT *, k +!NORMAL:END PROGRAM + +!OMP: CONTINUE +!OMP: k=0_4 +!OMP: k=0_4 +!OMP: k=1_4 +!OMP: PRINT *, k +!OMP: 1 CONTINUE +!OMP:END PROGRAM From flang-commits at lists.llvm.org Mon May 12 12:16:11 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 12:16:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Further refinement of OpenMP !$ lines in -E mode (PR #138956) In-Reply-To: Message-ID: <682248fb.170a0220.13b1d.f05b@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/138956 From flang-commits at lists.llvm.org Mon May 12 12:18:38 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 12:18:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Acknowledge non-enforcement of C7108 (PR #139169) In-Reply-To: Message-ID: <6822498e.050a0220.18bd2.bec8@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/139169 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 12 12:27:24 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 12:27:24 -0700 (PDT) Subject: [flang-commits] [flang] 1d8ecbe - [flang] Require contiguous actual pointer for contiguous dummy pointer (#139298) Message-ID: <68224b9c.170a0220.102159.ddca@mx.google.com> Author: Peter Klausler Date: 2025-05-12T12:27:21-07:00 New Revision: 1d8ecbe9486b8a6b2839cb3001008338c3d9798d URL: https://github.com/llvm/llvm-project/commit/1d8ecbe9486b8a6b2839cb3001008338c3d9798d DIFF: https://github.com/llvm/llvm-project/commit/1d8ecbe9486b8a6b2839cb3001008338c3d9798d.diff LOG: [flang] Require contiguous actual pointer for contiguous dummy pointer (#139298) When the actual argument associated with an explicitly CONTIGUOUS pointer dummy argument is itself a pointer, it must also be contiguous. (A non-pointer actual argument can associate with a CONTIGUOUS pointer dummy argument if it's INTENT(IN), and in that case it's still just a warning if we can't prove at compilation time that the actual is contiguous.) Fixes https://github.com/llvm/llvm-project/issues/138899. Added: Modified: flang/lib/Semantics/check-call.cpp flang/lib/Semantics/pointer-assignment.cpp flang/lib/Semantics/pointer-assignment.h flang/test/Semantics/call07.f90 Removed: ################################################################################ diff --git a/flang/lib/Semantics/check-call.cpp b/flang/lib/Semantics/check-call.cpp index 231f3a4222a2c..3cf95fdab44f5 100644 --- a/flang/lib/Semantics/check-call.cpp +++ b/flang/lib/Semantics/check-call.cpp @@ -772,12 +772,13 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, } } - // Cases when temporaries might be needed but must not be permitted. + bool dummyIsContiguous{ + dummy.attrs.test(characteristics::DummyDataObject::Attr::Contiguous)}; bool actualIsContiguous{IsSimplyContiguous(actual, foldingContext)}; + + // Cases when temporaries might be needed but must not be permitted. bool dummyIsAssumedShape{dummy.type.attrs().test( characteristics::TypeAndShape::Attr::AssumedShape)}; - bool dummyIsContiguous{ - dummy.attrs.test(characteristics::DummyDataObject::Attr::Contiguous)}; if ((actualIsAsynchronous || actualIsVolatile) && (dummyIsAsynchronous || dummyIsVolatile) && !dummyIsValue) { if (actualCoarrayRef) { // C1538 @@ -852,7 +853,7 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, if (scope) { semantics::CheckPointerAssignment(context, messages.at(), dummyName, dummy, actual, *scope, - /*isAssumedRank=*/dummyIsAssumedRank); + /*isAssumedRank=*/dummyIsAssumedRank, actualIsPointer); } } else if (!actualIsPointer) { messages.Say( diff --git a/flang/lib/Semantics/pointer-assignment.cpp b/flang/lib/Semantics/pointer-assignment.cpp index c17eb0aa941ec..090876912138a 100644 --- a/flang/lib/Semantics/pointer-assignment.cpp +++ b/flang/lib/Semantics/pointer-assignment.cpp @@ -59,6 +59,7 @@ class PointerAssignmentChecker { PointerAssignmentChecker &set_isBoundsRemapping(bool); PointerAssignmentChecker &set_isAssumedRank(bool); PointerAssignmentChecker &set_pointerComponentLHS(const Symbol *); + PointerAssignmentChecker &set_isRHSPointerActualArgument(bool); bool CheckLeftHandSide(const SomeExpr &); bool Check(const SomeExpr &); @@ -94,6 +95,7 @@ class PointerAssignmentChecker { bool isVolatile_{false}; bool isBoundsRemapping_{false}; bool isAssumedRank_{false}; + bool isRHSPointerActualArgument_{false}; const Symbol *pointerComponentLHS_{nullptr}; }; @@ -133,6 +135,12 @@ PointerAssignmentChecker &PointerAssignmentChecker::set_pointerComponentLHS( return *this; } +PointerAssignmentChecker & +PointerAssignmentChecker::set_isRHSPointerActualArgument(bool isPointerActual) { + isRHSPointerActualArgument_ = isPointerActual; + return *this; +} + bool PointerAssignmentChecker::CharacterizeProcedure() { if (!characterizedProcedure_) { characterizedProcedure_ = true; @@ -221,6 +229,9 @@ bool PointerAssignmentChecker::Check(const SomeExpr &rhs) { Say("CONTIGUOUS pointer may not be associated with a discontiguous target"_err_en_US); return false; } + } else if (isRHSPointerActualArgument_) { + Say("CONTIGUOUS pointer dummy argument may not be associated with non-CONTIGUOUS pointer actual argument"_err_en_US); + return false; } else { Warn(common::UsageWarning::PointerToPossibleNoncontiguous, "Target of CONTIGUOUS pointer association is not known to be contiguous"_warn_en_US); @@ -590,12 +601,14 @@ bool CheckStructConstructorPointerComponent(SemanticsContext &context, bool CheckPointerAssignment(SemanticsContext &context, parser::CharBlock source, const std::string &description, const DummyDataObject &lhs, - const SomeExpr &rhs, const Scope &scope, bool isAssumedRank) { + const SomeExpr &rhs, const Scope &scope, bool isAssumedRank, + bool isPointerActualArgument) { return PointerAssignmentChecker{context, scope, source, description} .set_lhsType(common::Clone(lhs.type)) .set_isContiguous(lhs.attrs.test(DummyDataObject::Attr::Contiguous)) .set_isVolatile(lhs.attrs.test(DummyDataObject::Attr::Volatile)) .set_isAssumedRank(isAssumedRank) + .set_isRHSPointerActualArgument(isPointerActualArgument) .Check(rhs); } diff --git a/flang/lib/Semantics/pointer-assignment.h b/flang/lib/Semantics/pointer-assignment.h index 269d64112fd29..ad7c6554d5a13 100644 --- a/flang/lib/Semantics/pointer-assignment.h +++ b/flang/lib/Semantics/pointer-assignment.h @@ -31,7 +31,7 @@ bool CheckPointerAssignment(SemanticsContext &, const SomeExpr &lhs, bool CheckPointerAssignment(SemanticsContext &, parser::CharBlock source, const std::string &description, const evaluate::characteristics::DummyDataObject &, const SomeExpr &rhs, - const Scope &, bool isAssumedRank); + const Scope &, bool isAssumedRank, bool IsPointerActualArgument); bool CheckStructConstructorPointerComponent( SemanticsContext &, const Symbol &lhs, const SomeExpr &rhs, const Scope &); diff --git a/flang/test/Semantics/call07.f90 b/flang/test/Semantics/call07.f90 index 3b5c2838fadf7..92f2bdba882d5 100644 --- a/flang/test/Semantics/call07.f90 +++ b/flang/test/Semantics/call07.f90 @@ -27,8 +27,10 @@ subroutine test !PORTABILITY: CONTIGUOUS entity 'scalar' should be an array pointer, assumed-shape, or assumed-rank real, contiguous :: scalar call s01(a03) ! ok - !WARNING: Target of CONTIGUOUS pointer association is not known to be contiguous + !ERROR: CONTIGUOUS pointer dummy argument may not be associated with non-CONTIGUOUS pointer actual argument call s01(a02) + !WARNING: Target of CONTIGUOUS pointer association is not known to be contiguous + call s01(a02(:)) !ERROR: CONTIGUOUS pointer may not be associated with a discontiguous target call s01(a03(::2)) call s02(a02) ! ok From flang-commits at lists.llvm.org Mon May 12 12:27:27 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 12:27:27 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Require contiguous actual pointer for contiguous dummy pointer (PR #139298) In-Reply-To: Message-ID: <68224b9f.a70a0220.22241a.bae1@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/139298 From flang-commits at lists.llvm.org Mon May 12 12:27:42 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 12:27:42 -0700 (PDT) Subject: [flang-commits] [flang] 8fc1a64 - [flang] Emit error when DEFERRED binding overrides non-DEFERRED (#139325) Message-ID: <68224bae.630a0220.7f314.4949@mx.google.com> Author: Peter Klausler Date: 2025-05-12T12:27:39-07:00 New Revision: 8fc1a6496a219a2ac40e3ece8969dd99d90a8f19 URL: https://github.com/llvm/llvm-project/commit/8fc1a6496a219a2ac40e3ece8969dd99d90a8f19 DIFF: https://github.com/llvm/llvm-project/commit/8fc1a6496a219a2ac40e3ece8969dd99d90a8f19.diff LOG: [flang] Emit error when DEFERRED binding overrides non-DEFERRED (#139325) Fixes https://github.com/llvm/llvm-project/issues/138915. Added: flang/test/Semantics/bug138915.f90 Modified: flang/lib/Evaluate/tools.cpp flang/lib/Semantics/check-declarations.cpp Removed: ################################################################################ diff --git a/flang/lib/Evaluate/tools.cpp b/flang/lib/Evaluate/tools.cpp index d39e4c42928f3..7ce009c1d0b53 100644 --- a/flang/lib/Evaluate/tools.cpp +++ b/flang/lib/Evaluate/tools.cpp @@ -1196,16 +1196,6 @@ parser::Message *AttachDeclaration( const auto *assoc{unhosted->detailsIf()}) { unhosted = &assoc->symbol(); } - if (const auto *binding{ - unhosted->detailsIf()}) { - if (binding->symbol().name() != symbol.name()) { - message.Attach(binding->symbol().name(), - "Procedure '%s' of type '%s' is bound to '%s'"_en_US, symbol.name(), - symbol.owner().GetName().value(), binding->symbol().name()); - return &message; - } - unhosted = &binding->symbol(); - } if (const auto *use{symbol.detailsIf()}) { message.Attach(use->location(), "'%s' is USE-associated with '%s' in module '%s'"_en_US, symbol.name(), @@ -1214,6 +1204,14 @@ parser::Message *AttachDeclaration( message.Attach( unhosted->name(), "Declaration of '%s'"_en_US, unhosted->name()); } + if (const auto *binding{ + unhosted->detailsIf()}) { + if (binding->symbol().name() != symbol.name()) { + message.Attach(binding->symbol().name(), + "Procedure '%s' of type '%s' is bound to '%s'"_en_US, symbol.name(), + symbol.owner().GetName().value(), binding->symbol().name()); + } + } return &message; } diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 318085518cc57..94258444cf7ef 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -2555,6 +2555,9 @@ void CheckHelper::CheckProcBinding( const Symbol &symbol, const ProcBindingDetails &binding) { const Scope &dtScope{symbol.owner()}; CHECK(dtScope.kind() == Scope::Kind::DerivedType); + bool isInaccessibleDeferred{false}; + const Symbol *overridden{ + FindOverriddenBinding(symbol, isInaccessibleDeferred)}; if (symbol.attrs().test(Attr::DEFERRED)) { if (const Symbol *dtSymbol{dtScope.symbol()}) { if (!dtSymbol->attrs().test(Attr::ABSTRACT)) { // C733 @@ -2568,6 +2571,11 @@ void CheckHelper::CheckProcBinding( "Type-bound procedure '%s' may not be both DEFERRED and NON_OVERRIDABLE"_err_en_US, symbol.name()); } + if (overridden && !overridden->attrs().test(Attr::DEFERRED)) { + SayWithDeclaration(*overridden, + "Override of non-DEFERRED '%s' must not be DEFERRED"_err_en_US, + symbol.name()); + } } if (binding.symbol().attrs().test(Attr::INTRINSIC) && !context_.intrinsics().IsSpecificIntrinsicFunction( @@ -2576,9 +2584,7 @@ void CheckHelper::CheckProcBinding( "Intrinsic procedure '%s' is not a specific intrinsic permitted for use in the definition of binding '%s'"_err_en_US, binding.symbol().name(), symbol.name()); } - bool isInaccessibleDeferred{false}; - if (const Symbol * - overridden{FindOverriddenBinding(symbol, isInaccessibleDeferred)}) { + if (overridden) { if (isInaccessibleDeferred) { SayWithDeclaration(*overridden, "Override of PRIVATE DEFERRED '%s' must appear in its module"_err_en_US, diff --git a/flang/test/Semantics/bug138915.f90 b/flang/test/Semantics/bug138915.f90 new file mode 100644 index 0000000000000..786a4ac2d930b --- /dev/null +++ b/flang/test/Semantics/bug138915.f90 @@ -0,0 +1,15 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 +module m + type base + contains + procedure, nopass :: tbp + end type + type, extends(base), abstract :: child + contains + !ERROR: Override of non-DEFERRED 'tbp' must not be DEFERRED + procedure(tbp), deferred, nopass :: tbp + end type + contains + subroutine tbp + end +end From flang-commits at lists.llvm.org Mon May 12 12:27:45 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 12:27:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Emit error when DEFERRED binding overrides non-DEFERRED (PR #139325) In-Reply-To: Message-ID: <68224bb1.620a0220.3bdce.d6ea@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/139325 From flang-commits at lists.llvm.org Mon May 12 12:27:59 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 12:27:59 -0700 (PDT) Subject: [flang-commits] [flang] d90bbf1 - [flang] Stricter checking of v_list DIO arguments (#139329) Message-ID: <68224bbf.630a0220.291a11.48f7@mx.google.com> Author: Peter Klausler Date: 2025-05-12T12:27:56-07:00 New Revision: d90bbf147b5024bcfef80a8a6602596cb31a9143 URL: https://github.com/llvm/llvm-project/commit/d90bbf147b5024bcfef80a8a6602596cb31a9143 DIFF: https://github.com/llvm/llvm-project/commit/d90bbf147b5024bcfef80a8a6602596cb31a9143.diff LOG: [flang] Stricter checking of v_list DIO arguments (#139329) Catch assumed-rank arguments to defined I/O subroutines, and ensure that v_list dummy arguments are vectors. Fixes https://github.com/llvm/llvm-project/issues/138933. Added: Modified: flang/lib/Semantics/check-declarations.cpp flang/test/Semantics/io11.f90 Removed: ################################################################################ diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index 94258444cf7ef..a86f78154b859 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -1192,7 +1192,7 @@ void CheckHelper::CheckObjectEntity( typeName); } else if (evaluate::IsAssumedRank(symbol)) { SayWithDeclaration(symbol, - "Assumed Rank entity of %s type is not supported"_err_en_US, + "Assumed rank entity of %s type is not supported"_err_en_US, typeName); } } @@ -3420,7 +3420,13 @@ void CheckHelper::CheckBindC(const Symbol &symbol) { bool CheckHelper::CheckDioDummyIsData( const Symbol &subp, const Symbol *arg, std::size_t position) { if (arg && arg->detailsIf()) { - return true; + if (evaluate::IsAssumedRank(*arg)) { + messages_.Say(arg->name(), + "Dummy argument '%s' may not be assumed-rank"_err_en_US, arg->name()); + return false; + } else { + return true; + } } else { if (arg) { messages_.Say(arg->name(), @@ -3598,9 +3604,10 @@ void CheckHelper::CheckDioVlistArg( CheckDioDummyIsDefaultInteger(subp, *arg); CheckDioDummyAttrs(subp, *arg, Attr::INTENT_IN); const auto *objectDetails{arg->detailsIf()}; - if (!objectDetails || !objectDetails->shape().CanBeAssumedShape()) { + if (!objectDetails || !objectDetails->shape().CanBeAssumedShape() || + objectDetails->shape().Rank() != 1) { messages_.Say(arg->name(), - "Dummy argument '%s' of a defined input/output procedure must be assumed shape"_err_en_US, + "Dummy argument '%s' of a defined input/output procedure must be assumed shape vector"_err_en_US, arg->name()); } } diff --git a/flang/test/Semantics/io11.f90 b/flang/test/Semantics/io11.f90 index 3529929003b01..c00deede6b516 100644 --- a/flang/test/Semantics/io11.f90 +++ b/flang/test/Semantics/io11.f90 @@ -342,7 +342,7 @@ subroutine formattedReadProc(dtv, unit, iotype, vlist, iostat, iomsg) end subroutine end module m15 -module m16 +module m16a type,public :: t integer c contains @@ -355,15 +355,58 @@ subroutine formattedReadProc(dtv, unit, iotype, vlist, iostat, iomsg) class(t), intent(inout) :: dtv integer, intent(in) :: unit character(len=*), intent(in) :: iotype - !ERROR: Dummy argument 'vlist' of a defined input/output procedure must be assumed shape + !ERROR: Dummy argument 'vlist' of a defined input/output procedure must be assumed shape vector integer, intent(in) :: vlist(5) integer, intent(out) :: iostat character(len=*), intent(inout) :: iomsg + iostat = 343 + stop 'fail' + end subroutine +end module m16a +module m16b + type,public :: t + integer c + contains + procedure, pass :: tbp=>formattedReadProc + generic :: read(formatted) => tbp + end type + private +contains + subroutine formattedReadProc(dtv, unit, iotype, vlist, iostat, iomsg) + class(t), intent(inout) :: dtv + integer, intent(in) :: unit + character(len=*), intent(in) :: iotype + !ERROR: Dummy argument 'vlist' of a defined input/output procedure must be assumed shape vector + integer, intent(in) :: vlist(:,:) + integer, intent(out) :: iostat + character(len=*), intent(inout) :: iomsg + iostat = 343 + stop 'fail' + end subroutine +end module m16b + +module m16c + type,public :: t + integer c + contains + procedure, pass :: tbp=>formattedReadProc + generic :: read(formatted) => tbp + end type + private +contains + subroutine formattedReadProc(dtv, unit, iotype, vlist, iostat, iomsg) + class(t), intent(inout) :: dtv + integer, intent(in) :: unit + character(len=*), intent(in) :: iotype + !ERROR: Dummy argument 'vlist' may not be assumed-rank + integer, intent(in) :: vlist(..) + integer, intent(out) :: iostat + character(len=*), intent(inout) :: iomsg iostat = 343 stop 'fail' end subroutine -end module m16 +end module m16c module m17 ! Test the same defined input/output procedure specified as a generic From flang-commits at lists.llvm.org Mon May 12 12:28:03 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 12:28:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Stricter checking of v_list DIO arguments (PR #139329) In-Reply-To: Message-ID: <68224bc3.630a0220.2e524d.5915@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/139329 From flang-commits at lists.llvm.org Mon May 12 12:28:17 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 12:28:17 -0700 (PDT) Subject: [flang-commits] [flang] 0d55927 - [flang] Catch deferred type parameters in ALLOCATE(type-spec::) (#139334) Message-ID: <68224bd1.050a0220.2b5cfa.c293@mx.google.com> Author: Peter Klausler Date: 2025-05-12T12:28:13-07:00 New Revision: 0d5592713b93bf9dbf305f1d923e8a85b2ba3350 URL: https://github.com/llvm/llvm-project/commit/0d5592713b93bf9dbf305f1d923e8a85b2ba3350 DIFF: https://github.com/llvm/llvm-project/commit/0d5592713b93bf9dbf305f1d923e8a85b2ba3350.diff LOG: [flang] Catch deferred type parameters in ALLOCATE(type-spec::) (#139334) The type-spec in ALLOCATE may not have any deferred type parameters. Fixes https://github.com/llvm/llvm-project/issues/138979. Added: Modified: flang/lib/Semantics/check-allocate.cpp flang/test/Semantics/allocate01.f90 Removed: ################################################################################ diff --git a/flang/lib/Semantics/check-allocate.cpp b/flang/lib/Semantics/check-allocate.cpp index b426dd81334bb..2c215f45bf516 100644 --- a/flang/lib/Semantics/check-allocate.cpp +++ b/flang/lib/Semantics/check-allocate.cpp @@ -116,13 +116,19 @@ static std::optional CheckAllocateOptions( // C937 if (auto it{FindCoarrayUltimateComponent(*derived)}) { context - .Say("Type-spec in ALLOCATE must not specify a type with a coarray" - " ultimate component"_err_en_US) + .Say( + "Type-spec in ALLOCATE must not specify a type with a coarray ultimate component"_err_en_US) .Attach(it->name(), "Type '%s' has coarray ultimate component '%s' declared here"_en_US, info.typeSpec->AsFortran(), it.BuildResultDesignatorName()); } } + if (auto dyType{evaluate::DynamicType::From(*info.typeSpec)}) { + if (dyType->HasDeferredTypeParameter()) { + context.Say( + "Type-spec in ALLOCATE must not have a deferred type parameter"_err_en_US); + } + } } const parser::Expr *parserSourceExpr{nullptr}; diff --git a/flang/test/Semantics/allocate01.f90 b/flang/test/Semantics/allocate01.f90 index a66e2467cbe4e..a10a7259ae94f 100644 --- a/flang/test/Semantics/allocate01.f90 +++ b/flang/test/Semantics/allocate01.f90 @@ -62,6 +62,7 @@ subroutine bar() real, pointer, save :: okp3 real, allocatable, save :: oka3, okac4[:,:] real, allocatable :: okacd5(:, :)[:] + character(:), allocatable :: chvar !ERROR: Name in ALLOCATE statement must be a variable name allocate(foo) @@ -102,6 +103,8 @@ subroutine bar() allocate(edc9%nok) !ERROR: Entity in ALLOCATE statement must have the ALLOCATABLE or POINTER attribute allocate(edc10) + !ERROR: Type-spec in ALLOCATE must not have a deferred type parameter + allocate(character(:) :: chvar) ! No errors expected below: allocate(a_var) @@ -117,4 +120,5 @@ subroutine bar() allocate(edc9%ok(4)) allocate(edc10%ok) allocate(rp) + allocate(character(123) :: chvar) end subroutine From flang-commits at lists.llvm.org Mon May 12 12:28:20 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 12:28:20 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Catch deferred type parameters in ALLOCATE(type-spec::) (PR #139334) In-Reply-To: Message-ID: <68224bd4.170a0220.34f449.c0f9@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/139334 From flang-commits at lists.llvm.org Mon May 12 12:28:35 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 12:28:35 -0700 (PDT) Subject: [flang-commits] [flang] f600154 - [flang] PRIVATE statement in derived type applies to proc components (#139336) Message-ID: <68224be3.170a0220.291061.9072@mx.google.com> Author: Peter Klausler Date: 2025-05-12T12:28:31-07:00 New Revision: f600154ebf3b947e6ae1e5ab307dfaa4a9e2f78a URL: https://github.com/llvm/llvm-project/commit/f600154ebf3b947e6ae1e5ab307dfaa4a9e2f78a DIFF: https://github.com/llvm/llvm-project/commit/f600154ebf3b947e6ae1e5ab307dfaa4a9e2f78a.diff LOG: [flang] PRIVATE statement in derived type applies to proc components (#139336) A PRIVATE statement in a derived type definition is failing to set the default accessibility of procedure pointer components; fix. Fixes https://github.com/llvm/llvm-project/issues/138911. Added: Modified: flang/lib/Semantics/resolve-names.cpp flang/lib/Semantics/tools.cpp flang/test/Semantics/c_loc01.f90 flang/test/Semantics/resolve34.f90 Removed: ################################################################################ diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index b2979690f78e7..bdafc03ad2c05 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6350,6 +6350,10 @@ void DeclarationVisitor::Post(const parser::ProcDecl &x) { if (!dtDetails) { attrs.set(Attr::EXTERNAL); } + if (derivedTypeInfo_.privateComps && + !attrs.HasAny({Attr::PUBLIC, Attr::PRIVATE})) { + attrs.set(Attr::PRIVATE); + } Symbol &symbol{DeclareProcEntity(name, attrs, procInterface)}; SetCUDADataAttr(name.source, symbol, cudaDataAttr()); // for error symbol.ReplaceName(name.source); diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 08d260555f37e..1d1e3ac044166 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -1076,7 +1076,7 @@ std::optional CheckAccessibleSymbol( return std::nullopt; } else { return parser::MessageFormattedText{ - "PRIVATE name '%s' is only accessible within module '%s'"_err_en_US, + "PRIVATE name '%s' is accessible only within module '%s'"_err_en_US, symbol.name(), DEREF(FindModuleContaining(symbol.owner())).GetName().value()}; } diff --git a/flang/test/Semantics/c_loc01.f90 b/flang/test/Semantics/c_loc01.f90 index abae1e263e2e2..a515a7a64f02a 100644 --- a/flang/test/Semantics/c_loc01.f90 +++ b/flang/test/Semantics/c_loc01.f90 @@ -48,9 +48,9 @@ subroutine test(assumedType, poly, nclen, n) cp = c_loc(ch(1:1)) ! ok cp = c_loc(deferred) ! ok cp = c_loc(p2ch) ! ok - !ERROR: PRIVATE name '__address' is only accessible within module '__fortran_builtins' + !ERROR: PRIVATE name '__address' is accessible only within module '__fortran_builtins' cp = c_ptr(0) - !ERROR: PRIVATE name '__address' is only accessible within module '__fortran_builtins' + !ERROR: PRIVATE name '__address' is accessible only within module '__fortran_builtins' cfp = c_funptr(0) !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches operand types TYPE(c_ptr) and TYPE(c_funptr) cp = cfp diff --git a/flang/test/Semantics/resolve34.f90 b/flang/test/Semantics/resolve34.f90 index 39709a362b363..da1b80b5a50b0 100644 --- a/flang/test/Semantics/resolve34.f90 +++ b/flang/test/Semantics/resolve34.f90 @@ -90,16 +90,37 @@ module m7 integer :: i2 integer, private :: i3 end type + type :: t3 + private + integer :: i4 = 0 + procedure(real), pointer, nopass :: pp1 => null() + end type + type, extends(t3) :: t4 + private + integer :: i5 + procedure(real), pointer, nopass :: pp2 + end type end subroutine s7 use m7 type(t2) :: x + type(t4) :: y integer :: j j = x%i2 - !ERROR: PRIVATE name 'i3' is only accessible within module 'm7' + !ERROR: PRIVATE name 'i3' is accessible only within module 'm7' j = x%i3 - !ERROR: PRIVATE name 't1' is only accessible within module 'm7' + !ERROR: PRIVATE name 't1' is accessible only within module 'm7' j = x%t1%i1 + !ok, parent component is not affected by PRIVATE in t4 + y%t3 = t3() + !ERROR: PRIVATE name 'i4' is accessible only within module 'm7' + y%i4 = 0 + !ERROR: PRIVATE name 'pp1' is accessible only within module 'm7' + y%pp1 => null() + !ERROR: PRIVATE name 'i5' is accessible only within module 'm7' + y%i5 = 0 + !ERROR: PRIVATE name 'pp2' is accessible only within module 'm7' + y%pp2 => null() end ! 7.5.4.8(2) @@ -122,11 +143,11 @@ subroutine s1 subroutine s8 use m8 type(t) :: x - !ERROR: PRIVATE name 'i2' is only accessible within module 'm8' + !ERROR: PRIVATE name 'i2' is accessible only within module 'm8' x = t(2, 5) - !ERROR: PRIVATE name 'i2' is only accessible within module 'm8' + !ERROR: PRIVATE name 'i2' is accessible only within module 'm8' x = t(i1=2, i2=5) - !ERROR: PRIVATE name 'i2' is only accessible within module 'm8' + !ERROR: PRIVATE name 'i2' is accessible only within module 'm8' a = [y%i2] end @@ -166,6 +187,6 @@ subroutine s10 use m10 type(t) x x = t(1) - !ERROR: PRIVATE name 'operator(+)' is only accessible within module 'm10' + !ERROR: PRIVATE name 'operator(+)' is accessible only within module 'm10' x = x + x end subroutine From flang-commits at lists.llvm.org Mon May 12 12:28:37 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 12:28:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang] PRIVATE statement in derived type applies to proc components (PR #139336) In-Reply-To: Message-ID: <68224be5.170a0220.74286.acb4@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/139336 From flang-commits at lists.llvm.org Mon May 12 12:28:52 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 12:28:52 -0700 (PDT) Subject: [flang-commits] [flang] 39b0433 - [flang] Extend assumed-size array checking in intrinsic functions (#139339) Message-ID: <68224bf4.050a0220.2897b4.e8ba@mx.google.com> Author: Peter Klausler Date: 2025-05-12T12:28:50-07:00 New Revision: 39b04335ef3021399f8c0dc43837a45537b62e54 URL: https://github.com/llvm/llvm-project/commit/39b04335ef3021399f8c0dc43837a45537b62e54 DIFF: https://github.com/llvm/llvm-project/commit/39b04335ef3021399f8c0dc43837a45537b62e54.diff LOG: [flang] Extend assumed-size array checking in intrinsic functions (#139339) The array argument of a reference to the intrinsic functions SHAPE can't be assumed-size; and for SIZE and UBOUND, it can be assumed-size only if DIM= is present. The checks for thes restrictions don't allow for host association, or for associate entities (ASSOCIATE, SELECT TYPE) that are variables. Fixes https://github.com/llvm/llvm-project/issues/138926. Added: Modified: flang/lib/Evaluate/intrinsics.cpp flang/test/Semantics/misc-intrinsics.f90 Removed: ################################################################################ diff --git a/flang/lib/Evaluate/intrinsics.cpp b/flang/lib/Evaluate/intrinsics.cpp index d64a008e3db84..e802915945e26 100644 --- a/flang/lib/Evaluate/intrinsics.cpp +++ b/flang/lib/Evaluate/intrinsics.cpp @@ -2340,7 +2340,7 @@ std::optional IntrinsicInterface::Match( if (!knownArg) { knownArg = arg; } - if (!dimArg && rank > 0 && + if (rank > 0 && (std::strcmp(name, "shape") == 0 || std::strcmp(name, "size") == 0 || std::strcmp(name, "ubound") == 0)) { @@ -2351,16 +2351,18 @@ std::optional IntrinsicInterface::Match( // over this one, as this error is caught by the second entry // for UBOUND.) if (auto named{ExtractNamedEntity(*arg)}) { - if (semantics::IsAssumedSizeArray(named->GetLastSymbol())) { + if (semantics::IsAssumedSizeArray(ResolveAssociations( + named->GetLastSymbol().GetUltimate()))) { if (strcmp(name, "shape") == 0) { messages.Say(arg->sourceLocation(), "The 'source=' argument to the intrinsic function 'shape' may not be assumed-size"_err_en_US); - } else { + return std::nullopt; + } else if (!dimArg) { messages.Say(arg->sourceLocation(), "A dim= argument is required for '%s' when the array is assumed-size"_err_en_US, name); + return std::nullopt; } - return std::nullopt; } } } diff --git a/flang/test/Semantics/misc-intrinsics.f90 b/flang/test/Semantics/misc-intrinsics.f90 index 14dcdb05ac6c6..a7895f7b7f16f 100644 --- a/flang/test/Semantics/misc-intrinsics.f90 +++ b/flang/test/Semantics/misc-intrinsics.f90 @@ -3,17 +3,37 @@ program test_size real :: scalar real, dimension(5, 5) :: array - call test(array, array) + call test(array, array, array) contains - subroutine test(arg, assumedRank) + subroutine test(arg, assumedRank, poly) real, dimension(5, *) :: arg real, dimension(..) :: assumedRank + class(*) :: poly(5, *) !ERROR: A dim= argument is required for 'size' when the array is assumed-size print *, size(arg) + print *, size(arg, dim=1) ! ok + select type (poly) + type is (real) + !ERROR: A dim= argument is required for 'size' when the array is assumed-size + print *, size(poly) + print *, size(poly, dim=1) ! ok + end select !ERROR: A dim= argument is required for 'ubound' when the array is assumed-size print *, ubound(arg) + print *, ubound(arg, dim=1) ! ok + select type (poly) + type is (real) + !ERROR: A dim= argument is required for 'ubound' when the array is assumed-size + print *, ubound(poly) + print *, ubound(poly, dim=1) ! ok + end select !ERROR: The 'source=' argument to the intrinsic function 'shape' may not be assumed-size print *, shape(arg) + select type (poly) + type is (real) + !ERROR: The 'source=' argument to the intrinsic function 'shape' may not be assumed-size + print *, shape(poly) + end select !ERROR: The 'harvest=' argument to the intrinsic procedure 'random_number' may not be assumed-size call random_number(arg) !ERROR: 'array=' argument has unacceptable rank 0 @@ -85,5 +105,16 @@ subroutine test(arg, assumedRank) print *, lbound(assumedRank, dim=2) print *, ubound(assumedRank, dim=2) end select + contains + subroutine inner + !ERROR: A dim= argument is required for 'size' when the array is assumed-size + print *, size(arg) + print *, size(arg, dim=1) ! ok + !ERROR: A dim= argument is required for 'ubound' when the array is assumed-size + print *, ubound(arg) + print *, ubound(arg, dim=1) ! ok + !ERROR: The 'source=' argument to the intrinsic function 'shape' may not be assumed-size + print *, shape(arg) + end end subroutine end From flang-commits at lists.llvm.org Mon May 12 12:28:56 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 12 May 2025 12:28:56 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Extend assumed-size array checking in intrinsic functions (PR #139339) In-Reply-To: Message-ID: <68224bf8.170a0220.2ad15a.8fdf@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/139339 From flang-commits at lists.llvm.org Mon May 12 12:02:18 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 12:02:18 -0700 (PDT) Subject: [flang-commits] [flang] 9f8ff4b - [flang] Revamp evaluate::CoarrayRef (#136628) Message-ID: <682245ba.170a0220.5b497.9821@mx.google.com> Author: Peter Klausler Date: 2025-05-12T12:02:15-07:00 New Revision: 9f8ff4b77d07570294a020c24bc347285c3affdc URL: https://github.com/llvm/llvm-project/commit/9f8ff4b77d07570294a020c24bc347285c3affdc DIFF: https://github.com/llvm/llvm-project/commit/9f8ff4b77d07570294a020c24bc347285c3affdc.diff LOG: [flang] Revamp evaluate::CoarrayRef (#136628) Bring the typed expression representation of a coindexed reference up to F'2023, which removed some restrictions that had allowed the current representation to suffice for older revisions of the language. This new representation is somewhat more simple -- it uses a DataRef as its base, so any subscripts in a part-ref can be represented as an ArrayRef there. Update the code that creates the CoarrayRef, and add more checking to it, as well as actually capturing any STAT=, TEAM=, & TEAM_NUMBER= specifiers that might appear. Enforce the constraint that the part-ref must have subscripts if it is an array. (And update a pile of copied-and-pasted test code that lacked such subscripts.) Added: Modified: flang/include/flang/Evaluate/tools.h flang/include/flang/Evaluate/traverse.h flang/include/flang/Evaluate/variable.h flang/lib/Evaluate/check-expression.cpp flang/lib/Evaluate/fold.cpp flang/lib/Evaluate/formatting.cpp flang/lib/Evaluate/shape.cpp flang/lib/Evaluate/tools.cpp flang/lib/Evaluate/variable.cpp flang/lib/Lower/Support/Utils.cpp flang/lib/Semantics/check-coarray.cpp flang/lib/Semantics/check-coarray.h flang/lib/Semantics/dump-expr.cpp flang/lib/Semantics/expression.cpp flang/test/Semantics/atomic02.f90 flang/test/Semantics/atomic03.f90 flang/test/Semantics/atomic04.f90 flang/test/Semantics/atomic05.f90 flang/test/Semantics/atomic06.f90 flang/test/Semantics/atomic07.f90 flang/test/Semantics/atomic08.f90 flang/test/Semantics/atomic09.f90 flang/test/Semantics/atomic10.f90 flang/test/Semantics/atomic11.f90 flang/test/Semantics/coarrays02.f90 flang/test/Semantics/coshape.f90 flang/test/Semantics/error_stop1b.f90 flang/test/Semantics/event01b.f90 flang/test/Semantics/resolve94.f90 Removed: ################################################################################ diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 5cdabb3056d8f..922af4190822d 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -399,20 +399,17 @@ template bool IsArrayElement(const Expr &expr, bool intoSubstring = true, bool skipComponents = false) { if (auto dataRef{ExtractDataRef(expr, intoSubstring)}) { - const DataRef *ref{&*dataRef}; - if (skipComponents) { - while (const Component * component{std::get_if(&ref->u)}) { - ref = &component->base(); + for (const DataRef *ref{&*dataRef}; ref;) { + if (const Component * component{std::get_if(&ref->u)}) { + ref = skipComponents ? &component->base() : nullptr; + } else if (const auto *coarrayRef{std::get_if(&ref->u)}) { + ref = &coarrayRef->base(); + } else { + return std::holds_alternative(ref->u); } } - if (const auto *coarrayRef{std::get_if(&ref->u)}) { - return !coarrayRef->subscript().empty(); - } else { - return std::holds_alternative(ref->u); - } - } else { - return false; } + return false; } template @@ -426,9 +423,6 @@ std::optional ExtractNamedEntity(const A &x) { [](Component &&component) -> std::optional { return NamedEntity{std::move(component)}; }, - [](CoarrayRef &&co) -> std::optional { - return co.GetBase(); - }, [](auto &&) { return std::optional{}; }, }, std::move(dataRef->u)); @@ -536,22 +530,14 @@ const Symbol *UnwrapWholeSymbolOrComponentDataRef(const A &x) { // If an expression is a whole symbol or a whole component designator, // potentially followed by an image selector, extract and return that symbol, // else null. +const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef(const DataRef &); template const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef(const A &x) { if (auto dataRef{ExtractDataRef(x)}) { - if (const SymbolRef * p{std::get_if(&dataRef->u)}) { - return &p->get(); - } else if (const Component * c{std::get_if(&dataRef->u)}) { - if (c->base().Rank() == 0) { - return &c->GetLastSymbol(); - } - } else if (const CoarrayRef * c{std::get_if(&dataRef->u)}) { - if (c->subscript().empty()) { - return &c->GetLastSymbol(); - } - } + return UnwrapWholeSymbolOrComponentOrCoarrayRef(*dataRef); + } else { + return nullptr; } - return nullptr; } // GetFirstSymbol(A%B%C[I]%D) -> A diff --git a/flang/include/flang/Evaluate/traverse.h b/flang/include/flang/Evaluate/traverse.h index 45402143604f4..48aafa8982559 100644 --- a/flang/include/flang/Evaluate/traverse.h +++ b/flang/include/flang/Evaluate/traverse.h @@ -146,8 +146,7 @@ class Traverse { return Combine(x.base(), x.subscript()); } Result operator()(const CoarrayRef &x) const { - return Combine( - x.base(), x.subscript(), x.cosubscript(), x.stat(), x.team()); + return Combine(x.base(), x.cosubscript(), x.stat(), x.team()); } Result operator()(const DataRef &x) const { return visitor_(x.u); } Result operator()(const Substring &x) const { diff --git a/flang/include/flang/Evaluate/variable.h b/flang/include/flang/Evaluate/variable.h index 7f1518fd26e78..5c14421fd3a1b 100644 --- a/flang/include/flang/Evaluate/variable.h +++ b/flang/include/flang/Evaluate/variable.h @@ -98,8 +98,6 @@ class Component { // A NamedEntity is either a whole Symbol or a component in an instance // of a derived type. It may be a descriptor. -// TODO: this is basically a symbol with an optional DataRef base; -// could be used to replace Component. class NamedEntity { public: CLASS_BOILERPLATE(NamedEntity) @@ -239,28 +237,16 @@ class ArrayRef { std::vector subscript_; }; -// R914 coindexed-named-object -// R924 image-selector, R926 image-selector-spec. -// C825 severely limits the usage of derived types with coarray ultimate -// components: they can't be pointers, allocatables, arrays, coarrays, or -// function results. They can be components of other derived types. -// Although the F'2018 Standard never prohibits multiple image-selectors -// per se in the same data-ref or designator, nor the presence of an -// image-selector after a part-ref with rank, the constraints on the -// derived types that would have be involved make it impossible to declare -// an object that could be referenced in these ways (esp. C748 & C825). -// C930 precludes having both TEAM= and TEAM_NUMBER=. -// TODO C931 prohibits the use of a coindexed object as a stat-variable. +// A coindexed data-ref. The base is represented as a general +// DataRef, but the base may not contain a CoarrayRef and may +// have rank > 0 only in an uppermost ArrayRef. class CoarrayRef { public: CLASS_BOILERPLATE(CoarrayRef) - CoarrayRef(SymbolVector &&, std::vector &&, - std::vector> &&); + CoarrayRef(DataRef &&, std::vector> &&); - const SymbolVector &base() const { return base_; } - SymbolVector &base() { return base_; } - const std::vector &subscript() const { return subscript_; } - std::vector &subscript() { return subscript_; } + const DataRef &base() const { return base_.value(); } + DataRef &base() { return base_.value(); } const std::vector> &cosubscript() const { return cosubscript_; } @@ -270,25 +256,24 @@ class CoarrayRef { // (i.e., Designator or pointer-valued FunctionRef). std::optional> stat() const; CoarrayRef &set_stat(Expr &&); - std::optional> team() const; - bool teamIsTeamNumber() const { return teamIsTeamNumber_; } - CoarrayRef &set_team(Expr &&, bool isTeamNumber = false); + // When team() is Expr, it's TEAM_NUMBER=; otherwise, + // it's TEAM=. + std::optional> team() const; + CoarrayRef &set_team(Expr &&); int Rank() const; int Corank() const { return 0; } const Symbol &GetFirstSymbol() const; const Symbol &GetLastSymbol() const; - NamedEntity GetBase() const; std::optional> LEN() const; bool operator==(const CoarrayRef &) const; llvm::raw_ostream &AsFortran(llvm::raw_ostream &) const; private: - SymbolVector base_; - std::vector subscript_; + common::CopyableIndirection base_; std::vector> cosubscript_; - std::optional>> stat_, team_; - bool teamIsTeamNumber_{false}; // false: TEAM=, true: TEAM_NUMBER= + std::optional>> stat_; + std::optional>> team_; }; // R911 data-ref is defined syntactically as a series of part-refs, which diff --git a/flang/lib/Evaluate/check-expression.cpp b/flang/lib/Evaluate/check-expression.cpp index d8baaf2e2a7ac..3d7f01d56c465 100644 --- a/flang/lib/Evaluate/check-expression.cpp +++ b/flang/lib/Evaluate/check-expression.cpp @@ -946,10 +946,7 @@ class IsContiguousHelper return std::nullopt; } } - Result operator()(const CoarrayRef &x) const { - int rank{0}; - return CheckSubscripts(x.subscript(), rank).has_value(); - } + Result operator()(const CoarrayRef &x) const { return (*this)(x.base()); } Result operator()(const Component &x) const { if (x.base().Rank() == 0) { return (*this)(x.GetLastSymbol()); diff --git a/flang/lib/Evaluate/fold.cpp b/flang/lib/Evaluate/fold.cpp index 5fc31728ce5d6..45e842abf589f 100644 --- a/flang/lib/Evaluate/fold.cpp +++ b/flang/lib/Evaluate/fold.cpp @@ -162,22 +162,17 @@ ArrayRef FoldOperation(FoldingContext &context, ArrayRef &&arrayRef) { } CoarrayRef FoldOperation(FoldingContext &context, CoarrayRef &&coarrayRef) { - std::vector subscript; - for (Subscript x : coarrayRef.subscript()) { - subscript.emplace_back(FoldOperation(context, std::move(x))); - } + DataRef base{FoldOperation(context, std::move(coarrayRef.base()))}; std::vector> cosubscript; for (Expr x : coarrayRef.cosubscript()) { cosubscript.emplace_back(Fold(context, std::move(x))); } - CoarrayRef folded{std::move(coarrayRef.base()), std::move(subscript), - std::move(cosubscript)}; + CoarrayRef folded{std::move(base), std::move(cosubscript)}; if (std::optional> stat{coarrayRef.stat()}) { folded.set_stat(Fold(context, std::move(*stat))); } - if (std::optional> team{coarrayRef.team()}) { - folded.set_team( - Fold(context, std::move(*team)), coarrayRef.teamIsTeamNumber()); + if (std::optional> team{coarrayRef.team()}) { + folded.set_team(Fold(context, std::move(*team))); } return folded; } diff --git a/flang/lib/Evaluate/formatting.cpp b/flang/lib/Evaluate/formatting.cpp index 6778fac9a44fd..121afc6f0f8bf 100644 --- a/flang/lib/Evaluate/formatting.cpp +++ b/flang/lib/Evaluate/formatting.cpp @@ -723,24 +723,8 @@ llvm::raw_ostream &ArrayRef::AsFortran(llvm::raw_ostream &o) const { } llvm::raw_ostream &CoarrayRef::AsFortran(llvm::raw_ostream &o) const { - bool first{true}; - for (const Symbol &part : base_) { - if (first) { - first = false; - } else { - o << '%'; - } - EmitVar(o, part); - } - char separator{'('}; - for (const auto &sscript : subscript_) { - EmitVar(o << separator, sscript); - separator = ','; - } - if (separator == ',') { - o << ')'; - } - separator = '['; + base().AsFortran(o); + char separator{'['}; for (const auto &css : cosubscript_) { EmitVar(o << separator, css); separator = ','; @@ -750,8 +734,10 @@ llvm::raw_ostream &CoarrayRef::AsFortran(llvm::raw_ostream &o) const { separator = ','; } if (team_) { - EmitVar( - o << separator, team_, teamIsTeamNumber_ ? "TEAM_NUMBER=" : "TEAM="); + EmitVar(o << separator, team_, + std::holds_alternative>(team_->value().u) + ? "TEAM_NUMBER=" + : "TEAM="); } return o << ']'; } diff --git a/flang/lib/Evaluate/shape.cpp b/flang/lib/Evaluate/shape.cpp index f620ecd4a24bb..ac4811e9978eb 100644 --- a/flang/lib/Evaluate/shape.cpp +++ b/flang/lib/Evaluate/shape.cpp @@ -891,20 +891,7 @@ auto GetShapeHelper::operator()(const ArrayRef &arrayRef) const -> Result { } auto GetShapeHelper::operator()(const CoarrayRef &coarrayRef) const -> Result { - NamedEntity base{coarrayRef.GetBase()}; - if (coarrayRef.subscript().empty()) { - return (*this)(base); - } else { - Shape shape; - int dimension{0}; - for (const Subscript &ss : coarrayRef.subscript()) { - if (ss.Rank() > 0) { - shape.emplace_back(GetExtent(ss, base, dimension)); - } - ++dimension; - } - return shape; - } + return (*this)(coarrayRef.base()); } auto GetShapeHelper::operator()(const Substring &substring) const -> Result { diff --git a/flang/lib/Evaluate/tools.cpp b/flang/lib/Evaluate/tools.cpp index 702711e3cff53..d39e4c42928f3 100644 --- a/flang/lib/Evaluate/tools.cpp +++ b/flang/lib/Evaluate/tools.cpp @@ -1090,7 +1090,7 @@ auto GetSymbolVectorHelper::operator()(const ArrayRef &x) const -> Result { return GetSymbolVector(x.base()); } auto GetSymbolVectorHelper::operator()(const CoarrayRef &x) const -> Result { - return x.base(); + return GetSymbolVector(x.base()); } const Symbol *GetLastTarget(const SymbolVector &symbols) { @@ -1320,6 +1320,19 @@ std::optional CheckProcCompatibility(bool isCall, return msg; } +const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef(const DataRef &dataRef) { + if (const SymbolRef * p{std::get_if(&dataRef.u)}) { + return &p->get(); + } else if (const Component * c{std::get_if(&dataRef.u)}) { + if (c->base().Rank() == 0) { + return &c->GetLastSymbol(); + } + } else if (const CoarrayRef * c{std::get_if(&dataRef.u)}) { + return UnwrapWholeSymbolOrComponentOrCoarrayRef(c->base()); + } + return nullptr; +} + // GetLastPointerSymbol() static const Symbol *GetLastPointerSymbol(const Symbol &symbol) { return IsPointer(GetAssociationRoot(symbol)) ? &symbol : nullptr; diff --git a/flang/lib/Evaluate/variable.cpp b/flang/lib/Evaluate/variable.cpp index 849194b492053..d1bff03a6ea5f 100644 --- a/flang/lib/Evaluate/variable.cpp +++ b/flang/lib/Evaluate/variable.cpp @@ -69,13 +69,9 @@ Triplet &Triplet::set_stride(Expr &&expr) { return *this; } -CoarrayRef::CoarrayRef(SymbolVector &&base, std::vector &&ss, - std::vector> &&css) - : base_{std::move(base)}, subscript_(std::move(ss)), - cosubscript_(std::move(css)) { - CHECK(!base_.empty()); - CHECK(!cosubscript_.empty()); -} +CoarrayRef::CoarrayRef( + DataRef &&base, std::vector> &&css) + : base_{std::move(base)}, cosubscript_(std::move(css)) {} std::optional> CoarrayRef::stat() const { if (stat_) { @@ -85,7 +81,7 @@ std::optional> CoarrayRef::stat() const { } } -std::optional> CoarrayRef::team() const { +std::optional> CoarrayRef::team() const { if (team_) { return team_.value().value(); } else { @@ -99,16 +95,18 @@ CoarrayRef &CoarrayRef::set_stat(Expr &&v) { return *this; } -CoarrayRef &CoarrayRef::set_team(Expr &&v, bool isTeamNumber) { - CHECK(IsVariable(v)); +CoarrayRef &CoarrayRef::set_team(Expr &&v) { team_.emplace(std::move(v)); - teamIsTeamNumber_ = isTeamNumber; return *this; } -const Symbol &CoarrayRef::GetFirstSymbol() const { return base_.front(); } +const Symbol &CoarrayRef::GetFirstSymbol() const { + return base().GetFirstSymbol(); +} -const Symbol &CoarrayRef::GetLastSymbol() const { return base_.back(); } +const Symbol &CoarrayRef::GetLastSymbol() const { + return base().GetLastSymbol(); +} void Substring::SetBounds(std::optional> &lower, std::optional> &upper) { @@ -426,17 +424,7 @@ int ArrayRef::Rank() const { } } -int CoarrayRef::Rank() const { - if (!subscript_.empty()) { - int rank{0}; - for (const auto &expr : subscript_) { - rank += expr.Rank(); - } - return rank; - } else { - return base_.back()->Rank(); - } -} +int CoarrayRef::Rank() const { return base().Rank(); } int DataRef::Rank() const { return common::visit(common::visitors{ @@ -671,22 +659,6 @@ std::optional Designator::GetType() const { return std::nullopt; } -static NamedEntity AsNamedEntity(const SymbolVector &x) { - CHECK(!x.empty()); - NamedEntity result{x.front()}; - int j{0}; - for (const Symbol &symbol : x) { - if (j++ != 0) { - DataRef base{result.IsSymbol() ? DataRef{result.GetLastSymbol()} - : DataRef{result.GetComponent()}}; - result = NamedEntity{Component{std::move(base), symbol}}; - } - } - return result; -} - -NamedEntity CoarrayRef::GetBase() const { return AsNamedEntity(base_); } - // Equality testing // For the purposes of comparing type parameter expressions while @@ -759,9 +731,8 @@ bool ArrayRef::operator==(const ArrayRef &that) const { return base_ == that.base_ && subscript_ == that.subscript_; } bool CoarrayRef::operator==(const CoarrayRef &that) const { - return base_ == that.base_ && subscript_ == that.subscript_ && - cosubscript_ == that.cosubscript_ && stat_ == that.stat_ && - team_ == that.team_ && teamIsTeamNumber_ == that.teamIsTeamNumber_; + return base_ == that.base_ && cosubscript_ == that.cosubscript_ && + stat_ == that.stat_ && team_ == that.team_; } bool DataRef::operator==(const DataRef &that) const { return TestVariableEquality(*this, that); diff --git a/flang/lib/Lower/Support/Utils.cpp b/flang/lib/Lower/Support/Utils.cpp index ed2700c42fc55..668ee31a36bc3 100644 --- a/flang/lib/Lower/Support/Utils.cpp +++ b/flang/lib/Lower/Support/Utils.cpp @@ -70,18 +70,12 @@ class HashEvaluateExpr { return getHashValue(x.base()) * 89u - subs; } static unsigned getHashValue(const Fortran::evaluate::CoarrayRef &x) { - unsigned subs = 1u; - for (const Fortran::evaluate::Subscript &v : x.subscript()) - subs -= getHashValue(v); unsigned cosubs = 3u; for (const Fortran::evaluate::Expr &v : x.cosubscript()) cosubs -= getHashValue(v); - unsigned syms = 7u; - for (const Fortran::evaluate::SymbolRef &v : x.base()) - syms += getHashValue(v); - return syms * 97u - subs - cosubs + getHashValue(x.stat()) + 257u + - getHashValue(x.team()); + return getHashValue(x.base()) * 97u - cosubs + getHashValue(x.stat()) + + 257u + getHashValue(x.team()); } static unsigned getHashValue(const Fortran::evaluate::NamedEntity &x) { if (x.IsSymbol()) @@ -339,7 +333,6 @@ class IsEqualEvaluateExpr { static bool isEqual(const Fortran::evaluate::CoarrayRef &x, const Fortran::evaluate::CoarrayRef &y) { return isEqual(x.base(), y.base()) && - isEqual(x.subscript(), y.subscript()) && isEqual(x.cosubscript(), y.cosubscript()) && isEqual(x.stat(), y.stat()) && isEqual(x.team(), y.team()); } diff --git a/flang/lib/Semantics/check-coarray.cpp b/flang/lib/Semantics/check-coarray.cpp index b21e3cd757d6b..0e444f155f116 100644 --- a/flang/lib/Semantics/check-coarray.cpp +++ b/flang/lib/Semantics/check-coarray.cpp @@ -373,41 +373,12 @@ void CoarrayChecker::Leave(const parser::CriticalStmt &x) { } void CoarrayChecker::Leave(const parser::ImageSelector &imageSelector) { - haveStat_ = false; - haveTeam_ = false; - haveTeamNumber_ = false; for (const auto &imageSelectorSpec : std::get>(imageSelector.t)) { - if (const auto *team{ - std::get_if(&imageSelectorSpec.u)}) { - if (haveTeam_) { - context_.Say(parser::FindSourceLocation(imageSelectorSpec), // C929 - "TEAM value can only be specified once"_err_en_US); - } - CheckTeamType(context_, *team); - haveTeam_ = true; - } if (const auto *stat{std::get_if( &imageSelectorSpec.u)}) { - if (haveStat_) { - context_.Say(parser::FindSourceLocation(imageSelectorSpec), // C929 - "STAT variable can only be specified once"_err_en_US); - } CheckTeamStat(context_, *stat); - haveStat_ = true; } - if (std::get_if( - &imageSelectorSpec.u)) { - if (haveTeamNumber_) { - context_.Say(parser::FindSourceLocation(imageSelectorSpec), // C929 - "TEAM_NUMBER value can only be specified once"_err_en_US); - } - haveTeamNumber_ = true; - } - } - if (haveTeam_ && haveTeamNumber_) { - context_.Say(parser::FindSourceLocation(imageSelector), // C930 - "Cannot specify both TEAM and TEAM_NUMBER"_err_en_US); } } diff --git a/flang/lib/Semantics/check-coarray.h b/flang/lib/Semantics/check-coarray.h index f156959019383..51de47f123558 100644 --- a/flang/lib/Semantics/check-coarray.h +++ b/flang/lib/Semantics/check-coarray.h @@ -37,9 +37,6 @@ class CoarrayChecker : public virtual BaseChecker { private: SemanticsContext &context_; - bool haveStat_; - bool haveTeam_; - bool haveTeamNumber_; void CheckNamesAreDistinct(const std::list &); void Say2(const parser::CharBlock &, parser::MessageFixedText &&, diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index 850904bf897b9..aa0b4e0f03398 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -22,7 +22,6 @@ inline const char *DumpEvaluateExpr::GetIndentString() const { void DumpEvaluateExpr::Show(const evaluate::CoarrayRef &x) { Indent("coarray ref"); Show(x.base()); - Show(x.subscript()); Show(x.cosubscript()); Show(x.stat()); Show(x.team()); diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index e139bda7e4950..0659536aab98c 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -419,13 +419,9 @@ static void CheckSubscripts( } } -static void CheckSubscripts( +static void CheckCosubscripts( semantics::SemanticsContext &context, CoarrayRef &ref) { - const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; - Shape lb, ub; - if (FoldSubscripts(context, coarraySymbol, ref.subscript(), lb, ub)) { - ValidateSubscripts(context, coarraySymbol, ref.subscript(), lb, ub); - } + const Symbol &coarraySymbol{ref.GetLastSymbol()}; FoldingContext &foldingContext{context.foldingContext()}; int dim{0}; for (auto &expr : ref.cosubscript()) { @@ -1534,29 +1530,10 @@ MaybeExpr ExpressionAnalyzer::Analyze(const parser::StructureComponent &sc) { } MaybeExpr ExpressionAnalyzer::Analyze(const parser::CoindexedNamedObject &x) { - if (auto maybeDataRef{ExtractDataRef(Analyze(x.base))}) { - DataRef *dataRef{&*maybeDataRef}; - std::vector subscripts; - SymbolVector reversed; - if (auto *aRef{std::get_if(&dataRef->u)}) { - subscripts = std::move(aRef->subscript()); - reversed.push_back(aRef->GetLastSymbol()); - if (Component *component{aRef->base().UnwrapComponent()}) { - dataRef = &component->base(); - } else { - dataRef = nullptr; - } - } - if (dataRef) { - while (auto *component{std::get_if(&dataRef->u)}) { - reversed.push_back(component->GetLastSymbol()); - dataRef = &component->base(); - } - if (auto *baseSym{std::get_if(&dataRef->u)}) { - reversed.push_back(*baseSym); - } else { - Say("Base of coindexed named object has subscripts or cosubscripts"_err_en_US); - } + if (auto dataRef{ExtractDataRef(Analyze(x.base))}) { + if (!std::holds_alternative(dataRef->u) && + dataRef->GetLastSymbol().Rank() > 0) { // F'2023 C916 + Say("Subscripts must appear in a coindexed reference when its base is an array"_err_en_US); } std::vector> cosubscripts; bool cosubsOk{true}; @@ -1570,30 +1547,59 @@ MaybeExpr ExpressionAnalyzer::Analyze(const parser::CoindexedNamedObject &x) { cosubsOk = false; } } - if (cosubsOk && !reversed.empty()) { + if (cosubsOk) { int numCosubscripts{static_cast(cosubscripts.size())}; - const Symbol &symbol{reversed.front()}; + const Symbol &symbol{dataRef->GetLastSymbol()}; if (numCosubscripts != GetCorank(symbol)) { Say("'%s' has corank %d, but coindexed reference has %d cosubscripts"_err_en_US, symbol.name(), GetCorank(symbol), numCosubscripts); } } + CoarrayRef coarrayRef{std::move(*dataRef), std::move(cosubscripts)}; for (const auto &imageSelSpec : std::get>(x.imageSelector.t)) { common::visit( common::visitors{ - [&](const auto &x) { Analyze(x.v); }, - }, + [&](const parser::ImageSelectorSpec::Stat &x) { + Analyze(x.v); + if (const auto *expr{GetExpr(context_, x.v)}) { + if (const auto *intExpr{ + std::get_if>(&expr->u)}) { + if (coarrayRef.stat()) { + Say("coindexed reference has multiple STAT= specifiers"_err_en_US); + } else { + coarrayRef.set_stat(Expr{*intExpr}); + } + } + } + }, + [&](const parser::TeamValue &x) { + Analyze(x.v); + if (const auto *expr{GetExpr(context_, x.v)}) { + if (coarrayRef.team()) { + Say("coindexed reference has multiple TEAM= or TEAM_NUMBER= specifiers"_err_en_US); + } else if (auto dyType{expr->GetType()}; + dyType && IsTeamType(GetDerivedTypeSpec(*dyType))) { + coarrayRef.set_team(Expr{*expr}); + } else { + Say("TEAM= specifier must have type TEAM_TYPE from ISO_FORTRAN_ENV"_err_en_US); + } + } + }, + [&](const parser::ImageSelectorSpec::Team_Number &x) { + Analyze(x.v); + if (const auto *expr{GetExpr(context_, x.v)}) { + if (coarrayRef.team()) { + Say("coindexed reference has multiple TEAM= or TEAM_NUMBER= specifiers"_err_en_US); + } else { + coarrayRef.set_team(Expr{*expr}); + } + } + }}, imageSelSpec.u); } - // Reverse the chain of symbols so that the base is first and coarray - // ultimate component is last. - if (cosubsOk) { - CoarrayRef coarrayRef{SymbolVector{reversed.crbegin(), reversed.crend()}, - std::move(subscripts), std::move(cosubscripts)}; - CheckSubscripts(context_, coarrayRef); - return Designate(DataRef{std::move(coarrayRef)}); - } + CheckCosubscripts(context_, coarrayRef); + return Designate(DataRef{std::move(coarrayRef)}); } return std::nullopt; } diff --git a/flang/test/Semantics/atomic02.f90 b/flang/test/Semantics/atomic02.f90 index 484239a23ede2..0d107152a8c14 100644 --- a/flang/test/Semantics/atomic02.f90 +++ b/flang/test/Semantics/atomic02.f90 @@ -31,7 +31,7 @@ program test_atomic_and call atomic_and(non_scalar_coarray, val) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_and' - call atomic_and(non_scalar_coarray[1], val) + call atomic_and(non_scalar_coarray(:)[1], val) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_and' call atomic_and(non_coarray, val) diff --git a/flang/test/Semantics/atomic03.f90 b/flang/test/Semantics/atomic03.f90 index 495df5eb97192..cef21d002dd68 100644 --- a/flang/test/Semantics/atomic03.f90 +++ b/flang/test/Semantics/atomic03.f90 @@ -51,13 +51,13 @@ program test_atomic_cas call atomic_cas(non_scalar_coarray, old_int, compare_int, new_int) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_cas' - call atomic_cas(non_scalar_coarray[1], old_int, compare_int, new_int) + call atomic_cas(non_scalar_coarray(:)[1], old_int, compare_int, new_int) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_cas' call atomic_cas(non_scalar_logical_coarray, old_logical, compare_logical, new_logical) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_cas' - call atomic_cas(non_scalar_logical_coarray[1], old_logical, compare_logical, new_logical) + call atomic_cas(non_scalar_logical_coarray(:)[1], old_logical, compare_logical, new_logical) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_cas' call atomic_cas(non_coarray, old_int, compare_int, new_int) diff --git a/flang/test/Semantics/atomic04.f90 b/flang/test/Semantics/atomic04.f90 index 9df0b56d192a8..453fdb10e7f49 100644 --- a/flang/test/Semantics/atomic04.f90 +++ b/flang/test/Semantics/atomic04.f90 @@ -47,13 +47,13 @@ program test_atomic_define call atomic_define(non_scalar_coarray, val) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_define' - call atomic_define(non_scalar_coarray[1], val) + call atomic_define(non_scalar_coarray(:)[1], val) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_define' call atomic_define(non_scalar_logical_coarray, val_logical) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_define' - call atomic_define(non_scalar_logical_coarray[1], val_logical) + call atomic_define(non_scalar_logical_coarray(:)[1], val_logical) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_define' call atomic_define(non_coarray, val) diff --git a/flang/test/Semantics/atomic05.f90 b/flang/test/Semantics/atomic05.f90 index 98d6b19b1f23d..c1e67b0d454fe 100644 --- a/flang/test/Semantics/atomic05.f90 +++ b/flang/test/Semantics/atomic05.f90 @@ -41,7 +41,7 @@ program test_atomic_fetch_add call atomic_fetch_add(array, val, old_val) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_fetch_add' - call atomic_fetch_add(non_scalar_coarray[1], val, old_val) + call atomic_fetch_add(non_scalar_coarray(:)[1], val, old_val) !ERROR: Actual argument for 'atom=' must have kind=atomic_int_kind, but is 'INTEGER(4)' call atomic_fetch_add(default_kind_coarray, val, old_val) diff --git a/flang/test/Semantics/atomic06.f90 b/flang/test/Semantics/atomic06.f90 index c6a23dd0077ca..57cc81e9c4a97 100644 --- a/flang/test/Semantics/atomic06.f90 +++ b/flang/test/Semantics/atomic06.f90 @@ -41,7 +41,7 @@ program test_atomic_fetch_and call atomic_fetch_and(array, val, old_val) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_fetch_and' - call atomic_fetch_and(non_scalar_coarray[1], val, old_val) + call atomic_fetch_and(non_scalar_coarray(:)[1], val, old_val) !ERROR: Actual argument for 'atom=' must have kind=atomic_int_kind, but is 'INTEGER(4)' call atomic_fetch_and(default_kind_coarray, val, old_val) diff --git a/flang/test/Semantics/atomic07.f90 b/flang/test/Semantics/atomic07.f90 index 2bc544b757864..e4d80956ed036 100644 --- a/flang/test/Semantics/atomic07.f90 +++ b/flang/test/Semantics/atomic07.f90 @@ -34,7 +34,7 @@ program test_atomic_fetch_or call atomic_fetch_or(array, val, old_val) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_fetch_or' - call atomic_fetch_or(non_scalar_coarray[1], val, old_val) + call atomic_fetch_or(non_scalar_coarray(:)[1], val, old_val) !ERROR: Actual argument for 'atom=' must have kind=atomic_int_kind, but is 'INTEGER(4)' call atomic_fetch_or(default_kind_coarray, val, old_val) diff --git a/flang/test/Semantics/atomic08.f90 b/flang/test/Semantics/atomic08.f90 index f519f9735e00e..234e6e3923620 100644 --- a/flang/test/Semantics/atomic08.f90 +++ b/flang/test/Semantics/atomic08.f90 @@ -41,7 +41,7 @@ program test_atomic_fetch_xor call atomic_fetch_xor(array, val, old_val) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_fetch_xor' - call atomic_fetch_xor(non_scalar_coarray[1], val, old_val) + call atomic_fetch_xor(non_scalar_coarray(:)[1], val, old_val) !ERROR: Actual argument for 'atom=' must have kind=atomic_int_kind, but is 'INTEGER(4)' call atomic_fetch_xor(default_kind_coarray, val, old_val) diff --git a/flang/test/Semantics/atomic09.f90 b/flang/test/Semantics/atomic09.f90 index e4e062252659a..4f78ccb977186 100644 --- a/flang/test/Semantics/atomic09.f90 +++ b/flang/test/Semantics/atomic09.f90 @@ -31,7 +31,7 @@ program test_atomic_or call atomic_or(non_scalar_coarray, val) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_or' - call atomic_or(non_scalar_coarray[1], val) + call atomic_or(non_scalar_coarray(:)[1], val) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_or' call atomic_or(non_coarray, val) diff --git a/flang/test/Semantics/atomic10.f90 b/flang/test/Semantics/atomic10.f90 index 04efbd6e80fd2..e206326786042 100644 --- a/flang/test/Semantics/atomic10.f90 +++ b/flang/test/Semantics/atomic10.f90 @@ -47,13 +47,13 @@ program test_atomic_ref call atomic_ref(val, non_scalar_coarray) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_ref' - call atomic_ref(val, non_scalar_coarray[1]) + call atomic_ref(val, non_scalar_coarray(:)[1]) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_ref' call atomic_ref(val_logical, non_scalar_logical_coarray) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_ref' - call atomic_ref(val_logical, non_scalar_logical_coarray[1]) + call atomic_ref(val_logical, non_scalar_logical_coarray(:)[1]) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_ref' call atomic_ref(val, non_coarray) diff --git a/flang/test/Semantics/atomic11.f90 b/flang/test/Semantics/atomic11.f90 index d4f951ea02c32..dba7dfdf5ae47 100644 --- a/flang/test/Semantics/atomic11.f90 +++ b/flang/test/Semantics/atomic11.f90 @@ -31,7 +31,7 @@ program test_atomic_xor call atomic_xor(non_scalar_coarray, val) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_xor' - call atomic_xor(non_scalar_coarray[1], val) + call atomic_xor(non_scalar_coarray(:)[1], val) !ERROR: 'atom=' argument must be a scalar coarray or coindexed object for intrinsic 'atomic_xor' call atomic_xor(non_coarray, val) diff --git a/flang/test/Semantics/coarrays02.f90 b/flang/test/Semantics/coarrays02.f90 index dc907161250ab..b16e0ccb58797 100644 --- a/flang/test/Semantics/coarrays02.f90 +++ b/flang/test/Semantics/coarrays02.f90 @@ -96,3 +96,27 @@ subroutine test(cat) call sub(cat%p) end end + +subroutine s4 + type t + real, allocatable :: a(:)[:] + end type + type t2 + !ERROR: Allocatable or array component 'bad1' may not have a coarray ultimate component '%a' + type(t), allocatable :: bad1 + !ERROR: Pointer 'bad2' may not have a coarray potential component '%a' + type(t), pointer :: bad2 + !ERROR: Allocatable or array component 'bad3' may not have a coarray ultimate component '%a' + type(t) :: bad3(2) + !ERROR: Component 'bad4' is a coarray and must have the ALLOCATABLE attribute and have a deferred coshape + !ERROR: Coarray 'bad4' may not have a coarray potential component '%a' + type(t) :: bad4[*] + end type + type(t), save :: ta(2) + !ERROR: 'a' has corank 1, but coindexed reference has 2 cosubscripts + print *, ta(1)%a(1)[1,2] + !ERROR: An allocatable or pointer component reference must be applied to a scalar base + print *, ta(:)%a(1)[1] + !ERROR: Subscripts must appear in a coindexed reference when its base is an array + print *, ta(1)%a[1] +end diff --git a/flang/test/Semantics/coshape.f90 b/flang/test/Semantics/coshape.f90 index d4fb45df6600c..d4e3f2d25280d 100644 --- a/flang/test/Semantics/coshape.f90 +++ b/flang/test/Semantics/coshape.f90 @@ -40,9 +40,9 @@ program coshape_tests !ERROR: 'coarray=' argument must have corank > 0 for intrinsic 'coshape' codimensions = coshape(derived_scalar_coarray[1]%x) !ERROR: 'coarray=' argument must have corank > 0 for intrinsic 'coshape' - codimensions = coshape(derived_array_coarray[1]%x) + codimensions = coshape(derived_array_coarray(:)[1]%x) !ERROR: 'coarray=' argument must have corank > 0 for intrinsic 'coshape' - codimensions = coshape(array_coarray[1]) + codimensions = coshape(array_coarray(:)[1]) !ERROR: 'coarray=' argument must have corank > 0 for intrinsic 'coshape' codimensions = coshape(scalar_coarray[1]) diff --git a/flang/test/Semantics/error_stop1b.f90 b/flang/test/Semantics/error_stop1b.f90 index 355a049560102..3c9ace13693ac 100644 --- a/flang/test/Semantics/error_stop1b.f90 +++ b/flang/test/Semantics/error_stop1b.f90 @@ -32,7 +32,7 @@ program test_error_stop error stop char_array !ERROR: Must be a scalar value, but is a rank-1 array - error stop array_coarray[1] + error stop array_coarray(:)[1] !ERROR: Must have LOGICAL type, but is CHARACTER(KIND=1,LEN=128_8) error stop int_code, quiet=non_logical diff --git a/flang/test/Semantics/event01b.f90 b/flang/test/Semantics/event01b.f90 index 0cd8a5bcb1f1f..b11118783eaee 100644 --- a/flang/test/Semantics/event01b.f90 +++ b/flang/test/Semantics/event01b.f90 @@ -62,7 +62,7 @@ program test_event_post event post(occurrences) !ERROR: Must be a scalar value, but is a rank-1 array - event post(occurrences[1]) + event post(occurrences(:)[1]) !______ invalid sync-stat-lists: invalid stat= ____________ diff --git a/flang/test/Semantics/resolve94.f90 b/flang/test/Semantics/resolve94.f90 index 75755fb2b2038..1d0b106bd1171 100644 --- a/flang/test/Semantics/resolve94.f90 +++ b/flang/test/Semantics/resolve94.f90 @@ -35,7 +35,7 @@ subroutine s1() rVar1 = rCoarray[1,intArray,3] ! OK rVar1 = rCoarray[1,2,3,STAT=iVar1, TEAM=team2] - !ERROR: Team value must be of type TEAM_TYPE from module ISO_FORTRAN_ENV + !ERROR: TEAM= specifier must have type TEAM_TYPE from ISO_FORTRAN_ENV rVar1 = rCoarray[1,2,3,STAT=iVar1, TEAM=2] ! OK rVar1 = rCoarray[1,2,3,STAT=iVar1, TEAM_NUMBER=38] @@ -48,12 +48,12 @@ subroutine s1() !ERROR: Must be a scalar value, but is a rank-1 array rVar1 = rCoarray[1,2,3,STAT=intArray] ! Error on C929, no specifier can appear more than once - !ERROR: STAT variable can only be specified once + !ERROR: coindexed reference has multiple STAT= specifiers rVar1 = rCoarray[1,2,3,STAT=iVar1, STAT=iVar2] ! OK rVar1 = rCoarray[1,2,3,TEAM=team1] ! Error on C929, no specifier can appear more than once - !ERROR: TEAM value can only be specified once + !ERROR: coindexed reference has multiple TEAM= or TEAM_NUMBER= specifiers rVar1 = rCoarray[1,2,3,TEAM=team1, TEAM=team2] ! OK rVar1 = rCoarray[1,2,3,TEAM_NUMBER=37] @@ -66,11 +66,11 @@ subroutine s1() !ERROR: Must have INTEGER type, but is REAL(4) rVar1 = rCoarray[1,2,3,TEAM_NUMBER=3.7] ! Error on C929, no specifier can appear more than once - !ERROR: TEAM_NUMBER value can only be specified once + !ERROR: coindexed reference has multiple TEAM= or TEAM_NUMBER= specifiers rVar1 = rCoarray[1,2,3,TEAM_NUMBER=37, TEAM_NUMBER=37] - !ERROR: Cannot specify both TEAM and TEAM_NUMBER + !ERROR: coindexed reference has multiple TEAM= or TEAM_NUMBER= specifiers rVar1 = rCoarray[1,2,3,TEAM=team1, TEAM_NUMBER=37] - !ERROR: Cannot specify both TEAM and TEAM_NUMBER + !ERROR: coindexed reference has multiple TEAM= or TEAM_NUMBER= specifiers rVar1 = rCoarray[1,2,3,TEAM_number=43, TEAM=team1] ! OK for a STAT variable to be a coarray integer rVar1 = rCoarray[1,2,3,stat=intScalarCoarray] From flang-commits at lists.llvm.org Mon May 12 13:03:07 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Mon, 12 May 2025 13:03:07 -0700 (PDT) Subject: [flang-commits] [flang] [flang] PRIVATE statement in derived type applies to proc components (PR #139336) In-Reply-To: Message-ID: <682253fb.050a0220.d1fe5.f2b4@mx.google.com> https://github.com/akuhlens commented: LGTM https://github.com/llvm/llvm-project/pull/139336 From flang-commits at lists.llvm.org Mon May 12 13:04:29 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Mon, 12 May 2025 13:04:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Require contiguous actual pointer for contiguous dummy pointer (PR #139298) In-Reply-To: Message-ID: <6822544d.630a0220.2c2a2.546d@mx.google.com> https://github.com/akuhlens commented: LGTM https://github.com/llvm/llvm-project/pull/139298 From flang-commits at lists.llvm.org Mon May 12 13:53:09 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 12 May 2025 13:53:09 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fixed designator codegen for contiguous boxes. (PR #139003) In-Reply-To: Message-ID: <68225fb5.170a0220.f9af4.32c5@mx.google.com> https://github.com/vzakhari updated https://github.com/llvm/llvm-project/pull/139003 >From 5d3fc1fbeb088d04c1dbe863dd0ce6657bc92949 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Wed, 7 May 2025 18:07:53 -0700 Subject: [PATCH 1/2] [flang] Fixed designator codegen for contiguous boxes. Contiguous variables represented with a box do not have explicit shape, but it looks like the base/shape computation was assuming that. This caused generation of raw address fir.array_coor without the shape. This patch is needed to fix failures hapenning with #138797. --- .../flang/Optimizer/Builder/HLFIRTools.h | 6 ++ flang/lib/Optimizer/Builder/HLFIRTools.cpp | 30 ++++++-- .../HLFIR/Transforms/ConvertToFIR.cpp | 36 +++++++++- flang/test/HLFIR/designate-codegen.fir | 72 +++++++++++++++++++ 4 files changed, 135 insertions(+), 9 deletions(-) diff --git a/flang/include/flang/Optimizer/Builder/HLFIRTools.h b/flang/include/flang/Optimizer/Builder/HLFIRTools.h index ac80873dc374f..bcba38ed8bd5d 100644 --- a/flang/include/flang/Optimizer/Builder/HLFIRTools.h +++ b/flang/include/flang/Optimizer/Builder/HLFIRTools.h @@ -533,6 +533,12 @@ Entity gen1DSection(mlir::Location loc, fir::FirOpBuilder &builder, mlir::ArrayRef extents, mlir::ValueRange oneBasedIndices, mlir::ArrayRef typeParams); + +/// Return explicit lower bounds from a fir.shape result. +/// Only fir.shape, fir.shift and fir.shape_shift are currently +/// supported as \p shape. +llvm::SmallVector getExplicitLboundsFromShape(mlir::Value shape); + } // namespace hlfir #endif // FORTRAN_OPTIMIZER_BUILDER_HLFIRTOOLS_H diff --git a/flang/lib/Optimizer/Builder/HLFIRTools.cpp b/flang/lib/Optimizer/Builder/HLFIRTools.cpp index 51ea7305d3d26..752dc0cf86414 100644 --- a/flang/lib/Optimizer/Builder/HLFIRTools.cpp +++ b/flang/lib/Optimizer/Builder/HLFIRTools.cpp @@ -70,10 +70,8 @@ getExplicitExtents(fir::FortranVariableOpInterface var, return {}; } -// Return explicit lower bounds. For pointers and allocatables, this will not -// read the lower bounds and instead return an empty vector. -static llvm::SmallVector -getExplicitLboundsFromShape(mlir::Value shape) { +llvm::SmallVector +hlfir::getExplicitLboundsFromShape(mlir::Value shape) { llvm::SmallVector result; auto *shapeOp = shape.getDefiningOp(); if (auto s = mlir::dyn_cast_or_null(shapeOp)) { @@ -89,10 +87,13 @@ getExplicitLboundsFromShape(mlir::Value shape) { } return result; } + +// Return explicit lower bounds. For pointers and allocatables, this will not +// read the lower bounds and instead return an empty vector. static llvm::SmallVector getExplicitLbounds(fir::FortranVariableOpInterface var) { if (mlir::Value shape = var.getShape()) - return getExplicitLboundsFromShape(shape); + return hlfir::getExplicitLboundsFromShape(shape); return {}; } @@ -753,9 +754,24 @@ std::pair hlfir::genVariableFirBaseShapeAndParams( } if (entity.isScalar()) return {fir::getBase(exv), mlir::Value{}}; + + // Contiguous variables that are represented with a box + // may require the shape to be extracted from the box (i.e. evx), + // because they itself may not have shape specified. + // This happens during late propagationg of contiguous + // attribute, e.g.: + // %9:2 = hlfir.declare %6 + // {fortran_attrs = #fir.var_attrs} : + // (!fir.box>) -> + // (!fir.box>, !fir.box>) + // The extended value is an ArrayBoxValue with base being + // the raw address of the array. if (auto variableInterface = entity.getIfVariableInterface()) - return {fir::getBase(exv), - asEmboxShape(loc, builder, exv, variableInterface.getShape())}; + if (mlir::isa(fir::getBase(exv).getType()) || + !mlir::isa(entity.getType()) || + variableInterface.getShape()) + return {fir::getBase(exv), + asEmboxShape(loc, builder, exv, variableInterface.getShape())}; return {fir::getBase(exv), builder.createShape(loc, exv)}; } diff --git a/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp b/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp index 8721a895b5e05..495f11a365185 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp @@ -412,12 +412,44 @@ class DesignateOpConversion auto indices = designate.getIndices(); int i = 0; auto attrs = designate.getIsTripletAttr(); + + // If the shape specifies a shift and the base is not a box, + // then we have to subtract the lower bounds, as long as + // fir.array_coor does not support non-default lower bounds + // for non-box accesses. + llvm::SmallVector lbounds; + if (shape && !mlir::isa(base.getType())) + lbounds = hlfir::getExplicitLboundsFromShape(shape); + std::size_t lboundIdx = 0; for (auto isTriplet : attrs.asArrayRef()) { // Coordinate of the first element are the index and triplets lower - // bounds - firstElementIndices.push_back(indices[i]); + // bounds. + mlir::Value index = indices[i]; + if (!lbounds.empty()) { + assert(lboundIdx < lbounds.size() && "missing lbound"); + mlir::Type indexType = builder.getIndexType(); + mlir::Value one = builder.createIntegerConstant(loc, indexType, 1); + mlir::Value orig = builder.createConvert(loc, indexType, index); + mlir::Value lb = + builder.createConvert(loc, indexType, lbounds[lboundIdx]); + index = builder.create(loc, orig, lb); + index = builder.create(loc, index, one); + ++lboundIdx; + } + firstElementIndices.push_back(index); i = i + (isTriplet ? 3 : 1); } + + // Remove the shift from the shape, if needed. + if (!lbounds.empty()) { + mlir::Operation *op = shape.getDefiningOp(); + if (mlir::isa(op)) + shape = nullptr; + else if (auto shiftOp = mlir::dyn_cast(op)) + shape = builder.create(loc, shiftOp.getExtents()); + else + TODO(loc, "read fir.shape to get lower bounds"); + } mlir::Type originalDesignateType = designate.getResult().getType(); const bool isVolatile = fir::isa_volatile_type(originalDesignateType); mlir::Type arrayCoorType = fir::ReferenceType::get(baseEleTy, isVolatile); diff --git a/flang/test/HLFIR/designate-codegen.fir b/flang/test/HLFIR/designate-codegen.fir index da0a1f82b516e..d3e264941264f 100644 --- a/flang/test/HLFIR/designate-codegen.fir +++ b/flang/test/HLFIR/designate-codegen.fir @@ -213,3 +213,75 @@ func.func @test_polymorphic_array_elt(%arg0: !fir.class>, !fir.class>>) -> !fir.class> // CHECK: return // CHECK: } + +// Test proper generation of fir.array_coor for contiguous box with default lbounds. +func.func @_QPtest_contiguous_derived_default(%arg0: !fir.class>> {fir.bindc_name = "d1", fir.contiguous, fir.optional}) { + %c1 = arith.constant 1 : index + %c16_i32 = arith.constant 16 : i32 + %0 = fir.dummy_scope : !fir.dscope + %1:2 = hlfir.declare %arg0 dummy_scope %0 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.class>>, !fir.dscope) -> (!fir.class>>, !fir.class>>) + fir.select_type %1#1 : !fir.class>> [#fir.type_is,i:i32}>>, ^bb1, unit, ^bb2] +^bb1: // pred: ^bb0 + %2 = fir.convert %1#1 : (!fir.class>>) -> !fir.box,i:i32}>>> + %3:2 = hlfir.declare %2 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.box,i:i32}>>>) -> (!fir.box,i:i32}>>>, !fir.box,i:i32}>>>) + %4 = hlfir.designate %3#0 (%c1, %c1) : (!fir.box,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> + %5 = hlfir.designate %4{"i"} : (!fir.ref,i:i32}>>) -> !fir.ref + hlfir.assign %c16_i32 to %5 : i32, !fir.ref + cf.br ^bb3 +^bb2: // pred: ^bb0 + %6:2 = hlfir.declare %1#1 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.class>>) -> (!fir.class>>, !fir.class>>) + cf.br ^bb3 +^bb3: // 2 preds: ^bb1, ^bb2 + return +} +// CHECK-LABEL: func.func @_QPtest_contiguous_derived_default( +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_9:.*]] = fir.declare %{{.*}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_defaultEd1"} : (!fir.box,i:i32}>>>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_10:.*]] = fir.rebox %[[VAL_9]] : (!fir.box,i:i32}>>>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_11:.*]] = fir.box_addr %[[VAL_10]] : (!fir.box,i:i32}>>>) -> !fir.ref,i:i32}>>> +// CHECK: %[[VAL_12:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10]], %[[VAL_12]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_15:.*]]:3 = fir.box_dims %[[VAL_10]], %[[VAL_14]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_16:.*]] = fir.shape %[[VAL_13]]#1, %[[VAL_15]]#1 : (index, index) -> !fir.shape<2> +// CHECK: %[[VAL_17:.*]] = fir.array_coor %[[VAL_11]](%[[VAL_16]]) %[[VAL_0]], %[[VAL_0]] : (!fir.ref,i:i32}>>>, !fir.shape<2>, index, index) -> !fir.ref,i:i32}>> + +// Test proper generation of fir.array_coor for contiguous box with non-default lbounds. +func.func @_QPtest_contiguous_derived_lbounds(%arg0: !fir.class>> {fir.bindc_name = "d1", fir.contiguous}) { + %c3 = arith.constant 3 : index + %c1 = arith.constant 1 : index + %c16_i32 = arith.constant 16 : i32 + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.shift %c1, %c3 : (index, index) -> !fir.shift<2> + %2:2 = hlfir.declare %arg0(%1) dummy_scope %0 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.class>>, !fir.shift<2>, !fir.dscope) -> (!fir.class>>, !fir.class>>) + fir.select_type %2#1 : !fir.class>> [#fir.type_is,i:i32}>>, ^bb1, unit, ^bb2] +^bb1: // pred: ^bb0 + %3 = fir.convert %2#1 : (!fir.class>>) -> !fir.box,i:i32}>>> + %4:2 = hlfir.declare %3(%1) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.box,i:i32}>>>, !fir.shift<2>) -> (!fir.box,i:i32}>>>, !fir.box,i:i32}>>>) + %5 = hlfir.designate %4#0 (%c1, %c3) : (!fir.box,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> + %6 = hlfir.designate %5{"i"} : (!fir.ref,i:i32}>>) -> !fir.ref + hlfir.assign %c16_i32 to %6 : i32, !fir.ref + cf.br ^bb3 +^bb2: // pred: ^bb0 + %7:2 = hlfir.declare %2#1(%1) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.class>>, !fir.shift<2>) -> (!fir.class>>, !fir.class>>) + cf.br ^bb3 +^bb3: // 2 preds: ^bb1, ^bb2 + return +} +// CHECK-LABEL: func.func @_QPtest_contiguous_derived_lbounds( +// CHECK: %[[VAL_0:.*]] = arith.constant 3 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.declare %{{.*}}(%[[VAL_4:.*]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_contiguous_derived_lboundsEd1"} : (!fir.box,i:i32}>>>, !fir.shift<2>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_9:.*]] = fir.rebox %[[VAL_8]](%[[VAL_4]]) : (!fir.box,i:i32}>>>, !fir.shift<2>) -> !fir.box,i:i32}>>> +// CHECK: %[[VAL_10:.*]] = fir.box_addr %[[VAL_9]] : (!fir.box,i:i32}>>>) -> !fir.ref,i:i32}>>> +// CHECK: %[[VAL_11:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_12:.*]]:3 = fir.box_dims %[[VAL_9]], %[[VAL_11]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_13:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_14:.*]]:3 = fir.box_dims %[[VAL_9]], %[[VAL_13]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) +// CHECK: %[[VAL_15:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_16:.*]] = arith.subi %[[VAL_1]], %[[VAL_1]] : index +// CHECK: %[[VAL_17:.*]] = arith.addi %[[VAL_16]], %[[VAL_15]] : index +// CHECK: %[[VAL_18:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_19:.*]] = arith.subi %[[VAL_0]], %[[VAL_0]] : index +// CHECK: %[[VAL_20:.*]] = arith.addi %[[VAL_19]], %[[VAL_18]] : index +// CHECK: %[[VAL_21:.*]] = fir.array_coor %[[VAL_10]] %[[VAL_17]], %[[VAL_20]] : (!fir.ref,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> >From ac7905e6878385069b41eb39f20f1b760d01929b Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Fri, 9 May 2025 15:32:24 -0700 Subject: [PATCH 2/2] Fixed the fir.shift case, and got rid of the shift handling in ConvertToFIR. --- .../flang/Optimizer/Builder/HLFIRTools.h | 6 ---- flang/lib/Optimizer/Builder/HLFIRTools.cpp | 19 ++++++++--- .../HLFIR/Transforms/ConvertToFIR.cpp | 33 +------------------ flang/test/HLFIR/designate-codegen.fir | 9 ++--- 4 files changed, 17 insertions(+), 50 deletions(-) diff --git a/flang/include/flang/Optimizer/Builder/HLFIRTools.h b/flang/include/flang/Optimizer/Builder/HLFIRTools.h index bcba38ed8bd5d..ac80873dc374f 100644 --- a/flang/include/flang/Optimizer/Builder/HLFIRTools.h +++ b/flang/include/flang/Optimizer/Builder/HLFIRTools.h @@ -533,12 +533,6 @@ Entity gen1DSection(mlir::Location loc, fir::FirOpBuilder &builder, mlir::ArrayRef extents, mlir::ValueRange oneBasedIndices, mlir::ArrayRef typeParams); - -/// Return explicit lower bounds from a fir.shape result. -/// Only fir.shape, fir.shift and fir.shape_shift are currently -/// supported as \p shape. -llvm::SmallVector getExplicitLboundsFromShape(mlir::Value shape); - } // namespace hlfir #endif // FORTRAN_OPTIMIZER_BUILDER_HLFIRTOOLS_H diff --git a/flang/lib/Optimizer/Builder/HLFIRTools.cpp b/flang/lib/Optimizer/Builder/HLFIRTools.cpp index 752dc0cf86414..f2b084cb760b9 100644 --- a/flang/lib/Optimizer/Builder/HLFIRTools.cpp +++ b/flang/lib/Optimizer/Builder/HLFIRTools.cpp @@ -70,8 +70,11 @@ getExplicitExtents(fir::FortranVariableOpInterface var, return {}; } -llvm::SmallVector -hlfir::getExplicitLboundsFromShape(mlir::Value shape) { +// Return explicit lower bounds from a shape result. +// Only fir.shape, fir.shift and fir.shape_shift are currently +// supported as shape. +static llvm::SmallVector +getExplicitLboundsFromShape(mlir::Value shape) { llvm::SmallVector result; auto *shapeOp = shape.getDefiningOp(); if (auto s = mlir::dyn_cast_or_null(shapeOp)) { @@ -93,7 +96,7 @@ hlfir::getExplicitLboundsFromShape(mlir::Value shape) { static llvm::SmallVector getExplicitLbounds(fir::FortranVariableOpInterface var) { if (mlir::Value shape = var.getShape()) - return hlfir::getExplicitLboundsFromShape(shape); + return getExplicitLboundsFromShape(shape); return {}; } @@ -766,12 +769,18 @@ std::pair hlfir::genVariableFirBaseShapeAndParams( // (!fir.box>, !fir.box>) // The extended value is an ArrayBoxValue with base being // the raw address of the array. - if (auto variableInterface = entity.getIfVariableInterface()) + if (auto variableInterface = entity.getIfVariableInterface()) { + mlir::Value shape = variableInterface.getShape(); if (mlir::isa(fir::getBase(exv).getType()) || !mlir::isa(entity.getType()) || - variableInterface.getShape()) + // Still use the variable's shape if it is present. + // If it only specifies a shift, then we have to create + // a shape from the exv. + (shape && (shape.getDefiningOp() || + shape.getDefiningOp()))) return {fir::getBase(exv), asEmboxShape(loc, builder, exv, variableInterface.getShape())}; + } return {fir::getBase(exv), builder.createShape(loc, exv)}; } diff --git a/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp b/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp index 495f11a365185..8f206b5a1ade7 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/ConvertToFIR.cpp @@ -412,44 +412,13 @@ class DesignateOpConversion auto indices = designate.getIndices(); int i = 0; auto attrs = designate.getIsTripletAttr(); - - // If the shape specifies a shift and the base is not a box, - // then we have to subtract the lower bounds, as long as - // fir.array_coor does not support non-default lower bounds - // for non-box accesses. - llvm::SmallVector lbounds; - if (shape && !mlir::isa(base.getType())) - lbounds = hlfir::getExplicitLboundsFromShape(shape); - std::size_t lboundIdx = 0; for (auto isTriplet : attrs.asArrayRef()) { // Coordinate of the first element are the index and triplets lower // bounds. - mlir::Value index = indices[i]; - if (!lbounds.empty()) { - assert(lboundIdx < lbounds.size() && "missing lbound"); - mlir::Type indexType = builder.getIndexType(); - mlir::Value one = builder.createIntegerConstant(loc, indexType, 1); - mlir::Value orig = builder.createConvert(loc, indexType, index); - mlir::Value lb = - builder.createConvert(loc, indexType, lbounds[lboundIdx]); - index = builder.create(loc, orig, lb); - index = builder.create(loc, index, one); - ++lboundIdx; - } - firstElementIndices.push_back(index); + firstElementIndices.push_back(indices[i]); i = i + (isTriplet ? 3 : 1); } - // Remove the shift from the shape, if needed. - if (!lbounds.empty()) { - mlir::Operation *op = shape.getDefiningOp(); - if (mlir::isa(op)) - shape = nullptr; - else if (auto shiftOp = mlir::dyn_cast(op)) - shape = builder.create(loc, shiftOp.getExtents()); - else - TODO(loc, "read fir.shape to get lower bounds"); - } mlir::Type originalDesignateType = designate.getResult().getType(); const bool isVolatile = fir::isa_volatile_type(originalDesignateType); mlir::Type arrayCoorType = fir::ReferenceType::get(baseEleTy, isVolatile); diff --git a/flang/test/HLFIR/designate-codegen.fir b/flang/test/HLFIR/designate-codegen.fir index d3e264941264f..5c3ae202fd3b9 100644 --- a/flang/test/HLFIR/designate-codegen.fir +++ b/flang/test/HLFIR/designate-codegen.fir @@ -278,10 +278,5 @@ func.func @_QPtest_contiguous_derived_lbounds(%arg0: !fir.class,i:i32}>>>, index) -> (index, index, index) // CHECK: %[[VAL_13:.*]] = arith.constant 1 : index // CHECK: %[[VAL_14:.*]]:3 = fir.box_dims %[[VAL_9]], %[[VAL_13]] : (!fir.box,i:i32}>>>, index) -> (index, index, index) -// CHECK: %[[VAL_15:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_16:.*]] = arith.subi %[[VAL_1]], %[[VAL_1]] : index -// CHECK: %[[VAL_17:.*]] = arith.addi %[[VAL_16]], %[[VAL_15]] : index -// CHECK: %[[VAL_18:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_19:.*]] = arith.subi %[[VAL_0]], %[[VAL_0]] : index -// CHECK: %[[VAL_20:.*]] = arith.addi %[[VAL_19]], %[[VAL_18]] : index -// CHECK: %[[VAL_21:.*]] = fir.array_coor %[[VAL_10]] %[[VAL_17]], %[[VAL_20]] : (!fir.ref,i:i32}>>>, index, index) -> !fir.ref,i:i32}>> +// CHECK: %[[VAL_15:.*]] = fir.shape_shift %[[VAL_1]], %[[VAL_12]]#1, %[[VAL_0]], %[[VAL_14]]#1 : (index, index, index, index) -> !fir.shapeshift<2> +// CHECK: %[[VAL_16:.*]] = fir.array_coor %[[VAL_10]](%[[VAL_15]]) %[[VAL_1]], %[[VAL_0]] : (!fir.ref,i:i32}>>>, !fir.shapeshift<2>, index, index) -> !fir.ref,i:i32}>> From flang-commits at lists.llvm.org Mon May 12 13:53:21 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 12 May 2025 13:53:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Treat hlfir.associate as Allocate for FIR alias analysis. (PR #139004) In-Reply-To: Message-ID: <68225fc1.170a0220.85a02.ac29@mx.google.com> https://github.com/vzakhari updated https://github.com/llvm/llvm-project/pull/139004 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Mon May 12 14:02:52 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 12 May 2025 14:02:52 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Propagate contiguous attribute through HLFIR. (PR #138797) In-Reply-To: Message-ID: <682261fc.170a0220.228864.bcbc@mx.google.com> https://github.com/vzakhari edited https://github.com/llvm/llvm-project/pull/138797 From flang-commits at lists.llvm.org Mon May 12 14:03:19 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 12 May 2025 14:03:19 -0700 (PDT) Subject: [flang-commits] [flang] 09b772e - [flang] Postpone hlfir.end_associate generation for calls. (#138786) Message-ID: <68226217.170a0220.15b25e.e3d7@mx.google.com> Author: Slava Zakharin Date: 2025-05-12T14:03:15-07:00 New Revision: 09b772e2efad804fdda02e2bd9ee44a2aaaddeeb URL: https://github.com/llvm/llvm-project/commit/09b772e2efad804fdda02e2bd9ee44a2aaaddeeb DIFF: https://github.com/llvm/llvm-project/commit/09b772e2efad804fdda02e2bd9ee44a2aaaddeeb.diff LOG: [flang] Postpone hlfir.end_associate generation for calls. (#138786) If we generate hlfir.end_associate at the end of the statement, we get easier optimizable HLFIR, because there are no compiler generated operations with side-effects in between the call and the consumers. This allows more hlfir.eval_in_mem to reuse the LHS instead of allocating temporary buffer. I do not think the same can be done for hlfir.copy_out always, e.g.: ``` subroutine test2(x) interface function array_func2(x,y) real:: x(*), array_func2(10), y end function array_func2 end interface real :: x(:) x = array_func2(x, 1.0) end subroutine test2 ``` If we postpone the copy-out until after the assignment, then the result may be wrong. Added: flang/test/Lower/HLFIR/call-postponed-associate.f90 Modified: flang/lib/Lower/ConvertCall.cpp flang/lib/Lower/OpenACC.cpp flang/lib/Lower/OpenMP/OpenMP.cpp flang/test/Lower/HLFIR/entry_return.f90 flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 flang/test/Lower/OpenACC/acc-atomic-capture.f90 flang/test/Lower/OpenACC/acc-atomic-update.f90 flang/test/Lower/OpenMP/atomic-capture.f90 flang/test/Lower/OpenMP/atomic-update.f90 Removed: ################################################################################ diff --git a/flang/lib/Lower/ConvertCall.cpp b/flang/lib/Lower/ConvertCall.cpp index a5b85e25b1af0..d37d51f6ec634 100644 --- a/flang/lib/Lower/ConvertCall.cpp +++ b/flang/lib/Lower/ConvertCall.cpp @@ -960,9 +960,26 @@ struct CallCleanUp { mlir::Value tempVar; mlir::Value mustFree; }; - void genCleanUp(mlir::Location loc, fir::FirOpBuilder &builder) { - Fortran::common::visit([&](auto &c) { c.genCleanUp(loc, builder); }, + + /// Generate clean-up code. + /// If \p postponeAssociates is true, the ExprAssociate clean-up + /// is not generated, and instead the corresponding CallCleanUp + /// object is returned as the result. + std::optional genCleanUp(mlir::Location loc, + fir::FirOpBuilder &builder, + bool postponeAssociates) { + std::optional postponed; + Fortran::common::visit(Fortran::common::visitors{ + [&](CopyIn &c) { c.genCleanUp(loc, builder); }, + [&](ExprAssociate &c) { + if (postponeAssociates) + postponed = CallCleanUp{c}; + else + c.genCleanUp(loc, builder); + }, + }, cleanUp); + return postponed; } std::variant cleanUp; }; @@ -1729,10 +1746,23 @@ genUserCall(Fortran::lower::PreparedActualArguments &loweredActuals, caller, callSiteType, callContext.resultType, callContext.isElementalProcWithArrayArgs()); - /// Clean-up associations and copy-in. - for (auto cleanUp : callCleanUps) - cleanUp.genCleanUp(loc, builder); - + // Clean-up associations and copy-in. + // The association clean-ups are postponed to the end of the statement + // lowering. The copy-in clean-ups may be delayed as well, + // but they are done immediately after the call currently. + llvm::SmallVector associateCleanups; + for (auto cleanUp : callCleanUps) { + auto postponed = + cleanUp.genCleanUp(loc, builder, /*postponeAssociates=*/true); + if (postponed) + associateCleanups.push_back(*postponed); + } + + fir::FirOpBuilder *bldr = &builder; + callContext.stmtCtx.attachCleanup([=]() { + for (auto cleanUp : associateCleanups) + (void)cleanUp.genCleanUp(loc, *bldr, /*postponeAssociates=*/false); + }); if (auto *entity = std::get_if(&loweredResult)) return *entity; diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 2f70041a04dde..e1918288d6de3 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -416,7 +416,8 @@ static inline void genAtomicUpdateStatement( Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { + mlir::Operation *atomicCaptureOp = nullptr, + Fortran::lower::StatementContext *atomicCaptureStmtCtx = nullptr) { // Generate `atomic.update` operation for atomic assignment statements fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); mlir::Location currentLocation = converter.getCurrentLocation(); @@ -496,15 +497,24 @@ static inline void genAtomicUpdateStatement( }, assignmentStmtExpr.u); Fortran::lower::StatementContext nonAtomicStmtCtx; + Fortran::lower::StatementContext *stmtCtxPtr = &nonAtomicStmtCtx; if (!nonAtomicSubExprs.empty()) { // Generate non atomic part before all the atomic operations. auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) + if (atomicCaptureOp) { + assert(atomicCaptureStmtCtx && "must specify statement context"); firOpBuilder.setInsertionPoint(atomicCaptureOp); + // Any clean-ups associated with the expression lowering + // must also be generated outside of the atomic update operation + // and after the atomic capture operation. + // The atomicCaptureStmtCtx will be finalized at the end + // of the atomic capture operation generation. + stmtCtxPtr = atomicCaptureStmtCtx; + } mlir::Value nonAtomicVal; for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + currentLocation, *nonAtomicSubExpr, *stmtCtxPtr)); exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); } if (atomicCaptureOp) @@ -652,7 +662,7 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter, genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, elementType, loc); genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, - stmt2Expr, loc, atomicCaptureOp); + stmt2Expr, loc, atomicCaptureOp, &stmtCtx); } else { // Atomic capture construct is of the form [capture-stmt, write-stmt] firOpBuilder.setInsertionPoint(atomicCaptureOp); @@ -672,13 +682,15 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter, *Fortran::semantics::GetExpr(stmt2Expr); mlir::Type elementType = converter.genType(fromExpr); genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, - stmt1Expr, loc, atomicCaptureOp); + stmt1Expr, loc, atomicCaptureOp, &stmtCtx); genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, loc); } firOpBuilder.setInsertionPointToEnd(&block); firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); + // The clean-ups associated with the statements inside the capture + // construct must be generated after the AtomicCaptureOp. + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); } template diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 446aa2deb3d05..4909c3e277a07 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2816,7 +2816,8 @@ static void genAtomicUpdateStatement( const parser::Expr &assignmentStmtExpr, const parser::OmpAtomicClauseList *leftHandClauseList, const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { + mlir::Operation *atomicCaptureOp = nullptr, + lower::StatementContext *atomicCaptureStmtCtx = nullptr) { // Generate `atomic.update` operation for atomic assignment statements fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); mlir::Location currentLocation = converter.getCurrentLocation(); @@ -2890,15 +2891,24 @@ static void genAtomicUpdateStatement( }, assignmentStmtExpr.u); lower::StatementContext nonAtomicStmtCtx; + lower::StatementContext *stmtCtxPtr = &nonAtomicStmtCtx; if (!nonAtomicSubExprs.empty()) { // Generate non atomic part before all the atomic operations. auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) + if (atomicCaptureOp) { + assert(atomicCaptureStmtCtx && "must specify statement context"); firOpBuilder.setInsertionPoint(atomicCaptureOp); + // Any clean-ups associated with the expression lowering + // must also be generated outside of the atomic update operation + // and after the atomic capture operation. + // The atomicCaptureStmtCtx will be finalized at the end + // of the atomic capture operation generation. + stmtCtxPtr = atomicCaptureStmtCtx; + } mlir::Value nonAtomicVal; for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + currentLocation, *nonAtomicSubExpr, *stmtCtxPtr)); exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); } if (atomicCaptureOp) @@ -3238,7 +3248,7 @@ static void genAtomicCapture(lower::AbstractConverter &converter, genAtomicUpdateStatement( converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp, &stmtCtx); } else { // Atomic capture construct is of the form [capture-stmt, write-stmt] firOpBuilder.setInsertionPoint(atomicCaptureOp); @@ -3284,7 +3294,7 @@ static void genAtomicCapture(lower::AbstractConverter &converter, genAtomicUpdateStatement( converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp, &stmtCtx); if (stmt1VarType != stmt2VarType) { mlir::Value alloca; @@ -3316,7 +3326,9 @@ static void genAtomicCapture(lower::AbstractConverter &converter, } firOpBuilder.setInsertionPointToEnd(&block); firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); + // The clean-ups associated with the statements inside the capture + // construct must be generated after the AtomicCaptureOp. + firOpBuilder.setInsertionPointAfter(atomicCaptureOp); } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/HLFIR/call-postponed-associate.f90 b/flang/test/Lower/HLFIR/call-postponed-associate.f90 new file mode 100644 index 0000000000000..18df62b44324b --- /dev/null +++ b/flang/test/Lower/HLFIR/call-postponed-associate.f90 @@ -0,0 +1,85 @@ +! RUN: bbc -emit-hlfir -o - %s -I nowhere | FileCheck %s + +subroutine test1 + interface + function array_func1(x) + real:: x, array_func1(10) + end function array_func1 + end interface + real :: x(10) + x = array_func1(1.0) +end subroutine test1 +! CHECK-LABEL: func.func @_QPtest1() { +! CHECK: %[[VAL_5:.*]] = arith.constant 1.000000e+00 : f32 +! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_5]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_17:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: fir.call @_QParray_func1 +! CHECK: fir.save_result +! CHECK: } +! CHECK: hlfir.assign %[[VAL_17]] to %{{.*}} : !hlfir.expr<10xf32>, !fir.ref> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 + +subroutine test2(x) + interface + function array_func2(x,y) + real:: x(*), array_func2(10), y + end function array_func2 + end interface + real :: x(:) + x = array_func2(x, 1.0) +end subroutine test2 +! CHECK-LABEL: func.func @_QPtest2( +! CHECK: %[[VAL_3:.*]] = arith.constant 1.000000e+00 : f32 +! CHECK: %[[VAL_4:.*]]:2 = hlfir.copy_in %{{.*}} to %{{.*}} : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +! CHECK: %[[VAL_5:.*]] = fir.box_addr %[[VAL_4]]#0 : (!fir.box>) -> !fir.ref> +! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_3]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_17:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: ^bb0(%[[VAL_18:.*]]: !fir.ref>): +! CHECK: %[[VAL_19:.*]] = fir.call @_QParray_func2(%[[VAL_5]], %[[VAL_6]]#0) fastmath : (!fir.ref>, !fir.ref) -> !fir.array<10xf32> +! CHECK: fir.save_result %[[VAL_19]] to %[[VAL_18]](%{{.*}}) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +! CHECK: } +! CHECK: hlfir.copy_out %{{.*}}, %[[VAL_4]]#1 to %{{.*}} : (!fir.ref>>>, i1, !fir.box>) -> () +! CHECK: hlfir.assign %[[VAL_17]] to %{{.*}} : !hlfir.expr<10xf32>, !fir.box> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 +! CHECK: hlfir.destroy %[[VAL_17]] : !hlfir.expr<10xf32> + +subroutine test3(x) + interface + function array_func3(x) + real :: x, array_func3(10) + end function array_func3 + end interface + logical :: x + if (any(array_func3(1.0).le.array_func3(2.0))) x = .true. +end subroutine test3 +! CHECK-LABEL: func.func @_QPtest3( +! CHECK: %[[VAL_2:.*]] = arith.constant 1.000000e+00 : f32 +! CHECK: %[[VAL_3:.*]]:3 = hlfir.associate %[[VAL_2]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_14:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: ^bb0(%[[VAL_15:.*]]: !fir.ref>): +! CHECK: %[[VAL_16:.*]] = fir.call @_QParray_func3(%[[VAL_3]]#0) fastmath : (!fir.ref) -> !fir.array<10xf32> +! CHECK: fir.save_result %[[VAL_16]] to %[[VAL_15]](%{{.*}}) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +! CHECK: } +! CHECK: %[[VAL_17:.*]] = arith.constant 2.000000e+00 : f32 +! CHECK: %[[VAL_18:.*]]:3 = hlfir.associate %[[VAL_17]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_29:.*]] = hlfir.eval_in_mem shape %{{.*}} : (!fir.shape<1>) -> !hlfir.expr<10xf32> { +! CHECK: ^bb0(%[[VAL_30:.*]]: !fir.ref>): +! CHECK: %[[VAL_31:.*]] = fir.call @_QParray_func3(%[[VAL_18]]#0) fastmath : (!fir.ref) -> !fir.array<10xf32> +! CHECK: fir.save_result %[[VAL_31]] to %[[VAL_30]](%{{.*}}) : !fir.array<10xf32>, !fir.ref>, !fir.shape<1> +! CHECK: } +! CHECK: %[[VAL_32:.*]] = hlfir.elemental %{{.*}} unordered : (!fir.shape<1>) -> !hlfir.expr> { +! CHECK: ^bb0(%[[VAL_33:.*]]: index): +! CHECK: %[[VAL_34:.*]] = hlfir.apply %[[VAL_14]], %[[VAL_33]] : (!hlfir.expr<10xf32>, index) -> f32 +! CHECK: %[[VAL_35:.*]] = hlfir.apply %[[VAL_29]], %[[VAL_33]] : (!hlfir.expr<10xf32>, index) -> f32 +! CHECK: %[[VAL_36:.*]] = arith.cmpf ole, %[[VAL_34]], %[[VAL_35]] fastmath : f32 +! CHECK: %[[VAL_37:.*]] = fir.convert %[[VAL_36]] : (i1) -> !fir.logical<4> +! CHECK: hlfir.yield_element %[[VAL_37]] : !fir.logical<4> +! CHECK: } +! CHECK: %[[VAL_38:.*]] = hlfir.any %[[VAL_32]] : (!hlfir.expr>) -> !fir.logical<4> +! CHECK: hlfir.destroy %[[VAL_32]] : !hlfir.expr> +! CHECK: hlfir.end_associate %[[VAL_18]]#1, %[[VAL_18]]#2 : !fir.ref, i1 +! CHECK: hlfir.destroy %[[VAL_29]] : !hlfir.expr<10xf32> +! CHECK: hlfir.end_associate %[[VAL_3]]#1, %[[VAL_3]]#2 : !fir.ref, i1 +! CHECK: hlfir.destroy %[[VAL_14]] : !hlfir.expr<10xf32> +! CHECK: %[[VAL_39:.*]] = fir.convert %[[VAL_38]] : (!fir.logical<4>) -> i1 +! CHECK: fir.if %[[VAL_39]] { diff --git a/flang/test/Lower/HLFIR/entry_return.f90 b/flang/test/Lower/HLFIR/entry_return.f90 index 5d3e160af2df6..18fb2b571b950 100644 --- a/flang/test/Lower/HLFIR/entry_return.f90 +++ b/flang/test/Lower/HLFIR/entry_return.f90 @@ -51,13 +51,13 @@ logical function f2() ! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_4]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %[[VAL_5]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_8:.*]] = fir.call @_QPcomplex(%[[VAL_6]]#0, %[[VAL_7]]#0) fastmath : (!fir.ref, !fir.ref) -> f32 -! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 -! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_9:.*]] = arith.constant 0.000000e+00 : f32 ! CHECK: %[[VAL_10:.*]] = fir.undefined complex ! CHECK: %[[VAL_11:.*]] = fir.insert_value %[[VAL_10]], %[[VAL_8]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[VAL_12:.*]] = fir.insert_value %[[VAL_11]], %[[VAL_9]], [1 : index] : (complex, f32) -> complex ! CHECK: hlfir.assign %[[VAL_12]] to %[[VAL_1]]#0 : complex, !fir.ref> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 +! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_3]]#0 : !fir.ref> ! CHECK: return %[[VAL_13]] : !fir.logical<4> ! CHECK: } @@ -74,13 +74,13 @@ logical function f2() ! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %[[VAL_4]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %[[VAL_5]] {adapt.valuebyref} : (f32) -> (!fir.ref, !fir.ref, i1) ! CHECK: %[[VAL_8:.*]] = fir.call @_QPcomplex(%[[VAL_6]]#0, %[[VAL_7]]#0) fastmath : (!fir.ref, !fir.ref) -> f32 -! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 -! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_9:.*]] = arith.constant 0.000000e+00 : f32 ! CHECK: %[[VAL_10:.*]] = fir.undefined complex ! CHECK: %[[VAL_11:.*]] = fir.insert_value %[[VAL_10]], %[[VAL_8]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[VAL_12:.*]] = fir.insert_value %[[VAL_11]], %[[VAL_9]], [1 : index] : (complex, f32) -> complex ! CHECK: hlfir.assign %[[VAL_12]] to %[[VAL_1]]#0 : complex, !fir.ref> +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 +! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 ! CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_1]]#0 : !fir.ref> ! CHECK: return %[[VAL_13]] : complex ! CHECK: } diff --git a/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 b/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 index 28659a33d0893..206b6e4e9b797 100644 --- a/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 +++ b/flang/test/Lower/HLFIR/proc-pointer-comp-nopass.f90 @@ -32,8 +32,8 @@ real function test1(x) ! CHECK: %[[VAL_7:.*]] = fir.load %[[VAL_6]] : !fir.ref) -> f32>> ! CHECK: %[[VAL_8:.*]] = fir.box_addr %[[VAL_7]] : (!fir.boxproc<(!fir.ref) -> f32>) -> ((!fir.ref) -> f32) ! CHECK: %[[VAL_9:.*]] = fir.call %[[VAL_8]](%[[VAL_5]]#0) fastmath : (!fir.ref) -> f32 -! CHECK: hlfir.end_associate %[[VAL_5]]#1, %[[VAL_5]]#2 : !fir.ref, i1 ! CHECK: hlfir.assign %[[VAL_9]] to %[[VAL_2]]#0 : f32, !fir.ref +! CHECK: hlfir.end_associate %[[VAL_5]]#1, %[[VAL_5]]#2 : !fir.ref, i1 subroutine test2(x) use proc_comp_defs, only : t, iface diff --git a/flang/test/Lower/OpenACC/acc-atomic-capture.f90 b/flang/test/Lower/OpenACC/acc-atomic-capture.f90 index 82059908bcd0b..ee38ab6ce826a 100644 --- a/flang/test/Lower/OpenACC/acc-atomic-capture.f90 +++ b/flang/test/Lower/OpenACC/acc-atomic-capture.f90 @@ -306,3 +306,60 @@ end subroutine comp_ref_in_atomic_capture2 ! CHECK: } ! CHECK: acc.atomic.read %[[V_DECL]]#0 = %[[C]] : !fir.ref, !fir.ref, i32 ! CHECK: } + +! CHECK-LABEL: func.func @_QPatomic_capture_with_associate() { +subroutine atomic_capture_with_associate + interface + integer function func(x) + integer :: x + end function func + end interface +! CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "_QFatomic_capture_with_associateEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Y_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "_QFatomic_capture_with_associateEy"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[Z_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "_QFatomic_capture_with_associateEz"} : (!fir.ref) -> (!fir.ref, !fir.ref) + integer :: x, y, z + +! CHECK: %[[VAL_10:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_11:.*]] = fir.call @_QPfunc(%[[VAL_10]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: acc.atomic.capture { +! CHECK: acc.atomic.read %[[X_DECL]]#0 = %[[Y_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: acc.atomic.write %[[Y_DECL]]#0 = %[[VAL_11]] : !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_10]]#1, %[[VAL_10]]#2 : !fir.ref, i1 + !$acc atomic capture + x = y + y = func(z + 1) + !$acc end atomic + +! CHECK: %[[VAL_15:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_16:.*]] = fir.call @_QPfunc(%[[VAL_15]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: acc.atomic.capture { +! CHECK: acc.atomic.update %[[Y_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_17:.*]]: i32): +! CHECK: %[[VAL_18:.*]] = arith.muli %[[VAL_16]], %[[VAL_17]] : i32 +! CHECK: acc.yield %[[VAL_18]] : i32 +! CHECK: } +! CHECK: acc.atomic.read %[[X_DECL]]#0 = %[[Y_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_15]]#1, %[[VAL_15]]#2 : !fir.ref, i1 + !$acc atomic capture + y = func(z + 1) * y + x = y + !$acc end atomic + +! CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_23:.*]] = fir.call @_QPfunc(%[[VAL_22]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: acc.atomic.capture { +! CHECK: acc.atomic.read %[[X_DECL]]#0 = %[[Y_DECL]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: acc.atomic.update %[[Y_DECL]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_24:.*]]: i32): +! CHECK: %[[VAL_25:.*]] = arith.addi %[[VAL_23]], %[[VAL_24]] : i32 +! CHECK: acc.yield %[[VAL_25]] : i32 +! CHECK: } +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_22]]#1, %[[VAL_22]]#2 : !fir.ref, i1 + !$acc atomic capture + x = y + y = func(z + 1) + y + !$acc end atomic +end subroutine atomic_capture_with_associate diff --git a/flang/test/Lower/OpenACC/acc-atomic-update.f90 b/flang/test/Lower/OpenACC/acc-atomic-update.f90 index da2972877244c..71aa69fd64eba 100644 --- a/flang/test/Lower/OpenACC/acc-atomic-update.f90 +++ b/flang/test/Lower/OpenACC/acc-atomic-update.f90 @@ -3,6 +3,11 @@ ! RUN: %flang_fc1 -fopenacc -emit-hlfir %s -o - | FileCheck %s program acc_atomic_update_test + interface + integer function func(x) + integer :: x + end function func + end interface integer :: x, y, z integer, pointer :: a, b integer, target :: c, d @@ -67,7 +72,18 @@ program acc_atomic_update_test !$acc atomic i1 = i1 + 1 !$acc end atomic + +!CHECK: %[[VAL_44:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +!CHECK: %[[VAL_45:.*]] = fir.call @_QPfunc(%[[VAL_44]]#0) fastmath : (!fir.ref) -> i32 +!CHECK: acc.atomic.update %[[X_DECL]]#0 : !fir.ref { +!CHECK: ^bb0(%[[VAL_46:.*]]: i32): +!CHECK: %[[VAL_47:.*]] = arith.addi %[[VAL_46]], %[[VAL_45]] : i32 +!CHECK: acc.yield %[[VAL_47]] : i32 +!CHECK: } +!CHECK: hlfir.end_associate %[[VAL_44]]#1, %[[VAL_44]]#2 : !fir.ref, i1 + !$acc atomic update + x = x + func(z + 1) + !$acc end atomic !CHECK: return !CHECK: } end program acc_atomic_update_test - diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..2f800d534dc36 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -97,3 +97,59 @@ subroutine pointers_in_atomic_capture() b = a !$omp end atomic end subroutine + +! Check that the clean-ups associated with the function call +! are generated after the omp.atomic.capture operation: +! CHECK-LABEL: func.func @_QPfunc_call_cleanup( +subroutine func_call_cleanup(x, v, vv) + interface + integer function func(x) + integer :: x + end function func + end interface + integer :: x, v, vv + +! CHECK: %[[VAL_7:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_8:.*]] = fir.call @_QPfunc(%[[VAL_7]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[VAL_1:.*]]#0 = %[[VAL_3:.*]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.write %[[VAL_3]]#0 = %[[VAL_8]] : !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_7]]#1, %[[VAL_7]]#2 : !fir.ref, i1 + !$omp atomic capture + v = x + x = func(vv + 1) + !$omp end atomic + +! CHECK: %[[VAL_12:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_13:.*]] = fir.call @_QPfunc(%[[VAL_12]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.read %[[VAL_1]]#0 = %[[VAL_3]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: omp.atomic.update %[[VAL_3]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_14:.*]]: i32): +! CHECK: %[[VAL_15:.*]] = arith.addi %[[VAL_13]], %[[VAL_14]] : i32 +! CHECK: omp.yield(%[[VAL_15]] : i32) +! CHECK: } +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_12]]#1, %[[VAL_12]]#2 : !fir.ref, i1 + !$omp atomic capture + v = x + x = func(vv + 1) + x + !$omp end atomic + +! CHECK: %[[VAL_19:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_20:.*]] = fir.call @_QPfunc(%[[VAL_19]]#0) fastmath : (!fir.ref) -> i32 +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[VAL_3]]#0 : !fir.ref { +! CHECK: ^bb0(%[[VAL_21:.*]]: i32): +! CHECK: %[[VAL_22:.*]] = arith.addi %[[VAL_20]], %[[VAL_21]] : i32 +! CHECK: omp.yield(%[[VAL_22]] : i32) +! CHECK: } +! CHECK: omp.atomic.read %[[VAL_1]]#0 = %[[VAL_3]]#0 : !fir.ref, !fir.ref, i32 +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_19]]#1, %[[VAL_19]]#2 : !fir.ref, i1 + !$omp atomic capture + x = func(vv + 1) + x + v = x + !$omp end atomic +end subroutine func_call_cleanup diff --git a/flang/test/Lower/OpenMP/atomic-update.f90 b/flang/test/Lower/OpenMP/atomic-update.f90 index 257ae8fb497ff..3f840acefa6e8 100644 --- a/flang/test/Lower/OpenMP/atomic-update.f90 +++ b/flang/test/Lower/OpenMP/atomic-update.f90 @@ -219,3 +219,24 @@ program OmpAtomicUpdate !$omp atomic update w = w + g end program OmpAtomicUpdate + +! Check that the clean-ups associated with the function call +! are generated after the omp.atomic.update operation: +! CHECK-LABEL: func.func @_QPfunc_call_cleanup( +subroutine func_call_cleanup(v, vv) + integer v, vv + +! CHECK: %[[VAL_6:.*]]:3 = hlfir.associate %{{.*}} {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +! CHECK: %[[VAL_7:.*]] = fir.call @_QPfunc(%[[VAL_6]]#0) fastmath : (!fir.ref) -> f32 +! CHECK: omp.atomic.update %{{.*}} : !fir.ref { +! CHECK: ^bb0(%[[VAL_8:.*]]: i32): +! CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_8]] : (i32) -> f32 +! CHECK: %[[VAL_10:.*]] = arith.addf %[[VAL_9]], %[[VAL_7]] fastmath : f32 +! CHECK: %[[VAL_11:.*]] = fir.convert %[[VAL_10]] : (f32) -> i32 +! CHECK: omp.yield(%[[VAL_11]] : i32) +! CHECK: } +! CHECK: hlfir.end_associate %[[VAL_6]]#1, %[[VAL_6]]#2 : !fir.ref, i1 + !$omp atomic update + v = v + func(vv + 1) + !$omp end atomic +end subroutine func_call_cleanup From flang-commits at lists.llvm.org Mon May 12 14:03:24 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 12 May 2025 14:03:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Postpone hlfir.end_associate generation for calls. (PR #138786) In-Reply-To: Message-ID: <6822621c.170a0220.57f24.0035@mx.google.com> https://github.com/vzakhari closed https://github.com/llvm/llvm-project/pull/138786 From flang-commits at lists.llvm.org Tue May 13 00:36:55 2025 From: flang-commits at lists.llvm.org (Dominik Adamski via flang-commits) Date: Tue, 13 May 2025 00:36:55 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Turn on alias analysis for locally allocated objects (PR #139682) Message-ID: https://github.com/DominikAdamski created https://github.com/llvm/llvm-project/pull/139682 Previously, a bug in the MemCptOpt LLVM IR pass caused issues with adding alias tags for locally allocated objects for Fortran code. However, the bug has now been fixed ( https://github.com/llvm/llvm-project/pull/129537 ), and we can safely enable alias tags for these objects. This change should improve the accuracy of the alias analysis. Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 13 00:37:30 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 00:37:30 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Turn on alias analysis for locally allocated objects (PR #139682) In-Reply-To: Message-ID: <6822f6ba.170a0220.1762c6.d835@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Dominik Adamski (DominikAdamski)
Changes Previously, a bug in the MemCptOpt LLVM IR pass caused issues with adding alias tags for locally allocated objects for Fortran code. However, the bug has now been fixed ( https://github.com/llvm/llvm-project/pull/129537 ), and we can safely enable alias tags for these objects. This change should improve the accuracy of the alias analysis. --- Patch is 24.67 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139682.diff 5 Files Affected: - (modified) flang/lib/Optimizer/Transforms/AddAliasTags.cpp (+4-7) - (modified) flang/test/Fir/tbaa-codegen2.fir (+5-2) - (modified) flang/test/Transforms/tbaa-with-dummy-scope2.fir (+15-8) - (modified) flang/test/Transforms/tbaa2.fir (+28-31) - (modified) flang/test/Transforms/tbaa3.fir (+5-5) ``````````diff diff --git a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp index 66b4b84998801..5cfbdc33285f9 100644 --- a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp +++ b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp @@ -43,13 +43,10 @@ static llvm::cl::opt static llvm::cl::opt enableDirect("direct-tbaa", llvm::cl::init(true), llvm::cl::Hidden, llvm::cl::desc("Add TBAA tags to direct variables")); -// This is **known unsafe** (misscompare in spec2017/wrf_r). It should -// not be enabled by default. -// The code is kept so that these may be tried with new benchmarks to see if -// this is worth fixing in the future. -static llvm::cl::opt enableLocalAllocs( - "local-alloc-tbaa", llvm::cl::init(false), llvm::cl::Hidden, - llvm::cl::desc("Add TBAA tags to local allocations. UNSAFE.")); +static llvm::cl::opt + enableLocalAllocs("local-alloc-tbaa", llvm::cl::init(true), + llvm::cl::Hidden, + llvm::cl::desc("Add TBAA tags to local allocations.")); namespace { diff --git a/flang/test/Fir/tbaa-codegen2.fir b/flang/test/Fir/tbaa-codegen2.fir index 8f8b6a29129e7..e4bfa9087ec75 100644 --- a/flang/test/Fir/tbaa-codegen2.fir +++ b/flang/test/Fir/tbaa-codegen2.fir @@ -100,7 +100,7 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ // [...] // CHECK: %[[VAL50:.*]] = getelementptr i32, ptr %{{.*}}, i64 %{{.*}} // store to the temporary: -// CHECK: store i32 %{{.*}}, ptr %[[VAL50]], align 4, !tbaa ![[DATA_ACCESS_TAG:.*]] +// CHECK: store i32 %{{.*}}, ptr %[[VAL50]], align 4, !tbaa ![[TMP_DATA_ACCESS_TAG:.*]] // [...] // CHECK: [[BOX_ACCESS_TAG]] = !{![[BOX_ACCESS_TYPE:.*]], ![[BOX_ACCESS_TYPE]], i64 0} @@ -111,4 +111,7 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ // CHECK: ![[A_ACCESS_TYPE]] = !{!"dummy arg data/_QFfuncEa", ![[ARG_ACCESS_TYPE:.*]], i64 0} // CHECK: ![[ARG_ACCESS_TYPE]] = !{!"dummy arg data", ![[DATA_ACCESS_TYPE:.*]], i64 0} // CHECK: ![[DATA_ACCESS_TYPE]] = !{!"any data access", ![[ANY_ACCESS_TYPE]], i64 0} -// CHECK: ![[DATA_ACCESS_TAG]] = !{![[DATA_ACCESS_TYPE]], ![[DATA_ACCESS_TYPE]], i64 0} +// CHECK: ![[TMP_DATA_ACCESS_TAG]] = !{![[TMP_DATA_ACCESS_TYPE:.*]], ![[TMP_DATA_ACCESS_TYPE]], i64 0} +// CHECK: ![[TMP_DATA_ACCESS_TYPE]] = !{!"allocated data/", ![[TMP_ACCESS_TYPE:.*]], i64 0} +// CHECK: ![[TMP_ACCESS_TYPE]] = !{!"allocated data", ![[TARGET_ACCESS_TAG:.*]], i64 0} +// CHECK: ![[TARGET_ACCESS_TAG]] = !{!"target data", ![[DATA_ACCESS_TYPE]], i64 0} diff --git a/flang/test/Transforms/tbaa-with-dummy-scope2.fir b/flang/test/Transforms/tbaa-with-dummy-scope2.fir index c8f419fbee652..249471de458d3 100644 --- a/flang/test/Transforms/tbaa-with-dummy-scope2.fir +++ b/flang/test/Transforms/tbaa-with-dummy-scope2.fir @@ -43,12 +43,15 @@ func.func @_QPtest1() attributes {noinline} { // CHECK: #[[$ATTR_0:.+]] = #llvm.tbaa_root // CHECK: #[[$ATTR_1:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_2:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$TARGETDATA:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_3:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$LOCAL_ATTR_0:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_5:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_tag +// CHECK: #[[$LOCAL_ATTR_1:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_6:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$LOCAL_ATTR_2:.+]] = #llvm.tbaa_tag // CHECK: #[[$ATTR_8:.+]] = #llvm.tbaa_tag // CHECK-LABEL: func.func @_QPtest1() attributes {noinline} { // CHECK: %[[VAL_2:.*]] = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFtest1FinnerEy"} @@ -57,8 +60,8 @@ func.func @_QPtest1() attributes {noinline} { // CHECK: %[[VAL_5:.*]] = fir.dummy_scope : !fir.dscope // CHECK: %[[VAL_6:.*]] = fir.declare %[[VAL_4]] dummy_scope %[[VAL_5]] {uniq_name = "_QFtest1FinnerEx"} : (!fir.ref, !fir.dscope) -> !fir.ref // CHECK: %[[VAL_7:.*]] = fir.declare %[[VAL_2]] {uniq_name = "_QFtest1FinnerEy"} : (!fir.ref) -> !fir.ref -// CHECK: fir.store %{{.*}} to %[[VAL_7]] : !fir.ref -// CHECK: %[[VAL_8:.*]] = fir.load %[[VAL_7]] : !fir.ref +// CHECK: fir.store %{{.*}} to %[[VAL_7]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref +// CHECK: %[[VAL_8:.*]] = fir.load %[[VAL_7]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref // CHECK: fir.store %[[VAL_8]] to %[[VAL_6]] {tbaa = [#[[$ATTR_7]]]} : !fir.ref // CHECK: fir.store %{{.*}} to %[[VAL_4]] {tbaa = [#[[$ATTR_8]]]} : !fir.ref @@ -87,12 +90,16 @@ func.func @_QPtest2() attributes {noinline} { // CHECK: #[[$ATTR_3:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_5:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$TARGETDATA_0:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_6:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$TARGETDATA_1:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$LOCAL_ATTR_0:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_8:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_10:.+]] = #llvm.tbaa_tag +// CHECK: #[[$LOCAL_ATTR_1:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_9:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$LOCAL_ATTR_2:.+]] = #llvm.tbaa_tag // CHECK: #[[$ATTR_11:.+]] = #llvm.tbaa_tag // CHECK-LABEL: func.func @_QPtest2() attributes {noinline} { // CHECK: %[[VAL_2:.*]] = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFtest2FinnerEy"} @@ -102,7 +109,7 @@ func.func @_QPtest2() attributes {noinline} { // CHECK: %[[VAL_6:.*]] = fir.dummy_scope : !fir.dscope // CHECK: %[[VAL_7:.*]] = fir.declare %[[VAL_5]] dummy_scope %[[VAL_6]] {uniq_name = "_QFtest2FinnerEx"} : (!fir.ref, !fir.dscope) -> !fir.ref // CHECK: %[[VAL_8:.*]] = fir.declare %[[VAL_2]] {uniq_name = "_QFtest2FinnerEy"} : (!fir.ref) -> !fir.ref -// CHECK: fir.store %{{.*}} to %[[VAL_8]] : !fir.ref -// CHECK: %[[VAL_9:.*]] = fir.load %[[VAL_8]] : !fir.ref +// CHECK: fir.store %{{.*}} to %[[VAL_8]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref +// CHECK: %[[VAL_9:.*]] = fir.load %[[VAL_8]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref // CHECK: fir.store %[[VAL_9]] to %[[VAL_7]] {tbaa = [#[[$ATTR_10]]]} : !fir.ref // CHECK: fir.store %{{.*}} to %[[VAL_5]] {tbaa = [#[[$ATTR_11]]]} : !fir.ref diff --git a/flang/test/Transforms/tbaa2.fir b/flang/test/Transforms/tbaa2.fir index 4678a1cd4a686..1429d0b420766 100644 --- a/flang/test/Transforms/tbaa2.fir +++ b/flang/test/Transforms/tbaa2.fir @@ -50,6 +50,7 @@ // CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ANY_ARG:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ANY_GLBL:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[ANY_LOCAL:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ARG_LOW:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ANY_DIRECT:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ARG_Z:.+]] = #llvm.tbaa_type_desc}> @@ -61,21 +62,31 @@ // CHECK: #[[GLBL_ZSTART:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_ZSTOP:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[LOCAL1_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_YSTART:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_YSTOP:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[LOCAL2_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_XSTART:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[LOCAL3_ALLOC:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[LOCAL4_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[DIRECT_A:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[DIRECT_B:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_DYINV:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[LOCAL5_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_ZSTART_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_ZSTOP_TAG:.+]] = #llvm.tbaa_tag +// CHECK: #[[LOCAL1_ALLOC_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_YSTART_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_YSTOP_TAG:.+]] = #llvm.tbaa_tag +// CHECK: #[[LOCAL2_ALLOC_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_XSTART_TAG:.+]] = #llvm.tbaa_tag +// CHECK: #[[LOCAL3_ALLOC_TAG:.+]] = #llvm.tbaa_tag +// CHECK: #[[LOCAL4_ALLOC_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[DIRECT_A_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[DIRECT_B_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_DYINV_TAG:.+]] = #llvm.tbaa_tag +// CHECK: #[[LOCAL5_ALLOC_TAG:.+]] = #llvm.tbaa_tag func.func @_QMmodPcallee(%arg0: !fir.box> {fir.bindc_name = "z"}, %arg1: !fir.box> {fir.bindc_name = "y"}, %arg2: !fir.ref>>> {fir.bindc_name = "low"}) { %c2 = arith.constant 2 : index @@ -277,7 +288,7 @@ // CHECK: %[[VAL_44:.*]] = fir.convert %[[VAL_43]] : (i32) -> index // CHECK: %[[VAL_45:.*]] = fir.convert %[[VAL_42]] : (index) -> i32 // CHECK: %[[VAL_46:.*]]:2 = fir.do_loop %[[VAL_47:.*]] = %[[VAL_42]] to %[[VAL_44]] step %[[VAL_5]] iter_args(%[[VAL_48:.*]] = %[[VAL_45]]) -> (index, i32) { -// CHECK: fir.store %[[VAL_48]] to %[[VAL_34]] : !fir.ref +// CHECK: fir.store %[[VAL_48]] to %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_49:.*]] = fir.load %[[VAL_18]] {tbaa = [#[[GLBL_YSTART_TAG]]]} : !fir.ref // CHECK: %[[VAL_50:.*]] = arith.addi %[[VAL_49]], %[[VAL_6]] : i32 // CHECK: %[[VAL_51:.*]] = fir.convert %[[VAL_50]] : (i32) -> index @@ -285,24 +296,20 @@ // CHECK: %[[VAL_53:.*]] = fir.convert %[[VAL_52]] : (i32) -> index // CHECK: %[[VAL_54:.*]] = fir.convert %[[VAL_51]] : (index) -> i32 // CHECK: %[[VAL_55:.*]]:2 = fir.do_loop %[[VAL_56:.*]] = %[[VAL_51]] to %[[VAL_53]] step %[[VAL_5]] iter_args(%[[VAL_57:.*]] = %[[VAL_54]]) -> (index, i32) { -// CHECK: fir.store %[[VAL_57]] to %[[VAL_32]] : !fir.ref +// CHECK: fir.store %[[VAL_57]] to %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_58:.*]] = fir.load %[[VAL_16]] {tbaa = [#[[GLBL_XSTART_TAG]]]} : !fir.ref // CHECK: %[[VAL_59:.*]] = arith.addi %[[VAL_58]], %[[VAL_6]] : i32 // CHECK: %[[VAL_60:.*]] = fir.convert %[[VAL_59]] : (i32) -> index // CHECK: %[[VAL_61:.*]] = fir.convert %[[VAL_60]] : (index) -> i32 // CHECK: %[[VAL_62:.*]]:2 = fir.do_loop %[[VAL_63:.*]] = %[[VAL_60]] to %[[VAL_4]] step %[[VAL_5]] iter_args(%[[VAL_64:.*]] = %[[VAL_61]]) -> (index, i32) { -// TODO: local allocation assumed to always alias -// CHECK: fir.store %[[VAL_64]] to %[[VAL_30]] : !fir.ref +// CHECK: fir.store %[[VAL_64]] to %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref // load from box tagged in CodeGen // CHECK: %[[VAL_65:.*]] = fir.load %[[VAL_35]] : !fir.ref>>> -// TODO: local allocation assumed to always alias -// CHECK: %[[VAL_66:.*]] = fir.load %[[VAL_30]] : !fir.ref +// CHECK: %[[VAL_66:.*]] = fir.load %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_67:.*]] = fir.convert %[[VAL_66]] : (i32) -> i64 -// TODO: local allocation assumed to always alias -// CHECK: %[[VAL_68:.*]] = fir.load %[[VAL_32]] : !fir.ref +// CHECK: %[[VAL_68:.*]] = fir.load %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_69:.*]] = fir.convert %[[VAL_68]] : (i32) -> i64 -// TODO: local allocation assumed to always alias -// CHECK: %[[VAL_70:.*]] = fir.load %[[VAL_34]] : !fir.ref +// CHECK: %[[VAL_70:.*]] = fir.load %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_71:.*]] = fir.convert %[[VAL_70]] : (i32) -> i64 // CHECK: %[[VAL_72:.*]] = fir.box_addr %[[VAL_65]] : (!fir.box>>) -> !fir.heap> // CHECK: %[[VAL_73:.*]]:3 = fir.box_dims %[[VAL_65]], %[[VAL_4]] : (!fir.box>>, index) -> (index, index, index) @@ -311,11 +318,10 @@ // CHECK: %[[VAL_76:.*]] = fir.shape_shift %[[VAL_73]]#0, %[[VAL_73]]#1, %[[VAL_74]]#0, %[[VAL_74]]#1, %[[VAL_75]]#0, %[[VAL_75]]#1 : (index, index, index, index, index, index) -> !fir.shapeshift<3> // CHECK: %[[VAL_77:.*]] = fir.array_coor %[[VAL_72]](%[[VAL_76]]) %[[VAL_67]], %[[VAL_69]], %[[VAL_71]] : (!fir.heap>, !fir.shapeshift<3>, i64, i64, i64) -> !fir.ref // CHECK: %[[VAL_78:.*]] = fir.load %[[VAL_77]] {tbaa = [#[[ARG_LOW_TAG]]]} : !fir.ref -// CHECK: fir.store %[[VAL_78]] to %[[VAL_26]] : !fir.ref +// CHECK: fir.store %[[VAL_78]] to %[[VAL_26]] {tbaa = [#[[LOCAL4_ALLOC_TAG]]]} : !fir.ref // load from box tagged in CodeGen // CHECK: %[[VAL_79:.*]] = fir.load %[[VAL_8]] : !fir.ref>>> -// TODO: local allocation assumed to always alias -// CHECK: %[[VAL_80:.*]] = fir.load %[[VAL_32]] : !fir.ref +// CHECK: %[[VAL_80:.*]] = fir.load %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_81:.*]] = fir.convert %[[VAL_80]] : (i32) -> i64 // CHECK: %[[VAL_82:.*]] = fir.box_addr %[[VAL_79]] : (!fir.box>>) -> !fir.heap> // CHECK: %[[VAL_83:.*]]:3 = fir.box_dims %[[VAL_79]], %[[VAL_4]] : (!fir.box>>, index) -> (index, index, index) @@ -324,11 +330,9 @@ // CHECK: %[[VAL_86:.*]] = fir.load %[[VAL_85]] {tbaa = [#[[DIRECT_A_TAG]]]} : !fir.ref // load from box // CHECK: %[[VAL_87:.*]] = fir.load %[[VAL_35]] : !fir.ref>>> -// load from local allocation -// CHECK: %[[VAL_88:.*]] = fir.load %[[VAL_30]] : !fir.ref +// CHECK: %[[VAL_88:.*]] = fir.load %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_89:.*]] = fir.convert %[[VAL_88]] : (i32) -> i64 -// load from local allocation -// CHECK: %[[VAL_90:.*]] = fir.load %[[VAL_34]] : !fir.ref +// CHECK: %[[VAL_90:.*]] = fir.load %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_91:.*]] = fir.convert %[[VAL_90]] : (i32) -> i64 // CHECK: %[[VAL_92:.*]] = fir.box_addr %[[VAL_87]] : (!fir.box>>) -> !fir.heap> // CHECK: %[[VAL_93:.*]]:3 = fir.box_dims %[[VAL_87]], %[[VAL_4]] : (!fir.box>>, index) -> (index, index, index) @@ -363,8 +367,7 @@ // CHECK: %[[VAL_121:.*]] = fir.load %[[VAL_120]] {tbaa = [#[[ARG_Y_TAG]]]} : !fir.ref // CHECK: %[[VAL_122:.*]] = arith.subf %[[VAL_119]], %[[VAL_121]] fastmath : f32 // CHECK: %[[VAL_123:.*]] = fir.no_reassoc %[[VAL_122]] : f32 -// load from local allocation -// CHECK: %[[VAL_124:.*]] = fir.load %[[VAL_28]] : !fir.ref +// CHECK: %[[VAL_124:.*]] = fir.load %[[VAL_28]] {tbaa = [#[[LOCAL5_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_125:.*]] = arith.mulf %[[VAL_123]], %[[VAL_124]] fastmath : f32 // CHECK: %[[VAL_126:.*]] = arith.addf %[[VAL_115]], %[[VAL_125]] fastmath : f32 // CHECK: %[[VAL_127:.*]] = fir.no_reassoc %[[VAL_126]] : f32 @@ -373,30 +376,24 @@ // CHECK: fir.store %[[VAL_129]] to %[[VAL_97]] {tbaa = [#[[ARG_LOW_TAG]]]} : !fir.ref... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/139682 From flang-commits at lists.llvm.org Tue May 13 01:13:11 2025 From: flang-commits at lists.llvm.org (Yussur Mustafa Oraji via flang-commits) Date: Tue, 13 May 2025 01:13:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <6822ff17.170a0220.10e50c.ed9c@mx.google.com> https://github.com/N00byKing updated https://github.com/llvm/llvm-project/pull/136827 >From 50b63c8dd0b69c0d00c4912f2c40f4e97d0fb384 Mon Sep 17 00:00:00 2001 From: Yussur Mustafa Oraji Date: Wed, 23 Apr 2025 10:33:04 +0200 Subject: [PATCH] [flang] Add __COUNTER__ preprocessor macro --- flang/docs/Extensions.md | 2 ++ flang/docs/Preprocessing.md | 12 ++++++++++++ flang/include/flang/Parser/preprocessor.h | 2 ++ flang/lib/Parser/preprocessor.cpp | 4 ++++ flang/test/Preprocessing/counter.F90 | 9 +++++++++ 5 files changed, 29 insertions(+) create mode 100644 flang/test/Preprocessing/counter.F90 diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 05e21ef2d33b5..c66688b42a3b6 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -509,6 +509,8 @@ end * We respect Fortran comments in macro actual arguments (like GNU, Intel, NAG; unlike PGI and XLF) on the principle that macro calls should be treated like function references. Fortran's line continuation methods also work. +* We implement the `__COUNTER__` preprocessing extension, + see [Non-standard Extensions](Preprocessing#non-standard-extensions) ## Standard features not silently accepted diff --git a/flang/docs/Preprocessing.md b/flang/docs/Preprocessing.md index 0b70d857833ce..db815b9244edf 100644 --- a/flang/docs/Preprocessing.md +++ b/flang/docs/Preprocessing.md @@ -138,6 +138,18 @@ text. OpenMP-style directives that look like comments are not addressed by this scheme but are obvious extensions. +## Currently implemented built-ins + +* `__DATE__`: Date, given as e.g. "Jun 16 1904" +* `__TIME__`: Time in 24-hour format including seconds, e.g. "09:24:13" +* `__TIMESTAMP__`: Date, time and year of last modification, given as e.g. "Fri May 9 09:16:17 2025" +* `__FILE__`: Current file +* `__LINE__`: Current line + +### Non-standard Extensions + +* `__COUNTER__`: Replaced by sequential integers on each expansion, starting from 0. + ## Appendix `N` in the table below means "not supported"; this doesn't mean a bug, it just means that a particular behavior was diff --git a/flang/include/flang/Parser/preprocessor.h b/flang/include/flang/Parser/preprocessor.h index 86528a7e68def..834c84a639a74 100644 --- a/flang/include/flang/Parser/preprocessor.h +++ b/flang/include/flang/Parser/preprocessor.h @@ -121,6 +121,8 @@ class Preprocessor { std::list names_; std::unordered_map definitions_; std::stack ifStack_; + + unsigned int counterVal_{0}; }; } // namespace Fortran::parser #endif // FORTRAN_PARSER_PREPROCESSOR_H_ diff --git a/flang/lib/Parser/preprocessor.cpp b/flang/lib/Parser/preprocessor.cpp index a47f9c32ad27c..4549b1c505569 100644 --- a/flang/lib/Parser/preprocessor.cpp +++ b/flang/lib/Parser/preprocessor.cpp @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -299,6 +300,7 @@ void Preprocessor::DefineStandardMacros() { Define("__FILE__"s, "__FILE__"s); Define("__LINE__"s, "__LINE__"s); Define("__TIMESTAMP__"s, "__TIMESTAMP__"s); + Define("__COUNTER__"s, "__COUNTER__"s); } void Preprocessor::Define(const std::string ¯o, const std::string &value) { @@ -421,6 +423,8 @@ std::optional Preprocessor::MacroReplacement( repl = "\""s + time + '"'; } } + } else if (name == "__COUNTER__") { + repl = std::to_string(counterVal_++); } if (!repl.empty()) { ProvenanceRange insert{allSources_.AddCompilerInsertion(repl)}; diff --git a/flang/test/Preprocessing/counter.F90 b/flang/test/Preprocessing/counter.F90 new file mode 100644 index 0000000000000..9761c8fb7f355 --- /dev/null +++ b/flang/test/Preprocessing/counter.F90 @@ -0,0 +1,9 @@ +! RUN: %flang -E %s | FileCheck %s +! CHECK: print *, 0 +! CHECK: print *, 1 +! CHECK: print *, 2 +! Check incremental counter macro +#define foo bar +print *, __COUNTER__ +print *, __COUNTER__ +print *, __COUNTER__ From flang-commits at lists.llvm.org Tue May 13 01:13:53 2025 From: flang-commits at lists.llvm.org (Yussur Mustafa Oraji via flang-commits) Date: Tue, 13 May 2025 01:13:53 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <6822ff41.170a0220.9c22d.f014@mx.google.com> N00byKing wrote: I've adjusted the extensions page with a link to the new preprocessing section https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Tue May 13 02:14:05 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 13 May 2025 02:14:05 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP][Semantics] resolve objects in the flush arg list (PR #139522) In-Reply-To: Message-ID: <68230d5d.170a0220.10b6a5.01a3@mx.google.com> tblah wrote: The CI failure looks unrelated. Thanks for the quick review. https://github.com/llvm/llvm-project/pull/139522 From flang-commits at lists.llvm.org Tue May 13 02:14:05 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 02:14:05 -0700 (PDT) Subject: [flang-commits] [flang] 8ecb958 - [flang][OpenMP][Semantics] resolve objects in the flush arg list (#139522) Message-ID: <68230d5d.170a0220.1166a9.d1ae@mx.google.com> Author: Tom Eccles Date: 2025-05-13T10:14:02+01:00 New Revision: 8ecb958b8f7bc8110fd2bd3e9b023095e7f14c94 URL: https://github.com/llvm/llvm-project/commit/8ecb958b8f7bc8110fd2bd3e9b023095e7f14c94 DIFF: https://github.com/llvm/llvm-project/commit/8ecb958b8f7bc8110fd2bd3e9b023095e7f14c94.diff LOG: [flang][OpenMP][Semantics] resolve objects in the flush arg list (#139522) Fixes #136583 Normally the flush argument list would contain a DataRef to some variable. All DataRefs are handled generically in resolve-names and so the problem wasn't observed. But when a common block name is specified, this is not parsed as a DataRef. There was already handling in resolve-directives for OmpObjectList but not for argument lists. I've added a visitor for FLUSH which ensures all of the arguments have been resolved. The test is there to make sure the compiler doesn't crashed encountering the unresolved symbol. It shows that we currently deny flushing a common block. I'm not sure that it is right to restrict common blocks from flush argument lists, but fixing that can come in a different patch. This one is fixing an ICE. Added: flang/test/Semantics/OpenMP/flush04.f90 Modified: flang/lib/Semantics/resolve-directives.cpp Removed: ################################################################################ diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 138749a97eb72..9fa7bc8964854 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -409,6 +409,26 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { } void Post(const parser::OpenMPDepobjConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPFlushConstruct &x) { + PushContext(x.source, llvm::omp::Directive::OMPD_flush); + for (auto &arg : x.v.Arguments().v) { + if (auto *locator{std::get_if(&arg.u)}) { + if (auto *object{std::get_if(&locator->u)}) { + if (auto *name{std::get_if(&object->u)}) { + // ResolveOmpCommonBlockName resolves the symbol as a side effect + if (!ResolveOmpCommonBlockName(name)) { + context_.Say(name->source, // 2.15.3 + "COMMON block must be declared in the same scoping unit " + "in which the OpenMP directive or clause appears"_err_en_US); + } + } + } + } + } + return true; + } + void Post(const parser::OpenMPFlushConstruct &) { PopContext(); } + bool Pre(const parser::OpenMPRequiresConstruct &x) { using Flags = WithOmpDeclarative::RequiresFlags; using Requires = WithOmpDeclarative::RequiresFlag; diff --git a/flang/test/Semantics/OpenMP/flush04.f90 b/flang/test/Semantics/OpenMP/flush04.f90 new file mode 100644 index 0000000000000..ffc2273b692dc --- /dev/null +++ b/flang/test/Semantics/OpenMP/flush04.f90 @@ -0,0 +1,11 @@ +! RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp + +! Regression test to ensure that the name /c/ in the flush argument list is +! resolved to the common block symbol. + + common /c/ x + real :: x +!ERROR: FLUSH argument must be a variable list item + !$omp flush(/c/) +end + From flang-commits at lists.llvm.org Tue May 13 02:14:09 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 13 May 2025 02:14:09 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP][Semantics] resolve objects in the flush arg list (PR #139522) In-Reply-To: Message-ID: <68230d61.050a0220.12f3c2.0a4e@mx.google.com> https://github.com/tblah closed https://github.com/llvm/llvm-project/pull/139522 From flang-commits at lists.llvm.org Tue May 13 02:39:00 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 02:39:00 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Set the default schedule modifier (PR #139572) In-Reply-To: Message-ID: <68231334.050a0220.10ef63.24de@mx.google.com> https://github.com/skatrak approved this pull request. Thank you, this LGTM. Perhaps leave some time for others to take a look, in case implementing this within another stage is preferred. https://github.com/llvm/llvm-project/pull/139572 From flang-commits at lists.llvm.org Tue May 13 03:23:20 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Tue, 13 May 2025 03:23:20 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) Message-ID: https://github.com/Thirumalai-Shaktivel created https://github.com/llvm/llvm-project/pull/139702 Implementation details: * Recognize prefetch directive in the parser as `!dir$ prefetch ...` * Unparse the prefetch directive * Add required tests Details on the prefetch directive: `!dir$ prefetch designator[, designator]...`, where the designator list can be a variable or an array reference. This directive is used to insert a hint to the code generator to prefetch instructions for memory references. Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 13 03:23:52 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 03:23:52 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) In-Reply-To: Message-ID: <68231db8.170a0220.2cb820.f2e0@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-parser Author: Thirumalai Shaktivel (Thirumalai-Shaktivel)
Changes Implementation details: * Recognize prefetch directive in the parser as `!dir$ prefetch ...` * Unparse the prefetch directive * Add required tests Details on the prefetch directive: `!dir$ prefetch designator[, designator]...`, where the designator list can be a variable or an array reference. This directive is used to insert a hint to the code generator to prefetch instructions for memory references. --- Full diff: https://github.com/llvm/llvm-project/pull/139702.diff 6 Files Affected: - (modified) flang/docs/Directives.md (+3) - (modified) flang/include/flang/Parser/dump-parse-tree.h (+1) - (modified) flang/include/flang/Parser/parse-tree.h (+7-2) - (modified) flang/lib/Parser/Fortran-parsers.cpp (+4) - (modified) flang/lib/Parser/unparse.cpp (+4) - (added) flang/test/Parser/prefetch.f90 (+80) ``````````diff diff --git a/flang/docs/Directives.md b/flang/docs/Directives.md index 91c27cb510ea0..9216516494523 100644 --- a/flang/docs/Directives.md +++ b/flang/docs/Directives.md @@ -50,6 +50,9 @@ A list of non-standard directives supported by Flang integer that specifying the unrolling factor. When `N` is `0` or `1`, the loop should not be unrolled at all. If `N` is omitted the optimizer will selects the number of times to unroll the loop. +* `!dir$ prefetch designator[, designator]...`, where the designator list can be + a variable or an array reference. This directive is used to insert a hint to + the code generator to prefetch instructions for memory references. * `!dir$ novector` disabling vectorization on the following loop. * `!dir$ nounroll` disabling unrolling on the following loop. * `!dir$ nounroll_and_jam` disabling unrolling and jamming on the following loop. diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index df9278697346f..c62d9b695108d 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -214,6 +214,7 @@ class ParseTreeDumper { NODE(CompilerDirective, NoVector) NODE(CompilerDirective, NoUnroll) NODE(CompilerDirective, NoUnrollAndJam) + NODE(CompilerDirective, Prefetch) NODE(parser, ComplexLiteralConstant) NODE(parser, ComplexPart) NODE(parser, ComponentArraySpec) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 254236b510544..cba7653be83d3 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -3354,6 +3354,7 @@ struct StmtFunctionStmt { // !DIR$ NOVECTOR // !DIR$ NOUNROLL // !DIR$ NOUNROLL_AND_JAM +// !DIR$ PREFETCH designator[, designator]... // !DIR$ struct CompilerDirective { UNION_CLASS_BOILERPLATE(CompilerDirective); @@ -3379,14 +3380,18 @@ struct CompilerDirective { struct UnrollAndJam { WRAPPER_CLASS_BOILERPLATE(UnrollAndJam, std::optional); }; + struct Prefetch { + WRAPPER_CLASS_BOILERPLATE( + Prefetch, std::list>); + }; EMPTY_CLASS(NoVector); EMPTY_CLASS(NoUnroll); EMPTY_CLASS(NoUnrollAndJam); EMPTY_CLASS(Unrecognized); CharBlock source; std::variant, LoopCount, std::list, - VectorAlways, std::list, Unroll, UnrollAndJam, Unrecognized, - NoVector, NoUnroll, NoUnrollAndJam> + VectorAlways, std::list, Unroll, UnrollAndJam, Prefetch, + Unrecognized, NoVector, NoUnroll, NoUnrollAndJam> u; }; diff --git a/flang/lib/Parser/Fortran-parsers.cpp b/flang/lib/Parser/Fortran-parsers.cpp index fbe629ab52935..782dff8a967b6 100644 --- a/flang/lib/Parser/Fortran-parsers.cpp +++ b/flang/lib/Parser/Fortran-parsers.cpp @@ -1294,6 +1294,7 @@ TYPE_PARSER(construct("STAT =" >> statVariable) || // !DIR$ LOOP COUNT (n1[, n2]...) // !DIR$ name[=value] [, name[=value]]... // !DIR$ UNROLL [n] +// !DIR$ PREFETCH designator[, designator]... // !DIR$ constexpr auto ignore_tkr{ "IGNORE_TKR" >> optionalList(construct( @@ -1308,6 +1309,8 @@ constexpr auto vectorAlways{ "VECTOR ALWAYS" >> construct()}; constexpr auto unroll{ "UNROLL" >> construct(maybe(digitString64))}; +constexpr auto prefetch{"PREFETCH" >> + construct(nonemptyList(indirect(designator)))}; constexpr auto unrollAndJam{"UNROLL_AND_JAM" >> construct(maybe(digitString64))}; constexpr auto novector{"NOVECTOR" >> construct()}; @@ -1321,6 +1324,7 @@ TYPE_PARSER(beginDirective >> "DIR$ "_tok >> construct(vectorAlways) || construct(unrollAndJam) || construct(unroll) || + construct(prefetch) || construct(novector) || construct(nounrollAndJam) || construct(nounroll) || diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index a626888b7dfe5..e4dbb16a6346c 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -1854,6 +1854,10 @@ class UnparseVisitor { Word("!DIR$ UNROLL"); Walk(" ", unroll.v); }, + [&](const CompilerDirective::Prefetch &prefetch) { + Word("!DIR$ PREFETCH"); + Walk(" ", prefetch.v); + }, [&](const CompilerDirective::UnrollAndJam &unrollAndJam) { Word("!DIR$ UNROLL_AND_JAM"); Walk(" ", unrollAndJam.v); diff --git a/flang/test/Parser/prefetch.f90 b/flang/test/Parser/prefetch.f90 new file mode 100644 index 0000000000000..1013a09c92117 --- /dev/null +++ b/flang/test/Parser/prefetch.f90 @@ -0,0 +1,80 @@ +!RUN: %flang_fc1 -fdebug-unparse-no-sema %s 2>&1 | FileCheck %s -check-prefix=UNPARSE +!RUN: %flang_fc1 -fdebug-dump-parse-tree-no-sema %s 2>&1 | FileCheck %s -check-prefix=TREE + +subroutine test_prefetch_01(a, b) + integer, intent(in) :: a + integer, intent(inout) :: b(5) + integer :: i = 2 + integer :: res + +!TREE: | | DeclarationConstruct -> SpecificationConstruct -> CompilerDirective -> Prefetch -> Designator -> DataRef -> Name = 'a' + +!UNPARSE: !DIR$ PREFETCH a + !dir$ prefetch a + b(1) = a + +!TREE: | | ExecutionPartConstruct -> ExecutableConstruct -> CompilerDirective -> Prefetch -> Designator -> DataRef -> Name = 'b' + +!UNPARSE: !DIR$ PREFETCH b + !dir$ prefetch b + res = sum(b) + +!TREE: | | ExecutionPartConstruct -> ExecutableConstruct -> CompilerDirective -> Prefetch -> Designator -> DataRef -> Name = 'a' +!TREE: | | Designator -> DataRef -> ArrayElement +!TREE: | | | DataRef -> Name = 'b' +!TREE: | | | SectionSubscript -> SubscriptTriplet +!TREE: | | | | Scalar -> Integer -> Expr -> LiteralConstant -> IntLiteralConstant = '3' +!TREE: | | | | Scalar -> Integer -> Expr -> LiteralConstant -> IntLiteralConstant = '5' + +!UNPARSE: !DIR$ PREFETCH a, b(3:5) + !dir$ prefetch a, b(3:5) + res = a + b(4) + +!TREE: | | ExecutionPartConstruct -> ExecutableConstruct -> CompilerDirective -> Prefetch -> Designator -> DataRef -> Name = 'res' +!TREE: | | Designator -> DataRef -> ArrayElement +!TREE: | | | DataRef -> Name = 'b' +!TREE: | | | SectionSubscript -> Integer -> Expr -> Add +!TREE: | | | | Expr -> Designator -> DataRef -> Name = 'i' +!TREE: | | | | Expr -> LiteralConstant -> IntLiteralConstant = '2' + +!UNPARSE: !DIR$ PREFETCH res, b(i+2) + !dir$ prefetch res, b(i+2) + res = res + b(i+2) +end subroutine + +subroutine test_prefetch_02(n, a) + integer, intent(in) :: n + integer, intent(in) :: a(n) + type :: t + real, allocatable :: x(:, :) + end type t + type(t) :: p + + do i = 1, n +!TREE: | | | | ExecutionPartConstruct -> ExecutableConstruct -> CompilerDirective -> Prefetch -> Designator -> DataRef -> ArrayElement +!TREE: | | | | | DataRef -> StructureComponent +!TREE: | | | | | | DataRef -> Name = 'p' +!TREE: | | | | | | Name = 'x' +!TREE: | | | | | SectionSubscript -> Integer -> Expr -> Designator -> DataRef -> Name = 'i' +!TREE: | | | | | SectionSubscript -> SubscriptTriplet +!TREE: | | | | Designator -> DataRef -> Name = 'a' + +!UNPARSE: !DIR$ PREFETCH p%x(i,:), a + !dir$ prefetch p%x(i, :), a + do j = 1, n +!TREE: | | | | | | ExecutionPartConstruct -> ExecutableConstruct -> CompilerDirective -> Prefetch -> Designator -> DataRef -> ArrayElement +!TREE: | | | | | | | DataRef -> StructureComponent +!TREE: | | | | | | | | DataRef -> Name = 'p' +!TREE: | | | | | | | | Name = 'x' +!TREE: | | | | | | | SectionSubscript -> Integer -> Expr -> Designator -> DataRef -> Name = 'i' +!TREE: | | | | | | | SectionSubscript -> Integer -> Expr -> Designator -> DataRef -> Name = 'j' +!TREE: | | | | | | Designator -> DataRef -> ArrayElement +!TREE: | | | | | | | DataRef -> Name = 'a' +!TREE: | | | | | | | SectionSubscript -> Integer -> Expr -> Designator -> DataRef -> Name = 'i' + +!UNPARSE: !DIR$ PREFETCH p%x(i,j), a(i) + !dir$ prefetch p%x(i, j), a(i) + p%x(i, j) = p%x(i, j) ** a(j) + end do + end do +end subroutine ``````````
https://github.com/llvm/llvm-project/pull/139702 From flang-commits at lists.llvm.org Tue May 13 03:42:07 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 13 May 2025 03:42:07 -0700 (PDT) Subject: [flang-commits] [flang] [FLANG][OpenMP][Taskloop] - Add testcase for reduction and in_reduction clause in taskloop construct (PR #139704) Message-ID: https://github.com/kaviya2510 created https://github.com/llvm/llvm-project/pull/139704 Added a testcase for reduction and in_reduction clause in taskloop construct. Reduction and in_reduction clauses are not supported in taskloop so below error is issued: "not yet implemented: Unhandled clause REDUCTION/IN_REDUCTION in TASKLOOP construct" >From d8a2d2c14602f00b5113b3bf103b522bf22af1ac Mon Sep 17 00:00:00 2001 From: Kaviya Rajendiran Date: Tue, 13 May 2025 16:06:49 +0530 Subject: [PATCH] [FLANG][OpenMP][Taskloop] - Add testcase for reduction and in_reduction clause in taskloop construct --- .../test/Lower/OpenMP/Todo/taskloop-inreduction.f90 | 13 +++++++++++++ flang/test/Lower/OpenMP/Todo/taskloop-reduction.f90 | 13 +++++++++++++ 2 files changed, 26 insertions(+) create mode 100644 flang/test/Lower/OpenMP/Todo/taskloop-inreduction.f90 create mode 100644 flang/test/Lower/OpenMP/Todo/taskloop-reduction.f90 diff --git a/flang/test/Lower/OpenMP/Todo/taskloop-inreduction.f90 b/flang/test/Lower/OpenMP/Todo/taskloop-inreduction.f90 new file mode 100644 index 0000000000000..8acc399a92abe --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/taskloop-inreduction.f90 @@ -0,0 +1,13 @@ +! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK: not yet implemented: Unhandled clause IN_REDUCTION in TASKLOOP construct +subroutine omp_taskloop_inreduction() + integer x + x = 0 + !$omp taskloop in_reduction(+:x) + do i = 1, 100 + x = x + 1 + end do + !$omp end taskloop +end subroutine omp_taskloop_inreduction diff --git a/flang/test/Lower/OpenMP/Todo/taskloop-reduction.f90 b/flang/test/Lower/OpenMP/Todo/taskloop-reduction.f90 new file mode 100644 index 0000000000000..0c16bd227257f --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/taskloop-reduction.f90 @@ -0,0 +1,13 @@ +! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK: not yet implemented: Unhandled clause REDUCTION in TASKLOOP construct +subroutine omp_taskloop_reduction() + integer x + x = 0 + !$omp taskloop reduction(+:x) + do i = 1, 100 + x = x + 1 + end do + !$omp end taskloop +end subroutine omp_taskloop_reduction From flang-commits at lists.llvm.org Tue May 13 03:42:37 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 03:42:37 -0700 (PDT) Subject: [flang-commits] [flang] [FLANG][OpenMP][Taskloop] - Add testcase for reduction and in_reduction clause in taskloop construct (PR #139704) In-Reply-To: Message-ID: <6823221d.170a0220.3144c7.27fe@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Kaviya Rajendiran (kaviya2510)
Changes Added a testcase for reduction and in_reduction clause in taskloop construct. Reduction and in_reduction clauses are not supported in taskloop so below error is issued: "not yet implemented: Unhandled clause REDUCTION/IN_REDUCTION in TASKLOOP construct" --- Full diff: https://github.com/llvm/llvm-project/pull/139704.diff 2 Files Affected: - (added) flang/test/Lower/OpenMP/Todo/taskloop-inreduction.f90 (+13) - (added) flang/test/Lower/OpenMP/Todo/taskloop-reduction.f90 (+13) ``````````diff diff --git a/flang/test/Lower/OpenMP/Todo/taskloop-inreduction.f90 b/flang/test/Lower/OpenMP/Todo/taskloop-inreduction.f90 new file mode 100644 index 0000000000000..8acc399a92abe --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/taskloop-inreduction.f90 @@ -0,0 +1,13 @@ +! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK: not yet implemented: Unhandled clause IN_REDUCTION in TASKLOOP construct +subroutine omp_taskloop_inreduction() + integer x + x = 0 + !$omp taskloop in_reduction(+:x) + do i = 1, 100 + x = x + 1 + end do + !$omp end taskloop +end subroutine omp_taskloop_inreduction diff --git a/flang/test/Lower/OpenMP/Todo/taskloop-reduction.f90 b/flang/test/Lower/OpenMP/Todo/taskloop-reduction.f90 new file mode 100644 index 0000000000000..0c16bd227257f --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/taskloop-reduction.f90 @@ -0,0 +1,13 @@ +! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK: not yet implemented: Unhandled clause REDUCTION in TASKLOOP construct +subroutine omp_taskloop_reduction() + integer x + x = 0 + !$omp taskloop reduction(+:x) + do i = 1, 100 + x = x + 1 + end do + !$omp end taskloop +end subroutine omp_taskloop_reduction ``````````
https://github.com/llvm/llvm-project/pull/139704 From flang-commits at lists.llvm.org Tue May 13 03:48:44 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Tue, 13 May 2025 03:48:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <6823238c.a70a0220.1fe7d.f1a1@mx.google.com> https://github.com/eugeneepshteyn commented: LGTM https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Tue May 13 04:22:37 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Tue, 13 May 2025 04:22:37 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Turn on alias analysis for locally allocated objects (PR #139682) In-Reply-To: Message-ID: <68232b7d.170a0220.9828a.6867@mx.google.com> mrkajetanp wrote: FWIW I tried it out and this improves the performance of the atmosphere kernel from https://github.com/E3SM-Project/codesign-kernels by well over 30%. https://github.com/llvm/llvm-project/pull/139682 From flang-commits at lists.llvm.org Tue May 13 04:30:11 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 04:30:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <68232d43.170a0220.e1ca8.eb39@mx.google.com> https://github.com/skatrak edited https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Tue May 13 04:30:11 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 04:30:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <68232d43.170a0220.18e312.f3d3@mx.google.com> https://github.com/skatrak commented: This seems fine to me, I just have a couple of nits and potentially basic questions. Thanks again! https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Tue May 13 04:30:11 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 04:30:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <68232d43.620a0220.2964ee.1a81@mx.google.com> ================ @@ -510,8 +542,73 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, subTypeAttr, entities, /*annotations=*/{}); funcOp->setLoc(builder.getFusedLoc({l}, spAttr)); + /* When we process the DeclareOp inside the OpenMP target region, all the ---------------- skatrak wrote: Nit: The general convention for comments is to use `//`: https://llvm.org/docs/CodingStandards.html#comment-formatting. https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Tue May 13 04:30:12 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 04:30:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <68232d44.a70a0220.13a601.edff@mx.google.com> ================ ---------------- skatrak wrote: There is an early return from `AddDebugInfoPass::handleFuncOp()` if `debugLevel == mlir::LLVM::DIEmissionKind::LineTablesOnly` that skips all the added target-specific handling. Should we be doing something about target regions there as well? https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Tue May 13 04:30:12 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 04:30:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <68232d44.050a0220.20e2b7.ef85@mx.google.com> ================ @@ -103,6 +104,37 @@ bool debugInfoIsAlreadySet(mlir::Location loc) { return false; } +// Generates the name for the artificial DISubprogram that we are going to +// generate for omp::TargetOp. Its logic is borrowed from +// getTargetEntryUniqueInfo and +// TargetRegionEntryInfo::getTargetRegionEntryFnName to generate the same name. +// But even if there was a slight mismatch, it is not a problem because this +// name is artifical and not important to debug experience. ---------------- skatrak wrote: ```suggestion // name is artificial and not important to debug experience. ``` https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Tue May 13 04:30:12 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 04:30:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <68232d44.630a0220.18e240.7f32@mx.google.com> ================ @@ -510,8 +542,73 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, subTypeAttr, entities, /*annotations=*/{}); funcOp->setLoc(builder.getFusedLoc({l}, spAttr)); + /* When we process the DeclareOp inside the OpenMP target region, all the + variables get the DISubprogram of the parent function of the target op as + the scope. In the codegen (to llvm ir), OpenMP target op results in the + creation of a separate function. As the variables in the debug info have + the DISubprogram of the parent function as the scope, the variables + need to be updated at codegen time to avoid verification failures. + + This updating after the fact becomes more and more difficult when types + are dependent on local variables like in the case of variable size arrays + or string. We not only have to generate new variables but also new types. + We can avoid this problem by generating a DISubprogramAttr here for the + target op and make sure that all the variables inside the target region + get the correct scope in the first place. */ + funcOp.walk([&](mlir::omp::TargetOp targetOp) { + unsigned line = getLineFromLoc(targetOp.getLoc()); + mlir::StringAttr Name = + getTargetFunctionName(context, targetOp.getLoc(), funcOp.getName()); + mlir::LLVM::DISubprogramFlags flags = + mlir::LLVM::DISubprogramFlags::Definition | + mlir::LLVM::DISubprogramFlags::LocalToUnit; + if (isOptimized) + flags = flags | mlir::LLVM::DISubprogramFlags::Optimized; + + mlir::DistinctAttr recId = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + mlir::DistinctAttr Id = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + llvm::SmallVector types; + types.push_back(mlir::LLVM::DINullTypeAttr::get(context)); + mlir::LLVM::DISubroutineTypeAttr spTy = + mlir::LLVM::DISubroutineTypeAttr::get(context, CC, types); + auto spAttr = mlir::LLVM::DISubprogramAttr::get( + context, recId, /*isRecSelf=*/true, Id, compilationUnit, Scope, Name, + Name, funcFileAttr, line, line, flags, spTy, /*retainedNodes=*/{}, + /*annotations=*/{}); + + // Make sure that information about the imported modules in copied from the ---------------- skatrak wrote: ```suggestion // Make sure that information about the imported modules is copied from the ``` https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Tue May 13 04:30:12 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 04:30:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <68232d44.050a0220.18a33d.1fb2@mx.google.com> ================ @@ -510,8 +542,73 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, subTypeAttr, entities, /*annotations=*/{}); funcOp->setLoc(builder.getFusedLoc({l}, spAttr)); + /* When we process the DeclareOp inside the OpenMP target region, all the + variables get the DISubprogram of the parent function of the target op as + the scope. In the codegen (to llvm ir), OpenMP target op results in the + creation of a separate function. As the variables in the debug info have + the DISubprogram of the parent function as the scope, the variables + need to be updated at codegen time to avoid verification failures. + + This updating after the fact becomes more and more difficult when types + are dependent on local variables like in the case of variable size arrays + or string. We not only have to generate new variables but also new types. + We can avoid this problem by generating a DISubprogramAttr here for the + target op and make sure that all the variables inside the target region + get the correct scope in the first place. */ + funcOp.walk([&](mlir::omp::TargetOp targetOp) { + unsigned line = getLineFromLoc(targetOp.getLoc()); + mlir::StringAttr Name = + getTargetFunctionName(context, targetOp.getLoc(), funcOp.getName()); + mlir::LLVM::DISubprogramFlags flags = + mlir::LLVM::DISubprogramFlags::Definition | + mlir::LLVM::DISubprogramFlags::LocalToUnit; + if (isOptimized) + flags = flags | mlir::LLVM::DISubprogramFlags::Optimized; + + mlir::DistinctAttr recId = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + mlir::DistinctAttr Id = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + llvm::SmallVector types; + types.push_back(mlir::LLVM::DINullTypeAttr::get(context)); + mlir::LLVM::DISubroutineTypeAttr spTy = + mlir::LLVM::DISubroutineTypeAttr::get(context, CC, types); + auto spAttr = mlir::LLVM::DISubprogramAttr::get( + context, recId, /*isRecSelf=*/true, Id, compilationUnit, Scope, Name, + Name, funcFileAttr, line, line, flags, spTy, /*retainedNodes=*/{}, + /*annotations=*/{}); + + // Make sure that information about the imported modules in copied from the + // parent function. + llvm::SmallVector OpEntities; + for (mlir::LLVM::DINodeAttr N : entities) { + if (auto entity = mlir::dyn_cast(N)) { + auto importedEntity = mlir::LLVM::DIImportedEntityAttr::get( + context, llvm::dwarf::DW_TAG_imported_module, spAttr, + entity.getEntity(), fileAttr, /*line=*/1, /*name=*/nullptr, + /*elements*/ {}); + OpEntities.push_back(importedEntity); + } + } + + Id = mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); ---------------- skatrak wrote: I wonder whether it's actually necessary to create new attributes to hold the same constant value. I might be wrong on that, just thinking that MLIR attributes are already uniqued at the context level, so my impression is that this would just end up pointing to the same thing in the end. But I may have misunderstood that, so not a request to change anything. https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Tue May 13 04:30:12 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 04:30:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <68232d44.170a0220.37f32a.fae4@mx.google.com> ================ @@ -510,8 +542,73 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, subTypeAttr, entities, /*annotations=*/{}); funcOp->setLoc(builder.getFusedLoc({l}, spAttr)); + /* When we process the DeclareOp inside the OpenMP target region, all the + variables get the DISubprogram of the parent function of the target op as + the scope. In the codegen (to llvm ir), OpenMP target op results in the + creation of a separate function. As the variables in the debug info have + the DISubprogram of the parent function as the scope, the variables + need to be updated at codegen time to avoid verification failures. + + This updating after the fact becomes more and more difficult when types + are dependent on local variables like in the case of variable size arrays + or string. We not only have to generate new variables but also new types. + We can avoid this problem by generating a DISubprogramAttr here for the + target op and make sure that all the variables inside the target region + get the correct scope in the first place. */ + funcOp.walk([&](mlir::omp::TargetOp targetOp) { + unsigned line = getLineFromLoc(targetOp.getLoc()); + mlir::StringAttr Name = ---------------- skatrak wrote: Nit: Variable names here should start with lowercase. There are a few cases of this not being followed below. https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Tue May 13 04:30:12 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 04:30:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <68232d44.170a0220.2aa714.ea51@mx.google.com> ================ @@ -510,8 +542,73 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, subTypeAttr, entities, /*annotations=*/{}); funcOp->setLoc(builder.getFusedLoc({l}, spAttr)); + /* When we process the DeclareOp inside the OpenMP target region, all the + variables get the DISubprogram of the parent function of the target op as + the scope. In the codegen (to llvm ir), OpenMP target op results in the + creation of a separate function. As the variables in the debug info have + the DISubprogram of the parent function as the scope, the variables + need to be updated at codegen time to avoid verification failures. + + This updating after the fact becomes more and more difficult when types + are dependent on local variables like in the case of variable size arrays + or string. We not only have to generate new variables but also new types. + We can avoid this problem by generating a DISubprogramAttr here for the + target op and make sure that all the variables inside the target region + get the correct scope in the first place. */ + funcOp.walk([&](mlir::omp::TargetOp targetOp) { + unsigned line = getLineFromLoc(targetOp.getLoc()); + mlir::StringAttr Name = + getTargetFunctionName(context, targetOp.getLoc(), funcOp.getName()); + mlir::LLVM::DISubprogramFlags flags = + mlir::LLVM::DISubprogramFlags::Definition | + mlir::LLVM::DISubprogramFlags::LocalToUnit; + if (isOptimized) + flags = flags | mlir::LLVM::DISubprogramFlags::Optimized; + + mlir::DistinctAttr recId = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + mlir::DistinctAttr Id = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + llvm::SmallVector types; + types.push_back(mlir::LLVM::DINullTypeAttr::get(context)); ---------------- skatrak wrote: I can see above that for a regular function, the list of types includes one element per result (or a `FINullTypeAttr`, if there are none, like here), but it also includes one element per function argument. Shouldn't we be adding them here as well, based on whichever map-like arguments the `omp.target` op has? https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Tue May 13 05:48:13 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 05:48:13 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Remove async and structured flag from data actions (PR #139723) In-Reply-To: Message-ID: <68233f8d.170a0220.27e633.f146@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- flang/lib/Lower/OpenACC.cpp mlir/include/mlir/Dialect/OpenACC/OpenACC.h mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp ``````````
View the diff from clang-format here. ``````````diff diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h index f667a6786..30271d059 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h @@ -119,8 +119,7 @@ mlir::SmallVector getBounds(mlir::Operation *accDataClauseOp); /// Used to obtain `async` operands from an acc operation. /// Returns an empty vector if there are no such operands. -mlir::SmallVector -getAsyncOperands(mlir::Operation *accOp); +mlir::SmallVector getAsyncOperands(mlir::Operation *accOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to /// an acc operation, that correspond to the device types associated with the ``````````
https://github.com/llvm/llvm-project/pull/139723 From flang-commits at lists.llvm.org Tue May 13 05:49:52 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 05:49:52 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Remove async and structured flag from data actions (PR #139723) In-Reply-To: Message-ID: <68233ff0.170a0220.1761f3.f0fa@mx.google.com> https://github.com/khaki3 updated https://github.com/llvm/llvm-project/pull/139723 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 13 06:30:16 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 13 May 2025 06:30:16 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Turn on alias analysis for locally allocated objects (PR #139682) In-Reply-To: Message-ID: <68234968.050a0220.2ad712.0bc2@mx.google.com> https://github.com/tblah approved this pull request. This LGTM. I tried SPEC 2017 ref and speed, plus some HPC applications and didn't see any incorrect results. Thank you for fixing this longstanding bug Domminik. Please wait for approval from Slava before merging. https://github.com/llvm/llvm-project/pull/139682 From flang-commits at lists.llvm.org Tue May 13 06:30:39 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 13 May 2025 06:30:39 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Turn on alias analysis for locally allocated objects (PR #139682) In-Reply-To: Message-ID: <6823497f.620a0220.3bdce.1ea0@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/139682 From flang-commits at lists.llvm.org Tue May 13 06:31:52 2025 From: flang-commits at lists.llvm.org (Jean-Didier PAILLEUX via flang-commits) Date: Tue, 13 May 2025 06:31:52 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Implement !DIR$ [NO]INLINE and FORCEINLINE directives (PR #134350) In-Reply-To: Message-ID: <682349c8.170a0220.1b64f.24e0@mx.google.com> JDPailleux wrote: I'm curious to know why it's better to resolve the symbols for the function call to attach the attribute to that function operationd and not attach the attributes directly on the function call operation, when the FIR -> MLIR (LLVM Dialect) -> LLVM IR conversion is done correctly? https://github.com/llvm/llvm-project/pull/134350 From flang-commits at lists.llvm.org Tue May 13 06:52:33 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Tue, 13 May 2025 06:52:33 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Turn on alias analysis for locally allocated objects (PR #139682) In-Reply-To: Message-ID: <68234ea1.050a0220.1cdd73.2a30@mx.google.com> https://github.com/mrkajetanp approved this pull request. LGTM, nice work. https://github.com/llvm/llvm-project/pull/139682 From flang-commits at lists.llvm.org Tue May 13 07:02:56 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 13 May 2025 07:02:56 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add missing dependent dialects to MLIR passes (PR #139260) In-Reply-To: Message-ID: <68235110.a70a0220.361314.1c32@mx.google.com> https://github.com/clementval approved this pull request. https://github.com/llvm/llvm-project/pull/139260 From flang-commits at lists.llvm.org Tue May 13 07:37:33 2025 From: flang-commits at lists.llvm.org (Mats Petersson via flang-commits) Date: Tue, 13 May 2025 07:37:33 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP]Replace assert with if-condition (PR #139559) In-Reply-To: Message-ID: <6823592d.170a0220.206a00.5255@mx.google.com> https://github.com/Leporacanthicus closed https://github.com/llvm/llvm-project/pull/139559 From flang-commits at lists.llvm.org Tue May 13 07:47:52 2025 From: flang-commits at lists.llvm.org (Michael Kruse via flang-commits) Date: Tue, 13 May 2025 07:47:52 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828) In-Reply-To: Message-ID: <68235b98.630a0220.c8baa.e7b8@mx.google.com> https://github.com/Meinersbur updated https://github.com/llvm/llvm-project/pull/137828 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Tue May 13 07:48:09 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 07:48:09 -0700 (PDT) Subject: [flang-commits] [flang] 2ca2e1c - [flang] Tune warning about incompatible implicit interfaces (#136788) Message-ID: <68235ba9.050a0220.9368e.8796@mx.google.com> Author: Peter Klausler Date: 2025-05-13T07:48:05-07:00 New Revision: 2ca2e1c9d5e353064586ccc314377dc4ef1bf25d URL: https://github.com/llvm/llvm-project/commit/2ca2e1c9d5e353064586ccc314377dc4ef1bf25d DIFF: https://github.com/llvm/llvm-project/commit/2ca2e1c9d5e353064586ccc314377dc4ef1bf25d.diff LOG: [flang] Tune warning about incompatible implicit interfaces (#136788) The compiler was emitting a warning about incompatible shapes being used for two calls to the same procedure with an implicit interface when one passed a whole array and the other passed a scalar. When the scalar is a whole element of a contiguous array, however, we must allow for storage association and not flag it as being a problem. Added: flang/test/Semantics/call43.f90 Modified: flang/include/flang/Evaluate/characteristics.h flang/include/flang/Evaluate/tools.h flang/lib/Evaluate/characteristics.cpp flang/lib/Semantics/check-call.cpp Removed: ################################################################################ diff --git a/flang/include/flang/Evaluate/characteristics.h b/flang/include/flang/Evaluate/characteristics.h index 6d29b57889681..d566c34ff71e8 100644 --- a/flang/include/flang/Evaluate/characteristics.h +++ b/flang/include/flang/Evaluate/characteristics.h @@ -174,6 +174,14 @@ class TypeAndShape { } const std::optional &shape() const { return shape_; } const Attrs &attrs() const { return attrs_; } + Attrs &attrs() { return attrs_; } + bool isPossibleSequenceAssociation() const { + return isPossibleSequenceAssociation_; + } + TypeAndShape &set_isPossibleSequenceAssociation(bool yes) { + isPossibleSequenceAssociation_ = yes; + return *this; + } int corank() const { return corank_; } void set_corank(int n) { corank_ = n; } @@ -209,11 +217,11 @@ class TypeAndShape { void AcquireLEN(); void AcquireLEN(const semantics::Symbol &); -protected: DynamicType type_; std::optional> LEN_; std::optional shape_; Attrs attrs_; + bool isPossibleSequenceAssociation_{false}; int corank_{0}; }; diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 922af4190822d..0318a468f3811 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -396,7 +396,7 @@ std::optional ExtractDataRef(const ActualArgument &, // Predicate: is an expression is an array element reference? template -bool IsArrayElement(const Expr &expr, bool intoSubstring = true, +const Symbol *IsArrayElement(const Expr &expr, bool intoSubstring = true, bool skipComponents = false) { if (auto dataRef{ExtractDataRef(expr, intoSubstring)}) { for (const DataRef *ref{&*dataRef}; ref;) { @@ -404,12 +404,14 @@ bool IsArrayElement(const Expr &expr, bool intoSubstring = true, ref = skipComponents ? &component->base() : nullptr; } else if (const auto *coarrayRef{std::get_if(&ref->u)}) { ref = &coarrayRef->base(); + } else if (const auto *arrayRef{std::get_if(&ref->u)}) { + return &arrayRef->GetLastSymbol(); } else { - return std::holds_alternative(ref->u); + break; } } } - return false; + return nullptr; } template diff --git a/flang/lib/Evaluate/characteristics.cpp b/flang/lib/Evaluate/characteristics.cpp index 63040feae43fc..89547733ea33c 100644 --- a/flang/lib/Evaluate/characteristics.cpp +++ b/flang/lib/Evaluate/characteristics.cpp @@ -274,6 +274,9 @@ llvm::raw_ostream &TypeAndShape::Dump(llvm::raw_ostream &o) const { } o << ')'; } + if (isPossibleSequenceAssociation_) { + o << " isPossibleSequenceAssociation"; + } return o; } @@ -282,17 +285,26 @@ bool DummyDataObject::operator==(const DummyDataObject &that) const { coshape == that.coshape && cudaDataAttr == that.cudaDataAttr; } +static bool IsOkWithSequenceAssociation( + const TypeAndShape &t1, const TypeAndShape &t2) { + return t1.isPossibleSequenceAssociation() && + (t2.isPossibleSequenceAssociation() || t2.CanBeSequenceAssociated()); +} + bool DummyDataObject::IsCompatibleWith(const DummyDataObject &actual, std::string *whyNot, std::optional *warning) const { - bool possibleWarning{false}; - if (!ShapesAreCompatible( - type.shape(), actual.type.shape(), &possibleWarning)) { - if (whyNot) { - *whyNot = "incompatible dummy data object shapes"; + if (!IsOkWithSequenceAssociation(type, actual.type) && + !IsOkWithSequenceAssociation(actual.type, type)) { + bool possibleWarning{false}; + if (!ShapesAreCompatible( + type.shape(), actual.type.shape(), &possibleWarning)) { + if (whyNot) { + *whyNot = "incompatible dummy data object shapes"; + } + return false; + } else if (warning && possibleWarning) { + *warning = "distinct dummy data object shapes"; } - return false; - } else if (warning && possibleWarning) { - *warning = "distinct dummy data object shapes"; } // Treat deduced dummy character type as if it were assumed-length character // to avoid useless "implicit interfaces have distinct type" warnings from @@ -343,10 +355,29 @@ bool DummyDataObject::IsCompatibleWith(const DummyDataObject &actual, } } } - if (!IdenticalSignificantAttrs(attrs, actual.attrs) || + if (!attrs.test(Attr::DeducedFromActual) && + !actual.attrs.test(Attr::DeducedFromActual) && type.attrs() != actual.type.attrs()) { + if (whyNot) { + *whyNot = "incompatible dummy data object shape attributes"; + auto diff erences{type.attrs() ^ actual.type.attrs()}; + auto sep{": "s}; + diff erences.IterateOverMembers([&](TypeAndShape::Attr x) { + *whyNot += sep + std::string{TypeAndShape::EnumToString(x)}; + sep = ", "; + }); + } + return false; + } + if (!IdenticalSignificantAttrs(attrs, actual.attrs)) { if (whyNot) { *whyNot = "incompatible dummy data object attributes"; + auto diff erences{attrs ^ actual.attrs}; + auto sep{": "s}; + diff erences.IterateOverMembers([&](DummyDataObject::Attr x) { + *whyNot += sep + std::string{EnumToString(x)}; + sep = ", "; + }); } return false; } @@ -900,6 +931,15 @@ std::optional DummyArgument::FromActual(std::string &&name, type->set_type(DynamicType{ type->type().GetDerivedTypeSpec(), /*poly=*/false}); } + if (type->type().category() == TypeCategory::Character && + type->type().kind() == 1) { + type->set_isPossibleSequenceAssociation(true); + } else if (const Symbol * array{IsArrayElement(expr)}) { + type->set_isPossibleSequenceAssociation( + IsContiguous(*array, context).value_or(false)); + } else { + type->set_isPossibleSequenceAssociation(expr.Rank() > 0); + } DummyDataObject obj{std::move(*type)}; obj.attrs.set(DummyDataObject::Attr::DeducedFromActual); return std::make_optional( diff --git a/flang/lib/Semantics/check-call.cpp b/flang/lib/Semantics/check-call.cpp index 3cf95fdab44f5..dfc2ddbacf071 100644 --- a/flang/lib/Semantics/check-call.cpp +++ b/flang/lib/Semantics/check-call.cpp @@ -561,7 +561,7 @@ static void CheckExplicitDataArg(const characteristics::DummyDataObject &dummy, "Coindexed scalar actual argument must be associated with a scalar %s"_err_en_US, dummyName); } - bool actualIsArrayElement{IsArrayElement(actual)}; + bool actualIsArrayElement{IsArrayElement(actual) != nullptr}; bool actualIsCKindCharacter{ actualType.type().category() == TypeCategory::Character && actualType.type().kind() == 1}; diff --git a/flang/test/Semantics/call43.f90 b/flang/test/Semantics/call43.f90 new file mode 100644 index 0000000000000..d8cc543a4838a --- /dev/null +++ b/flang/test/Semantics/call43.f90 @@ -0,0 +1,17 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 -pedantic -Werror +subroutine from(a, b, c, d) + real a(10), b(:), c + real, contiguous :: d(:) + call to(a) + call to(a(1)) ! ok + call to(b) ! ok, passed via temp + !WARNING: Reference to the procedure 'to' has an implicit interface that is distinct from another reference: incompatible dummy argument #1: incompatible dummy data object shapes + call to(b(1)) + !WARNING: Reference to the procedure 'to' has an implicit interface that is distinct from another reference: incompatible dummy argument #1: incompatible dummy data object shapes + call to(c) + !WARNING: Reference to the procedure 'to' has an implicit interface that is distinct from another reference: incompatible dummy argument #1: incompatible dummy data object shapes + call to(1.) + call to([1., 2.]) ! ok + call to(d) ! ok + call to(d(1)) ! ok +end From flang-commits at lists.llvm.org Tue May 13 07:48:12 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 13 May 2025 07:48:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Tune warning about incompatible implicit interfaces (PR #136788) In-Reply-To: Message-ID: <68235bac.170a0220.274288.6c53@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/136788 From flang-commits at lists.llvm.org Tue May 13 07:48:38 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 13 May 2025 07:48:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Acknowledge non-enforcement of C7108 (PR #139169) In-Reply-To: Message-ID: <68235bc6.050a0220.270b55.98d2@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/139169 From flang-commits at lists.llvm.org Tue May 13 07:48:58 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 07:48:58 -0700 (PDT) Subject: [flang-commits] [flang] 53f0367 - [flang] Fix spurious error on defined assignment in PURE (#139186) Message-ID: <68235bda.170a0220.4ef56.5d7a@mx.google.com> Author: Peter Klausler Date: 2025-05-13T07:48:54-07:00 New Revision: 53f0367ab0fa7e958f42fc07ceb9c38b9b9c74f2 URL: https://github.com/llvm/llvm-project/commit/53f0367ab0fa7e958f42fc07ceb9c38b9b9c74f2 DIFF: https://github.com/llvm/llvm-project/commit/53f0367ab0fa7e958f42fc07ceb9c38b9b9c74f2.diff LOG: [flang] Fix spurious error on defined assignment in PURE (#139186) An assignment to a whole polymorphic object in a PURE subprogram that is implemented by means of a defined assignment procedure shouldn't be subjected to the same definability checks as it would be for an intrinsic assignment (which would also require it to be allocatable). Fixes https://github.com/llvm/llvm-project/issues/139129. Added: flang/test/Semantics/bug139129.f90 Modified: flang/include/flang/Evaluate/tools.h flang/lib/Evaluate/tools.cpp flang/lib/Semantics/assignment.cpp flang/lib/Semantics/check-deallocate.cpp flang/lib/Semantics/check-declarations.cpp flang/lib/Semantics/definable.cpp flang/lib/Semantics/definable.h flang/lib/Semantics/expression.cpp flang/test/Semantics/assign11.f90 flang/test/Semantics/call28.f90 flang/test/Semantics/deallocate07.f90 flang/test/Semantics/declarations05.f90 Removed: ################################################################################ diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 0318a468f3811..7f2e91ae128bd 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -504,42 +504,31 @@ template std::optional ExtractSubstring(const A &x) { // If an expression is simply a whole symbol data designator, // extract and return that symbol, else null. +const Symbol *UnwrapWholeSymbolDataRef(const DataRef &); +const Symbol *UnwrapWholeSymbolDataRef(const std::optional &); template const Symbol *UnwrapWholeSymbolDataRef(const A &x) { - if (auto dataRef{ExtractDataRef(x)}) { - if (const SymbolRef * p{std::get_if(&dataRef->u)}) { - return &p->get(); - } - } - return nullptr; + return UnwrapWholeSymbolDataRef(ExtractDataRef(x)); } // If an expression is a whole symbol or a whole component desginator, // extract and return that symbol, else null. +const Symbol *UnwrapWholeSymbolOrComponentDataRef(const DataRef &); +const Symbol *UnwrapWholeSymbolOrComponentDataRef( + const std::optional &); template const Symbol *UnwrapWholeSymbolOrComponentDataRef(const A &x) { - if (auto dataRef{ExtractDataRef(x)}) { - if (const SymbolRef * p{std::get_if(&dataRef->u)}) { - return &p->get(); - } else if (const Component * c{std::get_if(&dataRef->u)}) { - if (c->base().Rank() == 0) { - return &c->GetLastSymbol(); - } - } - } - return nullptr; + return UnwrapWholeSymbolOrComponentDataRef(ExtractDataRef(x)); } // If an expression is a whole symbol or a whole component designator, // potentially followed by an image selector, extract and return that symbol, // else null. const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef(const DataRef &); +const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef( + const std::optional &); template const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef(const A &x) { - if (auto dataRef{ExtractDataRef(x)}) { - return UnwrapWholeSymbolOrComponentOrCoarrayRef(*dataRef); - } else { - return nullptr; - } + return UnwrapWholeSymbolOrComponentOrCoarrayRef(ExtractDataRef(x)); } // GetFirstSymbol(A%B%C[I]%D) -> A diff --git a/flang/lib/Evaluate/tools.cpp b/flang/lib/Evaluate/tools.cpp index 7ce009c1d0b53..c70915cfa6150 100644 --- a/flang/lib/Evaluate/tools.cpp +++ b/flang/lib/Evaluate/tools.cpp @@ -1318,17 +1318,39 @@ std::optional CheckProcCompatibility(bool isCall, return msg; } +const Symbol *UnwrapWholeSymbolDataRef(const DataRef &dataRef) { + const SymbolRef *p{std::get_if(&dataRef.u)}; + return p ? &p->get() : nullptr; +} + +const Symbol *UnwrapWholeSymbolDataRef(const std::optional &dataRef) { + return dataRef ? UnwrapWholeSymbolDataRef(*dataRef) : nullptr; +} + +const Symbol *UnwrapWholeSymbolOrComponentDataRef(const DataRef &dataRef) { + if (const Component * c{std::get_if(&dataRef.u)}) { + return c->base().Rank() == 0 ? &c->GetLastSymbol() : nullptr; + } else { + return UnwrapWholeSymbolDataRef(dataRef); + } +} + +const Symbol *UnwrapWholeSymbolOrComponentDataRef( + const std::optional &dataRef) { + return dataRef ? UnwrapWholeSymbolOrComponentDataRef(*dataRef) : nullptr; +} + const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef(const DataRef &dataRef) { - if (const SymbolRef * p{std::get_if(&dataRef.u)}) { - return &p->get(); - } else if (const Component * c{std::get_if(&dataRef.u)}) { - if (c->base().Rank() == 0) { - return &c->GetLastSymbol(); - } - } else if (const CoarrayRef * c{std::get_if(&dataRef.u)}) { + if (const CoarrayRef * c{std::get_if(&dataRef.u)}) { return UnwrapWholeSymbolOrComponentOrCoarrayRef(c->base()); + } else { + return UnwrapWholeSymbolOrComponentDataRef(dataRef); } - return nullptr; +} + +const Symbol *UnwrapWholeSymbolOrComponentOrCoarrayRef( + const std::optional &dataRef) { + return dataRef ? UnwrapWholeSymbolOrComponentOrCoarrayRef(*dataRef) : nullptr; } // GetLastPointerSymbol() diff --git a/flang/lib/Semantics/assignment.cpp b/flang/lib/Semantics/assignment.cpp index 935f5a03bdb6a..6e55d0210ee0e 100644 --- a/flang/lib/Semantics/assignment.cpp +++ b/flang/lib/Semantics/assignment.cpp @@ -72,6 +72,11 @@ void AssignmentContext::Analyze(const parser::AssignmentStmt &stmt) { std::holds_alternative(assignment->u)}; if (isDefinedAssignment) { flags.set(DefinabilityFlag::AllowEventLockOrNotifyType); + } else if (const Symbol * + whole{evaluate::UnwrapWholeSymbolOrComponentDataRef(lhs)}) { + if (IsAllocatable(whole->GetUltimate())) { + flags.set(DefinabilityFlag::PotentialDeallocation); + } } if (auto whyNot{WhyNotDefinable(lhsLoc, scope, flags, lhs)}) { if (whyNot->IsFatal()) { diff --git a/flang/lib/Semantics/check-deallocate.cpp b/flang/lib/Semantics/check-deallocate.cpp index 3bcd4d87b0906..c45b58586853b 100644 --- a/flang/lib/Semantics/check-deallocate.cpp +++ b/flang/lib/Semantics/check-deallocate.cpp @@ -36,7 +36,8 @@ void DeallocateChecker::Leave(const parser::DeallocateStmt &deallocateStmt) { } else if (auto whyNot{WhyNotDefinable(name.source, context_.FindScope(name.source), {DefinabilityFlag::PointerDefinition, - DefinabilityFlag::AcceptAllocatable}, + DefinabilityFlag::AcceptAllocatable, + DefinabilityFlag::PotentialDeallocation}, *symbol)}) { // Catch problems with non-definability of the // pointer/allocatable @@ -74,7 +75,8 @@ void DeallocateChecker::Leave(const parser::DeallocateStmt &deallocateStmt) { } else if (auto whyNot{WhyNotDefinable(source, context_.FindScope(source), {DefinabilityFlag::PointerDefinition, - DefinabilityFlag::AcceptAllocatable}, + DefinabilityFlag::AcceptAllocatable, + DefinabilityFlag::PotentialDeallocation}, *expr)}) { context_ .Say(source, diff --git a/flang/lib/Semantics/check-declarations.cpp b/flang/lib/Semantics/check-declarations.cpp index a86f78154b859..1d09dea06db54 100644 --- a/flang/lib/Semantics/check-declarations.cpp +++ b/flang/lib/Semantics/check-declarations.cpp @@ -949,8 +949,8 @@ void CheckHelper::CheckObjectEntity( !IsFunctionResult(symbol) /*ditto*/) { // Check automatically deallocated local variables for possible // problems with finalization in PURE. - if (auto whyNot{ - WhyNotDefinable(symbol.name(), symbol.owner(), {}, symbol)}) { + if (auto whyNot{WhyNotDefinable(symbol.name(), symbol.owner(), + {DefinabilityFlag::PotentialDeallocation}, symbol)}) { if (auto *msg{messages_.Say( "'%s' may not be a local variable in a pure subprogram"_err_en_US, symbol.name())}) { diff --git a/flang/lib/Semantics/definable.cpp b/flang/lib/Semantics/definable.cpp index 99a31553f2782..08cb268b318ae 100644 --- a/flang/lib/Semantics/definable.cpp +++ b/flang/lib/Semantics/definable.cpp @@ -193,6 +193,15 @@ static std::optional WhyNotDefinableLast(parser::CharBlock at, return WhyNotDefinableLast(at, scope, flags, dataRef->GetLastSymbol()); } } + auto dyType{evaluate::DynamicType::From(ultimate)}; + const auto *inPure{FindPureProcedureContaining(scope)}; + if (inPure && !flags.test(DefinabilityFlag::PolymorphicOkInPure) && + flags.test(DefinabilityFlag::PotentialDeallocation) && dyType && + dyType->IsPolymorphic()) { + return BlameSymbol(at, + "'%s' is a whole polymorphic object in a pure subprogram"_en_US, + original); + } if (flags.test(DefinabilityFlag::PointerDefinition)) { if (flags.test(DefinabilityFlag::AcceptAllocatable)) { if (!IsAllocatableOrObjectPointer(&ultimate)) { @@ -210,26 +219,17 @@ static std::optional WhyNotDefinableLast(parser::CharBlock at, "'%s' is an entity with either an EVENT_TYPE or LOCK_TYPE"_en_US, original); } - if (FindPureProcedureContaining(scope)) { - if (auto dyType{evaluate::DynamicType::From(ultimate)}) { - if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { - if (dyType->IsPolymorphic()) { // C1596 - return BlameSymbol( - at, "'%s' is polymorphic in a pure subprogram"_en_US, original); - } - } - if (const Symbol * impure{HasImpureFinal(ultimate)}) { - return BlameSymbol(at, "'%s' has an impure FINAL procedure '%s'"_en_US, - original, impure->name()); - } + if (dyType && inPure) { + if (const Symbol * impure{HasImpureFinal(ultimate)}) { + return BlameSymbol(at, "'%s' has an impure FINAL procedure '%s'"_en_US, + original, impure->name()); + } + if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { if (const DerivedTypeSpec * derived{GetDerivedTypeSpec(dyType)}) { - if (!flags.test(DefinabilityFlag::PolymorphicOkInPure)) { - if (auto bad{ - FindPolymorphicAllocatablePotentialComponent(*derived)}) { - return BlameSymbol(at, - "'%s' has polymorphic component '%s' in a pure subprogram"_en_US, - original, bad.BuildResultDesignatorName()); - } + if (auto bad{FindPolymorphicAllocatablePotentialComponent(*derived)}) { + return BlameSymbol(at, + "'%s' has polymorphic component '%s' in a pure subprogram"_en_US, + original, bad.BuildResultDesignatorName()); } } } @@ -243,7 +243,7 @@ static std::optional WhyNotDefinable(parser::CharBlock at, const evaluate::DataRef &dataRef) { auto whyNotBase{ WhyNotDefinableBase(at, scope, flags, dataRef.GetFirstSymbol(), - std::holds_alternative(dataRef.u), + evaluate::UnwrapWholeSymbolDataRef(dataRef) != nullptr, DefinesComponentPointerTarget(dataRef, flags))}; if (!whyNotBase || !whyNotBase->IsFatal()) { if (auto whyNotLast{ diff --git a/flang/lib/Semantics/definable.h b/flang/lib/Semantics/definable.h index 902702dbccbf3..0d027961417be 100644 --- a/flang/lib/Semantics/definable.h +++ b/flang/lib/Semantics/definable.h @@ -33,7 +33,7 @@ ENUM_CLASS(DefinabilityFlag, SourcedAllocation, // ALLOCATE(a,SOURCE=) PolymorphicOkInPure, // don't check for polymorphic type in pure subprogram DoNotNoteDefinition, // context does not imply definition - AllowEventLockOrNotifyType) + AllowEventLockOrNotifyType, PotentialDeallocation) using DefinabilityFlags = common::EnumSet; diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index 64cb46f2a6f4f..acec7051efa98 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -3475,15 +3475,15 @@ const Assignment *ExpressionAnalyzer::Analyze(const parser::AssignmentStmt &x) { const Symbol *lastWhole{ lastWhole0 ? &ResolveAssociations(*lastWhole0) : nullptr}; if (!lastWhole || !IsAllocatable(*lastWhole)) { - Say("Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable"_err_en_US); + Say("Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable"_err_en_US); } else if (evaluate::IsCoarray(*lastWhole)) { - Say("Left-hand side of assignment may not be polymorphic if it is a coarray"_err_en_US); + Say("Left-hand side of intrinsic assignment may not be polymorphic if it is a coarray"_err_en_US); } } if (auto *derived{GetDerivedTypeSpec(*dyType)}) { if (auto iter{FindAllocatableUltimateComponent(*derived)}) { if (ExtractCoarrayRef(lhs)) { - Say("Left-hand side of assignment must not be coindexed due to allocatable ultimate component '%s'"_err_en_US, + Say("Left-hand side of intrinsic assignment must not be coindexed due to allocatable ultimate component '%s'"_err_en_US, iter.BuildResultDesignatorName()); } } diff --git a/flang/test/Semantics/assign11.f90 b/flang/test/Semantics/assign11.f90 index 37216526b5f33..9d70d7109e75e 100644 --- a/flang/test/Semantics/assign11.f90 +++ b/flang/test/Semantics/assign11.f90 @@ -9,10 +9,10 @@ program test end type type(t) auc[*] pa = 1 ! ok - !ERROR: Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable pp = 1 - !ERROR: Left-hand side of assignment may not be polymorphic if it is a coarray + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic if it is a coarray pac = 1 - !ERROR: Left-hand side of assignment must not be coindexed due to allocatable ultimate component '%a' + !ERROR: Left-hand side of intrinsic assignment must not be coindexed due to allocatable ultimate component '%a' auc[1] = t() end diff --git a/flang/test/Semantics/bug139129.f90 b/flang/test/Semantics/bug139129.f90 new file mode 100644 index 0000000000000..2f0f865854706 --- /dev/null +++ b/flang/test/Semantics/bug139129.f90 @@ -0,0 +1,17 @@ +!RUN: %flang_fc1 -fsyntax-only %s +module m + type t + contains + procedure asst + generic :: assignment(=) => asst + end type + contains + pure subroutine asst(lhs, rhs) + class(t), intent(in out) :: lhs + class(t), intent(in) :: rhs + end + pure subroutine test(x, y) + class(t), intent(in out) :: x, y + x = y ! spurious definability error + end +end diff --git a/flang/test/Semantics/call28.f90 b/flang/test/Semantics/call28.f90 index 51430853d663f..f133276f7547e 100644 --- a/flang/test/Semantics/call28.f90 +++ b/flang/test/Semantics/call28.f90 @@ -11,9 +11,7 @@ pure subroutine s1(x) end subroutine pure subroutine s2(x) class(t), intent(in out) :: x - !ERROR: Left-hand side of assignment may not be polymorphic unless assignment is to an entire allocatable - !ERROR: Left-hand side of assignment is not definable - !BECAUSE: 'x' is polymorphic in a pure subprogram + !ERROR: Left-hand side of intrinsic assignment may not be polymorphic unless assignment is to an entire allocatable x = t() end subroutine pure subroutine s3(x) diff --git a/flang/test/Semantics/deallocate07.f90 b/flang/test/Semantics/deallocate07.f90 index 154c680f47c82..6dcf20e82cf0d 100644 --- a/flang/test/Semantics/deallocate07.f90 +++ b/flang/test/Semantics/deallocate07.f90 @@ -19,11 +19,11 @@ pure subroutine subr(pp1, pp2, mp2) !ERROR: Name in DEALLOCATE statement is not definable !BECAUSE: 'mv1' may not be defined in pure subprogram 'subr' because it is host-associated deallocate(mv1%pc) - !ERROR: Object in DEALLOCATE statement is not deallocatable - !BECAUSE: 'pp1' is polymorphic in a pure subprogram + !ERROR: Name in DEALLOCATE statement is not definable + !BECAUSE: 'pp1' is a whole polymorphic object in a pure subprogram deallocate(pp1) - !ERROR: Object in DEALLOCATE statement is not deallocatable - !BECAUSE: 'pc' is polymorphic in a pure subprogram + !ERROR: Name in DEALLOCATE statement is not definable + !BECAUSE: 'pc' is a whole polymorphic object in a pure subprogram deallocate(pp2%pc) !ERROR: Object in DEALLOCATE statement is not deallocatable !BECAUSE: 'mp2' has polymorphic component '%pc' in a pure subprogram diff --git a/flang/test/Semantics/declarations05.f90 b/flang/test/Semantics/declarations05.f90 index b6dab7aeea0bc..b1e3d3c773160 100644 --- a/flang/test/Semantics/declarations05.f90 +++ b/flang/test/Semantics/declarations05.f90 @@ -22,7 +22,7 @@ impure subroutine final(x) end pure subroutine test !ERROR: 'x0' may not be a local variable in a pure subprogram - !BECAUSE: 'x0' is polymorphic in a pure subprogram + !BECAUSE: 'x0' is a whole polymorphic object in a pure subprogram class(t0), allocatable :: x0 !ERROR: 'x1' may not be a local variable in a pure subprogram !BECAUSE: 'x1' has an impure FINAL procedure 'final' From flang-commits at lists.llvm.org Tue May 13 07:49:01 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 13 May 2025 07:49:01 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix spurious error on defined assignment in PURE (PR #139186) In-Reply-To: Message-ID: <68235bdd.170a0220.37cf9a.4724@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/139186 From flang-commits at lists.llvm.org Tue May 13 07:49:05 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 07:49:05 -0700 (PDT) Subject: [flang-commits] [flang] e75fda1 - [flang] Acknowledge non-enforcement of C7108 (#139169) Message-ID: <68235be1.630a0220.1452f9.c0af@mx.google.com> Author: Peter Klausler Date: 2025-05-13T07:48:30-07:00 New Revision: e75fda107da8bd6a3993bf1e3cb51dc03e952e23 URL: https://github.com/llvm/llvm-project/commit/e75fda107da8bd6a3993bf1e3cb51dc03e952e23 DIFF: https://github.com/llvm/llvm-project/commit/e75fda107da8bd6a3993bf1e3cb51dc03e952e23.diff LOG: [flang] Acknowledge non-enforcement of C7108 (#139169) Fortran 2023 constraint C7108 prohibits the use of a structure constructor in a way that is ambiguous with a generic function reference (intrinsic or user-defined). Sadly, no Fortran compiler implements this constraint, and the common portable interpretation seems to be the generic resolution, not the structure constructor. Restructure the processing of structure constructors in expression analysis so that it can be driven both from the parse tree as well as from generic resolution, and then use it to detect ambigous structure constructor / generic function cases, so that a portability warning can be issued. And document this as a new intentional violation of the standard in Extensions.md. Fixes https://github.com/llvm/llvm-project/issues/138807. Added: flang/test/Semantics/c7108.f90 Modified: flang/docs/Extensions.md flang/include/flang/Semantics/expression.h flang/include/flang/Support/Fortran-features.h flang/lib/Semantics/expression.cpp flang/lib/Support/Fortran-features.cpp flang/test/Semantics/generic09.f90 flang/test/Semantics/resolve11.f90 flang/test/Semantics/resolve17.f90 flang/test/Semantics/resolve18.f90 Removed: ################################################################################ diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 5c7751763eab1..00a7e2bac84e6 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -159,6 +159,11 @@ end to be constant will generate a compilation error. `ieee_support_standard` depends in part on `ieee_support_halting`, so this also applies to `ieee_support_standard` calls. +* F'2023 constraint C7108 prohibits the use of a structure constructor + that could also be interpreted as a generic function reference. + No other Fortran compiler enforces C7108 (to our knowledge); + they all resolve the ambiguity by interpreting the call as a function + reference. We do the same, with a portability warning. ## Extensions, deletions, and legacy features supported by default diff --git a/flang/include/flang/Semantics/expression.h b/flang/include/flang/Semantics/expression.h index eee23dba4831f..30f5dfd8a44cd 100644 --- a/flang/include/flang/Semantics/expression.h +++ b/flang/include/flang/Semantics/expression.h @@ -394,6 +394,19 @@ class ExpressionAnalyzer { MaybeExpr AnalyzeComplex(MaybeExpr &&re, MaybeExpr &&im, const char *what); std::optional AnalyzeChevrons(const parser::CallStmt &); + // CheckStructureConstructor() is used for parsed structure constructors + // as well as for generic function references. + struct ComponentSpec { + ComponentSpec() = default; + ComponentSpec(ComponentSpec &&) = default; + parser::CharBlock source, exprSource; + bool hasKeyword{false}; + const Symbol *keywordSymbol{nullptr}; + MaybeExpr expr; + }; + MaybeExpr CheckStructureConstructor(parser::CharBlock typeName, + const semantics::DerivedTypeSpec &, std::list &&); + MaybeExpr IterativelyAnalyzeSubexpressions(const parser::Expr &); semantics::SemanticsContext &context_; diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 550a5c8f307d3..0e18eaedf2139 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -54,7 +54,8 @@ ENUM_CLASS(LanguageFeature, BackslashEscapes, OldDebugLines, PolymorphicActualAllocatableOrPointerToMonomorphicDummy, RelaxedPureDummy, UndefinableAsynchronousOrVolatileActual, AutomaticInMainProgram, PrintCptr, SavedLocalInSpecExpr, PrintNamelist, AssumedRankPassedToNonAssumedRank, - IgnoreIrrelevantAttributes, Unsigned, ContiguousOkForSeqAssociation) + IgnoreIrrelevantAttributes, Unsigned, AmbiguousStructureConstructor, + ContiguousOkForSeqAssociation) // Portability and suspicious usage warnings ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index 0659536aab98c..64cb46f2a6f4f 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -2069,23 +2069,9 @@ static MaybeExpr ImplicitConvertTo(const semantics::Symbol &sym, return std::nullopt; } -MaybeExpr ExpressionAnalyzer::Analyze( - const parser::StructureConstructor &structure) { - auto &parsedType{std::get(structure.t)}; - parser::Name structureType{std::get(parsedType.t)}; - parser::CharBlock &typeName{structureType.source}; - if (semantics::Symbol *typeSymbol{structureType.symbol}) { - if (typeSymbol->has()) { - semantics::DerivedTypeSpec dtSpec{typeName, typeSymbol->GetUltimate()}; - if (!CheckIsValidForwardReference(dtSpec)) { - return std::nullopt; - } - } - } - if (!parsedType.derivedTypeSpec) { - return std::nullopt; - } - const auto &spec{*parsedType.derivedTypeSpec}; +MaybeExpr ExpressionAnalyzer::CheckStructureConstructor( + parser::CharBlock typeName, const semantics::DerivedTypeSpec &spec, + std::list &&componentSpecs) { const Symbol &typeSymbol{spec.typeSymbol()}; if (!spec.scope() || !typeSymbol.has()) { return std::nullopt; // error recovery @@ -2096,10 +2082,10 @@ MaybeExpr ExpressionAnalyzer::Analyze( const Symbol *parentComponent{typeDetails.GetParentComponent(*spec.scope())}; if (typeSymbol.attrs().test(semantics::Attr::ABSTRACT)) { // C796 - AttachDeclaration(Say(typeName, - "ABSTRACT derived type '%s' may not be used in a " - "structure constructor"_err_en_US, - typeName), + AttachDeclaration( + Say(typeName, + "ABSTRACT derived type '%s' may not be used in a structure constructor"_err_en_US, + typeName), typeSymbol); // C7114 } @@ -2129,22 +2115,19 @@ MaybeExpr ExpressionAnalyzer::Analyze( bool checkConflicts{true}; // until we hit one auto &messages{GetContextualMessages()}; - // NULL() can be a valid component - auto restorer{AllowNullPointer()}; - - for (const auto &component : - std::get>(structure.t)) { - const parser::Expr &expr{ - std::get(component.t).v.value()}; - parser::CharBlock source{expr.source}; + for (ComponentSpec &componentSpec : componentSpecs) { + parser::CharBlock source{componentSpec.source}; + parser::CharBlock exprSource{componentSpec.exprSource}; auto restorer{messages.SetLocation(source)}; - const Symbol *symbol{nullptr}; - MaybeExpr value{Analyze(expr)}; + const Symbol *symbol{componentSpec.keywordSymbol}; + MaybeExpr &maybeValue{componentSpec.expr}; + if (!maybeValue.has_value()) { + return std::nullopt; + } + Expr &value{*maybeValue}; std::optional valueType{DynamicType::From(value)}; - if (const auto &kw{std::get>(component.t)}) { + if (componentSpec.hasKeyword) { anyKeyword = true; - source = kw->v.source; - symbol = kw->v.symbol; if (!symbol) { // Skip overridden inaccessible parent components in favor of // their later overrides. @@ -2196,9 +2179,9 @@ MaybeExpr ExpressionAnalyzer::Analyze( } } if (symbol) { - const semantics::Scope &innermost{context_.FindScope(expr.source)}; + const semantics::Scope &innermost{context_.FindScope(exprSource)}; if (auto msg{CheckAccessibleSymbol(innermost, *symbol)}) { - Say(expr.source, std::move(*msg)); + Say(exprSource, std::move(*msg)); } if (checkConflicts) { auto componentIter{ @@ -2206,8 +2189,7 @@ MaybeExpr ExpressionAnalyzer::Analyze( if (unavailable.find(symbol->name()) != unavailable.cend()) { // C797, C798 Say(source, - "Component '%s' conflicts with another component earlier in " - "this structure constructor"_err_en_US, + "Component '%s' conflicts with another component earlier in this structure constructor"_err_en_US, symbol->name()); } else if (symbol->test(Symbol::Flag::ParentComp)) { // Make earlier components unavailable once a whole parent appears. @@ -2225,143 +2207,136 @@ MaybeExpr ExpressionAnalyzer::Analyze( } } unavailable.insert(symbol->name()); - if (value) { - if (symbol->has()) { - Say(expr.source, - "Type parameter '%s' may not appear as a component of a structure constructor"_err_en_US, - symbol->name()); - } - if (!(symbol->has() || - symbol->has())) { - continue; // recovery - } - if (IsPointer(*symbol)) { // C7104, C7105, C1594(4) - semantics::CheckStructConstructorPointerComponent( - context_, *symbol, *value, innermost); - result.Add(*symbol, Fold(std::move(*value))); - continue; - } - if (IsNullPointer(&*value)) { - if (IsAllocatable(*symbol)) { - if (IsBareNullPointer(&*value)) { - // NULL() with no arguments allowed by 7.5.10 para 6 for - // ALLOCATABLE. - result.Add(*symbol, Expr{NullPointer{}}); - continue; - } - if (IsNullObjectPointer(&*value)) { - AttachDeclaration( - Warn(common::LanguageFeature:: - NullMoldAllocatableComponentValue, - expr.source, - "NULL() with arguments is not standard conforming as the value for allocatable component '%s'"_port_en_US, - symbol->name()), - *symbol); - // proceed to check type & shape - } else { - AttachDeclaration( - Say(expr.source, - "A NULL procedure pointer may not be used as the value for component '%s'"_err_en_US, - symbol->name()), - *symbol); - continue; - } + if (symbol->has()) { + Say(exprSource, + "Type parameter '%s' may not appear as a component of a structure constructor"_err_en_US, + symbol->name()); + } + if (!(symbol->has() || + symbol->has())) { + continue; // recovery + } + if (IsPointer(*symbol)) { // C7104, C7105, C1594(4) + semantics::CheckStructConstructorPointerComponent( + context_, *symbol, value, innermost); + result.Add(*symbol, Fold(std::move(value))); + continue; + } + if (IsNullPointer(&value)) { + if (IsAllocatable(*symbol)) { + if (IsBareNullPointer(&value)) { + // NULL() with no arguments allowed by 7.5.10 para 6 for + // ALLOCATABLE. + result.Add(*symbol, Expr{NullPointer{}}); + continue; + } + if (IsNullObjectPointer(&value)) { + AttachDeclaration( + Warn(common::LanguageFeature::NullMoldAllocatableComponentValue, + exprSource, + "NULL() with arguments is not standard conforming as the value for allocatable component '%s'"_port_en_US, + symbol->name()), + *symbol); + // proceed to check type & shape } else { AttachDeclaration( - Say(expr.source, - "A NULL pointer may not be used as the value for component '%s'"_err_en_US, + Say(exprSource, + "A NULL procedure pointer may not be used as the value for component '%s'"_err_en_US, symbol->name()), *symbol); continue; } - } else if (IsNullAllocatable(&*value) && IsAllocatable(*symbol)) { - result.Add(*symbol, Expr{NullPointer{}}); + } else { + AttachDeclaration( + Say(exprSource, + "A NULL pointer may not be used as the value for component '%s'"_err_en_US, + symbol->name()), + *symbol); continue; - } else if (auto *derived{evaluate::GetDerivedTypeSpec( - evaluate::DynamicType::From(*symbol))}) { - if (auto iter{FindPointerPotentialComponent(*derived)}; - iter && pureContext) { // F'2023 C15104(4) - if (const Symbol * - visible{semantics::FindExternallyVisibleObject( - *value, *pureContext)}) { - Say(expr.source, - "The externally visible object '%s' may not be used in a pure procedure as the value for component '%s' which has the pointer component '%s'"_err_en_US, - visible->name(), symbol->name(), - iter.BuildResultDesignatorName()); - } else if (ExtractCoarrayRef(*value)) { - Say(expr.source, - "A coindexed object may not be used in a pure procedure as the value for component '%s' which has the pointer component '%s'"_err_en_US, - symbol->name(), iter.BuildResultDesignatorName()); - } + } + } else if (IsNullAllocatable(&value) && IsAllocatable(*symbol)) { + result.Add(*symbol, Expr{NullPointer{}}); + continue; + } else if (auto *derived{evaluate::GetDerivedTypeSpec( + evaluate::DynamicType::From(*symbol))}) { + if (auto iter{FindPointerPotentialComponent(*derived)}; + iter && pureContext) { // F'2023 C15104(4) + if (const Symbol * + visible{semantics::FindExternallyVisibleObject( + value, *pureContext)}) { + Say(exprSource, + "The externally visible object '%s' may not be used in a pure procedure as the value for component '%s' which has the pointer component '%s'"_err_en_US, + visible->name(), symbol->name(), + iter.BuildResultDesignatorName()); + } else if (ExtractCoarrayRef(value)) { + Say(exprSource, + "A coindexed object may not be used in a pure procedure as the value for component '%s' which has the pointer component '%s'"_err_en_US, + symbol->name(), iter.BuildResultDesignatorName()); } } - // Make implicit conversion explicit to allow folding of the structure - // constructors and help semantic checking, unless the component is - // allocatable, in which case the value could be an unallocated - // allocatable (see Fortran 2018 7.5.10 point 7). The explicit - // convert would cause a segfault. Lowering will deal with - // conditionally converting and preserving the lower bounds in this - // case. - if (MaybeExpr converted{ImplicitConvertTo( - *symbol, std::move(*value), IsAllocatable(*symbol))}) { - if (auto componentShape{GetShape(GetFoldingContext(), *symbol)}) { - if (auto valueShape{GetShape(GetFoldingContext(), *converted)}) { - if (GetRank(*componentShape) == 0 && GetRank(*valueShape) > 0) { + } + // Make implicit conversion explicit to allow folding of the structure + // constructors and help semantic checking, unless the component is + // allocatable, in which case the value could be an unallocated + // allocatable (see Fortran 2018 7.5.10 point 7). The explicit + // convert would cause a segfault. Lowering will deal with + // conditionally converting and preserving the lower bounds in this + // case. + if (MaybeExpr converted{ImplicitConvertTo( + *symbol, std::move(value), IsAllocatable(*symbol))}) { + if (auto componentShape{GetShape(GetFoldingContext(), *symbol)}) { + if (auto valueShape{GetShape(GetFoldingContext(), *converted)}) { + if (GetRank(*componentShape) == 0 && GetRank(*valueShape) > 0) { + AttachDeclaration( + Say(exprSource, + "Rank-%d array value is not compatible with scalar component '%s'"_err_en_US, + GetRank(*valueShape), symbol->name()), + *symbol); + } else { + auto checked{CheckConformance(messages, *componentShape, + *valueShape, CheckConformanceFlags::RightIsExpandableDeferred, + "component", "value")}; + if (checked && *checked && GetRank(*componentShape) > 0 && + GetRank(*valueShape) == 0 && + (IsDeferredShape(*symbol) || + !IsExpandableScalar(*converted, GetFoldingContext(), + *componentShape, true /*admit PURE call*/))) { AttachDeclaration( - Say(expr.source, - "Rank-%d array value is not compatible with scalar component '%s'"_err_en_US, - GetRank(*valueShape), symbol->name()), + Say(exprSource, + "Scalar value cannot be expanded to shape of array component '%s'"_err_en_US, + symbol->name()), *symbol); - } else { - auto checked{ - CheckConformance(messages, *componentShape, *valueShape, - CheckConformanceFlags::RightIsExpandableDeferred, - "component", "value")}; - if (checked && *checked && GetRank(*componentShape) > 0 && - GetRank(*valueShape) == 0 && - (IsDeferredShape(*symbol) || - !IsExpandableScalar(*converted, GetFoldingContext(), - *componentShape, true /*admit PURE call*/))) { - AttachDeclaration( - Say(expr.source, - "Scalar value cannot be expanded to shape of array component '%s'"_err_en_US, - symbol->name()), - *symbol); - } - if (checked.value_or(true)) { - result.Add(*symbol, std::move(*converted)); - } } - } else { - Say(expr.source, "Shape of value cannot be determined"_err_en_US); + if (checked.value_or(true)) { + result.Add(*symbol, std::move(*converted)); + } } } else { - AttachDeclaration( - Say(expr.source, - "Shape of component '%s' cannot be determined"_err_en_US, - symbol->name()), - *symbol); - } - } else if (auto symType{DynamicType::From(symbol)}) { - if (IsAllocatable(*symbol) && symType->IsUnlimitedPolymorphic() && - valueType) { - // ok - } else if (valueType) { - AttachDeclaration( - Say(expr.source, - "Value in structure constructor of type '%s' is " - "incompatible with component '%s' of type '%s'"_err_en_US, - valueType->AsFortran(), symbol->name(), - symType->AsFortran()), - *symbol); - } else { - AttachDeclaration( - Say(expr.source, - "Value in structure constructor is incompatible with " - "component '%s' of type %s"_err_en_US, - symbol->name(), symType->AsFortran()), - *symbol); + Say(exprSource, "Shape of value cannot be determined"_err_en_US); } + } else { + AttachDeclaration( + Say(exprSource, + "Shape of component '%s' cannot be determined"_err_en_US, + symbol->name()), + *symbol); + } + } else if (auto symType{DynamicType::From(symbol)}) { + if (IsAllocatable(*symbol) && symType->IsUnlimitedPolymorphic() && + valueType) { + // ok + } else if (valueType) { + AttachDeclaration( + Say(exprSource, + "Value in structure constructor of type '%s' is incompatible with component '%s' of type '%s'"_err_en_US, + valueType->AsFortran(), symbol->name(), symType->AsFortran()), + *symbol); + } else { + AttachDeclaration( + Say(exprSource, + "Value in structure constructor is incompatible with component '%s' of type %s"_err_en_US, + symbol->name(), symType->AsFortran()), + *symbol); } } } @@ -2381,10 +2356,10 @@ MaybeExpr ExpressionAnalyzer::Analyze( } else if (IsPointer(symbol)) { result.Add(symbol, Expr{NullPointer{}}); } else if (object) { // C799 - AttachDeclaration(Say(typeName, - "Structure constructor lacks a value for " - "component '%s'"_err_en_US, - symbol.name()), + AttachDeclaration( + Say(typeName, + "Structure constructor lacks a value for component '%s'"_err_en_US, + symbol.name()), symbol); } } @@ -2394,6 +2369,45 @@ MaybeExpr ExpressionAnalyzer::Analyze( return AsMaybeExpr(Expr{std::move(result)}); } +MaybeExpr ExpressionAnalyzer::Analyze( + const parser::StructureConstructor &structure) { + const auto &parsedType{std::get(structure.t)}; + parser::Name structureType{std::get(parsedType.t)}; + parser::CharBlock &typeName{structureType.source}; + if (semantics::Symbol * typeSymbol{structureType.symbol}) { + if (typeSymbol->has()) { + semantics::DerivedTypeSpec dtSpec{typeName, typeSymbol->GetUltimate()}; + if (!CheckIsValidForwardReference(dtSpec)) { + return std::nullopt; + } + } + } + if (!parsedType.derivedTypeSpec) { + return std::nullopt; + } + auto restorer{AllowNullPointer()}; // NULL() can be a valid component + std::list componentSpecs; + for (const auto &component : + std::get>(structure.t)) { + const parser::Expr &expr{ + std::get(component.t).v.value()}; + auto restorer{GetContextualMessages().SetLocation(expr.source)}; + ComponentSpec compSpec; + compSpec.exprSource = expr.source; + compSpec.expr = Analyze(expr); + if (const auto &kw{std::get>(component.t)}) { + compSpec.source = kw->v.source; + compSpec.hasKeyword = true; + compSpec.keywordSymbol = kw->v.symbol; + } else { + compSpec.source = expr.source; + } + componentSpecs.emplace_back(std::move(compSpec)); + } + return CheckStructureConstructor( + typeName, DEREF(parsedType.derivedTypeSpec), std::move(componentSpecs)); +} + static std::optional GetPassName( const semantics::Symbol &proc) { return common::visit( @@ -2841,24 +2855,26 @@ std::pair ExpressionAnalyzer::ResolveGeneric( const Symbol &symbol, const ActualArguments &actuals, const AdjustActuals &adjustActuals, bool isSubroutine, bool mightBeStructureConstructor) { - const Symbol *elemental{nullptr}; // matching elemental specific proc - const Symbol *nonElemental{nullptr}; // matching non-elemental specific const Symbol &ultimate{symbol.GetUltimate()}; - int crtMatchingDistance{cudaInfMatchingValue}; // Check for a match with an explicit INTRINSIC + const Symbol *explicitIntrinsic{nullptr}; if (ultimate.attrs().test(semantics::Attr::INTRINSIC)) { parser::Messages buffer; - auto restorer{foldingContext_.messages().SetMessages(buffer)}; + auto restorer{GetContextualMessages().SetMessages(buffer)}; ActualArguments localActuals{actuals}; if (context_.intrinsics().Probe( CallCharacteristics{ultimate.name().ToString(), isSubroutine}, localActuals, foldingContext_) && !buffer.AnyFatalError()) { - return {&ultimate, false}; + explicitIntrinsic = &ultimate; } } - if (const auto *details{ultimate.detailsIf()}) { - for (const Symbol &specific0 : details->specificProcs()) { + const Symbol *elemental{nullptr}; // matching elemental specific proc + const Symbol *nonElemental{nullptr}; // matching non-elemental specific + const auto *genericDetails{ultimate.detailsIf()}; + if (genericDetails && !explicitIntrinsic) { + int crtMatchingDistance{cudaInfMatchingValue}; + for (const Symbol &specific0 : genericDetails->specificProcs()) { const Symbol &specific1{BypassGeneric(specific0)}; if (isSubroutine != !IsFunction(specific1)) { continue; @@ -2911,25 +2927,93 @@ std::pair ExpressionAnalyzer::ResolveGeneric( } } } - if (nonElemental) { - return {&AccessSpecific(symbol, *nonElemental), false}; - } else if (elemental) { - return {&AccessSpecific(symbol, *elemental), false}; - } - // Check parent derived type - if (const auto *parentScope{symbol.owner().GetDerivedTypeParent()}) { - if (const Symbol *extended{parentScope->FindComponent(symbol.name())}) { - auto pair{ResolveGeneric( - *extended, actuals, adjustActuals, isSubroutine, false)}; - if (pair.first) { - return pair; + } + // Is there a derived type of the same name? + const Symbol *derivedType{nullptr}; + if (mightBeStructureConstructor && !isSubroutine && genericDetails) { + if (const Symbol * dt{genericDetails->derivedType()}) { + const Symbol &ultimate{dt->GetUltimate()}; + if (ultimate.has()) { + derivedType = &ultimate; + } + } + } + // F'2023 C7108 checking. No Fortran compiler actually enforces this + // constraint, so it's just a portability warning here. + if (derivedType && (explicitIntrinsic || nonElemental || elemental) && + context_.ShouldWarn( + common::LanguageFeature::AmbiguousStructureConstructor)) { + // See whethr there's ambiguity with a structure constructor. + bool possiblyAmbiguous{true}; + if (const semantics::Scope * dtScope{derivedType->scope()}) { + parser::Messages buffer; + auto restorer{GetContextualMessages().SetMessages(buffer)}; + std::list componentSpecs; + for (const auto &actual : actuals) { + if (actual) { + ComponentSpec compSpec; + if (const Expr *expr{actual->UnwrapExpr()}) { + compSpec.expr = *expr; + } else { + possiblyAmbiguous = false; + } + if (auto loc{actual->sourceLocation()}) { + compSpec.source = compSpec.exprSource = *loc; + } + if (auto kw{actual->keyword()}) { + compSpec.hasKeyword = true; + compSpec.keywordSymbol = dtScope->FindComponent(*kw); + } + componentSpecs.emplace_back(std::move(compSpec)); + } else { + possiblyAmbiguous = false; } } + semantics::DerivedTypeSpec dtSpec{derivedType->name(), *derivedType}; + dtSpec.set_scope(*dtScope); + possiblyAmbiguous = possiblyAmbiguous && + CheckStructureConstructor( + derivedType->name(), dtSpec, std::move(componentSpecs)) + .has_value() && + !buffer.AnyFatalError(); + } + if (possiblyAmbiguous) { + if (explicitIntrinsic) { + Warn(common::LanguageFeature::AmbiguousStructureConstructor, + "Reference to the intrinsic function '%s' is ambiguous with a structure constructor of the same name"_port_en_US, + symbol.name()); + } else { + Warn(common::LanguageFeature::AmbiguousStructureConstructor, + "Reference to generic function '%s' (resolving to specific '%s') is ambiguous with a structure constructor of the same name"_port_en_US, + symbol.name(), + nonElemental ? nonElemental->name() : elemental->name()); + } } - if (mightBeStructureConstructor && details->derivedType()) { - return {details->derivedType(), false}; + } + // Return the right resolution, if there is one. Explicit intrinsics + // are preferred, then non-elements specifics, then elementals, and + // lastly structure constructors. + if (explicitIntrinsic) { + return {explicitIntrinsic, false}; + } else if (nonElemental) { + return {&AccessSpecific(symbol, *nonElemental), false}; + } else if (elemental) { + return {&AccessSpecific(symbol, *elemental), false}; + } + // Check parent derived type + if (const auto *parentScope{symbol.owner().GetDerivedTypeParent()}) { + if (const Symbol * extended{parentScope->FindComponent(symbol.name())}) { + auto pair{ResolveGeneric( + *extended, actuals, adjustActuals, isSubroutine, false)}; + if (pair.first) { + return pair; + } } } + // Structure constructor? + if (derivedType) { + return {derivedType, false}; + } // Check for generic or explicit INTRINSIC of the same name in outer scopes. // See 15.5.5.2 for details. if (!symbol.owner().IsGlobal() && !symbol.owner().IsDerivedType()) { diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 49a5989849eaa..bee8984102b82 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -45,6 +45,7 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::HollerithPolymorphic); warnLanguage_.set(LanguageFeature::ListDirectedSize); warnLanguage_.set(LanguageFeature::IgnoreIrrelevantAttributes); + warnLanguage_.set(LanguageFeature::AmbiguousStructureConstructor); warnUsage_.set(UsageWarning::ShortArrayActual); warnUsage_.set(UsageWarning::FoldingException); warnUsage_.set(UsageWarning::FoldingAvoidsRuntimeCrash); diff --git a/flang/test/Semantics/c7108.f90 b/flang/test/Semantics/c7108.f90 new file mode 100644 index 0000000000000..c23a0abe3ee03 --- /dev/null +++ b/flang/test/Semantics/c7108.f90 @@ -0,0 +1,41 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 -pedantic -Werror +! F'2023 C7108 is portably unenforced. +module m + type foo + integer n + end type + interface foo + procedure bar0, bar1, bar2, bar3 + end interface + contains + type(foo) function bar0(n) + integer, intent(in) :: n + print *, 'bar0' + bar0%n = n + end + type(foo) function bar1() + print *, 'bar1' + bar1%n = 1 + end + type(foo) function bar2(a) + real, intent(in) :: a + print *, 'bar2' + bar2%n = a + end + type(foo) function bar3(L) + logical, intent(in) :: L + print *, 'bar3' + bar3%n = merge(4,5,L) + end +end + +program p + use m + type(foo) x + x = foo(); print *, x ! ok, not ambiguous + !PORTABILITY: Reference to generic function 'foo' (resolving to specific 'bar0') is ambiguous with a structure constructor of the same name + x = foo(2); print *, x ! ambigous + !PORTABILITY: Reference to generic function 'foo' (resolving to specific 'bar2') is ambiguous with a structure constructor of the same name + x = foo(3.); print *, x ! ambiguous due to data conversion + x = foo(.true.); print *, x ! ok, not ambigous +end diff --git a/flang/test/Semantics/generic09.f90 b/flang/test/Semantics/generic09.f90 index 6159dd4b701d7..d93d7453ed6dd 100644 --- a/flang/test/Semantics/generic09.f90 +++ b/flang/test/Semantics/generic09.f90 @@ -1,4 +1,5 @@ ! RUN: %flang_fc1 -fdebug-unparse %s 2>&1 | FileCheck %s + module m1 type foo integer n @@ -32,6 +33,9 @@ type(foo) function f2(a) end end +!CHECK: portability: Reference to generic function 'foo' (resolving to specific 'f1') is ambiguous with a structure constructor of the same name +!CHECK: portability: Reference to generic function 'foo' (resolving to specific 'f2') is ambiguous with a structure constructor of the same name + program main use m3 type(foo) x diff --git a/flang/test/Semantics/resolve11.f90 b/flang/test/Semantics/resolve11.f90 index 39a30b858ebb6..9ae4f52c4fd54 100644 --- a/flang/test/Semantics/resolve11.f90 +++ b/flang/test/Semantics/resolve11.f90 @@ -66,7 +66,8 @@ subroutine s4 !ERROR: 'fun' is PRIVATE in 'm4' use m4, only: foo, fun type(foo) x ! ok - print *, foo() ! ok + !PORTABILITY: Reference to generic function 'foo' (resolving to specific 'fun') is ambiguous with a structure constructor of the same name + print *, foo() end module m5 diff --git a/flang/test/Semantics/resolve17.f90 b/flang/test/Semantics/resolve17.f90 index 770af756d03bc..6a6e355abe0b8 100644 --- a/flang/test/Semantics/resolve17.f90 +++ b/flang/test/Semantics/resolve17.f90 @@ -290,6 +290,7 @@ module m14d contains subroutine test real :: y + !PORTABILITY: Reference to generic function 'foo' (resolving to specific 'bar') is ambiguous with a structure constructor of the same name y = foo(1.0) x = foo(2) end subroutine @@ -301,6 +302,7 @@ module m14e contains subroutine test real :: y + !PORTABILITY: Reference to generic function 'foo' (resolving to specific 'bar') is ambiguous with a structure constructor of the same name y = foo(1.0) x = foo(2) end subroutine diff --git a/flang/test/Semantics/resolve18.f90 b/flang/test/Semantics/resolve18.f90 index fef526908bbf9..547db5e85714c 100644 --- a/flang/test/Semantics/resolve18.f90 +++ b/flang/test/Semantics/resolve18.f90 @@ -348,6 +348,7 @@ subroutine s_21_23 use m21 use m23 type(foo) x ! Intel and NAG error + !PORTABILITY: Reference to generic function 'foo' (resolving to specific 'f1') is ambiguous with a structure constructor of the same name print *, foo(1.) ! Intel error print *, foo(1.,2.,3.) ! Intel error call ext(foo) ! GNU and Intel error From flang-commits at lists.llvm.org Tue May 13 07:49:26 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 13 May 2025 07:49:26 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Use LHS type for RHS BOZ on assignment (PR #139626) In-Reply-To: Message-ID: <68235bf6.170a0220.43e58.4d3f@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/139626 From flang-commits at lists.llvm.org Tue May 13 07:51:01 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 07:51:01 -0700 (PDT) Subject: [flang-commits] [flang] 936481f - [flang] Use LHS type for RHS BOZ on assignment (#139626) Message-ID: <68235c55.170a0220.1125e5.4af6@mx.google.com> Author: Peter Klausler Date: 2025-05-13T07:49:20-07:00 New Revision: 936481fdf5b0ab214e381aa96a151ec33348cfca URL: https://github.com/llvm/llvm-project/commit/936481fdf5b0ab214e381aa96a151ec33348cfca DIFF: https://github.com/llvm/llvm-project/commit/936481fdf5b0ab214e381aa96a151ec33348cfca.diff LOG: [flang] Use LHS type for RHS BOZ on assignment (#139626) F'2023 allows the right-hand side of an assignment to an integer or real scalar to be a BOZ literal constant; this has already been supported in some compilers. The type of the left-hand side variable is used to convert the value of the BOZ. Added: flang/test/Semantics/boz-rhs.f90 Modified: flang/lib/Semantics/expression.cpp Removed: ################################################################################ diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index acec7051efa98..c35492097cfbc 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -150,8 +150,9 @@ class ArgumentAnalyzer { } void Analyze(const parser::Variable &); void Analyze(const parser::ActualArgSpec &, bool isSubroutine); - void ConvertBOZ(std::optional *thisType, std::size_t, + void ConvertBOZOperand(std::optional *thisType, std::size_t, std::optional otherType); + void ConvertBOZAssignmentRHS(const DynamicType &lhsType); bool IsIntrinsicRelational( RelationalOperator, const DynamicType &, const DynamicType &) const; @@ -3849,8 +3850,8 @@ MaybeExpr RelationHelper(ExpressionAnalyzer &context, RelationalOperator opr, if (!analyzer.fatalErrors()) { std::optional leftType{analyzer.GetType(0)}; std::optional rightType{analyzer.GetType(1)}; - analyzer.ConvertBOZ(&leftType, 0, rightType); - analyzer.ConvertBOZ(&rightType, 1, leftType); + analyzer.ConvertBOZOperand(&leftType, 0, rightType); + analyzer.ConvertBOZOperand(&rightType, 1, leftType); if (leftType && rightType && analyzer.IsIntrinsicRelational(opr, *leftType, *rightType)) { analyzer.CheckForNullPointer("as a relational operand"); @@ -4761,12 +4762,8 @@ std::optional ArgumentAnalyzer::TryDefinedAssignment() { if (!IsAllocatableDesignator(lhs) || context_.inWhereBody()) { AddAssignmentConversion(*lhsType, *rhsType); } - } else { - if (lhsType->category() == TypeCategory::Integer || - lhsType->category() == TypeCategory::Unsigned || - lhsType->category() == TypeCategory::Real) { - ConvertBOZ(nullptr, 1, lhsType); - } + } else if (IsBOZLiteral(1)) { + ConvertBOZAssignmentRHS(*lhsType); if (IsBOZLiteral(1)) { context_.Say( "Right-hand side of this assignment may not be BOZ"_err_en_US); @@ -5003,7 +5000,7 @@ int ArgumentAnalyzer::GetRank(std::size_t i) const { // UNSIGNED; otherwise, convert to INTEGER. // Note that IBM supports comparing BOZ literals to CHARACTER operands. That // is not currently supported. -void ArgumentAnalyzer::ConvertBOZ(std::optional *thisType, +void ArgumentAnalyzer::ConvertBOZOperand(std::optional *thisType, std::size_t i, std::optional otherType) { if (IsBOZLiteral(i)) { Expr &&argExpr{MoveExpr(i)}; @@ -5036,6 +5033,17 @@ void ArgumentAnalyzer::ConvertBOZ(std::optional *thisType, } } +void ArgumentAnalyzer::ConvertBOZAssignmentRHS(const DynamicType &lhsType) { + if (lhsType.category() == TypeCategory::Integer || + lhsType.category() == TypeCategory::Unsigned || + lhsType.category() == TypeCategory::Real) { + Expr rhs{MoveExpr(1)}; + if (MaybeExpr converted{ConvertToType(lhsType, std::move(rhs))}) { + actuals_[1] = std::move(*converted); + } + } +} + // Report error resolving opr when there is a user-defined one available void ArgumentAnalyzer::SayNoMatch(const std::string &opr, bool isAssignment) { std::string type0{TypeAsFortran(0)}; diff --git a/flang/test/Semantics/boz-rhs.f90 b/flang/test/Semantics/boz-rhs.f90 new file mode 100644 index 0000000000000..1f2991aa1781b --- /dev/null +++ b/flang/test/Semantics/boz-rhs.f90 @@ -0,0 +1,8 @@ +! RUN: %flang_fc1 -fdebug-unparse %s | FileCheck %s +double precision dp +integer(8) i64 +!CHECK: dp=1._8 +dp = z'3ff0000000000000' +!CHECK: i64=-77129852189294865_8 +i64 = z'feedfacedeadbeef' +end From flang-commits at lists.llvm.org Tue May 13 08:00:04 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 13 May 2025 08:00:04 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][openacc] Align async check for combined construct (PR #139744) In-Reply-To: Message-ID: <68235e74.a70a0220.97f08.df52@mx.google.com> clementval wrote: > Thanks for this! Should this have codegen tests? Else I'm fine with it. Just added some https://github.com/llvm/llvm-project/pull/139744 From flang-commits at lists.llvm.org Tue May 13 08:02:16 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 13 May 2025 08:02:16 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][openacc] Align async check for combined construct (PR #139744) In-Reply-To: Message-ID: <68235ef8.170a0220.799e7.62af@mx.google.com> https://github.com/clementval updated https://github.com/llvm/llvm-project/pull/139744 >From d88d587c68339301b78b55efc9439bf5bf8aed2f Mon Sep 17 00:00:00 2001 From: Valentin Clement Date: Tue, 13 May 2025 07:39:50 -0700 Subject: [PATCH 1/2] [flang][openacc] Align async check for combined construct --- .../Semantics/OpenACC/acc-kernels-loop.f90 | 9 + .../OpenACC/acc-parallel-loop-validity.f90 | 9 + .../Semantics/OpenACC/acc-serial-loop.f90 | 9 + llvm/include/llvm/Frontend/OpenACC/ACC.td | 160 ++++++++---------- 4 files changed, 101 insertions(+), 86 deletions(-) diff --git a/flang/test/Semantics/OpenACC/acc-kernels-loop.f90 b/flang/test/Semantics/OpenACC/acc-kernels-loop.f90 index 8653978fb6249..29985a02eb6ef 100644 --- a/flang/test/Semantics/OpenACC/acc-kernels-loop.f90 +++ b/flang/test/Semantics/OpenACC/acc-kernels-loop.f90 @@ -295,4 +295,13 @@ program openacc_kernels_loop_validity if(i == 10) cycle end do + !$acc kernels loop async(1) device_type(nvidia) async(3) + do i = 1, n + end do + +!ERROR: At most one ASYNC clause can appear on the KERNELS LOOP directive or in group separated by the DEVICE_TYPE clause + !$acc kernels loop async(1) device_type(nvidia) async async + do i = 1, n + end do + end program openacc_kernels_loop_validity diff --git a/flang/test/Semantics/OpenACC/acc-parallel-loop-validity.f90 b/flang/test/Semantics/OpenACC/acc-parallel-loop-validity.f90 index 7f33f9e145110..78e1a7ad7c452 100644 --- a/flang/test/Semantics/OpenACC/acc-parallel-loop-validity.f90 +++ b/flang/test/Semantics/OpenACC/acc-parallel-loop-validity.f90 @@ -141,4 +141,13 @@ program openacc_parallel_loop_validity if(i == 10) cycle end do + !$acc parallel loop async(1) device_type(nvidia) async(3) + do i = 1, n + end do + +!ERROR: At most one ASYNC clause can appear on the PARALLEL LOOP directive or in group separated by the DEVICE_TYPE clause + !$acc parallel loop async(1) device_type(nvidia) async async + do i = 1, n + end do + end program openacc_parallel_loop_validity diff --git a/flang/test/Semantics/OpenACC/acc-serial-loop.f90 b/flang/test/Semantics/OpenACC/acc-serial-loop.f90 index 2832274680eca..5d2be7f7c6474 100644 --- a/flang/test/Semantics/OpenACC/acc-serial-loop.f90 +++ b/flang/test/Semantics/OpenACC/acc-serial-loop.f90 @@ -111,4 +111,13 @@ program openacc_serial_loop_validity if(i == 10) cycle end do + !$acc serial loop async(1) device_type(nvidia) async(3) + do i = 1, n + end do + +!ERROR: At most one ASYNC clause can appear on the SERIAL LOOP directive or in group separated by the DEVICE_TYPE clause + !$acc serial loop async(1) device_type(nvidia) async async + do i = 1, n + end do + end program openacc_serial_loop_validity diff --git a/llvm/include/llvm/Frontend/OpenACC/ACC.td b/llvm/include/llvm/Frontend/OpenACC/ACC.td index d372fc221e4b4..46cba9f2400e1 100644 --- a/llvm/include/llvm/Frontend/OpenACC/ACC.td +++ b/llvm/include/llvm/Frontend/OpenACC/ACC.td @@ -556,35 +556,31 @@ def ACC_HostData : Directive<"host_data"> { // 2.11 def ACC_KernelsLoop : Directive<"kernels loop"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause - ]; + let allowedClauses = [VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause]; + let allowedOnceClauses = [VersionedClause, + VersionedClause, + VersionedClause]; let allowedExclusiveClauses = [ VersionedClause, VersionedClause, @@ -596,36 +592,32 @@ def ACC_KernelsLoop : Directive<"kernels loop"> { // 2.11 def ACC_ParallelLoop : Directive<"parallel loop"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause - ]; + let allowedClauses = [VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause]; + let allowedOnceClauses = [VersionedClause, + VersionedClause, + VersionedClause]; let allowedExclusiveClauses = [ VersionedClause, VersionedClause, @@ -637,33 +629,29 @@ def ACC_ParallelLoop : Directive<"parallel loop"> { // 2.11 def ACC_SerialLoop : Directive<"serial loop"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause - ]; + let allowedClauses = [VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause]; + let allowedOnceClauses = [VersionedClause, + VersionedClause, + VersionedClause]; let allowedExclusiveClauses = [ VersionedClause, VersionedClause, >From 99e82a914b852b66189330f5a43d9cfe8f72f668 Mon Sep 17 00:00:00 2001 From: Valentin Clement Date: Tue, 13 May 2025 07:59:23 -0700 Subject: [PATCH 2/2] Add lowering tests --- flang/test/Lower/OpenACC/acc-kernels-loop.f90 | 6 ++++++ flang/test/Lower/OpenACC/acc-parallel-loop.f90 | 6 ++++++ flang/test/Lower/OpenACC/acc-serial-loop.f90 | 6 ++++++ 3 files changed, 18 insertions(+) diff --git a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 b/flang/test/Lower/OpenACC/acc-kernels-loop.f90 index 0ded708cb1a3b..a330b7d491d06 100644 --- a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-kernels-loop.f90 @@ -102,6 +102,12 @@ subroutine acc_kernels_loop ! CHECK: acc.terminator ! CHECK-NEXT: }{{$}} + !$acc kernels loop async(async) device_type(nvidia) async(1) + DO i = 1, n + a(i) = b(i) + END DO +! CHECK: acc.kernels combined(loop) async(%{{.*}} : i32, %c1{{.*}} : i32 [#acc.device_type]) + !$acc kernels loop wait DO i = 1, n a(i) = b(i) diff --git a/flang/test/Lower/OpenACC/acc-parallel-loop.f90 b/flang/test/Lower/OpenACC/acc-parallel-loop.f90 index ccd37d87262e3..1e1fc7448a513 100644 --- a/flang/test/Lower/OpenACC/acc-parallel-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-parallel-loop.f90 @@ -104,6 +104,12 @@ subroutine acc_parallel_loop ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} + !$acc parallel loop async(async) device_type(nvidia) async(1) + DO i = 1, n + a(i) = b(i) + END DO +! CHECK: acc.parallel combined(loop) async(%{{.*}} : i32, %c1{{.*}} : i32 [#acc.device_type]) + !$acc parallel loop wait DO i = 1, n a(i) = b(i) diff --git a/flang/test/Lower/OpenACC/acc-serial-loop.f90 b/flang/test/Lower/OpenACC/acc-serial-loop.f90 index 478dfa0d96c3b..98fc28990265a 100644 --- a/flang/test/Lower/OpenACC/acc-serial-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-serial-loop.f90 @@ -123,6 +123,12 @@ subroutine acc_serial_loop ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} + !$acc serial loop async(async) device_type(nvidia) async(1) + DO i = 1, n + a(i) = b(i) + END DO +! CHECK: acc.serial combined(loop) async(%{{.*}} : i32, %c1{{.*}} : i32 [#acc.device_type]) + !$acc serial loop wait DO i = 1, n a(i) = b(i) From flang-commits at lists.llvm.org Tue May 13 08:05:09 2025 From: flang-commits at lists.llvm.org (David Truby via flang-commits) Date: Tue, 13 May 2025 08:05:09 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add loop annotation attributes to the loop backedge instead of the loop header's conditional branch (PR #126082) In-Reply-To: Message-ID: <68235fa5.170a0220.59850.88fc@mx.google.com> DavidTruby wrote: Should this be merged, since it has a number of approvals and seemingly no issues? https://github.com/llvm/llvm-project/pull/126082 From flang-commits at lists.llvm.org Tue May 13 08:08:10 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Tue, 13 May 2025 08:08:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add loop annotation attributes to the loop backedge instead of the loop header's conditional branch (PR #126082) In-Reply-To: Message-ID: <6823605a.620a0220.260e52.9e07@mx.google.com> ashermancinelli wrote: Sorry folks, I lost track of this. I believe it needs to be updated because of other changes to loop annotations. Let me rebase and run some more testing. Thanks for the ping! https://github.com/llvm/llvm-project/pull/126082 From flang-commits at lists.llvm.org Tue May 13 08:13:51 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 08:13:51 -0700 (PDT) Subject: [flang-commits] [flang] bbb7f01 - [flang] Fix volatile attribute propagation on allocatables (#139183) Message-ID: <682361af.630a0220.57ab1.14cc@mx.google.com> Author: Asher Mancinelli Date: 2025-05-13T08:13:47-07:00 New Revision: bbb7f0148177d332df80b5cfdc7d161dca289056 URL: https://github.com/llvm/llvm-project/commit/bbb7f0148177d332df80b5cfdc7d161dca289056 DIFF: https://github.com/llvm/llvm-project/commit/bbb7f0148177d332df80b5cfdc7d161dca289056.diff LOG: [flang] Fix volatile attribute propagation on allocatables (#139183) Ensure volatility is reflected not just on the reference to an allocatable, but on the box, too. When we declare a volatile allocatable, we now get a volatile reference to a volatile box. Some related cleanups: * SELECT TYPE constructs check the selector's type for volatility when creating and designating the type used in the selecting block. * Refine the verifier for fir.convert. In general, I think it is ok to implicitly drop volatility in any ptr-to-int conversion because it means we are in codegen (and representing volatility on the LLVM ops and intrinsics) or we are calling an external function (are there any cases I'm not thinking of?) * An allocatable test that was XFAILed is now passing. Making allocatables' boxes volatile resulted in accesses of those boxes being volatile, which resolved some errors coming from the strict verifier. * I noticed a runtime function was missing the fir.runtime attribute. Added: Modified: flang/lib/Lower/Bridge.cpp flang/lib/Optimizer/Dialect/FIROps.cpp flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp flang/test/Fir/invalid.fir flang/test/Lower/volatile-allocatable.f90 flang/test/Lower/volatile-allocatable1.f90 Removed: ################################################################################ diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index cf9a322680321..c9e91cf3e8042 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -3842,6 +3842,10 @@ class FirConverter : public Fortran::lower::AbstractConverter { bool hasLocalScope = false; llvm::SmallVector typeCaseScopes; + const auto selectorIsVolatile = [&selector]() { + return fir::isa_volatile_type(fir::getBase(selector).getType()); + }; + const auto &typeCaseList = std::get>( selectTypeConstruct.t); @@ -3995,7 +3999,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { addrTy = fir::HeapType::get(addrTy); if (std::holds_alternative( typeSpec->u)) { - mlir::Type refTy = fir::ReferenceType::get(addrTy); + mlir::Type refTy = + fir::ReferenceType::get(addrTy, selectorIsVolatile()); if (isPointer || isAllocatable) refTy = addrTy; exactValue = builder->create( @@ -4004,7 +4009,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { typeSpec->declTypeSpec->AsIntrinsic(); if (isArray) { mlir::Value exact = builder->create( - loc, fir::BoxType::get(addrTy), fir::getBase(selector)); + loc, fir::BoxType::get(addrTy, selectorIsVolatile()), + fir::getBase(selector)); addAssocEntitySymbol(selectorBox->clone(exact)); } else if (intrinsic->category() == Fortran::common::TypeCategory::Character) { @@ -4019,7 +4025,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { } else if (std::holds_alternative( typeSpec->u)) { exactValue = builder->create( - loc, fir::BoxType::get(addrTy), fir::getBase(selector)); + loc, fir::BoxType::get(addrTy, selectorIsVolatile()), + fir::getBase(selector)); addAssocEntitySymbol(selectorBox->clone(exactValue)); } } else if (std::holds_alternative( @@ -4037,7 +4044,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { addrTy = fir::PointerType::get(addrTy); if (isAllocatable) addrTy = fir::HeapType::get(addrTy); - mlir::Type classTy = fir::ClassType::get(addrTy); + mlir::Type classTy = + fir::ClassType::get(addrTy, selectorIsVolatile()); if (classTy == baseTy) { addAssocEntitySymbol(selector); } else { diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index c1cdbddd45279..d85b38c467857 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -1536,20 +1536,50 @@ bool fir::ConvertOp::canBeConverted(mlir::Type inType, mlir::Type outType) { areRecordsCompatible(inType, outType); } +// In general, ptrtoint-like conversions are allowed to lose volatility +// information because they are either: +// +// 1. passing an entity to an external function and there's nothing we can do +// about volatility after that happens, or +// 2. for code generation, at which point we represent volatility with +// attributes on the LLVM instructions and intrinsics. +// +// For all other cases, volatility ought to match exactly. +static mlir::LogicalResult verifyVolatility(mlir::Type inType, + mlir::Type outType) { + const bool toLLVMPointer = mlir::isa(outType); + const bool toInteger = fir::isa_integer(outType); + + // When converting references to classes or allocatables into boxes for + // runtime arguments, we cast away all the volatility information and pass a + // box. This is allowed. + const bool isBoxNoneLike = [&]() { + if (fir::isBoxNone(outType)) + return true; + if (auto referenceType = mlir::dyn_cast(outType)) { + if (fir::isBoxNone(referenceType.getElementType())) { + return true; + } + } + return false; + }(); + + const bool isPtrToIntLike = toLLVMPointer || toInteger || isBoxNoneLike; + if (isPtrToIntLike) { + return mlir::success(); + } + + // In all other cases, we need to check for an exact volatility match. + return mlir::success(fir::isa_volatile_type(inType) == + fir::isa_volatile_type(outType)); +} + llvm::LogicalResult fir::ConvertOp::verify() { mlir::Type inType = getValue().getType(); mlir::Type outType = getType(); - // If we're converting to an LLVM pointer type or an integer, we don't - // need to check for volatility mismatch - volatility will be handled by the - // memory operations themselves in llvm code generation and ptr-to-int can't - // represent volatility. - const bool toLLVMPointer = mlir::isa(outType); - const bool toInteger = fir::isa_integer(outType); if (fir::useStrictVolatileVerification()) { - if (fir::isa_volatile_type(inType) != fir::isa_volatile_type(outType) && - !toLLVMPointer && !toInteger) { - return emitOpError("cannot convert between volatile and non-volatile " - "types, use fir.volatile_cast instead ") + if (failed(verifyVolatility(inType, outType))) { + return emitOpError("this conversion does not preserve volatility: ") << inType << " / " << outType; } } diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp index 711d5d1461b08..8cfca59ecdada 100644 --- a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp +++ b/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp @@ -207,29 +207,37 @@ static bool hasExplicitLowerBounds(mlir::Value shape) { mlir::isa(shape.getType()); } -static std::pair updateDeclareInputTypeWithVolatility( - mlir::Type inputType, mlir::Value memref, mlir::OpBuilder &builder, - fir::FortranVariableFlagsAttr fortran_attrs) { - if (fortran_attrs && - bitEnumContainsAny(fortran_attrs.getFlags(), - fir::FortranVariableFlagsEnum::fortran_volatile)) { - const bool isPointer = bitEnumContainsAny( - fortran_attrs.getFlags(), fir::FortranVariableFlagsEnum::pointer); - auto updateType = [&](auto t) { - using FIRT = decltype(t); - // A volatile pointer's pointee is volatile. - auto elementType = t.getEleTy(); - const bool elementTypeIsVolatile = - isPointer || fir::isa_volatile_type(elementType); - auto newEleTy = - fir::updateTypeWithVolatility(elementType, elementTypeIsVolatile); - inputType = FIRT::get(newEleTy, true); - }; - llvm::TypeSwitch(inputType) - .Case(updateType); - memref = - builder.create(memref.getLoc(), inputType, memref); +static std::pair +updateDeclaredInputTypeWithVolatility(mlir::Type inputType, mlir::Value memref, + mlir::OpBuilder &builder, + fir::FortranVariableFlagsEnum flags) { + if (!bitEnumContainsAny(flags, + fir::FortranVariableFlagsEnum::fortran_volatile)) { + return std::make_pair(inputType, memref); } + + // A volatile pointer's pointee is volatile. + const bool isPointer = + bitEnumContainsAny(flags, fir::FortranVariableFlagsEnum::pointer); + // An allocatable's inner type's volatility matches that of the reference. + const bool isAllocatable = + bitEnumContainsAny(flags, fir::FortranVariableFlagsEnum::allocatable); + + auto updateType = [&](auto t) { + using FIRT = decltype(t); + auto elementType = t.getEleTy(); + const bool elementTypeIsBox = mlir::isa(elementType); + const bool elementTypeIsVolatile = isPointer || isAllocatable || + elementTypeIsBox || + fir::isa_volatile_type(elementType); + auto newEleTy = + fir::updateTypeWithVolatility(elementType, elementTypeIsVolatile); + inputType = FIRT::get(newEleTy, true); + }; + llvm::TypeSwitch(inputType) + .Case(updateType); + memref = + builder.create(memref.getLoc(), inputType, memref); return std::make_pair(inputType, memref); } @@ -243,8 +251,11 @@ void hlfir::DeclareOp::build(mlir::OpBuilder &builder, auto nameAttr = builder.getStringAttr(uniq_name); mlir::Type inputType = memref.getType(); bool hasExplicitLbs = hasExplicitLowerBounds(shape); - std::tie(inputType, memref) = updateDeclareInputTypeWithVolatility( - inputType, memref, builder, fortran_attrs); + if (fortran_attrs) { + const auto flags = fortran_attrs.getFlags(); + std::tie(inputType, memref) = updateDeclaredInputTypeWithVolatility( + inputType, memref, builder, flags); + } mlir::Type hlfirVariableType = getHLFIRVariableType(inputType, hasExplicitLbs); build(builder, result, {hlfirVariableType, inputType}, memref, shape, diff --git a/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp b/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp index 0c78a878cdc53..f9a4c4d0283c7 100644 --- a/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp +++ b/flang/lib/Optimizer/Transforms/PolymorphicOpConversion.cpp @@ -401,10 +401,14 @@ llvm::LogicalResult SelectTypeConv::genTypeLadderStep( { // Since conversion is done in parallel for each fir.select_type // operation, the runtime function insertion must be threadsafe. + auto runtimeAttr = + mlir::NamedAttribute(fir::FIROpsDialect::getFirRuntimeAttrName(), + mlir::UnitAttr::get(rewriter.getContext())); callee = fir::createFuncOp(rewriter.getUnknownLoc(), mod, fctName, rewriter.getFunctionType({descNoneTy, typeDescTy}, - rewriter.getI1Type())); + rewriter.getI1Type()), + {runtimeAttr}); } cmp = rewriter .create(loc, callee, diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index 1de48b87365b3..fd607fd9066f7 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -1260,7 +1260,7 @@ func.func @dc_invalid_reduction(%arg0: index, %arg1: index) { // Should fail when volatility changes from a fir.convert func.func @bad_convert_volatile(%arg0: !fir.ref) -> !fir.ref { - // expected-error at +1 {{'fir.convert' op cannot convert between volatile and non-volatile types, use fir.volatile_cast instead}} + // expected-error at +1 {{op this conversion does not preserve volatility}} %0 = fir.convert %arg0 : (!fir.ref) -> !fir.ref return %0 : !fir.ref } @@ -1269,7 +1269,7 @@ func.func @bad_convert_volatile(%arg0: !fir.ref) -> !fir.ref // Should fail when volatility changes from a fir.convert func.func @bad_convert_volatile2(%arg0: !fir.ref) -> !fir.ref { - // expected-error at +1 {{'fir.convert' op cannot convert between volatile and non-volatile types, use fir.volatile_cast instead}} + // expected-error at +1 {{op this conversion does not preserve volatility}} %0 = fir.convert %arg0 : (!fir.ref) -> !fir.ref return %0 : !fir.ref } diff --git a/flang/test/Lower/volatile-allocatable.f90 b/flang/test/Lower/volatile-allocatable.f90 index 5f75a5425422a..e182fe8a4d9c9 100644 --- a/flang/test/Lower/volatile-allocatable.f90 +++ b/flang/test/Lower/volatile-allocatable.f90 @@ -119,10 +119,10 @@ subroutine test_unlimited_polymorphic() end subroutine ! CHECK-LABEL: func.func @_QPtest_scalar_volatile() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEc1"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv1"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv2"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv3"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEc1"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv1"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv2"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv3"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () @@ -140,8 +140,8 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_volatile_asynchronous() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEi1"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEv1"} : (!fir.ref>>>, volatile>) -> (!fir.ref>>>, volatile>, !fir.ref>>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEi1"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEv1"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 @@ -151,10 +151,11 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_select_base_type_volatile() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.ref>>>, volatile>) -> (!fir.ref>>>, volatile>, !fir.ref>>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAClassIs(%{{.+}}, %{{.+}}) : (!fir.box, !fir.ref) -> i1 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.class>>, volatile>, !fir.shift<1>) -> (!fir.class>>, volatile>, !fir.class>>, volatile>) ! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0 (%{{.+}}) : (!fir.class>>, volatile>, index) -> !fir.class, volatile> ! CHECK: %{{.+}} = hlfir.designate %{{.+}}{"i"} : (!fir.class, volatile>) -> !fir.ref @@ -162,7 +163,7 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_mold_allocation() { ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {uniq_name = "_QFtest_mold_allocationEtemplate"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_mold_allocationEv"} : (!fir.ref>>>, volatile>) -> (!fir.ref>>>, volatile>, !fir.ref>>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_mold_allocationEv"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QQclX6D6F6C642074657374"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) ! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0{"str"} typeparams %{{.+}} : (!fir.ref>, index) -> !fir.ref> ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QQro.2xi4.2"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) @@ -173,8 +174,8 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_unlimited_polymorphic() { -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.ref>, volatile>) -> (!fir.ref>, volatile>, !fir.ref>, volatile>) -! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEupa"} : (!fir.ref>>, volatile>) -> (!fir.ref>>, volatile>, !fir.ref>>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.ref, volatile>, volatile>) -> (!fir.ref, volatile>, volatile>, !fir.ref, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEupa"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitIntrinsicForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i32, i32, i32) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.heap) -> (!fir.heap, !fir.heap) diff --git a/flang/test/Lower/volatile-allocatable1.f90 b/flang/test/Lower/volatile-allocatable1.f90 index a21359c3b4225..d2a07c8763885 100644 --- a/flang/test/Lower/volatile-allocatable1.f90 +++ b/flang/test/Lower/volatile-allocatable1.f90 @@ -1,7 +1,6 @@ ! RUN: bbc --strict-fir-volatile-verifier %s -o - | FileCheck %s ! Requires correct propagation of volatility for allocatable nested types. -! XFAIL: * function allocatable_udt() type :: base_type @@ -15,3 +14,19 @@ function allocatable_udt() allocate(v2(2,3)) allocatable_udt = v2(1,1)%i end function +! CHECK-LABEL: func.func @_QPallocatable_udt() -> i32 { +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.n.i"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.di.base_type.i"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.n.base_type"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.n.j"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.di.ext_type.j"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.n.ext_type"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {uniq_name = "_QFallocatable_udtEallocatable_udt"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtEv2"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.c.base_type"} : (!fir.ref>>, !fir.shapeshift<1>) -> (!fir.box>>, !fir.ref>>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.dt.base_type"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.dt.ext_type"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) +! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFallocatable_udtE.c.ext_type"} : (!fir.ref>>, !fir.shapeshift<1>) -> (!fir.box>>, !fir.ref>>) +! CHECK: %{{.+}} = hlfir.designate %{{.+}} (%{{.+}}, %{{.+}}) : (!fir.box>>, volatile>, index, index) -> !fir.ref, volatile> +! CHECK: %{{.+}} = hlfir.designate %{{.+}}{"base_type"} : (!fir.ref, volatile>) -> !fir.ref, volatile> +! CHECK: %{{.+}} = hlfir.designate %{{.+}}{"i"} : (!fir.ref, volatile>) -> !fir.ref From flang-commits at lists.llvm.org Tue May 13 08:13:55 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Tue, 13 May 2025 08:13:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix volatile attribute propagation on allocatables (PR #139183) In-Reply-To: Message-ID: <682361b3.050a0220.13ccdb.9fab@mx.google.com> https://github.com/ashermancinelli closed https://github.com/llvm/llvm-project/pull/139183 From flang-commits at lists.llvm.org Tue May 13 08:17:55 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 08:17:55 -0700 (PDT) Subject: [flang-commits] [flang] 7e8b3fe - [Flang] Add missing dependent dialects to MLIR passes (#139260) Message-ID: <682362a3.a70a0220.36517a.c853@mx.google.com> Author: Sergio Afonso Date: 2025-05-13T16:17:49+01:00 New Revision: 7e8b3fea43f1dfa1d5611a70d887cba5d79b2da9 URL: https://github.com/llvm/llvm-project/commit/7e8b3fea43f1dfa1d5611a70d887cba5d79b2da9 DIFF: https://github.com/llvm/llvm-project/commit/7e8b3fea43f1dfa1d5611a70d887cba5d79b2da9.diff LOG: [Flang] Add missing dependent dialects to MLIR passes (#139260) This patch updates several passes to include the DLTI dialect, since their use of the `fir::support::getOrSetMLIRDataLayout()` utility function could, in some cases, require this dialect to be loaded in advance. Also, the `CUFComputeSharedMemoryOffsetsAndSize` pass has been updated with a dependency to the GPU dialect, as its invocation to `cuf::getOrCreateGPUModule()` would result in the same kind of error if no other operations or attributes from that dialect were present in the input MLIR module. Added: flang/test/Transforms/dlti-dependency.fir Modified: flang/include/flang/Optimizer/Transforms/Passes.td flang/lib/Optimizer/Transforms/CUFAddConstructor.cpp flang/lib/Optimizer/Transforms/CUFComputeSharedMemoryOffsetsAndSize.cpp flang/lib/Optimizer/Transforms/CUFGPUToLLVMConversion.cpp flang/lib/Optimizer/Transforms/CUFOpConversion.cpp flang/lib/Optimizer/Transforms/LoopVersioning.cpp Removed: ################################################################################ diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index 3243b44df9c7a..c0d88a8e19f80 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -356,7 +356,7 @@ def LoopVersioning : Pass<"loop-versioning", "mlir::func::FuncOp"> { an array has element sized stride. The element sizes stride allows some loops to be vectorized as well as other loop optimizations. }]; - let dependentDialects = [ "fir::FIROpsDialect" ]; + let dependentDialects = [ "fir::FIROpsDialect", "mlir::DLTIDialect" ]; } def VScaleAttr : Pass<"vscale-attr", "mlir::func::FuncOp"> { @@ -436,7 +436,7 @@ def AssumedRankOpConversion : Pass<"fir-assumed-rank-op", "mlir::ModuleOp"> { def CUFOpConversion : Pass<"cuf-convert", "mlir::ModuleOp"> { let summary = "Convert some CUF operations to runtime calls"; let dependentDialects = [ - "fir::FIROpsDialect", "mlir::gpu::GPUDialect" + "fir::FIROpsDialect", "mlir::gpu::GPUDialect", "mlir::DLTIDialect" ]; } @@ -451,14 +451,14 @@ def CUFDeviceGlobal : def CUFAddConstructor : Pass<"cuf-add-constructor", "mlir::ModuleOp"> { let summary = "Add constructor to register CUDA Fortran allocators"; let dependentDialects = [ - "cuf::CUFDialect", "mlir::func::FuncDialect" + "cuf::CUFDialect", "mlir::func::FuncDialect", "mlir::DLTIDialect" ]; } def CUFGPUToLLVMConversion : Pass<"cuf-gpu-convert-to-llvm", "mlir::ModuleOp"> { let summary = "Convert some GPU operations lowered from CUF to runtime calls"; let dependentDialects = [ - "mlir::LLVM::LLVMDialect" + "mlir::LLVM::LLVMDialect", "mlir::DLTIDialect" ]; } @@ -472,7 +472,10 @@ def CUFComputeSharedMemoryOffsetsAndSize the global and set it. }]; - let dependentDialects = ["cuf::CUFDialect", "fir::FIROpsDialect"]; + let dependentDialects = [ + "cuf::CUFDialect", "fir::FIROpsDialect", "mlir::gpu::GPUDialect", + "mlir::DLTIDialect" + ]; } def SetRuntimeCallAttributes diff --git a/flang/lib/Optimizer/Transforms/CUFAddConstructor.cpp b/flang/lib/Optimizer/Transforms/CUFAddConstructor.cpp index 064f0f363f699..2dd6950b34897 100644 --- a/flang/lib/Optimizer/Transforms/CUFAddConstructor.cpp +++ b/flang/lib/Optimizer/Transforms/CUFAddConstructor.cpp @@ -22,6 +22,7 @@ #include "flang/Optimizer/Support/DataLayout.h" #include "flang/Runtime/CUDA/registration.h" #include "flang/Runtime/entry-names.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/GPU/IR/GPUDialect.h" #include "mlir/Dialect/LLVMIR/LLVMAttrs.h" #include "mlir/Dialect/LLVMIR/LLVMDialect.h" diff --git a/flang/lib/Optimizer/Transforms/CUFComputeSharedMemoryOffsetsAndSize.cpp b/flang/lib/Optimizer/Transforms/CUFComputeSharedMemoryOffsetsAndSize.cpp index 8009522a82e27..f6381ef8a8a21 100644 --- a/flang/lib/Optimizer/Transforms/CUFComputeSharedMemoryOffsetsAndSize.cpp +++ b/flang/lib/Optimizer/Transforms/CUFComputeSharedMemoryOffsetsAndSize.cpp @@ -22,6 +22,7 @@ #include "flang/Optimizer/Support/DataLayout.h" #include "flang/Runtime/CUDA/registration.h" #include "flang/Runtime/entry-names.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/GPU/IR/GPUDialect.h" #include "mlir/Dialect/LLVMIR/LLVMDialect.h" #include "mlir/IR/Value.h" diff --git a/flang/lib/Optimizer/Transforms/CUFGPUToLLVMConversion.cpp b/flang/lib/Optimizer/Transforms/CUFGPUToLLVMConversion.cpp index 2549fdcb8baee..fe69ffa8350af 100644 --- a/flang/lib/Optimizer/Transforms/CUFGPUToLLVMConversion.cpp +++ b/flang/lib/Optimizer/Transforms/CUFGPUToLLVMConversion.cpp @@ -14,6 +14,7 @@ #include "flang/Runtime/CUDA/common.h" #include "flang/Support/Fortran.h" #include "mlir/Conversion/LLVMCommon/Pattern.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/GPU/IR/GPUDialect.h" #include "mlir/Dialect/LLVMIR/NVVMDialect.h" #include "mlir/Pass/Pass.h" diff --git a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp index e70ceb3a67d98..7477a3c53c3ef 100644 --- a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp +++ b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp @@ -24,6 +24,7 @@ #include "flang/Runtime/allocatable.h" #include "flang/Support/Fortran.h" #include "mlir/Conversion/LLVMCommon/Pattern.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/GPU/IR/GPUDialect.h" #include "mlir/IR/Matchers.h" #include "mlir/Pass/Pass.h" diff --git a/flang/lib/Optimizer/Transforms/LoopVersioning.cpp b/flang/lib/Optimizer/Transforms/LoopVersioning.cpp index 858e35ccb5f81..50e7ee5599ab1 100644 --- a/flang/lib/Optimizer/Transforms/LoopVersioning.cpp +++ b/flang/lib/Optimizer/Transforms/LoopVersioning.cpp @@ -51,6 +51,7 @@ #include "flang/Optimizer/Dialect/Support/KindMapping.h" #include "flang/Optimizer/Support/DataLayout.h" #include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/LLVMIR/LLVMDialect.h" #include "mlir/IR/Dominance.h" #include "mlir/IR/Matchers.h" diff --git a/flang/test/Transforms/dlti-dependency.fir b/flang/test/Transforms/dlti-dependency.fir new file mode 100644 index 0000000000000..c1c3da19fb8d6 --- /dev/null +++ b/flang/test/Transforms/dlti-dependency.fir @@ -0,0 +1,21 @@ +// This test only makes sure that passes with a DLTI dialect dependency are able +// to obtain the dlti.dl_spec module attribute from an llvm.data_layout string. +// +// If dependencies for the pass are not properly set, this test causes a +// compiler error due to the DLTI dialect not being loaded. + +// RUN: fir-opt --add-debug-info %s +// RUN: fir-opt --cuf-add-constructor %s +// RUN: fir-opt --cuf-compute-shared-memory %s +// RUN: fir-opt --cuf-gpu-convert-to-llvm %s +// RUN: fir-opt --cuf-convert %s +// RUN: fir-opt --loop-versioning %s + +module attributes {llvm.data_layout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"} { + llvm.func @foo(%arg0 : i32) { + llvm.return + } +} + +// CHECK: module attributes { +// CHECK-SAME: dlti.dl_spec = #dlti.dl_spec< From flang-commits at lists.llvm.org Tue May 13 08:18:35 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 08:18:35 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add missing dependent dialects to MLIR passes (PR #139260) In-Reply-To: Message-ID: <682362cb.170a0220.37cff5.79f9@mx.google.com> https://github.com/skatrak closed https://github.com/llvm/llvm-project/pull/139260 From flang-commits at lists.llvm.org Tue May 13 02:55:17 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 13 May 2025 02:55:17 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow flush of common block (PR #139528) In-Reply-To: Message-ID: <68231705.170a0220.2d80e1.3486@mx.google.com> https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/139528 >From 3e6d5c3cd14a9b688ec6e35a4bd4436fe944860b Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Mon, 12 May 2025 10:13:08 +0000 Subject: [PATCH] [flang][OpenMP] Allow flush of common block I think this was denied by accident in 68180d8. Flush of a common block is allowed by the standard on my reading. It is not allowed by classic-flang but is supported by gfortran and ifx. This doesn't need any lowering changes. The LLVM translation ignores the flush argument list because the openmp runtime library doesn't support flushing specific data. Depends upon #139522. Ignore the first commit in this PR. --- flang/lib/Semantics/check-omp-structure.cpp | 7 ------- flang/test/Lower/OpenMP/flush-common.f90 | 13 +++++++++++++ flang/test/Semantics/OpenMP/flush04.f90 | 11 ----------- 3 files changed, 13 insertions(+), 18 deletions(-) create mode 100644 flang/test/Lower/OpenMP/flush-common.f90 delete mode 100644 flang/test/Semantics/OpenMP/flush04.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 8f6a623508aa7..3e44ef5329ce2 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2304,13 +2304,6 @@ void OmpStructureChecker::Leave(const parser::OpenMPFlushConstruct &x) { auto &flushList{std::get>(x.v.t)}; if (flushList) { - for (const parser::OmpArgument &arg : flushList->v) { - if (auto *sym{GetArgumentSymbol(arg)}; sym && !IsVariableListItem(*sym)) { - context_.Say(arg.source, - "FLUSH argument must be a variable list item"_err_en_US); - } - } - if (FindClause(llvm::omp::Clause::OMPC_acquire) || FindClause(llvm::omp::Clause::OMPC_release) || FindClause(llvm::omp::Clause::OMPC_acq_rel)) { diff --git a/flang/test/Lower/OpenMP/flush-common.f90 b/flang/test/Lower/OpenMP/flush-common.f90 new file mode 100644 index 0000000000000..7656141dcb295 --- /dev/null +++ b/flang/test/Lower/OpenMP/flush-common.f90 @@ -0,0 +1,13 @@ +! RUN: %flang_fc1 -fopenmp -emit-hlfir -o - %s | FileCheck %s + +! Regression test to ensure that the name /c/ in the flush argument list is +! resolved to the common block symbol and common blocks are allowed in the +! flush argument list. + +! CHECK: %[[GLBL:.*]] = fir.address_of({{.*}}) : !fir.ref> + common /c/ x + real :: x +! CHECK: omp.flush(%[[GLBL]] : !fir.ref>) + !$omp flush(/c/) +end + diff --git a/flang/test/Semantics/OpenMP/flush04.f90 b/flang/test/Semantics/OpenMP/flush04.f90 deleted file mode 100644 index ffc2273b692dc..0000000000000 --- a/flang/test/Semantics/OpenMP/flush04.f90 +++ /dev/null @@ -1,11 +0,0 @@ -! RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp - -! Regression test to ensure that the name /c/ in the flush argument list is -! resolved to the common block symbol. - - common /c/ x - real :: x -!ERROR: FLUSH argument must be a variable list item - !$omp flush(/c/) -end - From flang-commits at lists.llvm.org Tue May 13 05:46:00 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 05:46:00 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Remove async and structured flag from data actions (PR #139723) Message-ID: https://github.com/khaki3 created https://github.com/llvm/llvm-project/pull/139723 The OpenACC data actions always collocate with parental constructs and have no effects themselves. We should force users to handle data actions through parental constructs. Especially, if async operands adhere to data actions, some would lower data actions independently from parental constructs, causing semantically incorrect code. This PR removes the async operations and the structured flag of data actions. This PR also does - Rename `UpdateOp`'s `async` to `asyncOnly`. - Update assemblyFormat to display `asyncOnly`. TODO: - Update lit tests. - Update assemblyFormat to display the `async` and the `wait` operands of `EnterData`, `ExitData`, and `WaitOp`. >From a1a3adaa3257c81d3936fdcce55be0bae14f0a9f Mon Sep 17 00:00:00 2001 From: Kazuaki Matsumura Date: Tue, 13 May 2025 05:08:27 -0700 Subject: [PATCH 1/2] [flang][acc] Remove async and structured flag from data actions; Rename UpdateOp's async to asyncOnly; Print asyncOnly --- flang/lib/Lower/OpenACC.cpp | 354 +++++++----------- mlir/include/mlir/Dialect/OpenACC/OpenACC.h | 10 +- .../mlir/Dialect/OpenACC/OpenACCOps.td | 195 ++-------- mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp | 34 +- 4 files changed, 188 insertions(+), 405 deletions(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index e1918288d6de3..c1a8dd0d5a478 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -104,15 +104,12 @@ static void addOperand(llvm::SmallVectorImpl &operands, } template -static Op -createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, - mlir::Value baseAddr, std::stringstream &name, - mlir::SmallVector bounds, bool structured, - bool implicit, mlir::acc::DataClause dataClause, - mlir::Type retTy, llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, - bool unwrapBoxAddr = false, mlir::Value isPresent = {}) { +static Op createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, + mlir::Value baseAddr, std::stringstream &name, + mlir::SmallVector bounds, + bool implicit, mlir::acc::DataClause dataClause, + mlir::Type retTy, bool unwrapBoxAddr = false, + mlir::Value isPresent = {}) { mlir::Value varPtrPtr; // The data clause may apply to either the box reference itself or the // pointer to the data it holds. So use `unwrapBoxAddr` to decide. @@ -157,11 +154,9 @@ createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, addOperand(operands, operandSegments, baseAddr); addOperand(operands, operandSegments, varPtrPtr); addOperands(operands, operandSegments, bounds); - addOperands(operands, operandSegments, async); Op op = builder.create(loc, retTy, operands); op.setNameAttr(builder.getStringAttr(name.str())); - op.setStructured(structured); op.setImplicit(implicit); op.setDataClause(dataClause); if (auto mappableTy = @@ -176,10 +171,6 @@ createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, op->setAttr(Op::getOperandSegmentSizeAttr(), builder.getDenseI32ArrayAttr(operandSegments)); - if (!asyncDeviceTypes.empty()) - op.setAsyncOperandsDeviceTypeAttr(builder.getArrayAttr(asyncDeviceTypes)); - if (!asyncOnlyDeviceTypes.empty()) - op.setAsyncOnlyAttr(builder.getArrayAttr(asyncOnlyDeviceTypes)); return op; } @@ -249,9 +240,7 @@ static void createDeclareAllocFuncWithArg(mlir::OpBuilder &modBuilder, mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, registerFuncOp.getArgument(0), asFortranDesc, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, descTy, - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, descTy); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -263,8 +252,7 @@ static void createDeclareAllocFuncWithArg(mlir::OpBuilder &modBuilder, addDeclareAttr(builder, boxAddrOp.getOperation(), clause); EntryOp entryOp = createDataEntryOp( builder, loc, boxAddrOp.getResult(), asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, boxAddrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, boxAddrOp.getType()); builder.create( loc, mlir::acc::DeclareTokenType::get(entryOp.getContext()), mlir::ValueRange(entryOp.getAccVar())); @@ -302,26 +290,20 @@ static void createDeclareDeallocFuncWithArg( mlir::acc::GetDevicePtrOp entryOp = createDataEntryOp( builder, loc, var, asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, var.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, var.getType()); builder.create( loc, mlir::Value{}, mlir::ValueRange(entryOp.getAccVar())); if constexpr (std::is_same_v || std::is_same_v) - builder.create(entryOp.getLoc(), entryOp.getAccVar(), - entryOp.getVar(), entryOp.getVarType(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, - builder.getStringAttr(*entryOp.getName())); + builder.create( + entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), + entryOp.getVarType(), entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); else builder.create(entryOp.getLoc(), entryOp.getAccVar(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); // Generate the post dealloc function. @@ -341,9 +323,8 @@ static void createDeclareDeallocFuncWithArg( mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, var, asFortran, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, var.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, + var.getType()); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -700,10 +681,7 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - mlir::acc::DataClause dataClause, bool structured, - bool implicit, llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, + mlir::acc::DataClause dataClause, bool implicit, bool setDeclareAttr = false) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; @@ -732,9 +710,8 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, ? info.rawInput : info.addr; Op op = createDataEntryOp( - builder, operandLocation, baseAddr, asFortran, bounds, structured, - implicit, dataClause, baseAddr.getType(), async, asyncDeviceTypes, - asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true, info.isPresent); + builder, operandLocation, baseAddr, asFortran, bounds, implicit, + dataClause, baseAddr.getType(), /*unwrapBoxAddr=*/true, info.isPresent); dataOperands.push_back(op.getAccVar()); } } @@ -746,7 +723,7 @@ static void genDeclareDataOperandOperations( Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - mlir::acc::DataClause dataClause, bool structured, bool implicit) { + mlir::acc::DataClause dataClause, bool implicit) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; for (const auto &accObject : objectList.v) { @@ -765,10 +742,9 @@ static void genDeclareDataOperandOperations( /*genDefaultBounds=*/generateDefaultBounds, /*strideIncludeLowerExtent=*/strideIncludeLowerExtent); LLVM_DEBUG(llvm::dbgs() << __func__ << "\n"; info.dump(llvm::dbgs())); - EntryOp op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, structured, - implicit, dataClause, info.addr.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + EntryOp op = createDataEntryOp(builder, operandLocation, info.addr, + asFortran, bounds, implicit, + dataClause, info.addr.getType()); dataOperands.push_back(op.getAccVar()); addDeclareAttr(builder, op.getVar().getDefiningOp(), dataClause); if (mlir::isa(fir::unwrapRefType(info.addr.getType()))) { @@ -805,14 +781,12 @@ static void genDeclareDataOperandOperationsWithModifier( (modifier && (*modifier).v == mod) ? clauseWithModifier : clause; genDeclareDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, - dataClause, - /*structured=*/true, /*implicit=*/false); + dataClause, /*implicit=*/false); } template static void genDataExitOperations(fir::FirOpBuilder &builder, - llvm::SmallVector operands, - bool structured) { + llvm::SmallVector operands) { for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); @@ -820,16 +794,13 @@ static void genDataExitOperations(fir::FirOpBuilder &builder, std::is_same_v) builder.create( entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), - entryOp.getVarType(), entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), - entryOp.getDataClause(), structured, entryOp.getImplicit(), - builder.getStringAttr(*entryOp.getName())); - else - builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getBounds(), - entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, + entryOp.getVarType(), entryOp.getBounds(), entryOp.getDataClause(), entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); + else + builder.create(entryOp.getLoc(), entryOp.getAccVar(), + entryOp.getBounds(), entryOp.getDataClause(), + entryOp.getImplicit(), + builder.getStringAttr(*entryOp.getName())); } } @@ -1240,10 +1211,7 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - llvm::SmallVector &privatizations, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes) { + llvm::SmallVector &privatizations) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; for (const auto &accObject : objectList.v) { @@ -1272,9 +1240,9 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, recipe = Fortran::lower::createOrGetPrivateRecipe(builder, recipeName, operandLocation, retTy); auto op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, true, - /*implicit=*/false, mlir::acc::DataClause::acc_private, retTy, async, - asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); + builder, operandLocation, info.addr, asFortran, bounds, + /*implicit=*/false, mlir::acc::DataClause::acc_private, retTy, + /*unwrapBoxAddr=*/true); dataOperands.push_back(op.getAccVar()); } else { std::string suffix = @@ -1284,9 +1252,8 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, recipe = Fortran::lower::createOrGetFirstprivateRecipe( builder, recipeName, operandLocation, retTy, bounds); auto op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, true, + builder, operandLocation, info.addr, asFortran, bounds, /*implicit=*/false, mlir::acc::DataClause::acc_firstprivate, retTy, - async, asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); dataOperands.push_back(op.getAccVar()); } @@ -1869,10 +1836,7 @@ genReductions(const Fortran::parser::AccObjectListWithReduction &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &reductionOperands, - llvm::SmallVector &reductionRecipes, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes) { + llvm::SmallVector &reductionRecipes) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); const auto &objects = std::get(objectList.t); const auto &op = std::get(objectList.t); @@ -1904,9 +1868,8 @@ genReductions(const Fortran::parser::AccObjectListWithReduction &objectList, auto op = createDataEntryOp( builder, operandLocation, info.addr, asFortran, bounds, - /*structured=*/true, /*implicit=*/false, - mlir::acc::DataClause::acc_reduction, info.addr.getType(), async, - asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); + /*implicit=*/false, mlir::acc::DataClause::acc_reduction, + info.addr.getType(), /*unwrapBoxAddr=*/true); mlir::Type ty = op.getAccVar().getType(); if (!areAllBoundConstant(bounds) || fir::isAssumedShape(info.addr.getType()) || @@ -2169,9 +2132,8 @@ static void privatizeIv(Fortran::lower::AbstractConverter &converter, std::stringstream asFortran; asFortran << Fortran::lower::mangle::demangleName(toStringRef(sym.name())); auto op = createDataEntryOp( - builder, loc, ivValue, asFortran, {}, true, /*implicit=*/true, - mlir::acc::DataClause::acc_private, ivValue.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + builder, loc, ivValue, asFortran, {}, /*implicit=*/true, + mlir::acc::DataClause::acc_private, ivValue.getType()); privateOp = op.getOperation(); privateOperands.push_back(op.getAccVar()); @@ -2328,14 +2290,12 @@ static mlir::acc::LoopOp createLoopOp( &clause.u)) { genPrivatizations( privateClause->v, converter, semanticsContext, stmtCtx, - privateOperands, privatizations, /*async=*/{}, - /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + privateOperands, privatizations); } else if (const auto *reductionClause = std::get_if( &clause.u)) { genReductions(reductionClause->v, converter, semanticsContext, stmtCtx, - reductionOperands, reductionRecipes, /*async=*/{}, - /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + reductionOperands, reductionRecipes); } else if (std::get_if(&clause.u)) { for (auto crtDeviceTypeAttr : crtDeviceTypes) seqDeviceTypes.push_back(crtDeviceTypeAttr); @@ -2613,9 +2573,6 @@ static void genDataOperandOperationsWithModifier( llvm::SmallVectorImpl &dataClauseOperands, const mlir::acc::DataClause clause, const mlir::acc::DataClause clauseWithModifier, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, bool setDeclareAttr = false) { const Fortran::parser::AccObjectListWithModifier &listWithModifier = x->v; const auto &accObjectList = @@ -2627,9 +2584,7 @@ static void genDataOperandOperationsWithModifier( (modifier && (*modifier).v == mod) ? clauseWithModifier : clause; genDataOperandOperations(accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, dataClause, - /*structured=*/true, /*implicit=*/false, async, - asyncDeviceTypes, asyncOnlyDeviceTypes, - setDeclareAttr); + /*implicit=*/false, setDeclareAttr); } template @@ -2779,8 +2734,7 @@ static Op createComputeOp( genDataOperandOperations( copyClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copy, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyinClause = @@ -2791,8 +2745,7 @@ static Op createComputeOp( copyinClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::ReadOnly, dataClauseOperands, mlir::acc::DataClause::acc_copyin, - mlir::acc::DataClause::acc_copyin_readonly, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyin_readonly); copyinEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyoutClause = @@ -2804,8 +2757,7 @@ static Op createComputeOp( copyoutClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::ReadOnly, dataClauseOperands, mlir::acc::DataClause::acc_copyout, - mlir::acc::DataClause::acc_copyout_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyout_zero); copyoutEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *createClause = @@ -2816,8 +2768,7 @@ static Op createComputeOp( createClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::Zero, dataClauseOperands, mlir::acc::DataClause::acc_create, - mlir::acc::DataClause::acc_create_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_create_zero); createEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *noCreateClause = @@ -2827,8 +2778,7 @@ static Op createComputeOp( genDataOperandOperations( noCreateClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_no_create, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); nocreateEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *presentClause = @@ -2838,8 +2788,7 @@ static Op createComputeOp( genDataOperandOperations( presentClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_present, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); presentEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *devicePtrClause = @@ -2848,16 +2797,14 @@ static Op createComputeOp( genDataOperandOperations( devicePtrClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_deviceptr, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); } else if (const auto *attachClause = std::get_if(&clause.u)) { auto crtDataStart = dataClauseOperands.size(); genDataOperandOperations( attachClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_attach, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); attachEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *privateClause = @@ -2866,15 +2813,13 @@ static Op createComputeOp( if (!combinedConstructs) genPrivatizations( privateClause->v, converter, semanticsContext, stmtCtx, - privateOperands, privatizations, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + privateOperands, privatizations); } else if (const auto *firstprivateClause = std::get_if( &clause.u)) { genPrivatizations( firstprivateClause->v, converter, semanticsContext, stmtCtx, - firstprivateOperands, firstPrivatizations, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + firstprivateOperands, firstPrivatizations); } else if (const auto *reductionClause = std::get_if( &clause.u)) { @@ -2885,16 +2830,14 @@ static Op createComputeOp( // instead. if (!combinedConstructs) { genReductions(reductionClause->v, converter, semanticsContext, stmtCtx, - reductionOperands, reductionRecipes, async, - asyncDeviceTypes, asyncOnlyDeviceTypes); + reductionOperands, reductionRecipes); } else { auto crtDataStart = dataClauseOperands.size(); genDataOperandOperations( std::get(reductionClause->v.t), converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_reduction, - /*structured=*/true, /*implicit=*/true, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/true); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } @@ -2997,19 +2940,19 @@ static Op createComputeOp( // Create the exit operations after the region. genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands); genDataExitOperations( - builder, attachEntryOperands, /*structured=*/true); + builder, attachEntryOperands); genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands); genDataExitOperations( - builder, nocreateEntryOperands, /*structured=*/true); + builder, nocreateEntryOperands); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands); builder.restoreInsertionPoint(insPt); return computeOp; @@ -3078,8 +3021,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( copyClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copy, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyinClause = @@ -3090,8 +3032,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, copyinClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::ReadOnly, dataClauseOperands, mlir::acc::DataClause::acc_copyin, - mlir::acc::DataClause::acc_copyin_readonly, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyin_readonly); copyinEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyoutClause = @@ -3103,8 +3044,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, copyoutClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::Zero, dataClauseOperands, mlir::acc::DataClause::acc_copyout, - mlir::acc::DataClause::acc_copyout_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyout_zero); copyoutEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *createClause = @@ -3115,8 +3055,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, createClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::Zero, dataClauseOperands, mlir::acc::DataClause::acc_create, - mlir::acc::DataClause::acc_create_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_create_zero); createEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *noCreateClause = @@ -3126,8 +3065,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( noCreateClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_no_create, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); nocreateEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *presentClause = @@ -3137,8 +3075,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( presentClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_present, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); presentEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *deviceptrClause = @@ -3147,16 +3084,14 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( deviceptrClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_deviceptr, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); } else if (const auto *attachClause = std::get_if(&clause.u)) { auto crtDataStart = dataClauseOperands.size(); genDataOperandOperations( attachClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_attach, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); attachEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *defaultClause = @@ -3211,19 +3146,19 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, // Create the exit operations after the region. genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands); genDataExitOperations( - builder, attachEntryOperands, /*structured=*/true); + builder, attachEntryOperands); genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands); genDataExitOperations( - builder, nocreateEntryOperands, /*structured=*/true); + builder, nocreateEntryOperands); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands); builder.restoreInsertionPoint(insPt); } @@ -3252,8 +3187,7 @@ genACCHostDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( useDevice->v, converter, semanticsContext, stmtCtx, dataOperands, mlir::acc::DataClause::acc_use_device, - /*structured=*/true, /*implicit=*/false, /*async=*/{}, - /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false); } else if (std::get_if(&clause.u)) { addIfPresentAttr = true; } @@ -3430,9 +3364,8 @@ genACCEnterDataOp(Fortran::lower::AbstractConverter &converter, std::get(listWithModifier.t); genDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, - dataClauseOperands, mlir::acc::DataClause::acc_copyin, false, - /*implicit=*/false, asyncValues, asyncDeviceTypes, - asyncOnlyDeviceTypes); + dataClauseOperands, mlir::acc::DataClause::acc_copyin, + /*implicit=*/false); } else if (const auto *createClause = std::get_if(&clause.u)) { const Fortran::parser::AccObjectListWithModifier &listWithModifier = @@ -3448,15 +3381,13 @@ genACCEnterDataOp(Fortran::lower::AbstractConverter &converter, clause = mlir::acc::DataClause::acc_create_zero; genDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, - dataClauseOperands, clause, false, /*implicit=*/false, asyncValues, - asyncDeviceTypes, asyncOnlyDeviceTypes); + dataClauseOperands, clause, /*implicit=*/false); } else if (const auto *attachClause = std::get_if(&clause.u)) { genDataOperandOperations( attachClause->v, converter, semanticsContext, stmtCtx, - dataClauseOperands, mlir::acc::DataClause::acc_attach, false, - /*implicit=*/false, asyncValues, asyncDeviceTypes, - asyncOnlyDeviceTypes); + dataClauseOperands, mlir::acc::DataClause::acc_attach, + /*implicit=*/false); } else if (!std::get_if(&clause.u)) { llvm::report_fatal_error( "Unknown clause in ENTER DATA directive lowering"); @@ -3544,20 +3475,17 @@ genACCExitDataOp(Fortran::lower::AbstractConverter &converter, std::get(listWithModifier.t); genDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, copyoutOperands, - mlir::acc::DataClause::acc_copyout, false, /*implicit=*/false, - asyncValues, asyncDeviceTypes, asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyout, /*implicit=*/false); } else if (const auto *deleteClause = std::get_if(&clause.u)) { genDataOperandOperations( deleteClause->v, converter, semanticsContext, stmtCtx, deleteOperands, - mlir::acc::DataClause::acc_delete, false, /*implicit=*/false, - asyncValues, asyncDeviceTypes, asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_delete, /*implicit=*/false); } else if (const auto *detachClause = std::get_if(&clause.u)) { genDataOperandOperations( detachClause->v, converter, semanticsContext, stmtCtx, detachOperands, - mlir::acc::DataClause::acc_detach, false, /*implicit=*/false, - asyncValues, asyncDeviceTypes, asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_detach, /*implicit=*/false); } else if (std::get_if(&clause.u)) { addFinalizeAttr = true; } @@ -3587,11 +3515,11 @@ genACCExitDataOp(Fortran::lower::AbstractConverter &converter, exitDataOp.setFinalizeAttr(builder.getUnitAttr()); genDataExitOperations( - builder, copyoutOperands, /*structured=*/false); + builder, copyoutOperands); genDataExitOperations( - builder, deleteOperands, /*structured=*/false); + builder, deleteOperands); genDataExitOperations( - builder, detachOperands, /*structured=*/false); + builder, detachOperands); } template @@ -3765,16 +3693,14 @@ genACCUpdateOp(Fortran::lower::AbstractConverter &converter, std::get_if(&clause.u)) { genDataOperandOperations( hostClause->v, converter, semanticsContext, stmtCtx, - updateHostOperands, mlir::acc::DataClause::acc_update_host, false, - /*implicit=*/false, asyncOperands, asyncOperandsDeviceTypes, - asyncOnlyDeviceTypes); + updateHostOperands, mlir::acc::DataClause::acc_update_host, + /*implicit=*/false); } else if (const auto *deviceClause = std::get_if(&clause.u)) { genDataOperandOperations( deviceClause->v, converter, semanticsContext, stmtCtx, - dataClauseOperands, mlir::acc::DataClause::acc_update_device, false, - /*implicit=*/false, asyncOperands, asyncOperandsDeviceTypes, - asyncOnlyDeviceTypes); + dataClauseOperands, mlir::acc::DataClause::acc_update_device, + /*implicit=*/false); } else if (std::get_if(&clause.u)) { ifPresent = true; } else if (const auto *selfClause = @@ -3786,9 +3712,8 @@ genACCUpdateOp(Fortran::lower::AbstractConverter &converter, assert(accObjectList && "expect AccObjectList"); genDataOperandOperations( *accObjectList, converter, semanticsContext, stmtCtx, - updateHostOperands, mlir::acc::DataClause::acc_update_self, false, - /*implicit=*/false, asyncOperands, asyncOperandsDeviceTypes, - asyncOnlyDeviceTypes); + updateHostOperands, mlir::acc::DataClause::acc_update_self, + /*implicit=*/false); } } @@ -3805,7 +3730,7 @@ genACCUpdateOp(Fortran::lower::AbstractConverter &converter, ifPresent); genDataExitOperations( - builder, updateHostOperands, /*structured=*/false); + builder, updateHostOperands); } static void @@ -3928,9 +3853,8 @@ static void createDeclareGlobalOp(mlir::OpBuilder &modBuilder, llvm::SmallVector bounds; EntryOp entryOp = createDataEntryOp( - builder, loc, addrOp.getResTy(), asFortran, bounds, - /*structured=*/false, implicit, clause, addrOp.getResTy().getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + builder, loc, addrOp.getResTy(), asFortran, bounds, implicit, clause, + addrOp.getResTy().getType()); if constexpr (std::is_same_v) builder.create( loc, mlir::acc::DeclareTokenType::get(entryOp.getContext()), @@ -3940,10 +3864,8 @@ static void createDeclareGlobalOp(mlir::OpBuilder &modBuilder, mlir::ValueRange(entryOp.getAccVar())); if constexpr (std::is_same_v) { builder.create(entryOp.getLoc(), entryOp.getAccVar(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); } builder.create(loc); @@ -3977,9 +3899,8 @@ static void createDeclareAllocFunc(mlir::OpBuilder &modBuilder, mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, addrOp, asFortranDesc, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, addrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, + addrOp.getType()); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -3990,8 +3911,7 @@ static void createDeclareAllocFunc(mlir::OpBuilder &modBuilder, addDeclareAttr(builder, boxAddrOp.getOperation(), clause); EntryOp entryOp = createDataEntryOp( builder, loc, boxAddrOp.getResult(), asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, boxAddrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, boxAddrOp.getType()); builder.create( loc, mlir::acc::DeclareTokenType::get(entryOp.getContext()), mlir::ValueRange(entryOp.getAccVar())); @@ -4035,8 +3955,7 @@ static void createDeclareDeallocFunc(mlir::OpBuilder &modBuilder, mlir::acc::GetDevicePtrOp entryOp = createDataEntryOp( builder, loc, var, asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, var.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, var.getType()); builder.create( loc, mlir::Value{}, mlir::ValueRange(entryOp.getAccVar())); @@ -4045,18 +3964,13 @@ static void createDeclareDeallocFunc(mlir::OpBuilder &modBuilder, std::is_same_v) builder.create( entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), - entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, - builder.getStringAttr(*entryOp.getName())); + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); else - builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getBounds(), - entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, - builder.getStringAttr(*entryOp.getName())); + builder.create(entryOp.getLoc(), entryOp.getAccVar(), + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, + builder.getStringAttr(*entryOp.getName())); // Generate the post dealloc function. modBuilder.setInsertionPointAfter(preDeallocOp); @@ -4076,9 +3990,8 @@ static void createDeclareDeallocFunc(mlir::OpBuilder &modBuilder, mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, addrOp, asFortran, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, addrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, + addrOp.getType()); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -4216,7 +4129,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, mlir::acc::CopyoutOp>( copyClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copy, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *createClause = @@ -4229,7 +4142,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, genDeclareDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_create, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); createEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *presentClause = @@ -4240,7 +4153,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, mlir::acc::DeleteOp>( presentClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_present, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); presentEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyinClause = @@ -4266,7 +4179,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, mlir::acc::CopyoutOp>( accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copyout, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); copyoutEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *devicePtrClause = @@ -4276,14 +4189,14 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, mlir::acc::DevicePtrOp>( devicePtrClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_deviceptr, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); } else if (const auto *linkClause = std::get_if(&clause.u)) { genDeclareDataOperandOperations( linkClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_declare_link, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); } else if (const auto *deviceResidentClause = std::get_if( &clause.u)) { @@ -4293,7 +4206,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, deviceResidentClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_declare_device_resident, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); deviceResidentEntryOperands.append( dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else { @@ -4341,18 +4254,18 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, } genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands); genDataExitOperations( - builder, deviceResidentEntryOperands, /*structured=*/true); + mlir::acc::DeleteOp>(builder, + deviceResidentEntryOperands); genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands); }); } @@ -4702,12 +4615,11 @@ genACC(Fortran::lower::AbstractConverter &converter, if (modifier && (*modifier).v == Fortran::parser::AccDataModifier::Modifier::ReadOnly) dataClause = mlir::acc::DataClause::acc_cache_readonly; - genDataOperandOperations( - accObjectList, converter, semanticsContext, stmtCtx, cacheOperands, - dataClause, - /*structured=*/true, /*implicit=*/false, - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}, - /*setDeclareAttr*/ false); + genDataOperandOperations(accObjectList, converter, + semanticsContext, stmtCtx, + cacheOperands, dataClause, + /*implicit=*/false, + /*setDeclareAttr*/ false); loopOp.getCacheOperandsMutable().append(cacheOperands); } else { llvm::report_fatal_error( diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h index ff5845343313c..e053e3d2bbcfc 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h @@ -117,19 +117,19 @@ mlir::Value getVarPtrPtr(mlir::Operation *accDataClauseOp); /// Returns an empty vector if there are no bounds. mlir::SmallVector getBounds(mlir::Operation *accDataClauseOp); -/// Used to obtain `async` operands from an acc data clause operation. +/// Used to obtain `async` operands from an acc operation. /// Returns an empty vector if there are no such operands. mlir::SmallVector getAsyncOperands(mlir::Operation *accDataClauseOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to -/// an acc data clause operation, that correspond to the device types -/// associated with the async clauses with an async-value. +/// an acc operation, that correspond to the device types associated with the +/// async clauses with an async-value. mlir::ArrayAttr getAsyncOperandsDeviceType(mlir::Operation *accDataClauseOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to -/// an acc data clause operation, that correspond to the device types -/// associated with the async clauses without an async-value. +/// an acc operation, that correspond to the device types associated with the +/// async clauses without an async-value. mlir::ArrayAttr getAsyncOnly(mlir::Operation *accDataClauseOp); /// Used to obtain the `name` from an acc operation. diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td index 5d5add6318e06..59b9a50144a1e 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td @@ -470,11 +470,7 @@ class OpenACC_DataEntryOp:$varPtrPtr, Variadic:$bounds, /* rank-0 to rank-{n-1} */ - Variadic:$asyncOperands, - OptionalAttr:$asyncOperandsDeviceType, - OptionalAttr:$asyncOnly, DefaultValuedAttr:$dataClause, - DefaultValuedAttr:$structured, DefaultValuedAttr:$implicit, OptionalAttr:$name)); @@ -491,63 +487,16 @@ class OpenACC_DataEntryOp(attr); - if (deviceTypeAttr.getValue() == deviceType) - return true; - } - return false; - } - /// Return the value of the async clause if present. - mlir::Value getAsyncValue() { - return getAsyncValue(mlir::acc::DeviceType::None); - } - /// Return the value of the async clause for the given device_type if - /// present. - mlir::Value getAsyncValue(mlir::acc::DeviceType deviceType) { - mlir::ArrayAttr deviceTypes = getAsyncOperandsDeviceTypeAttr(); - if (!deviceTypes) - return nullptr; - for (auto [attr, asyncValue] : - llvm::zip(deviceTypes, getAsyncOperands())) { - auto deviceTypeAttr = mlir::dyn_cast(attr); - if (deviceTypeAttr.getValue() == deviceType) - return asyncValue; - } - return nullptr; - } mlir::TypedValue getVarPtr() { return mlir::dyn_cast>(getVar()); } @@ -561,16 +510,13 @@ class OpenACC_DataEntryOp($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` ) `->` type($accVar) attr-dict }]; let hasVerifier = 1; let builders = [ - OpBuilder<(ins "::mlir::Value":$var, - "bool":$structured, "bool":$implicit, + OpBuilder<(ins "::mlir::Value":$var, "bool":$implicit, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ auto ptrLikeTy = ::mlir::dyn_cast<::mlir::acc::PointerLikeType>( @@ -579,14 +525,10 @@ class OpenACC_DataEntryOp, - OpBuilder<(ins "::mlir::Value":$var, - "bool":$structured, "bool":$implicit, + OpBuilder<(ins "::mlir::Value":$var, "bool":$implicit, "const ::llvm::Twine &":$name, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ @@ -596,10 +538,7 @@ class OpenACC_DataEntryOp]; @@ -829,15 +768,10 @@ def OpenACC_CacheOp : OpenACC_DataEntryOp<"cache", class OpenACC_DataExitOp traits = [], dag additionalArgs = (ins)> : OpenACC_Op]>])> { + [MemoryEffects<[MemRead]>])> { let arguments = !con(additionalArgs, (ins Variadic:$bounds, - Variadic:$asyncOperands, - OptionalAttr:$asyncOperandsDeviceType, - OptionalAttr:$asyncOnly, DefaultValuedAttr:$dataClause, - DefaultValuedAttr:$structured, DefaultValuedAttr:$implicit, OptionalAttr:$name)); @@ -846,65 +780,15 @@ class OpenACC_DataExitOp(attr); - if (deviceTypeAttr.getValue() == deviceType) - return true; - } - return false; - } - /// Return the value of the async clause if present. - mlir::Value getAsyncValue() { - return getAsyncValue(mlir::acc::DeviceType::None); - } - /// Return the value of the async clause for the given device_type if - /// present. - mlir::Value getAsyncValue(mlir::acc::DeviceType deviceType) { - mlir::ArrayAttr deviceTypes = getAsyncOperandsDeviceTypeAttr(); - if (!deviceTypes) - return nullptr; - for (auto [attr, asyncValue] : - llvm::zip(deviceTypes, getAsyncOperands())) { - auto deviceTypeAttr = mlir::dyn_cast(attr); - if (deviceTypeAttr.getValue() == deviceType) - return asyncValue; - } - return nullptr; - } - }]; - let hasVerifier = 1; } @@ -922,16 +806,13 @@ class OpenACC_DataExitOpWithVarPtr let assemblyFormat = [{ custom($accVar, type($accVar)) (`bounds` `(` $bounds^ `)` )? - (`async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType)^ `)`)? `to` custom($var) `:` custom(type($var), $varType) attr-dict }]; let builders = [ OpBuilder<(ins "::mlir::Value":$accVar, - "::mlir::Value":$var, - "bool":$structured, "bool":$implicit, + "::mlir::Value":$var, "bool":$implicit, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ auto ptrLikeTy = ::mlir::dyn_cast<::mlir::acc::PointerLikeType>( @@ -940,14 +821,11 @@ class OpenACC_DataExitOpWithVarPtr /*varType=*/ptrLikeTy ? ::mlir::TypeAttr::get(ptrLikeTy.getElementType()) : ::mlir::TypeAttr::get(var.getType()), - bounds, /*asyncOperands=*/{}, /*asyncOperandsDeviceType=*/nullptr, - /*asyncOnly=*/nullptr, /*dataClause=*/nullptr, - /*structured=*/$_builder.getBoolAttr(structured), + bounds, /*dataClause=*/nullptr, /*implicit=*/$_builder.getBoolAttr(implicit), /*name=*/nullptr); }]>, OpBuilder<(ins "::mlir::Value":$accVar, - "::mlir::Value":$var, - "bool":$structured, "bool":$implicit, + "::mlir::Value":$var, "bool":$implicit, "const ::llvm::Twine &":$name, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ @@ -957,9 +835,7 @@ class OpenACC_DataExitOpWithVarPtr /*varType=*/ptrLikeTy ? ::mlir::TypeAttr::get(ptrLikeTy.getElementType()) : ::mlir::TypeAttr::get(var.getType()), - bounds, /*asyncOperands=*/{}, /*asyncOperandsDeviceType=*/nullptr, - /*asyncOnly=*/nullptr, /*dataClause=*/nullptr, - /*structured=*/$_builder.getBoolAttr(structured), + bounds, /*dataClause=*/nullptr, /*implicit=*/$_builder.getBoolAttr(implicit), /*name=*/$_builder.getStringAttr(name)); }]>]; @@ -983,31 +859,23 @@ class OpenACC_DataExitOpNoVarPtr : let assemblyFormat = [{ custom($accVar, type($accVar)) (`bounds` `(` $bounds^ `)` )? - (`async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType)^ `)`)? attr-dict }]; let builders = [ - OpBuilder<(ins "::mlir::Value":$accVar, - "bool":$structured, "bool":$implicit, + OpBuilder<(ins "::mlir::Value":$accVar, "bool":$implicit, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ build($_builder, $_state, accVar, - bounds, /*asyncOperands=*/{}, /*asyncOperandsDeviceType=*/nullptr, - /*asyncOnly=*/nullptr, /*dataClause=*/nullptr, - /*structured=*/$_builder.getBoolAttr(structured), + bounds, /*dataClause=*/nullptr, /*implicit=*/$_builder.getBoolAttr(implicit), /*name=*/nullptr); }]>, - OpBuilder<(ins "::mlir::Value":$accVar, - "bool":$structured, "bool":$implicit, + OpBuilder<(ins "::mlir::Value":$accVar, "bool":$implicit, "const ::llvm::Twine &":$name, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ build($_builder, $_state, accVar, - bounds, /*asyncOperands=*/{}, /*asyncOperandsDeviceType=*/nullptr, - /*asyncOnly=*/nullptr, /*dataClause=*/nullptr, - /*structured=*/$_builder.getBoolAttr(structured), + bounds, /*dataClause=*/nullptr, /*implicit=*/$_builder.getBoolAttr(implicit), /*name=*/$_builder.getStringAttr(name)); }]> @@ -1027,7 +895,7 @@ def OpenACC_CopyoutOp : OpenACC_DataExitOpWithVarPtr<"copyout", "mlir::acc::DataClause::acc_copyout"> { let summary = "Represents acc copyout semantics - reverse of copyin."; - let extraClassDeclaration = extraClassDeclarationBase # extraClassDeclarationDataExit # [{ + let extraClassDeclaration = extraClassDeclarationDataExit # [{ /// Check if this is a copyout with zero modifier. bool isCopyoutZero(); }]; @@ -1039,7 +907,7 @@ def OpenACC_CopyoutOp : OpenACC_DataExitOpWithVarPtr<"copyout", def OpenACC_DeleteOp : OpenACC_DataExitOpNoVarPtr<"delete", "mlir::acc::DataClause::acc_delete"> { let summary = "Represents acc delete semantics - reverse of create."; - let extraClassDeclaration = extraClassDeclarationBase # extraClassDeclarationDataExit; + let extraClassDeclaration = extraClassDeclarationDataExit; } //===----------------------------------------------------------------------===// @@ -1048,7 +916,7 @@ def OpenACC_DeleteOp : OpenACC_DataExitOpNoVarPtr<"delete", def OpenACC_DetachOp : OpenACC_DataExitOpNoVarPtr<"detach", "mlir::acc::DataClause::acc_detach"> { let summary = "Represents acc detach semantics - reverse of attach."; - let extraClassDeclaration = extraClassDeclarationBase # extraClassDeclarationDataExit; + let extraClassDeclaration = extraClassDeclarationDataExit; } //===----------------------------------------------------------------------===// @@ -1057,7 +925,7 @@ def OpenACC_DetachOp : OpenACC_DataExitOpNoVarPtr<"detach", def OpenACC_UpdateHostOp : OpenACC_DataExitOpWithVarPtr<"update_host", "mlir::acc::DataClause::acc_update_host"> { let summary = "Represents acc update host semantics."; - let extraClassDeclaration = extraClassDeclarationBase # extraClassDeclarationDataExit # [{ + let extraClassDeclaration = extraClassDeclarationDataExit # [{ /// Check if this is an acc update self. bool isSelf() { return getDataClause() == acc::DataClause::acc_update_self; @@ -1439,8 +1307,8 @@ def OpenACC_ParallelOp : OpenACC_Op<"parallel", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1581,8 +1449,8 @@ def OpenACC_SerialOp : OpenACC_Op<"serial", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1750,8 +1618,8 @@ def OpenACC_KernelsOp : OpenACC_Op<"kernels", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `num_gangs` `(` custom($numGangs, type($numGangs), $numGangsDeviceType, $numGangsSegments) `)` | `num_workers` `(` custom($numWorkers, @@ -1799,6 +1667,9 @@ def OpenACC_DataOp : OpenACC_Op<"data", `async` and `wait` operands are supported with `device_type` information. They should only be accessed by the extra provided getters. If modified, the corresponding `device_type` attributes must be modified as well. + + The `asyncOnly` operand is a list of device_type's for which async clause + does not specify a value (default is acc_async_noval - OpenACC 3.3 2.16.1). }]; @@ -1870,8 +1741,8 @@ def OpenACC_DataOp : OpenACC_Op<"data", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, @@ -1931,6 +1802,7 @@ def OpenACC_EnterDataOp : OpenACC_Op<"enter_data", Value getDataOperand(unsigned i); }]; + // TODO: Show $async and $wait. let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` @@ -1983,6 +1855,7 @@ def OpenACC_ExitDataOp : OpenACC_Op<"exit_data", Value getDataOperand(unsigned i); }]; + // TODO: Show $async and $wait. let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` @@ -2853,7 +2726,7 @@ def OpenACC_UpdateOp : OpenACC_Op<"update", let arguments = (ins Optional:$ifCond, Variadic:$asyncOperands, OptionalAttr:$asyncOperandsDeviceType, - OptionalAttr:$async, + OptionalAttr:$asyncOnly, Variadic:$waitOperands, OptionalAttr:$waitOperandsSegments, OptionalAttr:$waitOperandsDeviceType, @@ -2901,9 +2774,8 @@ def OpenACC_UpdateOp : OpenACC_Op<"update", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `` custom( - $asyncOperands, type($asyncOperands), - $asyncOperandsDeviceType, $async) + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, $waitOnly) @@ -2946,6 +2818,7 @@ def OpenACC_WaitOp : OpenACC_Op<"wait", [AttrSizedOperandSegments]> { UnitAttr:$async, Optional:$ifCond); + // TODO: Show $async. let assemblyFormat = [{ ( `(` $waitOperands^ `:` type($waitOperands) `)` )? oilist(`async` `(` $asyncOperand `:` type($asyncOperand) `)` diff --git a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp index 7eb72d433c972..ee00acecb17b9 100644 --- a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp +++ b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp @@ -3505,7 +3505,7 @@ bool UpdateOp::hasAsyncOnly() { } bool UpdateOp::hasAsyncOnly(mlir::acc::DeviceType deviceType) { - return hasDeviceType(getAsync(), deviceType); + return hasDeviceType(getAsyncOnly(), deviceType); } mlir::Value UpdateOp::getAsyncValue() { @@ -3659,32 +3659,30 @@ mlir::acc::getBounds(mlir::Operation *accDataClauseOp) { } mlir::SmallVector -mlir::acc::getAsyncOperands(mlir::Operation *accDataClauseOp) { +mlir::acc::getAsyncOperands(mlir::Operation *accOp) { return llvm::TypeSwitch>( - accDataClauseOp) - .Case([&](auto dataClause) { - return mlir::SmallVector( - dataClause.getAsyncOperands().begin(), - dataClause.getAsyncOperands().end()); - }) + accOp) + .Case( + [&](auto op) { + return mlir::SmallVector(op.getAsyncOperands().begin(), + op.getAsyncOperands().end()); + }) .Default([&](mlir::Operation *) { return mlir::SmallVector(); }); } -mlir::ArrayAttr -mlir::acc::getAsyncOperandsDeviceType(mlir::Operation *accDataClauseOp) { - return llvm::TypeSwitch(accDataClauseOp) - .Case([&](auto dataClause) { - return dataClause.getAsyncOperandsDeviceTypeAttr(); - }) +mlir::ArrayAttr mlir::acc::getAsyncOperandsDeviceType(mlir::Operation *accOp) { + return llvm::TypeSwitch(accOp) + .Case( + [&](auto op) { return op.getAsyncOperandsDeviceTypeAttr(); }) .Default([&](mlir::Operation *) { return mlir::ArrayAttr{}; }); } -mlir::ArrayAttr mlir::acc::getAsyncOnly(mlir::Operation *accDataClauseOp) { - return llvm::TypeSwitch(accDataClauseOp) - .Case( - [&](auto dataClause) { return dataClause.getAsyncOnlyAttr(); }) +mlir::ArrayAttr mlir::acc::getAsyncOnly(mlir::Operation *accOp) { + return llvm::TypeSwitch(accOp) + .Case( + [&](auto op) { return op.getAsyncOnlyAttr(); }) .Default([&](mlir::Operation *) { return mlir::ArrayAttr{}; }); } >From 882ffbd6d7f9f8ddead4ce1f4686e5d0d10bb7ca Mon Sep 17 00:00:00 2001 From: Kazuaki Matsumura Date: Tue, 13 May 2025 05:21:48 -0700 Subject: [PATCH 2/2] [acc] accDataClauseOp -> accOp --- mlir/include/mlir/Dialect/OpenACC/OpenACC.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h index e053e3d2bbcfc..f667a6786189b 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h @@ -120,17 +120,17 @@ mlir::SmallVector getBounds(mlir::Operation *accDataClauseOp); /// Used to obtain `async` operands from an acc operation. /// Returns an empty vector if there are no such operands. mlir::SmallVector -getAsyncOperands(mlir::Operation *accDataClauseOp); +getAsyncOperands(mlir::Operation *accOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to /// an acc operation, that correspond to the device types associated with the /// async clauses with an async-value. -mlir::ArrayAttr getAsyncOperandsDeviceType(mlir::Operation *accDataClauseOp); +mlir::ArrayAttr getAsyncOperandsDeviceType(mlir::Operation *accOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to /// an acc operation, that correspond to the device types associated with the /// async clauses without an async-value. -mlir::ArrayAttr getAsyncOnly(mlir::Operation *accDataClauseOp); +mlir::ArrayAttr getAsyncOnly(mlir::Operation *accOp); /// Used to obtain the `name` from an acc operation. std::optional getVarName(mlir::Operation *accOp); From flang-commits at lists.llvm.org Tue May 13 08:54:26 2025 From: flang-commits at lists.llvm.org (Jameson Nash via flang-commits) Date: Tue, 13 May 2025 08:54:26 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <68236b32.a70a0220.16568f.d59a@mx.google.com> ================ @@ -523,6 +537,7 @@ def CC_AArch64_Preserve_None : CallingConv<[ // We can pass arguments in all general registers, except: // - X8, used for sret // - X16/X17, used by the linker as IP0/IP1 + // - X15, the nest register and used by Windows for stack allocation ---------------- vtjnash wrote: For the X9 thing, I just wanted to note (unrelated to this PR though) that in my reading of the code, it looks like the findScratchNonCalleeSaveRegister would next pick X16 for CC_AArch64_Preserve_None. That register seems to only actually be available only if there are no calls while it is active (it is reserved by the linker as IP0). I think the only suspect use of that call there is to preserve the value of X0 around the call to __arm_get_current_vg (CallingConv::AArch64_SME_ABI_Support_Routines_PreserveMost_From_X1), which means the argument value being passes in X0 with preserve_none will sometimes get smashed at runtime by the linker, if the LLVM optimizer decides to insert that function call to spill SVE state (https://llvm.org/docs/AArch64SME.html#compiler-inserted-streaming-mode-changes). https://github.com/llvm/llvm-project/pull/126743 From flang-commits at lists.llvm.org Tue May 13 08:59:53 2025 From: flang-commits at lists.llvm.org (Jameson Nash via flang-commits) Date: Tue, 13 May 2025 08:59:53 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <68236c79.050a0220.212cdd.b4eb@mx.google.com> https://github.com/vtjnash updated https://github.com/llvm/llvm-project/pull/126743 >From 4053196cdb8d87de6d3c5a47f2ffca6459b45680 Mon Sep 17 00:00:00 2001 From: Jameson Nash Date: Mon, 10 Feb 2025 19:21:38 +0000 Subject: [PATCH] [AArch64] fix trampoline implementation: use X15 AAPCS64 reserves any of X9-X15 for this purpose, and says not to use any of X16-X18 (like GCC chose). Simply choosing a different register fixes the problem of this being broken on any platform that actually follows the platform ABI. As a side benefit, also generate slightly better code in the trampoline itself by following the XCore implementation instead of PPC (although following the RISCV might have been slightly more readable in hindsight). --- compiler-rt/lib/builtins/README.txt | 5 - compiler-rt/lib/builtins/trampoline_setup.c | 42 --- .../builtins/Unit/trampoline_setup_test.c | 2 +- .../lib/Optimizer/CodeGen/BoxedProcedure.cpp | 8 +- flang/test/Fir/boxproc.fir | 4 +- .../AArch64/AArch64CallingConvention.td | 25 +- .../Target/AArch64/AArch64FrameLowering.cpp | 28 ++ .../Target/AArch64/AArch64ISelLowering.cpp | 97 ++++--- llvm/lib/TargetParser/Triple.cpp | 2 - llvm/test/CodeGen/AArch64/nest-register.ll | 16 +- .../AArch64/statepoint-call-lowering.ll | 2 +- llvm/test/CodeGen/AArch64/trampoline.ll | 257 +++++++++++++++++- llvm/test/CodeGen/AArch64/win64cc-x18.ll | 27 +- .../CodeGen/AArch64/zero-call-used-regs.ll | 16 +- 14 files changed, 385 insertions(+), 146 deletions(-) diff --git a/compiler-rt/lib/builtins/README.txt b/compiler-rt/lib/builtins/README.txt index 19f26c92a0f94..2d213d95f333a 100644 --- a/compiler-rt/lib/builtins/README.txt +++ b/compiler-rt/lib/builtins/README.txt @@ -272,11 +272,6 @@ switch32 switch8 switchu8 -// This function generates a custom trampoline function with the specific -// realFunc and localsPtr values. -void __trampoline_setup(uint32_t* trampOnStack, int trampSizeAllocated, - const void* realFunc, void* localsPtr); - // There is no C interface to the *_vfp_d8_d15_regs functions. There are // called in the prolog and epilog of Thumb1 functions. When the C++ ABI use // SJLJ for exceptions, each function with a catch clause or destructors needs diff --git a/compiler-rt/lib/builtins/trampoline_setup.c b/compiler-rt/lib/builtins/trampoline_setup.c index 830e25e4c0303..844eb27944142 100644 --- a/compiler-rt/lib/builtins/trampoline_setup.c +++ b/compiler-rt/lib/builtins/trampoline_setup.c @@ -41,45 +41,3 @@ COMPILER_RT_ABI void __trampoline_setup(uint32_t *trampOnStack, __clear_cache(trampOnStack, &trampOnStack[10]); } #endif // __powerpc__ && !defined(__powerpc64__) - -// The AArch64 compiler generates calls to __trampoline_setup() when creating -// trampoline functions on the stack for use with nested functions. -// This function creates a custom 36-byte trampoline function on the stack -// which loads x18 with a pointer to the outer function's locals -// and then jumps to the target nested function. -// Note: x18 is a reserved platform register on Windows and macOS. - -#if defined(__aarch64__) && defined(__ELF__) -COMPILER_RT_ABI void __trampoline_setup(uint32_t *trampOnStack, - int trampSizeAllocated, - const void *realFunc, void *localsPtr) { - // This should never happen, but if compiler did not allocate - // enough space on stack for the trampoline, abort. - if (trampSizeAllocated < 36) - compilerrt_abort(); - - // create trampoline - // Load realFunc into x17. mov/movk 16 bits at a time. - trampOnStack[0] = - 0xd2800000u | ((((uint64_t)realFunc >> 0) & 0xffffu) << 5) | 0x11; - trampOnStack[1] = - 0xf2a00000u | ((((uint64_t)realFunc >> 16) & 0xffffu) << 5) | 0x11; - trampOnStack[2] = - 0xf2c00000u | ((((uint64_t)realFunc >> 32) & 0xffffu) << 5) | 0x11; - trampOnStack[3] = - 0xf2e00000u | ((((uint64_t)realFunc >> 48) & 0xffffu) << 5) | 0x11; - // Load localsPtr into x18 - trampOnStack[4] = - 0xd2800000u | ((((uint64_t)localsPtr >> 0) & 0xffffu) << 5) | 0x12; - trampOnStack[5] = - 0xf2a00000u | ((((uint64_t)localsPtr >> 16) & 0xffffu) << 5) | 0x12; - trampOnStack[6] = - 0xf2c00000u | ((((uint64_t)localsPtr >> 32) & 0xffffu) << 5) | 0x12; - trampOnStack[7] = - 0xf2e00000u | ((((uint64_t)localsPtr >> 48) & 0xffffu) << 5) | 0x12; - trampOnStack[8] = 0xd61f0220; // br x17 - - // Clear instruction cache. - __clear_cache(trampOnStack, &trampOnStack[9]); -} -#endif // defined(__aarch64__) && !defined(__APPLE__) && !defined(_WIN64) diff --git a/compiler-rt/test/builtins/Unit/trampoline_setup_test.c b/compiler-rt/test/builtins/Unit/trampoline_setup_test.c index d51d35acaa02f..da115fe764271 100644 --- a/compiler-rt/test/builtins/Unit/trampoline_setup_test.c +++ b/compiler-rt/test/builtins/Unit/trampoline_setup_test.c @@ -7,7 +7,7 @@ /* * Tests nested functions - * The ppc and aarch64 compilers generates a call to __trampoline_setup + * The ppc compiler generates a call to __trampoline_setup * The i386 and x86_64 compilers generate a call to ___enable_execute_stack */ diff --git a/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp b/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp index 82b11ad7db32a..69bdb48146a54 100644 --- a/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp +++ b/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp @@ -274,12 +274,12 @@ class BoxedProcedurePass auto loc = embox.getLoc(); mlir::Type i8Ty = builder.getI8Type(); mlir::Type i8Ptr = builder.getRefType(i8Ty); - // For AArch64, PPC32 and PPC64, the thunk is populated by a call to + // For PPC32 and PPC64, the thunk is populated by a call to // __trampoline_setup, which is defined in // compiler-rt/lib/builtins/trampoline_setup.c and requires the - // thunk size greater than 32 bytes. For RISCV and x86_64, the - // thunk setup doesn't go through __trampoline_setup and fits in 32 - // bytes. + // thunk size greater than 32 bytes. For AArch64, RISCV and x86_64, + // the thunk setup doesn't go through __trampoline_setup and fits in + // 32 bytes. fir::SequenceType::Extent thunkSize = triple.getTrampolineSize(); mlir::Type buffTy = SequenceType::get({thunkSize}, i8Ty); auto buffer = builder.create(loc, buffTy); diff --git a/flang/test/Fir/boxproc.fir b/flang/test/Fir/boxproc.fir index e99dfd0b92afd..9e5e41a94069c 100644 --- a/flang/test/Fir/boxproc.fir +++ b/flang/test/Fir/boxproc.fir @@ -3,7 +3,7 @@ // RUN: %if powerpc-registered-target %{tco --target=powerpc64le-unknown-linux-gnu %s | FileCheck %s --check-prefixes=CHECK,CHECK-PPC %} // CHECK-LABEL: define void @_QPtest_proc_dummy() -// CHECK-AARCH64: %[[VAL_3:.*]] = alloca [36 x i8], i64 1, align 1 +// CHECK-AARCH64: %[[VAL_3:.*]] = alloca [32 x i8], i64 1, align 1 // CHECK-X86: %[[VAL_3:.*]] = alloca [32 x i8], i64 1, align 1 // CHECK-PPC: %[[VAL_3:.*]] = alloca [4{{[0-8]+}} x i8], i64 1, align 1 // CHECK: %[[VAL_1:.*]] = alloca { ptr }, i64 1, align 8 @@ -63,7 +63,7 @@ func.func @_QPtest_proc_dummy_other(%arg0: !fir.boxproc<() -> ()>) { } // CHECK-LABEL: define void @_QPtest_proc_dummy_char() -// CHECK-AARCH64: %[[VAL_20:.*]] = alloca [36 x i8], i64 1, align 1 +// CHECK-AARCH64: %[[VAL_20:.*]] = alloca [32 x i8], i64 1, align 1 // CHECK-X86: %[[VAL_20:.*]] = alloca [32 x i8], i64 1, align 1 // CHECK-PPC: %[[VAL_20:.*]] = alloca [4{{[0-8]+}} x i8], i64 1, align 1 // CHECK: %[[VAL_2:.*]] = alloca { { ptr, i64 } }, i64 1, align 8 diff --git a/llvm/lib/Target/AArch64/AArch64CallingConvention.td b/llvm/lib/Target/AArch64/AArch64CallingConvention.td index 7cca6d9bc6b9c..e973269545911 100644 --- a/llvm/lib/Target/AArch64/AArch64CallingConvention.td +++ b/llvm/lib/Target/AArch64/AArch64CallingConvention.td @@ -28,6 +28,12 @@ class CCIfSubtarget //===----------------------------------------------------------------------===// defvar AArch64_Common = [ + // The 'nest' parameter, if any, is passed in X15. + // The previous register used here (X18) is also defined to be unavailable + // for this purpose, while all of X9-X15 were defined to be free for LLVM to + // use for this, so use X15 (which LLVM often already clobbers anyways). + CCIfNest>, + CCIfType<[iPTR], CCBitConvertToType>, CCIfType<[v2f32], CCBitConvertToType>, CCIfType<[v2f64, v4f32], CCBitConvertToType>, @@ -117,13 +123,7 @@ defvar AArch64_Common = [ ]; let Entry = 1 in -def CC_AArch64_AAPCS : CallingConv>], - AArch64_Common -)>; +def CC_AArch64_AAPCS : CallingConv; let Entry = 1 in def RetCC_AArch64_AAPCS : CallingConv<[ @@ -177,6 +177,8 @@ def CC_AArch64_Win64_VarArg : CallingConv<[ // a stack layout compatible with the x64 calling convention. let Entry = 1 in def CC_AArch64_Arm64EC_VarArg : CallingConv<[ + CCIfNest>, + // Convert small floating-point values to integer. CCIfType<[f16, bf16], CCBitConvertToType>, CCIfType<[f32], CCBitConvertToType>, @@ -353,6 +355,8 @@ def RetCC_AArch64_Arm64EC_CFGuard_Check : CallingConv<[ // + Stack slots are sized as needed rather than being at least 64-bit. let Entry = 1 in def CC_AArch64_DarwinPCS : CallingConv<[ + CCIfNest>, + CCIfType<[iPTR], CCBitConvertToType>, CCIfType<[v2f32], CCBitConvertToType>, CCIfType<[v2f64, v4f32, f128], CCBitConvertToType>, @@ -427,6 +431,8 @@ def CC_AArch64_DarwinPCS : CallingConv<[ let Entry = 1 in def CC_AArch64_DarwinPCS_VarArg : CallingConv<[ + CCIfNest>, + CCIfType<[iPTR], CCBitConvertToType>, CCIfType<[v2f32], CCBitConvertToType>, CCIfType<[v2f64, v4f32, f128], CCBitConvertToType>, @@ -450,6 +456,8 @@ def CC_AArch64_DarwinPCS_VarArg : CallingConv<[ // same as the normal Darwin VarArgs handling. let Entry = 1 in def CC_AArch64_DarwinPCS_ILP32_VarArg : CallingConv<[ + CCIfNest>, + CCIfType<[v2f32], CCBitConvertToType>, CCIfType<[v2f64, v4f32, f128], CCBitConvertToType>, @@ -494,6 +502,8 @@ def CC_AArch64_DarwinPCS_ILP32_VarArg : CallingConv<[ let Entry = 1 in def CC_AArch64_GHC : CallingConv<[ + CCIfNest>, + CCIfType<[iPTR], CCBitConvertToType>, // Handle all vector types as either f64 or v2f64. @@ -522,6 +532,7 @@ def CC_AArch64_Preserve_None : CallingConv<[ // We can pass arguments in all general registers, except: // - X8, used for sret + // - X15 (on Windows), used as a temporary register in the prologue when allocating call frames // - X16/X17, used by the linker as IP0/IP1 // - X18, the platform register // - X19, the base pointer diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp index 040662a5f11dd..96f4451182391 100644 --- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp @@ -1982,6 +1982,27 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF, : 0; if (windowsRequiresStackProbe(MF, NumBytes + RealignmentPadding)) { + // Find an available register to spill the value of X15 to, if X15 is being + // used already for nest. + unsigned X15Scratch = AArch64::NoRegister; + const AArch64Subtarget &STI = MF.getSubtarget(); + if (llvm::any_of(MBB.liveins(), + [&STI](const MachineBasicBlock::RegisterMaskPair &LiveIn) { + return STI.getRegisterInfo()->isSuperOrSubRegisterEq( + AArch64::X15, LiveIn.PhysReg); + })) { + X15Scratch = findScratchNonCalleeSaveRegister(&MBB); + assert(X15Scratch != AArch64::NoRegister); +#ifndef NDEBUG + LiveRegs.removeReg(AArch64::X15); // ignore X15 since we restore it +#endif + BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXrr), X15Scratch) + .addReg(AArch64::XZR) + .addReg(AArch64::X15, RegState::Undef) + .addReg(AArch64::X15, RegState::Implicit) + .setMIFlag(MachineInstr::FrameSetup); + } + uint64_t NumWords = (NumBytes + RealignmentPadding) >> 4; if (NeedsWinCFI) { HasWinCFI = true; @@ -2104,6 +2125,13 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF, // we've set a frame pointer and already finished the SEH prologue. assert(!NeedsWinCFI); } + if (X15Scratch != AArch64::NoRegister) { + BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXrr), AArch64::X15) + .addReg(AArch64::XZR) + .addReg(X15Scratch, RegState::Undef) + .addReg(X15Scratch, RegState::Implicit) + .setMIFlag(MachineInstr::FrameSetup); + } } StackOffset SVECalleeSavesSize = {}, SVELocalsSize = SVEStackSize; diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp index ad48be4531d3b..f9451d81ae7ae 100644 --- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp @@ -7339,59 +7339,80 @@ static SDValue LowerFLDEXP(SDValue Op, SelectionDAG &DAG) { SDValue AArch64TargetLowering::LowerADJUST_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const { - // Note: x18 cannot be used for the Nest parameter on Windows and macOS. - if (Subtarget->isTargetDarwin() || Subtarget->isTargetWindows()) - report_fatal_error( - "ADJUST_TRAMPOLINE operation is only supported on Linux."); - return Op.getOperand(0); } SDValue AArch64TargetLowering::LowerINIT_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const { - - // Note: x18 cannot be used for the Nest parameter on Windows and macOS. - if (Subtarget->isTargetDarwin() || Subtarget->isTargetWindows()) - report_fatal_error("INIT_TRAMPOLINE operation is only supported on Linux."); - SDValue Chain = Op.getOperand(0); - SDValue Trmp = Op.getOperand(1); // trampoline + SDValue Trmp = Op.getOperand(1); // trampoline, >=32 bytes SDValue FPtr = Op.getOperand(2); // nested function SDValue Nest = Op.getOperand(3); // 'nest' parameter value - SDLoc dl(Op); - EVT PtrVT = getPointerTy(DAG.getDataLayout()); - Type *IntPtrTy = DAG.getDataLayout().getIntPtrType(*DAG.getContext()); + const Value *TrmpAddr = cast(Op.getOperand(4))->getValue(); - TargetLowering::ArgListTy Args; - TargetLowering::ArgListEntry Entry; + // ldr NestReg, .+16 + // ldr x17, .+20 + // br x17 + // .word 0 + // .nest: .qword nest + // .fptr: .qword fptr + SDValue OutChains[5]; - Entry.Ty = IntPtrTy; - Entry.Node = Trmp; - Args.push_back(Entry); + const Function *Func = + cast(cast(Op.getOperand(5))->getValue()); + CallingConv::ID CC = Func->getCallingConv(); + unsigned NestReg; - if (auto *FI = dyn_cast(Trmp.getNode())) { - MachineFunction &MF = DAG.getMachineFunction(); - MachineFrameInfo &MFI = MF.getFrameInfo(); - Entry.Node = - DAG.getConstant(MFI.getObjectSize(FI->getIndex()), dl, MVT::i64); - } else - Entry.Node = DAG.getConstant(36, dl, MVT::i64); + switch (CC) { + default: + NestReg = 0x0f; // X15 + case CallingConv::ARM64EC_Thunk_Native: + case CallingConv::ARM64EC_Thunk_X64: + // Must be kept in sync with AArch64CallingConv.td + NestReg = 0x04; // X4 + break; + } - Args.push_back(Entry); - Entry.Node = FPtr; - Args.push_back(Entry); - Entry.Node = Nest; - Args.push_back(Entry); + const char FptrReg = 0x11; // X17 - // Lower to a call to __trampoline_setup(Trmp, TrampSize, FPtr, ctx_reg) - TargetLowering::CallLoweringInfo CLI(DAG); - CLI.setDebugLoc(dl).setChain(Chain).setLibCallee( - CallingConv::C, Type::getVoidTy(*DAG.getContext()), - DAG.getExternalSymbol("__trampoline_setup", PtrVT), std::move(Args)); + SDValue Addr = Trmp; - std::pair CallResult = LowerCallTo(CLI); - return CallResult.second; + SDLoc dl(Op); + OutChains[0] = DAG.getStore( + Chain, dl, DAG.getConstant(0x58000080u | NestReg, dl, MVT::i32), Addr, + MachinePointerInfo(TrmpAddr)); + + Addr = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(4, dl, MVT::i64)); + OutChains[1] = DAG.getStore( + Chain, dl, DAG.getConstant(0x580000b0u | FptrReg, dl, MVT::i32), Addr, + MachinePointerInfo(TrmpAddr, 4)); + + Addr = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(8, dl, MVT::i64)); + OutChains[2] = + DAG.getStore(Chain, dl, DAG.getConstant(0xd61f0220u, dl, MVT::i32), Addr, + MachinePointerInfo(TrmpAddr, 8)); + + Addr = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(16, dl, MVT::i64)); + OutChains[3] = + DAG.getStore(Chain, dl, Nest, Addr, MachinePointerInfo(TrmpAddr, 16)); + + Addr = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(24, dl, MVT::i64)); + OutChains[4] = + DAG.getStore(Chain, dl, FPtr, Addr, MachinePointerInfo(TrmpAddr, 24)); + + SDValue StoreToken = DAG.getNode(ISD::TokenFactor, dl, MVT::Other, OutChains); + + SDValue EndOfTrmp = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(12, dl, MVT::i64)); + + // Call clear cache on the trampoline instructions. + return DAG.getNode(ISD::CLEAR_CACHE, dl, MVT::Other, StoreToken, Trmp, + EndOfTrmp); } SDValue AArch64TargetLowering::LowerOperation(SDValue Op, diff --git a/llvm/lib/TargetParser/Triple.cpp b/llvm/lib/TargetParser/Triple.cpp index 6a559ff023caa..aa1251f3b9485 100644 --- a/llvm/lib/TargetParser/Triple.cpp +++ b/llvm/lib/TargetParser/Triple.cpp @@ -1732,8 +1732,6 @@ unsigned Triple::getTrampolineSize() const { if (isOSLinux()) return 48; break; - case Triple::aarch64: - return 36; } return 32; } diff --git a/llvm/test/CodeGen/AArch64/nest-register.ll b/llvm/test/CodeGen/AArch64/nest-register.ll index 1e1c1b044bab6..2e94dfba1fa52 100644 --- a/llvm/test/CodeGen/AArch64/nest-register.ll +++ b/llvm/test/CodeGen/AArch64/nest-register.ll @@ -1,3 +1,4 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 ; RUN: llc -disable-post-ra -verify-machineinstrs < %s -mtriple=aarch64-none-linux-gnu | FileCheck %s ; Tests that the 'nest' parameter attribute causes the relevant parameter to be @@ -5,18 +6,21 @@ define ptr @nest_receiver(ptr nest %arg) nounwind { ; CHECK-LABEL: nest_receiver: -; CHECK-NEXT: // %bb.0: -; CHECK-NEXT: mov x0, x18 -; CHECK-NEXT: ret +; CHECK: // %bb.0: +; CHECK-NEXT: mov x0, x15 +; CHECK-NEXT: ret ret ptr %arg } define ptr @nest_caller(ptr %arg) nounwind { ; CHECK-LABEL: nest_caller: -; CHECK: mov x18, x0 -; CHECK-NEXT: bl nest_receiver -; CHECK: ret +; CHECK: // %bb.0: +; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill +; CHECK-NEXT: mov x15, x0 +; CHECK-NEXT: bl nest_receiver +; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload +; CHECK-NEXT: ret %result = call ptr @nest_receiver(ptr nest %arg) ret ptr %result diff --git a/llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll b/llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll index 9619895c450ca..32c3eaeb9c876 100644 --- a/llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll +++ b/llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll @@ -207,7 +207,7 @@ define void @test_attributes(ptr byval(%struct2) %s) gc "statepoint-example" { ; CHECK-NEXT: .cfi_offset w30, -16 ; CHECK-NEXT: ldr x8, [sp, #64] ; CHECK-NEXT: ldr q0, [sp, #48] -; CHECK-NEXT: mov x18, xzr +; CHECK-NEXT: mov x15, xzr ; CHECK-NEXT: mov w0, #42 // =0x2a ; CHECK-NEXT: mov w1, #17 // =0x11 ; CHECK-NEXT: str x8, [sp, #16] diff --git a/llvm/test/CodeGen/AArch64/trampoline.ll b/llvm/test/CodeGen/AArch64/trampoline.ll index 30ac2aa283b3e..d9016b02a0f80 100644 --- a/llvm/test/CodeGen/AArch64/trampoline.ll +++ b/llvm/test/CodeGen/AArch64/trampoline.ll @@ -1,32 +1,265 @@ -; RUN: llc -mtriple=aarch64-- < %s | FileCheck %s +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: llc -mtriple=aarch64-linux-gnu < %s | FileCheck %s --check-prefixes=CHECK-LINUX +; RUN: llc -mtriple=aarch64-none-eabi < %s | FileCheck %s --check-prefixes=CHECK-LINUX +; RUN: llc -mtriple=aarch64-pc-windows-msvc < %s | FileCheck %s --check-prefix=CHECK-PC +; RUN: llc -mtriple=aarch64-apple-darwin < %s | FileCheck %s --check-prefixes=CHECK-APPLE @trampg = internal global [36 x i8] zeroinitializer, align 8 declare void @llvm.init.trampoline(ptr, ptr, ptr); declare ptr @llvm.adjust.trampoline(ptr); -define i64 @f(ptr nest %c, i64 %x, i64 %y) { - %sum = add i64 %x, %y - ret i64 %sum +define ptr @f(ptr nest %x, i64 %y) { +; CHECK-LINUX-LABEL: f: +; CHECK-LINUX: // %bb.0: +; CHECK-LINUX-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill +; CHECK-LINUX-NEXT: sub sp, sp, #237, lsl #12 // =970752 +; CHECK-LINUX-NEXT: sub sp, sp, #3264 +; CHECK-LINUX-NEXT: .cfi_def_cfa_offset 974032 +; CHECK-LINUX-NEXT: .cfi_offset w29, -16 +; CHECK-LINUX-NEXT: add x0, x15, x0 +; CHECK-LINUX-NEXT: add sp, sp, #237, lsl #12 // =970752 +; CHECK-LINUX-NEXT: add sp, sp, #3264 +; CHECK-LINUX-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload +; CHECK-LINUX-NEXT: ret +; +; CHECK-PC-LABEL: f: +; CHECK-PC: .seh_proc f +; CHECK-PC-NEXT: // %bb.0: +; CHECK-PC-NEXT: stp x29, x30, [sp, #-16]! // 16-byte Folded Spill +; CHECK-PC-NEXT: .seh_save_fplr_x 16 +; CHECK-PC-NEXT: mov x9, x15 +; CHECK-PC-NEXT: mov x15, #60876 // =0xedcc +; CHECK-PC-NEXT: .seh_nop +; CHECK-PC-NEXT: bl __chkstk +; CHECK-PC-NEXT: .seh_nop +; CHECK-PC-NEXT: sub sp, sp, x15, lsl #4 +; CHECK-PC-NEXT: .seh_stackalloc 974016 +; CHECK-PC-NEXT: mov x15, x9 +; CHECK-PC-NEXT: .seh_endprologue +; CHECK-PC-NEXT: add x0, x15, x0 +; CHECK-PC-NEXT: .seh_startepilogue +; CHECK-PC-NEXT: add sp, sp, #237, lsl #12 // =970752 +; CHECK-PC-NEXT: .seh_stackalloc 970752 +; CHECK-PC-NEXT: add sp, sp, #3264 +; CHECK-PC-NEXT: .seh_stackalloc 3264 +; CHECK-PC-NEXT: ldp x29, x30, [sp], #16 // 16-byte Folded Reload +; CHECK-PC-NEXT: .seh_save_fplr_x 16 +; CHECK-PC-NEXT: .seh_endepilogue +; CHECK-PC-NEXT: ret +; CHECK-PC-NEXT: .seh_endfunclet +; CHECK-PC-NEXT: .seh_endproc +; +; CHECK-APPLE-LABEL: f: +; CHECK-APPLE: ; %bb.0: +; CHECK-APPLE-NEXT: stp x28, x27, [sp, #-16]! ; 16-byte Folded Spill +; CHECK-APPLE-NEXT: sub sp, sp, #237, lsl #12 ; =970752 +; CHECK-APPLE-NEXT: sub sp, sp, #3264 +; CHECK-APPLE-NEXT: .cfi_def_cfa_offset 974032 +; CHECK-APPLE-NEXT: .cfi_offset w27, -8 +; CHECK-APPLE-NEXT: .cfi_offset w28, -16 +; CHECK-APPLE-NEXT: add x0, x15, x0 +; CHECK-APPLE-NEXT: add sp, sp, #237, lsl #12 ; =970752 +; CHECK-APPLE-NEXT: add sp, sp, #3264 +; CHECK-APPLE-NEXT: ldp x28, x27, [sp], #16 ; 16-byte Folded Reload +; CHECK-APPLE-NEXT: ret + %chkstack = alloca [u0xedcba x i8] + %sum = getelementptr i8, ptr %x, i64 %y + ret ptr %sum } define i64 @func1() { +; CHECK-LINUX-LABEL: func1: +; CHECK-LINUX: // %bb.0: +; CHECK-LINUX-NEXT: sub sp, sp, #64 +; CHECK-LINUX-NEXT: str x30, [sp, #48] // 8-byte Folded Spill +; CHECK-LINUX-NEXT: .cfi_def_cfa_offset 64 +; CHECK-LINUX-NEXT: .cfi_offset w30, -16 +; CHECK-LINUX-NEXT: adrp x8, :got:f +; CHECK-LINUX-NEXT: mov w9, #544 // =0x220 +; CHECK-LINUX-NEXT: add x0, sp, #8 +; CHECK-LINUX-NEXT: ldr x8, [x8, :got_lo12:f] +; CHECK-LINUX-NEXT: movk w9, #54815, lsl #16 +; CHECK-LINUX-NEXT: str w9, [sp, #16] +; CHECK-LINUX-NEXT: add x9, sp, #56 +; CHECK-LINUX-NEXT: stp x9, x8, [sp, #24] +; CHECK-LINUX-NEXT: mov x8, #132 // =0x84 +; CHECK-LINUX-NEXT: movk x8, #22528, lsl #16 +; CHECK-LINUX-NEXT: movk x8, #177, lsl #32 +; CHECK-LINUX-NEXT: movk x8, #22528, lsl #48 +; CHECK-LINUX-NEXT: str x8, [sp, #8] +; CHECK-LINUX-NEXT: add x8, sp, #8 +; CHECK-LINUX-NEXT: add x1, x8, #12 +; CHECK-LINUX-NEXT: bl __clear_cache +; CHECK-LINUX-NEXT: ldr x30, [sp, #48] // 8-byte Folded Reload +; CHECK-LINUX-NEXT: mov x0, xzr +; CHECK-LINUX-NEXT: add sp, sp, #64 +; CHECK-LINUX-NEXT: ret +; +; CHECK-PC-LABEL: func1: +; CHECK-PC: .seh_proc func1 +; CHECK-PC-NEXT: // %bb.0: +; CHECK-PC-NEXT: sub sp, sp, #64 +; CHECK-PC-NEXT: .seh_stackalloc 64 +; CHECK-PC-NEXT: str x30, [sp, #48] // 8-byte Folded Spill +; CHECK-PC-NEXT: .seh_save_reg x30, 48 +; CHECK-PC-NEXT: .seh_endprologue +; CHECK-PC-NEXT: adrp x8, f +; CHECK-PC-NEXT: add x8, x8, :lo12:f +; CHECK-PC-NEXT: add x9, sp, #56 +; CHECK-PC-NEXT: stp x9, x8, [sp, #24] +; CHECK-PC-NEXT: mov w8, #544 // =0x220 +; CHECK-PC-NEXT: add x0, sp, #8 +; CHECK-PC-NEXT: movk w8, #54815, lsl #16 +; CHECK-PC-NEXT: str w8, [sp, #16] +; CHECK-PC-NEXT: mov x8, #132 // =0x84 +; CHECK-PC-NEXT: movk x8, #22528, lsl #16 +; CHECK-PC-NEXT: movk x8, #177, lsl #32 +; CHECK-PC-NEXT: movk x8, #22528, lsl #48 +; CHECK-PC-NEXT: str x8, [sp, #8] +; CHECK-PC-NEXT: add x8, sp, #8 +; CHECK-PC-NEXT: add x1, x8, #12 +; CHECK-PC-NEXT: bl __clear_cache +; CHECK-PC-NEXT: mov x0, xzr +; CHECK-PC-NEXT: .seh_startepilogue +; CHECK-PC-NEXT: ldr x30, [sp, #48] // 8-byte Folded Reload +; CHECK-PC-NEXT: .seh_save_reg x30, 48 +; CHECK-PC-NEXT: add sp, sp, #64 +; CHECK-PC-NEXT: .seh_stackalloc 64 +; CHECK-PC-NEXT: .seh_endepilogue +; CHECK-PC-NEXT: ret +; CHECK-PC-NEXT: .seh_endfunclet +; CHECK-PC-NEXT: .seh_endproc +; +; CHECK-APPLE-LABEL: func1: +; CHECK-APPLE: ; %bb.0: +; CHECK-APPLE-NEXT: sub sp, sp, #64 +; CHECK-APPLE-NEXT: stp x29, x30, [sp, #48] ; 16-byte Folded Spill +; CHECK-APPLE-NEXT: .cfi_def_cfa_offset 64 +; CHECK-APPLE-NEXT: .cfi_offset w30, -8 +; CHECK-APPLE-NEXT: .cfi_offset w29, -16 +; CHECK-APPLE-NEXT: Lloh0: +; CHECK-APPLE-NEXT: adrp x8, _f at PAGE +; CHECK-APPLE-NEXT: Lloh1: +; CHECK-APPLE-NEXT: add x8, x8, _f at PAGEOFF +; CHECK-APPLE-NEXT: add x9, sp, #40 +; CHECK-APPLE-NEXT: stp x9, x8, [sp, #16] +; CHECK-APPLE-NEXT: mov w8, #544 ; =0x220 +; CHECK-APPLE-NEXT: mov x0, sp +; CHECK-APPLE-NEXT: movk w8, #54815, lsl #16 +; CHECK-APPLE-NEXT: str w8, [sp, #8] +; CHECK-APPLE-NEXT: mov x8, #132 ; =0x84 +; CHECK-APPLE-NEXT: movk x8, #22528, lsl #16 +; CHECK-APPLE-NEXT: movk x8, #177, lsl #32 +; CHECK-APPLE-NEXT: movk x8, #22528, lsl #48 +; CHECK-APPLE-NEXT: str x8, [sp] +; CHECK-APPLE-NEXT: mov x8, sp +; CHECK-APPLE-NEXT: add x1, x8, #12 +; CHECK-APPLE-NEXT: bl ___clear_cache +; CHECK-APPLE-NEXT: ldp x29, x30, [sp, #48] ; 16-byte Folded Reload +; CHECK-APPLE-NEXT: mov x0, xzr +; CHECK-APPLE-NEXT: add sp, sp, #64 +; CHECK-APPLE-NEXT: ret +; CHECK-APPLE-NEXT: .loh AdrpAdd Lloh0, Lloh1 %val = alloca i64 - %nval = bitcast ptr %val to ptr %tramp = alloca [36 x i8], align 8 - ; CHECK: mov w1, #36 - ; CHECK: bl __trampoline_setup - call void @llvm.init.trampoline(ptr %tramp, ptr @f, ptr %nval) + call void @llvm.init.trampoline(ptr %tramp, ptr @f, ptr %val) %fp = call ptr @llvm.adjust.trampoline(ptr %tramp) ret i64 0 } define i64 @func2() { +; CHECK-LINUX-LABEL: func2: +; CHECK-LINUX: // %bb.0: +; CHECK-LINUX-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill +; CHECK-LINUX-NEXT: .cfi_def_cfa_offset 16 +; CHECK-LINUX-NEXT: .cfi_offset w30, -16 +; CHECK-LINUX-NEXT: adrp x8, :got:f +; CHECK-LINUX-NEXT: mov w9, #544 // =0x220 +; CHECK-LINUX-NEXT: adrp x0, trampg +; CHECK-LINUX-NEXT: add x0, x0, :lo12:trampg +; CHECK-LINUX-NEXT: ldr x8, [x8, :got_lo12:f] +; CHECK-LINUX-NEXT: movk w9, #54815, lsl #16 +; CHECK-LINUX-NEXT: str w9, [x0, #8] +; CHECK-LINUX-NEXT: add x9, sp, #8 +; CHECK-LINUX-NEXT: add x1, x0, #12 +; CHECK-LINUX-NEXT: stp x9, x8, [x0, #16] +; CHECK-LINUX-NEXT: mov x8, #132 // =0x84 +; CHECK-LINUX-NEXT: movk x8, #22528, lsl #16 +; CHECK-LINUX-NEXT: movk x8, #177, lsl #32 +; CHECK-LINUX-NEXT: movk x8, #22528, lsl #48 +; CHECK-LINUX-NEXT: str x8, [x0] +; CHECK-LINUX-NEXT: bl __clear_cache +; CHECK-LINUX-NEXT: mov x0, xzr +; CHECK-LINUX-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload +; CHECK-LINUX-NEXT: ret +; +; CHECK-PC-LABEL: func2: +; CHECK-PC: .seh_proc func2 +; CHECK-PC-NEXT: // %bb.0: +; CHECK-PC-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill +; CHECK-PC-NEXT: .seh_save_reg_x x30, 16 +; CHECK-PC-NEXT: .seh_endprologue +; CHECK-PC-NEXT: adrp x0, trampg +; CHECK-PC-NEXT: add x0, x0, :lo12:trampg +; CHECK-PC-NEXT: adrp x8, f +; CHECK-PC-NEXT: add x8, x8, :lo12:f +; CHECK-PC-NEXT: add x9, sp, #8 +; CHECK-PC-NEXT: add x1, x0, #12 +; CHECK-PC-NEXT: stp x9, x8, [x0, #16] +; CHECK-PC-NEXT: mov w8, #544 // =0x220 +; CHECK-PC-NEXT: movk w8, #54815, lsl #16 +; CHECK-PC-NEXT: str w8, [x0, #8] +; CHECK-PC-NEXT: mov x8, #132 // =0x84 +; CHECK-PC-NEXT: movk x8, #22528, lsl #16 +; CHECK-PC-NEXT: movk x8, #177, lsl #32 +; CHECK-PC-NEXT: movk x8, #22528, lsl #48 +; CHECK-PC-NEXT: str x8, [x0] +; CHECK-PC-NEXT: bl __clear_cache +; CHECK-PC-NEXT: mov x0, xzr +; CHECK-PC-NEXT: .seh_startepilogue +; CHECK-PC-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload +; CHECK-PC-NEXT: .seh_save_reg_x x30, 16 +; CHECK-PC-NEXT: .seh_endepilogue +; CHECK-PC-NEXT: ret +; CHECK-PC-NEXT: .seh_endfunclet +; CHECK-PC-NEXT: .seh_endproc +; +; CHECK-APPLE-LABEL: func2: +; CHECK-APPLE: ; %bb.0: +; CHECK-APPLE-NEXT: sub sp, sp, #32 +; CHECK-APPLE-NEXT: stp x29, x30, [sp, #16] ; 16-byte Folded Spill +; CHECK-APPLE-NEXT: .cfi_def_cfa_offset 32 +; CHECK-APPLE-NEXT: .cfi_offset w30, -8 +; CHECK-APPLE-NEXT: .cfi_offset w29, -16 +; CHECK-APPLE-NEXT: Lloh2: +; CHECK-APPLE-NEXT: adrp x0, _trampg at PAGE +; CHECK-APPLE-NEXT: Lloh3: +; CHECK-APPLE-NEXT: add x0, x0, _trampg at PAGEOFF +; CHECK-APPLE-NEXT: Lloh4: +; CHECK-APPLE-NEXT: adrp x8, _f at PAGE +; CHECK-APPLE-NEXT: Lloh5: +; CHECK-APPLE-NEXT: add x8, x8, _f at PAGEOFF +; CHECK-APPLE-NEXT: add x9, sp, #8 +; CHECK-APPLE-NEXT: add x1, x0, #12 +; CHECK-APPLE-NEXT: stp x9, x8, [x0, #16] +; CHECK-APPLE-NEXT: mov w8, #544 ; =0x220 +; CHECK-APPLE-NEXT: movk w8, #54815, lsl #16 +; CHECK-APPLE-NEXT: str w8, [x0, #8] +; CHECK-APPLE-NEXT: mov x8, #132 ; =0x84 +; CHECK-APPLE-NEXT: movk x8, #22528, lsl #16 +; CHECK-APPLE-NEXT: movk x8, #177, lsl #32 +; CHECK-APPLE-NEXT: movk x8, #22528, lsl #48 +; CHECK-APPLE-NEXT: str x8, [x0] +; CHECK-APPLE-NEXT: bl ___clear_cache +; CHECK-APPLE-NEXT: ldp x29, x30, [sp, #16] ; 16-byte Folded Reload +; CHECK-APPLE-NEXT: mov x0, xzr +; CHECK-APPLE-NEXT: add sp, sp, #32 +; CHECK-APPLE-NEXT: ret +; CHECK-APPLE-NEXT: .loh AdrpAdd Lloh4, Lloh5 +; CHECK-APPLE-NEXT: .loh AdrpAdd Lloh2, Lloh3 %val = alloca i64 - %nval = bitcast ptr %val to ptr - ; CHECK: mov w1, #36 - ; CHECK: bl __trampoline_setup - call void @llvm.init.trampoline(ptr @trampg, ptr @f, ptr %nval) + call void @llvm.init.trampoline(ptr @trampg, ptr @f, ptr %val) %fp = call ptr @llvm.adjust.trampoline(ptr @trampg) ret i64 0 } diff --git a/llvm/test/CodeGen/AArch64/win64cc-x18.ll b/llvm/test/CodeGen/AArch64/win64cc-x18.ll index b3e78cc9bbb81..4b45c300e9c1d 100644 --- a/llvm/test/CodeGen/AArch64/win64cc-x18.ll +++ b/llvm/test/CodeGen/AArch64/win64cc-x18.ll @@ -1,35 +1,26 @@ -; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +;; Testing that nest uses x15 on all calling conventions (except Arm64EC) -;; Testing that x18 is not clobbered when passing pointers with the nest -;; attribute on windows - -; RUN: llc < %s -mtriple=aarch64-pc-windows-msvc | FileCheck %s --check-prefixes=CHECK,CHECK-NO-X18 -; RUN: llc < %s -mtriple=aarch64-linux-gnu | FileCheck %s --check-prefixes=CHECK,CHECK-X18 +; RUN: llc < %s -mtriple=aarch64-pc-windows-msvc | FileCheck %s +; RUN: llc < %s -mtriple=aarch64-linux-gnu | FileCheck %s +; RUN: llc < %s -mtriple=aarch64-apple-darwin- | FileCheck %s define dso_local i64 @other(ptr nest %p) #0 { ; CHECK-LABEL: other: -; CHECK-X18: ldr x0, [x18] -; CHECK-NO-X18: ldr x0, [x0] +; CHECK: ldr x0, [x15] +; CHECK: ret %r = load i64, ptr %p -; CHECK: ret ret i64 %r } define dso_local void @func() #0 { ; CHECK-LABEL: func: - - +; CHECK: add x15, sp, #8 +; CHECK: bl {{_?other}} +; CHECK: ret entry: %p = alloca i64 -; CHECK: mov w8, #1 -; CHECK: stp x30, x8, [sp, #-16] -; CHECK-X18: add x18, sp, #8 store i64 1, ptr %p -; CHECK-NO-X18: add x0, sp, #8 -; CHECK: bl other call void @other(ptr nest %p) -; CHECK: ldr x30, [sp], #16 -; CHECK: ret ret void } diff --git a/llvm/test/CodeGen/AArch64/zero-call-used-regs.ll b/llvm/test/CodeGen/AArch64/zero-call-used-regs.ll index 4799ea3bcd19f..986666e015e9e 100644 --- a/llvm/test/CodeGen/AArch64/zero-call-used-regs.ll +++ b/llvm/test/CodeGen/AArch64/zero-call-used-regs.ll @@ -93,7 +93,7 @@ define dso_local i32 @all_gpr_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c ; CHECK-NEXT: mov x5, #0 // =0x0 ; CHECK-NEXT: mov x6, #0 // =0x0 ; CHECK-NEXT: mov x7, #0 // =0x0 -; CHECK-NEXT: mov x18, #0 // =0x0 +; CHECK-NEXT: mov x15, #0 // =0x0 ; CHECK-NEXT: orr w0, w8, w2 ; CHECK-NEXT: mov x2, #0 // =0x0 ; CHECK-NEXT: mov x8, #0 // =0x0 @@ -146,7 +146,7 @@ define dso_local i32 @all_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c) lo ; DEFAULT-NEXT: mov x5, #0 // =0x0 ; DEFAULT-NEXT: mov x6, #0 // =0x0 ; DEFAULT-NEXT: mov x7, #0 // =0x0 -; DEFAULT-NEXT: mov x18, #0 // =0x0 +; DEFAULT-NEXT: mov x15, #0 // =0x0 ; DEFAULT-NEXT: movi v0.2d, #0000000000000000 ; DEFAULT-NEXT: orr w0, w8, w2 ; DEFAULT-NEXT: mov x2, #0 // =0x0 @@ -169,7 +169,7 @@ define dso_local i32 @all_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c) lo ; SVE-OR-SME-NEXT: mov x5, #0 // =0x0 ; SVE-OR-SME-NEXT: mov x6, #0 // =0x0 ; SVE-OR-SME-NEXT: mov x7, #0 // =0x0 -; SVE-OR-SME-NEXT: mov x18, #0 // =0x0 +; SVE-OR-SME-NEXT: mov x15, #0 // =0x0 ; SVE-OR-SME-NEXT: mov z0.d, #0 // =0x0 ; SVE-OR-SME-NEXT: orr w0, w8, w2 ; SVE-OR-SME-NEXT: mov x2, #0 // =0x0 @@ -196,7 +196,7 @@ define dso_local i32 @all_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c) lo ; STREAMING-COMPAT-NEXT: mov x5, #0 // =0x0 ; STREAMING-COMPAT-NEXT: mov x6, #0 // =0x0 ; STREAMING-COMPAT-NEXT: mov x7, #0 // =0x0 -; STREAMING-COMPAT-NEXT: mov x18, #0 // =0x0 +; STREAMING-COMPAT-NEXT: mov x15, #0 // =0x0 ; STREAMING-COMPAT-NEXT: fmov d0, xzr ; STREAMING-COMPAT-NEXT: orr w0, w8, w2 ; STREAMING-COMPAT-NEXT: mov x2, #0 // =0x0 @@ -492,7 +492,7 @@ define dso_local double @all_gpr_arg_float(double noundef %a, float noundef %b) ; CHECK-NEXT: mov x6, #0 // =0x0 ; CHECK-NEXT: mov x7, #0 // =0x0 ; CHECK-NEXT: mov x8, #0 // =0x0 -; CHECK-NEXT: mov x18, #0 // =0x0 +; CHECK-NEXT: mov x15, #0 // =0x0 ; CHECK-NEXT: ret entry: @@ -547,7 +547,7 @@ define dso_local double @all_arg_float(double noundef %a, float noundef %b) loca ; DEFAULT-NEXT: mov x6, #0 // =0x0 ; DEFAULT-NEXT: mov x7, #0 // =0x0 ; DEFAULT-NEXT: mov x8, #0 // =0x0 -; DEFAULT-NEXT: mov x18, #0 // =0x0 +; DEFAULT-NEXT: mov x15, #0 // =0x0 ; DEFAULT-NEXT: movi v1.2d, #0000000000000000 ; DEFAULT-NEXT: movi v2.2d, #0000000000000000 ; DEFAULT-NEXT: movi v3.2d, #0000000000000000 @@ -570,7 +570,7 @@ define dso_local double @all_arg_float(double noundef %a, float noundef %b) loca ; SVE-OR-SME-NEXT: mov x6, #0 // =0x0 ; SVE-OR-SME-NEXT: mov x7, #0 // =0x0 ; SVE-OR-SME-NEXT: mov x8, #0 // =0x0 -; SVE-OR-SME-NEXT: mov x18, #0 // =0x0 +; SVE-OR-SME-NEXT: mov x15, #0 // =0x0 ; SVE-OR-SME-NEXT: mov z1.d, #0 // =0x0 ; SVE-OR-SME-NEXT: mov z2.d, #0 // =0x0 ; SVE-OR-SME-NEXT: mov z3.d, #0 // =0x0 @@ -597,7 +597,7 @@ define dso_local double @all_arg_float(double noundef %a, float noundef %b) loca ; STREAMING-COMPAT-NEXT: mov x6, #0 // =0x0 ; STREAMING-COMPAT-NEXT: mov x7, #0 // =0x0 ; STREAMING-COMPAT-NEXT: mov x8, #0 // =0x0 -; STREAMING-COMPAT-NEXT: mov x18, #0 // =0x0 +; STREAMING-COMPAT-NEXT: mov x15, #0 // =0x0 ; STREAMING-COMPAT-NEXT: fmov d1, xzr ; STREAMING-COMPAT-NEXT: fmov d2, xzr ; STREAMING-COMPAT-NEXT: fmov d3, xzr From flang-commits at lists.llvm.org Tue May 13 09:00:35 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 09:00:35 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <68236ca3.170a0220.3144c7.f439@mx.google.com> ================ @@ -1783,6 +1776,18 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); + + if (auto &mapperName{std::get>(spec.t)}) { + mapperName->symbol = + &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); + } else { + const auto &type = std::get(spec.t); + static llvm::SmallVector defaultNames; + defaultNames.emplace_back( + type.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"); + MakeSymbol(defaultNames.back(), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); ---------------- skatrak wrote: Nit: There's already an attribute-less overload of that function. ```suggestion MakeSymbol(defaultNames.back(), MiscDetails{MiscDetails::Kind::ConstructName}); ``` https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Tue May 13 09:00:35 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 09:00:35 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <68236ca3.a70a0220.fa42a.f063@mx.google.com> https://github.com/skatrak commented: Thank you Akash! I have just one comment and minor nits. https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Tue May 13 09:00:35 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 09:00:35 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <68236ca3.170a0220.f69e3.99e2@mx.google.com> https://github.com/skatrak edited https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Tue May 13 09:00:36 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 09:00:36 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <68236ca4.050a0220.cba47.c820@mx.google.com> ================ @@ -1783,6 +1776,18 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); + + if (auto &mapperName{std::get>(spec.t)}) { + mapperName->symbol = + &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); + } else { + const auto &type = std::get(spec.t); + static llvm::SmallVector defaultNames; + defaultNames.emplace_back( + type.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"); ---------------- skatrak wrote: Nit: Maybe call it `.omp.default.mapper`, to avoid any confusion later on. https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Tue May 13 09:00:38 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Tue, 13 May 2025 09:00:38 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <68236ca6.a70a0220.1b7311.b43a@mx.google.com> ================ @@ -1783,6 +1776,18 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); + + if (auto &mapperName{std::get>(spec.t)}) { + mapperName->symbol = + &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); + } else { + const auto &type = std::get(spec.t); + static llvm::SmallVector defaultNames; ---------------- skatrak wrote: I can see this is intentionally `static` to make sure that these runtime-created strings don't get destructed while symbols are still in use, since `CharBlock` doesn't own the data, but it doesn't seem to me like a very clean approach. Not sure if perhaps adding some generic static storage for situations like this to the `semantics::Scope` class, something looking similar to the `allSymbols` field, would be a better idea or if there currently are any facilities to store strings not present in the original source to be used as symbols. CC: @kiranchandramohan, @tblah. https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Tue May 13 09:04:11 2025 From: flang-commits at lists.llvm.org (Jameson Nash via flang-commits) Date: Tue, 13 May 2025 09:04:11 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <68236d7b.170a0220.5c999.adf2@mx.google.com> ================ @@ -1,35 +1,26 @@ -; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +;; Testing that nest uses x15 on all calling conventions (except Arm64EC) ---------------- vtjnash wrote: arm64ec is fine and should be fully implemented here now (following the existing documentation in CC_AArch64_Arm64EC_Thunk), just the test comment itself is not applicable to that platform https://github.com/llvm/llvm-project/pull/126743 From flang-commits at lists.llvm.org Tue May 13 09:04:30 2025 From: flang-commits at lists.llvm.org (Jameson Nash via flang-commits) Date: Tue, 13 May 2025 09:04:30 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <68236d8e.170a0220.6aab2.a818@mx.google.com> ================ @@ -20903,7 +20903,12 @@ sufficiently aligned block of memory; this memory is written to by the intrinsic. Note that the size and the alignment are target-specific - LLVM currently provides no portable way of determining them, so a front-end that generates this intrinsic needs to have some -target-specific knowledge. The ``func`` argument must hold a function. +target-specific knowledge. + +The ``func`` argument must be a constant (potentially bitcasted) pointer to a ---------------- vtjnash wrote: https://github.com/llvm/llvm-project/pull/139740 https://github.com/llvm/llvm-project/pull/126743 From flang-commits at lists.llvm.org Tue May 13 09:04:36 2025 From: flang-commits at lists.llvm.org (Jameson Nash via flang-commits) Date: Tue, 13 May 2025 09:04:36 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <68236d94.170a0220.199d49.90fb@mx.google.com> https://github.com/vtjnash edited https://github.com/llvm/llvm-project/pull/126743 From flang-commits at lists.llvm.org Tue May 13 09:05:47 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 13 May 2025 09:05:47 -0700 (PDT) Subject: [flang-commits] [flang] [FLANG][OpenMP][Taskloop] - Add testcase for reduction and in_reduction clause in taskloop construct (PR #139704) In-Reply-To: Message-ID: <68236ddb.050a0220.1cee34.cec5@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks https://github.com/llvm/llvm-project/pull/139704 From flang-commits at lists.llvm.org Tue May 13 09:06:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 09:06:21 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Remove async and structured flag from data actions (PR #139723) In-Reply-To: Message-ID: <68236dfd.050a0220.150b63.f63c@mx.google.com> https://github.com/khaki3 edited https://github.com/llvm/llvm-project/pull/139723 From flang-commits at lists.llvm.org Tue May 13 09:08:56 2025 From: flang-commits at lists.llvm.org (Jameson Nash via flang-commits) Date: Tue, 13 May 2025 09:08:56 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <68236e98.170a0220.18e312.c9da@mx.google.com> ================ @@ -523,6 +537,7 @@ def CC_AArch64_Preserve_None : CallingConv<[ // We can pass arguments in all general registers, except: // - X8, used for sret // - X16/X17, used by the linker as IP0/IP1 + // - X15, the nest register and used by Windows for stack allocation ---------------- vtjnash wrote: Alright, I updated the LangRef to note that nest is undefined behavior in the compiler when used with these calling conventions so that I could remove that part of the change. There doesn't really seem a good way to assert this and return it as a Verifier error, since LLVM I doesn't think it usually has that sort of intrinsic function usage checks in the Verifier? This PR should be ready to land now (and the langref clarifications in #139740), since it should just be a bugfix for trampoline use, without any other ABI changes. https://github.com/llvm/llvm-project/pull/126743 From flang-commits at lists.llvm.org Tue May 13 09:15:46 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 13 May 2025 09:15:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Verify uses of OmpCancellationConstructTypeClause (PR #139743) In-Reply-To: Message-ID: <68237032.170a0220.a5c18.c098@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/139743 From flang-commits at lists.llvm.org Tue May 13 09:15:46 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 13 May 2025 09:15:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Verify uses of OmpCancellationConstructTypeClause (PR #139743) In-Reply-To: Message-ID: <68237032.170a0220.174124.b108@mx.google.com> ================ @@ -0,0 +1,11 @@ +!RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags + +subroutine f(x) + integer :: x +!ERROR: Cancellation construct type is not allowed on SECTIONS ---------------- tblah wrote: I think the meaning of this error will not be obvious to the user. Perhaps "PARALLEL cannot follow SECTIONS"? https://github.com/llvm/llvm-project/pull/139743 From flang-commits at lists.llvm.org Tue May 13 09:15:46 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 13 May 2025 09:15:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Verify uses of OmpCancellationConstructTypeClause (PR #139743) In-Reply-To: Message-ID: <68237032.050a0220.1e1fc3.c98a@mx.google.com> https://github.com/tblah commented: Nice work diagnosing this, it is a surprising bug. https://github.com/llvm/llvm-project/pull/139743 From flang-commits at lists.llvm.org Tue May 13 09:05:44 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 09:05:44 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Remove async and structured flag from data actions (PR #139723) In-Reply-To: Message-ID: <68236dd8.170a0220.384bfe.b6bd@mx.google.com> https://github.com/khaki3 updated https://github.com/llvm/llvm-project/pull/139723 >From d15c2f009e87fe7dfba2969eb0b979f9ecd025ab Mon Sep 17 00:00:00 2001 From: Kazuaki Matsumura Date: Tue, 13 May 2025 05:08:27 -0700 Subject: [PATCH 1/4] [flang][acc] Remove async and structured flag from data actions; Rename UpdateOp's async to asyncOnly; Print asyncOnly --- flang/lib/Lower/OpenACC.cpp | 354 +++++++----------- mlir/include/mlir/Dialect/OpenACC/OpenACC.h | 10 +- .../mlir/Dialect/OpenACC/OpenACCOps.td | 195 ++-------- mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp | 34 +- 4 files changed, 188 insertions(+), 405 deletions(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index e1918288d6de3..c1a8dd0d5a478 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -104,15 +104,12 @@ static void addOperand(llvm::SmallVectorImpl &operands, } template -static Op -createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, - mlir::Value baseAddr, std::stringstream &name, - mlir::SmallVector bounds, bool structured, - bool implicit, mlir::acc::DataClause dataClause, - mlir::Type retTy, llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, - bool unwrapBoxAddr = false, mlir::Value isPresent = {}) { +static Op createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, + mlir::Value baseAddr, std::stringstream &name, + mlir::SmallVector bounds, + bool implicit, mlir::acc::DataClause dataClause, + mlir::Type retTy, bool unwrapBoxAddr = false, + mlir::Value isPresent = {}) { mlir::Value varPtrPtr; // The data clause may apply to either the box reference itself or the // pointer to the data it holds. So use `unwrapBoxAddr` to decide. @@ -157,11 +154,9 @@ createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, addOperand(operands, operandSegments, baseAddr); addOperand(operands, operandSegments, varPtrPtr); addOperands(operands, operandSegments, bounds); - addOperands(operands, operandSegments, async); Op op = builder.create(loc, retTy, operands); op.setNameAttr(builder.getStringAttr(name.str())); - op.setStructured(structured); op.setImplicit(implicit); op.setDataClause(dataClause); if (auto mappableTy = @@ -176,10 +171,6 @@ createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, op->setAttr(Op::getOperandSegmentSizeAttr(), builder.getDenseI32ArrayAttr(operandSegments)); - if (!asyncDeviceTypes.empty()) - op.setAsyncOperandsDeviceTypeAttr(builder.getArrayAttr(asyncDeviceTypes)); - if (!asyncOnlyDeviceTypes.empty()) - op.setAsyncOnlyAttr(builder.getArrayAttr(asyncOnlyDeviceTypes)); return op; } @@ -249,9 +240,7 @@ static void createDeclareAllocFuncWithArg(mlir::OpBuilder &modBuilder, mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, registerFuncOp.getArgument(0), asFortranDesc, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, descTy, - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, descTy); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -263,8 +252,7 @@ static void createDeclareAllocFuncWithArg(mlir::OpBuilder &modBuilder, addDeclareAttr(builder, boxAddrOp.getOperation(), clause); EntryOp entryOp = createDataEntryOp( builder, loc, boxAddrOp.getResult(), asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, boxAddrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, boxAddrOp.getType()); builder.create( loc, mlir::acc::DeclareTokenType::get(entryOp.getContext()), mlir::ValueRange(entryOp.getAccVar())); @@ -302,26 +290,20 @@ static void createDeclareDeallocFuncWithArg( mlir::acc::GetDevicePtrOp entryOp = createDataEntryOp( builder, loc, var, asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, var.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, var.getType()); builder.create( loc, mlir::Value{}, mlir::ValueRange(entryOp.getAccVar())); if constexpr (std::is_same_v || std::is_same_v) - builder.create(entryOp.getLoc(), entryOp.getAccVar(), - entryOp.getVar(), entryOp.getVarType(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, - builder.getStringAttr(*entryOp.getName())); + builder.create( + entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), + entryOp.getVarType(), entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); else builder.create(entryOp.getLoc(), entryOp.getAccVar(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); // Generate the post dealloc function. @@ -341,9 +323,8 @@ static void createDeclareDeallocFuncWithArg( mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, var, asFortran, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, var.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, + var.getType()); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -700,10 +681,7 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - mlir::acc::DataClause dataClause, bool structured, - bool implicit, llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, + mlir::acc::DataClause dataClause, bool implicit, bool setDeclareAttr = false) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; @@ -732,9 +710,8 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, ? info.rawInput : info.addr; Op op = createDataEntryOp( - builder, operandLocation, baseAddr, asFortran, bounds, structured, - implicit, dataClause, baseAddr.getType(), async, asyncDeviceTypes, - asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true, info.isPresent); + builder, operandLocation, baseAddr, asFortran, bounds, implicit, + dataClause, baseAddr.getType(), /*unwrapBoxAddr=*/true, info.isPresent); dataOperands.push_back(op.getAccVar()); } } @@ -746,7 +723,7 @@ static void genDeclareDataOperandOperations( Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - mlir::acc::DataClause dataClause, bool structured, bool implicit) { + mlir::acc::DataClause dataClause, bool implicit) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; for (const auto &accObject : objectList.v) { @@ -765,10 +742,9 @@ static void genDeclareDataOperandOperations( /*genDefaultBounds=*/generateDefaultBounds, /*strideIncludeLowerExtent=*/strideIncludeLowerExtent); LLVM_DEBUG(llvm::dbgs() << __func__ << "\n"; info.dump(llvm::dbgs())); - EntryOp op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, structured, - implicit, dataClause, info.addr.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + EntryOp op = createDataEntryOp(builder, operandLocation, info.addr, + asFortran, bounds, implicit, + dataClause, info.addr.getType()); dataOperands.push_back(op.getAccVar()); addDeclareAttr(builder, op.getVar().getDefiningOp(), dataClause); if (mlir::isa(fir::unwrapRefType(info.addr.getType()))) { @@ -805,14 +781,12 @@ static void genDeclareDataOperandOperationsWithModifier( (modifier && (*modifier).v == mod) ? clauseWithModifier : clause; genDeclareDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, - dataClause, - /*structured=*/true, /*implicit=*/false); + dataClause, /*implicit=*/false); } template static void genDataExitOperations(fir::FirOpBuilder &builder, - llvm::SmallVector operands, - bool structured) { + llvm::SmallVector operands) { for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); @@ -820,16 +794,13 @@ static void genDataExitOperations(fir::FirOpBuilder &builder, std::is_same_v) builder.create( entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), - entryOp.getVarType(), entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), - entryOp.getDataClause(), structured, entryOp.getImplicit(), - builder.getStringAttr(*entryOp.getName())); - else - builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getBounds(), - entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, + entryOp.getVarType(), entryOp.getBounds(), entryOp.getDataClause(), entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); + else + builder.create(entryOp.getLoc(), entryOp.getAccVar(), + entryOp.getBounds(), entryOp.getDataClause(), + entryOp.getImplicit(), + builder.getStringAttr(*entryOp.getName())); } } @@ -1240,10 +1211,7 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - llvm::SmallVector &privatizations, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes) { + llvm::SmallVector &privatizations) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; for (const auto &accObject : objectList.v) { @@ -1272,9 +1240,9 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, recipe = Fortran::lower::createOrGetPrivateRecipe(builder, recipeName, operandLocation, retTy); auto op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, true, - /*implicit=*/false, mlir::acc::DataClause::acc_private, retTy, async, - asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); + builder, operandLocation, info.addr, asFortran, bounds, + /*implicit=*/false, mlir::acc::DataClause::acc_private, retTy, + /*unwrapBoxAddr=*/true); dataOperands.push_back(op.getAccVar()); } else { std::string suffix = @@ -1284,9 +1252,8 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, recipe = Fortran::lower::createOrGetFirstprivateRecipe( builder, recipeName, operandLocation, retTy, bounds); auto op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, true, + builder, operandLocation, info.addr, asFortran, bounds, /*implicit=*/false, mlir::acc::DataClause::acc_firstprivate, retTy, - async, asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); dataOperands.push_back(op.getAccVar()); } @@ -1869,10 +1836,7 @@ genReductions(const Fortran::parser::AccObjectListWithReduction &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &reductionOperands, - llvm::SmallVector &reductionRecipes, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes) { + llvm::SmallVector &reductionRecipes) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); const auto &objects = std::get(objectList.t); const auto &op = std::get(objectList.t); @@ -1904,9 +1868,8 @@ genReductions(const Fortran::parser::AccObjectListWithReduction &objectList, auto op = createDataEntryOp( builder, operandLocation, info.addr, asFortran, bounds, - /*structured=*/true, /*implicit=*/false, - mlir::acc::DataClause::acc_reduction, info.addr.getType(), async, - asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); + /*implicit=*/false, mlir::acc::DataClause::acc_reduction, + info.addr.getType(), /*unwrapBoxAddr=*/true); mlir::Type ty = op.getAccVar().getType(); if (!areAllBoundConstant(bounds) || fir::isAssumedShape(info.addr.getType()) || @@ -2169,9 +2132,8 @@ static void privatizeIv(Fortran::lower::AbstractConverter &converter, std::stringstream asFortran; asFortran << Fortran::lower::mangle::demangleName(toStringRef(sym.name())); auto op = createDataEntryOp( - builder, loc, ivValue, asFortran, {}, true, /*implicit=*/true, - mlir::acc::DataClause::acc_private, ivValue.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + builder, loc, ivValue, asFortran, {}, /*implicit=*/true, + mlir::acc::DataClause::acc_private, ivValue.getType()); privateOp = op.getOperation(); privateOperands.push_back(op.getAccVar()); @@ -2328,14 +2290,12 @@ static mlir::acc::LoopOp createLoopOp( &clause.u)) { genPrivatizations( privateClause->v, converter, semanticsContext, stmtCtx, - privateOperands, privatizations, /*async=*/{}, - /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + privateOperands, privatizations); } else if (const auto *reductionClause = std::get_if( &clause.u)) { genReductions(reductionClause->v, converter, semanticsContext, stmtCtx, - reductionOperands, reductionRecipes, /*async=*/{}, - /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + reductionOperands, reductionRecipes); } else if (std::get_if(&clause.u)) { for (auto crtDeviceTypeAttr : crtDeviceTypes) seqDeviceTypes.push_back(crtDeviceTypeAttr); @@ -2613,9 +2573,6 @@ static void genDataOperandOperationsWithModifier( llvm::SmallVectorImpl &dataClauseOperands, const mlir::acc::DataClause clause, const mlir::acc::DataClause clauseWithModifier, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, bool setDeclareAttr = false) { const Fortran::parser::AccObjectListWithModifier &listWithModifier = x->v; const auto &accObjectList = @@ -2627,9 +2584,7 @@ static void genDataOperandOperationsWithModifier( (modifier && (*modifier).v == mod) ? clauseWithModifier : clause; genDataOperandOperations(accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, dataClause, - /*structured=*/true, /*implicit=*/false, async, - asyncDeviceTypes, asyncOnlyDeviceTypes, - setDeclareAttr); + /*implicit=*/false, setDeclareAttr); } template @@ -2779,8 +2734,7 @@ static Op createComputeOp( genDataOperandOperations( copyClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copy, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyinClause = @@ -2791,8 +2745,7 @@ static Op createComputeOp( copyinClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::ReadOnly, dataClauseOperands, mlir::acc::DataClause::acc_copyin, - mlir::acc::DataClause::acc_copyin_readonly, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyin_readonly); copyinEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyoutClause = @@ -2804,8 +2757,7 @@ static Op createComputeOp( copyoutClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::ReadOnly, dataClauseOperands, mlir::acc::DataClause::acc_copyout, - mlir::acc::DataClause::acc_copyout_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyout_zero); copyoutEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *createClause = @@ -2816,8 +2768,7 @@ static Op createComputeOp( createClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::Zero, dataClauseOperands, mlir::acc::DataClause::acc_create, - mlir::acc::DataClause::acc_create_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_create_zero); createEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *noCreateClause = @@ -2827,8 +2778,7 @@ static Op createComputeOp( genDataOperandOperations( noCreateClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_no_create, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); nocreateEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *presentClause = @@ -2838,8 +2788,7 @@ static Op createComputeOp( genDataOperandOperations( presentClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_present, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); presentEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *devicePtrClause = @@ -2848,16 +2797,14 @@ static Op createComputeOp( genDataOperandOperations( devicePtrClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_deviceptr, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); } else if (const auto *attachClause = std::get_if(&clause.u)) { auto crtDataStart = dataClauseOperands.size(); genDataOperandOperations( attachClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_attach, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); attachEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *privateClause = @@ -2866,15 +2813,13 @@ static Op createComputeOp( if (!combinedConstructs) genPrivatizations( privateClause->v, converter, semanticsContext, stmtCtx, - privateOperands, privatizations, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + privateOperands, privatizations); } else if (const auto *firstprivateClause = std::get_if( &clause.u)) { genPrivatizations( firstprivateClause->v, converter, semanticsContext, stmtCtx, - firstprivateOperands, firstPrivatizations, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + firstprivateOperands, firstPrivatizations); } else if (const auto *reductionClause = std::get_if( &clause.u)) { @@ -2885,16 +2830,14 @@ static Op createComputeOp( // instead. if (!combinedConstructs) { genReductions(reductionClause->v, converter, semanticsContext, stmtCtx, - reductionOperands, reductionRecipes, async, - asyncDeviceTypes, asyncOnlyDeviceTypes); + reductionOperands, reductionRecipes); } else { auto crtDataStart = dataClauseOperands.size(); genDataOperandOperations( std::get(reductionClause->v.t), converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_reduction, - /*structured=*/true, /*implicit=*/true, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/true); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } @@ -2997,19 +2940,19 @@ static Op createComputeOp( // Create the exit operations after the region. genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands); genDataExitOperations( - builder, attachEntryOperands, /*structured=*/true); + builder, attachEntryOperands); genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands); genDataExitOperations( - builder, nocreateEntryOperands, /*structured=*/true); + builder, nocreateEntryOperands); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands); builder.restoreInsertionPoint(insPt); return computeOp; @@ -3078,8 +3021,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( copyClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copy, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyinClause = @@ -3090,8 +3032,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, copyinClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::ReadOnly, dataClauseOperands, mlir::acc::DataClause::acc_copyin, - mlir::acc::DataClause::acc_copyin_readonly, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyin_readonly); copyinEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyoutClause = @@ -3103,8 +3044,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, copyoutClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::Zero, dataClauseOperands, mlir::acc::DataClause::acc_copyout, - mlir::acc::DataClause::acc_copyout_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyout_zero); copyoutEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *createClause = @@ -3115,8 +3055,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, createClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::Zero, dataClauseOperands, mlir::acc::DataClause::acc_create, - mlir::acc::DataClause::acc_create_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_create_zero); createEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *noCreateClause = @@ -3126,8 +3065,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( noCreateClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_no_create, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); nocreateEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *presentClause = @@ -3137,8 +3075,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( presentClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_present, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); presentEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *deviceptrClause = @@ -3147,16 +3084,14 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( deviceptrClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_deviceptr, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); } else if (const auto *attachClause = std::get_if(&clause.u)) { auto crtDataStart = dataClauseOperands.size(); genDataOperandOperations( attachClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_attach, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); attachEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *defaultClause = @@ -3211,19 +3146,19 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, // Create the exit operations after the region. genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands); genDataExitOperations( - builder, attachEntryOperands, /*structured=*/true); + builder, attachEntryOperands); genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands); genDataExitOperations( - builder, nocreateEntryOperands, /*structured=*/true); + builder, nocreateEntryOperands); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands); builder.restoreInsertionPoint(insPt); } @@ -3252,8 +3187,7 @@ genACCHostDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( useDevice->v, converter, semanticsContext, stmtCtx, dataOperands, mlir::acc::DataClause::acc_use_device, - /*structured=*/true, /*implicit=*/false, /*async=*/{}, - /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false); } else if (std::get_if(&clause.u)) { addIfPresentAttr = true; } @@ -3430,9 +3364,8 @@ genACCEnterDataOp(Fortran::lower::AbstractConverter &converter, std::get(listWithModifier.t); genDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, - dataClauseOperands, mlir::acc::DataClause::acc_copyin, false, - /*implicit=*/false, asyncValues, asyncDeviceTypes, - asyncOnlyDeviceTypes); + dataClauseOperands, mlir::acc::DataClause::acc_copyin, + /*implicit=*/false); } else if (const auto *createClause = std::get_if(&clause.u)) { const Fortran::parser::AccObjectListWithModifier &listWithModifier = @@ -3448,15 +3381,13 @@ genACCEnterDataOp(Fortran::lower::AbstractConverter &converter, clause = mlir::acc::DataClause::acc_create_zero; genDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, - dataClauseOperands, clause, false, /*implicit=*/false, asyncValues, - asyncDeviceTypes, asyncOnlyDeviceTypes); + dataClauseOperands, clause, /*implicit=*/false); } else if (const auto *attachClause = std::get_if(&clause.u)) { genDataOperandOperations( attachClause->v, converter, semanticsContext, stmtCtx, - dataClauseOperands, mlir::acc::DataClause::acc_attach, false, - /*implicit=*/false, asyncValues, asyncDeviceTypes, - asyncOnlyDeviceTypes); + dataClauseOperands, mlir::acc::DataClause::acc_attach, + /*implicit=*/false); } else if (!std::get_if(&clause.u)) { llvm::report_fatal_error( "Unknown clause in ENTER DATA directive lowering"); @@ -3544,20 +3475,17 @@ genACCExitDataOp(Fortran::lower::AbstractConverter &converter, std::get(listWithModifier.t); genDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, copyoutOperands, - mlir::acc::DataClause::acc_copyout, false, /*implicit=*/false, - asyncValues, asyncDeviceTypes, asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyout, /*implicit=*/false); } else if (const auto *deleteClause = std::get_if(&clause.u)) { genDataOperandOperations( deleteClause->v, converter, semanticsContext, stmtCtx, deleteOperands, - mlir::acc::DataClause::acc_delete, false, /*implicit=*/false, - asyncValues, asyncDeviceTypes, asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_delete, /*implicit=*/false); } else if (const auto *detachClause = std::get_if(&clause.u)) { genDataOperandOperations( detachClause->v, converter, semanticsContext, stmtCtx, detachOperands, - mlir::acc::DataClause::acc_detach, false, /*implicit=*/false, - asyncValues, asyncDeviceTypes, asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_detach, /*implicit=*/false); } else if (std::get_if(&clause.u)) { addFinalizeAttr = true; } @@ -3587,11 +3515,11 @@ genACCExitDataOp(Fortran::lower::AbstractConverter &converter, exitDataOp.setFinalizeAttr(builder.getUnitAttr()); genDataExitOperations( - builder, copyoutOperands, /*structured=*/false); + builder, copyoutOperands); genDataExitOperations( - builder, deleteOperands, /*structured=*/false); + builder, deleteOperands); genDataExitOperations( - builder, detachOperands, /*structured=*/false); + builder, detachOperands); } template @@ -3765,16 +3693,14 @@ genACCUpdateOp(Fortran::lower::AbstractConverter &converter, std::get_if(&clause.u)) { genDataOperandOperations( hostClause->v, converter, semanticsContext, stmtCtx, - updateHostOperands, mlir::acc::DataClause::acc_update_host, false, - /*implicit=*/false, asyncOperands, asyncOperandsDeviceTypes, - asyncOnlyDeviceTypes); + updateHostOperands, mlir::acc::DataClause::acc_update_host, + /*implicit=*/false); } else if (const auto *deviceClause = std::get_if(&clause.u)) { genDataOperandOperations( deviceClause->v, converter, semanticsContext, stmtCtx, - dataClauseOperands, mlir::acc::DataClause::acc_update_device, false, - /*implicit=*/false, asyncOperands, asyncOperandsDeviceTypes, - asyncOnlyDeviceTypes); + dataClauseOperands, mlir::acc::DataClause::acc_update_device, + /*implicit=*/false); } else if (std::get_if(&clause.u)) { ifPresent = true; } else if (const auto *selfClause = @@ -3786,9 +3712,8 @@ genACCUpdateOp(Fortran::lower::AbstractConverter &converter, assert(accObjectList && "expect AccObjectList"); genDataOperandOperations( *accObjectList, converter, semanticsContext, stmtCtx, - updateHostOperands, mlir::acc::DataClause::acc_update_self, false, - /*implicit=*/false, asyncOperands, asyncOperandsDeviceTypes, - asyncOnlyDeviceTypes); + updateHostOperands, mlir::acc::DataClause::acc_update_self, + /*implicit=*/false); } } @@ -3805,7 +3730,7 @@ genACCUpdateOp(Fortran::lower::AbstractConverter &converter, ifPresent); genDataExitOperations( - builder, updateHostOperands, /*structured=*/false); + builder, updateHostOperands); } static void @@ -3928,9 +3853,8 @@ static void createDeclareGlobalOp(mlir::OpBuilder &modBuilder, llvm::SmallVector bounds; EntryOp entryOp = createDataEntryOp( - builder, loc, addrOp.getResTy(), asFortran, bounds, - /*structured=*/false, implicit, clause, addrOp.getResTy().getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + builder, loc, addrOp.getResTy(), asFortran, bounds, implicit, clause, + addrOp.getResTy().getType()); if constexpr (std::is_same_v) builder.create( loc, mlir::acc::DeclareTokenType::get(entryOp.getContext()), @@ -3940,10 +3864,8 @@ static void createDeclareGlobalOp(mlir::OpBuilder &modBuilder, mlir::ValueRange(entryOp.getAccVar())); if constexpr (std::is_same_v) { builder.create(entryOp.getLoc(), entryOp.getAccVar(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); } builder.create(loc); @@ -3977,9 +3899,8 @@ static void createDeclareAllocFunc(mlir::OpBuilder &modBuilder, mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, addrOp, asFortranDesc, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, addrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, + addrOp.getType()); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -3990,8 +3911,7 @@ static void createDeclareAllocFunc(mlir::OpBuilder &modBuilder, addDeclareAttr(builder, boxAddrOp.getOperation(), clause); EntryOp entryOp = createDataEntryOp( builder, loc, boxAddrOp.getResult(), asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, boxAddrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, boxAddrOp.getType()); builder.create( loc, mlir::acc::DeclareTokenType::get(entryOp.getContext()), mlir::ValueRange(entryOp.getAccVar())); @@ -4035,8 +3955,7 @@ static void createDeclareDeallocFunc(mlir::OpBuilder &modBuilder, mlir::acc::GetDevicePtrOp entryOp = createDataEntryOp( builder, loc, var, asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, var.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, var.getType()); builder.create( loc, mlir::Value{}, mlir::ValueRange(entryOp.getAccVar())); @@ -4045,18 +3964,13 @@ static void createDeclareDeallocFunc(mlir::OpBuilder &modBuilder, std::is_same_v) builder.create( entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), - entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, - builder.getStringAttr(*entryOp.getName())); + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); else - builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getBounds(), - entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, - builder.getStringAttr(*entryOp.getName())); + builder.create(entryOp.getLoc(), entryOp.getAccVar(), + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, + builder.getStringAttr(*entryOp.getName())); // Generate the post dealloc function. modBuilder.setInsertionPointAfter(preDeallocOp); @@ -4076,9 +3990,8 @@ static void createDeclareDeallocFunc(mlir::OpBuilder &modBuilder, mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, addrOp, asFortran, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, addrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, + addrOp.getType()); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -4216,7 +4129,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, mlir::acc::CopyoutOp>( copyClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copy, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *createClause = @@ -4229,7 +4142,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, genDeclareDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_create, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); createEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *presentClause = @@ -4240,7 +4153,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, mlir::acc::DeleteOp>( presentClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_present, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); presentEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyinClause = @@ -4266,7 +4179,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, mlir::acc::CopyoutOp>( accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copyout, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); copyoutEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *devicePtrClause = @@ -4276,14 +4189,14 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, mlir::acc::DevicePtrOp>( devicePtrClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_deviceptr, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); } else if (const auto *linkClause = std::get_if(&clause.u)) { genDeclareDataOperandOperations( linkClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_declare_link, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); } else if (const auto *deviceResidentClause = std::get_if( &clause.u)) { @@ -4293,7 +4206,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, deviceResidentClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_declare_device_resident, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); deviceResidentEntryOperands.append( dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else { @@ -4341,18 +4254,18 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, } genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands); genDataExitOperations( - builder, deviceResidentEntryOperands, /*structured=*/true); + mlir::acc::DeleteOp>(builder, + deviceResidentEntryOperands); genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands); }); } @@ -4702,12 +4615,11 @@ genACC(Fortran::lower::AbstractConverter &converter, if (modifier && (*modifier).v == Fortran::parser::AccDataModifier::Modifier::ReadOnly) dataClause = mlir::acc::DataClause::acc_cache_readonly; - genDataOperandOperations( - accObjectList, converter, semanticsContext, stmtCtx, cacheOperands, - dataClause, - /*structured=*/true, /*implicit=*/false, - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}, - /*setDeclareAttr*/ false); + genDataOperandOperations(accObjectList, converter, + semanticsContext, stmtCtx, + cacheOperands, dataClause, + /*implicit=*/false, + /*setDeclareAttr*/ false); loopOp.getCacheOperandsMutable().append(cacheOperands); } else { llvm::report_fatal_error( diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h index ff5845343313c..e053e3d2bbcfc 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h @@ -117,19 +117,19 @@ mlir::Value getVarPtrPtr(mlir::Operation *accDataClauseOp); /// Returns an empty vector if there are no bounds. mlir::SmallVector getBounds(mlir::Operation *accDataClauseOp); -/// Used to obtain `async` operands from an acc data clause operation. +/// Used to obtain `async` operands from an acc operation. /// Returns an empty vector if there are no such operands. mlir::SmallVector getAsyncOperands(mlir::Operation *accDataClauseOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to -/// an acc data clause operation, that correspond to the device types -/// associated with the async clauses with an async-value. +/// an acc operation, that correspond to the device types associated with the +/// async clauses with an async-value. mlir::ArrayAttr getAsyncOperandsDeviceType(mlir::Operation *accDataClauseOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to -/// an acc data clause operation, that correspond to the device types -/// associated with the async clauses without an async-value. +/// an acc operation, that correspond to the device types associated with the +/// async clauses without an async-value. mlir::ArrayAttr getAsyncOnly(mlir::Operation *accDataClauseOp); /// Used to obtain the `name` from an acc operation. diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td index 5d5add6318e06..59b9a50144a1e 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td @@ -470,11 +470,7 @@ class OpenACC_DataEntryOp:$varPtrPtr, Variadic:$bounds, /* rank-0 to rank-{n-1} */ - Variadic:$asyncOperands, - OptionalAttr:$asyncOperandsDeviceType, - OptionalAttr:$asyncOnly, DefaultValuedAttr:$dataClause, - DefaultValuedAttr:$structured, DefaultValuedAttr:$implicit, OptionalAttr:$name)); @@ -491,63 +487,16 @@ class OpenACC_DataEntryOp(attr); - if (deviceTypeAttr.getValue() == deviceType) - return true; - } - return false; - } - /// Return the value of the async clause if present. - mlir::Value getAsyncValue() { - return getAsyncValue(mlir::acc::DeviceType::None); - } - /// Return the value of the async clause for the given device_type if - /// present. - mlir::Value getAsyncValue(mlir::acc::DeviceType deviceType) { - mlir::ArrayAttr deviceTypes = getAsyncOperandsDeviceTypeAttr(); - if (!deviceTypes) - return nullptr; - for (auto [attr, asyncValue] : - llvm::zip(deviceTypes, getAsyncOperands())) { - auto deviceTypeAttr = mlir::dyn_cast(attr); - if (deviceTypeAttr.getValue() == deviceType) - return asyncValue; - } - return nullptr; - } mlir::TypedValue getVarPtr() { return mlir::dyn_cast>(getVar()); } @@ -561,16 +510,13 @@ class OpenACC_DataEntryOp($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` ) `->` type($accVar) attr-dict }]; let hasVerifier = 1; let builders = [ - OpBuilder<(ins "::mlir::Value":$var, - "bool":$structured, "bool":$implicit, + OpBuilder<(ins "::mlir::Value":$var, "bool":$implicit, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ auto ptrLikeTy = ::mlir::dyn_cast<::mlir::acc::PointerLikeType>( @@ -579,14 +525,10 @@ class OpenACC_DataEntryOp, - OpBuilder<(ins "::mlir::Value":$var, - "bool":$structured, "bool":$implicit, + OpBuilder<(ins "::mlir::Value":$var, "bool":$implicit, "const ::llvm::Twine &":$name, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ @@ -596,10 +538,7 @@ class OpenACC_DataEntryOp]; @@ -829,15 +768,10 @@ def OpenACC_CacheOp : OpenACC_DataEntryOp<"cache", class OpenACC_DataExitOp traits = [], dag additionalArgs = (ins)> : OpenACC_Op]>])> { + [MemoryEffects<[MemRead]>])> { let arguments = !con(additionalArgs, (ins Variadic:$bounds, - Variadic:$asyncOperands, - OptionalAttr:$asyncOperandsDeviceType, - OptionalAttr:$asyncOnly, DefaultValuedAttr:$dataClause, - DefaultValuedAttr:$structured, DefaultValuedAttr:$implicit, OptionalAttr:$name)); @@ -846,65 +780,15 @@ class OpenACC_DataExitOp(attr); - if (deviceTypeAttr.getValue() == deviceType) - return true; - } - return false; - } - /// Return the value of the async clause if present. - mlir::Value getAsyncValue() { - return getAsyncValue(mlir::acc::DeviceType::None); - } - /// Return the value of the async clause for the given device_type if - /// present. - mlir::Value getAsyncValue(mlir::acc::DeviceType deviceType) { - mlir::ArrayAttr deviceTypes = getAsyncOperandsDeviceTypeAttr(); - if (!deviceTypes) - return nullptr; - for (auto [attr, asyncValue] : - llvm::zip(deviceTypes, getAsyncOperands())) { - auto deviceTypeAttr = mlir::dyn_cast(attr); - if (deviceTypeAttr.getValue() == deviceType) - return asyncValue; - } - return nullptr; - } - }]; - let hasVerifier = 1; } @@ -922,16 +806,13 @@ class OpenACC_DataExitOpWithVarPtr let assemblyFormat = [{ custom($accVar, type($accVar)) (`bounds` `(` $bounds^ `)` )? - (`async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType)^ `)`)? `to` custom($var) `:` custom(type($var), $varType) attr-dict }]; let builders = [ OpBuilder<(ins "::mlir::Value":$accVar, - "::mlir::Value":$var, - "bool":$structured, "bool":$implicit, + "::mlir::Value":$var, "bool":$implicit, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ auto ptrLikeTy = ::mlir::dyn_cast<::mlir::acc::PointerLikeType>( @@ -940,14 +821,11 @@ class OpenACC_DataExitOpWithVarPtr /*varType=*/ptrLikeTy ? ::mlir::TypeAttr::get(ptrLikeTy.getElementType()) : ::mlir::TypeAttr::get(var.getType()), - bounds, /*asyncOperands=*/{}, /*asyncOperandsDeviceType=*/nullptr, - /*asyncOnly=*/nullptr, /*dataClause=*/nullptr, - /*structured=*/$_builder.getBoolAttr(structured), + bounds, /*dataClause=*/nullptr, /*implicit=*/$_builder.getBoolAttr(implicit), /*name=*/nullptr); }]>, OpBuilder<(ins "::mlir::Value":$accVar, - "::mlir::Value":$var, - "bool":$structured, "bool":$implicit, + "::mlir::Value":$var, "bool":$implicit, "const ::llvm::Twine &":$name, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ @@ -957,9 +835,7 @@ class OpenACC_DataExitOpWithVarPtr /*varType=*/ptrLikeTy ? ::mlir::TypeAttr::get(ptrLikeTy.getElementType()) : ::mlir::TypeAttr::get(var.getType()), - bounds, /*asyncOperands=*/{}, /*asyncOperandsDeviceType=*/nullptr, - /*asyncOnly=*/nullptr, /*dataClause=*/nullptr, - /*structured=*/$_builder.getBoolAttr(structured), + bounds, /*dataClause=*/nullptr, /*implicit=*/$_builder.getBoolAttr(implicit), /*name=*/$_builder.getStringAttr(name)); }]>]; @@ -983,31 +859,23 @@ class OpenACC_DataExitOpNoVarPtr : let assemblyFormat = [{ custom($accVar, type($accVar)) (`bounds` `(` $bounds^ `)` )? - (`async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType)^ `)`)? attr-dict }]; let builders = [ - OpBuilder<(ins "::mlir::Value":$accVar, - "bool":$structured, "bool":$implicit, + OpBuilder<(ins "::mlir::Value":$accVar, "bool":$implicit, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ build($_builder, $_state, accVar, - bounds, /*asyncOperands=*/{}, /*asyncOperandsDeviceType=*/nullptr, - /*asyncOnly=*/nullptr, /*dataClause=*/nullptr, - /*structured=*/$_builder.getBoolAttr(structured), + bounds, /*dataClause=*/nullptr, /*implicit=*/$_builder.getBoolAttr(implicit), /*name=*/nullptr); }]>, - OpBuilder<(ins "::mlir::Value":$accVar, - "bool":$structured, "bool":$implicit, + OpBuilder<(ins "::mlir::Value":$accVar, "bool":$implicit, "const ::llvm::Twine &":$name, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ build($_builder, $_state, accVar, - bounds, /*asyncOperands=*/{}, /*asyncOperandsDeviceType=*/nullptr, - /*asyncOnly=*/nullptr, /*dataClause=*/nullptr, - /*structured=*/$_builder.getBoolAttr(structured), + bounds, /*dataClause=*/nullptr, /*implicit=*/$_builder.getBoolAttr(implicit), /*name=*/$_builder.getStringAttr(name)); }]> @@ -1027,7 +895,7 @@ def OpenACC_CopyoutOp : OpenACC_DataExitOpWithVarPtr<"copyout", "mlir::acc::DataClause::acc_copyout"> { let summary = "Represents acc copyout semantics - reverse of copyin."; - let extraClassDeclaration = extraClassDeclarationBase # extraClassDeclarationDataExit # [{ + let extraClassDeclaration = extraClassDeclarationDataExit # [{ /// Check if this is a copyout with zero modifier. bool isCopyoutZero(); }]; @@ -1039,7 +907,7 @@ def OpenACC_CopyoutOp : OpenACC_DataExitOpWithVarPtr<"copyout", def OpenACC_DeleteOp : OpenACC_DataExitOpNoVarPtr<"delete", "mlir::acc::DataClause::acc_delete"> { let summary = "Represents acc delete semantics - reverse of create."; - let extraClassDeclaration = extraClassDeclarationBase # extraClassDeclarationDataExit; + let extraClassDeclaration = extraClassDeclarationDataExit; } //===----------------------------------------------------------------------===// @@ -1048,7 +916,7 @@ def OpenACC_DeleteOp : OpenACC_DataExitOpNoVarPtr<"delete", def OpenACC_DetachOp : OpenACC_DataExitOpNoVarPtr<"detach", "mlir::acc::DataClause::acc_detach"> { let summary = "Represents acc detach semantics - reverse of attach."; - let extraClassDeclaration = extraClassDeclarationBase # extraClassDeclarationDataExit; + let extraClassDeclaration = extraClassDeclarationDataExit; } //===----------------------------------------------------------------------===// @@ -1057,7 +925,7 @@ def OpenACC_DetachOp : OpenACC_DataExitOpNoVarPtr<"detach", def OpenACC_UpdateHostOp : OpenACC_DataExitOpWithVarPtr<"update_host", "mlir::acc::DataClause::acc_update_host"> { let summary = "Represents acc update host semantics."; - let extraClassDeclaration = extraClassDeclarationBase # extraClassDeclarationDataExit # [{ + let extraClassDeclaration = extraClassDeclarationDataExit # [{ /// Check if this is an acc update self. bool isSelf() { return getDataClause() == acc::DataClause::acc_update_self; @@ -1439,8 +1307,8 @@ def OpenACC_ParallelOp : OpenACC_Op<"parallel", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1581,8 +1449,8 @@ def OpenACC_SerialOp : OpenACC_Op<"serial", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1750,8 +1618,8 @@ def OpenACC_KernelsOp : OpenACC_Op<"kernels", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `num_gangs` `(` custom($numGangs, type($numGangs), $numGangsDeviceType, $numGangsSegments) `)` | `num_workers` `(` custom($numWorkers, @@ -1799,6 +1667,9 @@ def OpenACC_DataOp : OpenACC_Op<"data", `async` and `wait` operands are supported with `device_type` information. They should only be accessed by the extra provided getters. If modified, the corresponding `device_type` attributes must be modified as well. + + The `asyncOnly` operand is a list of device_type's for which async clause + does not specify a value (default is acc_async_noval - OpenACC 3.3 2.16.1). }]; @@ -1870,8 +1741,8 @@ def OpenACC_DataOp : OpenACC_Op<"data", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, @@ -1931,6 +1802,7 @@ def OpenACC_EnterDataOp : OpenACC_Op<"enter_data", Value getDataOperand(unsigned i); }]; + // TODO: Show $async and $wait. let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` @@ -1983,6 +1855,7 @@ def OpenACC_ExitDataOp : OpenACC_Op<"exit_data", Value getDataOperand(unsigned i); }]; + // TODO: Show $async and $wait. let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` @@ -2853,7 +2726,7 @@ def OpenACC_UpdateOp : OpenACC_Op<"update", let arguments = (ins Optional:$ifCond, Variadic:$asyncOperands, OptionalAttr:$asyncOperandsDeviceType, - OptionalAttr:$async, + OptionalAttr:$asyncOnly, Variadic:$waitOperands, OptionalAttr:$waitOperandsSegments, OptionalAttr:$waitOperandsDeviceType, @@ -2901,9 +2774,8 @@ def OpenACC_UpdateOp : OpenACC_Op<"update", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `` custom( - $asyncOperands, type($asyncOperands), - $asyncOperandsDeviceType, $async) + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, $waitOnly) @@ -2946,6 +2818,7 @@ def OpenACC_WaitOp : OpenACC_Op<"wait", [AttrSizedOperandSegments]> { UnitAttr:$async, Optional:$ifCond); + // TODO: Show $async. let assemblyFormat = [{ ( `(` $waitOperands^ `:` type($waitOperands) `)` )? oilist(`async` `(` $asyncOperand `:` type($asyncOperand) `)` diff --git a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp index 7eb72d433c972..ee00acecb17b9 100644 --- a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp +++ b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp @@ -3505,7 +3505,7 @@ bool UpdateOp::hasAsyncOnly() { } bool UpdateOp::hasAsyncOnly(mlir::acc::DeviceType deviceType) { - return hasDeviceType(getAsync(), deviceType); + return hasDeviceType(getAsyncOnly(), deviceType); } mlir::Value UpdateOp::getAsyncValue() { @@ -3659,32 +3659,30 @@ mlir::acc::getBounds(mlir::Operation *accDataClauseOp) { } mlir::SmallVector -mlir::acc::getAsyncOperands(mlir::Operation *accDataClauseOp) { +mlir::acc::getAsyncOperands(mlir::Operation *accOp) { return llvm::TypeSwitch>( - accDataClauseOp) - .Case([&](auto dataClause) { - return mlir::SmallVector( - dataClause.getAsyncOperands().begin(), - dataClause.getAsyncOperands().end()); - }) + accOp) + .Case( + [&](auto op) { + return mlir::SmallVector(op.getAsyncOperands().begin(), + op.getAsyncOperands().end()); + }) .Default([&](mlir::Operation *) { return mlir::SmallVector(); }); } -mlir::ArrayAttr -mlir::acc::getAsyncOperandsDeviceType(mlir::Operation *accDataClauseOp) { - return llvm::TypeSwitch(accDataClauseOp) - .Case([&](auto dataClause) { - return dataClause.getAsyncOperandsDeviceTypeAttr(); - }) +mlir::ArrayAttr mlir::acc::getAsyncOperandsDeviceType(mlir::Operation *accOp) { + return llvm::TypeSwitch(accOp) + .Case( + [&](auto op) { return op.getAsyncOperandsDeviceTypeAttr(); }) .Default([&](mlir::Operation *) { return mlir::ArrayAttr{}; }); } -mlir::ArrayAttr mlir::acc::getAsyncOnly(mlir::Operation *accDataClauseOp) { - return llvm::TypeSwitch(accDataClauseOp) - .Case( - [&](auto dataClause) { return dataClause.getAsyncOnlyAttr(); }) +mlir::ArrayAttr mlir::acc::getAsyncOnly(mlir::Operation *accOp) { + return llvm::TypeSwitch(accOp) + .Case( + [&](auto op) { return op.getAsyncOnlyAttr(); }) .Default([&](mlir::Operation *) { return mlir::ArrayAttr{}; }); } >From 38e6b58dc6bc4fed2db23dd348c1a0cfe91d4fbe Mon Sep 17 00:00:00 2001 From: Kazuaki Matsumura Date: Tue, 13 May 2025 05:21:48 -0700 Subject: [PATCH 2/4] [acc] accDataClauseOp -> accOp --- mlir/include/mlir/Dialect/OpenACC/OpenACC.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h index e053e3d2bbcfc..f667a6786189b 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h @@ -120,17 +120,17 @@ mlir::SmallVector getBounds(mlir::Operation *accDataClauseOp); /// Used to obtain `async` operands from an acc operation. /// Returns an empty vector if there are no such operands. mlir::SmallVector -getAsyncOperands(mlir::Operation *accDataClauseOp); +getAsyncOperands(mlir::Operation *accOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to /// an acc operation, that correspond to the device types associated with the /// async clauses with an async-value. -mlir::ArrayAttr getAsyncOperandsDeviceType(mlir::Operation *accDataClauseOp); +mlir::ArrayAttr getAsyncOperandsDeviceType(mlir::Operation *accOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to /// an acc operation, that correspond to the device types associated with the /// async clauses without an async-value. -mlir::ArrayAttr getAsyncOnly(mlir::Operation *accDataClauseOp); +mlir::ArrayAttr getAsyncOnly(mlir::Operation *accOp); /// Used to obtain the `name` from an acc operation. std::optional getVarName(mlir::Operation *accOp); >From 1a0bac6b11d3c882b3ac7e6e0d28b97806048bec Mon Sep 17 00:00:00 2001 From: Kazuaki Matsumura Date: Tue, 13 May 2025 05:49:39 -0700 Subject: [PATCH 3/4] [acc] clang-format --- mlir/include/mlir/Dialect/OpenACC/OpenACC.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h index f667a6786189b..30271d0599236 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h @@ -119,8 +119,7 @@ mlir::SmallVector getBounds(mlir::Operation *accDataClauseOp); /// Used to obtain `async` operands from an acc operation. /// Returns an empty vector if there are no such operands. -mlir::SmallVector -getAsyncOperands(mlir::Operation *accOp); +mlir::SmallVector getAsyncOperands(mlir::Operation *accOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to /// an acc operation, that correspond to the device types associated with the >From 5ee42b2c7a2db13eeb587fc4586ee3dedb3a9614 Mon Sep 17 00:00:00 2001 From: Kazuaki Matsumura Date: Tue, 13 May 2025 09:05:08 -0700 Subject: [PATCH 4/4] [flang][acc] Revert assemblyFormat changes; asyncOnly is an attribute, no need to handle it --- .../mlir/Dialect/OpenACC/OpenACCOps.td | 23 ++++++++----------- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td index 59b9a50144a1e..ead7e95a694db 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td @@ -1307,8 +1307,8 @@ def OpenACC_ParallelOp : OpenACC_Op<"parallel", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType) `)` | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1449,8 +1449,8 @@ def OpenACC_SerialOp : OpenACC_Op<"serial", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType) `)` | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1618,8 +1618,8 @@ def OpenACC_KernelsOp : OpenACC_Op<"kernels", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType) `)` | `num_gangs` `(` custom($numGangs, type($numGangs), $numGangsDeviceType, $numGangsSegments) `)` | `num_workers` `(` custom($numWorkers, @@ -1741,8 +1741,8 @@ def OpenACC_DataOp : OpenACC_Op<"data", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType) `)` | `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, @@ -1802,7 +1802,6 @@ def OpenACC_EnterDataOp : OpenACC_Op<"enter_data", Value getDataOperand(unsigned i); }]; - // TODO: Show $async and $wait. let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` @@ -1855,7 +1854,6 @@ def OpenACC_ExitDataOp : OpenACC_Op<"exit_data", Value getDataOperand(unsigned i); }]; - // TODO: Show $async and $wait. let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` @@ -2774,8 +2772,8 @@ def OpenACC_UpdateOp : OpenACC_Op<"update", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, $waitOnly) @@ -2818,7 +2816,6 @@ def OpenACC_WaitOp : OpenACC_Op<"wait", [AttrSizedOperandSegments]> { UnitAttr:$async, Optional:$ifCond); - // TODO: Show $async. let assemblyFormat = [{ ( `(` $waitOperands^ `:` type($waitOperands) `)` )? oilist(`async` `(` $asyncOperand `:` type($asyncOperand) `)` From flang-commits at lists.llvm.org Tue May 13 10:32:46 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 13 May 2025 10:32:46 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Turn on alias analysis for locally allocated objects (PR #139682) In-Reply-To: Message-ID: <6823823e.170a0220.c1a2e.3e1f@mx.google.com> https://github.com/vzakhari approved this pull request. Thank you for the resolving the bug and enabling it by default! https://github.com/llvm/llvm-project/pull/139682 From flang-commits at lists.llvm.org Tue May 13 11:44:22 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Tue, 13 May 2025 11:44:22 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add loop annotation attributes to the loop backedge instead of the loop header's conditional branch (PR #126082) In-Reply-To: Message-ID: <68239306.a70a0220.25bb0e.1c93@mx.google.com> https://github.com/ashermancinelli updated https://github.com/llvm/llvm-project/pull/126082 >From 28a8f74342ac628ea50de746434735c028701ac8 Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Wed, 5 Feb 2025 16:41:50 -0800 Subject: [PATCH 1/3] Add loop attr info to the backedge, not condition --- .../Optimizer/Transforms/ControlFlowConverter.cpp | 12 ++++++------ flang/test/Fir/vector-always.fir | 4 +++- flang/test/Integration/unroll.f90 | 6 ++++-- flang/test/Integration/vector-always.f90 | 4 +++- 4 files changed, 16 insertions(+), 10 deletions(-) diff --git a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp index b09bbf6106dbb..0e03d574f4070 100644 --- a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp +++ b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp @@ -123,23 +123,23 @@ class CfgLoopConv : public mlir::OpRewritePattern { : terminator->operand_begin(); loopCarried.append(begin, terminator->operand_end()); loopCarried.push_back(itersMinusOne); - rewriter.create(loc, conditionalBlock, loopCarried); + auto backEdge = rewriter.create(loc, conditionalBlock, loopCarried); rewriter.eraseOp(terminator); + // Copy loop annotations from the do loop to the loop back edge. + if (auto ann = loop.getLoopAnnotation()) + backEdge->setAttr("loop_annotation", *ann); + // Conditional block rewriter.setInsertionPointToEnd(conditionalBlock); auto zero = rewriter.create(loc, 0); auto comparison = rewriter.create( loc, arith::CmpIPredicate::sgt, itersLeft, zero); - auto cond = rewriter.create( + rewriter.create( loc, comparison, firstBlock, llvm::ArrayRef(), endBlock, llvm::ArrayRef()); - // Copy loop annotations from the do loop to the loop entry condition. - if (auto ann = loop.getLoopAnnotation()) - cond->setAttr("loop_annotation", *ann); - // The result of the loop operation is the values of the condition block // arguments except the induction variable on the last iteration. auto args = loop.getFinalValue() diff --git a/flang/test/Fir/vector-always.fir b/flang/test/Fir/vector-always.fir index 00eb0e7a756ee..ec06b94a3d0f8 100644 --- a/flang/test/Fir/vector-always.fir +++ b/flang/test/Fir/vector-always.fir @@ -13,7 +13,9 @@ func.func @_QPvector_always() -> i32 { %c10_i32 = arith.constant 10 : i32 %c1_i32 = arith.constant 1 : i32 %c10 = arith.constant 10 : index -// CHECK: cf.cond_br %{{.*}}, ^{{.*}}, ^{{.*}} {loop_annotation = #[[ANNOTATION]]} +// CHECK: cf.cond_br +// CHECK-NOT: loop_annotation +// CHECK: cf.br ^{{.*}} {loop_annotation = #[[ANNOTATION]]} %8:2 = fir.do_loop %arg0 = %c1 to %c10 step %c1 iter_args(%arg1 = %c1_i32) -> (index, i32) attributes {loopAnnotation = #loop_annotation} { fir.result %c1, %c1_i32 : index, i32 } diff --git a/flang/test/Integration/unroll.f90 b/flang/test/Integration/unroll.f90 index aa47e465b63fc..294c2a6807b93 100644 --- a/flang/test/Integration/unroll.f90 +++ b/flang/test/Integration/unroll.f90 @@ -3,8 +3,10 @@ ! CHECK-LABEL: unroll_dir subroutine unroll_dir integer :: a(10) - !dir$ unroll - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[UNROLL_ENABLE_FULL_ANNO:.*]] + !dir$ unroll + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[UNROLL_ENABLE_FULL_ANNO:.*]] do i=1,10 a(i)=i end do diff --git a/flang/test/Integration/vector-always.f90 b/flang/test/Integration/vector-always.f90 index ee2aa8ab485e0..b73b439ecad18 100644 --- a/flang/test/Integration/vector-always.f90 +++ b/flang/test/Integration/vector-always.f90 @@ -4,7 +4,9 @@ subroutine vector_always integer :: a(10) !dir$ vector always - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION:.*]] do i=1,10 a(i)=i end do >From cd02a9181d13e232b0cf947c6b996c7a8d337532 Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Thu, 6 Feb 2025 07:45:06 -0800 Subject: [PATCH 2/3] formatting --- flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp index 0e03d574f4070..8a9e9b80134b8 100644 --- a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp +++ b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp @@ -123,7 +123,8 @@ class CfgLoopConv : public mlir::OpRewritePattern { : terminator->operand_begin(); loopCarried.append(begin, terminator->operand_end()); loopCarried.push_back(itersMinusOne); - auto backEdge = rewriter.create(loc, conditionalBlock, loopCarried); + auto backEdge = + rewriter.create(loc, conditionalBlock, loopCarried); rewriter.eraseOp(terminator); // Copy loop annotations from the do loop to the loop back edge. >From 7f006579e09ae9fa62c0b9805fe1296b881d44e9 Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Tue, 13 May 2025 11:37:25 -0700 Subject: [PATCH 3/3] Update tests wrt recent loop metadata changes --- flang/test/Integration/unroll.f90 | 12 +++++++++--- flang/test/Integration/unroll_and_jam.f90 | 20 +++++++++++++++----- flang/test/Integration/vector-always.f90 | 4 +++- 3 files changed, 27 insertions(+), 9 deletions(-) diff --git a/flang/test/Integration/unroll.f90 b/flang/test/Integration/unroll.f90 index 294c2a6807b93..f2c2ecb5cffac 100644 --- a/flang/test/Integration/unroll.f90 +++ b/flang/test/Integration/unroll.f90 @@ -16,7 +16,9 @@ end subroutine unroll_dir subroutine unroll_dir_0 integer :: a(10) !dir$ unroll 0 - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[UNROLL_DISABLE_ANNO:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[UNROLL_DISABLE_ANNO:.*]] do i=1,10 a(i)=i end do @@ -26,7 +28,9 @@ end subroutine unroll_dir_0 subroutine unroll_dir_1 integer :: a(10) !dir$ unroll 1 - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[UNROLL_DISABLE_ANNO]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[UNROLL_DISABLE_ANNO]] do i=1,10 a(i)=i end do @@ -36,7 +40,9 @@ end subroutine unroll_dir_1 subroutine unroll_dir_2 integer :: a(10) !dir$ unroll 2 - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[UNROLL_ENABLE_COUNT_2:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[UNROLL_ENABLE_COUNT_2:.*]] do i=1,10 a(i)=i end do diff --git a/flang/test/Integration/unroll_and_jam.f90 b/flang/test/Integration/unroll_and_jam.f90 index b9c16d34ac90a..05b3aaa04a1e0 100644 --- a/flang/test/Integration/unroll_and_jam.f90 +++ b/flang/test/Integration/unroll_and_jam.f90 @@ -4,7 +4,9 @@ subroutine unroll_and_jam_dir integer :: a(10) !dir$ unroll_and_jam 4 - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION:.*]] do i=1,10 a(i)=i end do @@ -14,7 +16,9 @@ end subroutine unroll_and_jam_dir subroutine unroll_and_jam_dir_0 integer :: a(10) !dir$ unroll_and_jam 0 - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION_DISABLE:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION_DISABLE:.*]] do i=1,10 a(i)=i end do @@ -24,7 +28,9 @@ end subroutine unroll_and_jam_dir_0 subroutine unroll_and_jam_dir_1 integer :: a(10) !dir$ unroll_and_jam 1 - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION_DISABLE]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION_DISABLE]] do i=1,10 a(i)=i end do @@ -34,7 +40,9 @@ end subroutine unroll_and_jam_dir_1 subroutine nounroll_and_jam_dir integer :: a(10) !dir$ nounroll_and_jam - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION_DISABLE]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION_DISABLE]] do i=1,10 a(i)=i end do @@ -44,7 +52,9 @@ end subroutine nounroll_and_jam_dir subroutine unroll_and_jam_dir_no_factor integer :: a(10) !dir$ unroll_and_jam - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION_NO_FACTOR:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION_NO_FACTOR:.*]] do i=1,10 a(i)=i end do diff --git a/flang/test/Integration/vector-always.f90 b/flang/test/Integration/vector-always.f90 index b73b439ecad18..1d8aad97bde70 100644 --- a/flang/test/Integration/vector-always.f90 +++ b/flang/test/Integration/vector-always.f90 @@ -16,7 +16,9 @@ end subroutine vector_always subroutine no_vector integer :: a(10) !dir$ novector - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION2:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION2:.*]] do i=1,10 a(i)=i end do From flang-commits at lists.llvm.org Tue May 13 12:11:58 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 12:11:58 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Remove async and structured flag from data actions (PR #139723) In-Reply-To: Message-ID: <6823997e.170a0220.10096a.ece4@mx.google.com> https://github.com/khaki3 updated https://github.com/llvm/llvm-project/pull/139723 >From d15c2f009e87fe7dfba2969eb0b979f9ecd025ab Mon Sep 17 00:00:00 2001 From: Kazuaki Matsumura Date: Tue, 13 May 2025 05:08:27 -0700 Subject: [PATCH 1/5] [flang][acc] Remove async and structured flag from data actions; Rename UpdateOp's async to asyncOnly; Print asyncOnly --- flang/lib/Lower/OpenACC.cpp | 354 +++++++----------- mlir/include/mlir/Dialect/OpenACC/OpenACC.h | 10 +- .../mlir/Dialect/OpenACC/OpenACCOps.td | 195 ++-------- mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp | 34 +- 4 files changed, 188 insertions(+), 405 deletions(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index e1918288d6de3..c1a8dd0d5a478 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -104,15 +104,12 @@ static void addOperand(llvm::SmallVectorImpl &operands, } template -static Op -createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, - mlir::Value baseAddr, std::stringstream &name, - mlir::SmallVector bounds, bool structured, - bool implicit, mlir::acc::DataClause dataClause, - mlir::Type retTy, llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, - bool unwrapBoxAddr = false, mlir::Value isPresent = {}) { +static Op createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, + mlir::Value baseAddr, std::stringstream &name, + mlir::SmallVector bounds, + bool implicit, mlir::acc::DataClause dataClause, + mlir::Type retTy, bool unwrapBoxAddr = false, + mlir::Value isPresent = {}) { mlir::Value varPtrPtr; // The data clause may apply to either the box reference itself or the // pointer to the data it holds. So use `unwrapBoxAddr` to decide. @@ -157,11 +154,9 @@ createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, addOperand(operands, operandSegments, baseAddr); addOperand(operands, operandSegments, varPtrPtr); addOperands(operands, operandSegments, bounds); - addOperands(operands, operandSegments, async); Op op = builder.create(loc, retTy, operands); op.setNameAttr(builder.getStringAttr(name.str())); - op.setStructured(structured); op.setImplicit(implicit); op.setDataClause(dataClause); if (auto mappableTy = @@ -176,10 +171,6 @@ createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, op->setAttr(Op::getOperandSegmentSizeAttr(), builder.getDenseI32ArrayAttr(operandSegments)); - if (!asyncDeviceTypes.empty()) - op.setAsyncOperandsDeviceTypeAttr(builder.getArrayAttr(asyncDeviceTypes)); - if (!asyncOnlyDeviceTypes.empty()) - op.setAsyncOnlyAttr(builder.getArrayAttr(asyncOnlyDeviceTypes)); return op; } @@ -249,9 +240,7 @@ static void createDeclareAllocFuncWithArg(mlir::OpBuilder &modBuilder, mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, registerFuncOp.getArgument(0), asFortranDesc, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, descTy, - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, descTy); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -263,8 +252,7 @@ static void createDeclareAllocFuncWithArg(mlir::OpBuilder &modBuilder, addDeclareAttr(builder, boxAddrOp.getOperation(), clause); EntryOp entryOp = createDataEntryOp( builder, loc, boxAddrOp.getResult(), asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, boxAddrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, boxAddrOp.getType()); builder.create( loc, mlir::acc::DeclareTokenType::get(entryOp.getContext()), mlir::ValueRange(entryOp.getAccVar())); @@ -302,26 +290,20 @@ static void createDeclareDeallocFuncWithArg( mlir::acc::GetDevicePtrOp entryOp = createDataEntryOp( builder, loc, var, asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, var.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, var.getType()); builder.create( loc, mlir::Value{}, mlir::ValueRange(entryOp.getAccVar())); if constexpr (std::is_same_v || std::is_same_v) - builder.create(entryOp.getLoc(), entryOp.getAccVar(), - entryOp.getVar(), entryOp.getVarType(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, - builder.getStringAttr(*entryOp.getName())); + builder.create( + entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), + entryOp.getVarType(), entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); else builder.create(entryOp.getLoc(), entryOp.getAccVar(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); // Generate the post dealloc function. @@ -341,9 +323,8 @@ static void createDeclareDeallocFuncWithArg( mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, var, asFortran, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, var.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, + var.getType()); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -700,10 +681,7 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - mlir::acc::DataClause dataClause, bool structured, - bool implicit, llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, + mlir::acc::DataClause dataClause, bool implicit, bool setDeclareAttr = false) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; @@ -732,9 +710,8 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, ? info.rawInput : info.addr; Op op = createDataEntryOp( - builder, operandLocation, baseAddr, asFortran, bounds, structured, - implicit, dataClause, baseAddr.getType(), async, asyncDeviceTypes, - asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true, info.isPresent); + builder, operandLocation, baseAddr, asFortran, bounds, implicit, + dataClause, baseAddr.getType(), /*unwrapBoxAddr=*/true, info.isPresent); dataOperands.push_back(op.getAccVar()); } } @@ -746,7 +723,7 @@ static void genDeclareDataOperandOperations( Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - mlir::acc::DataClause dataClause, bool structured, bool implicit) { + mlir::acc::DataClause dataClause, bool implicit) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; for (const auto &accObject : objectList.v) { @@ -765,10 +742,9 @@ static void genDeclareDataOperandOperations( /*genDefaultBounds=*/generateDefaultBounds, /*strideIncludeLowerExtent=*/strideIncludeLowerExtent); LLVM_DEBUG(llvm::dbgs() << __func__ << "\n"; info.dump(llvm::dbgs())); - EntryOp op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, structured, - implicit, dataClause, info.addr.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + EntryOp op = createDataEntryOp(builder, operandLocation, info.addr, + asFortran, bounds, implicit, + dataClause, info.addr.getType()); dataOperands.push_back(op.getAccVar()); addDeclareAttr(builder, op.getVar().getDefiningOp(), dataClause); if (mlir::isa(fir::unwrapRefType(info.addr.getType()))) { @@ -805,14 +781,12 @@ static void genDeclareDataOperandOperationsWithModifier( (modifier && (*modifier).v == mod) ? clauseWithModifier : clause; genDeclareDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, - dataClause, - /*structured=*/true, /*implicit=*/false); + dataClause, /*implicit=*/false); } template static void genDataExitOperations(fir::FirOpBuilder &builder, - llvm::SmallVector operands, - bool structured) { + llvm::SmallVector operands) { for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); @@ -820,16 +794,13 @@ static void genDataExitOperations(fir::FirOpBuilder &builder, std::is_same_v) builder.create( entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), - entryOp.getVarType(), entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), - entryOp.getDataClause(), structured, entryOp.getImplicit(), - builder.getStringAttr(*entryOp.getName())); - else - builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getBounds(), - entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, + entryOp.getVarType(), entryOp.getBounds(), entryOp.getDataClause(), entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); + else + builder.create(entryOp.getLoc(), entryOp.getAccVar(), + entryOp.getBounds(), entryOp.getDataClause(), + entryOp.getImplicit(), + builder.getStringAttr(*entryOp.getName())); } } @@ -1240,10 +1211,7 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - llvm::SmallVector &privatizations, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes) { + llvm::SmallVector &privatizations) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; for (const auto &accObject : objectList.v) { @@ -1272,9 +1240,9 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, recipe = Fortran::lower::createOrGetPrivateRecipe(builder, recipeName, operandLocation, retTy); auto op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, true, - /*implicit=*/false, mlir::acc::DataClause::acc_private, retTy, async, - asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); + builder, operandLocation, info.addr, asFortran, bounds, + /*implicit=*/false, mlir::acc::DataClause::acc_private, retTy, + /*unwrapBoxAddr=*/true); dataOperands.push_back(op.getAccVar()); } else { std::string suffix = @@ -1284,9 +1252,8 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, recipe = Fortran::lower::createOrGetFirstprivateRecipe( builder, recipeName, operandLocation, retTy, bounds); auto op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, true, + builder, operandLocation, info.addr, asFortran, bounds, /*implicit=*/false, mlir::acc::DataClause::acc_firstprivate, retTy, - async, asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); dataOperands.push_back(op.getAccVar()); } @@ -1869,10 +1836,7 @@ genReductions(const Fortran::parser::AccObjectListWithReduction &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &reductionOperands, - llvm::SmallVector &reductionRecipes, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes) { + llvm::SmallVector &reductionRecipes) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); const auto &objects = std::get(objectList.t); const auto &op = std::get(objectList.t); @@ -1904,9 +1868,8 @@ genReductions(const Fortran::parser::AccObjectListWithReduction &objectList, auto op = createDataEntryOp( builder, operandLocation, info.addr, asFortran, bounds, - /*structured=*/true, /*implicit=*/false, - mlir::acc::DataClause::acc_reduction, info.addr.getType(), async, - asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); + /*implicit=*/false, mlir::acc::DataClause::acc_reduction, + info.addr.getType(), /*unwrapBoxAddr=*/true); mlir::Type ty = op.getAccVar().getType(); if (!areAllBoundConstant(bounds) || fir::isAssumedShape(info.addr.getType()) || @@ -2169,9 +2132,8 @@ static void privatizeIv(Fortran::lower::AbstractConverter &converter, std::stringstream asFortran; asFortran << Fortran::lower::mangle::demangleName(toStringRef(sym.name())); auto op = createDataEntryOp( - builder, loc, ivValue, asFortran, {}, true, /*implicit=*/true, - mlir::acc::DataClause::acc_private, ivValue.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + builder, loc, ivValue, asFortran, {}, /*implicit=*/true, + mlir::acc::DataClause::acc_private, ivValue.getType()); privateOp = op.getOperation(); privateOperands.push_back(op.getAccVar()); @@ -2328,14 +2290,12 @@ static mlir::acc::LoopOp createLoopOp( &clause.u)) { genPrivatizations( privateClause->v, converter, semanticsContext, stmtCtx, - privateOperands, privatizations, /*async=*/{}, - /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + privateOperands, privatizations); } else if (const auto *reductionClause = std::get_if( &clause.u)) { genReductions(reductionClause->v, converter, semanticsContext, stmtCtx, - reductionOperands, reductionRecipes, /*async=*/{}, - /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + reductionOperands, reductionRecipes); } else if (std::get_if(&clause.u)) { for (auto crtDeviceTypeAttr : crtDeviceTypes) seqDeviceTypes.push_back(crtDeviceTypeAttr); @@ -2613,9 +2573,6 @@ static void genDataOperandOperationsWithModifier( llvm::SmallVectorImpl &dataClauseOperands, const mlir::acc::DataClause clause, const mlir::acc::DataClause clauseWithModifier, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, bool setDeclareAttr = false) { const Fortran::parser::AccObjectListWithModifier &listWithModifier = x->v; const auto &accObjectList = @@ -2627,9 +2584,7 @@ static void genDataOperandOperationsWithModifier( (modifier && (*modifier).v == mod) ? clauseWithModifier : clause; genDataOperandOperations(accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, dataClause, - /*structured=*/true, /*implicit=*/false, async, - asyncDeviceTypes, asyncOnlyDeviceTypes, - setDeclareAttr); + /*implicit=*/false, setDeclareAttr); } template @@ -2779,8 +2734,7 @@ static Op createComputeOp( genDataOperandOperations( copyClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copy, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyinClause = @@ -2791,8 +2745,7 @@ static Op createComputeOp( copyinClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::ReadOnly, dataClauseOperands, mlir::acc::DataClause::acc_copyin, - mlir::acc::DataClause::acc_copyin_readonly, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyin_readonly); copyinEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyoutClause = @@ -2804,8 +2757,7 @@ static Op createComputeOp( copyoutClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::ReadOnly, dataClauseOperands, mlir::acc::DataClause::acc_copyout, - mlir::acc::DataClause::acc_copyout_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyout_zero); copyoutEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *createClause = @@ -2816,8 +2768,7 @@ static Op createComputeOp( createClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::Zero, dataClauseOperands, mlir::acc::DataClause::acc_create, - mlir::acc::DataClause::acc_create_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_create_zero); createEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *noCreateClause = @@ -2827,8 +2778,7 @@ static Op createComputeOp( genDataOperandOperations( noCreateClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_no_create, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); nocreateEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *presentClause = @@ -2838,8 +2788,7 @@ static Op createComputeOp( genDataOperandOperations( presentClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_present, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); presentEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *devicePtrClause = @@ -2848,16 +2797,14 @@ static Op createComputeOp( genDataOperandOperations( devicePtrClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_deviceptr, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); } else if (const auto *attachClause = std::get_if(&clause.u)) { auto crtDataStart = dataClauseOperands.size(); genDataOperandOperations( attachClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_attach, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); attachEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *privateClause = @@ -2866,15 +2813,13 @@ static Op createComputeOp( if (!combinedConstructs) genPrivatizations( privateClause->v, converter, semanticsContext, stmtCtx, - privateOperands, privatizations, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + privateOperands, privatizations); } else if (const auto *firstprivateClause = std::get_if( &clause.u)) { genPrivatizations( firstprivateClause->v, converter, semanticsContext, stmtCtx, - firstprivateOperands, firstPrivatizations, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + firstprivateOperands, firstPrivatizations); } else if (const auto *reductionClause = std::get_if( &clause.u)) { @@ -2885,16 +2830,14 @@ static Op createComputeOp( // instead. if (!combinedConstructs) { genReductions(reductionClause->v, converter, semanticsContext, stmtCtx, - reductionOperands, reductionRecipes, async, - asyncDeviceTypes, asyncOnlyDeviceTypes); + reductionOperands, reductionRecipes); } else { auto crtDataStart = dataClauseOperands.size(); genDataOperandOperations( std::get(reductionClause->v.t), converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_reduction, - /*structured=*/true, /*implicit=*/true, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/true); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } @@ -2997,19 +2940,19 @@ static Op createComputeOp( // Create the exit operations after the region. genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands); genDataExitOperations( - builder, attachEntryOperands, /*structured=*/true); + builder, attachEntryOperands); genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands); genDataExitOperations( - builder, nocreateEntryOperands, /*structured=*/true); + builder, nocreateEntryOperands); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands); builder.restoreInsertionPoint(insPt); return computeOp; @@ -3078,8 +3021,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( copyClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copy, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyinClause = @@ -3090,8 +3032,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, copyinClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::ReadOnly, dataClauseOperands, mlir::acc::DataClause::acc_copyin, - mlir::acc::DataClause::acc_copyin_readonly, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyin_readonly); copyinEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyoutClause = @@ -3103,8 +3044,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, copyoutClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::Zero, dataClauseOperands, mlir::acc::DataClause::acc_copyout, - mlir::acc::DataClause::acc_copyout_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyout_zero); copyoutEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *createClause = @@ -3115,8 +3055,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, createClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::Zero, dataClauseOperands, mlir::acc::DataClause::acc_create, - mlir::acc::DataClause::acc_create_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_create_zero); createEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *noCreateClause = @@ -3126,8 +3065,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( noCreateClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_no_create, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); nocreateEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *presentClause = @@ -3137,8 +3075,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( presentClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_present, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); presentEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *deviceptrClause = @@ -3147,16 +3084,14 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( deviceptrClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_deviceptr, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); } else if (const auto *attachClause = std::get_if(&clause.u)) { auto crtDataStart = dataClauseOperands.size(); genDataOperandOperations( attachClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_attach, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); attachEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *defaultClause = @@ -3211,19 +3146,19 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, // Create the exit operations after the region. genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands); genDataExitOperations( - builder, attachEntryOperands, /*structured=*/true); + builder, attachEntryOperands); genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands); genDataExitOperations( - builder, nocreateEntryOperands, /*structured=*/true); + builder, nocreateEntryOperands); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands); builder.restoreInsertionPoint(insPt); } @@ -3252,8 +3187,7 @@ genACCHostDataOp(Fortran::lower::AbstractConverter &converter, genDataOperandOperations( useDevice->v, converter, semanticsContext, stmtCtx, dataOperands, mlir::acc::DataClause::acc_use_device, - /*structured=*/true, /*implicit=*/false, /*async=*/{}, - /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false); } else if (std::get_if(&clause.u)) { addIfPresentAttr = true; } @@ -3430,9 +3364,8 @@ genACCEnterDataOp(Fortran::lower::AbstractConverter &converter, std::get(listWithModifier.t); genDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, - dataClauseOperands, mlir::acc::DataClause::acc_copyin, false, - /*implicit=*/false, asyncValues, asyncDeviceTypes, - asyncOnlyDeviceTypes); + dataClauseOperands, mlir::acc::DataClause::acc_copyin, + /*implicit=*/false); } else if (const auto *createClause = std::get_if(&clause.u)) { const Fortran::parser::AccObjectListWithModifier &listWithModifier = @@ -3448,15 +3381,13 @@ genACCEnterDataOp(Fortran::lower::AbstractConverter &converter, clause = mlir::acc::DataClause::acc_create_zero; genDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, - dataClauseOperands, clause, false, /*implicit=*/false, asyncValues, - asyncDeviceTypes, asyncOnlyDeviceTypes); + dataClauseOperands, clause, /*implicit=*/false); } else if (const auto *attachClause = std::get_if(&clause.u)) { genDataOperandOperations( attachClause->v, converter, semanticsContext, stmtCtx, - dataClauseOperands, mlir::acc::DataClause::acc_attach, false, - /*implicit=*/false, asyncValues, asyncDeviceTypes, - asyncOnlyDeviceTypes); + dataClauseOperands, mlir::acc::DataClause::acc_attach, + /*implicit=*/false); } else if (!std::get_if(&clause.u)) { llvm::report_fatal_error( "Unknown clause in ENTER DATA directive lowering"); @@ -3544,20 +3475,17 @@ genACCExitDataOp(Fortran::lower::AbstractConverter &converter, std::get(listWithModifier.t); genDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, copyoutOperands, - mlir::acc::DataClause::acc_copyout, false, /*implicit=*/false, - asyncValues, asyncDeviceTypes, asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyout, /*implicit=*/false); } else if (const auto *deleteClause = std::get_if(&clause.u)) { genDataOperandOperations( deleteClause->v, converter, semanticsContext, stmtCtx, deleteOperands, - mlir::acc::DataClause::acc_delete, false, /*implicit=*/false, - asyncValues, asyncDeviceTypes, asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_delete, /*implicit=*/false); } else if (const auto *detachClause = std::get_if(&clause.u)) { genDataOperandOperations( detachClause->v, converter, semanticsContext, stmtCtx, detachOperands, - mlir::acc::DataClause::acc_detach, false, /*implicit=*/false, - asyncValues, asyncDeviceTypes, asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_detach, /*implicit=*/false); } else if (std::get_if(&clause.u)) { addFinalizeAttr = true; } @@ -3587,11 +3515,11 @@ genACCExitDataOp(Fortran::lower::AbstractConverter &converter, exitDataOp.setFinalizeAttr(builder.getUnitAttr()); genDataExitOperations( - builder, copyoutOperands, /*structured=*/false); + builder, copyoutOperands); genDataExitOperations( - builder, deleteOperands, /*structured=*/false); + builder, deleteOperands); genDataExitOperations( - builder, detachOperands, /*structured=*/false); + builder, detachOperands); } template @@ -3765,16 +3693,14 @@ genACCUpdateOp(Fortran::lower::AbstractConverter &converter, std::get_if(&clause.u)) { genDataOperandOperations( hostClause->v, converter, semanticsContext, stmtCtx, - updateHostOperands, mlir::acc::DataClause::acc_update_host, false, - /*implicit=*/false, asyncOperands, asyncOperandsDeviceTypes, - asyncOnlyDeviceTypes); + updateHostOperands, mlir::acc::DataClause::acc_update_host, + /*implicit=*/false); } else if (const auto *deviceClause = std::get_if(&clause.u)) { genDataOperandOperations( deviceClause->v, converter, semanticsContext, stmtCtx, - dataClauseOperands, mlir::acc::DataClause::acc_update_device, false, - /*implicit=*/false, asyncOperands, asyncOperandsDeviceTypes, - asyncOnlyDeviceTypes); + dataClauseOperands, mlir::acc::DataClause::acc_update_device, + /*implicit=*/false); } else if (std::get_if(&clause.u)) { ifPresent = true; } else if (const auto *selfClause = @@ -3786,9 +3712,8 @@ genACCUpdateOp(Fortran::lower::AbstractConverter &converter, assert(accObjectList && "expect AccObjectList"); genDataOperandOperations( *accObjectList, converter, semanticsContext, stmtCtx, - updateHostOperands, mlir::acc::DataClause::acc_update_self, false, - /*implicit=*/false, asyncOperands, asyncOperandsDeviceTypes, - asyncOnlyDeviceTypes); + updateHostOperands, mlir::acc::DataClause::acc_update_self, + /*implicit=*/false); } } @@ -3805,7 +3730,7 @@ genACCUpdateOp(Fortran::lower::AbstractConverter &converter, ifPresent); genDataExitOperations( - builder, updateHostOperands, /*structured=*/false); + builder, updateHostOperands); } static void @@ -3928,9 +3853,8 @@ static void createDeclareGlobalOp(mlir::OpBuilder &modBuilder, llvm::SmallVector bounds; EntryOp entryOp = createDataEntryOp( - builder, loc, addrOp.getResTy(), asFortran, bounds, - /*structured=*/false, implicit, clause, addrOp.getResTy().getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + builder, loc, addrOp.getResTy(), asFortran, bounds, implicit, clause, + addrOp.getResTy().getType()); if constexpr (std::is_same_v) builder.create( loc, mlir::acc::DeclareTokenType::get(entryOp.getContext()), @@ -3940,10 +3864,8 @@ static void createDeclareGlobalOp(mlir::OpBuilder &modBuilder, mlir::ValueRange(entryOp.getAccVar())); if constexpr (std::is_same_v) { builder.create(entryOp.getLoc(), entryOp.getAccVar(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); } builder.create(loc); @@ -3977,9 +3899,8 @@ static void createDeclareAllocFunc(mlir::OpBuilder &modBuilder, mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, addrOp, asFortranDesc, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, addrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, + addrOp.getType()); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -3990,8 +3911,7 @@ static void createDeclareAllocFunc(mlir::OpBuilder &modBuilder, addDeclareAttr(builder, boxAddrOp.getOperation(), clause); EntryOp entryOp = createDataEntryOp( builder, loc, boxAddrOp.getResult(), asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, boxAddrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, boxAddrOp.getType()); builder.create( loc, mlir::acc::DeclareTokenType::get(entryOp.getContext()), mlir::ValueRange(entryOp.getAccVar())); @@ -4035,8 +3955,7 @@ static void createDeclareDeallocFunc(mlir::OpBuilder &modBuilder, mlir::acc::GetDevicePtrOp entryOp = createDataEntryOp( builder, loc, var, asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, var.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, var.getType()); builder.create( loc, mlir::Value{}, mlir::ValueRange(entryOp.getAccVar())); @@ -4045,18 +3964,13 @@ static void createDeclareDeallocFunc(mlir::OpBuilder &modBuilder, std::is_same_v) builder.create( entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), - entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, - builder.getStringAttr(*entryOp.getName())); + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); else - builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getBounds(), - entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, - builder.getStringAttr(*entryOp.getName())); + builder.create(entryOp.getLoc(), entryOp.getAccVar(), + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, + builder.getStringAttr(*entryOp.getName())); // Generate the post dealloc function. modBuilder.setInsertionPointAfter(preDeallocOp); @@ -4076,9 +3990,8 @@ static void createDeclareDeallocFunc(mlir::OpBuilder &modBuilder, mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, addrOp, asFortran, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, addrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, + addrOp.getType()); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -4216,7 +4129,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, mlir::acc::CopyoutOp>( copyClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copy, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *createClause = @@ -4229,7 +4142,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, genDeclareDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_create, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); createEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *presentClause = @@ -4240,7 +4153,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, mlir::acc::DeleteOp>( presentClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_present, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); presentEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyinClause = @@ -4266,7 +4179,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, mlir::acc::CopyoutOp>( accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copyout, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); copyoutEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *devicePtrClause = @@ -4276,14 +4189,14 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, mlir::acc::DevicePtrOp>( devicePtrClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_deviceptr, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); } else if (const auto *linkClause = std::get_if(&clause.u)) { genDeclareDataOperandOperations( linkClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_declare_link, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); } else if (const auto *deviceResidentClause = std::get_if( &clause.u)) { @@ -4293,7 +4206,7 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, deviceResidentClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_declare_device_resident, - /*structured=*/true, /*implicit=*/false); + /*implicit=*/false); deviceResidentEntryOperands.append( dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else { @@ -4341,18 +4254,18 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter, } genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands); genDataExitOperations( - builder, deviceResidentEntryOperands, /*structured=*/true); + mlir::acc::DeleteOp>(builder, + deviceResidentEntryOperands); genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands); }); } @@ -4702,12 +4615,11 @@ genACC(Fortran::lower::AbstractConverter &converter, if (modifier && (*modifier).v == Fortran::parser::AccDataModifier::Modifier::ReadOnly) dataClause = mlir::acc::DataClause::acc_cache_readonly; - genDataOperandOperations( - accObjectList, converter, semanticsContext, stmtCtx, cacheOperands, - dataClause, - /*structured=*/true, /*implicit=*/false, - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}, - /*setDeclareAttr*/ false); + genDataOperandOperations(accObjectList, converter, + semanticsContext, stmtCtx, + cacheOperands, dataClause, + /*implicit=*/false, + /*setDeclareAttr*/ false); loopOp.getCacheOperandsMutable().append(cacheOperands); } else { llvm::report_fatal_error( diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h index ff5845343313c..e053e3d2bbcfc 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h @@ -117,19 +117,19 @@ mlir::Value getVarPtrPtr(mlir::Operation *accDataClauseOp); /// Returns an empty vector if there are no bounds. mlir::SmallVector getBounds(mlir::Operation *accDataClauseOp); -/// Used to obtain `async` operands from an acc data clause operation. +/// Used to obtain `async` operands from an acc operation. /// Returns an empty vector if there are no such operands. mlir::SmallVector getAsyncOperands(mlir::Operation *accDataClauseOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to -/// an acc data clause operation, that correspond to the device types -/// associated with the async clauses with an async-value. +/// an acc operation, that correspond to the device types associated with the +/// async clauses with an async-value. mlir::ArrayAttr getAsyncOperandsDeviceType(mlir::Operation *accDataClauseOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to -/// an acc data clause operation, that correspond to the device types -/// associated with the async clauses without an async-value. +/// an acc operation, that correspond to the device types associated with the +/// async clauses without an async-value. mlir::ArrayAttr getAsyncOnly(mlir::Operation *accDataClauseOp); /// Used to obtain the `name` from an acc operation. diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td index 5d5add6318e06..59b9a50144a1e 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td @@ -470,11 +470,7 @@ class OpenACC_DataEntryOp:$varPtrPtr, Variadic:$bounds, /* rank-0 to rank-{n-1} */ - Variadic:$asyncOperands, - OptionalAttr:$asyncOperandsDeviceType, - OptionalAttr:$asyncOnly, DefaultValuedAttr:$dataClause, - DefaultValuedAttr:$structured, DefaultValuedAttr:$implicit, OptionalAttr:$name)); @@ -491,63 +487,16 @@ class OpenACC_DataEntryOp(attr); - if (deviceTypeAttr.getValue() == deviceType) - return true; - } - return false; - } - /// Return the value of the async clause if present. - mlir::Value getAsyncValue() { - return getAsyncValue(mlir::acc::DeviceType::None); - } - /// Return the value of the async clause for the given device_type if - /// present. - mlir::Value getAsyncValue(mlir::acc::DeviceType deviceType) { - mlir::ArrayAttr deviceTypes = getAsyncOperandsDeviceTypeAttr(); - if (!deviceTypes) - return nullptr; - for (auto [attr, asyncValue] : - llvm::zip(deviceTypes, getAsyncOperands())) { - auto deviceTypeAttr = mlir::dyn_cast(attr); - if (deviceTypeAttr.getValue() == deviceType) - return asyncValue; - } - return nullptr; - } mlir::TypedValue getVarPtr() { return mlir::dyn_cast>(getVar()); } @@ -561,16 +510,13 @@ class OpenACC_DataEntryOp($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` ) `->` type($accVar) attr-dict }]; let hasVerifier = 1; let builders = [ - OpBuilder<(ins "::mlir::Value":$var, - "bool":$structured, "bool":$implicit, + OpBuilder<(ins "::mlir::Value":$var, "bool":$implicit, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ auto ptrLikeTy = ::mlir::dyn_cast<::mlir::acc::PointerLikeType>( @@ -579,14 +525,10 @@ class OpenACC_DataEntryOp, - OpBuilder<(ins "::mlir::Value":$var, - "bool":$structured, "bool":$implicit, + OpBuilder<(ins "::mlir::Value":$var, "bool":$implicit, "const ::llvm::Twine &":$name, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ @@ -596,10 +538,7 @@ class OpenACC_DataEntryOp]; @@ -829,15 +768,10 @@ def OpenACC_CacheOp : OpenACC_DataEntryOp<"cache", class OpenACC_DataExitOp traits = [], dag additionalArgs = (ins)> : OpenACC_Op]>])> { + [MemoryEffects<[MemRead]>])> { let arguments = !con(additionalArgs, (ins Variadic:$bounds, - Variadic:$asyncOperands, - OptionalAttr:$asyncOperandsDeviceType, - OptionalAttr:$asyncOnly, DefaultValuedAttr:$dataClause, - DefaultValuedAttr:$structured, DefaultValuedAttr:$implicit, OptionalAttr:$name)); @@ -846,65 +780,15 @@ class OpenACC_DataExitOp(attr); - if (deviceTypeAttr.getValue() == deviceType) - return true; - } - return false; - } - /// Return the value of the async clause if present. - mlir::Value getAsyncValue() { - return getAsyncValue(mlir::acc::DeviceType::None); - } - /// Return the value of the async clause for the given device_type if - /// present. - mlir::Value getAsyncValue(mlir::acc::DeviceType deviceType) { - mlir::ArrayAttr deviceTypes = getAsyncOperandsDeviceTypeAttr(); - if (!deviceTypes) - return nullptr; - for (auto [attr, asyncValue] : - llvm::zip(deviceTypes, getAsyncOperands())) { - auto deviceTypeAttr = mlir::dyn_cast(attr); - if (deviceTypeAttr.getValue() == deviceType) - return asyncValue; - } - return nullptr; - } - }]; - let hasVerifier = 1; } @@ -922,16 +806,13 @@ class OpenACC_DataExitOpWithVarPtr let assemblyFormat = [{ custom($accVar, type($accVar)) (`bounds` `(` $bounds^ `)` )? - (`async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType)^ `)`)? `to` custom($var) `:` custom(type($var), $varType) attr-dict }]; let builders = [ OpBuilder<(ins "::mlir::Value":$accVar, - "::mlir::Value":$var, - "bool":$structured, "bool":$implicit, + "::mlir::Value":$var, "bool":$implicit, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ auto ptrLikeTy = ::mlir::dyn_cast<::mlir::acc::PointerLikeType>( @@ -940,14 +821,11 @@ class OpenACC_DataExitOpWithVarPtr /*varType=*/ptrLikeTy ? ::mlir::TypeAttr::get(ptrLikeTy.getElementType()) : ::mlir::TypeAttr::get(var.getType()), - bounds, /*asyncOperands=*/{}, /*asyncOperandsDeviceType=*/nullptr, - /*asyncOnly=*/nullptr, /*dataClause=*/nullptr, - /*structured=*/$_builder.getBoolAttr(structured), + bounds, /*dataClause=*/nullptr, /*implicit=*/$_builder.getBoolAttr(implicit), /*name=*/nullptr); }]>, OpBuilder<(ins "::mlir::Value":$accVar, - "::mlir::Value":$var, - "bool":$structured, "bool":$implicit, + "::mlir::Value":$var, "bool":$implicit, "const ::llvm::Twine &":$name, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ @@ -957,9 +835,7 @@ class OpenACC_DataExitOpWithVarPtr /*varType=*/ptrLikeTy ? ::mlir::TypeAttr::get(ptrLikeTy.getElementType()) : ::mlir::TypeAttr::get(var.getType()), - bounds, /*asyncOperands=*/{}, /*asyncOperandsDeviceType=*/nullptr, - /*asyncOnly=*/nullptr, /*dataClause=*/nullptr, - /*structured=*/$_builder.getBoolAttr(structured), + bounds, /*dataClause=*/nullptr, /*implicit=*/$_builder.getBoolAttr(implicit), /*name=*/$_builder.getStringAttr(name)); }]>]; @@ -983,31 +859,23 @@ class OpenACC_DataExitOpNoVarPtr : let assemblyFormat = [{ custom($accVar, type($accVar)) (`bounds` `(` $bounds^ `)` )? - (`async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType)^ `)`)? attr-dict }]; let builders = [ - OpBuilder<(ins "::mlir::Value":$accVar, - "bool":$structured, "bool":$implicit, + OpBuilder<(ins "::mlir::Value":$accVar, "bool":$implicit, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ build($_builder, $_state, accVar, - bounds, /*asyncOperands=*/{}, /*asyncOperandsDeviceType=*/nullptr, - /*asyncOnly=*/nullptr, /*dataClause=*/nullptr, - /*structured=*/$_builder.getBoolAttr(structured), + bounds, /*dataClause=*/nullptr, /*implicit=*/$_builder.getBoolAttr(implicit), /*name=*/nullptr); }]>, - OpBuilder<(ins "::mlir::Value":$accVar, - "bool":$structured, "bool":$implicit, + OpBuilder<(ins "::mlir::Value":$accVar, "bool":$implicit, "const ::llvm::Twine &":$name, CArg<"::mlir::ValueRange", "{}">:$bounds), [{ build($_builder, $_state, accVar, - bounds, /*asyncOperands=*/{}, /*asyncOperandsDeviceType=*/nullptr, - /*asyncOnly=*/nullptr, /*dataClause=*/nullptr, - /*structured=*/$_builder.getBoolAttr(structured), + bounds, /*dataClause=*/nullptr, /*implicit=*/$_builder.getBoolAttr(implicit), /*name=*/$_builder.getStringAttr(name)); }]> @@ -1027,7 +895,7 @@ def OpenACC_CopyoutOp : OpenACC_DataExitOpWithVarPtr<"copyout", "mlir::acc::DataClause::acc_copyout"> { let summary = "Represents acc copyout semantics - reverse of copyin."; - let extraClassDeclaration = extraClassDeclarationBase # extraClassDeclarationDataExit # [{ + let extraClassDeclaration = extraClassDeclarationDataExit # [{ /// Check if this is a copyout with zero modifier. bool isCopyoutZero(); }]; @@ -1039,7 +907,7 @@ def OpenACC_CopyoutOp : OpenACC_DataExitOpWithVarPtr<"copyout", def OpenACC_DeleteOp : OpenACC_DataExitOpNoVarPtr<"delete", "mlir::acc::DataClause::acc_delete"> { let summary = "Represents acc delete semantics - reverse of create."; - let extraClassDeclaration = extraClassDeclarationBase # extraClassDeclarationDataExit; + let extraClassDeclaration = extraClassDeclarationDataExit; } //===----------------------------------------------------------------------===// @@ -1048,7 +916,7 @@ def OpenACC_DeleteOp : OpenACC_DataExitOpNoVarPtr<"delete", def OpenACC_DetachOp : OpenACC_DataExitOpNoVarPtr<"detach", "mlir::acc::DataClause::acc_detach"> { let summary = "Represents acc detach semantics - reverse of attach."; - let extraClassDeclaration = extraClassDeclarationBase # extraClassDeclarationDataExit; + let extraClassDeclaration = extraClassDeclarationDataExit; } //===----------------------------------------------------------------------===// @@ -1057,7 +925,7 @@ def OpenACC_DetachOp : OpenACC_DataExitOpNoVarPtr<"detach", def OpenACC_UpdateHostOp : OpenACC_DataExitOpWithVarPtr<"update_host", "mlir::acc::DataClause::acc_update_host"> { let summary = "Represents acc update host semantics."; - let extraClassDeclaration = extraClassDeclarationBase # extraClassDeclarationDataExit # [{ + let extraClassDeclaration = extraClassDeclarationDataExit # [{ /// Check if this is an acc update self. bool isSelf() { return getDataClause() == acc::DataClause::acc_update_self; @@ -1439,8 +1307,8 @@ def OpenACC_ParallelOp : OpenACC_Op<"parallel", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1581,8 +1449,8 @@ def OpenACC_SerialOp : OpenACC_Op<"serial", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1750,8 +1618,8 @@ def OpenACC_KernelsOp : OpenACC_Op<"kernels", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `num_gangs` `(` custom($numGangs, type($numGangs), $numGangsDeviceType, $numGangsSegments) `)` | `num_workers` `(` custom($numWorkers, @@ -1799,6 +1667,9 @@ def OpenACC_DataOp : OpenACC_Op<"data", `async` and `wait` operands are supported with `device_type` information. They should only be accessed by the extra provided getters. If modified, the corresponding `device_type` attributes must be modified as well. + + The `asyncOnly` operand is a list of device_type's for which async clause + does not specify a value (default is acc_async_noval - OpenACC 3.3 2.16.1). }]; @@ -1870,8 +1741,8 @@ def OpenACC_DataOp : OpenACC_Op<"data", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, @@ -1931,6 +1802,7 @@ def OpenACC_EnterDataOp : OpenACC_Op<"enter_data", Value getDataOperand(unsigned i); }]; + // TODO: Show $async and $wait. let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` @@ -1983,6 +1855,7 @@ def OpenACC_ExitDataOp : OpenACC_Op<"exit_data", Value getDataOperand(unsigned i); }]; + // TODO: Show $async and $wait. let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` @@ -2853,7 +2726,7 @@ def OpenACC_UpdateOp : OpenACC_Op<"update", let arguments = (ins Optional:$ifCond, Variadic:$asyncOperands, OptionalAttr:$asyncOperandsDeviceType, - OptionalAttr:$async, + OptionalAttr:$asyncOnly, Variadic:$waitOperands, OptionalAttr:$waitOperandsSegments, OptionalAttr:$waitOperandsDeviceType, @@ -2901,9 +2774,8 @@ def OpenACC_UpdateOp : OpenACC_Op<"update", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `` custom( - $asyncOperands, type($asyncOperands), - $asyncOperandsDeviceType, $async) + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, $waitOnly) @@ -2946,6 +2818,7 @@ def OpenACC_WaitOp : OpenACC_Op<"wait", [AttrSizedOperandSegments]> { UnitAttr:$async, Optional:$ifCond); + // TODO: Show $async. let assemblyFormat = [{ ( `(` $waitOperands^ `:` type($waitOperands) `)` )? oilist(`async` `(` $asyncOperand `:` type($asyncOperand) `)` diff --git a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp index 7eb72d433c972..ee00acecb17b9 100644 --- a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp +++ b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp @@ -3505,7 +3505,7 @@ bool UpdateOp::hasAsyncOnly() { } bool UpdateOp::hasAsyncOnly(mlir::acc::DeviceType deviceType) { - return hasDeviceType(getAsync(), deviceType); + return hasDeviceType(getAsyncOnly(), deviceType); } mlir::Value UpdateOp::getAsyncValue() { @@ -3659,32 +3659,30 @@ mlir::acc::getBounds(mlir::Operation *accDataClauseOp) { } mlir::SmallVector -mlir::acc::getAsyncOperands(mlir::Operation *accDataClauseOp) { +mlir::acc::getAsyncOperands(mlir::Operation *accOp) { return llvm::TypeSwitch>( - accDataClauseOp) - .Case([&](auto dataClause) { - return mlir::SmallVector( - dataClause.getAsyncOperands().begin(), - dataClause.getAsyncOperands().end()); - }) + accOp) + .Case( + [&](auto op) { + return mlir::SmallVector(op.getAsyncOperands().begin(), + op.getAsyncOperands().end()); + }) .Default([&](mlir::Operation *) { return mlir::SmallVector(); }); } -mlir::ArrayAttr -mlir::acc::getAsyncOperandsDeviceType(mlir::Operation *accDataClauseOp) { - return llvm::TypeSwitch(accDataClauseOp) - .Case([&](auto dataClause) { - return dataClause.getAsyncOperandsDeviceTypeAttr(); - }) +mlir::ArrayAttr mlir::acc::getAsyncOperandsDeviceType(mlir::Operation *accOp) { + return llvm::TypeSwitch(accOp) + .Case( + [&](auto op) { return op.getAsyncOperandsDeviceTypeAttr(); }) .Default([&](mlir::Operation *) { return mlir::ArrayAttr{}; }); } -mlir::ArrayAttr mlir::acc::getAsyncOnly(mlir::Operation *accDataClauseOp) { - return llvm::TypeSwitch(accDataClauseOp) - .Case( - [&](auto dataClause) { return dataClause.getAsyncOnlyAttr(); }) +mlir::ArrayAttr mlir::acc::getAsyncOnly(mlir::Operation *accOp) { + return llvm::TypeSwitch(accOp) + .Case( + [&](auto op) { return op.getAsyncOnlyAttr(); }) .Default([&](mlir::Operation *) { return mlir::ArrayAttr{}; }); } >From 38e6b58dc6bc4fed2db23dd348c1a0cfe91d4fbe Mon Sep 17 00:00:00 2001 From: Kazuaki Matsumura Date: Tue, 13 May 2025 05:21:48 -0700 Subject: [PATCH 2/5] [acc] accDataClauseOp -> accOp --- mlir/include/mlir/Dialect/OpenACC/OpenACC.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h index e053e3d2bbcfc..f667a6786189b 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h @@ -120,17 +120,17 @@ mlir::SmallVector getBounds(mlir::Operation *accDataClauseOp); /// Used to obtain `async` operands from an acc operation. /// Returns an empty vector if there are no such operands. mlir::SmallVector -getAsyncOperands(mlir::Operation *accDataClauseOp); +getAsyncOperands(mlir::Operation *accOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to /// an acc operation, that correspond to the device types associated with the /// async clauses with an async-value. -mlir::ArrayAttr getAsyncOperandsDeviceType(mlir::Operation *accDataClauseOp); +mlir::ArrayAttr getAsyncOperandsDeviceType(mlir::Operation *accOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to /// an acc operation, that correspond to the device types associated with the /// async clauses without an async-value. -mlir::ArrayAttr getAsyncOnly(mlir::Operation *accDataClauseOp); +mlir::ArrayAttr getAsyncOnly(mlir::Operation *accOp); /// Used to obtain the `name` from an acc operation. std::optional getVarName(mlir::Operation *accOp); >From 1a0bac6b11d3c882b3ac7e6e0d28b97806048bec Mon Sep 17 00:00:00 2001 From: Kazuaki Matsumura Date: Tue, 13 May 2025 05:49:39 -0700 Subject: [PATCH 3/5] [acc] clang-format --- mlir/include/mlir/Dialect/OpenACC/OpenACC.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h index f667a6786189b..30271d0599236 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACC.h +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACC.h @@ -119,8 +119,7 @@ mlir::SmallVector getBounds(mlir::Operation *accDataClauseOp); /// Used to obtain `async` operands from an acc operation. /// Returns an empty vector if there are no such operands. -mlir::SmallVector -getAsyncOperands(mlir::Operation *accOp); +mlir::SmallVector getAsyncOperands(mlir::Operation *accOp); /// Returns an array of acc:DeviceTypeAttr attributes attached to /// an acc operation, that correspond to the device types associated with the >From 5ee42b2c7a2db13eeb587fc4586ee3dedb3a9614 Mon Sep 17 00:00:00 2001 From: Kazuaki Matsumura Date: Tue, 13 May 2025 09:05:08 -0700 Subject: [PATCH 4/5] [flang][acc] Revert assemblyFormat changes; asyncOnly is an attribute, no need to handle it --- .../mlir/Dialect/OpenACC/OpenACCOps.td | 23 ++++++++----------- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td index 59b9a50144a1e..ead7e95a694db 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td @@ -1307,8 +1307,8 @@ def OpenACC_ParallelOp : OpenACC_Op<"parallel", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType) `)` | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1449,8 +1449,8 @@ def OpenACC_SerialOp : OpenACC_Op<"serial", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType) `)` | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1618,8 +1618,8 @@ def OpenACC_KernelsOp : OpenACC_Op<"kernels", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType) `)` | `num_gangs` `(` custom($numGangs, type($numGangs), $numGangsDeviceType, $numGangsSegments) `)` | `num_workers` `(` custom($numWorkers, @@ -1741,8 +1741,8 @@ def OpenACC_DataOp : OpenACC_Op<"data", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType) `)` | `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, @@ -1802,7 +1802,6 @@ def OpenACC_EnterDataOp : OpenACC_Op<"enter_data", Value getDataOperand(unsigned i); }]; - // TODO: Show $async and $wait. let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` @@ -1855,7 +1854,6 @@ def OpenACC_ExitDataOp : OpenACC_Op<"exit_data", Value getDataOperand(unsigned i); }]; - // TODO: Show $async and $wait. let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` @@ -2774,8 +2772,8 @@ def OpenACC_UpdateOp : OpenACC_Op<"update", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) `)` + | `async` `(` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, $waitOnly) @@ -2818,7 +2816,6 @@ def OpenACC_WaitOp : OpenACC_Op<"wait", [AttrSizedOperandSegments]> { UnitAttr:$async, Optional:$ifCond); - // TODO: Show $async. let assemblyFormat = [{ ( `(` $waitOperands^ `:` type($waitOperands) `)` )? oilist(`async` `(` $asyncOperand `:` type($asyncOperand) `)` >From 18522fdb0c59b1d06d35cf6b9860e79795968e60 Mon Sep 17 00:00:00 2001 From: Kazuaki Matsumura Date: Tue, 13 May 2025 12:01:23 -0700 Subject: [PATCH 5/5] [flang][acc] Update lit tests --- .../Fir/OpenACC/openacc-type-categories.f90 | 20 +-- flang/test/Lower/OpenACC/acc-bounds.f90 | 6 +- .../OpenACC/acc-data-unwrap-defaultbounds.f90 | 7 +- flang/test/Lower/OpenACC/acc-data.f90 | 7 +- .../Lower/OpenACC/acc-declare-globals.f90 | 20 +-- .../acc-declare-unwrap-defaultbounds.f90 | 38 +++--- flang/test/Lower/OpenACC/acc-declare.f90 | 26 ++-- .../acc-enter-data-unwrap-defaultbounds.f90 | 124 +++++++++--------- flang/test/Lower/OpenACC/acc-enter-data.f90 | 124 +++++++++--------- .../acc-exit-data-unwrap-defaultbounds.f90 | 68 +++++----- flang/test/Lower/OpenACC/acc-exit-data.f90 | 68 +++++----- flang/test/Lower/OpenACC/acc-parallel.f90 | 9 +- flang/test/Lower/OpenACC/acc-update.f90 | 84 ++++++------ mlir/test/Dialect/OpenACC/invalid.mlir | 2 +- mlir/test/Dialect/OpenACC/ops.mlir | 16 +-- 15 files changed, 311 insertions(+), 308 deletions(-) diff --git a/flang/test/Fir/OpenACC/openacc-type-categories.f90 b/flang/test/Fir/OpenACC/openacc-type-categories.f90 index c25c38422b755..64c5e897e960e 100644 --- a/flang/test/Fir/OpenACC/openacc-type-categories.f90 +++ b/flang/test/Fir/OpenACC/openacc-type-categories.f90 @@ -17,33 +17,33 @@ program main !$acc enter data copyin(complexvar, charvar, ttvar%field, ttvar%fieldarray, arrayconstsize(1)) end program -! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "scalar", structured = false} +! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "scalar"} ! CHECK: Pointer-like: !fir.ref ! CHECK: Type category: scalar -! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "scalaralloc", structured = false} +! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "scalaralloc"} ! CHECK: Pointer-like: !fir.ref>> ! CHECK: Type category: nonscalar -! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "ttvar", structured = false} +! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "ttvar"} ! CHECK: Pointer-like: !fir.ref}>> ! CHECK: Type category: composite -! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "arrayconstsize", structured = false} +! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "arrayconstsize"} ! CHECK: Pointer-like: !fir.ref> ! CHECK: Type category: array -! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "arrayalloc", structured = false} +! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "arrayalloc"} ! CHECK: Pointer-like: !fir.ref>>> ! CHECK: Type category: array -! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "complexvar", structured = false} +! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "complexvar"} ! CHECK: Pointer-like: !fir.ref> ! CHECK: Type category: scalar -! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "charvar", structured = false} +! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "charvar"} ! CHECK: Pointer-like: !fir.ref> ! CHECK: Type category: nonscalar -! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "ttvar%field", structured = false} +! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "ttvar%field"} ! CHECK: Pointer-like: !fir.ref ! CHECK: Type category: composite -! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "ttvar%fieldarray", structured = false} +! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "ttvar%fieldarray"} ! CHECK: Pointer-like: !fir.ref> ! CHECK: Type category: array -! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "arrayconstsize(1)", structured = false} +! CHECK: Visiting: {{.*}} acc.copyin {{.*}} {name = "arrayconstsize(1)"} ! CHECK: Pointer-like: !fir.ref> ! CHECK: Type category: array diff --git a/flang/test/Lower/OpenACC/acc-bounds.f90 b/flang/test/Lower/OpenACC/acc-bounds.f90 index cff53a2bfd122..5b3396b54ace7 100644 --- a/flang/test/Lower/OpenACC/acc-bounds.f90 +++ b/flang/test/Lower/OpenACC/acc-bounds.f90 @@ -33,7 +33,7 @@ subroutine acc_derived_type_component_pointer_array() ! CHECK: %[[UB:.*]] = arith.subi %[[BOX_DIMS1]]#1, %[[C1]] : index ! CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%c0{{.*}} : index) upperbound(%[[UB]] : index) extent(%[[BOX_DIMS1]]#1 : index) stride(%[[BOX_DIMS1]]#2 : index) startIdx(%[[BOX_DIMS0]]#0 : index) {strideInBytes = true} ! CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[LOAD]] : (!fir.box>>) -> !fir.ptr> -! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ptr>) bounds(%[[BOUND]]) -> !fir.ptr> {name = "d%array_comp", structured = false} +! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ptr>) bounds(%[[BOUND]]) -> !fir.ptr> {name = "d%array_comp"} ! CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ptr>) ! CHECK: return ! CHECK: } @@ -53,7 +53,7 @@ subroutine acc_derived_type_component_array() ! CHECK: %[[C0:.*]] = arith.constant 0 : index ! CHECK: %[[UB:.*]] = arith.subi %[[C10]], %[[C1]] : index ! CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[C0]] : index) upperbound(%[[UB]] : index) extent(%[[C10]] : index) stride(%[[C1]] : index) startIdx(%[[C1]] : index) -! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "d%array_comp", structured = false} +! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "d%array_comp"} ! CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) ! CHECK: return ! CHECK: } @@ -74,7 +74,7 @@ subroutine acc_derived_type_component_allocatable_array() ! CHECK: %[[UB:.*]] = arith.subi %[[BOX_DIMS1]]#1, %[[C1]] : index ! CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%c0{{.*}} : index) upperbound(%[[UB]] : index) extent(%[[BOX_DIMS1]]#1 : index) stride(%[[BOX_DIMS1]]#2 : index) startIdx(%[[BOX_DIMS0]]#0 : index) {strideInBytes = true} ! CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[LOAD]] : (!fir.box>>) -> !fir.heap> -! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "d%array_comp", structured = false} +! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "d%array_comp"} ! CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.heap>) ! CHECK: return ! CHECK: } diff --git a/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 index d010d39cef4eb..d0b2396500b3a 100644 --- a/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 @@ -161,10 +161,11 @@ subroutine acc_data !$acc data copy(a) async(1) !$acc end data -! CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%{{.*}} : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async(%[[ASYNC:.*]] : i32) -> !fir.ref> {dataClause = #acc, name = "a"} -! CHECK: acc.data async(%[[ASYNC]] : i32) dataOperands(%[[COPYIN]] : !fir.ref>) { +! CHECK-DAG: %[[COPYIN:.*]] = acc.copyin varPtr(%{{.*}} : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} +! CHECK-DAG: acc.data async(%[[ASYNC:.*]] : i32) dataOperands(%[[COPYIN]] : !fir.ref>) { +! CHECK-DAG: %[[ASYNC]] = arith.constant 1 : i32 ! CHECK: }{{$}} -! CHECK: acc.copyout accPtr(%[[COPYIN]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async(%[[ASYNC]] : i32) to varPtr(%{{.*}} : !fir.ref>) {dataClause = #acc, name = "a"} +! CHECK: acc.copyout accPtr(%[[COPYIN]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) to varPtr(%{{.*}} : !fir.ref>) {dataClause = #acc, name = "a"} !$acc data present(a) wait !$acc end data diff --git a/flang/test/Lower/OpenACC/acc-data.f90 b/flang/test/Lower/OpenACC/acc-data.f90 index 7965fdc0ac707..46687876925e6 100644 --- a/flang/test/Lower/OpenACC/acc-data.f90 +++ b/flang/test/Lower/OpenACC/acc-data.f90 @@ -161,10 +161,11 @@ subroutine acc_data !$acc data copy(a) async(1) !$acc end data -! CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%{{.*}} : !fir.ref>) async(%[[ASYNC:.*]] : i32) -> !fir.ref> {dataClause = #acc, name = "a"} -! CHECK: acc.data async(%[[ASYNC]] : i32) dataOperands(%[[COPYIN]] : !fir.ref>) { +! CHECK-DAG: %[[COPYIN:.*]] = acc.copyin varPtr(%{{.*}} : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} +! CHECK-DAG: acc.data async(%[[ASYNC:.*]] : i32) dataOperands(%[[COPYIN]] : !fir.ref>) { +! CHECK-DAG: %[[ASYNC]] = arith.constant 1 : i32 ! CHECK: }{{$}} -! CHECK: acc.copyout accPtr(%[[COPYIN]] : !fir.ref>) async(%[[ASYNC]] : i32) to varPtr(%{{.*}} : !fir.ref>) {dataClause = #acc, name = "a"} +! CHECK: acc.copyout accPtr(%[[COPYIN]] : !fir.ref>) to varPtr(%{{.*}} : !fir.ref>) {dataClause = #acc, name = "a"} !$acc data present(a) wait !$acc end data diff --git a/flang/test/Lower/OpenACC/acc-declare-globals.f90 b/flang/test/Lower/OpenACC/acc-declare-globals.f90 index 4556c5f4ddb1c..b79a92afea7bc 100644 --- a/flang/test/Lower/OpenACC/acc-declare-globals.f90 +++ b/flang/test/Lower/OpenACC/acc-declare-globals.f90 @@ -46,16 +46,16 @@ module acc_declare_test ! CHECK-LABEL: acc.global_ctor @_QMacc_declare_testEdata1_acc_ctor { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_testEdata1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[GLOBAL_ADDR]] : !fir.ref>) -> !fir.ref> {name = "data1", structured = false} +! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[GLOBAL_ADDR]] : !fir.ref>) -> !fir.ref> {name = "data1"} ! CHECK: acc.declare_enter dataOperands(%[[CREATE]] : !fir.ref>) ! CHECK: acc.terminator ! CHECK: } ! CHECK-LABEL: acc.global_dtor @_QMacc_declare_testEdata1_acc_dtor { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_testEdata1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "data1", structured = false} +! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "data1"} ! CHECK: acc.declare_exit dataOperands(%[[DEVICEPTR]] : !fir.ref>) -! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "data1", structured = false} +! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "data1"} ! CHECK: acc.terminator ! CHECK: } @@ -67,16 +67,16 @@ module acc_declare_copyin_test ! CHECK-LABEL: acc.global_ctor @_QMacc_declare_copyin_testEdata1_acc_ctor { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_copyin_testEdata1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[GLOBAL_ADDR]] : !fir.ref>) -> !fir.ref> {name = "data1", structured = false} +! CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[GLOBAL_ADDR]] : !fir.ref>) -> !fir.ref> {name = "data1"} ! CHECK: acc.declare_enter dataOperands(%[[COPYIN]] : !fir.ref>) ! CHECK: acc.terminator ! CHECK: } ! CHECK-LABEL: acc.global_dtor @_QMacc_declare_copyin_testEdata1_acc_dtor { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_copyin_testEdata1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "data1", structured = false} +! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "data1"} ! CHECK: acc.declare_exit dataOperands(%[[DEVICEPTR]] : !fir.ref>) -! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "data1", structured = false} +! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "data1"} ! CHECK: acc.terminator ! CHECK: } @@ -90,16 +90,16 @@ module acc_declare_device_resident_test ! CHECK-LABEL: acc.global_ctor @_QMacc_declare_device_resident_testEdata1_acc_ctor { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_device_resident_testEdata1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[DEVICERESIDENT:.*]] = acc.declare_device_resident varPtr(%0 : !fir.ref>) -> !fir.ref> {name = "data1", structured = false} +! CHECK: %[[DEVICERESIDENT:.*]] = acc.declare_device_resident varPtr(%0 : !fir.ref>) -> !fir.ref> {name = "data1"} ! CHECK: acc.declare_enter dataOperands(%[[DEVICERESIDENT]] : !fir.ref>) ! CHECK: acc.terminator ! CHECK: } ! CHECK-LABEL: acc.global_dtor @_QMacc_declare_device_resident_testEdata1_acc_dtor { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_device_resident_testEdata1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "data1", structured = false} +! CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "data1"} ! CHECK: acc.declare_exit dataOperands(%[[DEVICEPTR]] : !fir.ref>) -! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "data1", structured = false} +! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "data1"} ! CHECK: acc.terminator ! CHECK: } @@ -113,7 +113,7 @@ module acc_declare_device_link_test ! CHECK-LABEL: acc.global_ctor @_QMacc_declare_device_link_testEdata1_acc_ctor { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_device_link_testEdata1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[LINK:.*]] = acc.declare_link varPtr(%[[GLOBAL_ADDR]] : !fir.ref>) -> !fir.ref> {name = "data1", structured = false} +! CHECK: %[[LINK:.*]] = acc.declare_link varPtr(%[[GLOBAL_ADDR]] : !fir.ref>) -> !fir.ref> {name = "data1"} ! CHECK: acc.declare_enter dataOperands(%[[LINK]] : !fir.ref>) ! CHECK: acc.terminator ! CHECK: } diff --git a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 index 5bb1ae3797346..7dbfd2de90176 100644 --- a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 @@ -258,11 +258,11 @@ subroutine acc_declare_allocate() ! CHECK-LABEL: func.func private @_QMacc_declareFacc_declare_allocateEa_acc_declare_update_desc_post_alloc( ! CHECK-SAME: %[[ARG0:.*]]: !fir.ref>>>) { -! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[ARG0]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "a_desc", structured = false} +! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[ARG0]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "a_desc"} ! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref>>>) ! CHECK: %[[LOAD:.*]] = fir.load %[[ARG0]] : !fir.ref>>> ! CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[LOAD]] {acc.declare = #acc.declare} : (!fir.box>>) -> !fir.heap> -! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) -> !fir.heap> {name = "a", structured = false} +! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) -> !fir.heap> {name = "a"} ! CHECK: acc.declare_enter dataOperands(%[[CREATE]] : !fir.heap>) ! CHECK: return ! CHECK: } @@ -271,9 +271,9 @@ subroutine acc_declare_allocate() ! CHECK-SAME: %[[ARG0:.*]]: !fir.ref>>>) { ! CHECK: %[[LOAD:.*]] = fir.load %[[ARG0]] : !fir.ref>>> ! CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[LOAD]] {acc.declare = #acc.declare} : (!fir.box>>) -> !fir.heap> -! CHECK: %[[GETDEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[BOX_ADDR]] : !fir.heap>) -> !fir.heap> {dataClause = #acc, name = "a", structured = false} +! CHECK: %[[GETDEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[BOX_ADDR]] : !fir.heap>) -> !fir.heap> {dataClause = #acc, name = "a"} ! CHECK: acc.declare_exit dataOperands(%[[GETDEVICEPTR]] : !fir.heap>) -! CHECK: acc.delete accPtr(%[[GETDEVICEPTR]] : !fir.heap>) {dataClause = #acc, name = "a", structured = false} +! CHECK: acc.delete accPtr(%[[GETDEVICEPTR]] : !fir.heap>) {dataClause = #acc, name = "a"} ! CHECK: return ! CHECK: } @@ -281,7 +281,7 @@ subroutine acc_declare_allocate() ! CHECK-SAME: %[[ARG0:.*]]: !fir.ref>>>) { ! CHECK: %[[LOAD:.*]] = fir.load %[[ARG0]] : !fir.ref>>> ! CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[LOAD]] : (!fir.box>>) -> !fir.heap> -! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[BOX_ADDR]] : !fir.heap>) -> !fir.heap> {implicit = true, name = "a_desc", structured = false} +! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[BOX_ADDR]] : !fir.heap>) -> !fir.heap> {implicit = true, name = "a_desc"} ! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.heap>) ! CHECK: return ! CHECK: } @@ -348,18 +348,18 @@ module acc_declare_allocatable_test ! CHECK-LABEL: acc.global_ctor @_QMacc_declare_allocatable_testEdata1_acc_ctor { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) {acc.declare = #acc.declare} : !fir.ref>>> -! CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {dataClause = #acc, implicit = true, name = "data1", structured = false} +! CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {dataClause = #acc, implicit = true, name = "data1"} ! CHECK: acc.declare_enter dataOperands(%[[COPYIN]] : !fir.ref>>>) ! CHECK: acc.terminator ! CHECK: } ! CHECK-LABEL: func.func private @_QMacc_declare_allocatable_testEdata1_acc_declare_update_desc_post_alloc() { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) : !fir.ref>>> -! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "data1_desc", structured = false} +! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "data1_desc"} ! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref>>>) ! CHECK: %[[LOAD:.*]] = fir.load %[[GLOBAL_ADDR]] : !fir.ref>>> ! CHECK: %[[BOXADDR:.*]] = fir.box_addr %[[LOAD]] {acc.declare = #acc.declare} : (!fir.box>>) -> !fir.heap> -! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOXADDR]] : !fir.heap>) -> !fir.heap> {name = "data1", structured = false} +! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOXADDR]] : !fir.heap>) -> !fir.heap> {name = "data1"} ! CHECK: acc.declare_enter dataOperands(%[[CREATE]] : !fir.heap>) ! CHECK: return ! CHECK: } @@ -368,24 +368,24 @@ module acc_declare_allocatable_test ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) : !fir.ref>>> ! CHECK: %[[LOAD:.*]] = fir.load %[[GLOBAL_ADDR]] : !fir.ref>>> ! CHECK: %[[BOXADDR:.*]] = fir.box_addr %[[LOAD]] {acc.declare = #acc.declare} : (!fir.box>>) -> !fir.heap> -! CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[BOXADDR]] : !fir.heap>) -> !fir.heap> {dataClause = #acc, name = "data1", structured = false} +! CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[BOXADDR]] : !fir.heap>) -> !fir.heap> {dataClause = #acc, name = "data1"} ! CHECK: acc.declare_exit dataOperands(%[[DEVPTR]] : !fir.heap>) -! CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.heap>) {dataClause = #acc, name = "data1", structured = false} +! CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.heap>) {dataClause = #acc, name = "data1"} ! CHECK: return ! CHECK: } ! CHECK-LABEL: func.func private @_QMacc_declare_allocatable_testEdata1_acc_declare_update_desc_post_dealloc() { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) : !fir.ref>>> -! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "data1_desc", structured = false} +! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "data1_desc"} ! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref>>>) ! CHECK: return ! CHECK: } ! CHECK-LABEL: acc.global_dtor @_QMacc_declare_allocatable_testEdata1_acc_dtor { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) {acc.declare = #acc.declare} : !fir.ref>>> -! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {dataClause = #acc, name = "data1", structured = false} +! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {dataClause = #acc, name = "data1"} ! CHECK: acc.declare_exit dataOperands(%[[DEVICEPTR]] : !fir.ref>>>) -! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>>>) {dataClause = #acc, name = "data1", structured = false} +! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>>>) {dataClause = #acc, name = "data1"} ! CHECK: acc.terminator ! CHECK: } @@ -400,15 +400,15 @@ module acc_declare_equivalent ! CHECK-LABEL: acc.global_ctor @_QMacc_declare_equivalentEv2_acc_ctor { ! CHECK: %[[ADDR:.*]] = fir.address_of(@_QMacc_declare_equivalentEv1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {name = "v2", structured = false} +! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {name = "v2"} ! CHECK: acc.declare_enter dataOperands(%[[CREATE]] : !fir.ref>) ! CHECK: acc.terminator ! CHECK: } ! CHECK-LABEL: acc.global_dtor @_QMacc_declare_equivalentEv2_acc_dtor { ! CHECK: %[[ADDR:.*]] = fir.address_of(@_QMacc_declare_equivalentEv1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "v2", structured = false} +! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "v2"} ! CHECK: acc.declare_exit dataOperands(%[[DEVICEPTR]] : !fir.ref>) -! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "v2", structured = false} +! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "v2"} ! CHECK: acc.terminator ! CHECK: } @@ -421,15 +421,15 @@ module acc_declare_equivalent2 ! CHECK-LABEL: acc.global_ctor @_QMacc_declare_equivalent2Ev2_acc_ctor { ! CHECK: %[[ADDR:.*]] = fir.address_of(@_QMacc_declare_equivalent2Ev1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {name = "v2", structured = false} +! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {name = "v2"} ! CHECK: acc.declare_enter dataOperands(%[[CREATE]] : !fir.ref>) ! CHECK: acc.terminator ! CHECK: } ! CHECK-LABEL: acc.global_dtor @_QMacc_declare_equivalent2Ev2_acc_dtor { ! CHECK: %[[ADDR:.*]] = fir.address_of(@_QMacc_declare_equivalent2Ev1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "v2", structured = false} +! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "v2"} ! CHECK: acc.declare_exit dataOperands(%[[DEVICEPTR]] : !fir.ref>) -! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "v2", structured = false} +! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "v2"} ! CHECK: acc.terminator ! CHECK: } diff --git a/flang/test/Lower/OpenACC/acc-declare.f90 b/flang/test/Lower/OpenACC/acc-declare.f90 index 889cdef51f4ce..534df44782c0f 100644 --- a/flang/test/Lower/OpenACC/acc-declare.f90 +++ b/flang/test/Lower/OpenACC/acc-declare.f90 @@ -250,14 +250,14 @@ subroutine acc_declare_allocate() ! CHECK-LABEL: func.func private @_QMacc_declareFacc_declare_allocateEa_acc_declare_update_desc_post_alloc( ! CHECK-SAME: %[[ARG0:.*]]: !fir.ref>>>) { -! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[ARG0]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "a", structured = false} +! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[ARG0]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "a"} ! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref>>>) ! CHECK: return ! CHECK: } ! CHECK-LABEL: func.func private @_QMacc_declareFacc_declare_allocateEa_acc_declare_update_desc_post_dealloc( ! CHECK-SAME: %[[ARG0:.*]]: !fir.ref>>>) { -! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[ARG0]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "a", structured = false} +! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[ARG0]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "a"} ! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref>>>) ! CHECK: return ! CHECK: } @@ -323,30 +323,30 @@ module acc_declare_allocatable_test ! CHECK-LABEL: acc.global_ctor @_QMacc_declare_allocatable_testEdata1_acc_ctor { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) {acc.declare = #acc.declare} : !fir.ref>>> -! CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {dataClause = #acc, implicit = true, name = "data1", structured = false} +! CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {dataClause = #acc, implicit = true, name = "data1"} ! CHECK: acc.declare_enter dataOperands(%[[COPYIN]] : !fir.ref>>>) ! CHECK: acc.terminator ! CHECK: } ! CHECK-LABEL: func.func private @_QMacc_declare_allocatable_testEdata1_acc_declare_update_desc_post_alloc() { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) : !fir.ref>>> -! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "data1", structured = false} +! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "data1"} ! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref>>>) ! CHECK: return ! CHECK: } ! CHECK-LABEL: func.func private @_QMacc_declare_allocatable_testEdata1_acc_declare_update_desc_post_dealloc() { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) : !fir.ref>>> -! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "data1", structured = false} +! CHECK: %[[UPDATE:.*]] = acc.update_device varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {implicit = true, name = "data1"} ! CHECK: acc.update dataOperands(%[[UPDATE]] : !fir.ref>>>) ! CHECK: return ! CHECK: } ! CHECK-LABEL: acc.global_dtor @_QMacc_declare_allocatable_testEdata1_acc_dtor { ! CHECK: %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_allocatable_testEdata1) {acc.declare = #acc.declare} : !fir.ref>>> -! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {dataClause = #acc, name = "data1", structured = false} +! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref>>>) -> !fir.ref>>> {dataClause = #acc, name = "data1"} ! CHECK: acc.declare_exit dataOperands(%[[DEVICEPTR]] : !fir.ref>>>) -! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>>>) {dataClause = #acc, name = "data1", structured = false} +! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>>>) {dataClause = #acc, name = "data1"} ! CHECK: acc.terminator ! CHECK: } @@ -361,15 +361,15 @@ module acc_declare_equivalent ! CHECK-LABEL: acc.global_ctor @_QMacc_declare_equivalentEv2_acc_ctor { ! CHECK: %[[ADDR:.*]] = fir.address_of(@_QMacc_declare_equivalentEv1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {name = "v2", structured = false} +! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {name = "v2"} ! CHECK: acc.declare_enter dataOperands(%[[CREATE]] : !fir.ref>) ! CHECK: acc.terminator ! CHECK: } ! CHECK-LABEL: acc.global_dtor @_QMacc_declare_equivalentEv2_acc_dtor { ! CHECK: %[[ADDR:.*]] = fir.address_of(@_QMacc_declare_equivalentEv1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "v2", structured = false} +! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "v2"} ! CHECK: acc.declare_exit dataOperands(%[[DEVICEPTR]] : !fir.ref>) -! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "v2", structured = false} +! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "v2"} ! CHECK: acc.terminator ! CHECK: } @@ -382,15 +382,15 @@ module acc_declare_equivalent2 ! CHECK-LABEL: acc.global_ctor @_QMacc_declare_equivalent2Ev2_acc_ctor { ! CHECK: %[[ADDR:.*]] = fir.address_of(@_QMacc_declare_equivalent2Ev1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {name = "v2", structured = false} +! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {name = "v2"} ! CHECK: acc.declare_enter dataOperands(%[[CREATE]] : !fir.ref>) ! CHECK: acc.terminator ! CHECK: } ! CHECK-LABEL: acc.global_dtor @_QMacc_declare_equivalent2Ev2_acc_dtor { ! CHECK: %[[ADDR:.*]] = fir.address_of(@_QMacc_declare_equivalent2Ev1) {acc.declare = #acc.declare} : !fir.ref> -! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "v2", structured = false} +! CHECK: %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[ADDR]] : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "v2"} ! CHECK: acc.declare_exit dataOperands(%[[DEVICEPTR]] : !fir.ref>) -! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "v2", structured = false} +! CHECK: acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref>) {dataClause = #acc, name = "v2"} ! CHECK: acc.terminator ! CHECK: } diff --git a/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 index c42350a07c498..925d4b93b754e 100644 --- a/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 @@ -27,7 +27,7 @@ subroutine acc_enter_data !CHECK: %[[LB:.*]] = arith.constant 0 : index !CHECK: %[[UB:.*]] = arith.subi %[[EXTENT_C10]], %[[ONE]] : index !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%[[ONE]] : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>){{$}} !$acc enter data create(a) if(.true.) @@ -38,7 +38,7 @@ subroutine acc_enter_data !CHECK: %[[LB:.*]] = arith.constant 0 : index !CHECK: %[[UB:.*]] = arith.subi %[[EXTENT_C10]], %[[ONE]] : index !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%[[ONE]] : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: [[IF1:%.*]] = arith.constant true !CHECK: acc.enter_data if([[IF1]]) dataOperands(%[[CREATE_A]] : !fir.ref>){{$}} @@ -50,7 +50,7 @@ subroutine acc_enter_data !CHECK: %[[LB:.*]] = arith.constant 0 : index !CHECK: %[[UB:.*]] = arith.subi %[[EXTENT_C10]], %[[ONE]] : index !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%[[ONE]] : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: [[IFCOND:%.*]] = fir.load %{{.*}} : !fir.ref> !CHECK: [[IF2:%.*]] = fir.convert [[IFCOND]] : (!fir.logical<4>) -> i1 !CHECK: acc.enter_data if([[IF2]]) dataOperands(%[[CREATE_A]] : !fir.ref>){{$}} @@ -58,62 +58,62 @@ subroutine acc_enter_data !$acc enter data create(a) create(b) create(c) !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%c10_{{.*}} : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%c10_{{.*}} : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_B:.*]] = acc.create varPtr(%[[DECLB]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "b", structured = false} +!CHECK: %[[CREATE_B:.*]] = acc.create varPtr(%[[DECLB]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "b"} !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%c10_{{.*}} : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%c10_{{.*}} : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_C:.*]] = acc.create varPtr(%[[DECLC]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "c", structured = false} +!CHECK: %[[CREATE_C:.*]] = acc.create varPtr(%[[DECLC]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "c"} !CHECK: acc.enter_data dataOperands(%[[CREATE_A]], %[[CREATE_B]], %[[CREATE_C]] : !fir.ref>, !fir.ref>, !fir.ref>){{$}} !$acc enter data create(a) create(b) create(zero: c) !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%c10_{{.*}} : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%c10_{{.*}} : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_B:.*]] = acc.create varPtr(%[[DECLB]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "b", structured = false} +!CHECK: %[[CREATE_B:.*]] = acc.create varPtr(%[[DECLB]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "b"} !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%c10_{{.*}} : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%c10_{{.*}} : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_C:.*]] = acc.create varPtr(%[[DECLC]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {dataClause = #acc, name = "c", structured = false} +!CHECK: %[[CREATE_C:.*]] = acc.create varPtr(%[[DECLC]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {dataClause = #acc, name = "c"} !CHECK: acc.enter_data dataOperands(%[[CREATE_A]], %[[CREATE_B]], %[[CREATE_C]] : !fir.ref>, !fir.ref>, !fir.ref>){{$}} !$acc enter data copyin(a) create(b) attach(d) !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%c10_{{.*}} : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%c10_{{.*}} : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_B:.*]] = acc.create varPtr(%[[DECLB]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "b", structured = false} +!CHECK: %[[CREATE_B:.*]] = acc.create varPtr(%[[DECLB]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "b"} !CHECK: %[[BOX_D:.*]] = fir.load %[[DECLD]]#0 : !fir.ref>> !CHECK: %[[BOX_ADDR_D:.*]] = fir.box_addr %[[BOX_D]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[ATTACH_D:.*]] = acc.attach varPtr(%[[BOX_ADDR_D]] : !fir.ptr) -> !fir.ptr {name = "d", structured = false} +!CHECK: %[[ATTACH_D:.*]] = acc.attach varPtr(%[[BOX_ADDR_D]] : !fir.ptr) -> !fir.ptr {name = "d"} !CHECK: acc.enter_data dataOperands(%[[COPYIN_A]], %[[CREATE_B]], %[[ATTACH_D]] : !fir.ref>, !fir.ref>, !fir.ptr){{$}} !$acc enter data create(a) async !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async} !$acc enter data create(a) wait !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {wait} !$acc enter data create(a) async wait !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async, wait} !$acc enter data create(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) async(%[[ASYNC1]] : i32) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data async(%[[ASYNC1]] : i32) dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async(async) @@ -121,20 +121,20 @@ subroutine acc_enter_data !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) async(%[[ASYNC2]] : i32) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data async(%[[ASYNC2]] : i32) dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) wait(1) !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: %[[WAIT1:.*]] = arith.constant 1 : i32 !CHECK: acc.enter_data wait(%[[WAIT1]] : i32) dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) wait(queues: 1, 2) !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: %[[WAIT2:.*]] = arith.constant 1 : i32 !CHECK: %[[WAIT3:.*]] = arith.constant 2 : i32 !CHECK: acc.enter_data wait(%[[WAIT2]], %[[WAIT3]] : i32, i32) dataOperands(%[[CREATE_A]] : !fir.ref>) @@ -142,7 +142,7 @@ subroutine acc_enter_data !$acc enter data create(a) wait(devnum: 1: queues: 1, 2) !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a"} !CHECK: %[[WAIT4:.*]] = arith.constant 1 : i32 !CHECK: %[[WAIT5:.*]] = arith.constant 2 : i32 !CHECK: %[[WAIT6:.*]] = arith.constant 1 : i32 @@ -151,7 +151,7 @@ subroutine acc_enter_data !$acc enter data copyin(a(1:10,1:5)) !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a(1:10,1:5)", structured = false} +!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a(1:10,1:5)"} !CHECK: acc.enter_data dataOperands(%[[COPYIN_A]] : !fir.ref>) !$acc enter data copyin(a(1:,1:5)) @@ -162,7 +162,7 @@ subroutine acc_enter_data !CHECK: %[[LB2:.*]] = arith.constant 0 : index !CHECK: %[[UB2:.*]] = arith.constant 4 : index !CHECK: %[[BOUND2:.*]] = acc.bounds lowerbound(%[[LB2]] : index) upperbound(%[[UB2]] : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%c1{{.*}} : index) -!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]], %[[BOUND2]]) -> !fir.ref> {name = "a(1:,1:5)", structured = false} +!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]], %[[BOUND2]]) -> !fir.ref> {name = "a(1:,1:5)"} !CHECK: acc.enter_data dataOperands(%[[COPYIN_A]] : !fir.ref>) !$acc enter data copyin(a(:10,1:5)) @@ -173,7 +173,7 @@ subroutine acc_enter_data !CHECK: %[[LB:.*]] = arith.constant 0 : index !CHECK: %[[UB2:.*]] = arith.constant 4 : index !CHECK: %[[BOUND2:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB2]] : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%[[ONE]] : index) -!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]], %[[BOUND2]]) -> !fir.ref> {name = "a(:10,1:5)", structured = false} +!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]], %[[BOUND2]]) -> !fir.ref> {name = "a(:10,1:5)"} !CHECK: acc.enter_data dataOperands(%[[COPYIN_A]] : !fir.ref>) !$acc enter data copyin(a(:,:)) @@ -183,7 +183,7 @@ subroutine acc_enter_data !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%c10{{.*}} : index) stride(%[[ONE]] : index) startIdx(%[[ONE]] : index) !CHECK: %[[UB:.*]] = arith.subi %c10{{.*}}, %[[ONE]] : index !CHECK: %[[BOUND2:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%c10{{.*}} : index) stride(%c1{{.*}} : index) startIdx(%[[ONE]] : index) -!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]], %[[BOUND2]]) -> !fir.ref> {name = "a(:,:)", structured = false} +!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]], %[[BOUND2]]) -> !fir.ref> {name = "a(:,:)"} end subroutine acc_enter_data subroutine acc_enter_data_dummy(a, b, n, m) @@ -213,14 +213,14 @@ subroutine acc_enter_data_dummy(a, b, n, m) !$acc enter data create(a) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%c10{{.*}} : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(b) !CHECK: %[[DIMS:.*]]:3 = fir.box_dims %[[DECLB]]#0, %c0{{.*}} : (!fir.box>, index) -> (index, index, index) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[DIMS]]#1 : index) stride(%[[DIMS]]#2 : index) startIdx(%{{.*}} : index) {strideInBytes = true} !CHECK: %[[ADDR:.*]] = fir.box_addr %[[DECLB]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "b", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "b"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a(5:10)) @@ -228,7 +228,7 @@ subroutine acc_enter_data_dummy(a, b, n, m) !CHECK: %[[LB1:.*]] = arith.constant 4 : index !CHECK: %[[UB1:.*]] = arith.constant 9 : index !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%[[LB1]] : index) upperbound(%[[UB1]] : index) extent(%c10{{.*}} : index) stride(%[[ONE]] : index) startIdx(%c1{{.*}} : index) -!CHECK: %[[CREATE1:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]]) -> !fir.ref> {name = "a(5:10)", structured = false} +!CHECK: %[[CREATE1:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]]) -> !fir.ref> {name = "a(5:10)"} !CHECK: acc.enter_data dataOperands(%[[CREATE1]] : !fir.ref>) !$acc enter data create(b(n:m)) @@ -243,7 +243,7 @@ subroutine acc_enter_data_dummy(a, b, n, m) !CHECK: %[[UB:.*]] = arith.subi %[[M_CONV2]], %[[N_IDX]] : index !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[EXT_B]] : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[N_IDX]] : index) {strideInBytes = true} !CHECK: %[[ADDR:.*]] = fir.box_addr %[[DECLB]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE1:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND1]]) -> !fir.ref> {name = "b(n:m)", structured = false} +!CHECK: %[[CREATE1:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND1]]) -> !fir.ref> {name = "b(n:m)"} !CHECK: acc.enter_data dataOperands(%[[CREATE1]] : !fir.ref>) !$acc enter data create(b(n:)) @@ -256,7 +256,7 @@ subroutine acc_enter_data_dummy(a, b, n, m) !CHECK: %[[UB:.*]] = arith.subi %[[EXT_B]], %c1{{.*}} : index !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[EXT_B]] : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[N_IDX]] : index) {strideInBytes = true} !CHECK: %[[ADDR:.*]] = fir.box_addr %[[DECLB]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE1:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND1]]) -> !fir.ref> {name = "b(n:)", structured = false} +!CHECK: %[[CREATE1:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND1]]) -> !fir.ref> {name = "b(n:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE1]] : !fir.ref>) !$acc enter data create(b(:)) @@ -266,7 +266,7 @@ subroutine acc_enter_data_dummy(a, b, n, m) !CHECK: %[[UB:.*]] = arith.subi %[[EXT_B]], %[[ONE]] : index !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%[[ZERO]] : index) upperbound(%[[UB]] : index) extent(%[[EXT_B]] : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[N_IDX]] : index) {strideInBytes = true} !CHECK: %[[ADDR:.*]] = fir.box_addr %[[DECLB]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE1:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND1]]) -> !fir.ref> {name = "b(:)", structured = false} +!CHECK: %[[CREATE1:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND1]]) -> !fir.ref> {name = "b(:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE1]] : !fir.ref>) end subroutine @@ -291,7 +291,7 @@ subroutine acc_enter_data_non_default_lb() !CHECK: %[[UB:.*]] = arith.subi %[[SECTIONUB]], %[[BASELB]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%c10{{.*}} : index) stride(%{{.*}} : index) startIdx(%[[BASELB]] : index) !CHECK: %[[ADDR:.*]] = fir.box_addr %[[DECLA]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(5:9)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(5:9)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a(:)) @@ -300,7 +300,7 @@ subroutine acc_enter_data_non_default_lb() !CHECK: %[[UB:.*]] = arith.subi %[[EXTENT_C10]], %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[ZERO]] : index) upperbound(%[[UB]] : index) extent(%[[EXTENT_C10]] : index) stride(%{{.*}} : index) startIdx(%[[BASELB]] : index) !CHECK: %[[ADDR:.*]] = fir.box_addr %[[DECLA]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a(:6)) @@ -309,7 +309,7 @@ subroutine acc_enter_data_non_default_lb() !CHECK: %[[UB:.*]] = arith.subi %[[SECTIONUB]], %[[BASELB]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[ZERO]] : index) upperbound(%[[UB]] : index) extent(%c10{{.*}} : index) stride(%{{.*}} : index) startIdx(%[[BASELB]] : index) !CHECK: %[[ADDR:.*]] = fir.box_addr %[[DECLA]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(:6)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(:6)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a(4:)) @@ -319,7 +319,7 @@ subroutine acc_enter_data_non_default_lb() !CHECK: %[[UB:.*]] = arith.subi %[[EXTENT_C10]], %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[EXTENT_C10]] : index) stride(%{{.*}} : index) startIdx(%[[BASELB]] : index) !CHECK: %[[ADDR:.*]] = fir.box_addr %[[DECLA]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(4:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(4:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(b) @@ -329,7 +329,7 @@ subroutine acc_enter_data_non_default_lb() !CHECK: %[[UB:.*]] = arith.subi %[[DIMS0]]#1, %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%c0{{.*}} : index) upperbound(%[[UB]] : index) extent(%[[DIMS0]]#1 : index) stride(%{{.*}} : index) startIdx(%c11{{.*}} : index) {strideInBytes = true} !CHECK: %[[ADDR:.*]] = fir.box_addr %[[DECLB]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "b", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "b"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) end subroutine @@ -357,7 +357,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[UB:.*]] = arith.subi %[[DIMS]]#1, %[[C1]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS]]#1 : index) stride(%[[DIMS]]#2 : index) startIdx(%[[C1]] : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[DECLA]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a(:)) @@ -373,7 +373,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[DECLA]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a(2:)) @@ -389,7 +389,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[DECLA]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(2:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(2:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a(:4)) @@ -403,7 +403,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[DECLA]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(:4)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(:4)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a(6:10)) @@ -417,7 +417,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[DECLA]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(6:10)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(6:10)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a(n:)) @@ -437,7 +437,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[DECLA]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(n:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(n:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a(:m)) @@ -455,7 +455,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[BASELB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[DECLA]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(:m)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(:m)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a(n:m)) @@ -477,7 +477,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[DECLA]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(n:m)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a(n:m)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(b(:m)) @@ -494,7 +494,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[ZERO]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[LB_C10_IDX]] : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[DECLB]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "b(:m)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "b(:m)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(b) @@ -507,7 +507,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[C0]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS0]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[LB_C10_IDX]] : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[DECLB]]#0 : (!fir.box>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "b", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "b"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) end subroutine @@ -533,7 +533,7 @@ subroutine acc_enter_data_allocatable() !CHECK: %[[UB:.*]] = arith.subi %[[DIMS1]]#1, %c1{{.*}} : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%c0{{.*}} : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS1]]#2 : index) startIdx(%[[DIMS0]]#0 : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[BOX_A_0]] : (!fir.box>>) -> !fir.heap> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "a", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.heap>) !$acc enter data create(a(:)) @@ -554,7 +554,7 @@ subroutine acc_enter_data_allocatable() !CHECK: %[[UB:.*]] = arith.subi %[[DIMS2]]#1, %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[ZERO]] : index) upperbound(%[[UB:.*]] : index) extent(%[[DIMS2]]#1 : index) stride(%[[DIMS1]]#2 : index) startIdx(%[[DIMS0]]#0 : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[BOX_A_0]] : (!fir.box>>) -> !fir.heap> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "a(:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "a(:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.heap>) !$acc enter data create(a(2:5)) @@ -575,7 +575,7 @@ subroutine acc_enter_data_allocatable() !CHECK: %[[DIMS2:.*]]:3 = fir.box_dims %[[BOX_A_2]], %[[C0]] : (!fir.box>>, index) -> (index, index, index) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS2]]#1 : index) stride(%[[DIMS1]]#2 : index) startIdx(%[[DIMS0]]#0 : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[BOX_A_0]] : (!fir.box>>) -> !fir.heap> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "a(2:5)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "a(2:5)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.heap>) !$acc enter data create(a(3:)) @@ -597,7 +597,7 @@ subroutine acc_enter_data_allocatable() !CHECK: %[[UB:.*]] = arith.subi %[[DIMS2]]#1, %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS2]]#1 : index) stride(%[[DIMS1]]#2 : index) startIdx(%[[DIMS0]]#0 : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[BOX_A_0]] : (!fir.box>>) -> !fir.heap> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "a(3:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "a(3:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.heap>) !$acc enter data create(a(:7)) @@ -617,14 +617,14 @@ subroutine acc_enter_data_allocatable() !CHECK: %[[DIMS2:.*]]:3 = fir.box_dims %[[BOX_A_2]], %[[C0]] : (!fir.box>>, index) -> (index, index, index) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[ZERO]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS2]]#1 : index) stride(%[[DIMS1]]#2 : index) startIdx(%[[DIMS0]]#0 : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[BOX_A_0]] : (!fir.box>>) -> !fir.heap> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "a(:7)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "a(:7)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.heap>) !$acc enter data create(i) !CHECK: %[[BOX_I:.*]] = fir.load %[[DECLI]]#0 : !fir.ref>> !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[BOX_I]] : (!fir.box>) -> !fir.heap -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap) -> !fir.heap {name = "i", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap) -> !fir.heap {name = "i"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.heap) end subroutine @@ -669,7 +669,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[DATA_COORD:.*]] = hlfir.designate %[[DECLA]]#0{"data"} : (!fir.ref}>>) -> !fir.ref -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DATA_COORD]] : !fir.ref) -> !fir.ref {name = "a%data", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DATA_COORD]] : !fir.ref) -> !fir.ref {name = "a%data"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref) !$acc enter data create(b%d%data) @@ -678,7 +678,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[D_COORD:.*]] = hlfir.designate %[[DECLB]]#0{"d"} : (!fir.ref}>}>>) -> !fir.ref}>> !CHECK: %[[DATA_COORD:.*]] = hlfir.designate %[[D_COORD]]{"data"} : (!fir.ref}>>) -> !fir.ref -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DATA_COORD]] : !fir.ref) -> !fir.ref {name = "b%d%data", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DATA_COORD]] : !fir.ref) -> !fir.ref {name = "b%d%data"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref) !$acc enter data create(a%array) @@ -690,7 +690,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[LB:.*]] = arith.constant 0 : index !CHECK: %[[UB:.*]] = arith.subi %[[C10]], %[[C1]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[C10]] : index) stride(%[[C1]] : index) startIdx(%[[C1]] : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a%array(:)) @@ -702,7 +702,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[C1:.*]] = arith.constant 1 : index !CHECK: %[[UB:.*]] = arith.subi %[[C10]], %[[C1]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[C10]] : index) stride(%[[C1]] : index) startIdx(%[[C1]] : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a%array(1:5)) @@ -713,7 +713,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[C0:.*]] = arith.constant 0 : index !CHECK: %[[C4:.*]] = arith.constant 4 : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[C0]] : index) upperbound(%[[C4]] : index) extent(%[[C10]] : index) stride(%[[C1]] : index) startIdx(%[[C1]] : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(1:5)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(1:5)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a%array(:5)) @@ -724,7 +724,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[C1:.*]] = arith.constant 1 : index !CHECK: %[[C4:.*]] = arith.constant 4 : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[C4]] : index) extent(%[[C10]] : index) stride(%[[C1]] : index) startIdx(%[[C1]] : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(:5)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(:5)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a%array(2:)) @@ -736,7 +736,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[LB:.*]] = arith.constant 1 : index !CHECK: %[[UB:.*]] = arith.subi %[[C10]], %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[C10]] : index) stride(%[[ONE]] : index) startIdx(%[[ONE]] : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(2:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(2:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(b%d%array) @@ -750,7 +750,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[LB:.*]] = arith.constant 0 : index !CHECK: %[[UB:.*]] = arith.subi %[[C10]], %[[C1]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[C10]] : index) stride(%[[C1]] : index) startIdx(%[[C1]] : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "b%d%array", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "b%d%array"} !$acc enter data create(c%data) @@ -765,7 +765,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[UB:.*]] = arith.subi %[[DIMS0_1]]#1, %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%c0{{.*}} : index) upperbound(%[[UB]] : index) extent(%[[DIMS0_1]]#1 : index) stride(%[[DIMS0_1]]#2 : index) startIdx(%[[DIMS0]]#0 : index) {strideInBytes = true} !CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[DATA_BOX]] : (!fir.box>>) -> !fir.heap> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "c%data", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap>) bounds(%[[BOUND]]) -> !fir.heap> {name = "c%data"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.heap>) !$acc enter data create (d%d(1)%array) @@ -785,7 +785,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[LB:.*]] = arith.constant 0 : index !CHECK: %[[UB:.*]] = arith.subi %[[C10]], %[[C1]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[C10]] : index) stride(%[[C1]] : index) startIdx(%[[C1]] : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "d%d(1_8)%array", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "d%d(1_8)%array"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) end subroutine @@ -812,7 +812,7 @@ subroutine acc_enter_data_single_array_element() !CHECK: %[[VAL_46:.*]] = arith.constant 2 : index !CHECK: %[[VAL_47:.*]] = arith.subi %[[VAL_46]], %[[VAL_40]]#0 : index !CHECK: %[[VAL_48:.*]] = acc.bounds lowerbound(%[[VAL_47]] : index) upperbound(%[[VAL_47]] : index) extent(%[[VAL_42]] : index) stride(%[[VAL_42]] : index) startIdx(%[[VAL_40]]#0 : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[VAL_41]] : !fir.heap>) bounds(%[[VAL_45]], %[[VAL_48]]) -> !fir.heap> {name = "e(2_8)%a(1,2)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[VAL_41]] : !fir.heap>) bounds(%[[VAL_45]], %[[VAL_48]]) -> !fir.heap> {name = "e(2_8)%a(1,2)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.heap>) end subroutine diff --git a/flang/test/Lower/OpenACC/acc-enter-data.f90 b/flang/test/Lower/OpenACC/acc-enter-data.f90 index 3e49259c360eb..71d8d8403806e 100644 --- a/flang/test/Lower/OpenACC/acc-enter-data.f90 +++ b/flang/test/Lower/OpenACC/acc-enter-data.f90 @@ -20,73 +20,73 @@ subroutine acc_enter_data !CHECK: %[[DECLD:.*]]:2 = hlfir.declare %[[D]] !$acc enter data create(a) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>){{$}} !$acc enter data create(a) if(.true.) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} !CHECK: [[IF1:%.*]] = arith.constant true !CHECK: acc.enter_data if([[IF1]]) dataOperands(%[[CREATE_A]] : !fir.ref>){{$}} !$acc enter data create(a) if(ifCondition) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} !CHECK: [[IFCOND:%.*]] = fir.load %{{.*}} : !fir.ref> !CHECK: [[IF2:%.*]] = fir.convert [[IFCOND]] : (!fir.logical<4>) -> i1 !CHECK: acc.enter_data if([[IF2]]) dataOperands(%[[CREATE_A]] : !fir.ref>){{$}} !$acc enter data create(a) create(b) create(c) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} -!CHECK: %[[CREATE_B:.*]] = acc.create varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {name = "b", structured = false} -!CHECK: %[[CREATE_C:.*]] = acc.create varPtr(%[[DECLC]]#0 : !fir.ref>) -> !fir.ref> {name = "c", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} +!CHECK: %[[CREATE_B:.*]] = acc.create varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {name = "b"} +!CHECK: %[[CREATE_C:.*]] = acc.create varPtr(%[[DECLC]]#0 : !fir.ref>) -> !fir.ref> {name = "c"} !CHECK: acc.enter_data dataOperands(%[[CREATE_A]], %[[CREATE_B]], %[[CREATE_C]] : !fir.ref>, !fir.ref>, !fir.ref>){{$}} !$acc enter data create(a) create(b) create(zero: c) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} -!CHECK: %[[CREATE_B:.*]] = acc.create varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {name = "b", structured = false} -!CHECK: %[[CREATE_C:.*]] = acc.create varPtr(%[[DECLC]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "c", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} +!CHECK: %[[CREATE_B:.*]] = acc.create varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {name = "b"} +!CHECK: %[[CREATE_C:.*]] = acc.create varPtr(%[[DECLC]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "c"} !CHECK: acc.enter_data dataOperands(%[[CREATE_A]], %[[CREATE_B]], %[[CREATE_C]] : !fir.ref>, !fir.ref>, !fir.ref>){{$}} !$acc enter data copyin(a) create(b) attach(d) -!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} -!CHECK: %[[CREATE_B:.*]] = acc.create varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {name = "b", structured = false} -!CHECK: %[[ATTACH_D:.*]] = acc.attach varPtr(%[[DECLD]]#0 : !fir.ref>>) -> !fir.ref>> {name = "d", structured = false} +!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} +!CHECK: %[[CREATE_B:.*]] = acc.create varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {name = "b"} +!CHECK: %[[ATTACH_D:.*]] = acc.attach varPtr(%[[DECLD]]#0 : !fir.ref>>) -> !fir.ref>> {name = "d"} !CHECK: acc.enter_data dataOperands(%[[COPYIN_A]], %[[CREATE_B]], %[[ATTACH_D]] : !fir.ref>, !fir.ref>, !fir.ref>>){{$}} !$acc enter data create(a) async -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async} !$acc enter data create(a) wait -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {wait} !$acc enter data create(a) async wait -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async, wait} !$acc enter data create(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) async(%[[ASYNC1]] : i32) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data async(%[[ASYNC1]] : i32) dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async(async) !CHECK: %[[ASYNC2:.*]] = fir.load %{{.*}} : !fir.ref -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) async(%[[ASYNC2]] : i32) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data async(%[[ASYNC2]] : i32) dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) wait(1) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} !CHECK: %[[WAIT1:.*]] = arith.constant 1 : i32 !CHECK: acc.enter_data wait(%[[WAIT1]] : i32) dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) wait(queues: 1, 2) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} !CHECK: %[[WAIT2:.*]] = arith.constant 1 : i32 !CHECK: %[[WAIT3:.*]] = arith.constant 2 : i32 !CHECK: acc.enter_data wait(%[[WAIT2]], %[[WAIT3]] : i32, i32) dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) wait(devnum: 1: queues: 1, 2) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} !CHECK: %[[WAIT4:.*]] = arith.constant 1 : i32 !CHECK: %[[WAIT5:.*]] = arith.constant 2 : i32 !CHECK: %[[WAIT6:.*]] = arith.constant 1 : i32 @@ -95,7 +95,7 @@ subroutine acc_enter_data !$acc enter data copyin(a(1:10,1:5)) !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a(1:10,1:5)", structured = false} +!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a(1:10,1:5)"} !CHECK: acc.enter_data dataOperands(%[[COPYIN_A]] : !fir.ref>) !$acc enter data copyin(a(1:,1:5)) @@ -106,7 +106,7 @@ subroutine acc_enter_data !CHECK: %[[LB2:.*]] = arith.constant 0 : index !CHECK: %[[UB2:.*]] = arith.constant 4 : index !CHECK: %[[BOUND2:.*]] = acc.bounds lowerbound(%[[LB2]] : index) upperbound(%[[UB2]] : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%c1{{.*}} : index) -!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]], %[[BOUND2]]) -> !fir.ref> {name = "a(1:,1:5)", structured = false} +!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]], %[[BOUND2]]) -> !fir.ref> {name = "a(1:,1:5)"} !CHECK: acc.enter_data dataOperands(%[[COPYIN_A]] : !fir.ref>) !$acc enter data copyin(a(:10,1:5)) @@ -117,7 +117,7 @@ subroutine acc_enter_data !CHECK: %[[LB:.*]] = arith.constant 0 : index !CHECK: %[[UB2:.*]] = arith.constant 4 : index !CHECK: %[[BOUND2:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB2]] : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%[[ONE]] : index) -!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]], %[[BOUND2]]) -> !fir.ref> {name = "a(:10,1:5)", structured = false} +!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]], %[[BOUND2]]) -> !fir.ref> {name = "a(:10,1:5)"} !CHECK: acc.enter_data dataOperands(%[[COPYIN_A]] : !fir.ref>) !$acc enter data copyin(a(:,:)) @@ -127,7 +127,7 @@ subroutine acc_enter_data !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%c10{{.*}} : index) stride(%[[ONE]] : index) startIdx(%[[ONE]] : index) !CHECK: %[[UB:.*]] = arith.subi %c10{{.*}}, %[[ONE]] : index !CHECK: %[[BOUND2:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%c10{{.*}} : index) stride(%c1{{.*}} : index) startIdx(%c1{{.*}} : index) -!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]], %[[BOUND2]]) -> !fir.ref> {name = "a(:,:)", structured = false} +!CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]], %[[BOUND2]]) -> !fir.ref> {name = "a(:,:)"} end subroutine acc_enter_data subroutine acc_enter_data_dummy(a, b, n, m) @@ -156,11 +156,11 @@ subroutine acc_enter_data_dummy(a, b, n, m) !CHECK: %[[DECLB:.*]]:2 = hlfir.declare %[[B]] !$acc enter data create(a) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(b) -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) -> !fir.box> {name = "b", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) -> !fir.box> {name = "b"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(a(5:10)) @@ -168,7 +168,7 @@ subroutine acc_enter_data_dummy(a, b, n, m) !CHECK: %[[LB1:.*]] = arith.constant 4 : index !CHECK: %[[UB1:.*]] = arith.constant 9 : index !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%[[LB1]] : index) upperbound(%[[UB1]] : index) extent(%c10{{.*}} : index) stride(%[[ONE]] : index) startIdx(%c1{{.*}} : index) -!CHECK: %[[CREATE1:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]]) -> !fir.ref> {name = "a(5:10)", structured = false} +!CHECK: %[[CREATE1:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND1]]) -> !fir.ref> {name = "a(5:10)"} !CHECK: acc.enter_data dataOperands(%[[CREATE1]] : !fir.ref>) !$acc enter data create(b(n:m)) @@ -182,7 +182,7 @@ subroutine acc_enter_data_dummy(a, b, n, m) !CHECK: %[[M_CONV2:.*]] = fir.convert %[[M_CONV1]] : (i64) -> index !CHECK: %[[UB:.*]] = arith.subi %[[M_CONV2]], %[[N_IDX]] : index !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[EXT_B]] : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[N_IDX]] : index) {strideInBytes = true} -!CHECK: %[[CREATE1:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) bounds(%[[BOUND1]]) -> !fir.box> {name = "b(n:m)", structured = false} +!CHECK: %[[CREATE1:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) bounds(%[[BOUND1]]) -> !fir.box> {name = "b(n:m)"} !CHECK: acc.enter_data dataOperands(%[[CREATE1]] : !fir.box>) !$acc enter data create(b(n:)) @@ -194,7 +194,7 @@ subroutine acc_enter_data_dummy(a, b, n, m) !CHECK: %[[LB:.*]] = arith.subi %[[CONVERT2_N]], %[[N_IDX]] : index !CHECK: %[[UB:.*]] = arith.subi %[[EXT_B]], %c1{{.*}} : index !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[EXT_B]] : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[N_IDX]] : index) {strideInBytes = true} -!CHECK: %[[CREATE1:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) bounds(%[[BOUND1]]) -> !fir.box> {name = "b(n:)", structured = false} +!CHECK: %[[CREATE1:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) bounds(%[[BOUND1]]) -> !fir.box> {name = "b(n:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE1]] : !fir.box>) !$acc enter data create(b(:)) @@ -203,7 +203,7 @@ subroutine acc_enter_data_dummy(a, b, n, m) !CHECK: %[[DIMS0:.*]]:3 = fir.box_dims %[[DECLB]]#0, %c0{{.*}} : (!fir.box>, index) -> (index, index, index) !CHECK: %[[UB:.*]] = arith.subi %[[EXT_B]], %[[ONE]] : index !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%[[ZERO]] : index) upperbound(%[[UB]] : index) extent(%[[EXT_B]] : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[N_IDX]] : index) {strideInBytes = true} -!CHECK: %[[CREATE1:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) bounds(%[[BOUND1]]) -> !fir.box> {name = "b(:)", structured = false} +!CHECK: %[[CREATE1:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) bounds(%[[BOUND1]]) -> !fir.box> {name = "b(:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE1]] : !fir.box>) end subroutine @@ -227,7 +227,7 @@ subroutine acc_enter_data_non_default_lb() !CHECK: %[[SECTIONUB:.*]] = arith.constant 9 : index !CHECK: %[[UB:.*]] = arith.subi %[[SECTIONUB]], %[[BASELB]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%c10{{.*}} : index) stride(%{{.*}} : index) startIdx(%[[BASELB]] : index) -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(5:9)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(5:9)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(a(:)) @@ -235,7 +235,7 @@ subroutine acc_enter_data_non_default_lb() !CHECK: %[[ONE:.*]] = arith.constant 1 : index !CHECK: %[[UB:.*]] = arith.subi %[[EXTENT_C10]], %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[ZERO]] : index) upperbound(%[[UB]] : index) extent(%[[EXTENT_C10]] : index) stride(%{{.*}} : index) startIdx(%[[BASELB]] : index) -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(a(:6)) @@ -243,7 +243,7 @@ subroutine acc_enter_data_non_default_lb() !CHECK: %[[SECTIONUB:.*]] = arith.constant 6 : index !CHECK: %[[UB:.*]] = arith.subi %[[SECTIONUB]], %[[BASELB]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[ZERO]] : index) upperbound(%[[UB]] : index) extent(%c10{{.*}} : index) stride(%{{.*}} : index) startIdx(%[[BASELB]] : index) -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(:6)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(:6)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(a(4:)) @@ -252,11 +252,11 @@ subroutine acc_enter_data_non_default_lb() !CHECK: %[[LB:.*]] = arith.subi %[[SECTIONLB]], %[[BASELB]] : index !CHECK: %[[UB:.*]] = arith.subi %[[EXTENT_C10]], %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[EXTENT_C10]] : index) stride(%{{.*}} : index) startIdx(%[[BASELB]] : index) -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(4:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(4:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(b) -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) -> !fir.box> {name = "b", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) -> !fir.box> {name = "b"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) end subroutine @@ -277,7 +277,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[DECLN:.*]]:2 = hlfir.declare %[[N]] !$acc enter data create(a) -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) -> !fir.box> {name = "a", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) -> !fir.box> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(a(:)) @@ -292,7 +292,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[UB:.*]] = arith.subi %[[DIMS1]]#1, %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(a(2:)) @@ -307,7 +307,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[UB:.*]] = arith.subi %[[DIMS1]]#1, %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(2:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(2:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(a(:4)) @@ -319,7 +319,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[UB:.*]] = arith.constant 3 : index !CHECK: %[[DIMS1:.*]]:3 = fir.box_dims %[[DECLA]]#1, %{{.*}} : (!fir.box>, index) -> (index, index, index) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(:4)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(:4)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(a(6:10)) @@ -330,7 +330,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[UB:.*]] = arith.constant 9 : index !CHECK: %[[DIMS1:.*]]:3 = fir.box_dims %[[DECLA]]#1, %{{.*}} : (!fir.box>, index) -> (index, index, index) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(6:10)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(6:10)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(a(n:)) @@ -345,7 +345,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[DIMS:.*]]:3 = fir.box_dims %[[DECLA]]#1, %[[C0]] : (!fir.box>, index) -> (index, index, index) !CHECK: %[[UB:.*]] = arith.subi %[[DIMS]]#1, %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(n:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(n:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(a(:m)) @@ -359,7 +359,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[UB:.*]] = arith.subi %[[CONVERT2_M]], %[[ONE]] : index !CHECK: %[[DIMS1:.*]]:3 = fir.box_dims %[[DECLA]]#1, %{{.*}} : (!fir.box>, index) -> (index, index, index) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[BASELB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(:m)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(:m)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(a(n:m)) @@ -376,7 +376,7 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[UB:.*]] = arith.subi %[[CONVERT2_M]], %[[ONE]] : index !CHECK: %[[DIMS1:.*]]:3 = fir.box_dims %[[DECLA]]#1, %{{.*}} : (!fir.box>, index) -> (index, index, index) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[ONE]] : index) {strideInBytes = true} -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(n:m)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLA]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "a(n:m)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(b(:m)) @@ -389,11 +389,11 @@ subroutine acc_enter_data_assumed(a, b, n, m) !CHECK: %[[UB:.*]] = arith.subi %[[CONVERT2_M]], %[[LB_C10_IDX]] : index !CHECK: %[[DIMS1:.*]]:3 = fir.box_dims %[[DECLB]]#1, %{{.*}} : (!fir.box>, index) -> (index, index, index) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[ZERO]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS1]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%[[LB_C10_IDX]] : index) {strideInBytes = true} -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "b(:m)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) bounds(%[[BOUND]]) -> !fir.box> {name = "b(:m)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) !$acc enter data create(b) -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) -> !fir.box> {name = "b", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DECLB]]#0 : !fir.box>) -> !fir.box> {name = "b"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>) end subroutine @@ -410,7 +410,7 @@ subroutine acc_enter_data_allocatable() !$acc enter data create(a) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>>>) -> !fir.ref>>> {name = "a", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>>>) -> !fir.ref>>> {name = "a"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>>>) !$acc enter data create(a(:)) @@ -428,7 +428,7 @@ subroutine acc_enter_data_allocatable() !CHECK: %[[DIMS2:.*]]:3 = fir.box_dims %[[BOX_A_2]], %[[C0]] : (!fir.box>>, index) -> (index, index, index) !CHECK: %[[UB:.*]] = arith.subi %[[DIMS2]]#1, %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[ZERO]] : index) upperbound(%[[UB:.*]] : index) extent(%[[DIMS2]]#1 : index) stride(%[[DIMS1]]#2 : index) startIdx(%[[DIMS0]]#0 : index) {strideInBytes = true} -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>>>) bounds(%[[BOUND]]) -> !fir.ref>>> {name = "a(:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>>>) bounds(%[[BOUND]]) -> !fir.ref>>> {name = "a(:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>>>) !$acc enter data create(a(2:5)) @@ -447,7 +447,7 @@ subroutine acc_enter_data_allocatable() !CHECK: %[[C0:.*]] = arith.constant 0 : index !CHECK: %[[DIMS2:.*]]:3 = fir.box_dims %[[BOX_A_2]], %[[C0]] : (!fir.box>>, index) -> (index, index, index) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS2]]#1 : index) stride(%[[DIMS1]]#2 : index) startIdx(%[[DIMS0]]#0 : index) {strideInBytes = true} -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>>>) bounds(%[[BOUND]]) -> !fir.ref>>> {name = "a(2:5)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>>>) bounds(%[[BOUND]]) -> !fir.ref>>> {name = "a(2:5)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>>>) !$acc enter data create(a(3:)) @@ -466,7 +466,7 @@ subroutine acc_enter_data_allocatable() !CHECK: %[[DIMS2:.*]]:3 = fir.box_dims %[[BOX_A_2]], %[[C0]] : (!fir.box>>, index) -> (index, index, index) !CHECK: %[[UB:.*]] = arith.subi %[[DIMS2]]#1, %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS2]]#1 : index) stride(%[[DIMS1]]#2 : index) startIdx(%[[DIMS0]]#0 : index) {strideInBytes = true} -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>>>) bounds(%[[BOUND]]) -> !fir.ref>>> {name = "a(3:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>>>) bounds(%[[BOUND]]) -> !fir.ref>>> {name = "a(3:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>>>) !$acc enter data create(a(:7)) @@ -484,12 +484,12 @@ subroutine acc_enter_data_allocatable() !CHECK: %[[C0:.*]] = arith.constant 0 : index !CHECK: %[[DIMS2:.*]]:3 = fir.box_dims %[[BOX_A_2]], %[[C0]] : (!fir.box>>, index) -> (index, index, index) !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[ZERO]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS2]]#1 : index) stride(%[[DIMS1]]#2 : index) startIdx(%[[DIMS0]]#0 : index) {strideInBytes = true} -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>>>) bounds(%[[BOUND]]) -> !fir.ref>>> {name = "a(:7)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>>>) bounds(%[[BOUND]]) -> !fir.ref>>> {name = "a(:7)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>>>) !$acc enter data create(i) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLI]]#0 : !fir.ref>>) -> !fir.ref>> {name = "i", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECLI]]#0 : !fir.ref>>) -> !fir.ref>> {name = "i"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>>) end subroutine @@ -534,7 +534,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[DATA_COORD:.*]] = hlfir.designate %[[DECLA]]#0{"data"} : (!fir.ref}>>) -> !fir.ref -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DATA_COORD]] : !fir.ref) -> !fir.ref {name = "a%data", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DATA_COORD]] : !fir.ref) -> !fir.ref {name = "a%data"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref) !$acc enter data create(b%d%data) @@ -543,14 +543,14 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[D_COORD:.*]] = hlfir.designate %[[DECLB]]#0{"d"} : (!fir.ref}>}>>) -> !fir.ref}>> !CHECK: %[[DATA_COORD:.*]] = hlfir.designate %[[D_COORD]]{"data"} : (!fir.ref}>>) -> !fir.ref -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DATA_COORD]] : !fir.ref) -> !fir.ref {name = "b%d%data", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DATA_COORD]] : !fir.ref) -> !fir.ref {name = "b%d%data"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref) !$acc enter data create(a%array) !CHECK: %[[ARRAY_COORD:.*]] = hlfir.designate %[[DECLA]]#0{"array"} shape %{{.*}} : (!fir.ref}>>, !fir.shape<1>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) -> !fir.ref> {name = "a%array", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) -> !fir.ref> {name = "a%array"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a%array(:)) @@ -562,7 +562,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[C1:.*]] = arith.constant 1 : index !CHECK: %[[UB:.*]] = arith.subi %[[C10]], %[[C1]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[C10]] : index) stride(%[[C1]] : index) startIdx(%[[C1]] : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a%array(1:5)) @@ -573,7 +573,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[C0:.*]] = arith.constant 0 : index !CHECK: %[[C4:.*]] = arith.constant 4 : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[C0]] : index) upperbound(%[[C4]] : index) extent(%[[C10]] : index) stride(%[[C1]] : index) startIdx(%[[C1]] : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(1:5)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(1:5)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a%array(:5)) @@ -584,7 +584,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[C1:.*]] = arith.constant 1 : index !CHECK: %[[C4:.*]] = arith.constant 4 : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[C4]] : index) extent(%[[C10]] : index) stride(%[[C1]] : index) startIdx(%[[C1]] : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(:5)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(:5)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(a%array(2:)) @@ -596,7 +596,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[LB:.*]] = arith.constant 1 : index !CHECK: %[[UB:.*]] = arith.subi %[[C10]], %[[ONE]] : index !CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[C10]] : index) stride(%[[ONE]] : index) startIdx(%[[ONE]] : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(2:)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) bounds(%[[BOUND]]) -> !fir.ref> {name = "a%array(2:)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) !$acc enter data create(b%d%array) @@ -605,14 +605,14 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[D_COORD:.*]] = hlfir.designate %[[DECLB]]#0{"d"} : (!fir.ref}>}>>) -> !fir.ref}>> !CHECK: %[[ARRAY_COORD:.*]] = hlfir.designate %[[D_COORD]]{"array"} shape %{{.*}} : (!fir.ref}>>, !fir.shape<1>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) -> !fir.ref> {name = "b%d%array", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) -> !fir.ref> {name = "b%d%array"} !$acc enter data create(c%data) !CHECK: %[[DATA_COORD:.*]] = hlfir.designate %[[DECLC]]#0{"data"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>}>>) -> !fir.ref>>> !CHECK: %[[DATA_BOX:.*]] = fir.load %[[DATA_COORD]] : !fir.ref>>> -!CHECK: %[[CREATE:.*]] = acc.create var(%[[DATA_BOX]] : !fir.box>>) -> !fir.box>> {name = "c%data", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create var(%[[DATA_BOX]] : !fir.box>>) -> !fir.box>> {name = "c%data"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.box>>) !$acc enter data create (d%d(1)%array) @@ -620,7 +620,7 @@ subroutine acc_enter_data_derived_type() !CHECK: %[[ONE:.*]] = arith.constant 1 : index !CHECK: %[[D1_COORD:.*]] = hlfir.designate %[[DECLD]]#0{"d"} <%{{.*}}> (%[[ONE]]) : (!fir.ref}>>}>>, !fir.shape<1>, index) -> !fir.ref}>> !CHECK: %[[ARRAY_COORD:.*]] = hlfir.designate %[[D1_COORD]]{"array"} shape %{{.*}} : (!fir.ref}>>, !fir.shape<1>) -> !fir.ref> -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) -> !fir.ref> {name = "d%d(1_8)%array", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARRAY_COORD]] : !fir.ref>) -> !fir.ref> {name = "d%d(1_8)%array"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.ref>) end subroutine @@ -647,7 +647,7 @@ subroutine acc_enter_data_single_array_element() !CHECK: %[[VAL_46:.*]] = arith.constant 2 : index !CHECK: %[[VAL_47:.*]] = arith.subi %[[VAL_46]], %[[VAL_40]]#0 : index !CHECK: %[[VAL_48:.*]] = acc.bounds lowerbound(%[[VAL_47]] : index) upperbound(%[[VAL_47]] : index) extent(%[[VAL_42]] : index) stride(%[[VAL_42]] : index) startIdx(%[[VAL_40]]#0 : index) -!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[VAL_41]] : !fir.heap>) bounds(%[[VAL_45]], %[[VAL_48]]) -> !fir.heap> {name = "e(2_8)%a(1,2)", structured = false} +!CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[VAL_41]] : !fir.heap>) bounds(%[[VAL_45]], %[[VAL_48]]) -> !fir.heap> {name = "e(2_8)%a(1,2)"} !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.heap>) end subroutine diff --git a/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 index 7999a7647f49b..b047f584d20d7 100644 --- a/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 @@ -18,90 +18,90 @@ subroutine acc_exit_data !CHECK: %[[DECLD:.*]]:2 = hlfir.declare %[[D]] !$acc exit data delete(a) -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a"} !$acc exit data delete(a) if(.true.) -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: %[[IF1:.*]] = arith.constant true !CHECK: acc.exit_data if(%[[IF1]]) dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a"} !$acc exit data delete(a) if(ifCondition) -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: %[[IFCOND:.*]] = fir.load %{{.*}} : !fir.ref> !CHECK: %[[IF2:.*]] = fir.convert %[[IFCOND]] : (!fir.logical<4>) -> i1 !CHECK: acc.exit_data if(%[[IF2]]) dataOperands(%[[DEVPTR]] : !fir.ref>){{$}} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a"} !$acc exit data delete(a) delete(b) delete(c) -!CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -!CHECK: %[[DEVPTR_B:.*]] = acc.getdeviceptr varPtr(%[[DECLB]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "b", structured = false} -!CHECK: %[[DEVPTR_C:.*]] = acc.getdeviceptr varPtr(%[[DECLC]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "c", structured = false} +!CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} +!CHECK: %[[DEVPTR_B:.*]] = acc.getdeviceptr varPtr(%[[DECLB]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "b"} +!CHECK: %[[DEVPTR_C:.*]] = acc.getdeviceptr varPtr(%[[DECLC]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "c"} !CHECK: acc.exit_data dataOperands(%[[DEVPTR_A]], %[[DEVPTR_B]], %[[DEVPTR_C]] : !fir.ref>, !fir.ref>, !fir.ref>){{$}} -!CHECK: acc.delete accPtr(%[[DEVPTR_A]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a", structured = false} -!CHECK: acc.delete accPtr(%[[DEVPTR_B]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "b", structured = false} -!CHECK: acc.delete accPtr(%[[DEVPTR_C]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "c", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR_A]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a"} +!CHECK: acc.delete accPtr(%[[DEVPTR_B]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "b"} +!CHECK: acc.delete accPtr(%[[DEVPTR_C]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "c"} !$acc exit data copyout(a) delete(b) detach(d) -!CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -!CHECK: %[[DEVPTR_B:.*]] = acc.getdeviceptr varPtr(%[[DECLB]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "b", structured = false} +!CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} +!CHECK: %[[DEVPTR_B:.*]] = acc.getdeviceptr varPtr(%[[DECLB]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "b"} !CHECK: %[[BOX_D:.*]] = fir.load %[[DECLD]]#0 : !fir.ref>> !CHECK: %[[D_ADDR:.*]] = fir.box_addr %[[BOX_D]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[DEVPTR_D:.*]] = acc.getdeviceptr varPtr(%[[D_ADDR]] : !fir.ptr) -> !fir.ptr {dataClause = #acc, name = "d", structured = false} +!CHECK: %[[DEVPTR_D:.*]] = acc.getdeviceptr varPtr(%[[D_ADDR]] : !fir.ptr) -> !fir.ptr {dataClause = #acc, name = "d"} !CHECK: acc.exit_data dataOperands(%[[DEVPTR_A]], %[[DEVPTR_B]], %[[DEVPTR_D]] : !fir.ref>, !fir.ref>, !fir.ptr) -!CHECK: acc.copyout accPtr(%[[DEVPTR_A]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} -!CHECK: acc.delete accPtr(%[[DEVPTR_B]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "b", structured = false} -!CHECK: acc.detach accPtr(%[[DEVPTR_D]] : !fir.ptr) {name = "d", structured = false} +!CHECK: acc.copyout accPtr(%[[DEVPTR_A]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} +!CHECK: acc.delete accPtr(%[[DEVPTR_B]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "b"} +!CHECK: acc.detach accPtr(%[[DEVPTR_D]] : !fir.ptr) {name = "d"} !$acc exit data delete(a) async -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a"} !$acc exit data delete(a) wait -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {wait} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a"} !$acc exit data delete(a) async wait -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async, wait} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a"} !$acc exit data delete(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async(%[[ASYNC1]] : i32) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: acc.exit_data async(%[[ASYNC1]] : i32) dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async(%[[ASYNC1]] : i32) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a"} !$acc exit data delete(a) async(async) !CHECK: %[[ASYNC2:.*]] = fir.load %{{.*}} : !fir.ref -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async(%[[ASYNC2]] : i32) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: acc.exit_data async(%[[ASYNC2]] : i32) dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async(%[[ASYNC2]] : i32) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a"} !$acc exit data delete(a) wait(1) -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: %[[WAIT1:.*]] = arith.constant 1 : i32 !CHECK: acc.exit_data wait(%[[WAIT1]] : i32) dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a"} !$acc exit data delete(a) wait(queues: 1, 2) -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: %[[WAIT2:.*]] = arith.constant 1 : i32 !CHECK: %[[WAIT3:.*]] = arith.constant 2 : i32 !CHECK: acc.exit_data wait(%[[WAIT2]], %[[WAIT3]] : i32, i32) dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a"} !$acc exit data delete(a) wait(devnum: 1: queues: 1, 2) -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: %[[WAIT4:.*]] = arith.constant 1 : i32 !CHECK: %[[WAIT5:.*]] = arith.constant 2 : i32 !CHECK: %[[WAIT6:.*]] = arith.constant 1 : i32 !CHECK: acc.exit_data wait_devnum(%[[WAIT6]] : i32) wait(%[[WAIT4]], %[[WAIT5]] : i32, i32) dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a"} end subroutine acc_exit_data diff --git a/flang/test/Lower/OpenACC/acc-exit-data.f90 b/flang/test/Lower/OpenACC/acc-exit-data.f90 index bf5f7094913a1..20314a30d745f 100644 --- a/flang/test/Lower/OpenACC/acc-exit-data.f90 +++ b/flang/test/Lower/OpenACC/acc-exit-data.f90 @@ -18,89 +18,89 @@ subroutine acc_exit_data !CHECK: %[[DECLD:.*]]:2 = hlfir.declare %[[D]] !$acc exit data delete(a) -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a"} !$acc exit data delete(a) if(.true.) -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: %[[IF1:.*]] = arith.constant true !CHECK: acc.exit_data if(%[[IF1]]) dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a"} !$acc exit data delete(a) if(ifCondition) -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: %[[IFCOND:.*]] = fir.load %{{.*}} : !fir.ref> !CHECK: %[[IF2:.*]] = fir.convert %[[IFCOND]] : (!fir.logical<4>) -> i1 !CHECK: acc.exit_data if(%[[IF2]]) dataOperands(%[[DEVPTR]] : !fir.ref>){{$}} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a"} !$acc exit data delete(a) delete(b) delete(c) -!CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -!CHECK: %[[DEVPTR_B:.*]] = acc.getdeviceptr varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "b", structured = false} -!CHECK: %[[DEVPTR_C:.*]] = acc.getdeviceptr varPtr(%[[DECLC]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "c", structured = false} +!CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} +!CHECK: %[[DEVPTR_B:.*]] = acc.getdeviceptr varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "b"} +!CHECK: %[[DEVPTR_C:.*]] = acc.getdeviceptr varPtr(%[[DECLC]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "c"} !CHECK: acc.exit_data dataOperands(%[[DEVPTR_A]], %[[DEVPTR_B]], %[[DEVPTR_C]] : !fir.ref>, !fir.ref>, !fir.ref>){{$}} -!CHECK: acc.delete accPtr(%[[DEVPTR_A]] : !fir.ref>) {name = "a", structured = false} -!CHECK: acc.delete accPtr(%[[DEVPTR_B]] : !fir.ref>) {name = "b", structured = false} -!CHECK: acc.delete accPtr(%[[DEVPTR_C]] : !fir.ref>) {name = "c", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR_A]] : !fir.ref>) {name = "a"} +!CHECK: acc.delete accPtr(%[[DEVPTR_B]] : !fir.ref>) {name = "b"} +!CHECK: acc.delete accPtr(%[[DEVPTR_C]] : !fir.ref>) {name = "c"} !$acc exit data copyout(a) delete(b) detach(d) -!CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -!CHECK: %[[DEVPTR_B:.*]] = acc.getdeviceptr varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "b", structured = false} -!CHECK: %[[DEVPTR_D:.*]] = acc.getdeviceptr varPtr(%[[DECLD]]#0 : !fir.ref>>) -> !fir.ref>> {dataClause = #acc, name = "d", structured = false} +!CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} +!CHECK: %[[DEVPTR_B:.*]] = acc.getdeviceptr varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "b"} +!CHECK: %[[DEVPTR_D:.*]] = acc.getdeviceptr varPtr(%[[DECLD]]#0 : !fir.ref>>) -> !fir.ref>> {dataClause = #acc, name = "d"} !CHECK: acc.exit_data dataOperands(%[[DEVPTR_A]], %[[DEVPTR_B]], %[[DEVPTR_D]] : !fir.ref>, !fir.ref>, !fir.ref>>) -!CHECK: acc.copyout accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} -!CHECK: acc.delete accPtr(%[[DEVPTR_B]] : !fir.ref>) {name = "b", structured = false} -!CHECK: acc.detach accPtr(%[[DEVPTR_D]] : !fir.ref>>) {name = "d", structured = false} +!CHECK: acc.copyout accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} +!CHECK: acc.delete accPtr(%[[DEVPTR_B]] : !fir.ref>) {name = "b"} +!CHECK: acc.detach accPtr(%[[DEVPTR_D]] : !fir.ref>>) {name = "d"} !$acc exit data delete(a) async -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a"} !$acc exit data delete(a) wait -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {wait} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a"} !$acc exit data delete(a) async wait -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async, wait} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a"} !$acc exit data delete(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async(%[[ASYNC1]] : i32) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: acc.exit_data async(%[[ASYNC1]] : i32) dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) async(%[[ASYNC1]] : i32) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a"} !$acc exit data delete(a) async(async) !CHECK: %[[ASYNC2:.*]] = fir.load %{{.*}} : !fir.ref -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async(%[[ASYNC2]] : i32) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: acc.exit_data async(%[[ASYNC2]] : i32) dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) async(%[[ASYNC2]] : i32) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a"} !$acc exit data delete(a) wait(1) -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: %[[WAIT1:.*]] = arith.constant 1 : i32 !CHECK: acc.exit_data wait(%[[WAIT1]] : i32) dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a"} !$acc exit data delete(a) wait(queues: 1, 2) -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: %[[WAIT2:.*]] = arith.constant 1 : i32 !CHECK: %[[WAIT3:.*]] = arith.constant 2 : i32 !CHECK: acc.exit_data wait(%[[WAIT2]], %[[WAIT3]] : i32, i32) dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a"} !$acc exit data delete(a) wait(devnum: 1: queues: 1, 2) -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} !CHECK: %[[WAIT4:.*]] = arith.constant 1 : i32 !CHECK: %[[WAIT5:.*]] = arith.constant 2 : i32 !CHECK: %[[WAIT6:.*]] = arith.constant 1 : i32 !CHECK: acc.exit_data wait_devnum(%[[WAIT6]] : i32) wait(%[[WAIT4]], %[[WAIT5]] : i32, i32) dataOperands(%[[DEVPTR]] : !fir.ref>) -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a", structured = false} +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a"} !$acc exit data delete(a) finalize !CHECK: acc.exit_data dataOperands(%{{.*}} : !fir.ref>) attributes {finalize} diff --git a/flang/test/Lower/OpenACC/acc-parallel.f90 b/flang/test/Lower/OpenACC/acc-parallel.f90 index e00ea41210966..cba3ad538b4e2 100644 --- a/flang/test/Lower/OpenACC/acc-parallel.f90 +++ b/flang/test/Lower/OpenACC/acc-parallel.f90 @@ -330,10 +330,11 @@ subroutine acc_parallel !$acc parallel private(a) firstprivate(b) private(c) async(1) !$acc end parallel -! CHECK: %[[ACC_PRIVATE_A:.*]] = acc.private varPtr(%[[DECLA]]#0 : !fir.ref>) async([[ASYNC3:%.*]]) -> !fir.ref> {name = "a"} -! CHECK: %[[ACC_FPRIVATE_B:.*]] = acc.firstprivate varPtr(%[[DECLB]]#0 : !fir.ref>) async([[ASYNC3]]) -> !fir.ref> {name = "b"} -! CHECK: %[[ACC_PRIVATE_C:.*]] = acc.private varPtr(%[[DECLC]]#0 : !fir.ref>) async([[ASYNC3]]) -> !fir.ref> {name = "c"} -! CHECK: acc.parallel async([[ASYNC3]]) firstprivate(@firstprivatization_ref_10x10xf32 -> %[[ACC_FPRIVATE_B]] : !fir.ref>) private(@privatization_ref_10x10xf32 -> %[[ACC_PRIVATE_A]] : !fir.ref>, @privatization_ref_10x10xf32 -> %[[ACC_PRIVATE_C]] : !fir.ref>) { +! CHECK-DAG: %[[ACC_PRIVATE_A:.*]] = acc.private varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a"} +! CHECK-DAG: %[[ACC_FPRIVATE_B:.*]] = acc.firstprivate varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {name = "b"} +! CHECK-DAG: %[[ACC_PRIVATE_C:.*]] = acc.private varPtr(%[[DECLC]]#0 : !fir.ref>) -> !fir.ref> {name = "c"} +! CHECK-DAG: acc.parallel async(%[[ASYNC3:.*]] : i32) firstprivate(@firstprivatization_ref_10x10xf32 -> %[[ACC_FPRIVATE_B]] : !fir.ref>) private(@privatization_ref_10x10xf32 -> %[[ACC_PRIVATE_A]] : !fir.ref>, @privatization_ref_10x10xf32 -> %[[ACC_PRIVATE_C]] : !fir.ref>) { +! CHECK-DAG: %[[ASYNC3]] = arith.constant 1 : i32 ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} diff --git a/flang/test/Lower/OpenACC/acc-update.f90 b/flang/test/Lower/OpenACC/acc-update.f90 index f96b105ed93bd..4d74da38bb8e6 100644 --- a/flang/test/Lower/OpenACC/acc-update.f90 +++ b/flang/test/Lower/OpenACC/acc-update.f90 @@ -15,101 +15,101 @@ subroutine acc_update ! CHECK: %[[DECLC:.*]]:2 = hlfir.declare %[[C]] !$acc update host(a) -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} ! CHECK: acc.update dataOperands(%[[DEVPTR_A]] : !fir.ref>){{$}} -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} !$acc update host(a) if_present -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} ! CHECK: acc.update dataOperands(%[[DEVPTR_A]] : !fir.ref>) attributes {ifPresent}{{$}} -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} !$acc update host(a) if_present if_present ! CHECK: acc.update dataOperands(%{{.*}} : !fir.ref>) attributes {ifPresent}{{$}} !$acc update self(a) -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} ! CHECK: acc.update dataOperands(%[[DEVPTR_A]] : !fir.ref>){{$}} -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {dataClause = #acc, name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {dataClause = #acc, name = "a"} !$acc update host(a) if(.true.) -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} ! CHECK: %[[IF1:.*]] = arith.constant true ! CHECK: acc.update if(%[[IF1]]) dataOperands(%[[DEVPTR_A]] : !fir.ref>){{$}} -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} !$acc update host(a) if(ifCondition) -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} ! CHECK: %[[IFCOND:.*]] = fir.load %{{.*}} : !fir.ref> ! CHECK: %[[IF2:.*]] = fir.convert %[[IFCOND]] : (!fir.logical<4>) -> i1 ! CHECK: acc.update if(%[[IF2]]) dataOperands(%[[DEVPTR_A]] : !fir.ref>){{$}} -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} !$acc update host(a) host(b) host(c) -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -! CHECK: %[[DEVPTR_B:.*]] = acc.getdeviceptr varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "b", structured = false} -! CHECK: %[[DEVPTR_C:.*]] = acc.getdeviceptr varPtr(%[[DECLC]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "c", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} +! CHECK: %[[DEVPTR_B:.*]] = acc.getdeviceptr varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "b"} +! CHECK: %[[DEVPTR_C:.*]] = acc.getdeviceptr varPtr(%[[DECLC]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "c"} ! CHECK: acc.update dataOperands(%[[DEVPTR_A]], %[[DEVPTR_B]], %[[DEVPTR_C]] : !fir.ref>, !fir.ref>, !fir.ref>){{$}} -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} -! CHECK: acc.update_host accPtr(%[[DEVPTR_B]] : !fir.ref>) to varPtr(%[[DECLB]]#0 : !fir.ref>) {name = "b", structured = false} -! CHECK: acc.update_host accPtr(%[[DEVPTR_C]] : !fir.ref>) to varPtr(%[[DECLC]]#0 : !fir.ref>) {name = "c", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} +! CHECK: acc.update_host accPtr(%[[DEVPTR_B]] : !fir.ref>) to varPtr(%[[DECLB]]#0 : !fir.ref>) {name = "b"} +! CHECK: acc.update_host accPtr(%[[DEVPTR_C]] : !fir.ref>) to varPtr(%[[DECLC]]#0 : !fir.ref>) {name = "c"} !$acc update host(a) host(b) device(c) -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -! CHECK: %[[DEVPTR_B:.*]] = acc.getdeviceptr varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "b", structured = false} -! CHECK: %[[DEVPTR_C:.*]] = acc.update_device varPtr(%[[DECLC]]#0 : !fir.ref>) -> !fir.ref> {name = "c", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} +! CHECK: %[[DEVPTR_B:.*]] = acc.getdeviceptr varPtr(%[[DECLB]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "b"} +! CHECK: %[[DEVPTR_C:.*]] = acc.update_device varPtr(%[[DECLC]]#0 : !fir.ref>) -> !fir.ref> {name = "c"} ! CHECK: acc.update dataOperands(%[[DEVPTR_C]], %[[DEVPTR_A]], %[[DEVPTR_B]] : !fir.ref>, !fir.ref>, !fir.ref>){{$}} -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} -! CHECK: acc.update_host accPtr(%[[DEVPTR_B]] : !fir.ref>) to varPtr(%[[DECLB]]#0 : !fir.ref>) {name = "b", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} +! CHECK: acc.update_host accPtr(%[[DEVPTR_B]] : !fir.ref>) to varPtr(%[[DECLB]]#0 : !fir.ref>) {name = "b"} !$acc update host(a) async -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -! CHECK: acc.update async dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} +! CHECK: acc.update dataOperands(%[[DEVPTR_A]] : !fir.ref>) attributes {asyncOnly = [#acc.device_type]} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} !$acc update host(a) wait -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} ! CHECK: acc.update wait dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} !$acc update host(a) async wait -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -! CHECK: acc.update async wait dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} +! CHECK: acc.update wait dataOperands(%[[DEVPTR_A]] : !fir.ref>) attributes {asyncOnly = [#acc.device_type]} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} !$acc update host(a) async(1) ! CHECK: [[ASYNC1:%.*]] = arith.constant 1 : i32 -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async([[ASYNC1]] : i32) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} ! CHECK: acc.update async([[ASYNC1]] : i32) dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) async([[ASYNC1]] : i32) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} !$acc update host(a) async(async) ! CHECK: [[ASYNC2:%.*]] = fir.load %{{.*}} : !fir.ref -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async([[ASYNC2]] : i32) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} ! CHECK: acc.update async([[ASYNC2]] : i32) dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) async([[ASYNC2]] : i32) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} !$acc update host(a) wait(1) ! CHECK: [[WAIT1:%.*]] = arith.constant 1 : i32 -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} ! CHECK: acc.update wait({[[WAIT1]] : i32}) dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} !$acc update host(a) wait(queues: 1, 2) ! CHECK: [[WAIT2:%.*]] = arith.constant 1 : i32 ! CHECK: [[WAIT3:%.*]] = arith.constant 2 : i32 -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} ! CHECK: acc.update wait({[[WAIT2]] : i32, [[WAIT3]] : i32}) dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} !$acc update host(a) wait(devnum: 1: queues: 1, 2) -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} ! CHECK: acc.update wait({devnum: %c1{{.*}} : i32, %c1{{.*}} : i32, %c2{{.*}} : i32}) dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} !$acc update host(a) device_type(host, nvidia) async -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type, #acc.device_type], dataClause = #acc, name = "a", structured = false} -! CHECK: acc.update async([#acc.device_type, #acc.device_type]) dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {asyncOnly = [#acc.device_type, #acc.device_type], name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a"} +! CHECK: acc.update dataOperands(%[[DEVPTR_A]] : !fir.ref>) attributes {asyncOnly = [#acc.device_type, #acc.device_type]} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a"} end subroutine acc_update diff --git a/mlir/test/Dialect/OpenACC/invalid.mlir b/mlir/test/Dialect/OpenACC/invalid.mlir index c8d7a87112917..e2287009fe2fd 100644 --- a/mlir/test/Dialect/OpenACC/invalid.mlir +++ b/mlir/test/Dialect/OpenACC/invalid.mlir @@ -130,7 +130,7 @@ acc.update %value = memref.alloc() : memref %0 = acc.update_device varPtr(%value : memref) -> memref // expected-error at +1 {{async attribute cannot appear with asyncOperand}} -acc.update async(%cst: index) dataOperands(%0 : memref) attributes {async = [#acc.device_type]} +acc.update async(%cst: index) dataOperands(%0 : memref) attributes {asyncOnly = [#acc.device_type]} // ----- diff --git a/mlir/test/Dialect/OpenACC/ops.mlir b/mlir/test/Dialect/OpenACC/ops.mlir index 4c842a26f8dc4..8a05ee75ae9d3 100644 --- a/mlir/test/Dialect/OpenACC/ops.mlir +++ b/mlir/test/Dialect/OpenACC/ops.mlir @@ -938,7 +938,7 @@ func.func @testupdateop(%a: memref, %b: memref, %c: memref) -> () acc.update if(%ifCond) dataOperands(%0: memref) acc.update dataOperands(%0: memref) acc.update dataOperands(%0, %1, %2 : memref, memref, memref) - acc.update async dataOperands(%0, %1, %2 : memref, memref, memref) + acc.update dataOperands(%0, %1, %2 : memref, memref, memref) attributes {asyncOnly = [#acc.device_type]} acc.update wait dataOperands(%0, %1, %2 : memref, memref, memref) acc.update dataOperands(%0, %1, %2 : memref, memref, memref) attributes {ifPresent} return @@ -957,7 +957,7 @@ func.func @testupdateop(%a: memref, %b: memref, %c: memref) -> () // CHECK: acc.update if([[IFCOND]]) dataOperands(%{{.*}} : memref) // CHECK: acc.update dataOperands(%{{.*}} : memref) // CHECK: acc.update dataOperands(%{{.*}}, %{{.*}}, %{{.*}} : memref, memref, memref) -// CHECK: acc.update async dataOperands(%{{.*}}, %{{.*}}, %{{.*}} : memref, memref, memref) +// CHECK: acc.update dataOperands(%{{.*}}, %{{.*}}, %{{.*}} : memref, memref, memref) attributes {asyncOnly = [#acc.device_type]} // CHECK: acc.update wait dataOperands(%{{.*}}, %{{.*}}, %{{.*}} : memref, memref, memref) // CHECK: acc.update dataOperands(%{{.*}}, %{{.*}}, %{{.*}} : memref, memref, memref) attributes {ifPresent} @@ -1342,28 +1342,28 @@ func.func @teststructureddataclauseops(%a: memref<10xf32>, %b: memref) -> () { - %copyin = acc.copyin varPtr(%a : memref<10xf32>) varType(tensor<10xf32>) -> memref<10xf32> {structured = false} + %copyin = acc.copyin varPtr(%a : memref<10xf32>) varType(tensor<10xf32>) -> memref<10xf32> acc.enter_data dataOperands(%copyin : memref<10xf32>) %devptr = acc.getdeviceptr varPtr(%a : memref<10xf32>) varType(tensor<10xf32>) -> memref<10xf32> {dataClause = #acc} acc.exit_data dataOperands(%devptr : memref<10xf32>) - acc.copyout accPtr(%devptr : memref<10xf32>) to varPtr(%a : memref<10xf32>) varType(tensor<10xf32>) {structured = false} + acc.copyout accPtr(%devptr : memref<10xf32>) to varPtr(%a : memref<10xf32>) varType(tensor<10xf32>) return } // CHECK: func.func @testunstructuredclauseops([[ARGA:%.*]]: memref<10xf32>) { -// CHECK: [[COPYIN:%.*]] = acc.copyin varPtr([[ARGA]] : memref<10xf32>) varType(tensor<10xf32>) -> memref<10xf32> {structured = false} +// CHECK: [[COPYIN:%.*]] = acc.copyin varPtr([[ARGA]] : memref<10xf32>) varType(tensor<10xf32>) -> memref<10xf32> // CHECK-NEXT: acc.enter_data dataOperands([[COPYIN]] : memref<10xf32>) // CHECK: [[DEVPTR:%.*]] = acc.getdeviceptr varPtr([[ARGA]] : memref<10xf32>) varType(tensor<10xf32>) -> memref<10xf32> {dataClause = #acc} // CHECK-NEXT: acc.exit_data dataOperands([[DEVPTR]] : memref<10xf32>) -// CHECK-NEXT: acc.copyout accPtr([[DEVPTR]] : memref<10xf32>) to varPtr([[ARGA]] : memref<10xf32>) varType(tensor<10xf32>) {structured = false} +// CHECK-NEXT: acc.copyout accPtr([[DEVPTR]] : memref<10xf32>) to varPtr([[ARGA]] : memref<10xf32>) varType(tensor<10xf32>) // ----- func.func @host_device_ops(%a: memref) -> () { %devptr = acc.getdeviceptr varPtr(%a : memref) -> memref - acc.update_host accPtr(%devptr : memref) to varPtr(%a : memref) {structured = false} + acc.update_host accPtr(%devptr : memref) to varPtr(%a : memref) acc.update dataOperands(%devptr : memref) %accPtr = acc.update_device varPtr(%a : memref) -> memref @@ -1374,7 +1374,7 @@ func.func @host_device_ops(%a: memref) -> () { // CHECK-LABEL: func.func @host_device_ops( // CHECK-SAME: %[[A:.*]]: memref) // CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[A]] : memref) -> memref -// CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : memref) to varPtr(%[[A]] : memref) {structured = false} +// CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : memref) to varPtr(%[[A]] : memref) // CHECK: acc.update dataOperands(%[[DEVPTR_A]] : memref) // CHECK: %[[DEVPTR_A:.*]] = acc.update_device varPtr(%[[A]] : memref) -> memref // CHECK: acc.update dataOperands(%[[DEVPTR_A]] : memref) From flang-commits at lists.llvm.org Tue May 13 12:22:07 2025 From: flang-commits at lists.llvm.org (=?UTF-8?Q?Mateusz_Miku=C5=82a?= via flang-commits) Date: Tue, 13 May 2025 12:22:07 -0700 (PDT) Subject: [flang-commits] [flang] [flang][CMake] CYGWIN requires _GNU_SOURCE to be defined. (PR #66747) In-Reply-To: Message-ID: <68239bdf.050a0220.158753.ff89@mx.google.com> mati865 wrote: Fixed by https://github.com/llvm/llvm-project/pull/138329 and https://github.com/llvm/llvm-project/pull/138587 https://github.com/llvm/llvm-project/pull/66747 From flang-commits at lists.llvm.org Tue May 13 12:25:07 2025 From: flang-commits at lists.llvm.org (=?UTF-8?Q?Mateusz_Miku=C5=82a?= via flang-commits) Date: Tue, 13 May 2025 12:25:07 -0700 (PDT) Subject: [flang-commits] [flang] [flang][CMake] CYGWIN: Fix undefined references at link time. (PR #67105) In-Reply-To: Message-ID: <68239c93.170a0220.2ae179.fdd4@mx.google.com> mati865 wrote: That's the right thing to do but needs a rebase. https://github.com/llvm/llvm-project/pull/67105 From flang-commits at lists.llvm.org Tue May 13 12:29:48 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 12:29:48 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Remove async and structured flag from data actions (PR #139723) In-Reply-To: Message-ID: <68239dac.170a0220.d71a0.ef79@mx.google.com> https://github.com/khaki3 edited https://github.com/llvm/llvm-project/pull/139723 From flang-commits at lists.llvm.org Tue May 13 12:30:18 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 12:30:18 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Remove async and structured flag from data actions (PR #139723) In-Reply-To: Message-ID: <68239dca.620a0220.97895.f503@mx.google.com> https://github.com/khaki3 ready_for_review https://github.com/llvm/llvm-project/pull/139723 From flang-commits at lists.llvm.org Tue May 13 12:30:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 12:30:50 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Remove async and structured flag from data actions (PR #139723) In-Reply-To: Message-ID: <68239dea.170a0220.1516df.f44c@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-openacc Author: None (khaki3)
Changes The OpenACC data actions always collocate with parental constructs and have no effects themselves. We should force users to handle data actions through parental constructs. Especially, if async operands adhere to data actions, some would lower data actions independently from parental constructs, causing semantically incorrect code. This PR removes the async operations and the structured flag of data actions. This PR also renames `UpdateOp`'s `async` to `asyncOnly`. --- Patch is 257.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139723.diff 19 Files Affected: - (modified) flang/lib/Lower/OpenACC.cpp (+133-221) - (modified) flang/test/Fir/OpenACC/openacc-type-categories.f90 (+10-10) - (modified) flang/test/Lower/OpenACC/acc-bounds.f90 (+3-3) - (modified) flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 (+4-3) - (modified) flang/test/Lower/OpenACC/acc-data.f90 (+4-3) - (modified) flang/test/Lower/OpenACC/acc-declare-globals.f90 (+10-10) - (modified) flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 (+19-19) - (modified) flang/test/Lower/OpenACC/acc-declare.f90 (+13-13) - (modified) flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 (+62-62) - (modified) flang/test/Lower/OpenACC/acc-enter-data.f90 (+62-62) - (modified) flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 (+34-34) - (modified) flang/test/Lower/OpenACC/acc-exit-data.f90 (+34-34) - (modified) flang/test/Lower/OpenACC/acc-parallel.f90 (+5-4) - (modified) flang/test/Lower/OpenACC/acc-update.f90 (+42-42) - (modified) mlir/include/mlir/Dialect/OpenACC/OpenACC.h (+8-9) - (modified) mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td (+23-153) - (modified) mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp (+16-18) - (modified) mlir/test/Dialect/OpenACC/invalid.mlir (+1-1) - (modified) mlir/test/Dialect/OpenACC/ops.mlir (+8-8) ``````````diff diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index e1918288d6de3..c1a8dd0d5a478 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -104,15 +104,12 @@ static void addOperand(llvm::SmallVectorImpl &operands, } template -static Op -createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, - mlir::Value baseAddr, std::stringstream &name, - mlir::SmallVector bounds, bool structured, - bool implicit, mlir::acc::DataClause dataClause, - mlir::Type retTy, llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, - bool unwrapBoxAddr = false, mlir::Value isPresent = {}) { +static Op createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, + mlir::Value baseAddr, std::stringstream &name, + mlir::SmallVector bounds, + bool implicit, mlir::acc::DataClause dataClause, + mlir::Type retTy, bool unwrapBoxAddr = false, + mlir::Value isPresent = {}) { mlir::Value varPtrPtr; // The data clause may apply to either the box reference itself or the // pointer to the data it holds. So use `unwrapBoxAddr` to decide. @@ -157,11 +154,9 @@ createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, addOperand(operands, operandSegments, baseAddr); addOperand(operands, operandSegments, varPtrPtr); addOperands(operands, operandSegments, bounds); - addOperands(operands, operandSegments, async); Op op = builder.create(loc, retTy, operands); op.setNameAttr(builder.getStringAttr(name.str())); - op.setStructured(structured); op.setImplicit(implicit); op.setDataClause(dataClause); if (auto mappableTy = @@ -176,10 +171,6 @@ createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, op->setAttr(Op::getOperandSegmentSizeAttr(), builder.getDenseI32ArrayAttr(operandSegments)); - if (!asyncDeviceTypes.empty()) - op.setAsyncOperandsDeviceTypeAttr(builder.getArrayAttr(asyncDeviceTypes)); - if (!asyncOnlyDeviceTypes.empty()) - op.setAsyncOnlyAttr(builder.getArrayAttr(asyncOnlyDeviceTypes)); return op; } @@ -249,9 +240,7 @@ static void createDeclareAllocFuncWithArg(mlir::OpBuilder &modBuilder, mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, registerFuncOp.getArgument(0), asFortranDesc, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, descTy, - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, descTy); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -263,8 +252,7 @@ static void createDeclareAllocFuncWithArg(mlir::OpBuilder &modBuilder, addDeclareAttr(builder, boxAddrOp.getOperation(), clause); EntryOp entryOp = createDataEntryOp( builder, loc, boxAddrOp.getResult(), asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, boxAddrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, boxAddrOp.getType()); builder.create( loc, mlir::acc::DeclareTokenType::get(entryOp.getContext()), mlir::ValueRange(entryOp.getAccVar())); @@ -302,26 +290,20 @@ static void createDeclareDeallocFuncWithArg( mlir::acc::GetDevicePtrOp entryOp = createDataEntryOp( builder, loc, var, asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, var.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, var.getType()); builder.create( loc, mlir::Value{}, mlir::ValueRange(entryOp.getAccVar())); if constexpr (std::is_same_v || std::is_same_v) - builder.create(entryOp.getLoc(), entryOp.getAccVar(), - entryOp.getVar(), entryOp.getVarType(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, - builder.getStringAttr(*entryOp.getName())); + builder.create( + entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), + entryOp.getVarType(), entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); else builder.create(entryOp.getLoc(), entryOp.getAccVar(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); // Generate the post dealloc function. @@ -341,9 +323,8 @@ static void createDeclareDeallocFuncWithArg( mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, var, asFortran, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, var.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, + var.getType()); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -700,10 +681,7 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - mlir::acc::DataClause dataClause, bool structured, - bool implicit, llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, + mlir::acc::DataClause dataClause, bool implicit, bool setDeclareAttr = false) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; @@ -732,9 +710,8 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, ? info.rawInput : info.addr; Op op = createDataEntryOp( - builder, operandLocation, baseAddr, asFortran, bounds, structured, - implicit, dataClause, baseAddr.getType(), async, asyncDeviceTypes, - asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true, info.isPresent); + builder, operandLocation, baseAddr, asFortran, bounds, implicit, + dataClause, baseAddr.getType(), /*unwrapBoxAddr=*/true, info.isPresent); dataOperands.push_back(op.getAccVar()); } } @@ -746,7 +723,7 @@ static void genDeclareDataOperandOperations( Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - mlir::acc::DataClause dataClause, bool structured, bool implicit) { + mlir::acc::DataClause dataClause, bool implicit) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; for (const auto &accObject : objectList.v) { @@ -765,10 +742,9 @@ static void genDeclareDataOperandOperations( /*genDefaultBounds=*/generateDefaultBounds, /*strideIncludeLowerExtent=*/strideIncludeLowerExtent); LLVM_DEBUG(llvm::dbgs() << __func__ << "\n"; info.dump(llvm::dbgs())); - EntryOp op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, structured, - implicit, dataClause, info.addr.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + EntryOp op = createDataEntryOp(builder, operandLocation, info.addr, + asFortran, bounds, implicit, + dataClause, info.addr.getType()); dataOperands.push_back(op.getAccVar()); addDeclareAttr(builder, op.getVar().getDefiningOp(), dataClause); if (mlir::isa(fir::unwrapRefType(info.addr.getType()))) { @@ -805,14 +781,12 @@ static void genDeclareDataOperandOperationsWithModifier( (modifier && (*modifier).v == mod) ? clauseWithModifier : clause; genDeclareDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, - dataClause, - /*structured=*/true, /*implicit=*/false); + dataClause, /*implicit=*/false); } template static void genDataExitOperations(fir::FirOpBuilder &builder, - llvm::SmallVector operands, - bool structured) { + llvm::SmallVector operands) { for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); @@ -820,16 +794,13 @@ static void genDataExitOperations(fir::FirOpBuilder &builder, std::is_same_v) builder.create( entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), - entryOp.getVarType(), entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), - entryOp.getDataClause(), structured, entryOp.getImplicit(), - builder.getStringAttr(*entryOp.getName())); - else - builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getBounds(), - entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, + entryOp.getVarType(), entryOp.getBounds(), entryOp.getDataClause(), entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); + else + builder.create(entryOp.getLoc(), entryOp.getAccVar(), + entryOp.getBounds(), entryOp.getDataClause(), + entryOp.getImplicit(), + builder.getStringAttr(*entryOp.getName())); } } @@ -1240,10 +1211,7 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - llvm::SmallVector &privatizations, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes) { + llvm::SmallVector &privatizations) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; for (const auto &accObject : objectList.v) { @@ -1272,9 +1240,9 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, recipe = Fortran::lower::createOrGetPrivateRecipe(builder, recipeName, operandLocation, retTy); auto op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, true, - /*implicit=*/false, mlir::acc::DataClause::acc_private, retTy, async, - asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); + builder, operandLocation, info.addr, asFortran, bounds, + /*implicit=*/false, mlir::acc::DataClause::acc_private, retTy, + /*unwrapBoxAddr=*/true); dataOperands.push_back(op.getAccVar()); } else { std::string suffix = @@ -1284,9 +1252,8 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, recipe = Fortran::lower::createOrGetFirstprivateRecipe( builder, recipeName, operandLocation, retTy, bounds); auto op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, true, + builder, operandLocation, info.addr, asFortran, bounds, /*implicit=*/false, mlir::acc::DataClause::acc_firstprivate, retTy, - async, asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); dataOperands.push_back(op.getAccVar()); } @@ -1869,10 +1836,7 @@ genReductions(const Fortran::parser::AccObjectListWithReduction &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &reductionOperands, - llvm::SmallVector &reductionRecipes, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes) { + llvm::SmallVector &reductionRecipes) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); const auto &objects = std::get(objectList.t); const auto &op = std::get(objectList.t); @@ -1904,9 +1868,8 @@ genReductions(const Fortran::parser::AccObjectListWithReduction &objectList, auto op = createDataEntryOp( builder, operandLocation, info.addr, asFortran, bounds, - /*structured=*/true, /*implicit=*/false, - mlir::acc::DataClause::acc_reduction, info.addr.getType(), async, - asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); + /*implicit=*/false, mlir::acc::DataClause::acc_reduction, + info.addr.getType(), /*unwrapBoxAddr=*/true); mlir::Type ty = op.getAccVar().getType(); if (!areAllBoundConstant(bounds) || fir::isAssumedShape(info.addr.getType()) || @@ -2169,9 +2132,8 @@ static void privatizeIv(Fortran::lower::AbstractConverter &converter, std::stringstream asFortran; asFortran << Fortran::lower::mangle::demangleName(toStringRef(sym.name())); auto op = createDataEntryOp( - builder, loc, ivValue, asFortran, {}, true, /*implicit=*/true, - mlir::acc::DataClause::acc_private, ivValue.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + builder, loc, ivValue, asFortran, {}, /*implicit=*/true, + mlir::acc::DataClause::acc_private, ivValue.getType()); privateOp = op.getOperation(); privateOperands.push_back(op.getAccVar()); @@ -2328,14 +2290,12 @@ static mlir::acc::LoopOp createLoopOp( &clause.u)) { genPrivatizations( privateClause->v, converter, semanticsContext, stmtCtx, - privateOperands, privatizations, /*async=*/{}, - /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + privateOperands, privatizations); } else if (const auto *reductionClause = std::get_if( &clause.u)) { genReductions(reductionClause->v, converter, semanticsContext, stmtCtx, - reductionOperands, reductionRecipes, /*async=*/{}, - /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + reductionOperands, reductionRecipes); } else if (std::get_if(&clause.u)) { for (auto crtDeviceTypeAttr : crtDeviceTypes) seqDeviceTypes.push_back(crtDeviceTypeAttr); @@ -2613,9 +2573,6 @@ static void genDataOperandOperationsWithModifier( llvm::SmallVectorImpl &dataClauseOperands, const mlir::acc::DataClause clause, const mlir::acc::DataClause clauseWithModifier, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, bool setDeclareAttr = false) { const Fortran::parser::AccObjectListWithModifier &listWithModifier = x->v; const auto &accObjectList = @@ -2627,9 +2584,7 @@ static void genDataOperandOperationsWithModifier( (modifier && (*modifier).v == mod) ? clauseWithModifier : clause; genDataOperandOperations(accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, dataClause, - /*structured=*/true, /*implicit=*/false, async, - asyncDeviceTypes, asyncOnlyDeviceTypes, - setDeclareAttr); + /*implicit=*/false, setDeclareAttr); } template @@ -2779,8 +2734,7 @@ static Op createComputeOp( genDataOperandOperations( copyClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copy, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyinClause = @@ -2791,8 +2745,7 @@ static Op createComputeOp( copyinClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::ReadOnly, dataClauseOperands, mlir::acc::DataClause::acc_copyin, - mlir::acc::DataClause::acc_copyin_readonly, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyin_readonly); copyinEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyoutClause = @@ -2804,8 +2757,7 @@ static Op createComputeOp( copyoutClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::ReadOnly, dataClauseOperands, mlir::acc::DataClause::acc_copyout, - mlir::acc::DataClause::acc_copyout_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyout_zero); copyoutEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *createClause = @@ -2816,8 +2768,7 @@ static Op createComputeOp( createClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::Zero, dataClauseOperands, mlir::acc::DataClause::acc_create, - mlir::acc::DataClause::acc_create_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTy... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/139723 From flang-commits at lists.llvm.org Tue May 13 12:30:51 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 12:30:51 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Remove async and structured flag from data actions (PR #139723) In-Reply-To: Message-ID: <68239deb.170a0220.3d8973.ee03@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-mlir-openacc @llvm/pr-subscribers-flang-fir-hlfir Author: None (khaki3)
Changes The OpenACC data actions always collocate with parental constructs and have no effects themselves. We should force users to handle data actions through parental constructs. Especially, if async operands adhere to data actions, some would lower data actions independently from parental constructs, causing semantically incorrect code. This PR removes the async operations and the structured flag of data actions. This PR also renames `UpdateOp`'s `async` to `asyncOnly`. --- Patch is 257.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139723.diff 19 Files Affected: - (modified) flang/lib/Lower/OpenACC.cpp (+133-221) - (modified) flang/test/Fir/OpenACC/openacc-type-categories.f90 (+10-10) - (modified) flang/test/Lower/OpenACC/acc-bounds.f90 (+3-3) - (modified) flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 (+4-3) - (modified) flang/test/Lower/OpenACC/acc-data.f90 (+4-3) - (modified) flang/test/Lower/OpenACC/acc-declare-globals.f90 (+10-10) - (modified) flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 (+19-19) - (modified) flang/test/Lower/OpenACC/acc-declare.f90 (+13-13) - (modified) flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 (+62-62) - (modified) flang/test/Lower/OpenACC/acc-enter-data.f90 (+62-62) - (modified) flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 (+34-34) - (modified) flang/test/Lower/OpenACC/acc-exit-data.f90 (+34-34) - (modified) flang/test/Lower/OpenACC/acc-parallel.f90 (+5-4) - (modified) flang/test/Lower/OpenACC/acc-update.f90 (+42-42) - (modified) mlir/include/mlir/Dialect/OpenACC/OpenACC.h (+8-9) - (modified) mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td (+23-153) - (modified) mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp (+16-18) - (modified) mlir/test/Dialect/OpenACC/invalid.mlir (+1-1) - (modified) mlir/test/Dialect/OpenACC/ops.mlir (+8-8) ``````````diff diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index e1918288d6de3..c1a8dd0d5a478 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -104,15 +104,12 @@ static void addOperand(llvm::SmallVectorImpl &operands, } template -static Op -createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, - mlir::Value baseAddr, std::stringstream &name, - mlir::SmallVector bounds, bool structured, - bool implicit, mlir::acc::DataClause dataClause, - mlir::Type retTy, llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, - bool unwrapBoxAddr = false, mlir::Value isPresent = {}) { +static Op createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, + mlir::Value baseAddr, std::stringstream &name, + mlir::SmallVector bounds, + bool implicit, mlir::acc::DataClause dataClause, + mlir::Type retTy, bool unwrapBoxAddr = false, + mlir::Value isPresent = {}) { mlir::Value varPtrPtr; // The data clause may apply to either the box reference itself or the // pointer to the data it holds. So use `unwrapBoxAddr` to decide. @@ -157,11 +154,9 @@ createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, addOperand(operands, operandSegments, baseAddr); addOperand(operands, operandSegments, varPtrPtr); addOperands(operands, operandSegments, bounds); - addOperands(operands, operandSegments, async); Op op = builder.create(loc, retTy, operands); op.setNameAttr(builder.getStringAttr(name.str())); - op.setStructured(structured); op.setImplicit(implicit); op.setDataClause(dataClause); if (auto mappableTy = @@ -176,10 +171,6 @@ createDataEntryOp(fir::FirOpBuilder &builder, mlir::Location loc, op->setAttr(Op::getOperandSegmentSizeAttr(), builder.getDenseI32ArrayAttr(operandSegments)); - if (!asyncDeviceTypes.empty()) - op.setAsyncOperandsDeviceTypeAttr(builder.getArrayAttr(asyncDeviceTypes)); - if (!asyncOnlyDeviceTypes.empty()) - op.setAsyncOnlyAttr(builder.getArrayAttr(asyncOnlyDeviceTypes)); return op; } @@ -249,9 +240,7 @@ static void createDeclareAllocFuncWithArg(mlir::OpBuilder &modBuilder, mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, registerFuncOp.getArgument(0), asFortranDesc, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, descTy, - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, descTy); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -263,8 +252,7 @@ static void createDeclareAllocFuncWithArg(mlir::OpBuilder &modBuilder, addDeclareAttr(builder, boxAddrOp.getOperation(), clause); EntryOp entryOp = createDataEntryOp( builder, loc, boxAddrOp.getResult(), asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, boxAddrOp.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, boxAddrOp.getType()); builder.create( loc, mlir::acc::DeclareTokenType::get(entryOp.getContext()), mlir::ValueRange(entryOp.getAccVar())); @@ -302,26 +290,20 @@ static void createDeclareDeallocFuncWithArg( mlir::acc::GetDevicePtrOp entryOp = createDataEntryOp( builder, loc, var, asFortran, bounds, - /*structured=*/false, /*implicit=*/false, clause, var.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/false, clause, var.getType()); builder.create( loc, mlir::Value{}, mlir::ValueRange(entryOp.getAccVar())); if constexpr (std::is_same_v || std::is_same_v) - builder.create(entryOp.getLoc(), entryOp.getAccVar(), - entryOp.getVar(), entryOp.getVarType(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, - builder.getStringAttr(*entryOp.getName())); + builder.create( + entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), + entryOp.getVarType(), entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); else builder.create(entryOp.getLoc(), entryOp.getAccVar(), - entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), - /*structured=*/false, /*implicit=*/false, + entryOp.getBounds(), entryOp.getDataClause(), + /*implicit=*/false, builder.getStringAttr(*entryOp.getName())); // Generate the post dealloc function. @@ -341,9 +323,8 @@ static void createDeclareDeallocFuncWithArg( mlir::acc::UpdateDeviceOp updateDeviceOp = createDataEntryOp( builder, loc, var, asFortran, bounds, - /*structured=*/false, /*implicit=*/true, - mlir::acc::DataClause::acc_update_device, var.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + /*implicit=*/true, mlir::acc::DataClause::acc_update_device, + var.getType()); llvm::SmallVector operandSegments{0, 0, 0, 1}; llvm::SmallVector operands{updateDeviceOp.getResult()}; createSimpleOp(builder, loc, operands, operandSegments); @@ -700,10 +681,7 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - mlir::acc::DataClause dataClause, bool structured, - bool implicit, llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, + mlir::acc::DataClause dataClause, bool implicit, bool setDeclareAttr = false) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; @@ -732,9 +710,8 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, ? info.rawInput : info.addr; Op op = createDataEntryOp( - builder, operandLocation, baseAddr, asFortran, bounds, structured, - implicit, dataClause, baseAddr.getType(), async, asyncDeviceTypes, - asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true, info.isPresent); + builder, operandLocation, baseAddr, asFortran, bounds, implicit, + dataClause, baseAddr.getType(), /*unwrapBoxAddr=*/true, info.isPresent); dataOperands.push_back(op.getAccVar()); } } @@ -746,7 +723,7 @@ static void genDeclareDataOperandOperations( Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - mlir::acc::DataClause dataClause, bool structured, bool implicit) { + mlir::acc::DataClause dataClause, bool implicit) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; for (const auto &accObject : objectList.v) { @@ -765,10 +742,9 @@ static void genDeclareDataOperandOperations( /*genDefaultBounds=*/generateDefaultBounds, /*strideIncludeLowerExtent=*/strideIncludeLowerExtent); LLVM_DEBUG(llvm::dbgs() << __func__ << "\n"; info.dump(llvm::dbgs())); - EntryOp op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, structured, - implicit, dataClause, info.addr.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + EntryOp op = createDataEntryOp(builder, operandLocation, info.addr, + asFortran, bounds, implicit, + dataClause, info.addr.getType()); dataOperands.push_back(op.getAccVar()); addDeclareAttr(builder, op.getVar().getDefiningOp(), dataClause); if (mlir::isa(fir::unwrapRefType(info.addr.getType()))) { @@ -805,14 +781,12 @@ static void genDeclareDataOperandOperationsWithModifier( (modifier && (*modifier).v == mod) ? clauseWithModifier : clause; genDeclareDataOperandOperations( accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, - dataClause, - /*structured=*/true, /*implicit=*/false); + dataClause, /*implicit=*/false); } template static void genDataExitOperations(fir::FirOpBuilder &builder, - llvm::SmallVector operands, - bool structured) { + llvm::SmallVector operands) { for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); @@ -820,16 +794,13 @@ static void genDataExitOperations(fir::FirOpBuilder &builder, std::is_same_v) builder.create( entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), - entryOp.getVarType(), entryOp.getBounds(), entryOp.getAsyncOperands(), - entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), - entryOp.getDataClause(), structured, entryOp.getImplicit(), - builder.getStringAttr(*entryOp.getName())); - else - builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getBounds(), - entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), - entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, + entryOp.getVarType(), entryOp.getBounds(), entryOp.getDataClause(), entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); + else + builder.create(entryOp.getLoc(), entryOp.getAccVar(), + entryOp.getBounds(), entryOp.getDataClause(), + entryOp.getImplicit(), + builder.getStringAttr(*entryOp.getName())); } } @@ -1240,10 +1211,7 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &dataOperands, - llvm::SmallVector &privatizations, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes) { + llvm::SmallVector &privatizations) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; for (const auto &accObject : objectList.v) { @@ -1272,9 +1240,9 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, recipe = Fortran::lower::createOrGetPrivateRecipe(builder, recipeName, operandLocation, retTy); auto op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, true, - /*implicit=*/false, mlir::acc::DataClause::acc_private, retTy, async, - asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); + builder, operandLocation, info.addr, asFortran, bounds, + /*implicit=*/false, mlir::acc::DataClause::acc_private, retTy, + /*unwrapBoxAddr=*/true); dataOperands.push_back(op.getAccVar()); } else { std::string suffix = @@ -1284,9 +1252,8 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, recipe = Fortran::lower::createOrGetFirstprivateRecipe( builder, recipeName, operandLocation, retTy, bounds); auto op = createDataEntryOp( - builder, operandLocation, info.addr, asFortran, bounds, true, + builder, operandLocation, info.addr, asFortran, bounds, /*implicit=*/false, mlir::acc::DataClause::acc_firstprivate, retTy, - async, asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); dataOperands.push_back(op.getAccVar()); } @@ -1869,10 +1836,7 @@ genReductions(const Fortran::parser::AccObjectListWithReduction &objectList, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, llvm::SmallVectorImpl &reductionOperands, - llvm::SmallVector &reductionRecipes, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes) { + llvm::SmallVector &reductionRecipes) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); const auto &objects = std::get(objectList.t); const auto &op = std::get(objectList.t); @@ -1904,9 +1868,8 @@ genReductions(const Fortran::parser::AccObjectListWithReduction &objectList, auto op = createDataEntryOp( builder, operandLocation, info.addr, asFortran, bounds, - /*structured=*/true, /*implicit=*/false, - mlir::acc::DataClause::acc_reduction, info.addr.getType(), async, - asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true); + /*implicit=*/false, mlir::acc::DataClause::acc_reduction, + info.addr.getType(), /*unwrapBoxAddr=*/true); mlir::Type ty = op.getAccVar().getType(); if (!areAllBoundConstant(bounds) || fir::isAssumedShape(info.addr.getType()) || @@ -2169,9 +2132,8 @@ static void privatizeIv(Fortran::lower::AbstractConverter &converter, std::stringstream asFortran; asFortran << Fortran::lower::mangle::demangleName(toStringRef(sym.name())); auto op = createDataEntryOp( - builder, loc, ivValue, asFortran, {}, true, /*implicit=*/true, - mlir::acc::DataClause::acc_private, ivValue.getType(), - /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + builder, loc, ivValue, asFortran, {}, /*implicit=*/true, + mlir::acc::DataClause::acc_private, ivValue.getType()); privateOp = op.getOperation(); privateOperands.push_back(op.getAccVar()); @@ -2328,14 +2290,12 @@ static mlir::acc::LoopOp createLoopOp( &clause.u)) { genPrivatizations( privateClause->v, converter, semanticsContext, stmtCtx, - privateOperands, privatizations, /*async=*/{}, - /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + privateOperands, privatizations); } else if (const auto *reductionClause = std::get_if( &clause.u)) { genReductions(reductionClause->v, converter, semanticsContext, stmtCtx, - reductionOperands, reductionRecipes, /*async=*/{}, - /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); + reductionOperands, reductionRecipes); } else if (std::get_if(&clause.u)) { for (auto crtDeviceTypeAttr : crtDeviceTypes) seqDeviceTypes.push_back(crtDeviceTypeAttr); @@ -2613,9 +2573,6 @@ static void genDataOperandOperationsWithModifier( llvm::SmallVectorImpl &dataClauseOperands, const mlir::acc::DataClause clause, const mlir::acc::DataClause clauseWithModifier, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes, bool setDeclareAttr = false) { const Fortran::parser::AccObjectListWithModifier &listWithModifier = x->v; const auto &accObjectList = @@ -2627,9 +2584,7 @@ static void genDataOperandOperationsWithModifier( (modifier && (*modifier).v == mod) ? clauseWithModifier : clause; genDataOperandOperations(accObjectList, converter, semanticsContext, stmtCtx, dataClauseOperands, dataClause, - /*structured=*/true, /*implicit=*/false, async, - asyncDeviceTypes, asyncOnlyDeviceTypes, - setDeclareAttr); + /*implicit=*/false, setDeclareAttr); } template @@ -2779,8 +2734,7 @@ static Op createComputeOp( genDataOperandOperations( copyClause->v, converter, semanticsContext, stmtCtx, dataClauseOperands, mlir::acc::DataClause::acc_copy, - /*structured=*/true, /*implicit=*/false, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + /*implicit=*/false); copyEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyinClause = @@ -2791,8 +2745,7 @@ static Op createComputeOp( copyinClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::ReadOnly, dataClauseOperands, mlir::acc::DataClause::acc_copyin, - mlir::acc::DataClause::acc_copyin_readonly, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyin_readonly); copyinEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *copyoutClause = @@ -2804,8 +2757,7 @@ static Op createComputeOp( copyoutClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::ReadOnly, dataClauseOperands, mlir::acc::DataClause::acc_copyout, - mlir::acc::DataClause::acc_copyout_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + mlir::acc::DataClause::acc_copyout_zero); copyoutEntryOperands.append(dataClauseOperands.begin() + crtDataStart, dataClauseOperands.end()); } else if (const auto *createClause = @@ -2816,8 +2768,7 @@ static Op createComputeOp( createClause, converter, semanticsContext, stmtCtx, Fortran::parser::AccDataModifier::Modifier::Zero, dataClauseOperands, mlir::acc::DataClause::acc_create, - mlir::acc::DataClause::acc_create_zero, async, asyncDeviceTypes, - asyncOnlyDeviceTy... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/139723 From flang-commits at lists.llvm.org Tue May 13 12:41:08 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 13 May 2025 12:41:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Pad Hollerith actual arguments (PR #139782) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/139782 For more compatible legacy behavior on old tests, extend Hollerith actual arguments on the right with trailing blanks out to a multiple of 8 bytes. Fixes Fujitsu test 0343_0069. >From 8f3a00e948510a6e4b130dc40828e5bc0f5a5e8d Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Tue, 13 May 2025 12:37:34 -0700 Subject: [PATCH] [flang] Pad Hollerith actual arguments For more compatible legacy behavior on old tests, extend Hollerith actual arguments on the right with trailing blanks out to a multiple of 8 bytes. Fixes Fujitsu test 0343_0069. --- flang/lib/Semantics/expression.cpp | 13 +++++++++++++ flang/test/Semantics/pad-hollerith-arg.f | 5 +++++ 2 files changed, 18 insertions(+) create mode 100644 flang/test/Semantics/pad-hollerith-arg.f diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index c35492097cfbc..b3ad608ee6744 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -4904,6 +4904,19 @@ std::optional ArgumentAnalyzer::AnalyzeExpr( "TYPE(*) dummy argument may only be used as an actual argument"_err_en_US); } else if (MaybeExpr argExpr{AnalyzeExprOrWholeAssumedSizeArray(expr)}) { if (isProcedureCall_ || !IsProcedureDesignator(*argExpr)) { + // Pad Hollerith actual argument with spaces up to a multiple of 8 + // bytes, in case the data are interpreted as double precision + // (or a smaller numeric type) by legacy code. + if (auto hollerith{UnwrapExpr>(*argExpr)}; + hollerith && hollerith->wasHollerith()) { + std::string bytes{hollerith->values()}; + while ((bytes.size() % 8) != 0) { + bytes += ' '; + } + Constant c{std::move(bytes)}; + c.set_wasHollerith(true); + argExpr = AsGenericExpr(std::move(c)); + } ActualArgument arg{std::move(*argExpr)}; SetArgSourceLocation(arg, expr.source); return std::move(arg); diff --git a/flang/test/Semantics/pad-hollerith-arg.f b/flang/test/Semantics/pad-hollerith-arg.f new file mode 100644 index 0000000000000..75678441ea45f --- /dev/null +++ b/flang/test/Semantics/pad-hollerith-arg.f @@ -0,0 +1,5 @@ +! RUN: %flang_fc1 -fdebug-unparse %s | FileCheck %s +! Ensure that Hollerith actual arguments are blank padded. +! CHECK: CALL foo("abc ") + call foo(3habc) + end From flang-commits at lists.llvm.org Tue May 13 12:41:41 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 12:41:41 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Pad Hollerith actual arguments (PR #139782) In-Reply-To: Message-ID: <6823a075.170a0220.30c4eb.f3aa@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes For more compatible legacy behavior on old tests, extend Hollerith actual arguments on the right with trailing blanks out to a multiple of 8 bytes. Fixes Fujitsu test 0343_0069. --- Full diff: https://github.com/llvm/llvm-project/pull/139782.diff 2 Files Affected: - (modified) flang/lib/Semantics/expression.cpp (+13) - (added) flang/test/Semantics/pad-hollerith-arg.f (+5) ``````````diff diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index c35492097cfbc..b3ad608ee6744 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -4904,6 +4904,19 @@ std::optional ArgumentAnalyzer::AnalyzeExpr( "TYPE(*) dummy argument may only be used as an actual argument"_err_en_US); } else if (MaybeExpr argExpr{AnalyzeExprOrWholeAssumedSizeArray(expr)}) { if (isProcedureCall_ || !IsProcedureDesignator(*argExpr)) { + // Pad Hollerith actual argument with spaces up to a multiple of 8 + // bytes, in case the data are interpreted as double precision + // (or a smaller numeric type) by legacy code. + if (auto hollerith{UnwrapExpr>(*argExpr)}; + hollerith && hollerith->wasHollerith()) { + std::string bytes{hollerith->values()}; + while ((bytes.size() % 8) != 0) { + bytes += ' '; + } + Constant c{std::move(bytes)}; + c.set_wasHollerith(true); + argExpr = AsGenericExpr(std::move(c)); + } ActualArgument arg{std::move(*argExpr)}; SetArgSourceLocation(arg, expr.source); return std::move(arg); diff --git a/flang/test/Semantics/pad-hollerith-arg.f b/flang/test/Semantics/pad-hollerith-arg.f new file mode 100644 index 0000000000000..75678441ea45f --- /dev/null +++ b/flang/test/Semantics/pad-hollerith-arg.f @@ -0,0 +1,5 @@ +! RUN: %flang_fc1 -fdebug-unparse %s | FileCheck %s +! Ensure that Hollerith actual arguments are blank padded. +! CHECK: CALL foo("abc ") + call foo(3habc) + end ``````````
https://github.com/llvm/llvm-project/pull/139782 From flang-commits at lists.llvm.org Tue May 13 12:42:18 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Tue, 13 May 2025 12:42:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Pad Hollerith actual arguments (PR #139782) In-Reply-To: Message-ID: <6823a09a.170a0220.3293ca.fa7e@mx.google.com> https://github.com/ashermancinelli approved this pull request. Thank you for taking this Peter! https://github.com/llvm/llvm-project/pull/139782 From flang-commits at lists.llvm.org Tue May 13 12:44:46 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Tue, 13 May 2025 12:44:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Pad Hollerith actual arguments (PR #139782) In-Reply-To: Message-ID: <6823a12e.170a0220.110510.535f@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/139782 From flang-commits at lists.llvm.org Tue May 13 12:57:38 2025 From: flang-commits at lists.llvm.org (Kelvin Li via flang-commits) Date: Tue, 13 May 2025 12:57:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <6823a432.050a0220.60fde.37e3@mx.google.com> kkwli wrote: @akuhlens We found it in testing https://github.com/llvm/llvm-test-suite/pull/241. There are two test cases that have different behavior with and without the change. I am not sure if the current behavior is the intent behavior in flang. Could you please take a look? Thanks. https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/goacc/routine-intrinsic-2.f https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/goacc/routine-8.f90 cc @tarunprabhu https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Tue May 13 14:00:05 2025 From: flang-commits at lists.llvm.org (Eli Friedman via flang-commits) Date: Tue, 13 May 2025 14:00:05 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <6823b2d5.170a0220.f1106.f31d@mx.google.com> ================ @@ -523,6 +537,7 @@ def CC_AArch64_Preserve_None : CallingConv<[ // We can pass arguments in all general registers, except: // - X8, used for sret // - X16/X17, used by the linker as IP0/IP1 + // - X15, the nest register and used by Windows for stack allocation ---------------- efriedma-quic wrote: We prefer intrinsics use standardized attributes/semantics to avoid scattering special cases all over, but we do have checks for specific intrinsics in certain cases. For constructs which could have well-defined semantics on some other backend, but can't be lowered by some particular backend (like weird calling conventions), we usually just have the backend report an error, though. https://github.com/llvm/llvm-project/pull/126743 From flang-commits at lists.llvm.org Tue May 13 14:03:57 2025 From: flang-commits at lists.llvm.org (=?UTF-8?B?Um9nZXIgRmVycmVyIEliw6HDsWV6?= via flang-commits) Date: Tue, 13 May 2025 14:03:57 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][Preprocessor] Avoid creating an empty token when a kind suffix is torn by a pasting operator (PR #139795) Message-ID: https://github.com/rofirrim created https://github.com/llvm/llvm-project/pull/139795 This input "tears" the expected tokens of an integer-literal due to a pasting operator `##`. When lexing `1_##` we generate the sequence of tokens `['1_', '']`, the second being an empty token of length zero. The second token is created at the end of `Prescanner::NextToken`. Creating an empty token by accident (due to two consecutive `CloseToken` without consuming anything) can cause `TokenSequence::pop_back` to assert. If zero-length tokens are acceptable, then instead of this patch we may have to fix the logic in `TokenPasting` found in `preprocessor.cpp`. >From 4e1f840899384d9977105791785806b2c43c1a37 Mon Sep 17 00:00:00 2001 From: Roger Ferrer Ibanez Date: Tue, 13 May 2025 20:54:14 +0000 Subject: [PATCH] Avoid creating an empty token when a kind suffix is torn by a pasting operator This confuses the logic that implements the pasting itself. --- flang/lib/Parser/prescan.cpp | 8 ++++++++ flang/test/Preprocessing/torn-token-pasting-1.F90 | 9 +++++++++ 2 files changed, 17 insertions(+) create mode 100644 flang/test/Preprocessing/torn-token-pasting-1.F90 diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp index 3bc2ea0b37508..004e4f013f90a 100644 --- a/flang/lib/Parser/prescan.cpp +++ b/flang/lib/Parser/prescan.cpp @@ -937,6 +937,7 @@ bool Prescanner::HandleKindSuffix(TokenSequence &tokens) { if (*at_ != '_') { return false; } + auto underscore = *at_; TokenSequence withUnderscore, separate; EmitChar(withUnderscore, '_'); EmitCharAndAdvance(separate, '_'); @@ -951,6 +952,13 @@ bool Prescanner::HandleKindSuffix(TokenSequence &tokens) { } withUnderscore.CloseToken(); separate.CloseToken(); + // If we only saw "_" and nothing else, we have handled enough but we do not + // want to close the token here, or we will generate an extra token of length + // zero. + if (separate.SizeInTokens() == 1) { + EmitChar(tokens, underscore); + return true; + } tokens.CloseToken(); if (separate.SizeInTokens() == 2 && preprocessor_.IsNameDefined(separate.TokenAt(1)) && diff --git a/flang/test/Preprocessing/torn-token-pasting-1.F90 b/flang/test/Preprocessing/torn-token-pasting-1.F90 new file mode 100644 index 0000000000000..5e080129a94d1 --- /dev/null +++ b/flang/test/Preprocessing/torn-token-pasting-1.F90 @@ -0,0 +1,9 @@ +! RUN: %flang -E %s 2>&1 | FileCheck %s +! CHECK: IF(10>HUGE(1_4).OR.10<-HUGE(1_4)) CALL foo() +#define CHECKSAFEINT(x,k) IF(x>HUGE(1_ ## k).OR.x<-HUGE(1_##k)) CALL foo() + +program main + implicit none + + CHECKSAFEINT(10, 4) +end program main From flang-commits at lists.llvm.org Tue May 13 14:04:33 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 14:04:33 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][Preprocessor] Avoid creating an empty token when a kind suffix is torn by a pasting operator (PR #139795) In-Reply-To: Message-ID: <6823b3e1.170a0220.1dc49a.01de@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-parser Author: Roger Ferrer Ibáñez (rofirrim)
Changes This input "tears" the expected tokens of an integer-literal due to a pasting operator `##`. When lexing `1_##` we generate the sequence of tokens `['1_', '']`, the second being an empty token of length zero. The second token is created at the end of `Prescanner::NextToken`. Creating an empty token by accident (due to two consecutive `CloseToken` without consuming anything) can cause `TokenSequence::pop_back` to assert. If zero-length tokens are acceptable, then instead of this patch we may have to fix the logic in `TokenPasting` found in `preprocessor.cpp`. --- Full diff: https://github.com/llvm/llvm-project/pull/139795.diff 2 Files Affected: - (modified) flang/lib/Parser/prescan.cpp (+8) - (added) flang/test/Preprocessing/torn-token-pasting-1.F90 (+9) ``````````diff diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp index 3bc2ea0b37508..004e4f013f90a 100644 --- a/flang/lib/Parser/prescan.cpp +++ b/flang/lib/Parser/prescan.cpp @@ -937,6 +937,7 @@ bool Prescanner::HandleKindSuffix(TokenSequence &tokens) { if (*at_ != '_') { return false; } + auto underscore = *at_; TokenSequence withUnderscore, separate; EmitChar(withUnderscore, '_'); EmitCharAndAdvance(separate, '_'); @@ -951,6 +952,13 @@ bool Prescanner::HandleKindSuffix(TokenSequence &tokens) { } withUnderscore.CloseToken(); separate.CloseToken(); + // If we only saw "_" and nothing else, we have handled enough but we do not + // want to close the token here, or we will generate an extra token of length + // zero. + if (separate.SizeInTokens() == 1) { + EmitChar(tokens, underscore); + return true; + } tokens.CloseToken(); if (separate.SizeInTokens() == 2 && preprocessor_.IsNameDefined(separate.TokenAt(1)) && diff --git a/flang/test/Preprocessing/torn-token-pasting-1.F90 b/flang/test/Preprocessing/torn-token-pasting-1.F90 new file mode 100644 index 0000000000000..5e080129a94d1 --- /dev/null +++ b/flang/test/Preprocessing/torn-token-pasting-1.F90 @@ -0,0 +1,9 @@ +! RUN: %flang -E %s 2>&1 | FileCheck %s +! CHECK: IF(10>HUGE(1_4).OR.10<-HUGE(1_4)) CALL foo() +#define CHECKSAFEINT(x,k) IF(x>HUGE(1_ ## k).OR.x<-HUGE(1_##k)) CALL foo() + +program main + implicit none + + CHECKSAFEINT(10, 4) +end program main ``````````
https://github.com/llvm/llvm-project/pull/139795 From flang-commits at lists.llvm.org Tue May 13 14:04:38 2025 From: flang-commits at lists.llvm.org (Eli Friedman via flang-commits) Date: Tue, 13 May 2025 14:04:38 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <6823b3e6.170a0220.3241dc.fc72@mx.google.com> ================ @@ -1,35 +1,26 @@ -; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +;; Testing that nest uses x15 on all calling conventions (except Arm64EC) ---------------- efriedma-quic wrote: So, nest works, but init.trampoline is broken because it's generating code for the wrong architecture? I guess that's okay. https://github.com/llvm/llvm-project/pull/126743 From flang-commits at lists.llvm.org Tue May 13 14:20:29 2025 From: flang-commits at lists.llvm.org (Jameson Nash via flang-commits) Date: Tue, 13 May 2025 14:20:29 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <6823b79d.170a0220.38c33d.01fe@mx.google.com> ================ @@ -1,35 +1,26 @@ -; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +;; Testing that nest uses x15 on all calling conventions (except Arm64EC) ---------------- vtjnash wrote: I think `init.trampoline` should be okay, since it generates code on aarch64 to run on aarch64. I don't really know why the td claimed the nest pointer should be X4 (since external / x86-64 code cannot access the nest pointer as it is entirely hidden inside the ABI of the calling convention, so there is not really any compatibility to care about). I also don't have a machine to test it on, but it is implemented here for both the init and the prologue, so it should work as the comments described. https://github.com/llvm/llvm-project/pull/126743 From flang-commits at lists.llvm.org Tue May 13 14:23:49 2025 From: flang-commits at lists.llvm.org (Jameson Nash via flang-commits) Date: Tue, 13 May 2025 14:23:49 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <6823b865.170a0220.27a782.d872@mx.google.com> ================ @@ -523,6 +537,7 @@ def CC_AArch64_Preserve_None : CallingConv<[ // We can pass arguments in all general registers, except: // - X8, used for sret // - X16/X17, used by the linker as IP0/IP1 + // - X15, the nest register and used by Windows for stack allocation ---------------- vtjnash wrote: Agreed, I'd have preferred to make X15 reserved for this on all platforms (instead of just on Windows), but I guess that could be ABI-breaking and slightly less efficient. In any case, the backend is still expected to cause errors here, unchanged from before this PR, since the nest register is left undefined in the calling convention tablegen file. https://github.com/llvm/llvm-project/pull/126743 From flang-commits at lists.llvm.org Tue May 13 14:44:36 2025 From: flang-commits at lists.llvm.org (Jameson Nash via flang-commits) Date: Tue, 13 May 2025 14:44:36 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <6823bd44.170a0220.d71f2.004a@mx.google.com> ================ @@ -1,35 +1,26 @@ -; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +;; Testing that nest uses x15 on all calling conventions (except Arm64EC) ---------------- vtjnash wrote: Okay, I looked more in detail for this. The `init.trampoline` cannot actually know which architecture is going to be correct, since that is up to the llvm caller's runtime to correctly use VirtualAlloc2 with MEM_EXTENDED_PARAMETER_EC_CODE instead of VirtualAlloc (https://learn.microsoft.com/en-us/windows/arm/arm64ec-abi#dynamically-generating-jit-compiling-arm64ec-code). But we might have a pretty hard time fitting the necessary exit thunk into 36 bytes if we put x86-64 code there, as well as being a performance penalty. So this current version seems still more correct and more preferable to me. And if someone decides to call a nest function directly, without a trampoline, directly from x86-64 assembly or directly using x86-64 llvm IR, that should also work with this, since both platforms declare that nest is passed as R10/X4 https://github.com/llvm/llvm-project/pull/126743 From flang-commits at lists.llvm.org Tue May 13 21:56:25 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 21:56:25 -0700 (PDT) Subject: [flang-commits] [flang] 0b490f1 - [FLANG][OpenMP][Taskloop] - Add testcase for reduction and in_reduction clause in taskloop construct (#139704) Message-ID: <68242279.630a0220.367a2d.1ce8@mx.google.com> Author: Kaviya Rajendiran Date: 2025-05-14T10:26:21+05:30 New Revision: 0b490f11da245ad178bb4389cd8bfd858262aca6 URL: https://github.com/llvm/llvm-project/commit/0b490f11da245ad178bb4389cd8bfd858262aca6 DIFF: https://github.com/llvm/llvm-project/commit/0b490f11da245ad178bb4389cd8bfd858262aca6.diff LOG: [FLANG][OpenMP][Taskloop] - Add testcase for reduction and in_reduction clause in taskloop construct (#139704) Added a testcase for reduction and in_reduction clause in taskloop construct. Reduction and in_reduction clauses are not supported in taskloop so below error is issued: "not yet implemented: Unhandled clause REDUCTION/IN_REDUCTION in TASKLOOP construct" Added: flang/test/Lower/OpenMP/Todo/taskloop-inreduction.f90 flang/test/Lower/OpenMP/Todo/taskloop-reduction.f90 Modified: Removed: ################################################################################ diff --git a/flang/test/Lower/OpenMP/Todo/taskloop-inreduction.f90 b/flang/test/Lower/OpenMP/Todo/taskloop-inreduction.f90 new file mode 100644 index 0000000000000..8acc399a92abe --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/taskloop-inreduction.f90 @@ -0,0 +1,13 @@ +! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK: not yet implemented: Unhandled clause IN_REDUCTION in TASKLOOP construct +subroutine omp_taskloop_inreduction() + integer x + x = 0 + !$omp taskloop in_reduction(+:x) + do i = 1, 100 + x = x + 1 + end do + !$omp end taskloop +end subroutine omp_taskloop_inreduction diff --git a/flang/test/Lower/OpenMP/Todo/taskloop-reduction.f90 b/flang/test/Lower/OpenMP/Todo/taskloop-reduction.f90 new file mode 100644 index 0000000000000..0c16bd227257f --- /dev/null +++ b/flang/test/Lower/OpenMP/Todo/taskloop-reduction.f90 @@ -0,0 +1,13 @@ +! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s +! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=50 -o - %s 2>&1 | FileCheck %s + +! CHECK: not yet implemented: Unhandled clause REDUCTION in TASKLOOP construct +subroutine omp_taskloop_reduction() + integer x + x = 0 + !$omp taskloop reduction(+:x) + do i = 1, 100 + x = x + 1 + end do + !$omp end taskloop +end subroutine omp_taskloop_reduction From flang-commits at lists.llvm.org Tue May 13 21:56:27 2025 From: flang-commits at lists.llvm.org (Kaviya Rajendiran via flang-commits) Date: Tue, 13 May 2025 21:56:27 -0700 (PDT) Subject: [flang-commits] [flang] [FLANG][OpenMP][Taskloop] - Add testcase for reduction and in_reduction clause in taskloop construct (PR #139704) In-Reply-To: Message-ID: <6824227b.050a0220.1fa9bd.44d4@mx.google.com> https://github.com/kaviya2510 closed https://github.com/llvm/llvm-project/pull/139704 From flang-commits at lists.llvm.org Tue May 13 23:24:11 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 13 May 2025 23:24:11 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Remove async and structured flag from data actions (PR #139723) In-Reply-To: Message-ID: <6824370b.170a0220.b5da7.0702@mx.google.com> https://github.com/clementval commented: I'm ok with the change but I'll let Razvan approve it since he is more involved in this recently. You need to update the unit tests that are currently failing. https://github.com/llvm/llvm-project/pull/139723 From flang-commits at lists.llvm.org Tue May 13 23:24:40 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 13 May 2025 23:24:40 -0700 (PDT) Subject: [flang-commits] [flang] f1c9128 - [flang][openacc] Align async check for combined construct (#139744) Message-ID: <68243728.170a0220.2f6b81.0478@mx.google.com> Author: Valentin Clement (バレンタイン クレメン) Date: 2025-05-14T08:24:35+02:00 New Revision: f1c9128115f1cf8b9638513f85093837fa593f01 URL: https://github.com/llvm/llvm-project/commit/f1c9128115f1cf8b9638513f85093837fa593f01 DIFF: https://github.com/llvm/llvm-project/commit/f1c9128115f1cf8b9638513f85093837fa593f01.diff LOG: [flang][openacc] Align async check for combined construct (#139744) Align async clause check for combined construct to behave the same as parallel, kernels and serial. Added: Modified: flang/test/Lower/OpenACC/acc-kernels-loop.f90 flang/test/Lower/OpenACC/acc-parallel-loop.f90 flang/test/Lower/OpenACC/acc-serial-loop.f90 flang/test/Semantics/OpenACC/acc-kernels-loop.f90 flang/test/Semantics/OpenACC/acc-parallel-loop-validity.f90 flang/test/Semantics/OpenACC/acc-serial-loop.f90 llvm/include/llvm/Frontend/OpenACC/ACC.td Removed: ################################################################################ diff --git a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 b/flang/test/Lower/OpenACC/acc-kernels-loop.f90 index 0ded708cb1a3b..a330b7d491d06 100644 --- a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-kernels-loop.f90 @@ -102,6 +102,12 @@ subroutine acc_kernels_loop ! CHECK: acc.terminator ! CHECK-NEXT: }{{$}} + !$acc kernels loop async(async) device_type(nvidia) async(1) + DO i = 1, n + a(i) = b(i) + END DO +! CHECK: acc.kernels combined(loop) async(%{{.*}} : i32, %c1{{.*}} : i32 [#acc.device_type]) + !$acc kernels loop wait DO i = 1, n a(i) = b(i) diff --git a/flang/test/Lower/OpenACC/acc-parallel-loop.f90 b/flang/test/Lower/OpenACC/acc-parallel-loop.f90 index ccd37d87262e3..1e1fc7448a513 100644 --- a/flang/test/Lower/OpenACC/acc-parallel-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-parallel-loop.f90 @@ -104,6 +104,12 @@ subroutine acc_parallel_loop ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} + !$acc parallel loop async(async) device_type(nvidia) async(1) + DO i = 1, n + a(i) = b(i) + END DO +! CHECK: acc.parallel combined(loop) async(%{{.*}} : i32, %c1{{.*}} : i32 [#acc.device_type]) + !$acc parallel loop wait DO i = 1, n a(i) = b(i) diff --git a/flang/test/Lower/OpenACC/acc-serial-loop.f90 b/flang/test/Lower/OpenACC/acc-serial-loop.f90 index 478dfa0d96c3b..98fc28990265a 100644 --- a/flang/test/Lower/OpenACC/acc-serial-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-serial-loop.f90 @@ -123,6 +123,12 @@ subroutine acc_serial_loop ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} + !$acc serial loop async(async) device_type(nvidia) async(1) + DO i = 1, n + a(i) = b(i) + END DO +! CHECK: acc.serial combined(loop) async(%{{.*}} : i32, %c1{{.*}} : i32 [#acc.device_type]) + !$acc serial loop wait DO i = 1, n a(i) = b(i) diff --git a/flang/test/Semantics/OpenACC/acc-kernels-loop.f90 b/flang/test/Semantics/OpenACC/acc-kernels-loop.f90 index 8653978fb6249..29985a02eb6ef 100644 --- a/flang/test/Semantics/OpenACC/acc-kernels-loop.f90 +++ b/flang/test/Semantics/OpenACC/acc-kernels-loop.f90 @@ -295,4 +295,13 @@ program openacc_kernels_loop_validity if(i == 10) cycle end do + !$acc kernels loop async(1) device_type(nvidia) async(3) + do i = 1, n + end do + +!ERROR: At most one ASYNC clause can appear on the KERNELS LOOP directive or in group separated by the DEVICE_TYPE clause + !$acc kernels loop async(1) device_type(nvidia) async async + do i = 1, n + end do + end program openacc_kernels_loop_validity diff --git a/flang/test/Semantics/OpenACC/acc-parallel-loop-validity.f90 b/flang/test/Semantics/OpenACC/acc-parallel-loop-validity.f90 index 7f33f9e145110..78e1a7ad7c452 100644 --- a/flang/test/Semantics/OpenACC/acc-parallel-loop-validity.f90 +++ b/flang/test/Semantics/OpenACC/acc-parallel-loop-validity.f90 @@ -141,4 +141,13 @@ program openacc_parallel_loop_validity if(i == 10) cycle end do + !$acc parallel loop async(1) device_type(nvidia) async(3) + do i = 1, n + end do + +!ERROR: At most one ASYNC clause can appear on the PARALLEL LOOP directive or in group separated by the DEVICE_TYPE clause + !$acc parallel loop async(1) device_type(nvidia) async async + do i = 1, n + end do + end program openacc_parallel_loop_validity diff --git a/flang/test/Semantics/OpenACC/acc-serial-loop.f90 b/flang/test/Semantics/OpenACC/acc-serial-loop.f90 index 2832274680eca..5d2be7f7c6474 100644 --- a/flang/test/Semantics/OpenACC/acc-serial-loop.f90 +++ b/flang/test/Semantics/OpenACC/acc-serial-loop.f90 @@ -111,4 +111,13 @@ program openacc_serial_loop_validity if(i == 10) cycle end do + !$acc serial loop async(1) device_type(nvidia) async(3) + do i = 1, n + end do + +!ERROR: At most one ASYNC clause can appear on the SERIAL LOOP directive or in group separated by the DEVICE_TYPE clause + !$acc serial loop async(1) device_type(nvidia) async async + do i = 1, n + end do + end program openacc_serial_loop_validity diff --git a/llvm/include/llvm/Frontend/OpenACC/ACC.td b/llvm/include/llvm/Frontend/OpenACC/ACC.td index d372fc221e4b4..46cba9f2400e1 100644 --- a/llvm/include/llvm/Frontend/OpenACC/ACC.td +++ b/llvm/include/llvm/Frontend/OpenACC/ACC.td @@ -556,35 +556,31 @@ def ACC_HostData : Directive<"host_data"> { // 2.11 def ACC_KernelsLoop : Directive<"kernels loop"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause - ]; + let allowedClauses = [VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause]; + let allowedOnceClauses = [VersionedClause, + VersionedClause, + VersionedClause]; let allowedExclusiveClauses = [ VersionedClause, VersionedClause, @@ -596,36 +592,32 @@ def ACC_KernelsLoop : Directive<"kernels loop"> { // 2.11 def ACC_ParallelLoop : Directive<"parallel loop"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause - ]; + let allowedClauses = [VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause]; + let allowedOnceClauses = [VersionedClause, + VersionedClause, + VersionedClause]; let allowedExclusiveClauses = [ VersionedClause, VersionedClause, @@ -637,33 +629,29 @@ def ACC_ParallelLoop : Directive<"parallel loop"> { // 2.11 def ACC_SerialLoop : Directive<"serial loop"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause - ]; + let allowedClauses = [VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause]; + let allowedOnceClauses = [VersionedClause, + VersionedClause, + VersionedClause]; let allowedExclusiveClauses = [ VersionedClause, VersionedClause, From flang-commits at lists.llvm.org Tue May 13 23:24:41 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 13 May 2025 23:24:41 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][openacc] Align async check for combined construct (PR #139744) In-Reply-To: Message-ID: <68243729.170a0220.2bc56a.1ad7@mx.google.com> https://github.com/clementval closed https://github.com/llvm/llvm-project/pull/139744 From flang-commits at lists.llvm.org Tue May 13 23:48:20 2025 From: flang-commits at lists.llvm.org (Dominik Adamski via flang-commits) Date: Tue, 13 May 2025 23:48:20 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Turn on alias analysis for locally allocated objects (PR #139682) In-Reply-To: Message-ID: <68243cb4.630a0220.6549e.b213@mx.google.com> https://github.com/DominikAdamski updated https://github.com/llvm/llvm-project/pull/139682 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 14 00:21:22 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 00:21:22 -0700 (PDT) Subject: [flang-commits] [flang] cf16c97 - [Flang] Turn on alias analysis for locally allocated objects (#139682) Message-ID: <68244472.170a0220.1b2979.0a10@mx.google.com> Author: Dominik Adamski Date: 2025-05-14T09:21:18+02:00 New Revision: cf16c97bfa1416672d8990862369e86f360aa11e URL: https://github.com/llvm/llvm-project/commit/cf16c97bfa1416672d8990862369e86f360aa11e DIFF: https://github.com/llvm/llvm-project/commit/cf16c97bfa1416672d8990862369e86f360aa11e.diff LOG: [Flang] Turn on alias analysis for locally allocated objects (#139682) Previously, a bug in the MemCptOpt LLVM IR pass caused issues with adding alias tags for locally allocated objects for Fortran code. However, the bug has now been fixed ( https://github.com/llvm/llvm-project/pull/129537 ), and we can safely enable alias tags for these objects. This change should improve the accuracy of the alias analysis. Added: Modified: flang/lib/Optimizer/Transforms/AddAliasTags.cpp flang/test/Fir/tbaa-codegen2.fir flang/test/Transforms/tbaa-with-dummy-scope2.fir flang/test/Transforms/tbaa2.fir flang/test/Transforms/tbaa3.fir Removed: ################################################################################ diff --git a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp index 66b4b84998801..5cfbdc33285f9 100644 --- a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp +++ b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp @@ -43,13 +43,10 @@ static llvm::cl::opt static llvm::cl::opt enableDirect("direct-tbaa", llvm::cl::init(true), llvm::cl::Hidden, llvm::cl::desc("Add TBAA tags to direct variables")); -// This is **known unsafe** (misscompare in spec2017/wrf_r). It should -// not be enabled by default. -// The code is kept so that these may be tried with new benchmarks to see if -// this is worth fixing in the future. -static llvm::cl::opt enableLocalAllocs( - "local-alloc-tbaa", llvm::cl::init(false), llvm::cl::Hidden, - llvm::cl::desc("Add TBAA tags to local allocations. UNSAFE.")); +static llvm::cl::opt + enableLocalAllocs("local-alloc-tbaa", llvm::cl::init(true), + llvm::cl::Hidden, + llvm::cl::desc("Add TBAA tags to local allocations.")); namespace { diff --git a/flang/test/Fir/tbaa-codegen2.fir b/flang/test/Fir/tbaa-codegen2.fir index 8f8b6a29129e7..e4bfa9087ec75 100644 --- a/flang/test/Fir/tbaa-codegen2.fir +++ b/flang/test/Fir/tbaa-codegen2.fir @@ -100,7 +100,7 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ // [...] // CHECK: %[[VAL50:.*]] = getelementptr i32, ptr %{{.*}}, i64 %{{.*}} // store to the temporary: -// CHECK: store i32 %{{.*}}, ptr %[[VAL50]], align 4, !tbaa ![[DATA_ACCESS_TAG:.*]] +// CHECK: store i32 %{{.*}}, ptr %[[VAL50]], align 4, !tbaa ![[TMP_DATA_ACCESS_TAG:.*]] // [...] // CHECK: [[BOX_ACCESS_TAG]] = !{![[BOX_ACCESS_TYPE:.*]], ![[BOX_ACCESS_TYPE]], i64 0} @@ -111,4 +111,7 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ // CHECK: ![[A_ACCESS_TYPE]] = !{!"dummy arg data/_QFfuncEa", ![[ARG_ACCESS_TYPE:.*]], i64 0} // CHECK: ![[ARG_ACCESS_TYPE]] = !{!"dummy arg data", ![[DATA_ACCESS_TYPE:.*]], i64 0} // CHECK: ![[DATA_ACCESS_TYPE]] = !{!"any data access", ![[ANY_ACCESS_TYPE]], i64 0} -// CHECK: ![[DATA_ACCESS_TAG]] = !{![[DATA_ACCESS_TYPE]], ![[DATA_ACCESS_TYPE]], i64 0} +// CHECK: ![[TMP_DATA_ACCESS_TAG]] = !{![[TMP_DATA_ACCESS_TYPE:.*]], ![[TMP_DATA_ACCESS_TYPE]], i64 0} +// CHECK: ![[TMP_DATA_ACCESS_TYPE]] = !{!"allocated data/", ![[TMP_ACCESS_TYPE:.*]], i64 0} +// CHECK: ![[TMP_ACCESS_TYPE]] = !{!"allocated data", ![[TARGET_ACCESS_TAG:.*]], i64 0} +// CHECK: ![[TARGET_ACCESS_TAG]] = !{!"target data", ![[DATA_ACCESS_TYPE]], i64 0} diff --git a/flang/test/Transforms/tbaa-with-dummy-scope2.fir b/flang/test/Transforms/tbaa-with-dummy-scope2.fir index c8f419fbee652..249471de458d3 100644 --- a/flang/test/Transforms/tbaa-with-dummy-scope2.fir +++ b/flang/test/Transforms/tbaa-with-dummy-scope2.fir @@ -43,12 +43,15 @@ func.func @_QPtest1() attributes {noinline} { // CHECK: #[[$ATTR_0:.+]] = #llvm.tbaa_root // CHECK: #[[$ATTR_1:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_2:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$TARGETDATA:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_3:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$LOCAL_ATTR_0:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_5:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_tag +// CHECK: #[[$LOCAL_ATTR_1:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_6:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$LOCAL_ATTR_2:.+]] = #llvm.tbaa_tag // CHECK: #[[$ATTR_8:.+]] = #llvm.tbaa_tag // CHECK-LABEL: func.func @_QPtest1() attributes {noinline} { // CHECK: %[[VAL_2:.*]] = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFtest1FinnerEy"} @@ -57,8 +60,8 @@ func.func @_QPtest1() attributes {noinline} { // CHECK: %[[VAL_5:.*]] = fir.dummy_scope : !fir.dscope // CHECK: %[[VAL_6:.*]] = fir.declare %[[VAL_4]] dummy_scope %[[VAL_5]] {uniq_name = "_QFtest1FinnerEx"} : (!fir.ref, !fir.dscope) -> !fir.ref // CHECK: %[[VAL_7:.*]] = fir.declare %[[VAL_2]] {uniq_name = "_QFtest1FinnerEy"} : (!fir.ref) -> !fir.ref -// CHECK: fir.store %{{.*}} to %[[VAL_7]] : !fir.ref -// CHECK: %[[VAL_8:.*]] = fir.load %[[VAL_7]] : !fir.ref +// CHECK: fir.store %{{.*}} to %[[VAL_7]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref +// CHECK: %[[VAL_8:.*]] = fir.load %[[VAL_7]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref // CHECK: fir.store %[[VAL_8]] to %[[VAL_6]] {tbaa = [#[[$ATTR_7]]]} : !fir.ref // CHECK: fir.store %{{.*}} to %[[VAL_4]] {tbaa = [#[[$ATTR_8]]]} : !fir.ref @@ -87,12 +90,16 @@ func.func @_QPtest2() attributes {noinline} { // CHECK: #[[$ATTR_3:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_5:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$TARGETDATA_0:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_6:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$TARGETDATA_1:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$LOCAL_ATTR_0:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_8:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_10:.+]] = #llvm.tbaa_tag +// CHECK: #[[$LOCAL_ATTR_1:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_9:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$LOCAL_ATTR_2:.+]] = #llvm.tbaa_tag // CHECK: #[[$ATTR_11:.+]] = #llvm.tbaa_tag // CHECK-LABEL: func.func @_QPtest2() attributes {noinline} { // CHECK: %[[VAL_2:.*]] = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFtest2FinnerEy"} @@ -102,7 +109,7 @@ func.func @_QPtest2() attributes {noinline} { // CHECK: %[[VAL_6:.*]] = fir.dummy_scope : !fir.dscope // CHECK: %[[VAL_7:.*]] = fir.declare %[[VAL_5]] dummy_scope %[[VAL_6]] {uniq_name = "_QFtest2FinnerEx"} : (!fir.ref, !fir.dscope) -> !fir.ref // CHECK: %[[VAL_8:.*]] = fir.declare %[[VAL_2]] {uniq_name = "_QFtest2FinnerEy"} : (!fir.ref) -> !fir.ref -// CHECK: fir.store %{{.*}} to %[[VAL_8]] : !fir.ref -// CHECK: %[[VAL_9:.*]] = fir.load %[[VAL_8]] : !fir.ref +// CHECK: fir.store %{{.*}} to %[[VAL_8]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref +// CHECK: %[[VAL_9:.*]] = fir.load %[[VAL_8]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref // CHECK: fir.store %[[VAL_9]] to %[[VAL_7]] {tbaa = [#[[$ATTR_10]]]} : !fir.ref // CHECK: fir.store %{{.*}} to %[[VAL_5]] {tbaa = [#[[$ATTR_11]]]} : !fir.ref diff --git a/flang/test/Transforms/tbaa2.fir b/flang/test/Transforms/tbaa2.fir index 4678a1cd4a686..1429d0b420766 100644 --- a/flang/test/Transforms/tbaa2.fir +++ b/flang/test/Transforms/tbaa2.fir @@ -50,6 +50,7 @@ // CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ANY_ARG:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ANY_GLBL:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[ANY_LOCAL:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ARG_LOW:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ANY_DIRECT:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ARG_Z:.+]] = #llvm.tbaa_type_desc}> @@ -61,21 +62,31 @@ // CHECK: #[[GLBL_ZSTART:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_ZSTOP:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[LOCAL1_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_YSTART:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_YSTOP:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[LOCAL2_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_XSTART:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[LOCAL3_ALLOC:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[LOCAL4_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[DIRECT_A:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[DIRECT_B:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_DYINV:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[LOCAL5_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_ZSTART_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_ZSTOP_TAG:.+]] = #llvm.tbaa_tag +// CHECK: #[[LOCAL1_ALLOC_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_YSTART_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_YSTOP_TAG:.+]] = #llvm.tbaa_tag +// CHECK: #[[LOCAL2_ALLOC_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_XSTART_TAG:.+]] = #llvm.tbaa_tag +// CHECK: #[[LOCAL3_ALLOC_TAG:.+]] = #llvm.tbaa_tag +// CHECK: #[[LOCAL4_ALLOC_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[DIRECT_A_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[DIRECT_B_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_DYINV_TAG:.+]] = #llvm.tbaa_tag +// CHECK: #[[LOCAL5_ALLOC_TAG:.+]] = #llvm.tbaa_tag func.func @_QMmodPcallee(%arg0: !fir.box> {fir.bindc_name = "z"}, %arg1: !fir.box> {fir.bindc_name = "y"}, %arg2: !fir.ref>>> {fir.bindc_name = "low"}) { %c2 = arith.constant 2 : index @@ -277,7 +288,7 @@ // CHECK: %[[VAL_44:.*]] = fir.convert %[[VAL_43]] : (i32) -> index // CHECK: %[[VAL_45:.*]] = fir.convert %[[VAL_42]] : (index) -> i32 // CHECK: %[[VAL_46:.*]]:2 = fir.do_loop %[[VAL_47:.*]] = %[[VAL_42]] to %[[VAL_44]] step %[[VAL_5]] iter_args(%[[VAL_48:.*]] = %[[VAL_45]]) -> (index, i32) { -// CHECK: fir.store %[[VAL_48]] to %[[VAL_34]] : !fir.ref +// CHECK: fir.store %[[VAL_48]] to %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_49:.*]] = fir.load %[[VAL_18]] {tbaa = [#[[GLBL_YSTART_TAG]]]} : !fir.ref // CHECK: %[[VAL_50:.*]] = arith.addi %[[VAL_49]], %[[VAL_6]] : i32 // CHECK: %[[VAL_51:.*]] = fir.convert %[[VAL_50]] : (i32) -> index @@ -285,24 +296,20 @@ // CHECK: %[[VAL_53:.*]] = fir.convert %[[VAL_52]] : (i32) -> index // CHECK: %[[VAL_54:.*]] = fir.convert %[[VAL_51]] : (index) -> i32 // CHECK: %[[VAL_55:.*]]:2 = fir.do_loop %[[VAL_56:.*]] = %[[VAL_51]] to %[[VAL_53]] step %[[VAL_5]] iter_args(%[[VAL_57:.*]] = %[[VAL_54]]) -> (index, i32) { -// CHECK: fir.store %[[VAL_57]] to %[[VAL_32]] : !fir.ref +// CHECK: fir.store %[[VAL_57]] to %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_58:.*]] = fir.load %[[VAL_16]] {tbaa = [#[[GLBL_XSTART_TAG]]]} : !fir.ref // CHECK: %[[VAL_59:.*]] = arith.addi %[[VAL_58]], %[[VAL_6]] : i32 // CHECK: %[[VAL_60:.*]] = fir.convert %[[VAL_59]] : (i32) -> index // CHECK: %[[VAL_61:.*]] = fir.convert %[[VAL_60]] : (index) -> i32 // CHECK: %[[VAL_62:.*]]:2 = fir.do_loop %[[VAL_63:.*]] = %[[VAL_60]] to %[[VAL_4]] step %[[VAL_5]] iter_args(%[[VAL_64:.*]] = %[[VAL_61]]) -> (index, i32) { -// TODO: local allocation assumed to always alias -// CHECK: fir.store %[[VAL_64]] to %[[VAL_30]] : !fir.ref +// CHECK: fir.store %[[VAL_64]] to %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref // load from box tagged in CodeGen // CHECK: %[[VAL_65:.*]] = fir.load %[[VAL_35]] : !fir.ref>>> -// TODO: local allocation assumed to always alias -// CHECK: %[[VAL_66:.*]] = fir.load %[[VAL_30]] : !fir.ref +// CHECK: %[[VAL_66:.*]] = fir.load %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_67:.*]] = fir.convert %[[VAL_66]] : (i32) -> i64 -// TODO: local allocation assumed to always alias -// CHECK: %[[VAL_68:.*]] = fir.load %[[VAL_32]] : !fir.ref +// CHECK: %[[VAL_68:.*]] = fir.load %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_69:.*]] = fir.convert %[[VAL_68]] : (i32) -> i64 -// TODO: local allocation assumed to always alias -// CHECK: %[[VAL_70:.*]] = fir.load %[[VAL_34]] : !fir.ref +// CHECK: %[[VAL_70:.*]] = fir.load %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_71:.*]] = fir.convert %[[VAL_70]] : (i32) -> i64 // CHECK: %[[VAL_72:.*]] = fir.box_addr %[[VAL_65]] : (!fir.box>>) -> !fir.heap> // CHECK: %[[VAL_73:.*]]:3 = fir.box_dims %[[VAL_65]], %[[VAL_4]] : (!fir.box>>, index) -> (index, index, index) @@ -311,11 +318,10 @@ // CHECK: %[[VAL_76:.*]] = fir.shape_shift %[[VAL_73]]#0, %[[VAL_73]]#1, %[[VAL_74]]#0, %[[VAL_74]]#1, %[[VAL_75]]#0, %[[VAL_75]]#1 : (index, index, index, index, index, index) -> !fir.shapeshift<3> // CHECK: %[[VAL_77:.*]] = fir.array_coor %[[VAL_72]](%[[VAL_76]]) %[[VAL_67]], %[[VAL_69]], %[[VAL_71]] : (!fir.heap>, !fir.shapeshift<3>, i64, i64, i64) -> !fir.ref // CHECK: %[[VAL_78:.*]] = fir.load %[[VAL_77]] {tbaa = [#[[ARG_LOW_TAG]]]} : !fir.ref -// CHECK: fir.store %[[VAL_78]] to %[[VAL_26]] : !fir.ref +// CHECK: fir.store %[[VAL_78]] to %[[VAL_26]] {tbaa = [#[[LOCAL4_ALLOC_TAG]]]} : !fir.ref // load from box tagged in CodeGen // CHECK: %[[VAL_79:.*]] = fir.load %[[VAL_8]] : !fir.ref>>> -// TODO: local allocation assumed to always alias -// CHECK: %[[VAL_80:.*]] = fir.load %[[VAL_32]] : !fir.ref +// CHECK: %[[VAL_80:.*]] = fir.load %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_81:.*]] = fir.convert %[[VAL_80]] : (i32) -> i64 // CHECK: %[[VAL_82:.*]] = fir.box_addr %[[VAL_79]] : (!fir.box>>) -> !fir.heap> // CHECK: %[[VAL_83:.*]]:3 = fir.box_dims %[[VAL_79]], %[[VAL_4]] : (!fir.box>>, index) -> (index, index, index) @@ -324,11 +330,9 @@ // CHECK: %[[VAL_86:.*]] = fir.load %[[VAL_85]] {tbaa = [#[[DIRECT_A_TAG]]]} : !fir.ref // load from box // CHECK: %[[VAL_87:.*]] = fir.load %[[VAL_35]] : !fir.ref>>> -// load from local allocation -// CHECK: %[[VAL_88:.*]] = fir.load %[[VAL_30]] : !fir.ref +// CHECK: %[[VAL_88:.*]] = fir.load %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_89:.*]] = fir.convert %[[VAL_88]] : (i32) -> i64 -// load from local allocation -// CHECK: %[[VAL_90:.*]] = fir.load %[[VAL_34]] : !fir.ref +// CHECK: %[[VAL_90:.*]] = fir.load %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_91:.*]] = fir.convert %[[VAL_90]] : (i32) -> i64 // CHECK: %[[VAL_92:.*]] = fir.box_addr %[[VAL_87]] : (!fir.box>>) -> !fir.heap> // CHECK: %[[VAL_93:.*]]:3 = fir.box_dims %[[VAL_87]], %[[VAL_4]] : (!fir.box>>, index) -> (index, index, index) @@ -363,8 +367,7 @@ // CHECK: %[[VAL_121:.*]] = fir.load %[[VAL_120]] {tbaa = [#[[ARG_Y_TAG]]]} : !fir.ref // CHECK: %[[VAL_122:.*]] = arith.subf %[[VAL_119]], %[[VAL_121]] fastmath : f32 // CHECK: %[[VAL_123:.*]] = fir.no_reassoc %[[VAL_122]] : f32 -// load from local allocation -// CHECK: %[[VAL_124:.*]] = fir.load %[[VAL_28]] : !fir.ref +// CHECK: %[[VAL_124:.*]] = fir.load %[[VAL_28]] {tbaa = [#[[LOCAL5_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_125:.*]] = arith.mulf %[[VAL_123]], %[[VAL_124]] fastmath : f32 // CHECK: %[[VAL_126:.*]] = arith.addf %[[VAL_115]], %[[VAL_125]] fastmath : f32 // CHECK: %[[VAL_127:.*]] = fir.no_reassoc %[[VAL_126]] : f32 @@ -373,30 +376,24 @@ // CHECK: fir.store %[[VAL_129]] to %[[VAL_97]] {tbaa = [#[[ARG_LOW_TAG]]]} : !fir.ref // CHECK: %[[VAL_130:.*]] = arith.addi %[[VAL_63]], %[[VAL_5]] : index // CHECK: %[[VAL_131:.*]] = fir.convert %[[VAL_5]] : (index) -> i32 -// load from local allocation -// CHECK: %[[VAL_132:.*]] = fir.load %[[VAL_30]] : !fir.ref +// CHECK: %[[VAL_132:.*]] = fir.load %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_133:.*]] = arith.addi %[[VAL_132]], %[[VAL_131]] : i32 // CHECK: fir.result %[[VAL_130]], %[[VAL_133]] : index, i32 // CHECK: } -// store to local allocation -// CHECK: fir.store %[[VAL_134:.*]]#1 to %[[VAL_30]] : !fir.ref +// CHECK: fir.store %[[VAL_134:.*]]#1 to %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_135:.*]] = arith.addi %[[VAL_56]], %[[VAL_5]] : index // CHECK: %[[VAL_136:.*]] = fir.convert %[[VAL_5]] : (index) -> i32 -// local allocation: -// CHECK: %[[VAL_137:.*]] = fir.load %[[VAL_32]] : !fir.ref +// CHECK: %[[VAL_137:.*]] = fir.load %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_138:.*]] = arith.addi %[[VAL_137]], %[[VAL_136]] : i32 // CHECK: fir.result %[[VAL_135]], %[[VAL_138]] : index, i32 // CHECK: } -// local allocation: -// CHECK: fir.store %[[VAL_139:.*]]#1 to %[[VAL_32]] : !fir.ref +// CHECK: fir.store %[[VAL_139:.*]]#1 to %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_140:.*]] = arith.addi %[[VAL_47]], %[[VAL_5]] : index // CHECK: %[[VAL_141:.*]] = fir.convert %[[VAL_5]] : (index) -> i32 -// local allocation: -// CHECK: %[[VAL_142:.*]] = fir.load %[[VAL_34]] : !fir.ref +// CHECK: %[[VAL_142:.*]] = fir.load %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref // CHECK: %[[VAL_143:.*]] = arith.addi %[[VAL_142]], %[[VAL_141]] : i32 // CHECK: fir.result %[[VAL_140]], %[[VAL_143]] : index, i32 // CHECK: } -// local allocation: -// CHECK: fir.store %[[VAL_144:.*]]#1 to %[[VAL_34]] : !fir.ref +// CHECK: fir.store %[[VAL_144:.*]]#1 to %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref // CHECK: return // CHECK: } diff --git a/flang/test/Transforms/tbaa3.fir b/flang/test/Transforms/tbaa3.fir index 28ff8f7c5fa83..97bf69da1b99c 100644 --- a/flang/test/Transforms/tbaa3.fir +++ b/flang/test/Transforms/tbaa3.fir @@ -263,12 +263,12 @@ module { fir.store %cst to %67 : !fir.ref %68 = fir.array_coor %20(%5) %c1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref // real :: local(10) -// DEFAULT-NOT: fir.store{{.*}}tbaa +// DEFAULT: fir.store{{.*}}tbaa // LOCAL: fir.store{{.*}}{tbaa = [#[[LOCALTAG]]]} : !fir.ref fir.store %cst to %68 : !fir.ref %69 = fir.array_coor %33(%5) %c1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref // real, target :: localt(10) -// DEFAULT-NOT: fir.store{{.*}}tbaa +// DEFAULT: fir.store{{.*}}tbaa // LOCAL: fir.store{{.*}}{tbaa = [#[[LOCALTTAG]]]} : !fir.ref fir.store %cst to %69 : !fir.ref // ALL-NOT: fir.load{{.*}}tbaa @@ -278,7 +278,7 @@ module { %73 = fir.shape_shift %72#0, %72#1 : (index, index) -> !fir.shapeshift<1> %74 = fir.array_coor %71(%73) %c1 : (!fir.heap>, !fir.shapeshift<1>, index) -> !fir.ref // real, allocatable :: locala(:) -// DEFAULT-NOT: fir.store{{.*}}tbaa +// DEFAULT: fir.store{{.*}}tbaa // LOCAL: fir.store{{.*}}{tbaa = [#[[LOCALATAG]]]} : !fir.ref fir.store %cst to %74 : !fir.ref // ALL-NOT: fir.load{{.*}}tbaa @@ -288,7 +288,7 @@ module { %78 = fir.shape_shift %77#0, %77#1 : (index, index) -> !fir.shapeshift<1> %79 = fir.array_coor %76(%78) %c1 : (!fir.heap>, !fir.shapeshift<1>, index) -> !fir.ref // real, allocatable, target :: localat(:) -// DEFAULT-NOT: fir.store{{.*}}tbaa +// DEFAULT: fir.store{{.*}}tbaa // LOCAL: fir.store{{.*}}{tbaa = [#[[LOCALATTAG]]]} : !fir.ref fir.store %cst to %79 : !fir.ref // ALL-NOT: fir.load{{.*}}tbaa @@ -297,7 +297,7 @@ module { %82 = fir.shift %81#0 : (index) -> !fir.shift<1> %83 = fir.array_coor %80(%82) %c1 : (!fir.box>>, !fir.shift<1>, index) -> !fir.ref // real, pointer :: localp(:) -// DEFAULT-NOT: fir.store{{.*}}tbaa +// DEFAULT: fir.store{{.*}}tbaa // LOCAL: fir.store{{.*}}{tbaa = [#[[TARGETTAG]]]} : !fir.ref fir.store %cst to %83 : !fir.ref // ALL-NOT: fir.load{{.*}}tbaa From flang-commits at lists.llvm.org Wed May 14 00:21:24 2025 From: flang-commits at lists.llvm.org (Dominik Adamski via flang-commits) Date: Wed, 14 May 2025 00:21:24 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Turn on alias analysis for locally allocated objects (PR #139682) In-Reply-To: Message-ID: <68244474.050a0220.2b6a53.5aec@mx.google.com> https://github.com/DominikAdamski closed https://github.com/llvm/llvm-project/pull/139682 From flang-commits at lists.llvm.org Wed May 14 01:02:14 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 01:02:14 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) Message-ID: https://github.com/NexMing created https://github.com/llvm/llvm-project/pull/139857 This patch split `-emit-fir` and `-emit-mlir` option in Flang's frontend driver. A new parent class for code-gen frontend actions is introduced:`CodeGenAction`. For the `-emit-mlir` option, we aim to generate a file using the core MLIR dialects. Currently, FIR Dialect is directly lowered to LLVM Dialect, but in the future, we hope to gradually separate the pipeline into FIR → MLIR (core dialects) → LLVM. For now, this option temporarily generate a file using the LLVM dialect. Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 14 01:02:48 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 01:02:48 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <68244e28.050a0220.21daf1.28cb@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: MingYan (NexMing)
Changes This patch split `-emit-fir` and `-emit-mlir` option in Flang's frontend driver. A new parent class for code-gen frontend actions is introduced:`CodeGenAction`. For the `-emit-mlir` option, we aim to generate a file using the core MLIR dialects. Currently, FIR Dialect is directly lowered to LLVM Dialect, but in the future, we hope to gradually separate the pipeline into FIR → MLIR (core dialects) → LLVM. For now, this option temporarily generate a file using the LLVM dialect. --- Full diff: https://github.com/llvm/llvm-project/pull/139857.diff 10 Files Affected: - (modified) clang/include/clang/Driver/Options.td (+2-1) - (modified) flang/include/flang/Frontend/FrontendActions.h (+11) - (modified) flang/include/flang/Frontend/FrontendOptions.h (+3) - (modified) flang/lib/Frontend/CompilerInvocation.cpp (+3) - (modified) flang/lib/Frontend/FrontendActions.cpp (+49-1) - (modified) flang/lib/FrontendTool/ExecuteCompilerInvocation.cpp (+2) - (added) flang/test/Driver/emit-fir.f90 (+30) - (modified) flang/test/Driver/emit-mlir.f90 (+17-19) - (modified) flang/test/Fir/non-trivial-procedure-binding-description.f90 (+2-2) - (modified) flang/test/Lower/unsigned-ops.f90 (+1-1) ``````````diff diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index bd8df8f6a749a..00cc05a2bd1a6 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -7193,7 +7193,8 @@ defm analyzed_objects_for_unparse : OptOutFC1FFlag<"analyzed-objects-for-unparse def emit_fir : Flag<["-"], "emit-fir">, Group, HelpText<"Build the parse tree, then lower it to FIR">; -def emit_mlir : Flag<["-"], "emit-mlir">, Alias; +def emit_mlir : Flag<["-"], "emit-mlir">, Group, + HelpText<"Build the parse tree, then lower it to core MLIR">; def emit_hlfir : Flag<["-"], "emit-hlfir">, Group, HelpText<"Build the parse tree, then lower it to HLFIR">; diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index f9a45bd6c0a56..b651a234b5849 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -179,6 +179,7 @@ enum class BackendActionTy { Backend_EmitBC, ///< Emit LLVM bitcode files Backend_EmitLL, ///< Emit human-readable LLVM assembly Backend_EmitFIR, ///< Emit FIR files, possibly lowering via HLFIR + Backend_EmitMLIR, ///< Emit MLIR files, possibly lowering via FIR Backend_EmitHLFIR, ///< Emit HLFIR files before any passes run }; @@ -216,6 +217,11 @@ class CodeGenAction : public FrontendAction { /// Runs pass pipeline to lower HLFIR into FIR void lowerHLFIRToFIR(); + /// Runs pass pipeline to lower FIR into core MLIR + /// TODO: Some operations currently do not have corresponding representations + /// in the core MLIR dialects, so we lower them directly to the LLVM dialect. + void lowerFIRToMLIR(); + /// Generates an LLVM IR module from CodeGenAction::mlirModule and saves it /// in CodeGenAction::llvmModule. void generateLLVMIR(); @@ -232,6 +238,11 @@ class EmitFIRAction : public CodeGenAction { EmitFIRAction() : CodeGenAction(BackendActionTy::Backend_EmitFIR) {} }; +class EmitMLIRAction : public CodeGenAction { +public: + EmitMLIRAction() : CodeGenAction(BackendActionTy::Backend_EmitMLIR) {} +}; + class EmitHLFIRAction : public CodeGenAction { public: EmitHLFIRAction() : CodeGenAction(BackendActionTy::Backend_EmitHLFIR) {} diff --git a/flang/include/flang/Frontend/FrontendOptions.h b/flang/include/flang/Frontend/FrontendOptions.h index 0bd2e621813ca..69bc5691430a8 100644 --- a/flang/include/flang/Frontend/FrontendOptions.h +++ b/flang/include/flang/Frontend/FrontendOptions.h @@ -37,6 +37,9 @@ enum ActionKind { /// Emit FIR mlir file EmitFIR, + /// Emit core MLIR mlir file + EmitMLIR, + /// Emit HLFIR mlir file EmitHLFIR, diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 238079a09ef3a..2a4b6e9d884af 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -572,6 +572,9 @@ static bool parseFrontendArgs(FrontendOptions &opts, llvm::opt::ArgList &args, case clang::driver::options::OPT_emit_fir: opts.programAction = EmitFIR; break; + case clang::driver::options::OPT_emit_mlir: + opts.programAction = EmitMLIR; + break; case clang::driver::options::OPT_emit_hlfir: opts.programAction = EmitHLFIR; break; diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index e5a15c555fa5e..9ba98873042f8 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -635,6 +635,49 @@ void CodeGenAction::lowerHLFIRToFIR() { } } +void CodeGenAction::lowerFIRToMLIR() { + assert(mlirModule && "The MLIR module has not been generated yet."); + + CompilerInstance &ci = this->getInstance(); + CompilerInvocation &invoc = ci.getInvocation(); + const CodeGenOptions &opts = invoc.getCodeGenOpts(); + const auto &mathOpts = invoc.getLoweringOpts().getMathOptions(); + llvm::OptimizationLevel level = mapToLevel(opts); + mlir::DefaultTimingManager &timingMgr = ci.getTimingManager(); + mlir::TimingScope &timingScopeRoot = ci.getTimingScopeRoot(); + + fir::support::loadDialects(*mlirCtx); + mlir::DialectRegistry registry; + fir::support::registerNonCodegenDialects(registry); + fir::support::addFIRExtensions(registry); + mlirCtx->appendDialectRegistry(registry); + fir::support::registerLLVMTranslation(*mlirCtx); + + // Set-up the MLIR pass manager + mlir::PassManager pm((*mlirModule)->getName(), + mlir::OpPassManager::Nesting::Implicit); + + pm.addPass(std::make_unique()); + pm.enableVerifier(/*verifyPasses=*/true); + + MLIRToLLVMPassPipelineConfig config(level, opts, mathOpts); + fir::registerDefaultInlinerPass(config); + + // Create the pass pipeline + fir::createDefaultFIROptimizerPassPipeline(pm, config); + fir::createDefaultFIRCodeGenPassPipeline(pm, config); + (void)mlir::applyPassManagerCLOptions(pm); + + mlir::TimingScope timingScopeMLIRPasses = timingScopeRoot.nest( + mlir::TimingIdentifier::get(timingIdMLIRPasses, timingMgr)); + pm.enableTiming(timingScopeMLIRPasses); + if (!mlir::succeeded(pm.run(*mlirModule))) { + unsigned diagID = ci.getDiagnostics().getCustomDiagID( + clang::DiagnosticsEngine::Error, "Lowering to FIR failed"); + ci.getDiagnostics().Report(diagID); + } +} + static std::optional> getAArch64VScaleRange(CompilerInstance &ci) { const auto &langOpts = ci.getInvocation().getLangOpts(); @@ -836,6 +879,7 @@ getOutputStream(CompilerInstance &ci, llvm::StringRef inFile, return ci.createDefaultOutputFile( /*Binary=*/false, inFile, /*extension=*/"ll"); case BackendActionTy::Backend_EmitFIR: + case BackendActionTy::Backend_EmitMLIR: case BackendActionTy::Backend_EmitHLFIR: return ci.createDefaultOutputFile( /*Binary=*/false, inFile, /*extension=*/"mlir"); @@ -1242,10 +1286,14 @@ void CodeGenAction::executeAction() { } } - if (action == BackendActionTy::Backend_EmitFIR) { + if (action == BackendActionTy::Backend_EmitFIR || + action == BackendActionTy::Backend_EmitMLIR) { if (loweringOpts.getLowerToHighLevelFIR()) { lowerHLFIRToFIR(); } + if (action == BackendActionTy::Backend_EmitMLIR) { + lowerFIRToMLIR(); + } mlirModule->print(ci.isOutputStreamNull() ? *os : ci.getOutputStream()); return; } diff --git a/flang/lib/FrontendTool/ExecuteCompilerInvocation.cpp b/flang/lib/FrontendTool/ExecuteCompilerInvocation.cpp index 09ac129d3e689..0c4195ec2ac2e 100644 --- a/flang/lib/FrontendTool/ExecuteCompilerInvocation.cpp +++ b/flang/lib/FrontendTool/ExecuteCompilerInvocation.cpp @@ -43,6 +43,8 @@ createFrontendAction(CompilerInstance &ci) { return std::make_unique(); case EmitFIR: return std::make_unique(); + case EmitMLIR: + return std::make_unique(); case EmitHLFIR: return std::make_unique(); case EmitLLVM: diff --git a/flang/test/Driver/emit-fir.f90 b/flang/test/Driver/emit-fir.f90 new file mode 100644 index 0000000000000..4230c4b7ab434 --- /dev/null +++ b/flang/test/Driver/emit-fir.f90 @@ -0,0 +1,30 @@ +! Test the `-emit-fir` option + +! RUN: %flang_fc1 -emit-fir %s -o - | FileCheck %s + +! Verify that an `.mlir` file is created when `-emit-fir` is used. Do it in a temporary directory, which will be cleaned up by the +! LIT runner. +! RUN: rm -rf %t-dir && mkdir -p %t-dir && cd %t-dir +! RUN: cp %s . +! RUN: %flang_fc1 -emit-fir emit-fir.f90 && ls emit-fir.mlir + +! CHECK: module attributes { +! CHECK-SAME: dlti.dl_spec = +! CHECK-SAME: llvm.data_layout = +! CHECK-LABEL: func @_QQmain() { +! CHECK-NEXT: fir.dummy_scope +! CHECK-NEXT: return +! CHECK-NEXT: } +! CHECK-NEXT: func.func private @_FortranAProgramStart(i32, !llvm.ptr, !llvm.ptr, !llvm.ptr) +! CHECK-NEXT: func.func private @_FortranAProgramEndStatement() +! CHECK-NEXT: func.func @main(%arg0: i32, %arg1: !llvm.ptr, %arg2: !llvm.ptr) -> i32 { +! CHECK-NEXT: %c0_i32 = arith.constant 0 : i32 +! CHECK-NEXT: %0 = fir.zero_bits !fir.ref, !fir.ref>>>>> +! CHECK-NEXT: fir.call @_FortranAProgramStart(%arg0, %arg1, %arg2, %0) {{.*}} : (i32, !llvm.ptr, !llvm.ptr, !fir.ref, !fir.ref>>>>>) +! CHECK-NEXT: fir.call @_QQmain() fastmath : () -> () +! CHECK-NEXT: fir.call @_FortranAProgramEndStatement() {{.*}} : () -> () +! CHECK-NEXT: return %c0_i32 : i32 +! CHECK-NEXT: } +! CHECK-NEXT: } + +end program diff --git a/flang/test/Driver/emit-mlir.f90 b/flang/test/Driver/emit-mlir.f90 index de5a62d6bc7f3..cced4b0e37017 100644 --- a/flang/test/Driver/emit-mlir.f90 +++ b/flang/test/Driver/emit-mlir.f90 @@ -1,7 +1,6 @@ ! Test the `-emit-mlir` option ! RUN: %flang_fc1 -emit-mlir %s -o - | FileCheck %s -! RUN: %flang_fc1 -emit-fir %s -o - | FileCheck %s ! Verify that an `.mlir` file is created when `-emit-mlir` is used. Do it in a temporary directory, which will be cleaned up by the ! LIT runner. @@ -9,23 +8,22 @@ ! RUN: cp %s . ! RUN: %flang_fc1 -emit-mlir emit-mlir.f90 && ls emit-mlir.mlir -! CHECK: module attributes { -! CHECK-SAME: dlti.dl_spec = -! CHECK-SAME: llvm.data_layout = -! CHECK-LABEL: func @_QQmain() { -! CHECK-NEXT: fir.dummy_scope -! CHECK-NEXT: return -! CHECK-NEXT: } -! CHECK-NEXT: func.func private @_FortranAProgramStart(i32, !llvm.ptr, !llvm.ptr, !llvm.ptr) -! CHECK-NEXT: func.func private @_FortranAProgramEndStatement() -! CHECK-NEXT: func.func @main(%arg0: i32, %arg1: !llvm.ptr, %arg2: !llvm.ptr) -> i32 { -! CHECK-NEXT: %c0_i32 = arith.constant 0 : i32 -! CHECK-NEXT: %0 = fir.zero_bits !fir.ref, !fir.ref>>>>> -! CHECK-NEXT: fir.call @_FortranAProgramStart(%arg0, %arg1, %arg2, %0) {{.*}} : (i32, !llvm.ptr, !llvm.ptr, !fir.ref, !fir.ref>>>>>) -! CHECK-NEXT: fir.call @_QQmain() fastmath : () -> () -! CHECK-NEXT: fir.call @_FortranAProgramEndStatement() {{.*}} : () -> () -! CHECK-NEXT: return %c0_i32 : i32 -! CHECK-NEXT: } -! CHECK-NEXT: } +! CHECK-LABEL: llvm.func @_QQmain() { +! CHECK: llvm.return +! CHECK: } +! CHECK: llvm.func @_FortranAProgramStart(i32, !llvm.ptr, !llvm.ptr, !llvm.ptr) attributes {sym_visibility = "private"} +! CHECK: llvm.func @_FortranAProgramEndStatement() attributes {sym_visibility = "private"} + +! CHECK-LABEL: llvm.func @main( +! CHECK-SAME: %[[ARG0:.*]]: i32, +! CHECK-SAME: %[[ARG1:.*]]: !llvm.ptr, +! CHECK-SAME: %[[ARG2:.*]]: !llvm.ptr) -> i32 { +! CHECK: %[[VAL_0:.*]] = llvm.mlir.constant(0 : i32) : i32 +! CHECK: %[[VAL_1:.*]] = llvm.mlir.zero : !llvm.ptr +! CHECK: llvm.call @_FortranAProgramStart(%[[ARG0]], %[[ARG1]], %[[ARG2]], %[[VAL_1]]) {fastmathFlags = #llvm.fastmath} : (i32, !llvm.ptr, !llvm.ptr, !llvm.ptr) -> () +! CHECK: llvm.call @_QQmain() {fastmathFlags = #llvm.fastmath} : () -> () +! CHECK: llvm.call @_FortranAProgramEndStatement() {fastmathFlags = #llvm.fastmath} : () -> () +! CHECK: llvm.return %[[VAL_0]] : i32 +! CHECK: } end program diff --git a/flang/test/Fir/non-trivial-procedure-binding-description.f90 b/flang/test/Fir/non-trivial-procedure-binding-description.f90 index 668928600157b..79bb4dbb3521e 100644 --- a/flang/test/Fir/non-trivial-procedure-binding-description.f90 +++ b/flang/test/Fir/non-trivial-procedure-binding-description.f90 @@ -1,5 +1,5 @@ -! RUN: %flang_fc1 -emit-mlir %s -o - | FileCheck %s --check-prefix=BEFORE -! RUN: %flang_fc1 -emit-mlir %s -o - | fir-opt --abstract-result | FileCheck %s --check-prefix=AFTER +! RUN: %flang_fc1 -emit-fir %s -o - | FileCheck %s --check-prefix=BEFORE +! RUN: %flang_fc1 -emit-fir %s -o - | fir-opt --abstract-result | FileCheck %s --check-prefix=AFTER module a type f contains diff --git a/flang/test/Lower/unsigned-ops.f90 b/flang/test/Lower/unsigned-ops.f90 index f61f10656159a..fa1eb47b26b00 100644 --- a/flang/test/Lower/unsigned-ops.f90 +++ b/flang/test/Lower/unsigned-ops.f90 @@ -1,4 +1,4 @@ -! RUN: %flang_fc1 -funsigned -emit-mlir %s -o - | FileCheck %s +! RUN: %flang_fc1 -funsigned -emit-fir %s -o - | FileCheck %s unsigned function f01(u, v) unsigned, intent(in) :: u, v ``````````
https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Wed May 14 02:13:14 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 14 May 2025 02:13:14 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) In-Reply-To: Message-ID: <68245eaa.170a0220.1159b6.0900@mx.google.com> https://github.com/tblah approved this pull request. LGTM. Please could you add a `TODO(loc, "!$dir prefetch")` in lowering so that this does not get silently ignored until the codegen lands. https://github.com/llvm/llvm-project/pull/139702 From flang-commits at lists.llvm.org Wed May 14 02:15:19 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 14 May 2025 02:15:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Implement !DIR$ [NO]INLINE and FORCEINLINE directives (PR #134350) In-Reply-To: Message-ID: <68245f27.170a0220.311f7d.1692@mx.google.com> tblah wrote: Because the LLVM attribute can only be added to the function operation. https://github.com/llvm/llvm-project/pull/134350 From flang-commits at lists.llvm.org Wed May 14 02:16:14 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 14 May 2025 02:16:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Verify uses of OmpCancellationConstructTypeClause (PR #139743) In-Reply-To: Message-ID: <68245f5e.170a0220.1dc49a.23d4@mx.google.com> https://github.com/tblah approved this pull request. Thanks https://github.com/llvm/llvm-project/pull/139743 From flang-commits at lists.llvm.org Wed May 14 02:44:58 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 14 May 2025 02:44:58 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <6824661a.170a0220.7e312.13fb@mx.google.com> ================ @@ -1783,6 +1776,18 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); + + if (auto &mapperName{std::get>(spec.t)}) { + mapperName->symbol = + &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); + } else { + const auto &type = std::get(spec.t); + static llvm::SmallVector defaultNames; ---------------- tblah wrote: Surely some string storage for symbol names already exists? If not then I agree that it would be better to have a utility for other situations like this. Even just wrapping the static vector in a well documented and named utility function would make it clearer what this is for. https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Wed May 14 02:55:26 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 02:55:26 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) In-Reply-To: Message-ID: <6824688e.170a0220.9ceba.2647@mx.google.com> https://github.com/NimishMishra approved this pull request. LGTM. Please add the TODO as Tom mentioned. https://github.com/llvm/llvm-project/pull/139702 From flang-commits at lists.llvm.org Wed May 14 02:56:17 2025 From: flang-commits at lists.llvm.org (Thirumalai Shaktivel via flang-commits) Date: Wed, 14 May 2025 02:56:17 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) In-Reply-To: Message-ID: <682468c1.170a0220.33e4bd.ff37@mx.google.com> Thirumalai-Shaktivel wrote: Thanks for the reviews! I will add the required changes soon. https://github.com/llvm/llvm-project/pull/139702 From flang-commits at lists.llvm.org Wed May 14 02:59:50 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 14 May 2025 02:59:50 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <68246996.170a0220.9d76a.16f3@mx.google.com> https://github.com/tblah requested changes to this pull request. The code changes look okay here but I think the naming should reflect that the lowering to upstream MLIR dialects will be considered experimental (and may remain so forever, if it cannot be made to fully conform to Fortran semantics or show runtime performance improvements that justify the added complexity and compile time). Instead of re-purposing `-emit-mlir`, how about `-emit-experimental-mlir`? Thank you for upstreaming your work aligning flang with upstream mlir dialects. I am personally very interested in this. But I think making such a significant change to the lowering pipeline is going to need a lot broader consensus amongst flang contributors and strong technical justification. For now I am very happy to see any work in this direction under experimental options :smile: https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Wed May 14 03:32:36 2025 From: flang-commits at lists.llvm.org (Jay Foad via flang-commits) Date: Wed, 14 May 2025 03:32:36 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Simplify LeadingZeroBitCount. NFC. (PR #139873) Message-ID: https://github.com/jayfoad created https://github.com/llvm/llvm-project/pull/139873 Fold the subtraction into the mapping table. Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 14 04:09:16 2025 From: flang-commits at lists.llvm.org (Kiran Kumar T P via flang-commits) Date: Wed, 14 May 2025 04:09:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Implement !DIR$ IVDEP directive (PR #133728) In-Reply-To: Message-ID: <682479dc.170a0220.30c01b.288a@mx.google.com> kiranktp wrote: Hi @JDPailleux, Thanks for the patch. for ivdep pragma, clang generates below metadata: !8 = !{!"llvm.loop.vectorize.ivdep.enable", i1 true} !9 = !{!"llvm.loop.vectorize.enable", i1 true} but flang will generate below metadata: CHECK: ![[VECTORIZE]] = !{!"llvm.loop.vectorize.enable", i1 true} ! CHECK: ![[PARALLEL_ACCESSES]] = !{!"llvm.loop.parallel_accesses", [[DISTRINCT]]} IMO, the behavior of clang and flang should match. In AMD downstream compiler we have below support for different directives [Apologies, I had promised for an RFC, but i couldn't get that done in time] 1. !DIR$ VECTOR This directive must generate below metadata: !18 = !{!"llvm.loop.vectorize.enable", i1 true} This works as a hint to the optimizer. But the optimizer will anyway check for profitability of vectorization and then decide if vectorization should be done or not. This behavior is similar to "#pragma clang loop vectorize(enable)" for clang 2. !DIR$ NOVECTOR This directive must generate below metadata: !28 = !{!"llvm.loop.vectorize.width", i1 true} This will disable vectorizing the loop across all optimization levels This behavior is similar to "#pragma clang loop vectorize(disable)" 3. !DIR$ IVDEP This directive must generate below metedata: !28 = !{!"llvm.loop.vectorize.enable", i1 true} !29 = !{ !"llvm.loop.vectorize.ivdep.enable", i1 1 } !30 = !{ !30, !29, !28 } This behavior is similar to "#pragma clang loop ivdep(enable)" 4. !DIR$ VECTOR ALWAYS This directive must generate below metedata: !28 = !{!"llvm.loop.vectorize.enable", i1 true} !29 = !{!"llvm.loop.parallel_accesses", !26} This will vectorize the loop irrespective of the profitability of the vectorization. This directive should be used with caution NOTE: As I am aware, this was the behavior for "!DIR$ SIMD" directive. IMO "!DIR$ SIMD" will not be supported in llvm-flang. So we can have this behavior under "!DIR$ VECTOR ALWAYS" Let me know your opinion. https://github.com/llvm/llvm-project/pull/133728 From flang-commits at lists.llvm.org Wed May 14 04:31:40 2025 From: flang-commits at lists.llvm.org (David Spickett via flang-commits) Date: Wed, 14 May 2025 04:31:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <68247f1c.050a0220.35a46c.1ecd@mx.google.com> DavidSpickett wrote: We are also seeing this on Linaro's buildbots - https://lab.llvm.org/buildbot/#/builders/17/builds/7925 Let me know if you need any more information about the host machine or the failure. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Wed May 14 05:06:21 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 14 May 2025 05:06:21 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) In-Reply-To: Message-ID: <6824873d.170a0220.18e312.32c1@mx.google.com> kiranchandramohan wrote: Just a pass-through comment. IBM and HPE have prefetch directives that have more options. Might be good to check with @kkwli @DanielCChen @tmjbios to see whether they are OK with the syntax in this PR. https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/2024-2/prefetch-and-noprefetch-general-directives.html https://support.hpe.com/hpesc/public/docDisplay?docId=a00115296en_us&page=PREFETCH.html&docLocale=en_US https://github.com/llvm/llvm-project/pull/139702 From flang-commits at lists.llvm.org Wed May 14 05:07:18 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 05:07:18 -0700 (PDT) Subject: [flang-commits] [flang] e06363f - [flang][OpenMP] Verify uses of OmpCancellationConstructTypeClause (#139743) Message-ID: <68248776.170a0220.34f449.3752@mx.google.com> Author: Krzysztof Parzyszek Date: 2025-05-14T07:07:14-05:00 New Revision: e06363f80f95b53a433762d0561741277521241e URL: https://github.com/llvm/llvm-project/commit/e06363f80f95b53a433762d0561741277521241e DIFF: https://github.com/llvm/llvm-project/commit/e06363f80f95b53a433762d0561741277521241e.diff LOG: [flang][OpenMP] Verify uses of OmpCancellationConstructTypeClause (#139743) Some directive names can be used as clauses, for example in "cancel". In case where a directive name is misplaced, it could be interpreted as a clause. Verify that such uses are valid, and emit a diagnostic message if not. Fixes https://github.com/llvm/llvm-project/issues/138224 Added: flang/test/Semantics/OpenMP/cancellation-construct-type.f90 Modified: flang/lib/Parser/openmp-parsers.cpp flang/lib/Semantics/check-omp-structure.cpp Removed: ################################################################################ diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 0254ac4309ee5..52d3a5844c969 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -98,10 +98,12 @@ struct OmpDirectiveNameParser { using Token = TokenStringMatch; std::optional Parse(ParseState &state) const { + auto begin{state.GetLocation()}; for (const NameWithId &nid : directives()) { if (attempt(Token(nid.first.data())).Parse(state)) { OmpDirectiveName n; n.v = nid.second; + n.source = parser::CharBlock(begin, state.GetLocation()); return n; } } @@ -1104,18 +1106,8 @@ TYPE_PARSER( // "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs - "DO"_id >= - construct(construct( - Parser{})) || - "PARALLEL"_id >= - construct(construct( - Parser{})) || - "SECTIONS"_id >= - construct(construct( - Parser{})) || - "TASKGROUP"_id >= - construct(construct( - Parser{}))) + construct(construct( + Parser{}))) // [Clause, [Clause], ...] TYPE_PARSER(sourced(construct( diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 78736ee1929d1..5ae4bc29b72f7 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2422,20 +2422,30 @@ void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { void OmpStructureChecker::Enter( const parser::OmpClause::CancellationConstructType &x) { - // Do not call CheckAllowed/CheckAllowedClause, because in case of an error - // it will print "CANCELLATION_CONSTRUCT_TYPE" as the clause name instead of - // the contained construct name. + llvm::omp::Directive dir{GetContext().directive}; auto &dirName{std::get(x.v.t)}; - switch (dirName.v) { - case llvm::omp::Directive::OMPD_do: - case llvm::omp::Directive::OMPD_parallel: - case llvm::omp::Directive::OMPD_sections: - case llvm::omp::Directive::OMPD_taskgroup: - break; - default: - context_.Say(dirName.source, "%s is not a cancellable construct"_err_en_US, - parser::ToUpperCaseLetters(getDirectiveName(dirName.v).str())); - break; + + if (dir != llvm::omp::Directive::OMPD_cancel && + dir != llvm::omp::Directive::OMPD_cancellation_point) { + // Do not call CheckAllowed/CheckAllowedClause, because in case of an error + // it will print "CANCELLATION_CONSTRUCT_TYPE" as the clause name instead + // of the contained construct name. + context_.Say(dirName.source, "%s cannot follow %s"_err_en_US, + parser::ToUpperCaseLetters(getDirectiveName(dirName.v)), + parser::ToUpperCaseLetters(getDirectiveName(dir))); + } else { + switch (dirName.v) { + case llvm::omp::Directive::OMPD_do: + case llvm::omp::Directive::OMPD_parallel: + case llvm::omp::Directive::OMPD_sections: + case llvm::omp::Directive::OMPD_taskgroup: + break; + default: + context_.Say(dirName.source, + "%s is not a cancellable construct"_err_en_US, + parser::ToUpperCaseLetters(getDirectiveName(dirName.v))); + break; + } } } diff --git a/flang/test/Semantics/OpenMP/cancellation-construct-type.f90 b/flang/test/Semantics/OpenMP/cancellation-construct-type.f90 new file mode 100644 index 0000000000000..c9d1408fd83ef --- /dev/null +++ b/flang/test/Semantics/OpenMP/cancellation-construct-type.f90 @@ -0,0 +1,11 @@ +!RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags + +subroutine f(x) + integer :: x +!ERROR: PARALLEL cannot follow SECTIONS +!$omp sections parallel +!$omp section + x = x + 1 +!$omp end sections +end +end From flang-commits at lists.llvm.org Wed May 14 05:07:21 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Wed, 14 May 2025 05:07:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Verify uses of OmpCancellationConstructTypeClause (PR #139743) In-Reply-To: Message-ID: <68248779.a70a0220.2484f.48c4@mx.google.com> https://github.com/kparzysz closed https://github.com/llvm/llvm-project/pull/139743 From flang-commits at lists.llvm.org Wed May 14 06:39:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 06:39:21 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) In-Reply-To: Message-ID: <68249d09.170a0220.e444e.2afb@mx.google.com> tmjbios wrote: Thanks for the poke, Kiran. Cray Compiler Environment documentation link: https://cpe.ext.hpe.com/docs/latest/cce/index.html Our syntax is slightly different ``` !DIR$ PREFETCH [([lines(num)][, level(num)] [, write][, nt])] var[, var]... ``` With this provided example showing it in practice: ``` real*8 a(m,n), b(n,p), c(m,p), arow(n) ... do j = 1, p !dir$ prefetch (lines(3), nt) arow(1),b(1,j) do k = 1, n, 4 !dir$ prefetch (nt) arow(k+24),b(k+24,j) c(i,j) = c(i,j) + arow(k) * b(k,j) c(i,j) = c(i,j) + arow(k+1) * b(k+1,j) c(i,j) = c(i,j) + arow(k+2) * b(k+2,j) c(i,j) = c(i,j) + arow(k+3) * b(k+3,j) enddo enddo ``` https://github.com/llvm/llvm-project/pull/139702 From flang-commits at lists.llvm.org Wed May 14 06:58:41 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 14 May 2025 06:58:41 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Set the default schedule modifier (PR #139572) In-Reply-To: Message-ID: <6824a191.050a0220.175682.724a@mx.google.com> kiranchandramohan wrote: My preference is for this to be handled in the translation from OpenMP dialect to LLVM IR or in the OpenMP IRBuilder. The best case is when the OpenMP dialect models what is in the standard. So a user of the dialect might think that if they do not specify the static scheduling or ordered clause the `omp.wsloop` will be lowered to code with nonmonotonic scheduler. Does that sound OK? BTW, the call the `applyWorkshare` seems to be checking for monotonic, nonmonotonic etc. Will that be affected? https://github.com/llvm/llvm-project/blob/7a9fd62278a2eab8160fa476c3a64e66786f99ad/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp#L2436 Also, does the runtime assume a default behaviour? Or is it for the user to generate appropriate calls to the runtime with the right scheduling modifier value? https://github.com/llvm/llvm-project/pull/139572 From flang-commits at lists.llvm.org Wed May 14 07:08:00 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 07:08:00 -0700 (PDT) Subject: [flang-commits] [flang] f486cc4 - [flang] Add loop annotation attributes to the loop backedge (#126082) Message-ID: <6824a3c0.620a0220.39ef45.435d@mx.google.com> Author: Asher Mancinelli Date: 2025-05-14T07:07:57-07:00 New Revision: f486cc4417059e47e5b6e18294bbacd767c04030 URL: https://github.com/llvm/llvm-project/commit/f486cc4417059e47e5b6e18294bbacd767c04030 DIFF: https://github.com/llvm/llvm-project/commit/f486cc4417059e47e5b6e18294bbacd767c04030.diff LOG: [flang] Add loop annotation attributes to the loop backedge (#126082) Flang currently adds loop metadata to a conditional branch in the loop preheader, while clang adds it to the loop latch's branch instruction. Langref says: > Currently, loop metadata is implemented as metadata attached to the branch instruction in the loop latch block. > > https://llvm.org/docs/LangRef.html#llvm-loop I misread langref a couple times, but I think this is the appropriate branch op for the LoopAnnotationAttr. In a couple examples I found that the metadata was lost entirely during canonicalization. This patch makes the codegen look more like clang's and the annotations persist through codegen. * current clang: https://godbolt.org/z/8WhbcrnG3 * current flang: https://godbolt.org/z/TrPboqqcn Added: Modified: flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp flang/test/Fir/vector-always.fir flang/test/Integration/unroll.f90 flang/test/Integration/unroll_and_jam.f90 flang/test/Integration/vector-always.f90 Removed: ################################################################################ diff --git a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp index b09bbf6106dbb..8a9e9b80134b8 100644 --- a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp +++ b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp @@ -123,23 +123,24 @@ class CfgLoopConv : public mlir::OpRewritePattern { : terminator->operand_begin(); loopCarried.append(begin, terminator->operand_end()); loopCarried.push_back(itersMinusOne); - rewriter.create(loc, conditionalBlock, loopCarried); + auto backEdge = + rewriter.create(loc, conditionalBlock, loopCarried); rewriter.eraseOp(terminator); + // Copy loop annotations from the do loop to the loop back edge. + if (auto ann = loop.getLoopAnnotation()) + backEdge->setAttr("loop_annotation", *ann); + // Conditional block rewriter.setInsertionPointToEnd(conditionalBlock); auto zero = rewriter.create(loc, 0); auto comparison = rewriter.create( loc, arith::CmpIPredicate::sgt, itersLeft, zero); - auto cond = rewriter.create( + rewriter.create( loc, comparison, firstBlock, llvm::ArrayRef(), endBlock, llvm::ArrayRef()); - // Copy loop annotations from the do loop to the loop entry condition. - if (auto ann = loop.getLoopAnnotation()) - cond->setAttr("loop_annotation", *ann); - // The result of the loop operation is the values of the condition block // arguments except the induction variable on the last iteration. auto args = loop.getFinalValue() diff --git a/flang/test/Fir/vector-always.fir b/flang/test/Fir/vector-always.fir index 00eb0e7a756ee..ec06b94a3d0f8 100644 --- a/flang/test/Fir/vector-always.fir +++ b/flang/test/Fir/vector-always.fir @@ -13,7 +13,9 @@ func.func @_QPvector_always() -> i32 { %c10_i32 = arith.constant 10 : i32 %c1_i32 = arith.constant 1 : i32 %c10 = arith.constant 10 : index -// CHECK: cf.cond_br %{{.*}}, ^{{.*}}, ^{{.*}} {loop_annotation = #[[ANNOTATION]]} +// CHECK: cf.cond_br +// CHECK-NOT: loop_annotation +// CHECK: cf.br ^{{.*}} {loop_annotation = #[[ANNOTATION]]} %8:2 = fir.do_loop %arg0 = %c1 to %c10 step %c1 iter_args(%arg1 = %c1_i32) -> (index, i32) attributes {loopAnnotation = #loop_annotation} { fir.result %c1, %c1_i32 : index, i32 } diff --git a/flang/test/Integration/unroll.f90 b/flang/test/Integration/unroll.f90 index aa47e465b63fc..f2c2ecb5cffac 100644 --- a/flang/test/Integration/unroll.f90 +++ b/flang/test/Integration/unroll.f90 @@ -3,8 +3,10 @@ ! CHECK-LABEL: unroll_dir subroutine unroll_dir integer :: a(10) - !dir$ unroll - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[UNROLL_ENABLE_FULL_ANNO:.*]] + !dir$ unroll + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[UNROLL_ENABLE_FULL_ANNO:.*]] do i=1,10 a(i)=i end do @@ -14,7 +16,9 @@ end subroutine unroll_dir subroutine unroll_dir_0 integer :: a(10) !dir$ unroll 0 - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[UNROLL_DISABLE_ANNO:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[UNROLL_DISABLE_ANNO:.*]] do i=1,10 a(i)=i end do @@ -24,7 +28,9 @@ end subroutine unroll_dir_0 subroutine unroll_dir_1 integer :: a(10) !dir$ unroll 1 - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[UNROLL_DISABLE_ANNO]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[UNROLL_DISABLE_ANNO]] do i=1,10 a(i)=i end do @@ -34,7 +40,9 @@ end subroutine unroll_dir_1 subroutine unroll_dir_2 integer :: a(10) !dir$ unroll 2 - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[UNROLL_ENABLE_COUNT_2:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[UNROLL_ENABLE_COUNT_2:.*]] do i=1,10 a(i)=i end do diff --git a/flang/test/Integration/unroll_and_jam.f90 b/flang/test/Integration/unroll_and_jam.f90 index b9c16d34ac90a..05b3aaa04a1e0 100644 --- a/flang/test/Integration/unroll_and_jam.f90 +++ b/flang/test/Integration/unroll_and_jam.f90 @@ -4,7 +4,9 @@ subroutine unroll_and_jam_dir integer :: a(10) !dir$ unroll_and_jam 4 - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION:.*]] do i=1,10 a(i)=i end do @@ -14,7 +16,9 @@ end subroutine unroll_and_jam_dir subroutine unroll_and_jam_dir_0 integer :: a(10) !dir$ unroll_and_jam 0 - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION_DISABLE:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION_DISABLE:.*]] do i=1,10 a(i)=i end do @@ -24,7 +28,9 @@ end subroutine unroll_and_jam_dir_0 subroutine unroll_and_jam_dir_1 integer :: a(10) !dir$ unroll_and_jam 1 - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION_DISABLE]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION_DISABLE]] do i=1,10 a(i)=i end do @@ -34,7 +40,9 @@ end subroutine unroll_and_jam_dir_1 subroutine nounroll_and_jam_dir integer :: a(10) !dir$ nounroll_and_jam - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION_DISABLE]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION_DISABLE]] do i=1,10 a(i)=i end do @@ -44,7 +52,9 @@ end subroutine nounroll_and_jam_dir subroutine unroll_and_jam_dir_no_factor integer :: a(10) !dir$ unroll_and_jam - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION_NO_FACTOR:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION_NO_FACTOR:.*]] do i=1,10 a(i)=i end do diff --git a/flang/test/Integration/vector-always.f90 b/flang/test/Integration/vector-always.f90 index ee2aa8ab485e0..1d8aad97bde70 100644 --- a/flang/test/Integration/vector-always.f90 +++ b/flang/test/Integration/vector-always.f90 @@ -4,7 +4,9 @@ subroutine vector_always integer :: a(10) !dir$ vector always - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION:.*]] do i=1,10 a(i)=i end do @@ -14,7 +16,9 @@ end subroutine vector_always subroutine no_vector integer :: a(10) !dir$ novector - ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}}, !llvm.loop ![[ANNOTATION2:.*]] + ! CHECK: br i1 {{.*}}, label {{.*}}, label {{.*}} + ! CHECK-NOT: !llvm.loop + ! CHECK: br label {{.*}}, !llvm.loop ![[ANNOTATION2:.*]] do i=1,10 a(i)=i end do From flang-commits at lists.llvm.org Wed May 14 07:08:05 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Wed, 14 May 2025 07:08:05 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add loop annotation attributes to the loop backedge instead of the loop header's conditional branch (PR #126082) In-Reply-To: Message-ID: <6824a3c5.050a0220.10efa6.4dd7@mx.google.com> https://github.com/ashermancinelli closed https://github.com/llvm/llvm-project/pull/126082 From flang-commits at lists.llvm.org Wed May 14 07:38:16 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 07:38:16 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <6824aad8.630a0220.29c20d.e31f@mx.google.com> https://github.com/jeanPerier commented: While I understand the goal, I also have some issues with the genericity of the name while there is no single MLIR core dialect. How would you make a difference between lowering to different level of representation that the core dialect offer: affine, linalg, scf, cf, memref, tensor, llvm dialect? I have no problem having some `-emit-llvm-dialect` option though, on the contrary. https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Wed May 14 08:28:26 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Wed, 14 May 2025 08:28:26 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <6824b69a.050a0220.27796a.cce0@mx.google.com> https://github.com/tarunprabhu edited https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Wed May 14 08:28:26 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Wed, 14 May 2025 08:28:26 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <6824b69a.a70a0220.36517a.dab4@mx.google.com> https://github.com/tarunprabhu commented: I tend to prefer @jeanPerier's suggestion of having the option name reflect that only the LLVM dialect is being used. In the future, do you intend to provide a way to choose the set of dialects to use? In that case, we could consider something a bit more general like @tblah's suggestion. https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Wed May 14 08:28:27 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Wed, 14 May 2025 08:28:27 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <6824b69b.170a0220.1159b6.9209@mx.google.com> ================ @@ -635,6 +635,49 @@ void CodeGenAction::lowerHLFIRToFIR() { } } +void CodeGenAction::lowerFIRToMLIR() { + assert(mlirModule && "The MLIR module has not been generated yet."); + + CompilerInstance &ci = this->getInstance(); + CompilerInvocation &invoc = ci.getInvocation(); + const CodeGenOptions &opts = invoc.getCodeGenOpts(); + const auto &mathOpts = invoc.getLoweringOpts().getMathOptions(); ---------------- tarunprabhu wrote: Could we use a more concrete type instead of `auto` here? https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Wed May 14 08:30:28 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Wed, 14 May 2025 08:30:28 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <6824b714.170a0220.11d141.8cc8@mx.google.com> tarunprabhu wrote: > A new parent class for code-gen frontend actions is introduced:`CodeGenAction`. Should this instead be a "A new code-gen action class is introduced: `EmitMLIRAction`?" That seems to be the only new class that has been introduced and it is not (yet, at least) a parent for any other class. https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Wed May 14 08:32:22 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 14 May 2025 08:32:22 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) In-Reply-To: Message-ID: <6824b786.170a0220.176133.b805@mx.google.com> kiranchandramohan wrote: Thanks @tmjbios for the quick reply. The question is whether the syntax proposed in this PR `!dir$ prefetch designator[, designator]...` is OK with you. If what is proposed here is a subset of the functionality you have in CCE then I think it is OK and if you require, you can extend it later. https://github.com/llvm/llvm-project/pull/139702 From flang-commits at lists.llvm.org Wed May 14 09:01:52 2025 From: flang-commits at lists.llvm.org (Kelvin Li via flang-commits) Date: Wed, 14 May 2025 09:01:52 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) In-Reply-To: Message-ID: <6824be70.050a0220.cb9fb.d29c@mx.google.com> kkwli wrote: The IBM Open XL Fortran compiler has slightly different syntax for the `prefetch_*` directives, e.g. `!ibm* prefetech_by_load (var, ...)`. I think the proposed syntax is consistent with other supported directives (without the parentheses). It looks fine to me. Thanks. https://www.ibm.com/docs/en/openxl-fortran-aix/17.1.3?topic=prefetch-by-load https://www.ibm.com/docs/en/openxl-fortran-aix/17.1.3?topic=prefetch-by-stream https://www.ibm.com/docs/en/openxl-fortran-aix/17.1.3?topic=prefetch-load https://www.ibm.com/docs/en/openxl-fortran-aix/17.1.3?topic=prefetch-store https://github.com/llvm/llvm-project/pull/139702 From flang-commits at lists.llvm.org Wed May 14 09:02:39 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 09:02:39 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) In-Reply-To: Message-ID: <6824be9f.170a0220.27aa65.afd3@mx.google.com> tmjbios wrote: Both Cray CCE `ftn` and Intel's `ifx` throw multiple errors with the Flang example (test) code in this pull request. Intel's `ifx` will warn, but not error, with the Cray example code. A current master branch `flang` will warn, but not error, with the Cray CCE example. So I would object to this new implementation's syntax being incompatible with the existing Fortran compilers. Is there a compelling reason for Flang to be different from the major vendors? This is especially concerning when there are existing codes in the wild which will break or see significantly degraded performance if they use `flang`. https://github.com/llvm/llvm-project/pull/139702 From flang-commits at lists.llvm.org Wed May 14 09:03:11 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Wed, 14 May 2025 09:03:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <6824bebf.a70a0220.f979.f7f6@mx.google.com> akuhlens wrote: I will take a look at these today. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Wed May 14 09:07:08 2025 From: flang-commits at lists.llvm.org (Kelvin Li via flang-commits) Date: Wed, 14 May 2025 09:07:08 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) In-Reply-To: Message-ID: <6824bfac.620a0220.2f5848.bf40@mx.google.com> kkwli wrote: One more thought. If we want to specialize the prefetch operation (e.g. for store or load) without introducing a new directive, the current syntax may be very limited (i.e. no way to distinguish a keyword and a variable name). https://github.com/llvm/llvm-project/pull/139702 From flang-commits at lists.llvm.org Wed May 14 09:29:48 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 14 May 2025 09:29:48 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) In-Reply-To: Message-ID: <6824c4fc.050a0220.2a9075.b485@mx.google.com> kiranchandramohan wrote: > Is there a compelling reason for Flang to be different from the major vendors? This is especially concerning when there are existing codes in the wild which will break or see significantly degraded performance if they use `flang`. The syntax proposed here is similar to the ones that are/were supported in pgfortran and classic-flang based compilers (AOCC, Huawei compilers, Arm compilers). They all had the syntax `!$mem prefetch [,[,...]]`. This was modified for use in Flang `!$dir prefetch [,[,...]]` to match other directives. https://docs.nvidia.com/hpc-sdk/pgi-compilers/19.1/x86/pgi-ref-guide/index.htm#prefetch https://developer.arm.com/documentation/101380/2404/Optimize/Directives/prefetch https://www.amd.com/content/dam/amd/en/documents/pdfs/developer/aocc/aocc-v4.0-ga-user-guide.pdf (Section 4.1.5) https://github.com/llvm/llvm-project/pull/139702 From flang-commits at lists.llvm.org Wed May 14 09:39:11 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 14 May 2025 09:39:11 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) In-Reply-To: Message-ID: <6824c72f.650a0220.745f4.d9f1@mx.google.com> kiranchandramohan wrote: > Both Cray CCE ftn and Intel's ifx throw multiple errors with the Flang example (test) code in this pull request. Is that because the prefetch directive is only applicable in limited contexts like loops? From the syntax in the links that you posted, it looks like the syntax accepted in this patch is a subset. https://github.com/llvm/llvm-project/pull/139702 From flang-commits at lists.llvm.org Wed May 14 10:54:28 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 14 May 2025 10:54:28 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (WORK IN PROGRESS) (PR #137727) In-Reply-To: Message-ID: <6824d8d4.170a0220.352c29.0dda@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 14 10:05:11 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 14 May 2025 10:05:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow flush of common block (PR #139528) In-Reply-To: Message-ID: <6824cd47.630a0220.163f65.7dac@mx.google.com> ================ @@ -2304,13 +2304,6 @@ void OmpStructureChecker::Leave(const parser::OpenMPFlushConstruct &x) { auto &flushList{std::get>(x.v.t)}; if (flushList) { - for (const parser::OmpArgument &arg : flushList->v) { - if (auto *sym{GetArgumentSymbol(arg)}; sym && !IsVariableListItem(*sym)) { ---------------- kiranchandramohan wrote: @kparzysz is this an issue with the function `IsVariableListItem`? ``` A variable list item is one of the following: 2 • a variable that is not coindexed and that is not a substring; 3 • an array section that is not coindexed and that does not contain an element that is a substring; 4 • a named constant; 5 • an associate name that may appear in a variable definition context; or 6 • a common block name (enclosed in slashes). ``` https://github.com/llvm/llvm-project/pull/139528 From flang-commits at lists.llvm.org Wed May 14 11:18:10 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 14 May 2025 11:18:10 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (WORK IN PROGRESS) (PR #137727) In-Reply-To: Message-ID: <6824de62.170a0220.2118b2.e02a@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 14 12:08:55 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 12:08:55 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add parser support for prefetch directive (PR #139702) In-Reply-To: Message-ID: <6824ea47.170a0220.24470.e005@mx.google.com> tmjbios wrote: Yes, sorta - this seems more of a superset of what we support in that we require more specificity from the user. CCE will tend to disallow prefetching a whole array in this manner. Instead we allow the user to specify a scalar or an array element along with a number of cache lines, whether it is for read or write, whether the data is temporal or non-temporal, and which level of cache to work with. I'm not suggesting anyone block or disapprove this PR - this is certainly a step in the right direction. I'm just reminded of xkcd 927. https://github.com/llvm/llvm-project/pull/139702 From flang-commits at lists.llvm.org Wed May 14 13:19:12 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 13:19:12 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Remove async and structured flag from data actions (PR #139723) In-Reply-To: Message-ID: <6824fac0.170a0220.1b5a32.07f7@mx.google.com> https://github.com/khaki3 closed https://github.com/llvm/llvm-project/pull/139723 From flang-commits at lists.llvm.org Wed May 14 13:19:12 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 13:19:12 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Remove async and structured flag from data actions (PR #139723) In-Reply-To: Message-ID: <6824fac0.050a0220.1417e0.0aaf@mx.google.com> khaki3 wrote: For now, we find the mentioned operands useful for optimization and debugging purposes, not going through parental constructs. https://github.com/llvm/llvm-project/pull/139723 From flang-commits at lists.llvm.org Wed May 14 13:40:07 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Wed, 14 May 2025 13:40:07 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Support -D for function-like macros (PR #139812) In-Reply-To: Message-ID: <6824ffa7.170a0220.bd8a4.f779@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/139812 From flang-commits at lists.llvm.org Wed May 14 13:54:47 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Wed, 14 May 2025 13:54:47 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <68250317.170a0220.2b617e.ec18@mx.google.com> https://github.com/TIFitis updated https://github.com/llvm/llvm-project/pull/139593 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 14 13:57:07 2025 From: flang-commits at lists.llvm.org (Razvan Lupusoru via flang-commits) Date: Wed, 14 May 2025 13:57:07 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Basic PFT to MLIR lowering for do concurrent locality specifiers (PR #138534) In-Reply-To: Message-ID: <682503a3.620a0220.390156.f69a@mx.google.com> ================ @@ -2029,9 +2031,33 @@ class FirConverter : public Fortran::lower::AbstractConverter { void handleLocalitySpecs(const IncrementLoopInfo &info) { Fortran::semantics::SemanticsContext &semanticsContext = bridge.getSemanticsContext(); - for (const Fortran::semantics::Symbol *sym : info.localSymList) + Fortran::lower::omp::DataSharingProcessor dsp( ---------------- razvanlupusoru wrote: > Would it be ok to add a todo and do this is a separate PR? Sure! Thank you! https://github.com/llvm/llvm-project/pull/138534 From flang-commits at lists.llvm.org Wed May 14 12:51:48 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 14 May 2025 12:51:48 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (WORK IN PROGRESS) (PR #137727) In-Reply-To: Message-ID: <6824f454.050a0220.164b8f.fbd5@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From 2a9ac1b9c76c045ffd099e9d01d226b27d9c7e91 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, and Destroy. Default derived type I/O is also recursive, but already disabled. It can be added to this new framework later if the overall approach succeeds. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. --- .../include/flang-rt/runtime/work-queue.h | 299 ++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 541 ++++++++++-------- flang-rt/lib/runtime/derived.cpp | 487 ++++++++-------- flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 175 ++++++ flang/include/flang/Runtime/assign.h | 2 +- 7 files changed, 1036 insertions(+), 476 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..11e52a0b38bfb --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,299 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue is a list of tickets. Each ticket class has a Begin() +// member function that is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatOkContinue, and if that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatOkContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentTicketBase, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatOkContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatOkContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; +namespace typeInfo { +class DerivedType; +class Component; +class SpecialBinding; +} // namespace typeInfo + +// Ticket workers + +// Ticket workers return status codes. Returning StatOkContinue means +// that the ticket is incomplete and must be resumed; any other value +// means that the ticket is complete, and if not StatOk, the whole +// queue can be shut down due to an error. +static constexpr int StatOkContinue{1234}; + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +// Base class for ticket workers that operate elementwise over descriptors +// TODO: if ComponentTicketBase remains this class' only client, +// merge them for better comprehensibility. +class ElementalTicketBase { +protected: + RT_API_ATTRS ElementalTicketBase(const Descriptor &instance) + : instance_{instance} { + instance_.GetLowerBounds(subscripts_); + } + RT_API_ATTRS bool CueUpNextItem() const { return elementAt_ < elements_; } + RT_API_ATTRS void AdvanceToNextElement() { + phase_ = 0; + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + } + + const Descriptor &instance_; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + int phase_{0}; + SubscriptValue subscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentTicketBase : protected ElementalTicketBase { +protected: + RT_API_ATTRS ComponentTicketBase( + const Descriptor &instance, const typeInfo::DerivedType &derived); + RT_API_ATTRS bool CueUpNextItem(); + RT_API_ATTRS void AdvanceToNextComponent() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + ElementalTicketBase::Reset(); + component_ = nullptr; + componentAt_ = 0; + } + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Implements derived type instance initialization +class InitializeTicket : private ComponentTicketBase { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ComponentTicketBase{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket : private ComponentTicketBase { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ComponentTicketBase{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatOkContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : private ComponentTicketBase { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ComponentTicketBase{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : private ComponentTicketBase { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ComponentTicketBase{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : to_{to}, from_{&from}, flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + RT_API_ATTRS void GetDefinedAssignment(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + const typeInfo::SpecialBinding *scalarDefinedAssignment_{nullptr}; + const typeInfo::SpecialBinding *elementalDefinedAssignment_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment +class DerivedAssignTicket : private ComponentTicketBase { +public: + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ComponentTicketBase{to, derived}, from_{from}, flags_{flags}, + memmoveFct_{memmoveFct}, deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS void AdvanceToNextElement(); + RT_API_ATTRS void Reset(); + +private: + const Descriptor &from_; + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + SubscriptValue fromSubscripts_[common::maxRank]; + StaticDescriptor fromComponentDescriptor_; +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + RT_API_ATTRS void BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived); + RT_API_ATTRS void BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg); + RT_API_ATTRS void BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived); + RT_API_ATTRS void BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize); + RT_API_ATTRS void BeginAssign( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct); + RT_API_ATTRS void BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter); + + RT_API_ATTRS int Run(); + +private: + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index c5e7bdce5b2fd..3f9858671d1e1 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -67,6 +67,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -130,6 +131,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 4a813cd489022..02f1e49dbace1 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -99,11 +100,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncId)); } // least <= 0, most >= 0 @@ -228,6 +225,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -241,274 +240,358 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + workQueue.BeginAssign(to, from, flags, memmoveFct); + workQueue.Run(); +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. A user-defined assignment TBP defines all of + // the semantics, including allocatable (re)allocation and any + // finalization. + // + // Note that the aliasing and LHS (re)allocation handling below + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + GetDefinedAssignment(); + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + } else if (!IsSimpleMemmove() || scalarDefinedAssignment_ || + elementalDefinedAssignment_) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); - } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncId))}; + stat != StatOk) { + return stat; } - return; - } - } - if (to.IsAllocatable()) { - if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + workQueue.BeginInitialize(newFrom, *derived); + } + } } + workQueue.BeginAssign( + newFrom, *from_, MaybeReallocate | PolymorphicLHS, memmoveFct_); } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; - } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed + if (toDeallocate_ && toDerived_ && (flags_ & NeedFinalization)) { + // Schedule finalization for the RHS temporary or old LHS. + workQueue.BeginFinalize(*toDeallocate_, *toDerived_); + flags_ &= ~NeedFinalization; } } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( - typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + if (to_.IsAllocatable()) { + if (mustDeallocateLHS) { + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (const auto *special{toDerived->FindSpecialBinding( - typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); - } + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + workQueue.BeginFinalize(to_, *toDerived_); + } else if (!toDerived_->noDestructionNeeded()) { + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false); + } } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + return StatOkContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); + } + return StatOk; } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", - toElementBytes, fromElementBytes); + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } + if (const auto *addendum{to_.Addendum()}) { + if (const auto *derived{addendum->derivedType()}; derived != toDerived_) { + toDerived_ = derived; + scalarDefinedAssignment_ = nullptr; + elementalDefinedAssignment_ = nullptr; + GetDefinedAssignment(); } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); + } + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + workQueue.BeginInitialize(to_, *toDerived_); + if (scalarDefinedAssignment_ || elementalDefinedAssignment_) { + return StatOkContinue; // finish that now before defined assignment } } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + } + // Defined assignment? + if (scalarDefinedAssignment_) { + DoScalarDefinedAssignment(to_, *from_, *scalarDefinedAssignment_); + done_ = true; + return StatOkContinue; + } else if (elementalDefinedAssignment_) { + DoElementalDefinedAssignment( + to_, *from_, *toDerived_, *elementalDefinedAssignment_); + done_ = true; + return StatOkContinue; + } + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); + } + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); + } + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", + toElementBytes, fromElementBytes); + } + if (toDerived_) { + workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_); + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatOkContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +RT_API_ATTRS void AssignTicket::GetDefinedAssignment() { + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + scalarDefinedAssignment_ = toDerived_->FindSpecialBinding( + typeInfo::SpecialBinding::Which::ScalarAssignment); + } + if (!scalarDefinedAssignment_) { + elementalDefinedAssignment_ = toDerived_->FindSpecialBinding( + typeInfo::SpecialBinding::Which::ElementalAssignment); + } + } +} + +RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &workQueue) { + from_.GetLowerBounds(fromSubscripts_); + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{derived_.procPtr()}; + std::size_t numProcPtrs{procPtrDesc.Elements()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + for (; ElementalTicketBase::CueUpNextItem(); AdvanceToNextElement()) { + memmoveFct_(instance_.Element(subscripts_) + procPtr.offset, + from_.Element(fromSubscripts_) + procPtr.offset, + sizeof(typeInfo::ProcedurePointer)); + } + ElementalTicketBase::Reset(); + } + return StatOkContinue; +} + +RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &workQueue) { + for (; CueUpNextItem(); AdvanceToNextElement()) { + // Copy the data components (incl. the parent) first. + switch (component_->genre()) { + case typeInfo::Component::Genre::Data: + if (component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{fromComponentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + toCompDesc, instance_, workQueue.terminator(), subscripts_); + component_->CreatePointerDescriptor( + fromCompDesc, from_, workQueue.terminator(), fromSubscripts_); + AdvanceToNextElement(); + workQueue.BeginAssign(toCompDesc, fromCompDesc, flags_, memmoveFct_); + return StatOkContinue; + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_.Element(fromSubscripts_) + component_->offset(), + componentByteSize); + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_.Element(fromSubscripts_) + component_->offset(), + componentByteSize); + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + instance_.Element(subscripts_) + component_->offset())}; + const auto *fromDesc{reinterpret_cast( + from_.Element(fromSubscripts_) + component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (const auto *componentDerived{component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false); + return StatOkContinue; + } + } + toDesc->Deallocate(); + } + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + // + // Be careful not to destroy/reallocate the LHS, if there is + // overlap between LHS and RHS (it seems that partial overlap + // is not possible, though). + // Invoke Assign() recursively to deal with potential aliasing. + // Force LHS deallocation with DeallocateLHS flag. + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + workQueue.BeginAssign( + *toDesc, *fromDesc, flags_ | DeallocateLHS, memmoveFct_); + AdvanceToNextElement(); + return StatOkContinue; + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} + +RT_API_ATTRS void DerivedAssignTicket::AdvanceToNextElement() { + ComponentTicketBase::AdvanceToNextElement(); + from_.IncrementSubscripts(fromSubscripts_); +} + +RT_API_ATTRS void DerivedAssignTicket::Reset() { + ComponentTicketBase::Reset(); + from_.GetLowerBounds(fromSubscripts_); +} RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -578,7 +661,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -597,8 +679,9 @@ void RTDEF(CopyOutAssign)( // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. - if (var) + if (var) { Assign(*var, temp, terminator, NoAssignFlags); + } temp.Destroy(/*finalize=*/false, /*destroyPointers=*/false, &terminator); } diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..0f461f529fae6 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,174 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + workQueue.BeginInitialize(instance, derived); + return workQueue.Run(); +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + std::size_t myProcPtrs{procPtrDesc.Elements()}; + for (std::size_t k{0}; k < myProcPtrs; ++k) { const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; + *procPtrDesc.ZeroBasedIndexedElement(k)}; SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + instance_.GetLowerBounds(at); + for (std::size_t j{0}; j++ < elements_; instance_.IncrementSubscripts(at)) { + auto &pptr{*instance_.ElementComponent( + at, comp.offset)}; + pptr = comp.procInitialization; + } + } + return StatOkContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (CueUpNextItem()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; elementAt_ < elements_; AdvanceToNextElement()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; elementAt_ < elements_; AdvanceToNextElement()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; elementAt_ < elements_; AdvanceToNextElement()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } - } - } - } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + AdvanceToNextElement(); + workQueue.BeginInitialize(compDesc, compType); + return StatOkContinue; + } else { + AdvanceToNextComponent(); } } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + workQueue.BeginInitializeClone(clone, original, derived, hasStat, errMsg); + return workQueue.Run(); } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (CueUpNextItem()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); - } + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncId), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + workQueue.BeginInitialize(cloneDesc, *derived); + return StatOkContinue; } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_); + return StatOkContinue; + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + AdvanceToNextElement(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_); + AdvanceToNextElement(); + return StatOkContinue; // will resume at next element in this component + } else { + AdvanceToNextComponent(); } + } else { + AdvanceToNextComponent(); } } - return stat; + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + workQueue.BeginFinalize(descriptor, derived); + workQueue.Run(); + } } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +216,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +253,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncId) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,86 +277,84 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (!finalizableParentType_->noFinalizationNeeded()) { + componentAt_ = 1; + } else { + finalizableParentType_ = nullptr; + } + } + return StatOkContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (CueUpNextItem()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); - } + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + AdvanceToNextElement(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + workQueue.BeginFinalize(compDesc, *compDynamicType); + return StatOkContinue; } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + AdvanceToNextElement(); + if (compDesc.IsAllocated()) { + workQueue.BeginFinalize(compDesc, *compType); } + } else { + AdvanceToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + AdvanceToNextElement(); + workQueue.BeginFinalize(compDesc, compType); + return StatOkContinue; + } else { + AdvanceToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + workQueue.BeginFinalize(tmpDesc, *finalizableParentType_); + finalizableParentType_ = nullptr; + return StatOkContinue; + } else { + return StatOk; } } @@ -373,51 +364,61 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + workQueue.BeginDestroy(descriptor, derived, finalize); + workQueue.Run(); } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + workQueue.BeginFinalize(instance_, derived_); } + return StatOkContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (CueUpNextItem()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + workQueue.BeginDestroy(*d, *componentDerived, /*finalize=*/false); + return StatOkContinue; + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + AdvanceToNextElement(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + AdvanceToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + AdvanceToNextElement(); + workQueue.BeginDestroy(compDesc, *componentDerived, /*finalize=*/false); + return StatOkContinue; } + } else { + AdvanceToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..0ae50e72bb3a9 --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,175 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS ComponentTicketBase::ComponentTicketBase( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ElementalTicketBase{instance}, derived_{derived}, + components_{derived.component().Elements()} {} + +RT_API_ATTRS bool ComponentTicketBase::CueUpNextItem() { + bool elementsDone{!ElementalTicketBase::CueUpNextItem()}; + if (elementsDone) { + component_ = nullptr; + ++componentAt_; + } + if (!component_) { + if (componentAt_ >= components_) { + return false; // done! + } + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + if (elementsDone) { + ElementalTicketBase::Reset(); + } + } + return true; +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + delete firstFree_; + } + firstFree_ = next; + } +} + +RT_API_ATTRS void WorkQueue::BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + StartTicket().u.emplace(descriptor, derived); +} + +RT_API_ATTRS void WorkQueue::BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); +} + +RT_API_ATTRS void WorkQueue::BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + StartTicket().u.emplace(descriptor, derived); +} + +RT_API_ATTRS void WorkQueue::BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + StartTicket().u.emplace(descriptor, derived, finalize); +} + +RT_API_ATTRS void WorkQueue::BeginAssign( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) { + StartTicket().u.emplace(to, from, flags, memmoveFct); +} + +RT_API_ATTRS void WorkQueue::BeginDerivedAssign(Descriptor &to, + const Descriptor &from, const typeInfo::DerivedType &derived, int flags, + MemmoveFct memmoveFct, Descriptor *deallocateAfter) { + StartTicket().u.emplace( + to, from, derived, flags, memmoveFct, deallocateAfter); +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + firstFree_ = new TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; + int stat{at->ticket.Continue(*this)}; + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatOkContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime \ No newline at end of file diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION From flang-commits at lists.llvm.org Wed May 14 14:29:43 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Wed, 14 May 2025 14:29:43 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add missing copy constructor (PR #139966) Message-ID: https://github.com/ashermancinelli created https://github.com/llvm/llvm-project/pull/139966 On some compilers the implicit copy constructor was issuing a warning. Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Wed May 14 14:30:25 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 14:30:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add missing copy constructor (PR #139966) In-Reply-To: Message-ID: <68250b71.170a0220.12d9be.1358@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Asher Mancinelli (ashermancinelli)
Changes On some compilers the implicit copy constructor was issuing a warning. --- Full diff: https://github.com/llvm/llvm-project/pull/139966.diff 1 Files Affected: - (modified) flang/include/flang/Semantics/symbol.h (+1) ``````````diff diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 97c1e30631840..4cded64d170cd 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -600,6 +600,7 @@ class TypeParamDetails { public: TypeParamDetails() = default; TypeParamDetails(const TypeParamDetails &) = default; + TypeParamDetails &operator=(const TypeParamDetails &) = default; std::optional attr() const { return attr_; } TypeParamDetails &set_attr(common::TypeParamAttr); MaybeIntExpr &init() { return init_; } ``````````
https://github.com/llvm/llvm-project/pull/139966 From flang-commits at lists.llvm.org Wed May 14 14:33:26 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 14 May 2025 14:33:26 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add missing copy constructor (PR #139966) In-Reply-To: Message-ID: <68250c26.a70a0220.25bb0e.3470@mx.google.com> klausler wrote: > On some compilers the implicit copy constructor was issuing a warning. It's a copy assignment operator. Which compilers complain? https://github.com/llvm/llvm-project/pull/139966 From flang-commits at lists.llvm.org Wed May 14 14:36:17 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Wed, 14 May 2025 14:36:17 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add missing copy constructor (PR #139966) In-Reply-To: Message-ID: <68250cd1.170a0220.2bc9cb.e943@mx.google.com> https://github.com/ashermancinelli edited https://github.com/llvm/llvm-project/pull/139966 From flang-commits at lists.llvm.org Wed May 14 15:49:35 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 14 May 2025 15:49:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang][docs] Document technique for regenerating a module hermetically (PR #139975) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/139975 A flang-new module file is Fortran source, so it can be recompiled with the `-fhermetic-module-files` option to convert it into a hermetic one. >From 3635e87fe94b3158ec094c329f1196e10cc54c92 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 14 May 2025 15:46:13 -0700 Subject: [PATCH] [flang][docs] Document technique for regenerating a module hermetically A flang-new module file is Fortran source, so it can be recompiled with the `-fhermetic-module-files` option to convert it into a hermetic one. --- flang/docs/ModFiles.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/flang/docs/ModFiles.md b/flang/docs/ModFiles.md index dd0ade5cebbfc..fc05c2677fc26 100644 --- a/flang/docs/ModFiles.md +++ b/flang/docs/ModFiles.md @@ -171,3 +171,14 @@ modules of dependent libraries need not also be packaged with the library. When the compiler reads a hermetic module file, the copies of the dependent modules are read into their own scope, and will not conflict with other modules of the same name that client code might `USE`. + +One can use the `-fhermetic-module-files` option when building the top-level +module files of a library for which not all of the implementation modules +will (or can) be shipped. + +It is also possible to convert a default module file to a hermetic one after +the fact. +Since module files are Fortran source, simply copy the module file to a new +temporary free form Fortran source file and recompile it (`-fsyntax-only`) +with the `-fhermetic-module-files` flag, and that will regenerate the module +file in place with all of its dependent modules included. From flang-commits at lists.llvm.org Wed May 14 15:55:18 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 14 May 2025 15:55:18 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (WORK IN PROGRESS) (PR #137727) In-Reply-To: Message-ID: <68251f56.050a0220.21daf1.1c59@mx.google.com> klausler wrote: I've debugged and fixed all of the failures of the work queue implementation exposed by the GNU Fortran test suite and a proprietary one. Please review when convenient. https://github.com/llvm/llvm-project/pull/137727 From flang-commits at lists.llvm.org Wed May 14 19:02:07 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 14 May 2025 19:02:07 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <68254b1f.050a0220.33dbc7.287e@mx.google.com> https://github.com/NexMing edited https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Wed May 14 23:49:29 2025 From: flang-commits at lists.llvm.org (Jean-Didier PAILLEUX via flang-commits) Date: Wed, 14 May 2025 23:49:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Implement !DIR$ [NO]INLINE and FORCEINLINE directives (PR #134350) In-Reply-To: Message-ID: <68258e79.050a0220.175682.5e69@mx.google.com> JDPailleux wrote: Hi, you can look at the PR (merged) concerning attributes for LLVM function operation in MLIR here : https://github.com/llvm/llvm-project/pull/133726 and https://github.com/llvm/llvm-project/pull/134582 https://github.com/llvm/llvm-project/pull/134350 From flang-commits at lists.llvm.org Thu May 15 01:03:53 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 01:03:53 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <68259fe9.050a0220.19a6cd.10a1@mx.google.com> NexMing wrote: > I tend to prefer @jeanPerier's suggestion of having the option name reflect that only the LLVM dialect is being used. > > In the future, do you intend to provide a way to choose the set of dialects to use? In that case, we could consider something a bit more general like @tblah's suggestion. Currently, using the LLVM dialect is only a temporary solution and doesn’t align with my long-term goals. In the future, I plan to implement conversions from FIR to scf, memref, and other core dialects. I personally prefer @tblah’s suggestion, but as @jeanPerier pointed out, I’m indeed not yet sure how to make a difference between the different levels of representation that the core dialects. I think the appropriate core MLIR dialects are those that most closely reflect the original FIR representation. Do you have any suggestions on how to approach this? https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Thu May 15 01:31:08 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Thu, 15 May 2025 01:31:08 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <6825a64c.170a0220.337679.38db@mx.google.com> clementval wrote: >Currently, using the LLVM dialect is only a temporary solution and doesn’t align with my long-term goals. Who said it is a temporary solution? Can you point to an RFC? https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Thu May 15 02:45:17 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 15 May 2025 02:45:17 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <6825b7ad.170a0220.31c209.4f29@mx.google.com> tblah wrote: > Who said it is a temporary solution? Can you point to an RFC? This specific patch wasn't discussed in the RFC but there is some discussion here https://discourse.llvm.org/t/rfc-add-fir-affine-optimization-fir-pass-pipeline/86190/5 https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Thu May 15 02:50:29 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 15 May 2025 02:50:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Implement !DIR$ [NO]INLINE and FORCEINLINE directives (PR #134350) In-Reply-To: Message-ID: <6825b8e5.170a0220.310b6.10e9@mx.google.com> tblah wrote: Apologies if I have misunderstood. I thought this patch was blocked because you were unsure how to map from this directive being applied to call operations but llvm ir requiring them to be applied to function operations. ([1](https://llvm.org/docs/LangRef.html#function-attributes)) ([2](https://llvm.org/docs/LangRef.html#call-site-attributes)) I am not a maintainer of the LLVM MLIR dialect, but I am surprised that those patches were accepted when they do not appear to model what is legal in LLVM IR. How are those new MLIR attributes translated into LLVM IR? If this works correctly, what is blocking this patch/how can I help? https://github.com/llvm/llvm-project/pull/134350 From flang-commits at lists.llvm.org Thu May 15 03:21:07 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Thu, 15 May 2025 03:21:07 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <6825c013.170a0220.c19ce.3f15@mx.google.com> clementval wrote: > > Who said it is a temporary solution? Can you point to an RFC? > > This specific patch wasn't discussed in the RFC but there is some discussion here https://discourse.llvm.org/t/rfc-add-fir-affine-optimization-fir-pass-pipeline/86190/5 So is that only for the affine pipeline? https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Thu May 15 03:23:40 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Thu, 15 May 2025 03:23:40 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [Flang][MLIR][OpenMP] Improve use_device_* handling (PR #137198) In-Reply-To: Message-ID: <6825c0ac.050a0220.460ce.43b0@mx.google.com> https://github.com/skatrak updated https://github.com/llvm/llvm-project/pull/137198 >From 59bc1c921519967d71837a0833023a6dbccf9045 Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Fri, 11 Apr 2025 13:30:38 +0100 Subject: [PATCH] [Flang][MLIR][OpenMP] Improve use_device_* handling This patch updates MLIR op verifiers for operations taking arguments that must always be defined by an `omp.map.info` operation to check this requirement. It also modifies Flang lowering for `use_device_{addr, ptr}`, as well as the custom MLIR printer and parser for these clauses, to support initializing it to `OMP_MAP_RETURN_PARAM` and represent this in the MLIR representation as `return_param`. This internal mapping flag is what eventually is used for variables passed via these clauses into the target region when translating to LLVM IR, so making it explicit in Flang and MLIR removes an inconsistency in the current representation. --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 6 +-- flang/lib/Lower/OpenMP/Utils.cpp | 8 ++-- .../Fir/convert-to-llvm-openmp-and-fir.fir | 5 +- flang/test/Lower/OpenMP/target.f90 | 2 +- mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp | 47 +++++++++++++++---- mlir/test/Dialect/OpenMP/ops.mlir | 10 ++-- 6 files changed, 57 insertions(+), 21 deletions(-) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index f4876256a378f..02454543d0a60 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1407,8 +1407,7 @@ bool ClauseProcessor::processUseDeviceAddr( const parser::CharBlock &source) { mlir::Location location = converter.genLocation(source); llvm::omp::OpenMPOffloadMappingFlags mapTypeBits = - llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO | - llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_RETURN_PARAM; processMapObjects(stmtCtx, location, clause.v, mapTypeBits, parentMemberIndices, result.useDeviceAddrVars, useDeviceSyms); @@ -1429,8 +1428,7 @@ bool ClauseProcessor::processUseDevicePtr( const parser::CharBlock &source) { mlir::Location location = converter.genLocation(source); llvm::omp::OpenMPOffloadMappingFlags mapTypeBits = - llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO | - llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_RETURN_PARAM; processMapObjects(stmtCtx, location, clause.v, mapTypeBits, parentMemberIndices, result.useDevicePtrVars, useDeviceSyms); diff --git a/flang/lib/Lower/OpenMP/Utils.cpp b/flang/lib/Lower/OpenMP/Utils.cpp index 3f4cfb8c11a9d..173dceb07b193 100644 --- a/flang/lib/Lower/OpenMP/Utils.cpp +++ b/flang/lib/Lower/OpenMP/Utils.cpp @@ -398,14 +398,16 @@ mlir::Value createParentSymAndGenIntermediateMaps( interimBounds, treatIndexAsSection); } - // Remove all map TO, FROM and TOFROM bits, from the intermediate - // allocatable maps, we simply wish to alloc or release them. It may be - // safer to just pass OMP_MAP_NONE as the map type, but we may still + // Remove all map-type bits (e.g. TO, FROM, etc.) from the intermediate + // allocatable maps, as we simply wish to alloc or release them. It may + // be safer to just pass OMP_MAP_NONE as the map type, but we may still // need some of the other map types the mapped member utilises, so for // now it's good to keep an eye on this. llvm::omp::OpenMPOffloadMappingFlags interimMapType = mapTypeBits; interimMapType &= ~llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; interimMapType &= ~llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; + interimMapType &= + ~llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_RETURN_PARAM; // Create a map for the intermediate member and insert it and it's // indices into the parentMemberIndices list to track it. diff --git a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir index 8019ecf7f6a05..b13921f822b4d 100644 --- a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir +++ b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir @@ -423,14 +423,15 @@ func.func @_QPopenmp_target_data_region() { func.func @_QPomp_target_data_empty() { %0 = fir.alloca !fir.array<1024xi32> {bindc_name = "a", uniq_name = "_QFomp_target_data_emptyEa"} - omp.target_data use_device_addr(%0 -> %arg0 : !fir.ref>) { + %1 = omp.map.info var_ptr(%0 : !fir.ref>, !fir.ref>) map_clauses(return_param) capture(ByRef) -> !fir.ref> {name = ""} + omp.target_data use_device_addr(%1 -> %arg0 : !fir.ref>) { omp.terminator } return } // CHECK-LABEL: llvm.func @_QPomp_target_data_empty -// CHECK: omp.target_data use_device_addr(%1 -> %{{.*}} : !llvm.ptr) { +// CHECK: omp.target_data use_device_addr(%{{.*}} -> %{{.*}} : !llvm.ptr) { // CHECK: } // ----- diff --git a/flang/test/Lower/OpenMP/target.f90 b/flang/test/Lower/OpenMP/target.f90 index 4815e6564fc7e..f04aacc63fc2b 100644 --- a/flang/test/Lower/OpenMP/target.f90 +++ b/flang/test/Lower/OpenMP/target.f90 @@ -544,7 +544,7 @@ subroutine omp_target_device_addr !CHECK: %[[VAL_0_DECL:.*]]:2 = hlfir.declare %[[VAL_0]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFomp_target_device_addrEa"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) !CHECK: %[[MAP_MEMBERS:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>, i32) map_clauses(tofrom) capture(ByRef) var_ptr_ptr({{.*}} : !fir.llvm_ptr>) -> !fir.llvm_ptr> {name = ""} !CHECK: %[[MAP:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>, !fir.box>) map_clauses(to) capture(ByRef) members(%[[MAP_MEMBERS]] : [0] : !fir.llvm_ptr>) -> !fir.ref>> {name = "a"} - !CHECK: %[[DEV_ADDR_MEMBERS:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>, i32) map_clauses(tofrom) capture(ByRef) var_ptr_ptr({{.*}} : !fir.llvm_ptr>) -> !fir.llvm_ptr> {name = ""} + !CHECK: %[[DEV_ADDR_MEMBERS:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>, i32) map_clauses(return_param) capture(ByRef) var_ptr_ptr({{.*}} : !fir.llvm_ptr>) -> !fir.llvm_ptr> {name = ""} !CHECK: %[[DEV_ADDR:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>, !fir.box>) map_clauses(to) capture(ByRef) members(%[[DEV_ADDR_MEMBERS]] : [0] : !fir.llvm_ptr>) -> !fir.ref>> {name = "a"} !CHECK: omp.target_data map_entries(%[[MAP]], %[[MAP_MEMBERS]] : {{.*}}) use_device_addr(%[[DEV_ADDR]] -> %[[ARG_0:.*]], %[[DEV_ADDR_MEMBERS]] -> %[[ARG_1:.*]] : !fir.ref>>, !fir.llvm_ptr>) { !$omp target data map(tofrom: a) use_device_addr(a) diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp index 2bf7aaa46db11..deff86d5c5ecb 100644 --- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp +++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp @@ -1521,6 +1521,9 @@ static ParseResult parseMapClause(OpAsmParser &parser, IntegerAttr &mapType) { if (mapTypeMod == "delete") mapTypeBits |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_DELETE; + if (mapTypeMod == "return_param") + mapTypeBits |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_RETURN_PARAM; + return success(); }; @@ -1583,6 +1586,12 @@ static void printMapClause(OpAsmPrinter &p, Operation *op, emitAllocRelease = false; mapTypeStrs.push_back("delete"); } + if (mapTypeToBitFlag( + mapTypeBits, + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_RETURN_PARAM)) { + emitAllocRelease = false; + mapTypeStrs.push_back("return_param"); + } if (emitAllocRelease) mapTypeStrs.push_back("exit_release_or_enter_alloc"); @@ -1777,6 +1786,17 @@ static LogicalResult verifyPrivateVarsMapping(TargetOp targetOp) { // MapInfoOp //===----------------------------------------------------------------------===// +static LogicalResult verifyMapInfoDefinedArgs(Operation *op, + StringRef clauseName, + OperandRange vars) { + for (Value var : vars) + if (!llvm::isa_and_present(var.getDefiningOp())) + return op->emitOpError() + << "'" << clauseName + << "' arguments must be defined by 'omp.map.info' ops"; + return success(); +} + LogicalResult MapInfoOp::verify() { if (getMapperId() && !SymbolTable::lookupNearestSymbolFrom( @@ -1784,6 +1804,9 @@ LogicalResult MapInfoOp::verify() { return emitError("invalid mapper id"); } + if (failed(verifyMapInfoDefinedArgs(*this, "members", getMembers()))) + return failure(); + return success(); } @@ -1805,6 +1828,15 @@ LogicalResult TargetDataOp::verify() { "At least one of map, use_device_ptr_vars, or " "use_device_addr_vars operand must be present"); } + + if (failed(verifyMapInfoDefinedArgs(*this, "use_device_ptr", + getUseDevicePtrVars()))) + return failure(); + + if (failed(verifyMapInfoDefinedArgs(*this, "use_device_addr", + getUseDeviceAddrVars()))) + return failure(); + return verifyMapClause(*this, getMapVars()); } @@ -1889,16 +1921,15 @@ void TargetOp::build(OpBuilder &builder, OperationState &state, } LogicalResult TargetOp::verify() { - LogicalResult verifyDependVars = - verifyDependVarList(*this, getDependKinds(), getDependVars()); - - if (failed(verifyDependVars)) - return verifyDependVars; + if (failed(verifyDependVarList(*this, getDependKinds(), getDependVars()))) + return failure(); - LogicalResult verifyMapVars = verifyMapClause(*this, getMapVars()); + if (failed(verifyMapInfoDefinedArgs(*this, "has_device_addr", + getHasDeviceAddrVars()))) + return failure(); - if (failed(verifyMapVars)) - return verifyMapVars; + if (failed(verifyMapClause(*this, getMapVars()))) + return failure(); return verifyPrivateVarsMapping(*this); } diff --git a/mlir/test/Dialect/OpenMP/ops.mlir b/mlir/test/Dialect/OpenMP/ops.mlir index b7e16b7ec35e2..a9e4af035dbd7 100644 --- a/mlir/test/Dialect/OpenMP/ops.mlir +++ b/mlir/test/Dialect/OpenMP/ops.mlir @@ -802,10 +802,14 @@ func.func @omp_target_data (%if_cond : i1, %device : si32, %device_ptr: memref, tensor) map_clauses(always, from) capture(ByRef) -> memref {name = ""} omp.target_data if(%if_cond) device(%device : si32) map_entries(%mapv1 : memref){} - // CHECK: %[[MAP_A:.*]] = omp.map.info var_ptr(%[[VAL_2:.*]] : memref, tensor) map_clauses(close, present, to) capture(ByRef) -> memref {name = ""} - // CHECK: omp.target_data map_entries(%[[MAP_A]] : memref) use_device_addr(%[[VAL_3:.*]] -> %{{.*}} : memref) use_device_ptr(%[[VAL_4:.*]] -> %{{.*}} : memref) + // CHECK: %[[MAP_A:.*]] = omp.map.info var_ptr(%{{.*}} : memref, tensor) map_clauses(close, present, to) capture(ByRef) -> memref {name = ""} + // CHECK: %[[DEV_ADDR:.*]] = omp.map.info var_ptr(%{{.*}} : memref, tensor) map_clauses(return_param) capture(ByRef) -> memref {name = ""} + // CHECK: %[[DEV_PTR:.*]] = omp.map.info var_ptr(%{{.*}} : memref, tensor) map_clauses(return_param) capture(ByRef) -> memref {name = ""} + // CHECK: omp.target_data map_entries(%[[MAP_A]] : memref) use_device_addr(%[[DEV_ADDR]] -> %{{.*}} : memref) use_device_ptr(%[[DEV_PTR]] -> %{{.*}} : memref) %mapv2 = omp.map.info var_ptr(%map1 : memref, tensor) map_clauses(close, present, to) capture(ByRef) -> memref {name = ""} - omp.target_data map_entries(%mapv2 : memref) use_device_addr(%device_addr -> %arg0 : memref) use_device_ptr(%device_ptr -> %arg1 : memref) { + %device_addrv1 = omp.map.info var_ptr(%device_addr : memref, tensor) map_clauses(return_param) capture(ByRef) -> memref {name = ""} + %device_ptrv1 = omp.map.info var_ptr(%device_ptr : memref, tensor) map_clauses(return_param) capture(ByRef) -> memref {name = ""} + omp.target_data map_entries(%mapv2 : memref) use_device_addr(%device_addrv1 -> %arg0 : memref) use_device_ptr(%device_ptrv1 -> %arg1 : memref) { omp.terminator } From flang-commits at lists.llvm.org Thu May 15 03:35:01 2025 From: flang-commits at lists.llvm.org (=?UTF-8?Q?Martin_Storsj=C3=B6?= via flang-commits) Date: Thu, 15 May 2025 03:35:01 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [lld] [lldb] [llvm] [mlir] [polly] [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS in standalone builds (PR #138587) In-Reply-To: Message-ID: <6825c355.170a0220.a9650.1b23@mx.google.com> mstorsjo wrote: > I rebased this on top of #138783 and adjusted the title and description. Now it should be in a good state to push cmake changes for other projects. The changes look good, but it looks like the changes from #138783 still show up when viewing the changes; can you check that you've rebased past the merged #138783? (Also, I take it that no other subprojects than clang need the `cmake_push_check_state` change?) https://github.com/llvm/llvm-project/pull/138587 From flang-commits at lists.llvm.org Thu May 15 04:28:10 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 04:28:10 -0700 (PDT) Subject: [flang-commits] [flang] 30b0946 - [Flang][MLIR][OpenMP] Improve use_device_* handling (#137198) Message-ID: <6825cfca.a70a0220.25f73e.2401@mx.google.com> Author: Sergio Afonso Date: 2025-05-15T12:28:06+01:00 New Revision: 30b0946326354d247a92622f08be4722df58bb55 URL: https://github.com/llvm/llvm-project/commit/30b0946326354d247a92622f08be4722df58bb55 DIFF: https://github.com/llvm/llvm-project/commit/30b0946326354d247a92622f08be4722df58bb55.diff LOG: [Flang][MLIR][OpenMP] Improve use_device_* handling (#137198) This patch updates MLIR op verifiers for operations taking arguments that must always be defined by an `omp.map.info` operation to check this requirement. It also modifies Flang lowering for `use_device_{addr, ptr}`, as well as the custom MLIR printer and parser for these clauses, to support initializing it to `OMP_MAP_RETURN_PARAM` and represent this in the MLIR representation as `return_param`. This internal mapping flag is what eventually is used for variables passed via these clauses into the target region when translating to LLVM IR, so making it explicit in Flang and MLIR removes an inconsistency in the current representation. Added: Modified: flang/lib/Lower/OpenMP/ClauseProcessor.cpp flang/lib/Lower/OpenMP/Utils.cpp flang/test/Fir/convert-to-llvm-openmp-and-fir.fir flang/test/Lower/OpenMP/target.f90 mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp mlir/test/Dialect/OpenMP/ops.mlir Removed: ################################################################################ diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index f4876256a378f..02454543d0a60 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1407,8 +1407,7 @@ bool ClauseProcessor::processUseDeviceAddr( const parser::CharBlock &source) { mlir::Location location = converter.genLocation(source); llvm::omp::OpenMPOffloadMappingFlags mapTypeBits = - llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO | - llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_RETURN_PARAM; processMapObjects(stmtCtx, location, clause.v, mapTypeBits, parentMemberIndices, result.useDeviceAddrVars, useDeviceSyms); @@ -1429,8 +1428,7 @@ bool ClauseProcessor::processUseDevicePtr( const parser::CharBlock &source) { mlir::Location location = converter.genLocation(source); llvm::omp::OpenMPOffloadMappingFlags mapTypeBits = - llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO | - llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_RETURN_PARAM; processMapObjects(stmtCtx, location, clause.v, mapTypeBits, parentMemberIndices, result.useDevicePtrVars, useDeviceSyms); diff --git a/flang/lib/Lower/OpenMP/Utils.cpp b/flang/lib/Lower/OpenMP/Utils.cpp index 3f4cfb8c11a9d..173dceb07b193 100644 --- a/flang/lib/Lower/OpenMP/Utils.cpp +++ b/flang/lib/Lower/OpenMP/Utils.cpp @@ -398,14 +398,16 @@ mlir::Value createParentSymAndGenIntermediateMaps( interimBounds, treatIndexAsSection); } - // Remove all map TO, FROM and TOFROM bits, from the intermediate - // allocatable maps, we simply wish to alloc or release them. It may be - // safer to just pass OMP_MAP_NONE as the map type, but we may still + // Remove all map-type bits (e.g. TO, FROM, etc.) from the intermediate + // allocatable maps, as we simply wish to alloc or release them. It may + // be safer to just pass OMP_MAP_NONE as the map type, but we may still // need some of the other map types the mapped member utilises, so for // now it's good to keep an eye on this. llvm::omp::OpenMPOffloadMappingFlags interimMapType = mapTypeBits; interimMapType &= ~llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO; interimMapType &= ~llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_FROM; + interimMapType &= + ~llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_RETURN_PARAM; // Create a map for the intermediate member and insert it and it's // indices into the parentMemberIndices list to track it. diff --git a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir index 8019ecf7f6a05..b13921f822b4d 100644 --- a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir +++ b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir @@ -423,14 +423,15 @@ func.func @_QPopenmp_target_data_region() { func.func @_QPomp_target_data_empty() { %0 = fir.alloca !fir.array<1024xi32> {bindc_name = "a", uniq_name = "_QFomp_target_data_emptyEa"} - omp.target_data use_device_addr(%0 -> %arg0 : !fir.ref>) { + %1 = omp.map.info var_ptr(%0 : !fir.ref>, !fir.ref>) map_clauses(return_param) capture(ByRef) -> !fir.ref> {name = ""} + omp.target_data use_device_addr(%1 -> %arg0 : !fir.ref>) { omp.terminator } return } // CHECK-LABEL: llvm.func @_QPomp_target_data_empty -// CHECK: omp.target_data use_device_addr(%1 -> %{{.*}} : !llvm.ptr) { +// CHECK: omp.target_data use_device_addr(%{{.*}} -> %{{.*}} : !llvm.ptr) { // CHECK: } // ----- diff --git a/flang/test/Lower/OpenMP/target.f90 b/flang/test/Lower/OpenMP/target.f90 index 4815e6564fc7e..f04aacc63fc2b 100644 --- a/flang/test/Lower/OpenMP/target.f90 +++ b/flang/test/Lower/OpenMP/target.f90 @@ -544,7 +544,7 @@ subroutine omp_target_device_addr !CHECK: %[[VAL_0_DECL:.*]]:2 = hlfir.declare %[[VAL_0]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFomp_target_device_addrEa"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) !CHECK: %[[MAP_MEMBERS:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>, i32) map_clauses(tofrom) capture(ByRef) var_ptr_ptr({{.*}} : !fir.llvm_ptr>) -> !fir.llvm_ptr> {name = ""} !CHECK: %[[MAP:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>, !fir.box>) map_clauses(to) capture(ByRef) members(%[[MAP_MEMBERS]] : [0] : !fir.llvm_ptr>) -> !fir.ref>> {name = "a"} - !CHECK: %[[DEV_ADDR_MEMBERS:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>, i32) map_clauses(tofrom) capture(ByRef) var_ptr_ptr({{.*}} : !fir.llvm_ptr>) -> !fir.llvm_ptr> {name = ""} + !CHECK: %[[DEV_ADDR_MEMBERS:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>, i32) map_clauses(return_param) capture(ByRef) var_ptr_ptr({{.*}} : !fir.llvm_ptr>) -> !fir.llvm_ptr> {name = ""} !CHECK: %[[DEV_ADDR:.*]] = omp.map.info var_ptr({{.*}} : !fir.ref>>, !fir.box>) map_clauses(to) capture(ByRef) members(%[[DEV_ADDR_MEMBERS]] : [0] : !fir.llvm_ptr>) -> !fir.ref>> {name = "a"} !CHECK: omp.target_data map_entries(%[[MAP]], %[[MAP_MEMBERS]] : {{.*}}) use_device_addr(%[[DEV_ADDR]] -> %[[ARG_0:.*]], %[[DEV_ADDR_MEMBERS]] -> %[[ARG_1:.*]] : !fir.ref>>, !fir.llvm_ptr>) { !$omp target data map(tofrom: a) use_device_addr(a) diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp index 2bf7aaa46db11..deff86d5c5ecb 100644 --- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp +++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp @@ -1521,6 +1521,9 @@ static ParseResult parseMapClause(OpAsmParser &parser, IntegerAttr &mapType) { if (mapTypeMod == "delete") mapTypeBits |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_DELETE; + if (mapTypeMod == "return_param") + mapTypeBits |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_RETURN_PARAM; + return success(); }; @@ -1583,6 +1586,12 @@ static void printMapClause(OpAsmPrinter &p, Operation *op, emitAllocRelease = false; mapTypeStrs.push_back("delete"); } + if (mapTypeToBitFlag( + mapTypeBits, + llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_RETURN_PARAM)) { + emitAllocRelease = false; + mapTypeStrs.push_back("return_param"); + } if (emitAllocRelease) mapTypeStrs.push_back("exit_release_or_enter_alloc"); @@ -1777,6 +1786,17 @@ static LogicalResult verifyPrivateVarsMapping(TargetOp targetOp) { // MapInfoOp //===----------------------------------------------------------------------===// +static LogicalResult verifyMapInfoDefinedArgs(Operation *op, + StringRef clauseName, + OperandRange vars) { + for (Value var : vars) + if (!llvm::isa_and_present(var.getDefiningOp())) + return op->emitOpError() + << "'" << clauseName + << "' arguments must be defined by 'omp.map.info' ops"; + return success(); +} + LogicalResult MapInfoOp::verify() { if (getMapperId() && !SymbolTable::lookupNearestSymbolFrom( @@ -1784,6 +1804,9 @@ LogicalResult MapInfoOp::verify() { return emitError("invalid mapper id"); } + if (failed(verifyMapInfoDefinedArgs(*this, "members", getMembers()))) + return failure(); + return success(); } @@ -1805,6 +1828,15 @@ LogicalResult TargetDataOp::verify() { "At least one of map, use_device_ptr_vars, or " "use_device_addr_vars operand must be present"); } + + if (failed(verifyMapInfoDefinedArgs(*this, "use_device_ptr", + getUseDevicePtrVars()))) + return failure(); + + if (failed(verifyMapInfoDefinedArgs(*this, "use_device_addr", + getUseDeviceAddrVars()))) + return failure(); + return verifyMapClause(*this, getMapVars()); } @@ -1889,16 +1921,15 @@ void TargetOp::build(OpBuilder &builder, OperationState &state, } LogicalResult TargetOp::verify() { - LogicalResult verifyDependVars = - verifyDependVarList(*this, getDependKinds(), getDependVars()); - - if (failed(verifyDependVars)) - return verifyDependVars; + if (failed(verifyDependVarList(*this, getDependKinds(), getDependVars()))) + return failure(); - LogicalResult verifyMapVars = verifyMapClause(*this, getMapVars()); + if (failed(verifyMapInfoDefinedArgs(*this, "has_device_addr", + getHasDeviceAddrVars()))) + return failure(); - if (failed(verifyMapVars)) - return verifyMapVars; + if (failed(verifyMapClause(*this, getMapVars()))) + return failure(); return verifyPrivateVarsMapping(*this); } diff --git a/mlir/test/Dialect/OpenMP/ops.mlir b/mlir/test/Dialect/OpenMP/ops.mlir index b7e16b7ec35e2..a9e4af035dbd7 100644 --- a/mlir/test/Dialect/OpenMP/ops.mlir +++ b/mlir/test/Dialect/OpenMP/ops.mlir @@ -802,10 +802,14 @@ func.func @omp_target_data (%if_cond : i1, %device : si32, %device_ptr: memref, tensor) map_clauses(always, from) capture(ByRef) -> memref {name = ""} omp.target_data if(%if_cond) device(%device : si32) map_entries(%mapv1 : memref){} - // CHECK: %[[MAP_A:.*]] = omp.map.info var_ptr(%[[VAL_2:.*]] : memref, tensor) map_clauses(close, present, to) capture(ByRef) -> memref {name = ""} - // CHECK: omp.target_data map_entries(%[[MAP_A]] : memref) use_device_addr(%[[VAL_3:.*]] -> %{{.*}} : memref) use_device_ptr(%[[VAL_4:.*]] -> %{{.*}} : memref) + // CHECK: %[[MAP_A:.*]] = omp.map.info var_ptr(%{{.*}} : memref, tensor) map_clauses(close, present, to) capture(ByRef) -> memref {name = ""} + // CHECK: %[[DEV_ADDR:.*]] = omp.map.info var_ptr(%{{.*}} : memref, tensor) map_clauses(return_param) capture(ByRef) -> memref {name = ""} + // CHECK: %[[DEV_PTR:.*]] = omp.map.info var_ptr(%{{.*}} : memref, tensor) map_clauses(return_param) capture(ByRef) -> memref {name = ""} + // CHECK: omp.target_data map_entries(%[[MAP_A]] : memref) use_device_addr(%[[DEV_ADDR]] -> %{{.*}} : memref) use_device_ptr(%[[DEV_PTR]] -> %{{.*}} : memref) %mapv2 = omp.map.info var_ptr(%map1 : memref, tensor) map_clauses(close, present, to) capture(ByRef) -> memref {name = ""} - omp.target_data map_entries(%mapv2 : memref) use_device_addr(%device_addr -> %arg0 : memref) use_device_ptr(%device_ptr -> %arg1 : memref) { + %device_addrv1 = omp.map.info var_ptr(%device_addr : memref, tensor) map_clauses(return_param) capture(ByRef) -> memref {name = ""} + %device_ptrv1 = omp.map.info var_ptr(%device_ptr : memref, tensor) map_clauses(return_param) capture(ByRef) -> memref {name = ""} + omp.target_data map_entries(%mapv2 : memref) use_device_addr(%device_addrv1 -> %arg0 : memref) use_device_ptr(%device_ptrv1 -> %arg1 : memref) { omp.terminator } From flang-commits at lists.llvm.org Thu May 15 04:28:13 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Thu, 15 May 2025 04:28:13 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [Flang][MLIR][OpenMP] Improve use_device_* handling (PR #137198) In-Reply-To: Message-ID: <6825cfcd.a70a0220.35cfe8.1ccf@mx.google.com> https://github.com/skatrak closed https://github.com/llvm/llvm-project/pull/137198 From flang-commits at lists.llvm.org Thu May 15 04:28:16 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Thu, 15 May 2025 04:28:16 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [MLIR][OpenMP] Assert on map translation functions, NFC (PR #137199) In-Reply-To: Message-ID: <6825cfd0.630a0220.119565.b4ee@mx.google.com> https://github.com/skatrak edited https://github.com/llvm/llvm-project/pull/137199 From flang-commits at lists.llvm.org Thu May 15 04:28:50 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Thu, 15 May 2025 04:28:50 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [MLIR][OpenMP] Assert on map translation functions, NFC (PR #137199) In-Reply-To: Message-ID: <6825cff2.a70a0220.fa42a.686f@mx.google.com> https://github.com/skatrak updated https://github.com/llvm/llvm-project/pull/137199 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Thu May 15 04:29:13 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Thu, 15 May 2025 04:29:13 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [MLIR][OpenMP] Assert on map translation functions, NFC (PR #137199) In-Reply-To: Message-ID: <6825d009.630a0220.76aa6.9ef5@mx.google.com> https://github.com/skatrak closed https://github.com/llvm/llvm-project/pull/137199 From flang-commits at lists.llvm.org Thu May 15 04:29:17 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Thu, 15 May 2025 04:29:17 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [Flang][OpenMP] Minimize host ops remaining in device compilation (PR #137200) In-Reply-To: Message-ID: <6825d00d.050a0220.21dacf.6d64@mx.google.com> https://github.com/skatrak edited https://github.com/llvm/llvm-project/pull/137200 From flang-commits at lists.llvm.org Thu May 15 05:15:28 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 05:15:28 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <6825dae0.170a0220.239eaa.110a@mx.google.com> NexMing wrote: > > > Who said it is a temporary solution? Can you point to an RFC? > > > > > > This specific patch wasn't discussed in the RFC but there is some discussion here https://discourse.llvm.org/t/rfc-add-fir-affine-optimization-fir-pass-pipeline/86190/5 > > So is that only for the affine pipeline? It's not just about Affine. In the future, I plan to implement conversions from FIR to scf, memref, and other core dialects. The FIR → Affine → FIR path is part of my experimental roadmap for exploring Fortran optimizations. My envisioned final pipeline is: FIR → core MLIR (do optimization. ,like SCF->Affine )→ LLVM. I will revise the RFC title or create a new RFC. https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Thu May 15 06:12:47 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 15 May 2025 06:12:47 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6825e84f.630a0220.2839e6.d61c@mx.google.com> kparzysz wrote: Primary changes: 1. Replace individual AST nodes with one that corresponds to "atomic construct". Treat "read", "capture", etc. as clauses in the same way as clauses are treated in other constructs (i.e. as "parameters"). 2. Parse as much as possible, including a variety of invalid sources. The goal is to delay diagnostics from parser to the semantic analysis. 3. Perform detailed analysis in semantic checks, and store the results in the AST via a mutable member (similarly to how "typedExpr" works for parser::Expr). 4. Avoid checks for diagnosable issues in lowering, simply read the recorded analysis results and generate code. The goal of this PR is to preserve the existing functionality, but with new implementation. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Thu May 15 06:27:32 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Thu, 15 May 2025 06:27:32 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <6825ebc4.050a0220.43d7c.374a@mx.google.com> https://github.com/TIFitis updated https://github.com/llvm/llvm-project/pull/139593 >From a83bd68fdcb613d54c66f8503f522cc2b16a63a2 Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Mon, 12 May 2025 18:41:20 +0100 Subject: [PATCH 1/2] Fix semantic check for default declare mappers. --- flang/lib/Semantics/resolve-names.cpp | 21 ++++++++++++------- .../OpenMP/declare-mapper-symbols.f90 | 18 ++++++++-------- .../Semantics/OpenMP/declare-mapper03.f90 | 6 +----- 3 files changed, 23 insertions(+), 22 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index b2979690f78e7..1fd0ea007319d 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -38,6 +38,7 @@ #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" #include "flang/Support/default-kinds.h" +#include "llvm/ADT/SmallVector.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1766,14 +1767,6 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, // just following the natural flow, the map clauses gets processed before // the type has been fully processed. BeginDeclTypeSpec(); - if (auto &mapperName{std::get>(spec.t)}) { - mapperName->symbol = - &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); - } else { - const parser::CharBlock defaultName{"default", 7}; - MakeSymbol( - defaultName, Attrs{}, MiscDetails{MiscDetails::Kind::ConstructName}); - } PushScope(Scope::Kind::OtherConstruct, nullptr); Walk(std::get(spec.t)); @@ -1783,6 +1776,18 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); + + if (auto &mapperName{std::get>(spec.t)}) { + mapperName->symbol = + &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); + } else { + const auto &type = std::get(spec.t); + static llvm::SmallVector defaultNames; + defaultNames.emplace_back( + type.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"); + MakeSymbol(defaultNames.back(), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); + } } void OmpVisitor::ProcessReductionSpecifier( diff --git a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 index b4e03bd1632e5..0dda5b4456987 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 @@ -2,23 +2,23 @@ program main !CHECK-LABEL: MainProgram scope: main - implicit none + implicit none - type ty - integer :: x - end type ty - !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) - !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) + type ty + integer :: x + end type ty + !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) + !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) !! Note, symbols come out in their respective scope, but not in declaration order. -!CHECK: default: Misc ConstructName !CHECK: mymapper: Misc ConstructName !CHECK: ty: DerivedType components: x +!CHECK: ty.default: Misc ConstructName !CHECK: DerivedType scope: ty !CHECK: OtherConstruct scope: !CHECK: mapped (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) -!CHECK: OtherConstruct scope: +!CHECK: OtherConstruct scope: !CHECK: maptwo (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) - + end program main diff --git a/flang/test/Semantics/OpenMP/declare-mapper03.f90 b/flang/test/Semantics/OpenMP/declare-mapper03.f90 index b70b8a67f33e0..84fc3efafb3ad 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper03.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper03.f90 @@ -5,12 +5,8 @@ integer :: y end type t1 -type :: t2 - real :: y, z -end type t2 - !error: 'default' is already declared in this scoping unit !$omp declare mapper(t1::x) map(x, x%y) -!$omp declare mapper(t2::w) map(w, w%y, w%z) +!$omp declare mapper(t1::x) map(x) end >From 19c5f5635dbee97d8b6364b43f31196298a51062 Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Wed, 14 May 2025 20:38:01 +0100 Subject: [PATCH 2/2] Change mapper name field from parser::Name to std::string. --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 6 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 22 ++--- flang/lib/Parser/openmp-parsers.cpp | 22 ++++- flang/lib/Parser/unparse.cpp | 11 ++- flang/lib/Semantics/resolve-names.cpp | 16 +--- flang/test/Lower/OpenMP/declare-mapper.f90 | 95 ++++++++++++++++++- flang/test/Lower/OpenMP/map-mapper.f90 | 4 +- .../Parser/OpenMP/declare-mapper-unparse.f90 | 15 +-- .../Parser/OpenMP/metadirective-dirspec.f90 | 2 +- .../OpenMP/declare-mapper-symbols.f90 | 2 +- 11 files changed, 149 insertions(+), 48 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 254236b510544..c99006f0c1c22 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -3540,7 +3540,7 @@ WRAPPER_CLASS(OmpLocatorList, std::list); struct OmpMapperSpecifier { // Absent mapper-identifier is equivalent to DEFAULT. TUPLE_CLASS_BOILERPLATE(OmpMapperSpecifier); - std::tuple, TypeSpec, Name> t; + std::tuple t; }; // Ref: [4.5:222:1-5], [5.0:305:20-27], [5.1:337:11-19], [5.2:139:18-23], diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index f4876256a378f..82061eed8913a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1114,9 +1114,9 @@ void ClauseProcessor::processMapObjects( typeSpec = &object.sym()->GetType()->derivedTypeSpec(); if (typeSpec) { - mapperIdName = typeSpec->name().ToString() + ".default"; - mapperIdName = - converter.mangleName(mapperIdName, *typeSpec->GetScope()); + mapperIdName = typeSpec->name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); } } }; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 61bbc709872fd..cfcba0159db8d 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2422,8 +2422,10 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, mlir::FlatSymbolRefAttr mapperId; if (sym.GetType()->category() == semantics::DeclTypeSpec::TypeDerived) { auto &typeSpec = sym.GetType()->derivedTypeSpec(); - std::string mapperIdName = typeSpec.name().ToString() + ".default"; - mapperIdName = converter.mangleName(mapperIdName, *typeSpec.GetScope()); + std::string mapperIdName = + typeSpec.name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); if (converter.getModuleOp().lookupSymbol(mapperIdName)) mapperId = mlir::FlatSymbolRefAttr::get(&converter.getMLIRContext(), mapperIdName); @@ -4005,24 +4007,16 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, lower::StatementContext stmtCtx; const auto &spec = std::get(declareMapperConstruct.t); - const auto &mapperName{std::get>(spec.t)}; + const auto &mapperName{std::get(spec.t)}; const auto &varType{std::get(spec.t)}; const auto &varName{std::get(spec.t)}; assert(varType.declTypeSpec->category() == semantics::DeclTypeSpec::Category::TypeDerived && "Expected derived type"); - std::string mapperNameStr; - if (mapperName.has_value()) { - mapperNameStr = mapperName->ToString(); - mapperNameStr = - converter.mangleName(mapperNameStr, mapperName->symbol->owner()); - } else { - mapperNameStr = - varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"; - mapperNameStr = converter.mangleName( - mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope()); - } + std::string mapperNameStr = mapperName; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperNameStr)) + mapperNameStr = converter.mangleName(mapperNameStr, sym->owner()); // Save current insertion point before moving to the module scope to create // the DeclareMapperOp diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 52d3a5844c969..a1ed584020677 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1389,8 +1389,28 @@ TYPE_PARSER( TYPE_PARSER(sourced(construct( verbatim("DECLARE TARGET"_tok), Parser{}))) +static OmpMapperSpecifier ConstructOmpMapperSpecifier( + std::optional &&mapperName, TypeSpec &&typeSpec, Name &&varName) { + // If a name is present, parse: name ":" typeSpec "::" name + // This matches the syntax: : :: + if (mapperName.has_value() && mapperName->ToString() != "default") { + return OmpMapperSpecifier{ + mapperName->ToString(), std::move(typeSpec), std::move(varName)}; + } + // If the name is missing, use the DerivedTypeSpec name to construct the + // default mapper name. + // This matches the syntax: :: + if (auto *derived = std::get_if(&typeSpec.u)) { + return OmpMapperSpecifier{ + std::get(derived->t).ToString() + ".omp.default.mapper", + std::move(typeSpec), std::move(varName)}; + } + return OmpMapperSpecifier{std::string("omp.default.mapper"), + std::move(typeSpec), std::move(varName)}; +} + // mapper-specifier -TYPE_PARSER(construct( +TYPE_PARSER(applyFunction(ConstructOmpMapperSpecifier, maybe(name / ":" / !":"_tok), typeSpec / "::", name)) // OpenMP 5.2: 5.8.8 Declare Mapper Construct diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index a626888b7dfe5..1d68e8d8850fa 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2093,7 +2093,11 @@ class UnparseVisitor { Walk(x.v, ","); } void Unparse(const OmpMapperSpecifier &x) { - Walk(std::get>(x.t), ":"); + const auto &mapperName = std::get(x.t); + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); + Put(":"); + } Walk(std::get(x.t)); Put("::"); Walk(std::get(x.t)); @@ -2796,8 +2800,9 @@ class UnparseVisitor { BeginOpenMP(); Word("!$OMP DECLARE MAPPER ("); const auto &spec{std::get(z.t)}; - if (auto mapname{std::get>(spec.t)}) { - Walk(mapname); + const auto &mapperName = std::get(spec.t); + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); Put(":"); } Walk(std::get(spec.t)); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 42297f069499b..322562b06b87f 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1767,7 +1767,9 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, // just following the natural flow, the map clauses gets processed before // the type has been fully processed. BeginDeclTypeSpec(); - + auto &mapperName{std::get(spec.t)}; + MakeSymbol(parser::CharBlock(mapperName), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); PushScope(Scope::Kind::OtherConstruct, nullptr); Walk(std::get(spec.t)); auto &varName{std::get(spec.t)}; @@ -1776,18 +1778,6 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); - - if (auto &mapperName{std::get>(spec.t)}) { - mapperName->symbol = - &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); - } else { - const auto &type = std::get(spec.t); - static llvm::SmallVector defaultNames; - defaultNames.emplace_back( - type.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"); - MakeSymbol(defaultNames.back(), Attrs{}, - MiscDetails{MiscDetails::Kind::ConstructName}); - } } void OmpVisitor::ProcessReductionSpecifier( diff --git a/flang/test/Lower/OpenMP/declare-mapper.f90 b/flang/test/Lower/OpenMP/declare-mapper.f90 index 867b850317e66..8a98c68a8d582 100644 --- a/flang/test/Lower/OpenMP/declare-mapper.f90 +++ b/flang/test/Lower/OpenMP/declare-mapper.f90 @@ -5,6 +5,7 @@ ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-2.f90 -o - | FileCheck %t/omp-declare-mapper-2.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-3.f90 -o - | FileCheck %t/omp-declare-mapper-3.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-4.f90 -o - | FileCheck %t/omp-declare-mapper-4.f90 +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-5.f90 -o - | FileCheck %t/omp-declare-mapper-5.f90 !--- omp-declare-mapper-1.f90 subroutine declare_mapper_1 @@ -22,7 +23,7 @@ subroutine declare_mapper_1 end type type(my_type2) :: t real :: x, y(nvals) - !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { + !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.omp\.default\.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { !CHECK: ^bb0(%[[VAL_0:.*]]: !fir.ref<[[MY_TYPE]]>): !CHECK: %[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFdeclare_mapper_1Evar"} : (!fir.ref<[[MY_TYPE]]>) -> (!fir.ref<[[MY_TYPE]]>, !fir.ref<[[MY_TYPE]]>) !CHECK: %[[VAL_2:.*]] = hlfir.designate %[[VAL_1]]#0{"values"} {fortran_attrs = #fir.var_attrs} : (!fir.ref<[[MY_TYPE]]>) -> !fir.ref>>> @@ -149,7 +150,7 @@ subroutine declare_mapper_4 integer :: num end type - !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] + !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.omp.default.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] !$omp declare mapper (my_type :: var) map (var%num) type(my_type) :: a @@ -171,3 +172,93 @@ subroutine declare_mapper_4 a%num = 40 !$omp end target end subroutine declare_mapper_4 + +!--- omp-declare-mapper-5.f90 +program declare_mapper_5 + implicit none + + type :: mytype + integer :: x, y + end type + + !CHECK: omp.declare_mapper @[[INNER_MAPPER_NAMED:_QQFFuse_innermy_mapper]] : [[MY_TYPE:!fir\.type<_QFTmytype\{x:i32,y:i32\}>]] + !CHECK: omp.declare_mapper @[[INNER_MAPPER_DEFAULT:_QQFFuse_innermytype.omp.default.mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_NAMED:_QQFmy_mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_DEFAULT:_QQFmytype.omp.default.mapper]] : [[MY_TYPE]] + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + +contains + subroutine use_outer() + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine + + subroutine use_inner() + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine +end program declare_mapper_5 diff --git a/flang/test/Lower/OpenMP/map-mapper.f90 b/flang/test/Lower/OpenMP/map-mapper.f90 index a511110cb5d18..91564bfc7bc46 100644 --- a/flang/test/Lower/OpenMP/map-mapper.f90 +++ b/flang/test/Lower/OpenMP/map-mapper.f90 @@ -8,7 +8,7 @@ program p !$omp declare mapper(xx : t1 :: nn) map(to: nn, nn%x) !$omp declare mapper(t1 :: nn) map(from: nn) - !CHECK-LABEL: omp.declare_mapper @_QQFt1.default : !fir.type<_QFTt1{x:!fir.array<256xi32>}> + !CHECK-LABEL: omp.declare_mapper @_QQFt1.omp.default.mapper : !fir.type<_QFTt1{x:!fir.array<256xi32>}> !CHECK-LABEL: omp.declare_mapper @_QQFxx : !fir.type<_QFTt1{x:!fir.array<256xi32>}> type(t1) :: a, b @@ -20,7 +20,7 @@ program p end do !$omp end target - !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.default) -> {{.*}} {name = "b"} + !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.omp.default.mapper) -> {{.*}} {name = "b"} !CHECK: omp.target map_entries(%[[MAP_B]] -> %{{.*}}, %{{.*}} -> %{{.*}} : {{.*}}, {{.*}}) { !$omp target map(mapper(default) : b) do i = 1, n diff --git a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 index 407bfd29153fa..30d75d02736f3 100644 --- a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 +++ b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 @@ -7,36 +7,37 @@ program main type ty integer :: x end type ty - + !CHECK: !$OMP DECLARE MAPPER (mymapper:ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier -!PARSE-TREE: Name = 'mymapper' +!PARSE-TREE: string = 'mymapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' -!PARSE-TREE: Name = 'x' +!PARSE-TREE: Name = 'x' !CHECK: !$OMP DECLARE MAPPER (ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(ty :: mapped) map(mapped, mapped%x) - + !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier +!PARSE-TREE: string = 'ty.omp.default.mapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' !PARSE-TREE: Name = 'x' - + end program main !CHECK-LABEL: end program main diff --git a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 index b6c9c58948fec..baa8b2e08c539 100644 --- a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 +++ b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 @@ -78,7 +78,7 @@ subroutine f02 !PARSE-TREE: | | OmpDirectiveSpecification !PARSE-TREE: | | | OmpDirectiveName -> llvm::omp::Directive = declare mapper !PARSE-TREE: | | | OmpArgumentList -> OmpArgument -> OmpMapperSpecifier -!PARSE-TREE: | | | | Name = 'mymapper' +!PARSE-TREE: | | | | string = 'mymapper' !PARSE-TREE: | | | | TypeSpec -> IntrinsicTypeSpec -> IntegerTypeSpec -> !PARSE-TREE: | | | | Name = 'v' !PARSE-TREE: | | | OmpClauseList -> OmpClause -> Map -> OmpMapClause diff --git a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 index 0dda5b4456987..06f41ab8ce76f 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 @@ -13,7 +13,7 @@ program main !! Note, symbols come out in their respective scope, but not in declaration order. !CHECK: mymapper: Misc ConstructName !CHECK: ty: DerivedType components: x -!CHECK: ty.default: Misc ConstructName +!CHECK: ty.omp.default.mapper: Misc ConstructName !CHECK: DerivedType scope: ty !CHECK: OtherConstruct scope: !CHECK: mapped (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) From flang-commits at lists.llvm.org Thu May 15 06:35:33 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Thu, 15 May 2025 06:35:33 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <6825eda5.050a0220.b0a03.8282@mx.google.com> ================ @@ -1783,6 +1776,18 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); + + if (auto &mapperName{std::get>(spec.t)}) { + mapperName->symbol = + &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); + } else { + const auto &type = std::get(spec.t); + static llvm::SmallVector defaultNames; ---------------- TIFitis wrote: Hi, I encountered that even with this minor fix, some separate scoping related semantic issues would have still remained. To fix these issues altogether, I have instead taken a different approach to solving the problem. I have changed the optional field for the mapper name to a compulsory std::string field. The parser has been updated to create a default name - `".omp.default.mapper"` whenever the default name is used. Also added a new test to _flang/test/Lower/OpenMP/declare-mapper.f90_ for ensuring declare mapper scoping rules apply cleanly. https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Thu May 15 06:36:44 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Thu, 15 May 2025 06:36:44 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <6825edec.630a0220.163f65.e4d3@mx.google.com> https://github.com/TIFitis edited https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Thu May 15 06:40:48 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Thu, 15 May 2025 06:40:48 -0700 (PDT) Subject: [flang-commits] [flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <6825eee0.170a0220.3277ff.6f10@mx.google.com> ================ @@ -1783,6 +1776,18 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); + + if (auto &mapperName{std::get>(spec.t)}) { + mapperName->symbol = + &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); + } else { + const auto &type = std::get(spec.t); + static llvm::SmallVector defaultNames; + defaultNames.emplace_back( + type.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"); + MakeSymbol(defaultNames.back(), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); ---------------- TIFitis wrote: Unfortunately, the attribute-less overload uses a different type for the name parameter and thus can't be used directly. Passing a default Attrs{} seems likely a cleaner solution than changing the name to the correct type so I have went with this approach for now. https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Thu May 15 07:06:55 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 07:06:55 -0700 (PDT) Subject: [flang-commits] [flang] ed572aa - [flang] Add missing copy assignment operator (#139966) Message-ID: <6825f4ff.170a0220.326dc.6001@mx.google.com> Author: Asher Mancinelli Date: 2025-05-15T07:06:50-07:00 New Revision: ed572aaac8b142a7bf09a235f5497bc7e201f762 URL: https://github.com/llvm/llvm-project/commit/ed572aaac8b142a7bf09a235f5497bc7e201f762 DIFF: https://github.com/llvm/llvm-project/commit/ed572aaac8b142a7bf09a235f5497bc7e201f762.diff LOG: [flang] Add missing copy assignment operator (#139966) On Clang 17 the implicit copy assignment operator was issuing a warning because of the user-declared copy constructor. Declare the copy assignment operator as default. Added: Modified: flang/include/flang/Semantics/symbol.h Removed: ################################################################################ diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 97c1e30631840..4cded64d170cd 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -600,6 +600,7 @@ class TypeParamDetails { public: TypeParamDetails() = default; TypeParamDetails(const TypeParamDetails &) = default; + TypeParamDetails &operator=(const TypeParamDetails &) = default; std::optional attr() const { return attr_; } TypeParamDetails &set_attr(common::TypeParamAttr); MaybeIntExpr &init() { return init_; } From flang-commits at lists.llvm.org Thu May 15 07:06:57 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Thu, 15 May 2025 07:06:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add missing copy constructor (PR #139966) In-Reply-To: Message-ID: <6825f501.050a0220.270b55.c340@mx.google.com> https://github.com/ashermancinelli closed https://github.com/llvm/llvm-project/pull/139966 From flang-commits at lists.llvm.org Thu May 15 07:24:10 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Thu, 15 May 2025 07:24:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <6825f90a.170a0220.23276c.9222@mx.google.com> akuhlens wrote: @DavidSpickett and @kkwli for both of these tests we were erroring out during compilation with internal compiler errors. These should programs should be rejected by the compiler. I am preparing a PR that will reject these programs, in order to satisfy the test. If I wanted to change the expected behavior of the test how would I go about doing that? I haven't dealt with modifying the test-suite yet. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Thu May 15 07:28:31 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 15 May 2025 07:28:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #140066) Message-ID: https://github.com/tblah created https://github.com/llvm/llvm-project/pull/140066 This adds another puzzle piece for the support of OpenMP DECLARE REDUCTION functionality. This adds support for operators with derived types, as well as declaring multiple different types with the same name or operator. A new detail class for UserReductionDetials is introduced to hold the list of types supported for a given reduction declaration. Tests for parsing and symbol generation added. Declare reduction is still not supported to lowering, it will generate a "Not yet implemented" fatal error. Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Thu May 15 07:30:06 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 15 May 2025 07:30:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #140066) In-Reply-To: Message-ID: <6825fa6e.a70a0220.2484f.dc73@mx.google.com> tblah wrote: This is a continuation of @Leporacanthicus's work in https://github.com/llvm/llvm-project/pull/131628. https://github.com/llvm/llvm-project/pull/140066 From flang-commits at lists.llvm.org Wed May 14 15:31:46 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 14 May 2025 15:31:46 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (WORK IN PROGRESS) (PR #137727) In-Reply-To: Message-ID: <682519d2.170a0220.2d63e7.1b2a@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From ccd38ad49430f0b86aad46f51215034e777a7954 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, and Destroy. Default derived type I/O is also recursive, but already disabled. It can be added to this new framework later if the overall approach succeeds. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. --- .../include/flang-rt/runtime/work-queue.h | 296 ++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 516 ++++++++++-------- flang-rt/lib/runtime/derived.cpp | 487 ++++++++--------- flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 175 ++++++ flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- 8 files changed, 1019 insertions(+), 475 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..f2ae8c44bc468 --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,296 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue is a list of tickets. Each ticket class has a Begin() +// member function that is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatOkContinue, and if that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatOkContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentTicketBase, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatOkContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatOkContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; +namespace typeInfo { +class DerivedType; +class Component; +class SpecialBinding; +} // namespace typeInfo + +// Ticket workers + +// Ticket workers return status codes. Returning StatOkContinue means +// that the ticket is incomplete and must be resumed; any other value +// means that the ticket is complete, and if not StatOk, the whole +// queue can be shut down due to an error. +static constexpr int StatOkContinue{1234}; + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +// Base class for ticket workers that operate elementwise over descriptors +// TODO: if ComponentTicketBase remains this class' only client, +// merge them for better comprehensibility. +class ElementalTicketBase { +protected: + RT_API_ATTRS ElementalTicketBase(const Descriptor &instance) + : instance_{instance} { + instance_.GetLowerBounds(subscripts_); + } + RT_API_ATTRS bool CueUpNextItem() const { return elementAt_ < elements_; } + RT_API_ATTRS void AdvanceToNextElement() { + phase_ = 0; + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + } + + const Descriptor &instance_; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + int phase_{0}; + SubscriptValue subscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentTicketBase : protected ElementalTicketBase { +protected: + RT_API_ATTRS ComponentTicketBase( + const Descriptor &instance, const typeInfo::DerivedType &derived); + RT_API_ATTRS bool CueUpNextItem(); + RT_API_ATTRS void AdvanceToNextComponent() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + ElementalTicketBase::Reset(); + component_ = nullptr; + componentAt_ = 0; + } + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Implements derived type instance initialization +class InitializeTicket : private ComponentTicketBase { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ComponentTicketBase{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket : private ComponentTicketBase { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ComponentTicketBase{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatOkContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : private ComponentTicketBase { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ComponentTicketBase{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : private ComponentTicketBase { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ComponentTicketBase{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : to_{to}, from_{&from}, flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment +class DerivedAssignTicket : private ComponentTicketBase { +public: + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ComponentTicketBase{to, derived}, from_{from}, flags_{flags}, + memmoveFct_{memmoveFct}, deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS void AdvanceToNextElement(); + RT_API_ATTRS void Reset(); + +private: + const Descriptor &from_; + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + SubscriptValue fromSubscripts_[common::maxRank]; + StaticDescriptor fromComponentDescriptor_; +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + RT_API_ATTRS void BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived); + RT_API_ATTRS void BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg); + RT_API_ATTRS void BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived); + RT_API_ATTRS void BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize); + RT_API_ATTRS void BeginAssign( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct); + RT_API_ATTRS void BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter); + + RT_API_ATTRS int Run(); + +private: + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index c5e7bdce5b2fd..3f9858671d1e1 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -67,6 +67,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -130,6 +131,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 4a813cd489022..f16ee72d52e35 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -99,11 +100,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncId)); } // least <= 0, most >= 0 @@ -228,6 +225,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -241,274 +240,336 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + workQueue.BeginAssign(to, from, flags, memmoveFct); + workQueue.Run(); +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + workQueue.BeginFinalize(*toDeallocate_, *toDerived_); + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncId))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + workQueue.BeginInitialize(newFrom, *derived); + } + } } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + workQueue.BeginAssign( + newFrom, *from_, MaybeReallocate | PolymorphicLHS, memmoveFct_); + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; - } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + workQueue.BeginFinalize(to_, *toDerived_); + } else if (!toDerived_->noDestructionNeeded()) { + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false); + } + } + return StatOkContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); + } + return StatOk; + } + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + workQueue.BeginInitialize(to_, *toDerived_); + return StatOkContinue; } } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatOkContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatOkContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); - } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); - } - } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + if (toDerived_) { + workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_); + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatOkContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &workQueue) { + from_.GetLowerBounds(fromSubscripts_); + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{derived_.procPtr()}; + std::size_t numProcPtrs{procPtrDesc.Elements()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + for (; ElementalTicketBase::CueUpNextItem(); AdvanceToNextElement()) { + memmoveFct_(instance_.Element(subscripts_) + procPtr.offset, + from_.Element(fromSubscripts_) + procPtr.offset, + sizeof(typeInfo::ProcedurePointer)); + } + ElementalTicketBase::Reset(); + } + return StatOkContinue; +} + +RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &workQueue) { + for (; CueUpNextItem(); AdvanceToNextElement()) { + // Copy the data components (incl. the parent) first. + switch (component_->genre()) { + case typeInfo::Component::Genre::Data: + if (component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{fromComponentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + toCompDesc, instance_, workQueue.terminator(), subscripts_); + component_->CreatePointerDescriptor( + fromCompDesc, from_, workQueue.terminator(), fromSubscripts_); + AdvanceToNextElement(); + workQueue.BeginAssign(toCompDesc, fromCompDesc, flags_, memmoveFct_); + return StatOkContinue; + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_.Element(fromSubscripts_) + component_->offset(), + componentByteSize); + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_.Element(fromSubscripts_) + component_->offset(), + componentByteSize); + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + instance_.Element(subscripts_) + component_->offset())}; + const auto *fromDesc{reinterpret_cast( + from_.Element(fromSubscripts_) + component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (const auto *componentDerived{component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false); + return StatOkContinue; + } + } + toDesc->Deallocate(); + } + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + // + // Be careful not to destroy/reallocate the LHS, if there is + // overlap between LHS and RHS (it seems that partial overlap + // is not possible, though). + // Invoke Assign() recursively to deal with potential aliasing. + // Force LHS deallocation with DeallocateLHS flag. + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + workQueue.BeginAssign( + *toDesc, *fromDesc, flags_ | DeallocateLHS, memmoveFct_); + AdvanceToNextElement(); + return StatOkContinue; + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} + +RT_API_ATTRS void DerivedAssignTicket::AdvanceToNextElement() { + ComponentTicketBase::AdvanceToNextElement(); + from_.IncrementSubscripts(fromSubscripts_); +} + +RT_API_ATTRS void DerivedAssignTicket::Reset() { + ComponentTicketBase::Reset(); + from_.GetLowerBounds(fromSubscripts_); +} RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -578,7 +639,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -594,11 +654,11 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. - if (var) + if (var) { Assign(*var, temp, terminator, NoAssignFlags); + } temp.Destroy(/*finalize=*/false, /*destroyPointers=*/false, &terminator); } diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..0f461f529fae6 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,174 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + workQueue.BeginInitialize(instance, derived); + return workQueue.Run(); +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + std::size_t myProcPtrs{procPtrDesc.Elements()}; + for (std::size_t k{0}; k < myProcPtrs; ++k) { const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; + *procPtrDesc.ZeroBasedIndexedElement(k)}; SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + instance_.GetLowerBounds(at); + for (std::size_t j{0}; j++ < elements_; instance_.IncrementSubscripts(at)) { + auto &pptr{*instance_.ElementComponent( + at, comp.offset)}; + pptr = comp.procInitialization; + } + } + return StatOkContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (CueUpNextItem()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; elementAt_ < elements_; AdvanceToNextElement()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; elementAt_ < elements_; AdvanceToNextElement()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; elementAt_ < elements_; AdvanceToNextElement()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } - } - } - } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + AdvanceToNextElement(); + workQueue.BeginInitialize(compDesc, compType); + return StatOkContinue; + } else { + AdvanceToNextComponent(); } } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + workQueue.BeginInitializeClone(clone, original, derived, hasStat, errMsg); + return workQueue.Run(); } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (CueUpNextItem()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); - } + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncId), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + workQueue.BeginInitialize(cloneDesc, *derived); + return StatOkContinue; } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_); + return StatOkContinue; + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + AdvanceToNextElement(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_); + AdvanceToNextElement(); + return StatOkContinue; // will resume at next element in this component + } else { + AdvanceToNextComponent(); } + } else { + AdvanceToNextComponent(); } } - return stat; + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + workQueue.BeginFinalize(descriptor, derived); + workQueue.Run(); + } } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +216,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +253,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncId) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,86 +277,84 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (!finalizableParentType_->noFinalizationNeeded()) { + componentAt_ = 1; + } else { + finalizableParentType_ = nullptr; + } + } + return StatOkContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (CueUpNextItem()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); - } + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + AdvanceToNextElement(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + workQueue.BeginFinalize(compDesc, *compDynamicType); + return StatOkContinue; } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + AdvanceToNextElement(); + if (compDesc.IsAllocated()) { + workQueue.BeginFinalize(compDesc, *compType); } + } else { + AdvanceToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + AdvanceToNextElement(); + workQueue.BeginFinalize(compDesc, compType); + return StatOkContinue; + } else { + AdvanceToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + workQueue.BeginFinalize(tmpDesc, *finalizableParentType_); + finalizableParentType_ = nullptr; + return StatOkContinue; + } else { + return StatOk; } } @@ -373,51 +364,61 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + workQueue.BeginDestroy(descriptor, derived, finalize); + workQueue.Run(); } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + workQueue.BeginFinalize(instance_, derived_); } + return StatOkContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (CueUpNextItem()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + workQueue.BeginDestroy(*d, *componentDerived, /*finalize=*/false); + return StatOkContinue; + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + AdvanceToNextElement(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + AdvanceToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + AdvanceToNextElement(); + workQueue.BeginDestroy(compDesc, *componentDerived, /*finalize=*/false); + return StatOkContinue; } + } else { + AdvanceToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..0ae50e72bb3a9 --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,175 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS ComponentTicketBase::ComponentTicketBase( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ElementalTicketBase{instance}, derived_{derived}, + components_{derived.component().Elements()} {} + +RT_API_ATTRS bool ComponentTicketBase::CueUpNextItem() { + bool elementsDone{!ElementalTicketBase::CueUpNextItem()}; + if (elementsDone) { + component_ = nullptr; + ++componentAt_; + } + if (!component_) { + if (componentAt_ >= components_) { + return false; // done! + } + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + if (elementsDone) { + ElementalTicketBase::Reset(); + } + } + return true; +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + delete firstFree_; + } + firstFree_ = next; + } +} + +RT_API_ATTRS void WorkQueue::BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + StartTicket().u.emplace(descriptor, derived); +} + +RT_API_ATTRS void WorkQueue::BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); +} + +RT_API_ATTRS void WorkQueue::BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + StartTicket().u.emplace(descriptor, derived); +} + +RT_API_ATTRS void WorkQueue::BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + StartTicket().u.emplace(descriptor, derived, finalize); +} + +RT_API_ATTRS void WorkQueue::BeginAssign( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) { + StartTicket().u.emplace(to, from, flags, memmoveFct); +} + +RT_API_ATTRS void WorkQueue::BeginDerivedAssign(Descriptor &to, + const Descriptor &from, const typeInfo::DerivedType &derived, int flags, + MemmoveFct memmoveFct, Descriptor *deallocateAfter) { + StartTicket().u.emplace( + to, from, derived, flags, memmoveFct, deallocateAfter); +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + firstFree_ = new TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; + int stat{at->ticket.Continue(*this)}; + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatOkContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime \ No newline at end of file diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..32bebc1d866a4 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -843,6 +843,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION From flang-commits at lists.llvm.org Thu May 15 03:34:42 2025 From: flang-commits at lists.llvm.org (Amit Kumar Pandey via flang-commits) Date: Thu, 15 May 2025 03:34:42 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Flang][Sanitizer] Support sanitizer flag for Flang Driver. (PR #137759) In-Reply-To: Message-ID: <6825c342.170a0220.59850.30eb@mx.google.com> https://github.com/ampandey-1995 updated https://github.com/llvm/llvm-project/pull/137759 >From aa3caaeaa72ab2f0de8beac416875dc466ac1051 Mon Sep 17 00:00:00 2001 From: Amit Pandey Date: Thu, 1 May 2025 13:51:12 +0530 Subject: [PATCH 1/2] [Flang][Sanitizer] Support sanitizer flag for Flang Driver. Flang Driver currently dosen't support option sanitizer flags such as '-fsanitize='. This patch currently supports enabling sanitizer flags for the flang driver apart from clang independently. --- clang/include/clang/Driver/Options.td | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index e69cd6b833c3a..32cd93f9a5e36 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1555,11 +1555,15 @@ defm xl_pragma_pack : BoolFOption<"xl-pragma-pack", "Enable IBM XL #pragma pack handling">, NegFlag>; def shared_libsan : Flag<["-"], "shared-libsan">, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Dynamically link the sanitizer runtime">; def static_libsan : Flag<["-"], "static-libsan">, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Statically link the sanitizer runtime (Not supported for ASan, TSan or UBSan on darwin)">; -def : Flag<["-"], "shared-libasan">, Alias; -def : Flag<["-"], "static-libasan">, Alias; +def : Flag<["-"], "shared-libasan">, Alias, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>; +def : Flag<["-"], "static-libasan">, Alias, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>; def fasm : Flag<["-"], "fasm">, Group; defm assume_unique_vtables : BoolFOption<"assume-unique-vtables", @@ -2309,7 +2313,7 @@ def fmemory_profile_use_EQ : Joined<["-"], "fmemory-profile-use=">, // Begin sanitizer flags. These should all be core options exposed in all driver // modes. -let Visibility = [ClangOption, CC1Option, CLOption] in { +let Visibility = [ClangOption, CC1Option, CLOption, FlangOption, FC1Option] in { def fsanitize_EQ : CommaJoined<["-"], "fsanitize=">, Group, MetaVarName<"">, >From f938acc7557bd37edac3b7a38ecb32e069a13a97 Mon Sep 17 00:00:00 2001 From: Amit Pandey Date: Wed, 14 May 2025 21:44:15 +0530 Subject: [PATCH 2/2] Support ASan in LLVM Flang. --- clang/include/clang/Driver/Options.td | 92 +++++++++---------- clang/lib/Driver/ToolChains/Flang.cpp | 13 +++ clang/lib/Driver/ToolChains/Flang.h | 9 ++ .../include/flang/Frontend/CodeGenOptions.def | 67 ++++++++++++++ flang/include/flang/Frontend/CodeGenOptions.h | 47 ++++++++++ flang/include/flang/Support/LangOptions.def | 1 + flang/include/flang/Support/LangOptions.h | 11 ++- flang/lib/Frontend/CodeGenOptions.cpp | 29 ++++++ flang/lib/Frontend/CompilerInvocation.cpp | 82 +++++++++++++++++ flang/lib/Frontend/FrontendActions.cpp | 37 ++++++++ 10 files changed, 341 insertions(+), 47 deletions(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index f4307f02d9175..3e89a1f7b28aa 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -2339,7 +2339,7 @@ def fsanitize_EQ : CommaJoined<["-"], "fsanitize=">, Group, HelpText<"Turn on runtime checks for various forms of undefined " "or suspicious behavior. See user manual for available checks">; def fno_sanitize_EQ : CommaJoined<["-"], "fno-sanitize=">, Group, - Visibility<[ClangOption, CLOption]>; + Visibility<[ClangOption, CLOption, FlangOption]>; def fsanitize_ignorelist_EQ : Joined<["-"], "fsanitize-ignorelist=">, Group, HelpText<"Path to ignorelist file for sanitizers">; @@ -2349,7 +2349,7 @@ def : Joined<["-"], "fsanitize-blacklist=">, def fsanitize_system_ignorelist_EQ : Joined<["-"], "fsanitize-system-ignorelist=">, HelpText<"Path to system ignorelist file for sanitizers">, - Visibility<[ClangOption, CC1Option]>; + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>; def fno_sanitize_ignorelist : Flag<["-"], "fno-sanitize-ignorelist">, Group, HelpText<"Don't use ignorelist file for sanitizers">; @@ -2360,17 +2360,17 @@ def fsanitize_coverage : CommaJoined<["-"], "fsanitize-coverage=">, Group, HelpText<"Specify the type of coverage instrumentation for Sanitizers">; def fno_sanitize_coverage : CommaJoined<["-"], "fno-sanitize-coverage=">, - Group, Visibility<[ClangOption, CLOption]>, + Group, Visibility<[ClangOption, CLOption, FlangOption]>, HelpText<"Disable features of coverage instrumentation for Sanitizers">, Values<"func,bb,edge,indirect-calls,trace-bb,trace-cmp,trace-div,trace-gep," "8bit-counters,trace-pc,trace-pc-guard,no-prune,inline-8bit-counters," "inline-bool-flag">; def fsanitize_coverage_allowlist : Joined<["-"], "fsanitize-coverage-allowlist=">, - Group, Visibility<[ClangOption, CLOption]>, + Group, Visibility<[ClangOption, CLOption, FlangOption]>, HelpText<"Restrict sanitizer coverage instrumentation exclusively to modules and functions that match the provided special case list, except the blocked ones">, MarshallingInfoStringVector>; def fsanitize_coverage_ignorelist : Joined<["-"], "fsanitize-coverage-ignorelist=">, - Group, Visibility<[ClangOption, CLOption]>, + Group, Visibility<[ClangOption, CLOption, FlangOption]>, HelpText<"Disable sanitizer coverage instrumentation for modules and functions " "that match the provided special case list, even the allowed ones">, MarshallingInfoStringVector>; @@ -2385,10 +2385,10 @@ def fexperimental_sanitize_metadata_EQ : CommaJoined<["-"], "fexperimental-sanit Group, HelpText<"Specify the type of metadata to emit for binary analysis sanitizers">; def fno_experimental_sanitize_metadata_EQ : CommaJoined<["-"], "fno-experimental-sanitize-metadata=">, - Group, Visibility<[ClangOption, CLOption]>, + Group, Visibility<[ClangOption, CLOption,FlangOption]>, HelpText<"Disable emitting metadata for binary analysis sanitizers">; def fexperimental_sanitize_metadata_ignorelist_EQ : Joined<["-"], "fexperimental-sanitize-metadata-ignorelist=">, - Group, Visibility<[ClangOption, CLOption]>, + Group, Visibility<[ClangOption, CLOption, FlangOption]>, HelpText<"Disable sanitizer metadata for modules and functions that match the provided special case list">, MarshallingInfoStringVector>; def fsanitize_memory_track_origins_EQ : Joined<["-"], "fsanitize-memory-track-origins=">, @@ -2401,7 +2401,7 @@ def fsanitize_memory_track_origins : Flag<["-"], "fsanitize-memory-track-origins HelpText<"Enable origins tracking in MemorySanitizer">; def fno_sanitize_memory_track_origins : Flag<["-"], "fno-sanitize-memory-track-origins">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption]>, HelpText<"Disable origins tracking in MemorySanitizer">; def fsanitize_address_outline_instrumentation : Flag<["-"], "fsanitize-address-outline-instrumentation">, Group, @@ -2423,13 +2423,13 @@ def fsanitize_hwaddress_experimental_aliasing def fno_sanitize_hwaddress_experimental_aliasing : Flag<["-"], "fno-sanitize-hwaddress-experimental-aliasing">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption]>, HelpText<"Disable aliasing mode in HWAddressSanitizer">; defm sanitize_memory_use_after_dtor : BoolOption<"f", "sanitize-memory-use-after-dtor", CodeGenOpts<"SanitizeMemoryUseAfterDtor">, DefaultFalse, - PosFlag, - NegFlag, - BothFlags<[], [ClangOption], " use-after-destroy detection in MemorySanitizer">>, + PosFlag, + NegFlag, + BothFlags<[], [ClangOption, FlangOption], " use-after-destroy detection in MemorySanitizer">>, Group; def fsanitize_address_field_padding : Joined<["-"], "fsanitize-address-field-padding=">, Group, @@ -2437,15 +2437,15 @@ def fsanitize_address_field_padding : Joined<["-"], "fsanitize-address-field-pad MarshallingInfoInt>; defm sanitize_address_use_after_scope : BoolOption<"f", "sanitize-address-use-after-scope", CodeGenOpts<"SanitizeAddressUseAfterScope">, DefaultFalse, - PosFlag, - NegFlag, + NegFlag, - BothFlags<[], [ClangOption], " use-after-scope detection in AddressSanitizer">>, + BothFlags<[], [ClangOption, FlangOption], " use-after-scope detection in AddressSanitizer">>, Group; def sanitize_address_use_after_return_EQ : Joined<["-"], "fsanitize-address-use-after-return=">, MetaVarName<"">, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption]>, HelpText<"Select the mode of detecting stack use-after-return in AddressSanitizer">, Group, Values<"never,runtime,always">, @@ -2454,9 +2454,9 @@ def sanitize_address_use_after_return_EQ MarshallingInfoEnum, "Runtime">; defm sanitize_address_poison_custom_array_cookie : BoolOption<"f", "sanitize-address-poison-custom-array-cookie", CodeGenOpts<"SanitizeAddressPoisonCustomArrayCookie">, DefaultFalse, - PosFlag, - NegFlag, - BothFlags<[], [ClangOption], " poisoning array cookies when using custom operator new[] in AddressSanitizer">>, + PosFlag, + NegFlag, + BothFlags<[], [ClangOption, FlangOption], " poisoning array cookies when using custom operator new[] in AddressSanitizer">>, DocBrief<[{Enable "poisoning" array cookies when allocating arrays with a custom operator new\[\] in Address Sanitizer, preventing accesses to the cookies from user code. An array cookie is a small implementation-defined @@ -2471,18 +2471,18 @@ functions are always poisoned.}]>, Group; defm sanitize_address_globals_dead_stripping : BoolOption<"f", "sanitize-address-globals-dead-stripping", CodeGenOpts<"SanitizeAddressGlobalsDeadStripping">, DefaultFalse, - PosFlag, - NegFlag>, + PosFlag, + NegFlag>, Group; defm sanitize_address_use_odr_indicator : BoolOption<"f", "sanitize-address-use-odr-indicator", CodeGenOpts<"SanitizeAddressUseOdrIndicator">, DefaultTrue, - PosFlag, - NegFlag>, + NegFlag>, Group; def sanitize_address_destructor_EQ : Joined<["-"], "fsanitize-address-destructor=">, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Set the kind of module destructors emitted by " "AddressSanitizer instrumentation. These destructors are " "emitted to unregister instrumented global variables when " @@ -2496,9 +2496,9 @@ defm sanitize_memory_param_retval : BoolFOption<"sanitize-memory-param-retval", CodeGenOpts<"SanitizeMemoryParamRetval">, DefaultTrue, - PosFlag, - NegFlag, - BothFlags<[], [ClangOption], " detection of uninitialized parameters and return values">>; + PosFlag, + NegFlag, + BothFlags<[], [ClangOption, FlangOption], " detection of uninitialized parameters and return values">>; //// Note: This flag was introduced when it was necessary to distinguish between // ABI for correct codegen. This is no longer needed, but the flag is // not removed since targeting either ABI will behave the same. @@ -2514,25 +2514,25 @@ def fsanitize_recover_EQ : CommaJoined<["-"], "fsanitize-recover=">, HelpText<"Enable recovery for specified sanitizers">; def fno_sanitize_recover_EQ : CommaJoined<["-"], "fno-sanitize-recover=">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption]>, HelpText<"Disable recovery for specified sanitizers">; def fsanitize_recover : Flag<["-"], "fsanitize-recover">, Group, Alias, AliasArgs<["all"]>; def fno_sanitize_recover : Flag<["-"], "fno-sanitize-recover">, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption]>, Group, Alias, AliasArgs<["all"]>; def fsanitize_trap_EQ : CommaJoined<["-"], "fsanitize-trap=">, Group, HelpText<"Enable trapping for specified sanitizers">; def fno_sanitize_trap_EQ : CommaJoined<["-"], "fno-sanitize-trap=">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption]>, HelpText<"Disable trapping for specified sanitizers">; def fsanitize_trap : Flag<["-"], "fsanitize-trap">, Group, Alias, AliasArgs<["all"]>, HelpText<"Enable trapping for all sanitizers">; def fno_sanitize_trap : Flag<["-"], "fno-sanitize-trap">, Group, Alias, AliasArgs<["all"]>, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption]>, HelpText<"Disable trapping for all sanitizers">; def fsanitize_merge_handlers_EQ : CommaJoined<["-"], "fsanitize-merge=">, @@ -2547,7 +2547,7 @@ def fsanitize_merge_handlers : Flag<["-"], "fsanitize-merge">, Group; def fno_sanitize_merge_handlers : Flag<["-"], "fno-sanitize-merge">, Group, Alias, AliasArgs<["all"]>, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption]>, HelpText<"Do not allow compiler to merge handlers for any sanitizers">; def fsanitize_annotate_debug_info_EQ : CommaJoined<["-"], "fsanitize-annotate-debug-info=">, @@ -2595,10 +2595,10 @@ def fno_sanitize_link_cxx_runtime : Flag<["-"], "fno-sanitize-link-c++-runtime"> Group; defm sanitize_cfi_cross_dso : BoolOption<"f", "sanitize-cfi-cross-dso", CodeGenOpts<"SanitizeCfiCrossDso">, DefaultFalse, - PosFlag, - NegFlag, + NegFlag, - BothFlags<[], [ClangOption], " control flow integrity (CFI) checks for cross-DSO calls.">>, + BothFlags<[], [ClangOption, FlangOption], " control flow integrity (CFI) checks for cross-DSO calls.">>, Group; def fsanitize_cfi_icall_generalize_pointers : Flag<["-"], "fsanitize-cfi-icall-generalize-pointers">, Group, @@ -2610,10 +2610,10 @@ def fsanitize_cfi_icall_normalize_integers : Flag<["-"], "fsanitize-cfi-icall-ex MarshallingInfoFlag>; defm sanitize_cfi_canonical_jump_tables : BoolOption<"f", "sanitize-cfi-canonical-jump-tables", CodeGenOpts<"SanitizeCfiCanonicalJumpTables">, DefaultFalse, - PosFlag, - NegFlag, + NegFlag, - BothFlags<[], [ClangOption], " the jump table addresses canonical in the symbol table">>, + BothFlags<[], [ClangOption, FlangOption], " the jump table addresses canonical in the symbol table">>, Group; def fsanitize_kcfi_arity : Flag<["-"], "fsanitize-kcfi-arity">, Group, @@ -2621,14 +2621,14 @@ def fsanitize_kcfi_arity : Flag<["-"], "fsanitize-kcfi-arity">, MarshallingInfoFlag>; defm sanitize_stats : BoolOption<"f", "sanitize-stats", CodeGenOpts<"SanitizeStats">, DefaultFalse, - PosFlag, - NegFlag, + NegFlag, - BothFlags<[], [ClangOption], " sanitizer statistics gathering.">>, + BothFlags<[], [ClangOption, FlangOption], " sanitizer statistics gathering.">>, Group; def fsanitize_undefined_ignore_overflow_pattern_EQ : CommaJoined<["-"], "fsanitize-undefined-ignore-overflow-pattern=">, HelpText<"Specify the overflow patterns to exclude from arithmetic sanitizer instrumentation">, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, Values<"none,all,add-unsigned-overflow-test,add-signed-overflow-test,negated-unsigned-const,unsigned-post-decr-while">, MarshallingInfoStringVector>; def fsanitize_thread_memory_access : Flag<["-"], "fsanitize-thread-memory-access">, @@ -2636,21 +2636,21 @@ def fsanitize_thread_memory_access : Flag<["-"], "fsanitize-thread-memory-access HelpText<"Enable memory access instrumentation in ThreadSanitizer (default)">; def fno_sanitize_thread_memory_access : Flag<["-"], "fno-sanitize-thread-memory-access">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption]>, HelpText<"Disable memory access instrumentation in ThreadSanitizer">; def fsanitize_thread_func_entry_exit : Flag<["-"], "fsanitize-thread-func-entry-exit">, Group, HelpText<"Enable function entry/exit instrumentation in ThreadSanitizer (default)">; def fno_sanitize_thread_func_entry_exit : Flag<["-"], "fno-sanitize-thread-func-entry-exit">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption]>, HelpText<"Disable function entry/exit instrumentation in ThreadSanitizer">; def fsanitize_thread_atomics : Flag<["-"], "fsanitize-thread-atomics">, Group, HelpText<"Enable atomic operations instrumentation in ThreadSanitizer (default)">; def fno_sanitize_thread_atomics : Flag<["-"], "fno-sanitize-thread-atomics">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption]>, HelpText<"Disable atomic operations instrumentation in ThreadSanitizer">; def fsanitize_undefined_strip_path_components_EQ : Joined<["-"], "fsanitize-undefined-strip-path-components=">, Group, MetaVarName<"">, @@ -3448,7 +3448,7 @@ def fno_asm : Flag<["-"], "fno-asm">, Group; def fno_asynchronous_unwind_tables : Flag<["-"], "fno-asynchronous-unwind-tables">, Group; def fno_assume_sane_operator_new : Flag<["-"], "fno-assume-sane-operator-new">, Group, HelpText<"Don't assume that C++'s global operator new can't alias any pointer">, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, MarshallingInfoNegativeFlag>; def fno_builtin : Flag<["-"], "fno-builtin">, Group, Visibility<[ClangOption, CC1Option, CLOption, DXCOption]>, diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index b1ca747e68b89..52608715e7d4f 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -12,6 +12,7 @@ #include "clang/Basic/CodeGenOptions.h" #include "clang/Driver/Options.h" +#include "clang/Driver/SanitizerArgs.h" #include "llvm/Frontend/Debug/Options.h" #include "llvm/Support/FileSystem.h" #include "llvm/Support/Path.h" @@ -143,6 +144,15 @@ void Flang::addOtherOptions(const ArgList &Args, ArgStringList &CmdArgs) const { addDebugInfoKind(CmdArgs, DebugInfoKind); } +void Flang::addSanitizerOptions(const ToolChain &TC, const ArgList &Args, + ArgStringList &CmdArgs, + types::ID InputType) const { + SanitizerArgs SanArgs = TC.getSanitizerArgs(Args); + SanArgs.addArgs(TC, Args, CmdArgs, InputType); + // If Tc.getTriple() == amdgpu search for only allow -fsanitize=address for + // that target +} + void Flang::addCodegenOptions(const ArgList &Args, ArgStringList &CmdArgs) const { Arg *stackArrays = @@ -869,6 +879,9 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // Add Codegen options addCodegenOptions(Args, CmdArgs); + // Add Sanitizer Options + addSanitizerOptions(TC, Args, CmdArgs, InputType); + // Add R Group options Args.AddAllArgs(CmdArgs, options::OPT_R_Group); diff --git a/clang/lib/Driver/ToolChains/Flang.h b/clang/lib/Driver/ToolChains/Flang.h index 7c24a623af393..2b61955af7764 100644 --- a/clang/lib/Driver/ToolChains/Flang.h +++ b/clang/lib/Driver/ToolChains/Flang.h @@ -117,6 +117,15 @@ class LLVM_LIBRARY_VISIBILITY Flang : public Tool { void addCodegenOptions(const llvm::opt::ArgList &Args, llvm::opt::ArgStringList &CmdArgs) const; + /// Extract sanitizer options for code generation from the driver arguments + /// and add them to the command arguments. + /// + /// \param [in] Args The list of input driver arguments + /// \param [out] CmdArgs The list of output command arguments + void addSanitizerOptions(const ToolChain &TC, const llvm::opt::ArgList &Args, + llvm::opt::ArgStringList &CmdArgs, + types::ID InputType) const; + /// Extract other compilation options from the driver arguments and add them /// to the command arguments. /// diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index d9dbd274e83e5..fa707c4ddbfcc 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -47,5 +47,72 @@ ENUM_CODEGENOPT(FramePointer, llvm::FramePointerKind, 2, llvm::FramePointerKind: ENUM_CODEGENOPT(DoConcurrentMapping, DoConcurrentMappingKind, 2, DoConcurrentMappingKind::DCMK_None) ///< Map `do concurrent` to OpenMP +CODEGENOPT(SanitizeAddressUseAfterScope , 1, 0) ///< Enable use-after-scope detection + ///< in AddressSanitizer +ENUM_CODEGENOPT(SanitizeAddressUseAfterReturn, + llvm::AsanDetectStackUseAfterReturnMode, 2, + llvm::AsanDetectStackUseAfterReturnMode::Runtime + ) ///< Set detection mode for stack-use-after-return. +CODEGENOPT(SanitizeAddressPoisonCustomArrayCookie, 1, + 0) ///< Enable poisoning operator new[] which is not a replaceable + ///< global allocation function in AddressSanitizer +CODEGENOPT(SanitizeAddressGlobalsDeadStripping, 1, 0) ///< Enable linker dead stripping + ///< of globals in AddressSanitizer +CODEGENOPT(SanitizeAddressUseOdrIndicator, 1, 0) ///< Enable ODR indicator globals +CODEGENOPT(SanitizeMemoryTrackOrigins, 2, 0) ///< Enable tracking origins in + ///< MemorySanitizer +ENUM_CODEGENOPT(SanitizeAddressDtor, llvm::AsanDtorKind, 2, + llvm::AsanDtorKind::Global) ///< Set how ASan global + ///< destructors are emitted. +CODEGENOPT(SanitizeMemoryParamRetval, 1, 0) ///< Enable detection of uninitialized + ///< parameters and return values + ///< in MemorySanitizer +CODEGENOPT(SanitizeMemoryUseAfterDtor, 1, 0) ///< Enable use-after-delete detection + ///< in MemorySanitizer +CODEGENOPT(SanitizeCfiCrossDso, 1, 0) ///< Enable cross-dso support in CFI. +CODEGENOPT(SanitizeMinimalRuntime, 1, 0) ///< Use "_minimal" sanitizer runtime for + ///< diagnostics. +CODEGENOPT(SanitizeCfiICallGeneralizePointers, 1, 0) ///< Generalize pointer types in + ///< CFI icall function signatures +CODEGENOPT(SanitizeCfiICallNormalizeIntegers, 1, 0) ///< Normalize integer types in + ///< CFI icall function signatures +CODEGENOPT(SanitizeCfiCanonicalJumpTables, 1, 0) ///< Make jump table symbols canonical + ///< instead of creating a local jump table. +CODEGENOPT(UniqueSourceFileNames, 1, 0) ///< Allow the compiler to assume that TUs + ///< have unique source file names at link time +CODEGENOPT(SanitizeKcfiArity, 1, 0) ///< Embed arity in KCFI patchable function prefix +CODEGENOPT(SanitizeCoverageType, 2, 0) ///< Type of sanitizer coverage + ///< instrumentation. +CODEGENOPT(SanitizeCoverageIndirectCalls, 1, 0) ///< Enable sanitizer coverage + ///< for indirect calls. +CODEGENOPT(SanitizeCoverageTraceBB, 1, 0) ///< Enable basic block tracing in + ///< in sanitizer coverage. +CODEGENOPT(SanitizeCoverageTraceCmp, 1, 0) ///< Enable cmp instruction tracing + ///< in sanitizer coverage. +CODEGENOPT(SanitizeCoverageTraceDiv, 1, 0) ///< Enable div instruction tracing + ///< in sanitizer coverage. +CODEGENOPT(SanitizeCoverageTraceGep, 1, 0) ///< Enable GEP instruction tracing + ///< in sanitizer coverage. +CODEGENOPT(SanitizeCoverage8bitCounters, 1, 0) ///< Use 8-bit frequency counters + ///< in sanitizer coverage. +CODEGENOPT(SanitizeCoverageTracePC, 1, 0) ///< Enable PC tracing + ///< in sanitizer coverage. +CODEGENOPT(SanitizeCoverageTracePCGuard, 1, 0) ///< Enable PC tracing with guard + ///< in sanitizer coverage. +CODEGENOPT(SanitizeCoverageInline8bitCounters, 1, 0) ///< Use inline 8bit counters. +CODEGENOPT(SanitizeCoverageInlineBoolFlag, 1, 0) ///< Use inline bool flag. +CODEGENOPT(SanitizeCoveragePCTable, 1, 0) ///< Create a PC Table. +CODEGENOPT(SanitizeCoverageControlFlow, 1, 0) ///< Collect control flow +CODEGENOPT(SanitizeCoverageNoPrune, 1, 0) ///< Disable coverage pruning. +CODEGENOPT(SanitizeCoverageStackDepth, 1, 0) ///< Enable max stack depth tracing +CODEGENOPT(SanitizeCoverageTraceLoads, 1, 0) ///< Enable tracing of loads. +CODEGENOPT(SanitizeCoverageTraceStores, 1, 0) ///< Enable tracing of stores. +CODEGENOPT(SanitizeBinaryMetadataCovered, 1, 0) ///< Emit PCs for covered functions. +CODEGENOPT(SanitizeBinaryMetadataAtomics, 1, 0) ///< Emit PCs for atomic operations. +CODEGENOPT(SanitizeBinaryMetadataUAR, 1, 0) ///< Emit PCs for start of functions + ///< that are subject for use-after-return checking. +CODEGENOPT(SanitizeStats , 1, 0) ///< Collect statistics for sanitizers. +CODEGENOPT(DisableIntegratedAS, 1, 0) ///< -no-integrated-as + #undef CODEGENOPT #undef ENUM_CODEGENOPT diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..b2a799682dfe1 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -16,6 +16,7 @@ #define FORTRAN_FRONTEND_CODEGENOPTIONS_H #include "flang/Optimizer/OpenMP/Utils.h" +#include "clang/Basic/Sanitizers.h" #include "llvm/Frontend/Debug/Options.h" #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Support/CodeGen.h" @@ -148,6 +149,50 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; + /// Set of sanitizer checks that are non-fatal (i.e. execution should be + /// continued when possible). + clang::SanitizerSet SanitizeRecover; + + /// Set of sanitizer checks that trap rather than diagnose. + clang::SanitizerSet SanitizeTrap; + + /// Set of sanitizer checks that can merge handlers (smaller code size at + /// the expense of debuggability). + clang::SanitizerSet SanitizeMergeHandlers; + + /// Set of thresholds in a range [0.0, 1.0]: the top hottest code responsible + /// for the given fraction of PGO counters will be excluded from sanitization + /// (0.0 [default] to skip none, 1.0 to skip all). + clang::SanitizerMaskCutoffs SanitizeSkipHotCutoffs; + + /// Path to allowlist file specifying which objects + /// (files, functions) should exclusively be instrumented + /// by sanitizer coverage pass. + std::vector SanitizeCoverageAllowlistFiles; + + /// Path to ignorelist file specifying which objects + /// (files, functions) listed for instrumentation by sanitizer + /// coverage pass should actually not be instrumented. + std::vector SanitizeCoverageIgnorelistFiles; + + /// Path to ignorelist file specifying which objects + /// (files, functions) listed for instrumentation by sanitizer + /// binary metadata pass should not be instrumented. + std::vector SanitizeMetadataIgnorelistFiles; + + // Check if any one of SanitizeCoverage* is enabled. + bool hasSanitizeCoverage() const { + return SanitizeCoverageType || SanitizeCoverageIndirectCalls || + SanitizeCoverageTraceCmp || SanitizeCoverageTraceLoads || + SanitizeCoverageTraceStores || SanitizeCoverageControlFlow; + } + + // Check if any one of SanitizeBinaryMetadata* is enabled. + bool hasSanitizeBinaryMetadata() const { + return SanitizeBinaryMetadataCovered || SanitizeBinaryMetadataAtomics || + SanitizeBinaryMetadataUAR; + } + // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) #define ENUM_CODEGENOPT(Name, Type, Bits, Default) \ @@ -158,6 +203,8 @@ class CodeGenOptions : public CodeGenOptionsBase { CodeGenOptions(); }; +bool asanUseGlobalsGC(const llvm::Triple &T, const CodeGenOptions &CGOpts); + std::optional getCodeModel(llvm::StringRef string); } // end namespace Fortran::frontend diff --git a/flang/include/flang/Support/LangOptions.def b/flang/include/flang/Support/LangOptions.def index d5bf7a2ecc036..9bc2c6ca5b10d 100644 --- a/flang/include/flang/Support/LangOptions.def +++ b/flang/include/flang/Support/LangOptions.def @@ -62,5 +62,6 @@ LANGOPT(OpenMPNoNestedParallelism, 1, 0) LANGOPT(VScaleMin, 32, 0) ///< Minimum vscale range value LANGOPT(VScaleMax, 32, 0) ///< Maximum vscale range value +LANGOPT(SanitizeAddressFieldPadding, 2, 0) #undef LANGOPT #undef ENUM_LANGOPT diff --git a/flang/include/flang/Support/LangOptions.h b/flang/include/flang/Support/LangOptions.h index 1dd676e62a9e5..00c6bf743ce60 100644 --- a/flang/include/flang/Support/LangOptions.h +++ b/flang/include/flang/Support/LangOptions.h @@ -18,10 +18,10 @@ #include #include +#include "clang/Basic/Sanitizers.h" #include "llvm/TargetParser/Triple.h" namespace Fortran::common { - /// Bitfields of LangOptions, split out from LangOptions to ensure /// that this large collection of bitfields is a trivial class type. class LangOptionsBase { @@ -72,6 +72,15 @@ class LangOptions : public LangOptionsBase { /// host code generation. std::string OMPHostIRFile; + /// Set of enabled sanitizers. + clang::SanitizerSet Sanitize; + /// Is at least one coverage instrumentation type enabled. + bool SanitizeCoverage = false; + + /// Paths to files specifying which objects + /// (files, functions, variables) should not be instrumented. + std::vector NoSanitizeFiles; + /// List of triples passed in using -fopenmp-targets. std::vector OMPTargetTriples; diff --git a/flang/lib/Frontend/CodeGenOptions.cpp b/flang/lib/Frontend/CodeGenOptions.cpp index 8a9d3c27c8bc3..18988ec5d3f78 100644 --- a/flang/lib/Frontend/CodeGenOptions.cpp +++ b/flang/lib/Frontend/CodeGenOptions.cpp @@ -11,17 +11,46 @@ //===----------------------------------------------------------------------===// #include "flang/Frontend/CodeGenOptions.h" +#include "llvm/TargetParser/Triple.h" #include #include namespace Fortran::frontend { +using namespace llvm; + CodeGenOptions::CodeGenOptions() { #define CODEGENOPT(Name, Bits, Default) Name = Default; #define ENUM_CODEGENOPT(Name, Type, Bits, Default) set##Name(Default); #include "flang/Frontend/CodeGenOptions.def" } +// Check if ASan should use GC-friendly instrumentation for globals. +// First of all, there is no point if -fdata-sections is off (expect for MachO, +// where this is not a factor). Also, on ELF this feature requires an assembler +// extension that only works with -integrated-as at the moment. +bool asanUseGlobalsGC(const Triple &T, const CodeGenOptions &CGOpts) { + if (!CGOpts.SanitizeAddressGlobalsDeadStripping) + return false; + switch (T.getObjectFormat()) { + case Triple::MachO: + case Triple::COFF: + return true; + case Triple::ELF: + return !CGOpts.DisableIntegratedAS; + case Triple::GOFF: + llvm::report_fatal_error("ASan not implemented for GOFF"); + case Triple::XCOFF: + llvm::report_fatal_error("ASan not implemented for XCOFF."); + case Triple::Wasm: + case Triple::DXContainer: + case Triple::SPIRV: + case Triple::UnknownObjectFormat: + break; + } + return false; +} + std::optional getCodeModel(llvm::StringRef string) { return llvm::StringSwitch>(string) .Case("tiny", llvm::CodeModel::Model::Tiny) diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 238079a09ef3a..7283cd90843d0 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -23,6 +23,7 @@ #include "clang/Basic/AllDiagnostics.h" #include "clang/Basic/DiagnosticDriver.h" #include "clang/Basic/DiagnosticOptions.h" +#include "clang/Basic/Sanitizers.h" #include "clang/Driver/Driver.h" #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" @@ -256,6 +257,82 @@ parseOptimizationRemark(clang::DiagnosticsEngine &diags, return result; } +static void parseSanitizerKinds(llvm::StringRef FlagName, + const std::vector &Sanitizers, + clang::DiagnosticsEngine &Diags, + clang::SanitizerSet &S) { + for (const auto &Sanitizer : Sanitizers) { + clang::SanitizerMask K = + clang::parseSanitizerValue(Sanitizer, /*AllowGroups=*/false); + if (K == clang::SanitizerMask()) + Diags.Report(clang::diag::err_drv_invalid_value) << FlagName << Sanitizer; + else + S.set(K, true); + } +} + +static llvm::SmallVector +serializeSanitizerKinds(clang::SanitizerSet S) { + llvm::SmallVector Values; + serializeSanitizerSet(S, Values); + return Values; +} + +static clang::SanitizerMaskCutoffs +parseSanitizerWeightedKinds(llvm::StringRef FlagName, + const std::vector &Sanitizers, + clang::DiagnosticsEngine &Diags) { + clang::SanitizerMaskCutoffs Cutoffs; + for (const auto &Sanitizer : Sanitizers) { + if (!parseSanitizerWeightedValue(Sanitizer, /*AllowGroups=*/false, Cutoffs)) + Diags.Report(clang::diag::err_drv_invalid_value) << FlagName << Sanitizer; + } + return Cutoffs; +} + +static bool parseSanitizerArgs(CompilerInvocation &res, + llvm::opt::ArgList &args, + clang::DiagnosticsEngine &diags) { + auto &LangOpts = res.getLangOpts(); + auto &CodeGenOpts = res.getCodeGenOpts(); + + parseSanitizerKinds( + "-fsanitize-recover=", + args.getAllArgValues(clang::driver::options::OPT_fsanitize_recover_EQ), + diags, CodeGenOpts.SanitizeRecover); + parseSanitizerKinds( + "-fsanitize-trap=", + args.getAllArgValues(clang::driver::options::OPT_fsanitize_trap_EQ), + diags, CodeGenOpts.SanitizeTrap); + parseSanitizerKinds( + "-fsanitize-merge=", + args.getAllArgValues( + clang::driver::options::OPT_fsanitize_merge_handlers_EQ), + diags, CodeGenOpts.SanitizeMergeHandlers); + + // Parse -fsanitize-skip-hot-cutoff= arguments. + CodeGenOpts.SanitizeSkipHotCutoffs = parseSanitizerWeightedKinds( + "-fsanitize-skip-hot-cutoff=", + args.getAllArgValues( + clang::driver::options::OPT_fsanitize_skip_hot_cutoff_EQ), + diags); + + // Parse -fsanitize= arguments. + parseSanitizerKinds( + "-fsanitize=", + args.getAllArgValues(clang::driver::options::OPT_fsanitize_EQ), diags, + LangOpts.Sanitize); + + LangOpts.NoSanitizeFiles = + args.getAllArgValues(clang::driver::options::OPT_fsanitize_ignorelist_EQ); + std::vector systemIgnorelists = args.getAllArgValues( + clang::driver::options::OPT_fsanitize_system_ignorelist_EQ); + LangOpts.NoSanitizeFiles.insert(LangOpts.NoSanitizeFiles.end(), + systemIgnorelists.begin(), + systemIgnorelists.end()); + return true; +} + static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, llvm::opt::ArgList &args, clang::DiagnosticsEngine &diags) { @@ -395,6 +472,10 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, opts.RecordCommandLine = a->getValue(); } + // -mlink-bitcode-file + for (auto *a : args.filtered(clang::driver::options::OPT_mlink_bitcode_file)) + opts.BuiltinBCLibs.push_back(a->getValue()); + // -mlink-builtin-bitcode for (auto *a : args.filtered(clang::driver::options::OPT_mlink_builtin_bitcode)) @@ -1500,6 +1581,7 @@ bool CompilerInvocation::createFromArgs( success &= parseVectorLibArg(invoc.getCodeGenOpts(), args, diags); success &= parseSemaArgs(invoc, args, diags); success &= parseDialectArgs(invoc, args, diags); + success &= parseSanitizerArgs(invoc, args, diags); success &= parseOpenMPArgs(invoc, args, diags); success &= parseDiagArgs(invoc, args, diags); diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index e5a15c555fa5e..e6ab0402ce6b2 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -26,6 +26,7 @@ #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Semantics/runtime-type-info.h" #include "flang/Semantics/unparse-with-symbols.h" +#include "flang/Support/LangOptions.h" #include "flang/Support/default-kinds.h" #include "flang/Tools/CrossToolHelpers.h" @@ -67,7 +68,10 @@ #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" +#include "llvm/Transforms/Instrumentation/AddressSanitizer.h" +#include "llvm/Transforms/Instrumentation/AddressSanitizerOptions.h" #include "llvm/Transforms/Utils/ModuleUtils.h" +#include #include #include @@ -714,6 +718,7 @@ void CodeGenAction::generateLLVMIR() { CompilerInstance &ci = this->getInstance(); CompilerInvocation &invoc = ci.getInvocation(); const CodeGenOptions &opts = invoc.getCodeGenOpts(); + const common::LangOptions &langopts = invoc.getLangOpts(); const auto &mathOpts = invoc.getLoweringOpts().getMathOptions(); llvm::OptimizationLevel level = mapToLevel(opts); mlir::DefaultTimingManager &timingMgr = ci.getTimingManager(); @@ -787,6 +792,11 @@ void CodeGenAction::generateLLVMIR() { return; } + for (llvm::Function &F : llvmModule->getFunctionList()) { + if (langopts.Sanitize.has(clang::SanitizerKind::Address)) + F.addFnAttr(llvm::Attribute::SanitizeAddress); + } + // Set PIC/PIE level LLVM module flags. if (opts.PICLevel > 0) { llvmModule->setPICLevel(static_cast(opts.PICLevel)); @@ -899,9 +909,34 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } +void addSanitizers(const llvm::Triple &Triple, + const CodeGenOptions &flangCodeGenOpts, + const Fortran::common::LangOptions &flangLangOpts, + llvm::PassBuilder &PB, llvm::ModulePassManager &MPM) { + auto ASanPass = [&](clang::SanitizerMask Mask, bool CompileKernel) { + if (flangLangOpts.Sanitize.has(Mask)) { + bool UseGlobalGC = asanUseGlobalsGC(Triple, flangCodeGenOpts); + bool UseOdrIndicator = flangCodeGenOpts.SanitizeAddressUseOdrIndicator; + llvm::AsanDtorKind DestructorKind = + flangCodeGenOpts.getSanitizeAddressDtor(); + llvm::AddressSanitizerOptions Opts; + Opts.CompileKernel = CompileKernel; + Opts.Recover = flangCodeGenOpts.SanitizeRecover.has(Mask); + Opts.UseAfterScope = flangCodeGenOpts.SanitizeAddressUseAfterScope; + Opts.UseAfterReturn = flangCodeGenOpts.getSanitizeAddressUseAfterReturn(); + MPM.addPass(llvm::AddressSanitizerPass(Opts, UseGlobalGC, UseOdrIndicator, + DestructorKind)); + } + }; + ASanPass(clang::SanitizerKind::Address, false); + ASanPass(clang::SanitizerKind::KernelAddress, true); +} + void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); + const Fortran::common::LangOptions &langopts = + ci.getInvocation().getLangOpts(); clang::DiagnosticsEngine &diags = ci.getDiagnostics(); llvm::OptimizationLevel level = mapToLevel(opts); @@ -966,6 +1001,8 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { else mpm = pb.buildPerModuleDefaultPipeline(level); + addSanitizers(triple, opts, langopts, pb, mpm); + if (action == BackendActionTy::Backend_EmitBC) mpm.addPass(llvm::BitcodeWriterPass(os)); else if (action == BackendActionTy::Backend_EmitLL) From flang-commits at lists.llvm.org Thu May 15 04:36:46 2025 From: flang-commits at lists.llvm.org (Sergio Afonso via flang-commits) Date: Thu, 15 May 2025 04:36:46 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] Minimize host ops remaining in device compilation (PR #137200) In-Reply-To: Message-ID: <6825d1ce.170a0220.8bd04.4151@mx.google.com> https://github.com/skatrak updated https://github.com/llvm/llvm-project/pull/137200 >From 035cac0842e5d0e3886be0442100b10d15bf1be7 Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Tue, 15 Apr 2025 16:59:18 +0100 Subject: [PATCH] [Flang][OpenMP] Minimize host ops remaining in device compilation This patch updates the function filtering OpenMP pass intended to remove host functions from the MLIR module created by Flang lowering when targeting an OpenMP target device. Host functions holding target regions must be kept, so that the target regions within them can be translated for the device. The issue is that non-target operations inside these functions cannot be discarded because some of them hold information that is also relevant during target device codegen. Specifically, mapping information resides outside of `omp.target` regions. This patch updates the previous behavior where all host operations were preserved to then ignore all of those that are not actually needed by target device codegen. This, in practice, means only keeping target regions and mapping information needed by the device. Arguments for some of these remaining operations are replaced by placeholder allocations and `fir.undefined`, since they are only actually defined inside of the target regions themselves. As a result, this set of changes makes it possible to later simplify target device codegen, as it is no longer necessary to handle host operations differently to avoid issues. --- .../include/flang/Optimizer/OpenMP/Passes.td | 3 +- .../Optimizer/OpenMP/FunctionFiltering.cpp | 448 ++++++++++++++++ .../OpenMP/declare-target-link-tarop-cap.f90 | 19 +- flang/test/Lower/OpenMP/host-eval.f90 | 55 +- flang/test/Lower/OpenMP/real10.f90 | 5 +- .../OpenMP/function-filtering-host-ops.mlir | 498 ++++++++++++++++++ .../function-filtering.mlir} | 0 7 files changed, 996 insertions(+), 32 deletions(-) create mode 100644 flang/test/Transforms/OpenMP/function-filtering-host-ops.mlir rename flang/test/Transforms/{omp-function-filtering.mlir => OpenMP/function-filtering.mlir} (100%) diff --git a/flang/include/flang/Optimizer/OpenMP/Passes.td b/flang/include/flang/Optimizer/OpenMP/Passes.td index 704faf0ccd856..635604ca33550 100644 --- a/flang/include/flang/Optimizer/OpenMP/Passes.td +++ b/flang/include/flang/Optimizer/OpenMP/Passes.td @@ -46,7 +46,8 @@ def FunctionFilteringPass : Pass<"omp-function-filtering"> { "for the target device."; let dependentDialects = [ "mlir::func::FuncDialect", - "fir::FIROpsDialect" + "fir::FIROpsDialect", + "mlir::omp::OpenMPDialect" ]; } diff --git a/flang/lib/Optimizer/OpenMP/FunctionFiltering.cpp b/flang/lib/Optimizer/OpenMP/FunctionFiltering.cpp index 9554808824ac3..b600de9702fd4 100644 --- a/flang/lib/Optimizer/OpenMP/FunctionFiltering.cpp +++ b/flang/lib/Optimizer/OpenMP/FunctionFiltering.cpp @@ -13,13 +13,16 @@ #include "flang/Optimizer/Dialect/FIRDialect.h" #include "flang/Optimizer/Dialect/FIROpsSupport.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/OpenMP/Passes.h" #include "mlir/Dialect/Func/IR/FuncOps.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/Dialect/OpenMP/OpenMPInterfaces.h" #include "mlir/IR/BuiltinOps.h" +#include "llvm/ADT/SetVector.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/TypeSwitch.h" namespace flangomp { #define GEN_PASS_DEF_FUNCTIONFILTERINGPASS @@ -28,6 +31,104 @@ namespace flangomp { using namespace mlir; +/// Add an operation to one of the output sets to be later rewritten, based on +/// whether it is located within the given region. +template +static void collectRewriteImpl(OpTy op, Region ®ion, + llvm::SetVector &rewrites, + llvm::SetVector *parentRewrites) { + if (rewrites.contains(op)) + return; + + if (!parentRewrites || region.isAncestor(op->getParentRegion())) + rewrites.insert(op); + else + parentRewrites->insert(op.getOperation()); +} + +template +static void collectRewrite(OpTy op, Region ®ion, + llvm::SetVector &rewrites, + llvm::SetVector *parentRewrites) { + collectRewriteImpl(op, region, rewrites, parentRewrites); +} + +/// Add an \c omp.map.info operation and all its members recursively to one of +/// the output sets to be later rewritten, based on whether they are located +/// within the given region. +/// +/// Dependencies across \c omp.map.info are maintained by ensuring dependencies +/// are added to the output sets before operations based on them. +template <> +void collectRewrite(omp::MapInfoOp mapOp, Region ®ion, + llvm::SetVector &rewrites, + llvm::SetVector *parentRewrites) { + for (Value member : mapOp.getMembers()) + collectRewrite(cast(member.getDefiningOp()), region, + rewrites, parentRewrites); + + collectRewriteImpl(mapOp, region, rewrites, parentRewrites); +} + +/// Add the given value to a sorted set if it should be replaced by a +/// placeholder when used as an operand that must remain for the device. The +/// used output set used will depend on whether the value is defined within the +/// given region. +/// +/// Values that are block arguments of \c omp.target_data and \c func.func +/// operations are skipped, since they will still be available after all +/// rewrites are completed. +static void collectRewrite(Value value, Region ®ion, + llvm::SetVector &rewrites, + llvm::SetVector *parentRewrites) { + if ((isa(value) && + isa( + cast(value).getOwner()->getParentOp())) || + rewrites.contains(value)) + return; + + if (!parentRewrites) { + rewrites.insert(value); + return; + } + + Region *definingRegion; + if (auto blockArg = dyn_cast(value)) + definingRegion = blockArg.getOwner()->getParent(); + else + definingRegion = value.getDefiningOp()->getParentRegion(); + + assert(definingRegion && "defining op/block must exist in a region"); + + if (region.isAncestor(definingRegion)) + rewrites.insert(value); + else + parentRewrites->insert(value); +} + +/// Add operations in \c childRewrites to one of the output sets based on +/// whether they are located within the given region. +template +static void +applyChildRewrites(Region ®ion, + const llvm::SetVector &childRewrites, + llvm::SetVector &rewrites, + llvm::SetVector *parentRewrites) { + for (Operation *rewrite : childRewrites) + if (auto op = dyn_cast(*rewrite)) + collectRewrite(op, region, rewrites, parentRewrites); +} + +/// Add values in \c childRewrites to one of the output sets based on +/// whether they are defined within the given region. +static void applyChildRewrites(Region ®ion, + const llvm::SetVector &childRewrites, + llvm::SetVector &rewrites, + llvm::SetVector *parentRewrites) { + for (Value value : childRewrites) + collectRewrite(value, region, rewrites, parentRewrites); +} + namespace { class FunctionFilteringPass : public flangomp::impl::FunctionFilteringPassBase { @@ -94,6 +195,12 @@ class FunctionFilteringPass funcOp.erase(); return WalkResult::skip(); } + + if (failed(rewriteHostRegion(funcOp.getRegion()))) { + funcOp.emitOpError() << "could not be rewritten for target device"; + return WalkResult::interrupt(); + } + if (declareTargetOp) declareTargetOp.setDeclareTarget(declareType, omp::DeclareTargetCaptureClause::to); @@ -101,5 +208,346 @@ class FunctionFilteringPass return WalkResult::advance(); }); } + +private: + /// Rewrite the given host device region belonging to a function that contains + /// \c omp.target operations, to remove host-only operations that are not used + /// by device codegen. + /// + /// It is based on the expected form of the MLIR module as produced by Flang + /// lowering and it performs the following mutations: + /// - Replace all values returned by the function with \c fir.undefined. + /// - Operations taking map-like clauses (e.g. \c omp.target, + /// \c omp.target_data, etc) are moved to the end of the function. If they + /// are nested inside of any other operations, they are hoisted out of + /// them. If the region belongs to \c omp.target_data, these operations + /// are hoisted to its top level, rather than to the parent function. + /// - \c device, \c if and \c depend clauses are removed from these target + /// functions. Values initializing other clauses are either replaced by + /// placeholders as follows: + /// - Values defined by block arguments are replaced by placeholders only + /// if they are not attached to \c func.func or \c omp.target_data + /// operations. In that case, they are kept unmodified. + /// - \c arith.constant and \c fir.address_of are maintained. + /// - Other values are replaced by a combination of an \c fir.alloca for a + /// single bit and an \c fir.convert to the original type of the value. + /// This can be done because the code eventually generated for these + /// operations will be discarded, as they aren't runnable by the target + /// device. + /// - \c omp.map.info operations associated to these target regions are + /// preserved. These are moved above all \c omp.target and sorted to + /// satisfy dependencies among them. + /// - \c bounds arguments are removed from \c omp.map.info operations. + /// - \c var_ptr and \c var_ptr_ptr arguments of \c omp.map.info are + /// handled as follows: + /// - \c var_ptr_ptr is expected to be defined by a \c fir.box_offset + /// operation which is preserved. Otherwise, the pass will fail. + /// - \c var_ptr can be defined by an \c hlfir.declare which is also + /// preserved. Its \c memref argument is replaced by a placeholder or + /// maintained similarly to non-map clauses of target operations + /// described above. If it has \c shape or \c typeparams arguments, they + /// are replaced by applicable constants. \c dummy_scope arguments + /// are discarded. + /// - Every other operation not located inside of an \c omp.target is + /// removed. + /// - Whenever a value or operation that would otherwise be replaced with a + /// placeholder is defined outside of the region being rewritten, it is + /// added to the \c parentOpRewrites or \c parentValRewrites output + /// argument, to be later handled by the caller. This is only intended to + /// properly support nested \c omp.target_data and \c omp.target placed + /// inside of \c omp.target_data. When called for the main function, these + /// output arguments must not be set. + LogicalResult + rewriteHostRegion(Region ®ion, + llvm::SetVector *parentOpRewrites = nullptr, + llvm::SetVector *parentValRewrites = nullptr) { + // Extract parent op information. + auto [funcOp, targetDataOp] = [®ion]() { + Operation *parent = region.getParentOp(); + return std::make_tuple(dyn_cast(parent), + dyn_cast(parent)); + }(); + assert((bool)funcOp != (bool)targetDataOp && + "region must be defined by either func.func or omp.target_data"); + assert((bool)parentOpRewrites == (bool)targetDataOp && + (bool)parentValRewrites == (bool)targetDataOp && + "parent rewrites must be passed iff rewriting omp.target_data"); + + // Collect operations that have mapping information associated to them. + llvm::SmallVector< + std::variant> + targetOps; + + // Sets to store pending rewrites marked by child omp.target_data ops. + llvm::SetVector childOpRewrites; + llvm::SetVector childValRewrites; + WalkResult result = region.walk([&](Operation *op) { + // Skip the inside of omp.target regions, since these contain device code. + if (auto targetOp = dyn_cast(op)) { + targetOps.push_back(targetOp); + return WalkResult::skip(); + } + + if (auto targetOp = dyn_cast(op)) { + // Recursively rewrite omp.target_data regions as well. + if (failed(rewriteHostRegion(targetOp.getRegion(), &childOpRewrites, + &childValRewrites))) { + targetOp.emitOpError() << "rewrite for target device failed"; + return WalkResult::interrupt(); + } + + targetOps.push_back(targetOp); + return WalkResult::skip(); + } + + if (auto targetOp = dyn_cast(op)) + targetOps.push_back(targetOp); + else if (auto targetOp = dyn_cast(op)) + targetOps.push_back(targetOp); + else if (auto targetOp = dyn_cast(op)) + targetOps.push_back(targetOp); + + return WalkResult::advance(); + }); + + if (result.wasInterrupted()) + return failure(); + + // Make a temporary clone of the parent operation with an empty region, + // and update all references to entry block arguments to those of the new + // region. Users will later either be moved to the new region or deleted + // when the original region is replaced by the new. + OpBuilder builder(&getContext()); + builder.setInsertionPointAfter(region.getParentOp()); + Operation *newOp = builder.cloneWithoutRegions(*region.getParentOp()); + Block &block = newOp->getRegion(0).emplaceBlock(); + + llvm::SmallVector locs; + locs.reserve(region.getNumArguments()); + llvm::transform(region.getArguments(), std::back_inserter(locs), + [](const BlockArgument &arg) { return arg.getLoc(); }); + block.addArguments(region.getArgumentTypes(), locs); + + for (auto [oldArg, newArg] : + llvm::zip_equal(region.getArguments(), block.getArguments())) + oldArg.replaceAllUsesWith(newArg); + + // Collect omp.map.info ops while satisfying interdependencies. This must be + // updated whenever operands to operations contained in targetOps change. + llvm::SetVector rewriteValues; + llvm::SetVector mapInfos; + for (auto targetOp : targetOps) { + std::visit( + [&](auto op) { + // Variables unused by the device, present on all target ops. + op.getDeviceMutable().clear(); + op.getIfExprMutable().clear(); + + for (Value mapVar : op.getMapVars()) + collectRewrite(cast(mapVar.getDefiningOp()), + region, mapInfos, parentOpRewrites); + + if constexpr (!std::is_same_v) { + // Variables unused by the device, present on all target ops + // except for omp.target_data. + op.getDependVarsMutable().clear(); + op.setDependKindsAttr(nullptr); + } + + if constexpr (std::is_same_v) { + assert(op.getHostEvalVars().empty() && + "unexpected host_eval in target device module"); + // TODO: Clear some of these operands rather than rewriting them, + // depending on whether they are needed by device codegen once + // support for them is fully implemented. + for (Value allocVar : op.getAllocateVars()) + collectRewrite(allocVar, region, rewriteValues, + parentValRewrites); + for (Value allocVar : op.getAllocatorVars()) + collectRewrite(allocVar, region, rewriteValues, + parentValRewrites); + for (Value inReduction : op.getInReductionVars()) + collectRewrite(inReduction, region, rewriteValues, + parentValRewrites); + for (Value isDevPtr : op.getIsDevicePtrVars()) + collectRewrite(isDevPtr, region, rewriteValues, + parentValRewrites); + for (Value mapVar : op.getHasDeviceAddrVars()) + collectRewrite(cast(mapVar.getDefiningOp()), + region, mapInfos, parentOpRewrites); + for (Value privateVar : op.getPrivateVars()) + collectRewrite(privateVar, region, rewriteValues, + parentValRewrites); + if (Value threadLimit = op.getThreadLimit()) + collectRewrite(threadLimit, region, rewriteValues, + parentValRewrites); + } else if constexpr (std::is_same_v) { + for (Value mapVar : op.getUseDeviceAddrVars()) + collectRewrite(cast(mapVar.getDefiningOp()), + region, mapInfos, parentOpRewrites); + for (Value mapVar : op.getUseDevicePtrVars()) + collectRewrite(cast(mapVar.getDefiningOp()), + region, mapInfos, parentOpRewrites); + } + }, + targetOp); + } + + applyChildRewrites(region, childOpRewrites, mapInfos, parentOpRewrites); + + // Move omp.map.info ops to the new block and collect dependencies. + llvm::SetVector declareOps; + llvm::SetVector boxOffsets; + for (omp::MapInfoOp mapOp : mapInfos) { + if (auto declareOp = dyn_cast_if_present( + mapOp.getVarPtr().getDefiningOp())) + collectRewrite(declareOp, region, declareOps, parentOpRewrites); + else + collectRewrite(mapOp.getVarPtr(), region, rewriteValues, + parentValRewrites); + + if (Value varPtrPtr = mapOp.getVarPtrPtr()) { + if (auto boxOffset = llvm::dyn_cast_if_present( + varPtrPtr.getDefiningOp())) + collectRewrite(boxOffset, region, boxOffsets, parentOpRewrites); + else + return mapOp->emitOpError() << "var_ptr_ptr rewrite only supported " + "if defined by fir.box_offset"; + } + + // Bounds are not used during target device codegen. + mapOp.getBoundsMutable().clear(); + mapOp->moveBefore(&block, block.end()); + } + + applyChildRewrites(region, childOpRewrites, declareOps, parentOpRewrites); + applyChildRewrites(region, childOpRewrites, boxOffsets, parentOpRewrites); + + // Create a temporary marker to simplify the op moving process below. + builder.setInsertionPointToStart(&block); + auto marker = builder.create(builder.getUnknownLoc(), + builder.getNoneType()); + builder.setInsertionPoint(marker); + + // Handle dependencies of hlfir.declare ops. + for (hlfir::DeclareOp declareOp : declareOps) { + collectRewrite(declareOp.getMemref(), region, rewriteValues, + parentValRewrites); + + // Shape and typeparams aren't needed for target device codegen, but + // removing them would break verifiers. + Value zero; + if (declareOp.getShape() || !declareOp.getTypeparams().empty()) + zero = builder.create(declareOp.getLoc(), + builder.getI64IntegerAttr(0)); + + if (auto shape = declareOp.getShape()) { + // The pre-cg rewrite pass requires the shape to be defined by one of + // fir.shape, fir.shapeshift or fir.shift, so we need to make sure it's + // still defined by one of these after this pass. + Operation *shapeOp = shape.getDefiningOp(); + llvm::SmallVector extents(shapeOp->getNumOperands(), zero); + Value newShape = + llvm::TypeSwitch(shapeOp) + .Case([&](fir::ShapeOp op) { + return builder.create(op.getLoc(), extents); + }) + .Case([&](fir::ShapeShiftOp op) { + auto type = fir::ShapeShiftType::get(op.getContext(), + extents.size() / 2); + return builder.create(op.getLoc(), type, + extents); + }) + .Case([&](fir::ShiftOp op) { + auto type = + fir::ShiftType::get(op.getContext(), extents.size()); + return builder.create(op.getLoc(), type, + extents); + }) + .Default([](Operation *op) { + op->emitOpError() + << "hlfir.declare shape expected to be one of: " + "fir.shape, fir.shapeshift or fir.shift"; + return nullptr; + }); + + if (!newShape) + return failure(); + + declareOp.getShapeMutable().assign(newShape); + } + + for (OpOperand &typeParam : declareOp.getTypeparamsMutable()) + typeParam.assign(zero); + + declareOp.getDummyScopeMutable().clear(); + } + + // We don't actually need the proper initialization, but rather just + // maintain the basic form of these operands. We create 1-bit placeholder + // allocas that we "typecast" to the expected type and replace all uses. + // Using fir.undefined here instead is not possible because these variables + // cannot be constants, as that would trigger different codegen for target + // regions. + applyChildRewrites(region, childValRewrites, rewriteValues, + parentValRewrites); + for (Value value : rewriteValues) { + Location loc = value.getLoc(); + Value rewriteValue; + // If it's defined by fir.address_of, then we need to keep that op as + // well because it might be pointing to a 'declare target' global. + // Constants can also trigger different codegen paths, so we keep them as + // well. + if (isa_and_present( + value.getDefiningOp())) { + rewriteValue = builder.clone(*value.getDefiningOp())->getResult(0); + } else { + Value placeholder = + builder.create(loc, builder.getI1Type()); + rewriteValue = + builder.create(loc, value.getType(), placeholder); + } + value.replaceAllUsesWith(rewriteValue); + } + + // Move omp.map.info dependencies. + for (hlfir::DeclareOp declareOp : declareOps) + declareOp->moveBefore(marker); + + // The box_ref argument of fir.box_offset is expected to be the same value + // that was passed as var_ptr to the corresponding omp.map.info, so we don't + // need to handle its defining op here. + for (fir::BoxOffsetOp boxOffset : boxOffsets) + boxOffset->moveBefore(marker); + + marker->erase(); + + // Move target operations to the end of the new block. + for (auto targetOp : targetOps) + std::visit([&block](auto op) { op->moveBefore(&block, block.end()); }, + targetOp); + + // Add terminator to the new block. + builder.setInsertionPointToEnd(&block); + if (funcOp) { + llvm::SmallVector returnValues; + returnValues.reserve(funcOp.getNumResults()); + for (auto type : funcOp.getResultTypes()) + returnValues.push_back( + builder.create(funcOp.getLoc(), type)); + + builder.create(funcOp.getLoc(), returnValues); + } else { + builder.create(targetDataOp.getLoc()); + } + + // Replace old region (now missing ops) with the new one and remove the + // temporary operation clone. + region.takeBody(newOp->getRegion(0)); + newOp->erase(); + return success(); + } }; } // namespace diff --git a/flang/test/Lower/OpenMP/declare-target-link-tarop-cap.f90 b/flang/test/Lower/OpenMP/declare-target-link-tarop-cap.f90 index cfdcd9eda82d1..8f4d1bdd600d7 100644 --- a/flang/test/Lower/OpenMP/declare-target-link-tarop-cap.f90 +++ b/flang/test/Lower/OpenMP/declare-target-link-tarop-cap.f90 @@ -1,7 +1,7 @@ -!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s -!RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-is-device %s -o - | FileCheck %s -!RUN: bbc -emit-hlfir -fopenmp %s -o - | FileCheck %s -!RUN: bbc -emit-hlfir -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s +!RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s --check-prefixes=BOTH,HOST +!RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-is-device %s -o - | FileCheck %s --check-prefixes=BOTH,DEVICE +!RUN: bbc -emit-hlfir -fopenmp %s -o - | FileCheck %s --check-prefixes=BOTH,HOST +!RUN: bbc -emit-hlfir -fopenmp -fopenmp-is-target-device %s -o - | FileCheck %s --check-prefixes=BOTH,DEVICE program test_link @@ -20,13 +20,14 @@ program test_link integer, pointer :: test_ptr2 !$omp declare target link(test_ptr2) - !CHECK-DAG: {{%.*}} = omp.map.info var_ptr({{%.*}} : !fir.ref, i32) map_clauses(implicit, tofrom) capture(ByRef) -> !fir.ref {name = "test_int"} + !BOTH-DAG: {{%.*}} = omp.map.info var_ptr({{%.*}} : !fir.ref, i32) map_clauses(implicit, tofrom) capture(ByRef) -> !fir.ref {name = "test_int"} !$omp target test_int = test_int + 1 !$omp end target - !CHECK-DAG: {{%.*}} = omp.map.info var_ptr({{%.*}} : !fir.ref>, !fir.array<3xi32>) map_clauses(implicit, tofrom) capture(ByRef) bounds({{%.*}}) -> !fir.ref> {name = "test_array_1d"} + !HOST-DAG: {{%.*}} = omp.map.info var_ptr({{%.*}} : !fir.ref>, !fir.array<3xi32>) map_clauses(implicit, tofrom) capture(ByRef) bounds({{%.*}}) -> !fir.ref> {name = "test_array_1d"} + !DEVICE-DAG: {{%.*}} = omp.map.info var_ptr({{%.*}} : !fir.ref>, !fir.array<3xi32>) map_clauses(implicit, tofrom) capture(ByRef) -> !fir.ref> {name = "test_array_1d"} !$omp target do i = 1,3 test_array_1d(i) = i * 2 @@ -35,18 +36,18 @@ program test_link allocate(test_ptr1) test_ptr1 = 1 - !CHECK-DAG: {{%.*}} = omp.map.info var_ptr({{%.*}} : !fir.ref>>, !fir.box>) map_clauses(implicit, to) capture(ByRef) members({{%.*}} : !fir.llvm_ptr>) -> !fir.ref>> {name = "test_ptr1"} + !BOTH-DAG: {{%.*}} = omp.map.info var_ptr({{%.*}} : !fir.ref>>, !fir.box>) map_clauses(implicit, to) capture(ByRef) members({{%.*}} : !fir.llvm_ptr>) -> !fir.ref>> {name = "test_ptr1"} !$omp target test_ptr1 = test_ptr1 + 1 !$omp end target - !CHECK-DAG: {{%.*}} = omp.map.info var_ptr({{%.*}} : !fir.ref, i32) map_clauses(implicit, tofrom) capture(ByRef) -> !fir.ref {name = "test_target"} + !BOTH-DAG: {{%.*}} = omp.map.info var_ptr({{%.*}} : !fir.ref, i32) map_clauses(implicit, tofrom) capture(ByRef) -> !fir.ref {name = "test_target"} !$omp target test_target = test_target + 1 !$omp end target - !CHECK-DAG: {{%.*}} = omp.map.info var_ptr({{%.*}} : !fir.ref>>, !fir.box>) map_clauses(implicit, to) capture(ByRef) members({{%.*}} : !fir.llvm_ptr>) -> !fir.ref>> {name = "test_ptr2"} + !BOTH-DAG: {{%.*}} = omp.map.info var_ptr({{%.*}} : !fir.ref>>, !fir.box>) map_clauses(implicit, to) capture(ByRef) members({{%.*}} : !fir.llvm_ptr>) -> !fir.ref>> {name = "test_ptr2"} test_ptr2 => test_target !$omp target test_ptr2 = test_ptr2 + 1 diff --git a/flang/test/Lower/OpenMP/host-eval.f90 b/flang/test/Lower/OpenMP/host-eval.f90 index fe5b9597f8620..c059f7338b26d 100644 --- a/flang/test/Lower/OpenMP/host-eval.f90 +++ b/flang/test/Lower/OpenMP/host-eval.f90 @@ -22,8 +22,10 @@ subroutine teams() !$omp end target - ! BOTH: omp.teams - ! BOTH-SAME: num_teams({{.*}}) thread_limit({{.*}}) { + ! HOST: omp.teams + ! HOST-SAME: num_teams({{.*}}) thread_limit({{.*}}) { + + ! DEVICE-NOT: omp.teams !$omp teams num_teams(1) thread_limit(2) call foo() !$omp end teams @@ -76,13 +78,18 @@ subroutine distribute_parallel_do() !$omp end distribute parallel do !$omp end target teams - ! BOTH: omp.teams + ! HOST: omp.teams + ! DEVICE-NOT: omp.teams !$omp teams - ! BOTH: omp.parallel - ! BOTH-SAME: num_threads({{.*}}) - ! BOTH: omp.distribute - ! BOTH-NEXT: omp.wsloop + ! HOST: omp.parallel + ! HOST-SAME: num_threads({{.*}}) + ! HOST: omp.distribute + ! HOST-NEXT: omp.wsloop + + ! DEVICE-NOT: omp.parallel + ! DEVICE-NOT: omp.distribute + ! DEVICE-NOT: omp.wsloop !$omp distribute parallel do num_threads(1) do i=1,10 call foo() @@ -140,14 +147,20 @@ subroutine distribute_parallel_do_simd() !$omp end distribute parallel do simd !$omp end target teams - ! BOTH: omp.teams + ! HOST: omp.teams + ! DEVICE-NOT: omp.teams !$omp teams - ! BOTH: omp.parallel - ! BOTH-SAME: num_threads({{.*}}) - ! BOTH: omp.distribute - ! BOTH-NEXT: omp.wsloop - ! BOTH-NEXT: omp.simd + ! HOST: omp.parallel + ! HOST-SAME: num_threads({{.*}}) + ! HOST: omp.distribute + ! HOST-NEXT: omp.wsloop + ! HOST-NEXT: omp.simd + + ! DEVICE-NOT: omp.parallel + ! DEVICE-NOT: omp.distribute + ! DEVICE-NOT: omp.wsloop + ! DEVICE-NOT: omp.simd !$omp distribute parallel do simd num_threads(1) do i=1,10 call foo() @@ -194,10 +207,12 @@ subroutine distribute() !$omp end distribute !$omp end target teams - ! BOTH: omp.teams + ! HOST: omp.teams + ! DEVICE-NOT: omp.teams !$omp teams - ! BOTH: omp.distribute + ! HOST: omp.distribute + ! DEVICE-NOT: omp.distribute !$omp distribute do i=1,10 call foo() @@ -246,11 +261,15 @@ subroutine distribute_simd() !$omp end distribute simd !$omp end target teams - ! BOTH: omp.teams + ! HOST: omp.teams + ! DEVICE-NOT: omp.teams !$omp teams - ! BOTH: omp.distribute - ! BOTH-NEXT: omp.simd + ! HOST: omp.distribute + ! HOST-NEXT: omp.simd + + ! DEVICE-NOT: omp.distribute + ! DEVICE-NOT: omp.simd !$omp distribute simd do i=1,10 call foo() diff --git a/flang/test/Lower/OpenMP/real10.f90 b/flang/test/Lower/OpenMP/real10.f90 index a31d2ace80044..c76c2bde0f6f6 100644 --- a/flang/test/Lower/OpenMP/real10.f90 +++ b/flang/test/Lower/OpenMP/real10.f90 @@ -5,9 +5,6 @@ !CHECK: hlfir.declare %{{.*}} {uniq_name = "_QFEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) program p + !$omp declare target real(10) :: x - !$omp target - continue - !$omp end target end - diff --git a/flang/test/Transforms/OpenMP/function-filtering-host-ops.mlir b/flang/test/Transforms/OpenMP/function-filtering-host-ops.mlir new file mode 100644 index 0000000000000..4d8975d58a50c --- /dev/null +++ b/flang/test/Transforms/OpenMP/function-filtering-host-ops.mlir @@ -0,0 +1,498 @@ +// RUN: fir-opt --omp-function-filtering %s | FileCheck %s + +module attributes {omp.is_target_device = true} { + // CHECK-LABEL: func.func @basic_checks + // CHECK-SAME: (%[[ARG:.*]]: !fir.ref) -> (i32, f32) + func.func @basic_checks(%arg: !fir.ref) -> (i32, f32) { + // CHECK-NEXT: %[[PLACEHOLDER:.*]] = fir.alloca i1 + // CHECK-NEXT: %[[ALLOC:.*]] = fir.convert %[[PLACEHOLDER]] : (!fir.ref) -> !fir.ref + // CHECK-NEXT: %[[GLOBAL:.*]] = fir.address_of(@global_scalar) : !fir.ref + %r0 = arith.constant 10 : i32 + %r1 = arith.constant 2.5 : f32 + + func.call @foo() : () -> () + + // CHECK-NEXT: %[[ARG_DECL:.*]]:2 = hlfir.declare %[[ARG]] {uniq_name = "arg"} + %0:2 = hlfir.declare %arg {uniq_name = "arg"} : (!fir.ref) -> (!fir.ref, !fir.ref) + + // CHECK-NEXT: %[[GLOBAL_DECL:.*]]:2 = hlfir.declare %[[GLOBAL]] {uniq_name = "global_scalar"} + %global = fir.address_of(@global_scalar) : !fir.ref + %1:2 = hlfir.declare %global {uniq_name = "global_scalar"} : (!fir.ref) -> (!fir.ref, !fir.ref) + + // CHECK-NEXT: %[[ALLOC_DECL:.*]]:2 = hlfir.declare %[[ALLOC]] {uniq_name = "alloc"} + %alloc = fir.alloca i32 + %2:2 = hlfir.declare %alloc {uniq_name = "alloc"} : (!fir.ref) -> (!fir.ref, !fir.ref) + + // CHECK-NEXT: %[[MAP0:.*]] = omp.map.info var_ptr(%[[ARG_DECL]]#1{{.*}}) + // CHECK-NEXT: %[[MAP1:.*]] = omp.map.info var_ptr(%[[GLOBAL_DECL]]#1{{.*}}) + // CHECK-NEXT: %[[MAP3:.*]] = omp.map.info var_ptr(%[[ALLOC]]{{.*}}) + // CHECK-NEXT: %[[MAP2:.*]] = omp.map.info var_ptr(%[[ALLOC_DECL]]#1{{.*}}) + // CHECK-NEXT: %[[MAP4:.*]] = omp.map.info var_ptr(%[[ARG_DECL]]#1{{.*}}) + // CHECK-NEXT: %[[MAP5:.*]] = omp.map.info var_ptr(%[[GLOBAL_DECL]]#1{{.*}}) + // CHECK-NEXT: %[[MAP6:.*]] = omp.map.info var_ptr(%[[ALLOC_DECL]]#1{{.*}}) + // CHECK-NEXT: %[[MAP7:.*]] = omp.map.info var_ptr(%[[ALLOC]]{{.*}}) + // CHECK-NEXT: %[[MAP8:.*]] = omp.map.info var_ptr(%[[ARG_DECL]]#1{{.*}}) + // CHECK-NEXT: %[[MAP9:.*]] = omp.map.info var_ptr(%[[GLOBAL_DECL]]#1{{.*}}) + // CHECK-NEXT: %[[MAP10:.*]] = omp.map.info var_ptr(%[[ALLOC_DECL]]#1{{.*}}) + %m0 = omp.map.info var_ptr(%0#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + %m1 = omp.map.info var_ptr(%1#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + %m2 = omp.map.info var_ptr(%2#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + %m3 = omp.map.info var_ptr(%alloc : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + + // CHECK-NEXT: omp.target has_device_addr(%[[MAP2]] -> {{.*}} : {{.*}}) map_entries(%[[MAP0]] -> {{.*}}, %[[MAP1]] -> {{.*}}, %[[MAP3]] -> {{.*}} : {{.*}}) + omp.target has_device_addr(%m2 -> %arg0 : !fir.ref) map_entries(%m0 -> %arg1, %m1 -> %arg2, %m3 -> %arg3 : !fir.ref, !fir.ref, !fir.ref) { + // CHECK-NEXT: func.call + func.call @foo() : () -> () + omp.terminator + } + + // CHECK-NOT: omp.parallel + // CHECK-NOT: func.call + // CHECK-NOT: omp.map.info + omp.parallel { + func.call @foo() : () -> () + omp.terminator + } + + %m4 = omp.map.info var_ptr(%0#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + %m5 = omp.map.info var_ptr(%1#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + %m6 = omp.map.info var_ptr(%2#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + %m7 = omp.map.info var_ptr(%alloc : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + + // CHECK: omp.target_data map_entries(%[[MAP4]], %[[MAP5]], %[[MAP6]], %[[MAP7]] : {{.*}}) + omp.target_data map_entries(%m4, %m5, %m6, %m7 : !fir.ref, !fir.ref, !fir.ref, !fir.ref) { + // CHECK-NOT: func.call + func.call @foo() : () -> () + omp.terminator + } + + // CHECK: omp.target_enter_data map_entries(%[[MAP8]] : {{.*}}) + // CHECK-NEXT: omp.target_exit_data map_entries(%[[MAP9]] : {{.*}}) + // CHECK-NEXT: omp.target_update map_entries(%[[MAP10]] : {{.*}}) + %m8 = omp.map.info var_ptr(%0#1 : !fir.ref, i32) map_clauses(to) capture(ByRef) -> !fir.ref + omp.target_enter_data map_entries(%m8 : !fir.ref) + + %m9 = omp.map.info var_ptr(%1#1 : !fir.ref, i32) map_clauses(from) capture(ByRef) -> !fir.ref + omp.target_exit_data map_entries(%m9 : !fir.ref) + + %m10 = omp.map.info var_ptr(%2#1 : !fir.ref, !fir.ref) map_clauses(to) capture(ByRef) -> !fir.ref + omp.target_update map_entries(%m10 : !fir.ref) + + // CHECK-NOT: func.call + func.call @foo() : () -> () + + // CHECK: %[[RETURN0:.*]] = fir.undefined i32 + // CHECK-NEXT: %[[RETURN1:.*]] = fir.undefined f32 + // CHECK-NEXT: return %[[RETURN0]], %[[RETURN1]] + return %r0, %r1 : i32, f32 + } + + // CHECK-LABEL: func.func @allocatable_array + // CHECK-SAME: (%[[ALLOCATABLE:.*]]: [[ALLOCATABLE_TYPE:.*]], %[[ARRAY:.*]]: [[ARRAY_TYPE:[^)]*]]) + func.func @allocatable_array(%allocatable: !fir.ref>>>, %array: !fir.ref>) { + // CHECK-NEXT: %[[ZERO:.*]] = arith.constant 0 : i64 + // CHECK-NEXT: %[[SHAPE:.*]] = fir.shape %[[ZERO]] : (i64) -> !fir.shape<1> + // CHECK-NEXT: %[[ALLOCATABLE_DECL:.*]]:2 = hlfir.declare %[[ALLOCATABLE]] {fortran_attrs = #fir.var_attrs, uniq_name = "allocatable"} : ([[ALLOCATABLE_TYPE]]) -> ([[ALLOCATABLE_TYPE]], [[ALLOCATABLE_TYPE]]) + // CHECK-NEXT: %[[ARRAY_DECL:.*]]:2 = hlfir.declare %[[ARRAY]](%[[SHAPE]]) {uniq_name = "array"} : ([[ARRAY_TYPE]], !fir.shape<1>) -> ([[ARRAY_TYPE]], [[ARRAY_TYPE]]) + // CHECK-NEXT: %[[VAR_PTR_PTR:.*]] = fir.box_offset %[[ALLOCATABLE_DECL]]#1 base_addr : ([[ALLOCATABLE_TYPE]]) -> [[VAR_PTR_PTR_TYPE:.*]] + // CHECK-NEXT: %[[MAP_ALLOCATABLE:.*]] = omp.map.info var_ptr(%[[ALLOCATABLE_DECL]]#1 : [[ALLOCATABLE_TYPE]], f32) map_clauses(tofrom) capture(ByRef) var_ptr_ptr(%[[VAR_PTR_PTR]] : [[VAR_PTR_PTR_TYPE]]) -> [[VAR_PTR_PTR_TYPE]] + // CHECK-NEXT: %[[MAP_ARRAY:.*]] = omp.map.info var_ptr(%[[ARRAY_DECL]]#1 : [[ARRAY_TYPE]], !fir.array<9xi32>) map_clauses(tofrom) capture(ByRef) -> [[ARRAY_TYPE]] + // CHECK-NEXT: omp.target map_entries(%[[MAP_ALLOCATABLE]] -> %{{.*}}, %[[MAP_ARRAY]] -> %{{.*}} : [[VAR_PTR_PTR_TYPE]], [[ARRAY_TYPE]]) + %c0 = arith.constant 0 : index + %c1 = arith.constant 1 : index + %c8 = arith.constant 8 : index + %c9 = arith.constant 9 : index + + %0:2 = hlfir.declare %allocatable {fortran_attrs = #fir.var_attrs, uniq_name = "allocatable"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) + %1 = omp.map.bounds lower_bound(%c0 : index) upper_bound(%c8 : index) extent(%c9 : index) stride(%c1 : index) start_idx(%c1 : index) + %2 = fir.box_offset %0#1 base_addr : (!fir.ref>>>) -> !fir.llvm_ptr>> + %m0 = omp.map.info var_ptr(%0#1 : !fir.ref>>>, f32) map_clauses(tofrom) capture(ByRef) var_ptr_ptr(%2 : !fir.llvm_ptr>>) bounds(%1) -> !fir.llvm_ptr>> + + %3 = fir.shape %c9 : (index) -> !fir.shape<1> + %4:2 = hlfir.declare %array(%3) {uniq_name = "array"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) + %5 = omp.map.bounds lower_bound(%c0 : index) upper_bound(%c8 : index) extent(%c9 : index) stride(%c1 : index) start_idx(%c1 : index) + %6 = omp.map.info var_ptr(%4#1 : !fir.ref>, !fir.array<9xi32>) map_clauses(tofrom) capture(ByRef) bounds(%5) -> !fir.ref> + + omp.target map_entries(%m0 -> %arg0, %6 -> %arg1 : !fir.llvm_ptr>>, !fir.ref>) { + omp.terminator + } + return + } + + // CHECK-LABEL: func.func @character + // CHECK-SAME: (%[[X:.*]]: [[X_TYPE:[^)]*]]) + func.func @character(%x: !fir.ref>) { + // CHECK-NEXT: %[[ZERO]] = arith.constant 0 : i64 + %0 = fir.dummy_scope : !fir.dscope + %c1 = arith.constant 1 : index + // CHECK-NEXT: %[[X_DECL:.*]]:2 = hlfir.declare %[[X]] typeparams %[[ZERO]] {uniq_name = "x"} : ([[X_TYPE]], i64) -> ([[X_TYPE]], [[X_TYPE]]) + %3:2 = hlfir.declare %x typeparams %c1 dummy_scope %0 {uniq_name = "x"} : (!fir.ref>, index, !fir.dscope) -> (!fir.ref>, !fir.ref>) + // CHECK-NEXT: %[[MAP:.*]] = omp.map.info var_ptr(%[[X_DECL]]#1 : [[X_TYPE]], !fir.char<1>) map_clauses(tofrom) capture(ByRef) -> [[X_TYPE]] + %map = omp.map.info var_ptr(%3#1 : !fir.ref>, !fir.char<1>) map_clauses(tofrom) capture(ByRef) -> !fir.ref> + // CHECK-NEXT: omp.target map_entries(%[[MAP]] -> %{{.*}}) + omp.target map_entries(%map -> %arg0 : !fir.ref>) { + omp.terminator + } + return + } + + // CHECK-LABEL: func.func @assumed_rank + // CHECK-SAME: (%[[X:.*]]: [[X_TYPE:[^)]*]]) + func.func @assumed_rank(%x: !fir.box>) { + // CHECK-NEXT: %[[PLACEHOLDER:.*]] = fir.alloca i1 + // CHECK-NEXT: %[[ALLOCA:.*]] = fir.convert %[[PLACEHOLDER]] : (!fir.ref) -> !fir.ref<[[X_TYPE]]> + %0 = fir.alloca !fir.box> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %x dummy_scope %1 {uniq_name = "x"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %3 = fir.box_addr %2#1 : (!fir.box>) -> !fir.ref> + fir.store %2#1 to %0 : !fir.ref>> + // CHECK-NEXT: %[[VAR_PTR_PTR:.*]] = fir.box_offset %[[ALLOCA]] base_addr : (!fir.ref<[[X_TYPE]]>) -> [[VAR_PTR_PTR_TYPE:.*]] + %4 = fir.box_offset %0 base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> + // CHECK-NEXT: %[[MAP0:.*]] = omp.map.info var_ptr(%[[ALLOCA]] : !fir.ref<[[X_TYPE]]>, !fir.array<*:f32>) {{.*}} var_ptr_ptr(%[[VAR_PTR_PTR]] : [[VAR_PTR_PTR_TYPE]]) -> [[VAR_PTR_PTR_TYPE]] + %5 = omp.map.info var_ptr(%0 : !fir.ref>>, !fir.array<*:f32>) map_clauses(tofrom) capture(ByRef) var_ptr_ptr(%4 : !fir.llvm_ptr>>) -> !fir.llvm_ptr>> + // CHECK-NEXT: %[[MAP1:.*]] = omp.map.info var_ptr(%[[ALLOCA]] : !fir.ref<[[X_TYPE]]>, !fir.box>) {{.*}} members(%[[MAP0]] : [0] : [[VAR_PTR_PTR_TYPE]]) -> !fir.ref> + %6 = omp.map.info var_ptr(%0 : !fir.ref>>, !fir.box>) map_clauses(to) capture(ByRef) members(%5 : [0] : !fir.llvm_ptr>>) -> !fir.ref> + // CHECK-NEXT: omp.target map_entries(%[[MAP1]] -> %{{.*}}, %[[MAP0]] -> {{.*}}) + omp.target map_entries(%6 -> %arg1, %5 -> %arg2 : !fir.ref>, !fir.llvm_ptr>>) { + omp.terminator + } + return + } + + // CHECK-LABEL: func.func @box_ptr + // CHECK-SAME: (%[[X:.*]]: [[X_TYPE:[^)]*]]) + func.func @box_ptr(%x: !fir.ref>>>) { + // CHECK-NEXT: %[[ZERO:.*]] = arith.constant 0 : i64 + // CHECK-NEXT: %[[SHAPE:.*]] = fir.shape_shift %[[ZERO]], %[[ZERO]] : (i64, i64) -> !fir.shapeshift<1> + // CHECK-NEXT: %[[PLACEHOLDER_X:.*]] = fir.alloca i1 + // CHECK-NEXT: %[[ALLOCA_X:.*]] = fir.convert %[[PLACEHOLDER_X]] : (!fir.ref) -> [[X_TYPE]] + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %x dummy_scope %1 {fortran_attrs = #fir.var_attrs, uniq_name = "x"} : (!fir.ref>>>, !fir.dscope) -> (!fir.ref>>>, !fir.ref>>>) + %3 = fir.load %2#0 : !fir.ref>>> + fir.store %3 to %0 : !fir.ref>>> + + // CHECK-NEXT: %[[PLACEHOLDER_Y:.*]] = fir.alloca i1 + // CHECK-NEXT: %[[ALLOCA_Y:.*]] = fir.convert %[[PLACEHOLDER_Y]] : (!fir.ref) -> [[Y_TYPE:.*]] + %c0 = arith.constant 0 : index + %4:3 = fir.box_dims %3, %c0 : (!fir.box>>, index) -> (index, index, index) + %c1 = arith.constant 1 : index + %c0_0 = arith.constant 0 : index + %5:3 = fir.box_dims %3, %c0_0 : (!fir.box>>, index) -> (index, index, index) + %c0_1 = arith.constant 0 : index + %6 = arith.subi %5#1, %c1 : index + %7 = omp.map.bounds lower_bound(%c0_1 : index) upper_bound(%6 : index) extent(%5#1 : index) stride(%5#2 : index) start_idx(%4#0 : index) {stride_in_bytes = true} + %8 = fir.box_addr %3 : (!fir.box>>) -> !fir.ptr> + %c0_2 = arith.constant 0 : index + %9:3 = fir.box_dims %3, %c0_2 : (!fir.box>>, index) -> (index, index, index) + %10 = fir.shape_shift %9#0, %9#1 : (index, index) -> !fir.shapeshift<1> + + // CHECK-NEXT: %[[Y_DECL:.*]]:2 = hlfir.declare %[[ALLOCA_Y]](%[[SHAPE]]) {fortran_attrs = #fir.var_attrs, uniq_name = "y"} : ([[Y_TYPE]], !fir.shapeshift<1>) -> (!fir.box>, [[Y_TYPE]]) + %11:2 = hlfir.declare %8(%10) {fortran_attrs = #fir.var_attrs, uniq_name = "y"} : (!fir.ptr>, !fir.shapeshift<1>) -> (!fir.box>, !fir.ptr>) + %c1_3 = arith.constant 1 : index + %c0_4 = arith.constant 0 : index + %12:3 = fir.box_dims %11#0, %c0_4 : (!fir.box>, index) -> (index, index, index) + %c0_5 = arith.constant 0 : index + %13 = arith.subi %12#1, %c1_3 : index + %14 = omp.map.bounds lower_bound(%c0_5 : index) upper_bound(%13 : index) extent(%12#1 : index) stride(%12#2 : index) start_idx(%9#0 : index) {stride_in_bytes = true} + + // CHECK-NEXT: %[[VAR_PTR_PTR:.*]] = fir.box_offset %[[ALLOCA_X]] base_addr : ([[X_TYPE]]) -> [[VAR_PTR_PTR_TYPE:.*]] + // CHECK-NEXT: %[[MAP0:.*]] = omp.map.info var_ptr(%[[Y_DECL]]#1 : [[Y_TYPE]], i32) {{.*}} -> [[Y_TYPE]] + // CHECK-NEXT: %[[MAP1:.*]] = omp.map.info var_ptr(%[[ALLOCA_X]] : [[X_TYPE]], i32) {{.*}} var_ptr_ptr(%[[VAR_PTR_PTR]] : [[VAR_PTR_PTR_TYPE]]) -> [[VAR_PTR_PTR_TYPE]] + // CHECK-NEXT: %[[MAP2:.*]] = omp.map.info var_ptr(%[[ALLOCA_X]] : [[X_TYPE]], !fir.box>>) {{.*}} members(%[[MAP1]] : [0] : [[VAR_PTR_PTR_TYPE]]) -> [[X_TYPE]] + %15 = omp.map.info var_ptr(%11#1 : !fir.ptr>, i32) map_clauses(tofrom) capture(ByRef) bounds(%14) -> !fir.ptr> + %16 = fir.box_offset %0 base_addr : (!fir.ref>>>) -> !fir.llvm_ptr>> + %17 = omp.map.info var_ptr(%0 : !fir.ref>>>, i32) map_clauses(implicit, to) capture(ByRef) var_ptr_ptr(%16 : !fir.llvm_ptr>>) bounds(%7) -> !fir.llvm_ptr>> + %18 = omp.map.info var_ptr(%0 : !fir.ref>>>, !fir.box>>) map_clauses(implicit, to) capture(ByRef) members(%17 : [0] : !fir.llvm_ptr>>) -> !fir.ref>>> + + // CHECK-NEXT: omp.target map_entries(%[[MAP0]] -> %{{.*}}, %[[MAP2]] -> %{{.*}}, %[[MAP1]] -> {{.*}} : [[Y_TYPE]], [[X_TYPE]], [[VAR_PTR_PTR_TYPE]]) + omp.target map_entries(%15 -> %arg1, %18 -> %arg2, %17 -> %arg3 : !fir.ptr>, !fir.ref>>>, !fir.llvm_ptr>>) { + omp.terminator + } + return + } + + // CHECK-LABEL: func.func @target_data + // CHECK-SAME: (%[[MAPPED:.*]]: [[MAPPED_TYPE:[^)]*]], %[[USEDEVADDR:.*]]: [[USEDEVADDR_TYPE:[^)]*]], %[[USEDEVPTR:.*]]: [[USEDEVPTR_TYPE:[^)]*]]) + func.func @target_data(%mapped: !fir.ref, %usedevaddr: !fir.ref, %usedevptr: !fir.ref>) { + // CHECK-NEXT: %[[MAPPED_DECL:.*]]:2 = hlfir.declare %[[MAPPED]] {uniq_name = "mapped"} : ([[MAPPED_TYPE]]) -> ([[MAPPED_TYPE]], [[MAPPED_TYPE]]) + %0:2 = hlfir.declare %mapped {uniq_name = "mapped"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %1:2 = hlfir.declare %usedevaddr {uniq_name = "usedevaddr"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %2:2 = hlfir.declare %usedevptr {uniq_name = "usedevptr"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) + %m0 = omp.map.info var_ptr(%0#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + %m1 = omp.map.info var_ptr(%1#1 : !fir.ref, i32) map_clauses(return_param) capture(ByRef) -> !fir.ref + %m2 = omp.map.info var_ptr(%2#1 : !fir.ref>, !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>) map_clauses(return_param) capture(ByRef) -> !fir.ref> + // CHECK: omp.target_data map_entries(%{{.*}}) use_device_addr(%{{.*}} -> %[[USEDEVADDR_ARG:.*]] : [[USEDEVADDR_TYPE]]) use_device_ptr(%{{.*}} -> %[[USEDEVPTR_ARG:.*]] : [[USEDEVPTR_TYPE]]) + omp.target_data map_entries(%m0 : !fir.ref) use_device_addr(%m1 -> %arg0 : !fir.ref) use_device_ptr(%m2 -> %arg1 : !fir.ref>) { + // CHECK-NEXT: %[[USEDEVADDR_DECL:.*]]:2 = hlfir.declare %[[USEDEVADDR_ARG]] {uniq_name = "usedevaddr"} : ([[USEDEVADDR_TYPE]]) -> ([[USEDEVADDR_TYPE]], [[USEDEVADDR_TYPE]]) + %3:2 = hlfir.declare %arg0 {uniq_name = "usedevaddr"} : (!fir.ref) -> (!fir.ref, !fir.ref) + // CHECK-NEXT: %[[USEDEVPTR_DECL:.*]]:2 = hlfir.declare %[[USEDEVPTR_ARG]] {uniq_name = "usedevptr"} : ([[USEDEVPTR_TYPE]]) -> ([[USEDEVPTR_TYPE]], [[USEDEVPTR_TYPE]]) + %4:2 = hlfir.declare %arg1 {uniq_name = "usedevptr"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) + // CHECK-NEXT: %[[MAPPED_MAP:.*]] = omp.map.info var_ptr(%[[MAPPED_DECL]]#1 : [[MAPPED_TYPE]], i32) map_clauses(tofrom) capture(ByRef) -> [[MAPPED_TYPE]] + %m3 = omp.map.info var_ptr(%0#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + // CHECK-NEXT: %[[USEDEVADDR_MAP:.*]] = omp.map.info var_ptr(%[[USEDEVADDR_DECL]]#1 : [[USEDEVADDR_TYPE]], i32) map_clauses(tofrom) capture(ByRef) -> [[USEDEVADDR_TYPE]] + %m4 = omp.map.info var_ptr(%3#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + // CHECK-NEXT: %[[USEDEVPTR_MAP:.*]] = omp.map.info var_ptr(%[[USEDEVPTR_DECL]]#1 : [[USEDEVPTR_TYPE]], !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>) map_clauses(tofrom) capture(ByRef) -> [[USEDEVPTR_TYPE]] + %m5 = omp.map.info var_ptr(%4#1 : !fir.ref>, !fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>) map_clauses(tofrom) capture(ByRef) -> !fir.ref> + + // CHECK-NOT: func.call + func.call @foo() : () -> () + + // CHECK-NEXT: omp.target map_entries(%[[MAPPED_MAP]] -> %{{.*}}, %[[USEDEVADDR_MAP]] -> %{{.*}}, %[[USEDEVPTR_MAP]] -> %{{.*}} : {{.*}}) + omp.target map_entries(%m3 -> %arg2, %m4 -> %arg3, %m5 -> %arg4 : !fir.ref, !fir.ref, !fir.ref>) { + omp.terminator + } + + // CHECK-NOT: func.call + func.call @foo() : () -> () + + omp.terminator + } + + // CHECK: return + return + } + + // CHECK-LABEL: func.func @map_info_members + // CHECK-SAME: (%[[X:.*]]: [[X_TYPE:[^)]*]]) + func.func @map_info_members(%x: !fir.ref>>>) { + %c0 = arith.constant 0 : index + %c1 = arith.constant 1 : index + %c9 = arith.constant 9 : index + // CHECK-NEXT: %[[X_DECL:.*]]:2 = hlfir.declare %[[X]] {fortran_attrs = #fir.var_attrs, uniq_name = "x"} : ([[X_TYPE]]) -> ([[X_TYPE]], [[X_TYPE]]) + %23:2 = hlfir.declare %x {fortran_attrs = #fir.var_attrs, uniq_name = "x"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) + %63 = fir.load %23#0 : !fir.ref>>> + %64:3 = fir.box_dims %63, %c0 : (!fir.box>>, index) -> (index, index, index) + %65:3 = fir.box_dims %63, %c0 : (!fir.box>>, index) -> (index, index, index) + %66 = arith.subi %c1, %64#0 : index + %67 = arith.subi %c9, %64#0 : index + %68 = fir.load %23#0 : !fir.ref>>> + %69:3 = fir.box_dims %68, %c0 : (!fir.box>>, index) -> (index, index, index) + %70 = omp.map.bounds lower_bound(%66 : index) upper_bound(%67 : index) extent(%69#1 : index) stride(%65#2 : index) start_idx(%64#0 : index) {stride_in_bytes = true} + // CHECK-NEXT: %[[VAR_PTR_PTR:.*]] = fir.box_offset %[[X_DECL]]#1 base_addr : ([[X_TYPE]]) -> [[VAR_PTR_PTR_TYPE:.*]] + %71 = fir.box_offset %23#1 base_addr : (!fir.ref>>>) -> !fir.llvm_ptr>> + // CHECK-NEXT: %[[MAP0:.*]] = omp.map.info var_ptr(%[[X_DECL]]#1 : [[X_TYPE]], f32) {{.*}} var_ptr_ptr(%[[VAR_PTR_PTR]] : [[VAR_PTR_PTR_TYPE]]) -> [[VAR_PTR_PTR_TYPE]] + %72 = omp.map.info var_ptr(%23#1 : !fir.ref>>>, f32) map_clauses(tofrom) capture(ByRef) var_ptr_ptr(%71 : !fir.llvm_ptr>>) bounds(%70) -> !fir.llvm_ptr>> + // CHECK-NEXT: %[[MAP1:.*]] = omp.map.info var_ptr(%[[X_DECL]]#1 : [[X_TYPE]], !fir.box>>) {{.*}} members(%[[MAP0]] : [0] : [[VAR_PTR_PTR_TYPE]]) -> [[X_TYPE]] + %73 = omp.map.info var_ptr(%23#1 : !fir.ref>>>, !fir.box>>) map_clauses(to) capture(ByRef) members(%72 : [0] : !fir.llvm_ptr>>) -> !fir.ref>>> + // CHECK-NEXT: omp.target map_entries(%[[MAP1]] -> {{.*}}, %[[MAP0]] -> %{{.*}} : [[X_TYPE]], [[VAR_PTR_PTR_TYPE]]) + omp.target map_entries(%73 -> %arg0, %72 -> %arg1 : !fir.ref>>>, !fir.llvm_ptr>>) { + omp.terminator + } + return + } + + // CHECK-LABEL: func.func @control_flow + // CHECK-SAME: (%[[X:.*]]: [[X_TYPE:[^,]*]], %[[COND:.*]]: [[COND_TYPE:[^)]*]]) + func.func @control_flow(%x: !fir.ref, %cond: !fir.ref>) { + // CHECK-NEXT: %[[X_DECL:.*]]:2 = hlfir.declare %[[X]] {uniq_name = "x"} : ([[X_TYPE]]) -> ([[X_TYPE]], [[X_TYPE]]) + // CHECK-NEXT: %[[MAP0:.*]] = omp.map.info var_ptr(%[[X_DECL]]#1 : [[X_TYPE]], i32) {{.*}} -> [[X_TYPE]] + // CHECK-NEXT: %[[MAP1:.*]] = omp.map.info var_ptr(%[[X_DECL]]#1 : [[X_TYPE]], i32) {{.*}} -> [[X_TYPE]] + %x_decl:2 = hlfir.declare %x {uniq_name = "x"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %cond_decl:2 = hlfir.declare %cond {uniq_name = "cond"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) + %0 = fir.load %cond_decl#0 : !fir.ref> + %1 = fir.convert %0 : (!fir.logical<4>) -> i1 + cf.cond_br %1, ^bb1, ^bb2 + ^bb1: // pred: ^bb0 + fir.call @foo() : () -> () + %m0 = omp.map.info var_ptr(%x_decl#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + // CHECK-NEXT: omp.target map_entries(%[[MAP0]] -> {{.*}} : [[X_TYPE]]) + omp.target map_entries(%m0 -> %arg2 : !fir.ref) { + omp.terminator + } + fir.call @foo() : () -> () + cf.br ^bb2 + ^bb2: // 2 preds: ^bb0, ^bb1 + fir.call @foo() : () -> () + %m1 = omp.map.info var_ptr(%x_decl#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + // CHECK-NOT: fir.call + // CHECK-NOT: omp.map.info + // CHECK: omp.target_data map_entries(%[[MAP1]] : [[X_TYPE]]) + omp.target_data map_entries(%m1 : !fir.ref) { + fir.call @foo() : () -> () + %8 = fir.load %cond_decl#0 : !fir.ref> + %9 = fir.convert %8 : (!fir.logical<4>) -> i1 + cf.cond_br %9, ^bb1, ^bb2 + ^bb1: // pred: ^bb0 + fir.call @foo() : () -> () + // CHECK-NEXT: %[[MAP2:.*]] = omp.map.info var_ptr(%[[X_DECL]]#1 : [[X_TYPE]], i32) {{.*}} -> [[X_TYPE]] + %m2 = omp.map.info var_ptr(%x_decl#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + // CHECK-NEXT: omp.target map_entries(%[[MAP2]] -> {{.*}} : [[X_TYPE]]) + omp.target map_entries(%m2 -> %arg2 : !fir.ref) { + omp.terminator + } + // CHECK-NOT: fir.call + // CHECK-NOT: cf.br + fir.call @foo() : () -> () + cf.br ^bb2 + ^bb2: // 2 preds: ^bb0, ^bb1 + fir.call @foo() : () -> () + omp.terminator + } + fir.call @foo() : () -> () + + // CHECK: return + return + } + + // CHECK-LABEL: func.func @block_args + // CHECK-SAME: (%[[X:.*]]: [[X_TYPE:[^)]*]]) + func.func @block_args(%x: !fir.ref) { + // CHECK-NEXT: %[[PLACEHOLDER0:.*]] = fir.alloca i1 + // CHECK-NEXT: %[[ALLOCA0:.*]] = fir.convert %[[PLACEHOLDER0]] : (!fir.ref) -> !fir.ref + // CHECK-NEXT: %[[PLACEHOLDER1:.*]] = fir.alloca i1 + // CHECK-NEXT: %[[ALLOCA1:.*]] = fir.convert %[[PLACEHOLDER1]] : (!fir.ref) -> !fir.ref + // CHECK-NEXT: %[[X_DECL0:.*]]:2 = hlfir.declare %[[ALLOCA0]] {uniq_name = "x"} : ([[X_TYPE]]) -> ([[X_TYPE]], [[X_TYPE]]) + // CHECK-NEXT: %[[X_DECL1:.*]]:2 = hlfir.declare %[[ALLOCA1]] {uniq_name = "x"} : ([[X_TYPE]]) -> ([[X_TYPE]], [[X_TYPE]]) + // CHECK-NEXT: %[[MAP0:.*]] = omp.map.info var_ptr(%[[X_DECL0]]#1 : [[X_TYPE]], i32) {{.*}} -> [[X_TYPE]] + // CHECK-NEXT: %[[MAP1:.*]] = omp.map.info var_ptr(%[[X_DECL1]]#1 : [[X_TYPE]], i32) {{.*}} -> [[X_TYPE]] + %x_decl:2 = hlfir.declare %x {uniq_name = "x"} : (!fir.ref) -> (!fir.ref, !fir.ref) + omp.parallel private(@privatizer %x_decl#0 -> %arg0 : !fir.ref) { + %0:2 = hlfir.declare %arg0 {uniq_name = "x"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %m0 = omp.map.info var_ptr(%0#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + // CHECK-NEXT: omp.target map_entries(%[[MAP0]] -> {{.*}} : [[X_TYPE]]) + omp.target map_entries(%m0 -> %arg2 : !fir.ref) { + omp.terminator + } + omp.terminator + } + + omp.parallel private(@privatizer %x_decl#0 -> %arg0 : !fir.ref) { + %1:2 = hlfir.declare %arg0 {uniq_name = "x"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %m1 = omp.map.info var_ptr(%1#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + // CHECK-NOT: omp.parallel + // CHECK-NOT: hlfir.declare + // CHECK-NOT: omp.map.info + // CHECK: omp.target_data map_entries(%[[MAP1]] : [[X_TYPE]]) + omp.target_data map_entries(%m1 : !fir.ref) { + omp.parallel private(@privatizer %1#0 -> %arg1 : !fir.ref) { + // CHECK-NEXT: %[[PLACEHOLDER2:.*]] = fir.alloca i1 + // CHECK-NEXT: %[[ALLOCA2:.*]] = fir.convert %[[PLACEHOLDER2]] : (!fir.ref) -> !fir.ref + // CHECK-NEXT: %[[X_DECL2:.*]]:2 = hlfir.declare %[[ALLOCA2]] {uniq_name = "x"} : ([[X_TYPE]]) -> ([[X_TYPE]], [[X_TYPE]]) + %2:2 = hlfir.declare %arg1 {uniq_name = "x"} : (!fir.ref) -> (!fir.ref, !fir.ref) + // CHECK-NEXT: %[[MAP2:.*]] = omp.map.info var_ptr(%[[X_DECL2]]#1 : [[X_TYPE]], i32) {{.*}} -> [[X_TYPE]] + %m2 = omp.map.info var_ptr(%2#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + // CHECK-NEXT: omp.target map_entries(%[[MAP2]] -> {{.*}} : [[X_TYPE]]) + omp.target map_entries(%m2 -> %arg2 : !fir.ref) { + omp.terminator + } + omp.terminator + } + omp.terminator + } + omp.terminator + } + + return + } + + // CHECK-LABEL: func.func @reuse_tests() + func.func @reuse_tests() { + // CHECK-NEXT: %[[PLACEHOLDER:.*]] = fir.alloca i1 + // CHECK-NEXT: %[[THREAD_LIMIT:.*]] = fir.convert %[[PLACEHOLDER]] : (!fir.ref) -> i32 + // CHECK-NEXT: %[[CONST:.*]] = arith.constant 1 : i32 + // CHECK-NEXT: %[[GLOBAL:.*]] = fir.address_of(@global_scalar) : !fir.ref + %global = fir.address_of(@global_scalar) : !fir.ref + // CHECK-NEXT: %[[GLOBAL_DECL0:.*]]:2 = hlfir.declare %[[GLOBAL]] {uniq_name = "global_scalar"} + // CHECK-NEXT: %[[GLOBAL_DECL1:.*]]:2 = hlfir.declare %[[GLOBAL]] {uniq_name = "global_scalar"} + %0:2 = hlfir.declare %global {uniq_name = "global_scalar"} : (!fir.ref) -> (!fir.ref, !fir.ref) + // CHECK-NEXT: %[[MAP0:.*]] = omp.map.info var_ptr(%[[GLOBAL_DECL0]]#1 : !fir.ref, i32) + // CHECK-NEXT: %[[MAP3:.*]] = omp.map.info var_ptr(%[[GLOBAL_DECL1]]#1 : !fir.ref, i32) + %m0 = omp.map.info var_ptr(%0#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + // CHECK-NEXT: omp.target_data map_entries(%[[MAP0]] : !fir.ref) + omp.target_data map_entries(%m0 : !fir.ref) { + // CHECK-NEXT: %[[GLOBAL_DECL2:.*]]:2 = hlfir.declare %[[GLOBAL]] {uniq_name = "global_scalar"} + %1:2 = hlfir.declare %global {uniq_name = "global_scalar"} : (!fir.ref) -> (!fir.ref, !fir.ref) + // CHECK-NEXT: %[[MAP1:.*]] = omp.map.info var_ptr(%[[GLOBAL_DECL0]]#1 : !fir.ref, i32) + %m1 = omp.map.info var_ptr(%0#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + // CHECK-NEXT: %[[MAP2:.*]] = omp.map.info var_ptr(%[[GLOBAL_DECL2]]#1 : !fir.ref, i32) + %m2 = omp.map.info var_ptr(%1#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + // CHECK-NEXT: omp.target map_entries(%[[MAP1]] -> %{{.*}}, %[[MAP2]] -> {{.*}} : !fir.ref, !fir.ref) + omp.target map_entries(%m1 -> %arg0, %m2 -> %arg1 : !fir.ref, !fir.ref) { + omp.terminator + } + omp.terminator + } + // CHECK-NOT: fir.load + // CHECK-NOT: hlfir.declare + %2 = fir.load %global : !fir.ref + %3:2 = hlfir.declare %global {uniq_name = "global_scalar"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %m3 = omp.map.info var_ptr(%3#1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + // CHECK: omp.target thread_limit(%[[THREAD_LIMIT]] : i32) map_entries(%[[MAP3]] -> %{{.*}} : !fir.ref) + omp.target thread_limit(%2 : i32) map_entries(%m3 -> %arg0 : !fir.ref) { + omp.terminator + } + // CHECK: omp.target thread_limit(%[[CONST]] : i32) + %c1 = arith.constant 1 : i32 + omp.target thread_limit(%c1 : i32) { + omp.terminator + } + // CHECK: omp.target thread_limit(%[[CONST]] : i32) + omp.target thread_limit(%c1 : i32) { + omp.terminator + } + return + } + + // CHECK-LABEL: func.func @all_non_map_clauses + // CHECK-SAME: (%[[REF:.*]]: !fir.ref, %[[INT:.*]]: i32, %[[BOOL:.*]]: i1) + func.func @all_non_map_clauses(%ref: !fir.ref, %int: i32, %bool: i1) { + %m0 = omp.map.info var_ptr(%ref : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref + // CHECK: omp.target_data map_entries({{[^)]*}}) { + omp.target_data device(%int : i32) if(%bool) map_entries(%m0 : !fir.ref) { + omp.terminator + } + // CHECK: omp.target allocate({{[^)]*}}) thread_limit({{[^)]*}}) in_reduction({{[^)]*}}) private({{[^)]*}}) { + omp.target allocate(%ref : !fir.ref -> %ref : !fir.ref) + depend(taskdependin -> %ref : !fir.ref) + device(%int : i32) if(%bool) thread_limit(%int : i32) + in_reduction(@reduction %ref -> %arg0 : !fir.ref) + private(@privatizer %ref -> %arg1 : !fir.ref) { + omp.terminator + } + // CHECK: omp.target_enter_data + // CHECK-NOT: depend + // CHECK-NOT: device + // CHECK-NOT: if + omp.target_enter_data depend(taskdependin -> %ref : !fir.ref) + device(%int : i32) if(%bool) + // CHECK-NEXT: omp.target_exit_data + // CHECK-NOT: depend + // CHECK-NOT: device + // CHECK-NOT: if + omp.target_exit_data depend(taskdependin -> %ref : !fir.ref) + device(%int : i32) if(%bool) + // CHECK-NEXT: omp.target_update + // CHECK-NOT: depend + // CHECK-NOT: device + // CHECK-NOT: if + omp.target_update depend(taskdependin -> %ref : !fir.ref) + device(%int : i32) if(%bool) + + // CHECK-NEXT: return + return + } + + func.func private @foo() -> () attributes {omp.declare_target = #omp.declaretarget} + fir.global internal @global_scalar constant : i32 { + %0 = arith.constant 10 : i32 + fir.has_value %0 : i32 + } + omp.private {type = firstprivate} @privatizer : i32 copy { + ^bb0(%arg0: !fir.ref, %arg1: !fir.ref): + %0 = fir.load %arg0 : !fir.ref + hlfir.assign %0 to %arg1 : i32, !fir.ref + omp.yield(%arg1 : !fir.ref) + } + omp.declare_reduction @reduction : i32 + init { + ^bb0(%arg: i32): + %0 = arith.constant 0 : i32 + omp.yield (%0 : i32) + } + combiner { + ^bb1(%arg0: i32, %arg1: i32): + %1 = arith.addi %arg0, %arg1 : i32 + omp.yield (%1 : i32) + } +} diff --git a/flang/test/Transforms/omp-function-filtering.mlir b/flang/test/Transforms/OpenMP/function-filtering.mlir similarity index 100% rename from flang/test/Transforms/omp-function-filtering.mlir rename to flang/test/Transforms/OpenMP/function-filtering.mlir From flang-commits at lists.llvm.org Thu May 15 07:29:19 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 15 May 2025 07:29:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #131628) In-Reply-To: Message-ID: <6825fa3f.a70a0220.25f73e.aa37@mx.google.com> https://github.com/tblah closed https://github.com/llvm/llvm-project/pull/131628 From flang-commits at lists.llvm.org Thu May 15 07:29:20 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 15 May 2025 07:29:20 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #131628) In-Reply-To: Message-ID: <6825fa40.a70a0220.145002.f645@mx.google.com> tblah wrote: Mats will no longer be working on this. Please comment on the new pull request at https://github.com/llvm/llvm-project/pull/140066 Apologies for the noise and loss of comments. https://github.com/llvm/llvm-project/pull/131628 From flang-commits at lists.llvm.org Thu May 15 07:57:10 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 15 May 2025 07:57:10 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Separate the actions of the `-emit-fir` and `-emit-mlir` options (PR #139857) In-Reply-To: Message-ID: <682600c6.630a0220.3c230f.15e5@mx.google.com> tarunprabhu wrote: > I think the appropriate core MLIR dialects are those that most closely reflect the original FIR representation. Do you have any suggestions on how to approach this? I am not exactly clear on what sort of suggestions you are looking for. I haven't thought a lot about this either, so here are just my initial thoughts: One could have some standard "sets" of dialects that could be chosen. This may be similar to the various optimization levels available, -O3 for maximum performance, -Os to optimize for size etc. Each "set" may have its own strengths - maybe some enable better optimizations for certain codes. Others may contain information that is useful for more sophisticated tooling, rather than standard compilation. This may allow us to defer development of a truly flexible approach until we have a better sense of what is useful. As someone who works in a research lab, having the ability to experiment with lowering using the various MLIR dialects sounds great. However, there is a tension between that and the requirements for a production-quality compiler and the ability to maintain this code. Apologies if this is obvious, but I am just trying to provide some broader context for the concerns/interests of folks in the community. Does that help at all? https://github.com/llvm/llvm-project/pull/139857 From flang-commits at lists.llvm.org Thu May 15 08:31:16 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 15 May 2025 08:31:16 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Flang][Sanitizer] Support sanitizer flag for Flang Driver. (PR #137759) In-Reply-To: Message-ID: <682608c4.630a0220.d6949.191a@mx.google.com> https://github.com/tarunprabhu edited https://github.com/llvm/llvm-project/pull/137759 From flang-commits at lists.llvm.org Thu May 15 08:31:17 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 15 May 2025 08:31:17 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Flang][Sanitizer] Support sanitizer flag for Flang Driver. (PR #137759) In-Reply-To: Message-ID: <682608c5.170a0220.57f24.bbbd@mx.google.com> https://github.com/tarunprabhu requested changes to this pull request. Thanks. We should try to share code between `clang` and `flang` where appropriate. https://github.com/llvm/llvm-project/pull/137759 From flang-commits at lists.llvm.org Thu May 15 08:31:17 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 15 May 2025 08:31:17 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Flang][Sanitizer] Support sanitizer flag for Flang Driver. (PR #137759) In-Reply-To: Message-ID: <682608c5.630a0220.73c9.170f@mx.google.com> ================ @@ -11,17 +11,46 @@ //===----------------------------------------------------------------------===// #include "flang/Frontend/CodeGenOptions.h" +#include "llvm/TargetParser/Triple.h" #include #include namespace Fortran::frontend { +using namespace llvm; ---------------- tarunprabhu wrote: We generally do not use `using namespace llvm` in flang. https://github.com/llvm/llvm-project/pull/137759 From flang-commits at lists.llvm.org Thu May 15 08:31:18 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 15 May 2025 08:31:18 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Flang][Sanitizer] Support sanitizer flag for Flang Driver. (PR #137759) In-Reply-To: Message-ID: <682608c6.170a0220.184ca4.105b@mx.google.com> ================ @@ -11,17 +11,46 @@ //===----------------------------------------------------------------------===// #include "flang/Frontend/CodeGenOptions.h" +#include "llvm/TargetParser/Triple.h" #include #include namespace Fortran::frontend { +using namespace llvm; + CodeGenOptions::CodeGenOptions() { #define CODEGENOPT(Name, Bits, Default) Name = Default; #define ENUM_CODEGENOPT(Name, Type, Bits, Default) set##Name(Default); #include "flang/Frontend/CodeGenOptions.def" } +// Check if ASan should use GC-friendly instrumentation for globals. ---------------- tarunprabhu wrote: It looks like this and much of the other code here has been copied from `clang`. If the code is identical to what is in `clang`, it should be shared rather than copied. Such code can be moved somewhere in `llvm/include/llvm/Frontend` and `llvm/lib/Frontend/`. See #136098 for some suggestions. That PR is still awaiting approval from the clang developers, but I don't anticipate any major objections. https://github.com/llvm/llvm-project/pull/137759 From flang-commits at lists.llvm.org Thu May 15 08:31:18 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 15 May 2025 08:31:18 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Flang][Sanitizer] Support sanitizer flag for Flang Driver. (PR #137759) In-Reply-To: Message-ID: <682608c6.170a0220.13fa4a.d3cc@mx.google.com> ================ @@ -787,6 +792,11 @@ void CodeGenAction::generateLLVMIR() { return; } + for (llvm::Function &F : llvmModule->getFunctionList()) { ---------------- tarunprabhu wrote: We use slightly different coding conventions in flang. In particular, we use camel-casing in most places, so only type names should start with uppercase. https://github.com/llvm/llvm-project/pull/137759 From flang-commits at lists.llvm.org Thu May 15 08:39:26 2025 From: flang-commits at lists.llvm.org (Kelvin Li via flang-commits) Date: Thu, 15 May 2025 08:39:26 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <68260aae.170a0220.8f373.05d8@mx.google.com> kkwli wrote: > @DavidSpickett and @kkwli for both of these tests we were erroring out during compilation with internal compiler errors. We are now succeeding were we shouldn't be, as these programs should be rejected by the compiler. I am preparing a PR that will reject these programs, in order to satisfy the test. If I wanted to change the expected behavior of the test how would I go about doing that? I haven't dealt with modifying the test-suite yet. I don't think you can change the tests as these are from GCC test suite or something like that. I guess we can ignore the expected behavior in the test if flang's behavior is desirable. @tarunprabhu can provide more insight to it. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Thu May 15 09:33:05 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 15 May 2025 09:33:05 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <68261741.170a0220.2f1206.248c@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Thu May 15 09:58:00 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 09:58:00 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [lld] [lldb] [llvm] [mlir] [polly] [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS in standalone builds (PR #138587) In-Reply-To: Message-ID: <68261d18.050a0220.1fb92d.0f30@mx.google.com> https://github.com/jeremyd2019 updated https://github.com/llvm/llvm-project/pull/138587 >From 052580cd9ee141cd8c79e9588ad1c71e31f58cb3 Mon Sep 17 00:00:00 2001 From: Jeremy Drake Date: Mon, 5 May 2025 14:11:44 -0700 Subject: [PATCH 1/7] [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS In #138329, _GNU_SOURCE was added for Cygwin, but when building Clang standalone against an installed LLVM this definition was not picked up, resulting in undefined strnlen. Follow the documentation in https://llvm.org/docs/CMake.html#developing-llvm-passes-out-of-source and add the LLVM_DEFINITIONS in standalone projects' cmakes. --- clang/CMakeLists.txt | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/clang/CMakeLists.txt b/clang/CMakeLists.txt index f12712f55fb96..ab2ac9bc6b9ad 100644 --- a/clang/CMakeLists.txt +++ b/clang/CMakeLists.txt @@ -68,6 +68,10 @@ if(CLANG_BUILT_STANDALONE) option(CLANG_ENABLE_BOOTSTRAP "Generate the clang bootstrap target" OFF) option(LLVM_ENABLE_LIBXML2 "Use libxml2 if available." ON) + separate_arguments(LLVM_DEFINITIONS_LIST NATIVE_COMMAND ${LLVM_DEFINITIONS}) + add_definitions(${LLVM_DEFINITIONS_LIST}) + list(APPEND CMAKE_REQUIRED_DEFINITIONS ${LLVM_DEFINITIONS_LIST}) + include(AddLLVM) include(TableGen) include(HandleLLVMOptions) >From a59bbb92c54d81d06754d49190d7a46ba269b1ef Mon Sep 17 00:00:00 2001 From: Jeremy Drake Date: Wed, 7 May 2025 11:24:55 -0700 Subject: [PATCH 2/7] fixup! [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS bolt --- bolt/CMakeLists.txt | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/bolt/CMakeLists.txt b/bolt/CMakeLists.txt index 52c796518ac05..5c7d51e1e398c 100644 --- a/bolt/CMakeLists.txt +++ b/bolt/CMakeLists.txt @@ -46,6 +46,10 @@ if(BOLT_BUILT_STANDALONE) set(LLVM_RUNTIME_OUTPUT_INTDIR ${CMAKE_BINARY_DIR}/${CMAKE_CFG_INTDIR}/bin) set(LLVM_LIBRARY_OUTPUT_INTDIR ${CMAKE_BINARY_DIR}/${CMAKE_CFG_INTDIR}/lib${LLVM_LIBDIR_SUFFIX}) + separate_arguments(LLVM_DEFINITIONS_LIST NATIVE_COMMAND ${LLVM_DEFINITIONS}) + add_definitions(${LLVM_DEFINITIONS_LIST}) + list(APPEND CMAKE_REQUIRED_DEFINITIONS ${LLVM_DEFINITIONS_LIST}) + include(AddLLVM) include(TableGen) include_directories(${LLVM_INCLUDE_DIRS}) >From 595f483ff1403f282217ff4999f7640465b5dada Mon Sep 17 00:00:00 2001 From: Jeremy Drake Date: Wed, 7 May 2025 11:27:53 -0700 Subject: [PATCH 3/7] fixup! [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS flang --- flang/CMakeLists.txt | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/flang/CMakeLists.txt b/flang/CMakeLists.txt index f358a93fdd792..56a96f590f0a3 100644 --- a/flang/CMakeLists.txt +++ b/flang/CMakeLists.txt @@ -140,6 +140,11 @@ if (FLANG_STANDALONE_BUILD) if (NOT DEFINED LLVM_MAIN_SRC_DIR) set(LLVM_MAIN_SRC_DIR "${CMAKE_CURRENT_SOURCE_DIR}/../llvm") endif() + + separate_arguments(LLVM_DEFINITIONS_LIST NATIVE_COMMAND ${LLVM_DEFINITIONS}) + add_definitions(${LLVM_DEFINITIONS_LIST}) + list(APPEND CMAKE_REQUIRED_DEFINITIONS ${LLVM_DEFINITIONS_LIST}) + include(AddLLVM) include(HandleLLVMOptions) include(VersionFromVCS) >From 8eeb00ff57a90ae7e4a775f7fa85a4d3529f143d Mon Sep 17 00:00:00 2001 From: Jeremy Drake Date: Wed, 7 May 2025 11:28:22 -0700 Subject: [PATCH 4/7] fixup! [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS lld --- lld/CMakeLists.txt | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/lld/CMakeLists.txt b/lld/CMakeLists.txt index 9b202cc5d4899..80e25204a65ee 100644 --- a/lld/CMakeLists.txt +++ b/lld/CMakeLists.txt @@ -39,6 +39,10 @@ if(LLD_BUILT_STANDALONE) set(LLVM_RUNTIME_OUTPUT_INTDIR ${CMAKE_BINARY_DIR}/${CMAKE_CFG_INTDIR}/bin) set(LLVM_LIBRARY_OUTPUT_INTDIR ${CMAKE_BINARY_DIR}/${CMAKE_CFG_INTDIR}/lib${LLVM_LIBDIR_SUFFIX}) + separate_arguments(LLVM_DEFINITIONS_LIST NATIVE_COMMAND ${LLVM_DEFINITIONS}) + add_definitions(${LLVM_DEFINITIONS_LIST}) + list(APPEND CMAKE_REQUIRED_DEFINITIONS ${LLVM_DEFINITIONS_LIST}) + include(AddLLVM) include(TableGen) include(HandleLLVMOptions) >From e2510293883ff6499890de1a4a5de4c1d53beac3 Mon Sep 17 00:00:00 2001 From: Jeremy Drake Date: Wed, 7 May 2025 11:31:01 -0700 Subject: [PATCH 5/7] fixup! [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS lldb --- lldb/cmake/modules/LLDBStandalone.cmake | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/lldb/cmake/modules/LLDBStandalone.cmake b/lldb/cmake/modules/LLDBStandalone.cmake index c9367214848fd..1a4cdbfbb1cc7 100644 --- a/lldb/cmake/modules/LLDBStandalone.cmake +++ b/lldb/cmake/modules/LLDBStandalone.cmake @@ -85,6 +85,10 @@ endif() # CMake modules to be in that directory as well. list(APPEND CMAKE_MODULE_PATH "${LLVM_DIR}") +separate_arguments(LLVM_DEFINITIONS_LIST NATIVE_COMMAND ${LLVM_DEFINITIONS}) +add_definitions(${LLVM_DEFINITIONS_LIST}) +list(APPEND CMAKE_REQUIRED_DEFINITIONS ${LLVM_DEFINITIONS_LIST}) + include(AddLLVM) include(TableGen) include(HandleLLVMOptions) >From f9f30f2871fd3f5eedcf90b6f41c8e17b9d19ff3 Mon Sep 17 00:00:00 2001 From: Jeremy Drake Date: Wed, 7 May 2025 11:31:41 -0700 Subject: [PATCH 6/7] fixup! [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS mlir --- mlir/CMakeLists.txt | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/mlir/CMakeLists.txt b/mlir/CMakeLists.txt index 9e786154a2b40..daedc2be22588 100644 --- a/mlir/CMakeLists.txt +++ b/mlir/CMakeLists.txt @@ -21,6 +21,11 @@ set(CMAKE_CXX_STANDARD 17 CACHE STRING "C++ standard to conform to") if(MLIR_STANDALONE_BUILD) find_package(LLVM CONFIG REQUIRED) set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${LLVM_CMAKE_DIR}) + + separate_arguments(LLVM_DEFINITIONS_LIST NATIVE_COMMAND ${LLVM_DEFINITIONS}) + add_definitions(${LLVM_DEFINITIONS_LIST}) + list(APPEND CMAKE_REQUIRED_DEFINITIONS ${LLVM_DEFINITIONS_LIST}) + include(HandleLLVMOptions) include(AddLLVM) include(TableGen) >From be29dee4f077212da72ffa28815dafe9a0491f85 Mon Sep 17 00:00:00 2001 From: Jeremy Drake Date: Wed, 7 May 2025 11:32:11 -0700 Subject: [PATCH 7/7] fixup! [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS polly --- polly/CMakeLists.txt | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/polly/CMakeLists.txt b/polly/CMakeLists.txt index c3232752d307c..52d1be6fe295a 100644 --- a/polly/CMakeLists.txt +++ b/polly/CMakeLists.txt @@ -13,6 +13,11 @@ if(POLLY_STANDALONE_BUILD) # Where is LLVM installed? find_package(LLVM CONFIG REQUIRED) set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${LLVM_CMAKE_DIR}) + + separate_arguments(LLVM_DEFINITIONS_LIST NATIVE_COMMAND ${LLVM_DEFINITIONS}) + add_definitions(${LLVM_DEFINITIONS_LIST}) + list(APPEND CMAKE_REQUIRED_DEFINITIONS ${LLVM_DEFINITIONS_LIST}) + include(HandleLLVMOptions) include(AddLLVM) From flang-commits at lists.llvm.org Thu May 15 10:01:15 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 10:01:15 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [lld] [lldb] [llvm] [mlir] [polly] [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS in standalone builds (PR #138587) In-Reply-To: Message-ID: <68261ddb.170a0220.3037c1.0651@mx.google.com> jeremyd2019 wrote: > > I rebased this on top of #138783 and adjusted the title and description. Now it should be in a good state to push cmake changes for other projects. > > The changes look good, but it looks like the changes from #138783 still show up when viewing the changes; can you check that you've rebased past the merged #138783? I had not - I have now though. > (Also, I take it that no other subprojects than clang need the `cmake_push_check_state` change?) No, the other projects were not messing with `_GNU_SOURCE` like clang was. https://github.com/llvm/llvm-project/pull/138587 From flang-commits at lists.llvm.org Thu May 15 10:14:00 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 15 May 2025 10:14:00 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow open acc routines from other modules. (PR #136012) In-Reply-To: Message-ID: <682620d8.050a0220.20ffdb.176e@mx.google.com> tarunprabhu wrote: If these tests are not being rejected by the compiler, that is an issue that needs to be fixed in flang. This is the case for many tests in the gfortran test suite for one reason or another. The [README](https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/README.md) in the gfortran test suite has some information on disable/overriding such tests. In this case though, the two tests in question have already been disabled, so you don't need to do anything. I have been filing issues (e.g.#119420, #138950, #139776) in such cases. Feel free to file one for these as well so we can track them. I typically add "gfortran" somewhere in the title so we know where they came from. https://github.com/llvm/llvm-project/pull/136012 From flang-commits at lists.llvm.org Thu May 15 11:15:48 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Thu, 15 May 2025 11:15:48 -0700 (PDT) Subject: [flang-commits] [flang] [flang][docs] Document technique for regenerating a module hermetically (PR #139975) In-Reply-To: Message-ID: <68262f54.170a0220.198fb6.1d27@mx.google.com> https://github.com/akuhlens approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/139975 From flang-commits at lists.llvm.org Thu May 15 11:25:30 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 11:25:30 -0700 (PDT) Subject: [flang-commits] [flang] 9457616 - [flang] Pad Hollerith actual arguments (#139782) Message-ID: <6826319a.050a0220.12933a.2a74@mx.google.com> Author: Peter Klausler Date: 2025-05-15T11:25:26-07:00 New Revision: 9457616527b50590e9c9d5e91723b35b26e447cd URL: https://github.com/llvm/llvm-project/commit/9457616527b50590e9c9d5e91723b35b26e447cd DIFF: https://github.com/llvm/llvm-project/commit/9457616527b50590e9c9d5e91723b35b26e447cd.diff LOG: [flang] Pad Hollerith actual arguments (#139782) For more compatible legacy behavior on old tests, extend Hollerith actual arguments on the right with trailing blanks out to a multiple of 8 bytes. Fixes Fujitsu test 0343_0069. Added: flang/test/Semantics/pad-hollerith-arg.f Modified: flang/lib/Semantics/expression.cpp Removed: ################################################################################ diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index c35492097cfbc..b3ad608ee6744 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -4904,6 +4904,19 @@ std::optional ArgumentAnalyzer::AnalyzeExpr( "TYPE(*) dummy argument may only be used as an actual argument"_err_en_US); } else if (MaybeExpr argExpr{AnalyzeExprOrWholeAssumedSizeArray(expr)}) { if (isProcedureCall_ || !IsProcedureDesignator(*argExpr)) { + // Pad Hollerith actual argument with spaces up to a multiple of 8 + // bytes, in case the data are interpreted as double precision + // (or a smaller numeric type) by legacy code. + if (auto hollerith{UnwrapExpr>(*argExpr)}; + hollerith && hollerith->wasHollerith()) { + std::string bytes{hollerith->values()}; + while ((bytes.size() % 8) != 0) { + bytes += ' '; + } + Constant c{std::move(bytes)}; + c.set_wasHollerith(true); + argExpr = AsGenericExpr(std::move(c)); + } ActualArgument arg{std::move(*argExpr)}; SetArgSourceLocation(arg, expr.source); return std::move(arg); diff --git a/flang/test/Semantics/pad-hollerith-arg.f b/flang/test/Semantics/pad-hollerith-arg.f new file mode 100644 index 0000000000000..75678441ea45f --- /dev/null +++ b/flang/test/Semantics/pad-hollerith-arg.f @@ -0,0 +1,5 @@ +! RUN: %flang_fc1 -fdebug-unparse %s | FileCheck %s +! Ensure that Hollerith actual arguments are blank padded. +! CHECK: CALL foo("abc ") + call foo(3habc) + end From flang-commits at lists.llvm.org Thu May 15 11:25:33 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 15 May 2025 11:25:33 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Pad Hollerith actual arguments (PR #139782) In-Reply-To: Message-ID: <6826319d.a70a0220.10033b.2dcd@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/139782 From flang-commits at lists.llvm.org Thu May 15 11:26:07 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 11:26:07 -0700 (PDT) Subject: [flang-commits] [flang] c26e752 - [flang] Support -D for function-like macros (#139812) Message-ID: <682631bf.050a0220.36aacb.2a92@mx.google.com> Author: Peter Klausler Date: 2025-05-15T11:26:03-07:00 New Revision: c26e7520a939556bd23f7db3b7e0f4530b9d94a8 URL: https://github.com/llvm/llvm-project/commit/c26e7520a939556bd23f7db3b7e0f4530b9d94a8 DIFF: https://github.com/llvm/llvm-project/commit/c26e7520a939556bd23f7db3b7e0f4530b9d94a8.diff LOG: [flang] Support -D for function-like macros (#139812) Handle a command-line function-like macro definition like "-Dfoo(a)=...". TODO: error reporting for badly formed argument lists. Added: flang/test/Preprocessing/func-on-command-line.F90 Modified: flang/include/flang/Parser/preprocessor.h flang/lib/Parser/preprocessor.cpp Removed: ################################################################################ diff --git a/flang/include/flang/Parser/preprocessor.h b/flang/include/flang/Parser/preprocessor.h index 86528a7e68def..15810a34ee6a5 100644 --- a/flang/include/flang/Parser/preprocessor.h +++ b/flang/include/flang/Parser/preprocessor.h @@ -116,6 +116,7 @@ class Preprocessor { bool IsIfPredicateTrue(const TokenSequence &expr, std::size_t first, std::size_t exprTokens, Prescanner &); void LineDirective(const TokenSequence &, std::size_t, Prescanner &); + TokenSequence TokenizeMacroBody(const std::string &); AllSources &allSources_; std::list names_; diff --git a/flang/lib/Parser/preprocessor.cpp b/flang/lib/Parser/preprocessor.cpp index 6e8e3aee19b09..a5de14d864762 100644 --- a/flang/lib/Parser/preprocessor.cpp +++ b/flang/lib/Parser/preprocessor.cpp @@ -301,8 +301,82 @@ void Preprocessor::DefineStandardMacros() { Define("__TIMESTAMP__"s, "__TIMESTAMP__"s); } +static const std::string idChars{ + "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_0123456789"s}; + +static std::optional> TokenizeMacroNameAndArgs( + const std::string &str) { + // TODO: variadic macros on the command line (?) + std::vector names; + for (std::string::size_type at{0};;) { + auto nameStart{str.find_first_not_of(" "s, at)}; + if (nameStart == str.npos) { + return std::nullopt; + } + auto nameEnd{str.find_first_not_of(idChars, nameStart)}; + if (nameEnd == str.npos) { + return std::nullopt; + } + auto punc{str.find_first_not_of(" "s, nameEnd)}; + if (punc == str.npos) { + return std::nullopt; + } + if ((at == 0 && str[punc] != '(') || + (at > 0 && str[punc] != ',' && str[punc] != ')')) { + return std::nullopt; + } + names.push_back(str.substr(nameStart, nameEnd - nameStart)); + at = punc + 1; + if (str[punc] == ')') { + if (str.find_first_not_of(" "s, at) != str.npos) { + return std::nullopt; + } else { + return names; + } + } + } +} + +TokenSequence Preprocessor::TokenizeMacroBody(const std::string &str) { + TokenSequence tokens; + Provenance provenance{allSources_.AddCompilerInsertion(str).start()}; + auto end{str.size()}; + for (std::string::size_type at{0}; at < end;) { + // Alternate between tokens that are identifiers (and therefore subject + // to argument replacement) and those that are not. + auto start{str.find_first_of(idChars, at)}; + if (start == str.npos) { + tokens.Put(str.substr(at), provenance + at); + break; + } else if (start > at) { + tokens.Put(str.substr(at, start - at), provenance + at); + } + at = str.find_first_not_of(idChars, start + 1); + if (at == str.npos) { + tokens.Put(str.substr(start), provenance + start); + break; + } else { + tokens.Put(str.substr(start, at - start), provenance + start); + } + } + return tokens; +} + void Preprocessor::Define(const std::string ¯o, const std::string &value) { - definitions_.emplace(SaveTokenAsName(macro), Definition{value, allSources_}); + if (auto lhs{TokenizeMacroNameAndArgs(macro)}) { + // function-like macro + CharBlock macroName{SaveTokenAsName(lhs->front())}; + auto iter{lhs->begin()}; + ++iter; + std::vector argNames{iter, lhs->end()}; + auto rhs{TokenizeMacroBody(value)}; + definitions_.emplace(std::make_pair(macroName, + Definition{ + argNames, rhs, 0, rhs.SizeInTokens(), /*isVariadic=*/false})); + } else { // keyword macro + definitions_.emplace( + SaveTokenAsName(macro), Definition{value, allSources_}); + } } void Preprocessor::Undefine(std::string macro) { definitions_.erase(macro); } diff --git a/flang/test/Preprocessing/func-on-command-line.F90 b/flang/test/Preprocessing/func-on-command-line.F90 new file mode 100644 index 0000000000000..cf844e021b371 --- /dev/null +++ b/flang/test/Preprocessing/func-on-command-line.F90 @@ -0,0 +1,4 @@ +! RUN: %flang_fc1 -fdebug-unparse "-Dfoo(a,b)=bar(a+b)" %s | FileCheck %s +! CHECK: CALL bar(3_4) +call foo(1,2) +end From flang-commits at lists.llvm.org Thu May 15 11:26:10 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 15 May 2025 11:26:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Support -D for function-like macros (PR #139812) In-Reply-To: Message-ID: <682631c2.170a0220.2b200.1eb2@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/139812 From flang-commits at lists.llvm.org Thu May 15 11:26:26 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 11:26:26 -0700 (PDT) Subject: [flang-commits] [flang] b7e13ab - [flang][docs] Document technique for regenerating a module hermetically (#139975) Message-ID: <682631d2.170a0220.3470c5.1f50@mx.google.com> Author: Peter Klausler Date: 2025-05-15T11:26:21-07:00 New Revision: b7e13ab42929562d0fa78b623562341ef78617b4 URL: https://github.com/llvm/llvm-project/commit/b7e13ab42929562d0fa78b623562341ef78617b4 DIFF: https://github.com/llvm/llvm-project/commit/b7e13ab42929562d0fa78b623562341ef78617b4.diff LOG: [flang][docs] Document technique for regenerating a module hermetically (#139975) A flang-new module file is Fortran source, so it can be recompiled with the `-fhermetic-module-files` option to convert it into a hermetic one. Added: Modified: flang/docs/ModFiles.md Removed: ################################################################################ diff --git a/flang/docs/ModFiles.md b/flang/docs/ModFiles.md index dd0ade5cebbfc..fc05c2677fc26 100644 --- a/flang/docs/ModFiles.md +++ b/flang/docs/ModFiles.md @@ -171,3 +171,14 @@ modules of dependent libraries need not also be packaged with the library. When the compiler reads a hermetic module file, the copies of the dependent modules are read into their own scope, and will not conflict with other modules of the same name that client code might `USE`. + +One can use the `-fhermetic-module-files` option when building the top-level +module files of a library for which not all of the implementation modules +will (or can) be shipped. + +It is also possible to convert a default module file to a hermetic one after +the fact. +Since module files are Fortran source, simply copy the module file to a new +temporary free form Fortran source file and recompile it (`-fsyntax-only`) +with the `-fhermetic-module-files` flag, and that will regenerate the module +file in place with all of its dependent modules included. From flang-commits at lists.llvm.org Thu May 15 11:26:28 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 15 May 2025 11:26:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang][docs] Document technique for regenerating a module hermetically (PR #139975) In-Reply-To: Message-ID: <682631d4.630a0220.3243aa.5f00@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/139975 From flang-commits at lists.llvm.org Thu May 15 12:01:03 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 12:01:03 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Update assembly formats to include asyncOnly, async, and wait (PR #140122) In-Reply-To: Message-ID: <682639ef.050a0220.143989.2b79@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-mlir-openacc Author: None (khaki3)
Changes The async implementation is inconsistent in terms of the assembly format. While renaming `UpdateOp`'s `async` to `asyncOnly`, this PR handles `asyncOnly` along with async operands in every operation. Regarding `EnterDataOp` and `ExitDataOp`, they do not accept device types; thus, the async and the wait clauses without values lead to the `async` and the `wait` attributes (not `asyncOnly` nor `waitOnly`). This PR also processes them with async and wait operands all together. --- Patch is 46.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140122.diff 19 Files Affected: - (modified) flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-data.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 (+5-5) - (modified) flang/test/Lower/OpenACC/acc-enter-data.f90 (+5-5) - (modified) flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 (+7-7) - (modified) flang/test/Lower/OpenACC/acc-exit-data.f90 (+7-7) - (modified) flang/test/Lower/OpenACC/acc-kernels-loop.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-kernels.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-parallel-loop.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-parallel.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-serial-loop.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-serial.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-update.f90 (+6-6) - (modified) flang/test/Lower/OpenACC/acc-wait.f90 (+1-1) - (modified) mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td (+30-25) - (modified) mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp (+89-4) - (modified) mlir/test/Conversion/OpenACCToSCF/convert-openacc-to-scf.mlir (+6-6) - (modified) mlir/test/Dialect/OpenACC/invalid.mlir (+3-3) - (modified) mlir/test/Dialect/OpenACC/ops.mlir (+50-50) ``````````diff diff --git a/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 index d010d39cef4eb..789db34adefee 100644 --- a/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 @@ -155,8 +155,8 @@ subroutine acc_data !$acc data present(a) async !$acc end data -! CHECK: acc.data dataOperands(%{{.*}}) { -! CHECK: } attributes {asyncOnly = [#acc.device_type]} +! CHECK: acc.data async dataOperands(%{{.*}}) { +! CHECK: } !$acc data copy(a) async(1) !$acc end data diff --git a/flang/test/Lower/OpenACC/acc-data.f90 b/flang/test/Lower/OpenACC/acc-data.f90 index 7965fdc0ac707..3032ce7109c1e 100644 --- a/flang/test/Lower/OpenACC/acc-data.f90 +++ b/flang/test/Lower/OpenACC/acc-data.f90 @@ -155,8 +155,8 @@ subroutine acc_data !$acc data present(a) async !$acc end data -! CHECK: acc.data dataOperands(%{{.*}}) { -! CHECK: } attributes {asyncOnly = [#acc.device_type]} +! CHECK: acc.data async dataOperands(%{{.*}}) { +! CHECK: } !$acc data copy(a) async(1) !$acc end data diff --git a/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 index c42350a07c498..3e08068bdec44 100644 --- a/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 @@ -94,20 +94,20 @@ subroutine acc_enter_data !$acc enter data create(a) async !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) wait !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {wait} +!CHECK: acc.enter_data wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async wait !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async, wait} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-enter-data.f90 b/flang/test/Lower/OpenACC/acc-enter-data.f90 index 3e49259c360eb..f7396660a6d3c 100644 --- a/flang/test/Lower/OpenACC/acc-enter-data.f90 +++ b/flang/test/Lower/OpenACC/acc-enter-data.f90 @@ -53,16 +53,16 @@ subroutine acc_enter_data !CHECK: acc.enter_data dataOperands(%[[COPYIN_A]], %[[CREATE_B]], %[[ATTACH_D]] : !fir.ref>, !fir.ref>, !fir.ref>>){{$}} !$acc enter data create(a) async -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) wait !CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {wait} +!CHECK: acc.enter_data wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async wait -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async, wait} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 index 7999a7647f49b..fd942173b637a 100644 --- a/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 @@ -56,19 +56,19 @@ subroutine acc_exit_data !CHECK: acc.detach accPtr(%[[DEVPTR_D]] : !fir.ptr) {name = "d", structured = false} !$acc exit data delete(a) async -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async {name = "a", structured = false} !$acc exit data delete(a) wait !CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {wait} +!CHECK: acc.exit_data wait dataOperands(%[[DEVPTR]] : !fir.ref>) !CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a", structured = false} !$acc exit data delete(a) async wait -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async, wait} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async wait dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async {name = "a", structured = false} !$acc exit data delete(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-exit-data.f90 b/flang/test/Lower/OpenACC/acc-exit-data.f90 index bf5f7094913a1..cbc63ac81945c 100644 --- a/flang/test/Lower/OpenACC/acc-exit-data.f90 +++ b/flang/test/Lower/OpenACC/acc-exit-data.f90 @@ -54,19 +54,19 @@ subroutine acc_exit_data !CHECK: acc.detach accPtr(%[[DEVPTR_D]] : !fir.ref>>) {name = "d", structured = false} !$acc exit data delete(a) async -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) async {name = "a", structured = false} !$acc exit data delete(a) wait !CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {wait} +!CHECK: acc.exit_data wait dataOperands(%[[DEVPTR]] : !fir.ref>) !CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a", structured = false} !$acc exit data delete(a) async wait -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async, wait} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async wait dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) async {name = "a", structured = false} !$acc exit data delete(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 b/flang/test/Lower/OpenACC/acc-kernels-loop.f90 index a330b7d491d06..8608b0ad98ce6 100644 --- a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-kernels-loop.f90 @@ -69,12 +69,12 @@ subroutine acc_kernels_loop END DO !$acc end kernels loop -! CHECK: acc.kernels {{.*}} { +! CHECK: acc.kernels {{.*}} async { ! CHECK: acc.loop {{.*}} { ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} ! CHECK: acc.terminator -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc kernels loop async(1) DO i = 1, n diff --git a/flang/test/Lower/OpenACC/acc-kernels.f90 b/flang/test/Lower/OpenACC/acc-kernels.f90 index 6b7a625b34f71..b90870db25095 100644 --- a/flang/test/Lower/OpenACC/acc-kernels.f90 +++ b/flang/test/Lower/OpenACC/acc-kernels.f90 @@ -38,9 +38,9 @@ subroutine acc_kernels !$acc kernels async !$acc end kernels -! CHECK: acc.kernels { +! CHECK: acc.kernels async { ! CHECK: acc.terminator -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc kernels async(1) !$acc end kernels diff --git a/flang/test/Lower/OpenACC/acc-parallel-loop.f90 b/flang/test/Lower/OpenACC/acc-parallel-loop.f90 index 1e1fc7448a513..4cf268d2517f5 100644 --- a/flang/test/Lower/OpenACC/acc-parallel-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-parallel-loop.f90 @@ -71,12 +71,12 @@ subroutine acc_parallel_loop END DO !$acc end parallel loop -! CHECK: acc.parallel {{.*}} { +! CHECK: acc.parallel {{.*}} async { ! CHECK: acc.loop {{.*}} { ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc parallel loop async(1) DO i = 1, n diff --git a/flang/test/Lower/OpenACC/acc-parallel.f90 b/flang/test/Lower/OpenACC/acc-parallel.f90 index e00ea41210966..1eae106ba61b2 100644 --- a/flang/test/Lower/OpenACC/acc-parallel.f90 +++ b/flang/test/Lower/OpenACC/acc-parallel.f90 @@ -60,9 +60,9 @@ subroutine acc_parallel !$acc parallel async !$acc end parallel -! CHECK: acc.parallel { +! CHECK: acc.parallel async { ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc parallel async(1) !$acc end parallel diff --git a/flang/test/Lower/OpenACC/acc-serial-loop.f90 b/flang/test/Lower/OpenACC/acc-serial-loop.f90 index 98fc28990265a..34391f78ae707 100644 --- a/flang/test/Lower/OpenACC/acc-serial-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-serial-loop.f90 @@ -90,12 +90,12 @@ subroutine acc_serial_loop END DO !$acc end serial loop -! CHECK: acc.serial {{.*}} { +! CHECK: acc.serial {{.*}} async { ! CHECK: acc.loop {{.*}} { ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc serial loop async(1) DO i = 1, n diff --git a/flang/test/Lower/OpenACC/acc-serial.f90 b/flang/test/Lower/OpenACC/acc-serial.f90 index 9ba44ce6b9197..1e4f32fd209ef 100644 --- a/flang/test/Lower/OpenACC/acc-serial.f90 +++ b/flang/test/Lower/OpenACC/acc-serial.f90 @@ -60,9 +60,9 @@ subroutine acc_serial !$acc serial async !$acc end serial -! CHECK: acc.serial { +! CHECK: acc.serial async { ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc serial async(1) !$acc end serial diff --git a/flang/test/Lower/OpenACC/acc-update.f90 b/flang/test/Lower/OpenACC/acc-update.f90 index f96b105ed93bd..f98af425de985 100644 --- a/flang/test/Lower/OpenACC/acc-update.f90 +++ b/flang/test/Lower/OpenACC/acc-update.f90 @@ -63,9 +63,9 @@ subroutine acc_update ! CHECK: acc.update_host accPtr(%[[DEVPTR_B]] : !fir.ref>) to varPtr(%[[DECLB]]#0 : !fir.ref>) {name = "b", structured = false} !$acc update host(a) async -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} ! CHECK: acc.update async dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) async to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) wait ! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} @@ -73,9 +73,9 @@ subroutine acc_update ! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) async wait -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} ! CHECK: acc.update async wait dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) async to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) async(1) ! CHECK: [[ASYNC1:%.*]] = arith.constant 1 : i32 @@ -108,8 +108,8 @@ subroutine acc_update ! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) device_type(host, nvidia) async -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type, #acc.device_type], dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async([#acc.device_type, #acc.device_type]) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} ! CHECK: acc.update async([#acc.device_type, #acc.device_type]) dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/140122 From flang-commits at lists.llvm.org Thu May 15 12:01:05 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 12:01:05 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Update assembly formats to include asyncOnly, async, and wait (PR #140122) In-Reply-To: Message-ID: <682639f1.620a0220.35df9a.2c22@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-openacc Author: None (khaki3)
Changes The async implementation is inconsistent in terms of the assembly format. While renaming `UpdateOp`'s `async` to `asyncOnly`, this PR handles `asyncOnly` along with async operands in every operation. Regarding `EnterDataOp` and `ExitDataOp`, they do not accept device types; thus, the async and the wait clauses without values lead to the `async` and the `wait` attributes (not `asyncOnly` nor `waitOnly`). This PR also processes them with async and wait operands all together. --- Patch is 46.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140122.diff 19 Files Affected: - (modified) flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-data.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 (+5-5) - (modified) flang/test/Lower/OpenACC/acc-enter-data.f90 (+5-5) - (modified) flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 (+7-7) - (modified) flang/test/Lower/OpenACC/acc-exit-data.f90 (+7-7) - (modified) flang/test/Lower/OpenACC/acc-kernels-loop.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-kernels.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-parallel-loop.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-parallel.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-serial-loop.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-serial.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-update.f90 (+6-6) - (modified) flang/test/Lower/OpenACC/acc-wait.f90 (+1-1) - (modified) mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td (+30-25) - (modified) mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp (+89-4) - (modified) mlir/test/Conversion/OpenACCToSCF/convert-openacc-to-scf.mlir (+6-6) - (modified) mlir/test/Dialect/OpenACC/invalid.mlir (+3-3) - (modified) mlir/test/Dialect/OpenACC/ops.mlir (+50-50) ``````````diff diff --git a/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 index d010d39cef4eb..789db34adefee 100644 --- a/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 @@ -155,8 +155,8 @@ subroutine acc_data !$acc data present(a) async !$acc end data -! CHECK: acc.data dataOperands(%{{.*}}) { -! CHECK: } attributes {asyncOnly = [#acc.device_type]} +! CHECK: acc.data async dataOperands(%{{.*}}) { +! CHECK: } !$acc data copy(a) async(1) !$acc end data diff --git a/flang/test/Lower/OpenACC/acc-data.f90 b/flang/test/Lower/OpenACC/acc-data.f90 index 7965fdc0ac707..3032ce7109c1e 100644 --- a/flang/test/Lower/OpenACC/acc-data.f90 +++ b/flang/test/Lower/OpenACC/acc-data.f90 @@ -155,8 +155,8 @@ subroutine acc_data !$acc data present(a) async !$acc end data -! CHECK: acc.data dataOperands(%{{.*}}) { -! CHECK: } attributes {asyncOnly = [#acc.device_type]} +! CHECK: acc.data async dataOperands(%{{.*}}) { +! CHECK: } !$acc data copy(a) async(1) !$acc end data diff --git a/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 index c42350a07c498..3e08068bdec44 100644 --- a/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 @@ -94,20 +94,20 @@ subroutine acc_enter_data !$acc enter data create(a) async !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) wait !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {wait} +!CHECK: acc.enter_data wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async wait !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async, wait} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-enter-data.f90 b/flang/test/Lower/OpenACC/acc-enter-data.f90 index 3e49259c360eb..f7396660a6d3c 100644 --- a/flang/test/Lower/OpenACC/acc-enter-data.f90 +++ b/flang/test/Lower/OpenACC/acc-enter-data.f90 @@ -53,16 +53,16 @@ subroutine acc_enter_data !CHECK: acc.enter_data dataOperands(%[[COPYIN_A]], %[[CREATE_B]], %[[ATTACH_D]] : !fir.ref>, !fir.ref>, !fir.ref>>){{$}} !$acc enter data create(a) async -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) wait !CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {wait} +!CHECK: acc.enter_data wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async wait -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async, wait} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 index 7999a7647f49b..fd942173b637a 100644 --- a/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 @@ -56,19 +56,19 @@ subroutine acc_exit_data !CHECK: acc.detach accPtr(%[[DEVPTR_D]] : !fir.ptr) {name = "d", structured = false} !$acc exit data delete(a) async -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async {name = "a", structured = false} !$acc exit data delete(a) wait !CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {wait} +!CHECK: acc.exit_data wait dataOperands(%[[DEVPTR]] : !fir.ref>) !CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a", structured = false} !$acc exit data delete(a) async wait -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async, wait} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async wait dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async {name = "a", structured = false} !$acc exit data delete(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-exit-data.f90 b/flang/test/Lower/OpenACC/acc-exit-data.f90 index bf5f7094913a1..cbc63ac81945c 100644 --- a/flang/test/Lower/OpenACC/acc-exit-data.f90 +++ b/flang/test/Lower/OpenACC/acc-exit-data.f90 @@ -54,19 +54,19 @@ subroutine acc_exit_data !CHECK: acc.detach accPtr(%[[DEVPTR_D]] : !fir.ref>>) {name = "d", structured = false} !$acc exit data delete(a) async -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) async {name = "a", structured = false} !$acc exit data delete(a) wait !CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {wait} +!CHECK: acc.exit_data wait dataOperands(%[[DEVPTR]] : !fir.ref>) !CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a", structured = false} !$acc exit data delete(a) async wait -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async, wait} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async wait dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) async {name = "a", structured = false} !$acc exit data delete(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 b/flang/test/Lower/OpenACC/acc-kernels-loop.f90 index a330b7d491d06..8608b0ad98ce6 100644 --- a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-kernels-loop.f90 @@ -69,12 +69,12 @@ subroutine acc_kernels_loop END DO !$acc end kernels loop -! CHECK: acc.kernels {{.*}} { +! CHECK: acc.kernels {{.*}} async { ! CHECK: acc.loop {{.*}} { ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} ! CHECK: acc.terminator -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc kernels loop async(1) DO i = 1, n diff --git a/flang/test/Lower/OpenACC/acc-kernels.f90 b/flang/test/Lower/OpenACC/acc-kernels.f90 index 6b7a625b34f71..b90870db25095 100644 --- a/flang/test/Lower/OpenACC/acc-kernels.f90 +++ b/flang/test/Lower/OpenACC/acc-kernels.f90 @@ -38,9 +38,9 @@ subroutine acc_kernels !$acc kernels async !$acc end kernels -! CHECK: acc.kernels { +! CHECK: acc.kernels async { ! CHECK: acc.terminator -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc kernels async(1) !$acc end kernels diff --git a/flang/test/Lower/OpenACC/acc-parallel-loop.f90 b/flang/test/Lower/OpenACC/acc-parallel-loop.f90 index 1e1fc7448a513..4cf268d2517f5 100644 --- a/flang/test/Lower/OpenACC/acc-parallel-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-parallel-loop.f90 @@ -71,12 +71,12 @@ subroutine acc_parallel_loop END DO !$acc end parallel loop -! CHECK: acc.parallel {{.*}} { +! CHECK: acc.parallel {{.*}} async { ! CHECK: acc.loop {{.*}} { ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc parallel loop async(1) DO i = 1, n diff --git a/flang/test/Lower/OpenACC/acc-parallel.f90 b/flang/test/Lower/OpenACC/acc-parallel.f90 index e00ea41210966..1eae106ba61b2 100644 --- a/flang/test/Lower/OpenACC/acc-parallel.f90 +++ b/flang/test/Lower/OpenACC/acc-parallel.f90 @@ -60,9 +60,9 @@ subroutine acc_parallel !$acc parallel async !$acc end parallel -! CHECK: acc.parallel { +! CHECK: acc.parallel async { ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc parallel async(1) !$acc end parallel diff --git a/flang/test/Lower/OpenACC/acc-serial-loop.f90 b/flang/test/Lower/OpenACC/acc-serial-loop.f90 index 98fc28990265a..34391f78ae707 100644 --- a/flang/test/Lower/OpenACC/acc-serial-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-serial-loop.f90 @@ -90,12 +90,12 @@ subroutine acc_serial_loop END DO !$acc end serial loop -! CHECK: acc.serial {{.*}} { +! CHECK: acc.serial {{.*}} async { ! CHECK: acc.loop {{.*}} { ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc serial loop async(1) DO i = 1, n diff --git a/flang/test/Lower/OpenACC/acc-serial.f90 b/flang/test/Lower/OpenACC/acc-serial.f90 index 9ba44ce6b9197..1e4f32fd209ef 100644 --- a/flang/test/Lower/OpenACC/acc-serial.f90 +++ b/flang/test/Lower/OpenACC/acc-serial.f90 @@ -60,9 +60,9 @@ subroutine acc_serial !$acc serial async !$acc end serial -! CHECK: acc.serial { +! CHECK: acc.serial async { ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc serial async(1) !$acc end serial diff --git a/flang/test/Lower/OpenACC/acc-update.f90 b/flang/test/Lower/OpenACC/acc-update.f90 index f96b105ed93bd..f98af425de985 100644 --- a/flang/test/Lower/OpenACC/acc-update.f90 +++ b/flang/test/Lower/OpenACC/acc-update.f90 @@ -63,9 +63,9 @@ subroutine acc_update ! CHECK: acc.update_host accPtr(%[[DEVPTR_B]] : !fir.ref>) to varPtr(%[[DECLB]]#0 : !fir.ref>) {name = "b", structured = false} !$acc update host(a) async -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} ! CHECK: acc.update async dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) async to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) wait ! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} @@ -73,9 +73,9 @@ subroutine acc_update ! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) async wait -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} ! CHECK: acc.update async wait dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) async to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) async(1) ! CHECK: [[ASYNC1:%.*]] = arith.constant 1 : i32 @@ -108,8 +108,8 @@ subroutine acc_update ! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) device_type(host, nvidia) async -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type, #acc.device_type], dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async([#acc.device_type, #acc.device_type]) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} ! CHECK: acc.update async([#acc.device_type, #acc.device_type]) dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/140122 From flang-commits at lists.llvm.org Thu May 15 12:40:35 2025 From: flang-commits at lists.llvm.org (Razvan Lupusoru via flang-commits) Date: Thu, 15 May 2025 12:40:35 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Update assembly formats to include asyncOnly, async, and wait (PR #140122) In-Reply-To: Message-ID: <68264333.170a0220.3a3d66.257a@mx.google.com> https://github.com/razvanlupusoru approved this pull request. Thank you Matsu! Nice work! I like the improvement of printing the `async` keyword a lot. https://github.com/llvm/llvm-project/pull/140122 From flang-commits at lists.llvm.org Thu May 15 12:56:23 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 12:56:23 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Update assembly formats to include asyncOnly, async, and wait (PR #140122) In-Reply-To: Message-ID: <682646e7.050a0220.ee3a5.331c@mx.google.com> https://github.com/khaki3 closed https://github.com/llvm/llvm-project/pull/140122 From flang-commits at lists.llvm.org Thu May 15 12:00:30 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 12:00:30 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][acc] Update assembly formats to include asyncOnly, async, and wait (PR #140122) Message-ID: https://github.com/khaki3 created https://github.com/llvm/llvm-project/pull/140122 The async implementation is inconsistent in terms of the assembly format. While renaming `UpdateOp`'s `async` to `asyncOnly`, this PR handles `asyncOnly` along with async operands in every operation. Regarding `EnterDataOp` and `ExitDataOp`, they do not accept device types; thus, the async and the wait clauses without values lead to the `async` and the `wait` attributes (not `asyncOnly` nor `waitOnly`). This PR also processes them with async and wait operands all together. >From 790843e2001346aa8dcbc848ea61a0b46ff28c2f Mon Sep 17 00:00:00 2001 From: Kazuaki Matsumura Date: Thu, 15 May 2025 11:40:45 -0700 Subject: [PATCH] [flang][acc] Rename UpdateOp's async to asyncOnly; Update assembly formats to include asyncOnly/async/wait --- .../OpenACC/acc-data-unwrap-defaultbounds.f90 | 4 +- flang/test/Lower/OpenACC/acc-data.f90 | 4 +- .../acc-enter-data-unwrap-defaultbounds.f90 | 10 +- flang/test/Lower/OpenACC/acc-enter-data.f90 | 10 +- .../acc-exit-data-unwrap-defaultbounds.f90 | 14 +-- flang/test/Lower/OpenACC/acc-exit-data.f90 | 14 +-- flang/test/Lower/OpenACC/acc-kernels-loop.f90 | 4 +- flang/test/Lower/OpenACC/acc-kernels.f90 | 4 +- .../test/Lower/OpenACC/acc-parallel-loop.f90 | 4 +- flang/test/Lower/OpenACC/acc-parallel.f90 | 4 +- flang/test/Lower/OpenACC/acc-serial-loop.f90 | 4 +- flang/test/Lower/OpenACC/acc-serial.f90 | 4 +- flang/test/Lower/OpenACC/acc-update.f90 | 12 +-- flang/test/Lower/OpenACC/acc-wait.f90 | 2 +- .../mlir/Dialect/OpenACC/OpenACCOps.td | 55 +++++----- mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp | 93 +++++++++++++++- .../OpenACCToSCF/convert-openacc-to-scf.mlir | 12 +-- mlir/test/Dialect/OpenACC/invalid.mlir | 6 +- mlir/test/Dialect/OpenACC/ops.mlir | 100 +++++++++--------- 19 files changed, 225 insertions(+), 135 deletions(-) diff --git a/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 index d010d39cef4eb..789db34adefee 100644 --- a/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 @@ -155,8 +155,8 @@ subroutine acc_data !$acc data present(a) async !$acc end data -! CHECK: acc.data dataOperands(%{{.*}}) { -! CHECK: } attributes {asyncOnly = [#acc.device_type]} +! CHECK: acc.data async dataOperands(%{{.*}}) { +! CHECK: } !$acc data copy(a) async(1) !$acc end data diff --git a/flang/test/Lower/OpenACC/acc-data.f90 b/flang/test/Lower/OpenACC/acc-data.f90 index 7965fdc0ac707..3032ce7109c1e 100644 --- a/flang/test/Lower/OpenACC/acc-data.f90 +++ b/flang/test/Lower/OpenACC/acc-data.f90 @@ -155,8 +155,8 @@ subroutine acc_data !$acc data present(a) async !$acc end data -! CHECK: acc.data dataOperands(%{{.*}}) { -! CHECK: } attributes {asyncOnly = [#acc.device_type]} +! CHECK: acc.data async dataOperands(%{{.*}}) { +! CHECK: } !$acc data copy(a) async(1) !$acc end data diff --git a/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 index c42350a07c498..3e08068bdec44 100644 --- a/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 @@ -94,20 +94,20 @@ subroutine acc_enter_data !$acc enter data create(a) async !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) wait !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {wait} +!CHECK: acc.enter_data wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async wait !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async, wait} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-enter-data.f90 b/flang/test/Lower/OpenACC/acc-enter-data.f90 index 3e49259c360eb..f7396660a6d3c 100644 --- a/flang/test/Lower/OpenACC/acc-enter-data.f90 +++ b/flang/test/Lower/OpenACC/acc-enter-data.f90 @@ -53,16 +53,16 @@ subroutine acc_enter_data !CHECK: acc.enter_data dataOperands(%[[COPYIN_A]], %[[CREATE_B]], %[[ATTACH_D]] : !fir.ref>, !fir.ref>, !fir.ref>>){{$}} !$acc enter data create(a) async -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) wait !CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {wait} +!CHECK: acc.enter_data wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async wait -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async, wait} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 index 7999a7647f49b..fd942173b637a 100644 --- a/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 @@ -56,19 +56,19 @@ subroutine acc_exit_data !CHECK: acc.detach accPtr(%[[DEVPTR_D]] : !fir.ptr) {name = "d", structured = false} !$acc exit data delete(a) async -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async {name = "a", structured = false} !$acc exit data delete(a) wait !CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {wait} +!CHECK: acc.exit_data wait dataOperands(%[[DEVPTR]] : !fir.ref>) !CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a", structured = false} !$acc exit data delete(a) async wait -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async, wait} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async wait dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async {name = "a", structured = false} !$acc exit data delete(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-exit-data.f90 b/flang/test/Lower/OpenACC/acc-exit-data.f90 index bf5f7094913a1..cbc63ac81945c 100644 --- a/flang/test/Lower/OpenACC/acc-exit-data.f90 +++ b/flang/test/Lower/OpenACC/acc-exit-data.f90 @@ -54,19 +54,19 @@ subroutine acc_exit_data !CHECK: acc.detach accPtr(%[[DEVPTR_D]] : !fir.ref>>) {name = "d", structured = false} !$acc exit data delete(a) async -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) async {name = "a", structured = false} !$acc exit data delete(a) wait !CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {wait} +!CHECK: acc.exit_data wait dataOperands(%[[DEVPTR]] : !fir.ref>) !CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a", structured = false} !$acc exit data delete(a) async wait -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async, wait} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async wait dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) async {name = "a", structured = false} !$acc exit data delete(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 b/flang/test/Lower/OpenACC/acc-kernels-loop.f90 index a330b7d491d06..8608b0ad98ce6 100644 --- a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-kernels-loop.f90 @@ -69,12 +69,12 @@ subroutine acc_kernels_loop END DO !$acc end kernels loop -! CHECK: acc.kernels {{.*}} { +! CHECK: acc.kernels {{.*}} async { ! CHECK: acc.loop {{.*}} { ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} ! CHECK: acc.terminator -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc kernels loop async(1) DO i = 1, n diff --git a/flang/test/Lower/OpenACC/acc-kernels.f90 b/flang/test/Lower/OpenACC/acc-kernels.f90 index 6b7a625b34f71..b90870db25095 100644 --- a/flang/test/Lower/OpenACC/acc-kernels.f90 +++ b/flang/test/Lower/OpenACC/acc-kernels.f90 @@ -38,9 +38,9 @@ subroutine acc_kernels !$acc kernels async !$acc end kernels -! CHECK: acc.kernels { +! CHECK: acc.kernels async { ! CHECK: acc.terminator -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc kernels async(1) !$acc end kernels diff --git a/flang/test/Lower/OpenACC/acc-parallel-loop.f90 b/flang/test/Lower/OpenACC/acc-parallel-loop.f90 index 1e1fc7448a513..4cf268d2517f5 100644 --- a/flang/test/Lower/OpenACC/acc-parallel-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-parallel-loop.f90 @@ -71,12 +71,12 @@ subroutine acc_parallel_loop END DO !$acc end parallel loop -! CHECK: acc.parallel {{.*}} { +! CHECK: acc.parallel {{.*}} async { ! CHECK: acc.loop {{.*}} { ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc parallel loop async(1) DO i = 1, n diff --git a/flang/test/Lower/OpenACC/acc-parallel.f90 b/flang/test/Lower/OpenACC/acc-parallel.f90 index e00ea41210966..1eae106ba61b2 100644 --- a/flang/test/Lower/OpenACC/acc-parallel.f90 +++ b/flang/test/Lower/OpenACC/acc-parallel.f90 @@ -60,9 +60,9 @@ subroutine acc_parallel !$acc parallel async !$acc end parallel -! CHECK: acc.parallel { +! CHECK: acc.parallel async { ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc parallel async(1) !$acc end parallel diff --git a/flang/test/Lower/OpenACC/acc-serial-loop.f90 b/flang/test/Lower/OpenACC/acc-serial-loop.f90 index 98fc28990265a..34391f78ae707 100644 --- a/flang/test/Lower/OpenACC/acc-serial-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-serial-loop.f90 @@ -90,12 +90,12 @@ subroutine acc_serial_loop END DO !$acc end serial loop -! CHECK: acc.serial {{.*}} { +! CHECK: acc.serial {{.*}} async { ! CHECK: acc.loop {{.*}} { ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc serial loop async(1) DO i = 1, n diff --git a/flang/test/Lower/OpenACC/acc-serial.f90 b/flang/test/Lower/OpenACC/acc-serial.f90 index 9ba44ce6b9197..1e4f32fd209ef 100644 --- a/flang/test/Lower/OpenACC/acc-serial.f90 +++ b/flang/test/Lower/OpenACC/acc-serial.f90 @@ -60,9 +60,9 @@ subroutine acc_serial !$acc serial async !$acc end serial -! CHECK: acc.serial { +! CHECK: acc.serial async { ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc serial async(1) !$acc end serial diff --git a/flang/test/Lower/OpenACC/acc-update.f90 b/flang/test/Lower/OpenACC/acc-update.f90 index f96b105ed93bd..f98af425de985 100644 --- a/flang/test/Lower/OpenACC/acc-update.f90 +++ b/flang/test/Lower/OpenACC/acc-update.f90 @@ -63,9 +63,9 @@ subroutine acc_update ! CHECK: acc.update_host accPtr(%[[DEVPTR_B]] : !fir.ref>) to varPtr(%[[DECLB]]#0 : !fir.ref>) {name = "b", structured = false} !$acc update host(a) async -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} ! CHECK: acc.update async dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) async to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) wait ! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} @@ -73,9 +73,9 @@ subroutine acc_update ! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) async wait -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} ! CHECK: acc.update async wait dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) async to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) async(1) ! CHECK: [[ASYNC1:%.*]] = arith.constant 1 : i32 @@ -108,8 +108,8 @@ subroutine acc_update ! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) device_type(host, nvidia) async -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type, #acc.device_type], dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async([#acc.device_type, #acc.device_type]) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} ! CHECK: acc.update async([#acc.device_type, #acc.device_type]) dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {asyncOnly = [#acc.device_type, #acc.device_type], name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) async([#acc.device_type, #acc.device_type]) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} end subroutine acc_update diff --git a/flang/test/Lower/OpenACC/acc-wait.f90 b/flang/test/Lower/OpenACC/acc-wait.f90 index 8a42c97a12811..35db640a054c2 100644 --- a/flang/test/Lower/OpenACC/acc-wait.f90 +++ b/flang/test/Lower/OpenACC/acc-wait.f90 @@ -25,7 +25,7 @@ subroutine acc_update !$acc wait(1) async !CHECK: [[WAIT3:%.*]] = arith.constant 1 : i32 -!CHECK: acc.wait([[WAIT3]] : i32) attributes {async} +!CHECK: acc.wait([[WAIT3]] : i32) async !$acc wait(1) async(async) !CHECK: [[WAIT3:%.*]] = arith.constant 1 : i32 diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td index 5d5add6318e06..3c22aeb9a1ff7 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td @@ -561,8 +561,8 @@ class OpenACC_DataEntryOp($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) ) `->` type($accVar) attr-dict }]; @@ -922,8 +922,8 @@ class OpenACC_DataExitOpWithVarPtr let assemblyFormat = [{ custom($accVar, type($accVar)) (`bounds` `(` $bounds^ `)` )? - (`async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType)^ `)`)? + (`async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly)^)? `to` custom($var) `:` custom(type($var), $varType) attr-dict }]; @@ -983,8 +983,8 @@ class OpenACC_DataExitOpNoVarPtr : let assemblyFormat = [{ custom($accVar, type($accVar)) (`bounds` `(` $bounds^ `)` )? - (`async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType)^ `)`)? + (`async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly)^)? attr-dict }]; @@ -1439,8 +1439,8 @@ def OpenACC_ParallelOp : OpenACC_Op<"parallel", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1581,8 +1581,8 @@ def OpenACC_SerialOp : OpenACC_Op<"serial", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1750,8 +1750,8 @@ def OpenACC_KernelsOp : OpenACC_Op<"kernels", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `num_gangs` `(` custom($numGangs, type($numGangs), $numGangsDeviceType, $numGangsSegments) `)` | `num_workers` `(` custom($numWorkers, @@ -1870,8 +1870,8 @@ def OpenACC_DataOp : OpenACC_Op<"data", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, @@ -1934,9 +1934,11 @@ def OpenACC_EnterDataOp : OpenACC_Op<"enter_data", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `(` $asyncOperand `:` type($asyncOperand) `)` + | `async` `` custom($asyncOperand, + type($asyncOperand), $async) | `wait_devnum` `(` $waitDevnum `:` type($waitDevnum) `)` - | `wait` `(` $waitOperands `:` type($waitOperands) `)` + | `wait` `` custom($waitOperands, + type($waitOperands), $wait) | `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` ) attr-dict-with-keyword @@ -1986,9 +1988,11 @@ def OpenACC_ExitDataOp : OpenACC_Op<"exit_data", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `(` $asyncOperand `:` type($asyncOperand) `)` + | `async` `` custom($asyncOperand, + type($asyncOperand), $async) | `wait_devnum` `(` $waitDevnum `:` type($waitDevnum) `)` - | `wait` `(` $waitOperands `:` type($waitOperands) `)` + | `wait` `` custom($waitOperands, + type($waitOperands), $wait) | `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` ) attr-dict-with-keyword @@ -2853,7 +2857,7 @@ def OpenACC_UpdateOp : OpenACC_Op<"update", let arguments = (ins Optional:$ifCond, Variadic:$asyncOperands, OptionalAttr:$asyncOperandsDeviceType, - OptionalAttr:$async, + OptionalAttr:$asyncOnly, Variadic:$waitOperands, OptionalAttr:$waitOperandsSegments, OptionalAttr:$waitOperandsDeviceType, @@ -2901,9 +2905,8 @@ def OpenACC_UpdateOp : OpenACC_Op<"update", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `` custom( - $asyncOperands, type($asyncOperands), - $asyncOperandsDeviceType, $async) + | `async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, $waitOnly) @@ -2948,9 +2951,11 @@ def OpenACC_WaitOp : OpenACC_Op<"wait", [AttrSizedOperandSegments]> { let assemblyFormat = [{ ( `(` $waitOperands^ `:` type($waitOperands) `)` )? - oilist(`async` `(` $asyncOperand `:` type($asyncOperand) `)` - |`wait_devnum` `(` $waitDevnum `:` type($waitDevnum) `)` - |`if` `(` $ifCond `)` + oilist( + `async` `` custom($asyncOperand, + type($asyncOperand), $async) + | `wait_devnum` `(` $waitDevnum `:` type($waitDevnum) `)` + | `if` `(` $ifCond `)` ) attr-dict-with-keyword }]; let hasVerifier = 1; diff --git a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp index 7eb72d433c972..b401d2ec7894a 100644 --- a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp +++ b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp @@ -272,11 +272,12 @@ static LogicalResult checkWaitAndAsyncConflict(Op op) { ++dtypeInt) { auto dtype = static_cast(dtypeInt); - // The async attribute represent the async clause without value. Therefore - // the attribute and operand cannot appear at the same time. + // The asyncOnly attribute represent the async clause without value. + // Therefore the attribute and operand cannot appear at the same time. if (hasDeviceType(op.getAsyncOperandsDeviceType(), dtype) && op.hasAsyncOnly(dtype)) - return op.emitError("async attribute cannot appear with asyncOperand"); + return op.emitError( + "asyncOnly attribute cannot appear with asyncOperand"); // The wait attribute represent the wait clause without values. Therefore // the attribute and operands cannot appear at the same time. @@ -1683,6 +1684,90 @@ static void printDeviceTypeOperandsWithKeywordOnly( p << ")"; } +static ParseResult parseOperandWithKeywordOnly( + mlir::OpAsmParser &parser, + std::optional &operand, + mlir::Type &operandType, mlir::UnitAttr &attr) { + // Keyword only + if (failed(parser.parseOptionalLParen())) { + attr = mlir::UnitAttr::get(parser.getContext()); + return success(); + } + + OpAsmParser::UnresolvedOperand op; + if (failed(parser.parseOperand(op))) + return failure(); + operand = op; + if (failed(parser.parseColon())) + return failure(); + if (failed(parser.parseType(operandType))) + return failure(); + if (failed(parser.parseRParen())) + return failure(); + + return success(); +} + +static void printOperandWithKeywordOnly(mlir::OpAsmPrinter &p, + mlir::Operation *op, + std::optional operand, + mlir::Type operandType, + mlir::UnitAttr attr) { + if (attr) + return; + + p << "("; + p.printOperand(*operand); + p << " : "; + p.printType(operandType); + p << ")"; +} + +static ParseResult parseOperandsWithKeywordOnly( + mlir::OpAsmParser &parser, + llvm::SmallVectorImpl &operands, + llvm::SmallVectorImpl &types, mlir::UnitAttr &attr) { + // Keyword only + if (failed(parser.parseOptionalLParen())) { + attr = mlir::UnitAttr::get(parser.getContext()); + return success(); + } + + if (failed(parser.parseCommaSeparatedList([&]() { + if (parser.parseOperand(operands.emplace_back())) + return failure(); + return success(); + }))) + return failure(); + if (failed(parser.parseColon())) + return failure(); + if (failed(parser.parseCommaSeparatedList([&]() { + if (parser.parseType(types.emplace_back())) + return failure(); + return success(); + }))) + return failure(); + if (failed(parser.parseRParen())) + return failure(); + + return success(); +} + +static void printOperandsWithKeywordOnly(mlir::OpAsmPrinter &p, + mlir::Operation *op, + mlir::OperandRange operands, + mlir::TypeRange types, + mlir::UnitAttr attr) { + if (attr) + return; + + p << "("; + llvm::interleaveComma(operands, p, [&](auto it) { p << it; }); + p << " : "; + llvm::interleaveComma(types, p, [&](auto it) { p << it; }); + p << ")"; +} + static ParseResult parseCombinedConstructsLoop(mlir::OpAsmParser &parser, mlir::acc::CombinedConstructsTypeAttr &attr) { @@ -3505,7 +3590,7 @@ bool UpdateOp::hasAsyncOnly() { } bool UpdateOp::hasAsyncOnly(mlir::acc::DeviceType deviceType) { - return hasDeviceType(getAsync(), deviceType); + return hasDeviceType(getAsyncOnly(), deviceType); } mlir::Value UpdateOp::getAsyncValue() { diff --git a/mlir/test/Conversion/OpenACCToSCF/convert-openacc-to-scf.mlir b/mlir/test/Conversion/OpenACCToSCF/convert-openacc-to-scf.mlir index d8e89f64f8bc0..c08fd860e738b 100644 --- a/mlir/test/Conversion/OpenACCToSCF/convert-openacc-to-scf.mlir +++ b/mlir/test/Conversion/OpenACCToSCF/convert-openacc-to-scf.mlir @@ -68,20 +68,20 @@ func.func @update_false(%arg0: memref) { func.func @enter_data_true(%d1 : memref) { %true = arith.constant true %0 = acc.create varPtr(%d1 : memref) -> memref - acc.enter_data if(%true) dataOperands(%0 : memref) attributes {async} + acc.enter_data async if(%true) dataOperands(%0 : memref) return } // CHECK-LABEL: func.func @enter_data_true // CHECK-NOT: if -// CHECK: acc.enter_data dataOperands +// CHECK: acc.enter_data async dataOperands // ----- func.func @enter_data_false(%d1 : memref) { %false = arith.constant false %0 = acc.create varPtr(%d1 : memref) -> memref - acc.enter_data if(%false) dataOperands(%0 : memref) attributes {async} + acc.enter_data async if(%false) dataOperands(%0 : memref) return } @@ -93,21 +93,21 @@ func.func @enter_data_false(%d1 : memref) { func.func @exit_data_true(%d1 : memref) { %true = arith.constant true %0 = acc.getdeviceptr varPtr(%d1 : memref) -> memref - acc.exit_data if(%true) dataOperands(%0 : memref) attributes {async} + acc.exit_data async if(%true) dataOperands(%0 : memref) acc.delete accPtr(%0 : memref) return } // CHECK-LABEL: func.func @exit_data_true // CHECK-NOT:if -// CHECK:acc.exit_data dataOperands +// CHECK:acc.exit_data async dataOperands // ----- func.func @exit_data_false(%d1 : memref) { %false = arith.constant false %0 = acc.getdeviceptr varPtr(%d1 : memref) -> memref - acc.exit_data if(%false) dataOperands(%0 : memref) attributes {async} + acc.exit_data async if(%false) dataOperands(%0 : memref) acc.delete accPtr(%0 : memref) return } diff --git a/mlir/test/Dialect/OpenACC/invalid.mlir b/mlir/test/Dialect/OpenACC/invalid.mlir index c8d7a87112917..aadf189273212 100644 --- a/mlir/test/Dialect/OpenACC/invalid.mlir +++ b/mlir/test/Dialect/OpenACC/invalid.mlir @@ -129,8 +129,8 @@ acc.update %cst = arith.constant 1 : index %value = memref.alloc() : memref %0 = acc.update_device varPtr(%value : memref) -> memref -// expected-error at +1 {{async attribute cannot appear with asyncOperand}} -acc.update async(%cst: index) dataOperands(%0 : memref) attributes {async = [#acc.device_type]} +// expected-error at +1 {{asyncOnly attribute cannot appear with asyncOperand}} +acc.update async(%cst: index) dataOperands(%0 : memref) attributes {asyncOnly = [#acc.device_type]} // ----- @@ -138,7 +138,7 @@ acc.update async(%cst: index) dataOperands(%0 : memref) attributes {async = %value = memref.alloc() : memref %0 = acc.update_device varPtr(%value : memref) -> memref // expected-error at +1 {{wait attribute cannot appear with waitOperands}} -acc.update wait({%cst: index}) dataOperands(%0: memref) attributes {waitOnly = [#acc.device_type]} +acc.update wait({%cst: index}) dataOperands(%0: memref) attributes {waitOnly = [#acc.device_type]} // ----- diff --git a/mlir/test/Dialect/OpenACC/ops.mlir b/mlir/test/Dialect/OpenACC/ops.mlir index 4c842a26f8dc4..550f295f074a2 100644 --- a/mlir/test/Dialect/OpenACC/ops.mlir +++ b/mlir/test/Dialect/OpenACC/ops.mlir @@ -435,10 +435,10 @@ func.func @testparallelop(%a: memref<10xf32>, %b: memref<10xf32>, %c: memref<10x } attributes {defaultAttr = #acc} acc.parallel { } attributes {defaultAttr = #acc} - acc.parallel { - } attributes {asyncAttr} - acc.parallel { - } attributes {waitAttr} + acc.parallel async { + } + acc.parallel wait { + } acc.parallel { } attributes {selfAttr} return @@ -488,10 +488,10 @@ func.func @testparallelop(%a: memref<10xf32>, %b: memref<10xf32>, %c: memref<10x // CHECK-NEXT: } attributes {defaultAttr = #acc} // CHECK: acc.parallel { // CHECK-NEXT: } attributes {defaultAttr = #acc} -// CHECK: acc.parallel { -// CHECK-NEXT: } attributes {asyncAttr} -// CHECK: acc.parallel { -// CHECK-NEXT: } attributes {waitAttr} +// CHECK: acc.parallel async { +// CHECK-NEXT: } +// CHECK: acc.parallel wait { +// CHECK-NEXT: } // CHECK: acc.parallel { // CHECK-NEXT: } attributes {selfAttr} @@ -567,10 +567,10 @@ func.func @testserialop(%a: memref<10xf32>, %b: memref<10xf32>, %c: memref<10x10 } attributes {defaultAttr = #acc} acc.serial { } attributes {defaultAttr = #acc} - acc.serial { - } attributes {asyncAttr} - acc.serial { - } attributes {waitAttr} + acc.serial async { + } + acc.serial wait { + } acc.serial { } attributes {selfAttr} acc.serial { @@ -604,10 +604,10 @@ func.func @testserialop(%a: memref<10xf32>, %b: memref<10xf32>, %c: memref<10x10 // CHECK-NEXT: } attributes {defaultAttr = #acc} // CHECK: acc.serial { // CHECK-NEXT: } attributes {defaultAttr = #acc} -// CHECK: acc.serial { -// CHECK-NEXT: } attributes {asyncAttr} -// CHECK: acc.serial { -// CHECK-NEXT: } attributes {waitAttr} +// CHECK: acc.serial async { +// CHECK-NEXT: } +// CHECK: acc.serial wait { +// CHECK-NEXT: } // CHECK: acc.serial { // CHECK-NEXT: } attributes {selfAttr} // CHECK: acc.serial { @@ -639,10 +639,10 @@ func.func @testserialop(%a: memref<10xf32>, %b: memref<10xf32>, %c: memref<10x10 } attributes {defaultAttr = #acc} acc.kernels { } attributes {defaultAttr = #acc} - acc.kernels { - } attributes {asyncAttr} - acc.kernels { - } attributes {waitAttr} + acc.kernels async { + } + acc.kernels wait { + } acc.kernels { } attributes {selfAttr} acc.kernels { @@ -673,10 +673,10 @@ func.func @testserialop(%a: memref<10xf32>, %b: memref<10xf32>, %c: memref<10x10 // CHECK-NEXT: } attributes {defaultAttr = #acc} // CHECK: acc.kernels { // CHECK-NEXT: } attributes {defaultAttr = #acc} -// CHECK: acc.kernels { -// CHECK-NEXT: } attributes {asyncAttr} -// CHECK: acc.kernels { -// CHECK-NEXT: } attributes {waitAttr} +// CHECK: acc.kernels async { +// CHECK-NEXT: } +// CHECK: acc.kernels wait { +// CHECK-NEXT: } // CHECK: acc.kernels { // CHECK-NEXT: } attributes {selfAttr} // CHECK: acc.kernels { @@ -787,23 +787,23 @@ func.func @testdataop(%a: memref, %b: memref, %c: memref) -> () { acc.data { } attributes { defaultAttr = #acc } - acc.data { - } attributes { defaultAttr = #acc, async } + acc.data async { + } attributes { defaultAttr = #acc } %a1 = arith.constant 1 : i64 acc.data async(%a1 : i64) { - } attributes { defaultAttr = #acc, async } + } attributes { defaultAttr = #acc } - acc.data { - } attributes { defaultAttr = #acc, wait } + acc.data wait { + } attributes { defaultAttr = #acc } %w1 = arith.constant 1 : i64 acc.data wait({%w1 : i64}) { - } attributes { defaultAttr = #acc, wait } + } attributes { defaultAttr = #acc } %wd1 = arith.constant 1 : i64 acc.data wait({devnum: %wd1 : i64, %w1 : i64}) { - } attributes { defaultAttr = #acc, wait } + } attributes { defaultAttr = #acc } return } @@ -904,20 +904,20 @@ func.func @testdataop(%a: memref, %b: memref, %c: memref) -> () { // CHECK: acc.data { // CHECK-NEXT: } attributes {defaultAttr = #acc} -// CHECK: acc.data { -// CHECK-NEXT: } attributes {async, defaultAttr = #acc} +// CHECK: acc.data async { +// CHECK-NEXT: } attributes {defaultAttr = #acc} // CHECK: acc.data async(%{{.*}} : i64) { -// CHECK-NEXT: } attributes {async, defaultAttr = #acc} +// CHECK-NEXT: } attributes {defaultAttr = #acc} -// CHECK: acc.data { -// CHECK-NEXT: } attributes {defaultAttr = #acc, wait} +// CHECK: acc.data wait { +// CHECK-NEXT: } attributes {defaultAttr = #acc} // CHECK: acc.data wait({%{{.*}} : i64}) { -// CHECK-NEXT: } attributes {defaultAttr = #acc, wait} +// CHECK-NEXT: } attributes {defaultAttr = #acc} // CHECK: acc.data wait({devnum: %{{.*}} : i64, %{{.*}} : i64}) { -// CHECK-NEXT: } attributes {defaultAttr = #acc, wait} +// CHECK-NEXT: } attributes {defaultAttr = #acc} // ----- @@ -977,7 +977,7 @@ acc.wait async(%i32Value: i32) acc.wait async(%idxValue: index) acc.wait(%i32Value: i32) async(%idxValue: index) acc.wait(%i64Value: i64) wait_devnum(%i32Value: i32) -acc.wait attributes {async} +acc.wait async acc.wait(%i64Value: i64) async(%idxValue: index) wait_devnum(%i32Value: i32) acc.wait(%i64Value: i64) wait_devnum(%i32Value: i32) async(%idxValue: index) acc.wait if(%ifCond) @@ -996,7 +996,7 @@ acc.wait if(%ifCond) // CHECK: acc.wait async([[IDXVALUE]] : index) // CHECK: acc.wait([[I32VALUE]] : i32) async([[IDXVALUE]] : index) // CHECK: acc.wait([[I64VALUE]] : i64) wait_devnum([[I32VALUE]] : i32) -// CHECK: acc.wait attributes {async} +// CHECK: acc.wait async // CHECK: acc.wait([[I64VALUE]] : i64) async([[IDXVALUE]] : index) wait_devnum([[I32VALUE]] : i32) // CHECK: acc.wait([[I64VALUE]] : i64) async([[IDXVALUE]] : index) wait_devnum([[I32VALUE]] : i32) // CHECK: acc.wait if([[IFCOND]]) @@ -1078,7 +1078,7 @@ func.func @testexitdataop(%a: !llvm.ptr) -> () { acc.delete accPtr(%1 : !llvm.ptr) %2 = acc.getdeviceptr varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr - acc.exit_data dataOperands(%2 : !llvm.ptr) attributes {async,finalize} + acc.exit_data async dataOperands(%2 : !llvm.ptr) attributes {finalize} acc.delete accPtr(%2 : !llvm.ptr) %3 = acc.getdeviceptr varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr @@ -1086,11 +1086,11 @@ func.func @testexitdataop(%a: !llvm.ptr) -> () { acc.detach accPtr(%3 : !llvm.ptr) %4 = acc.getdeviceptr varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr - acc.exit_data dataOperands(%4 : !llvm.ptr) attributes {async} + acc.exit_data async dataOperands(%4 : !llvm.ptr) acc.copyout accPtr(%4 : !llvm.ptr) to varPtr(%a : !llvm.ptr) varType(f64) %5 = acc.getdeviceptr varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr - acc.exit_data dataOperands(%5 : !llvm.ptr) attributes {wait} + acc.exit_data wait dataOperands(%5 : !llvm.ptr) acc.delete accPtr(%5 : !llvm.ptr) %6 = acc.getdeviceptr varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr @@ -1127,7 +1127,7 @@ func.func @testexitdataop(%a: !llvm.ptr) -> () { // CHECK: acc.delete accPtr(%[[DEVPTR]] : !llvm.ptr) // CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr -// CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !llvm.ptr) attributes {async, finalize} +// CHECK: acc.exit_data async dataOperands(%[[DEVPTR]] : !llvm.ptr) attributes {finalize} // CHECK: acc.delete accPtr(%[[DEVPTR]] : !llvm.ptr) // CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr @@ -1135,11 +1135,11 @@ func.func @testexitdataop(%a: !llvm.ptr) -> () { // CHECK: acc.detach accPtr(%[[DEVPTR]] : !llvm.ptr) // CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr -// CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !llvm.ptr) attributes {async} +// CHECK: acc.exit_data async dataOperands(%[[DEVPTR]] : !llvm.ptr) // CHECK: acc.copyout accPtr(%[[DEVPTR]] : !llvm.ptr) to varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) // CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr -// CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !llvm.ptr) attributes {wait} +// CHECK: acc.exit_data wait dataOperands(%[[DEVPTR]] : !llvm.ptr) // CHECK: acc.delete accPtr(%[[DEVPTR]] : !llvm.ptr) // CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr @@ -1176,9 +1176,9 @@ func.func @testenterdataop(%a: !llvm.ptr, %b: !llvm.ptr, %c: !llvm.ptr) -> () { %4 = acc.attach varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr acc.enter_data dataOperands(%4 : !llvm.ptr) %5 = acc.copyin varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr - acc.enter_data dataOperands(%5 : !llvm.ptr) attributes {async} + acc.enter_data async dataOperands(%5 : !llvm.ptr) %6 = acc.create varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr - acc.enter_data dataOperands(%6 : !llvm.ptr) attributes {wait} + acc.enter_data wait dataOperands(%6 : !llvm.ptr) %7 = acc.copyin varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr acc.enter_data async(%i64Value : i64) dataOperands(%7 : !llvm.ptr) %8 = acc.copyin varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr @@ -1205,9 +1205,9 @@ func.func @testenterdataop(%a: !llvm.ptr, %b: !llvm.ptr, %c: !llvm.ptr) -> () { // CHECK: %[[ATTACH:.*]] = acc.attach varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr // CHECK: acc.enter_data dataOperands(%[[ATTACH]] : !llvm.ptr) // CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr -// CHECK: acc.enter_data dataOperands(%[[COPYIN]] : !llvm.ptr) attributes {async} +// CHECK: acc.enter_data async dataOperands(%[[COPYIN]] : !llvm.ptr) // CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr -// CHECK: acc.enter_data dataOperands(%[[CREATE]] : !llvm.ptr) attributes {wait} +// CHECK: acc.enter_data wait dataOperands(%[[CREATE]] : !llvm.ptr) // CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr // CHECK: acc.enter_data async([[I64VALUE]] : i64) dataOperands(%[[COPYIN]] : !llvm.ptr) // CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr From flang-commits at lists.llvm.org Thu May 15 12:56:20 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 12:56:20 -0700 (PDT) Subject: [flang-commits] [flang] f9dbfb1 - [flang][acc] Update assembly formats to include asyncOnly, async, and wait (#140122) Message-ID: <682646e4.170a0220.43513.35b1@mx.google.com> Author: khaki3 Date: 2025-05-15T12:56:15-07:00 New Revision: f9dbfb1566043d744d66ff8b5415269c6ec59743 URL: https://github.com/llvm/llvm-project/commit/f9dbfb1566043d744d66ff8b5415269c6ec59743 DIFF: https://github.com/llvm/llvm-project/commit/f9dbfb1566043d744d66ff8b5415269c6ec59743.diff LOG: [flang][acc] Update assembly formats to include asyncOnly, async, and wait (#140122) The async implementation is inconsistent in terms of the assembly format. While renaming `UpdateOp`'s `async` to `asyncOnly`, this PR handles `asyncOnly` along with async operands in every operation. Regarding `EnterDataOp` and `ExitDataOp`, they do not accept device types; thus, the async and the wait clauses without values lead to the `async` and the `wait` attributes (not `asyncOnly` nor `waitOnly`). This PR also processes them with async and wait operands all together. Added: Modified: flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 flang/test/Lower/OpenACC/acc-data.f90 flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 flang/test/Lower/OpenACC/acc-enter-data.f90 flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 flang/test/Lower/OpenACC/acc-exit-data.f90 flang/test/Lower/OpenACC/acc-kernels-loop.f90 flang/test/Lower/OpenACC/acc-kernels.f90 flang/test/Lower/OpenACC/acc-parallel-loop.f90 flang/test/Lower/OpenACC/acc-parallel.f90 flang/test/Lower/OpenACC/acc-serial-loop.f90 flang/test/Lower/OpenACC/acc-serial.f90 flang/test/Lower/OpenACC/acc-update.f90 flang/test/Lower/OpenACC/acc-wait.f90 mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp mlir/test/Conversion/OpenACCToSCF/convert-openacc-to-scf.mlir mlir/test/Dialect/OpenACC/invalid.mlir mlir/test/Dialect/OpenACC/ops.mlir Removed: ################################################################################ diff --git a/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 index d010d39cef4eb..789db34adefee 100644 --- a/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-data-unwrap-defaultbounds.f90 @@ -155,8 +155,8 @@ subroutine acc_data !$acc data present(a) async !$acc end data -! CHECK: acc.data dataOperands(%{{.*}}) { -! CHECK: } attributes {asyncOnly = [#acc.device_type]} +! CHECK: acc.data async dataOperands(%{{.*}}) { +! CHECK: } !$acc data copy(a) async(1) !$acc end data diff --git a/flang/test/Lower/OpenACC/acc-data.f90 b/flang/test/Lower/OpenACC/acc-data.f90 index 7965fdc0ac707..3032ce7109c1e 100644 --- a/flang/test/Lower/OpenACC/acc-data.f90 +++ b/flang/test/Lower/OpenACC/acc-data.f90 @@ -155,8 +155,8 @@ subroutine acc_data !$acc data present(a) async !$acc end data -! CHECK: acc.data dataOperands(%{{.*}}) { -! CHECK: } attributes {asyncOnly = [#acc.device_type]} +! CHECK: acc.data async dataOperands(%{{.*}}) { +! CHECK: } !$acc data copy(a) async(1) !$acc end data diff --git a/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 index c42350a07c498..3e08068bdec44 100644 --- a/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-enter-data-unwrap-defaultbounds.f90 @@ -94,20 +94,20 @@ subroutine acc_enter_data !$acc enter data create(a) async !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) wait !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {wait} +!CHECK: acc.enter_data wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async wait !CHECK: %[[BOUND0:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) !CHECK: %[[BOUND1:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%[[EXTENT_C10]] : index) stride(%c1{{.*}} : index) startIdx(%{{.*}} : index) -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async, wait} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%[[BOUND0]], %[[BOUND1]]) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-enter-data.f90 b/flang/test/Lower/OpenACC/acc-enter-data.f90 index 3e49259c360eb..f7396660a6d3c 100644 --- a/flang/test/Lower/OpenACC/acc-enter-data.f90 +++ b/flang/test/Lower/OpenACC/acc-enter-data.f90 @@ -53,16 +53,16 @@ subroutine acc_enter_data !CHECK: acc.enter_data dataOperands(%[[COPYIN_A]], %[[CREATE_B]], %[[ATTACH_D]] : !fir.ref>, !fir.ref>, !fir.ref>>){{$}} !$acc enter data create(a) async -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) wait !CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {wait} +!CHECK: acc.enter_data wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async wait -!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], name = "a", structured = false} -!CHECK: acc.enter_data dataOperands(%[[CREATE_A]] : !fir.ref>) attributes {async, wait} +!CHECK: %[[CREATE_A:.*]] = acc.create varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {name = "a", structured = false} +!CHECK: acc.enter_data async wait dataOperands(%[[CREATE_A]] : !fir.ref>) !$acc enter data create(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 index 7999a7647f49b..fd942173b637a 100644 --- a/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-exit-data-unwrap-defaultbounds.f90 @@ -56,19 +56,19 @@ subroutine acc_exit_data !CHECK: acc.detach accPtr(%[[DEVPTR_D]] : !fir.ptr) {name = "d", structured = false} !$acc exit data delete(a) async -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async {name = "a", structured = false} !$acc exit data delete(a) wait !CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {wait} +!CHECK: acc.exit_data wait dataOperands(%[[DEVPTR]] : !fir.ref>) !CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {name = "a", structured = false} !$acc exit data delete(a) async wait -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async, wait} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async wait dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) bounds(%{{.*}}, %{{.*}}) async {name = "a", structured = false} !$acc exit data delete(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-exit-data.f90 b/flang/test/Lower/OpenACC/acc-exit-data.f90 index bf5f7094913a1..cbc63ac81945c 100644 --- a/flang/test/Lower/OpenACC/acc-exit-data.f90 +++ b/flang/test/Lower/OpenACC/acc-exit-data.f90 @@ -54,19 +54,19 @@ subroutine acc_exit_data !CHECK: acc.detach accPtr(%[[DEVPTR_D]] : !fir.ref>>) {name = "d", structured = false} !$acc exit data delete(a) async -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) async {name = "a", structured = false} !$acc exit data delete(a) wait !CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {wait} +!CHECK: acc.exit_data wait dataOperands(%[[DEVPTR]] : !fir.ref>) !CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {name = "a", structured = false} !$acc exit data delete(a) async wait -!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} -!CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !fir.ref>) attributes {async, wait} -!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +!CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} +!CHECK: acc.exit_data async wait dataOperands(%[[DEVPTR]] : !fir.ref>) +!CHECK: acc.delete accPtr(%[[DEVPTR]] : !fir.ref>) async {name = "a", structured = false} !$acc exit data delete(a) async(1) !CHECK: %[[ASYNC1:.*]] = arith.constant 1 : i32 diff --git a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 b/flang/test/Lower/OpenACC/acc-kernels-loop.f90 index a330b7d491d06..8608b0ad98ce6 100644 --- a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-kernels-loop.f90 @@ -69,12 +69,12 @@ subroutine acc_kernels_loop END DO !$acc end kernels loop -! CHECK: acc.kernels {{.*}} { +! CHECK: acc.kernels {{.*}} async { ! CHECK: acc.loop {{.*}} { ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} ! CHECK: acc.terminator -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc kernels loop async(1) DO i = 1, n diff --git a/flang/test/Lower/OpenACC/acc-kernels.f90 b/flang/test/Lower/OpenACC/acc-kernels.f90 index 6b7a625b34f71..b90870db25095 100644 --- a/flang/test/Lower/OpenACC/acc-kernels.f90 +++ b/flang/test/Lower/OpenACC/acc-kernels.f90 @@ -38,9 +38,9 @@ subroutine acc_kernels !$acc kernels async !$acc end kernels -! CHECK: acc.kernels { +! CHECK: acc.kernels async { ! CHECK: acc.terminator -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc kernels async(1) !$acc end kernels diff --git a/flang/test/Lower/OpenACC/acc-parallel-loop.f90 b/flang/test/Lower/OpenACC/acc-parallel-loop.f90 index 1e1fc7448a513..4cf268d2517f5 100644 --- a/flang/test/Lower/OpenACC/acc-parallel-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-parallel-loop.f90 @@ -71,12 +71,12 @@ subroutine acc_parallel_loop END DO !$acc end parallel loop -! CHECK: acc.parallel {{.*}} { +! CHECK: acc.parallel {{.*}} async { ! CHECK: acc.loop {{.*}} { ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc parallel loop async(1) DO i = 1, n diff --git a/flang/test/Lower/OpenACC/acc-parallel.f90 b/flang/test/Lower/OpenACC/acc-parallel.f90 index e00ea41210966..1eae106ba61b2 100644 --- a/flang/test/Lower/OpenACC/acc-parallel.f90 +++ b/flang/test/Lower/OpenACC/acc-parallel.f90 @@ -60,9 +60,9 @@ subroutine acc_parallel !$acc parallel async !$acc end parallel -! CHECK: acc.parallel { +! CHECK: acc.parallel async { ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc parallel async(1) !$acc end parallel diff --git a/flang/test/Lower/OpenACC/acc-serial-loop.f90 b/flang/test/Lower/OpenACC/acc-serial-loop.f90 index 98fc28990265a..34391f78ae707 100644 --- a/flang/test/Lower/OpenACC/acc-serial-loop.f90 +++ b/flang/test/Lower/OpenACC/acc-serial-loop.f90 @@ -90,12 +90,12 @@ subroutine acc_serial_loop END DO !$acc end serial loop -! CHECK: acc.serial {{.*}} { +! CHECK: acc.serial {{.*}} async { ! CHECK: acc.loop {{.*}} { ! CHECK: acc.yield ! CHECK-NEXT: }{{$}} ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc serial loop async(1) DO i = 1, n diff --git a/flang/test/Lower/OpenACC/acc-serial.f90 b/flang/test/Lower/OpenACC/acc-serial.f90 index 9ba44ce6b9197..1e4f32fd209ef 100644 --- a/flang/test/Lower/OpenACC/acc-serial.f90 +++ b/flang/test/Lower/OpenACC/acc-serial.f90 @@ -60,9 +60,9 @@ subroutine acc_serial !$acc serial async !$acc end serial -! CHECK: acc.serial { +! CHECK: acc.serial async { ! CHECK: acc.yield -! CHECK-NEXT: } attributes {asyncOnly = [#acc.device_type]} +! CHECK-NEXT: } !$acc serial async(1) !$acc end serial diff --git a/flang/test/Lower/OpenACC/acc-update.f90 b/flang/test/Lower/OpenACC/acc-update.f90 index f96b105ed93bd..f98af425de985 100644 --- a/flang/test/Lower/OpenACC/acc-update.f90 +++ b/flang/test/Lower/OpenACC/acc-update.f90 @@ -63,9 +63,9 @@ subroutine acc_update ! CHECK: acc.update_host accPtr(%[[DEVPTR_B]] : !fir.ref>) to varPtr(%[[DECLB]]#0 : !fir.ref>) {name = "b", structured = false} !$acc update host(a) async -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} ! CHECK: acc.update async dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) async to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) wait ! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} @@ -73,9 +73,9 @@ subroutine acc_update ! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) async wait -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type], dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async -> !fir.ref> {dataClause = #acc, name = "a", structured = false} ! CHECK: acc.update async wait dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {asyncOnly = [#acc.device_type], name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) async to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) async(1) ! CHECK: [[ASYNC1:%.*]] = arith.constant 1 : i32 @@ -108,8 +108,8 @@ subroutine acc_update ! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} !$acc update host(a) device_type(host, nvidia) async -! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) -> !fir.ref> {asyncOnly = [#acc.device_type, #acc.device_type], dataClause = #acc, name = "a", structured = false} +! CHECK: %[[DEVPTR_A:.*]] = acc.getdeviceptr varPtr(%[[DECLA]]#0 : !fir.ref>) async([#acc.device_type, #acc.device_type]) -> !fir.ref> {dataClause = #acc, name = "a", structured = false} ! CHECK: acc.update async([#acc.device_type, #acc.device_type]) dataOperands(%[[DEVPTR_A]] : !fir.ref>) -! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) to varPtr(%[[DECLA]]#0 : !fir.ref>) {asyncOnly = [#acc.device_type, #acc.device_type], name = "a", structured = false} +! CHECK: acc.update_host accPtr(%[[DEVPTR_A]] : !fir.ref>) async([#acc.device_type, #acc.device_type]) to varPtr(%[[DECLA]]#0 : !fir.ref>) {name = "a", structured = false} end subroutine acc_update diff --git a/flang/test/Lower/OpenACC/acc-wait.f90 b/flang/test/Lower/OpenACC/acc-wait.f90 index 8a42c97a12811..35db640a054c2 100644 --- a/flang/test/Lower/OpenACC/acc-wait.f90 +++ b/flang/test/Lower/OpenACC/acc-wait.f90 @@ -25,7 +25,7 @@ subroutine acc_update !$acc wait(1) async !CHECK: [[WAIT3:%.*]] = arith.constant 1 : i32 -!CHECK: acc.wait([[WAIT3]] : i32) attributes {async} +!CHECK: acc.wait([[WAIT3]] : i32) async !$acc wait(1) async(async) !CHECK: [[WAIT3:%.*]] = arith.constant 1 : i32 diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td index 5d5add6318e06..3c22aeb9a1ff7 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td @@ -561,8 +561,8 @@ class OpenACC_DataEntryOp($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) ) `->` type($accVar) attr-dict }]; @@ -922,8 +922,8 @@ class OpenACC_DataExitOpWithVarPtr let assemblyFormat = [{ custom($accVar, type($accVar)) (`bounds` `(` $bounds^ `)` )? - (`async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType)^ `)`)? + (`async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly)^)? `to` custom($var) `:` custom(type($var), $varType) attr-dict }]; @@ -983,8 +983,8 @@ class OpenACC_DataExitOpNoVarPtr : let assemblyFormat = [{ custom($accVar, type($accVar)) (`bounds` `(` $bounds^ `)` )? - (`async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType)^ `)`)? + (`async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly)^)? attr-dict }]; @@ -1439,8 +1439,8 @@ def OpenACC_ParallelOp : OpenACC_Op<"parallel", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1581,8 +1581,8 @@ def OpenACC_SerialOp : OpenACC_Op<"serial", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `firstprivate` `(` custom($firstprivateOperands, type($firstprivateOperands), $firstprivatizations) `)` @@ -1750,8 +1750,8 @@ def OpenACC_KernelsOp : OpenACC_Op<"kernels", ( `combined` `(` `loop` `)` $combined^)? oilist( `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `num_gangs` `(` custom($numGangs, type($numGangs), $numGangsDeviceType, $numGangsSegments) `)` | `num_workers` `(` custom($numWorkers, @@ -1870,8 +1870,8 @@ def OpenACC_DataOp : OpenACC_Op<"data", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `(` custom($asyncOperands, - type($asyncOperands), $asyncOperandsDeviceType) `)` + | `async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, @@ -1934,9 +1934,11 @@ def OpenACC_EnterDataOp : OpenACC_Op<"enter_data", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `(` $asyncOperand `:` type($asyncOperand) `)` + | `async` `` custom($asyncOperand, + type($asyncOperand), $async) | `wait_devnum` `(` $waitDevnum `:` type($waitDevnum) `)` - | `wait` `(` $waitOperands `:` type($waitOperands) `)` + | `wait` `` custom($waitOperands, + type($waitOperands), $wait) | `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` ) attr-dict-with-keyword @@ -1986,9 +1988,11 @@ def OpenACC_ExitDataOp : OpenACC_Op<"exit_data", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `(` $asyncOperand `:` type($asyncOperand) `)` + | `async` `` custom($asyncOperand, + type($asyncOperand), $async) | `wait_devnum` `(` $waitDevnum `:` type($waitDevnum) `)` - | `wait` `(` $waitOperands `:` type($waitOperands) `)` + | `wait` `` custom($waitOperands, + type($waitOperands), $wait) | `dataOperands` `(` $dataClauseOperands `:` type($dataClauseOperands) `)` ) attr-dict-with-keyword @@ -2853,7 +2857,7 @@ def OpenACC_UpdateOp : OpenACC_Op<"update", let arguments = (ins Optional:$ifCond, Variadic:$asyncOperands, OptionalAttr:$asyncOperandsDeviceType, - OptionalAttr:$async, + OptionalAttr:$asyncOnly, Variadic:$waitOperands, OptionalAttr:$waitOperandsSegments, OptionalAttr:$waitOperandsDeviceType, @@ -2901,9 +2905,8 @@ def OpenACC_UpdateOp : OpenACC_Op<"update", let assemblyFormat = [{ oilist( `if` `(` $ifCond `)` - | `async` `` custom( - $asyncOperands, type($asyncOperands), - $asyncOperandsDeviceType, $async) + | `async` `` custom($asyncOperands, + type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, $waitOnly) @@ -2948,9 +2951,11 @@ def OpenACC_WaitOp : OpenACC_Op<"wait", [AttrSizedOperandSegments]> { let assemblyFormat = [{ ( `(` $waitOperands^ `:` type($waitOperands) `)` )? - oilist(`async` `(` $asyncOperand `:` type($asyncOperand) `)` - |`wait_devnum` `(` $waitDevnum `:` type($waitDevnum) `)` - |`if` `(` $ifCond `)` + oilist( + `async` `` custom($asyncOperand, + type($asyncOperand), $async) + | `wait_devnum` `(` $waitDevnum `:` type($waitDevnum) `)` + | `if` `(` $ifCond `)` ) attr-dict-with-keyword }]; let hasVerifier = 1; diff --git a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp index 7eb72d433c972..b401d2ec7894a 100644 --- a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp +++ b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp @@ -272,11 +272,12 @@ static LogicalResult checkWaitAndAsyncConflict(Op op) { ++dtypeInt) { auto dtype = static_cast(dtypeInt); - // The async attribute represent the async clause without value. Therefore - // the attribute and operand cannot appear at the same time. + // The asyncOnly attribute represent the async clause without value. + // Therefore the attribute and operand cannot appear at the same time. if (hasDeviceType(op.getAsyncOperandsDeviceType(), dtype) && op.hasAsyncOnly(dtype)) - return op.emitError("async attribute cannot appear with asyncOperand"); + return op.emitError( + "asyncOnly attribute cannot appear with asyncOperand"); // The wait attribute represent the wait clause without values. Therefore // the attribute and operands cannot appear at the same time. @@ -1683,6 +1684,90 @@ static void printDeviceTypeOperandsWithKeywordOnly( p << ")"; } +static ParseResult parseOperandWithKeywordOnly( + mlir::OpAsmParser &parser, + std::optional &operand, + mlir::Type &operandType, mlir::UnitAttr &attr) { + // Keyword only + if (failed(parser.parseOptionalLParen())) { + attr = mlir::UnitAttr::get(parser.getContext()); + return success(); + } + + OpAsmParser::UnresolvedOperand op; + if (failed(parser.parseOperand(op))) + return failure(); + operand = op; + if (failed(parser.parseColon())) + return failure(); + if (failed(parser.parseType(operandType))) + return failure(); + if (failed(parser.parseRParen())) + return failure(); + + return success(); +} + +static void printOperandWithKeywordOnly(mlir::OpAsmPrinter &p, + mlir::Operation *op, + std::optional operand, + mlir::Type operandType, + mlir::UnitAttr attr) { + if (attr) + return; + + p << "("; + p.printOperand(*operand); + p << " : "; + p.printType(operandType); + p << ")"; +} + +static ParseResult parseOperandsWithKeywordOnly( + mlir::OpAsmParser &parser, + llvm::SmallVectorImpl &operands, + llvm::SmallVectorImpl &types, mlir::UnitAttr &attr) { + // Keyword only + if (failed(parser.parseOptionalLParen())) { + attr = mlir::UnitAttr::get(parser.getContext()); + return success(); + } + + if (failed(parser.parseCommaSeparatedList([&]() { + if (parser.parseOperand(operands.emplace_back())) + return failure(); + return success(); + }))) + return failure(); + if (failed(parser.parseColon())) + return failure(); + if (failed(parser.parseCommaSeparatedList([&]() { + if (parser.parseType(types.emplace_back())) + return failure(); + return success(); + }))) + return failure(); + if (failed(parser.parseRParen())) + return failure(); + + return success(); +} + +static void printOperandsWithKeywordOnly(mlir::OpAsmPrinter &p, + mlir::Operation *op, + mlir::OperandRange operands, + mlir::TypeRange types, + mlir::UnitAttr attr) { + if (attr) + return; + + p << "("; + llvm::interleaveComma(operands, p, [&](auto it) { p << it; }); + p << " : "; + llvm::interleaveComma(types, p, [&](auto it) { p << it; }); + p << ")"; +} + static ParseResult parseCombinedConstructsLoop(mlir::OpAsmParser &parser, mlir::acc::CombinedConstructsTypeAttr &attr) { @@ -3505,7 +3590,7 @@ bool UpdateOp::hasAsyncOnly() { } bool UpdateOp::hasAsyncOnly(mlir::acc::DeviceType deviceType) { - return hasDeviceType(getAsync(), deviceType); + return hasDeviceType(getAsyncOnly(), deviceType); } mlir::Value UpdateOp::getAsyncValue() { diff --git a/mlir/test/Conversion/OpenACCToSCF/convert-openacc-to-scf.mlir b/mlir/test/Conversion/OpenACCToSCF/convert-openacc-to-scf.mlir index d8e89f64f8bc0..c08fd860e738b 100644 --- a/mlir/test/Conversion/OpenACCToSCF/convert-openacc-to-scf.mlir +++ b/mlir/test/Conversion/OpenACCToSCF/convert-openacc-to-scf.mlir @@ -68,20 +68,20 @@ func.func @update_false(%arg0: memref) { func.func @enter_data_true(%d1 : memref) { %true = arith.constant true %0 = acc.create varPtr(%d1 : memref) -> memref - acc.enter_data if(%true) dataOperands(%0 : memref) attributes {async} + acc.enter_data async if(%true) dataOperands(%0 : memref) return } // CHECK-LABEL: func.func @enter_data_true // CHECK-NOT: if -// CHECK: acc.enter_data dataOperands +// CHECK: acc.enter_data async dataOperands // ----- func.func @enter_data_false(%d1 : memref) { %false = arith.constant false %0 = acc.create varPtr(%d1 : memref) -> memref - acc.enter_data if(%false) dataOperands(%0 : memref) attributes {async} + acc.enter_data async if(%false) dataOperands(%0 : memref) return } @@ -93,21 +93,21 @@ func.func @enter_data_false(%d1 : memref) { func.func @exit_data_true(%d1 : memref) { %true = arith.constant true %0 = acc.getdeviceptr varPtr(%d1 : memref) -> memref - acc.exit_data if(%true) dataOperands(%0 : memref) attributes {async} + acc.exit_data async if(%true) dataOperands(%0 : memref) acc.delete accPtr(%0 : memref) return } // CHECK-LABEL: func.func @exit_data_true // CHECK-NOT:if -// CHECK:acc.exit_data dataOperands +// CHECK:acc.exit_data async dataOperands // ----- func.func @exit_data_false(%d1 : memref) { %false = arith.constant false %0 = acc.getdeviceptr varPtr(%d1 : memref) -> memref - acc.exit_data if(%false) dataOperands(%0 : memref) attributes {async} + acc.exit_data async if(%false) dataOperands(%0 : memref) acc.delete accPtr(%0 : memref) return } diff --git a/mlir/test/Dialect/OpenACC/invalid.mlir b/mlir/test/Dialect/OpenACC/invalid.mlir index c8d7a87112917..aadf189273212 100644 --- a/mlir/test/Dialect/OpenACC/invalid.mlir +++ b/mlir/test/Dialect/OpenACC/invalid.mlir @@ -129,8 +129,8 @@ acc.update %cst = arith.constant 1 : index %value = memref.alloc() : memref %0 = acc.update_device varPtr(%value : memref) -> memref -// expected-error at +1 {{async attribute cannot appear with asyncOperand}} -acc.update async(%cst: index) dataOperands(%0 : memref) attributes {async = [#acc.device_type]} +// expected-error at +1 {{asyncOnly attribute cannot appear with asyncOperand}} +acc.update async(%cst: index) dataOperands(%0 : memref) attributes {asyncOnly = [#acc.device_type]} // ----- @@ -138,7 +138,7 @@ acc.update async(%cst: index) dataOperands(%0 : memref) attributes {async = %value = memref.alloc() : memref %0 = acc.update_device varPtr(%value : memref) -> memref // expected-error at +1 {{wait attribute cannot appear with waitOperands}} -acc.update wait({%cst: index}) dataOperands(%0: memref) attributes {waitOnly = [#acc.device_type]} +acc.update wait({%cst: index}) dataOperands(%0: memref) attributes {waitOnly = [#acc.device_type]} // ----- diff --git a/mlir/test/Dialect/OpenACC/ops.mlir b/mlir/test/Dialect/OpenACC/ops.mlir index 4c842a26f8dc4..550f295f074a2 100644 --- a/mlir/test/Dialect/OpenACC/ops.mlir +++ b/mlir/test/Dialect/OpenACC/ops.mlir @@ -435,10 +435,10 @@ func.func @testparallelop(%a: memref<10xf32>, %b: memref<10xf32>, %c: memref<10x } attributes {defaultAttr = #acc} acc.parallel { } attributes {defaultAttr = #acc} - acc.parallel { - } attributes {asyncAttr} - acc.parallel { - } attributes {waitAttr} + acc.parallel async { + } + acc.parallel wait { + } acc.parallel { } attributes {selfAttr} return @@ -488,10 +488,10 @@ func.func @testparallelop(%a: memref<10xf32>, %b: memref<10xf32>, %c: memref<10x // CHECK-NEXT: } attributes {defaultAttr = #acc} // CHECK: acc.parallel { // CHECK-NEXT: } attributes {defaultAttr = #acc} -// CHECK: acc.parallel { -// CHECK-NEXT: } attributes {asyncAttr} -// CHECK: acc.parallel { -// CHECK-NEXT: } attributes {waitAttr} +// CHECK: acc.parallel async { +// CHECK-NEXT: } +// CHECK: acc.parallel wait { +// CHECK-NEXT: } // CHECK: acc.parallel { // CHECK-NEXT: } attributes {selfAttr} @@ -567,10 +567,10 @@ func.func @testserialop(%a: memref<10xf32>, %b: memref<10xf32>, %c: memref<10x10 } attributes {defaultAttr = #acc} acc.serial { } attributes {defaultAttr = #acc} - acc.serial { - } attributes {asyncAttr} - acc.serial { - } attributes {waitAttr} + acc.serial async { + } + acc.serial wait { + } acc.serial { } attributes {selfAttr} acc.serial { @@ -604,10 +604,10 @@ func.func @testserialop(%a: memref<10xf32>, %b: memref<10xf32>, %c: memref<10x10 // CHECK-NEXT: } attributes {defaultAttr = #acc} // CHECK: acc.serial { // CHECK-NEXT: } attributes {defaultAttr = #acc} -// CHECK: acc.serial { -// CHECK-NEXT: } attributes {asyncAttr} -// CHECK: acc.serial { -// CHECK-NEXT: } attributes {waitAttr} +// CHECK: acc.serial async { +// CHECK-NEXT: } +// CHECK: acc.serial wait { +// CHECK-NEXT: } // CHECK: acc.serial { // CHECK-NEXT: } attributes {selfAttr} // CHECK: acc.serial { @@ -639,10 +639,10 @@ func.func @testserialop(%a: memref<10xf32>, %b: memref<10xf32>, %c: memref<10x10 } attributes {defaultAttr = #acc} acc.kernels { } attributes {defaultAttr = #acc} - acc.kernels { - } attributes {asyncAttr} - acc.kernels { - } attributes {waitAttr} + acc.kernels async { + } + acc.kernels wait { + } acc.kernels { } attributes {selfAttr} acc.kernels { @@ -673,10 +673,10 @@ func.func @testserialop(%a: memref<10xf32>, %b: memref<10xf32>, %c: memref<10x10 // CHECK-NEXT: } attributes {defaultAttr = #acc} // CHECK: acc.kernels { // CHECK-NEXT: } attributes {defaultAttr = #acc} -// CHECK: acc.kernels { -// CHECK-NEXT: } attributes {asyncAttr} -// CHECK: acc.kernels { -// CHECK-NEXT: } attributes {waitAttr} +// CHECK: acc.kernels async { +// CHECK-NEXT: } +// CHECK: acc.kernels wait { +// CHECK-NEXT: } // CHECK: acc.kernels { // CHECK-NEXT: } attributes {selfAttr} // CHECK: acc.kernels { @@ -787,23 +787,23 @@ func.func @testdataop(%a: memref, %b: memref, %c: memref) -> () { acc.data { } attributes { defaultAttr = #acc } - acc.data { - } attributes { defaultAttr = #acc, async } + acc.data async { + } attributes { defaultAttr = #acc } %a1 = arith.constant 1 : i64 acc.data async(%a1 : i64) { - } attributes { defaultAttr = #acc, async } + } attributes { defaultAttr = #acc } - acc.data { - } attributes { defaultAttr = #acc, wait } + acc.data wait { + } attributes { defaultAttr = #acc } %w1 = arith.constant 1 : i64 acc.data wait({%w1 : i64}) { - } attributes { defaultAttr = #acc, wait } + } attributes { defaultAttr = #acc } %wd1 = arith.constant 1 : i64 acc.data wait({devnum: %wd1 : i64, %w1 : i64}) { - } attributes { defaultAttr = #acc, wait } + } attributes { defaultAttr = #acc } return } @@ -904,20 +904,20 @@ func.func @testdataop(%a: memref, %b: memref, %c: memref) -> () { // CHECK: acc.data { // CHECK-NEXT: } attributes {defaultAttr = #acc} -// CHECK: acc.data { -// CHECK-NEXT: } attributes {async, defaultAttr = #acc} +// CHECK: acc.data async { +// CHECK-NEXT: } attributes {defaultAttr = #acc} // CHECK: acc.data async(%{{.*}} : i64) { -// CHECK-NEXT: } attributes {async, defaultAttr = #acc} +// CHECK-NEXT: } attributes {defaultAttr = #acc} -// CHECK: acc.data { -// CHECK-NEXT: } attributes {defaultAttr = #acc, wait} +// CHECK: acc.data wait { +// CHECK-NEXT: } attributes {defaultAttr = #acc} // CHECK: acc.data wait({%{{.*}} : i64}) { -// CHECK-NEXT: } attributes {defaultAttr = #acc, wait} +// CHECK-NEXT: } attributes {defaultAttr = #acc} // CHECK: acc.data wait({devnum: %{{.*}} : i64, %{{.*}} : i64}) { -// CHECK-NEXT: } attributes {defaultAttr = #acc, wait} +// CHECK-NEXT: } attributes {defaultAttr = #acc} // ----- @@ -977,7 +977,7 @@ acc.wait async(%i32Value: i32) acc.wait async(%idxValue: index) acc.wait(%i32Value: i32) async(%idxValue: index) acc.wait(%i64Value: i64) wait_devnum(%i32Value: i32) -acc.wait attributes {async} +acc.wait async acc.wait(%i64Value: i64) async(%idxValue: index) wait_devnum(%i32Value: i32) acc.wait(%i64Value: i64) wait_devnum(%i32Value: i32) async(%idxValue: index) acc.wait if(%ifCond) @@ -996,7 +996,7 @@ acc.wait if(%ifCond) // CHECK: acc.wait async([[IDXVALUE]] : index) // CHECK: acc.wait([[I32VALUE]] : i32) async([[IDXVALUE]] : index) // CHECK: acc.wait([[I64VALUE]] : i64) wait_devnum([[I32VALUE]] : i32) -// CHECK: acc.wait attributes {async} +// CHECK: acc.wait async // CHECK: acc.wait([[I64VALUE]] : i64) async([[IDXVALUE]] : index) wait_devnum([[I32VALUE]] : i32) // CHECK: acc.wait([[I64VALUE]] : i64) async([[IDXVALUE]] : index) wait_devnum([[I32VALUE]] : i32) // CHECK: acc.wait if([[IFCOND]]) @@ -1078,7 +1078,7 @@ func.func @testexitdataop(%a: !llvm.ptr) -> () { acc.delete accPtr(%1 : !llvm.ptr) %2 = acc.getdeviceptr varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr - acc.exit_data dataOperands(%2 : !llvm.ptr) attributes {async,finalize} + acc.exit_data async dataOperands(%2 : !llvm.ptr) attributes {finalize} acc.delete accPtr(%2 : !llvm.ptr) %3 = acc.getdeviceptr varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr @@ -1086,11 +1086,11 @@ func.func @testexitdataop(%a: !llvm.ptr) -> () { acc.detach accPtr(%3 : !llvm.ptr) %4 = acc.getdeviceptr varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr - acc.exit_data dataOperands(%4 : !llvm.ptr) attributes {async} + acc.exit_data async dataOperands(%4 : !llvm.ptr) acc.copyout accPtr(%4 : !llvm.ptr) to varPtr(%a : !llvm.ptr) varType(f64) %5 = acc.getdeviceptr varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr - acc.exit_data dataOperands(%5 : !llvm.ptr) attributes {wait} + acc.exit_data wait dataOperands(%5 : !llvm.ptr) acc.delete accPtr(%5 : !llvm.ptr) %6 = acc.getdeviceptr varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr @@ -1127,7 +1127,7 @@ func.func @testexitdataop(%a: !llvm.ptr) -> () { // CHECK: acc.delete accPtr(%[[DEVPTR]] : !llvm.ptr) // CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr -// CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !llvm.ptr) attributes {async, finalize} +// CHECK: acc.exit_data async dataOperands(%[[DEVPTR]] : !llvm.ptr) attributes {finalize} // CHECK: acc.delete accPtr(%[[DEVPTR]] : !llvm.ptr) // CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr @@ -1135,11 +1135,11 @@ func.func @testexitdataop(%a: !llvm.ptr) -> () { // CHECK: acc.detach accPtr(%[[DEVPTR]] : !llvm.ptr) // CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr -// CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !llvm.ptr) attributes {async} +// CHECK: acc.exit_data async dataOperands(%[[DEVPTR]] : !llvm.ptr) // CHECK: acc.copyout accPtr(%[[DEVPTR]] : !llvm.ptr) to varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) // CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr -// CHECK: acc.exit_data dataOperands(%[[DEVPTR]] : !llvm.ptr) attributes {wait} +// CHECK: acc.exit_data wait dataOperands(%[[DEVPTR]] : !llvm.ptr) // CHECK: acc.delete accPtr(%[[DEVPTR]] : !llvm.ptr) // CHECK: %[[DEVPTR:.*]] = acc.getdeviceptr varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr @@ -1176,9 +1176,9 @@ func.func @testenterdataop(%a: !llvm.ptr, %b: !llvm.ptr, %c: !llvm.ptr) -> () { %4 = acc.attach varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr acc.enter_data dataOperands(%4 : !llvm.ptr) %5 = acc.copyin varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr - acc.enter_data dataOperands(%5 : !llvm.ptr) attributes {async} + acc.enter_data async dataOperands(%5 : !llvm.ptr) %6 = acc.create varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr - acc.enter_data dataOperands(%6 : !llvm.ptr) attributes {wait} + acc.enter_data wait dataOperands(%6 : !llvm.ptr) %7 = acc.copyin varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr acc.enter_data async(%i64Value : i64) dataOperands(%7 : !llvm.ptr) %8 = acc.copyin varPtr(%a : !llvm.ptr) varType(f64) -> !llvm.ptr @@ -1205,9 +1205,9 @@ func.func @testenterdataop(%a: !llvm.ptr, %b: !llvm.ptr, %c: !llvm.ptr) -> () { // CHECK: %[[ATTACH:.*]] = acc.attach varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr // CHECK: acc.enter_data dataOperands(%[[ATTACH]] : !llvm.ptr) // CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr -// CHECK: acc.enter_data dataOperands(%[[COPYIN]] : !llvm.ptr) attributes {async} +// CHECK: acc.enter_data async dataOperands(%[[COPYIN]] : !llvm.ptr) // CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr -// CHECK: acc.enter_data dataOperands(%[[CREATE]] : !llvm.ptr) attributes {wait} +// CHECK: acc.enter_data wait dataOperands(%[[CREATE]] : !llvm.ptr) // CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr // CHECK: acc.enter_data async([[I64VALUE]] : i64) dataOperands(%[[COPYIN]] : !llvm.ptr) // CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[ARGA]] : !llvm.ptr) varType(f64) -> !llvm.ptr From flang-commits at lists.llvm.org Thu May 15 13:42:52 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Thu, 15 May 2025 13:42:52 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <682651cc.170a0220.7b588.28ef@mx.google.com> https://github.com/akuhlens approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Thu May 15 13:43:29 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Thu, 15 May 2025 13:43:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <682651f1.170a0220.4bbe4.36a7@mx.google.com> https://github.com/akuhlens edited https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Thu May 15 22:06:40 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Thu, 15 May 2025 22:06:40 -0700 (PDT) Subject: [flang-commits] [flang] [draft][flang] Undo the effects of CSE for hlfir.exactly_once. (PR #140190) Message-ID: https://github.com/vzakhari created https://github.com/llvm/llvm-project/pull/140190 CSE may delete operations from hlfir.exactly_once and reuse the equivalent results from the parent region(s), e.g. from the parent hlfir.region_assign. This makes it problematic to clone hlfir.exactly_once before the top-level hlfir.where. This patch adds a "canonicalizer" that pulls in such operations back into hlfir.exactly_once. >From 41f8ec79d42e29228525efee9611f7cb761c18a6 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Thu, 15 May 2025 21:53:46 -0700 Subject: [PATCH] [draft][flang] Undo the effects of CSE for hlfir.exactly_once. CSE may delete operations from hlfir.exactly_once and reuse the equivalent results from the parent region(s), e.g. from the parent hlfir.region_assign. This makes it problematic to clone hlfir.exactly_once before the top-level hlfir.where. This patch adds a "canonicalizer" that pulls in such operations back into hlfir.exactly_once. --- .../LowerHLFIROrderedAssignments.cpp | 119 ++++++++++++++++++ 1 file changed, 119 insertions(+) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp b/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp index 5cae7cf443c86..89b5ccb7d850e 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp @@ -24,12 +24,15 @@ #include "flang/Optimizer/Builder/Todo.h" #include "flang/Optimizer/Dialect/Support/FIRContext.h" #include "flang/Optimizer/HLFIR/Passes.h" +#include "mlir/Analysis/Liveness.h" #include "mlir/IR/Dominance.h" #include "mlir/IR/IRMapping.h" #include "mlir/Transforms/DialectConversion.h" +#include "mlir/Transforms/RegionUtils.h" #include "llvm/ADT/SmallSet.h" #include "llvm/ADT/TypeSwitch.h" #include "llvm/Support/Debug.h" +#include namespace hlfir { #define GEN_PASS_DEF_LOWERHLFIRORDEREDASSIGNMENTS @@ -263,6 +266,19 @@ class OrderedAssignmentRewriter { return &inserted.first->second; } + /// Given a top-level hlfir.where, look for hlfir.exactly_once operations + /// inside it and see if any of the values live into hlfir.exactly_once + /// do not dominate hlfir.where. This may happen due to CSE reusing + /// results of operations from the region parent to hlfir.exactly_once. + /// Since we are going to clone the body of hlfir.exactly_once before + /// the top-level hlfir.where, such def-use will cause problems. + /// There are options how to resolve this in a different way, + /// e.g. making hlfir.exactly_once IsolatedFromAbove or making + /// it a region of hlfir.where and wiring the result(s) through + /// the block arguments. For the time being, this canonicalization + /// tries to undo the effects of CSE. + void canonicalizeExactlyOnceInsideWhere(hlfir::WhereOp whereOp); + fir::FirOpBuilder &builder; /// Map containing the mapping between the original order assignment tree @@ -523,6 +539,10 @@ void OrderedAssignmentRewriter::generateMaskIfOp(mlir::Value cdt) { void OrderedAssignmentRewriter::pre(hlfir::WhereOp whereOp) { mlir::Location loc = whereOp.getLoc(); if (!whereLoopNest) { + // Make sure liveness information is valid for the inner hlfir.exactly_once + // operations, and their bodies can be cloned before the top-level + // hlfir.where. + canonicalizeExactlyOnceInsideWhere(whereOp); // This is the top-level WHERE. Start a loop nest iterating on the shape of // the where mask. if (auto maybeSaved = getIfSaved(whereOp.getMaskRegion())) { @@ -1350,6 +1370,105 @@ void OrderedAssignmentRewriter::saveLeftHandSide( } } +void OrderedAssignmentRewriter::canonicalizeExactlyOnceInsideWhere( + hlfir::WhereOp whereOp) { + auto getDefinition = [](mlir::Value v) { + mlir::Operation *op = v.getDefiningOp(); + bool isValid = true; + if (!op) { + LLVM_DEBUG( + llvm::dbgs() + << "Value live into hlfir.exactly_once has no defining operation: " + << v << "\n"); + isValid = false; + } + if (op->getNumRegions() != 0) { + LLVM_DEBUG( + llvm::dbgs() + << "Cannot pull an operation with regions into hlfir.exactly_once" + << *op << "\n"); + isValid = false; + } + auto effects = mlir::getEffectsRecursively(op); + if (!effects || !effects->empty()) { + LLVM_DEBUG(llvm::dbgs() << "Side effects on operation with result live " + "into hlfir.exactly_once" + << *op << "\n"); + isValid = false; + } + assert(isValid && "invalid live-in"); + return op; + }; + mlir::Liveness liveness(whereOp.getOperation()); + whereOp->walk([&](hlfir::ExactlyOnceOp op) { + std::unordered_set liveInSet; + LLVM_DEBUG(llvm::dbgs() << "Canonicalizing:\n" << op << "\n"); + auto &liveIns = liveness.getLiveIn(&op.getBody().front()); + if (liveIns.empty()) + return; + // Note that the liveIns set is not ordered. + for (mlir::Value liveIn : liveIns) { + if (!dominanceInfo.properlyDominates(liveIn, whereOp)) { + LLVM_DEBUG(llvm::dbgs() + << "Does not dominate top-level where: " << liveIn << "\n"); + liveInSet.insert(getDefinition(liveIn)); + } + } + + // Populate the set of operations that we need to pull into + // hlfir.exactly_once, so that the only live-ins left are the ones + // that dominate whereOp. + std::unordered_set cloneSet(liveInSet); + llvm::SmallVector workList(cloneSet.begin(), + cloneSet.end()); + while (!workList.empty()) { + mlir::Operation *current = workList.pop_back_val(); + for (mlir::Value operand : current->getOperands()) { + if (dominanceInfo.properlyDominates(operand, whereOp)) + continue; + mlir::Operation *def = getDefinition(operand); + if (cloneSet.count(def)) + continue; + cloneSet.insert(def); + workList.push_back(def); + } + } + + // Sort the operations by dominance. This preserves their order + // after the cloning, and also guarantees stable IR generation. + llvm::SmallVector cloneList(cloneSet.begin(), + cloneSet.end()); + llvm::sort(cloneList, [&](mlir::Operation *L, mlir::Operation *R) { + return dominanceInfo.properlyDominates(L, R); + }); + + // Clone the operations. + mlir::IRMapping mapper; + mlir::Operation::CloneOptions options; + options.cloneOperands(); + mlir::OpBuilder::InsertionGuard guard(builder); + builder.setInsertionPointToStart(&op.getBody().front()); + + for (auto *toClone : cloneList) { + LLVM_DEBUG(llvm::dbgs() << "Cloning: " << *toClone << "\n"); + builder.insert(toClone->clone(mapper, options)); + } + for (mlir::Operation *oldOps : liveInSet) + for (mlir::Value oldVal : oldOps->getResults()) { + mlir::Value newVal = mapper.lookup(oldVal); + if (!newVal) { + LLVM_DEBUG(llvm::dbgs() << "No clone found for: " << oldVal << "\n"); + assert(false && "missing clone"); + } + mlir::replaceAllUsesInRegionWith(oldVal, newVal, op.getBody()); + } + + LLVM_DEBUG(llvm::dbgs() << "Finished canonicalization\n"); + if (!liveInSet.empty()) + LLVM_DEBUG(llvm::dbgs() << op << "\n"); + }); +} + /// Lower an ordered assignment tree to fir.do_loop and hlfir.assign given /// a schedule. static void lower(hlfir::OrderedAssignmentTreeOpInterface root, From flang-commits at lists.llvm.org Thu May 15 22:08:18 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Thu, 15 May 2025 22:08:18 -0700 (PDT) Subject: [flang-commits] [flang] [draft][flang] Undo the effects of CSE for hlfir.exactly_once. (PR #140190) In-Reply-To: Message-ID: <6826c842.170a0220.7c280.3ba8@mx.google.com> vzakhari wrote: Hi Jean, I want to share what I have so far. Let me know if you see any problems with this approach. It works on a couple of tests. Even one from our LIT tests fails at `-O1` due to the same issue, and the patch fixes it. I need to do more testing... https://github.com/llvm/llvm-project/pull/140190 From flang-commits at lists.llvm.org Thu May 15 23:56:23 2025 From: flang-commits at lists.llvm.org (Dominik Adamski via flang-commits) Date: Thu, 15 May 2025 23:56:23 -0700 (PDT) Subject: [flang-commits] [flang] Revert "[Flang] Turn on alias analysis for locally allocated objects" (PR #140202) Message-ID: https://github.com/DominikAdamski created https://github.com/llvm/llvm-project/pull/140202 Reverts llvm/llvm-project#139682 because of reported regression in Fujitsu Fortran test suite: https://ci.linaro.org/job/tcwg_flang_test--main-aarch64-Ofast-sve_vla-build/2081/artifact/artifacts/notify/mail-body.txt/*view*/ >From cdc4b7c6467188d3a69a55cadd2fd318bf6d4f66 Mon Sep 17 00:00:00 2001 From: Dominik Adamski Date: Fri, 16 May 2025 08:51:36 +0200 Subject: [PATCH] Revert "[Flang] Turn on alias analysis for locally allocated objects (#139682)" This reverts commit cf16c97bfa1416672d8990862369e86f360aa11e. --- .../lib/Optimizer/Transforms/AddAliasTags.cpp | 11 ++-- flang/test/Fir/tbaa-codegen2.fir | 7 +-- .../Transforms/tbaa-with-dummy-scope2.fir | 23 +++----- flang/test/Transforms/tbaa2.fir | 59 ++++++++++--------- flang/test/Transforms/tbaa3.fir | 10 ++-- 5 files changed, 53 insertions(+), 57 deletions(-) diff --git a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp index 5cfbdc33285f9..66b4b84998801 100644 --- a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp +++ b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp @@ -43,10 +43,13 @@ static llvm::cl::opt static llvm::cl::opt enableDirect("direct-tbaa", llvm::cl::init(true), llvm::cl::Hidden, llvm::cl::desc("Add TBAA tags to direct variables")); -static llvm::cl::opt - enableLocalAllocs("local-alloc-tbaa", llvm::cl::init(true), - llvm::cl::Hidden, - llvm::cl::desc("Add TBAA tags to local allocations.")); +// This is **known unsafe** (misscompare in spec2017/wrf_r). It should +// not be enabled by default. +// The code is kept so that these may be tried with new benchmarks to see if +// this is worth fixing in the future. +static llvm::cl::opt enableLocalAllocs( + "local-alloc-tbaa", llvm::cl::init(false), llvm::cl::Hidden, + llvm::cl::desc("Add TBAA tags to local allocations. UNSAFE.")); namespace { diff --git a/flang/test/Fir/tbaa-codegen2.fir b/flang/test/Fir/tbaa-codegen2.fir index e4bfa9087ec75..8f8b6a29129e7 100644 --- a/flang/test/Fir/tbaa-codegen2.fir +++ b/flang/test/Fir/tbaa-codegen2.fir @@ -100,7 +100,7 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ // [...] // CHECK: %[[VAL50:.*]] = getelementptr i32, ptr %{{.*}}, i64 %{{.*}} // store to the temporary: -// CHECK: store i32 %{{.*}}, ptr %[[VAL50]], align 4, !tbaa ![[TMP_DATA_ACCESS_TAG:.*]] +// CHECK: store i32 %{{.*}}, ptr %[[VAL50]], align 4, !tbaa ![[DATA_ACCESS_TAG:.*]] // [...] // CHECK: [[BOX_ACCESS_TAG]] = !{![[BOX_ACCESS_TYPE:.*]], ![[BOX_ACCESS_TYPE]], i64 0} @@ -111,7 +111,4 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ // CHECK: ![[A_ACCESS_TYPE]] = !{!"dummy arg data/_QFfuncEa", ![[ARG_ACCESS_TYPE:.*]], i64 0} // CHECK: ![[ARG_ACCESS_TYPE]] = !{!"dummy arg data", ![[DATA_ACCESS_TYPE:.*]], i64 0} // CHECK: ![[DATA_ACCESS_TYPE]] = !{!"any data access", ![[ANY_ACCESS_TYPE]], i64 0} -// CHECK: ![[TMP_DATA_ACCESS_TAG]] = !{![[TMP_DATA_ACCESS_TYPE:.*]], ![[TMP_DATA_ACCESS_TYPE]], i64 0} -// CHECK: ![[TMP_DATA_ACCESS_TYPE]] = !{!"allocated data/", ![[TMP_ACCESS_TYPE:.*]], i64 0} -// CHECK: ![[TMP_ACCESS_TYPE]] = !{!"allocated data", ![[TARGET_ACCESS_TAG:.*]], i64 0} -// CHECK: ![[TARGET_ACCESS_TAG]] = !{!"target data", ![[DATA_ACCESS_TYPE]], i64 0} +// CHECK: ![[DATA_ACCESS_TAG]] = !{![[DATA_ACCESS_TYPE]], ![[DATA_ACCESS_TYPE]], i64 0} diff --git a/flang/test/Transforms/tbaa-with-dummy-scope2.fir b/flang/test/Transforms/tbaa-with-dummy-scope2.fir index 249471de458d3..c8f419fbee652 100644 --- a/flang/test/Transforms/tbaa-with-dummy-scope2.fir +++ b/flang/test/Transforms/tbaa-with-dummy-scope2.fir @@ -43,15 +43,12 @@ func.func @_QPtest1() attributes {noinline} { // CHECK: #[[$ATTR_0:.+]] = #llvm.tbaa_root // CHECK: #[[$ATTR_1:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_2:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$TARGETDATA:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_3:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$LOCAL_ATTR_0:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_5:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_tag -// CHECK: #[[$LOCAL_ATTR_1:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_6:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$LOCAL_ATTR_2:.+]] = #llvm.tbaa_tag // CHECK: #[[$ATTR_8:.+]] = #llvm.tbaa_tag // CHECK-LABEL: func.func @_QPtest1() attributes {noinline} { // CHECK: %[[VAL_2:.*]] = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFtest1FinnerEy"} @@ -60,8 +57,8 @@ func.func @_QPtest1() attributes {noinline} { // CHECK: %[[VAL_5:.*]] = fir.dummy_scope : !fir.dscope // CHECK: %[[VAL_6:.*]] = fir.declare %[[VAL_4]] dummy_scope %[[VAL_5]] {uniq_name = "_QFtest1FinnerEx"} : (!fir.ref, !fir.dscope) -> !fir.ref // CHECK: %[[VAL_7:.*]] = fir.declare %[[VAL_2]] {uniq_name = "_QFtest1FinnerEy"} : (!fir.ref) -> !fir.ref -// CHECK: fir.store %{{.*}} to %[[VAL_7]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref -// CHECK: %[[VAL_8:.*]] = fir.load %[[VAL_7]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref +// CHECK: fir.store %{{.*}} to %[[VAL_7]] : !fir.ref +// CHECK: %[[VAL_8:.*]] = fir.load %[[VAL_7]] : !fir.ref // CHECK: fir.store %[[VAL_8]] to %[[VAL_6]] {tbaa = [#[[$ATTR_7]]]} : !fir.ref // CHECK: fir.store %{{.*}} to %[[VAL_4]] {tbaa = [#[[$ATTR_8]]]} : !fir.ref @@ -90,16 +87,12 @@ func.func @_QPtest2() attributes {noinline} { // CHECK: #[[$ATTR_3:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_5:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$TARGETDATA_0:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_6:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$TARGETDATA_1:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$LOCAL_ATTR_0:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_8:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_10:.+]] = #llvm.tbaa_tag -// CHECK: #[[$LOCAL_ATTR_1:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_9:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$LOCAL_ATTR_2:.+]] = #llvm.tbaa_tag // CHECK: #[[$ATTR_11:.+]] = #llvm.tbaa_tag // CHECK-LABEL: func.func @_QPtest2() attributes {noinline} { // CHECK: %[[VAL_2:.*]] = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFtest2FinnerEy"} @@ -109,7 +102,7 @@ func.func @_QPtest2() attributes {noinline} { // CHECK: %[[VAL_6:.*]] = fir.dummy_scope : !fir.dscope // CHECK: %[[VAL_7:.*]] = fir.declare %[[VAL_5]] dummy_scope %[[VAL_6]] {uniq_name = "_QFtest2FinnerEx"} : (!fir.ref, !fir.dscope) -> !fir.ref // CHECK: %[[VAL_8:.*]] = fir.declare %[[VAL_2]] {uniq_name = "_QFtest2FinnerEy"} : (!fir.ref) -> !fir.ref -// CHECK: fir.store %{{.*}} to %[[VAL_8]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref -// CHECK: %[[VAL_9:.*]] = fir.load %[[VAL_8]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref +// CHECK: fir.store %{{.*}} to %[[VAL_8]] : !fir.ref +// CHECK: %[[VAL_9:.*]] = fir.load %[[VAL_8]] : !fir.ref // CHECK: fir.store %[[VAL_9]] to %[[VAL_7]] {tbaa = [#[[$ATTR_10]]]} : !fir.ref // CHECK: fir.store %{{.*}} to %[[VAL_5]] {tbaa = [#[[$ATTR_11]]]} : !fir.ref diff --git a/flang/test/Transforms/tbaa2.fir b/flang/test/Transforms/tbaa2.fir index 1429d0b420766..4678a1cd4a686 100644 --- a/flang/test/Transforms/tbaa2.fir +++ b/flang/test/Transforms/tbaa2.fir @@ -50,7 +50,6 @@ // CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ANY_ARG:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ANY_GLBL:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[ANY_LOCAL:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ARG_LOW:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ANY_DIRECT:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ARG_Z:.+]] = #llvm.tbaa_type_desc}> @@ -62,31 +61,21 @@ // CHECK: #[[GLBL_ZSTART:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_ZSTOP:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[LOCAL1_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_YSTART:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_YSTOP:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[LOCAL2_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_XSTART:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[LOCAL3_ALLOC:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[LOCAL4_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[DIRECT_A:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[DIRECT_B:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_DYINV:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[LOCAL5_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_ZSTART_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_ZSTOP_TAG:.+]] = #llvm.tbaa_tag -// CHECK: #[[LOCAL1_ALLOC_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_YSTART_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_YSTOP_TAG:.+]] = #llvm.tbaa_tag -// CHECK: #[[LOCAL2_ALLOC_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_XSTART_TAG:.+]] = #llvm.tbaa_tag -// CHECK: #[[LOCAL3_ALLOC_TAG:.+]] = #llvm.tbaa_tag -// CHECK: #[[LOCAL4_ALLOC_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[DIRECT_A_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[DIRECT_B_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_DYINV_TAG:.+]] = #llvm.tbaa_tag -// CHECK: #[[LOCAL5_ALLOC_TAG:.+]] = #llvm.tbaa_tag func.func @_QMmodPcallee(%arg0: !fir.box> {fir.bindc_name = "z"}, %arg1: !fir.box> {fir.bindc_name = "y"}, %arg2: !fir.ref>>> {fir.bindc_name = "low"}) { %c2 = arith.constant 2 : index @@ -288,7 +277,7 @@ // CHECK: %[[VAL_44:.*]] = fir.convert %[[VAL_43]] : (i32) -> index // CHECK: %[[VAL_45:.*]] = fir.convert %[[VAL_42]] : (index) -> i32 // CHECK: %[[VAL_46:.*]]:2 = fir.do_loop %[[VAL_47:.*]] = %[[VAL_42]] to %[[VAL_44]] step %[[VAL_5]] iter_args(%[[VAL_48:.*]] = %[[VAL_45]]) -> (index, i32) { -// CHECK: fir.store %[[VAL_48]] to %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref +// CHECK: fir.store %[[VAL_48]] to %[[VAL_34]] : !fir.ref // CHECK: %[[VAL_49:.*]] = fir.load %[[VAL_18]] {tbaa = [#[[GLBL_YSTART_TAG]]]} : !fir.ref // CHECK: %[[VAL_50:.*]] = arith.addi %[[VAL_49]], %[[VAL_6]] : i32 // CHECK: %[[VAL_51:.*]] = fir.convert %[[VAL_50]] : (i32) -> index @@ -296,20 +285,24 @@ // CHECK: %[[VAL_53:.*]] = fir.convert %[[VAL_52]] : (i32) -> index // CHECK: %[[VAL_54:.*]] = fir.convert %[[VAL_51]] : (index) -> i32 // CHECK: %[[VAL_55:.*]]:2 = fir.do_loop %[[VAL_56:.*]] = %[[VAL_51]] to %[[VAL_53]] step %[[VAL_5]] iter_args(%[[VAL_57:.*]] = %[[VAL_54]]) -> (index, i32) { -// CHECK: fir.store %[[VAL_57]] to %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref +// CHECK: fir.store %[[VAL_57]] to %[[VAL_32]] : !fir.ref // CHECK: %[[VAL_58:.*]] = fir.load %[[VAL_16]] {tbaa = [#[[GLBL_XSTART_TAG]]]} : !fir.ref // CHECK: %[[VAL_59:.*]] = arith.addi %[[VAL_58]], %[[VAL_6]] : i32 // CHECK: %[[VAL_60:.*]] = fir.convert %[[VAL_59]] : (i32) -> index // CHECK: %[[VAL_61:.*]] = fir.convert %[[VAL_60]] : (index) -> i32 // CHECK: %[[VAL_62:.*]]:2 = fir.do_loop %[[VAL_63:.*]] = %[[VAL_60]] to %[[VAL_4]] step %[[VAL_5]] iter_args(%[[VAL_64:.*]] = %[[VAL_61]]) -> (index, i32) { -// CHECK: fir.store %[[VAL_64]] to %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref +// TODO: local allocation assumed to always alias +// CHECK: fir.store %[[VAL_64]] to %[[VAL_30]] : !fir.ref // load from box tagged in CodeGen // CHECK: %[[VAL_65:.*]] = fir.load %[[VAL_35]] : !fir.ref>>> -// CHECK: %[[VAL_66:.*]] = fir.load %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref +// TODO: local allocation assumed to always alias +// CHECK: %[[VAL_66:.*]] = fir.load %[[VAL_30]] : !fir.ref // CHECK: %[[VAL_67:.*]] = fir.convert %[[VAL_66]] : (i32) -> i64 -// CHECK: %[[VAL_68:.*]] = fir.load %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref +// TODO: local allocation assumed to always alias +// CHECK: %[[VAL_68:.*]] = fir.load %[[VAL_32]] : !fir.ref // CHECK: %[[VAL_69:.*]] = fir.convert %[[VAL_68]] : (i32) -> i64 -// CHECK: %[[VAL_70:.*]] = fir.load %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref +// TODO: local allocation assumed to always alias +// CHECK: %[[VAL_70:.*]] = fir.load %[[VAL_34]] : !fir.ref // CHECK: %[[VAL_71:.*]] = fir.convert %[[VAL_70]] : (i32) -> i64 // CHECK: %[[VAL_72:.*]] = fir.box_addr %[[VAL_65]] : (!fir.box>>) -> !fir.heap> // CHECK: %[[VAL_73:.*]]:3 = fir.box_dims %[[VAL_65]], %[[VAL_4]] : (!fir.box>>, index) -> (index, index, index) @@ -318,10 +311,11 @@ // CHECK: %[[VAL_76:.*]] = fir.shape_shift %[[VAL_73]]#0, %[[VAL_73]]#1, %[[VAL_74]]#0, %[[VAL_74]]#1, %[[VAL_75]]#0, %[[VAL_75]]#1 : (index, index, index, index, index, index) -> !fir.shapeshift<3> // CHECK: %[[VAL_77:.*]] = fir.array_coor %[[VAL_72]](%[[VAL_76]]) %[[VAL_67]], %[[VAL_69]], %[[VAL_71]] : (!fir.heap>, !fir.shapeshift<3>, i64, i64, i64) -> !fir.ref // CHECK: %[[VAL_78:.*]] = fir.load %[[VAL_77]] {tbaa = [#[[ARG_LOW_TAG]]]} : !fir.ref -// CHECK: fir.store %[[VAL_78]] to %[[VAL_26]] {tbaa = [#[[LOCAL4_ALLOC_TAG]]]} : !fir.ref +// CHECK: fir.store %[[VAL_78]] to %[[VAL_26]] : !fir.ref // load from box tagged in CodeGen // CHECK: %[[VAL_79:.*]] = fir.load %[[VAL_8]] : !fir.ref>>> -// CHECK: %[[VAL_80:.*]] = fir.load %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref +// TODO: local allocation assumed to always alias +// CHECK: %[[VAL_80:.*]] = fir.load %[[VAL_32]] : !fir.ref // CHECK: %[[VAL_81:.*]] = fir.convert %[[VAL_80]] : (i32) -> i64 // CHECK: %[[VAL_82:.*]] = fir.box_addr %[[VAL_79]] : (!fir.box>>) -> !fir.heap> // CHECK: %[[VAL_83:.*]]:3 = fir.box_dims %[[VAL_79]], %[[VAL_4]] : (!fir.box>>, index) -> (index, index, index) @@ -330,9 +324,11 @@ // CHECK: %[[VAL_86:.*]] = fir.load %[[VAL_85]] {tbaa = [#[[DIRECT_A_TAG]]]} : !fir.ref // load from box // CHECK: %[[VAL_87:.*]] = fir.load %[[VAL_35]] : !fir.ref>>> -// CHECK: %[[VAL_88:.*]] = fir.load %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref +// load from local allocation +// CHECK: %[[VAL_88:.*]] = fir.load %[[VAL_30]] : !fir.ref // CHECK: %[[VAL_89:.*]] = fir.convert %[[VAL_88]] : (i32) -> i64 -// CHECK: %[[VAL_90:.*]] = fir.load %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref +// load from local allocation +// CHECK: %[[VAL_90:.*]] = fir.load %[[VAL_34]] : !fir.ref // CHECK: %[[VAL_91:.*]] = fir.convert %[[VAL_90]] : (i32) -> i64 // CHECK: %[[VAL_92:.*]] = fir.box_addr %[[VAL_87]] : (!fir.box>>) -> !fir.heap> // CHECK: %[[VAL_93:.*]]:3 = fir.box_dims %[[VAL_87]], %[[VAL_4]] : (!fir.box>>, index) -> (index, index, index) @@ -367,7 +363,8 @@ // CHECK: %[[VAL_121:.*]] = fir.load %[[VAL_120]] {tbaa = [#[[ARG_Y_TAG]]]} : !fir.ref // CHECK: %[[VAL_122:.*]] = arith.subf %[[VAL_119]], %[[VAL_121]] fastmath : f32 // CHECK: %[[VAL_123:.*]] = fir.no_reassoc %[[VAL_122]] : f32 -// CHECK: %[[VAL_124:.*]] = fir.load %[[VAL_28]] {tbaa = [#[[LOCAL5_ALLOC_TAG]]]} : !fir.ref +// load from local allocation +// CHECK: %[[VAL_124:.*]] = fir.load %[[VAL_28]] : !fir.ref // CHECK: %[[VAL_125:.*]] = arith.mulf %[[VAL_123]], %[[VAL_124]] fastmath : f32 // CHECK: %[[VAL_126:.*]] = arith.addf %[[VAL_115]], %[[VAL_125]] fastmath : f32 // CHECK: %[[VAL_127:.*]] = fir.no_reassoc %[[VAL_126]] : f32 @@ -376,24 +373,30 @@ // CHECK: fir.store %[[VAL_129]] to %[[VAL_97]] {tbaa = [#[[ARG_LOW_TAG]]]} : !fir.ref // CHECK: %[[VAL_130:.*]] = arith.addi %[[VAL_63]], %[[VAL_5]] : index // CHECK: %[[VAL_131:.*]] = fir.convert %[[VAL_5]] : (index) -> i32 -// CHECK: %[[VAL_132:.*]] = fir.load %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref +// load from local allocation +// CHECK: %[[VAL_132:.*]] = fir.load %[[VAL_30]] : !fir.ref // CHECK: %[[VAL_133:.*]] = arith.addi %[[VAL_132]], %[[VAL_131]] : i32 // CHECK: fir.result %[[VAL_130]], %[[VAL_133]] : index, i32 // CHECK: } -// CHECK: fir.store %[[VAL_134:.*]]#1 to %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref +// store to local allocation +// CHECK: fir.store %[[VAL_134:.*]]#1 to %[[VAL_30]] : !fir.ref // CHECK: %[[VAL_135:.*]] = arith.addi %[[VAL_56]], %[[VAL_5]] : index // CHECK: %[[VAL_136:.*]] = fir.convert %[[VAL_5]] : (index) -> i32 -// CHECK: %[[VAL_137:.*]] = fir.load %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref +// local allocation: +// CHECK: %[[VAL_137:.*]] = fir.load %[[VAL_32]] : !fir.ref // CHECK: %[[VAL_138:.*]] = arith.addi %[[VAL_137]], %[[VAL_136]] : i32 // CHECK: fir.result %[[VAL_135]], %[[VAL_138]] : index, i32 // CHECK: } -// CHECK: fir.store %[[VAL_139:.*]]#1 to %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref +// local allocation: +// CHECK: fir.store %[[VAL_139:.*]]#1 to %[[VAL_32]] : !fir.ref // CHECK: %[[VAL_140:.*]] = arith.addi %[[VAL_47]], %[[VAL_5]] : index // CHECK: %[[VAL_141:.*]] = fir.convert %[[VAL_5]] : (index) -> i32 -// CHECK: %[[VAL_142:.*]] = fir.load %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref +// local allocation: +// CHECK: %[[VAL_142:.*]] = fir.load %[[VAL_34]] : !fir.ref // CHECK: %[[VAL_143:.*]] = arith.addi %[[VAL_142]], %[[VAL_141]] : i32 // CHECK: fir.result %[[VAL_140]], %[[VAL_143]] : index, i32 // CHECK: } -// CHECK: fir.store %[[VAL_144:.*]]#1 to %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref +// local allocation: +// CHECK: fir.store %[[VAL_144:.*]]#1 to %[[VAL_34]] : !fir.ref // CHECK: return // CHECK: } diff --git a/flang/test/Transforms/tbaa3.fir b/flang/test/Transforms/tbaa3.fir index 97bf69da1b99c..28ff8f7c5fa83 100644 --- a/flang/test/Transforms/tbaa3.fir +++ b/flang/test/Transforms/tbaa3.fir @@ -263,12 +263,12 @@ module { fir.store %cst to %67 : !fir.ref %68 = fir.array_coor %20(%5) %c1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref // real :: local(10) -// DEFAULT: fir.store{{.*}}tbaa +// DEFAULT-NOT: fir.store{{.*}}tbaa // LOCAL: fir.store{{.*}}{tbaa = [#[[LOCALTAG]]]} : !fir.ref fir.store %cst to %68 : !fir.ref %69 = fir.array_coor %33(%5) %c1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref // real, target :: localt(10) -// DEFAULT: fir.store{{.*}}tbaa +// DEFAULT-NOT: fir.store{{.*}}tbaa // LOCAL: fir.store{{.*}}{tbaa = [#[[LOCALTTAG]]]} : !fir.ref fir.store %cst to %69 : !fir.ref // ALL-NOT: fir.load{{.*}}tbaa @@ -278,7 +278,7 @@ module { %73 = fir.shape_shift %72#0, %72#1 : (index, index) -> !fir.shapeshift<1> %74 = fir.array_coor %71(%73) %c1 : (!fir.heap>, !fir.shapeshift<1>, index) -> !fir.ref // real, allocatable :: locala(:) -// DEFAULT: fir.store{{.*}}tbaa +// DEFAULT-NOT: fir.store{{.*}}tbaa // LOCAL: fir.store{{.*}}{tbaa = [#[[LOCALATAG]]]} : !fir.ref fir.store %cst to %74 : !fir.ref // ALL-NOT: fir.load{{.*}}tbaa @@ -288,7 +288,7 @@ module { %78 = fir.shape_shift %77#0, %77#1 : (index, index) -> !fir.shapeshift<1> %79 = fir.array_coor %76(%78) %c1 : (!fir.heap>, !fir.shapeshift<1>, index) -> !fir.ref // real, allocatable, target :: localat(:) -// DEFAULT: fir.store{{.*}}tbaa +// DEFAULT-NOT: fir.store{{.*}}tbaa // LOCAL: fir.store{{.*}}{tbaa = [#[[LOCALATTAG]]]} : !fir.ref fir.store %cst to %79 : !fir.ref // ALL-NOT: fir.load{{.*}}tbaa @@ -297,7 +297,7 @@ module { %82 = fir.shift %81#0 : (index) -> !fir.shift<1> %83 = fir.array_coor %80(%82) %c1 : (!fir.box>>, !fir.shift<1>, index) -> !fir.ref // real, pointer :: localp(:) -// DEFAULT: fir.store{{.*}}tbaa +// DEFAULT-NOT: fir.store{{.*}}tbaa // LOCAL: fir.store{{.*}}{tbaa = [#[[TARGETTAG]]]} : !fir.ref fir.store %cst to %83 : !fir.ref // ALL-NOT: fir.load{{.*}}tbaa From flang-commits at lists.llvm.org Thu May 15 23:57:02 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 15 May 2025 23:57:02 -0700 (PDT) Subject: [flang-commits] [flang] Revert "[Flang] Turn on alias analysis for locally allocated objects" (PR #140202) In-Reply-To: Message-ID: <6826e1be.a70a0220.f7b31.4d90@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Dominik Adamski (DominikAdamski)
Changes Reverts llvm/llvm-project#139682 because of reported regression in Fujitsu Fortran test suite: https://ci.linaro.org/job/tcwg_flang_test--main-aarch64-Ofast-sve_vla-build/2081/artifact/artifacts/notify/mail-body.txt/*view*/ --- Patch is 24.67 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140202.diff 5 Files Affected: - (modified) flang/lib/Optimizer/Transforms/AddAliasTags.cpp (+7-4) - (modified) flang/test/Fir/tbaa-codegen2.fir (+2-5) - (modified) flang/test/Transforms/tbaa-with-dummy-scope2.fir (+8-15) - (modified) flang/test/Transforms/tbaa2.fir (+31-28) - (modified) flang/test/Transforms/tbaa3.fir (+5-5) ``````````diff diff --git a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp index 5cfbdc33285f9..66b4b84998801 100644 --- a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp +++ b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp @@ -43,10 +43,13 @@ static llvm::cl::opt static llvm::cl::opt enableDirect("direct-tbaa", llvm::cl::init(true), llvm::cl::Hidden, llvm::cl::desc("Add TBAA tags to direct variables")); -static llvm::cl::opt - enableLocalAllocs("local-alloc-tbaa", llvm::cl::init(true), - llvm::cl::Hidden, - llvm::cl::desc("Add TBAA tags to local allocations.")); +// This is **known unsafe** (misscompare in spec2017/wrf_r). It should +// not be enabled by default. +// The code is kept so that these may be tried with new benchmarks to see if +// this is worth fixing in the future. +static llvm::cl::opt enableLocalAllocs( + "local-alloc-tbaa", llvm::cl::init(false), llvm::cl::Hidden, + llvm::cl::desc("Add TBAA tags to local allocations. UNSAFE.")); namespace { diff --git a/flang/test/Fir/tbaa-codegen2.fir b/flang/test/Fir/tbaa-codegen2.fir index e4bfa9087ec75..8f8b6a29129e7 100644 --- a/flang/test/Fir/tbaa-codegen2.fir +++ b/flang/test/Fir/tbaa-codegen2.fir @@ -100,7 +100,7 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ // [...] // CHECK: %[[VAL50:.*]] = getelementptr i32, ptr %{{.*}}, i64 %{{.*}} // store to the temporary: -// CHECK: store i32 %{{.*}}, ptr %[[VAL50]], align 4, !tbaa ![[TMP_DATA_ACCESS_TAG:.*]] +// CHECK: store i32 %{{.*}}, ptr %[[VAL50]], align 4, !tbaa ![[DATA_ACCESS_TAG:.*]] // [...] // CHECK: [[BOX_ACCESS_TAG]] = !{![[BOX_ACCESS_TYPE:.*]], ![[BOX_ACCESS_TYPE]], i64 0} @@ -111,7 +111,4 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ // CHECK: ![[A_ACCESS_TYPE]] = !{!"dummy arg data/_QFfuncEa", ![[ARG_ACCESS_TYPE:.*]], i64 0} // CHECK: ![[ARG_ACCESS_TYPE]] = !{!"dummy arg data", ![[DATA_ACCESS_TYPE:.*]], i64 0} // CHECK: ![[DATA_ACCESS_TYPE]] = !{!"any data access", ![[ANY_ACCESS_TYPE]], i64 0} -// CHECK: ![[TMP_DATA_ACCESS_TAG]] = !{![[TMP_DATA_ACCESS_TYPE:.*]], ![[TMP_DATA_ACCESS_TYPE]], i64 0} -// CHECK: ![[TMP_DATA_ACCESS_TYPE]] = !{!"allocated data/", ![[TMP_ACCESS_TYPE:.*]], i64 0} -// CHECK: ![[TMP_ACCESS_TYPE]] = !{!"allocated data", ![[TARGET_ACCESS_TAG:.*]], i64 0} -// CHECK: ![[TARGET_ACCESS_TAG]] = !{!"target data", ![[DATA_ACCESS_TYPE]], i64 0} +// CHECK: ![[DATA_ACCESS_TAG]] = !{![[DATA_ACCESS_TYPE]], ![[DATA_ACCESS_TYPE]], i64 0} diff --git a/flang/test/Transforms/tbaa-with-dummy-scope2.fir b/flang/test/Transforms/tbaa-with-dummy-scope2.fir index 249471de458d3..c8f419fbee652 100644 --- a/flang/test/Transforms/tbaa-with-dummy-scope2.fir +++ b/flang/test/Transforms/tbaa-with-dummy-scope2.fir @@ -43,15 +43,12 @@ func.func @_QPtest1() attributes {noinline} { // CHECK: #[[$ATTR_0:.+]] = #llvm.tbaa_root // CHECK: #[[$ATTR_1:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_2:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$TARGETDATA:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_3:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$LOCAL_ATTR_0:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_5:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_tag -// CHECK: #[[$LOCAL_ATTR_1:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_6:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$LOCAL_ATTR_2:.+]] = #llvm.tbaa_tag // CHECK: #[[$ATTR_8:.+]] = #llvm.tbaa_tag // CHECK-LABEL: func.func @_QPtest1() attributes {noinline} { // CHECK: %[[VAL_2:.*]] = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFtest1FinnerEy"} @@ -60,8 +57,8 @@ func.func @_QPtest1() attributes {noinline} { // CHECK: %[[VAL_5:.*]] = fir.dummy_scope : !fir.dscope // CHECK: %[[VAL_6:.*]] = fir.declare %[[VAL_4]] dummy_scope %[[VAL_5]] {uniq_name = "_QFtest1FinnerEx"} : (!fir.ref, !fir.dscope) -> !fir.ref // CHECK: %[[VAL_7:.*]] = fir.declare %[[VAL_2]] {uniq_name = "_QFtest1FinnerEy"} : (!fir.ref) -> !fir.ref -// CHECK: fir.store %{{.*}} to %[[VAL_7]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref -// CHECK: %[[VAL_8:.*]] = fir.load %[[VAL_7]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref +// CHECK: fir.store %{{.*}} to %[[VAL_7]] : !fir.ref +// CHECK: %[[VAL_8:.*]] = fir.load %[[VAL_7]] : !fir.ref // CHECK: fir.store %[[VAL_8]] to %[[VAL_6]] {tbaa = [#[[$ATTR_7]]]} : !fir.ref // CHECK: fir.store %{{.*}} to %[[VAL_4]] {tbaa = [#[[$ATTR_8]]]} : !fir.ref @@ -90,16 +87,12 @@ func.func @_QPtest2() attributes {noinline} { // CHECK: #[[$ATTR_3:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_4:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_5:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$TARGETDATA_0:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_6:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$TARGETDATA_1:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$LOCAL_ATTR_0:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_8:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_type_desc}> +// CHECK: #[[$ATTR_7:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_10:.+]] = #llvm.tbaa_tag -// CHECK: #[[$LOCAL_ATTR_1:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[$ATTR_9:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[$LOCAL_ATTR_2:.+]] = #llvm.tbaa_tag // CHECK: #[[$ATTR_11:.+]] = #llvm.tbaa_tag // CHECK-LABEL: func.func @_QPtest2() attributes {noinline} { // CHECK: %[[VAL_2:.*]] = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFtest2FinnerEy"} @@ -109,7 +102,7 @@ func.func @_QPtest2() attributes {noinline} { // CHECK: %[[VAL_6:.*]] = fir.dummy_scope : !fir.dscope // CHECK: %[[VAL_7:.*]] = fir.declare %[[VAL_5]] dummy_scope %[[VAL_6]] {uniq_name = "_QFtest2FinnerEx"} : (!fir.ref, !fir.dscope) -> !fir.ref // CHECK: %[[VAL_8:.*]] = fir.declare %[[VAL_2]] {uniq_name = "_QFtest2FinnerEy"} : (!fir.ref) -> !fir.ref -// CHECK: fir.store %{{.*}} to %[[VAL_8]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref -// CHECK: %[[VAL_9:.*]] = fir.load %[[VAL_8]] {tbaa = [#[[$LOCAL_ATTR_2]]]} : !fir.ref +// CHECK: fir.store %{{.*}} to %[[VAL_8]] : !fir.ref +// CHECK: %[[VAL_9:.*]] = fir.load %[[VAL_8]] : !fir.ref // CHECK: fir.store %[[VAL_9]] to %[[VAL_7]] {tbaa = [#[[$ATTR_10]]]} : !fir.ref // CHECK: fir.store %{{.*}} to %[[VAL_5]] {tbaa = [#[[$ATTR_11]]]} : !fir.ref diff --git a/flang/test/Transforms/tbaa2.fir b/flang/test/Transforms/tbaa2.fir index 1429d0b420766..4678a1cd4a686 100644 --- a/flang/test/Transforms/tbaa2.fir +++ b/flang/test/Transforms/tbaa2.fir @@ -50,7 +50,6 @@ // CHECK: #[[TARGETDATA:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ANY_ARG:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ANY_GLBL:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[ANY_LOCAL:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ARG_LOW:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ANY_DIRECT:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[ARG_Z:.+]] = #llvm.tbaa_type_desc}> @@ -62,31 +61,21 @@ // CHECK: #[[GLBL_ZSTART:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_ZSTOP:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[LOCAL1_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_YSTART:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_YSTOP:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[LOCAL2_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_XSTART:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[LOCAL3_ALLOC:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[LOCAL4_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[DIRECT_A:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[DIRECT_B:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_DYINV:.+]] = #llvm.tbaa_type_desc}> -// CHECK: #[[LOCAL5_ALLOC:.+]] = #llvm.tbaa_type_desc}> // CHECK: #[[GLBL_ZSTART_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_ZSTOP_TAG:.+]] = #llvm.tbaa_tag -// CHECK: #[[LOCAL1_ALLOC_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_YSTART_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_YSTOP_TAG:.+]] = #llvm.tbaa_tag -// CHECK: #[[LOCAL2_ALLOC_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_XSTART_TAG:.+]] = #llvm.tbaa_tag -// CHECK: #[[LOCAL3_ALLOC_TAG:.+]] = #llvm.tbaa_tag -// CHECK: #[[LOCAL4_ALLOC_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[DIRECT_A_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[DIRECT_B_TAG:.+]] = #llvm.tbaa_tag // CHECK: #[[GLBL_DYINV_TAG:.+]] = #llvm.tbaa_tag -// CHECK: #[[LOCAL5_ALLOC_TAG:.+]] = #llvm.tbaa_tag func.func @_QMmodPcallee(%arg0: !fir.box> {fir.bindc_name = "z"}, %arg1: !fir.box> {fir.bindc_name = "y"}, %arg2: !fir.ref>>> {fir.bindc_name = "low"}) { %c2 = arith.constant 2 : index @@ -288,7 +277,7 @@ // CHECK: %[[VAL_44:.*]] = fir.convert %[[VAL_43]] : (i32) -> index // CHECK: %[[VAL_45:.*]] = fir.convert %[[VAL_42]] : (index) -> i32 // CHECK: %[[VAL_46:.*]]:2 = fir.do_loop %[[VAL_47:.*]] = %[[VAL_42]] to %[[VAL_44]] step %[[VAL_5]] iter_args(%[[VAL_48:.*]] = %[[VAL_45]]) -> (index, i32) { -// CHECK: fir.store %[[VAL_48]] to %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref +// CHECK: fir.store %[[VAL_48]] to %[[VAL_34]] : !fir.ref // CHECK: %[[VAL_49:.*]] = fir.load %[[VAL_18]] {tbaa = [#[[GLBL_YSTART_TAG]]]} : !fir.ref // CHECK: %[[VAL_50:.*]] = arith.addi %[[VAL_49]], %[[VAL_6]] : i32 // CHECK: %[[VAL_51:.*]] = fir.convert %[[VAL_50]] : (i32) -> index @@ -296,20 +285,24 @@ // CHECK: %[[VAL_53:.*]] = fir.convert %[[VAL_52]] : (i32) -> index // CHECK: %[[VAL_54:.*]] = fir.convert %[[VAL_51]] : (index) -> i32 // CHECK: %[[VAL_55:.*]]:2 = fir.do_loop %[[VAL_56:.*]] = %[[VAL_51]] to %[[VAL_53]] step %[[VAL_5]] iter_args(%[[VAL_57:.*]] = %[[VAL_54]]) -> (index, i32) { -// CHECK: fir.store %[[VAL_57]] to %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref +// CHECK: fir.store %[[VAL_57]] to %[[VAL_32]] : !fir.ref // CHECK: %[[VAL_58:.*]] = fir.load %[[VAL_16]] {tbaa = [#[[GLBL_XSTART_TAG]]]} : !fir.ref // CHECK: %[[VAL_59:.*]] = arith.addi %[[VAL_58]], %[[VAL_6]] : i32 // CHECK: %[[VAL_60:.*]] = fir.convert %[[VAL_59]] : (i32) -> index // CHECK: %[[VAL_61:.*]] = fir.convert %[[VAL_60]] : (index) -> i32 // CHECK: %[[VAL_62:.*]]:2 = fir.do_loop %[[VAL_63:.*]] = %[[VAL_60]] to %[[VAL_4]] step %[[VAL_5]] iter_args(%[[VAL_64:.*]] = %[[VAL_61]]) -> (index, i32) { -// CHECK: fir.store %[[VAL_64]] to %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref +// TODO: local allocation assumed to always alias +// CHECK: fir.store %[[VAL_64]] to %[[VAL_30]] : !fir.ref // load from box tagged in CodeGen // CHECK: %[[VAL_65:.*]] = fir.load %[[VAL_35]] : !fir.ref>>> -// CHECK: %[[VAL_66:.*]] = fir.load %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref +// TODO: local allocation assumed to always alias +// CHECK: %[[VAL_66:.*]] = fir.load %[[VAL_30]] : !fir.ref // CHECK: %[[VAL_67:.*]] = fir.convert %[[VAL_66]] : (i32) -> i64 -// CHECK: %[[VAL_68:.*]] = fir.load %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref +// TODO: local allocation assumed to always alias +// CHECK: %[[VAL_68:.*]] = fir.load %[[VAL_32]] : !fir.ref // CHECK: %[[VAL_69:.*]] = fir.convert %[[VAL_68]] : (i32) -> i64 -// CHECK: %[[VAL_70:.*]] = fir.load %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref +// TODO: local allocation assumed to always alias +// CHECK: %[[VAL_70:.*]] = fir.load %[[VAL_34]] : !fir.ref // CHECK: %[[VAL_71:.*]] = fir.convert %[[VAL_70]] : (i32) -> i64 // CHECK: %[[VAL_72:.*]] = fir.box_addr %[[VAL_65]] : (!fir.box>>) -> !fir.heap> // CHECK: %[[VAL_73:.*]]:3 = fir.box_dims %[[VAL_65]], %[[VAL_4]] : (!fir.box>>, index) -> (index, index, index) @@ -318,10 +311,11 @@ // CHECK: %[[VAL_76:.*]] = fir.shape_shift %[[VAL_73]]#0, %[[VAL_73]]#1, %[[VAL_74]]#0, %[[VAL_74]]#1, %[[VAL_75]]#0, %[[VAL_75]]#1 : (index, index, index, index, index, index) -> !fir.shapeshift<3> // CHECK: %[[VAL_77:.*]] = fir.array_coor %[[VAL_72]](%[[VAL_76]]) %[[VAL_67]], %[[VAL_69]], %[[VAL_71]] : (!fir.heap>, !fir.shapeshift<3>, i64, i64, i64) -> !fir.ref // CHECK: %[[VAL_78:.*]] = fir.load %[[VAL_77]] {tbaa = [#[[ARG_LOW_TAG]]]} : !fir.ref -// CHECK: fir.store %[[VAL_78]] to %[[VAL_26]] {tbaa = [#[[LOCAL4_ALLOC_TAG]]]} : !fir.ref +// CHECK: fir.store %[[VAL_78]] to %[[VAL_26]] : !fir.ref // load from box tagged in CodeGen // CHECK: %[[VAL_79:.*]] = fir.load %[[VAL_8]] : !fir.ref>>> -// CHECK: %[[VAL_80:.*]] = fir.load %[[VAL_32]] {tbaa = [#[[LOCAL2_ALLOC_TAG]]]} : !fir.ref +// TODO: local allocation assumed to always alias +// CHECK: %[[VAL_80:.*]] = fir.load %[[VAL_32]] : !fir.ref // CHECK: %[[VAL_81:.*]] = fir.convert %[[VAL_80]] : (i32) -> i64 // CHECK: %[[VAL_82:.*]] = fir.box_addr %[[VAL_79]] : (!fir.box>>) -> !fir.heap> // CHECK: %[[VAL_83:.*]]:3 = fir.box_dims %[[VAL_79]], %[[VAL_4]] : (!fir.box>>, index) -> (index, index, index) @@ -330,9 +324,11 @@ // CHECK: %[[VAL_86:.*]] = fir.load %[[VAL_85]] {tbaa = [#[[DIRECT_A_TAG]]]} : !fir.ref // load from box // CHECK: %[[VAL_87:.*]] = fir.load %[[VAL_35]] : !fir.ref>>> -// CHECK: %[[VAL_88:.*]] = fir.load %[[VAL_30]] {tbaa = [#[[LOCAL3_ALLOC_TAG]]]} : !fir.ref +// load from local allocation +// CHECK: %[[VAL_88:.*]] = fir.load %[[VAL_30]] : !fir.ref // CHECK: %[[VAL_89:.*]] = fir.convert %[[VAL_88]] : (i32) -> i64 -// CHECK: %[[VAL_90:.*]] = fir.load %[[VAL_34]] {tbaa = [#[[LOCAL1_ALLOC_TAG]]]} : !fir.ref +// load from local allocation +// CHECK: %[[VAL_90:.*]] = fir.load %[[VAL_34]] : !fir.ref // CHECK: %[[VAL_91:.*]] = fir.convert %[[VAL_90]] : (i32) -> i64 // CHECK: %[[VAL_92:.*]] = fir.box_addr %[[VAL_87]] : (!fir.box>>) -> !fir.heap> // CHECK: %[[VAL_93:.*]]:3 = fir.box_dims %[[VAL_87]], %[[VAL_4]] : (!fir.box>>, index) -> (index, index, index) @@ -367,7 +363,8 @@ // CHECK: %[[VAL_121:.*]] = fir.load %[[VAL_120]] {tbaa = [#[[ARG_Y_TAG]]]} : !fir.ref // CHECK: %[[VAL_122:.*]] = arith.subf %[[VAL_119]], %[[VAL_121]] fastmath : f32 // CHECK: %[[VAL_123:.*]] = fir.no_reassoc %[[VAL_122]] : f32 -// CHECK: %[[VAL_124:.*]] = fir.load %[[VAL_28]] {tbaa = [#[[LOCAL5_ALLOC_TAG]]]} : !fir.ref +// load from local allocation +// CHECK: %[[VAL_124:.*]] = fir.load %[[VAL_28]] : !fir.ref // CHECK: %[[VAL_125:.*]] = arith.mulf %[[VAL_123]], %[[VAL_124]] fastmath : f32 // CHECK: %[[VAL_126:.*]] = arith.addf %[[VAL_115]], %[[VAL_125]] fastmath : f32 // CHECK: %[[VAL_127:.*]] = fir.no_reassoc %[[VAL_126]] : f32 @@ -376,24 +373,30 @@ // CHECK: fir.store %[[VAL_129]] to %[[VAL_97]] {tbaa = [#[[ARG_LOW_TAG]]]} : !fir.ref... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/140202 From flang-commits at lists.llvm.org Fri May 16 01:08:36 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 16 May 2025 01:08:36 -0700 (PDT) Subject: [flang-commits] [flang] [draft][flang] Undo the effects of CSE for hlfir.exactly_once. (PR #140190) In-Reply-To: Message-ID: <6826f284.170a0220.1340e5.4283@mx.google.com> https://github.com/jeanPerier commented: Thanks Slava, this looks great! I also thought about what you said about having exactly_once be a side region, and I think that may be the cleanest solution if what you have here were to hit issues with operations with memory effect being moved (which I do not think is possible given the current CSE works because the dominance is clear in the IR syntax, and that is all it needs for operation that will behave the same if re-evaluated with the same inputs, but any pass moving memory operations would need to understand the control path between the two operations to check for modifications, and I do not think it is possible for MLIR to do such analysis without more info on the control flow behavior of the where/forall/region_assign/exactly_once operations). https://github.com/llvm/llvm-project/pull/140190 From flang-commits at lists.llvm.org Fri May 16 02:35:28 2025 From: flang-commits at lists.llvm.org (Sjoerd Meijer via flang-commits) Date: Fri, 16 May 2025 02:35:28 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <682706e0.a70a0220.1fec6a.5129@mx.google.com> sjoerdmeijer wrote: For more context, this is part of our loop-interchange enablement story, see our RFC here: https://discourse.llvm.org/t/enabling-loop-interchange/82589. We have fixed all the compile-time issues and loop-interchange issues that we are aware of, and would like to enable this in the C/C++ flow, see here: https://github.com/llvm/llvm-project/pull/124911. As part of this work, we also promised to fix DependenceAnalysis. The last DA correctness corner-case that is being worked on is: https://github.com/llvm/llvm-project/pull/123436. This is a corner-case for C/C++ related to type punning, different offset sizes that won't be a problem in Fortran. Therefore, we think that enabling interchange and dependence analysis for Fortran makes sense. https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Fri May 16 03:03:01 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 16 May 2025 03:03:01 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <68270d55.050a0220.26299b.566c@mx.google.com> https://github.com/kiranchandramohan edited https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Fri May 16 03:03:01 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 16 May 2025 03:03:01 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <68270d55.630a0220.103036.8c4d@mx.google.com> https://github.com/kiranchandramohan commented: Thanks for this PR. Do you have any compilation time and performance data? https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Fri May 16 03:03:03 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 16 May 2025 03:03:03 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <68270d57.170a0220.211404.4a35@mx.google.com> ================ @@ -421,7 +421,8 @@ static void CheckSubscripts( static void CheckSubscripts( semantics::SemanticsContext &context, CoarrayRef &ref) { - const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; + const auto &base = ref.GetBase(); + const Symbol &coarraySymbol{base.GetLastSymbol()}; ---------------- kiranchandramohan wrote: Nit: Is this an unrelated change? https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Fri May 16 03:41:05 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 16 May 2025 03:41:05 -0700 (PDT) Subject: [flang-commits] [flang] [flang] use DataLayout instead of GEP to compute element size (PR #140235) Message-ID: https://github.com/jeanPerier created https://github.com/llvm/llvm-project/pull/140235 Now that the datalayout is part of codegen, use that to generate type size constants in codegen instead of generating GEP. This will be needed to be able to fold initializers of derived type arrays with descriptor components into ArrayAttr to speed-up compilation times which I will do in a different patch. >From 2a8d78189a6c20a482dd44dfe43f8c919d0e53d6 Mon Sep 17 00:00:00 2001 From: Jean Perier Date: Fri, 16 May 2025 01:44:31 -0700 Subject: [PATCH] [flang] use DataLayout instead of GEP to compute element size --- .../flang/Optimizer/CodeGen/FIROpPatterns.h | 4 ++ flang/lib/Optimizer/CodeGen/CodeGen.cpp | 50 ++++++++--------- flang/test/Fir/convert-to-llvm.fir | 54 +++++-------------- flang/test/Fir/copy-codegen.fir | 12 ++--- flang/test/Fir/embox-char.fir | 8 +-- flang/test/Fir/embox-substring.fir | 7 ++- 6 files changed, 48 insertions(+), 87 deletions(-) diff --git a/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h b/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h index 53d16323beddf..7b1c14e4dfdc9 100644 --- a/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h +++ b/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h @@ -173,6 +173,10 @@ class ConvertFIRToLLVMPattern : public mlir::ConvertToLLVMPattern { this->getTypeConverter()); } + const mlir::DataLayout &getDataLayout() const { + return lowerTy().getDataLayout(); + } + void attachTBAATag(mlir::LLVM::AliasAnalysisOpInterface op, mlir::Type baseFIRType, mlir::Type accessFIRType, mlir::LLVM::GEPOp gep) const { diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index e534cfa5591c6..ad9119ba4a031 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -1043,22 +1043,12 @@ static mlir::SymbolRefAttr getMalloc(fir::AllocMemOp op, static mlir::Value computeElementDistance(mlir::Location loc, mlir::Type llvmObjectType, mlir::Type idxTy, - mlir::ConversionPatternRewriter &rewriter) { - // Note that we cannot use something like - // mlir::LLVM::getPrimitiveTypeSizeInBits() for the element type here. For - // example, it returns 10 bytes for mlir::Float80Type for targets where it - // occupies 16 bytes. Proper solution is probably to use - // mlir::DataLayout::getTypeABIAlignment(), but DataLayout is not being set - // yet (see llvm-project#57230). For the time being use the '(intptr_t)((type - // *)0 + 1)' trick for all types. The generated instructions are optimized - // into constant by the first pass of InstCombine, so it should not be a - // performance issue. - auto llvmPtrTy = ::getLlvmPtrType(llvmObjectType.getContext()); - auto nullPtr = rewriter.create(loc, llvmPtrTy); - auto gep = rewriter.create( - loc, llvmPtrTy, llvmObjectType, nullPtr, - llvm::ArrayRef{1}); - return rewriter.create(loc, idxTy, gep); + mlir::ConversionPatternRewriter &rewriter, + const mlir::DataLayout &dataLayout) { + llvm::TypeSize size = dataLayout.getTypeSize(llvmObjectType); + unsigned short alignment = dataLayout.getTypeABIAlignment(llvmObjectType); + std::int64_t distance = llvm::alignTo(size, alignment); + return genConstantIndex(loc, idxTy, rewriter, distance); } /// Return value of the stride in bytes between adjacent elements @@ -1066,10 +1056,10 @@ computeElementDistance(mlir::Location loc, mlir::Type llvmObjectType, /// \p idxTy integer type. static mlir::Value genTypeStrideInBytes(mlir::Location loc, mlir::Type idxTy, - mlir::ConversionPatternRewriter &rewriter, - mlir::Type llTy) { + mlir::ConversionPatternRewriter &rewriter, mlir::Type llTy, + const mlir::DataLayout &dataLayout) { // Create a pointer type and use computeElementDistance(). - return computeElementDistance(loc, llTy, idxTy, rewriter); + return computeElementDistance(loc, llTy, idxTy, rewriter, dataLayout); } namespace { @@ -1111,7 +1101,7 @@ struct AllocMemOpConversion : public fir::FIROpConversion { mlir::Value genTypeSizeInBytes(mlir::Location loc, mlir::Type idxTy, mlir::ConversionPatternRewriter &rewriter, mlir::Type llTy) const { - return computeElementDistance(loc, llTy, idxTy, rewriter); + return computeElementDistance(loc, llTy, idxTy, rewriter, getDataLayout()); } }; } // namespace @@ -1323,8 +1313,8 @@ struct EmboxCommonConversion : public fir::FIROpConversion { fir::CharacterType charTy, mlir::ValueRange lenParams) const { auto i64Ty = mlir::IntegerType::get(rewriter.getContext(), 64); - mlir::Value size = - genTypeStrideInBytes(loc, i64Ty, rewriter, this->convertType(charTy)); + mlir::Value size = genTypeStrideInBytes( + loc, i64Ty, rewriter, this->convertType(charTy), this->getDataLayout()); if (charTy.hasConstantLen()) return size; // Length accounted for in the genTypeStrideInBytes GEP. // Otherwise, multiply the single character size by the length. @@ -1338,6 +1328,7 @@ struct EmboxCommonConversion : public fir::FIROpConversion { std::tuple getSizeAndTypeCode( mlir::Location loc, mlir::ConversionPatternRewriter &rewriter, mlir::Type boxEleTy, mlir::ValueRange lenParams = {}) const { + const mlir::DataLayout &dataLayout = this->getDataLayout(); auto i64Ty = mlir::IntegerType::get(rewriter.getContext(), 64); if (auto eleTy = fir::dyn_cast_ptrEleTy(boxEleTy)) boxEleTy = eleTy; @@ -1354,18 +1345,19 @@ struct EmboxCommonConversion : public fir::FIROpConversion { mlir::dyn_cast(boxEleTy) || fir::isa_real(boxEleTy) || fir::isa_complex(boxEleTy)) return {genTypeStrideInBytes(loc, i64Ty, rewriter, - this->convertType(boxEleTy)), + this->convertType(boxEleTy), dataLayout), typeCodeVal}; if (auto charTy = mlir::dyn_cast(boxEleTy)) return {getCharacterByteSize(loc, rewriter, charTy, lenParams), typeCodeVal}; if (fir::isa_ref_type(boxEleTy)) { auto ptrTy = ::getLlvmPtrType(rewriter.getContext()); - return {genTypeStrideInBytes(loc, i64Ty, rewriter, ptrTy), typeCodeVal}; + return {genTypeStrideInBytes(loc, i64Ty, rewriter, ptrTy, dataLayout), + typeCodeVal}; } if (mlir::isa(boxEleTy)) return {genTypeStrideInBytes(loc, i64Ty, rewriter, - this->convertType(boxEleTy)), + this->convertType(boxEleTy), dataLayout), typeCodeVal}; fir::emitFatalError(loc, "unhandled type in fir.box code generation"); } @@ -1909,8 +1901,8 @@ struct XEmboxOpConversion : public EmboxCommonConversion { if (hasSubcomp) { // We have a subcomponent. The step value needs to be the number of // bytes per element (which is a derived type). - prevDimByteStride = - genTypeStrideInBytes(loc, i64Ty, rewriter, convertType(seqEleTy)); + prevDimByteStride = genTypeStrideInBytes( + loc, i64Ty, rewriter, convertType(seqEleTy), getDataLayout()); } else if (hasSubstr) { // We have a substring. The step value needs to be the number of bytes // per CHARACTER element. @@ -3604,8 +3596,8 @@ struct CopyOpConversion : public fir::FIROpConversion { mlir::Value llvmDestination = adaptor.getDestination(); mlir::Type i64Ty = mlir::IntegerType::get(rewriter.getContext(), 64); mlir::Type copyTy = fir::unwrapRefType(copy.getSource().getType()); - mlir::Value copySize = - genTypeStrideInBytes(loc, i64Ty, rewriter, convertType(copyTy)); + mlir::Value copySize = genTypeStrideInBytes( + loc, i64Ty, rewriter, convertType(copyTy), getDataLayout()); mlir::LLVM::AliasAnalysisOpInterface newOp; if (copy.getNoOverlap()) diff --git a/flang/test/Fir/convert-to-llvm.fir b/flang/test/Fir/convert-to-llvm.fir index 2960528fb6c24..6d8a8bb606b90 100644 --- a/flang/test/Fir/convert-to-llvm.fir +++ b/flang/test/Fir/convert-to-llvm.fir @@ -216,9 +216,7 @@ func.func @test_alloc_and_freemem_one() { } // CHECK-LABEL: llvm.func @test_alloc_and_freemem_one() { -// CHECK-NEXT: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK-NEXT: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK-NEXT: %[[N:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[N:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK-NEXT: llvm.call @malloc(%[[N]]) // CHECK: llvm.call @free(%{{.*}}) // CHECK-NEXT: llvm.return @@ -235,10 +233,8 @@ func.func @test_alloc_and_freemem_several() { } // CHECK-LABEL: llvm.func @test_alloc_and_freemem_several() { -// CHECK: [[NULL:%.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: [[PTR:%.*]] = llvm.getelementptr [[NULL]][{{.*}}] : (!llvm.ptr) -> !llvm.ptr, !llvm.array<100 x f32> -// CHECK: [[N:%.*]] = llvm.ptrtoint [[PTR]] : !llvm.ptr to i64 -// CHECK: [[MALLOC:%.*]] = llvm.call @malloc([[N]]) +// CHECK: %[[N:.*]] = llvm.mlir.constant(400 : i64) : i64 +// CHECK: [[MALLOC:%.*]] = llvm.call @malloc(%[[N]]) // CHECK: llvm.call @free([[MALLOC]]) // CHECK: llvm.return @@ -251,9 +247,7 @@ func.func @test_with_shape(%ncols: index, %nrows: index) { // CHECK-LABEL: llvm.func @test_with_shape // CHECK-SAME: %[[NCOLS:.*]]: i64, %[[NROWS:.*]]: i64 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[FOUR:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[FOUR:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[DIM1_SIZE:.*]] = llvm.mul %[[FOUR]], %[[NCOLS]] : i64 // CHECK: %[[TOTAL_SIZE:.*]] = llvm.mul %[[DIM1_SIZE]], %[[NROWS]] : i64 // CHECK: %[[MEM:.*]] = llvm.call @malloc(%[[TOTAL_SIZE]]) @@ -269,9 +263,7 @@ func.func @test_string_with_shape(%len: index, %nelems: index) { // CHECK-LABEL: llvm.func @test_string_with_shape // CHECK-SAME: %[[LEN:.*]]: i64, %[[NELEMS:.*]]: i64) -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ONE:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ONE:.*]] = llvm.mlir.constant(1 : i64) : i64 // CHECK: %[[LEN_SIZE:.*]] = llvm.mul %[[ONE]], %[[LEN]] : i64 // CHECK: %[[TOTAL_SIZE:.*]] = llvm.mul %[[LEN_SIZE]], %[[NELEMS]] : i64 // CHECK: %[[MEM:.*]] = llvm.call @malloc(%[[TOTAL_SIZE]]) @@ -1654,9 +1646,7 @@ func.func @embox0(%arg0: !fir.ref>) { // AMDGPU: %[[AA:.*]] = llvm.alloca %[[C1]] x !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}})> {alignment = 8 : i64} : (i32) -> !llvm.ptr<5> // AMDGPU: %[[ALLOCA:.*]] = llvm.addrspacecast %[[AA]] : !llvm.ptr<5> to !llvm.ptr // CHECK: %[[TYPE_CODE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[I64_ELEM_SIZE:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[I64_ELEM_SIZE:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[DESC:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}})> // CHECK: %[[DESC0:.*]] = llvm.insertvalue %[[I64_ELEM_SIZE]], %[[DESC]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}})> // CHECK: %[[CFI_VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -1879,9 +1869,7 @@ func.func @xembox0(%arg0: !fir.ref>) { // AMDGPU: %[[ALLOCA:.*]] = llvm.addrspacecast %[[AA]] : !llvm.ptr<5> to !llvm.ptr // CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : i64) : i64 // CHECK: %[[TYPE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -1933,9 +1921,7 @@ func.func @xembox0_i32(%arg0: !fir.ref>) { // CHECK: %[[C0_I32:.*]] = llvm.mlir.constant(0 : i32) : i32 // CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : i64) : i64 // CHECK: %[[TYPE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -1988,9 +1974,7 @@ func.func @xembox1(%arg0: !fir.ref>>) { // CHECK-LABEL: llvm.func @xembox1(%{{.*}}: !llvm.ptr) { // CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : i64) : i64 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(10 : i64) : i64 // CHECK: %{{.*}} = llvm.insertvalue %[[ELEM_LEN_I64]], %{{.*}}[1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[PREV_PTROFF:.*]] = llvm.mul %[[ELEM_LEN_I64]], %[[C0]] : i64 @@ -2042,9 +2026,7 @@ func.func private @_QPxb(!fir.box>) // AMDGPU: %[[AR:.*]] = llvm.alloca %[[ARR_SIZE]] x f64 {bindc_name = "arr"} : (i64) -> !llvm.ptr<5> // AMDGPU: %[[ARR:.*]] = llvm.addrspacecast %[[AR]] : !llvm.ptr<5> to !llvm.ptr // CHECK: %[[TYPE_CODE:.*]] = llvm.mlir.constant(28 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(8 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<2 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<2 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -2126,9 +2108,7 @@ func.func private @_QPtest_dt_callee(%arg0: !fir.box>) // CHECK: %[[C10:.*]] = llvm.mlir.constant(10 : i64) : i64 // CHECK: %[[C2:.*]] = llvm.mlir.constant(2 : i64) : i64 // CHECK: %[[TYPE_CODE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -2146,9 +2126,7 @@ func.func private @_QPtest_dt_callee(%arg0: !fir.box>) // CHECK: %[[BOX6:.*]] = llvm.insertvalue %[[F18ADDENDUM_I8]], %[[BOX5]][6] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[ZERO:.*]] = llvm.mlir.constant(0 : i64) : i64 // CHECK: %[[ONE:.*]] = llvm.mlir.constant(1 : i64) : i64 -// CHECK: %[[ELE_TYPE:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP_DTYPE_SIZE:.*]] = llvm.getelementptr %[[ELE_TYPE]][1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<"_QFtest_dt_sliceTt", (i32, i32)> -// CHECK: %[[PTRTOINT_DTYPE_SIZE:.*]] = llvm.ptrtoint %[[GEP_DTYPE_SIZE]] : !llvm.ptr to i64 +// CHECK: %[[PTRTOINT_DTYPE_SIZE:.*]] = llvm.mlir.constant(8 : i64) : i64 // CHECK: %[[ADJUSTED_OFFSET:.*]] = llvm.sub %[[C1]], %[[ONE]] : i64 // CHECK: %[[EXT_SUB:.*]] = llvm.sub %[[C10]], %[[C1]] : i64 // CHECK: %[[EXT_ADD:.*]] = llvm.add %[[EXT_SUB]], %[[C2]] : i64 @@ -2429,9 +2407,7 @@ func.func @test_rebox_1(%arg0: !fir.box>) { //CHECK: %[[SIX:.*]] = llvm.mlir.constant(6 : index) : i64 //CHECK: %[[EIGHTY:.*]] = llvm.mlir.constant(80 : index) : i64 //CHECK: %[[FLOAT_TYPE:.*]] = llvm.mlir.constant(27 : i32) : i32 -//CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -//CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -//CHECK: %[[ELEM_SIZE_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +//CHECK: %[[ELEM_SIZE_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 //CHECK: %[[EXTRA_GEP:.*]] = llvm.getelementptr %[[ARG0]][0, 6] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> //CHECK: %[[EXTRA:.*]] = llvm.load %[[EXTRA_GEP]] : !llvm.ptr -> i8 //CHECK: %[[RBOX:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> @@ -2504,9 +2480,7 @@ func.func @foo(%arg0: !fir.box} //CHECK: %[[COMPONENT_OFFSET_1:.*]] = llvm.mlir.constant(1 : i64) : i64 //CHECK: %[[ELEM_COUNT:.*]] = llvm.mlir.constant(7 : i64) : i64 //CHECK: %[[TYPE_CHAR:.*]] = llvm.mlir.constant(40 : i32) : i32 -//CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -//CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -//CHECK: %[[CHAR_SIZE:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +//CHECK: %[[CHAR_SIZE:.*]] = llvm.mlir.constant(1 : i64) : i64 //CHECK: %[[ELEM_SIZE:.*]] = llvm.mul %[[CHAR_SIZE]], %[[ELEM_COUNT]] //CHECK: %[[EXTRA_GEP:.*]] = llvm.getelementptr %[[ARG0]][0, 6] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>, ptr, array<1 x i64>)> //CHECK: %[[EXTRA:.*]] = llvm.load %[[EXTRA_GEP]] : !llvm.ptr -> i8 diff --git a/flang/test/Fir/copy-codegen.fir b/flang/test/Fir/copy-codegen.fir index eef1885c6a49c..7b0620ca2d312 100644 --- a/flang/test/Fir/copy-codegen.fir +++ b/flang/test/Fir/copy-codegen.fir @@ -12,10 +12,8 @@ func.func @test_copy_1(%arg0: !fir.ref, %arg1: !fir.ref) { // CHECK-LABEL: llvm.func @test_copy_1( // CHECK-SAME: %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr, // CHECK-SAME: %[[VAL_1:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr) { -// CHECK: %[[VAL_2:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_3:.*]] = llvm.getelementptr %[[VAL_2]][1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<"sometype", (array<9 x i32>)> -// CHECK: %[[VAL_4:.*]] = llvm.ptrtoint %[[VAL_3]] : !llvm.ptr to i64 -// CHECK: "llvm.intr.memcpy"(%[[VAL_1]], %[[VAL_0]], %[[VAL_4]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () +// CHECK: %[[VAL_2:.*]] = llvm.mlir.constant(36 : i64) : i64 +// CHECK: "llvm.intr.memcpy"(%[[VAL_1]], %[[VAL_0]], %[[VAL_2]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () // CHECK: llvm.return // CHECK: } @@ -26,10 +24,8 @@ func.func @test_copy_2(%arg0: !fir.ref, %arg1: !fir.ref) { // CHECK-LABEL: llvm.func @test_copy_2( // CHECK-SAME: %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr, // CHECK-SAME: %[[VAL_1:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr) { -// CHECK: %[[VAL_2:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_3:.*]] = llvm.getelementptr %[[VAL_2]][1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<"sometype", (array<9 x i32>)> -// CHECK: %[[VAL_4:.*]] = llvm.ptrtoint %[[VAL_3]] : !llvm.ptr to i64 -// CHECK: "llvm.intr.memmove"(%[[VAL_1]], %[[VAL_0]], %[[VAL_4]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () +// CHECK: %[[VAL_2:.*]] = llvm.mlir.constant(36 : i64) : i64 +// CHECK: "llvm.intr.memmove"(%[[VAL_1]], %[[VAL_0]], %[[VAL_2]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () // CHECK: llvm.return // CHECK: } } diff --git a/flang/test/Fir/embox-char.fir b/flang/test/Fir/embox-char.fir index efb069f96520d..8e40acfdf289f 100644 --- a/flang/test/Fir/embox-char.fir +++ b/flang/test/Fir/embox-char.fir @@ -45,9 +45,7 @@ // CHECK: %[[VAL_30:.*]] = llvm.load %[[VAL_29]] : !llvm.ptr -> i64 // CHECK: %[[VAL_31:.*]] = llvm.sdiv %[[VAL_16]], %[[VAL_13]] : i64 // CHECK: %[[VAL_32:.*]] = llvm.mlir.constant(44 : i32) : i32 -// CHECK: %[[VAL_33:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_34:.*]] = llvm.getelementptr %[[VAL_33]][1] : (!llvm.ptr) -> !llvm.ptr, i32 -// CHECK: %[[VAL_35:.*]] = llvm.ptrtoint %[[VAL_34]] : !llvm.ptr to i64 +// CHECK: %[[VAL_35:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[VAL_36:.*]] = llvm.mul %[[VAL_35]], %[[VAL_31]] : i64 // CHECK: %[[VAL_37:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> // CHECK: %[[VAL_38:.*]] = llvm.insertvalue %[[VAL_36]], %[[VAL_37]][1] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> @@ -139,9 +137,7 @@ func.func @test_char4(%arg0: !fir.ref !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> // CHECK: %[[VAL_29:.*]] = llvm.load %[[VAL_28]] : !llvm.ptr -> i64 // CHECK: %[[VAL_30:.*]] = llvm.mlir.constant(40 : i32) : i32 -// CHECK: %[[VAL_31:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_32:.*]] = llvm.getelementptr %[[VAL_31]][1] : (!llvm.ptr) -> !llvm.ptr, i8 -// CHECK: %[[VAL_33:.*]] = llvm.ptrtoint %[[VAL_32]] : !llvm.ptr to i64 +// CHECK: %[[VAL_33:.*]] = llvm.mlir.constant(1 : i64) : i64 // CHECK: %[[VAL_34:.*]] = llvm.mul %[[VAL_33]], %[[VAL_15]] : i64 // CHECK: %[[VAL_35:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> // CHECK: %[[VAL_36:.*]] = llvm.insertvalue %[[VAL_34]], %[[VAL_35]][1] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> diff --git a/flang/test/Fir/embox-substring.fir b/flang/test/Fir/embox-substring.fir index f2042f9bda7fc..6ce6346f89b1d 100644 --- a/flang/test/Fir/embox-substring.fir +++ b/flang/test/Fir/embox-substring.fir @@ -29,10 +29,9 @@ func.func private @dump(!fir.box>>) // CHECK-SAME: %[[VAL_0:.*]]: !llvm.ptr, // CHECK-SAME: %[[VAL_1:.*]]: i64) { // CHECK: %[[VAL_5:.*]] = llvm.mlir.constant(1 : index) : i64 -// CHECK: llvm.getelementptr -// CHECK: %[[VAL_28:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_29:.*]] = llvm.getelementptr %[[VAL_28]][1] : (!llvm.ptr) -> !llvm.ptr, i8 -// CHECK: %[[VAL_30:.*]] = llvm.ptrtoint %[[VAL_29]] : !llvm.ptr to i64 +// CHECK: llvm.mlir.constant(1 : i64) : i64 +// CHECK: llvm.mlir.constant(1 : i64) : i64 +// CHECK: %[[VAL_30:.*]] = llvm.mlir.constant(1 : i64) : i64 // CHECK: %[[VAL_31:.*]] = llvm.mul %[[VAL_30]], %[[VAL_1]] : i64 // CHECK: %[[VAL_42:.*]] = llvm.mul %[[VAL_31]], %[[VAL_5]] : i64 // CHECK: %[[VAL_43:.*]] = llvm.insertvalue %[[VAL_42]], %{{.*}}[7, 0, 2] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> From flang-commits at lists.llvm.org Fri May 16 03:41:37 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 16 May 2025 03:41:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang] use DataLayout instead of GEP to compute element size (PR #140235) In-Reply-To: Message-ID: <68271661.170a0220.3a7a11.4e40@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-codegen @llvm/pr-subscribers-flang-fir-hlfir Author: None (jeanPerier)
Changes Now that the datalayout is part of codegen, use that to generate type size constants in codegen instead of generating GEP. This will be needed to be able to fold initializers of derived type arrays with descriptor components into ArrayAttr to speed-up compilation times which I will do in a different patch. --- Patch is 24.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140235.diff 6 Files Affected: - (modified) flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h (+4) - (modified) flang/lib/Optimizer/CodeGen/CodeGen.cpp (+21-29) - (modified) flang/test/Fir/convert-to-llvm.fir (+14-40) - (modified) flang/test/Fir/copy-codegen.fir (+4-8) - (modified) flang/test/Fir/embox-char.fir (+2-6) - (modified) flang/test/Fir/embox-substring.fir (+3-4) ``````````diff diff --git a/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h b/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h index 53d16323beddf..7b1c14e4dfdc9 100644 --- a/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h +++ b/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h @@ -173,6 +173,10 @@ class ConvertFIRToLLVMPattern : public mlir::ConvertToLLVMPattern { this->getTypeConverter()); } + const mlir::DataLayout &getDataLayout() const { + return lowerTy().getDataLayout(); + } + void attachTBAATag(mlir::LLVM::AliasAnalysisOpInterface op, mlir::Type baseFIRType, mlir::Type accessFIRType, mlir::LLVM::GEPOp gep) const { diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index e534cfa5591c6..ad9119ba4a031 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -1043,22 +1043,12 @@ static mlir::SymbolRefAttr getMalloc(fir::AllocMemOp op, static mlir::Value computeElementDistance(mlir::Location loc, mlir::Type llvmObjectType, mlir::Type idxTy, - mlir::ConversionPatternRewriter &rewriter) { - // Note that we cannot use something like - // mlir::LLVM::getPrimitiveTypeSizeInBits() for the element type here. For - // example, it returns 10 bytes for mlir::Float80Type for targets where it - // occupies 16 bytes. Proper solution is probably to use - // mlir::DataLayout::getTypeABIAlignment(), but DataLayout is not being set - // yet (see llvm-project#57230). For the time being use the '(intptr_t)((type - // *)0 + 1)' trick for all types. The generated instructions are optimized - // into constant by the first pass of InstCombine, so it should not be a - // performance issue. - auto llvmPtrTy = ::getLlvmPtrType(llvmObjectType.getContext()); - auto nullPtr = rewriter.create(loc, llvmPtrTy); - auto gep = rewriter.create( - loc, llvmPtrTy, llvmObjectType, nullPtr, - llvm::ArrayRef{1}); - return rewriter.create(loc, idxTy, gep); + mlir::ConversionPatternRewriter &rewriter, + const mlir::DataLayout &dataLayout) { + llvm::TypeSize size = dataLayout.getTypeSize(llvmObjectType); + unsigned short alignment = dataLayout.getTypeABIAlignment(llvmObjectType); + std::int64_t distance = llvm::alignTo(size, alignment); + return genConstantIndex(loc, idxTy, rewriter, distance); } /// Return value of the stride in bytes between adjacent elements @@ -1066,10 +1056,10 @@ computeElementDistance(mlir::Location loc, mlir::Type llvmObjectType, /// \p idxTy integer type. static mlir::Value genTypeStrideInBytes(mlir::Location loc, mlir::Type idxTy, - mlir::ConversionPatternRewriter &rewriter, - mlir::Type llTy) { + mlir::ConversionPatternRewriter &rewriter, mlir::Type llTy, + const mlir::DataLayout &dataLayout) { // Create a pointer type and use computeElementDistance(). - return computeElementDistance(loc, llTy, idxTy, rewriter); + return computeElementDistance(loc, llTy, idxTy, rewriter, dataLayout); } namespace { @@ -1111,7 +1101,7 @@ struct AllocMemOpConversion : public fir::FIROpConversion { mlir::Value genTypeSizeInBytes(mlir::Location loc, mlir::Type idxTy, mlir::ConversionPatternRewriter &rewriter, mlir::Type llTy) const { - return computeElementDistance(loc, llTy, idxTy, rewriter); + return computeElementDistance(loc, llTy, idxTy, rewriter, getDataLayout()); } }; } // namespace @@ -1323,8 +1313,8 @@ struct EmboxCommonConversion : public fir::FIROpConversion { fir::CharacterType charTy, mlir::ValueRange lenParams) const { auto i64Ty = mlir::IntegerType::get(rewriter.getContext(), 64); - mlir::Value size = - genTypeStrideInBytes(loc, i64Ty, rewriter, this->convertType(charTy)); + mlir::Value size = genTypeStrideInBytes( + loc, i64Ty, rewriter, this->convertType(charTy), this->getDataLayout()); if (charTy.hasConstantLen()) return size; // Length accounted for in the genTypeStrideInBytes GEP. // Otherwise, multiply the single character size by the length. @@ -1338,6 +1328,7 @@ struct EmboxCommonConversion : public fir::FIROpConversion { std::tuple getSizeAndTypeCode( mlir::Location loc, mlir::ConversionPatternRewriter &rewriter, mlir::Type boxEleTy, mlir::ValueRange lenParams = {}) const { + const mlir::DataLayout &dataLayout = this->getDataLayout(); auto i64Ty = mlir::IntegerType::get(rewriter.getContext(), 64); if (auto eleTy = fir::dyn_cast_ptrEleTy(boxEleTy)) boxEleTy = eleTy; @@ -1354,18 +1345,19 @@ struct EmboxCommonConversion : public fir::FIROpConversion { mlir::dyn_cast(boxEleTy) || fir::isa_real(boxEleTy) || fir::isa_complex(boxEleTy)) return {genTypeStrideInBytes(loc, i64Ty, rewriter, - this->convertType(boxEleTy)), + this->convertType(boxEleTy), dataLayout), typeCodeVal}; if (auto charTy = mlir::dyn_cast(boxEleTy)) return {getCharacterByteSize(loc, rewriter, charTy, lenParams), typeCodeVal}; if (fir::isa_ref_type(boxEleTy)) { auto ptrTy = ::getLlvmPtrType(rewriter.getContext()); - return {genTypeStrideInBytes(loc, i64Ty, rewriter, ptrTy), typeCodeVal}; + return {genTypeStrideInBytes(loc, i64Ty, rewriter, ptrTy, dataLayout), + typeCodeVal}; } if (mlir::isa(boxEleTy)) return {genTypeStrideInBytes(loc, i64Ty, rewriter, - this->convertType(boxEleTy)), + this->convertType(boxEleTy), dataLayout), typeCodeVal}; fir::emitFatalError(loc, "unhandled type in fir.box code generation"); } @@ -1909,8 +1901,8 @@ struct XEmboxOpConversion : public EmboxCommonConversion { if (hasSubcomp) { // We have a subcomponent. The step value needs to be the number of // bytes per element (which is a derived type). - prevDimByteStride = - genTypeStrideInBytes(loc, i64Ty, rewriter, convertType(seqEleTy)); + prevDimByteStride = genTypeStrideInBytes( + loc, i64Ty, rewriter, convertType(seqEleTy), getDataLayout()); } else if (hasSubstr) { // We have a substring. The step value needs to be the number of bytes // per CHARACTER element. @@ -3604,8 +3596,8 @@ struct CopyOpConversion : public fir::FIROpConversion { mlir::Value llvmDestination = adaptor.getDestination(); mlir::Type i64Ty = mlir::IntegerType::get(rewriter.getContext(), 64); mlir::Type copyTy = fir::unwrapRefType(copy.getSource().getType()); - mlir::Value copySize = - genTypeStrideInBytes(loc, i64Ty, rewriter, convertType(copyTy)); + mlir::Value copySize = genTypeStrideInBytes( + loc, i64Ty, rewriter, convertType(copyTy), getDataLayout()); mlir::LLVM::AliasAnalysisOpInterface newOp; if (copy.getNoOverlap()) diff --git a/flang/test/Fir/convert-to-llvm.fir b/flang/test/Fir/convert-to-llvm.fir index 2960528fb6c24..6d8a8bb606b90 100644 --- a/flang/test/Fir/convert-to-llvm.fir +++ b/flang/test/Fir/convert-to-llvm.fir @@ -216,9 +216,7 @@ func.func @test_alloc_and_freemem_one() { } // CHECK-LABEL: llvm.func @test_alloc_and_freemem_one() { -// CHECK-NEXT: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK-NEXT: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK-NEXT: %[[N:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[N:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK-NEXT: llvm.call @malloc(%[[N]]) // CHECK: llvm.call @free(%{{.*}}) // CHECK-NEXT: llvm.return @@ -235,10 +233,8 @@ func.func @test_alloc_and_freemem_several() { } // CHECK-LABEL: llvm.func @test_alloc_and_freemem_several() { -// CHECK: [[NULL:%.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: [[PTR:%.*]] = llvm.getelementptr [[NULL]][{{.*}}] : (!llvm.ptr) -> !llvm.ptr, !llvm.array<100 x f32> -// CHECK: [[N:%.*]] = llvm.ptrtoint [[PTR]] : !llvm.ptr to i64 -// CHECK: [[MALLOC:%.*]] = llvm.call @malloc([[N]]) +// CHECK: %[[N:.*]] = llvm.mlir.constant(400 : i64) : i64 +// CHECK: [[MALLOC:%.*]] = llvm.call @malloc(%[[N]]) // CHECK: llvm.call @free([[MALLOC]]) // CHECK: llvm.return @@ -251,9 +247,7 @@ func.func @test_with_shape(%ncols: index, %nrows: index) { // CHECK-LABEL: llvm.func @test_with_shape // CHECK-SAME: %[[NCOLS:.*]]: i64, %[[NROWS:.*]]: i64 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[FOUR:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[FOUR:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[DIM1_SIZE:.*]] = llvm.mul %[[FOUR]], %[[NCOLS]] : i64 // CHECK: %[[TOTAL_SIZE:.*]] = llvm.mul %[[DIM1_SIZE]], %[[NROWS]] : i64 // CHECK: %[[MEM:.*]] = llvm.call @malloc(%[[TOTAL_SIZE]]) @@ -269,9 +263,7 @@ func.func @test_string_with_shape(%len: index, %nelems: index) { // CHECK-LABEL: llvm.func @test_string_with_shape // CHECK-SAME: %[[LEN:.*]]: i64, %[[NELEMS:.*]]: i64) -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ONE:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ONE:.*]] = llvm.mlir.constant(1 : i64) : i64 // CHECK: %[[LEN_SIZE:.*]] = llvm.mul %[[ONE]], %[[LEN]] : i64 // CHECK: %[[TOTAL_SIZE:.*]] = llvm.mul %[[LEN_SIZE]], %[[NELEMS]] : i64 // CHECK: %[[MEM:.*]] = llvm.call @malloc(%[[TOTAL_SIZE]]) @@ -1654,9 +1646,7 @@ func.func @embox0(%arg0: !fir.ref>) { // AMDGPU: %[[AA:.*]] = llvm.alloca %[[C1]] x !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}})> {alignment = 8 : i64} : (i32) -> !llvm.ptr<5> // AMDGPU: %[[ALLOCA:.*]] = llvm.addrspacecast %[[AA]] : !llvm.ptr<5> to !llvm.ptr // CHECK: %[[TYPE_CODE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[I64_ELEM_SIZE:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[I64_ELEM_SIZE:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[DESC:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}})> // CHECK: %[[DESC0:.*]] = llvm.insertvalue %[[I64_ELEM_SIZE]], %[[DESC]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}})> // CHECK: %[[CFI_VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -1879,9 +1869,7 @@ func.func @xembox0(%arg0: !fir.ref>) { // AMDGPU: %[[ALLOCA:.*]] = llvm.addrspacecast %[[AA]] : !llvm.ptr<5> to !llvm.ptr // CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : i64) : i64 // CHECK: %[[TYPE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -1933,9 +1921,7 @@ func.func @xembox0_i32(%arg0: !fir.ref>) { // CHECK: %[[C0_I32:.*]] = llvm.mlir.constant(0 : i32) : i32 // CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : i64) : i64 // CHECK: %[[TYPE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -1988,9 +1974,7 @@ func.func @xembox1(%arg0: !fir.ref>>) { // CHECK-LABEL: llvm.func @xembox1(%{{.*}}: !llvm.ptr) { // CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : i64) : i64 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(10 : i64) : i64 // CHECK: %{{.*}} = llvm.insertvalue %[[ELEM_LEN_I64]], %{{.*}}[1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[PREV_PTROFF:.*]] = llvm.mul %[[ELEM_LEN_I64]], %[[C0]] : i64 @@ -2042,9 +2026,7 @@ func.func private @_QPxb(!fir.box>) // AMDGPU: %[[AR:.*]] = llvm.alloca %[[ARR_SIZE]] x f64 {bindc_name = "arr"} : (i64) -> !llvm.ptr<5> // AMDGPU: %[[ARR:.*]] = llvm.addrspacecast %[[AR]] : !llvm.ptr<5> to !llvm.ptr // CHECK: %[[TYPE_CODE:.*]] = llvm.mlir.constant(28 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(8 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<2 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<2 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -2126,9 +2108,7 @@ func.func private @_QPtest_dt_callee(%arg0: !fir.box>) // CHECK: %[[C10:.*]] = llvm.mlir.constant(10 : i64) : i64 // CHECK: %[[C2:.*]] = llvm.mlir.constant(2 : i64) : i64 // CHECK: %[[TYPE_CODE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -2146,9 +2126,7 @@ func.func private @_QPtest_dt_callee(%arg0: !fir.box>) // CHECK: %[[BOX6:.*]] = llvm.insertvalue %[[F18ADDENDUM_I8]], %[[BOX5]][6] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[ZERO:.*]] = llvm.mlir.constant(0 : i64) : i64 // CHECK: %[[ONE:.*]] = llvm.mlir.constant(1 : i64) : i64 -// CHECK: %[[ELE_TYPE:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP_DTYPE_SIZE:.*]] = llvm.getelementptr %[[ELE_TYPE]][1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<"_QFtest_dt_sliceTt", (i32, i32)> -// CHECK: %[[PTRTOINT_DTYPE_SIZE:.*]] = llvm.ptrtoint %[[GEP_DTYPE_SIZE]] : !llvm.ptr to i64 +// CHECK: %[[PTRTOINT_DTYPE_SIZE:.*]] = llvm.mlir.constant(8 : i64) : i64 // CHECK: %[[ADJUSTED_OFFSET:.*]] = llvm.sub %[[C1]], %[[ONE]] : i64 // CHECK: %[[EXT_SUB:.*]] = llvm.sub %[[C10]], %[[C1]] : i64 // CHECK: %[[EXT_ADD:.*]] = llvm.add %[[EXT_SUB]], %[[C2]] : i64 @@ -2429,9 +2407,7 @@ func.func @test_rebox_1(%arg0: !fir.box>) { //CHECK: %[[SIX:.*]] = llvm.mlir.constant(6 : index) : i64 //CHECK: %[[EIGHTY:.*]] = llvm.mlir.constant(80 : index) : i64 //CHECK: %[[FLOAT_TYPE:.*]] = llvm.mlir.constant(27 : i32) : i32 -//CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -//CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -//CHECK: %[[ELEM_SIZE_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +//CHECK: %[[ELEM_SIZE_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 //CHECK: %[[EXTRA_GEP:.*]] = llvm.getelementptr %[[ARG0]][0, 6] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> //CHECK: %[[EXTRA:.*]] = llvm.load %[[EXTRA_GEP]] : !llvm.ptr -> i8 //CHECK: %[[RBOX:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> @@ -2504,9 +2480,7 @@ func.func @foo(%arg0: !fir.box} //CHECK: %[[COMPONENT_OFFSET_1:.*]] = llvm.mlir.constant(1 : i64) : i64 //CHECK: %[[ELEM_COUNT:.*]] = llvm.mlir.constant(7 : i64) : i64 //CHECK: %[[TYPE_CHAR:.*]] = llvm.mlir.constant(40 : i32) : i32 -//CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -//CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -//CHECK: %[[CHAR_SIZE:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +//CHECK: %[[CHAR_SIZE:.*]] = llvm.mlir.constant(1 : i64) : i64 //CHECK: %[[ELEM_SIZE:.*]] = llvm.mul %[[CHAR_SIZE]], %[[ELEM_COUNT]] //CHECK: %[[EXTRA_GEP:.*]] = llvm.getelementptr %[[ARG0]][0, 6] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>, ptr, array<1 x i64>)> //CHECK: %[[EXTRA:.*]] = llvm.load %[[EXTRA_GEP]] : !llvm.ptr -> i8 diff --git a/flang/test/Fir/copy-codegen.fir b/flang/test/Fir/copy-codegen.fir index eef1885c6a49c..7b0620ca2d312 100644 --- a/flang/test/Fir/copy-codegen.fir +++ b/flang/test/Fir/copy-codegen.fir @@ -12,10 +12,8 @@ func.func @test_copy_1(%arg0: !fir.ref, %arg1: !fir.ref) { // CHECK-LABEL: llvm.func @test_copy_1( // CHECK-SAME: %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr, // CHECK-SAME: %[[VAL_1:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr) { -// CHECK: %[[VAL_2:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_3:.*]] = llvm.getelementptr %[[VAL_2]][1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<"sometype", (array<9 x i32>)> -// CHECK: %[[VAL_4:.*]] = llvm.ptrtoint %[[VAL_3]] : !llvm.ptr to i64 -// CHECK: "llvm.intr.memcpy"(%[[VAL_1]], %[[VAL_0]], %[[VAL_4]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () +// CHECK: %[[VAL_2:.*]] = llvm.mlir.constant(36 : i64) : i64 +// CHECK: "llvm.intr.memcpy"(%[[VAL_1]], %[[VAL_0]], %[[VAL_2]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () // CHECK: llvm.return // CHECK: } @@ -26,10 +24,8 @@ func.func @test_copy_2(%arg0: !fir.ref, %arg1: !fir.ref) { // CHECK-LABEL: llvm.func @test_copy_2( // CHECK-SAME: %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr, // CHECK-SAME: %[[VAL_1:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr) { -// CHECK: %[[VAL_2:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_3:.... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/140235 From flang-commits at lists.llvm.org Fri May 16 04:51:23 2025 From: flang-commits at lists.llvm.org (Carlo Bramini via flang-commits) Date: Fri, 16 May 2025 04:51:23 -0700 (PDT) Subject: [flang-commits] [flang] [flang][CMake] CYGWIN: Fix undefined references at link time. (PR #67105) In-Reply-To: Message-ID: <682726bb.170a0220.243f9d.680a@mx.google.com> https://github.com/carlo-bramini updated https://github.com/llvm/llvm-project/pull/67105 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Fri May 16 05:58:51 2025 From: flang-commits at lists.llvm.org (Sebastian Pop via flang-commits) Date: Fri, 16 May 2025 05:58:51 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <6827368b.630a0220.3a3474.1816@mx.google.com> ================ @@ -421,7 +421,8 @@ static void CheckSubscripts( static void CheckSubscripts( semantics::SemanticsContext &context, CoarrayRef &ref) { - const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; + const auto &base = ref.GetBase(); + const Symbol &coarraySymbol{base.GetLastSymbol()}; ---------------- sebpop wrote: This has been submitted separately https://github.com/llvm/llvm-project/pull/138793 Without this change I cannot build flang on arm64-linux ubuntu 24.04 machine. https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Fri May 16 06:31:58 2025 From: flang-commits at lists.llvm.org (Sebastian Pop via flang-commits) Date: Fri, 16 May 2025 06:31:58 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <68273e4e.170a0220.145166.695a@mx.google.com> sebpop wrote: > Do you have any compilation time and performance data? @madhur13490 did several changes to loop interchange to optimize the overall compilation time with the pass. I believe Madhur has only looked at c/c++ benchmarks and not at how loop interchange would impact flang. I think that if compilation time is good for c/c++, it should also be good for fortran. On the perf side, I was looking if we can already catch swim from cpu2000, and that fails with not enough data to infer number of iterations. I will be working on adding assume (N < 1335) based on analyzing array decls and infer loop bounds. https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Fri May 16 06:56:55 2025 From: flang-commits at lists.llvm.org (Jean-Didier PAILLEUX via flang-commits) Date: Fri, 16 May 2025 06:56:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Implement !DIR$ [NO]INLINE and FORCEINLINE directives (PR #134350) In-Reply-To: Message-ID: <68274427.630a0220.212f7.c58e@mx.google.com> JDPailleux wrote: No worries, I probably misspoke too and didn't point out those PRs. I don't see what's blocking this patc, nothing on my side. There was a little debate about the naming of the directives (renaming followed the comments on the respect those of Intel's one. Maybe the confusion was found here). > How are those new MLIR attributes translated into LLVM IR? For these new MLIR attributes. When operations are convert from the MLIR dialect to LLVMIR, we check for the presence of the attributes on the `mlir::LLVM::CallOp` operation. If so, we add the attribute to the operation function of the new operation (`llvm::CallInst`) along with the LLVM attributes (e.g. `llvm::Attribute::AlwaysInline`, `llvm::Attribute::NoInline`). https://github.com/llvm/llvm-project/pull/134350 From flang-commits at lists.llvm.org Fri May 16 06:57:28 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 16 May 2025 06:57:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow flush of common block (PR #139528) In-Reply-To: Message-ID: <68274448.630a0220.221a3c.cdc2@mx.google.com> ================ @@ -2304,13 +2304,6 @@ void OmpStructureChecker::Leave(const parser::OpenMPFlushConstruct &x) { auto &flushList{std::get>(x.v.t)}; if (flushList) { - for (const parser::OmpArgument &arg : flushList->v) { - if (auto *sym{GetArgumentSymbol(arg)}; sym && !IsVariableListItem(*sym)) { ---------------- kiranchandramohan wrote: I wonder whether the following is the correct fix here. ``` bool OmpStructureChecker::IsVariableListItem(const Symbol &sym) { - return evaluate::IsVariable(sym) || sym.attrs().test(Attr::POINTER); + return evaluate::IsVariable(sym) || sym.attrs().test(Attr::POINTER) || + sym.detailsIf(); } ``` https://github.com/llvm/llvm-project/pull/139528 From flang-commits at lists.llvm.org Fri May 16 07:15:49 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 16 May 2025 07:15:49 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <68274895.a70a0220.34f0fc.ae6e@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Fri May 16 07:18:33 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 16 May 2025 07:18:33 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <68274939.170a0220.1609a2.ab6a@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions cpp,h -- flang/examples/FeatureList/FeatureList.cpp flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp flang/include/flang/Parser/dump-parse-tree.h flang/include/flang/Parser/parse-tree.h flang/include/flang/Semantics/dump-expr.h flang/include/flang/Semantics/tools.h flang/lib/Lower/OpenMP/DataSharingProcessor.cpp flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Parser/openmp-parsers.cpp flang/lib/Parser/parse-tree.cpp flang/lib/Parser/unparse.cpp flang/lib/Semantics/check-omp-structure.cpp flang/lib/Semantics/check-omp-structure.h flang/lib/Semantics/dump-expr.cpp flang/lib/Semantics/resolve-names.cpp flang/lib/Semantics/rewrite-directives.cpp ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 1553dac3b..c828a3440 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -39,13 +39,12 @@ public: } private: - template - struct TypeOf { + template struct TypeOf { static constexpr std::string_view name{TypeOf::get()}; static constexpr std::string_view get() { std::string_view v(__PRETTY_FUNCTION__); - v.remove_prefix(99); // Strip the part "... [with T = " - v.remove_suffix(50); // Strip the ending "; string_view = ...]" + v.remove_prefix(99); // Strip the part "... [with T = " + v.remove_suffix(50); // Strip the ending "; string_view = ...]" return v; } }; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6977e209e..0f553541c 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2850,7 +2850,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(postAt); // For naCtx cleanups + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } ``````````
https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Fri May 16 07:57:54 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 16 May 2025 07:57:54 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow flush of common block (PR #139528) In-Reply-To: Message-ID: <68275272.170a0220.21a721.01bc@mx.google.com> https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/139528 Rate limit · GitHub

Whoa there!

You have exceeded a secondary rate limit.

Please wait a few minutes before you try again;
in some cases this may take up to an hour.

From flang-commits at lists.llvm.org Fri May 16 08:11:22 2025 From: flang-commits at lists.llvm.org (Sebastian Pop via flang-commits) Date: Fri, 16 May 2025 08:11:22 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix Werror=dangling-reference (PR #138793) In-Reply-To: Message-ID: <6827559a.a70a0220.dc8e7.1a57@mx.google.com> https://github.com/sebpop edited https://github.com/llvm/llvm-project/pull/138793 From flang-commits at lists.llvm.org Fri May 16 08:34:31 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 16 May 2025 08:34:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix Werror=dangling-reference (PR #138793) In-Reply-To: Message-ID: <68275b07.050a0220.196fe4.17b1@mx.google.com> ================ @@ -421,7 +421,8 @@ static void CheckSubscripts( static void CheckSubscripts( semantics::SemanticsContext &context, CoarrayRef &ref) { - const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; + const auto &base = ref.GetBase(); ---------------- klausler wrote: Please use braced initialization here. https://github.com/llvm/llvm-project/pull/138793 From flang-commits at lists.llvm.org Fri May 16 08:34:31 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 16 May 2025 08:34:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix Werror=dangling-reference (PR #138793) In-Reply-To: Message-ID: <68275b07.170a0220.4eadb.1829@mx.google.com> https://github.com/klausler edited https://github.com/llvm/llvm-project/pull/138793 From flang-commits at lists.llvm.org Fri May 16 08:34:31 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 16 May 2025 08:34:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix Werror=dangling-reference (PR #138793) In-Reply-To: Message-ID: <68275b07.630a0220.ac96c.5aca@mx.google.com> https://github.com/klausler approved this pull request. Please use braced initialization; otherwise LGTM, and thanks. https://github.com/llvm/llvm-project/pull/138793 From flang-commits at lists.llvm.org Fri May 16 08:39:29 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Fri, 16 May 2025 08:39:29 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <68275c31.170a0220.ad5b9.c6e0@mx.google.com> https://github.com/tarunprabhu edited https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Fri May 16 08:39:29 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Fri, 16 May 2025 08:39:29 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <68275c31.a70a0220.25fcb6.282f@mx.google.com> ================ @@ -421,7 +421,8 @@ static void CheckSubscripts( static void CheckSubscripts( semantics::SemanticsContext &context, CoarrayRef &ref) { - const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; + const auto &base = ref.GetBase(); + const Symbol &coarraySymbol{base.GetLastSymbol()}; ---------------- tarunprabhu wrote: I requested a review for #138793. It's probably best to proceed with this after that has been merged. https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Fri May 16 08:39:29 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Fri, 16 May 2025 08:39:29 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <68275c31.050a0220.f520c.1c1d@mx.google.com> https://github.com/tarunprabhu commented: Could you add a test that ensures that the loop-interchange pass is added to the pipeline. Perhaps something like [flang/test/Driver/slp-vectorize.f90](https://github.com/llvm/llvm-project/blob/04fde85057cb4da2e560da629df7a52702eac489/flang/test/Driver/slp-vectorize.f90#L9) https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Fri May 16 10:22:45 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 16 May 2025 10:22:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang] use DataLayout instead of GEP to compute element size (PR #140235) In-Reply-To: Message-ID: <68277465.170a0220.1480f9.eacb@mx.google.com> https://github.com/vzakhari approved this pull request. Looks great! https://github.com/llvm/llvm-project/pull/140235 From flang-commits at lists.llvm.org Fri May 16 19:50:29 2025 From: flang-commits at lists.llvm.org (Sebastian Pop via flang-commits) Date: Fri, 16 May 2025 19:50:29 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <6827f975.a70a0220.210127.7f3b@mx.google.com> https://github.com/sebpop updated https://github.com/llvm/llvm-project/pull/140182 >From b0a6935e8439bc5b4f742f55eb3bb090790a8f95 Mon Sep 17 00:00:00 2001 From: Sebastian Pop Date: Wed, 7 May 2025 01:14:49 +0000 Subject: [PATCH 1/4] [flang] fix Werror=dangling-reference MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit when compiling with g++ 13.3.0 flang build fails with: llvm-project/flang/lib/Semantics/expression.cpp:424:17: error: possibly dangling reference to a temporary [-Werror=dangling-reference] 424 | const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; | ^~~~~~~~~~~~~ llvm-project/flang/lib/Semantics/expression.cpp:424:58: note: the temporary was destroyed at the end of the full expression ‘Fortran::evaluate::CoarrayRef::GetBase() const().Fortran::evaluate::NamedEntity::GetLastSymbol()’ 424 | const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; | ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ Keep the base in a temporary variable to make sure it is not deleted. --- flang/lib/Semantics/expression.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index e139bda7e4950..35eb7b61429fb 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -421,7 +421,8 @@ static void CheckSubscripts( static void CheckSubscripts( semantics::SemanticsContext &context, CoarrayRef &ref) { - const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; + const auto &base = ref.GetBase(); + const Symbol &coarraySymbol{base.GetLastSymbol()}; Shape lb, ub; if (FoldSubscripts(context, coarraySymbol, ref.subscript(), lb, ub)) { ValidateSubscripts(context, coarraySymbol, ref.subscript(), lb, ub); >From c6d051a2b4239e1fe78e1d4483b500b129956867 Mon Sep 17 00:00:00 2001 From: Sebastian Pop Date: Mon, 12 May 2025 21:56:03 +0000 Subject: [PATCH 2/4] [flang] add -floop-interchange to flang driver This patch allows flang to recognize the flags -floop-interchange and -fno-loop-interchange. -floop-interchange adds the loop interchange pass to the pass pipeline. --- clang/include/clang/Driver/Options.td | 4 ++-- clang/lib/Driver/ToolChains/Flang.cpp | 3 +++ flang/include/flang/Frontend/CodeGenOptions.def | 1 + flang/lib/Frontend/CompilerInvocation.cpp | 3 +++ flang/lib/Frontend/FrontendActions.cpp | 1 + flang/test/Driver/loop-interchange.f90 | 7 +++++++ 6 files changed, 17 insertions(+), 2 deletions(-) create mode 100644 flang/test/Driver/loop-interchange.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 11677626dbf1f..287a00863bb35 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -4141,9 +4141,9 @@ def ftrap_function_EQ : Joined<["-"], "ftrap-function=">, Group, HelpText<"Issue call to specified function rather than a trap instruction">, MarshallingInfoString>; def floop_interchange : Flag<["-"], "floop-interchange">, Group, - HelpText<"Enable the loop interchange pass">, Visibility<[ClangOption, CC1Option]>; + HelpText<"Enable the loop interchange pass">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>; def fno_loop_interchange: Flag<["-"], "fno-loop-interchange">, Group, - HelpText<"Disable the loop interchange pass">, Visibility<[ClangOption, CC1Option]>; + HelpText<"Disable the loop interchange pass">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>; def funroll_loops : Flag<["-"], "funroll-loops">, Group, HelpText<"Turn on loop unroller">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>; def fno_unroll_loops : Flag<["-"], "fno-unroll-loops">, Group, diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index b1ca747e68b89..c6c7a0b75a987 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -152,6 +152,9 @@ void Flang::addCodegenOptions(const ArgList &Args, !stackArrays->getOption().matches(options::OPT_fno_stack_arrays)) CmdArgs.push_back("-fstack-arrays"); + Args.AddLastArg(CmdArgs, options::OPT_floop_interchange, + options::OPT_fno_loop_interchange); + handleVectorizeLoopsArgs(Args, CmdArgs); handleVectorizeSLPArgs(Args, CmdArgs); diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index d9dbd274e83e5..7ced60f512219 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -35,6 +35,7 @@ CODEGENOPT(PrepareForThinLTO , 1, 0) ///< Set when -flto=thin is enabled on the CODEGENOPT(StackArrays, 1, 0) ///< -fstack-arrays (enable the stack-arrays pass) CODEGENOPT(VectorizeLoop, 1, 0) ///< Enable loop vectorization. CODEGENOPT(VectorizeSLP, 1, 0) ///< Enable SLP vectorization. +CODEGENOPT(InterchangeLoops, 1, 0) ///< Enable loop interchange. CODEGENOPT(LoopVersioning, 1, 0) ///< Enable loop versioning. CODEGENOPT(UnrollLoops, 1, 0) ///< Enable loop unrolling CODEGENOPT(AliasAnalysis, 1, 0) ///< Enable alias analysis pass diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 28f2f69f23baf..0bdbb616136f1 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -269,6 +269,9 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, clang::driver::options::OPT_fno_stack_arrays, false)) opts.StackArrays = 1; + if (args.getLastArg(clang::driver::options::OPT_floop_interchange)) + opts.InterchangeLoops = 1; + if (args.getLastArg(clang::driver::options::OPT_vectorize_loops)) opts.VectorizeLoop = 1; diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..7c936ee23009d 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -915,6 +915,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (ci.isTimingEnabled()) si.getTimePasses().setOutStream(ci.getTimingStreamLLVM()); pto.LoopUnrolling = opts.UnrollLoops; + pto.LoopInterchange = opts.InterchangeLoops; pto.LoopInterleaving = opts.UnrollLoops; pto.LoopVectorization = opts.VectorizeLoop; pto.SLPVectorization = opts.VectorizeSLP; diff --git a/flang/test/Driver/loop-interchange.f90 b/flang/test/Driver/loop-interchange.f90 new file mode 100644 index 0000000000000..30ce2734d0466 --- /dev/null +++ b/flang/test/Driver/loop-interchange.f90 @@ -0,0 +1,7 @@ +! RUN: %flang -### -S -floop-interchange %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -fno-loop-interchange %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s +! CHECK-LOOP-INTERCHANGE: "-floop-interchange" +! CHECK-NO-LOOP-INTERCHANGE: "-fno-loop-interchange" + +program test +end program >From ad86b774f305df88c643ec85e470fcc44511d405 Mon Sep 17 00:00:00 2001 From: Sebastian Pop Date: Fri, 16 May 2025 03:02:54 +0000 Subject: [PATCH 3/4] [flang] enable loop-interchange at O3, O2, and Os --- clang/lib/Driver/ToolChains/CommonArgs.cpp | 13 +++++++++++++ clang/lib/Driver/ToolChains/CommonArgs.h | 4 ++++ clang/lib/Driver/ToolChains/Flang.cpp | 4 +--- flang/test/Driver/loop-interchange.f90 | 8 +++++++- 4 files changed, 25 insertions(+), 4 deletions(-) diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp index e4bad39f8332a..89f4ebd519ebf 100644 --- a/clang/lib/Driver/ToolChains/CommonArgs.cpp +++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp @@ -3152,3 +3152,16 @@ void tools::handleVectorizeSLPArgs(const ArgList &Args, options::OPT_fno_slp_vectorize, EnableSLPVec)) CmdArgs.push_back("-vectorize-slp"); } + +void tools::handleInterchangeLoopsArgs(const ArgList &Args, + ArgStringList &CmdArgs) { + // FIXME: instead of relying on shouldEnableVectorizerAtOLevel, we may want to + // implement a separate function to infer loop interchange from opt level. + // For now, enable loop-interchange at the same opt levels as loop-vectorize. + bool EnableInterch = shouldEnableVectorizerAtOLevel(Args, false); + OptSpecifier interchangeAliasOption = + EnableInterch ? options::OPT_O_Group : options::OPT_floop_interchange; + if (Args.hasFlag(options::OPT_floop_interchange, interchangeAliasOption, + options::OPT_fno_loop_interchange, EnableInterch)) + CmdArgs.push_back("-floop-interchange"); +} diff --git a/clang/lib/Driver/ToolChains/CommonArgs.h b/clang/lib/Driver/ToolChains/CommonArgs.h index 96bc0619dcbc0..6d36a0e8bf493 100644 --- a/clang/lib/Driver/ToolChains/CommonArgs.h +++ b/clang/lib/Driver/ToolChains/CommonArgs.h @@ -259,6 +259,10 @@ void renderCommonIntegerOverflowOptions(const llvm::opt::ArgList &Args, bool shouldEnableVectorizerAtOLevel(const llvm::opt::ArgList &Args, bool isSlpVec); +/// Enable -floop-interchange based on the optimization level selected. +void handleInterchangeLoopsArgs(const llvm::opt::ArgList &Args, + llvm::opt::ArgStringList &CmdArgs); + /// Enable -fvectorize based on the optimization level selected. void handleVectorizeLoopsArgs(const llvm::opt::ArgList &Args, llvm::opt::ArgStringList &CmdArgs); diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index c6c7a0b75a987..54176381b6e5b 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -152,9 +152,7 @@ void Flang::addCodegenOptions(const ArgList &Args, !stackArrays->getOption().matches(options::OPT_fno_stack_arrays)) CmdArgs.push_back("-fstack-arrays"); - Args.AddLastArg(CmdArgs, options::OPT_floop_interchange, - options::OPT_fno_loop_interchange); - + handleInterchangeLoopsArgs(Args, CmdArgs); handleVectorizeLoopsArgs(Args, CmdArgs); handleVectorizeSLPArgs(Args, CmdArgs); diff --git a/flang/test/Driver/loop-interchange.f90 b/flang/test/Driver/loop-interchange.f90 index 30ce2734d0466..d5d62e9a777d2 100644 --- a/flang/test/Driver/loop-interchange.f90 +++ b/flang/test/Driver/loop-interchange.f90 @@ -1,7 +1,13 @@ ! RUN: %flang -### -S -floop-interchange %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s ! RUN: %flang -### -S -fno-loop-interchange %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -O0 %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -O1 %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -O2 %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -O3 %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -Os %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -Oz %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s ! CHECK-LOOP-INTERCHANGE: "-floop-interchange" -! CHECK-NO-LOOP-INTERCHANGE: "-fno-loop-interchange" +! CHECK-NO-LOOP-INTERCHANGE-NOT: "-floop-interchange" program test end program >From dd3f7b2703f5502ede2f2b9de07d68a064beb110 Mon Sep 17 00:00:00 2001 From: Sebastian Pop Date: Fri, 16 May 2025 21:46:04 +0000 Subject: [PATCH 4/4] test loop-interchange in pass pipeline --- flang/test/Driver/loop-interchange.f90 | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/flang/test/Driver/loop-interchange.f90 b/flang/test/Driver/loop-interchange.f90 index d5d62e9a777d2..5d3ec71c59874 100644 --- a/flang/test/Driver/loop-interchange.f90 +++ b/flang/test/Driver/loop-interchange.f90 @@ -8,6 +8,10 @@ ! RUN: %flang -### -S -Oz %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s ! CHECK-LOOP-INTERCHANGE: "-floop-interchange" ! CHECK-NO-LOOP-INTERCHANGE-NOT: "-floop-interchange" +! RUN: %flang_fc1 -emit-llvm -O2 -floop-interchange -mllvm -print-pipeline-passes -o /dev/null %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE-PASS %s +! RUN: %flang_fc1 -emit-llvm -O2 -fno-loop-interchange -mllvm -print-pipeline-passes -o /dev/null %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE-PASS %s +! CHECK-LOOP-INTERCHANGE-PASS: loop-interchange +! CHECK-NO-LOOP-INTERCHANGE-PASS-NOT: loop-interchange program test end program From flang-commits at lists.llvm.org Fri May 16 20:06:19 2025 From: flang-commits at lists.llvm.org (Sebastian Pop via flang-commits) Date: Fri, 16 May 2025 20:06:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix Werror=dangling-reference (PR #138793) In-Reply-To: Message-ID: <6827fd2b.050a0220.37e5f6.79f6@mx.google.com> https://github.com/sebpop updated https://github.com/llvm/llvm-project/pull/138793 >From b0a6935e8439bc5b4f742f55eb3bb090790a8f95 Mon Sep 17 00:00:00 2001 From: Sebastian Pop Date: Wed, 7 May 2025 01:14:49 +0000 Subject: [PATCH 1/2] [flang] fix Werror=dangling-reference MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit when compiling with g++ 13.3.0 flang build fails with: llvm-project/flang/lib/Semantics/expression.cpp:424:17: error: possibly dangling reference to a temporary [-Werror=dangling-reference] 424 | const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; | ^~~~~~~~~~~~~ llvm-project/flang/lib/Semantics/expression.cpp:424:58: note: the temporary was destroyed at the end of the full expression ‘Fortran::evaluate::CoarrayRef::GetBase() const().Fortran::evaluate::NamedEntity::GetLastSymbol()’ 424 | const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; | ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ Keep the base in a temporary variable to make sure it is not deleted. --- flang/lib/Semantics/expression.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index e139bda7e4950..35eb7b61429fb 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -421,7 +421,8 @@ static void CheckSubscripts( static void CheckSubscripts( semantics::SemanticsContext &context, CoarrayRef &ref) { - const Symbol &coarraySymbol{ref.GetBase().GetLastSymbol()}; + const auto &base = ref.GetBase(); + const Symbol &coarraySymbol{base.GetLastSymbol()}; Shape lb, ub; if (FoldSubscripts(context, coarraySymbol, ref.subscript(), lb, ub)) { ValidateSubscripts(context, coarraySymbol, ref.subscript(), lb, ub); >From 9bbdf2168adde2293e8289b522b389755f57de14 Mon Sep 17 00:00:00 2001 From: Sebastian Pop Date: Sat, 17 May 2025 03:05:56 +0000 Subject: [PATCH 2/2] use braced initialization --- flang/lib/Semantics/expression.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index 35eb7b61429fb..e5a9aef96c1c7 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -421,7 +421,7 @@ static void CheckSubscripts( static void CheckSubscripts( semantics::SemanticsContext &context, CoarrayRef &ref) { - const auto &base = ref.GetBase(); + const auto &base{ref.GetBase()}; const Symbol &coarraySymbol{base.GetLastSymbol()}; Shape lb, ub; if (FoldSubscripts(context, coarraySymbol, ref.subscript(), lb, ub)) { From flang-commits at lists.llvm.org Fri May 16 07:15:42 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 16 May 2025 07:15:42 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6827488e.170a0220.fc482.a5ed@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 01/16] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 02/16] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 03/16] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; >From 02b40809aff5b2ad315ef078f0ea8edb0f30a40c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 17 Mar 2025 15:53:27 -0500 Subject: [PATCH 04/16] [flang][OpenMP] Overhaul implementation of ATOMIC construct The parser will accept a wide variety of illegal attempts at forming an ATOMIC construct, leaving it to the semantic analysis to diagnose any issues. This consolidates the analysis into one place and allows us to produce more informative diagnostics. The parser's outcome will be parser::OpenMPAtomicConstruct object holding the directive, parser::Body, and an optional end-directive. The prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have been removed. READ, WRITE, etc. are now proper clauses. The semantic analysis consistently operates on "evaluation" represen- tations, mainly evaluate::Expr (as SomeExpr) and evaluate::Assignment. The results of the semantic analysis are stored in a mutable member of the OpenMPAtomicConstruct node. This follows a precedent of having `typedExpr` member in parser::Expr, for example. This allows the lowering code to avoid duplicated handling of AST nodes. Using a BLOCK construct containing multiple statements for an ATOMIC construct that requires multiple statements is now allowed. In fact, any nesting of such BLOCK constructs is allowed. This implementation will parse, and perform semantic checks for both conditional-update and conditional-update-capture, although no MLIR will be generated for those. Instead, a TODO error will be issues prior to lowering. The allowed forms of the ATOMIC construct were based on the OpenMP 6.0 spec. --- flang/include/flang/Parser/dump-parse-tree.h | 12 - flang/include/flang/Parser/parse-tree.h | 111 +- flang/include/flang/Semantics/tools.h | 16 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 40 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 760 +++---- flang/lib/Parser/openmp-parsers.cpp | 233 +- flang/lib/Parser/parse-tree.cpp | 28 + flang/lib/Parser/unparse.cpp | 102 +- flang/lib/Semantics/check-omp-structure.cpp | 1998 +++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 56 +- flang/lib/Semantics/resolve-names.cpp | 7 +- flang/lib/Semantics/rewrite-directives.cpp | 126 +- .../Lower/OpenMP/Todo/atomic-compare-fail.f90 | 2 +- .../test/Lower/OpenMP/Todo/atomic-compare.f90 | 2 +- flang/test/Lower/OpenMP/atomic-capture.f90 | 4 +- flang/test/Lower/OpenMP/atomic-write.f90 | 2 +- flang/test/Parser/OpenMP/atomic-compare.f90 | 16 - .../test/Semantics/OpenMP/atomic-compare.f90 | 29 +- .../Semantics/OpenMP/atomic-hint-clause.f90 | 23 +- flang/test/Semantics/OpenMP/atomic-read.f90 | 89 + .../OpenMP/atomic-update-capture.f90 | 77 + .../Semantics/OpenMP/atomic-update-only.f90 | 83 + .../OpenMP/atomic-update-overloaded-ops.f90 | 4 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 81 + flang/test/Semantics/OpenMP/atomic.f90 | 29 +- flang/test/Semantics/OpenMP/atomic01.f90 | 221 +- flang/test/Semantics/OpenMP/atomic02.f90 | 47 +- flang/test/Semantics/OpenMP/atomic03.f90 | 51 +- flang/test/Semantics/OpenMP/atomic04.f90 | 96 +- flang/test/Semantics/OpenMP/atomic05.f90 | 12 +- .../Semantics/OpenMP/critical-hint-clause.f90 | 20 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 58 +- .../Semantics/OpenMP/requires-atomic01.f90 | 86 +- .../Semantics/OpenMP/requires-atomic02.f90 | 86 +- 34 files changed, 2838 insertions(+), 1769 deletions(-) delete mode 100644 flang/test/Parser/OpenMP/atomic-compare.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-read.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-capture.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-only.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-write.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a8c01ee2e7da3 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -526,15 +526,6 @@ class ParseTreeDumper { NODE(parser, OmpAtClause) NODE_ENUM(OmpAtClause, ActionTime) NODE_ENUM(OmpSeverityClause, Severity) - NODE(parser, OmpAtomic) - NODE(parser, OmpAtomicCapture) - NODE(OmpAtomicCapture, Stmt1) - NODE(OmpAtomicCapture, Stmt2) - NODE(parser, OmpAtomicCompare) - NODE(parser, OmpAtomicCompareIfStmt) - NODE(parser, OmpAtomicRead) - NODE(parser, OmpAtomicUpdate) - NODE(parser, OmpAtomicWrite) NODE(parser, OmpBeginBlockDirective) NODE(parser, OmpBeginLoopDirective) NODE(parser, OmpBeginSectionsDirective) @@ -581,7 +572,6 @@ class ParseTreeDumper { NODE(parser, OmpDoacrossClause) NODE(parser, OmpDestroyClause) NODE(parser, OmpEndAllocators) - NODE(parser, OmpEndAtomic) NODE(parser, OmpEndBlockDirective) NODE(parser, OmpEndCriticalDirective) NODE(parser, OmpEndLoopDirective) @@ -709,8 +699,6 @@ class ParseTreeDumper { NODE(parser, OpenMPDeclareMapperConstruct) NODE_ENUM(common, OmpMemoryOrderType) NODE(parser, OmpMemoryOrderClause) - NODE(parser, OmpAtomicClause) - NODE(parser, OmpAtomicClauseList) NODE(parser, OmpAtomicDefaultMemOrderClause) NODE(parser, OpenMPDepobjConstruct) NODE(parser, OpenMPUtilityConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..77f57b1cb85c7 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4835,94 +4835,37 @@ struct OmpMemoryOrderClause { CharBlock source; }; -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) | -// FAIL(memory-order) -struct OmpAtomicClause { - UNION_CLASS_BOILERPLATE(OmpAtomicClause); - CharBlock source; - std::variant u; -}; - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -struct OmpAtomicClauseList { - WRAPPER_CLASS_BOILERPLATE(OmpAtomicClauseList, std::list); - CharBlock source; -}; - -// END ATOMIC -EMPTY_CLASS(OmpEndAtomic); - -// ATOMIC READ -struct OmpAtomicRead { - TUPLE_CLASS_BOILERPLATE(OmpAtomicRead); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC WRITE -struct OmpAtomicWrite { - TUPLE_CLASS_BOILERPLATE(OmpAtomicWrite); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC UPDATE -struct OmpAtomicUpdate { - TUPLE_CLASS_BOILERPLATE(OmpAtomicUpdate); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC CAPTURE -struct OmpAtomicCapture { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCapture); - CharBlock source; - WRAPPER_CLASS(Stmt1, Statement); - WRAPPER_CLASS(Stmt2, Statement); - std::tuple - t; -}; - -struct OmpAtomicCompareIfStmt { - UNION_CLASS_BOILERPLATE(OmpAtomicCompareIfStmt); - std::variant, common::Indirection> u; -}; - -// ATOMIC COMPARE (OpenMP 5.1, OPenMP 5.2 spec: 15.8.4) -struct OmpAtomicCompare { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCompare); +struct OpenMPAtomicConstruct { + llvm::omp::Clause GetKind() const; + bool IsCapture() const; + bool IsCompare() const; + TUPLE_CLASS_BOILERPLATE(OpenMPAtomicConstruct); CharBlock source; - std::tuple> + std::tuple> t; -}; -// ATOMIC -struct OmpAtomic { - TUPLE_CLASS_BOILERPLATE(OmpAtomic); - CharBlock source; - std::tuple, - std::optional> - t; -}; + // Information filled out during semantic checks to avoid duplication + // of analyses. + struct Analysis { + static constexpr int None = 0; + static constexpr int Read = 1; + static constexpr int Write = 2; + static constexpr int Update = Read | Write; + static constexpr int Action = 3; // Bitmask for None, Read, Write, Update + static constexpr int IfTrue = 4; + static constexpr int IfFalse = 8; + static constexpr int Condition = 12; // Bitmask for IfTrue, IfFalse + + struct Op { + int what; + TypedExpr expr; + }; + TypedExpr atom, cond; + Op op0, op1; + }; -// 2.17.7 atomic -> -// ATOMIC [atomic-clause-list] atomic-construct [atomic-clause-list] | -// ATOMIC [atomic-clause-list] -// atomic-construct -> READ | WRITE | UPDATE | CAPTURE | COMPARE -struct OpenMPAtomicConstruct { - UNION_CLASS_BOILERPLATE(OpenMPAtomicConstruct); - std::variant - u; + mutable Analysis analysis; }; // OpenMP directives that associate with loop(s) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..7f1ec59b087a2 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -782,5 +782,21 @@ inline bool checkForSymbolMatch( } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). +/// Check if "expr" is +/// SomeType(SomeKind(Type( +/// Convert +/// SomeKind(...)[2]))) +/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves +/// TypeCategory. +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..0d941bf39afc3 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -332,26 +332,26 @@ getSource(const semantics::SemanticsContext &semaCtx, const parser::CharBlock *source = nullptr; auto ompConsVisit = [&](const parser::OpenMPConstruct &x) { - std::visit(common::visitors{ - [&](const parser::OpenMPSectionsConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPLoopConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPBlockConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPCriticalConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPAtomicConstruct &x) { - std::visit([&](const auto &x) { source = &x.source; }, - x.u); - }, - [&](const auto &x) { source = &x.source; }, - }, - x.u); + std::visit( + common::visitors{ + [&](const parser::OpenMPSectionsConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPLoopConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPBlockConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPCriticalConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPAtomicConstruct &x) { + source = &std::get(x.t).source; + }, + [&](const auto &x) { source = &x.source; }, + }, + x.u); }; eval.visit(common::visitors{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fdd85e94829f3..ab868df76d298 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2588,455 +2588,183 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); +} -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} + +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, + const List &clauses) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + if (clause.id != llvm::omp::Clause::OMPC_hint) + continue; + auto &hint = std::get(clause.u); + auto maybeVal = evaluate::ToInt64(hint.v); + CHECK(maybeVal); + return builder.getI64IntegerAttr(*maybeVal); } + return nullptr; } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static mlir::omp::ClauseMemoryOrderKindAttr +getAtomicMemoryOrder(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + const List &clauses) { + std::optional kind; + unsigned version = semaCtx.langOptions().OpenMPVersion; - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + switch (clause.id) { + case llvm::omp::Clause::OMPC_acq_rel: + kind = mlir::omp::ClauseMemoryOrderKind::Acq_rel; + break; + case llvm::omp::Clause::OMPC_acquire: + kind = mlir::omp::ClauseMemoryOrderKind::Acquire; + break; + case llvm::omp::Clause::OMPC_relaxed: + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + break; + case llvm::omp::Clause::OMPC_release: + kind = mlir::omp::ClauseMemoryOrderKind::Release; + break; + case llvm::omp::Clause::OMPC_seq_cst: + kind = mlir::omp::ClauseMemoryOrderKind::Seq_cst; + break; + default: + break; + } + } - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); + // Starting with 5.1, if no memory-order clause is present, the effect + // is as if "relaxed" was present. + if (!kind) { + if (version <= 50) + return nullptr; + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + } + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + return mlir::omp::ClauseMemoryOrderKindAttr::get(builder.getContext(), *kind); +} + +static mlir::Operation * // +genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; +static mlir::Operation * // +genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Value converted = builder.createConvert(loc, atomType, value); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, converted, hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - common::visit( - common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - lower::StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); +static mlir::Operation * +genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + lower::ExprToValueMap overrides; + lower::StatementContext naCtx; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + assert(!args.empty() && "Update operation without arguments"); + for (auto &arg : args) { + if (!semantics::IsSameOrResizeOf(arg, atom)) { + mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); + overrides.try_emplace(&arg, val); } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); } - mlir::Operation *atomicUpdateOp = nullptr; - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), - val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -static void genAtomicWrite(lower::AbstractConverter &converter, - const parser::OmpAtomicWrite &atomicWrite, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - - const parser::AssignmentStmt &stmt = - std::get>(atomicWrite.t) - .statement; - const evaluate::Assignment &assign = *stmt.typedAssignment->v; - lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -static void genAtomicRead(lower::AbstractConverter &converter, - const parser::OmpAtomicRead &atomicRead, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicRead.t) - .statement.t); + builder.restoreInsertionPoint(atomicAt); + auto updateOp = + builder.create(loc, atomAddr, hint, memOrder); - lower::StatementContext stmtCtx; - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -static void genAtomicUpdate(lower::AbstractConverter &converter, - const parser::OmpAtomicUpdate &atomicUpdate, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicUpdate.t) - .statement.t); + mlir::Region ®ion = updateOp->getRegion(0); + mlir::Block *block = builder.createBlock(®ion, {}, {atomType}, {loc}); + mlir::Value localAtom = fir::getBase(block->getArgument(0)); + overrides.try_emplace(&atom, localAtom); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -static void genOmpAtomic(lower::AbstractConverter &converter, - const parser::OmpAtomic &atomicConstruct, - mlir::Location loc) { - const parser::OmpAtomicClauseList &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>(atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicConstruct.t) - .statement.t); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -static void genAtomicCapture(lower::AbstractConverter &converter, - const parser::OmpAtomicCapture &atomicCapture, - mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + converter.overrideExprValues(&overrides); + mlir::Value updated = + fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value converted = builder.createConvert(loc, atomType, updated); + builder.create(loc, converted); + converter.resetExprOverrides(); - const parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const parser::OmpAtomicClauseList &rightHandClauseList = - std::get<2>(atomicCapture.t); - const parser::OmpAtomicClauseList &leftHandClauseList = - std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + builder.restoreInsertionPoint(saved); + return updateOp; +} + +static mlir::Operation * +genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, int action, + mlir::Value atomAddr, const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + switch (action) { + case parser::OpenMPAtomicConstruct::Analysis::Read: + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Write: + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Update: + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + default: + return nullptr; } - firOpBuilder.setInsertionPointToEnd(&block); - firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); } //===----------------------------------------------------------------------===// @@ -3911,10 +3639,6 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, @@ -3922,38 +3646,140 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; + llvm::errs() << " atom: " << atom.AsFortran() << "\n"; + llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; + llvm::errs() << " op0 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << " op1 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << "}\n"; +} + static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - const parser::OpenMPAtomicConstruct &atomicConstruct) { - Fortran::common::visit( - common::visitors{ - [&](const parser::OmpAtomicRead &atomicRead) { - mlir::Location loc = converter.genLocation(atomicRead.source); - genAtomicRead(converter, atomicRead, loc); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - mlir::Location loc = converter.genLocation(atomicWrite.source); - genAtomicWrite(converter, atomicWrite, loc); - }, - [&](const parser::OmpAtomic &atomicConstruct) { - mlir::Location loc = converter.genLocation(atomicConstruct.source); - genOmpAtomic(converter, atomicConstruct, loc); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - mlir::Location loc = converter.genLocation(atomicUpdate.source); - genAtomicUpdate(converter, atomicUpdate, loc); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - mlir::Location loc = converter.genLocation(atomicCapture.source); - genAtomicCapture(converter, atomicCapture, loc); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - mlir::Location loc = converter.genLocation(atomicCompare.source); - TODO(loc, "OpenMP atomic compare"); - }, - }, - atomicConstruct.u); + const parser::OpenMPAtomicConstruct &construct) { + auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { + if (auto *maybe = expr.get(); maybe && maybe->v) { + return &*maybe->v; + } else { + return nullptr; + } + }; + + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto &dirSpec = std::get(construct.t); + List clauses = makeClauses(dirSpec.Clauses(), semaCtx); + lower::StatementContext stmtCtx; + + const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + const semantics::SomeExpr &atom = *get(analysis.atom); + mlir::Location loc = converter.genLocation(construct.source); + mlir::Value atomAddr = + fir::getBase(converter.genExprAddr(atom, stmtCtx, &loc)); + mlir::IntegerAttr hint = getAtomicHint(converter, clauses); + mlir::omp::ClauseMemoryOrderKindAttr memOrder = + getAtomicMemoryOrder(converter, semaCtx, clauses); + + if (auto *cond = get(analysis.cond)) { + (void)cond; + TODO(loc, "OpenMP ATOMIC COMPARE"); + } else { + int action0 = analysis.op0.what & analysis.Action; + int action1 = analysis.op1.what & analysis.Action; + mlir::Operation *captureOp = nullptr; + fir::FirOpBuilder::InsertPoint atomicAt; + fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + + if (construct.IsCapture()) { + // Capturing operation. + assert(action0 != analysis.None && action1 != analysis.None && + "Expexcing two actions"); + captureOp = + builder.create(loc, hint, memOrder); + // Set the non-atomic insertion point to before the atomic.capture. + prepareAt = getInsertionPointBefore(captureOp); + + mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); + builder.setInsertionPointToEnd(block); + // Set the atomic insertion point to before the terminator inside + // atomic.capture. + mlir::Operation *term = builder.create(loc); + atomicAt = getInsertionPointBefore(term); + hint = nullptr; + memOrder = nullptr; + } else { + // Non-capturing operation. + assert(action0 != analysis.None && action1 == analysis.None && + "Expexcing single action"); + assert(!(analysis.op0.what & analysis.Condition)); + atomicAt = prepareAt; + } + + mlir::Operation *firstOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, + *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + assert(firstOp && "Should have created an atomic operation"); + atomicAt = getInsertionPointAfter(firstOp); + + mlir::Operation *secondOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, + *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } + } } static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..b30263ad54fa4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { + return std::string( + state.GetLocation(), std::min(64, state.BytesRemaining())); +} + constexpr auto startOmpLine = skipStuffBeforeStatement >> "!$OMP "_sptok; constexpr auto endOmpLine = space >> endOfLine; @@ -918,8 +924,10 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "BIND" >> construct(construct( parenthesized(Parser{}))) || + "CAPTURE" >> construct(construct()) || "COLLAPSE" >> construct(construct( parenthesized(scalarIntConstantExpr))) || + "COMPARE" >> construct(construct()) || "CONTAINS" >> construct(construct( parenthesized(Parser{}))) || "COPYIN" >> construct(construct( @@ -1039,6 +1047,7 @@ TYPE_PARSER( // "TASK_REDUCTION" >> construct(construct( parenthesized(Parser{}))) || + "READ" >> construct(construct()) || "RELAXED" >> construct(construct()) || "RELEASE" >> construct(construct()) || "REVERSE_OFFLOAD" >> @@ -1082,6 +1091,7 @@ TYPE_PARSER( // maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || + "WRITE" >> construct(construct()) || // Cancellable constructs "DO"_id >= construct(construct( @@ -1210,6 +1220,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. +// A problem can occur when atomic constructs without end-directive follow +// each other closely, e.g. +// !$omp atomic write +// x = v +// !$omp atomic update +// x = x + 1 +// ... +// The speculative parsing will become "recursive", and has the potential +// to take a (practically) infinite amount of time given a sufficiently +// large number of such constructs in a row. Since atomic constructs cannot +// contain other OpenMP constructs, guarding against recursive calls to the +// atomic construct parser solves the problem. +struct OmpAtomicConstructParser { + using resultType = OpenMPAtomicConstruct; + + static constexpr size_t BodyLimit{5}; + + std::optional Parse(ParseState &state) const { + if (recursing_) { + return std::nullopt; + } + recursing_ = true; + + auto dirSpec{Parser{}.Parse(state)}; + if (!dirSpec || dirSpec->DirId() != llvm::omp::Directive::OMPD_atomic) { + recursing_ = false; + return std::nullopt; + } + + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + if (ParseOne(exec, end, tail, state)) { + if (!tail.first.empty()) { + if (auto &&rest{attempt(LimitedTailParser(BodyLimit)).Parse(state)}) { + for (auto &&s : rest->first) { + tail.first.emplace_back(std::move(s)); + } + assert(!tail.second); + tail.second = std::move(rest->second); + } + } + recursing_ = false; + return OpenMPAtomicConstruct{ + std::move(*dirSpec), std::move(tail.first), std::move(tail.second)}; + } + + recursing_ = false; + return std::nullopt; + } + +private: + // Begin-directive + TailType = entire construct. + using TailType = std::pair>; + + // Parse either an ExecutionPartConstruct, or atomic end-directive. When + // successful, record the result in the "tail" provided, otherwise fail. + static std::optional ParseOne( // + Parser &exec, OmpEndDirectiveParser &end, + TailType &tail, ParseState &state) { + auto isRecovery{[](const ExecutionPartConstruct &e) { + return std::holds_alternative(e.u); + }}; + if (auto &&stmt{attempt(exec).Parse(state)}; stmt && !isRecovery(*stmt)) { + tail.first.emplace_back(std::move(*stmt)); + } else if (auto &&dir{attempt(end).Parse(state)}) { + tail.second = std::move(*dir); + } else { + return std::nullopt; + } + return Success{}; + } + + struct LimitedTailParser { + using resultType = TailType; + + constexpr LimitedTailParser(size_t count) : count_(count) {} + + std::optional Parse(ParseState &state) const { + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + for (size_t i{0}; i != count_; ++i) { + if (ParseOne(exec, end, tail, state)) { + if (tail.second) { + // Return when the end-directive was parsed. + return std::move(tail); + } + } else { + break; + } + } + return std::nullopt; + } + + private: + const size_t count_; + }; + + // The recursion guard should become thread_local if parsing is ever + // parallelized. + static bool recursing_; +}; + +bool OmpAtomicConstructParser::recursing_{false}; + +TYPE_PARSER(sourced( // + construct(OmpAtomicConstructParser{}))) + // 2.17.7 Atomic construct/2.17.8 Flush construct [OpenMP 5.0] // memory-order-clause -> // acq_rel @@ -1224,19 +1383,6 @@ TYPE_PARSER(sourced(construct( "RELEASE" >> construct(construct()) || "SEQ_CST" >> construct(construct()))))) -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) -TYPE_PARSER(sourced(construct( - construct(Parser{}) || - construct( - "FAIL" >> parenthesized(Parser{})) || - construct( - "HINT" >> parenthesized(Parser{}))))) - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -TYPE_PARSER(sourced(construct( - many(maybe(","_tok) >> sourced(Parser{}))))) - static bool IsSimpleStandalone(const OmpDirectiveName &name) { switch (name.v) { case llvm::omp::Directive::OMPD_barrier: @@ -1383,67 +1529,6 @@ TYPE_PARSER(sourced( TYPE_PARSER(construct(Parser{}) || construct(Parser{})) -// 2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] | -// ATOMIC [clause] -// clause -> memory-order-clause | HINT(hint-expression) -// memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED -// atomic-clause -> READ | WRITE | UPDATE | CAPTURE - -// OMP END ATOMIC -TYPE_PARSER(construct(startOmpLine >> "END ATOMIC"_tok)) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] READ [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("READ"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] CAPTURE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("CAPTURE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - statement(assignmentStmt), Parser{} / endOmpLine))) - -TYPE_PARSER(construct(indirect(Parser{})) || - construct(indirect(Parser{}))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] COMPARE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("COMPARE"_tok), - Parser{} / endOmpLine, - Parser{}, - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] UPDATE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("UPDATE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [atomic-clause-list] -TYPE_PARSER(sourced(construct(verbatim("ATOMIC"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] WRITE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("WRITE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// Atomic Construct -TYPE_PARSER(construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{})) - // 2.13.2 OMP CRITICAL TYPE_PARSER(startOmpLine >> sourced(construct( diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..5983f54600c21 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -318,6 +318,34 @@ std::string OmpTraitSetSelectorName::ToString() const { return std::string(EnumToString(v)); } +llvm::omp::Clause OpenMPAtomicConstruct::GetKind() const { + auto &dirSpec{std::get(t)}; + for (auto &clause : dirSpec.Clauses().v) { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_read: + case llvm::omp::Clause::OMPC_write: + case llvm::omp::Clause::OMPC_update: + return clause.Id(); + default: + break; + } + } + return llvm::omp::Clause::OMPC_update; +} + +bool OpenMPAtomicConstruct::IsCapture() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_capture; + }); +} + +bool OpenMPAtomicConstruct::IsCompare() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_compare; + }); +} } // namespace Fortran::parser template static llvm::omp::Clause getClauseIdForClass(C &&) { diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..a2bc8b98088f1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2562,83 +2562,22 @@ class UnparseVisitor { Word(ToUpperCaseLetters(common::EnumToString(x))); } - void Unparse(const OmpAtomicClauseList &x) { Walk(" ", x.v, " "); } - - void Unparse(const OmpAtomic &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCapture &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" CAPTURE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - Put("\n"); - Walk(std::get(x.t)); - BeginOpenMP(); - Word("!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCompare &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" COMPARE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - } - void Unparse(const OmpAtomicRead &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" READ"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicUpdate &x) { + void Unparse(const OpenMPAtomicConstruct &x) { BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" UPDATE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicWrite &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" WRITE"); - Walk(std::get<2>(x.t)); + Word("!$OMP "); + Walk(std::get(x.t)); Put("\n"); EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); + Walk(std::get(x.t), ""); + if (auto &end{std::get>(x.t)}) { + BeginOpenMP(); + Word("!$OMP END "); + Walk(*end); + Put("\n"); + EndOpenMP(); + } } + void Unparse(const OpenMPExecutableAllocate &x) { const auto &fields = std::get>>( @@ -2889,23 +2828,8 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } + void Unparse(const OmpFailClause &x) { Walk(x.v); } void Unparse(const OmpMemoryOrderClause &x) { Walk(x.v); } - void Unparse(const OmpAtomicClause &x) { - common::visit(common::visitors{ - [&](const OmpMemoryOrderClause &y) { Walk(y); }, - [&](const OmpFailClause &y) { - Word("FAIL("); - Walk(y.v); - Put(")"); - }, - [&](const OmpHintClause &y) { - Word("HINT("); - Walk(y.v); - Put(")"); - }, - }, - x.u); - } void Unparse(const OmpMetadirectiveDirective &x) { BeginOpenMP(); Word("!$OMP METADIRECTIVE "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c582de6df0319..bf3dff8ddab28 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); + +template +static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { + return !(e == f); +} + // Use when clause falls under 'struct OmpClause' in 'parse-tree.h'. #define CHECK_SIMPLE_CLAUSE(X, Y) \ void OmpStructureChecker::Enter(const parser::OmpClause::X &) { \ @@ -78,6 +86,28 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } +static bool IsVarOrFunctionRef(const SomeExpr &expr) { + return evaluate::UnwrapProcedureRef(expr) != nullptr || + evaluate::IsVariable(expr); +} + +static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { + const parser::TypedExpr &typedExpr{parserExpr.typedExpr}; + // ForwardOwningPointer typedExpr + // `- GenericExprWrapper ^.get() + // `- std::optional ^->v + return typedExpr.get()->v; +} + +static std::optional GetDynamicType( + const parser::Expr &parserExpr) { + if (auto maybeExpr{GetEvaluateExpr(parserExpr)}) { + return maybeExpr->GetType(); + } else { + return std::nullopt; + } +} + // 'OmpWorkshareBlockChecker' is used to check the validity of the assignment // statements and the expressions enclosed in an OpenMP Workshare construct class OmpWorkshareBlockChecker { @@ -584,51 +614,26 @@ void OmpStructureChecker::CheckPredefinedAllocatorRestriction( } } -template -void OmpStructureChecker::CheckHintClause( - D *leftOmpClauseList, D *rightOmpClauseList, std::string_view dirName) { - bool foundHint{false}; +void OmpStructureChecker::Enter(const parser::OmpClause::Hint &x) { + CheckAllowedClause(llvm::omp::Clause::OMPC_hint); + auto &dirCtx{GetContext()}; - auto checkForValidHintClause = [&](const D *clauseList) { - for (const auto &clause : clauseList->v) { - const parser::OmpHintClause *ompHintClause = nullptr; - if constexpr (std::is_same_v) { - ompHintClause = std::get_if(&clause.u); - } else if constexpr (std::is_same_v) { - if (auto *hint{std::get_if(&clause.u)}) { - ompHintClause = &hint->v; - } - } - if (!ompHintClause) - continue; - if (foundHint) { - context_.Say(clause.source, - "At most one HINT clause can appear on the %s directive"_err_en_US, - parser::ToUpperCaseLetters(dirName)); - } - foundHint = true; - std::optional hintValue = GetIntValue(ompHintClause->v); - if (hintValue && *hintValue >= 0) { - /*`omp_sync_hint_nonspeculative` and `omp_lock_hint_speculative`*/ - if ((*hintValue & 0xC) == 0xC - /*`omp_sync_hint_uncontended` and omp_sync_hint_contended*/ - || (*hintValue & 0x3) == 0x3) - context_.Say(clause.source, - "Hint clause value " - "is not a valid OpenMP synchronization value"_err_en_US); - } else { - context_.Say(clause.source, - "Hint clause must have non-negative constant " - "integer expression"_err_en_US); + if (std::optional maybeVal{GetIntValue(x.v.v)}) { + int64_t val{*maybeVal}; + if (val >= 0) { + // Check contradictory values. + if ((val & 0xC) == 0xC || // omp_sync_hint_speculative and nonspeculative + (val & 0x3) == 0x3) { // omp_sync_hint_contended and uncontended + context_.Say(dirCtx.clauseSource, + "The synchronization hint is not valid"_err_en_US); } + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be non-negative"_err_en_US); } - }; - - if (leftOmpClauseList) { - checkForValidHintClause(leftOmpClauseList); - } - if (rightOmpClauseList) { - checkForValidHintClause(rightOmpClauseList); + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be a constant integer value"_err_en_US); } } @@ -2370,8 +2375,9 @@ void OmpStructureChecker::Leave(const parser::OpenMPCancelConstruct &) { void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { const auto &dir{std::get(x.t)}; + const auto &dirSource{std::get(dir.t).source}; const auto &endDir{std::get(x.t)}; - PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_critical); + PushContextAndClauseSets(dirSource, llvm::omp::Directive::OMPD_critical); const auto &block{std::get(x.t)}; CheckNoBranching(block, llvm::omp::Directive::OMPD_critical, dir.source); const auto &dirName{std::get>(dir.t)}; @@ -2404,7 +2410,6 @@ void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { "Hint clause other than omp_sync_hint_none cannot be specified for " "an unnamed CRITICAL directive"_err_en_US}); } - CheckHintClause(&ompClause, nullptr, "CRITICAL"); } void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { @@ -2632,420 +2637,1519 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; } - return false; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } + + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (Append(v, std::move(results)), ...); + return v; + } + +private: + static void Append(Result &acc, Result &&data) { + for (auto &&s : data) { + acc.push_back(std::move(s)); } } +}; +} // namespace atomic - ErrIfAllocatableVariable(var); +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { + // Both expr and x have the form of SomeType(SomeKind(...)[1]). + // Check if expr is + // SomeType(SomeKind(Type( + // Convert + // SomeKind(...)[2]))) + // where SomeKind(...) [1] and [2] are equal, and the Convert preserves + // TypeCategory. + + if (expr != x) { + auto top{atomic::ArgumentExtractor{}(expr)}; + return top.first == atomic::Operator::Resize && x == top.second.front(); } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); + return true; } } -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, const MaybeExpr &maybeExpr = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetExpr(operation.expr, maybeExpr); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedExpr expr; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var + // or + // ... = x capture-var = atomic-var + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return u.lhs == c.rhs; + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); + bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + + if (couldBeCapture1) { + if (couldBeCapture2) { + if (isUpdateCapture(as2, as1)) { + if (isUpdateCapture(as1, as2)) { + // If both statements could be captures and both could be updates, + // emit a warning about the ambiguity. + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); } + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, + as1.rhs.AsFortran(), as2.rhs.AsFortran()); + } + } else { // !couldBeCapture2 + if (isUpdateCapture(as2, as1)) { + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else { + context_.Say(act2.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as1.rhs.AsFortran()); + } + } + } else { // !couldBeCapture1 + if (couldBeCapture2) { + if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); + } else { + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as2.rhs.AsFortran()); + } + } else { + context_.Say(source, + "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + } + } + + return std::make_pair(nullptr, nullptr); +} + +void OmpStructureChecker::CheckAtomicCaptureAssignment( + const evaluate::Assignment &capture, const SomeExpr &atom, + parser::CharBlock source) { + const SomeExpr &cap{capture.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + // This part should have been checked prior to callig this function. + assert(capture.rhs == atom && "This canont be a capture assignment"); + CheckStorageOverlap(atom, {cap}, source); + } +} + +void OmpStructureChecker::CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source) { + const SomeExpr &atom{read.rhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source) { + // [6.0:190:13-15] + // A write structured block is write-statement, a write statement that has + // one of the following forms: + // x = expr + // x => expr + const SomeExpr &atom{write.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, lsrc); + CheckStorageOverlap(atom, {write.rhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source) { + // [6.0:191:1-7] + // An update structured block is update-statement, an update statement + // that has one of the following forms: + // x = x operator expr + // x = expr operator x + // x = intrinsic-procedure-name (x) + // x = intrinsic-procedure-name (x, expr-list) + // x = intrinsic-procedure-name (expr-list, x) + const SomeExpr &atom{update.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, lsrc); + + auto top{GetTopLevelOperation(update.rhs)}; + switch (top.first) { + case atomic::Operator::Add: + case atomic::Operator::Sub: + case atomic::Operator::Mul: + case atomic::Operator::Div: + case atomic::Operator::And: + case atomic::Operator::Or: + case atomic::Operator::Eqv: + case atomic::Operator::Neqv: + case atomic::Operator::Min: + case atomic::Operator::Max: + case atomic::Operator::Identity: + break; + case atomic::Operator::Call: + context_.Say(source, + "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Convert: + context_.Say(source, + "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Intrinsic: + context_.Say(source, + "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Unk: + context_.Say( + source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + default: + context_.Say(source, + "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, + atomic::ToString(top.first)); + return; + } + // Check if `atom` occurs exactly once in the argument list. + std::vector nonAtom; + auto unique{[&]() { // -> iterator + auto found{top.second.end()}; + for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { + if (IsSameOrResizeOf(*i, atom)) { + if (found != top.second.end()) { + return top.second.end(); } + found = i; + } else { + nonAtom.push_back(*i); } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); + return found; + }()}; + + if (unique == top.second.end()) { + if (top.first == atomic::Operator::Identity) { + // This is "x = y". + context_.Say(rsrc, + "The atomic variable %s should appear as an argument in the update operation"_err_en_US, + atom.AsFortran()); + } else { + context_.Say(rsrc, + "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, + atom.AsFortran(), atomic::ToString(top.first)); + } + } else { + CheckStorageOverlap(atom, nonAtom, source); } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( + const SomeExpr &cond, parser::CharBlock condSource, + const evaluate::Assignment &assign, parser::CharBlock assignSource) { + const SomeExpr &atom{assign.lhs}; + auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, alsrc); + + auto top{GetTopLevelOperation(cond)}; + // Missing arguments to operations would have been diagnosed by now. + + switch (top.first) { + case atomic::Operator::Associated: + if (atom != top.second.front()) { + context_.Say(assignSource, + "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); + } + break; + // x equalop e | e equalop x (allowing "e equalop x" is an extension) + case atomic::Operator::Eq: + case atomic::Operator::Eqv: + // x ordop expr | expr ordop x + case atomic::Operator::Lt: + case atomic::Operator::Gt: { + const SomeExpr &arg0{top.second[0]}; + const SomeExpr &arg1{top.second[1]}; + if (IsSameOrResizeOf(arg0, atom)) { + CheckStorageOverlap(atom, {arg1}, condSource); + } else if (IsSameOrResizeOf(arg1, atom)) { + CheckStorageOverlap(atom, {arg0}, condSource); + } else { + context_.Say(assignSource, + "An argument of the %s operator should be the target of the assignment"_err_en_US, + atomic::ToString(top.first)); + } + break; + } + case atomic::Operator::True: + case atomic::Operator::False: + break; + default: + context_.Say(condSource, + "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, + atomic::ToString(top.first)); + break; } } -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, +void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source) { + // The condition/statements must be: + // - cond: x equalop e ift: x = d iff: - + // - cond: x ordop expr ift: x = expr iff: - (+ commute ordop) + // - cond: associated(x) ift: x => expr iff: - + // - cond: associated(x, e) ift: x => expr iff: - + + // The if-true statement must be present, and must be an assignment. + auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; + if (!maybeAssign) { + if (update.ift.stmt) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); + } else { + context_.Say( + source, "Invalid body of ATOMIC UPDATE COMPARE operation"_err_en_US); + } + return; + } + const evaluate::Assignment assign{*maybeAssign}; + const SomeExpr &atom{assign.lhs}; + + CheckAtomicConditionalUpdateAssignment( + update.cond, update.source, assign, update.ift.source); + + CheckStorageOverlap(atom, {assign.rhs}, update.ift.source); + + if (update.iff) { + context_.Say(update.iff.source, + "In ATOMIC UPDATE COMPARE the update statement should not have an ELSE branch"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdateOnly( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeUpdate{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeUpdate->lhs}; + CheckAtomicUpdateAssignment(*maybeUpdate, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC UPDATE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdate( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // Allowable forms are (single-statement): + // - if ... + // - x = (... ? ... : x) + // and two-statement: + // - r = cond ; if (r) ... + + const parser::ExecutionPartConstruct *ust{nullptr}; // update + const parser::ExecutionPartConstruct *cst{nullptr}; // condition + + if (body.size() == 1) { + ust = &body.front(); + } else if (body.size() == 2) { + cst = &body.front(); + ust = &body.back(); + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE operation should contain one or two statements"_err_en_US); + return; + } + + // Flang doesn't support conditional-expr yet, so all update statements + // are if-statements. + + // IfStmt: if (...) ... + // IfConstruct: if (...) then ... endif + auto maybeUpdate{AnalyzeConditionalStmt(ust)}; + if (!maybeUpdate) { + context_.Say(source, + "In ATOMIC UPDATE COMPARE the update statement should be a conditional statement"_err_en_US); + return; + } + + AnalyzedCondStmt &update{*maybeUpdate}; + + if (SourcedActionStmt action{GetActionStmt(cst)}) { + // The "condition" statement must be `r = cond`. + if (auto maybeCond{GetEvaluateAssignment(action.stmt)}) { + if (maybeCond->lhs != update.cond) { + context_.Say(update.source, + "In ATOMIC UPDATE COMPARE the conditional statement must use %s as the condition"_err_en_US, + maybeCond->lhs.AsFortran()); + } else { + // If it's "r = ...; if (r) ..." then put the original condition + // in `update`. + update.cond = maybeCond->rhs; + } + } else { + context_.Say(action.source, + "In ATOMIC UPDATE COMPARE with two statements the first statement should compute the condition"_err_en_US); + } + } + + evaluate::Assignment assign{*GetEvaluateAssignment(update.ift.stmt)}; + + CheckAtomicConditionalUpdateStmt(update, source); + if (IsCheckForAssociated(update.cond)) { + if (!IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment should be a pointer-assignment when the condition is ASSOCIATED"_err_en_US); + } + } else { + if (IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment cannot be a pointer-assignment except when the condition is ASSOCIATED"_err_en_US); + } + } + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::None)); +} + +void OmpStructureChecker::CheckAtomicUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() != 2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two statements"_err_en_US); + return; + } + + auto [uec, cec]{CheckUpdateCapture(&body.front(), &body.back(), source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; + auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; + + if (!maybeUpdate || !maybeCapture) { + context_.Say(source, + "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); + return; + } + + const evaluate::Assignment &update{*maybeUpdate}; + const evaluate::Assignment &capture{*maybeCapture}; + const SomeExpr &atom{update.lhs}; + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + int action; + + if (IsMaybeAtomicWrite(update)) { + action = Analysis::Write; + CheckAtomicWriteAssignment(update, uact.source); + } else { + action = Analysis::Update; + CheckAtomicUpdateAssignment(update, uact.source); + } + CheckAtomicCaptureAssignment(capture, atom, cact.source); + + if (IsPointerAssignment(update) != IsPointerAssignment(capture)) { + context_.Say(cact.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + + if (GetActionStmt(&body.front()).stmt == uact.stmt) { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(action, update.rhs), + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + } else { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), + MakeAtomicAnalysisOp(action, update.rhs)); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // There are two different variants of this: + // (1) conditional-update and capture separately: + // This form only allows single-statement updates, i.e. the update + // form "r = cond; if (r) ...)" is not allowed. + // (2) conditional-update combined with capture in a single statement: + // This form does allow the condition to be calculated separately, + // i.e. "r = cond; if (r) ...". + // Regardless of what form it is, the actual update assignment is a + // proper write, i.e. "x = d", where d does not depend on x. + + AnalyzedCondStmt update; + SourcedActionStmt capture; + bool captureAlways{true}, captureFirst{true}; + + auto extractCapture{[&]() { + capture = update.iff; + captureAlways = false; + update.iff = SourcedActionStmt{}; + }}; + + auto classifyNonUpdate{[&](const SourcedActionStmt &action) { + // The non-update statement is either "r = cond" or the capture. + if (auto maybeAssign{GetEvaluateAssignment(action.stmt)}) { + if (update.cond == maybeAssign->lhs) { + // If this is "r = cond; if (r) ...", then update the condition. + update.cond = maybeAssign->rhs; + update.source = action.source; + // In this form, the update and the capture are combined into + // an IF-THEN-ELSE statement. + extractCapture(); + } else { + // Assume this is the capture-statement. + capture = action; + } + } + }}; + + if (body.size() == 2) { + // This could be + // - capture; conditional-update (in any order), or + // - r = cond; if (r) capture-update + const parser::ExecutionPartConstruct *st1{&body.front()}; + const parser::ExecutionPartConstruct *st2{&body.back()}; + // In either case, the conditional statement can be analyzed by + // AnalyzeConditionalStmt, whereas the other statement cannot. + if (auto maybeUpdate1{AnalyzeConditionalStmt(st1)}) { + update = *maybeUpdate1; + classifyNonUpdate(GetActionStmt(st2)); + captureFirst = false; + } else if (auto maybeUpdate2{AnalyzeConditionalStmt(st2)}) { + update = *maybeUpdate2; + classifyNonUpdate(GetActionStmt(st1)); + } else { + // None of the statements are conditional, this rules out the + // "r = cond; if (r) ..." and the "capture + conditional-update" + // variants. This could still be capture + write (which is classified + // as conditional-update-capture in the spec). + auto [uec, cec]{CheckUpdateCapture(st1, st2, source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + update.ift = uact; + capture = cact; + if (uec == st1) { + captureFirst = false; + } + } + } else if (body.size() == 1) { + if (auto maybeUpdate{AnalyzeConditionalStmt(&body.front())}) { + update = *maybeUpdate; + // This is the form with update and capture combined into an IF-THEN-ELSE + // statement. The capture-statement is always the ELSE branch. + extractCapture(); + } else { + goto invalid; + } + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE CAPTURE operation should contain one or two statements"_err_en_US); + return; + invalid: + context_.Say(source, + "Invalid body of ATOMIC UPDATE COMPARE CAPTURE operation"_err_en_US); + return; + } + + // The update must have a form `x = d` or `x => d`. + if (auto maybeWrite{GetEvaluateAssignment(update.ift.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, update.ift.source); + if (auto maybeCapture{GetEvaluateAssignment(capture.stmt)}) { + CheckAtomicCaptureAssignment(*maybeCapture, atom, capture.source); + + if (IsPointerAssignment(*maybeWrite) != + IsPointerAssignment(*maybeCapture)) { + context_.Say(capture.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + } else { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + return; + } + } else { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + return; + } + + // update.iff should be empty here, the capture statement should be + // stored in "capture". + + // Fill out the analysis in the AST node. + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + bool condUnused{std::visit( + [](auto &&s) { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v) { + return true; + } else { + return false; + } }, - x.u); + update.cond.u)}; + + int updateWhen{!condUnused ? Analysis::IfTrue : 0}; + int captureWhen{!captureAlways ? Analysis::IfFalse : 0}; + + evaluate::Assignment updAssign{*GetEvaluateAssignment(update.ift.stmt)}; + evaluate::Assignment capAssign{*GetEvaluateAssignment(capture.stmt)}; + + if (captureFirst) { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + } else { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + } +} + +void OmpStructureChecker::CheckAtomicRead( + const parser::OpenMPAtomicConstruct &x) { + // [6.0:190:5-7] + // A read structured block is read-statement, a read statement that has one + // of the following forms: + // v = x + // v => x + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Read cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC READ cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeRead->rhs}; + CheckAtomicReadAssignment(*maybeRead, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC READ operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC READ operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicWrite( + const parser::OpenMPAtomicConstruct &x) { + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Write cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC WRITE cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeWrite{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC WRITE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdate( + const parser::OpenMPAtomicConstruct &x) { + auto &block{std::get(x.t)}; + + bool isConditional{x.IsCompare()}; + bool isCapture{x.IsCapture()}; + const parser::Block &body{GetInnermostExecPart(block)}; + + if (isConditional && isCapture) { + CheckAtomicConditionalUpdateCapture(x, body, x.source); + } else if (isConditional) { + CheckAtomicConditionalUpdate(x, body, x.source); + } else if (isCapture) { + CheckAtomicUpdateCapture(x, body, x.source); + } else { // update-only + CheckAtomicUpdateOnly(x, body, x.source); + } +} + +void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { + // All of the following groups have the "exclusive" property, i.e. at + // most one clause from each group is allowed. + // The exclusivity-checking code should eventually be unified for all + // clauses, with clause groups defined in OMP.td. + std::array atomic{llvm::omp::Clause::OMPC_read, + llvm::omp::Clause::OMPC_update, llvm::omp::Clause::OMPC_write}; + std::array memoryOrder{llvm::omp::Clause::OMPC_acq_rel, + llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_relaxed, + llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_seq_cst}; + + auto checkExclusive{[&](llvm::ArrayRef group, + std::string_view name, + const parser::OmpClauseList &clauses) { + const parser::OmpClause *present{nullptr}; + for (const parser::OmpClause &clause : clauses.v) { + llvm::omp::Clause id{clause.Id()}; + if (!llvm::is_contained(group, id)) { + continue; + } + if (present == nullptr) { + present = &clause; + continue; + } else if (id == present->Id()) { + // Ignore repetitions of the same clause, those will be diagnosed + // separately. + continue; + } + parser::MessageFormattedText txt( + "At most one clause from the '%s' group is allowed on ATOMIC construct"_err_en_US, + name.data()); + parser::Message message(clause.source, txt); + message.Attach(present->source, + "Previous clause from this group provided here"_en_US); + context_.Say(std::move(message)); + return; + } + }}; + + auto &dirSpec{std::get(x.t)}; + auto &dir{std::get(dirSpec.t)}; + PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_atomic); + llvm::omp::Clause kind{x.GetKind()}; + + checkExclusive(atomic, "atomic", dirSpec.Clauses()); + checkExclusive(memoryOrder, "memory-order", dirSpec.Clauses()); + + switch (kind) { + case llvm::omp::Clause::OMPC_read: + CheckAtomicRead(x); + break; + case llvm::omp::Clause::OMPC_write: + CheckAtomicWrite(x); + break; + case llvm::omp::Clause::OMPC_update: + CheckAtomicUpdate(x); + break; + default: + break; + } } void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { @@ -3237,7 +4341,6 @@ CHECK_SIMPLE_CLAUSE(Final, OMPC_final) CHECK_SIMPLE_CLAUSE(Flush, OMPC_flush) CHECK_SIMPLE_CLAUSE(Full, OMPC_full) CHECK_SIMPLE_CLAUSE(Grainsize, OMPC_grainsize) -CHECK_SIMPLE_CLAUSE(Hint, OMPC_hint) CHECK_SIMPLE_CLAUSE(Holds, OMPC_holds) CHECK_SIMPLE_CLAUSE(Inclusive, OMPC_inclusive) CHECK_SIMPLE_CLAUSE(Initializer, OMPC_initializer) @@ -3867,40 +4970,6 @@ void OmpStructureChecker::CheckIsLoopIvPartOfClause( } } } -// Following clauses have a separate node in parse-tree.h. -// Atomic-clause -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicRead, OMPC_read) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicWrite, OMPC_write) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicUpdate, OMPC_update) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicCapture, OMPC_capture) - -void OmpStructureChecker::Leave(const parser::OmpAtomicRead &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_read, - {llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicWrite &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_write, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicUpdate &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_update, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -// OmpAtomic node represents atomic directive without atomic-clause. -// atomic-clause - READ,WRITE,UPDATE,CAPTURE. -void OmpStructureChecker::Leave(const parser::OmpAtomic &) { - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acquire)}) { - context_.Say(clause->source, - "Clause ACQUIRE is not allowed on the ATOMIC directive"_err_en_US); - } - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acq_rel)}) { - context_.Say(clause->source, - "Clause ACQ_REL is not allowed on the ATOMIC directive"_err_en_US); - } -} // Restrictions specific to each clause are implemented apart from the // generalized restrictions. @@ -4854,21 +5923,6 @@ void OmpStructureChecker::Leave(const parser::OmpContextSelector &) { ExitDirectiveNest(ContextSelectorNest); } -std::optional OmpStructureChecker::GetDynamicType( - const common::Indirection &parserExpr) { - // Indirection parserExpr - // `- parser::Expr ^.value() - const parser::TypedExpr &typedExpr{parserExpr.value().typedExpr}; - // ForwardOwningPointer typedExpr - // `- GenericExprWrapper ^.get() - // `- std::optional ^->v - if (auto maybeExpr{typedExpr.get()->v}) { - return maybeExpr->GetType(); - } else { - return std::nullopt; - } -} - const std::list & OmpStructureChecker::GetTraitPropertyList( const parser::OmpTraitSelector &trait) { @@ -5258,7 +6312,7 @@ void OmpStructureChecker::CheckTraitCondition( const parser::OmpTraitProperty &property{properties.front()}; auto &scalarExpr{std::get(property.u)}; - auto maybeType{GetDynamicType(scalarExpr.thing)}; + auto maybeType{GetDynamicType(scalarExpr.thing.value())}; if (!maybeType || maybeType->category() != TypeCategory::Logical) { context_.Say(property.source, "%s trait requires a single LOGICAL expression"_err_en_US, diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..bf6fbf16d0646 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -48,6 +48,7 @@ static const OmpDirectiveSet noWaitClauseNotAllowedSet{ } // namespace llvm namespace Fortran::semantics { +struct AnalyzedCondStmt; // Mapping from 'Symbol' to 'Source' to keep track of the variables // used in multiple clauses @@ -142,15 +143,6 @@ class OmpStructureChecker void Leave(const parser::OmpClauseList &); void Enter(const parser::OmpClause &); - void Enter(const parser::OmpAtomicRead &); - void Leave(const parser::OmpAtomicRead &); - void Enter(const parser::OmpAtomicWrite &); - void Leave(const parser::OmpAtomicWrite &); - void Enter(const parser::OmpAtomicUpdate &); - void Leave(const parser::OmpAtomicUpdate &); - void Enter(const parser::OmpAtomicCapture &); - void Leave(const parser::OmpAtomic &); - void Enter(const parser::DoConstruct &); void Leave(const parser::DoConstruct &); @@ -189,8 +181,6 @@ class OmpStructureChecker void CheckAllowedMapTypes(const parser::OmpMapType::Value &, const std::list &); - std::optional GetDynamicType( - const common::Indirection &); const std::list &GetTraitPropertyList( const parser::OmpTraitSelector &); std::optional GetClauseFromProperty( @@ -260,14 +250,41 @@ class OmpStructureChecker void CheckDoWhile(const parser::OpenMPLoopConstruct &x); void CheckAssociatedLoopConstraints(const parser::OpenMPLoopConstruct &x); template bool IsOperatorValid(const T &, const D &); - void CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *, const parser::OmpAtomicClauseList *); - void CheckAtomicUpdateStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureStmt(const parser::AssignmentStmt &); - void CheckAtomicWriteStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureConstruct(const parser::OmpAtomicCapture &); - void CheckAtomicCompareConstruct(const parser::OmpAtomicCompare &); - void CheckAtomicConstructStructure(const parser::OpenMPAtomicConstruct &); + + void CheckStorageOverlap(const evaluate::Expr &, + llvm::ArrayRef>, parser::CharBlock); + void CheckAtomicVariable( + const evaluate::Expr &, parser::CharBlock); + std::pair + CheckUpdateCapture(const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source); + void CheckAtomicCaptureAssignment(const evaluate::Assignment &capture, + const SomeExpr &atom, parser::CharBlock source); + void CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source); + void CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source); + void CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source); + void CheckAtomicConditionalUpdateAssignment(const SomeExpr &cond, + parser::CharBlock condSource, const evaluate::Assignment &assign, + parser::CharBlock assignSource); + void CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source); + void CheckAtomicUpdateOnly(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdate(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicUpdateCapture(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source); + void CheckAtomicRead(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicWrite(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicUpdate(const parser::OpenMPAtomicConstruct &x); + void CheckDistLinear(const parser::OpenMPLoopConstruct &x); void CheckSIMDNest(const parser::OpenMPConstruct &x); void CheckTargetNest(const parser::OpenMPConstruct &x); @@ -319,7 +336,6 @@ class OmpStructureChecker void EnterDirectiveNest(const int index) { directiveNest_[index]++; } void ExitDirectiveNest(const int index) { directiveNest_[index]--; } int GetDirectiveNest(const int index) { return directiveNest_[index]; } - template void CheckHintClause(D *, D *, std::string_view); inline void ErrIfAllocatableVariable(const parser::Variable &); inline void ErrIfLHSAndRHSSymbolsMatch( const parser::Variable &, const parser::Expr &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..dc395ecf211b3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1638,11 +1638,8 @@ class OmpVisitor : public virtual DeclarationVisitor { messageHandler().set_currStmtSource(std::nullopt); } bool Pre(const parser::OpenMPAtomicConstruct &x) { - return common::visit(common::visitors{[&](const auto &u) -> bool { - AddOmpSourceRange(u.source); - return true; - }}, - x.u); + AddOmpSourceRange(x.source); + return true; } void Post(const parser::OpenMPAtomicConstruct &) { messageHandler().set_currStmtSource(std::nullopt); diff --git a/flang/lib/Semantics/rewrite-directives.cpp b/flang/lib/Semantics/rewrite-directives.cpp index 104a77885d276..b4fef2c881b67 100644 --- a/flang/lib/Semantics/rewrite-directives.cpp +++ b/flang/lib/Semantics/rewrite-directives.cpp @@ -51,23 +51,21 @@ class OmpRewriteMutator : public DirectiveRewriteMutator { bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { // Find top-level parent of the operation. - Symbol *topLevelParent{common::visit( - [&](auto &atomic) { - Symbol *symbol{nullptr}; - Scope *scope{ - &context_.FindScope(std::get(atomic.t).source)}; - do { - if (Symbol * parent{scope->symbol()}) { - symbol = parent; - } - scope = &scope->parent(); - } while (!scope->IsGlobal()); - - assert(symbol && - "Atomic construct must be within a scope associated with a symbol"); - return symbol; - }, - x.u)}; + Symbol *topLevelParent{[&]() { + Symbol *symbol{nullptr}; + Scope *scope{&context_.FindScope( + std::get(x.t).source)}; + do { + if (Symbol * parent{scope->symbol()}) { + symbol = parent; + } + scope = &scope->parent(); + } while (!scope->IsGlobal()); + + assert(symbol && + "Atomic construct must be within a scope associated with a symbol"); + return symbol; + }()}; // Get the `atomic_default_mem_order` clause from the top-level parent. std::optional defaultMemOrder; @@ -86,66 +84,48 @@ bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { return false; } - auto findMemOrderClause = - [](const std::list &clauses) { - return llvm::any_of(clauses, [](const auto &clause) { - return std::get_if(&clause.u); + auto findMemOrderClause{[](const parser::OmpClauseList &clauses) { + return llvm::any_of( + clauses.v, [](auto &clause) -> const parser::OmpClause * { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_acq_rel: + case llvm::omp::Clause::OMPC_acquire: + case llvm::omp::Clause::OMPC_relaxed: + case llvm::omp::Clause::OMPC_release: + case llvm::omp::Clause::OMPC_seq_cst: + return &clause; + default: + return nullptr; + } }); - }; - - // Get the clause list to which the new memory order clause must be added, - // only if there are no other memory order clauses present for this atomic - // directive. - std::list *clauseList = common::visit( - common::visitors{[&](parser::OmpAtomic &atomicConstruct) { - // OmpAtomic only has a single list of clauses. - auto &clauses{std::get( - atomicConstruct.t)}; - return !findMemOrderClause(clauses.v) ? &clauses.v - : nullptr; - }, - [&](auto &atomicConstruct) { - // All other atomic constructs have two lists of clauses. - auto &clausesLhs{std::get<0>(atomicConstruct.t)}; - auto &clausesRhs{std::get<2>(atomicConstruct.t)}; - return !findMemOrderClause(clausesLhs.v) && - !findMemOrderClause(clausesRhs.v) - ? &clausesRhs.v - : nullptr; - }}, - x.u); + }}; - // Add a memory order clause to the atomic directive. + auto &dirSpec{std::get(x.t)}; + auto &clauseList{std::get>(dirSpec.t)}; if (clauseList) { - atomicDirectiveDefaultOrderFound_ = true; - switch (*defaultMemOrder) { - case common::OmpMemoryOrderType::Acq_Rel: - clauseList->emplace_back(common::visit( - common::visitors{[](parser::OmpAtomicRead &) -> parser::OmpClause { - return parser::OmpClause::Acquire{}; - }, - [](parser::OmpAtomicCapture &) -> parser::OmpClause { - return parser::OmpClause::AcqRel{}; - }, - [](auto &) -> parser::OmpClause { - // parser::{OmpAtomic, OmpAtomicUpdate, OmpAtomicWrite} - return parser::OmpClause::Release{}; - }}, - x.u)); - break; - case common::OmpMemoryOrderType::Relaxed: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::Relaxed{}}); - break; - case common::OmpMemoryOrderType::Seq_Cst: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::SeqCst{}}); - break; - default: - // FIXME: Don't process other values at the moment since their validity - // depends on the OpenMP version (which is unavailable here). - break; + if (findMemOrderClause(*clauseList)) { + return false; } + } else { + clauseList = parser::OmpClauseList(decltype(parser::OmpClauseList::v){}); + } + + // Add a memory order clause to the atomic directive. + atomicDirectiveDefaultOrderFound_ = true; + switch (*defaultMemOrder) { + case common::OmpMemoryOrderType::Acq_Rel: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::AcqRel{}}); + break; + case common::OmpMemoryOrderType::Relaxed: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::Relaxed{}}); + break; + case common::OmpMemoryOrderType::Seq_Cst: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::SeqCst{}}); + break; + default: + // FIXME: Don't process other values at the moment since their validity + // depends on the OpenMP version (which is unavailable here). + break; } return false; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 index b82bd13622764..6f58e0939a787 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 index 88ec6fe910b9e..6729be6e5cf8b 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b8422c8f3b769 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -79,16 +79,16 @@ subroutine pointers_in_atomic_capture() !CHECK: %[[VAL_A_BOX_ADDR:.*]] = fir.box_addr %[[VAL_A_LOADED]] : (!fir.box>) -> !fir.ptr !CHECK: %[[VAL_B_LOADED:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR:.*]] = fir.box_addr %[[VAL_B_LOADED]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR]] : !fir.ptr !CHECK: %[[VAL_B_LOADED_2:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR_2:.*]] = fir.box_addr %[[VAL_B_LOADED_2]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR_2]] : !fir.ptr !CHECK: omp.atomic.capture { !CHECK: omp.atomic.update %[[VAL_A_BOX_ADDR]] : !fir.ptr { !CHECK: ^bb0(%[[ARG:.*]]: i32): !CHECK: %[[TEMP:.*]] = arith.addi %[[ARG]], %[[VAL_B]] : i32 !CHECK: omp.yield(%[[TEMP]] : i32) !CHECK: } -!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 +!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR_2]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 !CHECK: } !CHECK: return !CHECK: } diff --git a/flang/test/Lower/OpenMP/atomic-write.f90 b/flang/test/Lower/OpenMP/atomic-write.f90 index 13392ad76471f..6eded49b0b15d 100644 --- a/flang/test/Lower/OpenMP/atomic-write.f90 +++ b/flang/test/Lower/OpenMP/atomic-write.f90 @@ -44,9 +44,9 @@ end program OmpAtomicWrite !CHECK-LABEL: func.func @_QPatomic_write_pointer() { !CHECK: %[[X_REF:.*]] = fir.alloca !fir.box> {bindc_name = "x", uniq_name = "_QFatomic_write_pointerEx"} !CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFatomic_write_pointerEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) -!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> !CHECK: %[[X_POINTEE_ADDR:.*]] = fir.box_addr %[[X_ADDR_BOX]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: omp.atomic.write %[[X_POINTEE_ADDR]] = %[[C1]] : !fir.ptr, i32 !CHECK: %[[C2:.*]] = arith.constant 2 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> diff --git a/flang/test/Parser/OpenMP/atomic-compare.f90 b/flang/test/Parser/OpenMP/atomic-compare.f90 deleted file mode 100644 index 5cd02698ff482..0000000000000 --- a/flang/test/Parser/OpenMP/atomic-compare.f90 +++ /dev/null @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s -! OpenMP version for documentation purposes only - it isn't used until Sema. -! This is testing for Parser errors that bail out before Sema. -program main - implicit none - integer :: i, j = 10 - logical :: r - - !CHECK: error: expected OpenMP construct - !$omp atomic compare write - r = i .eq. j + 1 - - !CHECK: error: expected end of line - !$omp atomic compare num_threads(4) - r = i .eq. j -end program main diff --git a/flang/test/Semantics/OpenMP/atomic-compare.f90 b/flang/test/Semantics/OpenMP/atomic-compare.f90 index 54492bf6a22a6..11e23e062bce7 100644 --- a/flang/test/Semantics/OpenMP/atomic-compare.f90 +++ b/flang/test/Semantics/OpenMP/atomic-compare.f90 @@ -44,46 +44,37 @@ !$omp end atomic ! Check for error conditions: - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic compare seq_cst seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst compare seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic compare acquire acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire compare acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic compare relaxed relaxed if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed compare relaxed if (b .eq. c) b = a - !ERROR: More than one FAIL clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one FAIL clause can appear on the ATOMIC directive !$omp atomic fail(release) compare fail(release) if (c .eq. a) a = b !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index c13a11a8dd5dc..deb67e7614659 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -16,20 +16,21 @@ program sample !$omp atomic read hint(2) y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(3) y = y + 10 !$omp atomic update hint(5) y = x + y - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement y = x x = y !$omp end atomic - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic update hint(x) y = y * 1 @@ -46,7 +47,7 @@ program sample !$omp atomic hint(omp_lock_hint_speculative) x = y + x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint) read y = x @@ -69,36 +70,36 @@ program sample !$omp atomic hint(omp_lock_hint_contended + omp_sync_hint_nonspeculative) x = y + x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint_contended) read y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = y * 9 - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp atomic hint(1.0) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp atomic hint(z + omp_sync_hint_nonspeculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(k + omp_sync_hint_speculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(p(1) + omp_sync_hint_uncontended) write x = 10 * y !$omp atomic write hint(a) - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and y+x access the same storage x = y + x !$omp atomic hint(abs(-1)) write diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 new file mode 100644 index 0000000000000..2619d235380f8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -0,0 +1,89 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC READ. Expect no diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v = x + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v => x +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic read + v = p(i) +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic read + !ERROR: Atomic variable x should be a scalar + v = x + + !$omp atomic read + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + v = y(2:4) +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic read + !ERROR: Within atomic operation x and x access the same storage + x = x + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic read + y(1) = y(2) +end + +subroutine f05 + integer :: x, v + + !$omp atomic read + !ERROR: Atomic expression x+1_4 should be a variable + v = x + 1 +end + +subroutine f06 + character :: x, v + + !$omp atomic read + !ERROR: Atomic variable x cannot have CHARACTER type + v = x +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic read + !ERROR: Atomic variable x cannot be ALLOCATABLE + v = x +end + diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 new file mode 100644 index 0000000000000..64e31a7974b55 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -0,0 +1,77 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !$omp atomic update capture + x = v + x = x + 1 + y = x + !$omp end atomic +end + +subroutine f01 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two assignments + !$omp atomic update capture + x = v + block + x = x + 1 + y = x + end block + !$omp end atomic +end + +subroutine f02 + integer :: x, y + + ! The update and capture statements can be inside of a single BLOCK. + ! The end-directive is then optional. Expect no diagnostics. + !$omp atomic update capture + block + x = x + 1 + y = x + end block +end + +subroutine f03 + integer :: x + + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !$omp atomic update capture + x = x + 1 + x = x + 2 + !$omp end atomic +end + +subroutine f04 + integer :: x, v + + !$omp atomic update capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + v = x + x = v + !$omp end atomic +end + +subroutine f05 + integer :: x, v, z + + !$omp atomic update capture + v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + !$omp end atomic +end + +subroutine f06 + integer :: x, v, z + + !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + v = x + !$omp end atomic +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 new file mode 100644 index 0000000000000..0aee02d429dc0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + !$omp atomic update + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + x = int(x + y) +end + +subroutine f06 + integer :: x, y + interface + function f(i, j) + integer :: f, i, j + end + end interface + + !$omp atomic update + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation + x = f(x, y) +end + +subroutine f07 + real :: x + integer :: y + + !$omp atomic update + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation + x = x ** y +end + +subroutine f08 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should appear as an argument in the update operation + x = y +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 index 21a9b87d26345..3084376b4275d 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 @@ -22,10 +22,10 @@ program sample x = x / y !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y end program diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 new file mode 100644 index 0000000000000..979568ebf7140 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -0,0 +1,81 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC WRITE. Expect no diagnostics. + !$omp atomic write + x = v + 1 + + !$omp atomic write + x = v + 3 + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic write + x = 2 * v + 3 + + !$omp atomic write + x => v +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic write + p(i) = v +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic write + !ERROR: Atomic variable x should be a scalar + x = v + + !$omp atomic write + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + y(2:4) = v +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic write + !ERROR: Within atomic operation x and x+1_4 access the same storage + x = x + 1 + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic write + y(1) = y(2) +end + +subroutine f06 + character :: x, v + + !$omp atomic write + !ERROR: Atomic variable x cannot have CHARACTER type + x = v +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic write + !ERROR: Atomic variable x cannot be ALLOCATABLE + x = v +end + diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0e100871ea9b4..0dc8251ae356e 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct @@ -11,9 +11,13 @@ a = b !$omp end atomic + !ERROR: ACQUIRE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic read acquire hint(OMP_LOCK_HINT_CONTENDED) a = b + !ERROR: RELEASE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic release hint(OMP_LOCK_HINT_UNCONTENDED) write a = b @@ -22,39 +26,32 @@ a = a + 1 !$omp end atomic + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: ACQ_REL clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic hint(1) acq_rel capture b = a a = a + 1 !$omp end atomic - !ERROR: expected end of line + !ERROR: At most one clause from the 'atomic' group is allowed on ATOMIC construct !$omp atomic read write + !ERROR: Atomic expression a+1._4 should be a variable a = a + 1 !$omp atomic a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic num_threads(4) a = a + 1 - !ERROR: expected end of line + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic capture num_threads(4) a = a + 1 + !ERROR: RELAXED clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic relaxed a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' - !$omp atomic num_threads write - a = a + 1 - !$omp end parallel end diff --git a/flang/test/Semantics/OpenMP/atomic01.f90 b/flang/test/Semantics/OpenMP/atomic01.f90 index 173effe86b69c..f700c381cadd0 100644 --- a/flang/test/Semantics/OpenMP/atomic01.f90 +++ b/flang/test/Semantics/OpenMP/atomic01.f90 @@ -14,322 +14,277 @@ ! At most one memory-order-clause may appear on the construct. !READ - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic read seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst read seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic read acquire acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire read acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic read relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed read relaxed i = j !UPDATE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic update seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst update seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic update release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release update release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic update relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed update relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !CAPTURE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic capture seq_cst seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst capture seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic capture release release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release capture release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic capture relaxed relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed capture relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel acq_rel capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic capture acq_rel acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel capture acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic capture acquire acquire i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire capture acquire i = j j = k !$omp end atomic !WRITE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic write seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst write seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic write release release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release write release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic write relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed write relaxed i = j !No atomic-clause - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j ! 2.17.7.3 ! At most one hint clause may appear on the construct. - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_speculative) hint(omp_sync_hint_speculative) read i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) read hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic read hint(omp_sync_hint_uncontended) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) update hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic update hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) capture i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) capture hint(omp_sync_hint_nonspeculative) i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic capture hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j j = k @@ -337,34 +292,26 @@ ! 2.17.7.4 ! If atomic-clause is read then memory-order-clause must not be acq_rel or release. - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic acq_rel read i = j - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read acq_rel i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic release read i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read release i = j ! 2.17.7.5 ! If atomic-clause is write then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acq_rel write i = j - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acq_rel i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acquire write i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acquire i = j @@ -372,33 +319,27 @@ ! 2.17.7.6 ! If atomic-clause is update or not present then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acq_rel update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acquire update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed on the ATOMIC directive !$omp atomic acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed on the ATOMIC directive !$omp atomic acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j end program diff --git a/flang/test/Semantics/OpenMP/atomic02.f90 b/flang/test/Semantics/OpenMP/atomic02.f90 index c66085d00f157..45e41f2552965 100644 --- a/flang/test/Semantics/OpenMP/atomic02.f90 +++ b/flang/test/Semantics/OpenMP/atomic02.f90 @@ -28,36 +28,29 @@ program OmpAtomic !$omp atomic a = a/(b + 1) !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: The atomic variable c should appear as an argument in the update operation c = d !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The /= operator is not a valid ATOMIC UPDATE operation l = a .NE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic m = m .AND. n @@ -76,32 +69,26 @@ program OmpAtomic !$omp atomic update a = a/(b + 1) !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: This is not a valid ATOMIC UPDATE operation c = c//d !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic update m = m .AND. n diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index 76367495b9861..f5c189fd05318 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -25,28 +25,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7, b, c) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8, a, d) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -60,26 +58,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = MOD(y, 9) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation x = ABS(x) end program OmpAtomic @@ -92,7 +90,7 @@ subroutine conflicting_types() type(simple) ::s z = 1 !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(s%z, 4) end subroutine @@ -105,40 +103,37 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) :: s !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MIN operator a = min(a, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a) !$omp atomic - !ERROR: Atomic update statement should be of the form `a = intrinsic_procedure(a, expr_list)` OR `a = intrinsic_procedure(expr_list, a)` a = min(b, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a, b) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: The atomic variable y should occur exactly once among the arguments of the top-level MIN operator y = min(z, x) !$omp atomic z = max(z, y) !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'k' + !ERROR: Atomic variable k should be a scalar + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation k = max(x, y) - + !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement x = min(x, k) !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - z =z + s%m + z = z + s%m end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index a9644ad95aa30..5c91ab5dc37e4 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -20,12 +20,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic @@ -33,12 +31,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic @@ -46,12 +42,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic @@ -59,12 +53,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic @@ -72,8 +64,7 @@ program OmpAtomic !$omp atomic m = n .AND. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic @@ -81,8 +72,7 @@ program OmpAtomic !$omp atomic m = n .OR. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic @@ -90,8 +80,7 @@ program OmpAtomic !$omp atomic m = n .EQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic @@ -99,8 +88,7 @@ program OmpAtomic !$omp atomic m = n .NEQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l !$omp atomic update @@ -108,12 +96,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic update @@ -121,12 +107,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic update @@ -134,12 +118,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic update @@ -147,12 +129,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic update @@ -160,8 +140,7 @@ program OmpAtomic !$omp atomic update m = n .AND. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic update @@ -169,8 +148,7 @@ program OmpAtomic !$omp atomic update m = n .OR. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic update @@ -178,8 +156,7 @@ program OmpAtomic !$omp atomic update m = n .EQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic update @@ -187,8 +164,7 @@ program OmpAtomic !$omp atomic update m = n .NEQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l end program OmpAtomic @@ -204,35 +180,34 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) p !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement x = x !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 1 !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and a*b access the same storage a = a * b + a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level * operator a = b * (a + 9) !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (a+b) access the same storage a = a * (a + b) !$omp atomic - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (b+a) access the same storage a = (b + a) * a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a * b + c !$omp atomic update - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a + b + c !$omp atomic @@ -243,23 +218,18 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement a = a + d !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = x * y / z !$omp atomic - !ERROR: Atomic update statement should be of form `p%m = p%m operator expr` OR `p%m = expr operator p%m` - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable p%m should occur exactly once among the arguments of the top-level + operator p%m = x + y !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement p%m = p%m + p%n end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index 266268a212440..e0103be4cae4a 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -8,20 +8,20 @@ program OmpAtomic use omp_lib integer :: g, x - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic relaxed, seq_cst x = x + 1 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic read seq_cst, relaxed x = g - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic write relaxed, release x = 2 * 4 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 10 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst x = g g = x * 10 diff --git a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 index 7ca8c858239f7..e9cfa49bf934e 100644 --- a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 @@ -18,7 +18,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(3) y = 2 !$omp end critical (name) @@ -27,12 +27,12 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(7) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(x) y = 2 @@ -54,7 +54,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint) y = 2 @@ -84,35 +84,35 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint_contended) y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp critical (name) hint(1.0) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp critical (name) hint(z + omp_sync_hint_nonspeculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(k + omp_sync_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(p(1) + omp_sync_hint_uncontended) y = 2 diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 505cbc48fef90..677b933932b44 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -20,70 +20,64 @@ program sample !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable y(1_8:3_8:1_8) should be a scalar v = y(1:3) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression x*(10_4+x) should be a variable v = x * (10 + x) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression 4_4 should be a variable v = 4 !$omp atomic read - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE v = k !$omp atomic write - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = x !$omp atomic update - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = k + x * (v * x) !$omp atomic - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = v * k !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'z%y' + !ERROR: Within atomic operation z%y and x+z%y access the same storage z%y = x + z%y !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Within atomic operation m and min(m,x,z%m)+k access the same storage m = min(m, x, z%m) + k !$omp atomic read - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Atomic expression min(m,x,z%m)+k should be a variable m = min(m, x, z%m) + k !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar x = a - !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - a = x - !$omp atomic write !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement x = a !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar a = x !$omp atomic capture @@ -93,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = b + (x*1) !$omp end atomic @@ -103,60 +97,58 @@ program sample !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component x expected to be captured in the second statement of ATOMIC CAPTURE construct x = x + 10 v = b !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt] v = 1 x = 4 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component z%y expected to be assigned in the second statement of ATOMIC CAPTURE construct x = z%y + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component z%m expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 x = z%y !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component y(2) expected to be assigned in the second statement of ATOMIC CAPTURE construct x = y(2) + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component y(1) expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 x = y(2) !$omp end atomic !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable r cannot have CHARACTER type l = r !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable l cannot have CHARACTER type l = r end program diff --git a/flang/test/Semantics/OpenMP/requires-atomic01.f90 b/flang/test/Semantics/OpenMP/requires-atomic01.f90 index ae9fd086015dd..e8817c3f5ef61 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic01.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic01.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> SeqCst !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> SeqCst !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> SeqCst !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> SeqCst !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> SeqCst !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 diff --git a/flang/test/Semantics/OpenMP/requires-atomic02.f90 b/flang/test/Semantics/OpenMP/requires-atomic02.f90 index 4976a9667eb78..a3724a83456fd 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic02.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic02.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Acquire + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> AcqRel !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> AcqRel !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> AcqRel !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> AcqRel !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> AcqRel + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> AcqRel !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 >From 5928e6c450ed9d91a8130fd489a3fabab830140c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:41:16 -0500 Subject: [PATCH 05/16] Restart build >From 5a9cb4aaca6c7cd079f24036b233dbfd3c6278d4 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:45:28 -0500 Subject: [PATCH 06/16] Restart build >From 6273bb856c66fe110731e25293be0f9a1f02c64a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 07:54:54 -0500 Subject: [PATCH 07/16] Restart build >From 3594a0fcab499d7509c2bd4620b2c47feb74a481 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 08:18:01 -0500 Subject: [PATCH 08/16] Replace %openmp_flags with -fopenmp in tests, add REQUIRES where needed --- flang/test/Semantics/OpenMP/atomic-read.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-capture.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 2 +- flang/test/Semantics/OpenMP/atomic.f90 | 2 ++ 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 index 2619d235380f8..73899a9ff37f2 100644 --- a/flang/test/Semantics/OpenMP/atomic-read.f90 +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index 64e31a7974b55..c427ba07d43d8 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 0aee02d429dc0..4595e02d01456 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 index 979568ebf7140..7965ad2dc7dbf 100644 --- a/flang/test/Semantics/OpenMP/atomic-write.f90 +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0dc8251ae356e..2caa161507d49 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,3 +1,5 @@ +! REQUIRES: openmp_runtime + ! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct >From 0142671339663ba09c169800f909c0d48c0ad38b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 09:06:20 -0500 Subject: [PATCH 09/16] Fix examples --- flang/examples/FeatureList/FeatureList.cpp | 10 ------- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 26 +++++-------------- 2 files changed, 7 insertions(+), 29 deletions(-) diff --git a/flang/examples/FeatureList/FeatureList.cpp b/flang/examples/FeatureList/FeatureList.cpp index d1407cf0ef239..a36b8719e365d 100644 --- a/flang/examples/FeatureList/FeatureList.cpp +++ b/flang/examples/FeatureList/FeatureList.cpp @@ -445,13 +445,6 @@ struct NodeVisitor { READ_FEATURE(ObjectDecl) READ_FEATURE(OldParameterStmt) READ_FEATURE(OmpAlignedClause) - READ_FEATURE(OmpAtomic) - READ_FEATURE(OmpAtomicCapture) - READ_FEATURE(OmpAtomicCapture::Stmt1) - READ_FEATURE(OmpAtomicCapture::Stmt2) - READ_FEATURE(OmpAtomicRead) - READ_FEATURE(OmpAtomicUpdate) - READ_FEATURE(OmpAtomicWrite) READ_FEATURE(OmpBeginBlockDirective) READ_FEATURE(OmpBeginLoopDirective) READ_FEATURE(OmpBeginSectionsDirective) @@ -480,7 +473,6 @@ struct NodeVisitor { READ_FEATURE(OmpIterationOffset) READ_FEATURE(OmpIterationVector) READ_FEATURE(OmpEndAllocators) - READ_FEATURE(OmpEndAtomic) READ_FEATURE(OmpEndBlockDirective) READ_FEATURE(OmpEndCriticalDirective) READ_FEATURE(OmpEndLoopDirective) @@ -566,8 +558,6 @@ struct NodeVisitor { READ_FEATURE(OpenMPDeclareTargetConstruct) READ_FEATURE(OmpMemoryOrderType) READ_FEATURE(OmpMemoryOrderClause) - READ_FEATURE(OmpAtomicClause) - READ_FEATURE(OmpAtomicClauseList) READ_FEATURE(OmpAtomicDefaultMemOrderClause) READ_FEATURE(OpenMPFlushConstruct) READ_FEATURE(OpenMPLoopConstruct) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..18a4107f9a9d1 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -74,25 +74,19 @@ SourcePosition OpenMPCounterVisitor::getLocation(const OpenMPConstruct &c) { // the directive field. [&](const auto &c) -> SourcePosition { const CharBlock &source{std::get<0>(c.t).source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPAtomicConstruct &c) -> SourcePosition { - return std::visit( - [&](const auto &o) -> SourcePosition { - const CharBlock &source{std::get(o.t).source}; - return parsing->allCooked() - .GetSourcePositionRange(source) - ->first; - }, - c.u); + const CharBlock &source{c.source}; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPSectionConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPUtilityConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, }, c.u); @@ -157,14 +151,8 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - return std::visit( - [&](const auto &c) { - // Get source from the verbatim fields - const CharBlock &source{std::get(c.t).source}; - return "atomic-" + - normalize_construct_name(source.ToString()); - }, - c.u); + const CharBlock &source{c.source}; + return normalize_construct_name(source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; >From c158867527e0a5e2b67146784386675e0d414c71 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:19 -0500 Subject: [PATCH 10/16] Fix example --- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 5 +++-- flang/test/Examples/omp-atomic.f90 | 16 +++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index 18a4107f9a9d1..b0964845d3b99 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -151,8 +151,9 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - const CharBlock &source{c.source}; - return normalize_construct_name(source.ToString()); + auto &dirSpec = std::get(c.t); + auto &dirName = std::get(dirSpec.t); + return normalize_construct_name(dirName.source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; diff --git a/flang/test/Examples/omp-atomic.f90 b/flang/test/Examples/omp-atomic.f90 index dcca34b633a3e..934f84f132484 100644 --- a/flang/test/Examples/omp-atomic.f90 +++ b/flang/test/Examples/omp-atomic.f90 @@ -26,25 +26,31 @@ ! CHECK:--- ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 9 -! CHECK-NEXT: construct: atomic-read +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: -! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: - clause: read ! CHECK-NEXT: details: '' +! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 12 -! CHECK-NEXT: construct: atomic-write +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: ! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' +! CHECK-NEXT: - clause: write ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 16 -! CHECK-NEXT: construct: atomic-capture +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: +! CHECK-NEXT: - clause: capture +! CHECK-NEXT: details: 'name_modifier=atomic;name_modifier=atomic;' ! CHECK-NEXT: - clause: seq_cst ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 21 -! CHECK-NEXT: construct: atomic-atomic +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: [] ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 8 >From ddacb71299d1fefd37b566e3caed45031584aac8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:47 -0500 Subject: [PATCH 11/16] Remove reference to *nullptr --- flang/lib/Lower/OpenMP/OpenMP.cpp | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ab868df76d298..4d9b375553c1e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3770,9 +3770,12 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); - mlir::Operation *secondOp = genAtomicOperation( - converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, - *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + mlir::Operation *secondOp = nullptr; + if (analysis.op1.what != analysis.None) { + secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, + atomAddr, atom, *get(analysis.op1.expr), + hint, memOrder, atomicAt, prepareAt); + } if (secondOp) { builder.setInsertionPointAfter(secondOp); >From 4546997f82dfe32b79b2bd0e2b65974991ab55da Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 2 May 2025 18:49:05 -0500 Subject: [PATCH 12/16] Updates and improvements --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 107 +++-- flang/lib/Semantics/check-omp-structure.cpp | 375 ++++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 1 + .../Todo/atomic-capture-implicit-cast.f90 | 48 --- .../Lower/OpenMP/atomic-implicit-cast.f90 | 2 - .../Semantics/OpenMP/atomic-hint-clause.f90 | 2 +- .../OpenMP/atomic-update-capture.f90 | 8 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 16 +- 9 files changed, 381 insertions(+), 180 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 77f57b1cb85c7..8213fe33edbd0 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4859,7 +4859,7 @@ struct OpenMPAtomicConstruct { struct Op { int what; - TypedExpr expr; + AssignmentStmt::TypedAssignment assign; }; TypedExpr atom, cond; Op op0, op1; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6177b59199481..7b6c22095d723 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2673,21 +2673,46 @@ getAtomicMemoryOrder(lower::AbstractConverter &converter, static mlir::Operation * // genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Value storeAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Type storeType = fir::unwrapRefType(storeAddr.getType()); + + mlir::Value toAddr = [&]() { + if (atomType == storeType) + return storeAddr; + return builder.createTemporary(loc, atomType, ".tmp.atomval"); + }(); builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + + if (atomType != storeType) { + lower::ExprToValueMap overrides; + // The READ operation could be a part of UPDATE CAPTURE, so make sure + // we don't emit extra code into the body of the atomic op. + builder.restoreInsertionPoint(postAt); + mlir::Value load = builder.create(loc, toAddr); + overrides.try_emplace(&atom, load); + + converter.overrideExprValues(&overrides); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); + converter.resetExprOverrides(); + + builder.create(loc, value, storeAddr); + } builder.restoreInsertionPoint(saved); return op; } @@ -2695,16 +2720,18 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, static mlir::Operation * // genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); mlir::Value converted = builder.createConvert(loc, atomType, value); @@ -2719,19 +2746,20 @@ static mlir::Operation * genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrResizeOf(arg, atom)) { @@ -2751,7 +2779,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, converter.overrideExprValues(&overrides); mlir::Value updated = - fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Value converted = builder.createConvert(loc, atomType, updated); builder.create(loc, converted); converter.resetExprOverrides(); @@ -2764,20 +2792,21 @@ static mlir::Operation * genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, int action, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: - return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Write: - return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Update: - return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, assign, + hint, memOrder, preAt, atomicAt, postAt); default: return nullptr; } @@ -3724,6 +3753,15 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { } return ""s; }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; const SomeExpr &atom = *analysis.atom.get()->v; @@ -3732,11 +3770,11 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; llvm::errs() << " op0 {\n"; llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op0.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << " op1 {\n"; llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op1.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << "}\n"; } @@ -3745,8 +3783,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAtomicConstruct &construct) { - auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { - if (auto *maybe = expr.get(); maybe && maybe->v) { + auto get = [](auto &&typedWrapper) -> decltype(&*typedWrapper.get()->v) { + if (auto *maybe = typedWrapper.get(); maybe && maybe->v) { return &*maybe->v; } else { return nullptr; @@ -3774,8 +3812,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, int action0 = analysis.op0.what & analysis.Action; int action1 = analysis.op1.what & analysis.Action; mlir::Operation *captureOp = nullptr; - fir::FirOpBuilder::InsertPoint atomicAt; - fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint preAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint atomicAt, postAt; if (construct.IsCapture()) { // Capturing operation. @@ -3784,7 +3822,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, captureOp = builder.create(loc, hint, memOrder); // Set the non-atomic insertion point to before the atomic.capture. - prepareAt = getInsertionPointBefore(captureOp); + preAt = getInsertionPointBefore(captureOp); mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); builder.setInsertionPointToEnd(block); @@ -3792,6 +3830,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, // atomic.capture. mlir::Operation *term = builder.create(loc); atomicAt = getInsertionPointBefore(term); + postAt = getInsertionPointAfter(captureOp); hint = nullptr; memOrder = nullptr; } else { @@ -3799,20 +3838,20 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(action0 != analysis.None && action1 == analysis.None && "Expexcing single action"); assert(!(analysis.op0.what & analysis.Condition)); - atomicAt = prepareAt; + postAt = atomicAt = preAt; } mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, - *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); mlir::Operation *secondOp = nullptr; if (analysis.op1.what != analysis.None) { secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, - atomAddr, atom, *get(analysis.op1.expr), - hint, memOrder, atomicAt, prepareAt); + atomAddr, atom, *get(analysis.op1.assign), + hint, memOrder, preAt, atomicAt, postAt); } if (secondOp) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 201b38bd05ff3..f7753a5e5cc59 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -86,9 +86,13 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } -static bool IsVarOrFunctionRef(const SomeExpr &expr) { - return evaluate::UnwrapProcedureRef(expr) != nullptr || - evaluate::IsVariable(expr); +static bool IsVarOrFunctionRef(const MaybeExpr &expr) { + if (expr) { + return evaluate::UnwrapProcedureRef(*expr) != nullptr || + evaluate::IsVariable(*expr); + } else { + return false; + } } static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { @@ -2838,6 +2842,12 @@ static std::pair SplitAssignmentSource( namespace atomic { +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + enum class Operator { Unk, // Operators that are officially allowed in the update operation @@ -3137,16 +3147,108 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (Append(v, std::move(results)), ...); + (MoveAppend(v, std::move(results)), ...); return v; } +}; -private: - static void Append(Result &acc, Result &&data) { - for (auto &&s : data) { - acc.push_back(std::move(s)); +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + auto copy{x.derived()}; + return {evaluate::AsGenericExpr(std::move(copy)), {}}; } } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; }; } // namespace atomic @@ -3265,6 +3367,22 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } +static MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { // Both expr and x have the form of SomeType(SomeKind(...)[1]). // Check if expr is @@ -3282,6 +3400,10 @@ bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { } } +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { if (value) { expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), @@ -3289,11 +3411,20 @@ static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { } } +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( - int what, const MaybeExpr &maybeExpr = std::nullopt) { + int what, + const std::optional &maybeAssign = std::nullopt) { parser::OpenMPAtomicConstruct::Analysis::Op operation; operation.what = what; - SetExpr(operation.expr, maybeExpr); + SetAssignment(operation.assign, maybeAssign); return operation; } @@ -3316,7 +3447,7 @@ static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( // }; // struct Op { // int what; - // TypedExpr expr; + // TypedAssignment assign; // }; // TypedExpr atom, cond; // Op op0, op1; @@ -3340,6 +3471,16 @@ void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, } } +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + /// Check if `expr` satisfies the following conditions for x and v: /// /// [6.0:189:10-12] @@ -3383,9 +3524,9 @@ OmpStructureChecker::CheckUpdateCapture( // // The two allowed cases are: // x = ... atomic-var = ... - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // or - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // x = ... atomic-var = ... // // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture @@ -3394,6 +3535,8 @@ OmpStructureChecker::CheckUpdateCapture( // // If the two statements don't fit these criteria, return a pair of default- // constructed values. + using ReturnTy = std::pair; SourcedActionStmt act1{GetActionStmt(ec1)}; SourcedActionStmt act2{GetActionStmt(ec2)}; @@ -3409,86 +3552,155 @@ OmpStructureChecker::CheckUpdateCapture( auto isUpdateCapture{ [](const evaluate::Assignment &u, const evaluate::Assignment &c) { - return u.lhs == c.rhs; + return IsSameOrConvertOf(c.rhs, u.lhs); }}; // Do some checks that narrow down the possible choices for the update // and the capture statements. This will help to emit better diagnostics. - bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); - bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; + + auto errorCaptureShouldRead{[&](const parser::CharBlock &source, + const std::string &expr) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read %s"_err_en_US, + expr); + }}; - if (couldBeCapture1) { - if (couldBeCapture2) { - if (isUpdateCapture(as2, as1)) { - if (isUpdateCapture(as1, as2)) { - // If both statements could be captures and both could be updates, - // emit a warning about the ambiguity. - context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); - } - return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); - } else if (isUpdateCapture(as1, as2)) { + auto errorNeitherWorks{[&]() { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture"_err_en_US); + }}; + + auto makeSelectionFromDet{[&](int det) -> ReturnTy { + // If det != 0, then the checks unambiguously suggest a specific + // categorization. + // If det == 0, then this function should be called only if the + // checks haven't ruled out any possibility, i.e. when both assigments + // could still be either updates or captures. + if (det > 0) { + // as1 is update, as2 is capture + if (isUpdateCapture(as1, as2)) { return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - context_.Say(source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, - as1.rhs.AsFortran(), as2.rhs.AsFortran()); + errorCaptureShouldRead(act2.source, as1.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } else { // !couldBeCapture2 + } else if (det < 0) { + // as2 is update, as1 is capture if (isUpdateCapture(as2, as1)) { return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } else { - context_.Say(act2.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as1.rhs.AsFortran()); + errorCaptureShouldRead(act1.source, as2.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } - } else { // !couldBeCapture1 - if (couldBeCapture2) { - if (isUpdateCapture(as1, as2)) { - return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); - } else { + } else { + bool updateFirst{isUpdateCapture(as1, as2)}; + bool captureFirst{isUpdateCapture(as2, as1)}; + if (updateFirst && captureFirst) { + // If both assignment could be the update and both could be the + // capture, emit a warning about the ambiguity. context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as2.rhs.AsFortran()); + "In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement"_warn_en_US); + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } - } else { - context_.Say(source, - "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + if (updateFirst != captureFirst) { + const parser::ExecutionPartConstruct *upd{updateFirst ? ec1 : ec2}; + const parser::ExecutionPartConstruct *cap{captureFirst ? ec1 : ec2}; + return std::make_pair(upd, cap); + } + assert(!updateFirst && !captureFirst); + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); } + }}; + + if (det != 0 || (cbu1 && cbu2 && cbc1 && cbc2)) { + return makeSelectionFromDet(det); } + assert(det == 0 && "Prior checks should have covered det != 0"); - return std::make_pair(nullptr, nullptr); + // If neither of the statements is an RMW update, it could still be a + // "write" update. Pretty much any assignment can be a write update, so + // recompute det with cbu1 = cbu2 = true. + if (int writeDet{int(cbc2) - int(cbc1)}; writeDet || (cbc1 && cbc2)) { + return makeSelectionFromDet(writeDet); + } + + // It's only errors from here on. + + if (!cbu1 && !cbu2 && !cbc1 && !cbc2) { + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); + } + + // The remaining cases are that + // - no candidate for update, or for capture, + // - one of the assigments cannot be anything. + + if (!cbu1 && !cbu2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update"_err_en_US); + return std::make_pair(nullptr, nullptr); + } else if (!cbc1 && !cbc2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + if ((!cbu1 && !cbc1) || (!cbu2 && !cbc2)) { + auto &src = (!cbu1 && !cbc1) ? act1.source : act2.source; + context_.Say(src, + "In ATOMIC UPDATE operation with CAPTURE the statement could be neither the update nor the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + // All cases should have been covered. + llvm_unreachable("Unchecked condition"); } void OmpStructureChecker::CheckAtomicCaptureAssignment( const evaluate::Assignment &capture, const SomeExpr &atom, parser::CharBlock source) { - const SomeExpr &cap{capture.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &cap{capture.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, rsrc); - // This part should have been checked prior to callig this function. - assert(capture.rhs == atom && "This canont be a capture assignment"); + // This part should have been checked prior to calling this function. + assert(*GetConvertInput(capture.rhs) == atom && + "This canont be a capture assignment"); CheckStorageOverlap(atom, {cap}, source); } } void OmpStructureChecker::CheckAtomicReadAssignment( const evaluate::Assignment &read, parser::CharBlock source) { - const SomeExpr &atom{read.rhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; - if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + if (auto maybe{GetConvertInput(read.rhs)}) { + const SomeExpr &atom{*maybe}; + + if (!IsVarOrFunctionRef(atom)) { + ErrorShouldBeVariable(atom, rsrc); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } } else { - CheckAtomicVariable(atom, rsrc); - CheckStorageOverlap(atom, {read.lhs}, source); + ErrorShouldBeVariable(read.rhs, rsrc); } } @@ -3499,12 +3711,11 @@ void OmpStructureChecker::CheckAtomicWriteAssignment( // one of the following forms: // x = expr // x => expr - const SomeExpr &atom{write.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{write.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, lsrc); CheckStorageOverlap(atom, {write.rhs}, source); @@ -3521,12 +3732,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( // x = intrinsic-procedure-name (x) // x = intrinsic-procedure-name (x, expr-list) // x = intrinsic-procedure-name (expr-list, x) - const SomeExpr &atom{update.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{update.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); // Skip other checks. return; } @@ -3605,12 +3815,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( const SomeExpr &cond, parser::CharBlock condSource, const evaluate::Assignment &assign, parser::CharBlock assignSource) { - const SomeExpr &atom{assign.lhs}; auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + const SomeExpr &atom{assign.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, arsrc); // Skip other checks. return; } @@ -3702,7 +3911,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( @@ -3786,7 +3995,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdate( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign), MakeAtomicAnalysisOp(Analysis::None)); } @@ -3839,12 +4048,12 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( if (GetActionStmt(&body.front()).stmt == uact.stmt) { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(action, update.rhs), - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + MakeAtomicAnalysisOp(action, update), + MakeAtomicAnalysisOp(Analysis::Read, capture)); } else { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), - MakeAtomicAnalysisOp(action, update.rhs)); + MakeAtomicAnalysisOp(Analysis::Read, capture), + MakeAtomicAnalysisOp(action, update)); } } @@ -3988,12 +4197,12 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( if (captureFirst) { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign)); } else { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign)); } } @@ -4019,13 +4228,15 @@ void OmpStructureChecker::CheckAtomicRead( if (body.size() == 1) { SourcedActionStmt action{GetActionStmt(&body.front())}; if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { - const SomeExpr &atom{maybeRead->rhs}; CheckAtomicReadAssignment(*maybeRead, action.source); - using Analysis = parser::OpenMPAtomicConstruct::Analysis; - x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), - MakeAtomicAnalysisOp(Analysis::None)); + if (auto maybe{GetConvertInput(maybeRead->rhs)}) { + const SomeExpr &atom{*maybe}; + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead), + MakeAtomicAnalysisOp(Analysis::None)); + } } else { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); @@ -4058,7 +4269,7 @@ void OmpStructureChecker::CheckAtomicWrite( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index bf6fbf16d0646..835fbe45e1c0e 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -253,6 +253,7 @@ class OmpStructureChecker void CheckStorageOverlap(const evaluate::Expr &, llvm::ArrayRef>, parser::CharBlock); + void ErrorShouldBeVariable(const MaybeExpr &expr, parser::CharBlock source); void CheckAtomicVariable( const evaluate::Expr &, parser::CharBlock); std::pair&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..aa9d2e0ac3ff7 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -1,5 +1,3 @@ -! REQUIRES : openmp_runtime - ! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s ! CHECK: func.func @_QPatomic_implicit_cast_read() { diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index deb67e7614659..8adb0f1a67409 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -25,7 +25,7 @@ program sample !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement y = x x = y !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index c427ba07d43d8..f808ed916fb7e 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -39,7 +39,7 @@ subroutine f02 subroutine f03 integer :: x - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture !$omp atomic update capture x = x + 1 x = x + 2 @@ -50,7 +50,7 @@ subroutine f04 integer :: x, v !$omp atomic update capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement v = x x = v !$omp end atomic @@ -60,8 +60,8 @@ subroutine f05 integer :: x, v, z !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 !$omp end atomic end @@ -70,8 +70,8 @@ subroutine f06 integer :: x, v, z !$omp atomic update capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x !$omp end atomic end diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 677b933932b44..5e180aa0bbe5b 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -97,50 +97,50 @@ program sample !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture x = x + 10 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read x v = b !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture !$omp atomic capture v = 1 x = 4 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) !$omp end atomic >From 40510a3068498d15257cc7d198bce9c8cd71a902 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 24 Mar 2025 15:38:58 -0500 Subject: [PATCH 13/16] DumpEvExpr: show type --- flang/include/flang/Semantics/dump-expr.h | 30 ++++++++++++++++------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 2f445429a10b5..1553dac3b6687 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,6 +16,7 @@ #include #include +#include #include #include @@ -38,6 +39,17 @@ class DumpEvaluateExpr { } private: + template + struct TypeOf { + static constexpr std::string_view name{TypeOf::get()}; + static constexpr std::string_view get() { + std::string_view v(__PRETTY_FUNCTION__); + v.remove_prefix(99); // Strip the part "... [with T = " + v.remove_suffix(50); // Strip the ending "; string_view = ...]" + return v; + } + }; + template void Show(const common::Indirection &x) { Show(x.value()); } @@ -76,7 +88,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant"); + Indent("derived constant "s + std::string(TypeOf::name)); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -84,7 +96,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant"); + Print("constant "s + std::string(TypeOf::name)); } } void Show(const Symbol &symbol); @@ -102,7 +114,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator"); + Indent("designator "s + std::string(TypeOf::name)); Show(x.u); Outdent(); } @@ -117,7 +129,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref"); + Indent("function ref "s + std::string(TypeOf::name)); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -127,14 +139,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value"); + Indent("array constructor value "s + std::string(TypeOf::name)); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do"); + Indent("implied do "s + std::string(TypeOf::name)); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -148,20 +160,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op"); + Indent("unary op "s + std::string(TypeOf::name)); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op"); + Indent("binary op "s + std::string(TypeOf::name)); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr T"); + Indent("expr <" + std::string(TypeOf::name) + ">"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index aa0b4e0f03398..66cedab94bfb4 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("expr some type"); + Indent("relational some type"); Show(x.u); Outdent(); } >From b40ba0ed9270daf4f7d99190c1e100028a3e09c3 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 15:14:45 -0500 Subject: [PATCH 14/16] Handle conversion from real to complex via complex constructor --- flang/lib/Semantics/check-omp-structure.cpp | 55 ++++++++++++++++++--- 1 file changed, 47 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dada9c6c2bd6f..ae81dcb5ea150 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,36 +3183,46 @@ struct ConvertCollector using Base::operator(); template // - Result operator()(const evaluate::Designator &x) const { + Result asSomeExpr(const T &x) const { auto copy{x}; return {AsGenericExpr(std::move(copy)), {}}; } + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + template // Result operator()(const evaluate::FunctionRef &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template // Result operator()(const evaluate::Constant &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template Result operator()(const evaluate::Operation &x) const { if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. + // Ignore parentheses. return (*this)(x.template operand<0>()); } else if constexpr (is_convert_v) { // Convert should always have a typed result, so it should be safe to // dereference x.GetType(). return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } } else { - auto copy{x.derived()}; - return {evaluate::AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x.derived()); } } @@ -3231,6 +3241,23 @@ struct ConvertCollector } private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + template // struct is_convert { static constexpr bool value{false}; @@ -3246,6 +3273,18 @@ struct ConvertCollector }; template // static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { >From 303aef7886243a6f7952e866cfb50d860ed98e61 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 16:07:19 -0500 Subject: [PATCH 15/16] Fix handling of insertion point --- flang/lib/Lower/OpenMP/OpenMP.cpp | 23 +++++++++++-------- .../Lower/OpenMP/atomic-implicit-cast.f90 | 8 +++---- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 1c5589b116ca7..60e559b326f7f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2749,7 +2749,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value storeAddr = @@ -2782,7 +2781,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, value, storeAddr); } - builder.restoreInsertionPoint(saved); return op; } @@ -2796,7 +2794,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value value = @@ -2807,7 +2804,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, converted, hint, memOrder); - builder.restoreInsertionPoint(saved); return op; } @@ -2823,7 +2819,6 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); @@ -2853,7 +2848,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(saved); + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } @@ -2866,6 +2861,8 @@ genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { + // This function and the functions called here do not preserve the + // builder's insertion point, or set it to anything specific. switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, @@ -3919,6 +3916,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, postAt = atomicAt = preAt; } + // The builder's insertion point needs to be specifically set before + // each call to `genAtomicOperation`. mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); @@ -3932,10 +3931,16 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, hint, memOrder, preAt, atomicAt, postAt); } - if (secondOp) { - builder.setInsertionPointAfter(secondOp); + if (construct.IsCapture()) { + // If this is a capture operation, the first/second ops will be inside + // of it. Set the insertion point to past the capture op itself. + builder.restoreInsertionPoint(postAt); } else { - builder.setInsertionPointAfter(firstOp); + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } } } } diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 6f9a481e4cf43..5e00235b85e74 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -95,9 +95,9 @@ subroutine atomic_implicit_cast_read ! CHECK: } ! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 ! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref -! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 ! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex ! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> @@ -107,14 +107,14 @@ subroutine atomic_implicit_cast_read !$omp end atomic -! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { -! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 ! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 ! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex ! CHECK: omp.yield(%[[RESULT]] : complex) ! CHECK: } >From d788d87ebe69ec82c14a0eb0cbb95df38a216fde Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:14:47 -0500 Subject: [PATCH 16/16] Allow conversion in update operations --- flang/include/flang/Semantics/tools.h | 17 ++++----- flang/lib/Lower/OpenMP/OpenMP.cpp | 6 ++-- flang/lib/Semantics/check-omp-structure.cpp | 33 ++++++----------- .../Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic03.f90 | 6 ++-- flang/test/Semantics/OpenMP/atomic04.f90 | 35 +++++++++---------- .../OpenMP/omp-atomic-assignment-stmt.f90 | 2 +- 7 files changed, 44 insertions(+), 57 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7f1ec59b087a2..9be2feb8ae064 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -789,14 +789,15 @@ inline bool checkForSymbolMatch( /// return the "expr" but with top-level parentheses stripped. std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); -/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). -/// Check if "expr" is -/// SomeType(SomeKind(Type( -/// Convert -/// SomeKind(...)[2]))) -/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves -/// TypeCategory. -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 60e559b326f7f..6977e209e8b1b 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2823,10 +2823,12 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; + // This must exist by now. + SomeExpr input = *semantics::GetConvertInput(assign.rhs); + std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { - if (!semantics::IsSameOrResizeOf(arg, atom)) { + if (!semantics::IsSameOrConvertOf(arg, atom)) { mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); overrides.try_emplace(&arg, val); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index ae81dcb5ea150..edd8525c118bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3425,12 +3425,12 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -static MaybeExpr GetConvertInput(const SomeExpr &x) { +MaybeExpr GetConvertInput(const SomeExpr &x) { // This returns SomeExpr(x) when x is a designator/functionref/constant. return atomic::ConvertCollector{}(x).first; } -static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { // Check if expr is same as x, or a sequence of Convert operations on x. if (expr == x) { return true; @@ -3441,23 +3441,6 @@ static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { } } -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { - // Both expr and x have the form of SomeType(SomeKind(...)[1]). - // Check if expr is - // SomeType(SomeKind(Type( - // Convert - // SomeKind(...)[2]))) - // where SomeKind(...) [1] and [2] are equal, and the Convert preserves - // TypeCategory. - - if (expr != x) { - auto top{atomic::ArgumentExtractor{}(expr)}; - return top.first == atomic::Operator::Resize && x == top.second.front(); - } else { - return true; - } -} - bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3801,7 +3784,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - auto top{GetTopLevelOperation(update.rhs)}; + std::pair> top{ + atomic::Operator::Unk, {}}; + if (auto &&maybeInput{GetConvertInput(update.rhs)}) { + top = GetTopLevelOperation(*maybeInput); + } switch (top.first) { case atomic::Operator::Add: case atomic::Operator::Sub: @@ -3842,7 +3829,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( auto unique{[&]() { // -> iterator auto found{top.second.end()}; for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { - if (IsSameOrResizeOf(*i, atom)) { + if (IsSameOrConvertOf(*i, atom)) { if (found != top.second.end()) { return top.second.end(); } @@ -3902,9 +3889,9 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( case atomic::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; - if (IsSameOrResizeOf(arg0, atom)) { + if (IsSameOrConvertOf(arg0, atom)) { CheckStorageOverlap(atom, {arg1}, condSource); - } else if (IsSameOrResizeOf(arg1, atom)) { + } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { context_.Say(assignSource, diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 4595e02d01456..28d0e264359cb 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -47,8 +47,8 @@ subroutine f05 integer :: x real :: y + ! An explicit conversion is accepted as an extension. !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = int(x + y) end diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index f5c189fd05318..b3a3c0d5e7a14 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -41,10 +41,10 @@ program OmpAtomic z = MIN(y, 8, a, d) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable y should appear as an argument in the update operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -126,7 +126,7 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic update !ERROR: Atomic variable k should be a scalar - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable k should occur exactly once among the arguments of the top-level MAX operator k = max(x, y) !$omp atomic diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index 5c91ab5dc37e4..d603ba8b3937c 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -1,5 +1,3 @@ -! REQUIRES: openmp_runtime - ! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags ! OpenMP Atomic construct @@ -7,7 +5,6 @@ ! Update assignment must be 'var = var op expr' or 'var = expr op var' program OmpAtomic - use omp_lib real x integer y logical m, n, l @@ -20,10 +17,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic @@ -31,10 +28,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic @@ -42,10 +39,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic @@ -53,10 +50,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic @@ -96,10 +93,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic update @@ -107,10 +104,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic update @@ -118,10 +115,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic update @@ -129,10 +126,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 5e180aa0bbe5b..8fdd2aed3ec1f 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -87,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + ! This ends up being "x = b + x". x = b + (x*1) !$omp end atomic From flang-commits at lists.llvm.org Fri May 16 07:23:12 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 16 May 2025 07:23:12 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <68274a50.a70a0220.3be227.c008@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 01/18] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 02/18] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 03/18] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; >From 02b40809aff5b2ad315ef078f0ea8edb0f30a40c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 17 Mar 2025 15:53:27 -0500 Subject: [PATCH 04/18] [flang][OpenMP] Overhaul implementation of ATOMIC construct The parser will accept a wide variety of illegal attempts at forming an ATOMIC construct, leaving it to the semantic analysis to diagnose any issues. This consolidates the analysis into one place and allows us to produce more informative diagnostics. The parser's outcome will be parser::OpenMPAtomicConstruct object holding the directive, parser::Body, and an optional end-directive. The prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have been removed. READ, WRITE, etc. are now proper clauses. The semantic analysis consistently operates on "evaluation" represen- tations, mainly evaluate::Expr (as SomeExpr) and evaluate::Assignment. The results of the semantic analysis are stored in a mutable member of the OpenMPAtomicConstruct node. This follows a precedent of having `typedExpr` member in parser::Expr, for example. This allows the lowering code to avoid duplicated handling of AST nodes. Using a BLOCK construct containing multiple statements for an ATOMIC construct that requires multiple statements is now allowed. In fact, any nesting of such BLOCK constructs is allowed. This implementation will parse, and perform semantic checks for both conditional-update and conditional-update-capture, although no MLIR will be generated for those. Instead, a TODO error will be issues prior to lowering. The allowed forms of the ATOMIC construct were based on the OpenMP 6.0 spec. --- flang/include/flang/Parser/dump-parse-tree.h | 12 - flang/include/flang/Parser/parse-tree.h | 111 +- flang/include/flang/Semantics/tools.h | 16 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 40 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 760 +++---- flang/lib/Parser/openmp-parsers.cpp | 233 +- flang/lib/Parser/parse-tree.cpp | 28 + flang/lib/Parser/unparse.cpp | 102 +- flang/lib/Semantics/check-omp-structure.cpp | 1998 +++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 56 +- flang/lib/Semantics/resolve-names.cpp | 7 +- flang/lib/Semantics/rewrite-directives.cpp | 126 +- .../Lower/OpenMP/Todo/atomic-compare-fail.f90 | 2 +- .../test/Lower/OpenMP/Todo/atomic-compare.f90 | 2 +- flang/test/Lower/OpenMP/atomic-capture.f90 | 4 +- flang/test/Lower/OpenMP/atomic-write.f90 | 2 +- flang/test/Parser/OpenMP/atomic-compare.f90 | 16 - .../test/Semantics/OpenMP/atomic-compare.f90 | 29 +- .../Semantics/OpenMP/atomic-hint-clause.f90 | 23 +- flang/test/Semantics/OpenMP/atomic-read.f90 | 89 + .../OpenMP/atomic-update-capture.f90 | 77 + .../Semantics/OpenMP/atomic-update-only.f90 | 83 + .../OpenMP/atomic-update-overloaded-ops.f90 | 4 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 81 + flang/test/Semantics/OpenMP/atomic.f90 | 29 +- flang/test/Semantics/OpenMP/atomic01.f90 | 221 +- flang/test/Semantics/OpenMP/atomic02.f90 | 47 +- flang/test/Semantics/OpenMP/atomic03.f90 | 51 +- flang/test/Semantics/OpenMP/atomic04.f90 | 96 +- flang/test/Semantics/OpenMP/atomic05.f90 | 12 +- .../Semantics/OpenMP/critical-hint-clause.f90 | 20 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 58 +- .../Semantics/OpenMP/requires-atomic01.f90 | 86 +- .../Semantics/OpenMP/requires-atomic02.f90 | 86 +- 34 files changed, 2838 insertions(+), 1769 deletions(-) delete mode 100644 flang/test/Parser/OpenMP/atomic-compare.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-read.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-capture.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-only.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-write.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a8c01ee2e7da3 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -526,15 +526,6 @@ class ParseTreeDumper { NODE(parser, OmpAtClause) NODE_ENUM(OmpAtClause, ActionTime) NODE_ENUM(OmpSeverityClause, Severity) - NODE(parser, OmpAtomic) - NODE(parser, OmpAtomicCapture) - NODE(OmpAtomicCapture, Stmt1) - NODE(OmpAtomicCapture, Stmt2) - NODE(parser, OmpAtomicCompare) - NODE(parser, OmpAtomicCompareIfStmt) - NODE(parser, OmpAtomicRead) - NODE(parser, OmpAtomicUpdate) - NODE(parser, OmpAtomicWrite) NODE(parser, OmpBeginBlockDirective) NODE(parser, OmpBeginLoopDirective) NODE(parser, OmpBeginSectionsDirective) @@ -581,7 +572,6 @@ class ParseTreeDumper { NODE(parser, OmpDoacrossClause) NODE(parser, OmpDestroyClause) NODE(parser, OmpEndAllocators) - NODE(parser, OmpEndAtomic) NODE(parser, OmpEndBlockDirective) NODE(parser, OmpEndCriticalDirective) NODE(parser, OmpEndLoopDirective) @@ -709,8 +699,6 @@ class ParseTreeDumper { NODE(parser, OpenMPDeclareMapperConstruct) NODE_ENUM(common, OmpMemoryOrderType) NODE(parser, OmpMemoryOrderClause) - NODE(parser, OmpAtomicClause) - NODE(parser, OmpAtomicClauseList) NODE(parser, OmpAtomicDefaultMemOrderClause) NODE(parser, OpenMPDepobjConstruct) NODE(parser, OpenMPUtilityConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..77f57b1cb85c7 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4835,94 +4835,37 @@ struct OmpMemoryOrderClause { CharBlock source; }; -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) | -// FAIL(memory-order) -struct OmpAtomicClause { - UNION_CLASS_BOILERPLATE(OmpAtomicClause); - CharBlock source; - std::variant u; -}; - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -struct OmpAtomicClauseList { - WRAPPER_CLASS_BOILERPLATE(OmpAtomicClauseList, std::list); - CharBlock source; -}; - -// END ATOMIC -EMPTY_CLASS(OmpEndAtomic); - -// ATOMIC READ -struct OmpAtomicRead { - TUPLE_CLASS_BOILERPLATE(OmpAtomicRead); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC WRITE -struct OmpAtomicWrite { - TUPLE_CLASS_BOILERPLATE(OmpAtomicWrite); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC UPDATE -struct OmpAtomicUpdate { - TUPLE_CLASS_BOILERPLATE(OmpAtomicUpdate); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC CAPTURE -struct OmpAtomicCapture { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCapture); - CharBlock source; - WRAPPER_CLASS(Stmt1, Statement); - WRAPPER_CLASS(Stmt2, Statement); - std::tuple - t; -}; - -struct OmpAtomicCompareIfStmt { - UNION_CLASS_BOILERPLATE(OmpAtomicCompareIfStmt); - std::variant, common::Indirection> u; -}; - -// ATOMIC COMPARE (OpenMP 5.1, OPenMP 5.2 spec: 15.8.4) -struct OmpAtomicCompare { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCompare); +struct OpenMPAtomicConstruct { + llvm::omp::Clause GetKind() const; + bool IsCapture() const; + bool IsCompare() const; + TUPLE_CLASS_BOILERPLATE(OpenMPAtomicConstruct); CharBlock source; - std::tuple> + std::tuple> t; -}; -// ATOMIC -struct OmpAtomic { - TUPLE_CLASS_BOILERPLATE(OmpAtomic); - CharBlock source; - std::tuple, - std::optional> - t; -}; + // Information filled out during semantic checks to avoid duplication + // of analyses. + struct Analysis { + static constexpr int None = 0; + static constexpr int Read = 1; + static constexpr int Write = 2; + static constexpr int Update = Read | Write; + static constexpr int Action = 3; // Bitmask for None, Read, Write, Update + static constexpr int IfTrue = 4; + static constexpr int IfFalse = 8; + static constexpr int Condition = 12; // Bitmask for IfTrue, IfFalse + + struct Op { + int what; + TypedExpr expr; + }; + TypedExpr atom, cond; + Op op0, op1; + }; -// 2.17.7 atomic -> -// ATOMIC [atomic-clause-list] atomic-construct [atomic-clause-list] | -// ATOMIC [atomic-clause-list] -// atomic-construct -> READ | WRITE | UPDATE | CAPTURE | COMPARE -struct OpenMPAtomicConstruct { - UNION_CLASS_BOILERPLATE(OpenMPAtomicConstruct); - std::variant - u; + mutable Analysis analysis; }; // OpenMP directives that associate with loop(s) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..7f1ec59b087a2 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -782,5 +782,21 @@ inline bool checkForSymbolMatch( } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). +/// Check if "expr" is +/// SomeType(SomeKind(Type( +/// Convert +/// SomeKind(...)[2]))) +/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves +/// TypeCategory. +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..0d941bf39afc3 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -332,26 +332,26 @@ getSource(const semantics::SemanticsContext &semaCtx, const parser::CharBlock *source = nullptr; auto ompConsVisit = [&](const parser::OpenMPConstruct &x) { - std::visit(common::visitors{ - [&](const parser::OpenMPSectionsConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPLoopConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPBlockConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPCriticalConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPAtomicConstruct &x) { - std::visit([&](const auto &x) { source = &x.source; }, - x.u); - }, - [&](const auto &x) { source = &x.source; }, - }, - x.u); + std::visit( + common::visitors{ + [&](const parser::OpenMPSectionsConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPLoopConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPBlockConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPCriticalConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPAtomicConstruct &x) { + source = &std::get(x.t).source; + }, + [&](const auto &x) { source = &x.source; }, + }, + x.u); }; eval.visit(common::visitors{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fdd85e94829f3..ab868df76d298 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2588,455 +2588,183 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); +} -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} + +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, + const List &clauses) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + if (clause.id != llvm::omp::Clause::OMPC_hint) + continue; + auto &hint = std::get(clause.u); + auto maybeVal = evaluate::ToInt64(hint.v); + CHECK(maybeVal); + return builder.getI64IntegerAttr(*maybeVal); } + return nullptr; } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static mlir::omp::ClauseMemoryOrderKindAttr +getAtomicMemoryOrder(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + const List &clauses) { + std::optional kind; + unsigned version = semaCtx.langOptions().OpenMPVersion; - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + switch (clause.id) { + case llvm::omp::Clause::OMPC_acq_rel: + kind = mlir::omp::ClauseMemoryOrderKind::Acq_rel; + break; + case llvm::omp::Clause::OMPC_acquire: + kind = mlir::omp::ClauseMemoryOrderKind::Acquire; + break; + case llvm::omp::Clause::OMPC_relaxed: + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + break; + case llvm::omp::Clause::OMPC_release: + kind = mlir::omp::ClauseMemoryOrderKind::Release; + break; + case llvm::omp::Clause::OMPC_seq_cst: + kind = mlir::omp::ClauseMemoryOrderKind::Seq_cst; + break; + default: + break; + } + } - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); + // Starting with 5.1, if no memory-order clause is present, the effect + // is as if "relaxed" was present. + if (!kind) { + if (version <= 50) + return nullptr; + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + } + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + return mlir::omp::ClauseMemoryOrderKindAttr::get(builder.getContext(), *kind); +} + +static mlir::Operation * // +genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; +static mlir::Operation * // +genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Value converted = builder.createConvert(loc, atomType, value); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, converted, hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - common::visit( - common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - lower::StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); +static mlir::Operation * +genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + lower::ExprToValueMap overrides; + lower::StatementContext naCtx; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + assert(!args.empty() && "Update operation without arguments"); + for (auto &arg : args) { + if (!semantics::IsSameOrResizeOf(arg, atom)) { + mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); + overrides.try_emplace(&arg, val); } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); } - mlir::Operation *atomicUpdateOp = nullptr; - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), - val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -static void genAtomicWrite(lower::AbstractConverter &converter, - const parser::OmpAtomicWrite &atomicWrite, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - - const parser::AssignmentStmt &stmt = - std::get>(atomicWrite.t) - .statement; - const evaluate::Assignment &assign = *stmt.typedAssignment->v; - lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -static void genAtomicRead(lower::AbstractConverter &converter, - const parser::OmpAtomicRead &atomicRead, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicRead.t) - .statement.t); + builder.restoreInsertionPoint(atomicAt); + auto updateOp = + builder.create(loc, atomAddr, hint, memOrder); - lower::StatementContext stmtCtx; - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -static void genAtomicUpdate(lower::AbstractConverter &converter, - const parser::OmpAtomicUpdate &atomicUpdate, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicUpdate.t) - .statement.t); + mlir::Region ®ion = updateOp->getRegion(0); + mlir::Block *block = builder.createBlock(®ion, {}, {atomType}, {loc}); + mlir::Value localAtom = fir::getBase(block->getArgument(0)); + overrides.try_emplace(&atom, localAtom); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -static void genOmpAtomic(lower::AbstractConverter &converter, - const parser::OmpAtomic &atomicConstruct, - mlir::Location loc) { - const parser::OmpAtomicClauseList &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>(atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicConstruct.t) - .statement.t); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -static void genAtomicCapture(lower::AbstractConverter &converter, - const parser::OmpAtomicCapture &atomicCapture, - mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + converter.overrideExprValues(&overrides); + mlir::Value updated = + fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value converted = builder.createConvert(loc, atomType, updated); + builder.create(loc, converted); + converter.resetExprOverrides(); - const parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const parser::OmpAtomicClauseList &rightHandClauseList = - std::get<2>(atomicCapture.t); - const parser::OmpAtomicClauseList &leftHandClauseList = - std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + builder.restoreInsertionPoint(saved); + return updateOp; +} + +static mlir::Operation * +genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, int action, + mlir::Value atomAddr, const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + switch (action) { + case parser::OpenMPAtomicConstruct::Analysis::Read: + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Write: + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Update: + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + default: + return nullptr; } - firOpBuilder.setInsertionPointToEnd(&block); - firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); } //===----------------------------------------------------------------------===// @@ -3911,10 +3639,6 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, @@ -3922,38 +3646,140 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; + llvm::errs() << " atom: " << atom.AsFortran() << "\n"; + llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; + llvm::errs() << " op0 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << " op1 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << "}\n"; +} + static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - const parser::OpenMPAtomicConstruct &atomicConstruct) { - Fortran::common::visit( - common::visitors{ - [&](const parser::OmpAtomicRead &atomicRead) { - mlir::Location loc = converter.genLocation(atomicRead.source); - genAtomicRead(converter, atomicRead, loc); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - mlir::Location loc = converter.genLocation(atomicWrite.source); - genAtomicWrite(converter, atomicWrite, loc); - }, - [&](const parser::OmpAtomic &atomicConstruct) { - mlir::Location loc = converter.genLocation(atomicConstruct.source); - genOmpAtomic(converter, atomicConstruct, loc); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - mlir::Location loc = converter.genLocation(atomicUpdate.source); - genAtomicUpdate(converter, atomicUpdate, loc); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - mlir::Location loc = converter.genLocation(atomicCapture.source); - genAtomicCapture(converter, atomicCapture, loc); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - mlir::Location loc = converter.genLocation(atomicCompare.source); - TODO(loc, "OpenMP atomic compare"); - }, - }, - atomicConstruct.u); + const parser::OpenMPAtomicConstruct &construct) { + auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { + if (auto *maybe = expr.get(); maybe && maybe->v) { + return &*maybe->v; + } else { + return nullptr; + } + }; + + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto &dirSpec = std::get(construct.t); + List clauses = makeClauses(dirSpec.Clauses(), semaCtx); + lower::StatementContext stmtCtx; + + const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + const semantics::SomeExpr &atom = *get(analysis.atom); + mlir::Location loc = converter.genLocation(construct.source); + mlir::Value atomAddr = + fir::getBase(converter.genExprAddr(atom, stmtCtx, &loc)); + mlir::IntegerAttr hint = getAtomicHint(converter, clauses); + mlir::omp::ClauseMemoryOrderKindAttr memOrder = + getAtomicMemoryOrder(converter, semaCtx, clauses); + + if (auto *cond = get(analysis.cond)) { + (void)cond; + TODO(loc, "OpenMP ATOMIC COMPARE"); + } else { + int action0 = analysis.op0.what & analysis.Action; + int action1 = analysis.op1.what & analysis.Action; + mlir::Operation *captureOp = nullptr; + fir::FirOpBuilder::InsertPoint atomicAt; + fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + + if (construct.IsCapture()) { + // Capturing operation. + assert(action0 != analysis.None && action1 != analysis.None && + "Expexcing two actions"); + captureOp = + builder.create(loc, hint, memOrder); + // Set the non-atomic insertion point to before the atomic.capture. + prepareAt = getInsertionPointBefore(captureOp); + + mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); + builder.setInsertionPointToEnd(block); + // Set the atomic insertion point to before the terminator inside + // atomic.capture. + mlir::Operation *term = builder.create(loc); + atomicAt = getInsertionPointBefore(term); + hint = nullptr; + memOrder = nullptr; + } else { + // Non-capturing operation. + assert(action0 != analysis.None && action1 == analysis.None && + "Expexcing single action"); + assert(!(analysis.op0.what & analysis.Condition)); + atomicAt = prepareAt; + } + + mlir::Operation *firstOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, + *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + assert(firstOp && "Should have created an atomic operation"); + atomicAt = getInsertionPointAfter(firstOp); + + mlir::Operation *secondOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, + *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } + } } static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..b30263ad54fa4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { + return std::string( + state.GetLocation(), std::min(64, state.BytesRemaining())); +} + constexpr auto startOmpLine = skipStuffBeforeStatement >> "!$OMP "_sptok; constexpr auto endOmpLine = space >> endOfLine; @@ -918,8 +924,10 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "BIND" >> construct(construct( parenthesized(Parser{}))) || + "CAPTURE" >> construct(construct()) || "COLLAPSE" >> construct(construct( parenthesized(scalarIntConstantExpr))) || + "COMPARE" >> construct(construct()) || "CONTAINS" >> construct(construct( parenthesized(Parser{}))) || "COPYIN" >> construct(construct( @@ -1039,6 +1047,7 @@ TYPE_PARSER( // "TASK_REDUCTION" >> construct(construct( parenthesized(Parser{}))) || + "READ" >> construct(construct()) || "RELAXED" >> construct(construct()) || "RELEASE" >> construct(construct()) || "REVERSE_OFFLOAD" >> @@ -1082,6 +1091,7 @@ TYPE_PARSER( // maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || + "WRITE" >> construct(construct()) || // Cancellable constructs "DO"_id >= construct(construct( @@ -1210,6 +1220,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. +// A problem can occur when atomic constructs without end-directive follow +// each other closely, e.g. +// !$omp atomic write +// x = v +// !$omp atomic update +// x = x + 1 +// ... +// The speculative parsing will become "recursive", and has the potential +// to take a (practically) infinite amount of time given a sufficiently +// large number of such constructs in a row. Since atomic constructs cannot +// contain other OpenMP constructs, guarding against recursive calls to the +// atomic construct parser solves the problem. +struct OmpAtomicConstructParser { + using resultType = OpenMPAtomicConstruct; + + static constexpr size_t BodyLimit{5}; + + std::optional Parse(ParseState &state) const { + if (recursing_) { + return std::nullopt; + } + recursing_ = true; + + auto dirSpec{Parser{}.Parse(state)}; + if (!dirSpec || dirSpec->DirId() != llvm::omp::Directive::OMPD_atomic) { + recursing_ = false; + return std::nullopt; + } + + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + if (ParseOne(exec, end, tail, state)) { + if (!tail.first.empty()) { + if (auto &&rest{attempt(LimitedTailParser(BodyLimit)).Parse(state)}) { + for (auto &&s : rest->first) { + tail.first.emplace_back(std::move(s)); + } + assert(!tail.second); + tail.second = std::move(rest->second); + } + } + recursing_ = false; + return OpenMPAtomicConstruct{ + std::move(*dirSpec), std::move(tail.first), std::move(tail.second)}; + } + + recursing_ = false; + return std::nullopt; + } + +private: + // Begin-directive + TailType = entire construct. + using TailType = std::pair>; + + // Parse either an ExecutionPartConstruct, or atomic end-directive. When + // successful, record the result in the "tail" provided, otherwise fail. + static std::optional ParseOne( // + Parser &exec, OmpEndDirectiveParser &end, + TailType &tail, ParseState &state) { + auto isRecovery{[](const ExecutionPartConstruct &e) { + return std::holds_alternative(e.u); + }}; + if (auto &&stmt{attempt(exec).Parse(state)}; stmt && !isRecovery(*stmt)) { + tail.first.emplace_back(std::move(*stmt)); + } else if (auto &&dir{attempt(end).Parse(state)}) { + tail.second = std::move(*dir); + } else { + return std::nullopt; + } + return Success{}; + } + + struct LimitedTailParser { + using resultType = TailType; + + constexpr LimitedTailParser(size_t count) : count_(count) {} + + std::optional Parse(ParseState &state) const { + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + for (size_t i{0}; i != count_; ++i) { + if (ParseOne(exec, end, tail, state)) { + if (tail.second) { + // Return when the end-directive was parsed. + return std::move(tail); + } + } else { + break; + } + } + return std::nullopt; + } + + private: + const size_t count_; + }; + + // The recursion guard should become thread_local if parsing is ever + // parallelized. + static bool recursing_; +}; + +bool OmpAtomicConstructParser::recursing_{false}; + +TYPE_PARSER(sourced( // + construct(OmpAtomicConstructParser{}))) + // 2.17.7 Atomic construct/2.17.8 Flush construct [OpenMP 5.0] // memory-order-clause -> // acq_rel @@ -1224,19 +1383,6 @@ TYPE_PARSER(sourced(construct( "RELEASE" >> construct(construct()) || "SEQ_CST" >> construct(construct()))))) -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) -TYPE_PARSER(sourced(construct( - construct(Parser{}) || - construct( - "FAIL" >> parenthesized(Parser{})) || - construct( - "HINT" >> parenthesized(Parser{}))))) - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -TYPE_PARSER(sourced(construct( - many(maybe(","_tok) >> sourced(Parser{}))))) - static bool IsSimpleStandalone(const OmpDirectiveName &name) { switch (name.v) { case llvm::omp::Directive::OMPD_barrier: @@ -1383,67 +1529,6 @@ TYPE_PARSER(sourced( TYPE_PARSER(construct(Parser{}) || construct(Parser{})) -// 2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] | -// ATOMIC [clause] -// clause -> memory-order-clause | HINT(hint-expression) -// memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED -// atomic-clause -> READ | WRITE | UPDATE | CAPTURE - -// OMP END ATOMIC -TYPE_PARSER(construct(startOmpLine >> "END ATOMIC"_tok)) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] READ [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("READ"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] CAPTURE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("CAPTURE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - statement(assignmentStmt), Parser{} / endOmpLine))) - -TYPE_PARSER(construct(indirect(Parser{})) || - construct(indirect(Parser{}))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] COMPARE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("COMPARE"_tok), - Parser{} / endOmpLine, - Parser{}, - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] UPDATE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("UPDATE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [atomic-clause-list] -TYPE_PARSER(sourced(construct(verbatim("ATOMIC"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] WRITE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("WRITE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// Atomic Construct -TYPE_PARSER(construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{})) - // 2.13.2 OMP CRITICAL TYPE_PARSER(startOmpLine >> sourced(construct( diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..5983f54600c21 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -318,6 +318,34 @@ std::string OmpTraitSetSelectorName::ToString() const { return std::string(EnumToString(v)); } +llvm::omp::Clause OpenMPAtomicConstruct::GetKind() const { + auto &dirSpec{std::get(t)}; + for (auto &clause : dirSpec.Clauses().v) { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_read: + case llvm::omp::Clause::OMPC_write: + case llvm::omp::Clause::OMPC_update: + return clause.Id(); + default: + break; + } + } + return llvm::omp::Clause::OMPC_update; +} + +bool OpenMPAtomicConstruct::IsCapture() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_capture; + }); +} + +bool OpenMPAtomicConstruct::IsCompare() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_compare; + }); +} } // namespace Fortran::parser template static llvm::omp::Clause getClauseIdForClass(C &&) { diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..a2bc8b98088f1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2562,83 +2562,22 @@ class UnparseVisitor { Word(ToUpperCaseLetters(common::EnumToString(x))); } - void Unparse(const OmpAtomicClauseList &x) { Walk(" ", x.v, " "); } - - void Unparse(const OmpAtomic &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCapture &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" CAPTURE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - Put("\n"); - Walk(std::get(x.t)); - BeginOpenMP(); - Word("!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCompare &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" COMPARE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - } - void Unparse(const OmpAtomicRead &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" READ"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicUpdate &x) { + void Unparse(const OpenMPAtomicConstruct &x) { BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" UPDATE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicWrite &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" WRITE"); - Walk(std::get<2>(x.t)); + Word("!$OMP "); + Walk(std::get(x.t)); Put("\n"); EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); + Walk(std::get(x.t), ""); + if (auto &end{std::get>(x.t)}) { + BeginOpenMP(); + Word("!$OMP END "); + Walk(*end); + Put("\n"); + EndOpenMP(); + } } + void Unparse(const OpenMPExecutableAllocate &x) { const auto &fields = std::get>>( @@ -2889,23 +2828,8 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } + void Unparse(const OmpFailClause &x) { Walk(x.v); } void Unparse(const OmpMemoryOrderClause &x) { Walk(x.v); } - void Unparse(const OmpAtomicClause &x) { - common::visit(common::visitors{ - [&](const OmpMemoryOrderClause &y) { Walk(y); }, - [&](const OmpFailClause &y) { - Word("FAIL("); - Walk(y.v); - Put(")"); - }, - [&](const OmpHintClause &y) { - Word("HINT("); - Walk(y.v); - Put(")"); - }, - }, - x.u); - } void Unparse(const OmpMetadirectiveDirective &x) { BeginOpenMP(); Word("!$OMP METADIRECTIVE "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c582de6df0319..bf3dff8ddab28 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); + +template +static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { + return !(e == f); +} + // Use when clause falls under 'struct OmpClause' in 'parse-tree.h'. #define CHECK_SIMPLE_CLAUSE(X, Y) \ void OmpStructureChecker::Enter(const parser::OmpClause::X &) { \ @@ -78,6 +86,28 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } +static bool IsVarOrFunctionRef(const SomeExpr &expr) { + return evaluate::UnwrapProcedureRef(expr) != nullptr || + evaluate::IsVariable(expr); +} + +static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { + const parser::TypedExpr &typedExpr{parserExpr.typedExpr}; + // ForwardOwningPointer typedExpr + // `- GenericExprWrapper ^.get() + // `- std::optional ^->v + return typedExpr.get()->v; +} + +static std::optional GetDynamicType( + const parser::Expr &parserExpr) { + if (auto maybeExpr{GetEvaluateExpr(parserExpr)}) { + return maybeExpr->GetType(); + } else { + return std::nullopt; + } +} + // 'OmpWorkshareBlockChecker' is used to check the validity of the assignment // statements and the expressions enclosed in an OpenMP Workshare construct class OmpWorkshareBlockChecker { @@ -584,51 +614,26 @@ void OmpStructureChecker::CheckPredefinedAllocatorRestriction( } } -template -void OmpStructureChecker::CheckHintClause( - D *leftOmpClauseList, D *rightOmpClauseList, std::string_view dirName) { - bool foundHint{false}; +void OmpStructureChecker::Enter(const parser::OmpClause::Hint &x) { + CheckAllowedClause(llvm::omp::Clause::OMPC_hint); + auto &dirCtx{GetContext()}; - auto checkForValidHintClause = [&](const D *clauseList) { - for (const auto &clause : clauseList->v) { - const parser::OmpHintClause *ompHintClause = nullptr; - if constexpr (std::is_same_v) { - ompHintClause = std::get_if(&clause.u); - } else if constexpr (std::is_same_v) { - if (auto *hint{std::get_if(&clause.u)}) { - ompHintClause = &hint->v; - } - } - if (!ompHintClause) - continue; - if (foundHint) { - context_.Say(clause.source, - "At most one HINT clause can appear on the %s directive"_err_en_US, - parser::ToUpperCaseLetters(dirName)); - } - foundHint = true; - std::optional hintValue = GetIntValue(ompHintClause->v); - if (hintValue && *hintValue >= 0) { - /*`omp_sync_hint_nonspeculative` and `omp_lock_hint_speculative`*/ - if ((*hintValue & 0xC) == 0xC - /*`omp_sync_hint_uncontended` and omp_sync_hint_contended*/ - || (*hintValue & 0x3) == 0x3) - context_.Say(clause.source, - "Hint clause value " - "is not a valid OpenMP synchronization value"_err_en_US); - } else { - context_.Say(clause.source, - "Hint clause must have non-negative constant " - "integer expression"_err_en_US); + if (std::optional maybeVal{GetIntValue(x.v.v)}) { + int64_t val{*maybeVal}; + if (val >= 0) { + // Check contradictory values. + if ((val & 0xC) == 0xC || // omp_sync_hint_speculative and nonspeculative + (val & 0x3) == 0x3) { // omp_sync_hint_contended and uncontended + context_.Say(dirCtx.clauseSource, + "The synchronization hint is not valid"_err_en_US); } + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be non-negative"_err_en_US); } - }; - - if (leftOmpClauseList) { - checkForValidHintClause(leftOmpClauseList); - } - if (rightOmpClauseList) { - checkForValidHintClause(rightOmpClauseList); + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be a constant integer value"_err_en_US); } } @@ -2370,8 +2375,9 @@ void OmpStructureChecker::Leave(const parser::OpenMPCancelConstruct &) { void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { const auto &dir{std::get(x.t)}; + const auto &dirSource{std::get(dir.t).source}; const auto &endDir{std::get(x.t)}; - PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_critical); + PushContextAndClauseSets(dirSource, llvm::omp::Directive::OMPD_critical); const auto &block{std::get(x.t)}; CheckNoBranching(block, llvm::omp::Directive::OMPD_critical, dir.source); const auto &dirName{std::get>(dir.t)}; @@ -2404,7 +2410,6 @@ void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { "Hint clause other than omp_sync_hint_none cannot be specified for " "an unnamed CRITICAL directive"_err_en_US}); } - CheckHintClause(&ompClause, nullptr, "CRITICAL"); } void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { @@ -2632,420 +2637,1519 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; } - return false; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } + + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (Append(v, std::move(results)), ...); + return v; + } + +private: + static void Append(Result &acc, Result &&data) { + for (auto &&s : data) { + acc.push_back(std::move(s)); } } +}; +} // namespace atomic - ErrIfAllocatableVariable(var); +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { + // Both expr and x have the form of SomeType(SomeKind(...)[1]). + // Check if expr is + // SomeType(SomeKind(Type( + // Convert + // SomeKind(...)[2]))) + // where SomeKind(...) [1] and [2] are equal, and the Convert preserves + // TypeCategory. + + if (expr != x) { + auto top{atomic::ArgumentExtractor{}(expr)}; + return top.first == atomic::Operator::Resize && x == top.second.front(); } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); + return true; } } -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, const MaybeExpr &maybeExpr = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetExpr(operation.expr, maybeExpr); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedExpr expr; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var + // or + // ... = x capture-var = atomic-var + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return u.lhs == c.rhs; + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); + bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + + if (couldBeCapture1) { + if (couldBeCapture2) { + if (isUpdateCapture(as2, as1)) { + if (isUpdateCapture(as1, as2)) { + // If both statements could be captures and both could be updates, + // emit a warning about the ambiguity. + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); } + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, + as1.rhs.AsFortran(), as2.rhs.AsFortran()); + } + } else { // !couldBeCapture2 + if (isUpdateCapture(as2, as1)) { + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else { + context_.Say(act2.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as1.rhs.AsFortran()); + } + } + } else { // !couldBeCapture1 + if (couldBeCapture2) { + if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); + } else { + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as2.rhs.AsFortran()); + } + } else { + context_.Say(source, + "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + } + } + + return std::make_pair(nullptr, nullptr); +} + +void OmpStructureChecker::CheckAtomicCaptureAssignment( + const evaluate::Assignment &capture, const SomeExpr &atom, + parser::CharBlock source) { + const SomeExpr &cap{capture.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + // This part should have been checked prior to callig this function. + assert(capture.rhs == atom && "This canont be a capture assignment"); + CheckStorageOverlap(atom, {cap}, source); + } +} + +void OmpStructureChecker::CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source) { + const SomeExpr &atom{read.rhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source) { + // [6.0:190:13-15] + // A write structured block is write-statement, a write statement that has + // one of the following forms: + // x = expr + // x => expr + const SomeExpr &atom{write.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, lsrc); + CheckStorageOverlap(atom, {write.rhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source) { + // [6.0:191:1-7] + // An update structured block is update-statement, an update statement + // that has one of the following forms: + // x = x operator expr + // x = expr operator x + // x = intrinsic-procedure-name (x) + // x = intrinsic-procedure-name (x, expr-list) + // x = intrinsic-procedure-name (expr-list, x) + const SomeExpr &atom{update.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, lsrc); + + auto top{GetTopLevelOperation(update.rhs)}; + switch (top.first) { + case atomic::Operator::Add: + case atomic::Operator::Sub: + case atomic::Operator::Mul: + case atomic::Operator::Div: + case atomic::Operator::And: + case atomic::Operator::Or: + case atomic::Operator::Eqv: + case atomic::Operator::Neqv: + case atomic::Operator::Min: + case atomic::Operator::Max: + case atomic::Operator::Identity: + break; + case atomic::Operator::Call: + context_.Say(source, + "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Convert: + context_.Say(source, + "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Intrinsic: + context_.Say(source, + "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Unk: + context_.Say( + source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + default: + context_.Say(source, + "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, + atomic::ToString(top.first)); + return; + } + // Check if `atom` occurs exactly once in the argument list. + std::vector nonAtom; + auto unique{[&]() { // -> iterator + auto found{top.second.end()}; + for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { + if (IsSameOrResizeOf(*i, atom)) { + if (found != top.second.end()) { + return top.second.end(); } + found = i; + } else { + nonAtom.push_back(*i); } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); + return found; + }()}; + + if (unique == top.second.end()) { + if (top.first == atomic::Operator::Identity) { + // This is "x = y". + context_.Say(rsrc, + "The atomic variable %s should appear as an argument in the update operation"_err_en_US, + atom.AsFortran()); + } else { + context_.Say(rsrc, + "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, + atom.AsFortran(), atomic::ToString(top.first)); + } + } else { + CheckStorageOverlap(atom, nonAtom, source); } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( + const SomeExpr &cond, parser::CharBlock condSource, + const evaluate::Assignment &assign, parser::CharBlock assignSource) { + const SomeExpr &atom{assign.lhs}; + auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, alsrc); + + auto top{GetTopLevelOperation(cond)}; + // Missing arguments to operations would have been diagnosed by now. + + switch (top.first) { + case atomic::Operator::Associated: + if (atom != top.second.front()) { + context_.Say(assignSource, + "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); + } + break; + // x equalop e | e equalop x (allowing "e equalop x" is an extension) + case atomic::Operator::Eq: + case atomic::Operator::Eqv: + // x ordop expr | expr ordop x + case atomic::Operator::Lt: + case atomic::Operator::Gt: { + const SomeExpr &arg0{top.second[0]}; + const SomeExpr &arg1{top.second[1]}; + if (IsSameOrResizeOf(arg0, atom)) { + CheckStorageOverlap(atom, {arg1}, condSource); + } else if (IsSameOrResizeOf(arg1, atom)) { + CheckStorageOverlap(atom, {arg0}, condSource); + } else { + context_.Say(assignSource, + "An argument of the %s operator should be the target of the assignment"_err_en_US, + atomic::ToString(top.first)); + } + break; + } + case atomic::Operator::True: + case atomic::Operator::False: + break; + default: + context_.Say(condSource, + "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, + atomic::ToString(top.first)); + break; } } -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, +void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source) { + // The condition/statements must be: + // - cond: x equalop e ift: x = d iff: - + // - cond: x ordop expr ift: x = expr iff: - (+ commute ordop) + // - cond: associated(x) ift: x => expr iff: - + // - cond: associated(x, e) ift: x => expr iff: - + + // The if-true statement must be present, and must be an assignment. + auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; + if (!maybeAssign) { + if (update.ift.stmt) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); + } else { + context_.Say( + source, "Invalid body of ATOMIC UPDATE COMPARE operation"_err_en_US); + } + return; + } + const evaluate::Assignment assign{*maybeAssign}; + const SomeExpr &atom{assign.lhs}; + + CheckAtomicConditionalUpdateAssignment( + update.cond, update.source, assign, update.ift.source); + + CheckStorageOverlap(atom, {assign.rhs}, update.ift.source); + + if (update.iff) { + context_.Say(update.iff.source, + "In ATOMIC UPDATE COMPARE the update statement should not have an ELSE branch"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdateOnly( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeUpdate{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeUpdate->lhs}; + CheckAtomicUpdateAssignment(*maybeUpdate, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC UPDATE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdate( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // Allowable forms are (single-statement): + // - if ... + // - x = (... ? ... : x) + // and two-statement: + // - r = cond ; if (r) ... + + const parser::ExecutionPartConstruct *ust{nullptr}; // update + const parser::ExecutionPartConstruct *cst{nullptr}; // condition + + if (body.size() == 1) { + ust = &body.front(); + } else if (body.size() == 2) { + cst = &body.front(); + ust = &body.back(); + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE operation should contain one or two statements"_err_en_US); + return; + } + + // Flang doesn't support conditional-expr yet, so all update statements + // are if-statements. + + // IfStmt: if (...) ... + // IfConstruct: if (...) then ... endif + auto maybeUpdate{AnalyzeConditionalStmt(ust)}; + if (!maybeUpdate) { + context_.Say(source, + "In ATOMIC UPDATE COMPARE the update statement should be a conditional statement"_err_en_US); + return; + } + + AnalyzedCondStmt &update{*maybeUpdate}; + + if (SourcedActionStmt action{GetActionStmt(cst)}) { + // The "condition" statement must be `r = cond`. + if (auto maybeCond{GetEvaluateAssignment(action.stmt)}) { + if (maybeCond->lhs != update.cond) { + context_.Say(update.source, + "In ATOMIC UPDATE COMPARE the conditional statement must use %s as the condition"_err_en_US, + maybeCond->lhs.AsFortran()); + } else { + // If it's "r = ...; if (r) ..." then put the original condition + // in `update`. + update.cond = maybeCond->rhs; + } + } else { + context_.Say(action.source, + "In ATOMIC UPDATE COMPARE with two statements the first statement should compute the condition"_err_en_US); + } + } + + evaluate::Assignment assign{*GetEvaluateAssignment(update.ift.stmt)}; + + CheckAtomicConditionalUpdateStmt(update, source); + if (IsCheckForAssociated(update.cond)) { + if (!IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment should be a pointer-assignment when the condition is ASSOCIATED"_err_en_US); + } + } else { + if (IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment cannot be a pointer-assignment except when the condition is ASSOCIATED"_err_en_US); + } + } + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::None)); +} + +void OmpStructureChecker::CheckAtomicUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() != 2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two statements"_err_en_US); + return; + } + + auto [uec, cec]{CheckUpdateCapture(&body.front(), &body.back(), source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; + auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; + + if (!maybeUpdate || !maybeCapture) { + context_.Say(source, + "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); + return; + } + + const evaluate::Assignment &update{*maybeUpdate}; + const evaluate::Assignment &capture{*maybeCapture}; + const SomeExpr &atom{update.lhs}; + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + int action; + + if (IsMaybeAtomicWrite(update)) { + action = Analysis::Write; + CheckAtomicWriteAssignment(update, uact.source); + } else { + action = Analysis::Update; + CheckAtomicUpdateAssignment(update, uact.source); + } + CheckAtomicCaptureAssignment(capture, atom, cact.source); + + if (IsPointerAssignment(update) != IsPointerAssignment(capture)) { + context_.Say(cact.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + + if (GetActionStmt(&body.front()).stmt == uact.stmt) { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(action, update.rhs), + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + } else { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), + MakeAtomicAnalysisOp(action, update.rhs)); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // There are two different variants of this: + // (1) conditional-update and capture separately: + // This form only allows single-statement updates, i.e. the update + // form "r = cond; if (r) ...)" is not allowed. + // (2) conditional-update combined with capture in a single statement: + // This form does allow the condition to be calculated separately, + // i.e. "r = cond; if (r) ...". + // Regardless of what form it is, the actual update assignment is a + // proper write, i.e. "x = d", where d does not depend on x. + + AnalyzedCondStmt update; + SourcedActionStmt capture; + bool captureAlways{true}, captureFirst{true}; + + auto extractCapture{[&]() { + capture = update.iff; + captureAlways = false; + update.iff = SourcedActionStmt{}; + }}; + + auto classifyNonUpdate{[&](const SourcedActionStmt &action) { + // The non-update statement is either "r = cond" or the capture. + if (auto maybeAssign{GetEvaluateAssignment(action.stmt)}) { + if (update.cond == maybeAssign->lhs) { + // If this is "r = cond; if (r) ...", then update the condition. + update.cond = maybeAssign->rhs; + update.source = action.source; + // In this form, the update and the capture are combined into + // an IF-THEN-ELSE statement. + extractCapture(); + } else { + // Assume this is the capture-statement. + capture = action; + } + } + }}; + + if (body.size() == 2) { + // This could be + // - capture; conditional-update (in any order), or + // - r = cond; if (r) capture-update + const parser::ExecutionPartConstruct *st1{&body.front()}; + const parser::ExecutionPartConstruct *st2{&body.back()}; + // In either case, the conditional statement can be analyzed by + // AnalyzeConditionalStmt, whereas the other statement cannot. + if (auto maybeUpdate1{AnalyzeConditionalStmt(st1)}) { + update = *maybeUpdate1; + classifyNonUpdate(GetActionStmt(st2)); + captureFirst = false; + } else if (auto maybeUpdate2{AnalyzeConditionalStmt(st2)}) { + update = *maybeUpdate2; + classifyNonUpdate(GetActionStmt(st1)); + } else { + // None of the statements are conditional, this rules out the + // "r = cond; if (r) ..." and the "capture + conditional-update" + // variants. This could still be capture + write (which is classified + // as conditional-update-capture in the spec). + auto [uec, cec]{CheckUpdateCapture(st1, st2, source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + update.ift = uact; + capture = cact; + if (uec == st1) { + captureFirst = false; + } + } + } else if (body.size() == 1) { + if (auto maybeUpdate{AnalyzeConditionalStmt(&body.front())}) { + update = *maybeUpdate; + // This is the form with update and capture combined into an IF-THEN-ELSE + // statement. The capture-statement is always the ELSE branch. + extractCapture(); + } else { + goto invalid; + } + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE CAPTURE operation should contain one or two statements"_err_en_US); + return; + invalid: + context_.Say(source, + "Invalid body of ATOMIC UPDATE COMPARE CAPTURE operation"_err_en_US); + return; + } + + // The update must have a form `x = d` or `x => d`. + if (auto maybeWrite{GetEvaluateAssignment(update.ift.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, update.ift.source); + if (auto maybeCapture{GetEvaluateAssignment(capture.stmt)}) { + CheckAtomicCaptureAssignment(*maybeCapture, atom, capture.source); + + if (IsPointerAssignment(*maybeWrite) != + IsPointerAssignment(*maybeCapture)) { + context_.Say(capture.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + } else { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + return; + } + } else { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + return; + } + + // update.iff should be empty here, the capture statement should be + // stored in "capture". + + // Fill out the analysis in the AST node. + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + bool condUnused{std::visit( + [](auto &&s) { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v) { + return true; + } else { + return false; + } }, - x.u); + update.cond.u)}; + + int updateWhen{!condUnused ? Analysis::IfTrue : 0}; + int captureWhen{!captureAlways ? Analysis::IfFalse : 0}; + + evaluate::Assignment updAssign{*GetEvaluateAssignment(update.ift.stmt)}; + evaluate::Assignment capAssign{*GetEvaluateAssignment(capture.stmt)}; + + if (captureFirst) { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + } else { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + } +} + +void OmpStructureChecker::CheckAtomicRead( + const parser::OpenMPAtomicConstruct &x) { + // [6.0:190:5-7] + // A read structured block is read-statement, a read statement that has one + // of the following forms: + // v = x + // v => x + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Read cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC READ cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeRead->rhs}; + CheckAtomicReadAssignment(*maybeRead, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC READ operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC READ operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicWrite( + const parser::OpenMPAtomicConstruct &x) { + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Write cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC WRITE cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeWrite{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC WRITE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdate( + const parser::OpenMPAtomicConstruct &x) { + auto &block{std::get(x.t)}; + + bool isConditional{x.IsCompare()}; + bool isCapture{x.IsCapture()}; + const parser::Block &body{GetInnermostExecPart(block)}; + + if (isConditional && isCapture) { + CheckAtomicConditionalUpdateCapture(x, body, x.source); + } else if (isConditional) { + CheckAtomicConditionalUpdate(x, body, x.source); + } else if (isCapture) { + CheckAtomicUpdateCapture(x, body, x.source); + } else { // update-only + CheckAtomicUpdateOnly(x, body, x.source); + } +} + +void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { + // All of the following groups have the "exclusive" property, i.e. at + // most one clause from each group is allowed. + // The exclusivity-checking code should eventually be unified for all + // clauses, with clause groups defined in OMP.td. + std::array atomic{llvm::omp::Clause::OMPC_read, + llvm::omp::Clause::OMPC_update, llvm::omp::Clause::OMPC_write}; + std::array memoryOrder{llvm::omp::Clause::OMPC_acq_rel, + llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_relaxed, + llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_seq_cst}; + + auto checkExclusive{[&](llvm::ArrayRef group, + std::string_view name, + const parser::OmpClauseList &clauses) { + const parser::OmpClause *present{nullptr}; + for (const parser::OmpClause &clause : clauses.v) { + llvm::omp::Clause id{clause.Id()}; + if (!llvm::is_contained(group, id)) { + continue; + } + if (present == nullptr) { + present = &clause; + continue; + } else if (id == present->Id()) { + // Ignore repetitions of the same clause, those will be diagnosed + // separately. + continue; + } + parser::MessageFormattedText txt( + "At most one clause from the '%s' group is allowed on ATOMIC construct"_err_en_US, + name.data()); + parser::Message message(clause.source, txt); + message.Attach(present->source, + "Previous clause from this group provided here"_en_US); + context_.Say(std::move(message)); + return; + } + }}; + + auto &dirSpec{std::get(x.t)}; + auto &dir{std::get(dirSpec.t)}; + PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_atomic); + llvm::omp::Clause kind{x.GetKind()}; + + checkExclusive(atomic, "atomic", dirSpec.Clauses()); + checkExclusive(memoryOrder, "memory-order", dirSpec.Clauses()); + + switch (kind) { + case llvm::omp::Clause::OMPC_read: + CheckAtomicRead(x); + break; + case llvm::omp::Clause::OMPC_write: + CheckAtomicWrite(x); + break; + case llvm::omp::Clause::OMPC_update: + CheckAtomicUpdate(x); + break; + default: + break; + } } void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { @@ -3237,7 +4341,6 @@ CHECK_SIMPLE_CLAUSE(Final, OMPC_final) CHECK_SIMPLE_CLAUSE(Flush, OMPC_flush) CHECK_SIMPLE_CLAUSE(Full, OMPC_full) CHECK_SIMPLE_CLAUSE(Grainsize, OMPC_grainsize) -CHECK_SIMPLE_CLAUSE(Hint, OMPC_hint) CHECK_SIMPLE_CLAUSE(Holds, OMPC_holds) CHECK_SIMPLE_CLAUSE(Inclusive, OMPC_inclusive) CHECK_SIMPLE_CLAUSE(Initializer, OMPC_initializer) @@ -3867,40 +4970,6 @@ void OmpStructureChecker::CheckIsLoopIvPartOfClause( } } } -// Following clauses have a separate node in parse-tree.h. -// Atomic-clause -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicRead, OMPC_read) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicWrite, OMPC_write) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicUpdate, OMPC_update) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicCapture, OMPC_capture) - -void OmpStructureChecker::Leave(const parser::OmpAtomicRead &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_read, - {llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicWrite &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_write, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicUpdate &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_update, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -// OmpAtomic node represents atomic directive without atomic-clause. -// atomic-clause - READ,WRITE,UPDATE,CAPTURE. -void OmpStructureChecker::Leave(const parser::OmpAtomic &) { - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acquire)}) { - context_.Say(clause->source, - "Clause ACQUIRE is not allowed on the ATOMIC directive"_err_en_US); - } - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acq_rel)}) { - context_.Say(clause->source, - "Clause ACQ_REL is not allowed on the ATOMIC directive"_err_en_US); - } -} // Restrictions specific to each clause are implemented apart from the // generalized restrictions. @@ -4854,21 +5923,6 @@ void OmpStructureChecker::Leave(const parser::OmpContextSelector &) { ExitDirectiveNest(ContextSelectorNest); } -std::optional OmpStructureChecker::GetDynamicType( - const common::Indirection &parserExpr) { - // Indirection parserExpr - // `- parser::Expr ^.value() - const parser::TypedExpr &typedExpr{parserExpr.value().typedExpr}; - // ForwardOwningPointer typedExpr - // `- GenericExprWrapper ^.get() - // `- std::optional ^->v - if (auto maybeExpr{typedExpr.get()->v}) { - return maybeExpr->GetType(); - } else { - return std::nullopt; - } -} - const std::list & OmpStructureChecker::GetTraitPropertyList( const parser::OmpTraitSelector &trait) { @@ -5258,7 +6312,7 @@ void OmpStructureChecker::CheckTraitCondition( const parser::OmpTraitProperty &property{properties.front()}; auto &scalarExpr{std::get(property.u)}; - auto maybeType{GetDynamicType(scalarExpr.thing)}; + auto maybeType{GetDynamicType(scalarExpr.thing.value())}; if (!maybeType || maybeType->category() != TypeCategory::Logical) { context_.Say(property.source, "%s trait requires a single LOGICAL expression"_err_en_US, diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..bf6fbf16d0646 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -48,6 +48,7 @@ static const OmpDirectiveSet noWaitClauseNotAllowedSet{ } // namespace llvm namespace Fortran::semantics { +struct AnalyzedCondStmt; // Mapping from 'Symbol' to 'Source' to keep track of the variables // used in multiple clauses @@ -142,15 +143,6 @@ class OmpStructureChecker void Leave(const parser::OmpClauseList &); void Enter(const parser::OmpClause &); - void Enter(const parser::OmpAtomicRead &); - void Leave(const parser::OmpAtomicRead &); - void Enter(const parser::OmpAtomicWrite &); - void Leave(const parser::OmpAtomicWrite &); - void Enter(const parser::OmpAtomicUpdate &); - void Leave(const parser::OmpAtomicUpdate &); - void Enter(const parser::OmpAtomicCapture &); - void Leave(const parser::OmpAtomic &); - void Enter(const parser::DoConstruct &); void Leave(const parser::DoConstruct &); @@ -189,8 +181,6 @@ class OmpStructureChecker void CheckAllowedMapTypes(const parser::OmpMapType::Value &, const std::list &); - std::optional GetDynamicType( - const common::Indirection &); const std::list &GetTraitPropertyList( const parser::OmpTraitSelector &); std::optional GetClauseFromProperty( @@ -260,14 +250,41 @@ class OmpStructureChecker void CheckDoWhile(const parser::OpenMPLoopConstruct &x); void CheckAssociatedLoopConstraints(const parser::OpenMPLoopConstruct &x); template bool IsOperatorValid(const T &, const D &); - void CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *, const parser::OmpAtomicClauseList *); - void CheckAtomicUpdateStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureStmt(const parser::AssignmentStmt &); - void CheckAtomicWriteStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureConstruct(const parser::OmpAtomicCapture &); - void CheckAtomicCompareConstruct(const parser::OmpAtomicCompare &); - void CheckAtomicConstructStructure(const parser::OpenMPAtomicConstruct &); + + void CheckStorageOverlap(const evaluate::Expr &, + llvm::ArrayRef>, parser::CharBlock); + void CheckAtomicVariable( + const evaluate::Expr &, parser::CharBlock); + std::pair + CheckUpdateCapture(const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source); + void CheckAtomicCaptureAssignment(const evaluate::Assignment &capture, + const SomeExpr &atom, parser::CharBlock source); + void CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source); + void CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source); + void CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source); + void CheckAtomicConditionalUpdateAssignment(const SomeExpr &cond, + parser::CharBlock condSource, const evaluate::Assignment &assign, + parser::CharBlock assignSource); + void CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source); + void CheckAtomicUpdateOnly(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdate(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicUpdateCapture(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source); + void CheckAtomicRead(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicWrite(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicUpdate(const parser::OpenMPAtomicConstruct &x); + void CheckDistLinear(const parser::OpenMPLoopConstruct &x); void CheckSIMDNest(const parser::OpenMPConstruct &x); void CheckTargetNest(const parser::OpenMPConstruct &x); @@ -319,7 +336,6 @@ class OmpStructureChecker void EnterDirectiveNest(const int index) { directiveNest_[index]++; } void ExitDirectiveNest(const int index) { directiveNest_[index]--; } int GetDirectiveNest(const int index) { return directiveNest_[index]; } - template void CheckHintClause(D *, D *, std::string_view); inline void ErrIfAllocatableVariable(const parser::Variable &); inline void ErrIfLHSAndRHSSymbolsMatch( const parser::Variable &, const parser::Expr &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..dc395ecf211b3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1638,11 +1638,8 @@ class OmpVisitor : public virtual DeclarationVisitor { messageHandler().set_currStmtSource(std::nullopt); } bool Pre(const parser::OpenMPAtomicConstruct &x) { - return common::visit(common::visitors{[&](const auto &u) -> bool { - AddOmpSourceRange(u.source); - return true; - }}, - x.u); + AddOmpSourceRange(x.source); + return true; } void Post(const parser::OpenMPAtomicConstruct &) { messageHandler().set_currStmtSource(std::nullopt); diff --git a/flang/lib/Semantics/rewrite-directives.cpp b/flang/lib/Semantics/rewrite-directives.cpp index 104a77885d276..b4fef2c881b67 100644 --- a/flang/lib/Semantics/rewrite-directives.cpp +++ b/flang/lib/Semantics/rewrite-directives.cpp @@ -51,23 +51,21 @@ class OmpRewriteMutator : public DirectiveRewriteMutator { bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { // Find top-level parent of the operation. - Symbol *topLevelParent{common::visit( - [&](auto &atomic) { - Symbol *symbol{nullptr}; - Scope *scope{ - &context_.FindScope(std::get(atomic.t).source)}; - do { - if (Symbol * parent{scope->symbol()}) { - symbol = parent; - } - scope = &scope->parent(); - } while (!scope->IsGlobal()); - - assert(symbol && - "Atomic construct must be within a scope associated with a symbol"); - return symbol; - }, - x.u)}; + Symbol *topLevelParent{[&]() { + Symbol *symbol{nullptr}; + Scope *scope{&context_.FindScope( + std::get(x.t).source)}; + do { + if (Symbol * parent{scope->symbol()}) { + symbol = parent; + } + scope = &scope->parent(); + } while (!scope->IsGlobal()); + + assert(symbol && + "Atomic construct must be within a scope associated with a symbol"); + return symbol; + }()}; // Get the `atomic_default_mem_order` clause from the top-level parent. std::optional defaultMemOrder; @@ -86,66 +84,48 @@ bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { return false; } - auto findMemOrderClause = - [](const std::list &clauses) { - return llvm::any_of(clauses, [](const auto &clause) { - return std::get_if(&clause.u); + auto findMemOrderClause{[](const parser::OmpClauseList &clauses) { + return llvm::any_of( + clauses.v, [](auto &clause) -> const parser::OmpClause * { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_acq_rel: + case llvm::omp::Clause::OMPC_acquire: + case llvm::omp::Clause::OMPC_relaxed: + case llvm::omp::Clause::OMPC_release: + case llvm::omp::Clause::OMPC_seq_cst: + return &clause; + default: + return nullptr; + } }); - }; - - // Get the clause list to which the new memory order clause must be added, - // only if there are no other memory order clauses present for this atomic - // directive. - std::list *clauseList = common::visit( - common::visitors{[&](parser::OmpAtomic &atomicConstruct) { - // OmpAtomic only has a single list of clauses. - auto &clauses{std::get( - atomicConstruct.t)}; - return !findMemOrderClause(clauses.v) ? &clauses.v - : nullptr; - }, - [&](auto &atomicConstruct) { - // All other atomic constructs have two lists of clauses. - auto &clausesLhs{std::get<0>(atomicConstruct.t)}; - auto &clausesRhs{std::get<2>(atomicConstruct.t)}; - return !findMemOrderClause(clausesLhs.v) && - !findMemOrderClause(clausesRhs.v) - ? &clausesRhs.v - : nullptr; - }}, - x.u); + }}; - // Add a memory order clause to the atomic directive. + auto &dirSpec{std::get(x.t)}; + auto &clauseList{std::get>(dirSpec.t)}; if (clauseList) { - atomicDirectiveDefaultOrderFound_ = true; - switch (*defaultMemOrder) { - case common::OmpMemoryOrderType::Acq_Rel: - clauseList->emplace_back(common::visit( - common::visitors{[](parser::OmpAtomicRead &) -> parser::OmpClause { - return parser::OmpClause::Acquire{}; - }, - [](parser::OmpAtomicCapture &) -> parser::OmpClause { - return parser::OmpClause::AcqRel{}; - }, - [](auto &) -> parser::OmpClause { - // parser::{OmpAtomic, OmpAtomicUpdate, OmpAtomicWrite} - return parser::OmpClause::Release{}; - }}, - x.u)); - break; - case common::OmpMemoryOrderType::Relaxed: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::Relaxed{}}); - break; - case common::OmpMemoryOrderType::Seq_Cst: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::SeqCst{}}); - break; - default: - // FIXME: Don't process other values at the moment since their validity - // depends on the OpenMP version (which is unavailable here). - break; + if (findMemOrderClause(*clauseList)) { + return false; } + } else { + clauseList = parser::OmpClauseList(decltype(parser::OmpClauseList::v){}); + } + + // Add a memory order clause to the atomic directive. + atomicDirectiveDefaultOrderFound_ = true; + switch (*defaultMemOrder) { + case common::OmpMemoryOrderType::Acq_Rel: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::AcqRel{}}); + break; + case common::OmpMemoryOrderType::Relaxed: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::Relaxed{}}); + break; + case common::OmpMemoryOrderType::Seq_Cst: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::SeqCst{}}); + break; + default: + // FIXME: Don't process other values at the moment since their validity + // depends on the OpenMP version (which is unavailable here). + break; } return false; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 index b82bd13622764..6f58e0939a787 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 index 88ec6fe910b9e..6729be6e5cf8b 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b8422c8f3b769 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -79,16 +79,16 @@ subroutine pointers_in_atomic_capture() !CHECK: %[[VAL_A_BOX_ADDR:.*]] = fir.box_addr %[[VAL_A_LOADED]] : (!fir.box>) -> !fir.ptr !CHECK: %[[VAL_B_LOADED:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR:.*]] = fir.box_addr %[[VAL_B_LOADED]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR]] : !fir.ptr !CHECK: %[[VAL_B_LOADED_2:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR_2:.*]] = fir.box_addr %[[VAL_B_LOADED_2]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR_2]] : !fir.ptr !CHECK: omp.atomic.capture { !CHECK: omp.atomic.update %[[VAL_A_BOX_ADDR]] : !fir.ptr { !CHECK: ^bb0(%[[ARG:.*]]: i32): !CHECK: %[[TEMP:.*]] = arith.addi %[[ARG]], %[[VAL_B]] : i32 !CHECK: omp.yield(%[[TEMP]] : i32) !CHECK: } -!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 +!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR_2]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 !CHECK: } !CHECK: return !CHECK: } diff --git a/flang/test/Lower/OpenMP/atomic-write.f90 b/flang/test/Lower/OpenMP/atomic-write.f90 index 13392ad76471f..6eded49b0b15d 100644 --- a/flang/test/Lower/OpenMP/atomic-write.f90 +++ b/flang/test/Lower/OpenMP/atomic-write.f90 @@ -44,9 +44,9 @@ end program OmpAtomicWrite !CHECK-LABEL: func.func @_QPatomic_write_pointer() { !CHECK: %[[X_REF:.*]] = fir.alloca !fir.box> {bindc_name = "x", uniq_name = "_QFatomic_write_pointerEx"} !CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFatomic_write_pointerEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) -!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> !CHECK: %[[X_POINTEE_ADDR:.*]] = fir.box_addr %[[X_ADDR_BOX]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: omp.atomic.write %[[X_POINTEE_ADDR]] = %[[C1]] : !fir.ptr, i32 !CHECK: %[[C2:.*]] = arith.constant 2 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> diff --git a/flang/test/Parser/OpenMP/atomic-compare.f90 b/flang/test/Parser/OpenMP/atomic-compare.f90 deleted file mode 100644 index 5cd02698ff482..0000000000000 --- a/flang/test/Parser/OpenMP/atomic-compare.f90 +++ /dev/null @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s -! OpenMP version for documentation purposes only - it isn't used until Sema. -! This is testing for Parser errors that bail out before Sema. -program main - implicit none - integer :: i, j = 10 - logical :: r - - !CHECK: error: expected OpenMP construct - !$omp atomic compare write - r = i .eq. j + 1 - - !CHECK: error: expected end of line - !$omp atomic compare num_threads(4) - r = i .eq. j -end program main diff --git a/flang/test/Semantics/OpenMP/atomic-compare.f90 b/flang/test/Semantics/OpenMP/atomic-compare.f90 index 54492bf6a22a6..11e23e062bce7 100644 --- a/flang/test/Semantics/OpenMP/atomic-compare.f90 +++ b/flang/test/Semantics/OpenMP/atomic-compare.f90 @@ -44,46 +44,37 @@ !$omp end atomic ! Check for error conditions: - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic compare seq_cst seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst compare seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic compare acquire acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire compare acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic compare relaxed relaxed if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed compare relaxed if (b .eq. c) b = a - !ERROR: More than one FAIL clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one FAIL clause can appear on the ATOMIC directive !$omp atomic fail(release) compare fail(release) if (c .eq. a) a = b !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index c13a11a8dd5dc..deb67e7614659 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -16,20 +16,21 @@ program sample !$omp atomic read hint(2) y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(3) y = y + 10 !$omp atomic update hint(5) y = x + y - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement y = x x = y !$omp end atomic - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic update hint(x) y = y * 1 @@ -46,7 +47,7 @@ program sample !$omp atomic hint(omp_lock_hint_speculative) x = y + x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint) read y = x @@ -69,36 +70,36 @@ program sample !$omp atomic hint(omp_lock_hint_contended + omp_sync_hint_nonspeculative) x = y + x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint_contended) read y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = y * 9 - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp atomic hint(1.0) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp atomic hint(z + omp_sync_hint_nonspeculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(k + omp_sync_hint_speculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(p(1) + omp_sync_hint_uncontended) write x = 10 * y !$omp atomic write hint(a) - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and y+x access the same storage x = y + x !$omp atomic hint(abs(-1)) write diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 new file mode 100644 index 0000000000000..2619d235380f8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -0,0 +1,89 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC READ. Expect no diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v = x + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v => x +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic read + v = p(i) +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic read + !ERROR: Atomic variable x should be a scalar + v = x + + !$omp atomic read + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + v = y(2:4) +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic read + !ERROR: Within atomic operation x and x access the same storage + x = x + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic read + y(1) = y(2) +end + +subroutine f05 + integer :: x, v + + !$omp atomic read + !ERROR: Atomic expression x+1_4 should be a variable + v = x + 1 +end + +subroutine f06 + character :: x, v + + !$omp atomic read + !ERROR: Atomic variable x cannot have CHARACTER type + v = x +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic read + !ERROR: Atomic variable x cannot be ALLOCATABLE + v = x +end + diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 new file mode 100644 index 0000000000000..64e31a7974b55 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -0,0 +1,77 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !$omp atomic update capture + x = v + x = x + 1 + y = x + !$omp end atomic +end + +subroutine f01 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two assignments + !$omp atomic update capture + x = v + block + x = x + 1 + y = x + end block + !$omp end atomic +end + +subroutine f02 + integer :: x, y + + ! The update and capture statements can be inside of a single BLOCK. + ! The end-directive is then optional. Expect no diagnostics. + !$omp atomic update capture + block + x = x + 1 + y = x + end block +end + +subroutine f03 + integer :: x + + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !$omp atomic update capture + x = x + 1 + x = x + 2 + !$omp end atomic +end + +subroutine f04 + integer :: x, v + + !$omp atomic update capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + v = x + x = v + !$omp end atomic +end + +subroutine f05 + integer :: x, v, z + + !$omp atomic update capture + v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + !$omp end atomic +end + +subroutine f06 + integer :: x, v, z + + !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + v = x + !$omp end atomic +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 new file mode 100644 index 0000000000000..0aee02d429dc0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + !$omp atomic update + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + x = int(x + y) +end + +subroutine f06 + integer :: x, y + interface + function f(i, j) + integer :: f, i, j + end + end interface + + !$omp atomic update + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation + x = f(x, y) +end + +subroutine f07 + real :: x + integer :: y + + !$omp atomic update + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation + x = x ** y +end + +subroutine f08 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should appear as an argument in the update operation + x = y +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 index 21a9b87d26345..3084376b4275d 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 @@ -22,10 +22,10 @@ program sample x = x / y !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y end program diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 new file mode 100644 index 0000000000000..979568ebf7140 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -0,0 +1,81 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC WRITE. Expect no diagnostics. + !$omp atomic write + x = v + 1 + + !$omp atomic write + x = v + 3 + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic write + x = 2 * v + 3 + + !$omp atomic write + x => v +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic write + p(i) = v +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic write + !ERROR: Atomic variable x should be a scalar + x = v + + !$omp atomic write + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + y(2:4) = v +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic write + !ERROR: Within atomic operation x and x+1_4 access the same storage + x = x + 1 + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic write + y(1) = y(2) +end + +subroutine f06 + character :: x, v + + !$omp atomic write + !ERROR: Atomic variable x cannot have CHARACTER type + x = v +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic write + !ERROR: Atomic variable x cannot be ALLOCATABLE + x = v +end + diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0e100871ea9b4..0dc8251ae356e 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct @@ -11,9 +11,13 @@ a = b !$omp end atomic + !ERROR: ACQUIRE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic read acquire hint(OMP_LOCK_HINT_CONTENDED) a = b + !ERROR: RELEASE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic release hint(OMP_LOCK_HINT_UNCONTENDED) write a = b @@ -22,39 +26,32 @@ a = a + 1 !$omp end atomic + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: ACQ_REL clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic hint(1) acq_rel capture b = a a = a + 1 !$omp end atomic - !ERROR: expected end of line + !ERROR: At most one clause from the 'atomic' group is allowed on ATOMIC construct !$omp atomic read write + !ERROR: Atomic expression a+1._4 should be a variable a = a + 1 !$omp atomic a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic num_threads(4) a = a + 1 - !ERROR: expected end of line + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic capture num_threads(4) a = a + 1 + !ERROR: RELAXED clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic relaxed a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' - !$omp atomic num_threads write - a = a + 1 - !$omp end parallel end diff --git a/flang/test/Semantics/OpenMP/atomic01.f90 b/flang/test/Semantics/OpenMP/atomic01.f90 index 173effe86b69c..f700c381cadd0 100644 --- a/flang/test/Semantics/OpenMP/atomic01.f90 +++ b/flang/test/Semantics/OpenMP/atomic01.f90 @@ -14,322 +14,277 @@ ! At most one memory-order-clause may appear on the construct. !READ - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic read seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst read seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic read acquire acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire read acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic read relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed read relaxed i = j !UPDATE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic update seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst update seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic update release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release update release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic update relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed update relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !CAPTURE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic capture seq_cst seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst capture seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic capture release release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release capture release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic capture relaxed relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed capture relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel acq_rel capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic capture acq_rel acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel capture acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic capture acquire acquire i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire capture acquire i = j j = k !$omp end atomic !WRITE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic write seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst write seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic write release release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release write release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic write relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed write relaxed i = j !No atomic-clause - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j ! 2.17.7.3 ! At most one hint clause may appear on the construct. - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_speculative) hint(omp_sync_hint_speculative) read i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) read hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic read hint(omp_sync_hint_uncontended) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) update hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic update hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) capture i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) capture hint(omp_sync_hint_nonspeculative) i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic capture hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j j = k @@ -337,34 +292,26 @@ ! 2.17.7.4 ! If atomic-clause is read then memory-order-clause must not be acq_rel or release. - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic acq_rel read i = j - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read acq_rel i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic release read i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read release i = j ! 2.17.7.5 ! If atomic-clause is write then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acq_rel write i = j - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acq_rel i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acquire write i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acquire i = j @@ -372,33 +319,27 @@ ! 2.17.7.6 ! If atomic-clause is update or not present then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acq_rel update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acquire update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed on the ATOMIC directive !$omp atomic acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed on the ATOMIC directive !$omp atomic acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j end program diff --git a/flang/test/Semantics/OpenMP/atomic02.f90 b/flang/test/Semantics/OpenMP/atomic02.f90 index c66085d00f157..45e41f2552965 100644 --- a/flang/test/Semantics/OpenMP/atomic02.f90 +++ b/flang/test/Semantics/OpenMP/atomic02.f90 @@ -28,36 +28,29 @@ program OmpAtomic !$omp atomic a = a/(b + 1) !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: The atomic variable c should appear as an argument in the update operation c = d !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The /= operator is not a valid ATOMIC UPDATE operation l = a .NE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic m = m .AND. n @@ -76,32 +69,26 @@ program OmpAtomic !$omp atomic update a = a/(b + 1) !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: This is not a valid ATOMIC UPDATE operation c = c//d !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic update m = m .AND. n diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index 76367495b9861..f5c189fd05318 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -25,28 +25,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7, b, c) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8, a, d) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -60,26 +58,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = MOD(y, 9) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation x = ABS(x) end program OmpAtomic @@ -92,7 +90,7 @@ subroutine conflicting_types() type(simple) ::s z = 1 !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(s%z, 4) end subroutine @@ -105,40 +103,37 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) :: s !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MIN operator a = min(a, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a) !$omp atomic - !ERROR: Atomic update statement should be of the form `a = intrinsic_procedure(a, expr_list)` OR `a = intrinsic_procedure(expr_list, a)` a = min(b, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a, b) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: The atomic variable y should occur exactly once among the arguments of the top-level MIN operator y = min(z, x) !$omp atomic z = max(z, y) !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'k' + !ERROR: Atomic variable k should be a scalar + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation k = max(x, y) - + !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement x = min(x, k) !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - z =z + s%m + z = z + s%m end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index a9644ad95aa30..5c91ab5dc37e4 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -20,12 +20,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic @@ -33,12 +31,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic @@ -46,12 +42,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic @@ -59,12 +53,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic @@ -72,8 +64,7 @@ program OmpAtomic !$omp atomic m = n .AND. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic @@ -81,8 +72,7 @@ program OmpAtomic !$omp atomic m = n .OR. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic @@ -90,8 +80,7 @@ program OmpAtomic !$omp atomic m = n .EQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic @@ -99,8 +88,7 @@ program OmpAtomic !$omp atomic m = n .NEQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l !$omp atomic update @@ -108,12 +96,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic update @@ -121,12 +107,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic update @@ -134,12 +118,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic update @@ -147,12 +129,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic update @@ -160,8 +140,7 @@ program OmpAtomic !$omp atomic update m = n .AND. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic update @@ -169,8 +148,7 @@ program OmpAtomic !$omp atomic update m = n .OR. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic update @@ -178,8 +156,7 @@ program OmpAtomic !$omp atomic update m = n .EQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic update @@ -187,8 +164,7 @@ program OmpAtomic !$omp atomic update m = n .NEQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l end program OmpAtomic @@ -204,35 +180,34 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) p !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement x = x !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 1 !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and a*b access the same storage a = a * b + a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level * operator a = b * (a + 9) !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (a+b) access the same storage a = a * (a + b) !$omp atomic - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (b+a) access the same storage a = (b + a) * a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a * b + c !$omp atomic update - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a + b + c !$omp atomic @@ -243,23 +218,18 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement a = a + d !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = x * y / z !$omp atomic - !ERROR: Atomic update statement should be of form `p%m = p%m operator expr` OR `p%m = expr operator p%m` - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable p%m should occur exactly once among the arguments of the top-level + operator p%m = x + y !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement p%m = p%m + p%n end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index 266268a212440..e0103be4cae4a 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -8,20 +8,20 @@ program OmpAtomic use omp_lib integer :: g, x - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic relaxed, seq_cst x = x + 1 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic read seq_cst, relaxed x = g - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic write relaxed, release x = 2 * 4 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 10 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst x = g g = x * 10 diff --git a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 index 7ca8c858239f7..e9cfa49bf934e 100644 --- a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 @@ -18,7 +18,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(3) y = 2 !$omp end critical (name) @@ -27,12 +27,12 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(7) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(x) y = 2 @@ -54,7 +54,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint) y = 2 @@ -84,35 +84,35 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint_contended) y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp critical (name) hint(1.0) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp critical (name) hint(z + omp_sync_hint_nonspeculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(k + omp_sync_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(p(1) + omp_sync_hint_uncontended) y = 2 diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 505cbc48fef90..677b933932b44 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -20,70 +20,64 @@ program sample !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable y(1_8:3_8:1_8) should be a scalar v = y(1:3) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression x*(10_4+x) should be a variable v = x * (10 + x) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression 4_4 should be a variable v = 4 !$omp atomic read - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE v = k !$omp atomic write - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = x !$omp atomic update - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = k + x * (v * x) !$omp atomic - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = v * k !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'z%y' + !ERROR: Within atomic operation z%y and x+z%y access the same storage z%y = x + z%y !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Within atomic operation m and min(m,x,z%m)+k access the same storage m = min(m, x, z%m) + k !$omp atomic read - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Atomic expression min(m,x,z%m)+k should be a variable m = min(m, x, z%m) + k !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar x = a - !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - a = x - !$omp atomic write !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement x = a !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar a = x !$omp atomic capture @@ -93,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = b + (x*1) !$omp end atomic @@ -103,60 +97,58 @@ program sample !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component x expected to be captured in the second statement of ATOMIC CAPTURE construct x = x + 10 v = b !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt] v = 1 x = 4 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component z%y expected to be assigned in the second statement of ATOMIC CAPTURE construct x = z%y + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component z%m expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 x = z%y !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component y(2) expected to be assigned in the second statement of ATOMIC CAPTURE construct x = y(2) + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component y(1) expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 x = y(2) !$omp end atomic !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable r cannot have CHARACTER type l = r !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable l cannot have CHARACTER type l = r end program diff --git a/flang/test/Semantics/OpenMP/requires-atomic01.f90 b/flang/test/Semantics/OpenMP/requires-atomic01.f90 index ae9fd086015dd..e8817c3f5ef61 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic01.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic01.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> SeqCst !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> SeqCst !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> SeqCst !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> SeqCst !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> SeqCst !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 diff --git a/flang/test/Semantics/OpenMP/requires-atomic02.f90 b/flang/test/Semantics/OpenMP/requires-atomic02.f90 index 4976a9667eb78..a3724a83456fd 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic02.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic02.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Acquire + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> AcqRel !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> AcqRel !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> AcqRel !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> AcqRel !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> AcqRel + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> AcqRel !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 >From 5928e6c450ed9d91a8130fd489a3fabab830140c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:41:16 -0500 Subject: [PATCH 05/18] Restart build >From 5a9cb4aaca6c7cd079f24036b233dbfd3c6278d4 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:45:28 -0500 Subject: [PATCH 06/18] Restart build >From 6273bb856c66fe110731e25293be0f9a1f02c64a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 07:54:54 -0500 Subject: [PATCH 07/18] Restart build >From 3594a0fcab499d7509c2bd4620b2c47feb74a481 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 08:18:01 -0500 Subject: [PATCH 08/18] Replace %openmp_flags with -fopenmp in tests, add REQUIRES where needed --- flang/test/Semantics/OpenMP/atomic-read.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-capture.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 2 +- flang/test/Semantics/OpenMP/atomic.f90 | 2 ++ 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 index 2619d235380f8..73899a9ff37f2 100644 --- a/flang/test/Semantics/OpenMP/atomic-read.f90 +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index 64e31a7974b55..c427ba07d43d8 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 0aee02d429dc0..4595e02d01456 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 index 979568ebf7140..7965ad2dc7dbf 100644 --- a/flang/test/Semantics/OpenMP/atomic-write.f90 +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0dc8251ae356e..2caa161507d49 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,3 +1,5 @@ +! REQUIRES: openmp_runtime + ! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct >From 0142671339663ba09c169800f909c0d48c0ad38b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 09:06:20 -0500 Subject: [PATCH 09/18] Fix examples --- flang/examples/FeatureList/FeatureList.cpp | 10 ------- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 26 +++++-------------- 2 files changed, 7 insertions(+), 29 deletions(-) diff --git a/flang/examples/FeatureList/FeatureList.cpp b/flang/examples/FeatureList/FeatureList.cpp index d1407cf0ef239..a36b8719e365d 100644 --- a/flang/examples/FeatureList/FeatureList.cpp +++ b/flang/examples/FeatureList/FeatureList.cpp @@ -445,13 +445,6 @@ struct NodeVisitor { READ_FEATURE(ObjectDecl) READ_FEATURE(OldParameterStmt) READ_FEATURE(OmpAlignedClause) - READ_FEATURE(OmpAtomic) - READ_FEATURE(OmpAtomicCapture) - READ_FEATURE(OmpAtomicCapture::Stmt1) - READ_FEATURE(OmpAtomicCapture::Stmt2) - READ_FEATURE(OmpAtomicRead) - READ_FEATURE(OmpAtomicUpdate) - READ_FEATURE(OmpAtomicWrite) READ_FEATURE(OmpBeginBlockDirective) READ_FEATURE(OmpBeginLoopDirective) READ_FEATURE(OmpBeginSectionsDirective) @@ -480,7 +473,6 @@ struct NodeVisitor { READ_FEATURE(OmpIterationOffset) READ_FEATURE(OmpIterationVector) READ_FEATURE(OmpEndAllocators) - READ_FEATURE(OmpEndAtomic) READ_FEATURE(OmpEndBlockDirective) READ_FEATURE(OmpEndCriticalDirective) READ_FEATURE(OmpEndLoopDirective) @@ -566,8 +558,6 @@ struct NodeVisitor { READ_FEATURE(OpenMPDeclareTargetConstruct) READ_FEATURE(OmpMemoryOrderType) READ_FEATURE(OmpMemoryOrderClause) - READ_FEATURE(OmpAtomicClause) - READ_FEATURE(OmpAtomicClauseList) READ_FEATURE(OmpAtomicDefaultMemOrderClause) READ_FEATURE(OpenMPFlushConstruct) READ_FEATURE(OpenMPLoopConstruct) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..18a4107f9a9d1 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -74,25 +74,19 @@ SourcePosition OpenMPCounterVisitor::getLocation(const OpenMPConstruct &c) { // the directive field. [&](const auto &c) -> SourcePosition { const CharBlock &source{std::get<0>(c.t).source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPAtomicConstruct &c) -> SourcePosition { - return std::visit( - [&](const auto &o) -> SourcePosition { - const CharBlock &source{std::get(o.t).source}; - return parsing->allCooked() - .GetSourcePositionRange(source) - ->first; - }, - c.u); + const CharBlock &source{c.source}; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPSectionConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPUtilityConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, }, c.u); @@ -157,14 +151,8 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - return std::visit( - [&](const auto &c) { - // Get source from the verbatim fields - const CharBlock &source{std::get(c.t).source}; - return "atomic-" + - normalize_construct_name(source.ToString()); - }, - c.u); + const CharBlock &source{c.source}; + return normalize_construct_name(source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; >From c158867527e0a5e2b67146784386675e0d414c71 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:19 -0500 Subject: [PATCH 10/18] Fix example --- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 5 +++-- flang/test/Examples/omp-atomic.f90 | 16 +++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index 18a4107f9a9d1..b0964845d3b99 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -151,8 +151,9 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - const CharBlock &source{c.source}; - return normalize_construct_name(source.ToString()); + auto &dirSpec = std::get(c.t); + auto &dirName = std::get(dirSpec.t); + return normalize_construct_name(dirName.source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; diff --git a/flang/test/Examples/omp-atomic.f90 b/flang/test/Examples/omp-atomic.f90 index dcca34b633a3e..934f84f132484 100644 --- a/flang/test/Examples/omp-atomic.f90 +++ b/flang/test/Examples/omp-atomic.f90 @@ -26,25 +26,31 @@ ! CHECK:--- ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 9 -! CHECK-NEXT: construct: atomic-read +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: -! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: - clause: read ! CHECK-NEXT: details: '' +! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 12 -! CHECK-NEXT: construct: atomic-write +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: ! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' +! CHECK-NEXT: - clause: write ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 16 -! CHECK-NEXT: construct: atomic-capture +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: +! CHECK-NEXT: - clause: capture +! CHECK-NEXT: details: 'name_modifier=atomic;name_modifier=atomic;' ! CHECK-NEXT: - clause: seq_cst ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 21 -! CHECK-NEXT: construct: atomic-atomic +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: [] ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 8 >From ddacb71299d1fefd37b566e3caed45031584aac8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:47 -0500 Subject: [PATCH 11/18] Remove reference to *nullptr --- flang/lib/Lower/OpenMP/OpenMP.cpp | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ab868df76d298..4d9b375553c1e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3770,9 +3770,12 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); - mlir::Operation *secondOp = genAtomicOperation( - converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, - *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + mlir::Operation *secondOp = nullptr; + if (analysis.op1.what != analysis.None) { + secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, + atomAddr, atom, *get(analysis.op1.expr), + hint, memOrder, atomicAt, prepareAt); + } if (secondOp) { builder.setInsertionPointAfter(secondOp); >From 4546997f82dfe32b79b2bd0e2b65974991ab55da Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 2 May 2025 18:49:05 -0500 Subject: [PATCH 12/18] Updates and improvements --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 107 +++-- flang/lib/Semantics/check-omp-structure.cpp | 375 ++++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 1 + .../Todo/atomic-capture-implicit-cast.f90 | 48 --- .../Lower/OpenMP/atomic-implicit-cast.f90 | 2 - .../Semantics/OpenMP/atomic-hint-clause.f90 | 2 +- .../OpenMP/atomic-update-capture.f90 | 8 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 16 +- 9 files changed, 381 insertions(+), 180 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 77f57b1cb85c7..8213fe33edbd0 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4859,7 +4859,7 @@ struct OpenMPAtomicConstruct { struct Op { int what; - TypedExpr expr; + AssignmentStmt::TypedAssignment assign; }; TypedExpr atom, cond; Op op0, op1; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6177b59199481..7b6c22095d723 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2673,21 +2673,46 @@ getAtomicMemoryOrder(lower::AbstractConverter &converter, static mlir::Operation * // genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Value storeAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Type storeType = fir::unwrapRefType(storeAddr.getType()); + + mlir::Value toAddr = [&]() { + if (atomType == storeType) + return storeAddr; + return builder.createTemporary(loc, atomType, ".tmp.atomval"); + }(); builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + + if (atomType != storeType) { + lower::ExprToValueMap overrides; + // The READ operation could be a part of UPDATE CAPTURE, so make sure + // we don't emit extra code into the body of the atomic op. + builder.restoreInsertionPoint(postAt); + mlir::Value load = builder.create(loc, toAddr); + overrides.try_emplace(&atom, load); + + converter.overrideExprValues(&overrides); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); + converter.resetExprOverrides(); + + builder.create(loc, value, storeAddr); + } builder.restoreInsertionPoint(saved); return op; } @@ -2695,16 +2720,18 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, static mlir::Operation * // genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); mlir::Value converted = builder.createConvert(loc, atomType, value); @@ -2719,19 +2746,20 @@ static mlir::Operation * genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrResizeOf(arg, atom)) { @@ -2751,7 +2779,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, converter.overrideExprValues(&overrides); mlir::Value updated = - fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Value converted = builder.createConvert(loc, atomType, updated); builder.create(loc, converted); converter.resetExprOverrides(); @@ -2764,20 +2792,21 @@ static mlir::Operation * genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, int action, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: - return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Write: - return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Update: - return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, assign, + hint, memOrder, preAt, atomicAt, postAt); default: return nullptr; } @@ -3724,6 +3753,15 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { } return ""s; }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; const SomeExpr &atom = *analysis.atom.get()->v; @@ -3732,11 +3770,11 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; llvm::errs() << " op0 {\n"; llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op0.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << " op1 {\n"; llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op1.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << "}\n"; } @@ -3745,8 +3783,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAtomicConstruct &construct) { - auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { - if (auto *maybe = expr.get(); maybe && maybe->v) { + auto get = [](auto &&typedWrapper) -> decltype(&*typedWrapper.get()->v) { + if (auto *maybe = typedWrapper.get(); maybe && maybe->v) { return &*maybe->v; } else { return nullptr; @@ -3774,8 +3812,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, int action0 = analysis.op0.what & analysis.Action; int action1 = analysis.op1.what & analysis.Action; mlir::Operation *captureOp = nullptr; - fir::FirOpBuilder::InsertPoint atomicAt; - fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint preAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint atomicAt, postAt; if (construct.IsCapture()) { // Capturing operation. @@ -3784,7 +3822,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, captureOp = builder.create(loc, hint, memOrder); // Set the non-atomic insertion point to before the atomic.capture. - prepareAt = getInsertionPointBefore(captureOp); + preAt = getInsertionPointBefore(captureOp); mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); builder.setInsertionPointToEnd(block); @@ -3792,6 +3830,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, // atomic.capture. mlir::Operation *term = builder.create(loc); atomicAt = getInsertionPointBefore(term); + postAt = getInsertionPointAfter(captureOp); hint = nullptr; memOrder = nullptr; } else { @@ -3799,20 +3838,20 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(action0 != analysis.None && action1 == analysis.None && "Expexcing single action"); assert(!(analysis.op0.what & analysis.Condition)); - atomicAt = prepareAt; + postAt = atomicAt = preAt; } mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, - *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); mlir::Operation *secondOp = nullptr; if (analysis.op1.what != analysis.None) { secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, - atomAddr, atom, *get(analysis.op1.expr), - hint, memOrder, atomicAt, prepareAt); + atomAddr, atom, *get(analysis.op1.assign), + hint, memOrder, preAt, atomicAt, postAt); } if (secondOp) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 201b38bd05ff3..f7753a5e5cc59 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -86,9 +86,13 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } -static bool IsVarOrFunctionRef(const SomeExpr &expr) { - return evaluate::UnwrapProcedureRef(expr) != nullptr || - evaluate::IsVariable(expr); +static bool IsVarOrFunctionRef(const MaybeExpr &expr) { + if (expr) { + return evaluate::UnwrapProcedureRef(*expr) != nullptr || + evaluate::IsVariable(*expr); + } else { + return false; + } } static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { @@ -2838,6 +2842,12 @@ static std::pair SplitAssignmentSource( namespace atomic { +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + enum class Operator { Unk, // Operators that are officially allowed in the update operation @@ -3137,16 +3147,108 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (Append(v, std::move(results)), ...); + (MoveAppend(v, std::move(results)), ...); return v; } +}; -private: - static void Append(Result &acc, Result &&data) { - for (auto &&s : data) { - acc.push_back(std::move(s)); +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + auto copy{x.derived()}; + return {evaluate::AsGenericExpr(std::move(copy)), {}}; } } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; }; } // namespace atomic @@ -3265,6 +3367,22 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } +static MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { // Both expr and x have the form of SomeType(SomeKind(...)[1]). // Check if expr is @@ -3282,6 +3400,10 @@ bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { } } +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { if (value) { expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), @@ -3289,11 +3411,20 @@ static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { } } +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( - int what, const MaybeExpr &maybeExpr = std::nullopt) { + int what, + const std::optional &maybeAssign = std::nullopt) { parser::OpenMPAtomicConstruct::Analysis::Op operation; operation.what = what; - SetExpr(operation.expr, maybeExpr); + SetAssignment(operation.assign, maybeAssign); return operation; } @@ -3316,7 +3447,7 @@ static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( // }; // struct Op { // int what; - // TypedExpr expr; + // TypedAssignment assign; // }; // TypedExpr atom, cond; // Op op0, op1; @@ -3340,6 +3471,16 @@ void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, } } +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + /// Check if `expr` satisfies the following conditions for x and v: /// /// [6.0:189:10-12] @@ -3383,9 +3524,9 @@ OmpStructureChecker::CheckUpdateCapture( // // The two allowed cases are: // x = ... atomic-var = ... - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // or - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // x = ... atomic-var = ... // // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture @@ -3394,6 +3535,8 @@ OmpStructureChecker::CheckUpdateCapture( // // If the two statements don't fit these criteria, return a pair of default- // constructed values. + using ReturnTy = std::pair; SourcedActionStmt act1{GetActionStmt(ec1)}; SourcedActionStmt act2{GetActionStmt(ec2)}; @@ -3409,86 +3552,155 @@ OmpStructureChecker::CheckUpdateCapture( auto isUpdateCapture{ [](const evaluate::Assignment &u, const evaluate::Assignment &c) { - return u.lhs == c.rhs; + return IsSameOrConvertOf(c.rhs, u.lhs); }}; // Do some checks that narrow down the possible choices for the update // and the capture statements. This will help to emit better diagnostics. - bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); - bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; + + auto errorCaptureShouldRead{[&](const parser::CharBlock &source, + const std::string &expr) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read %s"_err_en_US, + expr); + }}; - if (couldBeCapture1) { - if (couldBeCapture2) { - if (isUpdateCapture(as2, as1)) { - if (isUpdateCapture(as1, as2)) { - // If both statements could be captures and both could be updates, - // emit a warning about the ambiguity. - context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); - } - return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); - } else if (isUpdateCapture(as1, as2)) { + auto errorNeitherWorks{[&]() { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture"_err_en_US); + }}; + + auto makeSelectionFromDet{[&](int det) -> ReturnTy { + // If det != 0, then the checks unambiguously suggest a specific + // categorization. + // If det == 0, then this function should be called only if the + // checks haven't ruled out any possibility, i.e. when both assigments + // could still be either updates or captures. + if (det > 0) { + // as1 is update, as2 is capture + if (isUpdateCapture(as1, as2)) { return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - context_.Say(source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, - as1.rhs.AsFortran(), as2.rhs.AsFortran()); + errorCaptureShouldRead(act2.source, as1.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } else { // !couldBeCapture2 + } else if (det < 0) { + // as2 is update, as1 is capture if (isUpdateCapture(as2, as1)) { return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } else { - context_.Say(act2.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as1.rhs.AsFortran()); + errorCaptureShouldRead(act1.source, as2.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } - } else { // !couldBeCapture1 - if (couldBeCapture2) { - if (isUpdateCapture(as1, as2)) { - return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); - } else { + } else { + bool updateFirst{isUpdateCapture(as1, as2)}; + bool captureFirst{isUpdateCapture(as2, as1)}; + if (updateFirst && captureFirst) { + // If both assignment could be the update and both could be the + // capture, emit a warning about the ambiguity. context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as2.rhs.AsFortran()); + "In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement"_warn_en_US); + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } - } else { - context_.Say(source, - "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + if (updateFirst != captureFirst) { + const parser::ExecutionPartConstruct *upd{updateFirst ? ec1 : ec2}; + const parser::ExecutionPartConstruct *cap{captureFirst ? ec1 : ec2}; + return std::make_pair(upd, cap); + } + assert(!updateFirst && !captureFirst); + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); } + }}; + + if (det != 0 || (cbu1 && cbu2 && cbc1 && cbc2)) { + return makeSelectionFromDet(det); } + assert(det == 0 && "Prior checks should have covered det != 0"); - return std::make_pair(nullptr, nullptr); + // If neither of the statements is an RMW update, it could still be a + // "write" update. Pretty much any assignment can be a write update, so + // recompute det with cbu1 = cbu2 = true. + if (int writeDet{int(cbc2) - int(cbc1)}; writeDet || (cbc1 && cbc2)) { + return makeSelectionFromDet(writeDet); + } + + // It's only errors from here on. + + if (!cbu1 && !cbu2 && !cbc1 && !cbc2) { + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); + } + + // The remaining cases are that + // - no candidate for update, or for capture, + // - one of the assigments cannot be anything. + + if (!cbu1 && !cbu2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update"_err_en_US); + return std::make_pair(nullptr, nullptr); + } else if (!cbc1 && !cbc2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + if ((!cbu1 && !cbc1) || (!cbu2 && !cbc2)) { + auto &src = (!cbu1 && !cbc1) ? act1.source : act2.source; + context_.Say(src, + "In ATOMIC UPDATE operation with CAPTURE the statement could be neither the update nor the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + // All cases should have been covered. + llvm_unreachable("Unchecked condition"); } void OmpStructureChecker::CheckAtomicCaptureAssignment( const evaluate::Assignment &capture, const SomeExpr &atom, parser::CharBlock source) { - const SomeExpr &cap{capture.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &cap{capture.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, rsrc); - // This part should have been checked prior to callig this function. - assert(capture.rhs == atom && "This canont be a capture assignment"); + // This part should have been checked prior to calling this function. + assert(*GetConvertInput(capture.rhs) == atom && + "This canont be a capture assignment"); CheckStorageOverlap(atom, {cap}, source); } } void OmpStructureChecker::CheckAtomicReadAssignment( const evaluate::Assignment &read, parser::CharBlock source) { - const SomeExpr &atom{read.rhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; - if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + if (auto maybe{GetConvertInput(read.rhs)}) { + const SomeExpr &atom{*maybe}; + + if (!IsVarOrFunctionRef(atom)) { + ErrorShouldBeVariable(atom, rsrc); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } } else { - CheckAtomicVariable(atom, rsrc); - CheckStorageOverlap(atom, {read.lhs}, source); + ErrorShouldBeVariable(read.rhs, rsrc); } } @@ -3499,12 +3711,11 @@ void OmpStructureChecker::CheckAtomicWriteAssignment( // one of the following forms: // x = expr // x => expr - const SomeExpr &atom{write.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{write.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, lsrc); CheckStorageOverlap(atom, {write.rhs}, source); @@ -3521,12 +3732,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( // x = intrinsic-procedure-name (x) // x = intrinsic-procedure-name (x, expr-list) // x = intrinsic-procedure-name (expr-list, x) - const SomeExpr &atom{update.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{update.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); // Skip other checks. return; } @@ -3605,12 +3815,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( const SomeExpr &cond, parser::CharBlock condSource, const evaluate::Assignment &assign, parser::CharBlock assignSource) { - const SomeExpr &atom{assign.lhs}; auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + const SomeExpr &atom{assign.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, arsrc); // Skip other checks. return; } @@ -3702,7 +3911,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( @@ -3786,7 +3995,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdate( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign), MakeAtomicAnalysisOp(Analysis::None)); } @@ -3839,12 +4048,12 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( if (GetActionStmt(&body.front()).stmt == uact.stmt) { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(action, update.rhs), - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + MakeAtomicAnalysisOp(action, update), + MakeAtomicAnalysisOp(Analysis::Read, capture)); } else { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), - MakeAtomicAnalysisOp(action, update.rhs)); + MakeAtomicAnalysisOp(Analysis::Read, capture), + MakeAtomicAnalysisOp(action, update)); } } @@ -3988,12 +4197,12 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( if (captureFirst) { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign)); } else { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign)); } } @@ -4019,13 +4228,15 @@ void OmpStructureChecker::CheckAtomicRead( if (body.size() == 1) { SourcedActionStmt action{GetActionStmt(&body.front())}; if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { - const SomeExpr &atom{maybeRead->rhs}; CheckAtomicReadAssignment(*maybeRead, action.source); - using Analysis = parser::OpenMPAtomicConstruct::Analysis; - x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), - MakeAtomicAnalysisOp(Analysis::None)); + if (auto maybe{GetConvertInput(maybeRead->rhs)}) { + const SomeExpr &atom{*maybe}; + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead), + MakeAtomicAnalysisOp(Analysis::None)); + } } else { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); @@ -4058,7 +4269,7 @@ void OmpStructureChecker::CheckAtomicWrite( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index bf6fbf16d0646..835fbe45e1c0e 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -253,6 +253,7 @@ class OmpStructureChecker void CheckStorageOverlap(const evaluate::Expr &, llvm::ArrayRef>, parser::CharBlock); + void ErrorShouldBeVariable(const MaybeExpr &expr, parser::CharBlock source); void CheckAtomicVariable( const evaluate::Expr &, parser::CharBlock); std::pair&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..aa9d2e0ac3ff7 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -1,5 +1,3 @@ -! REQUIRES : openmp_runtime - ! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s ! CHECK: func.func @_QPatomic_implicit_cast_read() { diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index deb67e7614659..8adb0f1a67409 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -25,7 +25,7 @@ program sample !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement y = x x = y !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index c427ba07d43d8..f808ed916fb7e 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -39,7 +39,7 @@ subroutine f02 subroutine f03 integer :: x - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture !$omp atomic update capture x = x + 1 x = x + 2 @@ -50,7 +50,7 @@ subroutine f04 integer :: x, v !$omp atomic update capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement v = x x = v !$omp end atomic @@ -60,8 +60,8 @@ subroutine f05 integer :: x, v, z !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 !$omp end atomic end @@ -70,8 +70,8 @@ subroutine f06 integer :: x, v, z !$omp atomic update capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x !$omp end atomic end diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 677b933932b44..5e180aa0bbe5b 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -97,50 +97,50 @@ program sample !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture x = x + 10 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read x v = b !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture !$omp atomic capture v = 1 x = 4 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) !$omp end atomic >From 40510a3068498d15257cc7d198bce9c8cd71a902 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 24 Mar 2025 15:38:58 -0500 Subject: [PATCH 13/18] DumpEvExpr: show type --- flang/include/flang/Semantics/dump-expr.h | 30 ++++++++++++++++------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 2f445429a10b5..1553dac3b6687 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,6 +16,7 @@ #include #include +#include #include #include @@ -38,6 +39,17 @@ class DumpEvaluateExpr { } private: + template + struct TypeOf { + static constexpr std::string_view name{TypeOf::get()}; + static constexpr std::string_view get() { + std::string_view v(__PRETTY_FUNCTION__); + v.remove_prefix(99); // Strip the part "... [with T = " + v.remove_suffix(50); // Strip the ending "; string_view = ...]" + return v; + } + }; + template void Show(const common::Indirection &x) { Show(x.value()); } @@ -76,7 +88,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant"); + Indent("derived constant "s + std::string(TypeOf::name)); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -84,7 +96,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant"); + Print("constant "s + std::string(TypeOf::name)); } } void Show(const Symbol &symbol); @@ -102,7 +114,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator"); + Indent("designator "s + std::string(TypeOf::name)); Show(x.u); Outdent(); } @@ -117,7 +129,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref"); + Indent("function ref "s + std::string(TypeOf::name)); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -127,14 +139,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value"); + Indent("array constructor value "s + std::string(TypeOf::name)); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do"); + Indent("implied do "s + std::string(TypeOf::name)); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -148,20 +160,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op"); + Indent("unary op "s + std::string(TypeOf::name)); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op"); + Indent("binary op "s + std::string(TypeOf::name)); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr T"); + Indent("expr <" + std::string(TypeOf::name) + ">"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index aa0b4e0f03398..66cedab94bfb4 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("expr some type"); + Indent("relational some type"); Show(x.u); Outdent(); } >From b40ba0ed9270daf4f7d99190c1e100028a3e09c3 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 15:14:45 -0500 Subject: [PATCH 14/18] Handle conversion from real to complex via complex constructor --- flang/lib/Semantics/check-omp-structure.cpp | 55 ++++++++++++++++++--- 1 file changed, 47 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dada9c6c2bd6f..ae81dcb5ea150 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,36 +3183,46 @@ struct ConvertCollector using Base::operator(); template // - Result operator()(const evaluate::Designator &x) const { + Result asSomeExpr(const T &x) const { auto copy{x}; return {AsGenericExpr(std::move(copy)), {}}; } + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + template // Result operator()(const evaluate::FunctionRef &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template // Result operator()(const evaluate::Constant &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template Result operator()(const evaluate::Operation &x) const { if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. + // Ignore parentheses. return (*this)(x.template operand<0>()); } else if constexpr (is_convert_v) { // Convert should always have a typed result, so it should be safe to // dereference x.GetType(). return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } } else { - auto copy{x.derived()}; - return {evaluate::AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x.derived()); } } @@ -3231,6 +3241,23 @@ struct ConvertCollector } private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + template // struct is_convert { static constexpr bool value{false}; @@ -3246,6 +3273,18 @@ struct ConvertCollector }; template // static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { >From 303aef7886243a6f7952e866cfb50d860ed98e61 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 16:07:19 -0500 Subject: [PATCH 15/18] Fix handling of insertion point --- flang/lib/Lower/OpenMP/OpenMP.cpp | 23 +++++++++++-------- .../Lower/OpenMP/atomic-implicit-cast.f90 | 8 +++---- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 1c5589b116ca7..60e559b326f7f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2749,7 +2749,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value storeAddr = @@ -2782,7 +2781,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, value, storeAddr); } - builder.restoreInsertionPoint(saved); return op; } @@ -2796,7 +2794,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value value = @@ -2807,7 +2804,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, converted, hint, memOrder); - builder.restoreInsertionPoint(saved); return op; } @@ -2823,7 +2819,6 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); @@ -2853,7 +2848,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(saved); + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } @@ -2866,6 +2861,8 @@ genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { + // This function and the functions called here do not preserve the + // builder's insertion point, or set it to anything specific. switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, @@ -3919,6 +3916,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, postAt = atomicAt = preAt; } + // The builder's insertion point needs to be specifically set before + // each call to `genAtomicOperation`. mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); @@ -3932,10 +3931,16 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, hint, memOrder, preAt, atomicAt, postAt); } - if (secondOp) { - builder.setInsertionPointAfter(secondOp); + if (construct.IsCapture()) { + // If this is a capture operation, the first/second ops will be inside + // of it. Set the insertion point to past the capture op itself. + builder.restoreInsertionPoint(postAt); } else { - builder.setInsertionPointAfter(firstOp); + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } } } } diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 6f9a481e4cf43..5e00235b85e74 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -95,9 +95,9 @@ subroutine atomic_implicit_cast_read ! CHECK: } ! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 ! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref -! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 ! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex ! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> @@ -107,14 +107,14 @@ subroutine atomic_implicit_cast_read !$omp end atomic -! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { -! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 ! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 ! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex ! CHECK: omp.yield(%[[RESULT]] : complex) ! CHECK: } >From d788d87ebe69ec82c14a0eb0cbb95df38a216fde Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:14:47 -0500 Subject: [PATCH 16/18] Allow conversion in update operations --- flang/include/flang/Semantics/tools.h | 17 ++++----- flang/lib/Lower/OpenMP/OpenMP.cpp | 6 ++-- flang/lib/Semantics/check-omp-structure.cpp | 33 ++++++----------- .../Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic03.f90 | 6 ++-- flang/test/Semantics/OpenMP/atomic04.f90 | 35 +++++++++---------- .../OpenMP/omp-atomic-assignment-stmt.f90 | 2 +- 7 files changed, 44 insertions(+), 57 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7f1ec59b087a2..9be2feb8ae064 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -789,14 +789,15 @@ inline bool checkForSymbolMatch( /// return the "expr" but with top-level parentheses stripped. std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); -/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). -/// Check if "expr" is -/// SomeType(SomeKind(Type( -/// Convert -/// SomeKind(...)[2]))) -/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves -/// TypeCategory. -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 60e559b326f7f..6977e209e8b1b 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2823,10 +2823,12 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; + // This must exist by now. + SomeExpr input = *semantics::GetConvertInput(assign.rhs); + std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { - if (!semantics::IsSameOrResizeOf(arg, atom)) { + if (!semantics::IsSameOrConvertOf(arg, atom)) { mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); overrides.try_emplace(&arg, val); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index ae81dcb5ea150..edd8525c118bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3425,12 +3425,12 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -static MaybeExpr GetConvertInput(const SomeExpr &x) { +MaybeExpr GetConvertInput(const SomeExpr &x) { // This returns SomeExpr(x) when x is a designator/functionref/constant. return atomic::ConvertCollector{}(x).first; } -static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { // Check if expr is same as x, or a sequence of Convert operations on x. if (expr == x) { return true; @@ -3441,23 +3441,6 @@ static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { } } -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { - // Both expr and x have the form of SomeType(SomeKind(...)[1]). - // Check if expr is - // SomeType(SomeKind(Type( - // Convert - // SomeKind(...)[2]))) - // where SomeKind(...) [1] and [2] are equal, and the Convert preserves - // TypeCategory. - - if (expr != x) { - auto top{atomic::ArgumentExtractor{}(expr)}; - return top.first == atomic::Operator::Resize && x == top.second.front(); - } else { - return true; - } -} - bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3801,7 +3784,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - auto top{GetTopLevelOperation(update.rhs)}; + std::pair> top{ + atomic::Operator::Unk, {}}; + if (auto &&maybeInput{GetConvertInput(update.rhs)}) { + top = GetTopLevelOperation(*maybeInput); + } switch (top.first) { case atomic::Operator::Add: case atomic::Operator::Sub: @@ -3842,7 +3829,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( auto unique{[&]() { // -> iterator auto found{top.second.end()}; for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { - if (IsSameOrResizeOf(*i, atom)) { + if (IsSameOrConvertOf(*i, atom)) { if (found != top.second.end()) { return top.second.end(); } @@ -3902,9 +3889,9 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( case atomic::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; - if (IsSameOrResizeOf(arg0, atom)) { + if (IsSameOrConvertOf(arg0, atom)) { CheckStorageOverlap(atom, {arg1}, condSource); - } else if (IsSameOrResizeOf(arg1, atom)) { + } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { context_.Say(assignSource, diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 4595e02d01456..28d0e264359cb 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -47,8 +47,8 @@ subroutine f05 integer :: x real :: y + ! An explicit conversion is accepted as an extension. !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = int(x + y) end diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index f5c189fd05318..b3a3c0d5e7a14 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -41,10 +41,10 @@ program OmpAtomic z = MIN(y, 8, a, d) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable y should appear as an argument in the update operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -126,7 +126,7 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic update !ERROR: Atomic variable k should be a scalar - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable k should occur exactly once among the arguments of the top-level MAX operator k = max(x, y) !$omp atomic diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index 5c91ab5dc37e4..d603ba8b3937c 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -1,5 +1,3 @@ -! REQUIRES: openmp_runtime - ! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags ! OpenMP Atomic construct @@ -7,7 +5,6 @@ ! Update assignment must be 'var = var op expr' or 'var = expr op var' program OmpAtomic - use omp_lib real x integer y logical m, n, l @@ -20,10 +17,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic @@ -31,10 +28,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic @@ -42,10 +39,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic @@ -53,10 +50,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic @@ -96,10 +93,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic update @@ -107,10 +104,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic update @@ -118,10 +115,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic update @@ -129,10 +126,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 5e180aa0bbe5b..8fdd2aed3ec1f 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -87,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + ! This ends up being "x = b + x". x = b + (x*1) !$omp end atomic >From 341723713929507c59d528540d32bc2e4213e920 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:21:56 -0500 Subject: [PATCH 17/18] format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6977e209e8b1b..0f553541c5ef0 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2850,7 +2850,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(postAt); // For naCtx cleanups + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } >From 2686207342bad511f6d51b20ed923c0d2cc9047b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:22:26 -0500 Subject: [PATCH 18/18] Revert "DumpEvExpr: show type" This reverts commit 40510a3068498d15257cc7d198bce9c8cd71a902. Debug changes accidentally pushed upstream. --- flang/include/flang/Semantics/dump-expr.h | 30 +++++++---------------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 10 insertions(+), 22 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 1553dac3b6687..2f445429a10b5 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,7 +16,6 @@ #include #include -#include #include #include @@ -39,17 +38,6 @@ class DumpEvaluateExpr { } private: - template - struct TypeOf { - static constexpr std::string_view name{TypeOf::get()}; - static constexpr std::string_view get() { - std::string_view v(__PRETTY_FUNCTION__); - v.remove_prefix(99); // Strip the part "... [with T = " - v.remove_suffix(50); // Strip the ending "; string_view = ...]" - return v; - } - }; - template void Show(const common::Indirection &x) { Show(x.value()); } @@ -88,7 +76,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant "s + std::string(TypeOf::name)); + Indent("derived constant"); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -96,7 +84,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant "s + std::string(TypeOf::name)); + Print("constant"); } } void Show(const Symbol &symbol); @@ -114,7 +102,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator "s + std::string(TypeOf::name)); + Indent("designator"); Show(x.u); Outdent(); } @@ -129,7 +117,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref "s + std::string(TypeOf::name)); + Indent("function ref"); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -139,14 +127,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value "s + std::string(TypeOf::name)); + Indent("array constructor value"); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do "s + std::string(TypeOf::name)); + Indent("implied do"); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -160,20 +148,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op "s + std::string(TypeOf::name)); + Indent("unary op"); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op "s + std::string(TypeOf::name)); + Indent("binary op"); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr <" + std::string(TypeOf::name) + ">"); + Indent("expr T"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index 66cedab94bfb4..aa0b4e0f03398 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("relational some type"); + Indent("expr some type"); Show(x.u); Outdent(); } From flang-commits at lists.llvm.org Fri May 16 07:40:53 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 16 May 2025 07:40:53 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow flush of common block (PR #139528) In-Reply-To: Message-ID: <68274e75.170a0220.3563db.5b6e@mx.google.com> ================ @@ -2304,13 +2304,6 @@ void OmpStructureChecker::Leave(const parser::OpenMPFlushConstruct &x) { auto &flushList{std::get>(x.v.t)}; if (flushList) { - for (const parser::OmpArgument &arg : flushList->v) { - if (auto *sym{GetArgumentSymbol(arg)}; sym && !IsVariableListItem(*sym)) { ---------------- tblah wrote: My worry was whether that would affect many other constructs. Especially through this use: ```c++ void OmpStructureChecker::Enter(const parser::OmpClause &x) ``` Even if it is wrong that those constructs currently do not accept common blocks, it is unlikely that they will all generate correct code in this case. https://github.com/llvm/llvm-project/pull/139528 From flang-commits at lists.llvm.org Fri May 16 08:06:57 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 16 May 2025 08:06:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow flush of common block (PR #139528) In-Reply-To: Message-ID: <68275491.170a0220.279916.0860@mx.google.com> https://github.com/kiranchandramohan approved this pull request. LG. https://github.com/llvm/llvm-project/pull/139528 From flang-commits at lists.llvm.org Sat May 17 01:46:45 2025 From: flang-commits at lists.llvm.org (Sjoerd Meijer via flang-commits) Date: Sat, 17 May 2025 01:46:45 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <68284cf5.170a0220.184911.77b6@mx.google.com> sjoerdmeijer wrote: > Thanks for this PR. Do you have any compilation time and performance data? This information is a bit spread out in the other tickets that I linked earlier, so to summarise that, compile times look really good and increases very minimal after the work that Madhur did. In https://github.com/llvm/llvm-project/pull/124911, I wrote: > The compile-time increase with a geomean increase of 0.19% looks good (after committing https://github.com/llvm/llvm-project/pull/124247), I think: stage1-O3: Benchmark kimwitu++ +0.10% sqlite3 +0.14% consumer-typeset +0.07% Bullet +0.06% tramp3d-v4 +0.21% mafft +0.39% ClamAVi +0.06% lencod +0.61% SPASS +0.17% 7zip +0.08% geomean +0.19% Regarding performance, as I also wrote in that ticket, loop-interchange has a lot of potential. It triggers a lot of times e.g. in the LLVM test-suite, see this https://github.com/llvm/llvm-project/pull/124911#issuecomment-2624704156. It is now triggering slightly less than what I wrote in that comment because we made interchange more pessimistic to fix correctness issues, but we think that's okay because we consider getting interchange and DependenceAnalysis running by default as a first enablement step. Once we have achieved this, we are going to focus on performance and lift some of the restrictions (while maintaining correctness of course). With this first patch, interchange won't trigger on SPEC for example, but we plan to do that as follow up. https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Sat May 17 04:46:34 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 17 May 2025 04:46:34 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) Message-ID: https://github.com/NexMing created https://github.com/llvm/llvm-project/pull/140374 Convert FIR structured control flow ops to SCF dialect. >From 6a246b41f16d766ad4beba29318954bfbbbbc131 Mon Sep 17 00:00:00 2001 From: yanming Date: Fri, 16 May 2025 17:56:21 +0800 Subject: [PATCH] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. --- .../include/flang/Optimizer/Support/InitFIR.h | 2 + .../flang/Optimizer/Transforms/Passes.h | 1 + .../flang/Optimizer/Transforms/Passes.td | 11 ++ flang/lib/Optimizer/Transforms/CMakeLists.txt | 1 + flang/lib/Optimizer/Transforms/FIRToSCF.cpp | 103 ++++++++++++ flang/test/Fir/FirToSCF/do-loop.fir | 147 ++++++++++++++++++ 6 files changed, 265 insertions(+) create mode 100644 flang/lib/Optimizer/Transforms/FIRToSCF.cpp create mode 100644 flang/test/Fir/FirToSCF/do-loop.fir diff --git a/flang/include/flang/Optimizer/Support/InitFIR.h b/flang/include/flang/Optimizer/Support/InitFIR.h index 1868fbb201970..fa7c430ed631c 100644 --- a/flang/include/flang/Optimizer/Support/InitFIR.h +++ b/flang/include/flang/Optimizer/Support/InitFIR.h @@ -30,6 +30,7 @@ #include "mlir/Pass/PassRegistry.h" #include "mlir/Transforms/LocationSnapshot.h" #include "mlir/Transforms/Passes.h" +#include namespace fir::support { @@ -103,6 +104,7 @@ inline void registerMLIRPassesForFortranTools() { mlir::registerPrintOpStatsPass(); mlir::registerInlinerPass(); mlir::registerSCCPPass(); + mlir::registerSCFPasses(); mlir::affine::registerAffineScalarReplacementPass(); mlir::registerSymbolDCEPass(); mlir::registerLocationSnapshotPass(); diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h b/flang/include/flang/Optimizer/Transforms/Passes.h index 6dbabd523f88a..dc8a5b9141ad2 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.h +++ b/flang/include/flang/Optimizer/Transforms/Passes.h @@ -72,6 +72,7 @@ std::unique_ptr createArrayValueCopyPass(fir::ArrayValueCopyOptions options = {}); std::unique_ptr createMemDataFlowOptPass(); std::unique_ptr createPromoteToAffinePass(); +std::unique_ptr createFIRToSCFPass(); std::unique_ptr createAddDebugInfoPass(fir::AddDebugInfoOptions options = {}); diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c0d88a8e19f80..da3d9bc751927 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -76,6 +76,17 @@ def AffineDialectDemotion : Pass<"demote-affine", "::mlir::func::FuncOp"> { ]; } +def FIRToSCFPass : Pass<"fir-to-scf"> { + let summary = "Convert FIR structured control flow ops to SCF dialect."; + let description = [{ + Convert FIR structured control flow ops to SCF dialect. + }]; + let constructor = "::fir::createFIRToSCFPass()"; + let dependentDialects = [ + "fir::FIROpsDialect", "mlir::scf::SCFDialect" + ]; +} + def AnnotateConstantOperands : Pass<"annotate-constant"> { let summary = "Annotate constant operands to all FIR operations"; let description = [{ diff --git a/flang/lib/Optimizer/Transforms/CMakeLists.txt b/flang/lib/Optimizer/Transforms/CMakeLists.txt index 170b6e2cca225..846d6c64dbd04 100644 --- a/flang/lib/Optimizer/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/Transforms/CMakeLists.txt @@ -16,6 +16,7 @@ add_flang_library(FIRTransforms CUFComputeSharedMemoryOffsetsAndSize.cpp ArrayValueCopy.cpp ExternalNameConversion.cpp + FIRToSCF.cpp MemoryUtils.cpp MemoryAllocation.cpp StackArrays.cpp diff --git a/flang/lib/Optimizer/Transforms/FIRToSCF.cpp b/flang/lib/Optimizer/Transforms/FIRToSCF.cpp new file mode 100644 index 0000000000000..02810f1bdba4e --- /dev/null +++ b/flang/lib/Optimizer/Transforms/FIRToSCF.cpp @@ -0,0 +1,103 @@ +//===-- FIRToSCF.cpp ------------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/SCF/IR/SCF.h" +#include "mlir/Transforms/DialectConversion.h" + +namespace fir { +#define GEN_PASS_DEF_FIRTOSCFPASS +#include "flang/Optimizer/Transforms/Passes.h.inc" +} // namespace fir + +using namespace fir; +using namespace mlir; + +namespace { +class FIRToSCFPass : public fir::impl::FIRToSCFPassBase { +public: + void runOnOperation() override; +}; +} // namespace + +struct DoLoopConversion : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + + LogicalResult matchAndRewrite(fir::DoLoopOp doLoopOp, + PatternRewriter &rewriter) const override { + auto loc = doLoopOp.getLoc(); + bool hasFinalValue = doLoopOp.getFinalValue().has_value(); + + // Get loop values from the DoLoopOp + auto low = doLoopOp.getLowerBound(); + auto high = doLoopOp.getUpperBound(); + assert(low && high && "must be a Value"); + auto step = doLoopOp.getStep(); + llvm::SmallVector iterArgs; + if (hasFinalValue) + iterArgs.push_back(low); + iterArgs.append(doLoopOp.getIterOperands().begin(), + doLoopOp.getIterOperands().end()); + + // Caculate the trip count. + auto diff = rewriter.create(loc, high, low); + auto distance = rewriter.create(loc, diff, step); + auto tripCount = rewriter.create(loc, distance, step); + auto zero = rewriter.create(loc, 0); + auto one = rewriter.create(loc, 1); + auto scfForOp = + rewriter.create(loc, zero, tripCount, one, iterArgs); + + auto &loopOps = doLoopOp.getBody()->getOperations(); + auto resultOp = cast(doLoopOp.getBody()->getTerminator()); + auto results = resultOp.getOperands(); + Block *loweredBody = scfForOp.getBody(); + + loweredBody->getOperations().splice(loweredBody->begin(), loopOps, + loopOps.begin(), + std::prev(loopOps.end())); + + rewriter.setInsertionPointToStart(loweredBody); + Value iv = + rewriter.create(loc, scfForOp.getInductionVar(), step); + iv = rewriter.create(loc, low, iv); + + if (!results.empty()) { + rewriter.setInsertionPointToEnd(loweredBody); + rewriter.create(resultOp->getLoc(), results); + } + doLoopOp.getInductionVar().replaceAllUsesWith(iv); + rewriter.replaceAllUsesWith(doLoopOp.getRegionIterArgs(), + hasFinalValue + ? scfForOp.getRegionIterArgs().drop_front() + : scfForOp.getRegionIterArgs()); + + // Copy loop annotations from the do loop to the loop entry condition. + if (auto ann = doLoopOp.getLoopAnnotation()) + scfForOp->setAttr("loop_annotation", *ann); + + rewriter.replaceOp(doLoopOp, scfForOp); + return success(); + } +}; + +void FIRToSCFPass::runOnOperation() { + RewritePatternSet patterns(&getContext()); + patterns.add(patterns.getContext()); + ConversionTarget target(getContext()); + target.addIllegalOp(); + target.markUnknownOpDynamicallyLegal([](Operation *) { return true; }); + if (failed( + applyPartialConversion(getOperation(), target, std::move(patterns)))) + signalPassFailure(); +} + +std::unique_ptr fir::createFIRToSCFPass() { + return std::make_unique(); +} diff --git a/flang/test/Fir/FirToSCF/do-loop.fir b/flang/test/Fir/FirToSCF/do-loop.fir new file mode 100644 index 0000000000000..c3c24ccc1db71 --- /dev/null +++ b/flang/test/Fir/FirToSCF/do-loop.fir @@ -0,0 +1,147 @@ +// RUN: fir-opt %s --fir-to-scf | FileCheck %s + +// CHECK-LABEL: func.func @simple_loop( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_2:.*]] = fir.shape %[[VAL_1]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_3:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_9:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] { +// CHECK: %[[VAL_10:.*]] = arith.muli %[[VAL_9]], %[[VAL_0]] : index +// CHECK: %[[VAL_11:.*]] = arith.addi %[[VAL_0]], %[[VAL_10]] : index +// CHECK: %[[VAL_12:.*]] = fir.array_coor %[[ARG0]](%[[VAL_2]]) %[[VAL_11]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_3]] to %[[VAL_12]] : !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } +func.func @simple_loop(%arg0: !fir.ref>) { + %c1 = arith.constant 1 : index + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %c1_i32 = arith.constant 1 : i32 + fir.do_loop %arg1 = %c1 to %c100 step %c1 { + %1 = fir.array_coor %arg0(%0) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + return +} + +// CHECK-LABEL: func.func @loop_with_negtive_step( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_2:.*]] = arith.constant -1 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_0]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_2]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_10:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] { +// CHECK: %[[VAL_11:.*]] = arith.muli %[[VAL_10]], %[[VAL_2]] : index +// CHECK: %[[VAL_12:.*]] = arith.addi %[[VAL_0]], %[[VAL_11]] : index +// CHECK: %[[VAL_13:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_12]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_4]] to %[[VAL_13]] : !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } +func.func @loop_with_negtive_step(%arg0: !fir.ref>) { + %c100 = arith.constant 100 : index + %c1 = arith.constant 1 : index + %c-1 = arith.constant -1 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %c1_i32 = arith.constant 1 : i32 + fir.do_loop %arg1 = %c100 to %c1 step %c-1 { + %1 = fir.array_coor %arg0(%0) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + return +} + +// CHECK-LABEL: func.func @loop_with_results( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_9:.*]] = scf.for %[[VAL_10:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] iter_args(%[[VAL_11:.*]] = %[[VAL_1]]) -> (i32) { +// CHECK: %[[VAL_12:.*]] = arith.muli %[[VAL_10]], %[[VAL_0]] : index +// CHECK: %[[VAL_13:.*]] = arith.addi %[[VAL_0]], %[[VAL_12]] : index +// CHECK: %[[VAL_14:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_13]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_15:.*]] = fir.load %[[VAL_14]] : !fir.ref +// CHECK: %[[VAL_16:.*]] = arith.addi %[[VAL_11]], %[[VAL_15]] : i32 +// CHECK: scf.yield %[[VAL_16]] : i32 +// CHECK: } +// CHECK: fir.store %[[VAL_9]] to %[[ARG1]] : !fir.ref +// CHECK: return +// CHECK: } +func.func @loop_with_results(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %1 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %c0_i32) -> (i32) { + %2 = fir.array_coor %arg0(%0) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %3 = fir.load %2 : !fir.ref + %4 = arith.addi %arg3, %3 : i32 + fir.result %4 : i32 + } + fir.store %1 to %arg1 : !fir.ref + return +} + +// CHECK-LABEL: func.func @loop_with_final_value( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.alloca index +// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_0]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_10:.*]]:2 = scf.for %[[VAL_11:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] iter_args(%[[VAL_12:.*]] = %[[VAL_0]], %[[VAL_13:.*]] = %[[VAL_1]]) -> (index, i32) { +// CHECK: %[[VAL_14:.*]] = arith.muli %[[VAL_11]], %[[VAL_0]] : index +// CHECK: %[[VAL_15:.*]] = arith.addi %[[VAL_0]], %[[VAL_14]] : index +// CHECK: %[[VAL_16:.*]] = fir.array_coor %[[ARG0]](%[[VAL_4]]) %[[VAL_15]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.load %[[VAL_16]] : !fir.ref +// CHECK: %[[VAL_18:.*]] = arith.addi %[[VAL_15]], %[[VAL_0]] overflow : index +// CHECK: %[[VAL_19:.*]] = arith.addi %[[VAL_13]], %[[VAL_17]] overflow : i32 +// CHECK: scf.yield %[[VAL_18]], %[[VAL_19]] : index, i32 +// CHECK: } +// CHECK: fir.store %[[VAL_20:.*]]#0 to %[[VAL_3]] : !fir.ref +// CHECK: fir.store %[[VAL_20]]#1 to %[[ARG1]] : !fir.ref +// CHECK: return +// CHECK: } +func.func @loop_with_final_value(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.alloca index + %1 = fir.shape %c100 : (index) -> !fir.shape<1> + %2:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %c0_i32) -> (index, i32) { + %3 = fir.array_coor %arg0(%1) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %4 = fir.load %3 : !fir.ref + %5 = arith.addi %arg2, %c1 overflow : index + %6 = arith.addi %arg3, %4 overflow : i32 + fir.result %5, %6 : index, i32 + } + fir.store %2#0 to %0 : !fir.ref + fir.store %2#1 to %arg1 : !fir.ref + return +} From flang-commits at lists.llvm.org Sat May 17 04:47:06 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 17 May 2025 04:47:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <6828773a.a70a0220.210127.94b6@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: MingYan (NexMing)
Changes Convert FIR structured control flow ops to SCF dialect. --- Full diff: https://github.com/llvm/llvm-project/pull/140374.diff 6 Files Affected: - (modified) flang/include/flang/Optimizer/Support/InitFIR.h (+2) - (modified) flang/include/flang/Optimizer/Transforms/Passes.h (+1) - (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+11) - (modified) flang/lib/Optimizer/Transforms/CMakeLists.txt (+1) - (added) flang/lib/Optimizer/Transforms/FIRToSCF.cpp (+103) - (added) flang/test/Fir/FirToSCF/do-loop.fir (+147) ``````````diff diff --git a/flang/include/flang/Optimizer/Support/InitFIR.h b/flang/include/flang/Optimizer/Support/InitFIR.h index 1868fbb201970..fa7c430ed631c 100644 --- a/flang/include/flang/Optimizer/Support/InitFIR.h +++ b/flang/include/flang/Optimizer/Support/InitFIR.h @@ -30,6 +30,7 @@ #include "mlir/Pass/PassRegistry.h" #include "mlir/Transforms/LocationSnapshot.h" #include "mlir/Transforms/Passes.h" +#include namespace fir::support { @@ -103,6 +104,7 @@ inline void registerMLIRPassesForFortranTools() { mlir::registerPrintOpStatsPass(); mlir::registerInlinerPass(); mlir::registerSCCPPass(); + mlir::registerSCFPasses(); mlir::affine::registerAffineScalarReplacementPass(); mlir::registerSymbolDCEPass(); mlir::registerLocationSnapshotPass(); diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h b/flang/include/flang/Optimizer/Transforms/Passes.h index 6dbabd523f88a..dc8a5b9141ad2 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.h +++ b/flang/include/flang/Optimizer/Transforms/Passes.h @@ -72,6 +72,7 @@ std::unique_ptr createArrayValueCopyPass(fir::ArrayValueCopyOptions options = {}); std::unique_ptr createMemDataFlowOptPass(); std::unique_ptr createPromoteToAffinePass(); +std::unique_ptr createFIRToSCFPass(); std::unique_ptr createAddDebugInfoPass(fir::AddDebugInfoOptions options = {}); diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c0d88a8e19f80..da3d9bc751927 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -76,6 +76,17 @@ def AffineDialectDemotion : Pass<"demote-affine", "::mlir::func::FuncOp"> { ]; } +def FIRToSCFPass : Pass<"fir-to-scf"> { + let summary = "Convert FIR structured control flow ops to SCF dialect."; + let description = [{ + Convert FIR structured control flow ops to SCF dialect. + }]; + let constructor = "::fir::createFIRToSCFPass()"; + let dependentDialects = [ + "fir::FIROpsDialect", "mlir::scf::SCFDialect" + ]; +} + def AnnotateConstantOperands : Pass<"annotate-constant"> { let summary = "Annotate constant operands to all FIR operations"; let description = [{ diff --git a/flang/lib/Optimizer/Transforms/CMakeLists.txt b/flang/lib/Optimizer/Transforms/CMakeLists.txt index 170b6e2cca225..846d6c64dbd04 100644 --- a/flang/lib/Optimizer/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/Transforms/CMakeLists.txt @@ -16,6 +16,7 @@ add_flang_library(FIRTransforms CUFComputeSharedMemoryOffsetsAndSize.cpp ArrayValueCopy.cpp ExternalNameConversion.cpp + FIRToSCF.cpp MemoryUtils.cpp MemoryAllocation.cpp StackArrays.cpp diff --git a/flang/lib/Optimizer/Transforms/FIRToSCF.cpp b/flang/lib/Optimizer/Transforms/FIRToSCF.cpp new file mode 100644 index 0000000000000..02810f1bdba4e --- /dev/null +++ b/flang/lib/Optimizer/Transforms/FIRToSCF.cpp @@ -0,0 +1,103 @@ +//===-- FIRToSCF.cpp ------------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/SCF/IR/SCF.h" +#include "mlir/Transforms/DialectConversion.h" + +namespace fir { +#define GEN_PASS_DEF_FIRTOSCFPASS +#include "flang/Optimizer/Transforms/Passes.h.inc" +} // namespace fir + +using namespace fir; +using namespace mlir; + +namespace { +class FIRToSCFPass : public fir::impl::FIRToSCFPassBase { +public: + void runOnOperation() override; +}; +} // namespace + +struct DoLoopConversion : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + + LogicalResult matchAndRewrite(fir::DoLoopOp doLoopOp, + PatternRewriter &rewriter) const override { + auto loc = doLoopOp.getLoc(); + bool hasFinalValue = doLoopOp.getFinalValue().has_value(); + + // Get loop values from the DoLoopOp + auto low = doLoopOp.getLowerBound(); + auto high = doLoopOp.getUpperBound(); + assert(low && high && "must be a Value"); + auto step = doLoopOp.getStep(); + llvm::SmallVector iterArgs; + if (hasFinalValue) + iterArgs.push_back(low); + iterArgs.append(doLoopOp.getIterOperands().begin(), + doLoopOp.getIterOperands().end()); + + // Caculate the trip count. + auto diff = rewriter.create(loc, high, low); + auto distance = rewriter.create(loc, diff, step); + auto tripCount = rewriter.create(loc, distance, step); + auto zero = rewriter.create(loc, 0); + auto one = rewriter.create(loc, 1); + auto scfForOp = + rewriter.create(loc, zero, tripCount, one, iterArgs); + + auto &loopOps = doLoopOp.getBody()->getOperations(); + auto resultOp = cast(doLoopOp.getBody()->getTerminator()); + auto results = resultOp.getOperands(); + Block *loweredBody = scfForOp.getBody(); + + loweredBody->getOperations().splice(loweredBody->begin(), loopOps, + loopOps.begin(), + std::prev(loopOps.end())); + + rewriter.setInsertionPointToStart(loweredBody); + Value iv = + rewriter.create(loc, scfForOp.getInductionVar(), step); + iv = rewriter.create(loc, low, iv); + + if (!results.empty()) { + rewriter.setInsertionPointToEnd(loweredBody); + rewriter.create(resultOp->getLoc(), results); + } + doLoopOp.getInductionVar().replaceAllUsesWith(iv); + rewriter.replaceAllUsesWith(doLoopOp.getRegionIterArgs(), + hasFinalValue + ? scfForOp.getRegionIterArgs().drop_front() + : scfForOp.getRegionIterArgs()); + + // Copy loop annotations from the do loop to the loop entry condition. + if (auto ann = doLoopOp.getLoopAnnotation()) + scfForOp->setAttr("loop_annotation", *ann); + + rewriter.replaceOp(doLoopOp, scfForOp); + return success(); + } +}; + +void FIRToSCFPass::runOnOperation() { + RewritePatternSet patterns(&getContext()); + patterns.add(patterns.getContext()); + ConversionTarget target(getContext()); + target.addIllegalOp(); + target.markUnknownOpDynamicallyLegal([](Operation *) { return true; }); + if (failed( + applyPartialConversion(getOperation(), target, std::move(patterns)))) + signalPassFailure(); +} + +std::unique_ptr fir::createFIRToSCFPass() { + return std::make_unique(); +} diff --git a/flang/test/Fir/FirToSCF/do-loop.fir b/flang/test/Fir/FirToSCF/do-loop.fir new file mode 100644 index 0000000000000..c3c24ccc1db71 --- /dev/null +++ b/flang/test/Fir/FirToSCF/do-loop.fir @@ -0,0 +1,147 @@ +// RUN: fir-opt %s --fir-to-scf | FileCheck %s + +// CHECK-LABEL: func.func @simple_loop( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_2:.*]] = fir.shape %[[VAL_1]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_3:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_9:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] { +// CHECK: %[[VAL_10:.*]] = arith.muli %[[VAL_9]], %[[VAL_0]] : index +// CHECK: %[[VAL_11:.*]] = arith.addi %[[VAL_0]], %[[VAL_10]] : index +// CHECK: %[[VAL_12:.*]] = fir.array_coor %[[ARG0]](%[[VAL_2]]) %[[VAL_11]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_3]] to %[[VAL_12]] : !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } +func.func @simple_loop(%arg0: !fir.ref>) { + %c1 = arith.constant 1 : index + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %c1_i32 = arith.constant 1 : i32 + fir.do_loop %arg1 = %c1 to %c100 step %c1 { + %1 = fir.array_coor %arg0(%0) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + return +} + +// CHECK-LABEL: func.func @loop_with_negtive_step( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_2:.*]] = arith.constant -1 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_0]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_2]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_10:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] { +// CHECK: %[[VAL_11:.*]] = arith.muli %[[VAL_10]], %[[VAL_2]] : index +// CHECK: %[[VAL_12:.*]] = arith.addi %[[VAL_0]], %[[VAL_11]] : index +// CHECK: %[[VAL_13:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_12]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_4]] to %[[VAL_13]] : !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } +func.func @loop_with_negtive_step(%arg0: !fir.ref>) { + %c100 = arith.constant 100 : index + %c1 = arith.constant 1 : index + %c-1 = arith.constant -1 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %c1_i32 = arith.constant 1 : i32 + fir.do_loop %arg1 = %c100 to %c1 step %c-1 { + %1 = fir.array_coor %arg0(%0) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + return +} + +// CHECK-LABEL: func.func @loop_with_results( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_9:.*]] = scf.for %[[VAL_10:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] iter_args(%[[VAL_11:.*]] = %[[VAL_1]]) -> (i32) { +// CHECK: %[[VAL_12:.*]] = arith.muli %[[VAL_10]], %[[VAL_0]] : index +// CHECK: %[[VAL_13:.*]] = arith.addi %[[VAL_0]], %[[VAL_12]] : index +// CHECK: %[[VAL_14:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_13]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_15:.*]] = fir.load %[[VAL_14]] : !fir.ref +// CHECK: %[[VAL_16:.*]] = arith.addi %[[VAL_11]], %[[VAL_15]] : i32 +// CHECK: scf.yield %[[VAL_16]] : i32 +// CHECK: } +// CHECK: fir.store %[[VAL_9]] to %[[ARG1]] : !fir.ref +// CHECK: return +// CHECK: } +func.func @loop_with_results(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %1 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %c0_i32) -> (i32) { + %2 = fir.array_coor %arg0(%0) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %3 = fir.load %2 : !fir.ref + %4 = arith.addi %arg3, %3 : i32 + fir.result %4 : i32 + } + fir.store %1 to %arg1 : !fir.ref + return +} + +// CHECK-LABEL: func.func @loop_with_final_value( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.alloca index +// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_0]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_10:.*]]:2 = scf.for %[[VAL_11:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] iter_args(%[[VAL_12:.*]] = %[[VAL_0]], %[[VAL_13:.*]] = %[[VAL_1]]) -> (index, i32) { +// CHECK: %[[VAL_14:.*]] = arith.muli %[[VAL_11]], %[[VAL_0]] : index +// CHECK: %[[VAL_15:.*]] = arith.addi %[[VAL_0]], %[[VAL_14]] : index +// CHECK: %[[VAL_16:.*]] = fir.array_coor %[[ARG0]](%[[VAL_4]]) %[[VAL_15]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.load %[[VAL_16]] : !fir.ref +// CHECK: %[[VAL_18:.*]] = arith.addi %[[VAL_15]], %[[VAL_0]] overflow : index +// CHECK: %[[VAL_19:.*]] = arith.addi %[[VAL_13]], %[[VAL_17]] overflow : i32 +// CHECK: scf.yield %[[VAL_18]], %[[VAL_19]] : index, i32 +// CHECK: } +// CHECK: fir.store %[[VAL_20:.*]]#0 to %[[VAL_3]] : !fir.ref +// CHECK: fir.store %[[VAL_20]]#1 to %[[ARG1]] : !fir.ref +// CHECK: return +// CHECK: } +func.func @loop_with_final_value(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.alloca index + %1 = fir.shape %c100 : (index) -> !fir.shape<1> + %2:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %c0_i32) -> (index, i32) { + %3 = fir.array_coor %arg0(%1) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %4 = fir.load %3 : !fir.ref + %5 = arith.addi %arg2, %c1 overflow : index + %6 = arith.addi %arg3, %4 overflow : i32 + fir.result %5, %6 : index, i32 + } + fir.store %2#0 to %0 : !fir.ref + fir.store %2#1 to %arg1 : !fir.ref + return +} ``````````
https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Sat May 17 04:52:14 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 17 May 2025 04:52:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <6828786e.170a0220.18cb2c.7c62@mx.google.com> https://github.com/NexMing edited https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Sat May 17 04:56:32 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 17 May 2025 04:56:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <68287970.050a0220.16c125.9b71@mx.google.com> https://github.com/NexMing edited https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Sun May 18 17:14:51 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Sun, 18 May 2025 17:14:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682a77fb.170a0220.af901.b1a7@mx.google.com> ================ @@ -30,6 +30,7 @@ #include "mlir/Pass/PassRegistry.h" #include "mlir/Transforms/LocationSnapshot.h" #include "mlir/Transforms/Passes.h" +#include ---------------- clementval wrote: Include style doesn't follow the other. https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Sun May 18 19:01:14 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sun, 18 May 2025 19:01:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682a90ea.170a0220.2798bf.bdfa@mx.google.com> https://github.com/NexMing updated https://github.com/llvm/llvm-project/pull/140374 >From a7dec68d490ec09950a5aa15b82ea8029c23fd22 Mon Sep 17 00:00:00 2001 From: yanming Date: Fri, 16 May 2025 17:56:21 +0800 Subject: [PATCH] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. --- .../include/flang/Optimizer/Support/InitFIR.h | 2 + .../flang/Optimizer/Transforms/Passes.h | 1 + .../flang/Optimizer/Transforms/Passes.td | 11 ++ flang/lib/Optimizer/Transforms/CMakeLists.txt | 1 + flang/lib/Optimizer/Transforms/FIRToSCF.cpp | 103 ++++++++++++ flang/test/Fir/FirToSCF/do-loop.fir | 147 ++++++++++++++++++ 6 files changed, 265 insertions(+) create mode 100644 flang/lib/Optimizer/Transforms/FIRToSCF.cpp create mode 100644 flang/test/Fir/FirToSCF/do-loop.fir diff --git a/flang/include/flang/Optimizer/Support/InitFIR.h b/flang/include/flang/Optimizer/Support/InitFIR.h index 1868fbb201970..fa08f41f84adf 100644 --- a/flang/include/flang/Optimizer/Support/InitFIR.h +++ b/flang/include/flang/Optimizer/Support/InitFIR.h @@ -25,6 +25,7 @@ #include "mlir/Dialect/Func/Extensions/InlinerExtension.h" #include "mlir/Dialect/LLVMIR/NVVMDialect.h" #include "mlir/Dialect/OpenACC/Transforms/Passes.h" +#include "mlir/Dialect/SCF/Transforms/Passes.h" #include "mlir/InitAllDialects.h" #include "mlir/Pass/Pass.h" #include "mlir/Pass/PassRegistry.h" @@ -103,6 +104,7 @@ inline void registerMLIRPassesForFortranTools() { mlir::registerPrintOpStatsPass(); mlir::registerInlinerPass(); mlir::registerSCCPPass(); + mlir::registerSCFPasses(); mlir::affine::registerAffineScalarReplacementPass(); mlir::registerSymbolDCEPass(); mlir::registerLocationSnapshotPass(); diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h b/flang/include/flang/Optimizer/Transforms/Passes.h index 6dbabd523f88a..dc8a5b9141ad2 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.h +++ b/flang/include/flang/Optimizer/Transforms/Passes.h @@ -72,6 +72,7 @@ std::unique_ptr createArrayValueCopyPass(fir::ArrayValueCopyOptions options = {}); std::unique_ptr createMemDataFlowOptPass(); std::unique_ptr createPromoteToAffinePass(); +std::unique_ptr createFIRToSCFPass(); std::unique_ptr createAddDebugInfoPass(fir::AddDebugInfoOptions options = {}); diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c0d88a8e19f80..da3d9bc751927 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -76,6 +76,17 @@ def AffineDialectDemotion : Pass<"demote-affine", "::mlir::func::FuncOp"> { ]; } +def FIRToSCFPass : Pass<"fir-to-scf"> { + let summary = "Convert FIR structured control flow ops to SCF dialect."; + let description = [{ + Convert FIR structured control flow ops to SCF dialect. + }]; + let constructor = "::fir::createFIRToSCFPass()"; + let dependentDialects = [ + "fir::FIROpsDialect", "mlir::scf::SCFDialect" + ]; +} + def AnnotateConstantOperands : Pass<"annotate-constant"> { let summary = "Annotate constant operands to all FIR operations"; let description = [{ diff --git a/flang/lib/Optimizer/Transforms/CMakeLists.txt b/flang/lib/Optimizer/Transforms/CMakeLists.txt index 170b6e2cca225..846d6c64dbd04 100644 --- a/flang/lib/Optimizer/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/Transforms/CMakeLists.txt @@ -16,6 +16,7 @@ add_flang_library(FIRTransforms CUFComputeSharedMemoryOffsetsAndSize.cpp ArrayValueCopy.cpp ExternalNameConversion.cpp + FIRToSCF.cpp MemoryUtils.cpp MemoryAllocation.cpp StackArrays.cpp diff --git a/flang/lib/Optimizer/Transforms/FIRToSCF.cpp b/flang/lib/Optimizer/Transforms/FIRToSCF.cpp new file mode 100644 index 0000000000000..02810f1bdba4e --- /dev/null +++ b/flang/lib/Optimizer/Transforms/FIRToSCF.cpp @@ -0,0 +1,103 @@ +//===-- FIRToSCF.cpp ------------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/SCF/IR/SCF.h" +#include "mlir/Transforms/DialectConversion.h" + +namespace fir { +#define GEN_PASS_DEF_FIRTOSCFPASS +#include "flang/Optimizer/Transforms/Passes.h.inc" +} // namespace fir + +using namespace fir; +using namespace mlir; + +namespace { +class FIRToSCFPass : public fir::impl::FIRToSCFPassBase { +public: + void runOnOperation() override; +}; +} // namespace + +struct DoLoopConversion : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + + LogicalResult matchAndRewrite(fir::DoLoopOp doLoopOp, + PatternRewriter &rewriter) const override { + auto loc = doLoopOp.getLoc(); + bool hasFinalValue = doLoopOp.getFinalValue().has_value(); + + // Get loop values from the DoLoopOp + auto low = doLoopOp.getLowerBound(); + auto high = doLoopOp.getUpperBound(); + assert(low && high && "must be a Value"); + auto step = doLoopOp.getStep(); + llvm::SmallVector iterArgs; + if (hasFinalValue) + iterArgs.push_back(low); + iterArgs.append(doLoopOp.getIterOperands().begin(), + doLoopOp.getIterOperands().end()); + + // Caculate the trip count. + auto diff = rewriter.create(loc, high, low); + auto distance = rewriter.create(loc, diff, step); + auto tripCount = rewriter.create(loc, distance, step); + auto zero = rewriter.create(loc, 0); + auto one = rewriter.create(loc, 1); + auto scfForOp = + rewriter.create(loc, zero, tripCount, one, iterArgs); + + auto &loopOps = doLoopOp.getBody()->getOperations(); + auto resultOp = cast(doLoopOp.getBody()->getTerminator()); + auto results = resultOp.getOperands(); + Block *loweredBody = scfForOp.getBody(); + + loweredBody->getOperations().splice(loweredBody->begin(), loopOps, + loopOps.begin(), + std::prev(loopOps.end())); + + rewriter.setInsertionPointToStart(loweredBody); + Value iv = + rewriter.create(loc, scfForOp.getInductionVar(), step); + iv = rewriter.create(loc, low, iv); + + if (!results.empty()) { + rewriter.setInsertionPointToEnd(loweredBody); + rewriter.create(resultOp->getLoc(), results); + } + doLoopOp.getInductionVar().replaceAllUsesWith(iv); + rewriter.replaceAllUsesWith(doLoopOp.getRegionIterArgs(), + hasFinalValue + ? scfForOp.getRegionIterArgs().drop_front() + : scfForOp.getRegionIterArgs()); + + // Copy loop annotations from the do loop to the loop entry condition. + if (auto ann = doLoopOp.getLoopAnnotation()) + scfForOp->setAttr("loop_annotation", *ann); + + rewriter.replaceOp(doLoopOp, scfForOp); + return success(); + } +}; + +void FIRToSCFPass::runOnOperation() { + RewritePatternSet patterns(&getContext()); + patterns.add(patterns.getContext()); + ConversionTarget target(getContext()); + target.addIllegalOp(); + target.markUnknownOpDynamicallyLegal([](Operation *) { return true; }); + if (failed( + applyPartialConversion(getOperation(), target, std::move(patterns)))) + signalPassFailure(); +} + +std::unique_ptr fir::createFIRToSCFPass() { + return std::make_unique(); +} diff --git a/flang/test/Fir/FirToSCF/do-loop.fir b/flang/test/Fir/FirToSCF/do-loop.fir new file mode 100644 index 0000000000000..c3c24ccc1db71 --- /dev/null +++ b/flang/test/Fir/FirToSCF/do-loop.fir @@ -0,0 +1,147 @@ +// RUN: fir-opt %s --fir-to-scf | FileCheck %s + +// CHECK-LABEL: func.func @simple_loop( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_2:.*]] = fir.shape %[[VAL_1]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_3:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_9:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] { +// CHECK: %[[VAL_10:.*]] = arith.muli %[[VAL_9]], %[[VAL_0]] : index +// CHECK: %[[VAL_11:.*]] = arith.addi %[[VAL_0]], %[[VAL_10]] : index +// CHECK: %[[VAL_12:.*]] = fir.array_coor %[[ARG0]](%[[VAL_2]]) %[[VAL_11]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_3]] to %[[VAL_12]] : !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } +func.func @simple_loop(%arg0: !fir.ref>) { + %c1 = arith.constant 1 : index + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %c1_i32 = arith.constant 1 : i32 + fir.do_loop %arg1 = %c1 to %c100 step %c1 { + %1 = fir.array_coor %arg0(%0) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + return +} + +// CHECK-LABEL: func.func @loop_with_negtive_step( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_2:.*]] = arith.constant -1 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_0]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_2]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_10:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] { +// CHECK: %[[VAL_11:.*]] = arith.muli %[[VAL_10]], %[[VAL_2]] : index +// CHECK: %[[VAL_12:.*]] = arith.addi %[[VAL_0]], %[[VAL_11]] : index +// CHECK: %[[VAL_13:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_12]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_4]] to %[[VAL_13]] : !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } +func.func @loop_with_negtive_step(%arg0: !fir.ref>) { + %c100 = arith.constant 100 : index + %c1 = arith.constant 1 : index + %c-1 = arith.constant -1 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %c1_i32 = arith.constant 1 : i32 + fir.do_loop %arg1 = %c100 to %c1 step %c-1 { + %1 = fir.array_coor %arg0(%0) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + return +} + +// CHECK-LABEL: func.func @loop_with_results( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_9:.*]] = scf.for %[[VAL_10:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] iter_args(%[[VAL_11:.*]] = %[[VAL_1]]) -> (i32) { +// CHECK: %[[VAL_12:.*]] = arith.muli %[[VAL_10]], %[[VAL_0]] : index +// CHECK: %[[VAL_13:.*]] = arith.addi %[[VAL_0]], %[[VAL_12]] : index +// CHECK: %[[VAL_14:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_13]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_15:.*]] = fir.load %[[VAL_14]] : !fir.ref +// CHECK: %[[VAL_16:.*]] = arith.addi %[[VAL_11]], %[[VAL_15]] : i32 +// CHECK: scf.yield %[[VAL_16]] : i32 +// CHECK: } +// CHECK: fir.store %[[VAL_9]] to %[[ARG1]] : !fir.ref +// CHECK: return +// CHECK: } +func.func @loop_with_results(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %1 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %c0_i32) -> (i32) { + %2 = fir.array_coor %arg0(%0) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %3 = fir.load %2 : !fir.ref + %4 = arith.addi %arg3, %3 : i32 + fir.result %4 : i32 + } + fir.store %1 to %arg1 : !fir.ref + return +} + +// CHECK-LABEL: func.func @loop_with_final_value( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.alloca index +// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_0]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_10:.*]]:2 = scf.for %[[VAL_11:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] iter_args(%[[VAL_12:.*]] = %[[VAL_0]], %[[VAL_13:.*]] = %[[VAL_1]]) -> (index, i32) { +// CHECK: %[[VAL_14:.*]] = arith.muli %[[VAL_11]], %[[VAL_0]] : index +// CHECK: %[[VAL_15:.*]] = arith.addi %[[VAL_0]], %[[VAL_14]] : index +// CHECK: %[[VAL_16:.*]] = fir.array_coor %[[ARG0]](%[[VAL_4]]) %[[VAL_15]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.load %[[VAL_16]] : !fir.ref +// CHECK: %[[VAL_18:.*]] = arith.addi %[[VAL_15]], %[[VAL_0]] overflow : index +// CHECK: %[[VAL_19:.*]] = arith.addi %[[VAL_13]], %[[VAL_17]] overflow : i32 +// CHECK: scf.yield %[[VAL_18]], %[[VAL_19]] : index, i32 +// CHECK: } +// CHECK: fir.store %[[VAL_20:.*]]#0 to %[[VAL_3]] : !fir.ref +// CHECK: fir.store %[[VAL_20]]#1 to %[[ARG1]] : !fir.ref +// CHECK: return +// CHECK: } +func.func @loop_with_final_value(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.alloca index + %1 = fir.shape %c100 : (index) -> !fir.shape<1> + %2:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %c0_i32) -> (index, i32) { + %3 = fir.array_coor %arg0(%1) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %4 = fir.load %3 : !fir.ref + %5 = arith.addi %arg2, %c1 overflow : index + %6 = arith.addi %arg3, %4 overflow : i32 + fir.result %5, %6 : index, i32 + } + fir.store %2#0 to %0 : !fir.ref + fir.store %2#1 to %arg1 : !fir.ref + return +} From flang-commits at lists.llvm.org Sun May 18 20:02:45 2025 From: flang-commits at lists.llvm.org (Brad Smith via flang-commits) Date: Sun, 18 May 2025 20:02:45 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][Driver] Support -nodefaultlibs, -nostartfiles and -nostdlib (PR #72601) In-Reply-To: Message-ID: <682a9f55.170a0220.1b54b6.daf8@mx.google.com> https://github.com/brad0 closed https://github.com/llvm/llvm-project/pull/72601 From flang-commits at lists.llvm.org Mon May 19 01:26:55 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 01:26:55 -0700 (PDT) Subject: [flang-commits] [flang] dc0dcab - [flang][OpenMP] Allow flush of common block (#139528) Message-ID: <682aeb4f.170a0220.3470c5.cdcc@mx.google.com> Author: Tom Eccles Date: 2025-05-19T09:26:52+01:00 New Revision: dc0dcab397ae3de38141e1995e4b4e5e3bb98660 URL: https://github.com/llvm/llvm-project/commit/dc0dcab397ae3de38141e1995e4b4e5e3bb98660 DIFF: https://github.com/llvm/llvm-project/commit/dc0dcab397ae3de38141e1995e4b4e5e3bb98660.diff LOG: [flang][OpenMP] Allow flush of common block (#139528) I think this was denied by accident in https://github.com/llvm/llvm-project/commit/68180d8d16f07db8200dfce7bae26a80c43ebc5e. Flush of a common block is allowed by the standard on my reading. It is not allowed by classic-flang but is supported by gfortran and ifx. This doesn't need any lowering changes. The LLVM translation ignores the flush argument list because the openmp runtime library doesn't support flushing specific data. Depends upon https://github.com/llvm/llvm-project/pull/139522. Ignore the first commit in this PR. Added: flang/test/Lower/OpenMP/flush-common.f90 Modified: flang/lib/Semantics/check-omp-structure.cpp Removed: flang/test/Semantics/OpenMP/flush04.f90 ################################################################################ diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 5ae4bc29b72f7..c6c4fdf8a8198 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2303,9 +2303,15 @@ void OmpStructureChecker::Enter(const parser::OpenMPFlushConstruct &x) { void OmpStructureChecker::Leave(const parser::OpenMPFlushConstruct &x) { auto &flushList{std::get>(x.v.t)}; + auto isVariableListItemOrCommonBlock{[this](const Symbol &sym) { + return IsVariableListItem(sym) || + sym.detailsIf(); + }}; + if (flushList) { for (const parser::OmpArgument &arg : flushList->v) { - if (auto *sym{GetArgumentSymbol(arg)}; sym && !IsVariableListItem(*sym)) { + if (auto *sym{GetArgumentSymbol(arg)}; + sym && !isVariableListItemOrCommonBlock(*sym)) { context_.Say(arg.source, "FLUSH argument must be a variable list item"_err_en_US); } diff --git a/flang/test/Lower/OpenMP/flush-common.f90 b/flang/test/Lower/OpenMP/flush-common.f90 new file mode 100644 index 0000000000000..7656141dcb295 --- /dev/null +++ b/flang/test/Lower/OpenMP/flush-common.f90 @@ -0,0 +1,13 @@ +! RUN: %flang_fc1 -fopenmp -emit-hlfir -o - %s | FileCheck %s + +! Regression test to ensure that the name /c/ in the flush argument list is +! resolved to the common block symbol and common blocks are allowed in the +! flush argument list. + +! CHECK: %[[GLBL:.*]] = fir.address_of({{.*}}) : !fir.ref> + common /c/ x + real :: x +! CHECK: omp.flush(%[[GLBL]] : !fir.ref>) + !$omp flush(/c/) +end + diff --git a/flang/test/Semantics/OpenMP/flush04.f90 b/flang/test/Semantics/OpenMP/flush04.f90 deleted file mode 100644 index ffc2273b692dc..0000000000000 --- a/flang/test/Semantics/OpenMP/flush04.f90 +++ /dev/null @@ -1,11 +0,0 @@ -! RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp - -! Regression test to ensure that the name /c/ in the flush argument list is -! resolved to the common block symbol. - - common /c/ x - real :: x -!ERROR: FLUSH argument must be a variable list item - !$omp flush(/c/) -end - From flang-commits at lists.llvm.org Mon May 19 01:54:29 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 19 May 2025 01:54:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] use DataLayout instead of GEP to compute element size (PR #140235) In-Reply-To: Message-ID: <682af1c5.170a0220.34373.d06a@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks https://github.com/llvm/llvm-project/pull/140235 From flang-commits at lists.llvm.org Mon May 19 02:33:52 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Mon, 19 May 2025 02:33:52 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682afb00.050a0220.12b65a.f93a@mx.google.com> ================ @@ -962,6 +962,9 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Number of loops generated by this loop transformation. unsigned NumGeneratedLoops = 0; + /// Number of top level canonical loop nests generated by this loop + /// transformation + unsigned NumGeneratedLoopNests = 0; ---------------- eZWALT wrote: This distinction is indeed important and actively used in `SemaOpenMP.cpp` file, particularly within the `AnalyzeLoopSequence` function (starting at line 14284). For example, it's referenced in lines 14344 and 14364 to differentiate between specific loop transformations. https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Mon May 19 02:54:40 2025 From: flang-commits at lists.llvm.org (Michael Klemm via flang-commits) Date: Mon, 19 May 2025 02:54:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Retrieve shape from selector when generating assoc sym type (PR #137117) In-Reply-To: Message-ID: <682affe0.170a0220.22d21c.d574@mx.google.com> https://github.com/mjklemm approved this pull request. LGTM, but I'm not the ultimate expert here. https://github.com/llvm/llvm-project/pull/137117 From flang-commits at lists.llvm.org Mon May 19 03:29:59 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Mon, 19 May 2025 03:29:59 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682b0827.170a0220.26f119.b671@mx.google.com> ================ @@ -11516,6 +11516,21 @@ def note_omp_implicit_dsa : Note< "implicitly determined as %0">; def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; +def warn_omp_different_loop_ind_var_types : Warning < + "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">, + InGroup; +def err_omp_not_canonical_loop : Error < + "loop after '#pragma omp %0' is not in canonical form">; +def err_omp_not_a_loop_sequence : Error < + "statement after '#pragma omp %0' must be a loop sequence containing canonical loops or loop-generating constructs">; +def err_omp_empty_loop_sequence : Error < + "loop sequence after '#pragma omp %0' must contain at least 1 canonical loop or loop-generating construct">; +def err_omp_invalid_looprange : Error < + "loop range in '#pragma omp %0' exceeds the number of available loops: " + "range end '%1' is greater than the total number of loops '%2'">; ---------------- eZWALT wrote: The two errors serve different purposes: 1. The first is triggered when the loop sequence inside a fusion construct (full or ranged) contains no loops. 2. The second is specific to loopranged fusion and reports when the specified range exceeds the number of available loops. While the first case is technically a subset of the second, they occur in different contexts. Keeping both improves clarity and helps users better understand the issue. That said, I’m open to refactoring if you'd prefer a single, more general diagnostic, though it may reduce the precision of the error messages. https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Mon May 19 04:59:12 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 04:59:12 -0700 (PDT) Subject: [flang-commits] [flang] 416b7df - [flang] use DataLayout instead of GEP to compute element size (#140235) Message-ID: <682b1d10.a70a0220.34f0fc.e300@mx.google.com> Author: jeanPerier Date: 2025-05-19T13:59:09+02:00 New Revision: 416b7dfaa0d114b552c596d320f0aaac5651e61e URL: https://github.com/llvm/llvm-project/commit/416b7dfaa0d114b552c596d320f0aaac5651e61e DIFF: https://github.com/llvm/llvm-project/commit/416b7dfaa0d114b552c596d320f0aaac5651e61e.diff LOG: [flang] use DataLayout instead of GEP to compute element size (#140235) Now that the datalayout is part of codegen, use that to generate type size constants in codegen instead of generating GEP. Added: Modified: flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h flang/lib/Optimizer/CodeGen/CodeGen.cpp flang/test/Fir/convert-to-llvm.fir flang/test/Fir/copy-codegen.fir flang/test/Fir/embox-char.fir flang/test/Fir/embox-substring.fir Removed: ################################################################################ diff --git a/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h b/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h index 53d16323beddf..7b1c14e4dfdc9 100644 --- a/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h +++ b/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h @@ -173,6 +173,10 @@ class ConvertFIRToLLVMPattern : public mlir::ConvertToLLVMPattern { this->getTypeConverter()); } + const mlir::DataLayout &getDataLayout() const { + return lowerTy().getDataLayout(); + } + void attachTBAATag(mlir::LLVM::AliasAnalysisOpInterface op, mlir::Type baseFIRType, mlir::Type accessFIRType, mlir::LLVM::GEPOp gep) const { diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index e534cfa5591c6..ad9119ba4a031 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -1043,22 +1043,12 @@ static mlir::SymbolRefAttr getMalloc(fir::AllocMemOp op, static mlir::Value computeElementDistance(mlir::Location loc, mlir::Type llvmObjectType, mlir::Type idxTy, - mlir::ConversionPatternRewriter &rewriter) { - // Note that we cannot use something like - // mlir::LLVM::getPrimitiveTypeSizeInBits() for the element type here. For - // example, it returns 10 bytes for mlir::Float80Type for targets where it - // occupies 16 bytes. Proper solution is probably to use - // mlir::DataLayout::getTypeABIAlignment(), but DataLayout is not being set - // yet (see llvm-project#57230). For the time being use the '(intptr_t)((type - // *)0 + 1)' trick for all types. The generated instructions are optimized - // into constant by the first pass of InstCombine, so it should not be a - // performance issue. - auto llvmPtrTy = ::getLlvmPtrType(llvmObjectType.getContext()); - auto nullPtr = rewriter.create(loc, llvmPtrTy); - auto gep = rewriter.create( - loc, llvmPtrTy, llvmObjectType, nullPtr, - llvm::ArrayRef{1}); - return rewriter.create(loc, idxTy, gep); + mlir::ConversionPatternRewriter &rewriter, + const mlir::DataLayout &dataLayout) { + llvm::TypeSize size = dataLayout.getTypeSize(llvmObjectType); + unsigned short alignment = dataLayout.getTypeABIAlignment(llvmObjectType); + std::int64_t distance = llvm::alignTo(size, alignment); + return genConstantIndex(loc, idxTy, rewriter, distance); } /// Return value of the stride in bytes between adjacent elements @@ -1066,10 +1056,10 @@ computeElementDistance(mlir::Location loc, mlir::Type llvmObjectType, /// \p idxTy integer type. static mlir::Value genTypeStrideInBytes(mlir::Location loc, mlir::Type idxTy, - mlir::ConversionPatternRewriter &rewriter, - mlir::Type llTy) { + mlir::ConversionPatternRewriter &rewriter, mlir::Type llTy, + const mlir::DataLayout &dataLayout) { // Create a pointer type and use computeElementDistance(). - return computeElementDistance(loc, llTy, idxTy, rewriter); + return computeElementDistance(loc, llTy, idxTy, rewriter, dataLayout); } namespace { @@ -1111,7 +1101,7 @@ struct AllocMemOpConversion : public fir::FIROpConversion { mlir::Value genTypeSizeInBytes(mlir::Location loc, mlir::Type idxTy, mlir::ConversionPatternRewriter &rewriter, mlir::Type llTy) const { - return computeElementDistance(loc, llTy, idxTy, rewriter); + return computeElementDistance(loc, llTy, idxTy, rewriter, getDataLayout()); } }; } // namespace @@ -1323,8 +1313,8 @@ struct EmboxCommonConversion : public fir::FIROpConversion { fir::CharacterType charTy, mlir::ValueRange lenParams) const { auto i64Ty = mlir::IntegerType::get(rewriter.getContext(), 64); - mlir::Value size = - genTypeStrideInBytes(loc, i64Ty, rewriter, this->convertType(charTy)); + mlir::Value size = genTypeStrideInBytes( + loc, i64Ty, rewriter, this->convertType(charTy), this->getDataLayout()); if (charTy.hasConstantLen()) return size; // Length accounted for in the genTypeStrideInBytes GEP. // Otherwise, multiply the single character size by the length. @@ -1338,6 +1328,7 @@ struct EmboxCommonConversion : public fir::FIROpConversion { std::tuple getSizeAndTypeCode( mlir::Location loc, mlir::ConversionPatternRewriter &rewriter, mlir::Type boxEleTy, mlir::ValueRange lenParams = {}) const { + const mlir::DataLayout &dataLayout = this->getDataLayout(); auto i64Ty = mlir::IntegerType::get(rewriter.getContext(), 64); if (auto eleTy = fir::dyn_cast_ptrEleTy(boxEleTy)) boxEleTy = eleTy; @@ -1354,18 +1345,19 @@ struct EmboxCommonConversion : public fir::FIROpConversion { mlir::dyn_cast(boxEleTy) || fir::isa_real(boxEleTy) || fir::isa_complex(boxEleTy)) return {genTypeStrideInBytes(loc, i64Ty, rewriter, - this->convertType(boxEleTy)), + this->convertType(boxEleTy), dataLayout), typeCodeVal}; if (auto charTy = mlir::dyn_cast(boxEleTy)) return {getCharacterByteSize(loc, rewriter, charTy, lenParams), typeCodeVal}; if (fir::isa_ref_type(boxEleTy)) { auto ptrTy = ::getLlvmPtrType(rewriter.getContext()); - return {genTypeStrideInBytes(loc, i64Ty, rewriter, ptrTy), typeCodeVal}; + return {genTypeStrideInBytes(loc, i64Ty, rewriter, ptrTy, dataLayout), + typeCodeVal}; } if (mlir::isa(boxEleTy)) return {genTypeStrideInBytes(loc, i64Ty, rewriter, - this->convertType(boxEleTy)), + this->convertType(boxEleTy), dataLayout), typeCodeVal}; fir::emitFatalError(loc, "unhandled type in fir.box code generation"); } @@ -1909,8 +1901,8 @@ struct XEmboxOpConversion : public EmboxCommonConversion { if (hasSubcomp) { // We have a subcomponent. The step value needs to be the number of // bytes per element (which is a derived type). - prevDimByteStride = - genTypeStrideInBytes(loc, i64Ty, rewriter, convertType(seqEleTy)); + prevDimByteStride = genTypeStrideInBytes( + loc, i64Ty, rewriter, convertType(seqEleTy), getDataLayout()); } else if (hasSubstr) { // We have a substring. The step value needs to be the number of bytes // per CHARACTER element. @@ -3604,8 +3596,8 @@ struct CopyOpConversion : public fir::FIROpConversion { mlir::Value llvmDestination = adaptor.getDestination(); mlir::Type i64Ty = mlir::IntegerType::get(rewriter.getContext(), 64); mlir::Type copyTy = fir::unwrapRefType(copy.getSource().getType()); - mlir::Value copySize = - genTypeStrideInBytes(loc, i64Ty, rewriter, convertType(copyTy)); + mlir::Value copySize = genTypeStrideInBytes( + loc, i64Ty, rewriter, convertType(copyTy), getDataLayout()); mlir::LLVM::AliasAnalysisOpInterface newOp; if (copy.getNoOverlap()) diff --git a/flang/test/Fir/convert-to-llvm.fir b/flang/test/Fir/convert-to-llvm.fir index 2960528fb6c24..6d8a8bb606b90 100644 --- a/flang/test/Fir/convert-to-llvm.fir +++ b/flang/test/Fir/convert-to-llvm.fir @@ -216,9 +216,7 @@ func.func @test_alloc_and_freemem_one() { } // CHECK-LABEL: llvm.func @test_alloc_and_freemem_one() { -// CHECK-NEXT: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK-NEXT: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK-NEXT: %[[N:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[N:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK-NEXT: llvm.call @malloc(%[[N]]) // CHECK: llvm.call @free(%{{.*}}) // CHECK-NEXT: llvm.return @@ -235,10 +233,8 @@ func.func @test_alloc_and_freemem_several() { } // CHECK-LABEL: llvm.func @test_alloc_and_freemem_several() { -// CHECK: [[NULL:%.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: [[PTR:%.*]] = llvm.getelementptr [[NULL]][{{.*}}] : (!llvm.ptr) -> !llvm.ptr, !llvm.array<100 x f32> -// CHECK: [[N:%.*]] = llvm.ptrtoint [[PTR]] : !llvm.ptr to i64 -// CHECK: [[MALLOC:%.*]] = llvm.call @malloc([[N]]) +// CHECK: %[[N:.*]] = llvm.mlir.constant(400 : i64) : i64 +// CHECK: [[MALLOC:%.*]] = llvm.call @malloc(%[[N]]) // CHECK: llvm.call @free([[MALLOC]]) // CHECK: llvm.return @@ -251,9 +247,7 @@ func.func @test_with_shape(%ncols: index, %nrows: index) { // CHECK-LABEL: llvm.func @test_with_shape // CHECK-SAME: %[[NCOLS:.*]]: i64, %[[NROWS:.*]]: i64 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[FOUR:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[FOUR:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[DIM1_SIZE:.*]] = llvm.mul %[[FOUR]], %[[NCOLS]] : i64 // CHECK: %[[TOTAL_SIZE:.*]] = llvm.mul %[[DIM1_SIZE]], %[[NROWS]] : i64 // CHECK: %[[MEM:.*]] = llvm.call @malloc(%[[TOTAL_SIZE]]) @@ -269,9 +263,7 @@ func.func @test_string_with_shape(%len: index, %nelems: index) { // CHECK-LABEL: llvm.func @test_string_with_shape // CHECK-SAME: %[[LEN:.*]]: i64, %[[NELEMS:.*]]: i64) -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ONE:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ONE:.*]] = llvm.mlir.constant(1 : i64) : i64 // CHECK: %[[LEN_SIZE:.*]] = llvm.mul %[[ONE]], %[[LEN]] : i64 // CHECK: %[[TOTAL_SIZE:.*]] = llvm.mul %[[LEN_SIZE]], %[[NELEMS]] : i64 // CHECK: %[[MEM:.*]] = llvm.call @malloc(%[[TOTAL_SIZE]]) @@ -1654,9 +1646,7 @@ func.func @embox0(%arg0: !fir.ref>) { // AMDGPU: %[[AA:.*]] = llvm.alloca %[[C1]] x !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}})> {alignment = 8 : i64} : (i32) -> !llvm.ptr<5> // AMDGPU: %[[ALLOCA:.*]] = llvm.addrspacecast %[[AA]] : !llvm.ptr<5> to !llvm.ptr // CHECK: %[[TYPE_CODE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[I64_ELEM_SIZE:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[I64_ELEM_SIZE:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[DESC:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}})> // CHECK: %[[DESC0:.*]] = llvm.insertvalue %[[I64_ELEM_SIZE]], %[[DESC]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}})> // CHECK: %[[CFI_VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -1879,9 +1869,7 @@ func.func @xembox0(%arg0: !fir.ref>) { // AMDGPU: %[[ALLOCA:.*]] = llvm.addrspacecast %[[AA]] : !llvm.ptr<5> to !llvm.ptr // CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : i64) : i64 // CHECK: %[[TYPE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -1933,9 +1921,7 @@ func.func @xembox0_i32(%arg0: !fir.ref>) { // CHECK: %[[C0_I32:.*]] = llvm.mlir.constant(0 : i32) : i32 // CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : i64) : i64 // CHECK: %[[TYPE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -1988,9 +1974,7 @@ func.func @xembox1(%arg0: !fir.ref>>) { // CHECK-LABEL: llvm.func @xembox1(%{{.*}}: !llvm.ptr) { // CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : i64) : i64 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(10 : i64) : i64 // CHECK: %{{.*}} = llvm.insertvalue %[[ELEM_LEN_I64]], %{{.*}}[1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[PREV_PTROFF:.*]] = llvm.mul %[[ELEM_LEN_I64]], %[[C0]] : i64 @@ -2042,9 +2026,7 @@ func.func private @_QPxb(!fir.box>) // AMDGPU: %[[AR:.*]] = llvm.alloca %[[ARR_SIZE]] x f64 {bindc_name = "arr"} : (i64) -> !llvm.ptr<5> // AMDGPU: %[[ARR:.*]] = llvm.addrspacecast %[[AR]] : !llvm.ptr<5> to !llvm.ptr // CHECK: %[[TYPE_CODE:.*]] = llvm.mlir.constant(28 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(8 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<2 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<2 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -2126,9 +2108,7 @@ func.func private @_QPtest_dt_callee(%arg0: !fir.box>) // CHECK: %[[C10:.*]] = llvm.mlir.constant(10 : i64) : i64 // CHECK: %[[C2:.*]] = llvm.mlir.constant(2 : i64) : i64 // CHECK: %[[TYPE_CODE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -2146,9 +2126,7 @@ func.func private @_QPtest_dt_callee(%arg0: !fir.box>) // CHECK: %[[BOX6:.*]] = llvm.insertvalue %[[F18ADDENDUM_I8]], %[[BOX5]][6] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[ZERO:.*]] = llvm.mlir.constant(0 : i64) : i64 // CHECK: %[[ONE:.*]] = llvm.mlir.constant(1 : i64) : i64 -// CHECK: %[[ELE_TYPE:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP_DTYPE_SIZE:.*]] = llvm.getelementptr %[[ELE_TYPE]][1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<"_QFtest_dt_sliceTt", (i32, i32)> -// CHECK: %[[PTRTOINT_DTYPE_SIZE:.*]] = llvm.ptrtoint %[[GEP_DTYPE_SIZE]] : !llvm.ptr to i64 +// CHECK: %[[PTRTOINT_DTYPE_SIZE:.*]] = llvm.mlir.constant(8 : i64) : i64 // CHECK: %[[ADJUSTED_OFFSET:.*]] = llvm.sub %[[C1]], %[[ONE]] : i64 // CHECK: %[[EXT_SUB:.*]] = llvm.sub %[[C10]], %[[C1]] : i64 // CHECK: %[[EXT_ADD:.*]] = llvm.add %[[EXT_SUB]], %[[C2]] : i64 @@ -2429,9 +2407,7 @@ func.func @test_rebox_1(%arg0: !fir.box>) { //CHECK: %[[SIX:.*]] = llvm.mlir.constant(6 : index) : i64 //CHECK: %[[EIGHTY:.*]] = llvm.mlir.constant(80 : index) : i64 //CHECK: %[[FLOAT_TYPE:.*]] = llvm.mlir.constant(27 : i32) : i32 -//CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -//CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -//CHECK: %[[ELEM_SIZE_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +//CHECK: %[[ELEM_SIZE_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 //CHECK: %[[EXTRA_GEP:.*]] = llvm.getelementptr %[[ARG0]][0, 6] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> //CHECK: %[[EXTRA:.*]] = llvm.load %[[EXTRA_GEP]] : !llvm.ptr -> i8 //CHECK: %[[RBOX:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> @@ -2504,9 +2480,7 @@ func.func @foo(%arg0: !fir.box} //CHECK: %[[COMPONENT_OFFSET_1:.*]] = llvm.mlir.constant(1 : i64) : i64 //CHECK: %[[ELEM_COUNT:.*]] = llvm.mlir.constant(7 : i64) : i64 //CHECK: %[[TYPE_CHAR:.*]] = llvm.mlir.constant(40 : i32) : i32 -//CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -//CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -//CHECK: %[[CHAR_SIZE:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +//CHECK: %[[CHAR_SIZE:.*]] = llvm.mlir.constant(1 : i64) : i64 //CHECK: %[[ELEM_SIZE:.*]] = llvm.mul %[[CHAR_SIZE]], %[[ELEM_COUNT]] //CHECK: %[[EXTRA_GEP:.*]] = llvm.getelementptr %[[ARG0]][0, 6] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>, ptr, array<1 x i64>)> //CHECK: %[[EXTRA:.*]] = llvm.load %[[EXTRA_GEP]] : !llvm.ptr -> i8 diff --git a/flang/test/Fir/copy-codegen.fir b/flang/test/Fir/copy-codegen.fir index eef1885c6a49c..7b0620ca2d312 100644 --- a/flang/test/Fir/copy-codegen.fir +++ b/flang/test/Fir/copy-codegen.fir @@ -12,10 +12,8 @@ func.func @test_copy_1(%arg0: !fir.ref, %arg1: !fir.ref) { // CHECK-LABEL: llvm.func @test_copy_1( // CHECK-SAME: %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr, // CHECK-SAME: %[[VAL_1:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr) { -// CHECK: %[[VAL_2:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_3:.*]] = llvm.getelementptr %[[VAL_2]][1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<"sometype", (array<9 x i32>)> -// CHECK: %[[VAL_4:.*]] = llvm.ptrtoint %[[VAL_3]] : !llvm.ptr to i64 -// CHECK: "llvm.intr.memcpy"(%[[VAL_1]], %[[VAL_0]], %[[VAL_4]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () +// CHECK: %[[VAL_2:.*]] = llvm.mlir.constant(36 : i64) : i64 +// CHECK: "llvm.intr.memcpy"(%[[VAL_1]], %[[VAL_0]], %[[VAL_2]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () // CHECK: llvm.return // CHECK: } @@ -26,10 +24,8 @@ func.func @test_copy_2(%arg0: !fir.ref, %arg1: !fir.ref) { // CHECK-LABEL: llvm.func @test_copy_2( // CHECK-SAME: %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr, // CHECK-SAME: %[[VAL_1:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr) { -// CHECK: %[[VAL_2:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_3:.*]] = llvm.getelementptr %[[VAL_2]][1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<"sometype", (array<9 x i32>)> -// CHECK: %[[VAL_4:.*]] = llvm.ptrtoint %[[VAL_3]] : !llvm.ptr to i64 -// CHECK: "llvm.intr.memmove"(%[[VAL_1]], %[[VAL_0]], %[[VAL_4]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () +// CHECK: %[[VAL_2:.*]] = llvm.mlir.constant(36 : i64) : i64 +// CHECK: "llvm.intr.memmove"(%[[VAL_1]], %[[VAL_0]], %[[VAL_2]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () // CHECK: llvm.return // CHECK: } } diff --git a/flang/test/Fir/embox-char.fir b/flang/test/Fir/embox-char.fir index efb069f96520d..8e40acfdf289f 100644 --- a/flang/test/Fir/embox-char.fir +++ b/flang/test/Fir/embox-char.fir @@ -45,9 +45,7 @@ // CHECK: %[[VAL_30:.*]] = llvm.load %[[VAL_29]] : !llvm.ptr -> i64 // CHECK: %[[VAL_31:.*]] = llvm.sdiv %[[VAL_16]], %[[VAL_13]] : i64 // CHECK: %[[VAL_32:.*]] = llvm.mlir.constant(44 : i32) : i32 -// CHECK: %[[VAL_33:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_34:.*]] = llvm.getelementptr %[[VAL_33]][1] : (!llvm.ptr) -> !llvm.ptr, i32 -// CHECK: %[[VAL_35:.*]] = llvm.ptrtoint %[[VAL_34]] : !llvm.ptr to i64 +// CHECK: %[[VAL_35:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[VAL_36:.*]] = llvm.mul %[[VAL_35]], %[[VAL_31]] : i64 // CHECK: %[[VAL_37:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> // CHECK: %[[VAL_38:.*]] = llvm.insertvalue %[[VAL_36]], %[[VAL_37]][1] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> @@ -139,9 +137,7 @@ func.func @test_char4(%arg0: !fir.ref !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> // CHECK: %[[VAL_29:.*]] = llvm.load %[[VAL_28]] : !llvm.ptr -> i64 // CHECK: %[[VAL_30:.*]] = llvm.mlir.constant(40 : i32) : i32 -// CHECK: %[[VAL_31:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_32:.*]] = llvm.getelementptr %[[VAL_31]][1] : (!llvm.ptr) -> !llvm.ptr, i8 -// CHECK: %[[VAL_33:.*]] = llvm.ptrtoint %[[VAL_32]] : !llvm.ptr to i64 +// CHECK: %[[VAL_33:.*]] = llvm.mlir.constant(1 : i64) : i64 // CHECK: %[[VAL_34:.*]] = llvm.mul %[[VAL_33]], %[[VAL_15]] : i64 // CHECK: %[[VAL_35:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> // CHECK: %[[VAL_36:.*]] = llvm.insertvalue %[[VAL_34]], %[[VAL_35]][1] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> diff --git a/flang/test/Fir/embox-substring.fir b/flang/test/Fir/embox-substring.fir index f2042f9bda7fc..6ce6346f89b1d 100644 --- a/flang/test/Fir/embox-substring.fir +++ b/flang/test/Fir/embox-substring.fir @@ -29,10 +29,9 @@ func.func private @dump(!fir.box>>) // CHECK-SAME: %[[VAL_0:.*]]: !llvm.ptr, // CHECK-SAME: %[[VAL_1:.*]]: i64) { // CHECK: %[[VAL_5:.*]] = llvm.mlir.constant(1 : index) : i64 -// CHECK: llvm.getelementptr -// CHECK: %[[VAL_28:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_29:.*]] = llvm.getelementptr %[[VAL_28]][1] : (!llvm.ptr) -> !llvm.ptr, i8 -// CHECK: %[[VAL_30:.*]] = llvm.ptrtoint %[[VAL_29]] : !llvm.ptr to i64 +// CHECK: llvm.mlir.constant(1 : i64) : i64 +// CHECK: llvm.mlir.constant(1 : i64) : i64 +// CHECK: %[[VAL_30:.*]] = llvm.mlir.constant(1 : i64) : i64 // CHECK: %[[VAL_31:.*]] = llvm.mul %[[VAL_30]], %[[VAL_1]] : i64 // CHECK: %[[VAL_42:.*]] = llvm.mul %[[VAL_31]], %[[VAL_5]] : i64 // CHECK: %[[VAL_43:.*]] = llvm.insertvalue %[[VAL_42]], %{{.*}}[7, 0, 2] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> From flang-commits at lists.llvm.org Mon May 19 04:59:15 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 04:59:15 -0700 (PDT) Subject: [flang-commits] [flang] [flang] use DataLayout instead of GEP to compute element size (PR #140235) In-Reply-To: Message-ID: <682b1d13.630a0220.3d0cee.2c3b@mx.google.com> https://github.com/jeanPerier closed https://github.com/llvm/llvm-project/pull/140235 From flang-commits at lists.llvm.org Mon May 19 04:59:18 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 04:59:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang] translate derived type array init to attribute if possible (PR #140268) In-Reply-To: Message-ID: <682b1d16.170a0220.103927.d6ce@mx.google.com> https://github.com/jeanPerier edited https://github.com/llvm/llvm-project/pull/140268 From flang-commits at lists.llvm.org Mon May 19 05:00:24 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 05:00:24 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][veclib] Adding AMDLIBM target to fveclib (PR #140533) Message-ID: https://github.com/shivaramaarao created https://github.com/llvm/llvm-project/pull/140533 This commit adds AMDLIBM support to fveclib targets. The support is already present in clang and this patch extends it to flang. >From 071cbdb22e6491ef5e5f7261cfded236e8f6582b Mon Sep 17 00:00:00 2001 From: Shivarama Rao Date: Mon, 19 May 2025 11:34:57 +0000 Subject: [PATCH] [flang][veclib] Add AMDLIBM target to fveclib This commit adds AMDLIBM support to fveclib targets. The support is already present in clang and this patch extends it to flang. --- clang/lib/Driver/ToolChains/Flang.cpp | 2 +- flang/include/flang/Frontend/CodeGenOptions.def | 2 +- flang/lib/Frontend/CompilerInvocation.cpp | 1 + flang/test/Driver/fveclib-codegen.f90 | 2 ++ flang/test/Driver/fveclib.f90 | 3 +++ 5 files changed, 8 insertions(+), 2 deletions(-) diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index b1ca747e68b89..0bd8d0c85e50a 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -484,7 +484,7 @@ void Flang::addTargetOptions(const ArgList &Args, Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) << Name << Triple.getArchName(); - } else if (Name == "libmvec") { + } else if (Name == "libmvec" || Name == "AMDLIBM") { if (Triple.getArch() != llvm::Triple::x86 && Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index d9dbd274e83e5..b50dd4fb3abda 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -42,7 +42,7 @@ CODEGENOPT(AliasAnalysis, 1, 0) ///< Enable alias analysis pass CODEGENOPT(Underscoring, 1, 1) ENUM_CODEGENOPT(RelocationModel, llvm::Reloc::Model, 3, llvm::Reloc::PIC_) ///< Name of the relocation model to use. ENUM_CODEGENOPT(DebugInfo, llvm::codegenoptions::DebugInfoKind, 4, llvm::codegenoptions::NoDebugInfo) ///< Level of debug info to generate -ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, llvm::driver::VectorLibrary::NoLibrary) ///< Vector functions library to use +ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 4, llvm::driver::VectorLibrary::NoLibrary) ///< Vector functions library to use ENUM_CODEGENOPT(FramePointer, llvm::FramePointerKind, 2, llvm::FramePointerKind::None) ///< Enable the usage of frame pointers ENUM_CODEGENOPT(DoConcurrentMapping, DoConcurrentMappingKind, 2, DoConcurrentMappingKind::DCMK_None) ///< Map `do concurrent` to OpenMP diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 238079a09ef3a..b6c37712d0f79 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -201,6 +201,7 @@ static bool parseVectorLibArg(Fortran::frontend::CodeGenOptions &opts, .Case("SLEEF", VectorLibrary::SLEEF) .Case("Darwin_libsystem_m", VectorLibrary::Darwin_libsystem_m) .Case("ArmPL", VectorLibrary::ArmPL) + .Case("AMDLIBM", VectorLibrary::AMDLIBM) .Case("NoLibrary", VectorLibrary::NoLibrary) .Default(std::nullopt); if (!val.has_value()) { diff --git a/flang/test/Driver/fveclib-codegen.f90 b/flang/test/Driver/fveclib-codegen.f90 index 802fff9772bb3..4cbb1e284f18e 100644 --- a/flang/test/Driver/fveclib-codegen.f90 +++ b/flang/test/Driver/fveclib-codegen.f90 @@ -1,6 +1,7 @@ ! test that -fveclib= is passed to the backend ! RUN: %if aarch64-registered-target %{ %flang -S -Ofast -target aarch64-unknown-linux-gnu -fveclib=SLEEF -o - %s | FileCheck %s --check-prefix=SLEEF %} ! RUN: %if x86-registered-target %{ %flang -S -Ofast -target x86_64-unknown-linux-gnu -fveclib=libmvec -o - %s | FileCheck %s %} +! RUN: %if x86-registered-target %{ %flang -S -O3 -ffast-math -target x86_64-unknown-linux-gnu -fveclib=AMDLIBM -o - %s | FileCheck %s --check-prefix=AMDLIBM %} ! RUN: %flang -S -Ofast -fveclib=NoLibrary -o - %s | FileCheck %s --check-prefix=NOLIB subroutine sb(a, b) @@ -10,6 +11,7 @@ subroutine sb(a, b) ! check that we used a vectorized call to powf() ! CHECK: _ZGVbN4vv_powf ! SLEEF: _ZGVnN4vv_powf +! AMDLIBM: amd_vrs4_powf ! NOLIB: powf a(i) = a(i) ** b(i) end do diff --git a/flang/test/Driver/fveclib.f90 b/flang/test/Driver/fveclib.f90 index 1b536b8ad0f18..431a4bfc02522 100644 --- a/flang/test/Driver/fveclib.f90 +++ b/flang/test/Driver/fveclib.f90 @@ -5,6 +5,7 @@ ! RUN: %flang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck -check-prefix CHECK-DARWIN_LIBSYSTEM_M %s ! RUN: %flang -### -c --target=aarch64-none-none -fveclib=SLEEF %s 2>&1 | FileCheck -check-prefix CHECK-SLEEF %s ! RUN: %flang -### -c --target=aarch64-none-none -fveclib=ArmPL %s 2>&1 | FileCheck -check-prefix CHECK-ARMPL %s +! RUN: %flang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck -check-prefix CHECK-AMDLIBM %s ! RUN: %flang -### -c --target=aarch64-apple-darwin -fveclib=none %s 2>&1 | FileCheck -check-prefix CHECK-NOLIB-DARWIN %s ! RUN: not %flang -c -fveclib=something %s 2>&1 | FileCheck -check-prefix CHECK-INVALID %s @@ -15,6 +16,7 @@ ! CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m" ! CHECK-SLEEF: "-fveclib=SLEEF" ! CHECK-ARMPL: "-fveclib=ArmPL" +! CHECK-AMDLIBM: "-fveclib=AMDLIBM" ! CHECK-NOLIB-DARWIN: "-fveclib=none" ! CHECK-INVALID: error: invalid value 'something' in '-fveclib=something' @@ -23,6 +25,7 @@ ! RUN: not %flang --target=x86-none-none -c -fveclib=ArmPL %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=aarch64-none-none -c -fveclib=libmvec %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=aarch64-none-none -c -fveclib=SVML %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s +! RUN: not %flang --target=aarch64-none-none -c -fveclib=AMDLIBM %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! CHECK-ERROR: unsupported option {{.*}} for target ! RUN: %flang -fveclib=Accelerate %s -target arm64-apple-ios8.0.0 -### 2>&1 | FileCheck --check-prefix=CHECK-LINK %s From flang-commits at lists.llvm.org Mon May 19 05:00:40 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 05:00:40 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][veclib] Adding AMDLIBM target to fveclib (PR #140533) In-Reply-To: Message-ID: <682b1d68.630a0220.322380.2bf4@mx.google.com> github-actions[bot] wrote: Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using `@` followed by their GitHub username. If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the [LLVM GitHub User Guide](https://llvm.org/docs/GitHub.html). You can also ask questions in a comment on this PR, on the [LLVM Discord](https://discord.com/invite/xS7Z362) or on the [forums](https://discourse.llvm.org/). https://github.com/llvm/llvm-project/pull/140533 From flang-commits at lists.llvm.org Mon May 19 05:01:12 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 05:01:12 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][veclib] Adding AMDLIBM target to fveclib (PR #140533) In-Reply-To: Message-ID: <682b1d88.a70a0220.37cc85.f780@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-driver @llvm/pr-subscribers-clang Author: None (shivaramaarao)
Changes This commit adds AMDLIBM support to fveclib targets. The support is already present in clang and this patch extends it to flang. --- Full diff: https://github.com/llvm/llvm-project/pull/140533.diff 5 Files Affected: - (modified) clang/lib/Driver/ToolChains/Flang.cpp (+1-1) - (modified) flang/include/flang/Frontend/CodeGenOptions.def (+1-1) - (modified) flang/lib/Frontend/CompilerInvocation.cpp (+1) - (modified) flang/test/Driver/fveclib-codegen.f90 (+2) - (modified) flang/test/Driver/fveclib.f90 (+3) ``````````diff diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index b1ca747e68b89..0bd8d0c85e50a 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -484,7 +484,7 @@ void Flang::addTargetOptions(const ArgList &Args, Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) << Name << Triple.getArchName(); - } else if (Name == "libmvec") { + } else if (Name == "libmvec" || Name == "AMDLIBM") { if (Triple.getArch() != llvm::Triple::x86 && Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index d9dbd274e83e5..b50dd4fb3abda 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -42,7 +42,7 @@ CODEGENOPT(AliasAnalysis, 1, 0) ///< Enable alias analysis pass CODEGENOPT(Underscoring, 1, 1) ENUM_CODEGENOPT(RelocationModel, llvm::Reloc::Model, 3, llvm::Reloc::PIC_) ///< Name of the relocation model to use. ENUM_CODEGENOPT(DebugInfo, llvm::codegenoptions::DebugInfoKind, 4, llvm::codegenoptions::NoDebugInfo) ///< Level of debug info to generate -ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, llvm::driver::VectorLibrary::NoLibrary) ///< Vector functions library to use +ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 4, llvm::driver::VectorLibrary::NoLibrary) ///< Vector functions library to use ENUM_CODEGENOPT(FramePointer, llvm::FramePointerKind, 2, llvm::FramePointerKind::None) ///< Enable the usage of frame pointers ENUM_CODEGENOPT(DoConcurrentMapping, DoConcurrentMappingKind, 2, DoConcurrentMappingKind::DCMK_None) ///< Map `do concurrent` to OpenMP diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 238079a09ef3a..b6c37712d0f79 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -201,6 +201,7 @@ static bool parseVectorLibArg(Fortran::frontend::CodeGenOptions &opts, .Case("SLEEF", VectorLibrary::SLEEF) .Case("Darwin_libsystem_m", VectorLibrary::Darwin_libsystem_m) .Case("ArmPL", VectorLibrary::ArmPL) + .Case("AMDLIBM", VectorLibrary::AMDLIBM) .Case("NoLibrary", VectorLibrary::NoLibrary) .Default(std::nullopt); if (!val.has_value()) { diff --git a/flang/test/Driver/fveclib-codegen.f90 b/flang/test/Driver/fveclib-codegen.f90 index 802fff9772bb3..4cbb1e284f18e 100644 --- a/flang/test/Driver/fveclib-codegen.f90 +++ b/flang/test/Driver/fveclib-codegen.f90 @@ -1,6 +1,7 @@ ! test that -fveclib= is passed to the backend ! RUN: %if aarch64-registered-target %{ %flang -S -Ofast -target aarch64-unknown-linux-gnu -fveclib=SLEEF -o - %s | FileCheck %s --check-prefix=SLEEF %} ! RUN: %if x86-registered-target %{ %flang -S -Ofast -target x86_64-unknown-linux-gnu -fveclib=libmvec -o - %s | FileCheck %s %} +! RUN: %if x86-registered-target %{ %flang -S -O3 -ffast-math -target x86_64-unknown-linux-gnu -fveclib=AMDLIBM -o - %s | FileCheck %s --check-prefix=AMDLIBM %} ! RUN: %flang -S -Ofast -fveclib=NoLibrary -o - %s | FileCheck %s --check-prefix=NOLIB subroutine sb(a, b) @@ -10,6 +11,7 @@ subroutine sb(a, b) ! check that we used a vectorized call to powf() ! CHECK: _ZGVbN4vv_powf ! SLEEF: _ZGVnN4vv_powf +! AMDLIBM: amd_vrs4_powf ! NOLIB: powf a(i) = a(i) ** b(i) end do diff --git a/flang/test/Driver/fveclib.f90 b/flang/test/Driver/fveclib.f90 index 1b536b8ad0f18..431a4bfc02522 100644 --- a/flang/test/Driver/fveclib.f90 +++ b/flang/test/Driver/fveclib.f90 @@ -5,6 +5,7 @@ ! RUN: %flang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck -check-prefix CHECK-DARWIN_LIBSYSTEM_M %s ! RUN: %flang -### -c --target=aarch64-none-none -fveclib=SLEEF %s 2>&1 | FileCheck -check-prefix CHECK-SLEEF %s ! RUN: %flang -### -c --target=aarch64-none-none -fveclib=ArmPL %s 2>&1 | FileCheck -check-prefix CHECK-ARMPL %s +! RUN: %flang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck -check-prefix CHECK-AMDLIBM %s ! RUN: %flang -### -c --target=aarch64-apple-darwin -fveclib=none %s 2>&1 | FileCheck -check-prefix CHECK-NOLIB-DARWIN %s ! RUN: not %flang -c -fveclib=something %s 2>&1 | FileCheck -check-prefix CHECK-INVALID %s @@ -15,6 +16,7 @@ ! CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m" ! CHECK-SLEEF: "-fveclib=SLEEF" ! CHECK-ARMPL: "-fveclib=ArmPL" +! CHECK-AMDLIBM: "-fveclib=AMDLIBM" ! CHECK-NOLIB-DARWIN: "-fveclib=none" ! CHECK-INVALID: error: invalid value 'something' in '-fveclib=something' @@ -23,6 +25,7 @@ ! RUN: not %flang --target=x86-none-none -c -fveclib=ArmPL %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=aarch64-none-none -c -fveclib=libmvec %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=aarch64-none-none -c -fveclib=SVML %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s +! RUN: not %flang --target=aarch64-none-none -c -fveclib=AMDLIBM %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! CHECK-ERROR: unsupported option {{.*}} for target ! RUN: %flang -fveclib=Accelerate %s -target arm64-apple-ios8.0.0 -### 2>&1 | FileCheck --check-prefix=CHECK-LINK %s ``````````
https://github.com/llvm/llvm-project/pull/140533 From flang-commits at lists.llvm.org Mon May 19 05:36:30 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Mon, 19 May 2025 05:36:30 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][veclib] Adding AMDLIBM target to fveclib (PR #140533) In-Reply-To: Message-ID: <682b25ce.050a0220.8b6db.e79d@mx.google.com> https://github.com/kiranchandramohan approved this pull request. LG. https://github.com/llvm/llvm-project/pull/140533 From flang-commits at lists.llvm.org Mon May 19 05:42:55 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 05:42:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Retrieve shape from selector when generating assoc sym type (PR #137117) In-Reply-To: Message-ID: <682b274f.170a0220.5ba0c.dd19@mx.google.com> ================ @@ -279,6 +279,23 @@ struct TypeBuilderImpl { bool isPolymorphic = (Fortran::semantics::IsPolymorphic(symbol) || Fortran::semantics::IsUnlimitedPolymorphic(symbol)) && !Fortran::semantics::IsAssumedType(symbol); + if (const auto *assocDetails = ---------------- jeanPerier wrote: Sorry for the very late reply, I missed the mails, do not hesitate to call me back for review when I am late to reply. Thanks for the reply, I can understand why Symbol::GetShape returns nothing given there is no textual shape-spec in the source for selectors. However, `evaluate::GetShape` that is called here in lowering is supposed to work directly with AssocEntityDetails (I see some handling for it [here](https://github.com/llvm/llvm-project/blob/e8a307498d678970e0be8cd7718ba0779d5ee307/flang/lib/Evaluate/shape.cpp#L821C16-L828C20)). So to avoid duplicating code, maybe the best is just to simplify the code to do: ``` auto shapeExpr= Fortran::evaluate::GetShape(....); if (shapeExpr && !shapeExpr.empty()) // Statically ranked array. fir::SequenceType::Shape shape; translateShape(shape, std::move(*shapeExpr)); ty = ... } else if (!shapeExpr ) { // Assumed-rank. ty = fir::SequenceType::get([shape](fir::SequenceType::Shape{}, ty); } ``` https://github.com/llvm/llvm-project/pull/137117 From flang-commits at lists.llvm.org Mon May 19 05:43:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 05:43:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Retrieve shape from selector when generating assoc sym type (PR #137117) In-Reply-To: Message-ID: <682b2769.170a0220.7cdba.052c@mx.google.com> https://github.com/jeanPerier edited https://github.com/llvm/llvm-project/pull/137117 From flang-commits at lists.llvm.org Mon May 19 05:59:42 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 05:59:42 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][veclib] Adding AMDLIBM target to fveclib (PR #140533) In-Reply-To: Message-ID: <682b2b3e.170a0220.139975.00b7@mx.google.com> https://github.com/NimishMishra approved this pull request. LGTM. Thanks https://github.com/llvm/llvm-project/pull/140533 From flang-commits at lists.llvm.org Mon May 19 06:53:39 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Mon, 19 May 2025 06:53:39 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682b37e3.170a0220.20d2c.2bda@mx.google.com> ================ @@ -0,0 +1,103 @@ +//===-- FIRToSCF.cpp ------------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/SCF/IR/SCF.h" +#include "mlir/Transforms/DialectConversion.h" + +namespace fir { +#define GEN_PASS_DEF_FIRTOSCFPASS +#include "flang/Optimizer/Transforms/Passes.h.inc" +} // namespace fir + +using namespace fir; +using namespace mlir; + +namespace { +class FIRToSCFPass : public fir::impl::FIRToSCFPassBase { +public: + void runOnOperation() override; +}; +} // namespace + +struct DoLoopConversion : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + + LogicalResult matchAndRewrite(fir::DoLoopOp doLoopOp, + PatternRewriter &rewriter) const override { + auto loc = doLoopOp.getLoc(); + bool hasFinalValue = doLoopOp.getFinalValue().has_value(); + + // Get loop values from the DoLoopOp + auto low = doLoopOp.getLowerBound(); + auto high = doLoopOp.getUpperBound(); + assert(low && high && "must be a Value"); + auto step = doLoopOp.getStep(); + llvm::SmallVector iterArgs; + if (hasFinalValue) + iterArgs.push_back(low); + iterArgs.append(doLoopOp.getIterOperands().begin(), + doLoopOp.getIterOperands().end()); + + // Caculate the trip count. ---------------- kiranchandramohan wrote: Could you add a brief comment on the semantic difference between `scf.for` and `fir.do_loop`? Particularly, whether the upperbound is included, the step value needing to be positive etc. Generally what are the differences that you should account for while converting. https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Mon May 19 06:53:40 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Mon, 19 May 2025 06:53:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682b37e4.620a0220.1c5e24.4bae@mx.google.com> ================ @@ -0,0 +1,103 @@ +//===-- FIRToSCF.cpp ------------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/SCF/IR/SCF.h" +#include "mlir/Transforms/DialectConversion.h" + +namespace fir { +#define GEN_PASS_DEF_FIRTOSCFPASS +#include "flang/Optimizer/Transforms/Passes.h.inc" +} // namespace fir + +using namespace fir; +using namespace mlir; + +namespace { +class FIRToSCFPass : public fir::impl::FIRToSCFPassBase { +public: + void runOnOperation() override; +}; +} // namespace + +struct DoLoopConversion : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + + LogicalResult matchAndRewrite(fir::DoLoopOp doLoopOp, + PatternRewriter &rewriter) const override { + auto loc = doLoopOp.getLoc(); + bool hasFinalValue = doLoopOp.getFinalValue().has_value(); + + // Get loop values from the DoLoopOp + auto low = doLoopOp.getLowerBound(); + auto high = doLoopOp.getUpperBound(); + assert(low && high && "must be a Value"); + auto step = doLoopOp.getStep(); + llvm::SmallVector iterArgs; + if (hasFinalValue) + iterArgs.push_back(low); + iterArgs.append(doLoopOp.getIterOperands().begin(), + doLoopOp.getIterOperands().end()); + + // Caculate the trip count. + auto diff = rewriter.create(loc, high, low); + auto distance = rewriter.create(loc, diff, step); + auto tripCount = rewriter.create(loc, distance, step); + auto zero = rewriter.create(loc, 0); + auto one = rewriter.create(loc, 1); + auto scfForOp = + rewriter.create(loc, zero, tripCount, one, iterArgs); + + auto &loopOps = doLoopOp.getBody()->getOperations(); + auto resultOp = cast(doLoopOp.getBody()->getTerminator()); + auto results = resultOp.getOperands(); + Block *loweredBody = scfForOp.getBody(); + + loweredBody->getOperations().splice(loweredBody->begin(), loopOps, + loopOps.begin(), + std::prev(loopOps.end())); + + rewriter.setInsertionPointToStart(loweredBody); + Value iv = + rewriter.create(loc, scfForOp.getInductionVar(), step); + iv = rewriter.create(loc, low, iv); + + if (!results.empty()) { + rewriter.setInsertionPointToEnd(loweredBody); + rewriter.create(resultOp->getLoc(), results); + } + doLoopOp.getInductionVar().replaceAllUsesWith(iv); + rewriter.replaceAllUsesWith(doLoopOp.getRegionIterArgs(), + hasFinalValue + ? scfForOp.getRegionIterArgs().drop_front() + : scfForOp.getRegionIterArgs()); + + // Copy loop annotations from the do loop to the loop entry condition. + if (auto ann = doLoopOp.getLoopAnnotation()) + scfForOp->setAttr("loop_annotation", *ann); ---------------- kiranchandramohan wrote: `fir.do_loop` also has the following attributes, do we need to do anything about these? unordered finalValue reduceAttrs https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Mon May 19 06:53:41 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Mon, 19 May 2025 06:53:41 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682b37e5.170a0220.2706e5.70aa@mx.google.com> ================ @@ -0,0 +1,103 @@ +//===-- FIRToSCF.cpp ------------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/SCF/IR/SCF.h" +#include "mlir/Transforms/DialectConversion.h" + +namespace fir { +#define GEN_PASS_DEF_FIRTOSCFPASS +#include "flang/Optimizer/Transforms/Passes.h.inc" +} // namespace fir + +using namespace fir; +using namespace mlir; + +namespace { +class FIRToSCFPass : public fir::impl::FIRToSCFPassBase { +public: + void runOnOperation() override; +}; +} // namespace + +struct DoLoopConversion : public OpRewritePattern { ---------------- kiranchandramohan wrote: Nit: Could you audit all the `auto` usage and add specific types if the type is not present on the Right Hand Side of the assignment? https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Mon May 19 06:55:39 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Mon, 19 May 2025 06:55:39 -0700 (PDT) Subject: [flang-commits] [flang] Reduce memory consumption when compiling flang (PR #124919) In-Reply-To: Message-ID: <682b385b.a70a0220.e9cae.491c@mx.google.com> https://github.com/kparzysz closed https://github.com/llvm/llvm-project/pull/124919 From flang-commits at lists.llvm.org Mon May 19 07:07:38 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Mon, 19 May 2025 07:07:38 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][veclib] Adding AMDLIBM target to fveclib (PR #140533) In-Reply-To: Message-ID: <682b3b2a.170a0220.ba589.9daa@mx.google.com> https://github.com/tarunprabhu approved this pull request. LGTM. Thanks. https://github.com/llvm/llvm-project/pull/140533 From flang-commits at lists.llvm.org Mon May 19 07:19:43 2025 From: flang-commits at lists.llvm.org (Scott Manley via flang-commits) Date: Mon, 19 May 2025 07:19:43 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682b3dff.170a0220.271210.ec81@mx.google.com> ================ @@ -0,0 +1,103 @@ +//===-- FIRToSCF.cpp ------------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/SCF/IR/SCF.h" +#include "mlir/Transforms/DialectConversion.h" + +namespace fir { +#define GEN_PASS_DEF_FIRTOSCFPASS +#include "flang/Optimizer/Transforms/Passes.h.inc" +} // namespace fir + +using namespace fir; +using namespace mlir; + +namespace { +class FIRToSCFPass : public fir::impl::FIRToSCFPassBase { +public: + void runOnOperation() override; +}; +} // namespace + +struct DoLoopConversion : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + + LogicalResult matchAndRewrite(fir::DoLoopOp doLoopOp, + PatternRewriter &rewriter) const override { + auto loc = doLoopOp.getLoc(); + bool hasFinalValue = doLoopOp.getFinalValue().has_value(); + + // Get loop values from the DoLoopOp + auto low = doLoopOp.getLowerBound(); + auto high = doLoopOp.getUpperBound(); + assert(low && high && "must be a Value"); + auto step = doLoopOp.getStep(); + llvm::SmallVector iterArgs; + if (hasFinalValue) + iterArgs.push_back(low); + iterArgs.append(doLoopOp.getIterOperands().begin(), + doLoopOp.getIterOperands().end()); + + // Caculate the trip count. + auto diff = rewriter.create(loc, high, low); + auto distance = rewriter.create(loc, diff, step); + auto tripCount = rewriter.create(loc, distance, step); + auto zero = rewriter.create(loc, 0); + auto one = rewriter.create(loc, 1); + auto scfForOp = + rewriter.create(loc, zero, tripCount, one, iterArgs); + + auto &loopOps = doLoopOp.getBody()->getOperations(); + auto resultOp = cast(doLoopOp.getBody()->getTerminator()); + auto results = resultOp.getOperands(); + Block *loweredBody = scfForOp.getBody(); + + loweredBody->getOperations().splice(loweredBody->begin(), loopOps, + loopOps.begin(), + std::prev(loopOps.end())); + + rewriter.setInsertionPointToStart(loweredBody); + Value iv = + rewriter.create(loc, scfForOp.getInductionVar(), step); + iv = rewriter.create(loc, low, iv); + + if (!results.empty()) { + rewriter.setInsertionPointToEnd(loweredBody); + rewriter.create(resultOp->getLoc(), results); + } + doLoopOp.getInductionVar().replaceAllUsesWith(iv); + rewriter.replaceAllUsesWith(doLoopOp.getRegionIterArgs(), + hasFinalValue + ? scfForOp.getRegionIterArgs().drop_front() + : scfForOp.getRegionIterArgs()); + + // Copy loop annotations from the do loop to the loop entry condition. + if (auto ann = doLoopOp.getLoopAnnotation()) + scfForOp->setAttr("loop_annotation", *ann); ---------------- rscottmanley wrote: It's worth pointing out that preserving attributes on SCF ops when using other upstream passes is not guaranteed. If there are any attributes that are required for correctness or further optimzation - this may be a concern. https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Mon May 19 07:19:51 2025 From: flang-commits at lists.llvm.org (Scott Manley via flang-commits) Date: Mon, 19 May 2025 07:19:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682b3e07.630a0220.30548b.8fc4@mx.google.com> https://github.com/rscottmanley edited https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Mon May 19 07:25:05 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 07:25:05 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][veclib] Adding AMDLIBM target to fveclib (PR #140533) In-Reply-To: Message-ID: <682b3f41.170a0220.f1751.5411@mx.google.com> NimishMishra wrote: Have approved the workflow runs https://github.com/llvm/llvm-project/pull/140533 From flang-commits at lists.llvm.org Mon May 19 07:27:10 2025 From: flang-commits at lists.llvm.org (Scott Manley via flang-commits) Date: Mon, 19 May 2025 07:27:10 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682b3fbe.a70a0220.25fcb6.76ce@mx.google.com> rscottmanley wrote: Can you elaborate on "future work will focus on gradually improving this conversion pass"? What ops will you be converting and where/when will it live in the pipeline? What's the intended use for this conversion upstream? https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Mon May 19 07:48:59 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Mon, 19 May 2025 07:48:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682b44db.170a0220.422fa.8235@mx.google.com> ================ @@ -0,0 +1,103 @@ +//===-- FIRToSCF.cpp ------------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/SCF/IR/SCF.h" +#include "mlir/Transforms/DialectConversion.h" + +namespace fir { +#define GEN_PASS_DEF_FIRTOSCFPASS +#include "flang/Optimizer/Transforms/Passes.h.inc" +} // namespace fir + +using namespace fir; +using namespace mlir; + +namespace { +class FIRToSCFPass : public fir::impl::FIRToSCFPassBase { +public: + void runOnOperation() override; +}; +} // namespace + +struct DoLoopConversion : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + + LogicalResult matchAndRewrite(fir::DoLoopOp doLoopOp, + PatternRewriter &rewriter) const override { + auto loc = doLoopOp.getLoc(); + bool hasFinalValue = doLoopOp.getFinalValue().has_value(); + + // Get loop values from the DoLoopOp + auto low = doLoopOp.getLowerBound(); + auto high = doLoopOp.getUpperBound(); + assert(low && high && "must be a Value"); + auto step = doLoopOp.getStep(); + llvm::SmallVector iterArgs; + if (hasFinalValue) + iterArgs.push_back(low); + iterArgs.append(doLoopOp.getIterOperands().begin(), + doLoopOp.getIterOperands().end()); + + // Caculate the trip count. ---------------- tarunprabhu wrote: nit: spelling ```suggestion // Calculate the trip count. ``` https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Mon May 19 08:01:28 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Mon, 19 May 2025 08:01:28 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) Message-ID: https://github.com/abidh created https://github.com/llvm/llvm-project/pull/140556 This PR add functionality to change `flang` command line using environment variable `FCC_OVERRIDE_OPTIONS`. It is quite similar to what `CCC_OVERRIDE_OPTIONS` does for clang. The `FCC_OVERRIDE_OPTIONS` seemed like the most obvious name to me but I am open to other ideas. The `applyOverrideOptions` now takes an extra argument that is the name of the environment variable. Previously `CCC_OVERRIDE_OPTIONS` was hardcoded. >From 5d20af48673adebc2ab3e1a6c8442f67d84f1847 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Mon, 19 May 2025 15:21:25 +0100 Subject: [PATCH] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. This PR add functionality to change flang command line using environment variable `FCC_OVERRIDE_OPTIONS`. It is quite similar to what `CCC_OVERRIDE_OPTIONS` does for clang. The `FCC_OVERRIDE_OPTIONS` seemed like the most obvious name to me but I am open to other ideas. The `applyOverrideOptions` now takes an extra argument that is the name of the environment variable. Previously `CCC_OVERRIDE_OPTIONS` was hardcoded. --- clang/include/clang/Driver/Driver.h | 2 +- clang/lib/Driver/Driver.cpp | 4 ++-- clang/tools/driver/driver.cpp | 2 +- flang/test/Driver/fcc_override.f90 | 12 ++++++++++++ flang/tools/flang-driver/driver.cpp | 7 +++++++ 5 files changed, 23 insertions(+), 4 deletions(-) create mode 100644 flang/test/Driver/fcc_override.f90 diff --git a/clang/include/clang/Driver/Driver.h b/clang/include/clang/Driver/Driver.h index b463dc2a93550..7ca848f11b561 100644 --- a/clang/include/clang/Driver/Driver.h +++ b/clang/include/clang/Driver/Driver.h @@ -879,7 +879,7 @@ llvm::Error expandResponseFiles(SmallVectorImpl &Args, /// See applyOneOverrideOption. void applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideOpts, - llvm::StringSet<> &SavedStrings, + llvm::StringSet<> &SavedStrings, StringRef EnvVar, raw_ostream *OS = nullptr); } // end namespace driver diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp index a648cc928afdc..a8fea35926a0d 100644 --- a/clang/lib/Driver/Driver.cpp +++ b/clang/lib/Driver/Driver.cpp @@ -7289,7 +7289,7 @@ static void applyOneOverrideOption(raw_ostream &OS, void driver::applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideStr, llvm::StringSet<> &SavedStrings, - raw_ostream *OS) { + StringRef EnvVar, raw_ostream *OS) { if (!OS) OS = &llvm::nulls(); @@ -7298,7 +7298,7 @@ void driver::applyOverrideOptions(SmallVectorImpl &Args, OS = &llvm::nulls(); } - *OS << "### CCC_OVERRIDE_OPTIONS: " << OverrideStr << "\n"; + *OS << "### " << EnvVar << ": " << OverrideStr << "\n"; // This does not need to be efficient. diff --git a/clang/tools/driver/driver.cpp b/clang/tools/driver/driver.cpp index 82f47ab973064..81964c65c2892 100644 --- a/clang/tools/driver/driver.cpp +++ b/clang/tools/driver/driver.cpp @@ -305,7 +305,7 @@ int clang_main(int Argc, char **Argv, const llvm::ToolContext &ToolContext) { if (const char *OverrideStr = ::getenv("CCC_OVERRIDE_OPTIONS")) { // FIXME: Driver shouldn't take extra initial argument. driver::applyOverrideOptions(Args, OverrideStr, SavedStrings, - &llvm::errs()); + "CCC_OVERRIDE_OPTIONS", &llvm::errs()); } std::string Path = GetExecutablePath(ToolContext.Path, CanonicalPrefixes); diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 new file mode 100644 index 0000000000000..55a07803fdde5 --- /dev/null +++ b/flang/test/Driver/fcc_override.f90 @@ -0,0 +1,12 @@ +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s +! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang -target x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR + +! CHECK: "-fc1" +! CHECK-NOT: "-Oignore" +! CHECK: "-Omagic" +! CHECK-NOT: "-Oignore" + +! RM-WERROR: ### FCC_OVERRIDE_OPTIONS: x-Werror +-g +! RM-WERROR-NEXT: ### Deleting argument -Werror +! RM-WERROR-NEXT: ### Adding argument -g at end +! RM-WERROR-NOT: "-Werror" diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ed52988feaa59..ad0efa3279cef 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -111,6 +111,13 @@ int main(int argc, const char **argv) { } } + llvm::StringSet<> SavedStrings; + // Handle FCC_OVERRIDE_OPTIONS, used for editing a command line behind the + // scenes. + if (const char *OverrideStr = ::getenv("FCC_OVERRIDE_OPTIONS")) + clang::driver::applyOverrideOptions(args, OverrideStr, SavedStrings, + "FCC_OVERRIDE_OPTIONS", &llvm::errs()); + // Not in the frontend mode - continue in the compiler driver mode. // Create DiagnosticsEngine for the compiler driver From flang-commits at lists.llvm.org Mon May 19 08:02:01 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 08:02:01 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <682b47e9.050a0220.375542.7a4c@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-clang-driver Author: Abid Qadeer (abidh)
Changes This PR add functionality to change `flang` command line using environment variable `FCC_OVERRIDE_OPTIONS`. It is quite similar to what `CCC_OVERRIDE_OPTIONS` does for clang. The `FCC_OVERRIDE_OPTIONS` seemed like the most obvious name to me but I am open to other ideas. The `applyOverrideOptions` now takes an extra argument that is the name of the environment variable. Previously `CCC_OVERRIDE_OPTIONS` was hardcoded. --- Full diff: https://github.com/llvm/llvm-project/pull/140556.diff 5 Files Affected: - (modified) clang/include/clang/Driver/Driver.h (+1-1) - (modified) clang/lib/Driver/Driver.cpp (+2-2) - (modified) clang/tools/driver/driver.cpp (+1-1) - (added) flang/test/Driver/fcc_override.f90 (+12) - (modified) flang/tools/flang-driver/driver.cpp (+7) ``````````diff diff --git a/clang/include/clang/Driver/Driver.h b/clang/include/clang/Driver/Driver.h index b463dc2a93550..7ca848f11b561 100644 --- a/clang/include/clang/Driver/Driver.h +++ b/clang/include/clang/Driver/Driver.h @@ -879,7 +879,7 @@ llvm::Error expandResponseFiles(SmallVectorImpl &Args, /// See applyOneOverrideOption. void applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideOpts, - llvm::StringSet<> &SavedStrings, + llvm::StringSet<> &SavedStrings, StringRef EnvVar, raw_ostream *OS = nullptr); } // end namespace driver diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp index a648cc928afdc..a8fea35926a0d 100644 --- a/clang/lib/Driver/Driver.cpp +++ b/clang/lib/Driver/Driver.cpp @@ -7289,7 +7289,7 @@ static void applyOneOverrideOption(raw_ostream &OS, void driver::applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideStr, llvm::StringSet<> &SavedStrings, - raw_ostream *OS) { + StringRef EnvVar, raw_ostream *OS) { if (!OS) OS = &llvm::nulls(); @@ -7298,7 +7298,7 @@ void driver::applyOverrideOptions(SmallVectorImpl &Args, OS = &llvm::nulls(); } - *OS << "### CCC_OVERRIDE_OPTIONS: " << OverrideStr << "\n"; + *OS << "### " << EnvVar << ": " << OverrideStr << "\n"; // This does not need to be efficient. diff --git a/clang/tools/driver/driver.cpp b/clang/tools/driver/driver.cpp index 82f47ab973064..81964c65c2892 100644 --- a/clang/tools/driver/driver.cpp +++ b/clang/tools/driver/driver.cpp @@ -305,7 +305,7 @@ int clang_main(int Argc, char **Argv, const llvm::ToolContext &ToolContext) { if (const char *OverrideStr = ::getenv("CCC_OVERRIDE_OPTIONS")) { // FIXME: Driver shouldn't take extra initial argument. driver::applyOverrideOptions(Args, OverrideStr, SavedStrings, - &llvm::errs()); + "CCC_OVERRIDE_OPTIONS", &llvm::errs()); } std::string Path = GetExecutablePath(ToolContext.Path, CanonicalPrefixes); diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 new file mode 100644 index 0000000000000..55a07803fdde5 --- /dev/null +++ b/flang/test/Driver/fcc_override.f90 @@ -0,0 +1,12 @@ +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s +! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang -target x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR + +! CHECK: "-fc1" +! CHECK-NOT: "-Oignore" +! CHECK: "-Omagic" +! CHECK-NOT: "-Oignore" + +! RM-WERROR: ### FCC_OVERRIDE_OPTIONS: x-Werror +-g +! RM-WERROR-NEXT: ### Deleting argument -Werror +! RM-WERROR-NEXT: ### Adding argument -g at end +! RM-WERROR-NOT: "-Werror" diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ed52988feaa59..ad0efa3279cef 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -111,6 +111,13 @@ int main(int argc, const char **argv) { } } + llvm::StringSet<> SavedStrings; + // Handle FCC_OVERRIDE_OPTIONS, used for editing a command line behind the + // scenes. + if (const char *OverrideStr = ::getenv("FCC_OVERRIDE_OPTIONS")) + clang::driver::applyOverrideOptions(args, OverrideStr, SavedStrings, + "FCC_OVERRIDE_OPTIONS", &llvm::errs()); + // Not in the frontend mode - continue in the compiler driver mode. // Create DiagnosticsEngine for the compiler driver ``````````
https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Mon May 19 01:26:59 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 19 May 2025 01:26:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow flush of common block (PR #139528) In-Reply-To: Message-ID: <682aeb53.050a0220.349151.eb5f@mx.google.com> https://github.com/tblah closed https://github.com/llvm/llvm-project/pull/139528 From flang-commits at lists.llvm.org Mon May 19 03:14:59 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 03:14:59 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [WIP] Implement workdistribute construct (PR #140523) Message-ID: https://github.com/skc7 created https://github.com/llvm/llvm-project/pull/140523 Note: This is very early work in progress PR implementing workdistribute in flang. More changes/commits incoming. >From 759ca74bd60a4380c880e527033320ca24b997a5 Mon Sep 17 00:00:00 2001 From: Ivan Radanov Ivanov Date: Mon, 4 Dec 2023 12:57:36 -0800 Subject: [PATCH 01/11] Add coexecute directives --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 45 ++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 194b1e657c493..0389ea722197e 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -667,6 +667,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let association = AS_None; let category = CA_Executable; } +def OMP_Coexecute : Directive<"coexecute"> {} def OMP_Critical : Directive<"critical"> { let allowedOnceClauses = [ VersionedClause, @@ -717,6 +718,7 @@ def OMP_DeclareTarget : Directive<"declare target"> { let association = AS_None; let category = CA_Declarative; } +def OMP_EndCoexecute : Directive<"end coexecute"> {} def OMP_EndDeclareTarget : Directive<"end declare target"> { let association = AS_Delimited; let category = OMP_DeclareTarget.category; @@ -2168,6 +2170,33 @@ def OMP_TargetTeams : Directive<"target teams"> { let leafConstructs = [OMP_Target, OMP_Teams]; let category = CA_Executable; } +def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; +} def OMP_TargetTeamsDistribute : Directive<"target teams distribute"> { let allowedClauses = [ VersionedClause, @@ -2446,6 +2475,22 @@ def OMP_TaskLoopSimd : Directive<"taskloop simd"> { let leafConstructs = [OMP_TaskLoop, OMP_Simd]; let category = CA_Executable; } +def OMP_TeamsCoexecute : Directive<"teams coexecute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause + ]; +} def OMP_TeamsDistribute : Directive<"teams distribute"> { let allowedClauses = [ VersionedClause, >From 13d539887222351c3dfd4c2e6c55906c469bb628 Mon Sep 17 00:00:00 2001 From: skc7 Date: Tue, 13 May 2025 11:01:45 +0530 Subject: [PATCH 02/11] [OpenMP] Fix Coexecute definitions --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 0389ea722197e..17edb90a2a618 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -667,7 +667,10 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let association = AS_None; let category = CA_Executable; } -def OMP_Coexecute : Directive<"coexecute"> {} +def OMP_Coexecute : Directive<"coexecute"> { + let association = AS_Block; + let category = CA_Executable; +} def OMP_Critical : Directive<"critical"> { let allowedOnceClauses = [ VersionedClause, @@ -718,7 +721,11 @@ def OMP_DeclareTarget : Directive<"declare target"> { let association = AS_None; let category = CA_Declarative; } -def OMP_EndCoexecute : Directive<"end coexecute"> {} +def OMP_EndCoexecute : Directive<"end coexecute"> { + let leafConstructs = OMP_Coexecute.leafConstructs; + let association = OMP_Coexecute.association; + let category = OMP_Coexecute.category; +} def OMP_EndDeclareTarget : Directive<"end declare target"> { let association = AS_Delimited; let category = OMP_DeclareTarget.category; @@ -2194,8 +2201,10 @@ def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { VersionedClause, VersionedClause, VersionedClause, - VersionedClause, + VersionedClause, ]; + let leafConstructs = [OMP_Target, OMP_Teams, OMP_Coexecute]; + let category = CA_Executable; } def OMP_TargetTeamsDistribute : Directive<"target teams distribute"> { let allowedClauses = [ @@ -2490,6 +2499,8 @@ def OMP_TeamsCoexecute : Directive<"teams coexecute"> { VersionedClause, VersionedClause ]; + let leafConstructs = [OMP_Target, OMP_Teams]; + let category = CA_Executable; } def OMP_TeamsDistribute : Directive<"teams distribute"> { let allowedClauses = [ >From 524115604497af50f92317832ec0ec0152d00623 Mon Sep 17 00:00:00 2001 From: Ivan Radanov Ivanov Date: Mon, 4 Dec 2023 12:58:10 -0800 Subject: [PATCH 03/11] Add omp.coexecute op --- mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 35 +++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 5a79fbf77a268..8061aa0209cc9 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -325,6 +325,41 @@ def SectionsOp : OpenMP_Op<"sections", traits = [ let hasRegionVerifier = 1; } +//===----------------------------------------------------------------------===// +// Coexecute Construct +//===----------------------------------------------------------------------===// + +def CoexecuteOp : OpenMP_Op<"coexecute"> { + let summary = "coexecute directive"; + let description = [{ + The coexecute construct specifies that the teams from the teams directive + this is nested in shall cooperate to execute the computation in this region. + There is no implicit barrier at the end as specified in the standard. + + TODO + We should probably change the defaut behaviour to have a barrier unless + nowait is specified, see below snippet. + + ``` + !$omp target teams + !$omp coexecute + tmp = matmul(x, y) + !$omp end coexecute + a = tmp(0, 0) ! there is no implicit barrier! the matmul hasnt completed! + !$omp end target teams coexecute + ``` + + }]; + + let arguments = (ins UnitAttr:$nowait); + + let regions = (region AnyRegion:$region); + + let assemblyFormat = [{ + oilist(`nowait` $nowait) $region attr-dict + }]; +} + //===----------------------------------------------------------------------===// // 2.8.2 Single Construct //===----------------------------------------------------------------------===// >From d0ae1a2cb13c7d026a36859d8c8f24109897c733 Mon Sep 17 00:00:00 2001 From: Ivan Radanov Ivanov Date: Mon, 4 Dec 2023 17:50:41 -0800 Subject: [PATCH 04/11] Initial frontend support for coexecute --- .../include/flang/Semantics/openmp-directive-sets.h | 13 +++++++++++++ flang/lib/Lower/OpenMP/OpenMP.cpp | 12 ++++++++++++ flang/lib/Parser/openmp-parsers.cpp | 5 ++++- flang/lib/Semantics/resolve-directives.cpp | 6 ++++++ 4 files changed, 35 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Semantics/openmp-directive-sets.h b/flang/include/flang/Semantics/openmp-directive-sets.h index dd610c9702c28..5c316e030c63f 100644 --- a/flang/include/flang/Semantics/openmp-directive-sets.h +++ b/flang/include/flang/Semantics/openmp-directive-sets.h @@ -143,6 +143,7 @@ static const OmpDirectiveSet topTargetSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, + Directive::OMPD_target_teams_coexecute, }; static const OmpDirectiveSet allTargetSet{topTargetSet}; @@ -187,9 +188,16 @@ static const OmpDirectiveSet allTeamsSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, + Directive::OMPD_target_teams_coexecute, } | topTeamsSet, }; +static const OmpDirectiveSet allCoexecuteSet{ + Directive::OMPD_coexecute, + Directive::OMPD_teams_coexecute, + Directive::OMPD_target_teams_coexecute, +}; + //===----------------------------------------------------------------------===// // Directive sets for groups of multiple directives //===----------------------------------------------------------------------===// @@ -230,6 +238,9 @@ static const OmpDirectiveSet blockConstructSet{ Directive::OMPD_taskgroup, Directive::OMPD_teams, Directive::OMPD_workshare, + Directive::OMPD_target_teams_coexecute, + Directive::OMPD_teams_coexecute, + Directive::OMPD_coexecute, }; static const OmpDirectiveSet loopConstructSet{ @@ -294,6 +305,7 @@ static const OmpDirectiveSet workShareSet{ Directive::OMPD_scope, Directive::OMPD_sections, Directive::OMPD_single, + Directive::OMPD_coexecute, } | allDoSet, }; @@ -376,6 +388,7 @@ static const OmpDirectiveSet nestedReduceWorkshareAllowedSet{ }; static const OmpDirectiveSet nestedTeamsAllowedSet{ + Directive::OMPD_coexecute, Directive::OMPD_distribute, Directive::OMPD_distribute_parallel_do, Directive::OMPD_distribute_parallel_do_simd, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 544f31bb5054f..226da50db0497 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2592,6 +2592,15 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +static mlir::omp::CoexecuteOp +genCoexecuteOp(Fortran::lower::AbstractConverter &converter, + Fortran::lower::pft::Evaluation &eval, + mlir::Location currentLocation, + const Fortran::parser::OmpClauseList &clauseList) { + return genOpWithBody( + converter, eval, currentLocation, /*outerCombined=*/false, &clauseList); +} + //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// @@ -3753,6 +3762,9 @@ static void genOMPDispatch(lower::AbstractConverter &converter, newOp = genTeamsOp(converter, symTable, stmtCtx, semaCtx, eval, loc, queue, item); break; + case llvm::omp::Directive::OMPD_coexecute: + newOp = genCoexecuteOp(converter, eval, currentLocation, beginClauseList); + break; case llvm::omp::Directive::OMPD_tile: case llvm::omp::Directive::OMPD_unroll: TODO(loc, "Unhandled loop directive (" + diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index c4728e0fabe61..cd8771a5a5ba4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1350,12 +1350,15 @@ TYPE_PARSER( "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_coexecute), "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), "TARGET" >> pure(llvm::omp::Directive::OMPD_target), "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_teams_coexecute), "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare), + "COEXECUTE" >> pure(llvm::omp::Directive::OMPD_coexecute)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 8b1caca34a6a7..bb6ba0de47b23 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1607,6 +1607,9 @@ bool OmpAttributeVisitor::Pre(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_taskgroup: case llvm::omp::Directive::OMPD_teams: + case llvm::omp::Directive::OMPD_coexecute: + case llvm::omp::Directive::OMPD_teams_coexecute: + case llvm::omp::Directive::OMPD_target_teams_coexecute: case llvm::omp::Directive::OMPD_workshare: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: @@ -1640,6 +1643,9 @@ void OmpAttributeVisitor::Post(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_target: case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_teams: + case llvm::omp::Directive::OMPD_coexecute: + case llvm::omp::Directive::OMPD_teams_coexecute: + case llvm::omp::Directive::OMPD_target_teams_coexecute: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: case llvm::omp::Directive::OMPD_target_parallel: { >From 483045d69d00405fad81e831dc6f2dd9fe5348d3 Mon Sep 17 00:00:00 2001 From: skc7 Date: Tue, 13 May 2025 15:09:45 +0530 Subject: [PATCH 05/11] [OpenMP] Fixes for coexecute definitions --- .../flang/Semantics/openmp-directive-sets.h | 1 + flang/lib/Lower/OpenMP/OpenMP.cpp | 13 ++-- flang/test/Lower/OpenMP/coexecute.f90 | 59 +++++++++++++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 33 +++++------ 4 files changed, 83 insertions(+), 23 deletions(-) create mode 100644 flang/test/Lower/OpenMP/coexecute.f90 diff --git a/flang/include/flang/Semantics/openmp-directive-sets.h b/flang/include/flang/Semantics/openmp-directive-sets.h index 5c316e030c63f..43f4e642b3d86 100644 --- a/flang/include/flang/Semantics/openmp-directive-sets.h +++ b/flang/include/flang/Semantics/openmp-directive-sets.h @@ -173,6 +173,7 @@ static const OmpDirectiveSet topTeamsSet{ Directive::OMPD_teams_distribute_parallel_do_simd, Directive::OMPD_teams_distribute_simd, Directive::OMPD_teams_loop, + Directive::OMPD_teams_coexecute, }; static const OmpDirectiveSet bottomTeamsSet{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 226da50db0497..fa816379d7713 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2593,12 +2593,13 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, } static mlir::omp::CoexecuteOp -genCoexecuteOp(Fortran::lower::AbstractConverter &converter, - Fortran::lower::pft::Evaluation &eval, - mlir::Location currentLocation, - const Fortran::parser::OmpClauseList &clauseList) { +genCoexecuteOp(lower::AbstractConverter &converter, lower::SymMap &symTable, + semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, + mlir::Location loc, const ConstructQueue &queue, + ConstructQueue::const_iterator item) { return genOpWithBody( - converter, eval, currentLocation, /*outerCombined=*/false, &clauseList); + OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, + llvm::omp::Directive::OMPD_coexecute), queue, item); } //===----------------------------------------------------------------------===// @@ -3763,7 +3764,7 @@ static void genOMPDispatch(lower::AbstractConverter &converter, item); break; case llvm::omp::Directive::OMPD_coexecute: - newOp = genCoexecuteOp(converter, eval, currentLocation, beginClauseList); + newOp = genCoexecuteOp(converter, symTable, semaCtx, eval, loc, queue, item); break; case llvm::omp::Directive::OMPD_tile: case llvm::omp::Directive::OMPD_unroll: diff --git a/flang/test/Lower/OpenMP/coexecute.f90 b/flang/test/Lower/OpenMP/coexecute.f90 new file mode 100644 index 0000000000000..b14f71f9bbbfa --- /dev/null +++ b/flang/test/Lower/OpenMP/coexecute.f90 @@ -0,0 +1,59 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +! CHECK-LABEL: func @_QPtarget_teams_coexecute +subroutine target_teams_coexecute() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp target teams coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end target teams coexecute +end subroutine target_teams_coexecute + +! CHECK-LABEL: func @_QPteams_coexecute +subroutine teams_coexecute() + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp teams coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end teams coexecute +end subroutine teams_coexecute + +! CHECK-LABEL: func @_QPtarget_teams_coexecute_m +subroutine target_teams_coexecute_m() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp target + !$omp teams + !$omp coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end coexecute + !$omp end teams + !$omp end target +end subroutine target_teams_coexecute_m + +! CHECK-LABEL: func @_QPteams_coexecute_m +subroutine teams_coexecute_m() + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp teams + !$omp coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end coexecute + !$omp end teams +end subroutine teams_coexecute_m diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 17edb90a2a618..bfa317c08aae1 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -2179,29 +2179,28 @@ def OMP_TargetTeams : Directive<"target teams"> { } def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, + VersionedClause, VersionedClause, VersionedClause, - VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, - VersionedClause, - VersionedClause, VersionedClause, - VersionedClause, + VersionedClause, ]; - let allowedOnceClauses = [ + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, - VersionedClause, - VersionedClause, VersionedClause, - VersionedClause, VersionedClause, VersionedClause, + VersionedClause, ]; let leafConstructs = [OMP_Target, OMP_Teams, OMP_Coexecute]; let category = CA_Executable; @@ -2486,20 +2485,20 @@ def OMP_TaskLoopSimd : Directive<"taskloop simd"> { } def OMP_TeamsCoexecute : Directive<"teams coexecute"> { let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, VersionedClause, - VersionedClause + VersionedClause, ]; - let leafConstructs = [OMP_Target, OMP_Teams]; + let leafConstructs = [OMP_Teams, OMP_Coexecute]; let category = CA_Executable; } def OMP_TeamsDistribute : Directive<"teams distribute"> { >From b354fe021895312a40d9a447e0932e763660128d Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 14 May 2025 14:48:52 +0530 Subject: [PATCH 06/11] [OpenMP] Use workdistribute instead of coexecute --- .../flang/Semantics/openmp-directive-sets.h | 24 ++-- flang/lib/Lower/OpenMP/OpenMP.cpp | 15 ++- flang/lib/Parser/openmp-parsers.cpp | 6 +- flang/lib/Semantics/resolve-directives.cpp | 12 +- flang/test/Lower/OpenMP/coexecute.f90 | 59 ---------- flang/test/Lower/OpenMP/workdistribute.f90 | 59 ++++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 110 +++++++++--------- mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 28 ++--- 8 files changed, 152 insertions(+), 161 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/coexecute.f90 create mode 100644 flang/test/Lower/OpenMP/workdistribute.f90 diff --git a/flang/include/flang/Semantics/openmp-directive-sets.h b/flang/include/flang/Semantics/openmp-directive-sets.h index 43f4e642b3d86..7ced6ed9b44d6 100644 --- a/flang/include/flang/Semantics/openmp-directive-sets.h +++ b/flang/include/flang/Semantics/openmp-directive-sets.h @@ -143,7 +143,7 @@ static const OmpDirectiveSet topTargetSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, - Directive::OMPD_target_teams_coexecute, + Directive::OMPD_target_teams_workdistribute, }; static const OmpDirectiveSet allTargetSet{topTargetSet}; @@ -173,7 +173,7 @@ static const OmpDirectiveSet topTeamsSet{ Directive::OMPD_teams_distribute_parallel_do_simd, Directive::OMPD_teams_distribute_simd, Directive::OMPD_teams_loop, - Directive::OMPD_teams_coexecute, + Directive::OMPD_teams_workdistribute, }; static const OmpDirectiveSet bottomTeamsSet{ @@ -189,14 +189,14 @@ static const OmpDirectiveSet allTeamsSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, - Directive::OMPD_target_teams_coexecute, + Directive::OMPD_target_teams_workdistribute, } | topTeamsSet, }; -static const OmpDirectiveSet allCoexecuteSet{ - Directive::OMPD_coexecute, - Directive::OMPD_teams_coexecute, - Directive::OMPD_target_teams_coexecute, +static const OmpDirectiveSet allWorkdistributeSet{ + Directive::OMPD_workdistribute, + Directive::OMPD_teams_workdistribute, + Directive::OMPD_target_teams_workdistribute, }; //===----------------------------------------------------------------------===// @@ -239,9 +239,9 @@ static const OmpDirectiveSet blockConstructSet{ Directive::OMPD_taskgroup, Directive::OMPD_teams, Directive::OMPD_workshare, - Directive::OMPD_target_teams_coexecute, - Directive::OMPD_teams_coexecute, - Directive::OMPD_coexecute, + Directive::OMPD_target_teams_workdistribute, + Directive::OMPD_teams_workdistribute, + Directive::OMPD_workdistribute, }; static const OmpDirectiveSet loopConstructSet{ @@ -306,7 +306,7 @@ static const OmpDirectiveSet workShareSet{ Directive::OMPD_scope, Directive::OMPD_sections, Directive::OMPD_single, - Directive::OMPD_coexecute, + Directive::OMPD_workdistribute, } | allDoSet, }; @@ -389,7 +389,7 @@ static const OmpDirectiveSet nestedReduceWorkshareAllowedSet{ }; static const OmpDirectiveSet nestedTeamsAllowedSet{ - Directive::OMPD_coexecute, + Directive::OMPD_workdistribute, Directive::OMPD_distribute, Directive::OMPD_distribute_parallel_do, Directive::OMPD_distribute_parallel_do_simd, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fa816379d7713..afb0c6ee74fcf 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2592,14 +2592,14 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } -static mlir::omp::CoexecuteOp -genCoexecuteOp(lower::AbstractConverter &converter, lower::SymMap &symTable, +static mlir::omp::WorkdistributeOp +genWorkdistributeOp(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, mlir::Location loc, const ConstructQueue &queue, ConstructQueue::const_iterator item) { - return genOpWithBody( + return genOpWithBody( OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, - llvm::omp::Directive::OMPD_coexecute), queue, item); + llvm::omp::Directive::OMPD_workdistribute), queue, item); } //===----------------------------------------------------------------------===// @@ -3763,14 +3763,13 @@ static void genOMPDispatch(lower::AbstractConverter &converter, newOp = genTeamsOp(converter, symTable, stmtCtx, semaCtx, eval, loc, queue, item); break; - case llvm::omp::Directive::OMPD_coexecute: - newOp = genCoexecuteOp(converter, symTable, semaCtx, eval, loc, queue, item); - break; case llvm::omp::Directive::OMPD_tile: case llvm::omp::Directive::OMPD_unroll: TODO(loc, "Unhandled loop directive (" + llvm::omp::getOpenMPDirectiveName(dir) + ")"); - // case llvm::omp::Directive::OMPD_workdistribute: + case llvm::omp::Directive::OMPD_workdistribute: + newOp = genWorkdistributeOp(converter, symTable, semaCtx, eval, loc, queue, item); + break; case llvm::omp::Directive::OMPD_workshare: newOp = genWorkshareOp(converter, symTable, stmtCtx, semaCtx, eval, loc, queue, item); diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index cd8771a5a5ba4..d3ae8653b35c7 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1350,15 +1350,15 @@ TYPE_PARSER( "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_coexecute), + "TARGET TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_workdistribute), "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), "TARGET" >> pure(llvm::omp::Directive::OMPD_target), "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_teams_coexecute), + "TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_teams_workdistribute), "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare), - "COEXECUTE" >> pure(llvm::omp::Directive::OMPD_coexecute)))) + "WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_workdistribute)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index bb6ba0de47b23..b7e014c07736a 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1607,9 +1607,9 @@ bool OmpAttributeVisitor::Pre(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_taskgroup: case llvm::omp::Directive::OMPD_teams: - case llvm::omp::Directive::OMPD_coexecute: - case llvm::omp::Directive::OMPD_teams_coexecute: - case llvm::omp::Directive::OMPD_target_teams_coexecute: + case llvm::omp::Directive::OMPD_workdistribute: + case llvm::omp::Directive::OMPD_teams_workdistribute: + case llvm::omp::Directive::OMPD_target_teams_workdistribute: case llvm::omp::Directive::OMPD_workshare: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: @@ -1643,9 +1643,9 @@ void OmpAttributeVisitor::Post(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_target: case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_teams: - case llvm::omp::Directive::OMPD_coexecute: - case llvm::omp::Directive::OMPD_teams_coexecute: - case llvm::omp::Directive::OMPD_target_teams_coexecute: + case llvm::omp::Directive::OMPD_workdistribute: + case llvm::omp::Directive::OMPD_teams_workdistribute: + case llvm::omp::Directive::OMPD_target_teams_workdistribute: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: case llvm::omp::Directive::OMPD_target_parallel: { diff --git a/flang/test/Lower/OpenMP/coexecute.f90 b/flang/test/Lower/OpenMP/coexecute.f90 deleted file mode 100644 index b14f71f9bbbfa..0000000000000 --- a/flang/test/Lower/OpenMP/coexecute.f90 +++ /dev/null @@ -1,59 +0,0 @@ -! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s - -! CHECK-LABEL: func @_QPtarget_teams_coexecute -subroutine target_teams_coexecute() - ! CHECK: omp.target - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp target teams coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end target teams coexecute -end subroutine target_teams_coexecute - -! CHECK-LABEL: func @_QPteams_coexecute -subroutine teams_coexecute() - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp teams coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end teams coexecute -end subroutine teams_coexecute - -! CHECK-LABEL: func @_QPtarget_teams_coexecute_m -subroutine target_teams_coexecute_m() - ! CHECK: omp.target - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp target - !$omp teams - !$omp coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end coexecute - !$omp end teams - !$omp end target -end subroutine target_teams_coexecute_m - -! CHECK-LABEL: func @_QPteams_coexecute_m -subroutine teams_coexecute_m() - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp teams - !$omp coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end coexecute - !$omp end teams -end subroutine teams_coexecute_m diff --git a/flang/test/Lower/OpenMP/workdistribute.f90 b/flang/test/Lower/OpenMP/workdistribute.f90 new file mode 100644 index 0000000000000..924205bb72e5e --- /dev/null +++ b/flang/test/Lower/OpenMP/workdistribute.f90 @@ -0,0 +1,59 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +! CHECK-LABEL: func @_QPtarget_teams_workdistribute +subroutine target_teams_workdistribute() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp target teams workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end target teams workdistribute +end subroutine target_teams_workdistribute + +! CHECK-LABEL: func @_QPteams_workdistribute +subroutine teams_workdistribute() + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp teams workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end teams workdistribute +end subroutine teams_workdistribute + +! CHECK-LABEL: func @_QPtarget_teams_workdistribute_m +subroutine target_teams_workdistribute_m() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp target + !$omp teams + !$omp workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end workdistribute + !$omp end teams + !$omp end target +end subroutine target_teams_workdistribute_m + +! CHECK-LABEL: func @_QPteams_workdistribute_m +subroutine teams_workdistribute_m() + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp teams + !$omp workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end workdistribute + !$omp end teams +end subroutine teams_workdistribute_m diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index bfa317c08aae1..372ae02099a7d 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -667,10 +667,6 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let association = AS_None; let category = CA_Executable; } -def OMP_Coexecute : Directive<"coexecute"> { - let association = AS_Block; - let category = CA_Executable; -} def OMP_Critical : Directive<"critical"> { let allowedOnceClauses = [ VersionedClause, @@ -721,11 +717,6 @@ def OMP_DeclareTarget : Directive<"declare target"> { let association = AS_None; let category = CA_Declarative; } -def OMP_EndCoexecute : Directive<"end coexecute"> { - let leafConstructs = OMP_Coexecute.leafConstructs; - let association = OMP_Coexecute.association; - let category = OMP_Coexecute.category; -} def OMP_EndDeclareTarget : Directive<"end declare target"> { let association = AS_Delimited; let category = OMP_DeclareTarget.category; @@ -1277,6 +1268,15 @@ def OMP_EndWorkshare : Directive<"end workshare"> { let association = OMP_Workshare.association; let category = OMP_Workshare.category; } +def OMP_Workdistribute : Directive<"workdistribute"> { + let association = AS_Block; + let category = CA_Executable; +} +def OMP_EndWorkdistribute : Directive<"end workdistribute"> { + let leafConstructs = OMP_Workdistribute.leafConstructs; + let association = OMP_Workdistribute.association; + let category = OMP_Workdistribute.category; +} //===----------------------------------------------------------------------===// // Definitions of OpenMP compound directives @@ -2177,34 +2177,6 @@ def OMP_TargetTeams : Directive<"target teams"> { let leafConstructs = [OMP_Target, OMP_Teams]; let category = CA_Executable; } -def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let leafConstructs = [OMP_Target, OMP_Teams, OMP_Coexecute]; - let category = CA_Executable; -} def OMP_TargetTeamsDistribute : Directive<"target teams distribute"> { let allowedClauses = [ VersionedClause, @@ -2419,6 +2391,34 @@ def OMP_TargetTeamsDistributeSimd : let leafConstructs = [OMP_Target, OMP_Teams, OMP_Distribute, OMP_Simd]; let category = CA_Executable; } +def OMP_TargetTeamsWorkdistribute : Directive<"target teams workdistribute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let leafConstructs = [OMP_Target, OMP_Teams, OMP_Workdistribute]; + let category = CA_Executable; +} def OMP_target_teams_loop : Directive<"target teams loop"> { let allowedClauses = [ VersionedClause, @@ -2483,24 +2483,6 @@ def OMP_TaskLoopSimd : Directive<"taskloop simd"> { let leafConstructs = [OMP_TaskLoop, OMP_Simd]; let category = CA_Executable; } -def OMP_TeamsCoexecute : Directive<"teams coexecute"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let leafConstructs = [OMP_Teams, OMP_Coexecute]; - let category = CA_Executable; -} def OMP_TeamsDistribute : Directive<"teams distribute"> { let allowedClauses = [ VersionedClause, @@ -2682,3 +2664,21 @@ def OMP_teams_loop : Directive<"teams loop"> { let leafConstructs = [OMP_Teams, OMP_loop]; let category = CA_Executable; } +def OMP_TeamsWorkdistribute : Directive<"teams workdistribute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let leafConstructs = [OMP_Teams, OMP_Workdistribute]; + let category = CA_Executable; +} diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 8061aa0209cc9..5e3ab0e908d21 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -326,38 +326,30 @@ def SectionsOp : OpenMP_Op<"sections", traits = [ } //===----------------------------------------------------------------------===// -// Coexecute Construct +// workdistribute Construct //===----------------------------------------------------------------------===// -def CoexecuteOp : OpenMP_Op<"coexecute"> { - let summary = "coexecute directive"; +def WorkdistributeOp : OpenMP_Op<"workdistribute"> { + let summary = "workdistribute directive"; let description = [{ - The coexecute construct specifies that the teams from the teams directive - this is nested in shall cooperate to execute the computation in this region. - There is no implicit barrier at the end as specified in the standard. - - TODO - We should probably change the defaut behaviour to have a barrier unless - nowait is specified, see below snippet. + workdistribute divides execution of the enclosed structured block into + separate units of work, each executed only once by each + initial thread in the league. ``` !$omp target teams - !$omp coexecute + !$omp workdistribute tmp = matmul(x, y) - !$omp end coexecute + !$omp end workdistribute a = tmp(0, 0) ! there is no implicit barrier! the matmul hasnt completed! - !$omp end target teams coexecute + !$omp end target teams workdistribute ``` }]; - let arguments = (ins UnitAttr:$nowait); - let regions = (region AnyRegion:$region); - let assemblyFormat = [{ - oilist(`nowait` $nowait) $region attr-dict - }]; + let assemblyFormat = "$region attr-dict"; } //===----------------------------------------------------------------------===// >From 3375aee6231521d4b54481cd6a5537ed48968b39 Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 14 May 2025 16:17:14 +0530 Subject: [PATCH 07/11] [OpenMP] workdistribute trivial lowering Lowering logic inspired from ivanradanov coexeute lowering f56da1a207df4a40776a8570122a33f047074a3c --- .../include/flang/Optimizer/OpenMP/Passes.td | 4 + flang/lib/Optimizer/OpenMP/CMakeLists.txt | 1 + .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 101 ++++++++++++++++++ .../OpenMP/lower-workdistribute.mlir | 52 +++++++++ 4 files changed, 158 insertions(+) create mode 100644 flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp create mode 100644 flang/test/Transforms/OpenMP/lower-workdistribute.mlir diff --git a/flang/include/flang/Optimizer/OpenMP/Passes.td b/flang/include/flang/Optimizer/OpenMP/Passes.td index 704faf0ccd856..743b6d381ed42 100644 --- a/flang/include/flang/Optimizer/OpenMP/Passes.td +++ b/flang/include/flang/Optimizer/OpenMP/Passes.td @@ -93,6 +93,10 @@ def LowerWorkshare : Pass<"lower-workshare", "::mlir::ModuleOp"> { let summary = "Lower workshare construct"; } +def LowerWorkdistribute : Pass<"lower-workdistribute", "::mlir::ModuleOp"> { + let summary = "Lower workdistribute construct"; +} + def GenericLoopConversionPass : Pass<"omp-generic-loop-conversion", "mlir::func::FuncOp"> { let summary = "Converts OpenMP generic `omp.loop` to semantically " diff --git a/flang/lib/Optimizer/OpenMP/CMakeLists.txt b/flang/lib/Optimizer/OpenMP/CMakeLists.txt index e31543328a9f9..cd746834741f9 100644 --- a/flang/lib/Optimizer/OpenMP/CMakeLists.txt +++ b/flang/lib/Optimizer/OpenMP/CMakeLists.txt @@ -7,6 +7,7 @@ add_flang_library(FlangOpenMPTransforms MapsForPrivatizedSymbols.cpp MapInfoFinalization.cpp MarkDeclareTarget.cpp + LowerWorkdistribute.cpp LowerWorkshare.cpp LowerNontemporal.cpp diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp new file mode 100644 index 0000000000000..75c9d2b0d494e --- /dev/null +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -0,0 +1,101 @@ +//===- LowerWorkshare.cpp - special cases for bufferization -------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the lowering of omp.workdistribute. +// +//===----------------------------------------------------------------------===// + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +#include + +namespace flangomp { +#define GEN_PASS_DEF_LOWERWORKDISTRIBUTE +#include "flang/Optimizer/OpenMP/Passes.h.inc" +} // namespace flangomp + +#define DEBUG_TYPE "lower-workdistribute" + +using namespace mlir; + +namespace { + +struct WorkdistributeToSingle : public mlir::OpRewritePattern { +using OpRewritePattern::OpRewritePattern; +mlir::LogicalResult + matchAndRewrite(mlir::omp::WorkdistributeOp workdistribute, + mlir::PatternRewriter &rewriter) const override { + auto loc = workdistribute->getLoc(); + auto teams = llvm::dyn_cast(workdistribute->getParentOp()); + if (!teams) { + mlir::emitError(loc, "workdistribute not nested in teams\n"); + return mlir::failure(); + } + if (workdistribute.getRegion().getBlocks().size() != 1) { + mlir::emitError(loc, "workdistribute with multiple blocks\n"); + return mlir::failure(); + } + if (teams.getRegion().getBlocks().size() != 1) { + mlir::emitError(loc, "teams with multiple blocks\n"); + return mlir::failure(); + } + if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { + mlir::emitError(loc, "teams with multiple nested ops\n"); + return mlir::failure(); + } + mlir::Block *workdistributeBlock = &workdistribute.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teams); + rewriter.eraseOp(teams); + return mlir::success(); + } +}; + +class LowerWorkdistributePass + : public flangomp::impl::LowerWorkdistributeBase { +public: + void runOnOperation() override { + mlir::MLIRContext &context = getContext(); + mlir::RewritePatternSet patterns(&context); + mlir::GreedyRewriteConfig config; + // prevent the pattern driver form merging blocks + config.setRegionSimplificationLevel( + mlir::GreedySimplifyRegionLevel::Disabled); + + patterns.insert(&context); + mlir::Operation *op = getOperation(); + if (mlir::failed(mlir::applyPatternsGreedily(op, std::move(patterns), config))) { + mlir::emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + signalPassFailure(); + } + } +}; +} diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute.mlir new file mode 100644 index 0000000000000..34c8c3f01976d --- /dev/null +++ b/flang/test/Transforms/OpenMP/lower-workdistribute.mlir @@ -0,0 +1,52 @@ +// RUN: fir-opt --lower-workdistribute %s | FileCheck %s + +// CHECK-LABEL: func.func @_QPtarget_simple() { +// CHECK: %[[VAL_0:.*]] = arith.constant 2 : i32 +// CHECK: %[[VAL_1:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFtarget_simpleEa"} +// CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[VAL_1]] {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_3:.*]] = fir.alloca !fir.box> {bindc_name = "simple_var", uniq_name = "_QFtarget_simpleEsimple_var"} +// CHECK: %[[VAL_4:.*]] = fir.zero_bits !fir.heap +// CHECK: %[[VAL_5:.*]] = fir.embox %[[VAL_4]] : (!fir.heap) -> !fir.box> +// CHECK: fir.store %[[VAL_5]] to %[[VAL_3]] : !fir.ref>> +// CHECK: %[[VAL_6:.*]]:2 = hlfir.declare %[[VAL_3]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) +// CHECK: hlfir.assign %[[VAL_0]] to %[[VAL_2]]#0 : i32, !fir.ref +// CHECK: %[[VAL_7:.*]] = omp.map.info var_ptr(%[[VAL_2]]#1 : !fir.ref, i32) map_clauses(to) capture(ByRef) -> !fir.ref {name = "a"} +// CHECK: omp.target map_entries(%[[VAL_7]] -> %[[VAL_8:.*]] : !fir.ref) private(@_QFtarget_simpleEsimple_var_private_ref_box_heap_i32 %[[VAL_6]]#0 -> %[[VAL_9:.*]] : !fir.ref>>) { +// CHECK: %[[VAL_10:.*]] = arith.constant 10 : i32 +// CHECK: %[[VAL_11:.*]]:2 = hlfir.declare %[[VAL_8]] {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_12:.*]]:2 = hlfir.declare %[[VAL_9]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) +// CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_11]]#0 : !fir.ref +// CHECK: %[[VAL_14:.*]] = arith.addi %[[VAL_13]], %[[VAL_10]] : i32 +// CHECK: hlfir.assign %[[VAL_14]] to %[[VAL_12]]#0 realloc : i32, !fir.ref>> +// CHECK: omp.terminator +// CHECK: } +// CHECK: return +// CHECK: } +func.func @_QPtarget_simple() { + %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFtarget_simpleEa"} + %1:2 = hlfir.declare %0 {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %2 = fir.alloca !fir.box> {bindc_name = "simple_var", uniq_name = "_QFtarget_simpleEsimple_var"} + %3 = fir.zero_bits !fir.heap + %4 = fir.embox %3 : (!fir.heap) -> !fir.box> + fir.store %4 to %2 : !fir.ref>> + %5:2 = hlfir.declare %2 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) + %c2_i32 = arith.constant 2 : i32 + hlfir.assign %c2_i32 to %1#0 : i32, !fir.ref + %6 = omp.map.info var_ptr(%1#1 : !fir.ref, i32) map_clauses(to) capture(ByRef) -> !fir.ref {name = "a"} + omp.target map_entries(%6 -> %arg0 : !fir.ref) private(@_QFtarget_simpleEsimple_var_private_ref_box_heap_i32 %5#0 -> %arg1 : !fir.ref>>){ + omp.teams { + omp.workdistribute { + %11:2 = hlfir.declare %arg0 {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %12:2 = hlfir.declare %arg1 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) + %c10_i32 = arith.constant 10 : i32 + %13 = fir.load %11#0 : !fir.ref + %14 = arith.addi %c10_i32, %13 : i32 + hlfir.assign %14 to %12#0 realloc : i32, !fir.ref>> + omp.terminator + } + omp.terminator + } + omp.terminator + } + return +} \ No newline at end of file >From 92c448026ce3d97c2c5c90b8acf6c216e0ebe8d9 Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 14 May 2025 19:29:33 +0530 Subject: [PATCH 08/11] [Flang][OpenMP] Add workdistribute lower pass to pipeline --- flang/lib/Optimizer/Passes/Pipelines.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index a3ef473ea39b7..fb7eaef00c8fd 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -276,8 +276,10 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP, addNestedPassToAllTopLevelOperations( pm, hlfir::createInlineHLFIRAssign); pm.addPass(hlfir::createConvertHLFIRtoFIR()); - if (enableOpenMP) + if (enableOpenMP) { pm.addPass(flangomp::createLowerWorkshare()); + pm.addPass(flangomp::createLowerWorkdistribute()); + } } /// Create a pass pipeline for handling certain OpenMP transformations needed >From 2d138e2db2cbd89d87c2e89b0a398c9ca424c3f1 Mon Sep 17 00:00:00 2001 From: skc7 Date: Thu, 15 May 2025 16:39:21 +0530 Subject: [PATCH 09/11] [Flang][OpenMP] Add FissionWorkdistribute lowering. Fission logic inspired from ivanradanov implementation : c97eca4010e460aac5a3d795614ca0980bce4565 --- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 233 ++++++++++++++---- .../OpenMP/lower-workdistribute-fission.mlir | 60 +++++ ...ir => lower-workdistribute-to-single.mlir} | 2 +- 3 files changed, 243 insertions(+), 52 deletions(-) create mode 100644 flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir rename flang/test/Transforms/OpenMP/{lower-workdistribute.mlir => lower-workdistribute-to-single.mlir} (99%) diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index 75c9d2b0d494e..f799202be2645 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -10,31 +10,26 @@ // //===----------------------------------------------------------------------===// -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Dialect/FIROps.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "flang/Optimizer/HLFIR/Passes.h" +#include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/IR/Builders.h" +#include "mlir/IR/Value.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" #include #include -#include -#include -#include +#include +#include #include +#include #include -#include #include -#include -#include #include #include -#include "mlir/Transforms/GreedyPatternRewriteDriver.h" - +#include #include namespace flangomp { @@ -48,52 +43,188 @@ using namespace mlir; namespace { -struct WorkdistributeToSingle : public mlir::OpRewritePattern { -using OpRewritePattern::OpRewritePattern; -mlir::LogicalResult - matchAndRewrite(mlir::omp::WorkdistributeOp workdistribute, - mlir::PatternRewriter &rewriter) const override { - auto loc = workdistribute->getLoc(); - auto teams = llvm::dyn_cast(workdistribute->getParentOp()); - if (!teams) { - mlir::emitError(loc, "workdistribute not nested in teams\n"); - return mlir::failure(); - } - if (workdistribute.getRegion().getBlocks().size() != 1) { - mlir::emitError(loc, "workdistribute with multiple blocks\n"); - return mlir::failure(); +template +static T getPerfectlyNested(Operation *op) { + if (op->getNumRegions() != 1) + return nullptr; + auto ®ion = op->getRegion(0); + if (region.getBlocks().size() != 1) + return nullptr; + auto *block = ®ion.front(); + auto *firstOp = &block->front(); + if (auto nested = dyn_cast(firstOp)) + if (firstOp->getNextNode() == block->getTerminator()) + return nested; + return nullptr; +} + +/// This is the single source of truth about whether we should parallelize an +/// operation nested in an omp.workdistribute region. +static bool shouldParallelize(Operation *op) { + // Currently we cannot parallelize operations with results that have uses + if (llvm::any_of(op->getResults(), + [](OpResult v) -> bool { return !v.use_empty(); })) + return false; + // We will parallelize unordered loops - these come from array syntax + if (auto loop = dyn_cast(op)) { + auto unordered = loop.getUnordered(); + if (!unordered) + return false; + return *unordered; + } + if (auto callOp = dyn_cast(op)) { + auto callee = callOp.getCallee(); + if (!callee) + return false; + auto *func = op->getParentOfType().lookupSymbol(*callee); + // TODO need to insert a check here whether it is a call we can actually + // parallelize currently + if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) + return true; + return false; + } + // We cannot parallise anything else + return false; +} + +struct WorkdistributeToSingle : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + PatternRewriter &rewriter) const override { + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); } - if (teams.getRegion().getBlocks().size() != 1) { - mlir::emitError(loc, "teams with multiple blocks\n"); - return mlir::failure(); + + Block *workdistributeBlock = &workdistributeOp.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); + rewriter.eraseOp(teamsOp); + workdistributeOp.emitWarning("unable to parallelize coexecute"); + return success(); + } +}; + +/// If B() and D() are parallelizable, +/// +/// omp.teams { +/// omp.workdistribute { +/// A() +/// B() +/// C() +/// D() +/// E() +/// } +/// } +/// +/// becomes +/// +/// A() +/// omp.teams { +/// omp.workdistribute { +/// B() +/// } +/// } +/// C() +/// omp.teams { +/// omp.workdistribute { +/// D() +/// } +/// } +/// E() + +struct FissionWorkdistribute + : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult + matchAndRewrite(omp::WorkdistributeOp workdistribute, + PatternRewriter &rewriter) const override { + auto loc = workdistribute->getLoc(); + auto teams = dyn_cast(workdistribute->getParentOp()); + if (!teams) { + emitError(loc, "workdistribute not nested in teams\n"); + return failure(); + } + if (workdistribute.getRegion().getBlocks().size() != 1) { + emitError(loc, "workdistribute with multiple blocks\n"); + return failure(); + } + if (teams.getRegion().getBlocks().size() != 1) { + emitError(loc, "teams with multiple blocks\n"); + return failure(); + } + if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { + emitError(loc, "teams with multiple nested ops\n"); + return failure(); + } + + auto *teamsBlock = &teams.getRegion().front(); + + // While we have unhandled operations in the original workdistribute + auto *workdistributeBlock = &workdistribute.getRegion().front(); + auto *terminator = workdistributeBlock->getTerminator(); + bool changed = false; + while (&workdistributeBlock->front() != terminator) { + rewriter.setInsertionPoint(teams); + IRMapping mapping; + llvm::SmallVector hoisted; + Operation *parallelize = nullptr; + for (auto &op : workdistribute.getOps()) { + if (&op == terminator) { + break; } - if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { - mlir::emitError(loc, "teams with multiple nested ops\n"); - return mlir::failure(); + if (shouldParallelize(&op)) { + parallelize = &op; + break; + } else { + rewriter.clone(op, mapping); + hoisted.push_back(&op); + changed = true; } - mlir::Block *workdistributeBlock = &workdistribute.getRegion().front(); - rewriter.eraseOp(workdistributeBlock->getTerminator()); - rewriter.inlineBlockBefore(workdistributeBlock, teams); - rewriter.eraseOp(teams); - return mlir::success(); + } + + for (auto *op : hoisted) + rewriter.replaceOp(op, mapping.lookup(op)); + + if (parallelize && hoisted.empty() && + parallelize->getNextNode() == terminator) + break; + if (parallelize) { + auto newTeams = rewriter.cloneWithoutRegions(teams); + auto *newTeamsBlock = rewriter.createBlock( + &newTeams.getRegion(), newTeams.getRegion().begin(), {}, {}); + for (auto arg : teamsBlock->getArguments()) + newTeamsBlock->addArgument(arg.getType(), arg.getLoc()); + auto newWorkdistribute = rewriter.create(loc); + rewriter.create(loc); + rewriter.createBlock(&newWorkdistribute.getRegion(), + newWorkdistribute.getRegion().begin(), {}, {}); + auto *cloned = rewriter.clone(*parallelize); + rewriter.replaceOp(parallelize, cloned); + rewriter.create(loc); + changed = true; + } } + return success(changed); + } }; class LowerWorkdistributePass : public flangomp::impl::LowerWorkdistributeBase { public: void runOnOperation() override { - mlir::MLIRContext &context = getContext(); - mlir::RewritePatternSet patterns(&context); - mlir::GreedyRewriteConfig config; + MLIRContext &context = getContext(); + RewritePatternSet patterns(&context); + GreedyRewriteConfig config; // prevent the pattern driver form merging blocks config.setRegionSimplificationLevel( - mlir::GreedySimplifyRegionLevel::Disabled); + GreedySimplifyRegionLevel::Disabled); - patterns.insert(&context); - mlir::Operation *op = getOperation(); - if (mlir::failed(mlir::applyPatternsGreedily(op, std::move(patterns), config))) { - mlir::emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + patterns.insert(&context); + Operation *op = getOperation(); + if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { + emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); } } diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir new file mode 100644 index 0000000000000..ea03a10dd3d44 --- /dev/null +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir @@ -0,0 +1,60 @@ +// RUN: fir-opt --lower-workdistribute %s | FileCheck %s + +// CHECK-LABEL: func.func @test_fission_workdistribute({{.*}}) { +// CHECK: %[[VAL_0:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_2:.*]] = arith.constant 9 : index +// CHECK: %[[VAL_3:.*]] = arith.constant 5.000000e+00 : f32 +// CHECK: fir.store %[[VAL_3]] to %[[ARG2:.*]] : !fir.ref +// CHECK: fir.do_loop %[[VAL_4:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] unordered { +// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref +// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: } +// CHECK: fir.call @regular_side_effect_func(%[[ARG2:.*]]) : (!fir.ref) -> () +// CHECK: fir.call @my_fir_parallel_runtime_func(%[[ARG3:.*]]) : (!fir.ref) -> () +// CHECK: fir.do_loop %[[VAL_8:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] { +// CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_3]] to %[[VAL_9]] : !fir.ref +// CHECK: } +// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2:.*]] : !fir.ref +// CHECK: fir.store %[[VAL_10]] to %[[ARG3:.*]] : !fir.ref +// CHECK: return +// CHECK: } +module { +func.func @regular_side_effect_func(%arg0: !fir.ref) { + return +} +func.func @my_fir_parallel_runtime_func(%arg0: !fir.ref) attributes {fir.runtime} { + return +} +func.func @test_fission_workdistribute(%arr1: !fir.ref>, %arr2: !fir.ref>, %scalar_ref1: !fir.ref, %scalar_ref2: !fir.ref) { + %c0_idx = arith.constant 0 : index + %c1_idx = arith.constant 1 : index + %c9_idx = arith.constant 9 : index + %float_val = arith.constant 5.0 : f32 + omp.teams { + omp.workdistribute { + fir.store %float_val to %scalar_ref1 : !fir.ref + fir.do_loop %iv = %c0_idx to %c9_idx step %c1_idx unordered { + %elem_ptr_arr1 = fir.coordinate_of %arr1, %iv : (!fir.ref>, index) -> !fir.ref + %loaded_val_loop1 = fir.load %elem_ptr_arr1 : !fir.ref + %elem_ptr_arr2 = fir.coordinate_of %arr2, %iv : (!fir.ref>, index) -> !fir.ref + fir.store %loaded_val_loop1 to %elem_ptr_arr2 : !fir.ref + } + fir.call @regular_side_effect_func(%scalar_ref1) : (!fir.ref) -> () + fir.call @my_fir_parallel_runtime_func(%scalar_ref2) : (!fir.ref) -> () + fir.do_loop %jv = %c0_idx to %c9_idx step %c1_idx { + %elem_ptr_ordered_loop = fir.coordinate_of %arr1, %jv : (!fir.ref>, index) -> !fir.ref + fir.store %float_val to %elem_ptr_ordered_loop : !fir.ref + } + %loaded_for_hoist = fir.load %scalar_ref1 : !fir.ref + fir.store %loaded_for_hoist to %scalar_ref2 : !fir.ref + omp.terminator + } + omp.terminator + } + return +} +} diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-to-single.mlir similarity index 99% rename from flang/test/Transforms/OpenMP/lower-workdistribute.mlir rename to flang/test/Transforms/OpenMP/lower-workdistribute-to-single.mlir index 34c8c3f01976d..0cc2aeded2532 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-to-single.mlir @@ -49,4 +49,4 @@ func.func @_QPtarget_simple() { omp.terminator } return -} \ No newline at end of file +} >From 9077ad5ae1524a37890a6ebd7d21bcdabf0a066a Mon Sep 17 00:00:00 2001 From: skc7 Date: Sun, 18 May 2025 12:37:53 +0530 Subject: [PATCH 10/11] [OpenMP][Flang] Lower teams workdistribute do_loop to wsloop. Logic inspired from ivanradanov commit 5682e9ea7fcba64693f7cfdc0f1970fab2d7d4ae --- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 177 +++++++++++++++--- .../OpenMP/lower-workdistribute-doloop.mlir | 28 +++ .../OpenMP/lower-workdistribute-fission.mlir | 22 ++- 3 files changed, 193 insertions(+), 34 deletions(-) create mode 100644 flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index f799202be2645..de208a8190650 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -6,18 +6,22 @@ // //===----------------------------------------------------------------------===// // -// This file implements the lowering of omp.workdistribute. +// This file implements the lowering and optimisations of omp.workdistribute. // //===----------------------------------------------------------------------===// +#include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Dialect/FIRDialect.h" #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Optimizer/HLFIR/Passes.h" +#include "flang/Optimizer/OpenMP/Utils.h" +#include "mlir/Analysis/SliceAnalysis.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/IR/Builders.h" #include "mlir/IR/Value.h" +#include "mlir/Transforms/DialectConversion.h" #include "mlir/Transforms/GreedyPatternRewriteDriver.h" #include #include @@ -29,6 +33,7 @@ #include #include #include +#include "mlir/Transforms/RegionUtils.h" #include #include @@ -87,25 +92,6 @@ static bool shouldParallelize(Operation *op) { return false; } -struct WorkdistributeToSingle : public OpRewritePattern { - using OpRewritePattern::OpRewritePattern; - LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, - PatternRewriter &rewriter) const override { - auto workdistributeOp = getPerfectlyNested(teamsOp); - if (!workdistributeOp) { - LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); - return failure(); - } - - Block *workdistributeBlock = &workdistributeOp.getRegion().front(); - rewriter.eraseOp(workdistributeBlock->getTerminator()); - rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); - rewriter.eraseOp(teamsOp); - workdistributeOp.emitWarning("unable to parallelize coexecute"); - return success(); - } -}; - /// If B() and D() are parallelizable, /// /// omp.teams { @@ -210,22 +196,161 @@ struct FissionWorkdistribute } }; +static void +genLoopNestClauseOps(mlir::Location loc, + mlir::PatternRewriter &rewriter, + fir::DoLoopOp loop, + mlir::omp::LoopNestOperands &loopNestClauseOps) { + assert(loopNestClauseOps.loopLowerBounds.empty() && + "Loop nest bounds were already emitted!"); + loopNestClauseOps.loopLowerBounds.push_back(loop.getLowerBound()); + loopNestClauseOps.loopUpperBounds.push_back(loop.getUpperBound()); + loopNestClauseOps.loopSteps.push_back(loop.getStep()); + loopNestClauseOps.loopInclusive = rewriter.getUnitAttr(); +} + +static void +genWsLoopOp(mlir::PatternRewriter &rewriter, + fir::DoLoopOp doLoop, + const mlir::omp::LoopNestOperands &clauseOps) { + + auto wsloopOp = rewriter.create(doLoop.getLoc()); + rewriter.createBlock(&wsloopOp.getRegion()); + + auto loopNestOp = + rewriter.create(doLoop.getLoc(), clauseOps); + + // Clone the loop's body inside the loop nest construct using the + // mapped values. + rewriter.cloneRegionBefore(doLoop.getRegion(), loopNestOp.getRegion(), + loopNestOp.getRegion().begin()); + Block *clonedBlock = &loopNestOp.getRegion().back(); + mlir::Operation *terminatorOp = clonedBlock->getTerminator(); + + // Erase fir.result op of do loop and create yield op. + if (auto resultOp = dyn_cast(terminatorOp)) { + rewriter.setInsertionPoint(terminatorOp); + rewriter.create(doLoop->getLoc()); + rewriter.eraseOp(terminatorOp); + } + return; +} + +/// If fir.do_loop id present inside teams workdistribute +/// +/// omp.teams { +/// omp.workdistribute { +/// fir.do_loop unoredered { +/// ... +/// } +/// } +/// } +/// +/// Then, its lowered to +/// +/// omp.teams { +/// omp.workdistribute { +/// omp.parallel { +/// omp.wsloop { +/// omp.loop_nest +/// ... +/// } +/// } +/// } +/// } +/// } + +struct TeamsWorkdistributeLowering : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + PatternRewriter &rewriter) const override { + auto teamsLoc = teamsOp->getLoc(); + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); + } + assert(teamsOp.getReductionVars().empty()); + + auto doLoop = getPerfectlyNested(workdistributeOp); + if (doLoop && shouldParallelize(doLoop)) { + + auto parallelOp = rewriter.create(teamsLoc); + rewriter.createBlock(¶llelOp.getRegion()); + rewriter.setInsertionPoint(rewriter.create(doLoop.getLoc())); + + mlir::omp::LoopNestOperands loopNestClauseOps; + genLoopNestClauseOps(doLoop.getLoc(), rewriter, doLoop, + loopNestClauseOps); + + genWsLoopOp(rewriter, doLoop, loopNestClauseOps); + rewriter.setInsertionPoint(doLoop); + rewriter.eraseOp(doLoop); + return success(); + } + return failure(); + } +}; + + +/// If A() and B () are present inside teams workdistribute +/// +/// omp.teams { +/// omp.workdistribute { +/// A() +/// B() +/// } +/// } +/// +/// Then, its lowered to +/// +/// A() +/// B() +/// + +struct TeamsWorkdistributeToSingle : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + PatternRewriter &rewriter) const override { + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); + } + Block *workdistributeBlock = &workdistributeOp.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); + rewriter.eraseOp(teamsOp); + return success(); + } +}; + class LowerWorkdistributePass : public flangomp::impl::LowerWorkdistributeBase { public: void runOnOperation() override { MLIRContext &context = getContext(); - RewritePatternSet patterns(&context); GreedyRewriteConfig config; // prevent the pattern driver form merging blocks config.setRegionSimplificationLevel( GreedySimplifyRegionLevel::Disabled); - - patterns.insert(&context); + Operation *op = getOperation(); - if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { - emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); - signalPassFailure(); + { + RewritePatternSet patterns(&context); + patterns.insert(&context); + if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { + emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + signalPassFailure(); + } + } + { + RewritePatternSet patterns(&context); + patterns.insert(&context); + if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { + emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + signalPassFailure(); + } } } }; diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir new file mode 100644 index 0000000000000..666bdb3ced647 --- /dev/null +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir @@ -0,0 +1,28 @@ +// RUN: fir-opt --lower-workdistribute %s | FileCheck %s + +// CHECK-LABEL: func.func @x({{.*}}) +// CHECK: %[[VAL_0:.*]] = arith.constant 0 : index +// CHECK: omp.parallel { +// CHECK: omp.wsloop { +// CHECK: omp.loop_nest (%[[VAL_1:.*]]) : index = (%[[ARG0:.*]]) to (%[[ARG1:.*]]) inclusive step (%[[ARG2:.*]]) { +// CHECK: fir.store %[[VAL_0]] to %[[ARG4:.*]] : !fir.ref +// CHECK: omp.yield +// CHECK: } +// CHECK: } +// CHECK: omp.terminator +// CHECK: } +// CHECK: return +// CHECK: } +func.func @x(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref) { + omp.teams { + omp.workdistribute { + fir.do_loop %iv = %lb to %ub step %step unordered { + %zero = arith.constant 0 : index + fir.store %zero to %addr : !fir.ref + } + omp.terminator + } + omp.terminator + } + return +} \ No newline at end of file diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir index ea03a10dd3d44..cf50d135d01ec 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir @@ -6,20 +6,26 @@ // CHECK: %[[VAL_2:.*]] = arith.constant 9 : index // CHECK: %[[VAL_3:.*]] = arith.constant 5.000000e+00 : f32 // CHECK: fir.store %[[VAL_3]] to %[[ARG2:.*]] : !fir.ref -// CHECK: fir.do_loop %[[VAL_4:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] unordered { -// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref -// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref -// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref -// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: omp.parallel { +// CHECK: omp.wsloop { +// CHECK: omp.loop_nest (%[[VAL_4:.*]]) : index = (%[[VAL_0]]) to (%[[VAL_2]]) inclusive step (%[[VAL_1]]) { +// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref +// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: omp.yield +// CHECK: } +// CHECK: } +// CHECK: omp.terminator // CHECK: } // CHECK: fir.call @regular_side_effect_func(%[[ARG2:.*]]) : (!fir.ref) -> () // CHECK: fir.call @my_fir_parallel_runtime_func(%[[ARG3:.*]]) : (!fir.ref) -> () // CHECK: fir.do_loop %[[VAL_8:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] { -// CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref // CHECK: fir.store %[[VAL_3]] to %[[VAL_9]] : !fir.ref // CHECK: } -// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2:.*]] : !fir.ref -// CHECK: fir.store %[[VAL_10]] to %[[ARG3:.*]] : !fir.ref +// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2]] : !fir.ref +// CHECK: fir.store %[[VAL_10]] to %[[ARG3]] : !fir.ref // CHECK: return // CHECK: } module { >From 6e8010db820ff508538e019dc2ce4c4425abc952 Mon Sep 17 00:00:00 2001 From: skc7 Date: Mon, 19 May 2025 15:33:53 +0530 Subject: [PATCH 11/11] clang format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 18 +-- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 108 +++++++++--------- flang/lib/Parser/openmp-parsers.cpp | 6 +- .../OpenMP/lower-workdistribute-doloop.mlir | 2 +- 4 files changed, 67 insertions(+), 67 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index afb0c6ee74fcf..d401462034d77 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2592,14 +2592,15 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } -static mlir::omp::WorkdistributeOp -genWorkdistributeOp(lower::AbstractConverter &converter, lower::SymMap &symTable, - semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - mlir::Location loc, const ConstructQueue &queue, - ConstructQueue::const_iterator item) { +static mlir::omp::WorkdistributeOp genWorkdistributeOp( + lower::AbstractConverter &converter, lower::SymMap &symTable, + semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, + mlir::Location loc, const ConstructQueue &queue, + ConstructQueue::const_iterator item) { return genOpWithBody( - OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, - llvm::omp::Directive::OMPD_workdistribute), queue, item); + OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, + llvm::omp::Directive::OMPD_workdistribute), + queue, item); } //===----------------------------------------------------------------------===// @@ -3768,7 +3769,8 @@ static void genOMPDispatch(lower::AbstractConverter &converter, TODO(loc, "Unhandled loop directive (" + llvm::omp::getOpenMPDirectiveName(dir) + ")"); case llvm::omp::Directive::OMPD_workdistribute: - newOp = genWorkdistributeOp(converter, symTable, semaCtx, eval, loc, queue, item); + newOp = genWorkdistributeOp(converter, symTable, semaCtx, eval, loc, queue, + item); break; case llvm::omp::Directive::OMPD_workshare: newOp = genWorkshareOp(converter, symTable, stmtCtx, semaCtx, eval, loc, diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index de208a8190650..f75d4d1988fd2 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -14,15 +14,16 @@ #include "flang/Optimizer/Dialect/FIRDialect.h" #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/Dialect/FIRType.h" -#include "flang/Optimizer/Transforms/Passes.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Utils.h" +#include "flang/Optimizer/Transforms/Passes.h" #include "mlir/Analysis/SliceAnalysis.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/IR/Builders.h" #include "mlir/IR/Value.h" #include "mlir/Transforms/DialectConversion.h" #include "mlir/Transforms/GreedyPatternRewriteDriver.h" +#include "mlir/Transforms/RegionUtils.h" #include #include #include @@ -33,7 +34,6 @@ #include #include #include -#include "mlir/Transforms/RegionUtils.h" #include #include @@ -66,30 +66,30 @@ static T getPerfectlyNested(Operation *op) { /// This is the single source of truth about whether we should parallelize an /// operation nested in an omp.workdistribute region. static bool shouldParallelize(Operation *op) { - // Currently we cannot parallelize operations with results that have uses - if (llvm::any_of(op->getResults(), - [](OpResult v) -> bool { return !v.use_empty(); })) + // Currently we cannot parallelize operations with results that have uses + if (llvm::any_of(op->getResults(), + [](OpResult v) -> bool { return !v.use_empty(); })) + return false; + // We will parallelize unordered loops - these come from array syntax + if (auto loop = dyn_cast(op)) { + auto unordered = loop.getUnordered(); + if (!unordered) return false; - // We will parallelize unordered loops - these come from array syntax - if (auto loop = dyn_cast(op)) { - auto unordered = loop.getUnordered(); - if (!unordered) - return false; - return *unordered; - } - if (auto callOp = dyn_cast(op)) { - auto callee = callOp.getCallee(); - if (!callee) - return false; - auto *func = op->getParentOfType().lookupSymbol(*callee); - // TODO need to insert a check here whether it is a call we can actually - // parallelize currently - if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) - return true; + return *unordered; + } + if (auto callOp = dyn_cast(op)) { + auto callee = callOp.getCallee(); + if (!callee) return false; - } - // We cannot parallise anything else + auto *func = op->getParentOfType().lookupSymbol(*callee); + // TODO need to insert a check here whether it is a call we can actually + // parallelize currently + if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) + return true; return false; + } + // We cannot parallise anything else + return false; } /// If B() and D() are parallelizable, @@ -120,12 +120,10 @@ static bool shouldParallelize(Operation *op) { /// } /// E() -struct FissionWorkdistribute - : public OpRewritePattern { +struct FissionWorkdistribute : public OpRewritePattern { using OpRewritePattern::OpRewritePattern; - LogicalResult - matchAndRewrite(omp::WorkdistributeOp workdistribute, - PatternRewriter &rewriter) const override { + LogicalResult matchAndRewrite(omp::WorkdistributeOp workdistribute, + PatternRewriter &rewriter) const override { auto loc = workdistribute->getLoc(); auto teams = dyn_cast(workdistribute->getParentOp()); if (!teams) { @@ -185,7 +183,7 @@ struct FissionWorkdistribute auto newWorkdistribute = rewriter.create(loc); rewriter.create(loc); rewriter.createBlock(&newWorkdistribute.getRegion(), - newWorkdistribute.getRegion().begin(), {}, {}); + newWorkdistribute.getRegion().begin(), {}, {}); auto *cloned = rewriter.clone(*parallelize); rewriter.replaceOp(parallelize, cloned); rewriter.create(loc); @@ -197,8 +195,7 @@ struct FissionWorkdistribute }; static void -genLoopNestClauseOps(mlir::Location loc, - mlir::PatternRewriter &rewriter, +genLoopNestClauseOps(mlir::Location loc, mlir::PatternRewriter &rewriter, fir::DoLoopOp loop, mlir::omp::LoopNestOperands &loopNestClauseOps) { assert(loopNestClauseOps.loopLowerBounds.empty() && @@ -209,10 +206,8 @@ genLoopNestClauseOps(mlir::Location loc, loopNestClauseOps.loopInclusive = rewriter.getUnitAttr(); } -static void -genWsLoopOp(mlir::PatternRewriter &rewriter, - fir::DoLoopOp doLoop, - const mlir::omp::LoopNestOperands &clauseOps) { +static void genWsLoopOp(mlir::PatternRewriter &rewriter, fir::DoLoopOp doLoop, + const mlir::omp::LoopNestOperands &clauseOps) { auto wsloopOp = rewriter.create(doLoop.getLoc()); rewriter.createBlock(&wsloopOp.getRegion()); @@ -236,7 +231,7 @@ genWsLoopOp(mlir::PatternRewriter &rewriter, return; } -/// If fir.do_loop id present inside teams workdistribute +/// If fir.do_loop is present inside teams workdistribute /// /// omp.teams { /// omp.workdistribute { @@ -246,7 +241,7 @@ genWsLoopOp(mlir::PatternRewriter &rewriter, /// } /// } /// -/// Then, its lowered to +/// Then, its lowered to /// /// omp.teams { /// omp.workdistribute { @@ -277,7 +272,8 @@ struct TeamsWorkdistributeLowering : public OpRewritePattern { auto parallelOp = rewriter.create(teamsLoc); rewriter.createBlock(¶llelOp.getRegion()); - rewriter.setInsertionPoint(rewriter.create(doLoop.getLoc())); + rewriter.setInsertionPoint( + rewriter.create(doLoop.getLoc())); mlir::omp::LoopNestOperands loopNestClauseOps; genLoopNestClauseOps(doLoop.getLoc(), rewriter, doLoop, @@ -292,7 +288,6 @@ struct TeamsWorkdistributeLowering : public OpRewritePattern { } }; - /// If A() and B () are present inside teams workdistribute /// /// omp.teams { @@ -311,17 +306,17 @@ struct TeamsWorkdistributeLowering : public OpRewritePattern { struct TeamsWorkdistributeToSingle : public OpRewritePattern { using OpRewritePattern::OpRewritePattern; LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, - PatternRewriter &rewriter) const override { - auto workdistributeOp = getPerfectlyNested(teamsOp); - if (!workdistributeOp) { - LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); - return failure(); - } - Block *workdistributeBlock = &workdistributeOp.getRegion().front(); - rewriter.eraseOp(workdistributeBlock->getTerminator()); - rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); - rewriter.eraseOp(teamsOp); - return success(); + PatternRewriter &rewriter) const override { + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); + } + Block *workdistributeBlock = &workdistributeOp.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); + rewriter.eraseOp(teamsOp); + return success(); } }; @@ -332,13 +327,13 @@ class LowerWorkdistributePass MLIRContext &context = getContext(); GreedyRewriteConfig config; // prevent the pattern driver form merging blocks - config.setRegionSimplificationLevel( - GreedySimplifyRegionLevel::Disabled); - + config.setRegionSimplificationLevel(GreedySimplifyRegionLevel::Disabled); + Operation *op = getOperation(); { RewritePatternSet patterns(&context); - patterns.insert(&context); + patterns.insert( + &context); if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); @@ -346,7 +341,8 @@ class LowerWorkdistributePass } { RewritePatternSet patterns(&context); - patterns.insert(&context); + patterns.insert( + &context); if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); @@ -354,4 +350,4 @@ class LowerWorkdistributePass } } }; -} +} // namespace diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index d3ae8653b35c7..f9d7fc1492b06 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1350,12 +1350,14 @@ TYPE_PARSER( "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_workdistribute), + "TARGET TEAMS WORKDISTRIBUTE" >> + pure(llvm::omp::Directive::OMPD_target_teams_workdistribute), "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), "TARGET" >> pure(llvm::omp::Directive::OMPD_target), "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_teams_workdistribute), + "TEAMS WORKDISTRIBUTE" >> + pure(llvm::omp::Directive::OMPD_teams_workdistribute), "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare), "WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_workdistribute)))) diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir index 666bdb3ced647..9fb970246b90c 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir @@ -25,4 +25,4 @@ func.func @x(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref< omp.terminator } return -} \ No newline at end of file +} From flang-commits at lists.llvm.org Mon May 19 03:26:42 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 19 May 2025 03:26:42 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Set the default schedule modifier (PR #139572) In-Reply-To: Message-ID: <682b0762.170a0220.3c6516.26c2@mx.google.com> tblah wrote: It turns out this was already handled in OMPIRBuilder: https://github.com/llvm/llvm-project/blob/cc51cbe27877aa7cc297f7e41afa5515edabcbdc/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp#L220 https://github.com/llvm/llvm-project/pull/139572 From flang-commits at lists.llvm.org Mon May 19 03:26:42 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 19 May 2025 03:26:42 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Set the default schedule modifier (PR #139572) In-Reply-To: Message-ID: <682b0762.170a0220.f14dc.d803@mx.google.com> https://github.com/tblah closed https://github.com/llvm/llvm-project/pull/139572 From flang-commits at lists.llvm.org Mon May 19 07:19:07 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 07:19:07 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [WIP] Implement workdistribute construct (PR #140523) In-Reply-To: Message-ID: <682b3ddb.170a0220.6b162.48a4@mx.google.com> https://github.com/skc7 updated https://github.com/llvm/llvm-project/pull/140523 >From e0dff6afb7aa31330aa0516effb7a0f65df5315f Mon Sep 17 00:00:00 2001 From: Ivan Radanov Ivanov Date: Mon, 4 Dec 2023 12:57:36 -0800 Subject: [PATCH 01/11] Add coexecute directives --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 45 ++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 0af4b436649a3..752486a8105b6 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -682,6 +682,8 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let association = AS_None; let category = CA_Executable; } +def OMP_Coexecute : Directive<"coexecute"> {} +def OMP_EndCoexecute : Directive<"end coexecute"> {} def OMP_Critical : Directive<"critical"> { let allowedOnceClauses = [ VersionedClause, @@ -2198,6 +2200,33 @@ def OMP_TargetTeams : Directive<"target teams"> { let leafConstructs = [OMP_Target, OMP_Teams]; let category = CA_Executable; } +def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; +} def OMP_TargetTeamsDistribute : Directive<"target teams distribute"> { let allowedClauses = [ VersionedClause, @@ -2484,6 +2513,22 @@ def OMP_TaskLoopSimd : Directive<"taskloop simd"> { let leafConstructs = [OMP_TaskLoop, OMP_Simd]; let category = CA_Executable; } +def OMP_TeamsCoexecute : Directive<"teams coexecute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause + ]; +} def OMP_TeamsDistribute : Directive<"teams distribute"> { let allowedClauses = [ VersionedClause, >From 8b1b36f5e716b8186d98b0d5c47c0fdf649ae67b Mon Sep 17 00:00:00 2001 From: skc7 Date: Tue, 13 May 2025 11:01:45 +0530 Subject: [PATCH 02/11] [OpenMP] Fix Coexecute definitions --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 752486a8105b6..7f450b43c2e36 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -682,8 +682,15 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let association = AS_None; let category = CA_Executable; } -def OMP_Coexecute : Directive<"coexecute"> {} -def OMP_EndCoexecute : Directive<"end coexecute"> {} +def OMP_Coexecute : Directive<"coexecute"> { + let association = AS_Block; + let category = CA_Executable; +} +def OMP_EndCoexecute : Directive<"end coexecute"> { + let leafConstructs = OMP_Coexecute.leafConstructs; + let association = OMP_Coexecute.association; + let category = OMP_Coexecute.category; +} def OMP_Critical : Directive<"critical"> { let allowedOnceClauses = [ VersionedClause, @@ -2224,8 +2231,10 @@ def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { VersionedClause, VersionedClause, VersionedClause, - VersionedClause, + VersionedClause, ]; + let leafConstructs = [OMP_Target, OMP_Teams, OMP_Coexecute]; + let category = CA_Executable; } def OMP_TargetTeamsDistribute : Directive<"target teams distribute"> { let allowedClauses = [ @@ -2528,6 +2537,8 @@ def OMP_TeamsCoexecute : Directive<"teams coexecute"> { VersionedClause, VersionedClause ]; + let leafConstructs = [OMP_Target, OMP_Teams]; + let category = CA_Executable; } def OMP_TeamsDistribute : Directive<"teams distribute"> { let allowedClauses = [ >From 9b8d66a45e602375ec779e6c5bdd43232644f9a2 Mon Sep 17 00:00:00 2001 From: Ivan Radanov Ivanov Date: Mon, 4 Dec 2023 12:58:10 -0800 Subject: [PATCH 03/11] Add omp.coexecute op --- mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 35 +++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 5a79fbf77a268..8061aa0209cc9 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -325,6 +325,41 @@ def SectionsOp : OpenMP_Op<"sections", traits = [ let hasRegionVerifier = 1; } +//===----------------------------------------------------------------------===// +// Coexecute Construct +//===----------------------------------------------------------------------===// + +def CoexecuteOp : OpenMP_Op<"coexecute"> { + let summary = "coexecute directive"; + let description = [{ + The coexecute construct specifies that the teams from the teams directive + this is nested in shall cooperate to execute the computation in this region. + There is no implicit barrier at the end as specified in the standard. + + TODO + We should probably change the defaut behaviour to have a barrier unless + nowait is specified, see below snippet. + + ``` + !$omp target teams + !$omp coexecute + tmp = matmul(x, y) + !$omp end coexecute + a = tmp(0, 0) ! there is no implicit barrier! the matmul hasnt completed! + !$omp end target teams coexecute + ``` + + }]; + + let arguments = (ins UnitAttr:$nowait); + + let regions = (region AnyRegion:$region); + + let assemblyFormat = [{ + oilist(`nowait` $nowait) $region attr-dict + }]; +} + //===----------------------------------------------------------------------===// // 2.8.2 Single Construct //===----------------------------------------------------------------------===// >From 7ecec06e00230649446c77c970160d4814a90e07 Mon Sep 17 00:00:00 2001 From: Ivan Radanov Ivanov Date: Mon, 4 Dec 2023 17:50:41 -0800 Subject: [PATCH 04/11] Initial frontend support for coexecute --- .../include/flang/Semantics/openmp-directive-sets.h | 13 +++++++++++++ flang/lib/Lower/OpenMP/OpenMP.cpp | 12 ++++++++++++ flang/lib/Parser/openmp-parsers.cpp | 5 ++++- flang/lib/Semantics/resolve-directives.cpp | 6 ++++++ 4 files changed, 35 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Semantics/openmp-directive-sets.h b/flang/include/flang/Semantics/openmp-directive-sets.h index dd610c9702c28..5c316e030c63f 100644 --- a/flang/include/flang/Semantics/openmp-directive-sets.h +++ b/flang/include/flang/Semantics/openmp-directive-sets.h @@ -143,6 +143,7 @@ static const OmpDirectiveSet topTargetSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, + Directive::OMPD_target_teams_coexecute, }; static const OmpDirectiveSet allTargetSet{topTargetSet}; @@ -187,9 +188,16 @@ static const OmpDirectiveSet allTeamsSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, + Directive::OMPD_target_teams_coexecute, } | topTeamsSet, }; +static const OmpDirectiveSet allCoexecuteSet{ + Directive::OMPD_coexecute, + Directive::OMPD_teams_coexecute, + Directive::OMPD_target_teams_coexecute, +}; + //===----------------------------------------------------------------------===// // Directive sets for groups of multiple directives //===----------------------------------------------------------------------===// @@ -230,6 +238,9 @@ static const OmpDirectiveSet blockConstructSet{ Directive::OMPD_taskgroup, Directive::OMPD_teams, Directive::OMPD_workshare, + Directive::OMPD_target_teams_coexecute, + Directive::OMPD_teams_coexecute, + Directive::OMPD_coexecute, }; static const OmpDirectiveSet loopConstructSet{ @@ -294,6 +305,7 @@ static const OmpDirectiveSet workShareSet{ Directive::OMPD_scope, Directive::OMPD_sections, Directive::OMPD_single, + Directive::OMPD_coexecute, } | allDoSet, }; @@ -376,6 +388,7 @@ static const OmpDirectiveSet nestedReduceWorkshareAllowedSet{ }; static const OmpDirectiveSet nestedTeamsAllowedSet{ + Directive::OMPD_coexecute, Directive::OMPD_distribute, Directive::OMPD_distribute_parallel_do, Directive::OMPD_distribute_parallel_do_simd, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 61bbc709872fd..b0c65c8e37988 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2670,6 +2670,15 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +static mlir::omp::CoexecuteOp +genCoexecuteOp(Fortran::lower::AbstractConverter &converter, + Fortran::lower::pft::Evaluation &eval, + mlir::Location currentLocation, + const Fortran::parser::OmpClauseList &clauseList) { + return genOpWithBody( + converter, eval, currentLocation, /*outerCombined=*/false, &clauseList); +} + //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// @@ -3929,6 +3938,9 @@ static void genOMPDispatch(lower::AbstractConverter &converter, newOp = genTeamsOp(converter, symTable, stmtCtx, semaCtx, eval, loc, queue, item); break; + case llvm::omp::Directive::OMPD_coexecute: + newOp = genCoexecuteOp(converter, eval, currentLocation, beginClauseList); + break; case llvm::omp::Directive::OMPD_tile: case llvm::omp::Directive::OMPD_unroll: { unsigned version = semaCtx.langOptions().OpenMPVersion; diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 52d3a5844c969..591b1642baed3 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1344,12 +1344,15 @@ TYPE_PARSER( "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_coexecute), "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), "TARGET" >> pure(llvm::omp::Directive::OMPD_target), "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_teams_coexecute), "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare), + "COEXECUTE" >> pure(llvm::omp::Directive::OMPD_coexecute)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 9fa7bc8964854..ae297f204356a 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1617,6 +1617,9 @@ bool OmpAttributeVisitor::Pre(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_taskgroup: case llvm::omp::Directive::OMPD_teams: + case llvm::omp::Directive::OMPD_coexecute: + case llvm::omp::Directive::OMPD_teams_coexecute: + case llvm::omp::Directive::OMPD_target_teams_coexecute: case llvm::omp::Directive::OMPD_workshare: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: @@ -1650,6 +1653,9 @@ void OmpAttributeVisitor::Post(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_target: case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_teams: + case llvm::omp::Directive::OMPD_coexecute: + case llvm::omp::Directive::OMPD_teams_coexecute: + case llvm::omp::Directive::OMPD_target_teams_coexecute: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: case llvm::omp::Directive::OMPD_target_parallel: { >From ca0cc44c621fde89f1889fb328e66755ca3f5e3a Mon Sep 17 00:00:00 2001 From: skc7 Date: Tue, 13 May 2025 15:09:45 +0530 Subject: [PATCH 05/11] [OpenMP] Fixes for coexecute definitions --- .../flang/Semantics/openmp-directive-sets.h | 1 + flang/lib/Lower/OpenMP/OpenMP.cpp | 13 ++-- flang/test/Lower/OpenMP/coexecute.f90 | 59 +++++++++++++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 33 +++++------ 4 files changed, 83 insertions(+), 23 deletions(-) create mode 100644 flang/test/Lower/OpenMP/coexecute.f90 diff --git a/flang/include/flang/Semantics/openmp-directive-sets.h b/flang/include/flang/Semantics/openmp-directive-sets.h index 5c316e030c63f..43f4e642b3d86 100644 --- a/flang/include/flang/Semantics/openmp-directive-sets.h +++ b/flang/include/flang/Semantics/openmp-directive-sets.h @@ -173,6 +173,7 @@ static const OmpDirectiveSet topTeamsSet{ Directive::OMPD_teams_distribute_parallel_do_simd, Directive::OMPD_teams_distribute_simd, Directive::OMPD_teams_loop, + Directive::OMPD_teams_coexecute, }; static const OmpDirectiveSet bottomTeamsSet{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index b0c65c8e37988..80612bd05ad97 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2671,12 +2671,13 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, } static mlir::omp::CoexecuteOp -genCoexecuteOp(Fortran::lower::AbstractConverter &converter, - Fortran::lower::pft::Evaluation &eval, - mlir::Location currentLocation, - const Fortran::parser::OmpClauseList &clauseList) { +genCoexecuteOp(lower::AbstractConverter &converter, lower::SymMap &symTable, + semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, + mlir::Location loc, const ConstructQueue &queue, + ConstructQueue::const_iterator item) { return genOpWithBody( - converter, eval, currentLocation, /*outerCombined=*/false, &clauseList); + OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, + llvm::omp::Directive::OMPD_coexecute), queue, item); } //===----------------------------------------------------------------------===// @@ -3939,7 +3940,7 @@ static void genOMPDispatch(lower::AbstractConverter &converter, item); break; case llvm::omp::Directive::OMPD_coexecute: - newOp = genCoexecuteOp(converter, eval, currentLocation, beginClauseList); + newOp = genCoexecuteOp(converter, symTable, semaCtx, eval, loc, queue, item); break; case llvm::omp::Directive::OMPD_tile: case llvm::omp::Directive::OMPD_unroll: { diff --git a/flang/test/Lower/OpenMP/coexecute.f90 b/flang/test/Lower/OpenMP/coexecute.f90 new file mode 100644 index 0000000000000..b14f71f9bbbfa --- /dev/null +++ b/flang/test/Lower/OpenMP/coexecute.f90 @@ -0,0 +1,59 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +! CHECK-LABEL: func @_QPtarget_teams_coexecute +subroutine target_teams_coexecute() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp target teams coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end target teams coexecute +end subroutine target_teams_coexecute + +! CHECK-LABEL: func @_QPteams_coexecute +subroutine teams_coexecute() + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp teams coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end teams coexecute +end subroutine teams_coexecute + +! CHECK-LABEL: func @_QPtarget_teams_coexecute_m +subroutine target_teams_coexecute_m() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp target + !$omp teams + !$omp coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end coexecute + !$omp end teams + !$omp end target +end subroutine target_teams_coexecute_m + +! CHECK-LABEL: func @_QPteams_coexecute_m +subroutine teams_coexecute_m() + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp teams + !$omp coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end coexecute + !$omp end teams +end subroutine teams_coexecute_m diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 7f450b43c2e36..3f02b6534816f 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -2209,29 +2209,28 @@ def OMP_TargetTeams : Directive<"target teams"> { } def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, + VersionedClause, VersionedClause, VersionedClause, - VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, - VersionedClause, - VersionedClause, VersionedClause, - VersionedClause, + VersionedClause, ]; - let allowedOnceClauses = [ + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, - VersionedClause, - VersionedClause, VersionedClause, - VersionedClause, VersionedClause, VersionedClause, + VersionedClause, ]; let leafConstructs = [OMP_Target, OMP_Teams, OMP_Coexecute]; let category = CA_Executable; @@ -2524,20 +2523,20 @@ def OMP_TaskLoopSimd : Directive<"taskloop simd"> { } def OMP_TeamsCoexecute : Directive<"teams coexecute"> { let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, VersionedClause, - VersionedClause + VersionedClause, ]; - let leafConstructs = [OMP_Target, OMP_Teams]; + let leafConstructs = [OMP_Teams, OMP_Coexecute]; let category = CA_Executable; } def OMP_TeamsDistribute : Directive<"teams distribute"> { >From 8077858a88a2ffac2b7d726c1ae5d1f1edb64b67 Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 14 May 2025 14:48:52 +0530 Subject: [PATCH 06/11] [OpenMP] Use workdistribute instead of coexecute --- .../flang/Semantics/openmp-directive-sets.h | 24 ++--- flang/lib/Lower/OpenMP/OpenMP.cpp | 15 ++- flang/lib/Parser/openmp-parsers.cpp | 6 +- flang/lib/Semantics/resolve-directives.cpp | 12 +-- flang/test/Lower/OpenMP/coexecute.f90 | 59 ---------- flang/test/Lower/OpenMP/workdistribute.f90 | 59 ++++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 101 ++++++++++-------- mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 28 ++--- 8 files changed, 152 insertions(+), 152 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/coexecute.f90 create mode 100644 flang/test/Lower/OpenMP/workdistribute.f90 diff --git a/flang/include/flang/Semantics/openmp-directive-sets.h b/flang/include/flang/Semantics/openmp-directive-sets.h index 43f4e642b3d86..7ced6ed9b44d6 100644 --- a/flang/include/flang/Semantics/openmp-directive-sets.h +++ b/flang/include/flang/Semantics/openmp-directive-sets.h @@ -143,7 +143,7 @@ static const OmpDirectiveSet topTargetSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, - Directive::OMPD_target_teams_coexecute, + Directive::OMPD_target_teams_workdistribute, }; static const OmpDirectiveSet allTargetSet{topTargetSet}; @@ -173,7 +173,7 @@ static const OmpDirectiveSet topTeamsSet{ Directive::OMPD_teams_distribute_parallel_do_simd, Directive::OMPD_teams_distribute_simd, Directive::OMPD_teams_loop, - Directive::OMPD_teams_coexecute, + Directive::OMPD_teams_workdistribute, }; static const OmpDirectiveSet bottomTeamsSet{ @@ -189,14 +189,14 @@ static const OmpDirectiveSet allTeamsSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, - Directive::OMPD_target_teams_coexecute, + Directive::OMPD_target_teams_workdistribute, } | topTeamsSet, }; -static const OmpDirectiveSet allCoexecuteSet{ - Directive::OMPD_coexecute, - Directive::OMPD_teams_coexecute, - Directive::OMPD_target_teams_coexecute, +static const OmpDirectiveSet allWorkdistributeSet{ + Directive::OMPD_workdistribute, + Directive::OMPD_teams_workdistribute, + Directive::OMPD_target_teams_workdistribute, }; //===----------------------------------------------------------------------===// @@ -239,9 +239,9 @@ static const OmpDirectiveSet blockConstructSet{ Directive::OMPD_taskgroup, Directive::OMPD_teams, Directive::OMPD_workshare, - Directive::OMPD_target_teams_coexecute, - Directive::OMPD_teams_coexecute, - Directive::OMPD_coexecute, + Directive::OMPD_target_teams_workdistribute, + Directive::OMPD_teams_workdistribute, + Directive::OMPD_workdistribute, }; static const OmpDirectiveSet loopConstructSet{ @@ -306,7 +306,7 @@ static const OmpDirectiveSet workShareSet{ Directive::OMPD_scope, Directive::OMPD_sections, Directive::OMPD_single, - Directive::OMPD_coexecute, + Directive::OMPD_workdistribute, } | allDoSet, }; @@ -389,7 +389,7 @@ static const OmpDirectiveSet nestedReduceWorkshareAllowedSet{ }; static const OmpDirectiveSet nestedTeamsAllowedSet{ - Directive::OMPD_coexecute, + Directive::OMPD_workdistribute, Directive::OMPD_distribute, Directive::OMPD_distribute_parallel_do, Directive::OMPD_distribute_parallel_do_simd, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 80612bd05ad97..42d04bceddb12 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2670,14 +2670,14 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } -static mlir::omp::CoexecuteOp -genCoexecuteOp(lower::AbstractConverter &converter, lower::SymMap &symTable, +static mlir::omp::WorkdistributeOp +genWorkdistributeOp(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, mlir::Location loc, const ConstructQueue &queue, ConstructQueue::const_iterator item) { - return genOpWithBody( + return genOpWithBody( OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, - llvm::omp::Directive::OMPD_coexecute), queue, item); + llvm::omp::Directive::OMPD_workdistribute), queue, item); } //===----------------------------------------------------------------------===// @@ -3939,16 +3939,15 @@ static void genOMPDispatch(lower::AbstractConverter &converter, newOp = genTeamsOp(converter, symTable, stmtCtx, semaCtx, eval, loc, queue, item); break; - case llvm::omp::Directive::OMPD_coexecute: - newOp = genCoexecuteOp(converter, symTable, semaCtx, eval, loc, queue, item); - break; case llvm::omp::Directive::OMPD_tile: case llvm::omp::Directive::OMPD_unroll: { unsigned version = semaCtx.langOptions().OpenMPVersion; TODO(loc, "Unhandled loop directive (" + llvm::omp::getOpenMPDirectiveName(dir, version) + ")"); } - // case llvm::omp::Directive::OMPD_workdistribute: + case llvm::omp::Directive::OMPD_workdistribute: + newOp = genWorkdistributeOp(converter, symTable, semaCtx, eval, loc, queue, item); + break; case llvm::omp::Directive::OMPD_workshare: newOp = genWorkshareOp(converter, symTable, stmtCtx, semaCtx, eval, loc, queue, item); diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 591b1642baed3..5b5ee257edd1f 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1344,15 +1344,15 @@ TYPE_PARSER( "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_coexecute), + "TARGET TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_workdistribute), "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), "TARGET" >> pure(llvm::omp::Directive::OMPD_target), "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_teams_coexecute), + "TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_teams_workdistribute), "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare), - "COEXECUTE" >> pure(llvm::omp::Directive::OMPD_coexecute)))) + "WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_workdistribute)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index ae297f204356a..4636508ac144d 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1617,9 +1617,9 @@ bool OmpAttributeVisitor::Pre(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_taskgroup: case llvm::omp::Directive::OMPD_teams: - case llvm::omp::Directive::OMPD_coexecute: - case llvm::omp::Directive::OMPD_teams_coexecute: - case llvm::omp::Directive::OMPD_target_teams_coexecute: + case llvm::omp::Directive::OMPD_workdistribute: + case llvm::omp::Directive::OMPD_teams_workdistribute: + case llvm::omp::Directive::OMPD_target_teams_workdistribute: case llvm::omp::Directive::OMPD_workshare: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: @@ -1653,9 +1653,9 @@ void OmpAttributeVisitor::Post(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_target: case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_teams: - case llvm::omp::Directive::OMPD_coexecute: - case llvm::omp::Directive::OMPD_teams_coexecute: - case llvm::omp::Directive::OMPD_target_teams_coexecute: + case llvm::omp::Directive::OMPD_workdistribute: + case llvm::omp::Directive::OMPD_teams_workdistribute: + case llvm::omp::Directive::OMPD_target_teams_workdistribute: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: case llvm::omp::Directive::OMPD_target_parallel: { diff --git a/flang/test/Lower/OpenMP/coexecute.f90 b/flang/test/Lower/OpenMP/coexecute.f90 deleted file mode 100644 index b14f71f9bbbfa..0000000000000 --- a/flang/test/Lower/OpenMP/coexecute.f90 +++ /dev/null @@ -1,59 +0,0 @@ -! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s - -! CHECK-LABEL: func @_QPtarget_teams_coexecute -subroutine target_teams_coexecute() - ! CHECK: omp.target - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp target teams coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end target teams coexecute -end subroutine target_teams_coexecute - -! CHECK-LABEL: func @_QPteams_coexecute -subroutine teams_coexecute() - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp teams coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end teams coexecute -end subroutine teams_coexecute - -! CHECK-LABEL: func @_QPtarget_teams_coexecute_m -subroutine target_teams_coexecute_m() - ! CHECK: omp.target - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp target - !$omp teams - !$omp coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end coexecute - !$omp end teams - !$omp end target -end subroutine target_teams_coexecute_m - -! CHECK-LABEL: func @_QPteams_coexecute_m -subroutine teams_coexecute_m() - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp teams - !$omp coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end coexecute - !$omp end teams -end subroutine teams_coexecute_m diff --git a/flang/test/Lower/OpenMP/workdistribute.f90 b/flang/test/Lower/OpenMP/workdistribute.f90 new file mode 100644 index 0000000000000..924205bb72e5e --- /dev/null +++ b/flang/test/Lower/OpenMP/workdistribute.f90 @@ -0,0 +1,59 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +! CHECK-LABEL: func @_QPtarget_teams_workdistribute +subroutine target_teams_workdistribute() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp target teams workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end target teams workdistribute +end subroutine target_teams_workdistribute + +! CHECK-LABEL: func @_QPteams_workdistribute +subroutine teams_workdistribute() + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp teams workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end teams workdistribute +end subroutine teams_workdistribute + +! CHECK-LABEL: func @_QPtarget_teams_workdistribute_m +subroutine target_teams_workdistribute_m() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp target + !$omp teams + !$omp workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end workdistribute + !$omp end teams + !$omp end target +end subroutine target_teams_workdistribute_m + +! CHECK-LABEL: func @_QPteams_workdistribute_m +subroutine teams_workdistribute_m() + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp teams + !$omp workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end workdistribute + !$omp end teams +end subroutine teams_workdistribute_m diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 3f02b6534816f..c88a3049450de 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -1292,6 +1292,15 @@ def OMP_EndWorkshare : Directive<"end workshare"> { let category = OMP_Workshare.category; let languages = [L_Fortran]; } +def OMP_Workdistribute : Directive<"workdistribute"> { + let association = AS_Block; + let category = CA_Executable; +} +def OMP_EndWorkdistribute : Directive<"end workdistribute"> { + let leafConstructs = OMP_Workdistribute.leafConstructs; + let association = OMP_Workdistribute.association; + let category = OMP_Workdistribute.category; +} //===----------------------------------------------------------------------===// // Definitions of OpenMP compound directives @@ -2207,34 +2216,6 @@ def OMP_TargetTeams : Directive<"target teams"> { let leafConstructs = [OMP_Target, OMP_Teams]; let category = CA_Executable; } -def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let leafConstructs = [OMP_Target, OMP_Teams, OMP_Coexecute]; - let category = CA_Executable; -} def OMP_TargetTeamsDistribute : Directive<"target teams distribute"> { let allowedClauses = [ VersionedClause, @@ -2457,6 +2438,34 @@ def OMP_TargetTeamsDistributeSimd : let leafConstructs = [OMP_Target, OMP_Teams, OMP_Distribute, OMP_Simd]; let category = CA_Executable; } +def OMP_TargetTeamsWorkdistribute : Directive<"target teams workdistribute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let leafConstructs = [OMP_Target, OMP_Teams, OMP_Workdistribute]; + let category = CA_Executable; +} def OMP_target_teams_loop : Directive<"target teams loop"> { let allowedClauses = [ VersionedClause, @@ -2521,24 +2530,6 @@ def OMP_TaskLoopSimd : Directive<"taskloop simd"> { let leafConstructs = [OMP_TaskLoop, OMP_Simd]; let category = CA_Executable; } -def OMP_TeamsCoexecute : Directive<"teams coexecute"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let leafConstructs = [OMP_Teams, OMP_Coexecute]; - let category = CA_Executable; -} def OMP_TeamsDistribute : Directive<"teams distribute"> { let allowedClauses = [ VersionedClause, @@ -2726,3 +2717,21 @@ def OMP_teams_loop : Directive<"teams loop"> { let leafConstructs = [OMP_Teams, OMP_loop]; let category = CA_Executable; } +def OMP_TeamsWorkdistribute : Directive<"teams workdistribute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let leafConstructs = [OMP_Teams, OMP_Workdistribute]; + let category = CA_Executable; +} diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 8061aa0209cc9..5e3ab0e908d21 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -326,38 +326,30 @@ def SectionsOp : OpenMP_Op<"sections", traits = [ } //===----------------------------------------------------------------------===// -// Coexecute Construct +// workdistribute Construct //===----------------------------------------------------------------------===// -def CoexecuteOp : OpenMP_Op<"coexecute"> { - let summary = "coexecute directive"; +def WorkdistributeOp : OpenMP_Op<"workdistribute"> { + let summary = "workdistribute directive"; let description = [{ - The coexecute construct specifies that the teams from the teams directive - this is nested in shall cooperate to execute the computation in this region. - There is no implicit barrier at the end as specified in the standard. - - TODO - We should probably change the defaut behaviour to have a barrier unless - nowait is specified, see below snippet. + workdistribute divides execution of the enclosed structured block into + separate units of work, each executed only once by each + initial thread in the league. ``` !$omp target teams - !$omp coexecute + !$omp workdistribute tmp = matmul(x, y) - !$omp end coexecute + !$omp end workdistribute a = tmp(0, 0) ! there is no implicit barrier! the matmul hasnt completed! - !$omp end target teams coexecute + !$omp end target teams workdistribute ``` }]; - let arguments = (ins UnitAttr:$nowait); - let regions = (region AnyRegion:$region); - let assemblyFormat = [{ - oilist(`nowait` $nowait) $region attr-dict - }]; + let assemblyFormat = "$region attr-dict"; } //===----------------------------------------------------------------------===// >From 085062f9ebac1079a720f614498c0b124eda8a51 Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 14 May 2025 16:17:14 +0530 Subject: [PATCH 07/11] [OpenMP] workdistribute trivial lowering Lowering logic inspired from ivanradanov coexeute lowering f56da1a207df4a40776a8570122a33f047074a3c --- .../include/flang/Optimizer/OpenMP/Passes.td | 4 + flang/lib/Optimizer/OpenMP/CMakeLists.txt | 1 + .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 101 ++++++++++++++++++ .../OpenMP/lower-workdistribute.mlir | 52 +++++++++ 4 files changed, 158 insertions(+) create mode 100644 flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp create mode 100644 flang/test/Transforms/OpenMP/lower-workdistribute.mlir diff --git a/flang/include/flang/Optimizer/OpenMP/Passes.td b/flang/include/flang/Optimizer/OpenMP/Passes.td index 704faf0ccd856..743b6d381ed42 100644 --- a/flang/include/flang/Optimizer/OpenMP/Passes.td +++ b/flang/include/flang/Optimizer/OpenMP/Passes.td @@ -93,6 +93,10 @@ def LowerWorkshare : Pass<"lower-workshare", "::mlir::ModuleOp"> { let summary = "Lower workshare construct"; } +def LowerWorkdistribute : Pass<"lower-workdistribute", "::mlir::ModuleOp"> { + let summary = "Lower workdistribute construct"; +} + def GenericLoopConversionPass : Pass<"omp-generic-loop-conversion", "mlir::func::FuncOp"> { let summary = "Converts OpenMP generic `omp.loop` to semantically " diff --git a/flang/lib/Optimizer/OpenMP/CMakeLists.txt b/flang/lib/Optimizer/OpenMP/CMakeLists.txt index e31543328a9f9..cd746834741f9 100644 --- a/flang/lib/Optimizer/OpenMP/CMakeLists.txt +++ b/flang/lib/Optimizer/OpenMP/CMakeLists.txt @@ -7,6 +7,7 @@ add_flang_library(FlangOpenMPTransforms MapsForPrivatizedSymbols.cpp MapInfoFinalization.cpp MarkDeclareTarget.cpp + LowerWorkdistribute.cpp LowerWorkshare.cpp LowerNontemporal.cpp diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp new file mode 100644 index 0000000000000..75c9d2b0d494e --- /dev/null +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -0,0 +1,101 @@ +//===- LowerWorkshare.cpp - special cases for bufferization -------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the lowering of omp.workdistribute. +// +//===----------------------------------------------------------------------===// + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +#include + +namespace flangomp { +#define GEN_PASS_DEF_LOWERWORKDISTRIBUTE +#include "flang/Optimizer/OpenMP/Passes.h.inc" +} // namespace flangomp + +#define DEBUG_TYPE "lower-workdistribute" + +using namespace mlir; + +namespace { + +struct WorkdistributeToSingle : public mlir::OpRewritePattern { +using OpRewritePattern::OpRewritePattern; +mlir::LogicalResult + matchAndRewrite(mlir::omp::WorkdistributeOp workdistribute, + mlir::PatternRewriter &rewriter) const override { + auto loc = workdistribute->getLoc(); + auto teams = llvm::dyn_cast(workdistribute->getParentOp()); + if (!teams) { + mlir::emitError(loc, "workdistribute not nested in teams\n"); + return mlir::failure(); + } + if (workdistribute.getRegion().getBlocks().size() != 1) { + mlir::emitError(loc, "workdistribute with multiple blocks\n"); + return mlir::failure(); + } + if (teams.getRegion().getBlocks().size() != 1) { + mlir::emitError(loc, "teams with multiple blocks\n"); + return mlir::failure(); + } + if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { + mlir::emitError(loc, "teams with multiple nested ops\n"); + return mlir::failure(); + } + mlir::Block *workdistributeBlock = &workdistribute.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teams); + rewriter.eraseOp(teams); + return mlir::success(); + } +}; + +class LowerWorkdistributePass + : public flangomp::impl::LowerWorkdistributeBase { +public: + void runOnOperation() override { + mlir::MLIRContext &context = getContext(); + mlir::RewritePatternSet patterns(&context); + mlir::GreedyRewriteConfig config; + // prevent the pattern driver form merging blocks + config.setRegionSimplificationLevel( + mlir::GreedySimplifyRegionLevel::Disabled); + + patterns.insert(&context); + mlir::Operation *op = getOperation(); + if (mlir::failed(mlir::applyPatternsGreedily(op, std::move(patterns), config))) { + mlir::emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + signalPassFailure(); + } + } +}; +} diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute.mlir new file mode 100644 index 0000000000000..34c8c3f01976d --- /dev/null +++ b/flang/test/Transforms/OpenMP/lower-workdistribute.mlir @@ -0,0 +1,52 @@ +// RUN: fir-opt --lower-workdistribute %s | FileCheck %s + +// CHECK-LABEL: func.func @_QPtarget_simple() { +// CHECK: %[[VAL_0:.*]] = arith.constant 2 : i32 +// CHECK: %[[VAL_1:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFtarget_simpleEa"} +// CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[VAL_1]] {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_3:.*]] = fir.alloca !fir.box> {bindc_name = "simple_var", uniq_name = "_QFtarget_simpleEsimple_var"} +// CHECK: %[[VAL_4:.*]] = fir.zero_bits !fir.heap +// CHECK: %[[VAL_5:.*]] = fir.embox %[[VAL_4]] : (!fir.heap) -> !fir.box> +// CHECK: fir.store %[[VAL_5]] to %[[VAL_3]] : !fir.ref>> +// CHECK: %[[VAL_6:.*]]:2 = hlfir.declare %[[VAL_3]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) +// CHECK: hlfir.assign %[[VAL_0]] to %[[VAL_2]]#0 : i32, !fir.ref +// CHECK: %[[VAL_7:.*]] = omp.map.info var_ptr(%[[VAL_2]]#1 : !fir.ref, i32) map_clauses(to) capture(ByRef) -> !fir.ref {name = "a"} +// CHECK: omp.target map_entries(%[[VAL_7]] -> %[[VAL_8:.*]] : !fir.ref) private(@_QFtarget_simpleEsimple_var_private_ref_box_heap_i32 %[[VAL_6]]#0 -> %[[VAL_9:.*]] : !fir.ref>>) { +// CHECK: %[[VAL_10:.*]] = arith.constant 10 : i32 +// CHECK: %[[VAL_11:.*]]:2 = hlfir.declare %[[VAL_8]] {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_12:.*]]:2 = hlfir.declare %[[VAL_9]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) +// CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_11]]#0 : !fir.ref +// CHECK: %[[VAL_14:.*]] = arith.addi %[[VAL_13]], %[[VAL_10]] : i32 +// CHECK: hlfir.assign %[[VAL_14]] to %[[VAL_12]]#0 realloc : i32, !fir.ref>> +// CHECK: omp.terminator +// CHECK: } +// CHECK: return +// CHECK: } +func.func @_QPtarget_simple() { + %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFtarget_simpleEa"} + %1:2 = hlfir.declare %0 {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %2 = fir.alloca !fir.box> {bindc_name = "simple_var", uniq_name = "_QFtarget_simpleEsimple_var"} + %3 = fir.zero_bits !fir.heap + %4 = fir.embox %3 : (!fir.heap) -> !fir.box> + fir.store %4 to %2 : !fir.ref>> + %5:2 = hlfir.declare %2 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) + %c2_i32 = arith.constant 2 : i32 + hlfir.assign %c2_i32 to %1#0 : i32, !fir.ref + %6 = omp.map.info var_ptr(%1#1 : !fir.ref, i32) map_clauses(to) capture(ByRef) -> !fir.ref {name = "a"} + omp.target map_entries(%6 -> %arg0 : !fir.ref) private(@_QFtarget_simpleEsimple_var_private_ref_box_heap_i32 %5#0 -> %arg1 : !fir.ref>>){ + omp.teams { + omp.workdistribute { + %11:2 = hlfir.declare %arg0 {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %12:2 = hlfir.declare %arg1 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) + %c10_i32 = arith.constant 10 : i32 + %13 = fir.load %11#0 : !fir.ref + %14 = arith.addi %c10_i32, %13 : i32 + hlfir.assign %14 to %12#0 realloc : i32, !fir.ref>> + omp.terminator + } + omp.terminator + } + omp.terminator + } + return +} \ No newline at end of file >From c9b63efe85f7aed781a4a0fd7d0888b595f2a520 Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 14 May 2025 19:29:33 +0530 Subject: [PATCH 08/11] [Flang][OpenMP] Add workdistribute lower pass to pipeline --- flang/lib/Optimizer/Passes/Pipelines.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..15983f80c1e4b 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -278,8 +278,10 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP, addNestedPassToAllTopLevelOperations( pm, hlfir::createInlineHLFIRAssign); pm.addPass(hlfir::createConvertHLFIRtoFIR()); - if (enableOpenMP) + if (enableOpenMP) { pm.addPass(flangomp::createLowerWorkshare()); + pm.addPass(flangomp::createLowerWorkdistribute()); + } } /// Create a pass pipeline for handling certain OpenMP transformations needed >From 048c3f22d55248a21e53ee3f4be2c0b07b500039 Mon Sep 17 00:00:00 2001 From: skc7 Date: Thu, 15 May 2025 16:39:21 +0530 Subject: [PATCH 09/11] [Flang][OpenMP] Add FissionWorkdistribute lowering. Fission logic inspired from ivanradanov implementation : c97eca4010e460aac5a3d795614ca0980bce4565 --- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 233 ++++++++++++++---- .../OpenMP/lower-workdistribute-fission.mlir | 60 +++++ ...ir => lower-workdistribute-to-single.mlir} | 2 +- 3 files changed, 243 insertions(+), 52 deletions(-) create mode 100644 flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir rename flang/test/Transforms/OpenMP/{lower-workdistribute.mlir => lower-workdistribute-to-single.mlir} (99%) diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index 75c9d2b0d494e..f799202be2645 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -10,31 +10,26 @@ // //===----------------------------------------------------------------------===// -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Dialect/FIROps.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "flang/Optimizer/HLFIR/Passes.h" +#include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/IR/Builders.h" +#include "mlir/IR/Value.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" #include #include -#include -#include -#include +#include +#include #include +#include #include -#include #include -#include -#include #include #include -#include "mlir/Transforms/GreedyPatternRewriteDriver.h" - +#include #include namespace flangomp { @@ -48,52 +43,188 @@ using namespace mlir; namespace { -struct WorkdistributeToSingle : public mlir::OpRewritePattern { -using OpRewritePattern::OpRewritePattern; -mlir::LogicalResult - matchAndRewrite(mlir::omp::WorkdistributeOp workdistribute, - mlir::PatternRewriter &rewriter) const override { - auto loc = workdistribute->getLoc(); - auto teams = llvm::dyn_cast(workdistribute->getParentOp()); - if (!teams) { - mlir::emitError(loc, "workdistribute not nested in teams\n"); - return mlir::failure(); - } - if (workdistribute.getRegion().getBlocks().size() != 1) { - mlir::emitError(loc, "workdistribute with multiple blocks\n"); - return mlir::failure(); +template +static T getPerfectlyNested(Operation *op) { + if (op->getNumRegions() != 1) + return nullptr; + auto ®ion = op->getRegion(0); + if (region.getBlocks().size() != 1) + return nullptr; + auto *block = ®ion.front(); + auto *firstOp = &block->front(); + if (auto nested = dyn_cast(firstOp)) + if (firstOp->getNextNode() == block->getTerminator()) + return nested; + return nullptr; +} + +/// This is the single source of truth about whether we should parallelize an +/// operation nested in an omp.workdistribute region. +static bool shouldParallelize(Operation *op) { + // Currently we cannot parallelize operations with results that have uses + if (llvm::any_of(op->getResults(), + [](OpResult v) -> bool { return !v.use_empty(); })) + return false; + // We will parallelize unordered loops - these come from array syntax + if (auto loop = dyn_cast(op)) { + auto unordered = loop.getUnordered(); + if (!unordered) + return false; + return *unordered; + } + if (auto callOp = dyn_cast(op)) { + auto callee = callOp.getCallee(); + if (!callee) + return false; + auto *func = op->getParentOfType().lookupSymbol(*callee); + // TODO need to insert a check here whether it is a call we can actually + // parallelize currently + if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) + return true; + return false; + } + // We cannot parallise anything else + return false; +} + +struct WorkdistributeToSingle : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + PatternRewriter &rewriter) const override { + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); } - if (teams.getRegion().getBlocks().size() != 1) { - mlir::emitError(loc, "teams with multiple blocks\n"); - return mlir::failure(); + + Block *workdistributeBlock = &workdistributeOp.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); + rewriter.eraseOp(teamsOp); + workdistributeOp.emitWarning("unable to parallelize coexecute"); + return success(); + } +}; + +/// If B() and D() are parallelizable, +/// +/// omp.teams { +/// omp.workdistribute { +/// A() +/// B() +/// C() +/// D() +/// E() +/// } +/// } +/// +/// becomes +/// +/// A() +/// omp.teams { +/// omp.workdistribute { +/// B() +/// } +/// } +/// C() +/// omp.teams { +/// omp.workdistribute { +/// D() +/// } +/// } +/// E() + +struct FissionWorkdistribute + : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult + matchAndRewrite(omp::WorkdistributeOp workdistribute, + PatternRewriter &rewriter) const override { + auto loc = workdistribute->getLoc(); + auto teams = dyn_cast(workdistribute->getParentOp()); + if (!teams) { + emitError(loc, "workdistribute not nested in teams\n"); + return failure(); + } + if (workdistribute.getRegion().getBlocks().size() != 1) { + emitError(loc, "workdistribute with multiple blocks\n"); + return failure(); + } + if (teams.getRegion().getBlocks().size() != 1) { + emitError(loc, "teams with multiple blocks\n"); + return failure(); + } + if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { + emitError(loc, "teams with multiple nested ops\n"); + return failure(); + } + + auto *teamsBlock = &teams.getRegion().front(); + + // While we have unhandled operations in the original workdistribute + auto *workdistributeBlock = &workdistribute.getRegion().front(); + auto *terminator = workdistributeBlock->getTerminator(); + bool changed = false; + while (&workdistributeBlock->front() != terminator) { + rewriter.setInsertionPoint(teams); + IRMapping mapping; + llvm::SmallVector hoisted; + Operation *parallelize = nullptr; + for (auto &op : workdistribute.getOps()) { + if (&op == terminator) { + break; } - if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { - mlir::emitError(loc, "teams with multiple nested ops\n"); - return mlir::failure(); + if (shouldParallelize(&op)) { + parallelize = &op; + break; + } else { + rewriter.clone(op, mapping); + hoisted.push_back(&op); + changed = true; } - mlir::Block *workdistributeBlock = &workdistribute.getRegion().front(); - rewriter.eraseOp(workdistributeBlock->getTerminator()); - rewriter.inlineBlockBefore(workdistributeBlock, teams); - rewriter.eraseOp(teams); - return mlir::success(); + } + + for (auto *op : hoisted) + rewriter.replaceOp(op, mapping.lookup(op)); + + if (parallelize && hoisted.empty() && + parallelize->getNextNode() == terminator) + break; + if (parallelize) { + auto newTeams = rewriter.cloneWithoutRegions(teams); + auto *newTeamsBlock = rewriter.createBlock( + &newTeams.getRegion(), newTeams.getRegion().begin(), {}, {}); + for (auto arg : teamsBlock->getArguments()) + newTeamsBlock->addArgument(arg.getType(), arg.getLoc()); + auto newWorkdistribute = rewriter.create(loc); + rewriter.create(loc); + rewriter.createBlock(&newWorkdistribute.getRegion(), + newWorkdistribute.getRegion().begin(), {}, {}); + auto *cloned = rewriter.clone(*parallelize); + rewriter.replaceOp(parallelize, cloned); + rewriter.create(loc); + changed = true; + } } + return success(changed); + } }; class LowerWorkdistributePass : public flangomp::impl::LowerWorkdistributeBase { public: void runOnOperation() override { - mlir::MLIRContext &context = getContext(); - mlir::RewritePatternSet patterns(&context); - mlir::GreedyRewriteConfig config; + MLIRContext &context = getContext(); + RewritePatternSet patterns(&context); + GreedyRewriteConfig config; // prevent the pattern driver form merging blocks config.setRegionSimplificationLevel( - mlir::GreedySimplifyRegionLevel::Disabled); + GreedySimplifyRegionLevel::Disabled); - patterns.insert(&context); - mlir::Operation *op = getOperation(); - if (mlir::failed(mlir::applyPatternsGreedily(op, std::move(patterns), config))) { - mlir::emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + patterns.insert(&context); + Operation *op = getOperation(); + if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { + emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); } } diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir new file mode 100644 index 0000000000000..ea03a10dd3d44 --- /dev/null +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir @@ -0,0 +1,60 @@ +// RUN: fir-opt --lower-workdistribute %s | FileCheck %s + +// CHECK-LABEL: func.func @test_fission_workdistribute({{.*}}) { +// CHECK: %[[VAL_0:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_2:.*]] = arith.constant 9 : index +// CHECK: %[[VAL_3:.*]] = arith.constant 5.000000e+00 : f32 +// CHECK: fir.store %[[VAL_3]] to %[[ARG2:.*]] : !fir.ref +// CHECK: fir.do_loop %[[VAL_4:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] unordered { +// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref +// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: } +// CHECK: fir.call @regular_side_effect_func(%[[ARG2:.*]]) : (!fir.ref) -> () +// CHECK: fir.call @my_fir_parallel_runtime_func(%[[ARG3:.*]]) : (!fir.ref) -> () +// CHECK: fir.do_loop %[[VAL_8:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] { +// CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_3]] to %[[VAL_9]] : !fir.ref +// CHECK: } +// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2:.*]] : !fir.ref +// CHECK: fir.store %[[VAL_10]] to %[[ARG3:.*]] : !fir.ref +// CHECK: return +// CHECK: } +module { +func.func @regular_side_effect_func(%arg0: !fir.ref) { + return +} +func.func @my_fir_parallel_runtime_func(%arg0: !fir.ref) attributes {fir.runtime} { + return +} +func.func @test_fission_workdistribute(%arr1: !fir.ref>, %arr2: !fir.ref>, %scalar_ref1: !fir.ref, %scalar_ref2: !fir.ref) { + %c0_idx = arith.constant 0 : index + %c1_idx = arith.constant 1 : index + %c9_idx = arith.constant 9 : index + %float_val = arith.constant 5.0 : f32 + omp.teams { + omp.workdistribute { + fir.store %float_val to %scalar_ref1 : !fir.ref + fir.do_loop %iv = %c0_idx to %c9_idx step %c1_idx unordered { + %elem_ptr_arr1 = fir.coordinate_of %arr1, %iv : (!fir.ref>, index) -> !fir.ref + %loaded_val_loop1 = fir.load %elem_ptr_arr1 : !fir.ref + %elem_ptr_arr2 = fir.coordinate_of %arr2, %iv : (!fir.ref>, index) -> !fir.ref + fir.store %loaded_val_loop1 to %elem_ptr_arr2 : !fir.ref + } + fir.call @regular_side_effect_func(%scalar_ref1) : (!fir.ref) -> () + fir.call @my_fir_parallel_runtime_func(%scalar_ref2) : (!fir.ref) -> () + fir.do_loop %jv = %c0_idx to %c9_idx step %c1_idx { + %elem_ptr_ordered_loop = fir.coordinate_of %arr1, %jv : (!fir.ref>, index) -> !fir.ref + fir.store %float_val to %elem_ptr_ordered_loop : !fir.ref + } + %loaded_for_hoist = fir.load %scalar_ref1 : !fir.ref + fir.store %loaded_for_hoist to %scalar_ref2 : !fir.ref + omp.terminator + } + omp.terminator + } + return +} +} diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-to-single.mlir similarity index 99% rename from flang/test/Transforms/OpenMP/lower-workdistribute.mlir rename to flang/test/Transforms/OpenMP/lower-workdistribute-to-single.mlir index 34c8c3f01976d..0cc2aeded2532 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-to-single.mlir @@ -49,4 +49,4 @@ func.func @_QPtarget_simple() { omp.terminator } return -} \ No newline at end of file +} >From 5b30d3dcb80cb4cef546f5bfdf3aa389f527d07d Mon Sep 17 00:00:00 2001 From: skc7 Date: Sun, 18 May 2025 12:37:53 +0530 Subject: [PATCH 10/11] [OpenMP][Flang] Lower teams workdistribute do_loop to wsloop. Logic inspired from ivanradanov commit 5682e9ea7fcba64693f7cfdc0f1970fab2d7d4ae --- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 177 +++++++++++++++--- .../OpenMP/lower-workdistribute-doloop.mlir | 28 +++ .../OpenMP/lower-workdistribute-fission.mlir | 22 ++- 3 files changed, 193 insertions(+), 34 deletions(-) create mode 100644 flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index f799202be2645..de208a8190650 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -6,18 +6,22 @@ // //===----------------------------------------------------------------------===// // -// This file implements the lowering of omp.workdistribute. +// This file implements the lowering and optimisations of omp.workdistribute. // //===----------------------------------------------------------------------===// +#include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Dialect/FIRDialect.h" #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Optimizer/HLFIR/Passes.h" +#include "flang/Optimizer/OpenMP/Utils.h" +#include "mlir/Analysis/SliceAnalysis.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/IR/Builders.h" #include "mlir/IR/Value.h" +#include "mlir/Transforms/DialectConversion.h" #include "mlir/Transforms/GreedyPatternRewriteDriver.h" #include #include @@ -29,6 +33,7 @@ #include #include #include +#include "mlir/Transforms/RegionUtils.h" #include #include @@ -87,25 +92,6 @@ static bool shouldParallelize(Operation *op) { return false; } -struct WorkdistributeToSingle : public OpRewritePattern { - using OpRewritePattern::OpRewritePattern; - LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, - PatternRewriter &rewriter) const override { - auto workdistributeOp = getPerfectlyNested(teamsOp); - if (!workdistributeOp) { - LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); - return failure(); - } - - Block *workdistributeBlock = &workdistributeOp.getRegion().front(); - rewriter.eraseOp(workdistributeBlock->getTerminator()); - rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); - rewriter.eraseOp(teamsOp); - workdistributeOp.emitWarning("unable to parallelize coexecute"); - return success(); - } -}; - /// If B() and D() are parallelizable, /// /// omp.teams { @@ -210,22 +196,161 @@ struct FissionWorkdistribute } }; +static void +genLoopNestClauseOps(mlir::Location loc, + mlir::PatternRewriter &rewriter, + fir::DoLoopOp loop, + mlir::omp::LoopNestOperands &loopNestClauseOps) { + assert(loopNestClauseOps.loopLowerBounds.empty() && + "Loop nest bounds were already emitted!"); + loopNestClauseOps.loopLowerBounds.push_back(loop.getLowerBound()); + loopNestClauseOps.loopUpperBounds.push_back(loop.getUpperBound()); + loopNestClauseOps.loopSteps.push_back(loop.getStep()); + loopNestClauseOps.loopInclusive = rewriter.getUnitAttr(); +} + +static void +genWsLoopOp(mlir::PatternRewriter &rewriter, + fir::DoLoopOp doLoop, + const mlir::omp::LoopNestOperands &clauseOps) { + + auto wsloopOp = rewriter.create(doLoop.getLoc()); + rewriter.createBlock(&wsloopOp.getRegion()); + + auto loopNestOp = + rewriter.create(doLoop.getLoc(), clauseOps); + + // Clone the loop's body inside the loop nest construct using the + // mapped values. + rewriter.cloneRegionBefore(doLoop.getRegion(), loopNestOp.getRegion(), + loopNestOp.getRegion().begin()); + Block *clonedBlock = &loopNestOp.getRegion().back(); + mlir::Operation *terminatorOp = clonedBlock->getTerminator(); + + // Erase fir.result op of do loop and create yield op. + if (auto resultOp = dyn_cast(terminatorOp)) { + rewriter.setInsertionPoint(terminatorOp); + rewriter.create(doLoop->getLoc()); + rewriter.eraseOp(terminatorOp); + } + return; +} + +/// If fir.do_loop id present inside teams workdistribute +/// +/// omp.teams { +/// omp.workdistribute { +/// fir.do_loop unoredered { +/// ... +/// } +/// } +/// } +/// +/// Then, its lowered to +/// +/// omp.teams { +/// omp.workdistribute { +/// omp.parallel { +/// omp.wsloop { +/// omp.loop_nest +/// ... +/// } +/// } +/// } +/// } +/// } + +struct TeamsWorkdistributeLowering : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + PatternRewriter &rewriter) const override { + auto teamsLoc = teamsOp->getLoc(); + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); + } + assert(teamsOp.getReductionVars().empty()); + + auto doLoop = getPerfectlyNested(workdistributeOp); + if (doLoop && shouldParallelize(doLoop)) { + + auto parallelOp = rewriter.create(teamsLoc); + rewriter.createBlock(¶llelOp.getRegion()); + rewriter.setInsertionPoint(rewriter.create(doLoop.getLoc())); + + mlir::omp::LoopNestOperands loopNestClauseOps; + genLoopNestClauseOps(doLoop.getLoc(), rewriter, doLoop, + loopNestClauseOps); + + genWsLoopOp(rewriter, doLoop, loopNestClauseOps); + rewriter.setInsertionPoint(doLoop); + rewriter.eraseOp(doLoop); + return success(); + } + return failure(); + } +}; + + +/// If A() and B () are present inside teams workdistribute +/// +/// omp.teams { +/// omp.workdistribute { +/// A() +/// B() +/// } +/// } +/// +/// Then, its lowered to +/// +/// A() +/// B() +/// + +struct TeamsWorkdistributeToSingle : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + PatternRewriter &rewriter) const override { + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); + } + Block *workdistributeBlock = &workdistributeOp.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); + rewriter.eraseOp(teamsOp); + return success(); + } +}; + class LowerWorkdistributePass : public flangomp::impl::LowerWorkdistributeBase { public: void runOnOperation() override { MLIRContext &context = getContext(); - RewritePatternSet patterns(&context); GreedyRewriteConfig config; // prevent the pattern driver form merging blocks config.setRegionSimplificationLevel( GreedySimplifyRegionLevel::Disabled); - - patterns.insert(&context); + Operation *op = getOperation(); - if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { - emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); - signalPassFailure(); + { + RewritePatternSet patterns(&context); + patterns.insert(&context); + if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { + emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + signalPassFailure(); + } + } + { + RewritePatternSet patterns(&context); + patterns.insert(&context); + if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { + emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + signalPassFailure(); + } } } }; diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir new file mode 100644 index 0000000000000..666bdb3ced647 --- /dev/null +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir @@ -0,0 +1,28 @@ +// RUN: fir-opt --lower-workdistribute %s | FileCheck %s + +// CHECK-LABEL: func.func @x({{.*}}) +// CHECK: %[[VAL_0:.*]] = arith.constant 0 : index +// CHECK: omp.parallel { +// CHECK: omp.wsloop { +// CHECK: omp.loop_nest (%[[VAL_1:.*]]) : index = (%[[ARG0:.*]]) to (%[[ARG1:.*]]) inclusive step (%[[ARG2:.*]]) { +// CHECK: fir.store %[[VAL_0]] to %[[ARG4:.*]] : !fir.ref +// CHECK: omp.yield +// CHECK: } +// CHECK: } +// CHECK: omp.terminator +// CHECK: } +// CHECK: return +// CHECK: } +func.func @x(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref) { + omp.teams { + omp.workdistribute { + fir.do_loop %iv = %lb to %ub step %step unordered { + %zero = arith.constant 0 : index + fir.store %zero to %addr : !fir.ref + } + omp.terminator + } + omp.terminator + } + return +} \ No newline at end of file diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir index ea03a10dd3d44..cf50d135d01ec 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir @@ -6,20 +6,26 @@ // CHECK: %[[VAL_2:.*]] = arith.constant 9 : index // CHECK: %[[VAL_3:.*]] = arith.constant 5.000000e+00 : f32 // CHECK: fir.store %[[VAL_3]] to %[[ARG2:.*]] : !fir.ref -// CHECK: fir.do_loop %[[VAL_4:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] unordered { -// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref -// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref -// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref -// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: omp.parallel { +// CHECK: omp.wsloop { +// CHECK: omp.loop_nest (%[[VAL_4:.*]]) : index = (%[[VAL_0]]) to (%[[VAL_2]]) inclusive step (%[[VAL_1]]) { +// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref +// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: omp.yield +// CHECK: } +// CHECK: } +// CHECK: omp.terminator // CHECK: } // CHECK: fir.call @regular_side_effect_func(%[[ARG2:.*]]) : (!fir.ref) -> () // CHECK: fir.call @my_fir_parallel_runtime_func(%[[ARG3:.*]]) : (!fir.ref) -> () // CHECK: fir.do_loop %[[VAL_8:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] { -// CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref // CHECK: fir.store %[[VAL_3]] to %[[VAL_9]] : !fir.ref // CHECK: } -// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2:.*]] : !fir.ref -// CHECK: fir.store %[[VAL_10]] to %[[ARG3:.*]] : !fir.ref +// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2]] : !fir.ref +// CHECK: fir.store %[[VAL_10]] to %[[ARG3]] : !fir.ref // CHECK: return // CHECK: } module { >From df65bd53111948abf6f9c2e1e0b8e27aa5e01946 Mon Sep 17 00:00:00 2001 From: skc7 Date: Mon, 19 May 2025 15:33:53 +0530 Subject: [PATCH 11/11] clang format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 18 +-- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 108 +++++++++--------- flang/lib/Parser/openmp-parsers.cpp | 6 +- .../OpenMP/lower-workdistribute-doloop.mlir | 2 +- 4 files changed, 67 insertions(+), 67 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 42d04bceddb12..ebf0710ab4feb 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2670,14 +2670,15 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } -static mlir::omp::WorkdistributeOp -genWorkdistributeOp(lower::AbstractConverter &converter, lower::SymMap &symTable, - semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - mlir::Location loc, const ConstructQueue &queue, - ConstructQueue::const_iterator item) { +static mlir::omp::WorkdistributeOp genWorkdistributeOp( + lower::AbstractConverter &converter, lower::SymMap &symTable, + semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, + mlir::Location loc, const ConstructQueue &queue, + ConstructQueue::const_iterator item) { return genOpWithBody( - OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, - llvm::omp::Directive::OMPD_workdistribute), queue, item); + OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, + llvm::omp::Directive::OMPD_workdistribute), + queue, item); } //===----------------------------------------------------------------------===// @@ -3946,7 +3947,8 @@ static void genOMPDispatch(lower::AbstractConverter &converter, llvm::omp::getOpenMPDirectiveName(dir, version) + ")"); } case llvm::omp::Directive::OMPD_workdistribute: - newOp = genWorkdistributeOp(converter, symTable, semaCtx, eval, loc, queue, item); + newOp = genWorkdistributeOp(converter, symTable, semaCtx, eval, loc, queue, + item); break; case llvm::omp::Directive::OMPD_workshare: newOp = genWorkshareOp(converter, symTable, stmtCtx, semaCtx, eval, loc, diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index de208a8190650..f75d4d1988fd2 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -14,15 +14,16 @@ #include "flang/Optimizer/Dialect/FIRDialect.h" #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/Dialect/FIRType.h" -#include "flang/Optimizer/Transforms/Passes.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Utils.h" +#include "flang/Optimizer/Transforms/Passes.h" #include "mlir/Analysis/SliceAnalysis.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/IR/Builders.h" #include "mlir/IR/Value.h" #include "mlir/Transforms/DialectConversion.h" #include "mlir/Transforms/GreedyPatternRewriteDriver.h" +#include "mlir/Transforms/RegionUtils.h" #include #include #include @@ -33,7 +34,6 @@ #include #include #include -#include "mlir/Transforms/RegionUtils.h" #include #include @@ -66,30 +66,30 @@ static T getPerfectlyNested(Operation *op) { /// This is the single source of truth about whether we should parallelize an /// operation nested in an omp.workdistribute region. static bool shouldParallelize(Operation *op) { - // Currently we cannot parallelize operations with results that have uses - if (llvm::any_of(op->getResults(), - [](OpResult v) -> bool { return !v.use_empty(); })) + // Currently we cannot parallelize operations with results that have uses + if (llvm::any_of(op->getResults(), + [](OpResult v) -> bool { return !v.use_empty(); })) + return false; + // We will parallelize unordered loops - these come from array syntax + if (auto loop = dyn_cast(op)) { + auto unordered = loop.getUnordered(); + if (!unordered) return false; - // We will parallelize unordered loops - these come from array syntax - if (auto loop = dyn_cast(op)) { - auto unordered = loop.getUnordered(); - if (!unordered) - return false; - return *unordered; - } - if (auto callOp = dyn_cast(op)) { - auto callee = callOp.getCallee(); - if (!callee) - return false; - auto *func = op->getParentOfType().lookupSymbol(*callee); - // TODO need to insert a check here whether it is a call we can actually - // parallelize currently - if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) - return true; + return *unordered; + } + if (auto callOp = dyn_cast(op)) { + auto callee = callOp.getCallee(); + if (!callee) return false; - } - // We cannot parallise anything else + auto *func = op->getParentOfType().lookupSymbol(*callee); + // TODO need to insert a check here whether it is a call we can actually + // parallelize currently + if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) + return true; return false; + } + // We cannot parallise anything else + return false; } /// If B() and D() are parallelizable, @@ -120,12 +120,10 @@ static bool shouldParallelize(Operation *op) { /// } /// E() -struct FissionWorkdistribute - : public OpRewritePattern { +struct FissionWorkdistribute : public OpRewritePattern { using OpRewritePattern::OpRewritePattern; - LogicalResult - matchAndRewrite(omp::WorkdistributeOp workdistribute, - PatternRewriter &rewriter) const override { + LogicalResult matchAndRewrite(omp::WorkdistributeOp workdistribute, + PatternRewriter &rewriter) const override { auto loc = workdistribute->getLoc(); auto teams = dyn_cast(workdistribute->getParentOp()); if (!teams) { @@ -185,7 +183,7 @@ struct FissionWorkdistribute auto newWorkdistribute = rewriter.create(loc); rewriter.create(loc); rewriter.createBlock(&newWorkdistribute.getRegion(), - newWorkdistribute.getRegion().begin(), {}, {}); + newWorkdistribute.getRegion().begin(), {}, {}); auto *cloned = rewriter.clone(*parallelize); rewriter.replaceOp(parallelize, cloned); rewriter.create(loc); @@ -197,8 +195,7 @@ struct FissionWorkdistribute }; static void -genLoopNestClauseOps(mlir::Location loc, - mlir::PatternRewriter &rewriter, +genLoopNestClauseOps(mlir::Location loc, mlir::PatternRewriter &rewriter, fir::DoLoopOp loop, mlir::omp::LoopNestOperands &loopNestClauseOps) { assert(loopNestClauseOps.loopLowerBounds.empty() && @@ -209,10 +206,8 @@ genLoopNestClauseOps(mlir::Location loc, loopNestClauseOps.loopInclusive = rewriter.getUnitAttr(); } -static void -genWsLoopOp(mlir::PatternRewriter &rewriter, - fir::DoLoopOp doLoop, - const mlir::omp::LoopNestOperands &clauseOps) { +static void genWsLoopOp(mlir::PatternRewriter &rewriter, fir::DoLoopOp doLoop, + const mlir::omp::LoopNestOperands &clauseOps) { auto wsloopOp = rewriter.create(doLoop.getLoc()); rewriter.createBlock(&wsloopOp.getRegion()); @@ -236,7 +231,7 @@ genWsLoopOp(mlir::PatternRewriter &rewriter, return; } -/// If fir.do_loop id present inside teams workdistribute +/// If fir.do_loop is present inside teams workdistribute /// /// omp.teams { /// omp.workdistribute { @@ -246,7 +241,7 @@ genWsLoopOp(mlir::PatternRewriter &rewriter, /// } /// } /// -/// Then, its lowered to +/// Then, its lowered to /// /// omp.teams { /// omp.workdistribute { @@ -277,7 +272,8 @@ struct TeamsWorkdistributeLowering : public OpRewritePattern { auto parallelOp = rewriter.create(teamsLoc); rewriter.createBlock(¶llelOp.getRegion()); - rewriter.setInsertionPoint(rewriter.create(doLoop.getLoc())); + rewriter.setInsertionPoint( + rewriter.create(doLoop.getLoc())); mlir::omp::LoopNestOperands loopNestClauseOps; genLoopNestClauseOps(doLoop.getLoc(), rewriter, doLoop, @@ -292,7 +288,6 @@ struct TeamsWorkdistributeLowering : public OpRewritePattern { } }; - /// If A() and B () are present inside teams workdistribute /// /// omp.teams { @@ -311,17 +306,17 @@ struct TeamsWorkdistributeLowering : public OpRewritePattern { struct TeamsWorkdistributeToSingle : public OpRewritePattern { using OpRewritePattern::OpRewritePattern; LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, - PatternRewriter &rewriter) const override { - auto workdistributeOp = getPerfectlyNested(teamsOp); - if (!workdistributeOp) { - LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); - return failure(); - } - Block *workdistributeBlock = &workdistributeOp.getRegion().front(); - rewriter.eraseOp(workdistributeBlock->getTerminator()); - rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); - rewriter.eraseOp(teamsOp); - return success(); + PatternRewriter &rewriter) const override { + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); + } + Block *workdistributeBlock = &workdistributeOp.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); + rewriter.eraseOp(teamsOp); + return success(); } }; @@ -332,13 +327,13 @@ class LowerWorkdistributePass MLIRContext &context = getContext(); GreedyRewriteConfig config; // prevent the pattern driver form merging blocks - config.setRegionSimplificationLevel( - GreedySimplifyRegionLevel::Disabled); - + config.setRegionSimplificationLevel(GreedySimplifyRegionLevel::Disabled); + Operation *op = getOperation(); { RewritePatternSet patterns(&context); - patterns.insert(&context); + patterns.insert( + &context); if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); @@ -346,7 +341,8 @@ class LowerWorkdistributePass } { RewritePatternSet patterns(&context); - patterns.insert(&context); + patterns.insert( + &context); if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); @@ -354,4 +350,4 @@ class LowerWorkdistributePass } } }; -} +} // namespace diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 5b5ee257edd1f..dc25adfe28c1d 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1344,12 +1344,14 @@ TYPE_PARSER( "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_workdistribute), + "TARGET TEAMS WORKDISTRIBUTE" >> + pure(llvm::omp::Directive::OMPD_target_teams_workdistribute), "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), "TARGET" >> pure(llvm::omp::Directive::OMPD_target), "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_teams_workdistribute), + "TEAMS WORKDISTRIBUTE" >> + pure(llvm::omp::Directive::OMPD_teams_workdistribute), "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare), "WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_workdistribute)))) diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir index 666bdb3ced647..9fb970246b90c 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir @@ -25,4 +25,4 @@ func.func @x(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref< omp.terminator } return -} \ No newline at end of file +} From flang-commits at lists.llvm.org Mon May 19 08:18:22 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Mon, 19 May 2025 08:18:22 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check for default declare mappers (PR #139593) In-Reply-To: Message-ID: <682b4bbe.050a0220.12b65a.99ca@mx.google.com> https://github.com/TIFitis edited https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Mon May 19 08:19:08 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Mon, 19 May 2025 08:19:08 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #139593) In-Reply-To: Message-ID: <682b4bec.170a0220.26f119.5836@mx.google.com> https://github.com/TIFitis edited https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Mon May 19 08:21:53 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Mon, 19 May 2025 08:21:53 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <682b4c91.170a0220.2a9344.9772@mx.google.com> tarunprabhu wrote: > The `FCC_OVERRIDE_OPTIONS` seemed like the most obvious name to me but I am open to other ideas. Thanks Abid. Perhaps `FFC_OVERRIDE_OPTIONS`? It has a similar correspondence to `CCC_OVERRIDE_OPTIONS` as `FCFLAGS` does to `CCFLAGS`. But I don't have a strong opinion on this. https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Mon May 19 08:29:12 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Mon, 19 May 2025 08:29:12 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) Message-ID: https://github.com/TIFitis created https://github.com/llvm/llvm-project/pull/140560 The current semantic check in place is incorrect, this patch fixes this. Up to 1 **'default'** named mapper should be allowed for each derived type. The current semantic check only allows up to 1 **'default'** named mapper across all derived types. This also makes sure that declare mappers follow proper scoping rules for both default and named mappers. Co-authored-by: Raghu Maddhipatla >From a83bd68fdcb613d54c66f8503f522cc2b16a63a2 Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Mon, 12 May 2025 18:41:20 +0100 Subject: [PATCH 1/2] Fix semantic check for default declare mappers. --- flang/lib/Semantics/resolve-names.cpp | 21 ++++++++++++------- .../OpenMP/declare-mapper-symbols.f90 | 18 ++++++++-------- .../Semantics/OpenMP/declare-mapper03.f90 | 6 +----- 3 files changed, 23 insertions(+), 22 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index b2979690f78e7..1fd0ea007319d 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -38,6 +38,7 @@ #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" #include "flang/Support/default-kinds.h" +#include "llvm/ADT/SmallVector.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1766,14 +1767,6 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, // just following the natural flow, the map clauses gets processed before // the type has been fully processed. BeginDeclTypeSpec(); - if (auto &mapperName{std::get>(spec.t)}) { - mapperName->symbol = - &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); - } else { - const parser::CharBlock defaultName{"default", 7}; - MakeSymbol( - defaultName, Attrs{}, MiscDetails{MiscDetails::Kind::ConstructName}); - } PushScope(Scope::Kind::OtherConstruct, nullptr); Walk(std::get(spec.t)); @@ -1783,6 +1776,18 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); + + if (auto &mapperName{std::get>(spec.t)}) { + mapperName->symbol = + &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); + } else { + const auto &type = std::get(spec.t); + static llvm::SmallVector defaultNames; + defaultNames.emplace_back( + type.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"); + MakeSymbol(defaultNames.back(), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); + } } void OmpVisitor::ProcessReductionSpecifier( diff --git a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 index b4e03bd1632e5..0dda5b4456987 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 @@ -2,23 +2,23 @@ program main !CHECK-LABEL: MainProgram scope: main - implicit none + implicit none - type ty - integer :: x - end type ty - !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) - !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) + type ty + integer :: x + end type ty + !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) + !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) !! Note, symbols come out in their respective scope, but not in declaration order. -!CHECK: default: Misc ConstructName !CHECK: mymapper: Misc ConstructName !CHECK: ty: DerivedType components: x +!CHECK: ty.default: Misc ConstructName !CHECK: DerivedType scope: ty !CHECK: OtherConstruct scope: !CHECK: mapped (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) -!CHECK: OtherConstruct scope: +!CHECK: OtherConstruct scope: !CHECK: maptwo (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) - + end program main diff --git a/flang/test/Semantics/OpenMP/declare-mapper03.f90 b/flang/test/Semantics/OpenMP/declare-mapper03.f90 index b70b8a67f33e0..84fc3efafb3ad 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper03.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper03.f90 @@ -5,12 +5,8 @@ integer :: y end type t1 -type :: t2 - real :: y, z -end type t2 - !error: 'default' is already declared in this scoping unit !$omp declare mapper(t1::x) map(x, x%y) -!$omp declare mapper(t2::w) map(w, w%y, w%z) +!$omp declare mapper(t1::x) map(x) end >From 19c5f5635dbee97d8b6364b43f31196298a51062 Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Wed, 14 May 2025 20:38:01 +0100 Subject: [PATCH 2/2] Change mapper name field from parser::Name to std::string. --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 6 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 22 ++--- flang/lib/Parser/openmp-parsers.cpp | 22 ++++- flang/lib/Parser/unparse.cpp | 11 ++- flang/lib/Semantics/resolve-names.cpp | 16 +--- flang/test/Lower/OpenMP/declare-mapper.f90 | 95 ++++++++++++++++++- flang/test/Lower/OpenMP/map-mapper.f90 | 4 +- .../Parser/OpenMP/declare-mapper-unparse.f90 | 15 +-- .../Parser/OpenMP/metadirective-dirspec.f90 | 2 +- .../OpenMP/declare-mapper-symbols.f90 | 2 +- 11 files changed, 149 insertions(+), 48 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 254236b510544..c99006f0c1c22 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -3540,7 +3540,7 @@ WRAPPER_CLASS(OmpLocatorList, std::list); struct OmpMapperSpecifier { // Absent mapper-identifier is equivalent to DEFAULT. TUPLE_CLASS_BOILERPLATE(OmpMapperSpecifier); - std::tuple, TypeSpec, Name> t; + std::tuple t; }; // Ref: [4.5:222:1-5], [5.0:305:20-27], [5.1:337:11-19], [5.2:139:18-23], diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index f4876256a378f..82061eed8913a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1114,9 +1114,9 @@ void ClauseProcessor::processMapObjects( typeSpec = &object.sym()->GetType()->derivedTypeSpec(); if (typeSpec) { - mapperIdName = typeSpec->name().ToString() + ".default"; - mapperIdName = - converter.mangleName(mapperIdName, *typeSpec->GetScope()); + mapperIdName = typeSpec->name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); } } }; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 61bbc709872fd..cfcba0159db8d 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2422,8 +2422,10 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, mlir::FlatSymbolRefAttr mapperId; if (sym.GetType()->category() == semantics::DeclTypeSpec::TypeDerived) { auto &typeSpec = sym.GetType()->derivedTypeSpec(); - std::string mapperIdName = typeSpec.name().ToString() + ".default"; - mapperIdName = converter.mangleName(mapperIdName, *typeSpec.GetScope()); + std::string mapperIdName = + typeSpec.name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); if (converter.getModuleOp().lookupSymbol(mapperIdName)) mapperId = mlir::FlatSymbolRefAttr::get(&converter.getMLIRContext(), mapperIdName); @@ -4005,24 +4007,16 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, lower::StatementContext stmtCtx; const auto &spec = std::get(declareMapperConstruct.t); - const auto &mapperName{std::get>(spec.t)}; + const auto &mapperName{std::get(spec.t)}; const auto &varType{std::get(spec.t)}; const auto &varName{std::get(spec.t)}; assert(varType.declTypeSpec->category() == semantics::DeclTypeSpec::Category::TypeDerived && "Expected derived type"); - std::string mapperNameStr; - if (mapperName.has_value()) { - mapperNameStr = mapperName->ToString(); - mapperNameStr = - converter.mangleName(mapperNameStr, mapperName->symbol->owner()); - } else { - mapperNameStr = - varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"; - mapperNameStr = converter.mangleName( - mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope()); - } + std::string mapperNameStr = mapperName; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperNameStr)) + mapperNameStr = converter.mangleName(mapperNameStr, sym->owner()); // Save current insertion point before moving to the module scope to create // the DeclareMapperOp diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 52d3a5844c969..a1ed584020677 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1389,8 +1389,28 @@ TYPE_PARSER( TYPE_PARSER(sourced(construct( verbatim("DECLARE TARGET"_tok), Parser{}))) +static OmpMapperSpecifier ConstructOmpMapperSpecifier( + std::optional &&mapperName, TypeSpec &&typeSpec, Name &&varName) { + // If a name is present, parse: name ":" typeSpec "::" name + // This matches the syntax: : :: + if (mapperName.has_value() && mapperName->ToString() != "default") { + return OmpMapperSpecifier{ + mapperName->ToString(), std::move(typeSpec), std::move(varName)}; + } + // If the name is missing, use the DerivedTypeSpec name to construct the + // default mapper name. + // This matches the syntax: :: + if (auto *derived = std::get_if(&typeSpec.u)) { + return OmpMapperSpecifier{ + std::get(derived->t).ToString() + ".omp.default.mapper", + std::move(typeSpec), std::move(varName)}; + } + return OmpMapperSpecifier{std::string("omp.default.mapper"), + std::move(typeSpec), std::move(varName)}; +} + // mapper-specifier -TYPE_PARSER(construct( +TYPE_PARSER(applyFunction(ConstructOmpMapperSpecifier, maybe(name / ":" / !":"_tok), typeSpec / "::", name)) // OpenMP 5.2: 5.8.8 Declare Mapper Construct diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index a626888b7dfe5..1d68e8d8850fa 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2093,7 +2093,11 @@ class UnparseVisitor { Walk(x.v, ","); } void Unparse(const OmpMapperSpecifier &x) { - Walk(std::get>(x.t), ":"); + const auto &mapperName = std::get(x.t); + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); + Put(":"); + } Walk(std::get(x.t)); Put("::"); Walk(std::get(x.t)); @@ -2796,8 +2800,9 @@ class UnparseVisitor { BeginOpenMP(); Word("!$OMP DECLARE MAPPER ("); const auto &spec{std::get(z.t)}; - if (auto mapname{std::get>(spec.t)}) { - Walk(mapname); + const auto &mapperName = std::get(spec.t); + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); Put(":"); } Walk(std::get(spec.t)); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 42297f069499b..322562b06b87f 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1767,7 +1767,9 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, // just following the natural flow, the map clauses gets processed before // the type has been fully processed. BeginDeclTypeSpec(); - + auto &mapperName{std::get(spec.t)}; + MakeSymbol(parser::CharBlock(mapperName), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); PushScope(Scope::Kind::OtherConstruct, nullptr); Walk(std::get(spec.t)); auto &varName{std::get(spec.t)}; @@ -1776,18 +1778,6 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); - - if (auto &mapperName{std::get>(spec.t)}) { - mapperName->symbol = - &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); - } else { - const auto &type = std::get(spec.t); - static llvm::SmallVector defaultNames; - defaultNames.emplace_back( - type.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"); - MakeSymbol(defaultNames.back(), Attrs{}, - MiscDetails{MiscDetails::Kind::ConstructName}); - } } void OmpVisitor::ProcessReductionSpecifier( diff --git a/flang/test/Lower/OpenMP/declare-mapper.f90 b/flang/test/Lower/OpenMP/declare-mapper.f90 index 867b850317e66..8a98c68a8d582 100644 --- a/flang/test/Lower/OpenMP/declare-mapper.f90 +++ b/flang/test/Lower/OpenMP/declare-mapper.f90 @@ -5,6 +5,7 @@ ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-2.f90 -o - | FileCheck %t/omp-declare-mapper-2.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-3.f90 -o - | FileCheck %t/omp-declare-mapper-3.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-4.f90 -o - | FileCheck %t/omp-declare-mapper-4.f90 +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-5.f90 -o - | FileCheck %t/omp-declare-mapper-5.f90 !--- omp-declare-mapper-1.f90 subroutine declare_mapper_1 @@ -22,7 +23,7 @@ subroutine declare_mapper_1 end type type(my_type2) :: t real :: x, y(nvals) - !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { + !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.omp\.default\.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { !CHECK: ^bb0(%[[VAL_0:.*]]: !fir.ref<[[MY_TYPE]]>): !CHECK: %[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFdeclare_mapper_1Evar"} : (!fir.ref<[[MY_TYPE]]>) -> (!fir.ref<[[MY_TYPE]]>, !fir.ref<[[MY_TYPE]]>) !CHECK: %[[VAL_2:.*]] = hlfir.designate %[[VAL_1]]#0{"values"} {fortran_attrs = #fir.var_attrs} : (!fir.ref<[[MY_TYPE]]>) -> !fir.ref>>> @@ -149,7 +150,7 @@ subroutine declare_mapper_4 integer :: num end type - !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] + !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.omp.default.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] !$omp declare mapper (my_type :: var) map (var%num) type(my_type) :: a @@ -171,3 +172,93 @@ subroutine declare_mapper_4 a%num = 40 !$omp end target end subroutine declare_mapper_4 + +!--- omp-declare-mapper-5.f90 +program declare_mapper_5 + implicit none + + type :: mytype + integer :: x, y + end type + + !CHECK: omp.declare_mapper @[[INNER_MAPPER_NAMED:_QQFFuse_innermy_mapper]] : [[MY_TYPE:!fir\.type<_QFTmytype\{x:i32,y:i32\}>]] + !CHECK: omp.declare_mapper @[[INNER_MAPPER_DEFAULT:_QQFFuse_innermytype.omp.default.mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_NAMED:_QQFmy_mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_DEFAULT:_QQFmytype.omp.default.mapper]] : [[MY_TYPE]] + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + +contains + subroutine use_outer() + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine + + subroutine use_inner() + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine +end program declare_mapper_5 diff --git a/flang/test/Lower/OpenMP/map-mapper.f90 b/flang/test/Lower/OpenMP/map-mapper.f90 index a511110cb5d18..91564bfc7bc46 100644 --- a/flang/test/Lower/OpenMP/map-mapper.f90 +++ b/flang/test/Lower/OpenMP/map-mapper.f90 @@ -8,7 +8,7 @@ program p !$omp declare mapper(xx : t1 :: nn) map(to: nn, nn%x) !$omp declare mapper(t1 :: nn) map(from: nn) - !CHECK-LABEL: omp.declare_mapper @_QQFt1.default : !fir.type<_QFTt1{x:!fir.array<256xi32>}> + !CHECK-LABEL: omp.declare_mapper @_QQFt1.omp.default.mapper : !fir.type<_QFTt1{x:!fir.array<256xi32>}> !CHECK-LABEL: omp.declare_mapper @_QQFxx : !fir.type<_QFTt1{x:!fir.array<256xi32>}> type(t1) :: a, b @@ -20,7 +20,7 @@ program p end do !$omp end target - !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.default) -> {{.*}} {name = "b"} + !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.omp.default.mapper) -> {{.*}} {name = "b"} !CHECK: omp.target map_entries(%[[MAP_B]] -> %{{.*}}, %{{.*}} -> %{{.*}} : {{.*}}, {{.*}}) { !$omp target map(mapper(default) : b) do i = 1, n diff --git a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 index 407bfd29153fa..30d75d02736f3 100644 --- a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 +++ b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 @@ -7,36 +7,37 @@ program main type ty integer :: x end type ty - + !CHECK: !$OMP DECLARE MAPPER (mymapper:ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier -!PARSE-TREE: Name = 'mymapper' +!PARSE-TREE: string = 'mymapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' -!PARSE-TREE: Name = 'x' +!PARSE-TREE: Name = 'x' !CHECK: !$OMP DECLARE MAPPER (ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(ty :: mapped) map(mapped, mapped%x) - + !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier +!PARSE-TREE: string = 'ty.omp.default.mapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' !PARSE-TREE: Name = 'x' - + end program main !CHECK-LABEL: end program main diff --git a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 index b6c9c58948fec..baa8b2e08c539 100644 --- a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 +++ b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 @@ -78,7 +78,7 @@ subroutine f02 !PARSE-TREE: | | OmpDirectiveSpecification !PARSE-TREE: | | | OmpDirectiveName -> llvm::omp::Directive = declare mapper !PARSE-TREE: | | | OmpArgumentList -> OmpArgument -> OmpMapperSpecifier -!PARSE-TREE: | | | | Name = 'mymapper' +!PARSE-TREE: | | | | string = 'mymapper' !PARSE-TREE: | | | | TypeSpec -> IntrinsicTypeSpec -> IntegerTypeSpec -> !PARSE-TREE: | | | | Name = 'v' !PARSE-TREE: | | | OmpClauseList -> OmpClause -> Map -> OmpMapClause diff --git a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 index 0dda5b4456987..06f41ab8ce76f 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 @@ -13,7 +13,7 @@ program main !! Note, symbols come out in their respective scope, but not in declaration order. !CHECK: mymapper: Misc ConstructName !CHECK: ty: DerivedType components: x -!CHECK: ty.default: Misc ConstructName +!CHECK: ty.omp.default.mapper: Misc ConstructName !CHECK: DerivedType scope: ty !CHECK: OtherConstruct scope: !CHECK: mapped (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) From flang-commits at lists.llvm.org Mon May 19 08:29:47 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Mon, 19 May 2025 08:29:47 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #139593) In-Reply-To: Message-ID: <682b4e6b.170a0220.ca434.6b9f@mx.google.com> TIFitis wrote: Moved PR to https://github.com/llvm/llvm-project/pull/140560 to create a PR stack. https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Mon May 19 08:29:47 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Mon, 19 May 2025 08:29:47 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #139593) In-Reply-To: Message-ID: <682b4e6b.170a0220.17ac84.779e@mx.google.com> https://github.com/TIFitis closed https://github.com/llvm/llvm-project/pull/139593 From flang-commits at lists.llvm.org Mon May 19 08:29:47 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 08:29:47 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <682b4e6b.170a0220.73329.6f41@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Akash Banerjee (TIFitis)
Changes The current semantic check in place is incorrect, this patch fixes this. Up to 1 **'default'** named mapper should be allowed for each derived type. The current semantic check only allows up to 1 **'default'** named mapper across all derived types. This also makes sure that declare mappers follow proper scoping rules for both default and named mappers. Co-authored-by: Raghu Maddhipatla <Raghu.Maddhipatla@amd.com> --- Patch is 20.07 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140560.diff 12 Files Affected: - (modified) flang/include/flang/Parser/parse-tree.h (+1-1) - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+3-3) - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+8-14) - (modified) flang/lib/Parser/openmp-parsers.cpp (+21-1) - (modified) flang/lib/Parser/unparse.cpp (+8-3) - (modified) flang/lib/Semantics/resolve-names.cpp (+4-9) - (modified) flang/test/Lower/OpenMP/declare-mapper.f90 (+93-2) - (modified) flang/test/Lower/OpenMP/map-mapper.f90 (+2-2) - (modified) flang/test/Parser/OpenMP/declare-mapper-unparse.f90 (+8-7) - (modified) flang/test/Parser/OpenMP/metadirective-dirspec.f90 (+1-1) - (modified) flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 (+9-9) - (modified) flang/test/Semantics/OpenMP/declare-mapper03.f90 (+1-5) ``````````diff diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 254236b510544..c99006f0c1c22 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -3540,7 +3540,7 @@ WRAPPER_CLASS(OmpLocatorList, std::list); struct OmpMapperSpecifier { // Absent mapper-identifier is equivalent to DEFAULT. TUPLE_CLASS_BOILERPLATE(OmpMapperSpecifier); - std::tuple, TypeSpec, Name> t; + std::tuple t; }; // Ref: [4.5:222:1-5], [5.0:305:20-27], [5.1:337:11-19], [5.2:139:18-23], diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index f4876256a378f..82061eed8913a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1114,9 +1114,9 @@ void ClauseProcessor::processMapObjects( typeSpec = &object.sym()->GetType()->derivedTypeSpec(); if (typeSpec) { - mapperIdName = typeSpec->name().ToString() + ".default"; - mapperIdName = - converter.mangleName(mapperIdName, *typeSpec->GetScope()); + mapperIdName = typeSpec->name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); } } }; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 61bbc709872fd..cfcba0159db8d 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2422,8 +2422,10 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, mlir::FlatSymbolRefAttr mapperId; if (sym.GetType()->category() == semantics::DeclTypeSpec::TypeDerived) { auto &typeSpec = sym.GetType()->derivedTypeSpec(); - std::string mapperIdName = typeSpec.name().ToString() + ".default"; - mapperIdName = converter.mangleName(mapperIdName, *typeSpec.GetScope()); + std::string mapperIdName = + typeSpec.name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); if (converter.getModuleOp().lookupSymbol(mapperIdName)) mapperId = mlir::FlatSymbolRefAttr::get(&converter.getMLIRContext(), mapperIdName); @@ -4005,24 +4007,16 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, lower::StatementContext stmtCtx; const auto &spec = std::get(declareMapperConstruct.t); - const auto &mapperName{std::get>(spec.t)}; + const auto &mapperName{std::get(spec.t)}; const auto &varType{std::get(spec.t)}; const auto &varName{std::get(spec.t)}; assert(varType.declTypeSpec->category() == semantics::DeclTypeSpec::Category::TypeDerived && "Expected derived type"); - std::string mapperNameStr; - if (mapperName.has_value()) { - mapperNameStr = mapperName->ToString(); - mapperNameStr = - converter.mangleName(mapperNameStr, mapperName->symbol->owner()); - } else { - mapperNameStr = - varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"; - mapperNameStr = converter.mangleName( - mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope()); - } + std::string mapperNameStr = mapperName; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperNameStr)) + mapperNameStr = converter.mangleName(mapperNameStr, sym->owner()); // Save current insertion point before moving to the module scope to create // the DeclareMapperOp diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 52d3a5844c969..a1ed584020677 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1389,8 +1389,28 @@ TYPE_PARSER( TYPE_PARSER(sourced(construct( verbatim("DECLARE TARGET"_tok), Parser{}))) +static OmpMapperSpecifier ConstructOmpMapperSpecifier( + std::optional &&mapperName, TypeSpec &&typeSpec, Name &&varName) { + // If a name is present, parse: name ":" typeSpec "::" name + // This matches the syntax: : :: + if (mapperName.has_value() && mapperName->ToString() != "default") { + return OmpMapperSpecifier{ + mapperName->ToString(), std::move(typeSpec), std::move(varName)}; + } + // If the name is missing, use the DerivedTypeSpec name to construct the + // default mapper name. + // This matches the syntax: :: + if (auto *derived = std::get_if(&typeSpec.u)) { + return OmpMapperSpecifier{ + std::get(derived->t).ToString() + ".omp.default.mapper", + std::move(typeSpec), std::move(varName)}; + } + return OmpMapperSpecifier{std::string("omp.default.mapper"), + std::move(typeSpec), std::move(varName)}; +} + // mapper-specifier -TYPE_PARSER(construct( +TYPE_PARSER(applyFunction(ConstructOmpMapperSpecifier, maybe(name / ":" / !":"_tok), typeSpec / "::", name)) // OpenMP 5.2: 5.8.8 Declare Mapper Construct diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index a626888b7dfe5..1d68e8d8850fa 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2093,7 +2093,11 @@ class UnparseVisitor { Walk(x.v, ","); } void Unparse(const OmpMapperSpecifier &x) { - Walk(std::get>(x.t), ":"); + const auto &mapperName = std::get(x.t); + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); + Put(":"); + } Walk(std::get(x.t)); Put("::"); Walk(std::get(x.t)); @@ -2796,8 +2800,9 @@ class UnparseVisitor { BeginOpenMP(); Word("!$OMP DECLARE MAPPER ("); const auto &spec{std::get(z.t)}; - if (auto mapname{std::get>(spec.t)}) { - Walk(mapname); + const auto &mapperName = std::get(spec.t); + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); Put(":"); } Walk(std::get(spec.t)); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index bdafc03ad2c05..322562b06b87f 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -38,6 +38,7 @@ #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" #include "flang/Support/default-kinds.h" +#include "llvm/ADT/SmallVector.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1766,15 +1767,9 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, // just following the natural flow, the map clauses gets processed before // the type has been fully processed. BeginDeclTypeSpec(); - if (auto &mapperName{std::get>(spec.t)}) { - mapperName->symbol = - &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); - } else { - const parser::CharBlock defaultName{"default", 7}; - MakeSymbol( - defaultName, Attrs{}, MiscDetails{MiscDetails::Kind::ConstructName}); - } - + auto &mapperName{std::get(spec.t)}; + MakeSymbol(parser::CharBlock(mapperName), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); PushScope(Scope::Kind::OtherConstruct, nullptr); Walk(std::get(spec.t)); auto &varName{std::get(spec.t)}; diff --git a/flang/test/Lower/OpenMP/declare-mapper.f90 b/flang/test/Lower/OpenMP/declare-mapper.f90 index 867b850317e66..8a98c68a8d582 100644 --- a/flang/test/Lower/OpenMP/declare-mapper.f90 +++ b/flang/test/Lower/OpenMP/declare-mapper.f90 @@ -5,6 +5,7 @@ ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-2.f90 -o - | FileCheck %t/omp-declare-mapper-2.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-3.f90 -o - | FileCheck %t/omp-declare-mapper-3.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-4.f90 -o - | FileCheck %t/omp-declare-mapper-4.f90 +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-5.f90 -o - | FileCheck %t/omp-declare-mapper-5.f90 !--- omp-declare-mapper-1.f90 subroutine declare_mapper_1 @@ -22,7 +23,7 @@ subroutine declare_mapper_1 end type type(my_type2) :: t real :: x, y(nvals) - !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { + !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.omp\.default\.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { !CHECK: ^bb0(%[[VAL_0:.*]]: !fir.ref<[[MY_TYPE]]>): !CHECK: %[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFdeclare_mapper_1Evar"} : (!fir.ref<[[MY_TYPE]]>) -> (!fir.ref<[[MY_TYPE]]>, !fir.ref<[[MY_TYPE]]>) !CHECK: %[[VAL_2:.*]] = hlfir.designate %[[VAL_1]]#0{"values"} {fortran_attrs = #fir.var_attrs} : (!fir.ref<[[MY_TYPE]]>) -> !fir.ref>>> @@ -149,7 +150,7 @@ subroutine declare_mapper_4 integer :: num end type - !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] + !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.omp.default.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] !$omp declare mapper (my_type :: var) map (var%num) type(my_type) :: a @@ -171,3 +172,93 @@ subroutine declare_mapper_4 a%num = 40 !$omp end target end subroutine declare_mapper_4 + +!--- omp-declare-mapper-5.f90 +program declare_mapper_5 + implicit none + + type :: mytype + integer :: x, y + end type + + !CHECK: omp.declare_mapper @[[INNER_MAPPER_NAMED:_QQFFuse_innermy_mapper]] : [[MY_TYPE:!fir\.type<_QFTmytype\{x:i32,y:i32\}>]] + !CHECK: omp.declare_mapper @[[INNER_MAPPER_DEFAULT:_QQFFuse_innermytype.omp.default.mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_NAMED:_QQFmy_mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_DEFAULT:_QQFmytype.omp.default.mapper]] : [[MY_TYPE]] + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + +contains + subroutine use_outer() + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine + + subroutine use_inner() + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine +end program declare_mapper_5 diff --git a/flang/test/Lower/OpenMP/map-mapper.f90 b/flang/test/Lower/OpenMP/map-mapper.f90 index a511110cb5d18..91564bfc7bc46 100644 --- a/flang/test/Lower/OpenMP/map-mapper.f90 +++ b/flang/test/Lower/OpenMP/map-mapper.f90 @@ -8,7 +8,7 @@ program p !$omp declare mapper(xx : t1 :: nn) map(to: nn, nn%x) !$omp declare mapper(t1 :: nn) map(from: nn) - !CHECK-LABEL: omp.declare_mapper @_QQFt1.default : !fir.type<_QFTt1{x:!fir.array<256xi32>}> + !CHECK-LABEL: omp.declare_mapper @_QQFt1.omp.default.mapper : !fir.type<_QFTt1{x:!fir.array<256xi32>}> !CHECK-LABEL: omp.declare_mapper @_QQFxx : !fir.type<_QFTt1{x:!fir.array<256xi32>}> type(t1) :: a, b @@ -20,7 +20,7 @@ program p end do !$omp end target - !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.default) -> {{.*}} {name = "b"} + !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.omp.default.mapper) -> {{.*}} {name = "b"} !CHECK: omp.target map_entries(%[[MAP_B]] -> %{{.*}}, %{{.*}} -> %{{.*}} : {{.*}}, {{.*}}) { !$omp target map(mapper(default) : b) do i = 1, n diff --git a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 index 407bfd29153fa..30d75d02736f3 100644 --- a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 +++ b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 @@ -7,36 +7,37 @@ program main type ty integer :: x end type ty - + !CHECK: !$OMP DECLARE MAPPER (mymapper:ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier -!PARSE-TREE: Name = 'mymapper' +!PARSE-TREE: string = 'mymapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' -!PARSE-TREE: Name = 'x' +!PARSE-TREE: Name = 'x' !CHECK: !$OMP DECLARE MAPPER (ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(ty :: mapped) map(mapped, mapped%x) - + !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier +!PARSE-TREE: string = 'ty.omp.default.mapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' !PARSE-TREE: Name = 'x' - + end program main !CHECK-LABEL: end program main diff --git a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 index b6c9c58948fec..baa8b2e08c539 100644 --- a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 +++ b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 @@ -78,7 +78,7 @@ subroutine f02 !PARSE-TREE: | | OmpDirectiveSpecification !PARSE-TREE: | | | OmpDirectiveName -> llvm::omp::Directive = declare mapper !PARSE-TREE: | | | OmpArgumentList -> OmpArgument -> OmpMapperSpecifier -!PARSE-TREE: | | | | Name = 'mymapper' +!PARSE-TREE: | | | | string = 'mymapper' !PARSE-TREE: | | | | TypeSpec -> IntrinsicTypeSpec -> IntegerTypeSpec -> !PARSE-TREE: | | | | Name = 'v' !PARSE-TREE: | | | OmpClauseList -> OmpClause -> Map -> OmpMapClause diff --git a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 index b4e03bd1632e5..06f41ab8ce76f 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 @@ -2,23 +2,23 @@ program main !CHECK-LABEL: MainProgram scope: main - implicit none + implicit none - type ty - integer :: x - end type ty - !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) - !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) + type ty + integer :: x + end type ty + !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) + !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) !! Note, symbols come out in their respective scope, but not in declaration order. -!CHECK: default: Misc ConstructName !CHECK: mymapper: Misc ConstructName !CHECK: ty: DerivedType components: x +!CHECK: ty.omp.default.mapper: Misc ConstructName !CHECK: DerivedType scope: ty !CHECK: OtherConstruct scope: !CHECK: mapped (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) -!CHECK: OtherConstruct scope: +!CHECK: OtherConstruct scope: !CHECK: maptwo (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) - + end program main diff --git a/flang/test/Semantics/OpenMP/declare-mapper03.f90 b/flang/test/Semantics/OpenMP/declare-mapper03.f90 index b70b8a67f33e0..84fc3efafb3ad 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper03.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper03.f90 @@ -5,12 +5,8 @@ integer :: y end type t1 -type :: t2 - real :: y, z -end type t2 - !error: 'default' is already declared in this scoping unit !$omp declare mapper(t1::x) map(x, x%y) -!$omp declare mapp... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Mon May 19 08:29:48 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 08:29:48 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <682b4e6c.630a0220.e5cdf.c042@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-parser Author: Akash Banerjee (TIFitis)
Changes The current semantic check in place is incorrect, this patch fixes this. Up to 1 **'default'** named mapper should be allowed for each derived type. The current semantic check only allows up to 1 **'default'** named mapper across all derived types. This also makes sure that declare mappers follow proper scoping rules for both default and named mappers. Co-authored-by: Raghu Maddhipatla <Raghu.Maddhipatla@amd.com> --- Patch is 20.07 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140560.diff 12 Files Affected: - (modified) flang/include/flang/Parser/parse-tree.h (+1-1) - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+3-3) - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+8-14) - (modified) flang/lib/Parser/openmp-parsers.cpp (+21-1) - (modified) flang/lib/Parser/unparse.cpp (+8-3) - (modified) flang/lib/Semantics/resolve-names.cpp (+4-9) - (modified) flang/test/Lower/OpenMP/declare-mapper.f90 (+93-2) - (modified) flang/test/Lower/OpenMP/map-mapper.f90 (+2-2) - (modified) flang/test/Parser/OpenMP/declare-mapper-unparse.f90 (+8-7) - (modified) flang/test/Parser/OpenMP/metadirective-dirspec.f90 (+1-1) - (modified) flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 (+9-9) - (modified) flang/test/Semantics/OpenMP/declare-mapper03.f90 (+1-5) ``````````diff diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 254236b510544..c99006f0c1c22 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -3540,7 +3540,7 @@ WRAPPER_CLASS(OmpLocatorList, std::list); struct OmpMapperSpecifier { // Absent mapper-identifier is equivalent to DEFAULT. TUPLE_CLASS_BOILERPLATE(OmpMapperSpecifier); - std::tuple, TypeSpec, Name> t; + std::tuple t; }; // Ref: [4.5:222:1-5], [5.0:305:20-27], [5.1:337:11-19], [5.2:139:18-23], diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index f4876256a378f..82061eed8913a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1114,9 +1114,9 @@ void ClauseProcessor::processMapObjects( typeSpec = &object.sym()->GetType()->derivedTypeSpec(); if (typeSpec) { - mapperIdName = typeSpec->name().ToString() + ".default"; - mapperIdName = - converter.mangleName(mapperIdName, *typeSpec->GetScope()); + mapperIdName = typeSpec->name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); } } }; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 61bbc709872fd..cfcba0159db8d 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2422,8 +2422,10 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, mlir::FlatSymbolRefAttr mapperId; if (sym.GetType()->category() == semantics::DeclTypeSpec::TypeDerived) { auto &typeSpec = sym.GetType()->derivedTypeSpec(); - std::string mapperIdName = typeSpec.name().ToString() + ".default"; - mapperIdName = converter.mangleName(mapperIdName, *typeSpec.GetScope()); + std::string mapperIdName = + typeSpec.name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); if (converter.getModuleOp().lookupSymbol(mapperIdName)) mapperId = mlir::FlatSymbolRefAttr::get(&converter.getMLIRContext(), mapperIdName); @@ -4005,24 +4007,16 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, lower::StatementContext stmtCtx; const auto &spec = std::get(declareMapperConstruct.t); - const auto &mapperName{std::get>(spec.t)}; + const auto &mapperName{std::get(spec.t)}; const auto &varType{std::get(spec.t)}; const auto &varName{std::get(spec.t)}; assert(varType.declTypeSpec->category() == semantics::DeclTypeSpec::Category::TypeDerived && "Expected derived type"); - std::string mapperNameStr; - if (mapperName.has_value()) { - mapperNameStr = mapperName->ToString(); - mapperNameStr = - converter.mangleName(mapperNameStr, mapperName->symbol->owner()); - } else { - mapperNameStr = - varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"; - mapperNameStr = converter.mangleName( - mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope()); - } + std::string mapperNameStr = mapperName; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperNameStr)) + mapperNameStr = converter.mangleName(mapperNameStr, sym->owner()); // Save current insertion point before moving to the module scope to create // the DeclareMapperOp diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 52d3a5844c969..a1ed584020677 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1389,8 +1389,28 @@ TYPE_PARSER( TYPE_PARSER(sourced(construct( verbatim("DECLARE TARGET"_tok), Parser{}))) +static OmpMapperSpecifier ConstructOmpMapperSpecifier( + std::optional &&mapperName, TypeSpec &&typeSpec, Name &&varName) { + // If a name is present, parse: name ":" typeSpec "::" name + // This matches the syntax: : :: + if (mapperName.has_value() && mapperName->ToString() != "default") { + return OmpMapperSpecifier{ + mapperName->ToString(), std::move(typeSpec), std::move(varName)}; + } + // If the name is missing, use the DerivedTypeSpec name to construct the + // default mapper name. + // This matches the syntax: :: + if (auto *derived = std::get_if(&typeSpec.u)) { + return OmpMapperSpecifier{ + std::get(derived->t).ToString() + ".omp.default.mapper", + std::move(typeSpec), std::move(varName)}; + } + return OmpMapperSpecifier{std::string("omp.default.mapper"), + std::move(typeSpec), std::move(varName)}; +} + // mapper-specifier -TYPE_PARSER(construct( +TYPE_PARSER(applyFunction(ConstructOmpMapperSpecifier, maybe(name / ":" / !":"_tok), typeSpec / "::", name)) // OpenMP 5.2: 5.8.8 Declare Mapper Construct diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index a626888b7dfe5..1d68e8d8850fa 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2093,7 +2093,11 @@ class UnparseVisitor { Walk(x.v, ","); } void Unparse(const OmpMapperSpecifier &x) { - Walk(std::get>(x.t), ":"); + const auto &mapperName = std::get(x.t); + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); + Put(":"); + } Walk(std::get(x.t)); Put("::"); Walk(std::get(x.t)); @@ -2796,8 +2800,9 @@ class UnparseVisitor { BeginOpenMP(); Word("!$OMP DECLARE MAPPER ("); const auto &spec{std::get(z.t)}; - if (auto mapname{std::get>(spec.t)}) { - Walk(mapname); + const auto &mapperName = std::get(spec.t); + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); Put(":"); } Walk(std::get(spec.t)); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index bdafc03ad2c05..322562b06b87f 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -38,6 +38,7 @@ #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" #include "flang/Support/default-kinds.h" +#include "llvm/ADT/SmallVector.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1766,15 +1767,9 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, // just following the natural flow, the map clauses gets processed before // the type has been fully processed. BeginDeclTypeSpec(); - if (auto &mapperName{std::get>(spec.t)}) { - mapperName->symbol = - &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); - } else { - const parser::CharBlock defaultName{"default", 7}; - MakeSymbol( - defaultName, Attrs{}, MiscDetails{MiscDetails::Kind::ConstructName}); - } - + auto &mapperName{std::get(spec.t)}; + MakeSymbol(parser::CharBlock(mapperName), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); PushScope(Scope::Kind::OtherConstruct, nullptr); Walk(std::get(spec.t)); auto &varName{std::get(spec.t)}; diff --git a/flang/test/Lower/OpenMP/declare-mapper.f90 b/flang/test/Lower/OpenMP/declare-mapper.f90 index 867b850317e66..8a98c68a8d582 100644 --- a/flang/test/Lower/OpenMP/declare-mapper.f90 +++ b/flang/test/Lower/OpenMP/declare-mapper.f90 @@ -5,6 +5,7 @@ ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-2.f90 -o - | FileCheck %t/omp-declare-mapper-2.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-3.f90 -o - | FileCheck %t/omp-declare-mapper-3.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-4.f90 -o - | FileCheck %t/omp-declare-mapper-4.f90 +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-5.f90 -o - | FileCheck %t/omp-declare-mapper-5.f90 !--- omp-declare-mapper-1.f90 subroutine declare_mapper_1 @@ -22,7 +23,7 @@ subroutine declare_mapper_1 end type type(my_type2) :: t real :: x, y(nvals) - !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { + !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.omp\.default\.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { !CHECK: ^bb0(%[[VAL_0:.*]]: !fir.ref<[[MY_TYPE]]>): !CHECK: %[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFdeclare_mapper_1Evar"} : (!fir.ref<[[MY_TYPE]]>) -> (!fir.ref<[[MY_TYPE]]>, !fir.ref<[[MY_TYPE]]>) !CHECK: %[[VAL_2:.*]] = hlfir.designate %[[VAL_1]]#0{"values"} {fortran_attrs = #fir.var_attrs} : (!fir.ref<[[MY_TYPE]]>) -> !fir.ref>>> @@ -149,7 +150,7 @@ subroutine declare_mapper_4 integer :: num end type - !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] + !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.omp.default.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] !$omp declare mapper (my_type :: var) map (var%num) type(my_type) :: a @@ -171,3 +172,93 @@ subroutine declare_mapper_4 a%num = 40 !$omp end target end subroutine declare_mapper_4 + +!--- omp-declare-mapper-5.f90 +program declare_mapper_5 + implicit none + + type :: mytype + integer :: x, y + end type + + !CHECK: omp.declare_mapper @[[INNER_MAPPER_NAMED:_QQFFuse_innermy_mapper]] : [[MY_TYPE:!fir\.type<_QFTmytype\{x:i32,y:i32\}>]] + !CHECK: omp.declare_mapper @[[INNER_MAPPER_DEFAULT:_QQFFuse_innermytype.omp.default.mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_NAMED:_QQFmy_mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_DEFAULT:_QQFmytype.omp.default.mapper]] : [[MY_TYPE]] + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + +contains + subroutine use_outer() + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine + + subroutine use_inner() + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine +end program declare_mapper_5 diff --git a/flang/test/Lower/OpenMP/map-mapper.f90 b/flang/test/Lower/OpenMP/map-mapper.f90 index a511110cb5d18..91564bfc7bc46 100644 --- a/flang/test/Lower/OpenMP/map-mapper.f90 +++ b/flang/test/Lower/OpenMP/map-mapper.f90 @@ -8,7 +8,7 @@ program p !$omp declare mapper(xx : t1 :: nn) map(to: nn, nn%x) !$omp declare mapper(t1 :: nn) map(from: nn) - !CHECK-LABEL: omp.declare_mapper @_QQFt1.default : !fir.type<_QFTt1{x:!fir.array<256xi32>}> + !CHECK-LABEL: omp.declare_mapper @_QQFt1.omp.default.mapper : !fir.type<_QFTt1{x:!fir.array<256xi32>}> !CHECK-LABEL: omp.declare_mapper @_QQFxx : !fir.type<_QFTt1{x:!fir.array<256xi32>}> type(t1) :: a, b @@ -20,7 +20,7 @@ program p end do !$omp end target - !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.default) -> {{.*}} {name = "b"} + !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.omp.default.mapper) -> {{.*}} {name = "b"} !CHECK: omp.target map_entries(%[[MAP_B]] -> %{{.*}}, %{{.*}} -> %{{.*}} : {{.*}}, {{.*}}) { !$omp target map(mapper(default) : b) do i = 1, n diff --git a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 index 407bfd29153fa..30d75d02736f3 100644 --- a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 +++ b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 @@ -7,36 +7,37 @@ program main type ty integer :: x end type ty - + !CHECK: !$OMP DECLARE MAPPER (mymapper:ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier -!PARSE-TREE: Name = 'mymapper' +!PARSE-TREE: string = 'mymapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' -!PARSE-TREE: Name = 'x' +!PARSE-TREE: Name = 'x' !CHECK: !$OMP DECLARE MAPPER (ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(ty :: mapped) map(mapped, mapped%x) - + !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier +!PARSE-TREE: string = 'ty.omp.default.mapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' !PARSE-TREE: Name = 'x' - + end program main !CHECK-LABEL: end program main diff --git a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 index b6c9c58948fec..baa8b2e08c539 100644 --- a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 +++ b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 @@ -78,7 +78,7 @@ subroutine f02 !PARSE-TREE: | | OmpDirectiveSpecification !PARSE-TREE: | | | OmpDirectiveName -> llvm::omp::Directive = declare mapper !PARSE-TREE: | | | OmpArgumentList -> OmpArgument -> OmpMapperSpecifier -!PARSE-TREE: | | | | Name = 'mymapper' +!PARSE-TREE: | | | | string = 'mymapper' !PARSE-TREE: | | | | TypeSpec -> IntrinsicTypeSpec -> IntegerTypeSpec -> !PARSE-TREE: | | | | Name = 'v' !PARSE-TREE: | | | OmpClauseList -> OmpClause -> Map -> OmpMapClause diff --git a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 index b4e03bd1632e5..06f41ab8ce76f 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 @@ -2,23 +2,23 @@ program main !CHECK-LABEL: MainProgram scope: main - implicit none + implicit none - type ty - integer :: x - end type ty - !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) - !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) + type ty + integer :: x + end type ty + !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) + !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) !! Note, symbols come out in their respective scope, but not in declaration order. -!CHECK: default: Misc ConstructName !CHECK: mymapper: Misc ConstructName !CHECK: ty: DerivedType components: x +!CHECK: ty.omp.default.mapper: Misc ConstructName !CHECK: DerivedType scope: ty !CHECK: OtherConstruct scope: !CHECK: mapped (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) -!CHECK: OtherConstruct scope: +!CHECK: OtherConstruct scope: !CHECK: maptwo (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) - + end program main diff --git a/flang/test/Semantics/OpenMP/declare-mapper03.f90 b/flang/test/Semantics/OpenMP/declare-mapper03.f90 index b70b8a67f33e0..84fc3efafb3ad 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper03.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper03.f90 @@ -5,12 +5,8 @@ integer :: y end type t1 -type :: t2 - real :: y, z -end type t2 - !error: 'default' is already declared in this scoping unit !$omp declare mapper(t1::x) map(x, x%y) -!$omp declare mapp... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Mon May 19 08:35:14 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Mon, 19 May 2025 08:35:14 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <682b4fb2.170a0220.73a89.6f1c@mx.google.com> https://github.com/TIFitis updated https://github.com/llvm/llvm-project/pull/140560 >From f852e36071da0b78431cbf3de808e5cca804449a Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Mon, 12 May 2025 18:41:20 +0100 Subject: [PATCH 1/2] Fix semantic check for default declare mappers. --- flang/lib/Semantics/resolve-names.cpp | 21 ++++++++++++------- .../OpenMP/declare-mapper-symbols.f90 | 18 ++++++++-------- .../Semantics/OpenMP/declare-mapper03.f90 | 6 +----- 3 files changed, 23 insertions(+), 22 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index bdafc03ad2c05..42297f069499b 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -38,6 +38,7 @@ #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" #include "flang/Support/default-kinds.h" +#include "llvm/ADT/SmallVector.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1766,14 +1767,6 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, // just following the natural flow, the map clauses gets processed before // the type has been fully processed. BeginDeclTypeSpec(); - if (auto &mapperName{std::get>(spec.t)}) { - mapperName->symbol = - &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); - } else { - const parser::CharBlock defaultName{"default", 7}; - MakeSymbol( - defaultName, Attrs{}, MiscDetails{MiscDetails::Kind::ConstructName}); - } PushScope(Scope::Kind::OtherConstruct, nullptr); Walk(std::get(spec.t)); @@ -1783,6 +1776,18 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); + + if (auto &mapperName{std::get>(spec.t)}) { + mapperName->symbol = + &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); + } else { + const auto &type = std::get(spec.t); + static llvm::SmallVector defaultNames; + defaultNames.emplace_back( + type.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"); + MakeSymbol(defaultNames.back(), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); + } } void OmpVisitor::ProcessReductionSpecifier( diff --git a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 index b4e03bd1632e5..0dda5b4456987 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 @@ -2,23 +2,23 @@ program main !CHECK-LABEL: MainProgram scope: main - implicit none + implicit none - type ty - integer :: x - end type ty - !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) - !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) + type ty + integer :: x + end type ty + !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) + !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) !! Note, symbols come out in their respective scope, but not in declaration order. -!CHECK: default: Misc ConstructName !CHECK: mymapper: Misc ConstructName !CHECK: ty: DerivedType components: x +!CHECK: ty.default: Misc ConstructName !CHECK: DerivedType scope: ty !CHECK: OtherConstruct scope: !CHECK: mapped (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) -!CHECK: OtherConstruct scope: +!CHECK: OtherConstruct scope: !CHECK: maptwo (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) - + end program main diff --git a/flang/test/Semantics/OpenMP/declare-mapper03.f90 b/flang/test/Semantics/OpenMP/declare-mapper03.f90 index b70b8a67f33e0..84fc3efafb3ad 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper03.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper03.f90 @@ -5,12 +5,8 @@ integer :: y end type t1 -type :: t2 - real :: y, z -end type t2 - !error: 'default' is already declared in this scoping unit !$omp declare mapper(t1::x) map(x, x%y) -!$omp declare mapper(t2::w) map(w, w%y, w%z) +!$omp declare mapper(t1::x) map(x) end >From c2d8c55407494b3d702f927c9a68032ecc56b629 Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Wed, 14 May 2025 20:38:01 +0100 Subject: [PATCH 2/2] Change mapper name field from parser::Name to std::string. --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 6 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 22 ++--- flang/lib/Parser/openmp-parsers.cpp | 22 ++++- flang/lib/Parser/unparse.cpp | 11 ++- flang/lib/Semantics/resolve-names.cpp | 16 +--- flang/test/Lower/OpenMP/declare-mapper.f90 | 95 ++++++++++++++++++- flang/test/Lower/OpenMP/map-mapper.f90 | 4 +- .../Parser/OpenMP/declare-mapper-unparse.f90 | 15 +-- .../Parser/OpenMP/metadirective-dirspec.f90 | 2 +- .../OpenMP/declare-mapper-symbols.f90 | 2 +- 11 files changed, 149 insertions(+), 48 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 254236b510544..c99006f0c1c22 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -3540,7 +3540,7 @@ WRAPPER_CLASS(OmpLocatorList, std::list); struct OmpMapperSpecifier { // Absent mapper-identifier is equivalent to DEFAULT. TUPLE_CLASS_BOILERPLATE(OmpMapperSpecifier); - std::tuple, TypeSpec, Name> t; + std::tuple t; }; // Ref: [4.5:222:1-5], [5.0:305:20-27], [5.1:337:11-19], [5.2:139:18-23], diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 02454543d0a60..8dcc8be9be5bf 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1114,9 +1114,9 @@ void ClauseProcessor::processMapObjects( typeSpec = &object.sym()->GetType()->derivedTypeSpec(); if (typeSpec) { - mapperIdName = typeSpec->name().ToString() + ".default"; - mapperIdName = - converter.mangleName(mapperIdName, *typeSpec->GetScope()); + mapperIdName = typeSpec->name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); } } }; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 61bbc709872fd..cfcba0159db8d 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2422,8 +2422,10 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, mlir::FlatSymbolRefAttr mapperId; if (sym.GetType()->category() == semantics::DeclTypeSpec::TypeDerived) { auto &typeSpec = sym.GetType()->derivedTypeSpec(); - std::string mapperIdName = typeSpec.name().ToString() + ".default"; - mapperIdName = converter.mangleName(mapperIdName, *typeSpec.GetScope()); + std::string mapperIdName = + typeSpec.name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); if (converter.getModuleOp().lookupSymbol(mapperIdName)) mapperId = mlir::FlatSymbolRefAttr::get(&converter.getMLIRContext(), mapperIdName); @@ -4005,24 +4007,16 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, lower::StatementContext stmtCtx; const auto &spec = std::get(declareMapperConstruct.t); - const auto &mapperName{std::get>(spec.t)}; + const auto &mapperName{std::get(spec.t)}; const auto &varType{std::get(spec.t)}; const auto &varName{std::get(spec.t)}; assert(varType.declTypeSpec->category() == semantics::DeclTypeSpec::Category::TypeDerived && "Expected derived type"); - std::string mapperNameStr; - if (mapperName.has_value()) { - mapperNameStr = mapperName->ToString(); - mapperNameStr = - converter.mangleName(mapperNameStr, mapperName->symbol->owner()); - } else { - mapperNameStr = - varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"; - mapperNameStr = converter.mangleName( - mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope()); - } + std::string mapperNameStr = mapperName; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperNameStr)) + mapperNameStr = converter.mangleName(mapperNameStr, sym->owner()); // Save current insertion point before moving to the module scope to create // the DeclareMapperOp diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 52d3a5844c969..a1ed584020677 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1389,8 +1389,28 @@ TYPE_PARSER( TYPE_PARSER(sourced(construct( verbatim("DECLARE TARGET"_tok), Parser{}))) +static OmpMapperSpecifier ConstructOmpMapperSpecifier( + std::optional &&mapperName, TypeSpec &&typeSpec, Name &&varName) { + // If a name is present, parse: name ":" typeSpec "::" name + // This matches the syntax: : :: + if (mapperName.has_value() && mapperName->ToString() != "default") { + return OmpMapperSpecifier{ + mapperName->ToString(), std::move(typeSpec), std::move(varName)}; + } + // If the name is missing, use the DerivedTypeSpec name to construct the + // default mapper name. + // This matches the syntax: :: + if (auto *derived = std::get_if(&typeSpec.u)) { + return OmpMapperSpecifier{ + std::get(derived->t).ToString() + ".omp.default.mapper", + std::move(typeSpec), std::move(varName)}; + } + return OmpMapperSpecifier{std::string("omp.default.mapper"), + std::move(typeSpec), std::move(varName)}; +} + // mapper-specifier -TYPE_PARSER(construct( +TYPE_PARSER(applyFunction(ConstructOmpMapperSpecifier, maybe(name / ":" / !":"_tok), typeSpec / "::", name)) // OpenMP 5.2: 5.8.8 Declare Mapper Construct diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index a626888b7dfe5..1d68e8d8850fa 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2093,7 +2093,11 @@ class UnparseVisitor { Walk(x.v, ","); } void Unparse(const OmpMapperSpecifier &x) { - Walk(std::get>(x.t), ":"); + const auto &mapperName = std::get(x.t); + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); + Put(":"); + } Walk(std::get(x.t)); Put("::"); Walk(std::get(x.t)); @@ -2796,8 +2800,9 @@ class UnparseVisitor { BeginOpenMP(); Word("!$OMP DECLARE MAPPER ("); const auto &spec{std::get(z.t)}; - if (auto mapname{std::get>(spec.t)}) { - Walk(mapname); + const auto &mapperName = std::get(spec.t); + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); Put(":"); } Walk(std::get(spec.t)); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 42297f069499b..322562b06b87f 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1767,7 +1767,9 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, // just following the natural flow, the map clauses gets processed before // the type has been fully processed. BeginDeclTypeSpec(); - + auto &mapperName{std::get(spec.t)}; + MakeSymbol(parser::CharBlock(mapperName), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); PushScope(Scope::Kind::OtherConstruct, nullptr); Walk(std::get(spec.t)); auto &varName{std::get(spec.t)}; @@ -1776,18 +1778,6 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); - - if (auto &mapperName{std::get>(spec.t)}) { - mapperName->symbol = - &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); - } else { - const auto &type = std::get(spec.t); - static llvm::SmallVector defaultNames; - defaultNames.emplace_back( - type.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"); - MakeSymbol(defaultNames.back(), Attrs{}, - MiscDetails{MiscDetails::Kind::ConstructName}); - } } void OmpVisitor::ProcessReductionSpecifier( diff --git a/flang/test/Lower/OpenMP/declare-mapper.f90 b/flang/test/Lower/OpenMP/declare-mapper.f90 index 867b850317e66..8a98c68a8d582 100644 --- a/flang/test/Lower/OpenMP/declare-mapper.f90 +++ b/flang/test/Lower/OpenMP/declare-mapper.f90 @@ -5,6 +5,7 @@ ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-2.f90 -o - | FileCheck %t/omp-declare-mapper-2.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-3.f90 -o - | FileCheck %t/omp-declare-mapper-3.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-4.f90 -o - | FileCheck %t/omp-declare-mapper-4.f90 +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-5.f90 -o - | FileCheck %t/omp-declare-mapper-5.f90 !--- omp-declare-mapper-1.f90 subroutine declare_mapper_1 @@ -22,7 +23,7 @@ subroutine declare_mapper_1 end type type(my_type2) :: t real :: x, y(nvals) - !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { + !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.omp\.default\.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { !CHECK: ^bb0(%[[VAL_0:.*]]: !fir.ref<[[MY_TYPE]]>): !CHECK: %[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFdeclare_mapper_1Evar"} : (!fir.ref<[[MY_TYPE]]>) -> (!fir.ref<[[MY_TYPE]]>, !fir.ref<[[MY_TYPE]]>) !CHECK: %[[VAL_2:.*]] = hlfir.designate %[[VAL_1]]#0{"values"} {fortran_attrs = #fir.var_attrs} : (!fir.ref<[[MY_TYPE]]>) -> !fir.ref>>> @@ -149,7 +150,7 @@ subroutine declare_mapper_4 integer :: num end type - !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] + !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.omp.default.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] !$omp declare mapper (my_type :: var) map (var%num) type(my_type) :: a @@ -171,3 +172,93 @@ subroutine declare_mapper_4 a%num = 40 !$omp end target end subroutine declare_mapper_4 + +!--- omp-declare-mapper-5.f90 +program declare_mapper_5 + implicit none + + type :: mytype + integer :: x, y + end type + + !CHECK: omp.declare_mapper @[[INNER_MAPPER_NAMED:_QQFFuse_innermy_mapper]] : [[MY_TYPE:!fir\.type<_QFTmytype\{x:i32,y:i32\}>]] + !CHECK: omp.declare_mapper @[[INNER_MAPPER_DEFAULT:_QQFFuse_innermytype.omp.default.mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_NAMED:_QQFmy_mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_DEFAULT:_QQFmytype.omp.default.mapper]] : [[MY_TYPE]] + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + +contains + subroutine use_outer() + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine + + subroutine use_inner() + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine +end program declare_mapper_5 diff --git a/flang/test/Lower/OpenMP/map-mapper.f90 b/flang/test/Lower/OpenMP/map-mapper.f90 index a511110cb5d18..91564bfc7bc46 100644 --- a/flang/test/Lower/OpenMP/map-mapper.f90 +++ b/flang/test/Lower/OpenMP/map-mapper.f90 @@ -8,7 +8,7 @@ program p !$omp declare mapper(xx : t1 :: nn) map(to: nn, nn%x) !$omp declare mapper(t1 :: nn) map(from: nn) - !CHECK-LABEL: omp.declare_mapper @_QQFt1.default : !fir.type<_QFTt1{x:!fir.array<256xi32>}> + !CHECK-LABEL: omp.declare_mapper @_QQFt1.omp.default.mapper : !fir.type<_QFTt1{x:!fir.array<256xi32>}> !CHECK-LABEL: omp.declare_mapper @_QQFxx : !fir.type<_QFTt1{x:!fir.array<256xi32>}> type(t1) :: a, b @@ -20,7 +20,7 @@ program p end do !$omp end target - !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.default) -> {{.*}} {name = "b"} + !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.omp.default.mapper) -> {{.*}} {name = "b"} !CHECK: omp.target map_entries(%[[MAP_B]] -> %{{.*}}, %{{.*}} -> %{{.*}} : {{.*}}, {{.*}}) { !$omp target map(mapper(default) : b) do i = 1, n diff --git a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 index 407bfd29153fa..30d75d02736f3 100644 --- a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 +++ b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 @@ -7,36 +7,37 @@ program main type ty integer :: x end type ty - + !CHECK: !$OMP DECLARE MAPPER (mymapper:ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier -!PARSE-TREE: Name = 'mymapper' +!PARSE-TREE: string = 'mymapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' -!PARSE-TREE: Name = 'x' +!PARSE-TREE: Name = 'x' !CHECK: !$OMP DECLARE MAPPER (ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(ty :: mapped) map(mapped, mapped%x) - + !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier +!PARSE-TREE: string = 'ty.omp.default.mapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' !PARSE-TREE: Name = 'x' - + end program main !CHECK-LABEL: end program main diff --git a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 index b6c9c58948fec..baa8b2e08c539 100644 --- a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 +++ b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 @@ -78,7 +78,7 @@ subroutine f02 !PARSE-TREE: | | OmpDirectiveSpecification !PARSE-TREE: | | | OmpDirectiveName -> llvm::omp::Directive = declare mapper !PARSE-TREE: | | | OmpArgumentList -> OmpArgument -> OmpMapperSpecifier -!PARSE-TREE: | | | | Name = 'mymapper' +!PARSE-TREE: | | | | string = 'mymapper' !PARSE-TREE: | | | | TypeSpec -> IntrinsicTypeSpec -> IntegerTypeSpec -> !PARSE-TREE: | | | | Name = 'v' !PARSE-TREE: | | | OmpClauseList -> OmpClause -> Map -> OmpMapClause diff --git a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 index 0dda5b4456987..06f41ab8ce76f 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 @@ -13,7 +13,7 @@ program main !! Note, symbols come out in their respective scope, but not in declaration order. !CHECK: mymapper: Misc ConstructName !CHECK: ty: DerivedType components: x -!CHECK: ty.default: Misc ConstructName +!CHECK: ty.omp.default.mapper: Misc ConstructName !CHECK: DerivedType scope: ty !CHECK: OtherConstruct scope: !CHECK: mapped (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) From flang-commits at lists.llvm.org Mon May 19 09:55:49 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 19 May 2025 09:55:49 -0700 (PDT) Subject: [flang-commits] [flang] [flang] translate derived type array init to attribute if possible (PR #140268) In-Reply-To: Message-ID: <682b6295.170a0220.19c4fc.a5ce@mx.google.com> https://github.com/vzakhari edited https://github.com/llvm/llvm-project/pull/140268 From flang-commits at lists.llvm.org Mon May 19 09:55:49 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 19 May 2025 09:55:49 -0700 (PDT) Subject: [flang-commits] [flang] [flang] translate derived type array init to attribute if possible (PR #140268) In-Reply-To: Message-ID: <682b6295.050a0220.34f28d.bb0d@mx.google.com> ================ @@ -0,0 +1,34 @@ +//===-- LLVMInsertChainFolder.h -- insertvalue chain folder ----*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Helper to fold LLVM dialect llvm.insertvalue chain representing constants +// into an Attribute representation. +// This sits in Flang because it is incomplete and tailored for flang needs. +// +//===----------------------------------------------------------------------===// + +#include "llvm/Support/LogicalResult.h" + +namespace mlir { +class Attribute; +class OpBuilder; +class Value; +} // namespace mlir + +namespace fir { + +/// Attempt to fold an llvm.insertvalue chain into an attribute representation +/// suitable as llvm.constant operand. The returned value will be a null pointer +/// if this is not an llvm.insertvalue result pr if the chain is not a constant, ---------------- vzakhari wrote: ```suggestion /// if this is not an llvm.insertvalue result or if the chain is not a constant, ``` https://github.com/llvm/llvm-project/pull/140268 From flang-commits at lists.llvm.org Mon May 19 09:55:50 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 19 May 2025 09:55:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang] translate derived type array init to attribute if possible (PR #140268) In-Reply-To: Message-ID: <682b6296.050a0220.312130.b44c@mx.google.com> ================ @@ -0,0 +1,34 @@ +//===-- LLVMInsertChainFolder.h -- insertvalue chain folder ----*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Helper to fold LLVM dialect llvm.insertvalue chain representing constants +// into an Attribute representation. +// This sits in Flang because it is incomplete and tailored for flang needs. +// +//===----------------------------------------------------------------------===// + +#include "llvm/Support/LogicalResult.h" + +namespace mlir { +class Attribute; +class OpBuilder; +class Value; +} // namespace mlir + +namespace fir { + +/// Attempt to fold an llvm.insertvalue chain into an attribute representation +/// suitable as llvm.constant operand. The returned value will be a null pointer ---------------- vzakhari wrote: ```suggestion /// suitable as llvm.constant operand. The returned value will be llvm::Failure ``` https://github.com/llvm/llvm-project/pull/140268 From flang-commits at lists.llvm.org Mon May 19 09:55:50 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 19 May 2025 09:55:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang] translate derived type array init to attribute if possible (PR #140268) In-Reply-To: Message-ID: <682b6296.050a0220.142de5.cf66@mx.google.com> https://github.com/vzakhari approved this pull request. Looks great! https://github.com/llvm/llvm-project/pull/140268 From flang-commits at lists.llvm.org Mon May 19 10:11:52 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Mon, 19 May 2025 10:11:52 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #140066) In-Reply-To: Message-ID: <682b6658.170a0220.2d63e7.0da6@mx.google.com> https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/140066 >From adae9a9da1ea2d4d259b3e2acb93767313387adc Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Thu, 6 Mar 2025 10:41:59 +0000 Subject: [PATCH 01/13] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION This adds another puzzle piece for the support of OpenMP DECLARE REDUCTION functionality. This adds support for operators with derived types, as well as declaring multiple different types with the same name or operator. A new detail class for UserReductionDetials is introduced to hold the list of types supported for a given reduction declaration. Tests for parsing and symbol generation added. Declare reduction is still not supported to lowering, it will generate a "Not yet implemented" fatal error. --- flang/include/flang/Semantics/symbol.h | 21 ++- flang/lib/Semantics/check-omp-structure.cpp | 63 ++++++-- flang/lib/Semantics/resolve-names-utils.h | 4 + flang/lib/Semantics/resolve-names.cpp | 77 +++++++++- flang/lib/Semantics/symbol.cpp | 12 +- .../Parser/OpenMP/declare-reduction-multi.f90 | 134 ++++++++++++++++++ .../OpenMP/declare-reduction-operator.f90 | 59 ++++++++ .../OpenMP/declare-reduction-functions.f90 | 126 ++++++++++++++++ .../OpenMP/declare-reduction-mangled.f90 | 51 +++++++ .../OpenMP/declare-reduction-operators.f90 | 55 +++++++ .../OpenMP/declare-reduction-typeerror.f90 | 30 ++++ .../Semantics/OpenMP/declare-reduction.f90 | 4 +- 12 files changed, 616 insertions(+), 20 deletions(-) create mode 100644 flang/test/Parser/OpenMP/declare-reduction-multi.f90 create mode 100644 flang/test/Parser/OpenMP/declare-reduction-operator.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-functions.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-mangled.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-operators.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 4cded64d170cd..ce35fc7aff3df 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -728,6 +728,25 @@ class GenericDetails { }; llvm::raw_ostream &operator<<(llvm::raw_ostream &, const GenericDetails &); +class UserReductionDetails : public WithBindName { +public: + using TypeVector = std::vector; + UserReductionDetails() = default; + + void AddType(const DeclTypeSpec *type) { typeList_.push_back(type); } + const TypeVector &GetTypeList() const { return typeList_; } + + bool SupportsType(const DeclTypeSpec *type) const { + for (auto t : typeList_) + if (t == type) + return true; + return false; + } + +private: + TypeVector typeList_; +}; + class UnknownDetails {}; using Details = std::variant; + TypeParamDetails, MiscDetails, UserReductionDetails>; llvm::raw_ostream &operator<<(llvm::raw_ostream &, const Details &); std::string DetailsToString(const Details &); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c6c4fdf8a8198..837ca3377fe1e 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -8,6 +8,7 @@ #include "check-omp-structure.h" #include "definable.h" +#include "resolve-names-utils.h" #include "flang/Evaluate/check-expression.h" #include "flang/Evaluate/expression.h" #include "flang/Evaluate/type.h" @@ -3521,8 +3522,8 @@ bool OmpStructureChecker::CheckReductionOperator( valid = llvm::is_contained({"max", "min", "iand", "ior", "ieor"}, realName); if (!valid) { - auto *misc{name->symbol->detailsIf()}; - valid = misc && misc->kind() == MiscDetails::Kind::ConstructName; + auto *reductionDetails{name->symbol->detailsIf()}; + valid = reductionDetails != nullptr; } } if (!valid) { @@ -3603,7 +3604,8 @@ void OmpStructureChecker::CheckReductionObjects( } static bool IsReductionAllowedForType( - const parser::OmpReductionIdentifier &ident, const DeclTypeSpec &type) { + const parser::OmpReductionIdentifier &ident, const DeclTypeSpec &type, + const Scope &scope) { auto isLogical{[](const DeclTypeSpec &type) -> bool { return type.category() == DeclTypeSpec::Logical; }}; @@ -3623,9 +3625,11 @@ static bool IsReductionAllowedForType( case parser::DefinedOperator::IntrinsicOperator::Multiply: case parser::DefinedOperator::IntrinsicOperator::Add: case parser::DefinedOperator::IntrinsicOperator::Subtract: - return type.IsNumeric(TypeCategory::Integer) || + if (type.IsNumeric(TypeCategory::Integer) || type.IsNumeric(TypeCategory::Real) || - type.IsNumeric(TypeCategory::Complex); + type.IsNumeric(TypeCategory::Complex)) + return true; + break; case parser::DefinedOperator::IntrinsicOperator::AND: case parser::DefinedOperator::IntrinsicOperator::OR: @@ -3638,8 +3642,18 @@ static bool IsReductionAllowedForType( DIE("This should have been caught in CheckIntrinsicOperator"); return false; } + parser::CharBlock name{MakeNameFromOperator(*intrinsicOp)}; + Symbol *symbol{scope.FindSymbol(name)}; + if (symbol) { + const auto *reductionDetails{symbol->detailsIf()}; + assert(reductionDetails && "Expected to find reductiondetails"); + + return reductionDetails->SupportsType(&type); + } + return false; } - return true; + assert(0 && "Intrinsic Operator not found - parsing gone wrong?"); + return false; // Reject everything else. }}; auto checkDesignator{[&](const parser::ProcedureDesignator &procD) { @@ -3652,18 +3666,42 @@ static bool IsReductionAllowedForType( // IAND: arguments must be integers: F2023 16.9.100 // IEOR: arguments must be integers: F2023 16.9.106 // IOR: arguments must be integers: F2023 16.9.111 - return type.IsNumeric(TypeCategory::Integer); + if (type.IsNumeric(TypeCategory::Integer)) { + return true; + } } else if (realName == "max" || realName == "min") { // MAX: arguments must be integer, real, or character: // F2023 16.9.135 // MIN: arguments must be integer, real, or character: // F2023 16.9.141 - return type.IsNumeric(TypeCategory::Integer) || - type.IsNumeric(TypeCategory::Real) || isCharacter(type); + if (type.IsNumeric(TypeCategory::Integer) || + type.IsNumeric(TypeCategory::Real) || isCharacter(type)) { + return true; + } } + + // If we get here, it may be a user declared reduction, so check + // if the symbol has UserReductionDetails, and if so, the type is + // supported. + if (const auto *reductionDetails{ + name->symbol->detailsIf()}) { + return reductionDetails->SupportsType(&type); + } + + // We also need to check for mangled names (max, min, iand, ieor and ior) + // and then check if the type is there. + parser::CharBlock mangledName = MangleSpecialFunctions(name->source); + if (const auto &symbol{scope.FindSymbol(mangledName)}) { + if (const auto *reductionDetails{ + symbol->detailsIf()}) { + return reductionDetails->SupportsType(&type); + } + } + // Everything else is "not matching type". + return false; } - // TODO: user defined reduction operators. Just allow everything for now. - return true; + assert(0 && "name and name->symbol should be set here..."); + return false; }}; return common::visit( @@ -3678,7 +3716,8 @@ void OmpStructureChecker::CheckReductionObjectTypes( for (auto &[symbol, source] : symbols) { if (auto *type{symbol->GetType()}) { - if (!IsReductionAllowedForType(ident, *type)) { + const auto &scope{context_.FindScope(symbol->name())}; + if (!IsReductionAllowedForType(ident, *type, scope)) { context_.Say(source, "The type of '%s' is incompatible with the reduction operator."_err_en_US, symbol->name()); diff --git a/flang/lib/Semantics/resolve-names-utils.h b/flang/lib/Semantics/resolve-names-utils.h index 64784722ff4f8..de0991d69b61b 100644 --- a/flang/lib/Semantics/resolve-names-utils.h +++ b/flang/lib/Semantics/resolve-names-utils.h @@ -146,5 +146,9 @@ struct SymbolAndTypeMappings; void MapSubprogramToNewSymbols(const Symbol &oldSymbol, Symbol &newSymbol, Scope &newScope, SymbolAndTypeMappings * = nullptr); +parser::CharBlock MakeNameFromOperator( + const parser::DefinedOperator::IntrinsicOperator &op); +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_RESOLVE_NAMES_H_ diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index bdafc03ad2c05..4e60a1f1b5a49 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1785,15 +1785,75 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, PopScope(); } +parser::CharBlock MakeNameFromOperator( + const parser::DefinedOperator::IntrinsicOperator &op) { + switch (op) { + case parser::DefinedOperator::IntrinsicOperator::Multiply: + return parser::CharBlock{"op.*", 4}; + case parser::DefinedOperator::IntrinsicOperator::Add: + return parser::CharBlock{"op.+", 4}; + case parser::DefinedOperator::IntrinsicOperator::Subtract: + return parser::CharBlock{"op.-", 4}; + + case parser::DefinedOperator::IntrinsicOperator::AND: + return parser::CharBlock{"op.AND", 6}; + case parser::DefinedOperator::IntrinsicOperator::OR: + return parser::CharBlock{"op.OR", 6}; + case parser::DefinedOperator::IntrinsicOperator::EQV: + return parser::CharBlock{"op.EQV", 7}; + case parser::DefinedOperator::IntrinsicOperator::NEQV: + return parser::CharBlock{"op.NEQV", 8}; + + default: + assert(0 && "Unsupported operator..."); + return parser::CharBlock{"op.?", 4}; + } +} + +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name) { + if (name == "max") { + return parser::CharBlock{"op.max", 6}; + } + if (name == "min") { + return parser::CharBlock{"op.min", 6}; + } + if (name == "iand") { + return parser::CharBlock{"op.iand", 7}; + } + if (name == "ior") { + return parser::CharBlock{"op.ior", 6}; + } + if (name == "ieor") { + return parser::CharBlock{"op.ieor", 7}; + } + // All other names: return as is. + return name; +} + void OmpVisitor::ProcessReductionSpecifier( const parser::OmpReductionSpecifier &spec, const std::optional &clauses) { + const parser::Name *name{nullptr}; + parser::Name mangledName{}; + UserReductionDetails reductionDetailsTemp{}; const auto &id{std::get(spec.t)}; if (auto procDes{std::get_if(&id.u)}) { - if (auto *name{std::get_if(&procDes->u)}) { - name->symbol = - &MakeSymbol(*name, MiscDetails{MiscDetails::Kind::ConstructName}); + name = std::get_if(&procDes->u); + if (name) { + mangledName.source = MangleSpecialFunctions(name->source); } + } else { + const auto &defOp{std::get(id.u)}; + mangledName.source = MakeNameFromOperator( + std::get(defOp.u)); + name = &mangledName; + } + + UserReductionDetails *reductionDetails{&reductionDetailsTemp}; + Symbol *symbol{name ? name->symbol : nullptr}; + symbol = FindSymbol(mangledName); + if (symbol) { + reductionDetails = symbol->detailsIf(); } auto &typeList{std::get(spec.t)}; @@ -1825,6 +1885,10 @@ void OmpVisitor::ProcessReductionSpecifier( const DeclTypeSpec *typeSpec{GetDeclTypeSpec()}; assert(typeSpec && "We should have a type here"); + if (reductionDetails) { + reductionDetails->AddType(typeSpec); + } + for (auto &nm : ompVarNames) { ObjectEntityDetails details{}; details.set_type(*typeSpec); @@ -1835,6 +1899,13 @@ void OmpVisitor::ProcessReductionSpecifier( Walk(clauses); PopScope(); } + + if (name) { + if (!symbol) { + symbol = &MakeSymbol(mangledName, Attrs{}, std::move(*reductionDetails)); + } + name->symbol = symbol; + } } bool OmpVisitor::Pre(const parser::OmpDirectiveSpecification &x) { diff --git a/flang/lib/Semantics/symbol.cpp b/flang/lib/Semantics/symbol.cpp index 2118970a7bf25..d03b888318b30 100644 --- a/flang/lib/Semantics/symbol.cpp +++ b/flang/lib/Semantics/symbol.cpp @@ -292,7 +292,7 @@ void GenericDetails::CopyFrom(const GenericDetails &from) { // This is primarily for debugging. std::string DetailsToString(const Details &details) { return common::visit( - common::visitors{ + common::visitors{// [](const UnknownDetails &) { return "Unknown"; }, [](const MainProgramDetails &) { return "MainProgram"; }, [](const ModuleDetails &) { return "Module"; }, @@ -312,7 +312,7 @@ std::string DetailsToString(const Details &details) { [](const TypeParamDetails &) { return "TypeParam"; }, [](const MiscDetails &) { return "Misc"; }, [](const AssocEntityDetails &) { return "AssocEntity"; }, - }, + [](const UserReductionDetails &) { return "UserReductionDetails"; }}, details); } @@ -346,6 +346,9 @@ bool Symbol::CanReplaceDetails(const Details &details) const { [&](const HostAssocDetails &) { return this->has(); }, + [&](const UserReductionDetails &) { + return this->has(); + }, [](const auto &) { return false; }, }, details); @@ -644,6 +647,11 @@ llvm::raw_ostream &operator<<(llvm::raw_ostream &os, const Details &details) { [&](const MiscDetails &x) { os << ' ' << MiscDetails::EnumToString(x.kind()); }, + [&](const UserReductionDetails &x) { + for (auto &type : x.GetTypeList()) { + DumpType(os, type); + } + }, [&](const auto &x) { os << x; }, }, details); diff --git a/flang/test/Parser/OpenMP/declare-reduction-multi.f90 b/flang/test/Parser/OpenMP/declare-reduction-multi.f90 new file mode 100644 index 0000000000000..0e1adcc9958d7 --- /dev/null +++ b/flang/test/Parser/OpenMP/declare-reduction-multi.f90 @@ -0,0 +1,134 @@ +! RUN: %flang_fc1 -fdebug-unparse -fopenmp %s | FileCheck --ignore-case %s +! RUN: %flang_fc1 -fdebug-dump-parse-tree -fopenmp %s | FileCheck --check-prefix="PARSE-TREE" %s + +!! Test multiple declarations for the same type, with different operations. +module mymod + type :: tt + real r + end type tt +contains + function mymax(a, b) + type(tt) :: a, b, mymax + if (a%r > b%r) then + mymax = a + else + mymax = b + end if + end function mymax +end module mymod + +program omp_examples +!CHECK-LABEL: PROGRAM omp_examples + use mymod + implicit none + integer, parameter :: n = 100 + integer :: i + type(tt) :: values(n), sum, prod, big, small + + !$omp declare reduction(+:tt:omp_out%r = omp_out%r + omp_in%r) initializer(omp_priv%r = 0) +!CHECK: !$OMP DECLARE REDUCTION (+:tt: omp_out%r=omp_out%r+omp_in%r +!CHECK-NEXT: ) INITIALIZER(omp_priv%r=0_4) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE-NEXT: OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Add +!PARSE-TREE: OmpTypeNameList -> OmpTypeSpecifier -> TypeSpec -> DerivedTypeSpec +!PARSE-TREE-NEXT: Name = 'tt' +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out%r=omp_out%r+omp_in%r' +!PARSE-TREE: OmpClauseList -> OmpClause -> Initializer -> OmpInitializerClause -> AssignmentStmt = 'omp_priv%r=0._4 + !$omp declare reduction(*:tt:omp_out%r = omp_out%r * omp_in%r) initializer(omp_priv%r = 1) +!CHECK-NEXT: !$OMP DECLARE REDUCTION (*:tt: omp_out%r=omp_out%r*omp_in%r +!CHECK-NEXT: ) INITIALIZER(omp_priv%r=1_4) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Multiply +!PARSE-TREE: OmpTypeNameList -> OmpTypeSpecifier -> TypeSpec -> DerivedTypeSpec +!PARSE-TREE-NEXT: Name = 'tt' +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out%r=omp_out%r*omp_in%r' +!PARSE-TREE: OmpClauseList -> OmpClause -> Initializer -> OmpInitializerClause -> AssignmentStmt = 'omp_priv%r=1._4' + !$omp declare reduction(max:tt:omp_out = mymax(omp_out, omp_in)) initializer(omp_priv%r = 0) +!CHECK-NEXT: !$OMP DECLARE REDUCTION (max:tt: omp_out=mymax(omp_out,omp_in) +!CHECK-NEXT: ) INITIALIZER(omp_priv%r=0_4) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> ProcedureDesignator -> Name = 'max' +!PARSE-TREE: OmpTypeNameList -> OmpTypeSpecifier -> TypeSpec -> DerivedTypeSpec +!PARSE-TREE: Name = 'tt' +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out=mymax(omp_out,omp_in)' +!PARSE-TREE: OmpClauseList -> OmpClause -> Initializer -> OmpInitializerClause -> AssignmentStmt = 'omp_priv%r=0._4' + !$omp declare reduction(min:tt:omp_out%r = min(omp_out%r, omp_in%r)) initializer(omp_priv%r = 1) +!CHECK-NEXT: !$OMP DECLARE REDUCTION (min:tt: omp_out%r=min(omp_out%r,omp_in%r) +!CHECK-NEXT: ) INITIALIZER(omp_priv%r=1_4) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> ProcedureDesignator -> Name = 'min' +!PARSE-TREE: OmpTypeNameList -> OmpTypeSpecifier -> TypeSpec -> DerivedTypeSpec +!PARSE-TREE: Name = 'tt' +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out%r=min(omp_out%r,omp_in%r)' +!PARSE-TREE: OmpClauseList -> OmpClause -> Initializer -> OmpInitializerClause -> AssignmentStmt = 'omp_priv%r=1._4' + call random_number(values%r) + + sum%r = 0 + !$omp parallel do reduction(+:sum) +!CHECK: !$OMP PARALLEL DO REDUCTION(+: sum) +!PARSE-TREE: ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct +!PARSE-TREE: OmpBeginLoopDirective +!PARSE-TREE: OmpLoopDirective -> llvm::omp::Directive = parallel do +!PARSE-TREE: OmpClauseList -> OmpClause -> Reduction -> OmpReductionClause +!PARSE-TREE: Modifier -> OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Add +!PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'sum +!PARSE-TREE: DoConstruct + do i = 1, n + sum%r = sum%r + values(i)%r + end do + + prod%r = 1 + !$omp parallel do reduction(*:prod) +!CHECK: !$OMP PARALLEL DO REDUCTION(*: prod) +!PARSE-TREE: ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct +!PARSE-TREE: OmpBeginLoopDirective +!PARSE-TREE: OmpLoopDirective -> llvm::omp::Directive = parallel do +!PARSE-TREE: OmpClauseList -> OmpClause -> Reduction -> OmpReductionClause +!PARSE-TREE: Modifier -> OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Multiply +!PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'prod' +!PARSE-TREE: DoConstruct + do i = 1, n + prod%r = prod%r * (values(i)%r+0.6) + end do + + big%r = 0 + !$omp parallel do reduction(max:big) +!CHECK: $OMP PARALLEL DO REDUCTION(max: big) +!PARSE-TREE: ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct +!PARSE-TREE: OmpBeginLoopDirective +!PARSE-TREE: OmpLoopDirective -> llvm::omp::Directive = parallel do +!PARSE-TREE: OmpClauseList -> OmpClause -> Reduction -> OmpReductionClause +!PARSE-TREE: Modifier -> OmpReductionIdentifier -> ProcedureDesignator -> Name = 'max' +!PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'big' +!PARSE-TREE: DoConstruct + do i = 1, n + big = mymax(values(i), big) + end do + + small%r = 1 + !$omp parallel do reduction(min:small) +!CHECK: !$OMP PARALLEL DO REDUCTION(min: small) +!CHECK-TREE: ExecutionPartConstruct -> ExecutableConstruct -> OpenMPConstruct -> OpenMPLoopConstruct +!CHECK-TREE: OmpBeginLoopDirective +!CHECK-TREE: OmpLoopDirective -> llvm::omp::Directive = parallel do +!CHECK-TREE: OmpClauseList -> OmpClause -> Reduction -> OmpReductionClause +!CHECK-TREE: Modifier -> OmpReductionIdentifier -> ProcedureDesignator -> Name = 'min' +!CHECK-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'small' +!CHECK-TREE: DoConstruct + do i = 1, n + small%r = min(values(i)%r, small%r) + end do + + print *, values%r + print *, "sum=", sum%r + print *, "prod=", prod%r + print *, "small=", small%r, " big=", big%r +end program omp_examples diff --git a/flang/test/Parser/OpenMP/declare-reduction-operator.f90 b/flang/test/Parser/OpenMP/declare-reduction-operator.f90 new file mode 100644 index 0000000000000..7bfb78115b10d --- /dev/null +++ b/flang/test/Parser/OpenMP/declare-reduction-operator.f90 @@ -0,0 +1,59 @@ +! RUN: %flang_fc1 -fdebug-unparse -fopenmp %s | FileCheck --ignore-case %s +! RUN: %flang_fc1 -fdebug-dump-parse-tree -fopenmp %s | FileCheck --check-prefix="PARSE-TREE" %s + +!CHECK-LABEL: SUBROUTINE reduce_1 (n, tts) +subroutine reduce_1 ( n, tts ) + type :: tt + integer :: x + integer :: y + end type tt + type :: tt2 + real(8) :: x + real(8) :: y + end type + + integer :: n + type(tt) :: tts(n) + type(tt2) :: tts2(n) + +!CHECK: !$OMP DECLARE REDUCTION (+:tt: omp_out=tt(x=omp_out%x-omp_in%x,y=omp_out%y-omp_in%y) +!CHECK: ) INITIALIZER(omp_priv=tt(x=0_4,y=0_4)) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Add +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out=tt(x=omp_out%x-omp_in%x,y=omp_out%y-omp_in%y)' +!PARSE-TREE: OmpInitializerClause -> AssignmentStmt = 'omp_priv=tt(x=0_4,y=0_4)' + + !$omp declare reduction(+ : tt : omp_out = tt(omp_out%x - omp_in%x , omp_out%y - omp_in%y)) initializer(omp_priv = tt(0,0)) + + +!CHECK: !$OMP DECLARE REDUCTION (+:tt2: omp_out=tt2(x=omp_out%x-omp_in%x,y=omp_out%y-omp_in%y) +!CHECK: ) INITIALIZER(omp_priv=tt2(x=0._8,y=0._8) +!PARSE-TREE: DeclarationConstruct -> SpecificationConstruct -> OpenMPDeclarativeConstruct -> OpenMPDeclareReductionConstruct +!PARSE-TREE: Verbatim +!PARSE-TREE: OmpReductionSpecifier +!PARSE-TREE: OmpReductionIdentifier -> DefinedOperator -> IntrinsicOperator = Add +!PARSE-TREE: OmpReductionCombiner -> AssignmentStmt = 'omp_out=tt2(x=omp_out%x-omp_in%x,y=omp_out%y-omp_in%y)' +!PARSE-TREE: OmpInitializerClause -> AssignmentStmt = 'omp_priv=tt2(x=0._8,y=0._8)' + + !$omp declare reduction(+ :tt2 : omp_out = tt2(omp_out%x - omp_in%x , omp_out%y - omp_in%y)) initializer(omp_priv = tt2(0,0)) + + type(tt) :: diffp = tt( 0, 0 ) + type(tt2) :: diffp2 = tt2( 0, 0 ) + integer :: i + + !$omp parallel do reduction(+ : diffp) + do i = 1, n + diffp%x = diffp%x + tts(i)%x + diffp%y = diffp%y + tts(i)%y + end do + + !$omp parallel do reduction(+ : diffp2) + do i = 1, n + diffp2%x = diffp2%x + tts2(i)%x + diffp2%y = diffp2%y + tts2(i)%y + end do + +end subroutine reduce_1 +!CHECK: END SUBROUTINE reduce_1 diff --git a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 new file mode 100644 index 0000000000000..924ef0807ec80 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 @@ -0,0 +1,126 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +module mm + implicit none + type two + integer(4) :: a, b + end type two + + type three + integer(8) :: a, b, c + end type three + + type twothree + type(two) t2 + type(three) t3 + end type twothree + +contains +!CHECK-LABEL: Subprogram scope: inittwo + subroutine inittwo(x,n) + integer :: n + type(two) :: x + x%a=n + x%b=n + end subroutine inittwo + + subroutine initthree(x,n) + integer :: n + type(three) :: x + x%a=n + x%b=n + end subroutine initthree + + function add_two(x, y) + type(two) add_two, x, y, res + res%a = x%a + y%a + res%b = x%b + y%b + add_two = res + end function add_two + + function add_three(x, y) + type(three) add_three, x, y, res + res%a = x%a + y%a + res%b = x%b + y%b + res%c = x%c + y%c + add_three = res + end function add_three + +!CHECK-LABEL: Subprogram scope: functwo + function functwo(x, n) + type(two) functwo + integer :: n + type(two) :: x(n) + type(two) :: res + integer :: i + !$omp declare reduction(adder:two:omp_out=add_two(omp_out,omp_in)) initializer(inittwo(omp_priv,0)) +!CHECK: adder: UserReductionDetails TYPE(two) +!CHECK OtherConstruct scope +!CHECK: omp_in size=8 offset=0: ObjectEntity type: TYPE(two) +!CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) +!CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) +!CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) + + + !$omp simd reduction(adder:res) + do i=1,n + res=add_two(res,x(i)) + enddo + functwo=res + end function functwo + + function functhree(x, n) + implicit none + type(three) :: functhree + type(three) :: x(n) + type(three) :: res + integer :: i + integer :: n + !$omp declare reduction(adder:three:omp_out=add_three(omp_out,omp_in)) initializer(initthree(omp_priv,1)) + + !$omp simd reduction(adder:res) + do i=1,n + res=add_three(res,x(i)) + enddo + functhree=res + end function functhree + + function functtwothree(x, n) + type(twothree) :: functtwothree + type(twothree) :: x(n) + type(twothree) :: res + type(two) :: res2 + type(three) :: res3 + integer :: n + integer :: i + + !$omp declare reduction(adder:two:omp_out=add_two(omp_out,omp_in)) initializer(inittwo(omp_priv,0)) + + !$omp declare reduction(adder:three:omp_out=add_three(omp_out,omp_in)) initializer(initthree(omp_priv,1)) + +!CHECK: adder: UserReductionDetails TYPE(two) TYPE(three) +!CHECK OtherConstruct scope +!CHECK: omp_in size=8 offset=0: ObjectEntity type: TYPE(two) +!CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) +!CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) +!CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) +!CHECK OtherConstruct scope +!CHECK: omp_in size=24 offset=0: ObjectEntity type: TYPE(three) +!CHECK: omp_orig size=24 offset=24: ObjectEntity type: TYPE(three) +!CHECK: omp_out size=24 offset=48: ObjectEntity type: TYPE(three) +!CHECK: omp_priv size=24 offset=72: ObjectEntity type: TYPE(three) + + !$omp simd reduction(adder:res3) + do i=1,n + res3=add_three(res%t3,x(i)%t3) + enddo + + !$omp simd reduction(adder:res2) + do i=1,n + res2=add_two(res2,x(i)%t2) + enddo + res%t2 = res2 + res%t3 = res3 + end function functtwothree + +end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction-mangled.f90 b/flang/test/Semantics/OpenMP/declare-reduction-mangled.f90 new file mode 100644 index 0000000000000..f1675b6f251e0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-mangled.f90 @@ -0,0 +1,51 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +!! Test that the name mangling for min & max (also used for iand, ieor and ior). +module mymod + type :: tt + real r + end type tt +contains + function mymax(a, b) + type(tt) :: a, b, mymax + if (a%r > b%r) then + mymax = a + else + mymax = b + end if + end function mymax +end module mymod + +program omp_examples +!CHECK-LABEL: MainProgram scope: omp_examples + use mymod + implicit none + integer, parameter :: n = 100 + integer :: i + type(tt) :: values(n), big, small + + !$omp declare reduction(max:tt:omp_out = mymax(omp_out, omp_in)) initializer(omp_priv%r = 0) + !$omp declare reduction(min:tt:omp_out%r = min(omp_out%r, omp_in%r)) initializer(omp_priv%r = 1) + +!CHECK: min, ELEMENTAL, INTRINSIC, PURE (Function): ProcEntity +!CHECK: mymax (Function): Use from mymax in mymod +!CHECK: op.max: UserReductionDetails TYPE(tt) +!CHECK: op.min: UserReductionDetails TYPE(tt) + + big%r = 0 + !$omp parallel do reduction(max:big) +!CHECK: big (OmpReduction): HostAssoc +!CHECK: max, INTRINSIC: ProcEntity + do i = 1, n + big = mymax(values(i), big) + end do + + small%r = 1 + !$omp parallel do reduction(min:small) +!CHECK: small (OmpReduction): HostAssoc + do i = 1, n + small%r = min(values(i)%r, small%r) + end do + + print *, "small=", small%r, " big=", big%r +end program omp_examples diff --git a/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 b/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 new file mode 100644 index 0000000000000..e7513ab3f95b1 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 @@ -0,0 +1,55 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +module vector_mod + implicit none + type :: Vector + real :: x, y, z + contains + procedure :: add_vectors + generic :: operator(+) => add_vectors + end type Vector +contains + ! Function implementing vector addition + function add_vectors(a, b) result(res) + class(Vector), intent(in) :: a, b + type(Vector) :: res + res%x = a%x + b%x + res%y = a%y + b%y + res%z = a%z + b%z + end function add_vectors +end module vector_mod + +program test_vector +!CHECK-LABEL: MainProgram scope: test_vector + use vector_mod +!CHECK: add_vectors (Function): Use from add_vectors in vector_mod + implicit none + integer :: i + type(Vector) :: v1(100), v2(100) + + !$OMP declare reduction(+:vector:omp_out=omp_out+omp_in) initializer(omp_priv=Vector(0,0,0)) +!CHECK: op.+: UserReductionDetails TYPE(vector) +!CHECK: v1 size=1200 offset=4: ObjectEntity type: TYPE(vector) shape: 1_8:100_8 +!CHECK: v2 size=1200 offset=1204: ObjectEntity type: TYPE(vector) shape: 1_8:100_8 +!CHECK: vector: Use from vector in vector_mod + +!CHECK: OtherConstruct scope: +!CHECK: omp_in size=12 offset=0: ObjectEntity type: TYPE(vector) +!CHECK: omp_orig size=12 offset=12: ObjectEntity type: TYPE(vector) +!CHECK: omp_out size=12 offset=24: ObjectEntity type: TYPE(vector) +!CHECK: omp_priv size=12 offset=36: ObjectEntity type: TYPE(vector) + + v2 = Vector(0.0, 0.0, 0.0) + v1 = Vector(1.0, 2.0, 3.0) + !$OMP parallel do reduction(+:v2) +!CHECK: OtherConstruct scope +!CHECK: i (OmpPrivate, OmpPreDetermined): HostAssoc +!CHECK: v1: HostAssoc +!CHECK: v2 (OmpReduction): HostAssoc + + do i = 1, 100 + v2(i) = v2(i) + v1(i) ! Invokes add_vectors + end do + + print *, 'v2 components:', v2%x, v2%y, v2%z +end program test_vector diff --git a/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 b/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 new file mode 100644 index 0000000000000..14695faf844b6 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 @@ -0,0 +1,30 @@ +! RUN: not %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s 2>&1 | FileCheck %s + +module mm + implicit none + type two + integer(4) :: a, b + end type two + + type three + integer(8) :: a, b, c + end type three +contains + function add_two(x, y) + type(two) add_two, x, y, res + add_two = res + end function add_two + + function func(n) + type(three) :: func + type(three) :: res3 + integer :: n + integer :: i + !$omp declare reduction(adder:two:omp_out=add_two(omp_out,omp_in)) + !$omp simd reduction(adder:res3) +!CHECK: error: The type of 'res3' is incompatible with the reduction operator. + do i=1,n + enddo + func = res3 + end function func +end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction.f90 b/flang/test/Semantics/OpenMP/declare-reduction.f90 index 11612f01f0f2d..ddca38fd57812 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction.f90 @@ -17,7 +17,7 @@ subroutine initme(x,n) end subroutine initme end interface !$omp declare reduction(red_add:integer(4):omp_out=omp_out+omp_in) initializer(initme(omp_priv,0)) -!CHECK: red_add: Misc ConstructName +!CHECK: red_add: UserReductionDetails !CHECK: Subprogram scope: initme !CHECK: omp_in size=4 offset=0: ObjectEntity type: INTEGER(4) !CHECK: omp_orig size=4 offset=4: ObjectEntity type: INTEGER(4) @@ -35,7 +35,7 @@ program main !$omp declare reduction (my_add_red : integer : omp_out = omp_out + omp_in) initializer (omp_priv=0) -!CHECK: my_add_red: Misc ConstructName +!CHECK: my_add_red: UserReductionDetails !CHECK: omp_in size=4 offset=0: ObjectEntity type: INTEGER(4) !CHECK: omp_orig size=4 offset=4: ObjectEntity type: INTEGER(4) !CHECK: omp_out size=4 offset=8: ObjectEntity type: INTEGER(4) >From 69ef46a4f55d69b292522d3fd3116ccbfdd2771d Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Wed, 26 Mar 2025 13:42:43 +0000 Subject: [PATCH 02/13] Fix review comments * Add two more tests (multiple operator-based declarations and re-using symbol already declared. * Add a few comments. * Fix up logical results. --- flang/include/flang/Semantics/symbol.h | 10 +-- flang/lib/Semantics/check-omp-structure.cpp | 11 +-- flang/lib/Semantics/resolve-names.cpp | 38 +++++++---- .../OpenMP/declare-reduction-dupsym.f90 | 15 ++++ .../OpenMP/declare-reduction-functions.f90 | 68 ++++++++++++++++++- .../OpenMP/declare-reduction-logical.f90 | 32 +++++++++ .../OpenMP/declare-reduction-typeerror.f90 | 4 ++ 7 files changed, 152 insertions(+), 26 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-logical.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index ce35fc7aff3df..1a0ca0d8b4183 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -728,7 +728,10 @@ class GenericDetails { }; llvm::raw_ostream &operator<<(llvm::raw_ostream &, const GenericDetails &); -class UserReductionDetails : public WithBindName { +// Used for OpenMP DECLARE REDUCTION, it holds the information +// needed to resolve which declaration (there could be multiple +// with the same name) to use for a given type. +class UserReductionDetails { public: using TypeVector = std::vector; UserReductionDetails() = default; @@ -737,10 +740,7 @@ class UserReductionDetails : public WithBindName { const TypeVector &GetTypeList() const { return typeList_; } bool SupportsType(const DeclTypeSpec *type) const { - for (auto t : typeList_) - if (t == type) - return true; - return false; + return llvm::is_contained(typeList_, type); } private: diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 837ca3377fe1e..650f69c6f8bf8 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3635,7 +3635,10 @@ static bool IsReductionAllowedForType( case parser::DefinedOperator::IntrinsicOperator::OR: case parser::DefinedOperator::IntrinsicOperator::EQV: case parser::DefinedOperator::IntrinsicOperator::NEQV: - return isLogical(type); + if (isLogical(type)) { + return true; + } + break; // Reduction identifier is not in OMP5.2 Table 5.2 default: @@ -3652,7 +3655,7 @@ static bool IsReductionAllowedForType( } return false; } - assert(0 && "Intrinsic Operator not found - parsing gone wrong?"); + DIE("Intrinsic Operator not found - parsing gone wrong?"); return false; // Reject everything else. }}; @@ -3690,7 +3693,7 @@ static bool IsReductionAllowedForType( // We also need to check for mangled names (max, min, iand, ieor and ior) // and then check if the type is there. - parser::CharBlock mangledName = MangleSpecialFunctions(name->source); + parser::CharBlock mangledName{MangleSpecialFunctions(name->source)}; if (const auto &symbol{scope.FindSymbol(mangledName)}) { if (const auto *reductionDetails{ symbol->detailsIf()}) { @@ -3700,7 +3703,7 @@ static bool IsReductionAllowedForType( // Everything else is "not matching type". return false; } - assert(0 && "name and name->symbol should be set here..."); + DIE("name and name->symbol should be set here..."); return false; }}; diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 4e60a1f1b5a49..5088d832e6382 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1805,7 +1805,7 @@ parser::CharBlock MakeNameFromOperator( return parser::CharBlock{"op.NEQV", 8}; default: - assert(0 && "Unsupported operator..."); + DIE("Unsupported operator..."); return parser::CharBlock{"op.?", 4}; } } @@ -1834,8 +1834,8 @@ void OmpVisitor::ProcessReductionSpecifier( const parser::OmpReductionSpecifier &spec, const std::optional &clauses) { const parser::Name *name{nullptr}; - parser::Name mangledName{}; - UserReductionDetails reductionDetailsTemp{}; + parser::Name mangledName; + UserReductionDetails reductionDetailsTemp; const auto &id{std::get(spec.t)}; if (auto procDes{std::get_if(&id.u)}) { name = std::get_if(&procDes->u); @@ -1849,11 +1849,22 @@ void OmpVisitor::ProcessReductionSpecifier( name = &mangledName; } + // Use reductionDetailsTemp if we can't find the symbol (this is + // the first, or only, instance with this name). The detaiols then + // gets stored in the symbol when it's created. UserReductionDetails *reductionDetails{&reductionDetailsTemp}; - Symbol *symbol{name ? name->symbol : nullptr}; - symbol = FindSymbol(mangledName); + Symbol *symbol{FindSymbol(mangledName)}; if (symbol) { + // If we found a symbol, we append the type info to the + // existing reductionDetails. reductionDetails = symbol->detailsIf(); + + if (!reductionDetails) { + context().Say(name->source, + "Duplicate defineition of '%s' in !$OMP DECLARE REDUCTION"_err_en_US, + name->source); + return; + } } auto &typeList{std::get(spec.t)}; @@ -1882,17 +1893,16 @@ void OmpVisitor::ProcessReductionSpecifier( // We need to walk t.u because Walk(t) does it's own BeginDeclTypeSpec. Walk(t.u); - const DeclTypeSpec *typeSpec{GetDeclTypeSpec()}; - assert(typeSpec && "We should have a type here"); - - if (reductionDetails) { + // Only process types we can find. There will be an error later on when + // a type isn't found. + if (const DeclTypeSpec * typeSpec{GetDeclTypeSpec()}) { reductionDetails->AddType(typeSpec); - } - for (auto &nm : ompVarNames) { - ObjectEntityDetails details{}; - details.set_type(*typeSpec); - MakeSymbol(nm, Attrs{}, std::move(details)); + for (auto &nm : ompVarNames) { + ObjectEntityDetails details{}; + details.set_type(*typeSpec); + MakeSymbol(nm, Attrs{}, std::move(details)); + } } EndDeclTypeSpec(); Walk(std::get>(spec.t)); diff --git a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 new file mode 100644 index 0000000000000..17f70174e1854 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 @@ -0,0 +1,15 @@ +! RUN: not %flang_fc1 -fopenmp -fopenmp-version=50 %s 2>&1 | FileCheck %s + +!! Check for duplicate symbol use. +subroutine dup_symbol() + type :: loc + integer :: x + integer :: y + end type loc + + integer :: my_red + +!CHECK: error: Duplicate defineition of 'my_red' in !$OMP DECLARE REDUCTION + !$omp declare reduction(my_red : loc : omp_out%x = omp_out%x + omp_in%x) initializer(omp_priv%x = 0) + +end subroutine dup_symbol diff --git a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 index 924ef0807ec80..a2435fca415cd 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 @@ -85,8 +85,8 @@ function functhree(x, n) functhree=res end function functhree - function functtwothree(x, n) - type(twothree) :: functtwothree + function functwothree(x, n) + type(twothree) :: functwothree type(twothree) :: x(n) type(twothree) :: res type(two) :: res2 @@ -121,6 +121,68 @@ function functtwothree(x, n) enddo res%t2 = res2 res%t3 = res3 - end function functtwothree + functwothree=res + end function functwothree + +!CHECK-LABEL: Subprogram scope: funcbtwo + function funcBtwo(x, n) + type(two) funcBtwo + integer :: n + type(two) :: x(n) + type(two) :: res + integer :: i + !$omp declare reduction(+:two:omp_out=add_two(omp_out,omp_in)) initializer(inittwo(omp_priv,0)) +!CHECK: op.+: UserReductionDetails TYPE(two) +!CHECK OtherConstruct scope +!CHECK: omp_in size=8 offset=0: ObjectEntity type: TYPE(two) +!CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) +!CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) +!CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) + + + !$omp simd reduction(+:res) + do i=1,n + res=add_two(res,x(i)) + enddo + funcBtwo=res + end function funcBtwo + + function funcBtwothree(x, n) + type(twothree) :: funcBtwothree + type(twothree) :: x(n) + type(twothree) :: res + type(two) :: res2 + type(three) :: res3 + integer :: n + integer :: i + + !$omp declare reduction(+:two:omp_out=add_two(omp_out,omp_in)) initializer(inittwo(omp_priv,0)) + !$omp declare reduction(+:three:omp_out=add_three(omp_out,omp_in)) initializer(initthree(omp_priv,1)) + +!CHECK: op.+: UserReductionDetails TYPE(two) TYPE(three) +!CHECK OtherConstruct scope +!CHECK: omp_in size=8 offset=0: ObjectEntity type: TYPE(two) +!CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) +!CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) +!CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) +!CHECK OtherConstruct scope +!CHECK: omp_in size=24 offset=0: ObjectEntity type: TYPE(three) +!CHECK: omp_orig size=24 offset=24: ObjectEntity type: TYPE(three) +!CHECK: omp_out size=24 offset=48: ObjectEntity type: TYPE(three) +!CHECK: omp_priv size=24 offset=72: ObjectEntity type: TYPE(three) + + !$omp simd reduction(+:res3) + do i=1,n + res3=add_three(res%t3,x(i)%t3) + enddo + + !$omp simd reduction(+:res2) + do i=1,n + res2=add_two(res2,x(i)%t2) + enddo + res%t2 = res2 + res%t3 = res3 + end function funcBtwothree + end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction-logical.f90 b/flang/test/Semantics/OpenMP/declare-reduction-logical.f90 new file mode 100644 index 0000000000000..7ab7cad473ac8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-logical.f90 @@ -0,0 +1,32 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +module mm + implicit none + type logicalwrapper + logical b + end type logicalwrapper + +contains +!CHECK-LABEL: Subprogram scope: func + function func(x, n) + logical func + integer :: n + type(logicalwrapper) :: x(n) + type(logicalwrapper) :: res + integer :: i + !$omp declare reduction(.AND.:type(logicalwrapper):omp_out%b=omp_out%b .AND. omp_in%b) initializer(omp_priv%b=.true.) +!CHECK: op.AND: UserReductionDetails TYPE(logicalwrapper) +!CHECK OtherConstruct scope +!CHECK: omp_in size=4 offset=0: ObjectEntity type: TYPE(logicalwrapper) +!CHECK: omp_orig size=4 offset=4: ObjectEntity type: TYPE(logicalwrapper) +!CHECK: omp_out size=4 offset=8: ObjectEntity type: TYPE(logicalwrapper) +!CHECK: omp_priv size=4 offset=12: ObjectEntity type: TYPE(logicalwrapper) + + !$omp simd reduction(.AND.:res) + do i=1,n + res%b=res%b .and. x(i)%b + enddo + + func=res%b + end function func +end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 b/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 index 14695faf844b6..b8ede55aa0ed7 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-typeerror.f90 @@ -20,6 +20,10 @@ function func(n) type(three) :: res3 integer :: n integer :: i + + !$omp declare reduction(dummy:kerflunk:omp_out=omp_out+omp_in) +!CHECK: error: Derived type 'kerflunk' not found + !$omp declare reduction(adder:two:omp_out=add_two(omp_out,omp_in)) !$omp simd reduction(adder:res3) !CHECK: error: The type of 'res3' is incompatible with the reduction operator. >From 28ab6d291ed08e68cfc6cfa37d2dfd3a872ce817 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Wed, 26 Mar 2025 17:51:25 +0000 Subject: [PATCH 03/13] Use stringswitch and spell details correctly --- flang/lib/Semantics/resolve-names.cpp | 27 +++++++++------------------ 1 file changed, 9 insertions(+), 18 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 5088d832e6382..b4030166803d8 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -38,6 +38,7 @@ #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" #include "flang/Support/default-kinds.h" +#include "llvm/ADT/StringSwitch.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1811,23 +1812,13 @@ parser::CharBlock MakeNameFromOperator( } parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name) { - if (name == "max") { - return parser::CharBlock{"op.max", 6}; - } - if (name == "min") { - return parser::CharBlock{"op.min", 6}; - } - if (name == "iand") { - return parser::CharBlock{"op.iand", 7}; - } - if (name == "ior") { - return parser::CharBlock{"op.ior", 6}; - } - if (name == "ieor") { - return parser::CharBlock{"op.ieor", 7}; - } - // All other names: return as is. - return name; + return llvm::StringSwitch(name.ToString()) + .Case("max", {"op.max", 6}) + .Case("min", {"op.min", 6}) + .Case("iand", {"op.iand", 7}) + .Case("ior", {"op.ior", 6}) + .Case("ieor", {"op.ieor", 7}) + .Default(name); } void OmpVisitor::ProcessReductionSpecifier( @@ -1850,7 +1841,7 @@ void OmpVisitor::ProcessReductionSpecifier( } // Use reductionDetailsTemp if we can't find the symbol (this is - // the first, or only, instance with this name). The detaiols then + // the first, or only, instance with this name). The details then // gets stored in the symbol when it's created. UserReductionDetails *reductionDetails{&reductionDetailsTemp}; Symbol *symbol{FindSymbol(mangledName)}; >From e2a1d923277125e27d3d34ab398ce75c8d18efb8 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 4 Apr 2025 16:27:07 +0100 Subject: [PATCH 04/13] Add support for user defined operators in declare reduction Also print the reduction declaration in the module file. Fix trivial typo. Add/modify tests to cover all the new things, including fixing the duplicated typo in the test... --- flang/include/flang/Semantics/semantics.h | 9 +++ flang/include/flang/Semantics/symbol.h | 10 +++ flang/lib/Parser/unparse.cpp | 8 +++ flang/lib/Semantics/mod-file.cpp | 21 +++++++ flang/lib/Semantics/mod-file.h | 1 + flang/lib/Semantics/resolve-names.cpp | 41 +++++++++--- flang/lib/Semantics/semantics.cpp | 6 ++ .../OpenMP/declare-reduction-dupsym.f90 | 2 +- .../OpenMP/declare-reduction-modfile.f90 | 63 +++++++++++++++++++ .../OpenMP/declare-reduction-operators.f90 | 29 +++++++++ 10 files changed, 181 insertions(+), 9 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 diff --git a/flang/include/flang/Semantics/semantics.h b/flang/include/flang/Semantics/semantics.h index 730513dbe3232..460af89daa0cf 100644 --- a/flang/include/flang/Semantics/semantics.h +++ b/flang/include/flang/Semantics/semantics.h @@ -290,6 +290,10 @@ class SemanticsContext { // Top-level ProgramTrees are owned by the SemanticsContext for persistence. ProgramTree &SaveProgramTree(ProgramTree &&); + // Store (and get a reference to the stored string) for mangled names + // used for OpenMP DECLARE REDUCTION. + std::string &StoreUserReductionName(const std::string &name); + private: struct ScopeIndexComparator { bool operator()(parser::CharBlock, parser::CharBlock) const; @@ -343,6 +347,11 @@ class SemanticsContext { std::map moduleFileOutputRenamings_; UnorderedSymbolSet isDefined_; std::list programTrees_; + + // storage for mangled names used in OMP DECLARE REDUCTION. + // use std::list to avoid re-allocating the string when adding + // more content to the container. + std::list userReductionNames_; }; class Semantics { diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 1a0ca0d8b4183..3fd4d455e0f0b 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -30,6 +30,8 @@ class raw_ostream; } namespace Fortran::parser { struct Expr; +struct OpenMPDeclareReductionConstruct; +struct OmpDirectiveSpecification; } namespace Fortran::semantics { @@ -734,6 +736,10 @@ llvm::raw_ostream &operator<<(llvm::raw_ostream &, const GenericDetails &); class UserReductionDetails { public: using TypeVector = std::vector; + using DeclInfo = std::variant; + using DeclVector = std::vector; + UserReductionDetails() = default; void AddType(const DeclTypeSpec *type) { typeList_.push_back(type); } @@ -743,8 +749,12 @@ class UserReductionDetails { return llvm::is_contained(typeList_, type); } + void AddDecl(const DeclInfo &decl) { declList_.push_back(decl); } + const DeclVector &GetDeclList() const { return declList_; } + private: TypeVector typeList_; + DeclVector declList_; }; class UnknownDetails {}; diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index a626888b7dfe5..2f48e40a81dd9 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -3363,4 +3363,12 @@ template void Unparse(llvm::raw_ostream &, const Program &, template void Unparse(llvm::raw_ostream &, const Expr &, const common::LangOptions &, Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); + +template void Unparse( + llvm::raw_ostream &, const parser::OpenMPDeclareReductionConstruct &, + const common::LangOptions &, Encoding, bool, bool, preStatementType *, + AnalyzedObjectsAsFortran *); +template void Unparse(llvm::raw_ostream &, + const parser::OmpDirectiveSpecification &, const common::LangOptions &, + Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); } // namespace Fortran::parser diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index a1ec956562204..a17884c3016e3 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -8,6 +8,7 @@ #include "mod-file.h" #include "resolve-names.h" +#include "flang/Common/indirection.h" #include "flang/Common/restorer.h" #include "flang/Evaluate/tools.h" #include "flang/Parser/message.h" @@ -894,6 +895,7 @@ void ModFileWriter::PutEntity(llvm::raw_ostream &os, const Symbol &symbol) { [&](const ObjectEntityDetails &) { PutObjectEntity(os, symbol); }, [&](const ProcEntityDetails &) { PutProcEntity(os, symbol); }, [&](const TypeParamDetails &) { PutTypeParam(os, symbol); }, + [&](const UserReductionDetails &) { PutUserReduction(os, symbol); }, [&](const auto &) { common::die("PutEntity: unexpected details: %s", DetailsToString(symbol.details()).c_str()); @@ -1043,6 +1045,25 @@ void ModFileWriter::PutTypeParam(llvm::raw_ostream &os, const Symbol &symbol) { os << '\n'; } +void ModFileWriter::PutUserReduction( + llvm::raw_ostream &os, const Symbol &symbol) { + auto &details{symbol.get()}; + // The module content for a OpenMP Declare Reduction is the OpenMP + // declaration. There may be multiple declarations. + // Decls are pointers, so do not use a referene. + for (const auto decl : details.GetDeclList()) { + if (auto d = std::get_if( + &decl)) { + Unparse(os, **d, context_.langOptions()); + } else if (auto s = std::get_if( + &decl)) { + Unparse(os, **s, context_.langOptions()); + } else { + DIE("Unknown OpenMP DECLARE REDUCTION content"); + } + } +} + void PutInit(llvm::raw_ostream &os, const Symbol &symbol, const MaybeExpr &init, const parser::Expr *unanalyzed, SemanticsContext &context) { if (IsNamedConstant(symbol) || symbol.owner().IsDerivedType()) { diff --git a/flang/lib/Semantics/mod-file.h b/flang/lib/Semantics/mod-file.h index 82538fb510873..9e5724089b3c5 100644 --- a/flang/lib/Semantics/mod-file.h +++ b/flang/lib/Semantics/mod-file.h @@ -80,6 +80,7 @@ class ModFileWriter { void PutDerivedType(const Symbol &, const Scope * = nullptr); void PutDECStructure(const Symbol &, const Scope * = nullptr); void PutTypeParam(llvm::raw_ostream &, const Symbol &); + void PutUserReduction(llvm::raw_ostream &, const Symbol &); void PutSubprogram(const Symbol &); void PutGeneric(const Symbol &); void PutUse(const Symbol &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index b4030166803d8..ce4fcb5c63ba3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1535,7 +1535,7 @@ class OmpVisitor : public virtual DeclarationVisitor { AddOmpSourceRange(x.source); ProcessReductionSpecifier( std::get>(x.t).value(), - std::get>(x.t)); + std::get>(x.t), x); return false; } bool Pre(const parser::OmpMapClause &); @@ -1691,8 +1691,13 @@ class OmpVisitor : public virtual DeclarationVisitor { private: void ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, const parser::OmpClauseList &clauses); + template void ProcessReductionSpecifier(const parser::OmpReductionSpecifier &spec, - const std::optional &clauses); + const std::optional &clauses, + const T &wholeConstruct); + + parser::CharBlock MangleDefinedOperator(const parser::CharBlock &name); + int metaLevel_{0}; }; @@ -1821,9 +1826,21 @@ parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name) { .Default(name); } +parser::CharBlock OmpVisitor::MangleDefinedOperator( + const parser::CharBlock &name) { + // This function should only be used with user defined operators, that have + // the pattern + // .. + CHECK(name[0] == '.' && name[name.size() - 1] == '.'); + return parser::CharBlock{ + context().StoreUserReductionName("op" + name.ToString())}; +} + +template void OmpVisitor::ProcessReductionSpecifier( const parser::OmpReductionSpecifier &spec, - const std::optional &clauses) { + const std::optional &clauses, + const T &wholeOmpConstruct) { const parser::Name *name{nullptr}; parser::Name mangledName; UserReductionDetails reductionDetailsTemp; @@ -1833,11 +1850,17 @@ void OmpVisitor::ProcessReductionSpecifier( if (name) { mangledName.source = MangleSpecialFunctions(name->source); } + } else { const auto &defOp{std::get(id.u)}; - mangledName.source = MakeNameFromOperator( - std::get(defOp.u)); - name = &mangledName; + if (const auto definedOp{std::get_if(&defOp.u)}) { + name = &definedOp->v; + mangledName.source = MangleDefinedOperator(definedOp->v.source); + } else { + mangledName.source = MakeNameFromOperator( + std::get(defOp.u)); + name = &mangledName; + } } // Use reductionDetailsTemp if we can't find the symbol (this is @@ -1852,7 +1875,7 @@ void OmpVisitor::ProcessReductionSpecifier( if (!reductionDetails) { context().Say(name->source, - "Duplicate defineition of '%s' in !$OMP DECLARE REDUCTION"_err_en_US, + "Duplicate definition of '%s' in !$OMP DECLARE REDUCTION"_err_en_US, name->source); return; } @@ -1901,6 +1924,8 @@ void OmpVisitor::ProcessReductionSpecifier( PopScope(); } + reductionDetails->AddDecl(&wholeOmpConstruct); + if (name) { if (!symbol) { symbol = &MakeSymbol(mangledName, Attrs{}, std::move(*reductionDetails)); @@ -1936,7 +1961,7 @@ bool OmpVisitor::Pre(const parser::OmpDirectiveSpecification &x) { if (maybeArgs && maybeClauses) { const parser::OmpArgument &first{maybeArgs->v.front()}; if (auto *spec{std::get_if(&first.u)}) { - ProcessReductionSpecifier(*spec, maybeClauses); + ProcessReductionSpecifier(*spec, maybeClauses, x); } } break; diff --git a/flang/lib/Semantics/semantics.cpp b/flang/lib/Semantics/semantics.cpp index e07054f8ec564..bccccba86346d 100644 --- a/flang/lib/Semantics/semantics.cpp +++ b/flang/lib/Semantics/semantics.cpp @@ -772,4 +772,10 @@ bool SemanticsContext::IsSymbolDefined(const Symbol &symbol) const { return isDefined_.find(symbol) != isDefined_.end(); } +std::string &SemanticsContext::StoreUserReductionName(const std::string &name) { + userReductionNames_.push_back(name); + CHECK(userReductionNames_.back() == name); + return userReductionNames_.back(); +} + } // namespace Fortran::semantics diff --git a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 index 17f70174e1854..2e82cd1a18332 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 @@ -9,7 +9,7 @@ subroutine dup_symbol() integer :: my_red -!CHECK: error: Duplicate defineition of 'my_red' in !$OMP DECLARE REDUCTION +!CHECK: error: Duplicate definition of 'my_red' in !$OMP DECLARE REDUCTION !$omp declare reduction(my_red : loc : omp_out%x = omp_out%x + omp_in%x) initializer(omp_priv%x = 0) end subroutine dup_symbol diff --git a/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 b/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 new file mode 100644 index 0000000000000..caed7fd335376 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 @@ -0,0 +1,63 @@ +! RUN: %python %S/../test_modfile.py %s %flang_fc1 -fopenmp +! Check correct modfile generation for OpenMP DECLARE REDUCTION construct. + +!Expect: drm.mod +!module drm +!type::t1 +!integer(4)::val +!endtype +!!$OMP DECLARE REDUCTION (*:t1:omp_out = omp_out*omp_in) INITIALIZER(omp_priv=t& +!!$OMP&1(1)) +!!$OMP DECLARE REDUCTION (.fluffy.:t1:omp_out = omp_out.fluffy.omp_in) INITIALI& +!!$OMP&ZER(omp_priv=t1(0)) +!!$OMP DECLARE REDUCTION (.mul.:t1:omp_out = omp_out.mul.omp_in) INITIALIZER(om& +!!$OMP&p_priv=t1(1)) +!interface operator(.mul.) +!procedure::mul +!end interface +!interface operator(.fluffy.) +!procedure::add +!end interface +!interface operator(*) +!procedure::mul +!end interface +!contains +!function mul(v1,v2) +!type(t1),intent(in)::v1 +!type(t1),intent(in)::v2 +!type(t1)::mul +!end +!function add(v1,v2) +!type(t1),intent(in)::v1 +!type(t1),intent(in)::v2 +!type(t1)::add +!end +!end + +module drm + type t1 + integer :: val + end type t1 + interface operator(.mul.) + procedure mul + end interface + interface operator(.fluffy.) + procedure add + end interface + interface operator(*) + module procedure mul + end interface +!$omp declare reduction(*:t1:omp_out=omp_out*omp_in) initializer(omp_priv=t1(1)) +!$omp declare reduction(.mul.:t1:omp_out=omp_out.mul.omp_in) initializer(omp_priv=t1(1)) +!$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) initializer(omp_priv=t1(0)) +contains + type(t1) function mul(v1, v2) + type(t1), intent (in):: v1, v2 + mul%val = v1%val * v2%val + end function + type(t1) function add(v1, v2) + type(t1), intent (in):: v1, v2 + add%val = v1%val + v2%val + end function +end module drm + diff --git a/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 b/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 index e7513ab3f95b1..73fa1a1fea2c5 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-operators.f90 @@ -19,6 +19,35 @@ function add_vectors(a, b) result(res) end function add_vectors end module vector_mod +!! Test user-defined operators. Two different varieties, using conventional and +!! unconventional names. +module m1 + interface operator(.mul.) + procedure my_mul + end interface + interface operator(.fluffy.) + procedure my_add + end interface + type t1 + integer :: val = 1 + end type +!$omp declare reduction(.mul.:t1:omp_out=omp_out.mul.omp_in) +!$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) +!CHECK: op.fluffy., PUBLIC: UserReductionDetails TYPE(t1) +!CHECK: op.mul., PUBLIC: UserReductionDetails TYPE(t1) +contains + function my_mul(x, y) + type (t1), intent (in) :: x, y + type (t1) :: my_mul + my_mul%val = x%val * y%val + end function + function my_add(x, y) + type (t1), intent (in) :: x, y + type (t1) :: my_add + my_add%val = x%val + y%val + end function +end module m1 + program test_vector !CHECK-LABEL: MainProgram scope: test_vector use vector_mod >From e40a99612a846f4863b5a113e3e2df3588390545 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 4 Apr 2025 18:56:14 +0100 Subject: [PATCH 05/13] Fix nit comments and add simple bad operator test --- flang/lib/Semantics/check-omp-structure.cpp | 12 +++++------- flang/lib/Semantics/resolve-names-utils.h | 3 ++- flang/lib/Semantics/resolve-names.cpp | 8 +++++--- flang/lib/Semantics/symbol.cpp | 3 +-- .../OpenMP/declare-reduction-bad-operator.f90 | 6 ++++++ 5 files changed, 19 insertions(+), 13 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 650f69c6f8bf8..4302d7d2181a6 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3605,7 +3605,7 @@ void OmpStructureChecker::CheckReductionObjects( static bool IsReductionAllowedForType( const parser::OmpReductionIdentifier &ident, const DeclTypeSpec &type, - const Scope &scope) { + const Scope &scope, SemanticsContext &context) { auto isLogical{[](const DeclTypeSpec &type) -> bool { return type.category() == DeclTypeSpec::Logical; }}; @@ -3645,7 +3645,7 @@ static bool IsReductionAllowedForType( DIE("This should have been caught in CheckIntrinsicOperator"); return false; } - parser::CharBlock name{MakeNameFromOperator(*intrinsicOp)}; + parser::CharBlock name{MakeNameFromOperator(*intrinsicOp, context)}; Symbol *symbol{scope.FindSymbol(name)}; if (symbol) { const auto *reductionDetails{symbol->detailsIf()}; @@ -3656,11 +3656,11 @@ static bool IsReductionAllowedForType( return false; } DIE("Intrinsic Operator not found - parsing gone wrong?"); - return false; // Reject everything else. }}; auto checkDesignator{[&](const parser::ProcedureDesignator &procD) { const parser::Name *name{std::get_if(&procD.u)}; + CHECK(name && name->symbol); if (name && name->symbol) { const SourceName &realName{name->symbol->GetUltimate().name()}; // OMP5.2: The type [...] of a list item that appears in a @@ -3700,10 +3700,8 @@ static bool IsReductionAllowedForType( return reductionDetails->SupportsType(&type); } } - // Everything else is "not matching type". - return false; } - DIE("name and name->symbol should be set here..."); + // Everything else is "not matching type". return false; }}; @@ -3720,7 +3718,7 @@ void OmpStructureChecker::CheckReductionObjectTypes( for (auto &[symbol, source] : symbols) { if (auto *type{symbol->GetType()}) { const auto &scope{context_.FindScope(symbol->name())}; - if (!IsReductionAllowedForType(ident, *type, scope)) { + if (!IsReductionAllowedForType(ident, *type, scope, context_)) { context_.Say(source, "The type of '%s' is incompatible with the reduction operator."_err_en_US, symbol->name()); diff --git a/flang/lib/Semantics/resolve-names-utils.h b/flang/lib/Semantics/resolve-names-utils.h index de0991d69b61b..ed74c8203e29a 100644 --- a/flang/lib/Semantics/resolve-names-utils.h +++ b/flang/lib/Semantics/resolve-names-utils.h @@ -147,7 +147,8 @@ void MapSubprogramToNewSymbols(const Symbol &oldSymbol, Symbol &newSymbol, Scope &newScope, SymbolAndTypeMappings * = nullptr); parser::CharBlock MakeNameFromOperator( - const parser::DefinedOperator::IntrinsicOperator &op); + const parser::DefinedOperator::IntrinsicOperator &op, + SemanticsContext &context); parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name); } // namespace Fortran::semantics diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index ce4fcb5c63ba3..0270b74e0185d 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1792,7 +1792,8 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, } parser::CharBlock MakeNameFromOperator( - const parser::DefinedOperator::IntrinsicOperator &op) { + const parser::DefinedOperator::IntrinsicOperator &op, + SemanticsContext &context) { switch (op) { case parser::DefinedOperator::IntrinsicOperator::Multiply: return parser::CharBlock{"op.*", 4}; @@ -1811,7 +1812,7 @@ parser::CharBlock MakeNameFromOperator( return parser::CharBlock{"op.NEQV", 8}; default: - DIE("Unsupported operator..."); + context.Say("Unsupported operator in OMP DECLARE REDUCTION"_err_en_US); return parser::CharBlock{"op.?", 4}; } } @@ -1858,7 +1859,8 @@ void OmpVisitor::ProcessReductionSpecifier( mangledName.source = MangleDefinedOperator(definedOp->v.source); } else { mangledName.source = MakeNameFromOperator( - std::get(defOp.u)); + std::get(defOp.u), + context()); name = &mangledName; } } diff --git a/flang/lib/Semantics/symbol.cpp b/flang/lib/Semantics/symbol.cpp index d03b888318b30..d9809d6da8208 100644 --- a/flang/lib/Semantics/symbol.cpp +++ b/flang/lib/Semantics/symbol.cpp @@ -292,8 +292,7 @@ void GenericDetails::CopyFrom(const GenericDetails &from) { // This is primarily for debugging. std::string DetailsToString(const Details &details) { return common::visit( - common::visitors{// - [](const UnknownDetails &) { return "Unknown"; }, + common::visitors{[](const UnknownDetails &) { return "Unknown"; }, [](const MainProgramDetails &) { return "MainProgram"; }, [](const ModuleDetails &) { return "Module"; }, [](const SubprogramDetails &) { return "Subprogram"; }, diff --git a/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 new file mode 100644 index 0000000000000..3b27c6aa20f13 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 @@ -0,0 +1,6 @@ +! RUN: not %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s 2>&1 | FileCheck %s + +function func(n) + !$omp declare reduction(/:integer:omp_out=omp_out+omp_in) +!CHECK: error: Unsupported operator in OMP DECLARE REDUCTION +end function func >From dde78ab0bb5af3259b24d2b881f226488a343b76 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 4 Apr 2025 19:47:42 +0100 Subject: [PATCH 06/13] Fix error messages to be more consistent --- flang/lib/Semantics/resolve-names.cpp | 6 +++--- .../Semantics/OpenMP/declare-reduction-bad-operator.f90 | 2 +- flang/test/Semantics/OpenMP/declare-reduction-error.f90 | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 0270b74e0185d..eeaa800f37a0b 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1506,7 +1506,7 @@ class OmpVisitor : public virtual DeclarationVisitor { auto *symbol{FindSymbol(NonDerivedTypeScope(), name)}; if (!symbol) { context().Say(name.source, - "Implicit subroutine declaration '%s' in !$OMP DECLARE REDUCTION"_err_en_US, + "Implicit subroutine declaration '%s' in DECLARE REDUCTION"_err_en_US, name.source); } return true; @@ -1812,7 +1812,7 @@ parser::CharBlock MakeNameFromOperator( return parser::CharBlock{"op.NEQV", 8}; default: - context.Say("Unsupported operator in OMP DECLARE REDUCTION"_err_en_US); + context.Say("Unsupported operator in DECLARE REDUCTION"_err_en_US); return parser::CharBlock{"op.?", 4}; } } @@ -1877,7 +1877,7 @@ void OmpVisitor::ProcessReductionSpecifier( if (!reductionDetails) { context().Say(name->source, - "Duplicate definition of '%s' in !$OMP DECLARE REDUCTION"_err_en_US, + "Duplicate definition of '%s' in DECLARE REDUCTION"_err_en_US, name->source); return; } diff --git a/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 index 3b27c6aa20f13..1d1d2903a2780 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator.f90 @@ -2,5 +2,5 @@ function func(n) !$omp declare reduction(/:integer:omp_out=omp_out+omp_in) -!CHECK: error: Unsupported operator in OMP DECLARE REDUCTION +!CHECK: error: Unsupported operator in DECLARE REDUCTION end function func diff --git a/flang/test/Semantics/OpenMP/declare-reduction-error.f90 b/flang/test/Semantics/OpenMP/declare-reduction-error.f90 index c22cf106ea507..21f5cc186e037 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-error.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-error.f90 @@ -7,5 +7,5 @@ end subroutine initme subroutine subr !$omp declare reduction(red_add:integer(4):omp_out=omp_out+omp_in) initializer(initme(omp_priv,0)) - !CHECK: error: Implicit subroutine declaration 'initme' in !$OMP DECLARE REDUCTION + !CHECK: error: Implicit subroutine declaration 'initme' in DECLARE REDUCTION end subroutine subr >From b9c10a507f2c5ff06aa00d8a57b392d4515eaaf9 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Mon, 7 Apr 2025 10:47:44 +0100 Subject: [PATCH 07/13] add missed test change --- flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 index 2e82cd1a18332..83f8f85299dca 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-dupsym.f90 @@ -9,7 +9,7 @@ subroutine dup_symbol() integer :: my_red -!CHECK: error: Duplicate definition of 'my_red' in !$OMP DECLARE REDUCTION +!CHECK: error: Duplicate definition of 'my_red' in DECLARE REDUCTION !$omp declare reduction(my_red : loc : omp_out%x = omp_out%x + omp_in%x) initializer(omp_priv%x = 0) end subroutine dup_symbol >From c0a2cc3c3d9750dd6f72a6b435bfd8c5cac890b0 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Tue, 8 Apr 2025 14:41:37 +0100 Subject: [PATCH 08/13] Improve support for metadirective + declare reduction --- flang/include/flang/Semantics/symbol.h | 4 ++-- flang/lib/Parser/unparse.cpp | 4 ++-- flang/lib/Semantics/mod-file.cpp | 2 +- flang/lib/Semantics/resolve-names.cpp | 12 +++++++++--- .../Semantics/OpenMP/declare-reduction-modfile.f90 | 4 +++- 5 files changed, 17 insertions(+), 9 deletions(-) diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 3fd4d455e0f0b..851581bdd9b08 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -31,7 +31,7 @@ class raw_ostream; namespace Fortran::parser { struct Expr; struct OpenMPDeclareReductionConstruct; -struct OmpDirectiveSpecification; +struct OmpMetadirectiveDirective; } namespace Fortran::semantics { @@ -737,7 +737,7 @@ class UserReductionDetails { public: using TypeVector = std::vector; using DeclInfo = std::variant; + const parser::OmpMetadirectiveDirective *>; using DeclVector = std::vector; UserReductionDetails() = default; diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 2f48e40a81dd9..9f3c33fd0222e 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -3368,7 +3368,7 @@ template void Unparse( llvm::raw_ostream &, const parser::OpenMPDeclareReductionConstruct &, const common::LangOptions &, Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); -template void Unparse(llvm::raw_ostream &, - const parser::OmpDirectiveSpecification &, const common::LangOptions &, +template void Unparse(llvm::raw_ostream &, + const parser::OmpMetadirectiveDirective &, const common::LangOptions &, Encoding, bool, bool, preStatementType *, AnalyzedObjectsAsFortran *); } // namespace Fortran::parser diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index a17884c3016e3..d15b71a4b75be 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1055,7 +1055,7 @@ void ModFileWriter::PutUserReduction( if (auto d = std::get_if( &decl)) { Unparse(os, **d, context_.langOptions()); - } else if (auto s = std::get_if( + } else if (auto s = std::get_if( &decl)) { Unparse(os, **s, context_.langOptions()); } else { diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index eeaa800f37a0b..da0aedd35676a 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1456,11 +1456,15 @@ class OmpVisitor : public virtual DeclarationVisitor { static bool NeedsScope(const parser::OpenMPBlockConstruct &); static bool NeedsScope(const parser::OmpClause &); - bool Pre(const parser::OmpMetadirectiveDirective &) { + bool Pre(const parser::OmpMetadirectiveDirective &x) { + metaDirective_ = &x; ++metaLevel_; return true; } - void Post(const parser::OmpMetadirectiveDirective &) { --metaLevel_; } + void Post(const parser::OmpMetadirectiveDirective &) { + metaDirective_ = nullptr; + --metaLevel_; + } bool Pre(const parser::OpenMPRequiresConstruct &x) { AddOmpSourceRange(x.source); @@ -1699,6 +1703,7 @@ class OmpVisitor : public virtual DeclarationVisitor { parser::CharBlock MangleDefinedOperator(const parser::CharBlock &name); int metaLevel_{0}; + const parser::OmpMetadirectiveDirective *metaDirective_{nullptr}; }; bool OmpVisitor::NeedsScope(const parser::OpenMPBlockConstruct &x) { @@ -1963,7 +1968,8 @@ bool OmpVisitor::Pre(const parser::OmpDirectiveSpecification &x) { if (maybeArgs && maybeClauses) { const parser::OmpArgument &first{maybeArgs->v.front()}; if (auto *spec{std::get_if(&first.u)}) { - ProcessReductionSpecifier(*spec, maybeClauses, x); + CHECK(metaDirective_); + ProcessReductionSpecifier(*spec, maybeClauses, *metaDirective_); } } break; diff --git a/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 b/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 index caed7fd335376..f80eb1097e18a 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-modfile.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_modfile.py %s %flang_fc1 -fopenmp +! RUN: %python %S/../test_modfile.py %s %flang_fc1 -fopenmp -fopenmp-version=52 ! Check correct modfile generation for OpenMP DECLARE REDUCTION construct. !Expect: drm.mod @@ -8,6 +8,7 @@ !endtype !!$OMP DECLARE REDUCTION (*:t1:omp_out = omp_out*omp_in) INITIALIZER(omp_priv=t& !!$OMP&1(1)) +!!$OMP METADIRECTIVE OTHERWISE(DECLARE REDUCTION(+:INTEGER)) !!$OMP DECLARE REDUCTION (.fluffy.:t1:omp_out = omp_out.fluffy.omp_in) INITIALI& !!$OMP&ZER(omp_priv=t1(0)) !!$OMP DECLARE REDUCTION (.mul.:t1:omp_out = omp_out.mul.omp_in) INITIALIZER(om& @@ -50,6 +51,7 @@ module drm !$omp declare reduction(*:t1:omp_out=omp_out*omp_in) initializer(omp_priv=t1(1)) !$omp declare reduction(.mul.:t1:omp_out=omp_out.mul.omp_in) initializer(omp_priv=t1(1)) !$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) initializer(omp_priv=t1(0)) +!$omp metadirective otherwise(declare reduction(+: integer)) contains type(t1) function mul(v1, v2) type(t1), intent (in):: v1, v2 >From 31f864fb38c60b21220eaa8938348d5c1d08a8ef Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Thu, 10 Apr 2025 19:11:10 +0100 Subject: [PATCH 09/13] Fix Klausler reported review comments Also rebase, as the branch was quite a way behind. Small conflict was resolved. --- flang/include/flang/Semantics/symbol.h | 6 +++--- flang/lib/Semantics/check-omp-structure.cpp | 9 ++++---- flang/lib/Semantics/mod-file.cpp | 23 ++++++++++++--------- flang/lib/Semantics/resolve-names-utils.h | 2 +- flang/lib/Semantics/resolve-names.cpp | 8 +++---- 5 files changed, 25 insertions(+), 23 deletions(-) diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 851581bdd9b08..1de92d410fca5 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -742,11 +742,11 @@ class UserReductionDetails { UserReductionDetails() = default; - void AddType(const DeclTypeSpec *type) { typeList_.push_back(type); } + void AddType(const DeclTypeSpec &type) { typeList_.push_back(&type); } const TypeVector &GetTypeList() const { return typeList_; } - bool SupportsType(const DeclTypeSpec *type) const { - return llvm::is_contained(typeList_, type); + bool SupportsType(const DeclTypeSpec &type) const { + return llvm::is_contained(typeList_, &type); } void AddDecl(const DeclInfo &decl) { declList_.push_back(decl); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 4302d7d2181a6..218cebf9bcec2 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3522,8 +3522,7 @@ bool OmpStructureChecker::CheckReductionOperator( valid = llvm::is_contained({"max", "min", "iand", "ior", "ieor"}, realName); if (!valid) { - auto *reductionDetails{name->symbol->detailsIf()}; - valid = reductionDetails != nullptr; + valid = name->symbol->detailsIf(); } } if (!valid) { @@ -3651,7 +3650,7 @@ static bool IsReductionAllowedForType( const auto *reductionDetails{symbol->detailsIf()}; assert(reductionDetails && "Expected to find reductiondetails"); - return reductionDetails->SupportsType(&type); + return reductionDetails->SupportsType(type); } return false; } @@ -3688,7 +3687,7 @@ static bool IsReductionAllowedForType( // supported. if (const auto *reductionDetails{ name->symbol->detailsIf()}) { - return reductionDetails->SupportsType(&type); + return reductionDetails->SupportsType(type); } // We also need to check for mangled names (max, min, iand, ieor and ior) @@ -3697,7 +3696,7 @@ static bool IsReductionAllowedForType( if (const auto &symbol{scope.FindSymbol(mangledName)}) { if (const auto *reductionDetails{ symbol->detailsIf()}) { - return reductionDetails->SupportsType(&type); + return reductionDetails->SupportsType(type); } } } diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index d15b71a4b75be..dbd55a61f510f 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -1047,20 +1047,23 @@ void ModFileWriter::PutTypeParam(llvm::raw_ostream &os, const Symbol &symbol) { void ModFileWriter::PutUserReduction( llvm::raw_ostream &os, const Symbol &symbol) { - auto &details{symbol.get()}; + const auto &details{symbol.get()}; // The module content for a OpenMP Declare Reduction is the OpenMP // declaration. There may be multiple declarations. // Decls are pointers, so do not use a referene. for (const auto decl : details.GetDeclList()) { - if (auto d = std::get_if( - &decl)) { - Unparse(os, **d, context_.langOptions()); - } else if (auto s = std::get_if( - &decl)) { - Unparse(os, **s, context_.langOptions()); - } else { - DIE("Unknown OpenMP DECLARE REDUCTION content"); - } + common::visit( // + common::visitors{// + [&](const parser::OpenMPDeclareReductionConstruct *d) { + Unparse(os, *d, context_.langOptions()); + }, + [&](const parser::OmpMetadirectiveDirective *m) { + Unparse(os, *m, context_.langOptions()); + }, + [&](const auto &) { + DIE("Unknown OpenMP DECLARE REDUCTION content"); + }}, + decl); } } diff --git a/flang/lib/Semantics/resolve-names-utils.h b/flang/lib/Semantics/resolve-names-utils.h index ed74c8203e29a..809074031e2cc 100644 --- a/flang/lib/Semantics/resolve-names-utils.h +++ b/flang/lib/Semantics/resolve-names-utils.h @@ -149,7 +149,7 @@ void MapSubprogramToNewSymbols(const Symbol &oldSymbol, Symbol &newSymbol, parser::CharBlock MakeNameFromOperator( const parser::DefinedOperator::IntrinsicOperator &op, SemanticsContext &context); -parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name); +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_RESOLVE_NAMES_H_ diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index da0aedd35676a..b2fbb501491e1 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1456,12 +1456,12 @@ class OmpVisitor : public virtual DeclarationVisitor { static bool NeedsScope(const parser::OpenMPBlockConstruct &); static bool NeedsScope(const parser::OmpClause &); - bool Pre(const parser::OmpMetadirectiveDirective &x) { + bool Pre(const parser::OmpMetadirectiveDirective &x) { // metaDirective_ = &x; ++metaLevel_; return true; } - void Post(const parser::OmpMetadirectiveDirective &) { + void Post(const parser::OmpMetadirectiveDirective &) { // metaDirective_ = nullptr; --metaLevel_; } @@ -1822,7 +1822,7 @@ parser::CharBlock MakeNameFromOperator( } } -parser::CharBlock MangleSpecialFunctions(const parser::CharBlock name) { +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name) { return llvm::StringSwitch(name.ToString()) .Case("max", {"op.max", 6}) .Case("min", {"op.min", 6}) @@ -1917,7 +1917,7 @@ void OmpVisitor::ProcessReductionSpecifier( // Only process types we can find. There will be an error later on when // a type isn't found. if (const DeclTypeSpec * typeSpec{GetDeclTypeSpec()}) { - reductionDetails->AddType(typeSpec); + reductionDetails->AddType(*typeSpec); for (auto &nm : ompVarNames) { ObjectEntityDetails details{}; >From db33cb11ee5aa27405ce07f35a207c2486076563 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Tue, 29 Apr 2025 13:25:19 +0100 Subject: [PATCH 10/13] Fix some semantics issues --- flang/lib/Semantics/assignment.cpp | 10 ++++++ flang/lib/Semantics/assignment.h | 3 ++ flang/lib/Semantics/check-omp-structure.cpp | 39 +++++++++++++-------- flang/lib/Semantics/resolve-names-utils.h | 1 + flang/lib/Semantics/resolve-names.cpp | 14 +++----- 5 files changed, 43 insertions(+), 24 deletions(-) diff --git a/flang/lib/Semantics/assignment.cpp b/flang/lib/Semantics/assignment.cpp index 6e55d0210ee0e..43e23a9d8f60b 100644 --- a/flang/lib/Semantics/assignment.cpp +++ b/flang/lib/Semantics/assignment.cpp @@ -43,6 +43,7 @@ class AssignmentContext { void Analyze(const parser::PointerAssignmentStmt &); void Analyze(const parser::ConcurrentControl &); int deviceConstructDepth_{0}; + SemanticsContext &context() { return context_; } private: bool CheckForPureContext(const SomeExpr &rhs, parser::CharBlock rhsSource); @@ -218,8 +219,17 @@ void AssignmentContext::PopWhereContext() { AssignmentChecker::~AssignmentChecker() {} +SemanticsContext &AssignmentChecker::context() { + return context_.value().context(); +} + AssignmentChecker::AssignmentChecker(SemanticsContext &context) : context_{new AssignmentContext{context}} {} + +void AssignmentChecker::Enter( + const parser::OpenMPDeclareReductionConstruct &x) { + context().set_location(x.source); +} void AssignmentChecker::Enter(const parser::AssignmentStmt &x) { context_.value().Analyze(x); } diff --git a/flang/lib/Semantics/assignment.h b/flang/lib/Semantics/assignment.h index a67bee4a03dfc..4a1bb92037119 100644 --- a/flang/lib/Semantics/assignment.h +++ b/flang/lib/Semantics/assignment.h @@ -37,6 +37,7 @@ class AssignmentChecker : public virtual BaseChecker { public: explicit AssignmentChecker(SemanticsContext &); ~AssignmentChecker(); + void Enter(const parser::OpenMPDeclareReductionConstruct &x); void Enter(const parser::AssignmentStmt &); void Enter(const parser::PointerAssignmentStmt &); void Enter(const parser::WhereStmt &); @@ -54,6 +55,8 @@ class AssignmentChecker : public virtual BaseChecker { void Enter(const parser::OpenACCLoopConstruct &); void Leave(const parser::OpenACCLoopConstruct &); + SemanticsContext &context(); + private: common::Indirection context_; }; diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 218cebf9bcec2..3dbf0c9b23da2 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3509,6 +3509,14 @@ bool OmpStructureChecker::CheckReductionOperator( break; } } + // User-defined operators are OK if there has been a declared reduction + // for that. So check if it's a defined operator, and it has + // UserReductionDetails - then it's good. + if (const auto *definedOp{std::get_if(&dOpr.u)}) { + if (definedOp->v.symbol->detailsIf()) { + return true; + } + } context_.Say(source, "Invalid reduction operator in %s clause."_err_en_US, parser::ToUpperCaseLetters(getClauseName(clauseId).str())); return false; @@ -3602,6 +3610,17 @@ void OmpStructureChecker::CheckReductionObjects( } } +static bool CheckSymbolSupportsType(const Scope &scope, + const parser::CharBlock &name, const DeclTypeSpec &type) { + if (const auto &symbol{scope.FindSymbol(name)}) { + if (const auto *reductionDetails{ + symbol->detailsIf()}) { + return reductionDetails->SupportsType(&type); + } + } + return false; +} + static bool IsReductionAllowedForType( const parser::OmpReductionIdentifier &ident, const DeclTypeSpec &type, const Scope &scope, SemanticsContext &context) { @@ -3645,14 +3664,11 @@ static bool IsReductionAllowedForType( return false; } parser::CharBlock name{MakeNameFromOperator(*intrinsicOp, context)}; - Symbol *symbol{scope.FindSymbol(name)}; - if (symbol) { - const auto *reductionDetails{symbol->detailsIf()}; - assert(reductionDetails && "Expected to find reductiondetails"); - - return reductionDetails->SupportsType(type); - } - return false; + return CheckSymbolSupportsType(scope, name, type); + } else if (const auto *definedOp{ + std::get_if(&dOpr.u)}) { + // TODO: Figure out if it's valid. + return true; } DIE("Intrinsic Operator not found - parsing gone wrong?"); }}; @@ -3693,12 +3709,7 @@ static bool IsReductionAllowedForType( // We also need to check for mangled names (max, min, iand, ieor and ior) // and then check if the type is there. parser::CharBlock mangledName{MangleSpecialFunctions(name->source)}; - if (const auto &symbol{scope.FindSymbol(mangledName)}) { - if (const auto *reductionDetails{ - symbol->detailsIf()}) { - return reductionDetails->SupportsType(type); - } - } + return CheckSymbolSupportsType(scope, mangledName, type); } // Everything else is "not matching type". return false; diff --git a/flang/lib/Semantics/resolve-names-utils.h b/flang/lib/Semantics/resolve-names-utils.h index 809074031e2cc..ee8113a3fda5e 100644 --- a/flang/lib/Semantics/resolve-names-utils.h +++ b/flang/lib/Semantics/resolve-names-utils.h @@ -150,6 +150,7 @@ parser::CharBlock MakeNameFromOperator( const parser::DefinedOperator::IntrinsicOperator &op, SemanticsContext &context); parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name); +std::string MangleDefinedOperator(const parser::CharBlock &name); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_RESOLVE_NAMES_H_ diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index b2fbb501491e1..8d91f76ff0a83 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1700,8 +1700,6 @@ class OmpVisitor : public virtual DeclarationVisitor { const std::optional &clauses, const T &wholeConstruct); - parser::CharBlock MangleDefinedOperator(const parser::CharBlock &name); - int metaLevel_{0}; const parser::OmpMetadirectiveDirective *metaDirective_{nullptr}; }; @@ -1832,14 +1830,9 @@ parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name) { .Default(name); } -parser::CharBlock OmpVisitor::MangleDefinedOperator( - const parser::CharBlock &name) { - // This function should only be used with user defined operators, that have - // the pattern - // .. +std::string MangleDefinedOperator(const parser::CharBlock &name) { CHECK(name[0] == '.' && name[name.size() - 1] == '.'); - return parser::CharBlock{ - context().StoreUserReductionName("op" + name.ToString())}; + return "op" + name.ToString(); } template @@ -1861,7 +1854,8 @@ void OmpVisitor::ProcessReductionSpecifier( const auto &defOp{std::get(id.u)}; if (const auto definedOp{std::get_if(&defOp.u)}) { name = &definedOp->v; - mangledName.source = MangleDefinedOperator(definedOp->v.source); + mangledName.source = parser::CharBlock{context().StoreUserReductionName( + MangleDefinedOperator(definedOp->v.source))}; } else { mangledName.source = MakeNameFromOperator( std::get(defOp.u), >From 1f3c97c76e1c36787efd4d3edb9dfca438927945 Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Thu, 1 May 2025 13:39:58 +0100 Subject: [PATCH 11/13] [Flang][OpenMP] Fix review comment failed examples Add code to better handle operators in parsing and semantics. Add a function to set the the scope when processign assignments, which caused a crash in "check for pure functions". Add three new tests and amend existing tests to cover a pure function. --- flang/include/flang/Semantics/symbol.h | 9 +++- flang/lib/Semantics/check-omp-structure.cpp | 17 ++++--- .../declare-reduction-bad-operator2.f90 | 28 +++++++++++ .../OpenMP/declare-reduction-functions.f90 | 17 ++++++- .../OpenMP/declare-reduction-operator.f90 | 36 ++++++++++++++ .../OpenMP/declare-reduction-renamedop.f90 | 47 +++++++++++++++++++ 6 files changed, 145 insertions(+), 9 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-bad-operator2.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-operator.f90 create mode 100644 flang/test/Semantics/OpenMP/declare-reduction-renamedop.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 1de92d410fca5..10abd6dfe2d96 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -746,7 +746,14 @@ class UserReductionDetails { const TypeVector &GetTypeList() const { return typeList_; } bool SupportsType(const DeclTypeSpec &type) const { - return llvm::is_contained(typeList_, &type); + // We have to compare the actual type, not the pointer, as some + // types are not guaranteed to be the same object. + for (auto t : typeList_) { + if (*t == type) { + return true; + } + } + return false; } void AddDecl(const DeclInfo &decl) { declList_.push_back(decl); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 3dbf0c9b23da2..be0d4d81dc07a 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3510,11 +3510,14 @@ bool OmpStructureChecker::CheckReductionOperator( } } // User-defined operators are OK if there has been a declared reduction - // for that. So check if it's a defined operator, and it has - // UserReductionDetails - then it's good. + // for that. We mangle those names to store the user details. if (const auto *definedOp{std::get_if(&dOpr.u)}) { - if (definedOp->v.symbol->detailsIf()) { - return true; + std::string mangled = MangleDefinedOperator(definedOp->v.symbol->name()); + const Scope &scope = definedOp->v.symbol->owner(); + if (const Symbol *symbol = scope.FindSymbol(mangled)) { + if (symbol->detailsIf()) { + return true; + } } } context_.Say(source, "Invalid reduction operator in %s clause."_err_en_US, @@ -3615,7 +3618,7 @@ static bool CheckSymbolSupportsType(const Scope &scope, if (const auto &symbol{scope.FindSymbol(name)}) { if (const auto *reductionDetails{ symbol->detailsIf()}) { - return reductionDetails->SupportsType(&type); + return reductionDetails->SupportsType(type); } } return false; @@ -3667,8 +3670,8 @@ static bool IsReductionAllowedForType( return CheckSymbolSupportsType(scope, name, type); } else if (const auto *definedOp{ std::get_if(&dOpr.u)}) { - // TODO: Figure out if it's valid. - return true; + return CheckSymbolSupportsType( + scope, MangleDefinedOperator(definedOp->v.symbol->name()), type); } DIE("Intrinsic Operator not found - parsing gone wrong?"); }}; diff --git a/flang/test/Semantics/OpenMP/declare-reduction-bad-operator2.f90 b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator2.f90 new file mode 100644 index 0000000000000..9ee223c1c71fe --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-bad-operator2.f90 @@ -0,0 +1,28 @@ +! RUN: not %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s 2>&1 | FileCheck %s + +module m1 + interface operator(.fluffy.) + procedure my_mul + end interface + type t1 + integer :: val = 1 + end type +!$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) +contains + function my_mul(x, y) + type (t1), intent (in) :: x, y + type (t1) :: my_mul + my_mul%val = x%val * y%val + end function my_mul + + subroutine subr(a, r) + implicit none + integer, intent(in), dimension(10) :: a + integer, intent(out) :: r + integer :: i + !$omp do parallel reduction(.fluffy.:r) +!CHECK: error: The type of 'r' is incompatible with the reduction operator. + do i=1,10 + end do + end subroutine subr +end module m1 diff --git a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 index a2435fca415cd..000d323f522cf 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-functions.f90 @@ -166,7 +166,7 @@ function funcBtwothree(x, n) !CHECK: omp_orig size=8 offset=8: ObjectEntity type: TYPE(two) !CHECK: omp_out size=8 offset=16: ObjectEntity type: TYPE(two) !CHECK: omp_priv size=8 offset=24: ObjectEntity type: TYPE(two) -!CHECK OtherConstruct scope +!CHECK: OtherConstruct scope !CHECK: omp_in size=24 offset=0: ObjectEntity type: TYPE(three) !CHECK: omp_orig size=24 offset=24: ObjectEntity type: TYPE(three) !CHECK: omp_out size=24 offset=48: ObjectEntity type: TYPE(three) @@ -184,5 +184,20 @@ function funcBtwothree(x, n) res%t2 = res2 res%t3 = res3 end function funcBtwothree + + !! This is checking a special case, where a reduction is declared inside a + !! pure function + + pure logical function reduction() +!CHECK: reduction size=4 offset=0: ObjectEntity funcResult type: LOGICAL(4) +!CHECK: rr: UserReductionDetails INTEGER(4) +!CHECK: OtherConstruct scope: size=16 alignment=4 sourceRange=0 bytes +!CHECK: omp_in size=4 offset=0: ObjectEntity type: INTEGER(4) +!CHECK: omp_orig size=4 offset=4: ObjectEntity type: INTEGER(4) +!CHECK: omp_out size=4 offset=8: ObjectEntity type: INTEGER(4) +!CHECK: omp_priv size=4 offset=12: ObjectEntity type: INTEGER(4) + !$omp declare reduction (rr : integer : omp_out = omp_out + omp_in) initializer (omp_priv = 0) + reduction = .false. + end function reduction end module mm diff --git a/flang/test/Semantics/OpenMP/declare-reduction-operator.f90 b/flang/test/Semantics/OpenMP/declare-reduction-operator.f90 new file mode 100644 index 0000000000000..e4ac7023f4629 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-operator.f90 @@ -0,0 +1,36 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +module m1 + interface operator(.fluffy.) +!CHECK: .fluffy., PUBLIC (Function): Generic DefinedOp procs: my_mul + procedure my_mul + end interface + type t1 + integer :: val = 1 + end type +!$omp declare reduction(.fluffy.:t1:omp_out=omp_out.fluffy.omp_in) +!CHECK: op.fluffy., PUBLIC: UserReductionDetails TYPE(t1) +!CHECK: t1, PUBLIC: DerivedType components: val +!CHECK: OtherConstruct scope: size=16 alignment=4 sourceRange=0 bytes +!CHECK: omp_in size=4 offset=0: ObjectEntity type: TYPE(t1) +!CHECK: omp_orig size=4 offset=4: ObjectEntity type: TYPE(t1) +!CHECK: omp_out size=4 offset=8: ObjectEntity type: TYPE(t1) +!CHECK: omp_priv size=4 offset=12: ObjectEntity type: TYPE(t1) +contains + function my_mul(x, y) + type (t1), intent (in) :: x, y + type (t1) :: my_mul + my_mul%val = x%val * y%val + end function my_mul + + subroutine subr(a, r) + implicit none + type(t1), intent(in), dimension(10) :: a + type(t1), intent(out) :: r + integer :: i + !$omp do parallel reduction(.fluffy.:r) + do i=1,10 + r = r .fluffy. a(i) + end do + end subroutine subr +end module m1 diff --git a/flang/test/Semantics/OpenMP/declare-reduction-renamedop.f90 b/flang/test/Semantics/OpenMP/declare-reduction-renamedop.f90 new file mode 100644 index 0000000000000..12e80cbf7b327 --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-reduction-renamedop.f90 @@ -0,0 +1,47 @@ +! RUN: %flang_fc1 -fdebug-dump-symbols -fopenmp -fopenmp-version=50 %s | FileCheck %s + +!! Test that we can "rename" an operator when using a module's operator. +module module1 +!CHECK: Module scope: module1 size=0 + implicit none + type :: t1 + real :: value + end type t1 + interface operator(.mul.) + module procedure my_mul + end interface operator(.mul.) +!CHECK: .mul., PUBLIC (Function): Generic DefinedOp procs: my_mul +!CHECK: my_mul, PUBLIC (Function): Subprogram result:TYPE(t1) r (TYPE(t1) x,TYPE(t1) +!CHECK: t1, PUBLIC: DerivedType components: value +contains + function my_mul(x, y) result(r) + type(t1), intent(in) :: x, y + type(t1) :: r + r%value = x%value * y%value + end function my_mul +end module module1 + +program test_omp_reduction +!CHECK: MainProgram scope: test_omp_reduction + use module1, only: t1, operator(.modmul.) => operator(.mul.) + +!CHECK: .modmul. (Function): Use from .mul. in module1 + implicit none + + type(t1) :: result + integer :: i + !$omp declare reduction (.modmul. : t1 : omp_out = omp_out .modmul. omp_in) initializer(omp_priv = t1(1.0)) +!CHECK: op.modmul.: UserReductionDetails TYPE(t1) +!CHECK: t1: Use from t1 in module1 +!CHECK: OtherConstruct scope: size=16 alignment=4 sourceRange=0 bytes +!CHECK: omp_in size=4 offset=0: ObjectEntity type: TYPE(t1) +!CHECK: omp_orig size=4 offset=4: ObjectEntity type: TYPE(t1) +!CHECK: omp_out size=4 offset=8: ObjectEntity type: TYPE(t1) +!CHECK: omp_priv size=4 offset=12: ObjectEntity type: TYPE(t1) + result = t1(1.0) + !$omp parallel do reduction(.modmul.:result) + do i = 1, 10 + result = result .modmul. t1(real(i)) + end do + !$omp end parallel do +end program test_omp_reduction >From 750457563d8a6a3c5c32d0d9d7e33389064c0cfe Mon Sep 17 00:00:00 2001 From: Mats Petersson Date: Fri, 9 May 2025 17:15:08 +0100 Subject: [PATCH 12/13] Fix review comments --- flang/include/flang/Semantics/semantics.h | 2 +- flang/lib/Semantics/check-omp-structure.cpp | 2 +- flang/lib/Semantics/mod-file.cpp | 1 - flang/lib/Semantics/resolve-names.cpp | 2 +- 4 files changed, 3 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Semantics/semantics.h b/flang/include/flang/Semantics/semantics.h index 460af89daa0cf..3924e6db81eb8 100644 --- a/flang/include/flang/Semantics/semantics.h +++ b/flang/include/flang/Semantics/semantics.h @@ -348,7 +348,7 @@ class SemanticsContext { UnorderedSymbolSet isDefined_; std::list programTrees_; - // storage for mangled names used in OMP DECLARE REDUCTION. + // Storage for mangled names used in OMP DECLARE REDUCTION. // use std::list to avoid re-allocating the string when adding // more content to the container. std::list userReductionNames_; diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index be0d4d81dc07a..37b1365e6acc4 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3615,7 +3615,7 @@ void OmpStructureChecker::CheckReductionObjects( static bool CheckSymbolSupportsType(const Scope &scope, const parser::CharBlock &name, const DeclTypeSpec &type) { - if (const auto &symbol{scope.FindSymbol(name)}) { + if (const auto *symbol{scope.FindSymbol(name)}) { if (const auto *reductionDetails{ symbol->detailsIf()}) { return reductionDetails->SupportsType(type); diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp index dbd55a61f510f..2e27716db5c62 100644 --- a/flang/lib/Semantics/mod-file.cpp +++ b/flang/lib/Semantics/mod-file.cpp @@ -8,7 +8,6 @@ #include "mod-file.h" #include "resolve-names.h" -#include "flang/Common/indirection.h" #include "flang/Common/restorer.h" #include "flang/Evaluate/tools.h" #include "flang/Parser/message.h" diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 8d91f76ff0a83..1d3b2b1505e5e 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1875,7 +1875,7 @@ void OmpVisitor::ProcessReductionSpecifier( reductionDetails = symbol->detailsIf(); if (!reductionDetails) { - context().Say(name->source, + context().Say( "Duplicate definition of '%s' in DECLARE REDUCTION"_err_en_US, name->source); return; >From c5f9768fb2fce2137551798e8ca66f78c7600c06 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Mon, 19 May 2025 17:11:05 +0000 Subject: [PATCH 13/13] Fix typo in test --- flang/test/Semantics/OpenMP/declare-reduction-operator.f90 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/test/Semantics/OpenMP/declare-reduction-operator.f90 b/flang/test/Semantics/OpenMP/declare-reduction-operator.f90 index e4ac7023f4629..dc12332b80baf 100644 --- a/flang/test/Semantics/OpenMP/declare-reduction-operator.f90 +++ b/flang/test/Semantics/OpenMP/declare-reduction-operator.f90 @@ -28,7 +28,7 @@ subroutine subr(a, r) type(t1), intent(in), dimension(10) :: a type(t1), intent(out) :: r integer :: i - !$omp do parallel reduction(.fluffy.:r) + !$omp parallel do reduction(.fluffy.:r) do i=1,10 r = r .fluffy. a(i) end do From flang-commits at lists.llvm.org Mon May 19 10:52:02 2025 From: flang-commits at lists.llvm.org (Scott Manley via flang-commits) Date: Mon, 19 May 2025 10:52:02 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682b6fc2.050a0220.2af7ae.ca1d@mx.google.com> ================ ---------------- rscottmanley wrote: Can you add a nested `fir.do_loop` test case? https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Mon May 19 11:01:11 2025 From: flang-commits at lists.llvm.org (Rohit Aggarwal via flang-commits) Date: Mon, 19 May 2025 11:01:11 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Flang][Driver] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682b71e7.620a0220.10aea4.d9db@mx.google.com> https://github.com/rohitaggarwal007 updated https://github.com/llvm/llvm-project/pull/140544 >From 4769d05876f3d7f4a335c10e51fb20e3c923e270 Mon Sep 17 00:00:00 2001 From: Rohit Aggarwal Date: Mon, 19 May 2025 19:25:52 +0530 Subject: [PATCH 1/2] [Clang][Flang][Driver] Fix target parsing for -fveclib=AMDLIBM option --- clang/lib/Driver/ToolChains/Clang.cpp | 2 +- clang/lib/Driver/ToolChains/CommonArgs.cpp | 1 + clang/test/Driver/fveclib.c | 10 ++++++++++ 3 files changed, 12 insertions(+), 1 deletion(-) diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index a08bdba99bfe0..4aefdb24af17b 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -5844,7 +5844,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) << Name << Triple.getArchName(); - } else if (Name == "libmvec") { + } else if (Name == "libmvec" || Name == "AMDLIBM") { if (Triple.getArch() != llvm::Triple::x86 && Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp index 5c1bc090810a2..c499b7266a553 100644 --- a/clang/lib/Driver/ToolChains/CommonArgs.cpp +++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp @@ -935,6 +935,7 @@ void tools::addLTOOptions(const ToolChain &ToolChain, const ArgList &Args, llvm::StringSwitch>(ArgVecLib->getValue()) .Case("Accelerate", "Accelerate") .Case("libmvec", "LIBMVEC") + .Case("AMDLIBM", "AMDLIBM") .Case("MASSV", "MASSV") .Case("SVML", "SVML") .Case("SLEEF", "sleefgnuabi") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 1235d08a3e139..5420555c36a2a 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -1,6 +1,7 @@ // RUN: %clang -### -c -fveclib=none %s 2>&1 | FileCheck --check-prefix=CHECK-NOLIB %s // RUN: %clang -### -c -fveclib=Accelerate %s 2>&1 | FileCheck --check-prefix=CHECK-ACCELERATE %s // RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-libmvec %s +// RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-AMDLIBM %s // RUN: %clang -### -c -fveclib=MASSV %s 2>&1 | FileCheck --check-prefix=CHECK-MASSV %s // RUN: %clang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck --check-prefix=CHECK-DARWIN_LIBSYSTEM_M %s // RUN: %clang -### -c --target=aarch64 -fveclib=SLEEF %s 2>&1 | FileCheck --check-prefix=CHECK-SLEEF %s @@ -11,6 +12,7 @@ // CHECK-NOLIB: "-fveclib=none" // CHECK-ACCELERATE: "-fveclib=Accelerate" // CHECK-libmvec: "-fveclib=libmvec" +// CHECK-AMDLIBM: "-fveclib=AMDLIBM" // CHECK-MASSV: "-fveclib=MASSV" // CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m" // CHECK-SLEEF: "-fveclib=SLEEF" @@ -23,6 +25,7 @@ // RUN: not %clang --target=x86 -c -fveclib=ArmPL %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s // RUN: not %clang --target=aarch64 -c -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s // RUN: not %clang --target=aarch64 -c -fveclib=SVML %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s +// RUN: not %clang --target=aarch64 -c -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s // CHECK-ERROR: unsupported option {{.*}} for target // RUN: %clang -fveclib=Accelerate %s -target arm64-apple-ios8.0.0 -### 2>&1 | FileCheck --check-prefix=CHECK-LINK %s @@ -40,6 +43,9 @@ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC" +// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-AMDLIBM %s +// CHECK-LTO-AMDLIBM: "-plugin-opt=-vector-library=AMDLIBM" + // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" @@ -62,6 +68,10 @@ // CHECK-ERRNO-LIBMVEC: "-fveclib=libmvec" // CHECK-ERRNO-LIBMVEC-SAME: "-fmath-errno" +// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-AMDLIBM %s +// CHECK-ERRNO-AMDLIBM: "-fveclib=AMDLIBM" +// CHECK-ERRNO-AMDLIBM-SAME: "-fmath-errno" + // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-MASSV %s // CHECK-ERRNO-MASSV: "-fveclib=MASSV" // CHECK-ERRNO-MASSV-SAME: "-fmath-errno" >From 4505686899f24686aca32dfd53bd79e2908e185a Mon Sep 17 00:00:00 2001 From: Rohit Aggarwal Date: Mon, 19 May 2025 23:30:52 +0530 Subject: [PATCH 2/2] Update -fveclib=AMDLIBM in flang --- clang/lib/Driver/ToolChains/Flang.cpp | 2 +- flang/lib/Frontend/CompilerInvocation.cpp | 1 + flang/test/Driver/fveclib.f90 | 3 +++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index b1ca747e68b89..0bd8d0c85e50a 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -484,7 +484,7 @@ void Flang::addTargetOptions(const ArgList &Args, Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) << Name << Triple.getArchName(); - } else if (Name == "libmvec") { + } else if (Name == "libmvec" || Name == "AMDLIBM") { if (Triple.getArch() != llvm::Triple::x86 && Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 238079a09ef3a..a5df5b02fcf9f 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -196,6 +196,7 @@ static bool parseVectorLibArg(Fortran::frontend::CodeGenOptions &opts, llvm::StringSwitch>(arg->getValue()) .Case("Accelerate", VectorLibrary::Accelerate) .Case("libmvec", VectorLibrary::LIBMVEC) + .Case("AMDLIBM", VectorLibrary::AMDLIBM) .Case("MASSV", VectorLibrary::MASSV) .Case("SVML", VectorLibrary::SVML) .Case("SLEEF", VectorLibrary::SLEEF) diff --git a/flang/test/Driver/fveclib.f90 b/flang/test/Driver/fveclib.f90 index 1b536b8ad0f18..6cb9361f7b778 100644 --- a/flang/test/Driver/fveclib.f90 +++ b/flang/test/Driver/fveclib.f90 @@ -1,6 +1,7 @@ ! RUN: %flang -### -c -fveclib=none %s 2>&1 | FileCheck -check-prefix CHECK-NOLIB %s ! RUN: %flang -### -c -fveclib=Accelerate %s 2>&1 | FileCheck -check-prefix CHECK-ACCELERATE %s ! RUN: %flang -### -c --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck -check-prefix CHECK-libmvec %s +! RUN: %flang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck -check-prefix CHECK-AMDLIBM %s ! RUN: %flang -### -c -fveclib=MASSV %s 2>&1 | FileCheck -check-prefix CHECK-MASSV %s ! RUN: %flang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck -check-prefix CHECK-DARWIN_LIBSYSTEM_M %s ! RUN: %flang -### -c --target=aarch64-none-none -fveclib=SLEEF %s 2>&1 | FileCheck -check-prefix CHECK-SLEEF %s @@ -11,6 +12,7 @@ ! CHECK-NOLIB: "-fveclib=none" ! CHECK-ACCELERATE: "-fveclib=Accelerate" ! CHECK-libmvec: "-fveclib=libmvec" +! CHECK-AMDLIBM: "-fveclib=AMDLIBM" ! CHECK-MASSV: "-fveclib=MASSV" ! CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m" ! CHECK-SLEEF: "-fveclib=SLEEF" @@ -22,6 +24,7 @@ ! RUN: not %flang --target=x86-none-none -c -fveclib=SLEEF %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=x86-none-none -c -fveclib=ArmPL %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=aarch64-none-none -c -fveclib=libmvec %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s +! RUN: not %flang --target=aarch64-none-none -c -fveclib=AMDLIBM %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=aarch64-none-none -c -fveclib=SVML %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! CHECK-ERROR: unsupported option {{.*}} for target From flang-commits at lists.llvm.org Mon May 19 12:00:19 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 19 May 2025 12:00:19 -0700 (PDT) Subject: [flang-commits] [flang] [draft][flang] Undo the effects of CSE for hlfir.exactly_once. (PR #140190) In-Reply-To: Message-ID: <682b7fc3.170a0220.72c48.b59d@mx.google.com> https://github.com/vzakhari updated https://github.com/llvm/llvm-project/pull/140190 >From 41f8ec79d42e29228525efee9611f7cb761c18a6 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Thu, 15 May 2025 21:53:46 -0700 Subject: [PATCH 1/2] [draft][flang] Undo the effects of CSE for hlfir.exactly_once. CSE may delete operations from hlfir.exactly_once and reuse the equivalent results from the parent region(s), e.g. from the parent hlfir.region_assign. This makes it problematic to clone hlfir.exactly_once before the top-level hlfir.where. This patch adds a "canonicalizer" that pulls in such operations back into hlfir.exactly_once. --- .../LowerHLFIROrderedAssignments.cpp | 119 ++++++++++++++++++ 1 file changed, 119 insertions(+) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp b/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp index 5cae7cf443c86..89b5ccb7d850e 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp @@ -24,12 +24,15 @@ #include "flang/Optimizer/Builder/Todo.h" #include "flang/Optimizer/Dialect/Support/FIRContext.h" #include "flang/Optimizer/HLFIR/Passes.h" +#include "mlir/Analysis/Liveness.h" #include "mlir/IR/Dominance.h" #include "mlir/IR/IRMapping.h" #include "mlir/Transforms/DialectConversion.h" +#include "mlir/Transforms/RegionUtils.h" #include "llvm/ADT/SmallSet.h" #include "llvm/ADT/TypeSwitch.h" #include "llvm/Support/Debug.h" +#include namespace hlfir { #define GEN_PASS_DEF_LOWERHLFIRORDEREDASSIGNMENTS @@ -263,6 +266,19 @@ class OrderedAssignmentRewriter { return &inserted.first->second; } + /// Given a top-level hlfir.where, look for hlfir.exactly_once operations + /// inside it and see if any of the values live into hlfir.exactly_once + /// do not dominate hlfir.where. This may happen due to CSE reusing + /// results of operations from the region parent to hlfir.exactly_once. + /// Since we are going to clone the body of hlfir.exactly_once before + /// the top-level hlfir.where, such def-use will cause problems. + /// There are options how to resolve this in a different way, + /// e.g. making hlfir.exactly_once IsolatedFromAbove or making + /// it a region of hlfir.where and wiring the result(s) through + /// the block arguments. For the time being, this canonicalization + /// tries to undo the effects of CSE. + void canonicalizeExactlyOnceInsideWhere(hlfir::WhereOp whereOp); + fir::FirOpBuilder &builder; /// Map containing the mapping between the original order assignment tree @@ -523,6 +539,10 @@ void OrderedAssignmentRewriter::generateMaskIfOp(mlir::Value cdt) { void OrderedAssignmentRewriter::pre(hlfir::WhereOp whereOp) { mlir::Location loc = whereOp.getLoc(); if (!whereLoopNest) { + // Make sure liveness information is valid for the inner hlfir.exactly_once + // operations, and their bodies can be cloned before the top-level + // hlfir.where. + canonicalizeExactlyOnceInsideWhere(whereOp); // This is the top-level WHERE. Start a loop nest iterating on the shape of // the where mask. if (auto maybeSaved = getIfSaved(whereOp.getMaskRegion())) { @@ -1350,6 +1370,105 @@ void OrderedAssignmentRewriter::saveLeftHandSide( } } +void OrderedAssignmentRewriter::canonicalizeExactlyOnceInsideWhere( + hlfir::WhereOp whereOp) { + auto getDefinition = [](mlir::Value v) { + mlir::Operation *op = v.getDefiningOp(); + bool isValid = true; + if (!op) { + LLVM_DEBUG( + llvm::dbgs() + << "Value live into hlfir.exactly_once has no defining operation: " + << v << "\n"); + isValid = false; + } + if (op->getNumRegions() != 0) { + LLVM_DEBUG( + llvm::dbgs() + << "Cannot pull an operation with regions into hlfir.exactly_once" + << *op << "\n"); + isValid = false; + } + auto effects = mlir::getEffectsRecursively(op); + if (!effects || !effects->empty()) { + LLVM_DEBUG(llvm::dbgs() << "Side effects on operation with result live " + "into hlfir.exactly_once" + << *op << "\n"); + isValid = false; + } + assert(isValid && "invalid live-in"); + return op; + }; + mlir::Liveness liveness(whereOp.getOperation()); + whereOp->walk([&](hlfir::ExactlyOnceOp op) { + std::unordered_set liveInSet; + LLVM_DEBUG(llvm::dbgs() << "Canonicalizing:\n" << op << "\n"); + auto &liveIns = liveness.getLiveIn(&op.getBody().front()); + if (liveIns.empty()) + return; + // Note that the liveIns set is not ordered. + for (mlir::Value liveIn : liveIns) { + if (!dominanceInfo.properlyDominates(liveIn, whereOp)) { + LLVM_DEBUG(llvm::dbgs() + << "Does not dominate top-level where: " << liveIn << "\n"); + liveInSet.insert(getDefinition(liveIn)); + } + } + + // Populate the set of operations that we need to pull into + // hlfir.exactly_once, so that the only live-ins left are the ones + // that dominate whereOp. + std::unordered_set cloneSet(liveInSet); + llvm::SmallVector workList(cloneSet.begin(), + cloneSet.end()); + while (!workList.empty()) { + mlir::Operation *current = workList.pop_back_val(); + for (mlir::Value operand : current->getOperands()) { + if (dominanceInfo.properlyDominates(operand, whereOp)) + continue; + mlir::Operation *def = getDefinition(operand); + if (cloneSet.count(def)) + continue; + cloneSet.insert(def); + workList.push_back(def); + } + } + + // Sort the operations by dominance. This preserves their order + // after the cloning, and also guarantees stable IR generation. + llvm::SmallVector cloneList(cloneSet.begin(), + cloneSet.end()); + llvm::sort(cloneList, [&](mlir::Operation *L, mlir::Operation *R) { + return dominanceInfo.properlyDominates(L, R); + }); + + // Clone the operations. + mlir::IRMapping mapper; + mlir::Operation::CloneOptions options; + options.cloneOperands(); + mlir::OpBuilder::InsertionGuard guard(builder); + builder.setInsertionPointToStart(&op.getBody().front()); + + for (auto *toClone : cloneList) { + LLVM_DEBUG(llvm::dbgs() << "Cloning: " << *toClone << "\n"); + builder.insert(toClone->clone(mapper, options)); + } + for (mlir::Operation *oldOps : liveInSet) + for (mlir::Value oldVal : oldOps->getResults()) { + mlir::Value newVal = mapper.lookup(oldVal); + if (!newVal) { + LLVM_DEBUG(llvm::dbgs() << "No clone found for: " << oldVal << "\n"); + assert(false && "missing clone"); + } + mlir::replaceAllUsesInRegionWith(oldVal, newVal, op.getBody()); + } + + LLVM_DEBUG(llvm::dbgs() << "Finished canonicalization\n"); + if (!liveInSet.empty()) + LLVM_DEBUG(llvm::dbgs() << op << "\n"); + }); +} + /// Lower an ordered assignment tree to fir.do_loop and hlfir.assign given /// a schedule. static void lower(hlfir::OrderedAssignmentTreeOpInterface root, >From 0d4c96cac4018cac6d56bea69b60edf80e7c8c1c Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Mon, 19 May 2025 11:38:30 -0700 Subject: [PATCH 2/2] Added LIT test. --- .../order_assignments/where-after-cse.fir | 254 ++++++++++++++++++ 1 file changed, 254 insertions(+) create mode 100644 flang/test/HLFIR/order_assignments/where-after-cse.fir diff --git a/flang/test/HLFIR/order_assignments/where-after-cse.fir b/flang/test/HLFIR/order_assignments/where-after-cse.fir new file mode 100644 index 0000000000000..4505c879c7b0f --- /dev/null +++ b/flang/test/HLFIR/order_assignments/where-after-cse.fir @@ -0,0 +1,254 @@ +// Test canonicalization of hlfir.exactly_once operations +// after CSE. The live-in values that are not dominating +// the top-level hlfir.where must be cloned inside hlfir.exactly_once, +// otherwise, the cloning of the hlfir.exactly_once before hlfir.where +// would cause def-use issues: +// RUN: fir-opt %s --lower-hlfir-ordered-assignments | FileCheck %s + +// Simple case, where CSE makes only hlfir.designate live-in: +// CHECK-LABEL: func.func @_QPtest1( +func.func @_QPtest1(%arg0: !fir.ref>>,p2:!fir.box>>}>> {fir.bindc_name = "x"}) { + %true = arith.constant true + %cst = arith.constant 0.000000e+00 : f32 + %c1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %0 = fir.dummy_scope : !fir.dscope + %1:2 = hlfir.declare %arg0 dummy_scope %0 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest1Ex"} : (!fir.ref>>,p2:!fir.box>>}>>, !fir.dscope) -> (!fir.ref>>,p2:!fir.box>>}>>, !fir.ref>>,p2:!fir.box>>}>>) + hlfir.where { + %2 = hlfir.designate %1#0{"p2"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>,p2:!fir.box>>}>>) -> !fir.ref>>> + %3 = fir.load %2 : !fir.ref>>> + %4:3 = fir.box_dims %3, %c0 : (!fir.box>>, index) -> (index, index, index) + %5 = arith.addi %4#0, %4#1 : index + %6 = arith.subi %5, %c1 : index + %7 = arith.subi %6, %4#0 : index + %8 = arith.addi %7, %c1 : index + %9 = arith.cmpi sgt, %8, %c0 : index + %10 = arith.select %9, %8, %c0 : index + %11 = fir.shape %10 : (index) -> !fir.shape<1> + %12 = hlfir.designate %3 (%4#0:%6:%c1) shape %11 : (!fir.box>>, index, index, index, !fir.shape<1>) -> !fir.box> + %13 = hlfir.elemental %11 unordered : (!fir.shape<1>) -> !hlfir.expr> { + ^bb0(%arg1: index): + %14 = hlfir.designate %12 (%arg1) : (!fir.box>, index) -> !fir.ref + %15 = fir.load %14 : !fir.ref + %16 = arith.cmpf ogt, %15, %cst fastmath : f32 + %17 = fir.convert %16 : (i1) -> !fir.logical<4> + hlfir.yield_element %17 : !fir.logical<4> + } + hlfir.yield %13 : !hlfir.expr> cleanup { + hlfir.destroy %13 : !hlfir.expr> + } + } do { + hlfir.region_assign { + %2 = hlfir.designate %1#0{"p1"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>,p2:!fir.box>>}>>) -> !fir.ref>>> + %3 = fir.load %2 : !fir.ref>>> + %4:3 = fir.box_dims %3, %c0 : (!fir.box>>, index) -> (index, index, index) + %5 = arith.addi %4#0, %4#1 : index + %6 = arith.subi %5, %c1 : index + %7 = arith.subi %6, %4#0 : index + %8 = arith.addi %7, %c1 : index + %9 = arith.cmpi sgt, %8, %c0 : index + %10 = arith.select %9, %8, %c0 : index + %11 = fir.shape %10 : (index) -> !fir.shape<1> + %12 = hlfir.designate %3 (%4#0:%6:%c1, %c1) shape %11 : (!fir.box>>, index, index, index, index, !fir.shape<1>) -> !fir.box> + %13 = hlfir.exactly_once : !hlfir.expr { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %{{.*}}#0{"p1"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>,p2:!fir.box>>}>>) -> !fir.ref>>> +// CHECK: fir.load %[[VAL_26]] : !fir.ref>>> +// CHECK: %[[VAL_47:.*]] = fir.call @_QPcallee(%{{.*}}) fastmath : (!fir.box>) -> !fir.array +// CHECK: fir.do_loop + %15 = fir.load %2 : !fir.ref>>> + %16:3 = fir.box_dims %15, %c0 : (!fir.box>>, index) -> (index, index, index) + %17 = arith.addi %16#0, %16#1 : index + %18 = arith.subi %17, %c1 : index + %19 = arith.subi %18, %16#0 : index + %20 = arith.addi %19, %c1 : index + %21 = arith.cmpi sgt, %20, %c0 : index + %22 = arith.select %21, %20, %c0 : index + %23 = fir.shape %22 : (index) -> !fir.shape<1> + %24 = hlfir.designate %15 (%16#0:%18:%c1, %c1) shape %23 : (!fir.box>>, index, index, index, index, !fir.shape<1>) -> !fir.box> + %25:2 = hlfir.declare %24 {fortran_attrs = #fir.var_attrs, uniq_name = "_QMmy_moduleFcalleeEx"} : (!fir.box>) -> (!fir.box>, !fir.box>) + %26:3 = fir.box_dims %25#0, %c0 : (!fir.box>, index) -> (index, index, index) + %27 = fir.convert %26#1 : (index) -> i64 + %28 = fir.convert %27 : (i64) -> index + %29 = arith.cmpi sgt, %28, %c0 : index + %30 = arith.select %29, %28, %c0 : index + %31 = fir.shape %30 : (index) -> !fir.shape<1> + %32 = fir.allocmem !fir.array, %30 {bindc_name = ".tmp.expr_result", uniq_name = ""} + %33 = fir.convert %32 : (!fir.heap>) -> !fir.ref> + %34:2 = hlfir.declare %33(%31) {uniq_name = ".tmp.expr_result"} : (!fir.ref>, !fir.shape<1>) -> (!fir.box>, !fir.ref>) + %35 = fir.call @_QPcallee(%24) fastmath : (!fir.box>) -> !fir.array + fir.save_result %35 to %34#1(%31) : !fir.array, !fir.ref>, !fir.shape<1> + %36 = hlfir.as_expr %34#0 move %true : (!fir.box>, i1) -> !hlfir.expr + hlfir.yield %36 : !hlfir.expr cleanup { + hlfir.destroy %36 : !hlfir.expr + } + } + %14 = hlfir.elemental %11 unordered : (!fir.shape<1>) -> !hlfir.expr { + ^bb0(%arg1: index): + %15 = hlfir.designate %12 (%arg1) : (!fir.box>, index) -> !fir.ref + %16 = hlfir.apply %13, %arg1 : (!hlfir.expr, index) -> f32 + %17 = fir.load %15 : !fir.ref + %18 = arith.divf %17, %16 fastmath : f32 + hlfir.yield_element %18 : f32 + } + hlfir.yield %14 : !hlfir.expr cleanup { + hlfir.destroy %14 : !hlfir.expr + } + } to { + %2 = hlfir.designate %1#0{"p2"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>,p2:!fir.box>>}>>) -> !fir.ref>>> + %3 = fir.load %2 : !fir.ref>>> + %4:3 = fir.box_dims %3, %c0 : (!fir.box>>, index) -> (index, index, index) + %5 = arith.addi %4#0, %4#1 : index + %6 = arith.subi %5, %c1 : index + %7 = arith.subi %6, %4#0 : index + %8 = arith.addi %7, %c1 : index + %9 = arith.cmpi sgt, %8, %c0 : index + %10 = arith.select %9, %8, %c0 : index + %11 = fir.shape %10 : (index) -> !fir.shape<1> + %12 = hlfir.designate %3 (%4#0:%6:%c1) shape %11 : (!fir.box>>, index, index, index, !fir.shape<1>) -> !fir.box> + hlfir.yield %12 : !fir.box> + } + } + return +} + +// CSE makes a chain of operations live-in: +// CHECK-LABEL: func.func @_QPtest_where_in_forall( +func.func @_QPtest_where_in_forall(%arg0: !fir.box> {fir.bindc_name = "a"}, %arg1: !fir.box> {fir.bindc_name = "b"}, %arg2: !fir.box> {fir.bindc_name = "c"}) { + %false = arith.constant false + %c1_i32 = arith.constant 1 : i32 + %c10_i32 = arith.constant 10 : i32 + %c0 = arith.constant 0 : index + %c1 = arith.constant 1 : index + %c2_i32 = arith.constant 2 : i32 + %c100 = arith.constant 100 : index + %0 = fir.alloca !fir.array<100x!fir.logical<4>> {bindc_name = ".tmp.expr_result"} + %1 = fir.alloca !fir.array<100x!fir.logical<4>> {bindc_name = ".tmp.expr_result"} + %2 = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_21:.*]]:2 = hlfir.declare %{{.*}} dummy_scope %{{.*}} {uniq_name = "_QFtest_where_in_forallEb"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %3:2 = hlfir.declare %arg0 dummy_scope %2 {uniq_name = "_QFtest_where_in_forallEa"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %4:2 = hlfir.declare %arg1 dummy_scope %2 {uniq_name = "_QFtest_where_in_forallEb"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5:2 = hlfir.declare %arg2 dummy_scope %2 {uniq_name = "_QFtest_where_in_forallEc"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + hlfir.forall lb { + hlfir.yield %c1_i32 : i32 + } ub { + hlfir.yield %c10_i32 : i32 + } (%arg3: i32) { + hlfir.where { + %6 = fir.shape %c100 : (index) -> !fir.shape<1> + %7:2 = hlfir.declare %0(%6) {uniq_name = ".tmp.expr_result"} : (!fir.ref>>, !fir.shape<1>) -> (!fir.ref>>, !fir.ref>>) + %8 = fir.call @_QPpure_logical_func1() proc_attrs fastmath : () -> !fir.array<100x!fir.logical<4>> + fir.save_result %8 to %7#1(%6) : !fir.array<100x!fir.logical<4>>, !fir.ref>>, !fir.shape<1> + %9 = hlfir.as_expr %7#0 move %false : (!fir.ref>>, i1) -> !hlfir.expr<100x!fir.logical<4>> + hlfir.yield %9 : !hlfir.expr<100x!fir.logical<4>> cleanup { + hlfir.destroy %9 : !hlfir.expr<100x!fir.logical<4>> + } + } do { + hlfir.region_assign { + %6 = fir.convert %arg3 : (i32) -> i64 +// CHECK: %[[VAL_58:.*]]:3 = fir.box_dims %[[VAL_21]]#1, %{{.*}} : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_59:.*]] = arith.cmpi sgt, %[[VAL_58]]#1, %{{.*}} : index +// CHECK: %[[VAL_60:.*]] = arith.select %[[VAL_59]], %[[VAL_58]]#1, %{{.*}} : index +// CHECK: %[[VAL_61:.*]] = fir.shape %[[VAL_60]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_62:.*]] = hlfir.designate %[[VAL_21]]#0 (%{{.*}}, %{{.*}}:%[[VAL_58]]#1:%{{.*}}) shape %[[VAL_61]] : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + %7:3 = fir.box_dims %4#1, %c1 : (!fir.box>, index) -> (index, index, index) + %8 = arith.cmpi sgt, %7#1, %c0 : index + %9 = arith.select %8, %7#1, %c0 : index + %10 = fir.shape %9 : (index) -> !fir.shape<1> + %11 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1) shape %10 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + %12 = hlfir.exactly_once : f32 { + %19:3 = fir.box_dims %3#1, %c1 : (!fir.box>, index) -> (index, index, index) + %20 = arith.cmpi sgt, %19#1, %c0 : index + %21 = arith.select %20, %19#1, %c0 : index + %22 = fir.shape %21 : (index) -> !fir.shape<1> + %23 = hlfir.designate %3#0 (%6, %c1:%19#1:%c1) shape %22 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_68:.*]] = fir.call @_QPpure_real_func2() fastmath : () -> f32 +// CHECK: %[[VAL_69:.*]] = hlfir.elemental %{{.*}} unordered : (!fir.shape<1>) -> !hlfir.expr { +// CHECK: ^bb0(%[[VAL_70:.*]]: index): +// CHECK: %[[VAL_72:.*]] = hlfir.designate %[[VAL_62]] (%[[VAL_70]]) : (!fir.box>, index) -> !fir.ref + %24 = fir.call @_QPpure_real_func2() fastmath : () -> f32 + %25 = hlfir.elemental %22 unordered : (!fir.shape<1>) -> !hlfir.expr { + ^bb0(%arg4: index): + %28 = hlfir.designate %23 (%arg4) : (!fir.box>, index) -> !fir.ref + %29 = hlfir.designate %11 (%arg4) : (!fir.box>, index) -> !fir.ref + %30 = fir.load %28 : !fir.ref + %31 = fir.load %29 : !fir.ref + %32 = arith.addf %30, %31 fastmath : f32 + %33 = arith.addf %32, %24 fastmath : f32 + hlfir.yield_element %33 : f32 + } + %26:3 = hlfir.associate %25(%22) {adapt.valuebyref} : (!hlfir.expr, !fir.shape<1>) -> (!fir.box>, !fir.ref>, i1) + %27 = fir.call @_QPpure_real_func(%26#1) fastmath : (!fir.ref>) -> f32 + hlfir.yield %27 : f32 cleanup { + hlfir.end_associate %26#1, %26#2 : !fir.ref>, i1 + hlfir.destroy %25 : !hlfir.expr + } + } + %13:3 = fir.box_dims %3#1, %c1 : (!fir.box>, index) -> (index, index, index) + %14 = arith.cmpi sgt, %13#1, %c0 : index + %15 = arith.select %14, %13#1, %c0 : index + %16 = fir.shape %15 : (index) -> !fir.shape<1> + %17 = hlfir.designate %3#0 (%6, %c1:%13#1:%c1) shape %16 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + %18 = hlfir.elemental %10 unordered : (!fir.shape<1>) -> !hlfir.expr { + ^bb0(%arg4: index): + %19 = hlfir.designate %11 (%arg4) : (!fir.box>, index) -> !fir.ref + %20 = fir.load %19 : !fir.ref + %21 = arith.addf %20, %12 fastmath : f32 + %22 = hlfir.designate %17 (%arg4) : (!fir.box>, index) -> !fir.ref + %23 = fir.call @_QPpure_elem_func(%22) proc_attrs fastmath : (!fir.ref) -> f32 + %24 = arith.addf %21, %23 fastmath : f32 + hlfir.yield_element %24 : f32 + } + hlfir.yield %18 : !hlfir.expr cleanup { + hlfir.destroy %18 : !hlfir.expr + } + } to { + %6 = arith.muli %arg3, %c2_i32 overflow : i32 + %7 = fir.convert %6 : (i32) -> i64 + %8:3 = fir.box_dims %3#1, %c1 : (!fir.box>, index) -> (index, index, index) + %9 = arith.cmpi sgt, %8#1, %c0 : index + %10 = arith.select %9, %8#1, %c0 : index + %11 = fir.shape %10 : (index) -> !fir.shape<1> + %12 = hlfir.designate %3#0 (%7, %c1:%8#1:%c1) shape %11 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + hlfir.yield %12 : !fir.box> + } + hlfir.elsewhere mask { + %6 = hlfir.exactly_once : !hlfir.expr<100x!fir.logical<4>> { + %7 = fir.shape %c100 : (index) -> !fir.shape<1> + %8:2 = hlfir.declare %1(%7) {uniq_name = ".tmp.expr_result"} : (!fir.ref>>, !fir.shape<1>) -> (!fir.ref>>, !fir.ref>>) + %9 = fir.call @_QPpure_logical_func2() proc_attrs fastmath : () -> !fir.array<100x!fir.logical<4>> + fir.save_result %9 to %8#1(%7) : !fir.array<100x!fir.logical<4>>, !fir.ref>>, !fir.shape<1> + %10 = hlfir.as_expr %8#0 move %false : (!fir.ref>>, i1) -> !hlfir.expr<100x!fir.logical<4>> + hlfir.yield %10 : !hlfir.expr<100x!fir.logical<4>> cleanup { + hlfir.destroy %10 : !hlfir.expr<100x!fir.logical<4>> + } + } + hlfir.yield %6 : !hlfir.expr<100x!fir.logical<4>> + } do { + hlfir.region_assign { + %6 = fir.convert %arg3 : (i32) -> i64 + %7:3 = fir.box_dims %5#1, %c1 : (!fir.box>, index) -> (index, index, index) + %8 = arith.cmpi sgt, %7#1, %c0 : index + %9 = arith.select %8, %7#1, %c0 : index + %10 = fir.shape %9 : (index) -> !fir.shape<1> + %11 = hlfir.designate %5#0 (%6, %c1:%7#1:%c1) shape %10 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + hlfir.yield %11 : !fir.box> + } to { + %6 = arith.muli %arg3, %c2_i32 overflow : i32 + %7 = fir.convert %6 : (i32) -> i64 + %8 = hlfir.exactly_once : i32 { + %14 = fir.call @_QPpure_ifoo() proc_attrs fastmath : () -> i32 + hlfir.yield %14 : i32 cleanup { + } + } + %9 = fir.convert %8 : (i32) -> index + %10 = arith.cmpi sgt, %9, %c0 : index + %11 = arith.select %10, %9, %c0 : index + %12 = fir.shape %11 : (index) -> !fir.shape<1> + %13 = hlfir.designate %3#0 (%7, %c1:%9:%c1) shape %12 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + hlfir.yield %13 : !fir.box> + } + } + } + } + return +} From flang-commits at lists.llvm.org Mon May 19 12:00:40 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 19 May 2025 12:00:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Undo the effects of CSE for hlfir.exactly_once. (PR #140190) In-Reply-To: Message-ID: <682b7fd8.170a0220.220399.c59e@mx.google.com> https://github.com/vzakhari edited https://github.com/llvm/llvm-project/pull/140190 From flang-commits at lists.llvm.org Mon May 19 12:00:46 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Mon, 19 May 2025 12:00:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Undo the effects of CSE for hlfir.exactly_once. (PR #140190) In-Reply-To: Message-ID: <682b7fde.050a0220.1fc0a7.e01b@mx.google.com> https://github.com/vzakhari ready_for_review https://github.com/llvm/llvm-project/pull/140190 From flang-commits at lists.llvm.org Mon May 19 12:01:20 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 12:01:20 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Undo the effects of CSE for hlfir.exactly_once. (PR #140190) In-Reply-To: Message-ID: <682b8000.630a0220.8ee9c.0aa6@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Slava Zakharin (vzakhari)
Changes CSE may delete operations from hlfir.exactly_once and reuse the equivalent results from the parent region(s), e.g. from the parent hlfir.region_assign. This makes it problematic to clone hlfir.exactly_once before the top-level hlfir.where. This patch adds a "canonicalizer" that pulls in such operations back into hlfir.exactly_once. --- Patch is 24.39 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140190.diff 2 Files Affected: - (modified) flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp (+119) - (added) flang/test/HLFIR/order_assignments/where-after-cse.fir (+254) ``````````diff diff --git a/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp b/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp index 5cae7cf443c86..89b5ccb7d850e 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp @@ -24,12 +24,15 @@ #include "flang/Optimizer/Builder/Todo.h" #include "flang/Optimizer/Dialect/Support/FIRContext.h" #include "flang/Optimizer/HLFIR/Passes.h" +#include "mlir/Analysis/Liveness.h" #include "mlir/IR/Dominance.h" #include "mlir/IR/IRMapping.h" #include "mlir/Transforms/DialectConversion.h" +#include "mlir/Transforms/RegionUtils.h" #include "llvm/ADT/SmallSet.h" #include "llvm/ADT/TypeSwitch.h" #include "llvm/Support/Debug.h" +#include namespace hlfir { #define GEN_PASS_DEF_LOWERHLFIRORDEREDASSIGNMENTS @@ -263,6 +266,19 @@ class OrderedAssignmentRewriter { return &inserted.first->second; } + /// Given a top-level hlfir.where, look for hlfir.exactly_once operations + /// inside it and see if any of the values live into hlfir.exactly_once + /// do not dominate hlfir.where. This may happen due to CSE reusing + /// results of operations from the region parent to hlfir.exactly_once. + /// Since we are going to clone the body of hlfir.exactly_once before + /// the top-level hlfir.where, such def-use will cause problems. + /// There are options how to resolve this in a different way, + /// e.g. making hlfir.exactly_once IsolatedFromAbove or making + /// it a region of hlfir.where and wiring the result(s) through + /// the block arguments. For the time being, this canonicalization + /// tries to undo the effects of CSE. + void canonicalizeExactlyOnceInsideWhere(hlfir::WhereOp whereOp); + fir::FirOpBuilder &builder; /// Map containing the mapping between the original order assignment tree @@ -523,6 +539,10 @@ void OrderedAssignmentRewriter::generateMaskIfOp(mlir::Value cdt) { void OrderedAssignmentRewriter::pre(hlfir::WhereOp whereOp) { mlir::Location loc = whereOp.getLoc(); if (!whereLoopNest) { + // Make sure liveness information is valid for the inner hlfir.exactly_once + // operations, and their bodies can be cloned before the top-level + // hlfir.where. + canonicalizeExactlyOnceInsideWhere(whereOp); // This is the top-level WHERE. Start a loop nest iterating on the shape of // the where mask. if (auto maybeSaved = getIfSaved(whereOp.getMaskRegion())) { @@ -1350,6 +1370,105 @@ void OrderedAssignmentRewriter::saveLeftHandSide( } } +void OrderedAssignmentRewriter::canonicalizeExactlyOnceInsideWhere( + hlfir::WhereOp whereOp) { + auto getDefinition = [](mlir::Value v) { + mlir::Operation *op = v.getDefiningOp(); + bool isValid = true; + if (!op) { + LLVM_DEBUG( + llvm::dbgs() + << "Value live into hlfir.exactly_once has no defining operation: " + << v << "\n"); + isValid = false; + } + if (op->getNumRegions() != 0) { + LLVM_DEBUG( + llvm::dbgs() + << "Cannot pull an operation with regions into hlfir.exactly_once" + << *op << "\n"); + isValid = false; + } + auto effects = mlir::getEffectsRecursively(op); + if (!effects || !effects->empty()) { + LLVM_DEBUG(llvm::dbgs() << "Side effects on operation with result live " + "into hlfir.exactly_once" + << *op << "\n"); + isValid = false; + } + assert(isValid && "invalid live-in"); + return op; + }; + mlir::Liveness liveness(whereOp.getOperation()); + whereOp->walk([&](hlfir::ExactlyOnceOp op) { + std::unordered_set liveInSet; + LLVM_DEBUG(llvm::dbgs() << "Canonicalizing:\n" << op << "\n"); + auto &liveIns = liveness.getLiveIn(&op.getBody().front()); + if (liveIns.empty()) + return; + // Note that the liveIns set is not ordered. + for (mlir::Value liveIn : liveIns) { + if (!dominanceInfo.properlyDominates(liveIn, whereOp)) { + LLVM_DEBUG(llvm::dbgs() + << "Does not dominate top-level where: " << liveIn << "\n"); + liveInSet.insert(getDefinition(liveIn)); + } + } + + // Populate the set of operations that we need to pull into + // hlfir.exactly_once, so that the only live-ins left are the ones + // that dominate whereOp. + std::unordered_set cloneSet(liveInSet); + llvm::SmallVector workList(cloneSet.begin(), + cloneSet.end()); + while (!workList.empty()) { + mlir::Operation *current = workList.pop_back_val(); + for (mlir::Value operand : current->getOperands()) { + if (dominanceInfo.properlyDominates(operand, whereOp)) + continue; + mlir::Operation *def = getDefinition(operand); + if (cloneSet.count(def)) + continue; + cloneSet.insert(def); + workList.push_back(def); + } + } + + // Sort the operations by dominance. This preserves their order + // after the cloning, and also guarantees stable IR generation. + llvm::SmallVector cloneList(cloneSet.begin(), + cloneSet.end()); + llvm::sort(cloneList, [&](mlir::Operation *L, mlir::Operation *R) { + return dominanceInfo.properlyDominates(L, R); + }); + + // Clone the operations. + mlir::IRMapping mapper; + mlir::Operation::CloneOptions options; + options.cloneOperands(); + mlir::OpBuilder::InsertionGuard guard(builder); + builder.setInsertionPointToStart(&op.getBody().front()); + + for (auto *toClone : cloneList) { + LLVM_DEBUG(llvm::dbgs() << "Cloning: " << *toClone << "\n"); + builder.insert(toClone->clone(mapper, options)); + } + for (mlir::Operation *oldOps : liveInSet) + for (mlir::Value oldVal : oldOps->getResults()) { + mlir::Value newVal = mapper.lookup(oldVal); + if (!newVal) { + LLVM_DEBUG(llvm::dbgs() << "No clone found for: " << oldVal << "\n"); + assert(false && "missing clone"); + } + mlir::replaceAllUsesInRegionWith(oldVal, newVal, op.getBody()); + } + + LLVM_DEBUG(llvm::dbgs() << "Finished canonicalization\n"); + if (!liveInSet.empty()) + LLVM_DEBUG(llvm::dbgs() << op << "\n"); + }); +} + /// Lower an ordered assignment tree to fir.do_loop and hlfir.assign given /// a schedule. static void lower(hlfir::OrderedAssignmentTreeOpInterface root, diff --git a/flang/test/HLFIR/order_assignments/where-after-cse.fir b/flang/test/HLFIR/order_assignments/where-after-cse.fir new file mode 100644 index 0000000000000..4505c879c7b0f --- /dev/null +++ b/flang/test/HLFIR/order_assignments/where-after-cse.fir @@ -0,0 +1,254 @@ +// Test canonicalization of hlfir.exactly_once operations +// after CSE. The live-in values that are not dominating +// the top-level hlfir.where must be cloned inside hlfir.exactly_once, +// otherwise, the cloning of the hlfir.exactly_once before hlfir.where +// would cause def-use issues: +// RUN: fir-opt %s --lower-hlfir-ordered-assignments | FileCheck %s + +// Simple case, where CSE makes only hlfir.designate live-in: +// CHECK-LABEL: func.func @_QPtest1( +func.func @_QPtest1(%arg0: !fir.ref>>,p2:!fir.box>>}>> {fir.bindc_name = "x"}) { + %true = arith.constant true + %cst = arith.constant 0.000000e+00 : f32 + %c1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %0 = fir.dummy_scope : !fir.dscope + %1:2 = hlfir.declare %arg0 dummy_scope %0 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest1Ex"} : (!fir.ref>>,p2:!fir.box>>}>>, !fir.dscope) -> (!fir.ref>>,p2:!fir.box>>}>>, !fir.ref>>,p2:!fir.box>>}>>) + hlfir.where { + %2 = hlfir.designate %1#0{"p2"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>,p2:!fir.box>>}>>) -> !fir.ref>>> + %3 = fir.load %2 : !fir.ref>>> + %4:3 = fir.box_dims %3, %c0 : (!fir.box>>, index) -> (index, index, index) + %5 = arith.addi %4#0, %4#1 : index + %6 = arith.subi %5, %c1 : index + %7 = arith.subi %6, %4#0 : index + %8 = arith.addi %7, %c1 : index + %9 = arith.cmpi sgt, %8, %c0 : index + %10 = arith.select %9, %8, %c0 : index + %11 = fir.shape %10 : (index) -> !fir.shape<1> + %12 = hlfir.designate %3 (%4#0:%6:%c1) shape %11 : (!fir.box>>, index, index, index, !fir.shape<1>) -> !fir.box> + %13 = hlfir.elemental %11 unordered : (!fir.shape<1>) -> !hlfir.expr> { + ^bb0(%arg1: index): + %14 = hlfir.designate %12 (%arg1) : (!fir.box>, index) -> !fir.ref + %15 = fir.load %14 : !fir.ref + %16 = arith.cmpf ogt, %15, %cst fastmath : f32 + %17 = fir.convert %16 : (i1) -> !fir.logical<4> + hlfir.yield_element %17 : !fir.logical<4> + } + hlfir.yield %13 : !hlfir.expr> cleanup { + hlfir.destroy %13 : !hlfir.expr> + } + } do { + hlfir.region_assign { + %2 = hlfir.designate %1#0{"p1"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>,p2:!fir.box>>}>>) -> !fir.ref>>> + %3 = fir.load %2 : !fir.ref>>> + %4:3 = fir.box_dims %3, %c0 : (!fir.box>>, index) -> (index, index, index) + %5 = arith.addi %4#0, %4#1 : index + %6 = arith.subi %5, %c1 : index + %7 = arith.subi %6, %4#0 : index + %8 = arith.addi %7, %c1 : index + %9 = arith.cmpi sgt, %8, %c0 : index + %10 = arith.select %9, %8, %c0 : index + %11 = fir.shape %10 : (index) -> !fir.shape<1> + %12 = hlfir.designate %3 (%4#0:%6:%c1, %c1) shape %11 : (!fir.box>>, index, index, index, index, !fir.shape<1>) -> !fir.box> + %13 = hlfir.exactly_once : !hlfir.expr { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %{{.*}}#0{"p1"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>,p2:!fir.box>>}>>) -> !fir.ref>>> +// CHECK: fir.load %[[VAL_26]] : !fir.ref>>> +// CHECK: %[[VAL_47:.*]] = fir.call @_QPcallee(%{{.*}}) fastmath : (!fir.box>) -> !fir.array +// CHECK: fir.do_loop + %15 = fir.load %2 : !fir.ref>>> + %16:3 = fir.box_dims %15, %c0 : (!fir.box>>, index) -> (index, index, index) + %17 = arith.addi %16#0, %16#1 : index + %18 = arith.subi %17, %c1 : index + %19 = arith.subi %18, %16#0 : index + %20 = arith.addi %19, %c1 : index + %21 = arith.cmpi sgt, %20, %c0 : index + %22 = arith.select %21, %20, %c0 : index + %23 = fir.shape %22 : (index) -> !fir.shape<1> + %24 = hlfir.designate %15 (%16#0:%18:%c1, %c1) shape %23 : (!fir.box>>, index, index, index, index, !fir.shape<1>) -> !fir.box> + %25:2 = hlfir.declare %24 {fortran_attrs = #fir.var_attrs, uniq_name = "_QMmy_moduleFcalleeEx"} : (!fir.box>) -> (!fir.box>, !fir.box>) + %26:3 = fir.box_dims %25#0, %c0 : (!fir.box>, index) -> (index, index, index) + %27 = fir.convert %26#1 : (index) -> i64 + %28 = fir.convert %27 : (i64) -> index + %29 = arith.cmpi sgt, %28, %c0 : index + %30 = arith.select %29, %28, %c0 : index + %31 = fir.shape %30 : (index) -> !fir.shape<1> + %32 = fir.allocmem !fir.array, %30 {bindc_name = ".tmp.expr_result", uniq_name = ""} + %33 = fir.convert %32 : (!fir.heap>) -> !fir.ref> + %34:2 = hlfir.declare %33(%31) {uniq_name = ".tmp.expr_result"} : (!fir.ref>, !fir.shape<1>) -> (!fir.box>, !fir.ref>) + %35 = fir.call @_QPcallee(%24) fastmath : (!fir.box>) -> !fir.array + fir.save_result %35 to %34#1(%31) : !fir.array, !fir.ref>, !fir.shape<1> + %36 = hlfir.as_expr %34#0 move %true : (!fir.box>, i1) -> !hlfir.expr + hlfir.yield %36 : !hlfir.expr cleanup { + hlfir.destroy %36 : !hlfir.expr + } + } + %14 = hlfir.elemental %11 unordered : (!fir.shape<1>) -> !hlfir.expr { + ^bb0(%arg1: index): + %15 = hlfir.designate %12 (%arg1) : (!fir.box>, index) -> !fir.ref + %16 = hlfir.apply %13, %arg1 : (!hlfir.expr, index) -> f32 + %17 = fir.load %15 : !fir.ref + %18 = arith.divf %17, %16 fastmath : f32 + hlfir.yield_element %18 : f32 + } + hlfir.yield %14 : !hlfir.expr cleanup { + hlfir.destroy %14 : !hlfir.expr + } + } to { + %2 = hlfir.designate %1#0{"p2"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>,p2:!fir.box>>}>>) -> !fir.ref>>> + %3 = fir.load %2 : !fir.ref>>> + %4:3 = fir.box_dims %3, %c0 : (!fir.box>>, index) -> (index, index, index) + %5 = arith.addi %4#0, %4#1 : index + %6 = arith.subi %5, %c1 : index + %7 = arith.subi %6, %4#0 : index + %8 = arith.addi %7, %c1 : index + %9 = arith.cmpi sgt, %8, %c0 : index + %10 = arith.select %9, %8, %c0 : index + %11 = fir.shape %10 : (index) -> !fir.shape<1> + %12 = hlfir.designate %3 (%4#0:%6:%c1) shape %11 : (!fir.box>>, index, index, index, !fir.shape<1>) -> !fir.box> + hlfir.yield %12 : !fir.box> + } + } + return +} + +// CSE makes a chain of operations live-in: +// CHECK-LABEL: func.func @_QPtest_where_in_forall( +func.func @_QPtest_where_in_forall(%arg0: !fir.box> {fir.bindc_name = "a"}, %arg1: !fir.box> {fir.bindc_name = "b"}, %arg2: !fir.box> {fir.bindc_name = "c"}) { + %false = arith.constant false + %c1_i32 = arith.constant 1 : i32 + %c10_i32 = arith.constant 10 : i32 + %c0 = arith.constant 0 : index + %c1 = arith.constant 1 : index + %c2_i32 = arith.constant 2 : i32 + %c100 = arith.constant 100 : index + %0 = fir.alloca !fir.array<100x!fir.logical<4>> {bindc_name = ".tmp.expr_result"} + %1 = fir.alloca !fir.array<100x!fir.logical<4>> {bindc_name = ".tmp.expr_result"} + %2 = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_21:.*]]:2 = hlfir.declare %{{.*}} dummy_scope %{{.*}} {uniq_name = "_QFtest_where_in_forallEb"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %3:2 = hlfir.declare %arg0 dummy_scope %2 {uniq_name = "_QFtest_where_in_forallEa"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %4:2 = hlfir.declare %arg1 dummy_scope %2 {uniq_name = "_QFtest_where_in_forallEb"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5:2 = hlfir.declare %arg2 dummy_scope %2 {uniq_name = "_QFtest_where_in_forallEc"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + hlfir.forall lb { + hlfir.yield %c1_i32 : i32 + } ub { + hlfir.yield %c10_i32 : i32 + } (%arg3: i32) { + hlfir.where { + %6 = fir.shape %c100 : (index) -> !fir.shape<1> + %7:2 = hlfir.declare %0(%6) {uniq_name = ".tmp.expr_result"} : (!fir.ref>>, !fir.shape<1>) -> (!fir.ref>>, !fir.ref>>) + %8 = fir.call @_QPpure_logical_func1() proc_attrs fastmath : () -> !fir.array<100x!fir.logical<4>> + fir.save_result %8 to %7#1(%6) : !fir.array<100x!fir.logical<4>>, !fir.ref>>, !fir.shape<1> + %9 = hlfir.as_expr %7#0 move %false : (!fir.ref>>, i1) -> !hlfir.expr<100x!fir.logical<4>> + hlfir.yield %9 : !hlfir.expr<100x!fir.logical<4>> cleanup { + hlfir.destroy %9 : !hlfir.expr<100x!fir.logical<4>> + } + } do { + hlfir.region_assign { + %6 = fir.convert %arg3 : (i32) -> i64 +// CHECK: %[[VAL_58:.*]]:3 = fir.box_dims %[[VAL_21]]#1, %{{.*}} : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_59:.*]] = arith.cmpi sgt, %[[VAL_58]]#1, %{{.*}} : index +// CHECK: %[[VAL_60:.*]] = arith.select %[[VAL_59]], %[[VAL_58]]#1, %{{.*}} : index +// CHECK: %[[VAL_61:.*]] = fir.shape %[[VAL_60]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_62:.*]] = hlfir.designate %[[VAL_21]]#0 (%{{.*}}, %{{.*}}:%[[VAL_58]]#1:%{{.*}}) shape %[[VAL_61]] : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + %7:3 = fir.box_dims %4#1, %c1 : (!fir.box>, index) -> (index, index, index) + %8 = arith.cmpi sgt, %7#1, %c0 : index + %9 = arith.select %8, %7#1, %c0 : index + %10 = fir.shape %9 : (index) -> !fir.shape<1> + %11 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1) shape %10 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + %12 = hlfir.exactly_once : f32 { + %19:3 = fir.box_dims %3#1, %c1 : (!fir.box>, index) -> (index, index, index) + %20 = arith.cmpi sgt, %19#1, %c0 : index + %21 = arith.select %20, %19#1, %c0 : index + %22 = fir.shape %21 : (index) -> !fir.shape<1> + %23 = hlfir.designate %3#0 (%6, %c1:%19#1:%c1) shape %22 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_68:.*]] = fir.call @_QPpure_real_func2() fastmath : () -> f32 +// CHECK: %[[VAL_69:.*]] = hlfir.elemental %{{.*}} unordered : (!fir.shape<1>) -> !hlfir.expr { +// CHECK: ^bb0(%[[VAL_70:.*]]: index): +// CHECK: %[[VAL_72:.*]] = hlfir.designate %[[VAL_62]] (%[[VAL_70]]) : (!fir.box>, index) -> !fir.ref + %24 = fir.call @_QPpure_real_func2() fastmath : () -> f32 + %25 = hlfir.elemental %22 unordered : (!fir.shape<1>) -> !hlfir.expr { + ^bb0(%arg4: index): + %28 = hlfir.designate %23 (%arg4) : (!fir.box>, index) -> !fir.ref + %29 = hlfir.designate %11 (%arg4) : (!fir.box>, index) -> !fir.ref + %30 = fir.load %28 : !fir.ref + %31 = fir.load %29 : !fir.ref + %32 = arith.addf %30, %31 fastmath : f32 + %33 = arith.addf %32, %24 fastmath : f32 + hlfir.yield_element %33 : f32 + } + %26:3 = hlfir.associate %25(%22) {adapt.valuebyref} : (!hlfir.expr, !fir.shape<1>) -> (!fir.box>, !fir.ref>, i1) + %27 = fir.call @_QPpure_real_func(%26... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/140190 From flang-commits at lists.llvm.org Mon May 19 12:20:19 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Mon, 19 May 2025 12:20:19 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Flang][Driver] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682b8473.630a0220.3c230f.146a@mx.google.com> tarunprabhu wrote: Is this doing a lot that is similar to #140533? https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Mon May 19 13:17:37 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 19 May 2025 13:17:37 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (WORK IN PROGRESS) (PR #137727) In-Reply-To: Message-ID: <682b91e1.630a0220.ac96c.0fd2@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From a8831eace686c74c9e81541a0ff7221fe6a28386 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, and Destroy. Default derived type I/O is also recursive, but already disabled. It can be added to this new framework later if the overall approach succeeds. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. --- .../include/flang-rt/runtime/environment.h | 1 + .../include/flang-rt/runtime/work-queue.h | 374 +++++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 518 ++++++++++-------- flang-rt/lib/runtime/derived.cpp | 485 ++++++++-------- flang-rt/lib/runtime/environment.cpp | 4 + flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 178 ++++++ flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- 10 files changed, 1105 insertions(+), 475 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/environment.h b/flang-rt/include/flang-rt/runtime/environment.h index 16258b3bbba9b..87fe1f92ba545 100644 --- a/flang-rt/include/flang-rt/runtime/environment.h +++ b/flang-rt/include/flang-rt/runtime/environment.h @@ -63,6 +63,7 @@ struct ExecutionEnvironment { bool noStopMessage{false}; // NO_STOP_MESSAGE=1 inhibits "Fortran STOP" bool defaultUTF8{false}; // DEFAULT_UTF8 bool checkPointerDeallocation{true}; // FORT_CHECK_POINTER_DEALLOCATION + int internalDebugging{0}; // FLANG_RT_DEBUG // CUDA related variables std::size_t cudaStackLimit{0}; // ACC_OFFLOAD_STACK_SIZE diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..c2fb5a9ffd980 --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,374 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue is a list of tickets. Each ticket class has a Begin() +// member function that is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatOkContinue, and if that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatOkContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentsOverElements, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatOkContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatOkContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; +namespace typeInfo { +class DerivedType; +class Component; +class SpecialBinding; +} // namespace typeInfo + +// Ticket workers + +// Ticket workers return status codes. Returning StatOkContinue means +// that the ticket is incomplete and must be resumed; any other value +// means that the ticket is complete, and if not StatOk, the whole +// queue can be shut down due to an error. +static constexpr int StatOkContinue{1234}; + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +// Base class for ticket workers that operate elementwise over descriptors +class Elementwise { +protected: + RT_API_ATTRS Elementwise( + const Descriptor &instance, const Descriptor *from = nullptr) + : instance_{instance}, from_{from} { + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + RT_API_ATTRS bool IsComplete() const { return elementAt_ >= elements_; } + RT_API_ATTRS void Advance() { + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + if (from_) { + from_->IncrementSubscripts(fromSubscripts_); + } + } + RT_API_ATTRS void SkipToEnd() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + + const Descriptor &instance_, *from_{nullptr}; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + SubscriptValue subscripts_[common::maxRank]; + SubscriptValue fromSubscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components. +class Componentwise { +protected: + RT_API_ATTRS Componentwise(const typeInfo::DerivedType &); + RT_API_ATTRS bool IsComplete() const { return componentAt_ >= components_; } + RT_API_ATTRS void Advance() { + ++componentAt_; + GetComponent(); + } + RT_API_ATTRS void SkipToEnd() { + component_ = nullptr; + componentAt_ = components_; + } + RT_API_ATTRS void Reset() { + component_ = nullptr; + componentAt_ = 0; + GetComponent(); + } + RT_API_ATTRS void GetComponent(); + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentsOverElements : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ComponentsOverElements(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Elementwise::IsComplete()) { + Componentwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Componentwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextElement(); + if (Elementwise::IsComplete()) { + Elementwise::Reset(); + Componentwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Elementwise::Advance(); + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Advance(); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Reset(); + } + + int phase_{0}; +}; + +// Base class for ticket workers that operate over elements in an outer loop, +// type components in an inner loop. +class ElementsOverComponents : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ElementsOverComponents(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Componentwise::IsComplete()) { + Elementwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Elementwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextComponent(); + if (Componentwise::IsComplete()) { + Componentwise::Reset(); + Elementwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Componentwise::Advance(); + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Componentwise::Reset(); + Elementwise::Advance(); + } + + int phase_{0}; +}; + +// Implements derived type instance initialization +class InitializeTicket : private ComponentsOverElements { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket : private ComponentsOverElements { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ComponentsOverElements{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatOkContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : private ComponentsOverElements { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : private ComponentsOverElements { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ComponentsOverElements{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : to_{to}, from_{&from}, flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment. +class DerivedAssignTicket : private ElementsOverComponents { +public: + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ElementsOverComponents{to, derived, &from}, flags_{flags}, + memmoveFct_{memmoveFct}, deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + StaticDescriptor fromComponentDescriptor_; +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + RT_API_ATTRS void BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived); + RT_API_ATTRS void BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg); + RT_API_ATTRS void BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived); + RT_API_ATTRS void BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize); + RT_API_ATTRS void BeginAssign( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct); + RT_API_ATTRS void BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter); + + RT_API_ATTRS int Run(); + +private: + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index c5e7bdce5b2fd..3f9858671d1e1 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -67,6 +67,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -130,6 +131,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 9be75da9520e3..345ec8ef31162 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -102,11 +103,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncId)); } // least <= 0, most >= 0 @@ -231,6 +228,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -244,274 +243,338 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + workQueue.BeginAssign(to, from, flags, memmoveFct); + workQueue.Run(); +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + workQueue.BeginFinalize(*toDeallocate_, *toDerived_); + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncId))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + workQueue.BeginInitialize(newFrom, *derived); + } + } } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + workQueue.BeginAssign( + newFrom, *from_, MaybeReallocate | PolymorphicLHS, memmoveFct_); + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; - } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + workQueue.BeginFinalize(to_, *toDerived_); + } else if (!toDerived_->noDestructionNeeded()) { + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false); } } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + return StatOkContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); + } + return StatOk; + } + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + workQueue.BeginInitialize(to_, *toDerived_); + return StatOkContinue; + } + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatOkContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatOkContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); - } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); - } - } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + if (toDerived_) { + workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_); + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatOkContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &workQueue) { + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + memmoveFct_(instance_.Element(subscripts_) + procPtr.offset, + from_->Element(fromSubscripts_) + procPtr.offset, + sizeof(typeInfo::ProcedurePointer)); + } + } + if (noDataComponents) { + return StatOk; + } else { + Elementwise::Reset(); + } + } + return StatOkContinue; +} + +RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &workQueue) { + // DerivedAssignTicket inherits from ElementComponentBase so that it + // loops over elements at the outer level and over components at the inner. + // This yield less surprising behavior for codes that care due to the use + // of defined assignments when the ordering of their calls matters. + for (; !IsComplete(); Advance()) { + // Copy the data components (incl. the parent) first. + switch (component_->genre()) { + case typeInfo::Component::Genre::Data: + if (component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{fromComponentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + toCompDesc, instance_, workQueue.terminator(), subscripts_); + component_->CreatePointerDescriptor( + fromCompDesc, *from_, workQueue.terminator(), fromSubscripts_); + Advance(); + workQueue.BeginAssign(toCompDesc, fromCompDesc, flags_, memmoveFct_); + return StatOkContinue; + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + instance_.Element(subscripts_) + component_->offset())}; + const auto *fromDesc{reinterpret_cast( + from_->Element(fromSubscripts_) + component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (const auto *componentDerived{component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false); + return StatOkContinue; + } + } + toDesc->Deallocate(); + } + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + // + // Be careful not to destroy/reallocate the LHS, if there is + // overlap between LHS and RHS (it seems that partial overlap + // is not possible, though). + // Invoke Assign() recursively to deal with potential aliasing. + // Force LHS deallocation with DeallocateLHS flag. + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + workQueue.BeginAssign( + *toDesc, *fromDesc, flags_ | DeallocateLHS, memmoveFct_); + Advance(); + return StatOkContinue; + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -581,7 +644,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -597,11 +659,11 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. - if (var) + if (var) { Assign(*var, temp, terminator, NoAssignFlags); + } temp.Destroy(/*finalize=*/false, /*destroyPointers=*/false, &terminator); } diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..9fdc016d37d0b 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,172 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + workQueue.BeginInitialize(instance, derived); + return workQueue.Run(); +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + std::size_t myProcPtrs{procPtrDesc.Elements()}; + for (std::size_t k{0}; k < myProcPtrs; ++k) { const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; + *procPtrDesc.ZeroBasedIndexedElement(k)}; SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + instance_.GetLowerBounds(at); + for (std::size_t j{0}; j++ < elements_; instance_.IncrementSubscripts(at)) { + auto &pptr{*instance_.ElementComponent( + at, comp.offset)}; + pptr = comp.procInitialization; + } + } + return StatOkContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + for (; !IsComplete(); SkipToNextComponent()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; !Elementwise::IsComplete(); SkipToNextElement()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; !Elementwise::IsComplete(); SkipToNextElement()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; !Elementwise::IsComplete(); SkipToNextElement()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + workQueue.BeginInitialize(compDesc, compType); + return StatOkContinue; } } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; - } - } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + workQueue.BeginInitializeClone(clone, original, derived, hasStat, errMsg); + return workQueue.Run(); } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); - } + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncId), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + workQueue.BeginInitialize(cloneDesc, *derived); + return StatOkContinue; } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_); + return StatOkContinue; + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_); + Advance(); + return StatOkContinue; // will resume at next element in this component + } else { + SkipToNextComponent(); } + } else { + SkipToNextComponent(); } } - return stat; + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + workQueue.BeginFinalize(descriptor, derived); + workQueue.Run(); + } } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +214,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +251,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncId) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,86 +275,84 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (finalizableParentType_->noFinalizationNeeded()) { + finalizableParentType_ = nullptr; + } else { + SkipToNextComponent(); + } + } + return StatOkContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); - } + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + workQueue.BeginFinalize(compDesc, *compDynamicType); + return StatOkContinue; } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + workQueue.BeginFinalize(compDesc, *compType); } + } else { + SkipToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + workQueue.BeginFinalize(compDesc, compType); + return StatOkContinue; + } else { + SkipToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + workQueue.BeginFinalize(tmpDesc, *finalizableParentType_); + finalizableParentType_ = nullptr; + return StatOkContinue; + } else { + return StatOk; } } @@ -373,51 +362,61 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + workQueue.BeginDestroy(descriptor, derived, finalize); + workQueue.Run(); } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + workQueue.BeginFinalize(instance_, derived_); } + return StatOkContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (!IsComplete()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + workQueue.BeginDestroy(*d, *componentDerived, /*finalize=*/false); + return StatOkContinue; + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + SkipToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + workQueue.BeginDestroy(compDesc, *componentDerived, /*finalize=*/false); + return StatOkContinue; } + } else { + SkipToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/environment.cpp b/flang-rt/lib/runtime/environment.cpp index 1d5304254ed0e..0f0564403c0e2 100644 --- a/flang-rt/lib/runtime/environment.cpp +++ b/flang-rt/lib/runtime/environment.cpp @@ -143,6 +143,10 @@ void ExecutionEnvironment::Configure(int ac, const char *av[], } } + if (auto *x{std::getenv("FLANG_RT_DEBUG")}) { + internalDebugging = std::strtol(x, nullptr, 10); + } + if (auto *x{std::getenv("ACC_OFFLOAD_STACK_SIZE")}) { char *end; auto n{std::strtoul(x, &end, 10)}; diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..34af5b4fa6283 --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,178 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +static constexpr bool enableDebugOutput{false}; + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (componentAt_ >= components_) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + delete firstFree_; + } + firstFree_ = next; + } +} + +RT_API_ATTRS void WorkQueue::BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + StartTicket().u.emplace(descriptor, derived); +} + +RT_API_ATTRS void WorkQueue::BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); +} + +RT_API_ATTRS void WorkQueue::BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + StartTicket().u.emplace(descriptor, derived); +} + +RT_API_ATTRS void WorkQueue::BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + StartTicket().u.emplace(descriptor, derived, finalize); +} + +RT_API_ATTRS void WorkQueue::BeginAssign( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) { + StartTicket().u.emplace(to, from, flags, memmoveFct); +} + +RT_API_ATTRS void WorkQueue::BeginDerivedAssign(Descriptor &to, + const Descriptor &from, const typeInfo::DerivedType &derived, int flags, + MemmoveFct memmoveFct, Descriptor *deallocateAfter) { + StartTicket().u.emplace( + to, from, derived, flags, memmoveFct, deallocateAfter); +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + firstFree_ = new TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: new ticket\n"); + } + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), + at->ticket.begun ? "Continue" : "Begin"); + } + int stat{at->ticket.Continue(*this)}; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: ... stat %d\n", stat); + } + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatOkContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..32bebc1d866a4 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -843,6 +843,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION From flang-commits at lists.llvm.org Mon May 19 13:40:43 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 13:40:43 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][cuda] Use a reference for asyncObject (PR #140614) In-Reply-To: Message-ID: <682b974b.170a0220.7c2e2.c66a@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-openacc Author: Valentin Clement (バレンタイン クレメン) (clementval)
Changes Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation. New tentative with some fix. The previous was reverted some time ago. --- Patch is 104.83 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140614.diff 55 Files Affected: - (modified) flang-rt/include/flang-rt/runtime/allocator-registry.h (+2-2) - (modified) flang-rt/include/flang-rt/runtime/descriptor.h (+3-3) - (modified) flang-rt/include/flang-rt/runtime/reduction-templates.h (+1-1) - (modified) flang-rt/lib/cuda/CMakeLists.txt (+1) - (modified) flang-rt/lib/cuda/allocatable.cpp (+4-4) - (modified) flang-rt/lib/cuda/allocator.cpp (+10-10) - (modified) flang-rt/lib/cuda/descriptor.cpp (+1-1) - (modified) flang-rt/lib/cuda/pointer.cpp (+4-4) - (modified) flang-rt/lib/runtime/allocatable.cpp (+6-6) - (modified) flang-rt/lib/runtime/array-constructor.cpp (+2-2) - (modified) flang-rt/lib/runtime/assign.cpp (+2-2) - (modified) flang-rt/lib/runtime/character.cpp (+11-9) - (modified) flang-rt/lib/runtime/copy.cpp (+2-2) - (modified) flang-rt/lib/runtime/derived.cpp (+3-3) - (modified) flang-rt/lib/runtime/descriptor.cpp (+2-2) - (modified) flang-rt/lib/runtime/extrema.cpp (+2-2) - (modified) flang-rt/lib/runtime/findloc.cpp (+1-1) - (modified) flang-rt/lib/runtime/matmul-transpose.cpp (+1-1) - (modified) flang-rt/lib/runtime/matmul.cpp (+1-1) - (modified) flang-rt/lib/runtime/misc-intrinsic.cpp (+1-1) - (modified) flang-rt/lib/runtime/pointer.cpp (+1-1) - (modified) flang-rt/lib/runtime/temporary-stack.cpp (+1-1) - (modified) flang-rt/lib/runtime/tools.cpp (+1-1) - (modified) flang-rt/lib/runtime/transformational.cpp (+2-2) - (modified) flang-rt/unittests/Evaluate/reshape.cpp (+1-1) - (modified) flang-rt/unittests/Runtime/Allocatable.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/CUDA/Allocatable.cpp (+8-4) - (modified) flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/CUDA/Memory.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/CharacterTest.cpp (+1-1) - (modified) flang-rt/unittests/Runtime/CommandTest.cpp (+4-4) - (modified) flang-rt/unittests/Runtime/TemporaryStack.cpp (+2-2) - (modified) flang-rt/unittests/Runtime/tools.h (+1-1) - (modified) flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td (+5-6) - (modified) flang/include/flang/Runtime/CUDA/allocatable.h (+4-4) - (modified) flang/include/flang/Runtime/CUDA/allocator.h (+4-4) - (modified) flang/include/flang/Runtime/CUDA/pointer.h (+4-4) - (modified) flang/include/flang/Runtime/allocatable.h (+4-3) - (modified) flang/lib/Lower/Allocatable.cpp (+1-1) - (modified) flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp (+3-4) - (modified) flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp (+11-11) - (modified) flang/lib/Optimizer/Transforms/CUFOpConversion.cpp (+4-6) - (modified) flang/test/Fir/CUDA/cuda-allocate.fir (+8-10) - (modified) flang/test/Fir/cuf-invalid.fir (+2-3) - (modified) flang/test/Fir/cuf.mlir (+3-4) - (modified) flang/test/HLFIR/elemental-codegen.fir (+3-3) - (modified) flang/test/Lower/CUDA/cuda-allocatable.cuf (+4-5) - (modified) flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 (+2-2) - (modified) flang/test/Lower/OpenACC/acc-declare.f90 (+2-2) - (modified) flang/test/Lower/allocatable-polymorphic.f90 (+13-13) - (modified) flang/test/Lower/allocatable-runtime.f90 (+2-2) - (modified) flang/test/Lower/allocate-mold.f90 (+2-2) - (modified) flang/test/Lower/polymorphic.f90 (+1-1) - (modified) flang/test/Lower/volatile-allocatable.f90 (+9-9) - (modified) flang/test/Transforms/lower-repack-arrays.fir (+4-4) ``````````diff diff --git a/flang-rt/include/flang-rt/runtime/allocator-registry.h b/flang-rt/include/flang-rt/runtime/allocator-registry.h index 33e8e2c7d7850..f0ba77a360736 100644 --- a/flang-rt/include/flang-rt/runtime/allocator-registry.h +++ b/flang-rt/include/flang-rt/runtime/allocator-registry.h @@ -19,7 +19,7 @@ namespace Fortran::runtime { -using AllocFct = void *(*)(std::size_t, std::int64_t); +using AllocFct = void *(*)(std::size_t, std::int64_t *); using FreeFct = void (*)(void *); typedef struct Allocator_t { @@ -28,7 +28,7 @@ typedef struct Allocator_t { } Allocator_t; static RT_API_ATTRS void *MallocWrapper( - std::size_t size, [[maybe_unused]] std::int64_t) { + std::size_t size, [[maybe_unused]] std::int64_t *) { return std::malloc(size); } #ifdef RT_DEVICE_COMPILATION diff --git a/flang-rt/include/flang-rt/runtime/descriptor.h b/flang-rt/include/flang-rt/runtime/descriptor.h index 9907e7866e7bf..c98e6b14850cb 100644 --- a/flang-rt/include/flang-rt/runtime/descriptor.h +++ b/flang-rt/include/flang-rt/runtime/descriptor.h @@ -29,8 +29,8 @@ #include #include -/// Value used for asyncId when no specific stream is specified. -static constexpr std::int64_t kNoAsyncId = -1; +/// Value used for asyncObject when no specific stream is specified. +static constexpr std::int64_t *kNoAsyncObject = nullptr; namespace Fortran::runtime { @@ -372,7 +372,7 @@ class Descriptor { // before calling. It (re)computes the byte strides after // allocation. Does not allocate automatic components or // perform default component initialization. - RT_API_ATTRS int Allocate(std::int64_t asyncId); + RT_API_ATTRS int Allocate(std::int64_t *asyncObject); RT_API_ATTRS void SetByteStrides(); // Deallocates storage; does not call FINAL subroutines or diff --git a/flang-rt/include/flang-rt/runtime/reduction-templates.h b/flang-rt/include/flang-rt/runtime/reduction-templates.h index 77f77a592a476..18412708b02c5 100644 --- a/flang-rt/include/flang-rt/runtime/reduction-templates.h +++ b/flang-rt/include/flang-rt/runtime/reduction-templates.h @@ -347,7 +347,7 @@ inline RT_API_ATTRS void DoMaxMinNorm2(Descriptor &result, const Descriptor &x, // as the element size of the source. result.Establish(x.type(), x.ElementBytes(), nullptr, 0, nullptr, CFI_attribute_allocatable); - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/cuda/CMakeLists.txt b/flang-rt/lib/cuda/CMakeLists.txt index 95e8e855e46b7..14576676a1f0d 100644 --- a/flang-rt/lib/cuda/CMakeLists.txt +++ b/flang-rt/lib/cuda/CMakeLists.txt @@ -14,6 +14,7 @@ add_flangrt_library(flang_rt.cuda STATIC SHARED kernel.cpp memmove-function.cpp memory.cpp + pointer.cpp registration.cpp TARGET_PROPERTIES diff --git a/flang-rt/lib/cuda/allocatable.cpp b/flang-rt/lib/cuda/allocatable.cpp index 432974d18a3e3..c77819e9440d7 100644 --- a/flang-rt/lib/cuda/allocatable.cpp +++ b/flang-rt/lib/cuda/allocatable.cpp @@ -23,7 +23,7 @@ namespace Fortran::runtime::cuda { extern "C" { RT_EXT_API_GROUP_BEGIN -int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, +int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( @@ -41,7 +41,7 @@ int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, return stat; } -int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, +int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { if (desc.HasAddendum()) { @@ -63,7 +63,7 @@ int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, } int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; @@ -76,7 +76,7 @@ int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, } int RTDEF(CUFAllocatableAllocateSourceSync)(Descriptor &alloc, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocateSync)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; diff --git a/flang-rt/lib/cuda/allocator.cpp b/flang-rt/lib/cuda/allocator.cpp index 51119ab251168..f4289c55bd8de 100644 --- a/flang-rt/lib/cuda/allocator.cpp +++ b/flang-rt/lib/cuda/allocator.cpp @@ -98,7 +98,7 @@ static unsigned findAllocation(void *ptr) { return allocNotFound; } -static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { +static void insertAllocation(void *ptr, std::size_t size, cudaStream_t stream) { CriticalSection critical{lock}; initAllocations(); if (numDeviceAllocations >= maxDeviceAllocations) { @@ -106,7 +106,7 @@ static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { } deviceAllocations[numDeviceAllocations].ptr = ptr; deviceAllocations[numDeviceAllocations].size = size; - deviceAllocations[numDeviceAllocations].stream = (cudaStream_t)stream; + deviceAllocations[numDeviceAllocations].stream = stream; ++numDeviceAllocations; qsort(deviceAllocations, numDeviceAllocations, sizeof(DeviceAllocation), compareDeviceAlloc); @@ -136,7 +136,7 @@ void RTDEF(CUFRegisterAllocator)() { } void *CUFAllocPinned( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { void *p; CUDA_REPORT_IF_ERROR(cudaMallocHost((void **)&p, sizeInBytes)); return p; @@ -144,18 +144,18 @@ void *CUFAllocPinned( void CUFFreePinned(void *p) { CUDA_REPORT_IF_ERROR(cudaFreeHost(p)); } -void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t asyncId) { +void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t *asyncObject) { void *p; if (Fortran::runtime::executionEnvironment.cudaDeviceIsManaged) { CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); } else { - if (asyncId == kNoAsyncId) { + if (asyncObject == kNoAsyncObject) { CUDA_REPORT_IF_ERROR(cudaMalloc(&p, sizeInBytes)); } else { CUDA_REPORT_IF_ERROR( - cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)asyncId)); - insertAllocation(p, sizeInBytes, asyncId); + cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)*asyncObject)); + insertAllocation(p, sizeInBytes, (cudaStream_t)*asyncObject); } } return p; @@ -174,7 +174,7 @@ void CUFFreeDevice(void *p) { } void *CUFAllocManaged( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { void *p; CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); @@ -184,9 +184,9 @@ void *CUFAllocManaged( void CUFFreeManaged(void *p) { CUDA_REPORT_IF_ERROR(cudaFree(p)); } void *CUFAllocUnified( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { // Call alloc managed for the time being. - return CUFAllocManaged(sizeInBytes, asyncId); + return CUFAllocManaged(sizeInBytes, asyncObject); } void CUFFreeUnified(void *p) { diff --git a/flang-rt/lib/cuda/descriptor.cpp b/flang-rt/lib/cuda/descriptor.cpp index 175e8c0ef8438..7b768f91af29d 100644 --- a/flang-rt/lib/cuda/descriptor.cpp +++ b/flang-rt/lib/cuda/descriptor.cpp @@ -21,7 +21,7 @@ RT_EXT_API_GROUP_BEGIN Descriptor *RTDEF(CUFAllocDescriptor)( std::size_t sizeInBytes, const char *sourceFile, int sourceLine) { return reinterpret_cast( - CUFAllocManaged(sizeInBytes, /*asyncId*/ -1)); + CUFAllocManaged(sizeInBytes, /*asyncObject=*/nullptr)); } void RTDEF(CUFFreeDescriptor)( diff --git a/flang-rt/lib/cuda/pointer.cpp b/flang-rt/lib/cuda/pointer.cpp index c2559ecb9a6f2..0ed2b0a2b751f 100644 --- a/flang-rt/lib/cuda/pointer.cpp +++ b/flang-rt/lib/cuda/pointer.cpp @@ -22,7 +22,7 @@ namespace Fortran::runtime::cuda { extern "C" { RT_EXT_API_GROUP_BEGIN -int RTDEF(CUFPointerAllocate)(Descriptor &desc, int64_t stream, bool *pinned, +int RTDEF(CUFPointerAllocate)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { if (desc.HasAddendum()) { @@ -43,7 +43,7 @@ int RTDEF(CUFPointerAllocate)(Descriptor &desc, int64_t stream, bool *pinned, return stat; } -int RTDEF(CUFPointerAllocateSync)(Descriptor &desc, int64_t stream, +int RTDEF(CUFPointerAllocateSync)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFPointerAllocate)( @@ -62,7 +62,7 @@ int RTDEF(CUFPointerAllocateSync)(Descriptor &desc, int64_t stream, } int RTDEF(CUFPointerAllocateSource)(Descriptor &pointer, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFPointerAllocate)( pointer, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; @@ -75,7 +75,7 @@ int RTDEF(CUFPointerAllocateSource)(Descriptor &pointer, } int RTDEF(CUFPointerAllocateSourceSync)(Descriptor &pointer, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFPointerAllocateSync)( pointer, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; diff --git a/flang-rt/lib/runtime/allocatable.cpp b/flang-rt/lib/runtime/allocatable.cpp index 6acce34eb9a9e..ef18da6ea0786 100644 --- a/flang-rt/lib/runtime/allocatable.cpp +++ b/flang-rt/lib/runtime/allocatable.cpp @@ -133,17 +133,17 @@ void RTDEF(AllocatableApplyMold)( } } -int RTDEF(AllocatableAllocate)(Descriptor &descriptor, std::int64_t asyncId, - bool hasStat, const Descriptor *errMsg, const char *sourceFile, - int sourceLine) { +int RTDEF(AllocatableAllocate)(Descriptor &descriptor, + std::int64_t *asyncObject, bool hasStat, const Descriptor *errMsg, + const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; if (!descriptor.IsAllocatable()) { return ReturnError(terminator, StatInvalidDescriptor, errMsg, hasStat); } else if (descriptor.IsAllocated()) { return ReturnError(terminator, StatBaseNotNull, errMsg, hasStat); } else { - int stat{ - ReturnError(terminator, descriptor.Allocate(asyncId), errMsg, hasStat)}; + int stat{ReturnError( + terminator, descriptor.Allocate(asyncObject), errMsg, hasStat)}; if (stat == StatOk) { if (const DescriptorAddendum * addendum{descriptor.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -162,7 +162,7 @@ int RTDEF(AllocatableAllocateSource)(Descriptor &alloc, const Descriptor &source, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(AllocatableAllocate)( - alloc, /*asyncId=*/-1, hasStat, errMsg, sourceFile, sourceLine)}; + alloc, /*asyncObject=*/nullptr, hasStat, errMsg, sourceFile, sourceLine)}; if (stat == StatOk) { Terminator terminator{sourceFile, sourceLine}; DoFromSourceAssign(alloc, source, terminator); diff --git a/flang-rt/lib/runtime/array-constructor.cpp b/flang-rt/lib/runtime/array-constructor.cpp index 67b3b5e1e0f50..858fac7bf2b39 100644 --- a/flang-rt/lib/runtime/array-constructor.cpp +++ b/flang-rt/lib/runtime/array-constructor.cpp @@ -50,7 +50,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( initialAllocationSize(fromElements, to.ElementBytes())}; to.GetDimension(0).SetBounds(1, allocationSize); RTNAME(AllocatableAllocate) - (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); to.GetDimension(0).SetBounds(1, fromElements); vector.actualAllocationSize = allocationSize; @@ -59,7 +59,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( // first value: there should be no reallocation. RUNTIME_CHECK(terminator, previousToElements >= fromElements); RTNAME(AllocatableAllocate) - (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); vector.actualAllocationSize = previousToElements; } diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 9be75da9520e3..912beee909f4a 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -102,7 +102,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; + int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; if (result == StatOk && derived && !derived->noInitializationNeeded()) { result = ReturnError(terminator, Initialize(to, *derived, terminator)); } @@ -280,7 +280,7 @@ RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; + auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; if (stat == StatOk) { if (HasDynamicComponent(from)) { // If 'from' has allocatable/automatic component, we cannot diff --git a/flang-rt/lib/runtime/character.cpp b/flang-rt/lib/runtime/character.cpp index d1152ee1caefb..f140d202e118e 100644 --- a/flang-rt/lib/runtime/character.cpp +++ b/flang-rt/lib/runtime/character.cpp @@ -118,7 +118,7 @@ static RT_API_ATTRS void Compare(Descriptor &result, const Descriptor &x, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("Compare: could not allocate storage for result"); } std::size_t xChars{x.ElementBytes() >> shift}; @@ -173,7 +173,7 @@ static RT_API_ATTRS void AdjustLRHelper(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("ADJUSTL/R: could not allocate storage for result"); } for (SubscriptValue resultAt{0}; elements-- > 0; @@ -227,7 +227,7 @@ static RT_API_ATTRS void LenTrim(Descriptor &result, const Descriptor &string, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("LEN_TRIM: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -427,7 +427,7 @@ static RT_API_ATTRS void GeneralCharFunc(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("SCAN/VERIFY: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -530,7 +530,8 @@ static RT_API_ATTRS void MaxMinHelper(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); } for (CHAR *result{accumulator.OffsetElement()}; elements-- > 0; accumData += accumChars, result += chars, x.IncrementSubscripts(xAt)) { @@ -606,7 +607,7 @@ void RTDEF(CharacterConcatenate)(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - if (accumulator.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (accumulator.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash( "CharacterConcatenate: could not allocate storage for result"); } @@ -629,7 +630,8 @@ void RTDEF(CharacterConcatenateScalar1)( accumulator.set_base_addr(nullptr); std::size_t oldLen{accumulator.ElementBytes()}; accumulator.raw().elem_len += chars; - RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); std::memcpy(accumulator.OffsetElement(oldLen), from, chars); FreeMemory(old); } @@ -831,7 +833,7 @@ void RTDEF(Repeat)(Descriptor &result, const Descriptor &string, std::size_t origBytes{string.ElementBytes()}; result.Establish(string.type(), origBytes * ncopies, nullptr, 0, nullptr, CFI_attribute_allocatable); - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("REPEAT could not allocate storage for result"); } const char *from{string.OffsetElement()}; @@ -865,7 +867,7 @@ void RTDEF(Trim)(Descriptor &result, const Descriptor &string, } result.Establish(string.type(), resultBytes, nullptr, 0, nullptr, CFI_attribute_allocatable); - RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncObject) == CFI_SUCCESS); std::memcpy(result.OffsetElement(), string.OffsetElement(), resultBytes); } diff --git a/flang-rt/lib/runtime/copy.cpp b/flang-rt/lib/runtime/copy.cpp index 3a0f98cf8d376..f990f46e0be66 100644 --- a/flang-rt/lib/runtime/copy.cpp +++ b/flang-rt/lib/runtime/copy.cpp @@ -171,8 +171,8 @@ RT_API_ATTRS void CopyElement(const Descriptor &to, const SubscriptValue toAt[], *reinterpret_cast(toPtr + component->offset())}; if (toDesc.raw().base_addr != nullptr) { toDesc.set_base_addr(nullptr); - RUNTIME_CHECK( - terminator, toDesc.Allocate(/*asyncId=*/-1) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, + toDesc.Allocate(/*asyncObject=*/nullptr) == CFI_SUCCESS); const Descriptor &fromDesc{*reinterpret_cast( fromPtr + component->offset())}; copyStack.emplace(toDesc, fromDesc); diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..35037036f63e7 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -52,7 +52,7 @@ RT_API_ATTRS int Initialize(const Descriptor &instance, allocDesc.raw().attribute = CFI_attribute_allocatable; if (comp.genre() == typeInfo::Component::Genre::Automatic) { stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); + terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); if (stat == StatOk) { if (const... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/140614 From flang-commits at lists.llvm.org Mon May 19 13:42:33 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Mon, 19 May 2025 13:42:33 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][cuda] Use a reference for asyncObject (PR #140614) In-Reply-To: Message-ID: <682b97b9.170a0220.18d9da.c11c@mx.google.com> https://github.com/clementval edited https://github.com/llvm/llvm-project/pull/140614 From flang-commits at lists.llvm.org Mon May 19 14:20:09 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 19 May 2025 14:20:09 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang] Extension: allow char string edit descriptors in input formats (PR #140624) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/140624 FORMAT("J=",I3) is accepted by a few other Fortran compilers as a valid format for input as well as for output. The character string edit descriptor "J=" is interpreted as if it had been 2X on input, causing two characters to be skipped over. The skipped characters don't have to match the characters in the literal string. An optional warning is emitted under control of the -pedantic option. >From f431ed3288ad883ba19f82e06bba81d32678eef4 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Mon, 19 May 2025 14:13:09 -0700 Subject: [PATCH] [flang] Extension: allow char string edit descriptors in input formats FORMAT("J=",I3) is accepted by a few other Fortran compilers as a valid format for input as well as for output. The character string edit descriptor "J=" is interpreted as if it had been 2X on input, causing two characters to be skipped over. The skipped characters don't have to match the characters in the literal string. An optional warning is emitted under control of the -pedantic option. --- .../flang-rt/runtime/format-implementation.h | 13 +++++++++++-- flang/docs/Extensions.md | 4 ++++ flang/include/flang/Common/format.h | 8 ++++---- flang/test/Semantics/io09.f90 | 6 +++--- 4 files changed, 22 insertions(+), 9 deletions(-) diff --git a/flang-rt/include/flang-rt/runtime/format-implementation.h b/flang-rt/include/flang-rt/runtime/format-implementation.h index 8f4eb1161dd14..85dc922bc31bc 100644 --- a/flang-rt/include/flang-rt/runtime/format-implementation.h +++ b/flang-rt/include/flang-rt/runtime/format-implementation.h @@ -427,7 +427,11 @@ RT_API_ATTRS int FormatControl::CueUpNextDataEdit( } else { --chars; } - EmitAscii(context, format_ + start, chars); + if constexpr (std::is_base_of_v) { + context.HandleRelativePosition(chars); + } else { + EmitAscii(context, format_ + start, chars); + } } else if (ch == 'H') { // 9HHOLLERITH if (!repeat || *repeat < 1 || offset_ + *repeat > formatLength_) { @@ -435,7 +439,12 @@ RT_API_ATTRS int FormatControl::CueUpNextDataEdit( maybeReversionPoint); return 0; } - EmitAscii(context, format_ + offset_, static_cast(*repeat)); + if constexpr (std::is_base_of_v) { + context.HandleRelativePosition(static_cast(*repeat)); + } else { + EmitAscii( + context, format_ + offset_, static_cast(*repeat)); + } offset_ += *repeat; } else if (ch >= 'A' && ch <= 'Z') { int start{offset_ - 1}; diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..1cc4881438cc1 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -424,6 +424,10 @@ end * A zero field width is allowed for logical formatted output (`L0`). * `OPEN(..., FORM='BINARY')` is accepted as a legacy synonym for the standard `OPEN(..., FORM='UNFORMATTED', ACCESS='STREAM')`. +* A character string edit descriptor is allowed in an input format + with an optional compilation-time warning. When executed, it + is treated as an 'nX' positioning control descriptor that skips + over the same number of characters, without comparison. ### Extensions supported when enabled by options diff --git a/flang/include/flang/Common/format.h b/flang/include/flang/Common/format.h index da416506ffb5d..1650f56140b4d 100644 --- a/flang/include/flang/Common/format.h +++ b/flang/include/flang/Common/format.h @@ -430,11 +430,11 @@ template void FormatValidator::NextToken() { } } SetLength(); - if (stmt_ == IoStmtKind::Read && - previousToken_.kind() != TokenKind::DT) { // 13.3.2p6 - ReportError("String edit descriptor in READ format expression"); - } else if (token_.kind() != TokenKind::String) { + if (token_.kind() != TokenKind::String) { ReportError("Unterminated string"); + } else if (stmt_ == IoStmtKind::Read && + previousToken_.kind() != TokenKind::DT) { // 13.3.2p6 + ReportWarning("String edit descriptor in READ format expression"); } break; default: diff --git a/flang/test/Semantics/io09.f90 b/flang/test/Semantics/io09.f90 index 495cbf059005c..7fc9d8ffe7b4b 100644 --- a/flang/test/Semantics/io09.f90 +++ b/flang/test/Semantics/io09.f90 @@ -1,8 +1,8 @@ -! RUN: %python %S/test_errors.py %s %flang_fc1 - !ERROR: String edit descriptor in READ format expression +! RUN: %python %S/test_errors.py %s %flang_fc1 -pedantic + !WARNING: String edit descriptor in READ format expression read(*,'("abc")') - !ERROR: String edit descriptor in READ format expression + !ERROR: Unterminated string !ERROR: Unterminated format expression read(*,'("abc)') From flang-commits at lists.llvm.org Mon May 19 14:20:46 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 14:20:46 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang] Extension: allow char string edit descriptors in input formats (PR #140624) In-Reply-To: Message-ID: <682ba0ae.170a0220.149559.c2bc@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes FORMAT("J=",I3) is accepted by a few other Fortran compilers as a valid format for input as well as for output. The character string edit descriptor "J=" is interpreted as if it had been 2X on input, causing two characters to be skipped over. The skipped characters don't have to match the characters in the literal string. An optional warning is emitted under control of the -pedantic option. --- Full diff: https://github.com/llvm/llvm-project/pull/140624.diff 4 Files Affected: - (modified) flang-rt/include/flang-rt/runtime/format-implementation.h (+11-2) - (modified) flang/docs/Extensions.md (+4) - (modified) flang/include/flang/Common/format.h (+4-4) - (modified) flang/test/Semantics/io09.f90 (+3-3) ``````````diff diff --git a/flang-rt/include/flang-rt/runtime/format-implementation.h b/flang-rt/include/flang-rt/runtime/format-implementation.h index 8f4eb1161dd14..85dc922bc31bc 100644 --- a/flang-rt/include/flang-rt/runtime/format-implementation.h +++ b/flang-rt/include/flang-rt/runtime/format-implementation.h @@ -427,7 +427,11 @@ RT_API_ATTRS int FormatControl::CueUpNextDataEdit( } else { --chars; } - EmitAscii(context, format_ + start, chars); + if constexpr (std::is_base_of_v) { + context.HandleRelativePosition(chars); + } else { + EmitAscii(context, format_ + start, chars); + } } else if (ch == 'H') { // 9HHOLLERITH if (!repeat || *repeat < 1 || offset_ + *repeat > formatLength_) { @@ -435,7 +439,12 @@ RT_API_ATTRS int FormatControl::CueUpNextDataEdit( maybeReversionPoint); return 0; } - EmitAscii(context, format_ + offset_, static_cast(*repeat)); + if constexpr (std::is_base_of_v) { + context.HandleRelativePosition(static_cast(*repeat)); + } else { + EmitAscii( + context, format_ + offset_, static_cast(*repeat)); + } offset_ += *repeat; } else if (ch >= 'A' && ch <= 'Z') { int start{offset_ - 1}; diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..1cc4881438cc1 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -424,6 +424,10 @@ end * A zero field width is allowed for logical formatted output (`L0`). * `OPEN(..., FORM='BINARY')` is accepted as a legacy synonym for the standard `OPEN(..., FORM='UNFORMATTED', ACCESS='STREAM')`. +* A character string edit descriptor is allowed in an input format + with an optional compilation-time warning. When executed, it + is treated as an 'nX' positioning control descriptor that skips + over the same number of characters, without comparison. ### Extensions supported when enabled by options diff --git a/flang/include/flang/Common/format.h b/flang/include/flang/Common/format.h index da416506ffb5d..1650f56140b4d 100644 --- a/flang/include/flang/Common/format.h +++ b/flang/include/flang/Common/format.h @@ -430,11 +430,11 @@ template void FormatValidator::NextToken() { } } SetLength(); - if (stmt_ == IoStmtKind::Read && - previousToken_.kind() != TokenKind::DT) { // 13.3.2p6 - ReportError("String edit descriptor in READ format expression"); - } else if (token_.kind() != TokenKind::String) { + if (token_.kind() != TokenKind::String) { ReportError("Unterminated string"); + } else if (stmt_ == IoStmtKind::Read && + previousToken_.kind() != TokenKind::DT) { // 13.3.2p6 + ReportWarning("String edit descriptor in READ format expression"); } break; default: diff --git a/flang/test/Semantics/io09.f90 b/flang/test/Semantics/io09.f90 index 495cbf059005c..7fc9d8ffe7b4b 100644 --- a/flang/test/Semantics/io09.f90 +++ b/flang/test/Semantics/io09.f90 @@ -1,8 +1,8 @@ -! RUN: %python %S/test_errors.py %s %flang_fc1 - !ERROR: String edit descriptor in READ format expression +! RUN: %python %S/test_errors.py %s %flang_fc1 -pedantic + !WARNING: String edit descriptor in READ format expression read(*,'("abc")') - !ERROR: String edit descriptor in READ format expression + !ERROR: Unterminated string !ERROR: Unterminated format expression read(*,'("abc)') ``````````
https://github.com/llvm/llvm-project/pull/140624 From flang-commits at lists.llvm.org Mon May 19 14:31:45 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 19 May 2025 14:31:45 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang] Extension: allow char string edit descriptors in input formats (PR #140624) In-Reply-To: Message-ID: <682ba341.170a0220.1df4f4.ca5c@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/140624 >From 4c21f059753e0184078cbffda23e961d97953a17 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Mon, 19 May 2025 14:13:09 -0700 Subject: [PATCH] [flang] Extension: allow char string edit descriptors in input formats FORMAT("J=",I3) is accepted by a few other Fortran compilers as a valid format for input as well as for output. The character string edit descriptor "J=" is interpreted as if it had been 2X on input, causing two characters to be skipped over. The skipped characters don't have to match the characters in the literal string. An optional warning is emitted under control of the -pedantic option. --- .../flang-rt/runtime/format-implementation.h | 13 +++++++++++-- flang-rt/unittests/Runtime/NumericalFormatTest.cpp | 1 + flang/docs/Extensions.md | 4 ++++ flang/include/flang/Common/format.h | 8 ++++---- flang/test/Semantics/io09.f90 | 6 +++--- 5 files changed, 23 insertions(+), 9 deletions(-) diff --git a/flang-rt/include/flang-rt/runtime/format-implementation.h b/flang-rt/include/flang-rt/runtime/format-implementation.h index 8f4eb1161dd14..85dc922bc31bc 100644 --- a/flang-rt/include/flang-rt/runtime/format-implementation.h +++ b/flang-rt/include/flang-rt/runtime/format-implementation.h @@ -427,7 +427,11 @@ RT_API_ATTRS int FormatControl::CueUpNextDataEdit( } else { --chars; } - EmitAscii(context, format_ + start, chars); + if constexpr (std::is_base_of_v) { + context.HandleRelativePosition(chars); + } else { + EmitAscii(context, format_ + start, chars); + } } else if (ch == 'H') { // 9HHOLLERITH if (!repeat || *repeat < 1 || offset_ + *repeat > formatLength_) { @@ -435,7 +439,12 @@ RT_API_ATTRS int FormatControl::CueUpNextDataEdit( maybeReversionPoint); return 0; } - EmitAscii(context, format_ + offset_, static_cast(*repeat)); + if constexpr (std::is_base_of_v) { + context.HandleRelativePosition(static_cast(*repeat)); + } else { + EmitAscii( + context, format_ + offset_, static_cast(*repeat)); + } offset_ += *repeat; } else if (ch >= 'A' && ch <= 'Z') { int start{offset_ - 1}; diff --git a/flang-rt/unittests/Runtime/NumericalFormatTest.cpp b/flang-rt/unittests/Runtime/NumericalFormatTest.cpp index a752f9d6c723b..58852d3c3dead 100644 --- a/flang-rt/unittests/Runtime/NumericalFormatTest.cpp +++ b/flang-rt/unittests/Runtime/NumericalFormatTest.cpp @@ -882,6 +882,7 @@ TEST(IOApiTests, EditDoubleInputValues) { {"(F18.1)", " 125", 0x4029000000000000, 0}, {"(F18.2)", " 125", 0x3ff4000000000000, 0}, {"(F18.3)", " 125", 0x3fc0000000000000, 0}, + {"('str',F3.0)", "xxx125", 0x405f400000000000, 0}, {"(-1P,F18.0)", " 125", 0x4093880000000000, 0}, // 1250 {"(1P,F18.0)", " 125", 0x4029000000000000, 0}, // 12.5 {"(BZ,F18.0)", " 125 ", 0x4093880000000000, 0}, // 1250 diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..1cc4881438cc1 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -424,6 +424,10 @@ end * A zero field width is allowed for logical formatted output (`L0`). * `OPEN(..., FORM='BINARY')` is accepted as a legacy synonym for the standard `OPEN(..., FORM='UNFORMATTED', ACCESS='STREAM')`. +* A character string edit descriptor is allowed in an input format + with an optional compilation-time warning. When executed, it + is treated as an 'nX' positioning control descriptor that skips + over the same number of characters, without comparison. ### Extensions supported when enabled by options diff --git a/flang/include/flang/Common/format.h b/flang/include/flang/Common/format.h index da416506ffb5d..1650f56140b4d 100644 --- a/flang/include/flang/Common/format.h +++ b/flang/include/flang/Common/format.h @@ -430,11 +430,11 @@ template void FormatValidator::NextToken() { } } SetLength(); - if (stmt_ == IoStmtKind::Read && - previousToken_.kind() != TokenKind::DT) { // 13.3.2p6 - ReportError("String edit descriptor in READ format expression"); - } else if (token_.kind() != TokenKind::String) { + if (token_.kind() != TokenKind::String) { ReportError("Unterminated string"); + } else if (stmt_ == IoStmtKind::Read && + previousToken_.kind() != TokenKind::DT) { // 13.3.2p6 + ReportWarning("String edit descriptor in READ format expression"); } break; default: diff --git a/flang/test/Semantics/io09.f90 b/flang/test/Semantics/io09.f90 index 495cbf059005c..7fc9d8ffe7b4b 100644 --- a/flang/test/Semantics/io09.f90 +++ b/flang/test/Semantics/io09.f90 @@ -1,8 +1,8 @@ -! RUN: %python %S/test_errors.py %s %flang_fc1 - !ERROR: String edit descriptor in READ format expression +! RUN: %python %S/test_errors.py %s %flang_fc1 -pedantic + !WARNING: String edit descriptor in READ format expression read(*,'("abc")') - !ERROR: String edit descriptor in READ format expression + !ERROR: Unterminated string !ERROR: Unterminated format expression read(*,'("abc)') From flang-commits at lists.llvm.org Mon May 19 14:34:00 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 14:34:00 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang] Extension: allow char string edit descriptors in input formats (PR #140624) In-Reply-To: Message-ID: <682ba3c8.630a0220.1fa401.2706@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- flang-rt/include/flang-rt/runtime/format-implementation.h flang-rt/unittests/Runtime/NumericalFormatTest.cpp flang/include/flang/Common/format.h ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang-rt/unittests/Runtime/NumericalFormatTest.cpp b/flang-rt/unittests/Runtime/NumericalFormatTest.cpp index 58852d3c3..f1492d0e3 100644 --- a/flang-rt/unittests/Runtime/NumericalFormatTest.cpp +++ b/flang-rt/unittests/Runtime/NumericalFormatTest.cpp @@ -882,7 +882,7 @@ TEST(IOApiTests, EditDoubleInputValues) { {"(F18.1)", " 125", 0x4029000000000000, 0}, {"(F18.2)", " 125", 0x3ff4000000000000, 0}, {"(F18.3)", " 125", 0x3fc0000000000000, 0}, - {"('str',F3.0)", "xxx125", 0x405f400000000000, 0}, + {"('str',F3.0)", "xxx125", 0x405f400000000000, 0}, {"(-1P,F18.0)", " 125", 0x4093880000000000, 0}, // 1250 {"(1P,F18.0)", " 125", 0x4029000000000000, 0}, // 12.5 {"(BZ,F18.0)", " 125 ", 0x4093880000000000, 0}, // 1250 ``````````
https://github.com/llvm/llvm-project/pull/140624 From flang-commits at lists.llvm.org Mon May 19 14:35:34 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 19 May 2025 14:35:34 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang] Extension: allow char string edit descriptors in input formats (PR #140624) In-Reply-To: Message-ID: <682ba426.170a0220.57c3b.d2cf@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/140624 >From 46cfbd8f14330fab69ba3f4236ca9e7a53e7a35a Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Mon, 19 May 2025 14:13:09 -0700 Subject: [PATCH] [flang] Extension: allow char string edit descriptors in input formats FORMAT("J=",I3) is accepted by a few other Fortran compilers as a valid format for input as well as for output. The character string edit descriptor "J=" is interpreted as if it had been 2X on input, causing two characters to be skipped over. The skipped characters don't have to match the characters in the literal string. An optional warning is emitted under control of the -pedantic option. --- .../flang-rt/runtime/format-implementation.h | 13 +++++++++++-- flang-rt/unittests/Runtime/NumericalFormatTest.cpp | 1 + flang/docs/Extensions.md | 4 ++++ flang/include/flang/Common/format.h | 8 ++++---- flang/test/Semantics/io09.f90 | 6 +++--- 5 files changed, 23 insertions(+), 9 deletions(-) diff --git a/flang-rt/include/flang-rt/runtime/format-implementation.h b/flang-rt/include/flang-rt/runtime/format-implementation.h index 8f4eb1161dd14..85dc922bc31bc 100644 --- a/flang-rt/include/flang-rt/runtime/format-implementation.h +++ b/flang-rt/include/flang-rt/runtime/format-implementation.h @@ -427,7 +427,11 @@ RT_API_ATTRS int FormatControl::CueUpNextDataEdit( } else { --chars; } - EmitAscii(context, format_ + start, chars); + if constexpr (std::is_base_of_v) { + context.HandleRelativePosition(chars); + } else { + EmitAscii(context, format_ + start, chars); + } } else if (ch == 'H') { // 9HHOLLERITH if (!repeat || *repeat < 1 || offset_ + *repeat > formatLength_) { @@ -435,7 +439,12 @@ RT_API_ATTRS int FormatControl::CueUpNextDataEdit( maybeReversionPoint); return 0; } - EmitAscii(context, format_ + offset_, static_cast(*repeat)); + if constexpr (std::is_base_of_v) { + context.HandleRelativePosition(static_cast(*repeat)); + } else { + EmitAscii( + context, format_ + offset_, static_cast(*repeat)); + } offset_ += *repeat; } else if (ch >= 'A' && ch <= 'Z') { int start{offset_ - 1}; diff --git a/flang-rt/unittests/Runtime/NumericalFormatTest.cpp b/flang-rt/unittests/Runtime/NumericalFormatTest.cpp index a752f9d6c723b..f1492d0e39fec 100644 --- a/flang-rt/unittests/Runtime/NumericalFormatTest.cpp +++ b/flang-rt/unittests/Runtime/NumericalFormatTest.cpp @@ -882,6 +882,7 @@ TEST(IOApiTests, EditDoubleInputValues) { {"(F18.1)", " 125", 0x4029000000000000, 0}, {"(F18.2)", " 125", 0x3ff4000000000000, 0}, {"(F18.3)", " 125", 0x3fc0000000000000, 0}, + {"('str',F3.0)", "xxx125", 0x405f400000000000, 0}, {"(-1P,F18.0)", " 125", 0x4093880000000000, 0}, // 1250 {"(1P,F18.0)", " 125", 0x4029000000000000, 0}, // 12.5 {"(BZ,F18.0)", " 125 ", 0x4093880000000000, 0}, // 1250 diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..1cc4881438cc1 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -424,6 +424,10 @@ end * A zero field width is allowed for logical formatted output (`L0`). * `OPEN(..., FORM='BINARY')` is accepted as a legacy synonym for the standard `OPEN(..., FORM='UNFORMATTED', ACCESS='STREAM')`. +* A character string edit descriptor is allowed in an input format + with an optional compilation-time warning. When executed, it + is treated as an 'nX' positioning control descriptor that skips + over the same number of characters, without comparison. ### Extensions supported when enabled by options diff --git a/flang/include/flang/Common/format.h b/flang/include/flang/Common/format.h index da416506ffb5d..1650f56140b4d 100644 --- a/flang/include/flang/Common/format.h +++ b/flang/include/flang/Common/format.h @@ -430,11 +430,11 @@ template void FormatValidator::NextToken() { } } SetLength(); - if (stmt_ == IoStmtKind::Read && - previousToken_.kind() != TokenKind::DT) { // 13.3.2p6 - ReportError("String edit descriptor in READ format expression"); - } else if (token_.kind() != TokenKind::String) { + if (token_.kind() != TokenKind::String) { ReportError("Unterminated string"); + } else if (stmt_ == IoStmtKind::Read && + previousToken_.kind() != TokenKind::DT) { // 13.3.2p6 + ReportWarning("String edit descriptor in READ format expression"); } break; default: diff --git a/flang/test/Semantics/io09.f90 b/flang/test/Semantics/io09.f90 index 495cbf059005c..7fc9d8ffe7b4b 100644 --- a/flang/test/Semantics/io09.f90 +++ b/flang/test/Semantics/io09.f90 @@ -1,8 +1,8 @@ -! RUN: %python %S/test_errors.py %s %flang_fc1 - !ERROR: String edit descriptor in READ format expression +! RUN: %python %S/test_errors.py %s %flang_fc1 -pedantic + !WARNING: String edit descriptor in READ format expression read(*,'("abc")') - !ERROR: String edit descriptor in READ format expression + !ERROR: Unterminated string !ERROR: Unterminated format expression read(*,'("abc)') From flang-commits at lists.llvm.org Mon May 19 13:40:05 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Mon, 19 May 2025 13:40:05 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][cuda] Use a reference for asyncObject (PR #140614) Message-ID: https://github.com/clementval created https://github.com/llvm/llvm-project/pull/140614 Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation. New tentative with some fix. The previous was reverted some time ago. >From ff4c98d4faa11ca4eacd4775d2305faaee9526ae Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Valentin=20Clement=20=28=E3=83=90=E3=83=AC=E3=83=B3?= =?UTF-8?q?=E3=82=BF=E3=82=A4=E3=83=B3=20=E3=82=AF=E3=83=AC=E3=83=A1?= =?UTF-8?q?=E3=83=B3=29?= Date: Thu, 1 May 2025 17:04:12 -0700 Subject: [PATCH] [flang][cuda] Use a reference for asyncObject (#138186) Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation. New tentative with some fix. The previous was reverted yesterday. --- .../flang-rt/runtime/allocator-registry.h | 4 +-- .../include/flang-rt/runtime/descriptor.h | 6 ++--- .../flang-rt/runtime/reduction-templates.h | 2 +- flang-rt/lib/cuda/CMakeLists.txt | 1 + flang-rt/lib/cuda/allocatable.cpp | 8 +++--- flang-rt/lib/cuda/allocator.cpp | 20 +++++++------- flang-rt/lib/cuda/descriptor.cpp | 2 +- flang-rt/lib/cuda/pointer.cpp | 8 +++--- flang-rt/lib/runtime/allocatable.cpp | 12 ++++----- flang-rt/lib/runtime/array-constructor.cpp | 4 +-- flang-rt/lib/runtime/assign.cpp | 4 +-- flang-rt/lib/runtime/character.cpp | 20 +++++++------- flang-rt/lib/runtime/copy.cpp | 4 +-- flang-rt/lib/runtime/derived.cpp | 6 ++--- flang-rt/lib/runtime/descriptor.cpp | 4 +-- flang-rt/lib/runtime/extrema.cpp | 4 +-- flang-rt/lib/runtime/findloc.cpp | 2 +- flang-rt/lib/runtime/matmul-transpose.cpp | 2 +- flang-rt/lib/runtime/matmul.cpp | 2 +- flang-rt/lib/runtime/misc-intrinsic.cpp | 2 +- flang-rt/lib/runtime/pointer.cpp | 2 +- flang-rt/lib/runtime/temporary-stack.cpp | 2 +- flang-rt/lib/runtime/tools.cpp | 2 +- flang-rt/lib/runtime/transformational.cpp | 4 +-- flang-rt/unittests/Evaluate/reshape.cpp | 2 +- flang-rt/unittests/Runtime/Allocatable.cpp | 4 +-- .../unittests/Runtime/CUDA/Allocatable.cpp | 12 ++++++--- .../unittests/Runtime/CUDA/AllocatorCUF.cpp | 4 +-- flang-rt/unittests/Runtime/CUDA/Memory.cpp | 4 +-- flang-rt/unittests/Runtime/CharacterTest.cpp | 2 +- flang-rt/unittests/Runtime/CommandTest.cpp | 8 +++--- flang-rt/unittests/Runtime/TemporaryStack.cpp | 4 +-- flang-rt/unittests/Runtime/tools.h | 2 +- .../flang/Optimizer/Dialect/CUF/CUFOps.td | 11 ++++---- .../include/flang/Runtime/CUDA/allocatable.h | 8 +++--- flang/include/flang/Runtime/CUDA/allocator.h | 8 +++--- flang/include/flang/Runtime/CUDA/pointer.h | 8 +++--- flang/include/flang/Runtime/allocatable.h | 7 ++--- flang/lib/Lower/Allocatable.cpp | 2 +- .../Optimizer/Builder/Runtime/Allocatable.cpp | 7 +++-- flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp | 22 ++++++++-------- .../Optimizer/Transforms/CUFOpConversion.cpp | 10 +++---- flang/test/Fir/CUDA/cuda-allocate.fir | 18 ++++++------- flang/test/Fir/cuf-invalid.fir | 5 ++-- flang/test/Fir/cuf.mlir | 7 +++-- flang/test/HLFIR/elemental-codegen.fir | 6 ++--- flang/test/Lower/CUDA/cuda-allocatable.cuf | 9 +++---- .../acc-declare-unwrap-defaultbounds.f90 | 4 +-- flang/test/Lower/OpenACC/acc-declare.f90 | 4 +-- flang/test/Lower/allocatable-polymorphic.f90 | 26 +++++++++---------- flang/test/Lower/allocatable-runtime.f90 | 4 +-- flang/test/Lower/allocate-mold.f90 | 4 +-- flang/test/Lower/polymorphic.f90 | 2 +- flang/test/Lower/volatile-allocatable.f90 | 18 ++++++------- flang/test/Transforms/lower-repack-arrays.fir | 8 +++--- 55 files changed, 183 insertions(+), 184 deletions(-) diff --git a/flang-rt/include/flang-rt/runtime/allocator-registry.h b/flang-rt/include/flang-rt/runtime/allocator-registry.h index 33e8e2c7d7850..f0ba77a360736 100644 --- a/flang-rt/include/flang-rt/runtime/allocator-registry.h +++ b/flang-rt/include/flang-rt/runtime/allocator-registry.h @@ -19,7 +19,7 @@ namespace Fortran::runtime { -using AllocFct = void *(*)(std::size_t, std::int64_t); +using AllocFct = void *(*)(std::size_t, std::int64_t *); using FreeFct = void (*)(void *); typedef struct Allocator_t { @@ -28,7 +28,7 @@ typedef struct Allocator_t { } Allocator_t; static RT_API_ATTRS void *MallocWrapper( - std::size_t size, [[maybe_unused]] std::int64_t) { + std::size_t size, [[maybe_unused]] std::int64_t *) { return std::malloc(size); } #ifdef RT_DEVICE_COMPILATION diff --git a/flang-rt/include/flang-rt/runtime/descriptor.h b/flang-rt/include/flang-rt/runtime/descriptor.h index 9907e7866e7bf..c98e6b14850cb 100644 --- a/flang-rt/include/flang-rt/runtime/descriptor.h +++ b/flang-rt/include/flang-rt/runtime/descriptor.h @@ -29,8 +29,8 @@ #include #include -/// Value used for asyncId when no specific stream is specified. -static constexpr std::int64_t kNoAsyncId = -1; +/// Value used for asyncObject when no specific stream is specified. +static constexpr std::int64_t *kNoAsyncObject = nullptr; namespace Fortran::runtime { @@ -372,7 +372,7 @@ class Descriptor { // before calling. It (re)computes the byte strides after // allocation. Does not allocate automatic components or // perform default component initialization. - RT_API_ATTRS int Allocate(std::int64_t asyncId); + RT_API_ATTRS int Allocate(std::int64_t *asyncObject); RT_API_ATTRS void SetByteStrides(); // Deallocates storage; does not call FINAL subroutines or diff --git a/flang-rt/include/flang-rt/runtime/reduction-templates.h b/flang-rt/include/flang-rt/runtime/reduction-templates.h index 77f77a592a476..18412708b02c5 100644 --- a/flang-rt/include/flang-rt/runtime/reduction-templates.h +++ b/flang-rt/include/flang-rt/runtime/reduction-templates.h @@ -347,7 +347,7 @@ inline RT_API_ATTRS void DoMaxMinNorm2(Descriptor &result, const Descriptor &x, // as the element size of the source. result.Establish(x.type(), x.ElementBytes(), nullptr, 0, nullptr, CFI_attribute_allocatable); - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/cuda/CMakeLists.txt b/flang-rt/lib/cuda/CMakeLists.txt index 95e8e855e46b7..14576676a1f0d 100644 --- a/flang-rt/lib/cuda/CMakeLists.txt +++ b/flang-rt/lib/cuda/CMakeLists.txt @@ -14,6 +14,7 @@ add_flangrt_library(flang_rt.cuda STATIC SHARED kernel.cpp memmove-function.cpp memory.cpp + pointer.cpp registration.cpp TARGET_PROPERTIES diff --git a/flang-rt/lib/cuda/allocatable.cpp b/flang-rt/lib/cuda/allocatable.cpp index 432974d18a3e3..c77819e9440d7 100644 --- a/flang-rt/lib/cuda/allocatable.cpp +++ b/flang-rt/lib/cuda/allocatable.cpp @@ -23,7 +23,7 @@ namespace Fortran::runtime::cuda { extern "C" { RT_EXT_API_GROUP_BEGIN -int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, +int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( @@ -41,7 +41,7 @@ int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, return stat; } -int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, +int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { if (desc.HasAddendum()) { @@ -63,7 +63,7 @@ int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, } int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; @@ -76,7 +76,7 @@ int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, } int RTDEF(CUFAllocatableAllocateSourceSync)(Descriptor &alloc, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocateSync)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; diff --git a/flang-rt/lib/cuda/allocator.cpp b/flang-rt/lib/cuda/allocator.cpp index 51119ab251168..f4289c55bd8de 100644 --- a/flang-rt/lib/cuda/allocator.cpp +++ b/flang-rt/lib/cuda/allocator.cpp @@ -98,7 +98,7 @@ static unsigned findAllocation(void *ptr) { return allocNotFound; } -static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { +static void insertAllocation(void *ptr, std::size_t size, cudaStream_t stream) { CriticalSection critical{lock}; initAllocations(); if (numDeviceAllocations >= maxDeviceAllocations) { @@ -106,7 +106,7 @@ static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { } deviceAllocations[numDeviceAllocations].ptr = ptr; deviceAllocations[numDeviceAllocations].size = size; - deviceAllocations[numDeviceAllocations].stream = (cudaStream_t)stream; + deviceAllocations[numDeviceAllocations].stream = stream; ++numDeviceAllocations; qsort(deviceAllocations, numDeviceAllocations, sizeof(DeviceAllocation), compareDeviceAlloc); @@ -136,7 +136,7 @@ void RTDEF(CUFRegisterAllocator)() { } void *CUFAllocPinned( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { void *p; CUDA_REPORT_IF_ERROR(cudaMallocHost((void **)&p, sizeInBytes)); return p; @@ -144,18 +144,18 @@ void *CUFAllocPinned( void CUFFreePinned(void *p) { CUDA_REPORT_IF_ERROR(cudaFreeHost(p)); } -void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t asyncId) { +void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t *asyncObject) { void *p; if (Fortran::runtime::executionEnvironment.cudaDeviceIsManaged) { CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); } else { - if (asyncId == kNoAsyncId) { + if (asyncObject == kNoAsyncObject) { CUDA_REPORT_IF_ERROR(cudaMalloc(&p, sizeInBytes)); } else { CUDA_REPORT_IF_ERROR( - cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)asyncId)); - insertAllocation(p, sizeInBytes, asyncId); + cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)*asyncObject)); + insertAllocation(p, sizeInBytes, (cudaStream_t)*asyncObject); } } return p; @@ -174,7 +174,7 @@ void CUFFreeDevice(void *p) { } void *CUFAllocManaged( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { void *p; CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); @@ -184,9 +184,9 @@ void *CUFAllocManaged( void CUFFreeManaged(void *p) { CUDA_REPORT_IF_ERROR(cudaFree(p)); } void *CUFAllocUnified( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { // Call alloc managed for the time being. - return CUFAllocManaged(sizeInBytes, asyncId); + return CUFAllocManaged(sizeInBytes, asyncObject); } void CUFFreeUnified(void *p) { diff --git a/flang-rt/lib/cuda/descriptor.cpp b/flang-rt/lib/cuda/descriptor.cpp index 175e8c0ef8438..7b768f91af29d 100644 --- a/flang-rt/lib/cuda/descriptor.cpp +++ b/flang-rt/lib/cuda/descriptor.cpp @@ -21,7 +21,7 @@ RT_EXT_API_GROUP_BEGIN Descriptor *RTDEF(CUFAllocDescriptor)( std::size_t sizeInBytes, const char *sourceFile, int sourceLine) { return reinterpret_cast( - CUFAllocManaged(sizeInBytes, /*asyncId*/ -1)); + CUFAllocManaged(sizeInBytes, /*asyncObject=*/nullptr)); } void RTDEF(CUFFreeDescriptor)( diff --git a/flang-rt/lib/cuda/pointer.cpp b/flang-rt/lib/cuda/pointer.cpp index c2559ecb9a6f2..0ed2b0a2b751f 100644 --- a/flang-rt/lib/cuda/pointer.cpp +++ b/flang-rt/lib/cuda/pointer.cpp @@ -22,7 +22,7 @@ namespace Fortran::runtime::cuda { extern "C" { RT_EXT_API_GROUP_BEGIN -int RTDEF(CUFPointerAllocate)(Descriptor &desc, int64_t stream, bool *pinned, +int RTDEF(CUFPointerAllocate)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { if (desc.HasAddendum()) { @@ -43,7 +43,7 @@ int RTDEF(CUFPointerAllocate)(Descriptor &desc, int64_t stream, bool *pinned, return stat; } -int RTDEF(CUFPointerAllocateSync)(Descriptor &desc, int64_t stream, +int RTDEF(CUFPointerAllocateSync)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFPointerAllocate)( @@ -62,7 +62,7 @@ int RTDEF(CUFPointerAllocateSync)(Descriptor &desc, int64_t stream, } int RTDEF(CUFPointerAllocateSource)(Descriptor &pointer, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFPointerAllocate)( pointer, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; @@ -75,7 +75,7 @@ int RTDEF(CUFPointerAllocateSource)(Descriptor &pointer, } int RTDEF(CUFPointerAllocateSourceSync)(Descriptor &pointer, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFPointerAllocateSync)( pointer, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; diff --git a/flang-rt/lib/runtime/allocatable.cpp b/flang-rt/lib/runtime/allocatable.cpp index 6acce34eb9a9e..ef18da6ea0786 100644 --- a/flang-rt/lib/runtime/allocatable.cpp +++ b/flang-rt/lib/runtime/allocatable.cpp @@ -133,17 +133,17 @@ void RTDEF(AllocatableApplyMold)( } } -int RTDEF(AllocatableAllocate)(Descriptor &descriptor, std::int64_t asyncId, - bool hasStat, const Descriptor *errMsg, const char *sourceFile, - int sourceLine) { +int RTDEF(AllocatableAllocate)(Descriptor &descriptor, + std::int64_t *asyncObject, bool hasStat, const Descriptor *errMsg, + const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; if (!descriptor.IsAllocatable()) { return ReturnError(terminator, StatInvalidDescriptor, errMsg, hasStat); } else if (descriptor.IsAllocated()) { return ReturnError(terminator, StatBaseNotNull, errMsg, hasStat); } else { - int stat{ - ReturnError(terminator, descriptor.Allocate(asyncId), errMsg, hasStat)}; + int stat{ReturnError( + terminator, descriptor.Allocate(asyncObject), errMsg, hasStat)}; if (stat == StatOk) { if (const DescriptorAddendum * addendum{descriptor.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -162,7 +162,7 @@ int RTDEF(AllocatableAllocateSource)(Descriptor &alloc, const Descriptor &source, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(AllocatableAllocate)( - alloc, /*asyncId=*/-1, hasStat, errMsg, sourceFile, sourceLine)}; + alloc, /*asyncObject=*/nullptr, hasStat, errMsg, sourceFile, sourceLine)}; if (stat == StatOk) { Terminator terminator{sourceFile, sourceLine}; DoFromSourceAssign(alloc, source, terminator); diff --git a/flang-rt/lib/runtime/array-constructor.cpp b/flang-rt/lib/runtime/array-constructor.cpp index 67b3b5e1e0f50..858fac7bf2b39 100644 --- a/flang-rt/lib/runtime/array-constructor.cpp +++ b/flang-rt/lib/runtime/array-constructor.cpp @@ -50,7 +50,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( initialAllocationSize(fromElements, to.ElementBytes())}; to.GetDimension(0).SetBounds(1, allocationSize); RTNAME(AllocatableAllocate) - (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); to.GetDimension(0).SetBounds(1, fromElements); vector.actualAllocationSize = allocationSize; @@ -59,7 +59,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( // first value: there should be no reallocation. RUNTIME_CHECK(terminator, previousToElements >= fromElements); RTNAME(AllocatableAllocate) - (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); vector.actualAllocationSize = previousToElements; } diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 9be75da9520e3..912beee909f4a 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -102,7 +102,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; + int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; if (result == StatOk && derived && !derived->noInitializationNeeded()) { result = ReturnError(terminator, Initialize(to, *derived, terminator)); } @@ -280,7 +280,7 @@ RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; + auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; if (stat == StatOk) { if (HasDynamicComponent(from)) { // If 'from' has allocatable/automatic component, we cannot diff --git a/flang-rt/lib/runtime/character.cpp b/flang-rt/lib/runtime/character.cpp index d1152ee1caefb..f140d202e118e 100644 --- a/flang-rt/lib/runtime/character.cpp +++ b/flang-rt/lib/runtime/character.cpp @@ -118,7 +118,7 @@ static RT_API_ATTRS void Compare(Descriptor &result, const Descriptor &x, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("Compare: could not allocate storage for result"); } std::size_t xChars{x.ElementBytes() >> shift}; @@ -173,7 +173,7 @@ static RT_API_ATTRS void AdjustLRHelper(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("ADJUSTL/R: could not allocate storage for result"); } for (SubscriptValue resultAt{0}; elements-- > 0; @@ -227,7 +227,7 @@ static RT_API_ATTRS void LenTrim(Descriptor &result, const Descriptor &string, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("LEN_TRIM: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -427,7 +427,7 @@ static RT_API_ATTRS void GeneralCharFunc(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("SCAN/VERIFY: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -530,7 +530,8 @@ static RT_API_ATTRS void MaxMinHelper(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); } for (CHAR *result{accumulator.OffsetElement()}; elements-- > 0; accumData += accumChars, result += chars, x.IncrementSubscripts(xAt)) { @@ -606,7 +607,7 @@ void RTDEF(CharacterConcatenate)(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - if (accumulator.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (accumulator.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash( "CharacterConcatenate: could not allocate storage for result"); } @@ -629,7 +630,8 @@ void RTDEF(CharacterConcatenateScalar1)( accumulator.set_base_addr(nullptr); std::size_t oldLen{accumulator.ElementBytes()}; accumulator.raw().elem_len += chars; - RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); std::memcpy(accumulator.OffsetElement(oldLen), from, chars); FreeMemory(old); } @@ -831,7 +833,7 @@ void RTDEF(Repeat)(Descriptor &result, const Descriptor &string, std::size_t origBytes{string.ElementBytes()}; result.Establish(string.type(), origBytes * ncopies, nullptr, 0, nullptr, CFI_attribute_allocatable); - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("REPEAT could not allocate storage for result"); } const char *from{string.OffsetElement()}; @@ -865,7 +867,7 @@ void RTDEF(Trim)(Descriptor &result, const Descriptor &string, } result.Establish(string.type(), resultBytes, nullptr, 0, nullptr, CFI_attribute_allocatable); - RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncObject) == CFI_SUCCESS); std::memcpy(result.OffsetElement(), string.OffsetElement(), resultBytes); } diff --git a/flang-rt/lib/runtime/copy.cpp b/flang-rt/lib/runtime/copy.cpp index 3a0f98cf8d376..f990f46e0be66 100644 --- a/flang-rt/lib/runtime/copy.cpp +++ b/flang-rt/lib/runtime/copy.cpp @@ -171,8 +171,8 @@ RT_API_ATTRS void CopyElement(const Descriptor &to, const SubscriptValue toAt[], *reinterpret_cast(toPtr + component->offset())}; if (toDesc.raw().base_addr != nullptr) { toDesc.set_base_addr(nullptr); - RUNTIME_CHECK( - terminator, toDesc.Allocate(/*asyncId=*/-1) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, + toDesc.Allocate(/*asyncObject=*/nullptr) == CFI_SUCCESS); const Descriptor &fromDesc{*reinterpret_cast( fromPtr + component->offset())}; copyStack.emplace(toDesc, fromDesc); diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..35037036f63e7 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -52,7 +52,7 @@ RT_API_ATTRS int Initialize(const Descriptor &instance, allocDesc.raw().attribute = CFI_attribute_allocatable; if (comp.genre() == typeInfo::Component::Genre::Automatic) { stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); + terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -153,7 +153,7 @@ RT_API_ATTRS int InitializeClone(const Descriptor &clone, if (origDesc.IsAllocated()) { cloneDesc.ApplyMold(origDesc, origDesc.rank()); stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); + terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { if (const typeInfo::DerivedType * @@ -260,7 +260,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy.raw().attribute = CFI_attribute_allocatable; Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncId) == CFI_SUCCESS); + copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } diff --git a/flang-rt/lib/runtime/descriptor.cpp b/flang-rt/lib/runtime/descriptor.cpp index 3debf53bb5290..67336d01380e0 100644 --- a/flang-rt/lib/runtime/descriptor.cpp +++ b/flang-rt/lib/runtime/descriptor.cpp @@ -158,7 +158,7 @@ RT_API_ATTRS static inline int MapAllocIdx(const Descriptor &desc) { #endif } -RT_API_ATTRS int Descriptor::Allocate(std::int64_t asyncId) { +RT_API_ATTRS int Descriptor::Allocate(std::int64_t *asyncObject) { std::size_t elementBytes{ElementBytes()}; if (static_cast(elementBytes) < 0) { // F'2023 7.4.4.2 p5: "If the character length parameter value evaluates @@ -170,7 +170,7 @@ RT_API_ATTRS int Descriptor::Allocate(std::int64_t asyncId) { // Zero size allocation is possible in Fortran and the resulting // descriptor must be allocated/associated. Since std::malloc(0) // result is implementation defined, always allocate at least one byte. - void *p{alloc(byteSize ? byteSize : 1, asyncId)}; + void *p{alloc(byteSize ? byteSize : 1, asyncObject)}; if (!p) { return CFI_ERROR_MEM_ALLOCATION; } diff --git a/flang-rt/lib/runtime/extrema.cpp b/flang-rt/lib/runtime/extrema.cpp index 4c7f8e8b99e8f..03e574a8fbff1 100644 --- a/flang-rt/lib/runtime/extrema.cpp +++ b/flang-rt/lib/runtime/extrema.cpp @@ -152,7 +152,7 @@ inline RT_API_ATTRS void CharacterMaxOrMinLoc(const char *intrinsic, CFI_attribute_allocatable); result.GetDimension(0).SetBounds(1, extent[0]); Terminator terminator{source, line}; - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } @@ -181,7 +181,7 @@ inline RT_API_ATTRS void TotalNumericMaxOrMinLoc(const char *intrinsic, CFI_attribute_allocatable); result.GetDimension(0).SetBounds(1, extent[0]); Terminator terminator{source, line}; - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/runtime/findloc.cpp b/flang-rt/lib/runtime/findloc.cpp index e3e98953b0cfc..5485f4b97bd2f 100644 --- a/flang-rt/lib/runtime/findloc.cpp +++ b/flang-rt/lib/runtime/findloc.cpp @@ -220,7 +220,7 @@ void RTDEF(Findloc)(Descriptor &result, const Descriptor &x, CFI_attribute_allocatable); result.GetDimension(0).SetBounds(1, extent[0]); Terminator terminator{source, line}; - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "FINDLOC: could not allocate memory for result; STAT=%d", stat); } diff --git a/flang-rt/lib/runtime/matmul-transpose.cpp b/flang-rt/lib/runtime/matmul-transpose.cpp index 17987fb73d943..c9e21502b629e 100644 --- a/flang-rt/lib/runtime/matmul-transpose.cpp +++ b/flang-rt/lib/runtime/matmul-transpose.cpp @@ -183,7 +183,7 @@ inline static RT_API_ATTRS void DoMatmulTranspose( for (int j{0}; j < resRank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "MATMUL-TRANSPOSE: could not allocate memory for result; STAT=%d", stat); diff --git a/flang-rt/lib/runtime/matmul.cpp b/flang-rt/lib/runtime/matmul.cpp index 0ff92cecbbcb8..5acb345725212 100644 --- a/flang-rt/lib/runtime/matmul.cpp +++ b/flang-rt/lib/runtime/matmul.cpp @@ -255,7 +255,7 @@ static inline RT_API_ATTRS void DoMatmul( for (int j{0}; j < resRank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "MATMUL: could not allocate memory for result; STAT=%d", stat); } diff --git a/flang-rt/lib/runtime/misc-intrinsic.cpp b/flang-rt/lib/runtime/misc-intrinsic.cpp index 2fde859869ef0..a8797f48fa667 100644 --- a/flang-rt/lib/runtime/misc-intrinsic.cpp +++ b/flang-rt/lib/runtime/misc-intrinsic.cpp @@ -30,7 +30,7 @@ static RT_API_ATTRS void TransferImpl(Descriptor &result, if (const DescriptorAddendum * addendum{mold.Addendum()}) { *result.Addendum() = *addendum; } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { Terminator{sourceFile, line}.Crash( "TRANSFER: could not allocate memory for result; STAT=%d", stat); } diff --git a/flang-rt/lib/runtime/pointer.cpp b/flang-rt/lib/runtime/pointer.cpp index fd2427f4124b5..7331f7bbc3a75 100644 --- a/flang-rt/lib/runtime/pointer.cpp +++ b/flang-rt/lib/runtime/pointer.cpp @@ -129,7 +129,7 @@ RT_API_ATTRS void *AllocateValidatedPointerPayload( byteSize = ((byteSize + align - 1) / align) * align; std::size_t total{byteSize + sizeof(std::uintptr_t)}; AllocFct alloc{allocatorRegistry.GetAllocator(allocatorIdx)}; - void *p{alloc(total, /*asyncId=*/-1)}; + void *p{alloc(total, /*asyncObject=*/nullptr)}; if (p && allocatorIdx == 0) { // Fill the footer word with the XOR of the ones' complement of // the base address, which is a value that would be highly unlikely diff --git a/flang-rt/lib/runtime/temporary-stack.cpp b/flang-rt/lib/runtime/temporary-stack.cpp index 3a952b1fdbcca..3f6fd8ee15a80 100644 --- a/flang-rt/lib/runtime/temporary-stack.cpp +++ b/flang-rt/lib/runtime/temporary-stack.cpp @@ -148,7 +148,7 @@ void DescriptorStorage::push(const Descriptor &source) { if constexpr (COPY_VALUES) { // copy the data pointed to by the box box.set_base_addr(nullptr); - box.Allocate(kNoAsyncId); + box.Allocate(kNoAsyncObject); RTNAME(AssignTemporary) (box, source, terminator_.sourceFileName(), terminator_.sourceLine()); } diff --git a/flang-rt/lib/runtime/tools.cpp b/flang-rt/lib/runtime/tools.cpp index 5d6e35faca70a..1f965b0b151ce 100644 --- a/flang-rt/lib/runtime/tools.cpp +++ b/flang-rt/lib/runtime/tools.cpp @@ -261,7 +261,7 @@ RT_API_ATTRS void CreatePartialReductionResult(Descriptor &result, for (int j{0}; j + 1 < xRank; ++j) { result.GetDimension(j).SetBounds(1, resultExtent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/runtime/transformational.cpp b/flang-rt/lib/runtime/transformational.cpp index a7d5a48530ee9..3df314a4e966b 100644 --- a/flang-rt/lib/runtime/transformational.cpp +++ b/flang-rt/lib/runtime/transformational.cpp @@ -132,7 +132,7 @@ static inline RT_API_ATTRS std::size_t AllocateResult(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: Could not allocate memory for result (stat=%d)", function, stat); } @@ -157,7 +157,7 @@ static inline RT_API_ATTRS std::size_t AllocateBesselResult(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: Could not allocate memory for result (stat=%d)", function, stat); } diff --git a/flang-rt/unittests/Evaluate/reshape.cpp b/flang-rt/unittests/Evaluate/reshape.cpp index 67a0be124e8e0..f84de443965d1 100644 --- a/flang-rt/unittests/Evaluate/reshape.cpp +++ b/flang-rt/unittests/Evaluate/reshape.cpp @@ -26,7 +26,7 @@ int main() { for (int j{0}; j < 3; ++j) { source->GetDimension(j).SetBounds(1, sourceExtent[j]); } - TEST(source->Allocate(kNoAsyncId) == CFI_SUCCESS); + TEST(source->Allocate(kNoAsyncObject) == CFI_SUCCESS); TEST(source->IsAllocated()); MATCH(2, source->GetDimension(0).Extent()); MATCH(3, source->GetDimension(1).Extent()); diff --git a/flang-rt/unittests/Runtime/Allocatable.cpp b/flang-rt/unittests/Runtime/Allocatable.cpp index a6fcdd0d1423c..b394312e5bc5a 100644 --- a/flang-rt/unittests/Runtime/Allocatable.cpp +++ b/flang-rt/unittests/Runtime/Allocatable.cpp @@ -26,7 +26,7 @@ TEST(AllocatableTest, MoveAlloc) { auto b{createAllocatable(TypeCategory::Integer, 4)}; // ALLOCATE(a(20)) a->GetDimension(0).SetBounds(1, 20); - a->Allocate(kNoAsyncId); + a->Allocate(kNoAsyncObject); EXPECT_TRUE(a->IsAllocated()); EXPECT_FALSE(b->IsAllocated()); @@ -46,7 +46,7 @@ TEST(AllocatableTest, MoveAlloc) { // move_alloc with errMsg auto errMsg{Descriptor::Create( sizeof(char), 64, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - errMsg->Allocate(kNoAsyncId); + errMsg->Allocate(kNoAsyncObject); RTNAME(MoveAlloc)(*b, *a, nullptr, false, errMsg.get(), __FILE__, __LINE__); EXPECT_FALSE(a->IsAllocated()); EXPECT_TRUE(b->IsAllocated()); diff --git a/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp b/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp index 89649aa95ad93..9935ae0eaac2f 100644 --- a/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp +++ b/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp @@ -42,7 +42,8 @@ TEST(AllocatableCUFTest, SimpleDeviceAllocatable) { CUDA_REPORT_IF_ERROR(cudaMalloc(&device_desc, a->SizeInBytes())); RTNAME(AllocatableAllocate) - (*a, kNoAsyncId, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*a, kNoAsyncObject, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(a->IsAllocated()); RTNAME(CUFDescriptorSync)(device_desc, a.get(), __FILE__, __LINE__); cudaDeviceSynchronize(); @@ -82,19 +83,22 @@ TEST(AllocatableCUFTest, StreamDeviceAllocatable) { RTNAME(AllocatableSetBounds)(*c, 0, 1, 100); RTNAME(AllocatableAllocate) - (*a, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(a->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); RTNAME(AllocatableAllocate) - (*b, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*b, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(b->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); RTNAME(AllocatableAllocate) - (*c, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*c, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(c->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); diff --git a/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp b/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp index 2f1dc64dc8c5a..f1f931e87a86e 100644 --- a/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp +++ b/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp @@ -35,7 +35,7 @@ TEST(AllocatableCUFTest, SimpleDeviceAllocate) { EXPECT_FALSE(a->HasAddendum()); RTNAME(AllocatableSetBounds)(*a, 0, 1, 10); RTNAME(AllocatableAllocate) - (*a, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); EXPECT_TRUE(a->IsAllocated()); RTNAME(AllocatableDeallocate) @@ -54,7 +54,7 @@ TEST(AllocatableCUFTest, SimplePinnedAllocate) { EXPECT_FALSE(a->HasAddendum()); RTNAME(AllocatableSetBounds)(*a, 0, 1, 10); RTNAME(AllocatableAllocate) - (*a, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); EXPECT_TRUE(a->IsAllocated()); RTNAME(AllocatableDeallocate) diff --git a/flang-rt/unittests/Runtime/CUDA/Memory.cpp b/flang-rt/unittests/Runtime/CUDA/Memory.cpp index b3612073657ab..7915baca6c203 100644 --- a/flang-rt/unittests/Runtime/CUDA/Memory.cpp +++ b/flang-rt/unittests/Runtime/CUDA/Memory.cpp @@ -50,8 +50,8 @@ TEST(MemoryCUFTest, CUFDataTransferDescDesc) { EXPECT_EQ((int)kDeviceAllocatorPos, dev->GetAllocIdx()); RTNAME(AllocatableSetBounds)(*dev, 0, 1, 10); RTNAME(AllocatableAllocate) - (*dev, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, - __LINE__); + (*dev, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, + __FILE__, __LINE__); EXPECT_TRUE(dev->IsAllocated()); // Create temp array to transfer to device. diff --git a/flang-rt/unittests/Runtime/CharacterTest.cpp b/flang-rt/unittests/Runtime/CharacterTest.cpp index 0f28e883671bc..2c7af27b9da77 100644 --- a/flang-rt/unittests/Runtime/CharacterTest.cpp +++ b/flang-rt/unittests/Runtime/CharacterTest.cpp @@ -35,7 +35,7 @@ OwningPtr CreateDescriptor(const std::vector &shape, for (int j{0}; j < rank; ++j) { descriptor->GetDimension(j).SetBounds(2, shape[j] + 1); } - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } diff --git a/flang-rt/unittests/Runtime/CommandTest.cpp b/flang-rt/unittests/Runtime/CommandTest.cpp index 9d0da4ce8dd4e..6919a98105b8a 100644 --- a/flang-rt/unittests/Runtime/CommandTest.cpp +++ b/flang-rt/unittests/Runtime/CommandTest.cpp @@ -26,7 +26,7 @@ template static OwningPtr CreateEmptyCharDescriptor() { OwningPtr descriptor{Descriptor::Create( sizeof(char), n, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } return descriptor; @@ -36,7 +36,7 @@ static OwningPtr CharDescriptor(const char *value) { std::size_t n{std::strlen(value)}; OwningPtr descriptor{Descriptor::Create( sizeof(char), n, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } std::memcpy(descriptor->OffsetElement(), value, n); @@ -47,7 +47,7 @@ template static OwningPtr EmptyIntDescriptor() { OwningPtr descriptor{Descriptor::Create(TypeCategory::Integer, kind, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } return descriptor; @@ -57,7 +57,7 @@ template static OwningPtr IntDescriptor(const int &value) { OwningPtr descriptor{Descriptor::Create(TypeCategory::Integer, kind, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } std::memcpy(descriptor->OffsetElement(), &value, sizeof(int)); diff --git a/flang-rt/unittests/Runtime/TemporaryStack.cpp b/flang-rt/unittests/Runtime/TemporaryStack.cpp index 3291794f22fc1..65725840459ab 100644 --- a/flang-rt/unittests/Runtime/TemporaryStack.cpp +++ b/flang-rt/unittests/Runtime/TemporaryStack.cpp @@ -59,7 +59,7 @@ TEST(TemporaryStack, ValueStackBasic) { Descriptor &outputDesc2{testDescriptorStorage[2].descriptor()}; inputDesc.Establish(code, elementBytes, descriptorPtr, rank, extent); - inputDesc.Allocate(kNoAsyncId); + inputDesc.Allocate(kNoAsyncObject); ASSERT_EQ(inputDesc.IsAllocated(), true); uint32_t *inputData = static_cast(inputDesc.raw().base_addr); for (std::size_t i = 0; i < inputDesc.Elements(); ++i) { @@ -123,7 +123,7 @@ TEST(TemporaryStack, ValueStackMultiSize) { boxDims.extent = extent[dim]; boxDims.sm = elementBytes; } - desc->Allocate(kNoAsyncId); + desc->Allocate(kNoAsyncObject); // fill the array with some data to test for (uint32_t i = 0; i < desc->Elements(); ++i) { diff --git a/flang-rt/unittests/Runtime/tools.h b/flang-rt/unittests/Runtime/tools.h index a1eba45647a80..4ada862df110b 100644 --- a/flang-rt/unittests/Runtime/tools.h +++ b/flang-rt/unittests/Runtime/tools.h @@ -42,7 +42,7 @@ static OwningPtr MakeArray(const std::vector &shape, for (int j{0}; j < rank; ++j) { result->GetDimension(j).SetBounds(1, shape[j]); } - int stat{result->Allocate(kNoAsyncId)}; + int stat{result->Allocate(kNoAsyncObject)}; EXPECT_EQ(stat, 0) << stat; EXPECT_LE(data.size(), result->Elements()); char *p{result->OffsetElement()}; diff --git a/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td b/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td index 46cc59cda1612..e38738230ffbc 100644 --- a/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td +++ b/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td @@ -95,12 +95,11 @@ def cuf_AllocateOp : cuf_Op<"allocate", [AttrSizedOperandSegments, }]; let arguments = (ins Arg:$box, - Arg, "", [MemWrite]>:$errmsg, - Optional:$stream, - Arg, "", [MemWrite]>:$pinned, - Arg, "", [MemRead]>:$source, - cuf_DataAttributeAttr:$data_attr, - UnitAttr:$hasStat); + Arg, "", [MemWrite]>:$errmsg, + Optional:$stream, + Arg, "", [MemWrite]>:$pinned, + Arg, "", [MemRead]>:$source, + cuf_DataAttributeAttr:$data_attr, UnitAttr:$hasStat); let results = (outs AnyIntegerType:$stat); diff --git a/flang/include/flang/Runtime/CUDA/allocatable.h b/flang/include/flang/Runtime/CUDA/allocatable.h index 822f2d4a2b297..6c97afa9e10e8 100644 --- a/flang/include/flang/Runtime/CUDA/allocatable.h +++ b/flang/include/flang/Runtime/CUDA/allocatable.h @@ -17,14 +17,14 @@ namespace Fortran::runtime::cuda { extern "C" { /// Perform allocation of the descriptor. -int RTDECL(CUFAllocatableAllocate)(Descriptor &, int64_t stream = -1, +int RTDECL(CUFAllocatableAllocate)(Descriptor &, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. -int RTDECL(CUFAllocatableAllocateSync)(Descriptor &, int64_t stream = -1, +int RTDECL(CUFAllocatableAllocateSync)(Descriptor &, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); @@ -32,14 +32,14 @@ int RTDECL(CUFAllocatableAllocateSync)(Descriptor &, int64_t stream = -1, /// Perform allocation of the descriptor without synchronization. Assign data /// from source. int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, - const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, + const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. Assign data from source. int RTDEF(CUFAllocatableAllocateSourceSync)(Descriptor &alloc, - const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, + const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); diff --git a/flang/include/flang/Runtime/CUDA/allocator.h b/flang/include/flang/Runtime/CUDA/allocator.h index 18ddf75ac3852..59fdb22b6e663 100644 --- a/flang/include/flang/Runtime/CUDA/allocator.h +++ b/flang/include/flang/Runtime/CUDA/allocator.h @@ -20,16 +20,16 @@ extern "C" { void RTDECL(CUFRegisterAllocator)(); } -void *CUFAllocPinned(std::size_t, std::int64_t); +void *CUFAllocPinned(std::size_t, std::int64_t *); void CUFFreePinned(void *); -void *CUFAllocDevice(std::size_t, std::int64_t); +void *CUFAllocDevice(std::size_t, std::int64_t *); void CUFFreeDevice(void *); -void *CUFAllocManaged(std::size_t, std::int64_t); +void *CUFAllocManaged(std::size_t, std::int64_t *); void CUFFreeManaged(void *); -void *CUFAllocUnified(std::size_t, std::int64_t); +void *CUFAllocUnified(std::size_t, std::int64_t *); void CUFFreeUnified(void *); } // namespace Fortran::runtime::cuda diff --git a/flang/include/flang/Runtime/CUDA/pointer.h b/flang/include/flang/Runtime/CUDA/pointer.h index 7fbd8f8e061f2..bdfc3268e0814 100644 --- a/flang/include/flang/Runtime/CUDA/pointer.h +++ b/flang/include/flang/Runtime/CUDA/pointer.h @@ -17,14 +17,14 @@ namespace Fortran::runtime::cuda { extern "C" { /// Perform allocation of the descriptor. -int RTDECL(CUFPointerAllocate)(Descriptor &, int64_t stream = -1, +int RTDECL(CUFPointerAllocate)(Descriptor &, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. -int RTDECL(CUFPointerAllocateSync)(Descriptor &, int64_t stream = -1, +int RTDECL(CUFPointerAllocateSync)(Descriptor &, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); @@ -32,14 +32,14 @@ int RTDECL(CUFPointerAllocateSync)(Descriptor &, int64_t stream = -1, /// Perform allocation of the descriptor without synchronization. Assign data /// from source. int RTDEF(CUFPointerAllocateSource)(Descriptor &pointer, - const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, + const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. Assign data from source. int RTDEF(CUFPointerAllocateSourceSync)(Descriptor &pointer, - const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, + const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); diff --git a/flang/include/flang/Runtime/allocatable.h b/flang/include/flang/Runtime/allocatable.h index 6895f8af5e2a8..863c07494e7c3 100644 --- a/flang/include/flang/Runtime/allocatable.h +++ b/flang/include/flang/Runtime/allocatable.h @@ -94,9 +94,10 @@ int RTDECL(AllocatableCheckLengthParameter)(Descriptor &, // Successfully allocated memory is initialized if the allocatable has a // derived type, and is always initialized by AllocatableAllocateSource(). // Performs all necessary coarray synchronization and validation actions. -int RTDECL(AllocatableAllocate)(Descriptor &, std::int64_t asyncId = -1, - bool hasStat = false, const Descriptor *errMsg = nullptr, - const char *sourceFile = nullptr, int sourceLine = 0); +int RTDECL(AllocatableAllocate)(Descriptor &, + std::int64_t *asyncObject = nullptr, bool hasStat = false, + const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, + int sourceLine = 0); int RTDECL(AllocatableAllocateSource)(Descriptor &, const Descriptor &source, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); diff --git a/flang/lib/Lower/Allocatable.cpp b/flang/lib/Lower/Allocatable.cpp index 7e32575caad9b..dd90e8900704b 100644 --- a/flang/lib/Lower/Allocatable.cpp +++ b/flang/lib/Lower/Allocatable.cpp @@ -760,7 +760,7 @@ class AllocateStmtHelper { mlir::Value errmsg = errMsgExpr ? errorManager.errMsgAddr : nullptr; mlir::Value stream = streamExpr - ? fir::getBase(converter.genExprValue(loc, *streamExpr, stmtCtx)) + ? fir::getBase(converter.genExprAddr(loc, *streamExpr, stmtCtx)) : nullptr; mlir::Value pinned = pinnedExpr diff --git a/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp b/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp index 28452d3b486da..cd5f1f6d098c3 100644 --- a/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp +++ b/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp @@ -76,8 +76,7 @@ void fir::runtime::genAllocatableAllocate(fir::FirOpBuilder &builder, mlir::func::FuncOp func{ fir::runtime::getRuntimeFunc(loc, builder)}; mlir::FunctionType fTy{func.getFunctionType()}; - mlir::Value asyncId = - builder.createIntegerConstant(loc, builder.getI64Type(), -1); + mlir::Value asyncObject = builder.createNullConstant(loc); mlir::Value sourceFile{fir::factory::locationToFilename(builder, loc)}; mlir::Value sourceLine{ fir::factory::locationToLineNo(builder, loc, fTy.getInput(5))}; @@ -88,7 +87,7 @@ void fir::runtime::genAllocatableAllocate(fir::FirOpBuilder &builder, errMsg = builder.create(loc, boxNoneTy).getResult(); } llvm::SmallVector args{ - fir::runtime::createArguments(builder, loc, fTy, desc, asyncId, hasStat, - errMsg, sourceFile, sourceLine)}; + fir::runtime::createArguments(builder, loc, fTy, desc, asyncObject, + hasStat, errMsg, sourceFile, sourceLine)}; builder.create(loc, func, args); } diff --git a/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp b/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp index 24033bc15b8eb..687007d957225 100644 --- a/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp +++ b/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp @@ -76,6 +76,16 @@ llvm::LogicalResult cuf::FreeOp::verify() { return checkCudaAttr(*this); } // AllocateOp //===----------------------------------------------------------------------===// +template +static llvm::LogicalResult checkStreamType(OpTy op) { + if (!op.getStream()) + return mlir::success(); + if (auto refTy = mlir::dyn_cast(op.getStream().getType())) + if (!refTy.getEleTy().isInteger(64)) + return op.emitOpError("stream is expected to be an i64 reference"); + return mlir::success(); +} + llvm::LogicalResult cuf::AllocateOp::verify() { if (getPinned() && getStream()) return emitOpError("pinned and stream cannot appears at the same time"); @@ -92,7 +102,7 @@ llvm::LogicalResult cuf::AllocateOp::verify() { "expect errmsg to be a reference to/or a box type value"); if (getErrmsg() && !getHasStat()) return emitOpError("expect stat attribute when errmsg is provided"); - return mlir::success(); + return checkStreamType(*this); } //===----------------------------------------------------------------------===// @@ -143,16 +153,6 @@ llvm::LogicalResult cuf::DeallocateOp::verify() { // KernelLaunchOp //===----------------------------------------------------------------------===// -template -static llvm::LogicalResult checkStreamType(OpTy op) { - if (!op.getStream()) - return mlir::success(); - if (auto refTy = mlir::dyn_cast(op.getStream().getType())) - if (!refTy.getEleTy().isInteger(64)) - return op.emitOpError("stream is expected to be an i64 reference"); - return mlir::success(); -} - llvm::LogicalResult cuf::KernelLaunchOp::verify() { return checkStreamType(*this); } diff --git a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp index 7477a3c53c3ef..0fff06033b73d 100644 --- a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp +++ b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp @@ -129,17 +129,15 @@ static mlir::LogicalResult convertOpToCall(OpTy op, mlir::IntegerType::get(op.getContext(), 1))); if (op.getSource()) { mlir::Value stream = - op.getStream() - ? op.getStream() - : builder.createIntegerConstant(loc, fTy.getInput(2), -1); + op.getStream() ? op.getStream() + : builder.createNullConstant(loc, fTy.getInput(2)); args = fir::runtime::createArguments( builder, loc, fTy, op.getBox(), op.getSource(), stream, pinned, hasStat, errmsg, sourceFile, sourceLine); } else { mlir::Value stream = - op.getStream() - ? op.getStream() - : builder.createIntegerConstant(loc, fTy.getInput(1), -1); + op.getStream() ? op.getStream() + : builder.createNullConstant(loc, fTy.getInput(1)); args = fir::runtime::createArguments(builder, loc, fTy, op.getBox(), stream, pinned, hasStat, errmsg, sourceFile, sourceLine); diff --git a/flang/test/Fir/CUDA/cuda-allocate.fir b/flang/test/Fir/CUDA/cuda-allocate.fir index 095ad92d5deb5..ea7890c9aac52 100644 --- a/flang/test/Fir/CUDA/cuda-allocate.fir +++ b/flang/test/Fir/CUDA/cuda-allocate.fir @@ -19,7 +19,7 @@ func.func @_QPsub1() { // CHECK: %[[DESC:.*]] = fir.convert %[[DESC_RT_CALL]] : (!fir.ref>) -> !fir.ref>>> // CHECK: %[[DECL_DESC:.*]]:2 = hlfir.declare %[[DESC]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) // CHECK: %[[BOX_NONE:.*]] = fir.convert %[[DECL_DESC]]#1 : (!fir.ref>>>) -> !fir.ref> -// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[BOX_NONE:.*]] = fir.convert %[[DECL_DESC]]#1 : (!fir.ref>>>) -> !fir.ref> // CHECK: %{{.*}} = fir.call @_FortranAAllocatableDeallocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -47,7 +47,7 @@ func.func @_QPsub3() { // CHECK: %[[A:.*]]:2 = hlfir.declare %[[A_ADDR]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QMmod1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) // CHECK: %[[A_BOX:.*]] = fir.convert %[[A]]#1 : (!fir.ref>>>) -> !fir.ref> -// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[A_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[A_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[A_BOX:.*]] = fir.convert %[[A]]#1 : (!fir.ref>>>) -> !fir.ref> // CHECK: fir.call @_FortranACUFAllocatableDeallocate(%[[A_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -87,7 +87,7 @@ func.func @_QPsub5() { } // CHECK-LABEL: func.func @_QPsub5() -// CHECK: fir.call @_FortranACUFAllocatableAllocate({{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocate({{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: fir.call @_FortranAAllocatableDeallocate({{.*}}) : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -118,7 +118,7 @@ func.func @_QQsub6() attributes {fir.bindc_name = "test"} { // CHECK: %[[B:.*]]:2 = hlfir.declare %[[B_ADDR]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QMdataEb"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) // CHECK: _FortranAAllocatableSetBounds // CHECK: %[[B_BOX:.*]] = fir.convert %[[B]]#1 : (!fir.ref>>>) -> !fir.ref> -// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[B_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[B_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 func.func @_QPallocate_source() { @@ -142,7 +142,7 @@ func.func @_QPallocate_source() { // CHECK: %[[SOURCE:.*]] = fir.load %[[DECL_HOST]] : !fir.ref>>> // CHECK: %[[DEV_CONV:.*]] = fir.convert %[[DECL_DEV]] : (!fir.ref>>>) -> !fir.ref> // CHECK: %[[SOURCE_CONV:.*]] = fir.convert %[[SOURCE]] : (!fir.box>>) -> !fir.box -// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocateSource(%[[DEV_CONV]], %[[SOURCE_CONV]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.box, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocateSource(%[[DEV_CONV]], %[[SOURCE_CONV]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.box, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 fir.global @_QMmod1Ea_d {data_attr = #cuf.cuda} : !fir.box>> { @@ -170,16 +170,14 @@ func.func @_QQallocate_stream() { %1 = fir.declare %0 {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFEa"} : (!fir.ref>>>) -> !fir.ref>>> %2 = fir.alloca i64 {bindc_name = "stream1", uniq_name = "_QFEstream1"} %3 = fir.declare %2 {uniq_name = "_QFEstream1"} : (!fir.ref) -> !fir.ref - %4 = fir.load %3 : !fir.ref - %5 = cuf.allocate %1 : !fir.ref>>> stream(%4 : i64) {data_attr = #cuf.cuda} -> i32 + %5 = cuf.allocate %1 : !fir.ref>>> stream(%3 : !fir.ref) {data_attr = #cuf.cuda} -> i32 return } // CHECK-LABEL: func.func @_QQallocate_stream() // CHECK: %[[STREAM_ALLOCA:.*]] = fir.alloca i64 {bindc_name = "stream1", uniq_name = "_QFEstream1"} // CHECK: %[[STREAM:.*]] = fir.declare %[[STREAM_ALLOCA]] {uniq_name = "_QFEstream1"} : (!fir.ref) -> !fir.ref -// CHECK: %[[STREAM_LOAD:.*]] = fir.load %[[STREAM]] : !fir.ref -// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %[[STREAM_LOAD]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %[[STREAM]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 func.func @_QPp_alloc() { @@ -268,6 +266,6 @@ func.func @_QQpinned() attributes {fir.bindc_name = "testasync"} { // CHECK: %[[PINNED:.*]] = fir.alloca !fir.logical<4> {bindc_name = "pinnedflag", uniq_name = "_QFEpinnedflag"} // CHECK: %[[DECL_PINNED:.*]] = fir.declare %[[PINNED]] {uniq_name = "_QFEpinnedflag"} : (!fir.ref>) -> !fir.ref> // CHECK: %[[CONV_PINNED:.*]] = fir.convert %[[DECL_PINNED]] : (!fir.ref>) -> !fir.ref -// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %{{.*}}, %[[CONV_PINNED]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %{{.*}}, %[[CONV_PINNED]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 } // end of module diff --git a/flang/test/Fir/cuf-invalid.fir b/flang/test/Fir/cuf-invalid.fir index a3b9be3ee8223..dceb8f6fde236 100644 --- a/flang/test/Fir/cuf-invalid.fir +++ b/flang/test/Fir/cuf-invalid.fir @@ -2,13 +2,12 @@ func.func @_QPsub1() { %0 = fir.alloca !fir.box>> {bindc_name = "a", uniq_name = "_QFsub1Ea"} - %1 = fir.alloca i32 + %s = fir.alloca i64 %pinned = fir.alloca i1 %4:2 = hlfir.declare %0 {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) %11 = fir.convert %4#1 : (!fir.ref>>>) -> !fir.ref> - %s = fir.load %1 : !fir.ref // expected-error at +1{{'cuf.allocate' op pinned and stream cannot appears at the same time}} - %13 = cuf.allocate %11 : !fir.ref> stream(%s : i32) pinned(%pinned : !fir.ref) {data_attr = #cuf.cuda} -> i32 + %13 = cuf.allocate %11 : !fir.ref> stream(%s : !fir.ref) pinned(%pinned : !fir.ref) {data_attr = #cuf.cuda} -> i32 return } diff --git a/flang/test/Fir/cuf.mlir b/flang/test/Fir/cuf.mlir index d38b26a4548ed..f80a70eca34a3 100644 --- a/flang/test/Fir/cuf.mlir +++ b/flang/test/Fir/cuf.mlir @@ -18,15 +18,14 @@ func.func @_QPsub1() { func.func @_QPsub1() { %0 = fir.alloca !fir.box>> {bindc_name = "a", uniq_name = "_QFsub1Ea"} - %1 = fir.alloca i32 + %1 = fir.alloca i64 %4:2 = hlfir.declare %0 {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) %11 = fir.convert %4#1 : (!fir.ref>>>) -> !fir.ref> - %s = fir.load %1 : !fir.ref - %13 = cuf.allocate %11 : !fir.ref> stream(%s : i32) {data_attr = #cuf.cuda} -> i32 + %13 = cuf.allocate %11 : !fir.ref> stream(%1 : !fir.ref) {data_attr = #cuf.cuda} -> i32 return } -// CHECK: cuf.allocate %{{.*}} : !fir.ref> stream(%{{.*}} : i32) {data_attr = #cuf.cuda} -> i32 +// CHECK: cuf.allocate %{{.*}} : !fir.ref> stream(%{{.*}} : !fir.ref) {data_attr = #cuf.cuda} -> i32 // ----- diff --git a/flang/test/HLFIR/elemental-codegen.fir b/flang/test/HLFIR/elemental-codegen.fir index a715479f16115..67af4261470f7 100644 --- a/flang/test/HLFIR/elemental-codegen.fir +++ b/flang/test/HLFIR/elemental-codegen.fir @@ -191,7 +191,7 @@ func.func @test_polymorphic(%arg0: !fir.class> {fir.bindc_ // CHECK: %[[VAL_35:.*]] = fir.absent !fir.box // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_4]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_37:.*]] = fir.convert %[[VAL_31]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_38:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_36]], %{{.*}}, %[[VAL_34]], %[[VAL_35]], %[[VAL_37]], %[[VAL_33]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_38:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_36]], %{{.*}}, %[[VAL_34]], %[[VAL_35]], %[[VAL_37]], %[[VAL_33]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_12:.*]] = arith.constant true // CHECK: %[[VAL_39:.*]] = fir.load %[[VAL_13]]#0 : !fir.ref>>>> // CHECK: %[[VAL_40:.*]] = arith.constant 1 : index @@ -275,7 +275,7 @@ func.func @test_polymorphic_expr(%arg0: !fir.class> {fir.b // CHECK: %[[VAL_36:.*]] = fir.absent !fir.box // CHECK: %[[VAL_37:.*]] = fir.convert %[[VAL_5]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_38:.*]] = fir.convert %[[VAL_32]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_39:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_37]], %{{.*}}, %[[VAL_35]], %[[VAL_36]], %[[VAL_38]], %[[VAL_34]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_39:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_37]], %{{.*}}, %[[VAL_35]], %[[VAL_36]], %[[VAL_38]], %[[VAL_34]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_13:.*]] = arith.constant true // CHECK: %[[VAL_40:.*]] = fir.load %[[VAL_14]]#0 : !fir.ref>>>> // CHECK: %[[VAL_41:.*]] = arith.constant 1 : index @@ -328,7 +328,7 @@ func.func @test_polymorphic_expr(%arg0: !fir.class> {fir.b // CHECK: %[[VAL_85:.*]] = fir.absent !fir.box // CHECK: %[[VAL_86:.*]] = fir.convert %[[VAL_4]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_87:.*]] = fir.convert %[[VAL_81]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_88:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_86]], %{{.*}}, %[[VAL_84]], %[[VAL_85]], %[[VAL_87]], %[[VAL_83]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_88:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_86]], %{{.*}}, %[[VAL_84]], %[[VAL_85]], %[[VAL_87]], %[[VAL_83]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_62:.*]] = arith.constant true // CHECK: %[[VAL_89:.*]] = fir.load %[[VAL_63]]#0 : !fir.ref>>>> // CHECK: %[[VAL_90:.*]] = arith.constant 1 : index diff --git a/flang/test/Lower/CUDA/cuda-allocatable.cuf b/flang/test/Lower/CUDA/cuda-allocatable.cuf index a570f636b8db1..cec10dda839e9 100644 --- a/flang/test/Lower/CUDA/cuda-allocatable.cuf +++ b/flang/test/Lower/CUDA/cuda-allocatable.cuf @@ -90,7 +90,7 @@ end subroutine subroutine sub4() real, allocatable, device :: a(:) - integer :: istream + integer(8) :: istream allocate(a(10), stream=istream) end subroutine @@ -98,11 +98,10 @@ end subroutine ! CHECK: %[[BOX:.*]] = cuf.alloc !fir.box>> {bindc_name = "a", data_attr = #cuf.cuda, uniq_name = "_QFsub4Ea"} -> !fir.ref>>> ! CHECK: fir.embox {{.*}} {allocator_idx = 2 : i32} ! CHECK: %[[BOX_DECL:.*]]:2 = hlfir.declare %{{.*}} {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub4Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) -! CHECK: %[[ISTREAM:.*]] = fir.alloca i32 {bindc_name = "istream", uniq_name = "_QFsub4Eistream"} -! CHECK: %[[ISTREAM_DECL:.*]]:2 = hlfir.declare %[[ISTREAM]] {uniq_name = "_QFsub4Eistream"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ISTREAM:.*]] = fir.alloca i64 {bindc_name = "istream", uniq_name = "_QFsub4Eistream"} +! CHECK: %[[ISTREAM_DECL:.*]]:2 = hlfir.declare %[[ISTREAM]] {uniq_name = "_QFsub4Eistream"} : (!fir.ref) -> (!fir.ref, !fir.ref) ! CHECK: fir.call @_FortranAAllocatableSetBounds -! CHECK: %[[STREAM:.*]] = fir.load %[[ISTREAM_DECL]]#0 : !fir.ref -! CHECK: %{{.*}} = cuf.allocate %[[BOX_DECL]]#0 : !fir.ref>>> stream(%[[STREAM]] : i32) {data_attr = #cuf.cuda} -> i32 +! CHECK: %{{.*}} = cuf.allocate %[[BOX_DECL]]#0 : !fir.ref>>> stream(%[[ISTREAM_DECL]]#0 : !fir.ref) {data_attr = #cuf.cuda} -> i32 ! CHECK: fir.if %{{.*}} { ! CHECK: %{{.*}} = cuf.deallocate %[[BOX_DECL]]#0 : !fir.ref>>> {data_attr = #cuf.cuda} -> i32 ! CHECK: } diff --git a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 index 5bb1ae3797346..6869af863644d 100644 --- a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 @@ -473,6 +473,6 @@ subroutine init() end module ! CHECK-LABEL: func.func @_QMacc_declare_post_action_statPinit() -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.if -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/OpenACC/acc-declare.f90 b/flang/test/Lower/OpenACC/acc-declare.f90 index 889cdef51f4ce..4d95ffa10edaf 100644 --- a/flang/test/Lower/OpenACC/acc-declare.f90 +++ b/flang/test/Lower/OpenACC/acc-declare.f90 @@ -434,6 +434,6 @@ subroutine init() end module ! CHECK-LABEL: func.func @_QMacc_declare_post_action_statPinit() -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.if -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/allocatable-polymorphic.f90 b/flang/test/Lower/allocatable-polymorphic.f90 index 10e703210ea61..e6a8c5e025123 100644 --- a/flang/test/Lower/allocatable-polymorphic.f90 +++ b/flang/test/Lower/allocatable-polymorphic.f90 @@ -267,7 +267,7 @@ subroutine test_allocatable() ! CHECK-DAG: %[[C0:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%[[P_CAST]], %[[TYPE_DESC_P1_CAST]], %[[RANK]], %[[C0]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[P_CAST:.*]] = fir.convert %[[P_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[P_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[P_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P1:.*]] = fir.type_desc !fir.type<_QMpolyTp1{a:i32,b:i32}> ! CHECK-DAG: %[[C1_CAST:.*]] = fir.convert %[[C1_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> @@ -276,7 +276,7 @@ subroutine test_allocatable() ! CHECK-DAG: %[[C0:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%[[C1_CAST]], %[[TYPE_DESC_P1_CAST]], %[[RANK]], %[[C0]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[C1_CAST:.*]] = fir.convert %[[C1_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C1_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C1_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P2:.*]] = fir.type_desc !fir.type<_QMpolyTp2{p1:!fir.type<_QMpolyTp1{a:i32,b:i32}>,c:i32}> ! CHECK-DAG: %[[C2_CAST:.*]] = fir.convert %[[C2_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> @@ -285,7 +285,7 @@ subroutine test_allocatable() ! CHECK-DAG: %[[C0:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%[[C2_CAST]], %[[TYPE_DESC_P2_CAST]], %[[RANK]], %[[C0]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[C2_CAST:.*]] = fir.convert %[[C2_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C2_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C2_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P1:.*]] = fir.type_desc !fir.type<_QMpolyTp1{a:i32,b:i32}> ! CHECK-DAG: %[[C3_CAST:.*]] = fir.convert %[[C3_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> @@ -300,7 +300,7 @@ subroutine test_allocatable() ! CHECK-DAG: %[[C10_I64:.*]] = fir.convert %[[C10]] : (i32) -> i64 ! CHECK: fir.call @_FortranAAllocatableSetBounds(%[[C3_CAST]], %[[C0]], %[[C1_I64]], %[[C10_I64]]) {{.*}}: (!fir.ref>, i32, i64, i64) -> () ! CHECK: %[[C3_CAST:.*]] = fir.convert %[[C3_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C3_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C3_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P2:.*]] = fir.type_desc !fir.type<_QMpolyTp2{p1:!fir.type<_QMpolyTp1{a:i32,b:i32}>,c:i32}> ! CHECK-DAG: %[[C4_CAST:.*]] = fir.convert %[[C4_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> @@ -316,7 +316,7 @@ subroutine test_allocatable() ! CHECK-DAG: %[[C20_I64:.*]] = fir.convert %[[C20]] : (i32) -> i64 ! CHECK: fir.call @_FortranAAllocatableSetBounds(%[[C4_CAST]], %[[C0]], %[[C1_I64]], %[[C20_I64]]) {{.*}}: (!fir.ref>, i32, i64, i64) -> () ! CHECK: %[[C4_CAST:.*]] = fir.convert %[[C4_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C4_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C4_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[C1_LOAD1:.*]] = fir.load %[[C1_DECL]]#0 : !fir.ref>>> ! CHECK: fir.dispatch "proc1"(%[[C1_LOAD1]] : !fir.class>>) @@ -390,7 +390,7 @@ subroutine test_unlimited_polymorphic_with_intrinsic_type_spec() ! CHECK-DAG: %[[CORANK:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitIntrinsicForAllocate(%[[BOX_NONE]], %[[CAT]], %[[KIND]], %[[RANK]], %[[CORANK]]) {{.*}} : (!fir.ref>, i32, i32, i32, i32) -> () ! CHECK: %[[BOX_NONE:.*]] = fir.convert %[[P_DECL]]#0 : (!fir.ref>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-DAG: %[[BOX_NONE:.*]] = fir.convert %[[PTR_DECL]]#0 : (!fir.ref>>) -> !fir.ref> ! CHECK-DAG: %[[CAT:.*]] = arith.constant 2 : i32 @@ -573,7 +573,7 @@ subroutine test_allocatable_up_character() ! CHECK-DAG: %[[CORANK:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitCharacterForAllocate(%[[A_NONE]], %[[LEN]], %[[KIND]], %[[RANK]], %[[CORANK]]) {{.*}} : (!fir.ref>, i64, i32, i32, i32) -> () ! CHECK: %[[A_NONE:.*]] = fir.convert %[[A_DECL]]#0 : (!fir.ref>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 end module @@ -592,17 +592,17 @@ program test_alloc ! LLVM-LABEL: define void @_QMpolyPtest_allocatable() ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp1, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp1, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp2, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp1, i32 1, i32 0) ! LLVM: call void @_FortranAAllocatableSetBounds(ptr %{{.*}}, i32 0, i64 1, i64 10) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp2, i32 1, i32 0) ! LLVM: call void @_FortranAAllocatableSetBounds(ptr %{{.*}}, i32 0, i64 1, i64 20) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM-COUNT-2: call void %{{[0-9]*}}() ! LLVM: call void @llvm.memcpy.p0.p0.i32 @@ -683,5 +683,5 @@ program test_alloc ! LLVM: store { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] } { ptr null, i64 8, i32 20240719, i8 0, i8 42, i8 2, i8 1, ptr @_QMpolyEXdtXp1, [1 x i64] zeroinitializer }, ptr %[[ALLOCA1:[0-9]*]] ! LLVM: call void @llvm.memcpy.p0.p0.i32(ptr %[[ALLOCA2:[0-9]+]], ptr %[[ALLOCA1]], i32 40, i1 false) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %[[ALLOCA2]], ptr @_QMpolyEXdtXp1, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %[[ALLOCA2]], i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %[[ALLOCA2]], ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: %{{.*}} = call i32 @_FortranAAllocatableDeallocatePolymorphic(ptr %[[ALLOCA2]], ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) diff --git a/flang/test/Lower/allocatable-runtime.f90 b/flang/test/Lower/allocatable-runtime.f90 index 37272c90656cc..c63252c68974e 100644 --- a/flang/test/Lower/allocatable-runtime.f90 +++ b/flang/test/Lower/allocatable-runtime.f90 @@ -31,7 +31,7 @@ subroutine foo() ! CHECK: fir.call @{{.*}}AllocatableSetBounds(%[[xBoxCast2]], %c0{{.*}}, %[[xlbCast]], %[[xubCast]]) {{.*}}: (!fir.ref>, i32, i64, i64) -> () ! CHECK-DAG: %[[xBoxCast3:.*]] = fir.convert %[[xBoxAddr]] : (!fir.ref>>>) -> !fir.ref> ! CHECK-DAG: %[[sourceFile:.*]] = fir.convert %{{.*}} -> !fir.ref - ! CHECK: fir.call @{{.*}}AllocatableAllocate(%[[xBoxCast3]], %{{.*}}, %false{{.*}}, %[[errMsg]], %[[sourceFile]], %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 + ! CHECK: fir.call @{{.*}}AllocatableAllocate(%[[xBoxCast3]], %{{.*}}, %false{{.*}}, %[[errMsg]], %[[sourceFile]], %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! Simply check that we are emitting the right numebr of set bound for y and z. Otherwise, this is just like x. ! CHECK: fir.convert %[[yBoxAddr]] : (!fir.ref>>>) -> !fir.ref> @@ -180,4 +180,4 @@ subroutine mold_allocation() ! CHECK: %[[M_BOX_NONE:.*]] = fir.convert %[[EMBOX_M]] : (!fir.box>) -> !fir.box ! CHECK: fir.call @_FortranAAllocatableApplyMold(%[[A_BOX_NONE]], %[[M_BOX_NONE]], %[[RANK]]) {{.*}} : (!fir.ref>, !fir.box, i32) -> () ! CHECK: %[[A_BOX_NONE:.*]] = fir.convert %[[A]] : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/allocate-mold.f90 b/flang/test/Lower/allocate-mold.f90 index c7985b11397ce..9427c8b08786f 100644 --- a/flang/test/Lower/allocate-mold.f90 +++ b/flang/test/Lower/allocate-mold.f90 @@ -16,7 +16,7 @@ subroutine scalar_mold_allocation() ! CHECK: %[[A_REF_BOX_NONE1:.*]] = fir.convert %[[A]] : (!fir.ref>>) -> !fir.ref> ! CHECK: fir.call @_FortranAAllocatableApplyMold(%[[A_REF_BOX_NONE1]], %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.box, i32) -> () ! CHECK: %[[A_REF_BOX_NONE2:.*]] = fir.convert %[[A]] : (!fir.ref>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_REF_BOX_NONE2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_REF_BOX_NONE2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 subroutine array_scalar_mold_allocation() real, allocatable :: a(:) @@ -40,4 +40,4 @@ end subroutine array_scalar_mold_allocation ! CHECK: %[[REF_BOX_A1:.*]] = fir.convert %1 : (!fir.ref>>>) -> !fir.ref> ! CHECK: fir.call @_FortranAAllocatableSetBounds(%[[REF_BOX_A1]], {{.*}},{{.*}}, {{.*}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %[[REF_BOX_A2:.*]] = fir.convert %[[A]] : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[REF_BOX_A2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[REF_BOX_A2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/polymorphic.f90 b/flang/test/Lower/polymorphic.f90 index 485861a838ff6..b7be5f685d9e3 100644 --- a/flang/test/Lower/polymorphic.f90 +++ b/flang/test/Lower/polymorphic.f90 @@ -1149,7 +1149,7 @@ program test ! CHECK-LABEL: func.func @_QQmain() attributes {fir.bindc_name = "test"} { ! CHECK: %[[ADDR_O:.*]] = fir.address_of(@_QFEo) : !fir.ref}>>>> ! CHECK: %[[BOX_NONE:.*]] = fir.convert %[[ADDR_O]] : (!fir.ref}>>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[O:.*]] = fir.load %[[ADDR_O]] : !fir.ref}>>>> ! CHECK: %[[COORD_INNER:.*]] = fir.coordinate_of %[[O]], inner : (!fir.box}>>>) -> !fir.ref> ! CHECK: %{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} unordered iter_args(%arg1 = %{{.*}}) -> (!fir.array<5x!fir.logical<4>>) { diff --git a/flang/test/Lower/volatile-allocatable.f90 b/flang/test/Lower/volatile-allocatable.f90 index e182fe8a4d9c9..59e724bce8464 100644 --- a/flang/test/Lower/volatile-allocatable.f90 +++ b/flang/test/Lower/volatile-allocatable.f90 @@ -124,15 +124,15 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv2"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv3"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv1"} : (!fir.box>, volatile>) -> (!fir.box>, volatile>, !fir.box>, volatile>) ! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0{"j"} : (!fir.box>, volatile>) -> !fir.ref ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QQro._QMderived_typesText_type.0"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocateSource(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.box, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QQclX766F6C6174696C6520636861726163746572"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -144,7 +144,7 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEv1"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QQro.4xi4.1"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocateSource(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.box, i1, !fir.box, !fir.ref, i32) -> i32 @@ -154,7 +154,7 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAClassIs(%{{.+}}, %{{.+}}) : (!fir.box, !fir.ref) -> i1 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.class>>, volatile>, !fir.shift<1>) -> (!fir.class>>, volatile>, !fir.class>>, volatile>) ! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0 (%{{.+}}) : (!fir.class>>, volatile>, index) -> !fir.class, volatile> @@ -170,22 +170,22 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0{"arr"} shape %{{.+}} : (!fir.ref>, !fir.shape<1>) -> !fir.ref> ! CHECK: fir.call @_FortranAAllocatableApplyMold(%{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.box, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_unlimited_polymorphic() { ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.ref, volatile>, volatile>) -> (!fir.ref, volatile>, volatile>, !fir.ref, volatile>, volatile>) ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEupa"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitIntrinsicForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i32, i32, i32) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.heap) -> (!fir.heap, !fir.heap) ! CHECK: fir.call @_FortranAAllocatableInitCharacterForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i32, i32, i32) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.heap>, index) -> (!fir.boxchar<1>, !fir.heap>) ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QQclX636C617373282A29"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) ! CHECK: fir.call @_FortranAAllocatableInitIntrinsicForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i32, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEupa"} : (!fir.box>, volatile>, !fir.shift<1>) -> (!fir.box>, volatile>, !fir.box>, volatile>) ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QQro.3xr4.3"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Transforms/lower-repack-arrays.fir b/flang/test/Transforms/lower-repack-arrays.fir index bbae7ba5b0e0b..0b323b1bb0697 100644 --- a/flang/test/Transforms/lower-repack-arrays.fir +++ b/flang/test/Transforms/lower-repack-arrays.fir @@ -840,7 +840,7 @@ func.func @_QPtest6(%arg0: !fir.class>> {fir.bi // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>>) -> !fir.box @@ -928,7 +928,7 @@ func.func @_QPtest6_stack(%arg0: !fir.class>> { // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>>) -> !fir.box @@ -1015,7 +1015,7 @@ func.func @_QPtest7(%arg0: !fir.class> {fir.bindc_name = "x // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>) -> !fir.box @@ -1103,7 +1103,7 @@ func.func @_QPtest7_stack(%arg0: !fir.class> {fir.bindc_nam // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>) -> !fir.box From flang-commits at lists.llvm.org Mon May 19 15:02:59 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Mon, 19 May 2025 15:02:59 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][cuda] Use a reference for asyncObject (PR #140614) In-Reply-To: Message-ID: <682baa93.170a0220.19cb56.d2d6@mx.google.com> https://github.com/clementval closed https://github.com/llvm/llvm-project/pull/140614 From flang-commits at lists.llvm.org Mon May 19 15:28:17 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Mon, 19 May 2025 15:28:17 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang] Extension: allow char string edit descriptors in input formats (PR #140624) In-Reply-To: Message-ID: <682bb081.170a0220.35d4da.1841@mx.google.com> https://github.com/akuhlens approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/140624 From flang-commits at lists.llvm.org Mon May 19 15:28:41 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Mon, 19 May 2025 15:28:41 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Set implicit CUDA device attribute in block construct (PR #140637) Message-ID: https://github.com/clementval created https://github.com/llvm/llvm-project/pull/140637 Arrays in specification part inside a device procedure are implicitly flagged as device if they have no attribute. This was not done for arrays in block construct and leads to false semantic error about usage of host arrays in device context. >From a16ad92e740557bf4fd5e403213ecea9eb04e7b6 Mon Sep 17 00:00:00 2001 From: Valentin Clement Date: Mon, 19 May 2025 15:26:14 -0700 Subject: [PATCH] [flang][cuda] Set implicit CUDA device attribute in block construct --- flang/lib/Semantics/resolve-names.cpp | 43 ++++++++++++++++++++++----- flang/test/Semantics/cuf09.cuf | 11 +++++++ 2 files changed, 46 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index bdafc03ad2c05..92a3277191ae0 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -9372,11 +9372,40 @@ void ResolveNamesVisitor::CreateGeneric(const parser::GenericSpec &x) { info.Resolve(&MakeSymbol(symbolName, Attrs{}, std::move(genericDetails))); } +static void SetImplicitCUDADevice(bool inDeviceSubprogram, Symbol &symbol) { + if (inDeviceSubprogram && symbol.has()) { + auto *object{symbol.detailsIf()}; + if (!object->cudaDataAttr() && !IsValue(symbol) && + (IsDummy(symbol) || object->IsArray())) { + // Implicitly set device attribute if none is set in device context. + object->set_cudaDataAttr(common::CUDADataAttr::Device); + } + } +} + void ResolveNamesVisitor::FinishSpecificationPart( const std::list &decls) { misparsedStmtFuncFound_ = false; funcResultStack().CompleteFunctionResultType(); CheckImports(); + bool inDeviceSubprogram{false}; + Symbol *scopeSym{currScope().symbol()}; + if (currScope().kind() == Scope::Kind::BlockConstruct) { + scopeSym = currScope().parent().symbol(); + } + if (scopeSym) { + if (auto *details{scopeSym->detailsIf()}) { + // Check the current procedure is a device procedure to apply implicit + // attribute at the end. + if (auto attrs{details->cudaSubprogramAttrs()}) { + if (*attrs == common::CUDASubprogramAttrs::Device || + *attrs == common::CUDASubprogramAttrs::Global || + *attrs == common::CUDASubprogramAttrs::Grid_Global) { + inDeviceSubprogram = true; + } + } + } + } for (auto &pair : currScope()) { auto &symbol{*pair.second}; if (inInterfaceBlock()) { @@ -9411,6 +9440,11 @@ void ResolveNamesVisitor::FinishSpecificationPart( SetBindNameOn(symbol); } } + if (currScope().kind() == Scope::Kind::BlockConstruct) { + // Only look for specification in BlockConstruct. Other cases are done in + // ResolveSpecificationParts. + SetImplicitCUDADevice(inDeviceSubprogram, symbol); + } } currScope().InstantiateDerivedTypes(); for (const auto &decl : decls) { @@ -9970,14 +10004,7 @@ void ResolveNamesVisitor::ResolveSpecificationParts(ProgramTree &node) { } ApplyImplicitRules(symbol); // Apply CUDA implicit attributes if needed. - if (inDeviceSubprogram && symbol.has()) { - auto *object{symbol.detailsIf()}; - if (!object->cudaDataAttr() && !IsValue(symbol) && - (IsDummy(symbol) || object->IsArray())) { - // Implicitly set device attribute if none is set in device context. - object->set_cudaDataAttr(common::CUDADataAttr::Device); - } - } + SetImplicitCUDADevice(inDeviceSubprogram, symbol); // Main program local objects usually don't have an implied SAVE attribute, // as one might think, but in the exceptional case of a derived type // local object that contains a coarray, we have to mark it as an diff --git a/flang/test/Semantics/cuf09.cuf b/flang/test/Semantics/cuf09.cuf index 193b22213da61..4a6d9ab09387d 100644 --- a/flang/test/Semantics/cuf09.cuf +++ b/flang/test/Semantics/cuf09.cuf @@ -228,3 +228,14 @@ attributes(host,device) subroutine do2(a,b,c,i) c(i) = a(i) - b(i) ! ok. Should not error with Host array ! cannot be present in device context end + +attributes(global) subroutine blockTest +block + integer(8) :: xloc + integer(8) :: s(7) + integer(4) :: i + do i = 1, 7 + s = xloc ! ok. + end do +end block +end subroutine From flang-commits at lists.llvm.org Mon May 19 15:29:16 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 15:29:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Set implicit CUDA device attribute in block construct (PR #140637) In-Reply-To: Message-ID: <682bb0bc.630a0220.d6949.1d42@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Valentin Clement (バレンタイン クレメン) (clementval)
Changes Arrays in specification part inside a device procedure are implicitly flagged as device if they have no attribute. This was not done for arrays in block construct and leads to false semantic error about usage of host arrays in device context. --- Full diff: https://github.com/llvm/llvm-project/pull/140637.diff 2 Files Affected: - (modified) flang/lib/Semantics/resolve-names.cpp (+35-8) - (modified) flang/test/Semantics/cuf09.cuf (+11) ``````````diff diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index bdafc03ad2c05..92a3277191ae0 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -9372,11 +9372,40 @@ void ResolveNamesVisitor::CreateGeneric(const parser::GenericSpec &x) { info.Resolve(&MakeSymbol(symbolName, Attrs{}, std::move(genericDetails))); } +static void SetImplicitCUDADevice(bool inDeviceSubprogram, Symbol &symbol) { + if (inDeviceSubprogram && symbol.has()) { + auto *object{symbol.detailsIf()}; + if (!object->cudaDataAttr() && !IsValue(symbol) && + (IsDummy(symbol) || object->IsArray())) { + // Implicitly set device attribute if none is set in device context. + object->set_cudaDataAttr(common::CUDADataAttr::Device); + } + } +} + void ResolveNamesVisitor::FinishSpecificationPart( const std::list &decls) { misparsedStmtFuncFound_ = false; funcResultStack().CompleteFunctionResultType(); CheckImports(); + bool inDeviceSubprogram{false}; + Symbol *scopeSym{currScope().symbol()}; + if (currScope().kind() == Scope::Kind::BlockConstruct) { + scopeSym = currScope().parent().symbol(); + } + if (scopeSym) { + if (auto *details{scopeSym->detailsIf()}) { + // Check the current procedure is a device procedure to apply implicit + // attribute at the end. + if (auto attrs{details->cudaSubprogramAttrs()}) { + if (*attrs == common::CUDASubprogramAttrs::Device || + *attrs == common::CUDASubprogramAttrs::Global || + *attrs == common::CUDASubprogramAttrs::Grid_Global) { + inDeviceSubprogram = true; + } + } + } + } for (auto &pair : currScope()) { auto &symbol{*pair.second}; if (inInterfaceBlock()) { @@ -9411,6 +9440,11 @@ void ResolveNamesVisitor::FinishSpecificationPart( SetBindNameOn(symbol); } } + if (currScope().kind() == Scope::Kind::BlockConstruct) { + // Only look for specification in BlockConstruct. Other cases are done in + // ResolveSpecificationParts. + SetImplicitCUDADevice(inDeviceSubprogram, symbol); + } } currScope().InstantiateDerivedTypes(); for (const auto &decl : decls) { @@ -9970,14 +10004,7 @@ void ResolveNamesVisitor::ResolveSpecificationParts(ProgramTree &node) { } ApplyImplicitRules(symbol); // Apply CUDA implicit attributes if needed. - if (inDeviceSubprogram && symbol.has()) { - auto *object{symbol.detailsIf()}; - if (!object->cudaDataAttr() && !IsValue(symbol) && - (IsDummy(symbol) || object->IsArray())) { - // Implicitly set device attribute if none is set in device context. - object->set_cudaDataAttr(common::CUDADataAttr::Device); - } - } + SetImplicitCUDADevice(inDeviceSubprogram, symbol); // Main program local objects usually don't have an implied SAVE attribute, // as one might think, but in the exceptional case of a derived type // local object that contains a coarray, we have to mark it as an diff --git a/flang/test/Semantics/cuf09.cuf b/flang/test/Semantics/cuf09.cuf index 193b22213da61..4a6d9ab09387d 100644 --- a/flang/test/Semantics/cuf09.cuf +++ b/flang/test/Semantics/cuf09.cuf @@ -228,3 +228,14 @@ attributes(host,device) subroutine do2(a,b,c,i) c(i) = a(i) - b(i) ! ok. Should not error with Host array ! cannot be present in device context end + +attributes(global) subroutine blockTest +block + integer(8) :: xloc + integer(8) :: s(7) + integer(4) :: i + do i = 1, 7 + s = xloc ! ok. + end do +end block +end subroutine ``````````
https://github.com/llvm/llvm-project/pull/140637 From flang-commits at lists.llvm.org Mon May 19 15:30:58 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Mon, 19 May 2025 15:30:58 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Set implicit CUDA device attribute in block construct (PR #140637) In-Reply-To: Message-ID: <682bb122.170a0220.f5784.cb0c@mx.google.com> https://github.com/klausler approved this pull request. https://github.com/llvm/llvm-project/pull/140637 From flang-commits at lists.llvm.org Mon May 19 15:02:56 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 15:02:56 -0700 (PDT) Subject: [flang-commits] [flang] f5609aa - [flang][cuda] Use a reference for asyncObject (#140614) Message-ID: <682baa90.170a0220.18d5a9.c8d9@mx.google.com> Author: Valentin Clement (バレンタイン クレメン) Date: 2025-05-19T15:02:53-07:00 New Revision: f5609aa1b014bea1eb72a992665c6afa41015794 URL: https://github.com/llvm/llvm-project/commit/f5609aa1b014bea1eb72a992665c6afa41015794 DIFF: https://github.com/llvm/llvm-project/commit/f5609aa1b014bea1eb72a992665c6afa41015794.diff LOG: [flang][cuda] Use a reference for asyncObject (#140614) Switch from `int64_t` to `int64_t*` to fit with the rest of the implementation. New tentative with some fix. The previous was reverted some time ago. Reviewed in #138010 Added: Modified: flang-rt/include/flang-rt/runtime/allocator-registry.h flang-rt/include/flang-rt/runtime/descriptor.h flang-rt/include/flang-rt/runtime/reduction-templates.h flang-rt/lib/cuda/CMakeLists.txt flang-rt/lib/cuda/allocatable.cpp flang-rt/lib/cuda/allocator.cpp flang-rt/lib/cuda/descriptor.cpp flang-rt/lib/cuda/pointer.cpp flang-rt/lib/runtime/allocatable.cpp flang-rt/lib/runtime/array-constructor.cpp flang-rt/lib/runtime/assign.cpp flang-rt/lib/runtime/character.cpp flang-rt/lib/runtime/copy.cpp flang-rt/lib/runtime/derived.cpp flang-rt/lib/runtime/descriptor.cpp flang-rt/lib/runtime/extrema.cpp flang-rt/lib/runtime/findloc.cpp flang-rt/lib/runtime/matmul-transpose.cpp flang-rt/lib/runtime/matmul.cpp flang-rt/lib/runtime/misc-intrinsic.cpp flang-rt/lib/runtime/pointer.cpp flang-rt/lib/runtime/temporary-stack.cpp flang-rt/lib/runtime/tools.cpp flang-rt/lib/runtime/transformational.cpp flang-rt/unittests/Evaluate/reshape.cpp flang-rt/unittests/Runtime/Allocatable.cpp flang-rt/unittests/Runtime/CUDA/Allocatable.cpp flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp flang-rt/unittests/Runtime/CUDA/Memory.cpp flang-rt/unittests/Runtime/CharacterTest.cpp flang-rt/unittests/Runtime/CommandTest.cpp flang-rt/unittests/Runtime/TemporaryStack.cpp flang-rt/unittests/Runtime/tools.h flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td flang/include/flang/Runtime/CUDA/allocatable.h flang/include/flang/Runtime/CUDA/allocator.h flang/include/flang/Runtime/CUDA/pointer.h flang/include/flang/Runtime/allocatable.h flang/lib/Lower/Allocatable.cpp flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp flang/lib/Optimizer/Transforms/CUFOpConversion.cpp flang/test/Fir/CUDA/cuda-allocate.fir flang/test/Fir/cuf-invalid.fir flang/test/Fir/cuf.mlir flang/test/HLFIR/elemental-codegen.fir flang/test/Lower/CUDA/cuda-allocatable.cuf flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 flang/test/Lower/OpenACC/acc-declare.f90 flang/test/Lower/allocatable-polymorphic.f90 flang/test/Lower/allocatable-runtime.f90 flang/test/Lower/allocate-mold.f90 flang/test/Lower/polymorphic.f90 flang/test/Lower/volatile-allocatable.f90 flang/test/Transforms/lower-repack-arrays.fir Removed: ################################################################################ diff --git a/flang-rt/include/flang-rt/runtime/allocator-registry.h b/flang-rt/include/flang-rt/runtime/allocator-registry.h index 33e8e2c7d7850..f0ba77a360736 100644 --- a/flang-rt/include/flang-rt/runtime/allocator-registry.h +++ b/flang-rt/include/flang-rt/runtime/allocator-registry.h @@ -19,7 +19,7 @@ namespace Fortran::runtime { -using AllocFct = void *(*)(std::size_t, std::int64_t); +using AllocFct = void *(*)(std::size_t, std::int64_t *); using FreeFct = void (*)(void *); typedef struct Allocator_t { @@ -28,7 +28,7 @@ typedef struct Allocator_t { } Allocator_t; static RT_API_ATTRS void *MallocWrapper( - std::size_t size, [[maybe_unused]] std::int64_t) { + std::size_t size, [[maybe_unused]] std::int64_t *) { return std::malloc(size); } #ifdef RT_DEVICE_COMPILATION diff --git a/flang-rt/include/flang-rt/runtime/descriptor.h b/flang-rt/include/flang-rt/runtime/descriptor.h index 9907e7866e7bf..c98e6b14850cb 100644 --- a/flang-rt/include/flang-rt/runtime/descriptor.h +++ b/flang-rt/include/flang-rt/runtime/descriptor.h @@ -29,8 +29,8 @@ #include #include -/// Value used for asyncId when no specific stream is specified. -static constexpr std::int64_t kNoAsyncId = -1; +/// Value used for asyncObject when no specific stream is specified. +static constexpr std::int64_t *kNoAsyncObject = nullptr; namespace Fortran::runtime { @@ -372,7 +372,7 @@ class Descriptor { // before calling. It (re)computes the byte strides after // allocation. Does not allocate automatic components or // perform default component initialization. - RT_API_ATTRS int Allocate(std::int64_t asyncId); + RT_API_ATTRS int Allocate(std::int64_t *asyncObject); RT_API_ATTRS void SetByteStrides(); // Deallocates storage; does not call FINAL subroutines or diff --git a/flang-rt/include/flang-rt/runtime/reduction-templates.h b/flang-rt/include/flang-rt/runtime/reduction-templates.h index 77f77a592a476..18412708b02c5 100644 --- a/flang-rt/include/flang-rt/runtime/reduction-templates.h +++ b/flang-rt/include/flang-rt/runtime/reduction-templates.h @@ -347,7 +347,7 @@ inline RT_API_ATTRS void DoMaxMinNorm2(Descriptor &result, const Descriptor &x, // as the element size of the source. result.Establish(x.type(), x.ElementBytes(), nullptr, 0, nullptr, CFI_attribute_allocatable); - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/cuda/CMakeLists.txt b/flang-rt/lib/cuda/CMakeLists.txt index 95e8e855e46b7..14576676a1f0d 100644 --- a/flang-rt/lib/cuda/CMakeLists.txt +++ b/flang-rt/lib/cuda/CMakeLists.txt @@ -14,6 +14,7 @@ add_flangrt_library(flang_rt.cuda STATIC SHARED kernel.cpp memmove-function.cpp memory.cpp + pointer.cpp registration.cpp TARGET_PROPERTIES diff --git a/flang-rt/lib/cuda/allocatable.cpp b/flang-rt/lib/cuda/allocatable.cpp index 432974d18a3e3..c77819e9440d7 100644 --- a/flang-rt/lib/cuda/allocatable.cpp +++ b/flang-rt/lib/cuda/allocatable.cpp @@ -23,7 +23,7 @@ namespace Fortran::runtime::cuda { extern "C" { RT_EXT_API_GROUP_BEGIN -int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, +int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( @@ -41,7 +41,7 @@ int RTDEF(CUFAllocatableAllocateSync)(Descriptor &desc, int64_t stream, return stat; } -int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, +int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { if (desc.HasAddendum()) { @@ -63,7 +63,7 @@ int RTDEF(CUFAllocatableAllocate)(Descriptor &desc, int64_t stream, } int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocate)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; @@ -76,7 +76,7 @@ int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, } int RTDEF(CUFAllocatableAllocateSourceSync)(Descriptor &alloc, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFAllocatableAllocateSync)( alloc, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; diff --git a/flang-rt/lib/cuda/allocator.cpp b/flang-rt/lib/cuda/allocator.cpp index 51119ab251168..f4289c55bd8de 100644 --- a/flang-rt/lib/cuda/allocator.cpp +++ b/flang-rt/lib/cuda/allocator.cpp @@ -98,7 +98,7 @@ static unsigned findAllocation(void *ptr) { return allocNotFound; } -static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { +static void insertAllocation(void *ptr, std::size_t size, cudaStream_t stream) { CriticalSection critical{lock}; initAllocations(); if (numDeviceAllocations >= maxDeviceAllocations) { @@ -106,7 +106,7 @@ static void insertAllocation(void *ptr, std::size_t size, std::int64_t stream) { } deviceAllocations[numDeviceAllocations].ptr = ptr; deviceAllocations[numDeviceAllocations].size = size; - deviceAllocations[numDeviceAllocations].stream = (cudaStream_t)stream; + deviceAllocations[numDeviceAllocations].stream = stream; ++numDeviceAllocations; qsort(deviceAllocations, numDeviceAllocations, sizeof(DeviceAllocation), compareDeviceAlloc); @@ -136,7 +136,7 @@ void RTDEF(CUFRegisterAllocator)() { } void *CUFAllocPinned( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { void *p; CUDA_REPORT_IF_ERROR(cudaMallocHost((void **)&p, sizeInBytes)); return p; @@ -144,18 +144,18 @@ void *CUFAllocPinned( void CUFFreePinned(void *p) { CUDA_REPORT_IF_ERROR(cudaFreeHost(p)); } -void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t asyncId) { +void *CUFAllocDevice(std::size_t sizeInBytes, std::int64_t *asyncObject) { void *p; if (Fortran::runtime::executionEnvironment.cudaDeviceIsManaged) { CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); } else { - if (asyncId == kNoAsyncId) { + if (asyncObject == kNoAsyncObject) { CUDA_REPORT_IF_ERROR(cudaMalloc(&p, sizeInBytes)); } else { CUDA_REPORT_IF_ERROR( - cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)asyncId)); - insertAllocation(p, sizeInBytes, asyncId); + cudaMallocAsync(&p, sizeInBytes, (cudaStream_t)*asyncObject)); + insertAllocation(p, sizeInBytes, (cudaStream_t)*asyncObject); } } return p; @@ -174,7 +174,7 @@ void CUFFreeDevice(void *p) { } void *CUFAllocManaged( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { void *p; CUDA_REPORT_IF_ERROR( cudaMallocManaged((void **)&p, sizeInBytes, cudaMemAttachGlobal)); @@ -184,9 +184,9 @@ void *CUFAllocManaged( void CUFFreeManaged(void *p) { CUDA_REPORT_IF_ERROR(cudaFree(p)); } void *CUFAllocUnified( - std::size_t sizeInBytes, [[maybe_unused]] std::int64_t asyncId) { + std::size_t sizeInBytes, [[maybe_unused]] std::int64_t *asyncObject) { // Call alloc managed for the time being. - return CUFAllocManaged(sizeInBytes, asyncId); + return CUFAllocManaged(sizeInBytes, asyncObject); } void CUFFreeUnified(void *p) { diff --git a/flang-rt/lib/cuda/descriptor.cpp b/flang-rt/lib/cuda/descriptor.cpp index 175e8c0ef8438..7b768f91af29d 100644 --- a/flang-rt/lib/cuda/descriptor.cpp +++ b/flang-rt/lib/cuda/descriptor.cpp @@ -21,7 +21,7 @@ RT_EXT_API_GROUP_BEGIN Descriptor *RTDEF(CUFAllocDescriptor)( std::size_t sizeInBytes, const char *sourceFile, int sourceLine) { return reinterpret_cast( - CUFAllocManaged(sizeInBytes, /*asyncId*/ -1)); + CUFAllocManaged(sizeInBytes, /*asyncObject=*/nullptr)); } void RTDEF(CUFFreeDescriptor)( diff --git a/flang-rt/lib/cuda/pointer.cpp b/flang-rt/lib/cuda/pointer.cpp index c2559ecb9a6f2..0ed2b0a2b751f 100644 --- a/flang-rt/lib/cuda/pointer.cpp +++ b/flang-rt/lib/cuda/pointer.cpp @@ -22,7 +22,7 @@ namespace Fortran::runtime::cuda { extern "C" { RT_EXT_API_GROUP_BEGIN -int RTDEF(CUFPointerAllocate)(Descriptor &desc, int64_t stream, bool *pinned, +int RTDEF(CUFPointerAllocate)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { if (desc.HasAddendum()) { @@ -43,7 +43,7 @@ int RTDEF(CUFPointerAllocate)(Descriptor &desc, int64_t stream, bool *pinned, return stat; } -int RTDEF(CUFPointerAllocateSync)(Descriptor &desc, int64_t stream, +int RTDEF(CUFPointerAllocateSync)(Descriptor &desc, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFPointerAllocate)( @@ -62,7 +62,7 @@ int RTDEF(CUFPointerAllocateSync)(Descriptor &desc, int64_t stream, } int RTDEF(CUFPointerAllocateSource)(Descriptor &pointer, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFPointerAllocate)( pointer, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; @@ -75,7 +75,7 @@ int RTDEF(CUFPointerAllocateSource)(Descriptor &pointer, } int RTDEF(CUFPointerAllocateSourceSync)(Descriptor &pointer, - const Descriptor &source, int64_t stream, bool *pinned, bool hasStat, + const Descriptor &source, int64_t *stream, bool *pinned, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(CUFPointerAllocateSync)( pointer, stream, pinned, hasStat, errMsg, sourceFile, sourceLine)}; diff --git a/flang-rt/lib/runtime/allocatable.cpp b/flang-rt/lib/runtime/allocatable.cpp index 6acce34eb9a9e..ef18da6ea0786 100644 --- a/flang-rt/lib/runtime/allocatable.cpp +++ b/flang-rt/lib/runtime/allocatable.cpp @@ -133,17 +133,17 @@ void RTDEF(AllocatableApplyMold)( } } -int RTDEF(AllocatableAllocate)(Descriptor &descriptor, std::int64_t asyncId, - bool hasStat, const Descriptor *errMsg, const char *sourceFile, - int sourceLine) { +int RTDEF(AllocatableAllocate)(Descriptor &descriptor, + std::int64_t *asyncObject, bool hasStat, const Descriptor *errMsg, + const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; if (!descriptor.IsAllocatable()) { return ReturnError(terminator, StatInvalidDescriptor, errMsg, hasStat); } else if (descriptor.IsAllocated()) { return ReturnError(terminator, StatBaseNotNull, errMsg, hasStat); } else { - int stat{ - ReturnError(terminator, descriptor.Allocate(asyncId), errMsg, hasStat)}; + int stat{ReturnError( + terminator, descriptor.Allocate(asyncObject), errMsg, hasStat)}; if (stat == StatOk) { if (const DescriptorAddendum * addendum{descriptor.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -162,7 +162,7 @@ int RTDEF(AllocatableAllocateSource)(Descriptor &alloc, const Descriptor &source, bool hasStat, const Descriptor *errMsg, const char *sourceFile, int sourceLine) { int stat{RTNAME(AllocatableAllocate)( - alloc, /*asyncId=*/-1, hasStat, errMsg, sourceFile, sourceLine)}; + alloc, /*asyncObject=*/nullptr, hasStat, errMsg, sourceFile, sourceLine)}; if (stat == StatOk) { Terminator terminator{sourceFile, sourceLine}; DoFromSourceAssign(alloc, source, terminator); diff --git a/flang-rt/lib/runtime/array-constructor.cpp b/flang-rt/lib/runtime/array-constructor.cpp index 67b3b5e1e0f50..858fac7bf2b39 100644 --- a/flang-rt/lib/runtime/array-constructor.cpp +++ b/flang-rt/lib/runtime/array-constructor.cpp @@ -50,7 +50,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( initialAllocationSize(fromElements, to.ElementBytes())}; to.GetDimension(0).SetBounds(1, allocationSize); RTNAME(AllocatableAllocate) - (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); to.GetDimension(0).SetBounds(1, fromElements); vector.actualAllocationSize = allocationSize; @@ -59,7 +59,7 @@ static RT_API_ATTRS void AllocateOrReallocateVectorIfNeeded( // first value: there should be no reallocation. RUNTIME_CHECK(terminator, previousToElements >= fromElements); RTNAME(AllocatableAllocate) - (to, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, + (to, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, vector.sourceFile, vector.sourceLine); vector.actualAllocationSize = previousToElements; } diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 9be75da9520e3..912beee909f4a 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -102,7 +102,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; + int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; if (result == StatOk && derived && !derived->noInitializationNeeded()) { result = ReturnError(terminator, Initialize(to, *derived, terminator)); } @@ -280,7 +280,7 @@ RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; + auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; if (stat == StatOk) { if (HasDynamicComponent(from)) { // If 'from' has allocatable/automatic component, we cannot diff --git a/flang-rt/lib/runtime/character.cpp b/flang-rt/lib/runtime/character.cpp index d1152ee1caefb..f140d202e118e 100644 --- a/flang-rt/lib/runtime/character.cpp +++ b/flang-rt/lib/runtime/character.cpp @@ -118,7 +118,7 @@ static RT_API_ATTRS void Compare(Descriptor &result, const Descriptor &x, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("Compare: could not allocate storage for result"); } std::size_t xChars{x.ElementBytes() >> shift}; @@ -173,7 +173,7 @@ static RT_API_ATTRS void AdjustLRHelper(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("ADJUSTL/R: could not allocate storage for result"); } for (SubscriptValue resultAt{0}; elements-- > 0; @@ -227,7 +227,7 @@ static RT_API_ATTRS void LenTrim(Descriptor &result, const Descriptor &string, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("LEN_TRIM: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -427,7 +427,7 @@ static RT_API_ATTRS void GeneralCharFunc(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, ub[j]); } - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("SCAN/VERIFY: could not allocate storage for result"); } std::size_t stringElementChars{string.ElementBytes() >> shift}; @@ -530,7 +530,8 @@ static RT_API_ATTRS void MaxMinHelper(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); } for (CHAR *result{accumulator.OffsetElement()}; elements-- > 0; accumData += accumChars, result += chars, x.IncrementSubscripts(xAt)) { @@ -606,7 +607,7 @@ void RTDEF(CharacterConcatenate)(Descriptor &accumulator, for (int j{0}; j < rank; ++j) { accumulator.GetDimension(j).SetBounds(1, ub[j]); } - if (accumulator.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (accumulator.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash( "CharacterConcatenate: could not allocate storage for result"); } @@ -629,7 +630,8 @@ void RTDEF(CharacterConcatenateScalar1)( accumulator.set_base_addr(nullptr); std::size_t oldLen{accumulator.ElementBytes()}; accumulator.raw().elem_len += chars; - RUNTIME_CHECK(terminator, accumulator.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK( + terminator, accumulator.Allocate(kNoAsyncObject) == CFI_SUCCESS); std::memcpy(accumulator.OffsetElement(oldLen), from, chars); FreeMemory(old); } @@ -831,7 +833,7 @@ void RTDEF(Repeat)(Descriptor &result, const Descriptor &string, std::size_t origBytes{string.ElementBytes()}; result.Establish(string.type(), origBytes * ncopies, nullptr, 0, nullptr, CFI_attribute_allocatable); - if (result.Allocate(kNoAsyncId) != CFI_SUCCESS) { + if (result.Allocate(kNoAsyncObject) != CFI_SUCCESS) { terminator.Crash("REPEAT could not allocate storage for result"); } const char *from{string.OffsetElement()}; @@ -865,7 +867,7 @@ void RTDEF(Trim)(Descriptor &result, const Descriptor &string, } result.Establish(string.type(), resultBytes, nullptr, 0, nullptr, CFI_attribute_allocatable); - RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, result.Allocate(kNoAsyncObject) == CFI_SUCCESS); std::memcpy(result.OffsetElement(), string.OffsetElement(), resultBytes); } diff --git a/flang-rt/lib/runtime/copy.cpp b/flang-rt/lib/runtime/copy.cpp index 3a0f98cf8d376..f990f46e0be66 100644 --- a/flang-rt/lib/runtime/copy.cpp +++ b/flang-rt/lib/runtime/copy.cpp @@ -171,8 +171,8 @@ RT_API_ATTRS void CopyElement(const Descriptor &to, const SubscriptValue toAt[], *reinterpret_cast(toPtr + component->offset())}; if (toDesc.raw().base_addr != nullptr) { toDesc.set_base_addr(nullptr); - RUNTIME_CHECK( - terminator, toDesc.Allocate(/*asyncId=*/-1) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, + toDesc.Allocate(/*asyncObject=*/nullptr) == CFI_SUCCESS); const Descriptor &fromDesc{*reinterpret_cast( fromPtr + component->offset())}; copyStack.emplace(toDesc, fromDesc); diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..35037036f63e7 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -52,7 +52,7 @@ RT_API_ATTRS int Initialize(const Descriptor &instance, allocDesc.raw().attribute = CFI_attribute_allocatable; if (comp.genre() == typeInfo::Component::Genre::Automatic) { stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); + terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { if (const auto *derived{addendum->derivedType()}) { @@ -153,7 +153,7 @@ RT_API_ATTRS int InitializeClone(const Descriptor &clone, if (origDesc.IsAllocated()) { cloneDesc.ApplyMold(origDesc, origDesc.rank()); stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); + terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); if (stat == StatOk) { if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { if (const typeInfo::DerivedType * @@ -260,7 +260,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy.raw().attribute = CFI_attribute_allocatable; Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncId) == CFI_SUCCESS); + copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } diff --git a/flang-rt/lib/runtime/descriptor.cpp b/flang-rt/lib/runtime/descriptor.cpp index 3debf53bb5290..67336d01380e0 100644 --- a/flang-rt/lib/runtime/descriptor.cpp +++ b/flang-rt/lib/runtime/descriptor.cpp @@ -158,7 +158,7 @@ RT_API_ATTRS static inline int MapAllocIdx(const Descriptor &desc) { #endif } -RT_API_ATTRS int Descriptor::Allocate(std::int64_t asyncId) { +RT_API_ATTRS int Descriptor::Allocate(std::int64_t *asyncObject) { std::size_t elementBytes{ElementBytes()}; if (static_cast(elementBytes) < 0) { // F'2023 7.4.4.2 p5: "If the character length parameter value evaluates @@ -170,7 +170,7 @@ RT_API_ATTRS int Descriptor::Allocate(std::int64_t asyncId) { // Zero size allocation is possible in Fortran and the resulting // descriptor must be allocated/associated. Since std::malloc(0) // result is implementation defined, always allocate at least one byte. - void *p{alloc(byteSize ? byteSize : 1, asyncId)}; + void *p{alloc(byteSize ? byteSize : 1, asyncObject)}; if (!p) { return CFI_ERROR_MEM_ALLOCATION; } diff --git a/flang-rt/lib/runtime/extrema.cpp b/flang-rt/lib/runtime/extrema.cpp index 4c7f8e8b99e8f..03e574a8fbff1 100644 --- a/flang-rt/lib/runtime/extrema.cpp +++ b/flang-rt/lib/runtime/extrema.cpp @@ -152,7 +152,7 @@ inline RT_API_ATTRS void CharacterMaxOrMinLoc(const char *intrinsic, CFI_attribute_allocatable); result.GetDimension(0).SetBounds(1, extent[0]); Terminator terminator{source, line}; - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } @@ -181,7 +181,7 @@ inline RT_API_ATTRS void TotalNumericMaxOrMinLoc(const char *intrinsic, CFI_attribute_allocatable); result.GetDimension(0).SetBounds(1, extent[0]); Terminator terminator{source, line}; - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/runtime/findloc.cpp b/flang-rt/lib/runtime/findloc.cpp index e3e98953b0cfc..5485f4b97bd2f 100644 --- a/flang-rt/lib/runtime/findloc.cpp +++ b/flang-rt/lib/runtime/findloc.cpp @@ -220,7 +220,7 @@ void RTDEF(Findloc)(Descriptor &result, const Descriptor &x, CFI_attribute_allocatable); result.GetDimension(0).SetBounds(1, extent[0]); Terminator terminator{source, line}; - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "FINDLOC: could not allocate memory for result; STAT=%d", stat); } diff --git a/flang-rt/lib/runtime/matmul-transpose.cpp b/flang-rt/lib/runtime/matmul-transpose.cpp index 17987fb73d943..c9e21502b629e 100644 --- a/flang-rt/lib/runtime/matmul-transpose.cpp +++ b/flang-rt/lib/runtime/matmul-transpose.cpp @@ -183,7 +183,7 @@ inline static RT_API_ATTRS void DoMatmulTranspose( for (int j{0}; j < resRank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "MATMUL-TRANSPOSE: could not allocate memory for result; STAT=%d", stat); diff --git a/flang-rt/lib/runtime/matmul.cpp b/flang-rt/lib/runtime/matmul.cpp index 0ff92cecbbcb8..5acb345725212 100644 --- a/flang-rt/lib/runtime/matmul.cpp +++ b/flang-rt/lib/runtime/matmul.cpp @@ -255,7 +255,7 @@ static inline RT_API_ATTRS void DoMatmul( for (int j{0}; j < resRank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "MATMUL: could not allocate memory for result; STAT=%d", stat); } diff --git a/flang-rt/lib/runtime/misc-intrinsic.cpp b/flang-rt/lib/runtime/misc-intrinsic.cpp index 2fde859869ef0..a8797f48fa667 100644 --- a/flang-rt/lib/runtime/misc-intrinsic.cpp +++ b/flang-rt/lib/runtime/misc-intrinsic.cpp @@ -30,7 +30,7 @@ static RT_API_ATTRS void TransferImpl(Descriptor &result, if (const DescriptorAddendum * addendum{mold.Addendum()}) { *result.Addendum() = *addendum; } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { Terminator{sourceFile, line}.Crash( "TRANSFER: could not allocate memory for result; STAT=%d", stat); } diff --git a/flang-rt/lib/runtime/pointer.cpp b/flang-rt/lib/runtime/pointer.cpp index fd2427f4124b5..7331f7bbc3a75 100644 --- a/flang-rt/lib/runtime/pointer.cpp +++ b/flang-rt/lib/runtime/pointer.cpp @@ -129,7 +129,7 @@ RT_API_ATTRS void *AllocateValidatedPointerPayload( byteSize = ((byteSize + align - 1) / align) * align; std::size_t total{byteSize + sizeof(std::uintptr_t)}; AllocFct alloc{allocatorRegistry.GetAllocator(allocatorIdx)}; - void *p{alloc(total, /*asyncId=*/-1)}; + void *p{alloc(total, /*asyncObject=*/nullptr)}; if (p && allocatorIdx == 0) { // Fill the footer word with the XOR of the ones' complement of // the base address, which is a value that would be highly unlikely diff --git a/flang-rt/lib/runtime/temporary-stack.cpp b/flang-rt/lib/runtime/temporary-stack.cpp index 3a952b1fdbcca..3f6fd8ee15a80 100644 --- a/flang-rt/lib/runtime/temporary-stack.cpp +++ b/flang-rt/lib/runtime/temporary-stack.cpp @@ -148,7 +148,7 @@ void DescriptorStorage::push(const Descriptor &source) { if constexpr (COPY_VALUES) { // copy the data pointed to by the box box.set_base_addr(nullptr); - box.Allocate(kNoAsyncId); + box.Allocate(kNoAsyncObject); RTNAME(AssignTemporary) (box, source, terminator_.sourceFileName(), terminator_.sourceLine()); } diff --git a/flang-rt/lib/runtime/tools.cpp b/flang-rt/lib/runtime/tools.cpp index 5d6e35faca70a..1f965b0b151ce 100644 --- a/flang-rt/lib/runtime/tools.cpp +++ b/flang-rt/lib/runtime/tools.cpp @@ -261,7 +261,7 @@ RT_API_ATTRS void CreatePartialReductionResult(Descriptor &result, for (int j{0}; j + 1 < xRank; ++j) { result.GetDimension(j).SetBounds(1, resultExtent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: could not allocate memory for result; STAT=%d", intrinsic, stat); } diff --git a/flang-rt/lib/runtime/transformational.cpp b/flang-rt/lib/runtime/transformational.cpp index a7d5a48530ee9..3df314a4e966b 100644 --- a/flang-rt/lib/runtime/transformational.cpp +++ b/flang-rt/lib/runtime/transformational.cpp @@ -132,7 +132,7 @@ static inline RT_API_ATTRS std::size_t AllocateResult(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: Could not allocate memory for result (stat=%d)", function, stat); } @@ -157,7 +157,7 @@ static inline RT_API_ATTRS std::size_t AllocateBesselResult(Descriptor &result, for (int j{0}; j < rank; ++j) { result.GetDimension(j).SetBounds(1, extent[j]); } - if (int stat{result.Allocate(kNoAsyncId)}) { + if (int stat{result.Allocate(kNoAsyncObject)}) { terminator.Crash( "%s: Could not allocate memory for result (stat=%d)", function, stat); } diff --git a/flang-rt/unittests/Evaluate/reshape.cpp b/flang-rt/unittests/Evaluate/reshape.cpp index 67a0be124e8e0..f84de443965d1 100644 --- a/flang-rt/unittests/Evaluate/reshape.cpp +++ b/flang-rt/unittests/Evaluate/reshape.cpp @@ -26,7 +26,7 @@ int main() { for (int j{0}; j < 3; ++j) { source->GetDimension(j).SetBounds(1, sourceExtent[j]); } - TEST(source->Allocate(kNoAsyncId) == CFI_SUCCESS); + TEST(source->Allocate(kNoAsyncObject) == CFI_SUCCESS); TEST(source->IsAllocated()); MATCH(2, source->GetDimension(0).Extent()); MATCH(3, source->GetDimension(1).Extent()); diff --git a/flang-rt/unittests/Runtime/Allocatable.cpp b/flang-rt/unittests/Runtime/Allocatable.cpp index a6fcdd0d1423c..b394312e5bc5a 100644 --- a/flang-rt/unittests/Runtime/Allocatable.cpp +++ b/flang-rt/unittests/Runtime/Allocatable.cpp @@ -26,7 +26,7 @@ TEST(AllocatableTest, MoveAlloc) { auto b{createAllocatable(TypeCategory::Integer, 4)}; // ALLOCATE(a(20)) a->GetDimension(0).SetBounds(1, 20); - a->Allocate(kNoAsyncId); + a->Allocate(kNoAsyncObject); EXPECT_TRUE(a->IsAllocated()); EXPECT_FALSE(b->IsAllocated()); @@ -46,7 +46,7 @@ TEST(AllocatableTest, MoveAlloc) { // move_alloc with errMsg auto errMsg{Descriptor::Create( sizeof(char), 64, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - errMsg->Allocate(kNoAsyncId); + errMsg->Allocate(kNoAsyncObject); RTNAME(MoveAlloc)(*b, *a, nullptr, false, errMsg.get(), __FILE__, __LINE__); EXPECT_FALSE(a->IsAllocated()); EXPECT_TRUE(b->IsAllocated()); diff --git a/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp b/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp index 89649aa95ad93..9935ae0eaac2f 100644 --- a/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp +++ b/flang-rt/unittests/Runtime/CUDA/Allocatable.cpp @@ -42,7 +42,8 @@ TEST(AllocatableCUFTest, SimpleDeviceAllocatable) { CUDA_REPORT_IF_ERROR(cudaMalloc(&device_desc, a->SizeInBytes())); RTNAME(AllocatableAllocate) - (*a, kNoAsyncId, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*a, kNoAsyncObject, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(a->IsAllocated()); RTNAME(CUFDescriptorSync)(device_desc, a.get(), __FILE__, __LINE__); cudaDeviceSynchronize(); @@ -82,19 +83,22 @@ TEST(AllocatableCUFTest, StreamDeviceAllocatable) { RTNAME(AllocatableSetBounds)(*c, 0, 1, 100); RTNAME(AllocatableAllocate) - (*a, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(a->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); RTNAME(AllocatableAllocate) - (*b, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*b, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(b->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); RTNAME(AllocatableAllocate) - (*c, 1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + (*c, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + __LINE__); EXPECT_TRUE(c->IsAllocated()); cudaDeviceSynchronize(); EXPECT_EQ(cudaSuccess, cudaGetLastError()); diff --git a/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp b/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp index 2f1dc64dc8c5a..f1f931e87a86e 100644 --- a/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp +++ b/flang-rt/unittests/Runtime/CUDA/AllocatorCUF.cpp @@ -35,7 +35,7 @@ TEST(AllocatableCUFTest, SimpleDeviceAllocate) { EXPECT_FALSE(a->HasAddendum()); RTNAME(AllocatableSetBounds)(*a, 0, 1, 10); RTNAME(AllocatableAllocate) - (*a, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); EXPECT_TRUE(a->IsAllocated()); RTNAME(AllocatableDeallocate) @@ -54,7 +54,7 @@ TEST(AllocatableCUFTest, SimplePinnedAllocate) { EXPECT_FALSE(a->HasAddendum()); RTNAME(AllocatableSetBounds)(*a, 0, 1, 10); RTNAME(AllocatableAllocate) - (*a, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, + (*a, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); EXPECT_TRUE(a->IsAllocated()); RTNAME(AllocatableDeallocate) diff --git a/flang-rt/unittests/Runtime/CUDA/Memory.cpp b/flang-rt/unittests/Runtime/CUDA/Memory.cpp index b3612073657ab..7915baca6c203 100644 --- a/flang-rt/unittests/Runtime/CUDA/Memory.cpp +++ b/flang-rt/unittests/Runtime/CUDA/Memory.cpp @@ -50,8 +50,8 @@ TEST(MemoryCUFTest, CUFDataTransferDescDesc) { EXPECT_EQ((int)kDeviceAllocatorPos, dev->GetAllocIdx()); RTNAME(AllocatableSetBounds)(*dev, 0, 1, 10); RTNAME(AllocatableAllocate) - (*dev, /*asyncId=*/-1, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, - __LINE__); + (*dev, /*asyncObject=*/nullptr, /*hasStat=*/false, /*errMsg=*/nullptr, + __FILE__, __LINE__); EXPECT_TRUE(dev->IsAllocated()); // Create temp array to transfer to device. diff --git a/flang-rt/unittests/Runtime/CharacterTest.cpp b/flang-rt/unittests/Runtime/CharacterTest.cpp index 0f28e883671bc..2c7af27b9da77 100644 --- a/flang-rt/unittests/Runtime/CharacterTest.cpp +++ b/flang-rt/unittests/Runtime/CharacterTest.cpp @@ -35,7 +35,7 @@ OwningPtr CreateDescriptor(const std::vector &shape, for (int j{0}; j < rank; ++j) { descriptor->GetDimension(j).SetBounds(2, shape[j] + 1); } - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } diff --git a/flang-rt/unittests/Runtime/CommandTest.cpp b/flang-rt/unittests/Runtime/CommandTest.cpp index 9d0da4ce8dd4e..6919a98105b8a 100644 --- a/flang-rt/unittests/Runtime/CommandTest.cpp +++ b/flang-rt/unittests/Runtime/CommandTest.cpp @@ -26,7 +26,7 @@ template static OwningPtr CreateEmptyCharDescriptor() { OwningPtr descriptor{Descriptor::Create( sizeof(char), n, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } return descriptor; @@ -36,7 +36,7 @@ static OwningPtr CharDescriptor(const char *value) { std::size_t n{std::strlen(value)}; OwningPtr descriptor{Descriptor::Create( sizeof(char), n, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } std::memcpy(descriptor->OffsetElement(), value, n); @@ -47,7 +47,7 @@ template static OwningPtr EmptyIntDescriptor() { OwningPtr descriptor{Descriptor::Create(TypeCategory::Integer, kind, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } return descriptor; @@ -57,7 +57,7 @@ template static OwningPtr IntDescriptor(const int &value) { OwningPtr descriptor{Descriptor::Create(TypeCategory::Integer, kind, nullptr, 0, nullptr, CFI_attribute_allocatable)}; - if (descriptor->Allocate(kNoAsyncId) != 0) { + if (descriptor->Allocate(kNoAsyncObject) != 0) { return nullptr; } std::memcpy(descriptor->OffsetElement(), &value, sizeof(int)); diff --git a/flang-rt/unittests/Runtime/TemporaryStack.cpp b/flang-rt/unittests/Runtime/TemporaryStack.cpp index 3291794f22fc1..65725840459ab 100644 --- a/flang-rt/unittests/Runtime/TemporaryStack.cpp +++ b/flang-rt/unittests/Runtime/TemporaryStack.cpp @@ -59,7 +59,7 @@ TEST(TemporaryStack, ValueStackBasic) { Descriptor &outputDesc2{testDescriptorStorage[2].descriptor()}; inputDesc.Establish(code, elementBytes, descriptorPtr, rank, extent); - inputDesc.Allocate(kNoAsyncId); + inputDesc.Allocate(kNoAsyncObject); ASSERT_EQ(inputDesc.IsAllocated(), true); uint32_t *inputData = static_cast(inputDesc.raw().base_addr); for (std::size_t i = 0; i < inputDesc.Elements(); ++i) { @@ -123,7 +123,7 @@ TEST(TemporaryStack, ValueStackMultiSize) { boxDims.extent = extent[dim]; boxDims.sm = elementBytes; } - desc->Allocate(kNoAsyncId); + desc->Allocate(kNoAsyncObject); // fill the array with some data to test for (uint32_t i = 0; i < desc->Elements(); ++i) { diff --git a/flang-rt/unittests/Runtime/tools.h b/flang-rt/unittests/Runtime/tools.h index a1eba45647a80..4ada862df110b 100644 --- a/flang-rt/unittests/Runtime/tools.h +++ b/flang-rt/unittests/Runtime/tools.h @@ -42,7 +42,7 @@ static OwningPtr MakeArray(const std::vector &shape, for (int j{0}; j < rank; ++j) { result->GetDimension(j).SetBounds(1, shape[j]); } - int stat{result->Allocate(kNoAsyncId)}; + int stat{result->Allocate(kNoAsyncObject)}; EXPECT_EQ(stat, 0) << stat; EXPECT_LE(data.size(), result->Elements()); char *p{result->OffsetElement()}; diff --git a/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td b/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td index 46cc59cda1612..e38738230ffbc 100644 --- a/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td +++ b/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td @@ -95,12 +95,11 @@ def cuf_AllocateOp : cuf_Op<"allocate", [AttrSizedOperandSegments, }]; let arguments = (ins Arg:$box, - Arg, "", [MemWrite]>:$errmsg, - Optional:$stream, - Arg, "", [MemWrite]>:$pinned, - Arg, "", [MemRead]>:$source, - cuf_DataAttributeAttr:$data_attr, - UnitAttr:$hasStat); + Arg, "", [MemWrite]>:$errmsg, + Optional:$stream, + Arg, "", [MemWrite]>:$pinned, + Arg, "", [MemRead]>:$source, + cuf_DataAttributeAttr:$data_attr, UnitAttr:$hasStat); let results = (outs AnyIntegerType:$stat); diff --git a/flang/include/flang/Runtime/CUDA/allocatable.h b/flang/include/flang/Runtime/CUDA/allocatable.h index 822f2d4a2b297..6c97afa9e10e8 100644 --- a/flang/include/flang/Runtime/CUDA/allocatable.h +++ b/flang/include/flang/Runtime/CUDA/allocatable.h @@ -17,14 +17,14 @@ namespace Fortran::runtime::cuda { extern "C" { /// Perform allocation of the descriptor. -int RTDECL(CUFAllocatableAllocate)(Descriptor &, int64_t stream = -1, +int RTDECL(CUFAllocatableAllocate)(Descriptor &, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. -int RTDECL(CUFAllocatableAllocateSync)(Descriptor &, int64_t stream = -1, +int RTDECL(CUFAllocatableAllocateSync)(Descriptor &, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); @@ -32,14 +32,14 @@ int RTDECL(CUFAllocatableAllocateSync)(Descriptor &, int64_t stream = -1, /// Perform allocation of the descriptor without synchronization. Assign data /// from source. int RTDEF(CUFAllocatableAllocateSource)(Descriptor &alloc, - const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, + const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. Assign data from source. int RTDEF(CUFAllocatableAllocateSourceSync)(Descriptor &alloc, - const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, + const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); diff --git a/flang/include/flang/Runtime/CUDA/allocator.h b/flang/include/flang/Runtime/CUDA/allocator.h index 18ddf75ac3852..59fdb22b6e663 100644 --- a/flang/include/flang/Runtime/CUDA/allocator.h +++ b/flang/include/flang/Runtime/CUDA/allocator.h @@ -20,16 +20,16 @@ extern "C" { void RTDECL(CUFRegisterAllocator)(); } -void *CUFAllocPinned(std::size_t, std::int64_t); +void *CUFAllocPinned(std::size_t, std::int64_t *); void CUFFreePinned(void *); -void *CUFAllocDevice(std::size_t, std::int64_t); +void *CUFAllocDevice(std::size_t, std::int64_t *); void CUFFreeDevice(void *); -void *CUFAllocManaged(std::size_t, std::int64_t); +void *CUFAllocManaged(std::size_t, std::int64_t *); void CUFFreeManaged(void *); -void *CUFAllocUnified(std::size_t, std::int64_t); +void *CUFAllocUnified(std::size_t, std::int64_t *); void CUFFreeUnified(void *); } // namespace Fortran::runtime::cuda diff --git a/flang/include/flang/Runtime/CUDA/pointer.h b/flang/include/flang/Runtime/CUDA/pointer.h index 7fbd8f8e061f2..bdfc3268e0814 100644 --- a/flang/include/flang/Runtime/CUDA/pointer.h +++ b/flang/include/flang/Runtime/CUDA/pointer.h @@ -17,14 +17,14 @@ namespace Fortran::runtime::cuda { extern "C" { /// Perform allocation of the descriptor. -int RTDECL(CUFPointerAllocate)(Descriptor &, int64_t stream = -1, +int RTDECL(CUFPointerAllocate)(Descriptor &, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. -int RTDECL(CUFPointerAllocateSync)(Descriptor &, int64_t stream = -1, +int RTDECL(CUFPointerAllocateSync)(Descriptor &, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); @@ -32,14 +32,14 @@ int RTDECL(CUFPointerAllocateSync)(Descriptor &, int64_t stream = -1, /// Perform allocation of the descriptor without synchronization. Assign data /// from source. int RTDEF(CUFPointerAllocateSource)(Descriptor &pointer, - const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, + const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); /// Perform allocation of the descriptor with synchronization of it when /// necessary. Assign data from source. int RTDEF(CUFPointerAllocateSourceSync)(Descriptor &pointer, - const Descriptor &source, int64_t stream = -1, bool *pinned = nullptr, + const Descriptor &source, int64_t *stream = nullptr, bool *pinned = nullptr, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); diff --git a/flang/include/flang/Runtime/allocatable.h b/flang/include/flang/Runtime/allocatable.h index 6895f8af5e2a8..863c07494e7c3 100644 --- a/flang/include/flang/Runtime/allocatable.h +++ b/flang/include/flang/Runtime/allocatable.h @@ -94,9 +94,10 @@ int RTDECL(AllocatableCheckLengthParameter)(Descriptor &, // Successfully allocated memory is initialized if the allocatable has a // derived type, and is always initialized by AllocatableAllocateSource(). // Performs all necessary coarray synchronization and validation actions. -int RTDECL(AllocatableAllocate)(Descriptor &, std::int64_t asyncId = -1, - bool hasStat = false, const Descriptor *errMsg = nullptr, - const char *sourceFile = nullptr, int sourceLine = 0); +int RTDECL(AllocatableAllocate)(Descriptor &, + std::int64_t *asyncObject = nullptr, bool hasStat = false, + const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, + int sourceLine = 0); int RTDECL(AllocatableAllocateSource)(Descriptor &, const Descriptor &source, bool hasStat = false, const Descriptor *errMsg = nullptr, const char *sourceFile = nullptr, int sourceLine = 0); diff --git a/flang/lib/Lower/Allocatable.cpp b/flang/lib/Lower/Allocatable.cpp index 7e32575caad9b..dd90e8900704b 100644 --- a/flang/lib/Lower/Allocatable.cpp +++ b/flang/lib/Lower/Allocatable.cpp @@ -760,7 +760,7 @@ class AllocateStmtHelper { mlir::Value errmsg = errMsgExpr ? errorManager.errMsgAddr : nullptr; mlir::Value stream = streamExpr - ? fir::getBase(converter.genExprValue(loc, *streamExpr, stmtCtx)) + ? fir::getBase(converter.genExprAddr(loc, *streamExpr, stmtCtx)) : nullptr; mlir::Value pinned = pinnedExpr diff --git a/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp b/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp index 28452d3b486da..cd5f1f6d098c3 100644 --- a/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp +++ b/flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp @@ -76,8 +76,7 @@ void fir::runtime::genAllocatableAllocate(fir::FirOpBuilder &builder, mlir::func::FuncOp func{ fir::runtime::getRuntimeFunc(loc, builder)}; mlir::FunctionType fTy{func.getFunctionType()}; - mlir::Value asyncId = - builder.createIntegerConstant(loc, builder.getI64Type(), -1); + mlir::Value asyncObject = builder.createNullConstant(loc); mlir::Value sourceFile{fir::factory::locationToFilename(builder, loc)}; mlir::Value sourceLine{ fir::factory::locationToLineNo(builder, loc, fTy.getInput(5))}; @@ -88,7 +87,7 @@ void fir::runtime::genAllocatableAllocate(fir::FirOpBuilder &builder, errMsg = builder.create(loc, boxNoneTy).getResult(); } llvm::SmallVector args{ - fir::runtime::createArguments(builder, loc, fTy, desc, asyncId, hasStat, - errMsg, sourceFile, sourceLine)}; + fir::runtime::createArguments(builder, loc, fTy, desc, asyncObject, + hasStat, errMsg, sourceFile, sourceLine)}; builder.create(loc, func, args); } diff --git a/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp b/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp index 24033bc15b8eb..687007d957225 100644 --- a/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp +++ b/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp @@ -76,6 +76,16 @@ llvm::LogicalResult cuf::FreeOp::verify() { return checkCudaAttr(*this); } // AllocateOp //===----------------------------------------------------------------------===// +template +static llvm::LogicalResult checkStreamType(OpTy op) { + if (!op.getStream()) + return mlir::success(); + if (auto refTy = mlir::dyn_cast(op.getStream().getType())) + if (!refTy.getEleTy().isInteger(64)) + return op.emitOpError("stream is expected to be an i64 reference"); + return mlir::success(); +} + llvm::LogicalResult cuf::AllocateOp::verify() { if (getPinned() && getStream()) return emitOpError("pinned and stream cannot appears at the same time"); @@ -92,7 +102,7 @@ llvm::LogicalResult cuf::AllocateOp::verify() { "expect errmsg to be a reference to/or a box type value"); if (getErrmsg() && !getHasStat()) return emitOpError("expect stat attribute when errmsg is provided"); - return mlir::success(); + return checkStreamType(*this); } //===----------------------------------------------------------------------===// @@ -143,16 +153,6 @@ llvm::LogicalResult cuf::DeallocateOp::verify() { // KernelLaunchOp //===----------------------------------------------------------------------===// -template -static llvm::LogicalResult checkStreamType(OpTy op) { - if (!op.getStream()) - return mlir::success(); - if (auto refTy = mlir::dyn_cast(op.getStream().getType())) - if (!refTy.getEleTy().isInteger(64)) - return op.emitOpError("stream is expected to be an i64 reference"); - return mlir::success(); -} - llvm::LogicalResult cuf::KernelLaunchOp::verify() { return checkStreamType(*this); } diff --git a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp index 7477a3c53c3ef..0fff06033b73d 100644 --- a/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp +++ b/flang/lib/Optimizer/Transforms/CUFOpConversion.cpp @@ -129,17 +129,15 @@ static mlir::LogicalResult convertOpToCall(OpTy op, mlir::IntegerType::get(op.getContext(), 1))); if (op.getSource()) { mlir::Value stream = - op.getStream() - ? op.getStream() - : builder.createIntegerConstant(loc, fTy.getInput(2), -1); + op.getStream() ? op.getStream() + : builder.createNullConstant(loc, fTy.getInput(2)); args = fir::runtime::createArguments( builder, loc, fTy, op.getBox(), op.getSource(), stream, pinned, hasStat, errmsg, sourceFile, sourceLine); } else { mlir::Value stream = - op.getStream() - ? op.getStream() - : builder.createIntegerConstant(loc, fTy.getInput(1), -1); + op.getStream() ? op.getStream() + : builder.createNullConstant(loc, fTy.getInput(1)); args = fir::runtime::createArguments(builder, loc, fTy, op.getBox(), stream, pinned, hasStat, errmsg, sourceFile, sourceLine); diff --git a/flang/test/Fir/CUDA/cuda-allocate.fir b/flang/test/Fir/CUDA/cuda-allocate.fir index 095ad92d5deb5..ea7890c9aac52 100644 --- a/flang/test/Fir/CUDA/cuda-allocate.fir +++ b/flang/test/Fir/CUDA/cuda-allocate.fir @@ -19,7 +19,7 @@ func.func @_QPsub1() { // CHECK: %[[DESC:.*]] = fir.convert %[[DESC_RT_CALL]] : (!fir.ref>) -> !fir.ref>>> // CHECK: %[[DECL_DESC:.*]]:2 = hlfir.declare %[[DESC]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) // CHECK: %[[BOX_NONE:.*]] = fir.convert %[[DECL_DESC]]#1 : (!fir.ref>>>) -> !fir.ref> -// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[BOX_NONE:.*]] = fir.convert %[[DECL_DESC]]#1 : (!fir.ref>>>) -> !fir.ref> // CHECK: %{{.*}} = fir.call @_FortranAAllocatableDeallocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -47,7 +47,7 @@ func.func @_QPsub3() { // CHECK: %[[A:.*]]:2 = hlfir.declare %[[A_ADDR]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QMmod1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) // CHECK: %[[A_BOX:.*]] = fir.convert %[[A]]#1 : (!fir.ref>>>) -> !fir.ref> -// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[A_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[A_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[A_BOX:.*]] = fir.convert %[[A]]#1 : (!fir.ref>>>) -> !fir.ref> // CHECK: fir.call @_FortranACUFAllocatableDeallocate(%[[A_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -87,7 +87,7 @@ func.func @_QPsub5() { } // CHECK-LABEL: func.func @_QPsub5() -// CHECK: fir.call @_FortranACUFAllocatableAllocate({{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocate({{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: fir.call @_FortranAAllocatableDeallocate({{.*}}) : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -118,7 +118,7 @@ func.func @_QQsub6() attributes {fir.bindc_name = "test"} { // CHECK: %[[B:.*]]:2 = hlfir.declare %[[B_ADDR]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QMdataEb"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) // CHECK: _FortranAAllocatableSetBounds // CHECK: %[[B_BOX:.*]] = fir.convert %[[B]]#1 : (!fir.ref>>>) -> !fir.ref> -// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[B_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocateSync(%[[B_BOX]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 func.func @_QPallocate_source() { @@ -142,7 +142,7 @@ func.func @_QPallocate_source() { // CHECK: %[[SOURCE:.*]] = fir.load %[[DECL_HOST]] : !fir.ref>>> // CHECK: %[[DEV_CONV:.*]] = fir.convert %[[DECL_DEV]] : (!fir.ref>>>) -> !fir.ref> // CHECK: %[[SOURCE_CONV:.*]] = fir.convert %[[SOURCE]] : (!fir.box>>) -> !fir.box -// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocateSource(%[[DEV_CONV]], %[[SOURCE_CONV]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.box, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %{{.*}} = fir.call @_FortranACUFAllocatableAllocateSource(%[[DEV_CONV]], %[[SOURCE_CONV]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.box, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 fir.global @_QMmod1Ea_d {data_attr = #cuf.cuda} : !fir.box>> { @@ -170,16 +170,14 @@ func.func @_QQallocate_stream() { %1 = fir.declare %0 {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFEa"} : (!fir.ref>>>) -> !fir.ref>>> %2 = fir.alloca i64 {bindc_name = "stream1", uniq_name = "_QFEstream1"} %3 = fir.declare %2 {uniq_name = "_QFEstream1"} : (!fir.ref) -> !fir.ref - %4 = fir.load %3 : !fir.ref - %5 = cuf.allocate %1 : !fir.ref>>> stream(%4 : i64) {data_attr = #cuf.cuda} -> i32 + %5 = cuf.allocate %1 : !fir.ref>>> stream(%3 : !fir.ref) {data_attr = #cuf.cuda} -> i32 return } // CHECK-LABEL: func.func @_QQallocate_stream() // CHECK: %[[STREAM_ALLOCA:.*]] = fir.alloca i64 {bindc_name = "stream1", uniq_name = "_QFEstream1"} // CHECK: %[[STREAM:.*]] = fir.declare %[[STREAM_ALLOCA]] {uniq_name = "_QFEstream1"} : (!fir.ref) -> !fir.ref -// CHECK: %[[STREAM_LOAD:.*]] = fir.load %[[STREAM]] : !fir.ref -// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %[[STREAM_LOAD]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %[[STREAM]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 func.func @_QPp_alloc() { @@ -268,6 +266,6 @@ func.func @_QQpinned() attributes {fir.bindc_name = "testasync"} { // CHECK: %[[PINNED:.*]] = fir.alloca !fir.logical<4> {bindc_name = "pinnedflag", uniq_name = "_QFEpinnedflag"} // CHECK: %[[DECL_PINNED:.*]] = fir.declare %[[PINNED]] {uniq_name = "_QFEpinnedflag"} : (!fir.ref>) -> !fir.ref> // CHECK: %[[CONV_PINNED:.*]] = fir.convert %[[DECL_PINNED]] : (!fir.ref>) -> !fir.ref -// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %{{.*}}, %[[CONV_PINNED]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, i64, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: fir.call @_FortranACUFAllocatableAllocate(%{{.*}}, %{{.*}}, %[[CONV_PINNED]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) : (!fir.ref>, !fir.ref, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 } // end of module diff --git a/flang/test/Fir/cuf-invalid.fir b/flang/test/Fir/cuf-invalid.fir index a3b9be3ee8223..dceb8f6fde236 100644 --- a/flang/test/Fir/cuf-invalid.fir +++ b/flang/test/Fir/cuf-invalid.fir @@ -2,13 +2,12 @@ func.func @_QPsub1() { %0 = fir.alloca !fir.box>> {bindc_name = "a", uniq_name = "_QFsub1Ea"} - %1 = fir.alloca i32 + %s = fir.alloca i64 %pinned = fir.alloca i1 %4:2 = hlfir.declare %0 {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) %11 = fir.convert %4#1 : (!fir.ref>>>) -> !fir.ref> - %s = fir.load %1 : !fir.ref // expected-error at +1{{'cuf.allocate' op pinned and stream cannot appears at the same time}} - %13 = cuf.allocate %11 : !fir.ref> stream(%s : i32) pinned(%pinned : !fir.ref) {data_attr = #cuf.cuda} -> i32 + %13 = cuf.allocate %11 : !fir.ref> stream(%s : !fir.ref) pinned(%pinned : !fir.ref) {data_attr = #cuf.cuda} -> i32 return } diff --git a/flang/test/Fir/cuf.mlir b/flang/test/Fir/cuf.mlir index d38b26a4548ed..f80a70eca34a3 100644 --- a/flang/test/Fir/cuf.mlir +++ b/flang/test/Fir/cuf.mlir @@ -18,15 +18,14 @@ func.func @_QPsub1() { func.func @_QPsub1() { %0 = fir.alloca !fir.box>> {bindc_name = "a", uniq_name = "_QFsub1Ea"} - %1 = fir.alloca i32 + %1 = fir.alloca i64 %4:2 = hlfir.declare %0 {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub1Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) %11 = fir.convert %4#1 : (!fir.ref>>>) -> !fir.ref> - %s = fir.load %1 : !fir.ref - %13 = cuf.allocate %11 : !fir.ref> stream(%s : i32) {data_attr = #cuf.cuda} -> i32 + %13 = cuf.allocate %11 : !fir.ref> stream(%1 : !fir.ref) {data_attr = #cuf.cuda} -> i32 return } -// CHECK: cuf.allocate %{{.*}} : !fir.ref> stream(%{{.*}} : i32) {data_attr = #cuf.cuda} -> i32 +// CHECK: cuf.allocate %{{.*}} : !fir.ref> stream(%{{.*}} : !fir.ref) {data_attr = #cuf.cuda} -> i32 // ----- diff --git a/flang/test/HLFIR/elemental-codegen.fir b/flang/test/HLFIR/elemental-codegen.fir index a715479f16115..67af4261470f7 100644 --- a/flang/test/HLFIR/elemental-codegen.fir +++ b/flang/test/HLFIR/elemental-codegen.fir @@ -191,7 +191,7 @@ func.func @test_polymorphic(%arg0: !fir.class> {fir.bindc_ // CHECK: %[[VAL_35:.*]] = fir.absent !fir.box // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_4]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_37:.*]] = fir.convert %[[VAL_31]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_38:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_36]], %{{.*}}, %[[VAL_34]], %[[VAL_35]], %[[VAL_37]], %[[VAL_33]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_38:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_36]], %{{.*}}, %[[VAL_34]], %[[VAL_35]], %[[VAL_37]], %[[VAL_33]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_12:.*]] = arith.constant true // CHECK: %[[VAL_39:.*]] = fir.load %[[VAL_13]]#0 : !fir.ref>>>> // CHECK: %[[VAL_40:.*]] = arith.constant 1 : index @@ -275,7 +275,7 @@ func.func @test_polymorphic_expr(%arg0: !fir.class> {fir.b // CHECK: %[[VAL_36:.*]] = fir.absent !fir.box // CHECK: %[[VAL_37:.*]] = fir.convert %[[VAL_5]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_38:.*]] = fir.convert %[[VAL_32]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_39:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_37]], %{{.*}}, %[[VAL_35]], %[[VAL_36]], %[[VAL_38]], %[[VAL_34]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_39:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_37]], %{{.*}}, %[[VAL_35]], %[[VAL_36]], %[[VAL_38]], %[[VAL_34]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_13:.*]] = arith.constant true // CHECK: %[[VAL_40:.*]] = fir.load %[[VAL_14]]#0 : !fir.ref>>>> // CHECK: %[[VAL_41:.*]] = arith.constant 1 : index @@ -328,7 +328,7 @@ func.func @test_polymorphic_expr(%arg0: !fir.class> {fir.b // CHECK: %[[VAL_85:.*]] = fir.absent !fir.box // CHECK: %[[VAL_86:.*]] = fir.convert %[[VAL_4]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_87:.*]] = fir.convert %[[VAL_81]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_88:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_86]], %{{.*}}, %[[VAL_84]], %[[VAL_85]], %[[VAL_87]], %[[VAL_83]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_88:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_86]], %{{.*}}, %[[VAL_84]], %[[VAL_85]], %[[VAL_87]], %[[VAL_83]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_62:.*]] = arith.constant true // CHECK: %[[VAL_89:.*]] = fir.load %[[VAL_63]]#0 : !fir.ref>>>> // CHECK: %[[VAL_90:.*]] = arith.constant 1 : index diff --git a/flang/test/Lower/CUDA/cuda-allocatable.cuf b/flang/test/Lower/CUDA/cuda-allocatable.cuf index a570f636b8db1..cec10dda839e9 100644 --- a/flang/test/Lower/CUDA/cuda-allocatable.cuf +++ b/flang/test/Lower/CUDA/cuda-allocatable.cuf @@ -90,7 +90,7 @@ end subroutine subroutine sub4() real, allocatable, device :: a(:) - integer :: istream + integer(8) :: istream allocate(a(10), stream=istream) end subroutine @@ -98,11 +98,10 @@ end subroutine ! CHECK: %[[BOX:.*]] = cuf.alloc !fir.box>> {bindc_name = "a", data_attr = #cuf.cuda, uniq_name = "_QFsub4Ea"} -> !fir.ref>>> ! CHECK: fir.embox {{.*}} {allocator_idx = 2 : i32} ! CHECK: %[[BOX_DECL:.*]]:2 = hlfir.declare %{{.*}} {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub4Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) -! CHECK: %[[ISTREAM:.*]] = fir.alloca i32 {bindc_name = "istream", uniq_name = "_QFsub4Eistream"} -! CHECK: %[[ISTREAM_DECL:.*]]:2 = hlfir.declare %[[ISTREAM]] {uniq_name = "_QFsub4Eistream"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[ISTREAM:.*]] = fir.alloca i64 {bindc_name = "istream", uniq_name = "_QFsub4Eistream"} +! CHECK: %[[ISTREAM_DECL:.*]]:2 = hlfir.declare %[[ISTREAM]] {uniq_name = "_QFsub4Eistream"} : (!fir.ref) -> (!fir.ref, !fir.ref) ! CHECK: fir.call @_FortranAAllocatableSetBounds -! CHECK: %[[STREAM:.*]] = fir.load %[[ISTREAM_DECL]]#0 : !fir.ref -! CHECK: %{{.*}} = cuf.allocate %[[BOX_DECL]]#0 : !fir.ref>>> stream(%[[STREAM]] : i32) {data_attr = #cuf.cuda} -> i32 +! CHECK: %{{.*}} = cuf.allocate %[[BOX_DECL]]#0 : !fir.ref>>> stream(%[[ISTREAM_DECL]]#0 : !fir.ref) {data_attr = #cuf.cuda} -> i32 ! CHECK: fir.if %{{.*}} { ! CHECK: %{{.*}} = cuf.deallocate %[[BOX_DECL]]#0 : !fir.ref>>> {data_attr = #cuf.cuda} -> i32 ! CHECK: } diff --git a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 index 5bb1ae3797346..6869af863644d 100644 --- a/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 +++ b/flang/test/Lower/OpenACC/acc-declare-unwrap-defaultbounds.f90 @@ -473,6 +473,6 @@ subroutine init() end module ! CHECK-LABEL: func.func @_QMacc_declare_post_action_statPinit() -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.if -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/OpenACC/acc-declare.f90 b/flang/test/Lower/OpenACC/acc-declare.f90 index 889cdef51f4ce..4d95ffa10edaf 100644 --- a/flang/test/Lower/OpenACC/acc-declare.f90 +++ b/flang/test/Lower/OpenACC/acc-declare.f90 @@ -434,6 +434,6 @@ subroutine init() end module ! CHECK-LABEL: func.func @_QMacc_declare_post_action_statPinit() -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.if -! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: fir.call @_FortranAAllocatableAllocate({{.*}}) fastmath {acc.declare_action = #acc.declare_action} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/allocatable-polymorphic.f90 b/flang/test/Lower/allocatable-polymorphic.f90 index 10e703210ea61..e6a8c5e025123 100644 --- a/flang/test/Lower/allocatable-polymorphic.f90 +++ b/flang/test/Lower/allocatable-polymorphic.f90 @@ -267,7 +267,7 @@ subroutine test_allocatable() ! CHECK-DAG: %[[C0:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%[[P_CAST]], %[[TYPE_DESC_P1_CAST]], %[[RANK]], %[[C0]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[P_CAST:.*]] = fir.convert %[[P_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[P_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[P_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P1:.*]] = fir.type_desc !fir.type<_QMpolyTp1{a:i32,b:i32}> ! CHECK-DAG: %[[C1_CAST:.*]] = fir.convert %[[C1_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> @@ -276,7 +276,7 @@ subroutine test_allocatable() ! CHECK-DAG: %[[C0:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%[[C1_CAST]], %[[TYPE_DESC_P1_CAST]], %[[RANK]], %[[C0]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[C1_CAST:.*]] = fir.convert %[[C1_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C1_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C1_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P2:.*]] = fir.type_desc !fir.type<_QMpolyTp2{p1:!fir.type<_QMpolyTp1{a:i32,b:i32}>,c:i32}> ! CHECK-DAG: %[[C2_CAST:.*]] = fir.convert %[[C2_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> @@ -285,7 +285,7 @@ subroutine test_allocatable() ! CHECK-DAG: %[[C0:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%[[C2_CAST]], %[[TYPE_DESC_P2_CAST]], %[[RANK]], %[[C0]]) {{.*}}: (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: %[[C2_CAST:.*]] = fir.convert %[[C2_DECL]]#0 : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C2_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C2_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P1:.*]] = fir.type_desc !fir.type<_QMpolyTp1{a:i32,b:i32}> ! CHECK-DAG: %[[C3_CAST:.*]] = fir.convert %[[C3_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> @@ -300,7 +300,7 @@ subroutine test_allocatable() ! CHECK-DAG: %[[C10_I64:.*]] = fir.convert %[[C10]] : (i32) -> i64 ! CHECK: fir.call @_FortranAAllocatableSetBounds(%[[C3_CAST]], %[[C0]], %[[C1_I64]], %[[C10_I64]]) {{.*}}: (!fir.ref>, i32, i64, i64) -> () ! CHECK: %[[C3_CAST:.*]] = fir.convert %[[C3_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C3_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C3_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[TYPE_DESC_P2:.*]] = fir.type_desc !fir.type<_QMpolyTp2{p1:!fir.type<_QMpolyTp1{a:i32,b:i32}>,c:i32}> ! CHECK-DAG: %[[C4_CAST:.*]] = fir.convert %[[C4_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> @@ -316,7 +316,7 @@ subroutine test_allocatable() ! CHECK-DAG: %[[C20_I64:.*]] = fir.convert %[[C20]] : (i32) -> i64 ! CHECK: fir.call @_FortranAAllocatableSetBounds(%[[C4_CAST]], %[[C0]], %[[C1_I64]], %[[C20_I64]]) {{.*}}: (!fir.ref>, i32, i64, i64) -> () ! CHECK: %[[C4_CAST:.*]] = fir.convert %[[C4_DECL]]#0 : (!fir.ref>>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C4_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[C4_CAST]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[C1_LOAD1:.*]] = fir.load %[[C1_DECL]]#0 : !fir.ref>>> ! CHECK: fir.dispatch "proc1"(%[[C1_LOAD1]] : !fir.class>>) @@ -390,7 +390,7 @@ subroutine test_unlimited_polymorphic_with_intrinsic_type_spec() ! CHECK-DAG: %[[CORANK:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitIntrinsicForAllocate(%[[BOX_NONE]], %[[CAT]], %[[KIND]], %[[RANK]], %[[CORANK]]) {{.*}} : (!fir.ref>, i32, i32, i32, i32) -> () ! CHECK: %[[BOX_NONE:.*]] = fir.convert %[[P_DECL]]#0 : (!fir.ref>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-DAG: %[[BOX_NONE:.*]] = fir.convert %[[PTR_DECL]]#0 : (!fir.ref>>) -> !fir.ref> ! CHECK-DAG: %[[CAT:.*]] = arith.constant 2 : i32 @@ -573,7 +573,7 @@ subroutine test_allocatable_up_character() ! CHECK-DAG: %[[CORANK:.*]] = arith.constant 0 : i32 ! CHECK: fir.call @_FortranAAllocatableInitCharacterForAllocate(%[[A_NONE]], %[[LEN]], %[[KIND]], %[[RANK]], %[[CORANK]]) {{.*}} : (!fir.ref>, i64, i32, i32, i32) -> () ! CHECK: %[[A_NONE:.*]] = fir.convert %[[A_DECL]]#0 : (!fir.ref>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 end module @@ -592,17 +592,17 @@ program test_alloc ! LLVM-LABEL: define void @_QMpolyPtest_allocatable() ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp1, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp1, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp2, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp1, i32 1, i32 0) ! LLVM: call void @_FortranAAllocatableSetBounds(ptr %{{.*}}, i32 0, i64 1, i64 10) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %{{.*}}, ptr @_QMpolyEXdtXp2, i32 1, i32 0) ! LLVM: call void @_FortranAAllocatableSetBounds(ptr %{{.*}}, i32 0, i64 1, i64 20) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %{{.*}}, ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM-COUNT-2: call void %{{[0-9]*}}() ! LLVM: call void @llvm.memcpy.p0.p0.i32 @@ -683,5 +683,5 @@ program test_alloc ! LLVM: store { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] } { ptr null, i64 8, i32 20240719, i8 0, i8 42, i8 2, i8 1, ptr @_QMpolyEXdtXp1, [1 x i64] zeroinitializer }, ptr %[[ALLOCA1:[0-9]*]] ! LLVM: call void @llvm.memcpy.p0.p0.i32(ptr %[[ALLOCA2:[0-9]+]], ptr %[[ALLOCA1]], i32 40, i1 false) ! LLVM: call void @_FortranAAllocatableInitDerivedForAllocate(ptr %[[ALLOCA2]], ptr @_QMpolyEXdtXp1, i32 0, i32 0) -! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %[[ALLOCA2]], i64 {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) +! LLVM: %{{.*}} = call i32 @_FortranAAllocatableAllocate(ptr %[[ALLOCA2]], ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) ! LLVM: %{{.*}} = call i32 @_FortranAAllocatableDeallocatePolymorphic(ptr %[[ALLOCA2]], ptr {{.*}}, i1 false, ptr null, ptr @_QQclX{{.*}}, i32 {{.*}}) diff --git a/flang/test/Lower/allocatable-runtime.f90 b/flang/test/Lower/allocatable-runtime.f90 index 37272c90656cc..c63252c68974e 100644 --- a/flang/test/Lower/allocatable-runtime.f90 +++ b/flang/test/Lower/allocatable-runtime.f90 @@ -31,7 +31,7 @@ subroutine foo() ! CHECK: fir.call @{{.*}}AllocatableSetBounds(%[[xBoxCast2]], %c0{{.*}}, %[[xlbCast]], %[[xubCast]]) {{.*}}: (!fir.ref>, i32, i64, i64) -> () ! CHECK-DAG: %[[xBoxCast3:.*]] = fir.convert %[[xBoxAddr]] : (!fir.ref>>>) -> !fir.ref> ! CHECK-DAG: %[[sourceFile:.*]] = fir.convert %{{.*}} -> !fir.ref - ! CHECK: fir.call @{{.*}}AllocatableAllocate(%[[xBoxCast3]], %{{.*}}, %false{{.*}}, %[[errMsg]], %[[sourceFile]], %{{.*}}) {{.*}}: (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 + ! CHECK: fir.call @{{.*}}AllocatableAllocate(%[[xBoxCast3]], %{{.*}}, %false{{.*}}, %[[errMsg]], %[[sourceFile]], %{{.*}}) {{.*}}: (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! Simply check that we are emitting the right numebr of set bound for y and z. Otherwise, this is just like x. ! CHECK: fir.convert %[[yBoxAddr]] : (!fir.ref>>>) -> !fir.ref> @@ -180,4 +180,4 @@ subroutine mold_allocation() ! CHECK: %[[M_BOX_NONE:.*]] = fir.convert %[[EMBOX_M]] : (!fir.box>) -> !fir.box ! CHECK: fir.call @_FortranAAllocatableApplyMold(%[[A_BOX_NONE]], %[[M_BOX_NONE]], %[[RANK]]) {{.*}} : (!fir.ref>, !fir.box, i32) -> () ! CHECK: %[[A_BOX_NONE:.*]] = fir.convert %[[A]] : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/allocate-mold.f90 b/flang/test/Lower/allocate-mold.f90 index c7985b11397ce..9427c8b08786f 100644 --- a/flang/test/Lower/allocate-mold.f90 +++ b/flang/test/Lower/allocate-mold.f90 @@ -16,7 +16,7 @@ subroutine scalar_mold_allocation() ! CHECK: %[[A_REF_BOX_NONE1:.*]] = fir.convert %[[A]] : (!fir.ref>>) -> !fir.ref> ! CHECK: fir.call @_FortranAAllocatableApplyMold(%[[A_REF_BOX_NONE1]], %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.box, i32) -> () ! CHECK: %[[A_REF_BOX_NONE2:.*]] = fir.convert %[[A]] : (!fir.ref>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_REF_BOX_NONE2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[A_REF_BOX_NONE2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 subroutine array_scalar_mold_allocation() real, allocatable :: a(:) @@ -40,4 +40,4 @@ end subroutine array_scalar_mold_allocation ! CHECK: %[[REF_BOX_A1:.*]] = fir.convert %1 : (!fir.ref>>>) -> !fir.ref> ! CHECK: fir.call @_FortranAAllocatableSetBounds(%[[REF_BOX_A1]], {{.*}},{{.*}}, {{.*}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %[[REF_BOX_A2:.*]] = fir.convert %[[A]] : (!fir.ref>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[REF_BOX_A2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[REF_BOX_A2]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Lower/polymorphic.f90 b/flang/test/Lower/polymorphic.f90 index 485861a838ff6..b7be5f685d9e3 100644 --- a/flang/test/Lower/polymorphic.f90 +++ b/flang/test/Lower/polymorphic.f90 @@ -1149,7 +1149,7 @@ program test ! CHECK-LABEL: func.func @_QQmain() attributes {fir.bindc_name = "test"} { ! CHECK: %[[ADDR_O:.*]] = fir.address_of(@_QFEo) : !fir.ref}>>>> ! CHECK: %[[BOX_NONE:.*]] = fir.convert %[[ADDR_O]] : (!fir.ref}>>>>) -> !fir.ref> -! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.*}} = fir.call @_FortranAAllocatableAllocate(%[[BOX_NONE]], %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}, %{{.*}}) {{.*}} : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %[[O:.*]] = fir.load %[[ADDR_O]] : !fir.ref}>>>> ! CHECK: %[[COORD_INNER:.*]] = fir.coordinate_of %[[O]], inner : (!fir.box}>>>) -> !fir.ref> ! CHECK: %{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} unordered iter_args(%arg1 = %{{.*}}) -> (!fir.array<5x!fir.logical<4>>) { diff --git a/flang/test/Lower/volatile-allocatable.f90 b/flang/test/Lower/volatile-allocatable.f90 index e182fe8a4d9c9..59e724bce8464 100644 --- a/flang/test/Lower/volatile-allocatable.f90 +++ b/flang/test/Lower/volatile-allocatable.f90 @@ -124,15 +124,15 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv2"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv3"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_scalar_volatileEv1"} : (!fir.box>, volatile>) -> (!fir.box>, volatile>, !fir.box>, volatile>) ! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0{"j"} : (!fir.box>, volatile>) -> !fir.ref ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QQro._QMderived_typesText_type.0"} : (!fir.ref>) -> (!fir.ref>, !fir.ref>) ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocateSource(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.box, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QQclX766F6C6174696C6520636861726163746572"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 @@ -144,7 +144,7 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_volatile_asynchronousEv1"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QQro.4xi4.1"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocateSource(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.box, i1, !fir.box, !fir.ref, i32) -> i32 @@ -154,7 +154,7 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.ref>>, volatile>, volatile>) -> (!fir.ref>>, volatile>, volatile>, !fir.ref>>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitDerivedForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAClassIs(%{{.+}}, %{{.+}}) : (!fir.box, !fir.ref) -> i1 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_select_base_type_volatileEv"} : (!fir.class>>, volatile>, !fir.shift<1>) -> (!fir.class>>, volatile>, !fir.class>>, volatile>) ! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0 (%{{.+}}) : (!fir.class>>, volatile>, index) -> !fir.class, volatile> @@ -170,22 +170,22 @@ subroutine test_unlimited_polymorphic() ! CHECK: %{{.+}} = hlfir.designate %{{.+}}#0{"arr"} shape %{{.+}} : (!fir.ref>, !fir.shape<1>) -> !fir.ref> ! CHECK: fir.call @_FortranAAllocatableApplyMold(%{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.box, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK-LABEL: func.func @_QPtest_unlimited_polymorphic() { ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.ref, volatile>, volatile>) -> (!fir.ref, volatile>, volatile>, !fir.ref, volatile>, volatile>) ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEupa"} : (!fir.ref>, volatile>, volatile>) -> (!fir.ref>, volatile>, volatile>, !fir.ref>, volatile>, volatile>) ! CHECK: fir.call @_FortranAAllocatableInitIntrinsicForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i32, i32, i32) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.heap) -> (!fir.heap, !fir.heap) ! CHECK: fir.call @_FortranAAllocatableInitCharacterForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i32, i32, i32) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEup"} : (!fir.heap>, index) -> (!fir.boxchar<1>, !fir.heap>) ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}} typeparams %{{.+}} {fortran_attrs = #fir.var_attrs, uniq_name = "_QQclX636C617373282A29"} : (!fir.ref>, index) -> (!fir.ref>, !fir.ref>) ! CHECK: fir.call @_FortranAAllocatableInitIntrinsicForAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i32, i32, i32) -> () ! CHECK: fir.call @_FortranAAllocatableSetBounds(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i32, i64, i64) -> () -! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +! CHECK: %{{.+}} = fir.call @_FortranAAllocatableAllocate(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest_unlimited_polymorphicEupa"} : (!fir.box>, volatile>, !fir.shift<1>) -> (!fir.box>, volatile>, !fir.box>, volatile>) ! CHECK: %{{.+}}:2 = hlfir.declare %{{.+}}(%{{.+}}) {fortran_attrs = #fir.var_attrs, uniq_name = "_QQro.3xr4.3"} : (!fir.ref>, !fir.shape<1>) -> (!fir.ref>, !fir.ref>) ! CHECK: %{{.+}} = fir.call @_FortranAAllocatableDeallocatePolymorphic(%{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}, %{{.+}}) fastmath : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 diff --git a/flang/test/Transforms/lower-repack-arrays.fir b/flang/test/Transforms/lower-repack-arrays.fir index bbae7ba5b0e0b..0b323b1bb0697 100644 --- a/flang/test/Transforms/lower-repack-arrays.fir +++ b/flang/test/Transforms/lower-repack-arrays.fir @@ -840,7 +840,7 @@ func.func @_QPtest6(%arg0: !fir.class>> {fir.bi // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>>) -> !fir.box @@ -928,7 +928,7 @@ func.func @_QPtest6_stack(%arg0: !fir.class>> { // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>>) -> !fir.box @@ -1015,7 +1015,7 @@ func.func @_QPtest7(%arg0: !fir.class> {fir.bindc_name = "x // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>) -> !fir.box @@ -1103,7 +1103,7 @@ func.func @_QPtest7_stack(%arg0: !fir.class> {fir.bindc_nam // CHECK: %[[VAL_34:.*]] = fir.absent !fir.box // CHECK: %[[VAL_35:.*]] = fir.convert %[[VAL_7]] : (!fir.ref>>>) -> !fir.ref> // CHECK: %[[VAL_36:.*]] = fir.convert %[[VAL_33]] : (!fir.ref>) -> !fir.ref -// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, i64, i1, !fir.box, !fir.ref, i32) -> i32 +// CHECK: %[[VAL_37:.*]] = fir.call @_FortranAAllocatableAllocate(%[[VAL_35]], %{{.*}}, %[[VAL_6]], %[[VAL_34]], %[[VAL_36]], %[[VAL_2]]) : (!fir.ref>, !fir.ref, i1, !fir.box, !fir.ref, i32) -> i32 // CHECK: %[[VAL_38:.*]] = fir.load %[[VAL_22]] : !fir.ref>>> // CHECK: %[[VAL_39:.*]] = fir.address_of(@{{_QQcl.*}} // CHECK: %[[VAL_40:.*]] = fir.convert %[[VAL_38]] : (!fir.class>>) -> !fir.box From flang-commits at lists.llvm.org Mon May 19 17:05:05 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 17:05:05 -0700 (PDT) Subject: [flang-commits] [flang] 73c638f - [flang][cuda] Set implicit CUDA device attribute in block construct (#140637) Message-ID: <682bc731.170a0220.2c70a3.d24d@mx.google.com> Author: Valentin Clement (バレンタイン クレメン) Date: 2025-05-19T17:05:01-07:00 New Revision: 73c638f897327b7869435a588bde7909709ca795 URL: https://github.com/llvm/llvm-project/commit/73c638f897327b7869435a588bde7909709ca795 DIFF: https://github.com/llvm/llvm-project/commit/73c638f897327b7869435a588bde7909709ca795.diff LOG: [flang][cuda] Set implicit CUDA device attribute in block construct (#140637) Added: Modified: flang/lib/Semantics/resolve-names.cpp flang/test/Semantics/cuf09.cuf Removed: ################################################################################ diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index bdafc03ad2c05..92a3277191ae0 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -9372,11 +9372,40 @@ void ResolveNamesVisitor::CreateGeneric(const parser::GenericSpec &x) { info.Resolve(&MakeSymbol(symbolName, Attrs{}, std::move(genericDetails))); } +static void SetImplicitCUDADevice(bool inDeviceSubprogram, Symbol &symbol) { + if (inDeviceSubprogram && symbol.has()) { + auto *object{symbol.detailsIf()}; + if (!object->cudaDataAttr() && !IsValue(symbol) && + (IsDummy(symbol) || object->IsArray())) { + // Implicitly set device attribute if none is set in device context. + object->set_cudaDataAttr(common::CUDADataAttr::Device); + } + } +} + void ResolveNamesVisitor::FinishSpecificationPart( const std::list &decls) { misparsedStmtFuncFound_ = false; funcResultStack().CompleteFunctionResultType(); CheckImports(); + bool inDeviceSubprogram{false}; + Symbol *scopeSym{currScope().symbol()}; + if (currScope().kind() == Scope::Kind::BlockConstruct) { + scopeSym = currScope().parent().symbol(); + } + if (scopeSym) { + if (auto *details{scopeSym->detailsIf()}) { + // Check the current procedure is a device procedure to apply implicit + // attribute at the end. + if (auto attrs{details->cudaSubprogramAttrs()}) { + if (*attrs == common::CUDASubprogramAttrs::Device || + *attrs == common::CUDASubprogramAttrs::Global || + *attrs == common::CUDASubprogramAttrs::Grid_Global) { + inDeviceSubprogram = true; + } + } + } + } for (auto &pair : currScope()) { auto &symbol{*pair.second}; if (inInterfaceBlock()) { @@ -9411,6 +9440,11 @@ void ResolveNamesVisitor::FinishSpecificationPart( SetBindNameOn(symbol); } } + if (currScope().kind() == Scope::Kind::BlockConstruct) { + // Only look for specification in BlockConstruct. Other cases are done in + // ResolveSpecificationParts. + SetImplicitCUDADevice(inDeviceSubprogram, symbol); + } } currScope().InstantiateDerivedTypes(); for (const auto &decl : decls) { @@ -9970,14 +10004,7 @@ void ResolveNamesVisitor::ResolveSpecificationParts(ProgramTree &node) { } ApplyImplicitRules(symbol); // Apply CUDA implicit attributes if needed. - if (inDeviceSubprogram && symbol.has()) { - auto *object{symbol.detailsIf()}; - if (!object->cudaDataAttr() && !IsValue(symbol) && - (IsDummy(symbol) || object->IsArray())) { - // Implicitly set device attribute if none is set in device context. - object->set_cudaDataAttr(common::CUDADataAttr::Device); - } - } + SetImplicitCUDADevice(inDeviceSubprogram, symbol); // Main program local objects usually don't have an implied SAVE attribute, // as one might think, but in the exceptional case of a derived type // local object that contains a coarray, we have to mark it as an diff --git a/flang/test/Semantics/cuf09.cuf b/flang/test/Semantics/cuf09.cuf index 193b22213da61..4a6d9ab09387d 100644 --- a/flang/test/Semantics/cuf09.cuf +++ b/flang/test/Semantics/cuf09.cuf @@ -228,3 +228,14 @@ attributes(host,device) subroutine do2(a,b,c,i) c(i) = a(i) - b(i) ! ok. Should not error with Host array ! cannot be present in device context end + +attributes(global) subroutine blockTest +block + integer(8) :: xloc + integer(8) :: s(7) + integer(4) :: i + do i = 1, 7 + s = xloc ! ok. + end do +end block +end subroutine From flang-commits at lists.llvm.org Mon May 19 17:05:07 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Mon, 19 May 2025 17:05:07 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Set implicit CUDA device attribute in block construct (PR #140637) In-Reply-To: Message-ID: <682bc733.170a0220.1848a3.d03b@mx.google.com> https://github.com/clementval closed https://github.com/llvm/llvm-project/pull/140637 From flang-commits at lists.llvm.org Mon May 19 17:33:47 2025 From: flang-commits at lists.llvm.org (Scott Manley via flang-commits) Date: Mon, 19 May 2025 17:33:47 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [OpenACC] unify reduction and private-like init region recipes (PR #140652) Message-ID: https://github.com/rscottmanley created https://github.com/llvm/llvm-project/pull/140652 Between firstprivate, private and reduction init regions, the difference is largely whether or not the temp that is created is initialized or not. Some recent fixes were made to privatization (#135698, #137869) but did not get propagated to reductions, even though they need to return the yield the same things from their init regions. To mitigate this discrepancy in the future, refactor the init region recipes so they can be shared between the three recipe ops. Also add "none" to the OpenACC_ReductionOperator enum for better error checking. >From eed6fd1295c8613f8d3ab76262e8f758f6fac462 Mon Sep 17 00:00:00 2001 From: Scott Manley Date: Mon, 19 May 2025 16:40:10 -0700 Subject: [PATCH] [OpenACC] unify reduction and private-like init region recipes Between firstprivate, private and reduction init regions, the difference is largely whether or not the temp that is created is initialized or not. Some recent fixes were made to privatization (#135698, #137869) but did not get propagated to reductions, even though they essentially do the same thing. To mitigate this descrepency in the future, refactor the init region recipes so they can be shared between the three recipe ops. Also add "none" to the OpenACC_ReductionOperator enum for better error checking. --- flang/include/flang/Lower/OpenACC.h | 4 +- flang/lib/Lower/OpenACC.cpp | 458 ++++++++---------- flang/test/Lower/OpenACC/acc-reduction.f90 | 16 +- .../mlir/Dialect/OpenACC/OpenACCOps.td | 27 +- 4 files changed, 233 insertions(+), 272 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index bbe3b01fdb29d..dad841863ac00 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -85,7 +85,7 @@ void genOpenACCRoutineConstruct( /// Get a acc.private.recipe op for the given type or create it if it does not /// exist yet. -mlir::acc::PrivateRecipeOp createOrGetPrivateRecipe(mlir::OpBuilder &, +mlir::acc::PrivateRecipeOp createOrGetPrivateRecipe(fir::FirOpBuilder &, llvm::StringRef, mlir::Location, mlir::Type); @@ -99,7 +99,7 @@ createOrGetReductionRecipe(fir::FirOpBuilder &, llvm::StringRef, mlir::Location, /// Get a acc.firstprivate.recipe op for the given type or create it if it does /// not exist yet. mlir::acc::FirstprivateRecipeOp -createOrGetFirstprivateRecipe(mlir::OpBuilder &, llvm::StringRef, +createOrGetFirstprivateRecipe(fir::FirOpBuilder &, llvm::StringRef, mlir::Location, mlir::Type, llvm::SmallVector &); diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index e1918288d6de3..98f6adf8f17c6 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -843,22 +843,147 @@ fir::ShapeOp genShapeOp(mlir::OpBuilder &builder, fir::SequenceType seqTy, return builder.create(loc, extents); } +/// Get the initial value for reduction operator. +template +static R getReductionInitValue(mlir::acc::ReductionOperator op, mlir::Type ty) { + if (op == mlir::acc::ReductionOperator::AccMin) { + // min init value -> largest + if constexpr (std::is_same_v) { + assert(ty.isIntOrIndex() && "expect integer or index type"); + return llvm::APInt::getSignedMaxValue(ty.getIntOrFloatBitWidth()); + } + if constexpr (std::is_same_v) { + auto floatTy = mlir::dyn_cast_or_null(ty); + assert(floatTy && "expect float type"); + return llvm::APFloat::getLargest(floatTy.getFloatSemantics(), + /*negative=*/false); + } + } else if (op == mlir::acc::ReductionOperator::AccMax) { + // max init value -> smallest + if constexpr (std::is_same_v) { + assert(ty.isIntOrIndex() && "expect integer or index type"); + return llvm::APInt::getSignedMinValue(ty.getIntOrFloatBitWidth()); + } + if constexpr (std::is_same_v) { + auto floatTy = mlir::dyn_cast_or_null(ty); + assert(floatTy && "expect float type"); + return llvm::APFloat::getSmallest(floatTy.getFloatSemantics(), + /*negative=*/true); + } + } else if (op == mlir::acc::ReductionOperator::AccIand) { + if constexpr (std::is_same_v) { + assert(ty.isIntOrIndex() && "expect integer type"); + unsigned bits = ty.getIntOrFloatBitWidth(); + return llvm::APInt::getAllOnes(bits); + } + } else { + assert(op != mlir::acc::ReductionOperator::AccNone); + // +, ior, ieor init value -> 0 + // * init value -> 1 + int64_t value = (op == mlir::acc::ReductionOperator::AccMul) ? 1 : 0; + if constexpr (std::is_same_v) { + assert(ty.isIntOrIndex() && "expect integer or index type"); + return llvm::APInt(ty.getIntOrFloatBitWidth(), value, true); + } + + if constexpr (std::is_same_v) { + assert(mlir::isa(ty) && "expect float type"); + auto floatTy = mlir::dyn_cast(ty); + return llvm::APFloat(floatTy.getFloatSemantics(), value); + } + + if constexpr (std::is_same_v) + return value; + } + llvm_unreachable("OpenACC reduction unsupported type"); +} + +/// Return a constant with the initial value for the reduction operator and +/// type combination. +static mlir::Value getReductionInitValue(fir::FirOpBuilder &builder, + mlir::Location loc, mlir::Type ty, + mlir::acc::ReductionOperator op) { + if (op == mlir::acc::ReductionOperator::AccLand || + op == mlir::acc::ReductionOperator::AccLor || + op == mlir::acc::ReductionOperator::AccEqv || + op == mlir::acc::ReductionOperator::AccNeqv) { + assert(mlir::isa(ty) && "expect fir.logical type"); + bool value = true; // .true. for .and. and .eqv. + if (op == mlir::acc::ReductionOperator::AccLor || + op == mlir::acc::ReductionOperator::AccNeqv) + value = false; // .false. for .or. and .neqv. + return builder.createBool(loc, value); + } + if (ty.isIntOrIndex()) + return builder.create( + loc, ty, + builder.getIntegerAttr(ty, getReductionInitValue(op, ty))); + if (op == mlir::acc::ReductionOperator::AccMin || + op == mlir::acc::ReductionOperator::AccMax) { + if (mlir::isa(ty)) + llvm::report_fatal_error( + "min/max reduction not supported for complex type"); + if (auto floatTy = mlir::dyn_cast_or_null(ty)) + return builder.create( + loc, ty, + builder.getFloatAttr(ty, + getReductionInitValue(op, ty))); + } else if (auto floatTy = mlir::dyn_cast_or_null(ty)) { + return builder.create( + loc, ty, + builder.getFloatAttr(ty, getReductionInitValue(op, ty))); + } else if (auto cmplxTy = mlir::dyn_cast_or_null(ty)) { + mlir::Type floatTy = cmplxTy.getElementType(); + mlir::Value realInit = builder.createRealConstant( + loc, floatTy, getReductionInitValue(op, cmplxTy)); + mlir::Value imagInit = builder.createRealConstant(loc, floatTy, 0.0); + return fir::factory::Complex{builder, loc}.createComplex(cmplxTy, realInit, + imagInit); + } + + if (auto seqTy = mlir::dyn_cast(ty)) + return getReductionInitValue(builder, loc, seqTy.getEleTy(), op); + + if (auto boxTy = mlir::dyn_cast(ty)) + return getReductionInitValue(builder, loc, boxTy.getEleTy(), op); + + if (auto heapTy = mlir::dyn_cast(ty)) + return getReductionInitValue(builder, loc, heapTy.getEleTy(), op); + + if (auto ptrTy = mlir::dyn_cast(ty)) + return getReductionInitValue(builder, loc, ptrTy.getEleTy(), op); + + llvm::report_fatal_error("Unsupported OpenACC reduction type"); +} + template -static void genPrivateLikeInitRegion(mlir::OpBuilder &builder, RecipeOp recipe, - mlir::Type argTy, mlir::Location loc) { +static void genPrivateLikeInitRegion(fir::FirOpBuilder &builder, + RecipeOp recipe, mlir::Type argTy, + mlir::Location loc, + mlir::Value initValue) { mlir::Value retVal = recipe.getInitRegion().front().getArgument(0); mlir::Type unwrappedTy = fir::unwrapRefType(argTy); + llvm::StringRef initName; + if constexpr (std::is_same_v) + initName = accReductionInitName; + else + initName = accPrivateInitName; + auto getDeclareOpForType = [&](mlir::Type ty) -> hlfir::DeclareOp { auto alloca = builder.create(loc, ty); return builder.create( - loc, alloca, accPrivateInitName, /*shape=*/nullptr, - llvm::ArrayRef{}, /*dummy_scope=*/nullptr, - fir::FortranVariableFlagsAttr{}); + loc, alloca, initName, /*shape=*/nullptr, llvm::ArrayRef{}, + /*dummy_scope=*/nullptr, fir::FortranVariableFlagsAttr{}); }; if (fir::isa_trivial(unwrappedTy)) { - retVal = getDeclareOpForType(unwrappedTy).getBase(); + auto declareOp = getDeclareOpForType(unwrappedTy); + if (initValue) { + auto convert = builder.createConvert(loc, unwrappedTy, initValue); + builder.create(loc, convert, declareOp.getBase()); + } + retVal = declareOp.getBase(); } else if (auto seqTy = mlir::dyn_cast_or_null(unwrappedTy)) { if (fir::isa_trivial(seqTy.getEleTy())) { @@ -877,8 +1002,34 @@ static void genPrivateLikeInitRegion(mlir::OpBuilder &builder, RecipeOp recipe, auto alloca = builder.create( loc, seqTy, /*typeparams=*/mlir::ValueRange{}, extents); auto declareOp = builder.create( - loc, alloca, accPrivateInitName, shape, llvm::ArrayRef{}, + loc, alloca, initName, shape, llvm::ArrayRef{}, /*dummy_scope=*/nullptr, fir::FortranVariableFlagsAttr{}); + + if (initValue) { + mlir::Type idxTy = builder.getIndexType(); + mlir::Type refTy = fir::ReferenceType::get(seqTy.getEleTy()); + llvm::SmallVector loops; + llvm::SmallVector ivs; + + if (seqTy.hasDynamicExtents()) { + builder.create(loc, initValue, declareOp.getBase()); + } else { + for (auto ext : seqTy.getShape()) { + auto lb = builder.createIntegerConstant(loc, idxTy, 0); + auto ub = builder.createIntegerConstant(loc, idxTy, ext - 1); + auto step = builder.createIntegerConstant(loc, idxTy, 1); + auto loop = builder.create(loc, lb, ub, step, + /*unordered=*/false); + builder.setInsertionPointToStart(loop.getBody()); + loops.push_back(loop); + ivs.push_back(loop.getInductionVar()); + } + auto coord = builder.create( + loc, refTy, declareOp.getBase(), ivs); + builder.create(loc, initValue, coord); + builder.setInsertionPointAfter(loops[0]); + } + } retVal = declareOp.getBase(); } } else if (auto boxTy = @@ -909,25 +1060,29 @@ static void genPrivateLikeInitRegion(mlir::OpBuilder &builder, RecipeOp recipe, retVal = temp; } } else { - TODO(loc, "Unsupported boxed type in OpenACC privatization"); + TODO(loc, "Unsupported boxed type for OpenACC private-like recipe"); + } + if (initValue) { + builder.create(loc, initValue, retVal); } } builder.create(loc, retVal); } -mlir::acc::PrivateRecipeOp -Fortran::lower::createOrGetPrivateRecipe(mlir::OpBuilder &builder, - llvm::StringRef recipeName, - mlir::Location loc, mlir::Type ty) { - mlir::ModuleOp mod = - builder.getBlock()->getParent()->getParentOfType(); - if (auto recipe = mod.lookupSymbol(recipeName)) - return recipe; - - auto crtPos = builder.saveInsertionPoint(); +template +static RecipeOp genRecipeOp( + fir::FirOpBuilder &builder, mlir::ModuleOp mod, llvm::StringRef recipeName, + mlir::Location loc, mlir::Type ty, + mlir::acc::ReductionOperator op = mlir::acc::ReductionOperator::AccNone) { mlir::OpBuilder modBuilder(mod.getBodyRegion()); - auto recipe = - modBuilder.create(loc, recipeName, ty); + RecipeOp recipe; + if constexpr (std::is_same_v) { + recipe = modBuilder.create(loc, recipeName, + ty, op); + } else { + recipe = modBuilder.create(loc, recipeName, ty); + } + llvm::SmallVector argsTy{ty}; llvm::SmallVector argsLoc{loc}; if (auto refTy = mlir::dyn_cast_or_null(ty)) { @@ -945,9 +1100,28 @@ Fortran::lower::createOrGetPrivateRecipe(mlir::OpBuilder &builder, builder.createBlock(&recipe.getInitRegion(), recipe.getInitRegion().end(), argsTy, argsLoc); builder.setInsertionPointToEnd(&recipe.getInitRegion().back()); - genPrivateLikeInitRegion(builder, recipe, ty, - loc); - builder.restoreInsertionPoint(crtPos); + mlir::Value initValue; + if constexpr (std::is_same_v) { + assert(op != mlir::acc::ReductionOperator::AccNone); + initValue = getReductionInitValue(builder, loc, fir::unwrapRefType(ty), op); + } + genPrivateLikeInitRegion(builder, recipe, ty, loc, initValue); + return recipe; +} + +mlir::acc::PrivateRecipeOp +Fortran::lower::createOrGetPrivateRecipe(fir::FirOpBuilder &builder, + llvm::StringRef recipeName, + mlir::Location loc, mlir::Type ty) { + mlir::ModuleOp mod = + builder.getBlock()->getParent()->getParentOfType(); + if (auto recipe = mod.lookupSymbol(recipeName)) + return recipe; + + auto ip = builder.saveInsertionPoint(); + auto recipe = genRecipeOp(builder, mod, + recipeName, loc, ty); + builder.restoreInsertionPoint(ip); return recipe; } @@ -1064,7 +1238,7 @@ static hlfir::Entity genDesignateWithTriplets( } mlir::acc::FirstprivateRecipeOp Fortran::lower::createOrGetFirstprivateRecipe( - mlir::OpBuilder &builder, llvm::StringRef recipeName, mlir::Location loc, + fir::FirOpBuilder &builder, llvm::StringRef recipeName, mlir::Location loc, mlir::Type ty, llvm::SmallVector &bounds) { mlir::ModuleOp mod = builder.getBlock()->getParent()->getParentOfType(); @@ -1072,28 +1246,9 @@ mlir::acc::FirstprivateRecipeOp Fortran::lower::createOrGetFirstprivateRecipe( mod.lookupSymbol(recipeName)) return recipe; - auto crtPos = builder.saveInsertionPoint(); - mlir::OpBuilder modBuilder(mod.getBodyRegion()); - auto recipe = - modBuilder.create(loc, recipeName, ty); - llvm::SmallVector initArgsTy{ty}; - llvm::SmallVector initArgsLoc{loc}; - auto refTy = fir::unwrapRefType(ty); - if (auto seqTy = mlir::dyn_cast_or_null(refTy)) { - if (seqTy.hasDynamicExtents()) { - mlir::Type idxTy = builder.getIndexType(); - for (unsigned i = 0; i < seqTy.getDimension(); ++i) { - initArgsTy.push_back(idxTy); - initArgsLoc.push_back(loc); - } - } - } - builder.createBlock(&recipe.getInitRegion(), recipe.getInitRegion().end(), - initArgsTy, initArgsLoc); - builder.setInsertionPointToEnd(&recipe.getInitRegion().back()); - genPrivateLikeInitRegion(builder, recipe, ty, - loc); - + auto ip = builder.saveInsertionPoint(); + auto recipe = genRecipeOp(builder, mod, + recipeName, loc, ty); bool allConstantBound = areAllBoundConstant(bounds); llvm::SmallVector argsTy{ty, ty}; llvm::SmallVector argsLoc{loc, loc}; @@ -1167,7 +1322,7 @@ mlir::acc::FirstprivateRecipeOp Fortran::lower::createOrGetFirstprivateRecipe( } builder.create(loc); - builder.restoreInsertionPoint(crtPos); + builder.restoreInsertionPoint(ip); return recipe; } @@ -1326,188 +1481,6 @@ getReductionOperator(const Fortran::parser::ReductionOperator &op) { llvm_unreachable("unexpected reduction operator"); } -/// Get the initial value for reduction operator. -template -static R getReductionInitValue(mlir::acc::ReductionOperator op, mlir::Type ty) { - if (op == mlir::acc::ReductionOperator::AccMin) { - // min init value -> largest - if constexpr (std::is_same_v) { - assert(ty.isIntOrIndex() && "expect integer or index type"); - return llvm::APInt::getSignedMaxValue(ty.getIntOrFloatBitWidth()); - } - if constexpr (std::is_same_v) { - auto floatTy = mlir::dyn_cast_or_null(ty); - assert(floatTy && "expect float type"); - return llvm::APFloat::getLargest(floatTy.getFloatSemantics(), - /*negative=*/false); - } - } else if (op == mlir::acc::ReductionOperator::AccMax) { - // max init value -> smallest - if constexpr (std::is_same_v) { - assert(ty.isIntOrIndex() && "expect integer or index type"); - return llvm::APInt::getSignedMinValue(ty.getIntOrFloatBitWidth()); - } - if constexpr (std::is_same_v) { - auto floatTy = mlir::dyn_cast_or_null(ty); - assert(floatTy && "expect float type"); - return llvm::APFloat::getSmallest(floatTy.getFloatSemantics(), - /*negative=*/true); - } - } else if (op == mlir::acc::ReductionOperator::AccIand) { - if constexpr (std::is_same_v) { - assert(ty.isIntOrIndex() && "expect integer type"); - unsigned bits = ty.getIntOrFloatBitWidth(); - return llvm::APInt::getAllOnes(bits); - } - } else { - // +, ior, ieor init value -> 0 - // * init value -> 1 - int64_t value = (op == mlir::acc::ReductionOperator::AccMul) ? 1 : 0; - if constexpr (std::is_same_v) { - assert(ty.isIntOrIndex() && "expect integer or index type"); - return llvm::APInt(ty.getIntOrFloatBitWidth(), value, true); - } - - if constexpr (std::is_same_v) { - assert(mlir::isa(ty) && "expect float type"); - auto floatTy = mlir::dyn_cast(ty); - return llvm::APFloat(floatTy.getFloatSemantics(), value); - } - - if constexpr (std::is_same_v) - return value; - } - llvm_unreachable("OpenACC reduction unsupported type"); -} - -/// Return a constant with the initial value for the reduction operator and -/// type combination. -static mlir::Value getReductionInitValue(fir::FirOpBuilder &builder, - mlir::Location loc, mlir::Type ty, - mlir::acc::ReductionOperator op) { - if (op == mlir::acc::ReductionOperator::AccLand || - op == mlir::acc::ReductionOperator::AccLor || - op == mlir::acc::ReductionOperator::AccEqv || - op == mlir::acc::ReductionOperator::AccNeqv) { - assert(mlir::isa(ty) && "expect fir.logical type"); - bool value = true; // .true. for .and. and .eqv. - if (op == mlir::acc::ReductionOperator::AccLor || - op == mlir::acc::ReductionOperator::AccNeqv) - value = false; // .false. for .or. and .neqv. - return builder.createBool(loc, value); - } - if (ty.isIntOrIndex()) - return builder.create( - loc, ty, - builder.getIntegerAttr(ty, getReductionInitValue(op, ty))); - if (op == mlir::acc::ReductionOperator::AccMin || - op == mlir::acc::ReductionOperator::AccMax) { - if (mlir::isa(ty)) - llvm::report_fatal_error( - "min/max reduction not supported for complex type"); - if (auto floatTy = mlir::dyn_cast_or_null(ty)) - return builder.create( - loc, ty, - builder.getFloatAttr(ty, - getReductionInitValue(op, ty))); - } else if (auto floatTy = mlir::dyn_cast_or_null(ty)) { - return builder.create( - loc, ty, - builder.getFloatAttr(ty, getReductionInitValue(op, ty))); - } else if (auto cmplxTy = mlir::dyn_cast_or_null(ty)) { - mlir::Type floatTy = cmplxTy.getElementType(); - mlir::Value realInit = builder.createRealConstant( - loc, floatTy, getReductionInitValue(op, cmplxTy)); - mlir::Value imagInit = builder.createRealConstant(loc, floatTy, 0.0); - return fir::factory::Complex{builder, loc}.createComplex(cmplxTy, realInit, - imagInit); - } - - if (auto seqTy = mlir::dyn_cast(ty)) - return getReductionInitValue(builder, loc, seqTy.getEleTy(), op); - - if (auto boxTy = mlir::dyn_cast(ty)) - return getReductionInitValue(builder, loc, boxTy.getEleTy(), op); - - if (auto heapTy = mlir::dyn_cast(ty)) - return getReductionInitValue(builder, loc, heapTy.getEleTy(), op); - - if (auto ptrTy = mlir::dyn_cast(ty)) - return getReductionInitValue(builder, loc, ptrTy.getEleTy(), op); - - llvm::report_fatal_error("Unsupported OpenACC reduction type"); -} - -static mlir::Value genReductionInitRegion(fir::FirOpBuilder &builder, - mlir::Location loc, mlir::Type ty, - mlir::acc::ReductionOperator op) { - ty = fir::unwrapRefType(ty); - mlir::Value initValue = getReductionInitValue(builder, loc, ty, op); - if (fir::isa_trivial(ty)) { - mlir::Value alloca = builder.create(loc, ty); - auto declareOp = builder.create( - loc, alloca, accReductionInitName, /*shape=*/nullptr, - llvm::ArrayRef{}, /*dummy_scope=*/nullptr, - fir::FortranVariableFlagsAttr{}); - builder.create(loc, builder.createConvert(loc, ty, initValue), - declareOp.getBase()); - return declareOp.getBase(); - } else if (auto seqTy = mlir::dyn_cast_or_null(ty)) { - if (fir::isa_trivial(seqTy.getEleTy())) { - mlir::Value shape; - auto extents = builder.getBlock()->getArguments().drop_front(1); - if (seqTy.hasDynamicExtents()) - shape = builder.create(loc, extents); - else - shape = genShapeOp(builder, seqTy, loc); - mlir::Value alloca = builder.create( - loc, seqTy, /*typeparams=*/mlir::ValueRange{}, extents); - auto declareOp = builder.create( - loc, alloca, accReductionInitName, shape, - llvm::ArrayRef{}, /*dummy_scope=*/nullptr, - fir::FortranVariableFlagsAttr{}); - mlir::Type idxTy = builder.getIndexType(); - mlir::Type refTy = fir::ReferenceType::get(seqTy.getEleTy()); - llvm::SmallVector loops; - llvm::SmallVector ivs; - - if (seqTy.hasDynamicExtents()) { - builder.create(loc, initValue, declareOp.getBase()); - return declareOp.getBase(); - } - for (auto ext : seqTy.getShape()) { - auto lb = builder.createIntegerConstant(loc, idxTy, 0); - auto ub = builder.createIntegerConstant(loc, idxTy, ext - 1); - auto step = builder.createIntegerConstant(loc, idxTy, 1); - auto loop = builder.create(loc, lb, ub, step, - /*unordered=*/false); - builder.setInsertionPointToStart(loop.getBody()); - loops.push_back(loop); - ivs.push_back(loop.getInductionVar()); - } - auto coord = builder.create(loc, refTy, - declareOp.getBase(), ivs); - builder.create(loc, initValue, coord); - builder.setInsertionPointAfter(loops[0]); - return declareOp.getBase(); - } - } else if (auto boxTy = mlir::dyn_cast_or_null(ty)) { - mlir::Type innerTy = fir::unwrapRefType(boxTy.getEleTy()); - if (!fir::isa_trivial(innerTy) && !mlir::isa(innerTy)) - TODO(loc, "Unsupported boxed type for reduction"); - // Create the private copy from the initial fir.box. - hlfir::Entity source = hlfir::Entity{builder.getBlock()->getArgument(0)}; - auto [temp, cleanup] = hlfir::createTempFromMold(loc, builder, source); - mlir::Value newBox = temp; - if (!mlir::isa(temp.getType())) { - newBox = builder.create(loc, boxTy, temp); - } - builder.create(loc, initValue, newBox); - return newBox; - } - llvm::report_fatal_error("Unsupported OpenACC reduction type"); -} - template static mlir::Value genLogicalCombiner(fir::FirOpBuilder &builder, mlir::Location loc, mlir::Value value1, @@ -1799,27 +1772,10 @@ mlir::acc::ReductionRecipeOp Fortran::lower::createOrGetReductionRecipe( if (auto recipe = mod.lookupSymbol(recipeName)) return recipe; - auto crtPos = builder.saveInsertionPoint(); - mlir::OpBuilder modBuilder(mod.getBodyRegion()); - auto recipe = - modBuilder.create(loc, recipeName, ty, op); - llvm::SmallVector initArgsTy{ty}; - llvm::SmallVector initArgsLoc{loc}; - mlir::Type refTy = fir::unwrapRefType(ty); - if (auto seqTy = mlir::dyn_cast_or_null(refTy)) { - if (seqTy.hasDynamicExtents()) { - mlir::Type idxTy = builder.getIndexType(); - for (unsigned i = 0; i < seqTy.getDimension(); ++i) { - initArgsTy.push_back(idxTy); - initArgsLoc.push_back(loc); - } - } - } - builder.createBlock(&recipe.getInitRegion(), recipe.getInitRegion().end(), - initArgsTy, initArgsLoc); - builder.setInsertionPointToEnd(&recipe.getInitRegion().back()); - mlir::Value initValue = genReductionInitRegion(builder, loc, ty, op); - builder.create(loc, initValue); + auto ip = builder.saveInsertionPoint(); + + auto recipe = genRecipeOp( + builder, mod, recipeName, loc, ty, op); // The two first block arguments are the two values to be combined. // The next arguments are the iteration ranges (lb, ub, step) to be used @@ -1846,7 +1802,7 @@ mlir::acc::ReductionRecipeOp Fortran::lower::createOrGetReductionRecipe( mlir::Value v2 = recipe.getCombinerRegion().front().getArgument(1); genCombiner(builder, loc, op, ty, v1, v2, recipe, bounds, allConstantBound); builder.create(loc, v1); - builder.restoreInsertionPoint(crtPos); + builder.restoreInsertionPoint(ip); return recipe; } diff --git a/flang/test/Lower/OpenACC/acc-reduction.f90 b/flang/test/Lower/OpenACC/acc-reduction.f90 index 0d97c298f8d24..20b5ad28f78a1 100644 --- a/flang/test/Lower/OpenACC/acc-reduction.f90 +++ b/flang/test/Lower/OpenACC/acc-reduction.f90 @@ -39,9 +39,11 @@ ! CHECK: %[[BOX_DIMS:.*]]:3 = fir.box_dims %[[BOX]], %[[C0]] : (!fir.box>>, index) -> (index, index, index) ! CHECK: %[[SHAPE:.*]] = fir.shape %[[BOX_DIMS]]#1 : (index) -> !fir.shape<1> ! CHECK: %[[TEMP:.*]] = fir.allocmem !fir.array, %[[BOX_DIMS]]#1 {bindc_name = ".tmp", uniq_name = ""} -! CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %[[TEMP]](%[[SHAPE]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -! CHECK: hlfir.assign %[[CST]] to %[[DECLARE]]#0 : f32, !fir.box> -! CHECK: acc.yield %[[DECLARE]]#0 : !fir.box> +! CHECK: %[[STORAGE:.*]]:2 = hlfir.declare %[[TEMP]](%[[SHAPE]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +! CHECK: %[[BOXTEMP:.*]] = fir.alloca !fir.box>> +! CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %[[BOXTEMP]] {uniq_name = "acc.reduction.init"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) +! CHECK: hlfir.assign %[[CST]] to %[[DECLARE]]#0 : f32, !fir.ref>>> +! CHECK: acc.yield %[[DECLARE]]#0 : !fir.ref>>> ! CHECK: } combiner { ! CHECK: ^bb0(%[[ARG0:.*]]: !fir.ref>>>, %[[ARG1:.*]]: !fir.ref>>>): ! CHECK: %[[BOX0:.*]] = fir.load %[[ARG0]] : !fir.ref>>> @@ -74,9 +76,11 @@ ! CHECK: %[[BOX_DIMS:.*]]:3 = fir.box_dims %[[BOX]], %[[C0]] : (!fir.box>>, index) -> (index, index, index) ! CHECK: %[[SHAPE:.*]] = fir.shape %[[BOX_DIMS]]#1 : (index) -> !fir.shape<1> ! CHECK: %[[TEMP:.*]] = fir.allocmem !fir.array, %[[BOX_DIMS]]#1 {bindc_name = ".tmp", uniq_name = ""} -! CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %[[TEMP]](%[[SHAPE]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -! CHECK: hlfir.assign %[[CST]] to %[[DECLARE]]#0 : f32, !fir.box> -! CHECK: acc.yield %[[DECLARE]]#0 : !fir.box> +! CHECK: %[[STORAGE:.*]]:2 = hlfir.declare %[[TEMP]](%[[SHAPE]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +! CHECK: %[[BOXTEMP:.*]] = fir.alloca !fir.box>> +! CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %[[BOXTEMP]] {uniq_name = "acc.reduction.init"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) +! CHECK: hlfir.assign %[[CST]] to %[[DECLARE]]#0 : f32, !fir.ref>>> +! CHECK: acc.yield %[[DECLARE]]#0 : !fir.ref>>> ! CHECK: } combiner { ! CHECK: ^bb0(%[[ARG0:.*]]: !fir.ref>>>, %[[ARG1:.*]]: !fir.ref>>>): ! CHECK: %[[BOX0:.*]] = fir.load %[[ARG0]] : !fir.ref>>> diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td index 3c22aeb9a1ff7..b9148dc088a6a 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td @@ -35,22 +35,23 @@ class OpenACC_Op traits = []> : Op; // Reduction operation enumeration. -def OpenACC_ReductionOperatorAdd : I32EnumAttrCase<"AccAdd", 0, "add">; -def OpenACC_ReductionOperatorMul : I32EnumAttrCase<"AccMul", 1, "mul">; -def OpenACC_ReductionOperatorMax : I32EnumAttrCase<"AccMax", 2, "max">; -def OpenACC_ReductionOperatorMin : I32EnumAttrCase<"AccMin", 3, "min">; -def OpenACC_ReductionOperatorAnd : I32EnumAttrCase<"AccIand", 4, "iand">; -def OpenACC_ReductionOperatorOr : I32EnumAttrCase<"AccIor", 5, "ior">; -def OpenACC_ReductionOperatorXor : I32EnumAttrCase<"AccXor", 6, "xor">; -def OpenACC_ReductionOperatorLogEqv : I32EnumAttrCase<"AccEqv", 7, "eqv">; -def OpenACC_ReductionOperatorLogNeqv : I32EnumAttrCase<"AccNeqv", 8, "neqv">; -def OpenACC_ReductionOperatorLogAnd : I32EnumAttrCase<"AccLand", 9, "land">; -def OpenACC_ReductionOperatorLogOr : I32EnumAttrCase<"AccLor", 10, "lor">; +def OpenACC_ReductionOperatorNone : I32EnumAttrCase<"AccNone", 0, "none">; +def OpenACC_ReductionOperatorAdd : I32EnumAttrCase<"AccAdd", 1, "add">; +def OpenACC_ReductionOperatorMul : I32EnumAttrCase<"AccMul", 2, "mul">; +def OpenACC_ReductionOperatorMax : I32EnumAttrCase<"AccMax", 3, "max">; +def OpenACC_ReductionOperatorMin : I32EnumAttrCase<"AccMin", 4, "min">; +def OpenACC_ReductionOperatorAnd : I32EnumAttrCase<"AccIand", 5, "iand">; +def OpenACC_ReductionOperatorOr : I32EnumAttrCase<"AccIor", 6, "ior">; +def OpenACC_ReductionOperatorXor : I32EnumAttrCase<"AccXor", 7, "xor">; +def OpenACC_ReductionOperatorLogEqv : I32EnumAttrCase<"AccEqv", 8, "eqv">; +def OpenACC_ReductionOperatorLogNeqv : I32EnumAttrCase<"AccNeqv", 9, "neqv">; +def OpenACC_ReductionOperatorLogAnd : I32EnumAttrCase<"AccLand", 10, "land">; +def OpenACC_ReductionOperatorLogOr : I32EnumAttrCase<"AccLor", 11, "lor">; def OpenACC_ReductionOperator : I32EnumAttr<"ReductionOperator", "built-in reduction operations supported by OpenACC", - [OpenACC_ReductionOperatorAdd, OpenACC_ReductionOperatorMul, - OpenACC_ReductionOperatorMax, OpenACC_ReductionOperatorMin, + [OpenACC_ReductionOperatorNone, OpenACC_ReductionOperatorAdd, + OpenACC_ReductionOperatorMul, OpenACC_ReductionOperatorMax, OpenACC_ReductionOperatorMin, OpenACC_ReductionOperatorAnd, OpenACC_ReductionOperatorOr, OpenACC_ReductionOperatorXor, OpenACC_ReductionOperatorLogEqv, OpenACC_ReductionOperatorLogNeqv, OpenACC_ReductionOperatorLogAnd, From flang-commits at lists.llvm.org Mon May 19 17:34:39 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 17:34:39 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [OpenACC] unify reduction and private-like init region recipes (PR #140652) In-Reply-To: Message-ID: <682bce1f.170a0220.18d5a9.d136@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-openacc @llvm/pr-subscribers-flang-fir-hlfir Author: Scott Manley (rscottmanley)
Changes Between firstprivate, private and reduction init regions, the difference is largely whether or not the temp that is created is initialized or not. Some recent fixes were made to privatization (#135698, #137869) but did not get propagated to reductions, even though they need to return the yield the same things from their init regions. To mitigate this discrepancy in the future, refactor the init region recipes so they can be shared between the three recipe ops. Also add "none" to the OpenACC_ReductionOperator enum for better error checking. --- Patch is 32.72 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140652.diff 4 Files Affected: - (modified) flang/include/flang/Lower/OpenACC.h (+2-2) - (modified) flang/lib/Lower/OpenACC.cpp (+207-251) - (modified) flang/test/Lower/OpenACC/acc-reduction.f90 (+10-6) - (modified) mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td (+14-13) ``````````diff diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index bbe3b01fdb29d..dad841863ac00 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -85,7 +85,7 @@ void genOpenACCRoutineConstruct( /// Get a acc.private.recipe op for the given type or create it if it does not /// exist yet. -mlir::acc::PrivateRecipeOp createOrGetPrivateRecipe(mlir::OpBuilder &, +mlir::acc::PrivateRecipeOp createOrGetPrivateRecipe(fir::FirOpBuilder &, llvm::StringRef, mlir::Location, mlir::Type); @@ -99,7 +99,7 @@ createOrGetReductionRecipe(fir::FirOpBuilder &, llvm::StringRef, mlir::Location, /// Get a acc.firstprivate.recipe op for the given type or create it if it does /// not exist yet. mlir::acc::FirstprivateRecipeOp -createOrGetFirstprivateRecipe(mlir::OpBuilder &, llvm::StringRef, +createOrGetFirstprivateRecipe(fir::FirOpBuilder &, llvm::StringRef, mlir::Location, mlir::Type, llvm::SmallVector &); diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index e1918288d6de3..98f6adf8f17c6 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -843,22 +843,147 @@ fir::ShapeOp genShapeOp(mlir::OpBuilder &builder, fir::SequenceType seqTy, return builder.create(loc, extents); } +/// Get the initial value for reduction operator. +template +static R getReductionInitValue(mlir::acc::ReductionOperator op, mlir::Type ty) { + if (op == mlir::acc::ReductionOperator::AccMin) { + // min init value -> largest + if constexpr (std::is_same_v) { + assert(ty.isIntOrIndex() && "expect integer or index type"); + return llvm::APInt::getSignedMaxValue(ty.getIntOrFloatBitWidth()); + } + if constexpr (std::is_same_v) { + auto floatTy = mlir::dyn_cast_or_null(ty); + assert(floatTy && "expect float type"); + return llvm::APFloat::getLargest(floatTy.getFloatSemantics(), + /*negative=*/false); + } + } else if (op == mlir::acc::ReductionOperator::AccMax) { + // max init value -> smallest + if constexpr (std::is_same_v) { + assert(ty.isIntOrIndex() && "expect integer or index type"); + return llvm::APInt::getSignedMinValue(ty.getIntOrFloatBitWidth()); + } + if constexpr (std::is_same_v) { + auto floatTy = mlir::dyn_cast_or_null(ty); + assert(floatTy && "expect float type"); + return llvm::APFloat::getSmallest(floatTy.getFloatSemantics(), + /*negative=*/true); + } + } else if (op == mlir::acc::ReductionOperator::AccIand) { + if constexpr (std::is_same_v) { + assert(ty.isIntOrIndex() && "expect integer type"); + unsigned bits = ty.getIntOrFloatBitWidth(); + return llvm::APInt::getAllOnes(bits); + } + } else { + assert(op != mlir::acc::ReductionOperator::AccNone); + // +, ior, ieor init value -> 0 + // * init value -> 1 + int64_t value = (op == mlir::acc::ReductionOperator::AccMul) ? 1 : 0; + if constexpr (std::is_same_v) { + assert(ty.isIntOrIndex() && "expect integer or index type"); + return llvm::APInt(ty.getIntOrFloatBitWidth(), value, true); + } + + if constexpr (std::is_same_v) { + assert(mlir::isa(ty) && "expect float type"); + auto floatTy = mlir::dyn_cast(ty); + return llvm::APFloat(floatTy.getFloatSemantics(), value); + } + + if constexpr (std::is_same_v) + return value; + } + llvm_unreachable("OpenACC reduction unsupported type"); +} + +/// Return a constant with the initial value for the reduction operator and +/// type combination. +static mlir::Value getReductionInitValue(fir::FirOpBuilder &builder, + mlir::Location loc, mlir::Type ty, + mlir::acc::ReductionOperator op) { + if (op == mlir::acc::ReductionOperator::AccLand || + op == mlir::acc::ReductionOperator::AccLor || + op == mlir::acc::ReductionOperator::AccEqv || + op == mlir::acc::ReductionOperator::AccNeqv) { + assert(mlir::isa(ty) && "expect fir.logical type"); + bool value = true; // .true. for .and. and .eqv. + if (op == mlir::acc::ReductionOperator::AccLor || + op == mlir::acc::ReductionOperator::AccNeqv) + value = false; // .false. for .or. and .neqv. + return builder.createBool(loc, value); + } + if (ty.isIntOrIndex()) + return builder.create( + loc, ty, + builder.getIntegerAttr(ty, getReductionInitValue(op, ty))); + if (op == mlir::acc::ReductionOperator::AccMin || + op == mlir::acc::ReductionOperator::AccMax) { + if (mlir::isa(ty)) + llvm::report_fatal_error( + "min/max reduction not supported for complex type"); + if (auto floatTy = mlir::dyn_cast_or_null(ty)) + return builder.create( + loc, ty, + builder.getFloatAttr(ty, + getReductionInitValue(op, ty))); + } else if (auto floatTy = mlir::dyn_cast_or_null(ty)) { + return builder.create( + loc, ty, + builder.getFloatAttr(ty, getReductionInitValue(op, ty))); + } else if (auto cmplxTy = mlir::dyn_cast_or_null(ty)) { + mlir::Type floatTy = cmplxTy.getElementType(); + mlir::Value realInit = builder.createRealConstant( + loc, floatTy, getReductionInitValue(op, cmplxTy)); + mlir::Value imagInit = builder.createRealConstant(loc, floatTy, 0.0); + return fir::factory::Complex{builder, loc}.createComplex(cmplxTy, realInit, + imagInit); + } + + if (auto seqTy = mlir::dyn_cast(ty)) + return getReductionInitValue(builder, loc, seqTy.getEleTy(), op); + + if (auto boxTy = mlir::dyn_cast(ty)) + return getReductionInitValue(builder, loc, boxTy.getEleTy(), op); + + if (auto heapTy = mlir::dyn_cast(ty)) + return getReductionInitValue(builder, loc, heapTy.getEleTy(), op); + + if (auto ptrTy = mlir::dyn_cast(ty)) + return getReductionInitValue(builder, loc, ptrTy.getEleTy(), op); + + llvm::report_fatal_error("Unsupported OpenACC reduction type"); +} + template -static void genPrivateLikeInitRegion(mlir::OpBuilder &builder, RecipeOp recipe, - mlir::Type argTy, mlir::Location loc) { +static void genPrivateLikeInitRegion(fir::FirOpBuilder &builder, + RecipeOp recipe, mlir::Type argTy, + mlir::Location loc, + mlir::Value initValue) { mlir::Value retVal = recipe.getInitRegion().front().getArgument(0); mlir::Type unwrappedTy = fir::unwrapRefType(argTy); + llvm::StringRef initName; + if constexpr (std::is_same_v) + initName = accReductionInitName; + else + initName = accPrivateInitName; + auto getDeclareOpForType = [&](mlir::Type ty) -> hlfir::DeclareOp { auto alloca = builder.create(loc, ty); return builder.create( - loc, alloca, accPrivateInitName, /*shape=*/nullptr, - llvm::ArrayRef{}, /*dummy_scope=*/nullptr, - fir::FortranVariableFlagsAttr{}); + loc, alloca, initName, /*shape=*/nullptr, llvm::ArrayRef{}, + /*dummy_scope=*/nullptr, fir::FortranVariableFlagsAttr{}); }; if (fir::isa_trivial(unwrappedTy)) { - retVal = getDeclareOpForType(unwrappedTy).getBase(); + auto declareOp = getDeclareOpForType(unwrappedTy); + if (initValue) { + auto convert = builder.createConvert(loc, unwrappedTy, initValue); + builder.create(loc, convert, declareOp.getBase()); + } + retVal = declareOp.getBase(); } else if (auto seqTy = mlir::dyn_cast_or_null(unwrappedTy)) { if (fir::isa_trivial(seqTy.getEleTy())) { @@ -877,8 +1002,34 @@ static void genPrivateLikeInitRegion(mlir::OpBuilder &builder, RecipeOp recipe, auto alloca = builder.create( loc, seqTy, /*typeparams=*/mlir::ValueRange{}, extents); auto declareOp = builder.create( - loc, alloca, accPrivateInitName, shape, llvm::ArrayRef{}, + loc, alloca, initName, shape, llvm::ArrayRef{}, /*dummy_scope=*/nullptr, fir::FortranVariableFlagsAttr{}); + + if (initValue) { + mlir::Type idxTy = builder.getIndexType(); + mlir::Type refTy = fir::ReferenceType::get(seqTy.getEleTy()); + llvm::SmallVector loops; + llvm::SmallVector ivs; + + if (seqTy.hasDynamicExtents()) { + builder.create(loc, initValue, declareOp.getBase()); + } else { + for (auto ext : seqTy.getShape()) { + auto lb = builder.createIntegerConstant(loc, idxTy, 0); + auto ub = builder.createIntegerConstant(loc, idxTy, ext - 1); + auto step = builder.createIntegerConstant(loc, idxTy, 1); + auto loop = builder.create(loc, lb, ub, step, + /*unordered=*/false); + builder.setInsertionPointToStart(loop.getBody()); + loops.push_back(loop); + ivs.push_back(loop.getInductionVar()); + } + auto coord = builder.create( + loc, refTy, declareOp.getBase(), ivs); + builder.create(loc, initValue, coord); + builder.setInsertionPointAfter(loops[0]); + } + } retVal = declareOp.getBase(); } } else if (auto boxTy = @@ -909,25 +1060,29 @@ static void genPrivateLikeInitRegion(mlir::OpBuilder &builder, RecipeOp recipe, retVal = temp; } } else { - TODO(loc, "Unsupported boxed type in OpenACC privatization"); + TODO(loc, "Unsupported boxed type for OpenACC private-like recipe"); + } + if (initValue) { + builder.create(loc, initValue, retVal); } } builder.create(loc, retVal); } -mlir::acc::PrivateRecipeOp -Fortran::lower::createOrGetPrivateRecipe(mlir::OpBuilder &builder, - llvm::StringRef recipeName, - mlir::Location loc, mlir::Type ty) { - mlir::ModuleOp mod = - builder.getBlock()->getParent()->getParentOfType(); - if (auto recipe = mod.lookupSymbol(recipeName)) - return recipe; - - auto crtPos = builder.saveInsertionPoint(); +template +static RecipeOp genRecipeOp( + fir::FirOpBuilder &builder, mlir::ModuleOp mod, llvm::StringRef recipeName, + mlir::Location loc, mlir::Type ty, + mlir::acc::ReductionOperator op = mlir::acc::ReductionOperator::AccNone) { mlir::OpBuilder modBuilder(mod.getBodyRegion()); - auto recipe = - modBuilder.create(loc, recipeName, ty); + RecipeOp recipe; + if constexpr (std::is_same_v) { + recipe = modBuilder.create(loc, recipeName, + ty, op); + } else { + recipe = modBuilder.create(loc, recipeName, ty); + } + llvm::SmallVector argsTy{ty}; llvm::SmallVector argsLoc{loc}; if (auto refTy = mlir::dyn_cast_or_null(ty)) { @@ -945,9 +1100,28 @@ Fortran::lower::createOrGetPrivateRecipe(mlir::OpBuilder &builder, builder.createBlock(&recipe.getInitRegion(), recipe.getInitRegion().end(), argsTy, argsLoc); builder.setInsertionPointToEnd(&recipe.getInitRegion().back()); - genPrivateLikeInitRegion(builder, recipe, ty, - loc); - builder.restoreInsertionPoint(crtPos); + mlir::Value initValue; + if constexpr (std::is_same_v) { + assert(op != mlir::acc::ReductionOperator::AccNone); + initValue = getReductionInitValue(builder, loc, fir::unwrapRefType(ty), op); + } + genPrivateLikeInitRegion(builder, recipe, ty, loc, initValue); + return recipe; +} + +mlir::acc::PrivateRecipeOp +Fortran::lower::createOrGetPrivateRecipe(fir::FirOpBuilder &builder, + llvm::StringRef recipeName, + mlir::Location loc, mlir::Type ty) { + mlir::ModuleOp mod = + builder.getBlock()->getParent()->getParentOfType(); + if (auto recipe = mod.lookupSymbol(recipeName)) + return recipe; + + auto ip = builder.saveInsertionPoint(); + auto recipe = genRecipeOp(builder, mod, + recipeName, loc, ty); + builder.restoreInsertionPoint(ip); return recipe; } @@ -1064,7 +1238,7 @@ static hlfir::Entity genDesignateWithTriplets( } mlir::acc::FirstprivateRecipeOp Fortran::lower::createOrGetFirstprivateRecipe( - mlir::OpBuilder &builder, llvm::StringRef recipeName, mlir::Location loc, + fir::FirOpBuilder &builder, llvm::StringRef recipeName, mlir::Location loc, mlir::Type ty, llvm::SmallVector &bounds) { mlir::ModuleOp mod = builder.getBlock()->getParent()->getParentOfType(); @@ -1072,28 +1246,9 @@ mlir::acc::FirstprivateRecipeOp Fortran::lower::createOrGetFirstprivateRecipe( mod.lookupSymbol(recipeName)) return recipe; - auto crtPos = builder.saveInsertionPoint(); - mlir::OpBuilder modBuilder(mod.getBodyRegion()); - auto recipe = - modBuilder.create(loc, recipeName, ty); - llvm::SmallVector initArgsTy{ty}; - llvm::SmallVector initArgsLoc{loc}; - auto refTy = fir::unwrapRefType(ty); - if (auto seqTy = mlir::dyn_cast_or_null(refTy)) { - if (seqTy.hasDynamicExtents()) { - mlir::Type idxTy = builder.getIndexType(); - for (unsigned i = 0; i < seqTy.getDimension(); ++i) { - initArgsTy.push_back(idxTy); - initArgsLoc.push_back(loc); - } - } - } - builder.createBlock(&recipe.getInitRegion(), recipe.getInitRegion().end(), - initArgsTy, initArgsLoc); - builder.setInsertionPointToEnd(&recipe.getInitRegion().back()); - genPrivateLikeInitRegion(builder, recipe, ty, - loc); - + auto ip = builder.saveInsertionPoint(); + auto recipe = genRecipeOp(builder, mod, + recipeName, loc, ty); bool allConstantBound = areAllBoundConstant(bounds); llvm::SmallVector argsTy{ty, ty}; llvm::SmallVector argsLoc{loc, loc}; @@ -1167,7 +1322,7 @@ mlir::acc::FirstprivateRecipeOp Fortran::lower::createOrGetFirstprivateRecipe( } builder.create(loc); - builder.restoreInsertionPoint(crtPos); + builder.restoreInsertionPoint(ip); return recipe; } @@ -1326,188 +1481,6 @@ getReductionOperator(const Fortran::parser::ReductionOperator &op) { llvm_unreachable("unexpected reduction operator"); } -/// Get the initial value for reduction operator. -template -static R getReductionInitValue(mlir::acc::ReductionOperator op, mlir::Type ty) { - if (op == mlir::acc::ReductionOperator::AccMin) { - // min init value -> largest - if constexpr (std::is_same_v) { - assert(ty.isIntOrIndex() && "expect integer or index type"); - return llvm::APInt::getSignedMaxValue(ty.getIntOrFloatBitWidth()); - } - if constexpr (std::is_same_v) { - auto floatTy = mlir::dyn_cast_or_null(ty); - assert(floatTy && "expect float type"); - return llvm::APFloat::getLargest(floatTy.getFloatSemantics(), - /*negative=*/false); - } - } else if (op == mlir::acc::ReductionOperator::AccMax) { - // max init value -> smallest - if constexpr (std::is_same_v) { - assert(ty.isIntOrIndex() && "expect integer or index type"); - return llvm::APInt::getSignedMinValue(ty.getIntOrFloatBitWidth()); - } - if constexpr (std::is_same_v) { - auto floatTy = mlir::dyn_cast_or_null(ty); - assert(floatTy && "expect float type"); - return llvm::APFloat::getSmallest(floatTy.getFloatSemantics(), - /*negative=*/true); - } - } else if (op == mlir::acc::ReductionOperator::AccIand) { - if constexpr (std::is_same_v) { - assert(ty.isIntOrIndex() && "expect integer type"); - unsigned bits = ty.getIntOrFloatBitWidth(); - return llvm::APInt::getAllOnes(bits); - } - } else { - // +, ior, ieor init value -> 0 - // * init value -> 1 - int64_t value = (op == mlir::acc::ReductionOperator::AccMul) ? 1 : 0; - if constexpr (std::is_same_v) { - assert(ty.isIntOrIndex() && "expect integer or index type"); - return llvm::APInt(ty.getIntOrFloatBitWidth(), value, true); - } - - if constexpr (std::is_same_v) { - assert(mlir::isa(ty) && "expect float type"); - auto floatTy = mlir::dyn_cast(ty); - return llvm::APFloat(floatTy.getFloatSemantics(), value); - } - - if constexpr (std::is_same_v) - return value; - } - llvm_unreachable("OpenACC reduction unsupported type"); -} - -/// Return a constant with the initial value for the reduction operator and -/// type combination. -static mlir::Value getReductionInitValue(fir::FirOpBuilder &builder, - mlir::Location loc, mlir::Type ty, - mlir::acc::ReductionOperator op) { - if (op == mlir::acc::ReductionOperator::AccLand || - op == mlir::acc::ReductionOperator::AccLor || - op == mlir::acc::ReductionOperator::AccEqv || - op == mlir::acc::ReductionOperator::AccNeqv) { - assert(mlir::isa(ty) && "expect fir.logical type"); - bool value = true; // .true. for .and. and .eqv. - if (op == mlir::acc::ReductionOperator::AccLor || - op == mlir::acc::ReductionOperator::AccNeqv) - value = false; // .false. for .or. and .neqv. - return builder.createBool(loc, value); - } - if (ty.isIntOrIndex()) - return builder.create( - loc, ty, - builder.getIntegerAttr(ty, getReductionInitValue(op, ty))); - if (op == mlir::acc::ReductionOperator::AccMin || - op == mlir::acc::ReductionOperator::AccMax) { - if (mlir::isa(ty)) - llvm::report_fatal_error( - "min/max reduction not supported for complex type"); - if (auto floatTy = mlir::dyn_cast_or_null(ty)) - return builder.create( - loc, ty, - builder.getFloatAttr(ty, - getReductionInitValue(op, ty))); - } else if (auto floatTy = mlir::dyn_cast_or_null(ty)) { - return builder.create( - loc, ty, - builder.getFloatAttr(ty, getReductionInitValue(op, ty))); - } else if (auto cmplxTy = mlir::dyn_cast_or_null(ty)) { - mlir::Type floatTy = cmplxTy.getElementType(); - mlir::Value realInit... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/140652 From flang-commits at lists.llvm.org Mon May 19 17:35:14 2025 From: flang-commits at lists.llvm.org (Scott Manley via flang-commits) Date: Mon, 19 May 2025 17:35:14 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [OpenACC] unify reduction and private-like init region recipes (PR #140652) In-Reply-To: Message-ID: <682bce42.170a0220.3c4fd6.d001@mx.google.com> https://github.com/rscottmanley updated https://github.com/llvm/llvm-project/pull/140652 >From ea1ffd7b9789e285707c8d795004619796999943 Mon Sep 17 00:00:00 2001 From: Scott Manley Date: Mon, 19 May 2025 16:40:10 -0700 Subject: [PATCH] [OpenACC] unify reduction and private-like init region recipes Between firstprivate, private and reduction init regions, the difference is largely whether or not the temp that is created is initialized or not. Some recent fixes were made to privatization (#135698, #137869) but did not get propagated to reductions, even though they essentially do the same thing. To mitigate this descrepency in the future, refactor the init region recipes so they can be shared between the three recipe ops. Also add "none" to the OpenACC_ReductionOperator enum for better error checking. --- flang/include/flang/Lower/OpenACC.h | 4 +- flang/lib/Lower/OpenACC.cpp | 458 ++++++++---------- flang/test/Lower/OpenACC/acc-reduction.f90 | 16 +- .../mlir/Dialect/OpenACC/OpenACCOps.td | 27 +- 4 files changed, 233 insertions(+), 272 deletions(-) diff --git a/flang/include/flang/Lower/OpenACC.h b/flang/include/flang/Lower/OpenACC.h index bbe3b01fdb29d..dad841863ac00 100644 --- a/flang/include/flang/Lower/OpenACC.h +++ b/flang/include/flang/Lower/OpenACC.h @@ -85,7 +85,7 @@ void genOpenACCRoutineConstruct( /// Get a acc.private.recipe op for the given type or create it if it does not /// exist yet. -mlir::acc::PrivateRecipeOp createOrGetPrivateRecipe(mlir::OpBuilder &, +mlir::acc::PrivateRecipeOp createOrGetPrivateRecipe(fir::FirOpBuilder &, llvm::StringRef, mlir::Location, mlir::Type); @@ -99,7 +99,7 @@ createOrGetReductionRecipe(fir::FirOpBuilder &, llvm::StringRef, mlir::Location, /// Get a acc.firstprivate.recipe op for the given type or create it if it does /// not exist yet. mlir::acc::FirstprivateRecipeOp -createOrGetFirstprivateRecipe(mlir::OpBuilder &, llvm::StringRef, +createOrGetFirstprivateRecipe(fir::FirOpBuilder &, llvm::StringRef, mlir::Location, mlir::Type, llvm::SmallVector &); diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index e1918288d6de3..bc94e860ff10b 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -843,22 +843,147 @@ fir::ShapeOp genShapeOp(mlir::OpBuilder &builder, fir::SequenceType seqTy, return builder.create(loc, extents); } +/// Get the initial value for reduction operator. +template +static R getReductionInitValue(mlir::acc::ReductionOperator op, mlir::Type ty) { + if (op == mlir::acc::ReductionOperator::AccMin) { + // min init value -> largest + if constexpr (std::is_same_v) { + assert(ty.isIntOrIndex() && "expect integer or index type"); + return llvm::APInt::getSignedMaxValue(ty.getIntOrFloatBitWidth()); + } + if constexpr (std::is_same_v) { + auto floatTy = mlir::dyn_cast_or_null(ty); + assert(floatTy && "expect float type"); + return llvm::APFloat::getLargest(floatTy.getFloatSemantics(), + /*negative=*/false); + } + } else if (op == mlir::acc::ReductionOperator::AccMax) { + // max init value -> smallest + if constexpr (std::is_same_v) { + assert(ty.isIntOrIndex() && "expect integer or index type"); + return llvm::APInt::getSignedMinValue(ty.getIntOrFloatBitWidth()); + } + if constexpr (std::is_same_v) { + auto floatTy = mlir::dyn_cast_or_null(ty); + assert(floatTy && "expect float type"); + return llvm::APFloat::getSmallest(floatTy.getFloatSemantics(), + /*negative=*/true); + } + } else if (op == mlir::acc::ReductionOperator::AccIand) { + if constexpr (std::is_same_v) { + assert(ty.isIntOrIndex() && "expect integer type"); + unsigned bits = ty.getIntOrFloatBitWidth(); + return llvm::APInt::getAllOnes(bits); + } + } else { + assert(op != mlir::acc::ReductionOperator::AccNone); + // +, ior, ieor init value -> 0 + // * init value -> 1 + int64_t value = (op == mlir::acc::ReductionOperator::AccMul) ? 1 : 0; + if constexpr (std::is_same_v) { + assert(ty.isIntOrIndex() && "expect integer or index type"); + return llvm::APInt(ty.getIntOrFloatBitWidth(), value, true); + } + + if constexpr (std::is_same_v) { + assert(mlir::isa(ty) && "expect float type"); + auto floatTy = mlir::dyn_cast(ty); + return llvm::APFloat(floatTy.getFloatSemantics(), value); + } + + if constexpr (std::is_same_v) + return value; + } + llvm_unreachable("OpenACC reduction unsupported type"); +} + +/// Return a constant with the initial value for the reduction operator and +/// type combination. +static mlir::Value getReductionInitValue(fir::FirOpBuilder &builder, + mlir::Location loc, mlir::Type ty, + mlir::acc::ReductionOperator op) { + if (op == mlir::acc::ReductionOperator::AccLand || + op == mlir::acc::ReductionOperator::AccLor || + op == mlir::acc::ReductionOperator::AccEqv || + op == mlir::acc::ReductionOperator::AccNeqv) { + assert(mlir::isa(ty) && "expect fir.logical type"); + bool value = true; // .true. for .and. and .eqv. + if (op == mlir::acc::ReductionOperator::AccLor || + op == mlir::acc::ReductionOperator::AccNeqv) + value = false; // .false. for .or. and .neqv. + return builder.createBool(loc, value); + } + if (ty.isIntOrIndex()) + return builder.create( + loc, ty, + builder.getIntegerAttr(ty, getReductionInitValue(op, ty))); + if (op == mlir::acc::ReductionOperator::AccMin || + op == mlir::acc::ReductionOperator::AccMax) { + if (mlir::isa(ty)) + llvm::report_fatal_error( + "min/max reduction not supported for complex type"); + if (auto floatTy = mlir::dyn_cast_or_null(ty)) + return builder.create( + loc, ty, + builder.getFloatAttr(ty, + getReductionInitValue(op, ty))); + } else if (auto floatTy = mlir::dyn_cast_or_null(ty)) { + return builder.create( + loc, ty, + builder.getFloatAttr(ty, getReductionInitValue(op, ty))); + } else if (auto cmplxTy = mlir::dyn_cast_or_null(ty)) { + mlir::Type floatTy = cmplxTy.getElementType(); + mlir::Value realInit = builder.createRealConstant( + loc, floatTy, getReductionInitValue(op, cmplxTy)); + mlir::Value imagInit = builder.createRealConstant(loc, floatTy, 0.0); + return fir::factory::Complex{builder, loc}.createComplex(cmplxTy, realInit, + imagInit); + } + + if (auto seqTy = mlir::dyn_cast(ty)) + return getReductionInitValue(builder, loc, seqTy.getEleTy(), op); + + if (auto boxTy = mlir::dyn_cast(ty)) + return getReductionInitValue(builder, loc, boxTy.getEleTy(), op); + + if (auto heapTy = mlir::dyn_cast(ty)) + return getReductionInitValue(builder, loc, heapTy.getEleTy(), op); + + if (auto ptrTy = mlir::dyn_cast(ty)) + return getReductionInitValue(builder, loc, ptrTy.getEleTy(), op); + + llvm::report_fatal_error("Unsupported OpenACC reduction type"); +} + template -static void genPrivateLikeInitRegion(mlir::OpBuilder &builder, RecipeOp recipe, - mlir::Type argTy, mlir::Location loc) { +static void genPrivateLikeInitRegion(fir::FirOpBuilder &builder, + RecipeOp recipe, mlir::Type argTy, + mlir::Location loc, + mlir::Value initValue) { mlir::Value retVal = recipe.getInitRegion().front().getArgument(0); mlir::Type unwrappedTy = fir::unwrapRefType(argTy); + llvm::StringRef initName; + if constexpr (std::is_same_v) + initName = accReductionInitName; + else + initName = accPrivateInitName; + auto getDeclareOpForType = [&](mlir::Type ty) -> hlfir::DeclareOp { auto alloca = builder.create(loc, ty); return builder.create( - loc, alloca, accPrivateInitName, /*shape=*/nullptr, - llvm::ArrayRef{}, /*dummy_scope=*/nullptr, - fir::FortranVariableFlagsAttr{}); + loc, alloca, initName, /*shape=*/nullptr, llvm::ArrayRef{}, + /*dummy_scope=*/nullptr, fir::FortranVariableFlagsAttr{}); }; if (fir::isa_trivial(unwrappedTy)) { - retVal = getDeclareOpForType(unwrappedTy).getBase(); + auto declareOp = getDeclareOpForType(unwrappedTy); + if (initValue) { + auto convert = builder.createConvert(loc, unwrappedTy, initValue); + builder.create(loc, convert, declareOp.getBase()); + } + retVal = declareOp.getBase(); } else if (auto seqTy = mlir::dyn_cast_or_null(unwrappedTy)) { if (fir::isa_trivial(seqTy.getEleTy())) { @@ -877,8 +1002,34 @@ static void genPrivateLikeInitRegion(mlir::OpBuilder &builder, RecipeOp recipe, auto alloca = builder.create( loc, seqTy, /*typeparams=*/mlir::ValueRange{}, extents); auto declareOp = builder.create( - loc, alloca, accPrivateInitName, shape, llvm::ArrayRef{}, + loc, alloca, initName, shape, llvm::ArrayRef{}, /*dummy_scope=*/nullptr, fir::FortranVariableFlagsAttr{}); + + if (initValue) { + mlir::Type idxTy = builder.getIndexType(); + mlir::Type refTy = fir::ReferenceType::get(seqTy.getEleTy()); + llvm::SmallVector loops; + llvm::SmallVector ivs; + + if (seqTy.hasDynamicExtents()) { + builder.create(loc, initValue, declareOp.getBase()); + } else { + for (auto ext : seqTy.getShape()) { + auto lb = builder.createIntegerConstant(loc, idxTy, 0); + auto ub = builder.createIntegerConstant(loc, idxTy, ext - 1); + auto step = builder.createIntegerConstant(loc, idxTy, 1); + auto loop = builder.create(loc, lb, ub, step, + /*unordered=*/false); + builder.setInsertionPointToStart(loop.getBody()); + loops.push_back(loop); + ivs.push_back(loop.getInductionVar()); + } + auto coord = builder.create( + loc, refTy, declareOp.getBase(), ivs); + builder.create(loc, initValue, coord); + builder.setInsertionPointAfter(loops[0]); + } + } retVal = declareOp.getBase(); } } else if (auto boxTy = @@ -909,25 +1060,29 @@ static void genPrivateLikeInitRegion(mlir::OpBuilder &builder, RecipeOp recipe, retVal = temp; } } else { - TODO(loc, "Unsupported boxed type in OpenACC privatization"); + TODO(loc, "Unsupported boxed type for OpenACC private-like recipe"); + } + if (initValue) { + builder.create(loc, initValue, retVal); } } builder.create(loc, retVal); } -mlir::acc::PrivateRecipeOp -Fortran::lower::createOrGetPrivateRecipe(mlir::OpBuilder &builder, - llvm::StringRef recipeName, - mlir::Location loc, mlir::Type ty) { - mlir::ModuleOp mod = - builder.getBlock()->getParent()->getParentOfType(); - if (auto recipe = mod.lookupSymbol(recipeName)) - return recipe; - - auto crtPos = builder.saveInsertionPoint(); +template +static RecipeOp genRecipeOp( + fir::FirOpBuilder &builder, mlir::ModuleOp mod, llvm::StringRef recipeName, + mlir::Location loc, mlir::Type ty, + mlir::acc::ReductionOperator op = mlir::acc::ReductionOperator::AccNone) { mlir::OpBuilder modBuilder(mod.getBodyRegion()); - auto recipe = - modBuilder.create(loc, recipeName, ty); + RecipeOp recipe; + if constexpr (std::is_same_v) { + recipe = modBuilder.create(loc, recipeName, + ty, op); + } else { + recipe = modBuilder.create(loc, recipeName, ty); + } + llvm::SmallVector argsTy{ty}; llvm::SmallVector argsLoc{loc}; if (auto refTy = mlir::dyn_cast_or_null(ty)) { @@ -945,9 +1100,28 @@ Fortran::lower::createOrGetPrivateRecipe(mlir::OpBuilder &builder, builder.createBlock(&recipe.getInitRegion(), recipe.getInitRegion().end(), argsTy, argsLoc); builder.setInsertionPointToEnd(&recipe.getInitRegion().back()); - genPrivateLikeInitRegion(builder, recipe, ty, - loc); - builder.restoreInsertionPoint(crtPos); + mlir::Value initValue; + if constexpr (std::is_same_v) { + assert(op != mlir::acc::ReductionOperator::AccNone); + initValue = getReductionInitValue(builder, loc, fir::unwrapRefType(ty), op); + } + genPrivateLikeInitRegion(builder, recipe, ty, loc, initValue); + return recipe; +} + +mlir::acc::PrivateRecipeOp +Fortran::lower::createOrGetPrivateRecipe(fir::FirOpBuilder &builder, + llvm::StringRef recipeName, + mlir::Location loc, mlir::Type ty) { + mlir::ModuleOp mod = + builder.getBlock()->getParent()->getParentOfType(); + if (auto recipe = mod.lookupSymbol(recipeName)) + return recipe; + + auto ip = builder.saveInsertionPoint(); + auto recipe = genRecipeOp(builder, mod, + recipeName, loc, ty); + builder.restoreInsertionPoint(ip); return recipe; } @@ -1064,7 +1238,7 @@ static hlfir::Entity genDesignateWithTriplets( } mlir::acc::FirstprivateRecipeOp Fortran::lower::createOrGetFirstprivateRecipe( - mlir::OpBuilder &builder, llvm::StringRef recipeName, mlir::Location loc, + fir::FirOpBuilder &builder, llvm::StringRef recipeName, mlir::Location loc, mlir::Type ty, llvm::SmallVector &bounds) { mlir::ModuleOp mod = builder.getBlock()->getParent()->getParentOfType(); @@ -1072,28 +1246,9 @@ mlir::acc::FirstprivateRecipeOp Fortran::lower::createOrGetFirstprivateRecipe( mod.lookupSymbol(recipeName)) return recipe; - auto crtPos = builder.saveInsertionPoint(); - mlir::OpBuilder modBuilder(mod.getBodyRegion()); - auto recipe = - modBuilder.create(loc, recipeName, ty); - llvm::SmallVector initArgsTy{ty}; - llvm::SmallVector initArgsLoc{loc}; - auto refTy = fir::unwrapRefType(ty); - if (auto seqTy = mlir::dyn_cast_or_null(refTy)) { - if (seqTy.hasDynamicExtents()) { - mlir::Type idxTy = builder.getIndexType(); - for (unsigned i = 0; i < seqTy.getDimension(); ++i) { - initArgsTy.push_back(idxTy); - initArgsLoc.push_back(loc); - } - } - } - builder.createBlock(&recipe.getInitRegion(), recipe.getInitRegion().end(), - initArgsTy, initArgsLoc); - builder.setInsertionPointToEnd(&recipe.getInitRegion().back()); - genPrivateLikeInitRegion(builder, recipe, ty, - loc); - + auto ip = builder.saveInsertionPoint(); + auto recipe = genRecipeOp( + builder, mod, recipeName, loc, ty); bool allConstantBound = areAllBoundConstant(bounds); llvm::SmallVector argsTy{ty, ty}; llvm::SmallVector argsLoc{loc, loc}; @@ -1167,7 +1322,7 @@ mlir::acc::FirstprivateRecipeOp Fortran::lower::createOrGetFirstprivateRecipe( } builder.create(loc); - builder.restoreInsertionPoint(crtPos); + builder.restoreInsertionPoint(ip); return recipe; } @@ -1326,188 +1481,6 @@ getReductionOperator(const Fortran::parser::ReductionOperator &op) { llvm_unreachable("unexpected reduction operator"); } -/// Get the initial value for reduction operator. -template -static R getReductionInitValue(mlir::acc::ReductionOperator op, mlir::Type ty) { - if (op == mlir::acc::ReductionOperator::AccMin) { - // min init value -> largest - if constexpr (std::is_same_v) { - assert(ty.isIntOrIndex() && "expect integer or index type"); - return llvm::APInt::getSignedMaxValue(ty.getIntOrFloatBitWidth()); - } - if constexpr (std::is_same_v) { - auto floatTy = mlir::dyn_cast_or_null(ty); - assert(floatTy && "expect float type"); - return llvm::APFloat::getLargest(floatTy.getFloatSemantics(), - /*negative=*/false); - } - } else if (op == mlir::acc::ReductionOperator::AccMax) { - // max init value -> smallest - if constexpr (std::is_same_v) { - assert(ty.isIntOrIndex() && "expect integer or index type"); - return llvm::APInt::getSignedMinValue(ty.getIntOrFloatBitWidth()); - } - if constexpr (std::is_same_v) { - auto floatTy = mlir::dyn_cast_or_null(ty); - assert(floatTy && "expect float type"); - return llvm::APFloat::getSmallest(floatTy.getFloatSemantics(), - /*negative=*/true); - } - } else if (op == mlir::acc::ReductionOperator::AccIand) { - if constexpr (std::is_same_v) { - assert(ty.isIntOrIndex() && "expect integer type"); - unsigned bits = ty.getIntOrFloatBitWidth(); - return llvm::APInt::getAllOnes(bits); - } - } else { - // +, ior, ieor init value -> 0 - // * init value -> 1 - int64_t value = (op == mlir::acc::ReductionOperator::AccMul) ? 1 : 0; - if constexpr (std::is_same_v) { - assert(ty.isIntOrIndex() && "expect integer or index type"); - return llvm::APInt(ty.getIntOrFloatBitWidth(), value, true); - } - - if constexpr (std::is_same_v) { - assert(mlir::isa(ty) && "expect float type"); - auto floatTy = mlir::dyn_cast(ty); - return llvm::APFloat(floatTy.getFloatSemantics(), value); - } - - if constexpr (std::is_same_v) - return value; - } - llvm_unreachable("OpenACC reduction unsupported type"); -} - -/// Return a constant with the initial value for the reduction operator and -/// type combination. -static mlir::Value getReductionInitValue(fir::FirOpBuilder &builder, - mlir::Location loc, mlir::Type ty, - mlir::acc::ReductionOperator op) { - if (op == mlir::acc::ReductionOperator::AccLand || - op == mlir::acc::ReductionOperator::AccLor || - op == mlir::acc::ReductionOperator::AccEqv || - op == mlir::acc::ReductionOperator::AccNeqv) { - assert(mlir::isa(ty) && "expect fir.logical type"); - bool value = true; // .true. for .and. and .eqv. - if (op == mlir::acc::ReductionOperator::AccLor || - op == mlir::acc::ReductionOperator::AccNeqv) - value = false; // .false. for .or. and .neqv. - return builder.createBool(loc, value); - } - if (ty.isIntOrIndex()) - return builder.create( - loc, ty, - builder.getIntegerAttr(ty, getReductionInitValue(op, ty))); - if (op == mlir::acc::ReductionOperator::AccMin || - op == mlir::acc::ReductionOperator::AccMax) { - if (mlir::isa(ty)) - llvm::report_fatal_error( - "min/max reduction not supported for complex type"); - if (auto floatTy = mlir::dyn_cast_or_null(ty)) - return builder.create( - loc, ty, - builder.getFloatAttr(ty, - getReductionInitValue(op, ty))); - } else if (auto floatTy = mlir::dyn_cast_or_null(ty)) { - return builder.create( - loc, ty, - builder.getFloatAttr(ty, getReductionInitValue(op, ty))); - } else if (auto cmplxTy = mlir::dyn_cast_or_null(ty)) { - mlir::Type floatTy = cmplxTy.getElementType(); - mlir::Value realInit = builder.createRealConstant( - loc, floatTy, getReductionInitValue(op, cmplxTy)); - mlir::Value imagInit = builder.createRealConstant(loc, floatTy, 0.0); - return fir::factory::Complex{builder, loc}.createComplex(cmplxTy, realInit, - imagInit); - } - - if (auto seqTy = mlir::dyn_cast(ty)) - return getReductionInitValue(builder, loc, seqTy.getEleTy(), op); - - if (auto boxTy = mlir::dyn_cast(ty)) - return getReductionInitValue(builder, loc, boxTy.getEleTy(), op); - - if (auto heapTy = mlir::dyn_cast(ty)) - return getReductionInitValue(builder, loc, heapTy.getEleTy(), op); - - if (auto ptrTy = mlir::dyn_cast(ty)) - return getReductionInitValue(builder, loc, ptrTy.getEleTy(), op); - - llvm::report_fatal_error("Unsupported OpenACC reduction type"); -} - -static mlir::Value genReductionInitRegion(fir::FirOpBuilder &builder, - mlir::Location loc, mlir::Type ty, - mlir::acc::ReductionOperator op) { - ty = fir::unwrapRefType(ty); - mlir::Value initValue = getReductionInitValue(builder, loc, ty, op); - if (fir::isa_trivial(ty)) { - mlir::Value alloca = builder.create(loc, ty); - auto declareOp = builder.create( - loc, alloca, accReductionInitName, /*shape=*/nullptr, - llvm::ArrayRef{}, /*dummy_scope=*/nullptr, - fir::FortranVariableFlagsAttr{}); - builder.create(loc, builder.createConvert(loc, ty, initValue), - declareOp.getBase()); - return declareOp.getBase(); - } else if (auto seqTy = mlir::dyn_cast_or_null(ty)) { - if (fir::isa_trivial(seqTy.getEleTy())) { - mlir::Value shape; - auto extents = builder.getBlock()->getArguments().drop_front(1); - if (seqTy.hasDynamicExtents()) - shape = builder.create(loc, extents); - else - shape = genShapeOp(builder, seqTy, loc); - mlir::Value alloca = builder.create( - loc, seqTy, /*typeparams=*/mlir::ValueRange{}, extents); - auto declareOp = builder.create( - loc, alloca, accReductionInitName, shape, - llvm::ArrayRef{}, /*dummy_scope=*/nullptr, - fir::FortranVariableFlagsAttr{}); - mlir::Type idxTy = builder.getIndexType(); - mlir::Type refTy = fir::ReferenceType::get(seqTy.getEleTy()); - llvm::SmallVector loops; - llvm::SmallVector ivs; - - if (seqTy.hasDynamicExtents()) { - builder.create(loc, initValue, declareOp.getBase()); - return declareOp.getBase(); - } - for (auto ext : seqTy.getShape()) { - auto lb = builder.createIntegerConstant(loc, idxTy, 0); - auto ub = builder.createIntegerConstant(loc, idxTy, ext - 1); - auto step = builder.createIntegerConstant(loc, idxTy, 1); - auto loop = builder.create(loc, lb, ub, step, - /*unordered=*/false); - builder.setInsertionPointToStart(loop.getBody()); - loops.push_back(loop); - ivs.push_back(loop.getInductionVar()); - } - auto coord = builder.create(loc, refTy, - declareOp.getBase(), ivs); - builder.create(loc, initValue, coord); - builder.setInsertionPointAfter(loops[0]); - return declareOp.getBase(); - } - } else if (auto boxTy = mlir::dyn_cast_or_null(ty)) { - mlir::Type innerTy = fir::unwrapRefType(boxTy.getEleTy()); - if (!fir::isa_trivial(innerTy) && !mlir::isa(innerTy)) - TODO(loc, "Unsupported boxed type for reduction"); - // Create the private copy from the initial fir.box. - hlfir::Entity source = hlfir::Entity{builder.getBlock()->getArgument(0)}; - auto [temp, cleanup] = hlfir::createTempFromMold(loc, builder, source); - mlir::Value newBox = temp; - if (!mlir::isa(temp.getType())) { - newBox = builder.create(loc, boxTy, temp); - } - builder.create(loc, initValue, newBox); - return newBox; - } - llvm::report_fatal_error("Unsupported OpenACC reduction type"); -} - template static mlir::Value genLogicalCombiner(fir::FirOpBuilder &builder, mlir::Location loc, mlir::Value value1, @@ -1799,27 +1772,10 @@ mlir::acc::ReductionRecipeOp Fortran::lower::createOrGetReductionRecipe( if (auto recipe = mod.lookupSymbol(recipeName)) return recipe; - auto crtPos = builder.saveInsertionPoint(); - mlir::OpBuilder modBuilder(mod.getBodyRegion()); - auto recipe = - modBuilder.create(loc, recipeName, ty, op); - llvm::SmallVector initArgsTy{ty}; - llvm::SmallVector initArgsLoc{loc}; - mlir::Type refTy = fir::unwrapRefType(ty); - if (auto seqTy = mlir::dyn_cast_or_null(refTy)) { - if (seqTy.hasDynamicExtents()) { - mlir::Type idxTy = builder.getIndexType(); - for (unsigned i = 0; i < seqTy.getDimension(); ++i) { - initArgsTy.push_back(idxTy); - initArgsLoc.push_back(loc); - } - } - } - builder.createBlock(&recipe.getInitRegion(), recipe.getInitRegion().end(), - initArgsTy, initArgsLoc); - builder.setInsertionPointToEnd(&recipe.getInitRegion().back()); - mlir::Value initValue = genReductionInitRegion(builder, loc, ty, op); - builder.create(loc, initValue); + auto ip = builder.saveInsertionPoint(); + + auto recipe = genRecipeOp( + builder, mod, recipeName, loc, ty, op); // The two first block arguments are the two values to be combined. // The next arguments are the iteration ranges (lb, ub, step) to be used @@ -1846,7 +1802,7 @@ mlir::acc::ReductionRecipeOp Fortran::lower::createOrGetReductionRecipe( mlir::Value v2 = recipe.getCombinerRegion().front().getArgument(1); genCombiner(builder, loc, op, ty, v1, v2, recipe, bounds, allConstantBound); builder.create(loc, v1); - builder.restoreInsertionPoint(crtPos); + builder.restoreInsertionPoint(ip); return recipe; } diff --git a/flang/test/Lower/OpenACC/acc-reduction.f90 b/flang/test/Lower/OpenACC/acc-reduction.f90 index 0d97c298f8d24..20b5ad28f78a1 100644 --- a/flang/test/Lower/OpenACC/acc-reduction.f90 +++ b/flang/test/Lower/OpenACC/acc-reduction.f90 @@ -39,9 +39,11 @@ ! CHECK: %[[BOX_DIMS:.*]]:3 = fir.box_dims %[[BOX]], %[[C0]] : (!fir.box>>, index) -> (index, index, index) ! CHECK: %[[SHAPE:.*]] = fir.shape %[[BOX_DIMS]]#1 : (index) -> !fir.shape<1> ! CHECK: %[[TEMP:.*]] = fir.allocmem !fir.array, %[[BOX_DIMS]]#1 {bindc_name = ".tmp", uniq_name = ""} -! CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %[[TEMP]](%[[SHAPE]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -! CHECK: hlfir.assign %[[CST]] to %[[DECLARE]]#0 : f32, !fir.box> -! CHECK: acc.yield %[[DECLARE]]#0 : !fir.box> +! CHECK: %[[STORAGE:.*]]:2 = hlfir.declare %[[TEMP]](%[[SHAPE]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +! CHECK: %[[BOXTEMP:.*]] = fir.alloca !fir.box>> +! CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %[[BOXTEMP]] {uniq_name = "acc.reduction.init"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) +! CHECK: hlfir.assign %[[CST]] to %[[DECLARE]]#0 : f32, !fir.ref>>> +! CHECK: acc.yield %[[DECLARE]]#0 : !fir.ref>>> ! CHECK: } combiner { ! CHECK: ^bb0(%[[ARG0:.*]]: !fir.ref>>>, %[[ARG1:.*]]: !fir.ref>>>): ! CHECK: %[[BOX0:.*]] = fir.load %[[ARG0]] : !fir.ref>>> @@ -74,9 +76,11 @@ ! CHECK: %[[BOX_DIMS:.*]]:3 = fir.box_dims %[[BOX]], %[[C0]] : (!fir.box>>, index) -> (index, index, index) ! CHECK: %[[SHAPE:.*]] = fir.shape %[[BOX_DIMS]]#1 : (index) -> !fir.shape<1> ! CHECK: %[[TEMP:.*]] = fir.allocmem !fir.array, %[[BOX_DIMS]]#1 {bindc_name = ".tmp", uniq_name = ""} -! CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %[[TEMP]](%[[SHAPE]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -! CHECK: hlfir.assign %[[CST]] to %[[DECLARE]]#0 : f32, !fir.box> -! CHECK: acc.yield %[[DECLARE]]#0 : !fir.box> +! CHECK: %[[STORAGE:.*]]:2 = hlfir.declare %[[TEMP]](%[[SHAPE]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +! CHECK: %[[BOXTEMP:.*]] = fir.alloca !fir.box>> +! CHECK: %[[DECLARE:.*]]:2 = hlfir.declare %[[BOXTEMP]] {uniq_name = "acc.reduction.init"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) +! CHECK: hlfir.assign %[[CST]] to %[[DECLARE]]#0 : f32, !fir.ref>>> +! CHECK: acc.yield %[[DECLARE]]#0 : !fir.ref>>> ! CHECK: } combiner { ! CHECK: ^bb0(%[[ARG0:.*]]: !fir.ref>>>, %[[ARG1:.*]]: !fir.ref>>>): ! CHECK: %[[BOX0:.*]] = fir.load %[[ARG0]] : !fir.ref>>> diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td index 3c22aeb9a1ff7..b9148dc088a6a 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td @@ -35,22 +35,23 @@ class OpenACC_Op traits = []> : Op; // Reduction operation enumeration. -def OpenACC_ReductionOperatorAdd : I32EnumAttrCase<"AccAdd", 0, "add">; -def OpenACC_ReductionOperatorMul : I32EnumAttrCase<"AccMul", 1, "mul">; -def OpenACC_ReductionOperatorMax : I32EnumAttrCase<"AccMax", 2, "max">; -def OpenACC_ReductionOperatorMin : I32EnumAttrCase<"AccMin", 3, "min">; -def OpenACC_ReductionOperatorAnd : I32EnumAttrCase<"AccIand", 4, "iand">; -def OpenACC_ReductionOperatorOr : I32EnumAttrCase<"AccIor", 5, "ior">; -def OpenACC_ReductionOperatorXor : I32EnumAttrCase<"AccXor", 6, "xor">; -def OpenACC_ReductionOperatorLogEqv : I32EnumAttrCase<"AccEqv", 7, "eqv">; -def OpenACC_ReductionOperatorLogNeqv : I32EnumAttrCase<"AccNeqv", 8, "neqv">; -def OpenACC_ReductionOperatorLogAnd : I32EnumAttrCase<"AccLand", 9, "land">; -def OpenACC_ReductionOperatorLogOr : I32EnumAttrCase<"AccLor", 10, "lor">; +def OpenACC_ReductionOperatorNone : I32EnumAttrCase<"AccNone", 0, "none">; +def OpenACC_ReductionOperatorAdd : I32EnumAttrCase<"AccAdd", 1, "add">; +def OpenACC_ReductionOperatorMul : I32EnumAttrCase<"AccMul", 2, "mul">; +def OpenACC_ReductionOperatorMax : I32EnumAttrCase<"AccMax", 3, "max">; +def OpenACC_ReductionOperatorMin : I32EnumAttrCase<"AccMin", 4, "min">; +def OpenACC_ReductionOperatorAnd : I32EnumAttrCase<"AccIand", 5, "iand">; +def OpenACC_ReductionOperatorOr : I32EnumAttrCase<"AccIor", 6, "ior">; +def OpenACC_ReductionOperatorXor : I32EnumAttrCase<"AccXor", 7, "xor">; +def OpenACC_ReductionOperatorLogEqv : I32EnumAttrCase<"AccEqv", 8, "eqv">; +def OpenACC_ReductionOperatorLogNeqv : I32EnumAttrCase<"AccNeqv", 9, "neqv">; +def OpenACC_ReductionOperatorLogAnd : I32EnumAttrCase<"AccLand", 10, "land">; +def OpenACC_ReductionOperatorLogOr : I32EnumAttrCase<"AccLor", 11, "lor">; def OpenACC_ReductionOperator : I32EnumAttr<"ReductionOperator", "built-in reduction operations supported by OpenACC", - [OpenACC_ReductionOperatorAdd, OpenACC_ReductionOperatorMul, - OpenACC_ReductionOperatorMax, OpenACC_ReductionOperatorMin, + [OpenACC_ReductionOperatorNone, OpenACC_ReductionOperatorAdd, + OpenACC_ReductionOperatorMul, OpenACC_ReductionOperatorMax, OpenACC_ReductionOperatorMin, OpenACC_ReductionOperatorAnd, OpenACC_ReductionOperatorOr, OpenACC_ReductionOperatorXor, OpenACC_ReductionOperatorLogEqv, OpenACC_ReductionOperatorLogNeqv, OpenACC_ReductionOperatorLogAnd, From flang-commits at lists.llvm.org Mon May 19 17:39:11 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Mon, 19 May 2025 17:39:11 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [OpenACC] unify reduction and private-like init region recipes (PR #140652) In-Reply-To: Message-ID: <682bcf2f.170a0220.fa567.d558@mx.google.com> https://github.com/clementval approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/140652 From flang-commits at lists.llvm.org Mon May 19 18:00:27 2025 From: flang-commits at lists.llvm.org (Razvan Lupusoru via flang-commits) Date: Mon, 19 May 2025 18:00:27 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [OpenACC] unify reduction and private-like init region recipes (PR #140652) In-Reply-To: Message-ID: <682bd42b.170a0220.95b01.da95@mx.google.com> https://github.com/razvanlupusoru approved this pull request. Looks great to me! Thank you for doing this! https://github.com/llvm/llvm-project/pull/140652 From flang-commits at lists.llvm.org Mon May 19 20:12:10 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 20:12:10 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <682bf30a.170a0220.f14dc.dea9@mx.google.com> fanju110 wrote: > @MaskRay @aeubanks when you have a moment, could you please take a look at this PR? Let me know if there’s anything I should revise or clarify. Thanks! @MaskRay @aeubanks https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Mon May 19 20:24:25 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 20:24:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682bf5e9.050a0220.40824.0027@mx.google.com> https://github.com/NexMing updated https://github.com/llvm/llvm-project/pull/140374 >From 12603fa8910e7730a3168cc0a629f0039e1675f0 Mon Sep 17 00:00:00 2001 From: yanming Date: Fri, 16 May 2025 17:56:21 +0800 Subject: [PATCH] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. --- .../include/flang/Optimizer/Support/InitFIR.h | 2 + .../flang/Optimizer/Transforms/Passes.h | 1 + .../flang/Optimizer/Transforms/Passes.td | 11 + flang/lib/Optimizer/Transforms/CMakeLists.txt | 1 + flang/lib/Optimizer/Transforms/FIRToSCF.cpp | 105 +++++++++ flang/test/Fir/FirToSCF/do-loop.fir | 206 ++++++++++++++++++ 6 files changed, 326 insertions(+) create mode 100644 flang/lib/Optimizer/Transforms/FIRToSCF.cpp create mode 100644 flang/test/Fir/FirToSCF/do-loop.fir diff --git a/flang/include/flang/Optimizer/Support/InitFIR.h b/flang/include/flang/Optimizer/Support/InitFIR.h index 1868fbb201970..fa08f41f84adf 100644 --- a/flang/include/flang/Optimizer/Support/InitFIR.h +++ b/flang/include/flang/Optimizer/Support/InitFIR.h @@ -25,6 +25,7 @@ #include "mlir/Dialect/Func/Extensions/InlinerExtension.h" #include "mlir/Dialect/LLVMIR/NVVMDialect.h" #include "mlir/Dialect/OpenACC/Transforms/Passes.h" +#include "mlir/Dialect/SCF/Transforms/Passes.h" #include "mlir/InitAllDialects.h" #include "mlir/Pass/Pass.h" #include "mlir/Pass/PassRegistry.h" @@ -103,6 +104,7 @@ inline void registerMLIRPassesForFortranTools() { mlir::registerPrintOpStatsPass(); mlir::registerInlinerPass(); mlir::registerSCCPPass(); + mlir::registerSCFPasses(); mlir::affine::registerAffineScalarReplacementPass(); mlir::registerSymbolDCEPass(); mlir::registerLocationSnapshotPass(); diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h b/flang/include/flang/Optimizer/Transforms/Passes.h index 6dbabd523f88a..dc8a5b9141ad2 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.h +++ b/flang/include/flang/Optimizer/Transforms/Passes.h @@ -72,6 +72,7 @@ std::unique_ptr createArrayValueCopyPass(fir::ArrayValueCopyOptions options = {}); std::unique_ptr createMemDataFlowOptPass(); std::unique_ptr createPromoteToAffinePass(); +std::unique_ptr createFIRToSCFPass(); std::unique_ptr createAddDebugInfoPass(fir::AddDebugInfoOptions options = {}); diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c0d88a8e19f80..da3d9bc751927 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -76,6 +76,17 @@ def AffineDialectDemotion : Pass<"demote-affine", "::mlir::func::FuncOp"> { ]; } +def FIRToSCFPass : Pass<"fir-to-scf"> { + let summary = "Convert FIR structured control flow ops to SCF dialect."; + let description = [{ + Convert FIR structured control flow ops to SCF dialect. + }]; + let constructor = "::fir::createFIRToSCFPass()"; + let dependentDialects = [ + "fir::FIROpsDialect", "mlir::scf::SCFDialect" + ]; +} + def AnnotateConstantOperands : Pass<"annotate-constant"> { let summary = "Annotate constant operands to all FIR operations"; let description = [{ diff --git a/flang/lib/Optimizer/Transforms/CMakeLists.txt b/flang/lib/Optimizer/Transforms/CMakeLists.txt index 170b6e2cca225..846d6c64dbd04 100644 --- a/flang/lib/Optimizer/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/Transforms/CMakeLists.txt @@ -16,6 +16,7 @@ add_flang_library(FIRTransforms CUFComputeSharedMemoryOffsetsAndSize.cpp ArrayValueCopy.cpp ExternalNameConversion.cpp + FIRToSCF.cpp MemoryUtils.cpp MemoryAllocation.cpp StackArrays.cpp diff --git a/flang/lib/Optimizer/Transforms/FIRToSCF.cpp b/flang/lib/Optimizer/Transforms/FIRToSCF.cpp new file mode 100644 index 0000000000000..f06ad2db90d55 --- /dev/null +++ b/flang/lib/Optimizer/Transforms/FIRToSCF.cpp @@ -0,0 +1,105 @@ +//===-- FIRToSCF.cpp ------------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/SCF/IR/SCF.h" +#include "mlir/Transforms/DialectConversion.h" + +namespace fir { +#define GEN_PASS_DEF_FIRTOSCFPASS +#include "flang/Optimizer/Transforms/Passes.h.inc" +} // namespace fir + +using namespace fir; +using namespace mlir; + +namespace { +class FIRToSCFPass : public fir::impl::FIRToSCFPassBase { +public: + void runOnOperation() override; +}; + +struct DoLoopConversion : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + + LogicalResult matchAndRewrite(fir::DoLoopOp doLoopOp, + PatternRewriter &rewriter) const override { + auto loc = doLoopOp.getLoc(); + bool hasFinalValue = doLoopOp.getFinalValue().has_value(); + + // Get loop values from the DoLoopOp + auto low = doLoopOp.getLowerBound(); + auto high = doLoopOp.getUpperBound(); + assert(low && high && "must be a Value"); + auto step = doLoopOp.getStep(); + llvm::SmallVector iterArgs; + if (hasFinalValue) + iterArgs.push_back(low); + iterArgs.append(doLoopOp.getIterOperands().begin(), + doLoopOp.getIterOperands().end()); + + // fir.do_loop iterates over the interval [%l, %u], and the step may be + // negative. But scf.for iterates over the interval [%l, %u), and the step + // must be a positive value. + // For easier conversion, we calculate the trip count and use a canonical + // induction variable. + auto diff = rewriter.create(loc, high, low); + auto distance = rewriter.create(loc, diff, step); + auto tripCount = rewriter.create(loc, distance, step); + auto zero = rewriter.create(loc, 0); + auto one = rewriter.create(loc, 1); + auto scfForOp = + rewriter.create(loc, zero, tripCount, one, iterArgs); + + auto &loopOps = doLoopOp.getBody()->getOperations(); + auto resultOp = cast(doLoopOp.getBody()->getTerminator()); + auto results = resultOp.getOperands(); + Block *loweredBody = scfForOp.getBody(); + + loweredBody->getOperations().splice(loweredBody->begin(), loopOps, + loopOps.begin(), + std::prev(loopOps.end())); + + rewriter.setInsertionPointToStart(loweredBody); + Value iv = + rewriter.create(loc, scfForOp.getInductionVar(), step); + iv = rewriter.create(loc, low, iv); + + if (!results.empty()) { + rewriter.setInsertionPointToEnd(loweredBody); + rewriter.create(resultOp->getLoc(), results); + } + doLoopOp.getInductionVar().replaceAllUsesWith(iv); + rewriter.replaceAllUsesWith(doLoopOp.getRegionIterArgs(), + hasFinalValue + ? scfForOp.getRegionIterArgs().drop_front() + : scfForOp.getRegionIterArgs()); + + // Copy all the attributes from the old to new op. + scfForOp->setAttrs(doLoopOp->getAttrs()); + rewriter.replaceOp(doLoopOp, scfForOp); + return success(); + } +}; +} // namespace + +void FIRToSCFPass::runOnOperation() { + RewritePatternSet patterns(&getContext()); + patterns.add(patterns.getContext()); + ConversionTarget target(getContext()); + target.addIllegalOp(); + target.markUnknownOpDynamicallyLegal([](Operation *) { return true; }); + if (failed( + applyPartialConversion(getOperation(), target, std::move(patterns)))) + signalPassFailure(); +} + +std::unique_ptr fir::createFIRToSCFPass() { + return std::make_unique(); +} diff --git a/flang/test/Fir/FirToSCF/do-loop.fir b/flang/test/Fir/FirToSCF/do-loop.fir new file mode 100644 index 0000000000000..0feb3339fa9ed --- /dev/null +++ b/flang/test/Fir/FirToSCF/do-loop.fir @@ -0,0 +1,206 @@ +// RUN: fir-opt %s --fir-to-scf | FileCheck %s + +// CHECK-LABEL: func.func @simple_loop( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_2:.*]] = fir.shape %[[VAL_1]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_3:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_9:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] { +// CHECK: %[[VAL_10:.*]] = arith.muli %[[VAL_9]], %[[VAL_0]] : index +// CHECK: %[[VAL_11:.*]] = arith.addi %[[VAL_0]], %[[VAL_10]] : index +// CHECK: %[[VAL_12:.*]] = fir.array_coor %[[ARG0]](%[[VAL_2]]) %[[VAL_11]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_3]] to %[[VAL_12]] : !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } +func.func @simple_loop(%arg0: !fir.ref>) { + %c1 = arith.constant 1 : index + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %c1_i32 = arith.constant 1 : i32 + fir.do_loop %arg1 = %c1 to %c100 step %c1 { + %1 = fir.array_coor %arg0(%0) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + return +} + +// CHECK-LABEL: func.func @loop_with_negtive_step( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_2:.*]] = arith.constant -1 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_0]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_2]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_10:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] { +// CHECK: %[[VAL_11:.*]] = arith.muli %[[VAL_10]], %[[VAL_2]] : index +// CHECK: %[[VAL_12:.*]] = arith.addi %[[VAL_0]], %[[VAL_11]] : index +// CHECK: %[[VAL_13:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_12]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_4]] to %[[VAL_13]] : !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } +func.func @loop_with_negtive_step(%arg0: !fir.ref>) { + %c100 = arith.constant 100 : index + %c1 = arith.constant 1 : index + %c-1 = arith.constant -1 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %c1_i32 = arith.constant 1 : i32 + fir.do_loop %arg1 = %c100 to %c1 step %c-1 { + %1 = fir.array_coor %arg0(%0) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + return +} + +// CHECK-LABEL: func.func @loop_with_results( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_9:.*]] = scf.for %[[VAL_10:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] iter_args(%[[VAL_11:.*]] = %[[VAL_1]]) -> (i32) { +// CHECK: %[[VAL_12:.*]] = arith.muli %[[VAL_10]], %[[VAL_0]] : index +// CHECK: %[[VAL_13:.*]] = arith.addi %[[VAL_0]], %[[VAL_12]] : index +// CHECK: %[[VAL_14:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_13]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_15:.*]] = fir.load %[[VAL_14]] : !fir.ref +// CHECK: %[[VAL_16:.*]] = arith.addi %[[VAL_11]], %[[VAL_15]] : i32 +// CHECK: scf.yield %[[VAL_16]] : i32 +// CHECK: } +// CHECK: fir.store %[[VAL_9]] to %[[ARG1]] : !fir.ref +// CHECK: return +// CHECK: } +func.func @loop_with_results(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %1 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %c0_i32) -> (i32) { + %2 = fir.array_coor %arg0(%0) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %3 = fir.load %2 : !fir.ref + %4 = arith.addi %arg3, %3 : i32 + fir.result %4 : i32 + } + fir.store %1 to %arg1 : !fir.ref + return +} + +// CHECK-LABEL: func.func @loop_with_final_value( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.alloca index +// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_0]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_10:.*]]:2 = scf.for %[[VAL_11:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] iter_args(%[[VAL_12:.*]] = %[[VAL_0]], %[[VAL_13:.*]] = %[[VAL_1]]) -> (index, i32) { +// CHECK: %[[VAL_14:.*]] = arith.muli %[[VAL_11]], %[[VAL_0]] : index +// CHECK: %[[VAL_15:.*]] = arith.addi %[[VAL_0]], %[[VAL_14]] : index +// CHECK: %[[VAL_16:.*]] = fir.array_coor %[[ARG0]](%[[VAL_4]]) %[[VAL_15]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.load %[[VAL_16]] : !fir.ref +// CHECK: %[[VAL_18:.*]] = arith.addi %[[VAL_15]], %[[VAL_0]] overflow : index +// CHECK: %[[VAL_19:.*]] = arith.addi %[[VAL_13]], %[[VAL_17]] overflow : i32 +// CHECK: scf.yield %[[VAL_18]], %[[VAL_19]] : index, i32 +// CHECK: } +// CHECK: fir.store %[[VAL_20:.*]]#0 to %[[VAL_3]] : !fir.ref +// CHECK: fir.store %[[VAL_20]]#1 to %[[ARG1]] : !fir.ref +// CHECK: return +// CHECK: } +func.func @loop_with_final_value(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.alloca index + %1 = fir.shape %c100 : (index) -> !fir.shape<1> + %2:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %c0_i32) -> (index, i32) { + %3 = fir.array_coor %arg0(%1) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %4 = fir.load %3 : !fir.ref + %5 = arith.addi %arg2, %c1 overflow : index + %6 = arith.addi %arg3, %4 overflow : i32 + fir.result %5, %6 : index, i32 + } + fir.store %2#0 to %0 : !fir.ref + fir.store %2#1 to %arg1 : !fir.ref + return +} + +func.func @loop_with_attribute(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.alloca i32 + %1 = fir.shape %c100 : (index) -> !fir.shape<1> + fir.do_loop %arg2 = %c1 to %c100 step %c1 reduce(#fir.reduce_attr -> %0 : !fir.ref) { + %2 = fir.array_coor %arg0(%1) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %3 = fir.load %2 : !fir.ref + %4 = fir.load %0 : !fir.ref + %5 = arith.addi %4, %3 : i32 + fir.store %5 to %0 : !fir.ref + fir.result + } + return +} + +// CHECK-LABEL: func.func @nested_loop( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_2]], %[[VAL_2]] : (index, index) -> !fir.shape<2> +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_9:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] { +// CHECK: %[[VAL_10:.*]] = arith.muli %[[VAL_9]], %[[VAL_0]] : index +// CHECK: %[[VAL_11:.*]] = arith.addi %[[VAL_0]], %[[VAL_10]] : index +// CHECK: %[[VAL_12:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_13:.*]] = arith.addi %[[VAL_12]], %[[VAL_0]] : index +// CHECK: %[[VAL_14:.*]] = arith.divsi %[[VAL_13]], %[[VAL_0]] : index +// CHECK: %[[VAL_15:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_16:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_17:.*]] = %[[VAL_15]] to %[[VAL_14]] step %[[VAL_16]] { +// CHECK: %[[VAL_18:.*]] = arith.muli %[[VAL_17]], %[[VAL_0]] : index +// CHECK: %[[VAL_19:.*]] = arith.addi %[[VAL_0]], %[[VAL_18]] : index +// CHECK: %[[VAL_20:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_19]], %[[VAL_11]] : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref +// CHECK: fir.store %[[VAL_1]] to %[[VAL_20]] : !fir.ref +// CHECK: } +// CHECK: } +// CHECK: return +// CHECK: } +func.func @nested_loop(%arg0: !fir.ref>) { + %c1 = arith.constant 1 : index + %c1_i32 = arith.constant 1 : i32 + %c100 = arith.constant 100 : index + %0 = fir.shape %c100, %c100 : (index, index) -> !fir.shape<2> + fir.do_loop %arg1 = %c1 to %c100 step %c1 { + fir.do_loop %arg2 = %c1 to %c100 step %c1 { + %1 = fir.array_coor %arg0(%0) %arg2, %arg1 : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + } + return +} From flang-commits at lists.llvm.org Mon May 19 20:39:13 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 20:39:13 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682bf961.a70a0220.303af7.eda7@mx.google.com> NexMing wrote: > Can you elaborate on "future work will focus on gradually improving this conversion pass"? What ops will you be converting and where/when will it live in the pipeline? What's the intended use for this conversion upstream? There is some discussion here https://discourse.llvm.org/t/rfc-add-fir-affine-optimization-fir-pass-pipeline/86190/5 My envisioned final pipeline is: FIR → standard MLIR (do optimization. ,like SCF->Affine )→ LLVM, and working to implement it. https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Mon May 19 20:45:30 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 19 May 2025 20:45:30 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682bfada.620a0220.16d5bd.0cdd@mx.google.com> https://github.com/NexMing updated https://github.com/llvm/llvm-project/pull/140374 >From 7ac07790f6de61fc9377bac3c10f5f985d79d5cc Mon Sep 17 00:00:00 2001 From: yanming Date: Fri, 16 May 2025 17:56:21 +0800 Subject: [PATCH] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. --- .../include/flang/Optimizer/Support/InitFIR.h | 2 + .../flang/Optimizer/Transforms/Passes.h | 1 + .../flang/Optimizer/Transforms/Passes.td | 11 + flang/lib/Optimizer/Transforms/CMakeLists.txt | 1 + flang/lib/Optimizer/Transforms/FIRToSCF.cpp | 105 ++++++++ flang/test/Fir/FirToSCF/do-loop.fir | 230 ++++++++++++++++++ 6 files changed, 350 insertions(+) create mode 100644 flang/lib/Optimizer/Transforms/FIRToSCF.cpp create mode 100644 flang/test/Fir/FirToSCF/do-loop.fir diff --git a/flang/include/flang/Optimizer/Support/InitFIR.h b/flang/include/flang/Optimizer/Support/InitFIR.h index 1868fbb201970..fa08f41f84adf 100644 --- a/flang/include/flang/Optimizer/Support/InitFIR.h +++ b/flang/include/flang/Optimizer/Support/InitFIR.h @@ -25,6 +25,7 @@ #include "mlir/Dialect/Func/Extensions/InlinerExtension.h" #include "mlir/Dialect/LLVMIR/NVVMDialect.h" #include "mlir/Dialect/OpenACC/Transforms/Passes.h" +#include "mlir/Dialect/SCF/Transforms/Passes.h" #include "mlir/InitAllDialects.h" #include "mlir/Pass/Pass.h" #include "mlir/Pass/PassRegistry.h" @@ -103,6 +104,7 @@ inline void registerMLIRPassesForFortranTools() { mlir::registerPrintOpStatsPass(); mlir::registerInlinerPass(); mlir::registerSCCPPass(); + mlir::registerSCFPasses(); mlir::affine::registerAffineScalarReplacementPass(); mlir::registerSymbolDCEPass(); mlir::registerLocationSnapshotPass(); diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h b/flang/include/flang/Optimizer/Transforms/Passes.h index 6dbabd523f88a..dc8a5b9141ad2 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.h +++ b/flang/include/flang/Optimizer/Transforms/Passes.h @@ -72,6 +72,7 @@ std::unique_ptr createArrayValueCopyPass(fir::ArrayValueCopyOptions options = {}); std::unique_ptr createMemDataFlowOptPass(); std::unique_ptr createPromoteToAffinePass(); +std::unique_ptr createFIRToSCFPass(); std::unique_ptr createAddDebugInfoPass(fir::AddDebugInfoOptions options = {}); diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c0d88a8e19f80..da3d9bc751927 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -76,6 +76,17 @@ def AffineDialectDemotion : Pass<"demote-affine", "::mlir::func::FuncOp"> { ]; } +def FIRToSCFPass : Pass<"fir-to-scf"> { + let summary = "Convert FIR structured control flow ops to SCF dialect."; + let description = [{ + Convert FIR structured control flow ops to SCF dialect. + }]; + let constructor = "::fir::createFIRToSCFPass()"; + let dependentDialects = [ + "fir::FIROpsDialect", "mlir::scf::SCFDialect" + ]; +} + def AnnotateConstantOperands : Pass<"annotate-constant"> { let summary = "Annotate constant operands to all FIR operations"; let description = [{ diff --git a/flang/lib/Optimizer/Transforms/CMakeLists.txt b/flang/lib/Optimizer/Transforms/CMakeLists.txt index 170b6e2cca225..846d6c64dbd04 100644 --- a/flang/lib/Optimizer/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/Transforms/CMakeLists.txt @@ -16,6 +16,7 @@ add_flang_library(FIRTransforms CUFComputeSharedMemoryOffsetsAndSize.cpp ArrayValueCopy.cpp ExternalNameConversion.cpp + FIRToSCF.cpp MemoryUtils.cpp MemoryAllocation.cpp StackArrays.cpp diff --git a/flang/lib/Optimizer/Transforms/FIRToSCF.cpp b/flang/lib/Optimizer/Transforms/FIRToSCF.cpp new file mode 100644 index 0000000000000..f06ad2db90d55 --- /dev/null +++ b/flang/lib/Optimizer/Transforms/FIRToSCF.cpp @@ -0,0 +1,105 @@ +//===-- FIRToSCF.cpp ------------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/SCF/IR/SCF.h" +#include "mlir/Transforms/DialectConversion.h" + +namespace fir { +#define GEN_PASS_DEF_FIRTOSCFPASS +#include "flang/Optimizer/Transforms/Passes.h.inc" +} // namespace fir + +using namespace fir; +using namespace mlir; + +namespace { +class FIRToSCFPass : public fir::impl::FIRToSCFPassBase { +public: + void runOnOperation() override; +}; + +struct DoLoopConversion : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + + LogicalResult matchAndRewrite(fir::DoLoopOp doLoopOp, + PatternRewriter &rewriter) const override { + auto loc = doLoopOp.getLoc(); + bool hasFinalValue = doLoopOp.getFinalValue().has_value(); + + // Get loop values from the DoLoopOp + auto low = doLoopOp.getLowerBound(); + auto high = doLoopOp.getUpperBound(); + assert(low && high && "must be a Value"); + auto step = doLoopOp.getStep(); + llvm::SmallVector iterArgs; + if (hasFinalValue) + iterArgs.push_back(low); + iterArgs.append(doLoopOp.getIterOperands().begin(), + doLoopOp.getIterOperands().end()); + + // fir.do_loop iterates over the interval [%l, %u], and the step may be + // negative. But scf.for iterates over the interval [%l, %u), and the step + // must be a positive value. + // For easier conversion, we calculate the trip count and use a canonical + // induction variable. + auto diff = rewriter.create(loc, high, low); + auto distance = rewriter.create(loc, diff, step); + auto tripCount = rewriter.create(loc, distance, step); + auto zero = rewriter.create(loc, 0); + auto one = rewriter.create(loc, 1); + auto scfForOp = + rewriter.create(loc, zero, tripCount, one, iterArgs); + + auto &loopOps = doLoopOp.getBody()->getOperations(); + auto resultOp = cast(doLoopOp.getBody()->getTerminator()); + auto results = resultOp.getOperands(); + Block *loweredBody = scfForOp.getBody(); + + loweredBody->getOperations().splice(loweredBody->begin(), loopOps, + loopOps.begin(), + std::prev(loopOps.end())); + + rewriter.setInsertionPointToStart(loweredBody); + Value iv = + rewriter.create(loc, scfForOp.getInductionVar(), step); + iv = rewriter.create(loc, low, iv); + + if (!results.empty()) { + rewriter.setInsertionPointToEnd(loweredBody); + rewriter.create(resultOp->getLoc(), results); + } + doLoopOp.getInductionVar().replaceAllUsesWith(iv); + rewriter.replaceAllUsesWith(doLoopOp.getRegionIterArgs(), + hasFinalValue + ? scfForOp.getRegionIterArgs().drop_front() + : scfForOp.getRegionIterArgs()); + + // Copy all the attributes from the old to new op. + scfForOp->setAttrs(doLoopOp->getAttrs()); + rewriter.replaceOp(doLoopOp, scfForOp); + return success(); + } +}; +} // namespace + +void FIRToSCFPass::runOnOperation() { + RewritePatternSet patterns(&getContext()); + patterns.add(patterns.getContext()); + ConversionTarget target(getContext()); + target.addIllegalOp(); + target.markUnknownOpDynamicallyLegal([](Operation *) { return true; }); + if (failed( + applyPartialConversion(getOperation(), target, std::move(patterns)))) + signalPassFailure(); +} + +std::unique_ptr fir::createFIRToSCFPass() { + return std::make_unique(); +} diff --git a/flang/test/Fir/FirToSCF/do-loop.fir b/flang/test/Fir/FirToSCF/do-loop.fir new file mode 100644 index 0000000000000..812497c8d0c74 --- /dev/null +++ b/flang/test/Fir/FirToSCF/do-loop.fir @@ -0,0 +1,230 @@ +// RUN: fir-opt %s --fir-to-scf | FileCheck %s + +// CHECK-LABEL: func.func @simple_loop( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_2:.*]] = fir.shape %[[VAL_1]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_3:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_9:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] { +// CHECK: %[[VAL_10:.*]] = arith.muli %[[VAL_9]], %[[VAL_0]] : index +// CHECK: %[[VAL_11:.*]] = arith.addi %[[VAL_0]], %[[VAL_10]] : index +// CHECK: %[[VAL_12:.*]] = fir.array_coor %[[ARG0]](%[[VAL_2]]) %[[VAL_11]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_3]] to %[[VAL_12]] : !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } +func.func @simple_loop(%arg0: !fir.ref>) { + %c1 = arith.constant 1 : index + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %c1_i32 = arith.constant 1 : i32 + fir.do_loop %arg1 = %c1 to %c100 step %c1 { + %1 = fir.array_coor %arg0(%0) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + return +} + +// CHECK-LABEL: func.func @loop_with_negtive_step( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_2:.*]] = arith.constant -1 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_0]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_2]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_10:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] { +// CHECK: %[[VAL_11:.*]] = arith.muli %[[VAL_10]], %[[VAL_2]] : index +// CHECK: %[[VAL_12:.*]] = arith.addi %[[VAL_0]], %[[VAL_11]] : index +// CHECK: %[[VAL_13:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_12]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_4]] to %[[VAL_13]] : !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } +func.func @loop_with_negtive_step(%arg0: !fir.ref>) { + %c100 = arith.constant 100 : index + %c1 = arith.constant 1 : index + %c-1 = arith.constant -1 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %c1_i32 = arith.constant 1 : i32 + fir.do_loop %arg1 = %c100 to %c1 step %c-1 { + %1 = fir.array_coor %arg0(%0) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + return +} + +// CHECK-LABEL: func.func @loop_with_results( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_9:.*]] = scf.for %[[VAL_10:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] iter_args(%[[VAL_11:.*]] = %[[VAL_1]]) -> (i32) { +// CHECK: %[[VAL_12:.*]] = arith.muli %[[VAL_10]], %[[VAL_0]] : index +// CHECK: %[[VAL_13:.*]] = arith.addi %[[VAL_0]], %[[VAL_12]] : index +// CHECK: %[[VAL_14:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_13]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_15:.*]] = fir.load %[[VAL_14]] : !fir.ref +// CHECK: %[[VAL_16:.*]] = arith.addi %[[VAL_11]], %[[VAL_15]] : i32 +// CHECK: scf.yield %[[VAL_16]] : i32 +// CHECK: } +// CHECK: fir.store %[[VAL_9]] to %[[ARG1]] : !fir.ref +// CHECK: return +// CHECK: } +func.func @loop_with_results(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %1 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %c0_i32) -> (i32) { + %2 = fir.array_coor %arg0(%0) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %3 = fir.load %2 : !fir.ref + %4 = arith.addi %arg3, %3 : i32 + fir.result %4 : i32 + } + fir.store %1 to %arg1 : !fir.ref + return +} + +// CHECK-LABEL: func.func @loop_with_final_value( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.alloca index +// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_0]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_10:.*]]:2 = scf.for %[[VAL_11:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] iter_args(%[[VAL_12:.*]] = %[[VAL_0]], %[[VAL_13:.*]] = %[[VAL_1]]) -> (index, i32) { +// CHECK: %[[VAL_14:.*]] = arith.muli %[[VAL_11]], %[[VAL_0]] : index +// CHECK: %[[VAL_15:.*]] = arith.addi %[[VAL_0]], %[[VAL_14]] : index +// CHECK: %[[VAL_16:.*]] = fir.array_coor %[[ARG0]](%[[VAL_4]]) %[[VAL_15]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.load %[[VAL_16]] : !fir.ref +// CHECK: %[[VAL_18:.*]] = arith.addi %[[VAL_15]], %[[VAL_0]] overflow : index +// CHECK: %[[VAL_19:.*]] = arith.addi %[[VAL_13]], %[[VAL_17]] overflow : i32 +// CHECK: scf.yield %[[VAL_18]], %[[VAL_19]] : index, i32 +// CHECK: } +// CHECK: fir.store %[[VAL_20:.*]]#0 to %[[VAL_3]] : !fir.ref +// CHECK: fir.store %[[VAL_20]]#1 to %[[ARG1]] : !fir.ref +// CHECK: return +// CHECK: } +func.func @loop_with_final_value(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.alloca index + %1 = fir.shape %c100 : (index) -> !fir.shape<1> + %2:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %c0_i32) -> (index, i32) { + %3 = fir.array_coor %arg0(%1) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %4 = fir.load %3 : !fir.ref + %5 = arith.addi %arg2, %c1 overflow : index + %6 = arith.addi %arg3, %4 overflow : i32 + fir.result %5, %6 : index, i32 + } + fir.store %2#0 to %0 : !fir.ref + fir.store %2#1 to %arg1 : !fir.ref + return +} + +// CHECK-LABEL: func.func @loop_with_attribute( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.alloca i32 +// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_0]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_10:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] { +// CHECK: %[[VAL_11:.*]] = arith.muli %[[VAL_10]], %[[VAL_0]] : index +// CHECK: %[[VAL_12:.*]] = arith.addi %[[VAL_0]], %[[VAL_11]] : index +// CHECK: %[[VAL_13:.*]] = fir.array_coor %[[ARG0]](%[[VAL_4]]) %[[VAL_12]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_14:.*]] = fir.load %[[VAL_13]] : !fir.ref +// CHECK: %[[VAL_15:.*]] = fir.load %[[VAL_3]] : !fir.ref +// CHECK: %[[VAL_16:.*]] = arith.addi %[[VAL_15]], %[[VAL_14]] : i32 +// CHECK: fir.store %[[VAL_16]] to %[[VAL_3]] : !fir.ref +// CHECK: } {operandSegmentSizes = array, reduceAttrs = [#fir.reduce_attr]} +// CHECK: return +// CHECK: } +func.func @loop_with_attribute(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.alloca i32 + %1 = fir.shape %c100 : (index) -> !fir.shape<1> + fir.do_loop %arg2 = %c1 to %c100 step %c1 reduce(#fir.reduce_attr -> %0 : !fir.ref) { + %2 = fir.array_coor %arg0(%1) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %3 = fir.load %2 : !fir.ref + %4 = fir.load %0 : !fir.ref + %5 = arith.addi %4, %3 : i32 + fir.store %5 to %0 : !fir.ref + fir.result + } + return +} + +// CHECK-LABEL: func.func @nested_loop( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_2]], %[[VAL_2]] : (index, index) -> !fir.shape<2> +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_9:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] { +// CHECK: %[[VAL_10:.*]] = arith.muli %[[VAL_9]], %[[VAL_0]] : index +// CHECK: %[[VAL_11:.*]] = arith.addi %[[VAL_0]], %[[VAL_10]] : index +// CHECK: %[[VAL_12:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_13:.*]] = arith.addi %[[VAL_12]], %[[VAL_0]] : index +// CHECK: %[[VAL_14:.*]] = arith.divsi %[[VAL_13]], %[[VAL_0]] : index +// CHECK: %[[VAL_15:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_16:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_17:.*]] = %[[VAL_15]] to %[[VAL_14]] step %[[VAL_16]] { +// CHECK: %[[VAL_18:.*]] = arith.muli %[[VAL_17]], %[[VAL_0]] : index +// CHECK: %[[VAL_19:.*]] = arith.addi %[[VAL_0]], %[[VAL_18]] : index +// CHECK: %[[VAL_20:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_19]], %[[VAL_11]] : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref +// CHECK: fir.store %[[VAL_1]] to %[[VAL_20]] : !fir.ref +// CHECK: } +// CHECK: } +// CHECK: return +// CHECK: } +func.func @nested_loop(%arg0: !fir.ref>) { + %c1 = arith.constant 1 : index + %c1_i32 = arith.constant 1 : i32 + %c100 = arith.constant 100 : index + %0 = fir.shape %c100, %c100 : (index, index) -> !fir.shape<2> + fir.do_loop %arg1 = %c1 to %c100 step %c1 { + fir.do_loop %arg2 = %c1 to %c100 step %c1 { + %1 = fir.array_coor %arg0(%0) %arg2, %arg1 : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + } + return +} From flang-commits at lists.llvm.org Tue May 20 00:00:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 00:00:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [Preprocessor] Fix ignoring OpenMP directive when preceded by a MACRO (PR #140686) Message-ID: https://github.com/shivaramaarao created https://github.com/llvm/llvm-project/pull/140686 When a macro is followed by OpenMP pragma it is considered as comment and ignored. The function IsCompilerDirectiveSentinel expects the compiler directive argument without the prefix comment character. This is fixed in this commit. Fixes #117693 >From 74430527c4b73fa67327386a8476703d06edb3db Mon Sep 17 00:00:00 2001 From: Shivarama Rao Date: Tue, 20 May 2025 06:48:52 +0000 Subject: [PATCH] When calling IsCompilerDirectiveSentinel,the prefix comment character need to be skipped. Fixes #117693 --- flang/lib/Parser/prescan.cpp | 2 +- flang/test/Preprocessing/bug117693.F90 | 14 ++++++++++++++ 2 files changed, 15 insertions(+), 1 deletion(-) create mode 100644 flang/test/Preprocessing/bug117693.F90 diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp index 3bc2ea0b37508..2aeb2a81308cd 100644 --- a/flang/lib/Parser/prescan.cpp +++ b/flang/lib/Parser/prescan.cpp @@ -564,7 +564,7 @@ bool Prescanner::MustSkipToEndOfLine() const { return true; // skip over ignored columns in right margin (73:80) } else if (*at_ == '!' && !inCharLiteral_ && (!inFixedForm_ || tabInCurrentLine_ || column_ != 6)) { - return !IsCompilerDirectiveSentinel(at_); + return !IsCompilerDirectiveSentinel(at_ + 1); } else { return false; } diff --git a/flang/test/Preprocessing/bug117693.F90 b/flang/test/Preprocessing/bug117693.F90 new file mode 100644 index 0000000000000..531c07417d0f1 --- /dev/null +++ b/flang/test/Preprocessing/bug117693.F90 @@ -0,0 +1,14 @@ +! RUN: %flang -fopenmp -E %s 2>&1 | FileCheck %s +! CHECK: !$OMP PARALLEL DO +! CHECK: !$OMP END PARALLEL DO +program main +IMPLICIT NONE +INTEGER:: I +#define OMPSUPPORT +INTEGER :: omp_id +OMPSUPPORT !$OMP PARALLEL DO +DO I=1,100 +print *, omp_id +ENDDO +OMPSUPPORT !$OMP END PARALLEL DO +end program From flang-commits at lists.llvm.org Tue May 20 00:00:42 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 00:00:42 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [Preprocessor] Fix ignoring OpenMP directive when preceded by a MACRO (PR #140686) In-Reply-To: Message-ID: <682c289a.170a0220.20cd50.edc5@mx.google.com> github-actions[bot] wrote: Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using `@` followed by their GitHub username. If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the [LLVM GitHub User Guide](https://llvm.org/docs/GitHub.html). You can also ask questions in a comment on this PR, on the [LLVM Discord](https://discord.com/invite/xS7Z362) or on the [forums](https://discourse.llvm.org/). https://github.com/llvm/llvm-project/pull/140686 From flang-commits at lists.llvm.org Tue May 20 00:01:17 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 00:01:17 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [Preprocessor] Fix ignoring OpenMP directive when preceded by a MACRO (PR #140686) In-Reply-To: Message-ID: <682c28bd.170a0220.3470c5.e380@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-parser Author: None (shivaramaarao)
Changes When a macro is followed by OpenMP pragma it is considered as comment and ignored. The function IsCompilerDirectiveSentinel expects the compiler directive argument without the prefix comment character. This is fixed in this commit. Fixes #117693 --- Full diff: https://github.com/llvm/llvm-project/pull/140686.diff 2 Files Affected: - (modified) flang/lib/Parser/prescan.cpp (+1-1) - (added) flang/test/Preprocessing/bug117693.F90 (+14) ``````````diff diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp index 3bc2ea0b37508..2aeb2a81308cd 100644 --- a/flang/lib/Parser/prescan.cpp +++ b/flang/lib/Parser/prescan.cpp @@ -564,7 +564,7 @@ bool Prescanner::MustSkipToEndOfLine() const { return true; // skip over ignored columns in right margin (73:80) } else if (*at_ == '!' && !inCharLiteral_ && (!inFixedForm_ || tabInCurrentLine_ || column_ != 6)) { - return !IsCompilerDirectiveSentinel(at_); + return !IsCompilerDirectiveSentinel(at_ + 1); } else { return false; } diff --git a/flang/test/Preprocessing/bug117693.F90 b/flang/test/Preprocessing/bug117693.F90 new file mode 100644 index 0000000000000..531c07417d0f1 --- /dev/null +++ b/flang/test/Preprocessing/bug117693.F90 @@ -0,0 +1,14 @@ +! RUN: %flang -fopenmp -E %s 2>&1 | FileCheck %s +! CHECK: !$OMP PARALLEL DO +! CHECK: !$OMP END PARALLEL DO +program main +IMPLICIT NONE +INTEGER:: I +#define OMPSUPPORT +INTEGER :: omp_id +OMPSUPPORT !$OMP PARALLEL DO +DO I=1,100 +print *, omp_id +ENDDO +OMPSUPPORT !$OMP END PARALLEL DO +end program ``````````
https://github.com/llvm/llvm-project/pull/140686 From flang-commits at lists.llvm.org Tue May 20 00:16:59 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 00:16:59 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [WIP] Implement workdistribute construct (PR #140523) In-Reply-To: Message-ID: <682c2c6b.170a0220.12132b.edf8@mx.google.com> https://github.com/skc7 converted_to_draft https://github.com/llvm/llvm-project/pull/140523 From flang-commits at lists.llvm.org Tue May 20 01:06:45 2025 From: flang-commits at lists.llvm.org (Rohit Aggarwal via flang-commits) Date: Tue, 20 May 2025 01:06:45 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c3815.050a0220.1a259c.1a2f@mx.google.com> rohitaggarwal007 wrote: @tarunprabhu @florianhumblot @alexey-bataev @RKSimon @phoebewang Please review the pull request. Thanks https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Tue May 20 01:17:20 2025 From: flang-commits at lists.llvm.org (Kiran Kumar T P via flang-commits) Date: Tue, 20 May 2025 01:17:20 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c3a90.170a0220.c0d92.3a28@mx.google.com> kiranktp wrote: LGTM. Please wait for others to approve. https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Tue May 20 01:18:28 2025 From: flang-commits at lists.llvm.org (Kiran Kumar T P via flang-commits) Date: Tue, 20 May 2025 01:18:28 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c3ad4.170a0220.297121.b23c@mx.google.com> https://github.com/kiranktp approved this pull request. https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Tue May 20 01:19:09 2025 From: flang-commits at lists.llvm.org (Kiran Kumar T P via flang-commits) Date: Tue, 20 May 2025 01:19:09 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][veclib] Adding AMDLIBM target to fveclib (PR #140533) In-Reply-To: Message-ID: <682c3afd.170a0220.3a7a11.e623@mx.google.com> https://github.com/kiranktp approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/140533 From flang-commits at lists.llvm.org Tue May 20 01:38:41 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 01:38:41 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][veclib] Adding AMDLIBM target to fveclib (PR #140533) In-Reply-To: Message-ID: <682c3f91.170a0220.14f235.ea46@mx.google.com> https://github.com/NimishMishra closed https://github.com/llvm/llvm-project/pull/140533 From flang-commits at lists.llvm.org Tue May 20 01:38:39 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 01:38:39 -0700 (PDT) Subject: [flang-commits] [flang] 32a1b6a - [flang][veclib] Adding AMDLIBM target to fveclib (#140533) Message-ID: <682c3f8f.050a0220.3a4be9.197b@mx.google.com> Author: shivaramaarao Date: 2025-05-20T01:38:35-07:00 New Revision: 32a1b6a70b3ec9066dd70ccf538f735a5c58e031 URL: https://github.com/llvm/llvm-project/commit/32a1b6a70b3ec9066dd70ccf538f735a5c58e031 DIFF: https://github.com/llvm/llvm-project/commit/32a1b6a70b3ec9066dd70ccf538f735a5c58e031.diff LOG: [flang][veclib] Adding AMDLIBM target to fveclib (#140533) This commit adds AMDLIBM support to fveclib targets. The support is already present in clang and this patch extends it to flang. Added: Modified: clang/lib/Driver/ToolChains/Flang.cpp flang/include/flang/Frontend/CodeGenOptions.def flang/lib/Frontend/CompilerInvocation.cpp flang/test/Driver/fveclib-codegen.f90 flang/test/Driver/fveclib.f90 Removed: ################################################################################ diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index b1ca747e68b89..0bd8d0c85e50a 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -484,7 +484,7 @@ void Flang::addTargetOptions(const ArgList &Args, Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) << Name << Triple.getArchName(); - } else if (Name == "libmvec") { + } else if (Name == "libmvec" || Name == "AMDLIBM") { if (Triple.getArch() != llvm::Triple::x86 && Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index d9dbd274e83e5..b50dd4fb3abda 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -42,7 +42,7 @@ CODEGENOPT(AliasAnalysis, 1, 0) ///< Enable alias analysis pass CODEGENOPT(Underscoring, 1, 1) ENUM_CODEGENOPT(RelocationModel, llvm::Reloc::Model, 3, llvm::Reloc::PIC_) ///< Name of the relocation model to use. ENUM_CODEGENOPT(DebugInfo, llvm::codegenoptions::DebugInfoKind, 4, llvm::codegenoptions::NoDebugInfo) ///< Level of debug info to generate -ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, llvm::driver::VectorLibrary::NoLibrary) ///< Vector functions library to use +ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 4, llvm::driver::VectorLibrary::NoLibrary) ///< Vector functions library to use ENUM_CODEGENOPT(FramePointer, llvm::FramePointerKind, 2, llvm::FramePointerKind::None) ///< Enable the usage of frame pointers ENUM_CODEGENOPT(DoConcurrentMapping, DoConcurrentMappingKind, 2, DoConcurrentMappingKind::DCMK_None) ///< Map `do concurrent` to OpenMP diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 238079a09ef3a..b6c37712d0f79 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -201,6 +201,7 @@ static bool parseVectorLibArg(Fortran::frontend::CodeGenOptions &opts, .Case("SLEEF", VectorLibrary::SLEEF) .Case("Darwin_libsystem_m", VectorLibrary::Darwin_libsystem_m) .Case("ArmPL", VectorLibrary::ArmPL) + .Case("AMDLIBM", VectorLibrary::AMDLIBM) .Case("NoLibrary", VectorLibrary::NoLibrary) .Default(std::nullopt); if (!val.has_value()) { diff --git a/flang/test/Driver/fveclib-codegen.f90 b/flang/test/Driver/fveclib-codegen.f90 index 802fff9772bb3..4cbb1e284f18e 100644 --- a/flang/test/Driver/fveclib-codegen.f90 +++ b/flang/test/Driver/fveclib-codegen.f90 @@ -1,6 +1,7 @@ ! test that -fveclib= is passed to the backend ! RUN: %if aarch64-registered-target %{ %flang -S -Ofast -target aarch64-unknown-linux-gnu -fveclib=SLEEF -o - %s | FileCheck %s --check-prefix=SLEEF %} ! RUN: %if x86-registered-target %{ %flang -S -Ofast -target x86_64-unknown-linux-gnu -fveclib=libmvec -o - %s | FileCheck %s %} +! RUN: %if x86-registered-target %{ %flang -S -O3 -ffast-math -target x86_64-unknown-linux-gnu -fveclib=AMDLIBM -o - %s | FileCheck %s --check-prefix=AMDLIBM %} ! RUN: %flang -S -Ofast -fveclib=NoLibrary -o - %s | FileCheck %s --check-prefix=NOLIB subroutine sb(a, b) @@ -10,6 +11,7 @@ subroutine sb(a, b) ! check that we used a vectorized call to powf() ! CHECK: _ZGVbN4vv_powf ! SLEEF: _ZGVnN4vv_powf +! AMDLIBM: amd_vrs4_powf ! NOLIB: powf a(i) = a(i) ** b(i) end do diff --git a/flang/test/Driver/fveclib.f90 b/flang/test/Driver/fveclib.f90 index 1b536b8ad0f18..431a4bfc02522 100644 --- a/flang/test/Driver/fveclib.f90 +++ b/flang/test/Driver/fveclib.f90 @@ -5,6 +5,7 @@ ! RUN: %flang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck -check-prefix CHECK-DARWIN_LIBSYSTEM_M %s ! RUN: %flang -### -c --target=aarch64-none-none -fveclib=SLEEF %s 2>&1 | FileCheck -check-prefix CHECK-SLEEF %s ! RUN: %flang -### -c --target=aarch64-none-none -fveclib=ArmPL %s 2>&1 | FileCheck -check-prefix CHECK-ARMPL %s +! RUN: %flang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck -check-prefix CHECK-AMDLIBM %s ! RUN: %flang -### -c --target=aarch64-apple-darwin -fveclib=none %s 2>&1 | FileCheck -check-prefix CHECK-NOLIB-DARWIN %s ! RUN: not %flang -c -fveclib=something %s 2>&1 | FileCheck -check-prefix CHECK-INVALID %s @@ -15,6 +16,7 @@ ! CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m" ! CHECK-SLEEF: "-fveclib=SLEEF" ! CHECK-ARMPL: "-fveclib=ArmPL" +! CHECK-AMDLIBM: "-fveclib=AMDLIBM" ! CHECK-NOLIB-DARWIN: "-fveclib=none" ! CHECK-INVALID: error: invalid value 'something' in '-fveclib=something' @@ -23,6 +25,7 @@ ! RUN: not %flang --target=x86-none-none -c -fveclib=ArmPL %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=aarch64-none-none -c -fveclib=libmvec %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=aarch64-none-none -c -fveclib=SVML %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s +! RUN: not %flang --target=aarch64-none-none -c -fveclib=AMDLIBM %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! CHECK-ERROR: unsupported option {{.*}} for target ! RUN: %flang -fveclib=Accelerate %s -target arm64-apple-ios8.0.0 -### 2>&1 | FileCheck --check-prefix=CHECK-LINK %s From flang-commits at lists.llvm.org Tue May 20 01:39:00 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 01:39:00 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][veclib] Adding AMDLIBM target to fveclib (PR #140533) In-Reply-To: Message-ID: <682c3fa4.170a0220.d9581.3c1d@mx.google.com> github-actions[bot] wrote: @shivaramaarao Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our [build bots](https://lab.llvm.org/buildbot/). If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail [here](https://llvm.org/docs/MyFirstTypoFix.html#myfirsttypofix-issues-after-landing-your-pr). If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of [LLVM development](https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy). You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! https://github.com/llvm/llvm-project/pull/140533 From flang-commits at lists.llvm.org Tue May 20 02:08:21 2025 From: flang-commits at lists.llvm.org (Simon Pilgrim via flang-commits) Date: Tue, 20 May 2025 02:08:21 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c4685.170a0220.1848a3.e744@mx.google.com> RKSimon wrote: @rohitaggarwal007 please can you edit the summary to briefly describe the fix https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Tue May 20 02:58:12 2025 From: flang-commits at lists.llvm.org (Rohit Aggarwal via flang-commits) Date: Tue, 20 May 2025 02:58:12 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c5234.170a0220.3c2062.bf16@mx.google.com> https://github.com/rohitaggarwal007 edited https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Tue May 20 02:58:50 2025 From: flang-commits at lists.llvm.org (Rohit Aggarwal via flang-commits) Date: Tue, 20 May 2025 02:58:50 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c525a.170a0220.fdb56.e76c@mx.google.com> rohitaggarwal007 wrote: > @rohitaggarwal007 please can you edit the summary to briefly describe the fix Done, Update the summary. https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Tue May 20 02:58:51 2025 From: flang-commits at lists.llvm.org (Paul Walker via flang-commits) Date: Tue, 20 May 2025 02:58:51 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c525b.050a0220.2af1d7.023e@mx.google.com> ================ @@ -389,7 +389,7 @@ ENUM_CODEGENOPT(Inlining, InliningMethod, 2, NormalInlining) VALUE_CODEGENOPT(InlineMaxStackSize, 32, UINT_MAX) // Vector functions library to use. -ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, llvm::driver::VectorLibrary::NoLibrary) +ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 4, llvm::driver::VectorLibrary::NoLibrary) ---------------- paulwalker-arm wrote: Not sure if there's a compile time way to protect against this in the future, if not then it's worth adding a comment to the matching enum in CodeGenOptions.h to highlight the necessary action of ensuring this value big enough when adding a new library. https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Tue May 20 02:58:52 2025 From: flang-commits at lists.llvm.org (Paul Walker via flang-commits) Date: Tue, 20 May 2025 02:58:52 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c525c.170a0220.1609a2.ec39@mx.google.com> ================ @@ -5844,7 +5844,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) << Name << Triple.getArchName(); - } else if (Name == "libmvec") { + } else if (Name == "libmvec" || Name == "AMDLIBM") { ---------------- paulwalker-arm wrote: There's equivalent code in `Flang.cpp` that's also worth updating. https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Tue May 20 03:00:41 2025 From: flang-commits at lists.llvm.org (Rohit Aggarwal via flang-commits) Date: Tue, 20 May 2025 03:00:41 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c52c9.050a0220.34f28d.ff2e@mx.google.com> ================ @@ -5844,7 +5844,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) << Name << Triple.getArchName(); - } else if (Name == "libmvec") { + } else if (Name == "libmvec" || Name == "AMDLIBM") { ---------------- rohitaggarwal007 wrote: #140533 is taking care of flang https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Tue May 20 03:05:48 2025 From: flang-commits at lists.llvm.org (Rohit Aggarwal via flang-commits) Date: Tue, 20 May 2025 03:05:48 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c53fc.170a0220.171f50.e6ea@mx.google.com> ================ @@ -389,7 +389,7 @@ ENUM_CODEGENOPT(Inlining, InliningMethod, 2, NormalInlining) VALUE_CODEGENOPT(InlineMaxStackSize, 32, UINT_MAX) // Vector functions library to use. -ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, llvm::driver::VectorLibrary::NoLibrary) +ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 4, llvm::driver::VectorLibrary::NoLibrary) ---------------- rohitaggarwal007 wrote: Sure, I will add a comment https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Tue May 20 03:11:17 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Tue, 20 May 2025 03:11:17 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Make -fsave-main-program default (PR #137090) In-Reply-To: Message-ID: <682c5545.170a0220.20061a.415a@mx.google.com> https://github.com/eugeneepshteyn closed https://github.com/llvm/llvm-project/pull/137090 From flang-commits at lists.llvm.org Tue May 20 03:11:18 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Tue, 20 May 2025 03:11:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Make -fsave-main-program default (PR #137090) In-Reply-To: Message-ID: <682c5546.050a0220.148a70.fff2@mx.google.com> eugeneepshteyn wrote: Decided not to pursue this https://github.com/llvm/llvm-project/pull/137090 From flang-commits at lists.llvm.org Tue May 20 03:59:09 2025 From: flang-commits at lists.llvm.org (Rohit Aggarwal via flang-commits) Date: Tue, 20 May 2025 03:59:09 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c607d.170a0220.269d00.44ce@mx.google.com> https://github.com/rohitaggarwal007 updated https://github.com/llvm/llvm-project/pull/140544 >From 4769d05876f3d7f4a335c10e51fb20e3c923e270 Mon Sep 17 00:00:00 2001 From: Rohit Aggarwal Date: Mon, 19 May 2025 19:25:52 +0530 Subject: [PATCH 1/5] [Clang][Flang][Driver] Fix target parsing for -fveclib=AMDLIBM option --- clang/lib/Driver/ToolChains/Clang.cpp | 2 +- clang/lib/Driver/ToolChains/CommonArgs.cpp | 1 + clang/test/Driver/fveclib.c | 10 ++++++++++ 3 files changed, 12 insertions(+), 1 deletion(-) diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index a08bdba99bfe0..4aefdb24af17b 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -5844,7 +5844,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) << Name << Triple.getArchName(); - } else if (Name == "libmvec") { + } else if (Name == "libmvec" || Name == "AMDLIBM") { if (Triple.getArch() != llvm::Triple::x86 && Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp index 5c1bc090810a2..c499b7266a553 100644 --- a/clang/lib/Driver/ToolChains/CommonArgs.cpp +++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp @@ -935,6 +935,7 @@ void tools::addLTOOptions(const ToolChain &ToolChain, const ArgList &Args, llvm::StringSwitch>(ArgVecLib->getValue()) .Case("Accelerate", "Accelerate") .Case("libmvec", "LIBMVEC") + .Case("AMDLIBM", "AMDLIBM") .Case("MASSV", "MASSV") .Case("SVML", "SVML") .Case("SLEEF", "sleefgnuabi") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 1235d08a3e139..5420555c36a2a 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -1,6 +1,7 @@ // RUN: %clang -### -c -fveclib=none %s 2>&1 | FileCheck --check-prefix=CHECK-NOLIB %s // RUN: %clang -### -c -fveclib=Accelerate %s 2>&1 | FileCheck --check-prefix=CHECK-ACCELERATE %s // RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-libmvec %s +// RUN: %clang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-AMDLIBM %s // RUN: %clang -### -c -fveclib=MASSV %s 2>&1 | FileCheck --check-prefix=CHECK-MASSV %s // RUN: %clang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck --check-prefix=CHECK-DARWIN_LIBSYSTEM_M %s // RUN: %clang -### -c --target=aarch64 -fveclib=SLEEF %s 2>&1 | FileCheck --check-prefix=CHECK-SLEEF %s @@ -11,6 +12,7 @@ // CHECK-NOLIB: "-fveclib=none" // CHECK-ACCELERATE: "-fveclib=Accelerate" // CHECK-libmvec: "-fveclib=libmvec" +// CHECK-AMDLIBM: "-fveclib=AMDLIBM" // CHECK-MASSV: "-fveclib=MASSV" // CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m" // CHECK-SLEEF: "-fveclib=SLEEF" @@ -23,6 +25,7 @@ // RUN: not %clang --target=x86 -c -fveclib=ArmPL %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s // RUN: not %clang --target=aarch64 -c -fveclib=libmvec %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s // RUN: not %clang --target=aarch64 -c -fveclib=SVML %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s +// RUN: not %clang --target=aarch64 -c -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERROR %s // CHECK-ERROR: unsupported option {{.*}} for target // RUN: %clang -fveclib=Accelerate %s -target arm64-apple-ios8.0.0 -### 2>&1 | FileCheck --check-prefix=CHECK-LINK %s @@ -40,6 +43,9 @@ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=libmvec -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC" +// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-AMDLIBM %s +// CHECK-LTO-AMDLIBM: "-plugin-opt=-vector-library=AMDLIBM" + // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" @@ -62,6 +68,10 @@ // CHECK-ERRNO-LIBMVEC: "-fveclib=libmvec" // CHECK-ERRNO-LIBMVEC-SAME: "-fmath-errno" +// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-AMDLIBM %s +// CHECK-ERRNO-AMDLIBM: "-fveclib=AMDLIBM" +// CHECK-ERRNO-AMDLIBM-SAME: "-fmath-errno" + // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV %s 2>&1 | FileCheck --check-prefix=CHECK-ERRNO-MASSV %s // CHECK-ERRNO-MASSV: "-fveclib=MASSV" // CHECK-ERRNO-MASSV-SAME: "-fmath-errno" >From 4505686899f24686aca32dfd53bd79e2908e185a Mon Sep 17 00:00:00 2001 From: Rohit Aggarwal Date: Mon, 19 May 2025 23:30:52 +0530 Subject: [PATCH 2/5] Update -fveclib=AMDLIBM in flang --- clang/lib/Driver/ToolChains/Flang.cpp | 2 +- flang/lib/Frontend/CompilerInvocation.cpp | 1 + flang/test/Driver/fveclib.f90 | 3 +++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index b1ca747e68b89..0bd8d0c85e50a 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -484,7 +484,7 @@ void Flang::addTargetOptions(const ArgList &Args, Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) << Name << Triple.getArchName(); - } else if (Name == "libmvec") { + } else if (Name == "libmvec" || Name == "AMDLIBM") { if (Triple.getArch() != llvm::Triple::x86 && Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 238079a09ef3a..a5df5b02fcf9f 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -196,6 +196,7 @@ static bool parseVectorLibArg(Fortran::frontend::CodeGenOptions &opts, llvm::StringSwitch>(arg->getValue()) .Case("Accelerate", VectorLibrary::Accelerate) .Case("libmvec", VectorLibrary::LIBMVEC) + .Case("AMDLIBM", VectorLibrary::AMDLIBM) .Case("MASSV", VectorLibrary::MASSV) .Case("SVML", VectorLibrary::SVML) .Case("SLEEF", VectorLibrary::SLEEF) diff --git a/flang/test/Driver/fveclib.f90 b/flang/test/Driver/fveclib.f90 index 1b536b8ad0f18..6cb9361f7b778 100644 --- a/flang/test/Driver/fveclib.f90 +++ b/flang/test/Driver/fveclib.f90 @@ -1,6 +1,7 @@ ! RUN: %flang -### -c -fveclib=none %s 2>&1 | FileCheck -check-prefix CHECK-NOLIB %s ! RUN: %flang -### -c -fveclib=Accelerate %s 2>&1 | FileCheck -check-prefix CHECK-ACCELERATE %s ! RUN: %flang -### -c --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck -check-prefix CHECK-libmvec %s +! RUN: %flang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck -check-prefix CHECK-AMDLIBM %s ! RUN: %flang -### -c -fveclib=MASSV %s 2>&1 | FileCheck -check-prefix CHECK-MASSV %s ! RUN: %flang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck -check-prefix CHECK-DARWIN_LIBSYSTEM_M %s ! RUN: %flang -### -c --target=aarch64-none-none -fveclib=SLEEF %s 2>&1 | FileCheck -check-prefix CHECK-SLEEF %s @@ -11,6 +12,7 @@ ! CHECK-NOLIB: "-fveclib=none" ! CHECK-ACCELERATE: "-fveclib=Accelerate" ! CHECK-libmvec: "-fveclib=libmvec" +! CHECK-AMDLIBM: "-fveclib=AMDLIBM" ! CHECK-MASSV: "-fveclib=MASSV" ! CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m" ! CHECK-SLEEF: "-fveclib=SLEEF" @@ -22,6 +24,7 @@ ! RUN: not %flang --target=x86-none-none -c -fveclib=SLEEF %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=x86-none-none -c -fveclib=ArmPL %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=aarch64-none-none -c -fveclib=libmvec %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s +! RUN: not %flang --target=aarch64-none-none -c -fveclib=AMDLIBM %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=aarch64-none-none -c -fveclib=SVML %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! CHECK-ERROR: unsupported option {{.*}} for target >From a135380263af7a4c484f0e326027b2e3ce3f1b63 Mon Sep 17 00:00:00 2001 From: Rohit Aggarwal Date: Tue, 20 May 2025 11:01:05 +0530 Subject: [PATCH 3/5] Remove flang related changes as it is already taken care --- clang/lib/Driver/ToolChains/Flang.cpp | 2 +- flang/lib/Frontend/CompilerInvocation.cpp | 1 - flang/test/Driver/fveclib.f90 | 3 --- 3 files changed, 1 insertion(+), 5 deletions(-) diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index 0bd8d0c85e50a..b1ca747e68b89 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -484,7 +484,7 @@ void Flang::addTargetOptions(const ArgList &Args, Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) << Name << Triple.getArchName(); - } else if (Name == "libmvec" || Name == "AMDLIBM") { + } else if (Name == "libmvec") { if (Triple.getArch() != llvm::Triple::x86 && Triple.getArch() != llvm::Triple::x86_64) D.Diag(diag::err_drv_unsupported_opt_for_target) diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index a5df5b02fcf9f..238079a09ef3a 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -196,7 +196,6 @@ static bool parseVectorLibArg(Fortran::frontend::CodeGenOptions &opts, llvm::StringSwitch>(arg->getValue()) .Case("Accelerate", VectorLibrary::Accelerate) .Case("libmvec", VectorLibrary::LIBMVEC) - .Case("AMDLIBM", VectorLibrary::AMDLIBM) .Case("MASSV", VectorLibrary::MASSV) .Case("SVML", VectorLibrary::SVML) .Case("SLEEF", VectorLibrary::SLEEF) diff --git a/flang/test/Driver/fveclib.f90 b/flang/test/Driver/fveclib.f90 index 6cb9361f7b778..1b536b8ad0f18 100644 --- a/flang/test/Driver/fveclib.f90 +++ b/flang/test/Driver/fveclib.f90 @@ -1,7 +1,6 @@ ! RUN: %flang -### -c -fveclib=none %s 2>&1 | FileCheck -check-prefix CHECK-NOLIB %s ! RUN: %flang -### -c -fveclib=Accelerate %s 2>&1 | FileCheck -check-prefix CHECK-ACCELERATE %s ! RUN: %flang -### -c --target=x86_64-unknown-linux-gnu -fveclib=libmvec %s 2>&1 | FileCheck -check-prefix CHECK-libmvec %s -! RUN: %flang -### -c --target=x86_64-unknown-linux-gnu -fveclib=AMDLIBM %s 2>&1 | FileCheck -check-prefix CHECK-AMDLIBM %s ! RUN: %flang -### -c -fveclib=MASSV %s 2>&1 | FileCheck -check-prefix CHECK-MASSV %s ! RUN: %flang -### -c -fveclib=Darwin_libsystem_m %s 2>&1 | FileCheck -check-prefix CHECK-DARWIN_LIBSYSTEM_M %s ! RUN: %flang -### -c --target=aarch64-none-none -fveclib=SLEEF %s 2>&1 | FileCheck -check-prefix CHECK-SLEEF %s @@ -12,7 +11,6 @@ ! CHECK-NOLIB: "-fveclib=none" ! CHECK-ACCELERATE: "-fveclib=Accelerate" ! CHECK-libmvec: "-fveclib=libmvec" -! CHECK-AMDLIBM: "-fveclib=AMDLIBM" ! CHECK-MASSV: "-fveclib=MASSV" ! CHECK-DARWIN_LIBSYSTEM_M: "-fveclib=Darwin_libsystem_m" ! CHECK-SLEEF: "-fveclib=SLEEF" @@ -24,7 +22,6 @@ ! RUN: not %flang --target=x86-none-none -c -fveclib=SLEEF %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=x86-none-none -c -fveclib=ArmPL %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=aarch64-none-none -c -fveclib=libmvec %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s -! RUN: not %flang --target=aarch64-none-none -c -fveclib=AMDLIBM %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! RUN: not %flang --target=aarch64-none-none -c -fveclib=SVML %s 2>&1 | FileCheck -check-prefix CHECK-ERROR %s ! CHECK-ERROR: unsupported option {{.*}} for target >From ed8a1f91f66407891ac6e37e56c16000f923bbc0 Mon Sep 17 00:00:00 2001 From: Rohit Aggarwal Date: Tue, 20 May 2025 12:08:31 +0530 Subject: [PATCH 4/5] Fix and update the bitcounts for veclib enum --- clang/include/clang/Basic/CodeGenOptions.def | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index 3f31d058f95a8..aad4e107cbeb3 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -389,7 +389,7 @@ ENUM_CODEGENOPT(Inlining, InliningMethod, 2, NormalInlining) VALUE_CODEGENOPT(InlineMaxStackSize, 32, UINT_MAX) // Vector functions library to use. -ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, llvm::driver::VectorLibrary::NoLibrary) +ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 4, llvm::driver::VectorLibrary::NoLibrary) /// The default TLS model to use. ENUM_CODEGENOPT(DefaultTLSModel, TLSModel, 2, GeneralDynamicTLSModel) >From 57fecd2678701f92735550bff2e7c3adbbb1c3f7 Mon Sep 17 00:00:00 2001 From: Rohit Aggarwal Date: Tue, 20 May 2025 16:28:40 +0530 Subject: [PATCH 5/5] Updated the comment regarding the vector library support --- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..ee52645f2e51b 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -19,6 +19,19 @@ class TargetLibraryInfoImpl; } // namespace llvm namespace llvm::driver { +// The current supported vector libraries in enum \VectorLibrary are 9(including +// the NoLibrary). Changing the bitcount from 3 to 4 so that more than 8 values +// can be supported. Now the maximum number of vector libraries supported +// increase from 8(2^3) to 16(2^4). +// +// ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, +// 4, llvm::driver::VectorLibrary::NoLibrary) is the +// currently defined in clang/include/clang/Basic/CodeGenOptions.def +// bitcount is the number of bits used to represent the enum value. +// +// IMPORTANT NOTE: When adding a new vector library support, and if count of +// supported vector libraries crosses the current max limit. Please increment +// the bitcount value. /// Vector library option used with -fveclib= enum class VectorLibrary { From flang-commits at lists.llvm.org Tue May 20 04:00:19 2025 From: flang-commits at lists.llvm.org (Rohit Aggarwal via flang-commits) Date: Tue, 20 May 2025 04:00:19 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c60c3.a70a0220.3101b8.043b@mx.google.com> ================ @@ -389,7 +389,7 @@ ENUM_CODEGENOPT(Inlining, InliningMethod, 2, NormalInlining) VALUE_CODEGENOPT(InlineMaxStackSize, 32, UINT_MAX) // Vector functions library to use. -ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, llvm::driver::VectorLibrary::NoLibrary) +ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 4, llvm::driver::VectorLibrary::NoLibrary) ---------------- rohitaggarwal007 wrote: @paulwalker-arm Done https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Tue May 20 04:03:16 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 20 May 2025 04:03:16 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [flang][OpenMP] Support MLIR lowering of linear clause for omp.wsloop (PR #139385) In-Reply-To: Message-ID: <682c6174.170a0220.1c336d.43de@mx.google.com> tblah wrote: > Not sure how the `TEST 'lldb-api :: tools/lldb-dap/launch/TestDAP_launch.py' FAILED` failure is related to this merge. I did confirm the linux and windows checks passed before merging. The CI runners can be a bit unreliable. Keep an eye out for any other reports but it sounds very likely that this failure was unrelated to your patch. https://github.com/llvm/llvm-project/pull/139385 From flang-commits at lists.llvm.org Tue May 20 04:06:30 2025 From: flang-commits at lists.llvm.org (Yang Zaizhou via flang-commits) Date: Tue, 20 May 2025 04:06:30 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] fix crash on sematic error in atomic capture clause (PR #140710) Message-ID: https://github.com/Mxfg-incense created https://github.com/llvm/llvm-project/pull/140710 Fix a crash caused by an invalid expression in the atomic capture clause, due to the `checkForSymbolMatch` function not accounting for `GetExpr` potentially returning null. Fix https://github.com/llvm/llvm-project/issues/139884 >From b91933df49b2012f0bd5781061109d7e4d71f65c Mon Sep 17 00:00:00 2001 From: Zaizhou Yang Date: Tue, 20 May 2025 18:36:25 +0800 Subject: [PATCH] [Flang][OpenMP] fix crash on sematic error in atomic capture clause --- flang/include/flang/Semantics/tools.h | 19 +++--- flang/lib/Lower/OpenACC.cpp | 4 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 3 +- flang/lib/Semantics/check-omp-structure.cpp | 62 ++++++++++--------- .../OpenMP/atomic-capture-invalid.f90 | 22 +++++++ 5 files changed, 66 insertions(+), 44 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..3839bc1d2a215 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -764,19 +764,14 @@ inline bool checkForSingleVariableOnRHS( return designator != nullptr; } -/// Checks if the symbol on the LHS of the assignment statement is present in -/// the RHS expression. -inline bool checkForSymbolMatch( - const Fortran::parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - const auto *e{Fortran::semantics::GetExpr(expr)}; - const auto *v{Fortran::semantics::GetExpr(var)}; - auto varSyms{Fortran::evaluate::GetSymbolVector(*v)}; - const Fortran::semantics::Symbol &varSymbol{*varSyms.front()}; +/// Checks if the symbol on the LHS is present in the RHS expression. +inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, + const Fortran::semantics::SomeExpr *rhs) { + auto lhsSyms{Fortran::evaluate::GetSymbolVector(*lhs)}; + const Fortran::semantics::Symbol &lhsSymbol{*lhsSyms.front()}; for (const Fortran::semantics::Symbol &symbol : - Fortran::evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { + Fortran::evaluate::GetSymbolVector(*rhs)) { + if (lhsSymbol == symbol) { return true; } } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index bc94e860ff10b..052d29e875444 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -654,7 +654,9 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter, mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + if (Fortran::semantics::checkForSymbolMatch( + Fortran::semantics::GetExpr(stmt2Var), + Fortran::semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] const Fortran::semantics::SomeExpr &fromExpr = *Fortran::semantics::GetExpr(stmt1Expr); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 02c09d4eea041..5a975384bd371 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3198,7 +3198,8 @@ static void genAtomicCapture(lower::AbstractConverter &converter, mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { + if (semantics::checkForSymbolMatch(semantics::GetExpr(stmt2Var), + semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c6c4fdf8a8198..3f8980b226174 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2910,45 +2910,47 @@ void OmpStructureChecker::CheckAtomicCaptureConstruct( .v.statement; const auto &stmt1Var{std::get(stmt1.t)}; const auto &stmt1Expr{std::get(stmt1.t)}; + const auto *v1 = GetExpr(context_, stmt1Var); + const auto *e1 = GetExpr(context_, stmt1Expr); const parser::AssignmentStmt &stmt2 = std::get(atomicCaptureConstruct.t) .v.statement; const auto &stmt2Var{std::get(stmt2.t)}; const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); + const auto *v2 = GetExpr(context_, stmt2Var); + const auto *e2 = GetExpr(context_, stmt2Expr); + + if (e1 && v1 && e2 && v2) { + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + CheckAtomicCaptureStmt(stmt1); + if (semantics::checkForSymbolMatch(v2, e2)) { + // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] + CheckAtomicUpdateStmt(stmt2); + } else { + // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] + CheckAtomicWriteStmt(stmt2); + } + if (!(*e1 == *v2)) { + context_.Say(stmt1Expr.source, + "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, + stmt1Expr.source); + } + } else if (semantics::checkForSymbolMatch(v1, e1) && + semantics::checkForSingleVariableOnRHS(stmt2)) { + // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] + CheckAtomicUpdateStmt(stmt1); + CheckAtomicCaptureStmt(stmt2); + // Variable updated in stmt1 should be captured in stmt2 + if (!(*v1 == *e2)) { + context_.Say(stmt1Var.GetSource(), + "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, + stmt1Var.GetSource()); + } } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); + "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } - } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } } diff --git a/flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 b/flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 new file mode 100644 index 0000000000000..cb9c73cc940db --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 @@ -0,0 +1,22 @@ +! REQUIRES: openmp_runtime + +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags +! Semantic checks on invalid atomic capture clause + +use omp_lib + logical x + complex y + !$omp atomic capture + !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches operand types LOGICAL(4) and COMPLEX(4) + x = y + !ERROR: Operands of + must be numeric; have COMPLEX(4) and LOGICAL(4) + y = y + x + !$omp end atomic + + !$omp atomic capture + !ERROR: Operands of + must be numeric; have COMPLEX(4) and LOGICAL(4) + y = y + x + !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches operand types LOGICAL(4) and COMPLEX(4) + x = y + !$omp end atomic +end From flang-commits at lists.llvm.org Tue May 20 04:07:05 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 04:07:05 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] fix crash on sematic error in atomic capture clause (PR #140710) In-Reply-To: Message-ID: <682c6259.170a0220.35a7bb.4c9e@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Yang Zaizhou (Mxfg-incense)
Changes Fix a crash caused by an invalid expression in the atomic capture clause, due to the `checkForSymbolMatch` function not accounting for `GetExpr` potentially returning null. Fix https://github.com/llvm/llvm-project/issues/139884 --- Full diff: https://github.com/llvm/llvm-project/pull/140710.diff 5 Files Affected: - (modified) flang/include/flang/Semantics/tools.h (+7-12) - (modified) flang/lib/Lower/OpenACC.cpp (+3-1) - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+2-1) - (modified) flang/lib/Semantics/check-omp-structure.cpp (+32-30) - (added) flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 (+22) ``````````diff diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..3839bc1d2a215 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -764,19 +764,14 @@ inline bool checkForSingleVariableOnRHS( return designator != nullptr; } -/// Checks if the symbol on the LHS of the assignment statement is present in -/// the RHS expression. -inline bool checkForSymbolMatch( - const Fortran::parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - const auto *e{Fortran::semantics::GetExpr(expr)}; - const auto *v{Fortran::semantics::GetExpr(var)}; - auto varSyms{Fortran::evaluate::GetSymbolVector(*v)}; - const Fortran::semantics::Symbol &varSymbol{*varSyms.front()}; +/// Checks if the symbol on the LHS is present in the RHS expression. +inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, + const Fortran::semantics::SomeExpr *rhs) { + auto lhsSyms{Fortran::evaluate::GetSymbolVector(*lhs)}; + const Fortran::semantics::Symbol &lhsSymbol{*lhsSyms.front()}; for (const Fortran::semantics::Symbol &symbol : - Fortran::evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { + Fortran::evaluate::GetSymbolVector(*rhs)) { + if (lhsSymbol == symbol) { return true; } } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index bc94e860ff10b..052d29e875444 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -654,7 +654,9 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter, mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + if (Fortran::semantics::checkForSymbolMatch( + Fortran::semantics::GetExpr(stmt2Var), + Fortran::semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] const Fortran::semantics::SomeExpr &fromExpr = *Fortran::semantics::GetExpr(stmt1Expr); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 02c09d4eea041..5a975384bd371 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3198,7 +3198,8 @@ static void genAtomicCapture(lower::AbstractConverter &converter, mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { + if (semantics::checkForSymbolMatch(semantics::GetExpr(stmt2Var), + semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c6c4fdf8a8198..3f8980b226174 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2910,45 +2910,47 @@ void OmpStructureChecker::CheckAtomicCaptureConstruct( .v.statement; const auto &stmt1Var{std::get(stmt1.t)}; const auto &stmt1Expr{std::get(stmt1.t)}; + const auto *v1 = GetExpr(context_, stmt1Var); + const auto *e1 = GetExpr(context_, stmt1Expr); const parser::AssignmentStmt &stmt2 = std::get(atomicCaptureConstruct.t) .v.statement; const auto &stmt2Var{std::get(stmt2.t)}; const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); + const auto *v2 = GetExpr(context_, stmt2Var); + const auto *e2 = GetExpr(context_, stmt2Expr); + + if (e1 && v1 && e2 && v2) { + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + CheckAtomicCaptureStmt(stmt1); + if (semantics::checkForSymbolMatch(v2, e2)) { + // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] + CheckAtomicUpdateStmt(stmt2); + } else { + // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] + CheckAtomicWriteStmt(stmt2); + } + if (!(*e1 == *v2)) { + context_.Say(stmt1Expr.source, + "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, + stmt1Expr.source); + } + } else if (semantics::checkForSymbolMatch(v1, e1) && + semantics::checkForSingleVariableOnRHS(stmt2)) { + // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] + CheckAtomicUpdateStmt(stmt1); + CheckAtomicCaptureStmt(stmt2); + // Variable updated in stmt1 should be captured in stmt2 + if (!(*v1 == *e2)) { + context_.Say(stmt1Var.GetSource(), + "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, + stmt1Var.GetSource()); + } } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); + "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } - } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } } diff --git a/flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 b/flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 new file mode 100644 index 0000000000000..cb9c73cc940db --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 @@ -0,0 +1,22 @@ +! REQUIRES: openmp_runtime + +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags +! Semantic checks on invalid atomic capture clause + +use omp_lib + logical x + complex y + !$omp atomic capture + !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches operand types LOGICAL(4) and COMPLEX(4) + x = y + !ERROR: Operands of + must be numeric; have COMPLEX(4) and LOGICAL(4) + y = y + x + !$omp end atomic + + !$omp atomic capture + !ERROR: Operands of + must be numeric; have COMPLEX(4) and LOGICAL(4) + y = y + x + !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches operand types LOGICAL(4) and COMPLEX(4) + x = y + !$omp end atomic +end ``````````
https://github.com/llvm/llvm-project/pull/140710 From flang-commits at lists.llvm.org Tue May 20 04:07:06 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 04:07:06 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] fix crash on sematic error in atomic capture clause (PR #140710) In-Reply-To: Message-ID: <682c625a.170a0220.16df54.f072@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Yang Zaizhou (Mxfg-incense)
Changes Fix a crash caused by an invalid expression in the atomic capture clause, due to the `checkForSymbolMatch` function not accounting for `GetExpr` potentially returning null. Fix https://github.com/llvm/llvm-project/issues/139884 --- Full diff: https://github.com/llvm/llvm-project/pull/140710.diff 5 Files Affected: - (modified) flang/include/flang/Semantics/tools.h (+7-12) - (modified) flang/lib/Lower/OpenACC.cpp (+3-1) - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+2-1) - (modified) flang/lib/Semantics/check-omp-structure.cpp (+32-30) - (added) flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 (+22) ``````````diff diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..3839bc1d2a215 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -764,19 +764,14 @@ inline bool checkForSingleVariableOnRHS( return designator != nullptr; } -/// Checks if the symbol on the LHS of the assignment statement is present in -/// the RHS expression. -inline bool checkForSymbolMatch( - const Fortran::parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - const auto *e{Fortran::semantics::GetExpr(expr)}; - const auto *v{Fortran::semantics::GetExpr(var)}; - auto varSyms{Fortran::evaluate::GetSymbolVector(*v)}; - const Fortran::semantics::Symbol &varSymbol{*varSyms.front()}; +/// Checks if the symbol on the LHS is present in the RHS expression. +inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, + const Fortran::semantics::SomeExpr *rhs) { + auto lhsSyms{Fortran::evaluate::GetSymbolVector(*lhs)}; + const Fortran::semantics::Symbol &lhsSymbol{*lhsSyms.front()}; for (const Fortran::semantics::Symbol &symbol : - Fortran::evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { + Fortran::evaluate::GetSymbolVector(*rhs)) { + if (lhsSymbol == symbol) { return true; } } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index bc94e860ff10b..052d29e875444 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -654,7 +654,9 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter, mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + if (Fortran::semantics::checkForSymbolMatch( + Fortran::semantics::GetExpr(stmt2Var), + Fortran::semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] const Fortran::semantics::SomeExpr &fromExpr = *Fortran::semantics::GetExpr(stmt1Expr); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 02c09d4eea041..5a975384bd371 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3198,7 +3198,8 @@ static void genAtomicCapture(lower::AbstractConverter &converter, mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { + if (semantics::checkForSymbolMatch(semantics::GetExpr(stmt2Var), + semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c6c4fdf8a8198..3f8980b226174 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2910,45 +2910,47 @@ void OmpStructureChecker::CheckAtomicCaptureConstruct( .v.statement; const auto &stmt1Var{std::get(stmt1.t)}; const auto &stmt1Expr{std::get(stmt1.t)}; + const auto *v1 = GetExpr(context_, stmt1Var); + const auto *e1 = GetExpr(context_, stmt1Expr); const parser::AssignmentStmt &stmt2 = std::get(atomicCaptureConstruct.t) .v.statement; const auto &stmt2Var{std::get(stmt2.t)}; const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); + const auto *v2 = GetExpr(context_, stmt2Var); + const auto *e2 = GetExpr(context_, stmt2Expr); + + if (e1 && v1 && e2 && v2) { + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + CheckAtomicCaptureStmt(stmt1); + if (semantics::checkForSymbolMatch(v2, e2)) { + // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] + CheckAtomicUpdateStmt(stmt2); + } else { + // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] + CheckAtomicWriteStmt(stmt2); + } + if (!(*e1 == *v2)) { + context_.Say(stmt1Expr.source, + "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, + stmt1Expr.source); + } + } else if (semantics::checkForSymbolMatch(v1, e1) && + semantics::checkForSingleVariableOnRHS(stmt2)) { + // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] + CheckAtomicUpdateStmt(stmt1); + CheckAtomicCaptureStmt(stmt2); + // Variable updated in stmt1 should be captured in stmt2 + if (!(*v1 == *e2)) { + context_.Say(stmt1Var.GetSource(), + "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, + stmt1Var.GetSource()); + } } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); + "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } - } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } } diff --git a/flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 b/flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 new file mode 100644 index 0000000000000..cb9c73cc940db --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 @@ -0,0 +1,22 @@ +! REQUIRES: openmp_runtime + +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags +! Semantic checks on invalid atomic capture clause + +use omp_lib + logical x + complex y + !$omp atomic capture + !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches operand types LOGICAL(4) and COMPLEX(4) + x = y + !ERROR: Operands of + must be numeric; have COMPLEX(4) and LOGICAL(4) + y = y + x + !$omp end atomic + + !$omp atomic capture + !ERROR: Operands of + must be numeric; have COMPLEX(4) and LOGICAL(4) + y = y + x + !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches operand types LOGICAL(4) and COMPLEX(4) + x = y + !$omp end atomic +end ``````````
https://github.com/llvm/llvm-project/pull/140710 From flang-commits at lists.llvm.org Tue May 20 04:12:21 2025 From: flang-commits at lists.llvm.org (Yang Zaizhou via flang-commits) Date: Tue, 20 May 2025 04:12:21 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] fix crash on sematic error in atomic capture clause (PR #140710) In-Reply-To: Message-ID: <682c6395.170a0220.2a2229.d06f@mx.google.com> Mxfg-incense wrote: @harishch4 @kiranchandramohan A patch on semantic checks for the atomic capture clause. https://github.com/llvm/llvm-project/pull/140710 From flang-commits at lists.llvm.org Tue May 20 04:18:11 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 20 May 2025 04:18:11 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] fix crash on sematic error in atomic capture clause (PR #140710) In-Reply-To: Message-ID: <682c64f3.050a0220.196fe4.048d@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks for the fix! https://github.com/llvm/llvm-project/pull/140710 From flang-commits at lists.llvm.org Tue May 20 04:20:58 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Tue, 20 May 2025 04:20:58 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <682c659a.050a0220.304e32.fafe@mx.google.com> https://github.com/abidh updated https://github.com/llvm/llvm-project/pull/138039 >From e03838394cdbe41959b41977b8b083db5a4c3764 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Tue, 29 Apr 2025 12:30:42 +0100 Subject: [PATCH 1/3] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. There are DeclareOp present for the variables mapped into target region. That allow us to generate debug information for them. Bu the TargetOp is still part of parent function and those variables get the parent function's DISubprogram as a scope. In OMPIRBuilder, a new function is created for the TargetOp. We also create a new DISubprogram for it. All the variables that were in the target region now have to be updated to have the correct scope. This after the fact updating of debug information becomes very difficult in certain cases. Take the example of variable arrays. The type of those arrays depend on the artificial DILocalVariable(s) which hold the size(s) of the array. This new function will now require that we generate the new variable and and new types. Similar issue exist for character type variables too. To avoid this after the fact updating, this PR generates a DISubprogramAttr for the TargetOp while generating the debug info in flang. This help us avoid updating later. This PR is flang side of the change. I will open another PR which will make the required changes in OMPIRBuilder. --- .../lib/Optimizer/Transforms/AddDebugInfo.cpp | 99 ++++++++++++++++++- .../test/Transforms/debug-omp-target-op-1.fir | 35 +++++++ .../test/Transforms/debug-omp-target-op-2.fir | 53 ++++++++++ 3 files changed, 186 insertions(+), 1 deletion(-) create mode 100644 flang/test/Transforms/debug-omp-target-op-1.fir create mode 100644 flang/test/Transforms/debug-omp-target-op-2.fir diff --git a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp index c479c1a0892b5..8e7ae4383bfdc 100644 --- a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp +++ b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp @@ -34,6 +34,7 @@ #include "llvm/BinaryFormat/Dwarf.h" #include "llvm/Support/Debug.h" #include "llvm/Support/FileSystem.h" +#include "llvm/Support/FormatVariadic.h" #include "llvm/Support/Path.h" #include "llvm/Support/raw_ostream.h" @@ -103,6 +104,37 @@ bool debugInfoIsAlreadySet(mlir::Location loc) { return false; } +// Generates the name for the artificial DISubprogram that we are going to +// generate for omp::TargetOp. Its logic is borrowed from +// getTargetEntryUniqueInfo and +// TargetRegionEntryInfo::getTargetRegionEntryFnName to generate the same name. +// But even if there was a slight mismatch, it is not a problem because this +// name is artifical and not important to debug experience. +mlir::StringAttr getTargetFunctionName(mlir::MLIRContext *context, + mlir::Location Loc, + llvm::StringRef parentName) { + auto fileLoc = Loc->findInstanceOf(); + + assert(fileLoc && "No file found from location"); + llvm::StringRef fileName = fileLoc.getFilename().getValue(); + + llvm::sys::fs::UniqueID id; + uint64_t line = fileLoc.getLine(); + size_t fileId; + size_t deviceId; + if (auto ec = llvm::sys::fs::getUniqueID(fileName, id)) { + fileId = llvm::hash_value(fileName.str()); + deviceId = 0xdeadf17e; + } else { + fileId = id.getFile(); + deviceId = id.getDevice(); + } + return mlir::StringAttr::get( + context, + std::string(llvm::formatv("__omp_offloading_{0:x-}_{1:x-}_{2}_l{3}", + deviceId, fileId, parentName, line))); +} + } // namespace bool AddDebugInfoPass::createCommonBlockGlobal( @@ -510,8 +542,73 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, subTypeAttr, entities, /*annotations=*/{}); funcOp->setLoc(builder.getFusedLoc({l}, spAttr)); + /* When we process the DeclareOp inside the OpenMP target region, all the + variables get the DISubprogram of the parent function of the target op as + the scope. In the codegen (to llvm ir), OpenMP target op results in the + creation of a separate function. As the variables in the debug info have + the DISubprogram of the parent function as the scope, the variables + need to be updated at codegen time to avoid verification failures. + + This updating after the fact becomes more and more difficult when types + are dependent on local variables like in the case of variable size arrays + or string. We not only have to generate new variables but also new types. + We can avoid this problem by generating a DISubprogramAttr here for the + target op and make sure that all the variables inside the target region + get the correct scope in the first place. */ + funcOp.walk([&](mlir::omp::TargetOp targetOp) { + unsigned line = getLineFromLoc(targetOp.getLoc()); + mlir::StringAttr Name = + getTargetFunctionName(context, targetOp.getLoc(), funcOp.getName()); + mlir::LLVM::DISubprogramFlags flags = + mlir::LLVM::DISubprogramFlags::Definition | + mlir::LLVM::DISubprogramFlags::LocalToUnit; + if (isOptimized) + flags = flags | mlir::LLVM::DISubprogramFlags::Optimized; + + mlir::DistinctAttr recId = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + mlir::DistinctAttr Id = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + llvm::SmallVector types; + types.push_back(mlir::LLVM::DINullTypeAttr::get(context)); + mlir::LLVM::DISubroutineTypeAttr spTy = + mlir::LLVM::DISubroutineTypeAttr::get(context, CC, types); + auto spAttr = mlir::LLVM::DISubprogramAttr::get( + context, recId, /*isRecSelf=*/true, Id, compilationUnit, Scope, Name, + Name, funcFileAttr, line, line, flags, spTy, /*retainedNodes=*/{}, + /*annotations=*/{}); + + // Make sure that information about the imported modules in copied from the + // parent function. + llvm::SmallVector OpEntities; + for (mlir::LLVM::DINodeAttr N : entities) { + if (auto entity = mlir::dyn_cast(N)) { + auto importedEntity = mlir::LLVM::DIImportedEntityAttr::get( + context, llvm::dwarf::DW_TAG_imported_module, spAttr, + entity.getEntity(), fileAttr, /*line=*/1, /*name=*/nullptr, + /*elements*/ {}); + OpEntities.push_back(importedEntity); + } + } + + Id = mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + spAttr = mlir::LLVM::DISubprogramAttr::get( + context, recId, /*isRecSelf=*/false, Id, compilationUnit, Scope, Name, + Name, funcFileAttr, line, line, flags, spTy, OpEntities, + /*annotations=*/{}); + targetOp->setLoc(builder.getFusedLoc({targetOp.getLoc()}, spAttr)); + }); + funcOp.walk([&](fir::cg::XDeclareOp declOp) { - handleDeclareOp(declOp, fileAttr, spAttr, typeGen, symbolTable); + mlir::LLVM::DISubprogramAttr spTy = spAttr; + if (auto tOp = declOp->getParentOfType()) { + if (auto fusedLoc = llvm::dyn_cast(tOp.getLoc())) { + if (auto sp = llvm::dyn_cast( + fusedLoc.getMetadata())) + spTy = sp; + } + } + handleDeclareOp(declOp, fileAttr, spTy, typeGen, symbolTable); }); // commonBlockMap ensures that we don't create multiple DICommonBlockAttr of // the same name in one function. But it is ok (rather required) to create diff --git a/flang/test/Transforms/debug-omp-target-op-1.fir b/flang/test/Transforms/debug-omp-target-op-1.fir new file mode 100644 index 0000000000000..bb586cdf6e9ab --- /dev/null +++ b/flang/test/Transforms/debug-omp-target-op-1.fir @@ -0,0 +1,35 @@ +// RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s + +module attributes {dlti.dl_spec = #dlti.dl_spec<>} { + func.func @_QQmain() attributes {fir.bindc_name = "test"} { + %c13_i32 = arith.constant 13 : i32 + %c12_i32 = arith.constant 12 : i32 + %c6_i32 = arith.constant 6 : i32 + %c1_i32 = arith.constant 1 : i32 + %c5_i32 = arith.constant 5 : i32 + %0 = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFEx"} loc(#loc1) + %1 = fircg.ext_declare %0 {uniq_name = "_QFEx"} : (!fir.ref) -> !fir.ref loc(#loc1) + %2 = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFEy"} loc(#loc2) + %3 = fircg.ext_declare %2 {uniq_name = "_QFEy"} : (!fir.ref) -> !fir.ref loc(#loc2) + %4 = omp.map.info var_ptr(%1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref {name = "x"} + %5 = omp.map.info var_ptr(%3 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref {name = "y"} + omp.target map_entries(%4 -> %arg0, %5 -> %arg1 : !fir.ref, !fir.ref) { + %16 = fircg.ext_declare %arg0 {uniq_name = "_QFEx"} : (!fir.ref) -> !fir.ref loc(#loc3) + %17 = fircg.ext_declare %arg1 {uniq_name = "_QFEy"} : (!fir.ref) -> !fir.ref loc(#loc4) + omp.terminator + } loc(#loc5) + return + } +} +#loc1 = loc("test.f90":1:1) +#loc2 = loc("test.f90":3:1) +#loc3 = loc("test.f90":7:1) +#loc4 = loc("test.f90":8:1) +#loc5 = loc("test.f90":6:1) + +// CHECK: #[[SP:.*]] = #llvm.di_subprogram<{{.*}}name = "test"{{.*}}> +// CHECK: #[[SP1:.*]] = #llvm.di_subprogram<{{.*}}name = "__omp_offloading_{{.*}}_QQmain_l6"{{.*}}line = 6{{.*}}subprogramFlags = "LocalToUnit|Definition"{{.*}}> +// CHECK: #llvm.di_local_variable +// CHECK: #llvm.di_local_variable +// CHECK: #llvm.di_local_variable +// CHECK: #llvm.di_local_variable diff --git a/flang/test/Transforms/debug-omp-target-op-2.fir b/flang/test/Transforms/debug-omp-target-op-2.fir new file mode 100644 index 0000000000000..15dcf2389b21d --- /dev/null +++ b/flang/test/Transforms/debug-omp-target-op-2.fir @@ -0,0 +1,53 @@ +// RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s + +module attributes {dlti.dl_spec = #dlti.dl_spec<>} { + func.func @fn_(%arg0: !fir.ref> {fir.bindc_name = "b"}, %arg1: !fir.ref {fir.bindc_name = "c"}, %arg2: !fir.ref {fir.bindc_name = "d"}) { + %c1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %0 = fir.alloca i32 + %1 = fir.alloca i32 + %2 = fir.undefined !fir.dscope + %3 = fircg.ext_declare %arg1 dummy_scope %2 {uniq_name = "_QFfnEc"} : (!fir.ref, !fir.dscope) -> !fir.ref loc(#loc2) + %4 = fircg.ext_declare %arg2 dummy_scope %2 {uniq_name = "_QFfnEd"} : (!fir.ref, !fir.dscope) -> !fir.ref loc(#loc3) + %5 = fir.load %3 : !fir.ref + %6 = fir.convert %5 : (i32) -> index + %9 = fir.load %4 : !fir.ref + %10 = fir.convert %9 : (i32) -> index + %15 = fircg.ext_declare %arg0(%6, %10) dummy_scope %2 {uniq_name = "_QFfnEb"} : (!fir.ref>, index, index, !fir.dscope) -> !fir.ref> loc(#loc4) + %16 = fircg.ext_embox %15(%6, %10) : (!fir.ref>, index, index) -> !fir.box> + %17:3 = fir.box_dims %16, %c0 : (!fir.box>, index) -> (index, index, index) + %18 = arith.subi %17#1, %c1 : index + %19 = omp.map.bounds lower_bound(%c0 : index) upper_bound(%18 : index) extent(%17#1 : index) stride(%17#2 : index) start_idx(%c1 : index) {stride_in_bytes = true} + %20 = arith.muli %17#2, %17#1 : index + %21:3 = fir.box_dims %16, %c1 : (!fir.box>, index) -> (index, index, index) + %22 = arith.subi %21#1, %c1 : index + %23 = omp.map.bounds lower_bound(%c0 : index) upper_bound(%22 : index) extent(%21#1 : index) stride(%20 : index) start_idx(%c1 : index) {stride_in_bytes = true} + %24 = omp.map.info var_ptr(%15 : !fir.ref>, i32) map_clauses(tofrom) capture(ByRef) bounds(%19, %23) -> !fir.ref> {name = "b"} + %25 = omp.map.info var_ptr(%1 : !fir.ref, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref {name = ""} + %26 = omp.map.info var_ptr(%0 : !fir.ref, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref {name = ""} + omp.target map_entries(%24 -> %arg3, %25 -> %arg4, %26 -> %arg5 : !fir.ref>, !fir.ref, !fir.ref) { + %27 = fir.load %arg5 : !fir.ref + %28 = fir.load %arg4 : !fir.ref + %29 = fir.convert %27 : (i32) -> index + %31 = fir.convert %28 : (i32) -> index + %37 = fircg.ext_declare %arg3(%29, %31) {uniq_name = "_QFfnEb"} : (!fir.ref>, index, index) -> !fir.ref> loc(#loc5) + omp.terminator + } loc(#loc6) + return + } loc(#loc7) +} +#loc1 = loc("test.f90":1:1) +#loc2 = loc("test.f90":3:1) +#loc3 = loc("test.f90":7:1) +#loc4 = loc("test.f90":8:1) +#loc5 = loc("test.f90":6:1) +#loc6 = loc("test.f90":16:1) +#loc7 = loc("test.f90":26:1) + + +// Test that variable size arrays inside target regions get their own +// compiler generated variables for size. + +// CHECK: #[[SP:.*]] = #llvm.di_subprogram<{{.*}}name = "__omp_offloading_{{.*}}_fn__l16"{{.*}}> +// CHECK: #llvm.di_local_variable +// CHECK: #llvm.di_local_variable >From 73fbd7b9ff7f94ce32d3c35cdb503652ebf7d0c5 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Sun, 18 May 2025 11:23:52 +0100 Subject: [PATCH 2/3] Handle review comments. Main change is that we also take care of the case when only line table information is requested. --- .../lib/Optimizer/Transforms/AddDebugInfo.cpp | 128 ++++++++++-------- .../test/Transforms/debug-omp-target-op-1.fir | 5 + 2 files changed, 75 insertions(+), 58 deletions(-) diff --git a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp index 8e7ae4383bfdc..f0971557d7256 100644 --- a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp +++ b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp @@ -109,7 +109,7 @@ bool debugInfoIsAlreadySet(mlir::Location loc) { // getTargetEntryUniqueInfo and // TargetRegionEntryInfo::getTargetRegionEntryFnName to generate the same name. // But even if there was a slight mismatch, it is not a problem because this -// name is artifical and not important to debug experience. +// name is artificial and not important to debug experience. mlir::StringAttr getTargetFunctionName(mlir::MLIRContext *context, mlir::Location Loc, llvm::StringRef parentName) { @@ -477,6 +477,73 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, line - 1, false); } + auto addTargetOpDISP = [&](bool lineTableOnly, + const llvm::SmallVector &entities) { + // When we process the DeclareOp inside the OpenMP target region, all the + // variables get the DISubprogram of the parent function of the target op as + // the scope. In the codegen (to llvm ir), OpenMP target op results in the + // creation of a separate function. As the variables in the debug info have + // the DISubprogram of the parent function as the scope, the variables + // need to be updated at codegen time to avoid verification failures. + + // This updating after the fact becomes more and more difficult when types + // are dependent on local variables like in the case of variable size arrays + // or string. We not only have to generate new variables but also new types. + // We can avoid this problem by generating a DISubprogramAttr here for the + // target op and make sure that all the variables inside the target region + // get the correct scope in the first place. + funcOp.walk([&](mlir::omp::TargetOp targetOp) { + unsigned line = getLineFromLoc(targetOp.getLoc()); + mlir::StringAttr name = + getTargetFunctionName(context, targetOp.getLoc(), funcOp.getName()); + mlir::LLVM::DISubprogramFlags flags = + mlir::LLVM::DISubprogramFlags::Definition | + mlir::LLVM::DISubprogramFlags::LocalToUnit; + if (isOptimized) + flags = flags | mlir::LLVM::DISubprogramFlags::Optimized; + + mlir::DistinctAttr id = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + llvm::SmallVector types; + types.push_back(mlir::LLVM::DINullTypeAttr::get(context)); + mlir::LLVM::DISubroutineTypeAttr spTy = + mlir::LLVM::DISubroutineTypeAttr::get(context, CC, types); + if (lineTableOnly) { + auto spAttr = mlir::LLVM::DISubprogramAttr::get( + context, id, compilationUnit, Scope, name, name, funcFileAttr, line, + line, flags, spTy, /*retainedNodes=*/{}, /*annotations=*/{}); + targetOp->setLoc(builder.getFusedLoc({targetOp.getLoc()}, spAttr)); + return; + } + mlir::DistinctAttr recId = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + auto spAttr = mlir::LLVM::DISubprogramAttr::get( + context, recId, /*isRecSelf=*/true, id, compilationUnit, Scope, name, + name, funcFileAttr, line, line, flags, spTy, /*retainedNodes=*/{}, + /*annotations=*/{}); + + // Make sure that information about the imported modules is copied in the + // new function. + llvm::SmallVector opEntities; + for (mlir::LLVM::DINodeAttr N : entities) { + if (auto entity = mlir::dyn_cast(N)) { + auto importedEntity = mlir::LLVM::DIImportedEntityAttr::get( + context, llvm::dwarf::DW_TAG_imported_module, spAttr, + entity.getEntity(), fileAttr, /*line=*/1, /*name=*/nullptr, + /*elements*/ {}); + opEntities.push_back(importedEntity); + } + } + + id = mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + spAttr = mlir::LLVM::DISubprogramAttr::get( + context, recId, /*isRecSelf=*/false, id, compilationUnit, Scope, name, + name, funcFileAttr, line, line, flags, spTy, opEntities, + /*annotations=*/{}); + targetOp->setLoc(builder.getFusedLoc({targetOp.getLoc()}, spAttr)); + }); + }; + // Don't process variables if user asked for line tables only. if (debugLevel == mlir::LLVM::DIEmissionKind::LineTablesOnly) { auto spAttr = mlir::LLVM::DISubprogramAttr::get( @@ -484,6 +551,7 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, line, line, subprogramFlags, subTypeAttr, /*retainedNodes=*/{}, /*annotations=*/{}); funcOp->setLoc(builder.getFusedLoc({l}, spAttr)); + addTargetOpDISP(true, {}); return; } @@ -541,63 +609,7 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, funcName, fullName, funcFileAttr, line, line, subprogramFlags, subTypeAttr, entities, /*annotations=*/{}); funcOp->setLoc(builder.getFusedLoc({l}, spAttr)); - - /* When we process the DeclareOp inside the OpenMP target region, all the - variables get the DISubprogram of the parent function of the target op as - the scope. In the codegen (to llvm ir), OpenMP target op results in the - creation of a separate function. As the variables in the debug info have - the DISubprogram of the parent function as the scope, the variables - need to be updated at codegen time to avoid verification failures. - - This updating after the fact becomes more and more difficult when types - are dependent on local variables like in the case of variable size arrays - or string. We not only have to generate new variables but also new types. - We can avoid this problem by generating a DISubprogramAttr here for the - target op and make sure that all the variables inside the target region - get the correct scope in the first place. */ - funcOp.walk([&](mlir::omp::TargetOp targetOp) { - unsigned line = getLineFromLoc(targetOp.getLoc()); - mlir::StringAttr Name = - getTargetFunctionName(context, targetOp.getLoc(), funcOp.getName()); - mlir::LLVM::DISubprogramFlags flags = - mlir::LLVM::DISubprogramFlags::Definition | - mlir::LLVM::DISubprogramFlags::LocalToUnit; - if (isOptimized) - flags = flags | mlir::LLVM::DISubprogramFlags::Optimized; - - mlir::DistinctAttr recId = - mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); - mlir::DistinctAttr Id = - mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); - llvm::SmallVector types; - types.push_back(mlir::LLVM::DINullTypeAttr::get(context)); - mlir::LLVM::DISubroutineTypeAttr spTy = - mlir::LLVM::DISubroutineTypeAttr::get(context, CC, types); - auto spAttr = mlir::LLVM::DISubprogramAttr::get( - context, recId, /*isRecSelf=*/true, Id, compilationUnit, Scope, Name, - Name, funcFileAttr, line, line, flags, spTy, /*retainedNodes=*/{}, - /*annotations=*/{}); - - // Make sure that information about the imported modules in copied from the - // parent function. - llvm::SmallVector OpEntities; - for (mlir::LLVM::DINodeAttr N : entities) { - if (auto entity = mlir::dyn_cast(N)) { - auto importedEntity = mlir::LLVM::DIImportedEntityAttr::get( - context, llvm::dwarf::DW_TAG_imported_module, spAttr, - entity.getEntity(), fileAttr, /*line=*/1, /*name=*/nullptr, - /*elements*/ {}); - OpEntities.push_back(importedEntity); - } - } - - Id = mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); - spAttr = mlir::LLVM::DISubprogramAttr::get( - context, recId, /*isRecSelf=*/false, Id, compilationUnit, Scope, Name, - Name, funcFileAttr, line, line, flags, spTy, OpEntities, - /*annotations=*/{}); - targetOp->setLoc(builder.getFusedLoc({targetOp.getLoc()}, spAttr)); - }); + addTargetOpDISP(false, entities); funcOp.walk([&](fir::cg::XDeclareOp declOp) { mlir::LLVM::DISubprogramAttr spTy = spAttr; diff --git a/flang/test/Transforms/debug-omp-target-op-1.fir b/flang/test/Transforms/debug-omp-target-op-1.fir index bb586cdf6e9ab..6b895b732c42b 100644 --- a/flang/test/Transforms/debug-omp-target-op-1.fir +++ b/flang/test/Transforms/debug-omp-target-op-1.fir @@ -1,4 +1,5 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s +// RUN: fir-opt --add-debug-info="debug-level=LineTablesOnly" --mlir-print-debuginfo %s | FileCheck %s --check-prefix=LINETABLE module attributes {dlti.dl_spec = #dlti.dl_spec<>} { func.func @_QQmain() attributes {fir.bindc_name = "test"} { @@ -33,3 +34,7 @@ module attributes {dlti.dl_spec = #dlti.dl_spec<>} { // CHECK: #llvm.di_local_variable // CHECK: #llvm.di_local_variable // CHECK: #llvm.di_local_variable + +// LINETABLE: #[[SP:.*]] = #llvm.di_subprogram<{{.*}}name = "test"{{.*}}> +// LINETABLE: #[[SP1:.*]] = #llvm.di_subprogram<{{.*}}name = "__omp_offloading_{{.*}}_QQmain_l6"{{.*}}line = 6{{.*}}subprogramFlags = "LocalToUnit|Definition"{{.*}}> +// LINETABLE-NOT: #llvm.di_local_variable >From fb063cca9c4e1414090ed79818a685237226e41f Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Tue, 20 May 2025 12:12:38 +0100 Subject: [PATCH 3/3] Handle review comments. Use the arguments to the TargetOp in the subroutine type created for it. --- flang/lib/Optimizer/Transforms/AddDebugInfo.cpp | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp index f0971557d7256..b41d4f2285091 100644 --- a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp +++ b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp @@ -506,6 +506,12 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); llvm::SmallVector types; types.push_back(mlir::LLVM::DINullTypeAttr::get(context)); + for (auto arg : targetOp.getRegion().getArguments()) { + auto tyAttr = typeGen.convertType(fir::unwrapRefType(arg.getType()), + fileAttr, cuAttr, /*declOp=*/nullptr); + types.push_back(tyAttr); + } + CC = llvm::dwarf::getCallingConvention("DW_CC_normal"); mlir::LLVM::DISubroutineTypeAttr spTy = mlir::LLVM::DISubroutineTypeAttr::get(context, CC, types); if (lineTableOnly) { From flang-commits at lists.llvm.org Tue May 20 04:23:20 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Tue, 20 May 2025 04:23:20 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <682c6628.170a0220.160907.f6b2@mx.google.com> ================ @@ -510,8 +542,73 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, subTypeAttr, entities, /*annotations=*/{}); funcOp->setLoc(builder.getFusedLoc({l}, spAttr)); + /* When we process the DeclareOp inside the OpenMP target region, all the + variables get the DISubprogram of the parent function of the target op as + the scope. In the codegen (to llvm ir), OpenMP target op results in the + creation of a separate function. As the variables in the debug info have + the DISubprogram of the parent function as the scope, the variables + need to be updated at codegen time to avoid verification failures. + + This updating after the fact becomes more and more difficult when types + are dependent on local variables like in the case of variable size arrays + or string. We not only have to generate new variables but also new types. + We can avoid this problem by generating a DISubprogramAttr here for the + target op and make sure that all the variables inside the target region + get the correct scope in the first place. */ + funcOp.walk([&](mlir::omp::TargetOp targetOp) { + unsigned line = getLineFromLoc(targetOp.getLoc()); + mlir::StringAttr Name = + getTargetFunctionName(context, targetOp.getLoc(), funcOp.getName()); + mlir::LLVM::DISubprogramFlags flags = + mlir::LLVM::DISubprogramFlags::Definition | + mlir::LLVM::DISubprogramFlags::LocalToUnit; + if (isOptimized) + flags = flags | mlir::LLVM::DISubprogramFlags::Optimized; + + mlir::DistinctAttr recId = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + mlir::DistinctAttr Id = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + llvm::SmallVector types; + types.push_back(mlir::LLVM::DINullTypeAttr::get(context)); ---------------- abidh wrote: Thanks for pointing that out. The type of the function did not matter that much for the final DWARF that came out but I have now added the types from the arguments to the TargetOp for completeness. https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Tue May 20 04:23:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 04:23:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <682c6629.170a0220.15d084.f5c7@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions cpp -- flang/lib/Optimizer/Transforms/AddDebugInfo.cpp ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp index 20e890713..0f319ff55 100644 --- a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp +++ b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp @@ -479,7 +479,8 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, } auto addTargetOpDISP = [&](bool lineTableOnly, - const llvm::SmallVector &entities) { + const llvm::SmallVector + &entities) { // When we process the DeclareOp inside the OpenMP target region, all the // variables get the DISubprogram of the parent function of the target op as // the scope. In the codegen (to llvm ir), OpenMP target op results in the ``````````
https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Tue May 20 04:24:26 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Tue, 20 May 2025 04:24:26 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <682c666a.a70a0220.22946c.2a96@mx.google.com> ================ ---------------- abidh wrote: That is a good catch. I have re-arranges the code so that we handle both cases(lineTableOnly and FullDebug). I added a testcase for it too. https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Tue May 20 04:26:30 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Tue, 20 May 2025 04:26:30 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <682c66e6.630a0220.287981.3094@mx.google.com> ================ @@ -510,8 +542,73 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, subTypeAttr, entities, /*annotations=*/{}); funcOp->setLoc(builder.getFusedLoc({l}, spAttr)); + /* When we process the DeclareOp inside the OpenMP target region, all the + variables get the DISubprogram of the parent function of the target op as + the scope. In the codegen (to llvm ir), OpenMP target op results in the + creation of a separate function. As the variables in the debug info have + the DISubprogram of the parent function as the scope, the variables + need to be updated at codegen time to avoid verification failures. + + This updating after the fact becomes more and more difficult when types + are dependent on local variables like in the case of variable size arrays + or string. We not only have to generate new variables but also new types. + We can avoid this problem by generating a DISubprogramAttr here for the + target op and make sure that all the variables inside the target region + get the correct scope in the first place. */ + funcOp.walk([&](mlir::omp::TargetOp targetOp) { + unsigned line = getLineFromLoc(targetOp.getLoc()); + mlir::StringAttr Name = + getTargetFunctionName(context, targetOp.getLoc(), funcOp.getName()); + mlir::LLVM::DISubprogramFlags flags = + mlir::LLVM::DISubprogramFlags::Definition | + mlir::LLVM::DISubprogramFlags::LocalToUnit; + if (isOptimized) + flags = flags | mlir::LLVM::DISubprogramFlags::Optimized; + + mlir::DistinctAttr recId = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + mlir::DistinctAttr Id = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + llvm::SmallVector types; + types.push_back(mlir::LLVM::DINullTypeAttr::get(context)); + mlir::LLVM::DISubroutineTypeAttr spTy = + mlir::LLVM::DISubroutineTypeAttr::get(context, CC, types); + auto spAttr = mlir::LLVM::DISubprogramAttr::get( + context, recId, /*isRecSelf=*/true, Id, compilationUnit, Scope, Name, + Name, funcFileAttr, line, line, flags, spTy, /*retainedNodes=*/{}, + /*annotations=*/{}); + + // Make sure that information about the imported modules in copied from the + // parent function. + llvm::SmallVector OpEntities; + for (mlir::LLVM::DINodeAttr N : entities) { + if (auto entity = mlir::dyn_cast(N)) { + auto importedEntity = mlir::LLVM::DIImportedEntityAttr::get( + context, llvm::dwarf::DW_TAG_imported_module, spAttr, + entity.getEntity(), fileAttr, /*line=*/1, /*name=*/nullptr, + /*elements*/ {}); + OpEntities.push_back(importedEntity); + } + } + + Id = mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); ---------------- abidh wrote: I will do further investigation on it because the similar code is used in other places too. https://github.com/llvm/llvm-project/pull/138039 From flang-commits at lists.llvm.org Tue May 20 04:29:37 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Tue, 20 May 2025 04:29:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. (PR #138039) In-Reply-To: Message-ID: <682c67a1.170a0220.1446b4.f5bd@mx.google.com> https://github.com/abidh updated https://github.com/llvm/llvm-project/pull/138039 >From e03838394cdbe41959b41977b8b083db5a4c3764 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Tue, 29 Apr 2025 12:30:42 +0100 Subject: [PATCH 1/4] [flang][debug] Generate DISubprogramAttr for omp::TargetOp. There are DeclareOp present for the variables mapped into target region. That allow us to generate debug information for them. Bu the TargetOp is still part of parent function and those variables get the parent function's DISubprogram as a scope. In OMPIRBuilder, a new function is created for the TargetOp. We also create a new DISubprogram for it. All the variables that were in the target region now have to be updated to have the correct scope. This after the fact updating of debug information becomes very difficult in certain cases. Take the example of variable arrays. The type of those arrays depend on the artificial DILocalVariable(s) which hold the size(s) of the array. This new function will now require that we generate the new variable and and new types. Similar issue exist for character type variables too. To avoid this after the fact updating, this PR generates a DISubprogramAttr for the TargetOp while generating the debug info in flang. This help us avoid updating later. This PR is flang side of the change. I will open another PR which will make the required changes in OMPIRBuilder. --- .../lib/Optimizer/Transforms/AddDebugInfo.cpp | 99 ++++++++++++++++++- .../test/Transforms/debug-omp-target-op-1.fir | 35 +++++++ .../test/Transforms/debug-omp-target-op-2.fir | 53 ++++++++++ 3 files changed, 186 insertions(+), 1 deletion(-) create mode 100644 flang/test/Transforms/debug-omp-target-op-1.fir create mode 100644 flang/test/Transforms/debug-omp-target-op-2.fir diff --git a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp index c479c1a0892b5..8e7ae4383bfdc 100644 --- a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp +++ b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp @@ -34,6 +34,7 @@ #include "llvm/BinaryFormat/Dwarf.h" #include "llvm/Support/Debug.h" #include "llvm/Support/FileSystem.h" +#include "llvm/Support/FormatVariadic.h" #include "llvm/Support/Path.h" #include "llvm/Support/raw_ostream.h" @@ -103,6 +104,37 @@ bool debugInfoIsAlreadySet(mlir::Location loc) { return false; } +// Generates the name for the artificial DISubprogram that we are going to +// generate for omp::TargetOp. Its logic is borrowed from +// getTargetEntryUniqueInfo and +// TargetRegionEntryInfo::getTargetRegionEntryFnName to generate the same name. +// But even if there was a slight mismatch, it is not a problem because this +// name is artifical and not important to debug experience. +mlir::StringAttr getTargetFunctionName(mlir::MLIRContext *context, + mlir::Location Loc, + llvm::StringRef parentName) { + auto fileLoc = Loc->findInstanceOf(); + + assert(fileLoc && "No file found from location"); + llvm::StringRef fileName = fileLoc.getFilename().getValue(); + + llvm::sys::fs::UniqueID id; + uint64_t line = fileLoc.getLine(); + size_t fileId; + size_t deviceId; + if (auto ec = llvm::sys::fs::getUniqueID(fileName, id)) { + fileId = llvm::hash_value(fileName.str()); + deviceId = 0xdeadf17e; + } else { + fileId = id.getFile(); + deviceId = id.getDevice(); + } + return mlir::StringAttr::get( + context, + std::string(llvm::formatv("__omp_offloading_{0:x-}_{1:x-}_{2}_l{3}", + deviceId, fileId, parentName, line))); +} + } // namespace bool AddDebugInfoPass::createCommonBlockGlobal( @@ -510,8 +542,73 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, subTypeAttr, entities, /*annotations=*/{}); funcOp->setLoc(builder.getFusedLoc({l}, spAttr)); + /* When we process the DeclareOp inside the OpenMP target region, all the + variables get the DISubprogram of the parent function of the target op as + the scope. In the codegen (to llvm ir), OpenMP target op results in the + creation of a separate function. As the variables in the debug info have + the DISubprogram of the parent function as the scope, the variables + need to be updated at codegen time to avoid verification failures. + + This updating after the fact becomes more and more difficult when types + are dependent on local variables like in the case of variable size arrays + or string. We not only have to generate new variables but also new types. + We can avoid this problem by generating a DISubprogramAttr here for the + target op and make sure that all the variables inside the target region + get the correct scope in the first place. */ + funcOp.walk([&](mlir::omp::TargetOp targetOp) { + unsigned line = getLineFromLoc(targetOp.getLoc()); + mlir::StringAttr Name = + getTargetFunctionName(context, targetOp.getLoc(), funcOp.getName()); + mlir::LLVM::DISubprogramFlags flags = + mlir::LLVM::DISubprogramFlags::Definition | + mlir::LLVM::DISubprogramFlags::LocalToUnit; + if (isOptimized) + flags = flags | mlir::LLVM::DISubprogramFlags::Optimized; + + mlir::DistinctAttr recId = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + mlir::DistinctAttr Id = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + llvm::SmallVector types; + types.push_back(mlir::LLVM::DINullTypeAttr::get(context)); + mlir::LLVM::DISubroutineTypeAttr spTy = + mlir::LLVM::DISubroutineTypeAttr::get(context, CC, types); + auto spAttr = mlir::LLVM::DISubprogramAttr::get( + context, recId, /*isRecSelf=*/true, Id, compilationUnit, Scope, Name, + Name, funcFileAttr, line, line, flags, spTy, /*retainedNodes=*/{}, + /*annotations=*/{}); + + // Make sure that information about the imported modules in copied from the + // parent function. + llvm::SmallVector OpEntities; + for (mlir::LLVM::DINodeAttr N : entities) { + if (auto entity = mlir::dyn_cast(N)) { + auto importedEntity = mlir::LLVM::DIImportedEntityAttr::get( + context, llvm::dwarf::DW_TAG_imported_module, spAttr, + entity.getEntity(), fileAttr, /*line=*/1, /*name=*/nullptr, + /*elements*/ {}); + OpEntities.push_back(importedEntity); + } + } + + Id = mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + spAttr = mlir::LLVM::DISubprogramAttr::get( + context, recId, /*isRecSelf=*/false, Id, compilationUnit, Scope, Name, + Name, funcFileAttr, line, line, flags, spTy, OpEntities, + /*annotations=*/{}); + targetOp->setLoc(builder.getFusedLoc({targetOp.getLoc()}, spAttr)); + }); + funcOp.walk([&](fir::cg::XDeclareOp declOp) { - handleDeclareOp(declOp, fileAttr, spAttr, typeGen, symbolTable); + mlir::LLVM::DISubprogramAttr spTy = spAttr; + if (auto tOp = declOp->getParentOfType()) { + if (auto fusedLoc = llvm::dyn_cast(tOp.getLoc())) { + if (auto sp = llvm::dyn_cast( + fusedLoc.getMetadata())) + spTy = sp; + } + } + handleDeclareOp(declOp, fileAttr, spTy, typeGen, symbolTable); }); // commonBlockMap ensures that we don't create multiple DICommonBlockAttr of // the same name in one function. But it is ok (rather required) to create diff --git a/flang/test/Transforms/debug-omp-target-op-1.fir b/flang/test/Transforms/debug-omp-target-op-1.fir new file mode 100644 index 0000000000000..bb586cdf6e9ab --- /dev/null +++ b/flang/test/Transforms/debug-omp-target-op-1.fir @@ -0,0 +1,35 @@ +// RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s + +module attributes {dlti.dl_spec = #dlti.dl_spec<>} { + func.func @_QQmain() attributes {fir.bindc_name = "test"} { + %c13_i32 = arith.constant 13 : i32 + %c12_i32 = arith.constant 12 : i32 + %c6_i32 = arith.constant 6 : i32 + %c1_i32 = arith.constant 1 : i32 + %c5_i32 = arith.constant 5 : i32 + %0 = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFEx"} loc(#loc1) + %1 = fircg.ext_declare %0 {uniq_name = "_QFEx"} : (!fir.ref) -> !fir.ref loc(#loc1) + %2 = fir.alloca i32 {bindc_name = "y", uniq_name = "_QFEy"} loc(#loc2) + %3 = fircg.ext_declare %2 {uniq_name = "_QFEy"} : (!fir.ref) -> !fir.ref loc(#loc2) + %4 = omp.map.info var_ptr(%1 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref {name = "x"} + %5 = omp.map.info var_ptr(%3 : !fir.ref, i32) map_clauses(tofrom) capture(ByRef) -> !fir.ref {name = "y"} + omp.target map_entries(%4 -> %arg0, %5 -> %arg1 : !fir.ref, !fir.ref) { + %16 = fircg.ext_declare %arg0 {uniq_name = "_QFEx"} : (!fir.ref) -> !fir.ref loc(#loc3) + %17 = fircg.ext_declare %arg1 {uniq_name = "_QFEy"} : (!fir.ref) -> !fir.ref loc(#loc4) + omp.terminator + } loc(#loc5) + return + } +} +#loc1 = loc("test.f90":1:1) +#loc2 = loc("test.f90":3:1) +#loc3 = loc("test.f90":7:1) +#loc4 = loc("test.f90":8:1) +#loc5 = loc("test.f90":6:1) + +// CHECK: #[[SP:.*]] = #llvm.di_subprogram<{{.*}}name = "test"{{.*}}> +// CHECK: #[[SP1:.*]] = #llvm.di_subprogram<{{.*}}name = "__omp_offloading_{{.*}}_QQmain_l6"{{.*}}line = 6{{.*}}subprogramFlags = "LocalToUnit|Definition"{{.*}}> +// CHECK: #llvm.di_local_variable +// CHECK: #llvm.di_local_variable +// CHECK: #llvm.di_local_variable +// CHECK: #llvm.di_local_variable diff --git a/flang/test/Transforms/debug-omp-target-op-2.fir b/flang/test/Transforms/debug-omp-target-op-2.fir new file mode 100644 index 0000000000000..15dcf2389b21d --- /dev/null +++ b/flang/test/Transforms/debug-omp-target-op-2.fir @@ -0,0 +1,53 @@ +// RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s + +module attributes {dlti.dl_spec = #dlti.dl_spec<>} { + func.func @fn_(%arg0: !fir.ref> {fir.bindc_name = "b"}, %arg1: !fir.ref {fir.bindc_name = "c"}, %arg2: !fir.ref {fir.bindc_name = "d"}) { + %c1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %0 = fir.alloca i32 + %1 = fir.alloca i32 + %2 = fir.undefined !fir.dscope + %3 = fircg.ext_declare %arg1 dummy_scope %2 {uniq_name = "_QFfnEc"} : (!fir.ref, !fir.dscope) -> !fir.ref loc(#loc2) + %4 = fircg.ext_declare %arg2 dummy_scope %2 {uniq_name = "_QFfnEd"} : (!fir.ref, !fir.dscope) -> !fir.ref loc(#loc3) + %5 = fir.load %3 : !fir.ref + %6 = fir.convert %5 : (i32) -> index + %9 = fir.load %4 : !fir.ref + %10 = fir.convert %9 : (i32) -> index + %15 = fircg.ext_declare %arg0(%6, %10) dummy_scope %2 {uniq_name = "_QFfnEb"} : (!fir.ref>, index, index, !fir.dscope) -> !fir.ref> loc(#loc4) + %16 = fircg.ext_embox %15(%6, %10) : (!fir.ref>, index, index) -> !fir.box> + %17:3 = fir.box_dims %16, %c0 : (!fir.box>, index) -> (index, index, index) + %18 = arith.subi %17#1, %c1 : index + %19 = omp.map.bounds lower_bound(%c0 : index) upper_bound(%18 : index) extent(%17#1 : index) stride(%17#2 : index) start_idx(%c1 : index) {stride_in_bytes = true} + %20 = arith.muli %17#2, %17#1 : index + %21:3 = fir.box_dims %16, %c1 : (!fir.box>, index) -> (index, index, index) + %22 = arith.subi %21#1, %c1 : index + %23 = omp.map.bounds lower_bound(%c0 : index) upper_bound(%22 : index) extent(%21#1 : index) stride(%20 : index) start_idx(%c1 : index) {stride_in_bytes = true} + %24 = omp.map.info var_ptr(%15 : !fir.ref>, i32) map_clauses(tofrom) capture(ByRef) bounds(%19, %23) -> !fir.ref> {name = "b"} + %25 = omp.map.info var_ptr(%1 : !fir.ref, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref {name = ""} + %26 = omp.map.info var_ptr(%0 : !fir.ref, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref {name = ""} + omp.target map_entries(%24 -> %arg3, %25 -> %arg4, %26 -> %arg5 : !fir.ref>, !fir.ref, !fir.ref) { + %27 = fir.load %arg5 : !fir.ref + %28 = fir.load %arg4 : !fir.ref + %29 = fir.convert %27 : (i32) -> index + %31 = fir.convert %28 : (i32) -> index + %37 = fircg.ext_declare %arg3(%29, %31) {uniq_name = "_QFfnEb"} : (!fir.ref>, index, index) -> !fir.ref> loc(#loc5) + omp.terminator + } loc(#loc6) + return + } loc(#loc7) +} +#loc1 = loc("test.f90":1:1) +#loc2 = loc("test.f90":3:1) +#loc3 = loc("test.f90":7:1) +#loc4 = loc("test.f90":8:1) +#loc5 = loc("test.f90":6:1) +#loc6 = loc("test.f90":16:1) +#loc7 = loc("test.f90":26:1) + + +// Test that variable size arrays inside target regions get their own +// compiler generated variables for size. + +// CHECK: #[[SP:.*]] = #llvm.di_subprogram<{{.*}}name = "__omp_offloading_{{.*}}_fn__l16"{{.*}}> +// CHECK: #llvm.di_local_variable +// CHECK: #llvm.di_local_variable >From 73fbd7b9ff7f94ce32d3c35cdb503652ebf7d0c5 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Sun, 18 May 2025 11:23:52 +0100 Subject: [PATCH 2/4] Handle review comments. Main change is that we also take care of the case when only line table information is requested. --- .../lib/Optimizer/Transforms/AddDebugInfo.cpp | 128 ++++++++++-------- .../test/Transforms/debug-omp-target-op-1.fir | 5 + 2 files changed, 75 insertions(+), 58 deletions(-) diff --git a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp index 8e7ae4383bfdc..f0971557d7256 100644 --- a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp +++ b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp @@ -109,7 +109,7 @@ bool debugInfoIsAlreadySet(mlir::Location loc) { // getTargetEntryUniqueInfo and // TargetRegionEntryInfo::getTargetRegionEntryFnName to generate the same name. // But even if there was a slight mismatch, it is not a problem because this -// name is artifical and not important to debug experience. +// name is artificial and not important to debug experience. mlir::StringAttr getTargetFunctionName(mlir::MLIRContext *context, mlir::Location Loc, llvm::StringRef parentName) { @@ -477,6 +477,73 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, line - 1, false); } + auto addTargetOpDISP = [&](bool lineTableOnly, + const llvm::SmallVector &entities) { + // When we process the DeclareOp inside the OpenMP target region, all the + // variables get the DISubprogram of the parent function of the target op as + // the scope. In the codegen (to llvm ir), OpenMP target op results in the + // creation of a separate function. As the variables in the debug info have + // the DISubprogram of the parent function as the scope, the variables + // need to be updated at codegen time to avoid verification failures. + + // This updating after the fact becomes more and more difficult when types + // are dependent on local variables like in the case of variable size arrays + // or string. We not only have to generate new variables but also new types. + // We can avoid this problem by generating a DISubprogramAttr here for the + // target op and make sure that all the variables inside the target region + // get the correct scope in the first place. + funcOp.walk([&](mlir::omp::TargetOp targetOp) { + unsigned line = getLineFromLoc(targetOp.getLoc()); + mlir::StringAttr name = + getTargetFunctionName(context, targetOp.getLoc(), funcOp.getName()); + mlir::LLVM::DISubprogramFlags flags = + mlir::LLVM::DISubprogramFlags::Definition | + mlir::LLVM::DISubprogramFlags::LocalToUnit; + if (isOptimized) + flags = flags | mlir::LLVM::DISubprogramFlags::Optimized; + + mlir::DistinctAttr id = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + llvm::SmallVector types; + types.push_back(mlir::LLVM::DINullTypeAttr::get(context)); + mlir::LLVM::DISubroutineTypeAttr spTy = + mlir::LLVM::DISubroutineTypeAttr::get(context, CC, types); + if (lineTableOnly) { + auto spAttr = mlir::LLVM::DISubprogramAttr::get( + context, id, compilationUnit, Scope, name, name, funcFileAttr, line, + line, flags, spTy, /*retainedNodes=*/{}, /*annotations=*/{}); + targetOp->setLoc(builder.getFusedLoc({targetOp.getLoc()}, spAttr)); + return; + } + mlir::DistinctAttr recId = + mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + auto spAttr = mlir::LLVM::DISubprogramAttr::get( + context, recId, /*isRecSelf=*/true, id, compilationUnit, Scope, name, + name, funcFileAttr, line, line, flags, spTy, /*retainedNodes=*/{}, + /*annotations=*/{}); + + // Make sure that information about the imported modules is copied in the + // new function. + llvm::SmallVector opEntities; + for (mlir::LLVM::DINodeAttr N : entities) { + if (auto entity = mlir::dyn_cast(N)) { + auto importedEntity = mlir::LLVM::DIImportedEntityAttr::get( + context, llvm::dwarf::DW_TAG_imported_module, spAttr, + entity.getEntity(), fileAttr, /*line=*/1, /*name=*/nullptr, + /*elements*/ {}); + opEntities.push_back(importedEntity); + } + } + + id = mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); + spAttr = mlir::LLVM::DISubprogramAttr::get( + context, recId, /*isRecSelf=*/false, id, compilationUnit, Scope, name, + name, funcFileAttr, line, line, flags, spTy, opEntities, + /*annotations=*/{}); + targetOp->setLoc(builder.getFusedLoc({targetOp.getLoc()}, spAttr)); + }); + }; + // Don't process variables if user asked for line tables only. if (debugLevel == mlir::LLVM::DIEmissionKind::LineTablesOnly) { auto spAttr = mlir::LLVM::DISubprogramAttr::get( @@ -484,6 +551,7 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, line, line, subprogramFlags, subTypeAttr, /*retainedNodes=*/{}, /*annotations=*/{}); funcOp->setLoc(builder.getFusedLoc({l}, spAttr)); + addTargetOpDISP(true, {}); return; } @@ -541,63 +609,7 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, funcName, fullName, funcFileAttr, line, line, subprogramFlags, subTypeAttr, entities, /*annotations=*/{}); funcOp->setLoc(builder.getFusedLoc({l}, spAttr)); - - /* When we process the DeclareOp inside the OpenMP target region, all the - variables get the DISubprogram of the parent function of the target op as - the scope. In the codegen (to llvm ir), OpenMP target op results in the - creation of a separate function. As the variables in the debug info have - the DISubprogram of the parent function as the scope, the variables - need to be updated at codegen time to avoid verification failures. - - This updating after the fact becomes more and more difficult when types - are dependent on local variables like in the case of variable size arrays - or string. We not only have to generate new variables but also new types. - We can avoid this problem by generating a DISubprogramAttr here for the - target op and make sure that all the variables inside the target region - get the correct scope in the first place. */ - funcOp.walk([&](mlir::omp::TargetOp targetOp) { - unsigned line = getLineFromLoc(targetOp.getLoc()); - mlir::StringAttr Name = - getTargetFunctionName(context, targetOp.getLoc(), funcOp.getName()); - mlir::LLVM::DISubprogramFlags flags = - mlir::LLVM::DISubprogramFlags::Definition | - mlir::LLVM::DISubprogramFlags::LocalToUnit; - if (isOptimized) - flags = flags | mlir::LLVM::DISubprogramFlags::Optimized; - - mlir::DistinctAttr recId = - mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); - mlir::DistinctAttr Id = - mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); - llvm::SmallVector types; - types.push_back(mlir::LLVM::DINullTypeAttr::get(context)); - mlir::LLVM::DISubroutineTypeAttr spTy = - mlir::LLVM::DISubroutineTypeAttr::get(context, CC, types); - auto spAttr = mlir::LLVM::DISubprogramAttr::get( - context, recId, /*isRecSelf=*/true, Id, compilationUnit, Scope, Name, - Name, funcFileAttr, line, line, flags, spTy, /*retainedNodes=*/{}, - /*annotations=*/{}); - - // Make sure that information about the imported modules in copied from the - // parent function. - llvm::SmallVector OpEntities; - for (mlir::LLVM::DINodeAttr N : entities) { - if (auto entity = mlir::dyn_cast(N)) { - auto importedEntity = mlir::LLVM::DIImportedEntityAttr::get( - context, llvm::dwarf::DW_TAG_imported_module, spAttr, - entity.getEntity(), fileAttr, /*line=*/1, /*name=*/nullptr, - /*elements*/ {}); - OpEntities.push_back(importedEntity); - } - } - - Id = mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); - spAttr = mlir::LLVM::DISubprogramAttr::get( - context, recId, /*isRecSelf=*/false, Id, compilationUnit, Scope, Name, - Name, funcFileAttr, line, line, flags, spTy, OpEntities, - /*annotations=*/{}); - targetOp->setLoc(builder.getFusedLoc({targetOp.getLoc()}, spAttr)); - }); + addTargetOpDISP(false, entities); funcOp.walk([&](fir::cg::XDeclareOp declOp) { mlir::LLVM::DISubprogramAttr spTy = spAttr; diff --git a/flang/test/Transforms/debug-omp-target-op-1.fir b/flang/test/Transforms/debug-omp-target-op-1.fir index bb586cdf6e9ab..6b895b732c42b 100644 --- a/flang/test/Transforms/debug-omp-target-op-1.fir +++ b/flang/test/Transforms/debug-omp-target-op-1.fir @@ -1,4 +1,5 @@ // RUN: fir-opt --add-debug-info --mlir-print-debuginfo %s | FileCheck %s +// RUN: fir-opt --add-debug-info="debug-level=LineTablesOnly" --mlir-print-debuginfo %s | FileCheck %s --check-prefix=LINETABLE module attributes {dlti.dl_spec = #dlti.dl_spec<>} { func.func @_QQmain() attributes {fir.bindc_name = "test"} { @@ -33,3 +34,7 @@ module attributes {dlti.dl_spec = #dlti.dl_spec<>} { // CHECK: #llvm.di_local_variable // CHECK: #llvm.di_local_variable // CHECK: #llvm.di_local_variable + +// LINETABLE: #[[SP:.*]] = #llvm.di_subprogram<{{.*}}name = "test"{{.*}}> +// LINETABLE: #[[SP1:.*]] = #llvm.di_subprogram<{{.*}}name = "__omp_offloading_{{.*}}_QQmain_l6"{{.*}}line = 6{{.*}}subprogramFlags = "LocalToUnit|Definition"{{.*}}> +// LINETABLE-NOT: #llvm.di_local_variable >From fb063cca9c4e1414090ed79818a685237226e41f Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Tue, 20 May 2025 12:12:38 +0100 Subject: [PATCH 3/4] Handle review comments. Use the arguments to the TargetOp in the subroutine type created for it. --- flang/lib/Optimizer/Transforms/AddDebugInfo.cpp | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp index f0971557d7256..b41d4f2285091 100644 --- a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp +++ b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp @@ -506,6 +506,12 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, mlir::DistinctAttr::create(mlir::UnitAttr::get(context)); llvm::SmallVector types; types.push_back(mlir::LLVM::DINullTypeAttr::get(context)); + for (auto arg : targetOp.getRegion().getArguments()) { + auto tyAttr = typeGen.convertType(fir::unwrapRefType(arg.getType()), + fileAttr, cuAttr, /*declOp=*/nullptr); + types.push_back(tyAttr); + } + CC = llvm::dwarf::getCallingConvention("DW_CC_normal"); mlir::LLVM::DISubroutineTypeAttr spTy = mlir::LLVM::DISubroutineTypeAttr::get(context, CC, types); if (lineTableOnly) { >From ad3b532e3ea6a91558ea009733df8ed1f37d112e Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Tue, 20 May 2025 12:29:10 +0100 Subject: [PATCH 4/4] Fix formatting issues. --- flang/lib/Optimizer/Transforms/AddDebugInfo.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp index b41d4f2285091..06b9c8ede5faa 100644 --- a/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp +++ b/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp @@ -478,7 +478,8 @@ void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp, } auto addTargetOpDISP = [&](bool lineTableOnly, - const llvm::SmallVector &entities) { + const llvm::SmallVector + &entities) { // When we process the DeclareOp inside the OpenMP target region, all the // variables get the DISubprogram of the parent function of the target op as // the scope. In the codegen (to llvm ir), OpenMP target op results in the From flang-commits at lists.llvm.org Tue May 20 04:59:44 2025 From: flang-commits at lists.llvm.org (Scott Manley via flang-commits) Date: Tue, 20 May 2025 04:59:44 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [OpenACC] rename private/firstprivate recipe attributes (PR #140719) Message-ID: https://github.com/rscottmanley created https://github.com/llvm/llvm-project/pull/140719 Make private and firstprivate recipe attribute names consistent with reductionRecipes attribute >From f6e4ef49829e68650f4c3e1be005e658ebe92f63 Mon Sep 17 00:00:00 2001 From: Scott Manley Date: Tue, 20 May 2025 04:48:57 -0700 Subject: [PATCH] [OpenACC] rename attributes to privatizationRecipes and firstprivatizationRecipes Make consistent with reductionRecipes attribute --- flang/lib/Lower/OpenACC.cpp | 85 ++++++++++--------- .../mlir/Dialect/OpenACC/OpenACCOps.td | 20 ++--- mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp | 10 +-- 3 files changed, 58 insertions(+), 57 deletions(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index bc94e860ff10b..0405510baeb9b 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -1389,16 +1389,16 @@ mlir::Type getTypeFromBounds(llvm::SmallVector &bounds, } template -static void -genPrivatizations(const Fortran::parser::AccObjectList &objectList, - Fortran::lower::AbstractConverter &converter, - Fortran::semantics::SemanticsContext &semanticsContext, - Fortran::lower::StatementContext &stmtCtx, - llvm::SmallVectorImpl &dataOperands, - llvm::SmallVector &privatizations, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes) { +static void genPrivatizationRecipes( + const Fortran::parser::AccObjectList &objectList, + Fortran::lower::AbstractConverter &converter, + Fortran::semantics::SemanticsContext &semanticsContext, + Fortran::lower::StatementContext &stmtCtx, + llvm::SmallVectorImpl &dataOperands, + llvm::SmallVector &privatizationRecipes, + llvm::ArrayRef async, + llvm::ArrayRef asyncDeviceTypes, + llvm::ArrayRef asyncOnlyDeviceTypes) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; for (const auto &accObject : objectList.v) { @@ -1445,7 +1445,7 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, /*unwrapBoxAddr=*/true); dataOperands.push_back(op.getAccVar()); } - privatizations.push_back(mlir::SymbolRefAttr::get( + privatizationRecipes.push_back(mlir::SymbolRefAttr::get( builder.getContext(), recipe.getSymName().str())); } } @@ -2083,15 +2083,15 @@ mlir::Type getTypeFromIvTypeSize(fir::FirOpBuilder &builder, return builder.getIntegerType(ivTypeSize * 8); } -static void privatizeIv(Fortran::lower::AbstractConverter &converter, - const Fortran::semantics::Symbol &sym, - mlir::Location loc, - llvm::SmallVector &ivTypes, - llvm::SmallVector &ivLocs, - llvm::SmallVector &privateOperands, - llvm::SmallVector &ivPrivate, - llvm::SmallVector &privatizations, - bool isDoConcurrent = false) { +static void +privatizeIv(Fortran::lower::AbstractConverter &converter, + const Fortran::semantics::Symbol &sym, mlir::Location loc, + llvm::SmallVector &ivTypes, + llvm::SmallVector &ivLocs, + llvm::SmallVector &privateOperands, + llvm::SmallVector &ivPrivate, + llvm::SmallVector &privatizationRecipes, + bool isDoConcurrent = false) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); mlir::Type ivTy = getTypeFromIvTypeSize(builder, sym); @@ -2131,7 +2131,7 @@ static void privatizeIv(Fortran::lower::AbstractConverter &converter, privateOp = op.getOperation(); privateOperands.push_back(op.getAccVar()); - privatizations.push_back(mlir::SymbolRefAttr::get( + privatizationRecipes.push_back(mlir::SymbolRefAttr::get( builder.getContext(), recipe.getSymName().str())); } @@ -2161,7 +2161,7 @@ static mlir::acc::LoopOp createLoopOp( llvm::SmallVector tileOperands, privateOperands, ivPrivate, reductionOperands, cacheOperands, vectorOperands, workerNumOperands, gangOperands, lowerbounds, upperbounds, steps; - llvm::SmallVector privatizations, reductionRecipes; + llvm::SmallVector privatizationRecipes, reductionRecipes; llvm::SmallVector tileOperandsSegments, gangOperandsSegments; llvm::SmallVector collapseValues; @@ -2282,9 +2282,9 @@ static mlir::acc::LoopOp createLoopOp( } else if (const auto *privateClause = std::get_if( &clause.u)) { - genPrivatizations( + genPrivatizationRecipes( privateClause->v, converter, semanticsContext, stmtCtx, - privateOperands, privatizations, /*async=*/{}, + privateOperands, privatizationRecipes, /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); } else if (const auto *reductionClause = std::get_if( @@ -2368,7 +2368,8 @@ static mlir::acc::LoopOp createLoopOp( const auto &name = std::get(control.t); privatizeIv(converter, *name.symbol, currentLocation, ivTypes, ivLocs, - privateOperands, ivPrivate, privatizations, isDoConcurrent); + privateOperands, ivPrivate, privatizationRecipes, + isDoConcurrent); inclusiveBounds.push_back(true); } @@ -2405,7 +2406,7 @@ static mlir::acc::LoopOp createLoopOp( Fortran::semantics::Symbol &ivSym = bounds->name.thing.symbol->GetUltimate(); privatizeIv(converter, ivSym, currentLocation, ivTypes, ivLocs, - privateOperands, ivPrivate, privatizations); + privateOperands, ivPrivate, privatizationRecipes); inclusiveBounds.push_back(true); @@ -2484,9 +2485,9 @@ static mlir::acc::LoopOp createLoopOp( if (!autoDeviceTypes.empty()) loopOp.setAuto_Attr(builder.getArrayAttr(autoDeviceTypes)); - if (!privatizations.empty()) - loopOp.setPrivatizationsAttr( - mlir::ArrayAttr::get(builder.getContext(), privatizations)); + if (!privatizationRecipes.empty()) + loopOp.setPrivatizationRecipesAttr( + mlir::ArrayAttr::get(builder.getContext(), privatizationRecipes)); if (!reductionRecipes.empty()) loopOp.setReductionRecipesAttr( @@ -2613,8 +2614,8 @@ static Op createComputeOp( llvm::SmallVector reductionOperands, privateOperands, firstprivateOperands; - llvm::SmallVector privatizations, firstPrivatizations, - reductionRecipes; + llvm::SmallVector privatizationRecipes, + firstPrivatizationRecipes, reductionRecipes; // Self clause has optional values but can be present with // no value as well. When there is no value, the op has an attribute to @@ -2820,17 +2821,17 @@ static Op createComputeOp( std::get_if( &clause.u)) { if (!combinedConstructs) - genPrivatizations( + genPrivatizationRecipes( privateClause->v, converter, semanticsContext, stmtCtx, - privateOperands, privatizations, async, asyncDeviceTypes, + privateOperands, privatizationRecipes, async, asyncDeviceTypes, asyncOnlyDeviceTypes); } else if (const auto *firstprivateClause = std::get_if( &clause.u)) { - genPrivatizations( + genPrivatizationRecipes( firstprivateClause->v, converter, semanticsContext, stmtCtx, - firstprivateOperands, firstPrivatizations, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + firstprivateOperands, firstPrivatizationRecipes, async, + asyncDeviceTypes, asyncOnlyDeviceTypes); } else if (const auto *reductionClause = std::get_if( &clause.u)) { @@ -2934,15 +2935,15 @@ static Op createComputeOp( computeOp.setWaitOnlyAttr(builder.getArrayAttr(waitOnlyDeviceTypes)); if constexpr (!std::is_same_v) { - if (!privatizations.empty()) - computeOp.setPrivatizationsAttr( - mlir::ArrayAttr::get(builder.getContext(), privatizations)); + if (!privatizationRecipes.empty()) + computeOp.setPrivatizationRecipesAttr( + mlir::ArrayAttr::get(builder.getContext(), privatizationRecipes)); if (!reductionRecipes.empty()) computeOp.setReductionRecipesAttr( mlir::ArrayAttr::get(builder.getContext(), reductionRecipes)); - if (!firstPrivatizations.empty()) - computeOp.setFirstprivatizationsAttr( - mlir::ArrayAttr::get(builder.getContext(), firstPrivatizations)); + if (!firstPrivatizationRecipes.empty()) + computeOp.setFirstprivatizationRecipesAttr(mlir::ArrayAttr::get( + builder.getContext(), firstPrivatizationRecipes)); } if (combinedConstructs) diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td index b9148dc088a6a..083a18d80704e 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td @@ -1327,9 +1327,9 @@ def OpenACC_ParallelOp : OpenACC_Op<"parallel", Variadic:$reductionOperands, OptionalAttr:$reductionRecipes, Variadic:$privateOperands, - OptionalAttr:$privatizations, + OptionalAttr:$privatizationRecipes, Variadic:$firstprivateOperands, - OptionalAttr:$firstprivatizations, + OptionalAttr:$firstprivatizationRecipes, Variadic:$dataClauseOperands, OptionalAttr:$defaultAttr, UnitAttr:$combined); @@ -1443,14 +1443,14 @@ def OpenACC_ParallelOp : OpenACC_Op<"parallel", | `async` `` custom($asyncOperands, type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `firstprivate` `(` custom($firstprivateOperands, - type($firstprivateOperands), $firstprivatizations) + type($firstprivateOperands), $firstprivatizationRecipes) `)` | `num_gangs` `(` custom($numGangs, type($numGangs), $numGangsDeviceType, $numGangsSegments) `)` | `num_workers` `(` custom($numWorkers, type($numWorkers), $numWorkersDeviceType) `)` | `private` `(` custom( - $privateOperands, type($privateOperands), $privatizations) + $privateOperands, type($privateOperands), $privatizationRecipes) `)` | `vector_length` `(` custom($vectorLength, type($vectorLength), $vectorLengthDeviceType) `)` @@ -1512,9 +1512,9 @@ def OpenACC_SerialOp : OpenACC_Op<"serial", Variadic:$reductionOperands, OptionalAttr:$reductionRecipes, Variadic:$privateOperands, - OptionalAttr:$privatizations, + OptionalAttr:$privatizationRecipes, Variadic:$firstprivateOperands, - OptionalAttr:$firstprivatizations, + OptionalAttr:$firstprivatizationRecipes, Variadic:$dataClauseOperands, OptionalAttr:$defaultAttr, UnitAttr:$combined); @@ -1585,10 +1585,10 @@ def OpenACC_SerialOp : OpenACC_Op<"serial", | `async` `` custom($asyncOperands, type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `firstprivate` `(` custom($firstprivateOperands, - type($firstprivateOperands), $firstprivatizations) + type($firstprivateOperands), $firstprivatizationRecipes) `)` | `private` `(` custom( - $privateOperands, type($privateOperands), $privatizations) + $privateOperands, type($privateOperands), $privatizationRecipes) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, @@ -2106,7 +2106,7 @@ def OpenACC_LoopOp : OpenACC_Op<"loop", OptionalAttr:$tileOperandsDeviceType, Variadic:$cacheOperands, Variadic:$privateOperands, - OptionalAttr:$privatizations, + OptionalAttr:$privatizationRecipes, Variadic:$reductionOperands, OptionalAttr:$reductionRecipes, OptionalAttr:$combined @@ -2261,7 +2261,7 @@ def OpenACC_LoopOp : OpenACC_Op<"loop", | `vector` `` custom($vectorOperands, type($vectorOperands), $vectorOperandsDeviceType, $vector) | `private` `(` custom( - $privateOperands, type($privateOperands), $privatizations) `)` + $privateOperands, type($privateOperands), $privatizationRecipes) `)` | `tile` `(` custom($tileOperands, type($tileOperands), $tileOperandsDeviceType, $tileOperandsSegments) `)` diff --git a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp index b401d2ec7894a..658ad28477ace 100644 --- a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp +++ b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp @@ -1079,11 +1079,11 @@ static LogicalResult verifyDeviceTypeAndSegmentCountMatch( LogicalResult acc::ParallelOp::verify() { if (failed(checkSymOperandList( - *this, getPrivatizations(), getPrivateOperands(), "private", + *this, getPrivatizationRecipes(), getPrivateOperands(), "private", "privatizations", /*checkOperandType=*/false))) return failure(); if (failed(checkSymOperandList( - *this, getFirstprivatizations(), getFirstprivateOperands(), + *this, getFirstprivatizationRecipes(), getFirstprivateOperands(), "firstprivate", "firstprivatizations", /*checkOperandType=*/false))) return failure(); if (failed(checkSymOperandList( @@ -1870,11 +1870,11 @@ mlir::Value SerialOp::getWaitDevnum(mlir::acc::DeviceType deviceType) { LogicalResult acc::SerialOp::verify() { if (failed(checkSymOperandList( - *this, getPrivatizations(), getPrivateOperands(), "private", + *this, getPrivatizationRecipes(), getPrivateOperands(), "private", "privatizations", /*checkOperandType=*/false))) return failure(); if (failed(checkSymOperandList( - *this, getFirstprivatizations(), getFirstprivateOperands(), + *this, getFirstprivatizationRecipes(), getFirstprivateOperands(), "firstprivate", "firstprivatizations", /*checkOperandType=*/false))) return failure(); if (failed(checkSymOperandList( @@ -2488,7 +2488,7 @@ LogicalResult acc::LoopOp::verify() { } if (failed(checkSymOperandList( - *this, getPrivatizations(), getPrivateOperands(), "private", + *this, getPrivatizationRecipes(), getPrivateOperands(), "private", "privatizations", false))) return failure(); From flang-commits at lists.llvm.org Tue May 20 05:00:19 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 05:00:19 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [OpenACC] rename private/firstprivate recipe attributes (PR #140719) In-Reply-To: Message-ID: <682c6ed3.630a0220.561b8.552b@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-mlir-openacc Author: Scott Manley (rscottmanley)
Changes Make private and firstprivate recipe attribute names consistent with reductionRecipes attribute --- Full diff: https://github.com/llvm/llvm-project/pull/140719.diff 3 Files Affected: - (modified) flang/lib/Lower/OpenACC.cpp (+43-42) - (modified) mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td (+10-10) - (modified) mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp (+5-5) ``````````diff diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index bc94e860ff10b..0405510baeb9b 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -1389,16 +1389,16 @@ mlir::Type getTypeFromBounds(llvm::SmallVector &bounds, } template -static void -genPrivatizations(const Fortran::parser::AccObjectList &objectList, - Fortran::lower::AbstractConverter &converter, - Fortran::semantics::SemanticsContext &semanticsContext, - Fortran::lower::StatementContext &stmtCtx, - llvm::SmallVectorImpl &dataOperands, - llvm::SmallVector &privatizations, - llvm::ArrayRef async, - llvm::ArrayRef asyncDeviceTypes, - llvm::ArrayRef asyncOnlyDeviceTypes) { +static void genPrivatizationRecipes( + const Fortran::parser::AccObjectList &objectList, + Fortran::lower::AbstractConverter &converter, + Fortran::semantics::SemanticsContext &semanticsContext, + Fortran::lower::StatementContext &stmtCtx, + llvm::SmallVectorImpl &dataOperands, + llvm::SmallVector &privatizationRecipes, + llvm::ArrayRef async, + llvm::ArrayRef asyncDeviceTypes, + llvm::ArrayRef asyncOnlyDeviceTypes) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext}; for (const auto &accObject : objectList.v) { @@ -1445,7 +1445,7 @@ genPrivatizations(const Fortran::parser::AccObjectList &objectList, /*unwrapBoxAddr=*/true); dataOperands.push_back(op.getAccVar()); } - privatizations.push_back(mlir::SymbolRefAttr::get( + privatizationRecipes.push_back(mlir::SymbolRefAttr::get( builder.getContext(), recipe.getSymName().str())); } } @@ -2083,15 +2083,15 @@ mlir::Type getTypeFromIvTypeSize(fir::FirOpBuilder &builder, return builder.getIntegerType(ivTypeSize * 8); } -static void privatizeIv(Fortran::lower::AbstractConverter &converter, - const Fortran::semantics::Symbol &sym, - mlir::Location loc, - llvm::SmallVector &ivTypes, - llvm::SmallVector &ivLocs, - llvm::SmallVector &privateOperands, - llvm::SmallVector &ivPrivate, - llvm::SmallVector &privatizations, - bool isDoConcurrent = false) { +static void +privatizeIv(Fortran::lower::AbstractConverter &converter, + const Fortran::semantics::Symbol &sym, mlir::Location loc, + llvm::SmallVector &ivTypes, + llvm::SmallVector &ivLocs, + llvm::SmallVector &privateOperands, + llvm::SmallVector &ivPrivate, + llvm::SmallVector &privatizationRecipes, + bool isDoConcurrent = false) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); mlir::Type ivTy = getTypeFromIvTypeSize(builder, sym); @@ -2131,7 +2131,7 @@ static void privatizeIv(Fortran::lower::AbstractConverter &converter, privateOp = op.getOperation(); privateOperands.push_back(op.getAccVar()); - privatizations.push_back(mlir::SymbolRefAttr::get( + privatizationRecipes.push_back(mlir::SymbolRefAttr::get( builder.getContext(), recipe.getSymName().str())); } @@ -2161,7 +2161,7 @@ static mlir::acc::LoopOp createLoopOp( llvm::SmallVector tileOperands, privateOperands, ivPrivate, reductionOperands, cacheOperands, vectorOperands, workerNumOperands, gangOperands, lowerbounds, upperbounds, steps; - llvm::SmallVector privatizations, reductionRecipes; + llvm::SmallVector privatizationRecipes, reductionRecipes; llvm::SmallVector tileOperandsSegments, gangOperandsSegments; llvm::SmallVector collapseValues; @@ -2282,9 +2282,9 @@ static mlir::acc::LoopOp createLoopOp( } else if (const auto *privateClause = std::get_if( &clause.u)) { - genPrivatizations( + genPrivatizationRecipes( privateClause->v, converter, semanticsContext, stmtCtx, - privateOperands, privatizations, /*async=*/{}, + privateOperands, privatizationRecipes, /*async=*/{}, /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes=*/{}); } else if (const auto *reductionClause = std::get_if( @@ -2368,7 +2368,8 @@ static mlir::acc::LoopOp createLoopOp( const auto &name = std::get(control.t); privatizeIv(converter, *name.symbol, currentLocation, ivTypes, ivLocs, - privateOperands, ivPrivate, privatizations, isDoConcurrent); + privateOperands, ivPrivate, privatizationRecipes, + isDoConcurrent); inclusiveBounds.push_back(true); } @@ -2405,7 +2406,7 @@ static mlir::acc::LoopOp createLoopOp( Fortran::semantics::Symbol &ivSym = bounds->name.thing.symbol->GetUltimate(); privatizeIv(converter, ivSym, currentLocation, ivTypes, ivLocs, - privateOperands, ivPrivate, privatizations); + privateOperands, ivPrivate, privatizationRecipes); inclusiveBounds.push_back(true); @@ -2484,9 +2485,9 @@ static mlir::acc::LoopOp createLoopOp( if (!autoDeviceTypes.empty()) loopOp.setAuto_Attr(builder.getArrayAttr(autoDeviceTypes)); - if (!privatizations.empty()) - loopOp.setPrivatizationsAttr( - mlir::ArrayAttr::get(builder.getContext(), privatizations)); + if (!privatizationRecipes.empty()) + loopOp.setPrivatizationRecipesAttr( + mlir::ArrayAttr::get(builder.getContext(), privatizationRecipes)); if (!reductionRecipes.empty()) loopOp.setReductionRecipesAttr( @@ -2613,8 +2614,8 @@ static Op createComputeOp( llvm::SmallVector reductionOperands, privateOperands, firstprivateOperands; - llvm::SmallVector privatizations, firstPrivatizations, - reductionRecipes; + llvm::SmallVector privatizationRecipes, + firstPrivatizationRecipes, reductionRecipes; // Self clause has optional values but can be present with // no value as well. When there is no value, the op has an attribute to @@ -2820,17 +2821,17 @@ static Op createComputeOp( std::get_if( &clause.u)) { if (!combinedConstructs) - genPrivatizations( + genPrivatizationRecipes( privateClause->v, converter, semanticsContext, stmtCtx, - privateOperands, privatizations, async, asyncDeviceTypes, + privateOperands, privatizationRecipes, async, asyncDeviceTypes, asyncOnlyDeviceTypes); } else if (const auto *firstprivateClause = std::get_if( &clause.u)) { - genPrivatizations( + genPrivatizationRecipes( firstprivateClause->v, converter, semanticsContext, stmtCtx, - firstprivateOperands, firstPrivatizations, async, asyncDeviceTypes, - asyncOnlyDeviceTypes); + firstprivateOperands, firstPrivatizationRecipes, async, + asyncDeviceTypes, asyncOnlyDeviceTypes); } else if (const auto *reductionClause = std::get_if( &clause.u)) { @@ -2934,15 +2935,15 @@ static Op createComputeOp( computeOp.setWaitOnlyAttr(builder.getArrayAttr(waitOnlyDeviceTypes)); if constexpr (!std::is_same_v) { - if (!privatizations.empty()) - computeOp.setPrivatizationsAttr( - mlir::ArrayAttr::get(builder.getContext(), privatizations)); + if (!privatizationRecipes.empty()) + computeOp.setPrivatizationRecipesAttr( + mlir::ArrayAttr::get(builder.getContext(), privatizationRecipes)); if (!reductionRecipes.empty()) computeOp.setReductionRecipesAttr( mlir::ArrayAttr::get(builder.getContext(), reductionRecipes)); - if (!firstPrivatizations.empty()) - computeOp.setFirstprivatizationsAttr( - mlir::ArrayAttr::get(builder.getContext(), firstPrivatizations)); + if (!firstPrivatizationRecipes.empty()) + computeOp.setFirstprivatizationRecipesAttr(mlir::ArrayAttr::get( + builder.getContext(), firstPrivatizationRecipes)); } if (combinedConstructs) diff --git a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td index b9148dc088a6a..083a18d80704e 100644 --- a/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td +++ b/mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td @@ -1327,9 +1327,9 @@ def OpenACC_ParallelOp : OpenACC_Op<"parallel", Variadic:$reductionOperands, OptionalAttr:$reductionRecipes, Variadic:$privateOperands, - OptionalAttr:$privatizations, + OptionalAttr:$privatizationRecipes, Variadic:$firstprivateOperands, - OptionalAttr:$firstprivatizations, + OptionalAttr:$firstprivatizationRecipes, Variadic:$dataClauseOperands, OptionalAttr:$defaultAttr, UnitAttr:$combined); @@ -1443,14 +1443,14 @@ def OpenACC_ParallelOp : OpenACC_Op<"parallel", | `async` `` custom($asyncOperands, type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `firstprivate` `(` custom($firstprivateOperands, - type($firstprivateOperands), $firstprivatizations) + type($firstprivateOperands), $firstprivatizationRecipes) `)` | `num_gangs` `(` custom($numGangs, type($numGangs), $numGangsDeviceType, $numGangsSegments) `)` | `num_workers` `(` custom($numWorkers, type($numWorkers), $numWorkersDeviceType) `)` | `private` `(` custom( - $privateOperands, type($privateOperands), $privatizations) + $privateOperands, type($privateOperands), $privatizationRecipes) `)` | `vector_length` `(` custom($vectorLength, type($vectorLength), $vectorLengthDeviceType) `)` @@ -1512,9 +1512,9 @@ def OpenACC_SerialOp : OpenACC_Op<"serial", Variadic:$reductionOperands, OptionalAttr:$reductionRecipes, Variadic:$privateOperands, - OptionalAttr:$privatizations, + OptionalAttr:$privatizationRecipes, Variadic:$firstprivateOperands, - OptionalAttr:$firstprivatizations, + OptionalAttr:$firstprivatizationRecipes, Variadic:$dataClauseOperands, OptionalAttr:$defaultAttr, UnitAttr:$combined); @@ -1585,10 +1585,10 @@ def OpenACC_SerialOp : OpenACC_Op<"serial", | `async` `` custom($asyncOperands, type($asyncOperands), $asyncOperandsDeviceType, $asyncOnly) | `firstprivate` `(` custom($firstprivateOperands, - type($firstprivateOperands), $firstprivatizations) + type($firstprivateOperands), $firstprivatizationRecipes) `)` | `private` `(` custom( - $privateOperands, type($privateOperands), $privatizations) + $privateOperands, type($privateOperands), $privatizationRecipes) `)` | `wait` `` custom($waitOperands, type($waitOperands), $waitOperandsDeviceType, $waitOperandsSegments, $hasWaitDevnum, @@ -2106,7 +2106,7 @@ def OpenACC_LoopOp : OpenACC_Op<"loop", OptionalAttr:$tileOperandsDeviceType, Variadic:$cacheOperands, Variadic:$privateOperands, - OptionalAttr:$privatizations, + OptionalAttr:$privatizationRecipes, Variadic:$reductionOperands, OptionalAttr:$reductionRecipes, OptionalAttr:$combined @@ -2261,7 +2261,7 @@ def OpenACC_LoopOp : OpenACC_Op<"loop", | `vector` `` custom($vectorOperands, type($vectorOperands), $vectorOperandsDeviceType, $vector) | `private` `(` custom( - $privateOperands, type($privateOperands), $privatizations) `)` + $privateOperands, type($privateOperands), $privatizationRecipes) `)` | `tile` `(` custom($tileOperands, type($tileOperands), $tileOperandsDeviceType, $tileOperandsSegments) `)` diff --git a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp index b401d2ec7894a..658ad28477ace 100644 --- a/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp +++ b/mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp @@ -1079,11 +1079,11 @@ static LogicalResult verifyDeviceTypeAndSegmentCountMatch( LogicalResult acc::ParallelOp::verify() { if (failed(checkSymOperandList( - *this, getPrivatizations(), getPrivateOperands(), "private", + *this, getPrivatizationRecipes(), getPrivateOperands(), "private", "privatizations", /*checkOperandType=*/false))) return failure(); if (failed(checkSymOperandList( - *this, getFirstprivatizations(), getFirstprivateOperands(), + *this, getFirstprivatizationRecipes(), getFirstprivateOperands(), "firstprivate", "firstprivatizations", /*checkOperandType=*/false))) return failure(); if (failed(checkSymOperandList( @@ -1870,11 +1870,11 @@ mlir::Value SerialOp::getWaitDevnum(mlir::acc::DeviceType deviceType) { LogicalResult acc::SerialOp::verify() { if (failed(checkSymOperandList( - *this, getPrivatizations(), getPrivateOperands(), "private", + *this, getPrivatizationRecipes(), getPrivateOperands(), "private", "privatizations", /*checkOperandType=*/false))) return failure(); if (failed(checkSymOperandList( - *this, getFirstprivatizations(), getFirstprivateOperands(), + *this, getFirstprivatizationRecipes(), getFirstprivateOperands(), "firstprivate", "firstprivatizations", /*checkOperandType=*/false))) return failure(); if (failed(checkSymOperandList( @@ -2488,7 +2488,7 @@ LogicalResult acc::LoopOp::verify() { } if (failed(checkSymOperandList( - *this, getPrivatizations(), getPrivateOperands(), "private", + *this, getPrivatizationRecipes(), getPrivateOperands(), "private", "privatizations", false))) return failure(); ``````````
https://github.com/llvm/llvm-project/pull/140719 From flang-commits at lists.llvm.org Tue May 20 05:01:11 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 20 May 2025 05:01:11 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [OpenACC] rename private/firstprivate recipe attributes (PR #140719) In-Reply-To: Message-ID: <682c6f07.050a0220.3010cc.1a20@mx.google.com> https://github.com/clementval approved this pull request. https://github.com/llvm/llvm-project/pull/140719 From flang-commits at lists.llvm.org Tue May 20 05:23:10 2025 From: flang-commits at lists.llvm.org (Paul Walker via flang-commits) Date: Tue, 20 May 2025 05:23:10 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c742e.170a0220.3753ca.f8c4@mx.google.com> https://github.com/paulwalker-arm approved this pull request. https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Tue May 20 05:31:00 2025 From: flang-commits at lists.llvm.org (Rohit Aggarwal via flang-commits) Date: Tue, 20 May 2025 05:31:00 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [Clang][Driver][fveclib] Fix target parsing for -fveclib=AMDLIBM option (PR #140544) In-Reply-To: Message-ID: <682c7604.a70a0220.1e9c0a.2fbf@mx.google.com> rohitaggarwal007 wrote: @kiranktp @paulwalker-arm Can you please merge this PR in main on my behalf? I don't have the write permission. Thank you https://github.com/llvm/llvm-project/pull/140544 From flang-commits at lists.llvm.org Tue May 20 07:07:52 2025 From: flang-commits at lists.llvm.org (Sebastian Pop via flang-commits) Date: Tue, 20 May 2025 07:07:52 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix Werror=dangling-reference (PR #138793) In-Reply-To: Message-ID: <682c8cb8.170a0220.225948.0533@mx.google.com> https://github.com/sebpop closed https://github.com/llvm/llvm-project/pull/138793 From flang-commits at lists.llvm.org Tue May 20 07:07:53 2025 From: flang-commits at lists.llvm.org (Sebastian Pop via flang-commits) Date: Tue, 20 May 2025 07:07:53 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix Werror=dangling-reference (PR #138793) In-Reply-To: Message-ID: <682c8cb9.050a0220.3d501.27e3@mx.google.com> sebpop wrote: Abandoning this patch as the code has changed and flang finishes build without errors on my system. https://github.com/llvm/llvm-project/pull/138793 From flang-commits at lists.llvm.org Tue May 20 07:11:31 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 07:11:31 -0700 (PDT) Subject: [flang-commits] [flang] ed07412 - [flang] translate derived type array init to attribute if possible (#140268) Message-ID: <682c8d93.170a0220.13308d.fd74@mx.google.com> Author: jeanPerier Date: 2025-05-20T16:11:27+02:00 New Revision: ed07412888e2ef5a1f36a48eb5a280050e223fad URL: https://github.com/llvm/llvm-project/commit/ed07412888e2ef5a1f36a48eb5a280050e223fad DIFF: https://github.com/llvm/llvm-project/commit/ed07412888e2ef5a1f36a48eb5a280050e223fad.diff LOG: [flang] translate derived type array init to attribute if possible (#140268) This patch relies on #140235 and #139724 to speed-up compilations of files with derived type array global with initial value. Currently, such derived type global init was lowered to an llvm.mlir.insertvalue chain in the LLVM IR dialect because there was no way to represent such value via attributes. This chain was later folded in LLVM dialect to LLVM IR using LLVM IR (not dialect) folding. This insert chain generation and folding is very expensive for big arrays. For instance, this patch brings down the compilation of FM_lib fmsave.f95 from 50s to 0.5s. Added: flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h flang/lib/Optimizer/CodeGen/LLVMInsertChainFolder.cpp flang/test/Fir/convert-and-fold-insert-on-range.fir Modified: flang/include/flang/Optimizer/Dialect/FIROps.td flang/lib/Optimizer/CodeGen/CMakeLists.txt flang/lib/Optimizer/CodeGen/CodeGen.cpp flang/lib/Optimizer/Dialect/FIROps.cpp Removed: ################################################################################ diff --git a/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h b/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h new file mode 100644 index 0000000000000..36360dc77d588 --- /dev/null +++ b/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h @@ -0,0 +1,34 @@ +//===-- LLVMInsertChainFolder.h -- insertvalue chain folder ----*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Helper to fold LLVM dialect llvm.insertvalue chain representing constants +// into an Attribute representation. +// This sits in Flang because it is incomplete and tailored for flang needs. +// +//===----------------------------------------------------------------------===// + +#include "llvm/Support/LogicalResult.h" + +namespace mlir { +class Attribute; +class OpBuilder; +class Value; +} // namespace mlir + +namespace fir { + +/// Attempt to fold an llvm.insertvalue chain into an attribute representation +/// suitable as llvm.constant operand. The returned value will be llvm::Failure +/// if this is not an llvm.insertvalue result or if the chain is not a constant, +/// or cannot be represented as an Attribute. The operations are not deleted, +/// but some llvm.insertvalue value operands may be folded with the builder on +/// the way. +llvm::FailureOr +tryFoldingLLVMInsertChain(mlir::Value insertChainResult, + mlir::OpBuilder &builder); +} // namespace fir diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td b/flang/include/flang/Optimizer/Dialect/FIROps.td index 458b780806144..dc66885f776f0 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.td +++ b/flang/include/flang/Optimizer/Dialect/FIROps.td @@ -2129,6 +2129,11 @@ def fir_InsertOnRangeOp : fir_OneResultOp<"insert_on_range", [NoMemoryEffect]> { $seq `,` $val custom($coor) attr-dict `:` functional-type(operands, results) }]; + let extraClassDeclaration = [{ + /// Is this insert_on_range inserting on all the values of the result type? + bool isFullRange(); + }]; + let hasVerifier = 1; } diff --git a/flang/lib/Optimizer/CodeGen/CMakeLists.txt b/flang/lib/Optimizer/CodeGen/CMakeLists.txt index 04480bac552b7..980307db315d9 100644 --- a/flang/lib/Optimizer/CodeGen/CMakeLists.txt +++ b/flang/lib/Optimizer/CodeGen/CMakeLists.txt @@ -3,6 +3,7 @@ add_flang_library(FIRCodeGen CodeGen.cpp CodeGenOpenMP.cpp FIROpPatterns.cpp + LLVMInsertChainFolder.cpp LowerRepackArrays.cpp PreCGRewrite.cpp TBAABuilder.cpp diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index ad9119ba4a031..70c90fae34086 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -14,6 +14,7 @@ #include "flang/Optimizer/CodeGen/CodeGenOpenMP.h" #include "flang/Optimizer/CodeGen/FIROpPatterns.h" +#include "flang/Optimizer/CodeGen/LLVMInsertChainFolder.h" #include "flang/Optimizer/CodeGen/TypeConverter.h" #include "flang/Optimizer/Dialect/FIRAttr.h" #include "flang/Optimizer/Dialect/FIRCG/CGOps.h" @@ -2412,15 +2413,39 @@ struct InsertOnRangeOpConversion doRewrite(fir::InsertOnRangeOp range, mlir::Type ty, OpAdaptor adaptor, mlir::ConversionPatternRewriter &rewriter) const override { - llvm::SmallVector dims; - auto type = adaptor.getOperands()[0].getType(); + auto arrayType = adaptor.getSeq().getType(); // Iteratively extract the array dimensions from the type. + llvm::SmallVector dims; + mlir::Type type = arrayType; while (auto t = mlir::dyn_cast(type)) { dims.push_back(t.getNumElements()); type = t.getElementType(); } + // Avoid generating long insert chain that are very slow to fold back + // (which is required in globals when later generating LLVM IR). Attempt to + // fold the inserted element value to an attribute and build an ArrayAttr + // for the resulting array. + if (range.isFullRange()) { + llvm::FailureOr cst = + fir::tryFoldingLLVMInsertChain(adaptor.getVal(), rewriter); + if (llvm::succeeded(cst)) { + mlir::Attribute dimVal = *cst; + for (auto dim : llvm::reverse(dims)) { + // Use std::vector in case the number of elements is big. + std::vector elements(dim, dimVal); + dimVal = mlir::ArrayAttr::get(range.getContext(), elements); + } + // Replace insert chain with constant. + rewriter.replaceOpWithNewOp(range, arrayType, + dimVal); + return mlir::success(); + } + } + + // The inserted value cannot be folded to an attribute, turn the + // insert_range into an llvm.insertvalue chain. llvm::SmallVector lBounds; llvm::SmallVector uBounds; @@ -2434,8 +2459,8 @@ struct InsertOnRangeOpConversion auto &subscripts = lBounds; auto loc = range.getLoc(); - mlir::Value lastOp = adaptor.getOperands()[0]; - mlir::Value insertVal = adaptor.getOperands()[1]; + mlir::Value lastOp = adaptor.getSeq(); + mlir::Value insertVal = adaptor.getVal(); while (subscripts != uBounds) { lastOp = rewriter.create( @@ -3131,7 +3156,7 @@ struct GlobalOpConversion : public fir::FIROpConversion { // initialization is on the full range. auto insertOnRangeOps = gr.front().getOps(); for (auto insertOp : insertOnRangeOps) { - if (isFullRange(insertOp.getCoor(), insertOp.getType())) { + if (insertOp.isFullRange()) { auto seqTyAttr = convertType(insertOp.getType()); auto *op = insertOp.getVal().getDefiningOp(); auto constant = mlir::dyn_cast(op); @@ -3161,22 +3186,7 @@ struct GlobalOpConversion : public fir::FIROpConversion { return mlir::success(); } - bool isFullRange(mlir::DenseIntElementsAttr indexes, - fir::SequenceType seqTy) const { - auto extents = seqTy.getShape(); - if (indexes.size() / 2 != static_cast(extents.size())) - return false; - auto cur_index = indexes.value_begin(); - for (unsigned i = 0; i < indexes.size(); i += 2) { - if (*(cur_index++) != 0) - return false; - if (*(cur_index++) != extents[i / 2] - 1) - return false; - } - return true; - } - - // TODO: String comparaison should be avoided. Replace linkName with an + // TODO: String comparisons should be avoided. Replace linkName with an // enumeration. mlir::LLVM::Linkage convertLinkage(std::optional optLinkage) const { diff --git a/flang/lib/Optimizer/CodeGen/LLVMInsertChainFolder.cpp b/flang/lib/Optimizer/CodeGen/LLVMInsertChainFolder.cpp new file mode 100644 index 0000000000000..5b522f2647916 --- /dev/null +++ b/flang/lib/Optimizer/CodeGen/LLVMInsertChainFolder.cpp @@ -0,0 +1,210 @@ +//===-- LLVMInsertChainFolder.cpp -----------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/CodeGen/LLVMInsertChainFolder.h" +#include "mlir/Dialect/LLVMIR/LLVMAttrs.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" +#include "mlir/IR/Builders.h" +#include "llvm/Support/Debug.h" + +#define DEBUG_TYPE "flang-insert-folder" + +#include + +namespace { +// Helper class to construct the attribute elements of an aggregate value being +// folded without creating a full mlir::Attribute representation for each step +// of the insert value chain, which would both be expensive in terms of +// compilation time and memory (since the intermediate Attribute would survive, +// unused, inside the mlir context). +class InsertChainBackwardFolder { + // Type for the current value of an element of the aggregate value being + // constructed by the insert chain. + // At any point of the insert chain, the value of an element is either: + // - nullptr: not yet known, the insert has not yet been seen. + // - an mlir::Attribute: the element is fully defined. + // - a nested InsertChainBackwardFolder: the element is itself an aggregate + // and its sub-elements have been partially defined (insert with mutliple + // indices have been seen). + + // The insertion folder assumes backward walk of the insert chain. Once an + // element or sub-element has been defined, it is not overriden by new + // insertions (last insert wins). + using InFlightValue = + llvm::PointerUnion; + +public: + InsertChainBackwardFolder( + mlir::Type type, std::deque *folderStorage) + : values(getNumElements(type), mlir::Attribute{}), + folderStorage{folderStorage}, type{type} {} + + /// Push + bool pushValue(mlir::Attribute val, llvm::ArrayRef at); + + mlir::Attribute finalize(mlir::Attribute defaultFieldValue); + +private: + static int64_t getNumElements(mlir::Type type) { + if (auto structTy = + llvm::dyn_cast_if_present(type)) + return structTy.getBody().size(); + if (auto arrayTy = + llvm::dyn_cast_if_present(type)) + return arrayTy.getNumElements(); + return 0; + } + + static mlir::Type getSubElementType(mlir::Type type, int64_t field) { + if (auto arrayTy = + llvm::dyn_cast_if_present(type)) + return arrayTy.getElementType(); + if (auto structTy = + llvm::dyn_cast_if_present(type)) + return structTy.getBody()[field]; + return nullptr; + } + + // Current element value of the aggregate value being built. + llvm::SmallVector values; + // std::deque is used to allocate storage for nested list and guarantee the + // stability of the InsertChainBackwardFolder* used as element value. + std::deque *folderStorage; + // Type of the aggregate value being built. + mlir::Type type; +}; +} // namespace + +// Helper to fold the value being inserted by an llvm.insert_value. +// This may call tryFoldingLLVMInsertChain if the value is an aggregate and +// was itself constructed by a diff erent insert chain. +// Returns a nullptr Attribute if the value could not be folded. +static mlir::Attribute getAttrIfConstant(mlir::Value val, + mlir::OpBuilder &rewriter) { + if (auto cst = val.getDefiningOp()) + return cst.getValue(); + if (auto insert = val.getDefiningOp()) { + llvm::FailureOr attr = + fir::tryFoldingLLVMInsertChain(val, rewriter); + if (succeeded(attr)) + return *attr; + return nullptr; + } + if (val.getDefiningOp()) + return mlir::LLVM::ZeroAttr::get(val.getContext()); + if (val.getDefiningOp()) + return mlir::LLVM::UndefAttr::get(val.getContext()); + if (mlir::Operation *op = val.getDefiningOp()) { + unsigned resNum = llvm::cast(val).getResultNumber(); + llvm::SmallVector results; + if (mlir::succeeded(rewriter.tryFold(op, results)) && + results.size() > resNum) { + if (auto cst = results[resNum].getDefiningOp()) + return cst.getValue(); + } + } + if (auto trunc = val.getDefiningOp()) + if (auto attr = getAttrIfConstant(trunc.getArg(), rewriter)) + if (auto intAttr = llvm::dyn_cast(attr)) + return mlir::IntegerAttr::get(trunc.getType(), intAttr.getInt()); + LLVM_DEBUG(llvm::dbgs() << "cannot fold insert value operand: " << val + << "\n"); + return nullptr; +} + +mlir::Attribute +InsertChainBackwardFolder::finalize(mlir::Attribute defaultFieldValue) { + llvm::SmallVector attrs = llvm::map_to_vector( + values, [&](InFlightValue inFlight) -> mlir::Attribute { + if (!inFlight) + return defaultFieldValue; + if (auto attr = llvm::dyn_cast(inFlight)) + return attr; + return llvm::cast(inFlight)->finalize( + defaultFieldValue); + }); + return mlir::ArrayAttr::get(type.getContext(), attrs); +} + +bool InsertChainBackwardFolder::pushValue(mlir::Attribute val, + llvm::ArrayRef at) { + if (at.size() == 0 || at[0] >= static_cast(values.size())) + return false; + InFlightValue &inFlight = values[at[0]]; + if (!inFlight) { + if (at.size() == 1) { + inFlight = val; + return true; + } + // This is the first insert to a nested field. Create a + // InsertChainBackwardFolder for the current element value. + mlir::Type subType = getSubElementType(type, at[0]); + if (!subType) + return false; + InsertChainBackwardFolder &inFlightList = + folderStorage->emplace_back(subType, folderStorage); + inFlight = &inFlightList; + return inFlightList.pushValue(val, at.drop_front()); + } + // Keep last inserted value if already set. + if (llvm::isa(inFlight)) + return true; + auto *inFlightList = llvm::cast(inFlight); + if (at.size() == 1) { + if (!llvm::isa(val)) { + LLVM_DEBUG(llvm::dbgs() + << "insert chain sub-element partially overwritten initial " + "value is not zero or undef\n"); + return false; + } + inFlight = inFlightList->finalize(val); + return true; + } + return inFlightList->pushValue(val, at.drop_front()); +} + +llvm::FailureOr +fir::tryFoldingLLVMInsertChain(mlir::Value val, mlir::OpBuilder &rewriter) { + if (auto cst = val.getDefiningOp()) + return cst.getValue(); + if (auto insert = val.getDefiningOp()) { + LLVM_DEBUG(llvm::dbgs() << "trying to fold insert chain:" << val << "\n"); + if (auto structTy = + llvm::dyn_cast(insert.getType())) { + mlir::LLVM::InsertValueOp currentInsert = insert; + mlir::LLVM::InsertValueOp lastInsert; + std::deque folderStorage; + InsertChainBackwardFolder inFlightList(structTy, &folderStorage); + while (currentInsert) { + mlir::Attribute attr = + getAttrIfConstant(currentInsert.getValue(), rewriter); + if (!attr) + return llvm::failure(); + if (!inFlightList.pushValue(attr, currentInsert.getPosition())) + return llvm::failure(); + lastInsert = currentInsert; + currentInsert = currentInsert.getContainer() + .getDefiningOp(); + } + mlir::Attribute defaultVal; + if (lastInsert) { + if (lastInsert.getContainer().getDefiningOp()) + defaultVal = mlir::LLVM::ZeroAttr::get(val.getContext()); + else if (lastInsert.getContainer().getDefiningOp()) + defaultVal = mlir::LLVM::UndefAttr::get(val.getContext()); + } + if (!defaultVal) { + LLVM_DEBUG(llvm::dbgs() + << "insert chain initial value is not Zero or Undef\n"); + return llvm::failure(); + } + return inFlightList.finalize(defaultVal); + } + } + return llvm::failure(); +} diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index d85b38c467857..e12af7782a578 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -2365,6 +2365,21 @@ llvm::LogicalResult fir::InsertOnRangeOp::verify() { return mlir::success(); } +bool fir::InsertOnRangeOp::isFullRange() { + auto extents = getType().getShape(); + mlir::DenseIntElementsAttr indexes = getCoor(); + if (indexes.size() / 2 != static_cast(extents.size())) + return false; + auto cur_index = indexes.value_begin(); + for (unsigned i = 0; i < indexes.size(); i += 2) { + if (*(cur_index++) != 0) + return false; + if (*(cur_index++) != extents[i / 2] - 1) + return false; + } + return true; +} + //===----------------------------------------------------------------------===// // InsertValueOp //===----------------------------------------------------------------------===// diff --git a/flang/test/Fir/convert-and-fold-insert-on-range.fir b/flang/test/Fir/convert-and-fold-insert-on-range.fir new file mode 100644 index 0000000000000..df18614d80b63 --- /dev/null +++ b/flang/test/Fir/convert-and-fold-insert-on-range.fir @@ -0,0 +1,33 @@ +// Test codegen of constant insert_on_range without symbol reference into mlir.constant. +// RUN: fir-opt --cg-rewrite --split-input-file --fir-to-llvm-ir %s | FileCheck %s + +module attributes {dlti.dl_spec = #dlti.dl_spec = dense<32> : vector<4xi64>, !llvm.ptr<271> = dense<32> : vector<4xi64>, !llvm.ptr<272> = dense<64> : vector<4xi64>, i64 = dense<64> : vector<2xi64>, i128 = dense<128> : vector<2xi64>, f80 = dense<128> : vector<2xi64>, !llvm.ptr = dense<64> : vector<4xi64>, i1 = dense<8> : vector<2xi64>, i8 = dense<8> : vector<2xi64>, i16 = dense<16> : vector<2xi64>, i32 = dense<32> : vector<2xi64>, f16 = dense<16> : vector<2xi64>, f64 = dense<64> : vector<2xi64>, f128 = dense<128> : vector<2xi64>, "dlti.endianness" = "little", "dlti.mangling_mode" = "e", "dlti.stack_alignment" = 128 : i64>, fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"} { + fir.global @derived_array : !fir.array<2x!fir.type>>}>> { + %c0 = arith.constant 0 : index + %0 = fir.undefined !fir.type>>}> + %1 = fir.zero_bits !fir.heap> + %2 = fir.shape %c0 : (index) -> !fir.shape<1> + %3 = fir.embox %1(%2) : (!fir.heap>, !fir.shape<1>) -> !fir.box>> + %4 = fir.insert_value %0, %3, ["comp", !fir.type>>}>] : (!fir.type>>}>, !fir.box>>) -> !fir.type>>}> + %5 = fir.undefined !fir.array<2x!fir.type>>}>> + %6 = fir.insert_on_range %5, %4 from (0) to (1) : (!fir.array<2x!fir.type>>}>>, !fir.type>>}>) -> !fir.array<2x!fir.type>>}>> + fir.has_value %6 : !fir.array<2x!fir.type>>}>> + } +} + +//CHECK-LABEL: llvm.mlir.global external @derived_array() +//CHECK: %[[CST:.*]] = llvm.mlir.constant([ +//CHECK-SAME: [ +//CHECK-SAME: [#llvm.zero, 8, 20240719 : i32, 1 : i8, 28 : i8, 2 : i8, 0 : i8, +//CHECK-SAME: [ +//CHECK-SAME: [1, 0 : index, 8] +//CHECK-SAME: ] +//CHECK-SAME: ], +//CHECK-SAME: [ +//CHECK-SAME: [#llvm.zero, 8, 20240719 : i32, 1 : i8, 28 : i8, 2 : i8, 0 : i8, +//CHECK-SAME: [ +//CHECK-SAME: [1, 0 : index, 8] +//CHECK-SAME: ] +//CHECK-SAME: ]) : +//CHECK-SAME: !llvm.array<2 x struct<"sometype", (struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)>)>> +//CHECK: llvm.return %[[CST]] : !llvm.array<2 x struct<"sometype", (struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)>)>> From flang-commits at lists.llvm.org Tue May 20 07:11:36 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 07:11:36 -0700 (PDT) Subject: [flang-commits] [flang] [flang] translate derived type array init to attribute if possible (PR #140268) In-Reply-To: Message-ID: <682c8d98.170a0220.22036d.0998@mx.google.com> https://github.com/jeanPerier closed https://github.com/llvm/llvm-project/pull/140268 From flang-commits at lists.llvm.org Tue May 20 01:47:44 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 01:47:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang] translate derived type array init to attribute if possible (PR #140268) In-Reply-To: Message-ID: <682c41b0.170a0220.1bea2b.e497@mx.google.com> https://github.com/jeanPerier updated https://github.com/llvm/llvm-project/pull/140268 >From 2a8d78189a6c20a482dd44dfe43f8c919d0e53d6 Mon Sep 17 00:00:00 2001 From: Jean Perier Date: Fri, 16 May 2025 01:44:31 -0700 Subject: [PATCH 1/4] [flang] use DataLayout instead of GEP to compute element size --- .../flang/Optimizer/CodeGen/FIROpPatterns.h | 4 ++ flang/lib/Optimizer/CodeGen/CodeGen.cpp | 50 ++++++++--------- flang/test/Fir/convert-to-llvm.fir | 54 +++++-------------- flang/test/Fir/copy-codegen.fir | 12 ++--- flang/test/Fir/embox-char.fir | 8 +-- flang/test/Fir/embox-substring.fir | 7 ++- 6 files changed, 48 insertions(+), 87 deletions(-) diff --git a/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h b/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h index 53d16323beddf..7b1c14e4dfdc9 100644 --- a/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h +++ b/flang/include/flang/Optimizer/CodeGen/FIROpPatterns.h @@ -173,6 +173,10 @@ class ConvertFIRToLLVMPattern : public mlir::ConvertToLLVMPattern { this->getTypeConverter()); } + const mlir::DataLayout &getDataLayout() const { + return lowerTy().getDataLayout(); + } + void attachTBAATag(mlir::LLVM::AliasAnalysisOpInterface op, mlir::Type baseFIRType, mlir::Type accessFIRType, mlir::LLVM::GEPOp gep) const { diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index e534cfa5591c6..ad9119ba4a031 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -1043,22 +1043,12 @@ static mlir::SymbolRefAttr getMalloc(fir::AllocMemOp op, static mlir::Value computeElementDistance(mlir::Location loc, mlir::Type llvmObjectType, mlir::Type idxTy, - mlir::ConversionPatternRewriter &rewriter) { - // Note that we cannot use something like - // mlir::LLVM::getPrimitiveTypeSizeInBits() for the element type here. For - // example, it returns 10 bytes for mlir::Float80Type for targets where it - // occupies 16 bytes. Proper solution is probably to use - // mlir::DataLayout::getTypeABIAlignment(), but DataLayout is not being set - // yet (see llvm-project#57230). For the time being use the '(intptr_t)((type - // *)0 + 1)' trick for all types. The generated instructions are optimized - // into constant by the first pass of InstCombine, so it should not be a - // performance issue. - auto llvmPtrTy = ::getLlvmPtrType(llvmObjectType.getContext()); - auto nullPtr = rewriter.create(loc, llvmPtrTy); - auto gep = rewriter.create( - loc, llvmPtrTy, llvmObjectType, nullPtr, - llvm::ArrayRef{1}); - return rewriter.create(loc, idxTy, gep); + mlir::ConversionPatternRewriter &rewriter, + const mlir::DataLayout &dataLayout) { + llvm::TypeSize size = dataLayout.getTypeSize(llvmObjectType); + unsigned short alignment = dataLayout.getTypeABIAlignment(llvmObjectType); + std::int64_t distance = llvm::alignTo(size, alignment); + return genConstantIndex(loc, idxTy, rewriter, distance); } /// Return value of the stride in bytes between adjacent elements @@ -1066,10 +1056,10 @@ computeElementDistance(mlir::Location loc, mlir::Type llvmObjectType, /// \p idxTy integer type. static mlir::Value genTypeStrideInBytes(mlir::Location loc, mlir::Type idxTy, - mlir::ConversionPatternRewriter &rewriter, - mlir::Type llTy) { + mlir::ConversionPatternRewriter &rewriter, mlir::Type llTy, + const mlir::DataLayout &dataLayout) { // Create a pointer type and use computeElementDistance(). - return computeElementDistance(loc, llTy, idxTy, rewriter); + return computeElementDistance(loc, llTy, idxTy, rewriter, dataLayout); } namespace { @@ -1111,7 +1101,7 @@ struct AllocMemOpConversion : public fir::FIROpConversion { mlir::Value genTypeSizeInBytes(mlir::Location loc, mlir::Type idxTy, mlir::ConversionPatternRewriter &rewriter, mlir::Type llTy) const { - return computeElementDistance(loc, llTy, idxTy, rewriter); + return computeElementDistance(loc, llTy, idxTy, rewriter, getDataLayout()); } }; } // namespace @@ -1323,8 +1313,8 @@ struct EmboxCommonConversion : public fir::FIROpConversion { fir::CharacterType charTy, mlir::ValueRange lenParams) const { auto i64Ty = mlir::IntegerType::get(rewriter.getContext(), 64); - mlir::Value size = - genTypeStrideInBytes(loc, i64Ty, rewriter, this->convertType(charTy)); + mlir::Value size = genTypeStrideInBytes( + loc, i64Ty, rewriter, this->convertType(charTy), this->getDataLayout()); if (charTy.hasConstantLen()) return size; // Length accounted for in the genTypeStrideInBytes GEP. // Otherwise, multiply the single character size by the length. @@ -1338,6 +1328,7 @@ struct EmboxCommonConversion : public fir::FIROpConversion { std::tuple getSizeAndTypeCode( mlir::Location loc, mlir::ConversionPatternRewriter &rewriter, mlir::Type boxEleTy, mlir::ValueRange lenParams = {}) const { + const mlir::DataLayout &dataLayout = this->getDataLayout(); auto i64Ty = mlir::IntegerType::get(rewriter.getContext(), 64); if (auto eleTy = fir::dyn_cast_ptrEleTy(boxEleTy)) boxEleTy = eleTy; @@ -1354,18 +1345,19 @@ struct EmboxCommonConversion : public fir::FIROpConversion { mlir::dyn_cast(boxEleTy) || fir::isa_real(boxEleTy) || fir::isa_complex(boxEleTy)) return {genTypeStrideInBytes(loc, i64Ty, rewriter, - this->convertType(boxEleTy)), + this->convertType(boxEleTy), dataLayout), typeCodeVal}; if (auto charTy = mlir::dyn_cast(boxEleTy)) return {getCharacterByteSize(loc, rewriter, charTy, lenParams), typeCodeVal}; if (fir::isa_ref_type(boxEleTy)) { auto ptrTy = ::getLlvmPtrType(rewriter.getContext()); - return {genTypeStrideInBytes(loc, i64Ty, rewriter, ptrTy), typeCodeVal}; + return {genTypeStrideInBytes(loc, i64Ty, rewriter, ptrTy, dataLayout), + typeCodeVal}; } if (mlir::isa(boxEleTy)) return {genTypeStrideInBytes(loc, i64Ty, rewriter, - this->convertType(boxEleTy)), + this->convertType(boxEleTy), dataLayout), typeCodeVal}; fir::emitFatalError(loc, "unhandled type in fir.box code generation"); } @@ -1909,8 +1901,8 @@ struct XEmboxOpConversion : public EmboxCommonConversion { if (hasSubcomp) { // We have a subcomponent. The step value needs to be the number of // bytes per element (which is a derived type). - prevDimByteStride = - genTypeStrideInBytes(loc, i64Ty, rewriter, convertType(seqEleTy)); + prevDimByteStride = genTypeStrideInBytes( + loc, i64Ty, rewriter, convertType(seqEleTy), getDataLayout()); } else if (hasSubstr) { // We have a substring. The step value needs to be the number of bytes // per CHARACTER element. @@ -3604,8 +3596,8 @@ struct CopyOpConversion : public fir::FIROpConversion { mlir::Value llvmDestination = adaptor.getDestination(); mlir::Type i64Ty = mlir::IntegerType::get(rewriter.getContext(), 64); mlir::Type copyTy = fir::unwrapRefType(copy.getSource().getType()); - mlir::Value copySize = - genTypeStrideInBytes(loc, i64Ty, rewriter, convertType(copyTy)); + mlir::Value copySize = genTypeStrideInBytes( + loc, i64Ty, rewriter, convertType(copyTy), getDataLayout()); mlir::LLVM::AliasAnalysisOpInterface newOp; if (copy.getNoOverlap()) diff --git a/flang/test/Fir/convert-to-llvm.fir b/flang/test/Fir/convert-to-llvm.fir index 2960528fb6c24..6d8a8bb606b90 100644 --- a/flang/test/Fir/convert-to-llvm.fir +++ b/flang/test/Fir/convert-to-llvm.fir @@ -216,9 +216,7 @@ func.func @test_alloc_and_freemem_one() { } // CHECK-LABEL: llvm.func @test_alloc_and_freemem_one() { -// CHECK-NEXT: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK-NEXT: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK-NEXT: %[[N:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[N:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK-NEXT: llvm.call @malloc(%[[N]]) // CHECK: llvm.call @free(%{{.*}}) // CHECK-NEXT: llvm.return @@ -235,10 +233,8 @@ func.func @test_alloc_and_freemem_several() { } // CHECK-LABEL: llvm.func @test_alloc_and_freemem_several() { -// CHECK: [[NULL:%.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: [[PTR:%.*]] = llvm.getelementptr [[NULL]][{{.*}}] : (!llvm.ptr) -> !llvm.ptr, !llvm.array<100 x f32> -// CHECK: [[N:%.*]] = llvm.ptrtoint [[PTR]] : !llvm.ptr to i64 -// CHECK: [[MALLOC:%.*]] = llvm.call @malloc([[N]]) +// CHECK: %[[N:.*]] = llvm.mlir.constant(400 : i64) : i64 +// CHECK: [[MALLOC:%.*]] = llvm.call @malloc(%[[N]]) // CHECK: llvm.call @free([[MALLOC]]) // CHECK: llvm.return @@ -251,9 +247,7 @@ func.func @test_with_shape(%ncols: index, %nrows: index) { // CHECK-LABEL: llvm.func @test_with_shape // CHECK-SAME: %[[NCOLS:.*]]: i64, %[[NROWS:.*]]: i64 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[FOUR:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[FOUR:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[DIM1_SIZE:.*]] = llvm.mul %[[FOUR]], %[[NCOLS]] : i64 // CHECK: %[[TOTAL_SIZE:.*]] = llvm.mul %[[DIM1_SIZE]], %[[NROWS]] : i64 // CHECK: %[[MEM:.*]] = llvm.call @malloc(%[[TOTAL_SIZE]]) @@ -269,9 +263,7 @@ func.func @test_string_with_shape(%len: index, %nelems: index) { // CHECK-LABEL: llvm.func @test_string_with_shape // CHECK-SAME: %[[LEN:.*]]: i64, %[[NELEMS:.*]]: i64) -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ONE:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ONE:.*]] = llvm.mlir.constant(1 : i64) : i64 // CHECK: %[[LEN_SIZE:.*]] = llvm.mul %[[ONE]], %[[LEN]] : i64 // CHECK: %[[TOTAL_SIZE:.*]] = llvm.mul %[[LEN_SIZE]], %[[NELEMS]] : i64 // CHECK: %[[MEM:.*]] = llvm.call @malloc(%[[TOTAL_SIZE]]) @@ -1654,9 +1646,7 @@ func.func @embox0(%arg0: !fir.ref>) { // AMDGPU: %[[AA:.*]] = llvm.alloca %[[C1]] x !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}})> {alignment = 8 : i64} : (i32) -> !llvm.ptr<5> // AMDGPU: %[[ALLOCA:.*]] = llvm.addrspacecast %[[AA]] : !llvm.ptr<5> to !llvm.ptr // CHECK: %[[TYPE_CODE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[I64_ELEM_SIZE:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[I64_ELEM_SIZE:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[DESC:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}})> // CHECK: %[[DESC0:.*]] = llvm.insertvalue %[[I64_ELEM_SIZE]], %[[DESC]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}})> // CHECK: %[[CFI_VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -1879,9 +1869,7 @@ func.func @xembox0(%arg0: !fir.ref>) { // AMDGPU: %[[ALLOCA:.*]] = llvm.addrspacecast %[[AA]] : !llvm.ptr<5> to !llvm.ptr // CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : i64) : i64 // CHECK: %[[TYPE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -1933,9 +1921,7 @@ func.func @xembox0_i32(%arg0: !fir.ref>) { // CHECK: %[[C0_I32:.*]] = llvm.mlir.constant(0 : i32) : i32 // CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : i64) : i64 // CHECK: %[[TYPE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -1988,9 +1974,7 @@ func.func @xembox1(%arg0: !fir.ref>>) { // CHECK-LABEL: llvm.func @xembox1(%{{.*}}: !llvm.ptr) { // CHECK: %[[C0:.*]] = llvm.mlir.constant(0 : i64) : i64 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(10 : i64) : i64 // CHECK: %{{.*}} = llvm.insertvalue %[[ELEM_LEN_I64]], %{{.*}}[1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[PREV_PTROFF:.*]] = llvm.mul %[[ELEM_LEN_I64]], %[[C0]] : i64 @@ -2042,9 +2026,7 @@ func.func private @_QPxb(!fir.box>) // AMDGPU: %[[AR:.*]] = llvm.alloca %[[ARR_SIZE]] x f64 {bindc_name = "arr"} : (i64) -> !llvm.ptr<5> // AMDGPU: %[[ARR:.*]] = llvm.addrspacecast %[[AR]] : !llvm.ptr<5> to !llvm.ptr // CHECK: %[[TYPE_CODE:.*]] = llvm.mlir.constant(28 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(8 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<2 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<2 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -2126,9 +2108,7 @@ func.func private @_QPtest_dt_callee(%arg0: !fir.box>) // CHECK: %[[C10:.*]] = llvm.mlir.constant(10 : i64) : i64 // CHECK: %[[C2:.*]] = llvm.mlir.constant(2 : i64) : i64 // CHECK: %[[TYPE_CODE:.*]] = llvm.mlir.constant(9 : i32) : i32 -// CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +// CHECK: %[[ELEM_LEN_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[BOX0:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[BOX1:.*]] = llvm.insertvalue %[[ELEM_LEN_I64]], %[[BOX0]][1] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[VERSION:.*]] = llvm.mlir.constant(20240719 : i32) : i32 @@ -2146,9 +2126,7 @@ func.func private @_QPtest_dt_callee(%arg0: !fir.box>) // CHECK: %[[BOX6:.*]] = llvm.insertvalue %[[F18ADDENDUM_I8]], %[[BOX5]][6] : !llvm.struct<(ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, array<1 x array<3 x i64>>)> // CHECK: %[[ZERO:.*]] = llvm.mlir.constant(0 : i64) : i64 // CHECK: %[[ONE:.*]] = llvm.mlir.constant(1 : i64) : i64 -// CHECK: %[[ELE_TYPE:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[GEP_DTYPE_SIZE:.*]] = llvm.getelementptr %[[ELE_TYPE]][1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<"_QFtest_dt_sliceTt", (i32, i32)> -// CHECK: %[[PTRTOINT_DTYPE_SIZE:.*]] = llvm.ptrtoint %[[GEP_DTYPE_SIZE]] : !llvm.ptr to i64 +// CHECK: %[[PTRTOINT_DTYPE_SIZE:.*]] = llvm.mlir.constant(8 : i64) : i64 // CHECK: %[[ADJUSTED_OFFSET:.*]] = llvm.sub %[[C1]], %[[ONE]] : i64 // CHECK: %[[EXT_SUB:.*]] = llvm.sub %[[C10]], %[[C1]] : i64 // CHECK: %[[EXT_ADD:.*]] = llvm.add %[[EXT_SUB]], %[[C2]] : i64 @@ -2429,9 +2407,7 @@ func.func @test_rebox_1(%arg0: !fir.box>) { //CHECK: %[[SIX:.*]] = llvm.mlir.constant(6 : index) : i64 //CHECK: %[[EIGHTY:.*]] = llvm.mlir.constant(80 : index) : i64 //CHECK: %[[FLOAT_TYPE:.*]] = llvm.mlir.constant(27 : i32) : i32 -//CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -//CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -//CHECK: %[[ELEM_SIZE_I64:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +//CHECK: %[[ELEM_SIZE_I64:.*]] = llvm.mlir.constant(4 : i64) : i64 //CHECK: %[[EXTRA_GEP:.*]] = llvm.getelementptr %[[ARG0]][0, 6] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> //CHECK: %[[EXTRA:.*]] = llvm.load %[[EXTRA_GEP]] : !llvm.ptr -> i8 //CHECK: %[[RBOX:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> @@ -2504,9 +2480,7 @@ func.func @foo(%arg0: !fir.box} //CHECK: %[[COMPONENT_OFFSET_1:.*]] = llvm.mlir.constant(1 : i64) : i64 //CHECK: %[[ELEM_COUNT:.*]] = llvm.mlir.constant(7 : i64) : i64 //CHECK: %[[TYPE_CHAR:.*]] = llvm.mlir.constant(40 : i32) : i32 -//CHECK: %[[NULL:.*]] = llvm.mlir.zero : !llvm.ptr -//CHECK: %[[GEP:.*]] = llvm.getelementptr %[[NULL]][1] -//CHECK: %[[CHAR_SIZE:.*]] = llvm.ptrtoint %[[GEP]] : !llvm.ptr to i64 +//CHECK: %[[CHAR_SIZE:.*]] = llvm.mlir.constant(1 : i64) : i64 //CHECK: %[[ELEM_SIZE:.*]] = llvm.mul %[[CHAR_SIZE]], %[[ELEM_COUNT]] //CHECK: %[[EXTRA_GEP:.*]] = llvm.getelementptr %[[ARG0]][0, 6] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>, ptr, array<1 x i64>)> //CHECK: %[[EXTRA:.*]] = llvm.load %[[EXTRA_GEP]] : !llvm.ptr -> i8 diff --git a/flang/test/Fir/copy-codegen.fir b/flang/test/Fir/copy-codegen.fir index eef1885c6a49c..7b0620ca2d312 100644 --- a/flang/test/Fir/copy-codegen.fir +++ b/flang/test/Fir/copy-codegen.fir @@ -12,10 +12,8 @@ func.func @test_copy_1(%arg0: !fir.ref, %arg1: !fir.ref) { // CHECK-LABEL: llvm.func @test_copy_1( // CHECK-SAME: %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr, // CHECK-SAME: %[[VAL_1:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr) { -// CHECK: %[[VAL_2:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_3:.*]] = llvm.getelementptr %[[VAL_2]][1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<"sometype", (array<9 x i32>)> -// CHECK: %[[VAL_4:.*]] = llvm.ptrtoint %[[VAL_3]] : !llvm.ptr to i64 -// CHECK: "llvm.intr.memcpy"(%[[VAL_1]], %[[VAL_0]], %[[VAL_4]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () +// CHECK: %[[VAL_2:.*]] = llvm.mlir.constant(36 : i64) : i64 +// CHECK: "llvm.intr.memcpy"(%[[VAL_1]], %[[VAL_0]], %[[VAL_2]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () // CHECK: llvm.return // CHECK: } @@ -26,10 +24,8 @@ func.func @test_copy_2(%arg0: !fir.ref, %arg1: !fir.ref) { // CHECK-LABEL: llvm.func @test_copy_2( // CHECK-SAME: %[[VAL_0:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr, // CHECK-SAME: %[[VAL_1:[0-9]+|[a-zA-Z$._-][a-zA-Z0-9$._-]*]]: !llvm.ptr) { -// CHECK: %[[VAL_2:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_3:.*]] = llvm.getelementptr %[[VAL_2]][1] : (!llvm.ptr) -> !llvm.ptr, !llvm.struct<"sometype", (array<9 x i32>)> -// CHECK: %[[VAL_4:.*]] = llvm.ptrtoint %[[VAL_3]] : !llvm.ptr to i64 -// CHECK: "llvm.intr.memmove"(%[[VAL_1]], %[[VAL_0]], %[[VAL_4]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () +// CHECK: %[[VAL_2:.*]] = llvm.mlir.constant(36 : i64) : i64 +// CHECK: "llvm.intr.memmove"(%[[VAL_1]], %[[VAL_0]], %[[VAL_2]]) <{isVolatile = false}> : (!llvm.ptr, !llvm.ptr, i64) -> () // CHECK: llvm.return // CHECK: } } diff --git a/flang/test/Fir/embox-char.fir b/flang/test/Fir/embox-char.fir index efb069f96520d..8e40acfdf289f 100644 --- a/flang/test/Fir/embox-char.fir +++ b/flang/test/Fir/embox-char.fir @@ -45,9 +45,7 @@ // CHECK: %[[VAL_30:.*]] = llvm.load %[[VAL_29]] : !llvm.ptr -> i64 // CHECK: %[[VAL_31:.*]] = llvm.sdiv %[[VAL_16]], %[[VAL_13]] : i64 // CHECK: %[[VAL_32:.*]] = llvm.mlir.constant(44 : i32) : i32 -// CHECK: %[[VAL_33:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_34:.*]] = llvm.getelementptr %[[VAL_33]][1] : (!llvm.ptr) -> !llvm.ptr, i32 -// CHECK: %[[VAL_35:.*]] = llvm.ptrtoint %[[VAL_34]] : !llvm.ptr to i64 +// CHECK: %[[VAL_35:.*]] = llvm.mlir.constant(4 : i64) : i64 // CHECK: %[[VAL_36:.*]] = llvm.mul %[[VAL_35]], %[[VAL_31]] : i64 // CHECK: %[[VAL_37:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> // CHECK: %[[VAL_38:.*]] = llvm.insertvalue %[[VAL_36]], %[[VAL_37]][1] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> @@ -139,9 +137,7 @@ func.func @test_char4(%arg0: !fir.ref !llvm.ptr, !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> // CHECK: %[[VAL_29:.*]] = llvm.load %[[VAL_28]] : !llvm.ptr -> i64 // CHECK: %[[VAL_30:.*]] = llvm.mlir.constant(40 : i32) : i32 -// CHECK: %[[VAL_31:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_32:.*]] = llvm.getelementptr %[[VAL_31]][1] : (!llvm.ptr) -> !llvm.ptr, i8 -// CHECK: %[[VAL_33:.*]] = llvm.ptrtoint %[[VAL_32]] : !llvm.ptr to i64 +// CHECK: %[[VAL_33:.*]] = llvm.mlir.constant(1 : i64) : i64 // CHECK: %[[VAL_34:.*]] = llvm.mul %[[VAL_33]], %[[VAL_15]] : i64 // CHECK: %[[VAL_35:.*]] = llvm.mlir.undef : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> // CHECK: %[[VAL_36:.*]] = llvm.insertvalue %[[VAL_34]], %[[VAL_35]][1] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<2 x array<3 x i64>>)> diff --git a/flang/test/Fir/embox-substring.fir b/flang/test/Fir/embox-substring.fir index f2042f9bda7fc..6ce6346f89b1d 100644 --- a/flang/test/Fir/embox-substring.fir +++ b/flang/test/Fir/embox-substring.fir @@ -29,10 +29,9 @@ func.func private @dump(!fir.box>>) // CHECK-SAME: %[[VAL_0:.*]]: !llvm.ptr, // CHECK-SAME: %[[VAL_1:.*]]: i64) { // CHECK: %[[VAL_5:.*]] = llvm.mlir.constant(1 : index) : i64 -// CHECK: llvm.getelementptr -// CHECK: %[[VAL_28:.*]] = llvm.mlir.zero : !llvm.ptr -// CHECK: %[[VAL_29:.*]] = llvm.getelementptr %[[VAL_28]][1] : (!llvm.ptr) -> !llvm.ptr, i8 -// CHECK: %[[VAL_30:.*]] = llvm.ptrtoint %[[VAL_29]] : !llvm.ptr to i64 +// CHECK: llvm.mlir.constant(1 : i64) : i64 +// CHECK: llvm.mlir.constant(1 : i64) : i64 +// CHECK: %[[VAL_30:.*]] = llvm.mlir.constant(1 : i64) : i64 // CHECK: %[[VAL_31:.*]] = llvm.mul %[[VAL_30]], %[[VAL_1]] : i64 // CHECK: %[[VAL_42:.*]] = llvm.mul %[[VAL_31]], %[[VAL_5]] : i64 // CHECK: %[[VAL_43:.*]] = llvm.insertvalue %[[VAL_42]], %{{.*}}[7, 0, 2] : !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)> >From d71c0b7f45582ece43016eb98367251e54e75280 Mon Sep 17 00:00:00 2001 From: Jean Perier Date: Fri, 16 May 2025 08:09:37 -0700 Subject: [PATCH 2/4] [flang] translate derived type array init to attribute if possible --- .../Optimizer/CodeGen/LLVMInsertChainFolder.h | 31 +++ .../include/flang/Optimizer/Dialect/FIROps.td | 5 + flang/lib/Optimizer/CodeGen/CMakeLists.txt | 1 + flang/lib/Optimizer/CodeGen/CodeGen.cpp | 51 +++-- .../CodeGen/LLVMInsertChainFolder.cpp | 204 ++++++++++++++++++ flang/lib/Optimizer/Dialect/FIROps.cpp | 15 ++ .../Fir/convert-and-fold-insert-on-range.fir | 33 +++ 7 files changed, 319 insertions(+), 21 deletions(-) create mode 100644 flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h create mode 100644 flang/lib/Optimizer/CodeGen/LLVMInsertChainFolder.cpp create mode 100644 flang/test/Fir/convert-and-fold-insert-on-range.fir diff --git a/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h b/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h new file mode 100644 index 0000000000000..d577c4c0fa70b --- /dev/null +++ b/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h @@ -0,0 +1,31 @@ +//===-- LLVMInsertChainFolder.h -- insertvalue chain folder ----*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// Helper to fold LLVM dialect llvm.insertvalue chain representing constants +// into an Attribute representation. +// This sits in Flang because it is incomplete and tailored for flang needs. +// +//===----------------------------------------------------------------------===// + +namespace mlir { +class Attribute; +class OpBuilder; +class Value; +} // namespace mlir + +namespace fir { + +/// Attempt to fold an llvm.insertvalue chain into an attribute representation +/// suitable as llvm.constant operand. The returned value will be a null pointer +/// if this is not an llvm.insertvalue result pr if the chain is not a constant, +/// or cannot be represented as an Attribute. The operations are not deleted, +/// but some llvm.insertvalue value operands may be folded with the builder on +/// the way. +mlir::Attribute tryFoldingLLVMInsertChain(mlir::Value insertChainResult, + mlir::OpBuilder &builder); +} // namespace fir diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td b/flang/include/flang/Optimizer/Dialect/FIROps.td index 458b780806144..dc66885f776f0 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.td +++ b/flang/include/flang/Optimizer/Dialect/FIROps.td @@ -2129,6 +2129,11 @@ def fir_InsertOnRangeOp : fir_OneResultOp<"insert_on_range", [NoMemoryEffect]> { $seq `,` $val custom($coor) attr-dict `:` functional-type(operands, results) }]; + let extraClassDeclaration = [{ + /// Is this insert_on_range inserting on all the values of the result type? + bool isFullRange(); + }]; + let hasVerifier = 1; } diff --git a/flang/lib/Optimizer/CodeGen/CMakeLists.txt b/flang/lib/Optimizer/CodeGen/CMakeLists.txt index 04480bac552b7..980307db315d9 100644 --- a/flang/lib/Optimizer/CodeGen/CMakeLists.txt +++ b/flang/lib/Optimizer/CodeGen/CMakeLists.txt @@ -3,6 +3,7 @@ add_flang_library(FIRCodeGen CodeGen.cpp CodeGenOpenMP.cpp FIROpPatterns.cpp + LLVMInsertChainFolder.cpp LowerRepackArrays.cpp PreCGRewrite.cpp TBAABuilder.cpp diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index ad9119ba4a031..ed76a77ced047 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -14,6 +14,7 @@ #include "flang/Optimizer/CodeGen/CodeGenOpenMP.h" #include "flang/Optimizer/CodeGen/FIROpPatterns.h" +#include "flang/Optimizer/CodeGen/LLVMInsertChainFolder.h" #include "flang/Optimizer/CodeGen/TypeConverter.h" #include "flang/Optimizer/Dialect/FIRAttr.h" #include "flang/Optimizer/Dialect/FIRCG/CGOps.h" @@ -2412,15 +2413,38 @@ struct InsertOnRangeOpConversion doRewrite(fir::InsertOnRangeOp range, mlir::Type ty, OpAdaptor adaptor, mlir::ConversionPatternRewriter &rewriter) const override { - llvm::SmallVector dims; - auto type = adaptor.getOperands()[0].getType(); + auto arrayType = adaptor.getSeq().getType(); // Iteratively extract the array dimensions from the type. + llvm::SmallVector dims; + mlir::Type type = arrayType; while (auto t = mlir::dyn_cast(type)) { dims.push_back(t.getNumElements()); type = t.getElementType(); } + // Avoid generating long insert chain that are very slow to fold back + // (which is required in globals when later generating LLVM IR). Attempt to + // fold the inserted element value to an attribute and build an ArrayAttr + // for the resulting array. + if (range.isFullRange()) { + if (mlir::Attribute cst = + fir::tryFoldingLLVMInsertChain(adaptor.getVal(), rewriter)) { + mlir::Attribute dimVal = cst; + for (auto dim : llvm::reverse(dims)) { + // Use std::vector in case the number of elements is big. + std::vector elements(dim, dimVal); + dimVal = mlir::ArrayAttr::get(range.getContext(), elements); + } + // Replace insert chain with constant. + rewriter.replaceOpWithNewOp(range, arrayType, + dimVal); + return mlir::success(); + } + } + + // The inserted value cannot be folded to an attribute, turn the + // insert_range into an llvm.insertvalue chain. llvm::SmallVector lBounds; llvm::SmallVector uBounds; @@ -2434,8 +2458,8 @@ struct InsertOnRangeOpConversion auto &subscripts = lBounds; auto loc = range.getLoc(); - mlir::Value lastOp = adaptor.getOperands()[0]; - mlir::Value insertVal = adaptor.getOperands()[1]; + mlir::Value lastOp = adaptor.getSeq(); + mlir::Value insertVal = adaptor.getVal(); while (subscripts != uBounds) { lastOp = rewriter.create( @@ -3131,7 +3155,7 @@ struct GlobalOpConversion : public fir::FIROpConversion { // initialization is on the full range. auto insertOnRangeOps = gr.front().getOps(); for (auto insertOp : insertOnRangeOps) { - if (isFullRange(insertOp.getCoor(), insertOp.getType())) { + if (insertOp.isFullRange()) { auto seqTyAttr = convertType(insertOp.getType()); auto *op = insertOp.getVal().getDefiningOp(); auto constant = mlir::dyn_cast(op); @@ -3161,22 +3185,7 @@ struct GlobalOpConversion : public fir::FIROpConversion { return mlir::success(); } - bool isFullRange(mlir::DenseIntElementsAttr indexes, - fir::SequenceType seqTy) const { - auto extents = seqTy.getShape(); - if (indexes.size() / 2 != static_cast(extents.size())) - return false; - auto cur_index = indexes.value_begin(); - for (unsigned i = 0; i < indexes.size(); i += 2) { - if (*(cur_index++) != 0) - return false; - if (*(cur_index++) != extents[i / 2] - 1) - return false; - } - return true; - } - - // TODO: String comparaison should be avoided. Replace linkName with an + // TODO: String comparisons should be avoided. Replace linkName with an // enumeration. mlir::LLVM::Linkage convertLinkage(std::optional optLinkage) const { diff --git a/flang/lib/Optimizer/CodeGen/LLVMInsertChainFolder.cpp b/flang/lib/Optimizer/CodeGen/LLVMInsertChainFolder.cpp new file mode 100644 index 0000000000000..0fc8697b735cf --- /dev/null +++ b/flang/lib/Optimizer/CodeGen/LLVMInsertChainFolder.cpp @@ -0,0 +1,204 @@ +//===-- LLVMInsertChainFolder.cpp -----------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/CodeGen/LLVMInsertChainFolder.h" +#include "mlir/Dialect/LLVMIR/LLVMAttrs.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" +#include "mlir/IR/Builders.h" +#include "llvm/Support/Debug.h" + +#define DEBUG_TYPE "flang-insert-folder" + +#include + +namespace { +// Helper class to construct the attribute elements of an aggregate value being +// folded without creating a full mlir::Attribute representation for each step +// of the insert value chain, which would both be expensive in terms of +// compilation time and memory (since the intermediate Attribute would survive, +// unused, inside the mlir context). +class InsertChainBackwardFolder { + // Type for the current value of an element of the aggregate value being + // constructed by the insert chain. + // At any point of the insert chain, the value of an element is either: + // - nullptr: not yet known, the insert has not yet been seen. + // - an mlir::Attribute: the element is fully defined. + // - a nested InsertChainBackwardFolder: the element is itself an aggregate + // and its sub-elements have been partially defined (insert with mutliple + // indices have been seen). + + // The insertion folder assumes backward walk of the insert chain. Once an + // element or sub-element has been defined, it is not overriden by new + // insertions (last insert wins). + using InFlightValue = + llvm::PointerUnion; + +public: + InsertChainBackwardFolder( + mlir::Type type, std::deque *folderStorage) + : values(getNumElements(type), mlir::Attribute{}), + folderStorage{folderStorage}, type{type} {} + + /// Push + bool pushValue(mlir::Attribute val, llvm::ArrayRef at); + + mlir::Attribute finalize(mlir::Attribute defaultFieldValue); + +private: + static int64_t getNumElements(mlir::Type type) { + if (auto structTy = + llvm::dyn_cast_if_present(type)) + return structTy.getBody().size(); + if (auto arrayTy = + llvm::dyn_cast_if_present(type)) + return arrayTy.getNumElements(); + return 0; + } + + static mlir::Type getSubElementType(mlir::Type type, int64_t field) { + if (auto arrayTy = + llvm::dyn_cast_if_present(type)) + return arrayTy.getElementType(); + if (auto structTy = + llvm::dyn_cast_if_present(type)) + return structTy.getBody()[field]; + return {}; + } + + // Current element value of the aggregate value being built. + llvm::SmallVector values; + // std::deque is used to allocate storage for nested list and guarantee the + // stability of the InsertChainBackwardFolder* used as element value. + std::deque *folderStorage; + // Type of the aggregate value being built. + mlir::Type type; +}; +} // namespace + +// Helper to fold the value being inserted by an llvm.insert_value. +// This may call tryFoldingLLVMInsertChain if the value is an aggregate and +// was itself constructed by a different insert chain. +static mlir::Attribute getAttrIfConstant(mlir::Value val, + mlir::OpBuilder &rewriter) { + if (auto cst = val.getDefiningOp()) + return cst.getValue(); + if (auto insert = val.getDefiningOp()) + return fir::tryFoldingLLVMInsertChain(val, rewriter); + if (val.getDefiningOp()) + return mlir::LLVM::ZeroAttr::get(val.getContext()); + if (val.getDefiningOp()) + return mlir::LLVM::UndefAttr::get(val.getContext()); + if (mlir::Operation *op = val.getDefiningOp()) { + unsigned resNum = llvm::cast(val).getResultNumber(); + llvm::SmallVector results; + if (mlir::succeeded(rewriter.tryFold(op, results)) && + results.size() > resNum) { + if (auto cst = results[resNum].getDefiningOp()) + return cst.getValue(); + } + } + if (auto trunc = val.getDefiningOp()) + if (auto attr = getAttrIfConstant(trunc.getArg(), rewriter)) + if (auto intAttr = llvm::dyn_cast(attr)) + return mlir::IntegerAttr::get(trunc.getType(), intAttr.getInt()); + LLVM_DEBUG(llvm::dbgs() << "cannot fold insert value operand: " << val + << "\n"); + return {}; +} + +mlir::Attribute +InsertChainBackwardFolder::finalize(mlir::Attribute defaultFieldValue) { + std::vector attrs; + attrs.reserve(values.size()); + for (InFlightValue &inFlight : values) { + if (!inFlight) { + attrs.push_back(defaultFieldValue); + } else if (auto attr = llvm::dyn_cast(inFlight)) { + attrs.push_back(attr); + } else { + auto *inFlightList = llvm::cast(inFlight); + attrs.push_back(inFlightList->finalize(defaultFieldValue)); + } + } + return mlir::ArrayAttr::get(type.getContext(), attrs); +} + +bool InsertChainBackwardFolder::pushValue(mlir::Attribute val, + llvm::ArrayRef at) { + if (at.size() == 0 || at[0] >= static_cast(values.size())) + return false; + InFlightValue &inFlight = values[at[0]]; + if (!inFlight) { + if (at.size() == 1) { + inFlight = val; + return true; + } + // This is the first insert to a nested field. Create a + // InsertChainBackwardFolder for the current element value. + InsertChainBackwardFolder &inFlightList = folderStorage->emplace_back( + getSubElementType(type, at[0]), folderStorage); + inFlight = &inFlightList; + return inFlightList.pushValue(val, at.drop_front()); + } + // Keep last inserted value if already set. + if (llvm::isa(inFlight)) + return true; + auto *inFlightList = llvm::cast(inFlight); + if (at.size() == 1) { + if (!llvm::isa(val)) { + LLVM_DEBUG(llvm::dbgs() + << "insert chain sub-element partially overwritten initial " + "value is not zero or undef\n"); + return false; + } + inFlight = inFlightList->finalize(val); + return true; + } + return inFlightList->pushValue(val, at.drop_front()); +} + +mlir::Attribute fir::tryFoldingLLVMInsertChain(mlir::Value val, + mlir::OpBuilder &rewriter) { + if (auto cst = val.getDefiningOp()) + return cst.getValue(); + if (auto insert = val.getDefiningOp()) { + LLVM_DEBUG(llvm::dbgs() << "trying to fold insert chain:" << val << "\n"); + if (auto structTy = + llvm::dyn_cast(insert.getType())) { + mlir::LLVM::InsertValueOp currentInsert = insert; + mlir::LLVM::InsertValueOp lastInsert; + std::deque folderStorage; + InsertChainBackwardFolder inFlightList(structTy, &folderStorage); + while (currentInsert) { + mlir::Attribute attr = + getAttrIfConstant(currentInsert.getValue(), rewriter); + if (!attr) + return {}; + if (!inFlightList.pushValue(attr, currentInsert.getPosition())) + return {}; + lastInsert = currentInsert; + currentInsert = currentInsert.getContainer() + .getDefiningOp(); + } + mlir::Attribute defaultVal; + if (lastInsert) { + if (lastInsert.getContainer().getDefiningOp()) + defaultVal = mlir::LLVM::ZeroAttr::get(val.getContext()); + else if (lastInsert.getContainer().getDefiningOp()) + defaultVal = mlir::LLVM::UndefAttr::get(val.getContext()); + } + if (!defaultVal) { + LLVM_DEBUG(llvm::dbgs() + << "insert chain initial value is not Zero or Undef\n"); + return {}; + } + return inFlightList.finalize(defaultVal); + } + } + return {}; +} diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index d85b38c467857..e12af7782a578 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -2365,6 +2365,21 @@ llvm::LogicalResult fir::InsertOnRangeOp::verify() { return mlir::success(); } +bool fir::InsertOnRangeOp::isFullRange() { + auto extents = getType().getShape(); + mlir::DenseIntElementsAttr indexes = getCoor(); + if (indexes.size() / 2 != static_cast(extents.size())) + return false; + auto cur_index = indexes.value_begin(); + for (unsigned i = 0; i < indexes.size(); i += 2) { + if (*(cur_index++) != 0) + return false; + if (*(cur_index++) != extents[i / 2] - 1) + return false; + } + return true; +} + //===----------------------------------------------------------------------===// // InsertValueOp //===----------------------------------------------------------------------===// diff --git a/flang/test/Fir/convert-and-fold-insert-on-range.fir b/flang/test/Fir/convert-and-fold-insert-on-range.fir new file mode 100644 index 0000000000000..df18614d80b63 --- /dev/null +++ b/flang/test/Fir/convert-and-fold-insert-on-range.fir @@ -0,0 +1,33 @@ +// Test codegen of constant insert_on_range without symbol reference into mlir.constant. +// RUN: fir-opt --cg-rewrite --split-input-file --fir-to-llvm-ir %s | FileCheck %s + +module attributes {dlti.dl_spec = #dlti.dl_spec = dense<32> : vector<4xi64>, !llvm.ptr<271> = dense<32> : vector<4xi64>, !llvm.ptr<272> = dense<64> : vector<4xi64>, i64 = dense<64> : vector<2xi64>, i128 = dense<128> : vector<2xi64>, f80 = dense<128> : vector<2xi64>, !llvm.ptr = dense<64> : vector<4xi64>, i1 = dense<8> : vector<2xi64>, i8 = dense<8> : vector<2xi64>, i16 = dense<16> : vector<2xi64>, i32 = dense<32> : vector<2xi64>, f16 = dense<16> : vector<2xi64>, f64 = dense<64> : vector<2xi64>, f128 = dense<128> : vector<2xi64>, "dlti.endianness" = "little", "dlti.mangling_mode" = "e", "dlti.stack_alignment" = 128 : i64>, fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"} { + fir.global @derived_array : !fir.array<2x!fir.type>>}>> { + %c0 = arith.constant 0 : index + %0 = fir.undefined !fir.type>>}> + %1 = fir.zero_bits !fir.heap> + %2 = fir.shape %c0 : (index) -> !fir.shape<1> + %3 = fir.embox %1(%2) : (!fir.heap>, !fir.shape<1>) -> !fir.box>> + %4 = fir.insert_value %0, %3, ["comp", !fir.type>>}>] : (!fir.type>>}>, !fir.box>>) -> !fir.type>>}> + %5 = fir.undefined !fir.array<2x!fir.type>>}>> + %6 = fir.insert_on_range %5, %4 from (0) to (1) : (!fir.array<2x!fir.type>>}>>, !fir.type>>}>) -> !fir.array<2x!fir.type>>}>> + fir.has_value %6 : !fir.array<2x!fir.type>>}>> + } +} + +//CHECK-LABEL: llvm.mlir.global external @derived_array() +//CHECK: %[[CST:.*]] = llvm.mlir.constant([ +//CHECK-SAME: [ +//CHECK-SAME: [#llvm.zero, 8, 20240719 : i32, 1 : i8, 28 : i8, 2 : i8, 0 : i8, +//CHECK-SAME: [ +//CHECK-SAME: [1, 0 : index, 8] +//CHECK-SAME: ] +//CHECK-SAME: ], +//CHECK-SAME: [ +//CHECK-SAME: [#llvm.zero, 8, 20240719 : i32, 1 : i8, 28 : i8, 2 : i8, 0 : i8, +//CHECK-SAME: [ +//CHECK-SAME: [1, 0 : index, 8] +//CHECK-SAME: ] +//CHECK-SAME: ]) : +//CHECK-SAME: !llvm.array<2 x struct<"sometype", (struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)>)>> +//CHECK: llvm.return %[[CST]] : !llvm.array<2 x struct<"sometype", (struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 x array<3 x i64>>)>)>> >From 796a1e0269baf1c77ffabf47a8fa155356bc9096 Mon Sep 17 00:00:00 2001 From: Jean Perier Date: Mon, 19 May 2025 01:37:14 -0700 Subject: [PATCH 3/4] use map_to_vector and FailureOr --- .../Optimizer/CodeGen/LLVMInsertChainFolder.h | 7 ++- flang/lib/Optimizer/CodeGen/CodeGen.cpp | 7 +-- .../CodeGen/LLVMInsertChainFolder.cpp | 54 ++++++++++--------- 3 files changed, 39 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h b/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h index d577c4c0fa70b..321bda91aa6fe 100644 --- a/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h +++ b/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h @@ -12,6 +12,8 @@ // //===----------------------------------------------------------------------===// +#include "llvm/Support/LogicalResult.h" + namespace mlir { class Attribute; class OpBuilder; @@ -26,6 +28,7 @@ namespace fir { /// or cannot be represented as an Attribute. The operations are not deleted, /// but some llvm.insertvalue value operands may be folded with the builder on /// the way. -mlir::Attribute tryFoldingLLVMInsertChain(mlir::Value insertChainResult, - mlir::OpBuilder &builder); +llvm::FailureOr +tryFoldingLLVMInsertChain(mlir::Value insertChainResult, + mlir::OpBuilder &builder); } // namespace fir diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index ed76a77ced047..70c90fae34086 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -2428,9 +2428,10 @@ struct InsertOnRangeOpConversion // fold the inserted element value to an attribute and build an ArrayAttr // for the resulting array. if (range.isFullRange()) { - if (mlir::Attribute cst = - fir::tryFoldingLLVMInsertChain(adaptor.getVal(), rewriter)) { - mlir::Attribute dimVal = cst; + llvm::FailureOr cst = + fir::tryFoldingLLVMInsertChain(adaptor.getVal(), rewriter); + if (llvm::succeeded(cst)) { + mlir::Attribute dimVal = *cst; for (auto dim : llvm::reverse(dims)) { // Use std::vector in case the number of elements is big. std::vector elements(dim, dimVal); diff --git a/flang/lib/Optimizer/CodeGen/LLVMInsertChainFolder.cpp b/flang/lib/Optimizer/CodeGen/LLVMInsertChainFolder.cpp index 0fc8697b735cf..5b522f2647916 100644 --- a/flang/lib/Optimizer/CodeGen/LLVMInsertChainFolder.cpp +++ b/flang/lib/Optimizer/CodeGen/LLVMInsertChainFolder.cpp @@ -67,7 +67,7 @@ class InsertChainBackwardFolder { if (auto structTy = llvm::dyn_cast_if_present(type)) return structTy.getBody()[field]; - return {}; + return nullptr; } // Current element value of the aggregate value being built. @@ -83,12 +83,18 @@ class InsertChainBackwardFolder { // Helper to fold the value being inserted by an llvm.insert_value. // This may call tryFoldingLLVMInsertChain if the value is an aggregate and // was itself constructed by a different insert chain. +// Returns a nullptr Attribute if the value could not be folded. static mlir::Attribute getAttrIfConstant(mlir::Value val, mlir::OpBuilder &rewriter) { if (auto cst = val.getDefiningOp()) return cst.getValue(); - if (auto insert = val.getDefiningOp()) - return fir::tryFoldingLLVMInsertChain(val, rewriter); + if (auto insert = val.getDefiningOp()) { + llvm::FailureOr attr = + fir::tryFoldingLLVMInsertChain(val, rewriter); + if (succeeded(attr)) + return *attr; + return nullptr; + } if (val.getDefiningOp()) return mlir::LLVM::ZeroAttr::get(val.getContext()); if (val.getDefiningOp()) @@ -108,23 +114,20 @@ static mlir::Attribute getAttrIfConstant(mlir::Value val, return mlir::IntegerAttr::get(trunc.getType(), intAttr.getInt()); LLVM_DEBUG(llvm::dbgs() << "cannot fold insert value operand: " << val << "\n"); - return {}; + return nullptr; } mlir::Attribute InsertChainBackwardFolder::finalize(mlir::Attribute defaultFieldValue) { - std::vector attrs; - attrs.reserve(values.size()); - for (InFlightValue &inFlight : values) { - if (!inFlight) { - attrs.push_back(defaultFieldValue); - } else if (auto attr = llvm::dyn_cast(inFlight)) { - attrs.push_back(attr); - } else { - auto *inFlightList = llvm::cast(inFlight); - attrs.push_back(inFlightList->finalize(defaultFieldValue)); - } - } + llvm::SmallVector attrs = llvm::map_to_vector( + values, [&](InFlightValue inFlight) -> mlir::Attribute { + if (!inFlight) + return defaultFieldValue; + if (auto attr = llvm::dyn_cast(inFlight)) + return attr; + return llvm::cast(inFlight)->finalize( + defaultFieldValue); + }); return mlir::ArrayAttr::get(type.getContext(), attrs); } @@ -140,8 +143,11 @@ bool InsertChainBackwardFolder::pushValue(mlir::Attribute val, } // This is the first insert to a nested field. Create a // InsertChainBackwardFolder for the current element value. - InsertChainBackwardFolder &inFlightList = folderStorage->emplace_back( - getSubElementType(type, at[0]), folderStorage); + mlir::Type subType = getSubElementType(type, at[0]); + if (!subType) + return false; + InsertChainBackwardFolder &inFlightList = + folderStorage->emplace_back(subType, folderStorage); inFlight = &inFlightList; return inFlightList.pushValue(val, at.drop_front()); } @@ -162,8 +168,8 @@ bool InsertChainBackwardFolder::pushValue(mlir::Attribute val, return inFlightList->pushValue(val, at.drop_front()); } -mlir::Attribute fir::tryFoldingLLVMInsertChain(mlir::Value val, - mlir::OpBuilder &rewriter) { +llvm::FailureOr +fir::tryFoldingLLVMInsertChain(mlir::Value val, mlir::OpBuilder &rewriter) { if (auto cst = val.getDefiningOp()) return cst.getValue(); if (auto insert = val.getDefiningOp()) { @@ -178,9 +184,9 @@ mlir::Attribute fir::tryFoldingLLVMInsertChain(mlir::Value val, mlir::Attribute attr = getAttrIfConstant(currentInsert.getValue(), rewriter); if (!attr) - return {}; + return llvm::failure(); if (!inFlightList.pushValue(attr, currentInsert.getPosition())) - return {}; + return llvm::failure(); lastInsert = currentInsert; currentInsert = currentInsert.getContainer() .getDefiningOp(); @@ -195,10 +201,10 @@ mlir::Attribute fir::tryFoldingLLVMInsertChain(mlir::Value val, if (!defaultVal) { LLVM_DEBUG(llvm::dbgs() << "insert chain initial value is not Zero or Undef\n"); - return {}; + return llvm::failure(); } return inFlightList.finalize(defaultVal); } } - return {}; + return llvm::failure(); } >From 181985e213289b1f0caa4d81599e07fbec067fb5 Mon Sep 17 00:00:00 2001 From: jeanPerier Date: Tue, 20 May 2025 10:47:35 +0200 Subject: [PATCH 4/4] Apply suggestions from code review Thanks Slava! Co-authored-by: Slava Zakharin --- flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h b/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h index 321bda91aa6fe..36360dc77d588 100644 --- a/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h +++ b/flang/include/flang/Optimizer/CodeGen/LLVMInsertChainFolder.h @@ -23,8 +23,8 @@ class Value; namespace fir { /// Attempt to fold an llvm.insertvalue chain into an attribute representation -/// suitable as llvm.constant operand. The returned value will be a null pointer -/// if this is not an llvm.insertvalue result pr if the chain is not a constant, +/// suitable as llvm.constant operand. The returned value will be llvm::Failure +/// if this is not an llvm.insertvalue result or if the chain is not a constant, /// or cannot be represented as an Attribute. The operations are not deleted, /// but some llvm.insertvalue value operands may be folded with the builder on /// the way. From flang-commits at lists.llvm.org Tue May 20 02:15:12 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 20 May 2025 02:15:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang][test] fix false positive match in namelist.f90 test (PR #129075) In-Reply-To: Message-ID: <682c4820.170a0220.2c1193.e47b@mx.google.com> https://github.com/tblah closed https://github.com/llvm/llvm-project/pull/129075 From flang-commits at lists.llvm.org Tue May 20 02:15:13 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 20 May 2025 02:15:13 -0700 (PDT) Subject: [flang-commits] [flang] [flang][test] fix false positive match in namelist.f90 test (PR #129075) In-Reply-To: Message-ID: <682c4821.050a0220.1d3a6d.dbcd@mx.google.com> tblah wrote: Closing because this has been open for ages without confirmation from the user who opened the issue. https://github.com/llvm/llvm-project/pull/129075 From flang-commits at lists.llvm.org Tue May 20 08:00:15 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Tue, 20 May 2025 08:00:15 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang] Extension: allow char string edit descriptors in input formats (PR #140624) In-Reply-To: Message-ID: <682c98ff.a70a0220.f7b31.4544@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/140624 From flang-commits at lists.llvm.org Tue May 20 08:27:23 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 08:27:23 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Undo the effects of CSE for hlfir.exactly_once. (PR #140190) In-Reply-To: Message-ID: <682c9f5b.170a0220.385eea.7db0@mx.google.com> https://github.com/jeanPerier approved this pull request. Looks great, thanks for fixing this! https://github.com/llvm/llvm-project/pull/140190 From flang-commits at lists.llvm.org Tue May 20 08:30:48 2025 From: flang-commits at lists.llvm.org (Razvan Lupusoru via flang-commits) Date: Tue, 20 May 2025 08:30:48 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [OpenACC] rename private/firstprivate recipe attributes (PR #140719) In-Reply-To: Message-ID: <682ca028.a70a0220.3be227.31c2@mx.google.com> razvanlupusoru wrote: @erichkeane I don't think you have gotten to privatizations yet - but just fyi the operand name is being renamed to have the "recipes" suffix. Please approve if it will not impact your testing. Thank you :) https://github.com/llvm/llvm-project/pull/140719 From flang-commits at lists.llvm.org Tue May 20 08:36:02 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 08:36:02 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682ca162.050a0220.154265.3894@mx.google.com> NexMing wrote: > > > Can you elaborate on "future work will focus on gradually improving this conversion pass"? What ops will you be converting and where/when will it live in the pipeline? What's the intended use for this conversion upstream? > > > > > > There is some discussion here https://discourse.llvm.org/t/rfc-add-fir-affine-optimization-fir-pass-pipeline/86190/5 My envisioned final pipeline is: FIR → standard MLIR (do optimization. ,like SCF->Affine )→ LLVM, and working to implement it. > > Does flang need both FIRToAffine and FIRToSCF if there's a plan for SCFToAffine? I am in favour of a FIRToSCF but I am trying to understand the vision. I don't know if it makes sense to maintain FIRToAffine if you proceed with FIRToSCF, for example. The FIRToAffine pass was an internship prototype created 4-5 years ago, and it was only part of my experimental attempt to explore optimization paths. I now prefer the pipeline FIRToSCF followed by SCFToAffine, and plan to deprecate FIRToAffine. https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Tue May 20 08:37:07 2025 From: flang-commits at lists.llvm.org (Erich Keane via flang-commits) Date: Tue, 20 May 2025 08:37:07 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [OpenACC] rename private/firstprivate recipe attributes (PR #140719) In-Reply-To: Message-ID: <682ca1a3.630a0220.9af87.7ade@mx.google.com> erichkeane wrote: > @erichkeane I don't think you have gotten to privatizations yet - but just fyi the operand name is being renamed to have the "recipes" suffix. Please approve if it will not impact your testing. Thank you :) I haven't, but thank you very much for the heads up! I'll know to look for the different names from my printout when I get to private. https://github.com/llvm/llvm-project/pull/140719 From flang-commits at lists.llvm.org Tue May 20 09:22:09 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 09:22:09 -0700 (PDT) Subject: [flang-commits] [flang] 54aa928 - [flang] Undo the effects of CSE for hlfir.exactly_once. (#140190) Message-ID: <682cac31.050a0220.174249.23b4@mx.google.com> Author: Slava Zakharin Date: 2025-05-20T09:22:05-07:00 New Revision: 54aa9282edb5a3abe625893a63018bb75dc5c541 URL: https://github.com/llvm/llvm-project/commit/54aa9282edb5a3abe625893a63018bb75dc5c541 DIFF: https://github.com/llvm/llvm-project/commit/54aa9282edb5a3abe625893a63018bb75dc5c541.diff LOG: [flang] Undo the effects of CSE for hlfir.exactly_once. (#140190) CSE may delete operations from hlfir.exactly_once and reuse the equivalent results from the parent region(s), e.g. from the parent hlfir.region_assign. This makes it problematic to clone hlfir.exactly_once before the top-level hlfir.where. This patch adds a "canonicalizer" that pulls in such operations back into hlfir.exactly_once. Added: flang/test/HLFIR/order_assignments/where-after-cse.fir Modified: flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp Removed: ################################################################################ diff --git a/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp b/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp index 5cae7cf443c86..89b5ccb7d850e 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp @@ -24,12 +24,15 @@ #include "flang/Optimizer/Builder/Todo.h" #include "flang/Optimizer/Dialect/Support/FIRContext.h" #include "flang/Optimizer/HLFIR/Passes.h" +#include "mlir/Analysis/Liveness.h" #include "mlir/IR/Dominance.h" #include "mlir/IR/IRMapping.h" #include "mlir/Transforms/DialectConversion.h" +#include "mlir/Transforms/RegionUtils.h" #include "llvm/ADT/SmallSet.h" #include "llvm/ADT/TypeSwitch.h" #include "llvm/Support/Debug.h" +#include namespace hlfir { #define GEN_PASS_DEF_LOWERHLFIRORDEREDASSIGNMENTS @@ -263,6 +266,19 @@ class OrderedAssignmentRewriter { return &inserted.first->second; } + /// Given a top-level hlfir.where, look for hlfir.exactly_once operations + /// inside it and see if any of the values live into hlfir.exactly_once + /// do not dominate hlfir.where. This may happen due to CSE reusing + /// results of operations from the region parent to hlfir.exactly_once. + /// Since we are going to clone the body of hlfir.exactly_once before + /// the top-level hlfir.where, such def-use will cause problems. + /// There are options how to resolve this in a diff erent way, + /// e.g. making hlfir.exactly_once IsolatedFromAbove or making + /// it a region of hlfir.where and wiring the result(s) through + /// the block arguments. For the time being, this canonicalization + /// tries to undo the effects of CSE. + void canonicalizeExactlyOnceInsideWhere(hlfir::WhereOp whereOp); + fir::FirOpBuilder &builder; /// Map containing the mapping between the original order assignment tree @@ -523,6 +539,10 @@ void OrderedAssignmentRewriter::generateMaskIfOp(mlir::Value cdt) { void OrderedAssignmentRewriter::pre(hlfir::WhereOp whereOp) { mlir::Location loc = whereOp.getLoc(); if (!whereLoopNest) { + // Make sure liveness information is valid for the inner hlfir.exactly_once + // operations, and their bodies can be cloned before the top-level + // hlfir.where. + canonicalizeExactlyOnceInsideWhere(whereOp); // This is the top-level WHERE. Start a loop nest iterating on the shape of // the where mask. if (auto maybeSaved = getIfSaved(whereOp.getMaskRegion())) { @@ -1350,6 +1370,105 @@ void OrderedAssignmentRewriter::saveLeftHandSide( } } +void OrderedAssignmentRewriter::canonicalizeExactlyOnceInsideWhere( + hlfir::WhereOp whereOp) { + auto getDefinition = [](mlir::Value v) { + mlir::Operation *op = v.getDefiningOp(); + bool isValid = true; + if (!op) { + LLVM_DEBUG( + llvm::dbgs() + << "Value live into hlfir.exactly_once has no defining operation: " + << v << "\n"); + isValid = false; + } + if (op->getNumRegions() != 0) { + LLVM_DEBUG( + llvm::dbgs() + << "Cannot pull an operation with regions into hlfir.exactly_once" + << *op << "\n"); + isValid = false; + } + auto effects = mlir::getEffectsRecursively(op); + if (!effects || !effects->empty()) { + LLVM_DEBUG(llvm::dbgs() << "Side effects on operation with result live " + "into hlfir.exactly_once" + << *op << "\n"); + isValid = false; + } + assert(isValid && "invalid live-in"); + return op; + }; + mlir::Liveness liveness(whereOp.getOperation()); + whereOp->walk([&](hlfir::ExactlyOnceOp op) { + std::unordered_set liveInSet; + LLVM_DEBUG(llvm::dbgs() << "Canonicalizing:\n" << op << "\n"); + auto &liveIns = liveness.getLiveIn(&op.getBody().front()); + if (liveIns.empty()) + return; + // Note that the liveIns set is not ordered. + for (mlir::Value liveIn : liveIns) { + if (!dominanceInfo.properlyDominates(liveIn, whereOp)) { + LLVM_DEBUG(llvm::dbgs() + << "Does not dominate top-level where: " << liveIn << "\n"); + liveInSet.insert(getDefinition(liveIn)); + } + } + + // Populate the set of operations that we need to pull into + // hlfir.exactly_once, so that the only live-ins left are the ones + // that dominate whereOp. + std::unordered_set cloneSet(liveInSet); + llvm::SmallVector workList(cloneSet.begin(), + cloneSet.end()); + while (!workList.empty()) { + mlir::Operation *current = workList.pop_back_val(); + for (mlir::Value operand : current->getOperands()) { + if (dominanceInfo.properlyDominates(operand, whereOp)) + continue; + mlir::Operation *def = getDefinition(operand); + if (cloneSet.count(def)) + continue; + cloneSet.insert(def); + workList.push_back(def); + } + } + + // Sort the operations by dominance. This preserves their order + // after the cloning, and also guarantees stable IR generation. + llvm::SmallVector cloneList(cloneSet.begin(), + cloneSet.end()); + llvm::sort(cloneList, [&](mlir::Operation *L, mlir::Operation *R) { + return dominanceInfo.properlyDominates(L, R); + }); + + // Clone the operations. + mlir::IRMapping mapper; + mlir::Operation::CloneOptions options; + options.cloneOperands(); + mlir::OpBuilder::InsertionGuard guard(builder); + builder.setInsertionPointToStart(&op.getBody().front()); + + for (auto *toClone : cloneList) { + LLVM_DEBUG(llvm::dbgs() << "Cloning: " << *toClone << "\n"); + builder.insert(toClone->clone(mapper, options)); + } + for (mlir::Operation *oldOps : liveInSet) + for (mlir::Value oldVal : oldOps->getResults()) { + mlir::Value newVal = mapper.lookup(oldVal); + if (!newVal) { + LLVM_DEBUG(llvm::dbgs() << "No clone found for: " << oldVal << "\n"); + assert(false && "missing clone"); + } + mlir::replaceAllUsesInRegionWith(oldVal, newVal, op.getBody()); + } + + LLVM_DEBUG(llvm::dbgs() << "Finished canonicalization\n"); + if (!liveInSet.empty()) + LLVM_DEBUG(llvm::dbgs() << op << "\n"); + }); +} + /// Lower an ordered assignment tree to fir.do_loop and hlfir.assign given /// a schedule. static void lower(hlfir::OrderedAssignmentTreeOpInterface root, diff --git a/flang/test/HLFIR/order_assignments/where-after-cse.fir b/flang/test/HLFIR/order_assignments/where-after-cse.fir new file mode 100644 index 0000000000000..4505c879c7b0f --- /dev/null +++ b/flang/test/HLFIR/order_assignments/where-after-cse.fir @@ -0,0 +1,254 @@ +// Test canonicalization of hlfir.exactly_once operations +// after CSE. The live-in values that are not dominating +// the top-level hlfir.where must be cloned inside hlfir.exactly_once, +// otherwise, the cloning of the hlfir.exactly_once before hlfir.where +// would cause def-use issues: +// RUN: fir-opt %s --lower-hlfir-ordered-assignments | FileCheck %s + +// Simple case, where CSE makes only hlfir.designate live-in: +// CHECK-LABEL: func.func @_QPtest1( +func.func @_QPtest1(%arg0: !fir.ref>>,p2:!fir.box>>}>> {fir.bindc_name = "x"}) { + %true = arith.constant true + %cst = arith.constant 0.000000e+00 : f32 + %c1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %0 = fir.dummy_scope : !fir.dscope + %1:2 = hlfir.declare %arg0 dummy_scope %0 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtest1Ex"} : (!fir.ref>>,p2:!fir.box>>}>>, !fir.dscope) -> (!fir.ref>>,p2:!fir.box>>}>>, !fir.ref>>,p2:!fir.box>>}>>) + hlfir.where { + %2 = hlfir.designate %1#0{"p2"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>,p2:!fir.box>>}>>) -> !fir.ref>>> + %3 = fir.load %2 : !fir.ref>>> + %4:3 = fir.box_dims %3, %c0 : (!fir.box>>, index) -> (index, index, index) + %5 = arith.addi %4#0, %4#1 : index + %6 = arith.subi %5, %c1 : index + %7 = arith.subi %6, %4#0 : index + %8 = arith.addi %7, %c1 : index + %9 = arith.cmpi sgt, %8, %c0 : index + %10 = arith.select %9, %8, %c0 : index + %11 = fir.shape %10 : (index) -> !fir.shape<1> + %12 = hlfir.designate %3 (%4#0:%6:%c1) shape %11 : (!fir.box>>, index, index, index, !fir.shape<1>) -> !fir.box> + %13 = hlfir.elemental %11 unordered : (!fir.shape<1>) -> !hlfir.expr> { + ^bb0(%arg1: index): + %14 = hlfir.designate %12 (%arg1) : (!fir.box>, index) -> !fir.ref + %15 = fir.load %14 : !fir.ref + %16 = arith.cmpf ogt, %15, %cst fastmath : f32 + %17 = fir.convert %16 : (i1) -> !fir.logical<4> + hlfir.yield_element %17 : !fir.logical<4> + } + hlfir.yield %13 : !hlfir.expr> cleanup { + hlfir.destroy %13 : !hlfir.expr> + } + } do { + hlfir.region_assign { + %2 = hlfir.designate %1#0{"p1"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>,p2:!fir.box>>}>>) -> !fir.ref>>> + %3 = fir.load %2 : !fir.ref>>> + %4:3 = fir.box_dims %3, %c0 : (!fir.box>>, index) -> (index, index, index) + %5 = arith.addi %4#0, %4#1 : index + %6 = arith.subi %5, %c1 : index + %7 = arith.subi %6, %4#0 : index + %8 = arith.addi %7, %c1 : index + %9 = arith.cmpi sgt, %8, %c0 : index + %10 = arith.select %9, %8, %c0 : index + %11 = fir.shape %10 : (index) -> !fir.shape<1> + %12 = hlfir.designate %3 (%4#0:%6:%c1, %c1) shape %11 : (!fir.box>>, index, index, index, index, !fir.shape<1>) -> !fir.box> + %13 = hlfir.exactly_once : !hlfir.expr { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %{{.*}}#0{"p1"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>,p2:!fir.box>>}>>) -> !fir.ref>>> +// CHECK: fir.load %[[VAL_26]] : !fir.ref>>> +// CHECK: %[[VAL_47:.*]] = fir.call @_QPcallee(%{{.*}}) fastmath : (!fir.box>) -> !fir.array +// CHECK: fir.do_loop + %15 = fir.load %2 : !fir.ref>>> + %16:3 = fir.box_dims %15, %c0 : (!fir.box>>, index) -> (index, index, index) + %17 = arith.addi %16#0, %16#1 : index + %18 = arith.subi %17, %c1 : index + %19 = arith.subi %18, %16#0 : index + %20 = arith.addi %19, %c1 : index + %21 = arith.cmpi sgt, %20, %c0 : index + %22 = arith.select %21, %20, %c0 : index + %23 = fir.shape %22 : (index) -> !fir.shape<1> + %24 = hlfir.designate %15 (%16#0:%18:%c1, %c1) shape %23 : (!fir.box>>, index, index, index, index, !fir.shape<1>) -> !fir.box> + %25:2 = hlfir.declare %24 {fortran_attrs = #fir.var_attrs, uniq_name = "_QMmy_moduleFcalleeEx"} : (!fir.box>) -> (!fir.box>, !fir.box>) + %26:3 = fir.box_dims %25#0, %c0 : (!fir.box>, index) -> (index, index, index) + %27 = fir.convert %26#1 : (index) -> i64 + %28 = fir.convert %27 : (i64) -> index + %29 = arith.cmpi sgt, %28, %c0 : index + %30 = arith.select %29, %28, %c0 : index + %31 = fir.shape %30 : (index) -> !fir.shape<1> + %32 = fir.allocmem !fir.array, %30 {bindc_name = ".tmp.expr_result", uniq_name = ""} + %33 = fir.convert %32 : (!fir.heap>) -> !fir.ref> + %34:2 = hlfir.declare %33(%31) {uniq_name = ".tmp.expr_result"} : (!fir.ref>, !fir.shape<1>) -> (!fir.box>, !fir.ref>) + %35 = fir.call @_QPcallee(%24) fastmath : (!fir.box>) -> !fir.array + fir.save_result %35 to %34#1(%31) : !fir.array, !fir.ref>, !fir.shape<1> + %36 = hlfir.as_expr %34#0 move %true : (!fir.box>, i1) -> !hlfir.expr + hlfir.yield %36 : !hlfir.expr cleanup { + hlfir.destroy %36 : !hlfir.expr + } + } + %14 = hlfir.elemental %11 unordered : (!fir.shape<1>) -> !hlfir.expr { + ^bb0(%arg1: index): + %15 = hlfir.designate %12 (%arg1) : (!fir.box>, index) -> !fir.ref + %16 = hlfir.apply %13, %arg1 : (!hlfir.expr, index) -> f32 + %17 = fir.load %15 : !fir.ref + %18 = arith.divf %17, %16 fastmath : f32 + hlfir.yield_element %18 : f32 + } + hlfir.yield %14 : !hlfir.expr cleanup { + hlfir.destroy %14 : !hlfir.expr + } + } to { + %2 = hlfir.designate %1#0{"p2"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>,p2:!fir.box>>}>>) -> !fir.ref>>> + %3 = fir.load %2 : !fir.ref>>> + %4:3 = fir.box_dims %3, %c0 : (!fir.box>>, index) -> (index, index, index) + %5 = arith.addi %4#0, %4#1 : index + %6 = arith.subi %5, %c1 : index + %7 = arith.subi %6, %4#0 : index + %8 = arith.addi %7, %c1 : index + %9 = arith.cmpi sgt, %8, %c0 : index + %10 = arith.select %9, %8, %c0 : index + %11 = fir.shape %10 : (index) -> !fir.shape<1> + %12 = hlfir.designate %3 (%4#0:%6:%c1) shape %11 : (!fir.box>>, index, index, index, !fir.shape<1>) -> !fir.box> + hlfir.yield %12 : !fir.box> + } + } + return +} + +// CSE makes a chain of operations live-in: +// CHECK-LABEL: func.func @_QPtest_where_in_forall( +func.func @_QPtest_where_in_forall(%arg0: !fir.box> {fir.bindc_name = "a"}, %arg1: !fir.box> {fir.bindc_name = "b"}, %arg2: !fir.box> {fir.bindc_name = "c"}) { + %false = arith.constant false + %c1_i32 = arith.constant 1 : i32 + %c10_i32 = arith.constant 10 : i32 + %c0 = arith.constant 0 : index + %c1 = arith.constant 1 : index + %c2_i32 = arith.constant 2 : i32 + %c100 = arith.constant 100 : index + %0 = fir.alloca !fir.array<100x!fir.logical<4>> {bindc_name = ".tmp.expr_result"} + %1 = fir.alloca !fir.array<100x!fir.logical<4>> {bindc_name = ".tmp.expr_result"} + %2 = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_21:.*]]:2 = hlfir.declare %{{.*}} dummy_scope %{{.*}} {uniq_name = "_QFtest_where_in_forallEb"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %3:2 = hlfir.declare %arg0 dummy_scope %2 {uniq_name = "_QFtest_where_in_forallEa"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %4:2 = hlfir.declare %arg1 dummy_scope %2 {uniq_name = "_QFtest_where_in_forallEb"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5:2 = hlfir.declare %arg2 dummy_scope %2 {uniq_name = "_QFtest_where_in_forallEc"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + hlfir.forall lb { + hlfir.yield %c1_i32 : i32 + } ub { + hlfir.yield %c10_i32 : i32 + } (%arg3: i32) { + hlfir.where { + %6 = fir.shape %c100 : (index) -> !fir.shape<1> + %7:2 = hlfir.declare %0(%6) {uniq_name = ".tmp.expr_result"} : (!fir.ref>>, !fir.shape<1>) -> (!fir.ref>>, !fir.ref>>) + %8 = fir.call @_QPpure_logical_func1() proc_attrs fastmath : () -> !fir.array<100x!fir.logical<4>> + fir.save_result %8 to %7#1(%6) : !fir.array<100x!fir.logical<4>>, !fir.ref>>, !fir.shape<1> + %9 = hlfir.as_expr %7#0 move %false : (!fir.ref>>, i1) -> !hlfir.expr<100x!fir.logical<4>> + hlfir.yield %9 : !hlfir.expr<100x!fir.logical<4>> cleanup { + hlfir.destroy %9 : !hlfir.expr<100x!fir.logical<4>> + } + } do { + hlfir.region_assign { + %6 = fir.convert %arg3 : (i32) -> i64 +// CHECK: %[[VAL_58:.*]]:3 = fir.box_dims %[[VAL_21]]#1, %{{.*}} : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_59:.*]] = arith.cmpi sgt, %[[VAL_58]]#1, %{{.*}} : index +// CHECK: %[[VAL_60:.*]] = arith.select %[[VAL_59]], %[[VAL_58]]#1, %{{.*}} : index +// CHECK: %[[VAL_61:.*]] = fir.shape %[[VAL_60]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_62:.*]] = hlfir.designate %[[VAL_21]]#0 (%{{.*}}, %{{.*}}:%[[VAL_58]]#1:%{{.*}}) shape %[[VAL_61]] : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + %7:3 = fir.box_dims %4#1, %c1 : (!fir.box>, index) -> (index, index, index) + %8 = arith.cmpi sgt, %7#1, %c0 : index + %9 = arith.select %8, %7#1, %c0 : index + %10 = fir.shape %9 : (index) -> !fir.shape<1> + %11 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1) shape %10 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + %12 = hlfir.exactly_once : f32 { + %19:3 = fir.box_dims %3#1, %c1 : (!fir.box>, index) -> (index, index, index) + %20 = arith.cmpi sgt, %19#1, %c0 : index + %21 = arith.select %20, %19#1, %c0 : index + %22 = fir.shape %21 : (index) -> !fir.shape<1> + %23 = hlfir.designate %3#0 (%6, %c1:%19#1:%c1) shape %22 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_68:.*]] = fir.call @_QPpure_real_func2() fastmath : () -> f32 +// CHECK: %[[VAL_69:.*]] = hlfir.elemental %{{.*}} unordered : (!fir.shape<1>) -> !hlfir.expr { +// CHECK: ^bb0(%[[VAL_70:.*]]: index): +// CHECK: %[[VAL_72:.*]] = hlfir.designate %[[VAL_62]] (%[[VAL_70]]) : (!fir.box>, index) -> !fir.ref + %24 = fir.call @_QPpure_real_func2() fastmath : () -> f32 + %25 = hlfir.elemental %22 unordered : (!fir.shape<1>) -> !hlfir.expr { + ^bb0(%arg4: index): + %28 = hlfir.designate %23 (%arg4) : (!fir.box>, index) -> !fir.ref + %29 = hlfir.designate %11 (%arg4) : (!fir.box>, index) -> !fir.ref + %30 = fir.load %28 : !fir.ref + %31 = fir.load %29 : !fir.ref + %32 = arith.addf %30, %31 fastmath : f32 + %33 = arith.addf %32, %24 fastmath : f32 + hlfir.yield_element %33 : f32 + } + %26:3 = hlfir.associate %25(%22) {adapt.valuebyref} : (!hlfir.expr, !fir.shape<1>) -> (!fir.box>, !fir.ref>, i1) + %27 = fir.call @_QPpure_real_func(%26#1) fastmath : (!fir.ref>) -> f32 + hlfir.yield %27 : f32 cleanup { + hlfir.end_associate %26#1, %26#2 : !fir.ref>, i1 + hlfir.destroy %25 : !hlfir.expr + } + } + %13:3 = fir.box_dims %3#1, %c1 : (!fir.box>, index) -> (index, index, index) + %14 = arith.cmpi sgt, %13#1, %c0 : index + %15 = arith.select %14, %13#1, %c0 : index + %16 = fir.shape %15 : (index) -> !fir.shape<1> + %17 = hlfir.designate %3#0 (%6, %c1:%13#1:%c1) shape %16 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + %18 = hlfir.elemental %10 unordered : (!fir.shape<1>) -> !hlfir.expr { + ^bb0(%arg4: index): + %19 = hlfir.designate %11 (%arg4) : (!fir.box>, index) -> !fir.ref + %20 = fir.load %19 : !fir.ref + %21 = arith.addf %20, %12 fastmath : f32 + %22 = hlfir.designate %17 (%arg4) : (!fir.box>, index) -> !fir.ref + %23 = fir.call @_QPpure_elem_func(%22) proc_attrs fastmath : (!fir.ref) -> f32 + %24 = arith.addf %21, %23 fastmath : f32 + hlfir.yield_element %24 : f32 + } + hlfir.yield %18 : !hlfir.expr cleanup { + hlfir.destroy %18 : !hlfir.expr + } + } to { + %6 = arith.muli %arg3, %c2_i32 overflow : i32 + %7 = fir.convert %6 : (i32) -> i64 + %8:3 = fir.box_dims %3#1, %c1 : (!fir.box>, index) -> (index, index, index) + %9 = arith.cmpi sgt, %8#1, %c0 : index + %10 = arith.select %9, %8#1, %c0 : index + %11 = fir.shape %10 : (index) -> !fir.shape<1> + %12 = hlfir.designate %3#0 (%7, %c1:%8#1:%c1) shape %11 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + hlfir.yield %12 : !fir.box> + } + hlfir.elsewhere mask { + %6 = hlfir.exactly_once : !hlfir.expr<100x!fir.logical<4>> { + %7 = fir.shape %c100 : (index) -> !fir.shape<1> + %8:2 = hlfir.declare %1(%7) {uniq_name = ".tmp.expr_result"} : (!fir.ref>>, !fir.shape<1>) -> (!fir.ref>>, !fir.ref>>) + %9 = fir.call @_QPpure_logical_func2() proc_attrs fastmath : () -> !fir.array<100x!fir.logical<4>> + fir.save_result %9 to %8#1(%7) : !fir.array<100x!fir.logical<4>>, !fir.ref>>, !fir.shape<1> + %10 = hlfir.as_expr %8#0 move %false : (!fir.ref>>, i1) -> !hlfir.expr<100x!fir.logical<4>> + hlfir.yield %10 : !hlfir.expr<100x!fir.logical<4>> cleanup { + hlfir.destroy %10 : !hlfir.expr<100x!fir.logical<4>> + } + } + hlfir.yield %6 : !hlfir.expr<100x!fir.logical<4>> + } do { + hlfir.region_assign { + %6 = fir.convert %arg3 : (i32) -> i64 + %7:3 = fir.box_dims %5#1, %c1 : (!fir.box>, index) -> (index, index, index) + %8 = arith.cmpi sgt, %7#1, %c0 : index + %9 = arith.select %8, %7#1, %c0 : index + %10 = fir.shape %9 : (index) -> !fir.shape<1> + %11 = hlfir.designate %5#0 (%6, %c1:%7#1:%c1) shape %10 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + hlfir.yield %11 : !fir.box> + } to { + %6 = arith.muli %arg3, %c2_i32 overflow : i32 + %7 = fir.convert %6 : (i32) -> i64 + %8 = hlfir.exactly_once : i32 { + %14 = fir.call @_QPpure_ifoo() proc_attrs fastmath : () -> i32 + hlfir.yield %14 : i32 cleanup { + } + } + %9 = fir.convert %8 : (i32) -> index + %10 = arith.cmpi sgt, %9, %c0 : index + %11 = arith.select %10, %9, %c0 : index + %12 = fir.shape %11 : (index) -> !fir.shape<1> + %13 = hlfir.designate %3#0 (%7, %c1:%9:%c1) shape %12 : (!fir.box>, i64, index, index, index, !fir.shape<1>) -> !fir.box> + hlfir.yield %13 : !fir.box> + } + } + } + } + return +} From flang-commits at lists.llvm.org Tue May 20 09:22:12 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 20 May 2025 09:22:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Undo the effects of CSE for hlfir.exactly_once. (PR #140190) In-Reply-To: Message-ID: <682cac34.170a0220.13fffc.a237@mx.google.com> https://github.com/vzakhari closed https://github.com/llvm/llvm-project/pull/140190 From flang-commits at lists.llvm.org Tue May 20 09:44:45 2025 From: flang-commits at lists.llvm.org (Razvan Lupusoru via flang-commits) Date: Tue, 20 May 2025 09:44:45 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [OpenACC] rename private/firstprivate recipe attributes (PR #140719) In-Reply-To: Message-ID: <682cb17d.050a0220.1fb92d.4707@mx.google.com> https://github.com/razvanlupusoru approved this pull request. Thank you! https://github.com/llvm/llvm-project/pull/140719 From flang-commits at lists.llvm.org Tue May 20 10:00:20 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Tue, 20 May 2025 10:00:20 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) Message-ID: https://github.com/akuhlens created https://github.com/llvm/llvm-project/pull/140763 Make sure to preserve the location of the end statement on data declarations for use in debugging OpenACC runtime. >From a06104840a9639eea138f17ec83ee3bfb75ab315 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Tue, 20 May 2025 09:54:26 -0700 Subject: [PATCH] initial commit --- flang/lib/Lower/OpenACC.cpp | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index e1918288d6de3..3ff11dfaab311 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -810,23 +810,25 @@ static void genDeclareDataOperandOperationsWithModifier( } template -static void genDataExitOperations(fir::FirOpBuilder &builder, - llvm::SmallVector operands, - bool structured) { +static void +genDataExitOperations(fir::FirOpBuilder &builder, + llvm::SmallVector operands, bool structured, + std::optional exitLoc = std::nullopt) { for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); + auto opLoc = exitLoc ? *exitLoc : entryOp.getLoc(); if constexpr (std::is_same_v || std::is_same_v) builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), - entryOp.getVarType(), entryOp.getBounds(), entryOp.getAsyncOperands(), + opLoc, entryOp.getAccVar(), entryOp.getVar(), entryOp.getVarType(), + entryOp.getBounds(), entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); else builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getBounds(), + opLoc, entryOp.getAccVar(), entryOp.getBounds(), entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); @@ -3017,6 +3019,7 @@ static Op createComputeOp( static void genACCDataOp(Fortran::lower::AbstractConverter &converter, mlir::Location currentLocation, + mlir::Location endLocation, Fortran::lower::pft::Evaluation &eval, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, @@ -3211,19 +3214,19 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, // Create the exit operations after the region. genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, attachEntryOperands, /*structured=*/true); + builder, attachEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, nocreateEntryOperands, /*structured=*/true); + builder, nocreateEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands, /*structured=*/true, endLocation); builder.restoreInsertionPoint(insPt); } @@ -3300,7 +3303,9 @@ genACC(Fortran::lower::AbstractConverter &converter, std::get(beginBlockDirective.t); const auto &accClauseList = std::get(beginBlockDirective.t); - + const auto &endBlockDirective = + std::get(blockConstruct.t); + mlir::Location endLocation = converter.genLocation(endBlockDirective.source); mlir::Location currentLocation = converter.genLocation(blockDirective.source); Fortran::lower::StatementContext stmtCtx; @@ -3309,8 +3314,8 @@ genACC(Fortran::lower::AbstractConverter &converter, semanticsContext, stmtCtx, accClauseList); } else if (blockDirective.v == llvm::acc::ACCD_data) { - genACCDataOp(converter, currentLocation, eval, semanticsContext, stmtCtx, - accClauseList); + genACCDataOp(converter, currentLocation, endLocation, eval, + semanticsContext, stmtCtx, accClauseList); } else if (blockDirective.v == llvm::acc::ACCD_serial) { createComputeOp(converter, currentLocation, eval, semanticsContext, stmtCtx, From flang-commits at lists.llvm.org Tue May 20 10:00:53 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 10:00:53 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) In-Reply-To: Message-ID: <682cb545.050a0220.21025e.6a23@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Andre Kuhlenschmidt (akuhlens)
Changes Make sure to preserve the location of the end statement on data declarations for use in debugging OpenACC runtime. --- Full diff: https://github.com/llvm/llvm-project/pull/140763.diff 1 Files Affected: - (modified) flang/lib/Lower/OpenACC.cpp (+21-16) ``````````diff diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index e1918288d6de3..3ff11dfaab311 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -810,23 +810,25 @@ static void genDeclareDataOperandOperationsWithModifier( } template -static void genDataExitOperations(fir::FirOpBuilder &builder, - llvm::SmallVector operands, - bool structured) { +static void +genDataExitOperations(fir::FirOpBuilder &builder, + llvm::SmallVector operands, bool structured, + std::optional exitLoc = std::nullopt) { for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); + auto opLoc = exitLoc ? *exitLoc : entryOp.getLoc(); if constexpr (std::is_same_v || std::is_same_v) builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), - entryOp.getVarType(), entryOp.getBounds(), entryOp.getAsyncOperands(), + opLoc, entryOp.getAccVar(), entryOp.getVar(), entryOp.getVarType(), + entryOp.getBounds(), entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); else builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getBounds(), + opLoc, entryOp.getAccVar(), entryOp.getBounds(), entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); @@ -3017,6 +3019,7 @@ static Op createComputeOp( static void genACCDataOp(Fortran::lower::AbstractConverter &converter, mlir::Location currentLocation, + mlir::Location endLocation, Fortran::lower::pft::Evaluation &eval, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, @@ -3211,19 +3214,19 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, // Create the exit operations after the region. genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, attachEntryOperands, /*structured=*/true); + builder, attachEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, nocreateEntryOperands, /*structured=*/true); + builder, nocreateEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands, /*structured=*/true, endLocation); builder.restoreInsertionPoint(insPt); } @@ -3300,7 +3303,9 @@ genACC(Fortran::lower::AbstractConverter &converter, std::get(beginBlockDirective.t); const auto &accClauseList = std::get(beginBlockDirective.t); - + const auto &endBlockDirective = + std::get(blockConstruct.t); + mlir::Location endLocation = converter.genLocation(endBlockDirective.source); mlir::Location currentLocation = converter.genLocation(blockDirective.source); Fortran::lower::StatementContext stmtCtx; @@ -3309,8 +3314,8 @@ genACC(Fortran::lower::AbstractConverter &converter, semanticsContext, stmtCtx, accClauseList); } else if (blockDirective.v == llvm::acc::ACCD_data) { - genACCDataOp(converter, currentLocation, eval, semanticsContext, stmtCtx, - accClauseList); + genACCDataOp(converter, currentLocation, endLocation, eval, + semanticsContext, stmtCtx, accClauseList); } else if (blockDirective.v == llvm::acc::ACCD_serial) { createComputeOp(converter, currentLocation, eval, semanticsContext, stmtCtx, ``````````
https://github.com/llvm/llvm-project/pull/140763 From flang-commits at lists.llvm.org Tue May 20 10:05:08 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Tue, 20 May 2025 10:05:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) In-Reply-To: Message-ID: <682cb644.170a0220.6f14d.17af@mx.google.com> akuhlens wrote: I didn't see a good way of testing this change. I am open to suggestions if you think it needs a test. I have checked manually that the correct location information ends up at the exit operation calls. https://github.com/llvm/llvm-project/pull/140763 From flang-commits at lists.llvm.org Tue May 20 10:11:04 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 20 May 2025 10:11:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) In-Reply-To: Message-ID: <682cb7a8.170a0220.19bbf2.360d@mx.google.com> ================ @@ -810,23 +810,25 @@ static void genDeclareDataOperandOperationsWithModifier( } template -static void genDataExitOperations(fir::FirOpBuilder &builder, - llvm::SmallVector operands, - bool structured) { +static void +genDataExitOperations(fir::FirOpBuilder &builder, + llvm::SmallVector operands, bool structured, + std::optional exitLoc = std::nullopt) { for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); + auto opLoc = exitLoc ? *exitLoc : entryOp.getLoc(); ---------------- clementval wrote: Spell out auto here. https://github.com/llvm/llvm-project/pull/140763 From flang-commits at lists.llvm.org Tue May 20 10:11:46 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 20 May 2025 10:11:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) In-Reply-To: Message-ID: <682cb7d2.050a0220.d7bd1.4362@mx.google.com> clementval wrote: Can you add a test in `flang/test/Lower/OpenACC/locations.f90` https://github.com/llvm/llvm-project/pull/140763 From flang-commits at lists.llvm.org Tue May 20 11:02:20 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Tue, 20 May 2025 11:02:20 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) In-Reply-To: Message-ID: <682cc3ac.170a0220.f1ff4.3d0c@mx.google.com> ================ @@ -810,23 +810,25 @@ static void genDeclareDataOperandOperationsWithModifier( } template -static void genDataExitOperations(fir::FirOpBuilder &builder, - llvm::SmallVector operands, - bool structured) { +static void +genDataExitOperations(fir::FirOpBuilder &builder, + llvm::SmallVector operands, bool structured, + std::optional exitLoc = std::nullopt) { for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); + auto opLoc = exitLoc ? *exitLoc : entryOp.getLoc(); ---------------- akuhlens wrote: Just so that I have a better understanding of when you want a type annotation and when you don't , could you say why you want `Location` here? https://github.com/llvm/llvm-project/pull/140763 From flang-commits at lists.llvm.org Tue May 20 11:05:45 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 20 May 2025 11:05:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) In-Reply-To: Message-ID: <682cc479.050a0220.30f7ac.55ba@mx.google.com> ================ @@ -810,23 +810,25 @@ static void genDeclareDataOperandOperationsWithModifier( } template -static void genDataExitOperations(fir::FirOpBuilder &builder, - llvm::SmallVector operands, - bool structured) { +static void +genDataExitOperations(fir::FirOpBuilder &builder, + llvm::SmallVector operands, bool structured, + std::optional exitLoc = std::nullopt) { for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); + auto opLoc = exitLoc ? *exitLoc : entryOp.getLoc(); ---------------- clementval wrote: The type is not obvious like if you had something like ``` https://github.com/llvm/llvm-project/pull/140763 From flang-commits at lists.llvm.org Tue May 20 11:06:51 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 20 May 2025 11:06:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) In-Reply-To: Message-ID: <682cc4bb.a70a0220.22946c.761f@mx.google.com> https://github.com/clementval edited https://github.com/llvm/llvm-project/pull/140763 From flang-commits at lists.llvm.org Tue May 20 11:07:30 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 20 May 2025 11:07:30 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) In-Reply-To: Message-ID: <682cc4e2.630a0220.212f7.89f1@mx.google.com> https://github.com/clementval edited https://github.com/llvm/llvm-project/pull/140763 From flang-commits at lists.llvm.org Tue May 20 11:09:48 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Tue, 20 May 2025 11:09:48 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) In-Reply-To: Message-ID: <682cc56c.170a0220.7b588.3f09@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/140763 >From a06104840a9639eea138f17ec83ee3bfb75ab315 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Tue, 20 May 2025 09:54:26 -0700 Subject: [PATCH 1/2] initial commit --- flang/lib/Lower/OpenACC.cpp | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index e1918288d6de3..3ff11dfaab311 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -810,23 +810,25 @@ static void genDeclareDataOperandOperationsWithModifier( } template -static void genDataExitOperations(fir::FirOpBuilder &builder, - llvm::SmallVector operands, - bool structured) { +static void +genDataExitOperations(fir::FirOpBuilder &builder, + llvm::SmallVector operands, bool structured, + std::optional exitLoc = std::nullopt) { for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); + auto opLoc = exitLoc ? *exitLoc : entryOp.getLoc(); if constexpr (std::is_same_v || std::is_same_v) builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), - entryOp.getVarType(), entryOp.getBounds(), entryOp.getAsyncOperands(), + opLoc, entryOp.getAccVar(), entryOp.getVar(), entryOp.getVarType(), + entryOp.getBounds(), entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); else builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getBounds(), + opLoc, entryOp.getAccVar(), entryOp.getBounds(), entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); @@ -3017,6 +3019,7 @@ static Op createComputeOp( static void genACCDataOp(Fortran::lower::AbstractConverter &converter, mlir::Location currentLocation, + mlir::Location endLocation, Fortran::lower::pft::Evaluation &eval, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, @@ -3211,19 +3214,19 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, // Create the exit operations after the region. genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, attachEntryOperands, /*structured=*/true); + builder, attachEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, nocreateEntryOperands, /*structured=*/true); + builder, nocreateEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands, /*structured=*/true, endLocation); builder.restoreInsertionPoint(insPt); } @@ -3300,7 +3303,9 @@ genACC(Fortran::lower::AbstractConverter &converter, std::get(beginBlockDirective.t); const auto &accClauseList = std::get(beginBlockDirective.t); - + const auto &endBlockDirective = + std::get(blockConstruct.t); + mlir::Location endLocation = converter.genLocation(endBlockDirective.source); mlir::Location currentLocation = converter.genLocation(blockDirective.source); Fortran::lower::StatementContext stmtCtx; @@ -3309,8 +3314,8 @@ genACC(Fortran::lower::AbstractConverter &converter, semanticsContext, stmtCtx, accClauseList); } else if (blockDirective.v == llvm::acc::ACCD_data) { - genACCDataOp(converter, currentLocation, eval, semanticsContext, stmtCtx, - accClauseList); + genACCDataOp(converter, currentLocation, endLocation, eval, + semanticsContext, stmtCtx, accClauseList); } else if (blockDirective.v == llvm::acc::ACCD_serial) { createComputeOp(converter, currentLocation, eval, semanticsContext, stmtCtx, >From 2b415845e7850fd85001102f40cabaa86320d212 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Tue, 20 May 2025 11:08:54 -0700 Subject: [PATCH 2/2] address feedback --- flang/lib/Lower/OpenACC.cpp | 2 +- flang/test/Lower/OpenACC/locations.f90 | 11 +++++++++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 3ff11dfaab311..7974c45264dde 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -817,7 +817,7 @@ genDataExitOperations(fir::FirOpBuilder &builder, for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); - auto opLoc = exitLoc ? *exitLoc : entryOp.getLoc(); + mlir::Location opLoc = exitLoc ? *exitLoc : entryOp.getLoc(); if constexpr (std::is_same_v || std::is_same_v) builder.create( diff --git a/flang/test/Lower/OpenACC/locations.f90 b/flang/test/Lower/OpenACC/locations.f90 index 84dd512a5d43f..69873b3fbca4f 100644 --- a/flang/test/Lower/OpenACC/locations.f90 +++ b/flang/test/Lower/OpenACC/locations.f90 @@ -171,6 +171,17 @@ subroutine acc_loop_fused_locations(arr) ! CHECK: acc.loop ! CHECK: } attributes {collapse = [3]{{.*}}} loc(fused["{{.*}}locations.f90":160:11, "{{.*}}locations.f90":161:5, "{{.*}}locations.f90":162:7, "{{.*}}locations.f90":163:9]) + subroutine data_end_locations(arr) + real, dimension(10) :: arr + + !$acc data copy(arr) + !CHECK-LABEL: acc.copyin + !CHECK-SAME: loc("{{.*}}locations.f90":177:21) + + !$acc end data + !CHECK-LABEL: acc.copyout + !CHECK-SAME: loc("{{.*}}locations.f90":181:11) + end subroutine end module From flang-commits at lists.llvm.org Tue May 20 11:11:07 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Tue, 20 May 2025 11:11:07 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix crash in error recovery (PR #140768) In-Reply-To: Message-ID: <682cc5bb.050a0220.4973.4bb9@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/140768 From flang-commits at lists.llvm.org Tue May 20 11:15:04 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 20 May 2025 11:15:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) In-Reply-To: Message-ID: <682cc6a8.170a0220.1907cd.427d@mx.google.com> https://github.com/clementval approved this pull request. LGTM! Thanks for addressing my comments. https://github.com/llvm/llvm-project/pull/140763 From flang-commits at lists.llvm.org Tue May 20 12:07:17 2025 From: flang-commits at lists.llvm.org (Eli Friedman via flang-commits) Date: Tue, 20 May 2025 12:07:17 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <682cd2e5.620a0220.dc029.67f9@mx.google.com> ================ @@ -1982,6 +1982,27 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF, : 0; if (windowsRequiresStackProbe(MF, NumBytes + RealignmentPadding)) { + // Find an available register to spill the value of X15 to, if X15 is being + // used already for nest. + unsigned X15Scratch = AArch64::NoRegister; + const AArch64Subtarget &STI = MF.getSubtarget(); + if (llvm::any_of(MBB.liveins(), + [&STI](const MachineBasicBlock::RegisterMaskPair &LiveIn) { + return STI.getRegisterInfo()->isSuperOrSubRegisterEq( + AArch64::X15, LiveIn.PhysReg); + })) { + X15Scratch = findScratchNonCalleeSaveRegister(&MBB); ---------------- efriedma-quic wrote: We need to make sure the scratch is not x16 or x17. (See CSR_AArch64_StackProbe_Windows.) https://github.com/llvm/llvm-project/pull/126743 From flang-commits at lists.llvm.org Tue May 20 12:07:17 2025 From: flang-commits at lists.llvm.org (Eli Friedman via flang-commits) Date: Tue, 20 May 2025 12:07:17 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <682cd2e5.050a0220.148a70.5114@mx.google.com> ================ @@ -1,35 +1,26 @@ -; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +;; Testing that nest uses x15 on all calling conventions (except Arm64EC) ---------------- efriedma-quic wrote: > llvm caller's runtime to correctly use VirtualAlloc2 The trampoline is allocated on the stack. We can't mess with the properties of the stack, or else x86 code breaks. > we might have a pretty hard time fitting the necessary exit thunk into 36 bytes There's no reason the size of a trampoline in arm64ec needs to be the same size as a regular AArch64 trampoline. But I'm not sure what "exit thunk" you're referring to. In any case, let's leave this for when someone actually tries arm64ec fortran. https://github.com/llvm/llvm-project/pull/126743 From flang-commits at lists.llvm.org Tue May 20 12:18:19 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 20 May 2025 12:18:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Skip optimized bufferization on volatile refs (PR #140781) In-Reply-To: Message-ID: <682cd57b.170a0220.19bc14.3b3d@mx.google.com> vzakhari wrote: Should not we instead check for `getValue` being `nullptr` at the places where it is used? It seems to be more generic than checking for volatile explicitly. https://github.com/llvm/llvm-project/pull/140781 From flang-commits at lists.llvm.org Tue May 20 12:24:21 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Tue, 20 May 2025 12:24:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Skip optimized bufferization on volatile refs (PR #140781) In-Reply-To: Message-ID: <682cd6e5.170a0220.35d4da.a5cc@mx.google.com> ashermancinelli wrote: > Should not we instead check for `getValue` being `nullptr` at the places where it is used? It seems to be more generic than checking for volatile explicitly. I think you're right. https://github.com/llvm/llvm-project/pull/140781 From flang-commits at lists.llvm.org Tue May 20 12:34:26 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Tue, 20 May 2025 12:34:26 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Skip optimized bufferization on volatile refs (PR #140781) In-Reply-To: Message-ID: <682cd942.a70a0220.1a74e0.501f@mx.google.com> https://github.com/ashermancinelli updated https://github.com/llvm/llvm-project/pull/140781 >From f84c4cc89bfbdb233b17d2b5f85ec73e90e46529 Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Mon, 19 May 2025 19:43:15 -0700 Subject: [PATCH 1/2] [flang] Skip optimized bufferization on volatile refs Memory effects on the volatile memory resource may not be attached to a particular source, in which case the value of an effect will be null which caused this test case to crash in the optimized bufferization pass's safety analysis because it assumes it can get the SSA value modified by the memory effect. This is because memory effects on the volatile resource indicate that the operation must not be reordered with respect to other volatile operations, but there is not a material ssa value that can be pointed to. This patch changes the safety checks to indicate that that memory effects on volatile resources are not safe for optimized bufferization. --- .../Transforms/OptimizedBufferization.cpp | 7 +++ .../HLFIR/opt-bufferization-skip-volatile.fir | 49 +++++++++++++++++++ 2 files changed, 56 insertions(+) create mode 100644 flang/test/HLFIR/opt-bufferization-skip-volatile.fir diff --git a/flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp b/flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp index 2f6ee2592a84f..120dc4c51f202 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp @@ -608,6 +608,13 @@ ElementalAssignBufferization::findMatch(hlfir::ElementalOp elemental) { return std::nullopt; } + // Don't allow any reads to or writes from volatile memory + if (mlir::isa( + effect.getEffect()) && + mlir::isa(effect.getResource())) { + return std::nullopt; + } + // allow if and only if the reads are from the elemental indices, in order // => each iteration doesn't read values written by other iterations // don't allow reads from a different value which may alias: fir alias diff --git a/flang/test/HLFIR/opt-bufferization-skip-volatile.fir b/flang/test/HLFIR/opt-bufferization-skip-volatile.fir new file mode 100644 index 0000000000000..158f92bf207d2 --- /dev/null +++ b/flang/test/HLFIR/opt-bufferization-skip-volatile.fir @@ -0,0 +1,49 @@ +// RUN: fir-opt --pass-pipeline="builtin.module(func.func(opt-bufferization))" %s | FileCheck %s + +// Ensure optimized bufferization preserves the semantics of volatile arrays +func.func @minimal_volatile_test() { + %c1 = arith.constant 1 : index + %c200 = arith.constant 200 : index + + // Create a volatile array + %1 = fir.address_of(@_QMtestEarray) : !fir.ref> + %2 = fir.shape %c200 : (index) -> !fir.shape<1> + %3 = fir.volatile_cast %1 : (!fir.ref>) -> !fir.ref, volatile> + %4:2 = hlfir.declare %3(%2) {fortran_attrs = #fir.var_attrs, uniq_name = "_QMtestEarray"} : (!fir.ref, volatile>, !fir.shape<1>) -> (!fir.ref, volatile>, !fir.ref, volatile>) + + // Create an elemental operation that negates each element + %5 = hlfir.elemental %2 unordered : (!fir.shape<1>) -> !hlfir.expr<200xf32> { + ^bb0(%arg1: index): + %6 = hlfir.designate %4#0 (%arg1) : (!fir.ref, volatile>, index) -> !fir.ref + %7 = fir.load %6 : !fir.ref + %8 = arith.negf %7 : f32 + hlfir.yield_element %8 : f32 + } + + // Assign the result back to the volatile array + hlfir.assign %5 to %4#0 : !hlfir.expr<200xf32>, !fir.ref, volatile> + hlfir.destroy %5 : !hlfir.expr<200xf32> + + return +} + +fir.global @_QMtestEarray : !fir.array<200xf32> + +// CHECK-LABEL: func.func @minimal_volatile_test() { +// CHECK: %[[VAL_0:.*]] = arith.constant 200 : index +// CHECK: %[[VAL_1:.*]] = fir.address_of(@_QMtestEarray) : !fir.ref> +// CHECK: %[[VAL_2:.*]] = fir.shape %[[VAL_0]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_3:.*]] = fir.volatile_cast %[[VAL_1]] : (!fir.ref>) -> !fir.ref, volatile> +// CHECK: %[[VAL_4:.*]]:2 = hlfir.declare %[[VAL_3]](%[[VAL_2]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QMtestEarray"} : (!fir.ref, volatile>, !fir.shape<1>) -> (!fir.ref, volatile>, !fir.ref, volatile>) +// CHECK: %[[VAL_5:.*]] = hlfir.elemental %[[VAL_2]] unordered : (!fir.shape<1>) -> !hlfir.expr<200xf32> { +// CHECK: ^bb0(%[[VAL_6:.*]]: index): +// CHECK: %[[VAL_7:.*]] = hlfir.designate %[[VAL_4]]#0 (%[[VAL_6]]) : (!fir.ref, volatile>, index) -> !fir.ref +// CHECK: %[[VAL_8:.*]] = fir.load %[[VAL_7]] : !fir.ref +// CHECK: %[[VAL_9:.*]] = arith.negf %[[VAL_8]] : f32 +// CHECK: hlfir.yield_element %[[VAL_9]] : f32 +// CHECK: } +// CHECK: hlfir.assign %[[VAL_5]] to %[[VAL_4]]#0 : !hlfir.expr<200xf32>, !fir.ref, volatile> +// CHECK: hlfir.destroy %[[VAL_5]] : !hlfir.expr<200xf32> +// CHECK: return +// CHECK: } +// CHECK: fir.global @_QMtestEarray : !fir.array<200xf32> >From 8b8ff482c48b157b55e95d2a6ae6a88ef0f05929 Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Tue, 20 May 2025 12:32:53 -0700 Subject: [PATCH 2/2] Check for null effect value instead of volatile resource --- .../Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp b/flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp index 120dc4c51f202..e2ca754a1817a 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp @@ -608,10 +608,9 @@ ElementalAssignBufferization::findMatch(hlfir::ElementalOp elemental) { return std::nullopt; } - // Don't allow any reads to or writes from volatile memory - if (mlir::isa( - effect.getEffect()) && - mlir::isa(effect.getResource())) { + if (effect.getValue() == nullptr) { + LLVM_DEBUG(llvm::dbgs() + << "side-effect with no value, cannot analyze further\n"); return std::nullopt; } From flang-commits at lists.llvm.org Tue May 20 12:39:37 2025 From: flang-commits at lists.llvm.org (Jameson Nash via flang-commits) Date: Tue, 20 May 2025 12:39:37 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <682cda79.170a0220.27ecaa.3641@mx.google.com> ================ @@ -1,35 +1,26 @@ -; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +;; Testing that nest uses x15 on all calling conventions (except Arm64EC) ---------------- vtjnash wrote: > The trampoline is allocated on the stack. There is nothing about the spec or implementation of trampoline that requires that to be true. In fact, it is much more useful if users such as flang doesn't implement it that way, since macOS prohibits that implementation and I would assume that Windows does as well since XP (https://en.wikipedia.org/wiki/Executable-space_protection#Windows). I therefore don't think it is relevant that arm64ec trampolines cannot be allocated on the stack, since most OS now prohibit that as the implementation anyways. Exit thunk is the arm64ec implementation term for when the runtime changes back from the aarch64ec to x86_64. https://github.com/llvm/llvm-project/pull/126743 From flang-commits at lists.llvm.org Tue May 20 12:55:33 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Tue, 20 May 2025 12:55:33 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Skip optimized bufferization on volatile refs (PR #140781) In-Reply-To: Message-ID: <682cde35.630a0220.37cc5.a192@mx.google.com> https://github.com/ashermancinelli edited https://github.com/llvm/llvm-project/pull/140781 From flang-commits at lists.llvm.org Tue May 20 13:01:40 2025 From: flang-commits at lists.llvm.org (Scott Manley via flang-commits) Date: Tue, 20 May 2025 13:01:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682cdfa4.050a0220.2a8cf0.06e8@mx.google.com> https://github.com/rscottmanley approved this pull request. LGTM but it may be best to get approval from a Flang code owner. https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Tue May 20 13:02:15 2025 From: flang-commits at lists.llvm.org (Sebastian Pop via flang-commits) Date: Tue, 20 May 2025 13:02:15 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <682cdfc7.050a0220.304e32.4e8e@mx.google.com> https://github.com/sebpop updated https://github.com/llvm/llvm-project/pull/140182 >From 46efee7d48a11794fc103cf67b21796d8e5f3408 Mon Sep 17 00:00:00 2001 From: Sebastian Pop Date: Mon, 12 May 2025 21:56:03 +0000 Subject: [PATCH 1/3] [flang] add -floop-interchange to flang driver This patch allows flang to recognize the flags -floop-interchange and -fno-loop-interchange. -floop-interchange adds the loop interchange pass to the pass pipeline. --- clang/include/clang/Driver/Options.td | 4 ++-- clang/lib/Driver/ToolChains/Flang.cpp | 3 +++ flang/docs/ReleaseNotes.md | 2 ++ flang/include/flang/Frontend/CodeGenOptions.def | 1 + flang/lib/Frontend/CompilerInvocation.cpp | 3 +++ flang/lib/Frontend/FrontendActions.cpp | 1 + flang/test/Driver/loop-interchange.f90 | 7 +++++++ 7 files changed, 19 insertions(+), 2 deletions(-) create mode 100644 flang/test/Driver/loop-interchange.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index bd8df8f6a749a..c8c675bc17e7d 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -4186,9 +4186,9 @@ def ftrap_function_EQ : Joined<["-"], "ftrap-function=">, Group, HelpText<"Issue call to specified function rather than a trap instruction">, MarshallingInfoString>; def floop_interchange : Flag<["-"], "floop-interchange">, Group, - HelpText<"Enable the loop interchange pass">, Visibility<[ClangOption, CC1Option]>; + HelpText<"Enable the loop interchange pass">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>; def fno_loop_interchange: Flag<["-"], "fno-loop-interchange">, Group, - HelpText<"Disable the loop interchange pass">, Visibility<[ClangOption, CC1Option]>; + HelpText<"Disable the loop interchange pass">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>; def funroll_loops : Flag<["-"], "funroll-loops">, Group, HelpText<"Turn on loop unroller">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>; def fno_unroll_loops : Flag<["-"], "fno-unroll-loops">, Group, diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index b1ca747e68b89..c6c7a0b75a987 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -152,6 +152,9 @@ void Flang::addCodegenOptions(const ArgList &Args, !stackArrays->getOption().matches(options::OPT_fno_stack_arrays)) CmdArgs.push_back("-fstack-arrays"); + Args.AddLastArg(CmdArgs, options::OPT_floop_interchange, + options::OPT_fno_loop_interchange); + handleVectorizeLoopsArgs(Args, CmdArgs); handleVectorizeSLPArgs(Args, CmdArgs); diff --git a/flang/docs/ReleaseNotes.md b/flang/docs/ReleaseNotes.md index b356f64553d7e..c76635d121d58 100644 --- a/flang/docs/ReleaseNotes.md +++ b/flang/docs/ReleaseNotes.md @@ -32,6 +32,8 @@ page](https://llvm.org/releases/). ## New Compiler Flags +* -floop-interchange is now recognized by flang. + ## Windows Support ## Fortran Language Changes in Flang diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index d9dbd274e83e5..7ced60f512219 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -35,6 +35,7 @@ CODEGENOPT(PrepareForThinLTO , 1, 0) ///< Set when -flto=thin is enabled on the CODEGENOPT(StackArrays, 1, 0) ///< -fstack-arrays (enable the stack-arrays pass) CODEGENOPT(VectorizeLoop, 1, 0) ///< Enable loop vectorization. CODEGENOPT(VectorizeSLP, 1, 0) ///< Enable SLP vectorization. +CODEGENOPT(InterchangeLoops, 1, 0) ///< Enable loop interchange. CODEGENOPT(LoopVersioning, 1, 0) ///< Enable loop versioning. CODEGENOPT(UnrollLoops, 1, 0) ///< Enable loop unrolling CODEGENOPT(AliasAnalysis, 1, 0) ///< Enable alias analysis pass diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 238079a09ef3a..67fb0924def71 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -269,6 +269,9 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, clang::driver::options::OPT_fno_stack_arrays, false)) opts.StackArrays = 1; + if (args.getLastArg(clang::driver::options::OPT_floop_interchange)) + opts.InterchangeLoops = 1; + if (args.getLastArg(clang::driver::options::OPT_vectorize_loops)) opts.VectorizeLoop = 1; diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index e5a15c555fa5e..38dfaadf1dff9 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -922,6 +922,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (ci.isTimingEnabled()) si.getTimePasses().setOutStream(ci.getTimingStreamLLVM()); pto.LoopUnrolling = opts.UnrollLoops; + pto.LoopInterchange = opts.InterchangeLoops; pto.LoopInterleaving = opts.UnrollLoops; pto.LoopVectorization = opts.VectorizeLoop; pto.SLPVectorization = opts.VectorizeSLP; diff --git a/flang/test/Driver/loop-interchange.f90 b/flang/test/Driver/loop-interchange.f90 new file mode 100644 index 0000000000000..30ce2734d0466 --- /dev/null +++ b/flang/test/Driver/loop-interchange.f90 @@ -0,0 +1,7 @@ +! RUN: %flang -### -S -floop-interchange %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -fno-loop-interchange %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s +! CHECK-LOOP-INTERCHANGE: "-floop-interchange" +! CHECK-NO-LOOP-INTERCHANGE: "-fno-loop-interchange" + +program test +end program >From 9dc3774db84e908516a184fa7b7fd242b68a22d1 Mon Sep 17 00:00:00 2001 From: Sebastian Pop Date: Fri, 16 May 2025 03:02:54 +0000 Subject: [PATCH 2/3] [flang] enable loop-interchange at O3, O2, and Os --- clang/lib/Driver/ToolChains/CommonArgs.cpp | 13 +++++++++++++ clang/lib/Driver/ToolChains/CommonArgs.h | 4 ++++ clang/lib/Driver/ToolChains/Flang.cpp | 4 +--- flang/docs/ReleaseNotes.md | 1 + flang/test/Driver/loop-interchange.f90 | 8 +++++++- 5 files changed, 26 insertions(+), 4 deletions(-) diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp index 632027c4a944c..83be3a8e27302 100644 --- a/clang/lib/Driver/ToolChains/CommonArgs.cpp +++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp @@ -3149,3 +3149,16 @@ void tools::handleVectorizeSLPArgs(const ArgList &Args, options::OPT_fno_slp_vectorize, EnableSLPVec)) CmdArgs.push_back("-vectorize-slp"); } + +void tools::handleInterchangeLoopsArgs(const ArgList &Args, + ArgStringList &CmdArgs) { + // FIXME: instead of relying on shouldEnableVectorizerAtOLevel, we may want to + // implement a separate function to infer loop interchange from opt level. + // For now, enable loop-interchange at the same opt levels as loop-vectorize. + bool EnableInterchange = shouldEnableVectorizerAtOLevel(Args, false); + OptSpecifier InterchangeAliasOption = + EnableInterchange ? options::OPT_O_Group : options::OPT_floop_interchange; + if (Args.hasFlag(options::OPT_floop_interchange, InterchangeAliasOption, + options::OPT_fno_loop_interchange, EnableInterchange)) + CmdArgs.push_back("-floop-interchange"); +} diff --git a/clang/lib/Driver/ToolChains/CommonArgs.h b/clang/lib/Driver/ToolChains/CommonArgs.h index 96bc0619dcbc0..6d36a0e8bf493 100644 --- a/clang/lib/Driver/ToolChains/CommonArgs.h +++ b/clang/lib/Driver/ToolChains/CommonArgs.h @@ -259,6 +259,10 @@ void renderCommonIntegerOverflowOptions(const llvm::opt::ArgList &Args, bool shouldEnableVectorizerAtOLevel(const llvm::opt::ArgList &Args, bool isSlpVec); +/// Enable -floop-interchange based on the optimization level selected. +void handleInterchangeLoopsArgs(const llvm::opt::ArgList &Args, + llvm::opt::ArgStringList &CmdArgs); + /// Enable -fvectorize based on the optimization level selected. void handleVectorizeLoopsArgs(const llvm::opt::ArgList &Args, llvm::opt::ArgStringList &CmdArgs); diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index c6c7a0b75a987..54176381b6e5b 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -152,9 +152,7 @@ void Flang::addCodegenOptions(const ArgList &Args, !stackArrays->getOption().matches(options::OPT_fno_stack_arrays)) CmdArgs.push_back("-fstack-arrays"); - Args.AddLastArg(CmdArgs, options::OPT_floop_interchange, - options::OPT_fno_loop_interchange); - + handleInterchangeLoopsArgs(Args, CmdArgs); handleVectorizeLoopsArgs(Args, CmdArgs); handleVectorizeSLPArgs(Args, CmdArgs); diff --git a/flang/docs/ReleaseNotes.md b/flang/docs/ReleaseNotes.md index c76635d121d58..36be369595ffd 100644 --- a/flang/docs/ReleaseNotes.md +++ b/flang/docs/ReleaseNotes.md @@ -33,6 +33,7 @@ page](https://llvm.org/releases/). ## New Compiler Flags * -floop-interchange is now recognized by flang. +* -floop-interchange is enabled by default at -O2 and above. ## Windows Support diff --git a/flang/test/Driver/loop-interchange.f90 b/flang/test/Driver/loop-interchange.f90 index 30ce2734d0466..d5d62e9a777d2 100644 --- a/flang/test/Driver/loop-interchange.f90 +++ b/flang/test/Driver/loop-interchange.f90 @@ -1,7 +1,13 @@ ! RUN: %flang -### -S -floop-interchange %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s ! RUN: %flang -### -S -fno-loop-interchange %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -O0 %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -O1 %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -O2 %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -O3 %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -Os %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -Oz %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s ! CHECK-LOOP-INTERCHANGE: "-floop-interchange" -! CHECK-NO-LOOP-INTERCHANGE: "-fno-loop-interchange" +! CHECK-NO-LOOP-INTERCHANGE-NOT: "-floop-interchange" program test end program >From 0b81d78ae5bcd78e1e5bfb7609f38c1ad16c079c Mon Sep 17 00:00:00 2001 From: Sebastian Pop Date: Fri, 16 May 2025 21:46:04 +0000 Subject: [PATCH 3/3] test loop-interchange in pass pipeline --- flang/test/Driver/loop-interchange.f90 | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/flang/test/Driver/loop-interchange.f90 b/flang/test/Driver/loop-interchange.f90 index d5d62e9a777d2..5d3ec71c59874 100644 --- a/flang/test/Driver/loop-interchange.f90 +++ b/flang/test/Driver/loop-interchange.f90 @@ -8,6 +8,10 @@ ! RUN: %flang -### -S -Oz %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s ! CHECK-LOOP-INTERCHANGE: "-floop-interchange" ! CHECK-NO-LOOP-INTERCHANGE-NOT: "-floop-interchange" +! RUN: %flang_fc1 -emit-llvm -O2 -floop-interchange -mllvm -print-pipeline-passes -o /dev/null %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE-PASS %s +! RUN: %flang_fc1 -emit-llvm -O2 -fno-loop-interchange -mllvm -print-pipeline-passes -o /dev/null %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE-PASS %s +! CHECK-LOOP-INTERCHANGE-PASS: loop-interchange +! CHECK-NO-LOOP-INTERCHANGE-PASS-NOT: loop-interchange program test end program From flang-commits at lists.llvm.org Tue May 20 13:18:29 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Tue, 20 May 2025 13:18:29 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <682ce395.170a0220.2d0568.428f@mx.google.com> https://github.com/tarunprabhu approved this pull request. LGTM. Thanks! https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Tue May 20 13:45:23 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Tue, 20 May 2025 13:45:23 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] fix diagnostic for bad cancel type (PR #140798) Message-ID: https://github.com/tblah created https://github.com/llvm/llvm-project/pull/140798 Fixes #133685 >From cdb3f7e02603da24e537af83af4b83e8b30e46dc Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Tue, 20 May 2025 20:43:29 +0000 Subject: [PATCH] [flang][OpenMP] fix diagnostic for bad cancel type Fixes #133685 --- flang/lib/Semantics/check-omp-structure.cpp | 8 ++++---- flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 | 6 ++++++ 2 files changed, 10 insertions(+), 4 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c6c4fdf8a8198..606014276e7ca 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2575,8 +2575,8 @@ void OmpStructureChecker::CheckCancellationNest( } break; default: - // This should have been diagnosed by this point. - llvm_unreachable("Unexpected directive"); + // This is diagnosed later. + return; } if (!eligibleCancellation) { context_.Say(source, @@ -2614,8 +2614,8 @@ void OmpStructureChecker::CheckCancellationNest( parser::ToUpperCaseLetters(typeName.str())); break; default: - // This should have been diagnosed by this point. - llvm_unreachable("Unexpected directive"); + // This is diagnosed later. + return; } } } diff --git a/flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 b/flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 new file mode 100644 index 0000000000000..ea5e7be23e2f9 --- /dev/null +++ b/flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 @@ -0,0 +1,6 @@ +!RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags + +program test +!ERROR: PARALLEL DO is not a cancellable construct +!$omp cancel parallel do +end From flang-commits at lists.llvm.org Tue May 20 13:45:55 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 13:45:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] fix diagnostic for bad cancel type (PR #140798) In-Reply-To: Message-ID: <682cea03.630a0220.96146.9a2f@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Tom Eccles (tblah)
Changes Fixes #133685 --- Full diff: https://github.com/llvm/llvm-project/pull/140798.diff 2 Files Affected: - (modified) flang/lib/Semantics/check-omp-structure.cpp (+4-4) - (added) flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 (+6) ``````````diff diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c6c4fdf8a8198..606014276e7ca 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2575,8 +2575,8 @@ void OmpStructureChecker::CheckCancellationNest( } break; default: - // This should have been diagnosed by this point. - llvm_unreachable("Unexpected directive"); + // This is diagnosed later. + return; } if (!eligibleCancellation) { context_.Say(source, @@ -2614,8 +2614,8 @@ void OmpStructureChecker::CheckCancellationNest( parser::ToUpperCaseLetters(typeName.str())); break; default: - // This should have been diagnosed by this point. - llvm_unreachable("Unexpected directive"); + // This is diagnosed later. + return; } } } diff --git a/flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 b/flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 new file mode 100644 index 0000000000000..ea5e7be23e2f9 --- /dev/null +++ b/flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 @@ -0,0 +1,6 @@ +!RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags + +program test +!ERROR: PARALLEL DO is not a cancellable construct +!$omp cancel parallel do +end ``````````
https://github.com/llvm/llvm-project/pull/140798 From flang-commits at lists.llvm.org Tue May 20 13:45:56 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 13:45:56 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] fix diagnostic for bad cancel type (PR #140798) In-Reply-To: Message-ID: <682cea04.170a0220.320ec6.493f@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Tom Eccles (tblah)
Changes Fixes #133685 --- Full diff: https://github.com/llvm/llvm-project/pull/140798.diff 2 Files Affected: - (modified) flang/lib/Semantics/check-omp-structure.cpp (+4-4) - (added) flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 (+6) ``````````diff diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c6c4fdf8a8198..606014276e7ca 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2575,8 +2575,8 @@ void OmpStructureChecker::CheckCancellationNest( } break; default: - // This should have been diagnosed by this point. - llvm_unreachable("Unexpected directive"); + // This is diagnosed later. + return; } if (!eligibleCancellation) { context_.Say(source, @@ -2614,8 +2614,8 @@ void OmpStructureChecker::CheckCancellationNest( parser::ToUpperCaseLetters(typeName.str())); break; default: - // This should have been diagnosed by this point. - llvm_unreachable("Unexpected directive"); + // This is diagnosed later. + return; } } } diff --git a/flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 b/flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 new file mode 100644 index 0000000000000..ea5e7be23e2f9 --- /dev/null +++ b/flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 @@ -0,0 +1,6 @@ +!RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags + +program test +!ERROR: PARALLEL DO is not a cancellable construct +!$omp cancel parallel do +end ``````````
https://github.com/llvm/llvm-project/pull/140798 From flang-commits at lists.llvm.org Tue May 20 13:56:21 2025 From: flang-commits at lists.llvm.org (Jameson Nash via flang-commits) Date: Tue, 20 May 2025 13:56:21 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <682cec75.170a0220.35d4da.af9f@mx.google.com> https://github.com/vtjnash updated https://github.com/llvm/llvm-project/pull/126743 >From 4053196cdb8d87de6d3c5a47f2ffca6459b45680 Mon Sep 17 00:00:00 2001 From: Jameson Nash Date: Mon, 10 Feb 2025 19:21:38 +0000 Subject: [PATCH 1/2] [AArch64] fix trampoline implementation: use X15 AAPCS64 reserves any of X9-X15 for this purpose, and says not to use any of X16-X18 (like GCC chose). Simply choosing a different register fixes the problem of this being broken on any platform that actually follows the platform ABI. As a side benefit, also generate slightly better code in the trampoline itself by following the XCore implementation instead of PPC (although following the RISCV might have been slightly more readable in hindsight). --- compiler-rt/lib/builtins/README.txt | 5 - compiler-rt/lib/builtins/trampoline_setup.c | 42 --- .../builtins/Unit/trampoline_setup_test.c | 2 +- .../lib/Optimizer/CodeGen/BoxedProcedure.cpp | 8 +- flang/test/Fir/boxproc.fir | 4 +- .../AArch64/AArch64CallingConvention.td | 25 +- .../Target/AArch64/AArch64FrameLowering.cpp | 28 ++ .../Target/AArch64/AArch64ISelLowering.cpp | 97 ++++--- llvm/lib/TargetParser/Triple.cpp | 2 - llvm/test/CodeGen/AArch64/nest-register.ll | 16 +- .../AArch64/statepoint-call-lowering.ll | 2 +- llvm/test/CodeGen/AArch64/trampoline.ll | 257 +++++++++++++++++- llvm/test/CodeGen/AArch64/win64cc-x18.ll | 27 +- .../CodeGen/AArch64/zero-call-used-regs.ll | 16 +- 14 files changed, 385 insertions(+), 146 deletions(-) diff --git a/compiler-rt/lib/builtins/README.txt b/compiler-rt/lib/builtins/README.txt index 19f26c92a0f94..2d213d95f333a 100644 --- a/compiler-rt/lib/builtins/README.txt +++ b/compiler-rt/lib/builtins/README.txt @@ -272,11 +272,6 @@ switch32 switch8 switchu8 -// This function generates a custom trampoline function with the specific -// realFunc and localsPtr values. -void __trampoline_setup(uint32_t* trampOnStack, int trampSizeAllocated, - const void* realFunc, void* localsPtr); - // There is no C interface to the *_vfp_d8_d15_regs functions. There are // called in the prolog and epilog of Thumb1 functions. When the C++ ABI use // SJLJ for exceptions, each function with a catch clause or destructors needs diff --git a/compiler-rt/lib/builtins/trampoline_setup.c b/compiler-rt/lib/builtins/trampoline_setup.c index 830e25e4c0303..844eb27944142 100644 --- a/compiler-rt/lib/builtins/trampoline_setup.c +++ b/compiler-rt/lib/builtins/trampoline_setup.c @@ -41,45 +41,3 @@ COMPILER_RT_ABI void __trampoline_setup(uint32_t *trampOnStack, __clear_cache(trampOnStack, &trampOnStack[10]); } #endif // __powerpc__ && !defined(__powerpc64__) - -// The AArch64 compiler generates calls to __trampoline_setup() when creating -// trampoline functions on the stack for use with nested functions. -// This function creates a custom 36-byte trampoline function on the stack -// which loads x18 with a pointer to the outer function's locals -// and then jumps to the target nested function. -// Note: x18 is a reserved platform register on Windows and macOS. - -#if defined(__aarch64__) && defined(__ELF__) -COMPILER_RT_ABI void __trampoline_setup(uint32_t *trampOnStack, - int trampSizeAllocated, - const void *realFunc, void *localsPtr) { - // This should never happen, but if compiler did not allocate - // enough space on stack for the trampoline, abort. - if (trampSizeAllocated < 36) - compilerrt_abort(); - - // create trampoline - // Load realFunc into x17. mov/movk 16 bits at a time. - trampOnStack[0] = - 0xd2800000u | ((((uint64_t)realFunc >> 0) & 0xffffu) << 5) | 0x11; - trampOnStack[1] = - 0xf2a00000u | ((((uint64_t)realFunc >> 16) & 0xffffu) << 5) | 0x11; - trampOnStack[2] = - 0xf2c00000u | ((((uint64_t)realFunc >> 32) & 0xffffu) << 5) | 0x11; - trampOnStack[3] = - 0xf2e00000u | ((((uint64_t)realFunc >> 48) & 0xffffu) << 5) | 0x11; - // Load localsPtr into x18 - trampOnStack[4] = - 0xd2800000u | ((((uint64_t)localsPtr >> 0) & 0xffffu) << 5) | 0x12; - trampOnStack[5] = - 0xf2a00000u | ((((uint64_t)localsPtr >> 16) & 0xffffu) << 5) | 0x12; - trampOnStack[6] = - 0xf2c00000u | ((((uint64_t)localsPtr >> 32) & 0xffffu) << 5) | 0x12; - trampOnStack[7] = - 0xf2e00000u | ((((uint64_t)localsPtr >> 48) & 0xffffu) << 5) | 0x12; - trampOnStack[8] = 0xd61f0220; // br x17 - - // Clear instruction cache. - __clear_cache(trampOnStack, &trampOnStack[9]); -} -#endif // defined(__aarch64__) && !defined(__APPLE__) && !defined(_WIN64) diff --git a/compiler-rt/test/builtins/Unit/trampoline_setup_test.c b/compiler-rt/test/builtins/Unit/trampoline_setup_test.c index d51d35acaa02f..da115fe764271 100644 --- a/compiler-rt/test/builtins/Unit/trampoline_setup_test.c +++ b/compiler-rt/test/builtins/Unit/trampoline_setup_test.c @@ -7,7 +7,7 @@ /* * Tests nested functions - * The ppc and aarch64 compilers generates a call to __trampoline_setup + * The ppc compiler generates a call to __trampoline_setup * The i386 and x86_64 compilers generate a call to ___enable_execute_stack */ diff --git a/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp b/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp index 82b11ad7db32a..69bdb48146a54 100644 --- a/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp +++ b/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp @@ -274,12 +274,12 @@ class BoxedProcedurePass auto loc = embox.getLoc(); mlir::Type i8Ty = builder.getI8Type(); mlir::Type i8Ptr = builder.getRefType(i8Ty); - // For AArch64, PPC32 and PPC64, the thunk is populated by a call to + // For PPC32 and PPC64, the thunk is populated by a call to // __trampoline_setup, which is defined in // compiler-rt/lib/builtins/trampoline_setup.c and requires the - // thunk size greater than 32 bytes. For RISCV and x86_64, the - // thunk setup doesn't go through __trampoline_setup and fits in 32 - // bytes. + // thunk size greater than 32 bytes. For AArch64, RISCV and x86_64, + // the thunk setup doesn't go through __trampoline_setup and fits in + // 32 bytes. fir::SequenceType::Extent thunkSize = triple.getTrampolineSize(); mlir::Type buffTy = SequenceType::get({thunkSize}, i8Ty); auto buffer = builder.create(loc, buffTy); diff --git a/flang/test/Fir/boxproc.fir b/flang/test/Fir/boxproc.fir index e99dfd0b92afd..9e5e41a94069c 100644 --- a/flang/test/Fir/boxproc.fir +++ b/flang/test/Fir/boxproc.fir @@ -3,7 +3,7 @@ // RUN: %if powerpc-registered-target %{tco --target=powerpc64le-unknown-linux-gnu %s | FileCheck %s --check-prefixes=CHECK,CHECK-PPC %} // CHECK-LABEL: define void @_QPtest_proc_dummy() -// CHECK-AARCH64: %[[VAL_3:.*]] = alloca [36 x i8], i64 1, align 1 +// CHECK-AARCH64: %[[VAL_3:.*]] = alloca [32 x i8], i64 1, align 1 // CHECK-X86: %[[VAL_3:.*]] = alloca [32 x i8], i64 1, align 1 // CHECK-PPC: %[[VAL_3:.*]] = alloca [4{{[0-8]+}} x i8], i64 1, align 1 // CHECK: %[[VAL_1:.*]] = alloca { ptr }, i64 1, align 8 @@ -63,7 +63,7 @@ func.func @_QPtest_proc_dummy_other(%arg0: !fir.boxproc<() -> ()>) { } // CHECK-LABEL: define void @_QPtest_proc_dummy_char() -// CHECK-AARCH64: %[[VAL_20:.*]] = alloca [36 x i8], i64 1, align 1 +// CHECK-AARCH64: %[[VAL_20:.*]] = alloca [32 x i8], i64 1, align 1 // CHECK-X86: %[[VAL_20:.*]] = alloca [32 x i8], i64 1, align 1 // CHECK-PPC: %[[VAL_20:.*]] = alloca [4{{[0-8]+}} x i8], i64 1, align 1 // CHECK: %[[VAL_2:.*]] = alloca { { ptr, i64 } }, i64 1, align 8 diff --git a/llvm/lib/Target/AArch64/AArch64CallingConvention.td b/llvm/lib/Target/AArch64/AArch64CallingConvention.td index 7cca6d9bc6b9c..e973269545911 100644 --- a/llvm/lib/Target/AArch64/AArch64CallingConvention.td +++ b/llvm/lib/Target/AArch64/AArch64CallingConvention.td @@ -28,6 +28,12 @@ class CCIfSubtarget //===----------------------------------------------------------------------===// defvar AArch64_Common = [ + // The 'nest' parameter, if any, is passed in X15. + // The previous register used here (X18) is also defined to be unavailable + // for this purpose, while all of X9-X15 were defined to be free for LLVM to + // use for this, so use X15 (which LLVM often already clobbers anyways). + CCIfNest>, + CCIfType<[iPTR], CCBitConvertToType>, CCIfType<[v2f32], CCBitConvertToType>, CCIfType<[v2f64, v4f32], CCBitConvertToType>, @@ -117,13 +123,7 @@ defvar AArch64_Common = [ ]; let Entry = 1 in -def CC_AArch64_AAPCS : CallingConv>], - AArch64_Common -)>; +def CC_AArch64_AAPCS : CallingConv; let Entry = 1 in def RetCC_AArch64_AAPCS : CallingConv<[ @@ -177,6 +177,8 @@ def CC_AArch64_Win64_VarArg : CallingConv<[ // a stack layout compatible with the x64 calling convention. let Entry = 1 in def CC_AArch64_Arm64EC_VarArg : CallingConv<[ + CCIfNest>, + // Convert small floating-point values to integer. CCIfType<[f16, bf16], CCBitConvertToType>, CCIfType<[f32], CCBitConvertToType>, @@ -353,6 +355,8 @@ def RetCC_AArch64_Arm64EC_CFGuard_Check : CallingConv<[ // + Stack slots are sized as needed rather than being at least 64-bit. let Entry = 1 in def CC_AArch64_DarwinPCS : CallingConv<[ + CCIfNest>, + CCIfType<[iPTR], CCBitConvertToType>, CCIfType<[v2f32], CCBitConvertToType>, CCIfType<[v2f64, v4f32, f128], CCBitConvertToType>, @@ -427,6 +431,8 @@ def CC_AArch64_DarwinPCS : CallingConv<[ let Entry = 1 in def CC_AArch64_DarwinPCS_VarArg : CallingConv<[ + CCIfNest>, + CCIfType<[iPTR], CCBitConvertToType>, CCIfType<[v2f32], CCBitConvertToType>, CCIfType<[v2f64, v4f32, f128], CCBitConvertToType>, @@ -450,6 +456,8 @@ def CC_AArch64_DarwinPCS_VarArg : CallingConv<[ // same as the normal Darwin VarArgs handling. let Entry = 1 in def CC_AArch64_DarwinPCS_ILP32_VarArg : CallingConv<[ + CCIfNest>, + CCIfType<[v2f32], CCBitConvertToType>, CCIfType<[v2f64, v4f32, f128], CCBitConvertToType>, @@ -494,6 +502,8 @@ def CC_AArch64_DarwinPCS_ILP32_VarArg : CallingConv<[ let Entry = 1 in def CC_AArch64_GHC : CallingConv<[ + CCIfNest>, + CCIfType<[iPTR], CCBitConvertToType>, // Handle all vector types as either f64 or v2f64. @@ -522,6 +532,7 @@ def CC_AArch64_Preserve_None : CallingConv<[ // We can pass arguments in all general registers, except: // - X8, used for sret + // - X15 (on Windows), used as a temporary register in the prologue when allocating call frames // - X16/X17, used by the linker as IP0/IP1 // - X18, the platform register // - X19, the base pointer diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp index 040662a5f11dd..96f4451182391 100644 --- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp @@ -1982,6 +1982,27 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF, : 0; if (windowsRequiresStackProbe(MF, NumBytes + RealignmentPadding)) { + // Find an available register to spill the value of X15 to, if X15 is being + // used already for nest. + unsigned X15Scratch = AArch64::NoRegister; + const AArch64Subtarget &STI = MF.getSubtarget(); + if (llvm::any_of(MBB.liveins(), + [&STI](const MachineBasicBlock::RegisterMaskPair &LiveIn) { + return STI.getRegisterInfo()->isSuperOrSubRegisterEq( + AArch64::X15, LiveIn.PhysReg); + })) { + X15Scratch = findScratchNonCalleeSaveRegister(&MBB); + assert(X15Scratch != AArch64::NoRegister); +#ifndef NDEBUG + LiveRegs.removeReg(AArch64::X15); // ignore X15 since we restore it +#endif + BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXrr), X15Scratch) + .addReg(AArch64::XZR) + .addReg(AArch64::X15, RegState::Undef) + .addReg(AArch64::X15, RegState::Implicit) + .setMIFlag(MachineInstr::FrameSetup); + } + uint64_t NumWords = (NumBytes + RealignmentPadding) >> 4; if (NeedsWinCFI) { HasWinCFI = true; @@ -2104,6 +2125,13 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF, // we've set a frame pointer and already finished the SEH prologue. assert(!NeedsWinCFI); } + if (X15Scratch != AArch64::NoRegister) { + BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXrr), AArch64::X15) + .addReg(AArch64::XZR) + .addReg(X15Scratch, RegState::Undef) + .addReg(X15Scratch, RegState::Implicit) + .setMIFlag(MachineInstr::FrameSetup); + } } StackOffset SVECalleeSavesSize = {}, SVELocalsSize = SVEStackSize; diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp index ad48be4531d3b..f9451d81ae7ae 100644 --- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp @@ -7339,59 +7339,80 @@ static SDValue LowerFLDEXP(SDValue Op, SelectionDAG &DAG) { SDValue AArch64TargetLowering::LowerADJUST_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const { - // Note: x18 cannot be used for the Nest parameter on Windows and macOS. - if (Subtarget->isTargetDarwin() || Subtarget->isTargetWindows()) - report_fatal_error( - "ADJUST_TRAMPOLINE operation is only supported on Linux."); - return Op.getOperand(0); } SDValue AArch64TargetLowering::LowerINIT_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const { - - // Note: x18 cannot be used for the Nest parameter on Windows and macOS. - if (Subtarget->isTargetDarwin() || Subtarget->isTargetWindows()) - report_fatal_error("INIT_TRAMPOLINE operation is only supported on Linux."); - SDValue Chain = Op.getOperand(0); - SDValue Trmp = Op.getOperand(1); // trampoline + SDValue Trmp = Op.getOperand(1); // trampoline, >=32 bytes SDValue FPtr = Op.getOperand(2); // nested function SDValue Nest = Op.getOperand(3); // 'nest' parameter value - SDLoc dl(Op); - EVT PtrVT = getPointerTy(DAG.getDataLayout()); - Type *IntPtrTy = DAG.getDataLayout().getIntPtrType(*DAG.getContext()); + const Value *TrmpAddr = cast(Op.getOperand(4))->getValue(); - TargetLowering::ArgListTy Args; - TargetLowering::ArgListEntry Entry; + // ldr NestReg, .+16 + // ldr x17, .+20 + // br x17 + // .word 0 + // .nest: .qword nest + // .fptr: .qword fptr + SDValue OutChains[5]; - Entry.Ty = IntPtrTy; - Entry.Node = Trmp; - Args.push_back(Entry); + const Function *Func = + cast(cast(Op.getOperand(5))->getValue()); + CallingConv::ID CC = Func->getCallingConv(); + unsigned NestReg; - if (auto *FI = dyn_cast(Trmp.getNode())) { - MachineFunction &MF = DAG.getMachineFunction(); - MachineFrameInfo &MFI = MF.getFrameInfo(); - Entry.Node = - DAG.getConstant(MFI.getObjectSize(FI->getIndex()), dl, MVT::i64); - } else - Entry.Node = DAG.getConstant(36, dl, MVT::i64); + switch (CC) { + default: + NestReg = 0x0f; // X15 + case CallingConv::ARM64EC_Thunk_Native: + case CallingConv::ARM64EC_Thunk_X64: + // Must be kept in sync with AArch64CallingConv.td + NestReg = 0x04; // X4 + break; + } - Args.push_back(Entry); - Entry.Node = FPtr; - Args.push_back(Entry); - Entry.Node = Nest; - Args.push_back(Entry); + const char FptrReg = 0x11; // X17 - // Lower to a call to __trampoline_setup(Trmp, TrampSize, FPtr, ctx_reg) - TargetLowering::CallLoweringInfo CLI(DAG); - CLI.setDebugLoc(dl).setChain(Chain).setLibCallee( - CallingConv::C, Type::getVoidTy(*DAG.getContext()), - DAG.getExternalSymbol("__trampoline_setup", PtrVT), std::move(Args)); + SDValue Addr = Trmp; - std::pair CallResult = LowerCallTo(CLI); - return CallResult.second; + SDLoc dl(Op); + OutChains[0] = DAG.getStore( + Chain, dl, DAG.getConstant(0x58000080u | NestReg, dl, MVT::i32), Addr, + MachinePointerInfo(TrmpAddr)); + + Addr = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(4, dl, MVT::i64)); + OutChains[1] = DAG.getStore( + Chain, dl, DAG.getConstant(0x580000b0u | FptrReg, dl, MVT::i32), Addr, + MachinePointerInfo(TrmpAddr, 4)); + + Addr = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(8, dl, MVT::i64)); + OutChains[2] = + DAG.getStore(Chain, dl, DAG.getConstant(0xd61f0220u, dl, MVT::i32), Addr, + MachinePointerInfo(TrmpAddr, 8)); + + Addr = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(16, dl, MVT::i64)); + OutChains[3] = + DAG.getStore(Chain, dl, Nest, Addr, MachinePointerInfo(TrmpAddr, 16)); + + Addr = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(24, dl, MVT::i64)); + OutChains[4] = + DAG.getStore(Chain, dl, FPtr, Addr, MachinePointerInfo(TrmpAddr, 24)); + + SDValue StoreToken = DAG.getNode(ISD::TokenFactor, dl, MVT::Other, OutChains); + + SDValue EndOfTrmp = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(12, dl, MVT::i64)); + + // Call clear cache on the trampoline instructions. + return DAG.getNode(ISD::CLEAR_CACHE, dl, MVT::Other, StoreToken, Trmp, + EndOfTrmp); } SDValue AArch64TargetLowering::LowerOperation(SDValue Op, diff --git a/llvm/lib/TargetParser/Triple.cpp b/llvm/lib/TargetParser/Triple.cpp index 6a559ff023caa..aa1251f3b9485 100644 --- a/llvm/lib/TargetParser/Triple.cpp +++ b/llvm/lib/TargetParser/Triple.cpp @@ -1732,8 +1732,6 @@ unsigned Triple::getTrampolineSize() const { if (isOSLinux()) return 48; break; - case Triple::aarch64: - return 36; } return 32; } diff --git a/llvm/test/CodeGen/AArch64/nest-register.ll b/llvm/test/CodeGen/AArch64/nest-register.ll index 1e1c1b044bab6..2e94dfba1fa52 100644 --- a/llvm/test/CodeGen/AArch64/nest-register.ll +++ b/llvm/test/CodeGen/AArch64/nest-register.ll @@ -1,3 +1,4 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 ; RUN: llc -disable-post-ra -verify-machineinstrs < %s -mtriple=aarch64-none-linux-gnu | FileCheck %s ; Tests that the 'nest' parameter attribute causes the relevant parameter to be @@ -5,18 +6,21 @@ define ptr @nest_receiver(ptr nest %arg) nounwind { ; CHECK-LABEL: nest_receiver: -; CHECK-NEXT: // %bb.0: -; CHECK-NEXT: mov x0, x18 -; CHECK-NEXT: ret +; CHECK: // %bb.0: +; CHECK-NEXT: mov x0, x15 +; CHECK-NEXT: ret ret ptr %arg } define ptr @nest_caller(ptr %arg) nounwind { ; CHECK-LABEL: nest_caller: -; CHECK: mov x18, x0 -; CHECK-NEXT: bl nest_receiver -; CHECK: ret +; CHECK: // %bb.0: +; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill +; CHECK-NEXT: mov x15, x0 +; CHECK-NEXT: bl nest_receiver +; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload +; CHECK-NEXT: ret %result = call ptr @nest_receiver(ptr nest %arg) ret ptr %result diff --git a/llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll b/llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll index 9619895c450ca..32c3eaeb9c876 100644 --- a/llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll +++ b/llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll @@ -207,7 +207,7 @@ define void @test_attributes(ptr byval(%struct2) %s) gc "statepoint-example" { ; CHECK-NEXT: .cfi_offset w30, -16 ; CHECK-NEXT: ldr x8, [sp, #64] ; CHECK-NEXT: ldr q0, [sp, #48] -; CHECK-NEXT: mov x18, xzr +; CHECK-NEXT: mov x15, xzr ; CHECK-NEXT: mov w0, #42 // =0x2a ; CHECK-NEXT: mov w1, #17 // =0x11 ; CHECK-NEXT: str x8, [sp, #16] diff --git a/llvm/test/CodeGen/AArch64/trampoline.ll b/llvm/test/CodeGen/AArch64/trampoline.ll index 30ac2aa283b3e..d9016b02a0f80 100644 --- a/llvm/test/CodeGen/AArch64/trampoline.ll +++ b/llvm/test/CodeGen/AArch64/trampoline.ll @@ -1,32 +1,265 @@ -; RUN: llc -mtriple=aarch64-- < %s | FileCheck %s +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: llc -mtriple=aarch64-linux-gnu < %s | FileCheck %s --check-prefixes=CHECK-LINUX +; RUN: llc -mtriple=aarch64-none-eabi < %s | FileCheck %s --check-prefixes=CHECK-LINUX +; RUN: llc -mtriple=aarch64-pc-windows-msvc < %s | FileCheck %s --check-prefix=CHECK-PC +; RUN: llc -mtriple=aarch64-apple-darwin < %s | FileCheck %s --check-prefixes=CHECK-APPLE @trampg = internal global [36 x i8] zeroinitializer, align 8 declare void @llvm.init.trampoline(ptr, ptr, ptr); declare ptr @llvm.adjust.trampoline(ptr); -define i64 @f(ptr nest %c, i64 %x, i64 %y) { - %sum = add i64 %x, %y - ret i64 %sum +define ptr @f(ptr nest %x, i64 %y) { +; CHECK-LINUX-LABEL: f: +; CHECK-LINUX: // %bb.0: +; CHECK-LINUX-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill +; CHECK-LINUX-NEXT: sub sp, sp, #237, lsl #12 // =970752 +; CHECK-LINUX-NEXT: sub sp, sp, #3264 +; CHECK-LINUX-NEXT: .cfi_def_cfa_offset 974032 +; CHECK-LINUX-NEXT: .cfi_offset w29, -16 +; CHECK-LINUX-NEXT: add x0, x15, x0 +; CHECK-LINUX-NEXT: add sp, sp, #237, lsl #12 // =970752 +; CHECK-LINUX-NEXT: add sp, sp, #3264 +; CHECK-LINUX-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload +; CHECK-LINUX-NEXT: ret +; +; CHECK-PC-LABEL: f: +; CHECK-PC: .seh_proc f +; CHECK-PC-NEXT: // %bb.0: +; CHECK-PC-NEXT: stp x29, x30, [sp, #-16]! // 16-byte Folded Spill +; CHECK-PC-NEXT: .seh_save_fplr_x 16 +; CHECK-PC-NEXT: mov x9, x15 +; CHECK-PC-NEXT: mov x15, #60876 // =0xedcc +; CHECK-PC-NEXT: .seh_nop +; CHECK-PC-NEXT: bl __chkstk +; CHECK-PC-NEXT: .seh_nop +; CHECK-PC-NEXT: sub sp, sp, x15, lsl #4 +; CHECK-PC-NEXT: .seh_stackalloc 974016 +; CHECK-PC-NEXT: mov x15, x9 +; CHECK-PC-NEXT: .seh_endprologue +; CHECK-PC-NEXT: add x0, x15, x0 +; CHECK-PC-NEXT: .seh_startepilogue +; CHECK-PC-NEXT: add sp, sp, #237, lsl #12 // =970752 +; CHECK-PC-NEXT: .seh_stackalloc 970752 +; CHECK-PC-NEXT: add sp, sp, #3264 +; CHECK-PC-NEXT: .seh_stackalloc 3264 +; CHECK-PC-NEXT: ldp x29, x30, [sp], #16 // 16-byte Folded Reload +; CHECK-PC-NEXT: .seh_save_fplr_x 16 +; CHECK-PC-NEXT: .seh_endepilogue +; CHECK-PC-NEXT: ret +; CHECK-PC-NEXT: .seh_endfunclet +; CHECK-PC-NEXT: .seh_endproc +; +; CHECK-APPLE-LABEL: f: +; CHECK-APPLE: ; %bb.0: +; CHECK-APPLE-NEXT: stp x28, x27, [sp, #-16]! ; 16-byte Folded Spill +; CHECK-APPLE-NEXT: sub sp, sp, #237, lsl #12 ; =970752 +; CHECK-APPLE-NEXT: sub sp, sp, #3264 +; CHECK-APPLE-NEXT: .cfi_def_cfa_offset 974032 +; CHECK-APPLE-NEXT: .cfi_offset w27, -8 +; CHECK-APPLE-NEXT: .cfi_offset w28, -16 +; CHECK-APPLE-NEXT: add x0, x15, x0 +; CHECK-APPLE-NEXT: add sp, sp, #237, lsl #12 ; =970752 +; CHECK-APPLE-NEXT: add sp, sp, #3264 +; CHECK-APPLE-NEXT: ldp x28, x27, [sp], #16 ; 16-byte Folded Reload +; CHECK-APPLE-NEXT: ret + %chkstack = alloca [u0xedcba x i8] + %sum = getelementptr i8, ptr %x, i64 %y + ret ptr %sum } define i64 @func1() { +; CHECK-LINUX-LABEL: func1: +; CHECK-LINUX: // %bb.0: +; CHECK-LINUX-NEXT: sub sp, sp, #64 +; CHECK-LINUX-NEXT: str x30, [sp, #48] // 8-byte Folded Spill +; CHECK-LINUX-NEXT: .cfi_def_cfa_offset 64 +; CHECK-LINUX-NEXT: .cfi_offset w30, -16 +; CHECK-LINUX-NEXT: adrp x8, :got:f +; CHECK-LINUX-NEXT: mov w9, #544 // =0x220 +; CHECK-LINUX-NEXT: add x0, sp, #8 +; CHECK-LINUX-NEXT: ldr x8, [x8, :got_lo12:f] +; CHECK-LINUX-NEXT: movk w9, #54815, lsl #16 +; CHECK-LINUX-NEXT: str w9, [sp, #16] +; CHECK-LINUX-NEXT: add x9, sp, #56 +; CHECK-LINUX-NEXT: stp x9, x8, [sp, #24] +; CHECK-LINUX-NEXT: mov x8, #132 // =0x84 +; CHECK-LINUX-NEXT: movk x8, #22528, lsl #16 +; CHECK-LINUX-NEXT: movk x8, #177, lsl #32 +; CHECK-LINUX-NEXT: movk x8, #22528, lsl #48 +; CHECK-LINUX-NEXT: str x8, [sp, #8] +; CHECK-LINUX-NEXT: add x8, sp, #8 +; CHECK-LINUX-NEXT: add x1, x8, #12 +; CHECK-LINUX-NEXT: bl __clear_cache +; CHECK-LINUX-NEXT: ldr x30, [sp, #48] // 8-byte Folded Reload +; CHECK-LINUX-NEXT: mov x0, xzr +; CHECK-LINUX-NEXT: add sp, sp, #64 +; CHECK-LINUX-NEXT: ret +; +; CHECK-PC-LABEL: func1: +; CHECK-PC: .seh_proc func1 +; CHECK-PC-NEXT: // %bb.0: +; CHECK-PC-NEXT: sub sp, sp, #64 +; CHECK-PC-NEXT: .seh_stackalloc 64 +; CHECK-PC-NEXT: str x30, [sp, #48] // 8-byte Folded Spill +; CHECK-PC-NEXT: .seh_save_reg x30, 48 +; CHECK-PC-NEXT: .seh_endprologue +; CHECK-PC-NEXT: adrp x8, f +; CHECK-PC-NEXT: add x8, x8, :lo12:f +; CHECK-PC-NEXT: add x9, sp, #56 +; CHECK-PC-NEXT: stp x9, x8, [sp, #24] +; CHECK-PC-NEXT: mov w8, #544 // =0x220 +; CHECK-PC-NEXT: add x0, sp, #8 +; CHECK-PC-NEXT: movk w8, #54815, lsl #16 +; CHECK-PC-NEXT: str w8, [sp, #16] +; CHECK-PC-NEXT: mov x8, #132 // =0x84 +; CHECK-PC-NEXT: movk x8, #22528, lsl #16 +; CHECK-PC-NEXT: movk x8, #177, lsl #32 +; CHECK-PC-NEXT: movk x8, #22528, lsl #48 +; CHECK-PC-NEXT: str x8, [sp, #8] +; CHECK-PC-NEXT: add x8, sp, #8 +; CHECK-PC-NEXT: add x1, x8, #12 +; CHECK-PC-NEXT: bl __clear_cache +; CHECK-PC-NEXT: mov x0, xzr +; CHECK-PC-NEXT: .seh_startepilogue +; CHECK-PC-NEXT: ldr x30, [sp, #48] // 8-byte Folded Reload +; CHECK-PC-NEXT: .seh_save_reg x30, 48 +; CHECK-PC-NEXT: add sp, sp, #64 +; CHECK-PC-NEXT: .seh_stackalloc 64 +; CHECK-PC-NEXT: .seh_endepilogue +; CHECK-PC-NEXT: ret +; CHECK-PC-NEXT: .seh_endfunclet +; CHECK-PC-NEXT: .seh_endproc +; +; CHECK-APPLE-LABEL: func1: +; CHECK-APPLE: ; %bb.0: +; CHECK-APPLE-NEXT: sub sp, sp, #64 +; CHECK-APPLE-NEXT: stp x29, x30, [sp, #48] ; 16-byte Folded Spill +; CHECK-APPLE-NEXT: .cfi_def_cfa_offset 64 +; CHECK-APPLE-NEXT: .cfi_offset w30, -8 +; CHECK-APPLE-NEXT: .cfi_offset w29, -16 +; CHECK-APPLE-NEXT: Lloh0: +; CHECK-APPLE-NEXT: adrp x8, _f at PAGE +; CHECK-APPLE-NEXT: Lloh1: +; CHECK-APPLE-NEXT: add x8, x8, _f at PAGEOFF +; CHECK-APPLE-NEXT: add x9, sp, #40 +; CHECK-APPLE-NEXT: stp x9, x8, [sp, #16] +; CHECK-APPLE-NEXT: mov w8, #544 ; =0x220 +; CHECK-APPLE-NEXT: mov x0, sp +; CHECK-APPLE-NEXT: movk w8, #54815, lsl #16 +; CHECK-APPLE-NEXT: str w8, [sp, #8] +; CHECK-APPLE-NEXT: mov x8, #132 ; =0x84 +; CHECK-APPLE-NEXT: movk x8, #22528, lsl #16 +; CHECK-APPLE-NEXT: movk x8, #177, lsl #32 +; CHECK-APPLE-NEXT: movk x8, #22528, lsl #48 +; CHECK-APPLE-NEXT: str x8, [sp] +; CHECK-APPLE-NEXT: mov x8, sp +; CHECK-APPLE-NEXT: add x1, x8, #12 +; CHECK-APPLE-NEXT: bl ___clear_cache +; CHECK-APPLE-NEXT: ldp x29, x30, [sp, #48] ; 16-byte Folded Reload +; CHECK-APPLE-NEXT: mov x0, xzr +; CHECK-APPLE-NEXT: add sp, sp, #64 +; CHECK-APPLE-NEXT: ret +; CHECK-APPLE-NEXT: .loh AdrpAdd Lloh0, Lloh1 %val = alloca i64 - %nval = bitcast ptr %val to ptr %tramp = alloca [36 x i8], align 8 - ; CHECK: mov w1, #36 - ; CHECK: bl __trampoline_setup - call void @llvm.init.trampoline(ptr %tramp, ptr @f, ptr %nval) + call void @llvm.init.trampoline(ptr %tramp, ptr @f, ptr %val) %fp = call ptr @llvm.adjust.trampoline(ptr %tramp) ret i64 0 } define i64 @func2() { +; CHECK-LINUX-LABEL: func2: +; CHECK-LINUX: // %bb.0: +; CHECK-LINUX-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill +; CHECK-LINUX-NEXT: .cfi_def_cfa_offset 16 +; CHECK-LINUX-NEXT: .cfi_offset w30, -16 +; CHECK-LINUX-NEXT: adrp x8, :got:f +; CHECK-LINUX-NEXT: mov w9, #544 // =0x220 +; CHECK-LINUX-NEXT: adrp x0, trampg +; CHECK-LINUX-NEXT: add x0, x0, :lo12:trampg +; CHECK-LINUX-NEXT: ldr x8, [x8, :got_lo12:f] +; CHECK-LINUX-NEXT: movk w9, #54815, lsl #16 +; CHECK-LINUX-NEXT: str w9, [x0, #8] +; CHECK-LINUX-NEXT: add x9, sp, #8 +; CHECK-LINUX-NEXT: add x1, x0, #12 +; CHECK-LINUX-NEXT: stp x9, x8, [x0, #16] +; CHECK-LINUX-NEXT: mov x8, #132 // =0x84 +; CHECK-LINUX-NEXT: movk x8, #22528, lsl #16 +; CHECK-LINUX-NEXT: movk x8, #177, lsl #32 +; CHECK-LINUX-NEXT: movk x8, #22528, lsl #48 +; CHECK-LINUX-NEXT: str x8, [x0] +; CHECK-LINUX-NEXT: bl __clear_cache +; CHECK-LINUX-NEXT: mov x0, xzr +; CHECK-LINUX-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload +; CHECK-LINUX-NEXT: ret +; +; CHECK-PC-LABEL: func2: +; CHECK-PC: .seh_proc func2 +; CHECK-PC-NEXT: // %bb.0: +; CHECK-PC-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill +; CHECK-PC-NEXT: .seh_save_reg_x x30, 16 +; CHECK-PC-NEXT: .seh_endprologue +; CHECK-PC-NEXT: adrp x0, trampg +; CHECK-PC-NEXT: add x0, x0, :lo12:trampg +; CHECK-PC-NEXT: adrp x8, f +; CHECK-PC-NEXT: add x8, x8, :lo12:f +; CHECK-PC-NEXT: add x9, sp, #8 +; CHECK-PC-NEXT: add x1, x0, #12 +; CHECK-PC-NEXT: stp x9, x8, [x0, #16] +; CHECK-PC-NEXT: mov w8, #544 // =0x220 +; CHECK-PC-NEXT: movk w8, #54815, lsl #16 +; CHECK-PC-NEXT: str w8, [x0, #8] +; CHECK-PC-NEXT: mov x8, #132 // =0x84 +; CHECK-PC-NEXT: movk x8, #22528, lsl #16 +; CHECK-PC-NEXT: movk x8, #177, lsl #32 +; CHECK-PC-NEXT: movk x8, #22528, lsl #48 +; CHECK-PC-NEXT: str x8, [x0] +; CHECK-PC-NEXT: bl __clear_cache +; CHECK-PC-NEXT: mov x0, xzr +; CHECK-PC-NEXT: .seh_startepilogue +; CHECK-PC-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload +; CHECK-PC-NEXT: .seh_save_reg_x x30, 16 +; CHECK-PC-NEXT: .seh_endepilogue +; CHECK-PC-NEXT: ret +; CHECK-PC-NEXT: .seh_endfunclet +; CHECK-PC-NEXT: .seh_endproc +; +; CHECK-APPLE-LABEL: func2: +; CHECK-APPLE: ; %bb.0: +; CHECK-APPLE-NEXT: sub sp, sp, #32 +; CHECK-APPLE-NEXT: stp x29, x30, [sp, #16] ; 16-byte Folded Spill +; CHECK-APPLE-NEXT: .cfi_def_cfa_offset 32 +; CHECK-APPLE-NEXT: .cfi_offset w30, -8 +; CHECK-APPLE-NEXT: .cfi_offset w29, -16 +; CHECK-APPLE-NEXT: Lloh2: +; CHECK-APPLE-NEXT: adrp x0, _trampg at PAGE +; CHECK-APPLE-NEXT: Lloh3: +; CHECK-APPLE-NEXT: add x0, x0, _trampg at PAGEOFF +; CHECK-APPLE-NEXT: Lloh4: +; CHECK-APPLE-NEXT: adrp x8, _f at PAGE +; CHECK-APPLE-NEXT: Lloh5: +; CHECK-APPLE-NEXT: add x8, x8, _f at PAGEOFF +; CHECK-APPLE-NEXT: add x9, sp, #8 +; CHECK-APPLE-NEXT: add x1, x0, #12 +; CHECK-APPLE-NEXT: stp x9, x8, [x0, #16] +; CHECK-APPLE-NEXT: mov w8, #544 ; =0x220 +; CHECK-APPLE-NEXT: movk w8, #54815, lsl #16 +; CHECK-APPLE-NEXT: str w8, [x0, #8] +; CHECK-APPLE-NEXT: mov x8, #132 ; =0x84 +; CHECK-APPLE-NEXT: movk x8, #22528, lsl #16 +; CHECK-APPLE-NEXT: movk x8, #177, lsl #32 +; CHECK-APPLE-NEXT: movk x8, #22528, lsl #48 +; CHECK-APPLE-NEXT: str x8, [x0] +; CHECK-APPLE-NEXT: bl ___clear_cache +; CHECK-APPLE-NEXT: ldp x29, x30, [sp, #16] ; 16-byte Folded Reload +; CHECK-APPLE-NEXT: mov x0, xzr +; CHECK-APPLE-NEXT: add sp, sp, #32 +; CHECK-APPLE-NEXT: ret +; CHECK-APPLE-NEXT: .loh AdrpAdd Lloh4, Lloh5 +; CHECK-APPLE-NEXT: .loh AdrpAdd Lloh2, Lloh3 %val = alloca i64 - %nval = bitcast ptr %val to ptr - ; CHECK: mov w1, #36 - ; CHECK: bl __trampoline_setup - call void @llvm.init.trampoline(ptr @trampg, ptr @f, ptr %nval) + call void @llvm.init.trampoline(ptr @trampg, ptr @f, ptr %val) %fp = call ptr @llvm.adjust.trampoline(ptr @trampg) ret i64 0 } diff --git a/llvm/test/CodeGen/AArch64/win64cc-x18.ll b/llvm/test/CodeGen/AArch64/win64cc-x18.ll index b3e78cc9bbb81..4b45c300e9c1d 100644 --- a/llvm/test/CodeGen/AArch64/win64cc-x18.ll +++ b/llvm/test/CodeGen/AArch64/win64cc-x18.ll @@ -1,35 +1,26 @@ -; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +;; Testing that nest uses x15 on all calling conventions (except Arm64EC) -;; Testing that x18 is not clobbered when passing pointers with the nest -;; attribute on windows - -; RUN: llc < %s -mtriple=aarch64-pc-windows-msvc | FileCheck %s --check-prefixes=CHECK,CHECK-NO-X18 -; RUN: llc < %s -mtriple=aarch64-linux-gnu | FileCheck %s --check-prefixes=CHECK,CHECK-X18 +; RUN: llc < %s -mtriple=aarch64-pc-windows-msvc | FileCheck %s +; RUN: llc < %s -mtriple=aarch64-linux-gnu | FileCheck %s +; RUN: llc < %s -mtriple=aarch64-apple-darwin- | FileCheck %s define dso_local i64 @other(ptr nest %p) #0 { ; CHECK-LABEL: other: -; CHECK-X18: ldr x0, [x18] -; CHECK-NO-X18: ldr x0, [x0] +; CHECK: ldr x0, [x15] +; CHECK: ret %r = load i64, ptr %p -; CHECK: ret ret i64 %r } define dso_local void @func() #0 { ; CHECK-LABEL: func: - - +; CHECK: add x15, sp, #8 +; CHECK: bl {{_?other}} +; CHECK: ret entry: %p = alloca i64 -; CHECK: mov w8, #1 -; CHECK: stp x30, x8, [sp, #-16] -; CHECK-X18: add x18, sp, #8 store i64 1, ptr %p -; CHECK-NO-X18: add x0, sp, #8 -; CHECK: bl other call void @other(ptr nest %p) -; CHECK: ldr x30, [sp], #16 -; CHECK: ret ret void } diff --git a/llvm/test/CodeGen/AArch64/zero-call-used-regs.ll b/llvm/test/CodeGen/AArch64/zero-call-used-regs.ll index 4799ea3bcd19f..986666e015e9e 100644 --- a/llvm/test/CodeGen/AArch64/zero-call-used-regs.ll +++ b/llvm/test/CodeGen/AArch64/zero-call-used-regs.ll @@ -93,7 +93,7 @@ define dso_local i32 @all_gpr_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c ; CHECK-NEXT: mov x5, #0 // =0x0 ; CHECK-NEXT: mov x6, #0 // =0x0 ; CHECK-NEXT: mov x7, #0 // =0x0 -; CHECK-NEXT: mov x18, #0 // =0x0 +; CHECK-NEXT: mov x15, #0 // =0x0 ; CHECK-NEXT: orr w0, w8, w2 ; CHECK-NEXT: mov x2, #0 // =0x0 ; CHECK-NEXT: mov x8, #0 // =0x0 @@ -146,7 +146,7 @@ define dso_local i32 @all_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c) lo ; DEFAULT-NEXT: mov x5, #0 // =0x0 ; DEFAULT-NEXT: mov x6, #0 // =0x0 ; DEFAULT-NEXT: mov x7, #0 // =0x0 -; DEFAULT-NEXT: mov x18, #0 // =0x0 +; DEFAULT-NEXT: mov x15, #0 // =0x0 ; DEFAULT-NEXT: movi v0.2d, #0000000000000000 ; DEFAULT-NEXT: orr w0, w8, w2 ; DEFAULT-NEXT: mov x2, #0 // =0x0 @@ -169,7 +169,7 @@ define dso_local i32 @all_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c) lo ; SVE-OR-SME-NEXT: mov x5, #0 // =0x0 ; SVE-OR-SME-NEXT: mov x6, #0 // =0x0 ; SVE-OR-SME-NEXT: mov x7, #0 // =0x0 -; SVE-OR-SME-NEXT: mov x18, #0 // =0x0 +; SVE-OR-SME-NEXT: mov x15, #0 // =0x0 ; SVE-OR-SME-NEXT: mov z0.d, #0 // =0x0 ; SVE-OR-SME-NEXT: orr w0, w8, w2 ; SVE-OR-SME-NEXT: mov x2, #0 // =0x0 @@ -196,7 +196,7 @@ define dso_local i32 @all_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c) lo ; STREAMING-COMPAT-NEXT: mov x5, #0 // =0x0 ; STREAMING-COMPAT-NEXT: mov x6, #0 // =0x0 ; STREAMING-COMPAT-NEXT: mov x7, #0 // =0x0 -; STREAMING-COMPAT-NEXT: mov x18, #0 // =0x0 +; STREAMING-COMPAT-NEXT: mov x15, #0 // =0x0 ; STREAMING-COMPAT-NEXT: fmov d0, xzr ; STREAMING-COMPAT-NEXT: orr w0, w8, w2 ; STREAMING-COMPAT-NEXT: mov x2, #0 // =0x0 @@ -492,7 +492,7 @@ define dso_local double @all_gpr_arg_float(double noundef %a, float noundef %b) ; CHECK-NEXT: mov x6, #0 // =0x0 ; CHECK-NEXT: mov x7, #0 // =0x0 ; CHECK-NEXT: mov x8, #0 // =0x0 -; CHECK-NEXT: mov x18, #0 // =0x0 +; CHECK-NEXT: mov x15, #0 // =0x0 ; CHECK-NEXT: ret entry: @@ -547,7 +547,7 @@ define dso_local double @all_arg_float(double noundef %a, float noundef %b) loca ; DEFAULT-NEXT: mov x6, #0 // =0x0 ; DEFAULT-NEXT: mov x7, #0 // =0x0 ; DEFAULT-NEXT: mov x8, #0 // =0x0 -; DEFAULT-NEXT: mov x18, #0 // =0x0 +; DEFAULT-NEXT: mov x15, #0 // =0x0 ; DEFAULT-NEXT: movi v1.2d, #0000000000000000 ; DEFAULT-NEXT: movi v2.2d, #0000000000000000 ; DEFAULT-NEXT: movi v3.2d, #0000000000000000 @@ -570,7 +570,7 @@ define dso_local double @all_arg_float(double noundef %a, float noundef %b) loca ; SVE-OR-SME-NEXT: mov x6, #0 // =0x0 ; SVE-OR-SME-NEXT: mov x7, #0 // =0x0 ; SVE-OR-SME-NEXT: mov x8, #0 // =0x0 -; SVE-OR-SME-NEXT: mov x18, #0 // =0x0 +; SVE-OR-SME-NEXT: mov x15, #0 // =0x0 ; SVE-OR-SME-NEXT: mov z1.d, #0 // =0x0 ; SVE-OR-SME-NEXT: mov z2.d, #0 // =0x0 ; SVE-OR-SME-NEXT: mov z3.d, #0 // =0x0 @@ -597,7 +597,7 @@ define dso_local double @all_arg_float(double noundef %a, float noundef %b) loca ; STREAMING-COMPAT-NEXT: mov x6, #0 // =0x0 ; STREAMING-COMPAT-NEXT: mov x7, #0 // =0x0 ; STREAMING-COMPAT-NEXT: mov x8, #0 // =0x0 -; STREAMING-COMPAT-NEXT: mov x18, #0 // =0x0 +; STREAMING-COMPAT-NEXT: mov x15, #0 // =0x0 ; STREAMING-COMPAT-NEXT: fmov d1, xzr ; STREAMING-COMPAT-NEXT: fmov d2, xzr ; STREAMING-COMPAT-NEXT: fmov d3, xzr >From d0aa2c4524a854153d85e172c913b5923d418efe Mon Sep 17 00:00:00 2001 From: Jameson Nash Date: Tue, 20 May 2025 20:29:49 +0000 Subject: [PATCH 2/2] choose scratch register more carefully --- .../Target/AArch64/AArch64FrameLowering.cpp | 55 +++++++++++-------- 1 file changed, 32 insertions(+), 23 deletions(-) diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp index 96f4451182391..b31e012fcf77f 100644 --- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp @@ -327,7 +327,8 @@ static int64_t getArgumentStackToRestore(MachineFunction &MF, static bool produceCompactUnwindFrame(MachineFunction &MF); static bool needsWinCFI(const MachineFunction &MF); static StackOffset getSVEStackSize(const MachineFunction &MF); -static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB); +static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB, bool HasCall=false); +static bool requiresSaveVG(const MachineFunction &MF); /// Returns true if a homogeneous prolog or epilog code can be emitted /// for the size optimization. If possible, a frame helper call is injected. @@ -1002,6 +1003,16 @@ void AArch64FrameLowering::emitZeroCallUsedRegs(BitVector RegsToZero, } } +static bool windowsRequiresStackProbe(const MachineFunction &MF, + uint64_t StackSizeInBytes) { + const AArch64Subtarget &Subtarget = MF.getSubtarget(); + const AArch64FunctionInfo &MFI = *MF.getInfo(); + // TODO: When implementing stack protectors, take that into account + // for the probe threshold. + return Subtarget.isTargetWindows() && MFI.hasStackProbing() && + StackSizeInBytes >= uint64_t(MFI.getStackProbeSize()); +} + static void getLiveRegsForEntryMBB(LivePhysRegs &LiveRegs, const MachineBasicBlock &MBB) { const MachineFunction *MF = MBB.getParent(); @@ -1023,7 +1034,7 @@ static void getLiveRegsForEntryMBB(LivePhysRegs &LiveRegs, // but we would then have to make sure that we were in fact saving at least one // callee-save register in the prologue, which is additional complexity that // doesn't seem worth the benefit. -static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB) { +static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB, bool HasCall) { MachineFunction *MF = MBB->getParent(); // If MBB is an entry block, use X9 as the scratch register @@ -1037,6 +1048,11 @@ static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB) { const AArch64RegisterInfo &TRI = *Subtarget.getRegisterInfo(); LivePhysRegs LiveRegs(TRI); getLiveRegsForEntryMBB(LiveRegs, *MBB); + if (HasCall) { + LiveRegs.addReg(AArch64::X16); + LiveRegs.addReg(AArch64::X17); + LiveRegs.addReg(AArch64::X18); + } // Prefer X9 since it was historically used for the prologue scratch reg. const MachineRegisterInfo &MRI = MF->getRegInfo(); @@ -1077,23 +1093,16 @@ bool AArch64FrameLowering::canUseAsPrologue( MBB.isLiveIn(AArch64::NZCV)) return false; - // Don't need a scratch register if we're not going to re-align the stack or - // emit stack probes. - if (!RegInfo->hasStackRealignment(*MF) && !TLI->hasInlineStackProbe(*MF)) - return true; - // Otherwise, we can use any block as long as it has a scratch register - // available. - return findScratchNonCalleeSaveRegister(TmpMBB) != AArch64::NoRegister; -} + if (RegInfo->hasStackRealignment(*MF) || TLI->hasInlineStackProbe(*MF)) + if (findScratchNonCalleeSaveRegister(TmpMBB) == AArch64::NoRegister) + return false; -static bool windowsRequiresStackProbe(MachineFunction &MF, - uint64_t StackSizeInBytes) { - const AArch64Subtarget &Subtarget = MF.getSubtarget(); - const AArch64FunctionInfo &MFI = *MF.getInfo(); - // TODO: When implementing stack protectors, take that into account - // for the probe threshold. - return Subtarget.isTargetWindows() && MFI.hasStackProbing() && - StackSizeInBytes >= uint64_t(MFI.getStackProbeSize()); + // May need a scratch register (for return value) if require making a special call + if (requiresSaveVG(*MF) || windowsRequiresStackProbe(*MF, std::numeric_limits::max())) + if (findScratchNonCalleeSaveRegister(TmpMBB, true) == AArch64::NoRegister) + return false; + + return true; } static bool needsWinCFI(const MachineFunction &MF) { @@ -1356,8 +1365,8 @@ bool requiresGetVGCall(MachineFunction &MF) { !MF.getSubtarget().hasSVE(); } -static bool requiresSaveVG(MachineFunction &MF) { - AArch64FunctionInfo *AFI = MF.getInfo(); +static bool requiresSaveVG(const MachineFunction &MF) { + const AArch64FunctionInfo *AFI = MF.getInfo(); // For Darwin platforms we don't save VG for non-SVE functions, even if SME // is enabled with streaming mode changes. if (!AFI->hasStreamingModeChanges()) @@ -1991,8 +2000,8 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF, return STI.getRegisterInfo()->isSuperOrSubRegisterEq( AArch64::X15, LiveIn.PhysReg); })) { - X15Scratch = findScratchNonCalleeSaveRegister(&MBB); - assert(X15Scratch != AArch64::NoRegister); + X15Scratch = findScratchNonCalleeSaveRegister(&MBB, true); + assert(X15Scratch != AArch64::NoRegister && (X15Scratch < AArch64::X15 || X15Scratch > AArch64::X17)); #ifndef NDEBUG LiveRegs.removeReg(AArch64::X15); // ignore X15 since we restore it #endif @@ -3236,7 +3245,7 @@ bool AArch64FrameLowering::spillCalleeSavedRegisters( unsigned X0Scratch = AArch64::NoRegister; if (Reg1 == AArch64::VG) { // Find an available register to store value of VG to. - Reg1 = findScratchNonCalleeSaveRegister(&MBB); + Reg1 = findScratchNonCalleeSaveRegister(&MBB, true); assert(Reg1 != AArch64::NoRegister); SMEAttrs Attrs(MF.getFunction()); From flang-commits at lists.llvm.org Tue May 20 14:13:51 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 20 May 2025 14:13:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Added noalias attribute to function arguments. (PR #140803) In-Reply-To: Message-ID: <682cf08f.170a0220.143cb0.488a@mx.google.com> https://github.com/vzakhari ready_for_review https://github.com/llvm/llvm-project/pull/140803 From flang-commits at lists.llvm.org Tue May 20 14:14:25 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 14:14:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Added noalias attribute to function arguments. (PR #140803) In-Reply-To: Message-ID: <682cf0b1.050a0220.1fc0a7.7299@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Slava Zakharin (vzakhari)
Changes This helps to disambiguate accesses in the caller and the callee after LLVM inlining in some apps. I did not see any performance changes, but this is one step towards enabling other optimizations in the apps that I am looking at. The definition of llvm.noalias says: ``` ... indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value. This guarantee only holds for memory locations that are modified, by any means, during the execution of the function. ``` I believe this exactly matches Fortran rules for the dummy arguments that are modified during their subprogram execution. I also set llvm.noalias and llvm.nocapture on the !fir.box<> arguments, because the corresponding descriptors cannot be captured and cannot alias anything (not based on them) during the execution of the subprogram. --- Patch is 52.48 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140803.diff 30 Files Affected: - (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+6) - (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+5-1) - (modified) flang/lib/Optimizer/Transforms/FunctionAttr.cpp (+17-12) - (modified) flang/test/Fir/array-coor.fir (+1-1) - (modified) flang/test/Fir/arrayset.fir (+1-1) - (modified) flang/test/Fir/arrexp.fir (+9-9) - (modified) flang/test/Fir/box-offset-codegen.fir (+4-4) - (modified) flang/test/Fir/box-typecode.fir (+1-1) - (modified) flang/test/Fir/box.fir (+9-9) - (modified) flang/test/Fir/boxproc.fir (+2-2) - (modified) flang/test/Fir/commute.fir (+1-1) - (modified) flang/test/Fir/coordinateof.fir (+1-1) - (modified) flang/test/Fir/embox.fir (+4-4) - (modified) flang/test/Fir/field-index.fir (+2-2) - (modified) flang/test/Fir/ignore-missing-type-descriptor.fir (+1-1) - (modified) flang/test/Fir/polymorphic.fir (+1-1) - (modified) flang/test/Fir/rebox.fir (+6-6) - (modified) flang/test/Fir/struct-passing-x86-64-byval.fir (+24-24) - (modified) flang/test/Fir/target-rewrite-complex-10-x86.fir (+1-1) - (modified) flang/test/Fir/target.fir (+4-4) - (modified) flang/test/Fir/tbaa-codegen.fir (+1-1) - (modified) flang/test/Fir/tbaa-codegen2.fir (+1-1) - (modified) flang/test/Integration/OpenMP/copyprivate.f90 (+17-17) - (modified) flang/test/Integration/debug-local-var-2.f90 (+2-2) - (modified) flang/test/Integration/unroll-loops.f90 (+1-1) - (modified) flang/test/Lower/HLFIR/unroll-loops.fir (+1-1) - (modified) flang/test/Lower/forall/character-1.f90 (+1-1) - (modified) flang/test/Transforms/constant-argument-globalisation.fir (+2-2) - (added) flang/test/Transforms/function-attrs-noalias.fir (+113) - (modified) flang/test/Transforms/function-attrs.fir (+26-1) ``````````diff diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c0d88a8e19f80..e1497aeb3aa36 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -418,6 +418,12 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "Set the unsafe-fp-math attribute on functions in the module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, + Option<"setNoCapture", "set-nocapture", "bool", /*default=*/"false", + "Set LLVM nocapture attribute on function arguments, " + "if possible">, + Option<"setNoAlias", "set-noalias", "bool", /*default=*/"false", + "Set LLVM noalias attribute on function arguments, " + "if possible">, ]; } diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..378913fcb1329 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -350,11 +350,15 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, else framePointerKind = mlir::LLVM::framePointerKind::FramePointerKind::None; + bool setNoCapture = false, setNoAlias = false; + if (config.OptLevel.isOptimizingForSpeed()) + setNoCapture = setNoAlias = true; + pm.addPass(fir::createFunctionAttr( {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - ""})); + /*tuneCPU=*/"", setNoCapture, setNoAlias})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 43e4c1a7af3cd..c8cdba0d6f9c4 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -27,17 +27,8 @@ namespace { class FunctionAttrPass : public fir::impl::FunctionAttrBase { public: - FunctionAttrPass(const fir::FunctionAttrOptions &options) { - instrumentFunctionEntry = options.instrumentFunctionEntry; - instrumentFunctionExit = options.instrumentFunctionExit; - framePointerKind = options.framePointerKind; - noInfsFPMath = options.noInfsFPMath; - noNaNsFPMath = options.noNaNsFPMath; - approxFuncFPMath = options.approxFuncFPMath; - noSignedZerosFPMath = options.noSignedZerosFPMath; - unsafeFPMath = options.unsafeFPMath; - } - FunctionAttrPass() {} + FunctionAttrPass(const fir::FunctionAttrOptions &options) : Base{options} {} + FunctionAttrPass() = default; void runOnOperation() override; }; @@ -56,14 +47,28 @@ void FunctionAttrPass::runOnOperation() { if ((isFromModule || !func.isDeclaration()) && !fir::hasBindcAttr(func.getOperation())) { llvm::StringRef nocapture = mlir::LLVM::LLVMDialect::getNoCaptureAttrName(); + llvm::StringRef noalias = mlir::LLVM::LLVMDialect::getNoAliasAttrName(); mlir::UnitAttr unitAttr = mlir::UnitAttr::get(func.getContext()); for (auto [index, argType] : llvm::enumerate(func.getArgumentTypes())) { + bool isNoCapture = false; + bool isNoAlias = false; if (mlir::isa(argType) && !func.getArgAttr(index, fir::getTargetAttrName()) && !func.getArgAttr(index, fir::getAsynchronousAttrName()) && - !func.getArgAttr(index, fir::getVolatileAttrName())) + !func.getArgAttr(index, fir::getVolatileAttrName())) { + isNoCapture = true; + isNoAlias = !fir::isPointerType(argType); + } else if (mlir::isa(argType)) { + // !fir.box arguments will be passed as descriptor pointers + // at LLVM IR dialect level - they cannot be captured, + // and cannot alias with anything within the function. + isNoCapture = isNoAlias = true; + } + if (isNoCapture && setNoCapture) func.setArgAttr(index, nocapture, unitAttr); + if (isNoAlias && setNoAlias) + func.setArgAttr(index, noalias, unitAttr); } } diff --git a/flang/test/Fir/array-coor.fir b/flang/test/Fir/array-coor.fir index a765670d20b28..2caa727a10c50 100644 --- a/flang/test/Fir/array-coor.fir +++ b/flang/test/Fir/array-coor.fir @@ -33,7 +33,7 @@ func.func @test_array_coor_box_component_slice(%arg0: !fir.box) -> () // CHECK-LABEL: define void @test_array_coor_box_component_slice( -// CHECK-SAME: ptr %[[VAL_0:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[VAL_0:.*]]) // CHECK: %[[VAL_1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[VAL_0]], i32 0, i32 7, i32 0, i32 2 // CHECK: %[[VAL_2:.*]] = load i64, ptr %[[VAL_1]] // CHECK: %[[VAL_3:.*]] = mul nsw i64 1, %[[VAL_2]] diff --git a/flang/test/Fir/arrayset.fir b/flang/test/Fir/arrayset.fir index dab939aba1702..cb26971cb962d 100644 --- a/flang/test/Fir/arrayset.fir +++ b/flang/test/Fir/arrayset.fir @@ -1,7 +1,7 @@ // RUN: tco %s | FileCheck %s // RUN: %flang_fc1 -emit-llvm %s -o - | FileCheck %s -// CHECK-LABEL: define void @x(ptr captures(none) %0) +// CHECK-LABEL: define void @x( func.func @x(%arr : !fir.ref>) { %1 = arith.constant 0 : index %2 = arith.constant 9 : index diff --git a/flang/test/Fir/arrexp.fir b/flang/test/Fir/arrexp.fir index 924c1fab8d84b..6c7f71f6f1f9c 100644 --- a/flang/test/Fir/arrexp.fir +++ b/flang/test/Fir/arrexp.fir @@ -1,7 +1,7 @@ // RUN: tco %s | FileCheck %s // CHECK-LABEL: define void @f1 -// CHECK: (ptr captures(none) %[[A:[^,]*]], {{.*}}, float %[[F:.*]]) +// CHECK: (ptr {{[^%]*}}%[[A:[^,]*]], {{.*}}, float %[[F:.*]]) func.func @f1(%a : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -23,7 +23,7 @@ func.func @f1(%a : !fir.ref>, %n : index, %m : index, %o : i } // CHECK-LABEL: define void @f2 -// CHECK: (ptr captures(none) %[[A:[^,]*]], {{.*}}, float %[[F:.*]]) +// CHECK: (ptr {{[^%]*}}%[[A:[^,]*]], {{.*}}, float %[[F:.*]]) func.func @f2(%a : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -47,7 +47,7 @@ func.func @f2(%a : !fir.ref>, %b : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -72,7 +72,7 @@ func.func @f3(%a : !fir.ref>, %b : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -102,7 +102,7 @@ func.func @f4(%a : !fir.ref>, %b : !fir.ref>, %arg1: !fir.box>, %arg2: f32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -135,7 +135,7 @@ func.func @f5(%arg0: !fir.box>, %arg1: !fir.box>, %arg1: f32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -165,7 +165,7 @@ func.func @f6(%arg0: !fir.box>, %arg1: f32) { // Non contiguous array with lower bounds (x = y(100), with y(4:)) // Test array_coor offset computation. // CHECK-LABEL: define void @f7( -// CHECK: ptr captures(none) %[[X:[^,]*]], ptr %[[Y:.*]]) +// CHECK: ptr {{[^%]*}}%[[X:[^,]*]], ptr {{[^%]*}}%[[Y:.*]]) func.func @f7(%arg0: !fir.ref, %arg1: !fir.box>) { %c4 = arith.constant 4 : index %c100 = arith.constant 100 : index @@ -181,7 +181,7 @@ func.func @f7(%arg0: !fir.ref, %arg1: !fir.box>) { // Test A(:, :)%x reference codegen with A constant shape. // CHECK-LABEL: define void @f8( -// CHECK-SAME: ptr captures(none) %[[A:.*]], i32 %[[I:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[A:.*]], i32 %[[I:.*]]) func.func @f8(%a : !fir.ref>>, %i : i32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -198,7 +198,7 @@ func.func @f8(%a : !fir.ref>>, %i : i32) { // Test casts in in array_coor offset computation when type parameters are not i64 // CHECK-LABEL: define ptr @f9( -// CHECK-SAME: i32 %[[I:.*]], i64 %{{.*}}, i64 %{{.*}}, ptr captures(none) %[[C:.*]]) +// CHECK-SAME: i32 %[[I:.*]], i64 %{{.*}}, i64 %{{.*}}, ptr {{[^%]*}}%[[C:.*]]) func.func @f9(%i: i32, %e : i64, %j: i64, %c: !fir.ref>>) -> !fir.ref> { %s = fir.shape %e, %e : (i64, i64) -> !fir.shape<2> // CHECK: %[[CAST:.*]] = sext i32 %[[I]] to i64 diff --git a/flang/test/Fir/box-offset-codegen.fir b/flang/test/Fir/box-offset-codegen.fir index 15c9a11e5aefe..11d5750ffc385 100644 --- a/flang/test/Fir/box-offset-codegen.fir +++ b/flang/test/Fir/box-offset-codegen.fir @@ -7,7 +7,7 @@ func.func @scalar_addr(%scalar : !fir.ref>>) -> !fir.llvm_ return %addr : !fir.llvm_ptr>> } // CHECK-LABEL: define ptr @scalar_addr( -// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 0 // CHECK: ret ptr %[[VAL_0]] @@ -16,7 +16,7 @@ func.func @scalar_tdesc(%scalar : !fir.ref>>) -> !fir.llvm return %tdesc : !fir.llvm_ptr>> } // CHECK-LABEL: define ptr @scalar_tdesc( -// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 7 // CHECK: ret ptr %[[VAL_0]] @@ -25,7 +25,7 @@ func.func @array_addr(%array : !fir.ref>>> } // CHECK-LABEL: define ptr @array_addr( -// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 0 // CHECK: ret ptr %[[VAL_0]] @@ -34,6 +34,6 @@ func.func @array_tdesc(%array : !fir.ref>> } // CHECK-LABEL: define ptr @array_tdesc( -// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 8 // CHECK: ret ptr %[[VAL_0]] diff --git a/flang/test/Fir/box-typecode.fir b/flang/test/Fir/box-typecode.fir index 766c5165b947c..a8d43eba39889 100644 --- a/flang/test/Fir/box-typecode.fir +++ b/flang/test/Fir/box-typecode.fir @@ -6,7 +6,7 @@ func.func @test_box_typecode(%a: !fir.class) -> i32 { } // CHECK-LABEL: @test_box_typecode( -// CHECK-SAME: ptr %[[BOX:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]) // CHECK: %[[GEP:.*]] = getelementptr { ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}} }, ptr %[[BOX]], i32 0, i32 4 // CHECK: %[[TYPE_CODE:.*]] = load i8, ptr %[[GEP]] // CHECK: %[[TYPE_CODE_CONV:.*]] = sext i8 %[[TYPE_CODE]] to i32 diff --git a/flang/test/Fir/box.fir b/flang/test/Fir/box.fir index 5e931a2e0d9aa..c0cf3d8375983 100644 --- a/flang/test/Fir/box.fir +++ b/flang/test/Fir/box.fir @@ -24,7 +24,7 @@ func.func private @g(%b : !fir.box) func.func private @ga(%b : !fir.box>) // CHECK-LABEL: define void @f -// CHECK: (ptr captures(none) %[[ARG:.*]]) +// CHECK: (ptr {{[^%]*}}%[[ARG:.*]]) func.func @f(%a : !fir.ref) { // CHECK: %[[DESC:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 } // CHECK: %[[INS0:.*]] = insertvalue {{.*}} { ptr undef, i64 4, i32 20240719, i8 0, i8 27, i8 0, i8 0 }, ptr %[[ARG]], 0 @@ -38,7 +38,7 @@ func.func @f(%a : !fir.ref) { } // CHECK-LABEL: define void @fa -// CHECK: (ptr captures(none) %[[ARG:.*]]) +// CHECK: (ptr {{[^%]*}}%[[ARG:.*]]) func.func @fa(%a : !fir.ref>) { %c = fir.convert %a : (!fir.ref>) -> !fir.ref> %c1 = arith.constant 1 : index @@ -54,7 +54,7 @@ func.func @fa(%a : !fir.ref>) { // Boxing of a scalar character of dynamic length // CHECK-LABEL: define void @b1( -// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]]) func.func @b1(%arg0 : !fir.ref>, %arg1 : index) -> !fir.box> { // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 } // CHECK: %[[size:.*]] = mul i64 1, %[[arg1]] @@ -69,8 +69,8 @@ func.func @b1(%arg0 : !fir.ref>, %arg1 : index) -> !fir.box>>, %arg1 : index) -> !fir.box>> { %1 = fir.shape %arg1 : (index) -> !fir.shape<1> // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } @@ -85,7 +85,7 @@ func.func @b2(%arg0 : !fir.ref>>, %arg1 : index) -> // Boxing of a dynamic array of character of dynamic length // CHECK-LABEL: define void @b3( -// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]], i64 %[[arg2:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]], i64 %[[arg2:.*]]) func.func @b3(%arg0 : !fir.ref>>, %arg1 : index, %arg2 : index) -> !fir.box>> { %1 = fir.shape %arg2 : (index) -> !fir.shape<1> // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } @@ -103,7 +103,7 @@ func.func @b3(%arg0 : !fir.ref>>, %arg1 : index, %ar // Boxing of a static array of character of dynamic length // CHECK-LABEL: define void @b4( -// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]]) func.func @b4(%arg0 : !fir.ref>>, %arg1 : index) -> !fir.box>> { %c_7 = arith.constant 7 : index %1 = fir.shape %c_7 : (index) -> !fir.shape<1> @@ -122,7 +122,7 @@ func.func @b4(%arg0 : !fir.ref>>, %arg1 : index) -> // Storing a fir.box into a fir.ref (modifying descriptors). // CHECK-LABEL: define void @b5( -// CHECK-SAME: ptr captures(none) %[[arg0:.*]], ptr %[[arg1:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[arg0:.*]], ptr {{[^%]*}}%[[arg1:.*]]) func.func @b5(%arg0 : !fir.ref>>>, %arg1 : !fir.box>>) { fir.store %arg1 to %arg0 : !fir.ref>>> // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr %0, ptr %1, i32 72, i1 false) @@ -132,7 +132,7 @@ func.func @b5(%arg0 : !fir.ref>>>, %arg1 func.func private @callee6(!fir.box) -> i32 // CHECK-LABEL: define i32 @box6( -// CHECK-SAME: ptr captures(none) %[[ARG0:.*]], i64 %[[ARG1:.*]], i64 %[[ARG2:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[ARG0:.*]], i64 %[[ARG1:.*]], i64 %[[ARG2:.*]]) func.func @box6(%0 : !fir.ref>, %1 : index, %2 : index) -> i32 { %c100 = arith.constant 100 : index %c50 = arith.constant 50 : index diff --git a/flang/test/Fir/boxproc.fir b/flang/test/Fir/boxproc.fir index e99dfd0b92afd..5d82522055adc 100644 --- a/flang/test/Fir/boxproc.fir +++ b/flang/test/Fir/boxproc.fir @@ -16,7 +16,7 @@ // CHECK: call void @_QPtest_proc_dummy_other(ptr %[[VAL_6]]) // CHECK-LABEL: define void @_QFtest_proc_dummyPtest_proc_dummy_a(ptr -// CHECK-SAME: captures(none) %[[VAL_0:.*]], ptr nest captures(none) %[[VAL_1:.*]]) +// CHECK-SAME: {{[^%]*}}%[[VAL_0:.*]], ptr nest {{[^%]*}}%[[VAL_1:.*]]) // CHECK-LABEL: define void @_QPtest_proc_dummy_other(ptr // CHECK-SAME: %[[VAL_0:.*]]) @@ -92,7 +92,7 @@ func.func @_QPtest_proc_dummy_other(%arg0: !fir.boxproc<() -> ()>) { // CHECK: call void @llvm.stackrestore.p0(ptr %[[VAL_27]]) // CHECK-LABEL: define { ptr, i64 } @_QFtest_proc_dummy_charPgen_message(ptr -// CHECK-SAME: captures(none) %[[VAL_0:.*]], i64 %[[VAL_1:.*]], ptr nest captures(none) %[[VAL_2:.*]]) +// CHECK-SAME: {{[^%]*}}%[[VAL_0:.*]], i64 %[[VAL_1:.*]], ptr nest {{[^%]*}}%[[VAL_2:.*]]) // CHECK: %[[VAL_3:.*]] = getelementptr { { ptr, i64 } }, ptr %[[VAL_2]], i32 0, i32 0 // CHECK: %[[VAL_4:.*]] = load { ptr, i64 }, ptr %[[VAL_3]], align 8 // CHECK: %[[VAL_5:.*]] = extractvalue { ptr, i64 } %[[VAL_4]], 0 diff --git a/flang/test/Fir/commute.fir b/flang/test/Fir/commute.fir index a857ba55b00c5..8713c8ff24e7f 100644 --- a/flang/test/Fir/commute.fir +++ b/flang/test/Fir/commute.fir @@ -11,7 +11,7 @@ func.func @f1(%a : i32, %b : i32) -> i32 { return %3 : i32 } -// CHECK-LABEL: define i32 @f2(ptr captures(none) %0) +// CHECK-LABEL: define i32 @f2(ptr {{[^%]*}}%0) func.func @f2(%a : !fir.ref) -> i32 { %1 = fir.load %a : !fir.ref // CHECK: %[[r2:.*]] = load diff --git a/flang/test/Fir/coordinateof.fir b/flang/test/Fir/coordinateof.fir index 693bdf716ba1d..a01e9e9d1fc40 100644 --- a/flang/test/Fir/coordinateof.fir +++ b/flang/test/Fir/coordinateof.fir @@ -62,7 +62,7 @@ func.func @foo5(%box : !fir.box>>, %i : index) -> i32 } // CHECK-LABEL: @foo6 -// CHECK-SAME: (ptr %[[box:.*]], i64 %{{.*}}, ptr captures(none) %{{.*}}) +// CHECK-SAME: (ptr {{[^%]*}}%[[box:.*]], i64 %{{.*}}, ptr {{[^%]*}}%{{.*}}) func.func @foo6(%box : !fir.box>>>, %i : i64 , %res : !fir.ref>) { // CHECK: %[[addr_gep:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, ptr %[[box]], i32 0, i32 0 // CHECK: %[[addr:.*]] = load ptr, ptr %[[addr_gep]] diff --git a/flang/test/Fir/embox.fir b/flang/test/Fir... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/140803 From flang-commits at lists.llvm.org Tue May 20 14:14:25 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 14:14:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Added noalias attribute to function arguments. (PR #140803) In-Reply-To: Message-ID: <682cf0b1.170a0220.1253e4.48a2@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Slava Zakharin (vzakhari)
Changes This helps to disambiguate accesses in the caller and the callee after LLVM inlining in some apps. I did not see any performance changes, but this is one step towards enabling other optimizations in the apps that I am looking at. The definition of llvm.noalias says: ``` ... indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value. This guarantee only holds for memory locations that are modified, by any means, during the execution of the function. ``` I believe this exactly matches Fortran rules for the dummy arguments that are modified during their subprogram execution. I also set llvm.noalias and llvm.nocapture on the !fir.box<> arguments, because the corresponding descriptors cannot be captured and cannot alias anything (not based on them) during the execution of the subprogram. --- Patch is 52.48 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140803.diff 30 Files Affected: - (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+6) - (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+5-1) - (modified) flang/lib/Optimizer/Transforms/FunctionAttr.cpp (+17-12) - (modified) flang/test/Fir/array-coor.fir (+1-1) - (modified) flang/test/Fir/arrayset.fir (+1-1) - (modified) flang/test/Fir/arrexp.fir (+9-9) - (modified) flang/test/Fir/box-offset-codegen.fir (+4-4) - (modified) flang/test/Fir/box-typecode.fir (+1-1) - (modified) flang/test/Fir/box.fir (+9-9) - (modified) flang/test/Fir/boxproc.fir (+2-2) - (modified) flang/test/Fir/commute.fir (+1-1) - (modified) flang/test/Fir/coordinateof.fir (+1-1) - (modified) flang/test/Fir/embox.fir (+4-4) - (modified) flang/test/Fir/field-index.fir (+2-2) - (modified) flang/test/Fir/ignore-missing-type-descriptor.fir (+1-1) - (modified) flang/test/Fir/polymorphic.fir (+1-1) - (modified) flang/test/Fir/rebox.fir (+6-6) - (modified) flang/test/Fir/struct-passing-x86-64-byval.fir (+24-24) - (modified) flang/test/Fir/target-rewrite-complex-10-x86.fir (+1-1) - (modified) flang/test/Fir/target.fir (+4-4) - (modified) flang/test/Fir/tbaa-codegen.fir (+1-1) - (modified) flang/test/Fir/tbaa-codegen2.fir (+1-1) - (modified) flang/test/Integration/OpenMP/copyprivate.f90 (+17-17) - (modified) flang/test/Integration/debug-local-var-2.f90 (+2-2) - (modified) flang/test/Integration/unroll-loops.f90 (+1-1) - (modified) flang/test/Lower/HLFIR/unroll-loops.fir (+1-1) - (modified) flang/test/Lower/forall/character-1.f90 (+1-1) - (modified) flang/test/Transforms/constant-argument-globalisation.fir (+2-2) - (added) flang/test/Transforms/function-attrs-noalias.fir (+113) - (modified) flang/test/Transforms/function-attrs.fir (+26-1) ``````````diff diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c0d88a8e19f80..e1497aeb3aa36 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -418,6 +418,12 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "Set the unsafe-fp-math attribute on functions in the module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, + Option<"setNoCapture", "set-nocapture", "bool", /*default=*/"false", + "Set LLVM nocapture attribute on function arguments, " + "if possible">, + Option<"setNoAlias", "set-noalias", "bool", /*default=*/"false", + "Set LLVM noalias attribute on function arguments, " + "if possible">, ]; } diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..378913fcb1329 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -350,11 +350,15 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, else framePointerKind = mlir::LLVM::framePointerKind::FramePointerKind::None; + bool setNoCapture = false, setNoAlias = false; + if (config.OptLevel.isOptimizingForSpeed()) + setNoCapture = setNoAlias = true; + pm.addPass(fir::createFunctionAttr( {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - ""})); + /*tuneCPU=*/"", setNoCapture, setNoAlias})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 43e4c1a7af3cd..c8cdba0d6f9c4 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -27,17 +27,8 @@ namespace { class FunctionAttrPass : public fir::impl::FunctionAttrBase { public: - FunctionAttrPass(const fir::FunctionAttrOptions &options) { - instrumentFunctionEntry = options.instrumentFunctionEntry; - instrumentFunctionExit = options.instrumentFunctionExit; - framePointerKind = options.framePointerKind; - noInfsFPMath = options.noInfsFPMath; - noNaNsFPMath = options.noNaNsFPMath; - approxFuncFPMath = options.approxFuncFPMath; - noSignedZerosFPMath = options.noSignedZerosFPMath; - unsafeFPMath = options.unsafeFPMath; - } - FunctionAttrPass() {} + FunctionAttrPass(const fir::FunctionAttrOptions &options) : Base{options} {} + FunctionAttrPass() = default; void runOnOperation() override; }; @@ -56,14 +47,28 @@ void FunctionAttrPass::runOnOperation() { if ((isFromModule || !func.isDeclaration()) && !fir::hasBindcAttr(func.getOperation())) { llvm::StringRef nocapture = mlir::LLVM::LLVMDialect::getNoCaptureAttrName(); + llvm::StringRef noalias = mlir::LLVM::LLVMDialect::getNoAliasAttrName(); mlir::UnitAttr unitAttr = mlir::UnitAttr::get(func.getContext()); for (auto [index, argType] : llvm::enumerate(func.getArgumentTypes())) { + bool isNoCapture = false; + bool isNoAlias = false; if (mlir::isa(argType) && !func.getArgAttr(index, fir::getTargetAttrName()) && !func.getArgAttr(index, fir::getAsynchronousAttrName()) && - !func.getArgAttr(index, fir::getVolatileAttrName())) + !func.getArgAttr(index, fir::getVolatileAttrName())) { + isNoCapture = true; + isNoAlias = !fir::isPointerType(argType); + } else if (mlir::isa(argType)) { + // !fir.box arguments will be passed as descriptor pointers + // at LLVM IR dialect level - they cannot be captured, + // and cannot alias with anything within the function. + isNoCapture = isNoAlias = true; + } + if (isNoCapture && setNoCapture) func.setArgAttr(index, nocapture, unitAttr); + if (isNoAlias && setNoAlias) + func.setArgAttr(index, noalias, unitAttr); } } diff --git a/flang/test/Fir/array-coor.fir b/flang/test/Fir/array-coor.fir index a765670d20b28..2caa727a10c50 100644 --- a/flang/test/Fir/array-coor.fir +++ b/flang/test/Fir/array-coor.fir @@ -33,7 +33,7 @@ func.func @test_array_coor_box_component_slice(%arg0: !fir.box) -> () // CHECK-LABEL: define void @test_array_coor_box_component_slice( -// CHECK-SAME: ptr %[[VAL_0:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[VAL_0:.*]]) // CHECK: %[[VAL_1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[VAL_0]], i32 0, i32 7, i32 0, i32 2 // CHECK: %[[VAL_2:.*]] = load i64, ptr %[[VAL_1]] // CHECK: %[[VAL_3:.*]] = mul nsw i64 1, %[[VAL_2]] diff --git a/flang/test/Fir/arrayset.fir b/flang/test/Fir/arrayset.fir index dab939aba1702..cb26971cb962d 100644 --- a/flang/test/Fir/arrayset.fir +++ b/flang/test/Fir/arrayset.fir @@ -1,7 +1,7 @@ // RUN: tco %s | FileCheck %s // RUN: %flang_fc1 -emit-llvm %s -o - | FileCheck %s -// CHECK-LABEL: define void @x(ptr captures(none) %0) +// CHECK-LABEL: define void @x( func.func @x(%arr : !fir.ref>) { %1 = arith.constant 0 : index %2 = arith.constant 9 : index diff --git a/flang/test/Fir/arrexp.fir b/flang/test/Fir/arrexp.fir index 924c1fab8d84b..6c7f71f6f1f9c 100644 --- a/flang/test/Fir/arrexp.fir +++ b/flang/test/Fir/arrexp.fir @@ -1,7 +1,7 @@ // RUN: tco %s | FileCheck %s // CHECK-LABEL: define void @f1 -// CHECK: (ptr captures(none) %[[A:[^,]*]], {{.*}}, float %[[F:.*]]) +// CHECK: (ptr {{[^%]*}}%[[A:[^,]*]], {{.*}}, float %[[F:.*]]) func.func @f1(%a : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -23,7 +23,7 @@ func.func @f1(%a : !fir.ref>, %n : index, %m : index, %o : i } // CHECK-LABEL: define void @f2 -// CHECK: (ptr captures(none) %[[A:[^,]*]], {{.*}}, float %[[F:.*]]) +// CHECK: (ptr {{[^%]*}}%[[A:[^,]*]], {{.*}}, float %[[F:.*]]) func.func @f2(%a : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -47,7 +47,7 @@ func.func @f2(%a : !fir.ref>, %b : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -72,7 +72,7 @@ func.func @f3(%a : !fir.ref>, %b : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -102,7 +102,7 @@ func.func @f4(%a : !fir.ref>, %b : !fir.ref>, %arg1: !fir.box>, %arg2: f32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -135,7 +135,7 @@ func.func @f5(%arg0: !fir.box>, %arg1: !fir.box>, %arg1: f32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -165,7 +165,7 @@ func.func @f6(%arg0: !fir.box>, %arg1: f32) { // Non contiguous array with lower bounds (x = y(100), with y(4:)) // Test array_coor offset computation. // CHECK-LABEL: define void @f7( -// CHECK: ptr captures(none) %[[X:[^,]*]], ptr %[[Y:.*]]) +// CHECK: ptr {{[^%]*}}%[[X:[^,]*]], ptr {{[^%]*}}%[[Y:.*]]) func.func @f7(%arg0: !fir.ref, %arg1: !fir.box>) { %c4 = arith.constant 4 : index %c100 = arith.constant 100 : index @@ -181,7 +181,7 @@ func.func @f7(%arg0: !fir.ref, %arg1: !fir.box>) { // Test A(:, :)%x reference codegen with A constant shape. // CHECK-LABEL: define void @f8( -// CHECK-SAME: ptr captures(none) %[[A:.*]], i32 %[[I:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[A:.*]], i32 %[[I:.*]]) func.func @f8(%a : !fir.ref>>, %i : i32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -198,7 +198,7 @@ func.func @f8(%a : !fir.ref>>, %i : i32) { // Test casts in in array_coor offset computation when type parameters are not i64 // CHECK-LABEL: define ptr @f9( -// CHECK-SAME: i32 %[[I:.*]], i64 %{{.*}}, i64 %{{.*}}, ptr captures(none) %[[C:.*]]) +// CHECK-SAME: i32 %[[I:.*]], i64 %{{.*}}, i64 %{{.*}}, ptr {{[^%]*}}%[[C:.*]]) func.func @f9(%i: i32, %e : i64, %j: i64, %c: !fir.ref>>) -> !fir.ref> { %s = fir.shape %e, %e : (i64, i64) -> !fir.shape<2> // CHECK: %[[CAST:.*]] = sext i32 %[[I]] to i64 diff --git a/flang/test/Fir/box-offset-codegen.fir b/flang/test/Fir/box-offset-codegen.fir index 15c9a11e5aefe..11d5750ffc385 100644 --- a/flang/test/Fir/box-offset-codegen.fir +++ b/flang/test/Fir/box-offset-codegen.fir @@ -7,7 +7,7 @@ func.func @scalar_addr(%scalar : !fir.ref>>) -> !fir.llvm_ return %addr : !fir.llvm_ptr>> } // CHECK-LABEL: define ptr @scalar_addr( -// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 0 // CHECK: ret ptr %[[VAL_0]] @@ -16,7 +16,7 @@ func.func @scalar_tdesc(%scalar : !fir.ref>>) -> !fir.llvm return %tdesc : !fir.llvm_ptr>> } // CHECK-LABEL: define ptr @scalar_tdesc( -// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 7 // CHECK: ret ptr %[[VAL_0]] @@ -25,7 +25,7 @@ func.func @array_addr(%array : !fir.ref>>> } // CHECK-LABEL: define ptr @array_addr( -// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 0 // CHECK: ret ptr %[[VAL_0]] @@ -34,6 +34,6 @@ func.func @array_tdesc(%array : !fir.ref>> } // CHECK-LABEL: define ptr @array_tdesc( -// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 8 // CHECK: ret ptr %[[VAL_0]] diff --git a/flang/test/Fir/box-typecode.fir b/flang/test/Fir/box-typecode.fir index 766c5165b947c..a8d43eba39889 100644 --- a/flang/test/Fir/box-typecode.fir +++ b/flang/test/Fir/box-typecode.fir @@ -6,7 +6,7 @@ func.func @test_box_typecode(%a: !fir.class) -> i32 { } // CHECK-LABEL: @test_box_typecode( -// CHECK-SAME: ptr %[[BOX:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]) // CHECK: %[[GEP:.*]] = getelementptr { ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}} }, ptr %[[BOX]], i32 0, i32 4 // CHECK: %[[TYPE_CODE:.*]] = load i8, ptr %[[GEP]] // CHECK: %[[TYPE_CODE_CONV:.*]] = sext i8 %[[TYPE_CODE]] to i32 diff --git a/flang/test/Fir/box.fir b/flang/test/Fir/box.fir index 5e931a2e0d9aa..c0cf3d8375983 100644 --- a/flang/test/Fir/box.fir +++ b/flang/test/Fir/box.fir @@ -24,7 +24,7 @@ func.func private @g(%b : !fir.box) func.func private @ga(%b : !fir.box>) // CHECK-LABEL: define void @f -// CHECK: (ptr captures(none) %[[ARG:.*]]) +// CHECK: (ptr {{[^%]*}}%[[ARG:.*]]) func.func @f(%a : !fir.ref) { // CHECK: %[[DESC:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 } // CHECK: %[[INS0:.*]] = insertvalue {{.*}} { ptr undef, i64 4, i32 20240719, i8 0, i8 27, i8 0, i8 0 }, ptr %[[ARG]], 0 @@ -38,7 +38,7 @@ func.func @f(%a : !fir.ref) { } // CHECK-LABEL: define void @fa -// CHECK: (ptr captures(none) %[[ARG:.*]]) +// CHECK: (ptr {{[^%]*}}%[[ARG:.*]]) func.func @fa(%a : !fir.ref>) { %c = fir.convert %a : (!fir.ref>) -> !fir.ref> %c1 = arith.constant 1 : index @@ -54,7 +54,7 @@ func.func @fa(%a : !fir.ref>) { // Boxing of a scalar character of dynamic length // CHECK-LABEL: define void @b1( -// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]]) func.func @b1(%arg0 : !fir.ref>, %arg1 : index) -> !fir.box> { // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 } // CHECK: %[[size:.*]] = mul i64 1, %[[arg1]] @@ -69,8 +69,8 @@ func.func @b1(%arg0 : !fir.ref>, %arg1 : index) -> !fir.box>>, %arg1 : index) -> !fir.box>> { %1 = fir.shape %arg1 : (index) -> !fir.shape<1> // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } @@ -85,7 +85,7 @@ func.func @b2(%arg0 : !fir.ref>>, %arg1 : index) -> // Boxing of a dynamic array of character of dynamic length // CHECK-LABEL: define void @b3( -// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]], i64 %[[arg2:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]], i64 %[[arg2:.*]]) func.func @b3(%arg0 : !fir.ref>>, %arg1 : index, %arg2 : index) -> !fir.box>> { %1 = fir.shape %arg2 : (index) -> !fir.shape<1> // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } @@ -103,7 +103,7 @@ func.func @b3(%arg0 : !fir.ref>>, %arg1 : index, %ar // Boxing of a static array of character of dynamic length // CHECK-LABEL: define void @b4( -// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]]) func.func @b4(%arg0 : !fir.ref>>, %arg1 : index) -> !fir.box>> { %c_7 = arith.constant 7 : index %1 = fir.shape %c_7 : (index) -> !fir.shape<1> @@ -122,7 +122,7 @@ func.func @b4(%arg0 : !fir.ref>>, %arg1 : index) -> // Storing a fir.box into a fir.ref (modifying descriptors). // CHECK-LABEL: define void @b5( -// CHECK-SAME: ptr captures(none) %[[arg0:.*]], ptr %[[arg1:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[arg0:.*]], ptr {{[^%]*}}%[[arg1:.*]]) func.func @b5(%arg0 : !fir.ref>>>, %arg1 : !fir.box>>) { fir.store %arg1 to %arg0 : !fir.ref>>> // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr %0, ptr %1, i32 72, i1 false) @@ -132,7 +132,7 @@ func.func @b5(%arg0 : !fir.ref>>>, %arg1 func.func private @callee6(!fir.box) -> i32 // CHECK-LABEL: define i32 @box6( -// CHECK-SAME: ptr captures(none) %[[ARG0:.*]], i64 %[[ARG1:.*]], i64 %[[ARG2:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[ARG0:.*]], i64 %[[ARG1:.*]], i64 %[[ARG2:.*]]) func.func @box6(%0 : !fir.ref>, %1 : index, %2 : index) -> i32 { %c100 = arith.constant 100 : index %c50 = arith.constant 50 : index diff --git a/flang/test/Fir/boxproc.fir b/flang/test/Fir/boxproc.fir index e99dfd0b92afd..5d82522055adc 100644 --- a/flang/test/Fir/boxproc.fir +++ b/flang/test/Fir/boxproc.fir @@ -16,7 +16,7 @@ // CHECK: call void @_QPtest_proc_dummy_other(ptr %[[VAL_6]]) // CHECK-LABEL: define void @_QFtest_proc_dummyPtest_proc_dummy_a(ptr -// CHECK-SAME: captures(none) %[[VAL_0:.*]], ptr nest captures(none) %[[VAL_1:.*]]) +// CHECK-SAME: {{[^%]*}}%[[VAL_0:.*]], ptr nest {{[^%]*}}%[[VAL_1:.*]]) // CHECK-LABEL: define void @_QPtest_proc_dummy_other(ptr // CHECK-SAME: %[[VAL_0:.*]]) @@ -92,7 +92,7 @@ func.func @_QPtest_proc_dummy_other(%arg0: !fir.boxproc<() -> ()>) { // CHECK: call void @llvm.stackrestore.p0(ptr %[[VAL_27]]) // CHECK-LABEL: define { ptr, i64 } @_QFtest_proc_dummy_charPgen_message(ptr -// CHECK-SAME: captures(none) %[[VAL_0:.*]], i64 %[[VAL_1:.*]], ptr nest captures(none) %[[VAL_2:.*]]) +// CHECK-SAME: {{[^%]*}}%[[VAL_0:.*]], i64 %[[VAL_1:.*]], ptr nest {{[^%]*}}%[[VAL_2:.*]]) // CHECK: %[[VAL_3:.*]] = getelementptr { { ptr, i64 } }, ptr %[[VAL_2]], i32 0, i32 0 // CHECK: %[[VAL_4:.*]] = load { ptr, i64 }, ptr %[[VAL_3]], align 8 // CHECK: %[[VAL_5:.*]] = extractvalue { ptr, i64 } %[[VAL_4]], 0 diff --git a/flang/test/Fir/commute.fir b/flang/test/Fir/commute.fir index a857ba55b00c5..8713c8ff24e7f 100644 --- a/flang/test/Fir/commute.fir +++ b/flang/test/Fir/commute.fir @@ -11,7 +11,7 @@ func.func @f1(%a : i32, %b : i32) -> i32 { return %3 : i32 } -// CHECK-LABEL: define i32 @f2(ptr captures(none) %0) +// CHECK-LABEL: define i32 @f2(ptr {{[^%]*}}%0) func.func @f2(%a : !fir.ref) -> i32 { %1 = fir.load %a : !fir.ref // CHECK: %[[r2:.*]] = load diff --git a/flang/test/Fir/coordinateof.fir b/flang/test/Fir/coordinateof.fir index 693bdf716ba1d..a01e9e9d1fc40 100644 --- a/flang/test/Fir/coordinateof.fir +++ b/flang/test/Fir/coordinateof.fir @@ -62,7 +62,7 @@ func.func @foo5(%box : !fir.box>>, %i : index) -> i32 } // CHECK-LABEL: @foo6 -// CHECK-SAME: (ptr %[[box:.*]], i64 %{{.*}}, ptr captures(none) %{{.*}}) +// CHECK-SAME: (ptr {{[^%]*}}%[[box:.*]], i64 %{{.*}}, ptr {{[^%]*}}%{{.*}}) func.func @foo6(%box : !fir.box>>>, %i : i64 , %res : !fir.ref>) { // CHECK: %[[addr_gep:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, ptr %[[box]], i32 0, i32 0 // CHECK: %[[addr:.*]] = load ptr, ptr %[[addr_gep]] diff --git a/flang/test/Fir/embox.fir b/flang/test/Fir... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/140803 From flang-commits at lists.llvm.org Tue May 20 14:34:40 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 20 May 2025 14:34:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix crash in error recovery (PR #140768) In-Reply-To: Message-ID: <682cf570.170a0220.ea3ea.3e5a@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/140768 >From bd8ed64f30be0f463af090c0b0634c2401b79c7e Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Tue, 20 May 2025 10:35:27 -0700 Subject: [PATCH] [flang] Fix crash in error recovery When a TYPE(*) dummy argument is erroneously used as a component value in a structure constructor, semantics crashes if the structure constructor had been initially parsed as a potential function reference. Clean out stale typed expressions when reanalyzing the reconstructed parse subtree to ensure that errors are caught the next time around. Fixes https://github.com/llvm/llvm-project/issues/140794. --- flang/lib/Semantics/expression.cpp | 6 +++++- flang/test/Semantics/bug869.f90 | 10 ++++++++++ 2 files changed, 15 insertions(+), 1 deletion(-) create mode 100644 flang/test/Semantics/bug869.f90 diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index b3ad608ee6744..d68e71f57f141 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -3376,6 +3376,10 @@ MaybeExpr ExpressionAnalyzer::Analyze(const parser::FunctionReference &funcRef, auto &mutableRef{const_cast(funcRef)}; *structureConstructor = mutableRef.ConvertToStructureConstructor(type.derivedTypeSpec()); + // Don't use saved typed expressions left over from argument + // analysis; they might not be valid structure components + // (e.g., a TYPE(*) argument) + auto restorer{DoNotUseSavedTypedExprs()}; return Analyze(structureConstructor->value()); } } @@ -4058,7 +4062,7 @@ MaybeExpr ExpressionAnalyzer::ExprOrVariable( // first to be sure. std::optional ctor; result = Analyze(funcRef->value(), &ctor); - if (result && ctor) { + if (ctor) { // A misparsed function reference is really a structure // constructor. Repair the parse tree in situ. const_cast(x).u = std::move(*ctor); diff --git a/flang/test/Semantics/bug869.f90 b/flang/test/Semantics/bug869.f90 new file mode 100644 index 0000000000000..ddc7dffcc2fa4 --- /dev/null +++ b/flang/test/Semantics/bug869.f90 @@ -0,0 +1,10 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 +! Regression test for crash +subroutine sub(xx) + type(*) :: xx + type ty + end type + type(ty) obj + !ERROR: TYPE(*) dummy argument may only be used as an actual argument + obj = ty(xx) +end From flang-commits at lists.llvm.org Tue May 20 16:10:41 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 20 May 2025 16:10:41 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (WORK IN PROGRESS) (PR #137727) In-Reply-To: Message-ID: <682d0bf1.a70a0220.367659.5fa4@mx.google.com> klausler wrote: Descriptor-based I/O is now supported in the patch. https://github.com/llvm/llvm-project/pull/137727 From flang-commits at lists.llvm.org Tue May 20 16:16:19 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 20 May 2025 16:16:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Allocate extra descriptor in managed memory when it is coming from device (PR #140818) Message-ID: https://github.com/clementval created https://github.com/llvm/llvm-project/pull/140818 None >From c71ea039124f68f3e9a30eaa4b7c2fd76ec9d189 Mon Sep 17 00:00:00 2001 From: Valentin Clement Date: Tue, 20 May 2025 16:14:27 -0700 Subject: [PATCH] [flang][cuda] Allocate extra descriptor in managed memory when it is coming from device --- flang/lib/Optimizer/CodeGen/CodeGen.cpp | 9 ++++++--- flang/test/Fir/CUDA/cuda-code-gen.mlir | 17 +++++++++++++++++ 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index 70c90fae34086..205807eab403a 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -1830,7 +1830,9 @@ static bool isDeviceAllocation(mlir::Value val, mlir::Value adaptorVal) { (callOp.getCallee().value().getRootReference().getValue().starts_with( RTNAME_STRING(CUFMemAlloc)) || callOp.getCallee().value().getRootReference().getValue().starts_with( - RTNAME_STRING(CUFAllocDescriptor)))) + RTNAME_STRING(CUFAllocDescriptor)) || + callOp.getCallee().value().getRootReference().getValue() == + "__tgt_acc_get_deviceptr")) return true; return false; } @@ -3253,8 +3255,9 @@ struct LoadOpConversion : public fir::FIROpConversion { if (auto callOp = mlir::dyn_cast_or_null( inputBoxStorage.getDefiningOp())) { if (callOp.getCallee() && - (*callOp.getCallee()) - .starts_with(RTNAME_STRING(CUFAllocDescriptor))) { + ((*callOp.getCallee()) + .starts_with(RTNAME_STRING(CUFAllocDescriptor)) || + (*callOp.getCallee()).starts_with("__tgt_acc_get_deviceptr"))) { // CUDA Fortran local descriptor are allocated in managed memory. So // new storage must be allocated the same way. auto mod = load->getParentOfType(); diff --git a/flang/test/Fir/CUDA/cuda-code-gen.mlir b/flang/test/Fir/CUDA/cuda-code-gen.mlir index fdd9f1ac12b1f..672be13beae24 100644 --- a/flang/test/Fir/CUDA/cuda-code-gen.mlir +++ b/flang/test/Fir/CUDA/cuda-code-gen.mlir @@ -204,3 +204,20 @@ func.func @_QMm1Psub1(%arg0: !fir.box> {cuf.data_attr = #cuf.c fir.global common @_QPshared_static__shared_mem(dense<0> : vector<28xi8>) {alignment = 8 : i64, data_attr = #cuf.cuda} : !fir.array<28xi8> // CHECK: llvm.mlir.global common @_QPshared_static__shared_mem(dense<0> : vector<28xi8>) {addr_space = 3 : i32, alignment = 8 : i64} : !llvm.array<28 x i8> + +// ----- + +module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry, dense<64> : vector<4xi64>>, #dlti.dl_entry, dense<32> : vector<4xi64>>, #dlti.dl_entry, dense<32> : vector<4xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<4xi64>>, #dlti.dl_entry<"dlti.endianness", "little">, #dlti.dl_entry<"dlti.stack_alignment", 128 : i64>>} { + func.func @_QQmain() attributes {fir.bindc_name = "cufkernel_global"} { + %c0 = arith.constant 0 : index + %3 = fir.call @__tgt_acc_get_deviceptr() : () -> !fir.ref> + %4 = fir.convert %3 : (!fir.ref>) -> !fir.ref>>> + %5 = fir.load %4 : !fir.ref>>> + return + } + + // CHECK-LABEL: llvm.func @_QQmain() + // CHECK: llvm.call @_FortranACUFAllocDescriptor + + func.func private @__tgt_acc_get_deviceptr() -> !fir.ref> +} From flang-commits at lists.llvm.org Tue May 20 16:16:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 16:16:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Allocate extra descriptor in managed memory when it is coming from device (PR #140818) In-Reply-To: Message-ID: <682d0d62.170a0220.ebf9.c78f@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Valentin Clement (バレンタイン クレメン) (clementval)
Changes --- Full diff: https://github.com/llvm/llvm-project/pull/140818.diff 2 Files Affected: - (modified) flang/lib/Optimizer/CodeGen/CodeGen.cpp (+6-3) - (modified) flang/test/Fir/CUDA/cuda-code-gen.mlir (+17) ``````````diff diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index 70c90fae34086..205807eab403a 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -1830,7 +1830,9 @@ static bool isDeviceAllocation(mlir::Value val, mlir::Value adaptorVal) { (callOp.getCallee().value().getRootReference().getValue().starts_with( RTNAME_STRING(CUFMemAlloc)) || callOp.getCallee().value().getRootReference().getValue().starts_with( - RTNAME_STRING(CUFAllocDescriptor)))) + RTNAME_STRING(CUFAllocDescriptor)) || + callOp.getCallee().value().getRootReference().getValue() == + "__tgt_acc_get_deviceptr")) return true; return false; } @@ -3253,8 +3255,9 @@ struct LoadOpConversion : public fir::FIROpConversion { if (auto callOp = mlir::dyn_cast_or_null( inputBoxStorage.getDefiningOp())) { if (callOp.getCallee() && - (*callOp.getCallee()) - .starts_with(RTNAME_STRING(CUFAllocDescriptor))) { + ((*callOp.getCallee()) + .starts_with(RTNAME_STRING(CUFAllocDescriptor)) || + (*callOp.getCallee()).starts_with("__tgt_acc_get_deviceptr"))) { // CUDA Fortran local descriptor are allocated in managed memory. So // new storage must be allocated the same way. auto mod = load->getParentOfType(); diff --git a/flang/test/Fir/CUDA/cuda-code-gen.mlir b/flang/test/Fir/CUDA/cuda-code-gen.mlir index fdd9f1ac12b1f..672be13beae24 100644 --- a/flang/test/Fir/CUDA/cuda-code-gen.mlir +++ b/flang/test/Fir/CUDA/cuda-code-gen.mlir @@ -204,3 +204,20 @@ func.func @_QMm1Psub1(%arg0: !fir.box> {cuf.data_attr = #cuf.c fir.global common @_QPshared_static__shared_mem(dense<0> : vector<28xi8>) {alignment = 8 : i64, data_attr = #cuf.cuda} : !fir.array<28xi8> // CHECK: llvm.mlir.global common @_QPshared_static__shared_mem(dense<0> : vector<28xi8>) {addr_space = 3 : i32, alignment = 8 : i64} : !llvm.array<28 x i8> + +// ----- + +module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry, dense<64> : vector<4xi64>>, #dlti.dl_entry, dense<32> : vector<4xi64>>, #dlti.dl_entry, dense<32> : vector<4xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<4xi64>>, #dlti.dl_entry<"dlti.endianness", "little">, #dlti.dl_entry<"dlti.stack_alignment", 128 : i64>>} { + func.func @_QQmain() attributes {fir.bindc_name = "cufkernel_global"} { + %c0 = arith.constant 0 : index + %3 = fir.call @__tgt_acc_get_deviceptr() : () -> !fir.ref> + %4 = fir.convert %3 : (!fir.ref>) -> !fir.ref>>> + %5 = fir.load %4 : !fir.ref>>> + return + } + + // CHECK-LABEL: llvm.func @_QQmain() + // CHECK: llvm.call @_FortranACUFAllocDescriptor + + func.func private @__tgt_acc_get_deviceptr() -> !fir.ref> +} ``````````
https://github.com/llvm/llvm-project/pull/140818 From flang-commits at lists.llvm.org Tue May 20 14:13:47 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 20 May 2025 14:13:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Added noalias attribute to function arguments. (PR #140803) Message-ID: https://github.com/vzakhari created https://github.com/llvm/llvm-project/pull/140803 This helps to disambiguate accesses in the caller and the callee after LLVM inlining in some apps. I did not see any performance changes, but this is one step towards enabling other optimizations in the apps that I am looking at. The definition of llvm.noalias says: ``` ... indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value. This guarantee only holds for memory locations that are modified, by any means, during the execution of the function. ``` I believe this exactly matches Fortran rules for the dummy arguments that are modified during their subprogram execution. I also set llvm.noalias and llvm.nocapture on the !fir.box<> arguments, because the corresponding descriptors cannot be captured and cannot alias anything (not based on them) during the execution of the subprogram. >From 8ecc5a67289cf29d78634ded24a450eb301cf237 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Mon, 19 May 2025 17:14:19 -0700 Subject: [PATCH 1/2] [flang] Added noalias attribute to function arguments. This helps to disambiguate accesses in the caller and the callee after LLVM inlining in some apps. I did not see any performance changes, but this is one step towards enabling other optimizations in the apps that I am looking at. The definition of llvm.noalias says: ``` ... indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value. This guarantee only holds for memory locations that are modified, by any means, during the execution of the function. ``` I believe this exactly matches Fortran rules for the dummy arguments that are modified during their subprogram execution. I also set llvm.noalias and llvm.nocapture on the !fir.box<> arguments, because the corresponding descriptors cannot be captured and cannot alias anything (not based on them) during the execution of the subprogram. --- .../flang/Optimizer/Transforms/Passes.td | 6 + flang/lib/Optimizer/Passes/Pipelines.cpp | 6 +- .../lib/Optimizer/Transforms/FunctionAttr.cpp | 29 +++-- flang/test/Fir/array-coor.fir | 2 +- flang/test/Fir/arrayset.fir | 2 +- flang/test/Fir/arrexp.fir | 18 +-- flang/test/Fir/box-offset-codegen.fir | 8 +- flang/test/Fir/box-typecode.fir | 2 +- flang/test/Fir/box.fir | 18 +-- flang/test/Fir/boxproc.fir | 4 +- flang/test/Fir/commute.fir | 2 +- flang/test/Fir/coordinateof.fir | 2 +- flang/test/Fir/embox.fir | 8 +- flang/test/Fir/field-index.fir | 4 +- .../Fir/ignore-missing-type-descriptor.fir | 2 +- flang/test/Fir/polymorphic.fir | 2 +- flang/test/Fir/rebox.fir | 12 +- .../test/Fir/struct-passing-x86-64-byval.fir | 48 ++++---- .../Fir/target-rewrite-complex-10-x86.fir | 2 +- flang/test/Fir/target.fir | 8 +- flang/test/Fir/tbaa-codegen.fir | 2 +- flang/test/Fir/tbaa-codegen2.fir | 2 +- flang/test/Integration/OpenMP/copyprivate.f90 | 34 +++--- flang/test/Integration/debug-local-var-2.f90 | 4 +- flang/test/Integration/unroll-loops.f90 | 2 +- flang/test/Lower/HLFIR/unroll-loops.fir | 2 +- flang/test/Lower/forall/character-1.f90 | 2 +- .../constant-argument-globalisation.fir | 4 +- .../Transforms/function-attrs-noalias.fir | 113 ++++++++++++++++++ flang/test/Transforms/function-attrs.fir | 2 +- 30 files changed, 240 insertions(+), 112 deletions(-) create mode 100644 flang/test/Transforms/function-attrs-noalias.fir diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c0d88a8e19f80..e1497aeb3aa36 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -418,6 +418,12 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "Set the unsafe-fp-math attribute on functions in the module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, + Option<"setNoCapture", "set-nocapture", "bool", /*default=*/"false", + "Set LLVM nocapture attribute on function arguments, " + "if possible">, + Option<"setNoAlias", "set-noalias", "bool", /*default=*/"false", + "Set LLVM noalias attribute on function arguments, " + "if possible">, ]; } diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..378913fcb1329 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -350,11 +350,15 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, else framePointerKind = mlir::LLVM::framePointerKind::FramePointerKind::None; + bool setNoCapture = false, setNoAlias = false; + if (config.OptLevel.isOptimizingForSpeed()) + setNoCapture = setNoAlias = true; + pm.addPass(fir::createFunctionAttr( {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - ""})); + /*tuneCPU=*/"", setNoCapture, setNoAlias})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 43e4c1a7af3cd..c8cdba0d6f9c4 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -27,17 +27,8 @@ namespace { class FunctionAttrPass : public fir::impl::FunctionAttrBase { public: - FunctionAttrPass(const fir::FunctionAttrOptions &options) { - instrumentFunctionEntry = options.instrumentFunctionEntry; - instrumentFunctionExit = options.instrumentFunctionExit; - framePointerKind = options.framePointerKind; - noInfsFPMath = options.noInfsFPMath; - noNaNsFPMath = options.noNaNsFPMath; - approxFuncFPMath = options.approxFuncFPMath; - noSignedZerosFPMath = options.noSignedZerosFPMath; - unsafeFPMath = options.unsafeFPMath; - } - FunctionAttrPass() {} + FunctionAttrPass(const fir::FunctionAttrOptions &options) : Base{options} {} + FunctionAttrPass() = default; void runOnOperation() override; }; @@ -56,14 +47,28 @@ void FunctionAttrPass::runOnOperation() { if ((isFromModule || !func.isDeclaration()) && !fir::hasBindcAttr(func.getOperation())) { llvm::StringRef nocapture = mlir::LLVM::LLVMDialect::getNoCaptureAttrName(); + llvm::StringRef noalias = mlir::LLVM::LLVMDialect::getNoAliasAttrName(); mlir::UnitAttr unitAttr = mlir::UnitAttr::get(func.getContext()); for (auto [index, argType] : llvm::enumerate(func.getArgumentTypes())) { + bool isNoCapture = false; + bool isNoAlias = false; if (mlir::isa(argType) && !func.getArgAttr(index, fir::getTargetAttrName()) && !func.getArgAttr(index, fir::getAsynchronousAttrName()) && - !func.getArgAttr(index, fir::getVolatileAttrName())) + !func.getArgAttr(index, fir::getVolatileAttrName())) { + isNoCapture = true; + isNoAlias = !fir::isPointerType(argType); + } else if (mlir::isa(argType)) { + // !fir.box arguments will be passed as descriptor pointers + // at LLVM IR dialect level - they cannot be captured, + // and cannot alias with anything within the function. + isNoCapture = isNoAlias = true; + } + if (isNoCapture && setNoCapture) func.setArgAttr(index, nocapture, unitAttr); + if (isNoAlias && setNoAlias) + func.setArgAttr(index, noalias, unitAttr); } } diff --git a/flang/test/Fir/array-coor.fir b/flang/test/Fir/array-coor.fir index a765670d20b28..2caa727a10c50 100644 --- a/flang/test/Fir/array-coor.fir +++ b/flang/test/Fir/array-coor.fir @@ -33,7 +33,7 @@ func.func @test_array_coor_box_component_slice(%arg0: !fir.box) -> () // CHECK-LABEL: define void @test_array_coor_box_component_slice( -// CHECK-SAME: ptr %[[VAL_0:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[VAL_0:.*]]) // CHECK: %[[VAL_1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[VAL_0]], i32 0, i32 7, i32 0, i32 2 // CHECK: %[[VAL_2:.*]] = load i64, ptr %[[VAL_1]] // CHECK: %[[VAL_3:.*]] = mul nsw i64 1, %[[VAL_2]] diff --git a/flang/test/Fir/arrayset.fir b/flang/test/Fir/arrayset.fir index dab939aba1702..cb26971cb962d 100644 --- a/flang/test/Fir/arrayset.fir +++ b/flang/test/Fir/arrayset.fir @@ -1,7 +1,7 @@ // RUN: tco %s | FileCheck %s // RUN: %flang_fc1 -emit-llvm %s -o - | FileCheck %s -// CHECK-LABEL: define void @x(ptr captures(none) %0) +// CHECK-LABEL: define void @x( func.func @x(%arr : !fir.ref>) { %1 = arith.constant 0 : index %2 = arith.constant 9 : index diff --git a/flang/test/Fir/arrexp.fir b/flang/test/Fir/arrexp.fir index 924c1fab8d84b..6c7f71f6f1f9c 100644 --- a/flang/test/Fir/arrexp.fir +++ b/flang/test/Fir/arrexp.fir @@ -1,7 +1,7 @@ // RUN: tco %s | FileCheck %s // CHECK-LABEL: define void @f1 -// CHECK: (ptr captures(none) %[[A:[^,]*]], {{.*}}, float %[[F:.*]]) +// CHECK: (ptr {{[^%]*}}%[[A:[^,]*]], {{.*}}, float %[[F:.*]]) func.func @f1(%a : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -23,7 +23,7 @@ func.func @f1(%a : !fir.ref>, %n : index, %m : index, %o : i } // CHECK-LABEL: define void @f2 -// CHECK: (ptr captures(none) %[[A:[^,]*]], {{.*}}, float %[[F:.*]]) +// CHECK: (ptr {{[^%]*}}%[[A:[^,]*]], {{.*}}, float %[[F:.*]]) func.func @f2(%a : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -47,7 +47,7 @@ func.func @f2(%a : !fir.ref>, %b : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -72,7 +72,7 @@ func.func @f3(%a : !fir.ref>, %b : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -102,7 +102,7 @@ func.func @f4(%a : !fir.ref>, %b : !fir.ref>, %arg1: !fir.box>, %arg2: f32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -135,7 +135,7 @@ func.func @f5(%arg0: !fir.box>, %arg1: !fir.box>, %arg1: f32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -165,7 +165,7 @@ func.func @f6(%arg0: !fir.box>, %arg1: f32) { // Non contiguous array with lower bounds (x = y(100), with y(4:)) // Test array_coor offset computation. // CHECK-LABEL: define void @f7( -// CHECK: ptr captures(none) %[[X:[^,]*]], ptr %[[Y:.*]]) +// CHECK: ptr {{[^%]*}}%[[X:[^,]*]], ptr {{[^%]*}}%[[Y:.*]]) func.func @f7(%arg0: !fir.ref, %arg1: !fir.box>) { %c4 = arith.constant 4 : index %c100 = arith.constant 100 : index @@ -181,7 +181,7 @@ func.func @f7(%arg0: !fir.ref, %arg1: !fir.box>) { // Test A(:, :)%x reference codegen with A constant shape. // CHECK-LABEL: define void @f8( -// CHECK-SAME: ptr captures(none) %[[A:.*]], i32 %[[I:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[A:.*]], i32 %[[I:.*]]) func.func @f8(%a : !fir.ref>>, %i : i32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -198,7 +198,7 @@ func.func @f8(%a : !fir.ref>>, %i : i32) { // Test casts in in array_coor offset computation when type parameters are not i64 // CHECK-LABEL: define ptr @f9( -// CHECK-SAME: i32 %[[I:.*]], i64 %{{.*}}, i64 %{{.*}}, ptr captures(none) %[[C:.*]]) +// CHECK-SAME: i32 %[[I:.*]], i64 %{{.*}}, i64 %{{.*}}, ptr {{[^%]*}}%[[C:.*]]) func.func @f9(%i: i32, %e : i64, %j: i64, %c: !fir.ref>>) -> !fir.ref> { %s = fir.shape %e, %e : (i64, i64) -> !fir.shape<2> // CHECK: %[[CAST:.*]] = sext i32 %[[I]] to i64 diff --git a/flang/test/Fir/box-offset-codegen.fir b/flang/test/Fir/box-offset-codegen.fir index 15c9a11e5aefe..11d5750ffc385 100644 --- a/flang/test/Fir/box-offset-codegen.fir +++ b/flang/test/Fir/box-offset-codegen.fir @@ -7,7 +7,7 @@ func.func @scalar_addr(%scalar : !fir.ref>>) -> !fir.llvm_ return %addr : !fir.llvm_ptr>> } // CHECK-LABEL: define ptr @scalar_addr( -// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 0 // CHECK: ret ptr %[[VAL_0]] @@ -16,7 +16,7 @@ func.func @scalar_tdesc(%scalar : !fir.ref>>) -> !fir.llvm return %tdesc : !fir.llvm_ptr>> } // CHECK-LABEL: define ptr @scalar_tdesc( -// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 7 // CHECK: ret ptr %[[VAL_0]] @@ -25,7 +25,7 @@ func.func @array_addr(%array : !fir.ref>>> } // CHECK-LABEL: define ptr @array_addr( -// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 0 // CHECK: ret ptr %[[VAL_0]] @@ -34,6 +34,6 @@ func.func @array_tdesc(%array : !fir.ref>> } // CHECK-LABEL: define ptr @array_tdesc( -// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 8 // CHECK: ret ptr %[[VAL_0]] diff --git a/flang/test/Fir/box-typecode.fir b/flang/test/Fir/box-typecode.fir index 766c5165b947c..a8d43eba39889 100644 --- a/flang/test/Fir/box-typecode.fir +++ b/flang/test/Fir/box-typecode.fir @@ -6,7 +6,7 @@ func.func @test_box_typecode(%a: !fir.class) -> i32 { } // CHECK-LABEL: @test_box_typecode( -// CHECK-SAME: ptr %[[BOX:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]) // CHECK: %[[GEP:.*]] = getelementptr { ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}} }, ptr %[[BOX]], i32 0, i32 4 // CHECK: %[[TYPE_CODE:.*]] = load i8, ptr %[[GEP]] // CHECK: %[[TYPE_CODE_CONV:.*]] = sext i8 %[[TYPE_CODE]] to i32 diff --git a/flang/test/Fir/box.fir b/flang/test/Fir/box.fir index 5e931a2e0d9aa..c0cf3d8375983 100644 --- a/flang/test/Fir/box.fir +++ b/flang/test/Fir/box.fir @@ -24,7 +24,7 @@ func.func private @g(%b : !fir.box) func.func private @ga(%b : !fir.box>) // CHECK-LABEL: define void @f -// CHECK: (ptr captures(none) %[[ARG:.*]]) +// CHECK: (ptr {{[^%]*}}%[[ARG:.*]]) func.func @f(%a : !fir.ref) { // CHECK: %[[DESC:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 } // CHECK: %[[INS0:.*]] = insertvalue {{.*}} { ptr undef, i64 4, i32 20240719, i8 0, i8 27, i8 0, i8 0 }, ptr %[[ARG]], 0 @@ -38,7 +38,7 @@ func.func @f(%a : !fir.ref) { } // CHECK-LABEL: define void @fa -// CHECK: (ptr captures(none) %[[ARG:.*]]) +// CHECK: (ptr {{[^%]*}}%[[ARG:.*]]) func.func @fa(%a : !fir.ref>) { %c = fir.convert %a : (!fir.ref>) -> !fir.ref> %c1 = arith.constant 1 : index @@ -54,7 +54,7 @@ func.func @fa(%a : !fir.ref>) { // Boxing of a scalar character of dynamic length // CHECK-LABEL: define void @b1( -// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]]) func.func @b1(%arg0 : !fir.ref>, %arg1 : index) -> !fir.box> { // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 } // CHECK: %[[size:.*]] = mul i64 1, %[[arg1]] @@ -69,8 +69,8 @@ func.func @b1(%arg0 : !fir.ref>, %arg1 : index) -> !fir.box>>, %arg1 : index) -> !fir.box>> { %1 = fir.shape %arg1 : (index) -> !fir.shape<1> // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } @@ -85,7 +85,7 @@ func.func @b2(%arg0 : !fir.ref>>, %arg1 : index) -> // Boxing of a dynamic array of character of dynamic length // CHECK-LABEL: define void @b3( -// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]], i64 %[[arg2:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]], i64 %[[arg2:.*]]) func.func @b3(%arg0 : !fir.ref>>, %arg1 : index, %arg2 : index) -> !fir.box>> { %1 = fir.shape %arg2 : (index) -> !fir.shape<1> // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } @@ -103,7 +103,7 @@ func.func @b3(%arg0 : !fir.ref>>, %arg1 : index, %ar // Boxing of a static array of character of dynamic length // CHECK-LABEL: define void @b4( -// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]]) func.func @b4(%arg0 : !fir.ref>>, %arg1 : index) -> !fir.box>> { %c_7 = arith.constant 7 : index %1 = fir.shape %c_7 : (index) -> !fir.shape<1> @@ -122,7 +122,7 @@ func.func @b4(%arg0 : !fir.ref>>, %arg1 : index) -> // Storing a fir.box into a fir.ref (modifying descriptors). // CHECK-LABEL: define void @b5( -// CHECK-SAME: ptr captures(none) %[[arg0:.*]], ptr %[[arg1:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[arg0:.*]], ptr {{[^%]*}}%[[arg1:.*]]) func.func @b5(%arg0 : !fir.ref>>>, %arg1 : !fir.box>>) { fir.store %arg1 to %arg0 : !fir.ref>>> // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr %0, ptr %1, i32 72, i1 false) @@ -132,7 +132,7 @@ func.func @b5(%arg0 : !fir.ref>>>, %arg1 func.func private @callee6(!fir.box) -> i32 // CHECK-LABEL: define i32 @box6( -// CHECK-SAME: ptr captures(none) %[[ARG0:.*]], i64 %[[ARG1:.*]], i64 %[[ARG2:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[ARG0:.*]], i64 %[[ARG1:.*]], i64 %[[ARG2:.*]]) func.func @box6(%0 : !fir.ref>, %1 : index, %2 : index) -> i32 { %c100 = arith.constant 100 : index %c50 = arith.constant 50 : index diff --git a/flang/test/Fir/boxproc.fir b/flang/test/Fir/boxproc.fir index e99dfd0b92afd..5d82522055adc 100644 --- a/flang/test/Fir/boxproc.fir +++ b/flang/test/Fir/boxproc.fir @@ -16,7 +16,7 @@ // CHECK: call void @_QPtest_proc_dummy_other(ptr %[[VAL_6]]) // CHECK-LABEL: define void @_QFtest_proc_dummyPtest_proc_dummy_a(ptr -// CHECK-SAME: captures(none) %[[VAL_0:.*]], ptr nest captures(none) %[[VAL_1:.*]]) +// CHECK-SAME: {{[^%]*}}%[[VAL_0:.*]], ptr nest {{[^%]*}}%[[VAL_1:.*]]) // CHECK-LABEL: define void @_QPtest_proc_dummy_other(ptr // CHECK-SAME: %[[VAL_0:.*]]) @@ -92,7 +92,7 @@ func.func @_QPtest_proc_dummy_other(%arg0: !fir.boxproc<() -> ()>) { // CHECK: call void @llvm.stackrestore.p0(ptr %[[VAL_27]]) // CHECK-LABEL: define { ptr, i64 } @_QFtest_proc_dummy_charPgen_message(ptr -// CHECK-SAME: captures(none) %[[VAL_0:.*]], i64 %[[VAL_1:.*]], ptr nest captures(none) %[[VAL_2:.*]]) +// CHECK-SAME: {{[^%]*}}%[[VAL_0:.*]], i64 %[[VAL_1:.*]], ptr nest {{[^%]*}}%[[VAL_2:.*]]) // CHECK: %[[VAL_3:.*]] = getelementptr { { ptr, i64 } }, ptr %[[VAL_2]], i32 0, i32 0 // CHECK: %[[VAL_4:.*]] = load { ptr, i64 }, ptr %[[VAL_3]], align 8 // CHECK: %[[VAL_5:.*]] = extractvalue { ptr, i64 } %[[VAL_4]], 0 diff --git a/flang/test/Fir/commute.fir b/flang/test/Fir/commute.fir index a857ba55b00c5..8713c8ff24e7f 100644 --- a/flang/test/Fir/commute.fir +++ b/flang/test/Fir/commute.fir @@ -11,7 +11,7 @@ func.func @f1(%a : i32, %b : i32) -> i32 { return %3 : i32 } -// CHECK-LABEL: define i32 @f2(ptr captures(none) %0) +// CHECK-LABEL: define i32 @f2(ptr {{[^%]*}}%0) func.func @f2(%a : !fir.ref) -> i32 { %1 = fir.load %a : !fir.ref // CHECK: %[[r2:.*]] = load diff --git a/flang/test/Fir/coordinateof.fir b/flang/test/Fir/coordinateof.fir index 693bdf716ba1d..a01e9e9d1fc40 100644 --- a/flang/test/Fir/coordinateof.fir +++ b/flang/test/Fir/coordinateof.fir @@ -62,7 +62,7 @@ func.func @foo5(%box : !fir.box>>, %i : index) -> i32 } // CHECK-LABEL: @foo6 -// CHECK-SAME: (ptr %[[box:.*]], i64 %{{.*}}, ptr captures(none) %{{.*}}) +// CHECK-SAME: (ptr {{[^%]*}}%[[box:.*]], i64 %{{.*}}, ptr {{[^%]*}}%{{.*}}) func.func @foo6(%box : !fir.box>>>, %i : i64 , %res : !fir.ref>) { // CHECK: %[[addr_gep:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, ptr %[[box]], i32 0, i32 0 // CHECK: %[[addr:.*]] = load ptr, ptr %[[addr_gep]] diff --git a/flang/test/Fir/embox.fir b/flang/test/Fir/embox.fir index 18b5efbc6a0e4..0f304cff2c79e 100644 --- a/flang/test/Fir/embox.fir +++ b/flang/test/Fir/embox.fir @@ -2,7 +2,7 @@ // RUN: %flang_fc1 -mmlir -disable-external-name-interop -emit-llvm %s -o -| FileCheck %s -// CHECK-LABEL: define void @_QPtest_callee(ptr %0) +// CHECK-LABEL: define void @_QPtest_callee( func.func @_QPtest_callee(%arg0: !fir.box>) { return } @@ -29,7 +29,7 @@ func.func @_QPtest_slice() { return } -// CHECK-LABEL: define void @_QPtest_dt_callee(ptr %0) +// CHECK-LABEL: define void @_QPtest_dt_callee( func.func @_QPtest_dt_callee(%arg0: !fir.box>) { return } @@ -63,7 +63,7 @@ func.func @_QPtest_dt_slice() { func.func private @takesRank2CharBox(!fir.box>>) // CHECK-LABEL: define void @emboxSubstring( -// CHECK-SAME: ptr captures(none) %[[arg0:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[arg0:.*]]) func.func @emboxSubstring(%arg0: !fir.ref>>) { %c2 = arith.constant 2 : index %c3 = arith.constant 3 : index @@ -84,7 +84,7 @@ func.func @emboxSubstring(%arg0: !fir.ref>>) { func.func private @do_something(!fir.box>) -> () // CHECK: define void @fir_dev_issue_1416 -// CHECK-SAME: ptr captures(none) %[[base_addr:.*]], i64 %[[low:.*]], i64 %[[up:.*]], i64 %[[at:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[base_addr:.*]], i64 %[[low:.*]], i64 %[[up:.*]], i64 %[[at:.*]]) func.func @fir_dev_issue_1416(%arg0: !fir.ref>, %low: index, %up: index, %at : index) { // Test fir.embox with a constant interior array shape. %c1 = arith.constant 1 : index diff --git a/flang/test/Fir/field-index.fir b/flang/test/Fir/field-index.fir index 55d173201f29a..19cfd2c04ad99 100644 --- a/flang/test/Fir/field-index.fir +++ b/flang/test/Fir/field-index.fir @@ -7,7 +7,7 @@ // CHECK-DAG: %[[c:.*]] = type { float, %[[b]] } // CHECK-LABEL: @simple_field -// CHECK-SAME: (ptr captures(none) %[[arg0:.*]]) +// CHECK-SAME: (ptr {{[^%]*}}%[[arg0:.*]]) func.func @simple_field(%arg0: !fir.ref>) -> i32 { // CHECK: %[[GEP:.*]] = getelementptr %a, ptr %[[arg0]], i32 0, i32 1 %2 = fir.coordinate_of %arg0, i : (!fir.ref>) -> !fir.ref @@ -17,7 +17,7 @@ func.func @simple_field(%arg0: !fir.ref>) -> i32 { } // CHECK-LABEL: @derived_field -// CHECK-SAME: (ptr captures(none) %[[arg0:.*]]) +// CHECK-SAME: (ptr {{[^%]*}}%[[arg0:.*]]) func.func @derived_field(%arg0: !fir.ref}>>) -> i32 { // CHECK: %[[GEP:.*]] = getelementptr %c, ptr %[[arg0]], i32 0, i32 1, i32 1 %3 = fir.coordinate_of %arg0, some_b, i : (!fir.ref}>>) -> !fir.ref diff --git a/flang/test/Fir/ignore-missing-type-descriptor.fir b/flang/test/Fir/ignore-missing-type-descriptor.fir index f9dcb7db77afe..d3e6dd166ca45 100644 --- a/flang/test/Fir/ignore-missing-type-descriptor.fir +++ b/flang/test/Fir/ignore-missing-type-descriptor.fir @@ -15,7 +15,7 @@ func.func @test_embox(%addr: !fir.ref) { return } // CHECK-LABEL: define void @test_embox( -// CHECK-SAME: ptr captures(none) %[[ADDR:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[ADDR:.*]]) // CHECK: insertvalue { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] } // CHECK-SAME: { ptr undef, i64 4, // CHECK-SAME: i32 20240719, i8 0, i8 42, i8 0, i8 1, ptr null, [1 x i64] zeroinitializer }, diff --git a/flang/test/Fir/polymorphic.fir b/flang/test/Fir/polymorphic.fir index ea1099af6b988..84fa2e950633f 100644 --- a/flang/test/Fir/polymorphic.fir +++ b/flang/test/Fir/polymorphic.fir @@ -86,7 +86,7 @@ func.func @_QMunlimitedPsub1(%arg0: !fir.class> {fir.bindc_na } // CHECK-LABEL: define void @_QMunlimitedPsub1( -// CHECK-SAME: ptr %[[ARRAY:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[ARRAY:.*]]){{.*}}{ // CHECK: %[[BOX:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] } // CHECK: %{{.}} = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[ARRAY]], i32 0, i32 7, i32 0, i32 2 // CHECK: %[[TYPE_DESC_GEP:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[ARRAY]], i32 0, i32 8 diff --git a/flang/test/Fir/rebox.fir b/flang/test/Fir/rebox.fir index 140308be6a814..0c9f6d9bb94ad 100644 --- a/flang/test/Fir/rebox.fir +++ b/flang/test/Fir/rebox.fir @@ -9,7 +9,7 @@ func.func private @bar1(!fir.box>) // CHECK-LABEL: define void @test_rebox_1( -// CHECK-SAME: ptr %[[INBOX:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[INBOX:.*]]) func.func @test_rebox_1(%arg0: !fir.box>) { // CHECK: %[[OUTBOX_ALLOC:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } %c2 = arith.constant 2 : index @@ -54,7 +54,7 @@ func.func @test_rebox_1(%arg0: !fir.box>) { func.func private @bar_rebox_test2(!fir.box>>) // CHECK-LABEL: define void @test_rebox_2( -// CHECK-SAME: ptr %[[INBOX:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[INBOX:.*]]) func.func @test_rebox_2(%arg0: !fir.box>>) { %c1 = arith.constant 1 : index %c4 = arith.constant 4 : index @@ -82,7 +82,7 @@ func.func @test_rebox_2(%arg0: !fir.box>>) { func.func private @bar_rebox_test3(!fir.box>) // CHECK-LABEL: define void @test_rebox_3( -// CHECK-SAME: ptr %[[INBOX:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[INBOX:.*]]) func.func @test_rebox_3(%arg0: !fir.box>) { // CHECK: %[[OUTBOX_ALLOC:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [3 x [3 x i64]] } %c2 = arith.constant 2 : index @@ -116,7 +116,7 @@ func.func @test_rebox_3(%arg0: !fir.box>) { // time constant length. // CHECK-LABEL: define void @test_rebox_4( -// CHECK-SAME: ptr %[[INPUT:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[INPUT:.*]]) func.func @test_rebox_4(%arg0: !fir.box>>) { // CHECK: %[[NEWBOX_STORAGE:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } // CHECK: %[[EXTENT_GEP:.*]] = getelementptr {{{.*}}}, ptr %[[INPUT]], i32 0, i32 7, i32 0, i32 1 @@ -144,7 +144,7 @@ func.func private @bar_test_rebox_4(!fir.box>>) { // CHECK: %[[OUTBOX_ALLOC:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } %c1 = arith.constant 1 : index @@ -184,7 +184,7 @@ func.func @test_cmplx_1(%arg0: !fir.box>>) { // end subroutine // CHECK-LABEL: define void @test_cmplx_2( -// CHECK-SAME: ptr %[[INBOX:.*]]) +// CHECK-SAME: ptr {{[^%]*}}%[[INBOX:.*]]) func.func @test_cmplx_2(%arg0: !fir.box>>) { // CHECK: %[[OUTBOX_ALLOC:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } %c7 = arith.constant 7 : index diff --git a/flang/test/Fir/struct-passing-x86-64-byval.fir b/flang/test/Fir/struct-passing-x86-64-byval.fir index 8451c26095226..997d2930f836c 100644 --- a/flang/test/Fir/struct-passing-x86-64-byval.fir +++ b/flang/test/Fir/struct-passing-x86-64-byval.fir @@ -80,27 +80,27 @@ func.func @not_enough_int_reg_3(%arg0: i32, %arg1: i32, %arg2: i32, %arg3: i32, } } -// CHECK: define void @takes_toobig(ptr byval(%toobig) align 8 captures(none) %{{.*}}) { -// CHECK: define void @takes_toobig_align16(ptr byval(%toobig_align16) align 16 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_int_reg_1(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr byval(%fits_in_1_int_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_int_reg_1b(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}, ptr byval(%fits_in_1_int_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_int_reg_2(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr byval(%fits_in_2_int_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @ftakes_toobig(ptr byval(%ftoobig) align 8 captures(none) %{{.*}}) { -// CHECK: define void @ftakes_toobig_align16(ptr byval(%ftoobig_align16) align 16 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_sse_reg_1(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr byval(%fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_sse_reg_1b(<2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, ptr byval(%fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_sse_reg_1c(double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, ptr byval(%fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_sse_reg_2(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr byval(%fits_in_2_sse_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @test_contains_x87(ptr byval(%contains_x87) align 16 captures(none) %{{.*}}) { -// CHECK: define void @test_contains_complex_x87(ptr byval(%contains_complex_x87) align 16 captures(none) %{{.*}}) { -// CHECK: define void @test_nested_toobig(ptr byval(%nested_toobig) align 8 captures(none) %{{.*}}) { -// CHECK: define void @test_badly_aligned(ptr byval(%badly_aligned) align 8 captures(none) %{{.*}}) { -// CHECK: define void @test_logical_toobig(ptr byval(%logical_too_big) align 8 captures(none) %{{.*}}) { -// CHECK: define void @l_not_enough_int_reg(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr byval(%l_fits_in_2_int_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @test_complex_toobig(ptr byval(%complex_too_big) align 8 captures(none) %{{.*}}) { -// CHECK: define void @cplx_not_enough_sse_reg_1(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr byval(%cplx_fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @test_char_to_big(ptr byval(%char_too_big) align 8 captures(none) %{{.*}}) { -// CHECK: define void @char_not_enough_int_reg_1(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr byval(%char_fits_in_1_int_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @mix_not_enough_int_reg_1(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr byval(%mix_in_1_int_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @mix_not_enough_sse_reg_2(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr byval(%mix_in_1_int_reg_1_sse_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_int_reg_3(ptr sret({ fp128, fp128 }) align 16 captures(none) %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr byval(%fits_in_1_int_reg) align 8 captures(none) %{{.*}}) +// CHECK: define void @takes_toobig(ptr noalias byval(%toobig) align 8 captures(none) %{{.*}}) { +// CHECK: define void @takes_toobig_align16(ptr noalias byval(%toobig_align16) align 16 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_int_reg_1(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr noalias byval(%fits_in_1_int_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_int_reg_1b(ptr noalias captures(none) %{{.*}}, ptr noalias captures(none) %{{.*}}, ptr noalias captures(none) %{{.*}}, ptr noalias captures(none) %{{.*}}, ptr noalias captures(none) %{{.*}}, ptr noalias captures(none) %{{.*}}, ptr noalias byval(%fits_in_1_int_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_int_reg_2(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr noalias byval(%fits_in_2_int_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @ftakes_toobig(ptr noalias byval(%ftoobig) align 8 captures(none) %{{.*}}) { +// CHECK: define void @ftakes_toobig_align16(ptr noalias byval(%ftoobig_align16) align 16 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_sse_reg_1(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr noalias byval(%fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_sse_reg_1b(<2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, ptr noalias byval(%fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_sse_reg_1c(double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, ptr noalias byval(%fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_sse_reg_2(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr noalias byval(%fits_in_2_sse_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @test_contains_x87(ptr noalias byval(%contains_x87) align 16 captures(none) %{{.*}}) { +// CHECK: define void @test_contains_complex_x87(ptr noalias byval(%contains_complex_x87) align 16 captures(none) %{{.*}}) { +// CHECK: define void @test_nested_toobig(ptr noalias byval(%nested_toobig) align 8 captures(none) %{{.*}}) { +// CHECK: define void @test_badly_aligned(ptr noalias byval(%badly_aligned) align 8 captures(none) %{{.*}}) { +// CHECK: define void @test_logical_toobig(ptr noalias byval(%logical_too_big) align 8 captures(none) %{{.*}}) { +// CHECK: define void @l_not_enough_int_reg(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr noalias byval(%l_fits_in_2_int_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @test_complex_toobig(ptr noalias byval(%complex_too_big) align 8 captures(none) %{{.*}}) { +// CHECK: define void @cplx_not_enough_sse_reg_1(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr noalias byval(%cplx_fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @test_char_to_big(ptr noalias byval(%char_too_big) align 8 captures(none) %{{.*}}) { +// CHECK: define void @char_not_enough_int_reg_1(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr noalias byval(%char_fits_in_1_int_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @mix_not_enough_int_reg_1(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr noalias byval(%mix_in_1_int_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @mix_not_enough_sse_reg_2(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr noalias byval(%mix_in_1_int_reg_1_sse_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_int_reg_3(ptr noalias sret({ fp128, fp128 }) align 16 captures(none) %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr noalias byval(%fits_in_1_int_reg) align 8 captures(none) %{{.*}}) diff --git a/flang/test/Fir/target-rewrite-complex-10-x86.fir b/flang/test/Fir/target-rewrite-complex-10-x86.fir index 6404b4f766d39..5f917ee42d598 100644 --- a/flang/test/Fir/target-rewrite-complex-10-x86.fir +++ b/flang/test/Fir/target-rewrite-complex-10-x86.fir @@ -30,5 +30,5 @@ func.func @takecomplex10(%z: complex) { // AMD64: %[[VAL_3:.*]] = fir.alloca complex // AMD64: fir.store %[[VAL_2]] to %[[VAL_3]] : !fir.ref> -// AMD64_LLVM: define void @takecomplex10(ptr byval({ x86_fp80, x86_fp80 }) align 16 captures(none) %0) +// AMD64_LLVM: define void @takecomplex10(ptr noalias byval({ x86_fp80, x86_fp80 }) align 16 captures(none) %0) } diff --git a/flang/test/Fir/target.fir b/flang/test/Fir/target.fir index 781d153f525ff..e1190649e0803 100644 --- a/flang/test/Fir/target.fir +++ b/flang/test/Fir/target.fir @@ -26,7 +26,7 @@ func.func @gen4() -> complex { return %6 : complex } -// I32-LABEL: define void @gen8(ptr sret({ double, double }) align 4 captures(none) % +// I32-LABEL: define void @gen8(ptr noalias sret({ double, double }) align 4 captures(none) % // X64-LABEL: define { double, double } @gen8() // AARCH64-LABEL: define { double, double } @gen8() // PPC-LABEL: define { double, double } @gen8() @@ -93,9 +93,9 @@ func.func @call8() { return } -// I32-LABEL: define i64 @char1lensum(ptr captures(none) %0, ptr captures(none) %1, i32 %2, i32 %3) -// X64-LABEL: define i64 @char1lensum(ptr captures(none) %0, ptr captures(none) %1, i64 %2, i64 %3) -// PPC-LABEL: define i64 @char1lensum(ptr captures(none) %0, ptr captures(none) %1, i64 %2, i64 %3) +// I32-LABEL: define i64 @char1lensum(ptr {{[^%]*}}%0, ptr {{[^%]*}}%1, i32 %2, i32 %3) +// X64-LABEL: define i64 @char1lensum(ptr {{[^%]*}}%0, ptr {{[^%]*}}%1, i64 %2, i64 %3) +// PPC-LABEL: define i64 @char1lensum(ptr {{[^%]*}}%0, ptr {{[^%]*}}%1, i64 %2, i64 %3) func.func @char1lensum(%arg0 : !fir.boxchar<1>, %arg1 : !fir.boxchar<1>) -> i64 { // X64-DAG: %[[p0:.*]] = insertvalue { ptr, i64 } undef, ptr %1, 0 // X64-DAG: = insertvalue { ptr, i64 } %[[p0]], i64 %3, 1 diff --git a/flang/test/Fir/tbaa-codegen.fir b/flang/test/Fir/tbaa-codegen.fir index 87bb15c0fea6c..b6b0982b3934e 100644 --- a/flang/test/Fir/tbaa-codegen.fir +++ b/flang/test/Fir/tbaa-codegen.fir @@ -28,7 +28,7 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ } // CHECK-LABEL: define void @_QPsimple( -// CHECK-SAME: ptr %[[ARG0:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[ARG0:.*]]){{.*}}{ // [...] // load a(2): // CHECK: %[[VAL20:.*]] = getelementptr i8, ptr %{{.*}}, i64 %{{.*}} diff --git a/flang/test/Fir/tbaa-codegen2.fir b/flang/test/Fir/tbaa-codegen2.fir index 8f8b6a29129e7..69b36c2611505 100644 --- a/flang/test/Fir/tbaa-codegen2.fir +++ b/flang/test/Fir/tbaa-codegen2.fir @@ -60,7 +60,7 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ } } // CHECK-LABEL: define void @_QPfunc( -// CHECK-SAME: ptr %[[ARG0:.*]]){{.*}}{ +// CHECK-SAME: ptr {{[^%]*}}%[[ARG0:.*]]){{.*}}{ // [...] // CHECK: %[[VAL5:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, ptr %[[ARG0]], i32 0, i32 7, i32 0, i32 0 // box access: diff --git a/flang/test/Integration/OpenMP/copyprivate.f90 b/flang/test/Integration/OpenMP/copyprivate.f90 index 3bae003ea8d83..e0e4abe015438 100644 --- a/flang/test/Integration/OpenMP/copyprivate.f90 +++ b/flang/test/Integration/OpenMP/copyprivate.f90 @@ -8,25 +8,25 @@ !RUN: %flang_fc1 -emit-llvm -fopenmp %s -o - | FileCheck %s -!CHECK-DAG: define internal void @_copy_box_Uxi32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_box_10xi32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_i64(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_box_Uxi64(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_f32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_box_2x3xf32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_z32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_box_10xz32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_l32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_box_5xl32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_c8x8(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_box_10xc8x8(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_c16x5(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_rec__QFtest_typesTdt(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_box_heap_Uxi32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) -!CHECK-DAG: define internal void @_copy_box_ptr_Uxc8x9(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_box_Uxi32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_box_10xi32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_i64(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_box_Uxi64(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_f32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_box_2x3xf32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_z32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_box_10xz32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_l32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_box_5xl32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_c8x8(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_box_10xc8x8(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_c16x5(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_rec__QFtest_typesTdt(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_box_heap_Uxi32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_box_ptr_Uxc8x9(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) !CHECK-LABEL: define internal void @_copy_i32( -!CHECK-SAME: ptr captures(none) %[[DST:.*]], ptr captures(none) %[[SRC:.*]]){{.*}} { +!CHECK-SAME: ptr {{[^%]*}}%[[DST:.*]], ptr {{[^%]*}}%[[SRC:.*]]){{.*}} { !CHECK-NEXT: %[[SRC_VAL:.*]] = load i32, ptr %[[SRC]] !CHECK-NEXT: store i32 %[[SRC_VAL]], ptr %[[DST]] !CHECK-NEXT: ret void diff --git a/flang/test/Integration/debug-local-var-2.f90 b/flang/test/Integration/debug-local-var-2.f90 index 08aeb0999b01b..468bb0c5a1289 100644 --- a/flang/test/Integration/debug-local-var-2.f90 +++ b/flang/test/Integration/debug-local-var-2.f90 @@ -20,7 +20,7 @@ ! BOTH-LABEL: } ! BOTH-LABEL: define {{.*}}i64 @_QFPfn1 -! BOTH-SAME: (ptr captures(none) %[[ARG1:.*]], ptr captures(none) %[[ARG2:.*]], ptr captures(none) %[[ARG3:.*]]) +! BOTH-SAME: (ptr {{[^%]*}}%[[ARG1:.*]], ptr {{[^%]*}}%[[ARG2:.*]], ptr {{[^%]*}}%[[ARG3:.*]]) ! RECORDS-DAG: #dbg_declare(ptr %[[ARG1]], ![[A1:.*]], !DIExpression(), !{{.*}}) ! RECORDS-DAG: #dbg_declare(ptr %[[ARG2]], ![[B1:.*]], !DIExpression(), !{{.*}}) ! RECORDS-DAG: #dbg_declare(ptr %[[ARG3]], ![[C1:.*]], !DIExpression(), !{{.*}}) @@ -29,7 +29,7 @@ ! BOTH-LABEL: } ! BOTH-LABEL: define {{.*}}i32 @_QFPfn2 -! BOTH-SAME: (ptr captures(none) %[[FN2ARG1:.*]], ptr captures(none) %[[FN2ARG2:.*]], ptr captures(none) %[[FN2ARG3:.*]]) +! BOTH-SAME: (ptr {{[^%]*}}%[[FN2ARG1:.*]], ptr {{[^%]*}}%[[FN2ARG2:.*]], ptr {{[^%]*}}%[[FN2ARG3:.*]]) ! RECORDS-DAG: #dbg_declare(ptr %[[FN2ARG1]], ![[A2:.*]], !DIExpression(), !{{.*}}) ! RECORDS-DAG: #dbg_declare(ptr %[[FN2ARG2]], ![[B2:.*]], !DIExpression(), !{{.*}}) ! RECORDS-DAG: #dbg_declare(ptr %[[FN2ARG3]], ![[C2:.*]], !DIExpression(), !{{.*}}) diff --git a/flang/test/Integration/unroll-loops.f90 b/flang/test/Integration/unroll-loops.f90 index debe45e0ec359..87ab9efeb703b 100644 --- a/flang/test/Integration/unroll-loops.f90 +++ b/flang/test/Integration/unroll-loops.f90 @@ -13,7 +13,7 @@ ! RUN: %if x86-registered-target %{ %{check-nounroll} %} ! ! CHECK-LABEL: @unroll -! CHECK-SAME: (ptr writeonly captures(none) %[[ARG0:.*]]) +! CHECK-SAME: (ptr {{[^%]*}}%[[ARG0:.*]]) subroutine unroll(a) integer(kind=8), intent(out) :: a(1000) integer(kind=8) :: i diff --git a/flang/test/Lower/HLFIR/unroll-loops.fir b/flang/test/Lower/HLFIR/unroll-loops.fir index 1321f39677405..89e8ce82d6f3f 100644 --- a/flang/test/Lower/HLFIR/unroll-loops.fir +++ b/flang/test/Lower/HLFIR/unroll-loops.fir @@ -11,7 +11,7 @@ // RUN: %if x86-registered-target %{ %{check-nounroll} %} // CHECK-LABEL: @unroll -// CHECK-SAME: (ptr writeonly captures(none) %[[ARG0:.*]]) +// CHECK-SAME: (ptr {{[^%]*}}%[[ARG0:.*]]) func.func @unroll(%arg0: !fir.ref> {fir.bindc_name = "a"}) { %scope = fir.dummy_scope : !fir.dscope %c1000 = arith.constant 1000 : index diff --git a/flang/test/Lower/forall/character-1.f90 b/flang/test/Lower/forall/character-1.f90 index 69064ddfcf0be..1e4bb73350871 100644 --- a/flang/test/Lower/forall/character-1.f90 +++ b/flang/test/Lower/forall/character-1.f90 @@ -22,7 +22,7 @@ end subroutine sub end program test ! CHECK-LABEL: define internal void @_QFPsub( -! CHECK-SAME: ptr %[[arg:.*]]) +! CHECK-SAME: ptr {{[^%]*}}%[[arg:.*]]) ! CHECK: %[[extent:.*]] = getelementptr { {{.*}}, [1 x [3 x i64]] }, ptr %[[arg]], i32 0, i32 7, i64 0, i32 1 ! CHECK: %[[extval:.*]] = load i64, ptr %[[extent]] ! CHECK: %[[elesize:.*]] = getelementptr { {{.*}}, [1 x [3 x i64]] }, ptr %[[arg]], i32 0, i32 1 diff --git a/flang/test/Transforms/constant-argument-globalisation.fir b/flang/test/Transforms/constant-argument-globalisation.fir index 02349de40bc0b..4e5995bdf2207 100644 --- a/flang/test/Transforms/constant-argument-globalisation.fir +++ b/flang/test/Transforms/constant-argument-globalisation.fir @@ -49,8 +49,8 @@ module { // DISABLE-LABEL: ; ModuleID = // DISABLE-NOT: @_extruded // DISABLE: define void @sub1( -// DISABLE-SAME: ptr captures(none) [[ARG0:%.*]], -// DISABLE-SAME: ptr captures(none) [[ARG1:%.*]]) +// DISABLE-SAME: ptr {{[^%]*}}[[ARG0:%.*]], +// DISABLE-SAME: ptr {{[^%]*}}[[ARG1:%.*]]) // DISABLE-SAME: { // DISABLE: [[CONST_R0:%.*]] = alloca double // DISABLE: [[CONST_R1:%.*]] = alloca double diff --git a/flang/test/Transforms/function-attrs-noalias.fir b/flang/test/Transforms/function-attrs-noalias.fir new file mode 100644 index 0000000000000..6733fa96457bc --- /dev/null +++ b/flang/test/Transforms/function-attrs-noalias.fir @@ -0,0 +1,113 @@ +// RUN: fir-opt --function-attr="set-noalias=true" %s | FileCheck %s + +// Test the annotation of function arguments with llvm.noalias. + +// Test !fir.ref arguments. +// CHECK-LABEL: func.func private @test_ref( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref {llvm.noalias}) { +func.func private @test_ref(%arg0: !fir.ref) { + return +} + +// CHECK-LABEL: func.func private @test_ref_target( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref {fir.target}) { +func.func private @test_ref_target(%arg0: !fir.ref {fir.target}) { + return +} + +// CHECK-LABEL: func.func private @test_ref_volatile( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref {fir.volatile}) { +func.func private @test_ref_volatile(%arg0: !fir.ref {fir.volatile}) { + return +} + +// CHECK-LABEL: func.func private @test_ref_asynchronous( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref {fir.asynchronous}) { +func.func private @test_ref_asynchronous(%arg0: !fir.ref {fir.asynchronous}) { + return +} + +// CHECK-LABEL: func.func private @test_ref_box( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref> {llvm.noalias}) { +// Test !fir.ref> arguments: +func.func private @test_ref_box(%arg0: !fir.ref>) { + return +} + +// CHECK-LABEL: func.func private @test_ref_box_target( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref> {fir.target}) { +func.func private @test_ref_box_target(%arg0: !fir.ref> {fir.target}) { + return +} + +// CHECK-LABEL: func.func private @test_ref_box_volatile( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref> {fir.volatile}) { +func.func private @test_ref_box_volatile(%arg0: !fir.ref> {fir.volatile}) { + return +} + +// CHECK-LABEL: func.func private @test_ref_box_asynchronous( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref> {fir.asynchronous}) { +func.func private @test_ref_box_asynchronous(%arg0: !fir.ref> {fir.asynchronous}) { + return +} + +// Test POINTER arguments. +// CHECK-LABEL: func.func private @test_ref_box_ptr( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>>) { +func.func private @test_ref_box_ptr(%arg0: !fir.ref>>) { + return +} + +// Test ALLOCATABLE arguments. +// CHECK-LABEL: func.func private @test_ref_box_heap( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>> {llvm.noalias}) { +func.func private @test_ref_box_heap(%arg0: !fir.ref>>) { + return +} + +// BIND(C) functions are not annotated. +// CHECK-LABEL: func.func private @test_ref_bindc( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref) +func.func private @test_ref_bindc(%arg0: !fir.ref) attributes {fir.bindc_name = "test_ref_bindc", fir.proc_attrs = #fir.proc_attrs} { + return +} + +// Test function declaration from a module. +// CHECK-LABEL: func.func private @_QMtest_modPcheck_module( +// CHECK-SAME: !fir.ref {llvm.noalias}) +func.func private @_QMtest_modPcheck_module(!fir.ref) + +// Test !fir.box arguments: +// CHECK-LABEL: func.func private @test_box( +// CHECK-SAME: %[[ARG0:.*]]: !fir.box {llvm.noalias}) { +func.func private @test_box(%arg0: !fir.box) { + return +} + +// CHECK-LABEL: func.func private @test_box_target( +// CHECK-SAME: %[[ARG0:.*]]: !fir.box {fir.target, llvm.noalias}) { +func.func private @test_box_target(%arg0: !fir.box {fir.target}) { + return +} + +// CHECK-LABEL: func.func private @test_box_volatile( +// CHECK-SAME: %[[ARG0:.*]]: !fir.box {fir.volatile, llvm.noalias}) { +func.func private @test_box_volatile(%arg0: !fir.box {fir.volatile}) { + return +} + +// CHECK-LABEL: func.func private @test_box_asynchronous( +// CHECK-SAME: %[[ARG0:.*]]: !fir.box {fir.asynchronous, llvm.noalias}) { +func.func private @test_box_asynchronous(%arg0: !fir.box {fir.asynchronous}) { + return +} + +// !fir.boxchar<> is lowered before FunctionAttrPass, but let's +// make sure we do not annotate it. +// CHECK-LABEL: func.func private @test_boxchar( +// CHECK-SAME: %[[ARG0:.*]]: !fir.boxchar<1>) { +func.func private @test_boxchar(%arg0: !fir.boxchar<1>) { + return +} + diff --git a/flang/test/Transforms/function-attrs.fir b/flang/test/Transforms/function-attrs.fir index 5f871a1a7b6c5..5ebf316586cd0 100644 --- a/flang/test/Transforms/function-attrs.fir +++ b/flang/test/Transforms/function-attrs.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt --function-attr %s | FileCheck %s +// RUN: fir-opt --function-attr="set-nocapture=true" %s | FileCheck %s // If a function has a body and is not bind(c), and if the dummy argument doesn't have the target, // asynchronous, volatile, or pointer attribute, then add llvm.nocapture to the dummy argument. >From 99ea15e1b0ac888e00522067873bfa35bece8e19 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Tue, 20 May 2025 14:11:53 -0700 Subject: [PATCH 2/2] Updated nocapture test. --- flang/test/Transforms/function-attrs.fir | 25 ++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/flang/test/Transforms/function-attrs.fir b/flang/test/Transforms/function-attrs.fir index 5ebf316586cd0..8e3a896fd58bf 100644 --- a/flang/test/Transforms/function-attrs.fir +++ b/flang/test/Transforms/function-attrs.fir @@ -43,3 +43,28 @@ func.func private @_QMarg_modPcheck_args(!fir.ref {fir.target}, !fir.ref {llvm.nocapture}, // CHECK-SAME: !fir.boxchar<1>, // CHECK-SAME: !fir.ref> {llvm.nocapture}) + +// Test !fir.box arguments: +// CHECK-LABEL: func.func private @test_box( +// CHECK-SAME: %[[ARG0:.*]]: !fir.box {llvm.nocapture}) { +func.func private @test_box(%arg0: !fir.box) { + return +} + +// CHECK-LABEL: func.func private @test_box_target( +// CHECK-SAME: %[[ARG0:.*]]: !fir.box {fir.target, llvm.nocapture}) { +func.func private @test_box_target(%arg0: !fir.box {fir.target}) { + return +} + +// CHECK-LABEL: func.func private @test_box_volatile( +// CHECK-SAME: %[[ARG0:.*]]: !fir.box {fir.volatile, llvm.nocapture}) { +func.func private @test_box_volatile(%arg0: !fir.box {fir.volatile}) { + return +} + +// CHECK-LABEL: func.func private @test_box_asynchronous( +// CHECK-SAME: %[[ARG0:.*]]: !fir.box {fir.asynchronous, llvm.nocapture}) { +func.func private @test_box_asynchronous(%arg0: !fir.box {fir.asynchronous}) { + return +} From flang-commits at lists.llvm.org Tue May 20 16:09:55 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 20 May 2025 16:09:55 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (WORK IN PROGRESS) (PR #137727) In-Reply-To: Message-ID: <682d0bc3.170a0220.3832b6.b50f@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From a174e845f70aac1cdea8485f2c6563b171b26a47 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, and Destroy. Descriptor-based I/O is now also supported. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. --- .../include/flang-rt/runtime/environment.h | 1 + .../include/flang-rt/runtime/work-queue.h | 425 ++++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 518 ++++++++------- flang-rt/lib/runtime/derived.cpp | 485 +++++++------- flang-rt/lib/runtime/descriptor-io.cpp | 629 +++++++++++++++++- flang-rt/lib/runtime/descriptor-io.h | 620 +---------------- flang-rt/lib/runtime/environment.cpp | 4 + flang-rt/lib/runtime/namelist.cpp | 1 + flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 205 ++++++ flang-rt/unittests/Runtime/ExternalIOTest.cpp | 2 +- flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- 14 files changed, 1825 insertions(+), 1085 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/environment.h b/flang-rt/include/flang-rt/runtime/environment.h index 16258b3bbba9b..87fe1f92ba545 100644 --- a/flang-rt/include/flang-rt/runtime/environment.h +++ b/flang-rt/include/flang-rt/runtime/environment.h @@ -63,6 +63,7 @@ struct ExecutionEnvironment { bool noStopMessage{false}; // NO_STOP_MESSAGE=1 inhibits "Fortran STOP" bool defaultUTF8{false}; // DEFAULT_UTF8 bool checkPointerDeallocation{true}; // FORT_CHECK_POINTER_DEALLOCATION + int internalDebugging{0}; // FLANG_RT_DEBUG // CUDA related variables std::size_t cudaStackLimit{0}; // ACC_OFFLOAD_STACK_SIZE diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..224de3f63bc74 --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,425 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue is a list of tickets. Each ticket class has a Begin() +// member function that is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatOkContinue, and if that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatOkContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentsOverElements, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatOkContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatOkContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/connection.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; + +// Ticket workers + +// Ticket workers return status codes. Returning StatOkContinue means +// that the ticket is incomplete and must be resumed; any other value +// means that the ticket is complete, and if not StatOk, the whole +// queue can be shut down due to an error. +static constexpr int StatOkContinue{1234}; + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +// Base class for ticket workers that operate elementwise over descriptors +class Elementwise { +protected: + RT_API_ATTRS Elementwise( + const Descriptor &instance, const Descriptor *from = nullptr) + : instance_{instance}, from_{from} { + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + RT_API_ATTRS bool IsComplete() const { return elementAt_ >= elements_; } + RT_API_ATTRS void Advance() { + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + if (from_) { + from_->IncrementSubscripts(fromSubscripts_); + } + } + RT_API_ATTRS void SkipToEnd() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + + const Descriptor &instance_, *from_{nullptr}; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + SubscriptValue subscripts_[common::maxRank]; + SubscriptValue fromSubscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components. +class Componentwise { +protected: + RT_API_ATTRS Componentwise(const typeInfo::DerivedType &); + RT_API_ATTRS bool IsComplete() const { return componentAt_ >= components_; } + RT_API_ATTRS void Advance() { + ++componentAt_; + GetComponent(); + } + RT_API_ATTRS void SkipToEnd() { + component_ = nullptr; + componentAt_ = components_; + } + RT_API_ATTRS void Reset() { + component_ = nullptr; + componentAt_ = 0; + GetComponent(); + } + RT_API_ATTRS void GetComponent(); + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentsOverElements : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ComponentsOverElements(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Elementwise::IsComplete()) { + Componentwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Componentwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextElement(); + if (Elementwise::IsComplete()) { + Elementwise::Reset(); + Componentwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Elementwise::Advance(); + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Advance(); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Reset(); + } + + int phase_{0}; +}; + +// Base class for ticket workers that operate over elements in an outer loop, +// type components in an inner loop. +class ElementsOverComponents : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ElementsOverComponents(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Componentwise::IsComplete()) { + Elementwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Elementwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextComponent(); + if (Componentwise::IsComplete()) { + Componentwise::Reset(); + Elementwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Componentwise::Advance(); + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Componentwise::Reset(); + Elementwise::Advance(); + } + + int phase_{0}; +}; + +// Implements derived type instance initialization +class InitializeTicket : private ComponentsOverElements { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket : private ComponentsOverElements { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ComponentsOverElements{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatOkContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : private ComponentsOverElements { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : private ComponentsOverElements { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ComponentsOverElements{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : to_{to}, from_{&from}, flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment. +class DerivedAssignTicket : private ElementsOverComponents { +public: + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ElementsOverComponents{to, derived, &from}, flags_{flags}, + memmoveFct_{memmoveFct}, deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + StaticDescriptor fromComponentDescriptor_; +}; + +namespace io::descr { +template +class DerivedIoTicket : private ElementsOverComponents { +public: + RT_API_ATTRS DerivedIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) + : ElementsOverComponents{descriptor, derived}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatOkContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; // refers to the DescriptorIoTicket::anyIoTookPlace_ +}; + +template class DescriptorIoTicket : private Elementwise { +public: + RT_API_ATTRS DescriptorIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) + : Elementwise{descriptor}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; + common::optional nonTbpSpecial_; + const typeInfo::DerivedType *derived_{nullptr}; + const typeInfo::SpecialBinding *special_{nullptr}; + StaticDescriptor elementDescriptor_; +}; +} // namespace io::descr + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant, + io::descr::DerivedIoTicket, + io::descr::DescriptorIoTicket, + io::descr::DescriptorIoTicket> + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + RT_API_ATTRS void BeginInitialize( + const Descriptor &, const typeInfo::DerivedType &); + RT_API_ATTRS void BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &, bool hasStat, + const Descriptor *errMsg); + RT_API_ATTRS void BeginFinalize( + const Descriptor &, const typeInfo::DerivedType &); + RT_API_ATTRS void BeginDestroy( + const Descriptor &, const typeInfo::DerivedType &, bool finalize); + RT_API_ATTRS void BeginAssign( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct); + RT_API_ATTRS void BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct, + Descriptor *deallocateAfter); + RT_API_ATTRS void BeginDerivedIo(io::Direction, io::IoStatementState &, + const Descriptor &, const typeInfo::DerivedType &, + const io::NonTbpDefinedIoTable *, bool &anyIoTookPlace); + RT_API_ATTRS void BeginDescriptorIo(io::Direction, io::IoStatementState &, + const Descriptor &, const io::NonTbpDefinedIoTable *, + bool &anyIoTookPlace); + + RT_API_ATTRS int Run(); + +private: + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index c5e7bdce5b2fd..3f9858671d1e1 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -67,6 +67,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -130,6 +131,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 9be75da9520e3..345ec8ef31162 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -102,11 +103,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncId)); } // least <= 0, most >= 0 @@ -231,6 +228,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -244,274 +243,338 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + workQueue.BeginAssign(to, from, flags, memmoveFct); + workQueue.Run(); +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + workQueue.BeginFinalize(*toDeallocate_, *toDerived_); + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncId))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + workQueue.BeginInitialize(newFrom, *derived); + } + } } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + workQueue.BeginAssign( + newFrom, *from_, MaybeReallocate | PolymorphicLHS, memmoveFct_); + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; - } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + workQueue.BeginFinalize(to_, *toDerived_); + } else if (!toDerived_->noDestructionNeeded()) { + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false); } } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + return StatOkContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); + } + return StatOk; + } + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + workQueue.BeginInitialize(to_, *toDerived_); + return StatOkContinue; + } + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatOkContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatOkContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); - } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); - } - } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + if (toDerived_) { + workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_); + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatOkContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &workQueue) { + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + memmoveFct_(instance_.Element(subscripts_) + procPtr.offset, + from_->Element(fromSubscripts_) + procPtr.offset, + sizeof(typeInfo::ProcedurePointer)); + } + } + if (noDataComponents) { + return StatOk; + } else { + Elementwise::Reset(); + } + } + return StatOkContinue; +} + +RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &workQueue) { + // DerivedAssignTicket inherits from ElementComponentBase so that it + // loops over elements at the outer level and over components at the inner. + // This yield less surprising behavior for codes that care due to the use + // of defined assignments when the ordering of their calls matters. + for (; !IsComplete(); Advance()) { + // Copy the data components (incl. the parent) first. + switch (component_->genre()) { + case typeInfo::Component::Genre::Data: + if (component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{fromComponentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + toCompDesc, instance_, workQueue.terminator(), subscripts_); + component_->CreatePointerDescriptor( + fromCompDesc, *from_, workQueue.terminator(), fromSubscripts_); + Advance(); + workQueue.BeginAssign(toCompDesc, fromCompDesc, flags_, memmoveFct_); + return StatOkContinue; + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + instance_.Element(subscripts_) + component_->offset())}; + const auto *fromDesc{reinterpret_cast( + from_->Element(fromSubscripts_) + component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (const auto *componentDerived{component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false); + return StatOkContinue; + } + } + toDesc->Deallocate(); + } + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + // + // Be careful not to destroy/reallocate the LHS, if there is + // overlap between LHS and RHS (it seems that partial overlap + // is not possible, though). + // Invoke Assign() recursively to deal with potential aliasing. + // Force LHS deallocation with DeallocateLHS flag. + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + workQueue.BeginAssign( + *toDesc, *fromDesc, flags_ | DeallocateLHS, memmoveFct_); + Advance(); + return StatOkContinue; + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -581,7 +644,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -597,11 +659,11 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. - if (var) + if (var) { Assign(*var, temp, terminator, NoAssignFlags); + } temp.Destroy(/*finalize=*/false, /*destroyPointers=*/false, &terminator); } diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..9fdc016d37d0b 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,172 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + workQueue.BeginInitialize(instance, derived); + return workQueue.Run(); +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + std::size_t myProcPtrs{procPtrDesc.Elements()}; + for (std::size_t k{0}; k < myProcPtrs; ++k) { const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; + *procPtrDesc.ZeroBasedIndexedElement(k)}; SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + instance_.GetLowerBounds(at); + for (std::size_t j{0}; j++ < elements_; instance_.IncrementSubscripts(at)) { + auto &pptr{*instance_.ElementComponent( + at, comp.offset)}; + pptr = comp.procInitialization; + } + } + return StatOkContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + for (; !IsComplete(); SkipToNextComponent()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; !Elementwise::IsComplete(); SkipToNextElement()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; !Elementwise::IsComplete(); SkipToNextElement()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; !Elementwise::IsComplete(); SkipToNextElement()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + workQueue.BeginInitialize(compDesc, compType); + return StatOkContinue; } } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; - } - } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + workQueue.BeginInitializeClone(clone, original, derived, hasStat, errMsg); + return workQueue.Run(); } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); - } + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncId), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + workQueue.BeginInitialize(cloneDesc, *derived); + return StatOkContinue; } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_); + return StatOkContinue; + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_); + Advance(); + return StatOkContinue; // will resume at next element in this component + } else { + SkipToNextComponent(); } + } else { + SkipToNextComponent(); } } - return stat; + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + workQueue.BeginFinalize(descriptor, derived); + workQueue.Run(); + } } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +214,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +251,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncId) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,86 +275,84 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (finalizableParentType_->noFinalizationNeeded()) { + finalizableParentType_ = nullptr; + } else { + SkipToNextComponent(); + } + } + return StatOkContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); - } + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + workQueue.BeginFinalize(compDesc, *compDynamicType); + return StatOkContinue; } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + workQueue.BeginFinalize(compDesc, *compType); } + } else { + SkipToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + workQueue.BeginFinalize(compDesc, compType); + return StatOkContinue; + } else { + SkipToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + workQueue.BeginFinalize(tmpDesc, *finalizableParentType_); + finalizableParentType_ = nullptr; + return StatOkContinue; + } else { + return StatOk; } } @@ -373,51 +362,61 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + workQueue.BeginDestroy(descriptor, derived, finalize); + workQueue.Run(); } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + workQueue.BeginFinalize(instance_, derived_); } + return StatOkContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (!IsComplete()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + workQueue.BeginDestroy(*d, *componentDerived, /*finalize=*/false); + return StatOkContinue; + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + SkipToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + workQueue.BeginDestroy(compDesc, *componentDerived, /*finalize=*/false); + return StatOkContinue; } + } else { + SkipToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index 3db1455af52fe..4b8bc8c3b4271 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -7,12 +7,40 @@ //===----------------------------------------------------------------------===// #include "descriptor-io.h" +#include "edit-input.h" +#include "edit-output.h" +#include "unit.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/namelist.h" +#include "flang-rt/runtime/terminator.h" +#include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" +#include "flang/Common/optional.h" #include "flang/Common/restorer.h" +#include "flang/Common/uint128.h" +#include "flang/Runtime/cpp-type.h" #include "flang/Runtime/freestanding-tools.h" +// Implementation of I/O data list item transfers based on descriptors. +// (All I/O items come through here so that the code is exercised for test; +// some scalar I/O data transfer APIs could be changed to bypass their use +// of descriptors in the future for better efficiency.) + namespace Fortran::runtime::io::descr { RT_OFFLOAD_API_GROUP_BEGIN +template +inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, + const Descriptor &descriptor, const SubscriptValue subscripts[]) { + A *p{descriptor.Element(subscripts)}; + if (!p) { + io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " + "address or subscripts out of range"); + } + return *p; +} + // Defined formatted I/O (maybe) Fortran::common::optional DefinedFormattedIo(IoStatementState &io, const Descriptor &descriptor, const typeInfo::DerivedType &derived, @@ -104,8 +132,8 @@ Fortran::common::optional DefinedFormattedIo(IoStatementState &io, } // Defined unformatted I/O -bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, - const typeInfo::DerivedType &derived, +static bool DefinedUnformattedIo(IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special) { // Unformatted I/O must have an external unit (or child thereof). IoErrorHandler &handler{io.GetIoErrorHandler()}; @@ -149,8 +177,603 @@ bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, } handler.Forward(ioStat, ioMsg, sizeof ioMsg); external->PopChildIo(child); - return handler.GetIoStat() == IostatOk; + return handler.GetIoStat() != IostatOk; +} + +// Per-category descriptor-based I/O templates + +// TODO (perhaps as a nontrivial but small starter project): implement +// automatic repetition counts, like "10*3.14159", for list-directed and +// NAMELIST array output. + +template +inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, + const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!EditIntegerOutput(io, *edit, x, isSigned)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditIntegerInput( + io, *edit, reinterpret_cast(&x), KIND, isSigned)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedIntegerIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedRealIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + RawType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditRealInput(io, *edit, reinterpret_cast(&x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedRealIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedComplexIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + bool isListOutput{ + io.get_if>() != nullptr}; + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + RawType *x{&ExtractElement(io, descriptor, subscripts)}; + if (isListOutput) { + DataEdit rEdit, iEdit; + rEdit.descriptor = DataEdit::ListDirectedRealPart; + iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; + rEdit.modes = iEdit.modes = io.mutableModes(); + if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || + !RealOutputEditing{io, x[1]}.Edit(iEdit)) { + return false; + } + } else { + for (int k{0}; k < 2; ++k, ++x) { + auto edit{io.GetNextDataEdit()}; + if (!edit) { + return false; + } else if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, *x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { + break; + } else if (EditRealInput( + io, *edit, reinterpret_cast(x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedComplexIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedCharacterIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + std::size_t length{descriptor.ElementBytes() / sizeof(A)}; + auto *listOutput{io.get_if>()}; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + A *x{&ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditCharacterOutput(io, *edit, x, length)) { + return false; + } + } else { // input + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditCharacterInput(io, *edit, x, length)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedCharacterIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedLogicalIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + auto *listOutput{io.get_if>()}; + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditLogicalOutput(io, *edit, x != 0)) { + return false; + } + } else { + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + bool truth{}; + if (EditLogicalInput(io, *edit, truth)) { + x = truth; + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedLogicalIO: subscripts out of bounds"); + } + } + return true; +} + +template +RT_API_ATTRS int DerivedIoTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Data) { + // Create a descriptor for the component + Descriptor &compDesc{componentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + compDesc, instance_, io_.GetIoErrorHandler(), subscripts_); + Advance(); + workQueue.BeginDescriptorIo(DIR, io_, compDesc, table_, anyIoTookPlace_); + return StatOkContinue; + } else { + // Component is itself a descriptor + char *pointer{ + instance_.Element(subscripts_) + component_->offset()}; + const Descriptor &compDesc{ + *reinterpret_cast(pointer)}; + Advance(); + if (compDesc.IsAllocated()) { + workQueue.BeginDescriptorIo( + DIR, io_, compDesc, table_, anyIoTookPlace_); + return StatOkContinue; + } + } + } + return StatOk; } +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Begin(WorkQueue &workQueue) { + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + if (handler.InError()) { + return handler.GetIoStat(); + } + if (!io_.get_if>()) { + handler.Crash("DescriptorIO() called for wrong I/O direction"); + return handler.GetIoStat(); + } + if constexpr (DIR == Direction::Input) { + if (!io_.BeginReadingRecord()) { + return StatOk; + } + } + if (!io_.get_if>()) { + // Unformatted I/O + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + if (const typeInfo::DerivedType * + type{addendum ? addendum->derivedType() : nullptr}) { + // derived type unformatted I/O + if (table_) { + if (const auto *definedIo{table_->Find(*type, + DIR == Direction::Input + ? common::DefinedIo::ReadUnformatted + : common::DefinedIo::WriteUnformatted)}) { + if (definedIo->subroutine) { + typeInfo::SpecialBinding special{DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false}; + if (DefinedUnformattedIo(io_, instance_, *type, special)) { + anyIoTookPlace_ = true; + return StatOk; + } + } else { + workQueue.BeginDerivedIo( + DIR, io_, instance_, *type, table_, anyIoTookPlace_); + return StatOk; + } + } + } + if (const typeInfo::SpecialBinding * + special{type->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || special->isTypeBound()) { + // defined derived type unformatted I/O + anyIoTookPlace_ |= + DefinedUnformattedIo(io_, instance_, *type, *special); + return handler.GetIoStat(); + } + } + // Default derived type unformatted I/O + // TODO: If no component at any level has defined READ or WRITE + // (as appropriate), the elements are contiguous, and no byte swapping + // is active, do a block transfer via the code below. + workQueue.BeginDerivedIo( + DIR, io_, instance_, *type, table_, anyIoTookPlace_); + return StatOk; + } else { + // intrinsic type unformatted I/O + auto *externalUnf{io_.get_if>()}; + ChildUnformattedIoStatementState *childUnf{nullptr}; + InquireIOLengthState *inq{nullptr}; + bool swapEndianness{false}; + if (externalUnf) { + swapEndianness = externalUnf->unit().swapEndianness(); + } else { + childUnf = io_.get_if>(); + if (!childUnf) { + inq = DIR == Direction::Output ? io_.get_if() + : nullptr; + RUNTIME_CHECK(handler, inq != nullptr); + } + } + std::size_t elementBytes{instance_.ElementBytes()}; + std::size_t swappingBytes{elementBytes}; + if (auto maybeCatAndKind{instance_.type().GetCategoryAndKind()}) { + // Byte swapping units can be smaller than elements, namely + // for COMPLEX and CHARACTER. + if (maybeCatAndKind->first == TypeCategory::Character) { + // swap each character position independently + swappingBytes = maybeCatAndKind->second; // kind + } else if (maybeCatAndKind->first == TypeCategory::Complex) { + // swap real and imaginary components independently + swappingBytes /= 2; + } + } + using CharType = + std::conditional_t; + auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { + bool ok{false}; + if constexpr (DIR == Direction::Output) { + ok = externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) + : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) + : inq->Emit(&x, totalBytes, swappingBytes); + } else { + ok = externalUnf ? externalUnf->Receive(&x, totalBytes, swappingBytes) + : childUnf->Receive(&x, totalBytes, swappingBytes); + } + anyIoTookPlace_ |= ok; + return ok; + }}; + if (!swapEndianness && + instance_.IsContiguous()) { // contiguous unformatted I/O + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (!Transfer(x, elements_ * elementBytes)) { + return handler.GetIoStat(); + } + } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O + for (; !IsComplete(); Advance()) { + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (!Transfer(x, elementBytes)) { + return handler.GetIoStat(); + } + } + } + } + // Unformatted I/O never needs to call Continue(). + return StatOk; + } + // Formatted I/O + if (auto catAndKind{instance_.type().GetCategoryAndKind()}) { + TypeCategory cat{catAndKind->first}; + int kind{catAndKind->second}; + switch (cat) { + case TypeCategory::Integer: + switch (kind) { + case 1: + anyIoTookPlace_ |= FormattedIntegerIO<1, DIR>(io_, instance_, true); + break; + case 2: + anyIoTookPlace_ |= FormattedIntegerIO<2, DIR>(io_, instance_, true); + break; + case 4: + anyIoTookPlace_ |= FormattedIntegerIO<4, DIR>(io_, instance_, true); + break; + case 8: + anyIoTookPlace_ |= FormattedIntegerIO<8, DIR>(io_, instance_, true); + break; + case 16: + anyIoTookPlace_ |= FormattedIntegerIO<16, DIR>(io_, instance_, true); + break; + default: + handler.Crash( + "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); + return handler.GetIoStat(); + } + break; + case TypeCategory::Unsigned: + switch (kind) { + case 1: + anyIoTookPlace_ |= FormattedIntegerIO<1, DIR>(io_, instance_, false); + break; + case 2: + anyIoTookPlace_ |= FormattedIntegerIO<2, DIR>(io_, instance_, false); + break; + case 4: + anyIoTookPlace_ |= FormattedIntegerIO<4, DIR>(io_, instance_, false); + break; + case 8: + anyIoTookPlace_ |= FormattedIntegerIO<8, DIR>(io_, instance_, false); + break; + case 16: + anyIoTookPlace_ |= FormattedIntegerIO<16, DIR>(io_, instance_, false); + break; + default: + handler.Crash( + "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); + return handler.GetIoStat(); + } + break; + case TypeCategory::Real: + switch (kind) { + case 2: + anyIoTookPlace_ |= FormattedRealIO<2, DIR>(io_, instance_); + break; + case 3: + anyIoTookPlace_ |= FormattedRealIO<3, DIR>(io_, instance_); + break; + case 4: + anyIoTookPlace_ |= FormattedRealIO<4, DIR>(io_, instance_); + break; + case 8: + anyIoTookPlace_ |= FormattedRealIO<8, DIR>(io_, instance_); + break; + case 10: + anyIoTookPlace_ |= FormattedRealIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + anyIoTookPlace_ |= FormattedRealIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: REAL(KIND=%d) in formatted IO", kind); + return handler.GetIoStat(); + } + break; + case TypeCategory::Complex: + switch (kind) { + case 2: + anyIoTookPlace_ |= FormattedComplexIO<2, DIR>(io_, instance_); + break; + case 3: + anyIoTookPlace_ |= FormattedComplexIO<3, DIR>(io_, instance_); + break; + case 4: + anyIoTookPlace_ |= FormattedComplexIO<4, DIR>(io_, instance_); + break; + case 8: + anyIoTookPlace_ |= FormattedComplexIO<8, DIR>(io_, instance_); + break; + case 10: + anyIoTookPlace_ |= FormattedComplexIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + anyIoTookPlace_ |= FormattedComplexIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); + return handler.GetIoStat(); + } + break; + case TypeCategory::Character: + switch (kind) { + case 1: + anyIoTookPlace_ |= FormattedCharacterIO(io_, instance_); + break; + ; + case 2: + anyIoTookPlace_ |= FormattedCharacterIO(io_, instance_); + break; + case 4: + anyIoTookPlace_ |= FormattedCharacterIO(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); + return handler.GetIoStat(); + } + break; + case TypeCategory::Logical: + switch (kind) { + case 1: + anyIoTookPlace_ |= FormattedLogicalIO<1, DIR>(io_, instance_); + break; + case 2: + anyIoTookPlace_ |= FormattedLogicalIO<2, DIR>(io_, instance_); + break; + case 4: + anyIoTookPlace_ |= FormattedLogicalIO<4, DIR>(io_, instance_); + break; + case 8: + anyIoTookPlace_ |= FormattedLogicalIO<8, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); + return handler.GetIoStat(); + } + break; + case TypeCategory::Derived: { + // Derived type information must be present for formatted I/O. + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + RUNTIME_CHECK(handler, addendum != nullptr); + derived_ = addendum->derivedType(); + RUNTIME_CHECK(handler, derived_ != nullptr); + if (table_) { + if (const auto *definedIo{table_->Find(*derived_, + DIR == Direction::Input ? common::DefinedIo::ReadFormatted + : common::DefinedIo::WriteFormatted)}) { + if (definedIo->subroutine) { + nonTbpSpecial_.emplace(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false); + special_ = &*nonTbpSpecial_; + } + } + } + if (!special_) { + if (const typeInfo::SpecialBinding * + binding{derived_->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || + binding->isTypeBound()) { + special_ = binding; + } + } + } + return StatOkContinue; + } + } + } else { + handler.Crash("DescriptorIO: bad type code (%d) in descriptor", + static_cast(instance_.type().raw())); + return handler.GetIoStat(); + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Continue(WorkQueue &workQueue) { + // Only derived type formatted I/O gets here. + for (; !IsComplete(); Advance()) { + if (special_) { + if (auto defined{DefinedFormattedIo( + io_, instance_, *derived_, *special_, subscripts_)}) { + anyIoTookPlace_ |= *defined; + continue; + } + } + Descriptor &elementDesc{elementDescriptor_.descriptor()}; + elementDesc.Establish( + *derived_, nullptr, 0, nullptr, CFI_attribute_pointer); + elementDesc.set_base_addr(instance_.Element(subscripts_)); + if (getenv("FLANG_RT_DEBUG")) + std::fprintf(stderr, "DescIO::C at %d\n", __LINE__); + Advance(); + workQueue.BeginDerivedIo( + DIR, io_, elementDesc, *derived_, table_, anyIoTookPlace_); + return StatOkContinue; + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS bool DescriptorIO(IoStatementState &io, + const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { + bool anyIoTookPlace{false}; + WorkQueue workQueue{io.GetIoErrorHandler()}; + workQueue.BeginDescriptorIo(DIR, io, descriptor, table, anyIoTookPlace); + return workQueue.Run() == StatOk && anyIoTookPlace; +} + +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); + RT_OFFLOAD_API_GROUP_END } // namespace Fortran::runtime::io::descr diff --git a/flang-rt/lib/runtime/descriptor-io.h b/flang-rt/lib/runtime/descriptor-io.h index eb60f106c9203..88ad59bd24b53 100644 --- a/flang-rt/lib/runtime/descriptor-io.h +++ b/flang-rt/lib/runtime/descriptor-io.h @@ -9,619 +9,27 @@ #ifndef FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ #define FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ -// Implementation of I/O data list item transfers based on descriptors. -// (All I/O items come through here so that the code is exercised for test; -// some scalar I/O data transfer APIs could be changed to bypass their use -// of descriptors in the future for better efficiency.) +#include "flang-rt/runtime/connection.h" -#include "edit-input.h" -#include "edit-output.h" -#include "unit.h" -#include "flang-rt/runtime/descriptor.h" -#include "flang-rt/runtime/io-stmt.h" -#include "flang-rt/runtime/namelist.h" -#include "flang-rt/runtime/terminator.h" -#include "flang-rt/runtime/type-info.h" -#include "flang/Common/optional.h" -#include "flang/Common/uint128.h" -#include "flang/Runtime/cpp-type.h" +namespace Fortran::runtime { +class Descriptor; +} // namespace Fortran::runtime -namespace Fortran::runtime::io::descr { -template -inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, - const Descriptor &descriptor, const SubscriptValue subscripts[]) { - A *p{descriptor.Element(subscripts)}; - if (!p) { - io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " - "address or subscripts out of range"); - } - return *p; -} - -// Per-category descriptor-based I/O templates - -// TODO (perhaps as a nontrivial but small starter project): implement -// automatic repetition counts, like "10*3.14159", for list-directed and -// NAMELIST array output. - -template -inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, - const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!EditIntegerOutput(io, *edit, x, isSigned)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditIntegerInput( - io, *edit, reinterpret_cast(&x), KIND, isSigned)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedIntegerIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedRealIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - RawType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditRealInput(io, *edit, reinterpret_cast(&x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedRealIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io -template -inline RT_API_ATTRS bool FormattedComplexIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - bool isListOutput{ - io.get_if>() != nullptr}; - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - RawType *x{&ExtractElement(io, descriptor, subscripts)}; - if (isListOutput) { - DataEdit rEdit, iEdit; - rEdit.descriptor = DataEdit::ListDirectedRealPart; - iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; - rEdit.modes = iEdit.modes = io.mutableModes(); - if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || - !RealOutputEditing{io, x[1]}.Edit(iEdit)) { - return false; - } - } else { - for (int k{0}; k < 2; ++k, ++x) { - auto edit{io.GetNextDataEdit()}; - if (!edit) { - return false; - } else if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, *x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { - break; - } else if (EditRealInput( - io, *edit, reinterpret_cast(x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedComplexIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedCharacterIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t length{descriptor.ElementBytes() / sizeof(A)}; - auto *listOutput{io.get_if>()}; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - A *x{&ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditCharacterOutput(io, *edit, x, length)) { - return false; - } - } else { // input - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditCharacterInput(io, *edit, x, length)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedCharacterIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedLogicalIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - auto *listOutput{io.get_if>()}; - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditLogicalOutput(io, *edit, x != 0)) { - return false; - } - } else { - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - bool truth{}; - if (EditLogicalInput(io, *edit, truth)) { - x = truth; - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedLogicalIO: subscripts out of bounds"); - } - } - return true; -} +namespace Fortran::runtime::io::descr { template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, +RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable * = nullptr); -// For intrinsic (not defined) derived type I/O, formatted & unformatted -template -static RT_API_ATTRS bool DefaultComponentIO(IoStatementState &io, - const typeInfo::Component &component, const Descriptor &origDescriptor, - const SubscriptValue origSubscripts[], Terminator &terminator, - const NonTbpDefinedIoTable *table) { -#if !defined(RT_DEVICE_AVOID_RECURSION) - if (component.genre() == typeInfo::Component::Genre::Data) { - // Create a descriptor for the component - StaticDescriptor statDesc; - Descriptor &desc{statDesc.descriptor()}; - component.CreatePointerDescriptor( - desc, origDescriptor, terminator, origSubscripts); - return DescriptorIO(io, desc, table); - } else { - // Component is itself a descriptor - char *pointer{ - origDescriptor.Element(origSubscripts) + component.offset()}; - const Descriptor &compDesc{*reinterpret_cast(pointer)}; - return compDesc.IsAllocated() && DescriptorIO(io, compDesc, table); - } -#else - terminator.Crash("not yet implemented: component IO"); -#endif -} - -template -static RT_API_ATTRS bool DefaultComponentwiseFormattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table, const SubscriptValue subscripts[]) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - // Return true for NAMELIST input if any component appeared. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && k > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -template -static RT_API_ATTRS bool DefaultComponentwiseUnformattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - return false; - } - } - } - return true; -} - -RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( - IoStatementState &, const Descriptor &, const typeInfo::DerivedType &, - const typeInfo::SpecialBinding &, const SubscriptValue[]); - -template -static RT_API_ATTRS bool FormattedDerivedTypeIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - // Derived type information must be present for formatted I/O. - const DescriptorAddendum *addendum{descriptor.Addendum()}; - RUNTIME_CHECK(handler, addendum != nullptr); - const typeInfo::DerivedType *type{addendum->derivedType()}; - RUNTIME_CHECK(handler, type != nullptr); - Fortran::common::optional nonTbpSpecial; - const typeInfo::SpecialBinding *special{nullptr}; - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadFormatted - : common::DefinedIo::WriteFormatted)}) { - if (definedIo->subroutine) { - nonTbpSpecial.emplace(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false); - special = &*nonTbpSpecial; - } - } - } - if (!special) { - if (const typeInfo::SpecialBinding * - binding{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted)}) { - if (!table || !table->ignoreNonTbpEntries || binding->isTypeBound()) { - special = binding; - } - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t numElements{descriptor.Elements()}; - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - Fortran::common::optional result; - if (special) { - result = DefinedFormattedIo(io, descriptor, *type, *special, subscripts); - } - if (!result) { - result = DefaultComponentwiseFormattedIO( - io, descriptor, *type, table, subscripts); - } - if (!result.value()) { - // Return true for NAMELIST input if we got anything. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && j > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &, const Descriptor &, - const typeInfo::DerivedType &, const typeInfo::SpecialBinding &); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); -// Unformatted I/O -template -static RT_API_ATTRS bool UnformattedDescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table = nullptr) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const DescriptorAddendum *addendum{descriptor.Addendum()}; - if (const typeInfo::DerivedType * - type{addendum ? addendum->derivedType() : nullptr}) { - // derived type unformatted I/O - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadUnformatted - : common::DefinedIo::WriteUnformatted)}) { - if (definedIo->subroutine) { - typeInfo::SpecialBinding special{DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false}; - if (Fortran::common::optional wasDefined{ - DefinedUnformattedIo(io, descriptor, *type, special)}) { - return *wasDefined; - } - } else { - return DefaultComponentwiseUnformattedIO( - io, descriptor, *type, table); - } - } - } - if (const typeInfo::SpecialBinding * - special{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { - if (!table || !table->ignoreNonTbpEntries || special->isTypeBound()) { - // defined derived type unformatted I/O - return DefinedUnformattedIo(io, descriptor, *type, *special); - } - } - // Default derived type unformatted I/O - // TODO: If no component at any level has defined READ or WRITE - // (as appropriate), the elements are contiguous, and no byte swapping - // is active, do a block transfer via the code below. - return DefaultComponentwiseUnformattedIO(io, descriptor, *type, table); - } else { - // intrinsic type unformatted I/O - auto *externalUnf{io.get_if>()}; - auto *childUnf{io.get_if>()}; - auto *inq{ - DIR == Direction::Output ? io.get_if() : nullptr}; - RUNTIME_CHECK(handler, externalUnf || childUnf || inq); - std::size_t elementBytes{descriptor.ElementBytes()}; - std::size_t numElements{descriptor.Elements()}; - std::size_t swappingBytes{elementBytes}; - if (auto maybeCatAndKind{descriptor.type().GetCategoryAndKind()}) { - // Byte swapping units can be smaller than elements, namely - // for COMPLEX and CHARACTER. - if (maybeCatAndKind->first == TypeCategory::Character) { - // swap each character position independently - swappingBytes = maybeCatAndKind->second; // kind - } else if (maybeCatAndKind->first == TypeCategory::Complex) { - // swap real and imaginary components independently - swappingBytes /= 2; - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using CharType = - std::conditional_t; - auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { - if constexpr (DIR == Direction::Output) { - return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) - : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) - : inq->Emit(&x, totalBytes, swappingBytes); - } else { - return externalUnf ? externalUnf->Receive(&x, totalBytes, swappingBytes) - : childUnf->Receive(&x, totalBytes, swappingBytes); - } - }}; - bool swapEndianness{externalUnf && externalUnf->unit().swapEndianness()}; - if (!swapEndianness && - descriptor.IsContiguous()) { // contiguous unformatted I/O - char &x{ExtractElement(io, descriptor, subscripts)}; - return Transfer(x, numElements * elementBytes); - } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O - for (std::size_t j{0}; j < numElements; ++j) { - char &x{ExtractElement(io, descriptor, subscripts)}; - if (!Transfer(x, elementBytes)) { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && - j + 1 < numElements) { - handler.Crash("DescriptorIO: subscripts out of bounds"); - } - } - return true; - } - } -} - -template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - if (handler.InError()) { - return false; - } - if (!io.get_if>()) { - handler.Crash("DescriptorIO() called for wrong I/O direction"); - return false; - } - if constexpr (DIR == Direction::Input) { - if (!io.BeginReadingRecord()) { - return false; - } - } - if (!io.get_if>()) { - return UnformattedDescriptorIO(io, descriptor, table); - } - if (auto catAndKind{descriptor.type().GetCategoryAndKind()}) { - TypeCategory cat{catAndKind->first}; - int kind{catAndKind->second}; - switch (cat) { - case TypeCategory::Integer: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, true); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, true); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, true); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, true); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, true); - default: - handler.Crash( - "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Unsigned: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, false); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, false); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, false); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, false); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, false); - default: - handler.Crash( - "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Real: - switch (kind) { - case 2: - return FormattedRealIO<2, DIR>(io, descriptor); - case 3: - return FormattedRealIO<3, DIR>(io, descriptor); - case 4: - return FormattedRealIO<4, DIR>(io, descriptor); - case 8: - return FormattedRealIO<8, DIR>(io, descriptor); - case 10: - return FormattedRealIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedRealIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: REAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Complex: - switch (kind) { - case 2: - return FormattedComplexIO<2, DIR>(io, descriptor); - case 3: - return FormattedComplexIO<3, DIR>(io, descriptor); - case 4: - return FormattedComplexIO<4, DIR>(io, descriptor); - case 8: - return FormattedComplexIO<8, DIR>(io, descriptor); - case 10: - return FormattedComplexIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedComplexIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Character: - switch (kind) { - case 1: - return FormattedCharacterIO(io, descriptor); - case 2: - return FormattedCharacterIO(io, descriptor); - case 4: - return FormattedCharacterIO(io, descriptor); - default: - handler.Crash( - "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Logical: - switch (kind) { - case 1: - return FormattedLogicalIO<1, DIR>(io, descriptor); - case 2: - return FormattedLogicalIO<2, DIR>(io, descriptor); - case 4: - return FormattedLogicalIO<4, DIR>(io, descriptor); - case 8: - return FormattedLogicalIO<8, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Derived: - return FormattedDerivedTypeIO(io, descriptor, table); - } - } - handler.Crash("DescriptorIO: bad type code (%d) in descriptor", - static_cast(descriptor.type().raw())); - return false; -} } // namespace Fortran::runtime::io::descr #endif // FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ diff --git a/flang-rt/lib/runtime/environment.cpp b/flang-rt/lib/runtime/environment.cpp index 1d5304254ed0e..0f0564403c0e2 100644 --- a/flang-rt/lib/runtime/environment.cpp +++ b/flang-rt/lib/runtime/environment.cpp @@ -143,6 +143,10 @@ void ExecutionEnvironment::Configure(int ac, const char *av[], } } + if (auto *x{std::getenv("FLANG_RT_DEBUG")}) { + internalDebugging = std::strtol(x, nullptr, 10); + } + if (auto *x{std::getenv("ACC_OFFLOAD_STACK_SIZE")}) { char *end; auto n{std::strtoul(x, &end, 10)}; diff --git a/flang-rt/lib/runtime/namelist.cpp b/flang-rt/lib/runtime/namelist.cpp index b0cf2180fc6d4..1bef387a9771f 100644 --- a/flang-rt/lib/runtime/namelist.cpp +++ b/flang-rt/lib/runtime/namelist.cpp @@ -10,6 +10,7 @@ #include "descriptor-io.h" #include "flang-rt/runtime/emit-encoded.h" #include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/type-info.h" #include "flang/Runtime/io-api.h" #include #include diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..9cb165834d0a7 --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,205 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +static constexpr bool enableDebugOutput{false}; + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (componentAt_ >= components_) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + delete firstFree_; + } + firstFree_ = next; + } +} + +RT_API_ATTRS void WorkQueue::BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + StartTicket().u.emplace(descriptor, derived); +} + +RT_API_ATTRS void WorkQueue::BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); +} + +RT_API_ATTRS void WorkQueue::BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + StartTicket().u.emplace(descriptor, derived); +} + +RT_API_ATTRS void WorkQueue::BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + StartTicket().u.emplace(descriptor, derived, finalize); +} + +RT_API_ATTRS void WorkQueue::BeginAssign( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) { + StartTicket().u.emplace(to, from, flags, memmoveFct); +} + +RT_API_ATTRS void WorkQueue::BeginDerivedAssign(Descriptor &to, + const Descriptor &from, const typeInfo::DerivedType &derived, int flags, + MemmoveFct memmoveFct, Descriptor *deallocateAfter) { + StartTicket().u.emplace( + to, from, derived, flags, memmoveFct, deallocateAfter); +} + +RT_API_ATTRS void WorkQueue::BeginDerivedIo(io::Direction direction, + io::IoStatementState &io, const Descriptor &descriptor, + const typeInfo::DerivedType &derived, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) { + if (direction == io::Direction::Output) { + StartTicket().u.emplace>( + io, descriptor, derived, table, anyIoTookPlace); + } else { + StartTicket().u.emplace>( + io, descriptor, derived, table, anyIoTookPlace); + } +} + +RT_API_ATTRS void WorkQueue::BeginDescriptorIo(io::Direction direction, + io::IoStatementState &io, const Descriptor &descriptor, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) { + if (direction == io::Direction::Output) { + StartTicket() + .u.emplace>( + io, descriptor, table, anyIoTookPlace); + } else { + StartTicket() + .u.emplace>( + io, descriptor, table, anyIoTookPlace); + } +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + firstFree_ = new TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: new ticket\n"); + } + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), + at->ticket.begun ? "Continue" : "Begin"); + } + int stat{at->ticket.Continue(*this)}; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: ... stat %d\n", stat); + } + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatOkContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime diff --git a/flang-rt/unittests/Runtime/ExternalIOTest.cpp b/flang-rt/unittests/Runtime/ExternalIOTest.cpp index 3833e48be3dd6..6c148b1de6f82 100644 --- a/flang-rt/unittests/Runtime/ExternalIOTest.cpp +++ b/flang-rt/unittests/Runtime/ExternalIOTest.cpp @@ -184,7 +184,7 @@ TEST(ExternalIOTests, TestSequentialFixedUnformatted) { io = IONAME(BeginInquireIoLength)(__FILE__, __LINE__); for (int j{1}; j <= 3; ++j) { ASSERT_TRUE(IONAME(OutputDescriptor)(io, desc)) - << "OutputDescriptor() for InquireIoLength"; + << "OutputDescriptor() for InquireIoLength " << j; } ASSERT_EQ(IONAME(GetIoLength)(io), 3 * recl) << "GetIoLength"; ASSERT_EQ(IONAME(EndIoStatement)(io), IostatOk) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..32bebc1d866a4 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -843,6 +843,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION From flang-commits at lists.llvm.org Tue May 20 16:37:16 2025 From: flang-commits at lists.llvm.org (Razvan Lupusoru via flang-commits) Date: Tue, 20 May 2025 16:37:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) In-Reply-To: Message-ID: <682d122c.170a0220.a177a.5517@mx.google.com> https://github.com/razvanlupusoru approved this pull request. Thank you! Nice work! https://github.com/llvm/llvm-project/pull/140763 From flang-commits at lists.llvm.org Tue May 20 16:53:18 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Tue, 20 May 2025 16:53:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) Message-ID: https://github.com/eugeneepshteyn created https://github.com/llvm/llvm-project/pull/140822 The integer used for Cray pointers should have the size equivalent to platform's pointer size. >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); From flang-commits at lists.llvm.org Tue May 20 16:53:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 16:53:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <682d160e.a70a0220.2647b3.7e2e@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Eugene Epshteyn (eugeneepshteyn)
Changes The integer used for Cray pointers should have the size equivalent to platform's pointer size. --- Full diff: https://github.com/llvm/llvm-project/pull/140822.diff 3 Files Affected: - (modified) flang/include/flang/Evaluate/target.h (+6) - (modified) flang/include/flang/Tools/TargetSetup.h (+3) - (modified) flang/lib/Semantics/resolve-names.cpp (+1-1) ``````````diff diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); ``````````
https://github.com/llvm/llvm-project/pull/140822 From flang-commits at lists.llvm.org Tue May 20 16:56:20 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 16:56:20 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <682d16a4.050a0220.21025e.8576@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- flang/include/flang/Evaluate/target.h flang/include/flang/Tools/TargetSetup.h flang/lib/Semantics/resolve-names.cpp ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d52..e8b9fedc3 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ public: const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e2..24ab65f74 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. ``````````
https://github.com/llvm/llvm-project/pull/140822 From flang-commits at lists.llvm.org Tue May 20 16:59:29 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Tue, 20 May 2025 16:59:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <682d1761.050a0220.34f28d.61cf@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH 1/2] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 30477b842ae64fb1c94f520136956cc4685648a7 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:59:20 -0400 Subject: [PATCH 2/2] clang-format --- flang/include/flang/Evaluate/target.h | 4 +--- flang/include/flang/Tools/TargetSetup.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d5285fb..e8b9fedc38f48 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ class TargetCharacteristics { const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e21fb4..24ab65f740ec6 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. From flang-commits at lists.llvm.org Tue May 20 17:16:24 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Tue, 20 May 2025 17:16:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <682d1b58.170a0220.17abf9.4e56@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH 1/2] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 30477b842ae64fb1c94f520136956cc4685648a7 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:59:20 -0400 Subject: [PATCH 2/2] clang-format --- flang/include/flang/Evaluate/target.h | 4 +--- flang/include/flang/Tools/TargetSetup.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d5285fb..e8b9fedc38f48 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ class TargetCharacteristics { const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e21fb4..24ab65f740ec6 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. From flang-commits at lists.llvm.org Tue May 20 17:20:06 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Tue, 20 May 2025 17:20:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Allocate extra descriptor in managed memory when it is coming from device (PR #140818) In-Reply-To: Message-ID: <682d1c36.630a0220.2ccb43.a306@mx.google.com> https://github.com/wangzpgi approved this pull request. https://github.com/llvm/llvm-project/pull/140818 From flang-commits at lists.llvm.org Tue May 20 18:55:16 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 18:55:16 -0700 (PDT) Subject: [flang-commits] [flang] 6811a3b - [flang][cuda] Allocate extra descriptor in managed memory when it is coming from device (#140818) Message-ID: <682d3284.170a0220.2061bc.b6dd@mx.google.com> Author: Valentin Clement (バレンタイン クレメン) Date: 2025-05-20T18:55:13-07:00 New Revision: 6811a3bedfd33ee64e884467791d2c299504b0e8 URL: https://github.com/llvm/llvm-project/commit/6811a3bedfd33ee64e884467791d2c299504b0e8 DIFF: https://github.com/llvm/llvm-project/commit/6811a3bedfd33ee64e884467791d2c299504b0e8.diff LOG: [flang][cuda] Allocate extra descriptor in managed memory when it is coming from device (#140818) Added: Modified: flang/lib/Optimizer/CodeGen/CodeGen.cpp flang/test/Fir/CUDA/cuda-code-gen.mlir Removed: ################################################################################ diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index 70c90fae34086..205807eab403a 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -1830,7 +1830,9 @@ static bool isDeviceAllocation(mlir::Value val, mlir::Value adaptorVal) { (callOp.getCallee().value().getRootReference().getValue().starts_with( RTNAME_STRING(CUFMemAlloc)) || callOp.getCallee().value().getRootReference().getValue().starts_with( - RTNAME_STRING(CUFAllocDescriptor)))) + RTNAME_STRING(CUFAllocDescriptor)) || + callOp.getCallee().value().getRootReference().getValue() == + "__tgt_acc_get_deviceptr")) return true; return false; } @@ -3253,8 +3255,9 @@ struct LoadOpConversion : public fir::FIROpConversion { if (auto callOp = mlir::dyn_cast_or_null( inputBoxStorage.getDefiningOp())) { if (callOp.getCallee() && - (*callOp.getCallee()) - .starts_with(RTNAME_STRING(CUFAllocDescriptor))) { + ((*callOp.getCallee()) + .starts_with(RTNAME_STRING(CUFAllocDescriptor)) || + (*callOp.getCallee()).starts_with("__tgt_acc_get_deviceptr"))) { // CUDA Fortran local descriptor are allocated in managed memory. So // new storage must be allocated the same way. auto mod = load->getParentOfType(); diff --git a/flang/test/Fir/CUDA/cuda-code-gen.mlir b/flang/test/Fir/CUDA/cuda-code-gen.mlir index fdd9f1ac12b1f..672be13beae24 100644 --- a/flang/test/Fir/CUDA/cuda-code-gen.mlir +++ b/flang/test/Fir/CUDA/cuda-code-gen.mlir @@ -204,3 +204,20 @@ func.func @_QMm1Psub1(%arg0: !fir.box> {cuf.data_attr = #cuf.c fir.global common @_QPshared_static__shared_mem(dense<0> : vector<28xi8>) {alignment = 8 : i64, data_attr = #cuf.cuda} : !fir.array<28xi8> // CHECK: llvm.mlir.global common @_QPshared_static__shared_mem(dense<0> : vector<28xi8>) {addr_space = 3 : i32, alignment = 8 : i64} : !llvm.array<28 x i8> + +// ----- + +module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry, dense<64> : vector<4xi64>>, #dlti.dl_entry, dense<32> : vector<4xi64>>, #dlti.dl_entry, dense<32> : vector<4xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<2xi64>>, #dlti.dl_entry : vector<4xi64>>, #dlti.dl_entry<"dlti.endianness", "little">, #dlti.dl_entry<"dlti.stack_alignment", 128 : i64>>} { + func.func @_QQmain() attributes {fir.bindc_name = "cufkernel_global"} { + %c0 = arith.constant 0 : index + %3 = fir.call @__tgt_acc_get_deviceptr() : () -> !fir.ref> + %4 = fir.convert %3 : (!fir.ref>) -> !fir.ref>>> + %5 = fir.load %4 : !fir.ref>>> + return + } + + // CHECK-LABEL: llvm.func @_QQmain() + // CHECK: llvm.call @_FortranACUFAllocDescriptor + + func.func private @__tgt_acc_get_deviceptr() -> !fir.ref> +} From flang-commits at lists.llvm.org Tue May 20 18:55:20 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 20 May 2025 18:55:20 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Allocate extra descriptor in managed memory when it is coming from device (PR #140818) In-Reply-To: Message-ID: <682d3288.050a0220.8b6db.612d@mx.google.com> https://github.com/clementval closed https://github.com/llvm/llvm-project/pull/140818 From flang-commits at lists.llvm.org Tue May 20 19:49:18 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Tue, 20 May 2025 19:49:18 -0700 (PDT) Subject: [flang-commits] [flang] implicitly set DEVICE attribute to scalars in device routines (PR #140834) Message-ID: https://github.com/wangzpgi created https://github.com/llvm/llvm-project/pull/140834 Scalars inside device routines also need to implicitly set the DEVICE attribute, except for function results. >From 564ff8f169f7807dedd95fe2d3eb995c0472f277 Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Tue, 20 May 2025 19:47:59 -0700 Subject: [PATCH] implicitly set DEVICE attribute to scalars in device routines --- flang/lib/Semantics/resolve-names.cpp | 2 +- flang/test/Lower/CUDA/cuda-shared.cuf | 1 + flang/test/Semantics/cuf21.cuf | 38 +++++++++++++++++++++++++++ 3 files changed, 40 insertions(+), 1 deletion(-) create mode 100644 flang/test/Semantics/cuf21.cuf diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..3f4a06444c4f3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -9376,7 +9376,7 @@ static void SetImplicitCUDADevice(bool inDeviceSubprogram, Symbol &symbol) { if (inDeviceSubprogram && symbol.has()) { auto *object{symbol.detailsIf()}; if (!object->cudaDataAttr() && !IsValue(symbol) && - (IsDummy(symbol) || object->IsArray())) { + !IsFunctionResult(symbol)) { // Implicitly set device attribute if none is set in device context. object->set_cudaDataAttr(common::CUDADataAttr::Device); } diff --git a/flang/test/Lower/CUDA/cuda-shared.cuf b/flang/test/Lower/CUDA/cuda-shared.cuf index f41011df06ae7..565857f01bdb8 100644 --- a/flang/test/Lower/CUDA/cuda-shared.cuf +++ b/flang/test/Lower/CUDA/cuda-shared.cuf @@ -9,4 +9,5 @@ end subroutine ! CHECK-LABEL: func.func @_QPsharedmem() attributes {cuf.proc_attr = #cuf.cuda_proc} ! CHECK: %{{.*}} = cuf.shared_memory !fir.array<32xf32> {bindc_name = "s", uniq_name = "_QFsharedmemEs"} -> !fir.ref> +! CHECK: cuf.free %{{.*}}#0 : !fir.ref {data_attr = #cuf.cuda} ! CHECK-NOT: cuf.free diff --git a/flang/test/Semantics/cuf21.cuf b/flang/test/Semantics/cuf21.cuf new file mode 100644 index 0000000000000..52343daaf66f1 --- /dev/null +++ b/flang/test/Semantics/cuf21.cuf @@ -0,0 +1,38 @@ +! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s + +module mlocModule + interface maxlocUpdate + module procedure :: & + maxlocUpdateR_32F, & + maxlocUpdateR_64F, & + maxlocUpdateR_32I, & + maxlocUpdateR_64I + end interface maxlocUpdate +contains + + attributes(global) subroutine maxlocPartialMaskR_32F1D() + implicit none + real(4) :: mval + + call maxlocUpdate(mval) + + end subroutine maxlocPartialMaskR_32F1D + + attributes(device) subroutine maxlocUpdateR_32F(mval) + real(4) :: mval + end subroutine maxlocUpdateR_32F + + attributes(device) subroutine maxlocUpdateR_64F(mval) + real(8) :: mval + end subroutine maxlocUpdateR_64F + + attributes(device) subroutine maxlocUpdateR_32I(mval) + integer(4) :: mval + end subroutine maxlocUpdateR_32I + + attributes(device) subroutine maxlocUpdateR_64I(mval) + integer(8) :: mval + end subroutine maxlocUpdateR_64I +end module + +! CHECK-LABEL: func.func @_QMmlocmodulePmaxlocpartialmaskr_32f1d() From flang-commits at lists.llvm.org Tue May 20 19:49:32 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Tue, 20 May 2025 19:49:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682d3f3c.170a0220.17ac84.5adb@mx.google.com> https://github.com/wangzpgi edited https://github.com/llvm/llvm-project/pull/140834 From flang-commits at lists.llvm.org Tue May 20 19:49:54 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 20 May 2025 19:49:54 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682d3f52.630a0220.3cfd74.9810@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-flang-semantics Author: Zhen Wang (wangzpgi)
Changes Scalars inside device routines also need to implicitly set the DEVICE attribute, except for function results. --- Full diff: https://github.com/llvm/llvm-project/pull/140834.diff 3 Files Affected: - (modified) flang/lib/Semantics/resolve-names.cpp (+1-1) - (modified) flang/test/Lower/CUDA/cuda-shared.cuf (+1) - (added) flang/test/Semantics/cuf21.cuf (+38) ``````````diff diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..3f4a06444c4f3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -9376,7 +9376,7 @@ static void SetImplicitCUDADevice(bool inDeviceSubprogram, Symbol &symbol) { if (inDeviceSubprogram && symbol.has()) { auto *object{symbol.detailsIf()}; if (!object->cudaDataAttr() && !IsValue(symbol) && - (IsDummy(symbol) || object->IsArray())) { + !IsFunctionResult(symbol)) { // Implicitly set device attribute if none is set in device context. object->set_cudaDataAttr(common::CUDADataAttr::Device); } diff --git a/flang/test/Lower/CUDA/cuda-shared.cuf b/flang/test/Lower/CUDA/cuda-shared.cuf index f41011df06ae7..565857f01bdb8 100644 --- a/flang/test/Lower/CUDA/cuda-shared.cuf +++ b/flang/test/Lower/CUDA/cuda-shared.cuf @@ -9,4 +9,5 @@ end subroutine ! CHECK-LABEL: func.func @_QPsharedmem() attributes {cuf.proc_attr = #cuf.cuda_proc} ! CHECK: %{{.*}} = cuf.shared_memory !fir.array<32xf32> {bindc_name = "s", uniq_name = "_QFsharedmemEs"} -> !fir.ref> +! CHECK: cuf.free %{{.*}}#0 : !fir.ref {data_attr = #cuf.cuda} ! CHECK-NOT: cuf.free diff --git a/flang/test/Semantics/cuf21.cuf b/flang/test/Semantics/cuf21.cuf new file mode 100644 index 0000000000000..52343daaf66f1 --- /dev/null +++ b/flang/test/Semantics/cuf21.cuf @@ -0,0 +1,38 @@ +! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s + +module mlocModule + interface maxlocUpdate + module procedure :: & + maxlocUpdateR_32F, & + maxlocUpdateR_64F, & + maxlocUpdateR_32I, & + maxlocUpdateR_64I + end interface maxlocUpdate +contains + + attributes(global) subroutine maxlocPartialMaskR_32F1D() + implicit none + real(4) :: mval + + call maxlocUpdate(mval) + + end subroutine maxlocPartialMaskR_32F1D + + attributes(device) subroutine maxlocUpdateR_32F(mval) + real(4) :: mval + end subroutine maxlocUpdateR_32F + + attributes(device) subroutine maxlocUpdateR_64F(mval) + real(8) :: mval + end subroutine maxlocUpdateR_64F + + attributes(device) subroutine maxlocUpdateR_32I(mval) + integer(4) :: mval + end subroutine maxlocUpdateR_32I + + attributes(device) subroutine maxlocUpdateR_64I(mval) + integer(8) :: mval + end subroutine maxlocUpdateR_64I +end module + +! CHECK-LABEL: func.func @_QMmlocmodulePmaxlocpartialmaskr_32f1d() ``````````
https://github.com/llvm/llvm-project/pull/140834 From flang-commits at lists.llvm.org Wed May 21 01:21:38 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 01:21:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Skip optimized bufferization on volatile refs (PR #140781) In-Reply-To: Message-ID: <682d8d12.170a0220.7ec23.6a1d@mx.google.com> https://github.com/jeanPerier approved this pull request. Thanks for the fix! Your fix makes sense to me as a general fix (we are probably lucky no operation with read effect to unknown value did not hit this before, they are probably not very common, or goes with write effects that made the code exit before). I think there is still a case for doing optimized bufferization on volatile refs, but it is probably fine and safer to bail without more thinking. Fortran does not really have a strong memory model like C++ that would allow precisely defining VOLATILE. The only thing the F2023 standard is saying in 8.5.20 is : _"The Fortran processor should use the most recent definition of a volatile object each time its value is required. When a volatile object is defined by means of Fortran, it should make that definition available to the non-Fortran parts of the program as soon as possible."_ It is not clear to my that this should prevent us from the evaluating RHS of `x = x + y` directly into `x` if `x` or `y` is VOLATILE and there is no aliasing. VOLATILE is not mentioned in 15.5.2.14 as something allowing aliasing between Fortran entities, so I think it is still safe to assume that `x` and `y` do not overlap even when both are VOLATILE. At least gfortran and ifx still optimize assignments when they are VOLATILE read/writes: https://godbolt.org/z/b9x317ezK To clarify, I am not asking you to update the patch to ignore VOLATILE effects here (we would still need a bit more thinking to make sure things are OK for the cases more complex than `x = x +y` from my example). I just do not want future us to believe it would be incorrect to do so if we need do. https://github.com/llvm/llvm-project/pull/140781 From flang-commits at lists.llvm.org Wed May 21 01:41:51 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 01:41:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Added noalias attribute to function arguments. (PR #140803) In-Reply-To: Message-ID: <682d91cf.170a0220.20cd0c.5d74@mx.google.com> https://github.com/jeanPerier approved this pull request. Makes sense to me, thank you! https://github.com/llvm/llvm-project/pull/140803 From flang-commits at lists.llvm.org Wed May 21 02:48:16 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 02:48:16 -0700 (PDT) Subject: [flang-commits] [flang] f054aa2 - [flang][OpenMP] fix diagnostic for bad cancel type (#140798) Message-ID: <682da160.170a0220.3c4fd6.6190@mx.google.com> Author: Tom Eccles Date: 2025-05-21T10:48:13+01:00 New Revision: f054aa240f4205873a1d2bb6da3e453007be8ba6 URL: https://github.com/llvm/llvm-project/commit/f054aa240f4205873a1d2bb6da3e453007be8ba6 DIFF: https://github.com/llvm/llvm-project/commit/f054aa240f4205873a1d2bb6da3e453007be8ba6.diff LOG: [flang][OpenMP] fix diagnostic for bad cancel type (#140798) Fixes #133685 Added: flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 Modified: flang/lib/Semantics/check-omp-structure.cpp Removed: ################################################################################ diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c6c4fdf8a8198..606014276e7ca 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2575,8 +2575,8 @@ void OmpStructureChecker::CheckCancellationNest( } break; default: - // This should have been diagnosed by this point. - llvm_unreachable("Unexpected directive"); + // This is diagnosed later. + return; } if (!eligibleCancellation) { context_.Say(source, @@ -2614,8 +2614,8 @@ void OmpStructureChecker::CheckCancellationNest( parser::ToUpperCaseLetters(typeName.str())); break; default: - // This should have been diagnosed by this point. - llvm_unreachable("Unexpected directive"); + // This is diagnosed later. + return; } } } diff --git a/flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 b/flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 new file mode 100644 index 0000000000000..ea5e7be23e2f9 --- /dev/null +++ b/flang/test/Semantics/OpenMP/cancel-bad-cancel-type.f90 @@ -0,0 +1,6 @@ +!RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags + +program test +!ERROR: PARALLEL DO is not a cancellable construct +!$omp cancel parallel do +end From flang-commits at lists.llvm.org Wed May 21 03:27:51 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Wed, 21 May 2025 03:27:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <682daaa7.170a0220.375e3d.6aeb@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH 1/2] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 30477b842ae64fb1c94f520136956cc4685648a7 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:59:20 -0400 Subject: [PATCH 2/2] clang-format --- flang/include/flang/Evaluate/target.h | 4 +--- flang/include/flang/Tools/TargetSetup.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d5285fb..e8b9fedc38f48 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ class TargetCharacteristics { const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e21fb4..24ab65f740ec6 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. From flang-commits at lists.llvm.org Wed May 21 03:30:14 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 21 May 2025 03:30:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Added noalias attribute to function arguments. (PR #140803) In-Reply-To: Message-ID: <682dab36.050a0220.fd865.7d35@mx.google.com> ================ @@ -56,14 +47,28 @@ void FunctionAttrPass::runOnOperation() { if ((isFromModule || !func.isDeclaration()) && !fir::hasBindcAttr(func.getOperation())) { llvm::StringRef nocapture = mlir::LLVM::LLVMDialect::getNoCaptureAttrName(); + llvm::StringRef noalias = mlir::LLVM::LLVMDialect::getNoAliasAttrName(); mlir::UnitAttr unitAttr = mlir::UnitAttr::get(func.getContext()); for (auto [index, argType] : llvm::enumerate(func.getArgumentTypes())) { + bool isNoCapture = false; + bool isNoAlias = false; if (mlir::isa(argType) && !func.getArgAttr(index, fir::getTargetAttrName()) && !func.getArgAttr(index, fir::getAsynchronousAttrName()) && - !func.getArgAttr(index, fir::getVolatileAttrName())) + !func.getArgAttr(index, fir::getVolatileAttrName())) { + isNoCapture = true; + isNoAlias = !fir::isPointerType(argType); ---------------- tblah wrote: What about TARGET dummy arguments e.g. ``` subroutine test(a) real, target :: a end subroutine ``` Here `a` is lowered to `!fir.ref` https://github.com/llvm/llvm-project/pull/140803 From flang-commits at lists.llvm.org Wed May 21 03:33:27 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Wed, 21 May 2025 03:33:27 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682dabf7.170a0220.12518c.6cf3@mx.google.com> ================ @@ -1480,6 +1493,108 @@ class SemaOpenMP : public SemaBase { SmallVectorImpl &LoopHelpers, Stmt *&Body, SmallVectorImpl> &OriginalInits); + /// @brief Categories of loops encountered during semantic OpenMP loop + /// analysis + /// + /// This enumeration identifies the structural category of a loop or sequence + /// of loops analyzed in the context of OpenMP transformations and directives. + /// This categorization helps differentiate between original source loops + /// and the structures resulting from applying OpenMP loop transformations. + enum class OMPLoopCategory { + + /// @var OMPLoopCategory::RegularLoop + /// Represents a standard canonical loop nest found in the + /// original source code or an intact loop after transformations + /// (i.e Post/Pre loops of a loopranged fusion) + RegularLoop, + + /// @var OMPLoopCategory::TransformSingleLoop + /// Represents the resulting loop structure when an OpenMP loop + // transformation, generates a single, top-level loop + TransformSingleLoop, + + /// @var OMPLoopCategory::TransformLoopSequence + /// Represents the resulting loop structure when an OpenMP loop + /// transformation + /// generates a sequence of two or more canonical loop nests + TransformLoopSequence + }; + + /// The main recursive process of `checkTransformableLoopSequence` that + /// performs grammatical parsing of a canonical loop sequence. It extracts + /// key information, such as the number of top-level loops, loop statements, + /// helper expressions, and other relevant loop-related data, all in a single + /// execution to avoid redundant traversals. This analysis flattens inner + /// Loop Sequences + /// + /// \param LoopSeqStmt The AST of the original statement. + /// \param LoopSeqSize [out] Number of top level canonical loops. + /// \param NumLoops [out] Number of total canonical loops (nested too). + /// \param LoopHelpers [out] The multiple loop analyses results. + /// \param ForStmts [out] The multiple Stmt of each For loop. + /// \param OriginalInits [out] The raw original initialization statements + /// of each belonging to a loop of the loop sequence + /// \param TransformPreInits [out] The multiple collection of statements and + /// declarations that must have been executed/declared + /// before entering the loop (each belonging to a + /// particular loop transformation, nullptr otherwise) + /// \param LoopSequencePreInits [out] Additional general collection of loop + /// transformation related statements and declarations + /// not bounded to a particular loop that must be + /// executed before entering the loop transformation + /// \param LoopCategories [out] A sequence of OMPLoopCategory values, + /// one for each loop or loop transformation node + /// successfully analyzed. + /// \param Context + /// \param Kind The loop transformation directive kind. + /// \return Whether the original statement is both syntactically and + /// semantically correct according to OpenMP 6.0 canonical loop + /// sequence definition. + bool analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, ---------------- eZWALT wrote: it is correct though, in order to be consistent with the rest of the codebase, there are several instances of this SmallVector being propagated (In checkTransformableLoopNest, ActOnOpenMPUnroll, ActOnOpenMPStripe...) therefore i either modify all of these instances or i keep it as it is. https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Wed May 21 03:33:28 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Wed, 21 May 2025 03:33:28 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682dabf8.170a0220.3b2db8.d398@mx.google.com> ================ @@ -11516,6 +11516,21 @@ def note_omp_implicit_dsa : Note< "implicitly determined as %0">; def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; +def warn_omp_different_loop_ind_var_types : Warning < + "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">, ---------------- eZWALT wrote: How could it be? The iteration var type is dependent on the original iteration variable type, therefore making it possible for fusion to loop multiple loops with different induction variable types given that i dont emit an error but rather a warning. But i dont really see how NumIterations could be the limiting factor here, could you please explain me? https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Wed May 21 03:33:28 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Wed, 21 May 2025 03:33:28 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682dabf8.170a0220.bf09.66bc@mx.google.com> ================ @@ -1151,6 +1151,106 @@ class OMPFullClause final : public OMPNoChildClause { static OMPFullClause *CreateEmpty(const ASTContext &C); }; +/// This class represents the 'looprange' clause in the +/// '#pragma omp fuse' directive +/// +/// \code {c} +/// #pragma omp fuse looprange(1,2) +/// { +/// for(int i = 0; i < 64; ++i) +/// for(int j = 0; j < 256; j+=2) +/// for(int k = 127; k >= 0; --k) +/// \endcode +class OMPLoopRangeClause final : public OMPClause { + friend class OMPClauseReader; + + explicit OMPLoopRangeClause() + : OMPClause(llvm::omp::OMPC_looprange, {}, {}) {} + + /// Location of '(' + SourceLocation LParenLoc; + + /// Location of 'first' + SourceLocation FirstLoc; + + /// Location of 'count' + SourceLocation CountLoc; + + /// Expr associated with 'first' argument + Expr *First = nullptr; + + /// Expr associated with 'count' argument + Expr *Count = nullptr; + + /// Set 'first' + void setFirst(Expr *First) { this->First = First; } + + /// Set 'count' + void setCount(Expr *Count) { this->Count = Count; } + + /// Set location of '('. + void setLParenLoc(SourceLocation Loc) { LParenLoc = Loc; } + + /// Set location of 'first' argument + void setFirstLoc(SourceLocation Loc) { FirstLoc = Loc; } + + /// Set location of 'count' argument + void setCountLoc(SourceLocation Loc) { CountLoc = Loc; } + +public: + /// Build an AST node for a 'looprange' clause + /// + /// \param StartLoc Starting location of the clause. + /// \param LParenLoc Location of '('. + /// \param ModifierLoc Modifier location. + /// \param + static OMPLoopRangeClause * + Create(const ASTContext &C, SourceLocation StartLoc, SourceLocation LParenLoc, + SourceLocation FirstLoc, SourceLocation CountLoc, + SourceLocation EndLoc, Expr *First, Expr *Count); + + /// Build an empty 'looprange' node for deserialization + /// + /// \param C Context of the AST. + static OMPLoopRangeClause *CreateEmpty(const ASTContext &C); + + /// Returns the location of '(' + SourceLocation getLParenLoc() const { return LParenLoc; } + + /// Returns the location of 'first' + SourceLocation getFirstLoc() const { return FirstLoc; } + + /// Returns the location of 'count' + SourceLocation getCountLoc() const { return CountLoc; } + + /// Returns the argument 'first' or nullptr if not set + Expr *getFirst() const { return cast_or_null(First); } + + /// Returns the argument 'count' or nullptr if not set + Expr *getCount() const { return cast_or_null(Count); } + + child_range children() { + return child_range(reinterpret_cast(&First), + reinterpret_cast(&Count) + 1); ---------------- eZWALT wrote: Well spotted, thank you! https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Wed May 21 03:33:28 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Wed, 21 May 2025 03:33:28 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682dabf8.170a0220.380c8d.ca39@mx.google.com> ================ @@ -14175,27 +14222,350 @@ bool SemaOpenMP::checkTransformableLoopNest( return false; }, [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + updatePreInits(Transform, OriginalInits); }); assert(OriginalInits.back().empty() && "No preinit after innermost loop"); OriginalInits.pop_back(); return Result; } +// Counts the total number of nested loops, including the outermost loop (the +// original loop). PRECONDITION of this visitor is that it must be invoked from +// the original loop to be analyzed. The traversal is stop for Decl's and +// Expr's given that they may contain inner loops that must not be counted. +// +// Example AST structure for the code: +// +// int main() { +// #pragma omp fuse +// { +// for (int i = 0; i < 100; i++) { <-- Outer loop +// []() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// }; +// for(int j = 0; j < 5; ++j) {} <-- Inner loop +// } +// for (int r = 0; i < 100; i++) { <-- Outer loop +// struct LocalClass { +// void bar() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// } +// }; +// for(int k = 0; k < 10; ++k) {} <-- Inner loop +// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +// } +// } +// } +// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { +private: + unsigned NestedLoopCount = 0; + +public: + explicit NestedLoopCounterVisitor() {} + + unsigned getNestedLoopCount() const { return NestedLoopCount; } + + bool VisitForStmt(ForStmt *FS) override { + ++NestedLoopCount; + return true; + } + + bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { + ++NestedLoopCount; + return true; + } + + bool TraverseStmt(Stmt *S) override { + if (!S) + return true; + + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) + return true; + + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || isa(S)) { + return DynamicRecursiveASTVisitor::TraverseStmt(S); + } + + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; + } + + bool TraverseDecl(Decl *D) override { + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; + } +}; + +bool SemaOpenMP::analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind) { + + VarsWithInheritedDSAType TmpDSA; + QualType BaseInductionVarType; + // Helper Lambda to handle storing initialization and body statements for both + // ForStmt and CXXForRangeStmt and checks for any possible mismatch between + // induction variables types + auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, + this, &Context](Stmt *LoopStmt) { + if (auto *For = dyn_cast(LoopStmt)) { + OriginalInits.back().push_back(For->getInit()); + ForStmts.push_back(For); + // Extract induction variable + if (auto *InitStmt = dyn_cast_or_null(For->getInit())) { + if (auto *InitDecl = dyn_cast(InitStmt->getSingleDecl())) { + QualType InductionVarType = InitDecl->getType().getCanonicalType(); + + // Compare with first loop type + if (BaseInductionVarType.isNull()) { + BaseInductionVarType = InductionVarType; + } else if (!Context.hasSameType(BaseInductionVarType, + InductionVarType)) { + Diag(InitDecl->getBeginLoc(), + diag::warn_omp_different_loop_ind_var_types) + << getOpenMPDirectiveName(OMPD_fuse) << BaseInductionVarType + << InductionVarType; + } + } + } + } else { + auto *CXXFor = cast(LoopStmt); + OriginalInits.back().push_back(CXXFor->getBeginStmt()); + ForStmts.push_back(CXXFor); + } + }; + + // Helper lambda functions to encapsulate the processing of different + // derivations of the canonical loop sequence grammar + // + // Modularized code for handling loop generation and transformations + auto analyzeLoopGeneration = [&storeLoopStatements, &LoopHelpers, + &OriginalInits, &TransformsPreInits, + &LoopCategories, &LoopSeqSize, &NumLoops, Kind, + &TmpDSA, &ForStmts, &Context, + &LoopSequencePreInits, this](Stmt *Child) { + auto LoopTransform = dyn_cast(Child); + Stmt *TransformedStmt = LoopTransform->getTransformedStmt(); + unsigned NumGeneratedLoopNests = LoopTransform->getNumGeneratedLoopNests(); + unsigned NumGeneratedLoops = LoopTransform->getNumGeneratedLoops(); + // Handle the case where transformed statement is not available due to + // dependent contexts + if (!TransformedStmt) { + if (NumGeneratedLoopNests > 0) { + LoopSeqSize += NumGeneratedLoopNests; + NumLoops += NumGeneratedLoops; + return true; + } + // Unroll full (0 loops produced) + else { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + } + // Handle loop transformations with multiple loop nests + // Unroll full + if (NumGeneratedLoopNests <= 0) { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + // Loop transformatons such as split or loopranged fuse + else if (NumGeneratedLoopNests > 1) { + // Get the preinits related to this loop sequence generating + // loop transformation (i.e loopranged fuse, split...) + LoopSequencePreInits.emplace_back(); + // These preinits differ slightly from regular inits/pre-inits related + // to single loop generating loop transformations (interchange, unroll) + // given that they are not bounded to a particular loop nest + // so they need to be treated independently + updatePreInits(LoopTransform, LoopSequencePreInits); + return analyzeLoopSequence(TransformedStmt, LoopSeqSize, NumLoops, + LoopHelpers, ForStmts, OriginalInits, + TransformsPreInits, LoopSequencePreInits, + LoopCategories, Context, Kind); + } + // Vast majority: (Tile, Unroll, Stripe, Reverse, Interchange, Fuse all) + else { ---------------- eZWALT wrote: I think this worsens readability, could you please tell me how this would improve the code quality other than conciseness? https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Wed May 21 03:33:29 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Wed, 21 May 2025 03:33:29 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682dabf9.170a0220.1be3e0.d010@mx.google.com> ================ @@ -14175,27 +14222,350 @@ bool SemaOpenMP::checkTransformableLoopNest( return false; }, [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + updatePreInits(Transform, OriginalInits); }); assert(OriginalInits.back().empty() && "No preinit after innermost loop"); OriginalInits.pop_back(); return Result; } +// Counts the total number of nested loops, including the outermost loop (the +// original loop). PRECONDITION of this visitor is that it must be invoked from +// the original loop to be analyzed. The traversal is stop for Decl's and +// Expr's given that they may contain inner loops that must not be counted. +// +// Example AST structure for the code: +// +// int main() { +// #pragma omp fuse +// { +// for (int i = 0; i < 100; i++) { <-- Outer loop +// []() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// }; +// for(int j = 0; j < 5; ++j) {} <-- Inner loop +// } +// for (int r = 0; i < 100; i++) { <-- Outer loop +// struct LocalClass { +// void bar() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// } +// }; +// for(int k = 0; k < 10; ++k) {} <-- Inner loop +// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +// } +// } +// } +// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { +private: + unsigned NestedLoopCount = 0; + +public: + explicit NestedLoopCounterVisitor() {} + + unsigned getNestedLoopCount() const { return NestedLoopCount; } + + bool VisitForStmt(ForStmt *FS) override { + ++NestedLoopCount; + return true; + } + + bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { + ++NestedLoopCount; + return true; + } + + bool TraverseStmt(Stmt *S) override { + if (!S) + return true; + + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) + return true; + + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || isa(S)) { + return DynamicRecursiveASTVisitor::TraverseStmt(S); + } + + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; + } + + bool TraverseDecl(Decl *D) override { + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; + } +}; + +bool SemaOpenMP::analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind) { + + VarsWithInheritedDSAType TmpDSA; + QualType BaseInductionVarType; + // Helper Lambda to handle storing initialization and body statements for both + // ForStmt and CXXForRangeStmt and checks for any possible mismatch between + // induction variables types + auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, + this, &Context](Stmt *LoopStmt) { + if (auto *For = dyn_cast(LoopStmt)) { + OriginalInits.back().push_back(For->getInit()); + ForStmts.push_back(For); + // Extract induction variable + if (auto *InitStmt = dyn_cast_or_null(For->getInit())) { + if (auto *InitDecl = dyn_cast(InitStmt->getSingleDecl())) { + QualType InductionVarType = InitDecl->getType().getCanonicalType(); + + // Compare with first loop type + if (BaseInductionVarType.isNull()) { + BaseInductionVarType = InductionVarType; + } else if (!Context.hasSameType(BaseInductionVarType, + InductionVarType)) { + Diag(InitDecl->getBeginLoc(), + diag::warn_omp_different_loop_ind_var_types) + << getOpenMPDirectiveName(OMPD_fuse) << BaseInductionVarType + << InductionVarType; + } + } + } + } else { + auto *CXXFor = cast(LoopStmt); + OriginalInits.back().push_back(CXXFor->getBeginStmt()); + ForStmts.push_back(CXXFor); + } + }; + + // Helper lambda functions to encapsulate the processing of different + // derivations of the canonical loop sequence grammar + // + // Modularized code for handling loop generation and transformations + auto analyzeLoopGeneration = [&storeLoopStatements, &LoopHelpers, + &OriginalInits, &TransformsPreInits, + &LoopCategories, &LoopSeqSize, &NumLoops, Kind, + &TmpDSA, &ForStmts, &Context, + &LoopSequencePreInits, this](Stmt *Child) { + auto LoopTransform = dyn_cast(Child); + Stmt *TransformedStmt = LoopTransform->getTransformedStmt(); + unsigned NumGeneratedLoopNests = LoopTransform->getNumGeneratedLoopNests(); + unsigned NumGeneratedLoops = LoopTransform->getNumGeneratedLoops(); + // Handle the case where transformed statement is not available due to + // dependent contexts + if (!TransformedStmt) { + if (NumGeneratedLoopNests > 0) { + LoopSeqSize += NumGeneratedLoopNests; + NumLoops += NumGeneratedLoops; + return true; + } + // Unroll full (0 loops produced) + else { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + } + // Handle loop transformations with multiple loop nests + // Unroll full + if (NumGeneratedLoopNests <= 0) { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + // Loop transformatons such as split or loopranged fuse + else if (NumGeneratedLoopNests > 1) { ---------------- eZWALT wrote: The rest of the conditionals i can understand removing the { tokens for conciseness, but this change worsens readibility of the code and can easily introduce errors if modified in the future due to this control flow being not identical to the one i proposed. Could you please elaborate a bit further why this change is needed? https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Wed May 21 03:35:25 2025 From: flang-commits at lists.llvm.org (=?UTF-8?Q?Martin_Storsj=C3=B6?= via flang-commits) Date: Wed, 21 May 2025 03:35:25 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [lld] [lldb] [llvm] [mlir] [polly] [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS in standalone builds (PR #138587) In-Reply-To: Message-ID: <682dac6d.a70a0220.22946c.a4d0@mx.google.com> https://github.com/mstorsjo approved this pull request. LGTM IIRC @petrhosek had commented on this before, and was generally in favour of it, but I'd still leave it open for a couple days if he wants to comment further on it. https://github.com/llvm/llvm-project/pull/138587 From flang-commits at lists.llvm.org Wed May 21 03:36:41 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Wed, 21 May 2025 03:36:41 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682dacb9.170a0220.2595ce.d679@mx.google.com> eZWALT wrote: Gentle-ping, I'm not sure if GitHub has notified you of the comments :) @alexey-bataev https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Wed May 21 03:39:14 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Wed, 21 May 2025 03:39:14 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682dad52.170a0220.25330f.072b@mx.google.com> ================ @@ -11516,6 +11516,21 @@ def note_omp_implicit_dsa : Note< "implicitly determined as %0">; def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; +def warn_omp_different_loop_ind_var_types : Warning < + "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">, ---------------- alexey-bataev wrote: What stops you from promoting to a larger type? https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Wed May 21 04:11:33 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 21 May 2025 04:11:33 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Skip optimized bufferization on volatile refs (PR #140781) In-Reply-To: Message-ID: <682db4e5.a70a0220.1241ce.9013@mx.google.com> https://github.com/tblah approved this pull request. Thanks for the fix! https://github.com/llvm/llvm-project/pull/140781 From flang-commits at lists.llvm.org Wed May 21 04:13:32 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 21 May 2025 04:13:32 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <682db55c.170a0220.20cd50.7435@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Wed May 21 04:15:27 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 21 May 2025 04:15:27 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682db5cf.170a0220.22a28a.7282@mx.google.com> https://github.com/tblah commented: Please could you add a test showing a loop containing multiple blocks? This feels like a likely source of bugs now or in the future. https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Wed May 21 05:08:45 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Wed, 21 May 2025 05:08:45 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682dc24d.170a0220.19368d.6fbc@mx.google.com> ================ @@ -11516,6 +11516,21 @@ def note_omp_implicit_dsa : Note< "implicitly determined as %0">; def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; +def warn_omp_different_loop_ind_var_types : Warning < + "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">, ---------------- eZWALT wrote: Nothing actually, it was one of the 3 ideas i had to tackle this problem and if you prefer i will follow this approach (Using uint of 64 bits for example, however your comment left me thinking if NumIterations cause any trouble, i think it can't but i'm not 100% sure to be honest. https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Wed May 21 05:15:25 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 05:15:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix ICE with ignore_tkr(tk) character in explicit interface (PR #140885) Message-ID: https://github.com/jeanPerier created https://github.com/llvm/llvm-project/pull/140885 Some MPI libraries use character dummies + ignore(TKR) to allow passing any kind of buffer. This was meant to already be handled by https://github.com/llvm/llvm-project/pull/108168 However, when the library interface also had an argument requiring an explicit interface, `builder.convertWithSemantics` was not allowed to properly deal with the actual/dummy type mismatch and generated bad IR causing errors like: `'fir.convert' op invalid type conversion'!fir.ref' / '!fir.boxchar\<1\>'`. This restriction was artificial, lowering should just handle any cases allowed by semantics. Just remove it. >From a7ff1819401b27f3b1c5e7e99eb6cc72bc020d49 Mon Sep 17 00:00:00 2001 From: Jean Perier Date: Wed, 21 May 2025 04:57:24 -0700 Subject: [PATCH] [flang] fix ignore_tkr(tk) character in explicit interface --- flang/lib/Lower/ConvertCall.cpp | 12 ++++++++++-- .../Lower/HLFIR/ignore-type-f77-character.f90 | 17 +++++++++++++++++ 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/flang/lib/Lower/ConvertCall.cpp b/flang/lib/Lower/ConvertCall.cpp index d37d51f6ec634..7378118cfef7f 100644 --- a/flang/lib/Lower/ConvertCall.cpp +++ b/flang/lib/Lower/ConvertCall.cpp @@ -486,7 +486,6 @@ Fortran::lower::genCallOpAndResult( // Deal with potential mismatches in arguments types. Passing an array to a // scalar argument should for instance be tolerated here. - bool callingImplicitInterface = caller.canBeCalledViaImplicitInterface(); for (auto [fst, snd] : llvm::zip(caller.getInputs(), funcType.getInputs())) { // When passing arguments to a procedure that can be called by implicit // interface, allow any character actual arguments to be passed to dummy @@ -518,10 +517,17 @@ Fortran::lower::genCallOpAndResult( // Do not attempt any reboxing here that could break this. bool legacyLowering = !converter.getLoweringOptions().getLowerToHighLevelFIR(); + // When dealing with a dummy character argument (fir.boxchar), the + // effective argument might be a non-character raw pointer. This may + // happen when calling an implicit interface that was previously called + // with a character argument, or when calling an explicit interface with + // an IgnoreTKR dummy character arguments. Allow creating a fir.boxchar + // from the raw pointer, which requires a non-trivial type conversion. + const bool allowCharacterConversions = true; bool isVolatile = fir::isa_volatile_type(snd); cast = builder.createVolatileCast(loc, isVolatile, fst); cast = builder.convertWithSemantics(loc, snd, cast, - callingImplicitInterface, + allowCharacterConversions, /*allowRebox=*/legacyLowering); } } @@ -1446,6 +1452,8 @@ static PreparedDummyArgument preparePresentUserCallActualArgument( // cause the fir.if results to be assumed-rank in case of OPTIONAL dummy, // causing extra runtime costs due to the unknown runtime size of assumed-rank // descriptors. + // For TKR dummy characters, the boxchar creation also happens later when + // creating the fir.call . preparedDummy.dummy = builder.createConvert(loc, dummyTypeWithActualRank, addr); return preparedDummy; diff --git a/flang/test/Lower/HLFIR/ignore-type-f77-character.f90 b/flang/test/Lower/HLFIR/ignore-type-f77-character.f90 index 41dbf82d5789d..6b2041a889e8d 100644 --- a/flang/test/Lower/HLFIR/ignore-type-f77-character.f90 +++ b/flang/test/Lower/HLFIR/ignore-type-f77-character.f90 @@ -8,6 +8,13 @@ subroutine foo(c) !dir$ ignore_tkr(tkrdm) c end subroutine end interface + interface + subroutine foo_requires_explicit_interface(c, i) + character(1)::c(*) + !dir$ ignore_tkr(tkrdm) c + integer, optional :: i + end subroutine + end interface contains subroutine test_normal() character(1) :: c(10) @@ -32,4 +39,14 @@ subroutine test_weird() !CHECK: %[[VAL_5:.*]] = fir.convert %{{.*}} : (!fir.ref>) -> !fir.ref> !CHECK: %[[VAL_6:.*]] = fir.emboxchar %[[VAL_5]], %c0{{.*}}: (!fir.ref>, index) -> !fir.boxchar<1> !CHECK: fir.call @_QPfoo(%[[VAL_6]]) fastmath : (!fir.boxchar<1>) -> () + + subroutine test_requires_explicit_interface(x, i) + real :: x(10) + integer :: i + call foo_requires_explicit_interface(x, i) + end subroutine +!CHECK-LABEL: func.func @_QMtest_char_tkPtest_requires_explicit_interface( +!CHECK: %[[VAL_5:.*]] = fir.convert %{{.*}} : (!fir.ref>) -> !fir.ref> +!CHECK: %[[VAL_6:.*]] = fir.emboxchar %[[VAL_5]], %c0{{.*}}: (!fir.ref>, index) -> !fir.boxchar<1> +!CHECK: fir.call @_QPfoo_requires_explicit_interface(%[[VAL_6]], %{{.*}}) end module From flang-commits at lists.llvm.org Wed May 21 05:15:57 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 05:15:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix ICE with ignore_tkr(tk) character in explicit interface (PR #140885) In-Reply-To: Message-ID: <682dc3fd.170a0220.153c9c.d6a3@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: None (jeanPerier)
Changes Some MPI libraries use character dummies + ignore(TKR) to allow passing any kind of buffer. This was meant to already be handled by https://github.com/llvm/llvm-project/pull/108168 However, when the library interface also had an argument requiring an explicit interface, `builder.convertWithSemantics` was not allowed to properly deal with the actual/dummy type mismatch and generated bad IR causing errors like: `'fir.convert' op invalid type conversion'!fir.ref' / '!fir.boxchar\<1\>'`. This restriction was artificial, lowering should just handle any cases allowed by semantics. Just remove it. --- Full diff: https://github.com/llvm/llvm-project/pull/140885.diff 2 Files Affected: - (modified) flang/lib/Lower/ConvertCall.cpp (+10-2) - (modified) flang/test/Lower/HLFIR/ignore-type-f77-character.f90 (+17) ``````````diff diff --git a/flang/lib/Lower/ConvertCall.cpp b/flang/lib/Lower/ConvertCall.cpp index d37d51f6ec634..7378118cfef7f 100644 --- a/flang/lib/Lower/ConvertCall.cpp +++ b/flang/lib/Lower/ConvertCall.cpp @@ -486,7 +486,6 @@ Fortran::lower::genCallOpAndResult( // Deal with potential mismatches in arguments types. Passing an array to a // scalar argument should for instance be tolerated here. - bool callingImplicitInterface = caller.canBeCalledViaImplicitInterface(); for (auto [fst, snd] : llvm::zip(caller.getInputs(), funcType.getInputs())) { // When passing arguments to a procedure that can be called by implicit // interface, allow any character actual arguments to be passed to dummy @@ -518,10 +517,17 @@ Fortran::lower::genCallOpAndResult( // Do not attempt any reboxing here that could break this. bool legacyLowering = !converter.getLoweringOptions().getLowerToHighLevelFIR(); + // When dealing with a dummy character argument (fir.boxchar), the + // effective argument might be a non-character raw pointer. This may + // happen when calling an implicit interface that was previously called + // with a character argument, or when calling an explicit interface with + // an IgnoreTKR dummy character arguments. Allow creating a fir.boxchar + // from the raw pointer, which requires a non-trivial type conversion. + const bool allowCharacterConversions = true; bool isVolatile = fir::isa_volatile_type(snd); cast = builder.createVolatileCast(loc, isVolatile, fst); cast = builder.convertWithSemantics(loc, snd, cast, - callingImplicitInterface, + allowCharacterConversions, /*allowRebox=*/legacyLowering); } } @@ -1446,6 +1452,8 @@ static PreparedDummyArgument preparePresentUserCallActualArgument( // cause the fir.if results to be assumed-rank in case of OPTIONAL dummy, // causing extra runtime costs due to the unknown runtime size of assumed-rank // descriptors. + // For TKR dummy characters, the boxchar creation also happens later when + // creating the fir.call . preparedDummy.dummy = builder.createConvert(loc, dummyTypeWithActualRank, addr); return preparedDummy; diff --git a/flang/test/Lower/HLFIR/ignore-type-f77-character.f90 b/flang/test/Lower/HLFIR/ignore-type-f77-character.f90 index 41dbf82d5789d..6b2041a889e8d 100644 --- a/flang/test/Lower/HLFIR/ignore-type-f77-character.f90 +++ b/flang/test/Lower/HLFIR/ignore-type-f77-character.f90 @@ -8,6 +8,13 @@ subroutine foo(c) !dir$ ignore_tkr(tkrdm) c end subroutine end interface + interface + subroutine foo_requires_explicit_interface(c, i) + character(1)::c(*) + !dir$ ignore_tkr(tkrdm) c + integer, optional :: i + end subroutine + end interface contains subroutine test_normal() character(1) :: c(10) @@ -32,4 +39,14 @@ subroutine test_weird() !CHECK: %[[VAL_5:.*]] = fir.convert %{{.*}} : (!fir.ref>) -> !fir.ref> !CHECK: %[[VAL_6:.*]] = fir.emboxchar %[[VAL_5]], %c0{{.*}}: (!fir.ref>, index) -> !fir.boxchar<1> !CHECK: fir.call @_QPfoo(%[[VAL_6]]) fastmath : (!fir.boxchar<1>) -> () + + subroutine test_requires_explicit_interface(x, i) + real :: x(10) + integer :: i + call foo_requires_explicit_interface(x, i) + end subroutine +!CHECK-LABEL: func.func @_QMtest_char_tkPtest_requires_explicit_interface( +!CHECK: %[[VAL_5:.*]] = fir.convert %{{.*}} : (!fir.ref>) -> !fir.ref> +!CHECK: %[[VAL_6:.*]] = fir.emboxchar %[[VAL_5]], %c0{{.*}}: (!fir.ref>, index) -> !fir.boxchar<1> +!CHECK: fir.call @_QPfoo_requires_explicit_interface(%[[VAL_6]], %{{.*}}) end module ``````````
https://github.com/llvm/llvm-project/pull/140885 From flang-commits at lists.llvm.org Wed May 21 05:48:41 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 05:48:41 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682dcba9.170a0220.153c9c.daa2@mx.google.com> NexMing wrote: > Please could you add a test showing a loop containing multiple blocks? This feels like a likely source of bugs now or in the future. It seems that `fir.do_loop` only allows a single block. Are you referring to cases where some operations within that block contain nested blocks (such as `fir.if`)? This will be gradually implemented in the future. https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Wed May 21 06:20:14 2025 From: flang-commits at lists.llvm.org (Alexey Bataev via flang-commits) Date: Wed, 21 May 2025 06:20:14 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682dd30e.170a0220.27ecaa.6637@mx.google.com> ================ @@ -11516,6 +11516,21 @@ def note_omp_implicit_dsa : Note< "implicitly determined as %0">; def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; +def warn_omp_different_loop_ind_var_types : Warning < + "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">, ---------------- alexey-bataev wrote: There is no such restriction in the standard, so if we can promote to larger type, we should do it https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Wed May 21 06:27:42 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 06:27:42 -0700 (PDT) Subject: [flang-commits] [flang] d360281 - [flang] add -floop-interchange and enable it with opt levels (#140182) Message-ID: <682dd4ce.170a0220.160907.76d0@mx.google.com> Author: Sebastian Pop Date: 2025-05-21T08:27:39-05:00 New Revision: d36028120a6ef6346bfaafe82d4d1a2887cf5e33 URL: https://github.com/llvm/llvm-project/commit/d36028120a6ef6346bfaafe82d4d1a2887cf5e33 DIFF: https://github.com/llvm/llvm-project/commit/d36028120a6ef6346bfaafe82d4d1a2887cf5e33.diff LOG: [flang] add -floop-interchange and enable it with opt levels (#140182) Enable the use of -floop-interchange from the flang driver. Enable in flang LLVM's loop interchange at levels -O2, -O3, -Ofast, and -Os. Added: flang/test/Driver/loop-interchange.f90 Modified: clang/include/clang/Driver/Options.td clang/lib/Driver/ToolChains/CommonArgs.cpp clang/lib/Driver/ToolChains/CommonArgs.h clang/lib/Driver/ToolChains/Flang.cpp flang/docs/ReleaseNotes.md flang/include/flang/Frontend/CodeGenOptions.def flang/lib/Frontend/CompilerInvocation.cpp flang/lib/Frontend/FrontendActions.cpp Removed: ################################################################################ diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 9a4253113488d..22261621df092 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -4186,9 +4186,9 @@ def ftrap_function_EQ : Joined<["-"], "ftrap-function=">, Group, HelpText<"Issue call to specified function rather than a trap instruction">, MarshallingInfoString>; def floop_interchange : Flag<["-"], "floop-interchange">, Group, - HelpText<"Enable the loop interchange pass">, Visibility<[ClangOption, CC1Option]>; + HelpText<"Enable the loop interchange pass">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>; def fno_loop_interchange: Flag<["-"], "fno-loop-interchange">, Group, - HelpText<"Disable the loop interchange pass">, Visibility<[ClangOption, CC1Option]>; + HelpText<"Disable the loop interchange pass">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>; def funroll_loops : Flag<["-"], "funroll-loops">, Group, HelpText<"Turn on loop unroller">, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>; def fno_unroll_loops : Flag<["-"], "fno-unroll-loops">, Group, diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp index 722431c999b95..d2535d1a2624a 100644 --- a/clang/lib/Driver/ToolChains/CommonArgs.cpp +++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp @@ -3150,3 +3150,16 @@ void tools::handleVectorizeSLPArgs(const ArgList &Args, options::OPT_fno_slp_vectorize, EnableSLPVec)) CmdArgs.push_back("-vectorize-slp"); } + +void tools::handleInterchangeLoopsArgs(const ArgList &Args, + ArgStringList &CmdArgs) { + // FIXME: instead of relying on shouldEnableVectorizerAtOLevel, we may want to + // implement a separate function to infer loop interchange from opt level. + // For now, enable loop-interchange at the same opt levels as loop-vectorize. + bool EnableInterchange = shouldEnableVectorizerAtOLevel(Args, false); + OptSpecifier InterchangeAliasOption = + EnableInterchange ? options::OPT_O_Group : options::OPT_floop_interchange; + if (Args.hasFlag(options::OPT_floop_interchange, InterchangeAliasOption, + options::OPT_fno_loop_interchange, EnableInterchange)) + CmdArgs.push_back("-floop-interchange"); +} diff --git a/clang/lib/Driver/ToolChains/CommonArgs.h b/clang/lib/Driver/ToolChains/CommonArgs.h index 96bc0619dcbc0..6d36a0e8bf493 100644 --- a/clang/lib/Driver/ToolChains/CommonArgs.h +++ b/clang/lib/Driver/ToolChains/CommonArgs.h @@ -259,6 +259,10 @@ void renderCommonIntegerOverflowOptions(const llvm::opt::ArgList &Args, bool shouldEnableVectorizerAtOLevel(const llvm::opt::ArgList &Args, bool isSlpVec); +/// Enable -floop-interchange based on the optimization level selected. +void handleInterchangeLoopsArgs(const llvm::opt::ArgList &Args, + llvm::opt::ArgStringList &CmdArgs); + /// Enable -fvectorize based on the optimization level selected. void handleVectorizeLoopsArgs(const llvm::opt::ArgList &Args, llvm::opt::ArgStringList &CmdArgs); diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index 0bd8d0c85e50a..e1bdfaff83288 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -152,6 +152,7 @@ void Flang::addCodegenOptions(const ArgList &Args, !stackArrays->getOption().matches(options::OPT_fno_stack_arrays)) CmdArgs.push_back("-fstack-arrays"); + handleInterchangeLoopsArgs(Args, CmdArgs); handleVectorizeLoopsArgs(Args, CmdArgs); handleVectorizeSLPArgs(Args, CmdArgs); diff --git a/flang/docs/ReleaseNotes.md b/flang/docs/ReleaseNotes.md index b356f64553d7e..36be369595ffd 100644 --- a/flang/docs/ReleaseNotes.md +++ b/flang/docs/ReleaseNotes.md @@ -32,6 +32,9 @@ page](https://llvm.org/releases/). ## New Compiler Flags +* -floop-interchange is now recognized by flang. +* -floop-interchange is enabled by default at -O2 and above. + ## Windows Support ## Fortran Language Changes in Flang diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index b50dd4fb3abda..a697872836569 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -35,6 +35,7 @@ CODEGENOPT(PrepareForThinLTO , 1, 0) ///< Set when -flto=thin is enabled on the CODEGENOPT(StackArrays, 1, 0) ///< -fstack-arrays (enable the stack-arrays pass) CODEGENOPT(VectorizeLoop, 1, 0) ///< Enable loop vectorization. CODEGENOPT(VectorizeSLP, 1, 0) ///< Enable SLP vectorization. +CODEGENOPT(InterchangeLoops, 1, 0) ///< Enable loop interchange. CODEGENOPT(LoopVersioning, 1, 0) ///< Enable loop versioning. CODEGENOPT(UnrollLoops, 1, 0) ///< Enable loop unrolling CODEGENOPT(AliasAnalysis, 1, 0) ///< Enable alias analysis pass diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b6c37712d0f79..ba2531819ee5e 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -270,6 +270,9 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, clang::driver::options::OPT_fno_stack_arrays, false)) opts.StackArrays = 1; + if (args.getLastArg(clang::driver::options::OPT_floop_interchange)) + opts.InterchangeLoops = 1; + if (args.getLastArg(clang::driver::options::OPT_vectorize_loops)) opts.VectorizeLoop = 1; diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index e5a15c555fa5e..38dfaadf1dff9 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -922,6 +922,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (ci.isTimingEnabled()) si.getTimePasses().setOutStream(ci.getTimingStreamLLVM()); pto.LoopUnrolling = opts.UnrollLoops; + pto.LoopInterchange = opts.InterchangeLoops; pto.LoopInterleaving = opts.UnrollLoops; pto.LoopVectorization = opts.VectorizeLoop; pto.SLPVectorization = opts.VectorizeSLP; diff --git a/flang/test/Driver/loop-interchange.f90 b/flang/test/Driver/loop-interchange.f90 new file mode 100644 index 0000000000000..5d3ec71c59874 --- /dev/null +++ b/flang/test/Driver/loop-interchange.f90 @@ -0,0 +1,17 @@ +! RUN: %flang -### -S -floop-interchange %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -fno-loop-interchange %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -O0 %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -O1 %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -O2 %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -O3 %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -Os %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE %s +! RUN: %flang -### -S -Oz %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE %s +! CHECK-LOOP-INTERCHANGE: "-floop-interchange" +! CHECK-NO-LOOP-INTERCHANGE-NOT: "-floop-interchange" +! RUN: %flang_fc1 -emit-llvm -O2 -floop-interchange -mllvm -print-pipeline-passes -o /dev/null %s 2>&1 | FileCheck -check-prefix=CHECK-LOOP-INTERCHANGE-PASS %s +! RUN: %flang_fc1 -emit-llvm -O2 -fno-loop-interchange -mllvm -print-pipeline-passes -o /dev/null %s 2>&1 | FileCheck -check-prefix=CHECK-NO-LOOP-INTERCHANGE-PASS %s +! CHECK-LOOP-INTERCHANGE-PASS: loop-interchange +! CHECK-NO-LOOP-INTERCHANGE-PASS-NOT: loop-interchange + +program test +end program From flang-commits at lists.llvm.org Wed May 21 06:27:46 2025 From: flang-commits at lists.llvm.org (Sebastian Pop via flang-commits) Date: Wed, 21 May 2025 06:27:46 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <682dd4d2.630a0220.29c3e1.db22@mx.google.com> https://github.com/sebpop closed https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Wed May 21 06:31:22 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 06:31:22 -0700 (PDT) Subject: [flang-commits] [flang] 2d956d2 - [flang] fix ICE with ignore_tkr(tk) character in explicit interface (#140885) Message-ID: <682dd5aa.170a0220.1526e8.7391@mx.google.com> Author: jeanPerier Date: 2025-05-21T15:31:18+02:00 New Revision: 2d956d2d4ecd6191cd0eab76ec705f5ea2916d59 URL: https://github.com/llvm/llvm-project/commit/2d956d2d4ecd6191cd0eab76ec705f5ea2916d59 DIFF: https://github.com/llvm/llvm-project/commit/2d956d2d4ecd6191cd0eab76ec705f5ea2916d59.diff LOG: [flang] fix ICE with ignore_tkr(tk) character in explicit interface (#140885) Some MPI libraries use character dummies + ignore(TKR) to allow passing any kind of buffer. This was meant to already be handled by #108168 However, when the library interface also had an argument requiring an explicit interface, `builder.convertWithSemantics` was not allowed to properly deal with the actual/dummy type mismatch and generated bad IR causing errors like: `'fir.convert' op invalid type conversion'!fir.ref' / '!fir.boxchar\<1\>'`. This restriction was artificial, lowering should just handle any cases allowed by semantics. Just remove it. Added: Modified: flang/lib/Lower/ConvertCall.cpp flang/test/Lower/HLFIR/ignore-type-f77-character.f90 Removed: ################################################################################ diff --git a/flang/lib/Lower/ConvertCall.cpp b/flang/lib/Lower/ConvertCall.cpp index d37d51f6ec634..7378118cfef7f 100644 --- a/flang/lib/Lower/ConvertCall.cpp +++ b/flang/lib/Lower/ConvertCall.cpp @@ -486,7 +486,6 @@ Fortran::lower::genCallOpAndResult( // Deal with potential mismatches in arguments types. Passing an array to a // scalar argument should for instance be tolerated here. - bool callingImplicitInterface = caller.canBeCalledViaImplicitInterface(); for (auto [fst, snd] : llvm::zip(caller.getInputs(), funcType.getInputs())) { // When passing arguments to a procedure that can be called by implicit // interface, allow any character actual arguments to be passed to dummy @@ -518,10 +517,17 @@ Fortran::lower::genCallOpAndResult( // Do not attempt any reboxing here that could break this. bool legacyLowering = !converter.getLoweringOptions().getLowerToHighLevelFIR(); + // When dealing with a dummy character argument (fir.boxchar), the + // effective argument might be a non-character raw pointer. This may + // happen when calling an implicit interface that was previously called + // with a character argument, or when calling an explicit interface with + // an IgnoreTKR dummy character arguments. Allow creating a fir.boxchar + // from the raw pointer, which requires a non-trivial type conversion. + const bool allowCharacterConversions = true; bool isVolatile = fir::isa_volatile_type(snd); cast = builder.createVolatileCast(loc, isVolatile, fst); cast = builder.convertWithSemantics(loc, snd, cast, - callingImplicitInterface, + allowCharacterConversions, /*allowRebox=*/legacyLowering); } } @@ -1446,6 +1452,8 @@ static PreparedDummyArgument preparePresentUserCallActualArgument( // cause the fir.if results to be assumed-rank in case of OPTIONAL dummy, // causing extra runtime costs due to the unknown runtime size of assumed-rank // descriptors. + // For TKR dummy characters, the boxchar creation also happens later when + // creating the fir.call . preparedDummy.dummy = builder.createConvert(loc, dummyTypeWithActualRank, addr); return preparedDummy; diff --git a/flang/test/Lower/HLFIR/ignore-type-f77-character.f90 b/flang/test/Lower/HLFIR/ignore-type-f77-character.f90 index 41dbf82d5789d..6b2041a889e8d 100644 --- a/flang/test/Lower/HLFIR/ignore-type-f77-character.f90 +++ b/flang/test/Lower/HLFIR/ignore-type-f77-character.f90 @@ -8,6 +8,13 @@ subroutine foo(c) !dir$ ignore_tkr(tkrdm) c end subroutine end interface + interface + subroutine foo_requires_explicit_interface(c, i) + character(1)::c(*) + !dir$ ignore_tkr(tkrdm) c + integer, optional :: i + end subroutine + end interface contains subroutine test_normal() character(1) :: c(10) @@ -32,4 +39,14 @@ subroutine test_weird() !CHECK: %[[VAL_5:.*]] = fir.convert %{{.*}} : (!fir.ref>) -> !fir.ref> !CHECK: %[[VAL_6:.*]] = fir.emboxchar %[[VAL_5]], %c0{{.*}}: (!fir.ref>, index) -> !fir.boxchar<1> !CHECK: fir.call @_QPfoo(%[[VAL_6]]) fastmath : (!fir.boxchar<1>) -> () + + subroutine test_requires_explicit_interface(x, i) + real :: x(10) + integer :: i + call foo_requires_explicit_interface(x, i) + end subroutine +!CHECK-LABEL: func.func @_QMtest_char_tkPtest_requires_explicit_interface( +!CHECK: %[[VAL_5:.*]] = fir.convert %{{.*}} : (!fir.ref>) -> !fir.ref> +!CHECK: %[[VAL_6:.*]] = fir.emboxchar %[[VAL_5]], %c0{{.*}}: (!fir.ref>, index) -> !fir.boxchar<1> +!CHECK: fir.call @_QPfoo_requires_explicit_interface(%[[VAL_6]], %{{.*}}) end module From flang-commits at lists.llvm.org Wed May 21 06:31:24 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 06:31:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang] fix ICE with ignore_tkr(tk) character in explicit interface (PR #140885) In-Reply-To: Message-ID: <682dd5ac.170a0220.1bdff9.71e8@mx.google.com> https://github.com/jeanPerier closed https://github.com/llvm/llvm-project/pull/140885 From flang-commits at lists.llvm.org Wed May 21 06:38:26 2025 From: flang-commits at lists.llvm.org (Kelvin Li via flang-commits) Date: Wed, 21 May 2025 06:38:26 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <682dd752.170a0220.14510a.7340@mx.google.com> ================ @@ -3149,3 +3149,16 @@ void tools::handleVectorizeSLPArgs(const ArgList &Args, options::OPT_fno_slp_vectorize, EnableSLPVec)) CmdArgs.push_back("-vectorize-slp"); } + +void tools::handleInterchangeLoopsArgs(const ArgList &Args, + ArgStringList &CmdArgs) { + // FIXME: instead of relying on shouldEnableVectorizerAtOLevel, we may want to ---------------- kkwli wrote: Nit: instead → Instead https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Wed May 21 06:47:40 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 21 May 2025 06:47:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682dd97c.050a0220.8b6db.88a3@mx.google.com> ================ @@ -9,4 +9,5 @@ end subroutine ! CHECK-LABEL: func.func @_QPsharedmem() attributes {cuf.proc_attr = #cuf.cuda_proc} ! CHECK: %{{.*}} = cuf.shared_memory !fir.array<32xf32> {bindc_name = "s", uniq_name = "_QFsharedmemEs"} -> !fir.ref> +! CHECK: cuf.free %{{.*}}#0 : !fir.ref {data_attr = #cuf.cuda} ! CHECK-NOT: cuf.free ---------------- clementval wrote: Is t getting a cuf.alloc now? https://github.com/llvm/llvm-project/pull/140834 From flang-commits at lists.llvm.org Wed May 21 06:48:59 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Wed, 21 May 2025 06:48:59 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682dd9cb.050a0220.24a15.8fa7@mx.google.com> https://github.com/eZWALT edited https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Wed May 21 06:49:04 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 21 May 2025 06:49:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682dd9d0.050a0220.d35f1.8f02@mx.google.com> ================ @@ -0,0 +1,38 @@ +! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s + +module mlocModule + interface maxlocUpdate + module procedure :: & + maxlocUpdateR_32F, & + maxlocUpdateR_64F, & + maxlocUpdateR_32I, & + maxlocUpdateR_64I + end interface maxlocUpdate +contains + + attributes(global) subroutine maxlocPartialMaskR_32F1D() + implicit none + real(4) :: mval + + call maxlocUpdate(mval) + + end subroutine maxlocPartialMaskR_32F1D + + attributes(device) subroutine maxlocUpdateR_32F(mval) + real(4) :: mval + end subroutine maxlocUpdateR_32F + + attributes(device) subroutine maxlocUpdateR_64F(mval) + real(8) :: mval + end subroutine maxlocUpdateR_64F + + attributes(device) subroutine maxlocUpdateR_32I(mval) + integer(4) :: mval + end subroutine maxlocUpdateR_32I + + attributes(device) subroutine maxlocUpdateR_64I(mval) + integer(8) :: mval + end subroutine maxlocUpdateR_64I +end module + +! CHECK-LABEL: func.func @_QMmlocmodulePmaxlocpartialmaskr_32f1d() ---------------- clementval wrote: Can we have a simpler test without the interface? It's probably gonna be hard to understand what this test does in the future. You can also add some comment. https://github.com/llvm/llvm-project/pull/140834 From flang-commits at lists.llvm.org Wed May 21 06:49:43 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Wed, 21 May 2025 06:49:43 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Skip optimized bufferization on volatile refs (PR #140781) In-Reply-To: Message-ID: <682dd9f7.170a0220.2f156b.e8b2@mx.google.com> ashermancinelli wrote: > @jeanPerier > > To clarify, I am not asking you to update the patch to ignore VOLATILE effects here (we would still need a bit more thinking to make sure things are OK for the cases more complex than x = x +y from my example). I just do not want future us to believe it would be incorrect to do so if we need do. Thank you for the thoughtful review! I wonder if there's somewhere I should collect those ideas along with the design of volatile so we can iterate on it (or at least keep it documented somewhere). Maybe something in `docs/`? https://github.com/llvm/llvm-project/pull/140781 From flang-commits at lists.llvm.org Wed May 21 06:51:17 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Wed, 21 May 2025 06:51:17 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682dda55.170a0220.99be3.8034@mx.google.com> ================ @@ -0,0 +1,186 @@ +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -std=c++20 -fopenmp -fopenmp-version=60 -fsyntax-only -Wuninitialized -verify %s + +void func() { + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a loop sequence containing canonical loops or loop-generating constructs}} + #pragma omp fuse + ; + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a for loop}} + #pragma omp fuse + {int bar = 0;} + + // expected-error at +4 {{statement after '#pragma omp fuse' must be a for loop}} + #pragma omp fuse + { + for(int i = 0; i < 10; ++i); + int x = 2; + } + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a loop sequence containing canonical loops or loop-generating constructs}} + #pragma omp fuse + #pragma omp for + for (int i = 0; i < 7; ++i) + ; + + { + // expected-error at +2 {{expected statement}} + #pragma omp fuse + } + + // expected-warning at +1 {{extra tokens at the end of '#pragma omp fuse' are ignored}} + #pragma omp fuse foo + { + for (int i = 0; i < 7; ++i) + ; + for(int j = 0; j < 100; ++j); + + } + + + // expected-error at +1 {{unexpected OpenMP clause 'final' in directive '#pragma omp fuse'}} + #pragma omp fuse final(0) + { + for (int i = 0; i < 7; ++i) + ; + for(int j = 0; j < 100; ++j); + + } + + //expected-error at +4 {{loop after '#pragma omp fuse' is not in canonical form}} ---------------- eZWALT wrote: Is not that i need it, is that is emitted independently from the 1st one. I've simply added the first one, the 2nd is emitted probably from some CheckOpenMPLoop or other constraint checking function. Nevertheless, it is nice to have such detailed information althought i understand that in the tradeoff, it can result a bit verbose. https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Wed May 21 07:01:17 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 07:01:17 -0700 (PDT) Subject: [flang-commits] [flang] [flang] optionally add lifetime markers to alloca created in stack-arrays (PR #140901) Message-ID: https://github.com/jeanPerier created https://github.com/llvm/llvm-project/pull/140901 Flang at Ofast usually produces executables that consume more stack that other Fortran compilers. This is in part because the alloca created from temporary heap allocation by the StackArray pass are created at the function scope level without lifetimes, and LLVM does not/is not able to merge alloca that do not have overlapping lifetimes. This patch adds an option to generate LLVM lifetime in the StackArray pass at the previous heap allocation/free using the LLVM dialect operation for it. For instance, take: ``` subroutine test(x) real :: x(100000) call bar(x+x) call bar(x*x) end subroutine ``` Currently, at Ofast, the stack usage is 100000*4*2=800kb. With this patch, LLVM is able to reuse the storage of `x+x` for `x*x` and the stack usage is only 400kb like with gfortran -Ofast. This can only be done for constant size alloca because LLVM lifetime requires a constant size arguments. For dynamic size alloca, we already generate stack/save restore when inside a loop. We could extend that generation of "linear" code, but this would disable some loop optimizations like loop fusions between two Fortran statement with dynamic size temporaries. It seems better to not do that at that point. The lifetimes are currently not added by default because I want to make sure this is not causing performance regressions on a few benchmarks before making that default. I refrained from making a FIR operation for it because I feel it would not add much over the LLVM operation (except for the LLVM ptr type casts), and maybe it would be better to explore some region based stack allocation management at the FIR level. >From 44b22f6d0fa2f93165ce64e1a34e63601d7a8bb1 Mon Sep 17 00:00:00 2001 From: Jean Perier Date: Tue, 20 May 2025 05:40:06 -0700 Subject: [PATCH] [flang] add lifetime markers to alloca created in stack-arrays --- .../flang/Optimizer/Builder/FIRBuilder.h | 14 ++- .../flang/Optimizer/Dialect/FIROpsSupport.h | 12 +++ .../flang/Optimizer/Transforms/Passes.td | 4 +- flang/lib/Optimizer/Builder/FIRBuilder.cpp | 20 +++- flang/lib/Optimizer/Dialect/FIROps.cpp | 13 +++ .../lib/Optimizer/Transforms/StackArrays.cpp | 100 ++++++++++++++---- .../test/Transforms/stack-arrays-lifetime.fir | 96 +++++++++++++++++ 7 files changed, 234 insertions(+), 25 deletions(-) create mode 100644 flang/test/Transforms/stack-arrays-lifetime.fir diff --git a/flang/include/flang/Optimizer/Builder/FIRBuilder.h b/flang/include/flang/Optimizer/Builder/FIRBuilder.h index 5309ea2c0fc09..9382d77a8d67b 100644 --- a/flang/include/flang/Optimizer/Builder/FIRBuilder.h +++ b/flang/include/flang/Optimizer/Builder/FIRBuilder.h @@ -879,7 +879,7 @@ llvm::SmallVector elideLengthsAlreadyInType(mlir::Type type, mlir::ValueRange lenParams); /// Get the address space which should be used for allocas -uint64_t getAllocaAddressSpace(mlir::DataLayout *dataLayout); +uint64_t getAllocaAddressSpace(const mlir::DataLayout *dataLayout); /// The two vectors of MLIR values have the following property: /// \p extents1[i] must have the same value as \p extents2[i] @@ -913,6 +913,18 @@ void genDimInfoFromBox(fir::FirOpBuilder &builder, mlir::Location loc, llvm::SmallVectorImpl *extents, llvm::SmallVectorImpl *strides); +/// Generate an LLVM dialect lifetime start marker at the current insertion +/// point given an fir.alloca and its constant size in bytes. Returns the value +/// to be passed to the lifetime end marker. +mlir::Value genLifetimeStart(mlir::OpBuilder &builder, mlir::Location loc, + fir::AllocaOp alloc, int64_t size, + const mlir::DataLayout *dl); + +/// Generate an LLVM dialect lifetime end marker at the current insertion point +/// given an llvm.ptr value and the constant size in bytes of its storage. +void genLifetimeEnd(mlir::OpBuilder &builder, mlir::Location loc, + mlir::Value mem, int64_t size); + } // namespace fir::factory #endif // FORTRAN_OPTIMIZER_BUILDER_FIRBUILDER_H diff --git a/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h b/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h index e71a622725bf4..0a2337be7455e 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h +++ b/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h @@ -125,6 +125,12 @@ static constexpr llvm::StringRef getInternalFuncNameAttrName() { return "fir.internal_name"; } +/// Attribute to mark alloca that have been given a lifetime marker so that +/// later pass do not try adding new ones. +static constexpr llvm::StringRef getHasLifetimeMarkerAttrName() { + return "fir.has_lifetime"; +} + /// Does the function, \p func, have a host-associations tuple argument? /// Some internal procedures may have access to host procedure variables. bool hasHostAssociationArgument(mlir::func::FuncOp func); @@ -221,6 +227,12 @@ inline bool hasBindcAttr(mlir::Operation *op) { return hasProcedureAttr(op); } +/// Get the allocation size of a given alloca if it has compile time constant +/// size. +std::optional getAllocaByteSize(fir::AllocaOp alloca, + const mlir::DataLayout &dl, + const fir::KindMapping &kindMap); + /// Return true, if \p rebox operation keeps the input array /// continuous if it is initially continuous. /// When \p checkWhole is false, then the checking is only done diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c0d88a8e19f80..b251534e1a8f6 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -285,7 +285,9 @@ def StackArrays : Pass<"stack-arrays", "mlir::func::FuncOp"> { Convert heap allocations for arrays, even those of unknown size, into stack allocations. }]; - let dependentDialects = [ "fir::FIROpsDialect" ]; + let dependentDialects = [ + "fir::FIROpsDialect", "mlir::DLTIDialect", "mlir::LLVM::LLVMDialect" + ]; } def StackReclaim : Pass<"stack-reclaim"> { diff --git a/flang/lib/Optimizer/Builder/FIRBuilder.cpp b/flang/lib/Optimizer/Builder/FIRBuilder.cpp index 86166db355f72..68a1cc7a3aee6 100644 --- a/flang/lib/Optimizer/Builder/FIRBuilder.cpp +++ b/flang/lib/Optimizer/Builder/FIRBuilder.cpp @@ -1868,7 +1868,8 @@ void fir::factory::setInternalLinkage(mlir::func::FuncOp func) { func->setAttr("llvm.linkage", linkage); } -uint64_t fir::factory::getAllocaAddressSpace(mlir::DataLayout *dataLayout) { +uint64_t +fir::factory::getAllocaAddressSpace(const mlir::DataLayout *dataLayout) { if (dataLayout) if (mlir::Attribute addrSpace = dataLayout->getAllocaMemorySpace()) return mlir::cast(addrSpace).getUInt(); @@ -1940,3 +1941,20 @@ void fir::factory::genDimInfoFromBox( strides->push_back(dimInfo.getByteStride()); } } + +mlir::Value fir::factory::genLifetimeStart(mlir::OpBuilder &builder, + mlir::Location loc, + fir::AllocaOp alloc, int64_t size, + const mlir::DataLayout *dl) { + mlir::Type ptrTy = mlir::LLVM::LLVMPointerType::get( + alloc.getContext(), getAllocaAddressSpace(dl)); + mlir::Value cast = + builder.create(loc, ptrTy, alloc.getResult()); + builder.create(loc, size, cast); + return cast; +} + +void fir::factory::genLifetimeEnd(mlir::OpBuilder &builder, mlir::Location loc, + mlir::Value cast, int64_t size) { + builder.create(loc, size, cast); +} diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index e12af7782a578..cbe93907265f6 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -4804,6 +4804,19 @@ bool fir::reboxPreservesContinuity(fir::ReboxOp rebox, bool checkWhole) { return false; } +std::optional fir::getAllocaByteSize(fir::AllocaOp alloca, + const mlir::DataLayout &dl, + const fir::KindMapping &kindMap) { + mlir::Type type = alloca.getInType(); + // TODO: should use the constant operands when all info is not available in + // the type. + if (!alloca.isDynamic()) + if (auto sizeAndAlignment = + getTypeSizeAndAlignment(alloca.getLoc(), type, dl, kindMap)) + return sizeAndAlignment->first; + return std::nullopt; +} + //===----------------------------------------------------------------------===// // DeclareOp //===----------------------------------------------------------------------===// diff --git a/flang/lib/Optimizer/Transforms/StackArrays.cpp b/flang/lib/Optimizer/Transforms/StackArrays.cpp index f9b9b4f4ff385..b5671261c9a2b 100644 --- a/flang/lib/Optimizer/Transforms/StackArrays.cpp +++ b/flang/lib/Optimizer/Transforms/StackArrays.cpp @@ -13,12 +13,15 @@ #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/Dialect/Support/FIRContext.h" +#include "flang/Optimizer/Support/DataLayout.h" #include "flang/Optimizer/Transforms/Passes.h" #include "mlir/Analysis/DataFlow/ConstantPropagationAnalysis.h" #include "mlir/Analysis/DataFlow/DeadCodeAnalysis.h" #include "mlir/Analysis/DataFlow/DenseAnalysis.h" #include "mlir/Analysis/DataFlowFramework.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/Func/IR/FuncOps.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/IR/Builders.h" #include "mlir/IR/Diagnostics.h" @@ -48,6 +51,11 @@ static llvm::cl::opt maxAllocsPerFunc( "to 0 for no limit."), llvm::cl::init(1000), llvm::cl::Hidden); +static llvm::cl::opt emitLifetimeMarkers( + "stack-arrays-lifetime", + llvm::cl::desc("Add lifetime markers to generated constant size allocas"), + llvm::cl::init(false), llvm::cl::Hidden); + namespace { /// The state of an SSA value at each program point @@ -189,8 +197,11 @@ class AllocMemConversion : public mlir::OpRewritePattern { public: explicit AllocMemConversion( mlir::MLIRContext *ctx, - const StackArraysAnalysisWrapper::AllocMemMap &candidateOps) - : OpRewritePattern(ctx), candidateOps{candidateOps} {} + const StackArraysAnalysisWrapper::AllocMemMap &candidateOps, + std::optional &dl, + std::optional &kindMap) + : OpRewritePattern(ctx), candidateOps{candidateOps}, dl{dl}, + kindMap{kindMap} {} llvm::LogicalResult matchAndRewrite(fir::AllocMemOp allocmem, @@ -206,6 +217,9 @@ class AllocMemConversion : public mlir::OpRewritePattern { /// Handle to the DFA (already run) const StackArraysAnalysisWrapper::AllocMemMap &candidateOps; + const std::optional &dl; + const std::optional &kindMap; + /// If we failed to find an insertion point not inside a loop, see if it would /// be safe to use an llvm.stacksave/llvm.stackrestore inside the loop static InsertionPoint findAllocaLoopInsertionPoint( @@ -218,8 +232,12 @@ class AllocMemConversion : public mlir::OpRewritePattern { mlir::PatternRewriter &rewriter) const; /// Inserts a stacksave before oldAlloc and a stackrestore after each freemem - void insertStackSaveRestore(fir::AllocMemOp &oldAlloc, + void insertStackSaveRestore(fir::AllocMemOp oldAlloc, mlir::PatternRewriter &rewriter) const; + /// Emit lifetime markers for newAlloc between oldAlloc and each freemem. + /// If the allocation is dynamic, no life markers are emitted. + void insertLifetimeMarkers(fir::AllocMemOp oldAlloc, fir::AllocaOp newAlloc, + mlir::PatternRewriter &rewriter) const; }; class StackArraysPass : public fir::impl::StackArraysBase { @@ -740,14 +758,34 @@ AllocMemConversion::insertAlloca(fir::AllocMemOp &oldAlloc, llvm::StringRef uniqName = unpackName(oldAlloc.getUniqName()); llvm::StringRef bindcName = unpackName(oldAlloc.getBindcName()); - return rewriter.create(loc, varTy, uniqName, bindcName, - oldAlloc.getTypeparams(), - oldAlloc.getShape()); + auto alloca = rewriter.create(loc, varTy, uniqName, bindcName, + oldAlloc.getTypeparams(), + oldAlloc.getShape()); + if (emitLifetimeMarkers) + insertLifetimeMarkers(oldAlloc, alloca, rewriter); + + return alloca; +} + +static void +visitFreeMemOp(fir::AllocMemOp oldAlloc, + const std::function &callBack) { + for (mlir::Operation *user : oldAlloc->getUsers()) { + if (auto declareOp = mlir::dyn_cast_if_present(user)) { + for (mlir::Operation *user : declareOp->getUsers()) { + if (mlir::isa(user)) + callBack(user); + } + } + + if (mlir::isa(user)) + callBack(user); + } } void AllocMemConversion::insertStackSaveRestore( - fir::AllocMemOp &oldAlloc, mlir::PatternRewriter &rewriter) const { - auto oldPoint = rewriter.saveInsertionPoint(); + fir::AllocMemOp oldAlloc, mlir::PatternRewriter &rewriter) const { + mlir::OpBuilder::InsertionGuard insertGuard(rewriter); auto mod = oldAlloc->getParentOfType(); fir::FirOpBuilder builder{rewriter, mod}; @@ -758,21 +796,30 @@ void AllocMemConversion::insertStackSaveRestore( builder.setInsertionPoint(user); builder.genStackRestore(user->getLoc(), sp); }; + visitFreeMemOp(oldAlloc, createStackRestoreCall); +} - for (mlir::Operation *user : oldAlloc->getUsers()) { - if (auto declareOp = mlir::dyn_cast_if_present(user)) { - for (mlir::Operation *user : declareOp->getUsers()) { - if (mlir::isa(user)) - createStackRestoreCall(user); - } - } - - if (mlir::isa(user)) { - createStackRestoreCall(user); - } +void AllocMemConversion::insertLifetimeMarkers( + fir::AllocMemOp oldAlloc, fir::AllocaOp newAlloc, + mlir::PatternRewriter &rewriter) const { + if (!dl || !kindMap) + return; + llvm::StringRef attrName = fir::getHasLifetimeMarkerAttrName(); + // Do not add lifetime markers, of the alloca already has any. + if (newAlloc->hasAttr(attrName)) + return; + if (std::optional size = + fir::getAllocaByteSize(newAlloc, *dl, *kindMap)) { + mlir::OpBuilder::InsertionGuard insertGuard(rewriter); + rewriter.setInsertionPoint(oldAlloc); + mlir::Value ptr = fir::factory::genLifetimeStart( + rewriter, newAlloc.getLoc(), newAlloc, *size, &*dl); + visitFreeMemOp(oldAlloc, [&](mlir::Operation *op) { + rewriter.setInsertionPoint(op); + fir::factory::genLifetimeEnd(rewriter, op->getLoc(), ptr, *size); + }); + newAlloc->setAttr(attrName, rewriter.getUnitAttr()); } - - rewriter.restoreInsertionPoint(oldPoint); } StackArraysPass::StackArraysPass(const StackArraysPass &pass) @@ -809,7 +856,16 @@ void StackArraysPass::runOnOperation() { config.setRegionSimplificationLevel( mlir::GreedySimplifyRegionLevel::Disabled); - patterns.insert(&context, *candidateOps); + auto module = func->getParentOfType(); + std::optional dl = + module ? fir::support::getOrSetMLIRDataLayout( + module, /*allowDefaultLayout=*/false) + : std::nullopt; + std::optional kindMap; + if (module) + kindMap = fir::getKindMapping(module); + + patterns.insert(&context, *candidateOps, dl, kindMap); if (mlir::failed(mlir::applyOpPatternsGreedily( opsToConvert, std::move(patterns), config))) { mlir::emitError(func->getLoc(), "error in stack arrays optimization\n"); diff --git a/flang/test/Transforms/stack-arrays-lifetime.fir b/flang/test/Transforms/stack-arrays-lifetime.fir new file mode 100644 index 0000000000000..5b2faeba132c3 --- /dev/null +++ b/flang/test/Transforms/stack-arrays-lifetime.fir @@ -0,0 +1,96 @@ +// Test insertion of llvm.lifetime for allocmem turn into alloca with constant size. +// RUN: fir-opt --stack-arrays -stack-arrays-lifetime %s | FileCheck %s + +module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"} { + +func.func @_QPcst_alloca(%arg0: !fir.ref> {fir.bindc_name = "x"}) { + %c1 = arith.constant 1 : index + %c100000 = arith.constant 100000 : index + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.shape %c100000 : (index) -> !fir.shape<1> + %2 = fir.declare %arg0(%1) dummy_scope %0 {uniq_name = "_QFcst_allocaEx"} : (!fir.ref>, !fir.shape<1>, !fir.dscope) -> !fir.ref> + %3 = fir.allocmem !fir.array<100000xf32> {bindc_name = ".tmp.array", uniq_name = ""} + %4 = fir.declare %3(%1) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg1 = %c1 to %c100000 step %c1 unordered { + %9 = fir.array_coor %2(%1) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %10 = fir.load %9 : !fir.ref + %11 = arith.addf %10, %10 fastmath : f32 + %12 = fir.array_coor %4(%1) %arg1 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %11 to %12 : !fir.ref + } + %5 = fir.convert %4 : (!fir.heap>) -> !fir.ref> + fir.call @_QPbar(%5) fastmath : (!fir.ref>) -> () + fir.freemem %4 : !fir.heap> + %6 = fir.allocmem !fir.array<100000xi32> {bindc_name = ".tmp.array", uniq_name = ""} + %7 = fir.declare %6(%1) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg1 = %c1 to %c100000 step %c1 unordered { + %9 = fir.array_coor %2(%1) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %10 = fir.load %9 : !fir.ref + %11 = fir.convert %10 : (f32) -> i32 + %12 = fir.array_coor %7(%1) %arg1 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %11 to %12 : !fir.ref + } + %8 = fir.convert %7 : (!fir.heap>) -> !fir.ref> + fir.call @_QPibar(%8) fastmath : (!fir.ref>) -> () + fir.freemem %7 : !fir.heap> + return +} +// CHECK-LABEL: func.func @_QPcst_alloca( +// CHECK-DAG: %[[VAL_0:.*]] = fir.alloca !fir.array<100000xf32> {bindc_name = ".tmp.array", fir.has_lifetime} +// CHECK-DAG: %[[VAL_2:.*]] = fir.alloca !fir.array<100000xi32> {bindc_name = ".tmp.array", fir.has_lifetime} +// CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_0]] : (!fir.ref>) -> !llvm.ptr +// CHECK: llvm.intr.lifetime.start 400000, %[[VAL_9]] : !llvm.ptr +// CHECK: fir.do_loop +// CHECK: fir.call @_QPbar( +// CHECK: llvm.intr.lifetime.end 400000, %[[VAL_9]] : !llvm.ptr +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_2]] : (!fir.ref>) -> !llvm.ptr +// CHECK: llvm.intr.lifetime.start 400000, %[[VAL_17]] : !llvm.ptr +// CHECK: fir.do_loop +// CHECK: fir.call @_QPibar( +// CHECK: llvm.intr.lifetime.end 400000, %[[VAL_17]] : !llvm.ptr + + +func.func @_QPdyn_alloca(%arg0: !fir.ref> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "n"}) { + %c1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.declare %arg1 dummy_scope %0 {uniq_name = "_QFdyn_allocaEn"} : (!fir.ref, !fir.dscope) -> !fir.ref + %2 = fir.load %1 : !fir.ref + %3 = fir.convert %2 : (i64) -> index + %4 = arith.cmpi sgt, %3, %c0 : index + %5 = arith.select %4, %3, %c0 : index + %6 = fir.shape %5 : (index) -> !fir.shape<1> + %7 = fir.declare %arg0(%6) dummy_scope %0 {uniq_name = "_QFdyn_allocaEx"} : (!fir.ref>, !fir.shape<1>, !fir.dscope) -> !fir.ref> + %8 = fir.allocmem !fir.array, %5 {bindc_name = ".tmp.array", uniq_name = ""} + %9 = fir.declare %8(%6) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg2 = %c1 to %5 step %c1 unordered { + %14 = fir.array_coor %7(%6) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %15 = fir.load %14 : !fir.ref + %16 = arith.addf %15, %15 fastmath : f32 + %17 = fir.array_coor %9(%6) %arg2 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %16 to %17 : !fir.ref + } + %10 = fir.convert %9 : (!fir.heap>) -> !fir.ref> + fir.call @_QPbar(%10) fastmath : (!fir.ref>) -> () + fir.freemem %9 : !fir.heap> + %11 = fir.allocmem !fir.array, %5 {bindc_name = ".tmp.array", uniq_name = ""} + %12 = fir.declare %11(%6) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg2 = %c1 to %5 step %c1 unordered { + %14 = fir.array_coor %7(%6) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %15 = fir.load %14 : !fir.ref + %16 = arith.mulf %15, %15 fastmath : f32 + %17 = fir.array_coor %12(%6) %arg2 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %16 to %17 : !fir.ref + } + %13 = fir.convert %12 : (!fir.heap>) -> !fir.ref> + fir.call @_QPbar(%13) fastmath : (!fir.ref>) -> () + fir.freemem %12 : !fir.heap> + return +} +// CHECK-LABEL: func.func @_QPdyn_alloca( +// CHECK-NOT: llvm.intr.lifetime.start +// CHECK: return + +func.func private @_QPbar(!fir.ref>) +func.func private @_QPibar(!fir.ref>) +} From flang-commits at lists.llvm.org Wed May 21 07:01:45 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 07:01:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang] optionally add lifetime markers to alloca created in stack-arrays (PR #140901) In-Reply-To: Message-ID: <682ddcc9.170a0220.175326.e5b8@mx.google.com> https://github.com/jeanPerier edited https://github.com/llvm/llvm-project/pull/140901 From flang-commits at lists.llvm.org Wed May 21 07:01:52 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 07:01:52 -0700 (PDT) Subject: [flang-commits] [flang] [flang] optionally add lifetime markers to alloca created in stack-arrays (PR #140901) In-Reply-To: Message-ID: <682ddcd0.170a0220.22ecea.80a4@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: None (jeanPerier)
Changes Flang at Ofast usually produces executables that consume more stack that other Fortran compilers. This is in part because the alloca created from temporary heap allocation by the StackArray pass are created at the function scope level without lifetimes, and LLVM does not/is not able to merge alloca that do not have overlapping lifetimes. This patch adds an option to generate LLVM lifetime in the StackArray pass at the previous heap allocation/free using the LLVM dialect operation for it. For instance, take: ``` subroutine test(x) real :: x(100000) call bar(x+x) call bar(x*x) end subroutine ``` Currently, at Ofast, the stack usage is `100000*4*2=800kb`. With this patch, LLVM is able to reuse the storage of `x+x` for `x*x` and the stack usage is only 400kb like with gfortran -Ofast. This can only be done for constant size alloca because LLVM lifetime requires a constant size arguments. For dynamic size alloca, we already generate stack/save restore when inside a loop. We could extend that generation of "linear" code, but this would disable some loop optimizations like loop fusions between two Fortran statement with dynamic size temporaries. It seems better to not do that at that point. The lifetimes are currently not added by default because I want to make sure this is not causing performance regressions on a few benchmarks before making that default. I refrained from making a FIR operation for it because I feel it would not add much over the LLVM operation (except for the LLVM ptr type casts), and maybe it would be better to explore some region based stack allocation management at the FIR level. --- Patch is 20.07 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/140901.diff 7 Files Affected: - (modified) flang/include/flang/Optimizer/Builder/FIRBuilder.h (+13-1) - (modified) flang/include/flang/Optimizer/Dialect/FIROpsSupport.h (+12) - (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+3-1) - (modified) flang/lib/Optimizer/Builder/FIRBuilder.cpp (+19-1) - (modified) flang/lib/Optimizer/Dialect/FIROps.cpp (+13) - (modified) flang/lib/Optimizer/Transforms/StackArrays.cpp (+78-22) - (added) flang/test/Transforms/stack-arrays-lifetime.fir (+96) ``````````diff diff --git a/flang/include/flang/Optimizer/Builder/FIRBuilder.h b/flang/include/flang/Optimizer/Builder/FIRBuilder.h index 5309ea2c0fc09..9382d77a8d67b 100644 --- a/flang/include/flang/Optimizer/Builder/FIRBuilder.h +++ b/flang/include/flang/Optimizer/Builder/FIRBuilder.h @@ -879,7 +879,7 @@ llvm::SmallVector elideLengthsAlreadyInType(mlir::Type type, mlir::ValueRange lenParams); /// Get the address space which should be used for allocas -uint64_t getAllocaAddressSpace(mlir::DataLayout *dataLayout); +uint64_t getAllocaAddressSpace(const mlir::DataLayout *dataLayout); /// The two vectors of MLIR values have the following property: /// \p extents1[i] must have the same value as \p extents2[i] @@ -913,6 +913,18 @@ void genDimInfoFromBox(fir::FirOpBuilder &builder, mlir::Location loc, llvm::SmallVectorImpl *extents, llvm::SmallVectorImpl *strides); +/// Generate an LLVM dialect lifetime start marker at the current insertion +/// point given an fir.alloca and its constant size in bytes. Returns the value +/// to be passed to the lifetime end marker. +mlir::Value genLifetimeStart(mlir::OpBuilder &builder, mlir::Location loc, + fir::AllocaOp alloc, int64_t size, + const mlir::DataLayout *dl); + +/// Generate an LLVM dialect lifetime end marker at the current insertion point +/// given an llvm.ptr value and the constant size in bytes of its storage. +void genLifetimeEnd(mlir::OpBuilder &builder, mlir::Location loc, + mlir::Value mem, int64_t size); + } // namespace fir::factory #endif // FORTRAN_OPTIMIZER_BUILDER_FIRBUILDER_H diff --git a/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h b/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h index e71a622725bf4..0a2337be7455e 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h +++ b/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h @@ -125,6 +125,12 @@ static constexpr llvm::StringRef getInternalFuncNameAttrName() { return "fir.internal_name"; } +/// Attribute to mark alloca that have been given a lifetime marker so that +/// later pass do not try adding new ones. +static constexpr llvm::StringRef getHasLifetimeMarkerAttrName() { + return "fir.has_lifetime"; +} + /// Does the function, \p func, have a host-associations tuple argument? /// Some internal procedures may have access to host procedure variables. bool hasHostAssociationArgument(mlir::func::FuncOp func); @@ -221,6 +227,12 @@ inline bool hasBindcAttr(mlir::Operation *op) { return hasProcedureAttr(op); } +/// Get the allocation size of a given alloca if it has compile time constant +/// size. +std::optional getAllocaByteSize(fir::AllocaOp alloca, + const mlir::DataLayout &dl, + const fir::KindMapping &kindMap); + /// Return true, if \p rebox operation keeps the input array /// continuous if it is initially continuous. /// When \p checkWhole is false, then the checking is only done diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c0d88a8e19f80..b251534e1a8f6 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -285,7 +285,9 @@ def StackArrays : Pass<"stack-arrays", "mlir::func::FuncOp"> { Convert heap allocations for arrays, even those of unknown size, into stack allocations. }]; - let dependentDialects = [ "fir::FIROpsDialect" ]; + let dependentDialects = [ + "fir::FIROpsDialect", "mlir::DLTIDialect", "mlir::LLVM::LLVMDialect" + ]; } def StackReclaim : Pass<"stack-reclaim"> { diff --git a/flang/lib/Optimizer/Builder/FIRBuilder.cpp b/flang/lib/Optimizer/Builder/FIRBuilder.cpp index 86166db355f72..68a1cc7a3aee6 100644 --- a/flang/lib/Optimizer/Builder/FIRBuilder.cpp +++ b/flang/lib/Optimizer/Builder/FIRBuilder.cpp @@ -1868,7 +1868,8 @@ void fir::factory::setInternalLinkage(mlir::func::FuncOp func) { func->setAttr("llvm.linkage", linkage); } -uint64_t fir::factory::getAllocaAddressSpace(mlir::DataLayout *dataLayout) { +uint64_t +fir::factory::getAllocaAddressSpace(const mlir::DataLayout *dataLayout) { if (dataLayout) if (mlir::Attribute addrSpace = dataLayout->getAllocaMemorySpace()) return mlir::cast(addrSpace).getUInt(); @@ -1940,3 +1941,20 @@ void fir::factory::genDimInfoFromBox( strides->push_back(dimInfo.getByteStride()); } } + +mlir::Value fir::factory::genLifetimeStart(mlir::OpBuilder &builder, + mlir::Location loc, + fir::AllocaOp alloc, int64_t size, + const mlir::DataLayout *dl) { + mlir::Type ptrTy = mlir::LLVM::LLVMPointerType::get( + alloc.getContext(), getAllocaAddressSpace(dl)); + mlir::Value cast = + builder.create(loc, ptrTy, alloc.getResult()); + builder.create(loc, size, cast); + return cast; +} + +void fir::factory::genLifetimeEnd(mlir::OpBuilder &builder, mlir::Location loc, + mlir::Value cast, int64_t size) { + builder.create(loc, size, cast); +} diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index e12af7782a578..cbe93907265f6 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -4804,6 +4804,19 @@ bool fir::reboxPreservesContinuity(fir::ReboxOp rebox, bool checkWhole) { return false; } +std::optional fir::getAllocaByteSize(fir::AllocaOp alloca, + const mlir::DataLayout &dl, + const fir::KindMapping &kindMap) { + mlir::Type type = alloca.getInType(); + // TODO: should use the constant operands when all info is not available in + // the type. + if (!alloca.isDynamic()) + if (auto sizeAndAlignment = + getTypeSizeAndAlignment(alloca.getLoc(), type, dl, kindMap)) + return sizeAndAlignment->first; + return std::nullopt; +} + //===----------------------------------------------------------------------===// // DeclareOp //===----------------------------------------------------------------------===// diff --git a/flang/lib/Optimizer/Transforms/StackArrays.cpp b/flang/lib/Optimizer/Transforms/StackArrays.cpp index f9b9b4f4ff385..b5671261c9a2b 100644 --- a/flang/lib/Optimizer/Transforms/StackArrays.cpp +++ b/flang/lib/Optimizer/Transforms/StackArrays.cpp @@ -13,12 +13,15 @@ #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/Dialect/Support/FIRContext.h" +#include "flang/Optimizer/Support/DataLayout.h" #include "flang/Optimizer/Transforms/Passes.h" #include "mlir/Analysis/DataFlow/ConstantPropagationAnalysis.h" #include "mlir/Analysis/DataFlow/DeadCodeAnalysis.h" #include "mlir/Analysis/DataFlow/DenseAnalysis.h" #include "mlir/Analysis/DataFlowFramework.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/Func/IR/FuncOps.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/IR/Builders.h" #include "mlir/IR/Diagnostics.h" @@ -48,6 +51,11 @@ static llvm::cl::opt maxAllocsPerFunc( "to 0 for no limit."), llvm::cl::init(1000), llvm::cl::Hidden); +static llvm::cl::opt emitLifetimeMarkers( + "stack-arrays-lifetime", + llvm::cl::desc("Add lifetime markers to generated constant size allocas"), + llvm::cl::init(false), llvm::cl::Hidden); + namespace { /// The state of an SSA value at each program point @@ -189,8 +197,11 @@ class AllocMemConversion : public mlir::OpRewritePattern { public: explicit AllocMemConversion( mlir::MLIRContext *ctx, - const StackArraysAnalysisWrapper::AllocMemMap &candidateOps) - : OpRewritePattern(ctx), candidateOps{candidateOps} {} + const StackArraysAnalysisWrapper::AllocMemMap &candidateOps, + std::optional &dl, + std::optional &kindMap) + : OpRewritePattern(ctx), candidateOps{candidateOps}, dl{dl}, + kindMap{kindMap} {} llvm::LogicalResult matchAndRewrite(fir::AllocMemOp allocmem, @@ -206,6 +217,9 @@ class AllocMemConversion : public mlir::OpRewritePattern { /// Handle to the DFA (already run) const StackArraysAnalysisWrapper::AllocMemMap &candidateOps; + const std::optional &dl; + const std::optional &kindMap; + /// If we failed to find an insertion point not inside a loop, see if it would /// be safe to use an llvm.stacksave/llvm.stackrestore inside the loop static InsertionPoint findAllocaLoopInsertionPoint( @@ -218,8 +232,12 @@ class AllocMemConversion : public mlir::OpRewritePattern { mlir::PatternRewriter &rewriter) const; /// Inserts a stacksave before oldAlloc and a stackrestore after each freemem - void insertStackSaveRestore(fir::AllocMemOp &oldAlloc, + void insertStackSaveRestore(fir::AllocMemOp oldAlloc, mlir::PatternRewriter &rewriter) const; + /// Emit lifetime markers for newAlloc between oldAlloc and each freemem. + /// If the allocation is dynamic, no life markers are emitted. + void insertLifetimeMarkers(fir::AllocMemOp oldAlloc, fir::AllocaOp newAlloc, + mlir::PatternRewriter &rewriter) const; }; class StackArraysPass : public fir::impl::StackArraysBase { @@ -740,14 +758,34 @@ AllocMemConversion::insertAlloca(fir::AllocMemOp &oldAlloc, llvm::StringRef uniqName = unpackName(oldAlloc.getUniqName()); llvm::StringRef bindcName = unpackName(oldAlloc.getBindcName()); - return rewriter.create(loc, varTy, uniqName, bindcName, - oldAlloc.getTypeparams(), - oldAlloc.getShape()); + auto alloca = rewriter.create(loc, varTy, uniqName, bindcName, + oldAlloc.getTypeparams(), + oldAlloc.getShape()); + if (emitLifetimeMarkers) + insertLifetimeMarkers(oldAlloc, alloca, rewriter); + + return alloca; +} + +static void +visitFreeMemOp(fir::AllocMemOp oldAlloc, + const std::function &callBack) { + for (mlir::Operation *user : oldAlloc->getUsers()) { + if (auto declareOp = mlir::dyn_cast_if_present(user)) { + for (mlir::Operation *user : declareOp->getUsers()) { + if (mlir::isa(user)) + callBack(user); + } + } + + if (mlir::isa(user)) + callBack(user); + } } void AllocMemConversion::insertStackSaveRestore( - fir::AllocMemOp &oldAlloc, mlir::PatternRewriter &rewriter) const { - auto oldPoint = rewriter.saveInsertionPoint(); + fir::AllocMemOp oldAlloc, mlir::PatternRewriter &rewriter) const { + mlir::OpBuilder::InsertionGuard insertGuard(rewriter); auto mod = oldAlloc->getParentOfType(); fir::FirOpBuilder builder{rewriter, mod}; @@ -758,21 +796,30 @@ void AllocMemConversion::insertStackSaveRestore( builder.setInsertionPoint(user); builder.genStackRestore(user->getLoc(), sp); }; + visitFreeMemOp(oldAlloc, createStackRestoreCall); +} - for (mlir::Operation *user : oldAlloc->getUsers()) { - if (auto declareOp = mlir::dyn_cast_if_present(user)) { - for (mlir::Operation *user : declareOp->getUsers()) { - if (mlir::isa(user)) - createStackRestoreCall(user); - } - } - - if (mlir::isa(user)) { - createStackRestoreCall(user); - } +void AllocMemConversion::insertLifetimeMarkers( + fir::AllocMemOp oldAlloc, fir::AllocaOp newAlloc, + mlir::PatternRewriter &rewriter) const { + if (!dl || !kindMap) + return; + llvm::StringRef attrName = fir::getHasLifetimeMarkerAttrName(); + // Do not add lifetime markers, of the alloca already has any. + if (newAlloc->hasAttr(attrName)) + return; + if (std::optional size = + fir::getAllocaByteSize(newAlloc, *dl, *kindMap)) { + mlir::OpBuilder::InsertionGuard insertGuard(rewriter); + rewriter.setInsertionPoint(oldAlloc); + mlir::Value ptr = fir::factory::genLifetimeStart( + rewriter, newAlloc.getLoc(), newAlloc, *size, &*dl); + visitFreeMemOp(oldAlloc, [&](mlir::Operation *op) { + rewriter.setInsertionPoint(op); + fir::factory::genLifetimeEnd(rewriter, op->getLoc(), ptr, *size); + }); + newAlloc->setAttr(attrName, rewriter.getUnitAttr()); } - - rewriter.restoreInsertionPoint(oldPoint); } StackArraysPass::StackArraysPass(const StackArraysPass &pass) @@ -809,7 +856,16 @@ void StackArraysPass::runOnOperation() { config.setRegionSimplificationLevel( mlir::GreedySimplifyRegionLevel::Disabled); - patterns.insert(&context, *candidateOps); + auto module = func->getParentOfType(); + std::optional dl = + module ? fir::support::getOrSetMLIRDataLayout( + module, /*allowDefaultLayout=*/false) + : std::nullopt; + std::optional kindMap; + if (module) + kindMap = fir::getKindMapping(module); + + patterns.insert(&context, *candidateOps, dl, kindMap); if (mlir::failed(mlir::applyOpPatternsGreedily( opsToConvert, std::move(patterns), config))) { mlir::emitError(func->getLoc(), "error in stack arrays optimization\n"); diff --git a/flang/test/Transforms/stack-arrays-lifetime.fir b/flang/test/Transforms/stack-arrays-lifetime.fir new file mode 100644 index 0000000000000..5b2faeba132c3 --- /dev/null +++ b/flang/test/Transforms/stack-arrays-lifetime.fir @@ -0,0 +1,96 @@ +// Test insertion of llvm.lifetime for allocmem turn into alloca with constant size. +// RUN: fir-opt --stack-arrays -stack-arrays-lifetime %s | FileCheck %s + +module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"} { + +func.func @_QPcst_alloca(%arg0: !fir.ref> {fir.bindc_name = "x"}) { + %c1 = arith.constant 1 : index + %c100000 = arith.constant 100000 : index + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.shape %c100000 : (index) -> !fir.shape<1> + %2 = fir.declare %arg0(%1) dummy_scope %0 {uniq_name = "_QFcst_allocaEx"} : (!fir.ref>, !fir.shape<1>, !fir.dscope) -> !fir.ref> + %3 = fir.allocmem !fir.array<100000xf32> {bindc_name = ".tmp.array", uniq_name = ""} + %4 = fir.declare %3(%1) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg1 = %c1 to %c100000 step %c1 unordered { + %9 = fir.array_coor %2(%1) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %10 = fir.load %9 : !fir.ref + %11 = arith.addf %10, %10 fastmath : f32 + %12 = fir.array_coor %4(%1) %arg1 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %11 to %12 : !fir.ref + } + %5 = fir.convert %4 : (!fir.heap>) -> !fir.ref> + fir.call @_QPbar(%5) fastmath : (!fir.ref>) -> () + fir.freemem %4 : !fir.heap> + %6 = fir.allocmem !fir.array<100000xi32> {bindc_name = ".tmp.array", uniq_name = ""} + %7 = fir.declare %6(%1) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg1 = %c1 to %c100000 step %c1 unordered { + %9 = fir.array_coor %2(%1) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %10 = fir.load %9 : !fir.ref + %11 = fir.convert %10 : (f32) -> i32 + %12 = fir.array_coor %7(%1) %arg1 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %11 to %12 : !fir.ref + } + %8 = fir.convert %7 : (!fir.heap>) -> !fir.ref> + fir.call @_QPibar(%8) fastmath : (!fir.ref>) -> () + fir.freemem %7 : !fir.heap> + return +} +// CHECK-LABEL: func.func @_QPcst_alloca( +// CHECK-DAG: %[[VAL_0:.*]] = fir.alloca !fir.array<100000xf32> {bindc_name = ".tmp.array", fir.has_lifetime} +// CHECK-DAG: %[[VAL_2:.*]] = fir.alloca !fir.array<100000xi32> {bindc_name = ".tmp.array", fir.has_lifetime} +// CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_0]] : (!fir.ref>) -> !llvm.ptr +// CHECK: llvm.intr.lifetime.start 400000, %[[VAL_9]] : !llvm.ptr +// CHECK: fir.do_loop +// CHECK: fir.call @_QPbar( +// CHECK: llvm.intr.lifetime.end 400000, %[[VAL_9]] : !llvm.ptr +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_2]] : (!fir.ref>) -> !llvm.ptr +// CHECK: llvm.intr.lifetime.start 400000, %[[VAL_17]] : !llvm.ptr +// CHECK: fir.do_loop +// CHECK: fir.call @_QPibar( +// CHECK: llvm.intr.lifetime.end 400000, %[[VAL_17]] : !llvm.ptr + + +func.func @_QPdyn_alloca(%arg0: !fir.ref> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "n"}) { + %c1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.declare %arg1 dummy_scope %0 {uniq_name = "_QFdyn_allocaEn"} : (!fir.ref, !fir.dscope) -> !fir.ref + %2 = fir.load %1 : !fir.ref + %3 = fir.convert %2 : (i64) -> index + %4 = arith.cmpi sgt, %3, %c0 : index + %5 = arith.select %4, %3, %c0 : index + %6 = fir.shape %5 : (index) -> !fir.shape<1> + %7 = fir.declare %arg0(%6) dummy_scope %0 {uniq_name = "_QFdyn_allocaEx"} : (!fir.ref>, !fir.shape<1>, !fir.dscope) -> !fir.ref> + %8 = fir.allocmem !fir.array, %5 {bindc_name = ".tmp.array", uniq_name = ""} + %9 = fir.declare %8(%6) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg2 = %c1 to %5 step %c1 unordered { + %14 = fir.array_coor %7(%6) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %15 = fir.load %14 : !fir.ref + %16 = arith.addf %15, %15 fastmath : f32 + %17 = fir.array_coor %9(%6) %arg2 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %16 to %17 : !fir.ref + } + %10 = fir.convert %9 : (!fir.heap>) -> !fir.ref> + fir.call @_QPbar(%10) fastmath : (!fir.ref>) -> () + fir.freemem %9 : !fir.heap> + %11 = fir.allocmem !fir.array, %5 {bindc_name = ".tmp.array", uniq_name = ""} + %12 = fir.declare %11(%6) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg2 = %c1 to %5 step %c1 unordered { + %14 = fir.array_coor %7(%6) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %15 = fir.load %14 : !fir.ref + %16 = arith.mulf %15, %15 fastmath : f32 + %17 = fir.array_coor %12(%6) %arg2 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %16 to %17 : !fir.ref + } + %13 = fir.convert %12 : (!fir.heap>) -> !fir.ref> + fir.call @_QPbar(%13) fastmath : (!fir.ref>) -> () + fir.freemem %12 : !fir.heap> + return +} +// CHECK-LABEL: func.func @_QPdyn_alloca( +// CHECK-NOT: llvm.intr.lifetime.start +// CHECK: return + +func.func private @_QPbar(!fir.ref https://github.com/llvm/llvm-project/pull/140901 From flang-commits at lists.llvm.org Wed May 21 07:02:29 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 07:02:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang] optionally add lifetime markers to alloca created in stack-arrays (PR #140901) In-Reply-To: Message-ID: <682ddcf5.170a0220.95b01.7b89@mx.google.com> https://github.com/jeanPerier edited https://github.com/llvm/llvm-project/pull/140901 From flang-commits at lists.llvm.org Wed May 21 07:16:55 2025 From: flang-commits at lists.llvm.org (Jean-Didier PAILLEUX via flang-commits) Date: Wed, 21 May 2025 07:16:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Implement !DIR$ IVDEP directive (PR #133728) In-Reply-To: Message-ID: <682de057.170a0220.95b01.7d45@mx.google.com> JDPailleux wrote: Hi @kiranktp, thank you for your reply. I was not aware of the existence of a directive in clang for ivdep? I couldn't find this directive, nor the associated metadata. Do you have the related PR or commit ? https://github.com/llvm/llvm-project/pull/133728 From flang-commits at lists.llvm.org Wed May 21 07:46:37 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 21 May 2025 07:46:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682de74d.630a0220.20f56b.dc4a@mx.google.com> tblah wrote: > > Please could you add a test showing a loop containing multiple blocks? This feels like a likely source of bugs now or in the future. > > It seems that `fir.do_loop` only allows a single block. Are you referring to cases where some operations within that block contain nested blocks (such as `fir.if`)? This will be gradually implemented in the future. Ahh my mistake. I thought it might be possible to have a do_loop containing multiple blocks. https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Wed May 21 08:11:08 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 21 May 2025 08:11:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682ded0c.a70a0220.33a4ea.a3f6@mx.google.com> https://github.com/clementval edited https://github.com/llvm/llvm-project/pull/140834 From flang-commits at lists.llvm.org Wed May 21 02:06:06 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 21 May 2025 02:06:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #140066) In-Reply-To: Message-ID: <682d977e.170a0220.73329.63f4@mx.google.com> tblah wrote: ping for review https://github.com/llvm/llvm-project/pull/140066 From flang-commits at lists.llvm.org Wed May 21 02:27:48 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 21 May 2025 02:27:48 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] fix diagnostic for bad cancel type (PR #140798) In-Reply-To: Message-ID: <682d9c94.630a0220.2177dd.8976@mx.google.com> https://github.com/kiranchandramohan approved this pull request. LG. https://github.com/llvm/llvm-project/pull/140798 From flang-commits at lists.llvm.org Wed May 21 02:48:19 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 21 May 2025 02:48:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] fix diagnostic for bad cancel type (PR #140798) In-Reply-To: Message-ID: <682da163.170a0220.2b543f.d233@mx.google.com> https://github.com/tblah closed https://github.com/llvm/llvm-project/pull/140798 From flang-commits at lists.llvm.org Wed May 21 08:16:02 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 21 May 2025 08:16:02 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <682dee32.170a0220.3021e0.ff59@mx.google.com> https://github.com/tblah commented: LGTM, but as Kiran pointed out, be careful about the attributes. I think it is safe to drop `unordered` because that is just an optimization hint (basically promoting a loop that doesn't care what order it is executed in to one that does). I am not sure about the reduction operands though. https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Wed May 21 08:19:14 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 08:19:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Skip optimized bufferization on volatile refs (PR #140781) In-Reply-To: Message-ID: <682deef2.170a0220.1f7165.8837@mx.google.com> jeanPerier wrote: > Thank you for the thoughtful review! I wonder if there's somewhere I should collect those ideas along with the design of volatile so we can iterate on it (or at least keep it documented somewhere). Maybe something in `docs/`? Yes, that is a good idea. https://github.com/llvm/llvm-project/pull/140781 From flang-commits at lists.llvm.org Wed May 21 08:57:37 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Wed, 21 May 2025 08:57:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682df7f1.170a0220.4e540.09f6@mx.google.com> ================ @@ -9,4 +9,5 @@ end subroutine ! CHECK-LABEL: func.func @_QPsharedmem() attributes {cuf.proc_attr = #cuf.cuda_proc} ! CHECK: %{{.*}} = cuf.shared_memory !fir.array<32xf32> {bindc_name = "s", uniq_name = "_QFsharedmemEs"} -> !fir.ref> +! CHECK: cuf.free %{{.*}}#0 : !fir.ref {data_attr = #cuf.cuda} ! CHECK-NOT: cuf.free ---------------- wangzpgi wrote: Yes. https://github.com/llvm/llvm-project/pull/140834 From flang-commits at lists.llvm.org Wed May 21 09:12:38 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Wed, 21 May 2025 09:12:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682dfb76.630a0220.183fc2.f23b@mx.google.com> https://github.com/wangzpgi updated https://github.com/llvm/llvm-project/pull/140834 >From 564ff8f169f7807dedd95fe2d3eb995c0472f277 Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Tue, 20 May 2025 19:47:59 -0700 Subject: [PATCH 1/2] implicitly set DEVICE attribute to scalars in device routines --- flang/lib/Semantics/resolve-names.cpp | 2 +- flang/test/Lower/CUDA/cuda-shared.cuf | 1 + flang/test/Semantics/cuf21.cuf | 38 +++++++++++++++++++++++++++ 3 files changed, 40 insertions(+), 1 deletion(-) create mode 100644 flang/test/Semantics/cuf21.cuf diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..3f4a06444c4f3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -9376,7 +9376,7 @@ static void SetImplicitCUDADevice(bool inDeviceSubprogram, Symbol &symbol) { if (inDeviceSubprogram && symbol.has()) { auto *object{symbol.detailsIf()}; if (!object->cudaDataAttr() && !IsValue(symbol) && - (IsDummy(symbol) || object->IsArray())) { + !IsFunctionResult(symbol)) { // Implicitly set device attribute if none is set in device context. object->set_cudaDataAttr(common::CUDADataAttr::Device); } diff --git a/flang/test/Lower/CUDA/cuda-shared.cuf b/flang/test/Lower/CUDA/cuda-shared.cuf index f41011df06ae7..565857f01bdb8 100644 --- a/flang/test/Lower/CUDA/cuda-shared.cuf +++ b/flang/test/Lower/CUDA/cuda-shared.cuf @@ -9,4 +9,5 @@ end subroutine ! CHECK-LABEL: func.func @_QPsharedmem() attributes {cuf.proc_attr = #cuf.cuda_proc} ! CHECK: %{{.*}} = cuf.shared_memory !fir.array<32xf32> {bindc_name = "s", uniq_name = "_QFsharedmemEs"} -> !fir.ref> +! CHECK: cuf.free %{{.*}}#0 : !fir.ref {data_attr = #cuf.cuda} ! CHECK-NOT: cuf.free diff --git a/flang/test/Semantics/cuf21.cuf b/flang/test/Semantics/cuf21.cuf new file mode 100644 index 0000000000000..52343daaf66f1 --- /dev/null +++ b/flang/test/Semantics/cuf21.cuf @@ -0,0 +1,38 @@ +! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s + +module mlocModule + interface maxlocUpdate + module procedure :: & + maxlocUpdateR_32F, & + maxlocUpdateR_64F, & + maxlocUpdateR_32I, & + maxlocUpdateR_64I + end interface maxlocUpdate +contains + + attributes(global) subroutine maxlocPartialMaskR_32F1D() + implicit none + real(4) :: mval + + call maxlocUpdate(mval) + + end subroutine maxlocPartialMaskR_32F1D + + attributes(device) subroutine maxlocUpdateR_32F(mval) + real(4) :: mval + end subroutine maxlocUpdateR_32F + + attributes(device) subroutine maxlocUpdateR_64F(mval) + real(8) :: mval + end subroutine maxlocUpdateR_64F + + attributes(device) subroutine maxlocUpdateR_32I(mval) + integer(4) :: mval + end subroutine maxlocUpdateR_32I + + attributes(device) subroutine maxlocUpdateR_64I(mval) + integer(8) :: mval + end subroutine maxlocUpdateR_64I +end module + +! CHECK-LABEL: func.func @_QMmlocmodulePmaxlocpartialmaskr_32f1d() >From ab3d3013ff32778652add6c1fced10af28b772b5 Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Wed, 21 May 2025 09:12:24 -0700 Subject: [PATCH 2/2] shorten test case --- flang/test/Semantics/cuf21.cuf | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/flang/test/Semantics/cuf21.cuf b/flang/test/Semantics/cuf21.cuf index 52343daaf66f1..8f9c4f7064bb2 100644 --- a/flang/test/Semantics/cuf21.cuf +++ b/flang/test/Semantics/cuf21.cuf @@ -4,9 +4,7 @@ module mlocModule interface maxlocUpdate module procedure :: & maxlocUpdateR_32F, & - maxlocUpdateR_64F, & - maxlocUpdateR_32I, & - maxlocUpdateR_64I + maxlocUpdateR_64F end interface maxlocUpdate contains @@ -25,14 +23,6 @@ contains attributes(device) subroutine maxlocUpdateR_64F(mval) real(8) :: mval end subroutine maxlocUpdateR_64F - - attributes(device) subroutine maxlocUpdateR_32I(mval) - integer(4) :: mval - end subroutine maxlocUpdateR_32I - - attributes(device) subroutine maxlocUpdateR_64I(mval) - integer(8) :: mval - end subroutine maxlocUpdateR_64I end module ! CHECK-LABEL: func.func @_QMmlocmodulePmaxlocpartialmaskr_32f1d() From flang-commits at lists.llvm.org Wed May 21 09:47:25 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Wed, 21 May 2025 09:47:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682e039d.170a0220.f1751.9e60@mx.google.com> ================ @@ -9,4 +9,5 @@ end subroutine ! CHECK-LABEL: func.func @_QPsharedmem() attributes {cuf.proc_attr = #cuf.cuda_proc} ! CHECK: %{{.*}} = cuf.shared_memory !fir.array<32xf32> {bindc_name = "s", uniq_name = "_QFsharedmemEs"} -> !fir.ref> +! CHECK: cuf.free %{{.*}}#0 : !fir.ref {data_attr = #cuf.cuda} ! CHECK-NOT: cuf.free ---------------- wangzpgi wrote: Do we call cuf.free for scalars if they have `device` attribute? https://github.com/llvm/llvm-project/pull/140834 From flang-commits at lists.llvm.org Wed May 21 09:48:12 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 21 May 2025 09:48:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682e03cc.630a0220.1372f1.f6c2@mx.google.com> ================ @@ -0,0 +1,28 @@ +! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s ---------------- clementval wrote: We should use the semantic check run line here: ``` ! RUN: %python %S/test_errors.py %s %flang_fc1 ``` We just want to see if semantic pass without error. We are not really interested in the IR. https://github.com/llvm/llvm-project/pull/140834 From flang-commits at lists.llvm.org Wed May 21 09:48:54 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 21 May 2025 09:48:54 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682e03f6.170a0220.5fc46.6e8e@mx.google.com> ================ @@ -0,0 +1,28 @@ +! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s + ---------------- clementval wrote: Can you add a comment to tell what this test is checking. It can be very short. https://github.com/llvm/llvm-project/pull/140834 From flang-commits at lists.llvm.org Wed May 21 09:49:51 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 21 May 2025 09:49:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682e042f.170a0220.152f49.9f94@mx.google.com> ================ @@ -9,4 +9,5 @@ end subroutine ! CHECK-LABEL: func.func @_QPsharedmem() attributes {cuf.proc_attr = #cuf.cuda_proc} ! CHECK: %{{.*}} = cuf.shared_memory !fir.array<32xf32> {bindc_name = "s", uniq_name = "_QFsharedmemEs"} -> !fir.ref> +! CHECK: cuf.free %{{.*}}#0 : !fir.ref {data_attr = #cuf.cuda} ! CHECK-NOT: cuf.free ---------------- clementval wrote: We clean it later but I think it would make sense to not generate them if we can. This is not very important with this patch. https://github.com/llvm/llvm-project/pull/140834 From flang-commits at lists.llvm.org Wed May 21 10:27:28 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Wed, 21 May 2025 10:27:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682e0d00.a70a0220.17d434.dc17@mx.google.com> https://github.com/wangzpgi updated https://github.com/llvm/llvm-project/pull/140834 >From 564ff8f169f7807dedd95fe2d3eb995c0472f277 Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Tue, 20 May 2025 19:47:59 -0700 Subject: [PATCH 1/3] implicitly set DEVICE attribute to scalars in device routines --- flang/lib/Semantics/resolve-names.cpp | 2 +- flang/test/Lower/CUDA/cuda-shared.cuf | 1 + flang/test/Semantics/cuf21.cuf | 38 +++++++++++++++++++++++++++ 3 files changed, 40 insertions(+), 1 deletion(-) create mode 100644 flang/test/Semantics/cuf21.cuf diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..3f4a06444c4f3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -9376,7 +9376,7 @@ static void SetImplicitCUDADevice(bool inDeviceSubprogram, Symbol &symbol) { if (inDeviceSubprogram && symbol.has()) { auto *object{symbol.detailsIf()}; if (!object->cudaDataAttr() && !IsValue(symbol) && - (IsDummy(symbol) || object->IsArray())) { + !IsFunctionResult(symbol)) { // Implicitly set device attribute if none is set in device context. object->set_cudaDataAttr(common::CUDADataAttr::Device); } diff --git a/flang/test/Lower/CUDA/cuda-shared.cuf b/flang/test/Lower/CUDA/cuda-shared.cuf index f41011df06ae7..565857f01bdb8 100644 --- a/flang/test/Lower/CUDA/cuda-shared.cuf +++ b/flang/test/Lower/CUDA/cuda-shared.cuf @@ -9,4 +9,5 @@ end subroutine ! CHECK-LABEL: func.func @_QPsharedmem() attributes {cuf.proc_attr = #cuf.cuda_proc} ! CHECK: %{{.*}} = cuf.shared_memory !fir.array<32xf32> {bindc_name = "s", uniq_name = "_QFsharedmemEs"} -> !fir.ref> +! CHECK: cuf.free %{{.*}}#0 : !fir.ref {data_attr = #cuf.cuda} ! CHECK-NOT: cuf.free diff --git a/flang/test/Semantics/cuf21.cuf b/flang/test/Semantics/cuf21.cuf new file mode 100644 index 0000000000000..52343daaf66f1 --- /dev/null +++ b/flang/test/Semantics/cuf21.cuf @@ -0,0 +1,38 @@ +! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s + +module mlocModule + interface maxlocUpdate + module procedure :: & + maxlocUpdateR_32F, & + maxlocUpdateR_64F, & + maxlocUpdateR_32I, & + maxlocUpdateR_64I + end interface maxlocUpdate +contains + + attributes(global) subroutine maxlocPartialMaskR_32F1D() + implicit none + real(4) :: mval + + call maxlocUpdate(mval) + + end subroutine maxlocPartialMaskR_32F1D + + attributes(device) subroutine maxlocUpdateR_32F(mval) + real(4) :: mval + end subroutine maxlocUpdateR_32F + + attributes(device) subroutine maxlocUpdateR_64F(mval) + real(8) :: mval + end subroutine maxlocUpdateR_64F + + attributes(device) subroutine maxlocUpdateR_32I(mval) + integer(4) :: mval + end subroutine maxlocUpdateR_32I + + attributes(device) subroutine maxlocUpdateR_64I(mval) + integer(8) :: mval + end subroutine maxlocUpdateR_64I +end module + +! CHECK-LABEL: func.func @_QMmlocmodulePmaxlocpartialmaskr_32f1d() >From ab3d3013ff32778652add6c1fced10af28b772b5 Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Wed, 21 May 2025 09:12:24 -0700 Subject: [PATCH 2/3] shorten test case --- flang/test/Semantics/cuf21.cuf | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/flang/test/Semantics/cuf21.cuf b/flang/test/Semantics/cuf21.cuf index 52343daaf66f1..8f9c4f7064bb2 100644 --- a/flang/test/Semantics/cuf21.cuf +++ b/flang/test/Semantics/cuf21.cuf @@ -4,9 +4,7 @@ module mlocModule interface maxlocUpdate module procedure :: & maxlocUpdateR_32F, & - maxlocUpdateR_64F, & - maxlocUpdateR_32I, & - maxlocUpdateR_64I + maxlocUpdateR_64F end interface maxlocUpdate contains @@ -25,14 +23,6 @@ contains attributes(device) subroutine maxlocUpdateR_64F(mval) real(8) :: mval end subroutine maxlocUpdateR_64F - - attributes(device) subroutine maxlocUpdateR_32I(mval) - integer(4) :: mval - end subroutine maxlocUpdateR_32I - - attributes(device) subroutine maxlocUpdateR_64I(mval) - integer(8) :: mval - end subroutine maxlocUpdateR_64I end module ! CHECK-LABEL: func.func @_QMmlocmodulePmaxlocpartialmaskr_32f1d() >From 4d65a4dc415c33e9a7c50cc37e93b4ceb008577e Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Wed, 21 May 2025 10:27:16 -0700 Subject: [PATCH 3/3] modify how test runs --- flang/test/Semantics/cuf21.cuf | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/flang/test/Semantics/cuf21.cuf b/flang/test/Semantics/cuf21.cuf index 8f9c4f7064bb2..b8a3b0a5a625d 100644 --- a/flang/test/Semantics/cuf21.cuf +++ b/flang/test/Semantics/cuf21.cuf @@ -1,4 +1,5 @@ -! RUN: bbc -emit-hlfir -fcuda %s -o - | FileCheck %s +! RUN: %python %S/test_errors.py %s %flang_fc1 +! Test generic matching with scalars argument without device attr module mlocModule interface maxlocUpdate From flang-commits at lists.llvm.org Wed May 21 10:31:36 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 21 May 2025 10:31:36 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682e0df8.170a0220.7c280.9b9d@mx.google.com> ================ @@ -0,0 +1,29 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 +! Test generic matching with scalars argument without device attr + +module mlocModule + interface maxlocUpdate + module procedure :: & + maxlocUpdateR_32F, & + maxlocUpdateR_64F + end interface maxlocUpdate +contains + + attributes(global) subroutine maxlocPartialMaskR_32F1D() + implicit none + real(4) :: mval + + call maxlocUpdate(mval) + + end subroutine maxlocPartialMaskR_32F1D + + attributes(device) subroutine maxlocUpdateR_32F(mval) + real(4) :: mval + end subroutine maxlocUpdateR_32F + + attributes(device) subroutine maxlocUpdateR_64F(mval) + real(8) :: mval + end subroutine maxlocUpdateR_64F +end module + +! CHECK-LABEL: func.func @_QMmlocmodulePmaxlocpartialmaskr_32f1d() ---------------- clementval wrote: You can remove this check line. https://github.com/llvm/llvm-project/pull/140834 From flang-commits at lists.llvm.org Wed May 21 10:31:46 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 21 May 2025 10:31:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682e0e02.170a0220.271210.45bf@mx.google.com> https://github.com/clementval approved this pull request. LGTM. https://github.com/llvm/llvm-project/pull/140834 From flang-commits at lists.llvm.org Wed May 21 10:55:55 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 10:55:55 -0700 (PDT) Subject: [flang-commits] [flang] 4042a00 - [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (#140834) Message-ID: <682e13ab.050a0220.2af7ae.c6e6@mx.google.com> Author: Zhen Wang Date: 2025-05-21T10:55:52-07:00 New Revision: 4042a002cea6dc6f12e32953c820f6eae1ac1817 URL: https://github.com/llvm/llvm-project/commit/4042a002cea6dc6f12e32953c820f6eae1ac1817 DIFF: https://github.com/llvm/llvm-project/commit/4042a002cea6dc6f12e32953c820f6eae1ac1817.diff LOG: [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (#140834) Scalars inside device routines also need to implicitly set the DEVICE attribute, except for function results. Added: flang/test/Semantics/cuf21.cuf Modified: flang/lib/Semantics/resolve-names.cpp flang/test/Lower/CUDA/cuda-shared.cuf Removed: ################################################################################ diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..3f4a06444c4f3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -9376,7 +9376,7 @@ static void SetImplicitCUDADevice(bool inDeviceSubprogram, Symbol &symbol) { if (inDeviceSubprogram && symbol.has()) { auto *object{symbol.detailsIf()}; if (!object->cudaDataAttr() && !IsValue(symbol) && - (IsDummy(symbol) || object->IsArray())) { + !IsFunctionResult(symbol)) { // Implicitly set device attribute if none is set in device context. object->set_cudaDataAttr(common::CUDADataAttr::Device); } diff --git a/flang/test/Lower/CUDA/cuda-shared.cuf b/flang/test/Lower/CUDA/cuda-shared.cuf index f41011df06ae7..565857f01bdb8 100644 --- a/flang/test/Lower/CUDA/cuda-shared.cuf +++ b/flang/test/Lower/CUDA/cuda-shared.cuf @@ -9,4 +9,5 @@ end subroutine ! CHECK-LABEL: func.func @_QPsharedmem() attributes {cuf.proc_attr = #cuf.cuda_proc} ! CHECK: %{{.*}} = cuf.shared_memory !fir.array<32xf32> {bindc_name = "s", uniq_name = "_QFsharedmemEs"} -> !fir.ref> +! CHECK: cuf.free %{{.*}}#0 : !fir.ref {data_attr = #cuf.cuda} ! CHECK-NOT: cuf.free diff --git a/flang/test/Semantics/cuf21.cuf b/flang/test/Semantics/cuf21.cuf new file mode 100644 index 0000000000000..b8b99a8d1d9be --- /dev/null +++ b/flang/test/Semantics/cuf21.cuf @@ -0,0 +1,27 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 +! Test generic matching with scalars argument without device attr + +module mlocModule + interface maxlocUpdate + module procedure :: & + maxlocUpdateR_32F, & + maxlocUpdateR_64F + end interface maxlocUpdate +contains + + attributes(global) subroutine maxlocPartialMaskR_32F1D() + implicit none + real(4) :: mval + + call maxlocUpdate(mval) + + end subroutine maxlocPartialMaskR_32F1D + + attributes(device) subroutine maxlocUpdateR_32F(mval) + real(4) :: mval + end subroutine maxlocUpdateR_32F + + attributes(device) subroutine maxlocUpdateR_64F(mval) + real(8) :: mval + end subroutine maxlocUpdateR_64F +end module From flang-commits at lists.llvm.org Wed May 21 10:55:59 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Wed, 21 May 2025 10:55:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set DEVICE attribute to scalars in device routines (PR #140834) In-Reply-To: Message-ID: <682e13af.170a0220.13992e.b013@mx.google.com> https://github.com/wangzpgi closed https://github.com/llvm/llvm-project/pull/140834 From flang-commits at lists.llvm.org Wed May 21 11:48:30 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 21 May 2025 11:48:30 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Use NVVM op for barrier0 intrinsic (PR #140947) Message-ID: https://github.com/clementval created https://github.com/llvm/llvm-project/pull/140947 The simple form of `Barrier0Op` is available in the NVVM dialect. It is needed to use it instead of the string version since https://github.com/llvm/llvm-project/pull/140615 >From 16d12fae66ab9d04097d49acbdc66d3009a1301e Mon Sep 17 00:00:00 2001 From: Valentin Clement Date: Wed, 21 May 2025 11:47:10 -0700 Subject: [PATCH] [flang][cuda] Use NVVM op for barrier0 intrinsic --- flang/lib/Optimizer/Builder/IntrinsicCall.cpp | 7 +------ flang/test/Lower/CUDA/cuda-device-proc.cuf | 4 ++-- 2 files changed, 3 insertions(+), 8 deletions(-) diff --git a/flang/lib/Optimizer/Builder/IntrinsicCall.cpp b/flang/lib/Optimizer/Builder/IntrinsicCall.cpp index 1ac0627da9524..178b6770d6b53 100644 --- a/flang/lib/Optimizer/Builder/IntrinsicCall.cpp +++ b/flang/lib/Optimizer/Builder/IntrinsicCall.cpp @@ -8332,12 +8332,7 @@ IntrinsicLibrary::genSum(mlir::Type resultType, // SYNCTHREADS void IntrinsicLibrary::genSyncThreads(llvm::ArrayRef args) { - constexpr llvm::StringLiteral funcName = "llvm.nvvm.barrier0"; - mlir::FunctionType funcType = - mlir::FunctionType::get(builder.getContext(), {}, {}); - auto funcOp = builder.createFunction(loc, funcName, funcType); - llvm::SmallVector noArgs; - builder.create(loc, funcOp, noArgs); + builder.create(loc); } // SYNCTHREADS_AND diff --git a/flang/test/Lower/CUDA/cuda-device-proc.cuf b/flang/test/Lower/CUDA/cuda-device-proc.cuf index 8f5e6dd36da4e..42ee7657966e2 100644 --- a/flang/test/Lower/CUDA/cuda-device-proc.cuf +++ b/flang/test/Lower/CUDA/cuda-device-proc.cuf @@ -49,7 +49,7 @@ attributes(global) subroutine devsub() end ! CHECK-LABEL: func.func @_QPdevsub() attributes {cuf.proc_attr = #cuf.cuda_proc} -! CHECK: fir.call @llvm.nvvm.barrier0() fastmath : () -> () +! CHECK: nvvm.barrier0 ! CHECK: fir.call @llvm.nvvm.bar.warp.sync(%c1{{.*}}) fastmath : (i32) -> () ! CHECK: fir.call @llvm.nvvm.membar.gl() fastmath : () -> () ! CHECK: fir.call @llvm.nvvm.membar.cta() fastmath : () -> () @@ -106,7 +106,7 @@ end ! CHECK-LABEL: func.func @_QPhost1() ! CHECK: cuf.kernel -! CHECK: fir.call @llvm.nvvm.barrier0() fastmath : () -> () +! CHECK: nvvm.barrier0 ! CHECK: fir.call @llvm.nvvm.bar.warp.sync(%c1{{.*}}) fastmath : (i32) -> () ! CHECK: fir.call @llvm.nvvm.barrier0.and(%c1{{.*}}) fastmath : (i32) -> i32 ! CHECK: fir.call @llvm.nvvm.barrier0.popc(%c1{{.*}}) fastmath : (i32) -> i32 From flang-commits at lists.llvm.org Wed May 21 11:49:04 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 11:49:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Use NVVM op for barrier0 intrinsic (PR #140947) In-Reply-To: Message-ID: <682e2020.050a0220.30f7ac.d24a@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Valentin Clement (バレンタイン クレメン) (clementval)
Changes The simple form of `Barrier0Op` is available in the NVVM dialect. It is needed to use it instead of the string version since https://github.com/llvm/llvm-project/pull/140615 --- Full diff: https://github.com/llvm/llvm-project/pull/140947.diff 2 Files Affected: - (modified) flang/lib/Optimizer/Builder/IntrinsicCall.cpp (+1-6) - (modified) flang/test/Lower/CUDA/cuda-device-proc.cuf (+2-2) ``````````diff diff --git a/flang/lib/Optimizer/Builder/IntrinsicCall.cpp b/flang/lib/Optimizer/Builder/IntrinsicCall.cpp index 1ac0627da9524..178b6770d6b53 100644 --- a/flang/lib/Optimizer/Builder/IntrinsicCall.cpp +++ b/flang/lib/Optimizer/Builder/IntrinsicCall.cpp @@ -8332,12 +8332,7 @@ IntrinsicLibrary::genSum(mlir::Type resultType, // SYNCTHREADS void IntrinsicLibrary::genSyncThreads(llvm::ArrayRef args) { - constexpr llvm::StringLiteral funcName = "llvm.nvvm.barrier0"; - mlir::FunctionType funcType = - mlir::FunctionType::get(builder.getContext(), {}, {}); - auto funcOp = builder.createFunction(loc, funcName, funcType); - llvm::SmallVector noArgs; - builder.create(loc, funcOp, noArgs); + builder.create(loc); } // SYNCTHREADS_AND diff --git a/flang/test/Lower/CUDA/cuda-device-proc.cuf b/flang/test/Lower/CUDA/cuda-device-proc.cuf index 8f5e6dd36da4e..42ee7657966e2 100644 --- a/flang/test/Lower/CUDA/cuda-device-proc.cuf +++ b/flang/test/Lower/CUDA/cuda-device-proc.cuf @@ -49,7 +49,7 @@ attributes(global) subroutine devsub() end ! CHECK-LABEL: func.func @_QPdevsub() attributes {cuf.proc_attr = #cuf.cuda_proc} -! CHECK: fir.call @llvm.nvvm.barrier0() fastmath : () -> () +! CHECK: nvvm.barrier0 ! CHECK: fir.call @llvm.nvvm.bar.warp.sync(%c1{{.*}}) fastmath : (i32) -> () ! CHECK: fir.call @llvm.nvvm.membar.gl() fastmath : () -> () ! CHECK: fir.call @llvm.nvvm.membar.cta() fastmath : () -> () @@ -106,7 +106,7 @@ end ! CHECK-LABEL: func.func @_QPhost1() ! CHECK: cuf.kernel -! CHECK: fir.call @llvm.nvvm.barrier0() fastmath : () -> () +! CHECK: nvvm.barrier0 ! CHECK: fir.call @llvm.nvvm.bar.warp.sync(%c1{{.*}}) fastmath : (i32) -> () ! CHECK: fir.call @llvm.nvvm.barrier0.and(%c1{{.*}}) fastmath : (i32) -> i32 ! CHECK: fir.call @llvm.nvvm.barrier0.popc(%c1{{.*}}) fastmath : (i32) -> i32 ``````````
https://github.com/llvm/llvm-project/pull/140947 From flang-commits at lists.llvm.org Wed May 21 11:49:37 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Wed, 21 May 2025 11:49:37 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix OOB access for derived type mapping (PR #140948) Message-ID: https://github.com/TIFitis created https://github.com/llvm/llvm-project/pull/140948 This patch fixes unintentional OOB acess when mapping members of derived type. When `currentIndicesIdx` is `maxSize - 1`, i.e, it is the last member mapping, we can break from the loop. This was already the intended behaviour but using `continue` instead of `break` resulted in OOB access of `indices`. This fixes the following test case: ``` module mod implicit none type :: mattype real(4), pointer :: array(:, :, :) integer(4) :: scalar end type type :: data type(mattype) :: memb end type contains subroutine us_gpumem(dat) implicit none type(data), pointer :: dat !$omp target enter data map(to:dat%memb) end subroutine us_gpumem end module mod ``` >From 6b5a1bed7d0ce18861ee9b85fbefb03b3d98e175 Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Wed, 21 May 2025 19:42:31 +0100 Subject: [PATCH] [OpenMP][Flang] Fix OOB access for derived type mapping This patch fixes unintentional OOB acess when mapping members of derived type. --- flang/lib/Lower/OpenMP/Utils.cpp | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/flang/lib/Lower/OpenMP/Utils.cpp b/flang/lib/Lower/OpenMP/Utils.cpp index 173dceb07b193..711d4af287691 100644 --- a/flang/lib/Lower/OpenMP/Utils.cpp +++ b/flang/lib/Lower/OpenMP/Utils.cpp @@ -362,16 +362,18 @@ mlir::Value createParentSymAndGenIntermediateMaps( clauseLocation, firOpBuilder.getRefType(memberTy), curValue, llvm::SmallVector{idxConst}); - // Skip mapping and the subsequent load if we're the final member or not - // a type with a descriptor such as a pointer/allocatable. If we're a - // final member, the map will be generated by the processMap call that - // invoked this function, and if we're not a type with a descriptor then - // we have no need of generating an intermediate map for it, as we only - // need to generate a map if a member is a descriptor type (and thus - // obscures the members it contains via a pointer in which it's data needs - // mapped) - if ((currentIndicesIdx == indices.size() - 1) || - !fir::isTypeWithDescriptor(memberTy)) { + // If we're a final member, the map will be generated by the processMap + // call that invoked this function. + if (currentIndicesIdx == indices.size() - 1) + break; + + // Skip mapping and the subsequent load if we're not + // a type with a descriptor such as a pointer/allocatable. If we're not a + // type with a descriptor then we have no need of generating an + // intermediate map for it, as we only need to generate a map if a member + // is a descriptor type (and thus obscures the members it contains via a + // pointer in which it's data needs mapped). + if (!fir::isTypeWithDescriptor(memberTy)) { currentIndicesIdx++; continue; } From flang-commits at lists.llvm.org Wed May 21 11:50:39 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 11:50:39 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix OOB access for derived type mapping (PR #140948) In-Reply-To: Message-ID: <682e207f.630a0220.2a9171.2244@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Akash Banerjee (TIFitis)
Changes This patch fixes unintentional OOB acess when mapping members of derived type. When `currentIndicesIdx` is `maxSize - 1`, i.e, it is the last member mapping, we can break from the loop. This was already the intended behaviour but using `continue` instead of `break` resulted in OOB access of `indices`. This fixes the following test case: ``` module mod implicit none type :: mattype real(4), pointer :: array(:, :, :) integer(4) :: scalar end type type :: data type(mattype) :: memb end type contains subroutine us_gpumem(dat) implicit none type(data), pointer :: dat !$omp target enter data map(to:dat%memb) end subroutine us_gpumem end module mod ``` --- Full diff: https://github.com/llvm/llvm-project/pull/140948.diff 1 Files Affected: - (modified) flang/lib/Lower/OpenMP/Utils.cpp (+12-10) ``````````diff diff --git a/flang/lib/Lower/OpenMP/Utils.cpp b/flang/lib/Lower/OpenMP/Utils.cpp index 173dceb07b193..711d4af287691 100644 --- a/flang/lib/Lower/OpenMP/Utils.cpp +++ b/flang/lib/Lower/OpenMP/Utils.cpp @@ -362,16 +362,18 @@ mlir::Value createParentSymAndGenIntermediateMaps( clauseLocation, firOpBuilder.getRefType(memberTy), curValue, llvm::SmallVector{idxConst}); - // Skip mapping and the subsequent load if we're the final member or not - // a type with a descriptor such as a pointer/allocatable. If we're a - // final member, the map will be generated by the processMap call that - // invoked this function, and if we're not a type with a descriptor then - // we have no need of generating an intermediate map for it, as we only - // need to generate a map if a member is a descriptor type (and thus - // obscures the members it contains via a pointer in which it's data needs - // mapped) - if ((currentIndicesIdx == indices.size() - 1) || - !fir::isTypeWithDescriptor(memberTy)) { + // If we're a final member, the map will be generated by the processMap + // call that invoked this function. + if (currentIndicesIdx == indices.size() - 1) + break; + + // Skip mapping and the subsequent load if we're not + // a type with a descriptor such as a pointer/allocatable. If we're not a + // type with a descriptor then we have no need of generating an + // intermediate map for it, as we only need to generate a map if a member + // is a descriptor type (and thus obscures the members it contains via a + // pointer in which it's data needs mapped). + if (!fir::isTypeWithDescriptor(memberTy)) { currentIndicesIdx++; continue; } ``````````
https://github.com/llvm/llvm-project/pull/140948 From flang-commits at lists.llvm.org Wed May 21 11:59:33 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Wed, 21 May 2025 11:59:33 -0700 (PDT) Subject: [flang-commits] [flang] implicitly set device attribute for variables have VALUE attribute in device routine (PR #140952) Message-ID: https://github.com/wangzpgi created https://github.com/llvm/llvm-project/pull/140952 For variables that have VALUE attribute inside device routines, implicitly set DEVICE attribute. >From 55913c856b56967382e1013e8d8b62f9f4c4e6ad Mon Sep 17 00:00:00 2001 From: Zhen Wang Date: Wed, 21 May 2025 11:57:54 -0700 Subject: [PATCH] implicitly set device attribute for variables have VALUE attribute inside device routine --- flang/lib/Semantics/resolve-names.cpp | 3 +-- flang/test/Semantics/cuf21.cuf | 14 +++++++++----- flang/test/Semantics/modfile55.cuf | 1 + 3 files changed, 11 insertions(+), 7 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 3f4a06444c4f3..f7c6a948375e4 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -9375,8 +9375,7 @@ void ResolveNamesVisitor::CreateGeneric(const parser::GenericSpec &x) { static void SetImplicitCUDADevice(bool inDeviceSubprogram, Symbol &symbol) { if (inDeviceSubprogram && symbol.has()) { auto *object{symbol.detailsIf()}; - if (!object->cudaDataAttr() && !IsValue(symbol) && - !IsFunctionResult(symbol)) { + if (!object->cudaDataAttr() && !IsFunctionResult(symbol)) { // Implicitly set device attribute if none is set in device context. object->set_cudaDataAttr(common::CUDADataAttr::Device); } diff --git a/flang/test/Semantics/cuf21.cuf b/flang/test/Semantics/cuf21.cuf index b8b99a8d1d9be..4251493c52e65 100644 --- a/flang/test/Semantics/cuf21.cuf +++ b/flang/test/Semantics/cuf21.cuf @@ -1,5 +1,6 @@ ! RUN: %python %S/test_errors.py %s %flang_fc1 -! Test generic matching with scalars argument without device attr +! Test generic matching with scalars argument and argument +! with VALUE attribute without DEVICE attr inside device routine module mlocModule interface maxlocUpdate @@ -9,19 +10,22 @@ module mlocModule end interface maxlocUpdate contains - attributes(global) subroutine maxlocPartialMaskR_32F1D() + attributes(global) subroutine maxlocPartialMaskR_32F1D(back) implicit none + logical, intent(in), value :: back real(4) :: mval - call maxlocUpdate(mval) + call maxlocUpdate(mval, back) end subroutine maxlocPartialMaskR_32F1D - attributes(device) subroutine maxlocUpdateR_32F(mval) + attributes(device) subroutine maxlocUpdateR_32F(mval, back) real(4) :: mval + logical :: back end subroutine maxlocUpdateR_32F - attributes(device) subroutine maxlocUpdateR_64F(mval) + attributes(device) subroutine maxlocUpdateR_64F(mval, back) real(8) :: mval + logical :: back end subroutine maxlocUpdateR_64F end module diff --git a/flang/test/Semantics/modfile55.cuf b/flang/test/Semantics/modfile55.cuf index 2338b745d8355..abe6c30fa0f67 100644 --- a/flang/test/Semantics/modfile55.cuf +++ b/flang/test/Semantics/modfile55.cuf @@ -33,6 +33,7 @@ end !contains !attributes(global) subroutine globsub(x,y,z) !real(4),value::x +!attributes(device)x !real(4)::y !attributes(device) y !real(4)::z From flang-commits at lists.llvm.org Wed May 21 12:00:06 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 12:00:06 -0700 (PDT) Subject: [flang-commits] [flang] implicitly set device attribute for variables have VALUE attribute in device routine (PR #140952) In-Reply-To: Message-ID: <682e22b6.170a0220.366193.2e24@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Zhen Wang (wangzpgi)
Changes For variables that have VALUE attribute inside device routines, implicitly set DEVICE attribute. --- Full diff: https://github.com/llvm/llvm-project/pull/140952.diff 3 Files Affected: - (modified) flang/lib/Semantics/resolve-names.cpp (+1-2) - (modified) flang/test/Semantics/cuf21.cuf (+9-5) - (modified) flang/test/Semantics/modfile55.cuf (+1) ``````````diff diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 3f4a06444c4f3..f7c6a948375e4 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -9375,8 +9375,7 @@ void ResolveNamesVisitor::CreateGeneric(const parser::GenericSpec &x) { static void SetImplicitCUDADevice(bool inDeviceSubprogram, Symbol &symbol) { if (inDeviceSubprogram && symbol.has()) { auto *object{symbol.detailsIf()}; - if (!object->cudaDataAttr() && !IsValue(symbol) && - !IsFunctionResult(symbol)) { + if (!object->cudaDataAttr() && !IsFunctionResult(symbol)) { // Implicitly set device attribute if none is set in device context. object->set_cudaDataAttr(common::CUDADataAttr::Device); } diff --git a/flang/test/Semantics/cuf21.cuf b/flang/test/Semantics/cuf21.cuf index b8b99a8d1d9be..4251493c52e65 100644 --- a/flang/test/Semantics/cuf21.cuf +++ b/flang/test/Semantics/cuf21.cuf @@ -1,5 +1,6 @@ ! RUN: %python %S/test_errors.py %s %flang_fc1 -! Test generic matching with scalars argument without device attr +! Test generic matching with scalars argument and argument +! with VALUE attribute without DEVICE attr inside device routine module mlocModule interface maxlocUpdate @@ -9,19 +10,22 @@ module mlocModule end interface maxlocUpdate contains - attributes(global) subroutine maxlocPartialMaskR_32F1D() + attributes(global) subroutine maxlocPartialMaskR_32F1D(back) implicit none + logical, intent(in), value :: back real(4) :: mval - call maxlocUpdate(mval) + call maxlocUpdate(mval, back) end subroutine maxlocPartialMaskR_32F1D - attributes(device) subroutine maxlocUpdateR_32F(mval) + attributes(device) subroutine maxlocUpdateR_32F(mval, back) real(4) :: mval + logical :: back end subroutine maxlocUpdateR_32F - attributes(device) subroutine maxlocUpdateR_64F(mval) + attributes(device) subroutine maxlocUpdateR_64F(mval, back) real(8) :: mval + logical :: back end subroutine maxlocUpdateR_64F end module diff --git a/flang/test/Semantics/modfile55.cuf b/flang/test/Semantics/modfile55.cuf index 2338b745d8355..abe6c30fa0f67 100644 --- a/flang/test/Semantics/modfile55.cuf +++ b/flang/test/Semantics/modfile55.cuf @@ -33,6 +33,7 @@ end !contains !attributes(global) subroutine globsub(x,y,z) !real(4),value::x +!attributes(device)x !real(4)::y !attributes(device) y !real(4)::z ``````````
https://github.com/llvm/llvm-project/pull/140952 From flang-commits at lists.llvm.org Wed May 21 12:00:33 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Wed, 21 May 2025 12:00:33 -0700 (PDT) Subject: [flang-commits] [flang] [flang] [cuda] implicitly set device attribute for variables have VALUE attribute in device routine (PR #140952) In-Reply-To: Message-ID: <682e22d1.630a0220.359ad9.06d9@mx.google.com> https://github.com/wangzpgi edited https://github.com/llvm/llvm-project/pull/140952 From flang-commits at lists.llvm.org Wed May 21 13:05:14 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 21 May 2025 13:05:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Use NVVM op for barrier0 intrinsic (PR #140947) In-Reply-To: Message-ID: <682e31fa.630a0220.173c0.2018@mx.google.com> clementval wrote: > I should mention that in non strictly CUF flows (OpenAcc), I am considering using gpu.barrier as it can target different back-ends. `syncthreads` is a cudadevice procedure so I don't think it needs to be portable. https://github.com/llvm/llvm-project/pull/140947 From flang-commits at lists.llvm.org Wed May 21 13:05:17 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 13:05:17 -0700 (PDT) Subject: [flang-commits] [flang] 89d9a83 - [flang][cuda] Use NVVM op for barrier0 intrinsic (#140947) Message-ID: <682e31fd.170a0220.2fa3df.3769@mx.google.com> Author: Valentin Clement (バレンタイン クレメン) Date: 2025-05-21T13:05:14-07:00 New Revision: 89d9a83b704a8f6b5bd64dac93095a9228c601d5 URL: https://github.com/llvm/llvm-project/commit/89d9a83b704a8f6b5bd64dac93095a9228c601d5 DIFF: https://github.com/llvm/llvm-project/commit/89d9a83b704a8f6b5bd64dac93095a9228c601d5.diff LOG: [flang][cuda] Use NVVM op for barrier0 intrinsic (#140947) The simple form of `Barrier0Op` is available in the NVVM dialect. It is needed to use it instead of the string version since https://github.com/llvm/llvm-project/pull/140615 Added: Modified: flang/lib/Optimizer/Builder/IntrinsicCall.cpp flang/test/Lower/CUDA/cuda-device-proc.cuf Removed: ################################################################################ diff --git a/flang/lib/Optimizer/Builder/IntrinsicCall.cpp b/flang/lib/Optimizer/Builder/IntrinsicCall.cpp index 1ac0627da9524..178b6770d6b53 100644 --- a/flang/lib/Optimizer/Builder/IntrinsicCall.cpp +++ b/flang/lib/Optimizer/Builder/IntrinsicCall.cpp @@ -8332,12 +8332,7 @@ IntrinsicLibrary::genSum(mlir::Type resultType, // SYNCTHREADS void IntrinsicLibrary::genSyncThreads(llvm::ArrayRef args) { - constexpr llvm::StringLiteral funcName = "llvm.nvvm.barrier0"; - mlir::FunctionType funcType = - mlir::FunctionType::get(builder.getContext(), {}, {}); - auto funcOp = builder.createFunction(loc, funcName, funcType); - llvm::SmallVector noArgs; - builder.create(loc, funcOp, noArgs); + builder.create(loc); } // SYNCTHREADS_AND diff --git a/flang/test/Lower/CUDA/cuda-device-proc.cuf b/flang/test/Lower/CUDA/cuda-device-proc.cuf index 8f5e6dd36da4e..42ee7657966e2 100644 --- a/flang/test/Lower/CUDA/cuda-device-proc.cuf +++ b/flang/test/Lower/CUDA/cuda-device-proc.cuf @@ -49,7 +49,7 @@ attributes(global) subroutine devsub() end ! CHECK-LABEL: func.func @_QPdevsub() attributes {cuf.proc_attr = #cuf.cuda_proc} -! CHECK: fir.call @llvm.nvvm.barrier0() fastmath : () -> () +! CHECK: nvvm.barrier0 ! CHECK: fir.call @llvm.nvvm.bar.warp.sync(%c1{{.*}}) fastmath : (i32) -> () ! CHECK: fir.call @llvm.nvvm.membar.gl() fastmath : () -> () ! CHECK: fir.call @llvm.nvvm.membar.cta() fastmath : () -> () @@ -106,7 +106,7 @@ end ! CHECK-LABEL: func.func @_QPhost1() ! CHECK: cuf.kernel -! CHECK: fir.call @llvm.nvvm.barrier0() fastmath : () -> () +! CHECK: nvvm.barrier0 ! CHECK: fir.call @llvm.nvvm.bar.warp.sync(%c1{{.*}}) fastmath : (i32) -> () ! CHECK: fir.call @llvm.nvvm.barrier0.and(%c1{{.*}}) fastmath : (i32) -> i32 ! CHECK: fir.call @llvm.nvvm.barrier0.popc(%c1{{.*}}) fastmath : (i32) -> i32 From flang-commits at lists.llvm.org Wed May 21 13:05:20 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 21 May 2025 13:05:20 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Use NVVM op for barrier0 intrinsic (PR #140947) In-Reply-To: Message-ID: <682e3200.170a0220.106bb5.be54@mx.google.com> https://github.com/clementval closed https://github.com/llvm/llvm-project/pull/140947 From flang-commits at lists.llvm.org Wed May 21 15:53:23 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 15:53:23 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix OOB access for derived type mapping (PR #140948) In-Reply-To: Message-ID: <682e5963.170a0220.65625.3aa0@mx.google.com> https://github.com/agozillon approved this pull request. LGTM, thank you for the catch/fix! :-) https://github.com/llvm/llvm-project/pull/140948 From flang-commits at lists.llvm.org Wed May 21 17:34:44 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 21 May 2025 17:34:44 -0700 (PDT) Subject: [flang-commits] [flang] fbb11b4 - [OpenMP][Flang] Fix OOB access for derived type mapping (#140948) Message-ID: <682e7124.170a0220.245a68.4432@mx.google.com> Author: Akash Banerjee Date: 2025-05-22T01:34:40+01:00 New Revision: fbb11b4c4e97c05623cfa624fe4c423587685cf3 URL: https://github.com/llvm/llvm-project/commit/fbb11b4c4e97c05623cfa624fe4c423587685cf3 DIFF: https://github.com/llvm/llvm-project/commit/fbb11b4c4e97c05623cfa624fe4c423587685cf3.diff LOG: [OpenMP][Flang] Fix OOB access for derived type mapping (#140948) Added: Modified: flang/lib/Lower/OpenMP/Utils.cpp Removed: ################################################################################ diff --git a/flang/lib/Lower/OpenMP/Utils.cpp b/flang/lib/Lower/OpenMP/Utils.cpp index 173dceb07b193..711d4af287691 100644 --- a/flang/lib/Lower/OpenMP/Utils.cpp +++ b/flang/lib/Lower/OpenMP/Utils.cpp @@ -362,16 +362,18 @@ mlir::Value createParentSymAndGenIntermediateMaps( clauseLocation, firOpBuilder.getRefType(memberTy), curValue, llvm::SmallVector{idxConst}); - // Skip mapping and the subsequent load if we're the final member or not - // a type with a descriptor such as a pointer/allocatable. If we're a - // final member, the map will be generated by the processMap call that - // invoked this function, and if we're not a type with a descriptor then - // we have no need of generating an intermediate map for it, as we only - // need to generate a map if a member is a descriptor type (and thus - // obscures the members it contains via a pointer in which it's data needs - // mapped) - if ((currentIndicesIdx == indices.size() - 1) || - !fir::isTypeWithDescriptor(memberTy)) { + // If we're a final member, the map will be generated by the processMap + // call that invoked this function. + if (currentIndicesIdx == indices.size() - 1) + break; + + // Skip mapping and the subsequent load if we're not + // a type with a descriptor such as a pointer/allocatable. If we're not a + // type with a descriptor then we have no need of generating an + // intermediate map for it, as we only need to generate a map if a member + // is a descriptor type (and thus obscures the members it contains via a + // pointer in which it's data needs mapped). + if (!fir::isTypeWithDescriptor(memberTy)) { currentIndicesIdx++; continue; } From flang-commits at lists.llvm.org Wed May 21 17:34:46 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Wed, 21 May 2025 17:34:46 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix OOB access for derived type mapping (PR #140948) In-Reply-To: Message-ID: <682e7126.170a0220.3da756.4542@mx.google.com> https://github.com/TIFitis closed https://github.com/llvm/llvm-project/pull/140948 From flang-commits at lists.llvm.org Wed May 21 17:49:01 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Wed, 21 May 2025 17:49:01 -0700 (PDT) Subject: [flang-commits] [flang] [flang] optionally add lifetime markers to alloca created in stack-arrays (PR #140901) In-Reply-To: Message-ID: <682e747d.170a0220.24e6b7.42d0@mx.google.com> https://github.com/vzakhari approved this pull request. Looks great! https://github.com/llvm/llvm-project/pull/140901 From flang-commits at lists.llvm.org Wed May 21 17:53:08 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Wed, 21 May 2025 17:53:08 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix OOB access for derived type mapping (PR #140948) In-Reply-To: Message-ID: <682e7574.170a0220.22a28a.cdb1@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `openmp-offload-sles-build-only` running on `rocm-worker-hw-04-sles` while building `flang` at step 10 "Add check check-lld". Full details are available at: https://lab.llvm.org/buildbot/#/builders/140/builds/23576
Here is the relevant piece of the build log for the reference ``` Step 10 (Add check check-lld) failure: test (failure) ******************** TEST 'lld :: COFF/lto-cache-errors.ll' FAILED ******************** Exit Code: 1 Command Output (stderr): -- /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/opt -module-hash -module-summary /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/lld/test/COFF/lto-cache-errors.ll -o /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.o # RUN: at line 5 + /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/opt -module-hash -module-summary /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/lld/test/COFF/lto-cache-errors.ll -o /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.o /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/opt -module-hash -module-summary /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/lld/test/COFF/Inputs/lto-cache.ll -o /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp2.o # RUN: at line 6 + /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/opt -module-hash -module-summary /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/lld/test/COFF/Inputs/lto-cache.ll -o /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp2.o rm -Rf /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache && mkdir /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache # RUN: at line 7 + rm -Rf /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache + mkdir /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache chmod 444 /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache # RUN: at line 8 + chmod 444 /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache not /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/lld-link /lldltocache:/home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache/nonexistant/ /out:/home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp3 /entry:main /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp2.o /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.o 2>&1 | /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/FileCheck /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/lld/test/COFF/lto-cache-errors.ll # RUN: at line 11 + /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/FileCheck /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/lld/test/COFF/lto-cache-errors.ll + not /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/lld-link /lldltocache:/home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.cache/nonexistant/ /out:/home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp3 /entry:main /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp2.o /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/tools/lld/test/COFF/Output/lto-cache-errors.ll.tmp.o -- ******************** ```
https://github.com/llvm/llvm-project/pull/140948 From flang-commits at lists.llvm.org Wed May 21 18:12:34 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Wed, 21 May 2025 18:12:34 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <682e7a02.170a0220.27ce0e.3bea@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 The server is unavailable at this time. Please wait a few minutes before you try again. From flang-commits at lists.llvm.org Wed May 21 19:48:29 2025 From: flang-commits at lists.llvm.org (Petr Hosek via flang-commits) Date: Wed, 21 May 2025 19:48:29 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [lld] [lldb] [llvm] [mlir] [polly] [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS in standalone builds (PR #138587) In-Reply-To: Message-ID: <682e907d.a70a0220.37cc85.04b8@mx.google.com> https://github.com/petrhosek approved this pull request. https://github.com/llvm/llvm-project/pull/138587 From flang-commits at lists.llvm.org Wed May 21 20:52:05 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Wed, 21 May 2025 20:52:05 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <682e9f65.170a0220.7f710.6f20@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH 1/4] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 30477b842ae64fb1c94f520136956cc4685648a7 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:59:20 -0400 Subject: [PATCH 2/4] clang-format --- flang/include/flang/Evaluate/target.h | 4 +--- flang/include/flang/Tools/TargetSetup.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d5285fb..e8b9fedc38f48 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ class TargetCharacteristics { const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e21fb4..24ab65f740ec6 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. >From e1f91c83680d72fc1463a8db97a77298141286d9 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:41:48 -0400 Subject: [PATCH 3/4] Renaming to integerKindForPointer --- flang/include/flang/Evaluate/target.h | 8 +++++--- flang/include/flang/Tools/TargetSetup.h | 3 ++- flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index e8b9fedc38f48..7b38db2db1956 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,8 +131,10 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } - std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } + std::size_t integerKindForPointer() { return integerKindForPointer_; } + void set_integerKindForPointer(std::size_t newKind) { + integerKindForPointer_ = newKind; + } private: static constexpr int maxKind{common::maxKind}; @@ -159,7 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; - std::size_t pointerSize_{8 /* bytes */}; + std::size_t integerKindForPointer_{8}; /* For 64 bit pointer */ }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index 24ab65f740ec6..002e82aa72484 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,7 +94,8 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); - targetCharacteristics.set_pointerSize( + // Currently the integer kind happens to be the same as the byte size + targetCharacteristics.set_integerKindForPointer( targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 88114bfe84af3..5fd5ea8f9bc5c 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; + TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 940ff38d0abba9d1b0ab415e42474629284962d8 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:51:53 -0400 Subject: [PATCH 4/4] clang-format --- flang/lib/Semantics/resolve-names.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 5fd5ea8f9bc5c..a8dbf61c8fd68 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6689,8 +6689,8 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { "'%s' cannot be a Cray pointer as it is already a Cray pointee"_err_en_US); } pointer->set(Symbol::Flag::CrayPointer); - const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; + const DeclTypeSpec &pointerType{MakeNumericType(TypeCategory::Integer, + context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); From flang-commits at lists.llvm.org Thu May 22 00:24:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 00:24:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang] optionally add lifetime markers to alloca created in stack-arrays (PR #140901) In-Reply-To: Message-ID: <682ed125.170a0220.1bdfbf.dad9@mx.google.com> https://github.com/jeanPerier updated https://github.com/llvm/llvm-project/pull/140901 >From 44b22f6d0fa2f93165ce64e1a34e63601d7a8bb1 Mon Sep 17 00:00:00 2001 From: Jean Perier Date: Tue, 20 May 2025 05:40:06 -0700 Subject: [PATCH 1/2] [flang] add lifetime markers to alloca created in stack-arrays --- .../flang/Optimizer/Builder/FIRBuilder.h | 14 ++- .../flang/Optimizer/Dialect/FIROpsSupport.h | 12 +++ .../flang/Optimizer/Transforms/Passes.td | 4 +- flang/lib/Optimizer/Builder/FIRBuilder.cpp | 20 +++- flang/lib/Optimizer/Dialect/FIROps.cpp | 13 +++ .../lib/Optimizer/Transforms/StackArrays.cpp | 100 ++++++++++++++---- .../test/Transforms/stack-arrays-lifetime.fir | 96 +++++++++++++++++ 7 files changed, 234 insertions(+), 25 deletions(-) create mode 100644 flang/test/Transforms/stack-arrays-lifetime.fir diff --git a/flang/include/flang/Optimizer/Builder/FIRBuilder.h b/flang/include/flang/Optimizer/Builder/FIRBuilder.h index 5309ea2c0fc09..9382d77a8d67b 100644 --- a/flang/include/flang/Optimizer/Builder/FIRBuilder.h +++ b/flang/include/flang/Optimizer/Builder/FIRBuilder.h @@ -879,7 +879,7 @@ llvm::SmallVector elideLengthsAlreadyInType(mlir::Type type, mlir::ValueRange lenParams); /// Get the address space which should be used for allocas -uint64_t getAllocaAddressSpace(mlir::DataLayout *dataLayout); +uint64_t getAllocaAddressSpace(const mlir::DataLayout *dataLayout); /// The two vectors of MLIR values have the following property: /// \p extents1[i] must have the same value as \p extents2[i] @@ -913,6 +913,18 @@ void genDimInfoFromBox(fir::FirOpBuilder &builder, mlir::Location loc, llvm::SmallVectorImpl *extents, llvm::SmallVectorImpl *strides); +/// Generate an LLVM dialect lifetime start marker at the current insertion +/// point given an fir.alloca and its constant size in bytes. Returns the value +/// to be passed to the lifetime end marker. +mlir::Value genLifetimeStart(mlir::OpBuilder &builder, mlir::Location loc, + fir::AllocaOp alloc, int64_t size, + const mlir::DataLayout *dl); + +/// Generate an LLVM dialect lifetime end marker at the current insertion point +/// given an llvm.ptr value and the constant size in bytes of its storage. +void genLifetimeEnd(mlir::OpBuilder &builder, mlir::Location loc, + mlir::Value mem, int64_t size); + } // namespace fir::factory #endif // FORTRAN_OPTIMIZER_BUILDER_FIRBUILDER_H diff --git a/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h b/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h index e71a622725bf4..0a2337be7455e 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h +++ b/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h @@ -125,6 +125,12 @@ static constexpr llvm::StringRef getInternalFuncNameAttrName() { return "fir.internal_name"; } +/// Attribute to mark alloca that have been given a lifetime marker so that +/// later pass do not try adding new ones. +static constexpr llvm::StringRef getHasLifetimeMarkerAttrName() { + return "fir.has_lifetime"; +} + /// Does the function, \p func, have a host-associations tuple argument? /// Some internal procedures may have access to host procedure variables. bool hasHostAssociationArgument(mlir::func::FuncOp func); @@ -221,6 +227,12 @@ inline bool hasBindcAttr(mlir::Operation *op) { return hasProcedureAttr(op); } +/// Get the allocation size of a given alloca if it has compile time constant +/// size. +std::optional getAllocaByteSize(fir::AllocaOp alloca, + const mlir::DataLayout &dl, + const fir::KindMapping &kindMap); + /// Return true, if \p rebox operation keeps the input array /// continuous if it is initially continuous. /// When \p checkWhole is false, then the checking is only done diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c0d88a8e19f80..b251534e1a8f6 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -285,7 +285,9 @@ def StackArrays : Pass<"stack-arrays", "mlir::func::FuncOp"> { Convert heap allocations for arrays, even those of unknown size, into stack allocations. }]; - let dependentDialects = [ "fir::FIROpsDialect" ]; + let dependentDialects = [ + "fir::FIROpsDialect", "mlir::DLTIDialect", "mlir::LLVM::LLVMDialect" + ]; } def StackReclaim : Pass<"stack-reclaim"> { diff --git a/flang/lib/Optimizer/Builder/FIRBuilder.cpp b/flang/lib/Optimizer/Builder/FIRBuilder.cpp index 86166db355f72..68a1cc7a3aee6 100644 --- a/flang/lib/Optimizer/Builder/FIRBuilder.cpp +++ b/flang/lib/Optimizer/Builder/FIRBuilder.cpp @@ -1868,7 +1868,8 @@ void fir::factory::setInternalLinkage(mlir::func::FuncOp func) { func->setAttr("llvm.linkage", linkage); } -uint64_t fir::factory::getAllocaAddressSpace(mlir::DataLayout *dataLayout) { +uint64_t +fir::factory::getAllocaAddressSpace(const mlir::DataLayout *dataLayout) { if (dataLayout) if (mlir::Attribute addrSpace = dataLayout->getAllocaMemorySpace()) return mlir::cast(addrSpace).getUInt(); @@ -1940,3 +1941,20 @@ void fir::factory::genDimInfoFromBox( strides->push_back(dimInfo.getByteStride()); } } + +mlir::Value fir::factory::genLifetimeStart(mlir::OpBuilder &builder, + mlir::Location loc, + fir::AllocaOp alloc, int64_t size, + const mlir::DataLayout *dl) { + mlir::Type ptrTy = mlir::LLVM::LLVMPointerType::get( + alloc.getContext(), getAllocaAddressSpace(dl)); + mlir::Value cast = + builder.create(loc, ptrTy, alloc.getResult()); + builder.create(loc, size, cast); + return cast; +} + +void fir::factory::genLifetimeEnd(mlir::OpBuilder &builder, mlir::Location loc, + mlir::Value cast, int64_t size) { + builder.create(loc, size, cast); +} diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index e12af7782a578..cbe93907265f6 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -4804,6 +4804,19 @@ bool fir::reboxPreservesContinuity(fir::ReboxOp rebox, bool checkWhole) { return false; } +std::optional fir::getAllocaByteSize(fir::AllocaOp alloca, + const mlir::DataLayout &dl, + const fir::KindMapping &kindMap) { + mlir::Type type = alloca.getInType(); + // TODO: should use the constant operands when all info is not available in + // the type. + if (!alloca.isDynamic()) + if (auto sizeAndAlignment = + getTypeSizeAndAlignment(alloca.getLoc(), type, dl, kindMap)) + return sizeAndAlignment->first; + return std::nullopt; +} + //===----------------------------------------------------------------------===// // DeclareOp //===----------------------------------------------------------------------===// diff --git a/flang/lib/Optimizer/Transforms/StackArrays.cpp b/flang/lib/Optimizer/Transforms/StackArrays.cpp index f9b9b4f4ff385..b5671261c9a2b 100644 --- a/flang/lib/Optimizer/Transforms/StackArrays.cpp +++ b/flang/lib/Optimizer/Transforms/StackArrays.cpp @@ -13,12 +13,15 @@ #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/Dialect/Support/FIRContext.h" +#include "flang/Optimizer/Support/DataLayout.h" #include "flang/Optimizer/Transforms/Passes.h" #include "mlir/Analysis/DataFlow/ConstantPropagationAnalysis.h" #include "mlir/Analysis/DataFlow/DeadCodeAnalysis.h" #include "mlir/Analysis/DataFlow/DenseAnalysis.h" #include "mlir/Analysis/DataFlowFramework.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/Func/IR/FuncOps.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/IR/Builders.h" #include "mlir/IR/Diagnostics.h" @@ -48,6 +51,11 @@ static llvm::cl::opt maxAllocsPerFunc( "to 0 for no limit."), llvm::cl::init(1000), llvm::cl::Hidden); +static llvm::cl::opt emitLifetimeMarkers( + "stack-arrays-lifetime", + llvm::cl::desc("Add lifetime markers to generated constant size allocas"), + llvm::cl::init(false), llvm::cl::Hidden); + namespace { /// The state of an SSA value at each program point @@ -189,8 +197,11 @@ class AllocMemConversion : public mlir::OpRewritePattern { public: explicit AllocMemConversion( mlir::MLIRContext *ctx, - const StackArraysAnalysisWrapper::AllocMemMap &candidateOps) - : OpRewritePattern(ctx), candidateOps{candidateOps} {} + const StackArraysAnalysisWrapper::AllocMemMap &candidateOps, + std::optional &dl, + std::optional &kindMap) + : OpRewritePattern(ctx), candidateOps{candidateOps}, dl{dl}, + kindMap{kindMap} {} llvm::LogicalResult matchAndRewrite(fir::AllocMemOp allocmem, @@ -206,6 +217,9 @@ class AllocMemConversion : public mlir::OpRewritePattern { /// Handle to the DFA (already run) const StackArraysAnalysisWrapper::AllocMemMap &candidateOps; + const std::optional &dl; + const std::optional &kindMap; + /// If we failed to find an insertion point not inside a loop, see if it would /// be safe to use an llvm.stacksave/llvm.stackrestore inside the loop static InsertionPoint findAllocaLoopInsertionPoint( @@ -218,8 +232,12 @@ class AllocMemConversion : public mlir::OpRewritePattern { mlir::PatternRewriter &rewriter) const; /// Inserts a stacksave before oldAlloc and a stackrestore after each freemem - void insertStackSaveRestore(fir::AllocMemOp &oldAlloc, + void insertStackSaveRestore(fir::AllocMemOp oldAlloc, mlir::PatternRewriter &rewriter) const; + /// Emit lifetime markers for newAlloc between oldAlloc and each freemem. + /// If the allocation is dynamic, no life markers are emitted. + void insertLifetimeMarkers(fir::AllocMemOp oldAlloc, fir::AllocaOp newAlloc, + mlir::PatternRewriter &rewriter) const; }; class StackArraysPass : public fir::impl::StackArraysBase { @@ -740,14 +758,34 @@ AllocMemConversion::insertAlloca(fir::AllocMemOp &oldAlloc, llvm::StringRef uniqName = unpackName(oldAlloc.getUniqName()); llvm::StringRef bindcName = unpackName(oldAlloc.getBindcName()); - return rewriter.create(loc, varTy, uniqName, bindcName, - oldAlloc.getTypeparams(), - oldAlloc.getShape()); + auto alloca = rewriter.create(loc, varTy, uniqName, bindcName, + oldAlloc.getTypeparams(), + oldAlloc.getShape()); + if (emitLifetimeMarkers) + insertLifetimeMarkers(oldAlloc, alloca, rewriter); + + return alloca; +} + +static void +visitFreeMemOp(fir::AllocMemOp oldAlloc, + const std::function &callBack) { + for (mlir::Operation *user : oldAlloc->getUsers()) { + if (auto declareOp = mlir::dyn_cast_if_present(user)) { + for (mlir::Operation *user : declareOp->getUsers()) { + if (mlir::isa(user)) + callBack(user); + } + } + + if (mlir::isa(user)) + callBack(user); + } } void AllocMemConversion::insertStackSaveRestore( - fir::AllocMemOp &oldAlloc, mlir::PatternRewriter &rewriter) const { - auto oldPoint = rewriter.saveInsertionPoint(); + fir::AllocMemOp oldAlloc, mlir::PatternRewriter &rewriter) const { + mlir::OpBuilder::InsertionGuard insertGuard(rewriter); auto mod = oldAlloc->getParentOfType(); fir::FirOpBuilder builder{rewriter, mod}; @@ -758,21 +796,30 @@ void AllocMemConversion::insertStackSaveRestore( builder.setInsertionPoint(user); builder.genStackRestore(user->getLoc(), sp); }; + visitFreeMemOp(oldAlloc, createStackRestoreCall); +} - for (mlir::Operation *user : oldAlloc->getUsers()) { - if (auto declareOp = mlir::dyn_cast_if_present(user)) { - for (mlir::Operation *user : declareOp->getUsers()) { - if (mlir::isa(user)) - createStackRestoreCall(user); - } - } - - if (mlir::isa(user)) { - createStackRestoreCall(user); - } +void AllocMemConversion::insertLifetimeMarkers( + fir::AllocMemOp oldAlloc, fir::AllocaOp newAlloc, + mlir::PatternRewriter &rewriter) const { + if (!dl || !kindMap) + return; + llvm::StringRef attrName = fir::getHasLifetimeMarkerAttrName(); + // Do not add lifetime markers, of the alloca already has any. + if (newAlloc->hasAttr(attrName)) + return; + if (std::optional size = + fir::getAllocaByteSize(newAlloc, *dl, *kindMap)) { + mlir::OpBuilder::InsertionGuard insertGuard(rewriter); + rewriter.setInsertionPoint(oldAlloc); + mlir::Value ptr = fir::factory::genLifetimeStart( + rewriter, newAlloc.getLoc(), newAlloc, *size, &*dl); + visitFreeMemOp(oldAlloc, [&](mlir::Operation *op) { + rewriter.setInsertionPoint(op); + fir::factory::genLifetimeEnd(rewriter, op->getLoc(), ptr, *size); + }); + newAlloc->setAttr(attrName, rewriter.getUnitAttr()); } - - rewriter.restoreInsertionPoint(oldPoint); } StackArraysPass::StackArraysPass(const StackArraysPass &pass) @@ -809,7 +856,16 @@ void StackArraysPass::runOnOperation() { config.setRegionSimplificationLevel( mlir::GreedySimplifyRegionLevel::Disabled); - patterns.insert(&context, *candidateOps); + auto module = func->getParentOfType(); + std::optional dl = + module ? fir::support::getOrSetMLIRDataLayout( + module, /*allowDefaultLayout=*/false) + : std::nullopt; + std::optional kindMap; + if (module) + kindMap = fir::getKindMapping(module); + + patterns.insert(&context, *candidateOps, dl, kindMap); if (mlir::failed(mlir::applyOpPatternsGreedily( opsToConvert, std::move(patterns), config))) { mlir::emitError(func->getLoc(), "error in stack arrays optimization\n"); diff --git a/flang/test/Transforms/stack-arrays-lifetime.fir b/flang/test/Transforms/stack-arrays-lifetime.fir new file mode 100644 index 0000000000000..5b2faeba132c3 --- /dev/null +++ b/flang/test/Transforms/stack-arrays-lifetime.fir @@ -0,0 +1,96 @@ +// Test insertion of llvm.lifetime for allocmem turn into alloca with constant size. +// RUN: fir-opt --stack-arrays -stack-arrays-lifetime %s | FileCheck %s + +module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"} { + +func.func @_QPcst_alloca(%arg0: !fir.ref> {fir.bindc_name = "x"}) { + %c1 = arith.constant 1 : index + %c100000 = arith.constant 100000 : index + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.shape %c100000 : (index) -> !fir.shape<1> + %2 = fir.declare %arg0(%1) dummy_scope %0 {uniq_name = "_QFcst_allocaEx"} : (!fir.ref>, !fir.shape<1>, !fir.dscope) -> !fir.ref> + %3 = fir.allocmem !fir.array<100000xf32> {bindc_name = ".tmp.array", uniq_name = ""} + %4 = fir.declare %3(%1) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg1 = %c1 to %c100000 step %c1 unordered { + %9 = fir.array_coor %2(%1) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %10 = fir.load %9 : !fir.ref + %11 = arith.addf %10, %10 fastmath : f32 + %12 = fir.array_coor %4(%1) %arg1 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %11 to %12 : !fir.ref + } + %5 = fir.convert %4 : (!fir.heap>) -> !fir.ref> + fir.call @_QPbar(%5) fastmath : (!fir.ref>) -> () + fir.freemem %4 : !fir.heap> + %6 = fir.allocmem !fir.array<100000xi32> {bindc_name = ".tmp.array", uniq_name = ""} + %7 = fir.declare %6(%1) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg1 = %c1 to %c100000 step %c1 unordered { + %9 = fir.array_coor %2(%1) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %10 = fir.load %9 : !fir.ref + %11 = fir.convert %10 : (f32) -> i32 + %12 = fir.array_coor %7(%1) %arg1 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %11 to %12 : !fir.ref + } + %8 = fir.convert %7 : (!fir.heap>) -> !fir.ref> + fir.call @_QPibar(%8) fastmath : (!fir.ref>) -> () + fir.freemem %7 : !fir.heap> + return +} +// CHECK-LABEL: func.func @_QPcst_alloca( +// CHECK-DAG: %[[VAL_0:.*]] = fir.alloca !fir.array<100000xf32> {bindc_name = ".tmp.array", fir.has_lifetime} +// CHECK-DAG: %[[VAL_2:.*]] = fir.alloca !fir.array<100000xi32> {bindc_name = ".tmp.array", fir.has_lifetime} +// CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_0]] : (!fir.ref>) -> !llvm.ptr +// CHECK: llvm.intr.lifetime.start 400000, %[[VAL_9]] : !llvm.ptr +// CHECK: fir.do_loop +// CHECK: fir.call @_QPbar( +// CHECK: llvm.intr.lifetime.end 400000, %[[VAL_9]] : !llvm.ptr +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_2]] : (!fir.ref>) -> !llvm.ptr +// CHECK: llvm.intr.lifetime.start 400000, %[[VAL_17]] : !llvm.ptr +// CHECK: fir.do_loop +// CHECK: fir.call @_QPibar( +// CHECK: llvm.intr.lifetime.end 400000, %[[VAL_17]] : !llvm.ptr + + +func.func @_QPdyn_alloca(%arg0: !fir.ref> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "n"}) { + %c1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.declare %arg1 dummy_scope %0 {uniq_name = "_QFdyn_allocaEn"} : (!fir.ref, !fir.dscope) -> !fir.ref + %2 = fir.load %1 : !fir.ref + %3 = fir.convert %2 : (i64) -> index + %4 = arith.cmpi sgt, %3, %c0 : index + %5 = arith.select %4, %3, %c0 : index + %6 = fir.shape %5 : (index) -> !fir.shape<1> + %7 = fir.declare %arg0(%6) dummy_scope %0 {uniq_name = "_QFdyn_allocaEx"} : (!fir.ref>, !fir.shape<1>, !fir.dscope) -> !fir.ref> + %8 = fir.allocmem !fir.array, %5 {bindc_name = ".tmp.array", uniq_name = ""} + %9 = fir.declare %8(%6) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg2 = %c1 to %5 step %c1 unordered { + %14 = fir.array_coor %7(%6) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %15 = fir.load %14 : !fir.ref + %16 = arith.addf %15, %15 fastmath : f32 + %17 = fir.array_coor %9(%6) %arg2 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %16 to %17 : !fir.ref + } + %10 = fir.convert %9 : (!fir.heap>) -> !fir.ref> + fir.call @_QPbar(%10) fastmath : (!fir.ref>) -> () + fir.freemem %9 : !fir.heap> + %11 = fir.allocmem !fir.array, %5 {bindc_name = ".tmp.array", uniq_name = ""} + %12 = fir.declare %11(%6) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg2 = %c1 to %5 step %c1 unordered { + %14 = fir.array_coor %7(%6) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %15 = fir.load %14 : !fir.ref + %16 = arith.mulf %15, %15 fastmath : f32 + %17 = fir.array_coor %12(%6) %arg2 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %16 to %17 : !fir.ref + } + %13 = fir.convert %12 : (!fir.heap>) -> !fir.ref> + fir.call @_QPbar(%13) fastmath : (!fir.ref>) -> () + fir.freemem %12 : !fir.heap> + return +} +// CHECK-LABEL: func.func @_QPdyn_alloca( +// CHECK-NOT: llvm.intr.lifetime.start +// CHECK: return + +func.func private @_QPbar(!fir.ref>) +func.func private @_QPibar(!fir.ref>) +} >From 07a4b82b44eafe60eae651a75c5be607bf8deec0 Mon Sep 17 00:00:00 2001 From: jeanPerier Date: Thu, 22 May 2025 09:24:14 +0200 Subject: [PATCH 2/2] Update flang/lib/Optimizer/Transforms/StackArrays.cpp Co-authored-by: Tom Eccles --- flang/lib/Optimizer/Transforms/StackArrays.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Optimizer/Transforms/StackArrays.cpp b/flang/lib/Optimizer/Transforms/StackArrays.cpp index b5671261c9a2b..bc8a9497fbb70 100644 --- a/flang/lib/Optimizer/Transforms/StackArrays.cpp +++ b/flang/lib/Optimizer/Transforms/StackArrays.cpp @@ -805,7 +805,7 @@ void AllocMemConversion::insertLifetimeMarkers( if (!dl || !kindMap) return; llvm::StringRef attrName = fir::getHasLifetimeMarkerAttrName(); - // Do not add lifetime markers, of the alloca already has any. + // Do not add lifetime markers if the alloca already has any. if (newAlloc->hasAttr(attrName)) return; if (std::optional size = From flang-commits at lists.llvm.org Thu May 22 00:25:45 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 00:25:45 -0700 (PDT) Subject: [flang-commits] [flang] [flang] optionally add lifetime markers to alloca created in stack-arrays (PR #140901) In-Reply-To: Message-ID: <682ed179.630a0220.414b3.b88a@mx.google.com> jeanPerier wrote: Thanks for the review! https://github.com/llvm/llvm-project/pull/140901 From flang-commits at lists.llvm.org Thu May 22 00:26:17 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 00:26:17 -0700 (PDT) Subject: [flang-commits] [flang] 1f5b6ae - [flang] optionally add lifetime markers to alloca created in stack-arrays (#140901) Message-ID: <682ed199.630a0220.338d7.3a86@mx.google.com> Author: jeanPerier Date: 2025-05-22T09:26:14+02:00 New Revision: 1f5b6ae89fbc88d22c323fa56d8bdad9f7b695c3 URL: https://github.com/llvm/llvm-project/commit/1f5b6ae89fbc88d22c323fa56d8bdad9f7b695c3 DIFF: https://github.com/llvm/llvm-project/commit/1f5b6ae89fbc88d22c323fa56d8bdad9f7b695c3.diff LOG: [flang] optionally add lifetime markers to alloca created in stack-arrays (#140901) Flang at Ofast usually produces executables that consume more stack that other Fortran compilers. This is in part because the alloca created from temporary heap allocation by the StackArray pass are created at the function scope level without lifetimes, and LLVM does not/is not able to merge alloca that do not have overlapping lifetimes. This patch adds an option to generate LLVM lifetime in the StackArray pass at the previous heap allocation/free using the LLVM dialect operation for it. Added: flang/test/Transforms/stack-arrays-lifetime.fir Modified: flang/include/flang/Optimizer/Builder/FIRBuilder.h flang/include/flang/Optimizer/Dialect/FIROpsSupport.h flang/include/flang/Optimizer/Transforms/Passes.td flang/lib/Optimizer/Builder/FIRBuilder.cpp flang/lib/Optimizer/Dialect/FIROps.cpp flang/lib/Optimizer/Transforms/StackArrays.cpp Removed: ################################################################################ diff --git a/flang/include/flang/Optimizer/Builder/FIRBuilder.h b/flang/include/flang/Optimizer/Builder/FIRBuilder.h index 5309ea2c0fc09..9382d77a8d67b 100644 --- a/flang/include/flang/Optimizer/Builder/FIRBuilder.h +++ b/flang/include/flang/Optimizer/Builder/FIRBuilder.h @@ -879,7 +879,7 @@ llvm::SmallVector elideLengthsAlreadyInType(mlir::Type type, mlir::ValueRange lenParams); /// Get the address space which should be used for allocas -uint64_t getAllocaAddressSpace(mlir::DataLayout *dataLayout); +uint64_t getAllocaAddressSpace(const mlir::DataLayout *dataLayout); /// The two vectors of MLIR values have the following property: /// \p extents1[i] must have the same value as \p extents2[i] @@ -913,6 +913,18 @@ void genDimInfoFromBox(fir::FirOpBuilder &builder, mlir::Location loc, llvm::SmallVectorImpl *extents, llvm::SmallVectorImpl *strides); +/// Generate an LLVM dialect lifetime start marker at the current insertion +/// point given an fir.alloca and its constant size in bytes. Returns the value +/// to be passed to the lifetime end marker. +mlir::Value genLifetimeStart(mlir::OpBuilder &builder, mlir::Location loc, + fir::AllocaOp alloc, int64_t size, + const mlir::DataLayout *dl); + +/// Generate an LLVM dialect lifetime end marker at the current insertion point +/// given an llvm.ptr value and the constant size in bytes of its storage. +void genLifetimeEnd(mlir::OpBuilder &builder, mlir::Location loc, + mlir::Value mem, int64_t size); + } // namespace fir::factory #endif // FORTRAN_OPTIMIZER_BUILDER_FIRBUILDER_H diff --git a/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h b/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h index e71a622725bf4..0a2337be7455e 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h +++ b/flang/include/flang/Optimizer/Dialect/FIROpsSupport.h @@ -125,6 +125,12 @@ static constexpr llvm::StringRef getInternalFuncNameAttrName() { return "fir.internal_name"; } +/// Attribute to mark alloca that have been given a lifetime marker so that +/// later pass do not try adding new ones. +static constexpr llvm::StringRef getHasLifetimeMarkerAttrName() { + return "fir.has_lifetime"; +} + /// Does the function, \p func, have a host-associations tuple argument? /// Some internal procedures may have access to host procedure variables. bool hasHostAssociationArgument(mlir::func::FuncOp func); @@ -221,6 +227,12 @@ inline bool hasBindcAttr(mlir::Operation *op) { return hasProcedureAttr(op); } +/// Get the allocation size of a given alloca if it has compile time constant +/// size. +std::optional getAllocaByteSize(fir::AllocaOp alloca, + const mlir::DataLayout &dl, + const fir::KindMapping &kindMap); + /// Return true, if \p rebox operation keeps the input array /// continuous if it is initially continuous. /// When \p checkWhole is false, then the checking is only done diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index c0d88a8e19f80..b251534e1a8f6 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -285,7 +285,9 @@ def StackArrays : Pass<"stack-arrays", "mlir::func::FuncOp"> { Convert heap allocations for arrays, even those of unknown size, into stack allocations. }]; - let dependentDialects = [ "fir::FIROpsDialect" ]; + let dependentDialects = [ + "fir::FIROpsDialect", "mlir::DLTIDialect", "mlir::LLVM::LLVMDialect" + ]; } def StackReclaim : Pass<"stack-reclaim"> { diff --git a/flang/lib/Optimizer/Builder/FIRBuilder.cpp b/flang/lib/Optimizer/Builder/FIRBuilder.cpp index 86166db355f72..68a1cc7a3aee6 100644 --- a/flang/lib/Optimizer/Builder/FIRBuilder.cpp +++ b/flang/lib/Optimizer/Builder/FIRBuilder.cpp @@ -1868,7 +1868,8 @@ void fir::factory::setInternalLinkage(mlir::func::FuncOp func) { func->setAttr("llvm.linkage", linkage); } -uint64_t fir::factory::getAllocaAddressSpace(mlir::DataLayout *dataLayout) { +uint64_t +fir::factory::getAllocaAddressSpace(const mlir::DataLayout *dataLayout) { if (dataLayout) if (mlir::Attribute addrSpace = dataLayout->getAllocaMemorySpace()) return mlir::cast(addrSpace).getUInt(); @@ -1940,3 +1941,20 @@ void fir::factory::genDimInfoFromBox( strides->push_back(dimInfo.getByteStride()); } } + +mlir::Value fir::factory::genLifetimeStart(mlir::OpBuilder &builder, + mlir::Location loc, + fir::AllocaOp alloc, int64_t size, + const mlir::DataLayout *dl) { + mlir::Type ptrTy = mlir::LLVM::LLVMPointerType::get( + alloc.getContext(), getAllocaAddressSpace(dl)); + mlir::Value cast = + builder.create(loc, ptrTy, alloc.getResult()); + builder.create(loc, size, cast); + return cast; +} + +void fir::factory::genLifetimeEnd(mlir::OpBuilder &builder, mlir::Location loc, + mlir::Value cast, int64_t size) { + builder.create(loc, size, cast); +} diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index e12af7782a578..cbe93907265f6 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -4804,6 +4804,19 @@ bool fir::reboxPreservesContinuity(fir::ReboxOp rebox, bool checkWhole) { return false; } +std::optional fir::getAllocaByteSize(fir::AllocaOp alloca, + const mlir::DataLayout &dl, + const fir::KindMapping &kindMap) { + mlir::Type type = alloca.getInType(); + // TODO: should use the constant operands when all info is not available in + // the type. + if (!alloca.isDynamic()) + if (auto sizeAndAlignment = + getTypeSizeAndAlignment(alloca.getLoc(), type, dl, kindMap)) + return sizeAndAlignment->first; + return std::nullopt; +} + //===----------------------------------------------------------------------===// // DeclareOp //===----------------------------------------------------------------------===// diff --git a/flang/lib/Optimizer/Transforms/StackArrays.cpp b/flang/lib/Optimizer/Transforms/StackArrays.cpp index f9b9b4f4ff385..bc8a9497fbb70 100644 --- a/flang/lib/Optimizer/Transforms/StackArrays.cpp +++ b/flang/lib/Optimizer/Transforms/StackArrays.cpp @@ -13,12 +13,15 @@ #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/Dialect/Support/FIRContext.h" +#include "flang/Optimizer/Support/DataLayout.h" #include "flang/Optimizer/Transforms/Passes.h" #include "mlir/Analysis/DataFlow/ConstantPropagationAnalysis.h" #include "mlir/Analysis/DataFlow/DeadCodeAnalysis.h" #include "mlir/Analysis/DataFlow/DenseAnalysis.h" #include "mlir/Analysis/DataFlowFramework.h" +#include "mlir/Dialect/DLTI/DLTI.h" #include "mlir/Dialect/Func/IR/FuncOps.h" +#include "mlir/Dialect/LLVMIR/LLVMDialect.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/IR/Builders.h" #include "mlir/IR/Diagnostics.h" @@ -48,6 +51,11 @@ static llvm::cl::opt maxAllocsPerFunc( "to 0 for no limit."), llvm::cl::init(1000), llvm::cl::Hidden); +static llvm::cl::opt emitLifetimeMarkers( + "stack-arrays-lifetime", + llvm::cl::desc("Add lifetime markers to generated constant size allocas"), + llvm::cl::init(false), llvm::cl::Hidden); + namespace { /// The state of an SSA value at each program point @@ -189,8 +197,11 @@ class AllocMemConversion : public mlir::OpRewritePattern { public: explicit AllocMemConversion( mlir::MLIRContext *ctx, - const StackArraysAnalysisWrapper::AllocMemMap &candidateOps) - : OpRewritePattern(ctx), candidateOps{candidateOps} {} + const StackArraysAnalysisWrapper::AllocMemMap &candidateOps, + std::optional &dl, + std::optional &kindMap) + : OpRewritePattern(ctx), candidateOps{candidateOps}, dl{dl}, + kindMap{kindMap} {} llvm::LogicalResult matchAndRewrite(fir::AllocMemOp allocmem, @@ -206,6 +217,9 @@ class AllocMemConversion : public mlir::OpRewritePattern { /// Handle to the DFA (already run) const StackArraysAnalysisWrapper::AllocMemMap &candidateOps; + const std::optional &dl; + const std::optional &kindMap; + /// If we failed to find an insertion point not inside a loop, see if it would /// be safe to use an llvm.stacksave/llvm.stackrestore inside the loop static InsertionPoint findAllocaLoopInsertionPoint( @@ -218,8 +232,12 @@ class AllocMemConversion : public mlir::OpRewritePattern { mlir::PatternRewriter &rewriter) const; /// Inserts a stacksave before oldAlloc and a stackrestore after each freemem - void insertStackSaveRestore(fir::AllocMemOp &oldAlloc, + void insertStackSaveRestore(fir::AllocMemOp oldAlloc, mlir::PatternRewriter &rewriter) const; + /// Emit lifetime markers for newAlloc between oldAlloc and each freemem. + /// If the allocation is dynamic, no life markers are emitted. + void insertLifetimeMarkers(fir::AllocMemOp oldAlloc, fir::AllocaOp newAlloc, + mlir::PatternRewriter &rewriter) const; }; class StackArraysPass : public fir::impl::StackArraysBase { @@ -740,14 +758,34 @@ AllocMemConversion::insertAlloca(fir::AllocMemOp &oldAlloc, llvm::StringRef uniqName = unpackName(oldAlloc.getUniqName()); llvm::StringRef bindcName = unpackName(oldAlloc.getBindcName()); - return rewriter.create(loc, varTy, uniqName, bindcName, - oldAlloc.getTypeparams(), - oldAlloc.getShape()); + auto alloca = rewriter.create(loc, varTy, uniqName, bindcName, + oldAlloc.getTypeparams(), + oldAlloc.getShape()); + if (emitLifetimeMarkers) + insertLifetimeMarkers(oldAlloc, alloca, rewriter); + + return alloca; +} + +static void +visitFreeMemOp(fir::AllocMemOp oldAlloc, + const std::function &callBack) { + for (mlir::Operation *user : oldAlloc->getUsers()) { + if (auto declareOp = mlir::dyn_cast_if_present(user)) { + for (mlir::Operation *user : declareOp->getUsers()) { + if (mlir::isa(user)) + callBack(user); + } + } + + if (mlir::isa(user)) + callBack(user); + } } void AllocMemConversion::insertStackSaveRestore( - fir::AllocMemOp &oldAlloc, mlir::PatternRewriter &rewriter) const { - auto oldPoint = rewriter.saveInsertionPoint(); + fir::AllocMemOp oldAlloc, mlir::PatternRewriter &rewriter) const { + mlir::OpBuilder::InsertionGuard insertGuard(rewriter); auto mod = oldAlloc->getParentOfType(); fir::FirOpBuilder builder{rewriter, mod}; @@ -758,21 +796,30 @@ void AllocMemConversion::insertStackSaveRestore( builder.setInsertionPoint(user); builder.genStackRestore(user->getLoc(), sp); }; + visitFreeMemOp(oldAlloc, createStackRestoreCall); +} - for (mlir::Operation *user : oldAlloc->getUsers()) { - if (auto declareOp = mlir::dyn_cast_if_present(user)) { - for (mlir::Operation *user : declareOp->getUsers()) { - if (mlir::isa(user)) - createStackRestoreCall(user); - } - } - - if (mlir::isa(user)) { - createStackRestoreCall(user); - } +void AllocMemConversion::insertLifetimeMarkers( + fir::AllocMemOp oldAlloc, fir::AllocaOp newAlloc, + mlir::PatternRewriter &rewriter) const { + if (!dl || !kindMap) + return; + llvm::StringRef attrName = fir::getHasLifetimeMarkerAttrName(); + // Do not add lifetime markers if the alloca already has any. + if (newAlloc->hasAttr(attrName)) + return; + if (std::optional size = + fir::getAllocaByteSize(newAlloc, *dl, *kindMap)) { + mlir::OpBuilder::InsertionGuard insertGuard(rewriter); + rewriter.setInsertionPoint(oldAlloc); + mlir::Value ptr = fir::factory::genLifetimeStart( + rewriter, newAlloc.getLoc(), newAlloc, *size, &*dl); + visitFreeMemOp(oldAlloc, [&](mlir::Operation *op) { + rewriter.setInsertionPoint(op); + fir::factory::genLifetimeEnd(rewriter, op->getLoc(), ptr, *size); + }); + newAlloc->setAttr(attrName, rewriter.getUnitAttr()); } - - rewriter.restoreInsertionPoint(oldPoint); } StackArraysPass::StackArraysPass(const StackArraysPass &pass) @@ -809,7 +856,16 @@ void StackArraysPass::runOnOperation() { config.setRegionSimplificationLevel( mlir::GreedySimplifyRegionLevel::Disabled); - patterns.insert(&context, *candidateOps); + auto module = func->getParentOfType(); + std::optional dl = + module ? fir::support::getOrSetMLIRDataLayout( + module, /*allowDefaultLayout=*/false) + : std::nullopt; + std::optional kindMap; + if (module) + kindMap = fir::getKindMapping(module); + + patterns.insert(&context, *candidateOps, dl, kindMap); if (mlir::failed(mlir::applyOpPatternsGreedily( opsToConvert, std::move(patterns), config))) { mlir::emitError(func->getLoc(), "error in stack arrays optimization\n"); diff --git a/flang/test/Transforms/stack-arrays-lifetime.fir b/flang/test/Transforms/stack-arrays-lifetime.fir new file mode 100644 index 0000000000000..5b2faeba132c3 --- /dev/null +++ b/flang/test/Transforms/stack-arrays-lifetime.fir @@ -0,0 +1,96 @@ +// Test insertion of llvm.lifetime for allocmem turn into alloca with constant size. +// RUN: fir-opt --stack-arrays -stack-arrays-lifetime %s | FileCheck %s + +module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"} { + +func.func @_QPcst_alloca(%arg0: !fir.ref> {fir.bindc_name = "x"}) { + %c1 = arith.constant 1 : index + %c100000 = arith.constant 100000 : index + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.shape %c100000 : (index) -> !fir.shape<1> + %2 = fir.declare %arg0(%1) dummy_scope %0 {uniq_name = "_QFcst_allocaEx"} : (!fir.ref>, !fir.shape<1>, !fir.dscope) -> !fir.ref> + %3 = fir.allocmem !fir.array<100000xf32> {bindc_name = ".tmp.array", uniq_name = ""} + %4 = fir.declare %3(%1) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg1 = %c1 to %c100000 step %c1 unordered { + %9 = fir.array_coor %2(%1) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %10 = fir.load %9 : !fir.ref + %11 = arith.addf %10, %10 fastmath : f32 + %12 = fir.array_coor %4(%1) %arg1 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %11 to %12 : !fir.ref + } + %5 = fir.convert %4 : (!fir.heap>) -> !fir.ref> + fir.call @_QPbar(%5) fastmath : (!fir.ref>) -> () + fir.freemem %4 : !fir.heap> + %6 = fir.allocmem !fir.array<100000xi32> {bindc_name = ".tmp.array", uniq_name = ""} + %7 = fir.declare %6(%1) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg1 = %c1 to %c100000 step %c1 unordered { + %9 = fir.array_coor %2(%1) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %10 = fir.load %9 : !fir.ref + %11 = fir.convert %10 : (f32) -> i32 + %12 = fir.array_coor %7(%1) %arg1 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %11 to %12 : !fir.ref + } + %8 = fir.convert %7 : (!fir.heap>) -> !fir.ref> + fir.call @_QPibar(%8) fastmath : (!fir.ref>) -> () + fir.freemem %7 : !fir.heap> + return +} +// CHECK-LABEL: func.func @_QPcst_alloca( +// CHECK-DAG: %[[VAL_0:.*]] = fir.alloca !fir.array<100000xf32> {bindc_name = ".tmp.array", fir.has_lifetime} +// CHECK-DAG: %[[VAL_2:.*]] = fir.alloca !fir.array<100000xi32> {bindc_name = ".tmp.array", fir.has_lifetime} +// CHECK: %[[VAL_9:.*]] = fir.convert %[[VAL_0]] : (!fir.ref>) -> !llvm.ptr +// CHECK: llvm.intr.lifetime.start 400000, %[[VAL_9]] : !llvm.ptr +// CHECK: fir.do_loop +// CHECK: fir.call @_QPbar( +// CHECK: llvm.intr.lifetime.end 400000, %[[VAL_9]] : !llvm.ptr +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_2]] : (!fir.ref>) -> !llvm.ptr +// CHECK: llvm.intr.lifetime.start 400000, %[[VAL_17]] : !llvm.ptr +// CHECK: fir.do_loop +// CHECK: fir.call @_QPibar( +// CHECK: llvm.intr.lifetime.end 400000, %[[VAL_17]] : !llvm.ptr + + +func.func @_QPdyn_alloca(%arg0: !fir.ref> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "n"}) { + %c1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %0 = fir.dummy_scope : !fir.dscope + %1 = fir.declare %arg1 dummy_scope %0 {uniq_name = "_QFdyn_allocaEn"} : (!fir.ref, !fir.dscope) -> !fir.ref + %2 = fir.load %1 : !fir.ref + %3 = fir.convert %2 : (i64) -> index + %4 = arith.cmpi sgt, %3, %c0 : index + %5 = arith.select %4, %3, %c0 : index + %6 = fir.shape %5 : (index) -> !fir.shape<1> + %7 = fir.declare %arg0(%6) dummy_scope %0 {uniq_name = "_QFdyn_allocaEx"} : (!fir.ref>, !fir.shape<1>, !fir.dscope) -> !fir.ref> + %8 = fir.allocmem !fir.array, %5 {bindc_name = ".tmp.array", uniq_name = ""} + %9 = fir.declare %8(%6) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg2 = %c1 to %5 step %c1 unordered { + %14 = fir.array_coor %7(%6) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %15 = fir.load %14 : !fir.ref + %16 = arith.addf %15, %15 fastmath : f32 + %17 = fir.array_coor %9(%6) %arg2 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %16 to %17 : !fir.ref + } + %10 = fir.convert %9 : (!fir.heap>) -> !fir.ref> + fir.call @_QPbar(%10) fastmath : (!fir.ref>) -> () + fir.freemem %9 : !fir.heap> + %11 = fir.allocmem !fir.array, %5 {bindc_name = ".tmp.array", uniq_name = ""} + %12 = fir.declare %11(%6) {uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> !fir.heap> + fir.do_loop %arg2 = %c1 to %5 step %c1 unordered { + %14 = fir.array_coor %7(%6) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %15 = fir.load %14 : !fir.ref + %16 = arith.mulf %15, %15 fastmath : f32 + %17 = fir.array_coor %12(%6) %arg2 : (!fir.heap>, !fir.shape<1>, index) -> !fir.ref + fir.store %16 to %17 : !fir.ref + } + %13 = fir.convert %12 : (!fir.heap>) -> !fir.ref> + fir.call @_QPbar(%13) fastmath : (!fir.ref>) -> () + fir.freemem %12 : !fir.heap> + return +} +// CHECK-LABEL: func.func @_QPdyn_alloca( +// CHECK-NOT: llvm.intr.lifetime.start +// CHECK: return + +func.func private @_QPbar(!fir.ref>) +func.func private @_QPibar(!fir.ref>) +} From flang-commits at lists.llvm.org Thu May 22 00:26:20 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 00:26:20 -0700 (PDT) Subject: [flang-commits] [flang] [flang] optionally add lifetime markers to alloca created in stack-arrays (PR #140901) In-Reply-To: Message-ID: <682ed19c.630a0220.132b69.1f37@mx.google.com> https://github.com/jeanPerier closed https://github.com/llvm/llvm-project/pull/140901 From flang-commits at lists.llvm.org Thu May 22 00:42:36 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Thu, 22 May 2025 00:42:36 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [lld] [lldb] [llvm] [mlir] [polly] [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS in standalone builds (PR #138587) In-Reply-To: Message-ID: <682ed56c.050a0220.ff60c.ef60@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `llvm-clang-x86_64-gcc-ubuntu-no-asserts` running on `doug-worker-6` while building `bolt,clang,flang,lld,lldb,mlir,polly` at step 6 "test-build-unified-tree-check-all". Full details are available at: https://lab.llvm.org/buildbot/#/builders/202/builds/1378
Here is the relevant piece of the build log for the reference ``` Step 6 (test-build-unified-tree-check-all) failure: test (failure) ... PASS: Clang Tools :: clang-tidy/checkers/modernize/loop-convert-rewritten-binop.cpp (22841 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/loop-convert-multidimensional.cpp (22842 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/loop-convert.c (22843 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/loop-convert-structured-binding.cpp (22844 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/loop-convert-negative.cpp (22845 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/macro-to-enum.c (22846 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/loop-convert-uppercase.cpp (22847 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/loop-convert-basic.cpp (22848 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/make-unique-cxx11.cpp (22849 of 88321) TIMEOUT: AddressSanitizer-x86_64-linux-dynamic :: TestCases/asan_lsan_deadlock.cpp (22850 of 88321) ******************** TEST 'AddressSanitizer-x86_64-linux-dynamic :: TestCases/asan_lsan_deadlock.cpp' FAILED ******************** Exit Code: -9 Timeout: Reached timeout of 900 seconds Command Output (stderr): -- /home/buildbot/buildbot-root/gcc-no-asserts/build/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64 -shared-libasan -O0 /home/buildbot/buildbot-root/gcc-no-asserts/llvm-project/compiler-rt/test/asan/TestCases/asan_lsan_deadlock.cpp -o /home/buildbot/buildbot-root/gcc-no-asserts/build/runtimes/runtimes-bins/compiler-rt/test/asan/X86_64LinuxDynamicConfig/TestCases/Output/asan_lsan_deadlock.cpp.tmp # RUN: at line 4 + /home/buildbot/buildbot-root/gcc-no-asserts/build/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64 -shared-libasan -O0 /home/buildbot/buildbot-root/gcc-no-asserts/llvm-project/compiler-rt/test/asan/TestCases/asan_lsan_deadlock.cpp -o /home/buildbot/buildbot-root/gcc-no-asserts/build/runtimes/runtimes-bins/compiler-rt/test/asan/X86_64LinuxDynamicConfig/TestCases/Output/asan_lsan_deadlock.cpp.tmp env ASAN_OPTIONS=detect_leaks=1 not /home/buildbot/buildbot-root/gcc-no-asserts/build/runtimes/runtimes-bins/compiler-rt/test/asan/X86_64LinuxDynamicConfig/TestCases/Output/asan_lsan_deadlock.cpp.tmp 2>&1 | FileCheck /home/buildbot/buildbot-root/gcc-no-asserts/llvm-project/compiler-rt/test/asan/TestCases/asan_lsan_deadlock.cpp # RUN: at line 5 + env ASAN_OPTIONS=detect_leaks=1 not /home/buildbot/buildbot-root/gcc-no-asserts/build/runtimes/runtimes-bins/compiler-rt/test/asan/X86_64LinuxDynamicConfig/TestCases/Output/asan_lsan_deadlock.cpp.tmp + FileCheck /home/buildbot/buildbot-root/gcc-no-asserts/llvm-project/compiler-rt/test/asan/TestCases/asan_lsan_deadlock.cpp -- ******************** PASS: Clang Tools :: clang-tidy/checkers/modernize/make-shared-header.cpp (22851 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/loop-convert-extra.cpp (22852 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/macro-to-enum.cpp (22853 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/make-unique-default-init.cpp (22854 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/make-unique-inaccessible-ctors.cpp (22855 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/make-unique-macros.cpp (22856 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/make-unique-header.cpp (22857 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/pass-by-value-header.cpp (22858 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/pass-by-value-multi-fixes.cpp (22859 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/make-shared.cpp (22860 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/loop-convert-reverse.cpp (22861 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/pass-by-value-macro-header.cpp (22862 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/raw-string-literal-delimiter.cpp (22863 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/redundant-void-arg.c (22864 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/raw-string-literal-replace-shorter.cpp (22865 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/pass-by-value.cpp (22866 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/make-unique.cpp (22867 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/min-max-use-initializer-list.cpp (22868 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/raw-string-literal.cpp (22869 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/redundant-void-arg-delayed.cpp (22870 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/replace-random-shuffle.cpp (22871 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/unary-static-assert.cpp (22872 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/return-braced-init-list.cpp (22873 of 88321) PASS: Clang Tools :: clang-tidy/checkers/modernize/replace-auto-ptr.cpp (22874 of 88321) ```
https://github.com/llvm/llvm-project/pull/138587 From flang-commits at lists.llvm.org Thu May 22 02:29:02 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Thu, 22 May 2025 02:29:02 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682eee5e.650a0220.1d6213.b857@mx.google.com> https://github.com/eZWALT edited https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Thu May 22 02:29:12 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Thu, 22 May 2025 02:29:12 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682eee68.170a0220.12132b.e44b@mx.google.com> https://github.com/eZWALT edited https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Thu May 22 02:36:08 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 22 May 2025 02:36:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Added noalias attribute to function arguments. (PR #140803) In-Reply-To: Message-ID: <682ef008.170a0220.320ec6.dbd4@mx.google.com> ================ @@ -56,14 +47,28 @@ void FunctionAttrPass::runOnOperation() { if ((isFromModule || !func.isDeclaration()) && !fir::hasBindcAttr(func.getOperation())) { llvm::StringRef nocapture = mlir::LLVM::LLVMDialect::getNoCaptureAttrName(); + llvm::StringRef noalias = mlir::LLVM::LLVMDialect::getNoAliasAttrName(); mlir::UnitAttr unitAttr = mlir::UnitAttr::get(func.getContext()); for (auto [index, argType] : llvm::enumerate(func.getArgumentTypes())) { + bool isNoCapture = false; + bool isNoAlias = false; if (mlir::isa(argType) && !func.getArgAttr(index, fir::getTargetAttrName()) && !func.getArgAttr(index, fir::getAsynchronousAttrName()) && - !func.getArgAttr(index, fir::getVolatileAttrName())) + !func.getArgAttr(index, fir::getVolatileAttrName())) { + isNoCapture = true; + isNoAlias = !fir::isPointerType(argType); ---------------- tblah wrote: Ahh sorry my mistake. https://github.com/llvm/llvm-project/pull/140803 From flang-commits at lists.llvm.org Thu May 22 02:36:18 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 22 May 2025 02:36:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Added noalias attribute to function arguments. (PR #140803) In-Reply-To: Message-ID: <682ef012.050a0220.26a250.0eda@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks https://github.com/llvm/llvm-project/pull/140803 From flang-commits at lists.llvm.org Thu May 22 02:41:07 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Thu, 22 May 2025 02:41:07 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682ef133.170a0220.366193.5ec8@mx.google.com> ================ @@ -1480,6 +1493,108 @@ class SemaOpenMP : public SemaBase { SmallVectorImpl &LoopHelpers, Stmt *&Body, SmallVectorImpl> &OriginalInits); + /// @brief Categories of loops encountered during semantic OpenMP loop + /// analysis + /// + /// This enumeration identifies the structural category of a loop or sequence + /// of loops analyzed in the context of OpenMP transformations and directives. + /// This categorization helps differentiate between original source loops + /// and the structures resulting from applying OpenMP loop transformations. + enum class OMPLoopCategory { + + /// @var OMPLoopCategory::RegularLoop + /// Represents a standard canonical loop nest found in the + /// original source code or an intact loop after transformations + /// (i.e Post/Pre loops of a loopranged fusion) + RegularLoop, + + /// @var OMPLoopCategory::TransformSingleLoop + /// Represents the resulting loop structure when an OpenMP loop + // transformation, generates a single, top-level loop + TransformSingleLoop, + + /// @var OMPLoopCategory::TransformLoopSequence + /// Represents the resulting loop structure when an OpenMP loop + /// transformation + /// generates a sequence of two or more canonical loop nests + TransformLoopSequence + }; + + /// The main recursive process of `checkTransformableLoopSequence` that + /// performs grammatical parsing of a canonical loop sequence. It extracts + /// key information, such as the number of top-level loops, loop statements, + /// helper expressions, and other relevant loop-related data, all in a single + /// execution to avoid redundant traversals. This analysis flattens inner + /// Loop Sequences + /// + /// \param LoopSeqStmt The AST of the original statement. + /// \param LoopSeqSize [out] Number of top level canonical loops. + /// \param NumLoops [out] Number of total canonical loops (nested too). + /// \param LoopHelpers [out] The multiple loop analyses results. + /// \param ForStmts [out] The multiple Stmt of each For loop. + /// \param OriginalInits [out] The raw original initialization statements + /// of each belonging to a loop of the loop sequence + /// \param TransformPreInits [out] The multiple collection of statements and + /// declarations that must have been executed/declared + /// before entering the loop (each belonging to a + /// particular loop transformation, nullptr otherwise) + /// \param LoopSequencePreInits [out] Additional general collection of loop + /// transformation related statements and declarations + /// not bounded to a particular loop that must be + /// executed before entering the loop transformation + /// \param LoopCategories [out] A sequence of OMPLoopCategory values, + /// one for each loop or loop transformation node + /// successfully analyzed. + /// \param Context + /// \param Kind The loop transformation directive kind. + /// \return Whether the original statement is both syntactically and + /// semantically correct according to OpenMP 6.0 canonical loop + /// sequence definition. + bool analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, ---------------- eZWALT wrote: At the end i've decided to modify all of them and avoid this rigid fixed logic https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Thu May 22 02:59:30 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Thu, 22 May 2025 02:59:30 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <682ef582.170a0220.5dd41.1421@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH 1/4] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 30477b842ae64fb1c94f520136956cc4685648a7 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:59:20 -0400 Subject: [PATCH 2/4] clang-format --- flang/include/flang/Evaluate/target.h | 4 +--- flang/include/flang/Tools/TargetSetup.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d5285fb..e8b9fedc38f48 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ class TargetCharacteristics { const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e21fb4..24ab65f740ec6 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. >From e1f91c83680d72fc1463a8db97a77298141286d9 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:41:48 -0400 Subject: [PATCH 3/4] Renaming to integerKindForPointer --- flang/include/flang/Evaluate/target.h | 8 +++++--- flang/include/flang/Tools/TargetSetup.h | 3 ++- flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index e8b9fedc38f48..7b38db2db1956 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,8 +131,10 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } - std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } + std::size_t integerKindForPointer() { return integerKindForPointer_; } + void set_integerKindForPointer(std::size_t newKind) { + integerKindForPointer_ = newKind; + } private: static constexpr int maxKind{common::maxKind}; @@ -159,7 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; - std::size_t pointerSize_{8 /* bytes */}; + std::size_t integerKindForPointer_{8}; /* For 64 bit pointer */ }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index 24ab65f740ec6..002e82aa72484 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,7 +94,8 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); - targetCharacteristics.set_pointerSize( + // Currently the integer kind happens to be the same as the byte size + targetCharacteristics.set_integerKindForPointer( targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 88114bfe84af3..5fd5ea8f9bc5c 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; + TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 940ff38d0abba9d1b0ab415e42474629284962d8 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:51:53 -0400 Subject: [PATCH 4/4] clang-format --- flang/lib/Semantics/resolve-names.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 5fd5ea8f9bc5c..a8dbf61c8fd68 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6689,8 +6689,8 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { "'%s' cannot be a Cray pointer as it is already a Cray pointee"_err_en_US); } pointer->set(Symbol::Flag::CrayPointer); - const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; + const DeclTypeSpec &pointerType{MakeNumericType(TypeCategory::Integer, + context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); From flang-commits at lists.llvm.org Thu May 22 06:50:29 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 06:50:29 -0700 (PDT) Subject: [flang-commits] [flang] 898df4b - [flang] Skip opt-bufferization when memory effect does not have an associated value (#140781) Message-ID: <682f2ba5.170a0220.37d0b0.d0f9@mx.google.com> Author: Asher Mancinelli Date: 2025-05-22T06:50:25-07:00 New Revision: 898df4b8ed86f6590e8496c2108c1611dca710ab URL: https://github.com/llvm/llvm-project/commit/898df4b8ed86f6590e8496c2108c1611dca710ab DIFF: https://github.com/llvm/llvm-project/commit/898df4b8ed86f6590e8496c2108c1611dca710ab.diff LOG: [flang] Skip opt-bufferization when memory effect does not have an associated value (#140781) Memory effects on the volatile memory resource may not be attached to a particular source, in which case the value of an effect will be null. This caused this test case to crash in the optimized bufferization pass's safety analysis because it assumes it can get the SSA value modified by the memory effect. This is because memory effects on the volatile resource indicate that the operation must not be reordered with respect to other volatile operations, but there is not a material ssa value that can be pointed to. This patch changes the safety checks such that memory effects which do not have associated values are not safe for optimized bufferization. Added: flang/test/HLFIR/opt-bufferization-skip-volatile.fir Modified: flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp Removed: ################################################################################ diff --git a/flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp b/flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp index 2f6ee2592a84f..e2ca754a1817a 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp @@ -608,6 +608,12 @@ ElementalAssignBufferization::findMatch(hlfir::ElementalOp elemental) { return std::nullopt; } + if (effect.getValue() == nullptr) { + LLVM_DEBUG(llvm::dbgs() + << "side-effect with no value, cannot analyze further\n"); + return std::nullopt; + } + // allow if and only if the reads are from the elemental indices, in order // => each iteration doesn't read values written by other iterations // don't allow reads from a diff erent value which may alias: fir alias diff --git a/flang/test/HLFIR/opt-bufferization-skip-volatile.fir b/flang/test/HLFIR/opt-bufferization-skip-volatile.fir new file mode 100644 index 0000000000000..158f92bf207d2 --- /dev/null +++ b/flang/test/HLFIR/opt-bufferization-skip-volatile.fir @@ -0,0 +1,49 @@ +// RUN: fir-opt --pass-pipeline="builtin.module(func.func(opt-bufferization))" %s | FileCheck %s + +// Ensure optimized bufferization preserves the semantics of volatile arrays +func.func @minimal_volatile_test() { + %c1 = arith.constant 1 : index + %c200 = arith.constant 200 : index + + // Create a volatile array + %1 = fir.address_of(@_QMtestEarray) : !fir.ref> + %2 = fir.shape %c200 : (index) -> !fir.shape<1> + %3 = fir.volatile_cast %1 : (!fir.ref>) -> !fir.ref, volatile> + %4:2 = hlfir.declare %3(%2) {fortran_attrs = #fir.var_attrs, uniq_name = "_QMtestEarray"} : (!fir.ref, volatile>, !fir.shape<1>) -> (!fir.ref, volatile>, !fir.ref, volatile>) + + // Create an elemental operation that negates each element + %5 = hlfir.elemental %2 unordered : (!fir.shape<1>) -> !hlfir.expr<200xf32> { + ^bb0(%arg1: index): + %6 = hlfir.designate %4#0 (%arg1) : (!fir.ref, volatile>, index) -> !fir.ref + %7 = fir.load %6 : !fir.ref + %8 = arith.negf %7 : f32 + hlfir.yield_element %8 : f32 + } + + // Assign the result back to the volatile array + hlfir.assign %5 to %4#0 : !hlfir.expr<200xf32>, !fir.ref, volatile> + hlfir.destroy %5 : !hlfir.expr<200xf32> + + return +} + +fir.global @_QMtestEarray : !fir.array<200xf32> + +// CHECK-LABEL: func.func @minimal_volatile_test() { +// CHECK: %[[VAL_0:.*]] = arith.constant 200 : index +// CHECK: %[[VAL_1:.*]] = fir.address_of(@_QMtestEarray) : !fir.ref> +// CHECK: %[[VAL_2:.*]] = fir.shape %[[VAL_0]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_3:.*]] = fir.volatile_cast %[[VAL_1]] : (!fir.ref>) -> !fir.ref, volatile> +// CHECK: %[[VAL_4:.*]]:2 = hlfir.declare %[[VAL_3]](%[[VAL_2]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QMtestEarray"} : (!fir.ref, volatile>, !fir.shape<1>) -> (!fir.ref, volatile>, !fir.ref, volatile>) +// CHECK: %[[VAL_5:.*]] = hlfir.elemental %[[VAL_2]] unordered : (!fir.shape<1>) -> !hlfir.expr<200xf32> { +// CHECK: ^bb0(%[[VAL_6:.*]]: index): +// CHECK: %[[VAL_7:.*]] = hlfir.designate %[[VAL_4]]#0 (%[[VAL_6]]) : (!fir.ref, volatile>, index) -> !fir.ref +// CHECK: %[[VAL_8:.*]] = fir.load %[[VAL_7]] : !fir.ref +// CHECK: %[[VAL_9:.*]] = arith.negf %[[VAL_8]] : f32 +// CHECK: hlfir.yield_element %[[VAL_9]] : f32 +// CHECK: } +// CHECK: hlfir.assign %[[VAL_5]] to %[[VAL_4]]#0 : !hlfir.expr<200xf32>, !fir.ref, volatile> +// CHECK: hlfir.destroy %[[VAL_5]] : !hlfir.expr<200xf32> +// CHECK: return +// CHECK: } +// CHECK: fir.global @_QMtestEarray : !fir.array<200xf32> From flang-commits at lists.llvm.org Thu May 22 06:50:31 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Thu, 22 May 2025 06:50:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Skip optimized bufferization on volatile refs (PR #140781) In-Reply-To: Message-ID: <682f2ba7.170a0220.319cc7.8e03@mx.google.com> https://github.com/ashermancinelli closed https://github.com/llvm/llvm-project/pull/140781 From flang-commits at lists.llvm.org Thu May 22 07:18:12 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Thu, 22 May 2025 07:18:12 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] Add missing trig math-to-llvm conversion patterns (PR #141069) Message-ID: https://github.com/ashermancinelli created https://github.com/llvm/llvm-project/pull/141069 asin, acos, atan, and atan2 were being lowered to libm calls instead of llvm intrinsics. Add the conversion patterns to handle these intrinsics and update tests to expect this. NOTE: I don't know what the difference between the fast and relaxed versions of the fir tests. I followed the surrounding patterns, but relaxed and fast look identical. I expected precise to use the libm C call with no FMFs, relaxed to use the math dialect with no FMFs, and the fast to use the math dialect with FMFs. NOTE: I don't know who to tag for math dialect changes, so feel free to tag someone else if you do! >From 23af523f531f1763870e7b65a03d28e2a1565155 Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Wed, 21 May 2025 17:31:21 -0700 Subject: [PATCH] Add missing trig math-to-llvm conversion patterns asin, acos, atan, and atan2 were being lowered to libm calls instead of llvm intrinsics. Add the conversion patterns to handle these intrinsics and update tests to expect this. --- flang/test/Intrinsics/math-codegen.fir | 170 +++++++++++++++++- mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp | 12 +- .../Conversion/MathToLLVM/math-to-llvm.mlir | 78 ++++++++ 3 files changed, 250 insertions(+), 10 deletions(-) diff --git a/flang/test/Intrinsics/math-codegen.fir b/flang/test/Intrinsics/math-codegen.fir index c45c6b23e897e..b7c4e07130662 100644 --- a/flang/test/Intrinsics/math-codegen.fir +++ b/flang/test/Intrinsics/math-codegen.fir @@ -378,13 +378,167 @@ func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { func.func private @llvm.round.f32(f32) -> f32 func.func private @llvm.round.f64(f64) -> f64 +//--- asin_fast.fir +// RUN: fir-opt %t/asin_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/asin_fast.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- asin_relaxed.fir +// RUN: fir-opt %t/asin_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/asin_relaxed.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- asin_precise.fir +// RUN: fir-opt %t/asin_precise.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/asin_precise.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @asinf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @asin({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @asinf(%1) : (f32) -> f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @asin(%1) : (f64) -> f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} +func.func private @asinf(f32) -> f32 +func.func private @asin(f64) -> f64 + +//--- acos_fast.fir +// RUN: fir-opt %t/acos_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/acos_fast.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- acos_relaxed.fir +// RUN: fir-opt %t/acos_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/acos_relaxed.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- acos_precise.fir +// RUN: fir-opt %t/acos_precise.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/acos_precise.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @acosf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @acos({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @acosf(%1) : (f32) -> f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @acos(%1) : (f64) -> f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} +func.func private @acosf(f32) -> f32 +func.func private @acos(f64) -> f64 + //--- atan_fast.fir // RUN: fir-opt %t/atan_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan_fast.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atanf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} @@ -406,10 +560,10 @@ func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { //--- atan_relaxed.fir // RUN: fir-opt %t/atan_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan_relaxed.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atanf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} @@ -458,10 +612,10 @@ func.func private @atan(f64) -> f64 //--- atan2_fast.fir // RUN: fir-opt %t/atan2_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan2_fast.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2f({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "y"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} @@ -485,10 +639,10 @@ func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}, %arg1: !fi //--- atan2_relaxed.fir // RUN: fir-opt %t/atan2_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan2_relaxed.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2f({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "y"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} diff --git a/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp b/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp index 97da96afac4cd..b42bb773f53ee 100644 --- a/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp +++ b/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp @@ -42,6 +42,7 @@ using CopySignOpLowering = ConvertFMFMathToLLVMPattern; using CosOpLowering = ConvertFMFMathToLLVMPattern; using CoshOpLowering = ConvertFMFMathToLLVMPattern; +using AcosOpLowering = ConvertFMFMathToLLVMPattern; using CtPopFOpLowering = VectorConvertToLLVMPattern; using Exp2OpLowering = ConvertFMFMathToLLVMPattern; @@ -62,12 +63,15 @@ using RoundOpLowering = ConvertFMFMathToLLVMPattern; using SinOpLowering = ConvertFMFMathToLLVMPattern; using SinhOpLowering = ConvertFMFMathToLLVMPattern; +using ASinOpLowering = ConvertFMFMathToLLVMPattern; using SqrtOpLowering = ConvertFMFMathToLLVMPattern; using FTruncOpLowering = ConvertFMFMathToLLVMPattern; using TanOpLowering = ConvertFMFMathToLLVMPattern; using TanhOpLowering = ConvertFMFMathToLLVMPattern; - +using ATanOpLowering = ConvertFMFMathToLLVMPattern; +using ATan2OpLowering = + ConvertFMFMathToLLVMPattern; // A `CtLz/CtTz/absi(a)` is converted into `CtLz/CtTz/absi(a, false)`. template struct IntOpWithFlagLowering : public ConvertOpToLLVMPattern { @@ -353,6 +357,7 @@ void mlir::populateMathToLLVMConversionPatterns( CopySignOpLowering, CosOpLowering, CoshOpLowering, + AcosOpLowering, CountLeadingZerosOpLowering, CountTrailingZerosOpLowering, CtPopFOpLowering, @@ -371,10 +376,13 @@ void mlir::populateMathToLLVMConversionPatterns( RsqrtOpLowering, SinOpLowering, SinhOpLowering, + ASinOpLowering, SqrtOpLowering, FTruncOpLowering, TanOpLowering, - TanhOpLowering + TanhOpLowering, + ATanOpLowering, + ATan2OpLowering >(converter, benefit); // clang-format on } diff --git a/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir b/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir index 974743a55932b..537fb967ef0e1 100644 --- a/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir +++ b/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir @@ -177,6 +177,84 @@ func.func @trigonometrics(%arg0: f32) { // ----- +// CHECK-LABEL: func @inverse_trigonometrics +// CHECK-SAME: [[ARG0:%.+]]: f32 +func.func @inverse_trigonometrics(%arg0: f32) { + // CHECK: llvm.intr.asin([[ARG0]]) : (f32) -> f32 + %0 = math.asin %arg0 : f32 + + // CHECK: llvm.intr.acos([[ARG0]]) : (f32) -> f32 + %1 = math.acos %arg0 : f32 + + // CHECK: llvm.intr.atan([[ARG0]]) : (f32) -> f32 + %2 = math.atan %arg0 : f32 + func.return +} + +// ----- + +// CHECK-LABEL: func @atan2 +// CHECK-SAME: [[ARG0:%.+]]: f32, [[ARG1:%.+]]: f32 +func.func @atan2(%arg0: f32, %arg1: f32) { + // CHECK: llvm.intr.atan2([[ARG0]], [[ARG1]]) : (f32, f32) -> f32 + %0 = math.atan2 %arg0, %arg1 : f32 + func.return +} + +// ----- + +// CHECK-LABEL: func @inverse_trigonometrics_vector +// CHECK-SAME: [[ARG0:%.+]]: vector<4xf32> +func.func @inverse_trigonometrics_vector(%arg0: vector<4xf32>) { + // CHECK: llvm.intr.asin([[ARG0]]) : (vector<4xf32>) -> vector<4xf32> + %0 = math.asin %arg0 : vector<4xf32> + + // CHECK: llvm.intr.acos([[ARG0]]) : (vector<4xf32>) -> vector<4xf32> + %1 = math.acos %arg0 : vector<4xf32> + + // CHECK: llvm.intr.atan([[ARG0]]) : (vector<4xf32>) -> vector<4xf32> + %2 = math.atan %arg0 : vector<4xf32> + func.return +} + +// ----- + +// CHECK-LABEL: func @atan2_vector +// CHECK-SAME: [[ARG0:%.+]]: vector<4xf32>, [[ARG1:%.+]]: vector<4xf32> +func.func @atan2_vector(%arg0: vector<4xf32>, %arg1: vector<4xf32>) { + // CHECK: llvm.intr.atan2([[ARG0]], [[ARG1]]) : (vector<4xf32>, vector<4xf32>) -> vector<4xf32> + %0 = math.atan2 %arg0, %arg1 : vector<4xf32> + func.return +} + +// ----- + +// CHECK-LABEL: func @inverse_trigonometrics_fmf +// CHECK-SAME: [[ARG0:%.+]]: f32 +func.func @inverse_trigonometrics_fmf(%arg0: f32) { + // CHECK: llvm.intr.asin([[ARG0]]) {fastmathFlags = #llvm.fastmath} : (f32) -> f32 + %0 = math.asin %arg0 fastmath : f32 + + // CHECK: llvm.intr.acos([[ARG0]]) {fastmathFlags = #llvm.fastmath} : (f32) -> f32 + %1 = math.acos %arg0 fastmath : f32 + + // CHECK: llvm.intr.atan([[ARG0]]) {fastmathFlags = #llvm.fastmath} : (f32) -> f32 + %2 = math.atan %arg0 fastmath : f32 + func.return +} + +// ----- + +// CHECK-LABEL: func @atan2_fmf +// CHECK-SAME: [[ARG0:%.+]]: f32, [[ARG1:%.+]]: f32 +func.func @atan2_fmf(%arg0: f32, %arg1: f32) { + // CHECK: llvm.intr.atan2([[ARG0]], [[ARG1]]) {fastmathFlags = #llvm.fastmath} : (f32, f32) -> f32 + %0 = math.atan2 %arg0, %arg1 fastmath : f32 + func.return +} + +// ----- + // CHECK-LABEL: func @hyperbolics // CHECK-SAME: [[ARG0:%.+]]: f32 func.func @hyperbolics(%arg0: f32) { From flang-commits at lists.llvm.org Thu May 22 07:18:35 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 07:18:35 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] Add missing trig math-to-llvm conversion patterns (PR #141069) In-Reply-To: Message-ID: <682f323b.170a0220.320ec6.ed4a@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-codegen Author: Asher Mancinelli (ashermancinelli)
Changes asin, acos, atan, and atan2 were being lowered to libm calls instead of llvm intrinsics. Add the conversion patterns to handle these intrinsics and update tests to expect this. NOTE: I don't know what the difference between the fast and relaxed versions of the fir tests. I followed the surrounding patterns, but relaxed and fast look identical. I expected precise to use the libm C call with no FMFs, relaxed to use the math dialect with no FMFs, and the fast to use the math dialect with FMFs. NOTE: I don't know who to tag for math dialect changes, so feel free to tag someone else if you do! --- Full diff: https://github.com/llvm/llvm-project/pull/141069.diff 3 Files Affected: - (modified) flang/test/Intrinsics/math-codegen.fir (+162-8) - (modified) mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp (+10-2) - (modified) mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir (+78) ``````````diff diff --git a/flang/test/Intrinsics/math-codegen.fir b/flang/test/Intrinsics/math-codegen.fir index c45c6b23e897e..b7c4e07130662 100644 --- a/flang/test/Intrinsics/math-codegen.fir +++ b/flang/test/Intrinsics/math-codegen.fir @@ -378,13 +378,167 @@ func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { func.func private @llvm.round.f32(f32) -> f32 func.func private @llvm.round.f64(f64) -> f64 +//--- asin_fast.fir +// RUN: fir-opt %t/asin_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/asin_fast.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- asin_relaxed.fir +// RUN: fir-opt %t/asin_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/asin_relaxed.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- asin_precise.fir +// RUN: fir-opt %t/asin_precise.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/asin_precise.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @asinf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @asin({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @asinf(%1) : (f32) -> f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @asin(%1) : (f64) -> f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} +func.func private @asinf(f32) -> f32 +func.func private @asin(f64) -> f64 + +//--- acos_fast.fir +// RUN: fir-opt %t/acos_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/acos_fast.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- acos_relaxed.fir +// RUN: fir-opt %t/acos_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/acos_relaxed.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- acos_precise.fir +// RUN: fir-opt %t/acos_precise.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/acos_precise.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @acosf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @acos({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @acosf(%1) : (f32) -> f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @acos(%1) : (f64) -> f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} +func.func private @acosf(f32) -> f32 +func.func private @acos(f64) -> f64 + //--- atan_fast.fir // RUN: fir-opt %t/atan_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan_fast.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atanf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} @@ -406,10 +560,10 @@ func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { //--- atan_relaxed.fir // RUN: fir-opt %t/atan_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan_relaxed.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atanf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} @@ -458,10 +612,10 @@ func.func private @atan(f64) -> f64 //--- atan2_fast.fir // RUN: fir-opt %t/atan2_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan2_fast.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2f({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "y"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} @@ -485,10 +639,10 @@ func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}, %arg1: !fi //--- atan2_relaxed.fir // RUN: fir-opt %t/atan2_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan2_relaxed.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2f({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "y"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} diff --git a/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp b/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp index 97da96afac4cd..b42bb773f53ee 100644 --- a/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp +++ b/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp @@ -42,6 +42,7 @@ using CopySignOpLowering = ConvertFMFMathToLLVMPattern; using CosOpLowering = ConvertFMFMathToLLVMPattern; using CoshOpLowering = ConvertFMFMathToLLVMPattern; +using AcosOpLowering = ConvertFMFMathToLLVMPattern; using CtPopFOpLowering = VectorConvertToLLVMPattern; using Exp2OpLowering = ConvertFMFMathToLLVMPattern; @@ -62,12 +63,15 @@ using RoundOpLowering = ConvertFMFMathToLLVMPattern; using SinOpLowering = ConvertFMFMathToLLVMPattern; using SinhOpLowering = ConvertFMFMathToLLVMPattern; +using ASinOpLowering = ConvertFMFMathToLLVMPattern; using SqrtOpLowering = ConvertFMFMathToLLVMPattern; using FTruncOpLowering = ConvertFMFMathToLLVMPattern; using TanOpLowering = ConvertFMFMathToLLVMPattern; using TanhOpLowering = ConvertFMFMathToLLVMPattern; - +using ATanOpLowering = ConvertFMFMathToLLVMPattern; +using ATan2OpLowering = + ConvertFMFMathToLLVMPattern; // A `CtLz/CtTz/absi(a)` is converted into `CtLz/CtTz/absi(a, false)`. template struct IntOpWithFlagLowering : public ConvertOpToLLVMPattern { @@ -353,6 +357,7 @@ void mlir::populateMathToLLVMConversionPatterns( CopySignOpLowering, CosOpLowering, CoshOpLowering, + AcosOpLowering, CountLeadingZerosOpLowering, CountTrailingZerosOpLowering, CtPopFOpLowering, @@ -371,10 +376,13 @@ void mlir::populateMathToLLVMConversionPatterns( RsqrtOpLowering, SinOpLowering, SinhOpLowering, + ASinOpLowering, SqrtOpLowering, FTruncOpLowering, TanOpLowering, - TanhOpLowering + TanhOpLowering, + ATanOpLowering, + ATan2OpLowering >(converter, benefit); // clang-format on } diff --git a/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir b/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir index 974743a55932b..537fb967ef0e1 100644 --- a/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir +++ b/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir @@ -177,6 +177,84 @@ func.func @trigonometrics(%arg0: f32) { // ----- +// CHECK-LABEL: func @inverse_trigonometrics +// CHECK-SAME: [[ARG0:%.+]]: f32 +func.func @inverse_trigonometrics(%arg0: f32) { + // CHECK: llvm.intr.asin([[ARG0]]) : (f32) -> f32 + %0 = math.asin %arg0 : f32 + + // CHECK: llvm.intr.acos([[ARG0]]) : (f32) -> f32 + %1 = math.acos %arg0 : f32 + + // CHECK: llvm.intr.atan([[ARG0]]) : (f32) -> f32 + %2 = math.atan %arg0 : f32 + func.return +} + +// ----- + +// CHECK-LABEL: func @atan2 +// CHECK-SAME: [[ARG0:%.+]]: f32, [[ARG1:%.+]]: f32 +func.func @atan2(%arg0: f32, %arg1: f32) { + // CHECK: llvm.intr.atan2([[ARG0]], [[ARG1]]) : (f32, f32) -> f32 + %0 = math.atan2 %arg0, %arg1 : f32 + func.return +} + +// ----- + +// CHECK-LABEL: func @inverse_trigonometrics_vector +// CHECK-SAME: [[ARG0:%.+]]: vector<4xf32> +func.func @inverse_trigonometrics_vector(%arg0: vector<4xf32>) { + // CHECK: llvm.intr.asin([[ARG0]]) : (vector<4xf32>) -> vector<4xf32> + %0 = math.asin %arg0 : vector<4xf32> + + // CHECK: llvm.intr.acos([[ARG0]]) : (vector<4xf32>) -> vector<4xf32> + %1 = math.acos %arg0 : vector<4xf32> + + // CHECK: llvm.intr.atan([[ARG0]]) : (vector<4xf32>) -> vector<4xf32> + %2 = math.atan %arg0 : vector<4xf32> + func.return +} + +// ----- + +// CHECK-LABEL: func @atan2_vector +// CHECK-SAME: [[ARG0:%.+]]: vector<4xf32>, [[ARG1:%.+]]: vector<4xf32> +func.func @atan2_vector(%arg0: vector<4xf32>, %arg1: vector<4xf32>) { + // CHECK: llvm.intr.atan2([[ARG0]], [[ARG1]]) : (vector<4xf32>, vector<4xf32>) -> vector<4xf32> + %0 = math.atan2 %arg0, %arg1 : vector<4xf32> + func.return +} + +// ----- + +// CHECK-LABEL: func @inverse_trigonometrics_fmf +// CHECK-SAME: [[ARG0:%.+]]: f32 +func.func @inverse_trigonometrics_fmf(%arg0: f32) { + // CHECK: llvm.intr.asin([[ARG0]]) {fastmathFlags = #llvm.fastmath} : (f32) -> f32 + %0 = math.asin %arg0 fastmath : f32 + + // CHECK: llvm.intr.acos([[ARG0]]) {fastmathFlags = #llvm.fastmath} : (f32) -> f32 + %1 = math.acos %arg0 fastmath : f32 + + // CHECK: llvm.intr.atan([[ARG0]]) {fastmathFlags = #llvm.fastmath} : (f32) -> f32 + %2 = math.atan %arg0 fastmath : f32 + func.return +} + +// ----- + +// CHECK-LABEL: func @atan2_fmf +// CHECK-SAME: [[ARG0:%.+]]: f32, [[ARG1:%.+]]: f32 +func.func @atan2_fmf(%arg0: f32, %arg1: f32) { + // CHECK: llvm.intr.atan2([[ARG0]], [[ARG1]]) {fastmathFlags = #llvm.fastmath} : (f32, f32) -> f32 + %0 = math.atan2 %arg0, %arg1 fastmath : f32 + func.return +} + +// ----- + // CHECK-LABEL: func @hyperbolics // CHECK-SAME: [[ARG0:%.+]]: f32 func.func @hyperbolics(%arg0: f32) { ``````````
https://github.com/llvm/llvm-project/pull/141069 From flang-commits at lists.llvm.org Thu May 22 07:18:35 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 07:18:35 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] Add missing trig math-to-llvm conversion patterns (PR #141069) In-Reply-To: Message-ID: <682f323b.630a0220.d6949.4877@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-mlir-math Author: Asher Mancinelli (ashermancinelli)
Changes asin, acos, atan, and atan2 were being lowered to libm calls instead of llvm intrinsics. Add the conversion patterns to handle these intrinsics and update tests to expect this. NOTE: I don't know what the difference between the fast and relaxed versions of the fir tests. I followed the surrounding patterns, but relaxed and fast look identical. I expected precise to use the libm C call with no FMFs, relaxed to use the math dialect with no FMFs, and the fast to use the math dialect with FMFs. NOTE: I don't know who to tag for math dialect changes, so feel free to tag someone else if you do! --- Full diff: https://github.com/llvm/llvm-project/pull/141069.diff 3 Files Affected: - (modified) flang/test/Intrinsics/math-codegen.fir (+162-8) - (modified) mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp (+10-2) - (modified) mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir (+78) ``````````diff diff --git a/flang/test/Intrinsics/math-codegen.fir b/flang/test/Intrinsics/math-codegen.fir index c45c6b23e897e..b7c4e07130662 100644 --- a/flang/test/Intrinsics/math-codegen.fir +++ b/flang/test/Intrinsics/math-codegen.fir @@ -378,13 +378,167 @@ func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { func.func private @llvm.round.f32(f32) -> f32 func.func private @llvm.round.f64(f64) -> f64 +//--- asin_fast.fir +// RUN: fir-opt %t/asin_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/asin_fast.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- asin_relaxed.fir +// RUN: fir-opt %t/asin_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/asin_relaxed.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- asin_precise.fir +// RUN: fir-opt %t/asin_precise.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/asin_precise.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @asinf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @asin({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @asinf(%1) : (f32) -> f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @asin(%1) : (f64) -> f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} +func.func private @asinf(f32) -> f32 +func.func private @asin(f64) -> f64 + +//--- acos_fast.fir +// RUN: fir-opt %t/acos_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/acos_fast.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- acos_relaxed.fir +// RUN: fir-opt %t/acos_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/acos_relaxed.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- acos_precise.fir +// RUN: fir-opt %t/acos_precise.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/acos_precise.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @acosf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @acos({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @acosf(%1) : (f32) -> f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @acos(%1) : (f64) -> f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} +func.func private @acosf(f32) -> f32 +func.func private @acos(f64) -> f64 + //--- atan_fast.fir // RUN: fir-opt %t/atan_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan_fast.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atanf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} @@ -406,10 +560,10 @@ func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { //--- atan_relaxed.fir // RUN: fir-opt %t/atan_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan_relaxed.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atanf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} @@ -458,10 +612,10 @@ func.func private @atan(f64) -> f64 //--- atan2_fast.fir // RUN: fir-opt %t/atan2_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan2_fast.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2f({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "y"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} @@ -485,10 +639,10 @@ func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}, %arg1: !fi //--- atan2_relaxed.fir // RUN: fir-opt %t/atan2_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan2_relaxed.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2f({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "y"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} diff --git a/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp b/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp index 97da96afac4cd..b42bb773f53ee 100644 --- a/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp +++ b/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp @@ -42,6 +42,7 @@ using CopySignOpLowering = ConvertFMFMathToLLVMPattern; using CosOpLowering = ConvertFMFMathToLLVMPattern; using CoshOpLowering = ConvertFMFMathToLLVMPattern; +using AcosOpLowering = ConvertFMFMathToLLVMPattern; using CtPopFOpLowering = VectorConvertToLLVMPattern; using Exp2OpLowering = ConvertFMFMathToLLVMPattern; @@ -62,12 +63,15 @@ using RoundOpLowering = ConvertFMFMathToLLVMPattern; using SinOpLowering = ConvertFMFMathToLLVMPattern; using SinhOpLowering = ConvertFMFMathToLLVMPattern; +using ASinOpLowering = ConvertFMFMathToLLVMPattern; using SqrtOpLowering = ConvertFMFMathToLLVMPattern; using FTruncOpLowering = ConvertFMFMathToLLVMPattern; using TanOpLowering = ConvertFMFMathToLLVMPattern; using TanhOpLowering = ConvertFMFMathToLLVMPattern; - +using ATanOpLowering = ConvertFMFMathToLLVMPattern; +using ATan2OpLowering = + ConvertFMFMathToLLVMPattern; // A `CtLz/CtTz/absi(a)` is converted into `CtLz/CtTz/absi(a, false)`. template struct IntOpWithFlagLowering : public ConvertOpToLLVMPattern { @@ -353,6 +357,7 @@ void mlir::populateMathToLLVMConversionPatterns( CopySignOpLowering, CosOpLowering, CoshOpLowering, + AcosOpLowering, CountLeadingZerosOpLowering, CountTrailingZerosOpLowering, CtPopFOpLowering, @@ -371,10 +376,13 @@ void mlir::populateMathToLLVMConversionPatterns( RsqrtOpLowering, SinOpLowering, SinhOpLowering, + ASinOpLowering, SqrtOpLowering, FTruncOpLowering, TanOpLowering, - TanhOpLowering + TanhOpLowering, + ATanOpLowering, + ATan2OpLowering >(converter, benefit); // clang-format on } diff --git a/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir b/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir index 974743a55932b..537fb967ef0e1 100644 --- a/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir +++ b/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir @@ -177,6 +177,84 @@ func.func @trigonometrics(%arg0: f32) { // ----- +// CHECK-LABEL: func @inverse_trigonometrics +// CHECK-SAME: [[ARG0:%.+]]: f32 +func.func @inverse_trigonometrics(%arg0: f32) { + // CHECK: llvm.intr.asin([[ARG0]]) : (f32) -> f32 + %0 = math.asin %arg0 : f32 + + // CHECK: llvm.intr.acos([[ARG0]]) : (f32) -> f32 + %1 = math.acos %arg0 : f32 + + // CHECK: llvm.intr.atan([[ARG0]]) : (f32) -> f32 + %2 = math.atan %arg0 : f32 + func.return +} + +// ----- + +// CHECK-LABEL: func @atan2 +// CHECK-SAME: [[ARG0:%.+]]: f32, [[ARG1:%.+]]: f32 +func.func @atan2(%arg0: f32, %arg1: f32) { + // CHECK: llvm.intr.atan2([[ARG0]], [[ARG1]]) : (f32, f32) -> f32 + %0 = math.atan2 %arg0, %arg1 : f32 + func.return +} + +// ----- + +// CHECK-LABEL: func @inverse_trigonometrics_vector +// CHECK-SAME: [[ARG0:%.+]]: vector<4xf32> +func.func @inverse_trigonometrics_vector(%arg0: vector<4xf32>) { + // CHECK: llvm.intr.asin([[ARG0]]) : (vector<4xf32>) -> vector<4xf32> + %0 = math.asin %arg0 : vector<4xf32> + + // CHECK: llvm.intr.acos([[ARG0]]) : (vector<4xf32>) -> vector<4xf32> + %1 = math.acos %arg0 : vector<4xf32> + + // CHECK: llvm.intr.atan([[ARG0]]) : (vector<4xf32>) -> vector<4xf32> + %2 = math.atan %arg0 : vector<4xf32> + func.return +} + +// ----- + +// CHECK-LABEL: func @atan2_vector +// CHECK-SAME: [[ARG0:%.+]]: vector<4xf32>, [[ARG1:%.+]]: vector<4xf32> +func.func @atan2_vector(%arg0: vector<4xf32>, %arg1: vector<4xf32>) { + // CHECK: llvm.intr.atan2([[ARG0]], [[ARG1]]) : (vector<4xf32>, vector<4xf32>) -> vector<4xf32> + %0 = math.atan2 %arg0, %arg1 : vector<4xf32> + func.return +} + +// ----- + +// CHECK-LABEL: func @inverse_trigonometrics_fmf +// CHECK-SAME: [[ARG0:%.+]]: f32 +func.func @inverse_trigonometrics_fmf(%arg0: f32) { + // CHECK: llvm.intr.asin([[ARG0]]) {fastmathFlags = #llvm.fastmath} : (f32) -> f32 + %0 = math.asin %arg0 fastmath : f32 + + // CHECK: llvm.intr.acos([[ARG0]]) {fastmathFlags = #llvm.fastmath} : (f32) -> f32 + %1 = math.acos %arg0 fastmath : f32 + + // CHECK: llvm.intr.atan([[ARG0]]) {fastmathFlags = #llvm.fastmath} : (f32) -> f32 + %2 = math.atan %arg0 fastmath : f32 + func.return +} + +// ----- + +// CHECK-LABEL: func @atan2_fmf +// CHECK-SAME: [[ARG0:%.+]]: f32, [[ARG1:%.+]]: f32 +func.func @atan2_fmf(%arg0: f32, %arg1: f32) { + // CHECK: llvm.intr.atan2([[ARG0]], [[ARG1]]) {fastmathFlags = #llvm.fastmath} : (f32, f32) -> f32 + %0 = math.atan2 %arg0, %arg1 fastmath : f32 + func.return +} + +// ----- + // CHECK-LABEL: func @hyperbolics // CHECK-SAME: [[ARG0:%.+]]: f32 func.func @hyperbolics(%arg0: f32) { ``````````
https://github.com/llvm/llvm-project/pull/141069 From flang-commits at lists.llvm.org Thu May 22 07:24:15 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 07:24:15 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <682f338f.a70a0220.043c.0e6f@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions cpp -- flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp flang/lib/Optimizer/Passes/Pipelines.cpp ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp index acaf59849..a6beada14 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -1,4 +1,5 @@ -//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops --------------------===// +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops +//--------------------===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -11,7 +12,6 @@ // when the input array is not behind a pointer. This may change in the future. //===----------------------------------------------------------------------===// - #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" #include "flang/Optimizer/Dialect/FIRType.h" @@ -29,9 +29,9 @@ namespace hlfir { #define DEBUG_TYPE "inline-hlfir-copy-in" static llvm::cl::opt noInlineHLFIRCopyIn( - "no-inline-hlfir-copy-in", - llvm::cl::desc("Do not inline hlfir.copy_in operations"), - llvm::cl::init(false)); + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); namespace { class InlineCopyInConversion : public mlir::OpRewritePattern { ``````````
https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Thu May 22 07:25:07 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Thu, 22 May 2025 07:25:07 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <682f33c3.050a0220.2a8cf0.c05c@mx.google.com> https://github.com/mrkajetanp edited https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Thu May 22 07:25:56 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 07:25:56 -0700 (PDT) Subject: [flang-commits] [flang] e9cba3c - [flang][OpenMP] use attribute for delayed privatization barrier (#140092) Message-ID: <682f33f4.170a0220.1bdff9.ee5f@mx.google.com> Author: Tom Eccles Date: 2025-05-22T15:25:53+01:00 New Revision: e9cba3c8edca3dc805e82afbb482b3938cb96ae2 URL: https://github.com/llvm/llvm-project/commit/e9cba3c8edca3dc805e82afbb482b3938cb96ae2 DIFF: https://github.com/llvm/llvm-project/commit/e9cba3c8edca3dc805e82afbb482b3938cb96ae2.diff LOG: [flang][OpenMP] use attribute for delayed privatization barrier (#140092) Fixes #136357 The barrier needs to go between the copying into firstprivate variables and the initialization call for the OpenMP construct (e.g. wsloop). There is no way of expressing this in MLIR because for delayed privatization that is all implicit (added in MLIR->LLVMIR conversion). The previous approach put the barrier immediately before the wsloop (or similar). For delayed privatization, the firstprivate copy code would then be inserted after that, opening the possibility for the race observed in the bug report. This patch solves the issue by instead setting an attribute on the mlir operation, which will instruct openmp dialect to llvm ir conversion to insert a barrier in the correct place. Added: Modified: flang/lib/Lower/OpenMP/DataSharingProcessor.cpp flang/lib/Lower/OpenMP/DataSharingProcessor.h flang/test/Lower/OpenMP/lastprivate-allocatable.f90 flang/test/Lower/OpenMP/parallel-lastprivate-clause-scalar.f90 flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 Removed: ################################################################################ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 2a1c94407e1c8..20dc46e4710fb 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -62,7 +62,7 @@ void DataSharingProcessor::processStep1( privatize(clauseOps); - insertBarrier(); + insertBarrier(clauseOps); } void DataSharingProcessor::processStep2(mlir::Operation *op, bool isLoop) { @@ -231,9 +231,18 @@ bool DataSharingProcessor::needBarrier() { return false; } -void DataSharingProcessor::insertBarrier() { - if (needBarrier()) +void DataSharingProcessor::insertBarrier( + mlir::omp::PrivateClauseOps *clauseOps) { + if (!needBarrier()) + return; + + if (useDelayedPrivatization) { + if (clauseOps) + clauseOps->privateNeedsBarrier = + mlir::UnitAttr::get(&converter.getMLIRContext()); + } else { firOpBuilder.create(converter.getCurrentLocation()); + } } void DataSharingProcessor::insertLastPrivateCompare(mlir::Operation *op) { diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.h b/flang/lib/Lower/OpenMP/DataSharingProcessor.h index 54a42fd199831..7787e4ffb03c2 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.h +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.h @@ -100,7 +100,7 @@ class DataSharingProcessor { const omp::ObjectList &objects, llvm::SetVector &symbolSet); void collectSymbolsForPrivatization(); - void insertBarrier(); + void insertBarrier(mlir::omp::PrivateClauseOps *clauseOps); void collectDefaultSymbols(); void collectImplicitSymbols(); void collectPreDeterminedSymbols(); diff --git a/flang/test/Lower/OpenMP/lastprivate-allocatable.f90 b/flang/test/Lower/OpenMP/lastprivate-allocatable.f90 index 1d31edd16efea..c2626e14b51c7 100644 --- a/flang/test/Lower/OpenMP/lastprivate-allocatable.f90 +++ b/flang/test/Lower/OpenMP/lastprivate-allocatable.f90 @@ -8,7 +8,7 @@ ! CHECK: fir.store %[[VAL_2]] to %[[VAL_0]] : !fir.ref>> ! CHECK: %[[VAL_3:.*]]:2 = hlfir.declare %[[VAL_0]] {fortran_attrs = {{.*}}, uniq_name = "_QFEa"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) ! CHECK: omp.parallel { -! CHECK: omp.wsloop private(@{{.*}} %{{.*}} -> %{{.*}}, @{{.*}} %{{.*}} -> %[[VAL_17:.*]] : !fir.ref>>, !fir.ref) { +! CHECK: omp.wsloop private(@{{.*}} %{{.*}} -> %{{.*}}, @{{.*}} %{{.*}} -> %[[VAL_17:.*]] : !fir.ref>>, !fir.ref) private_barrier { ! CHECK: omp.loop_nest ! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %{{.*}} {fortran_attrs = {{.*}}, uniq_name = "_QFEa"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) ! CHECK: %[[VAL_18:.*]]:2 = hlfir.declare %[[VAL_17]] {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) diff --git a/flang/test/Lower/OpenMP/parallel-lastprivate-clause-scalar.f90 b/flang/test/Lower/OpenMP/parallel-lastprivate-clause-scalar.f90 index 60de8fa6f46a2..5d37010f4095b 100644 --- a/flang/test/Lower/OpenMP/parallel-lastprivate-clause-scalar.f90 +++ b/flang/test/Lower/OpenMP/parallel-lastprivate-clause-scalar.f90 @@ -226,8 +226,7 @@ subroutine firstpriv_lastpriv_int(arg1, arg2) ! Firstprivate update -!CHECK-NEXT: omp.barrier -!CHECK: omp.wsloop private(@{{.*}} %{{.*}}#0 -> %[[CLONE1:.*]], @{{.*}} %{{.*}}#0 -> %[[IV:.*]] : !fir.ref, !fir.ref) { +!CHECK: omp.wsloop private(@{{.*}} %{{.*}}#0 -> %[[CLONE1:.*]], @{{.*}} %{{.*}}#0 -> %[[IV:.*]] : !fir.ref, !fir.ref) private_barrier { !CHECK-NEXT: omp.loop_nest (%[[INDX_WS:.*]]) : {{.*}} { !CHECK: %[[CLONE1_DECL:.*]]:2 = hlfir.declare %[[CLONE1]] {uniq_name = "_QFfirstpriv_lastpriv_int2Earg1"} : (!fir.ref) -> (!fir.ref, !fir.ref) diff --git a/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 b/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 index ee914f23aacf3..45d6f91f67f1f 100644 --- a/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 +++ b/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 @@ -20,8 +20,7 @@ subroutine first_and_lastprivate ! CHECK: func.func @{{.*}}first_and_lastprivate() ! CHECK: %[[ORIG_VAR_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "{{.*}}Evar"} ! CHECK: omp.parallel { -! CHECK: omp.barrier -! CHECK: omp.wsloop private(@{{.*}}var_firstprivate_i32 {{.*}}) { +! CHECK: omp.wsloop private(@{{.*}}var_firstprivate_i32 {{.*}}) private_barrier { ! CHECK: omp.loop_nest {{.*}} { ! CHECK: %[[PRIV_VAR_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "{{.*}}Evar"} ! CHECK: fir.if %{{.*}} { From flang-commits at lists.llvm.org Thu May 22 07:56:12 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Thu, 22 May 2025 07:56:12 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [lld] [lldb] [llvm] [mlir] [polly] [CMake] respect LLVMConfig.cmake's LLVM_DEFINITIONS in standalone builds (PR #138587) In-Reply-To: Message-ID: <682f3b0c.170a0220.1bdf95.f065@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `clang-ppc64-aix` running on `aix-ppc64` while building `bolt,clang,flang,lld,lldb,mlir,polly` at step 6 "test-build-unified-tree-check-all". Full details are available at: https://lab.llvm.org/buildbot/#/builders/64/builds/3735
Here is the relevant piece of the build log for the reference ``` Step 6 (test-build-unified-tree-check-all) failure: test (failure) ******************** TEST 'lit :: timeout-hang.py' FAILED ******************** Exit Code: 1 Command Output (stdout): -- # RUN: at line 13 not env -u FILECHECK_OPTS "/home/llvm/llvm-external-buildbots/workers/env/bin/python3.11" /home/llvm/llvm-external-buildbots/workers/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/utils/lit/lit.py -j1 --order=lexical Inputs/timeout-hang/run-nonexistent.txt --timeout=1 --param external=0 | "/home/llvm/llvm-external-buildbots/workers/env/bin/python3.11" /home/llvm/llvm-external-buildbots/workers/aix-ppc64/clang-ppc64-aix/build/utils/lit/tests/timeout-hang.py 1 # executed command: not env -u FILECHECK_OPTS /home/llvm/llvm-external-buildbots/workers/env/bin/python3.11 /home/llvm/llvm-external-buildbots/workers/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/utils/lit/lit.py -j1 --order=lexical Inputs/timeout-hang/run-nonexistent.txt --timeout=1 --param external=0 # .---command stderr------------ # | lit.py: /home/llvm/llvm-external-buildbots/workers/aix-ppc64/clang-ppc64-aix/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 1 seconds was requested on the command line. Forcing timeout to be 1 seconds. # `----------------------------- # executed command: /home/llvm/llvm-external-buildbots/workers/env/bin/python3.11 /home/llvm/llvm-external-buildbots/workers/aix-ppc64/clang-ppc64-aix/build/utils/lit/tests/timeout-hang.py 1 # .---command stdout------------ # | Testing took as long or longer than timeout # `----------------------------- # error: command failed with exit status: 1 -- ******************** ```
https://github.com/llvm/llvm-project/pull/138587 From flang-commits at lists.llvm.org Thu May 22 07:56:38 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Thu, 22 May 2025 07:56:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <682f3b26.170a0220.6c7a7.f6c1@mx.google.com> https://github.com/mrkajetanp ready_for_review https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Thu May 22 07:57:13 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 07:57:13 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <682f3b49.a70a0220.1fecab.2d53@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Kajetan Puchalski (mrkajetanp)
Changes hlfir.copy_in implements copying non-contiguous array slices for functions that take in arrays required to be contiguous through flang-rt. For large arrays of trivial types, this can incur overhead compared to a plain, inlined copy loop. To address that, add a new InlineHLFIRCopyIn optimisation pass to inline hlfir.copy_in operations for trivial types. For the time being, the pattern is only applied in cases where the copy-in does not require a corresponding copy-out, such as when the function being called declares the array parameter as intent(in), and when the input array is not behind a pointer. Applying this optimisation reduces the runtime of thornado-mini's DeleptonizationProblem by about 10%. ---- Accompanying implementation for the RFC which can be found [here](https://discourse.llvm.org/t/rfc-inline-hlfir-copy-in-for-trivial-types/86205). --- Patch is 21.97 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/138718.diff 5 Files Affected: - (modified) flang/include/flang/Optimizer/HLFIR/Passes.td (+4) - (modified) flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt (+1) - (added) flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp (+180) - (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+5) - (added) flang/test/HLFIR/inline-hlfir-copy-in.fir (+146) ``````````diff diff --git a/flang/include/flang/Optimizer/HLFIR/Passes.td b/flang/include/flang/Optimizer/HLFIR/Passes.td index d445140118073..04d7aec5fe489 100644 --- a/flang/include/flang/Optimizer/HLFIR/Passes.td +++ b/flang/include/flang/Optimizer/HLFIR/Passes.td @@ -69,6 +69,10 @@ def InlineHLFIRAssign : Pass<"inline-hlfir-assign"> { let summary = "Inline hlfir.assign operations"; } +def InlineHLFIRCopyIn : Pass<"inline-hlfir-copy-in"> { + let summary = "Inline hlfir.copy_in operations"; +} + def PropagateFortranVariableAttributes : Pass<"propagate-fortran-attrs"> { let summary = "Propagate FortranVariableFlagsAttr attributes through HLFIR"; } diff --git a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt index d959428ebd203..cc74273d9c5d9 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt @@ -5,6 +5,7 @@ add_flang_library(HLFIRTransforms ConvertToFIR.cpp InlineElementals.cpp InlineHLFIRAssign.cpp + InlineHLFIRCopyIn.cpp LowerHLFIRIntrinsics.cpp LowerHLFIROrderedAssignments.cpp ScheduleOrderedAssignments.cpp diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp new file mode 100644 index 0000000000000..1e2aecaf535a0 --- /dev/null +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + mlir::Value tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); + rewriter.eraseOp(tempBox.getDefiningOp()); + + return mlir::success(); +} + +class InlineHLFIRCopyInPass + : public hlfir::impl::InlineHLFIRCopyInBase { +public: + void runOnOperation() override { + mlir::MLIRContext *context = &getContext(); + + mlir::GreedyRewriteConfig config; + // Prevent the pattern driver from merging blocks. + config.setRegionSimplificationLevel( + mlir::GreedySimplifyRegionLevel::Disabled); + + mlir::RewritePatternSet patterns(context); + if (!noInlineHLFIRCopyIn) { + patterns.insert(context); + } + + if (mlir::failed(mlir::applyPatternsGreedily( + getOperation(), std::move(patterns), config))) { + mlir::emitError(getOperation()->getLoc(), + "failure in hlfir.copy_in inlining"); + signalPassFailure(); + } + } +}; +} // namespace diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..1779623fddc5a 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -255,6 +255,11 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP, pm, hlfir::createOptimizedBufferization); addNestedPassToAllTopLevelOperations( pm, hlfir::createInlineHLFIRAssign); + + if (optLevel == llvm::OptimizationLevel::O3) { + addNestedPassToAllTopLevelOperations( + pm, hlfir::createInlineHLFIRCopyIn); + } } pm.addPass(hlfir::createLowerHLFIROrderedAssignments()); pm.addPass(hlfir::createLowerHLFIRIntrinsics()); diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir new file mode 100644 index 0000000000000..7140e93f19979 --- /dev/null +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -0,0 +1,146 @@ +// Test inlining of hlfir.copy_in +// RUN: fir-opt --inline-hlfir-copy-in %s | FileCheck %s + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[V... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Thu May 22 07:22:00 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Thu, 22 May 2025 07:22:00 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <682f3308.a70a0220.2b4313.28a4@mx.google.com> https://github.com/mrkajetanp updated https://github.com/llvm/llvm-project/pull/138718 >From 6bbba43416edf2c4d310d3e92e99ad458e87d8f7 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 10 Apr 2025 14:04:52 +0000 Subject: [PATCH 1/4] [flang] Inline hlfir.copy_in for trivial types hlfir.copy_in implements copying non-contiguous array slices for functions that take in arrays required to be contiguous through a flang-rt function that calls memcpy/memmove separately on each element. For large arrays of trivial types, this can incur considerable overhead compared to a plain copy loop that is better able to take advantage of hardware pipelines. To address that, extend the InlineHLFIRAssign optimisation pass with a new pattern for inlining hlfir.copy_in operations for trivial types. For the time being, the pattern is only applied in cases where the copy-in does not require a corresponding copy-out, such as when the function being called declares the array parameter as intent(in). Applying this optimisation reduces the runtime of thornado-mini's DeleptonizationProblem by a factor of about 1/3rd. Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 117 ++++++++++++++++++ 1 file changed, 117 insertions(+) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 6e209cce07ad4..38c684eaceb7d 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,6 +13,7 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + + if (!mlir::isa(copyOut)) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (::llvm::cast(copyOut).getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + auto results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + auto elem = hlfir::getElementAt(loc, builder, inputVariable, + loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + auto tempElem = hlfir::getElementAt(loc, builder, temp, + loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + auto refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + auto refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + auto addr = results[0]; + auto needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + auto tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. + rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); +} + class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -140,6 +256,7 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); + patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { >From 32a62862f141c3894c7a30b141ddf3b7701cb85b Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Wed, 7 May 2025 16:04:07 +0000 Subject: [PATCH 2/4] Add tests Signed-off-by: Kajetan Puchalski --- flang/test/HLFIR/inline-hlfir-assign.fir | 144 +++++++++++++++++++++++ 1 file changed, 144 insertions(+) diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index f834e7971e3d5..df7681b9c5c16 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,3 +353,147 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } >From f34d4d56fd9818aa8cedf2a3600ab2205f65e047 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 8 May 2025 15:15:56 +0000 Subject: [PATCH 3/4] Address Tom's review comments Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 41 +++++++++++-------- 1 file changed, 23 insertions(+), 18 deletions(-) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 38c684eaceb7d..dc545ece8adff 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -158,16 +158,16 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, "CopyInOp's WasCopied has no uses"); // The copy out should always be present, either to actually copy or just // deallocate memory. - auto *copyOut = - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - if (!mlir::isa(copyOut)) + if (!copyOut) return rewriter.notifyMatchFailure(copyIn, "CopyInOp has no direct CopyOut"); // Only inline the copy_in when copy_out does not need to be done, i.e. in // case of intent(in). - if (::llvm::cast(copyOut).getVar()) + if (copyOut.getVar()) return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); inputVariable = @@ -175,7 +175,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); mlir::Value isContiguous = builder.create(loc, inputVariable); - auto results = + mlir::Operation::result_range results = builder .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, /*withElseRegion=*/true) @@ -195,11 +195,11 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, loc, builder, extents, /*isUnordered=*/true, flangomp::shouldUseWorkshareLowering(copyIn)); builder.setInsertionPointToStart(loopNest.body); - auto elem = hlfir::getElementAt(loc, builder, inputVariable, - loopNest.oneBasedIndices); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); elem = hlfir::loadTrivialScalar(loc, builder, elem); - auto tempElem = hlfir::getElementAt(loc, builder, temp, - loopNest.oneBasedIndices); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); builder.create(loc, elem, tempElem); builder.setInsertionPointAfter(loopNest.outerOp); @@ -209,9 +209,9 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, if (mlir::isa(temp.getType())) { result = temp; } else { - auto refTy = + fir::ReferenceType refTy = fir::ReferenceType::get(temp.getElementOrSequenceType()); - auto refVal = builder.createConvert(loc, refTy, temp); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); result = builder.create(loc, resultAddrType, refVal); } @@ -221,25 +221,30 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }) .getResults(); - auto addr = results[0]; - auto needsCleanup = results[1]; + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { auto boxAddr = builder.create(loc, addr); - auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); builder.create(loc, heapVal); }); rewriter.eraseOp(copyOut); - auto tempBox = copyIn.getTempBox(); + mlir::Value tempBox = copyIn.getTempBox(); rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); // The TempBox is only needed for flang-rt calls which we're no longer - // generating. + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); } >From cc62dcd91b6ebaefd53264c964a97ba741883807 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 22 May 2025 13:37:53 +0000 Subject: [PATCH 4/4] Separate copy_in inlining into its own pass, add flag Signed-off-by: Kajetan Puchalski --- flang/include/flang/Optimizer/HLFIR/Passes.td | 4 + .../Optimizer/HLFIR/Transforms/CMakeLists.txt | 1 + .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 122 ------------ .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 181 ++++++++++++++++++ flang/lib/Optimizer/Passes/Pipelines.cpp | 5 + flang/test/HLFIR/inline-hlfir-assign.fir | 144 -------------- flang/test/HLFIR/inline-hlfir-copy-in.fir | 146 ++++++++++++++ 7 files changed, 337 insertions(+), 266 deletions(-) create mode 100644 flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp create mode 100644 flang/test/HLFIR/inline-hlfir-copy-in.fir diff --git a/flang/include/flang/Optimizer/HLFIR/Passes.td b/flang/include/flang/Optimizer/HLFIR/Passes.td index d445140118073..04d7aec5fe489 100644 --- a/flang/include/flang/Optimizer/HLFIR/Passes.td +++ b/flang/include/flang/Optimizer/HLFIR/Passes.td @@ -69,6 +69,10 @@ def InlineHLFIRAssign : Pass<"inline-hlfir-assign"> { let summary = "Inline hlfir.assign operations"; } +def InlineHLFIRCopyIn : Pass<"inline-hlfir-copy-in"> { + let summary = "Inline hlfir.copy_in operations"; +} + def PropagateFortranVariableAttributes : Pass<"propagate-fortran-attrs"> { let summary = "Propagate FortranVariableFlagsAttr attributes through HLFIR"; } diff --git a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt index d959428ebd203..cc74273d9c5d9 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt @@ -5,6 +5,7 @@ add_flang_library(HLFIRTransforms ConvertToFIR.cpp InlineElementals.cpp InlineHLFIRAssign.cpp + InlineHLFIRCopyIn.cpp LowerHLFIRIntrinsics.cpp LowerHLFIROrderedAssignments.cpp ScheduleOrderedAssignments.cpp diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index dc545ece8adff..6e209cce07ad4 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,7 +13,6 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" -#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -128,126 +127,6 @@ class InlineHLFIRAssignConversion } }; -class InlineCopyInConversion : public mlir::OpRewritePattern { -public: - using mlir::OpRewritePattern::OpRewritePattern; - - llvm::LogicalResult - matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const override; -}; - -llvm::LogicalResult -InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const { - fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); - mlir::Location loc = copyIn.getLoc(); - hlfir::Entity inputVariable{copyIn.getVar()}; - if (!fir::isa_trivial(inputVariable.getFortranElementType())) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's data type is not trivial"); - - if (fir::isPointerType(inputVariable.getType())) - return rewriter.notifyMatchFailure( - copyIn, "CopyInOp's input variable is a pointer"); - - // There should be exactly one user of WasCopied - the corresponding - // CopyOutOp. - if (copyIn.getWasCopied().getUses().empty()) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's WasCopied has no uses"); - // The copy out should always be present, either to actually copy or just - // deallocate memory. - auto copyOut = mlir::dyn_cast( - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - - if (!copyOut) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp has no direct CopyOut"); - - // Only inline the copy_in when copy_out does not need to be done, i.e. in - // case of intent(in). - if (copyOut.getVar()) - return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); - - inputVariable = - hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); - mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); - mlir::Value isContiguous = - builder.create(loc, inputVariable); - mlir::Operation::result_range results = - builder - .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, - /*withElseRegion=*/true) - .genThen([&]() { - mlir::Value falseVal = builder.create( - loc, builder.getI1Type(), builder.getBoolAttr(false)); - builder.create( - loc, mlir::ValueRange{inputVariable, falseVal}); - }) - .genElse([&] { - auto [temp, cleanup] = - hlfir::createTempFromMold(loc, builder, inputVariable); - mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); - llvm::SmallVector extents = - hlfir::getIndexExtents(loc, builder, shape); - hlfir::LoopNest loopNest = hlfir::genLoopNest( - loc, builder, extents, /*isUnordered=*/true, - flangomp::shouldUseWorkshareLowering(copyIn)); - builder.setInsertionPointToStart(loopNest.body); - hlfir::Entity elem = hlfir::getElementAt( - loc, builder, inputVariable, loopNest.oneBasedIndices); - elem = hlfir::loadTrivialScalar(loc, builder, elem); - hlfir::Entity tempElem = hlfir::getElementAt( - loc, builder, temp, loopNest.oneBasedIndices); - builder.create(loc, elem, tempElem); - builder.setInsertionPointAfter(loopNest.outerOp); - - mlir::Value result; - // Make sure the result is always a boxed array by boxing it - // ourselves if need be. - if (mlir::isa(temp.getType())) { - result = temp; - } else { - fir::ReferenceType refTy = - fir::ReferenceType::get(temp.getElementOrSequenceType()); - mlir::Value refVal = builder.createConvert(loc, refTy, temp); - result = - builder.create(loc, resultAddrType, refVal); - } - - builder.create(loc, - mlir::ValueRange{result, cleanup}); - }) - .getResults(); - - mlir::OpResult addr = results[0]; - mlir::OpResult needsCleanup = results[1]; - - builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { - auto boxAddr = builder.create(loc, addr); - fir::HeapType heapType = - fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - mlir::Value heapVal = - builder.createConvert(loc, heapType, boxAddr.getResult()); - builder.create(loc, heapVal); - }); - rewriter.eraseOp(copyOut); - - mlir::Value tempBox = copyIn.getTempBox(); - - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); - - // The TempBox is only needed for flang-rt calls which we're no longer - // generating. It should have no uses left at this stage. - if (!tempBox.getUses().empty()) - return mlir::failure(); - rewriter.eraseOp(tempBox.getDefiningOp()); - - return mlir::success(); -} - class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -261,7 +140,6 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); - patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp new file mode 100644 index 0000000000000..acaf5984978d9 --- /dev/null +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -0,0 +1,181 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops --------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + mlir::Value tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); + rewriter.eraseOp(tempBox.getDefiningOp()); + + return mlir::success(); +} + +class InlineHLFIRCopyInPass + : public hlfir::impl::InlineHLFIRCopyInBase { +public: + void runOnOperation() override { + mlir::MLIRContext *context = &getContext(); + + mlir::GreedyRewriteConfig config; + // Prevent the pattern driver from merging blocks. + config.setRegionSimplificationLevel( + mlir::GreedySimplifyRegionLevel::Disabled); + + mlir::RewritePatternSet patterns(context); + if (!noInlineHLFIRCopyIn) { + patterns.insert(context); + } + + if (mlir::failed(mlir::applyPatternsGreedily( + getOperation(), std::move(patterns), config))) { + mlir::emitError(getOperation()->getLoc(), + "failure in hlfir.copy_in inlining"); + signalPassFailure(); + } + } +}; +} // namespace diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..1779623fddc5a 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -255,6 +255,11 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP, pm, hlfir::createOptimizedBufferization); addNestedPassToAllTopLevelOperations( pm, hlfir::createInlineHLFIRAssign); + + if (optLevel == llvm::OptimizationLevel::O3) { + addNestedPassToAllTopLevelOperations( + pm, hlfir::createInlineHLFIRCopyIn); + } } pm.addPass(hlfir::createLowerHLFIROrderedAssignments()); pm.addPass(hlfir::createLowerHLFIRIntrinsics()); diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index df7681b9c5c16..f834e7971e3d5 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,147 +353,3 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } - -// Test inlining of hlfir.copy_in that does not require the array to be copied out -func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant true -// CHECK: %[[VAL_4:.*]] = arith.constant false -// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 -// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { -// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 -// CHECK: } else { -// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} -// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { -// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref -// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref -// CHECK: } -// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 -// CHECK: } -// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: fir.if %[[VAL_21:.*]]#1 { -// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> -// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> -// CHECK: } -// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } - -// Test not inlining of hlfir.copy_in that requires the array to be copied out -func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_no_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> -// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) -// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () -// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir new file mode 100644 index 0000000000000..7140e93f19979 --- /dev/null +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -0,0 +1,146 @@ +// Test inlining of hlfir.copy_in +// RUN: fir-opt --inline-hlfir-copy-in %s | FileCheck %s + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } From flang-commits at lists.llvm.org Thu May 22 07:24:30 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 22 May 2025 07:24:30 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][OpenMP] use attribute for delayed privatization barrier (PR #140092) In-Reply-To: Message-ID: <682f339e.170a0220.18eeef.edb0@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/140092 From flang-commits at lists.llvm.org Thu May 22 07:24:47 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 22 May 2025 07:24:47 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][OpenMP] use attribute for delayed privatization barrier (PR #140092) In-Reply-To: Message-ID: <682f33af.170a0220.375382.eb57@mx.google.com> https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/140092 >From 5e2e1a76d93812fdb1ba6730c819b40aaeb99cb0 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Wed, 14 May 2025 16:49:32 +0000 Subject: [PATCH 1/4] [mlir][OpenMP] add attribute for privatization barrier A barrier is needed at the end of initialization/copying of private variables if any of those variables is lastprivate. This ensures that all firstprivate variables receive the original value of the variable before the lastprivate clause overwrites it. Previously this barrier was added by the flang fontend, but there is not a reliable way to put the barrier in the correct place for delayed privatization, and the OpenMP dialect could some day have other users. It is important that there are safe ways to use the constructs available in the dialect. lastprivate is currently not modelled in the OpenMP dialect, and so there is no way to reliably determine whether there were lastprivate variables. Therefore the frontend will have to provide this information through this new attribute. Part of a series of patches to fix https://github.com/llvm/llvm-project/issues/136357 --- .../mlir/Dialect/OpenMP/OpenMPClauses.td | 5 +- mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 37 ++-- .../Conversion/SCFToOpenMP/SCFToOpenMP.cpp | 1 + mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp | 196 +++++++++++------- mlir/test/Dialect/OpenMP/ops.mlir | 17 ++ 5 files changed, 159 insertions(+), 97 deletions(-) diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td index f8e880ea43b75..16c14ef085d6d 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td @@ -1102,7 +1102,10 @@ class OpenMP_PrivateClauseSkip< let arguments = (ins Variadic:$private_vars, - OptionalAttr:$private_syms + OptionalAttr:$private_syms, + // Set this attribute if a barrier is needed after initialization and + // copying of lastprivate variables. + UnitAttr:$private_needs_barrier ); // TODO: Add description. diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 5a79fbf77a268..036c6a6e350a8 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -213,8 +213,8 @@ def ParallelOp : OpenMP_Op<"parallel", traits = [ let assemblyFormat = clausesAssemblyFormat # [{ custom($region, $private_vars, type($private_vars), - $private_syms, $reduction_mod, $reduction_vars, type($reduction_vars), $reduction_byref, - $reduction_syms) attr-dict + $private_syms, $private_needs_barrier, $reduction_mod, $reduction_vars, + type($reduction_vars), $reduction_byref, $reduction_syms) attr-dict }]; let hasVerifier = 1; @@ -258,8 +258,8 @@ def TeamsOp : OpenMP_Op<"teams", traits = [ let assemblyFormat = clausesAssemblyFormat # [{ custom($region, $private_vars, type($private_vars), - $private_syms, $reduction_mod, $reduction_vars, type($reduction_vars), $reduction_byref, - $reduction_syms) attr-dict + $private_syms, $private_needs_barrier, $reduction_mod, $reduction_vars, + type($reduction_vars), $reduction_byref, $reduction_syms) attr-dict }]; let hasVerifier = 1; @@ -317,8 +317,8 @@ def SectionsOp : OpenMP_Op<"sections", traits = [ let assemblyFormat = clausesAssemblyFormat # [{ custom($region, $private_vars, type($private_vars), - $private_syms, $reduction_mod, $reduction_vars, type($reduction_vars), $reduction_byref, - $reduction_syms) attr-dict + $private_syms, $private_needs_barrier, $reduction_mod, $reduction_vars, + type($reduction_vars), $reduction_byref, $reduction_syms) attr-dict }]; let hasVerifier = 1; @@ -350,7 +350,7 @@ def SingleOp : OpenMP_Op<"single", traits = [ let assemblyFormat = clausesAssemblyFormat # [{ custom($region, $private_vars, type($private_vars), - $private_syms) attr-dict + $private_syms, $private_needs_barrier) attr-dict }]; let hasVerifier = 1; @@ -505,8 +505,8 @@ def LoopOp : OpenMP_Op<"loop", traits = [ let assemblyFormat = clausesAssemblyFormat # [{ custom($region, $private_vars, type($private_vars), - $private_syms, $reduction_mod, $reduction_vars, type($reduction_vars), $reduction_byref, - $reduction_syms) attr-dict + $private_syms, $private_needs_barrier, $reduction_mod, $reduction_vars, + type($reduction_vars), $reduction_byref, $reduction_syms) attr-dict }]; let builders = [ @@ -557,8 +557,8 @@ def WsloopOp : OpenMP_Op<"wsloop", traits = [ let assemblyFormat = clausesAssemblyFormat # [{ custom($region, $private_vars, type($private_vars), - $private_syms, $reduction_mod, $reduction_vars, type($reduction_vars), $reduction_byref, - $reduction_syms) attr-dict + $private_syms, $private_needs_barrier, $reduction_mod, $reduction_vars, + type($reduction_vars), $reduction_byref, $reduction_syms) attr-dict }]; let hasVerifier = 1; @@ -611,8 +611,8 @@ def SimdOp : OpenMP_Op<"simd", traits = [ let assemblyFormat = clausesAssemblyFormat # [{ custom($region, $private_vars, type($private_vars), - $private_syms, $reduction_mod, $reduction_vars, type($reduction_vars), $reduction_byref, - $reduction_syms) attr-dict + $private_syms, $private_needs_barrier, $reduction_mod, $reduction_vars, + type($reduction_vars), $reduction_byref, $reduction_syms) attr-dict }]; let hasVerifier = 1; @@ -690,7 +690,7 @@ def DistributeOp : OpenMP_Op<"distribute", traits = [ let assemblyFormat = clausesAssemblyFormat # [{ custom($region, $private_vars, type($private_vars), - $private_syms) attr-dict + $private_syms, $private_needs_barrier) attr-dict }]; let hasVerifier = 1; @@ -740,7 +740,7 @@ def TaskOp custom( $region, $in_reduction_vars, type($in_reduction_vars), $in_reduction_byref, $in_reduction_syms, $private_vars, - type($private_vars), $private_syms) attr-dict + type($private_vars), $private_syms, $private_needs_barrier) attr-dict }]; let hasVerifier = 1; @@ -816,8 +816,9 @@ def TaskloopOp : OpenMP_Op<"taskloop", traits = [ custom( $region, $in_reduction_vars, type($in_reduction_vars), $in_reduction_byref, $in_reduction_syms, $private_vars, - type($private_vars), $private_syms, $reduction_mod, $reduction_vars, - type($reduction_vars), $reduction_byref, $reduction_syms) attr-dict + type($private_vars), $private_syms, $private_needs_barrier, + $reduction_mod, $reduction_vars, type($reduction_vars), + $reduction_byref, $reduction_syms) attr-dict }]; let extraClassDeclaration = [{ @@ -1324,7 +1325,7 @@ def TargetOp : OpenMP_Op<"target", traits = [ $host_eval_vars, type($host_eval_vars), $in_reduction_vars, type($in_reduction_vars), $in_reduction_byref, $in_reduction_syms, $map_vars, type($map_vars), $private_vars, type($private_vars), - $private_syms, $private_maps) attr-dict + $private_syms, $private_needs_barrier, $private_maps) attr-dict }]; let hasVerifier = 1; diff --git a/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp b/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp index 233739e1d6d91..71786e856c6db 100644 --- a/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp +++ b/mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp @@ -450,6 +450,7 @@ struct ParallelOpLowering : public OpRewritePattern { /* num_threads = */ numThreadsVar, /* private_vars = */ ValueRange(), /* private_syms = */ nullptr, + /* private_needs_barrier = */ nullptr, /* proc_bind_kind = */ omp::ClauseProcBindKindAttr{}, /* reduction_mod = */ nullptr, /* reduction_vars = */ llvm::SmallVector{}, diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp index 2bf7aaa46db11..57a54f21fe9de 100644 --- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp +++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp @@ -581,11 +581,14 @@ struct PrivateParseArgs { llvm::SmallVectorImpl &vars; llvm::SmallVectorImpl &types; ArrayAttr &syms; + UnitAttr &needsBarrier; DenseI64ArrayAttr *mapIndices; PrivateParseArgs(SmallVectorImpl &vars, SmallVectorImpl &types, ArrayAttr &syms, + UnitAttr &needsBarrier, DenseI64ArrayAttr *mapIndices = nullptr) - : vars(vars), types(types), syms(syms), mapIndices(mapIndices) {} + : vars(vars), types(types), syms(syms), needsBarrier(needsBarrier), + mapIndices(mapIndices) {} }; struct ReductionParseArgs { @@ -613,6 +616,10 @@ struct AllRegionParseArgs { }; } // namespace +static inline constexpr StringRef getPrivateNeedsBarrierSpelling() { + return "private_barrier"; +} + static ParseResult parseClauseWithRegionArgs( OpAsmParser &parser, SmallVectorImpl &operands, @@ -620,7 +627,8 @@ static ParseResult parseClauseWithRegionArgs( SmallVectorImpl ®ionPrivateArgs, ArrayAttr *symbols = nullptr, DenseI64ArrayAttr *mapIndices = nullptr, DenseBoolArrayAttr *byref = nullptr, - ReductionModifierAttr *modifier = nullptr) { + ReductionModifierAttr *modifier = nullptr, + UnitAttr *needsBarrier = nullptr) { SmallVector symbolVec; SmallVector mapIndicesVec; SmallVector isByRefVec; @@ -688,6 +696,12 @@ static ParseResult parseClauseWithRegionArgs( if (parser.parseRParen()) return failure(); + if (needsBarrier) { + if (parser.parseOptionalKeyword(getPrivateNeedsBarrierSpelling()) + .succeeded()) + *needsBarrier = mlir::UnitAttr::get(parser.getContext()); + } + auto *argsBegin = regionPrivateArgs.begin(); MutableArrayRef argsSubrange(argsBegin + regionArgOffset, argsBegin + regionArgOffset + types.size()); @@ -735,7 +749,8 @@ static ParseResult parseBlockArgClause( if (failed(parseClauseWithRegionArgs( parser, privateArgs->vars, privateArgs->types, entryBlockArgs, - &privateArgs->syms, privateArgs->mapIndices))) + &privateArgs->syms, privateArgs->mapIndices, /*byref=*/nullptr, + /*modifier=*/nullptr, &privateArgs->needsBarrier))) return failure(); } return success(); @@ -824,7 +839,7 @@ static ParseResult parseTargetOpRegion( SmallVectorImpl &mapTypes, llvm::SmallVectorImpl &privateVars, llvm::SmallVectorImpl &privateTypes, ArrayAttr &privateSyms, - DenseI64ArrayAttr &privateMaps) { + UnitAttr &privateNeedsBarrier, DenseI64ArrayAttr &privateMaps) { AllRegionParseArgs args; args.hasDeviceAddrArgs.emplace(hasDeviceAddrVars, hasDeviceAddrTypes); args.hostEvalArgs.emplace(hostEvalVars, hostEvalTypes); @@ -832,7 +847,7 @@ static ParseResult parseTargetOpRegion( inReductionByref, inReductionSyms); args.mapArgs.emplace(mapVars, mapTypes); args.privateArgs.emplace(privateVars, privateTypes, privateSyms, - &privateMaps); + privateNeedsBarrier, &privateMaps); return parseBlockArgRegion(parser, region, args); } @@ -842,11 +857,13 @@ static ParseResult parseInReductionPrivateRegion( SmallVectorImpl &inReductionTypes, DenseBoolArrayAttr &inReductionByref, ArrayAttr &inReductionSyms, llvm::SmallVectorImpl &privateVars, - llvm::SmallVectorImpl &privateTypes, ArrayAttr &privateSyms) { + llvm::SmallVectorImpl &privateTypes, ArrayAttr &privateSyms, + UnitAttr &privateNeedsBarrier) { AllRegionParseArgs args; args.inReductionArgs.emplace(inReductionVars, inReductionTypes, inReductionByref, inReductionSyms); - args.privateArgs.emplace(privateVars, privateTypes, privateSyms); + args.privateArgs.emplace(privateVars, privateTypes, privateSyms, + privateNeedsBarrier); return parseBlockArgRegion(parser, region, args); } @@ -857,14 +874,15 @@ static ParseResult parseInReductionPrivateReductionRegion( DenseBoolArrayAttr &inReductionByref, ArrayAttr &inReductionSyms, llvm::SmallVectorImpl &privateVars, llvm::SmallVectorImpl &privateTypes, ArrayAttr &privateSyms, - ReductionModifierAttr &reductionMod, + UnitAttr &privateNeedsBarrier, ReductionModifierAttr &reductionMod, SmallVectorImpl &reductionVars, SmallVectorImpl &reductionTypes, DenseBoolArrayAttr &reductionByref, ArrayAttr &reductionSyms) { AllRegionParseArgs args; args.inReductionArgs.emplace(inReductionVars, inReductionTypes, inReductionByref, inReductionSyms); - args.privateArgs.emplace(privateVars, privateTypes, privateSyms); + args.privateArgs.emplace(privateVars, privateTypes, privateSyms, + privateNeedsBarrier); args.reductionArgs.emplace(reductionVars, reductionTypes, reductionByref, reductionSyms, &reductionMod); return parseBlockArgRegion(parser, region, args); @@ -873,9 +891,11 @@ static ParseResult parseInReductionPrivateReductionRegion( static ParseResult parsePrivateRegion( OpAsmParser &parser, Region ®ion, llvm::SmallVectorImpl &privateVars, - llvm::SmallVectorImpl &privateTypes, ArrayAttr &privateSyms) { + llvm::SmallVectorImpl &privateTypes, ArrayAttr &privateSyms, + UnitAttr &privateNeedsBarrier) { AllRegionParseArgs args; - args.privateArgs.emplace(privateVars, privateTypes, privateSyms); + args.privateArgs.emplace(privateVars, privateTypes, privateSyms, + privateNeedsBarrier); return parseBlockArgRegion(parser, region, args); } @@ -883,12 +903,13 @@ static ParseResult parsePrivateReductionRegion( OpAsmParser &parser, Region ®ion, llvm::SmallVectorImpl &privateVars, llvm::SmallVectorImpl &privateTypes, ArrayAttr &privateSyms, - ReductionModifierAttr &reductionMod, + UnitAttr &privateNeedsBarrier, ReductionModifierAttr &reductionMod, SmallVectorImpl &reductionVars, SmallVectorImpl &reductionTypes, DenseBoolArrayAttr &reductionByref, ArrayAttr &reductionSyms) { AllRegionParseArgs args; - args.privateArgs.emplace(privateVars, privateTypes, privateSyms); + args.privateArgs.emplace(privateVars, privateTypes, privateSyms, + privateNeedsBarrier); args.reductionArgs.emplace(reductionVars, reductionTypes, reductionByref, reductionSyms, &reductionMod); return parseBlockArgRegion(parser, region, args); @@ -931,10 +952,12 @@ struct PrivatePrintArgs { ValueRange vars; TypeRange types; ArrayAttr syms; + UnitAttr needsBarrier; DenseI64ArrayAttr mapIndices; PrivatePrintArgs(ValueRange vars, TypeRange types, ArrayAttr syms, - DenseI64ArrayAttr mapIndices) - : vars(vars), types(types), syms(syms), mapIndices(mapIndices) {} + UnitAttr needsBarrier, DenseI64ArrayAttr mapIndices) + : vars(vars), types(types), syms(syms), needsBarrier(needsBarrier), + mapIndices(mapIndices) {} }; struct ReductionPrintArgs { ValueRange vars; @@ -964,7 +987,7 @@ static void printClauseWithRegionArgs( ValueRange argsSubrange, ValueRange operands, TypeRange types, ArrayAttr symbols = nullptr, DenseI64ArrayAttr mapIndices = nullptr, DenseBoolArrayAttr byref = nullptr, - ReductionModifierAttr modifier = nullptr) { + ReductionModifierAttr modifier = nullptr, UnitAttr needsBarrier = nullptr) { if (argsSubrange.empty()) return; @@ -1006,6 +1029,9 @@ static void printClauseWithRegionArgs( p << " : "; llvm::interleaveComma(types, p); p << ") "; + + if (needsBarrier) + p << getPrivateNeedsBarrierSpelling() << " "; } static void printBlockArgClause(OpAsmPrinter &p, MLIRContext *ctx, @@ -1020,9 +1046,10 @@ static void printBlockArgClause(OpAsmPrinter &p, MLIRContext *ctx, StringRef clauseName, ValueRange argsSubrange, std::optional privateArgs) { if (privateArgs) - printClauseWithRegionArgs(p, ctx, clauseName, argsSubrange, - privateArgs->vars, privateArgs->types, - privateArgs->syms, privateArgs->mapIndices); + printClauseWithRegionArgs( + p, ctx, clauseName, argsSubrange, privateArgs->vars, privateArgs->types, + privateArgs->syms, privateArgs->mapIndices, /*byref=*/nullptr, + /*modifier=*/nullptr, privateArgs->needsBarrier); } static void @@ -1068,23 +1095,23 @@ static void printBlockArgRegion(OpAsmPrinter &p, Operation *op, Region ®ion, // These parseXyz functions correspond to the custom definitions // in the .td file(s). -static void -printTargetOpRegion(OpAsmPrinter &p, Operation *op, Region ®ion, - ValueRange hasDeviceAddrVars, TypeRange hasDeviceAddrTypes, - ValueRange hostEvalVars, TypeRange hostEvalTypes, - ValueRange inReductionVars, TypeRange inReductionTypes, - DenseBoolArrayAttr inReductionByref, - ArrayAttr inReductionSyms, ValueRange mapVars, - TypeRange mapTypes, ValueRange privateVars, - TypeRange privateTypes, ArrayAttr privateSyms, - DenseI64ArrayAttr privateMaps) { +static void printTargetOpRegion( + OpAsmPrinter &p, Operation *op, Region ®ion, + ValueRange hasDeviceAddrVars, TypeRange hasDeviceAddrTypes, + ValueRange hostEvalVars, TypeRange hostEvalTypes, + ValueRange inReductionVars, TypeRange inReductionTypes, + DenseBoolArrayAttr inReductionByref, ArrayAttr inReductionSyms, + ValueRange mapVars, TypeRange mapTypes, ValueRange privateVars, + TypeRange privateTypes, ArrayAttr privateSyms, UnitAttr privateNeedsBarrier, + DenseI64ArrayAttr privateMaps) { AllRegionPrintArgs args; args.hasDeviceAddrArgs.emplace(hasDeviceAddrVars, hasDeviceAddrTypes); args.hostEvalArgs.emplace(hostEvalVars, hostEvalTypes); args.inReductionArgs.emplace(inReductionVars, inReductionTypes, inReductionByref, inReductionSyms); args.mapArgs.emplace(mapVars, mapTypes); - args.privateArgs.emplace(privateVars, privateTypes, privateSyms, privateMaps); + args.privateArgs.emplace(privateVars, privateTypes, privateSyms, + privateNeedsBarrier, privateMaps); printBlockArgRegion(p, op, region, args); } @@ -1092,11 +1119,12 @@ static void printInReductionPrivateRegion( OpAsmPrinter &p, Operation *op, Region ®ion, ValueRange inReductionVars, TypeRange inReductionTypes, DenseBoolArrayAttr inReductionByref, ArrayAttr inReductionSyms, ValueRange privateVars, TypeRange privateTypes, - ArrayAttr privateSyms) { + ArrayAttr privateSyms, UnitAttr privateNeedsBarrier) { AllRegionPrintArgs args; args.inReductionArgs.emplace(inReductionVars, inReductionTypes, inReductionByref, inReductionSyms); args.privateArgs.emplace(privateVars, privateTypes, privateSyms, + privateNeedsBarrier, /*mapIndices=*/nullptr); printBlockArgRegion(p, op, region, args); } @@ -1105,13 +1133,15 @@ static void printInReductionPrivateReductionRegion( OpAsmPrinter &p, Operation *op, Region ®ion, ValueRange inReductionVars, TypeRange inReductionTypes, DenseBoolArrayAttr inReductionByref, ArrayAttr inReductionSyms, ValueRange privateVars, TypeRange privateTypes, - ArrayAttr privateSyms, ReductionModifierAttr reductionMod, - ValueRange reductionVars, TypeRange reductionTypes, - DenseBoolArrayAttr reductionByref, ArrayAttr reductionSyms) { + ArrayAttr privateSyms, UnitAttr privateNeedsBarrier, + ReductionModifierAttr reductionMod, ValueRange reductionVars, + TypeRange reductionTypes, DenseBoolArrayAttr reductionByref, + ArrayAttr reductionSyms) { AllRegionPrintArgs args; args.inReductionArgs.emplace(inReductionVars, inReductionTypes, inReductionByref, inReductionSyms); args.privateArgs.emplace(privateVars, privateTypes, privateSyms, + privateNeedsBarrier, /*mapIndices=*/nullptr); args.reductionArgs.emplace(reductionVars, reductionTypes, reductionByref, reductionSyms, reductionMod); @@ -1120,21 +1150,24 @@ static void printInReductionPrivateReductionRegion( static void printPrivateRegion(OpAsmPrinter &p, Operation *op, Region ®ion, ValueRange privateVars, TypeRange privateTypes, - ArrayAttr privateSyms) { + ArrayAttr privateSyms, + UnitAttr privateNeedsBarrier) { AllRegionPrintArgs args; args.privateArgs.emplace(privateVars, privateTypes, privateSyms, + privateNeedsBarrier, /*mapIndices=*/nullptr); printBlockArgRegion(p, op, region, args); } static void printPrivateReductionRegion( OpAsmPrinter &p, Operation *op, Region ®ion, ValueRange privateVars, - TypeRange privateTypes, ArrayAttr privateSyms, + TypeRange privateTypes, ArrayAttr privateSyms, UnitAttr privateNeedsBarrier, ReductionModifierAttr reductionMod, ValueRange reductionVars, TypeRange reductionTypes, DenseBoolArrayAttr reductionByref, ArrayAttr reductionSyms) { AllRegionPrintArgs args; args.privateArgs.emplace(privateVars, privateTypes, privateSyms, + privateNeedsBarrier, /*mapIndices=*/nullptr); args.reductionArgs.emplace(reductionVars, reductionTypes, reductionByref, reductionSyms, reductionMod); @@ -1884,7 +1917,8 @@ void TargetOp::build(OpBuilder &builder, OperationState &state, /*in_reduction_vars=*/{}, /*in_reduction_byref=*/nullptr, /*in_reduction_syms=*/nullptr, clauses.isDevicePtrVars, clauses.mapVars, clauses.nowait, clauses.privateVars, - makeArrayAttr(ctx, clauses.privateSyms), clauses.threadLimit, + makeArrayAttr(ctx, clauses.privateSyms), + clauses.privateNeedsBarrier, clauses.threadLimit, /*private_maps=*/nullptr); } @@ -2149,7 +2183,8 @@ void ParallelOp::build(OpBuilder &builder, OperationState &state, ParallelOp::build(builder, state, /*allocate_vars=*/ValueRange(), /*allocator_vars=*/ValueRange(), /*if_expr=*/nullptr, /*num_threads=*/nullptr, /*private_vars=*/ValueRange(), - /*private_syms=*/nullptr, /*proc_bind_kind=*/nullptr, + /*private_syms=*/nullptr, /*private_needs_barrier=*/nullptr, + /*proc_bind_kind=*/nullptr, /*reduction_mod =*/nullptr, /*reduction_vars=*/ValueRange(), /*reduction_byref=*/nullptr, /*reduction_syms=*/nullptr); state.addAttributes(attributes); @@ -2161,8 +2196,8 @@ void ParallelOp::build(OpBuilder &builder, OperationState &state, ParallelOp::build(builder, state, clauses.allocateVars, clauses.allocatorVars, clauses.ifExpr, clauses.numThreads, clauses.privateVars, makeArrayAttr(ctx, clauses.privateSyms), - clauses.procBindKind, clauses.reductionMod, - clauses.reductionVars, + clauses.privateNeedsBarrier, clauses.procBindKind, + clauses.reductionMod, clauses.reductionVars, makeDenseBoolArrayAttr(ctx, clauses.reductionByref), makeArrayAttr(ctx, clauses.reductionSyms)); } @@ -2266,11 +2301,12 @@ static bool opInGlobalImplicitParallelRegion(Operation *op) { void TeamsOp::build(OpBuilder &builder, OperationState &state, const TeamsOperands &clauses) { MLIRContext *ctx = builder.getContext(); - // TODO Store clauses in op: privateVars, privateSyms. + // TODO Store clauses in op: privateVars, privateSyms, privateNeedsBarrier TeamsOp::build(builder, state, clauses.allocateVars, clauses.allocatorVars, clauses.ifExpr, clauses.numTeamsLower, clauses.numTeamsUpper, /*private_vars=*/{}, /*private_syms=*/nullptr, - clauses.reductionMod, clauses.reductionVars, + /*private_needs_barrier=*/nullptr, clauses.reductionMod, + clauses.reductionVars, makeDenseBoolArrayAttr(ctx, clauses.reductionByref), makeArrayAttr(ctx, clauses.reductionSyms), clauses.threadLimit); @@ -2327,11 +2363,11 @@ OperandRange SectionOp::getReductionVars() { void SectionsOp::build(OpBuilder &builder, OperationState &state, const SectionsOperands &clauses) { MLIRContext *ctx = builder.getContext(); - // TODO Store clauses in op: privateVars, privateSyms. + // TODO Store clauses in op: privateVars, privateSyms, privateNeedsBarrier SectionsOp::build(builder, state, clauses.allocateVars, clauses.allocatorVars, clauses.nowait, /*private_vars=*/{}, - /*private_syms=*/nullptr, clauses.reductionMod, - clauses.reductionVars, + /*private_syms=*/nullptr, /*private_needs_barrier=*/nullptr, + clauses.reductionMod, clauses.reductionVars, makeDenseBoolArrayAttr(ctx, clauses.reductionByref), makeArrayAttr(ctx, clauses.reductionSyms)); } @@ -2363,11 +2399,12 @@ LogicalResult SectionsOp::verifyRegions() { void SingleOp::build(OpBuilder &builder, OperationState &state, const SingleOperands &clauses) { MLIRContext *ctx = builder.getContext(); - // TODO Store clauses in op: privateVars, privateSyms. + // TODO Store clauses in op: privateVars, privateSyms, privateNeedsBarrier SingleOp::build(builder, state, clauses.allocateVars, clauses.allocatorVars, clauses.copyprivateVars, makeArrayAttr(ctx, clauses.copyprivateSyms), clauses.nowait, - /*private_vars=*/{}, /*private_syms=*/nullptr); + /*private_vars=*/{}, /*private_syms=*/nullptr, + /*private_needs_barrier=*/nullptr); } LogicalResult SingleOp::verify() { @@ -2443,8 +2480,9 @@ void LoopOp::build(OpBuilder &builder, OperationState &state, MLIRContext *ctx = builder.getContext(); LoopOp::build(builder, state, clauses.bindKind, clauses.privateVars, - makeArrayAttr(ctx, clauses.privateSyms), clauses.order, - clauses.orderMod, clauses.reductionMod, clauses.reductionVars, + makeArrayAttr(ctx, clauses.privateSyms), + clauses.privateNeedsBarrier, clauses.order, clauses.orderMod, + clauses.reductionMod, clauses.reductionVars, makeDenseBoolArrayAttr(ctx, clauses.reductionByref), makeArrayAttr(ctx, clauses.reductionSyms)); } @@ -2472,6 +2510,7 @@ void WsloopOp::build(OpBuilder &builder, OperationState &state, /*linear_vars=*/ValueRange(), /*linear_step_vars=*/ValueRange(), /*nowait=*/false, /*order=*/nullptr, /*order_mod=*/nullptr, /*ordered=*/nullptr, /*private_vars=*/{}, /*private_syms=*/nullptr, + /*private_needs_barrier=*/false, /*reduction_mod=*/nullptr, /*reduction_vars=*/ValueRange(), /*reduction_byref=*/nullptr, /*reduction_syms=*/nullptr, /*schedule_kind=*/nullptr, @@ -2483,18 +2522,17 @@ void WsloopOp::build(OpBuilder &builder, OperationState &state, void WsloopOp::build(OpBuilder &builder, OperationState &state, const WsloopOperands &clauses) { MLIRContext *ctx = builder.getContext(); - // TODO: Store clauses in op: allocateVars, allocatorVars, privateVars, - // privateSyms. - WsloopOp::build(builder, state, - /*allocate_vars=*/{}, /*allocator_vars=*/{}, - clauses.linearVars, clauses.linearStepVars, clauses.nowait, - clauses.order, clauses.orderMod, clauses.ordered, - clauses.privateVars, makeArrayAttr(ctx, clauses.privateSyms), - clauses.reductionMod, clauses.reductionVars, - makeDenseBoolArrayAttr(ctx, clauses.reductionByref), - makeArrayAttr(ctx, clauses.reductionSyms), - clauses.scheduleKind, clauses.scheduleChunk, - clauses.scheduleMod, clauses.scheduleSimd); + // TODO: Store clauses in op: allocateVars, allocatorVars + WsloopOp::build( + builder, state, + /*allocate_vars=*/{}, /*allocator_vars=*/{}, clauses.linearVars, + clauses.linearStepVars, clauses.nowait, clauses.order, clauses.orderMod, + clauses.ordered, clauses.privateVars, + makeArrayAttr(ctx, clauses.privateSyms), clauses.privateNeedsBarrier, + clauses.reductionMod, clauses.reductionVars, + makeDenseBoolArrayAttr(ctx, clauses.reductionByref), + makeArrayAttr(ctx, clauses.reductionSyms), clauses.scheduleKind, + clauses.scheduleChunk, clauses.scheduleMod, clauses.scheduleSimd); } LogicalResult WsloopOp::verify() { @@ -2534,14 +2572,14 @@ LogicalResult WsloopOp::verifyRegions() { void SimdOp::build(OpBuilder &builder, OperationState &state, const SimdOperands &clauses) { MLIRContext *ctx = builder.getContext(); - // TODO Store clauses in op: linearVars, linearStepVars, privateVars, - // privateSyms. + // TODO Store clauses in op: linearVars, linearStepVars SimdOp::build(builder, state, clauses.alignedVars, makeArrayAttr(ctx, clauses.alignments), clauses.ifExpr, /*linear_vars=*/{}, /*linear_step_vars=*/{}, clauses.nontemporalVars, clauses.order, clauses.orderMod, clauses.privateVars, makeArrayAttr(ctx, clauses.privateSyms), - clauses.reductionMod, clauses.reductionVars, + clauses.privateNeedsBarrier, clauses.reductionMod, + clauses.reductionVars, makeDenseBoolArrayAttr(ctx, clauses.reductionByref), makeArrayAttr(ctx, clauses.reductionSyms), clauses.safelen, clauses.simdlen); @@ -2591,7 +2629,8 @@ void DistributeOp::build(OpBuilder &builder, OperationState &state, clauses.allocatorVars, clauses.distScheduleStatic, clauses.distScheduleChunkSize, clauses.order, clauses.orderMod, clauses.privateVars, - makeArrayAttr(builder.getContext(), clauses.privateSyms)); + makeArrayAttr(builder.getContext(), clauses.privateSyms), + clauses.privateNeedsBarrier); } LogicalResult DistributeOp::verify() { @@ -2747,7 +2786,8 @@ void TaskOp::build(OpBuilder &builder, OperationState &state, makeArrayAttr(ctx, clauses.inReductionSyms), clauses.mergeable, clauses.priority, /*private_vars=*/clauses.privateVars, /*private_syms=*/makeArrayAttr(ctx, clauses.privateSyms), - clauses.untied, clauses.eventHandle); + clauses.privateNeedsBarrier, clauses.untied, + clauses.eventHandle); } LogicalResult TaskOp::verify() { @@ -2786,18 +2826,18 @@ LogicalResult TaskgroupOp::verify() { void TaskloopOp::build(OpBuilder &builder, OperationState &state, const TaskloopOperands &clauses) { MLIRContext *ctx = builder.getContext(); - TaskloopOp::build(builder, state, clauses.allocateVars, clauses.allocatorVars, - clauses.final, clauses.grainsizeMod, clauses.grainsize, - clauses.ifExpr, clauses.inReductionVars, - makeDenseBoolArrayAttr(ctx, clauses.inReductionByref), - makeArrayAttr(ctx, clauses.inReductionSyms), - clauses.mergeable, clauses.nogroup, clauses.numTasksMod, - clauses.numTasks, clauses.priority, - /*private_vars=*/clauses.privateVars, - /*private_syms=*/makeArrayAttr(ctx, clauses.privateSyms), - clauses.reductionMod, clauses.reductionVars, - makeDenseBoolArrayAttr(ctx, clauses.reductionByref), - makeArrayAttr(ctx, clauses.reductionSyms), clauses.untied); + TaskloopOp::build( + builder, state, clauses.allocateVars, clauses.allocatorVars, + clauses.final, clauses.grainsizeMod, clauses.grainsize, clauses.ifExpr, + clauses.inReductionVars, + makeDenseBoolArrayAttr(ctx, clauses.inReductionByref), + makeArrayAttr(ctx, clauses.inReductionSyms), clauses.mergeable, + clauses.nogroup, clauses.numTasksMod, clauses.numTasks, clauses.priority, + /*private_vars=*/clauses.privateVars, + /*private_syms=*/makeArrayAttr(ctx, clauses.privateSyms), + clauses.privateNeedsBarrier, clauses.reductionMod, clauses.reductionVars, + makeDenseBoolArrayAttr(ctx, clauses.reductionByref), + makeArrayAttr(ctx, clauses.reductionSyms), clauses.untied); } LogicalResult TaskloopOp::verify() { diff --git a/mlir/test/Dialect/OpenMP/ops.mlir b/mlir/test/Dialect/OpenMP/ops.mlir index b7e16b7ec35e2..3eef3799c4b45 100644 --- a/mlir/test/Dialect/OpenMP/ops.mlir +++ b/mlir/test/Dialect/OpenMP/ops.mlir @@ -2872,6 +2872,23 @@ func.func @parallel_op_privatizers(%arg0: !llvm.ptr, %arg1: !llvm.ptr) { return } +// CHECK-LABEL: parallel_op_privatizers_barrier +// CHECK-SAME: (%[[ARG0:[^[:space:]]+]]: !llvm.ptr, %[[ARG1:[^[:space:]]+]]: !llvm.ptr) +func.func @parallel_op_privatizers_barrier(%arg0: !llvm.ptr, %arg1: !llvm.ptr) { + // CHECK: omp.parallel private( + // CHECK-SAME: @x.privatizer %[[ARG0]] -> %[[ARG0_PRIV:[^[:space:]]+]], + // CHECK-SAME: @y.privatizer %[[ARG1]] -> %[[ARG1_PRIV:[^[:space:]]+]] : !llvm.ptr, !llvm.ptr) + // CHECK-SAME: private_barrier + omp.parallel private(@x.privatizer %arg0 -> %arg2, @y.privatizer %arg1 -> %arg3 : !llvm.ptr, !llvm.ptr) private_barrier { + // CHECK: llvm.load %[[ARG0_PRIV]] + %0 = llvm.load %arg2 : !llvm.ptr -> i32 + // CHECK: llvm.load %[[ARG1_PRIV]] + %1 = llvm.load %arg3 : !llvm.ptr -> i32 + omp.terminator + } + return +} + // CHECK-LABEL: omp.private {type = private} @a.privatizer : !llvm.ptr init { omp.private {type = private} @a.privatizer : !llvm.ptr init { // CHECK: ^bb0(%{{.*}}: {{.*}}, %{{.*}}: {{.*}}): >From bcbe5a7068270c26522b9acade1ab4c84d0b62a1 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Tue, 13 May 2025 17:18:27 +0000 Subject: [PATCH 2/4] [mlir][OpenMP] Add translation of private_barrier attr to LLVMIR --- .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 37 +++++++++++++------ .../Target/LLVMIR/openmp-wsloop-private.mlir | 4 +- 2 files changed, 28 insertions(+), 13 deletions(-) diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 9f7b5605556e6..65d496ad8b774 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -1512,10 +1512,11 @@ allocatePrivateVars(llvm::IRBuilderBase &builder, } static LogicalResult copyFirstPrivateVars( - llvm::IRBuilderBase &builder, LLVM::ModuleTranslation &moduleTranslation, + mlir::Operation *op, llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, SmallVectorImpl &mlirPrivateVars, ArrayRef llvmPrivateVars, - SmallVectorImpl &privateDecls, + SmallVectorImpl &privateDecls, bool insertBarrier, llvm::DenseMap *mappedPrivateVars = nullptr) { // Apply copy region for firstprivate. bool needsFirstprivate = @@ -1563,6 +1564,14 @@ static LogicalResult copyFirstPrivateVars( moduleTranslation.forgetMapping(copyRegion); } + if (insertBarrier) { + llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder(); + llvm::OpenMPIRBuilder::InsertPointOrErrorTy res = + ompBuilder->createBarrier(builder.saveIP(), llvm::omp::OMPD_barrier); + if (failed(handleError(res, *op))) + return failure(); + } + return success(); } @@ -2171,8 +2180,9 @@ convertOmpTaskOp(omp::TaskOp taskOp, llvm::IRBuilderBase &builder, // firstprivate copy region setInsertPointForPossiblyEmptyBlock(builder, copyBlock); if (failed(copyFirstPrivateVars( - builder, moduleTranslation, privateVarsInfo.mlirVars, - taskStructMgr.getLLVMPrivateVarGEPs(), privateVarsInfo.privatizers))) + taskOp, builder, moduleTranslation, privateVarsInfo.mlirVars, + taskStructMgr.getLLVMPrivateVarGEPs(), privateVarsInfo.privatizers, + taskOp.getPrivateNeedsBarrier()))) return llvm::failure(); // Set up for call to createTask() @@ -2392,8 +2402,9 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, return failure(); if (failed(copyFirstPrivateVars( - builder, moduleTranslation, privateVarsInfo.mlirVars, - privateVarsInfo.llvmVars, privateVarsInfo.privatizers))) + wsloopOp, builder, moduleTranslation, privateVarsInfo.mlirVars, + privateVarsInfo.llvmVars, privateVarsInfo.privatizers, + wsloopOp.getPrivateNeedsBarrier()))) return failure(); assert(afterAllocas.get()->getSinglePredecessor()); @@ -2512,8 +2523,9 @@ convertOmpParallel(omp::ParallelOp opInst, llvm::IRBuilderBase &builder, return llvm::make_error(); if (failed(copyFirstPrivateVars( - builder, moduleTranslation, privateVarsInfo.mlirVars, - privateVarsInfo.llvmVars, privateVarsInfo.privatizers))) + opInst, builder, moduleTranslation, privateVarsInfo.mlirVars, + privateVarsInfo.llvmVars, privateVarsInfo.privatizers, + opInst.getPrivateNeedsBarrier()))) return llvm::make_error(); if (failed( @@ -4461,8 +4473,9 @@ convertOmpDistribute(Operation &opInst, llvm::IRBuilderBase &builder, return llvm::make_error(); if (failed(copyFirstPrivateVars( - builder, moduleTranslation, privVarsInfo.mlirVars, - privVarsInfo.llvmVars, privVarsInfo.privatizers))) + distributeOp, builder, moduleTranslation, privVarsInfo.mlirVars, + privVarsInfo.llvmVars, privVarsInfo.privatizers, + distributeOp.getPrivateNeedsBarrier()))) return llvm::make_error(); llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder(); @@ -5222,9 +5235,9 @@ convertOmpTarget(Operation &opInst, llvm::IRBuilderBase &builder, return llvm::make_error(); if (failed(copyFirstPrivateVars( - builder, moduleTranslation, privateVarsInfo.mlirVars, + targetOp, builder, moduleTranslation, privateVarsInfo.mlirVars, privateVarsInfo.llvmVars, privateVarsInfo.privatizers, - &mappedPrivateVars))) + targetOp.getPrivateNeedsBarrier(), &mappedPrivateVars))) return llvm::make_error(); SmallVector privateCleanupRegions; diff --git a/mlir/test/Target/LLVMIR/openmp-wsloop-private.mlir b/mlir/test/Target/LLVMIR/openmp-wsloop-private.mlir index 23a0ae5713aa2..0b1f45ad7ce1c 100644 --- a/mlir/test/Target/LLVMIR/openmp-wsloop-private.mlir +++ b/mlir/test/Target/LLVMIR/openmp-wsloop-private.mlir @@ -37,7 +37,7 @@ llvm.func @wsloop_private_(%arg0: !llvm.ptr {fir.bindc_name = "y"}) attributes { %7 = llvm.mlir.constant(10 : i32) : i32 %8 = llvm.mlir.constant(0 : i32) : i32 omp.parallel { - omp.wsloop private(@_QFwsloop_privateEc_firstprivate_ref_c8 %5 -> %arg1, @_QFwsloop_privateEi_private_ref_i32 %3 -> %arg2 : !llvm.ptr, !llvm.ptr) reduction(@max_f32 %1 -> %arg3 : !llvm.ptr) { + omp.wsloop private(@_QFwsloop_privateEc_firstprivate_ref_c8 %5 -> %arg1, @_QFwsloop_privateEi_private_ref_i32 %3 -> %arg2 : !llvm.ptr, !llvm.ptr) private_barrier reduction(@max_f32 %1 -> %arg3 : !llvm.ptr) { omp.loop_nest (%arg4) : i32 = (%8) to (%7) inclusive step (%6) { omp.yield } @@ -66,6 +66,8 @@ llvm.func @wsloop_private_(%arg0: !llvm.ptr {fir.bindc_name = "y"}) attributes { // CHECK: [[PRIVATE_CPY_BB:.*]]: // CHECK: %[[CHR_VAL:.*]] = load [1 x i8], ptr %{{.*}}, align 1 // CHECK: store [1 x i8] %[[CHR_VAL]], ptr %[[CHR]], align 1 +// CHECK: %[[THREAD_NUM:.*]] = call i32 @__kmpc_global_thread_num({{.*}}) +// CHECK: call void @__kmpc_barrier({{.*}}, i32 %[[THREAD_NUM]]) // CHECK: br label %[[RED_INIT_BB:.*]] // Third, check that reduction init took place. >From 4290004bb2bf9c7a978725262c0521902666fb06 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Thu, 15 May 2025 11:31:49 +0000 Subject: [PATCH 3/4] [flang][OpenMP] use attribute for delayed privatization barrier Fixes #136357 The barrier needs to go between the copying into firstprivate variables and the initialization call for the OpenMP construct (e.g. wsloop). There is no way of expressing this in MLIR because for delayed privatization that is all implicit (added in MLIR->LLVMIR conversion). The previous approach put the barrier immediately before the wsloop (or similar). For delayed privatization, the firstprivate copy code would then be inserted after that, opening the possibility for the race observed in the bug report. This patch solves the issue by instead setting an attribute on the mlir operation, which will instruct openmp dialect to llvm ir conversion to insert a barrier in the correct place. --- flang/lib/Lower/OpenMP/DataSharingProcessor.cpp | 13 ++++++++++--- flang/lib/Lower/OpenMP/DataSharingProcessor.h | 2 +- flang/test/Lower/OpenMP/lastprivate-allocatable.f90 | 2 +- .../OpenMP/parallel-lastprivate-clause-scalar.f90 | 3 +-- .../Lower/OpenMP/same_var_first_lastprivate.f90 | 3 +-- 5 files changed, 14 insertions(+), 9 deletions(-) diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 7eec598645eac..0949fe84f209f 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -62,7 +62,7 @@ void DataSharingProcessor::processStep1( privatize(clauseOps); - insertBarrier(); + insertBarrier(clauseOps); } void DataSharingProcessor::processStep2(mlir::Operation *op, bool isLoop) { @@ -230,8 +230,15 @@ bool DataSharingProcessor::needBarrier() { return false; } -void DataSharingProcessor::insertBarrier() { - if (needBarrier()) +void DataSharingProcessor::insertBarrier( + mlir::omp::PrivateClauseOps *clauseOps) { + if (!needBarrier()) + return; + + if (useDelayedPrivatization) + clauseOps->privateNeedsBarrier = + mlir::UnitAttr::get(&converter.getMLIRContext()); + else firOpBuilder.create(converter.getCurrentLocation()); } diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.h b/flang/lib/Lower/OpenMP/DataSharingProcessor.h index 54a42fd199831..7787e4ffb03c2 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.h +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.h @@ -100,7 +100,7 @@ class DataSharingProcessor { const omp::ObjectList &objects, llvm::SetVector &symbolSet); void collectSymbolsForPrivatization(); - void insertBarrier(); + void insertBarrier(mlir::omp::PrivateClauseOps *clauseOps); void collectDefaultSymbols(); void collectImplicitSymbols(); void collectPreDeterminedSymbols(); diff --git a/flang/test/Lower/OpenMP/lastprivate-allocatable.f90 b/flang/test/Lower/OpenMP/lastprivate-allocatable.f90 index 1d31edd16efea..c2626e14b51c7 100644 --- a/flang/test/Lower/OpenMP/lastprivate-allocatable.f90 +++ b/flang/test/Lower/OpenMP/lastprivate-allocatable.f90 @@ -8,7 +8,7 @@ ! CHECK: fir.store %[[VAL_2]] to %[[VAL_0]] : !fir.ref>> ! CHECK: %[[VAL_3:.*]]:2 = hlfir.declare %[[VAL_0]] {fortran_attrs = {{.*}}, uniq_name = "_QFEa"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) ! CHECK: omp.parallel { -! CHECK: omp.wsloop private(@{{.*}} %{{.*}} -> %{{.*}}, @{{.*}} %{{.*}} -> %[[VAL_17:.*]] : !fir.ref>>, !fir.ref) { +! CHECK: omp.wsloop private(@{{.*}} %{{.*}} -> %{{.*}}, @{{.*}} %{{.*}} -> %[[VAL_17:.*]] : !fir.ref>>, !fir.ref) private_barrier { ! CHECK: omp.loop_nest ! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %{{.*}} {fortran_attrs = {{.*}}, uniq_name = "_QFEa"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) ! CHECK: %[[VAL_18:.*]]:2 = hlfir.declare %[[VAL_17]] {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) diff --git a/flang/test/Lower/OpenMP/parallel-lastprivate-clause-scalar.f90 b/flang/test/Lower/OpenMP/parallel-lastprivate-clause-scalar.f90 index 60de8fa6f46a2..5d37010f4095b 100644 --- a/flang/test/Lower/OpenMP/parallel-lastprivate-clause-scalar.f90 +++ b/flang/test/Lower/OpenMP/parallel-lastprivate-clause-scalar.f90 @@ -226,8 +226,7 @@ subroutine firstpriv_lastpriv_int(arg1, arg2) ! Firstprivate update -!CHECK-NEXT: omp.barrier -!CHECK: omp.wsloop private(@{{.*}} %{{.*}}#0 -> %[[CLONE1:.*]], @{{.*}} %{{.*}}#0 -> %[[IV:.*]] : !fir.ref, !fir.ref) { +!CHECK: omp.wsloop private(@{{.*}} %{{.*}}#0 -> %[[CLONE1:.*]], @{{.*}} %{{.*}}#0 -> %[[IV:.*]] : !fir.ref, !fir.ref) private_barrier { !CHECK-NEXT: omp.loop_nest (%[[INDX_WS:.*]]) : {{.*}} { !CHECK: %[[CLONE1_DECL:.*]]:2 = hlfir.declare %[[CLONE1]] {uniq_name = "_QFfirstpriv_lastpriv_int2Earg1"} : (!fir.ref) -> (!fir.ref, !fir.ref) diff --git a/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 b/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 index ee914f23aacf3..45d6f91f67f1f 100644 --- a/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 +++ b/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 @@ -20,8 +20,7 @@ subroutine first_and_lastprivate ! CHECK: func.func @{{.*}}first_and_lastprivate() ! CHECK: %[[ORIG_VAR_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "{{.*}}Evar"} ! CHECK: omp.parallel { -! CHECK: omp.barrier -! CHECK: omp.wsloop private(@{{.*}}var_firstprivate_i32 {{.*}}) { +! CHECK: omp.wsloop private(@{{.*}}var_firstprivate_i32 {{.*}}) private_barrier { ! CHECK: omp.loop_nest {{.*}} { ! CHECK: %[[PRIV_VAR_DECL:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "{{.*}}Evar"} ! CHECK: fir.if %{{.*}} { >From a5056c54864ec66e36b36961486d704de9f3511d Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Thu, 22 May 2025 14:17:38 +0000 Subject: [PATCH 4/4] Nit: add if statement around clauseOps usage --- flang/lib/Lower/OpenMP/DataSharingProcessor.cpp | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 0949fe84f209f..bfc04a0d1070e 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -235,11 +235,13 @@ void DataSharingProcessor::insertBarrier( if (!needBarrier()) return; - if (useDelayedPrivatization) - clauseOps->privateNeedsBarrier = - mlir::UnitAttr::get(&converter.getMLIRContext()); - else + if (useDelayedPrivatization) { + if (clauseOps) + clauseOps->privateNeedsBarrier = + mlir::UnitAttr::get(&converter.getMLIRContext()); + } else { firOpBuilder.create(converter.getCurrentLocation()); + } } void DataSharingProcessor::insertLastPrivateCompare(mlir::Operation *op) { From flang-commits at lists.llvm.org Thu May 22 07:26:00 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 22 May 2025 07:26:00 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [flang][OpenMP] use attribute for delayed privatization barrier (PR #140092) In-Reply-To: Message-ID: <682f33f8.170a0220.73062.e462@mx.google.com> https://github.com/tblah closed https://github.com/llvm/llvm-project/pull/140092 From flang-commits at lists.llvm.org Thu May 22 07:26:55 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Thu, 22 May 2025 07:26:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <682f342f.a70a0220.274e86.0a39@mx.google.com> https://github.com/mrkajetanp updated https://github.com/llvm/llvm-project/pull/138718 >From 6bbba43416edf2c4d310d3e92e99ad458e87d8f7 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 10 Apr 2025 14:04:52 +0000 Subject: [PATCH 1/4] [flang] Inline hlfir.copy_in for trivial types hlfir.copy_in implements copying non-contiguous array slices for functions that take in arrays required to be contiguous through a flang-rt function that calls memcpy/memmove separately on each element. For large arrays of trivial types, this can incur considerable overhead compared to a plain copy loop that is better able to take advantage of hardware pipelines. To address that, extend the InlineHLFIRAssign optimisation pass with a new pattern for inlining hlfir.copy_in operations for trivial types. For the time being, the pattern is only applied in cases where the copy-in does not require a corresponding copy-out, such as when the function being called declares the array parameter as intent(in). Applying this optimisation reduces the runtime of thornado-mini's DeleptonizationProblem by a factor of about 1/3rd. Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 117 ++++++++++++++++++ 1 file changed, 117 insertions(+) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 6e209cce07ad4..38c684eaceb7d 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,6 +13,7 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + + if (!mlir::isa(copyOut)) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (::llvm::cast(copyOut).getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + auto results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + auto elem = hlfir::getElementAt(loc, builder, inputVariable, + loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + auto tempElem = hlfir::getElementAt(loc, builder, temp, + loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + auto refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + auto refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + auto addr = results[0]; + auto needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + auto tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. + rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); +} + class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -140,6 +256,7 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); + patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { >From 32a62862f141c3894c7a30b141ddf3b7701cb85b Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Wed, 7 May 2025 16:04:07 +0000 Subject: [PATCH 2/4] Add tests Signed-off-by: Kajetan Puchalski --- flang/test/HLFIR/inline-hlfir-assign.fir | 144 +++++++++++++++++++++++ 1 file changed, 144 insertions(+) diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index f834e7971e3d5..df7681b9c5c16 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,3 +353,147 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } >From f34d4d56fd9818aa8cedf2a3600ab2205f65e047 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 8 May 2025 15:15:56 +0000 Subject: [PATCH 3/4] Address Tom's review comments Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 41 +++++++++++-------- 1 file changed, 23 insertions(+), 18 deletions(-) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 38c684eaceb7d..dc545ece8adff 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -158,16 +158,16 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, "CopyInOp's WasCopied has no uses"); // The copy out should always be present, either to actually copy or just // deallocate memory. - auto *copyOut = - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - if (!mlir::isa(copyOut)) + if (!copyOut) return rewriter.notifyMatchFailure(copyIn, "CopyInOp has no direct CopyOut"); // Only inline the copy_in when copy_out does not need to be done, i.e. in // case of intent(in). - if (::llvm::cast(copyOut).getVar()) + if (copyOut.getVar()) return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); inputVariable = @@ -175,7 +175,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); mlir::Value isContiguous = builder.create(loc, inputVariable); - auto results = + mlir::Operation::result_range results = builder .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, /*withElseRegion=*/true) @@ -195,11 +195,11 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, loc, builder, extents, /*isUnordered=*/true, flangomp::shouldUseWorkshareLowering(copyIn)); builder.setInsertionPointToStart(loopNest.body); - auto elem = hlfir::getElementAt(loc, builder, inputVariable, - loopNest.oneBasedIndices); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); elem = hlfir::loadTrivialScalar(loc, builder, elem); - auto tempElem = hlfir::getElementAt(loc, builder, temp, - loopNest.oneBasedIndices); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); builder.create(loc, elem, tempElem); builder.setInsertionPointAfter(loopNest.outerOp); @@ -209,9 +209,9 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, if (mlir::isa(temp.getType())) { result = temp; } else { - auto refTy = + fir::ReferenceType refTy = fir::ReferenceType::get(temp.getElementOrSequenceType()); - auto refVal = builder.createConvert(loc, refTy, temp); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); result = builder.create(loc, resultAddrType, refVal); } @@ -221,25 +221,30 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }) .getResults(); - auto addr = results[0]; - auto needsCleanup = results[1]; + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { auto boxAddr = builder.create(loc, addr); - auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); builder.create(loc, heapVal); }); rewriter.eraseOp(copyOut); - auto tempBox = copyIn.getTempBox(); + mlir::Value tempBox = copyIn.getTempBox(); rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); // The TempBox is only needed for flang-rt calls which we're no longer - // generating. + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); } >From cb09635ad4652d384ded06831ba6054fdad80299 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 22 May 2025 13:37:53 +0000 Subject: [PATCH 4/4] Separate copy_in inlining into its own pass, add flag Signed-off-by: Kajetan Puchalski --- flang/include/flang/Optimizer/HLFIR/Passes.td | 4 + .../Optimizer/HLFIR/Transforms/CMakeLists.txt | 1 + .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 122 ------------ .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 180 ++++++++++++++++++ flang/lib/Optimizer/Passes/Pipelines.cpp | 5 + flang/test/HLFIR/inline-hlfir-assign.fir | 144 -------------- flang/test/HLFIR/inline-hlfir-copy-in.fir | 146 ++++++++++++++ 7 files changed, 336 insertions(+), 266 deletions(-) create mode 100644 flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp create mode 100644 flang/test/HLFIR/inline-hlfir-copy-in.fir diff --git a/flang/include/flang/Optimizer/HLFIR/Passes.td b/flang/include/flang/Optimizer/HLFIR/Passes.td index d445140118073..04d7aec5fe489 100644 --- a/flang/include/flang/Optimizer/HLFIR/Passes.td +++ b/flang/include/flang/Optimizer/HLFIR/Passes.td @@ -69,6 +69,10 @@ def InlineHLFIRAssign : Pass<"inline-hlfir-assign"> { let summary = "Inline hlfir.assign operations"; } +def InlineHLFIRCopyIn : Pass<"inline-hlfir-copy-in"> { + let summary = "Inline hlfir.copy_in operations"; +} + def PropagateFortranVariableAttributes : Pass<"propagate-fortran-attrs"> { let summary = "Propagate FortranVariableFlagsAttr attributes through HLFIR"; } diff --git a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt index d959428ebd203..cc74273d9c5d9 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt @@ -5,6 +5,7 @@ add_flang_library(HLFIRTransforms ConvertToFIR.cpp InlineElementals.cpp InlineHLFIRAssign.cpp + InlineHLFIRCopyIn.cpp LowerHLFIRIntrinsics.cpp LowerHLFIROrderedAssignments.cpp ScheduleOrderedAssignments.cpp diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index dc545ece8adff..6e209cce07ad4 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,7 +13,6 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" -#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -128,126 +127,6 @@ class InlineHLFIRAssignConversion } }; -class InlineCopyInConversion : public mlir::OpRewritePattern { -public: - using mlir::OpRewritePattern::OpRewritePattern; - - llvm::LogicalResult - matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const override; -}; - -llvm::LogicalResult -InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const { - fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); - mlir::Location loc = copyIn.getLoc(); - hlfir::Entity inputVariable{copyIn.getVar()}; - if (!fir::isa_trivial(inputVariable.getFortranElementType())) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's data type is not trivial"); - - if (fir::isPointerType(inputVariable.getType())) - return rewriter.notifyMatchFailure( - copyIn, "CopyInOp's input variable is a pointer"); - - // There should be exactly one user of WasCopied - the corresponding - // CopyOutOp. - if (copyIn.getWasCopied().getUses().empty()) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's WasCopied has no uses"); - // The copy out should always be present, either to actually copy or just - // deallocate memory. - auto copyOut = mlir::dyn_cast( - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - - if (!copyOut) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp has no direct CopyOut"); - - // Only inline the copy_in when copy_out does not need to be done, i.e. in - // case of intent(in). - if (copyOut.getVar()) - return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); - - inputVariable = - hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); - mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); - mlir::Value isContiguous = - builder.create(loc, inputVariable); - mlir::Operation::result_range results = - builder - .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, - /*withElseRegion=*/true) - .genThen([&]() { - mlir::Value falseVal = builder.create( - loc, builder.getI1Type(), builder.getBoolAttr(false)); - builder.create( - loc, mlir::ValueRange{inputVariable, falseVal}); - }) - .genElse([&] { - auto [temp, cleanup] = - hlfir::createTempFromMold(loc, builder, inputVariable); - mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); - llvm::SmallVector extents = - hlfir::getIndexExtents(loc, builder, shape); - hlfir::LoopNest loopNest = hlfir::genLoopNest( - loc, builder, extents, /*isUnordered=*/true, - flangomp::shouldUseWorkshareLowering(copyIn)); - builder.setInsertionPointToStart(loopNest.body); - hlfir::Entity elem = hlfir::getElementAt( - loc, builder, inputVariable, loopNest.oneBasedIndices); - elem = hlfir::loadTrivialScalar(loc, builder, elem); - hlfir::Entity tempElem = hlfir::getElementAt( - loc, builder, temp, loopNest.oneBasedIndices); - builder.create(loc, elem, tempElem); - builder.setInsertionPointAfter(loopNest.outerOp); - - mlir::Value result; - // Make sure the result is always a boxed array by boxing it - // ourselves if need be. - if (mlir::isa(temp.getType())) { - result = temp; - } else { - fir::ReferenceType refTy = - fir::ReferenceType::get(temp.getElementOrSequenceType()); - mlir::Value refVal = builder.createConvert(loc, refTy, temp); - result = - builder.create(loc, resultAddrType, refVal); - } - - builder.create(loc, - mlir::ValueRange{result, cleanup}); - }) - .getResults(); - - mlir::OpResult addr = results[0]; - mlir::OpResult needsCleanup = results[1]; - - builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { - auto boxAddr = builder.create(loc, addr); - fir::HeapType heapType = - fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - mlir::Value heapVal = - builder.createConvert(loc, heapType, boxAddr.getResult()); - builder.create(loc, heapVal); - }); - rewriter.eraseOp(copyOut); - - mlir::Value tempBox = copyIn.getTempBox(); - - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); - - // The TempBox is only needed for flang-rt calls which we're no longer - // generating. It should have no uses left at this stage. - if (!tempBox.getUses().empty()) - return mlir::failure(); - rewriter.eraseOp(tempBox.getDefiningOp()); - - return mlir::success(); -} - class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -261,7 +140,6 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); - patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp new file mode 100644 index 0000000000000..40a6bb49581eb --- /dev/null +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops ------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + mlir::Value tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); + rewriter.eraseOp(tempBox.getDefiningOp()); + + return mlir::success(); +} + +class InlineHLFIRCopyInPass + : public hlfir::impl::InlineHLFIRCopyInBase { +public: + void runOnOperation() override { + mlir::MLIRContext *context = &getContext(); + + mlir::GreedyRewriteConfig config; + // Prevent the pattern driver from merging blocks. + config.setRegionSimplificationLevel( + mlir::GreedySimplifyRegionLevel::Disabled); + + mlir::RewritePatternSet patterns(context); + if (!noInlineHLFIRCopyIn) { + patterns.insert(context); + } + + if (mlir::failed(mlir::applyPatternsGreedily( + getOperation(), std::move(patterns), config))) { + mlir::emitError(getOperation()->getLoc(), + "failure in hlfir.copy_in inlining"); + signalPassFailure(); + } + } +}; +} // namespace diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..1779623fddc5a 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -255,6 +255,11 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP, pm, hlfir::createOptimizedBufferization); addNestedPassToAllTopLevelOperations( pm, hlfir::createInlineHLFIRAssign); + + if (optLevel == llvm::OptimizationLevel::O3) { + addNestedPassToAllTopLevelOperations( + pm, hlfir::createInlineHLFIRCopyIn); + } } pm.addPass(hlfir::createLowerHLFIROrderedAssignments()); pm.addPass(hlfir::createLowerHLFIRIntrinsics()); diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index df7681b9c5c16..f834e7971e3d5 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,147 +353,3 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } - -// Test inlining of hlfir.copy_in that does not require the array to be copied out -func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant true -// CHECK: %[[VAL_4:.*]] = arith.constant false -// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 -// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { -// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 -// CHECK: } else { -// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} -// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { -// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref -// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref -// CHECK: } -// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 -// CHECK: } -// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: fir.if %[[VAL_21:.*]]#1 { -// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> -// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> -// CHECK: } -// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } - -// Test not inlining of hlfir.copy_in that requires the array to be copied out -func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_no_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> -// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) -// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () -// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir new file mode 100644 index 0000000000000..7140e93f19979 --- /dev/null +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -0,0 +1,146 @@ +// Test inlining of hlfir.copy_in +// RUN: fir-opt --inline-hlfir-copy-in %s | FileCheck %s + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } From flang-commits at lists.llvm.org Thu May 22 07:29:39 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Thu, 22 May 2025 07:29:39 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <682f34d3.a70a0220.a27ad.1217@mx.google.com> https://github.com/mrkajetanp updated https://github.com/llvm/llvm-project/pull/138718 >From 6bbba43416edf2c4d310d3e92e99ad458e87d8f7 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 10 Apr 2025 14:04:52 +0000 Subject: [PATCH 1/4] [flang] Inline hlfir.copy_in for trivial types hlfir.copy_in implements copying non-contiguous array slices for functions that take in arrays required to be contiguous through a flang-rt function that calls memcpy/memmove separately on each element. For large arrays of trivial types, this can incur considerable overhead compared to a plain copy loop that is better able to take advantage of hardware pipelines. To address that, extend the InlineHLFIRAssign optimisation pass with a new pattern for inlining hlfir.copy_in operations for trivial types. For the time being, the pattern is only applied in cases where the copy-in does not require a corresponding copy-out, such as when the function being called declares the array parameter as intent(in). Applying this optimisation reduces the runtime of thornado-mini's DeleptonizationProblem by a factor of about 1/3rd. Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 117 ++++++++++++++++++ 1 file changed, 117 insertions(+) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 6e209cce07ad4..38c684eaceb7d 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,6 +13,7 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + + if (!mlir::isa(copyOut)) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (::llvm::cast(copyOut).getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + auto results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + auto elem = hlfir::getElementAt(loc, builder, inputVariable, + loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + auto tempElem = hlfir::getElementAt(loc, builder, temp, + loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + auto refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + auto refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + auto addr = results[0]; + auto needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + auto tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. + rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); +} + class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -140,6 +256,7 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); + patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { >From 32a62862f141c3894c7a30b141ddf3b7701cb85b Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Wed, 7 May 2025 16:04:07 +0000 Subject: [PATCH 2/4] Add tests Signed-off-by: Kajetan Puchalski --- flang/test/HLFIR/inline-hlfir-assign.fir | 144 +++++++++++++++++++++++ 1 file changed, 144 insertions(+) diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index f834e7971e3d5..df7681b9c5c16 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,3 +353,147 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } >From f34d4d56fd9818aa8cedf2a3600ab2205f65e047 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 8 May 2025 15:15:56 +0000 Subject: [PATCH 3/4] Address Tom's review comments Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 41 +++++++++++-------- 1 file changed, 23 insertions(+), 18 deletions(-) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 38c684eaceb7d..dc545ece8adff 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -158,16 +158,16 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, "CopyInOp's WasCopied has no uses"); // The copy out should always be present, either to actually copy or just // deallocate memory. - auto *copyOut = - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - if (!mlir::isa(copyOut)) + if (!copyOut) return rewriter.notifyMatchFailure(copyIn, "CopyInOp has no direct CopyOut"); // Only inline the copy_in when copy_out does not need to be done, i.e. in // case of intent(in). - if (::llvm::cast(copyOut).getVar()) + if (copyOut.getVar()) return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); inputVariable = @@ -175,7 +175,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); mlir::Value isContiguous = builder.create(loc, inputVariable); - auto results = + mlir::Operation::result_range results = builder .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, /*withElseRegion=*/true) @@ -195,11 +195,11 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, loc, builder, extents, /*isUnordered=*/true, flangomp::shouldUseWorkshareLowering(copyIn)); builder.setInsertionPointToStart(loopNest.body); - auto elem = hlfir::getElementAt(loc, builder, inputVariable, - loopNest.oneBasedIndices); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); elem = hlfir::loadTrivialScalar(loc, builder, elem); - auto tempElem = hlfir::getElementAt(loc, builder, temp, - loopNest.oneBasedIndices); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); builder.create(loc, elem, tempElem); builder.setInsertionPointAfter(loopNest.outerOp); @@ -209,9 +209,9 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, if (mlir::isa(temp.getType())) { result = temp; } else { - auto refTy = + fir::ReferenceType refTy = fir::ReferenceType::get(temp.getElementOrSequenceType()); - auto refVal = builder.createConvert(loc, refTy, temp); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); result = builder.create(loc, resultAddrType, refVal); } @@ -221,25 +221,30 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }) .getResults(); - auto addr = results[0]; - auto needsCleanup = results[1]; + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { auto boxAddr = builder.create(loc, addr); - auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); builder.create(loc, heapVal); }); rewriter.eraseOp(copyOut); - auto tempBox = copyIn.getTempBox(); + mlir::Value tempBox = copyIn.getTempBox(); rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); // The TempBox is only needed for flang-rt calls which we're no longer - // generating. + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); } >From 24bbcbdf9064fa285bc7db01904e85f8959a5f0c Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 22 May 2025 13:37:53 +0000 Subject: [PATCH 4/4] Separate copy_in inlining into its own pass, add flag Signed-off-by: Kajetan Puchalski --- flang/include/flang/Optimizer/HLFIR/Passes.td | 4 + .../Optimizer/HLFIR/Transforms/CMakeLists.txt | 1 + .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 122 ------------ .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 180 ++++++++++++++++++ flang/lib/Optimizer/Passes/Pipelines.cpp | 5 + flang/test/HLFIR/inline-hlfir-assign.fir | 144 -------------- flang/test/HLFIR/inline-hlfir-copy-in.fir | 146 ++++++++++++++ 7 files changed, 336 insertions(+), 266 deletions(-) create mode 100644 flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp create mode 100644 flang/test/HLFIR/inline-hlfir-copy-in.fir diff --git a/flang/include/flang/Optimizer/HLFIR/Passes.td b/flang/include/flang/Optimizer/HLFIR/Passes.td index d445140118073..04d7aec5fe489 100644 --- a/flang/include/flang/Optimizer/HLFIR/Passes.td +++ b/flang/include/flang/Optimizer/HLFIR/Passes.td @@ -69,6 +69,10 @@ def InlineHLFIRAssign : Pass<"inline-hlfir-assign"> { let summary = "Inline hlfir.assign operations"; } +def InlineHLFIRCopyIn : Pass<"inline-hlfir-copy-in"> { + let summary = "Inline hlfir.copy_in operations"; +} + def PropagateFortranVariableAttributes : Pass<"propagate-fortran-attrs"> { let summary = "Propagate FortranVariableFlagsAttr attributes through HLFIR"; } diff --git a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt index d959428ebd203..cc74273d9c5d9 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt @@ -5,6 +5,7 @@ add_flang_library(HLFIRTransforms ConvertToFIR.cpp InlineElementals.cpp InlineHLFIRAssign.cpp + InlineHLFIRCopyIn.cpp LowerHLFIRIntrinsics.cpp LowerHLFIROrderedAssignments.cpp ScheduleOrderedAssignments.cpp diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index dc545ece8adff..6e209cce07ad4 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,7 +13,6 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" -#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -128,126 +127,6 @@ class InlineHLFIRAssignConversion } }; -class InlineCopyInConversion : public mlir::OpRewritePattern { -public: - using mlir::OpRewritePattern::OpRewritePattern; - - llvm::LogicalResult - matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const override; -}; - -llvm::LogicalResult -InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const { - fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); - mlir::Location loc = copyIn.getLoc(); - hlfir::Entity inputVariable{copyIn.getVar()}; - if (!fir::isa_trivial(inputVariable.getFortranElementType())) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's data type is not trivial"); - - if (fir::isPointerType(inputVariable.getType())) - return rewriter.notifyMatchFailure( - copyIn, "CopyInOp's input variable is a pointer"); - - // There should be exactly one user of WasCopied - the corresponding - // CopyOutOp. - if (copyIn.getWasCopied().getUses().empty()) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's WasCopied has no uses"); - // The copy out should always be present, either to actually copy or just - // deallocate memory. - auto copyOut = mlir::dyn_cast( - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - - if (!copyOut) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp has no direct CopyOut"); - - // Only inline the copy_in when copy_out does not need to be done, i.e. in - // case of intent(in). - if (copyOut.getVar()) - return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); - - inputVariable = - hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); - mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); - mlir::Value isContiguous = - builder.create(loc, inputVariable); - mlir::Operation::result_range results = - builder - .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, - /*withElseRegion=*/true) - .genThen([&]() { - mlir::Value falseVal = builder.create( - loc, builder.getI1Type(), builder.getBoolAttr(false)); - builder.create( - loc, mlir::ValueRange{inputVariable, falseVal}); - }) - .genElse([&] { - auto [temp, cleanup] = - hlfir::createTempFromMold(loc, builder, inputVariable); - mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); - llvm::SmallVector extents = - hlfir::getIndexExtents(loc, builder, shape); - hlfir::LoopNest loopNest = hlfir::genLoopNest( - loc, builder, extents, /*isUnordered=*/true, - flangomp::shouldUseWorkshareLowering(copyIn)); - builder.setInsertionPointToStart(loopNest.body); - hlfir::Entity elem = hlfir::getElementAt( - loc, builder, inputVariable, loopNest.oneBasedIndices); - elem = hlfir::loadTrivialScalar(loc, builder, elem); - hlfir::Entity tempElem = hlfir::getElementAt( - loc, builder, temp, loopNest.oneBasedIndices); - builder.create(loc, elem, tempElem); - builder.setInsertionPointAfter(loopNest.outerOp); - - mlir::Value result; - // Make sure the result is always a boxed array by boxing it - // ourselves if need be. - if (mlir::isa(temp.getType())) { - result = temp; - } else { - fir::ReferenceType refTy = - fir::ReferenceType::get(temp.getElementOrSequenceType()); - mlir::Value refVal = builder.createConvert(loc, refTy, temp); - result = - builder.create(loc, resultAddrType, refVal); - } - - builder.create(loc, - mlir::ValueRange{result, cleanup}); - }) - .getResults(); - - mlir::OpResult addr = results[0]; - mlir::OpResult needsCleanup = results[1]; - - builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { - auto boxAddr = builder.create(loc, addr); - fir::HeapType heapType = - fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - mlir::Value heapVal = - builder.createConvert(loc, heapType, boxAddr.getResult()); - builder.create(loc, heapVal); - }); - rewriter.eraseOp(copyOut); - - mlir::Value tempBox = copyIn.getTempBox(); - - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); - - // The TempBox is only needed for flang-rt calls which we're no longer - // generating. It should have no uses left at this stage. - if (!tempBox.getUses().empty()) - return mlir::failure(); - rewriter.eraseOp(tempBox.getDefiningOp()); - - return mlir::success(); -} - class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -261,7 +140,6 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); - patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp new file mode 100644 index 0000000000000..1e2aecaf535a0 --- /dev/null +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + mlir::Value tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); + rewriter.eraseOp(tempBox.getDefiningOp()); + + return mlir::success(); +} + +class InlineHLFIRCopyInPass + : public hlfir::impl::InlineHLFIRCopyInBase { +public: + void runOnOperation() override { + mlir::MLIRContext *context = &getContext(); + + mlir::GreedyRewriteConfig config; + // Prevent the pattern driver from merging blocks. + config.setRegionSimplificationLevel( + mlir::GreedySimplifyRegionLevel::Disabled); + + mlir::RewritePatternSet patterns(context); + if (!noInlineHLFIRCopyIn) { + patterns.insert(context); + } + + if (mlir::failed(mlir::applyPatternsGreedily( + getOperation(), std::move(patterns), config))) { + mlir::emitError(getOperation()->getLoc(), + "failure in hlfir.copy_in inlining"); + signalPassFailure(); + } + } +}; +} // namespace diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..1779623fddc5a 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -255,6 +255,11 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP, pm, hlfir::createOptimizedBufferization); addNestedPassToAllTopLevelOperations( pm, hlfir::createInlineHLFIRAssign); + + if (optLevel == llvm::OptimizationLevel::O3) { + addNestedPassToAllTopLevelOperations( + pm, hlfir::createInlineHLFIRCopyIn); + } } pm.addPass(hlfir::createLowerHLFIROrderedAssignments()); pm.addPass(hlfir::createLowerHLFIRIntrinsics()); diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index df7681b9c5c16..f834e7971e3d5 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,147 +353,3 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } - -// Test inlining of hlfir.copy_in that does not require the array to be copied out -func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant true -// CHECK: %[[VAL_4:.*]] = arith.constant false -// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 -// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { -// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 -// CHECK: } else { -// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} -// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { -// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref -// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref -// CHECK: } -// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 -// CHECK: } -// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: fir.if %[[VAL_21:.*]]#1 { -// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> -// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> -// CHECK: } -// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } - -// Test not inlining of hlfir.copy_in that requires the array to be copied out -func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_no_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> -// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) -// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () -// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir new file mode 100644 index 0000000000000..7140e93f19979 --- /dev/null +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -0,0 +1,146 @@ +// Test inlining of hlfir.copy_in +// RUN: fir-opt --inline-hlfir-copy-in %s | FileCheck %s + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } From flang-commits at lists.llvm.org Thu May 22 07:52:14 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Thu, 22 May 2025 07:52:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <682f3a1e.170a0220.106673.0288@mx.google.com> https://github.com/mrkajetanp updated https://github.com/llvm/llvm-project/pull/138718 >From eac5f59bcb47e5ec03647e2f097214df87a5aebb Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 10 Apr 2025 14:04:52 +0000 Subject: [PATCH 1/4] [flang] Inline hlfir.copy_in for trivial types hlfir.copy_in implements copying non-contiguous array slices for functions that take in arrays required to be contiguous through a flang-rt function that calls memcpy/memmove separately on each element. For large arrays of trivial types, this can incur considerable overhead compared to a plain copy loop that is better able to take advantage of hardware pipelines. To address that, extend the InlineHLFIRAssign optimisation pass with a new pattern for inlining hlfir.copy_in operations for trivial types. For the time being, the pattern is only applied in cases where the copy-in does not require a corresponding copy-out, such as when the function being called declares the array parameter as intent(in). Applying this optimisation reduces the runtime of thornado-mini's DeleptonizationProblem by a factor of about 1/3rd. Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 117 ++++++++++++++++++ 1 file changed, 117 insertions(+) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 6e209cce07ad4..38c684eaceb7d 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,6 +13,7 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + + if (!mlir::isa(copyOut)) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (::llvm::cast(copyOut).getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + auto results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + auto elem = hlfir::getElementAt(loc, builder, inputVariable, + loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + auto tempElem = hlfir::getElementAt(loc, builder, temp, + loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + auto refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + auto refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + auto addr = results[0]; + auto needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + auto tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. + rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); +} + class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -140,6 +256,7 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); + patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { >From e7d832795bf8744331dbb8d72a86b79e5ae56961 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Wed, 7 May 2025 16:04:07 +0000 Subject: [PATCH 2/4] Add tests Signed-off-by: Kajetan Puchalski --- flang/test/HLFIR/inline-hlfir-assign.fir | 144 +++++++++++++++++++++++ 1 file changed, 144 insertions(+) diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index f834e7971e3d5..df7681b9c5c16 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,3 +353,147 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } >From f129d8d5a2802578858ae592006a90c6fc5504d6 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 8 May 2025 15:15:56 +0000 Subject: [PATCH 3/4] Address Tom's review comments Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 41 +++++++++++-------- 1 file changed, 23 insertions(+), 18 deletions(-) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 38c684eaceb7d..dc545ece8adff 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -158,16 +158,16 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, "CopyInOp's WasCopied has no uses"); // The copy out should always be present, either to actually copy or just // deallocate memory. - auto *copyOut = - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - if (!mlir::isa(copyOut)) + if (!copyOut) return rewriter.notifyMatchFailure(copyIn, "CopyInOp has no direct CopyOut"); // Only inline the copy_in when copy_out does not need to be done, i.e. in // case of intent(in). - if (::llvm::cast(copyOut).getVar()) + if (copyOut.getVar()) return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); inputVariable = @@ -175,7 +175,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); mlir::Value isContiguous = builder.create(loc, inputVariable); - auto results = + mlir::Operation::result_range results = builder .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, /*withElseRegion=*/true) @@ -195,11 +195,11 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, loc, builder, extents, /*isUnordered=*/true, flangomp::shouldUseWorkshareLowering(copyIn)); builder.setInsertionPointToStart(loopNest.body); - auto elem = hlfir::getElementAt(loc, builder, inputVariable, - loopNest.oneBasedIndices); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); elem = hlfir::loadTrivialScalar(loc, builder, elem); - auto tempElem = hlfir::getElementAt(loc, builder, temp, - loopNest.oneBasedIndices); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); builder.create(loc, elem, tempElem); builder.setInsertionPointAfter(loopNest.outerOp); @@ -209,9 +209,9 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, if (mlir::isa(temp.getType())) { result = temp; } else { - auto refTy = + fir::ReferenceType refTy = fir::ReferenceType::get(temp.getElementOrSequenceType()); - auto refVal = builder.createConvert(loc, refTy, temp); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); result = builder.create(loc, resultAddrType, refVal); } @@ -221,25 +221,30 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }) .getResults(); - auto addr = results[0]; - auto needsCleanup = results[1]; + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { auto boxAddr = builder.create(loc, addr); - auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); builder.create(loc, heapVal); }); rewriter.eraseOp(copyOut); - auto tempBox = copyIn.getTempBox(); + mlir::Value tempBox = copyIn.getTempBox(); rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); // The TempBox is only needed for flang-rt calls which we're no longer - // generating. + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); } >From 36a62576ee8c2681a9dc2febc129a5e3f1618ea7 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 22 May 2025 13:37:53 +0000 Subject: [PATCH 4/4] Separate copy_in inlining into its own pass, add flag Signed-off-by: Kajetan Puchalski --- flang/include/flang/Optimizer/HLFIR/Passes.td | 4 + .../Optimizer/HLFIR/Transforms/CMakeLists.txt | 1 + .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 122 ------------ .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 180 ++++++++++++++++++ flang/lib/Optimizer/Passes/Pipelines.cpp | 5 + flang/test/HLFIR/inline-hlfir-assign.fir | 144 -------------- flang/test/HLFIR/inline-hlfir-copy-in.fir | 146 ++++++++++++++ 7 files changed, 336 insertions(+), 266 deletions(-) create mode 100644 flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp create mode 100644 flang/test/HLFIR/inline-hlfir-copy-in.fir diff --git a/flang/include/flang/Optimizer/HLFIR/Passes.td b/flang/include/flang/Optimizer/HLFIR/Passes.td index d445140118073..04d7aec5fe489 100644 --- a/flang/include/flang/Optimizer/HLFIR/Passes.td +++ b/flang/include/flang/Optimizer/HLFIR/Passes.td @@ -69,6 +69,10 @@ def InlineHLFIRAssign : Pass<"inline-hlfir-assign"> { let summary = "Inline hlfir.assign operations"; } +def InlineHLFIRCopyIn : Pass<"inline-hlfir-copy-in"> { + let summary = "Inline hlfir.copy_in operations"; +} + def PropagateFortranVariableAttributes : Pass<"propagate-fortran-attrs"> { let summary = "Propagate FortranVariableFlagsAttr attributes through HLFIR"; } diff --git a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt index d959428ebd203..cc74273d9c5d9 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt @@ -5,6 +5,7 @@ add_flang_library(HLFIRTransforms ConvertToFIR.cpp InlineElementals.cpp InlineHLFIRAssign.cpp + InlineHLFIRCopyIn.cpp LowerHLFIRIntrinsics.cpp LowerHLFIROrderedAssignments.cpp ScheduleOrderedAssignments.cpp diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index dc545ece8adff..6e209cce07ad4 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,7 +13,6 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" -#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -128,126 +127,6 @@ class InlineHLFIRAssignConversion } }; -class InlineCopyInConversion : public mlir::OpRewritePattern { -public: - using mlir::OpRewritePattern::OpRewritePattern; - - llvm::LogicalResult - matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const override; -}; - -llvm::LogicalResult -InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const { - fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); - mlir::Location loc = copyIn.getLoc(); - hlfir::Entity inputVariable{copyIn.getVar()}; - if (!fir::isa_trivial(inputVariable.getFortranElementType())) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's data type is not trivial"); - - if (fir::isPointerType(inputVariable.getType())) - return rewriter.notifyMatchFailure( - copyIn, "CopyInOp's input variable is a pointer"); - - // There should be exactly one user of WasCopied - the corresponding - // CopyOutOp. - if (copyIn.getWasCopied().getUses().empty()) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's WasCopied has no uses"); - // The copy out should always be present, either to actually copy or just - // deallocate memory. - auto copyOut = mlir::dyn_cast( - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - - if (!copyOut) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp has no direct CopyOut"); - - // Only inline the copy_in when copy_out does not need to be done, i.e. in - // case of intent(in). - if (copyOut.getVar()) - return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); - - inputVariable = - hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); - mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); - mlir::Value isContiguous = - builder.create(loc, inputVariable); - mlir::Operation::result_range results = - builder - .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, - /*withElseRegion=*/true) - .genThen([&]() { - mlir::Value falseVal = builder.create( - loc, builder.getI1Type(), builder.getBoolAttr(false)); - builder.create( - loc, mlir::ValueRange{inputVariable, falseVal}); - }) - .genElse([&] { - auto [temp, cleanup] = - hlfir::createTempFromMold(loc, builder, inputVariable); - mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); - llvm::SmallVector extents = - hlfir::getIndexExtents(loc, builder, shape); - hlfir::LoopNest loopNest = hlfir::genLoopNest( - loc, builder, extents, /*isUnordered=*/true, - flangomp::shouldUseWorkshareLowering(copyIn)); - builder.setInsertionPointToStart(loopNest.body); - hlfir::Entity elem = hlfir::getElementAt( - loc, builder, inputVariable, loopNest.oneBasedIndices); - elem = hlfir::loadTrivialScalar(loc, builder, elem); - hlfir::Entity tempElem = hlfir::getElementAt( - loc, builder, temp, loopNest.oneBasedIndices); - builder.create(loc, elem, tempElem); - builder.setInsertionPointAfter(loopNest.outerOp); - - mlir::Value result; - // Make sure the result is always a boxed array by boxing it - // ourselves if need be. - if (mlir::isa(temp.getType())) { - result = temp; - } else { - fir::ReferenceType refTy = - fir::ReferenceType::get(temp.getElementOrSequenceType()); - mlir::Value refVal = builder.createConvert(loc, refTy, temp); - result = - builder.create(loc, resultAddrType, refVal); - } - - builder.create(loc, - mlir::ValueRange{result, cleanup}); - }) - .getResults(); - - mlir::OpResult addr = results[0]; - mlir::OpResult needsCleanup = results[1]; - - builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { - auto boxAddr = builder.create(loc, addr); - fir::HeapType heapType = - fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - mlir::Value heapVal = - builder.createConvert(loc, heapType, boxAddr.getResult()); - builder.create(loc, heapVal); - }); - rewriter.eraseOp(copyOut); - - mlir::Value tempBox = copyIn.getTempBox(); - - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); - - // The TempBox is only needed for flang-rt calls which we're no longer - // generating. It should have no uses left at this stage. - if (!tempBox.getUses().empty()) - return mlir::failure(); - rewriter.eraseOp(tempBox.getDefiningOp()); - - return mlir::success(); -} - class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -261,7 +140,6 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); - patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp new file mode 100644 index 0000000000000..1e2aecaf535a0 --- /dev/null +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + mlir::Value tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); + rewriter.eraseOp(tempBox.getDefiningOp()); + + return mlir::success(); +} + +class InlineHLFIRCopyInPass + : public hlfir::impl::InlineHLFIRCopyInBase { +public: + void runOnOperation() override { + mlir::MLIRContext *context = &getContext(); + + mlir::GreedyRewriteConfig config; + // Prevent the pattern driver from merging blocks. + config.setRegionSimplificationLevel( + mlir::GreedySimplifyRegionLevel::Disabled); + + mlir::RewritePatternSet patterns(context); + if (!noInlineHLFIRCopyIn) { + patterns.insert(context); + } + + if (mlir::failed(mlir::applyPatternsGreedily( + getOperation(), std::move(patterns), config))) { + mlir::emitError(getOperation()->getLoc(), + "failure in hlfir.copy_in inlining"); + signalPassFailure(); + } + } +}; +} // namespace diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..1779623fddc5a 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -255,6 +255,11 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP, pm, hlfir::createOptimizedBufferization); addNestedPassToAllTopLevelOperations( pm, hlfir::createInlineHLFIRAssign); + + if (optLevel == llvm::OptimizationLevel::O3) { + addNestedPassToAllTopLevelOperations( + pm, hlfir::createInlineHLFIRCopyIn); + } } pm.addPass(hlfir::createLowerHLFIROrderedAssignments()); pm.addPass(hlfir::createLowerHLFIRIntrinsics()); diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index df7681b9c5c16..f834e7971e3d5 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,147 +353,3 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } - -// Test inlining of hlfir.copy_in that does not require the array to be copied out -func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant true -// CHECK: %[[VAL_4:.*]] = arith.constant false -// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 -// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { -// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 -// CHECK: } else { -// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} -// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { -// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref -// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref -// CHECK: } -// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 -// CHECK: } -// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: fir.if %[[VAL_21:.*]]#1 { -// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> -// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> -// CHECK: } -// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } - -// Test not inlining of hlfir.copy_in that requires the array to be copied out -func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_no_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> -// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) -// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () -// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir new file mode 100644 index 0000000000000..7140e93f19979 --- /dev/null +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -0,0 +1,146 @@ +// Test inlining of hlfir.copy_in +// RUN: fir-opt --inline-hlfir-copy-in %s | FileCheck %s + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } From flang-commits at lists.llvm.org Thu May 22 11:12:31 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Thu, 22 May 2025 11:12:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Do not generate cuf.alloc/cuf.free in device context (PR #141117) Message-ID: https://github.com/clementval created https://github.com/llvm/llvm-project/pull/141117 `cuf.alloc` and `cuf.free` are converted to `fir.alloca` or deleted when in device context during the CUFOpConversion pass. Do not generate them in lowering to avoid confusion. >From 686776eab1b320721df2935ba76484adeed4785f Mon Sep 17 00:00:00 2001 From: Valentin Clement Date: Thu, 22 May 2025 11:11:02 -0700 Subject: [PATCH] [flang][cuda] Do not generate cuf.alloc/cuf.free in device context --- flang/lib/Lower/ConvertVariable.cpp | 12 ++++++++---- flang/test/Lower/CUDA/cuda-allocatable.cuf | 2 +- flang/test/Lower/CUDA/cuda-shared.cuf | 1 - 3 files changed, 9 insertions(+), 6 deletions(-) diff --git a/flang/lib/Lower/ConvertVariable.cpp b/flang/lib/Lower/ConvertVariable.cpp index 372c71b6d4821..a28596bfd0099 100644 --- a/flang/lib/Lower/ConvertVariable.cpp +++ b/flang/lib/Lower/ConvertVariable.cpp @@ -25,6 +25,7 @@ #include "flang/Lower/StatementContext.h" #include "flang/Lower/Support/Utils.h" #include "flang/Lower/SymbolMap.h" +#include "flang/Optimizer/Builder/CUFCommon.h" #include "flang/Optimizer/Builder/Character.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" @@ -735,8 +736,10 @@ static mlir::Value createNewLocal(Fortran::lower::AbstractConverter &converter, if (dataAttr.getValue() == cuf::DataAttribute::Shared) return builder.create(loc, ty, nm, symNm, lenParams, indices); - return builder.create(loc, ty, nm, symNm, dataAttr, lenParams, - indices); + + if (!cuf::isCUDADeviceContext(builder.getRegion())) + return builder.create(loc, ty, nm, symNm, dataAttr, + lenParams, indices); } // Let the builder do all the heavy lifting. @@ -1072,8 +1075,9 @@ static void instantiateLocal(Fortran::lower::AbstractConverter &converter, if (mustBeDefaultInitializedAtRuntime(var)) Fortran::lower::defaultInitializeAtRuntime(converter, var.getSymbol(), symMap); - if (Fortran::semantics::NeedCUDAAlloc(var.getSymbol())) { - auto *builder = &converter.getFirOpBuilder(); + auto *builder = &converter.getFirOpBuilder(); + if (Fortran::semantics::NeedCUDAAlloc(var.getSymbol()) && + !cuf::isCUDADeviceContext(builder->getRegion())) { cuf::DataAttributeAttr dataAttr = Fortran::lower::translateSymbolCUFDataAttribute(builder->getContext(), var.getSymbol()); diff --git a/flang/test/Lower/CUDA/cuda-allocatable.cuf b/flang/test/Lower/CUDA/cuda-allocatable.cuf index cec10dda839e9..36e768bd7d92c 100644 --- a/flang/test/Lower/CUDA/cuda-allocatable.cuf +++ b/flang/test/Lower/CUDA/cuda-allocatable.cuf @@ -186,7 +186,7 @@ attributes(global) subroutine sub8() end subroutine ! CHECK-LABEL: func.func @_QPsub8() attributes {cuf.proc_attr = #cuf.cuda_proc} -! CHECK: %[[DESC:.*]] = cuf.alloc !fir.box>> {bindc_name = "a", data_attr = #cuf.cuda, uniq_name = "_QFsub8Ea"} -> !fir.ref>>> +! CHECK: %[[DESC:.*]] = fir.alloca !fir.box>> {bindc_name = "a", uniq_name = "_QFsub8Ea"} ! CHECK: %[[A:.*]]:2 = hlfir.declare %[[DESC]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub8Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) ! CHECK: %[[HEAP:.*]] = fir.allocmem !fir.array, %{{.*}} {fir.must_be_heap = true, uniq_name = "_QFsub8Ea.alloc"} ! CHECK: %[[SHAPE:.*]] = fir.shape %{{.*}} : (index) -> !fir.shape<1> diff --git a/flang/test/Lower/CUDA/cuda-shared.cuf b/flang/test/Lower/CUDA/cuda-shared.cuf index 565857f01bdb8..f41011df06ae7 100644 --- a/flang/test/Lower/CUDA/cuda-shared.cuf +++ b/flang/test/Lower/CUDA/cuda-shared.cuf @@ -9,5 +9,4 @@ end subroutine ! CHECK-LABEL: func.func @_QPsharedmem() attributes {cuf.proc_attr = #cuf.cuda_proc} ! CHECK: %{{.*}} = cuf.shared_memory !fir.array<32xf32> {bindc_name = "s", uniq_name = "_QFsharedmemEs"} -> !fir.ref> -! CHECK: cuf.free %{{.*}}#0 : !fir.ref {data_attr = #cuf.cuda} ! CHECK-NOT: cuf.free From flang-commits at lists.llvm.org Thu May 22 11:13:04 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 11:13:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Do not generate cuf.alloc/cuf.free in device context (PR #141117) In-Reply-To: Message-ID: <682f6930.630a0220.4d66b.8552@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Valentin Clement (バレンタイン クレメン) (clementval)
Changes `cuf.alloc` and `cuf.free` are converted to `fir.alloca` or deleted when in device context during the CUFOpConversion pass. Do not generate them in lowering to avoid confusion. --- Full diff: https://github.com/llvm/llvm-project/pull/141117.diff 3 Files Affected: - (modified) flang/lib/Lower/ConvertVariable.cpp (+8-4) - (modified) flang/test/Lower/CUDA/cuda-allocatable.cuf (+1-1) - (modified) flang/test/Lower/CUDA/cuda-shared.cuf (-1) ``````````diff diff --git a/flang/lib/Lower/ConvertVariable.cpp b/flang/lib/Lower/ConvertVariable.cpp index 372c71b6d4821..a28596bfd0099 100644 --- a/flang/lib/Lower/ConvertVariable.cpp +++ b/flang/lib/Lower/ConvertVariable.cpp @@ -25,6 +25,7 @@ #include "flang/Lower/StatementContext.h" #include "flang/Lower/Support/Utils.h" #include "flang/Lower/SymbolMap.h" +#include "flang/Optimizer/Builder/CUFCommon.h" #include "flang/Optimizer/Builder/Character.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" @@ -735,8 +736,10 @@ static mlir::Value createNewLocal(Fortran::lower::AbstractConverter &converter, if (dataAttr.getValue() == cuf::DataAttribute::Shared) return builder.create(loc, ty, nm, symNm, lenParams, indices); - return builder.create(loc, ty, nm, symNm, dataAttr, lenParams, - indices); + + if (!cuf::isCUDADeviceContext(builder.getRegion())) + return builder.create(loc, ty, nm, symNm, dataAttr, + lenParams, indices); } // Let the builder do all the heavy lifting. @@ -1072,8 +1075,9 @@ static void instantiateLocal(Fortran::lower::AbstractConverter &converter, if (mustBeDefaultInitializedAtRuntime(var)) Fortran::lower::defaultInitializeAtRuntime(converter, var.getSymbol(), symMap); - if (Fortran::semantics::NeedCUDAAlloc(var.getSymbol())) { - auto *builder = &converter.getFirOpBuilder(); + auto *builder = &converter.getFirOpBuilder(); + if (Fortran::semantics::NeedCUDAAlloc(var.getSymbol()) && + !cuf::isCUDADeviceContext(builder->getRegion())) { cuf::DataAttributeAttr dataAttr = Fortran::lower::translateSymbolCUFDataAttribute(builder->getContext(), var.getSymbol()); diff --git a/flang/test/Lower/CUDA/cuda-allocatable.cuf b/flang/test/Lower/CUDA/cuda-allocatable.cuf index cec10dda839e9..36e768bd7d92c 100644 --- a/flang/test/Lower/CUDA/cuda-allocatable.cuf +++ b/flang/test/Lower/CUDA/cuda-allocatable.cuf @@ -186,7 +186,7 @@ attributes(global) subroutine sub8() end subroutine ! CHECK-LABEL: func.func @_QPsub8() attributes {cuf.proc_attr = #cuf.cuda_proc} -! CHECK: %[[DESC:.*]] = cuf.alloc !fir.box>> {bindc_name = "a", data_attr = #cuf.cuda, uniq_name = "_QFsub8Ea"} -> !fir.ref>>> +! CHECK: %[[DESC:.*]] = fir.alloca !fir.box>> {bindc_name = "a", uniq_name = "_QFsub8Ea"} ! CHECK: %[[A:.*]]:2 = hlfir.declare %[[DESC]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub8Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) ! CHECK: %[[HEAP:.*]] = fir.allocmem !fir.array, %{{.*}} {fir.must_be_heap = true, uniq_name = "_QFsub8Ea.alloc"} ! CHECK: %[[SHAPE:.*]] = fir.shape %{{.*}} : (index) -> !fir.shape<1> diff --git a/flang/test/Lower/CUDA/cuda-shared.cuf b/flang/test/Lower/CUDA/cuda-shared.cuf index 565857f01bdb8..f41011df06ae7 100644 --- a/flang/test/Lower/CUDA/cuda-shared.cuf +++ b/flang/test/Lower/CUDA/cuda-shared.cuf @@ -9,5 +9,4 @@ end subroutine ! CHECK-LABEL: func.func @_QPsharedmem() attributes {cuf.proc_attr = #cuf.cuda_proc} ! CHECK: %{{.*}} = cuf.shared_memory !fir.array<32xf32> {bindc_name = "s", uniq_name = "_QFsharedmemEs"} -> !fir.ref> -! CHECK: cuf.free %{{.*}}#0 : !fir.ref {data_attr = #cuf.cuda} ! CHECK-NOT: cuf.free ``````````
https://github.com/llvm/llvm-project/pull/141117 From flang-commits at lists.llvm.org Thu May 22 11:19:23 2025 From: flang-commits at lists.llvm.org (Zhen Wang via flang-commits) Date: Thu, 22 May 2025 11:19:23 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Do not generate cuf.alloc/cuf.free in device context (PR #141117) In-Reply-To: Message-ID: <682f6aab.170a0220.76e21.17e4@mx.google.com> https://github.com/wangzpgi approved this pull request. https://github.com/llvm/llvm-project/pull/141117 From flang-commits at lists.llvm.org Thu May 22 10:50:40 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Thu, 22 May 2025 10:50:40 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <682f63f0.170a0220.f4a80.dde1@mx.google.com> https://github.com/eZWALT updated https://github.com/llvm/llvm-project/pull/139293 >From 204d902b738dcd9d260963afab3d4f8f5f1c0066 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:25:33 +0000 Subject: [PATCH 1/9] Add fuse directive patch --- clang/include/clang-c/Index.h | 4 + clang/include/clang/AST/RecursiveASTVisitor.h | 3 + clang/include/clang/AST/StmtOpenMP.h | 105 +- .../clang/Basic/DiagnosticSemaKinds.td | 8 + clang/include/clang/Basic/StmtNodes.td | 1 + clang/include/clang/Sema/SemaOpenMP.h | 27 + .../include/clang/Serialization/ASTBitCodes.h | 1 + clang/lib/AST/StmtOpenMP.cpp | 25 + clang/lib/AST/StmtPrinter.cpp | 5 + clang/lib/AST/StmtProfile.cpp | 4 + clang/lib/Basic/OpenMPKinds.cpp | 2 +- clang/lib/CodeGen/CGStmt.cpp | 3 + clang/lib/CodeGen/CGStmtOpenMP.cpp | 8 + clang/lib/CodeGen/CodeGenFunction.h | 1 + clang/lib/Sema/SemaExceptionSpec.cpp | 1 + clang/lib/Sema/SemaOpenMP.cpp | 600 +++++++ clang/lib/Sema/TreeTransform.h | 11 + clang/lib/Serialization/ASTReaderStmt.cpp | 11 + clang/lib/Serialization/ASTWriterStmt.cpp | 6 + clang/lib/StaticAnalyzer/Core/ExprEngine.cpp | 1 + clang/test/OpenMP/fuse_ast_print.cpp | 278 +++ clang/test/OpenMP/fuse_codegen.cpp | 1511 +++++++++++++++++ clang/test/OpenMP/fuse_messages.cpp | 76 + clang/tools/libclang/CIndex.cpp | 7 + clang/tools/libclang/CXCursor.cpp | 3 + llvm/include/llvm/Frontend/OpenMP/OMP.td | 4 + .../runtime/test/transform/fuse/foreach.cpp | 192 +++ openmp/runtime/test/transform/fuse/intfor.c | 50 + .../runtime/test/transform/fuse/iterfor.cpp | 194 +++ .../fuse/parallel-wsloop-collapse-foreach.cpp | 208 +++ .../fuse/parallel-wsloop-collapse-intfor.c | 45 + 31 files changed, 3391 insertions(+), 4 deletions(-) create mode 100644 clang/test/OpenMP/fuse_ast_print.cpp create mode 100644 clang/test/OpenMP/fuse_codegen.cpp create mode 100644 clang/test/OpenMP/fuse_messages.cpp create mode 100644 openmp/runtime/test/transform/fuse/foreach.cpp create mode 100644 openmp/runtime/test/transform/fuse/intfor.c create mode 100644 openmp/runtime/test/transform/fuse/iterfor.cpp create mode 100644 openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-foreach.cpp create mode 100644 openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-intfor.c diff --git a/clang/include/clang-c/Index.h b/clang/include/clang-c/Index.h index d30d15e53802a..00046de62a742 100644 --- a/clang/include/clang-c/Index.h +++ b/clang/include/clang-c/Index.h @@ -2162,6 +2162,10 @@ enum CXCursorKind { */ CXCursor_OMPStripeDirective = 310, + /** OpenMP fuse directive + */ + CXCursor_OMPFuseDirective = 318, + /** OpenACC Compute Construct. */ CXCursor_OpenACCComputeConstruct = 320, diff --git a/clang/include/clang/AST/RecursiveASTVisitor.h b/clang/include/clang/AST/RecursiveASTVisitor.h index 23a8c4f1f7380..057e9e346ce4e 100644 --- a/clang/include/clang/AST/RecursiveASTVisitor.h +++ b/clang/include/clang/AST/RecursiveASTVisitor.h @@ -3080,6 +3080,9 @@ DEF_TRAVERSE_STMT(OMPUnrollDirective, DEF_TRAVERSE_STMT(OMPReverseDirective, { TRY_TO(TraverseOMPExecutableDirective(S)); }) +DEF_TRAVERSE_STMT(OMPFuseDirective, + { TRY_TO(TraverseOMPExecutableDirective(S)); }) + DEF_TRAVERSE_STMT(OMPInterchangeDirective, { TRY_TO(TraverseOMPExecutableDirective(S)); }) diff --git a/clang/include/clang/AST/StmtOpenMP.h b/clang/include/clang/AST/StmtOpenMP.h index 736bcabbad1f7..dc6f797e24ab8 100644 --- a/clang/include/clang/AST/StmtOpenMP.h +++ b/clang/include/clang/AST/StmtOpenMP.h @@ -962,6 +962,9 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Number of loops generated by this loop transformation. unsigned NumGeneratedLoops = 0; + /// Number of top level canonical loop nests generated by this loop + /// transformation + unsigned NumGeneratedLoopNests = 0; protected: explicit OMPLoopTransformationDirective(StmtClass SC, @@ -973,6 +976,9 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Set the number of loops generated by this loop transformation. void setNumGeneratedLoops(unsigned Num) { NumGeneratedLoops = Num; } + /// Set the number of top level canonical loop nests generated by this loop + /// transformation + void setNumGeneratedLoopNests(unsigned Num) { NumGeneratedLoopNests = Num; } public: /// Return the number of associated (consumed) loops. @@ -981,6 +987,10 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Return the number of loops generated by this loop transformation. unsigned getNumGeneratedLoops() const { return NumGeneratedLoops; } + /// Return the number of top level canonical loop nests generated by this loop + /// transformation + unsigned getNumGeneratedLoopNests() const { return NumGeneratedLoopNests; } + /// Get the de-sugared statements after the loop transformation. /// /// Might be nullptr if either the directive generates no loops and is handled @@ -995,7 +1005,8 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { Stmt::StmtClass C = T->getStmtClass(); return C == OMPTileDirectiveClass || C == OMPUnrollDirectiveClass || C == OMPReverseDirectiveClass || C == OMPInterchangeDirectiveClass || - C == OMPStripeDirectiveClass; + C == OMPStripeDirectiveClass || + C == OMPFuseDirectiveClass; } }; @@ -5562,6 +5573,7 @@ class OMPTileDirective final : public OMPLoopTransformationDirective { llvm::omp::OMPD_tile, StartLoc, EndLoc, NumLoops) { setNumGeneratedLoops(2 * NumLoops); + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { @@ -5790,7 +5802,11 @@ class OMPReverseDirective final : public OMPLoopTransformationDirective { explicit OMPReverseDirective(SourceLocation StartLoc, SourceLocation EndLoc) : OMPLoopTransformationDirective(OMPReverseDirectiveClass, llvm::omp::OMPD_reverse, StartLoc, - EndLoc, 1) {} + EndLoc, 1) { + + setNumGeneratedLoopNests(1); + setNumGeneratedLoops(1); + } void setPreInits(Stmt *PreInits) { Data->getChildren()[PreInitsOffset] = PreInits; @@ -5857,7 +5873,8 @@ class OMPInterchangeDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPInterchangeDirectiveClass, llvm::omp::OMPD_interchange, StartLoc, EndLoc, NumLoops) { - setNumGeneratedLoops(3 * NumLoops); + setNumGeneratedLoops(NumLoops); + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { @@ -5908,6 +5925,88 @@ class OMPInterchangeDirective final : public OMPLoopTransformationDirective { } }; +/// Represents the '#pragma omp fuse' loop transformation directive +/// +/// \code{c} +/// #pragma omp fuse +/// { +/// for(int i = 0; i < m1; ++i) {...} +/// for(int j = 0; j < m2; ++j) {...} +/// ... +/// } +/// \endcode + +class OMPFuseDirective final : public OMPLoopTransformationDirective { + friend class ASTStmtReader; + friend class OMPExecutableDirective; + + // Offsets of child members. + enum { + PreInitsOffset = 0, + TransformedStmtOffset, + }; + + explicit OMPFuseDirective(SourceLocation StartLoc, SourceLocation EndLoc, + unsigned NumLoops) + : OMPLoopTransformationDirective(OMPFuseDirectiveClass, + llvm::omp::OMPD_fuse, StartLoc, EndLoc, + NumLoops) { + setNumGeneratedLoops(1); + // TODO: After implementing the looprange clause, change this logic + setNumGeneratedLoopNests(1); + } + + void setPreInits(Stmt *PreInits) { + Data->getChildren()[PreInitsOffset] = PreInits; + } + + void setTransformedStmt(Stmt *S) { + Data->getChildren()[TransformedStmtOffset] = S; + } + +public: + /// Create a new AST node representation for #pragma omp fuse' + /// + /// \param C Context of the AST + /// \param StartLoc Location of the introducer (e.g the 'omp' token) + /// \param EndLoc Location of the directive's end (e.g the tok::eod) + /// \param Clauses The directive's clauses + /// \param NumLoops Number of total affected loops + /// \param NumLoopNests Number of affected top level canonical loops + /// (number of items in the 'looprange' clause if present) + /// \param AssociatedStmt The outermost associated loop + /// \param TransformedStmt The loop nest after fusion, or nullptr in + /// dependent + /// \param PreInits Helper preinits statements for the loop nest + static OMPFuseDirective *Create(const ASTContext &C, SourceLocation StartLoc, + SourceLocation EndLoc, + ArrayRef Clauses, + unsigned NumLoops, unsigned NumLoopNests, + Stmt *AssociatedStmt, Stmt *TransformedStmt, + Stmt *PreInits); + + /// Build an empty '#pragma omp fuse' AST node for deserialization + /// + /// \param C Context of the AST + /// \param NumClauses Number of clauses to allocate + /// \param NumLoops Number of associated loops to allocate + static OMPFuseDirective *CreateEmpty(const ASTContext &C, unsigned NumClauses, + unsigned NumLoops); + + /// Gets the associated loops after the transformation. This is the de-sugared + /// replacement or nulltpr in dependent contexts. + Stmt *getTransformedStmt() const { + return Data->getChildren()[TransformedStmtOffset]; + } + + /// Return preinits statement. + Stmt *getPreInits() const { return Data->getChildren()[PreInitsOffset]; } + + static bool classof(const Stmt *T) { + return T->getStmtClass() == OMPFuseDirectiveClass; + } +}; + /// This represents '#pragma omp scan' directive. /// /// \code diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index 78b36ceb88125..f31b6f8a3b26a 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -11558,6 +11558,14 @@ def note_omp_implicit_dsa : Note< "implicitly determined as %0">; def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; +def warn_omp_different_loop_ind_var_types : Warning < + "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">; +def err_omp_not_canonical_loop : Error < + "loop after '#pragma omp %0' is not in canonical form">; +def err_omp_not_a_loop_sequence : Error < + "statement after '#pragma omp %0' must be a loop sequence containing canonical loops or loop-generating constructs">; +def err_omp_empty_loop_sequence : Error < + "loop sequence after '#pragma omp %0' must contain at least 1 canonical loop or loop-generating construct">; def err_omp_not_for : Error< "%select{statement after '#pragma omp %1' must be a for loop|" "expected %2 for loops after '#pragma omp %1'%select{|, but found only %4}3}0">; diff --git a/clang/include/clang/Basic/StmtNodes.td b/clang/include/clang/Basic/StmtNodes.td index 9526fa5808aa5..739160342062c 100644 --- a/clang/include/clang/Basic/StmtNodes.td +++ b/clang/include/clang/Basic/StmtNodes.td @@ -234,6 +234,7 @@ def OMPStripeDirective : StmtNode; def OMPUnrollDirective : StmtNode; def OMPReverseDirective : StmtNode; def OMPInterchangeDirective : StmtNode; +def OMPFuseDirective : StmtNode; def OMPForDirective : StmtNode; def OMPForSimdDirective : StmtNode; def OMPSectionsDirective : StmtNode; diff --git a/clang/include/clang/Sema/SemaOpenMP.h b/clang/include/clang/Sema/SemaOpenMP.h index 6498390fe96f7..8d78c2197c89d 100644 --- a/clang/include/clang/Sema/SemaOpenMP.h +++ b/clang/include/clang/Sema/SemaOpenMP.h @@ -457,6 +457,13 @@ class SemaOpenMP : public SemaBase { Stmt *AStmt, SourceLocation StartLoc, SourceLocation EndLoc); + + /// Called on well-formed '#pragma omp fuse' after parsing of its + /// clauses and the associated statement. + StmtResult ActOnOpenMPFuseDirective(ArrayRef Clauses, + Stmt *AStmt, SourceLocation StartLoc, + SourceLocation EndLoc); + /// Called on well-formed '\#pragma omp for' after parsing /// of the associated statement. StmtResult @@ -1480,6 +1487,26 @@ class SemaOpenMP : public SemaBase { SmallVectorImpl &LoopHelpers, Stmt *&Body, SmallVectorImpl> &OriginalInits); + /// Analyzes and checks a loop sequence for use by a loop transformation + /// + /// \param Kind The loop transformation directive kind. + /// \param NumLoops [out] Number of total canonical loops + /// \param LoopSeqSize [out] Number of top level canonical loops + /// \param LoopHelpers [out] The multiple loop analyses results. + /// \param LoopStmts [out] The multiple Stmt of each For loop. + /// \param OriginalInits [out] The multiple collection of statements and + /// declarations that must have been executed/declared + /// before entering the loop. + /// \param Context + /// \return Whether there was an absence of errors or not + bool checkTransformableLoopSequence( + OpenMPDirectiveKind Kind, Stmt *AStmt, unsigned &LoopSeqSize, + unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + ASTContext &Context); + /// Helper to keep information about the current `omp begin/end declare /// variant` nesting. struct OMPDeclareVariantScope { diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h index 5cb9998126a85..8fe9d8248d66f 100644 --- a/clang/include/clang/Serialization/ASTBitCodes.h +++ b/clang/include/clang/Serialization/ASTBitCodes.h @@ -1948,6 +1948,7 @@ enum StmtCode { STMT_OMP_UNROLL_DIRECTIVE, STMT_OMP_REVERSE_DIRECTIVE, STMT_OMP_INTERCHANGE_DIRECTIVE, + STMT_OMP_FUSE_DIRECTIVE, STMT_OMP_FOR_DIRECTIVE, STMT_OMP_FOR_SIMD_DIRECTIVE, STMT_OMP_SECTIONS_DIRECTIVE, diff --git a/clang/lib/AST/StmtOpenMP.cpp b/clang/lib/AST/StmtOpenMP.cpp index 093e1f659916f..4a6133766ef1c 100644 --- a/clang/lib/AST/StmtOpenMP.cpp +++ b/clang/lib/AST/StmtOpenMP.cpp @@ -456,6 +456,8 @@ OMPUnrollDirective::Create(const ASTContext &C, SourceLocation StartLoc, auto *Dir = createDirective( C, Clauses, AssociatedStmt, TransformedStmtOffset + 1, StartLoc, EndLoc); Dir->setNumGeneratedLoops(NumGeneratedLoops); + // The number of generated loops and loop nests during unroll matches + Dir->setNumGeneratedLoopNests(NumGeneratedLoops); Dir->setTransformedStmt(TransformedStmt); Dir->setPreInits(PreInits); return Dir; @@ -505,6 +507,29 @@ OMPInterchangeDirective::CreateEmpty(const ASTContext &C, unsigned NumClauses, SourceLocation(), SourceLocation(), NumLoops); } +OMPFuseDirective *OMPFuseDirective::Create( + const ASTContext &C, SourceLocation StartLoc, SourceLocation EndLoc, + ArrayRef Clauses, unsigned NumLoops, unsigned NumLoopNests, + Stmt *AssociatedStmt, Stmt *TransformedStmt, Stmt *PreInits) { + + OMPFuseDirective *Dir = createDirective( + C, Clauses, AssociatedStmt, TransformedStmtOffset + 1, StartLoc, EndLoc, + NumLoops); + Dir->setTransformedStmt(TransformedStmt); + Dir->setPreInits(PreInits); + Dir->setNumGeneratedLoopNests(NumLoopNests); + Dir->setNumGeneratedLoops(NumLoops); + return Dir; +} + +OMPFuseDirective *OMPFuseDirective::CreateEmpty(const ASTContext &C, + unsigned NumClauses, + unsigned NumLoops) { + return createEmptyDirective( + C, NumClauses, /*HasAssociatedStmt=*/true, TransformedStmtOffset + 1, + SourceLocation(), SourceLocation(), NumLoops); +} + OMPForSimdDirective * OMPForSimdDirective::Create(const ASTContext &C, SourceLocation StartLoc, SourceLocation EndLoc, unsigned CollapsedNum, diff --git a/clang/lib/AST/StmtPrinter.cpp b/clang/lib/AST/StmtPrinter.cpp index dc8af1586624b..12a1d5a943704 100644 --- a/clang/lib/AST/StmtPrinter.cpp +++ b/clang/lib/AST/StmtPrinter.cpp @@ -791,6 +791,11 @@ void StmtPrinter::VisitOMPInterchangeDirective(OMPInterchangeDirective *Node) { PrintOMPExecutableDirective(Node); } +void StmtPrinter::VisitOMPFuseDirective(OMPFuseDirective *Node) { + Indent() << "#pragma omp fuse"; + PrintOMPExecutableDirective(Node); +} + void StmtPrinter::VisitOMPForDirective(OMPForDirective *Node) { Indent() << "#pragma omp for"; PrintOMPExecutableDirective(Node); diff --git a/clang/lib/AST/StmtProfile.cpp b/clang/lib/AST/StmtProfile.cpp index f7d1655f67ed1..99d426db985e8 100644 --- a/clang/lib/AST/StmtProfile.cpp +++ b/clang/lib/AST/StmtProfile.cpp @@ -1026,6 +1026,10 @@ void StmtProfiler::VisitOMPInterchangeDirective( VisitOMPLoopTransformationDirective(S); } +void StmtProfiler::VisitOMPFuseDirective(const OMPFuseDirective *S) { + VisitOMPLoopTransformationDirective(S); +} + void StmtProfiler::VisitOMPForDirective(const OMPForDirective *S) { VisitOMPLoopDirective(S); } diff --git a/clang/lib/Basic/OpenMPKinds.cpp b/clang/lib/Basic/OpenMPKinds.cpp index a451fc7c01841..d172450512f13 100644 --- a/clang/lib/Basic/OpenMPKinds.cpp +++ b/clang/lib/Basic/OpenMPKinds.cpp @@ -702,7 +702,7 @@ bool clang::isOpenMPLoopBoundSharingDirective(OpenMPDirectiveKind Kind) { bool clang::isOpenMPLoopTransformationDirective(OpenMPDirectiveKind DKind) { return DKind == OMPD_tile || DKind == OMPD_unroll || DKind == OMPD_reverse || - DKind == OMPD_interchange || DKind == OMPD_stripe; + DKind == OMPD_interchange || DKind == OMPD_stripe || DKind == OMPD_fuse; } bool clang::isOpenMPCombinedParallelADirective(OpenMPDirectiveKind DKind) { diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp index 3562b4ea22a24..4a2dc1a537d46 100644 --- a/clang/lib/CodeGen/CGStmt.cpp +++ b/clang/lib/CodeGen/CGStmt.cpp @@ -233,6 +233,9 @@ void CodeGenFunction::EmitStmt(const Stmt *S, ArrayRef Attrs) { case Stmt::OMPInterchangeDirectiveClass: EmitOMPInterchangeDirective(cast(*S)); break; + case Stmt::OMPFuseDirectiveClass: + EmitOMPFuseDirective(cast(*S)); + break; case Stmt::OMPForDirectiveClass: EmitOMPForDirective(cast(*S)); break; diff --git a/clang/lib/CodeGen/CGStmtOpenMP.cpp b/clang/lib/CodeGen/CGStmtOpenMP.cpp index 803c7ed37635e..0c664b0f89044 100644 --- a/clang/lib/CodeGen/CGStmtOpenMP.cpp +++ b/clang/lib/CodeGen/CGStmtOpenMP.cpp @@ -197,6 +197,8 @@ class OMPLoopScope : public CodeGenFunction::RunCleanupsScope { } else if (const auto *Interchange = dyn_cast(&S)) { PreInits = Interchange->getPreInits(); + } else if (const auto *Fuse = dyn_cast(&S)) { + PreInits = Fuse->getPreInits(); } else { llvm_unreachable("Unknown loop-based directive kind."); } @@ -2918,6 +2920,12 @@ void CodeGenFunction::EmitOMPInterchangeDirective( EmitStmt(S.getTransformedStmt()); } +void CodeGenFunction::EmitOMPFuseDirective(const OMPFuseDirective &S) { + // Emit the de-sugared statement + OMPTransformDirectiveScopeRAII FuseScope(*this, &S); + EmitStmt(S.getTransformedStmt()); +} + void CodeGenFunction::EmitOMPUnrollDirective(const OMPUnrollDirective &S) { bool UseOMPIRBuilder = CGM.getLangOpts().OpenMPIRBuilder; diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h index 78d71fc822bcb..a983901f560de 100644 --- a/clang/lib/CodeGen/CodeGenFunction.h +++ b/clang/lib/CodeGen/CodeGenFunction.h @@ -3906,6 +3906,7 @@ class CodeGenFunction : public CodeGenTypeCache { void EmitOMPUnrollDirective(const OMPUnrollDirective &S); void EmitOMPReverseDirective(const OMPReverseDirective &S); void EmitOMPInterchangeDirective(const OMPInterchangeDirective &S); + void EmitOMPFuseDirective(const OMPFuseDirective &S); void EmitOMPForDirective(const OMPForDirective &S); void EmitOMPForSimdDirective(const OMPForSimdDirective &S); void EmitOMPScopeDirective(const OMPScopeDirective &S); diff --git a/clang/lib/Sema/SemaExceptionSpec.cpp b/clang/lib/Sema/SemaExceptionSpec.cpp index c83eab53891ca..85a374e6eb9b2 100644 --- a/clang/lib/Sema/SemaExceptionSpec.cpp +++ b/clang/lib/Sema/SemaExceptionSpec.cpp @@ -1491,6 +1491,7 @@ CanThrowResult Sema::canThrow(const Stmt *S) { case Stmt::OMPUnrollDirectiveClass: case Stmt::OMPReverseDirectiveClass: case Stmt::OMPInterchangeDirectiveClass: + case Stmt::OMPFuseDirectiveClass: case Stmt::OMPSingleDirectiveClass: case Stmt::OMPTargetDataDirectiveClass: case Stmt::OMPTargetDirectiveClass: diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index f16f841d62edd..bd8bee64a9d2f 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -4404,6 +4404,7 @@ void SemaOpenMP::ActOnOpenMPRegionStart(OpenMPDirectiveKind DKind, case OMPD_unroll: case OMPD_reverse: case OMPD_interchange: + case OMPD_fuse: case OMPD_assume: break; default: @@ -6221,6 +6222,10 @@ StmtResult SemaOpenMP::ActOnOpenMPExecutableDirective( Res = ActOnOpenMPInterchangeDirective(ClausesWithImplicit, AStmt, StartLoc, EndLoc); break; + case OMPD_fuse: + Res = + ActOnOpenMPFuseDirective(ClausesWithImplicit, AStmt, StartLoc, EndLoc); + break; case OMPD_for: Res = ActOnOpenMPForDirective(ClausesWithImplicit, AStmt, StartLoc, EndLoc, VarsWithInheritedDSA); @@ -14193,6 +14198,8 @@ bool SemaOpenMP::checkTransformableLoopNest( DependentPreInits = Dir->getPreInits(); else if (auto *Dir = dyn_cast(Transform)) DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); else llvm_unreachable("Unhandled loop transformation"); @@ -14203,6 +14210,265 @@ bool SemaOpenMP::checkTransformableLoopNest( return Result; } +class NestedLoopCounterVisitor + : public clang::RecursiveASTVisitor { +public: + explicit NestedLoopCounterVisitor() : NestedLoopCount(0) {} + + bool VisitForStmt(clang::ForStmt *FS) { + ++NestedLoopCount; + return true; + } + + bool VisitCXXForRangeStmt(clang::CXXForRangeStmt *FRS) { + ++NestedLoopCount; + return true; + } + + unsigned getNestedLoopCount() const { return NestedLoopCount; } + +private: + unsigned NestedLoopCount; +}; + +bool SemaOpenMP::checkTransformableLoopSequence( + OpenMPDirectiveKind Kind, Stmt *AStmt, unsigned &LoopSeqSize, + unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + ASTContext &Context) { + + // Checks whether the given statement is a compound statement + VarsWithInheritedDSAType TmpDSA; + if (!isa(AStmt)) { + Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) + << getOpenMPDirectiveName(Kind); + return false; + } + // Callback for updating pre-inits in case there are even more + // loop-sequence-generating-constructs inside of the main compound stmt + auto OnTransformationCallback = + [&OriginalInits](OMPLoopBasedDirective *Transform) { + Stmt *DependentPreInits; + if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else + llvm_unreachable("Unhandled loop transformation"); + + appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + }; + + // Number of top level canonical loop nests observed (And acts as index) + LoopSeqSize = 0; + // Number of total observed loops + NumLoops = 0; + + // Following OpenMP 6.0 API Specification, a Canonical Loop Sequence follows + // the grammar: + // + // canonical-loop-sequence: + // { + // loop-sequence+ + // } + // where loop-sequence can be any of the following: + // 1. canonical-loop-sequence + // 2. loop-nest + // 3. loop-sequence-generating-construct (i.e OMPLoopTransformationDirective) + // + // To recognise and traverse this structure the following helper functions + // have been defined. handleLoopSequence serves as the recurisve entry point + // and tries to match the input AST to the canonical loop sequence grammar + // structure + + auto NLCV = NestedLoopCounterVisitor(); + // Helper functions to validate canonical loop sequence grammar is valid + auto isLoopSequenceDerivation = [](auto *Child) { + return isa(Child) || isa(Child) || + isa(Child); + }; + auto isLoopGeneratingStmt = [](auto *Child) { + return isa(Child); + }; + + // Helper Lambda to handle storing initialization and body statements for both + // ForStmt and CXXForRangeStmt and checks for any possible mismatch between + // induction variables types + QualType BaseInductionVarType; + auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, + this, &Context](Stmt *LoopStmt) { + if (auto *For = dyn_cast(LoopStmt)) { + OriginalInits.back().push_back(For->getInit()); + ForStmts.push_back(For); + // Extract induction variable + if (auto *InitStmt = dyn_cast_or_null(For->getInit())) { + if (auto *InitDecl = dyn_cast(InitStmt->getSingleDecl())) { + QualType InductionVarType = InitDecl->getType().getCanonicalType(); + + // Compare with first loop type + if (BaseInductionVarType.isNull()) { + BaseInductionVarType = InductionVarType; + } else if (!Context.hasSameType(BaseInductionVarType, + InductionVarType)) { + Diag(InitDecl->getBeginLoc(), + diag::warn_omp_different_loop_ind_var_types) + << getOpenMPDirectiveName(OMPD_fuse) << BaseInductionVarType + << InductionVarType; + } + } + } + + } else { + assert(isa(LoopStmt) && + "Expected canonical for or range-based for loops."); + auto *CXXFor = dyn_cast(LoopStmt); + OriginalInits.back().push_back(CXXFor->getBeginStmt()); + ForStmts.push_back(CXXFor); + } + }; + // Helper lambda functions to encapsulate the processing of different + // derivations of the canonical loop sequence grammar + // + // Modularized code for handling loop generation and transformations + auto handleLoopGeneration = [&storeLoopStatements, &LoopHelpers, + &OriginalInits, &LoopSeqSize, &NumLoops, Kind, + &TmpDSA, &OnTransformationCallback, + this](Stmt *Child) { + auto LoopTransform = dyn_cast(Child); + Stmt *TransformedStmt = LoopTransform->getTransformedStmt(); + unsigned NumGeneratedLoopNests = LoopTransform->getNumGeneratedLoopNests(); + + // Handle the case where transformed statement is not available due to + // dependent contexts + if (!TransformedStmt) { + if (NumGeneratedLoopNests > 0) + return true; + // Unroll full + else { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + } + // Handle loop transformations with multiple loop nests + // Unroll full + if (NumGeneratedLoopNests <= 0) { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + // Future loop transformations that generate multiple canonical loops + } else if (NumGeneratedLoopNests > 1) { + llvm_unreachable("Multiple canonical loop generating transformations " + "like loop splitting are not yet supported"); + } + + // Process the transformed loop statement + Child = TransformedStmt; + OriginalInits.emplace_back(); + LoopHelpers.emplace_back(); + OnTransformationCallback(LoopTransform); + + unsigned IsCanonical = + checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, + TmpDSA, LoopHelpers[LoopSeqSize]); + + if (!IsCanonical) { + Diag(Child->getBeginLoc(), diag::err_omp_not_canonical_loop) + << getOpenMPDirectiveName(Kind); + return false; + } + storeLoopStatements(TransformedStmt); + NumLoops += LoopTransform->getNumGeneratedLoops(); + return true; + }; + + // Modularized code for handling regular canonical loops + auto handleRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, + &LoopSeqSize, &NumLoops, Kind, &TmpDSA, &NLCV, + this](Stmt *Child) { + OriginalInits.emplace_back(); + LoopHelpers.emplace_back(); + unsigned IsCanonical = + checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, + TmpDSA, LoopHelpers[LoopSeqSize]); + + if (!IsCanonical) { + Diag(Child->getBeginLoc(), diag::err_omp_not_canonical_loop) + << getOpenMPDirectiveName(Kind); + return false; + } + storeLoopStatements(Child); + NumLoops += NLCV.TraverseStmt(Child); + return true; + }; + + // Helper function to process a Loop Sequence Recursively + auto handleLoopSequence = [&](Stmt *LoopSeqStmt, + auto &handleLoopSequenceCallback) -> bool { + for (auto *Child : LoopSeqStmt->children()) { + if (!Child) + continue; + + // Skip over non-loop-sequence statements + if (!isLoopSequenceDerivation(Child)) { + Child = Child->IgnoreContainers(); + + // Ignore empty compound statement + if (!Child) + continue; + + // In the case of a nested loop sequence ignoring containers would not + // be enough, a recurisve transversal of the loop sequence is required + if (isa(Child)) { + if (!handleLoopSequenceCallback(Child, handleLoopSequenceCallback)) + return false; + // Already been treated, skip this children + continue; + } + } + // Regular loop sequence handling + if (isLoopSequenceDerivation(Child)) { + if (isLoopGeneratingStmt(Child)) { + if (!handleLoopGeneration(Child)) { + return false; + } + } else { + if (!handleRegularLoop(Child)) { + return false; + } + } + ++LoopSeqSize; + } else { + // Report error for invalid statement inside canonical loop sequence + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + } + return true; + }; + + // Recursive entry point to process the main loop sequence + if (!handleLoopSequence(AStmt, handleLoopSequence)) { + return false; + } + + if (LoopSeqSize <= 0) { + Diag(AStmt->getBeginLoc(), diag::err_omp_empty_loop_sequence) + << getOpenMPDirectiveName(Kind); + return false; + } + return true; +} + /// Add preinit statements that need to be propageted from the selected loop. static void addLoopPreInits(ASTContext &Context, OMPLoopBasedDirective::HelperExprs &LoopHelper, @@ -15462,6 +15728,340 @@ StmtResult SemaOpenMP::ActOnOpenMPInterchangeDirective( buildPreInits(Context, PreInits)); } +StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, + Stmt *AStmt, + SourceLocation StartLoc, + SourceLocation EndLoc) { + ASTContext &Context = getASTContext(); + DeclContext *CurrContext = SemaRef.CurContext; + Scope *CurScope = SemaRef.getCurScope(); + CaptureVars CopyTransformer(SemaRef); + + // Ensure the structured block is not empty + if (!AStmt) { + return StmtError(); + } + // Validate that the potential loop sequence is transformable for fusion + // Also collect the HelperExprs, Loop Stmts, Inits, and Number of loops + SmallVector LoopHelpers; + SmallVector LoopStmts; + SmallVector> OriginalInits; + + unsigned NumLoops; + // TODO: Support looprange clause using LoopSeqSize + unsigned LoopSeqSize; + if (!checkTransformableLoopSequence(OMPD_fuse, AStmt, LoopSeqSize, NumLoops, + LoopHelpers, LoopStmts, OriginalInits, + Context)) { + return StmtError(); + } + + // Defer transformation in dependent contexts + if (CurrContext->isDependentContext()) { + return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, + NumLoops, 1, AStmt, nullptr, nullptr); + } + assert(LoopHelpers.size() == LoopSeqSize && + "Expecting loop iteration space dimensionality to match number of " + "affected loops"); + assert(OriginalInits.size() == LoopSeqSize && + "Expecting loop iteration space dimensionality to match number of " + "affected loops"); + + // PreInits hold a sequence of variable declarations that must be executed + // before the fused loop begins. These include bounds, strides, and other + // helper variables required for the transformation. + SmallVector PreInits; + + // Select the type with the largest bit width among all induction variables + QualType IVType = LoopHelpers[0].IterationVarRef->getType(); + for (unsigned int I = 1; I < LoopSeqSize; ++I) { + QualType CurrentIVType = LoopHelpers[I].IterationVarRef->getType(); + if (Context.getTypeSize(CurrentIVType) > Context.getTypeSize(IVType)) { + IVType = CurrentIVType; + } + } + uint64_t IVBitWidth = Context.getIntWidth(IVType); + + // Create pre-init declarations for all loops lower bounds, upper bounds, + // strides and num-iterations + SmallVector LBVarDecls; + SmallVector STVarDecls; + SmallVector NIVarDecls; + SmallVector UBVarDecls; + SmallVector IVVarDecls; + + // Helper lambda to create variables for bounds, strides, and other + // expressions. Generates both the variable declaration and the corresponding + // initialization statement. + auto CreateHelperVarAndStmt = + [&SemaRef = this->SemaRef, &Context, &CopyTransformer, + &IVType](Expr *ExprToCopy, const std::string &BaseName, unsigned I, + bool NeedsNewVD = false) { + Expr *TransformedExpr = + AssertSuccess(CopyTransformer.TransformExpr(ExprToCopy)); + if (!TransformedExpr) + return std::pair(nullptr, StmtError()); + + auto Name = (Twine(".omp.") + BaseName + std::to_string(I)).str(); + + VarDecl *VD; + if (NeedsNewVD) { + VD = buildVarDecl(SemaRef, SourceLocation(), IVType, Name); + SemaRef.AddInitializerToDecl(VD, TransformedExpr, false); + + } else { + // Create a unique variable name + DeclRefExpr *DRE = cast(TransformedExpr); + VD = cast(DRE->getDecl()); + VD->setDeclName(&SemaRef.PP.getIdentifierTable().get(Name)); + } + // Create the corresponding declaration statement + StmtResult DeclStmt = new (Context) class DeclStmt( + DeclGroupRef(VD), SourceLocation(), SourceLocation()); + return std::make_pair(VD, DeclStmt); + }; + + // Process each single loop to generate and collect declarations + // and statements for all helper expressions + for (unsigned int I = 0; I < LoopSeqSize; ++I) { + addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], + PreInits); + + auto [UBVD, UBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].UB, "ub", I); + auto [LBVD, LBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].LB, "lb", I); + auto [STVD, STDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].ST, "st", I); + auto [NIVD, NIDStmt] = + CreateHelperVarAndStmt(LoopHelpers[I].NumIterations, "ni", I, true); + auto [IVVD, IVDStmt] = + CreateHelperVarAndStmt(LoopHelpers[I].IterationVarRef, "iv", I); + + if (!LBVD || !STVD || !NIVD || !IVVD) + return StmtError(); + + UBVarDecls.push_back(UBVD); + LBVarDecls.push_back(LBVD); + STVarDecls.push_back(STVD); + NIVarDecls.push_back(NIVD); + IVVarDecls.push_back(IVVD); + + PreInits.push_back(UBDStmt.get()); + PreInits.push_back(LBDStmt.get()); + PreInits.push_back(STDStmt.get()); + PreInits.push_back(NIDStmt.get()); + PreInits.push_back(IVDStmt.get()); + } + + auto MakeVarDeclRef = [&SemaRef = this->SemaRef](VarDecl *VD) { + return buildDeclRefExpr(SemaRef, VD, VD->getType(), VD->getLocation(), + false); + }; + + // Following up the creation of the final fused loop will be performed + // which has the following shape (considering the selected loops): + // + // for (fuse.index = 0; fuse.index < max(ni0, ni1..., nik); ++fuse.index) { + // if (fuse.index < ni0){ + // iv0 = lb0 + st0 * fuse.index; + // original.index0 = iv0 + // body(0); + // } + // if (fuse.index < ni1){ + // iv1 = lb1 + st1 * fuse.index; + // original.index1 = iv1 + // body(1); + // } + // + // ... + // + // if (fuse.index < nik){ + // ivk = lbk + stk * fuse.index; + // original.indexk = ivk + // body(k); Expr *InitVal = IntegerLiteral::Create(Context, + // llvm::APInt(IVWidth, 0), + + // } + + // 1. Create the initialized fuse index + const std::string IndexName = Twine(".omp.fuse.index").str(); + Expr *InitVal = IntegerLiteral::Create(Context, llvm::APInt(IVBitWidth, 0), + IVType, SourceLocation()); + VarDecl *IndexDecl = + buildVarDecl(SemaRef, {}, IVType, IndexName, nullptr, nullptr); + SemaRef.AddInitializerToDecl(IndexDecl, InitVal, false); + StmtResult InitStmt = new (Context) + DeclStmt(DeclGroupRef(IndexDecl), SourceLocation(), SourceLocation()); + + if (!InitStmt.isUsable()) + return StmtError(); + + auto MakeIVRef = [&SemaRef = this->SemaRef, IndexDecl, IVType, + Loc = InitVal->getExprLoc()]() { + return buildDeclRefExpr(SemaRef, IndexDecl, IVType, Loc, false); + }; + + // 2. Iteratively compute the max number of logical iterations Max(NI_1, NI_2, + // ..., NI_k) + // + // This loop accumulates the maximum value across multiple expressions, + // ensuring each step constructs a unique AST node for correctness. By using + // intermediate temporary variables and conditional operators, we maintain + // distinct nodes and avoid duplicating subtrees, For instance, max(a,b,c): + // omp.temp0 = max(a, b) + // omp.temp1 = max(omp.temp0, c) + // omp.fuse.max = max(omp.temp1, omp.temp0) + + ExprResult MaxExpr; + for (unsigned I = 0; I < LoopSeqSize; ++I) { + DeclRefExpr *NIRef = MakeVarDeclRef(NIVarDecls[I]); + QualType NITy = NIRef->getType(); + + if (MaxExpr.isUnset()) { + // Initialize MaxExpr with the first NI expression + MaxExpr = NIRef; + } else { + // Create a new acummulator variable t_i = MaxExpr + std::string TempName = (Twine(".omp.temp.") + Twine(I)).str(); + VarDecl *TempDecl = + buildVarDecl(SemaRef, {}, NITy, TempName, nullptr, nullptr); + TempDecl->setInit(MaxExpr.get()); + DeclRefExpr *TempRef = + buildDeclRefExpr(SemaRef, TempDecl, NITy, SourceLocation(), false); + DeclRefExpr *TempRef2 = + buildDeclRefExpr(SemaRef, TempDecl, NITy, SourceLocation(), false); + // Add a DeclStmt to PreInits to ensure the variable is declared. + StmtResult TempStmt = new (Context) + DeclStmt(DeclGroupRef(TempDecl), SourceLocation(), SourceLocation()); + + if (!TempStmt.isUsable()) + return StmtError(); + PreInits.push_back(TempStmt.get()); + + // Build MaxExpr <-(MaxExpr > NIRef ? MaxExpr : NIRef) + ExprResult Comparison = + SemaRef.BuildBinOp(nullptr, SourceLocation(), BO_GT, TempRef, NIRef); + // Handle any errors in Comparison creation + if (!Comparison.isUsable()) + return StmtError(); + + DeclRefExpr *NIRef2 = MakeVarDeclRef(NIVarDecls[I]); + // Update MaxExpr using a conditional expression to hold the max value + MaxExpr = new (Context) ConditionalOperator( + Comparison.get(), SourceLocation(), TempRef2, SourceLocation(), + NIRef2->getExprStmt(), NITy, VK_LValue, OK_Ordinary); + + if (!MaxExpr.isUsable()) + return StmtError(); + } + } + if (!MaxExpr.isUsable()) + return StmtError(); + + // 3. Declare the max variable + const std::string MaxName = Twine(".omp.fuse.max").str(); + VarDecl *MaxDecl = + buildVarDecl(SemaRef, {}, IVType, MaxName, nullptr, nullptr); + MaxDecl->setInit(MaxExpr.get()); + DeclRefExpr *MaxRef = buildDeclRefExpr(SemaRef, MaxDecl, IVType, {}, false); + StmtResult MaxStmt = new (Context) + DeclStmt(DeclGroupRef(MaxDecl), SourceLocation(), SourceLocation()); + + if (MaxStmt.isInvalid()) + return StmtError(); + PreInits.push_back(MaxStmt.get()); + + // 4. Create condition Expr: index < n_max + ExprResult CondExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_LT, + MakeIVRef(), MaxRef); + if (!CondExpr.isUsable()) + return StmtError(); + // 5. Increment Expr: ++index + ExprResult IncrExpr = + SemaRef.BuildUnaryOp(CurScope, SourceLocation(), UO_PreInc, MakeIVRef()); + if (!IncrExpr.isUsable()) + return StmtError(); + + // 6. Build the Fused Loop Body + // The final fused loop iterates over the maximum logical range. Inside the + // loop, each original loop's index is calculated dynamically, and its body + // is executed conditionally. + // + // Each sub-loop's body is guarded by a conditional statement to ensure + // it executes only within its logical iteration range: + // + // if (fuse.index < ni_k){ + // iv_k = lb_k + st_k * fuse.index; + // original.index = iv_k + // body(k); + // } + + CompoundStmt *FusedBody = nullptr; + SmallVector FusedBodyStmts; + for (unsigned I = 0; I < LoopSeqSize; ++I) { + + // Assingment of the original sub-loop index to compute the logical index + // IV_k = LB_k + omp.fuse.index * ST_k + + ExprResult IdxExpr = + SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Mul, + MakeVarDeclRef(STVarDecls[I]), MakeIVRef()); + if (!IdxExpr.isUsable()) + return StmtError(); + IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Add, + MakeVarDeclRef(LBVarDecls[I]), IdxExpr.get()); + + if (!IdxExpr.isUsable()) + return StmtError(); + IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Assign, + MakeVarDeclRef(IVVarDecls[I]), IdxExpr.get()); + if (!IdxExpr.isUsable()) + return StmtError(); + + // Update the original i_k = IV_k + SmallVector BodyStmts; + BodyStmts.push_back(IdxExpr.get()); + llvm::append_range(BodyStmts, LoopHelpers[I].Updates); + + if (auto *SourceCXXFor = dyn_cast(LoopStmts[I])) + BodyStmts.push_back(SourceCXXFor->getLoopVarStmt()); + + Stmt *Body = (isa(LoopStmts[I])) + ? cast(LoopStmts[I])->getBody() + : cast(LoopStmts[I])->getBody(); + + BodyStmts.push_back(Body); + + CompoundStmt *CombinedBody = + CompoundStmt::Create(Context, BodyStmts, FPOptionsOverride(), + SourceLocation(), SourceLocation()); + ExprResult Condition = + SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_LT, MakeIVRef(), + MakeVarDeclRef(NIVarDecls[I])); + + if (!Condition.isUsable()) + return StmtError(); + + IfStmt *IfStatement = IfStmt::Create( + Context, SourceLocation(), IfStatementKind::Ordinary, nullptr, nullptr, + Condition.get(), SourceLocation(), SourceLocation(), CombinedBody, + SourceLocation(), nullptr); + + FusedBodyStmts.push_back(IfStatement); + } + FusedBody = CompoundStmt::Create(Context, FusedBodyStmts, FPOptionsOverride(), + SourceLocation(), SourceLocation()); + + // 7. Construct the final fused loop + ForStmt *FusedForStmt = new (Context) + ForStmt(Context, InitStmt.get(), CondExpr.get(), nullptr, IncrExpr.get(), + FusedBody, InitStmt.get()->getBeginLoc(), SourceLocation(), + IncrExpr.get()->getEndLoc()); + + return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, NumLoops, + 1, AStmt, FusedForStmt, + buildPreInits(Context, PreInits)); +} + OMPClause *SemaOpenMP::ActOnOpenMPSingleExprClause(OpenMPClauseKind Kind, Expr *Expr, SourceLocation StartLoc, diff --git a/clang/lib/Sema/TreeTransform.h b/clang/lib/Sema/TreeTransform.h index 335e21d927b76..034b0c8243667 100644 --- a/clang/lib/Sema/TreeTransform.h +++ b/clang/lib/Sema/TreeTransform.h @@ -9666,6 +9666,17 @@ StmtResult TreeTransform::TransformOMPInterchangeDirective( return Res; } +template +StmtResult +TreeTransform::TransformOMPFuseDirective(OMPFuseDirective *D) { + DeclarationNameInfo DirName; + getDerived().getSema().OpenMP().StartOpenMPDSABlock( + D->getDirectiveKind(), DirName, nullptr, D->getBeginLoc()); + StmtResult Res = getDerived().TransformOMPExecutableDirective(D); + getDerived().getSema().OpenMP().EndOpenMPDSABlock(Res.get()); + return Res; +} + template StmtResult TreeTransform::TransformOMPForDirective(OMPForDirective *D) { diff --git a/clang/lib/Serialization/ASTReaderStmt.cpp b/clang/lib/Serialization/ASTReaderStmt.cpp index 0ba0378754eb4..6762d11d6b73e 100644 --- a/clang/lib/Serialization/ASTReaderStmt.cpp +++ b/clang/lib/Serialization/ASTReaderStmt.cpp @@ -2449,6 +2449,7 @@ void ASTStmtReader::VisitOMPLoopTransformationDirective( OMPLoopTransformationDirective *D) { VisitOMPLoopBasedDirective(D); D->setNumGeneratedLoops(Record.readUInt32()); + D->setNumGeneratedLoopNests(Record.readUInt32()); } void ASTStmtReader::VisitOMPTileDirective(OMPTileDirective *D) { @@ -2471,6 +2472,10 @@ void ASTStmtReader::VisitOMPInterchangeDirective(OMPInterchangeDirective *D) { VisitOMPLoopTransformationDirective(D); } +void ASTStmtReader::VisitOMPFuseDirective(OMPFuseDirective *D) { + VisitOMPLoopTransformationDirective(D); +} + void ASTStmtReader::VisitOMPForDirective(OMPForDirective *D) { VisitOMPLoopDirective(D); D->setHasCancel(Record.readBool()); @@ -3613,6 +3618,12 @@ Stmt *ASTReader::ReadStmtFromStream(ModuleFile &F) { S = OMPReverseDirective::CreateEmpty(Context); break; } + case STMT_OMP_FUSE_DIRECTIVE: { + unsigned NumLoops = Record[ASTStmtReader::NumStmtFields]; + unsigned NumClauses = Record[ASTStmtReader::NumStmtFields + 1]; + S = OMPFuseDirective::CreateEmpty(Context, NumClauses, NumLoops); + break; + } case STMT_OMP_INTERCHANGE_DIRECTIVE: { unsigned NumLoops = Record[ASTStmtReader::NumStmtFields]; diff --git a/clang/lib/Serialization/ASTWriterStmt.cpp b/clang/lib/Serialization/ASTWriterStmt.cpp index b9eabd5ddb64c..8b909d5c93686 100644 --- a/clang/lib/Serialization/ASTWriterStmt.cpp +++ b/clang/lib/Serialization/ASTWriterStmt.cpp @@ -2454,6 +2454,7 @@ void ASTStmtWriter::VisitOMPLoopTransformationDirective( OMPLoopTransformationDirective *D) { VisitOMPLoopBasedDirective(D); Record.writeUInt32(D->getNumGeneratedLoops()); + Record.writeUInt32(D->getNumGeneratedLoopNests()); } void ASTStmtWriter::VisitOMPTileDirective(OMPTileDirective *D) { @@ -2481,6 +2482,11 @@ void ASTStmtWriter::VisitOMPInterchangeDirective(OMPInterchangeDirective *D) { Code = serialization::STMT_OMP_INTERCHANGE_DIRECTIVE; } +void ASTStmtWriter::VisitOMPFuseDirective(OMPFuseDirective *D) { + VisitOMPLoopTransformationDirective(D); + Code = serialization::STMT_OMP_FUSE_DIRECTIVE; +} + void ASTStmtWriter::VisitOMPForDirective(OMPForDirective *D) { VisitOMPLoopDirective(D); Record.writeBool(D->hasCancel()); diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp index 1afd4b52eb354..036945b2d1700 100644 --- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp +++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp @@ -1817,6 +1817,7 @@ void ExprEngine::Visit(const Stmt *S, ExplodedNode *Pred, case Stmt::OMPStripeDirectiveClass: case Stmt::OMPTileDirectiveClass: case Stmt::OMPInterchangeDirectiveClass: + case Stmt::OMPFuseDirectiveClass: case Stmt::OMPInteropDirectiveClass: case Stmt::OMPDispatchDirectiveClass: case Stmt::OMPMaskedDirectiveClass: diff --git a/clang/test/OpenMP/fuse_ast_print.cpp b/clang/test/OpenMP/fuse_ast_print.cpp new file mode 100644 index 0000000000000..43ce815dab024 --- /dev/null +++ b/clang/test/OpenMP/fuse_ast_print.cpp @@ -0,0 +1,278 @@ +// Check no warnings/errors +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -fsyntax-only -verify %s +// expected-no-diagnostics + +// Check AST and unparsing +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -ast-dump %s | FileCheck %s --check-prefix=DUMP +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -ast-print %s | FileCheck %s --check-prefix=PRINT + +// Check same results after serialization round-trip +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -emit-pch -o %t %s +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -include-pch %t -ast-dump-all %s | FileCheck %s --check-prefix=DUMP +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -include-pch %t -ast-print %s | FileCheck %s --check-prefix=PRINT + +#ifndef HEADER +#define HEADER + +// placeholder for loop body code +extern "C" void body(...); + +// PRINT-LABEL: void foo1( +// DUMP-LABEL: FunctionDecl {{.*}} foo1 +void foo1() { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + + } + +} + +// PRINT-LABEL: void foo2( +// DUMP-LABEL: FunctionDecl {{.*}} foo2 +void foo2() { + // PRINT: #pragma omp unroll partial(4) + // DUMP: OMPUnrollDirective + // DUMP-NEXT: OMPPartialClause + // DUMP-NEXT: ConstantExpr + // DUMP-NEXT: value: Int 4 + // DUMP-NEXT: IntegerLiteral {{.*}} 4 + #pragma omp unroll partial(4) + // PRINT: #pragma omp fuse + // DUMP-NEXT: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + } + +} + +//PRINT-LABEL: void foo3( +//DUMP-LABEL: FunctionTemplateDecl {{.*}} foo3 +template +void foo3() { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: #pragma omp unroll partial(Factor1) + // DUMP: OMPUnrollDirective + #pragma omp unroll partial(Factor1) + // PRINT: for (int i = 0; i < 12; i += 1) + // DUMP: ForStmt + for (int i = 0; i < 12; i += 1) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: #pragma omp unroll partial(Factor2) + // DUMP: OMPUnrollDirective + #pragma omp unroll partial(Factor2) + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + + } +} + +// Also test instantiating the template. +void tfoo3() { + foo3<4,2>(); +} + +//PRINT-LABEL: void foo4( +//DUMP-LABEL: FunctionTemplateDecl {{.*}} foo4 +template +void foo4(int start, int end) { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (T i = start; i < end; i += Step) + // DUMP: ForStmt + for (T i = start; i < end; i += Step) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + + // PRINT: for (T j = end; j > start; j -= Step) + // DUMP: ForStmt + for (T j = end; j > start; j -= Step) { + // PRINT: body(j) + // DUMP: CallExpr + body(j); + } + + } +} + +// Also test instantiating the template. +void tfoo4() { + foo4(0, 64); +} + + + +// PRINT-LABEL: void foo5( +// DUMP-LABEL: FunctionDecl {{.*}} foo5 +void foo5() { + double arr[128], arr2[128]; + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT-NEXT: for (auto &&a : arr) + // DUMP-NEXT: CXXForRangeStmt + for (auto &&a: arr) + // PRINT: body(a) + // DUMP: CallExpr + body(a); + // PRINT: for (double v = 42; auto &&b : arr) + // DUMP: CXXForRangeStmt + for (double v = 42; auto &&b: arr) + // PRINT: body(b, v); + // DUMP: CallExpr + body(b, v); + // PRINT: for (auto &&c : arr2) + // DUMP: CXXForRangeStmt + for (auto &&c: arr2) + // PRINT: body(c) + // DUMP: CallExpr + body(c); + + } + +} + +// PRINT-LABEL: void foo6( +// DUMP-LABEL: FunctionDecl {{.*}} foo6 +void foo6() { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i <= 10; ++i) + // DUMP: ForStmt + for (int i = 0; i <= 10; ++i) + body(i); + // PRINT: for (int j = 0; j < 100; ++j) + // DUMP: ForStmt + for(int j = 0; j < 100; ++j) + body(j); + } + // PRINT: #pragma omp unroll partial(4) + // DUMP: OMPUnrollDirective + #pragma omp unroll partial(4) + // PRINT: for (int k = 0; k < 250; ++k) + // DUMP: ForStmt + for (int k = 0; k < 250; ++k) + body(k); + } +} + +// PRINT-LABEL: void foo7( +// DUMP-LABEL: FunctionDecl {{.*}} foo7 +void foo7() { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + } + } + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + } + } + } + } + +} + + + + + +#endif \ No newline at end of file diff --git a/clang/test/OpenMP/fuse_codegen.cpp b/clang/test/OpenMP/fuse_codegen.cpp new file mode 100644 index 0000000000000..6c1e21092da43 --- /dev/null +++ b/clang/test/OpenMP/fuse_codegen.cpp @@ -0,0 +1,1511 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --include-generated-funcs --replace-value-regex "pl_cond[.].+[.|,]" --prefix-filecheck-ir-name _ --version 5 +// expected-no-diagnostics + +// Check code generation +// RUN: %clang_cc1 -verify -triple x86_64-pc-linux-gnu -std=c++20 -fclang-abi-compat=latest -fopenmp -fopenmp-version=60 -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK1 + +// Check same results after serialization round-trip +// RUN: %clang_cc1 -verify -triple x86_64-pc-linux-gnu -std=c++20 -fclang-abi-compat=latest -fopenmp -fopenmp-version=60 -emit-pch -o %t %s +// RUN: %clang_cc1 -verify -triple x86_64-pc-linux-gnu -std=c++20 -fclang-abi-compat=latest -fopenmp -fopenmp-version=60 -include-pch %t -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK2 + +#ifndef HEADER +#define HEADER + +//placeholder for loop body code. +extern "C" void body(...) {} + +extern "C" void foo1(int start1, int end1, int step1, int start2, int end2, int step2) { + int i,j; + #pragma omp fuse + { + for(i = start1; i < end1; i += step1) body(i); + for(j = start2; j < end2; j += step2) body(j); + } + +} + +template +void foo2(T start, T end, T step){ + T i,j,k; + #pragma omp fuse + { + for(i = start; i < end; i += step) body(i); + for(j = end; j > start; j -= step) body(j); + for(k = start+step; k < end+step; k += step) body(k); + } +} + +extern "C" void tfoo2() { + foo2(0, 64, 4); +} + +extern "C" void foo3() { + double arr[256]; + #pragma omp fuse + { + #pragma omp fuse + { + for(int i = 0; i < 128; ++i) body(i); + for(int j = 0; j < 256; j+=2) body(j); + } + for(int c = 42; auto &&v: arr) body(c,v); + for(int cc = 37; auto &&vv: arr) body(cc, vv); + } +} + + +#endif +// CHECK1-LABEL: define dso_local void @body( +// CHECK1-SAME: ...) #[[ATTR0:[0-9]+]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: ret void +// +// +// CHECK1-LABEL: define dso_local void @foo1( +// CHECK1-SAME: i32 noundef [[START1:%.*]], i32 noundef [[END1:%.*]], i32 noundef [[STEP1:%.*]], i32 noundef [[START2:%.*]], i32 noundef [[END2:%.*]], i32 noundef [[STEP2:%.*]]) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[START1_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[END1_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[STEP1_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[START2_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[END2_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[STEP2_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_6:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: store i32 [[START1]], ptr [[START1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[END1]], ptr [[END1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[STEP1]], ptr [[STEP1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[START2]], ptr [[START2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[END2]], ptr [[END2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[STEP2]], ptr [[STEP2_ADDR]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[START1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[START1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP1]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[END1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[STEP1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP3]], ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[SUB:%.*]] = sub i32 [[TMP4]], [[TMP5]] +// CHECK1-NEXT: [[SUB3:%.*]] = sub i32 [[SUB]], 1 +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[ADD:%.*]] = add i32 [[SUB3]], [[TMP6]] +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] +// CHECK1-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 +// CHECK1-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK1-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[END2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK1-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK1-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 +// CHECK1-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK1-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP20]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP21]], [[TMP22]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP23]], %[[COND_TRUE]] ], [ [[TMP24]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] +// CHECK1-NEXT: br i1 [[CMP16]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: br i1 [[CMP17]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP30]], [[TMP31]] +// CHECK1-NEXT: [[ADD18:%.*]] = add i32 [[TMP29]], [[MUL]] +// CHECK1-NEXT: store i32 [[ADD18]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[MUL19:%.*]] = mul i32 [[TMP33]], [[TMP34]] +// CHECK1-NEXT: [[ADD20:%.*]] = add i32 [[TMP32]], [[MUL19]] +// CHECK1-NEXT: store i32 [[ADD20]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP35]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP36]], [[TMP37]] +// CHECK1-NEXT: br i1 [[CMP21]], label %[[IF_THEN22:.*]], label %[[IF_END27:.*]] +// CHECK1: [[IF_THEN22]]: +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL23:%.*]] = mul i32 [[TMP39]], [[TMP40]] +// CHECK1-NEXT: [[ADD24:%.*]] = add i32 [[TMP38]], [[MUL23]] +// CHECK1-NEXT: store i32 [[ADD24]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[MUL25:%.*]] = mul i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: [[ADD26:%.*]] = add i32 [[TMP41]], [[MUL25]] +// CHECK1-NEXT: store i32 [[ADD26]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK1-NEXT: br label %[[IF_END27]] +// CHECK1: [[IF_END27]]: +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP45]], 1 +// CHECK1-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP3:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: ret void +// +// +// CHECK1-LABEL: define dso_local void @tfoo2( +// CHECK1-SAME: ) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: call void @_Z4foo2IiEvT_S0_S0_(i32 noundef 0, i32 noundef 64, i32 noundef 4) +// CHECK1-NEXT: ret void +// +// +// CHECK1-LABEL: define linkonce_odr void @_Z4foo2IiEvT_S0_S0_( +// CHECK1-SAME: i32 noundef [[START:%.*]], i32 noundef [[END:%.*]], i32 noundef [[STEP:%.*]]) #[[ATTR0]] comdat { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[START_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[END_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[STEP_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_6:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_17:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_19:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP21:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_22:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: store i32 [[START]], ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[END]], ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[STEP]], ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP1]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP3]], ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[SUB:%.*]] = sub i32 [[TMP4]], [[TMP5]] +// CHECK1-NEXT: [[SUB3:%.*]] = sub i32 [[SUB]], 1 +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[ADD:%.*]] = add i32 [[SUB3]], [[TMP6]] +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] +// CHECK1-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 +// CHECK1-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK1-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK1-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK1-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 +// CHECK1-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK1-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] +// CHECK1-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] +// CHECK1-NEXT: store i32 [[ADD18]], ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP24]], [[TMP25]] +// CHECK1-NEXT: store i32 [[ADD20]], ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP26]], ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[SUB23:%.*]] = sub i32 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: [[SUB24:%.*]] = sub i32 [[SUB23]], 1 +// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP29]] +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP30]] +// CHECK1-NEXT: [[SUB27:%.*]] = sub i32 [[DIV26]], 1 +// CHECK1-NEXT: store i32 [[SUB27]], ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK1-NEXT: store i32 [[TMP31]], ptr [[DOTOMP_UB2]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB2]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST2]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK1-NEXT: [[ADD28:%.*]] = add i32 [[TMP32]], 1 +// CHECK1-NEXT: store i32 [[ADD28]], ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP33]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP34]], [[TMP35]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP36]], %[[COND_TRUE]] ], [ [[TMP37]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP38]], [[TMP39]] +// CHECK1-NEXT: br i1 [[CMP29]], label %[[COND_TRUE30:.*]], label %[[COND_FALSE31:.*]] +// CHECK1: [[COND_TRUE30]]: +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: br label %[[COND_END32:.*]] +// CHECK1: [[COND_FALSE31]]: +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: br label %[[COND_END32]] +// CHECK1: [[COND_END32]]: +// CHECK1-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP40]], %[[COND_TRUE30]] ], [ [[TMP41]], %[[COND_FALSE31]] ] +// CHECK1-NEXT: store i32 [[COND33]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: br i1 [[CMP34]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP44]], [[TMP45]] +// CHECK1-NEXT: br i1 [[CMP35]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP47]], [[TMP48]] +// CHECK1-NEXT: [[ADD36:%.*]] = add i32 [[TMP46]], [[MUL]] +// CHECK1-NEXT: store i32 [[ADD36]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[MUL37:%.*]] = mul i32 [[TMP50]], [[TMP51]] +// CHECK1-NEXT: [[ADD38:%.*]] = add i32 [[TMP49]], [[MUL37]] +// CHECK1-NEXT: store i32 [[ADD38]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP53]], [[TMP54]] +// CHECK1-NEXT: br i1 [[CMP39]], label %[[IF_THEN40:.*]], label %[[IF_END45:.*]] +// CHECK1: [[IF_THEN40]]: +// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL41:%.*]] = mul i32 [[TMP56]], [[TMP57]] +// CHECK1-NEXT: [[ADD42:%.*]] = add i32 [[TMP55]], [[MUL41]] +// CHECK1-NEXT: store i32 [[ADD42]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP58:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[MUL43:%.*]] = mul i32 [[TMP59]], [[TMP60]] +// CHECK1-NEXT: [[SUB44:%.*]] = sub i32 [[TMP58]], [[MUL43]] +// CHECK1-NEXT: store i32 [[SUB44]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP61:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP61]]) +// CHECK1-NEXT: br label %[[IF_END45]] +// CHECK1: [[IF_END45]]: +// CHECK1-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP62]], [[TMP63]] +// CHECK1-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] +// CHECK1: [[IF_THEN47]]: +// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 +// CHECK1-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 +// CHECK1-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL48:%.*]] = mul i32 [[TMP65]], [[TMP66]] +// CHECK1-NEXT: [[ADD49:%.*]] = add i32 [[TMP64]], [[MUL48]] +// CHECK1-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV2]], align 4 +// CHECK1-NEXT: [[TMP67:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 +// CHECK1-NEXT: [[TMP69:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[MUL50:%.*]] = mul i32 [[TMP68]], [[TMP69]] +// CHECK1-NEXT: [[ADD51:%.*]] = add i32 [[TMP67]], [[MUL50]] +// CHECK1-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 +// CHECK1-NEXT: [[TMP70:%.*]] = load i32, ptr [[K]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP70]]) +// CHECK1-NEXT: br label %[[IF_END52]] +// CHECK1: [[IF_END52]]: +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 +// CHECK1-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP5:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: ret void +// +// +// CHECK1-LABEL: define dso_local void @foo3( +// CHECK1-SAME: ) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB03:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB04:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST05:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI06:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV07:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_12:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_UB117:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_LB118:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_ST119:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_NI120:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV122:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[CC:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE223:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END224:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN227:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_29:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_31:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_32:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_UB2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_LB2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_ST2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_NI2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_TEMP_142:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX48:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX54:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[VV:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 [[TMP5]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[SUB:%.*]] = sub nsw i32 [[TMP6]], 0 +// CHECK1-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 +// CHECK1-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 +// CHECK1-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: store i32 [[TMP7]], ptr [[DOTOMP_UB03]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB04]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST05]], align 4 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP8]], 1 +// CHECK1-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 +// CHECK1-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI06]], align 8 +// CHECK1-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK1-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY8:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY8]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY10:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP11]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY10]], ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK1-NEXT: [[TMP12:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK1-NEXT: store ptr [[TMP12]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: [[TMP14:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP14]] to i64 +// CHECK1-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] +// CHECK1-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 +// CHECK1-NEXT: [[SUB13:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK1-NEXT: [[ADD14:%.*]] = add nsw i64 [[SUB13]], 1 +// CHECK1-NEXT: [[DIV15:%.*]] = sdiv i64 [[ADD14]], 1 +// CHECK1-NEXT: [[SUB16:%.*]] = sub nsw i64 [[DIV15]], 1 +// CHECK1-NEXT: store i64 [[SUB16]], ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK1-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK1-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_UB117]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB118]], align 8 +// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST119]], align 8 +// CHECK1-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK1-NEXT: [[ADD21:%.*]] = add nsw i64 [[TMP16]], 1 +// CHECK1-NEXT: store i64 [[ADD21]], ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: store i32 37, ptr [[CC]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE223]], align 8 +// CHECK1-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY25:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR26:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY25]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR26]], ptr [[__END224]], align 8 +// CHECK1-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP18]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY28]], ptr [[__BEGIN227]], align 8 +// CHECK1-NEXT: [[TMP19:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY30:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP19]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY30]], ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[TMP20:%.*]] = load ptr, ptr [[__END224]], align 8 +// CHECK1-NEXT: store ptr [[TMP20]], ptr [[DOTCAPTURE_EXPR_31]], align 8 +// CHECK1-NEXT: [[TMP21:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_31]], align 8 +// CHECK1-NEXT: [[TMP22:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST33:%.*]] = ptrtoint ptr [[TMP21]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST34:%.*]] = ptrtoint ptr [[TMP22]] to i64 +// CHECK1-NEXT: [[SUB_PTR_SUB35:%.*]] = sub i64 [[SUB_PTR_LHS_CAST33]], [[SUB_PTR_RHS_CAST34]] +// CHECK1-NEXT: [[SUB_PTR_DIV36:%.*]] = sdiv exact i64 [[SUB_PTR_SUB35]], 8 +// CHECK1-NEXT: [[SUB37:%.*]] = sub nsw i64 [[SUB_PTR_DIV36]], 1 +// CHECK1-NEXT: [[ADD38:%.*]] = add nsw i64 [[SUB37]], 1 +// CHECK1-NEXT: [[DIV39:%.*]] = sdiv i64 [[ADD38]], 1 +// CHECK1-NEXT: [[SUB40:%.*]] = sub nsw i64 [[DIV39]], 1 +// CHECK1-NEXT: store i64 [[SUB40]], ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK1-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK1-NEXT: store i64 [[TMP23]], ptr [[DOTOMP_UB2]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB2]], align 8 +// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST2]], align 8 +// CHECK1-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK1-NEXT: [[ADD41:%.*]] = add nsw i64 [[TMP24]], 1 +// CHECK1-NEXT: store i64 [[ADD41]], ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 +// CHECK1-NEXT: store i64 [[TMP25]], ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK1-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK1-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: [[CMP43:%.*]] = icmp sgt i64 [[TMP26]], [[TMP27]] +// CHECK1-NEXT: br i1 [[CMP43]], label %[[COND_TRUE44:.*]], label %[[COND_FALSE45:.*]] +// CHECK1: [[COND_TRUE44]]: +// CHECK1-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK1-NEXT: br label %[[COND_END46:.*]] +// CHECK1: [[COND_FALSE45]]: +// CHECK1-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: br label %[[COND_END46]] +// CHECK1: [[COND_END46]]: +// CHECK1-NEXT: [[COND47:%.*]] = phi i64 [ [[TMP28]], %[[COND_TRUE44]] ], [ [[TMP29]], %[[COND_FALSE45]] ] +// CHECK1-NEXT: store i64 [[COND47]], ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[CMP49:%.*]] = icmp sgt i64 [[TMP30]], [[TMP31]] +// CHECK1-NEXT: br i1 [[CMP49]], label %[[COND_TRUE50:.*]], label %[[COND_FALSE51:.*]] +// CHECK1: [[COND_TRUE50]]: +// CHECK1-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: br label %[[COND_END52:.*]] +// CHECK1: [[COND_FALSE51]]: +// CHECK1-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: br label %[[COND_END52]] +// CHECK1: [[COND_END52]]: +// CHECK1-NEXT: [[COND53:%.*]] = phi i64 [ [[TMP32]], %[[COND_TRUE50]] ], [ [[TMP33]], %[[COND_FALSE51]] ] +// CHECK1-NEXT: store i64 [[COND53]], ptr [[DOTOMP_FUSE_MAX48]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP35:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX48]], align 8 +// CHECK1-NEXT: [[CMP55:%.*]] = icmp slt i64 [[TMP34]], [[TMP35]] +// CHECK1-NEXT: br i1 [[CMP55]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP36:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 +// CHECK1-NEXT: [[CMP56:%.*]] = icmp slt i64 [[TMP36]], [[TMP37]] +// CHECK1-NEXT: br i1 [[CMP56]], label %[[IF_THEN:.*]], label %[[IF_END76:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB04]], align 4 +// CHECK1-NEXT: [[CONV57:%.*]] = sext i32 [[TMP38]] to i64 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST05]], align 4 +// CHECK1-NEXT: [[CONV58:%.*]] = sext i32 [[TMP39]] to i64 +// CHECK1-NEXT: [[TMP40:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV58]], [[TMP40]] +// CHECK1-NEXT: [[ADD59:%.*]] = add nsw i64 [[CONV57]], [[MUL]] +// CHECK1-NEXT: [[CONV60:%.*]] = trunc i64 [[ADD59]] to i32 +// CHECK1-NEXT: store i32 [[CONV60]], ptr [[DOTOMP_IV07]], align 4 +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_IV07]], align 4 +// CHECK1-NEXT: [[MUL61:%.*]] = mul nsw i32 [[TMP41]], 1 +// CHECK1-NEXT: [[ADD62:%.*]] = add nsw i32 0, [[MUL61]] +// CHECK1-NEXT: store i32 [[ADD62]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP63:%.*]] = icmp slt i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: br i1 [[CMP63]], label %[[IF_THEN64:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN64]]: +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP45]], [[TMP46]] +// CHECK1-NEXT: [[ADD66:%.*]] = add nsw i32 [[TMP44]], [[MUL65]] +// CHECK1-NEXT: store i32 [[ADD66]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[MUL67:%.*]] = mul nsw i32 [[TMP47]], 1 +// CHECK1-NEXT: [[ADD68:%.*]] = add nsw i32 0, [[MUL67]] +// CHECK1-NEXT: store i32 [[ADD68]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP48]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP69:%.*]] = icmp slt i32 [[TMP49]], [[TMP50]] +// CHECK1-NEXT: br i1 [[CMP69]], label %[[IF_THEN70:.*]], label %[[IF_END75:.*]] +// CHECK1: [[IF_THEN70]]: +// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP52]], [[TMP53]] +// CHECK1-NEXT: [[ADD72:%.*]] = add nsw i32 [[TMP51]], [[MUL71]] +// CHECK1-NEXT: store i32 [[ADD72]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[MUL73:%.*]] = mul nsw i32 [[TMP54]], 2 +// CHECK1-NEXT: [[ADD74:%.*]] = add nsw i32 0, [[MUL73]] +// CHECK1-NEXT: store i32 [[ADD74]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP55]]) +// CHECK1-NEXT: br label %[[IF_END75]] +// CHECK1: [[IF_END75]]: +// CHECK1-NEXT: br label %[[IF_END76]] +// CHECK1: [[IF_END76]]: +// CHECK1-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: [[CMP77:%.*]] = icmp slt i64 [[TMP56]], [[TMP57]] +// CHECK1-NEXT: br i1 [[CMP77]], label %[[IF_THEN78:.*]], label %[[IF_END83:.*]] +// CHECK1: [[IF_THEN78]]: +// CHECK1-NEXT: [[TMP58:%.*]] = load i64, ptr [[DOTOMP_LB118]], align 8 +// CHECK1-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_ST119]], align 8 +// CHECK1-NEXT: [[TMP60:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], [[TMP60]] +// CHECK1-NEXT: [[ADD80:%.*]] = add nsw i64 [[TMP58]], [[MUL79]] +// CHECK1-NEXT: store i64 [[ADD80]], ptr [[DOTOMP_IV122]], align 8 +// CHECK1-NEXT: [[TMP61:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK1-NEXT: [[TMP62:%.*]] = load i64, ptr [[DOTOMP_IV122]], align 8 +// CHECK1-NEXT: [[MUL81:%.*]] = mul nsw i64 [[TMP62]], 1 +// CHECK1-NEXT: [[ADD_PTR82:%.*]] = getelementptr inbounds double, ptr [[TMP61]], i64 [[MUL81]] +// CHECK1-NEXT: store ptr [[ADD_PTR82]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP63:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: store ptr [[TMP63]], ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[C]], align 4 +// CHECK1-NEXT: [[TMP65:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP66:%.*]] = load double, ptr [[TMP65]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP64]], double noundef [[TMP66]]) +// CHECK1-NEXT: br label %[[IF_END83]] +// CHECK1: [[IF_END83]]: +// CHECK1-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[CMP84:%.*]] = icmp slt i64 [[TMP67]], [[TMP68]] +// CHECK1-NEXT: br i1 [[CMP84]], label %[[IF_THEN85:.*]], label %[[IF_END90:.*]] +// CHECK1: [[IF_THEN85]]: +// CHECK1-NEXT: [[TMP69:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 +// CHECK1-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 +// CHECK1-NEXT: [[TMP71:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], [[TMP71]] +// CHECK1-NEXT: [[ADD87:%.*]] = add nsw i64 [[TMP69]], [[MUL86]] +// CHECK1-NEXT: store i64 [[ADD87]], ptr [[DOTOMP_IV2]], align 8 +// CHECK1-NEXT: [[TMP72:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[TMP73:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 +// CHECK1-NEXT: [[MUL88:%.*]] = mul nsw i64 [[TMP73]], 1 +// CHECK1-NEXT: [[ADD_PTR89:%.*]] = getelementptr inbounds double, ptr [[TMP72]], i64 [[MUL88]] +// CHECK1-NEXT: store ptr [[ADD_PTR89]], ptr [[__BEGIN227]], align 8 +// CHECK1-NEXT: [[TMP74:%.*]] = load ptr, ptr [[__BEGIN227]], align 8 +// CHECK1-NEXT: store ptr [[TMP74]], ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP75:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK1-NEXT: [[TMP76:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP77:%.*]] = load double, ptr [[TMP76]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP75]], double noundef [[TMP77]]) +// CHECK1-NEXT: br label %[[IF_END90]] +// CHECK1: [[IF_END90]]: +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP78:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[INC:%.*]] = add nsw i64 [[TMP78]], 1 +// CHECK1-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: ret void +// +// +// CHECK2-LABEL: define dso_local void @body( +// CHECK2-SAME: ...) #[[ATTR0:[0-9]+]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: ret void +// +// +// CHECK2-LABEL: define dso_local void @foo1( +// CHECK2-SAME: i32 noundef [[START1:%.*]], i32 noundef [[END1:%.*]], i32 noundef [[STEP1:%.*]], i32 noundef [[START2:%.*]], i32 noundef [[END2:%.*]], i32 noundef [[STEP2:%.*]]) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[START1_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[END1_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[STEP1_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[START2_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[END2_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[STEP2_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_6:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: store i32 [[START1]], ptr [[START1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[END1]], ptr [[END1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[STEP1]], ptr [[STEP1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[START2]], ptr [[START2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[END2]], ptr [[END2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[STEP2]], ptr [[STEP2_ADDR]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[START1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[START1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP1]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[END1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[STEP1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP3]], ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[SUB:%.*]] = sub i32 [[TMP4]], [[TMP5]] +// CHECK2-NEXT: [[SUB3:%.*]] = sub i32 [[SUB]], 1 +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[ADD:%.*]] = add i32 [[SUB3]], [[TMP6]] +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] +// CHECK2-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 +// CHECK2-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK2-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[END2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK2-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK2-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 +// CHECK2-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK2-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP20]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP21]], [[TMP22]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP23]], %[[COND_TRUE]] ], [ [[TMP24]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] +// CHECK2-NEXT: br i1 [[CMP16]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: br i1 [[CMP17]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP30]], [[TMP31]] +// CHECK2-NEXT: [[ADD18:%.*]] = add i32 [[TMP29]], [[MUL]] +// CHECK2-NEXT: store i32 [[ADD18]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[MUL19:%.*]] = mul i32 [[TMP33]], [[TMP34]] +// CHECK2-NEXT: [[ADD20:%.*]] = add i32 [[TMP32]], [[MUL19]] +// CHECK2-NEXT: store i32 [[ADD20]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP35]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP36]], [[TMP37]] +// CHECK2-NEXT: br i1 [[CMP21]], label %[[IF_THEN22:.*]], label %[[IF_END27:.*]] +// CHECK2: [[IF_THEN22]]: +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL23:%.*]] = mul i32 [[TMP39]], [[TMP40]] +// CHECK2-NEXT: [[ADD24:%.*]] = add i32 [[TMP38]], [[MUL23]] +// CHECK2-NEXT: store i32 [[ADD24]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[MUL25:%.*]] = mul i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: [[ADD26:%.*]] = add i32 [[TMP41]], [[MUL25]] +// CHECK2-NEXT: store i32 [[ADD26]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK2-NEXT: br label %[[IF_END27]] +// CHECK2: [[IF_END27]]: +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP45]], 1 +// CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP3:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: ret void +// +// +// CHECK2-LABEL: define dso_local void @foo3( +// CHECK2-SAME: ) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB03:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB04:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST05:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI06:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV07:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_12:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_UB117:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_LB118:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_ST119:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_NI120:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV122:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[CC:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE223:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END224:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN227:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_29:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_31:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_32:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_UB2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_LB2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_ST2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_NI2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_TEMP_142:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX48:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX54:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[VV:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 [[TMP5]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[SUB:%.*]] = sub nsw i32 [[TMP6]], 0 +// CHECK2-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 +// CHECK2-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 +// CHECK2-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: store i32 [[TMP7]], ptr [[DOTOMP_UB03]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB04]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST05]], align 4 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP8]], 1 +// CHECK2-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 +// CHECK2-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI06]], align 8 +// CHECK2-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK2-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY8:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY8]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY10:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP11]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY10]], ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK2-NEXT: [[TMP12:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK2-NEXT: store ptr [[TMP12]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: [[TMP14:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP14]] to i64 +// CHECK2-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] +// CHECK2-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 +// CHECK2-NEXT: [[SUB13:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK2-NEXT: [[ADD14:%.*]] = add nsw i64 [[SUB13]], 1 +// CHECK2-NEXT: [[DIV15:%.*]] = sdiv i64 [[ADD14]], 1 +// CHECK2-NEXT: [[SUB16:%.*]] = sub nsw i64 [[DIV15]], 1 +// CHECK2-NEXT: store i64 [[SUB16]], ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK2-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK2-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_UB117]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB118]], align 8 +// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST119]], align 8 +// CHECK2-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK2-NEXT: [[ADD21:%.*]] = add nsw i64 [[TMP16]], 1 +// CHECK2-NEXT: store i64 [[ADD21]], ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: store i32 37, ptr [[CC]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE223]], align 8 +// CHECK2-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY25:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR26:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY25]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR26]], ptr [[__END224]], align 8 +// CHECK2-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP18]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY28]], ptr [[__BEGIN227]], align 8 +// CHECK2-NEXT: [[TMP19:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY30:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP19]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY30]], ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[TMP20:%.*]] = load ptr, ptr [[__END224]], align 8 +// CHECK2-NEXT: store ptr [[TMP20]], ptr [[DOTCAPTURE_EXPR_31]], align 8 +// CHECK2-NEXT: [[TMP21:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_31]], align 8 +// CHECK2-NEXT: [[TMP22:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST33:%.*]] = ptrtoint ptr [[TMP21]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST34:%.*]] = ptrtoint ptr [[TMP22]] to i64 +// CHECK2-NEXT: [[SUB_PTR_SUB35:%.*]] = sub i64 [[SUB_PTR_LHS_CAST33]], [[SUB_PTR_RHS_CAST34]] +// CHECK2-NEXT: [[SUB_PTR_DIV36:%.*]] = sdiv exact i64 [[SUB_PTR_SUB35]], 8 +// CHECK2-NEXT: [[SUB37:%.*]] = sub nsw i64 [[SUB_PTR_DIV36]], 1 +// CHECK2-NEXT: [[ADD38:%.*]] = add nsw i64 [[SUB37]], 1 +// CHECK2-NEXT: [[DIV39:%.*]] = sdiv i64 [[ADD38]], 1 +// CHECK2-NEXT: [[SUB40:%.*]] = sub nsw i64 [[DIV39]], 1 +// CHECK2-NEXT: store i64 [[SUB40]], ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK2-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK2-NEXT: store i64 [[TMP23]], ptr [[DOTOMP_UB2]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB2]], align 8 +// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST2]], align 8 +// CHECK2-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK2-NEXT: [[ADD41:%.*]] = add nsw i64 [[TMP24]], 1 +// CHECK2-NEXT: store i64 [[ADD41]], ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 +// CHECK2-NEXT: store i64 [[TMP25]], ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK2-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK2-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: [[CMP43:%.*]] = icmp sgt i64 [[TMP26]], [[TMP27]] +// CHECK2-NEXT: br i1 [[CMP43]], label %[[COND_TRUE44:.*]], label %[[COND_FALSE45:.*]] +// CHECK2: [[COND_TRUE44]]: +// CHECK2-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK2-NEXT: br label %[[COND_END46:.*]] +// CHECK2: [[COND_FALSE45]]: +// CHECK2-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: br label %[[COND_END46]] +// CHECK2: [[COND_END46]]: +// CHECK2-NEXT: [[COND47:%.*]] = phi i64 [ [[TMP28]], %[[COND_TRUE44]] ], [ [[TMP29]], %[[COND_FALSE45]] ] +// CHECK2-NEXT: store i64 [[COND47]], ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[CMP49:%.*]] = icmp sgt i64 [[TMP30]], [[TMP31]] +// CHECK2-NEXT: br i1 [[CMP49]], label %[[COND_TRUE50:.*]], label %[[COND_FALSE51:.*]] +// CHECK2: [[COND_TRUE50]]: +// CHECK2-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: br label %[[COND_END52:.*]] +// CHECK2: [[COND_FALSE51]]: +// CHECK2-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: br label %[[COND_END52]] +// CHECK2: [[COND_END52]]: +// CHECK2-NEXT: [[COND53:%.*]] = phi i64 [ [[TMP32]], %[[COND_TRUE50]] ], [ [[TMP33]], %[[COND_FALSE51]] ] +// CHECK2-NEXT: store i64 [[COND53]], ptr [[DOTOMP_FUSE_MAX48]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP35:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX48]], align 8 +// CHECK2-NEXT: [[CMP55:%.*]] = icmp slt i64 [[TMP34]], [[TMP35]] +// CHECK2-NEXT: br i1 [[CMP55]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP36:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 +// CHECK2-NEXT: [[CMP56:%.*]] = icmp slt i64 [[TMP36]], [[TMP37]] +// CHECK2-NEXT: br i1 [[CMP56]], label %[[IF_THEN:.*]], label %[[IF_END76:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB04]], align 4 +// CHECK2-NEXT: [[CONV57:%.*]] = sext i32 [[TMP38]] to i64 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST05]], align 4 +// CHECK2-NEXT: [[CONV58:%.*]] = sext i32 [[TMP39]] to i64 +// CHECK2-NEXT: [[TMP40:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV58]], [[TMP40]] +// CHECK2-NEXT: [[ADD59:%.*]] = add nsw i64 [[CONV57]], [[MUL]] +// CHECK2-NEXT: [[CONV60:%.*]] = trunc i64 [[ADD59]] to i32 +// CHECK2-NEXT: store i32 [[CONV60]], ptr [[DOTOMP_IV07]], align 4 +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_IV07]], align 4 +// CHECK2-NEXT: [[MUL61:%.*]] = mul nsw i32 [[TMP41]], 1 +// CHECK2-NEXT: [[ADD62:%.*]] = add nsw i32 0, [[MUL61]] +// CHECK2-NEXT: store i32 [[ADD62]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP63:%.*]] = icmp slt i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: br i1 [[CMP63]], label %[[IF_THEN64:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN64]]: +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP45]], [[TMP46]] +// CHECK2-NEXT: [[ADD66:%.*]] = add nsw i32 [[TMP44]], [[MUL65]] +// CHECK2-NEXT: store i32 [[ADD66]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[MUL67:%.*]] = mul nsw i32 [[TMP47]], 1 +// CHECK2-NEXT: [[ADD68:%.*]] = add nsw i32 0, [[MUL67]] +// CHECK2-NEXT: store i32 [[ADD68]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP48]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP69:%.*]] = icmp slt i32 [[TMP49]], [[TMP50]] +// CHECK2-NEXT: br i1 [[CMP69]], label %[[IF_THEN70:.*]], label %[[IF_END75:.*]] +// CHECK2: [[IF_THEN70]]: +// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP52]], [[TMP53]] +// CHECK2-NEXT: [[ADD72:%.*]] = add nsw i32 [[TMP51]], [[MUL71]] +// CHECK2-NEXT: store i32 [[ADD72]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[MUL73:%.*]] = mul nsw i32 [[TMP54]], 2 +// CHECK2-NEXT: [[ADD74:%.*]] = add nsw i32 0, [[MUL73]] +// CHECK2-NEXT: store i32 [[ADD74]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP55]]) +// CHECK2-NEXT: br label %[[IF_END75]] +// CHECK2: [[IF_END75]]: +// CHECK2-NEXT: br label %[[IF_END76]] +// CHECK2: [[IF_END76]]: +// CHECK2-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: [[CMP77:%.*]] = icmp slt i64 [[TMP56]], [[TMP57]] +// CHECK2-NEXT: br i1 [[CMP77]], label %[[IF_THEN78:.*]], label %[[IF_END83:.*]] +// CHECK2: [[IF_THEN78]]: +// CHECK2-NEXT: [[TMP58:%.*]] = load i64, ptr [[DOTOMP_LB118]], align 8 +// CHECK2-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_ST119]], align 8 +// CHECK2-NEXT: [[TMP60:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], [[TMP60]] +// CHECK2-NEXT: [[ADD80:%.*]] = add nsw i64 [[TMP58]], [[MUL79]] +// CHECK2-NEXT: store i64 [[ADD80]], ptr [[DOTOMP_IV122]], align 8 +// CHECK2-NEXT: [[TMP61:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK2-NEXT: [[TMP62:%.*]] = load i64, ptr [[DOTOMP_IV122]], align 8 +// CHECK2-NEXT: [[MUL81:%.*]] = mul nsw i64 [[TMP62]], 1 +// CHECK2-NEXT: [[ADD_PTR82:%.*]] = getelementptr inbounds double, ptr [[TMP61]], i64 [[MUL81]] +// CHECK2-NEXT: store ptr [[ADD_PTR82]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP63:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: store ptr [[TMP63]], ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[C]], align 4 +// CHECK2-NEXT: [[TMP65:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP66:%.*]] = load double, ptr [[TMP65]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP64]], double noundef [[TMP66]]) +// CHECK2-NEXT: br label %[[IF_END83]] +// CHECK2: [[IF_END83]]: +// CHECK2-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[CMP84:%.*]] = icmp slt i64 [[TMP67]], [[TMP68]] +// CHECK2-NEXT: br i1 [[CMP84]], label %[[IF_THEN85:.*]], label %[[IF_END90:.*]] +// CHECK2: [[IF_THEN85]]: +// CHECK2-NEXT: [[TMP69:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 +// CHECK2-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 +// CHECK2-NEXT: [[TMP71:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], [[TMP71]] +// CHECK2-NEXT: [[ADD87:%.*]] = add nsw i64 [[TMP69]], [[MUL86]] +// CHECK2-NEXT: store i64 [[ADD87]], ptr [[DOTOMP_IV2]], align 8 +// CHECK2-NEXT: [[TMP72:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[TMP73:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 +// CHECK2-NEXT: [[MUL88:%.*]] = mul nsw i64 [[TMP73]], 1 +// CHECK2-NEXT: [[ADD_PTR89:%.*]] = getelementptr inbounds double, ptr [[TMP72]], i64 [[MUL88]] +// CHECK2-NEXT: store ptr [[ADD_PTR89]], ptr [[__BEGIN227]], align 8 +// CHECK2-NEXT: [[TMP74:%.*]] = load ptr, ptr [[__BEGIN227]], align 8 +// CHECK2-NEXT: store ptr [[TMP74]], ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP75:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK2-NEXT: [[TMP76:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP77:%.*]] = load double, ptr [[TMP76]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP75]], double noundef [[TMP77]]) +// CHECK2-NEXT: br label %[[IF_END90]] +// CHECK2: [[IF_END90]]: +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP78:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[INC:%.*]] = add nsw i64 [[TMP78]], 1 +// CHECK2-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP5:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: ret void +// +// +// CHECK2-LABEL: define dso_local void @tfoo2( +// CHECK2-SAME: ) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: call void @_Z4foo2IiEvT_S0_S0_(i32 noundef 0, i32 noundef 64, i32 noundef 4) +// CHECK2-NEXT: ret void +// +// +// CHECK2-LABEL: define linkonce_odr void @_Z4foo2IiEvT_S0_S0_( +// CHECK2-SAME: i32 noundef [[START:%.*]], i32 noundef [[END:%.*]], i32 noundef [[STEP:%.*]]) #[[ATTR0]] comdat { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[START_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[END_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[STEP_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_6:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_17:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_19:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP21:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_22:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: store i32 [[START]], ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[END]], ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[STEP]], ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP1]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP3]], ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[SUB:%.*]] = sub i32 [[TMP4]], [[TMP5]] +// CHECK2-NEXT: [[SUB3:%.*]] = sub i32 [[SUB]], 1 +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[ADD:%.*]] = add i32 [[SUB3]], [[TMP6]] +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] +// CHECK2-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 +// CHECK2-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK2-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK2-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK2-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 +// CHECK2-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK2-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] +// CHECK2-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] +// CHECK2-NEXT: store i32 [[ADD18]], ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP24]], [[TMP25]] +// CHECK2-NEXT: store i32 [[ADD20]], ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP26]], ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[SUB23:%.*]] = sub i32 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: [[SUB24:%.*]] = sub i32 [[SUB23]], 1 +// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP29]] +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP30]] +// CHECK2-NEXT: [[SUB27:%.*]] = sub i32 [[DIV26]], 1 +// CHECK2-NEXT: store i32 [[SUB27]], ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK2-NEXT: store i32 [[TMP31]], ptr [[DOTOMP_UB2]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB2]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST2]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK2-NEXT: [[ADD28:%.*]] = add i32 [[TMP32]], 1 +// CHECK2-NEXT: store i32 [[ADD28]], ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP33]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP34]], [[TMP35]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP36]], %[[COND_TRUE]] ], [ [[TMP37]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP38]], [[TMP39]] +// CHECK2-NEXT: br i1 [[CMP29]], label %[[COND_TRUE30:.*]], label %[[COND_FALSE31:.*]] +// CHECK2: [[COND_TRUE30]]: +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: br label %[[COND_END32:.*]] +// CHECK2: [[COND_FALSE31]]: +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: br label %[[COND_END32]] +// CHECK2: [[COND_END32]]: +// CHECK2-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP40]], %[[COND_TRUE30]] ], [ [[TMP41]], %[[COND_FALSE31]] ] +// CHECK2-NEXT: store i32 [[COND33]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: br i1 [[CMP34]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP44]], [[TMP45]] +// CHECK2-NEXT: br i1 [[CMP35]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP47]], [[TMP48]] +// CHECK2-NEXT: [[ADD36:%.*]] = add i32 [[TMP46]], [[MUL]] +// CHECK2-NEXT: store i32 [[ADD36]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[MUL37:%.*]] = mul i32 [[TMP50]], [[TMP51]] +// CHECK2-NEXT: [[ADD38:%.*]] = add i32 [[TMP49]], [[MUL37]] +// CHECK2-NEXT: store i32 [[ADD38]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP53]], [[TMP54]] +// CHECK2-NEXT: br i1 [[CMP39]], label %[[IF_THEN40:.*]], label %[[IF_END45:.*]] +// CHECK2: [[IF_THEN40]]: +// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL41:%.*]] = mul i32 [[TMP56]], [[TMP57]] +// CHECK2-NEXT: [[ADD42:%.*]] = add i32 [[TMP55]], [[MUL41]] +// CHECK2-NEXT: store i32 [[ADD42]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP58:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[MUL43:%.*]] = mul i32 [[TMP59]], [[TMP60]] +// CHECK2-NEXT: [[SUB44:%.*]] = sub i32 [[TMP58]], [[MUL43]] +// CHECK2-NEXT: store i32 [[SUB44]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP61:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP61]]) +// CHECK2-NEXT: br label %[[IF_END45]] +// CHECK2: [[IF_END45]]: +// CHECK2-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP62]], [[TMP63]] +// CHECK2-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] +// CHECK2: [[IF_THEN47]]: +// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 +// CHECK2-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 +// CHECK2-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL48:%.*]] = mul i32 [[TMP65]], [[TMP66]] +// CHECK2-NEXT: [[ADD49:%.*]] = add i32 [[TMP64]], [[MUL48]] +// CHECK2-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV2]], align 4 +// CHECK2-NEXT: [[TMP67:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 +// CHECK2-NEXT: [[TMP69:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[MUL50:%.*]] = mul i32 [[TMP68]], [[TMP69]] +// CHECK2-NEXT: [[ADD51:%.*]] = add i32 [[TMP67]], [[MUL50]] +// CHECK2-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 +// CHECK2-NEXT: [[TMP70:%.*]] = load i32, ptr [[K]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP70]]) +// CHECK2-NEXT: br label %[[IF_END52]] +// CHECK2: [[IF_END52]]: +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 +// CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: ret void +// +//. +// CHECK1: [[LOOP3]] = distinct !{[[LOOP3]], [[META4:![0-9]+]]} +// CHECK1: [[META4]] = !{!"llvm.loop.mustprogress"} +// CHECK1: [[LOOP5]] = distinct !{[[LOOP5]], [[META4]]} +// CHECK1: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} +//. +// CHECK2: [[LOOP3]] = distinct !{[[LOOP3]], [[META4:![0-9]+]]} +// CHECK2: [[META4]] = !{!"llvm.loop.mustprogress"} +// CHECK2: [[LOOP5]] = distinct !{[[LOOP5]], [[META4]]} +// CHECK2: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} +//. diff --git a/clang/test/OpenMP/fuse_messages.cpp b/clang/test/OpenMP/fuse_messages.cpp new file mode 100644 index 0000000000000..50dedfd2c0dc6 --- /dev/null +++ b/clang/test/OpenMP/fuse_messages.cpp @@ -0,0 +1,76 @@ +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -std=c++20 -fopenmp -fopenmp-version=60 -fsyntax-only -Wuninitialized -verify %s + +void func() { + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a loop sequence containing canonical loops or loop-generating constructs}} + #pragma omp fuse + ; + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a for loop}} + #pragma omp fuse + {int bar = 0;} + + // expected-error at +4 {{statement after '#pragma omp fuse' must be a for loop}} + #pragma omp fuse + { + for(int i = 0; i < 10; ++i); + int x = 2; + } + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a loop sequence containing canonical loops or loop-generating constructs}} + #pragma omp fuse + #pragma omp for + for (int i = 0; i < 7; ++i) + ; + + { + // expected-error at +2 {{expected statement}} + #pragma omp fuse + } + + // expected-warning at +1 {{extra tokens at the end of '#pragma omp fuse' are ignored}} + #pragma omp fuse foo + { + for (int i = 0; i < 7; ++i) + ; + } + + + // expected-error at +1 {{unexpected OpenMP clause 'final' in directive '#pragma omp fuse'}} + #pragma omp fuse final(0) + { + for (int i = 0; i < 7; ++i) + ; + } + + //expected-error at +4 {{loop after '#pragma omp fuse' is not in canonical form}} + //expected-error at +3 {{increment clause of OpenMP for loop must perform simple addition or subtraction on loop variable 'i'}} + #pragma omp fuse + { + for(int i = 0; i < 10; i*=2) { + ; + } + } + + //expected-error at +2 {{loop sequence after '#pragma omp fuse' must contain at least 1 canonical loop or loop-generating construct}} + #pragma omp fuse + {} + + //expected-error at +3 {{statement after '#pragma omp fuse' must be a for loop}} + #pragma omp fuse + { + #pragma omp unroll full + for(int i = 0; i < 10; ++i); + + for(int j = 0; j < 10; ++j); + } + + //expected-warning at +5 {{loop sequence following '#pragma omp fuse' contains induction variables of differing types: 'int' and 'unsigned int'}} + //expected-warning at +5 {{loop sequence following '#pragma omp fuse' contains induction variables of differing types: 'int' and 'long long'}} + #pragma omp fuse + { + for(int i = 0; i < 10; ++i); + for(unsigned int j = 0; j < 10; ++j); + for(long long k = 0; k < 100; ++k); + } +} \ No newline at end of file diff --git a/clang/tools/libclang/CIndex.cpp b/clang/tools/libclang/CIndex.cpp index 06a17006fdee9..fd788ac3d69d4 100644 --- a/clang/tools/libclang/CIndex.cpp +++ b/clang/tools/libclang/CIndex.cpp @@ -2206,6 +2206,7 @@ class EnqueueVisitor : public ConstStmtVisitor, void VisitOMPUnrollDirective(const OMPUnrollDirective *D); void VisitOMPReverseDirective(const OMPReverseDirective *D); void VisitOMPInterchangeDirective(const OMPInterchangeDirective *D); + void VisitOMPFuseDirective(const OMPFuseDirective *D); void VisitOMPForDirective(const OMPForDirective *D); void VisitOMPForSimdDirective(const OMPForSimdDirective *D); void VisitOMPSectionsDirective(const OMPSectionsDirective *D); @@ -3364,6 +3365,10 @@ void EnqueueVisitor::VisitOMPInterchangeDirective( VisitOMPLoopTransformationDirective(D); } +void EnqueueVisitor::VisitOMPFuseDirective(const OMPFuseDirective *D) { + VisitOMPLoopTransformationDirective(D); +} + void EnqueueVisitor::VisitOMPForDirective(const OMPForDirective *D) { VisitOMPLoopDirective(D); } @@ -6317,6 +6322,8 @@ CXString clang_getCursorKindSpelling(enum CXCursorKind Kind) { return cxstring::createRef("OMPReverseDirective"); case CXCursor_OMPInterchangeDirective: return cxstring::createRef("OMPInterchangeDirective"); + case CXCursor_OMPFuseDirective: + return cxstring::createRef("OMPFuseDirective"); case CXCursor_OMPForDirective: return cxstring::createRef("OMPForDirective"); case CXCursor_OMPForSimdDirective: diff --git a/clang/tools/libclang/CXCursor.cpp b/clang/tools/libclang/CXCursor.cpp index 635d03a88d105..709fa60d28d8d 100644 --- a/clang/tools/libclang/CXCursor.cpp +++ b/clang/tools/libclang/CXCursor.cpp @@ -688,6 +688,9 @@ CXCursor cxcursor::MakeCXCursor(const Stmt *S, const Decl *Parent, case Stmt::OMPInterchangeDirectiveClass: K = CXCursor_OMPInterchangeDirective; break; + case Stmt::OMPFuseDirectiveClass: + K = CXCursor_OMPFuseDirective; + break; case Stmt::OMPForDirectiveClass: K = CXCursor_OMPForDirective; break; diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 0af4b436649a3..8286cfcadaafd 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -852,6 +852,10 @@ def OMP_For : Directive<"for"> { let category = CA_Executable; let languages = [L_C]; } +def OMP_Fuse : Directive<"fuse"> { + let association = AS_Loop; + let category = CA_Executable; +} def OMP_Interchange : Directive<"interchange"> { let allowedOnceClauses = [ VersionedClause, diff --git a/openmp/runtime/test/transform/fuse/foreach.cpp b/openmp/runtime/test/transform/fuse/foreach.cpp new file mode 100644 index 0000000000000..cabf4bf8a511d --- /dev/null +++ b/openmp/runtime/test/transform/fuse/foreach.cpp @@ -0,0 +1,192 @@ +// RUN: %libomp-cxx20-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include +#include +#include + +struct Reporter { + const char *name; + + Reporter(const char *name) : name(name) { print("ctor"); } + + Reporter() : name("") { print("ctor"); } + + Reporter(const Reporter &that) : name(that.name) { print("copy ctor"); } + + Reporter(Reporter &&that) : name(that.name) { print("move ctor"); } + + ~Reporter() { print("dtor"); } + + const Reporter &operator=(const Reporter &that) { + print("copy assign"); + this->name = that.name; + return *this; + } + + const Reporter &operator=(Reporter &&that) { + print("move assign"); + this->name = that.name; + return *this; + } + + struct Iterator { + const Reporter *owner; + int pos; + + Iterator(const Reporter *owner, int pos) : owner(owner), pos(pos) {} + + Iterator(const Iterator &that) : owner(that.owner), pos(that.pos) { + owner->print("iterator copy ctor"); + } + + Iterator(Iterator &&that) : owner(that.owner), pos(that.pos) { + owner->print("iterator move ctor"); + } + + ~Iterator() { owner->print("iterator dtor"); } + + const Iterator &operator=(const Iterator &that) { + owner->print("iterator copy assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + const Iterator &operator=(Iterator &&that) { + owner->print("iterator move assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + bool operator==(const Iterator &that) const { + owner->print("iterator %d == %d", 2 - this->pos, 2 - that.pos); + return this->pos == that.pos; + } + + Iterator &operator++() { + owner->print("iterator prefix ++"); + pos -= 1; + return *this; + } + + Iterator operator++(int) { + owner->print("iterator postfix ++"); + auto result = *this; + pos -= 1; + return result; + } + + int operator*() const { + int result = 2 - pos; + owner->print("iterator deref: %i", result); + return result; + } + + size_t operator-(const Iterator &that) const { + int result = (2 - this->pos) - (2 - that.pos); + owner->print("iterator distance: %d", result); + return result; + } + + Iterator operator+(int steps) const { + owner->print("iterator advance: %i += %i", 2 - this->pos, steps); + return Iterator(owner, pos - steps); + } + + void print(const char *msg) const { owner->print(msg); } + }; + + Iterator begin() const { + print("begin()"); + return Iterator(this, 2); + } + + Iterator end() const { + print("end()"); + return Iterator(this, -1); + } + + void print(const char *msg, ...) const { + va_list args; + va_start(args, msg); + printf("[%s] ", name); + vprintf(msg, args); + printf("\n"); + va_end(args); + } +}; + +int main() { + printf("do\n"); +#pragma omp fuse + { + for (Reporter a{"C"}; auto &&v : Reporter("A")) + printf("v=%d\n", v); + for (Reporter aa{"D"}; auto &&vv : Reporter("B")) + printf("vv=%d\n", vv); + } + printf("done\n"); + return EXIT_SUCCESS; +} + +// CHECK: [C] ctor +// CHECK-NEXT: [A] ctor +// CHECK-NEXT: [A] end() +// CHECK-NEXT: [A] begin() +// CHECK-NEXT: [A] begin() +// CHECK-NEXT: [A] iterator distance: 3 +// CHECK-NEXT: [D] ctor +// CHECK-NEXT: [B] ctor +// CHECK-NEXT: [B] end() +// CHECK-NEXT: [B] begin() +// CHECK-NEXT: [B] begin() +// CHECK-NEXT: [B] iterator distance: 3 +// CHECK-NEXT: [A] iterator advance: 0 += 0 +// CHECK-NEXT: [A] iterator move assign +// CHECK-NEXT: [A] iterator deref: 0 +// CHECK-NEXT: v=0 +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [B] iterator advance: 0 += 0 +// CHECK-NEXT: [B] iterator move assign +// CHECK-NEXT: [B] iterator deref: 0 +// CHECK-NEXT: vv=0 +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [A] iterator advance: 0 += 1 +// CHECK-NEXT: [A] iterator move assign +// CHECK-NEXT: [A] iterator deref: 1 +// CHECK-NEXT: v=1 +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [B] iterator advance: 0 += 1 +// CHECK-NEXT: [B] iterator move assign +// CHECK-NEXT: [B] iterator deref: 1 +// CHECK-NEXT: vv=1 +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [A] iterator advance: 0 += 2 +// CHECK-NEXT: [A] iterator move assign +// CHECK-NEXT: [A] iterator deref: 2 +// CHECK-NEXT: v=2 +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [B] iterator advance: 0 += 2 +// CHECK-NEXT: [B] iterator move assign +// CHECK-NEXT: [B] iterator deref: 2 +// CHECK-NEXT: vv=2 +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [B] dtor +// CHECK-NEXT: [D] dtor +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [A] dtor +// CHECK-NEXT: [C] dtor +// CHECK-NEXT: done + + +#endif diff --git a/openmp/runtime/test/transform/fuse/intfor.c b/openmp/runtime/test/transform/fuse/intfor.c new file mode 100644 index 0000000000000..b8171b4df7042 --- /dev/null +++ b/openmp/runtime/test/transform/fuse/intfor.c @@ -0,0 +1,50 @@ +// RUN: %libomp-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include + +int main() { + printf("do\n"); +#pragma omp fuse + { + for (int i = 5; i <= 25; i += 5) + printf("i=%d\n", i); + for (int j = 10; j < 100; j += 10) + printf("j=%d\n", j); + for (int k = 10; k > 0; --k) + printf("k=%d\n", k); + } + printf("done\n"); + return EXIT_SUCCESS; +} +#endif /* HEADER */ + +// CHECK: do +// CHECK-NEXT: i=5 +// CHECK-NEXT: j=10 +// CHECK-NEXT: k=10 +// CHECK-NEXT: i=10 +// CHECK-NEXT: j=20 +// CHECK-NEXT: k=9 +// CHECK-NEXT: i=15 +// CHECK-NEXT: j=30 +// CHECK-NEXT: k=8 +// CHECK-NEXT: i=20 +// CHECK-NEXT: j=40 +// CHECK-NEXT: k=7 +// CHECK-NEXT: i=25 +// CHECK-NEXT: j=50 +// CHECK-NEXT: k=6 +// CHECK-NEXT: j=60 +// CHECK-NEXT: k=5 +// CHECK-NEXT: j=70 +// CHECK-NEXT: k=4 +// CHECK-NEXT: j=80 +// CHECK-NEXT: k=3 +// CHECK-NEXT: j=90 +// CHECK-NEXT: k=2 +// CHECK-NEXT: k=1 +// CHECK-NEXT: done diff --git a/openmp/runtime/test/transform/fuse/iterfor.cpp b/openmp/runtime/test/transform/fuse/iterfor.cpp new file mode 100644 index 0000000000000..552484b2981c4 --- /dev/null +++ b/openmp/runtime/test/transform/fuse/iterfor.cpp @@ -0,0 +1,194 @@ +// RUN: %libomp-cxx20-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include +#include +#include + +struct Reporter { + const char *name; + + Reporter(const char *name) : name(name) { print("ctor"); } + + Reporter() : name("") { print("ctor"); } + + Reporter(const Reporter &that) : name(that.name) { print("copy ctor"); } + + Reporter(Reporter &&that) : name(that.name) { print("move ctor"); } + + ~Reporter() { print("dtor"); } + + const Reporter &operator=(const Reporter &that) { + print("copy assign"); + this->name = that.name; + return *this; + } + + const Reporter &operator=(Reporter &&that) { + print("move assign"); + this->name = that.name; + return *this; + } + + struct Iterator { + const Reporter *owner; + int pos; + + Iterator(const Reporter *owner, int pos) : owner(owner), pos(pos) {} + + Iterator(const Iterator &that) : owner(that.owner), pos(that.pos) { + owner->print("iterator copy ctor"); + } + + Iterator(Iterator &&that) : owner(that.owner), pos(that.pos) { + owner->print("iterator move ctor"); + } + + ~Iterator() { owner->print("iterator dtor"); } + + const Iterator &operator=(const Iterator &that) { + owner->print("iterator copy assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + const Iterator &operator=(Iterator &&that) { + owner->print("iterator move assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + bool operator==(const Iterator &that) const { + owner->print("iterator %d == %d", 2 - this->pos, 2 - that.pos); + return this->pos == that.pos; + } + + bool operator!=(const Iterator &that) const { + owner->print("iterator %d != %d", 2 - this->pos, 2 - that.pos); + return this->pos != that.pos; + } + + Iterator &operator++() { + owner->print("iterator prefix ++"); + pos -= 1; + return *this; + } + + Iterator operator++(int) { + owner->print("iterator postfix ++"); + auto result = *this; + pos -= 1; + return result; + } + + int operator*() const { + int result = 2 - pos; + owner->print("iterator deref: %i", result); + return result; + } + + size_t operator-(const Iterator &that) const { + int result = (2 - this->pos) - (2 - that.pos); + owner->print("iterator distance: %d", result); + return result; + } + + Iterator operator+(int steps) const { + owner->print("iterator advance: %i += %i", 2 - this->pos, steps); + return Iterator(owner, pos - steps); + } + }; + + Iterator begin() const { + print("begin()"); + return Iterator(this, 2); + } + + Iterator end() const { + print("end()"); + return Iterator(this, -1); + } + + void print(const char *msg, ...) const { + va_list args; + va_start(args, msg); + printf("[%s] ", name); + vprintf(msg, args); + printf("\n"); + va_end(args); + } +}; + +int main() { + printf("do\n"); + Reporter C("C"); + Reporter D("D"); +#pragma omp fuse + { + for (auto it = C.begin(); it != C.end(); ++it) + printf("v=%d\n", *it); + + for (auto it = D.begin(); it != D.end(); ++it) + printf("vv=%d\n", *it); + } + printf("done\n"); + return EXIT_SUCCESS; +} + +#endif /* HEADER */ + +// CHECK: do +// CHECK: [C] ctor +// CHECK-NEXT: [D] ctor +// CHECK-NEXT: [C] begin() +// CHECK-NEXT: [C] begin() +// CHECK-NEXT: [C] end() +// CHECK-NEXT: [C] iterator distance: 3 +// CHECK-NEXT: [D] begin() +// CHECK-NEXT: [D] begin() +// CHECK-NEXT: [D] end() +// CHECK-NEXT: [D] iterator distance: 3 +// CHECK-NEXT: [C] iterator advance: 0 += 0 +// CHECK-NEXT: [C] iterator move assign +// CHECK-NEXT: [C] iterator deref: 0 +// CHECK-NEXT: v=0 +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [D] iterator advance: 0 += 0 +// CHECK-NEXT: [D] iterator move assign +// CHECK-NEXT: [D] iterator deref: 0 +// CHECK-NEXT: vv=0 +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [C] iterator advance: 0 += 1 +// CHECK-NEXT: [C] iterator move assign +// CHECK-NEXT: [C] iterator deref: 1 +// CHECK-NEXT: v=1 +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [D] iterator advance: 0 += 1 +// CHECK-NEXT: [D] iterator move assign +// CHECK-NEXT: [D] iterator deref: 1 +// CHECK-NEXT: vv=1 +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [C] iterator advance: 0 += 2 +// CHECK-NEXT: [C] iterator move assign +// CHECK-NEXT: [C] iterator deref: 2 +// CHECK-NEXT: v=2 +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [D] iterator advance: 0 += 2 +// CHECK-NEXT: [D] iterator move assign +// CHECK-NEXT: [D] iterator deref: 2 +// CHECK-NEXT: vv=2 +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: done +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [D] dtor +// CHECK-NEXT: [C] dtor diff --git a/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-foreach.cpp b/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-foreach.cpp new file mode 100644 index 0000000000000..e9f76713fe3e0 --- /dev/null +++ b/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-foreach.cpp @@ -0,0 +1,208 @@ +// RUN: %libomp-cxx20-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include +#include +#include + +struct Reporter { + const char *name; + + Reporter(const char *name) : name(name) { print("ctor"); } + + Reporter() : name("") { print("ctor"); } + + Reporter(const Reporter &that) : name(that.name) { print("copy ctor"); } + + Reporter(Reporter &&that) : name(that.name) { print("move ctor"); } + + ~Reporter() { print("dtor"); } + + const Reporter &operator=(const Reporter &that) { + print("copy assign"); + this->name = that.name; + return *this; + } + + const Reporter &operator=(Reporter &&that) { + print("move assign"); + this->name = that.name; + return *this; + } + + struct Iterator { + const Reporter *owner; + int pos; + + Iterator(const Reporter *owner, int pos) : owner(owner), pos(pos) {} + + Iterator(const Iterator &that) : owner(that.owner), pos(that.pos) { + owner->print("iterator copy ctor"); + } + + Iterator(Iterator &&that) : owner(that.owner), pos(that.pos) { + owner->print("iterator move ctor"); + } + + ~Iterator() { owner->print("iterator dtor"); } + + const Iterator &operator=(const Iterator &that) { + owner->print("iterator copy assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + const Iterator &operator=(Iterator &&that) { + owner->print("iterator move assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + bool operator==(const Iterator &that) const { + owner->print("iterator %d == %d", 2 - this->pos, 2 - that.pos); + return this->pos == that.pos; + } + + Iterator &operator++() { + owner->print("iterator prefix ++"); + pos -= 1; + return *this; + } + + Iterator operator++(int) { + owner->print("iterator postfix ++"); + auto result = *this; + pos -= 1; + return result; + } + + int operator*() const { + int result = 2 - pos; + owner->print("iterator deref: %i", result); + return result; + } + + size_t operator-(const Iterator &that) const { + int result = (2 - this->pos) - (2 - that.pos); + owner->print("iterator distance: %d", result); + return result; + } + + Iterator operator+(int steps) const { + owner->print("iterator advance: %i += %i", 2 - this->pos, steps); + return Iterator(owner, pos - steps); + } + + void print(const char *msg) const { owner->print(msg); } + }; + + Iterator begin() const { + print("begin()"); + return Iterator(this, 2); + } + + Iterator end() const { + print("end()"); + return Iterator(this, -1); + } + + void print(const char *msg, ...) const { + va_list args; + va_start(args, msg); + printf("[%s] ", name); + vprintf(msg, args); + printf("\n"); + va_end(args); + } +}; + +int main() { + printf("do\n"); +#pragma omp parallel for collapse(2) num_threads(1) + for (int i = 0; i < 3; ++i) +#pragma omp fuse + { + for (Reporter c{"init-stmt"}; auto &&v : Reporter("range")) + printf("i=%d v=%d\n", i, v); + for (int vv = 0; vv < 3; ++vv) + printf("i=%d vv=%d\n", i, vv); + } + printf("done\n"); + return EXIT_SUCCESS; +} + +#endif /* HEADER */ + +// CHECK: do +// CHECK-NEXT: [init-stmt] ctor +// CHECK-NEXT: [range] ctor +// CHECK-NEXT: [range] end() +// CHECK-NEXT: [range] begin() +// CHECK-NEXT: [range] begin() +// CHECK-NEXT: [range] iterator distance: 3 +// CHECK-NEXT: [range] iterator advance: 0 += 0 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 0 +// CHECK-NEXT: i=0 v=0 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=0 vv=0 +// CHECK-NEXT: [range] iterator advance: 0 += 1 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 1 +// CHECK-NEXT: i=0 v=1 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=0 vv=1 +// CHECK-NEXT: [range] iterator advance: 0 += 2 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 2 +// CHECK-NEXT: i=0 v=2 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=0 vv=2 +// CHECK-NEXT: [range] iterator advance: 0 += 0 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 0 +// CHECK-NEXT: i=1 v=0 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=1 vv=0 +// CHECK-NEXT: [range] iterator advance: 0 += 1 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 1 +// CHECK-NEXT: i=1 v=1 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=1 vv=1 +// CHECK-NEXT: [range] iterator advance: 0 += 2 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 2 +// CHECK-NEXT: i=1 v=2 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=1 vv=2 +// CHECK-NEXT: [range] iterator advance: 0 += 0 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 0 +// CHECK-NEXT: i=2 v=0 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=2 vv=0 +// CHECK-NEXT: [range] iterator advance: 0 += 1 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 1 +// CHECK-NEXT: i=2 v=1 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=2 vv=1 +// CHECK-NEXT: [range] iterator advance: 0 += 2 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 2 +// CHECK-NEXT: i=2 v=2 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=2 vv=2 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: [range] dtor +// CHECK-NEXT: [init-stmt] dtor +// CHECK-NEXT: done + diff --git a/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-intfor.c b/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-intfor.c new file mode 100644 index 0000000000000..272908e72c429 --- /dev/null +++ b/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-intfor.c @@ -0,0 +1,45 @@ +// RUN: %libomp-cxx-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include + +int main() { + printf("do\n"); +#pragma omp parallel for collapse(2) num_threads(1) + for (int i = 0; i < 3; ++i) +#pragma omp fuse + { + for (int j = 0; j < 3; ++j) + printf("i=%d j=%d\n", i, j); + for (int k = 0; k < 3; ++k) + printf("i=%d k=%d\n", i, k); + } + printf("done\n"); + return EXIT_SUCCESS; +} + +#endif /* HEADER */ + +// CHECK: do +// CHECK: i=0 j=0 +// CHECK-NEXT: i=0 k=0 +// CHECK-NEXT: i=0 j=1 +// CHECK-NEXT: i=0 k=1 +// CHECK-NEXT: i=0 j=2 +// CHECK-NEXT: i=0 k=2 +// CHECK-NEXT: i=1 j=0 +// CHECK-NEXT: i=1 k=0 +// CHECK-NEXT: i=1 j=1 +// CHECK-NEXT: i=1 k=1 +// CHECK-NEXT: i=1 j=2 +// CHECK-NEXT: i=1 k=2 +// CHECK-NEXT: i=2 j=0 +// CHECK-NEXT: i=2 k=0 +// CHECK-NEXT: i=2 j=1 +// CHECK-NEXT: i=2 k=1 +// CHECK-NEXT: i=2 j=2 +// CHECK-NEXT: i=2 k=2 +// CHECK-NEXT: done >From 7e3bd1e3afcdc246da0362ffb8693b160f9d3f4a Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:28:04 +0000 Subject: [PATCH 2/9] Add looprange clause --- clang/include/clang/AST/OpenMPClause.h | 100 ++++++ clang/include/clang/AST/RecursiveASTVisitor.h | 8 + clang/include/clang/AST/StmtOpenMP.h | 18 +- .../clang/Basic/DiagnosticSemaKinds.td | 5 + clang/include/clang/Parse/Parser.h | 3 + clang/include/clang/Sema/SemaOpenMP.h | 6 + clang/lib/AST/OpenMPClause.cpp | 35 ++ clang/lib/AST/StmtOpenMP.cpp | 7 +- clang/lib/AST/StmtProfile.cpp | 7 + clang/lib/Basic/OpenMPKinds.cpp | 2 + clang/lib/Parse/ParseOpenMP.cpp | 36 ++ clang/lib/Sema/SemaOpenMP.cpp | 155 +++++++-- clang/lib/Sema/TreeTransform.h | 33 ++ clang/lib/Serialization/ASTReader.cpp | 11 + clang/lib/Serialization/ASTReaderStmt.cpp | 4 +- clang/lib/Serialization/ASTWriter.cpp | 8 + clang/test/OpenMP/fuse_ast_print.cpp | 67 ++++ clang/test/OpenMP/fuse_codegen.cpp | 320 +++++++++++++++++- clang/test/OpenMP/fuse_messages.cpp | 112 +++++- clang/tools/libclang/CIndex.cpp | 5 + llvm/include/llvm/Frontend/OpenMP/ClauseT.h | 16 +- llvm/include/llvm/Frontend/OpenMP/OMP.td | 6 + 22 files changed, 921 insertions(+), 43 deletions(-) diff --git a/clang/include/clang/AST/OpenMPClause.h b/clang/include/clang/AST/OpenMPClause.h index 6fd16bc0f03be..8f937cdef9cd0 100644 --- a/clang/include/clang/AST/OpenMPClause.h +++ b/clang/include/clang/AST/OpenMPClause.h @@ -1143,6 +1143,106 @@ class OMPFullClause final : public OMPNoChildClause { static OMPFullClause *CreateEmpty(const ASTContext &C); }; +/// This class represents the 'looprange' clause in the +/// '#pragma omp fuse' directive +/// +/// \code {c} +/// #pragma omp fuse looprange(1,2) +/// { +/// for(int i = 0; i < 64; ++i) +/// for(int j = 0; j < 256; j+=2) +/// for(int k = 127; k >= 0; --k) +/// \endcode +class OMPLoopRangeClause final : public OMPClause { + friend class OMPClauseReader; + + explicit OMPLoopRangeClause() + : OMPClause(llvm::omp::OMPC_looprange, {}, {}) {} + + /// Location of '(' + SourceLocation LParenLoc; + + /// Location of 'first' + SourceLocation FirstLoc; + + /// Location of 'count' + SourceLocation CountLoc; + + /// Expr associated with 'first' argument + Expr *First = nullptr; + + /// Expr associated with 'count' argument + Expr *Count = nullptr; + + /// Set 'first' + void setFirst(Expr *First) { this->First = First; } + + /// Set 'count' + void setCount(Expr *Count) { this->Count = Count; } + + /// Set location of '('. + void setLParenLoc(SourceLocation Loc) { LParenLoc = Loc; } + + /// Set location of 'first' argument + void setFirstLoc(SourceLocation Loc) { FirstLoc = Loc; } + + /// Set location of 'count' argument + void setCountLoc(SourceLocation Loc) { CountLoc = Loc; } + +public: + /// Build an AST node for a 'looprange' clause + /// + /// \param StartLoc Starting location of the clause. + /// \param LParenLoc Location of '('. + /// \param ModifierLoc Modifier location. + /// \param + static OMPLoopRangeClause * + Create(const ASTContext &C, SourceLocation StartLoc, SourceLocation LParenLoc, + SourceLocation FirstLoc, SourceLocation CountLoc, + SourceLocation EndLoc, Expr *First, Expr *Count); + + /// Build an empty 'looprange' node for deserialization + /// + /// \param C Context of the AST. + static OMPLoopRangeClause *CreateEmpty(const ASTContext &C); + + /// Returns the location of '(' + SourceLocation getLParenLoc() const { return LParenLoc; } + + /// Returns the location of 'first' + SourceLocation getFirstLoc() const { return FirstLoc; } + + /// Returns the location of 'count' + SourceLocation getCountLoc() const { return CountLoc; } + + /// Returns the argument 'first' or nullptr if not set + Expr *getFirst() const { return cast_or_null(First); } + + /// Returns the argument 'count' or nullptr if not set + Expr *getCount() const { return cast_or_null(Count); } + + child_range children() { + return child_range(reinterpret_cast(&First), + reinterpret_cast(&Count) + 1); + } + + const_child_range children() const { + auto Children = const_cast(this)->children(); + return const_child_range(Children.begin(), Children.end()); + } + + child_range used_children() { + return child_range(child_iterator(), child_iterator()); + } + const_child_range used_children() const { + return const_child_range(const_child_iterator(), const_child_iterator()); + } + + static bool classof(const OMPClause *T) { + return T->getClauseKind() == llvm::omp::OMPC_looprange; + } +}; + /// Representation of the 'partial' clause of the '#pragma omp unroll' /// directive. /// diff --git a/clang/include/clang/AST/RecursiveASTVisitor.h b/clang/include/clang/AST/RecursiveASTVisitor.h index 057e9e346ce4e..94066edc64933 100644 --- a/clang/include/clang/AST/RecursiveASTVisitor.h +++ b/clang/include/clang/AST/RecursiveASTVisitor.h @@ -3400,6 +3400,14 @@ bool RecursiveASTVisitor::VisitOMPFullClause(OMPFullClause *C) { return true; } +template +bool RecursiveASTVisitor::VisitOMPLoopRangeClause( + OMPLoopRangeClause *C) { + TRY_TO(TraverseStmt(C->getFirst())); + TRY_TO(TraverseStmt(C->getCount())); + return true; +} + template bool RecursiveASTVisitor::VisitOMPPartialClause(OMPPartialClause *C) { TRY_TO(TraverseStmt(C->getFactor())); diff --git a/clang/include/clang/AST/StmtOpenMP.h b/clang/include/clang/AST/StmtOpenMP.h index dc6f797e24ab8..85bde292ca748 100644 --- a/clang/include/clang/AST/StmtOpenMP.h +++ b/clang/include/clang/AST/StmtOpenMP.h @@ -5572,7 +5572,9 @@ class OMPTileDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPTileDirectiveClass, llvm::omp::OMPD_tile, StartLoc, EndLoc, NumLoops) { + // Tiling doubles the original number of loops setNumGeneratedLoops(2 * NumLoops); + // Produces a single top-level canonical loop nest setNumGeneratedLoopNests(1); } @@ -5803,9 +5805,9 @@ class OMPReverseDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPReverseDirectiveClass, llvm::omp::OMPD_reverse, StartLoc, EndLoc, 1) { - - setNumGeneratedLoopNests(1); + // Reverse produces a single top-level canonical loop nest setNumGeneratedLoops(1); + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { @@ -5873,6 +5875,8 @@ class OMPInterchangeDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPInterchangeDirectiveClass, llvm::omp::OMPD_interchange, StartLoc, EndLoc, NumLoops) { + // Interchange produces a single top-level canonical loop + // nest, with the exact same amount of total loops setNumGeneratedLoops(NumLoops); setNumGeneratedLoopNests(1); } @@ -5950,11 +5954,7 @@ class OMPFuseDirective final : public OMPLoopTransformationDirective { unsigned NumLoops) : OMPLoopTransformationDirective(OMPFuseDirectiveClass, llvm::omp::OMPD_fuse, StartLoc, EndLoc, - NumLoops) { - setNumGeneratedLoops(1); - // TODO: After implementing the looprange clause, change this logic - setNumGeneratedLoopNests(1); - } + NumLoops) {} void setPreInits(Stmt *PreInits) { Data->getChildren()[PreInitsOffset] = PreInits; @@ -5990,8 +5990,10 @@ class OMPFuseDirective final : public OMPLoopTransformationDirective { /// \param C Context of the AST /// \param NumClauses Number of clauses to allocate /// \param NumLoops Number of associated loops to allocate + /// \param NumLoopNests Number of top level loops to allocate static OMPFuseDirective *CreateEmpty(const ASTContext &C, unsigned NumClauses, - unsigned NumLoops); + unsigned NumLoops, + unsigned NumLoopNests); /// Gets the associated loops after the transformation. This is the de-sugared /// replacement or nulltpr in dependent contexts. diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index f31b6f8a3b26a..191618e7865dc 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -11566,6 +11566,11 @@ def err_omp_not_a_loop_sequence : Error < "statement after '#pragma omp %0' must be a loop sequence containing canonical loops or loop-generating constructs">; def err_omp_empty_loop_sequence : Error < "loop sequence after '#pragma omp %0' must contain at least 1 canonical loop or loop-generating construct">; +def err_omp_invalid_looprange : Error < + "loop range in '#pragma omp %0' exceeds the number of available loops: " + "range end '%1' is greater than the total number of loops '%2'">; +def warn_omp_redundant_fusion : Warning < + "loop range in '#pragma omp %0' contains only a single loop, resulting in redundant fusion">; def err_omp_not_for : Error< "%select{statement after '#pragma omp %1' must be a for loop|" "expected %2 for loops after '#pragma omp %1'%select{|, but found only %4}3}0">; diff --git a/clang/include/clang/Parse/Parser.h b/clang/include/clang/Parse/Parser.h index e6492b81dfff8..965dcb7da26d8 100644 --- a/clang/include/clang/Parse/Parser.h +++ b/clang/include/clang/Parse/Parser.h @@ -6739,6 +6739,9 @@ class Parser : public CodeCompletionHandler { OpenMPClauseKind Kind, bool ParseOnly); + /// Parses the 'looprange' clause of a '#pragma omp fuse' directive. + OMPClause *ParseOpenMPLoopRangeClause(); + /// Parses the 'sizes' clause of a '#pragma omp tile' directive. OMPClause *ParseOpenMPSizesClause(); diff --git a/clang/include/clang/Sema/SemaOpenMP.h b/clang/include/clang/Sema/SemaOpenMP.h index 8d78c2197c89d..f4a075e54cebe 100644 --- a/clang/include/clang/Sema/SemaOpenMP.h +++ b/clang/include/clang/Sema/SemaOpenMP.h @@ -921,6 +921,12 @@ class SemaOpenMP : public SemaBase { SourceLocation StartLoc, SourceLocation LParenLoc, SourceLocation EndLoc); + + /// Called on well-form 'looprange' clause after parsing its arguments. + OMPClause * + ActOnOpenMPLoopRangeClause(Expr *First, Expr *Count, SourceLocation StartLoc, + SourceLocation LParenLoc, SourceLocation FirstLoc, + SourceLocation CountLoc, SourceLocation EndLoc); /// Called on well-formed 'ordered' clause. OMPClause * ActOnOpenMPOrderedClause(SourceLocation StartLoc, SourceLocation EndLoc, diff --git a/clang/lib/AST/OpenMPClause.cpp b/clang/lib/AST/OpenMPClause.cpp index 0e5052b944162..0b5808eb100e4 100644 --- a/clang/lib/AST/OpenMPClause.cpp +++ b/clang/lib/AST/OpenMPClause.cpp @@ -1024,6 +1024,26 @@ OMPPartialClause *OMPPartialClause::CreateEmpty(const ASTContext &C) { return new (C) OMPPartialClause(); } +OMPLoopRangeClause * +OMPLoopRangeClause::Create(const ASTContext &C, SourceLocation StartLoc, + SourceLocation LParenLoc, SourceLocation EndLoc, + SourceLocation FirstLoc, SourceLocation CountLoc, + Expr *First, Expr *Count) { + OMPLoopRangeClause *Clause = CreateEmpty(C); + Clause->setLocStart(StartLoc); + Clause->setLParenLoc(LParenLoc); + Clause->setLocEnd(EndLoc); + Clause->setFirstLoc(FirstLoc); + Clause->setCountLoc(CountLoc); + Clause->setFirst(First); + Clause->setCount(Count); + return Clause; +} + +OMPLoopRangeClause *OMPLoopRangeClause::CreateEmpty(const ASTContext &C) { + return new (C) OMPLoopRangeClause(); +} + OMPAllocateClause *OMPAllocateClause::Create( const ASTContext &C, SourceLocation StartLoc, SourceLocation LParenLoc, Expr *Allocator, Expr *Alignment, SourceLocation ColonLoc, @@ -1888,6 +1908,21 @@ void OMPClausePrinter::VisitOMPPartialClause(OMPPartialClause *Node) { } } +void OMPClausePrinter::VisitOMPLoopRangeClause(OMPLoopRangeClause *Node) { + OS << "looprange"; + + Expr *First = Node->getFirst(); + Expr *Count = Node->getCount(); + + if (First && Count) { + OS << "("; + First->printPretty(OS, nullptr, Policy, 0); + OS << ","; + Count->printPretty(OS, nullptr, Policy, 0); + OS << ")"; + } +} + void OMPClausePrinter::VisitOMPAllocatorClause(OMPAllocatorClause *Node) { OS << "allocator("; Node->getAllocator()->printPretty(OS, nullptr, Policy, 0); diff --git a/clang/lib/AST/StmtOpenMP.cpp b/clang/lib/AST/StmtOpenMP.cpp index 4a6133766ef1c..06c987e7f1761 100644 --- a/clang/lib/AST/StmtOpenMP.cpp +++ b/clang/lib/AST/StmtOpenMP.cpp @@ -524,10 +524,13 @@ OMPFuseDirective *OMPFuseDirective::Create( OMPFuseDirective *OMPFuseDirective::CreateEmpty(const ASTContext &C, unsigned NumClauses, - unsigned NumLoops) { - return createEmptyDirective( + unsigned NumLoops, + unsigned NumLoopNests) { + OMPFuseDirective *Dir = createEmptyDirective( C, NumClauses, /*HasAssociatedStmt=*/true, TransformedStmtOffset + 1, SourceLocation(), SourceLocation(), NumLoops); + Dir->setNumGeneratedLoopNests(NumLoopNests); + return Dir; } OMPForSimdDirective * diff --git a/clang/lib/AST/StmtProfile.cpp b/clang/lib/AST/StmtProfile.cpp index 99d426db985e8..9f0ce076c35fa 100644 --- a/clang/lib/AST/StmtProfile.cpp +++ b/clang/lib/AST/StmtProfile.cpp @@ -511,6 +511,13 @@ void OMPClauseProfiler::VisitOMPPartialClause(const OMPPartialClause *C) { Profiler->VisitExpr(Factor); } +void OMPClauseProfiler::VisitOMPLoopRangeClause(const OMPLoopRangeClause *C) { + if (const Expr *First = C->getFirst()) + Profiler->VisitExpr(First); + if (const Expr *Count = C->getCount()) + Profiler->VisitExpr(Count); +} + void OMPClauseProfiler::VisitOMPAllocatorClause(const OMPAllocatorClause *C) { if (C->getAllocator()) Profiler->VisitStmt(C->getAllocator()); diff --git a/clang/lib/Basic/OpenMPKinds.cpp b/clang/lib/Basic/OpenMPKinds.cpp index d172450512f13..18330181f1509 100644 --- a/clang/lib/Basic/OpenMPKinds.cpp +++ b/clang/lib/Basic/OpenMPKinds.cpp @@ -248,6 +248,7 @@ unsigned clang::getOpenMPSimpleClauseType(OpenMPClauseKind Kind, StringRef Str, case OMPC_affinity: case OMPC_when: case OMPC_append_args: + case OMPC_looprange: break; default: break; @@ -583,6 +584,7 @@ const char *clang::getOpenMPSimpleClauseTypeName(OpenMPClauseKind Kind, case OMPC_affinity: case OMPC_when: case OMPC_append_args: + case OMPC_looprange: break; default: break; diff --git a/clang/lib/Parse/ParseOpenMP.cpp b/clang/lib/Parse/ParseOpenMP.cpp index cfffcdb01a514..ade5192d1968d 100644 --- a/clang/lib/Parse/ParseOpenMP.cpp +++ b/clang/lib/Parse/ParseOpenMP.cpp @@ -3041,6 +3041,39 @@ OMPClause *Parser::ParseOpenMPSizesClause() { OpenLoc, CloseLoc); } +OMPClause *Parser::ParseOpenMPLoopRangeClause() { + SourceLocation ClauseNameLoc = ConsumeToken(); + SourceLocation FirstLoc, CountLoc; + + BalancedDelimiterTracker T(*this, tok::l_paren, tok::annot_pragma_openmp_end); + if (T.consumeOpen()) { + Diag(Tok, diag::err_expected) << tok::l_paren; + return nullptr; + } + + FirstLoc = Tok.getLocation(); + ExprResult FirstVal = ParseConstantExpression(); + if (!FirstVal.isUsable()) { + T.skipToEnd(); + return nullptr; + } + + ExpectAndConsume(tok::comma); + + CountLoc = Tok.getLocation(); + ExprResult CountVal = ParseConstantExpression(); + if (!CountVal.isUsable()) { + T.skipToEnd(); + return nullptr; + } + + T.consumeClose(); + + return Actions.OpenMP().ActOnOpenMPLoopRangeClause( + FirstVal.get(), CountVal.get(), ClauseNameLoc, T.getOpenLocation(), + FirstLoc, CountLoc, T.getCloseLocation()); +} + OMPClause *Parser::ParseOpenMPPermutationClause() { SourceLocation ClauseNameLoc, OpenLoc, CloseLoc; SmallVector ArgExprs; @@ -3469,6 +3502,9 @@ OMPClause *Parser::ParseOpenMPClause(OpenMPDirectiveKind DKind, } Clause = ParseOpenMPClause(CKind, WrongDirective); break; + case OMPC_looprange: + Clause = ParseOpenMPLoopRangeClause(); + break; default: break; } diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index bd8bee64a9d2f..556b5cb43b6f8 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -14289,7 +14289,6 @@ bool SemaOpenMP::checkTransformableLoopSequence( // and tries to match the input AST to the canonical loop sequence grammar // structure - auto NLCV = NestedLoopCounterVisitor(); // Helper functions to validate canonical loop sequence grammar is valid auto isLoopSequenceDerivation = [](auto *Child) { return isa(Child) || isa(Child) || @@ -14392,7 +14391,7 @@ bool SemaOpenMP::checkTransformableLoopSequence( // Modularized code for handling regular canonical loops auto handleRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, - &LoopSeqSize, &NumLoops, Kind, &TmpDSA, &NLCV, + &LoopSeqSize, &NumLoops, Kind, &TmpDSA, this](Stmt *Child) { OriginalInits.emplace_back(); LoopHelpers.emplace_back(); @@ -14405,8 +14404,11 @@ bool SemaOpenMP::checkTransformableLoopSequence( << getOpenMPDirectiveName(Kind); return false; } + storeLoopStatements(Child); - NumLoops += NLCV.TraverseStmt(Child); + auto NLCV = NestedLoopCounterVisitor(); + NLCV.TraverseStmt(Child); + NumLoops += NLCV.getNestedLoopCount(); return true; }; @@ -15732,6 +15734,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, Stmt *AStmt, SourceLocation StartLoc, SourceLocation EndLoc) { + ASTContext &Context = getASTContext(); DeclContext *CurrContext = SemaRef.CurContext; Scope *CurScope = SemaRef.getCurScope(); @@ -15748,7 +15751,6 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, SmallVector> OriginalInits; unsigned NumLoops; - // TODO: Support looprange clause using LoopSeqSize unsigned LoopSeqSize; if (!checkTransformableLoopSequence(OMPD_fuse, AStmt, LoopSeqSize, NumLoops, LoopHelpers, LoopStmts, OriginalInits, @@ -15757,10 +15759,67 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, } // Defer transformation in dependent contexts + // The NumLoopNests argument is set to a placeholder (0) + // because a dependent context could prevent determining its true value if (CurrContext->isDependentContext()) { return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, - NumLoops, 1, AStmt, nullptr, nullptr); + NumLoops, 0, AStmt, nullptr, nullptr); } + + // Handle clauses, which can be any of the following: [looprange, apply] + const OMPLoopRangeClause *LRC = + OMPExecutableDirective::getSingleClause(Clauses); + + // The clause arguments are invalidated if any error arises + // such as non-constant or non-positive arguments + if (LRC && (!LRC->getFirst() || !LRC->getCount())) + return StmtError(); + + // Delayed semantic check of LoopRange constraint + // Evaluates the loop range arguments and returns the first and count values + auto EvaluateLoopRangeArguments = [&Context](Expr *First, Expr *Count, + uint64_t &FirstVal, + uint64_t &CountVal) { + llvm::APSInt FirstInt = First->EvaluateKnownConstInt(Context); + llvm::APSInt CountInt = Count->EvaluateKnownConstInt(Context); + FirstVal = FirstInt.getZExtValue(); + CountVal = CountInt.getZExtValue(); + }; + + // Checks if the loop range is valid + auto ValidLoopRange = [](uint64_t FirstVal, uint64_t CountVal, + unsigned NumLoops) -> bool { + return FirstVal + CountVal - 1 <= NumLoops; + }; + uint64_t FirstVal = 1, CountVal = 0, LastVal = LoopSeqSize; + + if (LRC) { + EvaluateLoopRangeArguments(LRC->getFirst(), LRC->getCount(), FirstVal, + CountVal); + if (CountVal == 1) + SemaRef.Diag(LRC->getCountLoc(), diag::warn_omp_redundant_fusion) + << getOpenMPDirectiveName(OMPD_fuse); + + if (!ValidLoopRange(FirstVal, CountVal, LoopSeqSize)) { + SemaRef.Diag(LRC->getFirstLoc(), diag::err_omp_invalid_looprange) + << getOpenMPDirectiveName(OMPD_fuse) << (FirstVal + CountVal - 1) + << LoopSeqSize; + return StmtError(); + } + + LastVal = FirstVal + CountVal - 1; + } + + // Complete fusion generates a single canonical loop nest + // However looprange clause generates several loop nests + unsigned NumLoopNests = LRC ? LoopSeqSize - CountVal + 1 : 1; + + // Emit a warning for redundant loop fusion when the sequence contains only + // one loop. + if (LoopSeqSize == 1) + SemaRef.Diag(AStmt->getBeginLoc(), diag::warn_omp_redundant_fusion) + << getOpenMPDirectiveName(OMPD_fuse); + assert(LoopHelpers.size() == LoopSeqSize && "Expecting loop iteration space dimensionality to match number of " "affected loops"); @@ -15774,8 +15833,8 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, SmallVector PreInits; // Select the type with the largest bit width among all induction variables - QualType IVType = LoopHelpers[0].IterationVarRef->getType(); - for (unsigned int I = 1; I < LoopSeqSize; ++I) { + QualType IVType = LoopHelpers[FirstVal - 1].IterationVarRef->getType(); + for (unsigned int I = FirstVal; I < LastVal; ++I) { QualType CurrentIVType = LoopHelpers[I].IterationVarRef->getType(); if (Context.getTypeSize(CurrentIVType) > Context.getTypeSize(IVType)) { IVType = CurrentIVType; @@ -15824,20 +15883,21 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // Process each single loop to generate and collect declarations // and statements for all helper expressions - for (unsigned int I = 0; I < LoopSeqSize; ++I) { + for (unsigned int I = FirstVal - 1, J = 0; I < LastVal; ++I, ++J) { addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], PreInits); - auto [UBVD, UBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].UB, "ub", I); - auto [LBVD, LBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].LB, "lb", I); - auto [STVD, STDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].ST, "st", I); + auto [UBVD, UBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].UB, "ub", J); + auto [LBVD, LBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].LB, "lb", J); + auto [STVD, STDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].ST, "st", J); auto [NIVD, NIDStmt] = - CreateHelperVarAndStmt(LoopHelpers[I].NumIterations, "ni", I, true); + CreateHelperVarAndStmt(LoopHelpers[I].NumIterations, "ni", J, true); auto [IVVD, IVDStmt] = - CreateHelperVarAndStmt(LoopHelpers[I].IterationVarRef, "iv", I); + CreateHelperVarAndStmt(LoopHelpers[I].IterationVarRef, "iv", J); if (!LBVD || !STVD || !NIVD || !IVVD) - return StmtError(); + assert(LBVD && STVD && NIVD && IVVD && + "OpenMP Fuse Helper variables creation failed"); UBVarDecls.push_back(UBVD); LBVarDecls.push_back(LBVD); @@ -15912,8 +15972,9 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // omp.fuse.max = max(omp.temp1, omp.temp0) ExprResult MaxExpr; - for (unsigned I = 0; I < LoopSeqSize; ++I) { - DeclRefExpr *NIRef = MakeVarDeclRef(NIVarDecls[I]); + // I is the true + for (unsigned I = FirstVal - 1, J = 0; I < LastVal; ++I, ++J) { + DeclRefExpr *NIRef = MakeVarDeclRef(NIVarDecls[J]); QualType NITy = NIRef->getType(); if (MaxExpr.isUnset()) { @@ -15921,7 +15982,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, MaxExpr = NIRef; } else { // Create a new acummulator variable t_i = MaxExpr - std::string TempName = (Twine(".omp.temp.") + Twine(I)).str(); + std::string TempName = (Twine(".omp.temp.") + Twine(J)).str(); VarDecl *TempDecl = buildVarDecl(SemaRef, {}, NITy, TempName, nullptr, nullptr); TempDecl->setInit(MaxExpr.get()); @@ -15944,7 +16005,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, if (!Comparison.isUsable()) return StmtError(); - DeclRefExpr *NIRef2 = MakeVarDeclRef(NIVarDecls[I]); + DeclRefExpr *NIRef2 = MakeVarDeclRef(NIVarDecls[J]); // Update MaxExpr using a conditional expression to hold the max value MaxExpr = new (Context) ConditionalOperator( Comparison.get(), SourceLocation(), TempRef2, SourceLocation(), @@ -15997,23 +16058,21 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, CompoundStmt *FusedBody = nullptr; SmallVector FusedBodyStmts; - for (unsigned I = 0; I < LoopSeqSize; ++I) { - + for (unsigned I = FirstVal - 1, J = 0; I < LastVal; ++I, ++J) { // Assingment of the original sub-loop index to compute the logical index // IV_k = LB_k + omp.fuse.index * ST_k - ExprResult IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Mul, - MakeVarDeclRef(STVarDecls[I]), MakeIVRef()); + MakeVarDeclRef(STVarDecls[J]), MakeIVRef()); if (!IdxExpr.isUsable()) return StmtError(); IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Add, - MakeVarDeclRef(LBVarDecls[I]), IdxExpr.get()); + MakeVarDeclRef(LBVarDecls[J]), IdxExpr.get()); if (!IdxExpr.isUsable()) return StmtError(); IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Assign, - MakeVarDeclRef(IVVarDecls[I]), IdxExpr.get()); + MakeVarDeclRef(IVVarDecls[J]), IdxExpr.get()); if (!IdxExpr.isUsable()) return StmtError(); @@ -16028,7 +16087,6 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, Stmt *Body = (isa(LoopStmts[I])) ? cast(LoopStmts[I])->getBody() : cast(LoopStmts[I])->getBody(); - BodyStmts.push_back(Body); CompoundStmt *CombinedBody = @@ -16036,7 +16094,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, SourceLocation(), SourceLocation()); ExprResult Condition = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_LT, MakeIVRef(), - MakeVarDeclRef(NIVarDecls[I])); + MakeVarDeclRef(NIVarDecls[J])); if (!Condition.isUsable()) return StmtError(); @@ -16057,8 +16115,26 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, FusedBody, InitStmt.get()->getBeginLoc(), SourceLocation(), IncrExpr.get()->getEndLoc()); + // In the case of looprange, the result of fuse won't simply + // be a single loop (ForStmt), but rather a loop sequence + // (CompoundStmt) of 3 parts: the pre-fusion loops, the fused loop + // and the post-fusion loops, preserving its original order. + Stmt *FusionStmt = FusedForStmt; + if (LRC) { + SmallVector FinalLoops; + // Gather all the pre-fusion loops + for (unsigned I = 0; I < FirstVal - 1; ++I) + FinalLoops.push_back(LoopStmts[I]); + // Gather the fused loop + FinalLoops.push_back(FusedForStmt); + // Gather all the post-fusion loops + for (unsigned I = FirstVal + CountVal - 1; I < LoopSeqSize; ++I) + FinalLoops.push_back(LoopStmts[I]); + FusionStmt = CompoundStmt::Create(Context, FinalLoops, FPOptionsOverride(), + SourceLocation(), SourceLocation()); + } return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, NumLoops, - 1, AStmt, FusedForStmt, + NumLoopNests, AStmt, FusionStmt, buildPreInits(Context, PreInits)); } @@ -17181,6 +17257,31 @@ OMPClause *SemaOpenMP::ActOnOpenMPPartialClause(Expr *FactorExpr, FactorExpr); } +OMPClause *SemaOpenMP::ActOnOpenMPLoopRangeClause( + Expr *First, Expr *Count, SourceLocation StartLoc, SourceLocation LParenLoc, + SourceLocation FirstLoc, SourceLocation CountLoc, SourceLocation EndLoc) { + + // OpenMP [6.0, Restrictions] + // First and Count must be integer expressions with positive value + ExprResult FirstVal = + VerifyPositiveIntegerConstantInClause(First, OMPC_looprange); + if (FirstVal.isInvalid()) + First = nullptr; + + ExprResult CountVal = + VerifyPositiveIntegerConstantInClause(Count, OMPC_looprange); + if (CountVal.isInvalid()) + Count = nullptr; + + // OpenMP [6.0, Restrictions] + // first + count - 1 must not evaluate to a value greater than the + // loop sequence length of the associated canonical loop sequence. + // This check must be performed afterwards due to the delayed + // parsing and computation of the associated loop sequence + return OMPLoopRangeClause::Create(getASTContext(), StartLoc, LParenLoc, + FirstLoc, CountLoc, EndLoc, First, Count); +} + OMPClause *SemaOpenMP::ActOnOpenMPAlignClause(Expr *A, SourceLocation StartLoc, SourceLocation LParenLoc, SourceLocation EndLoc) { diff --git a/clang/lib/Sema/TreeTransform.h b/clang/lib/Sema/TreeTransform.h index 034b0c8243667..d70e2a3874c07 100644 --- a/clang/lib/Sema/TreeTransform.h +++ b/clang/lib/Sema/TreeTransform.h @@ -1775,6 +1775,14 @@ class TreeTransform { LParenLoc, EndLoc); } + OMPClause * + RebuildOMPLoopRangeClause(Expr *First, Expr *Count, SourceLocation StartLoc, + SourceLocation LParenLoc, SourceLocation FirstLoc, + SourceLocation CountLoc, SourceLocation EndLoc) { + return getSema().OpenMP().ActOnOpenMPLoopRangeClause( + First, Count, StartLoc, LParenLoc, FirstLoc, CountLoc, EndLoc); + } + /// Build a new OpenMP 'allocator' clause. /// /// By default, performs semantic analysis to build the new OpenMP clause. @@ -10569,6 +10577,31 @@ TreeTransform::TransformOMPPartialClause(OMPPartialClause *C) { C->getEndLoc()); } +template +OMPClause * +TreeTransform::TransformOMPLoopRangeClause(OMPLoopRangeClause *C) { + ExprResult F = getDerived().TransformExpr(C->getFirst()); + if (F.isInvalid()) + return nullptr; + + ExprResult Cn = getDerived().TransformExpr(C->getCount()); + if (Cn.isInvalid()) + return nullptr; + + Expr *First = F.get(); + Expr *Count = Cn.get(); + + bool Changed = (First != C->getFirst()) || (Count != C->getCount()); + + // If no changes and AlwaysRebuild() is false, return the original clause + if (!Changed && !getDerived().AlwaysRebuild()) + return C; + + return RebuildOMPLoopRangeClause(First, Count, C->getBeginLoc(), + C->getLParenLoc(), C->getFirstLoc(), + C->getCountLoc(), C->getEndLoc()); +} + template OMPClause * TreeTransform::TransformOMPCollapseClause(OMPCollapseClause *C) { diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp index d068f5e163176..8591eb9394fa5 100644 --- a/clang/lib/Serialization/ASTReader.cpp +++ b/clang/lib/Serialization/ASTReader.cpp @@ -11088,6 +11088,9 @@ OMPClause *OMPClauseReader::readClause() { case llvm::omp::OMPC_partial: C = OMPPartialClause::CreateEmpty(Context); break; + case llvm::omp::OMPC_looprange: + C = OMPLoopRangeClause::CreateEmpty(Context); + break; case llvm::omp::OMPC_allocator: C = new (Context) OMPAllocatorClause(); break; @@ -11489,6 +11492,14 @@ void OMPClauseReader::VisitOMPPartialClause(OMPPartialClause *C) { C->setLParenLoc(Record.readSourceLocation()); } +void OMPClauseReader::VisitOMPLoopRangeClause(OMPLoopRangeClause *C) { + C->setFirst(Record.readSubExpr()); + C->setCount(Record.readSubExpr()); + C->setLParenLoc(Record.readSourceLocation()); + C->setFirstLoc(Record.readSourceLocation()); + C->setCountLoc(Record.readSourceLocation()); +} + void OMPClauseReader::VisitOMPAllocatorClause(OMPAllocatorClause *C) { C->setAllocator(Record.readExpr()); C->setLParenLoc(Record.readSourceLocation()); diff --git a/clang/lib/Serialization/ASTReaderStmt.cpp b/clang/lib/Serialization/ASTReaderStmt.cpp index 6762d11d6b73e..a301e1c0b0e32 100644 --- a/clang/lib/Serialization/ASTReaderStmt.cpp +++ b/clang/lib/Serialization/ASTReaderStmt.cpp @@ -3621,7 +3621,9 @@ Stmt *ASTReader::ReadStmtFromStream(ModuleFile &F) { case STMT_OMP_FUSE_DIRECTIVE: { unsigned NumLoops = Record[ASTStmtReader::NumStmtFields]; unsigned NumClauses = Record[ASTStmtReader::NumStmtFields + 1]; - S = OMPFuseDirective::CreateEmpty(Context, NumClauses, NumLoops); + unsigned NumLoopNests = Record[ASTStmtReader::NumStmtFields + 2]; + S = OMPFuseDirective::CreateEmpty(Context, NumClauses, NumLoops, + NumLoopNests); break; } diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp index 1b3d3c22aa9f5..8548f7e50d34b 100644 --- a/clang/lib/Serialization/ASTWriter.cpp +++ b/clang/lib/Serialization/ASTWriter.cpp @@ -7782,6 +7782,14 @@ void OMPClauseWriter::VisitOMPPartialClause(OMPPartialClause *C) { Record.AddSourceLocation(C->getLParenLoc()); } +void OMPClauseWriter::VisitOMPLoopRangeClause(OMPLoopRangeClause *C) { + Record.AddStmt(C->getFirst()); + Record.AddStmt(C->getCount()); + Record.AddSourceLocation(C->getLParenLoc()); + Record.AddSourceLocation(C->getFirstLoc()); + Record.AddSourceLocation(C->getCountLoc()); +} + void OMPClauseWriter::VisitOMPAllocatorClause(OMPAllocatorClause *C) { Record.AddStmt(C->getAllocator()); Record.AddSourceLocation(C->getLParenLoc()); diff --git a/clang/test/OpenMP/fuse_ast_print.cpp b/clang/test/OpenMP/fuse_ast_print.cpp index 43ce815dab024..ac4f0d38a9c68 100644 --- a/clang/test/OpenMP/fuse_ast_print.cpp +++ b/clang/test/OpenMP/fuse_ast_print.cpp @@ -271,6 +271,73 @@ void foo7() { } +// PRINT-LABEL: void foo8( +// DUMP-LABEL: FunctionDecl {{.*}} foo8 +void foo8() { + // PRINT: #pragma omp fuse looprange(2,2) + // DUMP: OMPFuseDirective + // DUMP: OMPLooprangeClause + #pragma omp fuse looprange(2,2) + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + + } + +} + +//PRINT-LABEL: void foo9( +//DUMP-LABEL: FunctionTemplateDecl {{.*}} foo9 +//DUMP-LABEL: NonTypeTemplateParmDecl {{.*}} F +//DUMP-LABEL: NonTypeTemplateParmDecl {{.*}} C +template +void foo9() { + // PRINT: #pragma omp fuse looprange(F,C) + // DUMP: OMPFuseDirective + // DUMP: OMPLooprangeClause + #pragma omp fuse looprange(F,C) + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + + } +} + +// Also test instantiating the template. +void tfoo9() { + foo9<1, 2>(); +} + diff --git a/clang/test/OpenMP/fuse_codegen.cpp b/clang/test/OpenMP/fuse_codegen.cpp index 6c1e21092da43..d9500bed3ce31 100644 --- a/clang/test/OpenMP/fuse_codegen.cpp +++ b/clang/test/OpenMP/fuse_codegen.cpp @@ -53,6 +53,18 @@ extern "C" void foo3() { } } +extern "C" void foo4() { + double arr[256]; + + #pragma omp fuse looprange(2,2) + { + for(int i = 0; i < 128; ++i) body(i); + for(int j = 0; j < 256; j+=2) body(j); + for(int k = 0; k < 64; ++k) body(k); + for(int c = 42; auto &&v: arr) body(c,v); + } +} + #endif // CHECK1-LABEL: define dso_local void @body( @@ -777,6 +789,157 @@ extern "C" void foo3() { // CHECK1-NEXT: ret void // // +// CHECK1-LABEL: define dso_local void @foo4( +// CHECK1-SAME: ) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[K]], align 4 +// CHECK1-NEXT: store i32 63, ptr [[DOTOMP_UB1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: store i32 64, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP5]], 128 +// CHECK1-NEXT: br i1 [[CMP1]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP6]]) +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add nsw i32 [[TMP7]], 1 +// CHECK1-NEXT: store i32 [[INC]], ptr [[I]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP7:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND2:.*]] +// CHECK1: [[FOR_COND2]]: +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP3:%.*]] = icmp slt i32 [[TMP8]], [[TMP9]] +// CHECK1-NEXT: br i1 [[CMP3]], label %[[FOR_BODY4:.*]], label %[[FOR_END17:.*]] +// CHECK1: [[FOR_BODY4]]: +// CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP5:%.*]] = icmp slt i32 [[TMP10]], [[TMP11]] +// CHECK1-NEXT: br i1 [[CMP5]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i32 [[TMP13]], [[TMP14]] +// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP12]], [[MUL]] +// CHECK1-NEXT: store i32 [[ADD]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[MUL6:%.*]] = mul nsw i32 [[TMP15]], 2 +// CHECK1-NEXT: [[ADD7:%.*]] = add nsw i32 0, [[MUL6]] +// CHECK1-NEXT: store i32 [[ADD7]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP16]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP8:%.*]] = icmp slt i32 [[TMP17]], [[TMP18]] +// CHECK1-NEXT: br i1 [[CMP8]], label %[[IF_THEN9:.*]], label %[[IF_END14:.*]] +// CHECK1: [[IF_THEN9]]: +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL10:%.*]] = mul nsw i32 [[TMP20]], [[TMP21]] +// CHECK1-NEXT: [[ADD11:%.*]] = add nsw i32 [[TMP19]], [[MUL10]] +// CHECK1-NEXT: store i32 [[ADD11]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[MUL12:%.*]] = mul nsw i32 [[TMP22]], 1 +// CHECK1-NEXT: [[ADD13:%.*]] = add nsw i32 0, [[MUL12]] +// CHECK1-NEXT: store i32 [[ADD13]], ptr [[K]], align 4 +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[K]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP23]]) +// CHECK1-NEXT: br label %[[IF_END14]] +// CHECK1: [[IF_END14]]: +// CHECK1-NEXT: br label %[[FOR_INC15:.*]] +// CHECK1: [[FOR_INC15]]: +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC16:%.*]] = add nsw i32 [[TMP24]], 1 +// CHECK1-NEXT: store i32 [[INC16]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND2]], !llvm.loop [[LOOP8:![0-9]+]] +// CHECK1: [[FOR_END17]]: +// CHECK1-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[TMP25:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP25]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP26:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY18:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP26]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY18]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND19:.*]] +// CHECK1: [[FOR_COND19]]: +// CHECK1-NEXT: [[TMP27:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP28:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK1-NEXT: [[CMP20:%.*]] = icmp ne ptr [[TMP27]], [[TMP28]] +// CHECK1-NEXT: br i1 [[CMP20]], label %[[FOR_BODY21:.*]], label %[[FOR_END23:.*]] +// CHECK1: [[FOR_BODY21]]: +// CHECK1-NEXT: [[TMP29:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: store ptr [[TMP29]], ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[C]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP32:%.*]] = load double, ptr [[TMP31]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP30]], double noundef [[TMP32]]) +// CHECK1-NEXT: br label %[[FOR_INC22:.*]] +// CHECK1: [[FOR_INC22]]: +// CHECK1-NEXT: [[TMP33:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[INCDEC_PTR:%.*]] = getelementptr inbounds nuw double, ptr [[TMP33]], i32 1 +// CHECK1-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND19]] +// CHECK1: [[FOR_END23]]: +// CHECK1-NEXT: ret void +// +// // CHECK2-LABEL: define dso_local void @body( // CHECK2-SAME: ...) #[[ATTR0:[0-9]+]] { // CHECK2-NEXT: [[ENTRY:.*:]] @@ -1259,6 +1422,157 @@ extern "C" void foo3() { // CHECK2-NEXT: ret void // // +// CHECK2-LABEL: define dso_local void @foo4( +// CHECK2-SAME: ) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[K]], align 4 +// CHECK2-NEXT: store i32 63, ptr [[DOTOMP_UB1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: store i32 64, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP5]], 128 +// CHECK2-NEXT: br i1 [[CMP1]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP6]]) +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add nsw i32 [[TMP7]], 1 +// CHECK2-NEXT: store i32 [[INC]], ptr [[I]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND2:.*]] +// CHECK2: [[FOR_COND2]]: +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP3:%.*]] = icmp slt i32 [[TMP8]], [[TMP9]] +// CHECK2-NEXT: br i1 [[CMP3]], label %[[FOR_BODY4:.*]], label %[[FOR_END17:.*]] +// CHECK2: [[FOR_BODY4]]: +// CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP5:%.*]] = icmp slt i32 [[TMP10]], [[TMP11]] +// CHECK2-NEXT: br i1 [[CMP5]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i32 [[TMP13]], [[TMP14]] +// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP12]], [[MUL]] +// CHECK2-NEXT: store i32 [[ADD]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[MUL6:%.*]] = mul nsw i32 [[TMP15]], 2 +// CHECK2-NEXT: [[ADD7:%.*]] = add nsw i32 0, [[MUL6]] +// CHECK2-NEXT: store i32 [[ADD7]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP16]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP8:%.*]] = icmp slt i32 [[TMP17]], [[TMP18]] +// CHECK2-NEXT: br i1 [[CMP8]], label %[[IF_THEN9:.*]], label %[[IF_END14:.*]] +// CHECK2: [[IF_THEN9]]: +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL10:%.*]] = mul nsw i32 [[TMP20]], [[TMP21]] +// CHECK2-NEXT: [[ADD11:%.*]] = add nsw i32 [[TMP19]], [[MUL10]] +// CHECK2-NEXT: store i32 [[ADD11]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[MUL12:%.*]] = mul nsw i32 [[TMP22]], 1 +// CHECK2-NEXT: [[ADD13:%.*]] = add nsw i32 0, [[MUL12]] +// CHECK2-NEXT: store i32 [[ADD13]], ptr [[K]], align 4 +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[K]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP23]]) +// CHECK2-NEXT: br label %[[IF_END14]] +// CHECK2: [[IF_END14]]: +// CHECK2-NEXT: br label %[[FOR_INC15:.*]] +// CHECK2: [[FOR_INC15]]: +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC16:%.*]] = add nsw i32 [[TMP24]], 1 +// CHECK2-NEXT: store i32 [[INC16]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND2]], !llvm.loop [[LOOP7:![0-9]+]] +// CHECK2: [[FOR_END17]]: +// CHECK2-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[TMP25:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP25]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP26:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY18:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP26]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY18]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND19:.*]] +// CHECK2: [[FOR_COND19]]: +// CHECK2-NEXT: [[TMP27:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP28:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK2-NEXT: [[CMP20:%.*]] = icmp ne ptr [[TMP27]], [[TMP28]] +// CHECK2-NEXT: br i1 [[CMP20]], label %[[FOR_BODY21:.*]], label %[[FOR_END23:.*]] +// CHECK2: [[FOR_BODY21]]: +// CHECK2-NEXT: [[TMP29:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: store ptr [[TMP29]], ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[C]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP32:%.*]] = load double, ptr [[TMP31]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP30]], double noundef [[TMP32]]) +// CHECK2-NEXT: br label %[[FOR_INC22:.*]] +// CHECK2: [[FOR_INC22]]: +// CHECK2-NEXT: [[TMP33:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[INCDEC_PTR:%.*]] = getelementptr inbounds nuw double, ptr [[TMP33]], i32 1 +// CHECK2-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND19]] +// CHECK2: [[FOR_END23]]: +// CHECK2-NEXT: ret void +// +// // CHECK2-LABEL: define dso_local void @tfoo2( // CHECK2-SAME: ) #[[ATTR0]] { // CHECK2-NEXT: [[ENTRY:.*:]] @@ -1494,7 +1808,7 @@ extern "C" void foo3() { // CHECK2-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 // CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP8:![0-9]+]] // CHECK2: [[FOR_END]]: // CHECK2-NEXT: ret void // @@ -1503,9 +1817,13 @@ extern "C" void foo3() { // CHECK1: [[META4]] = !{!"llvm.loop.mustprogress"} // CHECK1: [[LOOP5]] = distinct !{[[LOOP5]], [[META4]]} // CHECK1: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} +// CHECK1: [[LOOP7]] = distinct !{[[LOOP7]], [[META4]]} +// CHECK1: [[LOOP8]] = distinct !{[[LOOP8]], [[META4]]} //. // CHECK2: [[LOOP3]] = distinct !{[[LOOP3]], [[META4:![0-9]+]]} // CHECK2: [[META4]] = !{!"llvm.loop.mustprogress"} // CHECK2: [[LOOP5]] = distinct !{[[LOOP5]], [[META4]]} // CHECK2: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} +// CHECK2: [[LOOP7]] = distinct !{[[LOOP7]], [[META4]]} +// CHECK2: [[LOOP8]] = distinct !{[[LOOP8]], [[META4]]} //. diff --git a/clang/test/OpenMP/fuse_messages.cpp b/clang/test/OpenMP/fuse_messages.cpp index 50dedfd2c0dc6..2a2491d008a0b 100644 --- a/clang/test/OpenMP/fuse_messages.cpp +++ b/clang/test/OpenMP/fuse_messages.cpp @@ -33,6 +33,8 @@ void func() { { for (int i = 0; i < 7; ++i) ; + for(int j = 0; j < 100; ++j); + } @@ -41,6 +43,8 @@ void func() { { for (int i = 0; i < 7; ++i) ; + for(int j = 0; j < 100; ++j); + } //expected-error at +4 {{loop after '#pragma omp fuse' is not in canonical form}} @@ -50,6 +54,7 @@ void func() { for(int i = 0; i < 10; i*=2) { ; } + for(int j = 0; j < 100; ++j); } //expected-error at +2 {{loop sequence after '#pragma omp fuse' must contain at least 1 canonical loop or loop-generating construct}} @@ -73,4 +78,109 @@ void func() { for(unsigned int j = 0; j < 10; ++j); for(long long k = 0; k < 100; ++k); } -} \ No newline at end of file + + //expected-warning at +2 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} + #pragma omp fuse + { + for(int i = 0; i < 10; ++i); + } + + //expected-warning at +1 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} + #pragma omp fuse looprange(1, 1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + } + + //expected-error at +1 {{argument to 'looprange' clause must be a strictly positive integer value}} + #pragma omp fuse looprange(1, -1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + } + + //expected-error at +1 {{argument to 'looprange' clause must be a strictly positive integer value}} + #pragma omp fuse looprange(1, 0) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + } + + const int x = 1; + constexpr int y = 4; + //expected-error at +1 {{loop range in '#pragma omp fuse' exceeds the number of available loops: range end '4' is greater than the total number of loops '3'}} + #pragma omp fuse looprange(x,y) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } + + //expected-error at +1 {{loop range in '#pragma omp fuse' exceeds the number of available loops: range end '420' is greater than the total number of loops '3'}} + #pragma omp fuse looprange(1,420) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } +} + +// In a template context, but expression itself not instantiation-dependent +template +static void templated_func() { + + //expected-warning at +1 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} + #pragma omp fuse looprange(2,1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } + + //expected-error at +1 {{loop range in '#pragma omp fuse' exceeds the number of available loops: range end '5' is greater than the total number of loops '3'}} + #pragma omp fuse looprange(3,3) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } + +} + +template +static void templated_func_value_dependent() { + + //expected-warning at +1 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} + #pragma omp fuse looprange(V,1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } +} + +template +static void templated_func_type_dependent() { + constexpr T s = 1; + + //expected-error at +1 {{argument to 'looprange' clause must be a strictly positive integer value}} + #pragma omp fuse looprange(s,s-1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } +} + + +void template_inst() { + // expected-note at +1 {{in instantiation of function template specialization 'templated_func' requested here}} + templated_func(); + // expected-note at +1 {{in instantiation of function template specialization 'templated_func_value_dependent<1>' requested here}} + templated_func_value_dependent<1>(); + // expected-note at +1 {{in instantiation of function template specialization 'templated_func_type_dependent' requested here}} + templated_func_type_dependent(); + +} + + diff --git a/clang/tools/libclang/CIndex.cpp b/clang/tools/libclang/CIndex.cpp index fd788ac3d69d4..38f5183b146ee 100644 --- a/clang/tools/libclang/CIndex.cpp +++ b/clang/tools/libclang/CIndex.cpp @@ -2412,6 +2412,11 @@ void OMPClauseEnqueue::VisitOMPPartialClause(const OMPPartialClause *C) { Visitor->AddStmt(C->getFactor()); } +void OMPClauseEnqueue::VisitOMPLoopRangeClause(const OMPLoopRangeClause *C) { + Visitor->AddStmt(C->getFirst()); + Visitor->AddStmt(C->getCount()); +} + void OMPClauseEnqueue::VisitOMPAllocatorClause(const OMPAllocatorClause *C) { Visitor->AddStmt(C->getAllocator()); } diff --git a/llvm/include/llvm/Frontend/OpenMP/ClauseT.h b/llvm/include/llvm/Frontend/OpenMP/ClauseT.h index e0714e812e5cd..dd51274c1aaf5 100644 --- a/llvm/include/llvm/Frontend/OpenMP/ClauseT.h +++ b/llvm/include/llvm/Frontend/OpenMP/ClauseT.h @@ -1233,6 +1233,15 @@ struct WriteT { using EmptyTrait = std::true_type; }; +// V6: [6.4.7] Looprange clause +template struct LoopRangeT { + using Begin = E; + using End = E; + + using TupleTrait = std::true_type; + std::tuple t; +}; + // --- template @@ -1263,9 +1272,10 @@ using TupleClausesT = DefaultmapT, DeviceT, DistScheduleT, DoacrossT, FromT, GrainsizeT, IfT, InitT, InReductionT, - LastprivateT, LinearT, MapT, - NumTasksT, OrderT, ReductionT, - ScheduleT, TaskReductionT, ToT>; + LastprivateT, LinearT, LoopRangeT, + MapT, NumTasksT, OrderT, + ReductionT, ScheduleT, + TaskReductionT, ToT>; template using UnionClausesT = std::variant>; diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 8286cfcadaafd..ae19385c022d0 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -271,6 +271,9 @@ def OMPC_Linear : Clause<"linear"> { def OMPC_Link : Clause<"link"> { let flangClass = "OmpObjectList"; } +def OMPC_LoopRange : Clause<"looprange"> { + let clangClass = "OMPLoopRangeClause"; +} def OMPC_Map : Clause<"map"> { let clangClass = "OMPMapClause"; let flangClass = "OmpMapClause"; @@ -853,6 +856,9 @@ def OMP_For : Directive<"for"> { let languages = [L_C]; } def OMP_Fuse : Directive<"fuse"> { + let allowedOnceClauses = [ + VersionedClause + ]; let association = AS_Loop; let category = CA_Executable; } >From c1e5fc3fe2ac7f126a76b44906b30029e3cc797b Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:30:39 +0000 Subject: [PATCH 3/9] Addef fuse to documentation --- clang/docs/OpenMPSupport.rst | 2 ++ clang/docs/ReleaseNotes.rst | 1 + 2 files changed, 3 insertions(+) diff --git a/clang/docs/OpenMPSupport.rst b/clang/docs/OpenMPSupport.rst index d6507071d4693..5f0e363792b32 100644 --- a/clang/docs/OpenMPSupport.rst +++ b/clang/docs/OpenMPSupport.rst @@ -376,6 +376,8 @@ implementation. +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | loop stripe transformation | :good:`done` | https://github.com/llvm/llvm-project/pull/119891 | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ +| loop fuse transformation | :good:`done` | :none:`unclaimed` | | ++-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | work distribute construct | :none:`unclaimed` | :none:`unclaimed` | | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | task_iteration | :none:`unclaimed` | :none:`unclaimed` | | diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 573ae97bff710..2188e42dc705c 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -1016,6 +1016,7 @@ OpenMP Support open parenthesis. (#GH139665) - An error is now emitted when OpenMP ``collapse`` and ``ordered`` clauses have an argument larger than what can fit within a 64-bit integer. +- Added support for 'omp fuse' directive. Improvements ^^^^^^^^^^^^ >From 33119f77c07cc3ecbb5b3360fd8f63a958e808c1 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:43:41 +0000 Subject: [PATCH 4/9] Refactored preinits handling and improved coverage --- clang/docs/OpenMPSupport.rst | 2 +- clang/include/clang/AST/StmtOpenMP.h | 5 +- clang/include/clang/Sema/SemaOpenMP.h | 96 +- clang/lib/AST/StmtOpenMP.cpp | 13 + clang/lib/Basic/OpenMPKinds.cpp | 3 +- clang/lib/CodeGen/CGExpr.cpp | 2 + clang/lib/CodeGen/CodeGenFunction.h | 4 + clang/lib/Sema/SemaOpenMP.cpp | 588 ++++--- clang/test/OpenMP/fuse_ast_print.cpp | 55 + clang/test/OpenMP/fuse_codegen.cpp | 2117 +++++++++++++++---------- 10 files changed, 1862 insertions(+), 1023 deletions(-) diff --git a/clang/docs/OpenMPSupport.rst b/clang/docs/OpenMPSupport.rst index 5f0e363792b32..b39f9d3634a63 100644 --- a/clang/docs/OpenMPSupport.rst +++ b/clang/docs/OpenMPSupport.rst @@ -376,7 +376,7 @@ implementation. +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | loop stripe transformation | :good:`done` | https://github.com/llvm/llvm-project/pull/119891 | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ -| loop fuse transformation | :good:`done` | :none:`unclaimed` | | +| loop fuse transformation | :good:`prototyped` | :none:`unclaimed` | | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | work distribute construct | :none:`unclaimed` | :none:`unclaimed` | | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ diff --git a/clang/include/clang/AST/StmtOpenMP.h b/clang/include/clang/AST/StmtOpenMP.h index 85bde292ca748..b6a948a8c6020 100644 --- a/clang/include/clang/AST/StmtOpenMP.h +++ b/clang/include/clang/AST/StmtOpenMP.h @@ -1005,8 +1005,7 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { Stmt::StmtClass C = T->getStmtClass(); return C == OMPTileDirectiveClass || C == OMPUnrollDirectiveClass || C == OMPReverseDirectiveClass || C == OMPInterchangeDirectiveClass || - C == OMPStripeDirectiveClass || - C == OMPFuseDirectiveClass; + C == OMPStripeDirectiveClass || C == OMPFuseDirectiveClass; } }; @@ -5653,6 +5652,8 @@ class OMPStripeDirective final : public OMPLoopTransformationDirective { llvm::omp::OMPD_stripe, StartLoc, EndLoc, NumLoops) { setNumGeneratedLoops(2 * NumLoops); + // Similar to Tile, it only generates a single top level loop nest + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { diff --git a/clang/include/clang/Sema/SemaOpenMP.h b/clang/include/clang/Sema/SemaOpenMP.h index f4a075e54cebe..ac4cbe3709a0d 100644 --- a/clang/include/clang/Sema/SemaOpenMP.h +++ b/clang/include/clang/Sema/SemaOpenMP.h @@ -1493,16 +1493,96 @@ class SemaOpenMP : public SemaBase { SmallVectorImpl &LoopHelpers, Stmt *&Body, SmallVectorImpl> &OriginalInits); - /// Analyzes and checks a loop sequence for use by a loop transformation + /// @brief Categories of loops encountered during semantic OpenMP loop + /// analysis + /// + /// This enumeration identifies the structural category of a loop or sequence + /// of loops analyzed in the context of OpenMP transformations and directives. + /// This categorization helps differentiate between original source loops + /// and the structures resulting from applying OpenMP loop transformations. + enum class OMPLoopCategory { + + /// @var OMPLoopCategory::RegularLoop + /// Represents a standard canonical loop nest found in the + /// original source code or an intact loop after transformations + /// (i.e Post/Pre loops of a loopranged fusion) + RegularLoop, + + /// @var OMPLoopCategory::TransformSingleLoop + /// Represents the resulting loop structure when an OpenMP loop + // transformation, generates a single, top-level loop + TransformSingleLoop, + + /// @var OMPLoopCategory::TransformLoopSequence + /// Represents the resulting loop structure when an OpenMP loop + /// transformation + /// generates a sequence of two or more canonical loop nests + TransformLoopSequence + }; + + /// The main recursive process of `checkTransformableLoopSequence` that + /// performs grammatical parsing of a canonical loop sequence. It extracts + /// key information, such as the number of top-level loops, loop statements, + /// helper expressions, and other relevant loop-related data, all in a single + /// execution to avoid redundant traversals. This analysis flattens inner + /// Loop Sequences + /// + /// \param LoopSeqStmt The AST of the original statement. + /// \param LoopSeqSize [out] Number of top level canonical loops. + /// \param NumLoops [out] Number of total canonical loops (nested too). + /// \param LoopHelpers [out] The multiple loop analyses results. + /// \param ForStmts [out] The multiple Stmt of each For loop. + /// \param OriginalInits [out] The raw original initialization statements + /// of each belonging to a loop of the loop sequence + /// \param TransformPreInits [out] The multiple collection of statements and + /// declarations that must have been executed/declared + /// before entering the loop (each belonging to a + /// particular loop transformation, nullptr otherwise) + /// \param LoopSequencePreInits [out] Additional general collection of loop + /// transformation related statements and declarations + /// not bounded to a particular loop that must be + /// executed before entering the loop transformation + /// \param LoopCategories [out] A sequence of OMPLoopCategory values, + /// one for each loop or loop transformation node + /// successfully analyzed. + /// \param Context + /// \param Kind The loop transformation directive kind. + /// \return Whether the original statement is both syntactically and + /// semantically correct according to OpenMP 6.0 canonical loop + /// sequence definition. + bool analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind); + + /// Validates and checks whether a loop sequence can be transformed according + /// to the given directive, providing necessary setup and initialization + /// (Driver function) before recursion using `analyzeLoopSequence`. /// /// \param Kind The loop transformation directive kind. - /// \param NumLoops [out] Number of total canonical loops - /// \param LoopSeqSize [out] Number of top level canonical loops + /// \param AStmt The AST of the original statement + /// \param LoopSeqSize [out] Number of top level canonical loops. + /// \param NumLoops [out] Number of total canonical loops (nested too) /// \param LoopHelpers [out] The multiple loop analyses results. - /// \param LoopStmts [out] The multiple Stmt of each For loop. - /// \param OriginalInits [out] The multiple collection of statements and + /// \param ForStmts [out] The multiple Stmt of each For loop. + /// \param OriginalInits [out] The raw original initialization statements + /// of each belonging to a loop of the loop sequence + /// \param TransformsPreInits [out] The multiple collection of statements and /// declarations that must have been executed/declared - /// before entering the loop. + /// before entering the loop (each belonging to a + /// particular loop transformation, nullptr otherwise) + /// \param LoopSequencePreInits [out] Additional general collection of loop + /// transformation related statements and declarations + /// not bounded to a particular loop that must be + /// executed before entering the loop transformation + /// \param LoopCategories [out] A sequence of OMPLoopCategory values, + /// one for each loop or loop transformation node + /// successfully analyzed. /// \param Context /// \return Whether there was an absence of errors or not bool checkTransformableLoopSequence( @@ -1511,7 +1591,9 @@ class SemaOpenMP : public SemaBase { SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, SmallVectorImpl> &OriginalInits, - ASTContext &Context); + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context); /// Helper to keep information about the current `omp begin/end declare /// variant` nesting. diff --git a/clang/lib/AST/StmtOpenMP.cpp b/clang/lib/AST/StmtOpenMP.cpp index 06c987e7f1761..e6b52792885ba 100644 --- a/clang/lib/AST/StmtOpenMP.cpp +++ b/clang/lib/AST/StmtOpenMP.cpp @@ -457,6 +457,8 @@ OMPUnrollDirective::Create(const ASTContext &C, SourceLocation StartLoc, C, Clauses, AssociatedStmt, TransformedStmtOffset + 1, StartLoc, EndLoc); Dir->setNumGeneratedLoops(NumGeneratedLoops); // The number of generated loops and loop nests during unroll matches + // given that unroll only generates top level canonical loop nests + // so each generated loop is a top level canonical loop nest Dir->setNumGeneratedLoopNests(NumGeneratedLoops); Dir->setTransformedStmt(TransformedStmt); Dir->setPreInits(PreInits); @@ -517,6 +519,17 @@ OMPFuseDirective *OMPFuseDirective::Create( NumLoops); Dir->setTransformedStmt(TransformedStmt); Dir->setPreInits(PreInits); + // The number of top level canonical nests could + // not match the total number of generated loops + // Example: + // Before fusion: + // for (int i = 0; i < N; ++i) + // for (int j = 0; j < M; ++j) + // A[i][j] = i + j; + // + // for (int k = 0; k < P; ++k) + // B[k] = k * 2; + // Here, NumLoopNests = 2, but NumLoops = 3. Dir->setNumGeneratedLoopNests(NumLoopNests); Dir->setNumGeneratedLoops(NumLoops); return Dir; diff --git a/clang/lib/Basic/OpenMPKinds.cpp b/clang/lib/Basic/OpenMPKinds.cpp index 18330181f1509..53a9f80e6d3b7 100644 --- a/clang/lib/Basic/OpenMPKinds.cpp +++ b/clang/lib/Basic/OpenMPKinds.cpp @@ -704,7 +704,8 @@ bool clang::isOpenMPLoopBoundSharingDirective(OpenMPDirectiveKind Kind) { bool clang::isOpenMPLoopTransformationDirective(OpenMPDirectiveKind DKind) { return DKind == OMPD_tile || DKind == OMPD_unroll || DKind == OMPD_reverse || - DKind == OMPD_interchange || DKind == OMPD_stripe || DKind == OMPD_fuse; + DKind == OMPD_interchange || DKind == OMPD_stripe || + DKind == OMPD_fuse; } bool clang::isOpenMPCombinedParallelADirective(OpenMPDirectiveKind DKind) { diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index 7cb7ee20fcf6a..1671f07bc2760 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -3242,6 +3242,8 @@ LValue CodeGenFunction::EmitDeclRefLValue(const DeclRefExpr *E) { // No other cases for now. } else { + llvm::dbgs() << "THE DAMN DECLREFEXPR HASN'T BEEN ENTERED IN LOCALDECLMAP\n"; + VD->dumpColor(); llvm_unreachable("DeclRefExpr for Decl not entered in LocalDeclMap?"); } diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h index a983901f560de..ce00198c396b6 100644 --- a/clang/lib/CodeGen/CodeGenFunction.h +++ b/clang/lib/CodeGen/CodeGenFunction.h @@ -5414,6 +5414,10 @@ class CodeGenFunction : public CodeGenTypeCache { /// Set the address of a local variable. void setAddrOfLocalVar(const VarDecl *VD, Address Addr) { + if (LocalDeclMap.count(VD)) { + llvm::errs() << "Warning: VarDecl already exists in map: "; + VD->dumpColor(); + } assert(!LocalDeclMap.count(VD) && "Decl already exists in LocalDeclMap!"); LocalDeclMap.insert({VD, Addr}); } diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index 556b5cb43b6f8..b0529c9352c83 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -22,6 +22,7 @@ #include "clang/AST/DeclOpenMP.h" #include "clang/AST/DynamicRecursiveASTVisitor.h" #include "clang/AST/OpenMPClause.h" +#include "clang/AST/RecursiveASTVisitor.h" #include "clang/AST/StmtCXX.h" #include "clang/AST/StmtOpenMP.h" #include "clang/AST/StmtVisitor.h" @@ -47,6 +48,7 @@ #include "llvm/Frontend/OpenMP/OMPConstants.h" #include "llvm/IR/Assumptions.h" #include +#include using namespace clang; using namespace llvm::omp; @@ -14157,6 +14159,45 @@ StmtResult SemaOpenMP::ActOnOpenMPTargetTeamsDistributeSimdDirective( getASTContext(), StartLoc, EndLoc, NestedLoopCount, Clauses, AStmt, B); } +// Overloaded base case function +template +static bool tryHandleAs(T *t, F &&) { + return false; +} + +/** + * Tries to recursively cast `t` to one of the given types and invokes `f` if successful. + * + * @tparam Class The first type to check. + * @tparam Rest The remaining types to check. + * @tparam T The base type of `t`. + * @tparam F The callable type for the function to invoke upon a successful cast. + * @param t The object to be checked. + * @param f The function to invoke if `t` matches `Class`. + * @return `true` if `t` matched any type and `f` was called, otherwise `false`. + */ +template +static bool tryHandleAs(T *t, F &&f) { + if (Class *c = dyn_cast(t)) { + f(c); + return true; + } else { + return tryHandleAs(t, std::forward(f)); + } +} + +// Updates OriginalInits by checking Transform against loop transformation +// directives and appending their pre-inits if a match is found. +static void updatePreInits(OMPLoopBasedDirective *Transform, + SmallVectorImpl> &PreInits) { + if (!tryHandleAs( + Transform, [&PreInits](auto *Dir) { + appendFlattenedStmtList(PreInits.back(), Dir->getPreInits()); + })) + llvm_unreachable("Unhandled loop transformation"); +} + bool SemaOpenMP::checkTransformableLoopNest( OpenMPDirectiveKind Kind, Stmt *AStmt, int NumLoops, SmallVectorImpl &LoopHelpers, @@ -14187,121 +14228,106 @@ bool SemaOpenMP::checkTransformableLoopNest( return false; }, [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + updatePreInits(Transform, OriginalInits); }); assert(OriginalInits.back().empty() && "No preinit after innermost loop"); OriginalInits.pop_back(); return Result; } -class NestedLoopCounterVisitor - : public clang::RecursiveASTVisitor { +// Counts the total number of nested loops, including the outermost loop (the +// original loop). PRECONDITION of this visitor is that it must be invoked from +// the original loop to be analyzed. The traversal is stop for Decl's and +// Expr's given that they may contain inner loops that must not be counted. +// +// Example AST structure for the code: +// +// int main() { +// #pragma omp fuse +// { +// for (int i = 0; i < 100; i++) { <-- Outer loop +// []() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// }; +// for(int j = 0; j < 5; ++j) {} <-- Inner loop +// } +// for (int r = 0; i < 100; i++) { <-- Outer loop +// struct LocalClass { +// void bar() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// } +// }; +// for(int k = 0; k < 10; ++k) {} <-- Inner loop +// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +// } +// } +// } +// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { +private: + unsigned NestedLoopCount = 0; + public: - explicit NestedLoopCounterVisitor() : NestedLoopCount(0) {} + explicit NestedLoopCounterVisitor() {} - bool VisitForStmt(clang::ForStmt *FS) { - ++NestedLoopCount; - return true; + unsigned getNestedLoopCount() const { return NestedLoopCount; } + + bool VisitForStmt(ForStmt *FS) override { + ++NestedLoopCount; + return true; } - bool VisitCXXForRangeStmt(clang::CXXForRangeStmt *FRS) { - ++NestedLoopCount; - return true; + bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { + ++NestedLoopCount; + return true; } - unsigned getNestedLoopCount() const { return NestedLoopCount; } + bool TraverseStmt(Stmt *S) override { + if (!S) + return true; -private: - unsigned NestedLoopCount; + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) + return true; + + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || + isa(S)) { + return DynamicRecursiveASTVisitor::TraverseStmt(S); + } + + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; + } + + bool TraverseDecl(Decl *D) override { + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; + } }; -bool SemaOpenMP::checkTransformableLoopSequence( - OpenMPDirectiveKind Kind, Stmt *AStmt, unsigned &LoopSeqSize, - unsigned &NumLoops, +bool SemaOpenMP::analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, SmallVectorImpl> &OriginalInits, - ASTContext &Context) { + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind) { - // Checks whether the given statement is a compound statement VarsWithInheritedDSAType TmpDSA; - if (!isa(AStmt)) { - Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) - << getOpenMPDirectiveName(Kind); - return false; - } - // Callback for updating pre-inits in case there are even more - // loop-sequence-generating-constructs inside of the main compound stmt - auto OnTransformationCallback = - [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); - }; - - // Number of top level canonical loop nests observed (And acts as index) - LoopSeqSize = 0; - // Number of total observed loops - NumLoops = 0; - - // Following OpenMP 6.0 API Specification, a Canonical Loop Sequence follows - // the grammar: - // - // canonical-loop-sequence: - // { - // loop-sequence+ - // } - // where loop-sequence can be any of the following: - // 1. canonical-loop-sequence - // 2. loop-nest - // 3. loop-sequence-generating-construct (i.e OMPLoopTransformationDirective) - // - // To recognise and traverse this structure the following helper functions - // have been defined. handleLoopSequence serves as the recurisve entry point - // and tries to match the input AST to the canonical loop sequence grammar - // structure - - // Helper functions to validate canonical loop sequence grammar is valid - auto isLoopSequenceDerivation = [](auto *Child) { - return isa(Child) || isa(Child) || - isa(Child); - }; - auto isLoopGeneratingStmt = [](auto *Child) { - return isa(Child); - }; - + QualType BaseInductionVarType; // Helper Lambda to handle storing initialization and body statements for both // ForStmt and CXXForRangeStmt and checks for any possible mismatch between // induction variables types - QualType BaseInductionVarType; auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, this, &Context](Stmt *LoopStmt) { if (auto *For = dyn_cast(LoopStmt)) { @@ -14324,33 +14350,35 @@ bool SemaOpenMP::checkTransformableLoopSequence( } } } - } else { - assert(isa(LoopStmt) && - "Expected canonical for or range-based for loops."); - auto *CXXFor = dyn_cast(LoopStmt); + auto *CXXFor = cast(LoopStmt); OriginalInits.back().push_back(CXXFor->getBeginStmt()); ForStmts.push_back(CXXFor); } }; + // Helper lambda functions to encapsulate the processing of different // derivations of the canonical loop sequence grammar // // Modularized code for handling loop generation and transformations - auto handleLoopGeneration = [&storeLoopStatements, &LoopHelpers, - &OriginalInits, &LoopSeqSize, &NumLoops, Kind, - &TmpDSA, &OnTransformationCallback, - this](Stmt *Child) { + auto analyzeLoopGeneration = [&storeLoopStatements, &LoopHelpers, + &OriginalInits, &TransformsPreInits, + &LoopCategories, &LoopSeqSize, &NumLoops, Kind, + &TmpDSA, &ForStmts, &Context, + &LoopSequencePreInits, this](Stmt *Child) { auto LoopTransform = dyn_cast(Child); Stmt *TransformedStmt = LoopTransform->getTransformedStmt(); unsigned NumGeneratedLoopNests = LoopTransform->getNumGeneratedLoopNests(); - + unsigned NumGeneratedLoops = LoopTransform->getNumGeneratedLoops(); // Handle the case where transformed statement is not available due to // dependent contexts if (!TransformedStmt) { - if (NumGeneratedLoopNests > 0) + if (NumGeneratedLoopNests > 0) { + LoopSeqSize += NumGeneratedLoopNests; + NumLoops += NumGeneratedLoops; return true; - // Unroll full + } + // Unroll full (0 loops produced) else { Diag(Child->getBeginLoc(), diag::err_omp_not_for) << 0 << getOpenMPDirectiveName(Kind); @@ -14363,38 +14391,56 @@ bool SemaOpenMP::checkTransformableLoopSequence( Diag(Child->getBeginLoc(), diag::err_omp_not_for) << 0 << getOpenMPDirectiveName(Kind); return false; - // Future loop transformations that generate multiple canonical loops - } else if (NumGeneratedLoopNests > 1) { - llvm_unreachable("Multiple canonical loop generating transformations " - "like loop splitting are not yet supported"); } + // Loop transformatons such as split or loopranged fuse + else if (NumGeneratedLoopNests > 1) { + // Get the preinits related to this loop sequence generating + // loop transformation (i.e loopranged fuse, split...) + LoopSequencePreInits.emplace_back(); + // These preinits differ slightly from regular inits/pre-inits related + // to single loop generating loop transformations (interchange, unroll) + // given that they are not bounded to a particular loop nest + // so they need to be treated independently + updatePreInits(LoopTransform, LoopSequencePreInits); + return analyzeLoopSequence(TransformedStmt, LoopSeqSize, NumLoops, + LoopHelpers, ForStmts, OriginalInits, + TransformsPreInits, LoopSequencePreInits, + LoopCategories, Context, Kind); + } + // Vast majority: (Tile, Unroll, Stripe, Reverse, Interchange, Fuse all) + else { + // Process the transformed loop statement + OriginalInits.emplace_back(); + TransformsPreInits.emplace_back(); + LoopHelpers.emplace_back(); + LoopCategories.push_back(OMPLoopCategory::TransformSingleLoop); + + unsigned IsCanonical = + checkOpenMPLoop(Kind, nullptr, nullptr, TransformedStmt, SemaRef, + *DSAStack, TmpDSA, LoopHelpers[LoopSeqSize]); + + if (!IsCanonical) { + Diag(TransformedStmt->getBeginLoc(), diag::err_omp_not_canonical_loop) + << getOpenMPDirectiveName(Kind); + return false; + } + storeLoopStatements(TransformedStmt); + updatePreInits(LoopTransform, TransformsPreInits); - // Process the transformed loop statement - Child = TransformedStmt; - OriginalInits.emplace_back(); - LoopHelpers.emplace_back(); - OnTransformationCallback(LoopTransform); - - unsigned IsCanonical = - checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, - TmpDSA, LoopHelpers[LoopSeqSize]); - - if (!IsCanonical) { - Diag(Child->getBeginLoc(), diag::err_omp_not_canonical_loop) - << getOpenMPDirectiveName(Kind); - return false; + NumLoops += NumGeneratedLoops; + ++LoopSeqSize; + return true; } - storeLoopStatements(TransformedStmt); - NumLoops += LoopTransform->getNumGeneratedLoops(); - return true; }; // Modularized code for handling regular canonical loops - auto handleRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, - &LoopSeqSize, &NumLoops, Kind, &TmpDSA, - this](Stmt *Child) { + auto analyzeRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, + &LoopSeqSize, &NumLoops, Kind, &TmpDSA, + &LoopCategories, this](Stmt *Child) { OriginalInits.emplace_back(); LoopHelpers.emplace_back(); + LoopCategories.push_back(OMPLoopCategory::RegularLoop); + unsigned IsCanonical = checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, TmpDSA, LoopHelpers[LoopSeqSize]); @@ -14412,57 +14458,114 @@ bool SemaOpenMP::checkTransformableLoopSequence( return true; }; - // Helper function to process a Loop Sequence Recursively - auto handleLoopSequence = [&](Stmt *LoopSeqStmt, - auto &handleLoopSequenceCallback) -> bool { - for (auto *Child : LoopSeqStmt->children()) { - if (!Child) - continue; + // Helper functions to validate canonical loop sequence grammar is valid + auto isLoopSequenceDerivation = [](auto *Child) { + return isa(Child) || isa(Child) || + isa(Child); + }; + auto isLoopGeneratingStmt = [](auto *Child) { + return isa(Child); + }; + - // Skip over non-loop-sequence statements - if (!isLoopSequenceDerivation(Child)) { - Child = Child->IgnoreContainers(); + // High level grammar validation + for (auto *Child : LoopSeqStmt->children()) { - // Ignore empty compound statement if (!Child) - continue; + continue; - // In the case of a nested loop sequence ignoring containers would not - // be enough, a recurisve transversal of the loop sequence is required - if (isa(Child)) { - if (!handleLoopSequenceCallback(Child, handleLoopSequenceCallback)) - return false; - // Already been treated, skip this children - continue; + // Skip over non-loop-sequence statements + if (!isLoopSequenceDerivation(Child)) { + Child = Child->IgnoreContainers(); + + // Ignore empty compound statement + if (!Child) + continue; + + // In the case of a nested loop sequence ignoring containers would not + // be enough, a recurisve transversal of the loop sequence is required + if (isa(Child)) { + if (!analyzeLoopSequence(Child, LoopSeqSize, NumLoops, LoopHelpers, + ForStmts, OriginalInits, TransformsPreInits, + LoopSequencePreInits, LoopCategories, Context, + Kind)) + return false; + // Already been treated, skip this children + continue; + } + } + // Regular loop sequence handling + if (isLoopSequenceDerivation(Child)) { + if (isLoopGeneratingStmt(Child)) { + if (!analyzeLoopGeneration(Child)) { + return false; } + // analyzeLoopGeneration updates Loop Sequence size accordingly + + } else { + if (!analyzeRegularLoop(Child)) { + return false; + } + // Update the Loop Sequence size by one + ++LoopSeqSize; } - // Regular loop sequence handling - if (isLoopSequenceDerivation(Child)) { - if (isLoopGeneratingStmt(Child)) { - if (!handleLoopGeneration(Child)) { - return false; - } } else { - if (!handleRegularLoop(Child)) { - return false; - } + // Report error for invalid statement inside canonical loop sequence + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; } - ++LoopSeqSize; - } else { - // Report error for invalid statement inside canonical loop sequence - Diag(Child->getBeginLoc(), diag::err_omp_not_for) - << 0 << getOpenMPDirectiveName(Kind); + } + return true; +} + +bool SemaOpenMP::checkTransformableLoopSequence( + OpenMPDirectiveKind Kind, Stmt *AStmt, unsigned &LoopSeqSize, + unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context) { + + // Checks whether the given statement is a compound statement + if (!isa(AStmt)) { + Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) + << getOpenMPDirectiveName(Kind); return false; - } - } - return true; - }; + } + // Number of top level canonical loop nests observed (And acts as index) + LoopSeqSize = 0; + // Number of total observed loops + NumLoops = 0; + + // Following OpenMP 6.0 API Specification, a Canonical Loop Sequence follows + // the grammar: + // + // canonical-loop-sequence: + // { + // loop-sequence+ + // } + // where loop-sequence can be any of the following: + // 1. canonical-loop-sequence + // 2. loop-nest + // 3. loop-sequence-generating-construct (i.e OMPLoopTransformationDirective) + // + // To recognise and traverse this structure the following helper functions + // have been defined. analyzeLoopSequence serves as the recurisve entry point + // and tries to match the input AST to the canonical loop sequence grammar + // structure. This function will perform both a semantic and syntactical + // analysis of the given statement according to OpenMP 6.0 definition of + // the aforementioned canonical loop sequence // Recursive entry point to process the main loop sequence - if (!handleLoopSequence(AStmt, handleLoopSequence)) { - return false; + if (!analyzeLoopSequence(AStmt, LoopSeqSize, NumLoops, LoopHelpers, ForStmts, + OriginalInits, TransformsPreInits, + LoopSequencePreInits, LoopCategories, Context, + Kind)) { + return false; } - if (LoopSeqSize <= 0) { Diag(AStmt->getBeginLoc(), diag::err_omp_empty_loop_sequence) << getOpenMPDirectiveName(Kind); @@ -14494,9 +14597,7 @@ static void addLoopPreInits(ASTContext &Context, RangeEnd->getBeginLoc(), RangeEnd->getEndLoc())); } - llvm::append_range(PreInits, OriginalInit); - // List of OMPCapturedExprDecl, for __begin, __end, and NumIterations if (auto *PI = cast_or_null(LoopHelper.PreInits)) { PreInits.push_back(new (Context) DeclStmt( @@ -15177,7 +15278,7 @@ StmtResult SemaOpenMP::ActOnOpenMPUnrollDirective(ArrayRef Clauses, Stmt *LoopStmt = nullptr; collectLoopStmts(AStmt, {LoopStmt}); - // Determine the PreInit declarations. + // Determine the PreInit declarations.e SmallVector PreInits; addLoopPreInits(Context, LoopHelper, LoopStmt, OriginalInits[0], PreInits); @@ -15744,28 +15845,35 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, if (!AStmt) { return StmtError(); } + + unsigned NumLoops = 1; + unsigned LoopSeqSize = 1; + + // Defer transformation in dependent contexts + // The NumLoopNests argument is set to a placeholder 1 (even though + // using looprange fuse could yield up to 3 top level loop nests) + // because a dependent context could prevent determining its true value + if (CurrContext->isDependentContext()) { + return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, + NumLoops, LoopSeqSize, AStmt, nullptr, + nullptr); + } + // Validate that the potential loop sequence is transformable for fusion // Also collect the HelperExprs, Loop Stmts, Inits, and Number of loops SmallVector LoopHelpers; SmallVector LoopStmts; SmallVector> OriginalInits; - - unsigned NumLoops; - unsigned LoopSeqSize; + SmallVector> TransformsPreInits; + SmallVector> LoopSequencePreInits; + SmallVector LoopCategories; if (!checkTransformableLoopSequence(OMPD_fuse, AStmt, LoopSeqSize, NumLoops, LoopHelpers, LoopStmts, OriginalInits, - Context)) { + TransformsPreInits, LoopSequencePreInits, + LoopCategories, Context)) { return StmtError(); } - // Defer transformation in dependent contexts - // The NumLoopNests argument is set to a placeholder (0) - // because a dependent context could prevent determining its true value - if (CurrContext->isDependentContext()) { - return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, - NumLoops, 0, AStmt, nullptr, nullptr); - } - // Handle clauses, which can be any of the following: [looprange, apply] const OMPLoopRangeClause *LRC = OMPExecutableDirective::getSingleClause(Clauses); @@ -15827,11 +15935,6 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, "Expecting loop iteration space dimensionality to match number of " "affected loops"); - // PreInits hold a sequence of variable declarations that must be executed - // before the fused loop begins. These include bounds, strides, and other - // helper variables required for the transformation. - SmallVector PreInits; - // Select the type with the largest bit width among all induction variables QualType IVType = LoopHelpers[FirstVal - 1].IterationVarRef->getType(); for (unsigned int I = FirstVal; I < LastVal; ++I) { @@ -15843,7 +15946,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, uint64_t IVBitWidth = Context.getIntWidth(IVType); // Create pre-init declarations for all loops lower bounds, upper bounds, - // strides and num-iterations + // strides and num-iterations for every top level loop in the fusion SmallVector LBVarDecls; SmallVector STVarDecls; SmallVector NIVarDecls; @@ -15881,12 +15984,62 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, return std::make_pair(VD, DeclStmt); }; + // PreInits hold a sequence of variable declarations that must be executed + // before the fused loop begins. These include bounds, strides, and other + // helper variables required for the transformation. Other loop transforms + // also contain their own preinits + SmallVector PreInits; + // Iterator to keep track of loop transformations + unsigned int TransformIndex = 0; + + // Update the general preinits using the preinits generated by loop sequence + // generating loop transformations. These preinits differ slightly from + // single-loop transformation preinits, as they can be detached from a + // specific loop inside the multiple generated loop nests. This happens + // because certain helper variables, like '.omp.fuse.max', are introduced to + // handle fused iteration spaces and may not be directly tied to a single + // original loop. the preinit structure must ensure that hidden variables + // like '.omp.fuse.max' are still properly handled. + // Transformations that apply this concept: Loopranged Fuse, Split + if (!LoopSequencePreInits.empty()) { + for (const auto <PreInits : LoopSequencePreInits) { + if (!LTPreInits.empty()) { + llvm::append_range(PreInits, LTPreInits); + } + } + } + // Process each single loop to generate and collect declarations - // and statements for all helper expressions + // and statements for all helper expressions related to + // particular single loop nests + + // Also In the case of the fused loops, we keep track of their original + // inits by appending them to their preinits statement, and in the case of + // transformations, also append their preinits (which contain the original + // loop initialization statement or other statements) + + // Firstly we need to update TransformIndex to match the begining of the + // looprange section + for (unsigned int I = 0; I < FirstVal - 1; ++I) { + if (LoopCategories[I] == OMPLoopCategory::TransformSingleLoop) + ++TransformIndex; + } for (unsigned int I = FirstVal - 1, J = 0; I < LastVal; ++I, ++J) { - addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], - PreInits); + if (LoopCategories[I] == OMPLoopCategory::RegularLoop) { + addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], + PreInits); + } else if (LoopCategories[I] == OMPLoopCategory::TransformSingleLoop) { + // For transformed loops, insert both pre-inits and original inits. + // Order matters: pre-inits may define variables used in the original + // inits such as upper bounds... + auto TransformPreInit = TransformsPreInits[TransformIndex++]; + if (!TransformPreInit.empty()) { + llvm::append_range(PreInits, TransformPreInit); + } + addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], + PreInits); + } auto [UBVD, UBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].UB, "ub", J); auto [LBVD, LBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].LB, "lb", J); auto [STVD, STDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].ST, "st", J); @@ -15905,7 +16058,6 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, NIVarDecls.push_back(NIVD); IVVarDecls.push_back(IVVD); - PreInits.push_back(UBDStmt.get()); PreInits.push_back(LBDStmt.get()); PreInits.push_back(STDStmt.get()); PreInits.push_back(NIDStmt.get()); @@ -16081,6 +16233,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, BodyStmts.push_back(IdxExpr.get()); llvm::append_range(BodyStmts, LoopHelpers[I].Updates); + // If the loop is a CXXForRangeStmt then the iterator variable is needed if (auto *SourceCXXFor = dyn_cast(LoopStmts[I])) BodyStmts.push_back(SourceCXXFor->getLoopVarStmt()); @@ -16115,21 +16268,50 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, FusedBody, InitStmt.get()->getBeginLoc(), SourceLocation(), IncrExpr.get()->getEndLoc()); - // In the case of looprange, the result of fuse won't simply - // be a single loop (ForStmt), but rather a loop sequence - // (CompoundStmt) of 3 parts: the pre-fusion loops, the fused loop - // and the post-fusion loops, preserving its original order. + // In the case of looprange, the result of fuse won't simply + // be a single loop (ForStmt), but rather a loop sequence + // (CompoundStmt) of 3 parts: the pre-fusion loops, the fused loop + // and the post-fusion loops, preserving its original order. + // + // Note: If looprange clause produces a single fused loop nest then + // this compound statement wrapper is unnecessary (Therefore this + // treatment is skipped) + Stmt *FusionStmt = FusedForStmt; - if (LRC) { + if (LRC && CountVal != LoopSeqSize) { SmallVector FinalLoops; - // Gather all the pre-fusion loops - for (unsigned I = 0; I < FirstVal - 1; ++I) - FinalLoops.push_back(LoopStmts[I]); - // Gather the fused loop - FinalLoops.push_back(FusedForStmt); - // Gather all the post-fusion loops - for (unsigned I = FirstVal + CountVal - 1; I < LoopSeqSize; ++I) + // Reset the transform index + TransformIndex = 0; + + // Collect all non-fused loops before and after the fused region. + // Pre-fusion and post-fusion loops are inserted in order exploiting their + // symmetry, along with their corresponding transformation pre-inits if + // needed. The fused loop is added between the two regions. + for (unsigned I = 0; I < LoopSeqSize; ++I) { + if (I >= FirstVal - 1 && I < FirstVal + CountVal - 1) { + // Update the Transformation counter to skip already treated + // loop transformations + if (LoopCategories[I] != OMPLoopCategory::TransformSingleLoop) + ++TransformIndex; + continue; + } + + // No need to handle: + // Regular loops: they are kept intact as-is. + // Loop-sequence-generating transformations: already handled earlier. + // Only TransformSingleLoop requires inserting pre-inits here + + if (LoopCategories[I] == OMPLoopCategory::TransformSingleLoop) { + auto TransformPreInit = TransformsPreInits[TransformIndex++]; + if (!TransformPreInit.empty()) { + llvm::append_range(PreInits, TransformPreInit); + } + } + FinalLoops.push_back(LoopStmts[I]); + } + + FinalLoops.insert(FinalLoops.begin() + (FirstVal - 1), FusedForStmt); FusionStmt = CompoundStmt::Create(Context, FinalLoops, FPOptionsOverride(), SourceLocation(), SourceLocation()); } diff --git a/clang/test/OpenMP/fuse_ast_print.cpp b/clang/test/OpenMP/fuse_ast_print.cpp index ac4f0d38a9c68..9d85bd1172948 100644 --- a/clang/test/OpenMP/fuse_ast_print.cpp +++ b/clang/test/OpenMP/fuse_ast_print.cpp @@ -338,6 +338,61 @@ void tfoo9() { foo9<1, 2>(); } +// PRINT-LABEL: void foo10( +// DUMP-LABEL: FunctionDecl {{.*}} foo10 +void foo10() { + // PRINT: #pragma omp fuse looprange(2,2) + // DUMP: OMPFuseDirective + // DUMP: OMPLooprangeClause + #pragma omp fuse looprange(2,2) + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int ii = 0; ii < 10; ii += 2) + // DUMP: ForStmt + for (int ii = 0; ii < 10; ii += 2) + // PRINT: body(ii) + // DUMP: CallExpr + body(ii); + // PRINT: #pragma omp fuse looprange(2,2) + // DUMP: OMPFuseDirective + // DUMP: OMPLooprangeClause + #pragma omp fuse looprange(2,2) + { + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + // PRINT: for (int jj = 10; jj > 0; --jj) + // DUMP: ForStmt + for (int jj = 10; jj > 0; --jj) + // PRINT: body(jj) + // DUMP: CallExpr + body(jj); + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + // PRINT: for (int kk = 0; kk <= 10; ++kk) + // DUMP: ForStmt + for (int kk = 0; kk <= 10; ++kk) + // PRINT: body(kk) + // DUMP: CallExpr + body(kk); + } + } + +} diff --git a/clang/test/OpenMP/fuse_codegen.cpp b/clang/test/OpenMP/fuse_codegen.cpp index d9500bed3ce31..742c280ed0172 100644 --- a/clang/test/OpenMP/fuse_codegen.cpp +++ b/clang/test/OpenMP/fuse_codegen.cpp @@ -65,6 +65,23 @@ extern "C" void foo4() { } } +// This exemplifies the usage of loop transformations that generate +// more than top level canonical loop nests (e.g split, loopranged fuse...) +extern "C" void foo5() { + double arr[256]; + #pragma omp fuse looprange(2,2) + { + #pragma omp fuse looprange(2,2) + { + for(int i = 0; i < 128; ++i) body(i); + for(int j = 0; j < 256; j+=2) body(j); + for(int k = 0; k < 512; ++k) body(k); + } + for(int c = 42; auto &&v: arr) body(c,v); + for(int cc = 37; auto &&vv: arr) body(cc, vv); + } +} + #endif // CHECK1-LABEL: define dso_local void @body( @@ -88,7 +105,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 @@ -97,7 +113,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -129,107 +144,103 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] // CHECK1-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 // CHECK1-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP8]], 1 // CHECK1-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP9]], ptr [[J]], align 4 // CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[START2_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[START2_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[END2_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK1-NEXT: store i32 [[TMP10]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[END2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP13]], [[TMP14]] // CHECK1-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP15]] // CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] -// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP16]] // CHECK1-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 // CHECK1-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP17]], 1 // CHECK1-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: store i32 [[TMP20]], ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP21]], [[TMP22]] +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP19]], [[TMP20]] // CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] // CHECK1: [[COND_TRUE]]: -// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 // CHECK1-NEXT: br label %[[COND_END:.*]] // CHECK1: [[COND_FALSE]]: -// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 // CHECK1-NEXT: br label %[[COND_END]] // CHECK1: [[COND_END]]: -// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP23]], %[[COND_TRUE]] ], [ [[TMP24]], %[[COND_FALSE]] ] +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP21]], %[[COND_TRUE]] ], [ [[TMP22]], %[[COND_FALSE]] ] // CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK1-NEXT: br label %[[FOR_COND:.*]] // CHECK1: [[FOR_COND]]: -// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 -// CHECK1-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP23]], [[TMP24]] // CHECK1-NEXT: br i1 [[CMP16]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK1: [[FOR_BODY]]: -// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] // CHECK1-NEXT: br i1 [[CMP17]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] // CHECK1: [[IF_THEN]]: -// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP30]], [[TMP31]] -// CHECK1-NEXT: [[ADD18:%.*]] = add i32 [[TMP29]], [[MUL]] +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP28]], [[TMP29]] +// CHECK1-NEXT: [[ADD18:%.*]] = add i32 [[TMP27]], [[MUL]] // CHECK1-NEXT: store i32 [[ADD18]], ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 -// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 -// CHECK1-NEXT: [[MUL19:%.*]] = mul i32 [[TMP33]], [[TMP34]] -// CHECK1-NEXT: [[ADD20:%.*]] = add i32 [[TMP32]], [[MUL19]] +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[MUL19:%.*]] = mul i32 [[TMP31]], [[TMP32]] +// CHECK1-NEXT: [[ADD20:%.*]] = add i32 [[TMP30]], [[MUL19]] // CHECK1-NEXT: store i32 [[ADD20]], ptr [[I]], align 4 -// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[I]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP35]]) +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP33]]) // CHECK1-NEXT: br label %[[IF_END]] // CHECK1: [[IF_END]]: -// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP36]], [[TMP37]] +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP34]], [[TMP35]] // CHECK1-NEXT: br i1 [[CMP21]], label %[[IF_THEN22:.*]], label %[[IF_END27:.*]] // CHECK1: [[IF_THEN22]]: -// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL23:%.*]] = mul i32 [[TMP39]], [[TMP40]] -// CHECK1-NEXT: [[ADD24:%.*]] = add i32 [[TMP38]], [[MUL23]] +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL23:%.*]] = mul i32 [[TMP37]], [[TMP38]] +// CHECK1-NEXT: [[ADD24:%.*]] = add i32 [[TMP36]], [[MUL23]] // CHECK1-NEXT: store i32 [[ADD24]], ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[MUL25:%.*]] = mul i32 [[TMP42]], [[TMP43]] -// CHECK1-NEXT: [[ADD26:%.*]] = add i32 [[TMP41]], [[MUL25]] +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[MUL25:%.*]] = mul i32 [[TMP40]], [[TMP41]] +// CHECK1-NEXT: [[ADD26:%.*]] = add i32 [[TMP39]], [[MUL25]] // CHECK1-NEXT: store i32 [[ADD26]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[J]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP42]]) // CHECK1-NEXT: br label %[[IF_END27]] // CHECK1: [[IF_END27]]: // CHECK1-NEXT: br label %[[FOR_INC:.*]] // CHECK1: [[FOR_INC]]: -// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP45]], 1 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP43]], 1 // CHECK1-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP3:![0-9]+]] // CHECK1: [[FOR_END]]: @@ -256,7 +267,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 @@ -265,7 +275,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -274,7 +283,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_19:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP21:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_22:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB2:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB2:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST2:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI2:%.*]] = alloca i32, align 4 @@ -304,172 +312,166 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] // CHECK1-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 // CHECK1-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP8]], 1 // CHECK1-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP9]], ptr [[J]], align 4 // CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[START_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK1-NEXT: store i32 [[TMP10]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP13]], [[TMP14]] // CHECK1-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP15]] // CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] -// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP16]] // CHECK1-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 // CHECK1-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP17]], 1 // CHECK1-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP18]], [[TMP19]] +// CHECK1-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 // CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[START_ADDR]], align 4 // CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] -// CHECK1-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 -// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[START_ADDR]], align 4 -// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] +// CHECK1-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] // CHECK1-NEXT: store i32 [[ADD18]], ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP24]], [[TMP25]] +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] // CHECK1-NEXT: store i32 [[ADD20]], ptr [[DOTCAPTURE_EXPR_19]], align 4 -// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP26]], ptr [[DOTNEW_STEP21]], align 4 -// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 -// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK1-NEXT: [[SUB23:%.*]] = sub i32 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP24]], ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[SUB23:%.*]] = sub i32 [[TMP25]], [[TMP26]] // CHECK1-NEXT: [[SUB24:%.*]] = sub i32 [[SUB23]], 1 -// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK1-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP29]] -// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK1-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP30]] +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP27]] +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP28]] // CHECK1-NEXT: [[SUB27:%.*]] = sub i32 [[DIV26]], 1 // CHECK1-NEXT: store i32 [[SUB27]], ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK1-NEXT: store i32 [[TMP31]], ptr [[DOTOMP_UB2]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB2]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST2]], align 4 -// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK1-NEXT: [[ADD28:%.*]] = add i32 [[TMP32]], 1 +// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK1-NEXT: [[ADD28:%.*]] = add i32 [[TMP29]], 1 // CHECK1-NEXT: store i32 [[ADD28]], ptr [[DOTOMP_NI2]], align 4 -// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: store i32 [[TMP33]], ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP34]], [[TMP35]] +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP30]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP31]], [[TMP32]] // CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] // CHECK1: [[COND_TRUE]]: -// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 // CHECK1-NEXT: br label %[[COND_END:.*]] // CHECK1: [[COND_FALSE]]: -// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 // CHECK1-NEXT: br label %[[COND_END]] // CHECK1: [[COND_END]]: -// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP36]], %[[COND_TRUE]] ], [ [[TMP37]], %[[COND_FALSE]] ] +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP33]], %[[COND_TRUE]] ], [ [[TMP34]], %[[COND_FALSE]] ] // CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_TEMP_2]], align 4 -// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 -// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 -// CHECK1-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP38]], [[TMP39]] +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP35]], [[TMP36]] // CHECK1-NEXT: br i1 [[CMP29]], label %[[COND_TRUE30:.*]], label %[[COND_FALSE31:.*]] // CHECK1: [[COND_TRUE30]]: -// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 // CHECK1-NEXT: br label %[[COND_END32:.*]] // CHECK1: [[COND_FALSE31]]: -// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 // CHECK1-NEXT: br label %[[COND_END32]] // CHECK1: [[COND_END32]]: -// CHECK1-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP40]], %[[COND_TRUE30]] ], [ [[TMP41]], %[[COND_FALSE31]] ] +// CHECK1-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP37]], %[[COND_TRUE30]] ], [ [[TMP38]], %[[COND_FALSE31]] ] // CHECK1-NEXT: store i32 [[COND33]], ptr [[DOTOMP_FUSE_MAX]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK1-NEXT: br label %[[FOR_COND:.*]] // CHECK1: [[FOR_COND]]: -// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 -// CHECK1-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP39]], [[TMP40]] // CHECK1-NEXT: br i1 [[CMP34]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK1: [[FOR_BODY]]: -// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP44]], [[TMP45]] +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP41]], [[TMP42]] // CHECK1-NEXT: br i1 [[CMP35]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] // CHECK1: [[IF_THEN]]: -// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP47]], [[TMP48]] -// CHECK1-NEXT: [[ADD36:%.*]] = add i32 [[TMP46]], [[MUL]] +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP44]], [[TMP45]] +// CHECK1-NEXT: [[ADD36:%.*]] = add i32 [[TMP43]], [[MUL]] // CHECK1-NEXT: store i32 [[ADD36]], ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 -// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 -// CHECK1-NEXT: [[MUL37:%.*]] = mul i32 [[TMP50]], [[TMP51]] -// CHECK1-NEXT: [[ADD38:%.*]] = add i32 [[TMP49]], [[MUL37]] +// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[MUL37:%.*]] = mul i32 [[TMP47]], [[TMP48]] +// CHECK1-NEXT: [[ADD38:%.*]] = add i32 [[TMP46]], [[MUL37]] // CHECK1-NEXT: store i32 [[ADD38]], ptr [[I]], align 4 -// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[I]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP49]]) // CHECK1-NEXT: br label %[[IF_END]] // CHECK1: [[IF_END]]: -// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP53]], [[TMP54]] +// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP50]], [[TMP51]] // CHECK1-NEXT: br i1 [[CMP39]], label %[[IF_THEN40:.*]], label %[[IF_END45:.*]] // CHECK1: [[IF_THEN40]]: -// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK1-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL41:%.*]] = mul i32 [[TMP56]], [[TMP57]] -// CHECK1-NEXT: [[ADD42:%.*]] = add i32 [[TMP55]], [[MUL41]] +// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL41:%.*]] = mul i32 [[TMP53]], [[TMP54]] +// CHECK1-NEXT: [[ADD42:%.*]] = add i32 [[TMP52]], [[MUL41]] // CHECK1-NEXT: store i32 [[ADD42]], ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP58:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[MUL43:%.*]] = mul i32 [[TMP59]], [[TMP60]] -// CHECK1-NEXT: [[SUB44:%.*]] = sub i32 [[TMP58]], [[MUL43]] +// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[MUL43:%.*]] = mul i32 [[TMP56]], [[TMP57]] +// CHECK1-NEXT: [[SUB44:%.*]] = sub i32 [[TMP55]], [[MUL43]] // CHECK1-NEXT: store i32 [[SUB44]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP61:%.*]] = load i32, ptr [[J]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP61]]) +// CHECK1-NEXT: [[TMP58:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP58]]) // CHECK1-NEXT: br label %[[IF_END45]] // CHECK1: [[IF_END45]]: -// CHECK1-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 -// CHECK1-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP62]], [[TMP63]] +// CHECK1-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP59]], [[TMP60]] // CHECK1-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] // CHECK1: [[IF_THEN47]]: -// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 -// CHECK1-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 -// CHECK1-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL48:%.*]] = mul i32 [[TMP65]], [[TMP66]] -// CHECK1-NEXT: [[ADD49:%.*]] = add i32 [[TMP64]], [[MUL48]] +// CHECK1-NEXT: [[TMP61:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 +// CHECK1-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 +// CHECK1-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL48:%.*]] = mul i32 [[TMP62]], [[TMP63]] +// CHECK1-NEXT: [[ADD49:%.*]] = add i32 [[TMP61]], [[MUL48]] // CHECK1-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV2]], align 4 -// CHECK1-NEXT: [[TMP67:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK1-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 -// CHECK1-NEXT: [[TMP69:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK1-NEXT: [[MUL50:%.*]] = mul i32 [[TMP68]], [[TMP69]] -// CHECK1-NEXT: [[ADD51:%.*]] = add i32 [[TMP67]], [[MUL50]] +// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 +// CHECK1-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[MUL50:%.*]] = mul i32 [[TMP65]], [[TMP66]] +// CHECK1-NEXT: [[ADD51:%.*]] = add i32 [[TMP64]], [[MUL50]] // CHECK1-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 -// CHECK1-NEXT: [[TMP70:%.*]] = load i32, ptr [[K]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP70]]) +// CHECK1-NEXT: [[TMP67:%.*]] = load i32, ptr [[K]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP67]]) // CHECK1-NEXT: br label %[[IF_END52]] // CHECK1: [[IF_END52]]: // CHECK1-NEXT: br label %[[FOR_INC:.*]] // CHECK1: [[FOR_INC]]: -// CHECK1-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 +// CHECK1-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP68]], 1 // CHECK1-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP5:![0-9]+]] // CHECK1: [[FOR_END]]: @@ -481,13 +483,11 @@ extern "C" void foo4() { // CHECK1-NEXT: [[ENTRY:.*:]] // CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 // CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -497,48 +497,43 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB03:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_LB04:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_ST05:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_NI06:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_IV07:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB03:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST04:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI05:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV06:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[C:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_12:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_UB117:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_LB118:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_ST119:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_NI120:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_IV122:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_8:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_10:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_LB116:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_ST117:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_NI118:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV120:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[CC:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[__RANGE223:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[__END224:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[__BEGIN227:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__RANGE221:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END222:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN225:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_27:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_29:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_31:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_32:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_UB2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_30:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_LB2:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_ST2:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_NI2:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_IV2:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_TEMP_142:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_TEMP_140:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_FUSE_MAX48:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX54:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX46:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX52:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[VV:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: store i32 0, ptr [[I]], align 4 -// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 // CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[J]], align 4 -// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB1]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 // CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI1]], align 4 @@ -565,225 +560,219 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 // CHECK1-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 // CHECK1-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB03]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST04]], align 4 // CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 -// CHECK1-NEXT: store i32 [[TMP7]], ptr [[DOTOMP_UB03]], align 4 -// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB04]], align 4 -// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST05]], align 4 -// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 -// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP8]], 1 +// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], 1 // CHECK1-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 -// CHECK1-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI06]], align 8 +// CHECK1-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI05]], align 8 // CHECK1-NEXT: store i32 42, ptr [[C]], align 4 // CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 -// CHECK1-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK1-NEXT: [[TMP8:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP8]], i64 0, i64 0 // CHECK1-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 // CHECK1-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK1-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY7:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY7]], ptr [[__BEGIN2]], align 8 // CHECK1-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY8:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 -// CHECK1-NEXT: store ptr [[ARRAYDECAY8]], ptr [[__BEGIN2]], align 8 -// CHECK1-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY10:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP11]], i64 0, i64 0 -// CHECK1-NEXT: store ptr [[ARRAYDECAY10]], ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK1-NEXT: [[TMP12:%.*]] = load ptr, ptr [[__END2]], align 8 -// CHECK1-NEXT: store ptr [[TMP12]], ptr [[DOTCAPTURE_EXPR_11]], align 8 -// CHECK1-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_11]], align 8 -// CHECK1-NEXT: [[TMP14:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK1-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 -// CHECK1-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP14]] to i64 +// CHECK1-NEXT: [[ARRAYDECAY9:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY9]], ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK1-NEXT: store ptr [[TMP11]], ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK1-NEXT: [[TMP12:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK1-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP12]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 // CHECK1-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] // CHECK1-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 -// CHECK1-NEXT: [[SUB13:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 -// CHECK1-NEXT: [[ADD14:%.*]] = add nsw i64 [[SUB13]], 1 -// CHECK1-NEXT: [[DIV15:%.*]] = sdiv i64 [[ADD14]], 1 -// CHECK1-NEXT: [[SUB16:%.*]] = sub nsw i64 [[DIV15]], 1 -// CHECK1-NEXT: store i64 [[SUB16]], ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK1-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK1-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_UB117]], align 8 -// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB118]], align 8 -// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST119]], align 8 -// CHECK1-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK1-NEXT: [[ADD21:%.*]] = add nsw i64 [[TMP16]], 1 -// CHECK1-NEXT: store i64 [[ADD21]], ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: [[SUB12:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK1-NEXT: [[ADD13:%.*]] = add nsw i64 [[SUB12]], 1 +// CHECK1-NEXT: [[DIV14:%.*]] = sdiv i64 [[ADD13]], 1 +// CHECK1-NEXT: [[SUB15:%.*]] = sub nsw i64 [[DIV14]], 1 +// CHECK1-NEXT: store i64 [[SUB15]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB116]], align 8 +// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST117]], align 8 +// CHECK1-NEXT: [[TMP14:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: [[ADD19:%.*]] = add nsw i64 [[TMP14]], 1 +// CHECK1-NEXT: store i64 [[ADD19]], ptr [[DOTOMP_NI118]], align 8 // CHECK1-NEXT: store i32 37, ptr [[CC]], align 4 -// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE223]], align 8 -// CHECK1-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY25:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 -// CHECK1-NEXT: [[ADD_PTR26:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY25]], i64 256 -// CHECK1-NEXT: store ptr [[ADD_PTR26]], ptr [[__END224]], align 8 -// CHECK1-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP18]], i64 0, i64 0 -// CHECK1-NEXT: store ptr [[ARRAYDECAY28]], ptr [[__BEGIN227]], align 8 -// CHECK1-NEXT: [[TMP19:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY30:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP19]], i64 0, i64 0 -// CHECK1-NEXT: store ptr [[ARRAYDECAY30]], ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK1-NEXT: [[TMP20:%.*]] = load ptr, ptr [[__END224]], align 8 -// CHECK1-NEXT: store ptr [[TMP20]], ptr [[DOTCAPTURE_EXPR_31]], align 8 -// CHECK1-NEXT: [[TMP21:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_31]], align 8 -// CHECK1-NEXT: [[TMP22:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK1-NEXT: [[SUB_PTR_LHS_CAST33:%.*]] = ptrtoint ptr [[TMP21]] to i64 -// CHECK1-NEXT: [[SUB_PTR_RHS_CAST34:%.*]] = ptrtoint ptr [[TMP22]] to i64 -// CHECK1-NEXT: [[SUB_PTR_SUB35:%.*]] = sub i64 [[SUB_PTR_LHS_CAST33]], [[SUB_PTR_RHS_CAST34]] -// CHECK1-NEXT: [[SUB_PTR_DIV36:%.*]] = sdiv exact i64 [[SUB_PTR_SUB35]], 8 -// CHECK1-NEXT: [[SUB37:%.*]] = sub nsw i64 [[SUB_PTR_DIV36]], 1 -// CHECK1-NEXT: [[ADD38:%.*]] = add nsw i64 [[SUB37]], 1 -// CHECK1-NEXT: [[DIV39:%.*]] = sdiv i64 [[ADD38]], 1 -// CHECK1-NEXT: [[SUB40:%.*]] = sub nsw i64 [[DIV39]], 1 -// CHECK1-NEXT: store i64 [[SUB40]], ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK1-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK1-NEXT: store i64 [[TMP23]], ptr [[DOTOMP_UB2]], align 8 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE221]], align 8 +// CHECK1-NEXT: [[TMP15:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY23:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP15]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR24:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY23]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR24]], ptr [[__END222]], align 8 +// CHECK1-NEXT: [[TMP16:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY26:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP16]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY26]], ptr [[__BEGIN225]], align 8 +// CHECK1-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY28]], ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK1-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__END222]], align 8 +// CHECK1-NEXT: store ptr [[TMP18]], ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[TMP19:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[TMP20:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST31:%.*]] = ptrtoint ptr [[TMP19]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST32:%.*]] = ptrtoint ptr [[TMP20]] to i64 +// CHECK1-NEXT: [[SUB_PTR_SUB33:%.*]] = sub i64 [[SUB_PTR_LHS_CAST31]], [[SUB_PTR_RHS_CAST32]] +// CHECK1-NEXT: [[SUB_PTR_DIV34:%.*]] = sdiv exact i64 [[SUB_PTR_SUB33]], 8 +// CHECK1-NEXT: [[SUB35:%.*]] = sub nsw i64 [[SUB_PTR_DIV34]], 1 +// CHECK1-NEXT: [[ADD36:%.*]] = add nsw i64 [[SUB35]], 1 +// CHECK1-NEXT: [[DIV37:%.*]] = sdiv i64 [[ADD36]], 1 +// CHECK1-NEXT: [[SUB38:%.*]] = sub nsw i64 [[DIV37]], 1 +// CHECK1-NEXT: store i64 [[SUB38]], ptr [[DOTCAPTURE_EXPR_30]], align 8 // CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB2]], align 8 // CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST2]], align 8 -// CHECK1-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK1-NEXT: [[ADD41:%.*]] = add nsw i64 [[TMP24]], 1 -// CHECK1-NEXT: store i64 [[ADD41]], ptr [[DOTOMP_NI2]], align 8 -// CHECK1-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 -// CHECK1-NEXT: store i64 [[TMP25]], ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK1-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK1-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK1-NEXT: [[CMP43:%.*]] = icmp sgt i64 [[TMP26]], [[TMP27]] -// CHECK1-NEXT: br i1 [[CMP43]], label %[[COND_TRUE44:.*]], label %[[COND_FALSE45:.*]] -// CHECK1: [[COND_TRUE44]]: -// CHECK1-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK1-NEXT: br label %[[COND_END46:.*]] -// CHECK1: [[COND_FALSE45]]: -// CHECK1-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK1-NEXT: br label %[[COND_END46]] -// CHECK1: [[COND_END46]]: -// CHECK1-NEXT: [[COND47:%.*]] = phi i64 [ [[TMP28]], %[[COND_TRUE44]] ], [ [[TMP29]], %[[COND_FALSE45]] ] -// CHECK1-NEXT: store i64 [[COND47]], ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK1-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK1-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK1-NEXT: [[CMP49:%.*]] = icmp sgt i64 [[TMP30]], [[TMP31]] -// CHECK1-NEXT: br i1 [[CMP49]], label %[[COND_TRUE50:.*]], label %[[COND_FALSE51:.*]] -// CHECK1: [[COND_TRUE50]]: -// CHECK1-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK1-NEXT: br label %[[COND_END52:.*]] -// CHECK1: [[COND_FALSE51]]: -// CHECK1-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK1-NEXT: br label %[[COND_END52]] -// CHECK1: [[COND_END52]]: -// CHECK1-NEXT: [[COND53:%.*]] = phi i64 [ [[TMP32]], %[[COND_TRUE50]] ], [ [[TMP33]], %[[COND_FALSE51]] ] -// CHECK1-NEXT: store i64 [[COND53]], ptr [[DOTOMP_FUSE_MAX48]], align 8 -// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP21:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_30]], align 8 +// CHECK1-NEXT: [[ADD39:%.*]] = add nsw i64 [[TMP21]], 1 +// CHECK1-NEXT: store i64 [[ADD39]], ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[TMP22:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: store i64 [[TMP22]], ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK1-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK1-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[CMP41:%.*]] = icmp sgt i64 [[TMP23]], [[TMP24]] +// CHECK1-NEXT: br i1 [[CMP41]], label %[[COND_TRUE42:.*]], label %[[COND_FALSE43:.*]] +// CHECK1: [[COND_TRUE42]]: +// CHECK1-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK1-NEXT: br label %[[COND_END44:.*]] +// CHECK1: [[COND_FALSE43]]: +// CHECK1-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: br label %[[COND_END44]] +// CHECK1: [[COND_END44]]: +// CHECK1-NEXT: [[COND45:%.*]] = phi i64 [ [[TMP25]], %[[COND_TRUE42]] ], [ [[TMP26]], %[[COND_FALSE43]] ] +// CHECK1-NEXT: store i64 [[COND45]], ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[CMP47:%.*]] = icmp sgt i64 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: br i1 [[CMP47]], label %[[COND_TRUE48:.*]], label %[[COND_FALSE49:.*]] +// CHECK1: [[COND_TRUE48]]: +// CHECK1-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: br label %[[COND_END50:.*]] +// CHECK1: [[COND_FALSE49]]: +// CHECK1-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: br label %[[COND_END50]] +// CHECK1: [[COND_END50]]: +// CHECK1-NEXT: [[COND51:%.*]] = phi i64 [ [[TMP29]], %[[COND_TRUE48]] ], [ [[TMP30]], %[[COND_FALSE49]] ] +// CHECK1-NEXT: store i64 [[COND51]], ptr [[DOTOMP_FUSE_MAX46]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX52]], align 8 // CHECK1-NEXT: br label %[[FOR_COND:.*]] // CHECK1: [[FOR_COND]]: -// CHECK1-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[TMP35:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX48]], align 8 -// CHECK1-NEXT: [[CMP55:%.*]] = icmp slt i64 [[TMP34]], [[TMP35]] -// CHECK1-NEXT: br i1 [[CMP55]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX46]], align 8 +// CHECK1-NEXT: [[CMP53:%.*]] = icmp slt i64 [[TMP31]], [[TMP32]] +// CHECK1-NEXT: br i1 [[CMP53]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK1: [[FOR_BODY]]: -// CHECK1-NEXT: [[TMP36:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 -// CHECK1-NEXT: [[CMP56:%.*]] = icmp slt i64 [[TMP36]], [[TMP37]] -// CHECK1-NEXT: br i1 [[CMP56]], label %[[IF_THEN:.*]], label %[[IF_END76:.*]] +// CHECK1-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: [[CMP54:%.*]] = icmp slt i64 [[TMP33]], [[TMP34]] +// CHECK1-NEXT: br i1 [[CMP54]], label %[[IF_THEN:.*]], label %[[IF_END74:.*]] // CHECK1: [[IF_THEN]]: -// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB04]], align 4 -// CHECK1-NEXT: [[CONV57:%.*]] = sext i32 [[TMP38]] to i64 -// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST05]], align 4 -// CHECK1-NEXT: [[CONV58:%.*]] = sext i32 [[TMP39]] to i64 -// CHECK1-NEXT: [[TMP40:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV58]], [[TMP40]] -// CHECK1-NEXT: [[ADD59:%.*]] = add nsw i64 [[CONV57]], [[MUL]] -// CHECK1-NEXT: [[CONV60:%.*]] = trunc i64 [[ADD59]] to i32 -// CHECK1-NEXT: store i32 [[CONV60]], ptr [[DOTOMP_IV07]], align 4 -// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_IV07]], align 4 -// CHECK1-NEXT: [[MUL61:%.*]] = mul nsw i32 [[TMP41]], 1 -// CHECK1-NEXT: [[ADD62:%.*]] = add nsw i32 0, [[MUL61]] -// CHECK1-NEXT: store i32 [[ADD62]], ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: [[CMP63:%.*]] = icmp slt i32 [[TMP42]], [[TMP43]] -// CHECK1-NEXT: br i1 [[CMP63]], label %[[IF_THEN64:.*]], label %[[IF_END:.*]] -// CHECK1: [[IF_THEN64]]: -// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP45]], [[TMP46]] -// CHECK1-NEXT: [[ADD66:%.*]] = add nsw i32 [[TMP44]], [[MUL65]] -// CHECK1-NEXT: store i32 [[ADD66]], ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[MUL67:%.*]] = mul nsw i32 [[TMP47]], 1 -// CHECK1-NEXT: [[ADD68:%.*]] = add nsw i32 0, [[MUL67]] -// CHECK1-NEXT: store i32 [[ADD68]], ptr [[I]], align 4 -// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[I]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP48]]) +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_LB03]], align 4 +// CHECK1-NEXT: [[CONV55:%.*]] = sext i32 [[TMP35]] to i64 +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_ST04]], align 4 +// CHECK1-NEXT: [[CONV56:%.*]] = sext i32 [[TMP36]] to i64 +// CHECK1-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV56]], [[TMP37]] +// CHECK1-NEXT: [[ADD57:%.*]] = add nsw i64 [[CONV55]], [[MUL]] +// CHECK1-NEXT: [[CONV58:%.*]] = trunc i64 [[ADD57]] to i32 +// CHECK1-NEXT: store i32 [[CONV58]], ptr [[DOTOMP_IV06]], align 4 +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_IV06]], align 4 +// CHECK1-NEXT: [[MUL59:%.*]] = mul nsw i32 [[TMP38]], 1 +// CHECK1-NEXT: [[ADD60:%.*]] = add nsw i32 0, [[MUL59]] +// CHECK1-NEXT: store i32 [[ADD60]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP61:%.*]] = icmp slt i32 [[TMP39]], [[TMP40]] +// CHECK1-NEXT: br i1 [[CMP61]], label %[[IF_THEN62:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN62]]: +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL63:%.*]] = mul nsw i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: [[ADD64:%.*]] = add nsw i32 [[TMP41]], [[MUL63]] +// CHECK1-NEXT: store i32 [[ADD64]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP44]], 1 +// CHECK1-NEXT: [[ADD66:%.*]] = add nsw i32 0, [[MUL65]] +// CHECK1-NEXT: store i32 [[ADD66]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP45]]) // CHECK1-NEXT: br label %[[IF_END]] // CHECK1: [[IF_END]]: -// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP69:%.*]] = icmp slt i32 [[TMP49]], [[TMP50]] -// CHECK1-NEXT: br i1 [[CMP69]], label %[[IF_THEN70:.*]], label %[[IF_END75:.*]] -// CHECK1: [[IF_THEN70]]: -// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP52]], [[TMP53]] -// CHECK1-NEXT: [[ADD72:%.*]] = add nsw i32 [[TMP51]], [[MUL71]] -// CHECK1-NEXT: store i32 [[ADD72]], ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[MUL73:%.*]] = mul nsw i32 [[TMP54]], 2 -// CHECK1-NEXT: [[ADD74:%.*]] = add nsw i32 0, [[MUL73]] -// CHECK1-NEXT: store i32 [[ADD74]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[J]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP55]]) -// CHECK1-NEXT: br label %[[IF_END75]] -// CHECK1: [[IF_END75]]: -// CHECK1-NEXT: br label %[[IF_END76]] -// CHECK1: [[IF_END76]]: -// CHECK1-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK1-NEXT: [[CMP77:%.*]] = icmp slt i64 [[TMP56]], [[TMP57]] -// CHECK1-NEXT: br i1 [[CMP77]], label %[[IF_THEN78:.*]], label %[[IF_END83:.*]] -// CHECK1: [[IF_THEN78]]: -// CHECK1-NEXT: [[TMP58:%.*]] = load i64, ptr [[DOTOMP_LB118]], align 8 -// CHECK1-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_ST119]], align 8 -// CHECK1-NEXT: [[TMP60:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], [[TMP60]] -// CHECK1-NEXT: [[ADD80:%.*]] = add nsw i64 [[TMP58]], [[MUL79]] -// CHECK1-NEXT: store i64 [[ADD80]], ptr [[DOTOMP_IV122]], align 8 -// CHECK1-NEXT: [[TMP61:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK1-NEXT: [[TMP62:%.*]] = load i64, ptr [[DOTOMP_IV122]], align 8 -// CHECK1-NEXT: [[MUL81:%.*]] = mul nsw i64 [[TMP62]], 1 -// CHECK1-NEXT: [[ADD_PTR82:%.*]] = getelementptr inbounds double, ptr [[TMP61]], i64 [[MUL81]] -// CHECK1-NEXT: store ptr [[ADD_PTR82]], ptr [[__BEGIN2]], align 8 -// CHECK1-NEXT: [[TMP63:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 -// CHECK1-NEXT: store ptr [[TMP63]], ptr [[V]], align 8 -// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[C]], align 4 -// CHECK1-NEXT: [[TMP65:%.*]] = load ptr, ptr [[V]], align 8 -// CHECK1-NEXT: [[TMP66:%.*]] = load double, ptr [[TMP65]], align 8 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP64]], double noundef [[TMP66]]) -// CHECK1-NEXT: br label %[[IF_END83]] -// CHECK1: [[IF_END83]]: -// CHECK1-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK1-NEXT: [[CMP84:%.*]] = icmp slt i64 [[TMP67]], [[TMP68]] -// CHECK1-NEXT: br i1 [[CMP84]], label %[[IF_THEN85:.*]], label %[[IF_END90:.*]] -// CHECK1: [[IF_THEN85]]: -// CHECK1-NEXT: [[TMP69:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 -// CHECK1-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 -// CHECK1-NEXT: [[TMP71:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], [[TMP71]] -// CHECK1-NEXT: [[ADD87:%.*]] = add nsw i64 [[TMP69]], [[MUL86]] -// CHECK1-NEXT: store i64 [[ADD87]], ptr [[DOTOMP_IV2]], align 8 -// CHECK1-NEXT: [[TMP72:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK1-NEXT: [[TMP73:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 -// CHECK1-NEXT: [[MUL88:%.*]] = mul nsw i64 [[TMP73]], 1 -// CHECK1-NEXT: [[ADD_PTR89:%.*]] = getelementptr inbounds double, ptr [[TMP72]], i64 [[MUL88]] -// CHECK1-NEXT: store ptr [[ADD_PTR89]], ptr [[__BEGIN227]], align 8 -// CHECK1-NEXT: [[TMP74:%.*]] = load ptr, ptr [[__BEGIN227]], align 8 -// CHECK1-NEXT: store ptr [[TMP74]], ptr [[VV]], align 8 -// CHECK1-NEXT: [[TMP75:%.*]] = load i32, ptr [[CC]], align 4 -// CHECK1-NEXT: [[TMP76:%.*]] = load ptr, ptr [[VV]], align 8 -// CHECK1-NEXT: [[TMP77:%.*]] = load double, ptr [[TMP76]], align 8 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP75]], double noundef [[TMP77]]) -// CHECK1-NEXT: br label %[[IF_END90]] -// CHECK1: [[IF_END90]]: +// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP67:%.*]] = icmp slt i32 [[TMP46]], [[TMP47]] +// CHECK1-NEXT: br i1 [[CMP67]], label %[[IF_THEN68:.*]], label %[[IF_END73:.*]] +// CHECK1: [[IF_THEN68]]: +// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL69:%.*]] = mul nsw i32 [[TMP49]], [[TMP50]] +// CHECK1-NEXT: [[ADD70:%.*]] = add nsw i32 [[TMP48]], [[MUL69]] +// CHECK1-NEXT: store i32 [[ADD70]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP51]], 2 +// CHECK1-NEXT: [[ADD72:%.*]] = add nsw i32 0, [[MUL71]] +// CHECK1-NEXT: store i32 [[ADD72]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK1-NEXT: br label %[[IF_END73]] +// CHECK1: [[IF_END73]]: +// CHECK1-NEXT: br label %[[IF_END74]] +// CHECK1: [[IF_END74]]: +// CHECK1-NEXT: [[TMP53:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[TMP54:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[CMP75:%.*]] = icmp slt i64 [[TMP53]], [[TMP54]] +// CHECK1-NEXT: br i1 [[CMP75]], label %[[IF_THEN76:.*]], label %[[IF_END81:.*]] +// CHECK1: [[IF_THEN76]]: +// CHECK1-NEXT: [[TMP55:%.*]] = load i64, ptr [[DOTOMP_LB116]], align 8 +// CHECK1-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_ST117]], align 8 +// CHECK1-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[MUL77:%.*]] = mul nsw i64 [[TMP56]], [[TMP57]] +// CHECK1-NEXT: [[ADD78:%.*]] = add nsw i64 [[TMP55]], [[MUL77]] +// CHECK1-NEXT: store i64 [[ADD78]], ptr [[DOTOMP_IV120]], align 8 +// CHECK1-NEXT: [[TMP58:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_IV120]], align 8 +// CHECK1-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], 1 +// CHECK1-NEXT: [[ADD_PTR80:%.*]] = getelementptr inbounds double, ptr [[TMP58]], i64 [[MUL79]] +// CHECK1-NEXT: store ptr [[ADD_PTR80]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP60:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: store ptr [[TMP60]], ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP61:%.*]] = load i32, ptr [[C]], align 4 +// CHECK1-NEXT: [[TMP62:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP63:%.*]] = load double, ptr [[TMP62]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP61]], double noundef [[TMP63]]) +// CHECK1-NEXT: br label %[[IF_END81]] +// CHECK1: [[IF_END81]]: +// CHECK1-NEXT: [[TMP64:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[TMP65:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[CMP82:%.*]] = icmp slt i64 [[TMP64]], [[TMP65]] +// CHECK1-NEXT: br i1 [[CMP82]], label %[[IF_THEN83:.*]], label %[[IF_END88:.*]] +// CHECK1: [[IF_THEN83]]: +// CHECK1-NEXT: [[TMP66:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 +// CHECK1-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 +// CHECK1-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[MUL84:%.*]] = mul nsw i64 [[TMP67]], [[TMP68]] +// CHECK1-NEXT: [[ADD85:%.*]] = add nsw i64 [[TMP66]], [[MUL84]] +// CHECK1-NEXT: store i64 [[ADD85]], ptr [[DOTOMP_IV2]], align 8 +// CHECK1-NEXT: [[TMP69:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK1-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 +// CHECK1-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], 1 +// CHECK1-NEXT: [[ADD_PTR87:%.*]] = getelementptr inbounds double, ptr [[TMP69]], i64 [[MUL86]] +// CHECK1-NEXT: store ptr [[ADD_PTR87]], ptr [[__BEGIN225]], align 8 +// CHECK1-NEXT: [[TMP71:%.*]] = load ptr, ptr [[__BEGIN225]], align 8 +// CHECK1-NEXT: store ptr [[TMP71]], ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP72:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK1-NEXT: [[TMP73:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP74:%.*]] = load double, ptr [[TMP73]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP72]], double noundef [[TMP74]]) +// CHECK1-NEXT: br label %[[IF_END88]] +// CHECK1: [[IF_END88]]: // CHECK1-NEXT: br label %[[FOR_INC:.*]] // CHECK1: [[FOR_INC]]: -// CHECK1-NEXT: [[TMP78:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[INC:%.*]] = add nsw i64 [[TMP78]], 1 -// CHECK1-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP75:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[INC:%.*]] = add nsw i64 [[TMP75]], 1 +// CHECK1-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX52]], align 8 // CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] // CHECK1: [[FOR_END]]: // CHECK1-NEXT: ret void @@ -794,13 +783,11 @@ extern "C" void foo4() { // CHECK1-NEXT: [[ENTRY:.*:]] // CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 // CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[K:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -815,12 +802,10 @@ extern "C" void foo4() { // CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: store i32 0, ptr [[J]], align 4 -// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 // CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[K]], align 4 -// CHECK1-NEXT: store i32 63, ptr [[DOTOMP_UB1]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 // CHECK1-NEXT: store i32 64, ptr [[DOTOMP_NI1]], align 4 @@ -940,6 +925,277 @@ extern "C" void foo4() { // CHECK1-NEXT: ret void // // +// CHECK1-LABEL: define dso_local void @foo5( +// CHECK1-SAME: ) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB03:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST04:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI05:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV06:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_8:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_10:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_LB116:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_ST117:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_NI118:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV120:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_TEMP_121:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX22:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX29:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[CC:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE264:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN265:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END267:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[VV:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[K]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: store i32 512, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 [[TMP5]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[SUB:%.*]] = sub nsw i32 [[TMP6]], 0 +// CHECK1-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 +// CHECK1-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 +// CHECK1-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB03]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST04]], align 4 +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], 1 +// CHECK1-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 +// CHECK1-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[TMP8:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP8]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK1-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY7:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY7]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY9:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY9]], ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK1-NEXT: store ptr [[TMP11]], ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK1-NEXT: [[TMP12:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK1-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP12]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 +// CHECK1-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] +// CHECK1-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 +// CHECK1-NEXT: [[SUB12:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK1-NEXT: [[ADD13:%.*]] = add nsw i64 [[SUB12]], 1 +// CHECK1-NEXT: [[DIV14:%.*]] = sdiv i64 [[ADD13]], 1 +// CHECK1-NEXT: [[SUB15:%.*]] = sub nsw i64 [[DIV14]], 1 +// CHECK1-NEXT: store i64 [[SUB15]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB116]], align 8 +// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST117]], align 8 +// CHECK1-NEXT: [[TMP14:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: [[ADD19:%.*]] = add nsw i64 [[TMP14]], 1 +// CHECK1-NEXT: store i64 [[ADD19]], ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK1-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK1-NEXT: [[TMP17:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[CMP23:%.*]] = icmp sgt i64 [[TMP16]], [[TMP17]] +// CHECK1-NEXT: br i1 [[CMP23]], label %[[COND_TRUE24:.*]], label %[[COND_FALSE25:.*]] +// CHECK1: [[COND_TRUE24]]: +// CHECK1-NEXT: [[TMP18:%.*]] = load i64, ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK1-NEXT: br label %[[COND_END26:.*]] +// CHECK1: [[COND_FALSE25]]: +// CHECK1-NEXT: [[TMP19:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: br label %[[COND_END26]] +// CHECK1: [[COND_END26]]: +// CHECK1-NEXT: [[COND27:%.*]] = phi i64 [ [[TMP18]], %[[COND_TRUE24]] ], [ [[TMP19]], %[[COND_FALSE25]] ] +// CHECK1-NEXT: store i64 [[COND27]], ptr [[DOTOMP_FUSE_MAX22]], align 8 +// CHECK1-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: [[CMP28:%.*]] = icmp slt i32 [[TMP20]], 128 +// CHECK1-NEXT: br i1 [[CMP28]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP21]]) +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add nsw i32 [[TMP22]], 1 +// CHECK1-NEXT: store i32 [[INC]], ptr [[I]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP9:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND30:.*]] +// CHECK1: [[FOR_COND30]]: +// CHECK1-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX22]], align 8 +// CHECK1-NEXT: [[CMP31:%.*]] = icmp slt i64 [[TMP23]], [[TMP24]] +// CHECK1-NEXT: br i1 [[CMP31]], label %[[FOR_BODY32:.*]], label %[[FOR_END63:.*]] +// CHECK1: [[FOR_BODY32]]: +// CHECK1-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: [[CMP33:%.*]] = icmp slt i64 [[TMP25]], [[TMP26]] +// CHECK1-NEXT: br i1 [[CMP33]], label %[[IF_THEN:.*]], label %[[IF_END53:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_LB03]], align 4 +// CHECK1-NEXT: [[CONV34:%.*]] = sext i32 [[TMP27]] to i64 +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_ST04]], align 4 +// CHECK1-NEXT: [[CONV35:%.*]] = sext i32 [[TMP28]] to i64 +// CHECK1-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV35]], [[TMP29]] +// CHECK1-NEXT: [[ADD36:%.*]] = add nsw i64 [[CONV34]], [[MUL]] +// CHECK1-NEXT: [[CONV37:%.*]] = trunc i64 [[ADD36]] to i32 +// CHECK1-NEXT: store i32 [[CONV37]], ptr [[DOTOMP_IV06]], align 4 +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_IV06]], align 4 +// CHECK1-NEXT: [[MUL38:%.*]] = mul nsw i32 [[TMP30]], 1 +// CHECK1-NEXT: [[ADD39:%.*]] = add nsw i32 0, [[MUL38]] +// CHECK1-NEXT: store i32 [[ADD39]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP40:%.*]] = icmp slt i32 [[TMP31]], [[TMP32]] +// CHECK1-NEXT: br i1 [[CMP40]], label %[[IF_THEN41:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN41]]: +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL42:%.*]] = mul nsw i32 [[TMP34]], [[TMP35]] +// CHECK1-NEXT: [[ADD43:%.*]] = add nsw i32 [[TMP33]], [[MUL42]] +// CHECK1-NEXT: store i32 [[ADD43]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[MUL44:%.*]] = mul nsw i32 [[TMP36]], 2 +// CHECK1-NEXT: [[ADD45:%.*]] = add nsw i32 0, [[MUL44]] +// CHECK1-NEXT: store i32 [[ADD45]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP37]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP46:%.*]] = icmp slt i32 [[TMP38]], [[TMP39]] +// CHECK1-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] +// CHECK1: [[IF_THEN47]]: +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL48:%.*]] = mul nsw i32 [[TMP41]], [[TMP42]] +// CHECK1-NEXT: [[ADD49:%.*]] = add nsw i32 [[TMP40]], [[MUL48]] +// CHECK1-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[MUL50:%.*]] = mul nsw i32 [[TMP43]], 1 +// CHECK1-NEXT: [[ADD51:%.*]] = add nsw i32 0, [[MUL50]] +// CHECK1-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[K]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK1-NEXT: br label %[[IF_END52]] +// CHECK1: [[IF_END52]]: +// CHECK1-NEXT: br label %[[IF_END53]] +// CHECK1: [[IF_END53]]: +// CHECK1-NEXT: [[TMP45:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[TMP46:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[CMP54:%.*]] = icmp slt i64 [[TMP45]], [[TMP46]] +// CHECK1-NEXT: br i1 [[CMP54]], label %[[IF_THEN55:.*]], label %[[IF_END60:.*]] +// CHECK1: [[IF_THEN55]]: +// CHECK1-NEXT: [[TMP47:%.*]] = load i64, ptr [[DOTOMP_LB116]], align 8 +// CHECK1-NEXT: [[TMP48:%.*]] = load i64, ptr [[DOTOMP_ST117]], align 8 +// CHECK1-NEXT: [[TMP49:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[MUL56:%.*]] = mul nsw i64 [[TMP48]], [[TMP49]] +// CHECK1-NEXT: [[ADD57:%.*]] = add nsw i64 [[TMP47]], [[MUL56]] +// CHECK1-NEXT: store i64 [[ADD57]], ptr [[DOTOMP_IV120]], align 8 +// CHECK1-NEXT: [[TMP50:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[TMP51:%.*]] = load i64, ptr [[DOTOMP_IV120]], align 8 +// CHECK1-NEXT: [[MUL58:%.*]] = mul nsw i64 [[TMP51]], 1 +// CHECK1-NEXT: [[ADD_PTR59:%.*]] = getelementptr inbounds double, ptr [[TMP50]], i64 [[MUL58]] +// CHECK1-NEXT: store ptr [[ADD_PTR59]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP52:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: store ptr [[TMP52]], ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[C]], align 4 +// CHECK1-NEXT: [[TMP54:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP55:%.*]] = load double, ptr [[TMP54]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP53]], double noundef [[TMP55]]) +// CHECK1-NEXT: br label %[[IF_END60]] +// CHECK1: [[IF_END60]]: +// CHECK1-NEXT: br label %[[FOR_INC61:.*]] +// CHECK1: [[FOR_INC61]]: +// CHECK1-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[INC62:%.*]] = add nsw i64 [[TMP56]], 1 +// CHECK1-NEXT: store i64 [[INC62]], ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND30]], !llvm.loop [[LOOP10:![0-9]+]] +// CHECK1: [[FOR_END63]]: +// CHECK1-NEXT: store i32 37, ptr [[CC]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE264]], align 8 +// CHECK1-NEXT: [[TMP57:%.*]] = load ptr, ptr [[__RANGE264]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY66:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP57]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY66]], ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: [[TMP58:%.*]] = load ptr, ptr [[__RANGE264]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY68:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP58]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR69:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY68]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR69]], ptr [[__END267]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND70:.*]] +// CHECK1: [[FOR_COND70]]: +// CHECK1-NEXT: [[TMP59:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: [[TMP60:%.*]] = load ptr, ptr [[__END267]], align 8 +// CHECK1-NEXT: [[CMP71:%.*]] = icmp ne ptr [[TMP59]], [[TMP60]] +// CHECK1-NEXT: br i1 [[CMP71]], label %[[FOR_BODY72:.*]], label %[[FOR_END74:.*]] +// CHECK1: [[FOR_BODY72]]: +// CHECK1-NEXT: [[TMP61:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: store ptr [[TMP61]], ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP62:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK1-NEXT: [[TMP63:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP64:%.*]] = load double, ptr [[TMP63]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP62]], double noundef [[TMP64]]) +// CHECK1-NEXT: br label %[[FOR_INC73:.*]] +// CHECK1: [[FOR_INC73]]: +// CHECK1-NEXT: [[TMP65:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: [[INCDEC_PTR:%.*]] = getelementptr inbounds nuw double, ptr [[TMP65]], i32 1 +// CHECK1-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND70]] +// CHECK1: [[FOR_END74]]: +// CHECK1-NEXT: ret void +// +// // CHECK2-LABEL: define dso_local void @body( // CHECK2-SAME: ...) #[[ATTR0:[0-9]+]] { // CHECK2-NEXT: [[ENTRY:.*:]] @@ -961,7 +1217,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 @@ -970,7 +1225,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -1002,107 +1256,103 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] // CHECK2-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 // CHECK2-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP8]], 1 // CHECK2-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP9]], ptr [[J]], align 4 // CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[START2_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[START2_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[END2_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK2-NEXT: store i32 [[TMP10]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[END2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP13]], [[TMP14]] // CHECK2-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP15]] // CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] -// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP16]] // CHECK2-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 // CHECK2-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP17]], 1 // CHECK2-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: store i32 [[TMP20]], ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP21]], [[TMP22]] +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP19]], [[TMP20]] // CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] // CHECK2: [[COND_TRUE]]: -// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 // CHECK2-NEXT: br label %[[COND_END:.*]] // CHECK2: [[COND_FALSE]]: -// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 // CHECK2-NEXT: br label %[[COND_END]] // CHECK2: [[COND_END]]: -// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP23]], %[[COND_TRUE]] ], [ [[TMP24]], %[[COND_FALSE]] ] +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP21]], %[[COND_TRUE]] ], [ [[TMP22]], %[[COND_FALSE]] ] // CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK2-NEXT: br label %[[FOR_COND:.*]] // CHECK2: [[FOR_COND]]: -// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 -// CHECK2-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP23]], [[TMP24]] // CHECK2-NEXT: br i1 [[CMP16]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK2: [[FOR_BODY]]: -// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] // CHECK2-NEXT: br i1 [[CMP17]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] // CHECK2: [[IF_THEN]]: -// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP30]], [[TMP31]] -// CHECK2-NEXT: [[ADD18:%.*]] = add i32 [[TMP29]], [[MUL]] +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP28]], [[TMP29]] +// CHECK2-NEXT: [[ADD18:%.*]] = add i32 [[TMP27]], [[MUL]] // CHECK2-NEXT: store i32 [[ADD18]], ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 -// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 -// CHECK2-NEXT: [[MUL19:%.*]] = mul i32 [[TMP33]], [[TMP34]] -// CHECK2-NEXT: [[ADD20:%.*]] = add i32 [[TMP32]], [[MUL19]] +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[MUL19:%.*]] = mul i32 [[TMP31]], [[TMP32]] +// CHECK2-NEXT: [[ADD20:%.*]] = add i32 [[TMP30]], [[MUL19]] // CHECK2-NEXT: store i32 [[ADD20]], ptr [[I]], align 4 -// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[I]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP35]]) +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP33]]) // CHECK2-NEXT: br label %[[IF_END]] // CHECK2: [[IF_END]]: -// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP36]], [[TMP37]] +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP34]], [[TMP35]] // CHECK2-NEXT: br i1 [[CMP21]], label %[[IF_THEN22:.*]], label %[[IF_END27:.*]] // CHECK2: [[IF_THEN22]]: -// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL23:%.*]] = mul i32 [[TMP39]], [[TMP40]] -// CHECK2-NEXT: [[ADD24:%.*]] = add i32 [[TMP38]], [[MUL23]] +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL23:%.*]] = mul i32 [[TMP37]], [[TMP38]] +// CHECK2-NEXT: [[ADD24:%.*]] = add i32 [[TMP36]], [[MUL23]] // CHECK2-NEXT: store i32 [[ADD24]], ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[MUL25:%.*]] = mul i32 [[TMP42]], [[TMP43]] -// CHECK2-NEXT: [[ADD26:%.*]] = add i32 [[TMP41]], [[MUL25]] +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[MUL25:%.*]] = mul i32 [[TMP40]], [[TMP41]] +// CHECK2-NEXT: [[ADD26:%.*]] = add i32 [[TMP39]], [[MUL25]] // CHECK2-NEXT: store i32 [[ADD26]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[J]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP42]]) // CHECK2-NEXT: br label %[[IF_END27]] // CHECK2: [[IF_END27]]: // CHECK2-NEXT: br label %[[FOR_INC:.*]] // CHECK2: [[FOR_INC]]: -// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP45]], 1 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP43]], 1 // CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP3:![0-9]+]] // CHECK2: [[FOR_END]]: @@ -1114,13 +1364,11 @@ extern "C" void foo4() { // CHECK2-NEXT: [[ENTRY:.*:]] // CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 // CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -1130,48 +1378,43 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB03:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_LB04:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_ST05:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_NI06:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_IV07:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB03:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST04:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI05:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV06:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[C:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_12:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_UB117:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_LB118:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_ST119:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_NI120:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_IV122:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_8:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_10:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_LB116:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_ST117:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_NI118:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV120:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[CC:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[__RANGE223:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[__END224:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[__BEGIN227:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__RANGE221:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END222:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN225:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_27:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_29:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_31:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_32:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_UB2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_30:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_LB2:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_ST2:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_NI2:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_IV2:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_TEMP_142:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_TEMP_140:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_FUSE_MAX48:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX54:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX46:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX52:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[VV:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: store i32 0, ptr [[I]], align 4 -// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 // CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[J]], align 4 -// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB1]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 // CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI1]], align 4 @@ -1198,225 +1441,219 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 // CHECK2-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 // CHECK2-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB03]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST04]], align 4 // CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 -// CHECK2-NEXT: store i32 [[TMP7]], ptr [[DOTOMP_UB03]], align 4 -// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB04]], align 4 -// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST05]], align 4 -// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 -// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP8]], 1 +// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], 1 // CHECK2-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 -// CHECK2-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI06]], align 8 +// CHECK2-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI05]], align 8 // CHECK2-NEXT: store i32 42, ptr [[C]], align 4 // CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 -// CHECK2-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK2-NEXT: [[TMP8:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP8]], i64 0, i64 0 // CHECK2-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 // CHECK2-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK2-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY7:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY7]], ptr [[__BEGIN2]], align 8 // CHECK2-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY8:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 -// CHECK2-NEXT: store ptr [[ARRAYDECAY8]], ptr [[__BEGIN2]], align 8 -// CHECK2-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY10:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP11]], i64 0, i64 0 -// CHECK2-NEXT: store ptr [[ARRAYDECAY10]], ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK2-NEXT: [[TMP12:%.*]] = load ptr, ptr [[__END2]], align 8 -// CHECK2-NEXT: store ptr [[TMP12]], ptr [[DOTCAPTURE_EXPR_11]], align 8 -// CHECK2-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_11]], align 8 -// CHECK2-NEXT: [[TMP14:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK2-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 -// CHECK2-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP14]] to i64 +// CHECK2-NEXT: [[ARRAYDECAY9:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY9]], ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK2-NEXT: store ptr [[TMP11]], ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK2-NEXT: [[TMP12:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK2-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP12]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 // CHECK2-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] // CHECK2-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 -// CHECK2-NEXT: [[SUB13:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 -// CHECK2-NEXT: [[ADD14:%.*]] = add nsw i64 [[SUB13]], 1 -// CHECK2-NEXT: [[DIV15:%.*]] = sdiv i64 [[ADD14]], 1 -// CHECK2-NEXT: [[SUB16:%.*]] = sub nsw i64 [[DIV15]], 1 -// CHECK2-NEXT: store i64 [[SUB16]], ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK2-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK2-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_UB117]], align 8 -// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB118]], align 8 -// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST119]], align 8 -// CHECK2-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK2-NEXT: [[ADD21:%.*]] = add nsw i64 [[TMP16]], 1 -// CHECK2-NEXT: store i64 [[ADD21]], ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: [[SUB12:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK2-NEXT: [[ADD13:%.*]] = add nsw i64 [[SUB12]], 1 +// CHECK2-NEXT: [[DIV14:%.*]] = sdiv i64 [[ADD13]], 1 +// CHECK2-NEXT: [[SUB15:%.*]] = sub nsw i64 [[DIV14]], 1 +// CHECK2-NEXT: store i64 [[SUB15]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB116]], align 8 +// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST117]], align 8 +// CHECK2-NEXT: [[TMP14:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: [[ADD19:%.*]] = add nsw i64 [[TMP14]], 1 +// CHECK2-NEXT: store i64 [[ADD19]], ptr [[DOTOMP_NI118]], align 8 // CHECK2-NEXT: store i32 37, ptr [[CC]], align 4 -// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE223]], align 8 -// CHECK2-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY25:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 -// CHECK2-NEXT: [[ADD_PTR26:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY25]], i64 256 -// CHECK2-NEXT: store ptr [[ADD_PTR26]], ptr [[__END224]], align 8 -// CHECK2-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP18]], i64 0, i64 0 -// CHECK2-NEXT: store ptr [[ARRAYDECAY28]], ptr [[__BEGIN227]], align 8 -// CHECK2-NEXT: [[TMP19:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY30:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP19]], i64 0, i64 0 -// CHECK2-NEXT: store ptr [[ARRAYDECAY30]], ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK2-NEXT: [[TMP20:%.*]] = load ptr, ptr [[__END224]], align 8 -// CHECK2-NEXT: store ptr [[TMP20]], ptr [[DOTCAPTURE_EXPR_31]], align 8 -// CHECK2-NEXT: [[TMP21:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_31]], align 8 -// CHECK2-NEXT: [[TMP22:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK2-NEXT: [[SUB_PTR_LHS_CAST33:%.*]] = ptrtoint ptr [[TMP21]] to i64 -// CHECK2-NEXT: [[SUB_PTR_RHS_CAST34:%.*]] = ptrtoint ptr [[TMP22]] to i64 -// CHECK2-NEXT: [[SUB_PTR_SUB35:%.*]] = sub i64 [[SUB_PTR_LHS_CAST33]], [[SUB_PTR_RHS_CAST34]] -// CHECK2-NEXT: [[SUB_PTR_DIV36:%.*]] = sdiv exact i64 [[SUB_PTR_SUB35]], 8 -// CHECK2-NEXT: [[SUB37:%.*]] = sub nsw i64 [[SUB_PTR_DIV36]], 1 -// CHECK2-NEXT: [[ADD38:%.*]] = add nsw i64 [[SUB37]], 1 -// CHECK2-NEXT: [[DIV39:%.*]] = sdiv i64 [[ADD38]], 1 -// CHECK2-NEXT: [[SUB40:%.*]] = sub nsw i64 [[DIV39]], 1 -// CHECK2-NEXT: store i64 [[SUB40]], ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK2-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK2-NEXT: store i64 [[TMP23]], ptr [[DOTOMP_UB2]], align 8 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE221]], align 8 +// CHECK2-NEXT: [[TMP15:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY23:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP15]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR24:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY23]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR24]], ptr [[__END222]], align 8 +// CHECK2-NEXT: [[TMP16:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY26:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP16]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY26]], ptr [[__BEGIN225]], align 8 +// CHECK2-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY28]], ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK2-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__END222]], align 8 +// CHECK2-NEXT: store ptr [[TMP18]], ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[TMP19:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[TMP20:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST31:%.*]] = ptrtoint ptr [[TMP19]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST32:%.*]] = ptrtoint ptr [[TMP20]] to i64 +// CHECK2-NEXT: [[SUB_PTR_SUB33:%.*]] = sub i64 [[SUB_PTR_LHS_CAST31]], [[SUB_PTR_RHS_CAST32]] +// CHECK2-NEXT: [[SUB_PTR_DIV34:%.*]] = sdiv exact i64 [[SUB_PTR_SUB33]], 8 +// CHECK2-NEXT: [[SUB35:%.*]] = sub nsw i64 [[SUB_PTR_DIV34]], 1 +// CHECK2-NEXT: [[ADD36:%.*]] = add nsw i64 [[SUB35]], 1 +// CHECK2-NEXT: [[DIV37:%.*]] = sdiv i64 [[ADD36]], 1 +// CHECK2-NEXT: [[SUB38:%.*]] = sub nsw i64 [[DIV37]], 1 +// CHECK2-NEXT: store i64 [[SUB38]], ptr [[DOTCAPTURE_EXPR_30]], align 8 // CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB2]], align 8 // CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST2]], align 8 -// CHECK2-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK2-NEXT: [[ADD41:%.*]] = add nsw i64 [[TMP24]], 1 -// CHECK2-NEXT: store i64 [[ADD41]], ptr [[DOTOMP_NI2]], align 8 -// CHECK2-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 -// CHECK2-NEXT: store i64 [[TMP25]], ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK2-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK2-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK2-NEXT: [[CMP43:%.*]] = icmp sgt i64 [[TMP26]], [[TMP27]] -// CHECK2-NEXT: br i1 [[CMP43]], label %[[COND_TRUE44:.*]], label %[[COND_FALSE45:.*]] -// CHECK2: [[COND_TRUE44]]: -// CHECK2-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK2-NEXT: br label %[[COND_END46:.*]] -// CHECK2: [[COND_FALSE45]]: -// CHECK2-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK2-NEXT: br label %[[COND_END46]] -// CHECK2: [[COND_END46]]: -// CHECK2-NEXT: [[COND47:%.*]] = phi i64 [ [[TMP28]], %[[COND_TRUE44]] ], [ [[TMP29]], %[[COND_FALSE45]] ] -// CHECK2-NEXT: store i64 [[COND47]], ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK2-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK2-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK2-NEXT: [[CMP49:%.*]] = icmp sgt i64 [[TMP30]], [[TMP31]] -// CHECK2-NEXT: br i1 [[CMP49]], label %[[COND_TRUE50:.*]], label %[[COND_FALSE51:.*]] -// CHECK2: [[COND_TRUE50]]: -// CHECK2-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK2-NEXT: br label %[[COND_END52:.*]] -// CHECK2: [[COND_FALSE51]]: -// CHECK2-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK2-NEXT: br label %[[COND_END52]] -// CHECK2: [[COND_END52]]: -// CHECK2-NEXT: [[COND53:%.*]] = phi i64 [ [[TMP32]], %[[COND_TRUE50]] ], [ [[TMP33]], %[[COND_FALSE51]] ] -// CHECK2-NEXT: store i64 [[COND53]], ptr [[DOTOMP_FUSE_MAX48]], align 8 -// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP21:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_30]], align 8 +// CHECK2-NEXT: [[ADD39:%.*]] = add nsw i64 [[TMP21]], 1 +// CHECK2-NEXT: store i64 [[ADD39]], ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[TMP22:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: store i64 [[TMP22]], ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK2-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK2-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[CMP41:%.*]] = icmp sgt i64 [[TMP23]], [[TMP24]] +// CHECK2-NEXT: br i1 [[CMP41]], label %[[COND_TRUE42:.*]], label %[[COND_FALSE43:.*]] +// CHECK2: [[COND_TRUE42]]: +// CHECK2-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK2-NEXT: br label %[[COND_END44:.*]] +// CHECK2: [[COND_FALSE43]]: +// CHECK2-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: br label %[[COND_END44]] +// CHECK2: [[COND_END44]]: +// CHECK2-NEXT: [[COND45:%.*]] = phi i64 [ [[TMP25]], %[[COND_TRUE42]] ], [ [[TMP26]], %[[COND_FALSE43]] ] +// CHECK2-NEXT: store i64 [[COND45]], ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[CMP47:%.*]] = icmp sgt i64 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: br i1 [[CMP47]], label %[[COND_TRUE48:.*]], label %[[COND_FALSE49:.*]] +// CHECK2: [[COND_TRUE48]]: +// CHECK2-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: br label %[[COND_END50:.*]] +// CHECK2: [[COND_FALSE49]]: +// CHECK2-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: br label %[[COND_END50]] +// CHECK2: [[COND_END50]]: +// CHECK2-NEXT: [[COND51:%.*]] = phi i64 [ [[TMP29]], %[[COND_TRUE48]] ], [ [[TMP30]], %[[COND_FALSE49]] ] +// CHECK2-NEXT: store i64 [[COND51]], ptr [[DOTOMP_FUSE_MAX46]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX52]], align 8 // CHECK2-NEXT: br label %[[FOR_COND:.*]] // CHECK2: [[FOR_COND]]: -// CHECK2-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[TMP35:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX48]], align 8 -// CHECK2-NEXT: [[CMP55:%.*]] = icmp slt i64 [[TMP34]], [[TMP35]] -// CHECK2-NEXT: br i1 [[CMP55]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX46]], align 8 +// CHECK2-NEXT: [[CMP53:%.*]] = icmp slt i64 [[TMP31]], [[TMP32]] +// CHECK2-NEXT: br i1 [[CMP53]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK2: [[FOR_BODY]]: -// CHECK2-NEXT: [[TMP36:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 -// CHECK2-NEXT: [[CMP56:%.*]] = icmp slt i64 [[TMP36]], [[TMP37]] -// CHECK2-NEXT: br i1 [[CMP56]], label %[[IF_THEN:.*]], label %[[IF_END76:.*]] +// CHECK2-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: [[CMP54:%.*]] = icmp slt i64 [[TMP33]], [[TMP34]] +// CHECK2-NEXT: br i1 [[CMP54]], label %[[IF_THEN:.*]], label %[[IF_END74:.*]] // CHECK2: [[IF_THEN]]: -// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB04]], align 4 -// CHECK2-NEXT: [[CONV57:%.*]] = sext i32 [[TMP38]] to i64 -// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST05]], align 4 -// CHECK2-NEXT: [[CONV58:%.*]] = sext i32 [[TMP39]] to i64 -// CHECK2-NEXT: [[TMP40:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV58]], [[TMP40]] -// CHECK2-NEXT: [[ADD59:%.*]] = add nsw i64 [[CONV57]], [[MUL]] -// CHECK2-NEXT: [[CONV60:%.*]] = trunc i64 [[ADD59]] to i32 -// CHECK2-NEXT: store i32 [[CONV60]], ptr [[DOTOMP_IV07]], align 4 -// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_IV07]], align 4 -// CHECK2-NEXT: [[MUL61:%.*]] = mul nsw i32 [[TMP41]], 1 -// CHECK2-NEXT: [[ADD62:%.*]] = add nsw i32 0, [[MUL61]] -// CHECK2-NEXT: store i32 [[ADD62]], ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: [[CMP63:%.*]] = icmp slt i32 [[TMP42]], [[TMP43]] -// CHECK2-NEXT: br i1 [[CMP63]], label %[[IF_THEN64:.*]], label %[[IF_END:.*]] -// CHECK2: [[IF_THEN64]]: -// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP45]], [[TMP46]] -// CHECK2-NEXT: [[ADD66:%.*]] = add nsw i32 [[TMP44]], [[MUL65]] -// CHECK2-NEXT: store i32 [[ADD66]], ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[MUL67:%.*]] = mul nsw i32 [[TMP47]], 1 -// CHECK2-NEXT: [[ADD68:%.*]] = add nsw i32 0, [[MUL67]] -// CHECK2-NEXT: store i32 [[ADD68]], ptr [[I]], align 4 -// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[I]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP48]]) +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_LB03]], align 4 +// CHECK2-NEXT: [[CONV55:%.*]] = sext i32 [[TMP35]] to i64 +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_ST04]], align 4 +// CHECK2-NEXT: [[CONV56:%.*]] = sext i32 [[TMP36]] to i64 +// CHECK2-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV56]], [[TMP37]] +// CHECK2-NEXT: [[ADD57:%.*]] = add nsw i64 [[CONV55]], [[MUL]] +// CHECK2-NEXT: [[CONV58:%.*]] = trunc i64 [[ADD57]] to i32 +// CHECK2-NEXT: store i32 [[CONV58]], ptr [[DOTOMP_IV06]], align 4 +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_IV06]], align 4 +// CHECK2-NEXT: [[MUL59:%.*]] = mul nsw i32 [[TMP38]], 1 +// CHECK2-NEXT: [[ADD60:%.*]] = add nsw i32 0, [[MUL59]] +// CHECK2-NEXT: store i32 [[ADD60]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP61:%.*]] = icmp slt i32 [[TMP39]], [[TMP40]] +// CHECK2-NEXT: br i1 [[CMP61]], label %[[IF_THEN62:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN62]]: +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL63:%.*]] = mul nsw i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: [[ADD64:%.*]] = add nsw i32 [[TMP41]], [[MUL63]] +// CHECK2-NEXT: store i32 [[ADD64]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP44]], 1 +// CHECK2-NEXT: [[ADD66:%.*]] = add nsw i32 0, [[MUL65]] +// CHECK2-NEXT: store i32 [[ADD66]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP45]]) // CHECK2-NEXT: br label %[[IF_END]] // CHECK2: [[IF_END]]: -// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP69:%.*]] = icmp slt i32 [[TMP49]], [[TMP50]] -// CHECK2-NEXT: br i1 [[CMP69]], label %[[IF_THEN70:.*]], label %[[IF_END75:.*]] -// CHECK2: [[IF_THEN70]]: -// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP52]], [[TMP53]] -// CHECK2-NEXT: [[ADD72:%.*]] = add nsw i32 [[TMP51]], [[MUL71]] -// CHECK2-NEXT: store i32 [[ADD72]], ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[MUL73:%.*]] = mul nsw i32 [[TMP54]], 2 -// CHECK2-NEXT: [[ADD74:%.*]] = add nsw i32 0, [[MUL73]] -// CHECK2-NEXT: store i32 [[ADD74]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[J]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP55]]) -// CHECK2-NEXT: br label %[[IF_END75]] -// CHECK2: [[IF_END75]]: -// CHECK2-NEXT: br label %[[IF_END76]] -// CHECK2: [[IF_END76]]: -// CHECK2-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK2-NEXT: [[CMP77:%.*]] = icmp slt i64 [[TMP56]], [[TMP57]] -// CHECK2-NEXT: br i1 [[CMP77]], label %[[IF_THEN78:.*]], label %[[IF_END83:.*]] -// CHECK2: [[IF_THEN78]]: -// CHECK2-NEXT: [[TMP58:%.*]] = load i64, ptr [[DOTOMP_LB118]], align 8 -// CHECK2-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_ST119]], align 8 -// CHECK2-NEXT: [[TMP60:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], [[TMP60]] -// CHECK2-NEXT: [[ADD80:%.*]] = add nsw i64 [[TMP58]], [[MUL79]] -// CHECK2-NEXT: store i64 [[ADD80]], ptr [[DOTOMP_IV122]], align 8 -// CHECK2-NEXT: [[TMP61:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK2-NEXT: [[TMP62:%.*]] = load i64, ptr [[DOTOMP_IV122]], align 8 -// CHECK2-NEXT: [[MUL81:%.*]] = mul nsw i64 [[TMP62]], 1 -// CHECK2-NEXT: [[ADD_PTR82:%.*]] = getelementptr inbounds double, ptr [[TMP61]], i64 [[MUL81]] -// CHECK2-NEXT: store ptr [[ADD_PTR82]], ptr [[__BEGIN2]], align 8 -// CHECK2-NEXT: [[TMP63:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 -// CHECK2-NEXT: store ptr [[TMP63]], ptr [[V]], align 8 -// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[C]], align 4 -// CHECK2-NEXT: [[TMP65:%.*]] = load ptr, ptr [[V]], align 8 -// CHECK2-NEXT: [[TMP66:%.*]] = load double, ptr [[TMP65]], align 8 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP64]], double noundef [[TMP66]]) -// CHECK2-NEXT: br label %[[IF_END83]] -// CHECK2: [[IF_END83]]: -// CHECK2-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK2-NEXT: [[CMP84:%.*]] = icmp slt i64 [[TMP67]], [[TMP68]] -// CHECK2-NEXT: br i1 [[CMP84]], label %[[IF_THEN85:.*]], label %[[IF_END90:.*]] -// CHECK2: [[IF_THEN85]]: -// CHECK2-NEXT: [[TMP69:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 -// CHECK2-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 -// CHECK2-NEXT: [[TMP71:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], [[TMP71]] -// CHECK2-NEXT: [[ADD87:%.*]] = add nsw i64 [[TMP69]], [[MUL86]] -// CHECK2-NEXT: store i64 [[ADD87]], ptr [[DOTOMP_IV2]], align 8 -// CHECK2-NEXT: [[TMP72:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK2-NEXT: [[TMP73:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 -// CHECK2-NEXT: [[MUL88:%.*]] = mul nsw i64 [[TMP73]], 1 -// CHECK2-NEXT: [[ADD_PTR89:%.*]] = getelementptr inbounds double, ptr [[TMP72]], i64 [[MUL88]] -// CHECK2-NEXT: store ptr [[ADD_PTR89]], ptr [[__BEGIN227]], align 8 -// CHECK2-NEXT: [[TMP74:%.*]] = load ptr, ptr [[__BEGIN227]], align 8 -// CHECK2-NEXT: store ptr [[TMP74]], ptr [[VV]], align 8 -// CHECK2-NEXT: [[TMP75:%.*]] = load i32, ptr [[CC]], align 4 -// CHECK2-NEXT: [[TMP76:%.*]] = load ptr, ptr [[VV]], align 8 -// CHECK2-NEXT: [[TMP77:%.*]] = load double, ptr [[TMP76]], align 8 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP75]], double noundef [[TMP77]]) -// CHECK2-NEXT: br label %[[IF_END90]] -// CHECK2: [[IF_END90]]: +// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP67:%.*]] = icmp slt i32 [[TMP46]], [[TMP47]] +// CHECK2-NEXT: br i1 [[CMP67]], label %[[IF_THEN68:.*]], label %[[IF_END73:.*]] +// CHECK2: [[IF_THEN68]]: +// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL69:%.*]] = mul nsw i32 [[TMP49]], [[TMP50]] +// CHECK2-NEXT: [[ADD70:%.*]] = add nsw i32 [[TMP48]], [[MUL69]] +// CHECK2-NEXT: store i32 [[ADD70]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP51]], 2 +// CHECK2-NEXT: [[ADD72:%.*]] = add nsw i32 0, [[MUL71]] +// CHECK2-NEXT: store i32 [[ADD72]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK2-NEXT: br label %[[IF_END73]] +// CHECK2: [[IF_END73]]: +// CHECK2-NEXT: br label %[[IF_END74]] +// CHECK2: [[IF_END74]]: +// CHECK2-NEXT: [[TMP53:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[TMP54:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[CMP75:%.*]] = icmp slt i64 [[TMP53]], [[TMP54]] +// CHECK2-NEXT: br i1 [[CMP75]], label %[[IF_THEN76:.*]], label %[[IF_END81:.*]] +// CHECK2: [[IF_THEN76]]: +// CHECK2-NEXT: [[TMP55:%.*]] = load i64, ptr [[DOTOMP_LB116]], align 8 +// CHECK2-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_ST117]], align 8 +// CHECK2-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[MUL77:%.*]] = mul nsw i64 [[TMP56]], [[TMP57]] +// CHECK2-NEXT: [[ADD78:%.*]] = add nsw i64 [[TMP55]], [[MUL77]] +// CHECK2-NEXT: store i64 [[ADD78]], ptr [[DOTOMP_IV120]], align 8 +// CHECK2-NEXT: [[TMP58:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_IV120]], align 8 +// CHECK2-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], 1 +// CHECK2-NEXT: [[ADD_PTR80:%.*]] = getelementptr inbounds double, ptr [[TMP58]], i64 [[MUL79]] +// CHECK2-NEXT: store ptr [[ADD_PTR80]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP60:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: store ptr [[TMP60]], ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP61:%.*]] = load i32, ptr [[C]], align 4 +// CHECK2-NEXT: [[TMP62:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP63:%.*]] = load double, ptr [[TMP62]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP61]], double noundef [[TMP63]]) +// CHECK2-NEXT: br label %[[IF_END81]] +// CHECK2: [[IF_END81]]: +// CHECK2-NEXT: [[TMP64:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[TMP65:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[CMP82:%.*]] = icmp slt i64 [[TMP64]], [[TMP65]] +// CHECK2-NEXT: br i1 [[CMP82]], label %[[IF_THEN83:.*]], label %[[IF_END88:.*]] +// CHECK2: [[IF_THEN83]]: +// CHECK2-NEXT: [[TMP66:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 +// CHECK2-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 +// CHECK2-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[MUL84:%.*]] = mul nsw i64 [[TMP67]], [[TMP68]] +// CHECK2-NEXT: [[ADD85:%.*]] = add nsw i64 [[TMP66]], [[MUL84]] +// CHECK2-NEXT: store i64 [[ADD85]], ptr [[DOTOMP_IV2]], align 8 +// CHECK2-NEXT: [[TMP69:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK2-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 +// CHECK2-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], 1 +// CHECK2-NEXT: [[ADD_PTR87:%.*]] = getelementptr inbounds double, ptr [[TMP69]], i64 [[MUL86]] +// CHECK2-NEXT: store ptr [[ADD_PTR87]], ptr [[__BEGIN225]], align 8 +// CHECK2-NEXT: [[TMP71:%.*]] = load ptr, ptr [[__BEGIN225]], align 8 +// CHECK2-NEXT: store ptr [[TMP71]], ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP72:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK2-NEXT: [[TMP73:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP74:%.*]] = load double, ptr [[TMP73]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP72]], double noundef [[TMP74]]) +// CHECK2-NEXT: br label %[[IF_END88]] +// CHECK2: [[IF_END88]]: // CHECK2-NEXT: br label %[[FOR_INC:.*]] // CHECK2: [[FOR_INC]]: -// CHECK2-NEXT: [[TMP78:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[INC:%.*]] = add nsw i64 [[TMP78]], 1 -// CHECK2-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP75:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[INC:%.*]] = add nsw i64 [[TMP75]], 1 +// CHECK2-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX52]], align 8 // CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP5:![0-9]+]] // CHECK2: [[FOR_END]]: // CHECK2-NEXT: ret void @@ -1427,13 +1664,11 @@ extern "C" void foo4() { // CHECK2-NEXT: [[ENTRY:.*:]] // CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 // CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[K:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -1448,12 +1683,10 @@ extern "C" void foo4() { // CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: store i32 0, ptr [[J]], align 4 -// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 // CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[K]], align 4 -// CHECK2-NEXT: store i32 63, ptr [[DOTOMP_UB1]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 // CHECK2-NEXT: store i32 64, ptr [[DOTOMP_NI1]], align 4 @@ -1573,6 +1806,277 @@ extern "C" void foo4() { // CHECK2-NEXT: ret void // // +// CHECK2-LABEL: define dso_local void @foo5( +// CHECK2-SAME: ) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB03:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST04:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI05:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV06:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_8:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_10:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_LB116:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_ST117:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_NI118:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV120:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_TEMP_121:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX22:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX29:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[CC:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE264:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN265:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END267:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[VV:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[K]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: store i32 512, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 [[TMP5]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[SUB:%.*]] = sub nsw i32 [[TMP6]], 0 +// CHECK2-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 +// CHECK2-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 +// CHECK2-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB03]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST04]], align 4 +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], 1 +// CHECK2-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 +// CHECK2-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[TMP8:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP8]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK2-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY7:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY7]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY9:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY9]], ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK2-NEXT: store ptr [[TMP11]], ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK2-NEXT: [[TMP12:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK2-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP12]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 +// CHECK2-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] +// CHECK2-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 +// CHECK2-NEXT: [[SUB12:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK2-NEXT: [[ADD13:%.*]] = add nsw i64 [[SUB12]], 1 +// CHECK2-NEXT: [[DIV14:%.*]] = sdiv i64 [[ADD13]], 1 +// CHECK2-NEXT: [[SUB15:%.*]] = sub nsw i64 [[DIV14]], 1 +// CHECK2-NEXT: store i64 [[SUB15]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB116]], align 8 +// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST117]], align 8 +// CHECK2-NEXT: [[TMP14:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: [[ADD19:%.*]] = add nsw i64 [[TMP14]], 1 +// CHECK2-NEXT: store i64 [[ADD19]], ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK2-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK2-NEXT: [[TMP17:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[CMP23:%.*]] = icmp sgt i64 [[TMP16]], [[TMP17]] +// CHECK2-NEXT: br i1 [[CMP23]], label %[[COND_TRUE24:.*]], label %[[COND_FALSE25:.*]] +// CHECK2: [[COND_TRUE24]]: +// CHECK2-NEXT: [[TMP18:%.*]] = load i64, ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK2-NEXT: br label %[[COND_END26:.*]] +// CHECK2: [[COND_FALSE25]]: +// CHECK2-NEXT: [[TMP19:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: br label %[[COND_END26]] +// CHECK2: [[COND_END26]]: +// CHECK2-NEXT: [[COND27:%.*]] = phi i64 [ [[TMP18]], %[[COND_TRUE24]] ], [ [[TMP19]], %[[COND_FALSE25]] ] +// CHECK2-NEXT: store i64 [[COND27]], ptr [[DOTOMP_FUSE_MAX22]], align 8 +// CHECK2-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: [[CMP28:%.*]] = icmp slt i32 [[TMP20]], 128 +// CHECK2-NEXT: br i1 [[CMP28]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP21]]) +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add nsw i32 [[TMP22]], 1 +// CHECK2-NEXT: store i32 [[INC]], ptr [[I]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP8:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND30:.*]] +// CHECK2: [[FOR_COND30]]: +// CHECK2-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX22]], align 8 +// CHECK2-NEXT: [[CMP31:%.*]] = icmp slt i64 [[TMP23]], [[TMP24]] +// CHECK2-NEXT: br i1 [[CMP31]], label %[[FOR_BODY32:.*]], label %[[FOR_END63:.*]] +// CHECK2: [[FOR_BODY32]]: +// CHECK2-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: [[CMP33:%.*]] = icmp slt i64 [[TMP25]], [[TMP26]] +// CHECK2-NEXT: br i1 [[CMP33]], label %[[IF_THEN:.*]], label %[[IF_END53:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_LB03]], align 4 +// CHECK2-NEXT: [[CONV34:%.*]] = sext i32 [[TMP27]] to i64 +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_ST04]], align 4 +// CHECK2-NEXT: [[CONV35:%.*]] = sext i32 [[TMP28]] to i64 +// CHECK2-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV35]], [[TMP29]] +// CHECK2-NEXT: [[ADD36:%.*]] = add nsw i64 [[CONV34]], [[MUL]] +// CHECK2-NEXT: [[CONV37:%.*]] = trunc i64 [[ADD36]] to i32 +// CHECK2-NEXT: store i32 [[CONV37]], ptr [[DOTOMP_IV06]], align 4 +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_IV06]], align 4 +// CHECK2-NEXT: [[MUL38:%.*]] = mul nsw i32 [[TMP30]], 1 +// CHECK2-NEXT: [[ADD39:%.*]] = add nsw i32 0, [[MUL38]] +// CHECK2-NEXT: store i32 [[ADD39]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP40:%.*]] = icmp slt i32 [[TMP31]], [[TMP32]] +// CHECK2-NEXT: br i1 [[CMP40]], label %[[IF_THEN41:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN41]]: +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL42:%.*]] = mul nsw i32 [[TMP34]], [[TMP35]] +// CHECK2-NEXT: [[ADD43:%.*]] = add nsw i32 [[TMP33]], [[MUL42]] +// CHECK2-NEXT: store i32 [[ADD43]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[MUL44:%.*]] = mul nsw i32 [[TMP36]], 2 +// CHECK2-NEXT: [[ADD45:%.*]] = add nsw i32 0, [[MUL44]] +// CHECK2-NEXT: store i32 [[ADD45]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP37]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP46:%.*]] = icmp slt i32 [[TMP38]], [[TMP39]] +// CHECK2-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] +// CHECK2: [[IF_THEN47]]: +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL48:%.*]] = mul nsw i32 [[TMP41]], [[TMP42]] +// CHECK2-NEXT: [[ADD49:%.*]] = add nsw i32 [[TMP40]], [[MUL48]] +// CHECK2-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[MUL50:%.*]] = mul nsw i32 [[TMP43]], 1 +// CHECK2-NEXT: [[ADD51:%.*]] = add nsw i32 0, [[MUL50]] +// CHECK2-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[K]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK2-NEXT: br label %[[IF_END52]] +// CHECK2: [[IF_END52]]: +// CHECK2-NEXT: br label %[[IF_END53]] +// CHECK2: [[IF_END53]]: +// CHECK2-NEXT: [[TMP45:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[TMP46:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[CMP54:%.*]] = icmp slt i64 [[TMP45]], [[TMP46]] +// CHECK2-NEXT: br i1 [[CMP54]], label %[[IF_THEN55:.*]], label %[[IF_END60:.*]] +// CHECK2: [[IF_THEN55]]: +// CHECK2-NEXT: [[TMP47:%.*]] = load i64, ptr [[DOTOMP_LB116]], align 8 +// CHECK2-NEXT: [[TMP48:%.*]] = load i64, ptr [[DOTOMP_ST117]], align 8 +// CHECK2-NEXT: [[TMP49:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[MUL56:%.*]] = mul nsw i64 [[TMP48]], [[TMP49]] +// CHECK2-NEXT: [[ADD57:%.*]] = add nsw i64 [[TMP47]], [[MUL56]] +// CHECK2-NEXT: store i64 [[ADD57]], ptr [[DOTOMP_IV120]], align 8 +// CHECK2-NEXT: [[TMP50:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[TMP51:%.*]] = load i64, ptr [[DOTOMP_IV120]], align 8 +// CHECK2-NEXT: [[MUL58:%.*]] = mul nsw i64 [[TMP51]], 1 +// CHECK2-NEXT: [[ADD_PTR59:%.*]] = getelementptr inbounds double, ptr [[TMP50]], i64 [[MUL58]] +// CHECK2-NEXT: store ptr [[ADD_PTR59]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP52:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: store ptr [[TMP52]], ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[C]], align 4 +// CHECK2-NEXT: [[TMP54:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP55:%.*]] = load double, ptr [[TMP54]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP53]], double noundef [[TMP55]]) +// CHECK2-NEXT: br label %[[IF_END60]] +// CHECK2: [[IF_END60]]: +// CHECK2-NEXT: br label %[[FOR_INC61:.*]] +// CHECK2: [[FOR_INC61]]: +// CHECK2-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[INC62:%.*]] = add nsw i64 [[TMP56]], 1 +// CHECK2-NEXT: store i64 [[INC62]], ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND30]], !llvm.loop [[LOOP9:![0-9]+]] +// CHECK2: [[FOR_END63]]: +// CHECK2-NEXT: store i32 37, ptr [[CC]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE264]], align 8 +// CHECK2-NEXT: [[TMP57:%.*]] = load ptr, ptr [[__RANGE264]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY66:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP57]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY66]], ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: [[TMP58:%.*]] = load ptr, ptr [[__RANGE264]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY68:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP58]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR69:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY68]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR69]], ptr [[__END267]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND70:.*]] +// CHECK2: [[FOR_COND70]]: +// CHECK2-NEXT: [[TMP59:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: [[TMP60:%.*]] = load ptr, ptr [[__END267]], align 8 +// CHECK2-NEXT: [[CMP71:%.*]] = icmp ne ptr [[TMP59]], [[TMP60]] +// CHECK2-NEXT: br i1 [[CMP71]], label %[[FOR_BODY72:.*]], label %[[FOR_END74:.*]] +// CHECK2: [[FOR_BODY72]]: +// CHECK2-NEXT: [[TMP61:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: store ptr [[TMP61]], ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP62:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK2-NEXT: [[TMP63:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP64:%.*]] = load double, ptr [[TMP63]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP62]], double noundef [[TMP64]]) +// CHECK2-NEXT: br label %[[FOR_INC73:.*]] +// CHECK2: [[FOR_INC73]]: +// CHECK2-NEXT: [[TMP65:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: [[INCDEC_PTR:%.*]] = getelementptr inbounds nuw double, ptr [[TMP65]], i32 1 +// CHECK2-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND70]] +// CHECK2: [[FOR_END74]]: +// CHECK2-NEXT: ret void +// +// // CHECK2-LABEL: define dso_local void @tfoo2( // CHECK2-SAME: ) #[[ATTR0]] { // CHECK2-NEXT: [[ENTRY:.*:]] @@ -1593,7 +2097,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 @@ -1602,7 +2105,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -1611,7 +2113,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_19:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP21:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_22:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB2:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB2:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST2:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI2:%.*]] = alloca i32, align 4 @@ -1641,174 +2142,168 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] // CHECK2-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 // CHECK2-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP8]], 1 // CHECK2-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP9]], ptr [[J]], align 4 // CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[START_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK2-NEXT: store i32 [[TMP10]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP13]], [[TMP14]] // CHECK2-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP15]] // CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] -// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP16]] // CHECK2-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 // CHECK2-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP17]], 1 // CHECK2-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP18]], [[TMP19]] +// CHECK2-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 // CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[START_ADDR]], align 4 // CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] -// CHECK2-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 -// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[START_ADDR]], align 4 -// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] +// CHECK2-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] // CHECK2-NEXT: store i32 [[ADD18]], ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP24]], [[TMP25]] +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] // CHECK2-NEXT: store i32 [[ADD20]], ptr [[DOTCAPTURE_EXPR_19]], align 4 -// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP26]], ptr [[DOTNEW_STEP21]], align 4 -// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 -// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK2-NEXT: [[SUB23:%.*]] = sub i32 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP24]], ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[SUB23:%.*]] = sub i32 [[TMP25]], [[TMP26]] // CHECK2-NEXT: [[SUB24:%.*]] = sub i32 [[SUB23]], 1 -// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK2-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP29]] -// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK2-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP30]] +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP27]] +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP28]] // CHECK2-NEXT: [[SUB27:%.*]] = sub i32 [[DIV26]], 1 // CHECK2-NEXT: store i32 [[SUB27]], ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK2-NEXT: store i32 [[TMP31]], ptr [[DOTOMP_UB2]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB2]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST2]], align 4 -// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK2-NEXT: [[ADD28:%.*]] = add i32 [[TMP32]], 1 +// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK2-NEXT: [[ADD28:%.*]] = add i32 [[TMP29]], 1 // CHECK2-NEXT: store i32 [[ADD28]], ptr [[DOTOMP_NI2]], align 4 -// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: store i32 [[TMP33]], ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP34]], [[TMP35]] +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP30]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP31]], [[TMP32]] // CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] // CHECK2: [[COND_TRUE]]: -// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 // CHECK2-NEXT: br label %[[COND_END:.*]] // CHECK2: [[COND_FALSE]]: -// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 // CHECK2-NEXT: br label %[[COND_END]] // CHECK2: [[COND_END]]: -// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP36]], %[[COND_TRUE]] ], [ [[TMP37]], %[[COND_FALSE]] ] +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP33]], %[[COND_TRUE]] ], [ [[TMP34]], %[[COND_FALSE]] ] // CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_TEMP_2]], align 4 -// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 -// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 -// CHECK2-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP38]], [[TMP39]] +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP35]], [[TMP36]] // CHECK2-NEXT: br i1 [[CMP29]], label %[[COND_TRUE30:.*]], label %[[COND_FALSE31:.*]] // CHECK2: [[COND_TRUE30]]: -// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 // CHECK2-NEXT: br label %[[COND_END32:.*]] // CHECK2: [[COND_FALSE31]]: -// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 // CHECK2-NEXT: br label %[[COND_END32]] // CHECK2: [[COND_END32]]: -// CHECK2-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP40]], %[[COND_TRUE30]] ], [ [[TMP41]], %[[COND_FALSE31]] ] +// CHECK2-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP37]], %[[COND_TRUE30]] ], [ [[TMP38]], %[[COND_FALSE31]] ] // CHECK2-NEXT: store i32 [[COND33]], ptr [[DOTOMP_FUSE_MAX]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK2-NEXT: br label %[[FOR_COND:.*]] // CHECK2: [[FOR_COND]]: -// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 -// CHECK2-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP39]], [[TMP40]] // CHECK2-NEXT: br i1 [[CMP34]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK2: [[FOR_BODY]]: -// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP44]], [[TMP45]] +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP41]], [[TMP42]] // CHECK2-NEXT: br i1 [[CMP35]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] // CHECK2: [[IF_THEN]]: -// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP47]], [[TMP48]] -// CHECK2-NEXT: [[ADD36:%.*]] = add i32 [[TMP46]], [[MUL]] +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP44]], [[TMP45]] +// CHECK2-NEXT: [[ADD36:%.*]] = add i32 [[TMP43]], [[MUL]] // CHECK2-NEXT: store i32 [[ADD36]], ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 -// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 -// CHECK2-NEXT: [[MUL37:%.*]] = mul i32 [[TMP50]], [[TMP51]] -// CHECK2-NEXT: [[ADD38:%.*]] = add i32 [[TMP49]], [[MUL37]] +// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[MUL37:%.*]] = mul i32 [[TMP47]], [[TMP48]] +// CHECK2-NEXT: [[ADD38:%.*]] = add i32 [[TMP46]], [[MUL37]] // CHECK2-NEXT: store i32 [[ADD38]], ptr [[I]], align 4 -// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[I]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP49]]) // CHECK2-NEXT: br label %[[IF_END]] // CHECK2: [[IF_END]]: -// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP53]], [[TMP54]] +// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP50]], [[TMP51]] // CHECK2-NEXT: br i1 [[CMP39]], label %[[IF_THEN40:.*]], label %[[IF_END45:.*]] // CHECK2: [[IF_THEN40]]: -// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK2-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL41:%.*]] = mul i32 [[TMP56]], [[TMP57]] -// CHECK2-NEXT: [[ADD42:%.*]] = add i32 [[TMP55]], [[MUL41]] +// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL41:%.*]] = mul i32 [[TMP53]], [[TMP54]] +// CHECK2-NEXT: [[ADD42:%.*]] = add i32 [[TMP52]], [[MUL41]] // CHECK2-NEXT: store i32 [[ADD42]], ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP58:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[MUL43:%.*]] = mul i32 [[TMP59]], [[TMP60]] -// CHECK2-NEXT: [[SUB44:%.*]] = sub i32 [[TMP58]], [[MUL43]] +// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[MUL43:%.*]] = mul i32 [[TMP56]], [[TMP57]] +// CHECK2-NEXT: [[SUB44:%.*]] = sub i32 [[TMP55]], [[MUL43]] // CHECK2-NEXT: store i32 [[SUB44]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP61:%.*]] = load i32, ptr [[J]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP61]]) +// CHECK2-NEXT: [[TMP58:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP58]]) // CHECK2-NEXT: br label %[[IF_END45]] // CHECK2: [[IF_END45]]: -// CHECK2-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 -// CHECK2-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP62]], [[TMP63]] +// CHECK2-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP59]], [[TMP60]] // CHECK2-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] // CHECK2: [[IF_THEN47]]: -// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 -// CHECK2-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 -// CHECK2-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL48:%.*]] = mul i32 [[TMP65]], [[TMP66]] -// CHECK2-NEXT: [[ADD49:%.*]] = add i32 [[TMP64]], [[MUL48]] +// CHECK2-NEXT: [[TMP61:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 +// CHECK2-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 +// CHECK2-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL48:%.*]] = mul i32 [[TMP62]], [[TMP63]] +// CHECK2-NEXT: [[ADD49:%.*]] = add i32 [[TMP61]], [[MUL48]] // CHECK2-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV2]], align 4 -// CHECK2-NEXT: [[TMP67:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK2-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 -// CHECK2-NEXT: [[TMP69:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK2-NEXT: [[MUL50:%.*]] = mul i32 [[TMP68]], [[TMP69]] -// CHECK2-NEXT: [[ADD51:%.*]] = add i32 [[TMP67]], [[MUL50]] +// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 +// CHECK2-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[MUL50:%.*]] = mul i32 [[TMP65]], [[TMP66]] +// CHECK2-NEXT: [[ADD51:%.*]] = add i32 [[TMP64]], [[MUL50]] // CHECK2-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 -// CHECK2-NEXT: [[TMP70:%.*]] = load i32, ptr [[K]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP70]]) +// CHECK2-NEXT: [[TMP67:%.*]] = load i32, ptr [[K]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP67]]) // CHECK2-NEXT: br label %[[IF_END52]] // CHECK2: [[IF_END52]]: // CHECK2-NEXT: br label %[[FOR_INC:.*]] // CHECK2: [[FOR_INC]]: -// CHECK2-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 +// CHECK2-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP68]], 1 // CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP8:![0-9]+]] +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP10:![0-9]+]] // CHECK2: [[FOR_END]]: // CHECK2-NEXT: ret void // @@ -1819,6 +2314,8 @@ extern "C" void foo4() { // CHECK1: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} // CHECK1: [[LOOP7]] = distinct !{[[LOOP7]], [[META4]]} // CHECK1: [[LOOP8]] = distinct !{[[LOOP8]], [[META4]]} +// CHECK1: [[LOOP9]] = distinct !{[[LOOP9]], [[META4]]} +// CHECK1: [[LOOP10]] = distinct !{[[LOOP10]], [[META4]]} //. // CHECK2: [[LOOP3]] = distinct !{[[LOOP3]], [[META4:![0-9]+]]} // CHECK2: [[META4]] = !{!"llvm.loop.mustprogress"} @@ -1826,4 +2323,6 @@ extern "C" void foo4() { // CHECK2: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} // CHECK2: [[LOOP7]] = distinct !{[[LOOP7]], [[META4]]} // CHECK2: [[LOOP8]] = distinct !{[[LOOP8]], [[META4]]} +// CHECK2: [[LOOP9]] = distinct !{[[LOOP9]], [[META4]]} +// CHECK2: [[LOOP10]] = distinct !{[[LOOP10]], [[META4]]} //. >From 823bc08b4ef97458665ed41409e03cd07598efd3 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:44:48 +0000 Subject: [PATCH 5/9] Fixed missing diagnostic groups in warnings --- clang/include/clang/Basic/DiagnosticSemaKinds.td | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index 191618e7865dc..a6ae0de004c8a 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -11559,7 +11559,8 @@ def note_omp_implicit_dsa : Note< def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; def warn_omp_different_loop_ind_var_types : Warning < - "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">; + "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">, + InGroup; def err_omp_not_canonical_loop : Error < "loop after '#pragma omp %0' is not in canonical form">; def err_omp_not_a_loop_sequence : Error < @@ -11570,7 +11571,8 @@ def err_omp_invalid_looprange : Error < "loop range in '#pragma omp %0' exceeds the number of available loops: " "range end '%1' is greater than the total number of loops '%2'">; def warn_omp_redundant_fusion : Warning < - "loop range in '#pragma omp %0' contains only a single loop, resulting in redundant fusion">; + "loop range in '#pragma omp %0' contains only a single loop, resulting in redundant fusion">, + InGroup; def err_omp_not_for : Error< "%select{statement after '#pragma omp %1' must be a for loop|" "expected %2 for loops after '#pragma omp %1'%select{|, but found only %4}3}0">; >From 422ffd7ef80a83156037a34c6ad955e67c504b4d Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:49:50 +0000 Subject: [PATCH 6/9] Fixed formatting and comments --- clang/lib/Sema/SemaOpenMP.cpp | 112 ++++++++++++++++++---------------- 1 file changed, 58 insertions(+), 54 deletions(-) diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index b0529c9352c83..485eebf23ef93 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -14160,42 +14160,43 @@ StmtResult SemaOpenMP::ActOnOpenMPTargetTeamsDistributeSimdDirective( } // Overloaded base case function -template -static bool tryHandleAs(T *t, F &&) { - return false; +template static bool tryHandleAs(T *t, F &&) { + return false; } /** - * Tries to recursively cast `t` to one of the given types and invokes `f` if successful. + * Tries to recursively cast `t` to one of the given types and invokes `f` if + * successful. * * @tparam Class The first type to check. * @tparam Rest The remaining types to check. * @tparam T The base type of `t`. - * @tparam F The callable type for the function to invoke upon a successful cast. + * @tparam F The callable type for the function to invoke upon a successful + * cast. * @param t The object to be checked. * @param f The function to invoke if `t` matches `Class`. * @return `true` if `t` matched any type and `f` was called, otherwise `false`. */ template static bool tryHandleAs(T *t, F &&f) { - if (Class *c = dyn_cast(t)) { - f(c); - return true; - } else { - return tryHandleAs(t, std::forward(f)); - } + if (Class *c = dyn_cast(t)) { + f(c); + return true; + } else { + return tryHandleAs(t, std::forward(f)); + } } // Updates OriginalInits by checking Transform against loop transformation // directives and appending their pre-inits if a match is found. static void updatePreInits(OMPLoopBasedDirective *Transform, SmallVectorImpl> &PreInits) { - if (!tryHandleAs( - Transform, [&PreInits](auto *Dir) { - appendFlattenedStmtList(PreInits.back(), Dir->getPreInits()); - })) - llvm_unreachable("Unhandled loop transformation"); + if (!tryHandleAs( + Transform, [&PreInits](auto *Dir) { + appendFlattenedStmtList(PreInits.back(), Dir->getPreInits()); + })) + llvm_unreachable("Unhandled loop transformation"); } bool SemaOpenMP::checkTransformableLoopNest( @@ -14273,43 +14274,42 @@ class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { unsigned getNestedLoopCount() const { return NestedLoopCount; } bool VisitForStmt(ForStmt *FS) override { - ++NestedLoopCount; - return true; + ++NestedLoopCount; + return true; } bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { - ++NestedLoopCount; - return true; + ++NestedLoopCount; + return true; } bool TraverseStmt(Stmt *S) override { - if (!S) + if (!S) return true; - // Skip traversal of all expressions, including special cases like - // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions - // may contain inner statements (and even loops), but they are not part - // of the syntactic body of the surrounding loop structure. - // Therefore must not be counted - if (isa(S)) + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) return true; - // Only recurse into CompoundStmt (block {}) and loop bodies - if (isa(S) || isa(S) || - isa(S)) { + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || isa(S)) { return DynamicRecursiveASTVisitor::TraverseStmt(S); - } + } - // Stop traversal of the rest of statements, that break perfect - // loop nesting, such as control flow (IfStmt, SwitchStmt...) - return true; + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; } bool TraverseDecl(Decl *D) override { - // Stop in the case of finding a declaration, it is not important - // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, - // FunctionDecl...) - return true; + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; } }; @@ -14467,15 +14467,14 @@ bool SemaOpenMP::analyzeLoopSequence( return isa(Child); }; - // High level grammar validation for (auto *Child : LoopSeqStmt->children()) { - if (!Child) + if (!Child) continue; - // Skip over non-loop-sequence statements - if (!isLoopSequenceDerivation(Child)) { + // Skip over non-loop-sequence statements + if (!isLoopSequenceDerivation(Child)) { Child = Child->IgnoreContainers(); // Ignore empty compound statement @@ -14493,9 +14492,9 @@ bool SemaOpenMP::analyzeLoopSequence( // Already been treated, skip this children continue; } - } - // Regular loop sequence handling - if (isLoopSequenceDerivation(Child)) { + } + // Regular loop sequence handling + if (isLoopSequenceDerivation(Child)) { if (isLoopGeneratingStmt(Child)) { if (!analyzeLoopGeneration(Child)) { return false; @@ -14509,12 +14508,12 @@ bool SemaOpenMP::analyzeLoopSequence( // Update the Loop Sequence size by one ++LoopSeqSize; } - } else { + } else { // Report error for invalid statement inside canonical loop sequence Diag(Child->getBeginLoc(), diag::err_omp_not_for) << 0 << getOpenMPDirectiveName(Kind); return false; - } + } } return true; } @@ -14531,9 +14530,9 @@ bool SemaOpenMP::checkTransformableLoopSequence( // Checks whether the given statement is a compound statement if (!isa(AStmt)) { - Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) - << getOpenMPDirectiveName(Kind); - return false; + Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) + << getOpenMPDirectiveName(Kind); + return false; } // Number of top level canonical loop nests observed (And acts as index) LoopSeqSize = 0; @@ -14564,7 +14563,7 @@ bool SemaOpenMP::checkTransformableLoopSequence( OriginalInits, TransformsPreInits, LoopSequencePreInits, LoopCategories, Context, Kind)) { - return false; + return false; } if (LoopSeqSize <= 0) { Diag(AStmt->getBeginLoc(), diag::err_omp_empty_loop_sequence) @@ -15278,7 +15277,7 @@ StmtResult SemaOpenMP::ActOnOpenMPUnrollDirective(ArrayRef Clauses, Stmt *LoopStmt = nullptr; collectLoopStmts(AStmt, {LoopStmt}); - // Determine the PreInit declarations.e + // Determine the PreInit declarations. SmallVector PreInits; addLoopPreInits(Context, LoopHelper, LoopStmt, OriginalInits[0], PreInits); @@ -15894,13 +15893,18 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, CountVal = CountInt.getZExtValue(); }; - // Checks if the loop range is valid + // OpenMP [6.0, Restrictions] + // first + count - 1 must not evaluate to a value greater than the + // loop sequence length of the associated canonical loop sequence. auto ValidLoopRange = [](uint64_t FirstVal, uint64_t CountVal, unsigned NumLoops) -> bool { return FirstVal + CountVal - 1 <= NumLoops; }; uint64_t FirstVal = 1, CountVal = 0, LastVal = LoopSeqSize; + // Validates the loop range after evaluating the semantic information + // and ensures that the range is valid for the given loop sequence size. + // Expressions are evaluated at compile time to obtain constant values. if (LRC) { EvaluateLoopRangeArguments(LRC->getFirst(), LRC->getCount(), FirstVal, CountVal); >From ac0d9e348109f742440003945d278a9c26f56976 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:58:54 +0000 Subject: [PATCH 7/9] Added minimal changes to enable flang future implementation --- flang/include/flang/Parser/dump-parse-tree.h | 1 + flang/include/flang/Parser/parse-tree.h | 9 +++++++++ flang/lib/Lower/OpenMP/Clauses.cpp | 5 +++++ flang/lib/Lower/OpenMP/Clauses.h | 1 + flang/lib/Parser/openmp-parsers.cpp | 7 +++++++ flang/lib/Parser/unparse.cpp | 7 +++++++ flang/lib/Semantics/check-omp-structure.cpp | 9 +++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 8 files changed, 40 insertions(+) diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index df9278697346f..c220c4dafb52f 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -609,6 +609,7 @@ class ParseTreeDumper { NODE(OmpLinearClause, Modifier) NODE(parser, OmpLinearModifier) NODE_ENUM(OmpLinearModifier, Value) + NODE(parser, OmpLoopRangeClause) NODE(parser, OmpStepComplexModifier) NODE(parser, OmpStepSimpleModifier) NODE(parser, OmpLoopDirective) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 254236b510544..be80141b49e2b 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4361,6 +4361,15 @@ struct OmpLinearClause { std::tuple t; }; +// Ref: [6.0:207-208] +// +// loop-range-clause -> +// LOOPRANGE(first, count) // since 6.0 +struct OmpLoopRangeClause { + TUPLE_CLASS_BOILERPLATE(OmpLoopRangeClause); + std::tuple t; +}; + // Ref: [4.5:216-219], [5.0:315-324], [5.1:347-355], [5.2:150-158] // // map-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f3088b18b77ff..ea535ab3adbe7 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -998,6 +998,11 @@ Link make(const parser::OmpClause::Link &inp, return Link{/*List=*/makeObjects(inp.v, semaCtx)}; } +LoopRange make(const parser::OmpClause::Looprange &inp, + semantics::SemanticsContext &semaCtx) { + llvm_unreachable("Unimplemented: looprange"); +} + Map make(const parser::OmpClause::Map &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpMapClause diff --git a/flang/lib/Lower/OpenMP/Clauses.h b/flang/lib/Lower/OpenMP/Clauses.h index d7ab21d428e32..bda8571e65f23 100644 --- a/flang/lib/Lower/OpenMP/Clauses.h +++ b/flang/lib/Lower/OpenMP/Clauses.h @@ -239,6 +239,7 @@ using Initializer = tomp::clause::InitializerT; using InReduction = tomp::clause::InReductionT; using IsDevicePtr = tomp::clause::IsDevicePtrT; using Lastprivate = tomp::clause::LastprivateT; +using LoopRange = tomp::clause::LoopRangeT; using Linear = tomp::clause::LinearT; using Link = tomp::clause::LinkT; using Map = tomp::clause::MapT; diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 52d3a5844c969..393dbe8ada002 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -841,6 +841,11 @@ TYPE_PARSER( maybe(":"_tok >> nonemptyList(Parser{})), /*PostModified=*/pure(true))) +TYPE_PARSER( + construct(scalarIntConstantExpr, + "," >> scalarIntConstantExpr) +) + // OpenMPv5.2 12.5.2 detach-clause -> DETACH (event-handle) TYPE_PARSER(construct(Parser{})) @@ -1010,6 +1015,8 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "LINK" >> construct(construct( parenthesized(Parser{}))) || + "LOOPRANGE" >> construct(construct( + parenthesized(Parser{}))) || "MAP" >> construct(construct( parenthesized(Parser{}))) || "MATCH" >> construct(construct( diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index a626888b7dfe5..00b5a8c0600e1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2314,6 +2314,13 @@ class UnparseVisitor { } } } + void Unparse(const OmpLoopRangeClause &x) { + Word("LOOPRANGE("); + Walk(std::get<0>(x.t)); + Put(", "); + Walk(std::get<1>(x.t)); + Put(")"); + } void Unparse(const OmpReductionClause &x) { using Modifier = OmpReductionClause::Modifier; Walk(std::get>>(x.t), ": "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 606014276e7ca..4af2b4909fcb6 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3383,6 +3383,15 @@ CHECK_REQ_CONSTANT_SCALAR_INT_CLAUSE(Collapse, OMPC_collapse) CHECK_REQ_CONSTANT_SCALAR_INT_CLAUSE(Safelen, OMPC_safelen) CHECK_REQ_CONSTANT_SCALAR_INT_CLAUSE(Simdlen, OMPC_simdlen) +void OmpStructureChecker::Enter(const parser::OmpClause::Looprange &x) { + context_.Say(GetContext().clauseSource, + "LOOPRANGE clause is not implemented yet"_err_en_US, + ContextDirectiveAsFortran()); +} + +void OmpStructureChecker::Enter(const parser::OmpClause::FreeAgent &x) { + context_.Say(GetContext().clauseSource, + "FREE_AGENT clause is not implemented yet"_err_en_US, // Restrictions specific to each clause are implemented apart from the // generalized restrictions. diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index ae19385c022d0..3be758686c634 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -273,6 +273,7 @@ def OMPC_Link : Clause<"link"> { } def OMPC_LoopRange : Clause<"looprange"> { let clangClass = "OMPLoopRangeClause"; + let flangClass = "OmpLoopRangeClause"; } def OMPC_Map : Clause<"map"> { let clangClass = "OMPMapClause"; >From e6e00ae563e491968637e00d2a15a7272bc9d146 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Wed, 21 May 2025 13:14:22 +0000 Subject: [PATCH 8/9] Address basic PR feedback --- clang/include/clang/AST/OpenMPClause.h | 93 ++++---- clang/include/clang/AST/StmtOpenMP.h | 3 +- clang/include/clang/Sema/SemaOpenMP.h | 14 +- clang/lib/AST/OpenMPClause.cpp | 17 +- clang/lib/CodeGen/CGExpr.cpp | 5 +- clang/lib/CodeGen/CodeGenFunction.h | 4 - clang/lib/Sema/SemaOpenMP.cpp | 224 +++++++++----------- flang/lib/Semantics/check-omp-structure.cpp | 3 - 8 files changed, 166 insertions(+), 197 deletions(-) diff --git a/clang/include/clang/AST/OpenMPClause.h b/clang/include/clang/AST/OpenMPClause.h index 8f937cdef9cd0..3df5133a17fb4 100644 --- a/clang/include/clang/AST/OpenMPClause.h +++ b/clang/include/clang/AST/OpenMPClause.h @@ -1153,82 +1153,73 @@ class OMPFullClause final : public OMPNoChildClause { /// for(int j = 0; j < 256; j+=2) /// for(int k = 127; k >= 0; --k) /// \endcode -class OMPLoopRangeClause final : public OMPClause { +class OMPLoopRangeClause final + : public OMPClause, + private llvm::TrailingObjects { friend class OMPClauseReader; - - explicit OMPLoopRangeClause() - : OMPClause(llvm::omp::OMPC_looprange, {}, {}) {} + friend class llvm::TrailingObjects; /// Location of '(' SourceLocation LParenLoc; - /// Location of 'first' - SourceLocation FirstLoc; - - /// Location of 'count' - SourceLocation CountLoc; - - /// Expr associated with 'first' argument - Expr *First = nullptr; - - /// Expr associated with 'count' argument - Expr *Count = nullptr; - - /// Set 'first' - void setFirst(Expr *First) { this->First = First; } + /// Location of first and count expressions + SourceLocation FirstLoc, CountLoc; - /// Set 'count' - void setCount(Expr *Count) { this->Count = Count; } + /// Number of looprange arguments (always 2: first, count) + unsigned NumArgs = 2; - /// Set location of '('. - void setLParenLoc(SourceLocation Loc) { LParenLoc = Loc; } - - /// Set location of 'first' argument - void setFirstLoc(SourceLocation Loc) { FirstLoc = Loc; } + /// Set the argument expressions. + void setArgs(ArrayRef Args) { + assert(Args.size() == NumArgs && "Expected exactly 2 looprange arguments"); + std::copy(Args.begin(), Args.end(), getTrailingObjects()); + } - /// Set location of 'count' argument - void setCountLoc(SourceLocation Loc) { CountLoc = Loc; } + /// Build an empty clause for deserialization. + explicit OMPLoopRangeClause() + : OMPClause(llvm::omp::OMPC_looprange, {}, {}), NumArgs(2) {} public: - /// Build an AST node for a 'looprange' clause - /// - /// \param StartLoc Starting location of the clause. - /// \param LParenLoc Location of '('. - /// \param ModifierLoc Modifier location. - /// \param + /// Build a 'looprange' clause AST node. static OMPLoopRangeClause * Create(const ASTContext &C, SourceLocation StartLoc, SourceLocation LParenLoc, SourceLocation FirstLoc, SourceLocation CountLoc, - SourceLocation EndLoc, Expr *First, Expr *Count); + SourceLocation EndLoc, ArrayRef Args); - /// Build an empty 'looprange' node for deserialization - /// - /// \param C Context of the AST. + /// Build an empty 'looprange' clause node. static OMPLoopRangeClause *CreateEmpty(const ASTContext &C); - /// Returns the location of '(' + // Location getters/setters SourceLocation getLParenLoc() const { return LParenLoc; } - - /// Returns the location of 'first' SourceLocation getFirstLoc() const { return FirstLoc; } - - /// Returns the location of 'count' SourceLocation getCountLoc() const { return CountLoc; } - /// Returns the argument 'first' or nullptr if not set - Expr *getFirst() const { return cast_or_null(First); } + void setLParenLoc(SourceLocation Loc) { LParenLoc = Loc; } + void setFirstLoc(SourceLocation Loc) { FirstLoc = Loc; } + void setCountLoc(SourceLocation Loc) { CountLoc = Loc; } - /// Returns the argument 'count' or nullptr if not set - Expr *getCount() const { return cast_or_null(Count); } + /// Get looprange arguments: first and count + Expr *getFirst() const { return getArgs()[0]; } + Expr *getCount() const { return getArgs()[1]; } - child_range children() { - return child_range(reinterpret_cast(&First), - reinterpret_cast(&Count) + 1); + /// Set looprange arguments: first and count + void setFirst(Expr *E) { getArgs()[0] = E; } + void setCount(Expr *E) { getArgs()[1] = E; } + + MutableArrayRef getArgs() { + return MutableArrayRef(getTrailingObjects(), NumArgs); + } + ArrayRef getArgs() const { + return ArrayRef(getTrailingObjects(), NumArgs); } + child_range children() { + return child_range(reinterpret_cast(getArgs().begin()), + reinterpret_cast(getArgs().end())); + } const_child_range children() const { - auto Children = const_cast(this)->children(); - return const_child_range(Children.begin(), Children.end()); + auto AR = getArgs(); + return const_child_range(reinterpret_cast(AR.begin()), + reinterpret_cast(AR.end())); } child_range used_children() { diff --git a/clang/include/clang/AST/StmtOpenMP.h b/clang/include/clang/AST/StmtOpenMP.h index b6a948a8c6020..cb871c9894d01 100644 --- a/clang/include/clang/AST/StmtOpenMP.h +++ b/clang/include/clang/AST/StmtOpenMP.h @@ -5807,7 +5807,6 @@ class OMPReverseDirective final : public OMPLoopTransformationDirective { llvm::omp::OMPD_reverse, StartLoc, EndLoc, 1) { // Reverse produces a single top-level canonical loop nest - setNumGeneratedLoops(1); setNumGeneratedLoopNests(1); } @@ -5878,7 +5877,7 @@ class OMPInterchangeDirective final : public OMPLoopTransformationDirective { EndLoc, NumLoops) { // Interchange produces a single top-level canonical loop // nest, with the exact same amount of total loops - setNumGeneratedLoops(NumLoops); + setNumGeneratedLoops(3 * NumLoops); setNumGeneratedLoopNests(1); } diff --git a/clang/include/clang/Sema/SemaOpenMP.h b/clang/include/clang/Sema/SemaOpenMP.h index ac4cbe3709a0d..35bb884c0c1f2 100644 --- a/clang/include/clang/Sema/SemaOpenMP.h +++ b/clang/include/clang/Sema/SemaOpenMP.h @@ -1491,7 +1491,7 @@ class SemaOpenMP : public SemaBase { bool checkTransformableLoopNest( OpenMPDirectiveKind Kind, Stmt *AStmt, int NumLoops, SmallVectorImpl &LoopHelpers, - Stmt *&Body, SmallVectorImpl> &OriginalInits); + Stmt *&Body, SmallVectorImpl> &OriginalInits); /// @brief Categories of loops encountered during semantic OpenMP loop /// analysis @@ -1554,9 +1554,9 @@ class SemaOpenMP : public SemaBase { Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, - SmallVectorImpl> &OriginalInits, - SmallVectorImpl> &TransformsPreInits, - SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, SmallVectorImpl &LoopCategories, ASTContext &Context, OpenMPDirectiveKind Kind); @@ -1590,9 +1590,9 @@ class SemaOpenMP : public SemaBase { unsigned &NumLoops, SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, - SmallVectorImpl> &OriginalInits, - SmallVectorImpl> &TransformsPreInits, - SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, SmallVectorImpl &LoopCategories, ASTContext &Context); /// Helper to keep information about the current `omp begin/end declare diff --git a/clang/lib/AST/OpenMPClause.cpp b/clang/lib/AST/OpenMPClause.cpp index 0b5808eb100e4..e0570262b2a05 100644 --- a/clang/lib/AST/OpenMPClause.cpp +++ b/clang/lib/AST/OpenMPClause.cpp @@ -1026,22 +1026,25 @@ OMPPartialClause *OMPPartialClause::CreateEmpty(const ASTContext &C) { OMPLoopRangeClause * OMPLoopRangeClause::Create(const ASTContext &C, SourceLocation StartLoc, - SourceLocation LParenLoc, SourceLocation EndLoc, - SourceLocation FirstLoc, SourceLocation CountLoc, - Expr *First, Expr *Count) { + SourceLocation LParenLoc, SourceLocation FirstLoc, + SourceLocation CountLoc, SourceLocation EndLoc, + ArrayRef Args) { + + assert(Args.size() == 2 && + "looprange clause must have exactly two arguments"); OMPLoopRangeClause *Clause = CreateEmpty(C); Clause->setLocStart(StartLoc); Clause->setLParenLoc(LParenLoc); - Clause->setLocEnd(EndLoc); Clause->setFirstLoc(FirstLoc); Clause->setCountLoc(CountLoc); - Clause->setFirst(First); - Clause->setCount(Count); + Clause->setLocEnd(EndLoc); + Clause->setArgs(Args); return Clause; } OMPLoopRangeClause *OMPLoopRangeClause::CreateEmpty(const ASTContext &C) { - return new (C) OMPLoopRangeClause(); + void *Mem = C.Allocate(totalSizeToAlloc(2)); + return new (Mem) OMPLoopRangeClause(); } OMPAllocateClause *OMPAllocateClause::Create( diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index 1671f07bc2760..268e4220b05b6 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -3241,11 +3241,8 @@ LValue CodeGenFunction::EmitDeclRefLValue(const DeclRefExpr *E) { var, ConvertTypeForMem(VD->getType()), getContext().getDeclAlign(VD)); // No other cases for now. - } else { - llvm::dbgs() << "THE DAMN DECLREFEXPR HASN'T BEEN ENTERED IN LOCALDECLMAP\n"; - VD->dumpColor(); + } else llvm_unreachable("DeclRefExpr for Decl not entered in LocalDeclMap?"); - } // Handle threadlocal function locals. if (VD->getTLSKind() != VarDecl::TLS_None) diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h index ce00198c396b6..a983901f560de 100644 --- a/clang/lib/CodeGen/CodeGenFunction.h +++ b/clang/lib/CodeGen/CodeGenFunction.h @@ -5414,10 +5414,6 @@ class CodeGenFunction : public CodeGenTypeCache { /// Set the address of a local variable. void setAddrOfLocalVar(const VarDecl *VD, Address Addr) { - if (LocalDeclMap.count(VD)) { - llvm::errs() << "Warning: VarDecl already exists in map: "; - VD->dumpColor(); - } assert(!LocalDeclMap.count(VD) && "Decl already exists in LocalDeclMap!"); LocalDeclMap.insert({VD, Addr}); } diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index 485eebf23ef93..d2da417e5cfde 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -14159,38 +14159,37 @@ StmtResult SemaOpenMP::ActOnOpenMPTargetTeamsDistributeSimdDirective( getASTContext(), StartLoc, EndLoc, NestedLoopCount, Clauses, AStmt, B); } -// Overloaded base case function +/// Overloaded base case function template static bool tryHandleAs(T *t, F &&) { return false; } -/** - * Tries to recursively cast `t` to one of the given types and invokes `f` if - * successful. - * - * @tparam Class The first type to check. - * @tparam Rest The remaining types to check. - * @tparam T The base type of `t`. - * @tparam F The callable type for the function to invoke upon a successful - * cast. - * @param t The object to be checked. - * @param f The function to invoke if `t` matches `Class`. - * @return `true` if `t` matched any type and `f` was called, otherwise `false`. - */ +/// +/// Tries to recursively cast `t` to one of the given types and invokes `f` if +/// successful. +/// +/// @tparam Class The first type to check. +/// @tparam Rest The remaining types to check. +/// @tparam T The base type of `t`. +/// @tparam F The callable type for the function to invoke upon a successful +/// cast. +/// @param t The object to be checked. +/// @param f The function to invoke if `t` matches `Class`. +/// @return `true` if `t` matched any type and `f` was called, otherwise +/// `false`. template static bool tryHandleAs(T *t, F &&f) { if (Class *c = dyn_cast(t)) { f(c); return true; - } else { - return tryHandleAs(t, std::forward(f)); } + return tryHandleAs(t, std::forward(f)); } -// Updates OriginalInits by checking Transform against loop transformation -// directives and appending their pre-inits if a match is found. +/// Updates OriginalInits by checking Transform against loop transformation +/// directives and appending their pre-inits if a match is found. static void updatePreInits(OMPLoopBasedDirective *Transform, - SmallVectorImpl> &PreInits) { + SmallVectorImpl> &PreInits) { if (!tryHandleAs( Transform, [&PreInits](auto *Dir) { @@ -14202,7 +14201,7 @@ static void updatePreInits(OMPLoopBasedDirective *Transform, bool SemaOpenMP::checkTransformableLoopNest( OpenMPDirectiveKind Kind, Stmt *AStmt, int NumLoops, SmallVectorImpl &LoopHelpers, - Stmt *&Body, SmallVectorImpl> &OriginalInits) { + Stmt *&Body, SmallVectorImpl> &OriginalInits) { OriginalInits.emplace_back(); bool Result = OMPLoopBasedDirective::doForAllLoops( AStmt->IgnoreContainers(), /*TryImperfectlyNestedLoops=*/false, NumLoops, @@ -14236,40 +14235,40 @@ bool SemaOpenMP::checkTransformableLoopNest( return Result; } -// Counts the total number of nested loops, including the outermost loop (the -// original loop). PRECONDITION of this visitor is that it must be invoked from -// the original loop to be analyzed. The traversal is stop for Decl's and -// Expr's given that they may contain inner loops that must not be counted. -// -// Example AST structure for the code: -// -// int main() { -// #pragma omp fuse -// { -// for (int i = 0; i < 100; i++) { <-- Outer loop -// []() { -// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP -// }; -// for(int j = 0; j < 5; ++j) {} <-- Inner loop -// } -// for (int r = 0; i < 100; i++) { <-- Outer loop -// struct LocalClass { -// void bar() { -// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP -// } -// }; -// for(int k = 0; k < 10; ++k) {} <-- Inner loop -// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP -// } -// } -// } -// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +/// Counts the total number of nested loops, including the outermost loop (the +/// original loop). PRECONDITION of this visitor is that it must be invoked from +/// the original loop to be analyzed. The traversal is stop for Decl's and +/// Expr's given that they may contain inner loops that must not be counted. +/// +/// Example AST structure for the code: +/// +/// int main() { +/// #pragma omp fuse +/// { +/// for (int i = 0; i < 100; i++) { <-- Outer loop +/// []() { +/// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +/// }; +/// for(int j = 0; j < 5; ++j) {} <-- Inner loop +/// } +/// for (int r = 0; i < 100; i++) { <-- Outer loop +/// struct LocalClass { +/// void bar() { +/// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +/// } +/// }; +/// for(int k = 0; k < 10; ++k) {} <-- Inner loop +/// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +/// } +/// } +/// } +/// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { private: unsigned NestedLoopCount = 0; public: - explicit NestedLoopCounterVisitor() {} + explicit NestedLoopCounterVisitor() = default; unsigned getNestedLoopCount() const { return NestedLoopCount; } @@ -14296,7 +14295,7 @@ class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { return true; // Only recurse into CompoundStmt (block {}) and loop bodies - if (isa(S) || isa(S) || isa(S)) { + if (isa(S)) { return DynamicRecursiveASTVisitor::TraverseStmt(S); } @@ -14317,19 +14316,18 @@ bool SemaOpenMP::analyzeLoopSequence( Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, - SmallVectorImpl> &OriginalInits, - SmallVectorImpl> &TransformsPreInits, - SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, SmallVectorImpl &LoopCategories, ASTContext &Context, OpenMPDirectiveKind Kind) { VarsWithInheritedDSAType TmpDSA; QualType BaseInductionVarType; - // Helper Lambda to handle storing initialization and body statements for both - // ForStmt and CXXForRangeStmt and checks for any possible mismatch between - // induction variables types - auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, - this, &Context](Stmt *LoopStmt) { + /// Helper Lambda to handle storing initialization and body statements for + /// both ForStmt and CXXForRangeStmt and checks for any possible mismatch + /// between induction variables types + auto StoreLoopStatements = [&](Stmt *LoopStmt) { if (auto *For = dyn_cast(LoopStmt)) { OriginalInits.back().push_back(For->getInit()); ForStmts.push_back(For); @@ -14357,16 +14355,11 @@ bool SemaOpenMP::analyzeLoopSequence( } }; - // Helper lambda functions to encapsulate the processing of different - // derivations of the canonical loop sequence grammar - // - // Modularized code for handling loop generation and transformations - auto analyzeLoopGeneration = [&storeLoopStatements, &LoopHelpers, - &OriginalInits, &TransformsPreInits, - &LoopCategories, &LoopSeqSize, &NumLoops, Kind, - &TmpDSA, &ForStmts, &Context, - &LoopSequencePreInits, this](Stmt *Child) { - auto LoopTransform = dyn_cast(Child); + /// Helper lambda functions to encapsulate the processing of different + /// derivations of the canonical loop sequence grammar + /// Modularized code for handling loop generation and transformations + auto AnalyzeLoopGeneration = [&](Stmt *Child) { + auto *LoopTransform = dyn_cast(Child); Stmt *TransformedStmt = LoopTransform->getTransformedStmt(); unsigned NumGeneratedLoopNests = LoopTransform->getNumGeneratedLoopNests(); unsigned NumGeneratedLoops = LoopTransform->getNumGeneratedLoops(); @@ -14377,9 +14370,8 @@ bool SemaOpenMP::analyzeLoopSequence( LoopSeqSize += NumGeneratedLoopNests; NumLoops += NumGeneratedLoops; return true; - } - // Unroll full (0 loops produced) - else { + } else { + // Unroll full (0 loops produced) Diag(Child->getBeginLoc(), diag::err_omp_not_for) << 0 << getOpenMPDirectiveName(Kind); return false; @@ -14406,9 +14398,8 @@ bool SemaOpenMP::analyzeLoopSequence( LoopHelpers, ForStmts, OriginalInits, TransformsPreInits, LoopSequencePreInits, LoopCategories, Context, Kind); - } - // Vast majority: (Tile, Unroll, Stripe, Reverse, Interchange, Fuse all) - else { + } else { + // Vast majority: (Tile, Unroll, Stripe, Reverse, Interchange, Fuse all) // Process the transformed loop statement OriginalInits.emplace_back(); TransformsPreInits.emplace_back(); @@ -14424,7 +14415,7 @@ bool SemaOpenMP::analyzeLoopSequence( << getOpenMPDirectiveName(Kind); return false; } - storeLoopStatements(TransformedStmt); + StoreLoopStatements(TransformedStmt); updatePreInits(LoopTransform, TransformsPreInits); NumLoops += NumGeneratedLoops; @@ -14433,10 +14424,8 @@ bool SemaOpenMP::analyzeLoopSequence( } }; - // Modularized code for handling regular canonical loops - auto analyzeRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, - &LoopSeqSize, &NumLoops, Kind, &TmpDSA, - &LoopCategories, this](Stmt *Child) { + /// Modularized code for handling regular canonical loops + auto AnalyzeRegularLoop = [&](Stmt *Child) { OriginalInits.emplace_back(); LoopHelpers.emplace_back(); LoopCategories.push_back(OMPLoopCategory::RegularLoop); @@ -14451,19 +14440,19 @@ bool SemaOpenMP::analyzeLoopSequence( return false; } - storeLoopStatements(Child); + StoreLoopStatements(Child); auto NLCV = NestedLoopCounterVisitor(); NLCV.TraverseStmt(Child); NumLoops += NLCV.getNestedLoopCount(); return true; }; - // Helper functions to validate canonical loop sequence grammar is valid - auto isLoopSequenceDerivation = [](auto *Child) { - return isa(Child) || isa(Child) || - isa(Child); + /// Helper functions to validate loop sequence grammar derivations + auto IsLoopSequenceDerivation = [](auto *Child) { + return isa(Child); }; - auto isLoopGeneratingStmt = [](auto *Child) { + /// Helper functions to validate loop generating grammar derivations + auto IsLoopGeneratingStmt = [](auto *Child) { return isa(Child); }; @@ -14474,7 +14463,7 @@ bool SemaOpenMP::analyzeLoopSequence( continue; // Skip over non-loop-sequence statements - if (!isLoopSequenceDerivation(Child)) { + if (!IsLoopSequenceDerivation(Child)) { Child = Child->IgnoreContainers(); // Ignore empty compound statement @@ -14494,17 +14483,17 @@ bool SemaOpenMP::analyzeLoopSequence( } } // Regular loop sequence handling - if (isLoopSequenceDerivation(Child)) { - if (isLoopGeneratingStmt(Child)) { - if (!analyzeLoopGeneration(Child)) { + if (IsLoopSequenceDerivation(Child)) { + if (IsLoopGeneratingStmt(Child)) { + if (!AnalyzeLoopGeneration(Child)) return false; - } - // analyzeLoopGeneration updates Loop Sequence size accordingly + + // AnalyzeLoopGeneration updates Loop Sequence size accordingly } else { - if (!analyzeRegularLoop(Child)) { + if (!AnalyzeRegularLoop(Child)) return false; - } + // Update the Loop Sequence size by one ++LoopSeqSize; } @@ -14523,9 +14512,9 @@ bool SemaOpenMP::checkTransformableLoopSequence( unsigned &NumLoops, SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, - SmallVectorImpl> &OriginalInits, - SmallVectorImpl> &TransformsPreInits, - SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, SmallVectorImpl &LoopCategories, ASTContext &Context) { // Checks whether the given statement is a compound statement @@ -14561,10 +14550,9 @@ bool SemaOpenMP::checkTransformableLoopSequence( // Recursive entry point to process the main loop sequence if (!analyzeLoopSequence(AStmt, LoopSeqSize, NumLoops, LoopHelpers, ForStmts, OriginalInits, TransformsPreInits, - LoopSequencePreInits, LoopCategories, Context, - Kind)) { + LoopSequencePreInits, LoopCategories, Context, Kind)) return false; - } + if (LoopSeqSize <= 0) { Diag(AStmt->getBeginLoc(), diag::err_omp_empty_loop_sequence) << getOpenMPDirectiveName(Kind); @@ -14656,7 +14644,7 @@ StmtResult SemaOpenMP::ActOnOpenMPTileDirective(ArrayRef Clauses, // Verify and diagnose loop nest. SmallVector LoopHelpers(NumLoops); Stmt *Body = nullptr; - SmallVector, 4> OriginalInits; + SmallVector, 4> OriginalInits; if (!checkTransformableLoopNest(OMPD_tile, AStmt, NumLoops, LoopHelpers, Body, OriginalInits)) return StmtError(); @@ -14933,7 +14921,7 @@ StmtResult SemaOpenMP::ActOnOpenMPStripeDirective(ArrayRef Clauses, // Verify and diagnose loop nest. SmallVector LoopHelpers(NumLoops); Stmt *Body = nullptr; - SmallVector, 4> OriginalInits; + SmallVector, 4> OriginalInits; if (!checkTransformableLoopNest(OMPD_stripe, AStmt, NumLoops, LoopHelpers, Body, OriginalInits)) return StmtError(); @@ -15194,7 +15182,7 @@ StmtResult SemaOpenMP::ActOnOpenMPUnrollDirective(ArrayRef Clauses, Stmt *Body = nullptr; SmallVector LoopHelpers( NumLoops); - SmallVector, NumLoops + 1> OriginalInits; + SmallVector, NumLoops + 1> OriginalInits; if (!checkTransformableLoopNest(OMPD_unroll, AStmt, NumLoops, LoopHelpers, Body, OriginalInits)) return StmtError(); @@ -15462,7 +15450,7 @@ StmtResult SemaOpenMP::ActOnOpenMPReverseDirective(Stmt *AStmt, Stmt *Body = nullptr; SmallVector LoopHelpers( NumLoops); - SmallVector, NumLoops + 1> OriginalInits; + SmallVector, NumLoops + 1> OriginalInits; if (!checkTransformableLoopNest(OMPD_reverse, AStmt, NumLoops, LoopHelpers, Body, OriginalInits)) return StmtError(); @@ -15654,7 +15642,7 @@ StmtResult SemaOpenMP::ActOnOpenMPInterchangeDirective( // Verify and diagnose loop nest. SmallVector LoopHelpers(NumLoops); Stmt *Body = nullptr; - SmallVector, 2> OriginalInits; + SmallVector, 2> OriginalInits; if (!checkTransformableLoopNest(OMPD_interchange, AStmt, NumLoops, LoopHelpers, Body, OriginalInits)) return StmtError(); @@ -15841,9 +15829,8 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, CaptureVars CopyTransformer(SemaRef); // Ensure the structured block is not empty - if (!AStmt) { + if (!AStmt) return StmtError(); - } unsigned NumLoops = 1; unsigned LoopSeqSize = 1; @@ -15862,16 +15849,15 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // Also collect the HelperExprs, Loop Stmts, Inits, and Number of loops SmallVector LoopHelpers; SmallVector LoopStmts; - SmallVector> OriginalInits; - SmallVector> TransformsPreInits; - SmallVector> LoopSequencePreInits; + SmallVector> OriginalInits; + SmallVector> TransformsPreInits; + SmallVector> LoopSequencePreInits; SmallVector LoopCategories; if (!checkTransformableLoopSequence(OMPD_fuse, AStmt, LoopSeqSize, NumLoops, LoopHelpers, LoopStmts, OriginalInits, TransformsPreInits, LoopSequencePreInits, - LoopCategories, Context)) { + LoopCategories, Context)) return StmtError(); - } // Handle clauses, which can be any of the following: [looprange, apply] const OMPLoopRangeClause *LRC = @@ -15961,9 +15947,8 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // expressions. Generates both the variable declaration and the corresponding // initialization statement. auto CreateHelperVarAndStmt = - [&SemaRef = this->SemaRef, &Context, &CopyTransformer, - &IVType](Expr *ExprToCopy, const std::string &BaseName, unsigned I, - bool NeedsNewVD = false) { + [&, &SemaRef = SemaRef](Expr *ExprToCopy, const std::string &BaseName, + unsigned I, bool NeedsNewVD = false) { Expr *TransformedExpr = AssertSuccess(CopyTransformer.TransformExpr(ExprToCopy)); if (!TransformedExpr) @@ -16007,9 +15992,8 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // Transformations that apply this concept: Loopranged Fuse, Split if (!LoopSequencePreInits.empty()) { for (const auto <PreInits : LoopSequencePreInits) { - if (!LTPreInits.empty()) { + if (!LTPreInits.empty()) llvm::append_range(PreInits, LTPreInits); - } } } @@ -16038,9 +16022,9 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // Order matters: pre-inits may define variables used in the original // inits such as upper bounds... auto TransformPreInit = TransformsPreInits[TransformIndex++]; - if (!TransformPreInit.empty()) { + if (!TransformPreInit.empty()) llvm::append_range(PreInits, TransformPreInit); - } + addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], PreInits); } @@ -17459,13 +17443,15 @@ OMPClause *SemaOpenMP::ActOnOpenMPLoopRangeClause( if (CountVal.isInvalid()) Count = nullptr; + SmallVector ArgsVec = {First, Count}; + // OpenMP [6.0, Restrictions] // first + count - 1 must not evaluate to a value greater than the // loop sequence length of the associated canonical loop sequence. // This check must be performed afterwards due to the delayed // parsing and computation of the associated loop sequence return OMPLoopRangeClause::Create(getASTContext(), StartLoc, LParenLoc, - FirstLoc, CountLoc, EndLoc, First, Count); + FirstLoc, CountLoc, EndLoc, ArgsVec); } OMPClause *SemaOpenMP::ActOnOpenMPAlignClause(Expr *A, SourceLocation StartLoc, diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 4af2b4909fcb6..ad4f54e6fdcc5 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3389,9 +3389,6 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Looprange &x) { ContextDirectiveAsFortran()); } -void OmpStructureChecker::Enter(const parser::OmpClause::FreeAgent &x) { - context_.Say(GetContext().clauseSource, - "FREE_AGENT clause is not implemented yet"_err_en_US, // Restrictions specific to each clause are implemented apart from the // generalized restrictions. >From 4100dfe4dd04ed1c953ea4e38a65e867c8e9f73f Mon Sep 17 00:00:00 2001 From: eZWALT Date: Thu, 22 May 2025 10:39:39 +0000 Subject: [PATCH 9/9] Removed unncessary warning and updated tests accordingly --- .../clang/Basic/DiagnosticSemaKinds.td | 3 -- clang/lib/Sema/SemaOpenMP.cpp | 21 +-------- clang/test/OpenMP/fuse_messages.cpp | 43 +++++++++++++++---- 3 files changed, 35 insertions(+), 32 deletions(-) diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index a6ae0de004c8a..d1790cea6cc45 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -11558,9 +11558,6 @@ def note_omp_implicit_dsa : Note< "implicitly determined as %0">; def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; -def warn_omp_different_loop_ind_var_types : Warning < - "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">, - InGroup; def err_omp_not_canonical_loop : Error < "loop after '#pragma omp %0' is not in canonical form">; def err_omp_not_a_loop_sequence : Error < diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index d2da417e5cfde..76484b577f9c1 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -14323,31 +14323,12 @@ bool SemaOpenMP::analyzeLoopSequence( OpenMPDirectiveKind Kind) { VarsWithInheritedDSAType TmpDSA; - QualType BaseInductionVarType; /// Helper Lambda to handle storing initialization and body statements for - /// both ForStmt and CXXForRangeStmt and checks for any possible mismatch - /// between induction variables types + /// both ForStmt and CXXForRangeStmt auto StoreLoopStatements = [&](Stmt *LoopStmt) { if (auto *For = dyn_cast(LoopStmt)) { OriginalInits.back().push_back(For->getInit()); ForStmts.push_back(For); - // Extract induction variable - if (auto *InitStmt = dyn_cast_or_null(For->getInit())) { - if (auto *InitDecl = dyn_cast(InitStmt->getSingleDecl())) { - QualType InductionVarType = InitDecl->getType().getCanonicalType(); - - // Compare with first loop type - if (BaseInductionVarType.isNull()) { - BaseInductionVarType = InductionVarType; - } else if (!Context.hasSameType(BaseInductionVarType, - InductionVarType)) { - Diag(InitDecl->getBeginLoc(), - diag::warn_omp_different_loop_ind_var_types) - << getOpenMPDirectiveName(OMPD_fuse) << BaseInductionVarType - << InductionVarType; - } - } - } } else { auto *CXXFor = cast(LoopStmt); OriginalInits.back().push_back(CXXFor->getBeginStmt()); diff --git a/clang/test/OpenMP/fuse_messages.cpp b/clang/test/OpenMP/fuse_messages.cpp index 2a2491d008a0b..4902d424373e5 100644 --- a/clang/test/OpenMP/fuse_messages.cpp +++ b/clang/test/OpenMP/fuse_messages.cpp @@ -70,15 +70,6 @@ void func() { for(int j = 0; j < 10; ++j); } - //expected-warning at +5 {{loop sequence following '#pragma omp fuse' contains induction variables of differing types: 'int' and 'unsigned int'}} - //expected-warning at +5 {{loop sequence following '#pragma omp fuse' contains induction variables of differing types: 'int' and 'long long'}} - #pragma omp fuse - { - for(int i = 0; i < 10; ++i); - for(unsigned int j = 0; j < 10; ++j); - for(long long k = 0; k < 100; ++k); - } - //expected-warning at +2 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} #pragma omp fuse { @@ -123,6 +114,40 @@ void func() { for(int j = 0; j < 100; ++j); for(int k = 0; k < 50; ++k); } + + //expected-error at +1 {{loop range in '#pragma omp fuse' exceeds the number of available loops: range end '6' is greater than the total number of loops '5'}} + #pragma omp fuse looprange(1,6) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + // This fusion results in 2 loops + #pragma omp fuse looprange(1,2) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } + } + + //expected-error at +1 {{loop range in '#pragma omp fuse' exceeds the number of available loops: range end '4' is greater than the total number of loops '3'}} + #pragma omp fuse looprange(2,3) + { + #pragma omp unroll partial(2) + for(int i = 0; i < 10; ++i); + + #pragma omp reverse + for(int j = 0; j < 10; ++j); + + #pragma omp fuse + { + { + #pragma omp reverse + for(int j = 0; j < 10; ++j); + } + for(int k = 0; k < 50; ++k); + } + } } // In a template context, but expression itself not instantiation-dependent From flang-commits at lists.llvm.org Thu May 22 13:30:38 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 13:30:38 -0700 (PDT) Subject: [flang-commits] [flang] d6d52a4 - [flang][cuda] Do not generate cuf.alloc/cuf.free in device context (#141117) Message-ID: <682f896e.170a0220.3021e0.b1d1@mx.google.com> Author: Valentin Clement (バレンタイン クレメン) Date: 2025-05-22T13:30:34-07:00 New Revision: d6d52a4abc71c583bd5391c5c0b36483c65081fe URL: https://github.com/llvm/llvm-project/commit/d6d52a4abc71c583bd5391c5c0b36483c65081fe DIFF: https://github.com/llvm/llvm-project/commit/d6d52a4abc71c583bd5391c5c0b36483c65081fe.diff LOG: [flang][cuda] Do not generate cuf.alloc/cuf.free in device context (#141117) `cuf.alloc` and `cuf.free` are converted to `fir.alloca` or deleted when in device context during the CUFOpConversion pass. Do not generate them in lowering to avoid confusion. Added: Modified: flang/lib/Lower/ConvertVariable.cpp flang/test/Lower/CUDA/cuda-allocatable.cuf flang/test/Lower/CUDA/cuda-shared.cuf Removed: ################################################################################ diff --git a/flang/lib/Lower/ConvertVariable.cpp b/flang/lib/Lower/ConvertVariable.cpp index 372c71b6d4821..a28596bfd0099 100644 --- a/flang/lib/Lower/ConvertVariable.cpp +++ b/flang/lib/Lower/ConvertVariable.cpp @@ -25,6 +25,7 @@ #include "flang/Lower/StatementContext.h" #include "flang/Lower/Support/Utils.h" #include "flang/Lower/SymbolMap.h" +#include "flang/Optimizer/Builder/CUFCommon.h" #include "flang/Optimizer/Builder/Character.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" @@ -735,8 +736,10 @@ static mlir::Value createNewLocal(Fortran::lower::AbstractConverter &converter, if (dataAttr.getValue() == cuf::DataAttribute::Shared) return builder.create(loc, ty, nm, symNm, lenParams, indices); - return builder.create(loc, ty, nm, symNm, dataAttr, lenParams, - indices); + + if (!cuf::isCUDADeviceContext(builder.getRegion())) + return builder.create(loc, ty, nm, symNm, dataAttr, + lenParams, indices); } // Let the builder do all the heavy lifting. @@ -1072,8 +1075,9 @@ static void instantiateLocal(Fortran::lower::AbstractConverter &converter, if (mustBeDefaultInitializedAtRuntime(var)) Fortran::lower::defaultInitializeAtRuntime(converter, var.getSymbol(), symMap); - if (Fortran::semantics::NeedCUDAAlloc(var.getSymbol())) { - auto *builder = &converter.getFirOpBuilder(); + auto *builder = &converter.getFirOpBuilder(); + if (Fortran::semantics::NeedCUDAAlloc(var.getSymbol()) && + !cuf::isCUDADeviceContext(builder->getRegion())) { cuf::DataAttributeAttr dataAttr = Fortran::lower::translateSymbolCUFDataAttribute(builder->getContext(), var.getSymbol()); diff --git a/flang/test/Lower/CUDA/cuda-allocatable.cuf b/flang/test/Lower/CUDA/cuda-allocatable.cuf index cec10dda839e9..36e768bd7d92c 100644 --- a/flang/test/Lower/CUDA/cuda-allocatable.cuf +++ b/flang/test/Lower/CUDA/cuda-allocatable.cuf @@ -186,7 +186,7 @@ attributes(global) subroutine sub8() end subroutine ! CHECK-LABEL: func.func @_QPsub8() attributes {cuf.proc_attr = #cuf.cuda_proc} -! CHECK: %[[DESC:.*]] = cuf.alloc !fir.box>> {bindc_name = "a", data_attr = #cuf.cuda, uniq_name = "_QFsub8Ea"} -> !fir.ref>>> +! CHECK: %[[DESC:.*]] = fir.alloca !fir.box>> {bindc_name = "a", uniq_name = "_QFsub8Ea"} ! CHECK: %[[A:.*]]:2 = hlfir.declare %[[DESC]] {data_attr = #cuf.cuda, fortran_attrs = #fir.var_attrs, uniq_name = "_QFsub8Ea"} : (!fir.ref>>>) -> (!fir.ref>>>, !fir.ref>>>) ! CHECK: %[[HEAP:.*]] = fir.allocmem !fir.array, %{{.*}} {fir.must_be_heap = true, uniq_name = "_QFsub8Ea.alloc"} ! CHECK: %[[SHAPE:.*]] = fir.shape %{{.*}} : (index) -> !fir.shape<1> diff --git a/flang/test/Lower/CUDA/cuda-shared.cuf b/flang/test/Lower/CUDA/cuda-shared.cuf index 565857f01bdb8..f41011df06ae7 100644 --- a/flang/test/Lower/CUDA/cuda-shared.cuf +++ b/flang/test/Lower/CUDA/cuda-shared.cuf @@ -9,5 +9,4 @@ end subroutine ! CHECK-LABEL: func.func @_QPsharedmem() attributes {cuf.proc_attr = #cuf.cuda_proc} ! CHECK: %{{.*}} = cuf.shared_memory !fir.array<32xf32> {bindc_name = "s", uniq_name = "_QFsharedmemEs"} -> !fir.ref> -! CHECK: cuf.free %{{.*}}#0 : !fir.ref {data_attr = #cuf.cuda} ! CHECK-NOT: cuf.free From flang-commits at lists.llvm.org Thu May 22 13:30:40 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Thu, 22 May 2025 13:30:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Do not generate cuf.alloc/cuf.free in device context (PR #141117) In-Reply-To: Message-ID: <682f8970.170a0220.177fd1.1fcf@mx.google.com> https://github.com/clementval closed https://github.com/llvm/llvm-project/pull/141117 From flang-commits at lists.llvm.org Thu May 22 13:38:20 2025 From: flang-commits at lists.llvm.org (Jan Svoboda via flang-commits) Date: Thu, 22 May 2025 13:38:20 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix build after 9e306ad4 (PR #141134) Message-ID: https://github.com/jansvoboda11 created https://github.com/llvm/llvm-project/pull/141134 None >From a81afb593b11dda3b5b7d4165e8839456e4e9c41 Mon Sep 17 00:00:00 2001 From: Jan Svoboda Date: Thu, 22 May 2025 13:32:24 -0700 Subject: [PATCH] [flang] Fix build after 9e306ad4 --- flang/include/flang/Frontend/CompilerInstance.h | 2 +- flang/include/flang/Frontend/CompilerInvocation.h | 2 +- flang/include/flang/Frontend/TextDiagnosticPrinter.h | 4 ++-- flang/lib/Frontend/CompilerInstance.cpp | 5 ++--- flang/lib/Frontend/TextDiagnosticPrinter.cpp | 2 +- flang/tools/flang-driver/driver.cpp | 10 +++++----- flang/tools/flang-driver/fc1_main.cpp | 5 ++--- flang/unittests/Frontend/CompilerInstanceTest.cpp | 5 +++-- 8 files changed, 17 insertions(+), 18 deletions(-) diff --git a/flang/include/flang/Frontend/CompilerInstance.h b/flang/include/flang/Frontend/CompilerInstance.h index 4ad95c9df42d7..4234e13597cd7 100644 --- a/flang/include/flang/Frontend/CompilerInstance.h +++ b/flang/include/flang/Frontend/CompilerInstance.h @@ -347,7 +347,7 @@ class CompilerInstance { /// /// \return The new object on success, or null on failure. static clang::IntrusiveRefCntPtr - createDiagnostics(clang::DiagnosticOptions *opts, + createDiagnostics(clang::DiagnosticOptions &opts, clang::DiagnosticConsumer *client = nullptr, bool shouldOwnClient = true); void createDiagnostics(clang::DiagnosticConsumer *client = nullptr, diff --git a/flang/include/flang/Frontend/CompilerInvocation.h b/flang/include/flang/Frontend/CompilerInvocation.h index d6ee1511cdb4b..06978029435b7 100644 --- a/flang/include/flang/Frontend/CompilerInvocation.h +++ b/flang/include/flang/Frontend/CompilerInvocation.h @@ -43,7 +43,7 @@ bool parseDiagnosticArgs(clang::DiagnosticOptions &opts, class CompilerInvocationBase { public: /// Options controlling the diagnostic engine. - llvm::IntrusiveRefCntPtr diagnosticOpts; + std::shared_ptr diagnosticOpts; /// Options for the preprocessor. std::shared_ptr preprocessorOpts; diff --git a/flang/include/flang/Frontend/TextDiagnosticPrinter.h b/flang/include/flang/Frontend/TextDiagnosticPrinter.h index 9c99a0c314351..4913713b6c365 100644 --- a/flang/include/flang/Frontend/TextDiagnosticPrinter.h +++ b/flang/include/flang/Frontend/TextDiagnosticPrinter.h @@ -37,13 +37,13 @@ class TextDiagnostic; class TextDiagnosticPrinter : public clang::DiagnosticConsumer { raw_ostream &os; - llvm::IntrusiveRefCntPtr diagOpts; + clang::DiagnosticOptions &diagOpts; /// A string to prefix to error messages. std::string prefix; public: - TextDiagnosticPrinter(raw_ostream &os, clang::DiagnosticOptions *diags); + TextDiagnosticPrinter(raw_ostream &os, clang::DiagnosticOptions &diags); ~TextDiagnosticPrinter() override; /// Set the diagnostic printer prefix string, which will be printed at the diff --git a/flang/lib/Frontend/CompilerInstance.cpp b/flang/lib/Frontend/CompilerInstance.cpp index cbd2c58eeeb47..2e0f91fb0521c 100644 --- a/flang/lib/Frontend/CompilerInstance.cpp +++ b/flang/lib/Frontend/CompilerInstance.cpp @@ -226,12 +226,11 @@ bool CompilerInstance::executeAction(FrontendAction &act) { void CompilerInstance::createDiagnostics(clang::DiagnosticConsumer *client, bool shouldOwnClient) { - diagnostics = - createDiagnostics(&getDiagnosticOpts(), client, shouldOwnClient); + diagnostics = createDiagnostics(getDiagnosticOpts(), client, shouldOwnClient); } clang::IntrusiveRefCntPtr -CompilerInstance::createDiagnostics(clang::DiagnosticOptions *opts, +CompilerInstance::createDiagnostics(clang::DiagnosticOptions &opts, clang::DiagnosticConsumer *client, bool shouldOwnClient) { clang::IntrusiveRefCntPtr diagID( diff --git a/flang/lib/Frontend/TextDiagnosticPrinter.cpp b/flang/lib/Frontend/TextDiagnosticPrinter.cpp index 65626827af3b3..043440328b1f6 100644 --- a/flang/lib/Frontend/TextDiagnosticPrinter.cpp +++ b/flang/lib/Frontend/TextDiagnosticPrinter.cpp @@ -27,7 +27,7 @@ using namespace Fortran::frontend; TextDiagnosticPrinter::TextDiagnosticPrinter(raw_ostream &diagOs, - clang::DiagnosticOptions *diags) + clang::DiagnosticOptions &diags) : os(diagOs), diagOpts(diags) {} TextDiagnosticPrinter::~TextDiagnosticPrinter() {} diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ed52988feaa59..35cc2efc0ac01 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -43,9 +43,9 @@ std::string getExecutablePath(const char *argv0) { // This lets us create the DiagnosticsEngine with a properly-filled-out // DiagnosticOptions instance -static clang::DiagnosticOptions * +static std::unique_ptr createAndPopulateDiagOpts(llvm::ArrayRef argv) { - auto *diagOpts = new clang::DiagnosticOptions; + auto diagOpts = std::make_unique(); // Ignore missingArgCount and the return value of ParseDiagnosticArgs. // Any errors that would be diagnosed here will also be diagnosed later, @@ -114,17 +114,17 @@ int main(int argc, const char **argv) { // Not in the frontend mode - continue in the compiler driver mode. // Create DiagnosticsEngine for the compiler driver - llvm::IntrusiveRefCntPtr diagOpts = + std::unique_ptr diagOpts = createAndPopulateDiagOpts(args); llvm::IntrusiveRefCntPtr diagID( new clang::DiagnosticIDs()); Fortran::frontend::TextDiagnosticPrinter *diagClient = - new Fortran::frontend::TextDiagnosticPrinter(llvm::errs(), &*diagOpts); + new Fortran::frontend::TextDiagnosticPrinter(llvm::errs(), *diagOpts); diagClient->setPrefix( std::string(llvm::sys::path::stem(getExecutablePath(args[0])))); - clang::DiagnosticsEngine diags(diagID, &*diagOpts, diagClient); + clang::DiagnosticsEngine diags(diagID, *diagOpts, diagClient); // Prepare the driver clang::driver::Driver theDriver(driverPath, diff --git a/flang/tools/flang-driver/fc1_main.cpp b/flang/tools/flang-driver/fc1_main.cpp index 49535275d084d..f2cd513d0028c 100644 --- a/flang/tools/flang-driver/fc1_main.cpp +++ b/flang/tools/flang-driver/fc1_main.cpp @@ -67,9 +67,8 @@ int fc1_main(llvm::ArrayRef argv, const char *argv0) { // for parsing the arguments llvm::IntrusiveRefCntPtr diagID( new clang::DiagnosticIDs()); - llvm::IntrusiveRefCntPtr diagOpts = - new clang::DiagnosticOptions(); - clang::DiagnosticsEngine diags(diagID, &*diagOpts, diagsBuffer); + clang::DiagnosticOptions diagOpts; + clang::DiagnosticsEngine diags(diagID, diagOpts, diagsBuffer); bool success = CompilerInvocation::createFromArgs(flang->getInvocation(), argv, diags, argv0); diff --git a/flang/unittests/Frontend/CompilerInstanceTest.cpp b/flang/unittests/Frontend/CompilerInstanceTest.cpp index 3fe2f063e996a..bf62a64be229a 100644 --- a/flang/unittests/Frontend/CompilerInstanceTest.cpp +++ b/flang/unittests/Frontend/CompilerInstanceTest.cpp @@ -67,14 +67,15 @@ TEST(CompilerInstance, AllowDiagnosticLogWithUnownedDiagnosticConsumer) { // 1. Set-up a basic DiagnosticConsumer std::string diagnosticOutput; llvm::raw_string_ostream diagnosticsOS(diagnosticOutput); + clang::DiagnosticOptions diagPrinterOpts; auto diagPrinter = std::make_unique( - diagnosticsOS, new clang::DiagnosticOptions()); + diagnosticsOS, *diagPrinterOpts; // 2. Create a CompilerInstance (to manage a DiagnosticEngine) CompilerInstance compInst; // 3. Set-up DiagnosticOptions - auto diagOpts = new clang::DiagnosticOptions(); + clang::DiagnosticOptions diagOpts; // Tell the diagnostics engine to emit the diagnostic log to STDERR. This // ensures that a chained diagnostic consumer is created so that the test can // exercise the unowned diagnostic consumer in a chained consumer. From flang-commits at lists.llvm.org Thu May 22 13:38:53 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 13:38:53 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix build after 9e306ad4 (PR #141134) In-Reply-To: Message-ID: <682f8b5d.630a0220.3d0cee.837d@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-driver Author: Jan Svoboda (jansvoboda11)
Changes --- Full diff: https://github.com/llvm/llvm-project/pull/141134.diff 8 Files Affected: - (modified) flang/include/flang/Frontend/CompilerInstance.h (+1-1) - (modified) flang/include/flang/Frontend/CompilerInvocation.h (+1-1) - (modified) flang/include/flang/Frontend/TextDiagnosticPrinter.h (+2-2) - (modified) flang/lib/Frontend/CompilerInstance.cpp (+2-3) - (modified) flang/lib/Frontend/TextDiagnosticPrinter.cpp (+1-1) - (modified) flang/tools/flang-driver/driver.cpp (+5-5) - (modified) flang/tools/flang-driver/fc1_main.cpp (+2-3) - (modified) flang/unittests/Frontend/CompilerInstanceTest.cpp (+3-2) ``````````diff diff --git a/flang/include/flang/Frontend/CompilerInstance.h b/flang/include/flang/Frontend/CompilerInstance.h index 4ad95c9df42d7..4234e13597cd7 100644 --- a/flang/include/flang/Frontend/CompilerInstance.h +++ b/flang/include/flang/Frontend/CompilerInstance.h @@ -347,7 +347,7 @@ class CompilerInstance { /// /// \return The new object on success, or null on failure. static clang::IntrusiveRefCntPtr - createDiagnostics(clang::DiagnosticOptions *opts, + createDiagnostics(clang::DiagnosticOptions &opts, clang::DiagnosticConsumer *client = nullptr, bool shouldOwnClient = true); void createDiagnostics(clang::DiagnosticConsumer *client = nullptr, diff --git a/flang/include/flang/Frontend/CompilerInvocation.h b/flang/include/flang/Frontend/CompilerInvocation.h index d6ee1511cdb4b..06978029435b7 100644 --- a/flang/include/flang/Frontend/CompilerInvocation.h +++ b/flang/include/flang/Frontend/CompilerInvocation.h @@ -43,7 +43,7 @@ bool parseDiagnosticArgs(clang::DiagnosticOptions &opts, class CompilerInvocationBase { public: /// Options controlling the diagnostic engine. - llvm::IntrusiveRefCntPtr diagnosticOpts; + std::shared_ptr diagnosticOpts; /// Options for the preprocessor. std::shared_ptr preprocessorOpts; diff --git a/flang/include/flang/Frontend/TextDiagnosticPrinter.h b/flang/include/flang/Frontend/TextDiagnosticPrinter.h index 9c99a0c314351..4913713b6c365 100644 --- a/flang/include/flang/Frontend/TextDiagnosticPrinter.h +++ b/flang/include/flang/Frontend/TextDiagnosticPrinter.h @@ -37,13 +37,13 @@ class TextDiagnostic; class TextDiagnosticPrinter : public clang::DiagnosticConsumer { raw_ostream &os; - llvm::IntrusiveRefCntPtr diagOpts; + clang::DiagnosticOptions &diagOpts; /// A string to prefix to error messages. std::string prefix; public: - TextDiagnosticPrinter(raw_ostream &os, clang::DiagnosticOptions *diags); + TextDiagnosticPrinter(raw_ostream &os, clang::DiagnosticOptions &diags); ~TextDiagnosticPrinter() override; /// Set the diagnostic printer prefix string, which will be printed at the diff --git a/flang/lib/Frontend/CompilerInstance.cpp b/flang/lib/Frontend/CompilerInstance.cpp index cbd2c58eeeb47..2e0f91fb0521c 100644 --- a/flang/lib/Frontend/CompilerInstance.cpp +++ b/flang/lib/Frontend/CompilerInstance.cpp @@ -226,12 +226,11 @@ bool CompilerInstance::executeAction(FrontendAction &act) { void CompilerInstance::createDiagnostics(clang::DiagnosticConsumer *client, bool shouldOwnClient) { - diagnostics = - createDiagnostics(&getDiagnosticOpts(), client, shouldOwnClient); + diagnostics = createDiagnostics(getDiagnosticOpts(), client, shouldOwnClient); } clang::IntrusiveRefCntPtr -CompilerInstance::createDiagnostics(clang::DiagnosticOptions *opts, +CompilerInstance::createDiagnostics(clang::DiagnosticOptions &opts, clang::DiagnosticConsumer *client, bool shouldOwnClient) { clang::IntrusiveRefCntPtr diagID( diff --git a/flang/lib/Frontend/TextDiagnosticPrinter.cpp b/flang/lib/Frontend/TextDiagnosticPrinter.cpp index 65626827af3b3..043440328b1f6 100644 --- a/flang/lib/Frontend/TextDiagnosticPrinter.cpp +++ b/flang/lib/Frontend/TextDiagnosticPrinter.cpp @@ -27,7 +27,7 @@ using namespace Fortran::frontend; TextDiagnosticPrinter::TextDiagnosticPrinter(raw_ostream &diagOs, - clang::DiagnosticOptions *diags) + clang::DiagnosticOptions &diags) : os(diagOs), diagOpts(diags) {} TextDiagnosticPrinter::~TextDiagnosticPrinter() {} diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ed52988feaa59..35cc2efc0ac01 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -43,9 +43,9 @@ std::string getExecutablePath(const char *argv0) { // This lets us create the DiagnosticsEngine with a properly-filled-out // DiagnosticOptions instance -static clang::DiagnosticOptions * +static std::unique_ptr createAndPopulateDiagOpts(llvm::ArrayRef argv) { - auto *diagOpts = new clang::DiagnosticOptions; + auto diagOpts = std::make_unique(); // Ignore missingArgCount and the return value of ParseDiagnosticArgs. // Any errors that would be diagnosed here will also be diagnosed later, @@ -114,17 +114,17 @@ int main(int argc, const char **argv) { // Not in the frontend mode - continue in the compiler driver mode. // Create DiagnosticsEngine for the compiler driver - llvm::IntrusiveRefCntPtr diagOpts = + std::unique_ptr diagOpts = createAndPopulateDiagOpts(args); llvm::IntrusiveRefCntPtr diagID( new clang::DiagnosticIDs()); Fortran::frontend::TextDiagnosticPrinter *diagClient = - new Fortran::frontend::TextDiagnosticPrinter(llvm::errs(), &*diagOpts); + new Fortran::frontend::TextDiagnosticPrinter(llvm::errs(), *diagOpts); diagClient->setPrefix( std::string(llvm::sys::path::stem(getExecutablePath(args[0])))); - clang::DiagnosticsEngine diags(diagID, &*diagOpts, diagClient); + clang::DiagnosticsEngine diags(diagID, *diagOpts, diagClient); // Prepare the driver clang::driver::Driver theDriver(driverPath, diff --git a/flang/tools/flang-driver/fc1_main.cpp b/flang/tools/flang-driver/fc1_main.cpp index 49535275d084d..f2cd513d0028c 100644 --- a/flang/tools/flang-driver/fc1_main.cpp +++ b/flang/tools/flang-driver/fc1_main.cpp @@ -67,9 +67,8 @@ int fc1_main(llvm::ArrayRef argv, const char *argv0) { // for parsing the arguments llvm::IntrusiveRefCntPtr diagID( new clang::DiagnosticIDs()); - llvm::IntrusiveRefCntPtr diagOpts = - new clang::DiagnosticOptions(); - clang::DiagnosticsEngine diags(diagID, &*diagOpts, diagsBuffer); + clang::DiagnosticOptions diagOpts; + clang::DiagnosticsEngine diags(diagID, diagOpts, diagsBuffer); bool success = CompilerInvocation::createFromArgs(flang->getInvocation(), argv, diags, argv0); diff --git a/flang/unittests/Frontend/CompilerInstanceTest.cpp b/flang/unittests/Frontend/CompilerInstanceTest.cpp index 3fe2f063e996a..bf62a64be229a 100644 --- a/flang/unittests/Frontend/CompilerInstanceTest.cpp +++ b/flang/unittests/Frontend/CompilerInstanceTest.cpp @@ -67,14 +67,15 @@ TEST(CompilerInstance, AllowDiagnosticLogWithUnownedDiagnosticConsumer) { // 1. Set-up a basic DiagnosticConsumer std::string diagnosticOutput; llvm::raw_string_ostream diagnosticsOS(diagnosticOutput); + clang::DiagnosticOptions diagPrinterOpts; auto diagPrinter = std::make_unique( - diagnosticsOS, new clang::DiagnosticOptions()); + diagnosticsOS, *diagPrinterOpts; // 2. Create a CompilerInstance (to manage a DiagnosticEngine) CompilerInstance compInst; // 3. Set-up DiagnosticOptions - auto diagOpts = new clang::DiagnosticOptions(); + clang::DiagnosticOptions diagOpts; // Tell the diagnostics engine to emit the diagnostic log to STDERR. This // ensures that a chained diagnostic consumer is created so that the test can // exercise the unowned diagnostic consumer in a chained consumer. ``````````
https://github.com/llvm/llvm-project/pull/141134 From flang-commits at lists.llvm.org Thu May 22 13:40:18 2025 From: flang-commits at lists.llvm.org (Kazu Hirata via flang-commits) Date: Thu, 22 May 2025 13:40:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix build after 9e306ad4 (PR #141134) In-Reply-To: Message-ID: <682f8bb2.a70a0220.303610.4640@mx.google.com> kazutakahirata wrote: I'm testing this on top of #141131#. I'm getting: ``` flang/lib/Frontend/TextDiagnosticPrinter.cpp:84:17: error: member reference type 'clang::DiagnosticOptions' is not a pointer; did you mean to use '.'? 84 | if (diagOpts->ShowColors) | ~~~~~~~~^~ | . ``` https://github.com/llvm/llvm-project/pull/141134 From flang-commits at lists.llvm.org Thu May 22 13:47:06 2025 From: flang-commits at lists.llvm.org (Jan Svoboda via flang-commits) Date: Thu, 22 May 2025 13:47:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix build after 9e306ad4 (PR #141134) In-Reply-To: Message-ID: <682f8d4a.050a0220.375542.4544@mx.google.com> https://github.com/jansvoboda11 updated https://github.com/llvm/llvm-project/pull/141134 >From 34cf500219e35111f6c9877632b65ac8fb8c5932 Mon Sep 17 00:00:00 2001 From: Jan Svoboda Date: Thu, 22 May 2025 13:32:24 -0700 Subject: [PATCH 1/3] [flang] Fix build after 9e306ad4 --- flang/include/flang/Frontend/CompilerInstance.h | 2 +- flang/include/flang/Frontend/CompilerInvocation.h | 2 +- flang/include/flang/Frontend/TextDiagnosticPrinter.h | 4 ++-- flang/lib/Frontend/CompilerInstance.cpp | 5 ++--- flang/lib/Frontend/TextDiagnosticPrinter.cpp | 2 +- flang/tools/flang-driver/driver.cpp | 10 +++++----- flang/tools/flang-driver/fc1_main.cpp | 5 ++--- flang/unittests/Frontend/CompilerInstanceTest.cpp | 5 +++-- 8 files changed, 17 insertions(+), 18 deletions(-) diff --git a/flang/include/flang/Frontend/CompilerInstance.h b/flang/include/flang/Frontend/CompilerInstance.h index 4ad95c9df42d7..4234e13597cd7 100644 --- a/flang/include/flang/Frontend/CompilerInstance.h +++ b/flang/include/flang/Frontend/CompilerInstance.h @@ -347,7 +347,7 @@ class CompilerInstance { /// /// \return The new object on success, or null on failure. static clang::IntrusiveRefCntPtr - createDiagnostics(clang::DiagnosticOptions *opts, + createDiagnostics(clang::DiagnosticOptions &opts, clang::DiagnosticConsumer *client = nullptr, bool shouldOwnClient = true); void createDiagnostics(clang::DiagnosticConsumer *client = nullptr, diff --git a/flang/include/flang/Frontend/CompilerInvocation.h b/flang/include/flang/Frontend/CompilerInvocation.h index d6ee1511cdb4b..06978029435b7 100644 --- a/flang/include/flang/Frontend/CompilerInvocation.h +++ b/flang/include/flang/Frontend/CompilerInvocation.h @@ -43,7 +43,7 @@ bool parseDiagnosticArgs(clang::DiagnosticOptions &opts, class CompilerInvocationBase { public: /// Options controlling the diagnostic engine. - llvm::IntrusiveRefCntPtr diagnosticOpts; + std::shared_ptr diagnosticOpts; /// Options for the preprocessor. std::shared_ptr preprocessorOpts; diff --git a/flang/include/flang/Frontend/TextDiagnosticPrinter.h b/flang/include/flang/Frontend/TextDiagnosticPrinter.h index 9c99a0c314351..4913713b6c365 100644 --- a/flang/include/flang/Frontend/TextDiagnosticPrinter.h +++ b/flang/include/flang/Frontend/TextDiagnosticPrinter.h @@ -37,13 +37,13 @@ class TextDiagnostic; class TextDiagnosticPrinter : public clang::DiagnosticConsumer { raw_ostream &os; - llvm::IntrusiveRefCntPtr diagOpts; + clang::DiagnosticOptions &diagOpts; /// A string to prefix to error messages. std::string prefix; public: - TextDiagnosticPrinter(raw_ostream &os, clang::DiagnosticOptions *diags); + TextDiagnosticPrinter(raw_ostream &os, clang::DiagnosticOptions &diags); ~TextDiagnosticPrinter() override; /// Set the diagnostic printer prefix string, which will be printed at the diff --git a/flang/lib/Frontend/CompilerInstance.cpp b/flang/lib/Frontend/CompilerInstance.cpp index cbd2c58eeeb47..2e0f91fb0521c 100644 --- a/flang/lib/Frontend/CompilerInstance.cpp +++ b/flang/lib/Frontend/CompilerInstance.cpp @@ -226,12 +226,11 @@ bool CompilerInstance::executeAction(FrontendAction &act) { void CompilerInstance::createDiagnostics(clang::DiagnosticConsumer *client, bool shouldOwnClient) { - diagnostics = - createDiagnostics(&getDiagnosticOpts(), client, shouldOwnClient); + diagnostics = createDiagnostics(getDiagnosticOpts(), client, shouldOwnClient); } clang::IntrusiveRefCntPtr -CompilerInstance::createDiagnostics(clang::DiagnosticOptions *opts, +CompilerInstance::createDiagnostics(clang::DiagnosticOptions &opts, clang::DiagnosticConsumer *client, bool shouldOwnClient) { clang::IntrusiveRefCntPtr diagID( diff --git a/flang/lib/Frontend/TextDiagnosticPrinter.cpp b/flang/lib/Frontend/TextDiagnosticPrinter.cpp index 65626827af3b3..043440328b1f6 100644 --- a/flang/lib/Frontend/TextDiagnosticPrinter.cpp +++ b/flang/lib/Frontend/TextDiagnosticPrinter.cpp @@ -27,7 +27,7 @@ using namespace Fortran::frontend; TextDiagnosticPrinter::TextDiagnosticPrinter(raw_ostream &diagOs, - clang::DiagnosticOptions *diags) + clang::DiagnosticOptions &diags) : os(diagOs), diagOpts(diags) {} TextDiagnosticPrinter::~TextDiagnosticPrinter() {} diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ed52988feaa59..35cc2efc0ac01 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -43,9 +43,9 @@ std::string getExecutablePath(const char *argv0) { // This lets us create the DiagnosticsEngine with a properly-filled-out // DiagnosticOptions instance -static clang::DiagnosticOptions * +static std::unique_ptr createAndPopulateDiagOpts(llvm::ArrayRef argv) { - auto *diagOpts = new clang::DiagnosticOptions; + auto diagOpts = std::make_unique(); // Ignore missingArgCount and the return value of ParseDiagnosticArgs. // Any errors that would be diagnosed here will also be diagnosed later, @@ -114,17 +114,17 @@ int main(int argc, const char **argv) { // Not in the frontend mode - continue in the compiler driver mode. // Create DiagnosticsEngine for the compiler driver - llvm::IntrusiveRefCntPtr diagOpts = + std::unique_ptr diagOpts = createAndPopulateDiagOpts(args); llvm::IntrusiveRefCntPtr diagID( new clang::DiagnosticIDs()); Fortran::frontend::TextDiagnosticPrinter *diagClient = - new Fortran::frontend::TextDiagnosticPrinter(llvm::errs(), &*diagOpts); + new Fortran::frontend::TextDiagnosticPrinter(llvm::errs(), *diagOpts); diagClient->setPrefix( std::string(llvm::sys::path::stem(getExecutablePath(args[0])))); - clang::DiagnosticsEngine diags(diagID, &*diagOpts, diagClient); + clang::DiagnosticsEngine diags(diagID, *diagOpts, diagClient); // Prepare the driver clang::driver::Driver theDriver(driverPath, diff --git a/flang/tools/flang-driver/fc1_main.cpp b/flang/tools/flang-driver/fc1_main.cpp index 49535275d084d..f2cd513d0028c 100644 --- a/flang/tools/flang-driver/fc1_main.cpp +++ b/flang/tools/flang-driver/fc1_main.cpp @@ -67,9 +67,8 @@ int fc1_main(llvm::ArrayRef argv, const char *argv0) { // for parsing the arguments llvm::IntrusiveRefCntPtr diagID( new clang::DiagnosticIDs()); - llvm::IntrusiveRefCntPtr diagOpts = - new clang::DiagnosticOptions(); - clang::DiagnosticsEngine diags(diagID, &*diagOpts, diagsBuffer); + clang::DiagnosticOptions diagOpts; + clang::DiagnosticsEngine diags(diagID, diagOpts, diagsBuffer); bool success = CompilerInvocation::createFromArgs(flang->getInvocation(), argv, diags, argv0); diff --git a/flang/unittests/Frontend/CompilerInstanceTest.cpp b/flang/unittests/Frontend/CompilerInstanceTest.cpp index 3fe2f063e996a..bf62a64be229a 100644 --- a/flang/unittests/Frontend/CompilerInstanceTest.cpp +++ b/flang/unittests/Frontend/CompilerInstanceTest.cpp @@ -67,14 +67,15 @@ TEST(CompilerInstance, AllowDiagnosticLogWithUnownedDiagnosticConsumer) { // 1. Set-up a basic DiagnosticConsumer std::string diagnosticOutput; llvm::raw_string_ostream diagnosticsOS(diagnosticOutput); + clang::DiagnosticOptions diagPrinterOpts; auto diagPrinter = std::make_unique( - diagnosticsOS, new clang::DiagnosticOptions()); + diagnosticsOS, *diagPrinterOpts; // 2. Create a CompilerInstance (to manage a DiagnosticEngine) CompilerInstance compInst; // 3. Set-up DiagnosticOptions - auto diagOpts = new clang::DiagnosticOptions(); + clang::DiagnosticOptions diagOpts; // Tell the diagnostics engine to emit the diagnostic log to STDERR. This // ensures that a chained diagnostic consumer is created so that the test can // exercise the unowned diagnostic consumer in a chained consumer. >From 0944453f35eaf6c254002cb41eb4aee54946f423 Mon Sep 17 00:00:00 2001 From: Jan Svoboda Date: Thu, 22 May 2025 13:42:17 -0700 Subject: [PATCH 2/3] Convert -> to . --- flang/lib/Frontend/TextDiagnosticPrinter.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/flang/lib/Frontend/TextDiagnosticPrinter.cpp b/flang/lib/Frontend/TextDiagnosticPrinter.cpp index 043440328b1f6..911b78a109e2e 100644 --- a/flang/lib/Frontend/TextDiagnosticPrinter.cpp +++ b/flang/lib/Frontend/TextDiagnosticPrinter.cpp @@ -81,7 +81,7 @@ void TextDiagnosticPrinter::printLocForRemarks( llvm::sys::path::make_preferred(absPath); // Used for changing only the bold attribute - if (diagOpts->ShowColors) + if (diagOpts.ShowColors) os.changeColor(llvm::raw_ostream::SAVEDCOLOR, true); // Print path, file name, line and column @@ -113,11 +113,11 @@ void TextDiagnosticPrinter::HandleDiagnostic( printLocForRemarks(diagMessageStream, diagMsg); Fortran::frontend::TextDiagnostic::printDiagnosticLevel(os, level, - diagOpts->ShowColors); + diagOpts.ShowColors); Fortran::frontend::TextDiagnostic::printDiagnosticMessage( os, /*IsSupplemental=*/level == clang::DiagnosticsEngine::Note, diagMsg, - diagOpts->ShowColors); + diagOpts.ShowColors); os.flush(); } >From 889172eaf986b904e30b07e611cd18513993a692 Mon Sep 17 00:00:00 2001 From: Jan Svoboda Date: Thu, 22 May 2025 13:46:53 -0700 Subject: [PATCH 3/3] Fix CodeGenActionTest.cpp --- flang/unittests/Frontend/CodeGenActionTest.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/flang/unittests/Frontend/CodeGenActionTest.cpp b/flang/unittests/Frontend/CodeGenActionTest.cpp index e9ff095973b97..6020abc463eda 100644 --- a/flang/unittests/Frontend/CodeGenActionTest.cpp +++ b/flang/unittests/Frontend/CodeGenActionTest.cpp @@ -86,8 +86,9 @@ class LLVMConversionFailureCodeGenAction : public CodeGenAction { TEST(CodeGenAction, GracefullyHandleLLVMConversionFailure) { std::string diagnosticOutput; llvm::raw_string_ostream diagnosticsOS(diagnosticOutput); + clang::DiagnosticOptions diagOpts; auto diagPrinter = std::make_unique( - diagnosticsOS, new clang::DiagnosticOptions()); + diagnosticsOS, diagOpts); CompilerInstance ci; ci.createDiagnostics(diagPrinter.get(), /*ShouldOwnClient=*/false); From flang-commits at lists.llvm.org Thu May 22 13:47:59 2025 From: flang-commits at lists.llvm.org (Jan Svoboda via flang-commits) Date: Thu, 22 May 2025 13:47:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix build after 9e306ad4 (PR #141134) In-Reply-To: Message-ID: <682f8d7f.170a0220.12518c.2ce8@mx.google.com> https://github.com/jansvoboda11 updated https://github.com/llvm/llvm-project/pull/141134 >From 34cf500219e35111f6c9877632b65ac8fb8c5932 Mon Sep 17 00:00:00 2001 From: Jan Svoboda Date: Thu, 22 May 2025 13:32:24 -0700 Subject: [PATCH 1/4] [flang] Fix build after 9e306ad4 --- flang/include/flang/Frontend/CompilerInstance.h | 2 +- flang/include/flang/Frontend/CompilerInvocation.h | 2 +- flang/include/flang/Frontend/TextDiagnosticPrinter.h | 4 ++-- flang/lib/Frontend/CompilerInstance.cpp | 5 ++--- flang/lib/Frontend/TextDiagnosticPrinter.cpp | 2 +- flang/tools/flang-driver/driver.cpp | 10 +++++----- flang/tools/flang-driver/fc1_main.cpp | 5 ++--- flang/unittests/Frontend/CompilerInstanceTest.cpp | 5 +++-- 8 files changed, 17 insertions(+), 18 deletions(-) diff --git a/flang/include/flang/Frontend/CompilerInstance.h b/flang/include/flang/Frontend/CompilerInstance.h index 4ad95c9df42d7..4234e13597cd7 100644 --- a/flang/include/flang/Frontend/CompilerInstance.h +++ b/flang/include/flang/Frontend/CompilerInstance.h @@ -347,7 +347,7 @@ class CompilerInstance { /// /// \return The new object on success, or null on failure. static clang::IntrusiveRefCntPtr - createDiagnostics(clang::DiagnosticOptions *opts, + createDiagnostics(clang::DiagnosticOptions &opts, clang::DiagnosticConsumer *client = nullptr, bool shouldOwnClient = true); void createDiagnostics(clang::DiagnosticConsumer *client = nullptr, diff --git a/flang/include/flang/Frontend/CompilerInvocation.h b/flang/include/flang/Frontend/CompilerInvocation.h index d6ee1511cdb4b..06978029435b7 100644 --- a/flang/include/flang/Frontend/CompilerInvocation.h +++ b/flang/include/flang/Frontend/CompilerInvocation.h @@ -43,7 +43,7 @@ bool parseDiagnosticArgs(clang::DiagnosticOptions &opts, class CompilerInvocationBase { public: /// Options controlling the diagnostic engine. - llvm::IntrusiveRefCntPtr diagnosticOpts; + std::shared_ptr diagnosticOpts; /// Options for the preprocessor. std::shared_ptr preprocessorOpts; diff --git a/flang/include/flang/Frontend/TextDiagnosticPrinter.h b/flang/include/flang/Frontend/TextDiagnosticPrinter.h index 9c99a0c314351..4913713b6c365 100644 --- a/flang/include/flang/Frontend/TextDiagnosticPrinter.h +++ b/flang/include/flang/Frontend/TextDiagnosticPrinter.h @@ -37,13 +37,13 @@ class TextDiagnostic; class TextDiagnosticPrinter : public clang::DiagnosticConsumer { raw_ostream &os; - llvm::IntrusiveRefCntPtr diagOpts; + clang::DiagnosticOptions &diagOpts; /// A string to prefix to error messages. std::string prefix; public: - TextDiagnosticPrinter(raw_ostream &os, clang::DiagnosticOptions *diags); + TextDiagnosticPrinter(raw_ostream &os, clang::DiagnosticOptions &diags); ~TextDiagnosticPrinter() override; /// Set the diagnostic printer prefix string, which will be printed at the diff --git a/flang/lib/Frontend/CompilerInstance.cpp b/flang/lib/Frontend/CompilerInstance.cpp index cbd2c58eeeb47..2e0f91fb0521c 100644 --- a/flang/lib/Frontend/CompilerInstance.cpp +++ b/flang/lib/Frontend/CompilerInstance.cpp @@ -226,12 +226,11 @@ bool CompilerInstance::executeAction(FrontendAction &act) { void CompilerInstance::createDiagnostics(clang::DiagnosticConsumer *client, bool shouldOwnClient) { - diagnostics = - createDiagnostics(&getDiagnosticOpts(), client, shouldOwnClient); + diagnostics = createDiagnostics(getDiagnosticOpts(), client, shouldOwnClient); } clang::IntrusiveRefCntPtr -CompilerInstance::createDiagnostics(clang::DiagnosticOptions *opts, +CompilerInstance::createDiagnostics(clang::DiagnosticOptions &opts, clang::DiagnosticConsumer *client, bool shouldOwnClient) { clang::IntrusiveRefCntPtr diagID( diff --git a/flang/lib/Frontend/TextDiagnosticPrinter.cpp b/flang/lib/Frontend/TextDiagnosticPrinter.cpp index 65626827af3b3..043440328b1f6 100644 --- a/flang/lib/Frontend/TextDiagnosticPrinter.cpp +++ b/flang/lib/Frontend/TextDiagnosticPrinter.cpp @@ -27,7 +27,7 @@ using namespace Fortran::frontend; TextDiagnosticPrinter::TextDiagnosticPrinter(raw_ostream &diagOs, - clang::DiagnosticOptions *diags) + clang::DiagnosticOptions &diags) : os(diagOs), diagOpts(diags) {} TextDiagnosticPrinter::~TextDiagnosticPrinter() {} diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ed52988feaa59..35cc2efc0ac01 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -43,9 +43,9 @@ std::string getExecutablePath(const char *argv0) { // This lets us create the DiagnosticsEngine with a properly-filled-out // DiagnosticOptions instance -static clang::DiagnosticOptions * +static std::unique_ptr createAndPopulateDiagOpts(llvm::ArrayRef argv) { - auto *diagOpts = new clang::DiagnosticOptions; + auto diagOpts = std::make_unique(); // Ignore missingArgCount and the return value of ParseDiagnosticArgs. // Any errors that would be diagnosed here will also be diagnosed later, @@ -114,17 +114,17 @@ int main(int argc, const char **argv) { // Not in the frontend mode - continue in the compiler driver mode. // Create DiagnosticsEngine for the compiler driver - llvm::IntrusiveRefCntPtr diagOpts = + std::unique_ptr diagOpts = createAndPopulateDiagOpts(args); llvm::IntrusiveRefCntPtr diagID( new clang::DiagnosticIDs()); Fortran::frontend::TextDiagnosticPrinter *diagClient = - new Fortran::frontend::TextDiagnosticPrinter(llvm::errs(), &*diagOpts); + new Fortran::frontend::TextDiagnosticPrinter(llvm::errs(), *diagOpts); diagClient->setPrefix( std::string(llvm::sys::path::stem(getExecutablePath(args[0])))); - clang::DiagnosticsEngine diags(diagID, &*diagOpts, diagClient); + clang::DiagnosticsEngine diags(diagID, *diagOpts, diagClient); // Prepare the driver clang::driver::Driver theDriver(driverPath, diff --git a/flang/tools/flang-driver/fc1_main.cpp b/flang/tools/flang-driver/fc1_main.cpp index 49535275d084d..f2cd513d0028c 100644 --- a/flang/tools/flang-driver/fc1_main.cpp +++ b/flang/tools/flang-driver/fc1_main.cpp @@ -67,9 +67,8 @@ int fc1_main(llvm::ArrayRef argv, const char *argv0) { // for parsing the arguments llvm::IntrusiveRefCntPtr diagID( new clang::DiagnosticIDs()); - llvm::IntrusiveRefCntPtr diagOpts = - new clang::DiagnosticOptions(); - clang::DiagnosticsEngine diags(diagID, &*diagOpts, diagsBuffer); + clang::DiagnosticOptions diagOpts; + clang::DiagnosticsEngine diags(diagID, diagOpts, diagsBuffer); bool success = CompilerInvocation::createFromArgs(flang->getInvocation(), argv, diags, argv0); diff --git a/flang/unittests/Frontend/CompilerInstanceTest.cpp b/flang/unittests/Frontend/CompilerInstanceTest.cpp index 3fe2f063e996a..bf62a64be229a 100644 --- a/flang/unittests/Frontend/CompilerInstanceTest.cpp +++ b/flang/unittests/Frontend/CompilerInstanceTest.cpp @@ -67,14 +67,15 @@ TEST(CompilerInstance, AllowDiagnosticLogWithUnownedDiagnosticConsumer) { // 1. Set-up a basic DiagnosticConsumer std::string diagnosticOutput; llvm::raw_string_ostream diagnosticsOS(diagnosticOutput); + clang::DiagnosticOptions diagPrinterOpts; auto diagPrinter = std::make_unique( - diagnosticsOS, new clang::DiagnosticOptions()); + diagnosticsOS, *diagPrinterOpts; // 2. Create a CompilerInstance (to manage a DiagnosticEngine) CompilerInstance compInst; // 3. Set-up DiagnosticOptions - auto diagOpts = new clang::DiagnosticOptions(); + clang::DiagnosticOptions diagOpts; // Tell the diagnostics engine to emit the diagnostic log to STDERR. This // ensures that a chained diagnostic consumer is created so that the test can // exercise the unowned diagnostic consumer in a chained consumer. >From 0944453f35eaf6c254002cb41eb4aee54946f423 Mon Sep 17 00:00:00 2001 From: Jan Svoboda Date: Thu, 22 May 2025 13:42:17 -0700 Subject: [PATCH 2/4] Convert -> to . --- flang/lib/Frontend/TextDiagnosticPrinter.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/flang/lib/Frontend/TextDiagnosticPrinter.cpp b/flang/lib/Frontend/TextDiagnosticPrinter.cpp index 043440328b1f6..911b78a109e2e 100644 --- a/flang/lib/Frontend/TextDiagnosticPrinter.cpp +++ b/flang/lib/Frontend/TextDiagnosticPrinter.cpp @@ -81,7 +81,7 @@ void TextDiagnosticPrinter::printLocForRemarks( llvm::sys::path::make_preferred(absPath); // Used for changing only the bold attribute - if (diagOpts->ShowColors) + if (diagOpts.ShowColors) os.changeColor(llvm::raw_ostream::SAVEDCOLOR, true); // Print path, file name, line and column @@ -113,11 +113,11 @@ void TextDiagnosticPrinter::HandleDiagnostic( printLocForRemarks(diagMessageStream, diagMsg); Fortran::frontend::TextDiagnostic::printDiagnosticLevel(os, level, - diagOpts->ShowColors); + diagOpts.ShowColors); Fortran::frontend::TextDiagnostic::printDiagnosticMessage( os, /*IsSupplemental=*/level == clang::DiagnosticsEngine::Note, diagMsg, - diagOpts->ShowColors); + diagOpts.ShowColors); os.flush(); } >From 889172eaf986b904e30b07e611cd18513993a692 Mon Sep 17 00:00:00 2001 From: Jan Svoboda Date: Thu, 22 May 2025 13:46:53 -0700 Subject: [PATCH 3/4] Fix CodeGenActionTest.cpp --- flang/unittests/Frontend/CodeGenActionTest.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/flang/unittests/Frontend/CodeGenActionTest.cpp b/flang/unittests/Frontend/CodeGenActionTest.cpp index e9ff095973b97..6020abc463eda 100644 --- a/flang/unittests/Frontend/CodeGenActionTest.cpp +++ b/flang/unittests/Frontend/CodeGenActionTest.cpp @@ -86,8 +86,9 @@ class LLVMConversionFailureCodeGenAction : public CodeGenAction { TEST(CodeGenAction, GracefullyHandleLLVMConversionFailure) { std::string diagnosticOutput; llvm::raw_string_ostream diagnosticsOS(diagnosticOutput); + clang::DiagnosticOptions diagOpts; auto diagPrinter = std::make_unique( - diagnosticsOS, new clang::DiagnosticOptions()); + diagnosticsOS, diagOpts); CompilerInstance ci; ci.createDiagnostics(diagPrinter.get(), /*ShouldOwnClient=*/false); >From 6a6adef7513ded2f7eae3f364e3db1b934bc9b1c Mon Sep 17 00:00:00 2001 From: Jan Svoboda Date: Thu, 22 May 2025 13:47:47 -0700 Subject: [PATCH 4/4] Fix CompilerInstanceTest.cpp --- flang/unittests/Frontend/CompilerInstanceTest.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/unittests/Frontend/CompilerInstanceTest.cpp b/flang/unittests/Frontend/CompilerInstanceTest.cpp index bf62a64be229a..8621c14029c15 100644 --- a/flang/unittests/Frontend/CompilerInstanceTest.cpp +++ b/flang/unittests/Frontend/CompilerInstanceTest.cpp @@ -69,7 +69,7 @@ TEST(CompilerInstance, AllowDiagnosticLogWithUnownedDiagnosticConsumer) { llvm::raw_string_ostream diagnosticsOS(diagnosticOutput); clang::DiagnosticOptions diagPrinterOpts; auto diagPrinter = std::make_unique( - diagnosticsOS, *diagPrinterOpts; + diagnosticsOS, diagPrinterOpts); // 2. Create a CompilerInstance (to manage a DiagnosticEngine) CompilerInstance compInst; @@ -79,7 +79,7 @@ TEST(CompilerInstance, AllowDiagnosticLogWithUnownedDiagnosticConsumer) { // Tell the diagnostics engine to emit the diagnostic log to STDERR. This // ensures that a chained diagnostic consumer is created so that the test can // exercise the unowned diagnostic consumer in a chained consumer. - diagOpts->DiagnosticLogFile = "-"; + diagOpts.DiagnosticLogFile = "-"; // 4. Create a DiagnosticEngine with an unowned consumer IntrusiveRefCntPtr diags = From flang-commits at lists.llvm.org Thu May 22 13:50:43 2025 From: flang-commits at lists.llvm.org (Kazu Hirata via flang-commits) Date: Thu, 22 May 2025 13:50:43 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix build after 9e306ad4 (PR #141134) In-Reply-To: Message-ID: <682f8e23.050a0220.375c48.7004@mx.google.com> https://github.com/kazutakahirata approved this pull request. LGTM. With your 4 commits in this PR, I can now do `ninja all test-depends`, which includes flang and flang's unit tests. Thank you for fixing things quickly! https://github.com/llvm/llvm-project/pull/141134 From flang-commits at lists.llvm.org Thu May 22 13:51:18 2025 From: flang-commits at lists.llvm.org (Jan Svoboda via flang-commits) Date: Thu, 22 May 2025 13:51:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix build after 9e306ad4 (PR #141134) In-Reply-To: Message-ID: <682f8e46.050a0220.1284d.5568@mx.google.com> jansvoboda11 wrote: Thanks again! https://github.com/llvm/llvm-project/pull/141134 From flang-commits at lists.llvm.org Thu May 22 13:51:23 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 13:51:23 -0700 (PDT) Subject: [flang-commits] [flang] 3ea2cec - [flang] Fix build after 9e306ad4 (#141134) Message-ID: <682f8e4b.170a0220.7f710.e4a1@mx.google.com> Author: Jan Svoboda Date: 2025-05-22T13:51:20-07:00 New Revision: 3ea2cec7324e1e4569cd15b9e6cb1a4a6e8aa521 URL: https://github.com/llvm/llvm-project/commit/3ea2cec7324e1e4569cd15b9e6cb1a4a6e8aa521 DIFF: https://github.com/llvm/llvm-project/commit/3ea2cec7324e1e4569cd15b9e6cb1a4a6e8aa521.diff LOG: [flang] Fix build after 9e306ad4 (#141134) Added: Modified: flang/include/flang/Frontend/CompilerInstance.h flang/include/flang/Frontend/CompilerInvocation.h flang/include/flang/Frontend/TextDiagnosticPrinter.h flang/lib/Frontend/CompilerInstance.cpp flang/lib/Frontend/TextDiagnosticPrinter.cpp flang/tools/flang-driver/driver.cpp flang/tools/flang-driver/fc1_main.cpp flang/unittests/Frontend/CodeGenActionTest.cpp flang/unittests/Frontend/CompilerInstanceTest.cpp Removed: ################################################################################ diff --git a/flang/include/flang/Frontend/CompilerInstance.h b/flang/include/flang/Frontend/CompilerInstance.h index 4ad95c9df42d7..4234e13597cd7 100644 --- a/flang/include/flang/Frontend/CompilerInstance.h +++ b/flang/include/flang/Frontend/CompilerInstance.h @@ -347,7 +347,7 @@ class CompilerInstance { /// /// \return The new object on success, or null on failure. static clang::IntrusiveRefCntPtr - createDiagnostics(clang::DiagnosticOptions *opts, + createDiagnostics(clang::DiagnosticOptions &opts, clang::DiagnosticConsumer *client = nullptr, bool shouldOwnClient = true); void createDiagnostics(clang::DiagnosticConsumer *client = nullptr, diff --git a/flang/include/flang/Frontend/CompilerInvocation.h b/flang/include/flang/Frontend/CompilerInvocation.h index d6ee1511cdb4b..06978029435b7 100644 --- a/flang/include/flang/Frontend/CompilerInvocation.h +++ b/flang/include/flang/Frontend/CompilerInvocation.h @@ -43,7 +43,7 @@ bool parseDiagnosticArgs(clang::DiagnosticOptions &opts, class CompilerInvocationBase { public: /// Options controlling the diagnostic engine. - llvm::IntrusiveRefCntPtr diagnosticOpts; + std::shared_ptr diagnosticOpts; /// Options for the preprocessor. std::shared_ptr preprocessorOpts; diff --git a/flang/include/flang/Frontend/TextDiagnosticPrinter.h b/flang/include/flang/Frontend/TextDiagnosticPrinter.h index 9c99a0c314351..4913713b6c365 100644 --- a/flang/include/flang/Frontend/TextDiagnosticPrinter.h +++ b/flang/include/flang/Frontend/TextDiagnosticPrinter.h @@ -37,13 +37,13 @@ class TextDiagnostic; class TextDiagnosticPrinter : public clang::DiagnosticConsumer { raw_ostream &os; - llvm::IntrusiveRefCntPtr diagOpts; + clang::DiagnosticOptions &diagOpts; /// A string to prefix to error messages. std::string prefix; public: - TextDiagnosticPrinter(raw_ostream &os, clang::DiagnosticOptions *diags); + TextDiagnosticPrinter(raw_ostream &os, clang::DiagnosticOptions &diags); ~TextDiagnosticPrinter() override; /// Set the diagnostic printer prefix string, which will be printed at the diff --git a/flang/lib/Frontend/CompilerInstance.cpp b/flang/lib/Frontend/CompilerInstance.cpp index cbd2c58eeeb47..2e0f91fb0521c 100644 --- a/flang/lib/Frontend/CompilerInstance.cpp +++ b/flang/lib/Frontend/CompilerInstance.cpp @@ -226,12 +226,11 @@ bool CompilerInstance::executeAction(FrontendAction &act) { void CompilerInstance::createDiagnostics(clang::DiagnosticConsumer *client, bool shouldOwnClient) { - diagnostics = - createDiagnostics(&getDiagnosticOpts(), client, shouldOwnClient); + diagnostics = createDiagnostics(getDiagnosticOpts(), client, shouldOwnClient); } clang::IntrusiveRefCntPtr -CompilerInstance::createDiagnostics(clang::DiagnosticOptions *opts, +CompilerInstance::createDiagnostics(clang::DiagnosticOptions &opts, clang::DiagnosticConsumer *client, bool shouldOwnClient) { clang::IntrusiveRefCntPtr diagID( diff --git a/flang/lib/Frontend/TextDiagnosticPrinter.cpp b/flang/lib/Frontend/TextDiagnosticPrinter.cpp index 65626827af3b3..911b78a109e2e 100644 --- a/flang/lib/Frontend/TextDiagnosticPrinter.cpp +++ b/flang/lib/Frontend/TextDiagnosticPrinter.cpp @@ -27,7 +27,7 @@ using namespace Fortran::frontend; TextDiagnosticPrinter::TextDiagnosticPrinter(raw_ostream &diagOs, - clang::DiagnosticOptions *diags) + clang::DiagnosticOptions &diags) : os(diagOs), diagOpts(diags) {} TextDiagnosticPrinter::~TextDiagnosticPrinter() {} @@ -81,7 +81,7 @@ void TextDiagnosticPrinter::printLocForRemarks( llvm::sys::path::make_preferred(absPath); // Used for changing only the bold attribute - if (diagOpts->ShowColors) + if (diagOpts.ShowColors) os.changeColor(llvm::raw_ostream::SAVEDCOLOR, true); // Print path, file name, line and column @@ -113,11 +113,11 @@ void TextDiagnosticPrinter::HandleDiagnostic( printLocForRemarks(diagMessageStream, diagMsg); Fortran::frontend::TextDiagnostic::printDiagnosticLevel(os, level, - diagOpts->ShowColors); + diagOpts.ShowColors); Fortran::frontend::TextDiagnostic::printDiagnosticMessage( os, /*IsSupplemental=*/level == clang::DiagnosticsEngine::Note, diagMsg, - diagOpts->ShowColors); + diagOpts.ShowColors); os.flush(); } diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ed52988feaa59..35cc2efc0ac01 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -43,9 +43,9 @@ std::string getExecutablePath(const char *argv0) { // This lets us create the DiagnosticsEngine with a properly-filled-out // DiagnosticOptions instance -static clang::DiagnosticOptions * +static std::unique_ptr createAndPopulateDiagOpts(llvm::ArrayRef argv) { - auto *diagOpts = new clang::DiagnosticOptions; + auto diagOpts = std::make_unique(); // Ignore missingArgCount and the return value of ParseDiagnosticArgs. // Any errors that would be diagnosed here will also be diagnosed later, @@ -114,17 +114,17 @@ int main(int argc, const char **argv) { // Not in the frontend mode - continue in the compiler driver mode. // Create DiagnosticsEngine for the compiler driver - llvm::IntrusiveRefCntPtr diagOpts = + std::unique_ptr diagOpts = createAndPopulateDiagOpts(args); llvm::IntrusiveRefCntPtr diagID( new clang::DiagnosticIDs()); Fortran::frontend::TextDiagnosticPrinter *diagClient = - new Fortran::frontend::TextDiagnosticPrinter(llvm::errs(), &*diagOpts); + new Fortran::frontend::TextDiagnosticPrinter(llvm::errs(), *diagOpts); diagClient->setPrefix( std::string(llvm::sys::path::stem(getExecutablePath(args[0])))); - clang::DiagnosticsEngine diags(diagID, &*diagOpts, diagClient); + clang::DiagnosticsEngine diags(diagID, *diagOpts, diagClient); // Prepare the driver clang::driver::Driver theDriver(driverPath, diff --git a/flang/tools/flang-driver/fc1_main.cpp b/flang/tools/flang-driver/fc1_main.cpp index 49535275d084d..f2cd513d0028c 100644 --- a/flang/tools/flang-driver/fc1_main.cpp +++ b/flang/tools/flang-driver/fc1_main.cpp @@ -67,9 +67,8 @@ int fc1_main(llvm::ArrayRef argv, const char *argv0) { // for parsing the arguments llvm::IntrusiveRefCntPtr diagID( new clang::DiagnosticIDs()); - llvm::IntrusiveRefCntPtr diagOpts = - new clang::DiagnosticOptions(); - clang::DiagnosticsEngine diags(diagID, &*diagOpts, diagsBuffer); + clang::DiagnosticOptions diagOpts; + clang::DiagnosticsEngine diags(diagID, diagOpts, diagsBuffer); bool success = CompilerInvocation::createFromArgs(flang->getInvocation(), argv, diags, argv0); diff --git a/flang/unittests/Frontend/CodeGenActionTest.cpp b/flang/unittests/Frontend/CodeGenActionTest.cpp index e9ff095973b97..6020abc463eda 100644 --- a/flang/unittests/Frontend/CodeGenActionTest.cpp +++ b/flang/unittests/Frontend/CodeGenActionTest.cpp @@ -86,8 +86,9 @@ class LLVMConversionFailureCodeGenAction : public CodeGenAction { TEST(CodeGenAction, GracefullyHandleLLVMConversionFailure) { std::string diagnosticOutput; llvm::raw_string_ostream diagnosticsOS(diagnosticOutput); + clang::DiagnosticOptions diagOpts; auto diagPrinter = std::make_unique( - diagnosticsOS, new clang::DiagnosticOptions()); + diagnosticsOS, diagOpts); CompilerInstance ci; ci.createDiagnostics(diagPrinter.get(), /*ShouldOwnClient=*/false); diff --git a/flang/unittests/Frontend/CompilerInstanceTest.cpp b/flang/unittests/Frontend/CompilerInstanceTest.cpp index 3fe2f063e996a..8621c14029c15 100644 --- a/flang/unittests/Frontend/CompilerInstanceTest.cpp +++ b/flang/unittests/Frontend/CompilerInstanceTest.cpp @@ -67,18 +67,19 @@ TEST(CompilerInstance, AllowDiagnosticLogWithUnownedDiagnosticConsumer) { // 1. Set-up a basic DiagnosticConsumer std::string diagnosticOutput; llvm::raw_string_ostream diagnosticsOS(diagnosticOutput); + clang::DiagnosticOptions diagPrinterOpts; auto diagPrinter = std::make_unique( - diagnosticsOS, new clang::DiagnosticOptions()); + diagnosticsOS, diagPrinterOpts); // 2. Create a CompilerInstance (to manage a DiagnosticEngine) CompilerInstance compInst; // 3. Set-up DiagnosticOptions - auto diagOpts = new clang::DiagnosticOptions(); + clang::DiagnosticOptions diagOpts; // Tell the diagnostics engine to emit the diagnostic log to STDERR. This // ensures that a chained diagnostic consumer is created so that the test can // exercise the unowned diagnostic consumer in a chained consumer. - diagOpts->DiagnosticLogFile = "-"; + diagOpts.DiagnosticLogFile = "-"; // 4. Create a DiagnosticEngine with an unowned consumer IntrusiveRefCntPtr diags = From flang-commits at lists.llvm.org Thu May 22 13:51:27 2025 From: flang-commits at lists.llvm.org (Jan Svoboda via flang-commits) Date: Thu, 22 May 2025 13:51:27 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix build after 9e306ad4 (PR #141134) In-Reply-To: Message-ID: <682f8e4f.170a0220.25b62d.b3ed@mx.google.com> https://github.com/jansvoboda11 closed https://github.com/llvm/llvm-project/pull/141134 From flang-commits at lists.llvm.org Thu May 22 13:51:48 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 22 May 2025 13:51:48 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (WORK IN PROGRESS) (PR #137727) In-Reply-To: Message-ID: <682f8e64.050a0220.26a250.5ea4@mx.google.com> klausler wrote: The use of the work queue is now optional in this patch. It's on by default for the GPU, off by default for the CPU. This should minimize the overhead of this approach when it's not needed. (Although good performance measurements remain to be done on real applications, and it's not clear what the consequences of this framework are, enabled or not.) Reviewers: I'm unlikely to make further major changes here until you have comments. It's stable and working in both modes on the CPU. https://github.com/llvm/llvm-project/pull/137727 From flang-commits at lists.llvm.org Thu May 22 13:52:20 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 22 May 2025 13:52:20 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <682f8e84.630a0220.173c0.929a@mx.google.com> https://github.com/klausler edited https://github.com/llvm/llvm-project/pull/137727 From flang-commits at lists.llvm.org Thu May 22 13:54:17 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Thu, 22 May 2025 13:54:17 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <682f8ef9.170a0220.247a23.a741@mx.google.com> clementval wrote: Thanks for the patch Peter. I'm going to make a build with your change and test it against some of our tests. I'll keep you posted. https://github.com/llvm/llvm-project/pull/137727 From flang-commits at lists.llvm.org Thu May 22 14:48:44 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 22 May 2025 14:48:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix folding of SHAPE(SPREAD(source, dim, ncopies=-1)) (PR #141146) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/141146 The number of copies on the new dimension must be clamped via MAX(0, ncopies) so that it is no less than zero. Fixes https://github.com/llvm/llvm-project/issues/141119. >From 7e8e7651a6e0afb934078ba3fac8589312ac457e Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Thu, 22 May 2025 14:45:34 -0700 Subject: [PATCH] [flang] Fix folding of SHAPE(SPREAD(source,dim,ncopies=-1)) The number of copies on the new dimension must be clamped via MAX(0, ncopies) so that it is no less than zero. Fixes https://github.com/llvm/llvm-project/issues/141119. --- flang/lib/Evaluate/shape.cpp | 7 ++++--- flang/test/Evaluate/fold-spread.f90 | 1 + 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/flang/lib/Evaluate/shape.cpp b/flang/lib/Evaluate/shape.cpp index ac4811e9978eb..776866d1416d2 100644 --- a/flang/lib/Evaluate/shape.cpp +++ b/flang/lib/Evaluate/shape.cpp @@ -1073,8 +1073,8 @@ auto GetShapeHelper::operator()(const ProcedureRef &call) const -> Result { } } } else if (intrinsic->name == "spread") { - // SHAPE(SPREAD(ARRAY,DIM,NCOPIES)) = SHAPE(ARRAY) with NCOPIES inserted - // at position DIM. + // SHAPE(SPREAD(ARRAY,DIM,NCOPIES)) = SHAPE(ARRAY) with MAX(0,NCOPIES) + // inserted at position DIM. if (call.arguments().size() == 3) { auto arrayShape{ (*this)(UnwrapExpr>(call.arguments().at(0)))}; @@ -1086,7 +1086,8 @@ auto GetShapeHelper::operator()(const ProcedureRef &call) const -> Result { if (*dim >= 1 && static_cast(*dim) <= arrayShape->size() + 1) { arrayShape->emplace(arrayShape->begin() + *dim - 1, - ConvertToType(common::Clone(*nCopies))); + Extremum{Ordering::Greater, ExtentExpr{0}, + ConvertToType(common::Clone(*nCopies))}); return std::move(*arrayShape); } } diff --git a/flang/test/Evaluate/fold-spread.f90 b/flang/test/Evaluate/fold-spread.f90 index b7e493ee061c8..c8b87e8c87811 100644 --- a/flang/test/Evaluate/fold-spread.f90 +++ b/flang/test/Evaluate/fold-spread.f90 @@ -12,4 +12,5 @@ module m1 logical, parameter :: test_log4 = all(any(spread([.false., .true.], 2, 2), dim=2) .eqv. [.false., .true.]) logical, parameter :: test_m2toa3 = all(spread(reshape([(j,j=1,6)],[2,3]),1,4) == & reshape([((j,k=1,4),j=1,6)],[4,2,3])) + logical, parameter :: test_shape_neg = all(shape(spread(0,1,-1)) == [0]) end module From flang-commits at lists.llvm.org Thu May 22 14:49:19 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 14:49:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix folding of SHAPE(SPREAD(source, dim, ncopies=-1)) (PR #141146) In-Reply-To: Message-ID: <682f9bdf.170a0220.76e21.2004@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes The number of copies on the new dimension must be clamped via MAX(0, ncopies) so that it is no less than zero. Fixes https://github.com/llvm/llvm-project/issues/141119. --- Full diff: https://github.com/llvm/llvm-project/pull/141146.diff 2 Files Affected: - (modified) flang/lib/Evaluate/shape.cpp (+4-3) - (modified) flang/test/Evaluate/fold-spread.f90 (+1) ``````````diff diff --git a/flang/lib/Evaluate/shape.cpp b/flang/lib/Evaluate/shape.cpp index ac4811e9978eb..776866d1416d2 100644 --- a/flang/lib/Evaluate/shape.cpp +++ b/flang/lib/Evaluate/shape.cpp @@ -1073,8 +1073,8 @@ auto GetShapeHelper::operator()(const ProcedureRef &call) const -> Result { } } } else if (intrinsic->name == "spread") { - // SHAPE(SPREAD(ARRAY,DIM,NCOPIES)) = SHAPE(ARRAY) with NCOPIES inserted - // at position DIM. + // SHAPE(SPREAD(ARRAY,DIM,NCOPIES)) = SHAPE(ARRAY) with MAX(0,NCOPIES) + // inserted at position DIM. if (call.arguments().size() == 3) { auto arrayShape{ (*this)(UnwrapExpr>(call.arguments().at(0)))}; @@ -1086,7 +1086,8 @@ auto GetShapeHelper::operator()(const ProcedureRef &call) const -> Result { if (*dim >= 1 && static_cast(*dim) <= arrayShape->size() + 1) { arrayShape->emplace(arrayShape->begin() + *dim - 1, - ConvertToType(common::Clone(*nCopies))); + Extremum{Ordering::Greater, ExtentExpr{0}, + ConvertToType(common::Clone(*nCopies))}); return std::move(*arrayShape); } } diff --git a/flang/test/Evaluate/fold-spread.f90 b/flang/test/Evaluate/fold-spread.f90 index b7e493ee061c8..c8b87e8c87811 100644 --- a/flang/test/Evaluate/fold-spread.f90 +++ b/flang/test/Evaluate/fold-spread.f90 @@ -12,4 +12,5 @@ module m1 logical, parameter :: test_log4 = all(any(spread([.false., .true.], 2, 2), dim=2) .eqv. [.false., .true.]) logical, parameter :: test_m2toa3 = all(spread(reshape([(j,j=1,6)],[2,3]),1,4) == & reshape([((j,k=1,4),j=1,6)],[4,2,3])) + logical, parameter :: test_shape_neg = all(shape(spread(0,1,-1)) == [0]) end module ``````````
https://github.com/llvm/llvm-project/pull/141146 From flang-commits at lists.llvm.org Thu May 22 16:01:12 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Thu, 22 May 2025 16:01:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <682facb8.170a0220.380c8d.aaf8@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH 1/4] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 30477b842ae64fb1c94f520136956cc4685648a7 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:59:20 -0400 Subject: [PATCH 2/4] clang-format --- flang/include/flang/Evaluate/target.h | 4 +--- flang/include/flang/Tools/TargetSetup.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d5285fb..e8b9fedc38f48 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ class TargetCharacteristics { const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e21fb4..24ab65f740ec6 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. >From e1f91c83680d72fc1463a8db97a77298141286d9 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:41:48 -0400 Subject: [PATCH 3/4] Renaming to integerKindForPointer --- flang/include/flang/Evaluate/target.h | 8 +++++--- flang/include/flang/Tools/TargetSetup.h | 3 ++- flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index e8b9fedc38f48..7b38db2db1956 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,8 +131,10 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } - std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } + std::size_t integerKindForPointer() { return integerKindForPointer_; } + void set_integerKindForPointer(std::size_t newKind) { + integerKindForPointer_ = newKind; + } private: static constexpr int maxKind{common::maxKind}; @@ -159,7 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; - std::size_t pointerSize_{8 /* bytes */}; + std::size_t integerKindForPointer_{8}; /* For 64 bit pointer */ }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index 24ab65f740ec6..002e82aa72484 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,7 +94,8 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); - targetCharacteristics.set_pointerSize( + // Currently the integer kind happens to be the same as the byte size + targetCharacteristics.set_integerKindForPointer( targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 88114bfe84af3..5fd5ea8f9bc5c 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; + TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 940ff38d0abba9d1b0ab415e42474629284962d8 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:51:53 -0400 Subject: [PATCH 4/4] clang-format --- flang/lib/Semantics/resolve-names.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 5fd5ea8f9bc5c..a8dbf61c8fd68 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6689,8 +6689,8 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { "'%s' cannot be a Cray pointer as it is already a Cray pointee"_err_en_US); } pointer->set(Symbol::Flag::CrayPointer); - const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; + const DeclTypeSpec &pointerType{MakeNumericType(TypeCategory::Integer, + context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); From flang-commits at lists.llvm.org Thu May 22 19:57:13 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Thu, 22 May 2025 19:57:13 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix folding of SHAPE(SPREAD(source, dim, ncopies=-1)) (PR #141146) In-Reply-To: Message-ID: <682fe409.170a0220.380c8d.b3d5@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/141146 From flang-commits at lists.llvm.org Thu May 22 23:52:29 2025 From: flang-commits at lists.llvm.org (Jean-Didier PAILLEUX via flang-commits) Date: Thu, 22 May 2025 23:52:29 -0700 (PDT) Subject: [flang-commits] [flang] [DRAFT][PRIF] Defining PRIF dialect (PR #141203) Message-ID: https://github.com/JDPailleux created https://github.com/llvm/llvm-project/pull/141203 Hello, This PR is a discussion around a dialect to define “prif” operations in Flang. This is a draft and a few operations have been proposed to start with and see if they might be suitable. The set of operations will be based on what is presented in the PRIF specification, which can be found here: : [https://doi.org/10.25344/S4CG6G](https://doi.org/10.25344/S4CG6G) >From bda0bd400aff06a3fa90e8259d2a4a1be77e8f3e Mon Sep 17 00:00:00 2001 From: Jean-Didier Pailleux Date: Thu, 15 May 2025 16:50:54 +0200 Subject: [PATCH] [DRAFT][PRIF] Dialect PRIF operations --- .../Optimizer/Dialect/PRIF/PRIFDialect.td | 39 +++++ .../flang/Optimizer/Dialect/PRIF/PRIFOps.td | 146 ++++++++++++++++++ 2 files changed, 185 insertions(+) create mode 100644 flang/include/flang/Optimizer/Dialect/PRIF/PRIFDialect.td create mode 100644 flang/include/flang/Optimizer/Dialect/PRIF/PRIFOps.td diff --git a/flang/include/flang/Optimizer/Dialect/PRIF/PRIFDialect.td b/flang/include/flang/Optimizer/Dialect/PRIF/PRIFDialect.td new file mode 100644 index 0000000000000..f560f71e7c1d4 --- /dev/null +++ b/flang/include/flang/Optimizer/Dialect/PRIF/PRIFDialect.td @@ -0,0 +1,39 @@ +//===-- PRIFDialect.td - PRIF dialect base definitions -----*- tablegen -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +/// +/// \file +/// Definition of the PRIF (Parallel Runtime Interface for Fortran) dialect +/// +//===----------------------------------------------------------------------===// + +#ifndef FORTRAN_DIALECT_PRIF_PRIFDIALECT +#define FORTRAN_DIALECT_PRIF_PRIFDIALECT + +include "mlir/IR/AttrTypeBase.td" +include "mlir/IR/EnumAttr.td" +include "mlir/IR/OpBase.td" + +def PRIFDialect : Dialect { + let name = "prif"; + + let summary = "PRIF (Parallel Runtime Interface for Fortran) dialect"; + + let description = [{ + The "prif" dialect is designed to contain the basic coarray operations + in Fortran as descibed in the PRIF specificatin. + This includes synchronization operations, atomic operations, + image queries, teams, criticals, etc. The PRIF dialect operations use + the FIR types and are tightly coupled with FIR and HLFIR. + }]; + + let usePropertiesForAttributes = 1; + let cppNamespace = "::prif"; + let dependentDialects = ["fir::FIROpsDialect"]; +} + +#endif // FORTRAN_DIALECT_PRIF_PRIFDIALECT diff --git a/flang/include/flang/Optimizer/Dialect/PRIF/PRIFOps.td b/flang/include/flang/Optimizer/Dialect/PRIF/PRIFOps.td new file mode 100644 index 0000000000000..9cad800c29cf2 --- /dev/null +++ b/flang/include/flang/Optimizer/Dialect/PRIF/PRIFOps.td @@ -0,0 +1,146 @@ +//===-- PRIFOps.td - PRIF operation definitions ------------*- tablegen -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +/// +/// \file +/// Definition of the PRIF dialect operations +/// +//===----------------------------------------------------------------------===// + +#ifndef FORTRAN_DIALECT_PRIF_PRIF_OPS +#define FORTRAN_DIALECT_PRIF_PRIF_OPS + +include "flang/Optimizer/Dialect/PRIF/PRIFDialect.td" +include "flang/Optimizer/Dialect/FIRTypes.td" +include "flang/Optimizer/Dialect/FIRAttr.td" +include "mlir/Dialect/LLVMIR/LLVMAttrDefs.td" +include "mlir/Dialect/LLVMIR/LLVMOpBase.td" +include "mlir/Interfaces/LoopLikeInterface.td" +include "mlir/IR/BuiltinAttributes.td" + +class prif_Op traits> + : Op; + +//===----------------------------------------------------------------------===// +// Initialization and Finalization +//===----------------------------------------------------------------------===// + +def prif_InitOp : prif_Op<"init", []> { + let summary = "Initialize the parallel environment"; + let description = [{This procedure will initialize the parallel environment}]; + + let arguments = (ins I32:$stat); +} + +def prif_StopOp : prif_Op<"stop", [AttrSizedOperandSegments]> { + let summary = "Procedure that synchronizes all execution images, clean up the" + "parallel runtime environment, and terminates the program."; + let description = [{ + This procedure synchronizes all executing images, cleans up the parallel + runtime environment, and terminates the program. Calls to this procedure do + not return. This procedure supports both normal termination at the end of a + program, as well as any STOP statements from the user source code. + }]; + + let arguments = (ins AnyReferenceLike:$quiet, + Optional:$stop_code_int, + Optional:$stop_code_char); + + let hasVerifier = 1; +} + +def prif_ErrorStopOp : prif_Op<"error_stop", [AttrSizedOperandSegments]> { + let summary = "This procedure terminates all executing images." + "Calls to this procedure do not return"; + let description = [{ + This procedure terminates all executing images and calls to this procedure + do not return. + }]; + + let arguments = (ins AnyReferenceLike:$quiet, + Optional:$stop_code_int, + Optional:$stop_code_char); + + let hasVerifier = 1; +} + +//===----------------------------------------------------------------------===// +// Image Queries +//===----------------------------------------------------------------------===// + +def prif_NumImagesOp : prif_Op<"num_images", [AttrSizedOperandSegments]> { + let summary = "Query the number of images in the specified or current team"; + + let arguments = (ins Optional:$team_number, + Optional:$team); + let results = (outs I32); + + let skipDefaultBuilders = 1; + let builders = [ + OpBuilder<(ins CArg<"mlir::Value", "{}">:$team_number, + CArg<"mlir::Value", "{}">:$team), + [{ return build($_builder, $_state, team_number, team); }]> + ]; + + let hasVerifier = 1; +} + +def prif_ThisImageOp : prif_Op<"this_image", []> { + let summary = "Determine the image index of the current image"; + + let arguments = (ins Optional:$team); + let results = (outs I32); +} + +//===----------------------------------------------------------------------===// +// Coarray Queries +//===----------------------------------------------------------------------===// + +//===----------------------------------------------------------------------===// +// Allocation and Deallocation +//===----------------------------------------------------------------------===// + +def prif_AllocateCoarrayOp : prif_Op<"allocate_coarray", + [MemoryEffects<[MemAlloc]>]> { + let summary = "Perform the allocation of a coarray and provide a " + "corresponding coarray descriptor"; + + let description = [{ + This procedure allocates a coarray and provides a handle referencing a + corresponding coarray descriptor. This call is collective over the + current team. + }]; + + let arguments = (ins Arg:$lcobounds, + Arg:$ucobound, + Arg:$size_in_bytes, + Arg:$final_func, + Arg:$coarray_handle, + Arg:$allocated_memory, + Arg, "", [MemWrite]>:$errmsg); + + let results = (outs I32:$stat); + + let hasVerifier = 1; +} + +//===----------------------------------------------------------------------===// +// Synchronization +//===----------------------------------------------------------------------===// + +def prif_SyncAllOp : prif_Op<"sync_all", [AttrSizedOperandSegments]> { + let summary = + "Performs a collective synchronization of all images in the current team"; + + let arguments = (ins Optional:$errmsg); + + let results = (outs I32:$stat); + + let hasVerifier = 1; +} + +#endif // FORTRAN_DIALECT_PRIF_PRIF_OPS From flang-commits at lists.llvm.org Thu May 22 23:52:34 2025 From: flang-commits at lists.llvm.org (Jean-Didier PAILLEUX via flang-commits) Date: Thu, 22 May 2025 23:52:34 -0700 (PDT) Subject: [flang-commits] [flang] [DRAFT][PRIF] Defining PRIF dialect (PR #141203) In-Reply-To: Message-ID: <68301b32.620a0220.35df9a.768a@mx.google.com> https://github.com/JDPailleux closed https://github.com/llvm/llvm-project/pull/141203 From flang-commits at lists.llvm.org Thu May 22 23:52:59 2025 From: flang-commits at lists.llvm.org (Jean-Didier PAILLEUX via flang-commits) Date: Thu, 22 May 2025 23:52:59 -0700 (PDT) Subject: [flang-commits] [flang] [DRAFT][PRIF] Defining PRIF dialect (PR #141203) In-Reply-To: Message-ID: <68301b4b.630a0220.ac96c.987c@mx.google.com> https://github.com/JDPailleux locked https://github.com/llvm/llvm-project/pull/141203 From flang-commits at lists.llvm.org Thu May 22 23:53:04 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 22 May 2025 23:53:04 -0700 (PDT) Subject: [flang-commits] [flang] [DRAFT][PRIF] Defining PRIF dialect (PR #141203) In-Reply-To: Message-ID: <68301b50.630a0220.1f0250.9181@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Jean-Didier PAILLEUX (JDPailleux)
Changes Hello, This PR is a discussion around a dialect to define “prif” operations in Flang. This is a draft and a few operations have been proposed to start with and see if they might be suitable. The set of operations will be based on what is presented in the PRIF specification, which can be found here: : [https://doi.org/10.25344/S4CG6G](https://doi.org/10.25344/S4CG6G) --- Full diff: https://github.com/llvm/llvm-project/pull/141203.diff 2 Files Affected: - (added) flang/include/flang/Optimizer/Dialect/PRIF/PRIFDialect.td (+39) - (added) flang/include/flang/Optimizer/Dialect/PRIF/PRIFOps.td (+146) ``````````diff diff --git a/flang/include/flang/Optimizer/Dialect/PRIF/PRIFDialect.td b/flang/include/flang/Optimizer/Dialect/PRIF/PRIFDialect.td new file mode 100644 index 0000000000000..f560f71e7c1d4 --- /dev/null +++ b/flang/include/flang/Optimizer/Dialect/PRIF/PRIFDialect.td @@ -0,0 +1,39 @@ +//===-- PRIFDialect.td - PRIF dialect base definitions -----*- tablegen -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +/// +/// \file +/// Definition of the PRIF (Parallel Runtime Interface for Fortran) dialect +/// +//===----------------------------------------------------------------------===// + +#ifndef FORTRAN_DIALECT_PRIF_PRIFDIALECT +#define FORTRAN_DIALECT_PRIF_PRIFDIALECT + +include "mlir/IR/AttrTypeBase.td" +include "mlir/IR/EnumAttr.td" +include "mlir/IR/OpBase.td" + +def PRIFDialect : Dialect { + let name = "prif"; + + let summary = "PRIF (Parallel Runtime Interface for Fortran) dialect"; + + let description = [{ + The "prif" dialect is designed to contain the basic coarray operations + in Fortran as descibed in the PRIF specificatin. + This includes synchronization operations, atomic operations, + image queries, teams, criticals, etc. The PRIF dialect operations use + the FIR types and are tightly coupled with FIR and HLFIR. + }]; + + let usePropertiesForAttributes = 1; + let cppNamespace = "::prif"; + let dependentDialects = ["fir::FIROpsDialect"]; +} + +#endif // FORTRAN_DIALECT_PRIF_PRIFDIALECT diff --git a/flang/include/flang/Optimizer/Dialect/PRIF/PRIFOps.td b/flang/include/flang/Optimizer/Dialect/PRIF/PRIFOps.td new file mode 100644 index 0000000000000..9cad800c29cf2 --- /dev/null +++ b/flang/include/flang/Optimizer/Dialect/PRIF/PRIFOps.td @@ -0,0 +1,146 @@ +//===-- PRIFOps.td - PRIF operation definitions ------------*- tablegen -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +/// +/// \file +/// Definition of the PRIF dialect operations +/// +//===----------------------------------------------------------------------===// + +#ifndef FORTRAN_DIALECT_PRIF_PRIF_OPS +#define FORTRAN_DIALECT_PRIF_PRIF_OPS + +include "flang/Optimizer/Dialect/PRIF/PRIFDialect.td" +include "flang/Optimizer/Dialect/FIRTypes.td" +include "flang/Optimizer/Dialect/FIRAttr.td" +include "mlir/Dialect/LLVMIR/LLVMAttrDefs.td" +include "mlir/Dialect/LLVMIR/LLVMOpBase.td" +include "mlir/Interfaces/LoopLikeInterface.td" +include "mlir/IR/BuiltinAttributes.td" + +class prif_Op traits> + : Op; + +//===----------------------------------------------------------------------===// +// Initialization and Finalization +//===----------------------------------------------------------------------===// + +def prif_InitOp : prif_Op<"init", []> { + let summary = "Initialize the parallel environment"; + let description = [{This procedure will initialize the parallel environment}]; + + let arguments = (ins I32:$stat); +} + +def prif_StopOp : prif_Op<"stop", [AttrSizedOperandSegments]> { + let summary = "Procedure that synchronizes all execution images, clean up the" + "parallel runtime environment, and terminates the program."; + let description = [{ + This procedure synchronizes all executing images, cleans up the parallel + runtime environment, and terminates the program. Calls to this procedure do + not return. This procedure supports both normal termination at the end of a + program, as well as any STOP statements from the user source code. + }]; + + let arguments = (ins AnyReferenceLike:$quiet, + Optional:$stop_code_int, + Optional:$stop_code_char); + + let hasVerifier = 1; +} + +def prif_ErrorStopOp : prif_Op<"error_stop", [AttrSizedOperandSegments]> { + let summary = "This procedure terminates all executing images." + "Calls to this procedure do not return"; + let description = [{ + This procedure terminates all executing images and calls to this procedure + do not return. + }]; + + let arguments = (ins AnyReferenceLike:$quiet, + Optional:$stop_code_int, + Optional:$stop_code_char); + + let hasVerifier = 1; +} + +//===----------------------------------------------------------------------===// +// Image Queries +//===----------------------------------------------------------------------===// + +def prif_NumImagesOp : prif_Op<"num_images", [AttrSizedOperandSegments]> { + let summary = "Query the number of images in the specified or current team"; + + let arguments = (ins Optional:$team_number, + Optional:$team); + let results = (outs I32); + + let skipDefaultBuilders = 1; + let builders = [ + OpBuilder<(ins CArg<"mlir::Value", "{}">:$team_number, + CArg<"mlir::Value", "{}">:$team), + [{ return build($_builder, $_state, team_number, team); }]> + ]; + + let hasVerifier = 1; +} + +def prif_ThisImageOp : prif_Op<"this_image", []> { + let summary = "Determine the image index of the current image"; + + let arguments = (ins Optional:$team); + let results = (outs I32); +} + +//===----------------------------------------------------------------------===// +// Coarray Queries +//===----------------------------------------------------------------------===// + +//===----------------------------------------------------------------------===// +// Allocation and Deallocation +//===----------------------------------------------------------------------===// + +def prif_AllocateCoarrayOp : prif_Op<"allocate_coarray", + [MemoryEffects<[MemAlloc]>]> { + let summary = "Perform the allocation of a coarray and provide a " + "corresponding coarray descriptor"; + + let description = [{ + This procedure allocates a coarray and provides a handle referencing a + corresponding coarray descriptor. This call is collective over the + current team. + }]; + + let arguments = (ins Arg:$lcobounds, + Arg:$ucobound, + Arg:$size_in_bytes, + Arg:$final_func, + Arg:$coarray_handle, + Arg:$allocated_memory, + Arg, "", [MemWrite]>:$errmsg); + + let results = (outs I32:$stat); + + let hasVerifier = 1; +} + +//===----------------------------------------------------------------------===// +// Synchronization +//===----------------------------------------------------------------------===// + +def prif_SyncAllOp : prif_Op<"sync_all", [AttrSizedOperandSegments]> { + let summary = + "Performs a collective synchronization of all images in the current team"; + + let arguments = (ins Optional:$errmsg); + + let results = (outs I32:$stat); + + let hasVerifier = 1; +} + +#endif // FORTRAN_DIALECT_PRIF_PRIF_OPS ``````````
https://github.com/llvm/llvm-project/pull/141203 From flang-commits at lists.llvm.org Fri May 23 00:05:02 2025 From: flang-commits at lists.llvm.org (Jean-Didier PAILLEUX via flang-commits) Date: Fri, 23 May 2025 00:05:02 -0700 (PDT) Subject: [flang-commits] [flang] [DRAFT][PRIF] Defining PRIF dialect (PR #141203) In-Reply-To: Message-ID: <68301e1e.170a0220.ebf9.046a@mx.google.com> https://github.com/JDPailleux edited https://github.com/llvm/llvm-project/pull/141203 From flang-commits at lists.llvm.org Fri May 23 01:07:48 2025 From: flang-commits at lists.llvm.org (Yang Zaizhou via flang-commits) Date: Fri, 23 May 2025 01:07:48 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] fix crash on sematic error in atomic capture clause (PR #140710) In-Reply-To: Message-ID: <68302cd4.170a0220.7c2e2.40d9@mx.google.com> Mxfg-incense wrote: @tblah I don't have merge permissions. Could you help me merge it? https://github.com/llvm/llvm-project/pull/140710 From flang-commits at lists.llvm.org Fri May 23 02:30:44 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 23 May 2025 02:30:44 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] Add missing trig math-to-llvm conversion patterns (PR #141069) In-Reply-To: Message-ID: <68304044.650a0220.32051b.02c5@mx.google.com> https://github.com/jeanPerier approved this pull request. Thanks, looks good to me. Nit: should add `[mlir][math]` to your commit/PR title. I agree it is a bit odd for the relaxed/fast FIR tests to be exactly the same. Makes sense to me to add some FMF to the fast case and see check how that propagates in a separate patch. @zero9178 or @matthias-springer for more visibility in MLIR reviewers. https://github.com/llvm/llvm-project/pull/141069 From flang-commits at lists.llvm.org Fri May 23 05:14:43 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 23 May 2025 05:14:43 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] fix crash on sematic error in atomic capture clause (PR #140710) In-Reply-To: Message-ID: <683066b3.050a0220.3be8e8.6b91@mx.google.com> ================ @@ -0,0 +1,22 @@ +! REQUIRES: openmp_runtime + +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags +! Semantic checks on invalid atomic capture clause + +use omp_lib ---------------- kparzysz wrote: FYI, "omp_lib" is not needed for anything in this example. https://github.com/llvm/llvm-project/pull/140710 From flang-commits at lists.llvm.org Fri May 23 05:15:13 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 23 May 2025 05:15:13 -0700 (PDT) Subject: [flang-commits] [flang] 5530474 - [Flang][OpenMP] fix crash on sematic error in atomic capture clause (#140710) Message-ID: <683066d1.a70a0220.17d434.8ea3@mx.google.com> Author: Yang Zaizhou Date: 2025-05-23T07:15:10-05:00 New Revision: 5530474e3e84edd02c85043c60e4df967fee7f26 URL: https://github.com/llvm/llvm-project/commit/5530474e3e84edd02c85043c60e4df967fee7f26 DIFF: https://github.com/llvm/llvm-project/commit/5530474e3e84edd02c85043c60e4df967fee7f26.diff LOG: [Flang][OpenMP] fix crash on sematic error in atomic capture clause (#140710) Fix a crash caused by an invalid expression in the atomic capture clause, due to the `checkForSymbolMatch` function not accounting for `GetExpr` potentially returning null. Fix https://github.com/llvm/llvm-project/issues/139884 Added: flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 Modified: flang/include/flang/Semantics/tools.h flang/lib/Lower/OpenACC.cpp flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Semantics/check-omp-structure.cpp Removed: ################################################################################ diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..3839bc1d2a215 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -764,19 +764,14 @@ inline bool checkForSingleVariableOnRHS( return designator != nullptr; } -/// Checks if the symbol on the LHS of the assignment statement is present in -/// the RHS expression. -inline bool checkForSymbolMatch( - const Fortran::parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - const auto *e{Fortran::semantics::GetExpr(expr)}; - const auto *v{Fortran::semantics::GetExpr(var)}; - auto varSyms{Fortran::evaluate::GetSymbolVector(*v)}; - const Fortran::semantics::Symbol &varSymbol{*varSyms.front()}; +/// Checks if the symbol on the LHS is present in the RHS expression. +inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, + const Fortran::semantics::SomeExpr *rhs) { + auto lhsSyms{Fortran::evaluate::GetSymbolVector(*lhs)}; + const Fortran::semantics::Symbol &lhsSymbol{*lhsSyms.front()}; for (const Fortran::semantics::Symbol &symbol : - Fortran::evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { + Fortran::evaluate::GetSymbolVector(*rhs)) { + if (lhsSymbol == symbol) { return true; } } diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 0405510baeb9b..bb8b4d8e833c2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -654,7 +654,9 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter, mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + if (Fortran::semantics::checkForSymbolMatch( + Fortran::semantics::GetExpr(stmt2Var), + Fortran::semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] const Fortran::semantics::SomeExpr &fromExpr = *Fortran::semantics::GetExpr(stmt1Expr); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 02c09d4eea041..5a975384bd371 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3198,7 +3198,8 @@ static void genAtomicCapture(lower::AbstractConverter &converter, mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { + if (semantics::checkForSymbolMatch(semantics::GetExpr(stmt2Var), + semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); mlir::Type elementType = converter.genType(fromExpr); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 606014276e7ca..bda0d62829506 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2910,45 +2910,47 @@ void OmpStructureChecker::CheckAtomicCaptureConstruct( .v.statement; const auto &stmt1Var{std::get(stmt1.t)}; const auto &stmt1Expr{std::get(stmt1.t)}; + const auto *v1 = GetExpr(context_, stmt1Var); + const auto *e1 = GetExpr(context_, stmt1Expr); const parser::AssignmentStmt &stmt2 = std::get(atomicCaptureConstruct.t) .v.statement; const auto &stmt2Var{std::get(stmt2.t)}; const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); + const auto *v2 = GetExpr(context_, stmt2Var); + const auto *e2 = GetExpr(context_, stmt2Expr); + + if (e1 && v1 && e2 && v2) { + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + CheckAtomicCaptureStmt(stmt1); + if (semantics::checkForSymbolMatch(v2, e2)) { + // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] + CheckAtomicUpdateStmt(stmt2); + } else { + // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] + CheckAtomicWriteStmt(stmt2); + } + if (!(*e1 == *v2)) { + context_.Say(stmt1Expr.source, + "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, + stmt1Expr.source); + } + } else if (semantics::checkForSymbolMatch(v1, e1) && + semantics::checkForSingleVariableOnRHS(stmt2)) { + // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] + CheckAtomicUpdateStmt(stmt1); + CheckAtomicCaptureStmt(stmt2); + // Variable updated in stmt1 should be captured in stmt2 + if (!(*v1 == *e2)) { + context_.Say(stmt1Var.GetSource(), + "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, + stmt1Var.GetSource()); + } } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); + "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } - } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } } diff --git a/flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 b/flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 new file mode 100644 index 0000000000000..cb9c73cc940db --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-capture-invalid.f90 @@ -0,0 +1,22 @@ +! REQUIRES: openmp_runtime + +! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags +! Semantic checks on invalid atomic capture clause + +use omp_lib + logical x + complex y + !$omp atomic capture + !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches operand types LOGICAL(4) and COMPLEX(4) + x = y + !ERROR: Operands of + must be numeric; have COMPLEX(4) and LOGICAL(4) + y = y + x + !$omp end atomic + + !$omp atomic capture + !ERROR: Operands of + must be numeric; have COMPLEX(4) and LOGICAL(4) + y = y + x + !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches operand types LOGICAL(4) and COMPLEX(4) + x = y + !$omp end atomic +end From flang-commits at lists.llvm.org Fri May 23 05:15:16 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 23 May 2025 05:15:16 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][OpenMP] fix crash on sematic error in atomic capture clause (PR #140710) In-Reply-To: Message-ID: <683066d4.170a0220.8143e.0f57@mx.google.com> https://github.com/kparzysz closed https://github.com/llvm/llvm-project/pull/140710 From flang-commits at lists.llvm.org Fri May 23 06:24:15 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 23 May 2025 06:24:15 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <683076ff.170a0220.162f30.262f@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH 1/4] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 30477b842ae64fb1c94f520136956cc4685648a7 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:59:20 -0400 Subject: [PATCH 2/4] clang-format --- flang/include/flang/Evaluate/target.h | 4 +--- flang/include/flang/Tools/TargetSetup.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d5285fb..e8b9fedc38f48 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ class TargetCharacteristics { const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e21fb4..24ab65f740ec6 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. >From e1f91c83680d72fc1463a8db97a77298141286d9 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:41:48 -0400 Subject: [PATCH 3/4] Renaming to integerKindForPointer --- flang/include/flang/Evaluate/target.h | 8 +++++--- flang/include/flang/Tools/TargetSetup.h | 3 ++- flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index e8b9fedc38f48..7b38db2db1956 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,8 +131,10 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } - std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } + std::size_t integerKindForPointer() { return integerKindForPointer_; } + void set_integerKindForPointer(std::size_t newKind) { + integerKindForPointer_ = newKind; + } private: static constexpr int maxKind{common::maxKind}; @@ -159,7 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; - std::size_t pointerSize_{8 /* bytes */}; + std::size_t integerKindForPointer_{8}; /* For 64 bit pointer */ }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index 24ab65f740ec6..002e82aa72484 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,7 +94,8 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); - targetCharacteristics.set_pointerSize( + // Currently the integer kind happens to be the same as the byte size + targetCharacteristics.set_integerKindForPointer( targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 88114bfe84af3..5fd5ea8f9bc5c 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; + TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 940ff38d0abba9d1b0ab415e42474629284962d8 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:51:53 -0400 Subject: [PATCH 4/4] clang-format --- flang/lib/Semantics/resolve-names.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 5fd5ea8f9bc5c..a8dbf61c8fd68 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6689,8 +6689,8 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { "'%s' cannot be a Cray pointer as it is already a Cray pointee"_err_en_US); } pointer->set(Symbol::Flag::CrayPointer); - const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; + const DeclTypeSpec &pointerType{MakeNumericType(TypeCategory::Integer, + context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); From flang-commits at lists.llvm.org Fri May 23 06:53:01 2025 From: flang-commits at lists.llvm.org (Matthias Springer via flang-commits) Date: Fri, 23 May 2025 06:53:01 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] Add missing trig math-to-llvm conversion patterns (PR #141069) In-Reply-To: Message-ID: <68307dbd.050a0220.24c15f.944b@mx.google.com> https://github.com/matthias-springer approved this pull request. https://github.com/llvm/llvm-project/pull/141069 From flang-commits at lists.llvm.org Fri May 23 07:07:26 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Fri, 23 May 2025 07:07:26 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [mlir][math] Add missing trig math-to-llvm conversion patterns (PR #141069) In-Reply-To: Message-ID: <6830811e.170a0220.1bdf95.52ca@mx.google.com> https://github.com/ashermancinelli edited https://github.com/llvm/llvm-project/pull/141069 From flang-commits at lists.llvm.org Thu May 22 13:48:13 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 22 May 2025 13:48:13 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (WORK IN PROGRESS) (PR #137727) In-Reply-To: Message-ID: <682f8d8d.170a0220.18d9da.2361@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From c243d222fe682c92670349353bc326d112a5e30b Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, Destroy, and DescriptorIO. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. There is a fast(?) mode in the work queue implementation that causes new work items to be executed to completion immediately upon creation, saving the overhead of actually representing and managing the work queue. This mode can't be used on GPU devices, but it is enabled by default for CPU hosts. It can be disabled easily for debugging and performance testing. --- .../include/flang-rt/runtime/environment.h | 1 + flang-rt/include/flang-rt/runtime/stat.h | 10 +- .../include/flang-rt/runtime/work-queue.h | 538 +++++++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 549 +++++++++------ flang-rt/lib/runtime/derived.cpp | 519 +++++++------- flang-rt/lib/runtime/descriptor-io.cpp | 646 +++++++++++++++++- flang-rt/lib/runtime/descriptor-io.h | 620 +---------------- flang-rt/lib/runtime/environment.cpp | 4 + flang-rt/lib/runtime/namelist.cpp | 1 + flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 145 ++++ flang-rt/unittests/Runtime/ExternalIOTest.cpp | 2 +- flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- 15 files changed, 1974 insertions(+), 1081 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/environment.h b/flang-rt/include/flang-rt/runtime/environment.h index 16258b3bbba9b..87fe1f92ba545 100644 --- a/flang-rt/include/flang-rt/runtime/environment.h +++ b/flang-rt/include/flang-rt/runtime/environment.h @@ -63,6 +63,7 @@ struct ExecutionEnvironment { bool noStopMessage{false}; // NO_STOP_MESSAGE=1 inhibits "Fortran STOP" bool defaultUTF8{false}; // DEFAULT_UTF8 bool checkPointerDeallocation{true}; // FORT_CHECK_POINTER_DEALLOCATION + int internalDebugging{0}; // FLANG_RT_DEBUG // CUDA related variables std::size_t cudaStackLimit{0}; // ACC_OFFLOAD_STACK_SIZE diff --git a/flang-rt/include/flang-rt/runtime/stat.h b/flang-rt/include/flang-rt/runtime/stat.h index 070d0bf8673fb..dc372de53506a 100644 --- a/flang-rt/include/flang-rt/runtime/stat.h +++ b/flang-rt/include/flang-rt/runtime/stat.h @@ -24,7 +24,7 @@ class Terminator; enum Stat { StatOk = 0, // required to be zero by Fortran - // Interoperable STAT= codes + // Interoperable STAT= codes (>= 11) StatBaseNull = CFI_ERROR_BASE_ADDR_NULL, StatBaseNotNull = CFI_ERROR_BASE_ADDR_NOT_NULL, StatInvalidElemLen = CFI_INVALID_ELEM_LEN, @@ -36,7 +36,7 @@ enum Stat { StatMemAllocation = CFI_ERROR_MEM_ALLOCATION, StatOutOfBounds = CFI_ERROR_OUT_OF_BOUNDS, - // Standard STAT= values + // Standard STAT= values (>= 101) StatFailedImage = FORTRAN_RUNTIME_STAT_FAILED_IMAGE, StatLocked = FORTRAN_RUNTIME_STAT_LOCKED, StatLockedOtherImage = FORTRAN_RUNTIME_STAT_LOCKED_OTHER_IMAGE, @@ -49,10 +49,14 @@ enum Stat { // Additional "processor-defined" STAT= values StatInvalidArgumentNumber = FORTRAN_RUNTIME_STAT_INVALID_ARG_NUMBER, StatMissingArgument = FORTRAN_RUNTIME_STAT_MISSING_ARG, - StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, + StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, // -1 StatMoveAllocSameAllocatable = FORTRAN_RUNTIME_STAT_MOVE_ALLOC_SAME_ALLOCATABLE, StatBadPointerDeallocation = FORTRAN_RUNTIME_STAT_BAD_POINTER_DEALLOCATION, + + // Dummy status for work queue continuation, declared here to perhaps + // avoid collisions + StatContinue = 201 }; RT_API_ATTRS const char *StatErrorString(int); diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..2b46890aeebe1 --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,538 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue comprises a list of tickets. Each ticket class has a Begin() +// member function, which is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatContinue. When that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentsOverElements, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/connection.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; + +// Ticket worker base classes + +template class ImmediateTicketRunner { +public: + RT_API_ATTRS explicit ImmediateTicketRunner(TICKET &ticket) + : ticket_{ticket} {} + RT_API_ATTRS int Run(WorkQueue &workQueue) { + int status{ticket_.Begin(workQueue)}; + while (status == StatContinue) { + status = ticket_.Continue(workQueue); + } + return status; + } + +private: + TICKET &ticket_; +}; + +// Base class for ticket workers that operate elementwise over descriptors +class Elementwise { +protected: + RT_API_ATTRS Elementwise( + const Descriptor &instance, const Descriptor *from = nullptr) + : instance_{instance}, from_{from} { + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + RT_API_ATTRS bool IsComplete() const { return elementAt_ >= elements_; } + RT_API_ATTRS void Advance() { + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + if (from_) { + from_->IncrementSubscripts(fromSubscripts_); + } + } + RT_API_ATTRS void SkipToEnd() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + + const Descriptor &instance_, *from_{nullptr}; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + SubscriptValue subscripts_[common::maxRank]; + SubscriptValue fromSubscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components. +class Componentwise { +protected: + RT_API_ATTRS Componentwise(const typeInfo::DerivedType &); + RT_API_ATTRS bool IsComplete() const { return componentAt_ >= components_; } + RT_API_ATTRS void Advance() { + ++componentAt_; + GetComponent(); + } + RT_API_ATTRS void SkipToEnd() { + component_ = nullptr; + componentAt_ = components_; + } + RT_API_ATTRS void Reset() { + component_ = nullptr; + componentAt_ = 0; + GetComponent(); + } + RT_API_ATTRS void GetComponent(); + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentsOverElements : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ComponentsOverElements(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Elementwise::IsComplete()) { + Componentwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Componentwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextElement(); + if (Elementwise::IsComplete()) { + Elementwise::Reset(); + Componentwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Elementwise::Advance(); + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Advance(); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Reset(); + } + + int phase_{0}; +}; + +// Base class for ticket workers that operate over elements in an outer loop, +// type components in an inner loop. +class ElementsOverComponents : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ElementsOverComponents(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Componentwise::IsComplete()) { + Elementwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Elementwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextComponent(); + if (Componentwise::IsComplete()) { + Componentwise::Reset(); + Elementwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Componentwise::Advance(); + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Componentwise::Reset(); + Elementwise::Advance(); + } + + int phase_{0}; +}; + +// Ticket worker classes + +// Implements derived type instance initialization +class InitializeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket + : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket : public ImmediateTicketRunner { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : ImmediateTicketRunner{*this}, to_{to}, from_{&from}, + flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment. +class DerivedAssignTicket : public ImmediateTicketRunner, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ImmediateTicketRunner{*this}, + ElementsOverComponents{to, derived, &from}, flags_{flags}, + memmoveFct_{memmoveFct}, deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + StaticDescriptor fromComponentDescriptor_; +}; + +namespace io::descr { + +template +class DescriptorIoTicket + : public ImmediateTicketRunner>, + private Elementwise { +public: + RT_API_ATTRS DescriptorIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + Elementwise{descriptor}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS bool &anyIoTookPlace() { return anyIoTookPlace_; } + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; + common::optional nonTbpSpecial_; + const typeInfo::DerivedType *derived_{nullptr}; + const typeInfo::SpecialBinding *special_{nullptr}; + StaticDescriptor elementDescriptor_; +}; + +template +class DerivedIoTicket : public ImmediateTicketRunner>, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + ElementsOverComponents{descriptor, derived}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; +}; + +} // namespace io::descr + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant, + io::descr::DescriptorIoTicket, + io::descr::DerivedIoTicket, + io::descr::DerivedIoTicket> + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + // APIs for particular tasks. These can return StatOk if the work is + // completed immediately. + RT_API_ATTRS int BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return InitializeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + if (runTicketsImmediately_) { + return InitializeCloneTicket{clone, original, derived, hasStat, errMsg} + .Run(*this); + } else { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); + return StatContinue; + } + } + RT_API_ATTRS int BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return FinalizeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + if (runTicketsImmediately_) { + return DestroyTicket{descriptor, derived, finalize}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived, finalize); + return StatContinue; + } + } + RT_API_ATTRS int BeginAssign(Descriptor &to, const Descriptor &from, + int flags, MemmoveFct memmoveFct) { + if (runTicketsImmediately_) { + return AssignTicket{to, from, flags, memmoveFct}.Run(*this); + } else { + StartTicket().u.emplace(to, from, flags, memmoveFct); + return StatContinue; + } + } + RT_API_ATTRS int BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) { + if (runTicketsImmediately_) { + return DerivedAssignTicket{ + to, from, derived, flags, memmoveFct, deallocateAfter} + .Run(*this); + } else { + StartTicket().u.emplace( + to, from, derived, flags, memmoveFct, deallocateAfter); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDescriptorIo(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DescriptorIoTicket{ + io, descriptor, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, table, anyIoTookPlace); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedIo(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DerivedIoTicket{ + io, descriptor, derived, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, derived, table, anyIoTookPlace); + return StatContinue; + } + } + + RT_API_ATTRS int Run(); + +private: +#if RT_DEVICE_COMPILATION + // Always use the work queue on a GPU device to avoid recursion. + static constexpr bool runTicketsImmediately_{false}; +#else + // Avoid the work queue overhead on the host, unless it needs + // debugging, which is so much easier there. + static constexpr bool runTicketsImmediately_{true}; +#endif + + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index c5e7bdce5b2fd..3f9858671d1e1 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -67,6 +67,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -130,6 +131,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 9be75da9520e3..cc2000ddfdb6e 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -102,11 +103,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncId)); } // least <= 0, most >= 0 @@ -231,6 +228,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -244,274 +243,373 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + if (workQueue.BeginAssign(to, from, flags, memmoveFct) == StatContinue) { + workQueue.Run(); + } +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + if (int status{workQueue.BeginFinalize(*toDeallocate_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncId))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(newFrom, *derived)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + } + static constexpr int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (int status{workQueue.BeginAssign( + newFrom, *from_, nestedFlags, memmoveFct_)}; + status != StatOk && status != StatContinue) { + return status; } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + if (int status{workQueue.BeginFinalize(to_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed + } else if (!toDerived_->noDestructionNeeded()) { + if (int status{ + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + return StatContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); } + return StatOk; } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(to_, *toDerived_)}; + status != StatOk) { + return status; + } + } + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); - } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); - } + if (toDerived_) { + if (int status{workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_)}; + status != StatOk && status != StatContinue) { + return status; } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &workQueue) { + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + memmoveFct_( + instance_.ElementComponent(subscripts_, procPtr.offset), + from_->ElementComponent( + fromSubscripts_, procPtr.offset), + sizeof(typeInfo::ProcedurePointer)); + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &workQueue) { + // DerivedAssignTicket inherits from ElementComponentBase so that it + // loops over elements at the outer level and over components at the inner. + // This yield less surprising behavior for codes that care due to the use + // of defined assignments when the ordering of their calls matters. + while (!IsComplete()) { + // Copy the data components (incl. the parent) first. + switch (component_->genre()) { + case typeInfo::Component::Genre::Data: + if (component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{fromComponentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + toCompDesc, instance_, workQueue.terminator(), subscripts_); + component_->CreatePointerDescriptor( + fromCompDesc, *from_, workQueue.terminator(), fromSubscripts_); + Advance(); + if (int status{workQueue.BeginAssign( + toCompDesc, fromCompDesc, flags_, memmoveFct_)}; + status != StatOk) { + return status; + } + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + instance_.Element(subscripts_) + component_->offset())}; + const auto *fromDesc{reinterpret_cast( + from_->Element(fromSubscripts_) + component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (const auto *componentDerived{component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } + } + toDesc->Deallocate(); + } + Advance(); + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + // + // Be careful not to destroy/reallocate the LHS, if there is + // overlap between LHS and RHS (it seems that partial overlap + // is not possible, though). + // Invoke Assign() recursively to deal with potential aliasing. + // Force LHS deallocation with DeallocateLHS flag. + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + int nestedFlags{flags_ | DeallocateLHS}; + Advance(); + if (int status{workQueue.BeginAssign( + *toDesc, *fromDesc, nestedFlags, memmoveFct_)}; + status != StatOk) { + return status; + } + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -581,7 +679,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -597,11 +694,11 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. - if (var) + if (var) { Assign(*var, temp, terminator, NoAssignFlags); + } temp.Destroy(/*finalize=*/false, /*destroyPointers=*/false, &terminator); } diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..8462d0aba1f06 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,195 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitialize(instance, derived)}; + return status == StatContinue ? workQueue.Run() : status; +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &comp{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + auto &pptr{*instance_.ElementComponent( + subscripts_, comp.offset)}; + pptr = comp.procInitialization; + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + SkipToNextComponent(); + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitialize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; - } - } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitializeClone( + clone, original, derived, hasStat, errMsg)}; + return status == StatContinue ? workQueue.Run() : status; } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncId), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + if (int status{workQueue.BeginInitialize(cloneDesc, *derived)}; + status != StatOk) { + return status; } } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_)}; + status != StatOk) { + return status; + } + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } + } else { + SkipToNextComponent(); + } + } + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginFinalize(descriptor, derived) == StatContinue) { + workQueue.Run(); } } - return stat; } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +237,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +274,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncId) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,87 +298,94 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (finalizableParentType_->noFinalizationNeeded()) { + finalizableParentType_ = nullptr; + } else { + SkipToNextComponent(); + } + } + return StatContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + if (int status{ + workQueue.BeginFinalize(compDesc, *compDynamicType)}; + status != StatOk) { + return status; } } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginFinalize(compDesc, *compType)}; + status != StatOk) { + return status; } } + } else { + SkipToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginFinalize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + const auto &parentType{*finalizableParentType_}; + finalizableParentType_ = nullptr; + // Don't return StatOk here if the nested FInalize is still running; + // it needs this->componentDescriptor_. + return workQueue.BeginFinalize(tmpDesc, parentType); } + return StatOk; } // The order of finalization follows Fortran 2018 7.5.6.2, with @@ -373,51 +394,71 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginDestroy(descriptor, derived, finalize) == StatContinue) { + workQueue.Run(); + } } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + if (int status{workQueue.BeginFinalize(instance_, derived_)}; + status != StatOk && status != StatContinue) { + return status; + } } + return StatContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (!IsComplete()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *d, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + SkipToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginDestroy( + compDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } } + } else { + SkipToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index 3db1455af52fe..de2b9a788a25e 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -7,12 +7,40 @@ //===----------------------------------------------------------------------===// #include "descriptor-io.h" +#include "edit-input.h" +#include "edit-output.h" +#include "unit.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/namelist.h" +#include "flang-rt/runtime/terminator.h" +#include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" +#include "flang/Common/optional.h" #include "flang/Common/restorer.h" +#include "flang/Common/uint128.h" +#include "flang/Runtime/cpp-type.h" #include "flang/Runtime/freestanding-tools.h" +// Implementation of I/O data list item transfers based on descriptors. +// (All I/O items come through here so that the code is exercised for test; +// some scalar I/O data transfer APIs could be changed to bypass their use +// of descriptors in the future for better efficiency.) + namespace Fortran::runtime::io::descr { RT_OFFLOAD_API_GROUP_BEGIN +template +inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, + const Descriptor &descriptor, const SubscriptValue subscripts[]) { + A *p{descriptor.Element(subscripts)}; + if (!p) { + io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " + "address or subscripts out of range"); + } + return *p; +} + // Defined formatted I/O (maybe) Fortran::common::optional DefinedFormattedIo(IoStatementState &io, const Descriptor &descriptor, const typeInfo::DerivedType &derived, @@ -104,8 +132,8 @@ Fortran::common::optional DefinedFormattedIo(IoStatementState &io, } // Defined unformatted I/O -bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, - const typeInfo::DerivedType &derived, +static bool DefinedUnformattedIo(IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special) { // Unformatted I/O must have an external unit (or child thereof). IoErrorHandler &handler{io.GetIoErrorHandler()}; @@ -152,5 +180,619 @@ bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, return handler.GetIoStat() == IostatOk; } +// Per-category descriptor-based I/O templates + +// TODO (perhaps as a nontrivial but small starter project): implement +// automatic repetition counts, like "10*3.14159", for list-directed and +// NAMELIST array output. + +template +inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, + const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!EditIntegerOutput(io, *edit, x, isSigned)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditIntegerInput( + io, *edit, reinterpret_cast(&x), KIND, isSigned)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedIntegerIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedRealIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + RawType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditRealInput(io, *edit, reinterpret_cast(&x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedRealIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedComplexIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + bool isListOutput{ + io.get_if>() != nullptr}; + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + RawType *x{&ExtractElement(io, descriptor, subscripts)}; + if (isListOutput) { + DataEdit rEdit, iEdit; + rEdit.descriptor = DataEdit::ListDirectedRealPart; + iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; + rEdit.modes = iEdit.modes = io.mutableModes(); + if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || + !RealOutputEditing{io, x[1]}.Edit(iEdit)) { + return false; + } + } else { + for (int k{0}; k < 2; ++k, ++x) { + auto edit{io.GetNextDataEdit()}; + if (!edit) { + return false; + } else if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, *x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { + break; + } else if (EditRealInput( + io, *edit, reinterpret_cast(x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedComplexIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedCharacterIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + std::size_t length{descriptor.ElementBytes() / sizeof(A)}; + auto *listOutput{io.get_if>()}; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + A *x{&ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditCharacterOutput(io, *edit, x, length)) { + return false; + } + } else { // input + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditCharacterInput(io, *edit, x, length)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedCharacterIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedLogicalIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + auto *listOutput{io.get_if>()}; + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditLogicalOutput(io, *edit, x != 0)) { + return false; + } + } else { + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + bool truth{}; + if (EditLogicalInput(io, *edit, truth)) { + x = truth; + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedLogicalIO: subscripts out of bounds"); + } + } + return true; +} + +template +RT_API_ATTRS int DerivedIoTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Data) { + // Create a descriptor for the component + Descriptor &compDesc{componentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + compDesc, instance_, io_.GetIoErrorHandler(), subscripts_); + Advance(); + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } else { + // Component is itself a descriptor + char *pointer{ + instance_.Element(subscripts_) + component_->offset()}; + const Descriptor &compDesc{ + *reinterpret_cast(pointer)}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + } + } + return StatOk; +} + +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Begin(WorkQueue &workQueue) { + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + if (handler.InError()) { + return handler.GetIoStat(); + } + if (!io_.get_if>()) { + handler.Crash("DescriptorIO() called for wrong I/O direction"); + return handler.GetIoStat(); + } + if constexpr (DIR == Direction::Input) { + if (!io_.BeginReadingRecord()) { + return StatOk; + } + } + if (!io_.get_if>()) { + // Unformatted I/O + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + if (const typeInfo::DerivedType * + type{addendum ? addendum->derivedType() : nullptr}) { + // derived type unformatted I/O + if (table_) { + if (const auto *definedIo{table_->Find(*type, + DIR == Direction::Input + ? common::DefinedIo::ReadUnformatted + : common::DefinedIo::WriteUnformatted)}) { + if (definedIo->subroutine) { + typeInfo::SpecialBinding special{DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false}; + if (DefinedUnformattedIo(io_, instance_, *type, special)) { + anyIoTookPlace_ = true; + return StatOk; + } + } else { + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } + } + } + if (const typeInfo::SpecialBinding * + special{type->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || special->isTypeBound()) { + // defined derived type unformatted I/O + if (DefinedUnformattedIo(io_, instance_, *type, *special)) { + anyIoTookPlace_ = true; + return StatOk; + } else { + return IostatEnd; + } + } + } + // Default derived type unformatted I/O + // TODO: If no component at any level has defined READ or WRITE + // (as appropriate), the elements are contiguous, and no byte swapping + // is active, do a block transfer via the code below. + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } else { + // intrinsic type unformatted I/O + auto *externalUnf{io_.get_if>()}; + ChildUnformattedIoStatementState *childUnf{nullptr}; + InquireIOLengthState *inq{nullptr}; + bool swapEndianness{false}; + if (externalUnf) { + swapEndianness = externalUnf->unit().swapEndianness(); + } else { + childUnf = io_.get_if>(); + if (!childUnf) { + inq = DIR == Direction::Output ? io_.get_if() + : nullptr; + RUNTIME_CHECK(handler, inq != nullptr); + } + } + std::size_t elementBytes{instance_.ElementBytes()}; + std::size_t swappingBytes{elementBytes}; + if (auto maybeCatAndKind{instance_.type().GetCategoryAndKind()}) { + // Byte swapping units can be smaller than elements, namely + // for COMPLEX and CHARACTER. + if (maybeCatAndKind->first == TypeCategory::Character) { + // swap each character position independently + swappingBytes = maybeCatAndKind->second; // kind + } else if (maybeCatAndKind->first == TypeCategory::Complex) { + // swap real and imaginary components independently + swappingBytes /= 2; + } + } + using CharType = + std::conditional_t; + auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { + if constexpr (DIR == Direction::Output) { + return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) + : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) + : inq->Emit(&x, totalBytes, swappingBytes); + } else { + return externalUnf + ? externalUnf->Receive(&x, totalBytes, swappingBytes) + : childUnf->Receive(&x, totalBytes, swappingBytes); + } + }}; + if (!swapEndianness && + instance_.IsContiguous()) { // contiguous unformatted I/O + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elements_ * elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O + for (; !IsComplete(); Advance()) { + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } + } + } + // Unformatted I/O never needs to call Continue(). + return StatOk; + } + // Formatted I/O + if (auto catAndKind{instance_.type().GetCategoryAndKind()}) { + TypeCategory cat{catAndKind->first}; + int kind{catAndKind->second}; + bool any{false}; + switch (cat) { + case TypeCategory::Integer: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, true); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, true); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, true); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, true); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, true); + break; + default: + handler.Crash( + "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Unsigned: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, false); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, false); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, false); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, false); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, false); + break; + default: + handler.Crash( + "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Real: + switch (kind) { + case 2: + any = FormattedRealIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedRealIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedRealIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedRealIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedRealIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedRealIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: REAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Complex: + switch (kind) { + case 2: + any = FormattedComplexIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedComplexIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedComplexIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedComplexIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedComplexIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedComplexIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Character: + switch (kind) { + case 1: + any = FormattedCharacterIO(io_, instance_); + break; + case 2: + any = FormattedCharacterIO(io_, instance_); + break; + case 4: + any = FormattedCharacterIO(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Logical: + switch (kind) { + case 1: + any = FormattedLogicalIO<1, DIR>(io_, instance_); + break; + case 2: + any = FormattedLogicalIO<2, DIR>(io_, instance_); + break; + case 4: + any = FormattedLogicalIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedLogicalIO<8, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Derived: { + // Derived type information must be present for formatted I/O. + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + RUNTIME_CHECK(handler, addendum != nullptr); + derived_ = addendum->derivedType(); + RUNTIME_CHECK(handler, derived_ != nullptr); + if (table_) { + if (const auto *definedIo{table_->Find(*derived_, + DIR == Direction::Input ? common::DefinedIo::ReadFormatted + : common::DefinedIo::WriteFormatted)}) { + if (definedIo->subroutine) { + nonTbpSpecial_.emplace(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false); + special_ = &*nonTbpSpecial_; + } + } + } + if (!special_) { + if (const typeInfo::SpecialBinding * + binding{derived_->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || + binding->isTypeBound()) { + special_ = binding; + } + } + } + return StatContinue; + } + } + if (any) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { + handler.Crash("DescriptorIO: bad type code (%d) in descriptor", + static_cast(instance_.type().raw())); + return handler.GetIoStat(); + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Continue(WorkQueue &workQueue) { + // Only derived type formatted I/O gets here. + while (!IsComplete()) { + if (special_) { + if (auto defined{DefinedFormattedIo( + io_, instance_, *derived_, *special_, subscripts_)}) { + anyIoTookPlace_ |= *defined; + Advance(); + continue; + } + } + Descriptor &elementDesc{elementDescriptor_.descriptor()}; + elementDesc.Establish( + *derived_, nullptr, 0, nullptr, CFI_attribute_pointer); + elementDesc.set_base_addr(instance_.Element(subscripts_)); + Advance(); + if (int status{workQueue.BeginDerivedIo( + io_, elementDesc, *derived_, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS bool DescriptorIO(IoStatementState &io, + const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { + bool anyIoTookPlace{false}; + WorkQueue workQueue{io.GetIoErrorHandler()}; + if (workQueue.BeginDescriptorIo(io, descriptor, table, anyIoTookPlace) == + StatContinue) { + workQueue.Run(); + } + return anyIoTookPlace; +} + +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); + RT_OFFLOAD_API_GROUP_END } // namespace Fortran::runtime::io::descr diff --git a/flang-rt/lib/runtime/descriptor-io.h b/flang-rt/lib/runtime/descriptor-io.h index eb60f106c9203..88ad59bd24b53 100644 --- a/flang-rt/lib/runtime/descriptor-io.h +++ b/flang-rt/lib/runtime/descriptor-io.h @@ -9,619 +9,27 @@ #ifndef FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ #define FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ -// Implementation of I/O data list item transfers based on descriptors. -// (All I/O items come through here so that the code is exercised for test; -// some scalar I/O data transfer APIs could be changed to bypass their use -// of descriptors in the future for better efficiency.) +#include "flang-rt/runtime/connection.h" -#include "edit-input.h" -#include "edit-output.h" -#include "unit.h" -#include "flang-rt/runtime/descriptor.h" -#include "flang-rt/runtime/io-stmt.h" -#include "flang-rt/runtime/namelist.h" -#include "flang-rt/runtime/terminator.h" -#include "flang-rt/runtime/type-info.h" -#include "flang/Common/optional.h" -#include "flang/Common/uint128.h" -#include "flang/Runtime/cpp-type.h" +namespace Fortran::runtime { +class Descriptor; +} // namespace Fortran::runtime -namespace Fortran::runtime::io::descr { -template -inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, - const Descriptor &descriptor, const SubscriptValue subscripts[]) { - A *p{descriptor.Element(subscripts)}; - if (!p) { - io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " - "address or subscripts out of range"); - } - return *p; -} - -// Per-category descriptor-based I/O templates - -// TODO (perhaps as a nontrivial but small starter project): implement -// automatic repetition counts, like "10*3.14159", for list-directed and -// NAMELIST array output. - -template -inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, - const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!EditIntegerOutput(io, *edit, x, isSigned)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditIntegerInput( - io, *edit, reinterpret_cast(&x), KIND, isSigned)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedIntegerIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedRealIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - RawType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditRealInput(io, *edit, reinterpret_cast(&x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedRealIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io -template -inline RT_API_ATTRS bool FormattedComplexIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - bool isListOutput{ - io.get_if>() != nullptr}; - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - RawType *x{&ExtractElement(io, descriptor, subscripts)}; - if (isListOutput) { - DataEdit rEdit, iEdit; - rEdit.descriptor = DataEdit::ListDirectedRealPart; - iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; - rEdit.modes = iEdit.modes = io.mutableModes(); - if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || - !RealOutputEditing{io, x[1]}.Edit(iEdit)) { - return false; - } - } else { - for (int k{0}; k < 2; ++k, ++x) { - auto edit{io.GetNextDataEdit()}; - if (!edit) { - return false; - } else if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, *x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { - break; - } else if (EditRealInput( - io, *edit, reinterpret_cast(x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedComplexIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedCharacterIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t length{descriptor.ElementBytes() / sizeof(A)}; - auto *listOutput{io.get_if>()}; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - A *x{&ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditCharacterOutput(io, *edit, x, length)) { - return false; - } - } else { // input - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditCharacterInput(io, *edit, x, length)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedCharacterIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedLogicalIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - auto *listOutput{io.get_if>()}; - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditLogicalOutput(io, *edit, x != 0)) { - return false; - } - } else { - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - bool truth{}; - if (EditLogicalInput(io, *edit, truth)) { - x = truth; - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedLogicalIO: subscripts out of bounds"); - } - } - return true; -} +namespace Fortran::runtime::io::descr { template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, +RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable * = nullptr); -// For intrinsic (not defined) derived type I/O, formatted & unformatted -template -static RT_API_ATTRS bool DefaultComponentIO(IoStatementState &io, - const typeInfo::Component &component, const Descriptor &origDescriptor, - const SubscriptValue origSubscripts[], Terminator &terminator, - const NonTbpDefinedIoTable *table) { -#if !defined(RT_DEVICE_AVOID_RECURSION) - if (component.genre() == typeInfo::Component::Genre::Data) { - // Create a descriptor for the component - StaticDescriptor statDesc; - Descriptor &desc{statDesc.descriptor()}; - component.CreatePointerDescriptor( - desc, origDescriptor, terminator, origSubscripts); - return DescriptorIO(io, desc, table); - } else { - // Component is itself a descriptor - char *pointer{ - origDescriptor.Element(origSubscripts) + component.offset()}; - const Descriptor &compDesc{*reinterpret_cast(pointer)}; - return compDesc.IsAllocated() && DescriptorIO(io, compDesc, table); - } -#else - terminator.Crash("not yet implemented: component IO"); -#endif -} - -template -static RT_API_ATTRS bool DefaultComponentwiseFormattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table, const SubscriptValue subscripts[]) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - // Return true for NAMELIST input if any component appeared. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && k > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -template -static RT_API_ATTRS bool DefaultComponentwiseUnformattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - return false; - } - } - } - return true; -} - -RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( - IoStatementState &, const Descriptor &, const typeInfo::DerivedType &, - const typeInfo::SpecialBinding &, const SubscriptValue[]); - -template -static RT_API_ATTRS bool FormattedDerivedTypeIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - // Derived type information must be present for formatted I/O. - const DescriptorAddendum *addendum{descriptor.Addendum()}; - RUNTIME_CHECK(handler, addendum != nullptr); - const typeInfo::DerivedType *type{addendum->derivedType()}; - RUNTIME_CHECK(handler, type != nullptr); - Fortran::common::optional nonTbpSpecial; - const typeInfo::SpecialBinding *special{nullptr}; - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadFormatted - : common::DefinedIo::WriteFormatted)}) { - if (definedIo->subroutine) { - nonTbpSpecial.emplace(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false); - special = &*nonTbpSpecial; - } - } - } - if (!special) { - if (const typeInfo::SpecialBinding * - binding{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted)}) { - if (!table || !table->ignoreNonTbpEntries || binding->isTypeBound()) { - special = binding; - } - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t numElements{descriptor.Elements()}; - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - Fortran::common::optional result; - if (special) { - result = DefinedFormattedIo(io, descriptor, *type, *special, subscripts); - } - if (!result) { - result = DefaultComponentwiseFormattedIO( - io, descriptor, *type, table, subscripts); - } - if (!result.value()) { - // Return true for NAMELIST input if we got anything. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && j > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &, const Descriptor &, - const typeInfo::DerivedType &, const typeInfo::SpecialBinding &); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); -// Unformatted I/O -template -static RT_API_ATTRS bool UnformattedDescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table = nullptr) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const DescriptorAddendum *addendum{descriptor.Addendum()}; - if (const typeInfo::DerivedType * - type{addendum ? addendum->derivedType() : nullptr}) { - // derived type unformatted I/O - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadUnformatted - : common::DefinedIo::WriteUnformatted)}) { - if (definedIo->subroutine) { - typeInfo::SpecialBinding special{DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false}; - if (Fortran::common::optional wasDefined{ - DefinedUnformattedIo(io, descriptor, *type, special)}) { - return *wasDefined; - } - } else { - return DefaultComponentwiseUnformattedIO( - io, descriptor, *type, table); - } - } - } - if (const typeInfo::SpecialBinding * - special{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { - if (!table || !table->ignoreNonTbpEntries || special->isTypeBound()) { - // defined derived type unformatted I/O - return DefinedUnformattedIo(io, descriptor, *type, *special); - } - } - // Default derived type unformatted I/O - // TODO: If no component at any level has defined READ or WRITE - // (as appropriate), the elements are contiguous, and no byte swapping - // is active, do a block transfer via the code below. - return DefaultComponentwiseUnformattedIO(io, descriptor, *type, table); - } else { - // intrinsic type unformatted I/O - auto *externalUnf{io.get_if>()}; - auto *childUnf{io.get_if>()}; - auto *inq{ - DIR == Direction::Output ? io.get_if() : nullptr}; - RUNTIME_CHECK(handler, externalUnf || childUnf || inq); - std::size_t elementBytes{descriptor.ElementBytes()}; - std::size_t numElements{descriptor.Elements()}; - std::size_t swappingBytes{elementBytes}; - if (auto maybeCatAndKind{descriptor.type().GetCategoryAndKind()}) { - // Byte swapping units can be smaller than elements, namely - // for COMPLEX and CHARACTER. - if (maybeCatAndKind->first == TypeCategory::Character) { - // swap each character position independently - swappingBytes = maybeCatAndKind->second; // kind - } else if (maybeCatAndKind->first == TypeCategory::Complex) { - // swap real and imaginary components independently - swappingBytes /= 2; - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using CharType = - std::conditional_t; - auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { - if constexpr (DIR == Direction::Output) { - return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) - : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) - : inq->Emit(&x, totalBytes, swappingBytes); - } else { - return externalUnf ? externalUnf->Receive(&x, totalBytes, swappingBytes) - : childUnf->Receive(&x, totalBytes, swappingBytes); - } - }}; - bool swapEndianness{externalUnf && externalUnf->unit().swapEndianness()}; - if (!swapEndianness && - descriptor.IsContiguous()) { // contiguous unformatted I/O - char &x{ExtractElement(io, descriptor, subscripts)}; - return Transfer(x, numElements * elementBytes); - } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O - for (std::size_t j{0}; j < numElements; ++j) { - char &x{ExtractElement(io, descriptor, subscripts)}; - if (!Transfer(x, elementBytes)) { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && - j + 1 < numElements) { - handler.Crash("DescriptorIO: subscripts out of bounds"); - } - } - return true; - } - } -} - -template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - if (handler.InError()) { - return false; - } - if (!io.get_if>()) { - handler.Crash("DescriptorIO() called for wrong I/O direction"); - return false; - } - if constexpr (DIR == Direction::Input) { - if (!io.BeginReadingRecord()) { - return false; - } - } - if (!io.get_if>()) { - return UnformattedDescriptorIO(io, descriptor, table); - } - if (auto catAndKind{descriptor.type().GetCategoryAndKind()}) { - TypeCategory cat{catAndKind->first}; - int kind{catAndKind->second}; - switch (cat) { - case TypeCategory::Integer: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, true); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, true); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, true); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, true); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, true); - default: - handler.Crash( - "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Unsigned: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, false); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, false); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, false); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, false); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, false); - default: - handler.Crash( - "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Real: - switch (kind) { - case 2: - return FormattedRealIO<2, DIR>(io, descriptor); - case 3: - return FormattedRealIO<3, DIR>(io, descriptor); - case 4: - return FormattedRealIO<4, DIR>(io, descriptor); - case 8: - return FormattedRealIO<8, DIR>(io, descriptor); - case 10: - return FormattedRealIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedRealIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: REAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Complex: - switch (kind) { - case 2: - return FormattedComplexIO<2, DIR>(io, descriptor); - case 3: - return FormattedComplexIO<3, DIR>(io, descriptor); - case 4: - return FormattedComplexIO<4, DIR>(io, descriptor); - case 8: - return FormattedComplexIO<8, DIR>(io, descriptor); - case 10: - return FormattedComplexIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedComplexIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Character: - switch (kind) { - case 1: - return FormattedCharacterIO(io, descriptor); - case 2: - return FormattedCharacterIO(io, descriptor); - case 4: - return FormattedCharacterIO(io, descriptor); - default: - handler.Crash( - "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Logical: - switch (kind) { - case 1: - return FormattedLogicalIO<1, DIR>(io, descriptor); - case 2: - return FormattedLogicalIO<2, DIR>(io, descriptor); - case 4: - return FormattedLogicalIO<4, DIR>(io, descriptor); - case 8: - return FormattedLogicalIO<8, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Derived: - return FormattedDerivedTypeIO(io, descriptor, table); - } - } - handler.Crash("DescriptorIO: bad type code (%d) in descriptor", - static_cast(descriptor.type().raw())); - return false; -} } // namespace Fortran::runtime::io::descr #endif // FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ diff --git a/flang-rt/lib/runtime/environment.cpp b/flang-rt/lib/runtime/environment.cpp index 1d5304254ed0e..0f0564403c0e2 100644 --- a/flang-rt/lib/runtime/environment.cpp +++ b/flang-rt/lib/runtime/environment.cpp @@ -143,6 +143,10 @@ void ExecutionEnvironment::Configure(int ac, const char *av[], } } + if (auto *x{std::getenv("FLANG_RT_DEBUG")}) { + internalDebugging = std::strtol(x, nullptr, 10); + } + if (auto *x{std::getenv("ACC_OFFLOAD_STACK_SIZE")}) { char *end; auto n{std::strtoul(x, &end, 10)}; diff --git a/flang-rt/lib/runtime/namelist.cpp b/flang-rt/lib/runtime/namelist.cpp index b0cf2180fc6d4..1bef387a9771f 100644 --- a/flang-rt/lib/runtime/namelist.cpp +++ b/flang-rt/lib/runtime/namelist.cpp @@ -10,6 +10,7 @@ #include "descriptor-io.h" #include "flang-rt/runtime/emit-encoded.h" #include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/type-info.h" #include "flang/Runtime/io-api.h" #include #include diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..9382c96bd870a --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,145 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +// FLANG_RT_DEBUG code is disabled when false. +static constexpr bool enableDebugOutput{false}; + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (componentAt_ >= components_) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + delete firstFree_; + } + firstFree_ = next; + } +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + firstFree_ = new TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: new ticket\n"); + } + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), + at->ticket.begun ? "Continue" : "Begin"); + } + int stat{at->ticket.Continue(*this)}; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: ... stat %d\n", stat); + } + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime diff --git a/flang-rt/unittests/Runtime/ExternalIOTest.cpp b/flang-rt/unittests/Runtime/ExternalIOTest.cpp index 3833e48be3dd6..6c148b1de6f82 100644 --- a/flang-rt/unittests/Runtime/ExternalIOTest.cpp +++ b/flang-rt/unittests/Runtime/ExternalIOTest.cpp @@ -184,7 +184,7 @@ TEST(ExternalIOTests, TestSequentialFixedUnformatted) { io = IONAME(BeginInquireIoLength)(__FILE__, __LINE__); for (int j{1}; j <= 3; ++j) { ASSERT_TRUE(IONAME(OutputDescriptor)(io, desc)) - << "OutputDescriptor() for InquireIoLength"; + << "OutputDescriptor() for InquireIoLength " << j; } ASSERT_EQ(IONAME(GetIoLength)(io), 3 * recl) << "GetIoLength"; ASSERT_EQ(IONAME(EndIoStatement)(io), IostatOk) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..32bebc1d866a4 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -843,6 +843,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION From flang-commits at lists.llvm.org Fri May 23 07:12:29 2025 From: flang-commits at lists.llvm.org (Jameson Nash via flang-commits) Date: Fri, 23 May 2025 07:12:29 -0700 (PDT) Subject: [flang-commits] [compiler-rt] [flang] [llvm] [AArch64] fix trampoline implementation: use X15 (PR #126743) In-Reply-To: Message-ID: <6830824d.050a0220.32de7a.1e93@mx.google.com> https://github.com/vtjnash updated https://github.com/llvm/llvm-project/pull/126743 >From 087d3c7e77b8eec3d551f48a77b016e5838d140b Mon Sep 17 00:00:00 2001 From: Jameson Nash Date: Mon, 10 Feb 2025 19:21:38 +0000 Subject: [PATCH 1/3] [AArch64] fix trampoline implementation: use X15 AAPCS64 reserves any of X9-X15 for this purpose, and says not to use any of X16-X18 (like GCC chose). Simply choosing a different register fixes the problem of this being broken on any platform that actually follows the platform ABI. As a side benefit, also generate slightly better code in the trampoline itself by following the XCore implementation instead of PPC (although following the RISCV might have been slightly more readable in hindsight). --- compiler-rt/lib/builtins/README.txt | 5 - compiler-rt/lib/builtins/trampoline_setup.c | 42 --- .../builtins/Unit/trampoline_setup_test.c | 2 +- .../lib/Optimizer/CodeGen/BoxedProcedure.cpp | 8 +- flang/test/Fir/boxproc.fir | 4 +- .../AArch64/AArch64CallingConvention.td | 25 +- .../Target/AArch64/AArch64FrameLowering.cpp | 28 ++ .../Target/AArch64/AArch64ISelLowering.cpp | 97 ++++--- llvm/lib/TargetParser/Triple.cpp | 2 - llvm/test/CodeGen/AArch64/nest-register.ll | 16 +- .../AArch64/statepoint-call-lowering.ll | 2 +- llvm/test/CodeGen/AArch64/trampoline.ll | 257 +++++++++++++++++- llvm/test/CodeGen/AArch64/win64cc-x18.ll | 27 +- .../CodeGen/AArch64/zero-call-used-regs.ll | 16 +- 14 files changed, 385 insertions(+), 146 deletions(-) diff --git a/compiler-rt/lib/builtins/README.txt b/compiler-rt/lib/builtins/README.txt index 19f26c92a0f94..2d213d95f333a 100644 --- a/compiler-rt/lib/builtins/README.txt +++ b/compiler-rt/lib/builtins/README.txt @@ -272,11 +272,6 @@ switch32 switch8 switchu8 -// This function generates a custom trampoline function with the specific -// realFunc and localsPtr values. -void __trampoline_setup(uint32_t* trampOnStack, int trampSizeAllocated, - const void* realFunc, void* localsPtr); - // There is no C interface to the *_vfp_d8_d15_regs functions. There are // called in the prolog and epilog of Thumb1 functions. When the C++ ABI use // SJLJ for exceptions, each function with a catch clause or destructors needs diff --git a/compiler-rt/lib/builtins/trampoline_setup.c b/compiler-rt/lib/builtins/trampoline_setup.c index 830e25e4c0303..844eb27944142 100644 --- a/compiler-rt/lib/builtins/trampoline_setup.c +++ b/compiler-rt/lib/builtins/trampoline_setup.c @@ -41,45 +41,3 @@ COMPILER_RT_ABI void __trampoline_setup(uint32_t *trampOnStack, __clear_cache(trampOnStack, &trampOnStack[10]); } #endif // __powerpc__ && !defined(__powerpc64__) - -// The AArch64 compiler generates calls to __trampoline_setup() when creating -// trampoline functions on the stack for use with nested functions. -// This function creates a custom 36-byte trampoline function on the stack -// which loads x18 with a pointer to the outer function's locals -// and then jumps to the target nested function. -// Note: x18 is a reserved platform register on Windows and macOS. - -#if defined(__aarch64__) && defined(__ELF__) -COMPILER_RT_ABI void __trampoline_setup(uint32_t *trampOnStack, - int trampSizeAllocated, - const void *realFunc, void *localsPtr) { - // This should never happen, but if compiler did not allocate - // enough space on stack for the trampoline, abort. - if (trampSizeAllocated < 36) - compilerrt_abort(); - - // create trampoline - // Load realFunc into x17. mov/movk 16 bits at a time. - trampOnStack[0] = - 0xd2800000u | ((((uint64_t)realFunc >> 0) & 0xffffu) << 5) | 0x11; - trampOnStack[1] = - 0xf2a00000u | ((((uint64_t)realFunc >> 16) & 0xffffu) << 5) | 0x11; - trampOnStack[2] = - 0xf2c00000u | ((((uint64_t)realFunc >> 32) & 0xffffu) << 5) | 0x11; - trampOnStack[3] = - 0xf2e00000u | ((((uint64_t)realFunc >> 48) & 0xffffu) << 5) | 0x11; - // Load localsPtr into x18 - trampOnStack[4] = - 0xd2800000u | ((((uint64_t)localsPtr >> 0) & 0xffffu) << 5) | 0x12; - trampOnStack[5] = - 0xf2a00000u | ((((uint64_t)localsPtr >> 16) & 0xffffu) << 5) | 0x12; - trampOnStack[6] = - 0xf2c00000u | ((((uint64_t)localsPtr >> 32) & 0xffffu) << 5) | 0x12; - trampOnStack[7] = - 0xf2e00000u | ((((uint64_t)localsPtr >> 48) & 0xffffu) << 5) | 0x12; - trampOnStack[8] = 0xd61f0220; // br x17 - - // Clear instruction cache. - __clear_cache(trampOnStack, &trampOnStack[9]); -} -#endif // defined(__aarch64__) && !defined(__APPLE__) && !defined(_WIN64) diff --git a/compiler-rt/test/builtins/Unit/trampoline_setup_test.c b/compiler-rt/test/builtins/Unit/trampoline_setup_test.c index d51d35acaa02f..da115fe764271 100644 --- a/compiler-rt/test/builtins/Unit/trampoline_setup_test.c +++ b/compiler-rt/test/builtins/Unit/trampoline_setup_test.c @@ -7,7 +7,7 @@ /* * Tests nested functions - * The ppc and aarch64 compilers generates a call to __trampoline_setup + * The ppc compiler generates a call to __trampoline_setup * The i386 and x86_64 compilers generate a call to ___enable_execute_stack */ diff --git a/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp b/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp index 82b11ad7db32a..69bdb48146a54 100644 --- a/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp +++ b/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp @@ -274,12 +274,12 @@ class BoxedProcedurePass auto loc = embox.getLoc(); mlir::Type i8Ty = builder.getI8Type(); mlir::Type i8Ptr = builder.getRefType(i8Ty); - // For AArch64, PPC32 and PPC64, the thunk is populated by a call to + // For PPC32 and PPC64, the thunk is populated by a call to // __trampoline_setup, which is defined in // compiler-rt/lib/builtins/trampoline_setup.c and requires the - // thunk size greater than 32 bytes. For RISCV and x86_64, the - // thunk setup doesn't go through __trampoline_setup and fits in 32 - // bytes. + // thunk size greater than 32 bytes. For AArch64, RISCV and x86_64, + // the thunk setup doesn't go through __trampoline_setup and fits in + // 32 bytes. fir::SequenceType::Extent thunkSize = triple.getTrampolineSize(); mlir::Type buffTy = SequenceType::get({thunkSize}, i8Ty); auto buffer = builder.create(loc, buffTy); diff --git a/flang/test/Fir/boxproc.fir b/flang/test/Fir/boxproc.fir index e99dfd0b92afd..9e5e41a94069c 100644 --- a/flang/test/Fir/boxproc.fir +++ b/flang/test/Fir/boxproc.fir @@ -3,7 +3,7 @@ // RUN: %if powerpc-registered-target %{tco --target=powerpc64le-unknown-linux-gnu %s | FileCheck %s --check-prefixes=CHECK,CHECK-PPC %} // CHECK-LABEL: define void @_QPtest_proc_dummy() -// CHECK-AARCH64: %[[VAL_3:.*]] = alloca [36 x i8], i64 1, align 1 +// CHECK-AARCH64: %[[VAL_3:.*]] = alloca [32 x i8], i64 1, align 1 // CHECK-X86: %[[VAL_3:.*]] = alloca [32 x i8], i64 1, align 1 // CHECK-PPC: %[[VAL_3:.*]] = alloca [4{{[0-8]+}} x i8], i64 1, align 1 // CHECK: %[[VAL_1:.*]] = alloca { ptr }, i64 1, align 8 @@ -63,7 +63,7 @@ func.func @_QPtest_proc_dummy_other(%arg0: !fir.boxproc<() -> ()>) { } // CHECK-LABEL: define void @_QPtest_proc_dummy_char() -// CHECK-AARCH64: %[[VAL_20:.*]] = alloca [36 x i8], i64 1, align 1 +// CHECK-AARCH64: %[[VAL_20:.*]] = alloca [32 x i8], i64 1, align 1 // CHECK-X86: %[[VAL_20:.*]] = alloca [32 x i8], i64 1, align 1 // CHECK-PPC: %[[VAL_20:.*]] = alloca [4{{[0-8]+}} x i8], i64 1, align 1 // CHECK: %[[VAL_2:.*]] = alloca { { ptr, i64 } }, i64 1, align 8 diff --git a/llvm/lib/Target/AArch64/AArch64CallingConvention.td b/llvm/lib/Target/AArch64/AArch64CallingConvention.td index 7cca6d9bc6b9c..e973269545911 100644 --- a/llvm/lib/Target/AArch64/AArch64CallingConvention.td +++ b/llvm/lib/Target/AArch64/AArch64CallingConvention.td @@ -28,6 +28,12 @@ class CCIfSubtarget //===----------------------------------------------------------------------===// defvar AArch64_Common = [ + // The 'nest' parameter, if any, is passed in X15. + // The previous register used here (X18) is also defined to be unavailable + // for this purpose, while all of X9-X15 were defined to be free for LLVM to + // use for this, so use X15 (which LLVM often already clobbers anyways). + CCIfNest>, + CCIfType<[iPTR], CCBitConvertToType>, CCIfType<[v2f32], CCBitConvertToType>, CCIfType<[v2f64, v4f32], CCBitConvertToType>, @@ -117,13 +123,7 @@ defvar AArch64_Common = [ ]; let Entry = 1 in -def CC_AArch64_AAPCS : CallingConv>], - AArch64_Common -)>; +def CC_AArch64_AAPCS : CallingConv; let Entry = 1 in def RetCC_AArch64_AAPCS : CallingConv<[ @@ -177,6 +177,8 @@ def CC_AArch64_Win64_VarArg : CallingConv<[ // a stack layout compatible with the x64 calling convention. let Entry = 1 in def CC_AArch64_Arm64EC_VarArg : CallingConv<[ + CCIfNest>, + // Convert small floating-point values to integer. CCIfType<[f16, bf16], CCBitConvertToType>, CCIfType<[f32], CCBitConvertToType>, @@ -353,6 +355,8 @@ def RetCC_AArch64_Arm64EC_CFGuard_Check : CallingConv<[ // + Stack slots are sized as needed rather than being at least 64-bit. let Entry = 1 in def CC_AArch64_DarwinPCS : CallingConv<[ + CCIfNest>, + CCIfType<[iPTR], CCBitConvertToType>, CCIfType<[v2f32], CCBitConvertToType>, CCIfType<[v2f64, v4f32, f128], CCBitConvertToType>, @@ -427,6 +431,8 @@ def CC_AArch64_DarwinPCS : CallingConv<[ let Entry = 1 in def CC_AArch64_DarwinPCS_VarArg : CallingConv<[ + CCIfNest>, + CCIfType<[iPTR], CCBitConvertToType>, CCIfType<[v2f32], CCBitConvertToType>, CCIfType<[v2f64, v4f32, f128], CCBitConvertToType>, @@ -450,6 +456,8 @@ def CC_AArch64_DarwinPCS_VarArg : CallingConv<[ // same as the normal Darwin VarArgs handling. let Entry = 1 in def CC_AArch64_DarwinPCS_ILP32_VarArg : CallingConv<[ + CCIfNest>, + CCIfType<[v2f32], CCBitConvertToType>, CCIfType<[v2f64, v4f32, f128], CCBitConvertToType>, @@ -494,6 +502,8 @@ def CC_AArch64_DarwinPCS_ILP32_VarArg : CallingConv<[ let Entry = 1 in def CC_AArch64_GHC : CallingConv<[ + CCIfNest>, + CCIfType<[iPTR], CCBitConvertToType>, // Handle all vector types as either f64 or v2f64. @@ -522,6 +532,7 @@ def CC_AArch64_Preserve_None : CallingConv<[ // We can pass arguments in all general registers, except: // - X8, used for sret + // - X15 (on Windows), used as a temporary register in the prologue when allocating call frames // - X16/X17, used by the linker as IP0/IP1 // - X18, the platform register // - X19, the base pointer diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp index 0f33e77d4eecc..71e0b490abe98 100644 --- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp @@ -1982,6 +1982,27 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF, : 0; if (windowsRequiresStackProbe(MF, NumBytes + RealignmentPadding)) { + // Find an available register to spill the value of X15 to, if X15 is being + // used already for nest. + unsigned X15Scratch = AArch64::NoRegister; + const AArch64Subtarget &STI = MF.getSubtarget(); + if (llvm::any_of(MBB.liveins(), + [&STI](const MachineBasicBlock::RegisterMaskPair &LiveIn) { + return STI.getRegisterInfo()->isSuperOrSubRegisterEq( + AArch64::X15, LiveIn.PhysReg); + })) { + X15Scratch = findScratchNonCalleeSaveRegister(&MBB); + assert(X15Scratch != AArch64::NoRegister); +#ifndef NDEBUG + LiveRegs.removeReg(AArch64::X15); // ignore X15 since we restore it +#endif + BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXrr), X15Scratch) + .addReg(AArch64::XZR) + .addReg(AArch64::X15, RegState::Undef) + .addReg(AArch64::X15, RegState::Implicit) + .setMIFlag(MachineInstr::FrameSetup); + } + uint64_t NumWords = (NumBytes + RealignmentPadding) >> 4; if (NeedsWinCFI) { HasWinCFI = true; @@ -2104,6 +2125,13 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF, // we've set a frame pointer and already finished the SEH prologue. assert(!NeedsWinCFI); } + if (X15Scratch != AArch64::NoRegister) { + BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXrr), AArch64::X15) + .addReg(AArch64::XZR) + .addReg(X15Scratch, RegState::Undef) + .addReg(X15Scratch, RegState::Implicit) + .setMIFlag(MachineInstr::FrameSetup); + } } StackOffset SVECalleeSavesSize = {}, SVELocalsSize = SVEStackSize; diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp index cee27ccf88ef1..b32b32c8fb34e 100644 --- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp @@ -7338,59 +7338,80 @@ static SDValue LowerFLDEXP(SDValue Op, SelectionDAG &DAG) { SDValue AArch64TargetLowering::LowerADJUST_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const { - // Note: x18 cannot be used for the Nest parameter on Windows and macOS. - if (Subtarget->isTargetDarwin() || Subtarget->isTargetWindows()) - report_fatal_error( - "ADJUST_TRAMPOLINE operation is only supported on Linux."); - return Op.getOperand(0); } SDValue AArch64TargetLowering::LowerINIT_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const { - - // Note: x18 cannot be used for the Nest parameter on Windows and macOS. - if (Subtarget->isTargetDarwin() || Subtarget->isTargetWindows()) - report_fatal_error("INIT_TRAMPOLINE operation is only supported on Linux."); - SDValue Chain = Op.getOperand(0); - SDValue Trmp = Op.getOperand(1); // trampoline + SDValue Trmp = Op.getOperand(1); // trampoline, >=32 bytes SDValue FPtr = Op.getOperand(2); // nested function SDValue Nest = Op.getOperand(3); // 'nest' parameter value - SDLoc dl(Op); - EVT PtrVT = getPointerTy(DAG.getDataLayout()); - Type *IntPtrTy = DAG.getDataLayout().getIntPtrType(*DAG.getContext()); + const Value *TrmpAddr = cast(Op.getOperand(4))->getValue(); - TargetLowering::ArgListTy Args; - TargetLowering::ArgListEntry Entry; + // ldr NestReg, .+16 + // ldr x17, .+20 + // br x17 + // .word 0 + // .nest: .qword nest + // .fptr: .qword fptr + SDValue OutChains[5]; - Entry.Ty = IntPtrTy; - Entry.Node = Trmp; - Args.push_back(Entry); + const Function *Func = + cast(cast(Op.getOperand(5))->getValue()); + CallingConv::ID CC = Func->getCallingConv(); + unsigned NestReg; - if (auto *FI = dyn_cast(Trmp.getNode())) { - MachineFunction &MF = DAG.getMachineFunction(); - MachineFrameInfo &MFI = MF.getFrameInfo(); - Entry.Node = - DAG.getConstant(MFI.getObjectSize(FI->getIndex()), dl, MVT::i64); - } else - Entry.Node = DAG.getConstant(36, dl, MVT::i64); + switch (CC) { + default: + NestReg = 0x0f; // X15 + case CallingConv::ARM64EC_Thunk_Native: + case CallingConv::ARM64EC_Thunk_X64: + // Must be kept in sync with AArch64CallingConv.td + NestReg = 0x04; // X4 + break; + } - Args.push_back(Entry); - Entry.Node = FPtr; - Args.push_back(Entry); - Entry.Node = Nest; - Args.push_back(Entry); + const char FptrReg = 0x11; // X17 - // Lower to a call to __trampoline_setup(Trmp, TrampSize, FPtr, ctx_reg) - TargetLowering::CallLoweringInfo CLI(DAG); - CLI.setDebugLoc(dl).setChain(Chain).setLibCallee( - CallingConv::C, Type::getVoidTy(*DAG.getContext()), - DAG.getExternalSymbol("__trampoline_setup", PtrVT), std::move(Args)); + SDValue Addr = Trmp; - std::pair CallResult = LowerCallTo(CLI); - return CallResult.second; + SDLoc dl(Op); + OutChains[0] = DAG.getStore( + Chain, dl, DAG.getConstant(0x58000080u | NestReg, dl, MVT::i32), Addr, + MachinePointerInfo(TrmpAddr)); + + Addr = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(4, dl, MVT::i64)); + OutChains[1] = DAG.getStore( + Chain, dl, DAG.getConstant(0x580000b0u | FptrReg, dl, MVT::i32), Addr, + MachinePointerInfo(TrmpAddr, 4)); + + Addr = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(8, dl, MVT::i64)); + OutChains[2] = + DAG.getStore(Chain, dl, DAG.getConstant(0xd61f0220u, dl, MVT::i32), Addr, + MachinePointerInfo(TrmpAddr, 8)); + + Addr = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(16, dl, MVT::i64)); + OutChains[3] = + DAG.getStore(Chain, dl, Nest, Addr, MachinePointerInfo(TrmpAddr, 16)); + + Addr = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(24, dl, MVT::i64)); + OutChains[4] = + DAG.getStore(Chain, dl, FPtr, Addr, MachinePointerInfo(TrmpAddr, 24)); + + SDValue StoreToken = DAG.getNode(ISD::TokenFactor, dl, MVT::Other, OutChains); + + SDValue EndOfTrmp = DAG.getNode(ISD::ADD, dl, MVT::i64, Trmp, + DAG.getConstant(12, dl, MVT::i64)); + + // Call clear cache on the trampoline instructions. + return DAG.getNode(ISD::CLEAR_CACHE, dl, MVT::Other, StoreToken, Trmp, + EndOfTrmp); } SDValue AArch64TargetLowering::LowerOperation(SDValue Op, diff --git a/llvm/lib/TargetParser/Triple.cpp b/llvm/lib/TargetParser/Triple.cpp index 6a559ff023caa..aa1251f3b9485 100644 --- a/llvm/lib/TargetParser/Triple.cpp +++ b/llvm/lib/TargetParser/Triple.cpp @@ -1732,8 +1732,6 @@ unsigned Triple::getTrampolineSize() const { if (isOSLinux()) return 48; break; - case Triple::aarch64: - return 36; } return 32; } diff --git a/llvm/test/CodeGen/AArch64/nest-register.ll b/llvm/test/CodeGen/AArch64/nest-register.ll index 1e1c1b044bab6..2e94dfba1fa52 100644 --- a/llvm/test/CodeGen/AArch64/nest-register.ll +++ b/llvm/test/CodeGen/AArch64/nest-register.ll @@ -1,3 +1,4 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 ; RUN: llc -disable-post-ra -verify-machineinstrs < %s -mtriple=aarch64-none-linux-gnu | FileCheck %s ; Tests that the 'nest' parameter attribute causes the relevant parameter to be @@ -5,18 +6,21 @@ define ptr @nest_receiver(ptr nest %arg) nounwind { ; CHECK-LABEL: nest_receiver: -; CHECK-NEXT: // %bb.0: -; CHECK-NEXT: mov x0, x18 -; CHECK-NEXT: ret +; CHECK: // %bb.0: +; CHECK-NEXT: mov x0, x15 +; CHECK-NEXT: ret ret ptr %arg } define ptr @nest_caller(ptr %arg) nounwind { ; CHECK-LABEL: nest_caller: -; CHECK: mov x18, x0 -; CHECK-NEXT: bl nest_receiver -; CHECK: ret +; CHECK: // %bb.0: +; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill +; CHECK-NEXT: mov x15, x0 +; CHECK-NEXT: bl nest_receiver +; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload +; CHECK-NEXT: ret %result = call ptr @nest_receiver(ptr nest %arg) ret ptr %result diff --git a/llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll b/llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll index 9619895c450ca..32c3eaeb9c876 100644 --- a/llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll +++ b/llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll @@ -207,7 +207,7 @@ define void @test_attributes(ptr byval(%struct2) %s) gc "statepoint-example" { ; CHECK-NEXT: .cfi_offset w30, -16 ; CHECK-NEXT: ldr x8, [sp, #64] ; CHECK-NEXT: ldr q0, [sp, #48] -; CHECK-NEXT: mov x18, xzr +; CHECK-NEXT: mov x15, xzr ; CHECK-NEXT: mov w0, #42 // =0x2a ; CHECK-NEXT: mov w1, #17 // =0x11 ; CHECK-NEXT: str x8, [sp, #16] diff --git a/llvm/test/CodeGen/AArch64/trampoline.ll b/llvm/test/CodeGen/AArch64/trampoline.ll index 30ac2aa283b3e..d9016b02a0f80 100644 --- a/llvm/test/CodeGen/AArch64/trampoline.ll +++ b/llvm/test/CodeGen/AArch64/trampoline.ll @@ -1,32 +1,265 @@ -; RUN: llc -mtriple=aarch64-- < %s | FileCheck %s +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: llc -mtriple=aarch64-linux-gnu < %s | FileCheck %s --check-prefixes=CHECK-LINUX +; RUN: llc -mtriple=aarch64-none-eabi < %s | FileCheck %s --check-prefixes=CHECK-LINUX +; RUN: llc -mtriple=aarch64-pc-windows-msvc < %s | FileCheck %s --check-prefix=CHECK-PC +; RUN: llc -mtriple=aarch64-apple-darwin < %s | FileCheck %s --check-prefixes=CHECK-APPLE @trampg = internal global [36 x i8] zeroinitializer, align 8 declare void @llvm.init.trampoline(ptr, ptr, ptr); declare ptr @llvm.adjust.trampoline(ptr); -define i64 @f(ptr nest %c, i64 %x, i64 %y) { - %sum = add i64 %x, %y - ret i64 %sum +define ptr @f(ptr nest %x, i64 %y) { +; CHECK-LINUX-LABEL: f: +; CHECK-LINUX: // %bb.0: +; CHECK-LINUX-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill +; CHECK-LINUX-NEXT: sub sp, sp, #237, lsl #12 // =970752 +; CHECK-LINUX-NEXT: sub sp, sp, #3264 +; CHECK-LINUX-NEXT: .cfi_def_cfa_offset 974032 +; CHECK-LINUX-NEXT: .cfi_offset w29, -16 +; CHECK-LINUX-NEXT: add x0, x15, x0 +; CHECK-LINUX-NEXT: add sp, sp, #237, lsl #12 // =970752 +; CHECK-LINUX-NEXT: add sp, sp, #3264 +; CHECK-LINUX-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload +; CHECK-LINUX-NEXT: ret +; +; CHECK-PC-LABEL: f: +; CHECK-PC: .seh_proc f +; CHECK-PC-NEXT: // %bb.0: +; CHECK-PC-NEXT: stp x29, x30, [sp, #-16]! // 16-byte Folded Spill +; CHECK-PC-NEXT: .seh_save_fplr_x 16 +; CHECK-PC-NEXT: mov x9, x15 +; CHECK-PC-NEXT: mov x15, #60876 // =0xedcc +; CHECK-PC-NEXT: .seh_nop +; CHECK-PC-NEXT: bl __chkstk +; CHECK-PC-NEXT: .seh_nop +; CHECK-PC-NEXT: sub sp, sp, x15, lsl #4 +; CHECK-PC-NEXT: .seh_stackalloc 974016 +; CHECK-PC-NEXT: mov x15, x9 +; CHECK-PC-NEXT: .seh_endprologue +; CHECK-PC-NEXT: add x0, x15, x0 +; CHECK-PC-NEXT: .seh_startepilogue +; CHECK-PC-NEXT: add sp, sp, #237, lsl #12 // =970752 +; CHECK-PC-NEXT: .seh_stackalloc 970752 +; CHECK-PC-NEXT: add sp, sp, #3264 +; CHECK-PC-NEXT: .seh_stackalloc 3264 +; CHECK-PC-NEXT: ldp x29, x30, [sp], #16 // 16-byte Folded Reload +; CHECK-PC-NEXT: .seh_save_fplr_x 16 +; CHECK-PC-NEXT: .seh_endepilogue +; CHECK-PC-NEXT: ret +; CHECK-PC-NEXT: .seh_endfunclet +; CHECK-PC-NEXT: .seh_endproc +; +; CHECK-APPLE-LABEL: f: +; CHECK-APPLE: ; %bb.0: +; CHECK-APPLE-NEXT: stp x28, x27, [sp, #-16]! ; 16-byte Folded Spill +; CHECK-APPLE-NEXT: sub sp, sp, #237, lsl #12 ; =970752 +; CHECK-APPLE-NEXT: sub sp, sp, #3264 +; CHECK-APPLE-NEXT: .cfi_def_cfa_offset 974032 +; CHECK-APPLE-NEXT: .cfi_offset w27, -8 +; CHECK-APPLE-NEXT: .cfi_offset w28, -16 +; CHECK-APPLE-NEXT: add x0, x15, x0 +; CHECK-APPLE-NEXT: add sp, sp, #237, lsl #12 ; =970752 +; CHECK-APPLE-NEXT: add sp, sp, #3264 +; CHECK-APPLE-NEXT: ldp x28, x27, [sp], #16 ; 16-byte Folded Reload +; CHECK-APPLE-NEXT: ret + %chkstack = alloca [u0xedcba x i8] + %sum = getelementptr i8, ptr %x, i64 %y + ret ptr %sum } define i64 @func1() { +; CHECK-LINUX-LABEL: func1: +; CHECK-LINUX: // %bb.0: +; CHECK-LINUX-NEXT: sub sp, sp, #64 +; CHECK-LINUX-NEXT: str x30, [sp, #48] // 8-byte Folded Spill +; CHECK-LINUX-NEXT: .cfi_def_cfa_offset 64 +; CHECK-LINUX-NEXT: .cfi_offset w30, -16 +; CHECK-LINUX-NEXT: adrp x8, :got:f +; CHECK-LINUX-NEXT: mov w9, #544 // =0x220 +; CHECK-LINUX-NEXT: add x0, sp, #8 +; CHECK-LINUX-NEXT: ldr x8, [x8, :got_lo12:f] +; CHECK-LINUX-NEXT: movk w9, #54815, lsl #16 +; CHECK-LINUX-NEXT: str w9, [sp, #16] +; CHECK-LINUX-NEXT: add x9, sp, #56 +; CHECK-LINUX-NEXT: stp x9, x8, [sp, #24] +; CHECK-LINUX-NEXT: mov x8, #132 // =0x84 +; CHECK-LINUX-NEXT: movk x8, #22528, lsl #16 +; CHECK-LINUX-NEXT: movk x8, #177, lsl #32 +; CHECK-LINUX-NEXT: movk x8, #22528, lsl #48 +; CHECK-LINUX-NEXT: str x8, [sp, #8] +; CHECK-LINUX-NEXT: add x8, sp, #8 +; CHECK-LINUX-NEXT: add x1, x8, #12 +; CHECK-LINUX-NEXT: bl __clear_cache +; CHECK-LINUX-NEXT: ldr x30, [sp, #48] // 8-byte Folded Reload +; CHECK-LINUX-NEXT: mov x0, xzr +; CHECK-LINUX-NEXT: add sp, sp, #64 +; CHECK-LINUX-NEXT: ret +; +; CHECK-PC-LABEL: func1: +; CHECK-PC: .seh_proc func1 +; CHECK-PC-NEXT: // %bb.0: +; CHECK-PC-NEXT: sub sp, sp, #64 +; CHECK-PC-NEXT: .seh_stackalloc 64 +; CHECK-PC-NEXT: str x30, [sp, #48] // 8-byte Folded Spill +; CHECK-PC-NEXT: .seh_save_reg x30, 48 +; CHECK-PC-NEXT: .seh_endprologue +; CHECK-PC-NEXT: adrp x8, f +; CHECK-PC-NEXT: add x8, x8, :lo12:f +; CHECK-PC-NEXT: add x9, sp, #56 +; CHECK-PC-NEXT: stp x9, x8, [sp, #24] +; CHECK-PC-NEXT: mov w8, #544 // =0x220 +; CHECK-PC-NEXT: add x0, sp, #8 +; CHECK-PC-NEXT: movk w8, #54815, lsl #16 +; CHECK-PC-NEXT: str w8, [sp, #16] +; CHECK-PC-NEXT: mov x8, #132 // =0x84 +; CHECK-PC-NEXT: movk x8, #22528, lsl #16 +; CHECK-PC-NEXT: movk x8, #177, lsl #32 +; CHECK-PC-NEXT: movk x8, #22528, lsl #48 +; CHECK-PC-NEXT: str x8, [sp, #8] +; CHECK-PC-NEXT: add x8, sp, #8 +; CHECK-PC-NEXT: add x1, x8, #12 +; CHECK-PC-NEXT: bl __clear_cache +; CHECK-PC-NEXT: mov x0, xzr +; CHECK-PC-NEXT: .seh_startepilogue +; CHECK-PC-NEXT: ldr x30, [sp, #48] // 8-byte Folded Reload +; CHECK-PC-NEXT: .seh_save_reg x30, 48 +; CHECK-PC-NEXT: add sp, sp, #64 +; CHECK-PC-NEXT: .seh_stackalloc 64 +; CHECK-PC-NEXT: .seh_endepilogue +; CHECK-PC-NEXT: ret +; CHECK-PC-NEXT: .seh_endfunclet +; CHECK-PC-NEXT: .seh_endproc +; +; CHECK-APPLE-LABEL: func1: +; CHECK-APPLE: ; %bb.0: +; CHECK-APPLE-NEXT: sub sp, sp, #64 +; CHECK-APPLE-NEXT: stp x29, x30, [sp, #48] ; 16-byte Folded Spill +; CHECK-APPLE-NEXT: .cfi_def_cfa_offset 64 +; CHECK-APPLE-NEXT: .cfi_offset w30, -8 +; CHECK-APPLE-NEXT: .cfi_offset w29, -16 +; CHECK-APPLE-NEXT: Lloh0: +; CHECK-APPLE-NEXT: adrp x8, _f at PAGE +; CHECK-APPLE-NEXT: Lloh1: +; CHECK-APPLE-NEXT: add x8, x8, _f at PAGEOFF +; CHECK-APPLE-NEXT: add x9, sp, #40 +; CHECK-APPLE-NEXT: stp x9, x8, [sp, #16] +; CHECK-APPLE-NEXT: mov w8, #544 ; =0x220 +; CHECK-APPLE-NEXT: mov x0, sp +; CHECK-APPLE-NEXT: movk w8, #54815, lsl #16 +; CHECK-APPLE-NEXT: str w8, [sp, #8] +; CHECK-APPLE-NEXT: mov x8, #132 ; =0x84 +; CHECK-APPLE-NEXT: movk x8, #22528, lsl #16 +; CHECK-APPLE-NEXT: movk x8, #177, lsl #32 +; CHECK-APPLE-NEXT: movk x8, #22528, lsl #48 +; CHECK-APPLE-NEXT: str x8, [sp] +; CHECK-APPLE-NEXT: mov x8, sp +; CHECK-APPLE-NEXT: add x1, x8, #12 +; CHECK-APPLE-NEXT: bl ___clear_cache +; CHECK-APPLE-NEXT: ldp x29, x30, [sp, #48] ; 16-byte Folded Reload +; CHECK-APPLE-NEXT: mov x0, xzr +; CHECK-APPLE-NEXT: add sp, sp, #64 +; CHECK-APPLE-NEXT: ret +; CHECK-APPLE-NEXT: .loh AdrpAdd Lloh0, Lloh1 %val = alloca i64 - %nval = bitcast ptr %val to ptr %tramp = alloca [36 x i8], align 8 - ; CHECK: mov w1, #36 - ; CHECK: bl __trampoline_setup - call void @llvm.init.trampoline(ptr %tramp, ptr @f, ptr %nval) + call void @llvm.init.trampoline(ptr %tramp, ptr @f, ptr %val) %fp = call ptr @llvm.adjust.trampoline(ptr %tramp) ret i64 0 } define i64 @func2() { +; CHECK-LINUX-LABEL: func2: +; CHECK-LINUX: // %bb.0: +; CHECK-LINUX-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill +; CHECK-LINUX-NEXT: .cfi_def_cfa_offset 16 +; CHECK-LINUX-NEXT: .cfi_offset w30, -16 +; CHECK-LINUX-NEXT: adrp x8, :got:f +; CHECK-LINUX-NEXT: mov w9, #544 // =0x220 +; CHECK-LINUX-NEXT: adrp x0, trampg +; CHECK-LINUX-NEXT: add x0, x0, :lo12:trampg +; CHECK-LINUX-NEXT: ldr x8, [x8, :got_lo12:f] +; CHECK-LINUX-NEXT: movk w9, #54815, lsl #16 +; CHECK-LINUX-NEXT: str w9, [x0, #8] +; CHECK-LINUX-NEXT: add x9, sp, #8 +; CHECK-LINUX-NEXT: add x1, x0, #12 +; CHECK-LINUX-NEXT: stp x9, x8, [x0, #16] +; CHECK-LINUX-NEXT: mov x8, #132 // =0x84 +; CHECK-LINUX-NEXT: movk x8, #22528, lsl #16 +; CHECK-LINUX-NEXT: movk x8, #177, lsl #32 +; CHECK-LINUX-NEXT: movk x8, #22528, lsl #48 +; CHECK-LINUX-NEXT: str x8, [x0] +; CHECK-LINUX-NEXT: bl __clear_cache +; CHECK-LINUX-NEXT: mov x0, xzr +; CHECK-LINUX-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload +; CHECK-LINUX-NEXT: ret +; +; CHECK-PC-LABEL: func2: +; CHECK-PC: .seh_proc func2 +; CHECK-PC-NEXT: // %bb.0: +; CHECK-PC-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill +; CHECK-PC-NEXT: .seh_save_reg_x x30, 16 +; CHECK-PC-NEXT: .seh_endprologue +; CHECK-PC-NEXT: adrp x0, trampg +; CHECK-PC-NEXT: add x0, x0, :lo12:trampg +; CHECK-PC-NEXT: adrp x8, f +; CHECK-PC-NEXT: add x8, x8, :lo12:f +; CHECK-PC-NEXT: add x9, sp, #8 +; CHECK-PC-NEXT: add x1, x0, #12 +; CHECK-PC-NEXT: stp x9, x8, [x0, #16] +; CHECK-PC-NEXT: mov w8, #544 // =0x220 +; CHECK-PC-NEXT: movk w8, #54815, lsl #16 +; CHECK-PC-NEXT: str w8, [x0, #8] +; CHECK-PC-NEXT: mov x8, #132 // =0x84 +; CHECK-PC-NEXT: movk x8, #22528, lsl #16 +; CHECK-PC-NEXT: movk x8, #177, lsl #32 +; CHECK-PC-NEXT: movk x8, #22528, lsl #48 +; CHECK-PC-NEXT: str x8, [x0] +; CHECK-PC-NEXT: bl __clear_cache +; CHECK-PC-NEXT: mov x0, xzr +; CHECK-PC-NEXT: .seh_startepilogue +; CHECK-PC-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload +; CHECK-PC-NEXT: .seh_save_reg_x x30, 16 +; CHECK-PC-NEXT: .seh_endepilogue +; CHECK-PC-NEXT: ret +; CHECK-PC-NEXT: .seh_endfunclet +; CHECK-PC-NEXT: .seh_endproc +; +; CHECK-APPLE-LABEL: func2: +; CHECK-APPLE: ; %bb.0: +; CHECK-APPLE-NEXT: sub sp, sp, #32 +; CHECK-APPLE-NEXT: stp x29, x30, [sp, #16] ; 16-byte Folded Spill +; CHECK-APPLE-NEXT: .cfi_def_cfa_offset 32 +; CHECK-APPLE-NEXT: .cfi_offset w30, -8 +; CHECK-APPLE-NEXT: .cfi_offset w29, -16 +; CHECK-APPLE-NEXT: Lloh2: +; CHECK-APPLE-NEXT: adrp x0, _trampg at PAGE +; CHECK-APPLE-NEXT: Lloh3: +; CHECK-APPLE-NEXT: add x0, x0, _trampg at PAGEOFF +; CHECK-APPLE-NEXT: Lloh4: +; CHECK-APPLE-NEXT: adrp x8, _f at PAGE +; CHECK-APPLE-NEXT: Lloh5: +; CHECK-APPLE-NEXT: add x8, x8, _f at PAGEOFF +; CHECK-APPLE-NEXT: add x9, sp, #8 +; CHECK-APPLE-NEXT: add x1, x0, #12 +; CHECK-APPLE-NEXT: stp x9, x8, [x0, #16] +; CHECK-APPLE-NEXT: mov w8, #544 ; =0x220 +; CHECK-APPLE-NEXT: movk w8, #54815, lsl #16 +; CHECK-APPLE-NEXT: str w8, [x0, #8] +; CHECK-APPLE-NEXT: mov x8, #132 ; =0x84 +; CHECK-APPLE-NEXT: movk x8, #22528, lsl #16 +; CHECK-APPLE-NEXT: movk x8, #177, lsl #32 +; CHECK-APPLE-NEXT: movk x8, #22528, lsl #48 +; CHECK-APPLE-NEXT: str x8, [x0] +; CHECK-APPLE-NEXT: bl ___clear_cache +; CHECK-APPLE-NEXT: ldp x29, x30, [sp, #16] ; 16-byte Folded Reload +; CHECK-APPLE-NEXT: mov x0, xzr +; CHECK-APPLE-NEXT: add sp, sp, #32 +; CHECK-APPLE-NEXT: ret +; CHECK-APPLE-NEXT: .loh AdrpAdd Lloh4, Lloh5 +; CHECK-APPLE-NEXT: .loh AdrpAdd Lloh2, Lloh3 %val = alloca i64 - %nval = bitcast ptr %val to ptr - ; CHECK: mov w1, #36 - ; CHECK: bl __trampoline_setup - call void @llvm.init.trampoline(ptr @trampg, ptr @f, ptr %nval) + call void @llvm.init.trampoline(ptr @trampg, ptr @f, ptr %val) %fp = call ptr @llvm.adjust.trampoline(ptr @trampg) ret i64 0 } diff --git a/llvm/test/CodeGen/AArch64/win64cc-x18.ll b/llvm/test/CodeGen/AArch64/win64cc-x18.ll index b3e78cc9bbb81..4b45c300e9c1d 100644 --- a/llvm/test/CodeGen/AArch64/win64cc-x18.ll +++ b/llvm/test/CodeGen/AArch64/win64cc-x18.ll @@ -1,35 +1,26 @@ -; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +;; Testing that nest uses x15 on all calling conventions (except Arm64EC) -;; Testing that x18 is not clobbered when passing pointers with the nest -;; attribute on windows - -; RUN: llc < %s -mtriple=aarch64-pc-windows-msvc | FileCheck %s --check-prefixes=CHECK,CHECK-NO-X18 -; RUN: llc < %s -mtriple=aarch64-linux-gnu | FileCheck %s --check-prefixes=CHECK,CHECK-X18 +; RUN: llc < %s -mtriple=aarch64-pc-windows-msvc | FileCheck %s +; RUN: llc < %s -mtriple=aarch64-linux-gnu | FileCheck %s +; RUN: llc < %s -mtriple=aarch64-apple-darwin- | FileCheck %s define dso_local i64 @other(ptr nest %p) #0 { ; CHECK-LABEL: other: -; CHECK-X18: ldr x0, [x18] -; CHECK-NO-X18: ldr x0, [x0] +; CHECK: ldr x0, [x15] +; CHECK: ret %r = load i64, ptr %p -; CHECK: ret ret i64 %r } define dso_local void @func() #0 { ; CHECK-LABEL: func: - - +; CHECK: add x15, sp, #8 +; CHECK: bl {{_?other}} +; CHECK: ret entry: %p = alloca i64 -; CHECK: mov w8, #1 -; CHECK: stp x30, x8, [sp, #-16] -; CHECK-X18: add x18, sp, #8 store i64 1, ptr %p -; CHECK-NO-X18: add x0, sp, #8 -; CHECK: bl other call void @other(ptr nest %p) -; CHECK: ldr x30, [sp], #16 -; CHECK: ret ret void } diff --git a/llvm/test/CodeGen/AArch64/zero-call-used-regs.ll b/llvm/test/CodeGen/AArch64/zero-call-used-regs.ll index 4799ea3bcd19f..986666e015e9e 100644 --- a/llvm/test/CodeGen/AArch64/zero-call-used-regs.ll +++ b/llvm/test/CodeGen/AArch64/zero-call-used-regs.ll @@ -93,7 +93,7 @@ define dso_local i32 @all_gpr_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c ; CHECK-NEXT: mov x5, #0 // =0x0 ; CHECK-NEXT: mov x6, #0 // =0x0 ; CHECK-NEXT: mov x7, #0 // =0x0 -; CHECK-NEXT: mov x18, #0 // =0x0 +; CHECK-NEXT: mov x15, #0 // =0x0 ; CHECK-NEXT: orr w0, w8, w2 ; CHECK-NEXT: mov x2, #0 // =0x0 ; CHECK-NEXT: mov x8, #0 // =0x0 @@ -146,7 +146,7 @@ define dso_local i32 @all_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c) lo ; DEFAULT-NEXT: mov x5, #0 // =0x0 ; DEFAULT-NEXT: mov x6, #0 // =0x0 ; DEFAULT-NEXT: mov x7, #0 // =0x0 -; DEFAULT-NEXT: mov x18, #0 // =0x0 +; DEFAULT-NEXT: mov x15, #0 // =0x0 ; DEFAULT-NEXT: movi v0.2d, #0000000000000000 ; DEFAULT-NEXT: orr w0, w8, w2 ; DEFAULT-NEXT: mov x2, #0 // =0x0 @@ -169,7 +169,7 @@ define dso_local i32 @all_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c) lo ; SVE-OR-SME-NEXT: mov x5, #0 // =0x0 ; SVE-OR-SME-NEXT: mov x6, #0 // =0x0 ; SVE-OR-SME-NEXT: mov x7, #0 // =0x0 -; SVE-OR-SME-NEXT: mov x18, #0 // =0x0 +; SVE-OR-SME-NEXT: mov x15, #0 // =0x0 ; SVE-OR-SME-NEXT: mov z0.d, #0 // =0x0 ; SVE-OR-SME-NEXT: orr w0, w8, w2 ; SVE-OR-SME-NEXT: mov x2, #0 // =0x0 @@ -196,7 +196,7 @@ define dso_local i32 @all_arg(i32 noundef %a, i32 noundef %b, i32 noundef %c) lo ; STREAMING-COMPAT-NEXT: mov x5, #0 // =0x0 ; STREAMING-COMPAT-NEXT: mov x6, #0 // =0x0 ; STREAMING-COMPAT-NEXT: mov x7, #0 // =0x0 -; STREAMING-COMPAT-NEXT: mov x18, #0 // =0x0 +; STREAMING-COMPAT-NEXT: mov x15, #0 // =0x0 ; STREAMING-COMPAT-NEXT: fmov d0, xzr ; STREAMING-COMPAT-NEXT: orr w0, w8, w2 ; STREAMING-COMPAT-NEXT: mov x2, #0 // =0x0 @@ -492,7 +492,7 @@ define dso_local double @all_gpr_arg_float(double noundef %a, float noundef %b) ; CHECK-NEXT: mov x6, #0 // =0x0 ; CHECK-NEXT: mov x7, #0 // =0x0 ; CHECK-NEXT: mov x8, #0 // =0x0 -; CHECK-NEXT: mov x18, #0 // =0x0 +; CHECK-NEXT: mov x15, #0 // =0x0 ; CHECK-NEXT: ret entry: @@ -547,7 +547,7 @@ define dso_local double @all_arg_float(double noundef %a, float noundef %b) loca ; DEFAULT-NEXT: mov x6, #0 // =0x0 ; DEFAULT-NEXT: mov x7, #0 // =0x0 ; DEFAULT-NEXT: mov x8, #0 // =0x0 -; DEFAULT-NEXT: mov x18, #0 // =0x0 +; DEFAULT-NEXT: mov x15, #0 // =0x0 ; DEFAULT-NEXT: movi v1.2d, #0000000000000000 ; DEFAULT-NEXT: movi v2.2d, #0000000000000000 ; DEFAULT-NEXT: movi v3.2d, #0000000000000000 @@ -570,7 +570,7 @@ define dso_local double @all_arg_float(double noundef %a, float noundef %b) loca ; SVE-OR-SME-NEXT: mov x6, #0 // =0x0 ; SVE-OR-SME-NEXT: mov x7, #0 // =0x0 ; SVE-OR-SME-NEXT: mov x8, #0 // =0x0 -; SVE-OR-SME-NEXT: mov x18, #0 // =0x0 +; SVE-OR-SME-NEXT: mov x15, #0 // =0x0 ; SVE-OR-SME-NEXT: mov z1.d, #0 // =0x0 ; SVE-OR-SME-NEXT: mov z2.d, #0 // =0x0 ; SVE-OR-SME-NEXT: mov z3.d, #0 // =0x0 @@ -597,7 +597,7 @@ define dso_local double @all_arg_float(double noundef %a, float noundef %b) loca ; STREAMING-COMPAT-NEXT: mov x6, #0 // =0x0 ; STREAMING-COMPAT-NEXT: mov x7, #0 // =0x0 ; STREAMING-COMPAT-NEXT: mov x8, #0 // =0x0 -; STREAMING-COMPAT-NEXT: mov x18, #0 // =0x0 +; STREAMING-COMPAT-NEXT: mov x15, #0 // =0x0 ; STREAMING-COMPAT-NEXT: fmov d1, xzr ; STREAMING-COMPAT-NEXT: fmov d2, xzr ; STREAMING-COMPAT-NEXT: fmov d3, xzr >From 0a07ba246e89104e454dee67ff86725df805f446 Mon Sep 17 00:00:00 2001 From: Jameson Nash Date: Tue, 20 May 2025 20:29:49 +0000 Subject: [PATCH 2/3] choose scratch register more carefully --- .../Target/AArch64/AArch64FrameLowering.cpp | 55 +++++++++++-------- 1 file changed, 32 insertions(+), 23 deletions(-) diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp index 71e0b490abe98..73d22286eb961 100644 --- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp @@ -327,7 +327,8 @@ static int64_t getArgumentStackToRestore(MachineFunction &MF, static bool produceCompactUnwindFrame(MachineFunction &MF); static bool needsWinCFI(const MachineFunction &MF); static StackOffset getSVEStackSize(const MachineFunction &MF); -static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB); +static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB, bool HasCall=false); +static bool requiresSaveVG(const MachineFunction &MF); /// Returns true if a homogeneous prolog or epilog code can be emitted /// for the size optimization. If possible, a frame helper call is injected. @@ -1002,6 +1003,16 @@ void AArch64FrameLowering::emitZeroCallUsedRegs(BitVector RegsToZero, } } +static bool windowsRequiresStackProbe(const MachineFunction &MF, + uint64_t StackSizeInBytes) { + const AArch64Subtarget &Subtarget = MF.getSubtarget(); + const AArch64FunctionInfo &MFI = *MF.getInfo(); + // TODO: When implementing stack protectors, take that into account + // for the probe threshold. + return Subtarget.isTargetWindows() && MFI.hasStackProbing() && + StackSizeInBytes >= uint64_t(MFI.getStackProbeSize()); +} + static void getLiveRegsForEntryMBB(LivePhysRegs &LiveRegs, const MachineBasicBlock &MBB) { const MachineFunction *MF = MBB.getParent(); @@ -1023,7 +1034,7 @@ static void getLiveRegsForEntryMBB(LivePhysRegs &LiveRegs, // but we would then have to make sure that we were in fact saving at least one // callee-save register in the prologue, which is additional complexity that // doesn't seem worth the benefit. -static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB) { +static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB, bool HasCall) { MachineFunction *MF = MBB->getParent(); // If MBB is an entry block, use X9 as the scratch register @@ -1037,6 +1048,11 @@ static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB) { const AArch64RegisterInfo &TRI = *Subtarget.getRegisterInfo(); LivePhysRegs LiveRegs(TRI); getLiveRegsForEntryMBB(LiveRegs, *MBB); + if (HasCall) { + LiveRegs.addReg(AArch64::X16); + LiveRegs.addReg(AArch64::X17); + LiveRegs.addReg(AArch64::X18); + } // Prefer X9 since it was historically used for the prologue scratch reg. const MachineRegisterInfo &MRI = MF->getRegInfo(); @@ -1077,23 +1093,16 @@ bool AArch64FrameLowering::canUseAsPrologue( MBB.isLiveIn(AArch64::NZCV)) return false; - // Don't need a scratch register if we're not going to re-align the stack or - // emit stack probes. - if (!RegInfo->hasStackRealignment(*MF) && !TLI->hasInlineStackProbe(*MF)) - return true; - // Otherwise, we can use any block as long as it has a scratch register - // available. - return findScratchNonCalleeSaveRegister(TmpMBB) != AArch64::NoRegister; -} + if (RegInfo->hasStackRealignment(*MF) || TLI->hasInlineStackProbe(*MF)) + if (findScratchNonCalleeSaveRegister(TmpMBB) == AArch64::NoRegister) + return false; -static bool windowsRequiresStackProbe(MachineFunction &MF, - uint64_t StackSizeInBytes) { - const AArch64Subtarget &Subtarget = MF.getSubtarget(); - const AArch64FunctionInfo &MFI = *MF.getInfo(); - // TODO: When implementing stack protectors, take that into account - // for the probe threshold. - return Subtarget.isTargetWindows() && MFI.hasStackProbing() && - StackSizeInBytes >= uint64_t(MFI.getStackProbeSize()); + // May need a scratch register (for return value) if require making a special call + if (requiresSaveVG(*MF) || windowsRequiresStackProbe(*MF, std::numeric_limits::max())) + if (findScratchNonCalleeSaveRegister(TmpMBB, true) == AArch64::NoRegister) + return false; + + return true; } static bool needsWinCFI(const MachineFunction &MF) { @@ -1356,8 +1365,8 @@ bool requiresGetVGCall(MachineFunction &MF) { !MF.getSubtarget().hasSVE(); } -static bool requiresSaveVG(MachineFunction &MF) { - AArch64FunctionInfo *AFI = MF.getInfo(); +static bool requiresSaveVG(const MachineFunction &MF) { + const AArch64FunctionInfo *AFI = MF.getInfo(); // For Darwin platforms we don't save VG for non-SVE functions, even if SME // is enabled with streaming mode changes. if (!AFI->hasStreamingModeChanges()) @@ -1991,8 +2000,8 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF, return STI.getRegisterInfo()->isSuperOrSubRegisterEq( AArch64::X15, LiveIn.PhysReg); })) { - X15Scratch = findScratchNonCalleeSaveRegister(&MBB); - assert(X15Scratch != AArch64::NoRegister); + X15Scratch = findScratchNonCalleeSaveRegister(&MBB, true); + assert(X15Scratch != AArch64::NoRegister && (X15Scratch < AArch64::X15 || X15Scratch > AArch64::X17)); #ifndef NDEBUG LiveRegs.removeReg(AArch64::X15); // ignore X15 since we restore it #endif @@ -3236,7 +3245,7 @@ bool AArch64FrameLowering::spillCalleeSavedRegisters( unsigned X0Scratch = AArch64::NoRegister; if (Reg1 == AArch64::VG) { // Find an available register to store value of VG to. - Reg1 = findScratchNonCalleeSaveRegister(&MBB); + Reg1 = findScratchNonCalleeSaveRegister(&MBB, true); assert(Reg1 != AArch64::NoRegister); SMEAttrs Attrs(MF.getFunction()); >From 27c221ce00b7e7760659f2e4dc14302b30b7092f Mon Sep 17 00:00:00 2001 From: Jameson Nash Date: Fri, 23 May 2025 10:11:41 -0400 Subject: [PATCH 3/3] clang-format --- llvm/lib/Target/AArch64/AArch64FrameLowering.cpp | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp index 73d22286eb961..0899c5f44b2fe 100644 --- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp @@ -327,7 +327,8 @@ static int64_t getArgumentStackToRestore(MachineFunction &MF, static bool produceCompactUnwindFrame(MachineFunction &MF); static bool needsWinCFI(const MachineFunction &MF); static StackOffset getSVEStackSize(const MachineFunction &MF); -static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB, bool HasCall=false); +static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB, + bool HasCall = false); static bool requiresSaveVG(const MachineFunction &MF); /// Returns true if a homogeneous prolog or epilog code can be emitted @@ -1034,7 +1035,8 @@ static void getLiveRegsForEntryMBB(LivePhysRegs &LiveRegs, // but we would then have to make sure that we were in fact saving at least one // callee-save register in the prologue, which is additional complexity that // doesn't seem worth the benefit. -static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB, bool HasCall) { +static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB, + bool HasCall) { MachineFunction *MF = MBB->getParent(); // If MBB is an entry block, use X9 as the scratch register @@ -1097,8 +1099,10 @@ bool AArch64FrameLowering::canUseAsPrologue( if (findScratchNonCalleeSaveRegister(TmpMBB) == AArch64::NoRegister) return false; - // May need a scratch register (for return value) if require making a special call - if (requiresSaveVG(*MF) || windowsRequiresStackProbe(*MF, std::numeric_limits::max())) + // May need a scratch register (for return value) if require making a special + // call + if (requiresSaveVG(*MF) || + windowsRequiresStackProbe(*MF, std::numeric_limits::max())) if (findScratchNonCalleeSaveRegister(TmpMBB, true) == AArch64::NoRegister) return false; @@ -2001,7 +2005,8 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF, AArch64::X15, LiveIn.PhysReg); })) { X15Scratch = findScratchNonCalleeSaveRegister(&MBB, true); - assert(X15Scratch != AArch64::NoRegister && (X15Scratch < AArch64::X15 || X15Scratch > AArch64::X17)); + assert(X15Scratch != AArch64::NoRegister && + (X15Scratch < AArch64::X15 || X15Scratch > AArch64::X17)); #ifndef NDEBUG LiveRegs.removeReg(AArch64::X15); // ignore X15 since we restore it #endif From flang-commits at lists.llvm.org Fri May 23 09:21:25 2025 From: flang-commits at lists.llvm.org (Daniel Chen via flang-commits) Date: Fri, 23 May 2025 09:21:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix folding of SHAPE(SPREAD(source, dim, ncopies=-1)) (PR #141146) In-Reply-To: Message-ID: <6830a085.a70a0220.125144.e449@mx.google.com> https://github.com/DanielCChen approved this pull request. LGTM. It fixed our test case. Thanks! https://github.com/llvm/llvm-project/pull/141146 From flang-commits at lists.llvm.org Fri May 23 09:48:48 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 23 May 2025 09:48:48 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Allow forward reference to non-default INTEGER dummy (PR #141254) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/141254 A dummy argument with an explicit INTEGER type of non-default kind can be forward-referenced from a specification expression in many Fortran compilers. Handle by adding type declaration statements to the initial pass over a specification part's declaration constructs. Emit an optional warning under -pedantic. Fixes https://github.com/llvm/llvm-project/issues/140941. >From f3b4d25ea9a95541beefd8a1525ca925b0890831 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 23 May 2025 09:44:56 -0700 Subject: [PATCH] [flang] Allow forward reference to non-default INTEGER dummy A dummy argument with an explicit INTEGER type of non-default kind can be forward-referenced from a specification expression in many Fortran compilers. Handle by adding type declaration statements to the initial pass over a specification part's declaration constructs. Emit an optional warning under -pedantic. Fixes https://github.com/llvm/llvm-project/issues/140941. --- flang/docs/Extensions.md | 5 +- flang/lib/Semantics/resolve-names.cpp | 75 ++++++++++++++++++- .../test/Semantics/OpenMP/linear-clause01.f90 | 2 - flang/test/Semantics/resolve103.f90 | 16 ++-- 4 files changed, 85 insertions(+), 13 deletions(-) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..e3501dffb8777 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -291,7 +291,10 @@ end * DATA statement initialization is allowed for procedure pointers outside structure constructors. * Nonstandard intrinsic functions: ISNAN, SIZEOF -* A forward reference to a default INTEGER scalar dummy argument or +* A forward reference to an INTEGER dummy argument is permitted to appear + in a specification expression, such as an array bound, in a scope with + IMPLICIT NONE(TYPE). +* A forward reference to a default INTEGER scalar `COMMON` block variable is permitted to appear in a specification expression, such as an array bound, in a scope with IMPLICIT NONE(TYPE) if the name of the variable would have caused it to be implicitly typed diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index bdafc03ad2c05..e910a910a86f6 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -768,10 +768,22 @@ class ScopeHandler : public ImplicitRulesVisitor { deferImplicitTyping_ = skipImplicitTyping_ = skip; } + void NoteEarlyDeclaredDummyArgument(Symbol &symbol) { + earlyDeclaredDummyArguments_.insert(symbol); + } + bool IsEarlyDeclaredDummyArgument(Symbol &symbol) { + return earlyDeclaredDummyArguments_.find(symbol) != + earlyDeclaredDummyArguments_.end(); + } + void ForgetEarlyDeclaredDummyArgument(Symbol &symbol) { + earlyDeclaredDummyArguments_.erase(symbol); + } + private: Scope *currScope_{nullptr}; FuncResultStack funcResultStack_{*this}; std::map deferred_; + UnorderedSymbolSet earlyDeclaredDummyArguments_; }; class ModuleVisitor : public virtual ScopeHandler { @@ -1119,6 +1131,7 @@ class DeclarationVisitor : public ArraySpecVisitor, template Symbol &DeclareEntity(const parser::Name &name, Attrs attrs) { Symbol &symbol{MakeSymbol(name, attrs)}; + ForgetEarlyDeclaredDummyArgument(symbol); if (context().HasError(symbol) || symbol.has()) { return symbol; // OK or error already reported } else if (symbol.has()) { @@ -1976,6 +1989,9 @@ class ResolveNamesVisitor : public virtual ScopeHandler, Scope &topScope_; void PreSpecificationConstruct(const parser::SpecificationConstruct &); + void EarlyDummyTypeDeclaration( + const parser::Statement> + &); void CreateCommonBlockSymbols(const parser::CommonStmt &); void CreateObjectSymbols(const std::list &, Attr); void CreateGeneric(const parser::GenericSpec &); @@ -8488,13 +8504,24 @@ const parser::Name *DeclarationVisitor::ResolveName(const parser::Name &name) { symbol->set(Symbol::Flag::ImplicitOrError, false); if (IsUplevelReference(*symbol)) { MakeHostAssocSymbol(name, *symbol); - } else if (IsDummy(*symbol) || - (!symbol->GetType() && FindCommonBlockContaining(*symbol))) { + } else if (IsDummy(*symbol)) { CheckEntryDummyUse(name.source, symbol); + ConvertToObjectEntity(*symbol); + if (IsEarlyDeclaredDummyArgument(*symbol)) { + ForgetEarlyDeclaredDummyArgument(*symbol); + if (isImplicitNoneType()) { + context().Warn(common::LanguageFeature::ForwardRefImplicitNone, + name.source, + "'%s' was used under IMPLICIT NONE(TYPE) before being explicitly typed"_warn_en_US, + name.source); + } + } + ApplyImplicitRules(*symbol); + } else if (!symbol->GetType() && FindCommonBlockContaining(*symbol)) { ConvertToObjectEntity(*symbol); ApplyImplicitRules(*symbol); } else if (const auto *tpd{symbol->detailsIf()}; - tpd && !tpd->attr()) { + tpd && !tpd->attr()) { Say(name, "Type parameter '%s' was referenced before being declared"_err_en_US, name.source); @@ -9258,6 +9285,10 @@ void ResolveNamesVisitor::PreSpecificationConstruct( const parser::SpecificationConstruct &spec) { common::visit( common::visitors{ + [&](const parser::Statement< + common::Indirection> &y) { + EarlyDummyTypeDeclaration(y); + }, [&](const parser::Statement> &y) { CreateGeneric(std::get(y.statement.value().t)); }, @@ -9286,6 +9317,44 @@ void ResolveNamesVisitor::PreSpecificationConstruct( spec.u); } +void ResolveNamesVisitor::EarlyDummyTypeDeclaration( + const parser::Statement> + &stmt) { + context().set_location(stmt.source); + const auto &[declTypeSpec, attrs, entities] = stmt.statement.value().t; + if (const auto *intrin{ + std::get_if(&declTypeSpec.u)}) { + if (const auto *intType{std::get_if(&intrin->u)}) { + if (const auto &kind{intType->v}) { + if (!parser::Unwrap(*kind) && + !parser::Unwrap(*kind)) { + return; + } + } + const DeclTypeSpec *type{nullptr}; + for (const auto &ent : entities) { + const auto &objName{std::get(ent.t)}; + Resolve(objName, FindInScope(currScope(), objName)); + if (Symbol * symbol{objName.symbol}; + symbol && IsDummy(*symbol) && NeedsType(*symbol)) { + if (!type) { + type = ProcessTypeSpec(declTypeSpec); + if (!type || !type->IsNumeric(TypeCategory::Integer)) { + break; + } + } + symbol->SetType(*type); + NoteEarlyDeclaredDummyArgument(*symbol); + // Set the Implicit flag to disable bogus errors from + // being emitted later when this declaration is processed + // again normally. + symbol->set(Symbol::Flag::Implicit); + } + } + } + } +} + void ResolveNamesVisitor::CreateCommonBlockSymbols( const parser::CommonStmt &commonStmt) { for (const parser::CommonStmt::Block &block : commonStmt.blocks) { diff --git a/flang/test/Semantics/OpenMP/linear-clause01.f90 b/flang/test/Semantics/OpenMP/linear-clause01.f90 index f95e834c9026c..286def2dba119 100644 --- a/flang/test/Semantics/OpenMP/linear-clause01.f90 +++ b/flang/test/Semantics/OpenMP/linear-clause01.f90 @@ -20,10 +20,8 @@ subroutine linear_clause_02(arg_01, arg_02) !$omp declare simd linear(val(arg_01)) real, intent(in) :: arg_01(:) - !ERROR: The list item 'arg_02' specified without the REF 'linear-modifier' must be of INTEGER type !ERROR: If the `linear-modifier` is REF or UVAL, the list item 'arg_02' must be a dummy argument without the VALUE attribute !$omp declare simd linear(uval(arg_02)) - !ERROR: The type of 'arg_02' has already been implicitly declared integer, value, intent(in) :: arg_02 !ERROR: The list item 'var' specified without the REF 'linear-modifier' must be of INTEGER type diff --git a/flang/test/Semantics/resolve103.f90 b/flang/test/Semantics/resolve103.f90 index 8f55968f43375..0acf2333b9586 100644 --- a/flang/test/Semantics/resolve103.f90 +++ b/flang/test/Semantics/resolve103.f90 @@ -1,8 +1,7 @@ ! RUN: not %flang_fc1 -pedantic %s 2>&1 | FileCheck %s ! Test extension: allow forward references to dummy arguments or COMMON ! from specification expressions in scopes with IMPLICIT NONE(TYPE), -! as long as those symbols are eventually typed later with the -! same integer type they would have had without IMPLICIT NONE. +! as long as those symbols are eventually typed later. !CHECK: warning: 'n1' was used without (or before) being explicitly typed !CHECK: error: No explicit type declared for dummy argument 'n1' @@ -19,12 +18,15 @@ subroutine foo2(a, n2) double precision n2 end -!CHECK: warning: 'n3' was used without (or before) being explicitly typed -!CHECK-NOT: error: Dummy argument 'n3' -subroutine foo3(a, n3) +!CHECK: warning: 'n3a' was used under IMPLICIT NONE(TYPE) before being explicitly typed +!CHECK: warning: 'n3b' was used under IMPLICIT NONE(TYPE) before being explicitly typed +!CHECK-NOT: error: Dummy argument 'n3a' +!CHECK-NOT: error: Dummy argument 'n3b' +subroutine foo3(a, n3a, n3b) implicit none - real a(n3) - integer n3 + integer a(n3a, n3b) + integer n3a + integer(8) n3b end !CHECK: warning: 'n4' was used without (or before) being explicitly typed From flang-commits at lists.llvm.org Fri May 23 09:49:22 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 23 May 2025 09:49:22 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Allow forward reference to non-default INTEGER dummy (PR #141254) In-Reply-To: Message-ID: <6830a712.170a0220.2b0a7c.5483@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes A dummy argument with an explicit INTEGER type of non-default kind can be forward-referenced from a specification expression in many Fortran compilers. Handle by adding type declaration statements to the initial pass over a specification part's declaration constructs. Emit an optional warning under -pedantic. Fixes https://github.com/llvm/llvm-project/issues/140941. --- Full diff: https://github.com/llvm/llvm-project/pull/141254.diff 4 Files Affected: - (modified) flang/docs/Extensions.md (+4-1) - (modified) flang/lib/Semantics/resolve-names.cpp (+72-3) - (modified) flang/test/Semantics/OpenMP/linear-clause01.f90 (-2) - (modified) flang/test/Semantics/resolve103.f90 (+9-7) ``````````diff diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..e3501dffb8777 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -291,7 +291,10 @@ end * DATA statement initialization is allowed for procedure pointers outside structure constructors. * Nonstandard intrinsic functions: ISNAN, SIZEOF -* A forward reference to a default INTEGER scalar dummy argument or +* A forward reference to an INTEGER dummy argument is permitted to appear + in a specification expression, such as an array bound, in a scope with + IMPLICIT NONE(TYPE). +* A forward reference to a default INTEGER scalar `COMMON` block variable is permitted to appear in a specification expression, such as an array bound, in a scope with IMPLICIT NONE(TYPE) if the name of the variable would have caused it to be implicitly typed diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index bdafc03ad2c05..e910a910a86f6 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -768,10 +768,22 @@ class ScopeHandler : public ImplicitRulesVisitor { deferImplicitTyping_ = skipImplicitTyping_ = skip; } + void NoteEarlyDeclaredDummyArgument(Symbol &symbol) { + earlyDeclaredDummyArguments_.insert(symbol); + } + bool IsEarlyDeclaredDummyArgument(Symbol &symbol) { + return earlyDeclaredDummyArguments_.find(symbol) != + earlyDeclaredDummyArguments_.end(); + } + void ForgetEarlyDeclaredDummyArgument(Symbol &symbol) { + earlyDeclaredDummyArguments_.erase(symbol); + } + private: Scope *currScope_{nullptr}; FuncResultStack funcResultStack_{*this}; std::map deferred_; + UnorderedSymbolSet earlyDeclaredDummyArguments_; }; class ModuleVisitor : public virtual ScopeHandler { @@ -1119,6 +1131,7 @@ class DeclarationVisitor : public ArraySpecVisitor, template Symbol &DeclareEntity(const parser::Name &name, Attrs attrs) { Symbol &symbol{MakeSymbol(name, attrs)}; + ForgetEarlyDeclaredDummyArgument(symbol); if (context().HasError(symbol) || symbol.has()) { return symbol; // OK or error already reported } else if (symbol.has()) { @@ -1976,6 +1989,9 @@ class ResolveNamesVisitor : public virtual ScopeHandler, Scope &topScope_; void PreSpecificationConstruct(const parser::SpecificationConstruct &); + void EarlyDummyTypeDeclaration( + const parser::Statement> + &); void CreateCommonBlockSymbols(const parser::CommonStmt &); void CreateObjectSymbols(const std::list &, Attr); void CreateGeneric(const parser::GenericSpec &); @@ -8488,13 +8504,24 @@ const parser::Name *DeclarationVisitor::ResolveName(const parser::Name &name) { symbol->set(Symbol::Flag::ImplicitOrError, false); if (IsUplevelReference(*symbol)) { MakeHostAssocSymbol(name, *symbol); - } else if (IsDummy(*symbol) || - (!symbol->GetType() && FindCommonBlockContaining(*symbol))) { + } else if (IsDummy(*symbol)) { CheckEntryDummyUse(name.source, symbol); + ConvertToObjectEntity(*symbol); + if (IsEarlyDeclaredDummyArgument(*symbol)) { + ForgetEarlyDeclaredDummyArgument(*symbol); + if (isImplicitNoneType()) { + context().Warn(common::LanguageFeature::ForwardRefImplicitNone, + name.source, + "'%s' was used under IMPLICIT NONE(TYPE) before being explicitly typed"_warn_en_US, + name.source); + } + } + ApplyImplicitRules(*symbol); + } else if (!symbol->GetType() && FindCommonBlockContaining(*symbol)) { ConvertToObjectEntity(*symbol); ApplyImplicitRules(*symbol); } else if (const auto *tpd{symbol->detailsIf()}; - tpd && !tpd->attr()) { + tpd && !tpd->attr()) { Say(name, "Type parameter '%s' was referenced before being declared"_err_en_US, name.source); @@ -9258,6 +9285,10 @@ void ResolveNamesVisitor::PreSpecificationConstruct( const parser::SpecificationConstruct &spec) { common::visit( common::visitors{ + [&](const parser::Statement< + common::Indirection> &y) { + EarlyDummyTypeDeclaration(y); + }, [&](const parser::Statement> &y) { CreateGeneric(std::get(y.statement.value().t)); }, @@ -9286,6 +9317,44 @@ void ResolveNamesVisitor::PreSpecificationConstruct( spec.u); } +void ResolveNamesVisitor::EarlyDummyTypeDeclaration( + const parser::Statement> + &stmt) { + context().set_location(stmt.source); + const auto &[declTypeSpec, attrs, entities] = stmt.statement.value().t; + if (const auto *intrin{ + std::get_if(&declTypeSpec.u)}) { + if (const auto *intType{std::get_if(&intrin->u)}) { + if (const auto &kind{intType->v}) { + if (!parser::Unwrap(*kind) && + !parser::Unwrap(*kind)) { + return; + } + } + const DeclTypeSpec *type{nullptr}; + for (const auto &ent : entities) { + const auto &objName{std::get(ent.t)}; + Resolve(objName, FindInScope(currScope(), objName)); + if (Symbol * symbol{objName.symbol}; + symbol && IsDummy(*symbol) && NeedsType(*symbol)) { + if (!type) { + type = ProcessTypeSpec(declTypeSpec); + if (!type || !type->IsNumeric(TypeCategory::Integer)) { + break; + } + } + symbol->SetType(*type); + NoteEarlyDeclaredDummyArgument(*symbol); + // Set the Implicit flag to disable bogus errors from + // being emitted later when this declaration is processed + // again normally. + symbol->set(Symbol::Flag::Implicit); + } + } + } + } +} + void ResolveNamesVisitor::CreateCommonBlockSymbols( const parser::CommonStmt &commonStmt) { for (const parser::CommonStmt::Block &block : commonStmt.blocks) { diff --git a/flang/test/Semantics/OpenMP/linear-clause01.f90 b/flang/test/Semantics/OpenMP/linear-clause01.f90 index f95e834c9026c..286def2dba119 100644 --- a/flang/test/Semantics/OpenMP/linear-clause01.f90 +++ b/flang/test/Semantics/OpenMP/linear-clause01.f90 @@ -20,10 +20,8 @@ subroutine linear_clause_02(arg_01, arg_02) !$omp declare simd linear(val(arg_01)) real, intent(in) :: arg_01(:) - !ERROR: The list item 'arg_02' specified without the REF 'linear-modifier' must be of INTEGER type !ERROR: If the `linear-modifier` is REF or UVAL, the list item 'arg_02' must be a dummy argument without the VALUE attribute !$omp declare simd linear(uval(arg_02)) - !ERROR: The type of 'arg_02' has already been implicitly declared integer, value, intent(in) :: arg_02 !ERROR: The list item 'var' specified without the REF 'linear-modifier' must be of INTEGER type diff --git a/flang/test/Semantics/resolve103.f90 b/flang/test/Semantics/resolve103.f90 index 8f55968f43375..0acf2333b9586 100644 --- a/flang/test/Semantics/resolve103.f90 +++ b/flang/test/Semantics/resolve103.f90 @@ -1,8 +1,7 @@ ! RUN: not %flang_fc1 -pedantic %s 2>&1 | FileCheck %s ! Test extension: allow forward references to dummy arguments or COMMON ! from specification expressions in scopes with IMPLICIT NONE(TYPE), -! as long as those symbols are eventually typed later with the -! same integer type they would have had without IMPLICIT NONE. +! as long as those symbols are eventually typed later. !CHECK: warning: 'n1' was used without (or before) being explicitly typed !CHECK: error: No explicit type declared for dummy argument 'n1' @@ -19,12 +18,15 @@ subroutine foo2(a, n2) double precision n2 end -!CHECK: warning: 'n3' was used without (or before) being explicitly typed -!CHECK-NOT: error: Dummy argument 'n3' -subroutine foo3(a, n3) +!CHECK: warning: 'n3a' was used under IMPLICIT NONE(TYPE) before being explicitly typed +!CHECK: warning: 'n3b' was used under IMPLICIT NONE(TYPE) before being explicitly typed +!CHECK-NOT: error: Dummy argument 'n3a' +!CHECK-NOT: error: Dummy argument 'n3b' +subroutine foo3(a, n3a, n3b) implicit none - real a(n3) - integer n3 + integer a(n3a, n3b) + integer n3a + integer(8) n3b end !CHECK: warning: 'n4' was used without (or before) being explicitly typed ``````````
https://github.com/llvm/llvm-project/pull/141254 From flang-commits at lists.llvm.org Fri May 23 10:50:34 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Fri, 23 May 2025 10:50:34 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <6830b56a.170a0220.2b74a5.733f@mx.google.com> eZWALT wrote: @alexey-bataev not sure what happened before with this build system, but now everything works as expected. Thanks for the fast replies and have a nice weekend! https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Fri May 23 09:55:00 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 23 May 2025 09:55:00 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <6830a864.170a0220.c9c2e.33e9@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From c243d222fe682c92670349353bc326d112a5e30b Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH 1/2] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, Destroy, and DescriptorIO. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. There is a fast(?) mode in the work queue implementation that causes new work items to be executed to completion immediately upon creation, saving the overhead of actually representing and managing the work queue. This mode can't be used on GPU devices, but it is enabled by default for CPU hosts. It can be disabled easily for debugging and performance testing. --- .../include/flang-rt/runtime/environment.h | 1 + flang-rt/include/flang-rt/runtime/stat.h | 10 +- .../include/flang-rt/runtime/work-queue.h | 538 +++++++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 549 +++++++++------ flang-rt/lib/runtime/derived.cpp | 519 +++++++------- flang-rt/lib/runtime/descriptor-io.cpp | 646 +++++++++++++++++- flang-rt/lib/runtime/descriptor-io.h | 620 +---------------- flang-rt/lib/runtime/environment.cpp | 4 + flang-rt/lib/runtime/namelist.cpp | 1 + flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 145 ++++ flang-rt/unittests/Runtime/ExternalIOTest.cpp | 2 +- flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- 15 files changed, 1974 insertions(+), 1081 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/environment.h b/flang-rt/include/flang-rt/runtime/environment.h index 16258b3bbba9b..87fe1f92ba545 100644 --- a/flang-rt/include/flang-rt/runtime/environment.h +++ b/flang-rt/include/flang-rt/runtime/environment.h @@ -63,6 +63,7 @@ struct ExecutionEnvironment { bool noStopMessage{false}; // NO_STOP_MESSAGE=1 inhibits "Fortran STOP" bool defaultUTF8{false}; // DEFAULT_UTF8 bool checkPointerDeallocation{true}; // FORT_CHECK_POINTER_DEALLOCATION + int internalDebugging{0}; // FLANG_RT_DEBUG // CUDA related variables std::size_t cudaStackLimit{0}; // ACC_OFFLOAD_STACK_SIZE diff --git a/flang-rt/include/flang-rt/runtime/stat.h b/flang-rt/include/flang-rt/runtime/stat.h index 070d0bf8673fb..dc372de53506a 100644 --- a/flang-rt/include/flang-rt/runtime/stat.h +++ b/flang-rt/include/flang-rt/runtime/stat.h @@ -24,7 +24,7 @@ class Terminator; enum Stat { StatOk = 0, // required to be zero by Fortran - // Interoperable STAT= codes + // Interoperable STAT= codes (>= 11) StatBaseNull = CFI_ERROR_BASE_ADDR_NULL, StatBaseNotNull = CFI_ERROR_BASE_ADDR_NOT_NULL, StatInvalidElemLen = CFI_INVALID_ELEM_LEN, @@ -36,7 +36,7 @@ enum Stat { StatMemAllocation = CFI_ERROR_MEM_ALLOCATION, StatOutOfBounds = CFI_ERROR_OUT_OF_BOUNDS, - // Standard STAT= values + // Standard STAT= values (>= 101) StatFailedImage = FORTRAN_RUNTIME_STAT_FAILED_IMAGE, StatLocked = FORTRAN_RUNTIME_STAT_LOCKED, StatLockedOtherImage = FORTRAN_RUNTIME_STAT_LOCKED_OTHER_IMAGE, @@ -49,10 +49,14 @@ enum Stat { // Additional "processor-defined" STAT= values StatInvalidArgumentNumber = FORTRAN_RUNTIME_STAT_INVALID_ARG_NUMBER, StatMissingArgument = FORTRAN_RUNTIME_STAT_MISSING_ARG, - StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, + StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, // -1 StatMoveAllocSameAllocatable = FORTRAN_RUNTIME_STAT_MOVE_ALLOC_SAME_ALLOCATABLE, StatBadPointerDeallocation = FORTRAN_RUNTIME_STAT_BAD_POINTER_DEALLOCATION, + + // Dummy status for work queue continuation, declared here to perhaps + // avoid collisions + StatContinue = 201 }; RT_API_ATTRS const char *StatErrorString(int); diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..2b46890aeebe1 --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,538 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue comprises a list of tickets. Each ticket class has a Begin() +// member function, which is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatContinue. When that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentsOverElements, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/connection.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; + +// Ticket worker base classes + +template class ImmediateTicketRunner { +public: + RT_API_ATTRS explicit ImmediateTicketRunner(TICKET &ticket) + : ticket_{ticket} {} + RT_API_ATTRS int Run(WorkQueue &workQueue) { + int status{ticket_.Begin(workQueue)}; + while (status == StatContinue) { + status = ticket_.Continue(workQueue); + } + return status; + } + +private: + TICKET &ticket_; +}; + +// Base class for ticket workers that operate elementwise over descriptors +class Elementwise { +protected: + RT_API_ATTRS Elementwise( + const Descriptor &instance, const Descriptor *from = nullptr) + : instance_{instance}, from_{from} { + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + RT_API_ATTRS bool IsComplete() const { return elementAt_ >= elements_; } + RT_API_ATTRS void Advance() { + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + if (from_) { + from_->IncrementSubscripts(fromSubscripts_); + } + } + RT_API_ATTRS void SkipToEnd() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + + const Descriptor &instance_, *from_{nullptr}; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + SubscriptValue subscripts_[common::maxRank]; + SubscriptValue fromSubscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components. +class Componentwise { +protected: + RT_API_ATTRS Componentwise(const typeInfo::DerivedType &); + RT_API_ATTRS bool IsComplete() const { return componentAt_ >= components_; } + RT_API_ATTRS void Advance() { + ++componentAt_; + GetComponent(); + } + RT_API_ATTRS void SkipToEnd() { + component_ = nullptr; + componentAt_ = components_; + } + RT_API_ATTRS void Reset() { + component_ = nullptr; + componentAt_ = 0; + GetComponent(); + } + RT_API_ATTRS void GetComponent(); + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentsOverElements : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ComponentsOverElements(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Elementwise::IsComplete()) { + Componentwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Componentwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextElement(); + if (Elementwise::IsComplete()) { + Elementwise::Reset(); + Componentwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Elementwise::Advance(); + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Advance(); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Reset(); + } + + int phase_{0}; +}; + +// Base class for ticket workers that operate over elements in an outer loop, +// type components in an inner loop. +class ElementsOverComponents : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ElementsOverComponents(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Componentwise::IsComplete()) { + Elementwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Elementwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextComponent(); + if (Componentwise::IsComplete()) { + Componentwise::Reset(); + Elementwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Componentwise::Advance(); + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Componentwise::Reset(); + Elementwise::Advance(); + } + + int phase_{0}; +}; + +// Ticket worker classes + +// Implements derived type instance initialization +class InitializeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket + : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket : public ImmediateTicketRunner { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : ImmediateTicketRunner{*this}, to_{to}, from_{&from}, + flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment. +class DerivedAssignTicket : public ImmediateTicketRunner, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ImmediateTicketRunner{*this}, + ElementsOverComponents{to, derived, &from}, flags_{flags}, + memmoveFct_{memmoveFct}, deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + StaticDescriptor fromComponentDescriptor_; +}; + +namespace io::descr { + +template +class DescriptorIoTicket + : public ImmediateTicketRunner>, + private Elementwise { +public: + RT_API_ATTRS DescriptorIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + Elementwise{descriptor}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS bool &anyIoTookPlace() { return anyIoTookPlace_; } + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; + common::optional nonTbpSpecial_; + const typeInfo::DerivedType *derived_{nullptr}; + const typeInfo::SpecialBinding *special_{nullptr}; + StaticDescriptor elementDescriptor_; +}; + +template +class DerivedIoTicket : public ImmediateTicketRunner>, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + ElementsOverComponents{descriptor, derived}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; +}; + +} // namespace io::descr + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant, + io::descr::DescriptorIoTicket, + io::descr::DerivedIoTicket, + io::descr::DerivedIoTicket> + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + // APIs for particular tasks. These can return StatOk if the work is + // completed immediately. + RT_API_ATTRS int BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return InitializeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + if (runTicketsImmediately_) { + return InitializeCloneTicket{clone, original, derived, hasStat, errMsg} + .Run(*this); + } else { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); + return StatContinue; + } + } + RT_API_ATTRS int BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return FinalizeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + if (runTicketsImmediately_) { + return DestroyTicket{descriptor, derived, finalize}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived, finalize); + return StatContinue; + } + } + RT_API_ATTRS int BeginAssign(Descriptor &to, const Descriptor &from, + int flags, MemmoveFct memmoveFct) { + if (runTicketsImmediately_) { + return AssignTicket{to, from, flags, memmoveFct}.Run(*this); + } else { + StartTicket().u.emplace(to, from, flags, memmoveFct); + return StatContinue; + } + } + RT_API_ATTRS int BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) { + if (runTicketsImmediately_) { + return DerivedAssignTicket{ + to, from, derived, flags, memmoveFct, deallocateAfter} + .Run(*this); + } else { + StartTicket().u.emplace( + to, from, derived, flags, memmoveFct, deallocateAfter); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDescriptorIo(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DescriptorIoTicket{ + io, descriptor, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, table, anyIoTookPlace); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedIo(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DerivedIoTicket{ + io, descriptor, derived, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, derived, table, anyIoTookPlace); + return StatContinue; + } + } + + RT_API_ATTRS int Run(); + +private: +#if RT_DEVICE_COMPILATION + // Always use the work queue on a GPU device to avoid recursion. + static constexpr bool runTicketsImmediately_{false}; +#else + // Avoid the work queue overhead on the host, unless it needs + // debugging, which is so much easier there. + static constexpr bool runTicketsImmediately_{true}; +#endif + + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index c5e7bdce5b2fd..3f9858671d1e1 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -67,6 +67,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -130,6 +131,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 9be75da9520e3..cc2000ddfdb6e 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -102,11 +103,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncId)); } // least <= 0, most >= 0 @@ -231,6 +228,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -244,274 +243,373 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + if (workQueue.BeginAssign(to, from, flags, memmoveFct) == StatContinue) { + workQueue.Run(); + } +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + if (int status{workQueue.BeginFinalize(*toDeallocate_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncId))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(newFrom, *derived)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + } + static constexpr int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (int status{workQueue.BeginAssign( + newFrom, *from_, nestedFlags, memmoveFct_)}; + status != StatOk && status != StatContinue) { + return status; } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + if (int status{workQueue.BeginFinalize(to_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed + } else if (!toDerived_->noDestructionNeeded()) { + if (int status{ + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + return StatContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); } + return StatOk; } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(to_, *toDerived_)}; + status != StatOk) { + return status; + } + } + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); - } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); - } + if (toDerived_) { + if (int status{workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_)}; + status != StatOk && status != StatContinue) { + return status; } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &workQueue) { + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + memmoveFct_( + instance_.ElementComponent(subscripts_, procPtr.offset), + from_->ElementComponent( + fromSubscripts_, procPtr.offset), + sizeof(typeInfo::ProcedurePointer)); + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &workQueue) { + // DerivedAssignTicket inherits from ElementComponentBase so that it + // loops over elements at the outer level and over components at the inner. + // This yield less surprising behavior for codes that care due to the use + // of defined assignments when the ordering of their calls matters. + while (!IsComplete()) { + // Copy the data components (incl. the parent) first. + switch (component_->genre()) { + case typeInfo::Component::Genre::Data: + if (component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{fromComponentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + toCompDesc, instance_, workQueue.terminator(), subscripts_); + component_->CreatePointerDescriptor( + fromCompDesc, *from_, workQueue.terminator(), fromSubscripts_); + Advance(); + if (int status{workQueue.BeginAssign( + toCompDesc, fromCompDesc, flags_, memmoveFct_)}; + status != StatOk) { + return status; + } + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + instance_.Element(subscripts_) + component_->offset())}; + const auto *fromDesc{reinterpret_cast( + from_->Element(fromSubscripts_) + component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (const auto *componentDerived{component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } + } + toDesc->Deallocate(); + } + Advance(); + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + // + // Be careful not to destroy/reallocate the LHS, if there is + // overlap between LHS and RHS (it seems that partial overlap + // is not possible, though). + // Invoke Assign() recursively to deal with potential aliasing. + // Force LHS deallocation with DeallocateLHS flag. + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + int nestedFlags{flags_ | DeallocateLHS}; + Advance(); + if (int status{workQueue.BeginAssign( + *toDesc, *fromDesc, nestedFlags, memmoveFct_)}; + status != StatOk) { + return status; + } + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -581,7 +679,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -597,11 +694,11 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. - if (var) + if (var) { Assign(*var, temp, terminator, NoAssignFlags); + } temp.Destroy(/*finalize=*/false, /*destroyPointers=*/false, &terminator); } diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..8462d0aba1f06 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,195 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitialize(instance, derived)}; + return status == StatContinue ? workQueue.Run() : status; +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &comp{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + auto &pptr{*instance_.ElementComponent( + subscripts_, comp.offset)}; + pptr = comp.procInitialization; + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + SkipToNextComponent(); + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitialize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; - } - } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitializeClone( + clone, original, derived, hasStat, errMsg)}; + return status == StatContinue ? workQueue.Run() : status; } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncId), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + if (int status{workQueue.BeginInitialize(cloneDesc, *derived)}; + status != StatOk) { + return status; } } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_)}; + status != StatOk) { + return status; + } + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } + } else { + SkipToNextComponent(); + } + } + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginFinalize(descriptor, derived) == StatContinue) { + workQueue.Run(); } } - return stat; } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +237,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +274,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncId) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,87 +298,94 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (finalizableParentType_->noFinalizationNeeded()) { + finalizableParentType_ = nullptr; + } else { + SkipToNextComponent(); + } + } + return StatContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + if (int status{ + workQueue.BeginFinalize(compDesc, *compDynamicType)}; + status != StatOk) { + return status; } } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginFinalize(compDesc, *compType)}; + status != StatOk) { + return status; } } + } else { + SkipToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginFinalize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + const auto &parentType{*finalizableParentType_}; + finalizableParentType_ = nullptr; + // Don't return StatOk here if the nested FInalize is still running; + // it needs this->componentDescriptor_. + return workQueue.BeginFinalize(tmpDesc, parentType); } + return StatOk; } // The order of finalization follows Fortran 2018 7.5.6.2, with @@ -373,51 +394,71 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginDestroy(descriptor, derived, finalize) == StatContinue) { + workQueue.Run(); + } } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + if (int status{workQueue.BeginFinalize(instance_, derived_)}; + status != StatOk && status != StatContinue) { + return status; + } } + return StatContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (!IsComplete()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *d, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + SkipToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginDestroy( + compDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } } + } else { + SkipToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index 3db1455af52fe..de2b9a788a25e 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -7,12 +7,40 @@ //===----------------------------------------------------------------------===// #include "descriptor-io.h" +#include "edit-input.h" +#include "edit-output.h" +#include "unit.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/namelist.h" +#include "flang-rt/runtime/terminator.h" +#include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" +#include "flang/Common/optional.h" #include "flang/Common/restorer.h" +#include "flang/Common/uint128.h" +#include "flang/Runtime/cpp-type.h" #include "flang/Runtime/freestanding-tools.h" +// Implementation of I/O data list item transfers based on descriptors. +// (All I/O items come through here so that the code is exercised for test; +// some scalar I/O data transfer APIs could be changed to bypass their use +// of descriptors in the future for better efficiency.) + namespace Fortran::runtime::io::descr { RT_OFFLOAD_API_GROUP_BEGIN +template +inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, + const Descriptor &descriptor, const SubscriptValue subscripts[]) { + A *p{descriptor.Element(subscripts)}; + if (!p) { + io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " + "address or subscripts out of range"); + } + return *p; +} + // Defined formatted I/O (maybe) Fortran::common::optional DefinedFormattedIo(IoStatementState &io, const Descriptor &descriptor, const typeInfo::DerivedType &derived, @@ -104,8 +132,8 @@ Fortran::common::optional DefinedFormattedIo(IoStatementState &io, } // Defined unformatted I/O -bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, - const typeInfo::DerivedType &derived, +static bool DefinedUnformattedIo(IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special) { // Unformatted I/O must have an external unit (or child thereof). IoErrorHandler &handler{io.GetIoErrorHandler()}; @@ -152,5 +180,619 @@ bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, return handler.GetIoStat() == IostatOk; } +// Per-category descriptor-based I/O templates + +// TODO (perhaps as a nontrivial but small starter project): implement +// automatic repetition counts, like "10*3.14159", for list-directed and +// NAMELIST array output. + +template +inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, + const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!EditIntegerOutput(io, *edit, x, isSigned)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditIntegerInput( + io, *edit, reinterpret_cast(&x), KIND, isSigned)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedIntegerIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedRealIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + RawType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditRealInput(io, *edit, reinterpret_cast(&x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedRealIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedComplexIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + bool isListOutput{ + io.get_if>() != nullptr}; + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + RawType *x{&ExtractElement(io, descriptor, subscripts)}; + if (isListOutput) { + DataEdit rEdit, iEdit; + rEdit.descriptor = DataEdit::ListDirectedRealPart; + iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; + rEdit.modes = iEdit.modes = io.mutableModes(); + if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || + !RealOutputEditing{io, x[1]}.Edit(iEdit)) { + return false; + } + } else { + for (int k{0}; k < 2; ++k, ++x) { + auto edit{io.GetNextDataEdit()}; + if (!edit) { + return false; + } else if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, *x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { + break; + } else if (EditRealInput( + io, *edit, reinterpret_cast(x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedComplexIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedCharacterIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + std::size_t length{descriptor.ElementBytes() / sizeof(A)}; + auto *listOutput{io.get_if>()}; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + A *x{&ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditCharacterOutput(io, *edit, x, length)) { + return false; + } + } else { // input + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditCharacterInput(io, *edit, x, length)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedCharacterIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedLogicalIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + auto *listOutput{io.get_if>()}; + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditLogicalOutput(io, *edit, x != 0)) { + return false; + } + } else { + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + bool truth{}; + if (EditLogicalInput(io, *edit, truth)) { + x = truth; + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedLogicalIO: subscripts out of bounds"); + } + } + return true; +} + +template +RT_API_ATTRS int DerivedIoTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Data) { + // Create a descriptor for the component + Descriptor &compDesc{componentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + compDesc, instance_, io_.GetIoErrorHandler(), subscripts_); + Advance(); + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } else { + // Component is itself a descriptor + char *pointer{ + instance_.Element(subscripts_) + component_->offset()}; + const Descriptor &compDesc{ + *reinterpret_cast(pointer)}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + } + } + return StatOk; +} + +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Begin(WorkQueue &workQueue) { + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + if (handler.InError()) { + return handler.GetIoStat(); + } + if (!io_.get_if>()) { + handler.Crash("DescriptorIO() called for wrong I/O direction"); + return handler.GetIoStat(); + } + if constexpr (DIR == Direction::Input) { + if (!io_.BeginReadingRecord()) { + return StatOk; + } + } + if (!io_.get_if>()) { + // Unformatted I/O + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + if (const typeInfo::DerivedType * + type{addendum ? addendum->derivedType() : nullptr}) { + // derived type unformatted I/O + if (table_) { + if (const auto *definedIo{table_->Find(*type, + DIR == Direction::Input + ? common::DefinedIo::ReadUnformatted + : common::DefinedIo::WriteUnformatted)}) { + if (definedIo->subroutine) { + typeInfo::SpecialBinding special{DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false}; + if (DefinedUnformattedIo(io_, instance_, *type, special)) { + anyIoTookPlace_ = true; + return StatOk; + } + } else { + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } + } + } + if (const typeInfo::SpecialBinding * + special{type->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || special->isTypeBound()) { + // defined derived type unformatted I/O + if (DefinedUnformattedIo(io_, instance_, *type, *special)) { + anyIoTookPlace_ = true; + return StatOk; + } else { + return IostatEnd; + } + } + } + // Default derived type unformatted I/O + // TODO: If no component at any level has defined READ or WRITE + // (as appropriate), the elements are contiguous, and no byte swapping + // is active, do a block transfer via the code below. + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } else { + // intrinsic type unformatted I/O + auto *externalUnf{io_.get_if>()}; + ChildUnformattedIoStatementState *childUnf{nullptr}; + InquireIOLengthState *inq{nullptr}; + bool swapEndianness{false}; + if (externalUnf) { + swapEndianness = externalUnf->unit().swapEndianness(); + } else { + childUnf = io_.get_if>(); + if (!childUnf) { + inq = DIR == Direction::Output ? io_.get_if() + : nullptr; + RUNTIME_CHECK(handler, inq != nullptr); + } + } + std::size_t elementBytes{instance_.ElementBytes()}; + std::size_t swappingBytes{elementBytes}; + if (auto maybeCatAndKind{instance_.type().GetCategoryAndKind()}) { + // Byte swapping units can be smaller than elements, namely + // for COMPLEX and CHARACTER. + if (maybeCatAndKind->first == TypeCategory::Character) { + // swap each character position independently + swappingBytes = maybeCatAndKind->second; // kind + } else if (maybeCatAndKind->first == TypeCategory::Complex) { + // swap real and imaginary components independently + swappingBytes /= 2; + } + } + using CharType = + std::conditional_t; + auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { + if constexpr (DIR == Direction::Output) { + return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) + : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) + : inq->Emit(&x, totalBytes, swappingBytes); + } else { + return externalUnf + ? externalUnf->Receive(&x, totalBytes, swappingBytes) + : childUnf->Receive(&x, totalBytes, swappingBytes); + } + }}; + if (!swapEndianness && + instance_.IsContiguous()) { // contiguous unformatted I/O + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elements_ * elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O + for (; !IsComplete(); Advance()) { + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } + } + } + // Unformatted I/O never needs to call Continue(). + return StatOk; + } + // Formatted I/O + if (auto catAndKind{instance_.type().GetCategoryAndKind()}) { + TypeCategory cat{catAndKind->first}; + int kind{catAndKind->second}; + bool any{false}; + switch (cat) { + case TypeCategory::Integer: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, true); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, true); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, true); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, true); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, true); + break; + default: + handler.Crash( + "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Unsigned: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, false); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, false); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, false); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, false); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, false); + break; + default: + handler.Crash( + "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Real: + switch (kind) { + case 2: + any = FormattedRealIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedRealIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedRealIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedRealIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedRealIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedRealIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: REAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Complex: + switch (kind) { + case 2: + any = FormattedComplexIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedComplexIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedComplexIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedComplexIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedComplexIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedComplexIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Character: + switch (kind) { + case 1: + any = FormattedCharacterIO(io_, instance_); + break; + case 2: + any = FormattedCharacterIO(io_, instance_); + break; + case 4: + any = FormattedCharacterIO(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Logical: + switch (kind) { + case 1: + any = FormattedLogicalIO<1, DIR>(io_, instance_); + break; + case 2: + any = FormattedLogicalIO<2, DIR>(io_, instance_); + break; + case 4: + any = FormattedLogicalIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedLogicalIO<8, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Derived: { + // Derived type information must be present for formatted I/O. + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + RUNTIME_CHECK(handler, addendum != nullptr); + derived_ = addendum->derivedType(); + RUNTIME_CHECK(handler, derived_ != nullptr); + if (table_) { + if (const auto *definedIo{table_->Find(*derived_, + DIR == Direction::Input ? common::DefinedIo::ReadFormatted + : common::DefinedIo::WriteFormatted)}) { + if (definedIo->subroutine) { + nonTbpSpecial_.emplace(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false); + special_ = &*nonTbpSpecial_; + } + } + } + if (!special_) { + if (const typeInfo::SpecialBinding * + binding{derived_->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || + binding->isTypeBound()) { + special_ = binding; + } + } + } + return StatContinue; + } + } + if (any) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { + handler.Crash("DescriptorIO: bad type code (%d) in descriptor", + static_cast(instance_.type().raw())); + return handler.GetIoStat(); + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Continue(WorkQueue &workQueue) { + // Only derived type formatted I/O gets here. + while (!IsComplete()) { + if (special_) { + if (auto defined{DefinedFormattedIo( + io_, instance_, *derived_, *special_, subscripts_)}) { + anyIoTookPlace_ |= *defined; + Advance(); + continue; + } + } + Descriptor &elementDesc{elementDescriptor_.descriptor()}; + elementDesc.Establish( + *derived_, nullptr, 0, nullptr, CFI_attribute_pointer); + elementDesc.set_base_addr(instance_.Element(subscripts_)); + Advance(); + if (int status{workQueue.BeginDerivedIo( + io_, elementDesc, *derived_, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS bool DescriptorIO(IoStatementState &io, + const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { + bool anyIoTookPlace{false}; + WorkQueue workQueue{io.GetIoErrorHandler()}; + if (workQueue.BeginDescriptorIo(io, descriptor, table, anyIoTookPlace) == + StatContinue) { + workQueue.Run(); + } + return anyIoTookPlace; +} + +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); + RT_OFFLOAD_API_GROUP_END } // namespace Fortran::runtime::io::descr diff --git a/flang-rt/lib/runtime/descriptor-io.h b/flang-rt/lib/runtime/descriptor-io.h index eb60f106c9203..88ad59bd24b53 100644 --- a/flang-rt/lib/runtime/descriptor-io.h +++ b/flang-rt/lib/runtime/descriptor-io.h @@ -9,619 +9,27 @@ #ifndef FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ #define FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ -// Implementation of I/O data list item transfers based on descriptors. -// (All I/O items come through here so that the code is exercised for test; -// some scalar I/O data transfer APIs could be changed to bypass their use -// of descriptors in the future for better efficiency.) +#include "flang-rt/runtime/connection.h" -#include "edit-input.h" -#include "edit-output.h" -#include "unit.h" -#include "flang-rt/runtime/descriptor.h" -#include "flang-rt/runtime/io-stmt.h" -#include "flang-rt/runtime/namelist.h" -#include "flang-rt/runtime/terminator.h" -#include "flang-rt/runtime/type-info.h" -#include "flang/Common/optional.h" -#include "flang/Common/uint128.h" -#include "flang/Runtime/cpp-type.h" +namespace Fortran::runtime { +class Descriptor; +} // namespace Fortran::runtime -namespace Fortran::runtime::io::descr { -template -inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, - const Descriptor &descriptor, const SubscriptValue subscripts[]) { - A *p{descriptor.Element(subscripts)}; - if (!p) { - io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " - "address or subscripts out of range"); - } - return *p; -} - -// Per-category descriptor-based I/O templates - -// TODO (perhaps as a nontrivial but small starter project): implement -// automatic repetition counts, like "10*3.14159", for list-directed and -// NAMELIST array output. - -template -inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, - const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!EditIntegerOutput(io, *edit, x, isSigned)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditIntegerInput( - io, *edit, reinterpret_cast(&x), KIND, isSigned)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedIntegerIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedRealIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - RawType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditRealInput(io, *edit, reinterpret_cast(&x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedRealIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io -template -inline RT_API_ATTRS bool FormattedComplexIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - bool isListOutput{ - io.get_if>() != nullptr}; - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - RawType *x{&ExtractElement(io, descriptor, subscripts)}; - if (isListOutput) { - DataEdit rEdit, iEdit; - rEdit.descriptor = DataEdit::ListDirectedRealPart; - iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; - rEdit.modes = iEdit.modes = io.mutableModes(); - if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || - !RealOutputEditing{io, x[1]}.Edit(iEdit)) { - return false; - } - } else { - for (int k{0}; k < 2; ++k, ++x) { - auto edit{io.GetNextDataEdit()}; - if (!edit) { - return false; - } else if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, *x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { - break; - } else if (EditRealInput( - io, *edit, reinterpret_cast(x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedComplexIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedCharacterIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t length{descriptor.ElementBytes() / sizeof(A)}; - auto *listOutput{io.get_if>()}; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - A *x{&ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditCharacterOutput(io, *edit, x, length)) { - return false; - } - } else { // input - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditCharacterInput(io, *edit, x, length)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedCharacterIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedLogicalIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - auto *listOutput{io.get_if>()}; - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditLogicalOutput(io, *edit, x != 0)) { - return false; - } - } else { - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - bool truth{}; - if (EditLogicalInput(io, *edit, truth)) { - x = truth; - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedLogicalIO: subscripts out of bounds"); - } - } - return true; -} +namespace Fortran::runtime::io::descr { template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, +RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable * = nullptr); -// For intrinsic (not defined) derived type I/O, formatted & unformatted -template -static RT_API_ATTRS bool DefaultComponentIO(IoStatementState &io, - const typeInfo::Component &component, const Descriptor &origDescriptor, - const SubscriptValue origSubscripts[], Terminator &terminator, - const NonTbpDefinedIoTable *table) { -#if !defined(RT_DEVICE_AVOID_RECURSION) - if (component.genre() == typeInfo::Component::Genre::Data) { - // Create a descriptor for the component - StaticDescriptor statDesc; - Descriptor &desc{statDesc.descriptor()}; - component.CreatePointerDescriptor( - desc, origDescriptor, terminator, origSubscripts); - return DescriptorIO(io, desc, table); - } else { - // Component is itself a descriptor - char *pointer{ - origDescriptor.Element(origSubscripts) + component.offset()}; - const Descriptor &compDesc{*reinterpret_cast(pointer)}; - return compDesc.IsAllocated() && DescriptorIO(io, compDesc, table); - } -#else - terminator.Crash("not yet implemented: component IO"); -#endif -} - -template -static RT_API_ATTRS bool DefaultComponentwiseFormattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table, const SubscriptValue subscripts[]) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - // Return true for NAMELIST input if any component appeared. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && k > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -template -static RT_API_ATTRS bool DefaultComponentwiseUnformattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - return false; - } - } - } - return true; -} - -RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( - IoStatementState &, const Descriptor &, const typeInfo::DerivedType &, - const typeInfo::SpecialBinding &, const SubscriptValue[]); - -template -static RT_API_ATTRS bool FormattedDerivedTypeIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - // Derived type information must be present for formatted I/O. - const DescriptorAddendum *addendum{descriptor.Addendum()}; - RUNTIME_CHECK(handler, addendum != nullptr); - const typeInfo::DerivedType *type{addendum->derivedType()}; - RUNTIME_CHECK(handler, type != nullptr); - Fortran::common::optional nonTbpSpecial; - const typeInfo::SpecialBinding *special{nullptr}; - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadFormatted - : common::DefinedIo::WriteFormatted)}) { - if (definedIo->subroutine) { - nonTbpSpecial.emplace(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false); - special = &*nonTbpSpecial; - } - } - } - if (!special) { - if (const typeInfo::SpecialBinding * - binding{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted)}) { - if (!table || !table->ignoreNonTbpEntries || binding->isTypeBound()) { - special = binding; - } - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t numElements{descriptor.Elements()}; - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - Fortran::common::optional result; - if (special) { - result = DefinedFormattedIo(io, descriptor, *type, *special, subscripts); - } - if (!result) { - result = DefaultComponentwiseFormattedIO( - io, descriptor, *type, table, subscripts); - } - if (!result.value()) { - // Return true for NAMELIST input if we got anything. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && j > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &, const Descriptor &, - const typeInfo::DerivedType &, const typeInfo::SpecialBinding &); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); -// Unformatted I/O -template -static RT_API_ATTRS bool UnformattedDescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table = nullptr) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const DescriptorAddendum *addendum{descriptor.Addendum()}; - if (const typeInfo::DerivedType * - type{addendum ? addendum->derivedType() : nullptr}) { - // derived type unformatted I/O - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadUnformatted - : common::DefinedIo::WriteUnformatted)}) { - if (definedIo->subroutine) { - typeInfo::SpecialBinding special{DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false}; - if (Fortran::common::optional wasDefined{ - DefinedUnformattedIo(io, descriptor, *type, special)}) { - return *wasDefined; - } - } else { - return DefaultComponentwiseUnformattedIO( - io, descriptor, *type, table); - } - } - } - if (const typeInfo::SpecialBinding * - special{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { - if (!table || !table->ignoreNonTbpEntries || special->isTypeBound()) { - // defined derived type unformatted I/O - return DefinedUnformattedIo(io, descriptor, *type, *special); - } - } - // Default derived type unformatted I/O - // TODO: If no component at any level has defined READ or WRITE - // (as appropriate), the elements are contiguous, and no byte swapping - // is active, do a block transfer via the code below. - return DefaultComponentwiseUnformattedIO(io, descriptor, *type, table); - } else { - // intrinsic type unformatted I/O - auto *externalUnf{io.get_if>()}; - auto *childUnf{io.get_if>()}; - auto *inq{ - DIR == Direction::Output ? io.get_if() : nullptr}; - RUNTIME_CHECK(handler, externalUnf || childUnf || inq); - std::size_t elementBytes{descriptor.ElementBytes()}; - std::size_t numElements{descriptor.Elements()}; - std::size_t swappingBytes{elementBytes}; - if (auto maybeCatAndKind{descriptor.type().GetCategoryAndKind()}) { - // Byte swapping units can be smaller than elements, namely - // for COMPLEX and CHARACTER. - if (maybeCatAndKind->first == TypeCategory::Character) { - // swap each character position independently - swappingBytes = maybeCatAndKind->second; // kind - } else if (maybeCatAndKind->first == TypeCategory::Complex) { - // swap real and imaginary components independently - swappingBytes /= 2; - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using CharType = - std::conditional_t; - auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { - if constexpr (DIR == Direction::Output) { - return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) - : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) - : inq->Emit(&x, totalBytes, swappingBytes); - } else { - return externalUnf ? externalUnf->Receive(&x, totalBytes, swappingBytes) - : childUnf->Receive(&x, totalBytes, swappingBytes); - } - }}; - bool swapEndianness{externalUnf && externalUnf->unit().swapEndianness()}; - if (!swapEndianness && - descriptor.IsContiguous()) { // contiguous unformatted I/O - char &x{ExtractElement(io, descriptor, subscripts)}; - return Transfer(x, numElements * elementBytes); - } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O - for (std::size_t j{0}; j < numElements; ++j) { - char &x{ExtractElement(io, descriptor, subscripts)}; - if (!Transfer(x, elementBytes)) { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && - j + 1 < numElements) { - handler.Crash("DescriptorIO: subscripts out of bounds"); - } - } - return true; - } - } -} - -template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - if (handler.InError()) { - return false; - } - if (!io.get_if>()) { - handler.Crash("DescriptorIO() called for wrong I/O direction"); - return false; - } - if constexpr (DIR == Direction::Input) { - if (!io.BeginReadingRecord()) { - return false; - } - } - if (!io.get_if>()) { - return UnformattedDescriptorIO(io, descriptor, table); - } - if (auto catAndKind{descriptor.type().GetCategoryAndKind()}) { - TypeCategory cat{catAndKind->first}; - int kind{catAndKind->second}; - switch (cat) { - case TypeCategory::Integer: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, true); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, true); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, true); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, true); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, true); - default: - handler.Crash( - "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Unsigned: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, false); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, false); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, false); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, false); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, false); - default: - handler.Crash( - "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Real: - switch (kind) { - case 2: - return FormattedRealIO<2, DIR>(io, descriptor); - case 3: - return FormattedRealIO<3, DIR>(io, descriptor); - case 4: - return FormattedRealIO<4, DIR>(io, descriptor); - case 8: - return FormattedRealIO<8, DIR>(io, descriptor); - case 10: - return FormattedRealIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedRealIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: REAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Complex: - switch (kind) { - case 2: - return FormattedComplexIO<2, DIR>(io, descriptor); - case 3: - return FormattedComplexIO<3, DIR>(io, descriptor); - case 4: - return FormattedComplexIO<4, DIR>(io, descriptor); - case 8: - return FormattedComplexIO<8, DIR>(io, descriptor); - case 10: - return FormattedComplexIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedComplexIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Character: - switch (kind) { - case 1: - return FormattedCharacterIO(io, descriptor); - case 2: - return FormattedCharacterIO(io, descriptor); - case 4: - return FormattedCharacterIO(io, descriptor); - default: - handler.Crash( - "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Logical: - switch (kind) { - case 1: - return FormattedLogicalIO<1, DIR>(io, descriptor); - case 2: - return FormattedLogicalIO<2, DIR>(io, descriptor); - case 4: - return FormattedLogicalIO<4, DIR>(io, descriptor); - case 8: - return FormattedLogicalIO<8, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Derived: - return FormattedDerivedTypeIO(io, descriptor, table); - } - } - handler.Crash("DescriptorIO: bad type code (%d) in descriptor", - static_cast(descriptor.type().raw())); - return false; -} } // namespace Fortran::runtime::io::descr #endif // FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ diff --git a/flang-rt/lib/runtime/environment.cpp b/flang-rt/lib/runtime/environment.cpp index 1d5304254ed0e..0f0564403c0e2 100644 --- a/flang-rt/lib/runtime/environment.cpp +++ b/flang-rt/lib/runtime/environment.cpp @@ -143,6 +143,10 @@ void ExecutionEnvironment::Configure(int ac, const char *av[], } } + if (auto *x{std::getenv("FLANG_RT_DEBUG")}) { + internalDebugging = std::strtol(x, nullptr, 10); + } + if (auto *x{std::getenv("ACC_OFFLOAD_STACK_SIZE")}) { char *end; auto n{std::strtoul(x, &end, 10)}; diff --git a/flang-rt/lib/runtime/namelist.cpp b/flang-rt/lib/runtime/namelist.cpp index b0cf2180fc6d4..1bef387a9771f 100644 --- a/flang-rt/lib/runtime/namelist.cpp +++ b/flang-rt/lib/runtime/namelist.cpp @@ -10,6 +10,7 @@ #include "descriptor-io.h" #include "flang-rt/runtime/emit-encoded.h" #include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/type-info.h" #include "flang/Runtime/io-api.h" #include #include diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..9382c96bd870a --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,145 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +// FLANG_RT_DEBUG code is disabled when false. +static constexpr bool enableDebugOutput{false}; + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (componentAt_ >= components_) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + delete firstFree_; + } + firstFree_ = next; + } +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + firstFree_ = new TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: new ticket\n"); + } + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), + at->ticket.begun ? "Continue" : "Begin"); + } + int stat{at->ticket.Continue(*this)}; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: ... stat %d\n", stat); + } + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime diff --git a/flang-rt/unittests/Runtime/ExternalIOTest.cpp b/flang-rt/unittests/Runtime/ExternalIOTest.cpp index 3833e48be3dd6..6c148b1de6f82 100644 --- a/flang-rt/unittests/Runtime/ExternalIOTest.cpp +++ b/flang-rt/unittests/Runtime/ExternalIOTest.cpp @@ -184,7 +184,7 @@ TEST(ExternalIOTests, TestSequentialFixedUnformatted) { io = IONAME(BeginInquireIoLength)(__FILE__, __LINE__); for (int j{1}; j <= 3; ++j) { ASSERT_TRUE(IONAME(OutputDescriptor)(io, desc)) - << "OutputDescriptor() for InquireIoLength"; + << "OutputDescriptor() for InquireIoLength " << j; } ASSERT_EQ(IONAME(GetIoLength)(io), 3 * recl) << "GetIoLength"; ASSERT_EQ(IONAME(EndIoStatement)(io), IostatOk) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..32bebc1d866a4 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -843,6 +843,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION >From 716abece73720048697a548e8bfc91b2a603b266 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 23 May 2025 09:54:28 -0700 Subject: [PATCH 2/2] more --- flang-rt/lib/runtime/work-queue.cpp | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp index 9382c96bd870a..1c3dd5146d0bf 100644 --- a/flang-rt/lib/runtime/work-queue.cpp +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -87,9 +87,11 @@ RT_API_ATTRS Ticket &WorkQueue::StartTicket() { last_ = newTicket; } newTicket->ticket.begun = false; +#if !defined(RT_DEVICE_COMPILATION) if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { std::fprintf(stderr, "WQ: new ticket\n"); } +#endif return newTicket->ticket; } @@ -97,14 +99,18 @@ RT_API_ATTRS int WorkQueue::Run() { while (last_) { TicketList *at{last_}; insertAfter_ = last_; +#if !defined(RT_DEVICE_COMPILATION) if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), at->ticket.begun ? "Continue" : "Begin"); } +#endif int stat{at->ticket.Continue(*this)}; +#if !defined(RT_DEVICE_COMPILATION) if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { std::fprintf(stderr, "WQ: ... stat %d\n", stat); } +#endif insertAfter_ = nullptr; if (stat == StatOk) { if (at->previous) { From flang-commits at lists.llvm.org Fri May 23 10:48:05 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Fri, 23 May 2025 10:48:05 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <6830b4d5.050a0220.2e54.bdac@mx.google.com> https://github.com/eZWALT updated https://github.com/llvm/llvm-project/pull/139293 >From 204d902b738dcd9d260963afab3d4f8f5f1c0066 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:25:33 +0000 Subject: [PATCH 1/9] Add fuse directive patch --- clang/include/clang-c/Index.h | 4 + clang/include/clang/AST/RecursiveASTVisitor.h | 3 + clang/include/clang/AST/StmtOpenMP.h | 105 +- .../clang/Basic/DiagnosticSemaKinds.td | 8 + clang/include/clang/Basic/StmtNodes.td | 1 + clang/include/clang/Sema/SemaOpenMP.h | 27 + .../include/clang/Serialization/ASTBitCodes.h | 1 + clang/lib/AST/StmtOpenMP.cpp | 25 + clang/lib/AST/StmtPrinter.cpp | 5 + clang/lib/AST/StmtProfile.cpp | 4 + clang/lib/Basic/OpenMPKinds.cpp | 2 +- clang/lib/CodeGen/CGStmt.cpp | 3 + clang/lib/CodeGen/CGStmtOpenMP.cpp | 8 + clang/lib/CodeGen/CodeGenFunction.h | 1 + clang/lib/Sema/SemaExceptionSpec.cpp | 1 + clang/lib/Sema/SemaOpenMP.cpp | 600 +++++++ clang/lib/Sema/TreeTransform.h | 11 + clang/lib/Serialization/ASTReaderStmt.cpp | 11 + clang/lib/Serialization/ASTWriterStmt.cpp | 6 + clang/lib/StaticAnalyzer/Core/ExprEngine.cpp | 1 + clang/test/OpenMP/fuse_ast_print.cpp | 278 +++ clang/test/OpenMP/fuse_codegen.cpp | 1511 +++++++++++++++++ clang/test/OpenMP/fuse_messages.cpp | 76 + clang/tools/libclang/CIndex.cpp | 7 + clang/tools/libclang/CXCursor.cpp | 3 + llvm/include/llvm/Frontend/OpenMP/OMP.td | 4 + .../runtime/test/transform/fuse/foreach.cpp | 192 +++ openmp/runtime/test/transform/fuse/intfor.c | 50 + .../runtime/test/transform/fuse/iterfor.cpp | 194 +++ .../fuse/parallel-wsloop-collapse-foreach.cpp | 208 +++ .../fuse/parallel-wsloop-collapse-intfor.c | 45 + 31 files changed, 3391 insertions(+), 4 deletions(-) create mode 100644 clang/test/OpenMP/fuse_ast_print.cpp create mode 100644 clang/test/OpenMP/fuse_codegen.cpp create mode 100644 clang/test/OpenMP/fuse_messages.cpp create mode 100644 openmp/runtime/test/transform/fuse/foreach.cpp create mode 100644 openmp/runtime/test/transform/fuse/intfor.c create mode 100644 openmp/runtime/test/transform/fuse/iterfor.cpp create mode 100644 openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-foreach.cpp create mode 100644 openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-intfor.c diff --git a/clang/include/clang-c/Index.h b/clang/include/clang-c/Index.h index d30d15e53802a..00046de62a742 100644 --- a/clang/include/clang-c/Index.h +++ b/clang/include/clang-c/Index.h @@ -2162,6 +2162,10 @@ enum CXCursorKind { */ CXCursor_OMPStripeDirective = 310, + /** OpenMP fuse directive + */ + CXCursor_OMPFuseDirective = 318, + /** OpenACC Compute Construct. */ CXCursor_OpenACCComputeConstruct = 320, diff --git a/clang/include/clang/AST/RecursiveASTVisitor.h b/clang/include/clang/AST/RecursiveASTVisitor.h index 23a8c4f1f7380..057e9e346ce4e 100644 --- a/clang/include/clang/AST/RecursiveASTVisitor.h +++ b/clang/include/clang/AST/RecursiveASTVisitor.h @@ -3080,6 +3080,9 @@ DEF_TRAVERSE_STMT(OMPUnrollDirective, DEF_TRAVERSE_STMT(OMPReverseDirective, { TRY_TO(TraverseOMPExecutableDirective(S)); }) +DEF_TRAVERSE_STMT(OMPFuseDirective, + { TRY_TO(TraverseOMPExecutableDirective(S)); }) + DEF_TRAVERSE_STMT(OMPInterchangeDirective, { TRY_TO(TraverseOMPExecutableDirective(S)); }) diff --git a/clang/include/clang/AST/StmtOpenMP.h b/clang/include/clang/AST/StmtOpenMP.h index 736bcabbad1f7..dc6f797e24ab8 100644 --- a/clang/include/clang/AST/StmtOpenMP.h +++ b/clang/include/clang/AST/StmtOpenMP.h @@ -962,6 +962,9 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Number of loops generated by this loop transformation. unsigned NumGeneratedLoops = 0; + /// Number of top level canonical loop nests generated by this loop + /// transformation + unsigned NumGeneratedLoopNests = 0; protected: explicit OMPLoopTransformationDirective(StmtClass SC, @@ -973,6 +976,9 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Set the number of loops generated by this loop transformation. void setNumGeneratedLoops(unsigned Num) { NumGeneratedLoops = Num; } + /// Set the number of top level canonical loop nests generated by this loop + /// transformation + void setNumGeneratedLoopNests(unsigned Num) { NumGeneratedLoopNests = Num; } public: /// Return the number of associated (consumed) loops. @@ -981,6 +987,10 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { /// Return the number of loops generated by this loop transformation. unsigned getNumGeneratedLoops() const { return NumGeneratedLoops; } + /// Return the number of top level canonical loop nests generated by this loop + /// transformation + unsigned getNumGeneratedLoopNests() const { return NumGeneratedLoopNests; } + /// Get the de-sugared statements after the loop transformation. /// /// Might be nullptr if either the directive generates no loops and is handled @@ -995,7 +1005,8 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { Stmt::StmtClass C = T->getStmtClass(); return C == OMPTileDirectiveClass || C == OMPUnrollDirectiveClass || C == OMPReverseDirectiveClass || C == OMPInterchangeDirectiveClass || - C == OMPStripeDirectiveClass; + C == OMPStripeDirectiveClass || + C == OMPFuseDirectiveClass; } }; @@ -5562,6 +5573,7 @@ class OMPTileDirective final : public OMPLoopTransformationDirective { llvm::omp::OMPD_tile, StartLoc, EndLoc, NumLoops) { setNumGeneratedLoops(2 * NumLoops); + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { @@ -5790,7 +5802,11 @@ class OMPReverseDirective final : public OMPLoopTransformationDirective { explicit OMPReverseDirective(SourceLocation StartLoc, SourceLocation EndLoc) : OMPLoopTransformationDirective(OMPReverseDirectiveClass, llvm::omp::OMPD_reverse, StartLoc, - EndLoc, 1) {} + EndLoc, 1) { + + setNumGeneratedLoopNests(1); + setNumGeneratedLoops(1); + } void setPreInits(Stmt *PreInits) { Data->getChildren()[PreInitsOffset] = PreInits; @@ -5857,7 +5873,8 @@ class OMPInterchangeDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPInterchangeDirectiveClass, llvm::omp::OMPD_interchange, StartLoc, EndLoc, NumLoops) { - setNumGeneratedLoops(3 * NumLoops); + setNumGeneratedLoops(NumLoops); + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { @@ -5908,6 +5925,88 @@ class OMPInterchangeDirective final : public OMPLoopTransformationDirective { } }; +/// Represents the '#pragma omp fuse' loop transformation directive +/// +/// \code{c} +/// #pragma omp fuse +/// { +/// for(int i = 0; i < m1; ++i) {...} +/// for(int j = 0; j < m2; ++j) {...} +/// ... +/// } +/// \endcode + +class OMPFuseDirective final : public OMPLoopTransformationDirective { + friend class ASTStmtReader; + friend class OMPExecutableDirective; + + // Offsets of child members. + enum { + PreInitsOffset = 0, + TransformedStmtOffset, + }; + + explicit OMPFuseDirective(SourceLocation StartLoc, SourceLocation EndLoc, + unsigned NumLoops) + : OMPLoopTransformationDirective(OMPFuseDirectiveClass, + llvm::omp::OMPD_fuse, StartLoc, EndLoc, + NumLoops) { + setNumGeneratedLoops(1); + // TODO: After implementing the looprange clause, change this logic + setNumGeneratedLoopNests(1); + } + + void setPreInits(Stmt *PreInits) { + Data->getChildren()[PreInitsOffset] = PreInits; + } + + void setTransformedStmt(Stmt *S) { + Data->getChildren()[TransformedStmtOffset] = S; + } + +public: + /// Create a new AST node representation for #pragma omp fuse' + /// + /// \param C Context of the AST + /// \param StartLoc Location of the introducer (e.g the 'omp' token) + /// \param EndLoc Location of the directive's end (e.g the tok::eod) + /// \param Clauses The directive's clauses + /// \param NumLoops Number of total affected loops + /// \param NumLoopNests Number of affected top level canonical loops + /// (number of items in the 'looprange' clause if present) + /// \param AssociatedStmt The outermost associated loop + /// \param TransformedStmt The loop nest after fusion, or nullptr in + /// dependent + /// \param PreInits Helper preinits statements for the loop nest + static OMPFuseDirective *Create(const ASTContext &C, SourceLocation StartLoc, + SourceLocation EndLoc, + ArrayRef Clauses, + unsigned NumLoops, unsigned NumLoopNests, + Stmt *AssociatedStmt, Stmt *TransformedStmt, + Stmt *PreInits); + + /// Build an empty '#pragma omp fuse' AST node for deserialization + /// + /// \param C Context of the AST + /// \param NumClauses Number of clauses to allocate + /// \param NumLoops Number of associated loops to allocate + static OMPFuseDirective *CreateEmpty(const ASTContext &C, unsigned NumClauses, + unsigned NumLoops); + + /// Gets the associated loops after the transformation. This is the de-sugared + /// replacement or nulltpr in dependent contexts. + Stmt *getTransformedStmt() const { + return Data->getChildren()[TransformedStmtOffset]; + } + + /// Return preinits statement. + Stmt *getPreInits() const { return Data->getChildren()[PreInitsOffset]; } + + static bool classof(const Stmt *T) { + return T->getStmtClass() == OMPFuseDirectiveClass; + } +}; + /// This represents '#pragma omp scan' directive. /// /// \code diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index 78b36ceb88125..f31b6f8a3b26a 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -11558,6 +11558,14 @@ def note_omp_implicit_dsa : Note< "implicitly determined as %0">; def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; +def warn_omp_different_loop_ind_var_types : Warning < + "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">; +def err_omp_not_canonical_loop : Error < + "loop after '#pragma omp %0' is not in canonical form">; +def err_omp_not_a_loop_sequence : Error < + "statement after '#pragma omp %0' must be a loop sequence containing canonical loops or loop-generating constructs">; +def err_omp_empty_loop_sequence : Error < + "loop sequence after '#pragma omp %0' must contain at least 1 canonical loop or loop-generating construct">; def err_omp_not_for : Error< "%select{statement after '#pragma omp %1' must be a for loop|" "expected %2 for loops after '#pragma omp %1'%select{|, but found only %4}3}0">; diff --git a/clang/include/clang/Basic/StmtNodes.td b/clang/include/clang/Basic/StmtNodes.td index 9526fa5808aa5..739160342062c 100644 --- a/clang/include/clang/Basic/StmtNodes.td +++ b/clang/include/clang/Basic/StmtNodes.td @@ -234,6 +234,7 @@ def OMPStripeDirective : StmtNode; def OMPUnrollDirective : StmtNode; def OMPReverseDirective : StmtNode; def OMPInterchangeDirective : StmtNode; +def OMPFuseDirective : StmtNode; def OMPForDirective : StmtNode; def OMPForSimdDirective : StmtNode; def OMPSectionsDirective : StmtNode; diff --git a/clang/include/clang/Sema/SemaOpenMP.h b/clang/include/clang/Sema/SemaOpenMP.h index 6498390fe96f7..8d78c2197c89d 100644 --- a/clang/include/clang/Sema/SemaOpenMP.h +++ b/clang/include/clang/Sema/SemaOpenMP.h @@ -457,6 +457,13 @@ class SemaOpenMP : public SemaBase { Stmt *AStmt, SourceLocation StartLoc, SourceLocation EndLoc); + + /// Called on well-formed '#pragma omp fuse' after parsing of its + /// clauses and the associated statement. + StmtResult ActOnOpenMPFuseDirective(ArrayRef Clauses, + Stmt *AStmt, SourceLocation StartLoc, + SourceLocation EndLoc); + /// Called on well-formed '\#pragma omp for' after parsing /// of the associated statement. StmtResult @@ -1480,6 +1487,26 @@ class SemaOpenMP : public SemaBase { SmallVectorImpl &LoopHelpers, Stmt *&Body, SmallVectorImpl> &OriginalInits); + /// Analyzes and checks a loop sequence for use by a loop transformation + /// + /// \param Kind The loop transformation directive kind. + /// \param NumLoops [out] Number of total canonical loops + /// \param LoopSeqSize [out] Number of top level canonical loops + /// \param LoopHelpers [out] The multiple loop analyses results. + /// \param LoopStmts [out] The multiple Stmt of each For loop. + /// \param OriginalInits [out] The multiple collection of statements and + /// declarations that must have been executed/declared + /// before entering the loop. + /// \param Context + /// \return Whether there was an absence of errors or not + bool checkTransformableLoopSequence( + OpenMPDirectiveKind Kind, Stmt *AStmt, unsigned &LoopSeqSize, + unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + ASTContext &Context); + /// Helper to keep information about the current `omp begin/end declare /// variant` nesting. struct OMPDeclareVariantScope { diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h index 5cb9998126a85..8fe9d8248d66f 100644 --- a/clang/include/clang/Serialization/ASTBitCodes.h +++ b/clang/include/clang/Serialization/ASTBitCodes.h @@ -1948,6 +1948,7 @@ enum StmtCode { STMT_OMP_UNROLL_DIRECTIVE, STMT_OMP_REVERSE_DIRECTIVE, STMT_OMP_INTERCHANGE_DIRECTIVE, + STMT_OMP_FUSE_DIRECTIVE, STMT_OMP_FOR_DIRECTIVE, STMT_OMP_FOR_SIMD_DIRECTIVE, STMT_OMP_SECTIONS_DIRECTIVE, diff --git a/clang/lib/AST/StmtOpenMP.cpp b/clang/lib/AST/StmtOpenMP.cpp index 093e1f659916f..4a6133766ef1c 100644 --- a/clang/lib/AST/StmtOpenMP.cpp +++ b/clang/lib/AST/StmtOpenMP.cpp @@ -456,6 +456,8 @@ OMPUnrollDirective::Create(const ASTContext &C, SourceLocation StartLoc, auto *Dir = createDirective( C, Clauses, AssociatedStmt, TransformedStmtOffset + 1, StartLoc, EndLoc); Dir->setNumGeneratedLoops(NumGeneratedLoops); + // The number of generated loops and loop nests during unroll matches + Dir->setNumGeneratedLoopNests(NumGeneratedLoops); Dir->setTransformedStmt(TransformedStmt); Dir->setPreInits(PreInits); return Dir; @@ -505,6 +507,29 @@ OMPInterchangeDirective::CreateEmpty(const ASTContext &C, unsigned NumClauses, SourceLocation(), SourceLocation(), NumLoops); } +OMPFuseDirective *OMPFuseDirective::Create( + const ASTContext &C, SourceLocation StartLoc, SourceLocation EndLoc, + ArrayRef Clauses, unsigned NumLoops, unsigned NumLoopNests, + Stmt *AssociatedStmt, Stmt *TransformedStmt, Stmt *PreInits) { + + OMPFuseDirective *Dir = createDirective( + C, Clauses, AssociatedStmt, TransformedStmtOffset + 1, StartLoc, EndLoc, + NumLoops); + Dir->setTransformedStmt(TransformedStmt); + Dir->setPreInits(PreInits); + Dir->setNumGeneratedLoopNests(NumLoopNests); + Dir->setNumGeneratedLoops(NumLoops); + return Dir; +} + +OMPFuseDirective *OMPFuseDirective::CreateEmpty(const ASTContext &C, + unsigned NumClauses, + unsigned NumLoops) { + return createEmptyDirective( + C, NumClauses, /*HasAssociatedStmt=*/true, TransformedStmtOffset + 1, + SourceLocation(), SourceLocation(), NumLoops); +} + OMPForSimdDirective * OMPForSimdDirective::Create(const ASTContext &C, SourceLocation StartLoc, SourceLocation EndLoc, unsigned CollapsedNum, diff --git a/clang/lib/AST/StmtPrinter.cpp b/clang/lib/AST/StmtPrinter.cpp index dc8af1586624b..12a1d5a943704 100644 --- a/clang/lib/AST/StmtPrinter.cpp +++ b/clang/lib/AST/StmtPrinter.cpp @@ -791,6 +791,11 @@ void StmtPrinter::VisitOMPInterchangeDirective(OMPInterchangeDirective *Node) { PrintOMPExecutableDirective(Node); } +void StmtPrinter::VisitOMPFuseDirective(OMPFuseDirective *Node) { + Indent() << "#pragma omp fuse"; + PrintOMPExecutableDirective(Node); +} + void StmtPrinter::VisitOMPForDirective(OMPForDirective *Node) { Indent() << "#pragma omp for"; PrintOMPExecutableDirective(Node); diff --git a/clang/lib/AST/StmtProfile.cpp b/clang/lib/AST/StmtProfile.cpp index f7d1655f67ed1..99d426db985e8 100644 --- a/clang/lib/AST/StmtProfile.cpp +++ b/clang/lib/AST/StmtProfile.cpp @@ -1026,6 +1026,10 @@ void StmtProfiler::VisitOMPInterchangeDirective( VisitOMPLoopTransformationDirective(S); } +void StmtProfiler::VisitOMPFuseDirective(const OMPFuseDirective *S) { + VisitOMPLoopTransformationDirective(S); +} + void StmtProfiler::VisitOMPForDirective(const OMPForDirective *S) { VisitOMPLoopDirective(S); } diff --git a/clang/lib/Basic/OpenMPKinds.cpp b/clang/lib/Basic/OpenMPKinds.cpp index a451fc7c01841..d172450512f13 100644 --- a/clang/lib/Basic/OpenMPKinds.cpp +++ b/clang/lib/Basic/OpenMPKinds.cpp @@ -702,7 +702,7 @@ bool clang::isOpenMPLoopBoundSharingDirective(OpenMPDirectiveKind Kind) { bool clang::isOpenMPLoopTransformationDirective(OpenMPDirectiveKind DKind) { return DKind == OMPD_tile || DKind == OMPD_unroll || DKind == OMPD_reverse || - DKind == OMPD_interchange || DKind == OMPD_stripe; + DKind == OMPD_interchange || DKind == OMPD_stripe || DKind == OMPD_fuse; } bool clang::isOpenMPCombinedParallelADirective(OpenMPDirectiveKind DKind) { diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp index 3562b4ea22a24..4a2dc1a537d46 100644 --- a/clang/lib/CodeGen/CGStmt.cpp +++ b/clang/lib/CodeGen/CGStmt.cpp @@ -233,6 +233,9 @@ void CodeGenFunction::EmitStmt(const Stmt *S, ArrayRef Attrs) { case Stmt::OMPInterchangeDirectiveClass: EmitOMPInterchangeDirective(cast(*S)); break; + case Stmt::OMPFuseDirectiveClass: + EmitOMPFuseDirective(cast(*S)); + break; case Stmt::OMPForDirectiveClass: EmitOMPForDirective(cast(*S)); break; diff --git a/clang/lib/CodeGen/CGStmtOpenMP.cpp b/clang/lib/CodeGen/CGStmtOpenMP.cpp index 803c7ed37635e..0c664b0f89044 100644 --- a/clang/lib/CodeGen/CGStmtOpenMP.cpp +++ b/clang/lib/CodeGen/CGStmtOpenMP.cpp @@ -197,6 +197,8 @@ class OMPLoopScope : public CodeGenFunction::RunCleanupsScope { } else if (const auto *Interchange = dyn_cast(&S)) { PreInits = Interchange->getPreInits(); + } else if (const auto *Fuse = dyn_cast(&S)) { + PreInits = Fuse->getPreInits(); } else { llvm_unreachable("Unknown loop-based directive kind."); } @@ -2918,6 +2920,12 @@ void CodeGenFunction::EmitOMPInterchangeDirective( EmitStmt(S.getTransformedStmt()); } +void CodeGenFunction::EmitOMPFuseDirective(const OMPFuseDirective &S) { + // Emit the de-sugared statement + OMPTransformDirectiveScopeRAII FuseScope(*this, &S); + EmitStmt(S.getTransformedStmt()); +} + void CodeGenFunction::EmitOMPUnrollDirective(const OMPUnrollDirective &S) { bool UseOMPIRBuilder = CGM.getLangOpts().OpenMPIRBuilder; diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h index 78d71fc822bcb..a983901f560de 100644 --- a/clang/lib/CodeGen/CodeGenFunction.h +++ b/clang/lib/CodeGen/CodeGenFunction.h @@ -3906,6 +3906,7 @@ class CodeGenFunction : public CodeGenTypeCache { void EmitOMPUnrollDirective(const OMPUnrollDirective &S); void EmitOMPReverseDirective(const OMPReverseDirective &S); void EmitOMPInterchangeDirective(const OMPInterchangeDirective &S); + void EmitOMPFuseDirective(const OMPFuseDirective &S); void EmitOMPForDirective(const OMPForDirective &S); void EmitOMPForSimdDirective(const OMPForSimdDirective &S); void EmitOMPScopeDirective(const OMPScopeDirective &S); diff --git a/clang/lib/Sema/SemaExceptionSpec.cpp b/clang/lib/Sema/SemaExceptionSpec.cpp index c83eab53891ca..85a374e6eb9b2 100644 --- a/clang/lib/Sema/SemaExceptionSpec.cpp +++ b/clang/lib/Sema/SemaExceptionSpec.cpp @@ -1491,6 +1491,7 @@ CanThrowResult Sema::canThrow(const Stmt *S) { case Stmt::OMPUnrollDirectiveClass: case Stmt::OMPReverseDirectiveClass: case Stmt::OMPInterchangeDirectiveClass: + case Stmt::OMPFuseDirectiveClass: case Stmt::OMPSingleDirectiveClass: case Stmt::OMPTargetDataDirectiveClass: case Stmt::OMPTargetDirectiveClass: diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index f16f841d62edd..bd8bee64a9d2f 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -4404,6 +4404,7 @@ void SemaOpenMP::ActOnOpenMPRegionStart(OpenMPDirectiveKind DKind, case OMPD_unroll: case OMPD_reverse: case OMPD_interchange: + case OMPD_fuse: case OMPD_assume: break; default: @@ -6221,6 +6222,10 @@ StmtResult SemaOpenMP::ActOnOpenMPExecutableDirective( Res = ActOnOpenMPInterchangeDirective(ClausesWithImplicit, AStmt, StartLoc, EndLoc); break; + case OMPD_fuse: + Res = + ActOnOpenMPFuseDirective(ClausesWithImplicit, AStmt, StartLoc, EndLoc); + break; case OMPD_for: Res = ActOnOpenMPForDirective(ClausesWithImplicit, AStmt, StartLoc, EndLoc, VarsWithInheritedDSA); @@ -14193,6 +14198,8 @@ bool SemaOpenMP::checkTransformableLoopNest( DependentPreInits = Dir->getPreInits(); else if (auto *Dir = dyn_cast(Transform)) DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); else llvm_unreachable("Unhandled loop transformation"); @@ -14203,6 +14210,265 @@ bool SemaOpenMP::checkTransformableLoopNest( return Result; } +class NestedLoopCounterVisitor + : public clang::RecursiveASTVisitor { +public: + explicit NestedLoopCounterVisitor() : NestedLoopCount(0) {} + + bool VisitForStmt(clang::ForStmt *FS) { + ++NestedLoopCount; + return true; + } + + bool VisitCXXForRangeStmt(clang::CXXForRangeStmt *FRS) { + ++NestedLoopCount; + return true; + } + + unsigned getNestedLoopCount() const { return NestedLoopCount; } + +private: + unsigned NestedLoopCount; +}; + +bool SemaOpenMP::checkTransformableLoopSequence( + OpenMPDirectiveKind Kind, Stmt *AStmt, unsigned &LoopSeqSize, + unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + ASTContext &Context) { + + // Checks whether the given statement is a compound statement + VarsWithInheritedDSAType TmpDSA; + if (!isa(AStmt)) { + Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) + << getOpenMPDirectiveName(Kind); + return false; + } + // Callback for updating pre-inits in case there are even more + // loop-sequence-generating-constructs inside of the main compound stmt + auto OnTransformationCallback = + [&OriginalInits](OMPLoopBasedDirective *Transform) { + Stmt *DependentPreInits; + if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else if (auto *Dir = dyn_cast(Transform)) + DependentPreInits = Dir->getPreInits(); + else + llvm_unreachable("Unhandled loop transformation"); + + appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + }; + + // Number of top level canonical loop nests observed (And acts as index) + LoopSeqSize = 0; + // Number of total observed loops + NumLoops = 0; + + // Following OpenMP 6.0 API Specification, a Canonical Loop Sequence follows + // the grammar: + // + // canonical-loop-sequence: + // { + // loop-sequence+ + // } + // where loop-sequence can be any of the following: + // 1. canonical-loop-sequence + // 2. loop-nest + // 3. loop-sequence-generating-construct (i.e OMPLoopTransformationDirective) + // + // To recognise and traverse this structure the following helper functions + // have been defined. handleLoopSequence serves as the recurisve entry point + // and tries to match the input AST to the canonical loop sequence grammar + // structure + + auto NLCV = NestedLoopCounterVisitor(); + // Helper functions to validate canonical loop sequence grammar is valid + auto isLoopSequenceDerivation = [](auto *Child) { + return isa(Child) || isa(Child) || + isa(Child); + }; + auto isLoopGeneratingStmt = [](auto *Child) { + return isa(Child); + }; + + // Helper Lambda to handle storing initialization and body statements for both + // ForStmt and CXXForRangeStmt and checks for any possible mismatch between + // induction variables types + QualType BaseInductionVarType; + auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, + this, &Context](Stmt *LoopStmt) { + if (auto *For = dyn_cast(LoopStmt)) { + OriginalInits.back().push_back(For->getInit()); + ForStmts.push_back(For); + // Extract induction variable + if (auto *InitStmt = dyn_cast_or_null(For->getInit())) { + if (auto *InitDecl = dyn_cast(InitStmt->getSingleDecl())) { + QualType InductionVarType = InitDecl->getType().getCanonicalType(); + + // Compare with first loop type + if (BaseInductionVarType.isNull()) { + BaseInductionVarType = InductionVarType; + } else if (!Context.hasSameType(BaseInductionVarType, + InductionVarType)) { + Diag(InitDecl->getBeginLoc(), + diag::warn_omp_different_loop_ind_var_types) + << getOpenMPDirectiveName(OMPD_fuse) << BaseInductionVarType + << InductionVarType; + } + } + } + + } else { + assert(isa(LoopStmt) && + "Expected canonical for or range-based for loops."); + auto *CXXFor = dyn_cast(LoopStmt); + OriginalInits.back().push_back(CXXFor->getBeginStmt()); + ForStmts.push_back(CXXFor); + } + }; + // Helper lambda functions to encapsulate the processing of different + // derivations of the canonical loop sequence grammar + // + // Modularized code for handling loop generation and transformations + auto handleLoopGeneration = [&storeLoopStatements, &LoopHelpers, + &OriginalInits, &LoopSeqSize, &NumLoops, Kind, + &TmpDSA, &OnTransformationCallback, + this](Stmt *Child) { + auto LoopTransform = dyn_cast(Child); + Stmt *TransformedStmt = LoopTransform->getTransformedStmt(); + unsigned NumGeneratedLoopNests = LoopTransform->getNumGeneratedLoopNests(); + + // Handle the case where transformed statement is not available due to + // dependent contexts + if (!TransformedStmt) { + if (NumGeneratedLoopNests > 0) + return true; + // Unroll full + else { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + } + // Handle loop transformations with multiple loop nests + // Unroll full + if (NumGeneratedLoopNests <= 0) { + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + // Future loop transformations that generate multiple canonical loops + } else if (NumGeneratedLoopNests > 1) { + llvm_unreachable("Multiple canonical loop generating transformations " + "like loop splitting are not yet supported"); + } + + // Process the transformed loop statement + Child = TransformedStmt; + OriginalInits.emplace_back(); + LoopHelpers.emplace_back(); + OnTransformationCallback(LoopTransform); + + unsigned IsCanonical = + checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, + TmpDSA, LoopHelpers[LoopSeqSize]); + + if (!IsCanonical) { + Diag(Child->getBeginLoc(), diag::err_omp_not_canonical_loop) + << getOpenMPDirectiveName(Kind); + return false; + } + storeLoopStatements(TransformedStmt); + NumLoops += LoopTransform->getNumGeneratedLoops(); + return true; + }; + + // Modularized code for handling regular canonical loops + auto handleRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, + &LoopSeqSize, &NumLoops, Kind, &TmpDSA, &NLCV, + this](Stmt *Child) { + OriginalInits.emplace_back(); + LoopHelpers.emplace_back(); + unsigned IsCanonical = + checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, + TmpDSA, LoopHelpers[LoopSeqSize]); + + if (!IsCanonical) { + Diag(Child->getBeginLoc(), diag::err_omp_not_canonical_loop) + << getOpenMPDirectiveName(Kind); + return false; + } + storeLoopStatements(Child); + NumLoops += NLCV.TraverseStmt(Child); + return true; + }; + + // Helper function to process a Loop Sequence Recursively + auto handleLoopSequence = [&](Stmt *LoopSeqStmt, + auto &handleLoopSequenceCallback) -> bool { + for (auto *Child : LoopSeqStmt->children()) { + if (!Child) + continue; + + // Skip over non-loop-sequence statements + if (!isLoopSequenceDerivation(Child)) { + Child = Child->IgnoreContainers(); + + // Ignore empty compound statement + if (!Child) + continue; + + // In the case of a nested loop sequence ignoring containers would not + // be enough, a recurisve transversal of the loop sequence is required + if (isa(Child)) { + if (!handleLoopSequenceCallback(Child, handleLoopSequenceCallback)) + return false; + // Already been treated, skip this children + continue; + } + } + // Regular loop sequence handling + if (isLoopSequenceDerivation(Child)) { + if (isLoopGeneratingStmt(Child)) { + if (!handleLoopGeneration(Child)) { + return false; + } + } else { + if (!handleRegularLoop(Child)) { + return false; + } + } + ++LoopSeqSize; + } else { + // Report error for invalid statement inside canonical loop sequence + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; + } + } + return true; + }; + + // Recursive entry point to process the main loop sequence + if (!handleLoopSequence(AStmt, handleLoopSequence)) { + return false; + } + + if (LoopSeqSize <= 0) { + Diag(AStmt->getBeginLoc(), diag::err_omp_empty_loop_sequence) + << getOpenMPDirectiveName(Kind); + return false; + } + return true; +} + /// Add preinit statements that need to be propageted from the selected loop. static void addLoopPreInits(ASTContext &Context, OMPLoopBasedDirective::HelperExprs &LoopHelper, @@ -15462,6 +15728,340 @@ StmtResult SemaOpenMP::ActOnOpenMPInterchangeDirective( buildPreInits(Context, PreInits)); } +StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, + Stmt *AStmt, + SourceLocation StartLoc, + SourceLocation EndLoc) { + ASTContext &Context = getASTContext(); + DeclContext *CurrContext = SemaRef.CurContext; + Scope *CurScope = SemaRef.getCurScope(); + CaptureVars CopyTransformer(SemaRef); + + // Ensure the structured block is not empty + if (!AStmt) { + return StmtError(); + } + // Validate that the potential loop sequence is transformable for fusion + // Also collect the HelperExprs, Loop Stmts, Inits, and Number of loops + SmallVector LoopHelpers; + SmallVector LoopStmts; + SmallVector> OriginalInits; + + unsigned NumLoops; + // TODO: Support looprange clause using LoopSeqSize + unsigned LoopSeqSize; + if (!checkTransformableLoopSequence(OMPD_fuse, AStmt, LoopSeqSize, NumLoops, + LoopHelpers, LoopStmts, OriginalInits, + Context)) { + return StmtError(); + } + + // Defer transformation in dependent contexts + if (CurrContext->isDependentContext()) { + return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, + NumLoops, 1, AStmt, nullptr, nullptr); + } + assert(LoopHelpers.size() == LoopSeqSize && + "Expecting loop iteration space dimensionality to match number of " + "affected loops"); + assert(OriginalInits.size() == LoopSeqSize && + "Expecting loop iteration space dimensionality to match number of " + "affected loops"); + + // PreInits hold a sequence of variable declarations that must be executed + // before the fused loop begins. These include bounds, strides, and other + // helper variables required for the transformation. + SmallVector PreInits; + + // Select the type with the largest bit width among all induction variables + QualType IVType = LoopHelpers[0].IterationVarRef->getType(); + for (unsigned int I = 1; I < LoopSeqSize; ++I) { + QualType CurrentIVType = LoopHelpers[I].IterationVarRef->getType(); + if (Context.getTypeSize(CurrentIVType) > Context.getTypeSize(IVType)) { + IVType = CurrentIVType; + } + } + uint64_t IVBitWidth = Context.getIntWidth(IVType); + + // Create pre-init declarations for all loops lower bounds, upper bounds, + // strides and num-iterations + SmallVector LBVarDecls; + SmallVector STVarDecls; + SmallVector NIVarDecls; + SmallVector UBVarDecls; + SmallVector IVVarDecls; + + // Helper lambda to create variables for bounds, strides, and other + // expressions. Generates both the variable declaration and the corresponding + // initialization statement. + auto CreateHelperVarAndStmt = + [&SemaRef = this->SemaRef, &Context, &CopyTransformer, + &IVType](Expr *ExprToCopy, const std::string &BaseName, unsigned I, + bool NeedsNewVD = false) { + Expr *TransformedExpr = + AssertSuccess(CopyTransformer.TransformExpr(ExprToCopy)); + if (!TransformedExpr) + return std::pair(nullptr, StmtError()); + + auto Name = (Twine(".omp.") + BaseName + std::to_string(I)).str(); + + VarDecl *VD; + if (NeedsNewVD) { + VD = buildVarDecl(SemaRef, SourceLocation(), IVType, Name); + SemaRef.AddInitializerToDecl(VD, TransformedExpr, false); + + } else { + // Create a unique variable name + DeclRefExpr *DRE = cast(TransformedExpr); + VD = cast(DRE->getDecl()); + VD->setDeclName(&SemaRef.PP.getIdentifierTable().get(Name)); + } + // Create the corresponding declaration statement + StmtResult DeclStmt = new (Context) class DeclStmt( + DeclGroupRef(VD), SourceLocation(), SourceLocation()); + return std::make_pair(VD, DeclStmt); + }; + + // Process each single loop to generate and collect declarations + // and statements for all helper expressions + for (unsigned int I = 0; I < LoopSeqSize; ++I) { + addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], + PreInits); + + auto [UBVD, UBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].UB, "ub", I); + auto [LBVD, LBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].LB, "lb", I); + auto [STVD, STDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].ST, "st", I); + auto [NIVD, NIDStmt] = + CreateHelperVarAndStmt(LoopHelpers[I].NumIterations, "ni", I, true); + auto [IVVD, IVDStmt] = + CreateHelperVarAndStmt(LoopHelpers[I].IterationVarRef, "iv", I); + + if (!LBVD || !STVD || !NIVD || !IVVD) + return StmtError(); + + UBVarDecls.push_back(UBVD); + LBVarDecls.push_back(LBVD); + STVarDecls.push_back(STVD); + NIVarDecls.push_back(NIVD); + IVVarDecls.push_back(IVVD); + + PreInits.push_back(UBDStmt.get()); + PreInits.push_back(LBDStmt.get()); + PreInits.push_back(STDStmt.get()); + PreInits.push_back(NIDStmt.get()); + PreInits.push_back(IVDStmt.get()); + } + + auto MakeVarDeclRef = [&SemaRef = this->SemaRef](VarDecl *VD) { + return buildDeclRefExpr(SemaRef, VD, VD->getType(), VD->getLocation(), + false); + }; + + // Following up the creation of the final fused loop will be performed + // which has the following shape (considering the selected loops): + // + // for (fuse.index = 0; fuse.index < max(ni0, ni1..., nik); ++fuse.index) { + // if (fuse.index < ni0){ + // iv0 = lb0 + st0 * fuse.index; + // original.index0 = iv0 + // body(0); + // } + // if (fuse.index < ni1){ + // iv1 = lb1 + st1 * fuse.index; + // original.index1 = iv1 + // body(1); + // } + // + // ... + // + // if (fuse.index < nik){ + // ivk = lbk + stk * fuse.index; + // original.indexk = ivk + // body(k); Expr *InitVal = IntegerLiteral::Create(Context, + // llvm::APInt(IVWidth, 0), + + // } + + // 1. Create the initialized fuse index + const std::string IndexName = Twine(".omp.fuse.index").str(); + Expr *InitVal = IntegerLiteral::Create(Context, llvm::APInt(IVBitWidth, 0), + IVType, SourceLocation()); + VarDecl *IndexDecl = + buildVarDecl(SemaRef, {}, IVType, IndexName, nullptr, nullptr); + SemaRef.AddInitializerToDecl(IndexDecl, InitVal, false); + StmtResult InitStmt = new (Context) + DeclStmt(DeclGroupRef(IndexDecl), SourceLocation(), SourceLocation()); + + if (!InitStmt.isUsable()) + return StmtError(); + + auto MakeIVRef = [&SemaRef = this->SemaRef, IndexDecl, IVType, + Loc = InitVal->getExprLoc()]() { + return buildDeclRefExpr(SemaRef, IndexDecl, IVType, Loc, false); + }; + + // 2. Iteratively compute the max number of logical iterations Max(NI_1, NI_2, + // ..., NI_k) + // + // This loop accumulates the maximum value across multiple expressions, + // ensuring each step constructs a unique AST node for correctness. By using + // intermediate temporary variables and conditional operators, we maintain + // distinct nodes and avoid duplicating subtrees, For instance, max(a,b,c): + // omp.temp0 = max(a, b) + // omp.temp1 = max(omp.temp0, c) + // omp.fuse.max = max(omp.temp1, omp.temp0) + + ExprResult MaxExpr; + for (unsigned I = 0; I < LoopSeqSize; ++I) { + DeclRefExpr *NIRef = MakeVarDeclRef(NIVarDecls[I]); + QualType NITy = NIRef->getType(); + + if (MaxExpr.isUnset()) { + // Initialize MaxExpr with the first NI expression + MaxExpr = NIRef; + } else { + // Create a new acummulator variable t_i = MaxExpr + std::string TempName = (Twine(".omp.temp.") + Twine(I)).str(); + VarDecl *TempDecl = + buildVarDecl(SemaRef, {}, NITy, TempName, nullptr, nullptr); + TempDecl->setInit(MaxExpr.get()); + DeclRefExpr *TempRef = + buildDeclRefExpr(SemaRef, TempDecl, NITy, SourceLocation(), false); + DeclRefExpr *TempRef2 = + buildDeclRefExpr(SemaRef, TempDecl, NITy, SourceLocation(), false); + // Add a DeclStmt to PreInits to ensure the variable is declared. + StmtResult TempStmt = new (Context) + DeclStmt(DeclGroupRef(TempDecl), SourceLocation(), SourceLocation()); + + if (!TempStmt.isUsable()) + return StmtError(); + PreInits.push_back(TempStmt.get()); + + // Build MaxExpr <-(MaxExpr > NIRef ? MaxExpr : NIRef) + ExprResult Comparison = + SemaRef.BuildBinOp(nullptr, SourceLocation(), BO_GT, TempRef, NIRef); + // Handle any errors in Comparison creation + if (!Comparison.isUsable()) + return StmtError(); + + DeclRefExpr *NIRef2 = MakeVarDeclRef(NIVarDecls[I]); + // Update MaxExpr using a conditional expression to hold the max value + MaxExpr = new (Context) ConditionalOperator( + Comparison.get(), SourceLocation(), TempRef2, SourceLocation(), + NIRef2->getExprStmt(), NITy, VK_LValue, OK_Ordinary); + + if (!MaxExpr.isUsable()) + return StmtError(); + } + } + if (!MaxExpr.isUsable()) + return StmtError(); + + // 3. Declare the max variable + const std::string MaxName = Twine(".omp.fuse.max").str(); + VarDecl *MaxDecl = + buildVarDecl(SemaRef, {}, IVType, MaxName, nullptr, nullptr); + MaxDecl->setInit(MaxExpr.get()); + DeclRefExpr *MaxRef = buildDeclRefExpr(SemaRef, MaxDecl, IVType, {}, false); + StmtResult MaxStmt = new (Context) + DeclStmt(DeclGroupRef(MaxDecl), SourceLocation(), SourceLocation()); + + if (MaxStmt.isInvalid()) + return StmtError(); + PreInits.push_back(MaxStmt.get()); + + // 4. Create condition Expr: index < n_max + ExprResult CondExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_LT, + MakeIVRef(), MaxRef); + if (!CondExpr.isUsable()) + return StmtError(); + // 5. Increment Expr: ++index + ExprResult IncrExpr = + SemaRef.BuildUnaryOp(CurScope, SourceLocation(), UO_PreInc, MakeIVRef()); + if (!IncrExpr.isUsable()) + return StmtError(); + + // 6. Build the Fused Loop Body + // The final fused loop iterates over the maximum logical range. Inside the + // loop, each original loop's index is calculated dynamically, and its body + // is executed conditionally. + // + // Each sub-loop's body is guarded by a conditional statement to ensure + // it executes only within its logical iteration range: + // + // if (fuse.index < ni_k){ + // iv_k = lb_k + st_k * fuse.index; + // original.index = iv_k + // body(k); + // } + + CompoundStmt *FusedBody = nullptr; + SmallVector FusedBodyStmts; + for (unsigned I = 0; I < LoopSeqSize; ++I) { + + // Assingment of the original sub-loop index to compute the logical index + // IV_k = LB_k + omp.fuse.index * ST_k + + ExprResult IdxExpr = + SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Mul, + MakeVarDeclRef(STVarDecls[I]), MakeIVRef()); + if (!IdxExpr.isUsable()) + return StmtError(); + IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Add, + MakeVarDeclRef(LBVarDecls[I]), IdxExpr.get()); + + if (!IdxExpr.isUsable()) + return StmtError(); + IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Assign, + MakeVarDeclRef(IVVarDecls[I]), IdxExpr.get()); + if (!IdxExpr.isUsable()) + return StmtError(); + + // Update the original i_k = IV_k + SmallVector BodyStmts; + BodyStmts.push_back(IdxExpr.get()); + llvm::append_range(BodyStmts, LoopHelpers[I].Updates); + + if (auto *SourceCXXFor = dyn_cast(LoopStmts[I])) + BodyStmts.push_back(SourceCXXFor->getLoopVarStmt()); + + Stmt *Body = (isa(LoopStmts[I])) + ? cast(LoopStmts[I])->getBody() + : cast(LoopStmts[I])->getBody(); + + BodyStmts.push_back(Body); + + CompoundStmt *CombinedBody = + CompoundStmt::Create(Context, BodyStmts, FPOptionsOverride(), + SourceLocation(), SourceLocation()); + ExprResult Condition = + SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_LT, MakeIVRef(), + MakeVarDeclRef(NIVarDecls[I])); + + if (!Condition.isUsable()) + return StmtError(); + + IfStmt *IfStatement = IfStmt::Create( + Context, SourceLocation(), IfStatementKind::Ordinary, nullptr, nullptr, + Condition.get(), SourceLocation(), SourceLocation(), CombinedBody, + SourceLocation(), nullptr); + + FusedBodyStmts.push_back(IfStatement); + } + FusedBody = CompoundStmt::Create(Context, FusedBodyStmts, FPOptionsOverride(), + SourceLocation(), SourceLocation()); + + // 7. Construct the final fused loop + ForStmt *FusedForStmt = new (Context) + ForStmt(Context, InitStmt.get(), CondExpr.get(), nullptr, IncrExpr.get(), + FusedBody, InitStmt.get()->getBeginLoc(), SourceLocation(), + IncrExpr.get()->getEndLoc()); + + return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, NumLoops, + 1, AStmt, FusedForStmt, + buildPreInits(Context, PreInits)); +} + OMPClause *SemaOpenMP::ActOnOpenMPSingleExprClause(OpenMPClauseKind Kind, Expr *Expr, SourceLocation StartLoc, diff --git a/clang/lib/Sema/TreeTransform.h b/clang/lib/Sema/TreeTransform.h index 335e21d927b76..034b0c8243667 100644 --- a/clang/lib/Sema/TreeTransform.h +++ b/clang/lib/Sema/TreeTransform.h @@ -9666,6 +9666,17 @@ StmtResult TreeTransform::TransformOMPInterchangeDirective( return Res; } +template +StmtResult +TreeTransform::TransformOMPFuseDirective(OMPFuseDirective *D) { + DeclarationNameInfo DirName; + getDerived().getSema().OpenMP().StartOpenMPDSABlock( + D->getDirectiveKind(), DirName, nullptr, D->getBeginLoc()); + StmtResult Res = getDerived().TransformOMPExecutableDirective(D); + getDerived().getSema().OpenMP().EndOpenMPDSABlock(Res.get()); + return Res; +} + template StmtResult TreeTransform::TransformOMPForDirective(OMPForDirective *D) { diff --git a/clang/lib/Serialization/ASTReaderStmt.cpp b/clang/lib/Serialization/ASTReaderStmt.cpp index 0ba0378754eb4..6762d11d6b73e 100644 --- a/clang/lib/Serialization/ASTReaderStmt.cpp +++ b/clang/lib/Serialization/ASTReaderStmt.cpp @@ -2449,6 +2449,7 @@ void ASTStmtReader::VisitOMPLoopTransformationDirective( OMPLoopTransformationDirective *D) { VisitOMPLoopBasedDirective(D); D->setNumGeneratedLoops(Record.readUInt32()); + D->setNumGeneratedLoopNests(Record.readUInt32()); } void ASTStmtReader::VisitOMPTileDirective(OMPTileDirective *D) { @@ -2471,6 +2472,10 @@ void ASTStmtReader::VisitOMPInterchangeDirective(OMPInterchangeDirective *D) { VisitOMPLoopTransformationDirective(D); } +void ASTStmtReader::VisitOMPFuseDirective(OMPFuseDirective *D) { + VisitOMPLoopTransformationDirective(D); +} + void ASTStmtReader::VisitOMPForDirective(OMPForDirective *D) { VisitOMPLoopDirective(D); D->setHasCancel(Record.readBool()); @@ -3613,6 +3618,12 @@ Stmt *ASTReader::ReadStmtFromStream(ModuleFile &F) { S = OMPReverseDirective::CreateEmpty(Context); break; } + case STMT_OMP_FUSE_DIRECTIVE: { + unsigned NumLoops = Record[ASTStmtReader::NumStmtFields]; + unsigned NumClauses = Record[ASTStmtReader::NumStmtFields + 1]; + S = OMPFuseDirective::CreateEmpty(Context, NumClauses, NumLoops); + break; + } case STMT_OMP_INTERCHANGE_DIRECTIVE: { unsigned NumLoops = Record[ASTStmtReader::NumStmtFields]; diff --git a/clang/lib/Serialization/ASTWriterStmt.cpp b/clang/lib/Serialization/ASTWriterStmt.cpp index b9eabd5ddb64c..8b909d5c93686 100644 --- a/clang/lib/Serialization/ASTWriterStmt.cpp +++ b/clang/lib/Serialization/ASTWriterStmt.cpp @@ -2454,6 +2454,7 @@ void ASTStmtWriter::VisitOMPLoopTransformationDirective( OMPLoopTransformationDirective *D) { VisitOMPLoopBasedDirective(D); Record.writeUInt32(D->getNumGeneratedLoops()); + Record.writeUInt32(D->getNumGeneratedLoopNests()); } void ASTStmtWriter::VisitOMPTileDirective(OMPTileDirective *D) { @@ -2481,6 +2482,11 @@ void ASTStmtWriter::VisitOMPInterchangeDirective(OMPInterchangeDirective *D) { Code = serialization::STMT_OMP_INTERCHANGE_DIRECTIVE; } +void ASTStmtWriter::VisitOMPFuseDirective(OMPFuseDirective *D) { + VisitOMPLoopTransformationDirective(D); + Code = serialization::STMT_OMP_FUSE_DIRECTIVE; +} + void ASTStmtWriter::VisitOMPForDirective(OMPForDirective *D) { VisitOMPLoopDirective(D); Record.writeBool(D->hasCancel()); diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp index 1afd4b52eb354..036945b2d1700 100644 --- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp +++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp @@ -1817,6 +1817,7 @@ void ExprEngine::Visit(const Stmt *S, ExplodedNode *Pred, case Stmt::OMPStripeDirectiveClass: case Stmt::OMPTileDirectiveClass: case Stmt::OMPInterchangeDirectiveClass: + case Stmt::OMPFuseDirectiveClass: case Stmt::OMPInteropDirectiveClass: case Stmt::OMPDispatchDirectiveClass: case Stmt::OMPMaskedDirectiveClass: diff --git a/clang/test/OpenMP/fuse_ast_print.cpp b/clang/test/OpenMP/fuse_ast_print.cpp new file mode 100644 index 0000000000000..43ce815dab024 --- /dev/null +++ b/clang/test/OpenMP/fuse_ast_print.cpp @@ -0,0 +1,278 @@ +// Check no warnings/errors +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -fsyntax-only -verify %s +// expected-no-diagnostics + +// Check AST and unparsing +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -ast-dump %s | FileCheck %s --check-prefix=DUMP +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -ast-print %s | FileCheck %s --check-prefix=PRINT + +// Check same results after serialization round-trip +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -emit-pch -o %t %s +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -include-pch %t -ast-dump-all %s | FileCheck %s --check-prefix=DUMP +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fopenmp -std=c++20 -fopenmp-version=60 -include-pch %t -ast-print %s | FileCheck %s --check-prefix=PRINT + +#ifndef HEADER +#define HEADER + +// placeholder for loop body code +extern "C" void body(...); + +// PRINT-LABEL: void foo1( +// DUMP-LABEL: FunctionDecl {{.*}} foo1 +void foo1() { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + + } + +} + +// PRINT-LABEL: void foo2( +// DUMP-LABEL: FunctionDecl {{.*}} foo2 +void foo2() { + // PRINT: #pragma omp unroll partial(4) + // DUMP: OMPUnrollDirective + // DUMP-NEXT: OMPPartialClause + // DUMP-NEXT: ConstantExpr + // DUMP-NEXT: value: Int 4 + // DUMP-NEXT: IntegerLiteral {{.*}} 4 + #pragma omp unroll partial(4) + // PRINT: #pragma omp fuse + // DUMP-NEXT: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + } + +} + +//PRINT-LABEL: void foo3( +//DUMP-LABEL: FunctionTemplateDecl {{.*}} foo3 +template +void foo3() { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: #pragma omp unroll partial(Factor1) + // DUMP: OMPUnrollDirective + #pragma omp unroll partial(Factor1) + // PRINT: for (int i = 0; i < 12; i += 1) + // DUMP: ForStmt + for (int i = 0; i < 12; i += 1) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: #pragma omp unroll partial(Factor2) + // DUMP: OMPUnrollDirective + #pragma omp unroll partial(Factor2) + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + + } +} + +// Also test instantiating the template. +void tfoo3() { + foo3<4,2>(); +} + +//PRINT-LABEL: void foo4( +//DUMP-LABEL: FunctionTemplateDecl {{.*}} foo4 +template +void foo4(int start, int end) { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (T i = start; i < end; i += Step) + // DUMP: ForStmt + for (T i = start; i < end; i += Step) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + + // PRINT: for (T j = end; j > start; j -= Step) + // DUMP: ForStmt + for (T j = end; j > start; j -= Step) { + // PRINT: body(j) + // DUMP: CallExpr + body(j); + } + + } +} + +// Also test instantiating the template. +void tfoo4() { + foo4(0, 64); +} + + + +// PRINT-LABEL: void foo5( +// DUMP-LABEL: FunctionDecl {{.*}} foo5 +void foo5() { + double arr[128], arr2[128]; + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT-NEXT: for (auto &&a : arr) + // DUMP-NEXT: CXXForRangeStmt + for (auto &&a: arr) + // PRINT: body(a) + // DUMP: CallExpr + body(a); + // PRINT: for (double v = 42; auto &&b : arr) + // DUMP: CXXForRangeStmt + for (double v = 42; auto &&b: arr) + // PRINT: body(b, v); + // DUMP: CallExpr + body(b, v); + // PRINT: for (auto &&c : arr2) + // DUMP: CXXForRangeStmt + for (auto &&c: arr2) + // PRINT: body(c) + // DUMP: CallExpr + body(c); + + } + +} + +// PRINT-LABEL: void foo6( +// DUMP-LABEL: FunctionDecl {{.*}} foo6 +void foo6() { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i <= 10; ++i) + // DUMP: ForStmt + for (int i = 0; i <= 10; ++i) + body(i); + // PRINT: for (int j = 0; j < 100; ++j) + // DUMP: ForStmt + for(int j = 0; j < 100; ++j) + body(j); + } + // PRINT: #pragma omp unroll partial(4) + // DUMP: OMPUnrollDirective + #pragma omp unroll partial(4) + // PRINT: for (int k = 0; k < 250; ++k) + // DUMP: ForStmt + for (int k = 0; k < 250; ++k) + body(k); + } +} + +// PRINT-LABEL: void foo7( +// DUMP-LABEL: FunctionDecl {{.*}} foo7 +void foo7() { + // PRINT: #pragma omp fuse + // DUMP: OMPFuseDirective + #pragma omp fuse + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + } + } + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + } + } + } + } + +} + + + + + +#endif \ No newline at end of file diff --git a/clang/test/OpenMP/fuse_codegen.cpp b/clang/test/OpenMP/fuse_codegen.cpp new file mode 100644 index 0000000000000..6c1e21092da43 --- /dev/null +++ b/clang/test/OpenMP/fuse_codegen.cpp @@ -0,0 +1,1511 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --include-generated-funcs --replace-value-regex "pl_cond[.].+[.|,]" --prefix-filecheck-ir-name _ --version 5 +// expected-no-diagnostics + +// Check code generation +// RUN: %clang_cc1 -verify -triple x86_64-pc-linux-gnu -std=c++20 -fclang-abi-compat=latest -fopenmp -fopenmp-version=60 -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK1 + +// Check same results after serialization round-trip +// RUN: %clang_cc1 -verify -triple x86_64-pc-linux-gnu -std=c++20 -fclang-abi-compat=latest -fopenmp -fopenmp-version=60 -emit-pch -o %t %s +// RUN: %clang_cc1 -verify -triple x86_64-pc-linux-gnu -std=c++20 -fclang-abi-compat=latest -fopenmp -fopenmp-version=60 -include-pch %t -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK2 + +#ifndef HEADER +#define HEADER + +//placeholder for loop body code. +extern "C" void body(...) {} + +extern "C" void foo1(int start1, int end1, int step1, int start2, int end2, int step2) { + int i,j; + #pragma omp fuse + { + for(i = start1; i < end1; i += step1) body(i); + for(j = start2; j < end2; j += step2) body(j); + } + +} + +template +void foo2(T start, T end, T step){ + T i,j,k; + #pragma omp fuse + { + for(i = start; i < end; i += step) body(i); + for(j = end; j > start; j -= step) body(j); + for(k = start+step; k < end+step; k += step) body(k); + } +} + +extern "C" void tfoo2() { + foo2(0, 64, 4); +} + +extern "C" void foo3() { + double arr[256]; + #pragma omp fuse + { + #pragma omp fuse + { + for(int i = 0; i < 128; ++i) body(i); + for(int j = 0; j < 256; j+=2) body(j); + } + for(int c = 42; auto &&v: arr) body(c,v); + for(int cc = 37; auto &&vv: arr) body(cc, vv); + } +} + + +#endif +// CHECK1-LABEL: define dso_local void @body( +// CHECK1-SAME: ...) #[[ATTR0:[0-9]+]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: ret void +// +// +// CHECK1-LABEL: define dso_local void @foo1( +// CHECK1-SAME: i32 noundef [[START1:%.*]], i32 noundef [[END1:%.*]], i32 noundef [[STEP1:%.*]], i32 noundef [[START2:%.*]], i32 noundef [[END2:%.*]], i32 noundef [[STEP2:%.*]]) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[START1_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[END1_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[STEP1_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[START2_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[END2_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[STEP2_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_6:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: store i32 [[START1]], ptr [[START1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[END1]], ptr [[END1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[STEP1]], ptr [[STEP1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[START2]], ptr [[START2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[END2]], ptr [[END2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[STEP2]], ptr [[STEP2_ADDR]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[START1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[START1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP1]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[END1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[STEP1_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP3]], ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[SUB:%.*]] = sub i32 [[TMP4]], [[TMP5]] +// CHECK1-NEXT: [[SUB3:%.*]] = sub i32 [[SUB]], 1 +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[ADD:%.*]] = add i32 [[SUB3]], [[TMP6]] +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] +// CHECK1-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 +// CHECK1-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK1-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[END2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK1-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK1-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 +// CHECK1-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK1-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP20]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP21]], [[TMP22]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP23]], %[[COND_TRUE]] ], [ [[TMP24]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] +// CHECK1-NEXT: br i1 [[CMP16]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: br i1 [[CMP17]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP30]], [[TMP31]] +// CHECK1-NEXT: [[ADD18:%.*]] = add i32 [[TMP29]], [[MUL]] +// CHECK1-NEXT: store i32 [[ADD18]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[MUL19:%.*]] = mul i32 [[TMP33]], [[TMP34]] +// CHECK1-NEXT: [[ADD20:%.*]] = add i32 [[TMP32]], [[MUL19]] +// CHECK1-NEXT: store i32 [[ADD20]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP35]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP36]], [[TMP37]] +// CHECK1-NEXT: br i1 [[CMP21]], label %[[IF_THEN22:.*]], label %[[IF_END27:.*]] +// CHECK1: [[IF_THEN22]]: +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL23:%.*]] = mul i32 [[TMP39]], [[TMP40]] +// CHECK1-NEXT: [[ADD24:%.*]] = add i32 [[TMP38]], [[MUL23]] +// CHECK1-NEXT: store i32 [[ADD24]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[MUL25:%.*]] = mul i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: [[ADD26:%.*]] = add i32 [[TMP41]], [[MUL25]] +// CHECK1-NEXT: store i32 [[ADD26]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK1-NEXT: br label %[[IF_END27]] +// CHECK1: [[IF_END27]]: +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP45]], 1 +// CHECK1-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP3:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: ret void +// +// +// CHECK1-LABEL: define dso_local void @tfoo2( +// CHECK1-SAME: ) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: call void @_Z4foo2IiEvT_S0_S0_(i32 noundef 0, i32 noundef 64, i32 noundef 4) +// CHECK1-NEXT: ret void +// +// +// CHECK1-LABEL: define linkonce_odr void @_Z4foo2IiEvT_S0_S0_( +// CHECK1-SAME: i32 noundef [[START:%.*]], i32 noundef [[END:%.*]], i32 noundef [[STEP:%.*]]) #[[ATTR0]] comdat { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[START_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[END_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[STEP_ADDR:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_6:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_17:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_19:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTNEW_STEP21:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_22:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: store i32 [[START]], ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[END]], ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[STEP]], ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP1]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP3]], ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[SUB:%.*]] = sub i32 [[TMP4]], [[TMP5]] +// CHECK1-NEXT: [[SUB3:%.*]] = sub i32 [[SUB]], 1 +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[ADD:%.*]] = add i32 [[SUB3]], [[TMP6]] +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] +// CHECK1-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 +// CHECK1-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK1-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK1-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK1-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 +// CHECK1-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK1-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] +// CHECK1-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] +// CHECK1-NEXT: store i32 [[ADD18]], ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP24]], [[TMP25]] +// CHECK1-NEXT: store i32 [[ADD20]], ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP26]], ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[SUB23:%.*]] = sub i32 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: [[SUB24:%.*]] = sub i32 [[SUB23]], 1 +// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP29]] +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP30]] +// CHECK1-NEXT: [[SUB27:%.*]] = sub i32 [[DIV26]], 1 +// CHECK1-NEXT: store i32 [[SUB27]], ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK1-NEXT: store i32 [[TMP31]], ptr [[DOTOMP_UB2]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB2]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST2]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK1-NEXT: [[ADD28:%.*]] = add i32 [[TMP32]], 1 +// CHECK1-NEXT: store i32 [[ADD28]], ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP33]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP34]], [[TMP35]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP36]], %[[COND_TRUE]] ], [ [[TMP37]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP38]], [[TMP39]] +// CHECK1-NEXT: br i1 [[CMP29]], label %[[COND_TRUE30:.*]], label %[[COND_FALSE31:.*]] +// CHECK1: [[COND_TRUE30]]: +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: br label %[[COND_END32:.*]] +// CHECK1: [[COND_FALSE31]]: +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: br label %[[COND_END32]] +// CHECK1: [[COND_END32]]: +// CHECK1-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP40]], %[[COND_TRUE30]] ], [ [[TMP41]], %[[COND_FALSE31]] ] +// CHECK1-NEXT: store i32 [[COND33]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: br i1 [[CMP34]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP44]], [[TMP45]] +// CHECK1-NEXT: br i1 [[CMP35]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP47]], [[TMP48]] +// CHECK1-NEXT: [[ADD36:%.*]] = add i32 [[TMP46]], [[MUL]] +// CHECK1-NEXT: store i32 [[ADD36]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[MUL37:%.*]] = mul i32 [[TMP50]], [[TMP51]] +// CHECK1-NEXT: [[ADD38:%.*]] = add i32 [[TMP49]], [[MUL37]] +// CHECK1-NEXT: store i32 [[ADD38]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP53]], [[TMP54]] +// CHECK1-NEXT: br i1 [[CMP39]], label %[[IF_THEN40:.*]], label %[[IF_END45:.*]] +// CHECK1: [[IF_THEN40]]: +// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL41:%.*]] = mul i32 [[TMP56]], [[TMP57]] +// CHECK1-NEXT: [[ADD42:%.*]] = add i32 [[TMP55]], [[MUL41]] +// CHECK1-NEXT: store i32 [[ADD42]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP58:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[MUL43:%.*]] = mul i32 [[TMP59]], [[TMP60]] +// CHECK1-NEXT: [[SUB44:%.*]] = sub i32 [[TMP58]], [[MUL43]] +// CHECK1-NEXT: store i32 [[SUB44]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP61:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP61]]) +// CHECK1-NEXT: br label %[[IF_END45]] +// CHECK1: [[IF_END45]]: +// CHECK1-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP62]], [[TMP63]] +// CHECK1-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] +// CHECK1: [[IF_THEN47]]: +// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 +// CHECK1-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 +// CHECK1-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL48:%.*]] = mul i32 [[TMP65]], [[TMP66]] +// CHECK1-NEXT: [[ADD49:%.*]] = add i32 [[TMP64]], [[MUL48]] +// CHECK1-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV2]], align 4 +// CHECK1-NEXT: [[TMP67:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 +// CHECK1-NEXT: [[TMP69:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[MUL50:%.*]] = mul i32 [[TMP68]], [[TMP69]] +// CHECK1-NEXT: [[ADD51:%.*]] = add i32 [[TMP67]], [[MUL50]] +// CHECK1-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 +// CHECK1-NEXT: [[TMP70:%.*]] = load i32, ptr [[K]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP70]]) +// CHECK1-NEXT: br label %[[IF_END52]] +// CHECK1: [[IF_END52]]: +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 +// CHECK1-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP5:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: ret void +// +// +// CHECK1-LABEL: define dso_local void @foo3( +// CHECK1-SAME: ) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB03:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB04:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST05:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI06:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV07:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_12:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_UB117:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_LB118:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_ST119:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_NI120:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV122:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[CC:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE223:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END224:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN227:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_29:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_31:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_32:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_UB2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_LB2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_ST2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_NI2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_TEMP_142:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX48:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX54:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[VV:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 [[TMP5]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[SUB:%.*]] = sub nsw i32 [[TMP6]], 0 +// CHECK1-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 +// CHECK1-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 +// CHECK1-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: store i32 [[TMP7]], ptr [[DOTOMP_UB03]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB04]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST05]], align 4 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP8]], 1 +// CHECK1-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 +// CHECK1-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI06]], align 8 +// CHECK1-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK1-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY8:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY8]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY10:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP11]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY10]], ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK1-NEXT: [[TMP12:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK1-NEXT: store ptr [[TMP12]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: [[TMP14:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP14]] to i64 +// CHECK1-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] +// CHECK1-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 +// CHECK1-NEXT: [[SUB13:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK1-NEXT: [[ADD14:%.*]] = add nsw i64 [[SUB13]], 1 +// CHECK1-NEXT: [[DIV15:%.*]] = sdiv i64 [[ADD14]], 1 +// CHECK1-NEXT: [[SUB16:%.*]] = sub nsw i64 [[DIV15]], 1 +// CHECK1-NEXT: store i64 [[SUB16]], ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK1-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK1-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_UB117]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB118]], align 8 +// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST119]], align 8 +// CHECK1-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK1-NEXT: [[ADD21:%.*]] = add nsw i64 [[TMP16]], 1 +// CHECK1-NEXT: store i64 [[ADD21]], ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: store i32 37, ptr [[CC]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE223]], align 8 +// CHECK1-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY25:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR26:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY25]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR26]], ptr [[__END224]], align 8 +// CHECK1-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP18]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY28]], ptr [[__BEGIN227]], align 8 +// CHECK1-NEXT: [[TMP19:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY30:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP19]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY30]], ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[TMP20:%.*]] = load ptr, ptr [[__END224]], align 8 +// CHECK1-NEXT: store ptr [[TMP20]], ptr [[DOTCAPTURE_EXPR_31]], align 8 +// CHECK1-NEXT: [[TMP21:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_31]], align 8 +// CHECK1-NEXT: [[TMP22:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST33:%.*]] = ptrtoint ptr [[TMP21]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST34:%.*]] = ptrtoint ptr [[TMP22]] to i64 +// CHECK1-NEXT: [[SUB_PTR_SUB35:%.*]] = sub i64 [[SUB_PTR_LHS_CAST33]], [[SUB_PTR_RHS_CAST34]] +// CHECK1-NEXT: [[SUB_PTR_DIV36:%.*]] = sdiv exact i64 [[SUB_PTR_SUB35]], 8 +// CHECK1-NEXT: [[SUB37:%.*]] = sub nsw i64 [[SUB_PTR_DIV36]], 1 +// CHECK1-NEXT: [[ADD38:%.*]] = add nsw i64 [[SUB37]], 1 +// CHECK1-NEXT: [[DIV39:%.*]] = sdiv i64 [[ADD38]], 1 +// CHECK1-NEXT: [[SUB40:%.*]] = sub nsw i64 [[DIV39]], 1 +// CHECK1-NEXT: store i64 [[SUB40]], ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK1-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK1-NEXT: store i64 [[TMP23]], ptr [[DOTOMP_UB2]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB2]], align 8 +// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST2]], align 8 +// CHECK1-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK1-NEXT: [[ADD41:%.*]] = add nsw i64 [[TMP24]], 1 +// CHECK1-NEXT: store i64 [[ADD41]], ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 +// CHECK1-NEXT: store i64 [[TMP25]], ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK1-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK1-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: [[CMP43:%.*]] = icmp sgt i64 [[TMP26]], [[TMP27]] +// CHECK1-NEXT: br i1 [[CMP43]], label %[[COND_TRUE44:.*]], label %[[COND_FALSE45:.*]] +// CHECK1: [[COND_TRUE44]]: +// CHECK1-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK1-NEXT: br label %[[COND_END46:.*]] +// CHECK1: [[COND_FALSE45]]: +// CHECK1-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: br label %[[COND_END46]] +// CHECK1: [[COND_END46]]: +// CHECK1-NEXT: [[COND47:%.*]] = phi i64 [ [[TMP28]], %[[COND_TRUE44]] ], [ [[TMP29]], %[[COND_FALSE45]] ] +// CHECK1-NEXT: store i64 [[COND47]], ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[CMP49:%.*]] = icmp sgt i64 [[TMP30]], [[TMP31]] +// CHECK1-NEXT: br i1 [[CMP49]], label %[[COND_TRUE50:.*]], label %[[COND_FALSE51:.*]] +// CHECK1: [[COND_TRUE50]]: +// CHECK1-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: br label %[[COND_END52:.*]] +// CHECK1: [[COND_FALSE51]]: +// CHECK1-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: br label %[[COND_END52]] +// CHECK1: [[COND_END52]]: +// CHECK1-NEXT: [[COND53:%.*]] = phi i64 [ [[TMP32]], %[[COND_TRUE50]] ], [ [[TMP33]], %[[COND_FALSE51]] ] +// CHECK1-NEXT: store i64 [[COND53]], ptr [[DOTOMP_FUSE_MAX48]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP35:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX48]], align 8 +// CHECK1-NEXT: [[CMP55:%.*]] = icmp slt i64 [[TMP34]], [[TMP35]] +// CHECK1-NEXT: br i1 [[CMP55]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP36:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 +// CHECK1-NEXT: [[CMP56:%.*]] = icmp slt i64 [[TMP36]], [[TMP37]] +// CHECK1-NEXT: br i1 [[CMP56]], label %[[IF_THEN:.*]], label %[[IF_END76:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB04]], align 4 +// CHECK1-NEXT: [[CONV57:%.*]] = sext i32 [[TMP38]] to i64 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST05]], align 4 +// CHECK1-NEXT: [[CONV58:%.*]] = sext i32 [[TMP39]] to i64 +// CHECK1-NEXT: [[TMP40:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV58]], [[TMP40]] +// CHECK1-NEXT: [[ADD59:%.*]] = add nsw i64 [[CONV57]], [[MUL]] +// CHECK1-NEXT: [[CONV60:%.*]] = trunc i64 [[ADD59]] to i32 +// CHECK1-NEXT: store i32 [[CONV60]], ptr [[DOTOMP_IV07]], align 4 +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_IV07]], align 4 +// CHECK1-NEXT: [[MUL61:%.*]] = mul nsw i32 [[TMP41]], 1 +// CHECK1-NEXT: [[ADD62:%.*]] = add nsw i32 0, [[MUL61]] +// CHECK1-NEXT: store i32 [[ADD62]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP63:%.*]] = icmp slt i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: br i1 [[CMP63]], label %[[IF_THEN64:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN64]]: +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP45]], [[TMP46]] +// CHECK1-NEXT: [[ADD66:%.*]] = add nsw i32 [[TMP44]], [[MUL65]] +// CHECK1-NEXT: store i32 [[ADD66]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[MUL67:%.*]] = mul nsw i32 [[TMP47]], 1 +// CHECK1-NEXT: [[ADD68:%.*]] = add nsw i32 0, [[MUL67]] +// CHECK1-NEXT: store i32 [[ADD68]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP48]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP69:%.*]] = icmp slt i32 [[TMP49]], [[TMP50]] +// CHECK1-NEXT: br i1 [[CMP69]], label %[[IF_THEN70:.*]], label %[[IF_END75:.*]] +// CHECK1: [[IF_THEN70]]: +// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP52]], [[TMP53]] +// CHECK1-NEXT: [[ADD72:%.*]] = add nsw i32 [[TMP51]], [[MUL71]] +// CHECK1-NEXT: store i32 [[ADD72]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[MUL73:%.*]] = mul nsw i32 [[TMP54]], 2 +// CHECK1-NEXT: [[ADD74:%.*]] = add nsw i32 0, [[MUL73]] +// CHECK1-NEXT: store i32 [[ADD74]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP55]]) +// CHECK1-NEXT: br label %[[IF_END75]] +// CHECK1: [[IF_END75]]: +// CHECK1-NEXT: br label %[[IF_END76]] +// CHECK1: [[IF_END76]]: +// CHECK1-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: [[CMP77:%.*]] = icmp slt i64 [[TMP56]], [[TMP57]] +// CHECK1-NEXT: br i1 [[CMP77]], label %[[IF_THEN78:.*]], label %[[IF_END83:.*]] +// CHECK1: [[IF_THEN78]]: +// CHECK1-NEXT: [[TMP58:%.*]] = load i64, ptr [[DOTOMP_LB118]], align 8 +// CHECK1-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_ST119]], align 8 +// CHECK1-NEXT: [[TMP60:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], [[TMP60]] +// CHECK1-NEXT: [[ADD80:%.*]] = add nsw i64 [[TMP58]], [[MUL79]] +// CHECK1-NEXT: store i64 [[ADD80]], ptr [[DOTOMP_IV122]], align 8 +// CHECK1-NEXT: [[TMP61:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK1-NEXT: [[TMP62:%.*]] = load i64, ptr [[DOTOMP_IV122]], align 8 +// CHECK1-NEXT: [[MUL81:%.*]] = mul nsw i64 [[TMP62]], 1 +// CHECK1-NEXT: [[ADD_PTR82:%.*]] = getelementptr inbounds double, ptr [[TMP61]], i64 [[MUL81]] +// CHECK1-NEXT: store ptr [[ADD_PTR82]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP63:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: store ptr [[TMP63]], ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[C]], align 4 +// CHECK1-NEXT: [[TMP65:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP66:%.*]] = load double, ptr [[TMP65]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP64]], double noundef [[TMP66]]) +// CHECK1-NEXT: br label %[[IF_END83]] +// CHECK1: [[IF_END83]]: +// CHECK1-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[CMP84:%.*]] = icmp slt i64 [[TMP67]], [[TMP68]] +// CHECK1-NEXT: br i1 [[CMP84]], label %[[IF_THEN85:.*]], label %[[IF_END90:.*]] +// CHECK1: [[IF_THEN85]]: +// CHECK1-NEXT: [[TMP69:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 +// CHECK1-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 +// CHECK1-NEXT: [[TMP71:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], [[TMP71]] +// CHECK1-NEXT: [[ADD87:%.*]] = add nsw i64 [[TMP69]], [[MUL86]] +// CHECK1-NEXT: store i64 [[ADD87]], ptr [[DOTOMP_IV2]], align 8 +// CHECK1-NEXT: [[TMP72:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[TMP73:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 +// CHECK1-NEXT: [[MUL88:%.*]] = mul nsw i64 [[TMP73]], 1 +// CHECK1-NEXT: [[ADD_PTR89:%.*]] = getelementptr inbounds double, ptr [[TMP72]], i64 [[MUL88]] +// CHECK1-NEXT: store ptr [[ADD_PTR89]], ptr [[__BEGIN227]], align 8 +// CHECK1-NEXT: [[TMP74:%.*]] = load ptr, ptr [[__BEGIN227]], align 8 +// CHECK1-NEXT: store ptr [[TMP74]], ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP75:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK1-NEXT: [[TMP76:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP77:%.*]] = load double, ptr [[TMP76]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP75]], double noundef [[TMP77]]) +// CHECK1-NEXT: br label %[[IF_END90]] +// CHECK1: [[IF_END90]]: +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP78:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[INC:%.*]] = add nsw i64 [[TMP78]], 1 +// CHECK1-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: ret void +// +// +// CHECK2-LABEL: define dso_local void @body( +// CHECK2-SAME: ...) #[[ATTR0:[0-9]+]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: ret void +// +// +// CHECK2-LABEL: define dso_local void @foo1( +// CHECK2-SAME: i32 noundef [[START1:%.*]], i32 noundef [[END1:%.*]], i32 noundef [[STEP1:%.*]], i32 noundef [[START2:%.*]], i32 noundef [[END2:%.*]], i32 noundef [[STEP2:%.*]]) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[START1_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[END1_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[STEP1_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[START2_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[END2_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[STEP2_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_6:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: store i32 [[START1]], ptr [[START1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[END1]], ptr [[END1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[STEP1]], ptr [[STEP1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[START2]], ptr [[START2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[END2]], ptr [[END2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[STEP2]], ptr [[STEP2_ADDR]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[START1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[START1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP1]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[END1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[STEP1_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP3]], ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[SUB:%.*]] = sub i32 [[TMP4]], [[TMP5]] +// CHECK2-NEXT: [[SUB3:%.*]] = sub i32 [[SUB]], 1 +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[ADD:%.*]] = add i32 [[SUB3]], [[TMP6]] +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] +// CHECK2-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 +// CHECK2-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK2-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[END2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK2-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK2-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 +// CHECK2-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK2-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP20]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP21]], [[TMP22]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP23]], %[[COND_TRUE]] ], [ [[TMP24]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] +// CHECK2-NEXT: br i1 [[CMP16]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: br i1 [[CMP17]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP30]], [[TMP31]] +// CHECK2-NEXT: [[ADD18:%.*]] = add i32 [[TMP29]], [[MUL]] +// CHECK2-NEXT: store i32 [[ADD18]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[MUL19:%.*]] = mul i32 [[TMP33]], [[TMP34]] +// CHECK2-NEXT: [[ADD20:%.*]] = add i32 [[TMP32]], [[MUL19]] +// CHECK2-NEXT: store i32 [[ADD20]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP35]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP36]], [[TMP37]] +// CHECK2-NEXT: br i1 [[CMP21]], label %[[IF_THEN22:.*]], label %[[IF_END27:.*]] +// CHECK2: [[IF_THEN22]]: +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL23:%.*]] = mul i32 [[TMP39]], [[TMP40]] +// CHECK2-NEXT: [[ADD24:%.*]] = add i32 [[TMP38]], [[MUL23]] +// CHECK2-NEXT: store i32 [[ADD24]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[MUL25:%.*]] = mul i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: [[ADD26:%.*]] = add i32 [[TMP41]], [[MUL25]] +// CHECK2-NEXT: store i32 [[ADD26]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK2-NEXT: br label %[[IF_END27]] +// CHECK2: [[IF_END27]]: +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP45]], 1 +// CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP3:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: ret void +// +// +// CHECK2-LABEL: define dso_local void @foo3( +// CHECK2-SAME: ) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB03:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB04:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST05:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI06:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV07:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_12:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_UB117:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_LB118:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_ST119:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_NI120:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV122:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[CC:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE223:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END224:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN227:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_29:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_31:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_32:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_UB2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_LB2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_ST2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_NI2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_TEMP_142:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX48:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX54:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[VV:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 [[TMP5]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[SUB:%.*]] = sub nsw i32 [[TMP6]], 0 +// CHECK2-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 +// CHECK2-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 +// CHECK2-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: store i32 [[TMP7]], ptr [[DOTOMP_UB03]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB04]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST05]], align 4 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP8]], 1 +// CHECK2-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 +// CHECK2-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI06]], align 8 +// CHECK2-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK2-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY8:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY8]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY10:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP11]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY10]], ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK2-NEXT: [[TMP12:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK2-NEXT: store ptr [[TMP12]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: [[TMP14:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP14]] to i64 +// CHECK2-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] +// CHECK2-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 +// CHECK2-NEXT: [[SUB13:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK2-NEXT: [[ADD14:%.*]] = add nsw i64 [[SUB13]], 1 +// CHECK2-NEXT: [[DIV15:%.*]] = sdiv i64 [[ADD14]], 1 +// CHECK2-NEXT: [[SUB16:%.*]] = sub nsw i64 [[DIV15]], 1 +// CHECK2-NEXT: store i64 [[SUB16]], ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK2-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK2-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_UB117]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB118]], align 8 +// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST119]], align 8 +// CHECK2-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 +// CHECK2-NEXT: [[ADD21:%.*]] = add nsw i64 [[TMP16]], 1 +// CHECK2-NEXT: store i64 [[ADD21]], ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: store i32 37, ptr [[CC]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE223]], align 8 +// CHECK2-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY25:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR26:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY25]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR26]], ptr [[__END224]], align 8 +// CHECK2-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP18]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY28]], ptr [[__BEGIN227]], align 8 +// CHECK2-NEXT: [[TMP19:%.*]] = load ptr, ptr [[__RANGE223]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY30:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP19]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY30]], ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[TMP20:%.*]] = load ptr, ptr [[__END224]], align 8 +// CHECK2-NEXT: store ptr [[TMP20]], ptr [[DOTCAPTURE_EXPR_31]], align 8 +// CHECK2-NEXT: [[TMP21:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_31]], align 8 +// CHECK2-NEXT: [[TMP22:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST33:%.*]] = ptrtoint ptr [[TMP21]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST34:%.*]] = ptrtoint ptr [[TMP22]] to i64 +// CHECK2-NEXT: [[SUB_PTR_SUB35:%.*]] = sub i64 [[SUB_PTR_LHS_CAST33]], [[SUB_PTR_RHS_CAST34]] +// CHECK2-NEXT: [[SUB_PTR_DIV36:%.*]] = sdiv exact i64 [[SUB_PTR_SUB35]], 8 +// CHECK2-NEXT: [[SUB37:%.*]] = sub nsw i64 [[SUB_PTR_DIV36]], 1 +// CHECK2-NEXT: [[ADD38:%.*]] = add nsw i64 [[SUB37]], 1 +// CHECK2-NEXT: [[DIV39:%.*]] = sdiv i64 [[ADD38]], 1 +// CHECK2-NEXT: [[SUB40:%.*]] = sub nsw i64 [[DIV39]], 1 +// CHECK2-NEXT: store i64 [[SUB40]], ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK2-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK2-NEXT: store i64 [[TMP23]], ptr [[DOTOMP_UB2]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB2]], align 8 +// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST2]], align 8 +// CHECK2-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 +// CHECK2-NEXT: [[ADD41:%.*]] = add nsw i64 [[TMP24]], 1 +// CHECK2-NEXT: store i64 [[ADD41]], ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 +// CHECK2-NEXT: store i64 [[TMP25]], ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK2-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK2-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: [[CMP43:%.*]] = icmp sgt i64 [[TMP26]], [[TMP27]] +// CHECK2-NEXT: br i1 [[CMP43]], label %[[COND_TRUE44:.*]], label %[[COND_FALSE45:.*]] +// CHECK2: [[COND_TRUE44]]: +// CHECK2-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 +// CHECK2-NEXT: br label %[[COND_END46:.*]] +// CHECK2: [[COND_FALSE45]]: +// CHECK2-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: br label %[[COND_END46]] +// CHECK2: [[COND_END46]]: +// CHECK2-NEXT: [[COND47:%.*]] = phi i64 [ [[TMP28]], %[[COND_TRUE44]] ], [ [[TMP29]], %[[COND_FALSE45]] ] +// CHECK2-NEXT: store i64 [[COND47]], ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[CMP49:%.*]] = icmp sgt i64 [[TMP30]], [[TMP31]] +// CHECK2-NEXT: br i1 [[CMP49]], label %[[COND_TRUE50:.*]], label %[[COND_FALSE51:.*]] +// CHECK2: [[COND_TRUE50]]: +// CHECK2-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: br label %[[COND_END52:.*]] +// CHECK2: [[COND_FALSE51]]: +// CHECK2-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: br label %[[COND_END52]] +// CHECK2: [[COND_END52]]: +// CHECK2-NEXT: [[COND53:%.*]] = phi i64 [ [[TMP32]], %[[COND_TRUE50]] ], [ [[TMP33]], %[[COND_FALSE51]] ] +// CHECK2-NEXT: store i64 [[COND53]], ptr [[DOTOMP_FUSE_MAX48]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP35:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX48]], align 8 +// CHECK2-NEXT: [[CMP55:%.*]] = icmp slt i64 [[TMP34]], [[TMP35]] +// CHECK2-NEXT: br i1 [[CMP55]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP36:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 +// CHECK2-NEXT: [[CMP56:%.*]] = icmp slt i64 [[TMP36]], [[TMP37]] +// CHECK2-NEXT: br i1 [[CMP56]], label %[[IF_THEN:.*]], label %[[IF_END76:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB04]], align 4 +// CHECK2-NEXT: [[CONV57:%.*]] = sext i32 [[TMP38]] to i64 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST05]], align 4 +// CHECK2-NEXT: [[CONV58:%.*]] = sext i32 [[TMP39]] to i64 +// CHECK2-NEXT: [[TMP40:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV58]], [[TMP40]] +// CHECK2-NEXT: [[ADD59:%.*]] = add nsw i64 [[CONV57]], [[MUL]] +// CHECK2-NEXT: [[CONV60:%.*]] = trunc i64 [[ADD59]] to i32 +// CHECK2-NEXT: store i32 [[CONV60]], ptr [[DOTOMP_IV07]], align 4 +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_IV07]], align 4 +// CHECK2-NEXT: [[MUL61:%.*]] = mul nsw i32 [[TMP41]], 1 +// CHECK2-NEXT: [[ADD62:%.*]] = add nsw i32 0, [[MUL61]] +// CHECK2-NEXT: store i32 [[ADD62]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP63:%.*]] = icmp slt i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: br i1 [[CMP63]], label %[[IF_THEN64:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN64]]: +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP45]], [[TMP46]] +// CHECK2-NEXT: [[ADD66:%.*]] = add nsw i32 [[TMP44]], [[MUL65]] +// CHECK2-NEXT: store i32 [[ADD66]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[MUL67:%.*]] = mul nsw i32 [[TMP47]], 1 +// CHECK2-NEXT: [[ADD68:%.*]] = add nsw i32 0, [[MUL67]] +// CHECK2-NEXT: store i32 [[ADD68]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP48]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP69:%.*]] = icmp slt i32 [[TMP49]], [[TMP50]] +// CHECK2-NEXT: br i1 [[CMP69]], label %[[IF_THEN70:.*]], label %[[IF_END75:.*]] +// CHECK2: [[IF_THEN70]]: +// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP52]], [[TMP53]] +// CHECK2-NEXT: [[ADD72:%.*]] = add nsw i32 [[TMP51]], [[MUL71]] +// CHECK2-NEXT: store i32 [[ADD72]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[MUL73:%.*]] = mul nsw i32 [[TMP54]], 2 +// CHECK2-NEXT: [[ADD74:%.*]] = add nsw i32 0, [[MUL73]] +// CHECK2-NEXT: store i32 [[ADD74]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP55]]) +// CHECK2-NEXT: br label %[[IF_END75]] +// CHECK2: [[IF_END75]]: +// CHECK2-NEXT: br label %[[IF_END76]] +// CHECK2: [[IF_END76]]: +// CHECK2-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: [[CMP77:%.*]] = icmp slt i64 [[TMP56]], [[TMP57]] +// CHECK2-NEXT: br i1 [[CMP77]], label %[[IF_THEN78:.*]], label %[[IF_END83:.*]] +// CHECK2: [[IF_THEN78]]: +// CHECK2-NEXT: [[TMP58:%.*]] = load i64, ptr [[DOTOMP_LB118]], align 8 +// CHECK2-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_ST119]], align 8 +// CHECK2-NEXT: [[TMP60:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], [[TMP60]] +// CHECK2-NEXT: [[ADD80:%.*]] = add nsw i64 [[TMP58]], [[MUL79]] +// CHECK2-NEXT: store i64 [[ADD80]], ptr [[DOTOMP_IV122]], align 8 +// CHECK2-NEXT: [[TMP61:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 +// CHECK2-NEXT: [[TMP62:%.*]] = load i64, ptr [[DOTOMP_IV122]], align 8 +// CHECK2-NEXT: [[MUL81:%.*]] = mul nsw i64 [[TMP62]], 1 +// CHECK2-NEXT: [[ADD_PTR82:%.*]] = getelementptr inbounds double, ptr [[TMP61]], i64 [[MUL81]] +// CHECK2-NEXT: store ptr [[ADD_PTR82]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP63:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: store ptr [[TMP63]], ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[C]], align 4 +// CHECK2-NEXT: [[TMP65:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP66:%.*]] = load double, ptr [[TMP65]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP64]], double noundef [[TMP66]]) +// CHECK2-NEXT: br label %[[IF_END83]] +// CHECK2: [[IF_END83]]: +// CHECK2-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[CMP84:%.*]] = icmp slt i64 [[TMP67]], [[TMP68]] +// CHECK2-NEXT: br i1 [[CMP84]], label %[[IF_THEN85:.*]], label %[[IF_END90:.*]] +// CHECK2: [[IF_THEN85]]: +// CHECK2-NEXT: [[TMP69:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 +// CHECK2-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 +// CHECK2-NEXT: [[TMP71:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], [[TMP71]] +// CHECK2-NEXT: [[ADD87:%.*]] = add nsw i64 [[TMP69]], [[MUL86]] +// CHECK2-NEXT: store i64 [[ADD87]], ptr [[DOTOMP_IV2]], align 8 +// CHECK2-NEXT: [[TMP72:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[TMP73:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 +// CHECK2-NEXT: [[MUL88:%.*]] = mul nsw i64 [[TMP73]], 1 +// CHECK2-NEXT: [[ADD_PTR89:%.*]] = getelementptr inbounds double, ptr [[TMP72]], i64 [[MUL88]] +// CHECK2-NEXT: store ptr [[ADD_PTR89]], ptr [[__BEGIN227]], align 8 +// CHECK2-NEXT: [[TMP74:%.*]] = load ptr, ptr [[__BEGIN227]], align 8 +// CHECK2-NEXT: store ptr [[TMP74]], ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP75:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK2-NEXT: [[TMP76:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP77:%.*]] = load double, ptr [[TMP76]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP75]], double noundef [[TMP77]]) +// CHECK2-NEXT: br label %[[IF_END90]] +// CHECK2: [[IF_END90]]: +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP78:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[INC:%.*]] = add nsw i64 [[TMP78]], 1 +// CHECK2-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP5:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: ret void +// +// +// CHECK2-LABEL: define dso_local void @tfoo2( +// CHECK2-SAME: ) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: call void @_Z4foo2IiEvT_S0_S0_(i32 noundef 0, i32 noundef 64, i32 noundef 4) +// CHECK2-NEXT: ret void +// +// +// CHECK2-LABEL: define linkonce_odr void @_Z4foo2IiEvT_S0_S0_( +// CHECK2-SAME: i32 noundef [[START:%.*]], i32 noundef [[END:%.*]], i32 noundef [[STEP:%.*]]) #[[ATTR0]] comdat { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[START_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[END_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[STEP_ADDR:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_6:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_17:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_19:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTNEW_STEP21:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_22:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: store i32 [[START]], ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[END]], ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[STEP]], ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP1]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP3]], ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[SUB:%.*]] = sub i32 [[TMP4]], [[TMP5]] +// CHECK2-NEXT: [[SUB3:%.*]] = sub i32 [[SUB]], 1 +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[ADD:%.*]] = add i32 [[SUB3]], [[TMP6]] +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] +// CHECK2-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 +// CHECK2-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK2-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK2-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK2-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 +// CHECK2-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK2-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] +// CHECK2-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] +// CHECK2-NEXT: store i32 [[ADD18]], ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP24]], [[TMP25]] +// CHECK2-NEXT: store i32 [[ADD20]], ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP26]], ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[SUB23:%.*]] = sub i32 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: [[SUB24:%.*]] = sub i32 [[SUB23]], 1 +// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP29]] +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP30]] +// CHECK2-NEXT: [[SUB27:%.*]] = sub i32 [[DIV26]], 1 +// CHECK2-NEXT: store i32 [[SUB27]], ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK2-NEXT: store i32 [[TMP31]], ptr [[DOTOMP_UB2]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB2]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST2]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK2-NEXT: [[ADD28:%.*]] = add i32 [[TMP32]], 1 +// CHECK2-NEXT: store i32 [[ADD28]], ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP33]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP34]], [[TMP35]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP36]], %[[COND_TRUE]] ], [ [[TMP37]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP38]], [[TMP39]] +// CHECK2-NEXT: br i1 [[CMP29]], label %[[COND_TRUE30:.*]], label %[[COND_FALSE31:.*]] +// CHECK2: [[COND_TRUE30]]: +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: br label %[[COND_END32:.*]] +// CHECK2: [[COND_FALSE31]]: +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: br label %[[COND_END32]] +// CHECK2: [[COND_END32]]: +// CHECK2-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP40]], %[[COND_TRUE30]] ], [ [[TMP41]], %[[COND_FALSE31]] ] +// CHECK2-NEXT: store i32 [[COND33]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: br i1 [[CMP34]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP44]], [[TMP45]] +// CHECK2-NEXT: br i1 [[CMP35]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP47]], [[TMP48]] +// CHECK2-NEXT: [[ADD36:%.*]] = add i32 [[TMP46]], [[MUL]] +// CHECK2-NEXT: store i32 [[ADD36]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[MUL37:%.*]] = mul i32 [[TMP50]], [[TMP51]] +// CHECK2-NEXT: [[ADD38:%.*]] = add i32 [[TMP49]], [[MUL37]] +// CHECK2-NEXT: store i32 [[ADD38]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP53]], [[TMP54]] +// CHECK2-NEXT: br i1 [[CMP39]], label %[[IF_THEN40:.*]], label %[[IF_END45:.*]] +// CHECK2: [[IF_THEN40]]: +// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL41:%.*]] = mul i32 [[TMP56]], [[TMP57]] +// CHECK2-NEXT: [[ADD42:%.*]] = add i32 [[TMP55]], [[MUL41]] +// CHECK2-NEXT: store i32 [[ADD42]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP58:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[MUL43:%.*]] = mul i32 [[TMP59]], [[TMP60]] +// CHECK2-NEXT: [[SUB44:%.*]] = sub i32 [[TMP58]], [[MUL43]] +// CHECK2-NEXT: store i32 [[SUB44]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP61:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP61]]) +// CHECK2-NEXT: br label %[[IF_END45]] +// CHECK2: [[IF_END45]]: +// CHECK2-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP62]], [[TMP63]] +// CHECK2-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] +// CHECK2: [[IF_THEN47]]: +// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 +// CHECK2-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 +// CHECK2-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL48:%.*]] = mul i32 [[TMP65]], [[TMP66]] +// CHECK2-NEXT: [[ADD49:%.*]] = add i32 [[TMP64]], [[MUL48]] +// CHECK2-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV2]], align 4 +// CHECK2-NEXT: [[TMP67:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 +// CHECK2-NEXT: [[TMP69:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[MUL50:%.*]] = mul i32 [[TMP68]], [[TMP69]] +// CHECK2-NEXT: [[ADD51:%.*]] = add i32 [[TMP67]], [[MUL50]] +// CHECK2-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 +// CHECK2-NEXT: [[TMP70:%.*]] = load i32, ptr [[K]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP70]]) +// CHECK2-NEXT: br label %[[IF_END52]] +// CHECK2: [[IF_END52]]: +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 +// CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: ret void +// +//. +// CHECK1: [[LOOP3]] = distinct !{[[LOOP3]], [[META4:![0-9]+]]} +// CHECK1: [[META4]] = !{!"llvm.loop.mustprogress"} +// CHECK1: [[LOOP5]] = distinct !{[[LOOP5]], [[META4]]} +// CHECK1: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} +//. +// CHECK2: [[LOOP3]] = distinct !{[[LOOP3]], [[META4:![0-9]+]]} +// CHECK2: [[META4]] = !{!"llvm.loop.mustprogress"} +// CHECK2: [[LOOP5]] = distinct !{[[LOOP5]], [[META4]]} +// CHECK2: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} +//. diff --git a/clang/test/OpenMP/fuse_messages.cpp b/clang/test/OpenMP/fuse_messages.cpp new file mode 100644 index 0000000000000..50dedfd2c0dc6 --- /dev/null +++ b/clang/test/OpenMP/fuse_messages.cpp @@ -0,0 +1,76 @@ +// RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -std=c++20 -fopenmp -fopenmp-version=60 -fsyntax-only -Wuninitialized -verify %s + +void func() { + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a loop sequence containing canonical loops or loop-generating constructs}} + #pragma omp fuse + ; + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a for loop}} + #pragma omp fuse + {int bar = 0;} + + // expected-error at +4 {{statement after '#pragma omp fuse' must be a for loop}} + #pragma omp fuse + { + for(int i = 0; i < 10; ++i); + int x = 2; + } + + // expected-error at +2 {{statement after '#pragma omp fuse' must be a loop sequence containing canonical loops or loop-generating constructs}} + #pragma omp fuse + #pragma omp for + for (int i = 0; i < 7; ++i) + ; + + { + // expected-error at +2 {{expected statement}} + #pragma omp fuse + } + + // expected-warning at +1 {{extra tokens at the end of '#pragma omp fuse' are ignored}} + #pragma omp fuse foo + { + for (int i = 0; i < 7; ++i) + ; + } + + + // expected-error at +1 {{unexpected OpenMP clause 'final' in directive '#pragma omp fuse'}} + #pragma omp fuse final(0) + { + for (int i = 0; i < 7; ++i) + ; + } + + //expected-error at +4 {{loop after '#pragma omp fuse' is not in canonical form}} + //expected-error at +3 {{increment clause of OpenMP for loop must perform simple addition or subtraction on loop variable 'i'}} + #pragma omp fuse + { + for(int i = 0; i < 10; i*=2) { + ; + } + } + + //expected-error at +2 {{loop sequence after '#pragma omp fuse' must contain at least 1 canonical loop or loop-generating construct}} + #pragma omp fuse + {} + + //expected-error at +3 {{statement after '#pragma omp fuse' must be a for loop}} + #pragma omp fuse + { + #pragma omp unroll full + for(int i = 0; i < 10; ++i); + + for(int j = 0; j < 10; ++j); + } + + //expected-warning at +5 {{loop sequence following '#pragma omp fuse' contains induction variables of differing types: 'int' and 'unsigned int'}} + //expected-warning at +5 {{loop sequence following '#pragma omp fuse' contains induction variables of differing types: 'int' and 'long long'}} + #pragma omp fuse + { + for(int i = 0; i < 10; ++i); + for(unsigned int j = 0; j < 10; ++j); + for(long long k = 0; k < 100; ++k); + } +} \ No newline at end of file diff --git a/clang/tools/libclang/CIndex.cpp b/clang/tools/libclang/CIndex.cpp index 06a17006fdee9..fd788ac3d69d4 100644 --- a/clang/tools/libclang/CIndex.cpp +++ b/clang/tools/libclang/CIndex.cpp @@ -2206,6 +2206,7 @@ class EnqueueVisitor : public ConstStmtVisitor, void VisitOMPUnrollDirective(const OMPUnrollDirective *D); void VisitOMPReverseDirective(const OMPReverseDirective *D); void VisitOMPInterchangeDirective(const OMPInterchangeDirective *D); + void VisitOMPFuseDirective(const OMPFuseDirective *D); void VisitOMPForDirective(const OMPForDirective *D); void VisitOMPForSimdDirective(const OMPForSimdDirective *D); void VisitOMPSectionsDirective(const OMPSectionsDirective *D); @@ -3364,6 +3365,10 @@ void EnqueueVisitor::VisitOMPInterchangeDirective( VisitOMPLoopTransformationDirective(D); } +void EnqueueVisitor::VisitOMPFuseDirective(const OMPFuseDirective *D) { + VisitOMPLoopTransformationDirective(D); +} + void EnqueueVisitor::VisitOMPForDirective(const OMPForDirective *D) { VisitOMPLoopDirective(D); } @@ -6317,6 +6322,8 @@ CXString clang_getCursorKindSpelling(enum CXCursorKind Kind) { return cxstring::createRef("OMPReverseDirective"); case CXCursor_OMPInterchangeDirective: return cxstring::createRef("OMPInterchangeDirective"); + case CXCursor_OMPFuseDirective: + return cxstring::createRef("OMPFuseDirective"); case CXCursor_OMPForDirective: return cxstring::createRef("OMPForDirective"); case CXCursor_OMPForSimdDirective: diff --git a/clang/tools/libclang/CXCursor.cpp b/clang/tools/libclang/CXCursor.cpp index 635d03a88d105..709fa60d28d8d 100644 --- a/clang/tools/libclang/CXCursor.cpp +++ b/clang/tools/libclang/CXCursor.cpp @@ -688,6 +688,9 @@ CXCursor cxcursor::MakeCXCursor(const Stmt *S, const Decl *Parent, case Stmt::OMPInterchangeDirectiveClass: K = CXCursor_OMPInterchangeDirective; break; + case Stmt::OMPFuseDirectiveClass: + K = CXCursor_OMPFuseDirective; + break; case Stmt::OMPForDirectiveClass: K = CXCursor_OMPForDirective; break; diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 0af4b436649a3..8286cfcadaafd 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -852,6 +852,10 @@ def OMP_For : Directive<"for"> { let category = CA_Executable; let languages = [L_C]; } +def OMP_Fuse : Directive<"fuse"> { + let association = AS_Loop; + let category = CA_Executable; +} def OMP_Interchange : Directive<"interchange"> { let allowedOnceClauses = [ VersionedClause, diff --git a/openmp/runtime/test/transform/fuse/foreach.cpp b/openmp/runtime/test/transform/fuse/foreach.cpp new file mode 100644 index 0000000000000..cabf4bf8a511d --- /dev/null +++ b/openmp/runtime/test/transform/fuse/foreach.cpp @@ -0,0 +1,192 @@ +// RUN: %libomp-cxx20-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include +#include +#include + +struct Reporter { + const char *name; + + Reporter(const char *name) : name(name) { print("ctor"); } + + Reporter() : name("") { print("ctor"); } + + Reporter(const Reporter &that) : name(that.name) { print("copy ctor"); } + + Reporter(Reporter &&that) : name(that.name) { print("move ctor"); } + + ~Reporter() { print("dtor"); } + + const Reporter &operator=(const Reporter &that) { + print("copy assign"); + this->name = that.name; + return *this; + } + + const Reporter &operator=(Reporter &&that) { + print("move assign"); + this->name = that.name; + return *this; + } + + struct Iterator { + const Reporter *owner; + int pos; + + Iterator(const Reporter *owner, int pos) : owner(owner), pos(pos) {} + + Iterator(const Iterator &that) : owner(that.owner), pos(that.pos) { + owner->print("iterator copy ctor"); + } + + Iterator(Iterator &&that) : owner(that.owner), pos(that.pos) { + owner->print("iterator move ctor"); + } + + ~Iterator() { owner->print("iterator dtor"); } + + const Iterator &operator=(const Iterator &that) { + owner->print("iterator copy assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + const Iterator &operator=(Iterator &&that) { + owner->print("iterator move assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + bool operator==(const Iterator &that) const { + owner->print("iterator %d == %d", 2 - this->pos, 2 - that.pos); + return this->pos == that.pos; + } + + Iterator &operator++() { + owner->print("iterator prefix ++"); + pos -= 1; + return *this; + } + + Iterator operator++(int) { + owner->print("iterator postfix ++"); + auto result = *this; + pos -= 1; + return result; + } + + int operator*() const { + int result = 2 - pos; + owner->print("iterator deref: %i", result); + return result; + } + + size_t operator-(const Iterator &that) const { + int result = (2 - this->pos) - (2 - that.pos); + owner->print("iterator distance: %d", result); + return result; + } + + Iterator operator+(int steps) const { + owner->print("iterator advance: %i += %i", 2 - this->pos, steps); + return Iterator(owner, pos - steps); + } + + void print(const char *msg) const { owner->print(msg); } + }; + + Iterator begin() const { + print("begin()"); + return Iterator(this, 2); + } + + Iterator end() const { + print("end()"); + return Iterator(this, -1); + } + + void print(const char *msg, ...) const { + va_list args; + va_start(args, msg); + printf("[%s] ", name); + vprintf(msg, args); + printf("\n"); + va_end(args); + } +}; + +int main() { + printf("do\n"); +#pragma omp fuse + { + for (Reporter a{"C"}; auto &&v : Reporter("A")) + printf("v=%d\n", v); + for (Reporter aa{"D"}; auto &&vv : Reporter("B")) + printf("vv=%d\n", vv); + } + printf("done\n"); + return EXIT_SUCCESS; +} + +// CHECK: [C] ctor +// CHECK-NEXT: [A] ctor +// CHECK-NEXT: [A] end() +// CHECK-NEXT: [A] begin() +// CHECK-NEXT: [A] begin() +// CHECK-NEXT: [A] iterator distance: 3 +// CHECK-NEXT: [D] ctor +// CHECK-NEXT: [B] ctor +// CHECK-NEXT: [B] end() +// CHECK-NEXT: [B] begin() +// CHECK-NEXT: [B] begin() +// CHECK-NEXT: [B] iterator distance: 3 +// CHECK-NEXT: [A] iterator advance: 0 += 0 +// CHECK-NEXT: [A] iterator move assign +// CHECK-NEXT: [A] iterator deref: 0 +// CHECK-NEXT: v=0 +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [B] iterator advance: 0 += 0 +// CHECK-NEXT: [B] iterator move assign +// CHECK-NEXT: [B] iterator deref: 0 +// CHECK-NEXT: vv=0 +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [A] iterator advance: 0 += 1 +// CHECK-NEXT: [A] iterator move assign +// CHECK-NEXT: [A] iterator deref: 1 +// CHECK-NEXT: v=1 +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [B] iterator advance: 0 += 1 +// CHECK-NEXT: [B] iterator move assign +// CHECK-NEXT: [B] iterator deref: 1 +// CHECK-NEXT: vv=1 +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [A] iterator advance: 0 += 2 +// CHECK-NEXT: [A] iterator move assign +// CHECK-NEXT: [A] iterator deref: 2 +// CHECK-NEXT: v=2 +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [B] iterator advance: 0 += 2 +// CHECK-NEXT: [B] iterator move assign +// CHECK-NEXT: [B] iterator deref: 2 +// CHECK-NEXT: vv=2 +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [B] iterator dtor +// CHECK-NEXT: [B] dtor +// CHECK-NEXT: [D] dtor +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [A] iterator dtor +// CHECK-NEXT: [A] dtor +// CHECK-NEXT: [C] dtor +// CHECK-NEXT: done + + +#endif diff --git a/openmp/runtime/test/transform/fuse/intfor.c b/openmp/runtime/test/transform/fuse/intfor.c new file mode 100644 index 0000000000000..b8171b4df7042 --- /dev/null +++ b/openmp/runtime/test/transform/fuse/intfor.c @@ -0,0 +1,50 @@ +// RUN: %libomp-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include + +int main() { + printf("do\n"); +#pragma omp fuse + { + for (int i = 5; i <= 25; i += 5) + printf("i=%d\n", i); + for (int j = 10; j < 100; j += 10) + printf("j=%d\n", j); + for (int k = 10; k > 0; --k) + printf("k=%d\n", k); + } + printf("done\n"); + return EXIT_SUCCESS; +} +#endif /* HEADER */ + +// CHECK: do +// CHECK-NEXT: i=5 +// CHECK-NEXT: j=10 +// CHECK-NEXT: k=10 +// CHECK-NEXT: i=10 +// CHECK-NEXT: j=20 +// CHECK-NEXT: k=9 +// CHECK-NEXT: i=15 +// CHECK-NEXT: j=30 +// CHECK-NEXT: k=8 +// CHECK-NEXT: i=20 +// CHECK-NEXT: j=40 +// CHECK-NEXT: k=7 +// CHECK-NEXT: i=25 +// CHECK-NEXT: j=50 +// CHECK-NEXT: k=6 +// CHECK-NEXT: j=60 +// CHECK-NEXT: k=5 +// CHECK-NEXT: j=70 +// CHECK-NEXT: k=4 +// CHECK-NEXT: j=80 +// CHECK-NEXT: k=3 +// CHECK-NEXT: j=90 +// CHECK-NEXT: k=2 +// CHECK-NEXT: k=1 +// CHECK-NEXT: done diff --git a/openmp/runtime/test/transform/fuse/iterfor.cpp b/openmp/runtime/test/transform/fuse/iterfor.cpp new file mode 100644 index 0000000000000..552484b2981c4 --- /dev/null +++ b/openmp/runtime/test/transform/fuse/iterfor.cpp @@ -0,0 +1,194 @@ +// RUN: %libomp-cxx20-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include +#include +#include + +struct Reporter { + const char *name; + + Reporter(const char *name) : name(name) { print("ctor"); } + + Reporter() : name("") { print("ctor"); } + + Reporter(const Reporter &that) : name(that.name) { print("copy ctor"); } + + Reporter(Reporter &&that) : name(that.name) { print("move ctor"); } + + ~Reporter() { print("dtor"); } + + const Reporter &operator=(const Reporter &that) { + print("copy assign"); + this->name = that.name; + return *this; + } + + const Reporter &operator=(Reporter &&that) { + print("move assign"); + this->name = that.name; + return *this; + } + + struct Iterator { + const Reporter *owner; + int pos; + + Iterator(const Reporter *owner, int pos) : owner(owner), pos(pos) {} + + Iterator(const Iterator &that) : owner(that.owner), pos(that.pos) { + owner->print("iterator copy ctor"); + } + + Iterator(Iterator &&that) : owner(that.owner), pos(that.pos) { + owner->print("iterator move ctor"); + } + + ~Iterator() { owner->print("iterator dtor"); } + + const Iterator &operator=(const Iterator &that) { + owner->print("iterator copy assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + const Iterator &operator=(Iterator &&that) { + owner->print("iterator move assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + bool operator==(const Iterator &that) const { + owner->print("iterator %d == %d", 2 - this->pos, 2 - that.pos); + return this->pos == that.pos; + } + + bool operator!=(const Iterator &that) const { + owner->print("iterator %d != %d", 2 - this->pos, 2 - that.pos); + return this->pos != that.pos; + } + + Iterator &operator++() { + owner->print("iterator prefix ++"); + pos -= 1; + return *this; + } + + Iterator operator++(int) { + owner->print("iterator postfix ++"); + auto result = *this; + pos -= 1; + return result; + } + + int operator*() const { + int result = 2 - pos; + owner->print("iterator deref: %i", result); + return result; + } + + size_t operator-(const Iterator &that) const { + int result = (2 - this->pos) - (2 - that.pos); + owner->print("iterator distance: %d", result); + return result; + } + + Iterator operator+(int steps) const { + owner->print("iterator advance: %i += %i", 2 - this->pos, steps); + return Iterator(owner, pos - steps); + } + }; + + Iterator begin() const { + print("begin()"); + return Iterator(this, 2); + } + + Iterator end() const { + print("end()"); + return Iterator(this, -1); + } + + void print(const char *msg, ...) const { + va_list args; + va_start(args, msg); + printf("[%s] ", name); + vprintf(msg, args); + printf("\n"); + va_end(args); + } +}; + +int main() { + printf("do\n"); + Reporter C("C"); + Reporter D("D"); +#pragma omp fuse + { + for (auto it = C.begin(); it != C.end(); ++it) + printf("v=%d\n", *it); + + for (auto it = D.begin(); it != D.end(); ++it) + printf("vv=%d\n", *it); + } + printf("done\n"); + return EXIT_SUCCESS; +} + +#endif /* HEADER */ + +// CHECK: do +// CHECK: [C] ctor +// CHECK-NEXT: [D] ctor +// CHECK-NEXT: [C] begin() +// CHECK-NEXT: [C] begin() +// CHECK-NEXT: [C] end() +// CHECK-NEXT: [C] iterator distance: 3 +// CHECK-NEXT: [D] begin() +// CHECK-NEXT: [D] begin() +// CHECK-NEXT: [D] end() +// CHECK-NEXT: [D] iterator distance: 3 +// CHECK-NEXT: [C] iterator advance: 0 += 0 +// CHECK-NEXT: [C] iterator move assign +// CHECK-NEXT: [C] iterator deref: 0 +// CHECK-NEXT: v=0 +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [D] iterator advance: 0 += 0 +// CHECK-NEXT: [D] iterator move assign +// CHECK-NEXT: [D] iterator deref: 0 +// CHECK-NEXT: vv=0 +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [C] iterator advance: 0 += 1 +// CHECK-NEXT: [C] iterator move assign +// CHECK-NEXT: [C] iterator deref: 1 +// CHECK-NEXT: v=1 +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [D] iterator advance: 0 += 1 +// CHECK-NEXT: [D] iterator move assign +// CHECK-NEXT: [D] iterator deref: 1 +// CHECK-NEXT: vv=1 +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [C] iterator advance: 0 += 2 +// CHECK-NEXT: [C] iterator move assign +// CHECK-NEXT: [C] iterator deref: 2 +// CHECK-NEXT: v=2 +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [D] iterator advance: 0 += 2 +// CHECK-NEXT: [D] iterator move assign +// CHECK-NEXT: [D] iterator deref: 2 +// CHECK-NEXT: vv=2 +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: done +// CHECK-NEXT: [D] iterator dtor +// CHECK-NEXT: [C] iterator dtor +// CHECK-NEXT: [D] dtor +// CHECK-NEXT: [C] dtor diff --git a/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-foreach.cpp b/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-foreach.cpp new file mode 100644 index 0000000000000..e9f76713fe3e0 --- /dev/null +++ b/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-foreach.cpp @@ -0,0 +1,208 @@ +// RUN: %libomp-cxx20-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include +#include +#include + +struct Reporter { + const char *name; + + Reporter(const char *name) : name(name) { print("ctor"); } + + Reporter() : name("") { print("ctor"); } + + Reporter(const Reporter &that) : name(that.name) { print("copy ctor"); } + + Reporter(Reporter &&that) : name(that.name) { print("move ctor"); } + + ~Reporter() { print("dtor"); } + + const Reporter &operator=(const Reporter &that) { + print("copy assign"); + this->name = that.name; + return *this; + } + + const Reporter &operator=(Reporter &&that) { + print("move assign"); + this->name = that.name; + return *this; + } + + struct Iterator { + const Reporter *owner; + int pos; + + Iterator(const Reporter *owner, int pos) : owner(owner), pos(pos) {} + + Iterator(const Iterator &that) : owner(that.owner), pos(that.pos) { + owner->print("iterator copy ctor"); + } + + Iterator(Iterator &&that) : owner(that.owner), pos(that.pos) { + owner->print("iterator move ctor"); + } + + ~Iterator() { owner->print("iterator dtor"); } + + const Iterator &operator=(const Iterator &that) { + owner->print("iterator copy assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + const Iterator &operator=(Iterator &&that) { + owner->print("iterator move assign"); + this->owner = that.owner; + this->pos = that.pos; + return *this; + } + + bool operator==(const Iterator &that) const { + owner->print("iterator %d == %d", 2 - this->pos, 2 - that.pos); + return this->pos == that.pos; + } + + Iterator &operator++() { + owner->print("iterator prefix ++"); + pos -= 1; + return *this; + } + + Iterator operator++(int) { + owner->print("iterator postfix ++"); + auto result = *this; + pos -= 1; + return result; + } + + int operator*() const { + int result = 2 - pos; + owner->print("iterator deref: %i", result); + return result; + } + + size_t operator-(const Iterator &that) const { + int result = (2 - this->pos) - (2 - that.pos); + owner->print("iterator distance: %d", result); + return result; + } + + Iterator operator+(int steps) const { + owner->print("iterator advance: %i += %i", 2 - this->pos, steps); + return Iterator(owner, pos - steps); + } + + void print(const char *msg) const { owner->print(msg); } + }; + + Iterator begin() const { + print("begin()"); + return Iterator(this, 2); + } + + Iterator end() const { + print("end()"); + return Iterator(this, -1); + } + + void print(const char *msg, ...) const { + va_list args; + va_start(args, msg); + printf("[%s] ", name); + vprintf(msg, args); + printf("\n"); + va_end(args); + } +}; + +int main() { + printf("do\n"); +#pragma omp parallel for collapse(2) num_threads(1) + for (int i = 0; i < 3; ++i) +#pragma omp fuse + { + for (Reporter c{"init-stmt"}; auto &&v : Reporter("range")) + printf("i=%d v=%d\n", i, v); + for (int vv = 0; vv < 3; ++vv) + printf("i=%d vv=%d\n", i, vv); + } + printf("done\n"); + return EXIT_SUCCESS; +} + +#endif /* HEADER */ + +// CHECK: do +// CHECK-NEXT: [init-stmt] ctor +// CHECK-NEXT: [range] ctor +// CHECK-NEXT: [range] end() +// CHECK-NEXT: [range] begin() +// CHECK-NEXT: [range] begin() +// CHECK-NEXT: [range] iterator distance: 3 +// CHECK-NEXT: [range] iterator advance: 0 += 0 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 0 +// CHECK-NEXT: i=0 v=0 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=0 vv=0 +// CHECK-NEXT: [range] iterator advance: 0 += 1 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 1 +// CHECK-NEXT: i=0 v=1 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=0 vv=1 +// CHECK-NEXT: [range] iterator advance: 0 += 2 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 2 +// CHECK-NEXT: i=0 v=2 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=0 vv=2 +// CHECK-NEXT: [range] iterator advance: 0 += 0 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 0 +// CHECK-NEXT: i=1 v=0 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=1 vv=0 +// CHECK-NEXT: [range] iterator advance: 0 += 1 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 1 +// CHECK-NEXT: i=1 v=1 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=1 vv=1 +// CHECK-NEXT: [range] iterator advance: 0 += 2 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 2 +// CHECK-NEXT: i=1 v=2 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=1 vv=2 +// CHECK-NEXT: [range] iterator advance: 0 += 0 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 0 +// CHECK-NEXT: i=2 v=0 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=2 vv=0 +// CHECK-NEXT: [range] iterator advance: 0 += 1 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 1 +// CHECK-NEXT: i=2 v=1 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=2 vv=1 +// CHECK-NEXT: [range] iterator advance: 0 += 2 +// CHECK-NEXT: [range] iterator move assign +// CHECK-NEXT: [range] iterator deref: 2 +// CHECK-NEXT: i=2 v=2 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: i=2 vv=2 +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: [range] iterator dtor +// CHECK-NEXT: [range] dtor +// CHECK-NEXT: [init-stmt] dtor +// CHECK-NEXT: done + diff --git a/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-intfor.c b/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-intfor.c new file mode 100644 index 0000000000000..272908e72c429 --- /dev/null +++ b/openmp/runtime/test/transform/fuse/parallel-wsloop-collapse-intfor.c @@ -0,0 +1,45 @@ +// RUN: %libomp-cxx-compile-and-run | FileCheck %s --match-full-lines + +#ifndef HEADER +#define HEADER + +#include +#include + +int main() { + printf("do\n"); +#pragma omp parallel for collapse(2) num_threads(1) + for (int i = 0; i < 3; ++i) +#pragma omp fuse + { + for (int j = 0; j < 3; ++j) + printf("i=%d j=%d\n", i, j); + for (int k = 0; k < 3; ++k) + printf("i=%d k=%d\n", i, k); + } + printf("done\n"); + return EXIT_SUCCESS; +} + +#endif /* HEADER */ + +// CHECK: do +// CHECK: i=0 j=0 +// CHECK-NEXT: i=0 k=0 +// CHECK-NEXT: i=0 j=1 +// CHECK-NEXT: i=0 k=1 +// CHECK-NEXT: i=0 j=2 +// CHECK-NEXT: i=0 k=2 +// CHECK-NEXT: i=1 j=0 +// CHECK-NEXT: i=1 k=0 +// CHECK-NEXT: i=1 j=1 +// CHECK-NEXT: i=1 k=1 +// CHECK-NEXT: i=1 j=2 +// CHECK-NEXT: i=1 k=2 +// CHECK-NEXT: i=2 j=0 +// CHECK-NEXT: i=2 k=0 +// CHECK-NEXT: i=2 j=1 +// CHECK-NEXT: i=2 k=1 +// CHECK-NEXT: i=2 j=2 +// CHECK-NEXT: i=2 k=2 +// CHECK-NEXT: done >From 7e3bd1e3afcdc246da0362ffb8693b160f9d3f4a Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:28:04 +0000 Subject: [PATCH 2/9] Add looprange clause --- clang/include/clang/AST/OpenMPClause.h | 100 ++++++ clang/include/clang/AST/RecursiveASTVisitor.h | 8 + clang/include/clang/AST/StmtOpenMP.h | 18 +- .../clang/Basic/DiagnosticSemaKinds.td | 5 + clang/include/clang/Parse/Parser.h | 3 + clang/include/clang/Sema/SemaOpenMP.h | 6 + clang/lib/AST/OpenMPClause.cpp | 35 ++ clang/lib/AST/StmtOpenMP.cpp | 7 +- clang/lib/AST/StmtProfile.cpp | 7 + clang/lib/Basic/OpenMPKinds.cpp | 2 + clang/lib/Parse/ParseOpenMP.cpp | 36 ++ clang/lib/Sema/SemaOpenMP.cpp | 155 +++++++-- clang/lib/Sema/TreeTransform.h | 33 ++ clang/lib/Serialization/ASTReader.cpp | 11 + clang/lib/Serialization/ASTReaderStmt.cpp | 4 +- clang/lib/Serialization/ASTWriter.cpp | 8 + clang/test/OpenMP/fuse_ast_print.cpp | 67 ++++ clang/test/OpenMP/fuse_codegen.cpp | 320 +++++++++++++++++- clang/test/OpenMP/fuse_messages.cpp | 112 +++++- clang/tools/libclang/CIndex.cpp | 5 + llvm/include/llvm/Frontend/OpenMP/ClauseT.h | 16 +- llvm/include/llvm/Frontend/OpenMP/OMP.td | 6 + 22 files changed, 921 insertions(+), 43 deletions(-) diff --git a/clang/include/clang/AST/OpenMPClause.h b/clang/include/clang/AST/OpenMPClause.h index 6fd16bc0f03be..8f937cdef9cd0 100644 --- a/clang/include/clang/AST/OpenMPClause.h +++ b/clang/include/clang/AST/OpenMPClause.h @@ -1143,6 +1143,106 @@ class OMPFullClause final : public OMPNoChildClause { static OMPFullClause *CreateEmpty(const ASTContext &C); }; +/// This class represents the 'looprange' clause in the +/// '#pragma omp fuse' directive +/// +/// \code {c} +/// #pragma omp fuse looprange(1,2) +/// { +/// for(int i = 0; i < 64; ++i) +/// for(int j = 0; j < 256; j+=2) +/// for(int k = 127; k >= 0; --k) +/// \endcode +class OMPLoopRangeClause final : public OMPClause { + friend class OMPClauseReader; + + explicit OMPLoopRangeClause() + : OMPClause(llvm::omp::OMPC_looprange, {}, {}) {} + + /// Location of '(' + SourceLocation LParenLoc; + + /// Location of 'first' + SourceLocation FirstLoc; + + /// Location of 'count' + SourceLocation CountLoc; + + /// Expr associated with 'first' argument + Expr *First = nullptr; + + /// Expr associated with 'count' argument + Expr *Count = nullptr; + + /// Set 'first' + void setFirst(Expr *First) { this->First = First; } + + /// Set 'count' + void setCount(Expr *Count) { this->Count = Count; } + + /// Set location of '('. + void setLParenLoc(SourceLocation Loc) { LParenLoc = Loc; } + + /// Set location of 'first' argument + void setFirstLoc(SourceLocation Loc) { FirstLoc = Loc; } + + /// Set location of 'count' argument + void setCountLoc(SourceLocation Loc) { CountLoc = Loc; } + +public: + /// Build an AST node for a 'looprange' clause + /// + /// \param StartLoc Starting location of the clause. + /// \param LParenLoc Location of '('. + /// \param ModifierLoc Modifier location. + /// \param + static OMPLoopRangeClause * + Create(const ASTContext &C, SourceLocation StartLoc, SourceLocation LParenLoc, + SourceLocation FirstLoc, SourceLocation CountLoc, + SourceLocation EndLoc, Expr *First, Expr *Count); + + /// Build an empty 'looprange' node for deserialization + /// + /// \param C Context of the AST. + static OMPLoopRangeClause *CreateEmpty(const ASTContext &C); + + /// Returns the location of '(' + SourceLocation getLParenLoc() const { return LParenLoc; } + + /// Returns the location of 'first' + SourceLocation getFirstLoc() const { return FirstLoc; } + + /// Returns the location of 'count' + SourceLocation getCountLoc() const { return CountLoc; } + + /// Returns the argument 'first' or nullptr if not set + Expr *getFirst() const { return cast_or_null(First); } + + /// Returns the argument 'count' or nullptr if not set + Expr *getCount() const { return cast_or_null(Count); } + + child_range children() { + return child_range(reinterpret_cast(&First), + reinterpret_cast(&Count) + 1); + } + + const_child_range children() const { + auto Children = const_cast(this)->children(); + return const_child_range(Children.begin(), Children.end()); + } + + child_range used_children() { + return child_range(child_iterator(), child_iterator()); + } + const_child_range used_children() const { + return const_child_range(const_child_iterator(), const_child_iterator()); + } + + static bool classof(const OMPClause *T) { + return T->getClauseKind() == llvm::omp::OMPC_looprange; + } +}; + /// Representation of the 'partial' clause of the '#pragma omp unroll' /// directive. /// diff --git a/clang/include/clang/AST/RecursiveASTVisitor.h b/clang/include/clang/AST/RecursiveASTVisitor.h index 057e9e346ce4e..94066edc64933 100644 --- a/clang/include/clang/AST/RecursiveASTVisitor.h +++ b/clang/include/clang/AST/RecursiveASTVisitor.h @@ -3400,6 +3400,14 @@ bool RecursiveASTVisitor::VisitOMPFullClause(OMPFullClause *C) { return true; } +template +bool RecursiveASTVisitor::VisitOMPLoopRangeClause( + OMPLoopRangeClause *C) { + TRY_TO(TraverseStmt(C->getFirst())); + TRY_TO(TraverseStmt(C->getCount())); + return true; +} + template bool RecursiveASTVisitor::VisitOMPPartialClause(OMPPartialClause *C) { TRY_TO(TraverseStmt(C->getFactor())); diff --git a/clang/include/clang/AST/StmtOpenMP.h b/clang/include/clang/AST/StmtOpenMP.h index dc6f797e24ab8..85bde292ca748 100644 --- a/clang/include/clang/AST/StmtOpenMP.h +++ b/clang/include/clang/AST/StmtOpenMP.h @@ -5572,7 +5572,9 @@ class OMPTileDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPTileDirectiveClass, llvm::omp::OMPD_tile, StartLoc, EndLoc, NumLoops) { + // Tiling doubles the original number of loops setNumGeneratedLoops(2 * NumLoops); + // Produces a single top-level canonical loop nest setNumGeneratedLoopNests(1); } @@ -5803,9 +5805,9 @@ class OMPReverseDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPReverseDirectiveClass, llvm::omp::OMPD_reverse, StartLoc, EndLoc, 1) { - - setNumGeneratedLoopNests(1); + // Reverse produces a single top-level canonical loop nest setNumGeneratedLoops(1); + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { @@ -5873,6 +5875,8 @@ class OMPInterchangeDirective final : public OMPLoopTransformationDirective { : OMPLoopTransformationDirective(OMPInterchangeDirectiveClass, llvm::omp::OMPD_interchange, StartLoc, EndLoc, NumLoops) { + // Interchange produces a single top-level canonical loop + // nest, with the exact same amount of total loops setNumGeneratedLoops(NumLoops); setNumGeneratedLoopNests(1); } @@ -5950,11 +5954,7 @@ class OMPFuseDirective final : public OMPLoopTransformationDirective { unsigned NumLoops) : OMPLoopTransformationDirective(OMPFuseDirectiveClass, llvm::omp::OMPD_fuse, StartLoc, EndLoc, - NumLoops) { - setNumGeneratedLoops(1); - // TODO: After implementing the looprange clause, change this logic - setNumGeneratedLoopNests(1); - } + NumLoops) {} void setPreInits(Stmt *PreInits) { Data->getChildren()[PreInitsOffset] = PreInits; @@ -5990,8 +5990,10 @@ class OMPFuseDirective final : public OMPLoopTransformationDirective { /// \param C Context of the AST /// \param NumClauses Number of clauses to allocate /// \param NumLoops Number of associated loops to allocate + /// \param NumLoopNests Number of top level loops to allocate static OMPFuseDirective *CreateEmpty(const ASTContext &C, unsigned NumClauses, - unsigned NumLoops); + unsigned NumLoops, + unsigned NumLoopNests); /// Gets the associated loops after the transformation. This is the de-sugared /// replacement or nulltpr in dependent contexts. diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index f31b6f8a3b26a..191618e7865dc 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -11566,6 +11566,11 @@ def err_omp_not_a_loop_sequence : Error < "statement after '#pragma omp %0' must be a loop sequence containing canonical loops or loop-generating constructs">; def err_omp_empty_loop_sequence : Error < "loop sequence after '#pragma omp %0' must contain at least 1 canonical loop or loop-generating construct">; +def err_omp_invalid_looprange : Error < + "loop range in '#pragma omp %0' exceeds the number of available loops: " + "range end '%1' is greater than the total number of loops '%2'">; +def warn_omp_redundant_fusion : Warning < + "loop range in '#pragma omp %0' contains only a single loop, resulting in redundant fusion">; def err_omp_not_for : Error< "%select{statement after '#pragma omp %1' must be a for loop|" "expected %2 for loops after '#pragma omp %1'%select{|, but found only %4}3}0">; diff --git a/clang/include/clang/Parse/Parser.h b/clang/include/clang/Parse/Parser.h index e6492b81dfff8..965dcb7da26d8 100644 --- a/clang/include/clang/Parse/Parser.h +++ b/clang/include/clang/Parse/Parser.h @@ -6739,6 +6739,9 @@ class Parser : public CodeCompletionHandler { OpenMPClauseKind Kind, bool ParseOnly); + /// Parses the 'looprange' clause of a '#pragma omp fuse' directive. + OMPClause *ParseOpenMPLoopRangeClause(); + /// Parses the 'sizes' clause of a '#pragma omp tile' directive. OMPClause *ParseOpenMPSizesClause(); diff --git a/clang/include/clang/Sema/SemaOpenMP.h b/clang/include/clang/Sema/SemaOpenMP.h index 8d78c2197c89d..f4a075e54cebe 100644 --- a/clang/include/clang/Sema/SemaOpenMP.h +++ b/clang/include/clang/Sema/SemaOpenMP.h @@ -921,6 +921,12 @@ class SemaOpenMP : public SemaBase { SourceLocation StartLoc, SourceLocation LParenLoc, SourceLocation EndLoc); + + /// Called on well-form 'looprange' clause after parsing its arguments. + OMPClause * + ActOnOpenMPLoopRangeClause(Expr *First, Expr *Count, SourceLocation StartLoc, + SourceLocation LParenLoc, SourceLocation FirstLoc, + SourceLocation CountLoc, SourceLocation EndLoc); /// Called on well-formed 'ordered' clause. OMPClause * ActOnOpenMPOrderedClause(SourceLocation StartLoc, SourceLocation EndLoc, diff --git a/clang/lib/AST/OpenMPClause.cpp b/clang/lib/AST/OpenMPClause.cpp index 0e5052b944162..0b5808eb100e4 100644 --- a/clang/lib/AST/OpenMPClause.cpp +++ b/clang/lib/AST/OpenMPClause.cpp @@ -1024,6 +1024,26 @@ OMPPartialClause *OMPPartialClause::CreateEmpty(const ASTContext &C) { return new (C) OMPPartialClause(); } +OMPLoopRangeClause * +OMPLoopRangeClause::Create(const ASTContext &C, SourceLocation StartLoc, + SourceLocation LParenLoc, SourceLocation EndLoc, + SourceLocation FirstLoc, SourceLocation CountLoc, + Expr *First, Expr *Count) { + OMPLoopRangeClause *Clause = CreateEmpty(C); + Clause->setLocStart(StartLoc); + Clause->setLParenLoc(LParenLoc); + Clause->setLocEnd(EndLoc); + Clause->setFirstLoc(FirstLoc); + Clause->setCountLoc(CountLoc); + Clause->setFirst(First); + Clause->setCount(Count); + return Clause; +} + +OMPLoopRangeClause *OMPLoopRangeClause::CreateEmpty(const ASTContext &C) { + return new (C) OMPLoopRangeClause(); +} + OMPAllocateClause *OMPAllocateClause::Create( const ASTContext &C, SourceLocation StartLoc, SourceLocation LParenLoc, Expr *Allocator, Expr *Alignment, SourceLocation ColonLoc, @@ -1888,6 +1908,21 @@ void OMPClausePrinter::VisitOMPPartialClause(OMPPartialClause *Node) { } } +void OMPClausePrinter::VisitOMPLoopRangeClause(OMPLoopRangeClause *Node) { + OS << "looprange"; + + Expr *First = Node->getFirst(); + Expr *Count = Node->getCount(); + + if (First && Count) { + OS << "("; + First->printPretty(OS, nullptr, Policy, 0); + OS << ","; + Count->printPretty(OS, nullptr, Policy, 0); + OS << ")"; + } +} + void OMPClausePrinter::VisitOMPAllocatorClause(OMPAllocatorClause *Node) { OS << "allocator("; Node->getAllocator()->printPretty(OS, nullptr, Policy, 0); diff --git a/clang/lib/AST/StmtOpenMP.cpp b/clang/lib/AST/StmtOpenMP.cpp index 4a6133766ef1c..06c987e7f1761 100644 --- a/clang/lib/AST/StmtOpenMP.cpp +++ b/clang/lib/AST/StmtOpenMP.cpp @@ -524,10 +524,13 @@ OMPFuseDirective *OMPFuseDirective::Create( OMPFuseDirective *OMPFuseDirective::CreateEmpty(const ASTContext &C, unsigned NumClauses, - unsigned NumLoops) { - return createEmptyDirective( + unsigned NumLoops, + unsigned NumLoopNests) { + OMPFuseDirective *Dir = createEmptyDirective( C, NumClauses, /*HasAssociatedStmt=*/true, TransformedStmtOffset + 1, SourceLocation(), SourceLocation(), NumLoops); + Dir->setNumGeneratedLoopNests(NumLoopNests); + return Dir; } OMPForSimdDirective * diff --git a/clang/lib/AST/StmtProfile.cpp b/clang/lib/AST/StmtProfile.cpp index 99d426db985e8..9f0ce076c35fa 100644 --- a/clang/lib/AST/StmtProfile.cpp +++ b/clang/lib/AST/StmtProfile.cpp @@ -511,6 +511,13 @@ void OMPClauseProfiler::VisitOMPPartialClause(const OMPPartialClause *C) { Profiler->VisitExpr(Factor); } +void OMPClauseProfiler::VisitOMPLoopRangeClause(const OMPLoopRangeClause *C) { + if (const Expr *First = C->getFirst()) + Profiler->VisitExpr(First); + if (const Expr *Count = C->getCount()) + Profiler->VisitExpr(Count); +} + void OMPClauseProfiler::VisitOMPAllocatorClause(const OMPAllocatorClause *C) { if (C->getAllocator()) Profiler->VisitStmt(C->getAllocator()); diff --git a/clang/lib/Basic/OpenMPKinds.cpp b/clang/lib/Basic/OpenMPKinds.cpp index d172450512f13..18330181f1509 100644 --- a/clang/lib/Basic/OpenMPKinds.cpp +++ b/clang/lib/Basic/OpenMPKinds.cpp @@ -248,6 +248,7 @@ unsigned clang::getOpenMPSimpleClauseType(OpenMPClauseKind Kind, StringRef Str, case OMPC_affinity: case OMPC_when: case OMPC_append_args: + case OMPC_looprange: break; default: break; @@ -583,6 +584,7 @@ const char *clang::getOpenMPSimpleClauseTypeName(OpenMPClauseKind Kind, case OMPC_affinity: case OMPC_when: case OMPC_append_args: + case OMPC_looprange: break; default: break; diff --git a/clang/lib/Parse/ParseOpenMP.cpp b/clang/lib/Parse/ParseOpenMP.cpp index cfffcdb01a514..ade5192d1968d 100644 --- a/clang/lib/Parse/ParseOpenMP.cpp +++ b/clang/lib/Parse/ParseOpenMP.cpp @@ -3041,6 +3041,39 @@ OMPClause *Parser::ParseOpenMPSizesClause() { OpenLoc, CloseLoc); } +OMPClause *Parser::ParseOpenMPLoopRangeClause() { + SourceLocation ClauseNameLoc = ConsumeToken(); + SourceLocation FirstLoc, CountLoc; + + BalancedDelimiterTracker T(*this, tok::l_paren, tok::annot_pragma_openmp_end); + if (T.consumeOpen()) { + Diag(Tok, diag::err_expected) << tok::l_paren; + return nullptr; + } + + FirstLoc = Tok.getLocation(); + ExprResult FirstVal = ParseConstantExpression(); + if (!FirstVal.isUsable()) { + T.skipToEnd(); + return nullptr; + } + + ExpectAndConsume(tok::comma); + + CountLoc = Tok.getLocation(); + ExprResult CountVal = ParseConstantExpression(); + if (!CountVal.isUsable()) { + T.skipToEnd(); + return nullptr; + } + + T.consumeClose(); + + return Actions.OpenMP().ActOnOpenMPLoopRangeClause( + FirstVal.get(), CountVal.get(), ClauseNameLoc, T.getOpenLocation(), + FirstLoc, CountLoc, T.getCloseLocation()); +} + OMPClause *Parser::ParseOpenMPPermutationClause() { SourceLocation ClauseNameLoc, OpenLoc, CloseLoc; SmallVector ArgExprs; @@ -3469,6 +3502,9 @@ OMPClause *Parser::ParseOpenMPClause(OpenMPDirectiveKind DKind, } Clause = ParseOpenMPClause(CKind, WrongDirective); break; + case OMPC_looprange: + Clause = ParseOpenMPLoopRangeClause(); + break; default: break; } diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index bd8bee64a9d2f..556b5cb43b6f8 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -14289,7 +14289,6 @@ bool SemaOpenMP::checkTransformableLoopSequence( // and tries to match the input AST to the canonical loop sequence grammar // structure - auto NLCV = NestedLoopCounterVisitor(); // Helper functions to validate canonical loop sequence grammar is valid auto isLoopSequenceDerivation = [](auto *Child) { return isa(Child) || isa(Child) || @@ -14392,7 +14391,7 @@ bool SemaOpenMP::checkTransformableLoopSequence( // Modularized code for handling regular canonical loops auto handleRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, - &LoopSeqSize, &NumLoops, Kind, &TmpDSA, &NLCV, + &LoopSeqSize, &NumLoops, Kind, &TmpDSA, this](Stmt *Child) { OriginalInits.emplace_back(); LoopHelpers.emplace_back(); @@ -14405,8 +14404,11 @@ bool SemaOpenMP::checkTransformableLoopSequence( << getOpenMPDirectiveName(Kind); return false; } + storeLoopStatements(Child); - NumLoops += NLCV.TraverseStmt(Child); + auto NLCV = NestedLoopCounterVisitor(); + NLCV.TraverseStmt(Child); + NumLoops += NLCV.getNestedLoopCount(); return true; }; @@ -15732,6 +15734,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, Stmt *AStmt, SourceLocation StartLoc, SourceLocation EndLoc) { + ASTContext &Context = getASTContext(); DeclContext *CurrContext = SemaRef.CurContext; Scope *CurScope = SemaRef.getCurScope(); @@ -15748,7 +15751,6 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, SmallVector> OriginalInits; unsigned NumLoops; - // TODO: Support looprange clause using LoopSeqSize unsigned LoopSeqSize; if (!checkTransformableLoopSequence(OMPD_fuse, AStmt, LoopSeqSize, NumLoops, LoopHelpers, LoopStmts, OriginalInits, @@ -15757,10 +15759,67 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, } // Defer transformation in dependent contexts + // The NumLoopNests argument is set to a placeholder (0) + // because a dependent context could prevent determining its true value if (CurrContext->isDependentContext()) { return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, - NumLoops, 1, AStmt, nullptr, nullptr); + NumLoops, 0, AStmt, nullptr, nullptr); } + + // Handle clauses, which can be any of the following: [looprange, apply] + const OMPLoopRangeClause *LRC = + OMPExecutableDirective::getSingleClause(Clauses); + + // The clause arguments are invalidated if any error arises + // such as non-constant or non-positive arguments + if (LRC && (!LRC->getFirst() || !LRC->getCount())) + return StmtError(); + + // Delayed semantic check of LoopRange constraint + // Evaluates the loop range arguments and returns the first and count values + auto EvaluateLoopRangeArguments = [&Context](Expr *First, Expr *Count, + uint64_t &FirstVal, + uint64_t &CountVal) { + llvm::APSInt FirstInt = First->EvaluateKnownConstInt(Context); + llvm::APSInt CountInt = Count->EvaluateKnownConstInt(Context); + FirstVal = FirstInt.getZExtValue(); + CountVal = CountInt.getZExtValue(); + }; + + // Checks if the loop range is valid + auto ValidLoopRange = [](uint64_t FirstVal, uint64_t CountVal, + unsigned NumLoops) -> bool { + return FirstVal + CountVal - 1 <= NumLoops; + }; + uint64_t FirstVal = 1, CountVal = 0, LastVal = LoopSeqSize; + + if (LRC) { + EvaluateLoopRangeArguments(LRC->getFirst(), LRC->getCount(), FirstVal, + CountVal); + if (CountVal == 1) + SemaRef.Diag(LRC->getCountLoc(), diag::warn_omp_redundant_fusion) + << getOpenMPDirectiveName(OMPD_fuse); + + if (!ValidLoopRange(FirstVal, CountVal, LoopSeqSize)) { + SemaRef.Diag(LRC->getFirstLoc(), diag::err_omp_invalid_looprange) + << getOpenMPDirectiveName(OMPD_fuse) << (FirstVal + CountVal - 1) + << LoopSeqSize; + return StmtError(); + } + + LastVal = FirstVal + CountVal - 1; + } + + // Complete fusion generates a single canonical loop nest + // However looprange clause generates several loop nests + unsigned NumLoopNests = LRC ? LoopSeqSize - CountVal + 1 : 1; + + // Emit a warning for redundant loop fusion when the sequence contains only + // one loop. + if (LoopSeqSize == 1) + SemaRef.Diag(AStmt->getBeginLoc(), diag::warn_omp_redundant_fusion) + << getOpenMPDirectiveName(OMPD_fuse); + assert(LoopHelpers.size() == LoopSeqSize && "Expecting loop iteration space dimensionality to match number of " "affected loops"); @@ -15774,8 +15833,8 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, SmallVector PreInits; // Select the type with the largest bit width among all induction variables - QualType IVType = LoopHelpers[0].IterationVarRef->getType(); - for (unsigned int I = 1; I < LoopSeqSize; ++I) { + QualType IVType = LoopHelpers[FirstVal - 1].IterationVarRef->getType(); + for (unsigned int I = FirstVal; I < LastVal; ++I) { QualType CurrentIVType = LoopHelpers[I].IterationVarRef->getType(); if (Context.getTypeSize(CurrentIVType) > Context.getTypeSize(IVType)) { IVType = CurrentIVType; @@ -15824,20 +15883,21 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // Process each single loop to generate and collect declarations // and statements for all helper expressions - for (unsigned int I = 0; I < LoopSeqSize; ++I) { + for (unsigned int I = FirstVal - 1, J = 0; I < LastVal; ++I, ++J) { addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], PreInits); - auto [UBVD, UBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].UB, "ub", I); - auto [LBVD, LBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].LB, "lb", I); - auto [STVD, STDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].ST, "st", I); + auto [UBVD, UBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].UB, "ub", J); + auto [LBVD, LBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].LB, "lb", J); + auto [STVD, STDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].ST, "st", J); auto [NIVD, NIDStmt] = - CreateHelperVarAndStmt(LoopHelpers[I].NumIterations, "ni", I, true); + CreateHelperVarAndStmt(LoopHelpers[I].NumIterations, "ni", J, true); auto [IVVD, IVDStmt] = - CreateHelperVarAndStmt(LoopHelpers[I].IterationVarRef, "iv", I); + CreateHelperVarAndStmt(LoopHelpers[I].IterationVarRef, "iv", J); if (!LBVD || !STVD || !NIVD || !IVVD) - return StmtError(); + assert(LBVD && STVD && NIVD && IVVD && + "OpenMP Fuse Helper variables creation failed"); UBVarDecls.push_back(UBVD); LBVarDecls.push_back(LBVD); @@ -15912,8 +15972,9 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // omp.fuse.max = max(omp.temp1, omp.temp0) ExprResult MaxExpr; - for (unsigned I = 0; I < LoopSeqSize; ++I) { - DeclRefExpr *NIRef = MakeVarDeclRef(NIVarDecls[I]); + // I is the true + for (unsigned I = FirstVal - 1, J = 0; I < LastVal; ++I, ++J) { + DeclRefExpr *NIRef = MakeVarDeclRef(NIVarDecls[J]); QualType NITy = NIRef->getType(); if (MaxExpr.isUnset()) { @@ -15921,7 +15982,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, MaxExpr = NIRef; } else { // Create a new acummulator variable t_i = MaxExpr - std::string TempName = (Twine(".omp.temp.") + Twine(I)).str(); + std::string TempName = (Twine(".omp.temp.") + Twine(J)).str(); VarDecl *TempDecl = buildVarDecl(SemaRef, {}, NITy, TempName, nullptr, nullptr); TempDecl->setInit(MaxExpr.get()); @@ -15944,7 +16005,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, if (!Comparison.isUsable()) return StmtError(); - DeclRefExpr *NIRef2 = MakeVarDeclRef(NIVarDecls[I]); + DeclRefExpr *NIRef2 = MakeVarDeclRef(NIVarDecls[J]); // Update MaxExpr using a conditional expression to hold the max value MaxExpr = new (Context) ConditionalOperator( Comparison.get(), SourceLocation(), TempRef2, SourceLocation(), @@ -15997,23 +16058,21 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, CompoundStmt *FusedBody = nullptr; SmallVector FusedBodyStmts; - for (unsigned I = 0; I < LoopSeqSize; ++I) { - + for (unsigned I = FirstVal - 1, J = 0; I < LastVal; ++I, ++J) { // Assingment of the original sub-loop index to compute the logical index // IV_k = LB_k + omp.fuse.index * ST_k - ExprResult IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Mul, - MakeVarDeclRef(STVarDecls[I]), MakeIVRef()); + MakeVarDeclRef(STVarDecls[J]), MakeIVRef()); if (!IdxExpr.isUsable()) return StmtError(); IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Add, - MakeVarDeclRef(LBVarDecls[I]), IdxExpr.get()); + MakeVarDeclRef(LBVarDecls[J]), IdxExpr.get()); if (!IdxExpr.isUsable()) return StmtError(); IdxExpr = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_Assign, - MakeVarDeclRef(IVVarDecls[I]), IdxExpr.get()); + MakeVarDeclRef(IVVarDecls[J]), IdxExpr.get()); if (!IdxExpr.isUsable()) return StmtError(); @@ -16028,7 +16087,6 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, Stmt *Body = (isa(LoopStmts[I])) ? cast(LoopStmts[I])->getBody() : cast(LoopStmts[I])->getBody(); - BodyStmts.push_back(Body); CompoundStmt *CombinedBody = @@ -16036,7 +16094,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, SourceLocation(), SourceLocation()); ExprResult Condition = SemaRef.BuildBinOp(CurScope, SourceLocation(), BO_LT, MakeIVRef(), - MakeVarDeclRef(NIVarDecls[I])); + MakeVarDeclRef(NIVarDecls[J])); if (!Condition.isUsable()) return StmtError(); @@ -16057,8 +16115,26 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, FusedBody, InitStmt.get()->getBeginLoc(), SourceLocation(), IncrExpr.get()->getEndLoc()); + // In the case of looprange, the result of fuse won't simply + // be a single loop (ForStmt), but rather a loop sequence + // (CompoundStmt) of 3 parts: the pre-fusion loops, the fused loop + // and the post-fusion loops, preserving its original order. + Stmt *FusionStmt = FusedForStmt; + if (LRC) { + SmallVector FinalLoops; + // Gather all the pre-fusion loops + for (unsigned I = 0; I < FirstVal - 1; ++I) + FinalLoops.push_back(LoopStmts[I]); + // Gather the fused loop + FinalLoops.push_back(FusedForStmt); + // Gather all the post-fusion loops + for (unsigned I = FirstVal + CountVal - 1; I < LoopSeqSize; ++I) + FinalLoops.push_back(LoopStmts[I]); + FusionStmt = CompoundStmt::Create(Context, FinalLoops, FPOptionsOverride(), + SourceLocation(), SourceLocation()); + } return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, NumLoops, - 1, AStmt, FusedForStmt, + NumLoopNests, AStmt, FusionStmt, buildPreInits(Context, PreInits)); } @@ -17181,6 +17257,31 @@ OMPClause *SemaOpenMP::ActOnOpenMPPartialClause(Expr *FactorExpr, FactorExpr); } +OMPClause *SemaOpenMP::ActOnOpenMPLoopRangeClause( + Expr *First, Expr *Count, SourceLocation StartLoc, SourceLocation LParenLoc, + SourceLocation FirstLoc, SourceLocation CountLoc, SourceLocation EndLoc) { + + // OpenMP [6.0, Restrictions] + // First and Count must be integer expressions with positive value + ExprResult FirstVal = + VerifyPositiveIntegerConstantInClause(First, OMPC_looprange); + if (FirstVal.isInvalid()) + First = nullptr; + + ExprResult CountVal = + VerifyPositiveIntegerConstantInClause(Count, OMPC_looprange); + if (CountVal.isInvalid()) + Count = nullptr; + + // OpenMP [6.0, Restrictions] + // first + count - 1 must not evaluate to a value greater than the + // loop sequence length of the associated canonical loop sequence. + // This check must be performed afterwards due to the delayed + // parsing and computation of the associated loop sequence + return OMPLoopRangeClause::Create(getASTContext(), StartLoc, LParenLoc, + FirstLoc, CountLoc, EndLoc, First, Count); +} + OMPClause *SemaOpenMP::ActOnOpenMPAlignClause(Expr *A, SourceLocation StartLoc, SourceLocation LParenLoc, SourceLocation EndLoc) { diff --git a/clang/lib/Sema/TreeTransform.h b/clang/lib/Sema/TreeTransform.h index 034b0c8243667..d70e2a3874c07 100644 --- a/clang/lib/Sema/TreeTransform.h +++ b/clang/lib/Sema/TreeTransform.h @@ -1775,6 +1775,14 @@ class TreeTransform { LParenLoc, EndLoc); } + OMPClause * + RebuildOMPLoopRangeClause(Expr *First, Expr *Count, SourceLocation StartLoc, + SourceLocation LParenLoc, SourceLocation FirstLoc, + SourceLocation CountLoc, SourceLocation EndLoc) { + return getSema().OpenMP().ActOnOpenMPLoopRangeClause( + First, Count, StartLoc, LParenLoc, FirstLoc, CountLoc, EndLoc); + } + /// Build a new OpenMP 'allocator' clause. /// /// By default, performs semantic analysis to build the new OpenMP clause. @@ -10569,6 +10577,31 @@ TreeTransform::TransformOMPPartialClause(OMPPartialClause *C) { C->getEndLoc()); } +template +OMPClause * +TreeTransform::TransformOMPLoopRangeClause(OMPLoopRangeClause *C) { + ExprResult F = getDerived().TransformExpr(C->getFirst()); + if (F.isInvalid()) + return nullptr; + + ExprResult Cn = getDerived().TransformExpr(C->getCount()); + if (Cn.isInvalid()) + return nullptr; + + Expr *First = F.get(); + Expr *Count = Cn.get(); + + bool Changed = (First != C->getFirst()) || (Count != C->getCount()); + + // If no changes and AlwaysRebuild() is false, return the original clause + if (!Changed && !getDerived().AlwaysRebuild()) + return C; + + return RebuildOMPLoopRangeClause(First, Count, C->getBeginLoc(), + C->getLParenLoc(), C->getFirstLoc(), + C->getCountLoc(), C->getEndLoc()); +} + template OMPClause * TreeTransform::TransformOMPCollapseClause(OMPCollapseClause *C) { diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp index d068f5e163176..8591eb9394fa5 100644 --- a/clang/lib/Serialization/ASTReader.cpp +++ b/clang/lib/Serialization/ASTReader.cpp @@ -11088,6 +11088,9 @@ OMPClause *OMPClauseReader::readClause() { case llvm::omp::OMPC_partial: C = OMPPartialClause::CreateEmpty(Context); break; + case llvm::omp::OMPC_looprange: + C = OMPLoopRangeClause::CreateEmpty(Context); + break; case llvm::omp::OMPC_allocator: C = new (Context) OMPAllocatorClause(); break; @@ -11489,6 +11492,14 @@ void OMPClauseReader::VisitOMPPartialClause(OMPPartialClause *C) { C->setLParenLoc(Record.readSourceLocation()); } +void OMPClauseReader::VisitOMPLoopRangeClause(OMPLoopRangeClause *C) { + C->setFirst(Record.readSubExpr()); + C->setCount(Record.readSubExpr()); + C->setLParenLoc(Record.readSourceLocation()); + C->setFirstLoc(Record.readSourceLocation()); + C->setCountLoc(Record.readSourceLocation()); +} + void OMPClauseReader::VisitOMPAllocatorClause(OMPAllocatorClause *C) { C->setAllocator(Record.readExpr()); C->setLParenLoc(Record.readSourceLocation()); diff --git a/clang/lib/Serialization/ASTReaderStmt.cpp b/clang/lib/Serialization/ASTReaderStmt.cpp index 6762d11d6b73e..a301e1c0b0e32 100644 --- a/clang/lib/Serialization/ASTReaderStmt.cpp +++ b/clang/lib/Serialization/ASTReaderStmt.cpp @@ -3621,7 +3621,9 @@ Stmt *ASTReader::ReadStmtFromStream(ModuleFile &F) { case STMT_OMP_FUSE_DIRECTIVE: { unsigned NumLoops = Record[ASTStmtReader::NumStmtFields]; unsigned NumClauses = Record[ASTStmtReader::NumStmtFields + 1]; - S = OMPFuseDirective::CreateEmpty(Context, NumClauses, NumLoops); + unsigned NumLoopNests = Record[ASTStmtReader::NumStmtFields + 2]; + S = OMPFuseDirective::CreateEmpty(Context, NumClauses, NumLoops, + NumLoopNests); break; } diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp index 1b3d3c22aa9f5..8548f7e50d34b 100644 --- a/clang/lib/Serialization/ASTWriter.cpp +++ b/clang/lib/Serialization/ASTWriter.cpp @@ -7782,6 +7782,14 @@ void OMPClauseWriter::VisitOMPPartialClause(OMPPartialClause *C) { Record.AddSourceLocation(C->getLParenLoc()); } +void OMPClauseWriter::VisitOMPLoopRangeClause(OMPLoopRangeClause *C) { + Record.AddStmt(C->getFirst()); + Record.AddStmt(C->getCount()); + Record.AddSourceLocation(C->getLParenLoc()); + Record.AddSourceLocation(C->getFirstLoc()); + Record.AddSourceLocation(C->getCountLoc()); +} + void OMPClauseWriter::VisitOMPAllocatorClause(OMPAllocatorClause *C) { Record.AddStmt(C->getAllocator()); Record.AddSourceLocation(C->getLParenLoc()); diff --git a/clang/test/OpenMP/fuse_ast_print.cpp b/clang/test/OpenMP/fuse_ast_print.cpp index 43ce815dab024..ac4f0d38a9c68 100644 --- a/clang/test/OpenMP/fuse_ast_print.cpp +++ b/clang/test/OpenMP/fuse_ast_print.cpp @@ -271,6 +271,73 @@ void foo7() { } +// PRINT-LABEL: void foo8( +// DUMP-LABEL: FunctionDecl {{.*}} foo8 +void foo8() { + // PRINT: #pragma omp fuse looprange(2,2) + // DUMP: OMPFuseDirective + // DUMP: OMPLooprangeClause + #pragma omp fuse looprange(2,2) + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + + } + +} + +//PRINT-LABEL: void foo9( +//DUMP-LABEL: FunctionTemplateDecl {{.*}} foo9 +//DUMP-LABEL: NonTypeTemplateParmDecl {{.*}} F +//DUMP-LABEL: NonTypeTemplateParmDecl {{.*}} C +template +void foo9() { + // PRINT: #pragma omp fuse looprange(F,C) + // DUMP: OMPFuseDirective + // DUMP: OMPLooprangeClause + #pragma omp fuse looprange(F,C) + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + + } +} + +// Also test instantiating the template. +void tfoo9() { + foo9<1, 2>(); +} + diff --git a/clang/test/OpenMP/fuse_codegen.cpp b/clang/test/OpenMP/fuse_codegen.cpp index 6c1e21092da43..d9500bed3ce31 100644 --- a/clang/test/OpenMP/fuse_codegen.cpp +++ b/clang/test/OpenMP/fuse_codegen.cpp @@ -53,6 +53,18 @@ extern "C" void foo3() { } } +extern "C" void foo4() { + double arr[256]; + + #pragma omp fuse looprange(2,2) + { + for(int i = 0; i < 128; ++i) body(i); + for(int j = 0; j < 256; j+=2) body(j); + for(int k = 0; k < 64; ++k) body(k); + for(int c = 42; auto &&v: arr) body(c,v); + } +} + #endif // CHECK1-LABEL: define dso_local void @body( @@ -777,6 +789,157 @@ extern "C" void foo3() { // CHECK1-NEXT: ret void // // +// CHECK1-LABEL: define dso_local void @foo4( +// CHECK1-SAME: ) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[K]], align 4 +// CHECK1-NEXT: store i32 63, ptr [[DOTOMP_UB1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: store i32 64, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP5]], 128 +// CHECK1-NEXT: br i1 [[CMP1]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP6]]) +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add nsw i32 [[TMP7]], 1 +// CHECK1-NEXT: store i32 [[INC]], ptr [[I]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP7:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND2:.*]] +// CHECK1: [[FOR_COND2]]: +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP3:%.*]] = icmp slt i32 [[TMP8]], [[TMP9]] +// CHECK1-NEXT: br i1 [[CMP3]], label %[[FOR_BODY4:.*]], label %[[FOR_END17:.*]] +// CHECK1: [[FOR_BODY4]]: +// CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP5:%.*]] = icmp slt i32 [[TMP10]], [[TMP11]] +// CHECK1-NEXT: br i1 [[CMP5]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i32 [[TMP13]], [[TMP14]] +// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP12]], [[MUL]] +// CHECK1-NEXT: store i32 [[ADD]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[MUL6:%.*]] = mul nsw i32 [[TMP15]], 2 +// CHECK1-NEXT: [[ADD7:%.*]] = add nsw i32 0, [[MUL6]] +// CHECK1-NEXT: store i32 [[ADD7]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP16]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP8:%.*]] = icmp slt i32 [[TMP17]], [[TMP18]] +// CHECK1-NEXT: br i1 [[CMP8]], label %[[IF_THEN9:.*]], label %[[IF_END14:.*]] +// CHECK1: [[IF_THEN9]]: +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL10:%.*]] = mul nsw i32 [[TMP20]], [[TMP21]] +// CHECK1-NEXT: [[ADD11:%.*]] = add nsw i32 [[TMP19]], [[MUL10]] +// CHECK1-NEXT: store i32 [[ADD11]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[MUL12:%.*]] = mul nsw i32 [[TMP22]], 1 +// CHECK1-NEXT: [[ADD13:%.*]] = add nsw i32 0, [[MUL12]] +// CHECK1-NEXT: store i32 [[ADD13]], ptr [[K]], align 4 +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[K]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP23]]) +// CHECK1-NEXT: br label %[[IF_END14]] +// CHECK1: [[IF_END14]]: +// CHECK1-NEXT: br label %[[FOR_INC15:.*]] +// CHECK1: [[FOR_INC15]]: +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC16:%.*]] = add nsw i32 [[TMP24]], 1 +// CHECK1-NEXT: store i32 [[INC16]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND2]], !llvm.loop [[LOOP8:![0-9]+]] +// CHECK1: [[FOR_END17]]: +// CHECK1-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[TMP25:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP25]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP26:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY18:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP26]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY18]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND19:.*]] +// CHECK1: [[FOR_COND19]]: +// CHECK1-NEXT: [[TMP27:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP28:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK1-NEXT: [[CMP20:%.*]] = icmp ne ptr [[TMP27]], [[TMP28]] +// CHECK1-NEXT: br i1 [[CMP20]], label %[[FOR_BODY21:.*]], label %[[FOR_END23:.*]] +// CHECK1: [[FOR_BODY21]]: +// CHECK1-NEXT: [[TMP29:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: store ptr [[TMP29]], ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[C]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP32:%.*]] = load double, ptr [[TMP31]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP30]], double noundef [[TMP32]]) +// CHECK1-NEXT: br label %[[FOR_INC22:.*]] +// CHECK1: [[FOR_INC22]]: +// CHECK1-NEXT: [[TMP33:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[INCDEC_PTR:%.*]] = getelementptr inbounds nuw double, ptr [[TMP33]], i32 1 +// CHECK1-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND19]] +// CHECK1: [[FOR_END23]]: +// CHECK1-NEXT: ret void +// +// // CHECK2-LABEL: define dso_local void @body( // CHECK2-SAME: ...) #[[ATTR0:[0-9]+]] { // CHECK2-NEXT: [[ENTRY:.*:]] @@ -1259,6 +1422,157 @@ extern "C" void foo3() { // CHECK2-NEXT: ret void // // +// CHECK2-LABEL: define dso_local void @foo4( +// CHECK2-SAME: ) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[K]], align 4 +// CHECK2-NEXT: store i32 63, ptr [[DOTOMP_UB1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: store i32 64, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: [[CMP1:%.*]] = icmp slt i32 [[TMP5]], 128 +// CHECK2-NEXT: br i1 [[CMP1]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP6]]) +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add nsw i32 [[TMP7]], 1 +// CHECK2-NEXT: store i32 [[INC]], ptr [[I]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND2:.*]] +// CHECK2: [[FOR_COND2]]: +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP3:%.*]] = icmp slt i32 [[TMP8]], [[TMP9]] +// CHECK2-NEXT: br i1 [[CMP3]], label %[[FOR_BODY4:.*]], label %[[FOR_END17:.*]] +// CHECK2: [[FOR_BODY4]]: +// CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP5:%.*]] = icmp slt i32 [[TMP10]], [[TMP11]] +// CHECK2-NEXT: br i1 [[CMP5]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i32 [[TMP13]], [[TMP14]] +// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP12]], [[MUL]] +// CHECK2-NEXT: store i32 [[ADD]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[MUL6:%.*]] = mul nsw i32 [[TMP15]], 2 +// CHECK2-NEXT: [[ADD7:%.*]] = add nsw i32 0, [[MUL6]] +// CHECK2-NEXT: store i32 [[ADD7]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP16]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP8:%.*]] = icmp slt i32 [[TMP17]], [[TMP18]] +// CHECK2-NEXT: br i1 [[CMP8]], label %[[IF_THEN9:.*]], label %[[IF_END14:.*]] +// CHECK2: [[IF_THEN9]]: +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL10:%.*]] = mul nsw i32 [[TMP20]], [[TMP21]] +// CHECK2-NEXT: [[ADD11:%.*]] = add nsw i32 [[TMP19]], [[MUL10]] +// CHECK2-NEXT: store i32 [[ADD11]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[MUL12:%.*]] = mul nsw i32 [[TMP22]], 1 +// CHECK2-NEXT: [[ADD13:%.*]] = add nsw i32 0, [[MUL12]] +// CHECK2-NEXT: store i32 [[ADD13]], ptr [[K]], align 4 +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[K]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP23]]) +// CHECK2-NEXT: br label %[[IF_END14]] +// CHECK2: [[IF_END14]]: +// CHECK2-NEXT: br label %[[FOR_INC15:.*]] +// CHECK2: [[FOR_INC15]]: +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC16:%.*]] = add nsw i32 [[TMP24]], 1 +// CHECK2-NEXT: store i32 [[INC16]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND2]], !llvm.loop [[LOOP7:![0-9]+]] +// CHECK2: [[FOR_END17]]: +// CHECK2-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[TMP25:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP25]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP26:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY18:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP26]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY18]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND19:.*]] +// CHECK2: [[FOR_COND19]]: +// CHECK2-NEXT: [[TMP27:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP28:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK2-NEXT: [[CMP20:%.*]] = icmp ne ptr [[TMP27]], [[TMP28]] +// CHECK2-NEXT: br i1 [[CMP20]], label %[[FOR_BODY21:.*]], label %[[FOR_END23:.*]] +// CHECK2: [[FOR_BODY21]]: +// CHECK2-NEXT: [[TMP29:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: store ptr [[TMP29]], ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[C]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP32:%.*]] = load double, ptr [[TMP31]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP30]], double noundef [[TMP32]]) +// CHECK2-NEXT: br label %[[FOR_INC22:.*]] +// CHECK2: [[FOR_INC22]]: +// CHECK2-NEXT: [[TMP33:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[INCDEC_PTR:%.*]] = getelementptr inbounds nuw double, ptr [[TMP33]], i32 1 +// CHECK2-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND19]] +// CHECK2: [[FOR_END23]]: +// CHECK2-NEXT: ret void +// +// // CHECK2-LABEL: define dso_local void @tfoo2( // CHECK2-SAME: ) #[[ATTR0]] { // CHECK2-NEXT: [[ENTRY:.*:]] @@ -1494,7 +1808,7 @@ extern "C" void foo3() { // CHECK2-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 // CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP8:![0-9]+]] // CHECK2: [[FOR_END]]: // CHECK2-NEXT: ret void // @@ -1503,9 +1817,13 @@ extern "C" void foo3() { // CHECK1: [[META4]] = !{!"llvm.loop.mustprogress"} // CHECK1: [[LOOP5]] = distinct !{[[LOOP5]], [[META4]]} // CHECK1: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} +// CHECK1: [[LOOP7]] = distinct !{[[LOOP7]], [[META4]]} +// CHECK1: [[LOOP8]] = distinct !{[[LOOP8]], [[META4]]} //. // CHECK2: [[LOOP3]] = distinct !{[[LOOP3]], [[META4:![0-9]+]]} // CHECK2: [[META4]] = !{!"llvm.loop.mustprogress"} // CHECK2: [[LOOP5]] = distinct !{[[LOOP5]], [[META4]]} // CHECK2: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} +// CHECK2: [[LOOP7]] = distinct !{[[LOOP7]], [[META4]]} +// CHECK2: [[LOOP8]] = distinct !{[[LOOP8]], [[META4]]} //. diff --git a/clang/test/OpenMP/fuse_messages.cpp b/clang/test/OpenMP/fuse_messages.cpp index 50dedfd2c0dc6..2a2491d008a0b 100644 --- a/clang/test/OpenMP/fuse_messages.cpp +++ b/clang/test/OpenMP/fuse_messages.cpp @@ -33,6 +33,8 @@ void func() { { for (int i = 0; i < 7; ++i) ; + for(int j = 0; j < 100; ++j); + } @@ -41,6 +43,8 @@ void func() { { for (int i = 0; i < 7; ++i) ; + for(int j = 0; j < 100; ++j); + } //expected-error at +4 {{loop after '#pragma omp fuse' is not in canonical form}} @@ -50,6 +54,7 @@ void func() { for(int i = 0; i < 10; i*=2) { ; } + for(int j = 0; j < 100; ++j); } //expected-error at +2 {{loop sequence after '#pragma omp fuse' must contain at least 1 canonical loop or loop-generating construct}} @@ -73,4 +78,109 @@ void func() { for(unsigned int j = 0; j < 10; ++j); for(long long k = 0; k < 100; ++k); } -} \ No newline at end of file + + //expected-warning at +2 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} + #pragma omp fuse + { + for(int i = 0; i < 10; ++i); + } + + //expected-warning at +1 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} + #pragma omp fuse looprange(1, 1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + } + + //expected-error at +1 {{argument to 'looprange' clause must be a strictly positive integer value}} + #pragma omp fuse looprange(1, -1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + } + + //expected-error at +1 {{argument to 'looprange' clause must be a strictly positive integer value}} + #pragma omp fuse looprange(1, 0) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + } + + const int x = 1; + constexpr int y = 4; + //expected-error at +1 {{loop range in '#pragma omp fuse' exceeds the number of available loops: range end '4' is greater than the total number of loops '3'}} + #pragma omp fuse looprange(x,y) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } + + //expected-error at +1 {{loop range in '#pragma omp fuse' exceeds the number of available loops: range end '420' is greater than the total number of loops '3'}} + #pragma omp fuse looprange(1,420) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } +} + +// In a template context, but expression itself not instantiation-dependent +template +static void templated_func() { + + //expected-warning at +1 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} + #pragma omp fuse looprange(2,1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } + + //expected-error at +1 {{loop range in '#pragma omp fuse' exceeds the number of available loops: range end '5' is greater than the total number of loops '3'}} + #pragma omp fuse looprange(3,3) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } + +} + +template +static void templated_func_value_dependent() { + + //expected-warning at +1 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} + #pragma omp fuse looprange(V,1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } +} + +template +static void templated_func_type_dependent() { + constexpr T s = 1; + + //expected-error at +1 {{argument to 'looprange' clause must be a strictly positive integer value}} + #pragma omp fuse looprange(s,s-1) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } +} + + +void template_inst() { + // expected-note at +1 {{in instantiation of function template specialization 'templated_func' requested here}} + templated_func(); + // expected-note at +1 {{in instantiation of function template specialization 'templated_func_value_dependent<1>' requested here}} + templated_func_value_dependent<1>(); + // expected-note at +1 {{in instantiation of function template specialization 'templated_func_type_dependent' requested here}} + templated_func_type_dependent(); + +} + + diff --git a/clang/tools/libclang/CIndex.cpp b/clang/tools/libclang/CIndex.cpp index fd788ac3d69d4..38f5183b146ee 100644 --- a/clang/tools/libclang/CIndex.cpp +++ b/clang/tools/libclang/CIndex.cpp @@ -2412,6 +2412,11 @@ void OMPClauseEnqueue::VisitOMPPartialClause(const OMPPartialClause *C) { Visitor->AddStmt(C->getFactor()); } +void OMPClauseEnqueue::VisitOMPLoopRangeClause(const OMPLoopRangeClause *C) { + Visitor->AddStmt(C->getFirst()); + Visitor->AddStmt(C->getCount()); +} + void OMPClauseEnqueue::VisitOMPAllocatorClause(const OMPAllocatorClause *C) { Visitor->AddStmt(C->getAllocator()); } diff --git a/llvm/include/llvm/Frontend/OpenMP/ClauseT.h b/llvm/include/llvm/Frontend/OpenMP/ClauseT.h index e0714e812e5cd..dd51274c1aaf5 100644 --- a/llvm/include/llvm/Frontend/OpenMP/ClauseT.h +++ b/llvm/include/llvm/Frontend/OpenMP/ClauseT.h @@ -1233,6 +1233,15 @@ struct WriteT { using EmptyTrait = std::true_type; }; +// V6: [6.4.7] Looprange clause +template struct LoopRangeT { + using Begin = E; + using End = E; + + using TupleTrait = std::true_type; + std::tuple t; +}; + // --- template @@ -1263,9 +1272,10 @@ using TupleClausesT = DefaultmapT, DeviceT, DistScheduleT, DoacrossT, FromT, GrainsizeT, IfT, InitT, InReductionT, - LastprivateT, LinearT, MapT, - NumTasksT, OrderT, ReductionT, - ScheduleT, TaskReductionT, ToT>; + LastprivateT, LinearT, LoopRangeT, + MapT, NumTasksT, OrderT, + ReductionT, ScheduleT, + TaskReductionT, ToT>; template using UnionClausesT = std::variant>; diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 8286cfcadaafd..ae19385c022d0 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -271,6 +271,9 @@ def OMPC_Linear : Clause<"linear"> { def OMPC_Link : Clause<"link"> { let flangClass = "OmpObjectList"; } +def OMPC_LoopRange : Clause<"looprange"> { + let clangClass = "OMPLoopRangeClause"; +} def OMPC_Map : Clause<"map"> { let clangClass = "OMPMapClause"; let flangClass = "OmpMapClause"; @@ -853,6 +856,9 @@ def OMP_For : Directive<"for"> { let languages = [L_C]; } def OMP_Fuse : Directive<"fuse"> { + let allowedOnceClauses = [ + VersionedClause + ]; let association = AS_Loop; let category = CA_Executable; } >From c1e5fc3fe2ac7f126a76b44906b30029e3cc797b Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:30:39 +0000 Subject: [PATCH 3/9] Addef fuse to documentation --- clang/docs/OpenMPSupport.rst | 2 ++ clang/docs/ReleaseNotes.rst | 1 + 2 files changed, 3 insertions(+) diff --git a/clang/docs/OpenMPSupport.rst b/clang/docs/OpenMPSupport.rst index d6507071d4693..5f0e363792b32 100644 --- a/clang/docs/OpenMPSupport.rst +++ b/clang/docs/OpenMPSupport.rst @@ -376,6 +376,8 @@ implementation. +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | loop stripe transformation | :good:`done` | https://github.com/llvm/llvm-project/pull/119891 | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ +| loop fuse transformation | :good:`done` | :none:`unclaimed` | | ++-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | work distribute construct | :none:`unclaimed` | :none:`unclaimed` | | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | task_iteration | :none:`unclaimed` | :none:`unclaimed` | | diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 573ae97bff710..2188e42dc705c 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -1016,6 +1016,7 @@ OpenMP Support open parenthesis. (#GH139665) - An error is now emitted when OpenMP ``collapse`` and ``ordered`` clauses have an argument larger than what can fit within a 64-bit integer. +- Added support for 'omp fuse' directive. Improvements ^^^^^^^^^^^^ >From 33119f77c07cc3ecbb5b3360fd8f63a958e808c1 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:43:41 +0000 Subject: [PATCH 4/9] Refactored preinits handling and improved coverage --- clang/docs/OpenMPSupport.rst | 2 +- clang/include/clang/AST/StmtOpenMP.h | 5 +- clang/include/clang/Sema/SemaOpenMP.h | 96 +- clang/lib/AST/StmtOpenMP.cpp | 13 + clang/lib/Basic/OpenMPKinds.cpp | 3 +- clang/lib/CodeGen/CGExpr.cpp | 2 + clang/lib/CodeGen/CodeGenFunction.h | 4 + clang/lib/Sema/SemaOpenMP.cpp | 588 ++++--- clang/test/OpenMP/fuse_ast_print.cpp | 55 + clang/test/OpenMP/fuse_codegen.cpp | 2117 +++++++++++++++---------- 10 files changed, 1862 insertions(+), 1023 deletions(-) diff --git a/clang/docs/OpenMPSupport.rst b/clang/docs/OpenMPSupport.rst index 5f0e363792b32..b39f9d3634a63 100644 --- a/clang/docs/OpenMPSupport.rst +++ b/clang/docs/OpenMPSupport.rst @@ -376,7 +376,7 @@ implementation. +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | loop stripe transformation | :good:`done` | https://github.com/llvm/llvm-project/pull/119891 | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ -| loop fuse transformation | :good:`done` | :none:`unclaimed` | | +| loop fuse transformation | :good:`prototyped` | :none:`unclaimed` | | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ | work distribute construct | :none:`unclaimed` | :none:`unclaimed` | | +-------------------------------------------------------------+---------------------------+---------------------------+--------------------------------------------------------------------------+ diff --git a/clang/include/clang/AST/StmtOpenMP.h b/clang/include/clang/AST/StmtOpenMP.h index 85bde292ca748..b6a948a8c6020 100644 --- a/clang/include/clang/AST/StmtOpenMP.h +++ b/clang/include/clang/AST/StmtOpenMP.h @@ -1005,8 +1005,7 @@ class OMPLoopTransformationDirective : public OMPLoopBasedDirective { Stmt::StmtClass C = T->getStmtClass(); return C == OMPTileDirectiveClass || C == OMPUnrollDirectiveClass || C == OMPReverseDirectiveClass || C == OMPInterchangeDirectiveClass || - C == OMPStripeDirectiveClass || - C == OMPFuseDirectiveClass; + C == OMPStripeDirectiveClass || C == OMPFuseDirectiveClass; } }; @@ -5653,6 +5652,8 @@ class OMPStripeDirective final : public OMPLoopTransformationDirective { llvm::omp::OMPD_stripe, StartLoc, EndLoc, NumLoops) { setNumGeneratedLoops(2 * NumLoops); + // Similar to Tile, it only generates a single top level loop nest + setNumGeneratedLoopNests(1); } void setPreInits(Stmt *PreInits) { diff --git a/clang/include/clang/Sema/SemaOpenMP.h b/clang/include/clang/Sema/SemaOpenMP.h index f4a075e54cebe..ac4cbe3709a0d 100644 --- a/clang/include/clang/Sema/SemaOpenMP.h +++ b/clang/include/clang/Sema/SemaOpenMP.h @@ -1493,16 +1493,96 @@ class SemaOpenMP : public SemaBase { SmallVectorImpl &LoopHelpers, Stmt *&Body, SmallVectorImpl> &OriginalInits); - /// Analyzes and checks a loop sequence for use by a loop transformation + /// @brief Categories of loops encountered during semantic OpenMP loop + /// analysis + /// + /// This enumeration identifies the structural category of a loop or sequence + /// of loops analyzed in the context of OpenMP transformations and directives. + /// This categorization helps differentiate between original source loops + /// and the structures resulting from applying OpenMP loop transformations. + enum class OMPLoopCategory { + + /// @var OMPLoopCategory::RegularLoop + /// Represents a standard canonical loop nest found in the + /// original source code or an intact loop after transformations + /// (i.e Post/Pre loops of a loopranged fusion) + RegularLoop, + + /// @var OMPLoopCategory::TransformSingleLoop + /// Represents the resulting loop structure when an OpenMP loop + // transformation, generates a single, top-level loop + TransformSingleLoop, + + /// @var OMPLoopCategory::TransformLoopSequence + /// Represents the resulting loop structure when an OpenMP loop + /// transformation + /// generates a sequence of two or more canonical loop nests + TransformLoopSequence + }; + + /// The main recursive process of `checkTransformableLoopSequence` that + /// performs grammatical parsing of a canonical loop sequence. It extracts + /// key information, such as the number of top-level loops, loop statements, + /// helper expressions, and other relevant loop-related data, all in a single + /// execution to avoid redundant traversals. This analysis flattens inner + /// Loop Sequences + /// + /// \param LoopSeqStmt The AST of the original statement. + /// \param LoopSeqSize [out] Number of top level canonical loops. + /// \param NumLoops [out] Number of total canonical loops (nested too). + /// \param LoopHelpers [out] The multiple loop analyses results. + /// \param ForStmts [out] The multiple Stmt of each For loop. + /// \param OriginalInits [out] The raw original initialization statements + /// of each belonging to a loop of the loop sequence + /// \param TransformPreInits [out] The multiple collection of statements and + /// declarations that must have been executed/declared + /// before entering the loop (each belonging to a + /// particular loop transformation, nullptr otherwise) + /// \param LoopSequencePreInits [out] Additional general collection of loop + /// transformation related statements and declarations + /// not bounded to a particular loop that must be + /// executed before entering the loop transformation + /// \param LoopCategories [out] A sequence of OMPLoopCategory values, + /// one for each loop or loop transformation node + /// successfully analyzed. + /// \param Context + /// \param Kind The loop transformation directive kind. + /// \return Whether the original statement is both syntactically and + /// semantically correct according to OpenMP 6.0 canonical loop + /// sequence definition. + bool analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind); + + /// Validates and checks whether a loop sequence can be transformed according + /// to the given directive, providing necessary setup and initialization + /// (Driver function) before recursion using `analyzeLoopSequence`. /// /// \param Kind The loop transformation directive kind. - /// \param NumLoops [out] Number of total canonical loops - /// \param LoopSeqSize [out] Number of top level canonical loops + /// \param AStmt The AST of the original statement + /// \param LoopSeqSize [out] Number of top level canonical loops. + /// \param NumLoops [out] Number of total canonical loops (nested too) /// \param LoopHelpers [out] The multiple loop analyses results. - /// \param LoopStmts [out] The multiple Stmt of each For loop. - /// \param OriginalInits [out] The multiple collection of statements and + /// \param ForStmts [out] The multiple Stmt of each For loop. + /// \param OriginalInits [out] The raw original initialization statements + /// of each belonging to a loop of the loop sequence + /// \param TransformsPreInits [out] The multiple collection of statements and /// declarations that must have been executed/declared - /// before entering the loop. + /// before entering the loop (each belonging to a + /// particular loop transformation, nullptr otherwise) + /// \param LoopSequencePreInits [out] Additional general collection of loop + /// transformation related statements and declarations + /// not bounded to a particular loop that must be + /// executed before entering the loop transformation + /// \param LoopCategories [out] A sequence of OMPLoopCategory values, + /// one for each loop or loop transformation node + /// successfully analyzed. /// \param Context /// \return Whether there was an absence of errors or not bool checkTransformableLoopSequence( @@ -1511,7 +1591,9 @@ class SemaOpenMP : public SemaBase { SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, SmallVectorImpl> &OriginalInits, - ASTContext &Context); + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context); /// Helper to keep information about the current `omp begin/end declare /// variant` nesting. diff --git a/clang/lib/AST/StmtOpenMP.cpp b/clang/lib/AST/StmtOpenMP.cpp index 06c987e7f1761..e6b52792885ba 100644 --- a/clang/lib/AST/StmtOpenMP.cpp +++ b/clang/lib/AST/StmtOpenMP.cpp @@ -457,6 +457,8 @@ OMPUnrollDirective::Create(const ASTContext &C, SourceLocation StartLoc, C, Clauses, AssociatedStmt, TransformedStmtOffset + 1, StartLoc, EndLoc); Dir->setNumGeneratedLoops(NumGeneratedLoops); // The number of generated loops and loop nests during unroll matches + // given that unroll only generates top level canonical loop nests + // so each generated loop is a top level canonical loop nest Dir->setNumGeneratedLoopNests(NumGeneratedLoops); Dir->setTransformedStmt(TransformedStmt); Dir->setPreInits(PreInits); @@ -517,6 +519,17 @@ OMPFuseDirective *OMPFuseDirective::Create( NumLoops); Dir->setTransformedStmt(TransformedStmt); Dir->setPreInits(PreInits); + // The number of top level canonical nests could + // not match the total number of generated loops + // Example: + // Before fusion: + // for (int i = 0; i < N; ++i) + // for (int j = 0; j < M; ++j) + // A[i][j] = i + j; + // + // for (int k = 0; k < P; ++k) + // B[k] = k * 2; + // Here, NumLoopNests = 2, but NumLoops = 3. Dir->setNumGeneratedLoopNests(NumLoopNests); Dir->setNumGeneratedLoops(NumLoops); return Dir; diff --git a/clang/lib/Basic/OpenMPKinds.cpp b/clang/lib/Basic/OpenMPKinds.cpp index 18330181f1509..53a9f80e6d3b7 100644 --- a/clang/lib/Basic/OpenMPKinds.cpp +++ b/clang/lib/Basic/OpenMPKinds.cpp @@ -704,7 +704,8 @@ bool clang::isOpenMPLoopBoundSharingDirective(OpenMPDirectiveKind Kind) { bool clang::isOpenMPLoopTransformationDirective(OpenMPDirectiveKind DKind) { return DKind == OMPD_tile || DKind == OMPD_unroll || DKind == OMPD_reverse || - DKind == OMPD_interchange || DKind == OMPD_stripe || DKind == OMPD_fuse; + DKind == OMPD_interchange || DKind == OMPD_stripe || + DKind == OMPD_fuse; } bool clang::isOpenMPCombinedParallelADirective(OpenMPDirectiveKind DKind) { diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index 7cb7ee20fcf6a..1671f07bc2760 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -3242,6 +3242,8 @@ LValue CodeGenFunction::EmitDeclRefLValue(const DeclRefExpr *E) { // No other cases for now. } else { + llvm::dbgs() << "THE DAMN DECLREFEXPR HASN'T BEEN ENTERED IN LOCALDECLMAP\n"; + VD->dumpColor(); llvm_unreachable("DeclRefExpr for Decl not entered in LocalDeclMap?"); } diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h index a983901f560de..ce00198c396b6 100644 --- a/clang/lib/CodeGen/CodeGenFunction.h +++ b/clang/lib/CodeGen/CodeGenFunction.h @@ -5414,6 +5414,10 @@ class CodeGenFunction : public CodeGenTypeCache { /// Set the address of a local variable. void setAddrOfLocalVar(const VarDecl *VD, Address Addr) { + if (LocalDeclMap.count(VD)) { + llvm::errs() << "Warning: VarDecl already exists in map: "; + VD->dumpColor(); + } assert(!LocalDeclMap.count(VD) && "Decl already exists in LocalDeclMap!"); LocalDeclMap.insert({VD, Addr}); } diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index 556b5cb43b6f8..b0529c9352c83 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -22,6 +22,7 @@ #include "clang/AST/DeclOpenMP.h" #include "clang/AST/DynamicRecursiveASTVisitor.h" #include "clang/AST/OpenMPClause.h" +#include "clang/AST/RecursiveASTVisitor.h" #include "clang/AST/StmtCXX.h" #include "clang/AST/StmtOpenMP.h" #include "clang/AST/StmtVisitor.h" @@ -47,6 +48,7 @@ #include "llvm/Frontend/OpenMP/OMPConstants.h" #include "llvm/IR/Assumptions.h" #include +#include using namespace clang; using namespace llvm::omp; @@ -14157,6 +14159,45 @@ StmtResult SemaOpenMP::ActOnOpenMPTargetTeamsDistributeSimdDirective( getASTContext(), StartLoc, EndLoc, NestedLoopCount, Clauses, AStmt, B); } +// Overloaded base case function +template +static bool tryHandleAs(T *t, F &&) { + return false; +} + +/** + * Tries to recursively cast `t` to one of the given types and invokes `f` if successful. + * + * @tparam Class The first type to check. + * @tparam Rest The remaining types to check. + * @tparam T The base type of `t`. + * @tparam F The callable type for the function to invoke upon a successful cast. + * @param t The object to be checked. + * @param f The function to invoke if `t` matches `Class`. + * @return `true` if `t` matched any type and `f` was called, otherwise `false`. + */ +template +static bool tryHandleAs(T *t, F &&f) { + if (Class *c = dyn_cast(t)) { + f(c); + return true; + } else { + return tryHandleAs(t, std::forward(f)); + } +} + +// Updates OriginalInits by checking Transform against loop transformation +// directives and appending their pre-inits if a match is found. +static void updatePreInits(OMPLoopBasedDirective *Transform, + SmallVectorImpl> &PreInits) { + if (!tryHandleAs( + Transform, [&PreInits](auto *Dir) { + appendFlattenedStmtList(PreInits.back(), Dir->getPreInits()); + })) + llvm_unreachable("Unhandled loop transformation"); +} + bool SemaOpenMP::checkTransformableLoopNest( OpenMPDirectiveKind Kind, Stmt *AStmt, int NumLoops, SmallVectorImpl &LoopHelpers, @@ -14187,121 +14228,106 @@ bool SemaOpenMP::checkTransformableLoopNest( return false; }, [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); + updatePreInits(Transform, OriginalInits); }); assert(OriginalInits.back().empty() && "No preinit after innermost loop"); OriginalInits.pop_back(); return Result; } -class NestedLoopCounterVisitor - : public clang::RecursiveASTVisitor { +// Counts the total number of nested loops, including the outermost loop (the +// original loop). PRECONDITION of this visitor is that it must be invoked from +// the original loop to be analyzed. The traversal is stop for Decl's and +// Expr's given that they may contain inner loops that must not be counted. +// +// Example AST structure for the code: +// +// int main() { +// #pragma omp fuse +// { +// for (int i = 0; i < 100; i++) { <-- Outer loop +// []() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// }; +// for(int j = 0; j < 5; ++j) {} <-- Inner loop +// } +// for (int r = 0; i < 100; i++) { <-- Outer loop +// struct LocalClass { +// void bar() { +// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +// } +// }; +// for(int k = 0; k < 10; ++k) {} <-- Inner loop +// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +// } +// } +// } +// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { +private: + unsigned NestedLoopCount = 0; + public: - explicit NestedLoopCounterVisitor() : NestedLoopCount(0) {} + explicit NestedLoopCounterVisitor() {} - bool VisitForStmt(clang::ForStmt *FS) { - ++NestedLoopCount; - return true; + unsigned getNestedLoopCount() const { return NestedLoopCount; } + + bool VisitForStmt(ForStmt *FS) override { + ++NestedLoopCount; + return true; } - bool VisitCXXForRangeStmt(clang::CXXForRangeStmt *FRS) { - ++NestedLoopCount; - return true; + bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { + ++NestedLoopCount; + return true; } - unsigned getNestedLoopCount() const { return NestedLoopCount; } + bool TraverseStmt(Stmt *S) override { + if (!S) + return true; -private: - unsigned NestedLoopCount; + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) + return true; + + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || + isa(S)) { + return DynamicRecursiveASTVisitor::TraverseStmt(S); + } + + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; + } + + bool TraverseDecl(Decl *D) override { + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; + } }; -bool SemaOpenMP::checkTransformableLoopSequence( - OpenMPDirectiveKind Kind, Stmt *AStmt, unsigned &LoopSeqSize, - unsigned &NumLoops, +bool SemaOpenMP::analyzeLoopSequence( + Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, SmallVectorImpl> &OriginalInits, - ASTContext &Context) { + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context, + OpenMPDirectiveKind Kind) { - // Checks whether the given statement is a compound statement VarsWithInheritedDSAType TmpDSA; - if (!isa(AStmt)) { - Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) - << getOpenMPDirectiveName(Kind); - return false; - } - // Callback for updating pre-inits in case there are even more - // loop-sequence-generating-constructs inside of the main compound stmt - auto OnTransformationCallback = - [&OriginalInits](OMPLoopBasedDirective *Transform) { - Stmt *DependentPreInits; - if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else if (auto *Dir = dyn_cast(Transform)) - DependentPreInits = Dir->getPreInits(); - else - llvm_unreachable("Unhandled loop transformation"); - - appendFlattenedStmtList(OriginalInits.back(), DependentPreInits); - }; - - // Number of top level canonical loop nests observed (And acts as index) - LoopSeqSize = 0; - // Number of total observed loops - NumLoops = 0; - - // Following OpenMP 6.0 API Specification, a Canonical Loop Sequence follows - // the grammar: - // - // canonical-loop-sequence: - // { - // loop-sequence+ - // } - // where loop-sequence can be any of the following: - // 1. canonical-loop-sequence - // 2. loop-nest - // 3. loop-sequence-generating-construct (i.e OMPLoopTransformationDirective) - // - // To recognise and traverse this structure the following helper functions - // have been defined. handleLoopSequence serves as the recurisve entry point - // and tries to match the input AST to the canonical loop sequence grammar - // structure - - // Helper functions to validate canonical loop sequence grammar is valid - auto isLoopSequenceDerivation = [](auto *Child) { - return isa(Child) || isa(Child) || - isa(Child); - }; - auto isLoopGeneratingStmt = [](auto *Child) { - return isa(Child); - }; - + QualType BaseInductionVarType; // Helper Lambda to handle storing initialization and body statements for both // ForStmt and CXXForRangeStmt and checks for any possible mismatch between // induction variables types - QualType BaseInductionVarType; auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, this, &Context](Stmt *LoopStmt) { if (auto *For = dyn_cast(LoopStmt)) { @@ -14324,33 +14350,35 @@ bool SemaOpenMP::checkTransformableLoopSequence( } } } - } else { - assert(isa(LoopStmt) && - "Expected canonical for or range-based for loops."); - auto *CXXFor = dyn_cast(LoopStmt); + auto *CXXFor = cast(LoopStmt); OriginalInits.back().push_back(CXXFor->getBeginStmt()); ForStmts.push_back(CXXFor); } }; + // Helper lambda functions to encapsulate the processing of different // derivations of the canonical loop sequence grammar // // Modularized code for handling loop generation and transformations - auto handleLoopGeneration = [&storeLoopStatements, &LoopHelpers, - &OriginalInits, &LoopSeqSize, &NumLoops, Kind, - &TmpDSA, &OnTransformationCallback, - this](Stmt *Child) { + auto analyzeLoopGeneration = [&storeLoopStatements, &LoopHelpers, + &OriginalInits, &TransformsPreInits, + &LoopCategories, &LoopSeqSize, &NumLoops, Kind, + &TmpDSA, &ForStmts, &Context, + &LoopSequencePreInits, this](Stmt *Child) { auto LoopTransform = dyn_cast(Child); Stmt *TransformedStmt = LoopTransform->getTransformedStmt(); unsigned NumGeneratedLoopNests = LoopTransform->getNumGeneratedLoopNests(); - + unsigned NumGeneratedLoops = LoopTransform->getNumGeneratedLoops(); // Handle the case where transformed statement is not available due to // dependent contexts if (!TransformedStmt) { - if (NumGeneratedLoopNests > 0) + if (NumGeneratedLoopNests > 0) { + LoopSeqSize += NumGeneratedLoopNests; + NumLoops += NumGeneratedLoops; return true; - // Unroll full + } + // Unroll full (0 loops produced) else { Diag(Child->getBeginLoc(), diag::err_omp_not_for) << 0 << getOpenMPDirectiveName(Kind); @@ -14363,38 +14391,56 @@ bool SemaOpenMP::checkTransformableLoopSequence( Diag(Child->getBeginLoc(), diag::err_omp_not_for) << 0 << getOpenMPDirectiveName(Kind); return false; - // Future loop transformations that generate multiple canonical loops - } else if (NumGeneratedLoopNests > 1) { - llvm_unreachable("Multiple canonical loop generating transformations " - "like loop splitting are not yet supported"); } + // Loop transformatons such as split or loopranged fuse + else if (NumGeneratedLoopNests > 1) { + // Get the preinits related to this loop sequence generating + // loop transformation (i.e loopranged fuse, split...) + LoopSequencePreInits.emplace_back(); + // These preinits differ slightly from regular inits/pre-inits related + // to single loop generating loop transformations (interchange, unroll) + // given that they are not bounded to a particular loop nest + // so they need to be treated independently + updatePreInits(LoopTransform, LoopSequencePreInits); + return analyzeLoopSequence(TransformedStmt, LoopSeqSize, NumLoops, + LoopHelpers, ForStmts, OriginalInits, + TransformsPreInits, LoopSequencePreInits, + LoopCategories, Context, Kind); + } + // Vast majority: (Tile, Unroll, Stripe, Reverse, Interchange, Fuse all) + else { + // Process the transformed loop statement + OriginalInits.emplace_back(); + TransformsPreInits.emplace_back(); + LoopHelpers.emplace_back(); + LoopCategories.push_back(OMPLoopCategory::TransformSingleLoop); + + unsigned IsCanonical = + checkOpenMPLoop(Kind, nullptr, nullptr, TransformedStmt, SemaRef, + *DSAStack, TmpDSA, LoopHelpers[LoopSeqSize]); + + if (!IsCanonical) { + Diag(TransformedStmt->getBeginLoc(), diag::err_omp_not_canonical_loop) + << getOpenMPDirectiveName(Kind); + return false; + } + storeLoopStatements(TransformedStmt); + updatePreInits(LoopTransform, TransformsPreInits); - // Process the transformed loop statement - Child = TransformedStmt; - OriginalInits.emplace_back(); - LoopHelpers.emplace_back(); - OnTransformationCallback(LoopTransform); - - unsigned IsCanonical = - checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, - TmpDSA, LoopHelpers[LoopSeqSize]); - - if (!IsCanonical) { - Diag(Child->getBeginLoc(), diag::err_omp_not_canonical_loop) - << getOpenMPDirectiveName(Kind); - return false; + NumLoops += NumGeneratedLoops; + ++LoopSeqSize; + return true; } - storeLoopStatements(TransformedStmt); - NumLoops += LoopTransform->getNumGeneratedLoops(); - return true; }; // Modularized code for handling regular canonical loops - auto handleRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, - &LoopSeqSize, &NumLoops, Kind, &TmpDSA, - this](Stmt *Child) { + auto analyzeRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, + &LoopSeqSize, &NumLoops, Kind, &TmpDSA, + &LoopCategories, this](Stmt *Child) { OriginalInits.emplace_back(); LoopHelpers.emplace_back(); + LoopCategories.push_back(OMPLoopCategory::RegularLoop); + unsigned IsCanonical = checkOpenMPLoop(Kind, nullptr, nullptr, Child, SemaRef, *DSAStack, TmpDSA, LoopHelpers[LoopSeqSize]); @@ -14412,57 +14458,114 @@ bool SemaOpenMP::checkTransformableLoopSequence( return true; }; - // Helper function to process a Loop Sequence Recursively - auto handleLoopSequence = [&](Stmt *LoopSeqStmt, - auto &handleLoopSequenceCallback) -> bool { - for (auto *Child : LoopSeqStmt->children()) { - if (!Child) - continue; + // Helper functions to validate canonical loop sequence grammar is valid + auto isLoopSequenceDerivation = [](auto *Child) { + return isa(Child) || isa(Child) || + isa(Child); + }; + auto isLoopGeneratingStmt = [](auto *Child) { + return isa(Child); + }; + - // Skip over non-loop-sequence statements - if (!isLoopSequenceDerivation(Child)) { - Child = Child->IgnoreContainers(); + // High level grammar validation + for (auto *Child : LoopSeqStmt->children()) { - // Ignore empty compound statement if (!Child) - continue; + continue; - // In the case of a nested loop sequence ignoring containers would not - // be enough, a recurisve transversal of the loop sequence is required - if (isa(Child)) { - if (!handleLoopSequenceCallback(Child, handleLoopSequenceCallback)) - return false; - // Already been treated, skip this children - continue; + // Skip over non-loop-sequence statements + if (!isLoopSequenceDerivation(Child)) { + Child = Child->IgnoreContainers(); + + // Ignore empty compound statement + if (!Child) + continue; + + // In the case of a nested loop sequence ignoring containers would not + // be enough, a recurisve transversal of the loop sequence is required + if (isa(Child)) { + if (!analyzeLoopSequence(Child, LoopSeqSize, NumLoops, LoopHelpers, + ForStmts, OriginalInits, TransformsPreInits, + LoopSequencePreInits, LoopCategories, Context, + Kind)) + return false; + // Already been treated, skip this children + continue; + } + } + // Regular loop sequence handling + if (isLoopSequenceDerivation(Child)) { + if (isLoopGeneratingStmt(Child)) { + if (!analyzeLoopGeneration(Child)) { + return false; } + // analyzeLoopGeneration updates Loop Sequence size accordingly + + } else { + if (!analyzeRegularLoop(Child)) { + return false; + } + // Update the Loop Sequence size by one + ++LoopSeqSize; } - // Regular loop sequence handling - if (isLoopSequenceDerivation(Child)) { - if (isLoopGeneratingStmt(Child)) { - if (!handleLoopGeneration(Child)) { - return false; - } } else { - if (!handleRegularLoop(Child)) { - return false; - } + // Report error for invalid statement inside canonical loop sequence + Diag(Child->getBeginLoc(), diag::err_omp_not_for) + << 0 << getOpenMPDirectiveName(Kind); + return false; } - ++LoopSeqSize; - } else { - // Report error for invalid statement inside canonical loop sequence - Diag(Child->getBeginLoc(), diag::err_omp_not_for) - << 0 << getOpenMPDirectiveName(Kind); + } + return true; +} + +bool SemaOpenMP::checkTransformableLoopSequence( + OpenMPDirectiveKind Kind, Stmt *AStmt, unsigned &LoopSeqSize, + unsigned &NumLoops, + SmallVectorImpl &LoopHelpers, + SmallVectorImpl &ForStmts, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl &LoopCategories, ASTContext &Context) { + + // Checks whether the given statement is a compound statement + if (!isa(AStmt)) { + Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) + << getOpenMPDirectiveName(Kind); return false; - } - } - return true; - }; + } + // Number of top level canonical loop nests observed (And acts as index) + LoopSeqSize = 0; + // Number of total observed loops + NumLoops = 0; + + // Following OpenMP 6.0 API Specification, a Canonical Loop Sequence follows + // the grammar: + // + // canonical-loop-sequence: + // { + // loop-sequence+ + // } + // where loop-sequence can be any of the following: + // 1. canonical-loop-sequence + // 2. loop-nest + // 3. loop-sequence-generating-construct (i.e OMPLoopTransformationDirective) + // + // To recognise and traverse this structure the following helper functions + // have been defined. analyzeLoopSequence serves as the recurisve entry point + // and tries to match the input AST to the canonical loop sequence grammar + // structure. This function will perform both a semantic and syntactical + // analysis of the given statement according to OpenMP 6.0 definition of + // the aforementioned canonical loop sequence // Recursive entry point to process the main loop sequence - if (!handleLoopSequence(AStmt, handleLoopSequence)) { - return false; + if (!analyzeLoopSequence(AStmt, LoopSeqSize, NumLoops, LoopHelpers, ForStmts, + OriginalInits, TransformsPreInits, + LoopSequencePreInits, LoopCategories, Context, + Kind)) { + return false; } - if (LoopSeqSize <= 0) { Diag(AStmt->getBeginLoc(), diag::err_omp_empty_loop_sequence) << getOpenMPDirectiveName(Kind); @@ -14494,9 +14597,7 @@ static void addLoopPreInits(ASTContext &Context, RangeEnd->getBeginLoc(), RangeEnd->getEndLoc())); } - llvm::append_range(PreInits, OriginalInit); - // List of OMPCapturedExprDecl, for __begin, __end, and NumIterations if (auto *PI = cast_or_null(LoopHelper.PreInits)) { PreInits.push_back(new (Context) DeclStmt( @@ -15177,7 +15278,7 @@ StmtResult SemaOpenMP::ActOnOpenMPUnrollDirective(ArrayRef Clauses, Stmt *LoopStmt = nullptr; collectLoopStmts(AStmt, {LoopStmt}); - // Determine the PreInit declarations. + // Determine the PreInit declarations.e SmallVector PreInits; addLoopPreInits(Context, LoopHelper, LoopStmt, OriginalInits[0], PreInits); @@ -15744,28 +15845,35 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, if (!AStmt) { return StmtError(); } + + unsigned NumLoops = 1; + unsigned LoopSeqSize = 1; + + // Defer transformation in dependent contexts + // The NumLoopNests argument is set to a placeholder 1 (even though + // using looprange fuse could yield up to 3 top level loop nests) + // because a dependent context could prevent determining its true value + if (CurrContext->isDependentContext()) { + return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, + NumLoops, LoopSeqSize, AStmt, nullptr, + nullptr); + } + // Validate that the potential loop sequence is transformable for fusion // Also collect the HelperExprs, Loop Stmts, Inits, and Number of loops SmallVector LoopHelpers; SmallVector LoopStmts; SmallVector> OriginalInits; - - unsigned NumLoops; - unsigned LoopSeqSize; + SmallVector> TransformsPreInits; + SmallVector> LoopSequencePreInits; + SmallVector LoopCategories; if (!checkTransformableLoopSequence(OMPD_fuse, AStmt, LoopSeqSize, NumLoops, LoopHelpers, LoopStmts, OriginalInits, - Context)) { + TransformsPreInits, LoopSequencePreInits, + LoopCategories, Context)) { return StmtError(); } - // Defer transformation in dependent contexts - // The NumLoopNests argument is set to a placeholder (0) - // because a dependent context could prevent determining its true value - if (CurrContext->isDependentContext()) { - return OMPFuseDirective::Create(Context, StartLoc, EndLoc, Clauses, - NumLoops, 0, AStmt, nullptr, nullptr); - } - // Handle clauses, which can be any of the following: [looprange, apply] const OMPLoopRangeClause *LRC = OMPExecutableDirective::getSingleClause(Clauses); @@ -15827,11 +15935,6 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, "Expecting loop iteration space dimensionality to match number of " "affected loops"); - // PreInits hold a sequence of variable declarations that must be executed - // before the fused loop begins. These include bounds, strides, and other - // helper variables required for the transformation. - SmallVector PreInits; - // Select the type with the largest bit width among all induction variables QualType IVType = LoopHelpers[FirstVal - 1].IterationVarRef->getType(); for (unsigned int I = FirstVal; I < LastVal; ++I) { @@ -15843,7 +15946,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, uint64_t IVBitWidth = Context.getIntWidth(IVType); // Create pre-init declarations for all loops lower bounds, upper bounds, - // strides and num-iterations + // strides and num-iterations for every top level loop in the fusion SmallVector LBVarDecls; SmallVector STVarDecls; SmallVector NIVarDecls; @@ -15881,12 +15984,62 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, return std::make_pair(VD, DeclStmt); }; + // PreInits hold a sequence of variable declarations that must be executed + // before the fused loop begins. These include bounds, strides, and other + // helper variables required for the transformation. Other loop transforms + // also contain their own preinits + SmallVector PreInits; + // Iterator to keep track of loop transformations + unsigned int TransformIndex = 0; + + // Update the general preinits using the preinits generated by loop sequence + // generating loop transformations. These preinits differ slightly from + // single-loop transformation preinits, as they can be detached from a + // specific loop inside the multiple generated loop nests. This happens + // because certain helper variables, like '.omp.fuse.max', are introduced to + // handle fused iteration spaces and may not be directly tied to a single + // original loop. the preinit structure must ensure that hidden variables + // like '.omp.fuse.max' are still properly handled. + // Transformations that apply this concept: Loopranged Fuse, Split + if (!LoopSequencePreInits.empty()) { + for (const auto <PreInits : LoopSequencePreInits) { + if (!LTPreInits.empty()) { + llvm::append_range(PreInits, LTPreInits); + } + } + } + // Process each single loop to generate and collect declarations - // and statements for all helper expressions + // and statements for all helper expressions related to + // particular single loop nests + + // Also In the case of the fused loops, we keep track of their original + // inits by appending them to their preinits statement, and in the case of + // transformations, also append their preinits (which contain the original + // loop initialization statement or other statements) + + // Firstly we need to update TransformIndex to match the begining of the + // looprange section + for (unsigned int I = 0; I < FirstVal - 1; ++I) { + if (LoopCategories[I] == OMPLoopCategory::TransformSingleLoop) + ++TransformIndex; + } for (unsigned int I = FirstVal - 1, J = 0; I < LastVal; ++I, ++J) { - addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], - PreInits); + if (LoopCategories[I] == OMPLoopCategory::RegularLoop) { + addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], + PreInits); + } else if (LoopCategories[I] == OMPLoopCategory::TransformSingleLoop) { + // For transformed loops, insert both pre-inits and original inits. + // Order matters: pre-inits may define variables used in the original + // inits such as upper bounds... + auto TransformPreInit = TransformsPreInits[TransformIndex++]; + if (!TransformPreInit.empty()) { + llvm::append_range(PreInits, TransformPreInit); + } + addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], + PreInits); + } auto [UBVD, UBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].UB, "ub", J); auto [LBVD, LBDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].LB, "lb", J); auto [STVD, STDStmt] = CreateHelperVarAndStmt(LoopHelpers[I].ST, "st", J); @@ -15905,7 +16058,6 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, NIVarDecls.push_back(NIVD); IVVarDecls.push_back(IVVD); - PreInits.push_back(UBDStmt.get()); PreInits.push_back(LBDStmt.get()); PreInits.push_back(STDStmt.get()); PreInits.push_back(NIDStmt.get()); @@ -16081,6 +16233,7 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, BodyStmts.push_back(IdxExpr.get()); llvm::append_range(BodyStmts, LoopHelpers[I].Updates); + // If the loop is a CXXForRangeStmt then the iterator variable is needed if (auto *SourceCXXFor = dyn_cast(LoopStmts[I])) BodyStmts.push_back(SourceCXXFor->getLoopVarStmt()); @@ -16115,21 +16268,50 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, FusedBody, InitStmt.get()->getBeginLoc(), SourceLocation(), IncrExpr.get()->getEndLoc()); - // In the case of looprange, the result of fuse won't simply - // be a single loop (ForStmt), but rather a loop sequence - // (CompoundStmt) of 3 parts: the pre-fusion loops, the fused loop - // and the post-fusion loops, preserving its original order. + // In the case of looprange, the result of fuse won't simply + // be a single loop (ForStmt), but rather a loop sequence + // (CompoundStmt) of 3 parts: the pre-fusion loops, the fused loop + // and the post-fusion loops, preserving its original order. + // + // Note: If looprange clause produces a single fused loop nest then + // this compound statement wrapper is unnecessary (Therefore this + // treatment is skipped) + Stmt *FusionStmt = FusedForStmt; - if (LRC) { + if (LRC && CountVal != LoopSeqSize) { SmallVector FinalLoops; - // Gather all the pre-fusion loops - for (unsigned I = 0; I < FirstVal - 1; ++I) - FinalLoops.push_back(LoopStmts[I]); - // Gather the fused loop - FinalLoops.push_back(FusedForStmt); - // Gather all the post-fusion loops - for (unsigned I = FirstVal + CountVal - 1; I < LoopSeqSize; ++I) + // Reset the transform index + TransformIndex = 0; + + // Collect all non-fused loops before and after the fused region. + // Pre-fusion and post-fusion loops are inserted in order exploiting their + // symmetry, along with their corresponding transformation pre-inits if + // needed. The fused loop is added between the two regions. + for (unsigned I = 0; I < LoopSeqSize; ++I) { + if (I >= FirstVal - 1 && I < FirstVal + CountVal - 1) { + // Update the Transformation counter to skip already treated + // loop transformations + if (LoopCategories[I] != OMPLoopCategory::TransformSingleLoop) + ++TransformIndex; + continue; + } + + // No need to handle: + // Regular loops: they are kept intact as-is. + // Loop-sequence-generating transformations: already handled earlier. + // Only TransformSingleLoop requires inserting pre-inits here + + if (LoopCategories[I] == OMPLoopCategory::TransformSingleLoop) { + auto TransformPreInit = TransformsPreInits[TransformIndex++]; + if (!TransformPreInit.empty()) { + llvm::append_range(PreInits, TransformPreInit); + } + } + FinalLoops.push_back(LoopStmts[I]); + } + + FinalLoops.insert(FinalLoops.begin() + (FirstVal - 1), FusedForStmt); FusionStmt = CompoundStmt::Create(Context, FinalLoops, FPOptionsOverride(), SourceLocation(), SourceLocation()); } diff --git a/clang/test/OpenMP/fuse_ast_print.cpp b/clang/test/OpenMP/fuse_ast_print.cpp index ac4f0d38a9c68..9d85bd1172948 100644 --- a/clang/test/OpenMP/fuse_ast_print.cpp +++ b/clang/test/OpenMP/fuse_ast_print.cpp @@ -338,6 +338,61 @@ void tfoo9() { foo9<1, 2>(); } +// PRINT-LABEL: void foo10( +// DUMP-LABEL: FunctionDecl {{.*}} foo10 +void foo10() { + // PRINT: #pragma omp fuse looprange(2,2) + // DUMP: OMPFuseDirective + // DUMP: OMPLooprangeClause + #pragma omp fuse looprange(2,2) + // PRINT: { + // DUMP: CompoundStmt + { + // PRINT: for (int i = 0; i < 10; i += 2) + // DUMP: ForStmt + for (int i = 0; i < 10; i += 2) + // PRINT: body(i) + // DUMP: CallExpr + body(i); + // PRINT: for (int ii = 0; ii < 10; ii += 2) + // DUMP: ForStmt + for (int ii = 0; ii < 10; ii += 2) + // PRINT: body(ii) + // DUMP: CallExpr + body(ii); + // PRINT: #pragma omp fuse looprange(2,2) + // DUMP: OMPFuseDirective + // DUMP: OMPLooprangeClause + #pragma omp fuse looprange(2,2) + { + // PRINT: for (int j = 10; j > 0; --j) + // DUMP: ForStmt + for (int j = 10; j > 0; --j) + // PRINT: body(j) + // DUMP: CallExpr + body(j); + // PRINT: for (int jj = 10; jj > 0; --jj) + // DUMP: ForStmt + for (int jj = 10; jj > 0; --jj) + // PRINT: body(jj) + // DUMP: CallExpr + body(jj); + // PRINT: for (int k = 0; k <= 10; ++k) + // DUMP: ForStmt + for (int k = 0; k <= 10; ++k) + // PRINT: body(k) + // DUMP: CallExpr + body(k); + // PRINT: for (int kk = 0; kk <= 10; ++kk) + // DUMP: ForStmt + for (int kk = 0; kk <= 10; ++kk) + // PRINT: body(kk) + // DUMP: CallExpr + body(kk); + } + } + +} diff --git a/clang/test/OpenMP/fuse_codegen.cpp b/clang/test/OpenMP/fuse_codegen.cpp index d9500bed3ce31..742c280ed0172 100644 --- a/clang/test/OpenMP/fuse_codegen.cpp +++ b/clang/test/OpenMP/fuse_codegen.cpp @@ -65,6 +65,23 @@ extern "C" void foo4() { } } +// This exemplifies the usage of loop transformations that generate +// more than top level canonical loop nests (e.g split, loopranged fuse...) +extern "C" void foo5() { + double arr[256]; + #pragma omp fuse looprange(2,2) + { + #pragma omp fuse looprange(2,2) + { + for(int i = 0; i < 128; ++i) body(i); + for(int j = 0; j < 256; j+=2) body(j); + for(int k = 0; k < 512; ++k) body(k); + } + for(int c = 42; auto &&v: arr) body(c,v); + for(int cc = 37; auto &&vv: arr) body(cc, vv); + } +} + #endif // CHECK1-LABEL: define dso_local void @body( @@ -88,7 +105,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 @@ -97,7 +113,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -129,107 +144,103 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] // CHECK1-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 // CHECK1-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP8]], 1 // CHECK1-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP9]], ptr [[J]], align 4 // CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[START2_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[START2_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[END2_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK1-NEXT: store i32 [[TMP10]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[END2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP13]], [[TMP14]] // CHECK1-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP15]] // CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] -// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP16]] // CHECK1-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 // CHECK1-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP17]], 1 // CHECK1-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: store i32 [[TMP20]], ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP21]], [[TMP22]] +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP19]], [[TMP20]] // CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] // CHECK1: [[COND_TRUE]]: -// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 // CHECK1-NEXT: br label %[[COND_END:.*]] // CHECK1: [[COND_FALSE]]: -// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 // CHECK1-NEXT: br label %[[COND_END]] // CHECK1: [[COND_END]]: -// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP23]], %[[COND_TRUE]] ], [ [[TMP24]], %[[COND_FALSE]] ] +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP21]], %[[COND_TRUE]] ], [ [[TMP22]], %[[COND_FALSE]] ] // CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK1-NEXT: br label %[[FOR_COND:.*]] // CHECK1: [[FOR_COND]]: -// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 -// CHECK1-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP23]], [[TMP24]] // CHECK1-NEXT: br i1 [[CMP16]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK1: [[FOR_BODY]]: -// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] // CHECK1-NEXT: br i1 [[CMP17]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] // CHECK1: [[IF_THEN]]: -// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP30]], [[TMP31]] -// CHECK1-NEXT: [[ADD18:%.*]] = add i32 [[TMP29]], [[MUL]] +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP28]], [[TMP29]] +// CHECK1-NEXT: [[ADD18:%.*]] = add i32 [[TMP27]], [[MUL]] // CHECK1-NEXT: store i32 [[ADD18]], ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 -// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 -// CHECK1-NEXT: [[MUL19:%.*]] = mul i32 [[TMP33]], [[TMP34]] -// CHECK1-NEXT: [[ADD20:%.*]] = add i32 [[TMP32]], [[MUL19]] +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[MUL19:%.*]] = mul i32 [[TMP31]], [[TMP32]] +// CHECK1-NEXT: [[ADD20:%.*]] = add i32 [[TMP30]], [[MUL19]] // CHECK1-NEXT: store i32 [[ADD20]], ptr [[I]], align 4 -// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[I]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP35]]) +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP33]]) // CHECK1-NEXT: br label %[[IF_END]] // CHECK1: [[IF_END]]: -// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP36]], [[TMP37]] +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP34]], [[TMP35]] // CHECK1-NEXT: br i1 [[CMP21]], label %[[IF_THEN22:.*]], label %[[IF_END27:.*]] // CHECK1: [[IF_THEN22]]: -// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL23:%.*]] = mul i32 [[TMP39]], [[TMP40]] -// CHECK1-NEXT: [[ADD24:%.*]] = add i32 [[TMP38]], [[MUL23]] +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL23:%.*]] = mul i32 [[TMP37]], [[TMP38]] +// CHECK1-NEXT: [[ADD24:%.*]] = add i32 [[TMP36]], [[MUL23]] // CHECK1-NEXT: store i32 [[ADD24]], ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[MUL25:%.*]] = mul i32 [[TMP42]], [[TMP43]] -// CHECK1-NEXT: [[ADD26:%.*]] = add i32 [[TMP41]], [[MUL25]] +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[MUL25:%.*]] = mul i32 [[TMP40]], [[TMP41]] +// CHECK1-NEXT: [[ADD26:%.*]] = add i32 [[TMP39]], [[MUL25]] // CHECK1-NEXT: store i32 [[ADD26]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[J]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP42]]) // CHECK1-NEXT: br label %[[IF_END27]] // CHECK1: [[IF_END27]]: // CHECK1-NEXT: br label %[[FOR_INC:.*]] // CHECK1: [[FOR_INC]]: -// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP45]], 1 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP43]], 1 // CHECK1-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP3:![0-9]+]] // CHECK1: [[FOR_END]]: @@ -256,7 +267,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 @@ -265,7 +275,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -274,7 +283,6 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTCAPTURE_EXPR_19:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTNEW_STEP21:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_22:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB2:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB2:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST2:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI2:%.*]] = alloca i32, align 4 @@ -304,172 +312,166 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] // CHECK1-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 // CHECK1-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK1-NEXT: [[ADD5:%.*]] = add i32 [[TMP8]], 1 // CHECK1-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[TMP9:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP9]], ptr [[J]], align 4 // CHECK1-NEXT: [[TMP10:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[START_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK1-NEXT: store i32 [[TMP10]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP11:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[TMP12:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP12]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK1-NEXT: [[SUB10:%.*]] = sub i32 [[TMP13]], [[TMP14]] // CHECK1-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK1-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP15]] // CHECK1-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] -// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK1-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP16]] // CHECK1-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 // CHECK1-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK1-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK1-NEXT: [[ADD15:%.*]] = add i32 [[TMP17]], 1 // CHECK1-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP18:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK1-NEXT: [[TMP19:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP18]], [[TMP19]] +// CHECK1-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 // CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[START_ADDR]], align 4 // CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] -// CHECK1-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 -// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[START_ADDR]], align 4 -// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] +// CHECK1-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] // CHECK1-NEXT: store i32 [[ADD18]], ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP24]], [[TMP25]] +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK1-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] // CHECK1-NEXT: store i32 [[ADD20]], ptr [[DOTCAPTURE_EXPR_19]], align 4 -// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK1-NEXT: store i32 [[TMP26]], ptr [[DOTNEW_STEP21]], align 4 -// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 -// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK1-NEXT: [[SUB23:%.*]] = sub i32 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: [[TMP24:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK1-NEXT: store i32 [[TMP24]], ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK1-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[SUB23:%.*]] = sub i32 [[TMP25]], [[TMP26]] // CHECK1-NEXT: [[SUB24:%.*]] = sub i32 [[SUB23]], 1 -// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK1-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP29]] -// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK1-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP30]] +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP27]] +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP28]] // CHECK1-NEXT: [[SUB27:%.*]] = sub i32 [[DIV26]], 1 // CHECK1-NEXT: store i32 [[SUB27]], ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK1-NEXT: store i32 [[TMP31]], ptr [[DOTOMP_UB2]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB2]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST2]], align 4 -// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK1-NEXT: [[ADD28:%.*]] = add i32 [[TMP32]], 1 +// CHECK1-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK1-NEXT: [[ADD28:%.*]] = add i32 [[TMP29]], 1 // CHECK1-NEXT: store i32 [[ADD28]], ptr [[DOTOMP_NI2]], align 4 -// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: store i32 [[TMP33]], ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP34]], [[TMP35]] +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP30]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP31]], [[TMP32]] // CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] // CHECK1: [[COND_TRUE]]: -// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 // CHECK1-NEXT: br label %[[COND_END:.*]] // CHECK1: [[COND_FALSE]]: -// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 // CHECK1-NEXT: br label %[[COND_END]] // CHECK1: [[COND_END]]: -// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP36]], %[[COND_TRUE]] ], [ [[TMP37]], %[[COND_FALSE]] ] +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP33]], %[[COND_TRUE]] ], [ [[TMP34]], %[[COND_FALSE]] ] // CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_TEMP_2]], align 4 -// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 -// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 -// CHECK1-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP38]], [[TMP39]] +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP35]], [[TMP36]] // CHECK1-NEXT: br i1 [[CMP29]], label %[[COND_TRUE30:.*]], label %[[COND_FALSE31:.*]] // CHECK1: [[COND_TRUE30]]: -// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 // CHECK1-NEXT: br label %[[COND_END32:.*]] // CHECK1: [[COND_FALSE31]]: -// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 // CHECK1-NEXT: br label %[[COND_END32]] // CHECK1: [[COND_END32]]: -// CHECK1-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP40]], %[[COND_TRUE30]] ], [ [[TMP41]], %[[COND_FALSE31]] ] +// CHECK1-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP37]], %[[COND_TRUE30]] ], [ [[TMP38]], %[[COND_FALSE31]] ] // CHECK1-NEXT: store i32 [[COND33]], ptr [[DOTOMP_FUSE_MAX]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK1-NEXT: br label %[[FOR_COND:.*]] // CHECK1: [[FOR_COND]]: -// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 -// CHECK1-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP39]], [[TMP40]] // CHECK1-NEXT: br i1 [[CMP34]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK1: [[FOR_BODY]]: -// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP44]], [[TMP45]] +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP41]], [[TMP42]] // CHECK1-NEXT: br i1 [[CMP35]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] // CHECK1: [[IF_THEN]]: -// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP47]], [[TMP48]] -// CHECK1-NEXT: [[ADD36:%.*]] = add i32 [[TMP46]], [[MUL]] +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL:%.*]] = mul i32 [[TMP44]], [[TMP45]] +// CHECK1-NEXT: [[ADD36:%.*]] = add i32 [[TMP43]], [[MUL]] // CHECK1-NEXT: store i32 [[ADD36]], ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 -// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 -// CHECK1-NEXT: [[MUL37:%.*]] = mul i32 [[TMP50]], [[TMP51]] -// CHECK1-NEXT: [[ADD38:%.*]] = add i32 [[TMP49]], [[MUL37]] +// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK1-NEXT: [[MUL37:%.*]] = mul i32 [[TMP47]], [[TMP48]] +// CHECK1-NEXT: [[ADD38:%.*]] = add i32 [[TMP46]], [[MUL37]] // CHECK1-NEXT: store i32 [[ADD38]], ptr [[I]], align 4 -// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[I]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP49]]) // CHECK1-NEXT: br label %[[IF_END]] // CHECK1: [[IF_END]]: -// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP53]], [[TMP54]] +// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP50]], [[TMP51]] // CHECK1-NEXT: br i1 [[CMP39]], label %[[IF_THEN40:.*]], label %[[IF_END45:.*]] // CHECK1: [[IF_THEN40]]: -// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK1-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL41:%.*]] = mul i32 [[TMP56]], [[TMP57]] -// CHECK1-NEXT: [[ADD42:%.*]] = add i32 [[TMP55]], [[MUL41]] +// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL41:%.*]] = mul i32 [[TMP53]], [[TMP54]] +// CHECK1-NEXT: [[ADD42:%.*]] = add i32 [[TMP52]], [[MUL41]] // CHECK1-NEXT: store i32 [[ADD42]], ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP58:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK1-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK1-NEXT: [[MUL43:%.*]] = mul i32 [[TMP59]], [[TMP60]] -// CHECK1-NEXT: [[SUB44:%.*]] = sub i32 [[TMP58]], [[MUL43]] +// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK1-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK1-NEXT: [[MUL43:%.*]] = mul i32 [[TMP56]], [[TMP57]] +// CHECK1-NEXT: [[SUB44:%.*]] = sub i32 [[TMP55]], [[MUL43]] // CHECK1-NEXT: store i32 [[SUB44]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP61:%.*]] = load i32, ptr [[J]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP61]]) +// CHECK1-NEXT: [[TMP58:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP58]]) // CHECK1-NEXT: br label %[[IF_END45]] // CHECK1: [[IF_END45]]: -// CHECK1-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 -// CHECK1-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP62]], [[TMP63]] +// CHECK1-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK1-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP59]], [[TMP60]] // CHECK1-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] // CHECK1: [[IF_THEN47]]: -// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 -// CHECK1-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 -// CHECK1-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL48:%.*]] = mul i32 [[TMP65]], [[TMP66]] -// CHECK1-NEXT: [[ADD49:%.*]] = add i32 [[TMP64]], [[MUL48]] +// CHECK1-NEXT: [[TMP61:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 +// CHECK1-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 +// CHECK1-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL48:%.*]] = mul i32 [[TMP62]], [[TMP63]] +// CHECK1-NEXT: [[ADD49:%.*]] = add i32 [[TMP61]], [[MUL48]] // CHECK1-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV2]], align 4 -// CHECK1-NEXT: [[TMP67:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK1-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 -// CHECK1-NEXT: [[TMP69:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK1-NEXT: [[MUL50:%.*]] = mul i32 [[TMP68]], [[TMP69]] -// CHECK1-NEXT: [[ADD51:%.*]] = add i32 [[TMP67]], [[MUL50]] +// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK1-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 +// CHECK1-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK1-NEXT: [[MUL50:%.*]] = mul i32 [[TMP65]], [[TMP66]] +// CHECK1-NEXT: [[ADD51:%.*]] = add i32 [[TMP64]], [[MUL50]] // CHECK1-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 -// CHECK1-NEXT: [[TMP70:%.*]] = load i32, ptr [[K]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP70]]) +// CHECK1-NEXT: [[TMP67:%.*]] = load i32, ptr [[K]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP67]]) // CHECK1-NEXT: br label %[[IF_END52]] // CHECK1: [[IF_END52]]: // CHECK1-NEXT: br label %[[FOR_INC:.*]] // CHECK1: [[FOR_INC]]: -// CHECK1-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 +// CHECK1-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add i32 [[TMP68]], 1 // CHECK1-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP5:![0-9]+]] // CHECK1: [[FOR_END]]: @@ -481,13 +483,11 @@ extern "C" void foo4() { // CHECK1-NEXT: [[ENTRY:.*:]] // CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 // CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -497,48 +497,43 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB03:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_LB04:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_ST05:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_NI06:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_IV07:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB03:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST04:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI05:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV06:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[C:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_12:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_UB117:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_LB118:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_ST119:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_NI120:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_IV122:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_8:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_10:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_LB116:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_ST117:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_NI118:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV120:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[CC:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[__RANGE223:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[__END224:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[__BEGIN227:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__RANGE221:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END222:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN225:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_27:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[DOTCAPTURE_EXPR_29:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_31:%.*]] = alloca ptr, align 8 -// CHECK1-NEXT: [[DOTCAPTURE_EXPR_32:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_UB2:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_30:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_LB2:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_ST2:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_NI2:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_IV2:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_TEMP_142:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_TEMP_140:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_FUSE_MAX48:%.*]] = alloca i64, align 8 -// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX54:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX46:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX52:%.*]] = alloca i64, align 8 // CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[VV:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: store i32 0, ptr [[I]], align 4 -// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 // CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[J]], align 4 -// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB1]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 // CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI1]], align 4 @@ -565,225 +560,219 @@ extern "C" void foo4() { // CHECK1-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 // CHECK1-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 // CHECK1-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB03]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST04]], align 4 // CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 -// CHECK1-NEXT: store i32 [[TMP7]], ptr [[DOTOMP_UB03]], align 4 -// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB04]], align 4 -// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST05]], align 4 -// CHECK1-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 -// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP8]], 1 +// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], 1 // CHECK1-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 -// CHECK1-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI06]], align 8 +// CHECK1-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI05]], align 8 // CHECK1-NEXT: store i32 42, ptr [[C]], align 4 // CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 -// CHECK1-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK1-NEXT: [[TMP8:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP8]], i64 0, i64 0 // CHECK1-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 // CHECK1-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK1-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY7:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY7]], ptr [[__BEGIN2]], align 8 // CHECK1-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY8:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 -// CHECK1-NEXT: store ptr [[ARRAYDECAY8]], ptr [[__BEGIN2]], align 8 -// CHECK1-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY10:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP11]], i64 0, i64 0 -// CHECK1-NEXT: store ptr [[ARRAYDECAY10]], ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK1-NEXT: [[TMP12:%.*]] = load ptr, ptr [[__END2]], align 8 -// CHECK1-NEXT: store ptr [[TMP12]], ptr [[DOTCAPTURE_EXPR_11]], align 8 -// CHECK1-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_11]], align 8 -// CHECK1-NEXT: [[TMP14:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK1-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 -// CHECK1-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP14]] to i64 +// CHECK1-NEXT: [[ARRAYDECAY9:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY9]], ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK1-NEXT: store ptr [[TMP11]], ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK1-NEXT: [[TMP12:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK1-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP12]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 // CHECK1-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] // CHECK1-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 -// CHECK1-NEXT: [[SUB13:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 -// CHECK1-NEXT: [[ADD14:%.*]] = add nsw i64 [[SUB13]], 1 -// CHECK1-NEXT: [[DIV15:%.*]] = sdiv i64 [[ADD14]], 1 -// CHECK1-NEXT: [[SUB16:%.*]] = sub nsw i64 [[DIV15]], 1 -// CHECK1-NEXT: store i64 [[SUB16]], ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK1-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK1-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_UB117]], align 8 -// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB118]], align 8 -// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST119]], align 8 -// CHECK1-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK1-NEXT: [[ADD21:%.*]] = add nsw i64 [[TMP16]], 1 -// CHECK1-NEXT: store i64 [[ADD21]], ptr [[DOTOMP_NI120]], align 8 +// CHECK1-NEXT: [[SUB12:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK1-NEXT: [[ADD13:%.*]] = add nsw i64 [[SUB12]], 1 +// CHECK1-NEXT: [[DIV14:%.*]] = sdiv i64 [[ADD13]], 1 +// CHECK1-NEXT: [[SUB15:%.*]] = sub nsw i64 [[DIV14]], 1 +// CHECK1-NEXT: store i64 [[SUB15]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB116]], align 8 +// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST117]], align 8 +// CHECK1-NEXT: [[TMP14:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: [[ADD19:%.*]] = add nsw i64 [[TMP14]], 1 +// CHECK1-NEXT: store i64 [[ADD19]], ptr [[DOTOMP_NI118]], align 8 // CHECK1-NEXT: store i32 37, ptr [[CC]], align 4 -// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE223]], align 8 -// CHECK1-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY25:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 -// CHECK1-NEXT: [[ADD_PTR26:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY25]], i64 256 -// CHECK1-NEXT: store ptr [[ADD_PTR26]], ptr [[__END224]], align 8 -// CHECK1-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP18]], i64 0, i64 0 -// CHECK1-NEXT: store ptr [[ARRAYDECAY28]], ptr [[__BEGIN227]], align 8 -// CHECK1-NEXT: [[TMP19:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK1-NEXT: [[ARRAYDECAY30:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP19]], i64 0, i64 0 -// CHECK1-NEXT: store ptr [[ARRAYDECAY30]], ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK1-NEXT: [[TMP20:%.*]] = load ptr, ptr [[__END224]], align 8 -// CHECK1-NEXT: store ptr [[TMP20]], ptr [[DOTCAPTURE_EXPR_31]], align 8 -// CHECK1-NEXT: [[TMP21:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_31]], align 8 -// CHECK1-NEXT: [[TMP22:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK1-NEXT: [[SUB_PTR_LHS_CAST33:%.*]] = ptrtoint ptr [[TMP21]] to i64 -// CHECK1-NEXT: [[SUB_PTR_RHS_CAST34:%.*]] = ptrtoint ptr [[TMP22]] to i64 -// CHECK1-NEXT: [[SUB_PTR_SUB35:%.*]] = sub i64 [[SUB_PTR_LHS_CAST33]], [[SUB_PTR_RHS_CAST34]] -// CHECK1-NEXT: [[SUB_PTR_DIV36:%.*]] = sdiv exact i64 [[SUB_PTR_SUB35]], 8 -// CHECK1-NEXT: [[SUB37:%.*]] = sub nsw i64 [[SUB_PTR_DIV36]], 1 -// CHECK1-NEXT: [[ADD38:%.*]] = add nsw i64 [[SUB37]], 1 -// CHECK1-NEXT: [[DIV39:%.*]] = sdiv i64 [[ADD38]], 1 -// CHECK1-NEXT: [[SUB40:%.*]] = sub nsw i64 [[DIV39]], 1 -// CHECK1-NEXT: store i64 [[SUB40]], ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK1-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK1-NEXT: store i64 [[TMP23]], ptr [[DOTOMP_UB2]], align 8 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE221]], align 8 +// CHECK1-NEXT: [[TMP15:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY23:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP15]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR24:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY23]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR24]], ptr [[__END222]], align 8 +// CHECK1-NEXT: [[TMP16:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY26:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP16]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY26]], ptr [[__BEGIN225]], align 8 +// CHECK1-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY28]], ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK1-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__END222]], align 8 +// CHECK1-NEXT: store ptr [[TMP18]], ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[TMP19:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK1-NEXT: [[TMP20:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST31:%.*]] = ptrtoint ptr [[TMP19]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST32:%.*]] = ptrtoint ptr [[TMP20]] to i64 +// CHECK1-NEXT: [[SUB_PTR_SUB33:%.*]] = sub i64 [[SUB_PTR_LHS_CAST31]], [[SUB_PTR_RHS_CAST32]] +// CHECK1-NEXT: [[SUB_PTR_DIV34:%.*]] = sdiv exact i64 [[SUB_PTR_SUB33]], 8 +// CHECK1-NEXT: [[SUB35:%.*]] = sub nsw i64 [[SUB_PTR_DIV34]], 1 +// CHECK1-NEXT: [[ADD36:%.*]] = add nsw i64 [[SUB35]], 1 +// CHECK1-NEXT: [[DIV37:%.*]] = sdiv i64 [[ADD36]], 1 +// CHECK1-NEXT: [[SUB38:%.*]] = sub nsw i64 [[DIV37]], 1 +// CHECK1-NEXT: store i64 [[SUB38]], ptr [[DOTCAPTURE_EXPR_30]], align 8 // CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB2]], align 8 // CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST2]], align 8 -// CHECK1-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK1-NEXT: [[ADD41:%.*]] = add nsw i64 [[TMP24]], 1 -// CHECK1-NEXT: store i64 [[ADD41]], ptr [[DOTOMP_NI2]], align 8 -// CHECK1-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 -// CHECK1-NEXT: store i64 [[TMP25]], ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK1-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK1-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK1-NEXT: [[CMP43:%.*]] = icmp sgt i64 [[TMP26]], [[TMP27]] -// CHECK1-NEXT: br i1 [[CMP43]], label %[[COND_TRUE44:.*]], label %[[COND_FALSE45:.*]] -// CHECK1: [[COND_TRUE44]]: -// CHECK1-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK1-NEXT: br label %[[COND_END46:.*]] -// CHECK1: [[COND_FALSE45]]: -// CHECK1-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK1-NEXT: br label %[[COND_END46]] -// CHECK1: [[COND_END46]]: -// CHECK1-NEXT: [[COND47:%.*]] = phi i64 [ [[TMP28]], %[[COND_TRUE44]] ], [ [[TMP29]], %[[COND_FALSE45]] ] -// CHECK1-NEXT: store i64 [[COND47]], ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK1-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK1-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK1-NEXT: [[CMP49:%.*]] = icmp sgt i64 [[TMP30]], [[TMP31]] -// CHECK1-NEXT: br i1 [[CMP49]], label %[[COND_TRUE50:.*]], label %[[COND_FALSE51:.*]] -// CHECK1: [[COND_TRUE50]]: -// CHECK1-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK1-NEXT: br label %[[COND_END52:.*]] -// CHECK1: [[COND_FALSE51]]: -// CHECK1-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK1-NEXT: br label %[[COND_END52]] -// CHECK1: [[COND_END52]]: -// CHECK1-NEXT: [[COND53:%.*]] = phi i64 [ [[TMP32]], %[[COND_TRUE50]] ], [ [[TMP33]], %[[COND_FALSE51]] ] -// CHECK1-NEXT: store i64 [[COND53]], ptr [[DOTOMP_FUSE_MAX48]], align 8 -// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP21:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_30]], align 8 +// CHECK1-NEXT: [[ADD39:%.*]] = add nsw i64 [[TMP21]], 1 +// CHECK1-NEXT: store i64 [[ADD39]], ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[TMP22:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: store i64 [[TMP22]], ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK1-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK1-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[CMP41:%.*]] = icmp sgt i64 [[TMP23]], [[TMP24]] +// CHECK1-NEXT: br i1 [[CMP41]], label %[[COND_TRUE42:.*]], label %[[COND_FALSE43:.*]] +// CHECK1: [[COND_TRUE42]]: +// CHECK1-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK1-NEXT: br label %[[COND_END44:.*]] +// CHECK1: [[COND_FALSE43]]: +// CHECK1-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: br label %[[COND_END44]] +// CHECK1: [[COND_END44]]: +// CHECK1-NEXT: [[COND45:%.*]] = phi i64 [ [[TMP25]], %[[COND_TRUE42]] ], [ [[TMP26]], %[[COND_FALSE43]] ] +// CHECK1-NEXT: store i64 [[COND45]], ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[CMP47:%.*]] = icmp sgt i64 [[TMP27]], [[TMP28]] +// CHECK1-NEXT: br i1 [[CMP47]], label %[[COND_TRUE48:.*]], label %[[COND_FALSE49:.*]] +// CHECK1: [[COND_TRUE48]]: +// CHECK1-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK1-NEXT: br label %[[COND_END50:.*]] +// CHECK1: [[COND_FALSE49]]: +// CHECK1-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: br label %[[COND_END50]] +// CHECK1: [[COND_END50]]: +// CHECK1-NEXT: [[COND51:%.*]] = phi i64 [ [[TMP29]], %[[COND_TRUE48]] ], [ [[TMP30]], %[[COND_FALSE49]] ] +// CHECK1-NEXT: store i64 [[COND51]], ptr [[DOTOMP_FUSE_MAX46]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX52]], align 8 // CHECK1-NEXT: br label %[[FOR_COND:.*]] // CHECK1: [[FOR_COND]]: -// CHECK1-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[TMP35:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX48]], align 8 -// CHECK1-NEXT: [[CMP55:%.*]] = icmp slt i64 [[TMP34]], [[TMP35]] -// CHECK1-NEXT: br i1 [[CMP55]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX46]], align 8 +// CHECK1-NEXT: [[CMP53:%.*]] = icmp slt i64 [[TMP31]], [[TMP32]] +// CHECK1-NEXT: br i1 [[CMP53]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK1: [[FOR_BODY]]: -// CHECK1-NEXT: [[TMP36:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 -// CHECK1-NEXT: [[CMP56:%.*]] = icmp slt i64 [[TMP36]], [[TMP37]] -// CHECK1-NEXT: br i1 [[CMP56]], label %[[IF_THEN:.*]], label %[[IF_END76:.*]] +// CHECK1-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: [[CMP54:%.*]] = icmp slt i64 [[TMP33]], [[TMP34]] +// CHECK1-NEXT: br i1 [[CMP54]], label %[[IF_THEN:.*]], label %[[IF_END74:.*]] // CHECK1: [[IF_THEN]]: -// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB04]], align 4 -// CHECK1-NEXT: [[CONV57:%.*]] = sext i32 [[TMP38]] to i64 -// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST05]], align 4 -// CHECK1-NEXT: [[CONV58:%.*]] = sext i32 [[TMP39]] to i64 -// CHECK1-NEXT: [[TMP40:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV58]], [[TMP40]] -// CHECK1-NEXT: [[ADD59:%.*]] = add nsw i64 [[CONV57]], [[MUL]] -// CHECK1-NEXT: [[CONV60:%.*]] = trunc i64 [[ADD59]] to i32 -// CHECK1-NEXT: store i32 [[CONV60]], ptr [[DOTOMP_IV07]], align 4 -// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_IV07]], align 4 -// CHECK1-NEXT: [[MUL61:%.*]] = mul nsw i32 [[TMP41]], 1 -// CHECK1-NEXT: [[ADD62:%.*]] = add nsw i32 0, [[MUL61]] -// CHECK1-NEXT: store i32 [[ADD62]], ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK1-NEXT: [[CMP63:%.*]] = icmp slt i32 [[TMP42]], [[TMP43]] -// CHECK1-NEXT: br i1 [[CMP63]], label %[[IF_THEN64:.*]], label %[[IF_END:.*]] -// CHECK1: [[IF_THEN64]]: -// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP45]], [[TMP46]] -// CHECK1-NEXT: [[ADD66:%.*]] = add nsw i32 [[TMP44]], [[MUL65]] -// CHECK1-NEXT: store i32 [[ADD66]], ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK1-NEXT: [[MUL67:%.*]] = mul nsw i32 [[TMP47]], 1 -// CHECK1-NEXT: [[ADD68:%.*]] = add nsw i32 0, [[MUL67]] -// CHECK1-NEXT: store i32 [[ADD68]], ptr [[I]], align 4 -// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[I]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP48]]) +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_LB03]], align 4 +// CHECK1-NEXT: [[CONV55:%.*]] = sext i32 [[TMP35]] to i64 +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_ST04]], align 4 +// CHECK1-NEXT: [[CONV56:%.*]] = sext i32 [[TMP36]] to i64 +// CHECK1-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV56]], [[TMP37]] +// CHECK1-NEXT: [[ADD57:%.*]] = add nsw i64 [[CONV55]], [[MUL]] +// CHECK1-NEXT: [[CONV58:%.*]] = trunc i64 [[ADD57]] to i32 +// CHECK1-NEXT: store i32 [[CONV58]], ptr [[DOTOMP_IV06]], align 4 +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_IV06]], align 4 +// CHECK1-NEXT: [[MUL59:%.*]] = mul nsw i32 [[TMP38]], 1 +// CHECK1-NEXT: [[ADD60:%.*]] = add nsw i32 0, [[MUL59]] +// CHECK1-NEXT: store i32 [[ADD60]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP61:%.*]] = icmp slt i32 [[TMP39]], [[TMP40]] +// CHECK1-NEXT: br i1 [[CMP61]], label %[[IF_THEN62:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN62]]: +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL63:%.*]] = mul nsw i32 [[TMP42]], [[TMP43]] +// CHECK1-NEXT: [[ADD64:%.*]] = add nsw i32 [[TMP41]], [[MUL63]] +// CHECK1-NEXT: store i32 [[ADD64]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP44]], 1 +// CHECK1-NEXT: [[ADD66:%.*]] = add nsw i32 0, [[MUL65]] +// CHECK1-NEXT: store i32 [[ADD66]], ptr [[I]], align 4 +// CHECK1-NEXT: [[TMP45:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP45]]) // CHECK1-NEXT: br label %[[IF_END]] // CHECK1: [[IF_END]]: -// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK1-NEXT: [[CMP69:%.*]] = icmp slt i32 [[TMP49]], [[TMP50]] -// CHECK1-NEXT: br i1 [[CMP69]], label %[[IF_THEN70:.*]], label %[[IF_END75:.*]] -// CHECK1: [[IF_THEN70]]: -// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK1-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP52]], [[TMP53]] -// CHECK1-NEXT: [[ADD72:%.*]] = add nsw i32 [[TMP51]], [[MUL71]] -// CHECK1-NEXT: store i32 [[ADD72]], ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK1-NEXT: [[MUL73:%.*]] = mul nsw i32 [[TMP54]], 2 -// CHECK1-NEXT: [[ADD74:%.*]] = add nsw i32 0, [[MUL73]] -// CHECK1-NEXT: store i32 [[ADD74]], ptr [[J]], align 4 -// CHECK1-NEXT: [[TMP55:%.*]] = load i32, ptr [[J]], align 4 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP55]]) -// CHECK1-NEXT: br label %[[IF_END75]] -// CHECK1: [[IF_END75]]: -// CHECK1-NEXT: br label %[[IF_END76]] -// CHECK1: [[IF_END76]]: -// CHECK1-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK1-NEXT: [[CMP77:%.*]] = icmp slt i64 [[TMP56]], [[TMP57]] -// CHECK1-NEXT: br i1 [[CMP77]], label %[[IF_THEN78:.*]], label %[[IF_END83:.*]] -// CHECK1: [[IF_THEN78]]: -// CHECK1-NEXT: [[TMP58:%.*]] = load i64, ptr [[DOTOMP_LB118]], align 8 -// CHECK1-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_ST119]], align 8 -// CHECK1-NEXT: [[TMP60:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], [[TMP60]] -// CHECK1-NEXT: [[ADD80:%.*]] = add nsw i64 [[TMP58]], [[MUL79]] -// CHECK1-NEXT: store i64 [[ADD80]], ptr [[DOTOMP_IV122]], align 8 -// CHECK1-NEXT: [[TMP61:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK1-NEXT: [[TMP62:%.*]] = load i64, ptr [[DOTOMP_IV122]], align 8 -// CHECK1-NEXT: [[MUL81:%.*]] = mul nsw i64 [[TMP62]], 1 -// CHECK1-NEXT: [[ADD_PTR82:%.*]] = getelementptr inbounds double, ptr [[TMP61]], i64 [[MUL81]] -// CHECK1-NEXT: store ptr [[ADD_PTR82]], ptr [[__BEGIN2]], align 8 -// CHECK1-NEXT: [[TMP63:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 -// CHECK1-NEXT: store ptr [[TMP63]], ptr [[V]], align 8 -// CHECK1-NEXT: [[TMP64:%.*]] = load i32, ptr [[C]], align 4 -// CHECK1-NEXT: [[TMP65:%.*]] = load ptr, ptr [[V]], align 8 -// CHECK1-NEXT: [[TMP66:%.*]] = load double, ptr [[TMP65]], align 8 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP64]], double noundef [[TMP66]]) -// CHECK1-NEXT: br label %[[IF_END83]] -// CHECK1: [[IF_END83]]: -// CHECK1-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK1-NEXT: [[CMP84:%.*]] = icmp slt i64 [[TMP67]], [[TMP68]] -// CHECK1-NEXT: br i1 [[CMP84]], label %[[IF_THEN85:.*]], label %[[IF_END90:.*]] -// CHECK1: [[IF_THEN85]]: -// CHECK1-NEXT: [[TMP69:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 -// CHECK1-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 -// CHECK1-NEXT: [[TMP71:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], [[TMP71]] -// CHECK1-NEXT: [[ADD87:%.*]] = add nsw i64 [[TMP69]], [[MUL86]] -// CHECK1-NEXT: store i64 [[ADD87]], ptr [[DOTOMP_IV2]], align 8 -// CHECK1-NEXT: [[TMP72:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK1-NEXT: [[TMP73:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 -// CHECK1-NEXT: [[MUL88:%.*]] = mul nsw i64 [[TMP73]], 1 -// CHECK1-NEXT: [[ADD_PTR89:%.*]] = getelementptr inbounds double, ptr [[TMP72]], i64 [[MUL88]] -// CHECK1-NEXT: store ptr [[ADD_PTR89]], ptr [[__BEGIN227]], align 8 -// CHECK1-NEXT: [[TMP74:%.*]] = load ptr, ptr [[__BEGIN227]], align 8 -// CHECK1-NEXT: store ptr [[TMP74]], ptr [[VV]], align 8 -// CHECK1-NEXT: [[TMP75:%.*]] = load i32, ptr [[CC]], align 4 -// CHECK1-NEXT: [[TMP76:%.*]] = load ptr, ptr [[VV]], align 8 -// CHECK1-NEXT: [[TMP77:%.*]] = load double, ptr [[TMP76]], align 8 -// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP75]], double noundef [[TMP77]]) -// CHECK1-NEXT: br label %[[IF_END90]] -// CHECK1: [[IF_END90]]: +// CHECK1-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP67:%.*]] = icmp slt i32 [[TMP46]], [[TMP47]] +// CHECK1-NEXT: br i1 [[CMP67]], label %[[IF_THEN68:.*]], label %[[IF_END73:.*]] +// CHECK1: [[IF_THEN68]]: +// CHECK1-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL69:%.*]] = mul nsw i32 [[TMP49]], [[TMP50]] +// CHECK1-NEXT: [[ADD70:%.*]] = add nsw i32 [[TMP48]], [[MUL69]] +// CHECK1-NEXT: store i32 [[ADD70]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP51]], 2 +// CHECK1-NEXT: [[ADD72:%.*]] = add nsw i32 0, [[MUL71]] +// CHECK1-NEXT: store i32 [[ADD72]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP52:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK1-NEXT: br label %[[IF_END73]] +// CHECK1: [[IF_END73]]: +// CHECK1-NEXT: br label %[[IF_END74]] +// CHECK1: [[IF_END74]]: +// CHECK1-NEXT: [[TMP53:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[TMP54:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[CMP75:%.*]] = icmp slt i64 [[TMP53]], [[TMP54]] +// CHECK1-NEXT: br i1 [[CMP75]], label %[[IF_THEN76:.*]], label %[[IF_END81:.*]] +// CHECK1: [[IF_THEN76]]: +// CHECK1-NEXT: [[TMP55:%.*]] = load i64, ptr [[DOTOMP_LB116]], align 8 +// CHECK1-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_ST117]], align 8 +// CHECK1-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[MUL77:%.*]] = mul nsw i64 [[TMP56]], [[TMP57]] +// CHECK1-NEXT: [[ADD78:%.*]] = add nsw i64 [[TMP55]], [[MUL77]] +// CHECK1-NEXT: store i64 [[ADD78]], ptr [[DOTOMP_IV120]], align 8 +// CHECK1-NEXT: [[TMP58:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_IV120]], align 8 +// CHECK1-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], 1 +// CHECK1-NEXT: [[ADD_PTR80:%.*]] = getelementptr inbounds double, ptr [[TMP58]], i64 [[MUL79]] +// CHECK1-NEXT: store ptr [[ADD_PTR80]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP60:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: store ptr [[TMP60]], ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP61:%.*]] = load i32, ptr [[C]], align 4 +// CHECK1-NEXT: [[TMP62:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP63:%.*]] = load double, ptr [[TMP62]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP61]], double noundef [[TMP63]]) +// CHECK1-NEXT: br label %[[IF_END81]] +// CHECK1: [[IF_END81]]: +// CHECK1-NEXT: [[TMP64:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[TMP65:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK1-NEXT: [[CMP82:%.*]] = icmp slt i64 [[TMP64]], [[TMP65]] +// CHECK1-NEXT: br i1 [[CMP82]], label %[[IF_THEN83:.*]], label %[[IF_END88:.*]] +// CHECK1: [[IF_THEN83]]: +// CHECK1-NEXT: [[TMP66:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 +// CHECK1-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 +// CHECK1-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[MUL84:%.*]] = mul nsw i64 [[TMP67]], [[TMP68]] +// CHECK1-NEXT: [[ADD85:%.*]] = add nsw i64 [[TMP66]], [[MUL84]] +// CHECK1-NEXT: store i64 [[ADD85]], ptr [[DOTOMP_IV2]], align 8 +// CHECK1-NEXT: [[TMP69:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK1-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 +// CHECK1-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], 1 +// CHECK1-NEXT: [[ADD_PTR87:%.*]] = getelementptr inbounds double, ptr [[TMP69]], i64 [[MUL86]] +// CHECK1-NEXT: store ptr [[ADD_PTR87]], ptr [[__BEGIN225]], align 8 +// CHECK1-NEXT: [[TMP71:%.*]] = load ptr, ptr [[__BEGIN225]], align 8 +// CHECK1-NEXT: store ptr [[TMP71]], ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP72:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK1-NEXT: [[TMP73:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP74:%.*]] = load double, ptr [[TMP73]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP72]], double noundef [[TMP74]]) +// CHECK1-NEXT: br label %[[IF_END88]] +// CHECK1: [[IF_END88]]: // CHECK1-NEXT: br label %[[FOR_INC:.*]] // CHECK1: [[FOR_INC]]: -// CHECK1-NEXT: [[TMP78:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK1-NEXT: [[INC:%.*]] = add nsw i64 [[TMP78]], 1 -// CHECK1-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK1-NEXT: [[TMP75:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK1-NEXT: [[INC:%.*]] = add nsw i64 [[TMP75]], 1 +// CHECK1-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX52]], align 8 // CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP6:![0-9]+]] // CHECK1: [[FOR_END]]: // CHECK1-NEXT: ret void @@ -794,13 +783,11 @@ extern "C" void foo4() { // CHECK1-NEXT: [[ENTRY:.*:]] // CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 // CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[K:%.*]] = alloca i32, align 4 -// CHECK1-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -815,12 +802,10 @@ extern "C" void foo4() { // CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 // CHECK1-NEXT: store i32 0, ptr [[J]], align 4 -// CHECK1-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 // CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 // CHECK1-NEXT: store i32 0, ptr [[K]], align 4 -// CHECK1-NEXT: store i32 63, ptr [[DOTOMP_UB1]], align 4 // CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 // CHECK1-NEXT: store i32 64, ptr [[DOTOMP_NI1]], align 4 @@ -940,6 +925,277 @@ extern "C" void foo4() { // CHECK1-NEXT: ret void // // +// CHECK1-LABEL: define dso_local void @foo5( +// CHECK1-SAME: ) #[[ATTR0]] { +// CHECK1-NEXT: [[ENTRY:.*:]] +// CHECK1-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK1-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_LB03:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_ST04:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_NI05:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV06:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_8:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_10:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_LB116:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_ST117:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_NI118:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_IV120:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_TEMP_121:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[DOTOMP_FUSE_MAX22:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[DOTOMP_FUSE_INDEX29:%.*]] = alloca i64, align 8 +// CHECK1-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[CC:%.*]] = alloca i32, align 4 +// CHECK1-NEXT: [[__RANGE264:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__BEGIN265:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[__END267:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: [[VV:%.*]] = alloca ptr, align 8 +// CHECK1-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[K]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: store i32 512, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK1-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK1: [[COND_TRUE]]: +// CHECK1-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK1-NEXT: br label %[[COND_END:.*]] +// CHECK1: [[COND_FALSE]]: +// CHECK1-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: br label %[[COND_END]] +// CHECK1: [[COND_END]]: +// CHECK1-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK1-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK1-NEXT: store i32 [[TMP5]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK1-NEXT: [[SUB:%.*]] = sub nsw i32 [[TMP6]], 0 +// CHECK1-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 +// CHECK1-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 +// CHECK1-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: store i32 0, ptr [[DOTOMP_LB03]], align 4 +// CHECK1-NEXT: store i32 1, ptr [[DOTOMP_ST04]], align 4 +// CHECK1-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK1-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], 1 +// CHECK1-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 +// CHECK1-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[TMP8:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP8]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK1-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY7:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY7]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY9:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY9]], ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK1-NEXT: store ptr [[TMP11]], ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK1-NEXT: [[TMP12:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK1-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP12]] to i64 +// CHECK1-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 +// CHECK1-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] +// CHECK1-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 +// CHECK1-NEXT: [[SUB12:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK1-NEXT: [[ADD13:%.*]] = add nsw i64 [[SUB12]], 1 +// CHECK1-NEXT: [[DIV14:%.*]] = sdiv i64 [[ADD13]], 1 +// CHECK1-NEXT: [[SUB15:%.*]] = sub nsw i64 [[DIV14]], 1 +// CHECK1-NEXT: store i64 [[SUB15]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_LB116]], align 8 +// CHECK1-NEXT: store i64 1, ptr [[DOTOMP_ST117]], align 8 +// CHECK1-NEXT: [[TMP14:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK1-NEXT: [[ADD19:%.*]] = add nsw i64 [[TMP14]], 1 +// CHECK1-NEXT: store i64 [[ADD19]], ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK1-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK1-NEXT: [[TMP17:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[CMP23:%.*]] = icmp sgt i64 [[TMP16]], [[TMP17]] +// CHECK1-NEXT: br i1 [[CMP23]], label %[[COND_TRUE24:.*]], label %[[COND_FALSE25:.*]] +// CHECK1: [[COND_TRUE24]]: +// CHECK1-NEXT: [[TMP18:%.*]] = load i64, ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK1-NEXT: br label %[[COND_END26:.*]] +// CHECK1: [[COND_FALSE25]]: +// CHECK1-NEXT: [[TMP19:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: br label %[[COND_END26]] +// CHECK1: [[COND_END26]]: +// CHECK1-NEXT: [[COND27:%.*]] = phi i64 [ [[TMP18]], %[[COND_TRUE24]] ], [ [[TMP19]], %[[COND_FALSE25]] ] +// CHECK1-NEXT: store i64 [[COND27]], ptr [[DOTOMP_FUSE_MAX22]], align 8 +// CHECK1-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND:.*]] +// CHECK1: [[FOR_COND]]: +// CHECK1-NEXT: [[TMP20:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: [[CMP28:%.*]] = icmp slt i32 [[TMP20]], 128 +// CHECK1-NEXT: br i1 [[CMP28]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK1: [[FOR_BODY]]: +// CHECK1-NEXT: [[TMP21:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP21]]) +// CHECK1-NEXT: br label %[[FOR_INC:.*]] +// CHECK1: [[FOR_INC]]: +// CHECK1-NEXT: [[TMP22:%.*]] = load i32, ptr [[I]], align 4 +// CHECK1-NEXT: [[INC:%.*]] = add nsw i32 [[TMP22]], 1 +// CHECK1-NEXT: store i32 [[INC]], ptr [[I]], align 4 +// CHECK1-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP9:![0-9]+]] +// CHECK1: [[FOR_END]]: +// CHECK1-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND30:.*]] +// CHECK1: [[FOR_COND30]]: +// CHECK1-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX22]], align 8 +// CHECK1-NEXT: [[CMP31:%.*]] = icmp slt i64 [[TMP23]], [[TMP24]] +// CHECK1-NEXT: br i1 [[CMP31]], label %[[FOR_BODY32:.*]], label %[[FOR_END63:.*]] +// CHECK1: [[FOR_BODY32]]: +// CHECK1-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK1-NEXT: [[CMP33:%.*]] = icmp slt i64 [[TMP25]], [[TMP26]] +// CHECK1-NEXT: br i1 [[CMP33]], label %[[IF_THEN:.*]], label %[[IF_END53:.*]] +// CHECK1: [[IF_THEN]]: +// CHECK1-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_LB03]], align 4 +// CHECK1-NEXT: [[CONV34:%.*]] = sext i32 [[TMP27]] to i64 +// CHECK1-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_ST04]], align 4 +// CHECK1-NEXT: [[CONV35:%.*]] = sext i32 [[TMP28]] to i64 +// CHECK1-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV35]], [[TMP29]] +// CHECK1-NEXT: [[ADD36:%.*]] = add nsw i64 [[CONV34]], [[MUL]] +// CHECK1-NEXT: [[CONV37:%.*]] = trunc i64 [[ADD36]] to i32 +// CHECK1-NEXT: store i32 [[CONV37]], ptr [[DOTOMP_IV06]], align 4 +// CHECK1-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_IV06]], align 4 +// CHECK1-NEXT: [[MUL38:%.*]] = mul nsw i32 [[TMP30]], 1 +// CHECK1-NEXT: [[ADD39:%.*]] = add nsw i32 0, [[MUL38]] +// CHECK1-NEXT: store i32 [[ADD39]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK1-NEXT: [[CMP40:%.*]] = icmp slt i32 [[TMP31]], [[TMP32]] +// CHECK1-NEXT: br i1 [[CMP40]], label %[[IF_THEN41:.*]], label %[[IF_END:.*]] +// CHECK1: [[IF_THEN41]]: +// CHECK1-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK1-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK1-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL42:%.*]] = mul nsw i32 [[TMP34]], [[TMP35]] +// CHECK1-NEXT: [[ADD43:%.*]] = add nsw i32 [[TMP33]], [[MUL42]] +// CHECK1-NEXT: store i32 [[ADD43]], ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK1-NEXT: [[MUL44:%.*]] = mul nsw i32 [[TMP36]], 2 +// CHECK1-NEXT: [[ADD45:%.*]] = add nsw i32 0, [[MUL44]] +// CHECK1-NEXT: store i32 [[ADD45]], ptr [[J]], align 4 +// CHECK1-NEXT: [[TMP37:%.*]] = load i32, ptr [[J]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP37]]) +// CHECK1-NEXT: br label %[[IF_END]] +// CHECK1: [[IF_END]]: +// CHECK1-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK1-NEXT: [[CMP46:%.*]] = icmp slt i32 [[TMP38]], [[TMP39]] +// CHECK1-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] +// CHECK1: [[IF_THEN47]]: +// CHECK1-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK1-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK1-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK1-NEXT: [[MUL48:%.*]] = mul nsw i32 [[TMP41]], [[TMP42]] +// CHECK1-NEXT: [[ADD49:%.*]] = add nsw i32 [[TMP40]], [[MUL48]] +// CHECK1-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK1-NEXT: [[MUL50:%.*]] = mul nsw i32 [[TMP43]], 1 +// CHECK1-NEXT: [[ADD51:%.*]] = add nsw i32 0, [[MUL50]] +// CHECK1-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 +// CHECK1-NEXT: [[TMP44:%.*]] = load i32, ptr [[K]], align 4 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK1-NEXT: br label %[[IF_END52]] +// CHECK1: [[IF_END52]]: +// CHECK1-NEXT: br label %[[IF_END53]] +// CHECK1: [[IF_END53]]: +// CHECK1-NEXT: [[TMP45:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[TMP46:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK1-NEXT: [[CMP54:%.*]] = icmp slt i64 [[TMP45]], [[TMP46]] +// CHECK1-NEXT: br i1 [[CMP54]], label %[[IF_THEN55:.*]], label %[[IF_END60:.*]] +// CHECK1: [[IF_THEN55]]: +// CHECK1-NEXT: [[TMP47:%.*]] = load i64, ptr [[DOTOMP_LB116]], align 8 +// CHECK1-NEXT: [[TMP48:%.*]] = load i64, ptr [[DOTOMP_ST117]], align 8 +// CHECK1-NEXT: [[TMP49:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[MUL56:%.*]] = mul nsw i64 [[TMP48]], [[TMP49]] +// CHECK1-NEXT: [[ADD57:%.*]] = add nsw i64 [[TMP47]], [[MUL56]] +// CHECK1-NEXT: store i64 [[ADD57]], ptr [[DOTOMP_IV120]], align 8 +// CHECK1-NEXT: [[TMP50:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK1-NEXT: [[TMP51:%.*]] = load i64, ptr [[DOTOMP_IV120]], align 8 +// CHECK1-NEXT: [[MUL58:%.*]] = mul nsw i64 [[TMP51]], 1 +// CHECK1-NEXT: [[ADD_PTR59:%.*]] = getelementptr inbounds double, ptr [[TMP50]], i64 [[MUL58]] +// CHECK1-NEXT: store ptr [[ADD_PTR59]], ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: [[TMP52:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK1-NEXT: store ptr [[TMP52]], ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP53:%.*]] = load i32, ptr [[C]], align 4 +// CHECK1-NEXT: [[TMP54:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK1-NEXT: [[TMP55:%.*]] = load double, ptr [[TMP54]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP53]], double noundef [[TMP55]]) +// CHECK1-NEXT: br label %[[IF_END60]] +// CHECK1: [[IF_END60]]: +// CHECK1-NEXT: br label %[[FOR_INC61:.*]] +// CHECK1: [[FOR_INC61]]: +// CHECK1-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: [[INC62:%.*]] = add nsw i64 [[TMP56]], 1 +// CHECK1-NEXT: store i64 [[INC62]], ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND30]], !llvm.loop [[LOOP10:![0-9]+]] +// CHECK1: [[FOR_END63]]: +// CHECK1-NEXT: store i32 37, ptr [[CC]], align 4 +// CHECK1-NEXT: store ptr [[ARR]], ptr [[__RANGE264]], align 8 +// CHECK1-NEXT: [[TMP57:%.*]] = load ptr, ptr [[__RANGE264]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY66:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP57]], i64 0, i64 0 +// CHECK1-NEXT: store ptr [[ARRAYDECAY66]], ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: [[TMP58:%.*]] = load ptr, ptr [[__RANGE264]], align 8 +// CHECK1-NEXT: [[ARRAYDECAY68:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP58]], i64 0, i64 0 +// CHECK1-NEXT: [[ADD_PTR69:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY68]], i64 256 +// CHECK1-NEXT: store ptr [[ADD_PTR69]], ptr [[__END267]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND70:.*]] +// CHECK1: [[FOR_COND70]]: +// CHECK1-NEXT: [[TMP59:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: [[TMP60:%.*]] = load ptr, ptr [[__END267]], align 8 +// CHECK1-NEXT: [[CMP71:%.*]] = icmp ne ptr [[TMP59]], [[TMP60]] +// CHECK1-NEXT: br i1 [[CMP71]], label %[[FOR_BODY72:.*]], label %[[FOR_END74:.*]] +// CHECK1: [[FOR_BODY72]]: +// CHECK1-NEXT: [[TMP61:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: store ptr [[TMP61]], ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP62:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK1-NEXT: [[TMP63:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK1-NEXT: [[TMP64:%.*]] = load double, ptr [[TMP63]], align 8 +// CHECK1-NEXT: call void (...) @body(i32 noundef [[TMP62]], double noundef [[TMP64]]) +// CHECK1-NEXT: br label %[[FOR_INC73:.*]] +// CHECK1: [[FOR_INC73]]: +// CHECK1-NEXT: [[TMP65:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: [[INCDEC_PTR:%.*]] = getelementptr inbounds nuw double, ptr [[TMP65]], i32 1 +// CHECK1-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN265]], align 8 +// CHECK1-NEXT: br label %[[FOR_COND70]] +// CHECK1: [[FOR_END74]]: +// CHECK1-NEXT: ret void +// +// // CHECK2-LABEL: define dso_local void @body( // CHECK2-SAME: ...) #[[ATTR0:[0-9]+]] { // CHECK2-NEXT: [[ENTRY:.*:]] @@ -961,7 +1217,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 @@ -970,7 +1225,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -1002,107 +1256,103 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] // CHECK2-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 // CHECK2-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP8]], 1 // CHECK2-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[START2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP9]], ptr [[J]], align 4 // CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[START2_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[START2_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[END2_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK2-NEXT: store i32 [[TMP10]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[END2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[STEP2_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP13]], [[TMP14]] // CHECK2-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP15]] // CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] -// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP16]] // CHECK2-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 // CHECK2-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP17]], 1 // CHECK2-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: store i32 [[TMP20]], ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP21]], [[TMP22]] +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP19]], [[TMP20]] // CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] // CHECK2: [[COND_TRUE]]: -// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 // CHECK2-NEXT: br label %[[COND_END:.*]] // CHECK2: [[COND_FALSE]]: -// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 // CHECK2-NEXT: br label %[[COND_END]] // CHECK2: [[COND_END]]: -// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP23]], %[[COND_TRUE]] ], [ [[TMP24]], %[[COND_FALSE]] ] +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP21]], %[[COND_TRUE]] ], [ [[TMP22]], %[[COND_FALSE]] ] // CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK2-NEXT: br label %[[FOR_COND:.*]] // CHECK2: [[FOR_COND]]: -// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 -// CHECK2-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP16:%.*]] = icmp ult i32 [[TMP23]], [[TMP24]] // CHECK2-NEXT: br i1 [[CMP16]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK2: [[FOR_BODY]]: -// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP17:%.*]] = icmp ult i32 [[TMP25]], [[TMP26]] // CHECK2-NEXT: br i1 [[CMP17]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] // CHECK2: [[IF_THEN]]: -// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP30]], [[TMP31]] -// CHECK2-NEXT: [[ADD18:%.*]] = add i32 [[TMP29]], [[MUL]] +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP28]], [[TMP29]] +// CHECK2-NEXT: [[ADD18:%.*]] = add i32 [[TMP27]], [[MUL]] // CHECK2-NEXT: store i32 [[ADD18]], ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 -// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 -// CHECK2-NEXT: [[MUL19:%.*]] = mul i32 [[TMP33]], [[TMP34]] -// CHECK2-NEXT: [[ADD20:%.*]] = add i32 [[TMP32]], [[MUL19]] +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[MUL19:%.*]] = mul i32 [[TMP31]], [[TMP32]] +// CHECK2-NEXT: [[ADD20:%.*]] = add i32 [[TMP30]], [[MUL19]] // CHECK2-NEXT: store i32 [[ADD20]], ptr [[I]], align 4 -// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[I]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP35]]) +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP33]]) // CHECK2-NEXT: br label %[[IF_END]] // CHECK2: [[IF_END]]: -// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP36]], [[TMP37]] +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP21:%.*]] = icmp ult i32 [[TMP34]], [[TMP35]] // CHECK2-NEXT: br i1 [[CMP21]], label %[[IF_THEN22:.*]], label %[[IF_END27:.*]] // CHECK2: [[IF_THEN22]]: -// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL23:%.*]] = mul i32 [[TMP39]], [[TMP40]] -// CHECK2-NEXT: [[ADD24:%.*]] = add i32 [[TMP38]], [[MUL23]] +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL23:%.*]] = mul i32 [[TMP37]], [[TMP38]] +// CHECK2-NEXT: [[ADD24:%.*]] = add i32 [[TMP36]], [[MUL23]] // CHECK2-NEXT: store i32 [[ADD24]], ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[MUL25:%.*]] = mul i32 [[TMP42]], [[TMP43]] -// CHECK2-NEXT: [[ADD26:%.*]] = add i32 [[TMP41]], [[MUL25]] +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[MUL25:%.*]] = mul i32 [[TMP40]], [[TMP41]] +// CHECK2-NEXT: [[ADD26:%.*]] = add i32 [[TMP39]], [[MUL25]] // CHECK2-NEXT: store i32 [[ADD26]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[J]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP42]]) // CHECK2-NEXT: br label %[[IF_END27]] // CHECK2: [[IF_END27]]: // CHECK2-NEXT: br label %[[FOR_INC:.*]] // CHECK2: [[FOR_INC]]: -// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP45]], 1 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP43]], 1 // CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP3:![0-9]+]] // CHECK2: [[FOR_END]]: @@ -1114,13 +1364,11 @@ extern "C" void foo4() { // CHECK2-NEXT: [[ENTRY:.*:]] // CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 // CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -1130,48 +1378,43 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB03:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_LB04:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_ST05:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_NI06:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_IV07:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB03:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST04:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI05:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV06:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[C:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_12:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_UB117:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_LB118:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_ST119:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_NI120:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_IV122:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_8:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_10:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_LB116:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_ST117:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_NI118:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV120:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[CC:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[__RANGE223:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[__END224:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[__BEGIN227:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__RANGE221:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END222:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN225:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_27:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_29:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_31:%.*]] = alloca ptr, align 8 -// CHECK2-NEXT: [[DOTCAPTURE_EXPR_32:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_UB2:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_30:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_LB2:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_ST2:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_NI2:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_IV2:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_TEMP_142:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_TEMP_140:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[DOTOMP_TEMP_2:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_FUSE_MAX48:%.*]] = alloca i64, align 8 -// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX54:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX46:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX52:%.*]] = alloca i64, align 8 // CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[VV:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: store i32 0, ptr [[I]], align 4 -// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 // CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[J]], align 4 -// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB1]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 // CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI1]], align 4 @@ -1198,225 +1441,219 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 // CHECK2-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 // CHECK2-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB03]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST04]], align 4 // CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 -// CHECK2-NEXT: store i32 [[TMP7]], ptr [[DOTOMP_UB03]], align 4 -// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB04]], align 4 -// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST05]], align 4 -// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 -// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP8]], 1 +// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], 1 // CHECK2-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 -// CHECK2-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI06]], align 8 +// CHECK2-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI05]], align 8 // CHECK2-NEXT: store i32 42, ptr [[C]], align 4 // CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 -// CHECK2-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK2-NEXT: [[TMP8:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP8]], i64 0, i64 0 // CHECK2-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 // CHECK2-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK2-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY7:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY7]], ptr [[__BEGIN2]], align 8 // CHECK2-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY8:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 -// CHECK2-NEXT: store ptr [[ARRAYDECAY8]], ptr [[__BEGIN2]], align 8 -// CHECK2-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__RANGE2]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY10:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP11]], i64 0, i64 0 -// CHECK2-NEXT: store ptr [[ARRAYDECAY10]], ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK2-NEXT: [[TMP12:%.*]] = load ptr, ptr [[__END2]], align 8 -// CHECK2-NEXT: store ptr [[TMP12]], ptr [[DOTCAPTURE_EXPR_11]], align 8 -// CHECK2-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_11]], align 8 -// CHECK2-NEXT: [[TMP14:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK2-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 -// CHECK2-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP14]] to i64 +// CHECK2-NEXT: [[ARRAYDECAY9:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY9]], ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK2-NEXT: store ptr [[TMP11]], ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK2-NEXT: [[TMP12:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK2-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP12]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 // CHECK2-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] // CHECK2-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 -// CHECK2-NEXT: [[SUB13:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 -// CHECK2-NEXT: [[ADD14:%.*]] = add nsw i64 [[SUB13]], 1 -// CHECK2-NEXT: [[DIV15:%.*]] = sdiv i64 [[ADD14]], 1 -// CHECK2-NEXT: [[SUB16:%.*]] = sub nsw i64 [[DIV15]], 1 -// CHECK2-NEXT: store i64 [[SUB16]], ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK2-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK2-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_UB117]], align 8 -// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB118]], align 8 -// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST119]], align 8 -// CHECK2-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_12]], align 8 -// CHECK2-NEXT: [[ADD21:%.*]] = add nsw i64 [[TMP16]], 1 -// CHECK2-NEXT: store i64 [[ADD21]], ptr [[DOTOMP_NI120]], align 8 +// CHECK2-NEXT: [[SUB12:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK2-NEXT: [[ADD13:%.*]] = add nsw i64 [[SUB12]], 1 +// CHECK2-NEXT: [[DIV14:%.*]] = sdiv i64 [[ADD13]], 1 +// CHECK2-NEXT: [[SUB15:%.*]] = sub nsw i64 [[DIV14]], 1 +// CHECK2-NEXT: store i64 [[SUB15]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB116]], align 8 +// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST117]], align 8 +// CHECK2-NEXT: [[TMP14:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: [[ADD19:%.*]] = add nsw i64 [[TMP14]], 1 +// CHECK2-NEXT: store i64 [[ADD19]], ptr [[DOTOMP_NI118]], align 8 // CHECK2-NEXT: store i32 37, ptr [[CC]], align 4 -// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE223]], align 8 -// CHECK2-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY25:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 -// CHECK2-NEXT: [[ADD_PTR26:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY25]], i64 256 -// CHECK2-NEXT: store ptr [[ADD_PTR26]], ptr [[__END224]], align 8 -// CHECK2-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP18]], i64 0, i64 0 -// CHECK2-NEXT: store ptr [[ARRAYDECAY28]], ptr [[__BEGIN227]], align 8 -// CHECK2-NEXT: [[TMP19:%.*]] = load ptr, ptr [[__RANGE223]], align 8 -// CHECK2-NEXT: [[ARRAYDECAY30:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP19]], i64 0, i64 0 -// CHECK2-NEXT: store ptr [[ARRAYDECAY30]], ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK2-NEXT: [[TMP20:%.*]] = load ptr, ptr [[__END224]], align 8 -// CHECK2-NEXT: store ptr [[TMP20]], ptr [[DOTCAPTURE_EXPR_31]], align 8 -// CHECK2-NEXT: [[TMP21:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_31]], align 8 -// CHECK2-NEXT: [[TMP22:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK2-NEXT: [[SUB_PTR_LHS_CAST33:%.*]] = ptrtoint ptr [[TMP21]] to i64 -// CHECK2-NEXT: [[SUB_PTR_RHS_CAST34:%.*]] = ptrtoint ptr [[TMP22]] to i64 -// CHECK2-NEXT: [[SUB_PTR_SUB35:%.*]] = sub i64 [[SUB_PTR_LHS_CAST33]], [[SUB_PTR_RHS_CAST34]] -// CHECK2-NEXT: [[SUB_PTR_DIV36:%.*]] = sdiv exact i64 [[SUB_PTR_SUB35]], 8 -// CHECK2-NEXT: [[SUB37:%.*]] = sub nsw i64 [[SUB_PTR_DIV36]], 1 -// CHECK2-NEXT: [[ADD38:%.*]] = add nsw i64 [[SUB37]], 1 -// CHECK2-NEXT: [[DIV39:%.*]] = sdiv i64 [[ADD38]], 1 -// CHECK2-NEXT: [[SUB40:%.*]] = sub nsw i64 [[DIV39]], 1 -// CHECK2-NEXT: store i64 [[SUB40]], ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK2-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK2-NEXT: store i64 [[TMP23]], ptr [[DOTOMP_UB2]], align 8 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE221]], align 8 +// CHECK2-NEXT: [[TMP15:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY23:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP15]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR24:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY23]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR24]], ptr [[__END222]], align 8 +// CHECK2-NEXT: [[TMP16:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY26:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP16]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY26]], ptr [[__BEGIN225]], align 8 +// CHECK2-NEXT: [[TMP17:%.*]] = load ptr, ptr [[__RANGE221]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY28:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP17]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY28]], ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK2-NEXT: [[TMP18:%.*]] = load ptr, ptr [[__END222]], align 8 +// CHECK2-NEXT: store ptr [[TMP18]], ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[TMP19:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 +// CHECK2-NEXT: [[TMP20:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST31:%.*]] = ptrtoint ptr [[TMP19]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST32:%.*]] = ptrtoint ptr [[TMP20]] to i64 +// CHECK2-NEXT: [[SUB_PTR_SUB33:%.*]] = sub i64 [[SUB_PTR_LHS_CAST31]], [[SUB_PTR_RHS_CAST32]] +// CHECK2-NEXT: [[SUB_PTR_DIV34:%.*]] = sdiv exact i64 [[SUB_PTR_SUB33]], 8 +// CHECK2-NEXT: [[SUB35:%.*]] = sub nsw i64 [[SUB_PTR_DIV34]], 1 +// CHECK2-NEXT: [[ADD36:%.*]] = add nsw i64 [[SUB35]], 1 +// CHECK2-NEXT: [[DIV37:%.*]] = sdiv i64 [[ADD36]], 1 +// CHECK2-NEXT: [[SUB38:%.*]] = sub nsw i64 [[DIV37]], 1 +// CHECK2-NEXT: store i64 [[SUB38]], ptr [[DOTCAPTURE_EXPR_30]], align 8 // CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB2]], align 8 // CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST2]], align 8 -// CHECK2-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_32]], align 8 -// CHECK2-NEXT: [[ADD41:%.*]] = add nsw i64 [[TMP24]], 1 -// CHECK2-NEXT: store i64 [[ADD41]], ptr [[DOTOMP_NI2]], align 8 -// CHECK2-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 -// CHECK2-NEXT: store i64 [[TMP25]], ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK2-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK2-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK2-NEXT: [[CMP43:%.*]] = icmp sgt i64 [[TMP26]], [[TMP27]] -// CHECK2-NEXT: br i1 [[CMP43]], label %[[COND_TRUE44:.*]], label %[[COND_FALSE45:.*]] -// CHECK2: [[COND_TRUE44]]: -// CHECK2-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_TEMP_142]], align 8 -// CHECK2-NEXT: br label %[[COND_END46:.*]] -// CHECK2: [[COND_FALSE45]]: -// CHECK2-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK2-NEXT: br label %[[COND_END46]] -// CHECK2: [[COND_END46]]: -// CHECK2-NEXT: [[COND47:%.*]] = phi i64 [ [[TMP28]], %[[COND_TRUE44]] ], [ [[TMP29]], %[[COND_FALSE45]] ] -// CHECK2-NEXT: store i64 [[COND47]], ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK2-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK2-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK2-NEXT: [[CMP49:%.*]] = icmp sgt i64 [[TMP30]], [[TMP31]] -// CHECK2-NEXT: br i1 [[CMP49]], label %[[COND_TRUE50:.*]], label %[[COND_FALSE51:.*]] -// CHECK2: [[COND_TRUE50]]: -// CHECK2-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 -// CHECK2-NEXT: br label %[[COND_END52:.*]] -// CHECK2: [[COND_FALSE51]]: -// CHECK2-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK2-NEXT: br label %[[COND_END52]] -// CHECK2: [[COND_END52]]: -// CHECK2-NEXT: [[COND53:%.*]] = phi i64 [ [[TMP32]], %[[COND_TRUE50]] ], [ [[TMP33]], %[[COND_FALSE51]] ] -// CHECK2-NEXT: store i64 [[COND53]], ptr [[DOTOMP_FUSE_MAX48]], align 8 -// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP21:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_30]], align 8 +// CHECK2-NEXT: [[ADD39:%.*]] = add nsw i64 [[TMP21]], 1 +// CHECK2-NEXT: store i64 [[ADD39]], ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[TMP22:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: store i64 [[TMP22]], ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK2-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK2-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[CMP41:%.*]] = icmp sgt i64 [[TMP23]], [[TMP24]] +// CHECK2-NEXT: br i1 [[CMP41]], label %[[COND_TRUE42:.*]], label %[[COND_FALSE43:.*]] +// CHECK2: [[COND_TRUE42]]: +// CHECK2-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_TEMP_140]], align 8 +// CHECK2-NEXT: br label %[[COND_END44:.*]] +// CHECK2: [[COND_FALSE43]]: +// CHECK2-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: br label %[[COND_END44]] +// CHECK2: [[COND_END44]]: +// CHECK2-NEXT: [[COND45:%.*]] = phi i64 [ [[TMP25]], %[[COND_TRUE42]] ], [ [[TMP26]], %[[COND_FALSE43]] ] +// CHECK2-NEXT: store i64 [[COND45]], ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: [[TMP27:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: [[TMP28:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[CMP47:%.*]] = icmp sgt i64 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: br i1 [[CMP47]], label %[[COND_TRUE48:.*]], label %[[COND_FALSE49:.*]] +// CHECK2: [[COND_TRUE48]]: +// CHECK2-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_TEMP_2]], align 8 +// CHECK2-NEXT: br label %[[COND_END50:.*]] +// CHECK2: [[COND_FALSE49]]: +// CHECK2-NEXT: [[TMP30:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: br label %[[COND_END50]] +// CHECK2: [[COND_END50]]: +// CHECK2-NEXT: [[COND51:%.*]] = phi i64 [ [[TMP29]], %[[COND_TRUE48]] ], [ [[TMP30]], %[[COND_FALSE49]] ] +// CHECK2-NEXT: store i64 [[COND51]], ptr [[DOTOMP_FUSE_MAX46]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX52]], align 8 // CHECK2-NEXT: br label %[[FOR_COND:.*]] // CHECK2: [[FOR_COND]]: -// CHECK2-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[TMP35:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX48]], align 8 -// CHECK2-NEXT: [[CMP55:%.*]] = icmp slt i64 [[TMP34]], [[TMP35]] -// CHECK2-NEXT: br i1 [[CMP55]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2-NEXT: [[TMP31:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[TMP32:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX46]], align 8 +// CHECK2-NEXT: [[CMP53:%.*]] = icmp slt i64 [[TMP31]], [[TMP32]] +// CHECK2-NEXT: br i1 [[CMP53]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK2: [[FOR_BODY]]: -// CHECK2-NEXT: [[TMP36:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_NI06]], align 8 -// CHECK2-NEXT: [[CMP56:%.*]] = icmp slt i64 [[TMP36]], [[TMP37]] -// CHECK2-NEXT: br i1 [[CMP56]], label %[[IF_THEN:.*]], label %[[IF_END76:.*]] +// CHECK2-NEXT: [[TMP33:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[TMP34:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: [[CMP54:%.*]] = icmp slt i64 [[TMP33]], [[TMP34]] +// CHECK2-NEXT: br i1 [[CMP54]], label %[[IF_THEN:.*]], label %[[IF_END74:.*]] // CHECK2: [[IF_THEN]]: -// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_LB04]], align 4 -// CHECK2-NEXT: [[CONV57:%.*]] = sext i32 [[TMP38]] to i64 -// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_ST05]], align 4 -// CHECK2-NEXT: [[CONV58:%.*]] = sext i32 [[TMP39]] to i64 -// CHECK2-NEXT: [[TMP40:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV58]], [[TMP40]] -// CHECK2-NEXT: [[ADD59:%.*]] = add nsw i64 [[CONV57]], [[MUL]] -// CHECK2-NEXT: [[CONV60:%.*]] = trunc i64 [[ADD59]] to i32 -// CHECK2-NEXT: store i32 [[CONV60]], ptr [[DOTOMP_IV07]], align 4 -// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_IV07]], align 4 -// CHECK2-NEXT: [[MUL61:%.*]] = mul nsw i32 [[TMP41]], 1 -// CHECK2-NEXT: [[ADD62:%.*]] = add nsw i32 0, [[MUL61]] -// CHECK2-NEXT: store i32 [[ADD62]], ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: [[CMP63:%.*]] = icmp slt i32 [[TMP42]], [[TMP43]] -// CHECK2-NEXT: br i1 [[CMP63]], label %[[IF_THEN64:.*]], label %[[IF_END:.*]] -// CHECK2: [[IF_THEN64]]: -// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP45]], [[TMP46]] -// CHECK2-NEXT: [[ADD66:%.*]] = add nsw i32 [[TMP44]], [[MUL65]] -// CHECK2-NEXT: store i32 [[ADD66]], ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[MUL67:%.*]] = mul nsw i32 [[TMP47]], 1 -// CHECK2-NEXT: [[ADD68:%.*]] = add nsw i32 0, [[MUL67]] -// CHECK2-NEXT: store i32 [[ADD68]], ptr [[I]], align 4 -// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[I]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP48]]) +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_LB03]], align 4 +// CHECK2-NEXT: [[CONV55:%.*]] = sext i32 [[TMP35]] to i64 +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_ST04]], align 4 +// CHECK2-NEXT: [[CONV56:%.*]] = sext i32 [[TMP36]] to i64 +// CHECK2-NEXT: [[TMP37:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV56]], [[TMP37]] +// CHECK2-NEXT: [[ADD57:%.*]] = add nsw i64 [[CONV55]], [[MUL]] +// CHECK2-NEXT: [[CONV58:%.*]] = trunc i64 [[ADD57]] to i32 +// CHECK2-NEXT: store i32 [[CONV58]], ptr [[DOTOMP_IV06]], align 4 +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_IV06]], align 4 +// CHECK2-NEXT: [[MUL59:%.*]] = mul nsw i32 [[TMP38]], 1 +// CHECK2-NEXT: [[ADD60:%.*]] = add nsw i32 0, [[MUL59]] +// CHECK2-NEXT: store i32 [[ADD60]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP61:%.*]] = icmp slt i32 [[TMP39]], [[TMP40]] +// CHECK2-NEXT: br i1 [[CMP61]], label %[[IF_THEN62:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN62]]: +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL63:%.*]] = mul nsw i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: [[ADD64:%.*]] = add nsw i32 [[TMP41]], [[MUL63]] +// CHECK2-NEXT: store i32 [[ADD64]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[MUL65:%.*]] = mul nsw i32 [[TMP44]], 1 +// CHECK2-NEXT: [[ADD66:%.*]] = add nsw i32 0, [[MUL65]] +// CHECK2-NEXT: store i32 [[ADD66]], ptr [[I]], align 4 +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP45]]) // CHECK2-NEXT: br label %[[IF_END]] // CHECK2: [[IF_END]]: -// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP69:%.*]] = icmp slt i32 [[TMP49]], [[TMP50]] -// CHECK2-NEXT: br i1 [[CMP69]], label %[[IF_THEN70:.*]], label %[[IF_END75:.*]] -// CHECK2: [[IF_THEN70]]: -// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP52]], [[TMP53]] -// CHECK2-NEXT: [[ADD72:%.*]] = add nsw i32 [[TMP51]], [[MUL71]] -// CHECK2-NEXT: store i32 [[ADD72]], ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[MUL73:%.*]] = mul nsw i32 [[TMP54]], 2 -// CHECK2-NEXT: [[ADD74:%.*]] = add nsw i32 0, [[MUL73]] -// CHECK2-NEXT: store i32 [[ADD74]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[J]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP55]]) -// CHECK2-NEXT: br label %[[IF_END75]] -// CHECK2: [[IF_END75]]: -// CHECK2-NEXT: br label %[[IF_END76]] -// CHECK2: [[IF_END76]]: -// CHECK2-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_NI120]], align 8 -// CHECK2-NEXT: [[CMP77:%.*]] = icmp slt i64 [[TMP56]], [[TMP57]] -// CHECK2-NEXT: br i1 [[CMP77]], label %[[IF_THEN78:.*]], label %[[IF_END83:.*]] -// CHECK2: [[IF_THEN78]]: -// CHECK2-NEXT: [[TMP58:%.*]] = load i64, ptr [[DOTOMP_LB118]], align 8 -// CHECK2-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_ST119]], align 8 -// CHECK2-NEXT: [[TMP60:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], [[TMP60]] -// CHECK2-NEXT: [[ADD80:%.*]] = add nsw i64 [[TMP58]], [[MUL79]] -// CHECK2-NEXT: store i64 [[ADD80]], ptr [[DOTOMP_IV122]], align 8 -// CHECK2-NEXT: [[TMP61:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_9]], align 8 -// CHECK2-NEXT: [[TMP62:%.*]] = load i64, ptr [[DOTOMP_IV122]], align 8 -// CHECK2-NEXT: [[MUL81:%.*]] = mul nsw i64 [[TMP62]], 1 -// CHECK2-NEXT: [[ADD_PTR82:%.*]] = getelementptr inbounds double, ptr [[TMP61]], i64 [[MUL81]] -// CHECK2-NEXT: store ptr [[ADD_PTR82]], ptr [[__BEGIN2]], align 8 -// CHECK2-NEXT: [[TMP63:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 -// CHECK2-NEXT: store ptr [[TMP63]], ptr [[V]], align 8 -// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[C]], align 4 -// CHECK2-NEXT: [[TMP65:%.*]] = load ptr, ptr [[V]], align 8 -// CHECK2-NEXT: [[TMP66:%.*]] = load double, ptr [[TMP65]], align 8 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP64]], double noundef [[TMP66]]) -// CHECK2-NEXT: br label %[[IF_END83]] -// CHECK2: [[IF_END83]]: -// CHECK2-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 -// CHECK2-NEXT: [[CMP84:%.*]] = icmp slt i64 [[TMP67]], [[TMP68]] -// CHECK2-NEXT: br i1 [[CMP84]], label %[[IF_THEN85:.*]], label %[[IF_END90:.*]] -// CHECK2: [[IF_THEN85]]: -// CHECK2-NEXT: [[TMP69:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 -// CHECK2-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 -// CHECK2-NEXT: [[TMP71:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], [[TMP71]] -// CHECK2-NEXT: [[ADD87:%.*]] = add nsw i64 [[TMP69]], [[MUL86]] -// CHECK2-NEXT: store i64 [[ADD87]], ptr [[DOTOMP_IV2]], align 8 -// CHECK2-NEXT: [[TMP72:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_29]], align 8 -// CHECK2-NEXT: [[TMP73:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 -// CHECK2-NEXT: [[MUL88:%.*]] = mul nsw i64 [[TMP73]], 1 -// CHECK2-NEXT: [[ADD_PTR89:%.*]] = getelementptr inbounds double, ptr [[TMP72]], i64 [[MUL88]] -// CHECK2-NEXT: store ptr [[ADD_PTR89]], ptr [[__BEGIN227]], align 8 -// CHECK2-NEXT: [[TMP74:%.*]] = load ptr, ptr [[__BEGIN227]], align 8 -// CHECK2-NEXT: store ptr [[TMP74]], ptr [[VV]], align 8 -// CHECK2-NEXT: [[TMP75:%.*]] = load i32, ptr [[CC]], align 4 -// CHECK2-NEXT: [[TMP76:%.*]] = load ptr, ptr [[VV]], align 8 -// CHECK2-NEXT: [[TMP77:%.*]] = load double, ptr [[TMP76]], align 8 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP75]], double noundef [[TMP77]]) -// CHECK2-NEXT: br label %[[IF_END90]] -// CHECK2: [[IF_END90]]: +// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP67:%.*]] = icmp slt i32 [[TMP46]], [[TMP47]] +// CHECK2-NEXT: br i1 [[CMP67]], label %[[IF_THEN68:.*]], label %[[IF_END73:.*]] +// CHECK2: [[IF_THEN68]]: +// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL69:%.*]] = mul nsw i32 [[TMP49]], [[TMP50]] +// CHECK2-NEXT: [[ADD70:%.*]] = add nsw i32 [[TMP48]], [[MUL69]] +// CHECK2-NEXT: store i32 [[ADD70]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[MUL71:%.*]] = mul nsw i32 [[TMP51]], 2 +// CHECK2-NEXT: [[ADD72:%.*]] = add nsw i32 0, [[MUL71]] +// CHECK2-NEXT: store i32 [[ADD72]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK2-NEXT: br label %[[IF_END73]] +// CHECK2: [[IF_END73]]: +// CHECK2-NEXT: br label %[[IF_END74]] +// CHECK2: [[IF_END74]]: +// CHECK2-NEXT: [[TMP53:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[TMP54:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[CMP75:%.*]] = icmp slt i64 [[TMP53]], [[TMP54]] +// CHECK2-NEXT: br i1 [[CMP75]], label %[[IF_THEN76:.*]], label %[[IF_END81:.*]] +// CHECK2: [[IF_THEN76]]: +// CHECK2-NEXT: [[TMP55:%.*]] = load i64, ptr [[DOTOMP_LB116]], align 8 +// CHECK2-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_ST117]], align 8 +// CHECK2-NEXT: [[TMP57:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[MUL77:%.*]] = mul nsw i64 [[TMP56]], [[TMP57]] +// CHECK2-NEXT: [[ADD78:%.*]] = add nsw i64 [[TMP55]], [[MUL77]] +// CHECK2-NEXT: store i64 [[ADD78]], ptr [[DOTOMP_IV120]], align 8 +// CHECK2-NEXT: [[TMP58:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[TMP59:%.*]] = load i64, ptr [[DOTOMP_IV120]], align 8 +// CHECK2-NEXT: [[MUL79:%.*]] = mul nsw i64 [[TMP59]], 1 +// CHECK2-NEXT: [[ADD_PTR80:%.*]] = getelementptr inbounds double, ptr [[TMP58]], i64 [[MUL79]] +// CHECK2-NEXT: store ptr [[ADD_PTR80]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP60:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: store ptr [[TMP60]], ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP61:%.*]] = load i32, ptr [[C]], align 4 +// CHECK2-NEXT: [[TMP62:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP63:%.*]] = load double, ptr [[TMP62]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP61]], double noundef [[TMP63]]) +// CHECK2-NEXT: br label %[[IF_END81]] +// CHECK2: [[IF_END81]]: +// CHECK2-NEXT: [[TMP64:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[TMP65:%.*]] = load i64, ptr [[DOTOMP_NI2]], align 8 +// CHECK2-NEXT: [[CMP82:%.*]] = icmp slt i64 [[TMP64]], [[TMP65]] +// CHECK2-NEXT: br i1 [[CMP82]], label %[[IF_THEN83:.*]], label %[[IF_END88:.*]] +// CHECK2: [[IF_THEN83]]: +// CHECK2-NEXT: [[TMP66:%.*]] = load i64, ptr [[DOTOMP_LB2]], align 8 +// CHECK2-NEXT: [[TMP67:%.*]] = load i64, ptr [[DOTOMP_ST2]], align 8 +// CHECK2-NEXT: [[TMP68:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[MUL84:%.*]] = mul nsw i64 [[TMP67]], [[TMP68]] +// CHECK2-NEXT: [[ADD85:%.*]] = add nsw i64 [[TMP66]], [[MUL84]] +// CHECK2-NEXT: store i64 [[ADD85]], ptr [[DOTOMP_IV2]], align 8 +// CHECK2-NEXT: [[TMP69:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_27]], align 8 +// CHECK2-NEXT: [[TMP70:%.*]] = load i64, ptr [[DOTOMP_IV2]], align 8 +// CHECK2-NEXT: [[MUL86:%.*]] = mul nsw i64 [[TMP70]], 1 +// CHECK2-NEXT: [[ADD_PTR87:%.*]] = getelementptr inbounds double, ptr [[TMP69]], i64 [[MUL86]] +// CHECK2-NEXT: store ptr [[ADD_PTR87]], ptr [[__BEGIN225]], align 8 +// CHECK2-NEXT: [[TMP71:%.*]] = load ptr, ptr [[__BEGIN225]], align 8 +// CHECK2-NEXT: store ptr [[TMP71]], ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP72:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK2-NEXT: [[TMP73:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP74:%.*]] = load double, ptr [[TMP73]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP72]], double noundef [[TMP74]]) +// CHECK2-NEXT: br label %[[IF_END88]] +// CHECK2: [[IF_END88]]: // CHECK2-NEXT: br label %[[FOR_INC:.*]] // CHECK2: [[FOR_INC]]: -// CHECK2-NEXT: [[TMP78:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX54]], align 8 -// CHECK2-NEXT: [[INC:%.*]] = add nsw i64 [[TMP78]], 1 -// CHECK2-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX54]], align 8 +// CHECK2-NEXT: [[TMP75:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX52]], align 8 +// CHECK2-NEXT: [[INC:%.*]] = add nsw i64 [[TMP75]], 1 +// CHECK2-NEXT: store i64 [[INC]], ptr [[DOTOMP_FUSE_INDEX52]], align 8 // CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP5:![0-9]+]] // CHECK2: [[FOR_END]]: // CHECK2-NEXT: ret void @@ -1427,13 +1664,11 @@ extern "C" void foo4() { // CHECK2-NEXT: [[ENTRY:.*:]] // CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 // CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[K:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -1448,12 +1683,10 @@ extern "C" void foo4() { // CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 // CHECK2-NEXT: store i32 0, ptr [[J]], align 4 -// CHECK2-NEXT: store i32 127, ptr [[DOTOMP_UB0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 // CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[K]], align 4 -// CHECK2-NEXT: store i32 63, ptr [[DOTOMP_UB1]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 // CHECK2-NEXT: store i32 64, ptr [[DOTOMP_NI1]], align 4 @@ -1573,6 +1806,277 @@ extern "C" void foo4() { // CHECK2-NEXT: ret void // // +// CHECK2-LABEL: define dso_local void @foo5( +// CHECK2-SAME: ) #[[ATTR0]] { +// CHECK2-NEXT: [[ENTRY:.*:]] +// CHECK2-NEXT: [[ARR:%.*]] = alloca [256 x double], align 16 +// CHECK2-NEXT: [[J:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV0:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[K:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_IV1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_TEMP_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_LB03:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_ST04:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_NI05:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV06:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[C:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN2:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_8:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_10:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[DOTCAPTURE_EXPR_11:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_LB116:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_ST117:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_NI118:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_IV120:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_TEMP_121:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[DOTOMP_FUSE_MAX22:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[I:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[DOTOMP_FUSE_INDEX29:%.*]] = alloca i64, align 8 +// CHECK2-NEXT: [[V:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[CC:%.*]] = alloca i32, align 4 +// CHECK2-NEXT: [[__RANGE264:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__BEGIN265:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[__END267:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: [[VV:%.*]] = alloca ptr, align 8 +// CHECK2-NEXT: store i32 0, ptr [[J]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: store i32 128, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[K]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: store i32 512, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP0:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP0]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP1:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP2:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp sgt i32 [[TMP1]], [[TMP2]] +// CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] +// CHECK2: [[COND_TRUE]]: +// CHECK2-NEXT: [[TMP3:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: br label %[[COND_END:.*]] +// CHECK2: [[COND_FALSE]]: +// CHECK2-NEXT: [[TMP4:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: br label %[[COND_END]] +// CHECK2: [[COND_END]]: +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP3]], %[[COND_TRUE]] ], [ [[TMP4]], %[[COND_FALSE]] ] +// CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP5:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: store i32 [[TMP5]], ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP6:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[SUB:%.*]] = sub nsw i32 [[TMP6]], 0 +// CHECK2-NEXT: [[DIV:%.*]] = sdiv i32 [[SUB]], 1 +// CHECK2-NEXT: [[SUB2:%.*]] = sub nsw i32 [[DIV]], 1 +// CHECK2-NEXT: store i32 [[SUB2]], ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB03]], align 4 +// CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST04]], align 4 +// CHECK2-NEXT: [[TMP7:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_1]], align 4 +// CHECK2-NEXT: [[ADD:%.*]] = add nsw i32 [[TMP7]], 1 +// CHECK2-NEXT: [[CONV:%.*]] = sext i32 [[ADD]] to i64 +// CHECK2-NEXT: store i64 [[CONV]], ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: store i32 42, ptr [[C]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[TMP8:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP8]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR]], ptr [[__END2]], align 8 +// CHECK2-NEXT: [[TMP9:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY7:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP9]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY7]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP10:%.*]] = load ptr, ptr [[__RANGE2]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY9:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP10]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY9]], ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[TMP11:%.*]] = load ptr, ptr [[__END2]], align 8 +// CHECK2-NEXT: store ptr [[TMP11]], ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK2-NEXT: [[TMP12:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_10]], align 8 +// CHECK2-NEXT: [[TMP13:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[TMP12]] to i64 +// CHECK2-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[TMP13]] to i64 +// CHECK2-NEXT: [[SUB_PTR_SUB:%.*]] = sub i64 [[SUB_PTR_LHS_CAST]], [[SUB_PTR_RHS_CAST]] +// CHECK2-NEXT: [[SUB_PTR_DIV:%.*]] = sdiv exact i64 [[SUB_PTR_SUB]], 8 +// CHECK2-NEXT: [[SUB12:%.*]] = sub nsw i64 [[SUB_PTR_DIV]], 1 +// CHECK2-NEXT: [[ADD13:%.*]] = add nsw i64 [[SUB12]], 1 +// CHECK2-NEXT: [[DIV14:%.*]] = sdiv i64 [[ADD13]], 1 +// CHECK2-NEXT: [[SUB15:%.*]] = sub nsw i64 [[DIV14]], 1 +// CHECK2-NEXT: store i64 [[SUB15]], ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_LB116]], align 8 +// CHECK2-NEXT: store i64 1, ptr [[DOTOMP_ST117]], align 8 +// CHECK2-NEXT: [[TMP14:%.*]] = load i64, ptr [[DOTCAPTURE_EXPR_11]], align 8 +// CHECK2-NEXT: [[ADD19:%.*]] = add nsw i64 [[TMP14]], 1 +// CHECK2-NEXT: store i64 [[ADD19]], ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[TMP15:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: store i64 [[TMP15]], ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK2-NEXT: [[TMP16:%.*]] = load i64, ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK2-NEXT: [[TMP17:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[CMP23:%.*]] = icmp sgt i64 [[TMP16]], [[TMP17]] +// CHECK2-NEXT: br i1 [[CMP23]], label %[[COND_TRUE24:.*]], label %[[COND_FALSE25:.*]] +// CHECK2: [[COND_TRUE24]]: +// CHECK2-NEXT: [[TMP18:%.*]] = load i64, ptr [[DOTOMP_TEMP_121]], align 8 +// CHECK2-NEXT: br label %[[COND_END26:.*]] +// CHECK2: [[COND_FALSE25]]: +// CHECK2-NEXT: [[TMP19:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: br label %[[COND_END26]] +// CHECK2: [[COND_END26]]: +// CHECK2-NEXT: [[COND27:%.*]] = phi i64 [ [[TMP18]], %[[COND_TRUE24]] ], [ [[TMP19]], %[[COND_FALSE25]] ] +// CHECK2-NEXT: store i64 [[COND27]], ptr [[DOTOMP_FUSE_MAX22]], align 8 +// CHECK2-NEXT: store i32 0, ptr [[I]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND:.*]] +// CHECK2: [[FOR_COND]]: +// CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: [[CMP28:%.*]] = icmp slt i32 [[TMP20]], 128 +// CHECK2-NEXT: br i1 [[CMP28]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] +// CHECK2: [[FOR_BODY]]: +// CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP21]]) +// CHECK2-NEXT: br label %[[FOR_INC:.*]] +// CHECK2: [[FOR_INC]]: +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add nsw i32 [[TMP22]], 1 +// CHECK2-NEXT: store i32 [[INC]], ptr [[I]], align 4 +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP8:![0-9]+]] +// CHECK2: [[FOR_END]]: +// CHECK2-NEXT: store i64 0, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND30:.*]] +// CHECK2: [[FOR_COND30]]: +// CHECK2-NEXT: [[TMP23:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[TMP24:%.*]] = load i64, ptr [[DOTOMP_FUSE_MAX22]], align 8 +// CHECK2-NEXT: [[CMP31:%.*]] = icmp slt i64 [[TMP23]], [[TMP24]] +// CHECK2-NEXT: br i1 [[CMP31]], label %[[FOR_BODY32:.*]], label %[[FOR_END63:.*]] +// CHECK2: [[FOR_BODY32]]: +// CHECK2-NEXT: [[TMP25:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[TMP26:%.*]] = load i64, ptr [[DOTOMP_NI05]], align 8 +// CHECK2-NEXT: [[CMP33:%.*]] = icmp slt i64 [[TMP25]], [[TMP26]] +// CHECK2-NEXT: br i1 [[CMP33]], label %[[IF_THEN:.*]], label %[[IF_END53:.*]] +// CHECK2: [[IF_THEN]]: +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTOMP_LB03]], align 4 +// CHECK2-NEXT: [[CONV34:%.*]] = sext i32 [[TMP27]] to i64 +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTOMP_ST04]], align 4 +// CHECK2-NEXT: [[CONV35:%.*]] = sext i32 [[TMP28]] to i64 +// CHECK2-NEXT: [[TMP29:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[MUL:%.*]] = mul nsw i64 [[CONV35]], [[TMP29]] +// CHECK2-NEXT: [[ADD36:%.*]] = add nsw i64 [[CONV34]], [[MUL]] +// CHECK2-NEXT: [[CONV37:%.*]] = trunc i64 [[ADD36]] to i32 +// CHECK2-NEXT: store i32 [[CONV37]], ptr [[DOTOMP_IV06]], align 4 +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_IV06]], align 4 +// CHECK2-NEXT: [[MUL38:%.*]] = mul nsw i32 [[TMP30]], 1 +// CHECK2-NEXT: [[ADD39:%.*]] = add nsw i32 0, [[MUL38]] +// CHECK2-NEXT: store i32 [[ADD39]], ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP40:%.*]] = icmp slt i32 [[TMP31]], [[TMP32]] +// CHECK2-NEXT: br i1 [[CMP40]], label %[[IF_THEN41:.*]], label %[[IF_END:.*]] +// CHECK2: [[IF_THEN41]]: +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL42:%.*]] = mul nsw i32 [[TMP34]], [[TMP35]] +// CHECK2-NEXT: [[ADD43:%.*]] = add nsw i32 [[TMP33]], [[MUL42]] +// CHECK2-NEXT: store i32 [[ADD43]], ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[MUL44:%.*]] = mul nsw i32 [[TMP36]], 2 +// CHECK2-NEXT: [[ADD45:%.*]] = add nsw i32 0, [[MUL44]] +// CHECK2-NEXT: store i32 [[ADD45]], ptr [[J]], align 4 +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP37]]) +// CHECK2-NEXT: br label %[[IF_END]] +// CHECK2: [[IF_END]]: +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP46:%.*]] = icmp slt i32 [[TMP38]], [[TMP39]] +// CHECK2-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] +// CHECK2: [[IF_THEN47]]: +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL48:%.*]] = mul nsw i32 [[TMP41]], [[TMP42]] +// CHECK2-NEXT: [[ADD49:%.*]] = add nsw i32 [[TMP40]], [[MUL48]] +// CHECK2-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[MUL50:%.*]] = mul nsw i32 [[TMP43]], 1 +// CHECK2-NEXT: [[ADD51:%.*]] = add nsw i32 0, [[MUL50]] +// CHECK2-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[K]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP44]]) +// CHECK2-NEXT: br label %[[IF_END52]] +// CHECK2: [[IF_END52]]: +// CHECK2-NEXT: br label %[[IF_END53]] +// CHECK2: [[IF_END53]]: +// CHECK2-NEXT: [[TMP45:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[TMP46:%.*]] = load i64, ptr [[DOTOMP_NI118]], align 8 +// CHECK2-NEXT: [[CMP54:%.*]] = icmp slt i64 [[TMP45]], [[TMP46]] +// CHECK2-NEXT: br i1 [[CMP54]], label %[[IF_THEN55:.*]], label %[[IF_END60:.*]] +// CHECK2: [[IF_THEN55]]: +// CHECK2-NEXT: [[TMP47:%.*]] = load i64, ptr [[DOTOMP_LB116]], align 8 +// CHECK2-NEXT: [[TMP48:%.*]] = load i64, ptr [[DOTOMP_ST117]], align 8 +// CHECK2-NEXT: [[TMP49:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[MUL56:%.*]] = mul nsw i64 [[TMP48]], [[TMP49]] +// CHECK2-NEXT: [[ADD57:%.*]] = add nsw i64 [[TMP47]], [[MUL56]] +// CHECK2-NEXT: store i64 [[ADD57]], ptr [[DOTOMP_IV120]], align 8 +// CHECK2-NEXT: [[TMP50:%.*]] = load ptr, ptr [[DOTCAPTURE_EXPR_8]], align 8 +// CHECK2-NEXT: [[TMP51:%.*]] = load i64, ptr [[DOTOMP_IV120]], align 8 +// CHECK2-NEXT: [[MUL58:%.*]] = mul nsw i64 [[TMP51]], 1 +// CHECK2-NEXT: [[ADD_PTR59:%.*]] = getelementptr inbounds double, ptr [[TMP50]], i64 [[MUL58]] +// CHECK2-NEXT: store ptr [[ADD_PTR59]], ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: [[TMP52:%.*]] = load ptr, ptr [[__BEGIN2]], align 8 +// CHECK2-NEXT: store ptr [[TMP52]], ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[C]], align 4 +// CHECK2-NEXT: [[TMP54:%.*]] = load ptr, ptr [[V]], align 8 +// CHECK2-NEXT: [[TMP55:%.*]] = load double, ptr [[TMP54]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP53]], double noundef [[TMP55]]) +// CHECK2-NEXT: br label %[[IF_END60]] +// CHECK2: [[IF_END60]]: +// CHECK2-NEXT: br label %[[FOR_INC61:.*]] +// CHECK2: [[FOR_INC61]]: +// CHECK2-NEXT: [[TMP56:%.*]] = load i64, ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: [[INC62:%.*]] = add nsw i64 [[TMP56]], 1 +// CHECK2-NEXT: store i64 [[INC62]], ptr [[DOTOMP_FUSE_INDEX29]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND30]], !llvm.loop [[LOOP9:![0-9]+]] +// CHECK2: [[FOR_END63]]: +// CHECK2-NEXT: store i32 37, ptr [[CC]], align 4 +// CHECK2-NEXT: store ptr [[ARR]], ptr [[__RANGE264]], align 8 +// CHECK2-NEXT: [[TMP57:%.*]] = load ptr, ptr [[__RANGE264]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY66:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP57]], i64 0, i64 0 +// CHECK2-NEXT: store ptr [[ARRAYDECAY66]], ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: [[TMP58:%.*]] = load ptr, ptr [[__RANGE264]], align 8 +// CHECK2-NEXT: [[ARRAYDECAY68:%.*]] = getelementptr inbounds [256 x double], ptr [[TMP58]], i64 0, i64 0 +// CHECK2-NEXT: [[ADD_PTR69:%.*]] = getelementptr inbounds double, ptr [[ARRAYDECAY68]], i64 256 +// CHECK2-NEXT: store ptr [[ADD_PTR69]], ptr [[__END267]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND70:.*]] +// CHECK2: [[FOR_COND70]]: +// CHECK2-NEXT: [[TMP59:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: [[TMP60:%.*]] = load ptr, ptr [[__END267]], align 8 +// CHECK2-NEXT: [[CMP71:%.*]] = icmp ne ptr [[TMP59]], [[TMP60]] +// CHECK2-NEXT: br i1 [[CMP71]], label %[[FOR_BODY72:.*]], label %[[FOR_END74:.*]] +// CHECK2: [[FOR_BODY72]]: +// CHECK2-NEXT: [[TMP61:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: store ptr [[TMP61]], ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP62:%.*]] = load i32, ptr [[CC]], align 4 +// CHECK2-NEXT: [[TMP63:%.*]] = load ptr, ptr [[VV]], align 8 +// CHECK2-NEXT: [[TMP64:%.*]] = load double, ptr [[TMP63]], align 8 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP62]], double noundef [[TMP64]]) +// CHECK2-NEXT: br label %[[FOR_INC73:.*]] +// CHECK2: [[FOR_INC73]]: +// CHECK2-NEXT: [[TMP65:%.*]] = load ptr, ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: [[INCDEC_PTR:%.*]] = getelementptr inbounds nuw double, ptr [[TMP65]], i32 1 +// CHECK2-NEXT: store ptr [[INCDEC_PTR]], ptr [[__BEGIN265]], align 8 +// CHECK2-NEXT: br label %[[FOR_COND70]] +// CHECK2: [[FOR_END74]]: +// CHECK2-NEXT: ret void +// +// // CHECK2-LABEL: define dso_local void @tfoo2( // CHECK2-SAME: ) #[[ATTR0]] { // CHECK2-NEXT: [[ENTRY:.*:]] @@ -1593,7 +2097,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_2:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST0:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI0:%.*]] = alloca i32, align 4 @@ -1602,7 +2105,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_7:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP8:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_9:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST1:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI1:%.*]] = alloca i32, align 4 @@ -1611,7 +2113,6 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DOTCAPTURE_EXPR_19:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTNEW_STEP21:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTCAPTURE_EXPR_22:%.*]] = alloca i32, align 4 -// CHECK2-NEXT: [[DOTOMP_UB2:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_LB2:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_ST2:%.*]] = alloca i32, align 4 // CHECK2-NEXT: [[DOTOMP_NI2:%.*]] = alloca i32, align 4 @@ -1641,174 +2142,168 @@ extern "C" void foo4() { // CHECK2-NEXT: [[DIV:%.*]] = udiv i32 [[ADD]], [[TMP7]] // CHECK2-NEXT: [[SUB4:%.*]] = sub i32 [[DIV]], 1 // CHECK2-NEXT: store i32 [[SUB4]], ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: store i32 [[TMP8]], ptr [[DOTOMP_UB0]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB0]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 -// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP9]], 1 +// CHECK2-NEXT: [[TMP8:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_2]], align 4 +// CHECK2-NEXT: [[ADD5:%.*]] = add i32 [[TMP8]], 1 // CHECK2-NEXT: store i32 [[ADD5]], ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[TMP9:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP9]], ptr [[J]], align 4 // CHECK2-NEXT: [[TMP10:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP10]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[START_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP13]], ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 -// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP14]], [[TMP15]] +// CHECK2-NEXT: store i32 [[TMP10]], ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP11:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP11]], ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[TMP12:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP12]], ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[TMP13:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP14:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_7]], align 4 +// CHECK2-NEXT: [[SUB10:%.*]] = sub i32 [[TMP13]], [[TMP14]] // CHECK2-NEXT: [[SUB11:%.*]] = sub i32 [[SUB10]], 1 +// CHECK2-NEXT: [[TMP15:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP15]] // CHECK2-NEXT: [[TMP16:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[ADD12:%.*]] = add i32 [[SUB11]], [[TMP16]] -// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP17]] +// CHECK2-NEXT: [[DIV13:%.*]] = udiv i32 [[ADD12]], [[TMP16]] // CHECK2-NEXT: [[SUB14:%.*]] = sub i32 [[DIV13]], 1 // CHECK2-NEXT: store i32 [[SUB14]], ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: store i32 [[TMP18]], ptr [[DOTOMP_UB1]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB1]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 -// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP19]], 1 +// CHECK2-NEXT: [[TMP17:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_9]], align 4 +// CHECK2-NEXT: [[ADD15:%.*]] = add i32 [[TMP17]], 1 // CHECK2-NEXT: store i32 [[ADD15]], ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP18:%.*]] = load i32, ptr [[START_ADDR]], align 4 +// CHECK2-NEXT: [[TMP19:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP18]], [[TMP19]] +// CHECK2-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 // CHECK2-NEXT: [[TMP20:%.*]] = load i32, ptr [[START_ADDR]], align 4 // CHECK2-NEXT: [[TMP21:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: [[ADD16:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] -// CHECK2-NEXT: store i32 [[ADD16]], ptr [[K]], align 4 -// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[START_ADDR]], align 4 -// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] +// CHECK2-NEXT: [[ADD18:%.*]] = add nsw i32 [[TMP20]], [[TMP21]] // CHECK2-NEXT: store i32 [[ADD18]], ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[END_ADDR]], align 4 -// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP24]], [[TMP25]] +// CHECK2-NEXT: [[TMP22:%.*]] = load i32, ptr [[END_ADDR]], align 4 +// CHECK2-NEXT: [[TMP23:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: [[ADD20:%.*]] = add nsw i32 [[TMP22]], [[TMP23]] // CHECK2-NEXT: store i32 [[ADD20]], ptr [[DOTCAPTURE_EXPR_19]], align 4 -// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 -// CHECK2-NEXT: store i32 [[TMP26]], ptr [[DOTNEW_STEP21]], align 4 -// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 -// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK2-NEXT: [[SUB23:%.*]] = sub i32 [[TMP27]], [[TMP28]] +// CHECK2-NEXT: [[TMP24:%.*]] = load i32, ptr [[STEP_ADDR]], align 4 +// CHECK2-NEXT: store i32 [[TMP24]], ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[TMP25:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_19]], align 4 +// CHECK2-NEXT: [[TMP26:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[SUB23:%.*]] = sub i32 [[TMP25]], [[TMP26]] // CHECK2-NEXT: [[SUB24:%.*]] = sub i32 [[SUB23]], 1 -// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK2-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP29]] -// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK2-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP30]] +// CHECK2-NEXT: [[TMP27:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[ADD25:%.*]] = add i32 [[SUB24]], [[TMP27]] +// CHECK2-NEXT: [[TMP28:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[DIV26:%.*]] = udiv i32 [[ADD25]], [[TMP28]] // CHECK2-NEXT: [[SUB27:%.*]] = sub i32 [[DIV26]], 1 // CHECK2-NEXT: store i32 [[SUB27]], ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK2-NEXT: store i32 [[TMP31]], ptr [[DOTOMP_UB2]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_LB2]], align 4 // CHECK2-NEXT: store i32 1, ptr [[DOTOMP_ST2]], align 4 -// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 -// CHECK2-NEXT: [[ADD28:%.*]] = add i32 [[TMP32]], 1 +// CHECK2-NEXT: [[TMP29:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_22]], align 4 +// CHECK2-NEXT: [[ADD28:%.*]] = add i32 [[TMP29]], 1 // CHECK2-NEXT: store i32 [[ADD28]], ptr [[DOTOMP_NI2]], align 4 -// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: store i32 [[TMP33]], ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 -// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP34]], [[TMP35]] +// CHECK2-NEXT: [[TMP30:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: store i32 [[TMP30]], ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP31:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP32:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP31]], [[TMP32]] // CHECK2-NEXT: br i1 [[CMP]], label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]] // CHECK2: [[COND_TRUE]]: -// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 +// CHECK2-NEXT: [[TMP33:%.*]] = load i32, ptr [[DOTOMP_TEMP_1]], align 4 // CHECK2-NEXT: br label %[[COND_END:.*]] // CHECK2: [[COND_FALSE]]: -// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[TMP34:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 // CHECK2-NEXT: br label %[[COND_END]] // CHECK2: [[COND_END]]: -// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP36]], %[[COND_TRUE]] ], [ [[TMP37]], %[[COND_FALSE]] ] +// CHECK2-NEXT: [[COND:%.*]] = phi i32 [ [[TMP33]], %[[COND_TRUE]] ], [ [[TMP34]], %[[COND_FALSE]] ] // CHECK2-NEXT: store i32 [[COND]], ptr [[DOTOMP_TEMP_2]], align 4 -// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 -// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 -// CHECK2-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP38]], [[TMP39]] +// CHECK2-NEXT: [[TMP35:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: [[TMP36:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[CMP29:%.*]] = icmp ugt i32 [[TMP35]], [[TMP36]] // CHECK2-NEXT: br i1 [[CMP29]], label %[[COND_TRUE30:.*]], label %[[COND_FALSE31:.*]] // CHECK2: [[COND_TRUE30]]: -// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 +// CHECK2-NEXT: [[TMP37:%.*]] = load i32, ptr [[DOTOMP_TEMP_2]], align 4 // CHECK2-NEXT: br label %[[COND_END32:.*]] // CHECK2: [[COND_FALSE31]]: -// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[TMP38:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 // CHECK2-NEXT: br label %[[COND_END32]] // CHECK2: [[COND_END32]]: -// CHECK2-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP40]], %[[COND_TRUE30]] ], [ [[TMP41]], %[[COND_FALSE31]] ] +// CHECK2-NEXT: [[COND33:%.*]] = phi i32 [ [[TMP37]], %[[COND_TRUE30]] ], [ [[TMP38]], %[[COND_FALSE31]] ] // CHECK2-NEXT: store i32 [[COND33]], ptr [[DOTOMP_FUSE_MAX]], align 4 // CHECK2-NEXT: store i32 0, ptr [[DOTOMP_FUSE_INDEX]], align 4 // CHECK2-NEXT: br label %[[FOR_COND:.*]] // CHECK2: [[FOR_COND]]: -// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 -// CHECK2-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP42]], [[TMP43]] +// CHECK2-NEXT: [[TMP39:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP40:%.*]] = load i32, ptr [[DOTOMP_FUSE_MAX]], align 4 +// CHECK2-NEXT: [[CMP34:%.*]] = icmp ult i32 [[TMP39]], [[TMP40]] // CHECK2-NEXT: br i1 [[CMP34]], label %[[FOR_BODY:.*]], label %[[FOR_END:.*]] // CHECK2: [[FOR_BODY]]: -// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 -// CHECK2-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP44]], [[TMP45]] +// CHECK2-NEXT: [[TMP41:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP42:%.*]] = load i32, ptr [[DOTOMP_NI0]], align 4 +// CHECK2-NEXT: [[CMP35:%.*]] = icmp ult i32 [[TMP41]], [[TMP42]] // CHECK2-NEXT: br i1 [[CMP35]], label %[[IF_THEN:.*]], label %[[IF_END:.*]] // CHECK2: [[IF_THEN]]: -// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 -// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 -// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP47]], [[TMP48]] -// CHECK2-NEXT: [[ADD36:%.*]] = add i32 [[TMP46]], [[MUL]] +// CHECK2-NEXT: [[TMP43:%.*]] = load i32, ptr [[DOTOMP_LB0]], align 4 +// CHECK2-NEXT: [[TMP44:%.*]] = load i32, ptr [[DOTOMP_ST0]], align 4 +// CHECK2-NEXT: [[TMP45:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL:%.*]] = mul i32 [[TMP44]], [[TMP45]] +// CHECK2-NEXT: [[ADD36:%.*]] = add i32 [[TMP43]], [[MUL]] // CHECK2-NEXT: store i32 [[ADD36]], ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 -// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 -// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 -// CHECK2-NEXT: [[MUL37:%.*]] = mul i32 [[TMP50]], [[TMP51]] -// CHECK2-NEXT: [[ADD38:%.*]] = add i32 [[TMP49]], [[MUL37]] +// CHECK2-NEXT: [[TMP46:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_]], align 4 +// CHECK2-NEXT: [[TMP47:%.*]] = load i32, ptr [[DOTOMP_IV0]], align 4 +// CHECK2-NEXT: [[TMP48:%.*]] = load i32, ptr [[DOTNEW_STEP]], align 4 +// CHECK2-NEXT: [[MUL37:%.*]] = mul i32 [[TMP47]], [[TMP48]] +// CHECK2-NEXT: [[ADD38:%.*]] = add i32 [[TMP46]], [[MUL37]] // CHECK2-NEXT: store i32 [[ADD38]], ptr [[I]], align 4 -// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[I]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP52]]) +// CHECK2-NEXT: [[TMP49:%.*]] = load i32, ptr [[I]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP49]]) // CHECK2-NEXT: br label %[[IF_END]] // CHECK2: [[IF_END]]: -// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 -// CHECK2-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP53]], [[TMP54]] +// CHECK2-NEXT: [[TMP50:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP51:%.*]] = load i32, ptr [[DOTOMP_NI1]], align 4 +// CHECK2-NEXT: [[CMP39:%.*]] = icmp ult i32 [[TMP50]], [[TMP51]] // CHECK2-NEXT: br i1 [[CMP39]], label %[[IF_THEN40:.*]], label %[[IF_END45:.*]] // CHECK2: [[IF_THEN40]]: -// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 -// CHECK2-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 -// CHECK2-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL41:%.*]] = mul i32 [[TMP56]], [[TMP57]] -// CHECK2-NEXT: [[ADD42:%.*]] = add i32 [[TMP55]], [[MUL41]] +// CHECK2-NEXT: [[TMP52:%.*]] = load i32, ptr [[DOTOMP_LB1]], align 4 +// CHECK2-NEXT: [[TMP53:%.*]] = load i32, ptr [[DOTOMP_ST1]], align 4 +// CHECK2-NEXT: [[TMP54:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL41:%.*]] = mul i32 [[TMP53]], [[TMP54]] +// CHECK2-NEXT: [[ADD42:%.*]] = add i32 [[TMP52]], [[MUL41]] // CHECK2-NEXT: store i32 [[ADD42]], ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP58:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 -// CHECK2-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 -// CHECK2-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 -// CHECK2-NEXT: [[MUL43:%.*]] = mul i32 [[TMP59]], [[TMP60]] -// CHECK2-NEXT: [[SUB44:%.*]] = sub i32 [[TMP58]], [[MUL43]] +// CHECK2-NEXT: [[TMP55:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_6]], align 4 +// CHECK2-NEXT: [[TMP56:%.*]] = load i32, ptr [[DOTOMP_IV1]], align 4 +// CHECK2-NEXT: [[TMP57:%.*]] = load i32, ptr [[DOTNEW_STEP8]], align 4 +// CHECK2-NEXT: [[MUL43:%.*]] = mul i32 [[TMP56]], [[TMP57]] +// CHECK2-NEXT: [[SUB44:%.*]] = sub i32 [[TMP55]], [[MUL43]] // CHECK2-NEXT: store i32 [[SUB44]], ptr [[J]], align 4 -// CHECK2-NEXT: [[TMP61:%.*]] = load i32, ptr [[J]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP61]]) +// CHECK2-NEXT: [[TMP58:%.*]] = load i32, ptr [[J]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP58]]) // CHECK2-NEXT: br label %[[IF_END45]] // CHECK2: [[IF_END45]]: -// CHECK2-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 -// CHECK2-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP62]], [[TMP63]] +// CHECK2-NEXT: [[TMP59:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[TMP60:%.*]] = load i32, ptr [[DOTOMP_NI2]], align 4 +// CHECK2-NEXT: [[CMP46:%.*]] = icmp ult i32 [[TMP59]], [[TMP60]] // CHECK2-NEXT: br i1 [[CMP46]], label %[[IF_THEN47:.*]], label %[[IF_END52:.*]] // CHECK2: [[IF_THEN47]]: -// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 -// CHECK2-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 -// CHECK2-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[MUL48:%.*]] = mul i32 [[TMP65]], [[TMP66]] -// CHECK2-NEXT: [[ADD49:%.*]] = add i32 [[TMP64]], [[MUL48]] +// CHECK2-NEXT: [[TMP61:%.*]] = load i32, ptr [[DOTOMP_LB2]], align 4 +// CHECK2-NEXT: [[TMP62:%.*]] = load i32, ptr [[DOTOMP_ST2]], align 4 +// CHECK2-NEXT: [[TMP63:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[MUL48:%.*]] = mul i32 [[TMP62]], [[TMP63]] +// CHECK2-NEXT: [[ADD49:%.*]] = add i32 [[TMP61]], [[MUL48]] // CHECK2-NEXT: store i32 [[ADD49]], ptr [[DOTOMP_IV2]], align 4 -// CHECK2-NEXT: [[TMP67:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 -// CHECK2-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 -// CHECK2-NEXT: [[TMP69:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 -// CHECK2-NEXT: [[MUL50:%.*]] = mul i32 [[TMP68]], [[TMP69]] -// CHECK2-NEXT: [[ADD51:%.*]] = add i32 [[TMP67]], [[MUL50]] +// CHECK2-NEXT: [[TMP64:%.*]] = load i32, ptr [[DOTCAPTURE_EXPR_17]], align 4 +// CHECK2-NEXT: [[TMP65:%.*]] = load i32, ptr [[DOTOMP_IV2]], align 4 +// CHECK2-NEXT: [[TMP66:%.*]] = load i32, ptr [[DOTNEW_STEP21]], align 4 +// CHECK2-NEXT: [[MUL50:%.*]] = mul i32 [[TMP65]], [[TMP66]] +// CHECK2-NEXT: [[ADD51:%.*]] = add i32 [[TMP64]], [[MUL50]] // CHECK2-NEXT: store i32 [[ADD51]], ptr [[K]], align 4 -// CHECK2-NEXT: [[TMP70:%.*]] = load i32, ptr [[K]], align 4 -// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP70]]) +// CHECK2-NEXT: [[TMP67:%.*]] = load i32, ptr [[K]], align 4 +// CHECK2-NEXT: call void (...) @body(i32 noundef [[TMP67]]) // CHECK2-NEXT: br label %[[IF_END52]] // CHECK2: [[IF_END52]]: // CHECK2-NEXT: br label %[[FOR_INC:.*]] // CHECK2: [[FOR_INC]]: -// CHECK2-NEXT: [[TMP71:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP71]], 1 +// CHECK2-NEXT: [[TMP68:%.*]] = load i32, ptr [[DOTOMP_FUSE_INDEX]], align 4 +// CHECK2-NEXT: [[INC:%.*]] = add i32 [[TMP68]], 1 // CHECK2-NEXT: store i32 [[INC]], ptr [[DOTOMP_FUSE_INDEX]], align 4 -// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP8:![0-9]+]] +// CHECK2-NEXT: br label %[[FOR_COND]], !llvm.loop [[LOOP10:![0-9]+]] // CHECK2: [[FOR_END]]: // CHECK2-NEXT: ret void // @@ -1819,6 +2314,8 @@ extern "C" void foo4() { // CHECK1: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} // CHECK1: [[LOOP7]] = distinct !{[[LOOP7]], [[META4]]} // CHECK1: [[LOOP8]] = distinct !{[[LOOP8]], [[META4]]} +// CHECK1: [[LOOP9]] = distinct !{[[LOOP9]], [[META4]]} +// CHECK1: [[LOOP10]] = distinct !{[[LOOP10]], [[META4]]} //. // CHECK2: [[LOOP3]] = distinct !{[[LOOP3]], [[META4:![0-9]+]]} // CHECK2: [[META4]] = !{!"llvm.loop.mustprogress"} @@ -1826,4 +2323,6 @@ extern "C" void foo4() { // CHECK2: [[LOOP6]] = distinct !{[[LOOP6]], [[META4]]} // CHECK2: [[LOOP7]] = distinct !{[[LOOP7]], [[META4]]} // CHECK2: [[LOOP8]] = distinct !{[[LOOP8]], [[META4]]} +// CHECK2: [[LOOP9]] = distinct !{[[LOOP9]], [[META4]]} +// CHECK2: [[LOOP10]] = distinct !{[[LOOP10]], [[META4]]} //. >From 823bc08b4ef97458665ed41409e03cd07598efd3 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:44:48 +0000 Subject: [PATCH 5/9] Fixed missing diagnostic groups in warnings --- clang/include/clang/Basic/DiagnosticSemaKinds.td | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index 191618e7865dc..a6ae0de004c8a 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -11559,7 +11559,8 @@ def note_omp_implicit_dsa : Note< def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; def warn_omp_different_loop_ind_var_types : Warning < - "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">; + "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">, + InGroup; def err_omp_not_canonical_loop : Error < "loop after '#pragma omp %0' is not in canonical form">; def err_omp_not_a_loop_sequence : Error < @@ -11570,7 +11571,8 @@ def err_omp_invalid_looprange : Error < "loop range in '#pragma omp %0' exceeds the number of available loops: " "range end '%1' is greater than the total number of loops '%2'">; def warn_omp_redundant_fusion : Warning < - "loop range in '#pragma omp %0' contains only a single loop, resulting in redundant fusion">; + "loop range in '#pragma omp %0' contains only a single loop, resulting in redundant fusion">, + InGroup; def err_omp_not_for : Error< "%select{statement after '#pragma omp %1' must be a for loop|" "expected %2 for loops after '#pragma omp %1'%select{|, but found only %4}3}0">; >From 422ffd7ef80a83156037a34c6ad955e67c504b4d Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:49:50 +0000 Subject: [PATCH 6/9] Fixed formatting and comments --- clang/lib/Sema/SemaOpenMP.cpp | 112 ++++++++++++++++++---------------- 1 file changed, 58 insertions(+), 54 deletions(-) diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index b0529c9352c83..485eebf23ef93 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -14160,42 +14160,43 @@ StmtResult SemaOpenMP::ActOnOpenMPTargetTeamsDistributeSimdDirective( } // Overloaded base case function -template -static bool tryHandleAs(T *t, F &&) { - return false; +template static bool tryHandleAs(T *t, F &&) { + return false; } /** - * Tries to recursively cast `t` to one of the given types and invokes `f` if successful. + * Tries to recursively cast `t` to one of the given types and invokes `f` if + * successful. * * @tparam Class The first type to check. * @tparam Rest The remaining types to check. * @tparam T The base type of `t`. - * @tparam F The callable type for the function to invoke upon a successful cast. + * @tparam F The callable type for the function to invoke upon a successful + * cast. * @param t The object to be checked. * @param f The function to invoke if `t` matches `Class`. * @return `true` if `t` matched any type and `f` was called, otherwise `false`. */ template static bool tryHandleAs(T *t, F &&f) { - if (Class *c = dyn_cast(t)) { - f(c); - return true; - } else { - return tryHandleAs(t, std::forward(f)); - } + if (Class *c = dyn_cast(t)) { + f(c); + return true; + } else { + return tryHandleAs(t, std::forward(f)); + } } // Updates OriginalInits by checking Transform against loop transformation // directives and appending their pre-inits if a match is found. static void updatePreInits(OMPLoopBasedDirective *Transform, SmallVectorImpl> &PreInits) { - if (!tryHandleAs( - Transform, [&PreInits](auto *Dir) { - appendFlattenedStmtList(PreInits.back(), Dir->getPreInits()); - })) - llvm_unreachable("Unhandled loop transformation"); + if (!tryHandleAs( + Transform, [&PreInits](auto *Dir) { + appendFlattenedStmtList(PreInits.back(), Dir->getPreInits()); + })) + llvm_unreachable("Unhandled loop transformation"); } bool SemaOpenMP::checkTransformableLoopNest( @@ -14273,43 +14274,42 @@ class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { unsigned getNestedLoopCount() const { return NestedLoopCount; } bool VisitForStmt(ForStmt *FS) override { - ++NestedLoopCount; - return true; + ++NestedLoopCount; + return true; } bool VisitCXXForRangeStmt(CXXForRangeStmt *FRS) override { - ++NestedLoopCount; - return true; + ++NestedLoopCount; + return true; } bool TraverseStmt(Stmt *S) override { - if (!S) + if (!S) return true; - // Skip traversal of all expressions, including special cases like - // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions - // may contain inner statements (and even loops), but they are not part - // of the syntactic body of the surrounding loop structure. - // Therefore must not be counted - if (isa(S)) + // Skip traversal of all expressions, including special cases like + // LambdaExpr, StmtExpr, BlockExpr, and RequiresExpr. These expressions + // may contain inner statements (and even loops), but they are not part + // of the syntactic body of the surrounding loop structure. + // Therefore must not be counted + if (isa(S)) return true; - // Only recurse into CompoundStmt (block {}) and loop bodies - if (isa(S) || isa(S) || - isa(S)) { + // Only recurse into CompoundStmt (block {}) and loop bodies + if (isa(S) || isa(S) || isa(S)) { return DynamicRecursiveASTVisitor::TraverseStmt(S); - } + } - // Stop traversal of the rest of statements, that break perfect - // loop nesting, such as control flow (IfStmt, SwitchStmt...) - return true; + // Stop traversal of the rest of statements, that break perfect + // loop nesting, such as control flow (IfStmt, SwitchStmt...) + return true; } bool TraverseDecl(Decl *D) override { - // Stop in the case of finding a declaration, it is not important - // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, - // FunctionDecl...) - return true; + // Stop in the case of finding a declaration, it is not important + // in order to find nested loops (Possible CXXRecordDecl, RecordDecl, + // FunctionDecl...) + return true; } }; @@ -14467,15 +14467,14 @@ bool SemaOpenMP::analyzeLoopSequence( return isa(Child); }; - // High level grammar validation for (auto *Child : LoopSeqStmt->children()) { - if (!Child) + if (!Child) continue; - // Skip over non-loop-sequence statements - if (!isLoopSequenceDerivation(Child)) { + // Skip over non-loop-sequence statements + if (!isLoopSequenceDerivation(Child)) { Child = Child->IgnoreContainers(); // Ignore empty compound statement @@ -14493,9 +14492,9 @@ bool SemaOpenMP::analyzeLoopSequence( // Already been treated, skip this children continue; } - } - // Regular loop sequence handling - if (isLoopSequenceDerivation(Child)) { + } + // Regular loop sequence handling + if (isLoopSequenceDerivation(Child)) { if (isLoopGeneratingStmt(Child)) { if (!analyzeLoopGeneration(Child)) { return false; @@ -14509,12 +14508,12 @@ bool SemaOpenMP::analyzeLoopSequence( // Update the Loop Sequence size by one ++LoopSeqSize; } - } else { + } else { // Report error for invalid statement inside canonical loop sequence Diag(Child->getBeginLoc(), diag::err_omp_not_for) << 0 << getOpenMPDirectiveName(Kind); return false; - } + } } return true; } @@ -14531,9 +14530,9 @@ bool SemaOpenMP::checkTransformableLoopSequence( // Checks whether the given statement is a compound statement if (!isa(AStmt)) { - Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) - << getOpenMPDirectiveName(Kind); - return false; + Diag(AStmt->getBeginLoc(), diag::err_omp_not_a_loop_sequence) + << getOpenMPDirectiveName(Kind); + return false; } // Number of top level canonical loop nests observed (And acts as index) LoopSeqSize = 0; @@ -14564,7 +14563,7 @@ bool SemaOpenMP::checkTransformableLoopSequence( OriginalInits, TransformsPreInits, LoopSequencePreInits, LoopCategories, Context, Kind)) { - return false; + return false; } if (LoopSeqSize <= 0) { Diag(AStmt->getBeginLoc(), diag::err_omp_empty_loop_sequence) @@ -15278,7 +15277,7 @@ StmtResult SemaOpenMP::ActOnOpenMPUnrollDirective(ArrayRef Clauses, Stmt *LoopStmt = nullptr; collectLoopStmts(AStmt, {LoopStmt}); - // Determine the PreInit declarations.e + // Determine the PreInit declarations. SmallVector PreInits; addLoopPreInits(Context, LoopHelper, LoopStmt, OriginalInits[0], PreInits); @@ -15894,13 +15893,18 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, CountVal = CountInt.getZExtValue(); }; - // Checks if the loop range is valid + // OpenMP [6.0, Restrictions] + // first + count - 1 must not evaluate to a value greater than the + // loop sequence length of the associated canonical loop sequence. auto ValidLoopRange = [](uint64_t FirstVal, uint64_t CountVal, unsigned NumLoops) -> bool { return FirstVal + CountVal - 1 <= NumLoops; }; uint64_t FirstVal = 1, CountVal = 0, LastVal = LoopSeqSize; + // Validates the loop range after evaluating the semantic information + // and ensures that the range is valid for the given loop sequence size. + // Expressions are evaluated at compile time to obtain constant values. if (LRC) { EvaluateLoopRangeArguments(LRC->getFirst(), LRC->getCount(), FirstVal, CountVal); >From ac0d9e348109f742440003945d278a9c26f56976 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Fri, 9 May 2025 10:58:54 +0000 Subject: [PATCH 7/9] Added minimal changes to enable flang future implementation --- flang/include/flang/Parser/dump-parse-tree.h | 1 + flang/include/flang/Parser/parse-tree.h | 9 +++++++++ flang/lib/Lower/OpenMP/Clauses.cpp | 5 +++++ flang/lib/Lower/OpenMP/Clauses.h | 1 + flang/lib/Parser/openmp-parsers.cpp | 7 +++++++ flang/lib/Parser/unparse.cpp | 7 +++++++ flang/lib/Semantics/check-omp-structure.cpp | 9 +++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 8 files changed, 40 insertions(+) diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index df9278697346f..c220c4dafb52f 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -609,6 +609,7 @@ class ParseTreeDumper { NODE(OmpLinearClause, Modifier) NODE(parser, OmpLinearModifier) NODE_ENUM(OmpLinearModifier, Value) + NODE(parser, OmpLoopRangeClause) NODE(parser, OmpStepComplexModifier) NODE(parser, OmpStepSimpleModifier) NODE(parser, OmpLoopDirective) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 254236b510544..be80141b49e2b 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4361,6 +4361,15 @@ struct OmpLinearClause { std::tuple t; }; +// Ref: [6.0:207-208] +// +// loop-range-clause -> +// LOOPRANGE(first, count) // since 6.0 +struct OmpLoopRangeClause { + TUPLE_CLASS_BOILERPLATE(OmpLoopRangeClause); + std::tuple t; +}; + // Ref: [4.5:216-219], [5.0:315-324], [5.1:347-355], [5.2:150-158] // // map-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f3088b18b77ff..ea535ab3adbe7 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -998,6 +998,11 @@ Link make(const parser::OmpClause::Link &inp, return Link{/*List=*/makeObjects(inp.v, semaCtx)}; } +LoopRange make(const parser::OmpClause::Looprange &inp, + semantics::SemanticsContext &semaCtx) { + llvm_unreachable("Unimplemented: looprange"); +} + Map make(const parser::OmpClause::Map &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpMapClause diff --git a/flang/lib/Lower/OpenMP/Clauses.h b/flang/lib/Lower/OpenMP/Clauses.h index d7ab21d428e32..bda8571e65f23 100644 --- a/flang/lib/Lower/OpenMP/Clauses.h +++ b/flang/lib/Lower/OpenMP/Clauses.h @@ -239,6 +239,7 @@ using Initializer = tomp::clause::InitializerT; using InReduction = tomp::clause::InReductionT; using IsDevicePtr = tomp::clause::IsDevicePtrT; using Lastprivate = tomp::clause::LastprivateT; +using LoopRange = tomp::clause::LoopRangeT; using Linear = tomp::clause::LinearT; using Link = tomp::clause::LinkT; using Map = tomp::clause::MapT; diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 52d3a5844c969..393dbe8ada002 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -841,6 +841,11 @@ TYPE_PARSER( maybe(":"_tok >> nonemptyList(Parser{})), /*PostModified=*/pure(true))) +TYPE_PARSER( + construct(scalarIntConstantExpr, + "," >> scalarIntConstantExpr) +) + // OpenMPv5.2 12.5.2 detach-clause -> DETACH (event-handle) TYPE_PARSER(construct(Parser{})) @@ -1010,6 +1015,8 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "LINK" >> construct(construct( parenthesized(Parser{}))) || + "LOOPRANGE" >> construct(construct( + parenthesized(Parser{}))) || "MAP" >> construct(construct( parenthesized(Parser{}))) || "MATCH" >> construct(construct( diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index a626888b7dfe5..00b5a8c0600e1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2314,6 +2314,13 @@ class UnparseVisitor { } } } + void Unparse(const OmpLoopRangeClause &x) { + Word("LOOPRANGE("); + Walk(std::get<0>(x.t)); + Put(", "); + Walk(std::get<1>(x.t)); + Put(")"); + } void Unparse(const OmpReductionClause &x) { using Modifier = OmpReductionClause::Modifier; Walk(std::get>>(x.t), ": "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 606014276e7ca..4af2b4909fcb6 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3383,6 +3383,15 @@ CHECK_REQ_CONSTANT_SCALAR_INT_CLAUSE(Collapse, OMPC_collapse) CHECK_REQ_CONSTANT_SCALAR_INT_CLAUSE(Safelen, OMPC_safelen) CHECK_REQ_CONSTANT_SCALAR_INT_CLAUSE(Simdlen, OMPC_simdlen) +void OmpStructureChecker::Enter(const parser::OmpClause::Looprange &x) { + context_.Say(GetContext().clauseSource, + "LOOPRANGE clause is not implemented yet"_err_en_US, + ContextDirectiveAsFortran()); +} + +void OmpStructureChecker::Enter(const parser::OmpClause::FreeAgent &x) { + context_.Say(GetContext().clauseSource, + "FREE_AGENT clause is not implemented yet"_err_en_US, // Restrictions specific to each clause are implemented apart from the // generalized restrictions. diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index ae19385c022d0..3be758686c634 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -273,6 +273,7 @@ def OMPC_Link : Clause<"link"> { } def OMPC_LoopRange : Clause<"looprange"> { let clangClass = "OMPLoopRangeClause"; + let flangClass = "OmpLoopRangeClause"; } def OMPC_Map : Clause<"map"> { let clangClass = "OMPMapClause"; >From e6e00ae563e491968637e00d2a15a7272bc9d146 Mon Sep 17 00:00:00 2001 From: eZWALT Date: Wed, 21 May 2025 13:14:22 +0000 Subject: [PATCH 8/9] Address basic PR feedback --- clang/include/clang/AST/OpenMPClause.h | 93 ++++---- clang/include/clang/AST/StmtOpenMP.h | 3 +- clang/include/clang/Sema/SemaOpenMP.h | 14 +- clang/lib/AST/OpenMPClause.cpp | 17 +- clang/lib/CodeGen/CGExpr.cpp | 5 +- clang/lib/CodeGen/CodeGenFunction.h | 4 - clang/lib/Sema/SemaOpenMP.cpp | 224 +++++++++----------- flang/lib/Semantics/check-omp-structure.cpp | 3 - 8 files changed, 166 insertions(+), 197 deletions(-) diff --git a/clang/include/clang/AST/OpenMPClause.h b/clang/include/clang/AST/OpenMPClause.h index 8f937cdef9cd0..3df5133a17fb4 100644 --- a/clang/include/clang/AST/OpenMPClause.h +++ b/clang/include/clang/AST/OpenMPClause.h @@ -1153,82 +1153,73 @@ class OMPFullClause final : public OMPNoChildClause { /// for(int j = 0; j < 256; j+=2) /// for(int k = 127; k >= 0; --k) /// \endcode -class OMPLoopRangeClause final : public OMPClause { +class OMPLoopRangeClause final + : public OMPClause, + private llvm::TrailingObjects { friend class OMPClauseReader; - - explicit OMPLoopRangeClause() - : OMPClause(llvm::omp::OMPC_looprange, {}, {}) {} + friend class llvm::TrailingObjects; /// Location of '(' SourceLocation LParenLoc; - /// Location of 'first' - SourceLocation FirstLoc; - - /// Location of 'count' - SourceLocation CountLoc; - - /// Expr associated with 'first' argument - Expr *First = nullptr; - - /// Expr associated with 'count' argument - Expr *Count = nullptr; - - /// Set 'first' - void setFirst(Expr *First) { this->First = First; } + /// Location of first and count expressions + SourceLocation FirstLoc, CountLoc; - /// Set 'count' - void setCount(Expr *Count) { this->Count = Count; } + /// Number of looprange arguments (always 2: first, count) + unsigned NumArgs = 2; - /// Set location of '('. - void setLParenLoc(SourceLocation Loc) { LParenLoc = Loc; } - - /// Set location of 'first' argument - void setFirstLoc(SourceLocation Loc) { FirstLoc = Loc; } + /// Set the argument expressions. + void setArgs(ArrayRef Args) { + assert(Args.size() == NumArgs && "Expected exactly 2 looprange arguments"); + std::copy(Args.begin(), Args.end(), getTrailingObjects()); + } - /// Set location of 'count' argument - void setCountLoc(SourceLocation Loc) { CountLoc = Loc; } + /// Build an empty clause for deserialization. + explicit OMPLoopRangeClause() + : OMPClause(llvm::omp::OMPC_looprange, {}, {}), NumArgs(2) {} public: - /// Build an AST node for a 'looprange' clause - /// - /// \param StartLoc Starting location of the clause. - /// \param LParenLoc Location of '('. - /// \param ModifierLoc Modifier location. - /// \param + /// Build a 'looprange' clause AST node. static OMPLoopRangeClause * Create(const ASTContext &C, SourceLocation StartLoc, SourceLocation LParenLoc, SourceLocation FirstLoc, SourceLocation CountLoc, - SourceLocation EndLoc, Expr *First, Expr *Count); + SourceLocation EndLoc, ArrayRef Args); - /// Build an empty 'looprange' node for deserialization - /// - /// \param C Context of the AST. + /// Build an empty 'looprange' clause node. static OMPLoopRangeClause *CreateEmpty(const ASTContext &C); - /// Returns the location of '(' + // Location getters/setters SourceLocation getLParenLoc() const { return LParenLoc; } - - /// Returns the location of 'first' SourceLocation getFirstLoc() const { return FirstLoc; } - - /// Returns the location of 'count' SourceLocation getCountLoc() const { return CountLoc; } - /// Returns the argument 'first' or nullptr if not set - Expr *getFirst() const { return cast_or_null(First); } + void setLParenLoc(SourceLocation Loc) { LParenLoc = Loc; } + void setFirstLoc(SourceLocation Loc) { FirstLoc = Loc; } + void setCountLoc(SourceLocation Loc) { CountLoc = Loc; } - /// Returns the argument 'count' or nullptr if not set - Expr *getCount() const { return cast_or_null(Count); } + /// Get looprange arguments: first and count + Expr *getFirst() const { return getArgs()[0]; } + Expr *getCount() const { return getArgs()[1]; } - child_range children() { - return child_range(reinterpret_cast(&First), - reinterpret_cast(&Count) + 1); + /// Set looprange arguments: first and count + void setFirst(Expr *E) { getArgs()[0] = E; } + void setCount(Expr *E) { getArgs()[1] = E; } + + MutableArrayRef getArgs() { + return MutableArrayRef(getTrailingObjects(), NumArgs); + } + ArrayRef getArgs() const { + return ArrayRef(getTrailingObjects(), NumArgs); } + child_range children() { + return child_range(reinterpret_cast(getArgs().begin()), + reinterpret_cast(getArgs().end())); + } const_child_range children() const { - auto Children = const_cast(this)->children(); - return const_child_range(Children.begin(), Children.end()); + auto AR = getArgs(); + return const_child_range(reinterpret_cast(AR.begin()), + reinterpret_cast(AR.end())); } child_range used_children() { diff --git a/clang/include/clang/AST/StmtOpenMP.h b/clang/include/clang/AST/StmtOpenMP.h index b6a948a8c6020..cb871c9894d01 100644 --- a/clang/include/clang/AST/StmtOpenMP.h +++ b/clang/include/clang/AST/StmtOpenMP.h @@ -5807,7 +5807,6 @@ class OMPReverseDirective final : public OMPLoopTransformationDirective { llvm::omp::OMPD_reverse, StartLoc, EndLoc, 1) { // Reverse produces a single top-level canonical loop nest - setNumGeneratedLoops(1); setNumGeneratedLoopNests(1); } @@ -5878,7 +5877,7 @@ class OMPInterchangeDirective final : public OMPLoopTransformationDirective { EndLoc, NumLoops) { // Interchange produces a single top-level canonical loop // nest, with the exact same amount of total loops - setNumGeneratedLoops(NumLoops); + setNumGeneratedLoops(3 * NumLoops); setNumGeneratedLoopNests(1); } diff --git a/clang/include/clang/Sema/SemaOpenMP.h b/clang/include/clang/Sema/SemaOpenMP.h index ac4cbe3709a0d..35bb884c0c1f2 100644 --- a/clang/include/clang/Sema/SemaOpenMP.h +++ b/clang/include/clang/Sema/SemaOpenMP.h @@ -1491,7 +1491,7 @@ class SemaOpenMP : public SemaBase { bool checkTransformableLoopNest( OpenMPDirectiveKind Kind, Stmt *AStmt, int NumLoops, SmallVectorImpl &LoopHelpers, - Stmt *&Body, SmallVectorImpl> &OriginalInits); + Stmt *&Body, SmallVectorImpl> &OriginalInits); /// @brief Categories of loops encountered during semantic OpenMP loop /// analysis @@ -1554,9 +1554,9 @@ class SemaOpenMP : public SemaBase { Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, - SmallVectorImpl> &OriginalInits, - SmallVectorImpl> &TransformsPreInits, - SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, SmallVectorImpl &LoopCategories, ASTContext &Context, OpenMPDirectiveKind Kind); @@ -1590,9 +1590,9 @@ class SemaOpenMP : public SemaBase { unsigned &NumLoops, SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, - SmallVectorImpl> &OriginalInits, - SmallVectorImpl> &TransformsPreInits, - SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, SmallVectorImpl &LoopCategories, ASTContext &Context); /// Helper to keep information about the current `omp begin/end declare diff --git a/clang/lib/AST/OpenMPClause.cpp b/clang/lib/AST/OpenMPClause.cpp index 0b5808eb100e4..e0570262b2a05 100644 --- a/clang/lib/AST/OpenMPClause.cpp +++ b/clang/lib/AST/OpenMPClause.cpp @@ -1026,22 +1026,25 @@ OMPPartialClause *OMPPartialClause::CreateEmpty(const ASTContext &C) { OMPLoopRangeClause * OMPLoopRangeClause::Create(const ASTContext &C, SourceLocation StartLoc, - SourceLocation LParenLoc, SourceLocation EndLoc, - SourceLocation FirstLoc, SourceLocation CountLoc, - Expr *First, Expr *Count) { + SourceLocation LParenLoc, SourceLocation FirstLoc, + SourceLocation CountLoc, SourceLocation EndLoc, + ArrayRef Args) { + + assert(Args.size() == 2 && + "looprange clause must have exactly two arguments"); OMPLoopRangeClause *Clause = CreateEmpty(C); Clause->setLocStart(StartLoc); Clause->setLParenLoc(LParenLoc); - Clause->setLocEnd(EndLoc); Clause->setFirstLoc(FirstLoc); Clause->setCountLoc(CountLoc); - Clause->setFirst(First); - Clause->setCount(Count); + Clause->setLocEnd(EndLoc); + Clause->setArgs(Args); return Clause; } OMPLoopRangeClause *OMPLoopRangeClause::CreateEmpty(const ASTContext &C) { - return new (C) OMPLoopRangeClause(); + void *Mem = C.Allocate(totalSizeToAlloc(2)); + return new (Mem) OMPLoopRangeClause(); } OMPAllocateClause *OMPAllocateClause::Create( diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index 1671f07bc2760..268e4220b05b6 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -3241,11 +3241,8 @@ LValue CodeGenFunction::EmitDeclRefLValue(const DeclRefExpr *E) { var, ConvertTypeForMem(VD->getType()), getContext().getDeclAlign(VD)); // No other cases for now. - } else { - llvm::dbgs() << "THE DAMN DECLREFEXPR HASN'T BEEN ENTERED IN LOCALDECLMAP\n"; - VD->dumpColor(); + } else llvm_unreachable("DeclRefExpr for Decl not entered in LocalDeclMap?"); - } // Handle threadlocal function locals. if (VD->getTLSKind() != VarDecl::TLS_None) diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h index ce00198c396b6..a983901f560de 100644 --- a/clang/lib/CodeGen/CodeGenFunction.h +++ b/clang/lib/CodeGen/CodeGenFunction.h @@ -5414,10 +5414,6 @@ class CodeGenFunction : public CodeGenTypeCache { /// Set the address of a local variable. void setAddrOfLocalVar(const VarDecl *VD, Address Addr) { - if (LocalDeclMap.count(VD)) { - llvm::errs() << "Warning: VarDecl already exists in map: "; - VD->dumpColor(); - } assert(!LocalDeclMap.count(VD) && "Decl already exists in LocalDeclMap!"); LocalDeclMap.insert({VD, Addr}); } diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index 485eebf23ef93..d2da417e5cfde 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -14159,38 +14159,37 @@ StmtResult SemaOpenMP::ActOnOpenMPTargetTeamsDistributeSimdDirective( getASTContext(), StartLoc, EndLoc, NestedLoopCount, Clauses, AStmt, B); } -// Overloaded base case function +/// Overloaded base case function template static bool tryHandleAs(T *t, F &&) { return false; } -/** - * Tries to recursively cast `t` to one of the given types and invokes `f` if - * successful. - * - * @tparam Class The first type to check. - * @tparam Rest The remaining types to check. - * @tparam T The base type of `t`. - * @tparam F The callable type for the function to invoke upon a successful - * cast. - * @param t The object to be checked. - * @param f The function to invoke if `t` matches `Class`. - * @return `true` if `t` matched any type and `f` was called, otherwise `false`. - */ +/// +/// Tries to recursively cast `t` to one of the given types and invokes `f` if +/// successful. +/// +/// @tparam Class The first type to check. +/// @tparam Rest The remaining types to check. +/// @tparam T The base type of `t`. +/// @tparam F The callable type for the function to invoke upon a successful +/// cast. +/// @param t The object to be checked. +/// @param f The function to invoke if `t` matches `Class`. +/// @return `true` if `t` matched any type and `f` was called, otherwise +/// `false`. template static bool tryHandleAs(T *t, F &&f) { if (Class *c = dyn_cast(t)) { f(c); return true; - } else { - return tryHandleAs(t, std::forward(f)); } + return tryHandleAs(t, std::forward(f)); } -// Updates OriginalInits by checking Transform against loop transformation -// directives and appending their pre-inits if a match is found. +/// Updates OriginalInits by checking Transform against loop transformation +/// directives and appending their pre-inits if a match is found. static void updatePreInits(OMPLoopBasedDirective *Transform, - SmallVectorImpl> &PreInits) { + SmallVectorImpl> &PreInits) { if (!tryHandleAs( Transform, [&PreInits](auto *Dir) { @@ -14202,7 +14201,7 @@ static void updatePreInits(OMPLoopBasedDirective *Transform, bool SemaOpenMP::checkTransformableLoopNest( OpenMPDirectiveKind Kind, Stmt *AStmt, int NumLoops, SmallVectorImpl &LoopHelpers, - Stmt *&Body, SmallVectorImpl> &OriginalInits) { + Stmt *&Body, SmallVectorImpl> &OriginalInits) { OriginalInits.emplace_back(); bool Result = OMPLoopBasedDirective::doForAllLoops( AStmt->IgnoreContainers(), /*TryImperfectlyNestedLoops=*/false, NumLoops, @@ -14236,40 +14235,40 @@ bool SemaOpenMP::checkTransformableLoopNest( return Result; } -// Counts the total number of nested loops, including the outermost loop (the -// original loop). PRECONDITION of this visitor is that it must be invoked from -// the original loop to be analyzed. The traversal is stop for Decl's and -// Expr's given that they may contain inner loops that must not be counted. -// -// Example AST structure for the code: -// -// int main() { -// #pragma omp fuse -// { -// for (int i = 0; i < 100; i++) { <-- Outer loop -// []() { -// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP -// }; -// for(int j = 0; j < 5; ++j) {} <-- Inner loop -// } -// for (int r = 0; i < 100; i++) { <-- Outer loop -// struct LocalClass { -// void bar() { -// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP -// } -// }; -// for(int k = 0; k < 10; ++k) {} <-- Inner loop -// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP -// } -// } -// } -// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops +/// Counts the total number of nested loops, including the outermost loop (the +/// original loop). PRECONDITION of this visitor is that it must be invoked from +/// the original loop to be analyzed. The traversal is stop for Decl's and +/// Expr's given that they may contain inner loops that must not be counted. +/// +/// Example AST structure for the code: +/// +/// int main() { +/// #pragma omp fuse +/// { +/// for (int i = 0; i < 100; i++) { <-- Outer loop +/// []() { +/// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +/// }; +/// for(int j = 0; j < 5; ++j) {} <-- Inner loop +/// } +/// for (int r = 0; i < 100; i++) { <-- Outer loop +/// struct LocalClass { +/// void bar() { +/// for(int j = 0; j < 100; j++) {} <-- NOT A LOOP +/// } +/// }; +/// for(int k = 0; k < 10; ++k) {} <-- Inner loop +/// {x = 5; for(k = 0; k < 10; ++k) x += k; x}; <-- NOT A LOOP +/// } +/// } +/// } +/// Result: Loop 'i' contains 2 loops, Loop 'r' also contains 2 loops class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { private: unsigned NestedLoopCount = 0; public: - explicit NestedLoopCounterVisitor() {} + explicit NestedLoopCounterVisitor() = default; unsigned getNestedLoopCount() const { return NestedLoopCount; } @@ -14296,7 +14295,7 @@ class NestedLoopCounterVisitor : public DynamicRecursiveASTVisitor { return true; // Only recurse into CompoundStmt (block {}) and loop bodies - if (isa(S) || isa(S) || isa(S)) { + if (isa(S)) { return DynamicRecursiveASTVisitor::TraverseStmt(S); } @@ -14317,19 +14316,18 @@ bool SemaOpenMP::analyzeLoopSequence( Stmt *LoopSeqStmt, unsigned &LoopSeqSize, unsigned &NumLoops, SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, - SmallVectorImpl> &OriginalInits, - SmallVectorImpl> &TransformsPreInits, - SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, SmallVectorImpl &LoopCategories, ASTContext &Context, OpenMPDirectiveKind Kind) { VarsWithInheritedDSAType TmpDSA; QualType BaseInductionVarType; - // Helper Lambda to handle storing initialization and body statements for both - // ForStmt and CXXForRangeStmt and checks for any possible mismatch between - // induction variables types - auto storeLoopStatements = [&OriginalInits, &ForStmts, &BaseInductionVarType, - this, &Context](Stmt *LoopStmt) { + /// Helper Lambda to handle storing initialization and body statements for + /// both ForStmt and CXXForRangeStmt and checks for any possible mismatch + /// between induction variables types + auto StoreLoopStatements = [&](Stmt *LoopStmt) { if (auto *For = dyn_cast(LoopStmt)) { OriginalInits.back().push_back(For->getInit()); ForStmts.push_back(For); @@ -14357,16 +14355,11 @@ bool SemaOpenMP::analyzeLoopSequence( } }; - // Helper lambda functions to encapsulate the processing of different - // derivations of the canonical loop sequence grammar - // - // Modularized code for handling loop generation and transformations - auto analyzeLoopGeneration = [&storeLoopStatements, &LoopHelpers, - &OriginalInits, &TransformsPreInits, - &LoopCategories, &LoopSeqSize, &NumLoops, Kind, - &TmpDSA, &ForStmts, &Context, - &LoopSequencePreInits, this](Stmt *Child) { - auto LoopTransform = dyn_cast(Child); + /// Helper lambda functions to encapsulate the processing of different + /// derivations of the canonical loop sequence grammar + /// Modularized code for handling loop generation and transformations + auto AnalyzeLoopGeneration = [&](Stmt *Child) { + auto *LoopTransform = dyn_cast(Child); Stmt *TransformedStmt = LoopTransform->getTransformedStmt(); unsigned NumGeneratedLoopNests = LoopTransform->getNumGeneratedLoopNests(); unsigned NumGeneratedLoops = LoopTransform->getNumGeneratedLoops(); @@ -14377,9 +14370,8 @@ bool SemaOpenMP::analyzeLoopSequence( LoopSeqSize += NumGeneratedLoopNests; NumLoops += NumGeneratedLoops; return true; - } - // Unroll full (0 loops produced) - else { + } else { + // Unroll full (0 loops produced) Diag(Child->getBeginLoc(), diag::err_omp_not_for) << 0 << getOpenMPDirectiveName(Kind); return false; @@ -14406,9 +14398,8 @@ bool SemaOpenMP::analyzeLoopSequence( LoopHelpers, ForStmts, OriginalInits, TransformsPreInits, LoopSequencePreInits, LoopCategories, Context, Kind); - } - // Vast majority: (Tile, Unroll, Stripe, Reverse, Interchange, Fuse all) - else { + } else { + // Vast majority: (Tile, Unroll, Stripe, Reverse, Interchange, Fuse all) // Process the transformed loop statement OriginalInits.emplace_back(); TransformsPreInits.emplace_back(); @@ -14424,7 +14415,7 @@ bool SemaOpenMP::analyzeLoopSequence( << getOpenMPDirectiveName(Kind); return false; } - storeLoopStatements(TransformedStmt); + StoreLoopStatements(TransformedStmt); updatePreInits(LoopTransform, TransformsPreInits); NumLoops += NumGeneratedLoops; @@ -14433,10 +14424,8 @@ bool SemaOpenMP::analyzeLoopSequence( } }; - // Modularized code for handling regular canonical loops - auto analyzeRegularLoop = [&storeLoopStatements, &LoopHelpers, &OriginalInits, - &LoopSeqSize, &NumLoops, Kind, &TmpDSA, - &LoopCategories, this](Stmt *Child) { + /// Modularized code for handling regular canonical loops + auto AnalyzeRegularLoop = [&](Stmt *Child) { OriginalInits.emplace_back(); LoopHelpers.emplace_back(); LoopCategories.push_back(OMPLoopCategory::RegularLoop); @@ -14451,19 +14440,19 @@ bool SemaOpenMP::analyzeLoopSequence( return false; } - storeLoopStatements(Child); + StoreLoopStatements(Child); auto NLCV = NestedLoopCounterVisitor(); NLCV.TraverseStmt(Child); NumLoops += NLCV.getNestedLoopCount(); return true; }; - // Helper functions to validate canonical loop sequence grammar is valid - auto isLoopSequenceDerivation = [](auto *Child) { - return isa(Child) || isa(Child) || - isa(Child); + /// Helper functions to validate loop sequence grammar derivations + auto IsLoopSequenceDerivation = [](auto *Child) { + return isa(Child); }; - auto isLoopGeneratingStmt = [](auto *Child) { + /// Helper functions to validate loop generating grammar derivations + auto IsLoopGeneratingStmt = [](auto *Child) { return isa(Child); }; @@ -14474,7 +14463,7 @@ bool SemaOpenMP::analyzeLoopSequence( continue; // Skip over non-loop-sequence statements - if (!isLoopSequenceDerivation(Child)) { + if (!IsLoopSequenceDerivation(Child)) { Child = Child->IgnoreContainers(); // Ignore empty compound statement @@ -14494,17 +14483,17 @@ bool SemaOpenMP::analyzeLoopSequence( } } // Regular loop sequence handling - if (isLoopSequenceDerivation(Child)) { - if (isLoopGeneratingStmt(Child)) { - if (!analyzeLoopGeneration(Child)) { + if (IsLoopSequenceDerivation(Child)) { + if (IsLoopGeneratingStmt(Child)) { + if (!AnalyzeLoopGeneration(Child)) return false; - } - // analyzeLoopGeneration updates Loop Sequence size accordingly + + // AnalyzeLoopGeneration updates Loop Sequence size accordingly } else { - if (!analyzeRegularLoop(Child)) { + if (!AnalyzeRegularLoop(Child)) return false; - } + // Update the Loop Sequence size by one ++LoopSeqSize; } @@ -14523,9 +14512,9 @@ bool SemaOpenMP::checkTransformableLoopSequence( unsigned &NumLoops, SmallVectorImpl &LoopHelpers, SmallVectorImpl &ForStmts, - SmallVectorImpl> &OriginalInits, - SmallVectorImpl> &TransformsPreInits, - SmallVectorImpl> &LoopSequencePreInits, + SmallVectorImpl> &OriginalInits, + SmallVectorImpl> &TransformsPreInits, + SmallVectorImpl> &LoopSequencePreInits, SmallVectorImpl &LoopCategories, ASTContext &Context) { // Checks whether the given statement is a compound statement @@ -14561,10 +14550,9 @@ bool SemaOpenMP::checkTransformableLoopSequence( // Recursive entry point to process the main loop sequence if (!analyzeLoopSequence(AStmt, LoopSeqSize, NumLoops, LoopHelpers, ForStmts, OriginalInits, TransformsPreInits, - LoopSequencePreInits, LoopCategories, Context, - Kind)) { + LoopSequencePreInits, LoopCategories, Context, Kind)) return false; - } + if (LoopSeqSize <= 0) { Diag(AStmt->getBeginLoc(), diag::err_omp_empty_loop_sequence) << getOpenMPDirectiveName(Kind); @@ -14656,7 +14644,7 @@ StmtResult SemaOpenMP::ActOnOpenMPTileDirective(ArrayRef Clauses, // Verify and diagnose loop nest. SmallVector LoopHelpers(NumLoops); Stmt *Body = nullptr; - SmallVector, 4> OriginalInits; + SmallVector, 4> OriginalInits; if (!checkTransformableLoopNest(OMPD_tile, AStmt, NumLoops, LoopHelpers, Body, OriginalInits)) return StmtError(); @@ -14933,7 +14921,7 @@ StmtResult SemaOpenMP::ActOnOpenMPStripeDirective(ArrayRef Clauses, // Verify and diagnose loop nest. SmallVector LoopHelpers(NumLoops); Stmt *Body = nullptr; - SmallVector, 4> OriginalInits; + SmallVector, 4> OriginalInits; if (!checkTransformableLoopNest(OMPD_stripe, AStmt, NumLoops, LoopHelpers, Body, OriginalInits)) return StmtError(); @@ -15194,7 +15182,7 @@ StmtResult SemaOpenMP::ActOnOpenMPUnrollDirective(ArrayRef Clauses, Stmt *Body = nullptr; SmallVector LoopHelpers( NumLoops); - SmallVector, NumLoops + 1> OriginalInits; + SmallVector, NumLoops + 1> OriginalInits; if (!checkTransformableLoopNest(OMPD_unroll, AStmt, NumLoops, LoopHelpers, Body, OriginalInits)) return StmtError(); @@ -15462,7 +15450,7 @@ StmtResult SemaOpenMP::ActOnOpenMPReverseDirective(Stmt *AStmt, Stmt *Body = nullptr; SmallVector LoopHelpers( NumLoops); - SmallVector, NumLoops + 1> OriginalInits; + SmallVector, NumLoops + 1> OriginalInits; if (!checkTransformableLoopNest(OMPD_reverse, AStmt, NumLoops, LoopHelpers, Body, OriginalInits)) return StmtError(); @@ -15654,7 +15642,7 @@ StmtResult SemaOpenMP::ActOnOpenMPInterchangeDirective( // Verify and diagnose loop nest. SmallVector LoopHelpers(NumLoops); Stmt *Body = nullptr; - SmallVector, 2> OriginalInits; + SmallVector, 2> OriginalInits; if (!checkTransformableLoopNest(OMPD_interchange, AStmt, NumLoops, LoopHelpers, Body, OriginalInits)) return StmtError(); @@ -15841,9 +15829,8 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, CaptureVars CopyTransformer(SemaRef); // Ensure the structured block is not empty - if (!AStmt) { + if (!AStmt) return StmtError(); - } unsigned NumLoops = 1; unsigned LoopSeqSize = 1; @@ -15862,16 +15849,15 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // Also collect the HelperExprs, Loop Stmts, Inits, and Number of loops SmallVector LoopHelpers; SmallVector LoopStmts; - SmallVector> OriginalInits; - SmallVector> TransformsPreInits; - SmallVector> LoopSequencePreInits; + SmallVector> OriginalInits; + SmallVector> TransformsPreInits; + SmallVector> LoopSequencePreInits; SmallVector LoopCategories; if (!checkTransformableLoopSequence(OMPD_fuse, AStmt, LoopSeqSize, NumLoops, LoopHelpers, LoopStmts, OriginalInits, TransformsPreInits, LoopSequencePreInits, - LoopCategories, Context)) { + LoopCategories, Context)) return StmtError(); - } // Handle clauses, which can be any of the following: [looprange, apply] const OMPLoopRangeClause *LRC = @@ -15961,9 +15947,8 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // expressions. Generates both the variable declaration and the corresponding // initialization statement. auto CreateHelperVarAndStmt = - [&SemaRef = this->SemaRef, &Context, &CopyTransformer, - &IVType](Expr *ExprToCopy, const std::string &BaseName, unsigned I, - bool NeedsNewVD = false) { + [&, &SemaRef = SemaRef](Expr *ExprToCopy, const std::string &BaseName, + unsigned I, bool NeedsNewVD = false) { Expr *TransformedExpr = AssertSuccess(CopyTransformer.TransformExpr(ExprToCopy)); if (!TransformedExpr) @@ -16007,9 +15992,8 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // Transformations that apply this concept: Loopranged Fuse, Split if (!LoopSequencePreInits.empty()) { for (const auto <PreInits : LoopSequencePreInits) { - if (!LTPreInits.empty()) { + if (!LTPreInits.empty()) llvm::append_range(PreInits, LTPreInits); - } } } @@ -16038,9 +16022,9 @@ StmtResult SemaOpenMP::ActOnOpenMPFuseDirective(ArrayRef Clauses, // Order matters: pre-inits may define variables used in the original // inits such as upper bounds... auto TransformPreInit = TransformsPreInits[TransformIndex++]; - if (!TransformPreInit.empty()) { + if (!TransformPreInit.empty()) llvm::append_range(PreInits, TransformPreInit); - } + addLoopPreInits(Context, LoopHelpers[I], LoopStmts[I], OriginalInits[I], PreInits); } @@ -17459,13 +17443,15 @@ OMPClause *SemaOpenMP::ActOnOpenMPLoopRangeClause( if (CountVal.isInvalid()) Count = nullptr; + SmallVector ArgsVec = {First, Count}; + // OpenMP [6.0, Restrictions] // first + count - 1 must not evaluate to a value greater than the // loop sequence length of the associated canonical loop sequence. // This check must be performed afterwards due to the delayed // parsing and computation of the associated loop sequence return OMPLoopRangeClause::Create(getASTContext(), StartLoc, LParenLoc, - FirstLoc, CountLoc, EndLoc, First, Count); + FirstLoc, CountLoc, EndLoc, ArgsVec); } OMPClause *SemaOpenMP::ActOnOpenMPAlignClause(Expr *A, SourceLocation StartLoc, diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 4af2b4909fcb6..ad4f54e6fdcc5 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3389,9 +3389,6 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Looprange &x) { ContextDirectiveAsFortran()); } -void OmpStructureChecker::Enter(const parser::OmpClause::FreeAgent &x) { - context_.Say(GetContext().clauseSource, - "FREE_AGENT clause is not implemented yet"_err_en_US, // Restrictions specific to each clause are implemented apart from the // generalized restrictions. >From 4100dfe4dd04ed1c953ea4e38a65e867c8e9f73f Mon Sep 17 00:00:00 2001 From: eZWALT Date: Thu, 22 May 2025 10:39:39 +0000 Subject: [PATCH 9/9] Removed unncessary warning and updated tests accordingly --- .../clang/Basic/DiagnosticSemaKinds.td | 3 -- clang/lib/Sema/SemaOpenMP.cpp | 21 +-------- clang/test/OpenMP/fuse_messages.cpp | 43 +++++++++++++++---- 3 files changed, 35 insertions(+), 32 deletions(-) diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index a6ae0de004c8a..d1790cea6cc45 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -11558,9 +11558,6 @@ def note_omp_implicit_dsa : Note< "implicitly determined as %0">; def err_omp_loop_var_dsa : Error< "loop iteration variable in the associated loop of 'omp %1' directive may not be %0, predetermined as %2">; -def warn_omp_different_loop_ind_var_types : Warning < - "loop sequence following '#pragma omp %0' contains induction variables of differing types: %1 and %2">, - InGroup; def err_omp_not_canonical_loop : Error < "loop after '#pragma omp %0' is not in canonical form">; def err_omp_not_a_loop_sequence : Error < diff --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp index d2da417e5cfde..76484b577f9c1 100644 --- a/clang/lib/Sema/SemaOpenMP.cpp +++ b/clang/lib/Sema/SemaOpenMP.cpp @@ -14323,31 +14323,12 @@ bool SemaOpenMP::analyzeLoopSequence( OpenMPDirectiveKind Kind) { VarsWithInheritedDSAType TmpDSA; - QualType BaseInductionVarType; /// Helper Lambda to handle storing initialization and body statements for - /// both ForStmt and CXXForRangeStmt and checks for any possible mismatch - /// between induction variables types + /// both ForStmt and CXXForRangeStmt auto StoreLoopStatements = [&](Stmt *LoopStmt) { if (auto *For = dyn_cast(LoopStmt)) { OriginalInits.back().push_back(For->getInit()); ForStmts.push_back(For); - // Extract induction variable - if (auto *InitStmt = dyn_cast_or_null(For->getInit())) { - if (auto *InitDecl = dyn_cast(InitStmt->getSingleDecl())) { - QualType InductionVarType = InitDecl->getType().getCanonicalType(); - - // Compare with first loop type - if (BaseInductionVarType.isNull()) { - BaseInductionVarType = InductionVarType; - } else if (!Context.hasSameType(BaseInductionVarType, - InductionVarType)) { - Diag(InitDecl->getBeginLoc(), - diag::warn_omp_different_loop_ind_var_types) - << getOpenMPDirectiveName(OMPD_fuse) << BaseInductionVarType - << InductionVarType; - } - } - } } else { auto *CXXFor = cast(LoopStmt); OriginalInits.back().push_back(CXXFor->getBeginStmt()); diff --git a/clang/test/OpenMP/fuse_messages.cpp b/clang/test/OpenMP/fuse_messages.cpp index 2a2491d008a0b..4902d424373e5 100644 --- a/clang/test/OpenMP/fuse_messages.cpp +++ b/clang/test/OpenMP/fuse_messages.cpp @@ -70,15 +70,6 @@ void func() { for(int j = 0; j < 10; ++j); } - //expected-warning at +5 {{loop sequence following '#pragma omp fuse' contains induction variables of differing types: 'int' and 'unsigned int'}} - //expected-warning at +5 {{loop sequence following '#pragma omp fuse' contains induction variables of differing types: 'int' and 'long long'}} - #pragma omp fuse - { - for(int i = 0; i < 10; ++i); - for(unsigned int j = 0; j < 10; ++j); - for(long long k = 0; k < 100; ++k); - } - //expected-warning at +2 {{loop range in '#pragma omp fuse' contains only a single loop, resulting in redundant fusion}} #pragma omp fuse { @@ -123,6 +114,40 @@ void func() { for(int j = 0; j < 100; ++j); for(int k = 0; k < 50; ++k); } + + //expected-error at +1 {{loop range in '#pragma omp fuse' exceeds the number of available loops: range end '6' is greater than the total number of loops '5'}} + #pragma omp fuse looprange(1,6) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + // This fusion results in 2 loops + #pragma omp fuse looprange(1,2) + { + for(int i = 0; i < 10; ++i); + for(int j = 0; j < 100; ++j); + for(int k = 0; k < 50; ++k); + } + } + + //expected-error at +1 {{loop range in '#pragma omp fuse' exceeds the number of available loops: range end '4' is greater than the total number of loops '3'}} + #pragma omp fuse looprange(2,3) + { + #pragma omp unroll partial(2) + for(int i = 0; i < 10; ++i); + + #pragma omp reverse + for(int j = 0; j < 10; ++j); + + #pragma omp fuse + { + { + #pragma omp reverse + for(int j = 0; j < 10; ++j); + } + for(int k = 0; k < 50; ++k); + } + } } // In a template context, but expression itself not instantiation-dependent From flang-commits at lists.llvm.org Fri May 23 11:27:09 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 23 May 2025 11:27:09 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix prescanner bug w/ empty macros in line continuation (PR #141274) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/141274 When processing free form source line continuation, the prescanner treats empty keyword macros as if they were spaces or tabs. After skipping over them, however, there's code that only works if the skipped characters ended with an actual space or tab. If the last skipped item was an empty keyword macro's name, the last character of that name would end up being the first character of the continuation line. Fix. >From a9e45ee33ba9e0663a91babc653a729f7a96912c Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 23 May 2025 11:21:32 -0700 Subject: [PATCH] [flang] Fix prescanner bug w/ empty macros in line continuation When processing free form source line continuation, the prescanner treats empty keyword macros as if they were spaces or tabs. After skipping over them, however, there's code that only works if the skipped characters ended with an actual space or tab. If the last skipped item was an empty keyword macro's name, the last character of that name would end up being the first character of the continuation line. Fix. --- flang/lib/Parser/prescan.cpp | 2 +- flang/test/Preprocessing/bug890.F90 | 6 ++++++ 2 files changed, 7 insertions(+), 1 deletion(-) create mode 100644 flang/test/Preprocessing/bug890.F90 diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp index 3bc2ea0b37508..9aef0c9981e3c 100644 --- a/flang/lib/Parser/prescan.cpp +++ b/flang/lib/Parser/prescan.cpp @@ -1473,7 +1473,7 @@ const char *Prescanner::FreeFormContinuationLine(bool ampersand) { GetProvenanceRange(p, p + 1), "Character literal continuation line should have been preceded by '&'"_port_en_US); } - } else if (p > lineStart) { + } else if (p > lineStart && IsSpaceOrTab(p - 1)) { --p; } else { insertASpace_ = true; diff --git a/flang/test/Preprocessing/bug890.F90 b/flang/test/Preprocessing/bug890.F90 new file mode 100644 index 0000000000000..0ce2d8c3f1569 --- /dev/null +++ b/flang/test/Preprocessing/bug890.F90 @@ -0,0 +1,6 @@ +! RUN: %flang -E %s 2>&1 | FileCheck %s +!CHECK: subroutine sub() +#define empty +subroutine sub ( & + empty) +end From flang-commits at lists.llvm.org Fri May 23 11:27:46 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 23 May 2025 11:27:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix prescanner bug w/ empty macros in line continuation (PR #141274) In-Reply-To: Message-ID: <6830be22.050a0220.1e58ec.ce16@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-parser Author: Peter Klausler (klausler)
Changes When processing free form source line continuation, the prescanner treats empty keyword macros as if they were spaces or tabs. After skipping over them, however, there's code that only works if the skipped characters ended with an actual space or tab. If the last skipped item was an empty keyword macro's name, the last character of that name would end up being the first character of the continuation line. Fix. --- Full diff: https://github.com/llvm/llvm-project/pull/141274.diff 2 Files Affected: - (modified) flang/lib/Parser/prescan.cpp (+1-1) - (added) flang/test/Preprocessing/bug890.F90 (+6) ``````````diff diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp index 3bc2ea0b37508..9aef0c9981e3c 100644 --- a/flang/lib/Parser/prescan.cpp +++ b/flang/lib/Parser/prescan.cpp @@ -1473,7 +1473,7 @@ const char *Prescanner::FreeFormContinuationLine(bool ampersand) { GetProvenanceRange(p, p + 1), "Character literal continuation line should have been preceded by '&'"_port_en_US); } - } else if (p > lineStart) { + } else if (p > lineStart && IsSpaceOrTab(p - 1)) { --p; } else { insertASpace_ = true; diff --git a/flang/test/Preprocessing/bug890.F90 b/flang/test/Preprocessing/bug890.F90 new file mode 100644 index 0000000000000..0ce2d8c3f1569 --- /dev/null +++ b/flang/test/Preprocessing/bug890.F90 @@ -0,0 +1,6 @@ +! RUN: %flang -E %s 2>&1 | FileCheck %s +!CHECK: subroutine sub() +#define empty +subroutine sub ( & + empty) +end ``````````
https://github.com/llvm/llvm-project/pull/141274 From flang-commits at lists.llvm.org Fri May 23 11:46:18 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 23 May 2025 11:46:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix prescanner bug w/ empty macros in line continuation (PR #141274) In-Reply-To: Message-ID: <6830c27a.170a0220.18eeef.abeb@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/141274 From flang-commits at lists.llvm.org Fri May 23 12:30:37 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 23 May 2025 12:30:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) In-Reply-To: Message-ID: <6830ccdd.170a0220.2b9185.a6cf@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/140763 >From 7ba080073d823c74f02efc57c664799f684e632a Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Tue, 20 May 2025 09:54:26 -0700 Subject: [PATCH 1/2] initial commit --- flang/lib/Lower/OpenACC.cpp | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index bb8b4d8e833c2..40357ebc3c07d 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -812,23 +812,25 @@ static void genDeclareDataOperandOperationsWithModifier( } template -static void genDataExitOperations(fir::FirOpBuilder &builder, - llvm::SmallVector operands, - bool structured) { +static void +genDataExitOperations(fir::FirOpBuilder &builder, + llvm::SmallVector operands, bool structured, + std::optional exitLoc = std::nullopt) { for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); + auto opLoc = exitLoc ? *exitLoc : entryOp.getLoc(); if constexpr (std::is_same_v || std::is_same_v) builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), - entryOp.getVarType(), entryOp.getBounds(), entryOp.getAsyncOperands(), + opLoc, entryOp.getAccVar(), entryOp.getVar(), entryOp.getVarType(), + entryOp.getBounds(), entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); else builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getBounds(), + opLoc, entryOp.getAccVar(), entryOp.getBounds(), entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); @@ -2976,6 +2978,7 @@ static Op createComputeOp( static void genACCDataOp(Fortran::lower::AbstractConverter &converter, mlir::Location currentLocation, + mlir::Location endLocation, Fortran::lower::pft::Evaluation &eval, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, @@ -3170,19 +3173,19 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, // Create the exit operations after the region. genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, attachEntryOperands, /*structured=*/true); + builder, attachEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, nocreateEntryOperands, /*structured=*/true); + builder, nocreateEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands, /*structured=*/true, endLocation); builder.restoreInsertionPoint(insPt); } @@ -3259,7 +3262,9 @@ genACC(Fortran::lower::AbstractConverter &converter, std::get(beginBlockDirective.t); const auto &accClauseList = std::get(beginBlockDirective.t); - + const auto &endBlockDirective = + std::get(blockConstruct.t); + mlir::Location endLocation = converter.genLocation(endBlockDirective.source); mlir::Location currentLocation = converter.genLocation(blockDirective.source); Fortran::lower::StatementContext stmtCtx; @@ -3268,8 +3273,8 @@ genACC(Fortran::lower::AbstractConverter &converter, semanticsContext, stmtCtx, accClauseList); } else if (blockDirective.v == llvm::acc::ACCD_data) { - genACCDataOp(converter, currentLocation, eval, semanticsContext, stmtCtx, - accClauseList); + genACCDataOp(converter, currentLocation, endLocation, eval, + semanticsContext, stmtCtx, accClauseList); } else if (blockDirective.v == llvm::acc::ACCD_serial) { createComputeOp(converter, currentLocation, eval, semanticsContext, stmtCtx, >From 5b17a866c73727a69d5710e5cbb43491aafd8aca Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Tue, 20 May 2025 11:08:54 -0700 Subject: [PATCH 2/2] address feedback --- flang/lib/Lower/OpenACC.cpp | 2 +- flang/test/Lower/OpenACC/locations.f90 | 11 +++++++++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 40357ebc3c07d..02dba22c29c7f 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -819,7 +819,7 @@ genDataExitOperations(fir::FirOpBuilder &builder, for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); - auto opLoc = exitLoc ? *exitLoc : entryOp.getLoc(); + mlir::Location opLoc = exitLoc ? *exitLoc : entryOp.getLoc(); if constexpr (std::is_same_v || std::is_same_v) builder.create( diff --git a/flang/test/Lower/OpenACC/locations.f90 b/flang/test/Lower/OpenACC/locations.f90 index 84dd512a5d43f..69873b3fbca4f 100644 --- a/flang/test/Lower/OpenACC/locations.f90 +++ b/flang/test/Lower/OpenACC/locations.f90 @@ -171,6 +171,17 @@ subroutine acc_loop_fused_locations(arr) ! CHECK: acc.loop ! CHECK: } attributes {collapse = [3]{{.*}}} loc(fused["{{.*}}locations.f90":160:11, "{{.*}}locations.f90":161:5, "{{.*}}locations.f90":162:7, "{{.*}}locations.f90":163:9]) + subroutine data_end_locations(arr) + real, dimension(10) :: arr + + !$acc data copy(arr) + !CHECK-LABEL: acc.copyin + !CHECK-SAME: loc("{{.*}}locations.f90":177:21) + + !$acc end data + !CHECK-LABEL: acc.copyout + !CHECK-SAME: loc("{{.*}}locations.f90":181:11) + end subroutine end module From flang-commits at lists.llvm.org Fri May 23 13:39:14 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 23 May 2025 13:39:14 -0700 (PDT) Subject: [flang-commits] [flang] df03f7e - [flang][openacc] use location of end directive for exit operations (#140763) Message-ID: <6830dcf2.630a0220.1fa401.20ce@mx.google.com> Author: Andre Kuhlenschmidt Date: 2025-05-23T13:39:11-07:00 New Revision: df03f7ed4cc9555c7f55945065a6e9c3c6ad5f11 URL: https://github.com/llvm/llvm-project/commit/df03f7ed4cc9555c7f55945065a6e9c3c6ad5f11 DIFF: https://github.com/llvm/llvm-project/commit/df03f7ed4cc9555c7f55945065a6e9c3c6ad5f11.diff LOG: [flang][openacc] use location of end directive for exit operations (#140763) Make sure to preserve the location of the end statement on data declarations for use in debugging OpenACC runtime. Added: Modified: flang/lib/Lower/OpenACC.cpp flang/test/Lower/OpenACC/locations.f90 Removed: ################################################################################ diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index bb8b4d8e833c2..02dba22c29c7f 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -812,23 +812,25 @@ static void genDeclareDataOperandOperationsWithModifier( } template -static void genDataExitOperations(fir::FirOpBuilder &builder, - llvm::SmallVector operands, - bool structured) { +static void +genDataExitOperations(fir::FirOpBuilder &builder, + llvm::SmallVector operands, bool structured, + std::optional exitLoc = std::nullopt) { for (mlir::Value operand : operands) { auto entryOp = mlir::dyn_cast_or_null(operand.getDefiningOp()); assert(entryOp && "data entry op expected"); + mlir::Location opLoc = exitLoc ? *exitLoc : entryOp.getLoc(); if constexpr (std::is_same_v || std::is_same_v) builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getVar(), - entryOp.getVarType(), entryOp.getBounds(), entryOp.getAsyncOperands(), + opLoc, entryOp.getAccVar(), entryOp.getVar(), entryOp.getVarType(), + entryOp.getBounds(), entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); else builder.create( - entryOp.getLoc(), entryOp.getAccVar(), entryOp.getBounds(), + opLoc, entryOp.getAccVar(), entryOp.getBounds(), entryOp.getAsyncOperands(), entryOp.getAsyncOperandsDeviceTypeAttr(), entryOp.getAsyncOnlyAttr(), entryOp.getDataClause(), structured, entryOp.getImplicit(), builder.getStringAttr(*entryOp.getName())); @@ -2976,6 +2978,7 @@ static Op createComputeOp( static void genACCDataOp(Fortran::lower::AbstractConverter &converter, mlir::Location currentLocation, + mlir::Location endLocation, Fortran::lower::pft::Evaluation &eval, Fortran::semantics::SemanticsContext &semanticsContext, Fortran::lower::StatementContext &stmtCtx, @@ -3170,19 +3173,19 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter, // Create the exit operations after the region. genDataExitOperations( - builder, copyEntryOperands, /*structured=*/true); + builder, copyEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, copyinEntryOperands, /*structured=*/true); + builder, copyinEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, copyoutEntryOperands, /*structured=*/true); + builder, copyoutEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, attachEntryOperands, /*structured=*/true); + builder, attachEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, createEntryOperands, /*structured=*/true); + builder, createEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, nocreateEntryOperands, /*structured=*/true); + builder, nocreateEntryOperands, /*structured=*/true, endLocation); genDataExitOperations( - builder, presentEntryOperands, /*structured=*/true); + builder, presentEntryOperands, /*structured=*/true, endLocation); builder.restoreInsertionPoint(insPt); } @@ -3259,7 +3262,9 @@ genACC(Fortran::lower::AbstractConverter &converter, std::get(beginBlockDirective.t); const auto &accClauseList = std::get(beginBlockDirective.t); - + const auto &endBlockDirective = + std::get(blockConstruct.t); + mlir::Location endLocation = converter.genLocation(endBlockDirective.source); mlir::Location currentLocation = converter.genLocation(blockDirective.source); Fortran::lower::StatementContext stmtCtx; @@ -3268,8 +3273,8 @@ genACC(Fortran::lower::AbstractConverter &converter, semanticsContext, stmtCtx, accClauseList); } else if (blockDirective.v == llvm::acc::ACCD_data) { - genACCDataOp(converter, currentLocation, eval, semanticsContext, stmtCtx, - accClauseList); + genACCDataOp(converter, currentLocation, endLocation, eval, + semanticsContext, stmtCtx, accClauseList); } else if (blockDirective.v == llvm::acc::ACCD_serial) { createComputeOp(converter, currentLocation, eval, semanticsContext, stmtCtx, diff --git a/flang/test/Lower/OpenACC/locations.f90 b/flang/test/Lower/OpenACC/locations.f90 index 84dd512a5d43f..69873b3fbca4f 100644 --- a/flang/test/Lower/OpenACC/locations.f90 +++ b/flang/test/Lower/OpenACC/locations.f90 @@ -171,6 +171,17 @@ subroutine acc_loop_fused_locations(arr) ! CHECK: acc.loop ! CHECK: } attributes {collapse = [3]{{.*}}} loc(fused["{{.*}}locations.f90":160:11, "{{.*}}locations.f90":161:5, "{{.*}}locations.f90":162:7, "{{.*}}locations.f90":163:9]) + subroutine data_end_locations(arr) + real, dimension(10) :: arr + + !$acc data copy(arr) + !CHECK-LABEL: acc.copyin + !CHECK-SAME: loc("{{.*}}locations.f90":177:21) + + !$acc end data + !CHECK-LABEL: acc.copyout + !CHECK-SAME: loc("{{.*}}locations.f90":181:11) + end subroutine end module From flang-commits at lists.llvm.org Fri May 23 13:39:18 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 23 May 2025 13:39:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] use location of end directive for exit operations (PR #140763) In-Reply-To: Message-ID: <6830dcf6.170a0220.12fdb2.a66d@mx.google.com> https://github.com/akuhlens closed https://github.com/llvm/llvm-project/pull/140763 From flang-commits at lists.llvm.org Fri May 23 13:46:43 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 23 May 2025 13:46:43 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <6830deb3.050a0220.349151.ef34@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From c243d222fe682c92670349353bc326d112a5e30b Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH 1/3] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, Destroy, and DescriptorIO. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. There is a fast(?) mode in the work queue implementation that causes new work items to be executed to completion immediately upon creation, saving the overhead of actually representing and managing the work queue. This mode can't be used on GPU devices, but it is enabled by default for CPU hosts. It can be disabled easily for debugging and performance testing. --- .../include/flang-rt/runtime/environment.h | 1 + flang-rt/include/flang-rt/runtime/stat.h | 10 +- .../include/flang-rt/runtime/work-queue.h | 538 +++++++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 549 +++++++++------ flang-rt/lib/runtime/derived.cpp | 519 +++++++------- flang-rt/lib/runtime/descriptor-io.cpp | 646 +++++++++++++++++- flang-rt/lib/runtime/descriptor-io.h | 620 +---------------- flang-rt/lib/runtime/environment.cpp | 4 + flang-rt/lib/runtime/namelist.cpp | 1 + flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 145 ++++ flang-rt/unittests/Runtime/ExternalIOTest.cpp | 2 +- flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- 15 files changed, 1974 insertions(+), 1081 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/environment.h b/flang-rt/include/flang-rt/runtime/environment.h index 16258b3bbba9b..87fe1f92ba545 100644 --- a/flang-rt/include/flang-rt/runtime/environment.h +++ b/flang-rt/include/flang-rt/runtime/environment.h @@ -63,6 +63,7 @@ struct ExecutionEnvironment { bool noStopMessage{false}; // NO_STOP_MESSAGE=1 inhibits "Fortran STOP" bool defaultUTF8{false}; // DEFAULT_UTF8 bool checkPointerDeallocation{true}; // FORT_CHECK_POINTER_DEALLOCATION + int internalDebugging{0}; // FLANG_RT_DEBUG // CUDA related variables std::size_t cudaStackLimit{0}; // ACC_OFFLOAD_STACK_SIZE diff --git a/flang-rt/include/flang-rt/runtime/stat.h b/flang-rt/include/flang-rt/runtime/stat.h index 070d0bf8673fb..dc372de53506a 100644 --- a/flang-rt/include/flang-rt/runtime/stat.h +++ b/flang-rt/include/flang-rt/runtime/stat.h @@ -24,7 +24,7 @@ class Terminator; enum Stat { StatOk = 0, // required to be zero by Fortran - // Interoperable STAT= codes + // Interoperable STAT= codes (>= 11) StatBaseNull = CFI_ERROR_BASE_ADDR_NULL, StatBaseNotNull = CFI_ERROR_BASE_ADDR_NOT_NULL, StatInvalidElemLen = CFI_INVALID_ELEM_LEN, @@ -36,7 +36,7 @@ enum Stat { StatMemAllocation = CFI_ERROR_MEM_ALLOCATION, StatOutOfBounds = CFI_ERROR_OUT_OF_BOUNDS, - // Standard STAT= values + // Standard STAT= values (>= 101) StatFailedImage = FORTRAN_RUNTIME_STAT_FAILED_IMAGE, StatLocked = FORTRAN_RUNTIME_STAT_LOCKED, StatLockedOtherImage = FORTRAN_RUNTIME_STAT_LOCKED_OTHER_IMAGE, @@ -49,10 +49,14 @@ enum Stat { // Additional "processor-defined" STAT= values StatInvalidArgumentNumber = FORTRAN_RUNTIME_STAT_INVALID_ARG_NUMBER, StatMissingArgument = FORTRAN_RUNTIME_STAT_MISSING_ARG, - StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, + StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, // -1 StatMoveAllocSameAllocatable = FORTRAN_RUNTIME_STAT_MOVE_ALLOC_SAME_ALLOCATABLE, StatBadPointerDeallocation = FORTRAN_RUNTIME_STAT_BAD_POINTER_DEALLOCATION, + + // Dummy status for work queue continuation, declared here to perhaps + // avoid collisions + StatContinue = 201 }; RT_API_ATTRS const char *StatErrorString(int); diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..2b46890aeebe1 --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,538 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue comprises a list of tickets. Each ticket class has a Begin() +// member function, which is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatContinue. When that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentsOverElements, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/connection.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; + +// Ticket worker base classes + +template class ImmediateTicketRunner { +public: + RT_API_ATTRS explicit ImmediateTicketRunner(TICKET &ticket) + : ticket_{ticket} {} + RT_API_ATTRS int Run(WorkQueue &workQueue) { + int status{ticket_.Begin(workQueue)}; + while (status == StatContinue) { + status = ticket_.Continue(workQueue); + } + return status; + } + +private: + TICKET &ticket_; +}; + +// Base class for ticket workers that operate elementwise over descriptors +class Elementwise { +protected: + RT_API_ATTRS Elementwise( + const Descriptor &instance, const Descriptor *from = nullptr) + : instance_{instance}, from_{from} { + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + RT_API_ATTRS bool IsComplete() const { return elementAt_ >= elements_; } + RT_API_ATTRS void Advance() { + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + if (from_) { + from_->IncrementSubscripts(fromSubscripts_); + } + } + RT_API_ATTRS void SkipToEnd() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + + const Descriptor &instance_, *from_{nullptr}; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + SubscriptValue subscripts_[common::maxRank]; + SubscriptValue fromSubscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components. +class Componentwise { +protected: + RT_API_ATTRS Componentwise(const typeInfo::DerivedType &); + RT_API_ATTRS bool IsComplete() const { return componentAt_ >= components_; } + RT_API_ATTRS void Advance() { + ++componentAt_; + GetComponent(); + } + RT_API_ATTRS void SkipToEnd() { + component_ = nullptr; + componentAt_ = components_; + } + RT_API_ATTRS void Reset() { + component_ = nullptr; + componentAt_ = 0; + GetComponent(); + } + RT_API_ATTRS void GetComponent(); + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentsOverElements : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ComponentsOverElements(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Elementwise::IsComplete()) { + Componentwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Componentwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextElement(); + if (Elementwise::IsComplete()) { + Elementwise::Reset(); + Componentwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Elementwise::Advance(); + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Advance(); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Reset(); + } + + int phase_{0}; +}; + +// Base class for ticket workers that operate over elements in an outer loop, +// type components in an inner loop. +class ElementsOverComponents : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ElementsOverComponents(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Componentwise::IsComplete()) { + Elementwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Elementwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextComponent(); + if (Componentwise::IsComplete()) { + Componentwise::Reset(); + Elementwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Componentwise::Advance(); + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Componentwise::Reset(); + Elementwise::Advance(); + } + + int phase_{0}; +}; + +// Ticket worker classes + +// Implements derived type instance initialization +class InitializeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket + : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket : public ImmediateTicketRunner { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : ImmediateTicketRunner{*this}, to_{to}, from_{&from}, + flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment. +class DerivedAssignTicket : public ImmediateTicketRunner, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ImmediateTicketRunner{*this}, + ElementsOverComponents{to, derived, &from}, flags_{flags}, + memmoveFct_{memmoveFct}, deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + StaticDescriptor fromComponentDescriptor_; +}; + +namespace io::descr { + +template +class DescriptorIoTicket + : public ImmediateTicketRunner>, + private Elementwise { +public: + RT_API_ATTRS DescriptorIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + Elementwise{descriptor}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS bool &anyIoTookPlace() { return anyIoTookPlace_; } + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; + common::optional nonTbpSpecial_; + const typeInfo::DerivedType *derived_{nullptr}; + const typeInfo::SpecialBinding *special_{nullptr}; + StaticDescriptor elementDescriptor_; +}; + +template +class DerivedIoTicket : public ImmediateTicketRunner>, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + ElementsOverComponents{descriptor, derived}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; +}; + +} // namespace io::descr + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant, + io::descr::DescriptorIoTicket, + io::descr::DerivedIoTicket, + io::descr::DerivedIoTicket> + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + // APIs for particular tasks. These can return StatOk if the work is + // completed immediately. + RT_API_ATTRS int BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return InitializeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + if (runTicketsImmediately_) { + return InitializeCloneTicket{clone, original, derived, hasStat, errMsg} + .Run(*this); + } else { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); + return StatContinue; + } + } + RT_API_ATTRS int BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return FinalizeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + if (runTicketsImmediately_) { + return DestroyTicket{descriptor, derived, finalize}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived, finalize); + return StatContinue; + } + } + RT_API_ATTRS int BeginAssign(Descriptor &to, const Descriptor &from, + int flags, MemmoveFct memmoveFct) { + if (runTicketsImmediately_) { + return AssignTicket{to, from, flags, memmoveFct}.Run(*this); + } else { + StartTicket().u.emplace(to, from, flags, memmoveFct); + return StatContinue; + } + } + RT_API_ATTRS int BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) { + if (runTicketsImmediately_) { + return DerivedAssignTicket{ + to, from, derived, flags, memmoveFct, deallocateAfter} + .Run(*this); + } else { + StartTicket().u.emplace( + to, from, derived, flags, memmoveFct, deallocateAfter); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDescriptorIo(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DescriptorIoTicket{ + io, descriptor, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, table, anyIoTookPlace); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedIo(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DerivedIoTicket{ + io, descriptor, derived, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, derived, table, anyIoTookPlace); + return StatContinue; + } + } + + RT_API_ATTRS int Run(); + +private: +#if RT_DEVICE_COMPILATION + // Always use the work queue on a GPU device to avoid recursion. + static constexpr bool runTicketsImmediately_{false}; +#else + // Avoid the work queue overhead on the host, unless it needs + // debugging, which is so much easier there. + static constexpr bool runTicketsImmediately_{true}; +#endif + + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index c5e7bdce5b2fd..3f9858671d1e1 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -67,6 +67,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -130,6 +131,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 9be75da9520e3..cc2000ddfdb6e 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -102,11 +103,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncId))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncId)); } // least <= 0, most >= 0 @@ -231,6 +228,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -244,274 +243,373 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + if (workQueue.BeginAssign(to, from, flags, memmoveFct) == StatContinue) { + workQueue.Run(); + } +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + if (int status{workQueue.BeginFinalize(*toDeallocate_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncId))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncId))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(newFrom, *derived)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + } + static constexpr int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (int status{workQueue.BeginAssign( + newFrom, *from_, nestedFlags, memmoveFct_)}; + status != StatOk && status != StatContinue) { + return status; } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + if (int status{workQueue.BeginFinalize(to_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed + } else if (!toDerived_->noDestructionNeeded()) { + if (int status{ + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + return StatContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); } + return StatOk; } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(to_, *toDerived_)}; + status != StatOk) { + return status; + } + } + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); - } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); - } + if (toDerived_) { + if (int status{workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_)}; + status != StatOk && status != StatContinue) { + return status; } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &workQueue) { + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + memmoveFct_( + instance_.ElementComponent(subscripts_, procPtr.offset), + from_->ElementComponent( + fromSubscripts_, procPtr.offset), + sizeof(typeInfo::ProcedurePointer)); + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &workQueue) { + // DerivedAssignTicket inherits from ElementComponentBase so that it + // loops over elements at the outer level and over components at the inner. + // This yield less surprising behavior for codes that care due to the use + // of defined assignments when the ordering of their calls matters. + while (!IsComplete()) { + // Copy the data components (incl. the parent) first. + switch (component_->genre()) { + case typeInfo::Component::Genre::Data: + if (component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{fromComponentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + toCompDesc, instance_, workQueue.terminator(), subscripts_); + component_->CreatePointerDescriptor( + fromCompDesc, *from_, workQueue.terminator(), fromSubscripts_); + Advance(); + if (int status{workQueue.BeginAssign( + toCompDesc, fromCompDesc, flags_, memmoveFct_)}; + status != StatOk) { + return status; + } + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + instance_.Element(subscripts_) + component_->offset())}; + const auto *fromDesc{reinterpret_cast( + from_->Element(fromSubscripts_) + component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (const auto *componentDerived{component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } + } + toDesc->Deallocate(); + } + Advance(); + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + // + // Be careful not to destroy/reallocate the LHS, if there is + // overlap between LHS and RHS (it seems that partial overlap + // is not possible, though). + // Invoke Assign() recursively to deal with potential aliasing. + // Force LHS deallocation with DeallocateLHS flag. + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + int nestedFlags{flags_ | DeallocateLHS}; + Advance(); + if (int status{workQueue.BeginAssign( + *toDesc, *fromDesc, nestedFlags, memmoveFct_)}; + status != StatOk) { + return status; + } + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -581,7 +679,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -597,11 +694,11 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. - if (var) + if (var) { Assign(*var, temp, terminator, NoAssignFlags); + } temp.Destroy(/*finalize=*/false, /*destroyPointers=*/false, &terminator); } diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index c46ea806a430a..8462d0aba1f06 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,195 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitialize(instance, derived)}; + return status == StatContinue ? workQueue.Run() : status; +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &comp{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + auto &pptr{*instance_.ElementComponent( + subscripts_, comp.offset)}; + pptr = comp.procInitialization; + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + SkipToNextComponent(); + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitialize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; - } - } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitializeClone( + clone, original, derived, hasStat, errMsg)}; + return status == StatContinue ? workQueue.Run() : status; } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncId), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncId), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + if (int status{workQueue.BeginInitialize(cloneDesc, *derived)}; + status != StatOk) { + return status; } } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_)}; + status != StatOk) { + return status; + } + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } + } else { + SkipToNextComponent(); + } + } + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginFinalize(descriptor, derived) == StatContinue) { + workQueue.Run(); } } - return stat; } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +237,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +274,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncId) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncId) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,87 +298,94 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (finalizableParentType_->noFinalizationNeeded()) { + finalizableParentType_ = nullptr; + } else { + SkipToNextComponent(); + } + } + return StatContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + if (int status{ + workQueue.BeginFinalize(compDesc, *compDynamicType)}; + status != StatOk) { + return status; } } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginFinalize(compDesc, *compType)}; + status != StatOk) { + return status; } } + } else { + SkipToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginFinalize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + const auto &parentType{*finalizableParentType_}; + finalizableParentType_ = nullptr; + // Don't return StatOk here if the nested FInalize is still running; + // it needs this->componentDescriptor_. + return workQueue.BeginFinalize(tmpDesc, parentType); } + return StatOk; } // The order of finalization follows Fortran 2018 7.5.6.2, with @@ -373,51 +394,71 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginDestroy(descriptor, derived, finalize) == StatContinue) { + workQueue.Run(); + } } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + if (int status{workQueue.BeginFinalize(instance_, derived_)}; + status != StatOk && status != StatContinue) { + return status; + } } + return StatContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (!IsComplete()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *d, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + SkipToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginDestroy( + compDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } } + } else { + SkipToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index 3db1455af52fe..de2b9a788a25e 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -7,12 +7,40 @@ //===----------------------------------------------------------------------===// #include "descriptor-io.h" +#include "edit-input.h" +#include "edit-output.h" +#include "unit.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/namelist.h" +#include "flang-rt/runtime/terminator.h" +#include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" +#include "flang/Common/optional.h" #include "flang/Common/restorer.h" +#include "flang/Common/uint128.h" +#include "flang/Runtime/cpp-type.h" #include "flang/Runtime/freestanding-tools.h" +// Implementation of I/O data list item transfers based on descriptors. +// (All I/O items come through here so that the code is exercised for test; +// some scalar I/O data transfer APIs could be changed to bypass their use +// of descriptors in the future for better efficiency.) + namespace Fortran::runtime::io::descr { RT_OFFLOAD_API_GROUP_BEGIN +template +inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, + const Descriptor &descriptor, const SubscriptValue subscripts[]) { + A *p{descriptor.Element(subscripts)}; + if (!p) { + io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " + "address or subscripts out of range"); + } + return *p; +} + // Defined formatted I/O (maybe) Fortran::common::optional DefinedFormattedIo(IoStatementState &io, const Descriptor &descriptor, const typeInfo::DerivedType &derived, @@ -104,8 +132,8 @@ Fortran::common::optional DefinedFormattedIo(IoStatementState &io, } // Defined unformatted I/O -bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, - const typeInfo::DerivedType &derived, +static bool DefinedUnformattedIo(IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special) { // Unformatted I/O must have an external unit (or child thereof). IoErrorHandler &handler{io.GetIoErrorHandler()}; @@ -152,5 +180,619 @@ bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, return handler.GetIoStat() == IostatOk; } +// Per-category descriptor-based I/O templates + +// TODO (perhaps as a nontrivial but small starter project): implement +// automatic repetition counts, like "10*3.14159", for list-directed and +// NAMELIST array output. + +template +inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, + const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!EditIntegerOutput(io, *edit, x, isSigned)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditIntegerInput( + io, *edit, reinterpret_cast(&x), KIND, isSigned)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedIntegerIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedRealIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + RawType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditRealInput(io, *edit, reinterpret_cast(&x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedRealIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedComplexIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + bool isListOutput{ + io.get_if>() != nullptr}; + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + RawType *x{&ExtractElement(io, descriptor, subscripts)}; + if (isListOutput) { + DataEdit rEdit, iEdit; + rEdit.descriptor = DataEdit::ListDirectedRealPart; + iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; + rEdit.modes = iEdit.modes = io.mutableModes(); + if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || + !RealOutputEditing{io, x[1]}.Edit(iEdit)) { + return false; + } + } else { + for (int k{0}; k < 2; ++k, ++x) { + auto edit{io.GetNextDataEdit()}; + if (!edit) { + return false; + } else if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, *x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { + break; + } else if (EditRealInput( + io, *edit, reinterpret_cast(x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedComplexIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedCharacterIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + std::size_t length{descriptor.ElementBytes() / sizeof(A)}; + auto *listOutput{io.get_if>()}; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + A *x{&ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditCharacterOutput(io, *edit, x, length)) { + return false; + } + } else { // input + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditCharacterInput(io, *edit, x, length)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedCharacterIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedLogicalIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + auto *listOutput{io.get_if>()}; + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditLogicalOutput(io, *edit, x != 0)) { + return false; + } + } else { + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + bool truth{}; + if (EditLogicalInput(io, *edit, truth)) { + x = truth; + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedLogicalIO: subscripts out of bounds"); + } + } + return true; +} + +template +RT_API_ATTRS int DerivedIoTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Data) { + // Create a descriptor for the component + Descriptor &compDesc{componentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + compDesc, instance_, io_.GetIoErrorHandler(), subscripts_); + Advance(); + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } else { + // Component is itself a descriptor + char *pointer{ + instance_.Element(subscripts_) + component_->offset()}; + const Descriptor &compDesc{ + *reinterpret_cast(pointer)}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + } + } + return StatOk; +} + +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Begin(WorkQueue &workQueue) { + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + if (handler.InError()) { + return handler.GetIoStat(); + } + if (!io_.get_if>()) { + handler.Crash("DescriptorIO() called for wrong I/O direction"); + return handler.GetIoStat(); + } + if constexpr (DIR == Direction::Input) { + if (!io_.BeginReadingRecord()) { + return StatOk; + } + } + if (!io_.get_if>()) { + // Unformatted I/O + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + if (const typeInfo::DerivedType * + type{addendum ? addendum->derivedType() : nullptr}) { + // derived type unformatted I/O + if (table_) { + if (const auto *definedIo{table_->Find(*type, + DIR == Direction::Input + ? common::DefinedIo::ReadUnformatted + : common::DefinedIo::WriteUnformatted)}) { + if (definedIo->subroutine) { + typeInfo::SpecialBinding special{DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false}; + if (DefinedUnformattedIo(io_, instance_, *type, special)) { + anyIoTookPlace_ = true; + return StatOk; + } + } else { + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } + } + } + if (const typeInfo::SpecialBinding * + special{type->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || special->isTypeBound()) { + // defined derived type unformatted I/O + if (DefinedUnformattedIo(io_, instance_, *type, *special)) { + anyIoTookPlace_ = true; + return StatOk; + } else { + return IostatEnd; + } + } + } + // Default derived type unformatted I/O + // TODO: If no component at any level has defined READ or WRITE + // (as appropriate), the elements are contiguous, and no byte swapping + // is active, do a block transfer via the code below. + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } else { + // intrinsic type unformatted I/O + auto *externalUnf{io_.get_if>()}; + ChildUnformattedIoStatementState *childUnf{nullptr}; + InquireIOLengthState *inq{nullptr}; + bool swapEndianness{false}; + if (externalUnf) { + swapEndianness = externalUnf->unit().swapEndianness(); + } else { + childUnf = io_.get_if>(); + if (!childUnf) { + inq = DIR == Direction::Output ? io_.get_if() + : nullptr; + RUNTIME_CHECK(handler, inq != nullptr); + } + } + std::size_t elementBytes{instance_.ElementBytes()}; + std::size_t swappingBytes{elementBytes}; + if (auto maybeCatAndKind{instance_.type().GetCategoryAndKind()}) { + // Byte swapping units can be smaller than elements, namely + // for COMPLEX and CHARACTER. + if (maybeCatAndKind->first == TypeCategory::Character) { + // swap each character position independently + swappingBytes = maybeCatAndKind->second; // kind + } else if (maybeCatAndKind->first == TypeCategory::Complex) { + // swap real and imaginary components independently + swappingBytes /= 2; + } + } + using CharType = + std::conditional_t; + auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { + if constexpr (DIR == Direction::Output) { + return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) + : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) + : inq->Emit(&x, totalBytes, swappingBytes); + } else { + return externalUnf + ? externalUnf->Receive(&x, totalBytes, swappingBytes) + : childUnf->Receive(&x, totalBytes, swappingBytes); + } + }}; + if (!swapEndianness && + instance_.IsContiguous()) { // contiguous unformatted I/O + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elements_ * elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O + for (; !IsComplete(); Advance()) { + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } + } + } + // Unformatted I/O never needs to call Continue(). + return StatOk; + } + // Formatted I/O + if (auto catAndKind{instance_.type().GetCategoryAndKind()}) { + TypeCategory cat{catAndKind->first}; + int kind{catAndKind->second}; + bool any{false}; + switch (cat) { + case TypeCategory::Integer: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, true); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, true); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, true); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, true); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, true); + break; + default: + handler.Crash( + "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Unsigned: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, false); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, false); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, false); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, false); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, false); + break; + default: + handler.Crash( + "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Real: + switch (kind) { + case 2: + any = FormattedRealIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedRealIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedRealIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedRealIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedRealIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedRealIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: REAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Complex: + switch (kind) { + case 2: + any = FormattedComplexIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedComplexIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedComplexIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedComplexIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedComplexIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedComplexIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Character: + switch (kind) { + case 1: + any = FormattedCharacterIO(io_, instance_); + break; + case 2: + any = FormattedCharacterIO(io_, instance_); + break; + case 4: + any = FormattedCharacterIO(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Logical: + switch (kind) { + case 1: + any = FormattedLogicalIO<1, DIR>(io_, instance_); + break; + case 2: + any = FormattedLogicalIO<2, DIR>(io_, instance_); + break; + case 4: + any = FormattedLogicalIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedLogicalIO<8, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Derived: { + // Derived type information must be present for formatted I/O. + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + RUNTIME_CHECK(handler, addendum != nullptr); + derived_ = addendum->derivedType(); + RUNTIME_CHECK(handler, derived_ != nullptr); + if (table_) { + if (const auto *definedIo{table_->Find(*derived_, + DIR == Direction::Input ? common::DefinedIo::ReadFormatted + : common::DefinedIo::WriteFormatted)}) { + if (definedIo->subroutine) { + nonTbpSpecial_.emplace(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false); + special_ = &*nonTbpSpecial_; + } + } + } + if (!special_) { + if (const typeInfo::SpecialBinding * + binding{derived_->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || + binding->isTypeBound()) { + special_ = binding; + } + } + } + return StatContinue; + } + } + if (any) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { + handler.Crash("DescriptorIO: bad type code (%d) in descriptor", + static_cast(instance_.type().raw())); + return handler.GetIoStat(); + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Continue(WorkQueue &workQueue) { + // Only derived type formatted I/O gets here. + while (!IsComplete()) { + if (special_) { + if (auto defined{DefinedFormattedIo( + io_, instance_, *derived_, *special_, subscripts_)}) { + anyIoTookPlace_ |= *defined; + Advance(); + continue; + } + } + Descriptor &elementDesc{elementDescriptor_.descriptor()}; + elementDesc.Establish( + *derived_, nullptr, 0, nullptr, CFI_attribute_pointer); + elementDesc.set_base_addr(instance_.Element(subscripts_)); + Advance(); + if (int status{workQueue.BeginDerivedIo( + io_, elementDesc, *derived_, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS bool DescriptorIO(IoStatementState &io, + const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { + bool anyIoTookPlace{false}; + WorkQueue workQueue{io.GetIoErrorHandler()}; + if (workQueue.BeginDescriptorIo(io, descriptor, table, anyIoTookPlace) == + StatContinue) { + workQueue.Run(); + } + return anyIoTookPlace; +} + +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); + RT_OFFLOAD_API_GROUP_END } // namespace Fortran::runtime::io::descr diff --git a/flang-rt/lib/runtime/descriptor-io.h b/flang-rt/lib/runtime/descriptor-io.h index eb60f106c9203..88ad59bd24b53 100644 --- a/flang-rt/lib/runtime/descriptor-io.h +++ b/flang-rt/lib/runtime/descriptor-io.h @@ -9,619 +9,27 @@ #ifndef FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ #define FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ -// Implementation of I/O data list item transfers based on descriptors. -// (All I/O items come through here so that the code is exercised for test; -// some scalar I/O data transfer APIs could be changed to bypass their use -// of descriptors in the future for better efficiency.) +#include "flang-rt/runtime/connection.h" -#include "edit-input.h" -#include "edit-output.h" -#include "unit.h" -#include "flang-rt/runtime/descriptor.h" -#include "flang-rt/runtime/io-stmt.h" -#include "flang-rt/runtime/namelist.h" -#include "flang-rt/runtime/terminator.h" -#include "flang-rt/runtime/type-info.h" -#include "flang/Common/optional.h" -#include "flang/Common/uint128.h" -#include "flang/Runtime/cpp-type.h" +namespace Fortran::runtime { +class Descriptor; +} // namespace Fortran::runtime -namespace Fortran::runtime::io::descr { -template -inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, - const Descriptor &descriptor, const SubscriptValue subscripts[]) { - A *p{descriptor.Element(subscripts)}; - if (!p) { - io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " - "address or subscripts out of range"); - } - return *p; -} - -// Per-category descriptor-based I/O templates - -// TODO (perhaps as a nontrivial but small starter project): implement -// automatic repetition counts, like "10*3.14159", for list-directed and -// NAMELIST array output. - -template -inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, - const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!EditIntegerOutput(io, *edit, x, isSigned)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditIntegerInput( - io, *edit, reinterpret_cast(&x), KIND, isSigned)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedIntegerIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedRealIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - RawType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditRealInput(io, *edit, reinterpret_cast(&x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedRealIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io -template -inline RT_API_ATTRS bool FormattedComplexIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - bool isListOutput{ - io.get_if>() != nullptr}; - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - RawType *x{&ExtractElement(io, descriptor, subscripts)}; - if (isListOutput) { - DataEdit rEdit, iEdit; - rEdit.descriptor = DataEdit::ListDirectedRealPart; - iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; - rEdit.modes = iEdit.modes = io.mutableModes(); - if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || - !RealOutputEditing{io, x[1]}.Edit(iEdit)) { - return false; - } - } else { - for (int k{0}; k < 2; ++k, ++x) { - auto edit{io.GetNextDataEdit()}; - if (!edit) { - return false; - } else if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, *x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { - break; - } else if (EditRealInput( - io, *edit, reinterpret_cast(x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedComplexIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedCharacterIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t length{descriptor.ElementBytes() / sizeof(A)}; - auto *listOutput{io.get_if>()}; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - A *x{&ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditCharacterOutput(io, *edit, x, length)) { - return false; - } - } else { // input - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditCharacterInput(io, *edit, x, length)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedCharacterIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedLogicalIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - auto *listOutput{io.get_if>()}; - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditLogicalOutput(io, *edit, x != 0)) { - return false; - } - } else { - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - bool truth{}; - if (EditLogicalInput(io, *edit, truth)) { - x = truth; - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedLogicalIO: subscripts out of bounds"); - } - } - return true; -} +namespace Fortran::runtime::io::descr { template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, +RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable * = nullptr); -// For intrinsic (not defined) derived type I/O, formatted & unformatted -template -static RT_API_ATTRS bool DefaultComponentIO(IoStatementState &io, - const typeInfo::Component &component, const Descriptor &origDescriptor, - const SubscriptValue origSubscripts[], Terminator &terminator, - const NonTbpDefinedIoTable *table) { -#if !defined(RT_DEVICE_AVOID_RECURSION) - if (component.genre() == typeInfo::Component::Genre::Data) { - // Create a descriptor for the component - StaticDescriptor statDesc; - Descriptor &desc{statDesc.descriptor()}; - component.CreatePointerDescriptor( - desc, origDescriptor, terminator, origSubscripts); - return DescriptorIO(io, desc, table); - } else { - // Component is itself a descriptor - char *pointer{ - origDescriptor.Element(origSubscripts) + component.offset()}; - const Descriptor &compDesc{*reinterpret_cast(pointer)}; - return compDesc.IsAllocated() && DescriptorIO(io, compDesc, table); - } -#else - terminator.Crash("not yet implemented: component IO"); -#endif -} - -template -static RT_API_ATTRS bool DefaultComponentwiseFormattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table, const SubscriptValue subscripts[]) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - // Return true for NAMELIST input if any component appeared. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && k > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -template -static RT_API_ATTRS bool DefaultComponentwiseUnformattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - return false; - } - } - } - return true; -} - -RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( - IoStatementState &, const Descriptor &, const typeInfo::DerivedType &, - const typeInfo::SpecialBinding &, const SubscriptValue[]); - -template -static RT_API_ATTRS bool FormattedDerivedTypeIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - // Derived type information must be present for formatted I/O. - const DescriptorAddendum *addendum{descriptor.Addendum()}; - RUNTIME_CHECK(handler, addendum != nullptr); - const typeInfo::DerivedType *type{addendum->derivedType()}; - RUNTIME_CHECK(handler, type != nullptr); - Fortran::common::optional nonTbpSpecial; - const typeInfo::SpecialBinding *special{nullptr}; - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadFormatted - : common::DefinedIo::WriteFormatted)}) { - if (definedIo->subroutine) { - nonTbpSpecial.emplace(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false); - special = &*nonTbpSpecial; - } - } - } - if (!special) { - if (const typeInfo::SpecialBinding * - binding{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted)}) { - if (!table || !table->ignoreNonTbpEntries || binding->isTypeBound()) { - special = binding; - } - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t numElements{descriptor.Elements()}; - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - Fortran::common::optional result; - if (special) { - result = DefinedFormattedIo(io, descriptor, *type, *special, subscripts); - } - if (!result) { - result = DefaultComponentwiseFormattedIO( - io, descriptor, *type, table, subscripts); - } - if (!result.value()) { - // Return true for NAMELIST input if we got anything. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && j > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &, const Descriptor &, - const typeInfo::DerivedType &, const typeInfo::SpecialBinding &); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); -// Unformatted I/O -template -static RT_API_ATTRS bool UnformattedDescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table = nullptr) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const DescriptorAddendum *addendum{descriptor.Addendum()}; - if (const typeInfo::DerivedType * - type{addendum ? addendum->derivedType() : nullptr}) { - // derived type unformatted I/O - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadUnformatted - : common::DefinedIo::WriteUnformatted)}) { - if (definedIo->subroutine) { - typeInfo::SpecialBinding special{DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false}; - if (Fortran::common::optional wasDefined{ - DefinedUnformattedIo(io, descriptor, *type, special)}) { - return *wasDefined; - } - } else { - return DefaultComponentwiseUnformattedIO( - io, descriptor, *type, table); - } - } - } - if (const typeInfo::SpecialBinding * - special{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { - if (!table || !table->ignoreNonTbpEntries || special->isTypeBound()) { - // defined derived type unformatted I/O - return DefinedUnformattedIo(io, descriptor, *type, *special); - } - } - // Default derived type unformatted I/O - // TODO: If no component at any level has defined READ or WRITE - // (as appropriate), the elements are contiguous, and no byte swapping - // is active, do a block transfer via the code below. - return DefaultComponentwiseUnformattedIO(io, descriptor, *type, table); - } else { - // intrinsic type unformatted I/O - auto *externalUnf{io.get_if>()}; - auto *childUnf{io.get_if>()}; - auto *inq{ - DIR == Direction::Output ? io.get_if() : nullptr}; - RUNTIME_CHECK(handler, externalUnf || childUnf || inq); - std::size_t elementBytes{descriptor.ElementBytes()}; - std::size_t numElements{descriptor.Elements()}; - std::size_t swappingBytes{elementBytes}; - if (auto maybeCatAndKind{descriptor.type().GetCategoryAndKind()}) { - // Byte swapping units can be smaller than elements, namely - // for COMPLEX and CHARACTER. - if (maybeCatAndKind->first == TypeCategory::Character) { - // swap each character position independently - swappingBytes = maybeCatAndKind->second; // kind - } else if (maybeCatAndKind->first == TypeCategory::Complex) { - // swap real and imaginary components independently - swappingBytes /= 2; - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using CharType = - std::conditional_t; - auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { - if constexpr (DIR == Direction::Output) { - return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) - : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) - : inq->Emit(&x, totalBytes, swappingBytes); - } else { - return externalUnf ? externalUnf->Receive(&x, totalBytes, swappingBytes) - : childUnf->Receive(&x, totalBytes, swappingBytes); - } - }}; - bool swapEndianness{externalUnf && externalUnf->unit().swapEndianness()}; - if (!swapEndianness && - descriptor.IsContiguous()) { // contiguous unformatted I/O - char &x{ExtractElement(io, descriptor, subscripts)}; - return Transfer(x, numElements * elementBytes); - } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O - for (std::size_t j{0}; j < numElements; ++j) { - char &x{ExtractElement(io, descriptor, subscripts)}; - if (!Transfer(x, elementBytes)) { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && - j + 1 < numElements) { - handler.Crash("DescriptorIO: subscripts out of bounds"); - } - } - return true; - } - } -} - -template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - if (handler.InError()) { - return false; - } - if (!io.get_if>()) { - handler.Crash("DescriptorIO() called for wrong I/O direction"); - return false; - } - if constexpr (DIR == Direction::Input) { - if (!io.BeginReadingRecord()) { - return false; - } - } - if (!io.get_if>()) { - return UnformattedDescriptorIO(io, descriptor, table); - } - if (auto catAndKind{descriptor.type().GetCategoryAndKind()}) { - TypeCategory cat{catAndKind->first}; - int kind{catAndKind->second}; - switch (cat) { - case TypeCategory::Integer: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, true); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, true); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, true); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, true); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, true); - default: - handler.Crash( - "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Unsigned: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, false); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, false); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, false); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, false); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, false); - default: - handler.Crash( - "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Real: - switch (kind) { - case 2: - return FormattedRealIO<2, DIR>(io, descriptor); - case 3: - return FormattedRealIO<3, DIR>(io, descriptor); - case 4: - return FormattedRealIO<4, DIR>(io, descriptor); - case 8: - return FormattedRealIO<8, DIR>(io, descriptor); - case 10: - return FormattedRealIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedRealIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: REAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Complex: - switch (kind) { - case 2: - return FormattedComplexIO<2, DIR>(io, descriptor); - case 3: - return FormattedComplexIO<3, DIR>(io, descriptor); - case 4: - return FormattedComplexIO<4, DIR>(io, descriptor); - case 8: - return FormattedComplexIO<8, DIR>(io, descriptor); - case 10: - return FormattedComplexIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedComplexIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Character: - switch (kind) { - case 1: - return FormattedCharacterIO(io, descriptor); - case 2: - return FormattedCharacterIO(io, descriptor); - case 4: - return FormattedCharacterIO(io, descriptor); - default: - handler.Crash( - "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Logical: - switch (kind) { - case 1: - return FormattedLogicalIO<1, DIR>(io, descriptor); - case 2: - return FormattedLogicalIO<2, DIR>(io, descriptor); - case 4: - return FormattedLogicalIO<4, DIR>(io, descriptor); - case 8: - return FormattedLogicalIO<8, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Derived: - return FormattedDerivedTypeIO(io, descriptor, table); - } - } - handler.Crash("DescriptorIO: bad type code (%d) in descriptor", - static_cast(descriptor.type().raw())); - return false; -} } // namespace Fortran::runtime::io::descr #endif // FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ diff --git a/flang-rt/lib/runtime/environment.cpp b/flang-rt/lib/runtime/environment.cpp index 1d5304254ed0e..0f0564403c0e2 100644 --- a/flang-rt/lib/runtime/environment.cpp +++ b/flang-rt/lib/runtime/environment.cpp @@ -143,6 +143,10 @@ void ExecutionEnvironment::Configure(int ac, const char *av[], } } + if (auto *x{std::getenv("FLANG_RT_DEBUG")}) { + internalDebugging = std::strtol(x, nullptr, 10); + } + if (auto *x{std::getenv("ACC_OFFLOAD_STACK_SIZE")}) { char *end; auto n{std::strtoul(x, &end, 10)}; diff --git a/flang-rt/lib/runtime/namelist.cpp b/flang-rt/lib/runtime/namelist.cpp index b0cf2180fc6d4..1bef387a9771f 100644 --- a/flang-rt/lib/runtime/namelist.cpp +++ b/flang-rt/lib/runtime/namelist.cpp @@ -10,6 +10,7 @@ #include "descriptor-io.h" #include "flang-rt/runtime/emit-encoded.h" #include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/type-info.h" #include "flang/Runtime/io-api.h" #include #include diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..9382c96bd870a --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,145 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +// FLANG_RT_DEBUG code is disabled when false. +static constexpr bool enableDebugOutput{false}; + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (componentAt_ >= components_) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + delete firstFree_; + } + firstFree_ = next; + } +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + firstFree_ = new TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: new ticket\n"); + } + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), + at->ticket.begun ? "Continue" : "Begin"); + } + int stat{at->ticket.Continue(*this)}; + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: ... stat %d\n", stat); + } + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime diff --git a/flang-rt/unittests/Runtime/ExternalIOTest.cpp b/flang-rt/unittests/Runtime/ExternalIOTest.cpp index 3833e48be3dd6..6c148b1de6f82 100644 --- a/flang-rt/unittests/Runtime/ExternalIOTest.cpp +++ b/flang-rt/unittests/Runtime/ExternalIOTest.cpp @@ -184,7 +184,7 @@ TEST(ExternalIOTests, TestSequentialFixedUnformatted) { io = IONAME(BeginInquireIoLength)(__FILE__, __LINE__); for (int j{1}; j <= 3; ++j) { ASSERT_TRUE(IONAME(OutputDescriptor)(io, desc)) - << "OutputDescriptor() for InquireIoLength"; + << "OutputDescriptor() for InquireIoLength " << j; } ASSERT_EQ(IONAME(GetIoLength)(io), 3 * recl) << "GetIoLength"; ASSERT_EQ(IONAME(EndIoStatement)(io), IostatOk) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..32bebc1d866a4 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -843,6 +843,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION >From 716abece73720048697a548e8bfc91b2a603b266 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 23 May 2025 09:54:28 -0700 Subject: [PATCH 2/3] more --- flang-rt/lib/runtime/work-queue.cpp | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp index 9382c96bd870a..1c3dd5146d0bf 100644 --- a/flang-rt/lib/runtime/work-queue.cpp +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -87,9 +87,11 @@ RT_API_ATTRS Ticket &WorkQueue::StartTicket() { last_ = newTicket; } newTicket->ticket.begun = false; +#if !defined(RT_DEVICE_COMPILATION) if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { std::fprintf(stderr, "WQ: new ticket\n"); } +#endif return newTicket->ticket; } @@ -97,14 +99,18 @@ RT_API_ATTRS int WorkQueue::Run() { while (last_) { TicketList *at{last_}; insertAfter_ = last_; +#if !defined(RT_DEVICE_COMPILATION) if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), at->ticket.begun ? "Continue" : "Begin"); } +#endif int stat{at->ticket.Continue(*this)}; +#if !defined(RT_DEVICE_COMPILATION) if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { std::fprintf(stderr, "WQ: ... stat %d\n", stat); } +#endif insertAfter_ = nullptr; if (stat == StatOk) { if (at->previous) { >From 4bd179d71dc541bbb96c6a0974ec6cdc6ca892b1 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 23 May 2025 13:46:30 -0700 Subject: [PATCH 3/3] Device build fixes --- flang-rt/lib/runtime/descriptor-io.cpp | 4 ++-- flang-rt/lib/runtime/work-queue.cpp | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index de2b9a788a25e..9bf7c193d2d96 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -42,7 +42,7 @@ inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, } // Defined formatted I/O (maybe) -Fortran::common::optional DefinedFormattedIo(IoStatementState &io, +static RT_API_ATTRS Fortran::common::optional DefinedFormattedIo(IoStatementState &io, const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special, const SubscriptValue subscripts[]) { @@ -132,7 +132,7 @@ Fortran::common::optional DefinedFormattedIo(IoStatementState &io, } // Defined unformatted I/O -static bool DefinedUnformattedIo(IoStatementState &io, +static RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special) { // Unformatted I/O must have an external unit (or child thereof). diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp index 1c3dd5146d0bf..eee1c551aad6d 100644 --- a/flang-rt/lib/runtime/work-queue.cpp +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -13,8 +13,10 @@ namespace Fortran::runtime { +#if !defined(RT_DEVICE_COMPILATION) // FLANG_RT_DEBUG code is disabled when false. static constexpr bool enableDebugOutput{false}; +#endif RT_OFFLOAD_API_GROUP_BEGIN From flang-commits at lists.llvm.org Fri May 23 16:17:58 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 23 May 2025 16:17:58 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <68310226.170a0220.88986.9708@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH 1/4] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 30477b842ae64fb1c94f520136956cc4685648a7 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:59:20 -0400 Subject: [PATCH 2/4] clang-format --- flang/include/flang/Evaluate/target.h | 4 +--- flang/include/flang/Tools/TargetSetup.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d5285fb..e8b9fedc38f48 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ class TargetCharacteristics { const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e21fb4..24ab65f740ec6 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. >From e1f91c83680d72fc1463a8db97a77298141286d9 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:41:48 -0400 Subject: [PATCH 3/4] Renaming to integerKindForPointer --- flang/include/flang/Evaluate/target.h | 8 +++++--- flang/include/flang/Tools/TargetSetup.h | 3 ++- flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index e8b9fedc38f48..7b38db2db1956 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,8 +131,10 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } - std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } + std::size_t integerKindForPointer() { return integerKindForPointer_; } + void set_integerKindForPointer(std::size_t newKind) { + integerKindForPointer_ = newKind; + } private: static constexpr int maxKind{common::maxKind}; @@ -159,7 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; - std::size_t pointerSize_{8 /* bytes */}; + std::size_t integerKindForPointer_{8}; /* For 64 bit pointer */ }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index 24ab65f740ec6..002e82aa72484 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,7 +94,8 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); - targetCharacteristics.set_pointerSize( + // Currently the integer kind happens to be the same as the byte size + targetCharacteristics.set_integerKindForPointer( targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 88114bfe84af3..5fd5ea8f9bc5c 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; + TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 940ff38d0abba9d1b0ab415e42474629284962d8 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:51:53 -0400 Subject: [PATCH 4/4] clang-format --- flang/lib/Semantics/resolve-names.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 5fd5ea8f9bc5c..a8dbf61c8fd68 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6689,8 +6689,8 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { "'%s' cannot be a Cray pointer as it is already a Cray pointee"_err_en_US); } pointer->set(Symbol::Flag::CrayPointer); - const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; + const DeclTypeSpec &pointerType{MakeNumericType(TypeCategory::Integer, + context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); From flang-commits at lists.llvm.org Sat May 24 13:46:07 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Sat, 24 May 2025 13:46:07 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) Message-ID: https://github.com/mcinally created https://github.com/llvm/llvm-project/pull/141380 This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. >From 05221439896aabdcadf79d15ae6875dc34f10edd Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 24 May 2025 13:35:13 -0700 Subject: [PATCH] [flang] Add support for -mprefer-vector-width= This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- clang/include/clang/Driver/Options.td | 2 +- flang/include/flang/Frontend/CodeGenOptions.h | 3 +++ .../include/flang/Optimizer/Transforms/Passes.td | 4 ++++ flang/include/flang/Tools/CrossToolHelpers.h | 3 +++ flang/lib/Frontend/CompilerInvocation.cpp | 14 ++++++++++++++ flang/lib/Frontend/FrontendActions.cpp | 2 ++ flang/lib/Optimizer/Passes/Pipelines.cpp | 2 +- flang/lib/Optimizer/Transforms/FunctionAttr.cpp | 5 +++++ flang/test/Driver/prefer-vector-width.f90 | 16 ++++++++++++++++ mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td | 1 + mlir/lib/Target/LLVMIR/ModuleImport.cpp | 4 ++++ mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | 3 +++ 12 files changed, 57 insertions(+), 2 deletions(-) create mode 100644 flang/test/Driver/prefer-vector-width.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 22261621df092..b0b642796010b 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5480,7 +5480,7 @@ def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, MarshallingInfoStringVector>; def mprefer_vector_width_EQ : Joined<["-"], "mprefer-vector-width=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Specifies preferred vector width for auto-vectorization. Defaults to 'none' which allows target specific decisions.">, MarshallingInfoString>; def mstack_protector_guard_EQ : Joined<["-"], "mstack-protector-guard=">, Group, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -53,6 +53,9 @@ class CodeGenOptions : public CodeGenOptionsBase { /// The paths to the pass plugins that were registered using -fpass-plugin. std::vector LLVMPassPlugins; + // The prefered vector width, if requested by -mprefer-vector-width. + std::string PreferVectorWidth; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index b251534e1a8f6..2e932d9ad4a26 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -418,6 +418,10 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"preferVectorWidth", "prefer-vector-width", "std::string", + /*default=*/"", + "Set the prefer-vector-width attribute on functions in the " + "module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, ]; diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 118695bbe2626..058024a4a04c5 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; InstrumentFunctionExit = "__cyg_profile_func_exit"; @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for + ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. std::string InstrumentFunctionEntry = diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..122eb56fa0599 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -309,6 +309,20 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, for (auto *a : args.filtered(clang::driver::options::OPT_fpass_plugin_EQ)) opts.LLVMPassPlugins.push_back(a->getValue()); + // -mprefer_vector_width option + if (const llvm::opt::Arg *a = + args.getLastArg(clang::driver::options::OPT_mprefer_vector_width_EQ)) { + llvm::StringRef s = a->getValue(); + unsigned Width; + if (s == "none") + opts.PreferVectorWidth = "none"; + else if (s.getAsInteger(10, Width)) + diags.Report(clang::diag::err_drv_invalid_value) + << a->getAsString(args) << a->getValue(); + else + opts.PreferVectorWidth = s.str(); + } + // -fembed-offload-object option for (auto *a : args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 38dfaadf1dff9..012d0fdfe645f 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -741,6 +741,8 @@ void CodeGenAction::generateLLVMIR() { config.VScaleMax = vsr->second; } + config.PreferVectorWidth = opts.PreferVectorWidth; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..5c1e1b9f77efb 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -354,7 +354,7 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - ""})); + config.PreferVectorWidth, ""})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 43e4c1a7af3cd..13f447cf738b4 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -36,6 +36,7 @@ class FunctionAttrPass : public fir::impl::FunctionAttrBase { approxFuncFPMath = options.approxFuncFPMath; noSignedZerosFPMath = options.noSignedZerosFPMath; unsafeFPMath = options.unsafeFPMath; + preferVectorWidth = options.preferVectorWidth; } FunctionAttrPass() {} void runOnOperation() override; @@ -102,6 +103,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!preferVectorWidth.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, preferVectorWidth)); LLVM_DEBUG(llvm::dbgs() << "=== End " DEBUG_TYPE " ===\n"); } diff --git a/flang/test/Driver/prefer-vector-width.f90 b/flang/test/Driver/prefer-vector-width.f90 new file mode 100644 index 0000000000000..d0f5fd28db826 --- /dev/null +++ b/flang/test/Driver/prefer-vector-width.f90 @@ -0,0 +1,16 @@ +! Test that -mprefer-vector-width works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mprefer-vector-width=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mprefer-vector-width=128 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-128 +! RUN: %flang_fc1 -mprefer-vector-width=256 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-256 +! RUN: not %flang_fc1 -mprefer-vector-width=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INVALID + +subroutine func +end subroutine func + +! CHECK-DEF-NOT: attributes #0 = { "prefer-vector-width"={{.*}} } +! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } +! CHECK-128: attributes #0 = { "prefer-vector-width"="128" } +! CHECK-256: attributes #0 = { "prefer-vector-width"="256" } +! CHECK-INVALID:error: invalid value 'xxx' in '-mprefer-vector-width=xxx' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index 6fde45ce5c556..c0324d561b77b 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1893,6 +1893,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$frame_pointer, OptionalAttr:$target_cpu, OptionalAttr:$tune_cpu, + OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index b049064fbd31c..85417da798b22 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2636,6 +2636,10 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, funcOp.setTargetFeaturesAttr( LLVM::TargetFeaturesAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("prefer-vector-width"); + attr.isStringAttribute()) + funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index 047e870b7dcd8..2b7f0b11613aa 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1540,6 +1540,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { if (auto tuneCpu = func.getTuneCpu()) llvmFunc->addFnAttr("tune-cpu", *tuneCpu); + if (auto preferVectorWidth = func.getPreferVectorWidth()) + llvmFunc->addFnAttr("prefer-vector-width", *preferVectorWidth); + if (auto attr = func.getVscaleRange()) llvmFunc->addFnAttr(llvm::Attribute::getWithVScaleRangeArgs( getLLVMContext(), attr->getMinRange().getInt(), From flang-commits at lists.llvm.org Sat May 24 13:46:39 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 24 May 2025 13:46:39 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <6832302f.170a0220.184911.e962@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Cameron McInally (mcinally)
Changes This patch adds support for the -mprefer-vector-width=<value> command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- Full diff: https://github.com/llvm/llvm-project/pull/141380.diff 12 Files Affected: - (modified) clang/include/clang/Driver/Options.td (+1-1) - (modified) flang/include/flang/Frontend/CodeGenOptions.h (+3) - (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+4) - (modified) flang/include/flang/Tools/CrossToolHelpers.h (+3) - (modified) flang/lib/Frontend/CompilerInvocation.cpp (+14) - (modified) flang/lib/Frontend/FrontendActions.cpp (+2) - (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+1-1) - (modified) flang/lib/Optimizer/Transforms/FunctionAttr.cpp (+5) - (added) flang/test/Driver/prefer-vector-width.f90 (+16) - (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td (+1) - (modified) mlir/lib/Target/LLVMIR/ModuleImport.cpp (+4) - (modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+3) ``````````diff diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 22261621df092..b0b642796010b 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5480,7 +5480,7 @@ def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, MarshallingInfoStringVector>; def mprefer_vector_width_EQ : Joined<["-"], "mprefer-vector-width=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Specifies preferred vector width for auto-vectorization. Defaults to 'none' which allows target specific decisions.">, MarshallingInfoString>; def mstack_protector_guard_EQ : Joined<["-"], "mstack-protector-guard=">, Group, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -53,6 +53,9 @@ class CodeGenOptions : public CodeGenOptionsBase { /// The paths to the pass plugins that were registered using -fpass-plugin. std::vector LLVMPassPlugins; + // The prefered vector width, if requested by -mprefer-vector-width. + std::string PreferVectorWidth; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index b251534e1a8f6..2e932d9ad4a26 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -418,6 +418,10 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"preferVectorWidth", "prefer-vector-width", "std::string", + /*default=*/"", + "Set the prefer-vector-width attribute on functions in the " + "module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, ]; diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 118695bbe2626..058024a4a04c5 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; InstrumentFunctionExit = "__cyg_profile_func_exit"; @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for + ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. std::string InstrumentFunctionEntry = diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..122eb56fa0599 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -309,6 +309,20 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, for (auto *a : args.filtered(clang::driver::options::OPT_fpass_plugin_EQ)) opts.LLVMPassPlugins.push_back(a->getValue()); + // -mprefer_vector_width option + if (const llvm::opt::Arg *a = + args.getLastArg(clang::driver::options::OPT_mprefer_vector_width_EQ)) { + llvm::StringRef s = a->getValue(); + unsigned Width; + if (s == "none") + opts.PreferVectorWidth = "none"; + else if (s.getAsInteger(10, Width)) + diags.Report(clang::diag::err_drv_invalid_value) + << a->getAsString(args) << a->getValue(); + else + opts.PreferVectorWidth = s.str(); + } + // -fembed-offload-object option for (auto *a : args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 38dfaadf1dff9..012d0fdfe645f 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -741,6 +741,8 @@ void CodeGenAction::generateLLVMIR() { config.VScaleMax = vsr->second; } + config.PreferVectorWidth = opts.PreferVectorWidth; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..5c1e1b9f77efb 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -354,7 +354,7 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - ""})); + config.PreferVectorWidth, ""})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 43e4c1a7af3cd..13f447cf738b4 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -36,6 +36,7 @@ class FunctionAttrPass : public fir::impl::FunctionAttrBase { approxFuncFPMath = options.approxFuncFPMath; noSignedZerosFPMath = options.noSignedZerosFPMath; unsafeFPMath = options.unsafeFPMath; + preferVectorWidth = options.preferVectorWidth; } FunctionAttrPass() {} void runOnOperation() override; @@ -102,6 +103,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!preferVectorWidth.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, preferVectorWidth)); LLVM_DEBUG(llvm::dbgs() << "=== End " DEBUG_TYPE " ===\n"); } diff --git a/flang/test/Driver/prefer-vector-width.f90 b/flang/test/Driver/prefer-vector-width.f90 new file mode 100644 index 0000000000000..d0f5fd28db826 --- /dev/null +++ b/flang/test/Driver/prefer-vector-width.f90 @@ -0,0 +1,16 @@ +! Test that -mprefer-vector-width works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mprefer-vector-width=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mprefer-vector-width=128 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-128 +! RUN: %flang_fc1 -mprefer-vector-width=256 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-256 +! RUN: not %flang_fc1 -mprefer-vector-width=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INVALID + +subroutine func +end subroutine func + +! CHECK-DEF-NOT: attributes #0 = { "prefer-vector-width"={{.*}} } +! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } +! CHECK-128: attributes #0 = { "prefer-vector-width"="128" } +! CHECK-256: attributes #0 = { "prefer-vector-width"="256" } +! CHECK-INVALID:error: invalid value 'xxx' in '-mprefer-vector-width=xxx' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index 6fde45ce5c556..c0324d561b77b 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1893,6 +1893,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$frame_pointer, OptionalAttr:$target_cpu, OptionalAttr:$tune_cpu, + OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index b049064fbd31c..85417da798b22 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2636,6 +2636,10 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, funcOp.setTargetFeaturesAttr( LLVM::TargetFeaturesAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("prefer-vector-width"); + attr.isStringAttribute()) + funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index 047e870b7dcd8..2b7f0b11613aa 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1540,6 +1540,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { if (auto tuneCpu = func.getTuneCpu()) llvmFunc->addFnAttr("tune-cpu", *tuneCpu); + if (auto preferVectorWidth = func.getPreferVectorWidth()) + llvmFunc->addFnAttr("prefer-vector-width", *preferVectorWidth); + if (auto attr = func.getVscaleRange()) llvmFunc->addFnAttr(llvm::Attribute::getWithVScaleRangeArgs( getLLVMContext(), attr->getMinRange().getInt(), ``````````
https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Sat May 24 13:46:40 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 24 May 2025 13:46:40 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68323030.050a0220.25a53d.1dbb@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-mlir @llvm/pr-subscribers-clang Author: Cameron McInally (mcinally)
Changes This patch adds support for the -mprefer-vector-width=<value> command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- Full diff: https://github.com/llvm/llvm-project/pull/141380.diff 12 Files Affected: - (modified) clang/include/clang/Driver/Options.td (+1-1) - (modified) flang/include/flang/Frontend/CodeGenOptions.h (+3) - (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+4) - (modified) flang/include/flang/Tools/CrossToolHelpers.h (+3) - (modified) flang/lib/Frontend/CompilerInvocation.cpp (+14) - (modified) flang/lib/Frontend/FrontendActions.cpp (+2) - (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+1-1) - (modified) flang/lib/Optimizer/Transforms/FunctionAttr.cpp (+5) - (added) flang/test/Driver/prefer-vector-width.f90 (+16) - (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td (+1) - (modified) mlir/lib/Target/LLVMIR/ModuleImport.cpp (+4) - (modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+3) ``````````diff diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 22261621df092..b0b642796010b 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5480,7 +5480,7 @@ def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, MarshallingInfoStringVector>; def mprefer_vector_width_EQ : Joined<["-"], "mprefer-vector-width=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Specifies preferred vector width for auto-vectorization. Defaults to 'none' which allows target specific decisions.">, MarshallingInfoString>; def mstack_protector_guard_EQ : Joined<["-"], "mstack-protector-guard=">, Group, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -53,6 +53,9 @@ class CodeGenOptions : public CodeGenOptionsBase { /// The paths to the pass plugins that were registered using -fpass-plugin. std::vector LLVMPassPlugins; + // The prefered vector width, if requested by -mprefer-vector-width. + std::string PreferVectorWidth; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index b251534e1a8f6..2e932d9ad4a26 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -418,6 +418,10 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"preferVectorWidth", "prefer-vector-width", "std::string", + /*default=*/"", + "Set the prefer-vector-width attribute on functions in the " + "module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, ]; diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 118695bbe2626..058024a4a04c5 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; InstrumentFunctionExit = "__cyg_profile_func_exit"; @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for + ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. std::string InstrumentFunctionEntry = diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..122eb56fa0599 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -309,6 +309,20 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, for (auto *a : args.filtered(clang::driver::options::OPT_fpass_plugin_EQ)) opts.LLVMPassPlugins.push_back(a->getValue()); + // -mprefer_vector_width option + if (const llvm::opt::Arg *a = + args.getLastArg(clang::driver::options::OPT_mprefer_vector_width_EQ)) { + llvm::StringRef s = a->getValue(); + unsigned Width; + if (s == "none") + opts.PreferVectorWidth = "none"; + else if (s.getAsInteger(10, Width)) + diags.Report(clang::diag::err_drv_invalid_value) + << a->getAsString(args) << a->getValue(); + else + opts.PreferVectorWidth = s.str(); + } + // -fembed-offload-object option for (auto *a : args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 38dfaadf1dff9..012d0fdfe645f 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -741,6 +741,8 @@ void CodeGenAction::generateLLVMIR() { config.VScaleMax = vsr->second; } + config.PreferVectorWidth = opts.PreferVectorWidth; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..5c1e1b9f77efb 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -354,7 +354,7 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - ""})); + config.PreferVectorWidth, ""})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 43e4c1a7af3cd..13f447cf738b4 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -36,6 +36,7 @@ class FunctionAttrPass : public fir::impl::FunctionAttrBase { approxFuncFPMath = options.approxFuncFPMath; noSignedZerosFPMath = options.noSignedZerosFPMath; unsafeFPMath = options.unsafeFPMath; + preferVectorWidth = options.preferVectorWidth; } FunctionAttrPass() {} void runOnOperation() override; @@ -102,6 +103,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!preferVectorWidth.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, preferVectorWidth)); LLVM_DEBUG(llvm::dbgs() << "=== End " DEBUG_TYPE " ===\n"); } diff --git a/flang/test/Driver/prefer-vector-width.f90 b/flang/test/Driver/prefer-vector-width.f90 new file mode 100644 index 0000000000000..d0f5fd28db826 --- /dev/null +++ b/flang/test/Driver/prefer-vector-width.f90 @@ -0,0 +1,16 @@ +! Test that -mprefer-vector-width works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mprefer-vector-width=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mprefer-vector-width=128 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-128 +! RUN: %flang_fc1 -mprefer-vector-width=256 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-256 +! RUN: not %flang_fc1 -mprefer-vector-width=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INVALID + +subroutine func +end subroutine func + +! CHECK-DEF-NOT: attributes #0 = { "prefer-vector-width"={{.*}} } +! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } +! CHECK-128: attributes #0 = { "prefer-vector-width"="128" } +! CHECK-256: attributes #0 = { "prefer-vector-width"="256" } +! CHECK-INVALID:error: invalid value 'xxx' in '-mprefer-vector-width=xxx' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index 6fde45ce5c556..c0324d561b77b 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1893,6 +1893,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$frame_pointer, OptionalAttr:$target_cpu, OptionalAttr:$tune_cpu, + OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index b049064fbd31c..85417da798b22 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2636,6 +2636,10 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, funcOp.setTargetFeaturesAttr( LLVM::TargetFeaturesAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("prefer-vector-width"); + attr.isStringAttribute()) + funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index 047e870b7dcd8..2b7f0b11613aa 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1540,6 +1540,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { if (auto tuneCpu = func.getTuneCpu()) llvmFunc->addFnAttr("tune-cpu", *tuneCpu); + if (auto preferVectorWidth = func.getPreferVectorWidth()) + llvmFunc->addFnAttr("prefer-vector-width", *preferVectorWidth); + if (auto attr = func.getVscaleRange()) llvmFunc->addFnAttr(llvm::Attribute::getWithVScaleRangeArgs( getLLVMContext(), attr->getMinRange().getInt(), ``````````
https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Sat May 24 13:46:41 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 24 May 2025 13:46:41 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68323031.170a0220.1212dc.ec83@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-mlir-llvm Author: Cameron McInally (mcinally)
Changes This patch adds support for the -mprefer-vector-width=<value> command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- Full diff: https://github.com/llvm/llvm-project/pull/141380.diff 12 Files Affected: - (modified) clang/include/clang/Driver/Options.td (+1-1) - (modified) flang/include/flang/Frontend/CodeGenOptions.h (+3) - (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+4) - (modified) flang/include/flang/Tools/CrossToolHelpers.h (+3) - (modified) flang/lib/Frontend/CompilerInvocation.cpp (+14) - (modified) flang/lib/Frontend/FrontendActions.cpp (+2) - (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+1-1) - (modified) flang/lib/Optimizer/Transforms/FunctionAttr.cpp (+5) - (added) flang/test/Driver/prefer-vector-width.f90 (+16) - (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td (+1) - (modified) mlir/lib/Target/LLVMIR/ModuleImport.cpp (+4) - (modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+3) ``````````diff diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 22261621df092..b0b642796010b 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5480,7 +5480,7 @@ def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, MarshallingInfoStringVector>; def mprefer_vector_width_EQ : Joined<["-"], "mprefer-vector-width=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Specifies preferred vector width for auto-vectorization. Defaults to 'none' which allows target specific decisions.">, MarshallingInfoString>; def mstack_protector_guard_EQ : Joined<["-"], "mstack-protector-guard=">, Group, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -53,6 +53,9 @@ class CodeGenOptions : public CodeGenOptionsBase { /// The paths to the pass plugins that were registered using -fpass-plugin. std::vector LLVMPassPlugins; + // The prefered vector width, if requested by -mprefer-vector-width. + std::string PreferVectorWidth; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index b251534e1a8f6..2e932d9ad4a26 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -418,6 +418,10 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"preferVectorWidth", "prefer-vector-width", "std::string", + /*default=*/"", + "Set the prefer-vector-width attribute on functions in the " + "module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, ]; diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 118695bbe2626..058024a4a04c5 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; InstrumentFunctionExit = "__cyg_profile_func_exit"; @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for + ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. std::string InstrumentFunctionEntry = diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..122eb56fa0599 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -309,6 +309,20 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, for (auto *a : args.filtered(clang::driver::options::OPT_fpass_plugin_EQ)) opts.LLVMPassPlugins.push_back(a->getValue()); + // -mprefer_vector_width option + if (const llvm::opt::Arg *a = + args.getLastArg(clang::driver::options::OPT_mprefer_vector_width_EQ)) { + llvm::StringRef s = a->getValue(); + unsigned Width; + if (s == "none") + opts.PreferVectorWidth = "none"; + else if (s.getAsInteger(10, Width)) + diags.Report(clang::diag::err_drv_invalid_value) + << a->getAsString(args) << a->getValue(); + else + opts.PreferVectorWidth = s.str(); + } + // -fembed-offload-object option for (auto *a : args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 38dfaadf1dff9..012d0fdfe645f 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -741,6 +741,8 @@ void CodeGenAction::generateLLVMIR() { config.VScaleMax = vsr->second; } + config.PreferVectorWidth = opts.PreferVectorWidth; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..5c1e1b9f77efb 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -354,7 +354,7 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - ""})); + config.PreferVectorWidth, ""})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 43e4c1a7af3cd..13f447cf738b4 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -36,6 +36,7 @@ class FunctionAttrPass : public fir::impl::FunctionAttrBase { approxFuncFPMath = options.approxFuncFPMath; noSignedZerosFPMath = options.noSignedZerosFPMath; unsafeFPMath = options.unsafeFPMath; + preferVectorWidth = options.preferVectorWidth; } FunctionAttrPass() {} void runOnOperation() override; @@ -102,6 +103,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!preferVectorWidth.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, preferVectorWidth)); LLVM_DEBUG(llvm::dbgs() << "=== End " DEBUG_TYPE " ===\n"); } diff --git a/flang/test/Driver/prefer-vector-width.f90 b/flang/test/Driver/prefer-vector-width.f90 new file mode 100644 index 0000000000000..d0f5fd28db826 --- /dev/null +++ b/flang/test/Driver/prefer-vector-width.f90 @@ -0,0 +1,16 @@ +! Test that -mprefer-vector-width works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mprefer-vector-width=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mprefer-vector-width=128 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-128 +! RUN: %flang_fc1 -mprefer-vector-width=256 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-256 +! RUN: not %flang_fc1 -mprefer-vector-width=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INVALID + +subroutine func +end subroutine func + +! CHECK-DEF-NOT: attributes #0 = { "prefer-vector-width"={{.*}} } +! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } +! CHECK-128: attributes #0 = { "prefer-vector-width"="128" } +! CHECK-256: attributes #0 = { "prefer-vector-width"="256" } +! CHECK-INVALID:error: invalid value 'xxx' in '-mprefer-vector-width=xxx' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index 6fde45ce5c556..c0324d561b77b 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1893,6 +1893,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$frame_pointer, OptionalAttr:$target_cpu, OptionalAttr:$tune_cpu, + OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index b049064fbd31c..85417da798b22 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2636,6 +2636,10 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, funcOp.setTargetFeaturesAttr( LLVM::TargetFeaturesAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("prefer-vector-width"); + attr.isStringAttribute()) + funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index 047e870b7dcd8..2b7f0b11613aa 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1540,6 +1540,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { if (auto tuneCpu = func.getTuneCpu()) llvmFunc->addFnAttr("tune-cpu", *tuneCpu); + if (auto preferVectorWidth = func.getPreferVectorWidth()) + llvmFunc->addFnAttr("prefer-vector-width", *preferVectorWidth); + if (auto attr = func.getVscaleRange()) llvmFunc->addFnAttr(llvm::Attribute::getWithVScaleRangeArgs( getLLVMContext(), attr->getMinRange().getInt(), ``````````
https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Sat May 24 13:46:41 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 24 May 2025 13:46:41 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68323031.170a0220.19d206.8e6f@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-driver Author: Cameron McInally (mcinally)
Changes This patch adds support for the -mprefer-vector-width=<value> command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- Full diff: https://github.com/llvm/llvm-project/pull/141380.diff 12 Files Affected: - (modified) clang/include/clang/Driver/Options.td (+1-1) - (modified) flang/include/flang/Frontend/CodeGenOptions.h (+3) - (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+4) - (modified) flang/include/flang/Tools/CrossToolHelpers.h (+3) - (modified) flang/lib/Frontend/CompilerInvocation.cpp (+14) - (modified) flang/lib/Frontend/FrontendActions.cpp (+2) - (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+1-1) - (modified) flang/lib/Optimizer/Transforms/FunctionAttr.cpp (+5) - (added) flang/test/Driver/prefer-vector-width.f90 (+16) - (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td (+1) - (modified) mlir/lib/Target/LLVMIR/ModuleImport.cpp (+4) - (modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+3) ``````````diff diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 22261621df092..b0b642796010b 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5480,7 +5480,7 @@ def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, MarshallingInfoStringVector>; def mprefer_vector_width_EQ : Joined<["-"], "mprefer-vector-width=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Specifies preferred vector width for auto-vectorization. Defaults to 'none' which allows target specific decisions.">, MarshallingInfoString>; def mstack_protector_guard_EQ : Joined<["-"], "mstack-protector-guard=">, Group, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -53,6 +53,9 @@ class CodeGenOptions : public CodeGenOptionsBase { /// The paths to the pass plugins that were registered using -fpass-plugin. std::vector LLVMPassPlugins; + // The prefered vector width, if requested by -mprefer-vector-width. + std::string PreferVectorWidth; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index b251534e1a8f6..2e932d9ad4a26 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -418,6 +418,10 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"preferVectorWidth", "prefer-vector-width", "std::string", + /*default=*/"", + "Set the prefer-vector-width attribute on functions in the " + "module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, ]; diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 118695bbe2626..058024a4a04c5 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; InstrumentFunctionExit = "__cyg_profile_func_exit"; @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for + ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. std::string InstrumentFunctionEntry = diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..122eb56fa0599 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -309,6 +309,20 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, for (auto *a : args.filtered(clang::driver::options::OPT_fpass_plugin_EQ)) opts.LLVMPassPlugins.push_back(a->getValue()); + // -mprefer_vector_width option + if (const llvm::opt::Arg *a = + args.getLastArg(clang::driver::options::OPT_mprefer_vector_width_EQ)) { + llvm::StringRef s = a->getValue(); + unsigned Width; + if (s == "none") + opts.PreferVectorWidth = "none"; + else if (s.getAsInteger(10, Width)) + diags.Report(clang::diag::err_drv_invalid_value) + << a->getAsString(args) << a->getValue(); + else + opts.PreferVectorWidth = s.str(); + } + // -fembed-offload-object option for (auto *a : args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 38dfaadf1dff9..012d0fdfe645f 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -741,6 +741,8 @@ void CodeGenAction::generateLLVMIR() { config.VScaleMax = vsr->second; } + config.PreferVectorWidth = opts.PreferVectorWidth; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..5c1e1b9f77efb 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -354,7 +354,7 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - ""})); + config.PreferVectorWidth, ""})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 43e4c1a7af3cd..13f447cf738b4 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -36,6 +36,7 @@ class FunctionAttrPass : public fir::impl::FunctionAttrBase { approxFuncFPMath = options.approxFuncFPMath; noSignedZerosFPMath = options.noSignedZerosFPMath; unsafeFPMath = options.unsafeFPMath; + preferVectorWidth = options.preferVectorWidth; } FunctionAttrPass() {} void runOnOperation() override; @@ -102,6 +103,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!preferVectorWidth.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, preferVectorWidth)); LLVM_DEBUG(llvm::dbgs() << "=== End " DEBUG_TYPE " ===\n"); } diff --git a/flang/test/Driver/prefer-vector-width.f90 b/flang/test/Driver/prefer-vector-width.f90 new file mode 100644 index 0000000000000..d0f5fd28db826 --- /dev/null +++ b/flang/test/Driver/prefer-vector-width.f90 @@ -0,0 +1,16 @@ +! Test that -mprefer-vector-width works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mprefer-vector-width=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mprefer-vector-width=128 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-128 +! RUN: %flang_fc1 -mprefer-vector-width=256 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-256 +! RUN: not %flang_fc1 -mprefer-vector-width=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INVALID + +subroutine func +end subroutine func + +! CHECK-DEF-NOT: attributes #0 = { "prefer-vector-width"={{.*}} } +! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } +! CHECK-128: attributes #0 = { "prefer-vector-width"="128" } +! CHECK-256: attributes #0 = { "prefer-vector-width"="256" } +! CHECK-INVALID:error: invalid value 'xxx' in '-mprefer-vector-width=xxx' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index 6fde45ce5c556..c0324d561b77b 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1893,6 +1893,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$frame_pointer, OptionalAttr:$target_cpu, OptionalAttr:$tune_cpu, + OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index b049064fbd31c..85417da798b22 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2636,6 +2636,10 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, funcOp.setTargetFeaturesAttr( LLVM::TargetFeaturesAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("prefer-vector-width"); + attr.isStringAttribute()) + funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index 047e870b7dcd8..2b7f0b11613aa 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1540,6 +1540,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { if (auto tuneCpu = func.getTuneCpu()) llvmFunc->addFnAttr("tune-cpu", *tuneCpu); + if (auto preferVectorWidth = func.getPreferVectorWidth()) + llvmFunc->addFnAttr("prefer-vector-width", *preferVectorWidth); + if (auto attr = func.getVscaleRange()) llvmFunc->addFnAttr(llvm::Attribute::getWithVScaleRangeArgs( getLLVMContext(), attr->getMinRange().getInt(), ``````````
https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Sat May 24 13:47:23 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Sat, 24 May 2025 13:47:23 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <6832305b.170a0220.24dd03.28a8@mx.google.com> mcinally wrote: This is my first patch to Flang, so please be suspicious and gentle. In particular, I am not sure if `FlangOption, FC1Option` were the correct phases to pass this option to or not. https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Sat May 24 13:48:45 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 24 May 2025 13:48:45 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <683230ad.a70a0220.293560.0360@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions cpp,h -- flang/include/flang/Frontend/CodeGenOptions.h flang/include/flang/Tools/CrossToolHelpers.h flang/lib/Frontend/CompilerInvocation.cpp flang/lib/Frontend/FrontendActions.cpp flang/lib/Optimizer/Passes/Pipelines.cpp flang/lib/Optimizer/Transforms/FunctionAttr.cpp mlir/lib/Target/LLVMIR/ModuleImport.cpp mlir/lib/Target/LLVMIR/ModuleTranslation.cpp ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 122eb56fa..918323d66 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -310,8 +310,8 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, opts.LLVMPassPlugins.push_back(a->getValue()); // -mprefer_vector_width option - if (const llvm::opt::Arg *a = - args.getLastArg(clang::driver::options::OPT_mprefer_vector_width_EQ)) { + if (const llvm::opt::Arg *a = args.getLastArg( + clang::driver::options::OPT_mprefer_vector_width_EQ)) { llvm::StringRef s = a->getValue(); unsigned Width; if (s == "none") ``````````
https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Sat May 24 13:53:43 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 24 May 2025 13:53:43 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <683231d7.170a0220.15e8d2.9f4a@mx.google.com> https://github.com/klausler commented: I'm not really qualified to review driver patches, but it looks okay as far as I can tell. https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Sat May 24 13:55:55 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Sat, 24 May 2025 13:55:55 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <6832325b.170a0220.35a7bb.ce86@mx.google.com> https://github.com/mcinally updated https://github.com/llvm/llvm-project/pull/141380 >From 9f8619cb54a3a11e4c90af7f5156141ddc59e4d4 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 24 May 2025 13:35:13 -0700 Subject: [PATCH] [flang] Add support for -mprefer-vector-width= This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- clang/include/clang/Driver/Options.td | 2 +- flang/include/flang/Frontend/CodeGenOptions.h | 3 +++ .../include/flang/Optimizer/Transforms/Passes.td | 4 ++++ flang/include/flang/Tools/CrossToolHelpers.h | 3 +++ flang/lib/Frontend/CompilerInvocation.cpp | 14 ++++++++++++++ flang/lib/Frontend/FrontendActions.cpp | 2 ++ flang/lib/Optimizer/Passes/Pipelines.cpp | 2 +- flang/lib/Optimizer/Transforms/FunctionAttr.cpp | 5 +++++ flang/test/Driver/prefer-vector-width.f90 | 16 ++++++++++++++++ mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td | 1 + mlir/lib/Target/LLVMIR/ModuleImport.cpp | 4 ++++ mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | 3 +++ 12 files changed, 57 insertions(+), 2 deletions(-) create mode 100644 flang/test/Driver/prefer-vector-width.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 22261621df092..b0b642796010b 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5480,7 +5480,7 @@ def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, MarshallingInfoStringVector>; def mprefer_vector_width_EQ : Joined<["-"], "mprefer-vector-width=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Specifies preferred vector width for auto-vectorization. Defaults to 'none' which allows target specific decisions.">, MarshallingInfoString>; def mstack_protector_guard_EQ : Joined<["-"], "mstack-protector-guard=">, Group, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -53,6 +53,9 @@ class CodeGenOptions : public CodeGenOptionsBase { /// The paths to the pass plugins that were registered using -fpass-plugin. std::vector LLVMPassPlugins; + // The prefered vector width, if requested by -mprefer-vector-width. + std::string PreferVectorWidth; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index b251534e1a8f6..2e932d9ad4a26 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -418,6 +418,10 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"preferVectorWidth", "prefer-vector-width", "std::string", + /*default=*/"", + "Set the prefer-vector-width attribute on functions in the " + "module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, ]; diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 118695bbe2626..058024a4a04c5 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; InstrumentFunctionExit = "__cyg_profile_func_exit"; @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for + ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. std::string InstrumentFunctionEntry = diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..918323d663610 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -309,6 +309,20 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, for (auto *a : args.filtered(clang::driver::options::OPT_fpass_plugin_EQ)) opts.LLVMPassPlugins.push_back(a->getValue()); + // -mprefer_vector_width option + if (const llvm::opt::Arg *a = args.getLastArg( + clang::driver::options::OPT_mprefer_vector_width_EQ)) { + llvm::StringRef s = a->getValue(); + unsigned Width; + if (s == "none") + opts.PreferVectorWidth = "none"; + else if (s.getAsInteger(10, Width)) + diags.Report(clang::diag::err_drv_invalid_value) + << a->getAsString(args) << a->getValue(); + else + opts.PreferVectorWidth = s.str(); + } + // -fembed-offload-object option for (auto *a : args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 38dfaadf1dff9..012d0fdfe645f 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -741,6 +741,8 @@ void CodeGenAction::generateLLVMIR() { config.VScaleMax = vsr->second; } + config.PreferVectorWidth = opts.PreferVectorWidth; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..5c1e1b9f77efb 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -354,7 +354,7 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - ""})); + config.PreferVectorWidth, ""})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 43e4c1a7af3cd..13f447cf738b4 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -36,6 +36,7 @@ class FunctionAttrPass : public fir::impl::FunctionAttrBase { approxFuncFPMath = options.approxFuncFPMath; noSignedZerosFPMath = options.noSignedZerosFPMath; unsafeFPMath = options.unsafeFPMath; + preferVectorWidth = options.preferVectorWidth; } FunctionAttrPass() {} void runOnOperation() override; @@ -102,6 +103,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!preferVectorWidth.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, preferVectorWidth)); LLVM_DEBUG(llvm::dbgs() << "=== End " DEBUG_TYPE " ===\n"); } diff --git a/flang/test/Driver/prefer-vector-width.f90 b/flang/test/Driver/prefer-vector-width.f90 new file mode 100644 index 0000000000000..d0f5fd28db826 --- /dev/null +++ b/flang/test/Driver/prefer-vector-width.f90 @@ -0,0 +1,16 @@ +! Test that -mprefer-vector-width works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mprefer-vector-width=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mprefer-vector-width=128 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-128 +! RUN: %flang_fc1 -mprefer-vector-width=256 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-256 +! RUN: not %flang_fc1 -mprefer-vector-width=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INVALID + +subroutine func +end subroutine func + +! CHECK-DEF-NOT: attributes #0 = { "prefer-vector-width"={{.*}} } +! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } +! CHECK-128: attributes #0 = { "prefer-vector-width"="128" } +! CHECK-256: attributes #0 = { "prefer-vector-width"="256" } +! CHECK-INVALID:error: invalid value 'xxx' in '-mprefer-vector-width=xxx' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index 6fde45ce5c556..c0324d561b77b 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1893,6 +1893,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$frame_pointer, OptionalAttr:$target_cpu, OptionalAttr:$tune_cpu, + OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index b049064fbd31c..85417da798b22 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2636,6 +2636,10 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, funcOp.setTargetFeaturesAttr( LLVM::TargetFeaturesAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("prefer-vector-width"); + attr.isStringAttribute()) + funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index 047e870b7dcd8..2b7f0b11613aa 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1540,6 +1540,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { if (auto tuneCpu = func.getTuneCpu()) llvmFunc->addFnAttr("tune-cpu", *tuneCpu); + if (auto preferVectorWidth = func.getPreferVectorWidth()) + llvmFunc->addFnAttr("prefer-vector-width", *preferVectorWidth); + if (auto attr = func.getVscaleRange()) llvmFunc->addFnAttr(llvm::Attribute::getWithVScaleRangeArgs( getLLVMContext(), attr->getMinRange().getInt(), From flang-commits at lists.llvm.org Sat May 24 23:22:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 24 May 2025 23:22:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <6832b71d.170a0220.dc8a5.a416@mx.google.com> https://github.com/NexMing edited https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Sat May 24 23:28:51 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 24 May 2025 23:28:51 -0700 (PDT) Subject: [flang-commits] [flang] 953302e - [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (#140374) Message-ID: <6832b8a3.050a0220.34f28d.089e@mx.google.com> Author: MingYan Date: 2025-05-25T14:28:47+08:00 New Revision: 953302eb98569c7c107d1c8b820948466a404cbe URL: https://github.com/llvm/llvm-project/commit/953302eb98569c7c107d1c8b820948466a404cbe DIFF: https://github.com/llvm/llvm-project/commit/953302eb98569c7c107d1c8b820948466a404cbe.diff LOG: [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (#140374) This patch only supports the conversion from `fir.do_loop` to `scf.for`. This pass is still experimental, and future work will focus on gradually improving this conversion pass. Co-authored-by: yanming Added: flang/lib/Optimizer/Transforms/FIRToSCF.cpp flang/test/Fir/FirToSCF/do-loop.fir Modified: flang/include/flang/Optimizer/Support/InitFIR.h flang/include/flang/Optimizer/Transforms/Passes.h flang/include/flang/Optimizer/Transforms/Passes.td flang/lib/Optimizer/Transforms/CMakeLists.txt Removed: ################################################################################ diff --git a/flang/include/flang/Optimizer/Support/InitFIR.h b/flang/include/flang/Optimizer/Support/InitFIR.h index 1868fbb201970..fa08f41f84adf 100644 --- a/flang/include/flang/Optimizer/Support/InitFIR.h +++ b/flang/include/flang/Optimizer/Support/InitFIR.h @@ -25,6 +25,7 @@ #include "mlir/Dialect/Func/Extensions/InlinerExtension.h" #include "mlir/Dialect/LLVMIR/NVVMDialect.h" #include "mlir/Dialect/OpenACC/Transforms/Passes.h" +#include "mlir/Dialect/SCF/Transforms/Passes.h" #include "mlir/InitAllDialects.h" #include "mlir/Pass/Pass.h" #include "mlir/Pass/PassRegistry.h" @@ -103,6 +104,7 @@ inline void registerMLIRPassesForFortranTools() { mlir::registerPrintOpStatsPass(); mlir::registerInlinerPass(); mlir::registerSCCPPass(); + mlir::registerSCFPasses(); mlir::affine::registerAffineScalarReplacementPass(); mlir::registerSymbolDCEPass(); mlir::registerLocationSnapshotPass(); diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h b/flang/include/flang/Optimizer/Transforms/Passes.h index 6dbabd523f88a..dc8a5b9141ad2 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.h +++ b/flang/include/flang/Optimizer/Transforms/Passes.h @@ -72,6 +72,7 @@ std::unique_ptr createArrayValueCopyPass(fir::ArrayValueCopyOptions options = {}); std::unique_ptr createMemDataFlowOptPass(); std::unique_ptr createPromoteToAffinePass(); +std::unique_ptr createFIRToSCFPass(); std::unique_ptr createAddDebugInfoPass(fir::AddDebugInfoOptions options = {}); diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index b251534e1a8f6..8e7f12505c59c 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -76,6 +76,17 @@ def AffineDialectDemotion : Pass<"demote-affine", "::mlir::func::FuncOp"> { ]; } +def FIRToSCFPass : Pass<"fir-to-scf"> { + let summary = "Convert FIR structured control flow ops to SCF dialect."; + let description = [{ + Convert FIR structured control flow ops to SCF dialect. + }]; + let constructor = "::fir::createFIRToSCFPass()"; + let dependentDialects = [ + "fir::FIROpsDialect", "mlir::scf::SCFDialect" + ]; +} + def AnnotateConstantOperands : Pass<"annotate-constant"> { let summary = "Annotate constant operands to all FIR operations"; let description = [{ diff --git a/flang/lib/Optimizer/Transforms/CMakeLists.txt b/flang/lib/Optimizer/Transforms/CMakeLists.txt index 170b6e2cca225..846d6c64dbd04 100644 --- a/flang/lib/Optimizer/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/Transforms/CMakeLists.txt @@ -16,6 +16,7 @@ add_flang_library(FIRTransforms CUFComputeSharedMemoryOffsetsAndSize.cpp ArrayValueCopy.cpp ExternalNameConversion.cpp + FIRToSCF.cpp MemoryUtils.cpp MemoryAllocation.cpp StackArrays.cpp diff --git a/flang/lib/Optimizer/Transforms/FIRToSCF.cpp b/flang/lib/Optimizer/Transforms/FIRToSCF.cpp new file mode 100644 index 0000000000000..f06ad2db90d55 --- /dev/null +++ b/flang/lib/Optimizer/Transforms/FIRToSCF.cpp @@ -0,0 +1,105 @@ +//===-- FIRToSCF.cpp ------------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "mlir/Dialect/SCF/IR/SCF.h" +#include "mlir/Transforms/DialectConversion.h" + +namespace fir { +#define GEN_PASS_DEF_FIRTOSCFPASS +#include "flang/Optimizer/Transforms/Passes.h.inc" +} // namespace fir + +using namespace fir; +using namespace mlir; + +namespace { +class FIRToSCFPass : public fir::impl::FIRToSCFPassBase { +public: + void runOnOperation() override; +}; + +struct DoLoopConversion : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + + LogicalResult matchAndRewrite(fir::DoLoopOp doLoopOp, + PatternRewriter &rewriter) const override { + auto loc = doLoopOp.getLoc(); + bool hasFinalValue = doLoopOp.getFinalValue().has_value(); + + // Get loop values from the DoLoopOp + auto low = doLoopOp.getLowerBound(); + auto high = doLoopOp.getUpperBound(); + assert(low && high && "must be a Value"); + auto step = doLoopOp.getStep(); + llvm::SmallVector iterArgs; + if (hasFinalValue) + iterArgs.push_back(low); + iterArgs.append(doLoopOp.getIterOperands().begin(), + doLoopOp.getIterOperands().end()); + + // fir.do_loop iterates over the interval [%l, %u], and the step may be + // negative. But scf.for iterates over the interval [%l, %u), and the step + // must be a positive value. + // For easier conversion, we calculate the trip count and use a canonical + // induction variable. + auto diff = rewriter.create(loc, high, low); + auto distance = rewriter.create(loc, diff , step); + auto tripCount = rewriter.create(loc, distance, step); + auto zero = rewriter.create(loc, 0); + auto one = rewriter.create(loc, 1); + auto scfForOp = + rewriter.create(loc, zero, tripCount, one, iterArgs); + + auto &loopOps = doLoopOp.getBody()->getOperations(); + auto resultOp = cast(doLoopOp.getBody()->getTerminator()); + auto results = resultOp.getOperands(); + Block *loweredBody = scfForOp.getBody(); + + loweredBody->getOperations().splice(loweredBody->begin(), loopOps, + loopOps.begin(), + std::prev(loopOps.end())); + + rewriter.setInsertionPointToStart(loweredBody); + Value iv = + rewriter.create(loc, scfForOp.getInductionVar(), step); + iv = rewriter.create(loc, low, iv); + + if (!results.empty()) { + rewriter.setInsertionPointToEnd(loweredBody); + rewriter.create(resultOp->getLoc(), results); + } + doLoopOp.getInductionVar().replaceAllUsesWith(iv); + rewriter.replaceAllUsesWith(doLoopOp.getRegionIterArgs(), + hasFinalValue + ? scfForOp.getRegionIterArgs().drop_front() + : scfForOp.getRegionIterArgs()); + + // Copy all the attributes from the old to new op. + scfForOp->setAttrs(doLoopOp->getAttrs()); + rewriter.replaceOp(doLoopOp, scfForOp); + return success(); + } +}; +} // namespace + +void FIRToSCFPass::runOnOperation() { + RewritePatternSet patterns(&getContext()); + patterns.add(patterns.getContext()); + ConversionTarget target(getContext()); + target.addIllegalOp(); + target.markUnknownOpDynamicallyLegal([](Operation *) { return true; }); + if (failed( + applyPartialConversion(getOperation(), target, std::move(patterns)))) + signalPassFailure(); +} + +std::unique_ptr fir::createFIRToSCFPass() { + return std::make_unique(); +} diff --git a/flang/test/Fir/FirToSCF/do-loop.fir b/flang/test/Fir/FirToSCF/do-loop.fir new file mode 100644 index 0000000000000..812497c8d0c74 --- /dev/null +++ b/flang/test/Fir/FirToSCF/do-loop.fir @@ -0,0 +1,230 @@ +// RUN: fir-opt %s --fir-to-scf | FileCheck %s + +// CHECK-LABEL: func.func @simple_loop( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_2:.*]] = fir.shape %[[VAL_1]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_3:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_9:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] { +// CHECK: %[[VAL_10:.*]] = arith.muli %[[VAL_9]], %[[VAL_0]] : index +// CHECK: %[[VAL_11:.*]] = arith.addi %[[VAL_0]], %[[VAL_10]] : index +// CHECK: %[[VAL_12:.*]] = fir.array_coor %[[ARG0]](%[[VAL_2]]) %[[VAL_11]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_3]] to %[[VAL_12]] : !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } +func.func @simple_loop(%arg0: !fir.ref>) { + %c1 = arith.constant 1 : index + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %c1_i32 = arith.constant 1 : i32 + fir.do_loop %arg1 = %c1 to %c100 step %c1 { + %1 = fir.array_coor %arg0(%0) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + return +} + +// CHECK-LABEL: func.func @loop_with_negtive_step( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_2:.*]] = arith.constant -1 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_0]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_1]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_2]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_2]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_10:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] { +// CHECK: %[[VAL_11:.*]] = arith.muli %[[VAL_10]], %[[VAL_2]] : index +// CHECK: %[[VAL_12:.*]] = arith.addi %[[VAL_0]], %[[VAL_11]] : index +// CHECK: %[[VAL_13:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_12]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_4]] to %[[VAL_13]] : !fir.ref +// CHECK: } +// CHECK: return +// CHECK: } +func.func @loop_with_negtive_step(%arg0: !fir.ref>) { + %c100 = arith.constant 100 : index + %c1 = arith.constant 1 : index + %c-1 = arith.constant -1 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %c1_i32 = arith.constant 1 : i32 + fir.do_loop %arg1 = %c100 to %c1 step %c-1 { + %1 = fir.array_coor %arg0(%0) %arg1 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + return +} + +// CHECK-LABEL: func.func @loop_with_results( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_9:.*]] = scf.for %[[VAL_10:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] iter_args(%[[VAL_11:.*]] = %[[VAL_1]]) -> (i32) { +// CHECK: %[[VAL_12:.*]] = arith.muli %[[VAL_10]], %[[VAL_0]] : index +// CHECK: %[[VAL_13:.*]] = arith.addi %[[VAL_0]], %[[VAL_12]] : index +// CHECK: %[[VAL_14:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_13]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_15:.*]] = fir.load %[[VAL_14]] : !fir.ref +// CHECK: %[[VAL_16:.*]] = arith.addi %[[VAL_11]], %[[VAL_15]] : i32 +// CHECK: scf.yield %[[VAL_16]] : i32 +// CHECK: } +// CHECK: fir.store %[[VAL_9]] to %[[ARG1]] : !fir.ref +// CHECK: return +// CHECK: } +func.func @loop_with_results(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.shape %c100 : (index) -> !fir.shape<1> + %1 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %c0_i32) -> (i32) { + %2 = fir.array_coor %arg0(%0) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %3 = fir.load %2 : !fir.ref + %4 = arith.addi %arg3, %3 : i32 + fir.result %4 : i32 + } + fir.store %1 to %arg1 : !fir.ref + return +} + +// CHECK-LABEL: func.func @loop_with_final_value( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.alloca index +// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_0]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_10:.*]]:2 = scf.for %[[VAL_11:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] iter_args(%[[VAL_12:.*]] = %[[VAL_0]], %[[VAL_13:.*]] = %[[VAL_1]]) -> (index, i32) { +// CHECK: %[[VAL_14:.*]] = arith.muli %[[VAL_11]], %[[VAL_0]] : index +// CHECK: %[[VAL_15:.*]] = arith.addi %[[VAL_0]], %[[VAL_14]] : index +// CHECK: %[[VAL_16:.*]] = fir.array_coor %[[ARG0]](%[[VAL_4]]) %[[VAL_15]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.load %[[VAL_16]] : !fir.ref +// CHECK: %[[VAL_18:.*]] = arith.addi %[[VAL_15]], %[[VAL_0]] overflow : index +// CHECK: %[[VAL_19:.*]] = arith.addi %[[VAL_13]], %[[VAL_17]] overflow : i32 +// CHECK: scf.yield %[[VAL_18]], %[[VAL_19]] : index, i32 +// CHECK: } +// CHECK: fir.store %[[VAL_20:.*]]#0 to %[[VAL_3]] : !fir.ref +// CHECK: fir.store %[[VAL_20]]#1 to %[[ARG1]] : !fir.ref +// CHECK: return +// CHECK: } +func.func @loop_with_final_value(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.alloca index + %1 = fir.shape %c100 : (index) -> !fir.shape<1> + %2:2 = fir.do_loop %arg2 = %c1 to %c100 step %c1 iter_args(%arg3 = %c0_i32) -> (index, i32) { + %3 = fir.array_coor %arg0(%1) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %4 = fir.load %3 : !fir.ref + %5 = arith.addi %arg2, %c1 overflow : index + %6 = arith.addi %arg3, %4 overflow : i32 + fir.result %5, %6 : index, i32 + } + fir.store %2#0 to %0 : !fir.ref + fir.store %2#1 to %arg1 : !fir.ref + return +} + +// CHECK-LABEL: func.func @loop_with_attribute( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>, +// CHECK-SAME: %[[ARG1:.*]]: !fir.ref) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.alloca i32 +// CHECK: %[[VAL_4:.*]] = fir.shape %[[VAL_2]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_5:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.addi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.divsi %[[VAL_6]], %[[VAL_0]] : index +// CHECK: %[[VAL_8:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_9:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_10:.*]] = %[[VAL_8]] to %[[VAL_7]] step %[[VAL_9]] { +// CHECK: %[[VAL_11:.*]] = arith.muli %[[VAL_10]], %[[VAL_0]] : index +// CHECK: %[[VAL_12:.*]] = arith.addi %[[VAL_0]], %[[VAL_11]] : index +// CHECK: %[[VAL_13:.*]] = fir.array_coor %[[ARG0]](%[[VAL_4]]) %[[VAL_12]] : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref +// CHECK: %[[VAL_14:.*]] = fir.load %[[VAL_13]] : !fir.ref +// CHECK: %[[VAL_15:.*]] = fir.load %[[VAL_3]] : !fir.ref +// CHECK: %[[VAL_16:.*]] = arith.addi %[[VAL_15]], %[[VAL_14]] : i32 +// CHECK: fir.store %[[VAL_16]] to %[[VAL_3]] : !fir.ref +// CHECK: } {operandSegmentSizes = array, reduceAttrs = [#fir.reduce_attr]} +// CHECK: return +// CHECK: } +func.func @loop_with_attribute(%arg0: !fir.ref>, %arg1: !fir.ref) { + %c1 = arith.constant 1 : index + %c0_i32 = arith.constant 0 : i32 + %c100 = arith.constant 100 : index + %0 = fir.alloca i32 + %1 = fir.shape %c100 : (index) -> !fir.shape<1> + fir.do_loop %arg2 = %c1 to %c100 step %c1 reduce(#fir.reduce_attr -> %0 : !fir.ref) { + %2 = fir.array_coor %arg0(%1) %arg2 : (!fir.ref>, !fir.shape<1>, index) -> !fir.ref + %3 = fir.load %2 : !fir.ref + %4 = fir.load %0 : !fir.ref + %5 = arith.addi %4, %3 : i32 + fir.store %5 to %0 : !fir.ref + fir.result + } + return +} + +// CHECK-LABEL: func.func @nested_loop( +// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>) { +// CHECK: %[[VAL_0:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : i32 +// CHECK: %[[VAL_2:.*]] = arith.constant 100 : index +// CHECK: %[[VAL_3:.*]] = fir.shape %[[VAL_2]], %[[VAL_2]] : (index, index) -> !fir.shape<2> +// CHECK: %[[VAL_4:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_5:.*]] = arith.addi %[[VAL_4]], %[[VAL_0]] : index +// CHECK: %[[VAL_6:.*]] = arith.divsi %[[VAL_5]], %[[VAL_0]] : index +// CHECK: %[[VAL_7:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_8:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_9:.*]] = %[[VAL_7]] to %[[VAL_6]] step %[[VAL_8]] { +// CHECK: %[[VAL_10:.*]] = arith.muli %[[VAL_9]], %[[VAL_0]] : index +// CHECK: %[[VAL_11:.*]] = arith.addi %[[VAL_0]], %[[VAL_10]] : index +// CHECK: %[[VAL_12:.*]] = arith.subi %[[VAL_2]], %[[VAL_0]] : index +// CHECK: %[[VAL_13:.*]] = arith.addi %[[VAL_12]], %[[VAL_0]] : index +// CHECK: %[[VAL_14:.*]] = arith.divsi %[[VAL_13]], %[[VAL_0]] : index +// CHECK: %[[VAL_15:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_16:.*]] = arith.constant 1 : index +// CHECK: scf.for %[[VAL_17:.*]] = %[[VAL_15]] to %[[VAL_14]] step %[[VAL_16]] { +// CHECK: %[[VAL_18:.*]] = arith.muli %[[VAL_17]], %[[VAL_0]] : index +// CHECK: %[[VAL_19:.*]] = arith.addi %[[VAL_0]], %[[VAL_18]] : index +// CHECK: %[[VAL_20:.*]] = fir.array_coor %[[ARG0]](%[[VAL_3]]) %[[VAL_19]], %[[VAL_11]] : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref +// CHECK: fir.store %[[VAL_1]] to %[[VAL_20]] : !fir.ref +// CHECK: } +// CHECK: } +// CHECK: return +// CHECK: } +func.func @nested_loop(%arg0: !fir.ref>) { + %c1 = arith.constant 1 : index + %c1_i32 = arith.constant 1 : i32 + %c100 = arith.constant 100 : index + %0 = fir.shape %c100, %c100 : (index, index) -> !fir.shape<2> + fir.do_loop %arg1 = %c1 to %c100 step %c1 { + fir.do_loop %arg2 = %c1 to %c100 step %c1 { + %1 = fir.array_coor %arg0(%0) %arg2, %arg1 : (!fir.ref>, !fir.shape<2>, index, index) -> !fir.ref + fir.store %c1_i32 to %1 : !fir.ref + } + } + return +} From flang-commits at lists.llvm.org Sat May 24 23:28:53 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 24 May 2025 23:28:53 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Add FIR structured control flow ops to SCF dialect pass. (PR #140374) In-Reply-To: Message-ID: <6832b8a5.170a0220.fa567.eee2@mx.google.com> https://github.com/NexMing closed https://github.com/llvm/llvm-project/pull/140374 From flang-commits at lists.llvm.org Sun May 25 07:02:22 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Sun, 25 May 2025 07:02:22 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <683322ee.630a0220.236c61.4852@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH 1/4] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 30477b842ae64fb1c94f520136956cc4685648a7 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:59:20 -0400 Subject: [PATCH 2/4] clang-format --- flang/include/flang/Evaluate/target.h | 4 +--- flang/include/flang/Tools/TargetSetup.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d5285fb..e8b9fedc38f48 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ class TargetCharacteristics { const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e21fb4..24ab65f740ec6 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. >From e1f91c83680d72fc1463a8db97a77298141286d9 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:41:48 -0400 Subject: [PATCH 3/4] Renaming to integerKindForPointer --- flang/include/flang/Evaluate/target.h | 8 +++++--- flang/include/flang/Tools/TargetSetup.h | 3 ++- flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index e8b9fedc38f48..7b38db2db1956 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,8 +131,10 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } - std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } + std::size_t integerKindForPointer() { return integerKindForPointer_; } + void set_integerKindForPointer(std::size_t newKind) { + integerKindForPointer_ = newKind; + } private: static constexpr int maxKind{common::maxKind}; @@ -159,7 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; - std::size_t pointerSize_{8 /* bytes */}; + std::size_t integerKindForPointer_{8}; /* For 64 bit pointer */ }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index 24ab65f740ec6..002e82aa72484 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,7 +94,8 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); - targetCharacteristics.set_pointerSize( + // Currently the integer kind happens to be the same as the byte size + targetCharacteristics.set_integerKindForPointer( targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 88114bfe84af3..5fd5ea8f9bc5c 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; + TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 940ff38d0abba9d1b0ab415e42474629284962d8 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:51:53 -0400 Subject: [PATCH 4/4] clang-format --- flang/lib/Semantics/resolve-names.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 5fd5ea8f9bc5c..a8dbf61c8fd68 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6689,8 +6689,8 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { "'%s' cannot be a Cray pointer as it is already a Cray pointee"_err_en_US); } pointer->set(Symbol::Flag::CrayPointer); - const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; + const DeclTypeSpec &pointerType{MakeNumericType(TypeCategory::Integer, + context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); From flang-commits at lists.llvm.org Sun May 25 14:26:23 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Sun, 25 May 2025 14:26:23 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <68338aff.170a0220.e007e.6837@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH 1/4] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 30477b842ae64fb1c94f520136956cc4685648a7 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:59:20 -0400 Subject: [PATCH 2/4] clang-format --- flang/include/flang/Evaluate/target.h | 4 +--- flang/include/flang/Tools/TargetSetup.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d5285fb..e8b9fedc38f48 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ class TargetCharacteristics { const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e21fb4..24ab65f740ec6 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. >From e1f91c83680d72fc1463a8db97a77298141286d9 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:41:48 -0400 Subject: [PATCH 3/4] Renaming to integerKindForPointer --- flang/include/flang/Evaluate/target.h | 8 +++++--- flang/include/flang/Tools/TargetSetup.h | 3 ++- flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index e8b9fedc38f48..7b38db2db1956 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,8 +131,10 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } - std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } + std::size_t integerKindForPointer() { return integerKindForPointer_; } + void set_integerKindForPointer(std::size_t newKind) { + integerKindForPointer_ = newKind; + } private: static constexpr int maxKind{common::maxKind}; @@ -159,7 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; - std::size_t pointerSize_{8 /* bytes */}; + std::size_t integerKindForPointer_{8}; /* For 64 bit pointer */ }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index 24ab65f740ec6..002e82aa72484 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,7 +94,8 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); - targetCharacteristics.set_pointerSize( + // Currently the integer kind happens to be the same as the byte size + targetCharacteristics.set_integerKindForPointer( targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 88114bfe84af3..5fd5ea8f9bc5c 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; + TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 940ff38d0abba9d1b0ab415e42474629284962d8 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:51:53 -0400 Subject: [PATCH 4/4] clang-format --- flang/lib/Semantics/resolve-names.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 5fd5ea8f9bc5c..a8dbf61c8fd68 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6689,8 +6689,8 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { "'%s' cannot be a Cray pointer as it is already a Cray pointee"_err_en_US); } pointer->set(Symbol::Flag::CrayPointer); - const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; + const DeclTypeSpec &pointerType{MakeNumericType(TypeCategory::Integer, + context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); From flang-commits at lists.llvm.org Sun May 25 23:21:29 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sun, 25 May 2025 23:21:29 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <68340869.170a0220.13f9c3.0500@mx.google.com> https://github.com/fanju110 updated https://github.com/llvm/llvm-project/pull/136098 >From 9494c9752400e4708dbc8b6a5ca4993ea9565e95 Mon Sep 17 00:00:00 2001 From: fanyikang Date: Thu, 17 Apr 2025 15:17:07 +0800 Subject: [PATCH 01/13] Add support for IR PGO (-fprofile-generate/-fprofile-use=/file) This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags: -fprofile-generate for instrumentation-based profile generation -fprofile-use=/file for profile-guided optimization Co-Authored-By: ict-ql <168183727+ict-ql at users.noreply.github.com> --- clang/include/clang/Driver/Options.td | 4 +- clang/lib/Driver/ToolChains/Flang.cpp | 8 +++ .../include/flang/Frontend/CodeGenOptions.def | 5 ++ flang/include/flang/Frontend/CodeGenOptions.h | 49 +++++++++++++++++ flang/lib/Frontend/CompilerInvocation.cpp | 12 +++++ flang/lib/Frontend/FrontendActions.cpp | 54 +++++++++++++++++++ flang/test/Driver/flang-f-opts.f90 | 5 ++ .../Inputs/gcc-flag-compatibility_IR.proftext | 19 +++++++ .../gcc-flag-compatibility_IR_entry.proftext | 14 +++++ flang/test/Profile/gcc-flag-compatibility.f90 | 39 ++++++++++++++ 10 files changed, 207 insertions(+), 2 deletions(-) create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext create mode 100644 flang/test/Profile/gcc-flag-compatibility.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index affc076a876ad..0b0dbc467c1e0 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1756,7 +1756,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFFFFFE">; def fprofile_generate : Flag<["-"], "fprofile-generate">, - Group, Visibility<[ClangOption, CLOption]>, + Group, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">, Group, Visibility<[ClangOption, CLOption]>, @@ -1773,7 +1773,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group, Visibility<[ClangOption, CLOption]>, Alias; def fprofile_use_EQ : Joined<["-"], "fprofile-use=">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, MetaVarName<"">, HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from /default.profdata. Otherwise, it reads from file .">; def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">, diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index a8b4688aed09c..fcdbe8a6aba5a 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,6 +882,14 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); + + if (Args.hasArg(options::OPT_fprofile_generate)){ + CmdArgs.push_back("-fprofile-generate"); + } + if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { + CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); + } + // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ, diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 57830bf51a1b3..4dec86cd8f51b 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,6 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. +ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Whether emit extra debug info for sample pgo profile collection. +CODEGENOPT(DebugInfoForProfiling, 1, 0) +CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..e052250f97e75 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,6 +148,55 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; + enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. + }; + + + /// Name of the profile file to use as output for -fprofile-instr-generate, + /// -fprofile-generate, and -fcs-profile-generate. + std::string InstrProfileOutput; + + /// Name of the profile file to use as input for -fmemory-profile-use. + std::string MemoryProfileUsePath; + + unsigned int DebugInfoForProfiling; + + unsigned int AtomicProfileUpdate; + + /// Name of the profile file to use as input for -fprofile-instr-use + std::string ProfileInstrumentUsePath; + + /// Name of the profile remapping file to apply to the profile data supplied + /// by -fprofile-sample-use or -fprofile-instr-use. + std::string ProfileRemappingFile; + + /// Check if Clang profile instrumenation is on. + bool hasProfileClangInstr() const { + return getProfileInstr() == ProfileClangInstr; + } + + /// Check if IR level profile instrumentation is on. + bool hasProfileIRInstr() const { + return getProfileInstr() == ProfileIRInstr; + } + + /// Check if CS IR level profile instrumentation is on. + bool hasProfileCSIRInstr() const { + return getProfileInstr() == ProfileCSIRInstr; + } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } + /// Check if CSIR profile use is on. + bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) #define ENUM_CODEGENOPT(Name, Type, Bits, Default) \ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 6f87a18d69c3d..f013fce2f3cfc 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,6 +27,7 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" +#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -431,6 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, opts.IsPIE = 1; } + if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { + opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + } + + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { + opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.ProfileInstrumentUsePath = A->getValue(); + } + // -mcmodel option. if (const llvm::opt::Arg *a = args.getLastArg(clang::driver::options::OPT_mcmodel_EQ)) { diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..68880bdeecf8d 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -63,11 +63,14 @@ #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" #include "llvm/Transforms/Utils/ModuleUtils.h" +#include "llvm/Transforms/Instrumentation/InstrProfiling.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include #include @@ -130,6 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// + +static llvm::cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -892,6 +909,20 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } + +// Default filename used for profile generation. +namespace llvm { + extern llvm::cl::opt DebugInfoCorrelate; + extern llvm::cl::opt ProfileCorrelate; + + +std::string getDefaultProfileGenName() { + return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} +} + void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -909,6 +940,29 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; + + if (opts.hasProfileIRInstr()){ + // // -fprofile-generate. + pgoOpt = llvm::PGOOptions( + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } + else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", + opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, + llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Driver/flang-f-opts.f90 b/flang/test/Driver/flang-f-opts.f90 index 4493a519e2010..b972b9b7b2a59 100644 --- a/flang/test/Driver/flang-f-opts.f90 +++ b/flang/test/Driver/flang-f-opts.f90 @@ -8,3 +8,8 @@ ! CHECK-LABEL: "-fc1" ! CHECK: -ffp-contract=off ! CHECK: -O3 + +! RUN: %flang -### -S -fprofile-generate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-GENERATE-LLVM %s +! CHECK-PROFILE-GENERATE-LLVM: "-fprofile-generate" +! RUN: %flang -### -S -fprofile-use=%S %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-USE-DIR %s +! CHECK-PROFILE-USE-DIR: "-fprofile-use={{.*}}" diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext new file mode 100644 index 0000000000000..6a6df8b1d4d5b --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -0,0 +1,19 @@ +# IR level Instrumentation Flag +:ir +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + +main +# Func Hash: +742261418966908927 +# Num Counters: +1 +# Counter Values: +1 + diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext new file mode 100644 index 0000000000000..9a46140286673 --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -0,0 +1,14 @@ +# IR level Instrumentation Flag +:ir +:entry_first +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + + + diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 new file mode 100644 index 0000000000000..0124bc79b87ef --- /dev/null +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -0,0 +1,39 @@ +! Tests for -fprofile-generate and -fprofile-use flag compatibility. These two +! flags behave similarly to their GCC counterparts: +! +! -fprofile-generate Generates the profile file ./default.profraw +! -fprofile-use=/file Uses the profile file /file + +! On AIX, -flto used to be required with -fprofile-generate. gcc-flag-compatibility-aix.c is used to do the testing on AIX with -flto +! RUN: %flang %s -c -S -o - -emit-llvm -fprofile-generate | FileCheck -check-prefix=PROFILE-GEN %s +! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section +! PROFILE-GEN: @__profd_{{_?}}main = + + + +! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof +! This uses LLVM IR format profile. +! RUN: rm -rf %t.dir +! RUN: mkdir -p %t.dir/some/path +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s +! + + + +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s +! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} +! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} + + +program main + implicit none + integer :: i + integer :: X = 0 + + do i = 0, 99 + X = X + i + end do + +end program main >From b897c7aa1e21dfe46b4acf709f3ea38d9021c164 Mon Sep 17 00:00:00 2001 From: FYK Date: Wed, 23 Apr 2025 09:56:14 +0800 Subject: [PATCH 02/13] Update flang/lib/Frontend/FrontendActions.cpp Remove redundant comment symbols Co-authored-by: Tom Eccles --- flang/lib/Frontend/FrontendActions.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 68880bdeecf8d..cd13a6aca92cd 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -942,7 +942,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; if (opts.hasProfileIRInstr()){ - // // -fprofile-generate. + // -fprofile-generate. pgoOpt = llvm::PGOOptions( opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() : opts.InstrProfileOutput, >From bc5adfcc4ac3456f587bedd48c1a8892d27e53ae Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:48:30 +0800 Subject: [PATCH 03/13] format code with clang-format --- flang/include/flang/Frontend/CodeGenOptions.h | 17 ++-- flang/lib/Frontend/CompilerInvocation.cpp | 15 ++-- flang/lib/Frontend/FrontendActions.cpp | 83 +++++++++---------- .../Inputs/gcc-flag-compatibility_IR.proftext | 3 +- .../gcc-flag-compatibility_IR_entry.proftext | 5 +- 5 files changed, 59 insertions(+), 64 deletions(-) diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index e052250f97e75..c9577862df832 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -156,7 +156,6 @@ class CodeGenOptions : public CodeGenOptionsBase { ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -171,7 +170,7 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; - /// Name of the profile remapping file to apply to the profile data supplied + /// Name of the profile remapping file to apply to the profile data supplied /// by -fprofile-sample-use or -fprofile-instr-use. std::string ProfileRemappingFile; @@ -181,19 +180,17 @@ class CodeGenOptions : public CodeGenOptionsBase { } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; - } + bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { return getProfileInstr() == ProfileCSIRInstr; } - /// Check if IR level profile use is on. - bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; - } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } /// Check if CSIR profile use is on. bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index f013fce2f3cfc..b28c2c0047579 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,7 +27,6 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" -#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -433,13 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = + args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = + args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); } - + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index cd13a6aca92cd..8d1ab670e4db4 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -56,21 +56,21 @@ #include "llvm/Passes/PassBuilder.h" #include "llvm/Passes/PassPlugin.h" #include "llvm/Passes/StandardInstrumentations.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/Support/AMDGPUAddrSpace.h" #include "llvm/Support/Error.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/FileSystem.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" -#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" -#include "llvm/Transforms/Utils/ModuleUtils.h" #include "llvm/Transforms/Instrumentation/InstrProfiling.h" -#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Transforms/Utils/ModuleUtils.h" #include #include @@ -133,19 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// - static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, + "default", "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, + "optsize", "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, + "minsize", "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, + "optnone", + "Mark cold functions with optnone."))); bool PrescanAction::beginSourceFileAction() { return runPrescan(); } @@ -909,19 +910,18 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } - // Default filename used for profile generation. namespace llvm { - extern llvm::cl::opt DebugInfoCorrelate; - extern llvm::cl::opt ProfileCorrelate; - +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt ProfileCorrelate; std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + return DebugInfoCorrelate || + ProfileCorrelate != llvm::InstrProfCorrelator::NONE ? "default_%m.proflite" : "default_%m.profraw"; } -} +} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); @@ -940,29 +940,28 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; - - if (opts.hasProfileIRInstr()){ + + if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); - } - else if (opts.hasProfileIRUse()) { - llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); - // -fprofile-use. - auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse - : llvm::PGOOptions::NoCSAction; - pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", - opts.ProfileRemappingFile, - opts.MemoryProfileUsePath, VFS, - llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling); - } - + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = + llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions( + opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, + ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext index 6a6df8b1d4d5b..2650fb5ebfd35 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -15,5 +15,4 @@ main # Num Counters: 1 # Counter Values: -1 - +1 \ No newline at end of file diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext index 9a46140286673..c4a2a26557e80 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -8,7 +8,4 @@ _QQmain 2 # Counter Values: 100 -1 - - - +1 \ No newline at end of file >From d64d9d95fb97d6cfa4bf4192bfb20f5c8d6b3bc3 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:53:47 +0800 Subject: [PATCH 04/13] simplify push_back usage --- clang/lib/Driver/ToolChains/Flang.cpp | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index fcdbe8a6aba5a..9c7e87c455e44 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,13 +882,9 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); - - if (Args.hasArg(options::OPT_fprofile_generate)){ - CmdArgs.push_back("-fprofile-generate"); - } - if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { - CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); - } + // recognise options: fprofile-generate -fprofile-use= + Args.addAllArgs( + CmdArgs, {options::OPT_fprofile_generate, options::OPT_fprofile_use_EQ}); // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. >From 22475a85d24b22fb44ca5a5ce26542b556bae280 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 20:33:54 +0800 Subject: [PATCH 05/13] Port the getDefaultProfileGenName definition and the ProfileInstrKind definition from clang to the llvm namespace to allow flang to reuse these code. --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++--- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/include/clang/CodeGen/BackendUtil.h | 3 ++ clang/lib/Basic/ProfileList.cpp | 20 ++++---- clang/lib/CodeGen/BackendUtil.cpp | 50 ++++++------------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 +-- flang/include/flang/Frontend/CodeGenOptions.h | 28 ++++------- .../include/flang/Frontend/FrontendActions.h | 5 ++ flang/lib/Frontend/CompilerInvocation.cpp | 11 ++-- flang/lib/Frontend/FrontendActions.cpp | 28 +++-------- .../llvm/Frontend/Driver/CodeGenOptions.h | 15 +++++- llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 25 ++++++++++ 17 files changed, 123 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index 92e0d13bf25b6..d9abf7bf962d2 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,6 +8,8 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -19,6 +21,7 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; +extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..963ed321b2cb9 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,38 +103,16 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); -// Experiment to mark cold functions as optsize/minsize/optnone. -// TODO: remove once this is exposed as a proper driver flag. -static cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, - cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); - extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +812,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,14 +825,14 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) @@ -863,15 +841,15 @@ void EmitAssemblyHelper::RunOptimizationPipeline( ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index f9a45bd6c0a56..9ba74a9dad9be 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,8 +20,13 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include +namespace llvm { +extern cl::opt ClPGOColdFuncAttr; +} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..c758aa18fbb8e 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -28,6 +28,7 @@ #include "flang/Semantics/unparse-with-symbols.h" #include "flang/Support/default-kinds.h" #include "flang/Tools/CrossToolHelpers.h" +#include "clang/CodeGen/BackendUtil.h" #include "mlir/IR/Dialect.h" #include "mlir/Parser/Parser.h" @@ -133,21 +134,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -944,12 +930,12 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, + opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + llvm::PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +945,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..6188c20cb29cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,9 +13,14 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; } // namespace llvm namespace llvm::driver { @@ -35,7 +40,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); - +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..818dcd3752437 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,7 +8,26 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); +} // namespace llvm namespace llvm::driver { @@ -56,4 +75,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From e53e689985088bbcdc253950a2ecc715592b5b3a Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 21:49:36 +0800 Subject: [PATCH 06/13] Remove redundant function definitions --- flang/lib/Frontend/FrontendActions.cpp | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c758aa18fbb8e..cdd2853bcd201 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -896,18 +896,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); >From 248175453354fecd078f5553576d16ce810e7808 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:12:32 +0800 Subject: [PATCH 07/13] Move the interface to the cpp that uses it --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++---- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/lib/Basic/ProfileList.cpp | 20 ++++----- clang/lib/CodeGen/BackendUtil.cpp | 37 +++++++-------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 ++-- flang/include/flang/Frontend/CodeGenOptions.h | 28 +++++------- flang/lib/Frontend/CompilerInvocation.cpp | 11 ++--- flang/lib/Frontend/FrontendActions.cpp | 45 ++++--------------- .../llvm/Frontend/Driver/CodeGenOptions.h | 10 +++++ llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 12 +++++ 15 files changed, 101 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..592e3bbbcc1cf 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -124,17 +124,10 @@ namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +827,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,31 +840,31 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) PGOOpt = PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..a650f54620543 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -133,21 +133,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -910,19 +895,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm - void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -943,13 +915,14 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. - pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + pgoOpt = llvm::PGOOptions(opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, + llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, + llvm::PGOOptions::ColdFuncOpt::Default, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +932,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::PGOOptions::ColdFuncOpt::Default, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..3eb03cc3064cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,6 +13,7 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -36,6 +37,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..14b6b89da8465 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,8 +8,14 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; +} // namespace llvm namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, @@ -56,4 +62,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From 70fea2265a374f59345691f4ad7653ef4f0b6aa6 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:25:15 +0800 Subject: [PATCH 08/13] Move the interface to the cpp that uses it --- clang/include/clang/CodeGen/BackendUtil.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index d9abf7bf962d2..92e0d13bf25b6 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,8 +8,6 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -21,7 +19,6 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; -extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs >From 5705d5eff937ca18eb44bec28a967a8629f0c085 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:26:22 +0800 Subject: [PATCH 09/13] Move the interface to the cpp that uses it --- flang/include/flang/Frontend/FrontendActions.h | 5 ----- 1 file changed, 5 deletions(-) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index 9ba74a9dad9be..f9a45bd6c0a56 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,13 +20,8 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include -namespace llvm { -extern cl::opt ClPGOColdFuncAttr; -} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// >From 016aab17f4cc73416c6ebca61240f269aac837d2 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:34:00 +0800 Subject: [PATCH 10/13] Fill in the missing code --- clang/lib/CodeGen/BackendUtil.cpp | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 2d33edbb8430d..6eb3a8638b7d1 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,6 +103,21 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +static cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { >From f36bfcfbfdc87b896f41be1ba25d8c18c339f1c1 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Thu, 1 May 2025 23:18:34 +0800 Subject: [PATCH 11/13] Adjusting the format of the code --- flang/test/Profile/gcc-flag-compatibility.f90 | 7 ------- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 7 ++++--- 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 index 0124bc79b87ef..4490c45232d28 100644 --- a/flang/test/Profile/gcc-flag-compatibility.f90 +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -9,24 +9,17 @@ ! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section ! PROFILE-GEN: @__profd_{{_?}}main = - - ! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof ! This uses LLVM IR format profile. ! RUN: rm -rf %t.dir ! RUN: mkdir -p %t.dir/some/path ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s -! - - - ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s ! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} ! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} - program main implicit none integer :: i diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 3eb03cc3064cf..98b9e1554f317 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -14,6 +14,7 @@ #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #include + namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -34,9 +35,6 @@ enum class VectorLibrary { AMDLIBM // AMD vector math library. }; -TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, - VectorLibrary Veclib); - enum ProfileInstrKind { ProfileNone, // Profile instrumentation is turned off. ProfileClangInstr, // Clang instrumentation to generate execution counts @@ -44,6 +42,9 @@ enum ProfileInstrKind { ProfileIRInstr, // IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; +TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, + VectorLibrary Veclib); + // Default filename used for profile generation. std::string getDefaultProfileGenName(); } // end namespace llvm::driver >From a5c7da77d2aa6909451bed3fb0f02c9b735dc876 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 2 May 2025 00:01:26 +0800 Subject: [PATCH 12/13] Adjusting the format of the code --- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 98b9e1554f317..84bba2a964ecf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -20,6 +20,7 @@ class Triple; class TargetLibraryInfoImpl; } // namespace llvm + namespace llvm::driver { /// Vector library option used with -fveclib= @@ -42,6 +43,7 @@ enum ProfileInstrKind { ProfileIRInstr, // IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; + TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); >From a99e16b29d70d2fea6d16ec06e6ca55f477b74e9 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 2 May 2025 00:07:23 +0800 Subject: [PATCH 13/13] Adjusting the format of the code --- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 1 - llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 84bba2a964ecf..f0baa6fcdbbd3 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -20,7 +20,6 @@ class Triple; class TargetLibraryInfoImpl; } // namespace llvm - namespace llvm::driver { /// Vector library option used with -fveclib= diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index 14b6b89da8465..c48f5ed68b10b 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -16,6 +16,7 @@ extern llvm::cl::opt DebugInfoCorrelate; extern llvm::cl::opt ProfileCorrelate; } // namespace llvm + namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, From flang-commits at lists.llvm.org Sun May 25 23:41:20 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sun, 25 May 2025 23:41:20 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <68340d10.170a0220.35d4da.ce84@mx.google.com> fanju110 wrote: > Thank you for seeing this through and making all the little changes. I have requested reviews from @MaskRay and @aeubanks for the clang side of things. hello,I noticed that this PR has been awaiting clang review for three weeks. I still haven't gotten a comment from them.would it be possible for you to suggest others who might be available to help review? Thanks https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Mon May 26 00:52:54 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 26 May 2025 00:52:54 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68341dd6.170a0220.1dff0f.3034@mx.google.com> https://github.com/jeanPerier edited https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Mon May 26 00:52:54 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 26 May 2025 00:52:54 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68341dd6.170a0220.1cc1da.2037@mx.google.com> https://github.com/jeanPerier approved this pull request. Nit about the need for MLIR tests, LGTM otherwise! https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Mon May 26 00:52:56 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Mon, 26 May 2025 00:52:56 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68341dd8.170a0220.202fd.4dc6@mx.google.com> ================ @@ -2636,6 +2636,10 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, funcOp.setTargetFeaturesAttr( LLVM::TargetFeaturesAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("prefer-vector-width"); + attr.isStringAttribute()) + funcOp.setPreferVectorWidth(attr.getValueAsString()); ---------------- jeanPerier wrote: Can you add an import test in mlir/test/Target/LLVMIR/Import and an export one in ‎mlir/test/Target/LLVMIR (just like the ones for tune-cpu)? Even though the export is tested indirectly in flang, it is best to add an explicit test here. https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Mon May 26 04:53:56 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Mon, 26 May 2025 04:53:56 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <68345654.170a0220.2cd125.1a43@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH 1/4] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 30477b842ae64fb1c94f520136956cc4685648a7 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:59:20 -0400 Subject: [PATCH 2/4] clang-format --- flang/include/flang/Evaluate/target.h | 4 +--- flang/include/flang/Tools/TargetSetup.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d5285fb..e8b9fedc38f48 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ class TargetCharacteristics { const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e21fb4..24ab65f740ec6 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. >From e1f91c83680d72fc1463a8db97a77298141286d9 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:41:48 -0400 Subject: [PATCH 3/4] Renaming to integerKindForPointer --- flang/include/flang/Evaluate/target.h | 8 +++++--- flang/include/flang/Tools/TargetSetup.h | 3 ++- flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index e8b9fedc38f48..7b38db2db1956 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,8 +131,10 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } - std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } + std::size_t integerKindForPointer() { return integerKindForPointer_; } + void set_integerKindForPointer(std::size_t newKind) { + integerKindForPointer_ = newKind; + } private: static constexpr int maxKind{common::maxKind}; @@ -159,7 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; - std::size_t pointerSize_{8 /* bytes */}; + std::size_t integerKindForPointer_{8}; /* For 64 bit pointer */ }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index 24ab65f740ec6..002e82aa72484 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,7 +94,8 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); - targetCharacteristics.set_pointerSize( + // Currently the integer kind happens to be the same as the byte size + targetCharacteristics.set_integerKindForPointer( targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 88114bfe84af3..5fd5ea8f9bc5c 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; + TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 940ff38d0abba9d1b0ab415e42474629284962d8 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:51:53 -0400 Subject: [PATCH 4/4] clang-format --- flang/lib/Semantics/resolve-names.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 5fd5ea8f9bc5c..a8dbf61c8fd68 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6689,8 +6689,8 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { "'%s' cannot be a Cray pointer as it is already a Cray pointee"_err_en_US); } pointer->set(Symbol::Flag::CrayPointer); - const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; + const DeclTypeSpec &pointerType{MakeNumericType(TypeCategory::Integer, + context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); From flang-commits at lists.llvm.org Mon May 26 05:30:21 2025 From: flang-commits at lists.llvm.org (Ryotaro Kasuga via flang-commits) Date: Mon, 26 May 2025 05:30:21 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <68345edd.050a0220.2140cd.28fa@mx.google.com> kasuga-fj wrote: There is still a correctness issue with LoopInterchange, as I reported in #140238. This problem is not specific to C/C++. The following code demonstrates a case where illegal loop-interchange occurs. ```fortran program main implicit none real, save :: A(5, 5) real, target :: V(16), W(16) real, pointer :: X(:, :) => null(), Y(:, :) => null() integer :: i, j do i = 0, 15 A(i / 4 + 1, mod(i, 4) + 1) = real(i) V(i + 1) = real(i + i) W(i + 1) = real(i * i) end do X(1:4, 1:4) => V(:) if (A(2, 2) < A(3, 3)) then Y(1:4, 1:4) => V(:) else Y(1:4, 1:4) => W(:) endif ! Illegal interchange occurs do j = 1, 4 do i = 1, 4 X(i, j) = Y(j, i) A(i + 1, j) = A(i, j) + 1 end do end do print *, X end program main ``` godbolt: https://godbolt.org/z/db5qaab1T This issue should be resolved by #140709. https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Mon May 26 08:40:36 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Mon, 26 May 2025 08:40:36 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68348b74.170a0220.1437e1.fa11@mx.google.com> https://github.com/tarunprabhu edited https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Mon May 26 08:40:36 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Mon, 26 May 2025 08:40:36 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68348b74.170a0220.34bd7a.eda2@mx.google.com> https://github.com/tarunprabhu requested changes to this pull request. Thanks for this. I second @jeanPerier's request for an MLIR test https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Mon May 26 08:40:36 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Mon, 26 May 2025 08:40:36 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68348b74.a70a0220.3aef75.473f@mx.google.com> ================ @@ -309,6 +309,20 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, for (auto *a : args.filtered(clang::driver::options::OPT_fpass_plugin_EQ)) opts.LLVMPassPlugins.push_back(a->getValue()); + // -mprefer_vector_width option + if (const llvm::opt::Arg *a = args.getLastArg( + clang::driver::options::OPT_mprefer_vector_width_EQ)) { + llvm::StringRef s = a->getValue(); + unsigned Width; ---------------- tarunprabhu wrote: Flang's coding conventions are a bit different from clang. Variables names should start with a lowercase letter ```suggestion unsigned width; ``` https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Mon May 26 11:20:57 2025 From: flang-commits at lists.llvm.org (Steve Scalpone via flang-commits) Date: Mon, 26 May 2025 11:20:57 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][flang-driver] Support flag -finstrument-functions (PR #137996) In-Reply-To: Message-ID: <6834b109.170a0220.35fa67.747d@mx.google.com> sscalpone wrote: Hi @anchuraj! Nice patch! Are you interesting in extending the front-end handling to support: ``` -finstrument-functions-exclude-function-list=sym,sym,... ``` ``` -finstrument-functions-exclude-file-list=file,file,... ``` In my experience, if found that application engineers generally need these extensions soon after discovering -finstrument-functions :) https://github.com/llvm/llvm-project/pull/137996 From flang-commits at lists.llvm.org Mon May 26 16:06:37 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Mon, 26 May 2025 16:06:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Allow forward reference to non-default INTEGER dummy (PR #141254) In-Reply-To: Message-ID: <6834f3fd.630a0220.2da0e1.aae8@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/141254 From flang-commits at lists.llvm.org Mon May 26 17:13:51 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Mon, 26 May 2025 17:13:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <683503bf.630a0220.359ad9.9afe@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH 1/4] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 30477b842ae64fb1c94f520136956cc4685648a7 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:59:20 -0400 Subject: [PATCH 2/4] clang-format --- flang/include/flang/Evaluate/target.h | 4 +--- flang/include/flang/Tools/TargetSetup.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d5285fb..e8b9fedc38f48 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ class TargetCharacteristics { const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e21fb4..24ab65f740ec6 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. >From e1f91c83680d72fc1463a8db97a77298141286d9 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:41:48 -0400 Subject: [PATCH 3/4] Renaming to integerKindForPointer --- flang/include/flang/Evaluate/target.h | 8 +++++--- flang/include/flang/Tools/TargetSetup.h | 3 ++- flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index e8b9fedc38f48..7b38db2db1956 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,8 +131,10 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } - std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } + std::size_t integerKindForPointer() { return integerKindForPointer_; } + void set_integerKindForPointer(std::size_t newKind) { + integerKindForPointer_ = newKind; + } private: static constexpr int maxKind{common::maxKind}; @@ -159,7 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; - std::size_t pointerSize_{8 /* bytes */}; + std::size_t integerKindForPointer_{8}; /* For 64 bit pointer */ }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index 24ab65f740ec6..002e82aa72484 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,7 +94,8 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); - targetCharacteristics.set_pointerSize( + // Currently the integer kind happens to be the same as the byte size + targetCharacteristics.set_integerKindForPointer( targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 88114bfe84af3..5fd5ea8f9bc5c 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; + TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 940ff38d0abba9d1b0ab415e42474629284962d8 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:51:53 -0400 Subject: [PATCH 4/4] clang-format --- flang/lib/Semantics/resolve-names.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 5fd5ea8f9bc5c..a8dbf61c8fd68 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6689,8 +6689,8 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { "'%s' cannot be a Cray pointer as it is already a Cray pointee"_err_en_US); } pointer->set(Symbol::Flag::CrayPointer); - const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; + const DeclTypeSpec &pointerType{MakeNumericType(TypeCategory::Integer, + context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); From flang-commits at lists.llvm.org Mon May 26 23:35:47 2025 From: flang-commits at lists.llvm.org (KAWASHIMA Takahiro via flang-commits) Date: Mon, 26 May 2025 23:35:47 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182) In-Reply-To: Message-ID: <68355d43.a70a0220.2c0be1.08d0@mx.google.com> kawashima-fj wrote: I've confirmed the result of Fujitsu Compiler Test Suite. The only correctness issue affected by this commit is https://github.com/fujitsu/compiler-test-suite/blob/main/Fortran/0347/0347_0240.f, which will be resolved by #140709. https://github.com/llvm/llvm-project/pull/140182 From flang-commits at lists.llvm.org Tue May 27 03:19:00 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Tue, 27 May 2025 03:19:00 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <68359194.050a0220.d4a4e.5aa9@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/140822 >From 0083c965ba32de46babcb0e633d54b1ba0121365 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:48:48 -0400 Subject: [PATCH 1/4] [flang] Ensure that the integer for Cray pointer is sized correctly The integer used for Cray pointers should have the size equivalent to platform's pointer size. modified: flang/include/flang/Evaluate/target.h modified: flang/include/flang/Tools/TargetSetup.h modified: flang/lib/Semantics/resolve-names.cpp --- flang/include/flang/Evaluate/target.h | 6 ++++++ flang/include/flang/Tools/TargetSetup.h | 3 +++ flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..281e42d5285fb 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t pointerSize() { return pointerSize_; } + void set_pointerSize(std::size_t pointerSize) { + pointerSize_ = pointerSize; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t pointerSize_{8 /* bytes */}; }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..a30c725e21fb4 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,9 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + targetCharacteristics.set_pointerSize( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 92a3277191ae0..2f6fe4fea4da2 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 30477b842ae64fb1c94f520136956cc4685648a7 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Tue, 20 May 2025 19:59:20 -0400 Subject: [PATCH 2/4] clang-format --- flang/include/flang/Evaluate/target.h | 4 +--- flang/include/flang/Tools/TargetSetup.h | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index 281e42d5285fb..e8b9fedc38f48 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -132,9 +132,7 @@ class TargetCharacteristics { const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { - pointerSize_ = pointerSize; - } + void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } private: static constexpr int maxKind{common::maxKind}; diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index a30c725e21fb4..24ab65f740ec6 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -95,7 +95,7 @@ namespace Fortran::tools { targetCharacteristics.set_isOSWindows(true); targetCharacteristics.set_pointerSize( - targetTriple.getArchPointerBitWidth() / 8); + targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. >From e1f91c83680d72fc1463a8db97a77298141286d9 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:41:48 -0400 Subject: [PATCH 3/4] Renaming to integerKindForPointer --- flang/include/flang/Evaluate/target.h | 8 +++++--- flang/include/flang/Tools/TargetSetup.h | 3 ++- flang/lib/Semantics/resolve-names.cpp | 2 +- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index e8b9fedc38f48..7b38db2db1956 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,8 +131,10 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } - std::size_t pointerSize() { return pointerSize_; } - void set_pointerSize(std::size_t pointerSize) { pointerSize_ = pointerSize; } + std::size_t integerKindForPointer() { return integerKindForPointer_; } + void set_integerKindForPointer(std::size_t newKind) { + integerKindForPointer_ = newKind; + } private: static constexpr int maxKind{common::maxKind}; @@ -159,7 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; - std::size_t pointerSize_{8 /* bytes */}; + std::size_t integerKindForPointer_{8}; /* For 64 bit pointer */ }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index 24ab65f740ec6..002e82aa72484 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,7 +94,8 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); - targetCharacteristics.set_pointerSize( + // Currently the integer kind happens to be the same as the byte size + targetCharacteristics.set_integerKindForPointer( targetTriple.getArchPointerBitWidth() / 8); // TODO: use target machine data layout to set-up the target characteristics diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 88114bfe84af3..5fd5ea8f9bc5c 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6690,7 +6690,7 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { } pointer->set(Symbol::Flag::CrayPointer); const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().pointerSize())}; + TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); >From 940ff38d0abba9d1b0ab415e42474629284962d8 Mon Sep 17 00:00:00 2001 From: Eugene Epshteyn Date: Wed, 21 May 2025 23:51:53 -0400 Subject: [PATCH 4/4] clang-format --- flang/lib/Semantics/resolve-names.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 5fd5ea8f9bc5c..a8dbf61c8fd68 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6689,8 +6689,8 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { "'%s' cannot be a Cray pointer as it is already a Cray pointee"_err_en_US); } pointer->set(Symbol::Flag::CrayPointer); - const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().targetCharacteristics().integerKindForPointer())}; + const DeclTypeSpec &pointerType{MakeNumericType(TypeCategory::Integer, + context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); From flang-commits at lists.llvm.org Tue May 27 06:28:46 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Tue, 27 May 2025 06:28:46 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <6835be0e.630a0220.1fa401.1d89@mx.google.com> TIFitis wrote: Polite reminder for review :) https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Tue May 27 07:44:43 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Tue, 27 May 2025 07:44:43 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <6835cfdb.170a0220.47242.6a4d@mx.google.com> ================ @@ -2093,7 +2093,11 @@ class UnparseVisitor { Walk(x.v, ","); } void Unparse(const OmpMapperSpecifier &x) { - Walk(std::get>(x.t), ":"); + const auto &mapperName = std::get(x.t); ---------------- kparzysz wrote: Same here (and below): ...mapperName{...}; https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Tue May 27 07:44:43 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Tue, 27 May 2025 07:44:43 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <6835cfdb.170a0220.226d38.8718@mx.google.com> ================ @@ -1389,8 +1389,28 @@ TYPE_PARSER( TYPE_PARSER(sourced(construct( verbatim("DECLARE TARGET"_tok), Parser{}))) +static OmpMapperSpecifier ConstructOmpMapperSpecifier( + std::optional &&mapperName, TypeSpec &&typeSpec, Name &&varName) { + // If a name is present, parse: name ":" typeSpec "::" name + // This matches the syntax: : :: + if (mapperName.has_value() && mapperName->ToString() != "default") { + return OmpMapperSpecifier{ + mapperName->ToString(), std::move(typeSpec), std::move(varName)}; + } + // If the name is missing, use the DerivedTypeSpec name to construct the + // default mapper name. + // This matches the syntax: :: + if (auto *derived = std::get_if(&typeSpec.u)) { ---------------- kparzysz wrote: Please use bracketed initialization in the frontend. https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Tue May 27 08:09:51 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 27 May 2025 08:09:51 -0700 (PDT) Subject: [flang-commits] [flang] 42b1df4 - [mlir][math] Add missing trig math-to-llvm conversion patterns (#141069) Message-ID: <6835d5bf.170a0220.1501d9.a077@mx.google.com> Author: Asher Mancinelli Date: 2025-05-27T08:09:48-07:00 New Revision: 42b1df43e73391f035dd370f219d2fca3482769d URL: https://github.com/llvm/llvm-project/commit/42b1df43e73391f035dd370f219d2fca3482769d DIFF: https://github.com/llvm/llvm-project/commit/42b1df43e73391f035dd370f219d2fca3482769d.diff LOG: [mlir][math] Add missing trig math-to-llvm conversion patterns (#141069) asin, acos, atan, and atan2 were being lowered to libm calls instead of llvm intrinsics. Add the conversion patterns to handle these intrinsics and update tests to expect this. Added: Modified: flang/test/Intrinsics/math-codegen.fir mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir Removed: ################################################################################ diff --git a/flang/test/Intrinsics/math-codegen.fir b/flang/test/Intrinsics/math-codegen.fir index c45c6b23e897e..b7c4e07130662 100644 --- a/flang/test/Intrinsics/math-codegen.fir +++ b/flang/test/Intrinsics/math-codegen.fir @@ -378,13 +378,167 @@ func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { func.func private @llvm.round.f32(f32) -> f32 func.func private @llvm.round.f64(f64) -> f64 +//--- asin_fast.fir +// RUN: fir-opt %t/asin_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/asin_fast.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- asin_relaxed.fir +// RUN: fir-opt %t/asin_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/asin_relaxed.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.asin({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.asin %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- asin_precise.fir +// RUN: fir-opt %t/asin_precise.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/asin_precise.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @asinf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @asin({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @asinf(%1) : (f32) -> f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @asin(%1) : (f64) -> f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} +func.func private @asinf(f32) -> f32 +func.func private @asin(f64) -> f64 + +//--- acos_fast.fir +// RUN: fir-opt %t/acos_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/acos_fast.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- acos_relaxed.fir +// RUN: fir-opt %t/acos_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/acos_relaxed.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.acos({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = math.acos %1 : f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} + +//--- acos_precise.fir +// RUN: fir-opt %t/acos_precise.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/acos_precise.fir +// CHECK: @_QPtest_real4 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @acosf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 + +// CHECK: @_QPtest_real8 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @acos({{%[A-Za-z0-9._]+}}) : (f64) -> f64 + +func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { + %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @acosf(%1) : (f32) -> f32 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f32 +} +func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { + %0 = fir.alloca f64 {bindc_name = "test_real8", uniq_name = "_QFtest_real8Etest_real8"} + %1 = fir.load %arg0 : !fir.ref + %2 = fir.call @acos(%1) : (f64) -> f64 + fir.store %2 to %0 : !fir.ref + %3 = fir.load %0 : !fir.ref + return %3 : f64 +} +func.func private @acosf(f32) -> f32 +func.func private @acos(f64) -> f64 + //--- atan_fast.fir // RUN: fir-opt %t/atan_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan_fast.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atanf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} @@ -406,10 +560,10 @@ func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f64 { //--- atan_relaxed.fir // RUN: fir-opt %t/atan_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan_relaxed.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atanf({{%[A-Za-z0-9._]+}}) : (f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan({{%[A-Za-z0-9._]+}}) : (f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} @@ -458,10 +612,10 @@ func.func private @atan(f64) -> f64 //--- atan2_fast.fir // RUN: fir-opt %t/atan2_fast.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan2_fast.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2f({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "y"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} @@ -485,10 +639,10 @@ func.func @_QPtest_real8(%arg0: !fir.ref {fir.bindc_name = "x"}, %arg1: !fi //--- atan2_relaxed.fir // RUN: fir-opt %t/atan2_relaxed.fir --fir-to-llvm-ir="target=x86_64-unknown-linux-gnu" | FileCheck %t/atan2_relaxed.fir // CHECK: @_QPtest_real4 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2f({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f32, f32) -> f32 // CHECK: @_QPtest_real8 -// CHECK: {{%[A-Za-z0-9._]+}} = llvm.call @atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 +// CHECK: {{%[A-Za-z0-9._]+}} = llvm.intr.atan2({{%[A-Za-z0-9._]+}}, {{%[A-Za-z0-9._]+}}) : (f64, f64) -> f64 func.func @_QPtest_real4(%arg0: !fir.ref {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "y"}) -> f32 { %0 = fir.alloca f32 {bindc_name = "test_real4", uniq_name = "_QFtest_real4Etest_real4"} diff --git a/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp b/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp index 97da96afac4cd..b42bb773f53ee 100644 --- a/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp +++ b/mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp @@ -42,6 +42,7 @@ using CopySignOpLowering = ConvertFMFMathToLLVMPattern; using CosOpLowering = ConvertFMFMathToLLVMPattern; using CoshOpLowering = ConvertFMFMathToLLVMPattern; +using AcosOpLowering = ConvertFMFMathToLLVMPattern; using CtPopFOpLowering = VectorConvertToLLVMPattern; using Exp2OpLowering = ConvertFMFMathToLLVMPattern; @@ -62,12 +63,15 @@ using RoundOpLowering = ConvertFMFMathToLLVMPattern; using SinOpLowering = ConvertFMFMathToLLVMPattern; using SinhOpLowering = ConvertFMFMathToLLVMPattern; +using ASinOpLowering = ConvertFMFMathToLLVMPattern; using SqrtOpLowering = ConvertFMFMathToLLVMPattern; using FTruncOpLowering = ConvertFMFMathToLLVMPattern; using TanOpLowering = ConvertFMFMathToLLVMPattern; using TanhOpLowering = ConvertFMFMathToLLVMPattern; - +using ATanOpLowering = ConvertFMFMathToLLVMPattern; +using ATan2OpLowering = + ConvertFMFMathToLLVMPattern; // A `CtLz/CtTz/absi(a)` is converted into `CtLz/CtTz/absi(a, false)`. template struct IntOpWithFlagLowering : public ConvertOpToLLVMPattern { @@ -353,6 +357,7 @@ void mlir::populateMathToLLVMConversionPatterns( CopySignOpLowering, CosOpLowering, CoshOpLowering, + AcosOpLowering, CountLeadingZerosOpLowering, CountTrailingZerosOpLowering, CtPopFOpLowering, @@ -371,10 +376,13 @@ void mlir::populateMathToLLVMConversionPatterns( RsqrtOpLowering, SinOpLowering, SinhOpLowering, + ASinOpLowering, SqrtOpLowering, FTruncOpLowering, TanOpLowering, - TanhOpLowering + TanhOpLowering, + ATanOpLowering, + ATan2OpLowering >(converter, benefit); // clang-format on } diff --git a/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir b/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir index 974743a55932b..537fb967ef0e1 100644 --- a/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir +++ b/mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir @@ -177,6 +177,84 @@ func.func @trigonometrics(%arg0: f32) { // ----- +// CHECK-LABEL: func @inverse_trigonometrics +// CHECK-SAME: [[ARG0:%.+]]: f32 +func.func @inverse_trigonometrics(%arg0: f32) { + // CHECK: llvm.intr.asin([[ARG0]]) : (f32) -> f32 + %0 = math.asin %arg0 : f32 + + // CHECK: llvm.intr.acos([[ARG0]]) : (f32) -> f32 + %1 = math.acos %arg0 : f32 + + // CHECK: llvm.intr.atan([[ARG0]]) : (f32) -> f32 + %2 = math.atan %arg0 : f32 + func.return +} + +// ----- + +// CHECK-LABEL: func @atan2 +// CHECK-SAME: [[ARG0:%.+]]: f32, [[ARG1:%.+]]: f32 +func.func @atan2(%arg0: f32, %arg1: f32) { + // CHECK: llvm.intr.atan2([[ARG0]], [[ARG1]]) : (f32, f32) -> f32 + %0 = math.atan2 %arg0, %arg1 : f32 + func.return +} + +// ----- + +// CHECK-LABEL: func @inverse_trigonometrics_vector +// CHECK-SAME: [[ARG0:%.+]]: vector<4xf32> +func.func @inverse_trigonometrics_vector(%arg0: vector<4xf32>) { + // CHECK: llvm.intr.asin([[ARG0]]) : (vector<4xf32>) -> vector<4xf32> + %0 = math.asin %arg0 : vector<4xf32> + + // CHECK: llvm.intr.acos([[ARG0]]) : (vector<4xf32>) -> vector<4xf32> + %1 = math.acos %arg0 : vector<4xf32> + + // CHECK: llvm.intr.atan([[ARG0]]) : (vector<4xf32>) -> vector<4xf32> + %2 = math.atan %arg0 : vector<4xf32> + func.return +} + +// ----- + +// CHECK-LABEL: func @atan2_vector +// CHECK-SAME: [[ARG0:%.+]]: vector<4xf32>, [[ARG1:%.+]]: vector<4xf32> +func.func @atan2_vector(%arg0: vector<4xf32>, %arg1: vector<4xf32>) { + // CHECK: llvm.intr.atan2([[ARG0]], [[ARG1]]) : (vector<4xf32>, vector<4xf32>) -> vector<4xf32> + %0 = math.atan2 %arg0, %arg1 : vector<4xf32> + func.return +} + +// ----- + +// CHECK-LABEL: func @inverse_trigonometrics_fmf +// CHECK-SAME: [[ARG0:%.+]]: f32 +func.func @inverse_trigonometrics_fmf(%arg0: f32) { + // CHECK: llvm.intr.asin([[ARG0]]) {fastmathFlags = #llvm.fastmath} : (f32) -> f32 + %0 = math.asin %arg0 fastmath : f32 + + // CHECK: llvm.intr.acos([[ARG0]]) {fastmathFlags = #llvm.fastmath} : (f32) -> f32 + %1 = math.acos %arg0 fastmath : f32 + + // CHECK: llvm.intr.atan([[ARG0]]) {fastmathFlags = #llvm.fastmath} : (f32) -> f32 + %2 = math.atan %arg0 fastmath : f32 + func.return +} + +// ----- + +// CHECK-LABEL: func @atan2_fmf +// CHECK-SAME: [[ARG0:%.+]]: f32, [[ARG1:%.+]]: f32 +func.func @atan2_fmf(%arg0: f32, %arg1: f32) { + // CHECK: llvm.intr.atan2([[ARG0]], [[ARG1]]) {fastmathFlags = #llvm.fastmath} : (f32, f32) -> f32 + %0 = math.atan2 %arg0, %arg1 fastmath : f32 + func.return +} + +// ----- + // CHECK-LABEL: func @hyperbolics // CHECK-SAME: [[ARG0:%.+]]: f32 func.func @hyperbolics(%arg0: f32) { From flang-commits at lists.llvm.org Tue May 27 08:09:54 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Tue, 27 May 2025 08:09:54 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [mlir][math] Add missing trig math-to-llvm conversion patterns (PR #141069) In-Reply-To: Message-ID: <6835d5c2.170a0220.16df54.e2ba@mx.google.com> https://github.com/ashermancinelli closed https://github.com/llvm/llvm-project/pull/141069 From flang-commits at lists.llvm.org Tue May 27 08:32:51 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Tue, 27 May 2025 08:32:51 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <6835db23.170a0220.826de.efc5@mx.google.com> https://github.com/clementval approved this pull request. I run some tests and the recursion problem seems to be addressed as expected. https://github.com/llvm/llvm-project/pull/137727 From flang-commits at lists.llvm.org Tue May 27 09:33:55 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Tue, 27 May 2025 09:33:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <6835e973.170a0220.2b7a5.153f@mx.google.com> https://github.com/akuhlens approved this pull request. https://github.com/llvm/llvm-project/pull/140822 From flang-commits at lists.llvm.org Tue May 27 09:35:47 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Tue, 27 May 2025 09:35:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Allow forward reference to non-default INTEGER dummy (PR #141254) In-Reply-To: Message-ID: <6835e9e3.630a0220.2fc9ae.28e0@mx.google.com> https://github.com/akuhlens approved this pull request. https://github.com/llvm/llvm-project/pull/141254 From flang-commits at lists.llvm.org Tue May 27 09:36:57 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Tue, 27 May 2025 09:36:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix prescanner bug w/ empty macros in line continuation (PR #141274) In-Reply-To: Message-ID: <6835ea29.170a0220.2dda17.6c3f@mx.google.com> https://github.com/akuhlens approved this pull request. https://github.com/llvm/llvm-project/pull/141274 From flang-commits at lists.llvm.org Tue May 27 09:39:38 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Tue, 27 May 2025 09:39:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix folding of SHAPE(SPREAD(source, dim, ncopies=-1)) (PR #141146) In-Reply-To: Message-ID: <6835eaca.170a0220.ab284.bbef@mx.google.com> https://github.com/akuhlens approved this pull request. https://github.com/llvm/llvm-project/pull/141146 From flang-commits at lists.llvm.org Tue May 27 09:41:30 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 27 May 2025 09:41:30 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <6835eb3a.630a0220.30548b.4a47@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions cpp,h -- flang-rt/include/flang-rt/runtime/work-queue.h flang-rt/lib/runtime/work-queue.cpp flang-rt/include/flang-rt/runtime/environment.h flang-rt/include/flang-rt/runtime/stat.h flang-rt/lib/runtime/assign.cpp flang-rt/lib/runtime/derived.cpp flang-rt/lib/runtime/descriptor-io.cpp flang-rt/lib/runtime/descriptor-io.h flang-rt/lib/runtime/environment.cpp flang-rt/lib/runtime/namelist.cpp flang-rt/lib/runtime/type-info.cpp flang-rt/unittests/Runtime/ExternalIOTest.cpp flang/include/flang/Runtime/assign.h ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index 9bf7c193d..4aa3640c1 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -42,8 +42,9 @@ inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, } // Defined formatted I/O (maybe) -static RT_API_ATTRS Fortran::common::optional DefinedFormattedIo(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &derived, +static RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( + IoStatementState &io, const Descriptor &descriptor, + const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special, const SubscriptValue subscripts[]) { Fortran::common::optional peek{ ``````````
https://github.com/llvm/llvm-project/pull/137727 From flang-commits at lists.llvm.org Tue May 27 09:45:48 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Tue, 27 May 2025 09:45:48 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix crash in error recovery (PR #140768) In-Reply-To: Message-ID: <6835ec3c.170a0220.14bc0e.c698@mx.google.com> https://github.com/akuhlens approved this pull request. https://github.com/llvm/llvm-project/pull/140768 From flang-commits at lists.llvm.org Tue May 27 10:00:27 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Tue, 27 May 2025 10:00:27 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <6835efab.170a0220.2c5c53.cac9@mx.google.com> akuhlens wrote: Nevermind, don't worry about waiting for @klausler's approval. I just confirmed he is fine with landing this. https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Tue May 27 10:01:14 2025 From: flang-commits at lists.llvm.org (Walter J.T.V via flang-commits) Date: Tue, 27 May 2025 10:01:14 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293) In-Reply-To: Message-ID: <6835efda.170a0220.2dde48.acc3@mx.google.com> eZWALT wrote: gentle ping @alexey-bataev https://github.com/llvm/llvm-project/pull/139293 From flang-commits at lists.llvm.org Tue May 27 10:08:38 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 27 May 2025 10:08:38 -0700 (PDT) Subject: [flang-commits] [flang] 66a2d4b - [flang] Ensure that the integer for Cray pointer is sized correctly (#140822) Message-ID: <6835f196.050a0220.1cbcce.edbf@mx.google.com> Author: Eugene Epshteyn Date: 2025-05-27T13:08:35-04:00 New Revision: 66a2d4b1e7b54a906990d8cee6174c228604830b URL: https://github.com/llvm/llvm-project/commit/66a2d4b1e7b54a906990d8cee6174c228604830b DIFF: https://github.com/llvm/llvm-project/commit/66a2d4b1e7b54a906990d8cee6174c228604830b.diff LOG: [flang] Ensure that the integer for Cray pointer is sized correctly (#140822) The integer used for Cray pointers should have the size equivalent to platform's pointer size. Added: Modified: flang/include/flang/Evaluate/target.h flang/include/flang/Tools/TargetSetup.h flang/lib/Semantics/resolve-names.cpp Removed: ################################################################################ diff --git a/flang/include/flang/Evaluate/target.h b/flang/include/flang/Evaluate/target.h index cc6172b492b3c..7b38db2db1956 100644 --- a/flang/include/flang/Evaluate/target.h +++ b/flang/include/flang/Evaluate/target.h @@ -131,6 +131,11 @@ class TargetCharacteristics { IeeeFeatures &ieeeFeatures() { return ieeeFeatures_; } const IeeeFeatures &ieeeFeatures() const { return ieeeFeatures_; } + std::size_t integerKindForPointer() { return integerKindForPointer_; } + void set_integerKindForPointer(std::size_t newKind) { + integerKindForPointer_ = newKind; + } + private: static constexpr int maxKind{common::maxKind}; std::uint8_t byteSize_[common::TypeCategory_enumSize][maxKind + 1]{}; @@ -156,6 +161,7 @@ class TargetCharacteristics { IeeeFeature::Io, IeeeFeature::NaN, IeeeFeature::Rounding, IeeeFeature::Sqrt, IeeeFeature::Standard, IeeeFeature::Subnormal, IeeeFeature::UnderflowControl}; + std::size_t integerKindForPointer_{8}; /* For 64 bit pointer */ }; } // namespace Fortran::evaluate diff --git a/flang/include/flang/Tools/TargetSetup.h b/flang/include/flang/Tools/TargetSetup.h index de3fb0519cf6b..002e82aa72484 100644 --- a/flang/include/flang/Tools/TargetSetup.h +++ b/flang/include/flang/Tools/TargetSetup.h @@ -94,6 +94,10 @@ namespace Fortran::tools { if (targetTriple.isOSWindows()) targetCharacteristics.set_isOSWindows(true); + // Currently the integer kind happens to be the same as the byte size + targetCharacteristics.set_integerKindForPointer( + targetTriple.getArchPointerBitWidth() / 8); + // TODO: use target machine data layout to set-up the target characteristics // type size and alignment info. } diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 3f4a06444c4f3..a8dbf61c8fd68 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -6689,8 +6689,8 @@ void DeclarationVisitor::Post(const parser::BasedPointer &bp) { "'%s' cannot be a Cray pointer as it is already a Cray pointee"_err_en_US); } pointer->set(Symbol::Flag::CrayPointer); - const DeclTypeSpec &pointerType{MakeNumericType( - TypeCategory::Integer, context().defaultKinds().subscriptIntegerKind())}; + const DeclTypeSpec &pointerType{MakeNumericType(TypeCategory::Integer, + context().targetCharacteristics().integerKindForPointer())}; const auto *type{pointer->GetType()}; if (!type) { pointer->SetType(pointerType); From flang-commits at lists.llvm.org Tue May 27 10:08:42 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Tue, 27 May 2025 10:08:42 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Ensure that the integer for Cray pointer is sized correctly (PR #140822) In-Reply-To: Message-ID: <6835f19a.170a0220.3a39d3.0f5c@mx.google.com> https://github.com/eugeneepshteyn closed https://github.com/llvm/llvm-project/pull/140822 From flang-commits at lists.llvm.org Tue May 27 10:11:33 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Tue, 27 May 2025 10:11:33 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <6835f245.170a0220.d72ea.a60c@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. Making my approval "official" https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Tue May 27 10:32:59 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Tue, 27 May 2025 10:32:59 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <6835f74b.170a0220.b11bf.b2a8@mx.google.com> https://github.com/TIFitis updated https://github.com/llvm/llvm-project/pull/140560 >From f852e36071da0b78431cbf3de808e5cca804449a Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Mon, 12 May 2025 18:41:20 +0100 Subject: [PATCH 1/3] Fix semantic check for default declare mappers. --- flang/lib/Semantics/resolve-names.cpp | 21 ++++++++++++------- .../OpenMP/declare-mapper-symbols.f90 | 18 ++++++++-------- .../Semantics/OpenMP/declare-mapper03.f90 | 6 +----- 3 files changed, 23 insertions(+), 22 deletions(-) diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index bdafc03ad2c05..42297f069499b 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -38,6 +38,7 @@ #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" #include "flang/Support/default-kinds.h" +#include "llvm/ADT/SmallVector.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1766,14 +1767,6 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, // just following the natural flow, the map clauses gets processed before // the type has been fully processed. BeginDeclTypeSpec(); - if (auto &mapperName{std::get>(spec.t)}) { - mapperName->symbol = - &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); - } else { - const parser::CharBlock defaultName{"default", 7}; - MakeSymbol( - defaultName, Attrs{}, MiscDetails{MiscDetails::Kind::ConstructName}); - } PushScope(Scope::Kind::OtherConstruct, nullptr); Walk(std::get(spec.t)); @@ -1783,6 +1776,18 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); + + if (auto &mapperName{std::get>(spec.t)}) { + mapperName->symbol = + &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); + } else { + const auto &type = std::get(spec.t); + static llvm::SmallVector defaultNames; + defaultNames.emplace_back( + type.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"); + MakeSymbol(defaultNames.back(), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); + } } void OmpVisitor::ProcessReductionSpecifier( diff --git a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 index b4e03bd1632e5..0dda5b4456987 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 @@ -2,23 +2,23 @@ program main !CHECK-LABEL: MainProgram scope: main - implicit none + implicit none - type ty - integer :: x - end type ty - !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) - !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) + type ty + integer :: x + end type ty + !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) + !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) !! Note, symbols come out in their respective scope, but not in declaration order. -!CHECK: default: Misc ConstructName !CHECK: mymapper: Misc ConstructName !CHECK: ty: DerivedType components: x +!CHECK: ty.default: Misc ConstructName !CHECK: DerivedType scope: ty !CHECK: OtherConstruct scope: !CHECK: mapped (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) -!CHECK: OtherConstruct scope: +!CHECK: OtherConstruct scope: !CHECK: maptwo (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) - + end program main diff --git a/flang/test/Semantics/OpenMP/declare-mapper03.f90 b/flang/test/Semantics/OpenMP/declare-mapper03.f90 index b70b8a67f33e0..84fc3efafb3ad 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper03.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper03.f90 @@ -5,12 +5,8 @@ integer :: y end type t1 -type :: t2 - real :: y, z -end type t2 - !error: 'default' is already declared in this scoping unit !$omp declare mapper(t1::x) map(x, x%y) -!$omp declare mapper(t2::w) map(w, w%y, w%z) +!$omp declare mapper(t1::x) map(x) end >From c2d8c55407494b3d702f927c9a68032ecc56b629 Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Wed, 14 May 2025 20:38:01 +0100 Subject: [PATCH 2/3] Change mapper name field from parser::Name to std::string. --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 6 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 22 ++--- flang/lib/Parser/openmp-parsers.cpp | 22 ++++- flang/lib/Parser/unparse.cpp | 11 ++- flang/lib/Semantics/resolve-names.cpp | 16 +--- flang/test/Lower/OpenMP/declare-mapper.f90 | 95 ++++++++++++++++++- flang/test/Lower/OpenMP/map-mapper.f90 | 4 +- .../Parser/OpenMP/declare-mapper-unparse.f90 | 15 +-- .../Parser/OpenMP/metadirective-dirspec.f90 | 2 +- .../OpenMP/declare-mapper-symbols.f90 | 2 +- 11 files changed, 149 insertions(+), 48 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 254236b510544..c99006f0c1c22 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -3540,7 +3540,7 @@ WRAPPER_CLASS(OmpLocatorList, std::list); struct OmpMapperSpecifier { // Absent mapper-identifier is equivalent to DEFAULT. TUPLE_CLASS_BOILERPLATE(OmpMapperSpecifier); - std::tuple, TypeSpec, Name> t; + std::tuple t; }; // Ref: [4.5:222:1-5], [5.0:305:20-27], [5.1:337:11-19], [5.2:139:18-23], diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 02454543d0a60..8dcc8be9be5bf 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1114,9 +1114,9 @@ void ClauseProcessor::processMapObjects( typeSpec = &object.sym()->GetType()->derivedTypeSpec(); if (typeSpec) { - mapperIdName = typeSpec->name().ToString() + ".default"; - mapperIdName = - converter.mangleName(mapperIdName, *typeSpec->GetScope()); + mapperIdName = typeSpec->name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); } } }; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 61bbc709872fd..cfcba0159db8d 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2422,8 +2422,10 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, mlir::FlatSymbolRefAttr mapperId; if (sym.GetType()->category() == semantics::DeclTypeSpec::TypeDerived) { auto &typeSpec = sym.GetType()->derivedTypeSpec(); - std::string mapperIdName = typeSpec.name().ToString() + ".default"; - mapperIdName = converter.mangleName(mapperIdName, *typeSpec.GetScope()); + std::string mapperIdName = + typeSpec.name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); if (converter.getModuleOp().lookupSymbol(mapperIdName)) mapperId = mlir::FlatSymbolRefAttr::get(&converter.getMLIRContext(), mapperIdName); @@ -4005,24 +4007,16 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, lower::StatementContext stmtCtx; const auto &spec = std::get(declareMapperConstruct.t); - const auto &mapperName{std::get>(spec.t)}; + const auto &mapperName{std::get(spec.t)}; const auto &varType{std::get(spec.t)}; const auto &varName{std::get(spec.t)}; assert(varType.declTypeSpec->category() == semantics::DeclTypeSpec::Category::TypeDerived && "Expected derived type"); - std::string mapperNameStr; - if (mapperName.has_value()) { - mapperNameStr = mapperName->ToString(); - mapperNameStr = - converter.mangleName(mapperNameStr, mapperName->symbol->owner()); - } else { - mapperNameStr = - varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"; - mapperNameStr = converter.mangleName( - mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope()); - } + std::string mapperNameStr = mapperName; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperNameStr)) + mapperNameStr = converter.mangleName(mapperNameStr, sym->owner()); // Save current insertion point before moving to the module scope to create // the DeclareMapperOp diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 52d3a5844c969..a1ed584020677 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1389,8 +1389,28 @@ TYPE_PARSER( TYPE_PARSER(sourced(construct( verbatim("DECLARE TARGET"_tok), Parser{}))) +static OmpMapperSpecifier ConstructOmpMapperSpecifier( + std::optional &&mapperName, TypeSpec &&typeSpec, Name &&varName) { + // If a name is present, parse: name ":" typeSpec "::" name + // This matches the syntax: : :: + if (mapperName.has_value() && mapperName->ToString() != "default") { + return OmpMapperSpecifier{ + mapperName->ToString(), std::move(typeSpec), std::move(varName)}; + } + // If the name is missing, use the DerivedTypeSpec name to construct the + // default mapper name. + // This matches the syntax: :: + if (auto *derived = std::get_if(&typeSpec.u)) { + return OmpMapperSpecifier{ + std::get(derived->t).ToString() + ".omp.default.mapper", + std::move(typeSpec), std::move(varName)}; + } + return OmpMapperSpecifier{std::string("omp.default.mapper"), + std::move(typeSpec), std::move(varName)}; +} + // mapper-specifier -TYPE_PARSER(construct( +TYPE_PARSER(applyFunction(ConstructOmpMapperSpecifier, maybe(name / ":" / !":"_tok), typeSpec / "::", name)) // OpenMP 5.2: 5.8.8 Declare Mapper Construct diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index a626888b7dfe5..1d68e8d8850fa 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2093,7 +2093,11 @@ class UnparseVisitor { Walk(x.v, ","); } void Unparse(const OmpMapperSpecifier &x) { - Walk(std::get>(x.t), ":"); + const auto &mapperName = std::get(x.t); + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); + Put(":"); + } Walk(std::get(x.t)); Put("::"); Walk(std::get(x.t)); @@ -2796,8 +2800,9 @@ class UnparseVisitor { BeginOpenMP(); Word("!$OMP DECLARE MAPPER ("); const auto &spec{std::get(z.t)}; - if (auto mapname{std::get>(spec.t)}) { - Walk(mapname); + const auto &mapperName = std::get(spec.t); + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); Put(":"); } Walk(std::get(spec.t)); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 42297f069499b..322562b06b87f 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1767,7 +1767,9 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, // just following the natural flow, the map clauses gets processed before // the type has been fully processed. BeginDeclTypeSpec(); - + auto &mapperName{std::get(spec.t)}; + MakeSymbol(parser::CharBlock(mapperName), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); PushScope(Scope::Kind::OtherConstruct, nullptr); Walk(std::get(spec.t)); auto &varName{std::get(spec.t)}; @@ -1776,18 +1778,6 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, Walk(clauses); EndDeclTypeSpec(); PopScope(); - - if (auto &mapperName{std::get>(spec.t)}) { - mapperName->symbol = - &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); - } else { - const auto &type = std::get(spec.t); - static llvm::SmallVector defaultNames; - defaultNames.emplace_back( - type.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"); - MakeSymbol(defaultNames.back(), Attrs{}, - MiscDetails{MiscDetails::Kind::ConstructName}); - } } void OmpVisitor::ProcessReductionSpecifier( diff --git a/flang/test/Lower/OpenMP/declare-mapper.f90 b/flang/test/Lower/OpenMP/declare-mapper.f90 index 867b850317e66..8a98c68a8d582 100644 --- a/flang/test/Lower/OpenMP/declare-mapper.f90 +++ b/flang/test/Lower/OpenMP/declare-mapper.f90 @@ -5,6 +5,7 @@ ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-2.f90 -o - | FileCheck %t/omp-declare-mapper-2.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-3.f90 -o - | FileCheck %t/omp-declare-mapper-3.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-4.f90 -o - | FileCheck %t/omp-declare-mapper-4.f90 +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-5.f90 -o - | FileCheck %t/omp-declare-mapper-5.f90 !--- omp-declare-mapper-1.f90 subroutine declare_mapper_1 @@ -22,7 +23,7 @@ subroutine declare_mapper_1 end type type(my_type2) :: t real :: x, y(nvals) - !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { + !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.omp\.default\.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { !CHECK: ^bb0(%[[VAL_0:.*]]: !fir.ref<[[MY_TYPE]]>): !CHECK: %[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFdeclare_mapper_1Evar"} : (!fir.ref<[[MY_TYPE]]>) -> (!fir.ref<[[MY_TYPE]]>, !fir.ref<[[MY_TYPE]]>) !CHECK: %[[VAL_2:.*]] = hlfir.designate %[[VAL_1]]#0{"values"} {fortran_attrs = #fir.var_attrs} : (!fir.ref<[[MY_TYPE]]>) -> !fir.ref>>> @@ -149,7 +150,7 @@ subroutine declare_mapper_4 integer :: num end type - !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] + !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.omp.default.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] !$omp declare mapper (my_type :: var) map (var%num) type(my_type) :: a @@ -171,3 +172,93 @@ subroutine declare_mapper_4 a%num = 40 !$omp end target end subroutine declare_mapper_4 + +!--- omp-declare-mapper-5.f90 +program declare_mapper_5 + implicit none + + type :: mytype + integer :: x, y + end type + + !CHECK: omp.declare_mapper @[[INNER_MAPPER_NAMED:_QQFFuse_innermy_mapper]] : [[MY_TYPE:!fir\.type<_QFTmytype\{x:i32,y:i32\}>]] + !CHECK: omp.declare_mapper @[[INNER_MAPPER_DEFAULT:_QQFFuse_innermytype.omp.default.mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_NAMED:_QQFmy_mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_DEFAULT:_QQFmytype.omp.default.mapper]] : [[MY_TYPE]] + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + +contains + subroutine use_outer() + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine + + subroutine use_inner() + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine +end program declare_mapper_5 diff --git a/flang/test/Lower/OpenMP/map-mapper.f90 b/flang/test/Lower/OpenMP/map-mapper.f90 index a511110cb5d18..91564bfc7bc46 100644 --- a/flang/test/Lower/OpenMP/map-mapper.f90 +++ b/flang/test/Lower/OpenMP/map-mapper.f90 @@ -8,7 +8,7 @@ program p !$omp declare mapper(xx : t1 :: nn) map(to: nn, nn%x) !$omp declare mapper(t1 :: nn) map(from: nn) - !CHECK-LABEL: omp.declare_mapper @_QQFt1.default : !fir.type<_QFTt1{x:!fir.array<256xi32>}> + !CHECK-LABEL: omp.declare_mapper @_QQFt1.omp.default.mapper : !fir.type<_QFTt1{x:!fir.array<256xi32>}> !CHECK-LABEL: omp.declare_mapper @_QQFxx : !fir.type<_QFTt1{x:!fir.array<256xi32>}> type(t1) :: a, b @@ -20,7 +20,7 @@ program p end do !$omp end target - !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.default) -> {{.*}} {name = "b"} + !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.omp.default.mapper) -> {{.*}} {name = "b"} !CHECK: omp.target map_entries(%[[MAP_B]] -> %{{.*}}, %{{.*}} -> %{{.*}} : {{.*}}, {{.*}}) { !$omp target map(mapper(default) : b) do i = 1, n diff --git a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 index 407bfd29153fa..30d75d02736f3 100644 --- a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 +++ b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 @@ -7,36 +7,37 @@ program main type ty integer :: x end type ty - + !CHECK: !$OMP DECLARE MAPPER (mymapper:ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier -!PARSE-TREE: Name = 'mymapper' +!PARSE-TREE: string = 'mymapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' -!PARSE-TREE: Name = 'x' +!PARSE-TREE: Name = 'x' !CHECK: !$OMP DECLARE MAPPER (ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(ty :: mapped) map(mapped, mapped%x) - + !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier +!PARSE-TREE: string = 'ty.omp.default.mapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' !PARSE-TREE: Name = 'x' - + end program main !CHECK-LABEL: end program main diff --git a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 index b6c9c58948fec..baa8b2e08c539 100644 --- a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 +++ b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 @@ -78,7 +78,7 @@ subroutine f02 !PARSE-TREE: | | OmpDirectiveSpecification !PARSE-TREE: | | | OmpDirectiveName -> llvm::omp::Directive = declare mapper !PARSE-TREE: | | | OmpArgumentList -> OmpArgument -> OmpMapperSpecifier -!PARSE-TREE: | | | | Name = 'mymapper' +!PARSE-TREE: | | | | string = 'mymapper' !PARSE-TREE: | | | | TypeSpec -> IntrinsicTypeSpec -> IntegerTypeSpec -> !PARSE-TREE: | | | | Name = 'v' !PARSE-TREE: | | | OmpClauseList -> OmpClause -> Map -> OmpMapClause diff --git a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 index 0dda5b4456987..06f41ab8ce76f 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 @@ -13,7 +13,7 @@ program main !! Note, symbols come out in their respective scope, but not in declaration order. !CHECK: mymapper: Misc ConstructName !CHECK: ty: DerivedType components: x -!CHECK: ty.default: Misc ConstructName +!CHECK: ty.omp.default.mapper: Misc ConstructName !CHECK: DerivedType scope: ty !CHECK: OtherConstruct scope: !CHECK: mapped (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) >From 22772627df5eff1309bf254df2c3826b3d0380cc Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Tue, 27 May 2025 18:32:14 +0100 Subject: [PATCH 3/3] Addressed reviewer comments. --- flang/lib/Parser/openmp-parsers.cpp | 2 +- flang/lib/Parser/unparse.cpp | 4 ++-- flang/lib/Semantics/resolve-names.cpp | 1 - 3 files changed, 3 insertions(+), 4 deletions(-) diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index a1ed584020677..c08cd1ab80559 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1400,7 +1400,7 @@ static OmpMapperSpecifier ConstructOmpMapperSpecifier( // If the name is missing, use the DerivedTypeSpec name to construct the // default mapper name. // This matches the syntax: :: - if (auto *derived = std::get_if(&typeSpec.u)) { + if (DerivedTypeSpec * derived{std::get_if(&typeSpec.u)}) { return OmpMapperSpecifier{ std::get(derived->t).ToString() + ".omp.default.mapper", std::move(typeSpec), std::move(varName)}; diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 1d68e8d8850fa..0784a6703bbde 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2093,7 +2093,7 @@ class UnparseVisitor { Walk(x.v, ","); } void Unparse(const OmpMapperSpecifier &x) { - const auto &mapperName = std::get(x.t); + const auto &mapperName{std::get(x.t)}; if (mapperName.find("omp.default.mapper") == std::string::npos) { Walk(mapperName); Put(":"); @@ -2800,7 +2800,7 @@ class UnparseVisitor { BeginOpenMP(); Word("!$OMP DECLARE MAPPER ("); const auto &spec{std::get(z.t)}; - const auto &mapperName = std::get(spec.t); + const auto &mapperName{std::get(spec.t)}; if (mapperName.find("omp.default.mapper") == std::string::npos) { Walk(mapperName); Put(":"); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 322562b06b87f..57a307f01b24b 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -38,7 +38,6 @@ #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" #include "flang/Support/default-kinds.h" -#include "llvm/ADT/SmallVector.h" #include "llvm/Support/raw_ostream.h" #include #include From flang-commits at lists.llvm.org Tue May 27 10:42:11 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Tue, 27 May 2025 10:42:11 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <6835f973.170a0220.212951.bc0e@mx.google.com> https://github.com/kparzysz edited https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Tue May 27 10:42:11 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Tue, 27 May 2025 10:42:11 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <6835f973.170a0220.38e3d1.d48e@mx.google.com> https://github.com/kparzysz approved this pull request. Found a couple more initializations. https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Tue May 27 10:42:11 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Tue, 27 May 2025 10:42:11 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <6835f973.170a0220.d1c68.9e1f@mx.google.com> ================ @@ -4005,24 +4007,16 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, lower::StatementContext stmtCtx; const auto &spec = std::get(declareMapperConstruct.t); - const auto &mapperName{std::get>(spec.t)}; + const auto &mapperName{std::get(spec.t)}; const auto &varType{std::get(spec.t)}; const auto &varName{std::get(spec.t)}; assert(varType.declTypeSpec->category() == semantics::DeclTypeSpec::Category::TypeDerived && "Expected derived type"); - std::string mapperNameStr; - if (mapperName.has_value()) { - mapperNameStr = mapperName->ToString(); - mapperNameStr = - converter.mangleName(mapperNameStr, mapperName->symbol->owner()); - } else { - mapperNameStr = - varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"; - mapperNameStr = converter.mangleName( - mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope()); - } + std::string mapperNameStr = mapperName; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperNameStr)) ---------------- kparzysz wrote: Brackets sym{...} https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Tue May 27 10:42:11 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Tue, 27 May 2025 10:42:11 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <6835f973.170a0220.92d4f.34ec@mx.google.com> ================ @@ -2422,8 +2422,10 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, mlir::FlatSymbolRefAttr mapperId; if (sym.GetType()->category() == semantics::DeclTypeSpec::TypeDerived) { auto &typeSpec = sym.GetType()->derivedTypeSpec(); - std::string mapperIdName = typeSpec.name().ToString() + ".default"; - mapperIdName = converter.mangleName(mapperIdName, *typeSpec.GetScope()); + std::string mapperIdName = + typeSpec.name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) ---------------- kparzysz wrote: Brackets: sym{...} https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Tue May 27 10:42:27 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Tue, 27 May 2025 10:42:27 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <6835f983.170a0220.e90bf.393e@mx.google.com> https://github.com/kparzysz deleted https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Tue May 27 10:42:34 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Tue, 27 May 2025 10:42:34 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <6835f98a.050a0220.d7bd1.0afc@mx.google.com> https://github.com/kparzysz deleted https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Tue May 27 10:42:45 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Tue, 27 May 2025 10:42:45 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <6835f995.170a0220.ca216.16ce@mx.google.com> https://github.com/kparzysz edited https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Tue May 27 11:20:41 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Tue, 27 May 2025 11:20:41 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68360279.170a0220.1dff0f.168a@mx.google.com> https://github.com/mcinally updated https://github.com/llvm/llvm-project/pull/141380 >From 9f8619cb54a3a11e4c90af7f5156141ddc59e4d4 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 24 May 2025 13:35:13 -0700 Subject: [PATCH 1/2] [flang] Add support for -mprefer-vector-width= This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- clang/include/clang/Driver/Options.td | 2 +- flang/include/flang/Frontend/CodeGenOptions.h | 3 +++ .../include/flang/Optimizer/Transforms/Passes.td | 4 ++++ flang/include/flang/Tools/CrossToolHelpers.h | 3 +++ flang/lib/Frontend/CompilerInvocation.cpp | 14 ++++++++++++++ flang/lib/Frontend/FrontendActions.cpp | 2 ++ flang/lib/Optimizer/Passes/Pipelines.cpp | 2 +- flang/lib/Optimizer/Transforms/FunctionAttr.cpp | 5 +++++ flang/test/Driver/prefer-vector-width.f90 | 16 ++++++++++++++++ mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td | 1 + mlir/lib/Target/LLVMIR/ModuleImport.cpp | 4 ++++ mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | 3 +++ 12 files changed, 57 insertions(+), 2 deletions(-) create mode 100644 flang/test/Driver/prefer-vector-width.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 22261621df092..b0b642796010b 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5480,7 +5480,7 @@ def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, MarshallingInfoStringVector>; def mprefer_vector_width_EQ : Joined<["-"], "mprefer-vector-width=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Specifies preferred vector width for auto-vectorization. Defaults to 'none' which allows target specific decisions.">, MarshallingInfoString>; def mstack_protector_guard_EQ : Joined<["-"], "mstack-protector-guard=">, Group, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -53,6 +53,9 @@ class CodeGenOptions : public CodeGenOptionsBase { /// The paths to the pass plugins that were registered using -fpass-plugin. std::vector LLVMPassPlugins; + // The prefered vector width, if requested by -mprefer-vector-width. + std::string PreferVectorWidth; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index b251534e1a8f6..2e932d9ad4a26 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -418,6 +418,10 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"preferVectorWidth", "prefer-vector-width", "std::string", + /*default=*/"", + "Set the prefer-vector-width attribute on functions in the " + "module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, ]; diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 118695bbe2626..058024a4a04c5 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; InstrumentFunctionExit = "__cyg_profile_func_exit"; @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for + ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. std::string InstrumentFunctionEntry = diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..918323d663610 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -309,6 +309,20 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, for (auto *a : args.filtered(clang::driver::options::OPT_fpass_plugin_EQ)) opts.LLVMPassPlugins.push_back(a->getValue()); + // -mprefer_vector_width option + if (const llvm::opt::Arg *a = args.getLastArg( + clang::driver::options::OPT_mprefer_vector_width_EQ)) { + llvm::StringRef s = a->getValue(); + unsigned Width; + if (s == "none") + opts.PreferVectorWidth = "none"; + else if (s.getAsInteger(10, Width)) + diags.Report(clang::diag::err_drv_invalid_value) + << a->getAsString(args) << a->getValue(); + else + opts.PreferVectorWidth = s.str(); + } + // -fembed-offload-object option for (auto *a : args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 38dfaadf1dff9..012d0fdfe645f 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -741,6 +741,8 @@ void CodeGenAction::generateLLVMIR() { config.VScaleMax = vsr->second; } + config.PreferVectorWidth = opts.PreferVectorWidth; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..5c1e1b9f77efb 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -354,7 +354,7 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - ""})); + config.PreferVectorWidth, ""})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 43e4c1a7af3cd..13f447cf738b4 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -36,6 +36,7 @@ class FunctionAttrPass : public fir::impl::FunctionAttrBase { approxFuncFPMath = options.approxFuncFPMath; noSignedZerosFPMath = options.noSignedZerosFPMath; unsafeFPMath = options.unsafeFPMath; + preferVectorWidth = options.preferVectorWidth; } FunctionAttrPass() {} void runOnOperation() override; @@ -102,6 +103,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!preferVectorWidth.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, preferVectorWidth)); LLVM_DEBUG(llvm::dbgs() << "=== End " DEBUG_TYPE " ===\n"); } diff --git a/flang/test/Driver/prefer-vector-width.f90 b/flang/test/Driver/prefer-vector-width.f90 new file mode 100644 index 0000000000000..d0f5fd28db826 --- /dev/null +++ b/flang/test/Driver/prefer-vector-width.f90 @@ -0,0 +1,16 @@ +! Test that -mprefer-vector-width works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mprefer-vector-width=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mprefer-vector-width=128 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-128 +! RUN: %flang_fc1 -mprefer-vector-width=256 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-256 +! RUN: not %flang_fc1 -mprefer-vector-width=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INVALID + +subroutine func +end subroutine func + +! CHECK-DEF-NOT: attributes #0 = { "prefer-vector-width"={{.*}} } +! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } +! CHECK-128: attributes #0 = { "prefer-vector-width"="128" } +! CHECK-256: attributes #0 = { "prefer-vector-width"="256" } +! CHECK-INVALID:error: invalid value 'xxx' in '-mprefer-vector-width=xxx' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index 6fde45ce5c556..c0324d561b77b 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1893,6 +1893,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$frame_pointer, OptionalAttr:$target_cpu, OptionalAttr:$tune_cpu, + OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index b049064fbd31c..85417da798b22 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2636,6 +2636,10 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, funcOp.setTargetFeaturesAttr( LLVM::TargetFeaturesAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("prefer-vector-width"); + attr.isStringAttribute()) + funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index 047e870b7dcd8..2b7f0b11613aa 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1540,6 +1540,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { if (auto tuneCpu = func.getTuneCpu()) llvmFunc->addFnAttr("tune-cpu", *tuneCpu); + if (auto preferVectorWidth = func.getPreferVectorWidth()) + llvmFunc->addFnAttr("prefer-vector-width", *preferVectorWidth); + if (auto attr = func.getVscaleRange()) llvmFunc->addFnAttr(llvm::Attribute::getWithVScaleRangeArgs( getLLVMContext(), attr->getMinRange().getInt(), >From 5bdf615715733351bae8f959f0a06a8449526bb8 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 24 May 2025 13:35:13 -0700 Subject: [PATCH 2/2] [flang] Add support for -mprefer-vector-width= This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- flang/lib/Frontend/CompilerInvocation.cpp | 4 ++-- mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll | 9 +++++++++ mlir/test/Target/LLVMIR/prefer-vector-width.mlir | 8 ++++++++ 3 files changed, 19 insertions(+), 2 deletions(-) create mode 100644 mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll create mode 100644 mlir/test/Target/LLVMIR/prefer-vector-width.mlir diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 918323d663610..90a002929eff0 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -313,10 +313,10 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, if (const llvm::opt::Arg *a = args.getLastArg( clang::driver::options::OPT_mprefer_vector_width_EQ)) { llvm::StringRef s = a->getValue(); - unsigned Width; + unsigned width; if (s == "none") opts.PreferVectorWidth = "none"; - else if (s.getAsInteger(10, Width)) + else if (s.getAsInteger(10, width)) diags.Report(clang::diag::err_drv_invalid_value) << a->getAsString(args) << a->getValue(); else diff --git a/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll new file mode 100644 index 0000000000000..831aa57345a3f --- /dev/null +++ b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll @@ -0,0 +1,9 @@ +; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s + +; CHECK-LABEL: llvm.func @prefer_vector_width() +; CHECK: prefer_vector_width = "128" +define void @prefer_vector_width() #0 { + ret void +} + +attributes #0 = { "prefer-vector-width"="128" } diff --git a/mlir/test/Target/LLVMIR/prefer-vector-width.mlir b/mlir/test/Target/LLVMIR/prefer-vector-width.mlir new file mode 100644 index 0000000000000..7410e8139fd31 --- /dev/null +++ b/mlir/test/Target/LLVMIR/prefer-vector-width.mlir @@ -0,0 +1,8 @@ +// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s + +// CHECK: define void @prefer_vector_width() #[[ATTRS:.*]] { +// CHECK: attributes #[[ATTRS]] = { "prefer-vector-width"="128" } + +llvm.func @prefer_vector_width() attributes {prefer_vector_width = "128"} { + llvm.return +} From flang-commits at lists.llvm.org Tue May 27 11:22:44 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Tue, 27 May 2025 11:22:44 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <683602f4.170a0220.2dba03.1c03@mx.google.com> ================ @@ -2636,6 +2636,10 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, funcOp.setTargetFeaturesAttr( LLVM::TargetFeaturesAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("prefer-vector-width"); + attr.isStringAttribute()) + funcOp.setPreferVectorWidth(attr.getValueAsString()); ---------------- mcinally wrote: Added tests and updated formatting. 👍 👍 https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Tue May 27 11:29:40 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 27 May 2025 11:29:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68360494.a70a0220.293560.2a75@mx.google.com> https://github.com/vzakhari edited https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 27 11:29:40 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 27 May 2025 11:29:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68360494.630a0220.650f2.f768@mx.google.com> ================ @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); ---------------- vzakhari wrote: Please shorten this using `user_begin`. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 27 11:29:40 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 27 May 2025 11:29:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68360494.170a0220.ac4c2.ebd1@mx.google.com> https://github.com/vzakhari commented: Thank you for the patch! https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 27 11:29:40 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 27 May 2025 11:29:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68360494.170a0220.18d6a3.99f3@mx.google.com> ================ @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + mlir::Value tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); ---------------- vzakhari wrote: I am not sure if it is correct to return `failure` here after the rewrites made above. I think we should not bother about getting rid of the temporary alloca here. It does not seem to be a part of this pass. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 27 11:29:40 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 27 May 2025 11:29:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68360494.170a0220.5b5cb.bc31@mx.google.com> ================ @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) ---------------- vzakhari wrote: ```suggestion if (!copyIn.getWasCopied().hasOneUse()) ``` https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 27 11:29:40 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 27 May 2025 11:29:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68360494.630a0220.2bf203.5fd5@mx.google.com> ================ @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); ---------------- vzakhari wrote: ```suggestion "CopyInOp's WasCopied has no single user"); ``` https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 27 11:29:40 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 27 May 2025 11:29:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68360494.170a0220.1e010e.0f0d@mx.google.com> ================ @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); ---------------- vzakhari wrote: Why do we do this here? I think we should leave it to the bufferization pass. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Tue May 27 09:38:37 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 27 May 2025 09:38:37 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <6835ea8d.170a0220.27940b.6cc1@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From 8713318d0e0e549805ef4b984f340df435f5035c Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, Destroy, and DescriptorIO. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. There is a fast(?) mode in the work queue implementation that causes new work items to be executed to completion immediately upon creation, saving the overhead of actually representing and managing the work queue. This mode can't be used on GPU devices, but it is enabled by default for CPU hosts. It can be disabled easily for debugging and performance testing. --- .../include/flang-rt/runtime/environment.h | 1 + flang-rt/include/flang-rt/runtime/stat.h | 10 +- .../include/flang-rt/runtime/work-queue.h | 538 +++++++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 546 +++++++++------ flang-rt/lib/runtime/derived.cpp | 519 +++++++------- flang-rt/lib/runtime/descriptor-io.cpp | 648 +++++++++++++++++- flang-rt/lib/runtime/descriptor-io.h | 620 +---------------- flang-rt/lib/runtime/environment.cpp | 4 + flang-rt/lib/runtime/namelist.cpp | 1 + flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 153 +++++ flang-rt/unittests/Runtime/ExternalIOTest.cpp | 2 +- flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- 15 files changed, 1981 insertions(+), 1081 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/environment.h b/flang-rt/include/flang-rt/runtime/environment.h index 16258b3bbba9b..87fe1f92ba545 100644 --- a/flang-rt/include/flang-rt/runtime/environment.h +++ b/flang-rt/include/flang-rt/runtime/environment.h @@ -63,6 +63,7 @@ struct ExecutionEnvironment { bool noStopMessage{false}; // NO_STOP_MESSAGE=1 inhibits "Fortran STOP" bool defaultUTF8{false}; // DEFAULT_UTF8 bool checkPointerDeallocation{true}; // FORT_CHECK_POINTER_DEALLOCATION + int internalDebugging{0}; // FLANG_RT_DEBUG // CUDA related variables std::size_t cudaStackLimit{0}; // ACC_OFFLOAD_STACK_SIZE diff --git a/flang-rt/include/flang-rt/runtime/stat.h b/flang-rt/include/flang-rt/runtime/stat.h index 070d0bf8673fb..dc372de53506a 100644 --- a/flang-rt/include/flang-rt/runtime/stat.h +++ b/flang-rt/include/flang-rt/runtime/stat.h @@ -24,7 +24,7 @@ class Terminator; enum Stat { StatOk = 0, // required to be zero by Fortran - // Interoperable STAT= codes + // Interoperable STAT= codes (>= 11) StatBaseNull = CFI_ERROR_BASE_ADDR_NULL, StatBaseNotNull = CFI_ERROR_BASE_ADDR_NOT_NULL, StatInvalidElemLen = CFI_INVALID_ELEM_LEN, @@ -36,7 +36,7 @@ enum Stat { StatMemAllocation = CFI_ERROR_MEM_ALLOCATION, StatOutOfBounds = CFI_ERROR_OUT_OF_BOUNDS, - // Standard STAT= values + // Standard STAT= values (>= 101) StatFailedImage = FORTRAN_RUNTIME_STAT_FAILED_IMAGE, StatLocked = FORTRAN_RUNTIME_STAT_LOCKED, StatLockedOtherImage = FORTRAN_RUNTIME_STAT_LOCKED_OTHER_IMAGE, @@ -49,10 +49,14 @@ enum Stat { // Additional "processor-defined" STAT= values StatInvalidArgumentNumber = FORTRAN_RUNTIME_STAT_INVALID_ARG_NUMBER, StatMissingArgument = FORTRAN_RUNTIME_STAT_MISSING_ARG, - StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, + StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, // -1 StatMoveAllocSameAllocatable = FORTRAN_RUNTIME_STAT_MOVE_ALLOC_SAME_ALLOCATABLE, StatBadPointerDeallocation = FORTRAN_RUNTIME_STAT_BAD_POINTER_DEALLOCATION, + + // Dummy status for work queue continuation, declared here to perhaps + // avoid collisions + StatContinue = 201 }; RT_API_ATTRS const char *StatErrorString(int); diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..2b46890aeebe1 --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,538 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue comprises a list of tickets. Each ticket class has a Begin() +// member function, which is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatContinue. When that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentsOverElements, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/connection.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; + +// Ticket worker base classes + +template class ImmediateTicketRunner { +public: + RT_API_ATTRS explicit ImmediateTicketRunner(TICKET &ticket) + : ticket_{ticket} {} + RT_API_ATTRS int Run(WorkQueue &workQueue) { + int status{ticket_.Begin(workQueue)}; + while (status == StatContinue) { + status = ticket_.Continue(workQueue); + } + return status; + } + +private: + TICKET &ticket_; +}; + +// Base class for ticket workers that operate elementwise over descriptors +class Elementwise { +protected: + RT_API_ATTRS Elementwise( + const Descriptor &instance, const Descriptor *from = nullptr) + : instance_{instance}, from_{from} { + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + RT_API_ATTRS bool IsComplete() const { return elementAt_ >= elements_; } + RT_API_ATTRS void Advance() { + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + if (from_) { + from_->IncrementSubscripts(fromSubscripts_); + } + } + RT_API_ATTRS void SkipToEnd() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + + const Descriptor &instance_, *from_{nullptr}; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + SubscriptValue subscripts_[common::maxRank]; + SubscriptValue fromSubscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components. +class Componentwise { +protected: + RT_API_ATTRS Componentwise(const typeInfo::DerivedType &); + RT_API_ATTRS bool IsComplete() const { return componentAt_ >= components_; } + RT_API_ATTRS void Advance() { + ++componentAt_; + GetComponent(); + } + RT_API_ATTRS void SkipToEnd() { + component_ = nullptr; + componentAt_ = components_; + } + RT_API_ATTRS void Reset() { + component_ = nullptr; + componentAt_ = 0; + GetComponent(); + } + RT_API_ATTRS void GetComponent(); + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentsOverElements : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ComponentsOverElements(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Elementwise::IsComplete()) { + Componentwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Componentwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextElement(); + if (Elementwise::IsComplete()) { + Elementwise::Reset(); + Componentwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Elementwise::Advance(); + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Advance(); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Reset(); + } + + int phase_{0}; +}; + +// Base class for ticket workers that operate over elements in an outer loop, +// type components in an inner loop. +class ElementsOverComponents : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ElementsOverComponents(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Componentwise::IsComplete()) { + Elementwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Elementwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextComponent(); + if (Componentwise::IsComplete()) { + Componentwise::Reset(); + Elementwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Componentwise::Advance(); + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Componentwise::Reset(); + Elementwise::Advance(); + } + + int phase_{0}; +}; + +// Ticket worker classes + +// Implements derived type instance initialization +class InitializeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket + : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket : public ImmediateTicketRunner { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : ImmediateTicketRunner{*this}, to_{to}, from_{&from}, + flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment. +class DerivedAssignTicket : public ImmediateTicketRunner, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ImmediateTicketRunner{*this}, + ElementsOverComponents{to, derived, &from}, flags_{flags}, + memmoveFct_{memmoveFct}, deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + StaticDescriptor fromComponentDescriptor_; +}; + +namespace io::descr { + +template +class DescriptorIoTicket + : public ImmediateTicketRunner>, + private Elementwise { +public: + RT_API_ATTRS DescriptorIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + Elementwise{descriptor}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS bool &anyIoTookPlace() { return anyIoTookPlace_; } + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; + common::optional nonTbpSpecial_; + const typeInfo::DerivedType *derived_{nullptr}; + const typeInfo::SpecialBinding *special_{nullptr}; + StaticDescriptor elementDescriptor_; +}; + +template +class DerivedIoTicket : public ImmediateTicketRunner>, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + ElementsOverComponents{descriptor, derived}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; +}; + +} // namespace io::descr + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant, + io::descr::DescriptorIoTicket, + io::descr::DerivedIoTicket, + io::descr::DerivedIoTicket> + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + // APIs for particular tasks. These can return StatOk if the work is + // completed immediately. + RT_API_ATTRS int BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return InitializeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + if (runTicketsImmediately_) { + return InitializeCloneTicket{clone, original, derived, hasStat, errMsg} + .Run(*this); + } else { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); + return StatContinue; + } + } + RT_API_ATTRS int BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return FinalizeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + if (runTicketsImmediately_) { + return DestroyTicket{descriptor, derived, finalize}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived, finalize); + return StatContinue; + } + } + RT_API_ATTRS int BeginAssign(Descriptor &to, const Descriptor &from, + int flags, MemmoveFct memmoveFct) { + if (runTicketsImmediately_) { + return AssignTicket{to, from, flags, memmoveFct}.Run(*this); + } else { + StartTicket().u.emplace(to, from, flags, memmoveFct); + return StatContinue; + } + } + RT_API_ATTRS int BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) { + if (runTicketsImmediately_) { + return DerivedAssignTicket{ + to, from, derived, flags, memmoveFct, deallocateAfter} + .Run(*this); + } else { + StartTicket().u.emplace( + to, from, derived, flags, memmoveFct, deallocateAfter); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDescriptorIo(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DescriptorIoTicket{ + io, descriptor, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, table, anyIoTookPlace); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedIo(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DerivedIoTicket{ + io, descriptor, derived, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, derived, table, anyIoTookPlace); + return StatContinue; + } + } + + RT_API_ATTRS int Run(); + +private: +#if RT_DEVICE_COMPILATION + // Always use the work queue on a GPU device to avoid recursion. + static constexpr bool runTicketsImmediately_{false}; +#else + // Avoid the work queue overhead on the host, unless it needs + // debugging, which is so much easier there. + static constexpr bool runTicketsImmediately_{true}; +#endif + + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index c5e7bdce5b2fd..3f9858671d1e1 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -67,6 +67,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -130,6 +131,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 86aeeaa88f2d1..1f73e22442731 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -102,11 +103,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncId)); } // least <= 0, most >= 0 @@ -231,6 +228,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -244,274 +243,373 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + if (workQueue.BeginAssign(to, from, flags, memmoveFct) == StatContinue) { + workQueue.Run(); + } +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + if (int status{workQueue.BeginFinalize(*toDeallocate_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncId))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(newFrom, *derived)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + } + static constexpr int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (int status{workQueue.BeginAssign( + newFrom, *from_, nestedFlags, memmoveFct_)}; + status != StatOk && status != StatContinue) { + return status; } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + if (int status{workQueue.BeginFinalize(to_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed + } else if (!toDerived_->noDestructionNeeded()) { + if (int status{ + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + return StatContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); } + return StatOk; } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(to_, *toDerived_)}; + status != StatOk) { + return status; + } + } + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); + if (toDerived_) { + if (int status{workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_)}; + status != StatOk && status != StatContinue) { + return status; } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); - } - } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &workQueue) { + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + memmoveFct_( + instance_.ElementComponent(subscripts_, procPtr.offset), + from_->ElementComponent( + fromSubscripts_, procPtr.offset), + sizeof(typeInfo::ProcedurePointer)); + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &workQueue) { + // DerivedAssignTicket inherits from ElementComponentBase so that it + // loops over elements at the outer level and over components at the inner. + // This yield less surprising behavior for codes that care due to the use + // of defined assignments when the ordering of their calls matters. + while (!IsComplete()) { + // Copy the data components (incl. the parent) first. + switch (component_->genre()) { + case typeInfo::Component::Genre::Data: + if (component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{fromComponentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + toCompDesc, instance_, workQueue.terminator(), subscripts_); + component_->CreatePointerDescriptor( + fromCompDesc, *from_, workQueue.terminator(), fromSubscripts_); + Advance(); + if (int status{workQueue.BeginAssign( + toCompDesc, fromCompDesc, flags_, memmoveFct_)}; + status != StatOk) { + return status; + } + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + instance_.Element(subscripts_) + component_->offset())}; + const auto *fromDesc{reinterpret_cast( + from_->Element(fromSubscripts_) + component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (const auto *componentDerived{component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } + } + toDesc->Deallocate(); + } + Advance(); + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + // + // Be careful not to destroy/reallocate the LHS, if there is + // overlap between LHS and RHS (it seems that partial overlap + // is not possible, though). + // Invoke Assign() recursively to deal with potential aliasing. + // Force LHS deallocation with DeallocateLHS flag. + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + int nestedFlags{flags_ | DeallocateLHS}; + Advance(); + if (int status{workQueue.BeginAssign( + *toDesc, *fromDesc, nestedFlags, memmoveFct_)}; + status != StatOk) { + return status; + } + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -581,7 +679,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -598,7 +695,6 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. if (var) { diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index 35037036f63e7..8462d0aba1f06 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,195 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitialize(instance, derived)}; + return status == StatContinue ? workQueue.Run() : status; +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &comp{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + auto &pptr{*instance_.ElementComponent( + subscripts_, comp.offset)}; + pptr = comp.procInitialization; + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + SkipToNextComponent(); + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitialize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; - } - } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitializeClone( + clone, original, derived, hasStat, errMsg)}; + return status == StatContinue ? workQueue.Run() : status; } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncId), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + if (int status{workQueue.BeginInitialize(cloneDesc, *derived)}; + status != StatOk) { + return status; } } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_)}; + status != StatOk) { + return status; + } + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } + } else { + SkipToNextComponent(); + } + } + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginFinalize(descriptor, derived) == StatContinue) { + workQueue.Run(); } } - return stat; } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +237,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +274,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncId) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,87 +298,94 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (finalizableParentType_->noFinalizationNeeded()) { + finalizableParentType_ = nullptr; + } else { + SkipToNextComponent(); + } + } + return StatContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + if (int status{ + workQueue.BeginFinalize(compDesc, *compDynamicType)}; + status != StatOk) { + return status; } } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginFinalize(compDesc, *compType)}; + status != StatOk) { + return status; } } + } else { + SkipToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginFinalize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + const auto &parentType{*finalizableParentType_}; + finalizableParentType_ = nullptr; + // Don't return StatOk here if the nested FInalize is still running; + // it needs this->componentDescriptor_. + return workQueue.BeginFinalize(tmpDesc, parentType); } + return StatOk; } // The order of finalization follows Fortran 2018 7.5.6.2, with @@ -373,51 +394,71 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginDestroy(descriptor, derived, finalize) == StatContinue) { + workQueue.Run(); + } } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + if (int status{workQueue.BeginFinalize(instance_, derived_)}; + status != StatOk && status != StatContinue) { + return status; + } } + return StatContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (!IsComplete()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *d, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + SkipToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginDestroy( + compDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } } + } else { + SkipToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index 3db1455af52fe..9bf7c193d2d96 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -7,14 +7,42 @@ //===----------------------------------------------------------------------===// #include "descriptor-io.h" +#include "edit-input.h" +#include "edit-output.h" +#include "unit.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/namelist.h" +#include "flang-rt/runtime/terminator.h" +#include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" +#include "flang/Common/optional.h" #include "flang/Common/restorer.h" +#include "flang/Common/uint128.h" +#include "flang/Runtime/cpp-type.h" #include "flang/Runtime/freestanding-tools.h" +// Implementation of I/O data list item transfers based on descriptors. +// (All I/O items come through here so that the code is exercised for test; +// some scalar I/O data transfer APIs could be changed to bypass their use +// of descriptors in the future for better efficiency.) + namespace Fortran::runtime::io::descr { RT_OFFLOAD_API_GROUP_BEGIN +template +inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, + const Descriptor &descriptor, const SubscriptValue subscripts[]) { + A *p{descriptor.Element(subscripts)}; + if (!p) { + io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " + "address or subscripts out of range"); + } + return *p; +} + // Defined formatted I/O (maybe) -Fortran::common::optional DefinedFormattedIo(IoStatementState &io, +static RT_API_ATTRS Fortran::common::optional DefinedFormattedIo(IoStatementState &io, const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special, const SubscriptValue subscripts[]) { @@ -104,8 +132,8 @@ Fortran::common::optional DefinedFormattedIo(IoStatementState &io, } // Defined unformatted I/O -bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, - const typeInfo::DerivedType &derived, +static RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special) { // Unformatted I/O must have an external unit (or child thereof). IoErrorHandler &handler{io.GetIoErrorHandler()}; @@ -152,5 +180,619 @@ bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, return handler.GetIoStat() == IostatOk; } +// Per-category descriptor-based I/O templates + +// TODO (perhaps as a nontrivial but small starter project): implement +// automatic repetition counts, like "10*3.14159", for list-directed and +// NAMELIST array output. + +template +inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, + const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!EditIntegerOutput(io, *edit, x, isSigned)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditIntegerInput( + io, *edit, reinterpret_cast(&x), KIND, isSigned)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedIntegerIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedRealIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + RawType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditRealInput(io, *edit, reinterpret_cast(&x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedRealIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedComplexIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + bool isListOutput{ + io.get_if>() != nullptr}; + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + RawType *x{&ExtractElement(io, descriptor, subscripts)}; + if (isListOutput) { + DataEdit rEdit, iEdit; + rEdit.descriptor = DataEdit::ListDirectedRealPart; + iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; + rEdit.modes = iEdit.modes = io.mutableModes(); + if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || + !RealOutputEditing{io, x[1]}.Edit(iEdit)) { + return false; + } + } else { + for (int k{0}; k < 2; ++k, ++x) { + auto edit{io.GetNextDataEdit()}; + if (!edit) { + return false; + } else if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, *x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { + break; + } else if (EditRealInput( + io, *edit, reinterpret_cast(x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedComplexIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedCharacterIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + std::size_t length{descriptor.ElementBytes() / sizeof(A)}; + auto *listOutput{io.get_if>()}; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + A *x{&ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditCharacterOutput(io, *edit, x, length)) { + return false; + } + } else { // input + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditCharacterInput(io, *edit, x, length)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedCharacterIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedLogicalIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + auto *listOutput{io.get_if>()}; + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditLogicalOutput(io, *edit, x != 0)) { + return false; + } + } else { + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + bool truth{}; + if (EditLogicalInput(io, *edit, truth)) { + x = truth; + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedLogicalIO: subscripts out of bounds"); + } + } + return true; +} + +template +RT_API_ATTRS int DerivedIoTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Data) { + // Create a descriptor for the component + Descriptor &compDesc{componentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + compDesc, instance_, io_.GetIoErrorHandler(), subscripts_); + Advance(); + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } else { + // Component is itself a descriptor + char *pointer{ + instance_.Element(subscripts_) + component_->offset()}; + const Descriptor &compDesc{ + *reinterpret_cast(pointer)}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + } + } + return StatOk; +} + +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Begin(WorkQueue &workQueue) { + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + if (handler.InError()) { + return handler.GetIoStat(); + } + if (!io_.get_if>()) { + handler.Crash("DescriptorIO() called for wrong I/O direction"); + return handler.GetIoStat(); + } + if constexpr (DIR == Direction::Input) { + if (!io_.BeginReadingRecord()) { + return StatOk; + } + } + if (!io_.get_if>()) { + // Unformatted I/O + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + if (const typeInfo::DerivedType * + type{addendum ? addendum->derivedType() : nullptr}) { + // derived type unformatted I/O + if (table_) { + if (const auto *definedIo{table_->Find(*type, + DIR == Direction::Input + ? common::DefinedIo::ReadUnformatted + : common::DefinedIo::WriteUnformatted)}) { + if (definedIo->subroutine) { + typeInfo::SpecialBinding special{DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false}; + if (DefinedUnformattedIo(io_, instance_, *type, special)) { + anyIoTookPlace_ = true; + return StatOk; + } + } else { + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } + } + } + if (const typeInfo::SpecialBinding * + special{type->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || special->isTypeBound()) { + // defined derived type unformatted I/O + if (DefinedUnformattedIo(io_, instance_, *type, *special)) { + anyIoTookPlace_ = true; + return StatOk; + } else { + return IostatEnd; + } + } + } + // Default derived type unformatted I/O + // TODO: If no component at any level has defined READ or WRITE + // (as appropriate), the elements are contiguous, and no byte swapping + // is active, do a block transfer via the code below. + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } else { + // intrinsic type unformatted I/O + auto *externalUnf{io_.get_if>()}; + ChildUnformattedIoStatementState *childUnf{nullptr}; + InquireIOLengthState *inq{nullptr}; + bool swapEndianness{false}; + if (externalUnf) { + swapEndianness = externalUnf->unit().swapEndianness(); + } else { + childUnf = io_.get_if>(); + if (!childUnf) { + inq = DIR == Direction::Output ? io_.get_if() + : nullptr; + RUNTIME_CHECK(handler, inq != nullptr); + } + } + std::size_t elementBytes{instance_.ElementBytes()}; + std::size_t swappingBytes{elementBytes}; + if (auto maybeCatAndKind{instance_.type().GetCategoryAndKind()}) { + // Byte swapping units can be smaller than elements, namely + // for COMPLEX and CHARACTER. + if (maybeCatAndKind->first == TypeCategory::Character) { + // swap each character position independently + swappingBytes = maybeCatAndKind->second; // kind + } else if (maybeCatAndKind->first == TypeCategory::Complex) { + // swap real and imaginary components independently + swappingBytes /= 2; + } + } + using CharType = + std::conditional_t; + auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { + if constexpr (DIR == Direction::Output) { + return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) + : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) + : inq->Emit(&x, totalBytes, swappingBytes); + } else { + return externalUnf + ? externalUnf->Receive(&x, totalBytes, swappingBytes) + : childUnf->Receive(&x, totalBytes, swappingBytes); + } + }}; + if (!swapEndianness && + instance_.IsContiguous()) { // contiguous unformatted I/O + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elements_ * elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O + for (; !IsComplete(); Advance()) { + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } + } + } + // Unformatted I/O never needs to call Continue(). + return StatOk; + } + // Formatted I/O + if (auto catAndKind{instance_.type().GetCategoryAndKind()}) { + TypeCategory cat{catAndKind->first}; + int kind{catAndKind->second}; + bool any{false}; + switch (cat) { + case TypeCategory::Integer: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, true); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, true); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, true); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, true); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, true); + break; + default: + handler.Crash( + "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Unsigned: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, false); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, false); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, false); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, false); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, false); + break; + default: + handler.Crash( + "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Real: + switch (kind) { + case 2: + any = FormattedRealIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedRealIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedRealIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedRealIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedRealIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedRealIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: REAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Complex: + switch (kind) { + case 2: + any = FormattedComplexIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedComplexIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedComplexIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedComplexIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedComplexIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedComplexIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Character: + switch (kind) { + case 1: + any = FormattedCharacterIO(io_, instance_); + break; + case 2: + any = FormattedCharacterIO(io_, instance_); + break; + case 4: + any = FormattedCharacterIO(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Logical: + switch (kind) { + case 1: + any = FormattedLogicalIO<1, DIR>(io_, instance_); + break; + case 2: + any = FormattedLogicalIO<2, DIR>(io_, instance_); + break; + case 4: + any = FormattedLogicalIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedLogicalIO<8, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Derived: { + // Derived type information must be present for formatted I/O. + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + RUNTIME_CHECK(handler, addendum != nullptr); + derived_ = addendum->derivedType(); + RUNTIME_CHECK(handler, derived_ != nullptr); + if (table_) { + if (const auto *definedIo{table_->Find(*derived_, + DIR == Direction::Input ? common::DefinedIo::ReadFormatted + : common::DefinedIo::WriteFormatted)}) { + if (definedIo->subroutine) { + nonTbpSpecial_.emplace(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false); + special_ = &*nonTbpSpecial_; + } + } + } + if (!special_) { + if (const typeInfo::SpecialBinding * + binding{derived_->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || + binding->isTypeBound()) { + special_ = binding; + } + } + } + return StatContinue; + } + } + if (any) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { + handler.Crash("DescriptorIO: bad type code (%d) in descriptor", + static_cast(instance_.type().raw())); + return handler.GetIoStat(); + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Continue(WorkQueue &workQueue) { + // Only derived type formatted I/O gets here. + while (!IsComplete()) { + if (special_) { + if (auto defined{DefinedFormattedIo( + io_, instance_, *derived_, *special_, subscripts_)}) { + anyIoTookPlace_ |= *defined; + Advance(); + continue; + } + } + Descriptor &elementDesc{elementDescriptor_.descriptor()}; + elementDesc.Establish( + *derived_, nullptr, 0, nullptr, CFI_attribute_pointer); + elementDesc.set_base_addr(instance_.Element(subscripts_)); + Advance(); + if (int status{workQueue.BeginDerivedIo( + io_, elementDesc, *derived_, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS bool DescriptorIO(IoStatementState &io, + const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { + bool anyIoTookPlace{false}; + WorkQueue workQueue{io.GetIoErrorHandler()}; + if (workQueue.BeginDescriptorIo(io, descriptor, table, anyIoTookPlace) == + StatContinue) { + workQueue.Run(); + } + return anyIoTookPlace; +} + +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); + RT_OFFLOAD_API_GROUP_END } // namespace Fortran::runtime::io::descr diff --git a/flang-rt/lib/runtime/descriptor-io.h b/flang-rt/lib/runtime/descriptor-io.h index eb60f106c9203..88ad59bd24b53 100644 --- a/flang-rt/lib/runtime/descriptor-io.h +++ b/flang-rt/lib/runtime/descriptor-io.h @@ -9,619 +9,27 @@ #ifndef FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ #define FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ -// Implementation of I/O data list item transfers based on descriptors. -// (All I/O items come through here so that the code is exercised for test; -// some scalar I/O data transfer APIs could be changed to bypass their use -// of descriptors in the future for better efficiency.) +#include "flang-rt/runtime/connection.h" -#include "edit-input.h" -#include "edit-output.h" -#include "unit.h" -#include "flang-rt/runtime/descriptor.h" -#include "flang-rt/runtime/io-stmt.h" -#include "flang-rt/runtime/namelist.h" -#include "flang-rt/runtime/terminator.h" -#include "flang-rt/runtime/type-info.h" -#include "flang/Common/optional.h" -#include "flang/Common/uint128.h" -#include "flang/Runtime/cpp-type.h" +namespace Fortran::runtime { +class Descriptor; +} // namespace Fortran::runtime -namespace Fortran::runtime::io::descr { -template -inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, - const Descriptor &descriptor, const SubscriptValue subscripts[]) { - A *p{descriptor.Element(subscripts)}; - if (!p) { - io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " - "address or subscripts out of range"); - } - return *p; -} - -// Per-category descriptor-based I/O templates - -// TODO (perhaps as a nontrivial but small starter project): implement -// automatic repetition counts, like "10*3.14159", for list-directed and -// NAMELIST array output. - -template -inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, - const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!EditIntegerOutput(io, *edit, x, isSigned)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditIntegerInput( - io, *edit, reinterpret_cast(&x), KIND, isSigned)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedIntegerIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedRealIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - RawType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditRealInput(io, *edit, reinterpret_cast(&x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedRealIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io -template -inline RT_API_ATTRS bool FormattedComplexIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - bool isListOutput{ - io.get_if>() != nullptr}; - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - RawType *x{&ExtractElement(io, descriptor, subscripts)}; - if (isListOutput) { - DataEdit rEdit, iEdit; - rEdit.descriptor = DataEdit::ListDirectedRealPart; - iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; - rEdit.modes = iEdit.modes = io.mutableModes(); - if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || - !RealOutputEditing{io, x[1]}.Edit(iEdit)) { - return false; - } - } else { - for (int k{0}; k < 2; ++k, ++x) { - auto edit{io.GetNextDataEdit()}; - if (!edit) { - return false; - } else if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, *x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { - break; - } else if (EditRealInput( - io, *edit, reinterpret_cast(x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedComplexIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedCharacterIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t length{descriptor.ElementBytes() / sizeof(A)}; - auto *listOutput{io.get_if>()}; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - A *x{&ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditCharacterOutput(io, *edit, x, length)) { - return false; - } - } else { // input - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditCharacterInput(io, *edit, x, length)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedCharacterIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedLogicalIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - auto *listOutput{io.get_if>()}; - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditLogicalOutput(io, *edit, x != 0)) { - return false; - } - } else { - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - bool truth{}; - if (EditLogicalInput(io, *edit, truth)) { - x = truth; - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedLogicalIO: subscripts out of bounds"); - } - } - return true; -} +namespace Fortran::runtime::io::descr { template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, +RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable * = nullptr); -// For intrinsic (not defined) derived type I/O, formatted & unformatted -template -static RT_API_ATTRS bool DefaultComponentIO(IoStatementState &io, - const typeInfo::Component &component, const Descriptor &origDescriptor, - const SubscriptValue origSubscripts[], Terminator &terminator, - const NonTbpDefinedIoTable *table) { -#if !defined(RT_DEVICE_AVOID_RECURSION) - if (component.genre() == typeInfo::Component::Genre::Data) { - // Create a descriptor for the component - StaticDescriptor statDesc; - Descriptor &desc{statDesc.descriptor()}; - component.CreatePointerDescriptor( - desc, origDescriptor, terminator, origSubscripts); - return DescriptorIO(io, desc, table); - } else { - // Component is itself a descriptor - char *pointer{ - origDescriptor.Element(origSubscripts) + component.offset()}; - const Descriptor &compDesc{*reinterpret_cast(pointer)}; - return compDesc.IsAllocated() && DescriptorIO(io, compDesc, table); - } -#else - terminator.Crash("not yet implemented: component IO"); -#endif -} - -template -static RT_API_ATTRS bool DefaultComponentwiseFormattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table, const SubscriptValue subscripts[]) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - // Return true for NAMELIST input if any component appeared. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && k > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -template -static RT_API_ATTRS bool DefaultComponentwiseUnformattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - return false; - } - } - } - return true; -} - -RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( - IoStatementState &, const Descriptor &, const typeInfo::DerivedType &, - const typeInfo::SpecialBinding &, const SubscriptValue[]); - -template -static RT_API_ATTRS bool FormattedDerivedTypeIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - // Derived type information must be present for formatted I/O. - const DescriptorAddendum *addendum{descriptor.Addendum()}; - RUNTIME_CHECK(handler, addendum != nullptr); - const typeInfo::DerivedType *type{addendum->derivedType()}; - RUNTIME_CHECK(handler, type != nullptr); - Fortran::common::optional nonTbpSpecial; - const typeInfo::SpecialBinding *special{nullptr}; - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadFormatted - : common::DefinedIo::WriteFormatted)}) { - if (definedIo->subroutine) { - nonTbpSpecial.emplace(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false); - special = &*nonTbpSpecial; - } - } - } - if (!special) { - if (const typeInfo::SpecialBinding * - binding{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted)}) { - if (!table || !table->ignoreNonTbpEntries || binding->isTypeBound()) { - special = binding; - } - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t numElements{descriptor.Elements()}; - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - Fortran::common::optional result; - if (special) { - result = DefinedFormattedIo(io, descriptor, *type, *special, subscripts); - } - if (!result) { - result = DefaultComponentwiseFormattedIO( - io, descriptor, *type, table, subscripts); - } - if (!result.value()) { - // Return true for NAMELIST input if we got anything. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && j > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &, const Descriptor &, - const typeInfo::DerivedType &, const typeInfo::SpecialBinding &); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); -// Unformatted I/O -template -static RT_API_ATTRS bool UnformattedDescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table = nullptr) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const DescriptorAddendum *addendum{descriptor.Addendum()}; - if (const typeInfo::DerivedType * - type{addendum ? addendum->derivedType() : nullptr}) { - // derived type unformatted I/O - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadUnformatted - : common::DefinedIo::WriteUnformatted)}) { - if (definedIo->subroutine) { - typeInfo::SpecialBinding special{DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false}; - if (Fortran::common::optional wasDefined{ - DefinedUnformattedIo(io, descriptor, *type, special)}) { - return *wasDefined; - } - } else { - return DefaultComponentwiseUnformattedIO( - io, descriptor, *type, table); - } - } - } - if (const typeInfo::SpecialBinding * - special{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { - if (!table || !table->ignoreNonTbpEntries || special->isTypeBound()) { - // defined derived type unformatted I/O - return DefinedUnformattedIo(io, descriptor, *type, *special); - } - } - // Default derived type unformatted I/O - // TODO: If no component at any level has defined READ or WRITE - // (as appropriate), the elements are contiguous, and no byte swapping - // is active, do a block transfer via the code below. - return DefaultComponentwiseUnformattedIO(io, descriptor, *type, table); - } else { - // intrinsic type unformatted I/O - auto *externalUnf{io.get_if>()}; - auto *childUnf{io.get_if>()}; - auto *inq{ - DIR == Direction::Output ? io.get_if() : nullptr}; - RUNTIME_CHECK(handler, externalUnf || childUnf || inq); - std::size_t elementBytes{descriptor.ElementBytes()}; - std::size_t numElements{descriptor.Elements()}; - std::size_t swappingBytes{elementBytes}; - if (auto maybeCatAndKind{descriptor.type().GetCategoryAndKind()}) { - // Byte swapping units can be smaller than elements, namely - // for COMPLEX and CHARACTER. - if (maybeCatAndKind->first == TypeCategory::Character) { - // swap each character position independently - swappingBytes = maybeCatAndKind->second; // kind - } else if (maybeCatAndKind->first == TypeCategory::Complex) { - // swap real and imaginary components independently - swappingBytes /= 2; - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using CharType = - std::conditional_t; - auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { - if constexpr (DIR == Direction::Output) { - return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) - : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) - : inq->Emit(&x, totalBytes, swappingBytes); - } else { - return externalUnf ? externalUnf->Receive(&x, totalBytes, swappingBytes) - : childUnf->Receive(&x, totalBytes, swappingBytes); - } - }}; - bool swapEndianness{externalUnf && externalUnf->unit().swapEndianness()}; - if (!swapEndianness && - descriptor.IsContiguous()) { // contiguous unformatted I/O - char &x{ExtractElement(io, descriptor, subscripts)}; - return Transfer(x, numElements * elementBytes); - } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O - for (std::size_t j{0}; j < numElements; ++j) { - char &x{ExtractElement(io, descriptor, subscripts)}; - if (!Transfer(x, elementBytes)) { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && - j + 1 < numElements) { - handler.Crash("DescriptorIO: subscripts out of bounds"); - } - } - return true; - } - } -} - -template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - if (handler.InError()) { - return false; - } - if (!io.get_if>()) { - handler.Crash("DescriptorIO() called for wrong I/O direction"); - return false; - } - if constexpr (DIR == Direction::Input) { - if (!io.BeginReadingRecord()) { - return false; - } - } - if (!io.get_if>()) { - return UnformattedDescriptorIO(io, descriptor, table); - } - if (auto catAndKind{descriptor.type().GetCategoryAndKind()}) { - TypeCategory cat{catAndKind->first}; - int kind{catAndKind->second}; - switch (cat) { - case TypeCategory::Integer: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, true); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, true); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, true); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, true); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, true); - default: - handler.Crash( - "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Unsigned: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, false); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, false); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, false); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, false); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, false); - default: - handler.Crash( - "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Real: - switch (kind) { - case 2: - return FormattedRealIO<2, DIR>(io, descriptor); - case 3: - return FormattedRealIO<3, DIR>(io, descriptor); - case 4: - return FormattedRealIO<4, DIR>(io, descriptor); - case 8: - return FormattedRealIO<8, DIR>(io, descriptor); - case 10: - return FormattedRealIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedRealIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: REAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Complex: - switch (kind) { - case 2: - return FormattedComplexIO<2, DIR>(io, descriptor); - case 3: - return FormattedComplexIO<3, DIR>(io, descriptor); - case 4: - return FormattedComplexIO<4, DIR>(io, descriptor); - case 8: - return FormattedComplexIO<8, DIR>(io, descriptor); - case 10: - return FormattedComplexIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedComplexIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Character: - switch (kind) { - case 1: - return FormattedCharacterIO(io, descriptor); - case 2: - return FormattedCharacterIO(io, descriptor); - case 4: - return FormattedCharacterIO(io, descriptor); - default: - handler.Crash( - "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Logical: - switch (kind) { - case 1: - return FormattedLogicalIO<1, DIR>(io, descriptor); - case 2: - return FormattedLogicalIO<2, DIR>(io, descriptor); - case 4: - return FormattedLogicalIO<4, DIR>(io, descriptor); - case 8: - return FormattedLogicalIO<8, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Derived: - return FormattedDerivedTypeIO(io, descriptor, table); - } - } - handler.Crash("DescriptorIO: bad type code (%d) in descriptor", - static_cast(descriptor.type().raw())); - return false; -} } // namespace Fortran::runtime::io::descr #endif // FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ diff --git a/flang-rt/lib/runtime/environment.cpp b/flang-rt/lib/runtime/environment.cpp index 1d5304254ed0e..0f0564403c0e2 100644 --- a/flang-rt/lib/runtime/environment.cpp +++ b/flang-rt/lib/runtime/environment.cpp @@ -143,6 +143,10 @@ void ExecutionEnvironment::Configure(int ac, const char *av[], } } + if (auto *x{std::getenv("FLANG_RT_DEBUG")}) { + internalDebugging = std::strtol(x, nullptr, 10); + } + if (auto *x{std::getenv("ACC_OFFLOAD_STACK_SIZE")}) { char *end; auto n{std::strtoul(x, &end, 10)}; diff --git a/flang-rt/lib/runtime/namelist.cpp b/flang-rt/lib/runtime/namelist.cpp index b0cf2180fc6d4..1bef387a9771f 100644 --- a/flang-rt/lib/runtime/namelist.cpp +++ b/flang-rt/lib/runtime/namelist.cpp @@ -10,6 +10,7 @@ #include "descriptor-io.h" #include "flang-rt/runtime/emit-encoded.h" #include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/type-info.h" #include "flang/Runtime/io-api.h" #include #include diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..eee1c551aad6d --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,153 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +#if !defined(RT_DEVICE_COMPILATION) +// FLANG_RT_DEBUG code is disabled when false. +static constexpr bool enableDebugOutput{false}; +#endif + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (componentAt_ >= components_) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + delete firstFree_; + } + firstFree_ = next; + } +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + firstFree_ = new TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: new ticket\n"); + } +#endif + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), + at->ticket.begun ? "Continue" : "Begin"); + } +#endif + int stat{at->ticket.Continue(*this)}; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: ... stat %d\n", stat); + } +#endif + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime diff --git a/flang-rt/unittests/Runtime/ExternalIOTest.cpp b/flang-rt/unittests/Runtime/ExternalIOTest.cpp index 3833e48be3dd6..6c148b1de6f82 100644 --- a/flang-rt/unittests/Runtime/ExternalIOTest.cpp +++ b/flang-rt/unittests/Runtime/ExternalIOTest.cpp @@ -184,7 +184,7 @@ TEST(ExternalIOTests, TestSequentialFixedUnformatted) { io = IONAME(BeginInquireIoLength)(__FILE__, __LINE__); for (int j{1}; j <= 3; ++j) { ASSERT_TRUE(IONAME(OutputDescriptor)(io, desc)) - << "OutputDescriptor() for InquireIoLength"; + << "OutputDescriptor() for InquireIoLength " << j; } ASSERT_EQ(IONAME(GetIoLength)(io), 3 * recl) << "GetIoLength"; ASSERT_EQ(IONAME(EndIoStatement)(io), IostatOk) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..32bebc1d866a4 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -843,6 +843,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION From flang-commits at lists.llvm.org Tue May 27 09:43:59 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 27 May 2025 09:43:59 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <6835ebcf.170a0220.2de3bb.6c21@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From 9702af742d5e17469681d028b059ded9a25de6a8 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, Destroy, and DescriptorIO. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. There is a fast(?) mode in the work queue implementation that causes new work items to be executed to completion immediately upon creation, saving the overhead of actually representing and managing the work queue. This mode can't be used on GPU devices, but it is enabled by default for CPU hosts. It can be disabled easily for debugging and performance testing. --- .../include/flang-rt/runtime/environment.h | 1 + flang-rt/include/flang-rt/runtime/stat.h | 10 +- .../include/flang-rt/runtime/work-queue.h | 538 +++++++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 546 +++++++++------ flang-rt/lib/runtime/derived.cpp | 519 +++++++------- flang-rt/lib/runtime/descriptor-io.cpp | 651 +++++++++++++++++- flang-rt/lib/runtime/descriptor-io.h | 620 +---------------- flang-rt/lib/runtime/environment.cpp | 4 + flang-rt/lib/runtime/namelist.cpp | 1 + flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 153 ++++ flang-rt/unittests/Runtime/ExternalIOTest.cpp | 2 +- flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- 15 files changed, 1983 insertions(+), 1082 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/environment.h b/flang-rt/include/flang-rt/runtime/environment.h index 16258b3bbba9b..87fe1f92ba545 100644 --- a/flang-rt/include/flang-rt/runtime/environment.h +++ b/flang-rt/include/flang-rt/runtime/environment.h @@ -63,6 +63,7 @@ struct ExecutionEnvironment { bool noStopMessage{false}; // NO_STOP_MESSAGE=1 inhibits "Fortran STOP" bool defaultUTF8{false}; // DEFAULT_UTF8 bool checkPointerDeallocation{true}; // FORT_CHECK_POINTER_DEALLOCATION + int internalDebugging{0}; // FLANG_RT_DEBUG // CUDA related variables std::size_t cudaStackLimit{0}; // ACC_OFFLOAD_STACK_SIZE diff --git a/flang-rt/include/flang-rt/runtime/stat.h b/flang-rt/include/flang-rt/runtime/stat.h index 070d0bf8673fb..dc372de53506a 100644 --- a/flang-rt/include/flang-rt/runtime/stat.h +++ b/flang-rt/include/flang-rt/runtime/stat.h @@ -24,7 +24,7 @@ class Terminator; enum Stat { StatOk = 0, // required to be zero by Fortran - // Interoperable STAT= codes + // Interoperable STAT= codes (>= 11) StatBaseNull = CFI_ERROR_BASE_ADDR_NULL, StatBaseNotNull = CFI_ERROR_BASE_ADDR_NOT_NULL, StatInvalidElemLen = CFI_INVALID_ELEM_LEN, @@ -36,7 +36,7 @@ enum Stat { StatMemAllocation = CFI_ERROR_MEM_ALLOCATION, StatOutOfBounds = CFI_ERROR_OUT_OF_BOUNDS, - // Standard STAT= values + // Standard STAT= values (>= 101) StatFailedImage = FORTRAN_RUNTIME_STAT_FAILED_IMAGE, StatLocked = FORTRAN_RUNTIME_STAT_LOCKED, StatLockedOtherImage = FORTRAN_RUNTIME_STAT_LOCKED_OTHER_IMAGE, @@ -49,10 +49,14 @@ enum Stat { // Additional "processor-defined" STAT= values StatInvalidArgumentNumber = FORTRAN_RUNTIME_STAT_INVALID_ARG_NUMBER, StatMissingArgument = FORTRAN_RUNTIME_STAT_MISSING_ARG, - StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, + StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, // -1 StatMoveAllocSameAllocatable = FORTRAN_RUNTIME_STAT_MOVE_ALLOC_SAME_ALLOCATABLE, StatBadPointerDeallocation = FORTRAN_RUNTIME_STAT_BAD_POINTER_DEALLOCATION, + + // Dummy status for work queue continuation, declared here to perhaps + // avoid collisions + StatContinue = 201 }; RT_API_ATTRS const char *StatErrorString(int); diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..2b46890aeebe1 --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,538 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue comprises a list of tickets. Each ticket class has a Begin() +// member function, which is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatContinue. When that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentsOverElements, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/connection.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; + +// Ticket worker base classes + +template class ImmediateTicketRunner { +public: + RT_API_ATTRS explicit ImmediateTicketRunner(TICKET &ticket) + : ticket_{ticket} {} + RT_API_ATTRS int Run(WorkQueue &workQueue) { + int status{ticket_.Begin(workQueue)}; + while (status == StatContinue) { + status = ticket_.Continue(workQueue); + } + return status; + } + +private: + TICKET &ticket_; +}; + +// Base class for ticket workers that operate elementwise over descriptors +class Elementwise { +protected: + RT_API_ATTRS Elementwise( + const Descriptor &instance, const Descriptor *from = nullptr) + : instance_{instance}, from_{from} { + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + RT_API_ATTRS bool IsComplete() const { return elementAt_ >= elements_; } + RT_API_ATTRS void Advance() { + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + if (from_) { + from_->IncrementSubscripts(fromSubscripts_); + } + } + RT_API_ATTRS void SkipToEnd() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + + const Descriptor &instance_, *from_{nullptr}; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + SubscriptValue subscripts_[common::maxRank]; + SubscriptValue fromSubscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components. +class Componentwise { +protected: + RT_API_ATTRS Componentwise(const typeInfo::DerivedType &); + RT_API_ATTRS bool IsComplete() const { return componentAt_ >= components_; } + RT_API_ATTRS void Advance() { + ++componentAt_; + GetComponent(); + } + RT_API_ATTRS void SkipToEnd() { + component_ = nullptr; + componentAt_ = components_; + } + RT_API_ATTRS void Reset() { + component_ = nullptr; + componentAt_ = 0; + GetComponent(); + } + RT_API_ATTRS void GetComponent(); + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentsOverElements : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ComponentsOverElements(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Elementwise::IsComplete()) { + Componentwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Componentwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextElement(); + if (Elementwise::IsComplete()) { + Elementwise::Reset(); + Componentwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Elementwise::Advance(); + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Advance(); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Reset(); + } + + int phase_{0}; +}; + +// Base class for ticket workers that operate over elements in an outer loop, +// type components in an inner loop. +class ElementsOverComponents : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ElementsOverComponents(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Componentwise::IsComplete()) { + Elementwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Elementwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextComponent(); + if (Componentwise::IsComplete()) { + Componentwise::Reset(); + Elementwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Componentwise::Advance(); + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Componentwise::Reset(); + Elementwise::Advance(); + } + + int phase_{0}; +}; + +// Ticket worker classes + +// Implements derived type instance initialization +class InitializeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket + : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket : public ImmediateTicketRunner { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : ImmediateTicketRunner{*this}, to_{to}, from_{&from}, + flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment. +class DerivedAssignTicket : public ImmediateTicketRunner, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ImmediateTicketRunner{*this}, + ElementsOverComponents{to, derived, &from}, flags_{flags}, + memmoveFct_{memmoveFct}, deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + StaticDescriptor fromComponentDescriptor_; +}; + +namespace io::descr { + +template +class DescriptorIoTicket + : public ImmediateTicketRunner>, + private Elementwise { +public: + RT_API_ATTRS DescriptorIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + Elementwise{descriptor}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS bool &anyIoTookPlace() { return anyIoTookPlace_; } + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; + common::optional nonTbpSpecial_; + const typeInfo::DerivedType *derived_{nullptr}; + const typeInfo::SpecialBinding *special_{nullptr}; + StaticDescriptor elementDescriptor_; +}; + +template +class DerivedIoTicket : public ImmediateTicketRunner>, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + ElementsOverComponents{descriptor, derived}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; +}; + +} // namespace io::descr + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant, + io::descr::DescriptorIoTicket, + io::descr::DerivedIoTicket, + io::descr::DerivedIoTicket> + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + // APIs for particular tasks. These can return StatOk if the work is + // completed immediately. + RT_API_ATTRS int BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return InitializeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + if (runTicketsImmediately_) { + return InitializeCloneTicket{clone, original, derived, hasStat, errMsg} + .Run(*this); + } else { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); + return StatContinue; + } + } + RT_API_ATTRS int BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return FinalizeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + if (runTicketsImmediately_) { + return DestroyTicket{descriptor, derived, finalize}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived, finalize); + return StatContinue; + } + } + RT_API_ATTRS int BeginAssign(Descriptor &to, const Descriptor &from, + int flags, MemmoveFct memmoveFct) { + if (runTicketsImmediately_) { + return AssignTicket{to, from, flags, memmoveFct}.Run(*this); + } else { + StartTicket().u.emplace(to, from, flags, memmoveFct); + return StatContinue; + } + } + RT_API_ATTRS int BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) { + if (runTicketsImmediately_) { + return DerivedAssignTicket{ + to, from, derived, flags, memmoveFct, deallocateAfter} + .Run(*this); + } else { + StartTicket().u.emplace( + to, from, derived, flags, memmoveFct, deallocateAfter); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDescriptorIo(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DescriptorIoTicket{ + io, descriptor, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, table, anyIoTookPlace); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedIo(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DerivedIoTicket{ + io, descriptor, derived, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, derived, table, anyIoTookPlace); + return StatContinue; + } + } + + RT_API_ATTRS int Run(); + +private: +#if RT_DEVICE_COMPILATION + // Always use the work queue on a GPU device to avoid recursion. + static constexpr bool runTicketsImmediately_{false}; +#else + // Avoid the work queue overhead on the host, unless it needs + // debugging, which is so much easier there. + static constexpr bool runTicketsImmediately_{true}; +#endif + + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index c5e7bdce5b2fd..3f9858671d1e1 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -67,6 +67,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -130,6 +131,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 86aeeaa88f2d1..1f73e22442731 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -102,11 +103,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncId)); } // least <= 0, most >= 0 @@ -231,6 +228,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -244,274 +243,373 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + if (workQueue.BeginAssign(to, from, flags, memmoveFct) == StatContinue) { + workQueue.Run(); + } +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + if (int status{workQueue.BeginFinalize(*toDeallocate_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncId))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(newFrom, *derived)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + } + static constexpr int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (int status{workQueue.BeginAssign( + newFrom, *from_, nestedFlags, memmoveFct_)}; + status != StatOk && status != StatContinue) { + return status; } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + if (int status{workQueue.BeginFinalize(to_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed + } else if (!toDerived_->noDestructionNeeded()) { + if (int status{ + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + return StatContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); } + return StatOk; } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(to_, *toDerived_)}; + status != StatOk) { + return status; + } + } + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); + if (toDerived_) { + if (int status{workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_)}; + status != StatOk && status != StatContinue) { + return status; } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); - } - } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &workQueue) { + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + memmoveFct_( + instance_.ElementComponent(subscripts_, procPtr.offset), + from_->ElementComponent( + fromSubscripts_, procPtr.offset), + sizeof(typeInfo::ProcedurePointer)); + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &workQueue) { + // DerivedAssignTicket inherits from ElementComponentBase so that it + // loops over elements at the outer level and over components at the inner. + // This yield less surprising behavior for codes that care due to the use + // of defined assignments when the ordering of their calls matters. + while (!IsComplete()) { + // Copy the data components (incl. the parent) first. + switch (component_->genre()) { + case typeInfo::Component::Genre::Data: + if (component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{fromComponentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + toCompDesc, instance_, workQueue.terminator(), subscripts_); + component_->CreatePointerDescriptor( + fromCompDesc, *from_, workQueue.terminator(), fromSubscripts_); + Advance(); + if (int status{workQueue.BeginAssign( + toCompDesc, fromCompDesc, flags_, memmoveFct_)}; + status != StatOk) { + return status; + } + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + instance_.Element(subscripts_) + component_->offset())}; + const auto *fromDesc{reinterpret_cast( + from_->Element(fromSubscripts_) + component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (const auto *componentDerived{component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } + } + toDesc->Deallocate(); + } + Advance(); + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + // + // Be careful not to destroy/reallocate the LHS, if there is + // overlap between LHS and RHS (it seems that partial overlap + // is not possible, though). + // Invoke Assign() recursively to deal with potential aliasing. + // Force LHS deallocation with DeallocateLHS flag. + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + int nestedFlags{flags_ | DeallocateLHS}; + Advance(); + if (int status{workQueue.BeginAssign( + *toDesc, *fromDesc, nestedFlags, memmoveFct_)}; + status != StatOk) { + return status; + } + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -581,7 +679,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -598,7 +695,6 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. if (var) { diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index 35037036f63e7..8462d0aba1f06 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,195 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitialize(instance, derived)}; + return status == StatContinue ? workQueue.Run() : status; +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &comp{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + auto &pptr{*instance_.ElementComponent( + subscripts_, comp.offset)}; + pptr = comp.procInitialization; + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + SkipToNextComponent(); + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitialize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; - } - } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitializeClone( + clone, original, derived, hasStat, errMsg)}; + return status == StatContinue ? workQueue.Run() : status; } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncId), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + if (int status{workQueue.BeginInitialize(cloneDesc, *derived)}; + status != StatOk) { + return status; } } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_)}; + status != StatOk) { + return status; + } + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } + } else { + SkipToNextComponent(); + } + } + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginFinalize(descriptor, derived) == StatContinue) { + workQueue.Run(); } } - return stat; } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +237,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +274,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncId) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,87 +298,94 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (finalizableParentType_->noFinalizationNeeded()) { + finalizableParentType_ = nullptr; + } else { + SkipToNextComponent(); + } + } + return StatContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + if (int status{ + workQueue.BeginFinalize(compDesc, *compDynamicType)}; + status != StatOk) { + return status; } } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginFinalize(compDesc, *compType)}; + status != StatOk) { + return status; } } + } else { + SkipToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginFinalize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + const auto &parentType{*finalizableParentType_}; + finalizableParentType_ = nullptr; + // Don't return StatOk here if the nested FInalize is still running; + // it needs this->componentDescriptor_. + return workQueue.BeginFinalize(tmpDesc, parentType); } + return StatOk; } // The order of finalization follows Fortran 2018 7.5.6.2, with @@ -373,51 +394,71 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginDestroy(descriptor, derived, finalize) == StatContinue) { + workQueue.Run(); + } } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + if (int status{workQueue.BeginFinalize(instance_, derived_)}; + status != StatOk && status != StatContinue) { + return status; + } } + return StatContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (!IsComplete()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *d, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + SkipToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginDestroy( + compDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } } + } else { + SkipToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index 3db1455af52fe..4aa3640c1ed94 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -7,15 +7,44 @@ //===----------------------------------------------------------------------===// #include "descriptor-io.h" +#include "edit-input.h" +#include "edit-output.h" +#include "unit.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/namelist.h" +#include "flang-rt/runtime/terminator.h" +#include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" +#include "flang/Common/optional.h" #include "flang/Common/restorer.h" +#include "flang/Common/uint128.h" +#include "flang/Runtime/cpp-type.h" #include "flang/Runtime/freestanding-tools.h" +// Implementation of I/O data list item transfers based on descriptors. +// (All I/O items come through here so that the code is exercised for test; +// some scalar I/O data transfer APIs could be changed to bypass their use +// of descriptors in the future for better efficiency.) + namespace Fortran::runtime::io::descr { RT_OFFLOAD_API_GROUP_BEGIN +template +inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, + const Descriptor &descriptor, const SubscriptValue subscripts[]) { + A *p{descriptor.Element(subscripts)}; + if (!p) { + io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " + "address or subscripts out of range"); + } + return *p; +} + // Defined formatted I/O (maybe) -Fortran::common::optional DefinedFormattedIo(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &derived, +static RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( + IoStatementState &io, const Descriptor &descriptor, + const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special, const SubscriptValue subscripts[]) { Fortran::common::optional peek{ @@ -104,8 +133,8 @@ Fortran::common::optional DefinedFormattedIo(IoStatementState &io, } // Defined unformatted I/O -bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, - const typeInfo::DerivedType &derived, +static RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special) { // Unformatted I/O must have an external unit (or child thereof). IoErrorHandler &handler{io.GetIoErrorHandler()}; @@ -152,5 +181,619 @@ bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, return handler.GetIoStat() == IostatOk; } +// Per-category descriptor-based I/O templates + +// TODO (perhaps as a nontrivial but small starter project): implement +// automatic repetition counts, like "10*3.14159", for list-directed and +// NAMELIST array output. + +template +inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, + const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!EditIntegerOutput(io, *edit, x, isSigned)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditIntegerInput( + io, *edit, reinterpret_cast(&x), KIND, isSigned)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedIntegerIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedRealIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + RawType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditRealInput(io, *edit, reinterpret_cast(&x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedRealIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedComplexIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + bool isListOutput{ + io.get_if>() != nullptr}; + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + RawType *x{&ExtractElement(io, descriptor, subscripts)}; + if (isListOutput) { + DataEdit rEdit, iEdit; + rEdit.descriptor = DataEdit::ListDirectedRealPart; + iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; + rEdit.modes = iEdit.modes = io.mutableModes(); + if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || + !RealOutputEditing{io, x[1]}.Edit(iEdit)) { + return false; + } + } else { + for (int k{0}; k < 2; ++k, ++x) { + auto edit{io.GetNextDataEdit()}; + if (!edit) { + return false; + } else if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, *x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { + break; + } else if (EditRealInput( + io, *edit, reinterpret_cast(x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedComplexIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedCharacterIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + std::size_t length{descriptor.ElementBytes() / sizeof(A)}; + auto *listOutput{io.get_if>()}; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + A *x{&ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditCharacterOutput(io, *edit, x, length)) { + return false; + } + } else { // input + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditCharacterInput(io, *edit, x, length)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedCharacterIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedLogicalIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + auto *listOutput{io.get_if>()}; + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditLogicalOutput(io, *edit, x != 0)) { + return false; + } + } else { + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + bool truth{}; + if (EditLogicalInput(io, *edit, truth)) { + x = truth; + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedLogicalIO: subscripts out of bounds"); + } + } + return true; +} + +template +RT_API_ATTRS int DerivedIoTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Data) { + // Create a descriptor for the component + Descriptor &compDesc{componentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + compDesc, instance_, io_.GetIoErrorHandler(), subscripts_); + Advance(); + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } else { + // Component is itself a descriptor + char *pointer{ + instance_.Element(subscripts_) + component_->offset()}; + const Descriptor &compDesc{ + *reinterpret_cast(pointer)}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + } + } + return StatOk; +} + +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Begin(WorkQueue &workQueue) { + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + if (handler.InError()) { + return handler.GetIoStat(); + } + if (!io_.get_if>()) { + handler.Crash("DescriptorIO() called for wrong I/O direction"); + return handler.GetIoStat(); + } + if constexpr (DIR == Direction::Input) { + if (!io_.BeginReadingRecord()) { + return StatOk; + } + } + if (!io_.get_if>()) { + // Unformatted I/O + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + if (const typeInfo::DerivedType * + type{addendum ? addendum->derivedType() : nullptr}) { + // derived type unformatted I/O + if (table_) { + if (const auto *definedIo{table_->Find(*type, + DIR == Direction::Input + ? common::DefinedIo::ReadUnformatted + : common::DefinedIo::WriteUnformatted)}) { + if (definedIo->subroutine) { + typeInfo::SpecialBinding special{DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false}; + if (DefinedUnformattedIo(io_, instance_, *type, special)) { + anyIoTookPlace_ = true; + return StatOk; + } + } else { + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } + } + } + if (const typeInfo::SpecialBinding * + special{type->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || special->isTypeBound()) { + // defined derived type unformatted I/O + if (DefinedUnformattedIo(io_, instance_, *type, *special)) { + anyIoTookPlace_ = true; + return StatOk; + } else { + return IostatEnd; + } + } + } + // Default derived type unformatted I/O + // TODO: If no component at any level has defined READ or WRITE + // (as appropriate), the elements are contiguous, and no byte swapping + // is active, do a block transfer via the code below. + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } else { + // intrinsic type unformatted I/O + auto *externalUnf{io_.get_if>()}; + ChildUnformattedIoStatementState *childUnf{nullptr}; + InquireIOLengthState *inq{nullptr}; + bool swapEndianness{false}; + if (externalUnf) { + swapEndianness = externalUnf->unit().swapEndianness(); + } else { + childUnf = io_.get_if>(); + if (!childUnf) { + inq = DIR == Direction::Output ? io_.get_if() + : nullptr; + RUNTIME_CHECK(handler, inq != nullptr); + } + } + std::size_t elementBytes{instance_.ElementBytes()}; + std::size_t swappingBytes{elementBytes}; + if (auto maybeCatAndKind{instance_.type().GetCategoryAndKind()}) { + // Byte swapping units can be smaller than elements, namely + // for COMPLEX and CHARACTER. + if (maybeCatAndKind->first == TypeCategory::Character) { + // swap each character position independently + swappingBytes = maybeCatAndKind->second; // kind + } else if (maybeCatAndKind->first == TypeCategory::Complex) { + // swap real and imaginary components independently + swappingBytes /= 2; + } + } + using CharType = + std::conditional_t; + auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { + if constexpr (DIR == Direction::Output) { + return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) + : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) + : inq->Emit(&x, totalBytes, swappingBytes); + } else { + return externalUnf + ? externalUnf->Receive(&x, totalBytes, swappingBytes) + : childUnf->Receive(&x, totalBytes, swappingBytes); + } + }}; + if (!swapEndianness && + instance_.IsContiguous()) { // contiguous unformatted I/O + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elements_ * elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O + for (; !IsComplete(); Advance()) { + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } + } + } + // Unformatted I/O never needs to call Continue(). + return StatOk; + } + // Formatted I/O + if (auto catAndKind{instance_.type().GetCategoryAndKind()}) { + TypeCategory cat{catAndKind->first}; + int kind{catAndKind->second}; + bool any{false}; + switch (cat) { + case TypeCategory::Integer: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, true); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, true); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, true); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, true); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, true); + break; + default: + handler.Crash( + "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Unsigned: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, false); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, false); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, false); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, false); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, false); + break; + default: + handler.Crash( + "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Real: + switch (kind) { + case 2: + any = FormattedRealIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedRealIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedRealIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedRealIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedRealIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedRealIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: REAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Complex: + switch (kind) { + case 2: + any = FormattedComplexIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedComplexIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedComplexIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedComplexIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedComplexIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedComplexIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Character: + switch (kind) { + case 1: + any = FormattedCharacterIO(io_, instance_); + break; + case 2: + any = FormattedCharacterIO(io_, instance_); + break; + case 4: + any = FormattedCharacterIO(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Logical: + switch (kind) { + case 1: + any = FormattedLogicalIO<1, DIR>(io_, instance_); + break; + case 2: + any = FormattedLogicalIO<2, DIR>(io_, instance_); + break; + case 4: + any = FormattedLogicalIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedLogicalIO<8, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Derived: { + // Derived type information must be present for formatted I/O. + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + RUNTIME_CHECK(handler, addendum != nullptr); + derived_ = addendum->derivedType(); + RUNTIME_CHECK(handler, derived_ != nullptr); + if (table_) { + if (const auto *definedIo{table_->Find(*derived_, + DIR == Direction::Input ? common::DefinedIo::ReadFormatted + : common::DefinedIo::WriteFormatted)}) { + if (definedIo->subroutine) { + nonTbpSpecial_.emplace(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false); + special_ = &*nonTbpSpecial_; + } + } + } + if (!special_) { + if (const typeInfo::SpecialBinding * + binding{derived_->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || + binding->isTypeBound()) { + special_ = binding; + } + } + } + return StatContinue; + } + } + if (any) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { + handler.Crash("DescriptorIO: bad type code (%d) in descriptor", + static_cast(instance_.type().raw())); + return handler.GetIoStat(); + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Continue(WorkQueue &workQueue) { + // Only derived type formatted I/O gets here. + while (!IsComplete()) { + if (special_) { + if (auto defined{DefinedFormattedIo( + io_, instance_, *derived_, *special_, subscripts_)}) { + anyIoTookPlace_ |= *defined; + Advance(); + continue; + } + } + Descriptor &elementDesc{elementDescriptor_.descriptor()}; + elementDesc.Establish( + *derived_, nullptr, 0, nullptr, CFI_attribute_pointer); + elementDesc.set_base_addr(instance_.Element(subscripts_)); + Advance(); + if (int status{workQueue.BeginDerivedIo( + io_, elementDesc, *derived_, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS bool DescriptorIO(IoStatementState &io, + const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { + bool anyIoTookPlace{false}; + WorkQueue workQueue{io.GetIoErrorHandler()}; + if (workQueue.BeginDescriptorIo(io, descriptor, table, anyIoTookPlace) == + StatContinue) { + workQueue.Run(); + } + return anyIoTookPlace; +} + +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); + RT_OFFLOAD_API_GROUP_END } // namespace Fortran::runtime::io::descr diff --git a/flang-rt/lib/runtime/descriptor-io.h b/flang-rt/lib/runtime/descriptor-io.h index eb60f106c9203..88ad59bd24b53 100644 --- a/flang-rt/lib/runtime/descriptor-io.h +++ b/flang-rt/lib/runtime/descriptor-io.h @@ -9,619 +9,27 @@ #ifndef FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ #define FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ -// Implementation of I/O data list item transfers based on descriptors. -// (All I/O items come through here so that the code is exercised for test; -// some scalar I/O data transfer APIs could be changed to bypass their use -// of descriptors in the future for better efficiency.) +#include "flang-rt/runtime/connection.h" -#include "edit-input.h" -#include "edit-output.h" -#include "unit.h" -#include "flang-rt/runtime/descriptor.h" -#include "flang-rt/runtime/io-stmt.h" -#include "flang-rt/runtime/namelist.h" -#include "flang-rt/runtime/terminator.h" -#include "flang-rt/runtime/type-info.h" -#include "flang/Common/optional.h" -#include "flang/Common/uint128.h" -#include "flang/Runtime/cpp-type.h" +namespace Fortran::runtime { +class Descriptor; +} // namespace Fortran::runtime -namespace Fortran::runtime::io::descr { -template -inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, - const Descriptor &descriptor, const SubscriptValue subscripts[]) { - A *p{descriptor.Element(subscripts)}; - if (!p) { - io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " - "address or subscripts out of range"); - } - return *p; -} - -// Per-category descriptor-based I/O templates - -// TODO (perhaps as a nontrivial but small starter project): implement -// automatic repetition counts, like "10*3.14159", for list-directed and -// NAMELIST array output. - -template -inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, - const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!EditIntegerOutput(io, *edit, x, isSigned)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditIntegerInput( - io, *edit, reinterpret_cast(&x), KIND, isSigned)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedIntegerIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedRealIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - RawType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditRealInput(io, *edit, reinterpret_cast(&x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedRealIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io -template -inline RT_API_ATTRS bool FormattedComplexIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - bool isListOutput{ - io.get_if>() != nullptr}; - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - RawType *x{&ExtractElement(io, descriptor, subscripts)}; - if (isListOutput) { - DataEdit rEdit, iEdit; - rEdit.descriptor = DataEdit::ListDirectedRealPart; - iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; - rEdit.modes = iEdit.modes = io.mutableModes(); - if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || - !RealOutputEditing{io, x[1]}.Edit(iEdit)) { - return false; - } - } else { - for (int k{0}; k < 2; ++k, ++x) { - auto edit{io.GetNextDataEdit()}; - if (!edit) { - return false; - } else if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, *x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { - break; - } else if (EditRealInput( - io, *edit, reinterpret_cast(x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedComplexIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedCharacterIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t length{descriptor.ElementBytes() / sizeof(A)}; - auto *listOutput{io.get_if>()}; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - A *x{&ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditCharacterOutput(io, *edit, x, length)) { - return false; - } - } else { // input - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditCharacterInput(io, *edit, x, length)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedCharacterIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedLogicalIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - auto *listOutput{io.get_if>()}; - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditLogicalOutput(io, *edit, x != 0)) { - return false; - } - } else { - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - bool truth{}; - if (EditLogicalInput(io, *edit, truth)) { - x = truth; - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedLogicalIO: subscripts out of bounds"); - } - } - return true; -} +namespace Fortran::runtime::io::descr { template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, +RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable * = nullptr); -// For intrinsic (not defined) derived type I/O, formatted & unformatted -template -static RT_API_ATTRS bool DefaultComponentIO(IoStatementState &io, - const typeInfo::Component &component, const Descriptor &origDescriptor, - const SubscriptValue origSubscripts[], Terminator &terminator, - const NonTbpDefinedIoTable *table) { -#if !defined(RT_DEVICE_AVOID_RECURSION) - if (component.genre() == typeInfo::Component::Genre::Data) { - // Create a descriptor for the component - StaticDescriptor statDesc; - Descriptor &desc{statDesc.descriptor()}; - component.CreatePointerDescriptor( - desc, origDescriptor, terminator, origSubscripts); - return DescriptorIO(io, desc, table); - } else { - // Component is itself a descriptor - char *pointer{ - origDescriptor.Element(origSubscripts) + component.offset()}; - const Descriptor &compDesc{*reinterpret_cast(pointer)}; - return compDesc.IsAllocated() && DescriptorIO(io, compDesc, table); - } -#else - terminator.Crash("not yet implemented: component IO"); -#endif -} - -template -static RT_API_ATTRS bool DefaultComponentwiseFormattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table, const SubscriptValue subscripts[]) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - // Return true for NAMELIST input if any component appeared. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && k > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -template -static RT_API_ATTRS bool DefaultComponentwiseUnformattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - return false; - } - } - } - return true; -} - -RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( - IoStatementState &, const Descriptor &, const typeInfo::DerivedType &, - const typeInfo::SpecialBinding &, const SubscriptValue[]); - -template -static RT_API_ATTRS bool FormattedDerivedTypeIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - // Derived type information must be present for formatted I/O. - const DescriptorAddendum *addendum{descriptor.Addendum()}; - RUNTIME_CHECK(handler, addendum != nullptr); - const typeInfo::DerivedType *type{addendum->derivedType()}; - RUNTIME_CHECK(handler, type != nullptr); - Fortran::common::optional nonTbpSpecial; - const typeInfo::SpecialBinding *special{nullptr}; - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadFormatted - : common::DefinedIo::WriteFormatted)}) { - if (definedIo->subroutine) { - nonTbpSpecial.emplace(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false); - special = &*nonTbpSpecial; - } - } - } - if (!special) { - if (const typeInfo::SpecialBinding * - binding{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted)}) { - if (!table || !table->ignoreNonTbpEntries || binding->isTypeBound()) { - special = binding; - } - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t numElements{descriptor.Elements()}; - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - Fortran::common::optional result; - if (special) { - result = DefinedFormattedIo(io, descriptor, *type, *special, subscripts); - } - if (!result) { - result = DefaultComponentwiseFormattedIO( - io, descriptor, *type, table, subscripts); - } - if (!result.value()) { - // Return true for NAMELIST input if we got anything. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && j > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &, const Descriptor &, - const typeInfo::DerivedType &, const typeInfo::SpecialBinding &); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); -// Unformatted I/O -template -static RT_API_ATTRS bool UnformattedDescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table = nullptr) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const DescriptorAddendum *addendum{descriptor.Addendum()}; - if (const typeInfo::DerivedType * - type{addendum ? addendum->derivedType() : nullptr}) { - // derived type unformatted I/O - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadUnformatted - : common::DefinedIo::WriteUnformatted)}) { - if (definedIo->subroutine) { - typeInfo::SpecialBinding special{DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false}; - if (Fortran::common::optional wasDefined{ - DefinedUnformattedIo(io, descriptor, *type, special)}) { - return *wasDefined; - } - } else { - return DefaultComponentwiseUnformattedIO( - io, descriptor, *type, table); - } - } - } - if (const typeInfo::SpecialBinding * - special{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { - if (!table || !table->ignoreNonTbpEntries || special->isTypeBound()) { - // defined derived type unformatted I/O - return DefinedUnformattedIo(io, descriptor, *type, *special); - } - } - // Default derived type unformatted I/O - // TODO: If no component at any level has defined READ or WRITE - // (as appropriate), the elements are contiguous, and no byte swapping - // is active, do a block transfer via the code below. - return DefaultComponentwiseUnformattedIO(io, descriptor, *type, table); - } else { - // intrinsic type unformatted I/O - auto *externalUnf{io.get_if>()}; - auto *childUnf{io.get_if>()}; - auto *inq{ - DIR == Direction::Output ? io.get_if() : nullptr}; - RUNTIME_CHECK(handler, externalUnf || childUnf || inq); - std::size_t elementBytes{descriptor.ElementBytes()}; - std::size_t numElements{descriptor.Elements()}; - std::size_t swappingBytes{elementBytes}; - if (auto maybeCatAndKind{descriptor.type().GetCategoryAndKind()}) { - // Byte swapping units can be smaller than elements, namely - // for COMPLEX and CHARACTER. - if (maybeCatAndKind->first == TypeCategory::Character) { - // swap each character position independently - swappingBytes = maybeCatAndKind->second; // kind - } else if (maybeCatAndKind->first == TypeCategory::Complex) { - // swap real and imaginary components independently - swappingBytes /= 2; - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using CharType = - std::conditional_t; - auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { - if constexpr (DIR == Direction::Output) { - return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) - : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) - : inq->Emit(&x, totalBytes, swappingBytes); - } else { - return externalUnf ? externalUnf->Receive(&x, totalBytes, swappingBytes) - : childUnf->Receive(&x, totalBytes, swappingBytes); - } - }}; - bool swapEndianness{externalUnf && externalUnf->unit().swapEndianness()}; - if (!swapEndianness && - descriptor.IsContiguous()) { // contiguous unformatted I/O - char &x{ExtractElement(io, descriptor, subscripts)}; - return Transfer(x, numElements * elementBytes); - } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O - for (std::size_t j{0}; j < numElements; ++j) { - char &x{ExtractElement(io, descriptor, subscripts)}; - if (!Transfer(x, elementBytes)) { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && - j + 1 < numElements) { - handler.Crash("DescriptorIO: subscripts out of bounds"); - } - } - return true; - } - } -} - -template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - if (handler.InError()) { - return false; - } - if (!io.get_if>()) { - handler.Crash("DescriptorIO() called for wrong I/O direction"); - return false; - } - if constexpr (DIR == Direction::Input) { - if (!io.BeginReadingRecord()) { - return false; - } - } - if (!io.get_if>()) { - return UnformattedDescriptorIO(io, descriptor, table); - } - if (auto catAndKind{descriptor.type().GetCategoryAndKind()}) { - TypeCategory cat{catAndKind->first}; - int kind{catAndKind->second}; - switch (cat) { - case TypeCategory::Integer: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, true); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, true); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, true); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, true); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, true); - default: - handler.Crash( - "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Unsigned: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, false); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, false); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, false); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, false); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, false); - default: - handler.Crash( - "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Real: - switch (kind) { - case 2: - return FormattedRealIO<2, DIR>(io, descriptor); - case 3: - return FormattedRealIO<3, DIR>(io, descriptor); - case 4: - return FormattedRealIO<4, DIR>(io, descriptor); - case 8: - return FormattedRealIO<8, DIR>(io, descriptor); - case 10: - return FormattedRealIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedRealIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: REAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Complex: - switch (kind) { - case 2: - return FormattedComplexIO<2, DIR>(io, descriptor); - case 3: - return FormattedComplexIO<3, DIR>(io, descriptor); - case 4: - return FormattedComplexIO<4, DIR>(io, descriptor); - case 8: - return FormattedComplexIO<8, DIR>(io, descriptor); - case 10: - return FormattedComplexIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedComplexIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Character: - switch (kind) { - case 1: - return FormattedCharacterIO(io, descriptor); - case 2: - return FormattedCharacterIO(io, descriptor); - case 4: - return FormattedCharacterIO(io, descriptor); - default: - handler.Crash( - "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Logical: - switch (kind) { - case 1: - return FormattedLogicalIO<1, DIR>(io, descriptor); - case 2: - return FormattedLogicalIO<2, DIR>(io, descriptor); - case 4: - return FormattedLogicalIO<4, DIR>(io, descriptor); - case 8: - return FormattedLogicalIO<8, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Derived: - return FormattedDerivedTypeIO(io, descriptor, table); - } - } - handler.Crash("DescriptorIO: bad type code (%d) in descriptor", - static_cast(descriptor.type().raw())); - return false; -} } // namespace Fortran::runtime::io::descr #endif // FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ diff --git a/flang-rt/lib/runtime/environment.cpp b/flang-rt/lib/runtime/environment.cpp index 1d5304254ed0e..0f0564403c0e2 100644 --- a/flang-rt/lib/runtime/environment.cpp +++ b/flang-rt/lib/runtime/environment.cpp @@ -143,6 +143,10 @@ void ExecutionEnvironment::Configure(int ac, const char *av[], } } + if (auto *x{std::getenv("FLANG_RT_DEBUG")}) { + internalDebugging = std::strtol(x, nullptr, 10); + } + if (auto *x{std::getenv("ACC_OFFLOAD_STACK_SIZE")}) { char *end; auto n{std::strtoul(x, &end, 10)}; diff --git a/flang-rt/lib/runtime/namelist.cpp b/flang-rt/lib/runtime/namelist.cpp index b0cf2180fc6d4..1bef387a9771f 100644 --- a/flang-rt/lib/runtime/namelist.cpp +++ b/flang-rt/lib/runtime/namelist.cpp @@ -10,6 +10,7 @@ #include "descriptor-io.h" #include "flang-rt/runtime/emit-encoded.h" #include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/type-info.h" #include "flang/Runtime/io-api.h" #include #include diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..eee1c551aad6d --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,153 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +#if !defined(RT_DEVICE_COMPILATION) +// FLANG_RT_DEBUG code is disabled when false. +static constexpr bool enableDebugOutput{false}; +#endif + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (componentAt_ >= components_) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + delete firstFree_; + } + firstFree_ = next; + } +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + firstFree_ = new TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: new ticket\n"); + } +#endif + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), + at->ticket.begun ? "Continue" : "Begin"); + } +#endif + int stat{at->ticket.Continue(*this)}; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: ... stat %d\n", stat); + } +#endif + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime diff --git a/flang-rt/unittests/Runtime/ExternalIOTest.cpp b/flang-rt/unittests/Runtime/ExternalIOTest.cpp index 3833e48be3dd6..6c148b1de6f82 100644 --- a/flang-rt/unittests/Runtime/ExternalIOTest.cpp +++ b/flang-rt/unittests/Runtime/ExternalIOTest.cpp @@ -184,7 +184,7 @@ TEST(ExternalIOTests, TestSequentialFixedUnformatted) { io = IONAME(BeginInquireIoLength)(__FILE__, __LINE__); for (int j{1}; j <= 3; ++j) { ASSERT_TRUE(IONAME(OutputDescriptor)(io, desc)) - << "OutputDescriptor() for InquireIoLength"; + << "OutputDescriptor() for InquireIoLength " << j; } ASSERT_EQ(IONAME(GetIoLength)(io), 3 * recl) << "GetIoLength"; ASSERT_EQ(IONAME(EndIoStatement)(io), IostatOk) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..32bebc1d866a4 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -843,6 +843,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION From flang-commits at lists.llvm.org Tue May 27 10:04:31 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Tue, 27 May 2025 10:04:31 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6835f09f.170a0220.2cd125.fb6d@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 01/18] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 02/18] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 03/18] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; >From 02b40809aff5b2ad315ef078f0ea8edb0f30a40c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 17 Mar 2025 15:53:27 -0500 Subject: [PATCH 04/18] [flang][OpenMP] Overhaul implementation of ATOMIC construct The parser will accept a wide variety of illegal attempts at forming an ATOMIC construct, leaving it to the semantic analysis to diagnose any issues. This consolidates the analysis into one place and allows us to produce more informative diagnostics. The parser's outcome will be parser::OpenMPAtomicConstruct object holding the directive, parser::Body, and an optional end-directive. The prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have been removed. READ, WRITE, etc. are now proper clauses. The semantic analysis consistently operates on "evaluation" represen- tations, mainly evaluate::Expr (as SomeExpr) and evaluate::Assignment. The results of the semantic analysis are stored in a mutable member of the OpenMPAtomicConstruct node. This follows a precedent of having `typedExpr` member in parser::Expr, for example. This allows the lowering code to avoid duplicated handling of AST nodes. Using a BLOCK construct containing multiple statements for an ATOMIC construct that requires multiple statements is now allowed. In fact, any nesting of such BLOCK constructs is allowed. This implementation will parse, and perform semantic checks for both conditional-update and conditional-update-capture, although no MLIR will be generated for those. Instead, a TODO error will be issues prior to lowering. The allowed forms of the ATOMIC construct were based on the OpenMP 6.0 spec. --- flang/include/flang/Parser/dump-parse-tree.h | 12 - flang/include/flang/Parser/parse-tree.h | 111 +- flang/include/flang/Semantics/tools.h | 16 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 40 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 760 +++---- flang/lib/Parser/openmp-parsers.cpp | 233 +- flang/lib/Parser/parse-tree.cpp | 28 + flang/lib/Parser/unparse.cpp | 102 +- flang/lib/Semantics/check-omp-structure.cpp | 1998 +++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 56 +- flang/lib/Semantics/resolve-names.cpp | 7 +- flang/lib/Semantics/rewrite-directives.cpp | 126 +- .../Lower/OpenMP/Todo/atomic-compare-fail.f90 | 2 +- .../test/Lower/OpenMP/Todo/atomic-compare.f90 | 2 +- flang/test/Lower/OpenMP/atomic-capture.f90 | 4 +- flang/test/Lower/OpenMP/atomic-write.f90 | 2 +- flang/test/Parser/OpenMP/atomic-compare.f90 | 16 - .../test/Semantics/OpenMP/atomic-compare.f90 | 29 +- .../Semantics/OpenMP/atomic-hint-clause.f90 | 23 +- flang/test/Semantics/OpenMP/atomic-read.f90 | 89 + .../OpenMP/atomic-update-capture.f90 | 77 + .../Semantics/OpenMP/atomic-update-only.f90 | 83 + .../OpenMP/atomic-update-overloaded-ops.f90 | 4 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 81 + flang/test/Semantics/OpenMP/atomic.f90 | 29 +- flang/test/Semantics/OpenMP/atomic01.f90 | 221 +- flang/test/Semantics/OpenMP/atomic02.f90 | 47 +- flang/test/Semantics/OpenMP/atomic03.f90 | 51 +- flang/test/Semantics/OpenMP/atomic04.f90 | 96 +- flang/test/Semantics/OpenMP/atomic05.f90 | 12 +- .../Semantics/OpenMP/critical-hint-clause.f90 | 20 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 58 +- .../Semantics/OpenMP/requires-atomic01.f90 | 86 +- .../Semantics/OpenMP/requires-atomic02.f90 | 86 +- 34 files changed, 2838 insertions(+), 1769 deletions(-) delete mode 100644 flang/test/Parser/OpenMP/atomic-compare.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-read.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-capture.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-only.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-write.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a8c01ee2e7da3 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -526,15 +526,6 @@ class ParseTreeDumper { NODE(parser, OmpAtClause) NODE_ENUM(OmpAtClause, ActionTime) NODE_ENUM(OmpSeverityClause, Severity) - NODE(parser, OmpAtomic) - NODE(parser, OmpAtomicCapture) - NODE(OmpAtomicCapture, Stmt1) - NODE(OmpAtomicCapture, Stmt2) - NODE(parser, OmpAtomicCompare) - NODE(parser, OmpAtomicCompareIfStmt) - NODE(parser, OmpAtomicRead) - NODE(parser, OmpAtomicUpdate) - NODE(parser, OmpAtomicWrite) NODE(parser, OmpBeginBlockDirective) NODE(parser, OmpBeginLoopDirective) NODE(parser, OmpBeginSectionsDirective) @@ -581,7 +572,6 @@ class ParseTreeDumper { NODE(parser, OmpDoacrossClause) NODE(parser, OmpDestroyClause) NODE(parser, OmpEndAllocators) - NODE(parser, OmpEndAtomic) NODE(parser, OmpEndBlockDirective) NODE(parser, OmpEndCriticalDirective) NODE(parser, OmpEndLoopDirective) @@ -709,8 +699,6 @@ class ParseTreeDumper { NODE(parser, OpenMPDeclareMapperConstruct) NODE_ENUM(common, OmpMemoryOrderType) NODE(parser, OmpMemoryOrderClause) - NODE(parser, OmpAtomicClause) - NODE(parser, OmpAtomicClauseList) NODE(parser, OmpAtomicDefaultMemOrderClause) NODE(parser, OpenMPDepobjConstruct) NODE(parser, OpenMPUtilityConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..77f57b1cb85c7 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4835,94 +4835,37 @@ struct OmpMemoryOrderClause { CharBlock source; }; -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) | -// FAIL(memory-order) -struct OmpAtomicClause { - UNION_CLASS_BOILERPLATE(OmpAtomicClause); - CharBlock source; - std::variant u; -}; - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -struct OmpAtomicClauseList { - WRAPPER_CLASS_BOILERPLATE(OmpAtomicClauseList, std::list); - CharBlock source; -}; - -// END ATOMIC -EMPTY_CLASS(OmpEndAtomic); - -// ATOMIC READ -struct OmpAtomicRead { - TUPLE_CLASS_BOILERPLATE(OmpAtomicRead); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC WRITE -struct OmpAtomicWrite { - TUPLE_CLASS_BOILERPLATE(OmpAtomicWrite); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC UPDATE -struct OmpAtomicUpdate { - TUPLE_CLASS_BOILERPLATE(OmpAtomicUpdate); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC CAPTURE -struct OmpAtomicCapture { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCapture); - CharBlock source; - WRAPPER_CLASS(Stmt1, Statement); - WRAPPER_CLASS(Stmt2, Statement); - std::tuple - t; -}; - -struct OmpAtomicCompareIfStmt { - UNION_CLASS_BOILERPLATE(OmpAtomicCompareIfStmt); - std::variant, common::Indirection> u; -}; - -// ATOMIC COMPARE (OpenMP 5.1, OPenMP 5.2 spec: 15.8.4) -struct OmpAtomicCompare { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCompare); +struct OpenMPAtomicConstruct { + llvm::omp::Clause GetKind() const; + bool IsCapture() const; + bool IsCompare() const; + TUPLE_CLASS_BOILERPLATE(OpenMPAtomicConstruct); CharBlock source; - std::tuple> + std::tuple> t; -}; -// ATOMIC -struct OmpAtomic { - TUPLE_CLASS_BOILERPLATE(OmpAtomic); - CharBlock source; - std::tuple, - std::optional> - t; -}; + // Information filled out during semantic checks to avoid duplication + // of analyses. + struct Analysis { + static constexpr int None = 0; + static constexpr int Read = 1; + static constexpr int Write = 2; + static constexpr int Update = Read | Write; + static constexpr int Action = 3; // Bitmask for None, Read, Write, Update + static constexpr int IfTrue = 4; + static constexpr int IfFalse = 8; + static constexpr int Condition = 12; // Bitmask for IfTrue, IfFalse + + struct Op { + int what; + TypedExpr expr; + }; + TypedExpr atom, cond; + Op op0, op1; + }; -// 2.17.7 atomic -> -// ATOMIC [atomic-clause-list] atomic-construct [atomic-clause-list] | -// ATOMIC [atomic-clause-list] -// atomic-construct -> READ | WRITE | UPDATE | CAPTURE | COMPARE -struct OpenMPAtomicConstruct { - UNION_CLASS_BOILERPLATE(OpenMPAtomicConstruct); - std::variant - u; + mutable Analysis analysis; }; // OpenMP directives that associate with loop(s) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..7f1ec59b087a2 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -782,5 +782,21 @@ inline bool checkForSymbolMatch( } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). +/// Check if "expr" is +/// SomeType(SomeKind(Type( +/// Convert +/// SomeKind(...)[2]))) +/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves +/// TypeCategory. +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..0d941bf39afc3 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -332,26 +332,26 @@ getSource(const semantics::SemanticsContext &semaCtx, const parser::CharBlock *source = nullptr; auto ompConsVisit = [&](const parser::OpenMPConstruct &x) { - std::visit(common::visitors{ - [&](const parser::OpenMPSectionsConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPLoopConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPBlockConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPCriticalConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPAtomicConstruct &x) { - std::visit([&](const auto &x) { source = &x.source; }, - x.u); - }, - [&](const auto &x) { source = &x.source; }, - }, - x.u); + std::visit( + common::visitors{ + [&](const parser::OpenMPSectionsConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPLoopConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPBlockConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPCriticalConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPAtomicConstruct &x) { + source = &std::get(x.t).source; + }, + [&](const auto &x) { source = &x.source; }, + }, + x.u); }; eval.visit(common::visitors{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fdd85e94829f3..ab868df76d298 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2588,455 +2588,183 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); +} -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} + +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, + const List &clauses) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + if (clause.id != llvm::omp::Clause::OMPC_hint) + continue; + auto &hint = std::get(clause.u); + auto maybeVal = evaluate::ToInt64(hint.v); + CHECK(maybeVal); + return builder.getI64IntegerAttr(*maybeVal); } + return nullptr; } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static mlir::omp::ClauseMemoryOrderKindAttr +getAtomicMemoryOrder(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + const List &clauses) { + std::optional kind; + unsigned version = semaCtx.langOptions().OpenMPVersion; - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + switch (clause.id) { + case llvm::omp::Clause::OMPC_acq_rel: + kind = mlir::omp::ClauseMemoryOrderKind::Acq_rel; + break; + case llvm::omp::Clause::OMPC_acquire: + kind = mlir::omp::ClauseMemoryOrderKind::Acquire; + break; + case llvm::omp::Clause::OMPC_relaxed: + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + break; + case llvm::omp::Clause::OMPC_release: + kind = mlir::omp::ClauseMemoryOrderKind::Release; + break; + case llvm::omp::Clause::OMPC_seq_cst: + kind = mlir::omp::ClauseMemoryOrderKind::Seq_cst; + break; + default: + break; + } + } - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); + // Starting with 5.1, if no memory-order clause is present, the effect + // is as if "relaxed" was present. + if (!kind) { + if (version <= 50) + return nullptr; + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + } + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + return mlir::omp::ClauseMemoryOrderKindAttr::get(builder.getContext(), *kind); +} + +static mlir::Operation * // +genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; +static mlir::Operation * // +genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Value converted = builder.createConvert(loc, atomType, value); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, converted, hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - common::visit( - common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - lower::StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); +static mlir::Operation * +genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + lower::ExprToValueMap overrides; + lower::StatementContext naCtx; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + assert(!args.empty() && "Update operation without arguments"); + for (auto &arg : args) { + if (!semantics::IsSameOrResizeOf(arg, atom)) { + mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); + overrides.try_emplace(&arg, val); } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); } - mlir::Operation *atomicUpdateOp = nullptr; - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), - val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -static void genAtomicWrite(lower::AbstractConverter &converter, - const parser::OmpAtomicWrite &atomicWrite, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - - const parser::AssignmentStmt &stmt = - std::get>(atomicWrite.t) - .statement; - const evaluate::Assignment &assign = *stmt.typedAssignment->v; - lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -static void genAtomicRead(lower::AbstractConverter &converter, - const parser::OmpAtomicRead &atomicRead, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicRead.t) - .statement.t); + builder.restoreInsertionPoint(atomicAt); + auto updateOp = + builder.create(loc, atomAddr, hint, memOrder); - lower::StatementContext stmtCtx; - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -static void genAtomicUpdate(lower::AbstractConverter &converter, - const parser::OmpAtomicUpdate &atomicUpdate, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicUpdate.t) - .statement.t); + mlir::Region ®ion = updateOp->getRegion(0); + mlir::Block *block = builder.createBlock(®ion, {}, {atomType}, {loc}); + mlir::Value localAtom = fir::getBase(block->getArgument(0)); + overrides.try_emplace(&atom, localAtom); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -static void genOmpAtomic(lower::AbstractConverter &converter, - const parser::OmpAtomic &atomicConstruct, - mlir::Location loc) { - const parser::OmpAtomicClauseList &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>(atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicConstruct.t) - .statement.t); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -static void genAtomicCapture(lower::AbstractConverter &converter, - const parser::OmpAtomicCapture &atomicCapture, - mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + converter.overrideExprValues(&overrides); + mlir::Value updated = + fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value converted = builder.createConvert(loc, atomType, updated); + builder.create(loc, converted); + converter.resetExprOverrides(); - const parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const parser::OmpAtomicClauseList &rightHandClauseList = - std::get<2>(atomicCapture.t); - const parser::OmpAtomicClauseList &leftHandClauseList = - std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + builder.restoreInsertionPoint(saved); + return updateOp; +} + +static mlir::Operation * +genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, int action, + mlir::Value atomAddr, const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + switch (action) { + case parser::OpenMPAtomicConstruct::Analysis::Read: + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Write: + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Update: + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + default: + return nullptr; } - firOpBuilder.setInsertionPointToEnd(&block); - firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); } //===----------------------------------------------------------------------===// @@ -3911,10 +3639,6 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, @@ -3922,38 +3646,140 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; + llvm::errs() << " atom: " << atom.AsFortran() << "\n"; + llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; + llvm::errs() << " op0 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << " op1 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << "}\n"; +} + static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - const parser::OpenMPAtomicConstruct &atomicConstruct) { - Fortran::common::visit( - common::visitors{ - [&](const parser::OmpAtomicRead &atomicRead) { - mlir::Location loc = converter.genLocation(atomicRead.source); - genAtomicRead(converter, atomicRead, loc); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - mlir::Location loc = converter.genLocation(atomicWrite.source); - genAtomicWrite(converter, atomicWrite, loc); - }, - [&](const parser::OmpAtomic &atomicConstruct) { - mlir::Location loc = converter.genLocation(atomicConstruct.source); - genOmpAtomic(converter, atomicConstruct, loc); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - mlir::Location loc = converter.genLocation(atomicUpdate.source); - genAtomicUpdate(converter, atomicUpdate, loc); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - mlir::Location loc = converter.genLocation(atomicCapture.source); - genAtomicCapture(converter, atomicCapture, loc); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - mlir::Location loc = converter.genLocation(atomicCompare.source); - TODO(loc, "OpenMP atomic compare"); - }, - }, - atomicConstruct.u); + const parser::OpenMPAtomicConstruct &construct) { + auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { + if (auto *maybe = expr.get(); maybe && maybe->v) { + return &*maybe->v; + } else { + return nullptr; + } + }; + + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto &dirSpec = std::get(construct.t); + List clauses = makeClauses(dirSpec.Clauses(), semaCtx); + lower::StatementContext stmtCtx; + + const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + const semantics::SomeExpr &atom = *get(analysis.atom); + mlir::Location loc = converter.genLocation(construct.source); + mlir::Value atomAddr = + fir::getBase(converter.genExprAddr(atom, stmtCtx, &loc)); + mlir::IntegerAttr hint = getAtomicHint(converter, clauses); + mlir::omp::ClauseMemoryOrderKindAttr memOrder = + getAtomicMemoryOrder(converter, semaCtx, clauses); + + if (auto *cond = get(analysis.cond)) { + (void)cond; + TODO(loc, "OpenMP ATOMIC COMPARE"); + } else { + int action0 = analysis.op0.what & analysis.Action; + int action1 = analysis.op1.what & analysis.Action; + mlir::Operation *captureOp = nullptr; + fir::FirOpBuilder::InsertPoint atomicAt; + fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + + if (construct.IsCapture()) { + // Capturing operation. + assert(action0 != analysis.None && action1 != analysis.None && + "Expexcing two actions"); + captureOp = + builder.create(loc, hint, memOrder); + // Set the non-atomic insertion point to before the atomic.capture. + prepareAt = getInsertionPointBefore(captureOp); + + mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); + builder.setInsertionPointToEnd(block); + // Set the atomic insertion point to before the terminator inside + // atomic.capture. + mlir::Operation *term = builder.create(loc); + atomicAt = getInsertionPointBefore(term); + hint = nullptr; + memOrder = nullptr; + } else { + // Non-capturing operation. + assert(action0 != analysis.None && action1 == analysis.None && + "Expexcing single action"); + assert(!(analysis.op0.what & analysis.Condition)); + atomicAt = prepareAt; + } + + mlir::Operation *firstOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, + *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + assert(firstOp && "Should have created an atomic operation"); + atomicAt = getInsertionPointAfter(firstOp); + + mlir::Operation *secondOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, + *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } + } } static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..b30263ad54fa4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { + return std::string( + state.GetLocation(), std::min(64, state.BytesRemaining())); +} + constexpr auto startOmpLine = skipStuffBeforeStatement >> "!$OMP "_sptok; constexpr auto endOmpLine = space >> endOfLine; @@ -918,8 +924,10 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "BIND" >> construct(construct( parenthesized(Parser{}))) || + "CAPTURE" >> construct(construct()) || "COLLAPSE" >> construct(construct( parenthesized(scalarIntConstantExpr))) || + "COMPARE" >> construct(construct()) || "CONTAINS" >> construct(construct( parenthesized(Parser{}))) || "COPYIN" >> construct(construct( @@ -1039,6 +1047,7 @@ TYPE_PARSER( // "TASK_REDUCTION" >> construct(construct( parenthesized(Parser{}))) || + "READ" >> construct(construct()) || "RELAXED" >> construct(construct()) || "RELEASE" >> construct(construct()) || "REVERSE_OFFLOAD" >> @@ -1082,6 +1091,7 @@ TYPE_PARSER( // maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || + "WRITE" >> construct(construct()) || // Cancellable constructs "DO"_id >= construct(construct( @@ -1210,6 +1220,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. +// A problem can occur when atomic constructs without end-directive follow +// each other closely, e.g. +// !$omp atomic write +// x = v +// !$omp atomic update +// x = x + 1 +// ... +// The speculative parsing will become "recursive", and has the potential +// to take a (practically) infinite amount of time given a sufficiently +// large number of such constructs in a row. Since atomic constructs cannot +// contain other OpenMP constructs, guarding against recursive calls to the +// atomic construct parser solves the problem. +struct OmpAtomicConstructParser { + using resultType = OpenMPAtomicConstruct; + + static constexpr size_t BodyLimit{5}; + + std::optional Parse(ParseState &state) const { + if (recursing_) { + return std::nullopt; + } + recursing_ = true; + + auto dirSpec{Parser{}.Parse(state)}; + if (!dirSpec || dirSpec->DirId() != llvm::omp::Directive::OMPD_atomic) { + recursing_ = false; + return std::nullopt; + } + + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + if (ParseOne(exec, end, tail, state)) { + if (!tail.first.empty()) { + if (auto &&rest{attempt(LimitedTailParser(BodyLimit)).Parse(state)}) { + for (auto &&s : rest->first) { + tail.first.emplace_back(std::move(s)); + } + assert(!tail.second); + tail.second = std::move(rest->second); + } + } + recursing_ = false; + return OpenMPAtomicConstruct{ + std::move(*dirSpec), std::move(tail.first), std::move(tail.second)}; + } + + recursing_ = false; + return std::nullopt; + } + +private: + // Begin-directive + TailType = entire construct. + using TailType = std::pair>; + + // Parse either an ExecutionPartConstruct, or atomic end-directive. When + // successful, record the result in the "tail" provided, otherwise fail. + static std::optional ParseOne( // + Parser &exec, OmpEndDirectiveParser &end, + TailType &tail, ParseState &state) { + auto isRecovery{[](const ExecutionPartConstruct &e) { + return std::holds_alternative(e.u); + }}; + if (auto &&stmt{attempt(exec).Parse(state)}; stmt && !isRecovery(*stmt)) { + tail.first.emplace_back(std::move(*stmt)); + } else if (auto &&dir{attempt(end).Parse(state)}) { + tail.second = std::move(*dir); + } else { + return std::nullopt; + } + return Success{}; + } + + struct LimitedTailParser { + using resultType = TailType; + + constexpr LimitedTailParser(size_t count) : count_(count) {} + + std::optional Parse(ParseState &state) const { + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + for (size_t i{0}; i != count_; ++i) { + if (ParseOne(exec, end, tail, state)) { + if (tail.second) { + // Return when the end-directive was parsed. + return std::move(tail); + } + } else { + break; + } + } + return std::nullopt; + } + + private: + const size_t count_; + }; + + // The recursion guard should become thread_local if parsing is ever + // parallelized. + static bool recursing_; +}; + +bool OmpAtomicConstructParser::recursing_{false}; + +TYPE_PARSER(sourced( // + construct(OmpAtomicConstructParser{}))) + // 2.17.7 Atomic construct/2.17.8 Flush construct [OpenMP 5.0] // memory-order-clause -> // acq_rel @@ -1224,19 +1383,6 @@ TYPE_PARSER(sourced(construct( "RELEASE" >> construct(construct()) || "SEQ_CST" >> construct(construct()))))) -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) -TYPE_PARSER(sourced(construct( - construct(Parser{}) || - construct( - "FAIL" >> parenthesized(Parser{})) || - construct( - "HINT" >> parenthesized(Parser{}))))) - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -TYPE_PARSER(sourced(construct( - many(maybe(","_tok) >> sourced(Parser{}))))) - static bool IsSimpleStandalone(const OmpDirectiveName &name) { switch (name.v) { case llvm::omp::Directive::OMPD_barrier: @@ -1383,67 +1529,6 @@ TYPE_PARSER(sourced( TYPE_PARSER(construct(Parser{}) || construct(Parser{})) -// 2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] | -// ATOMIC [clause] -// clause -> memory-order-clause | HINT(hint-expression) -// memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED -// atomic-clause -> READ | WRITE | UPDATE | CAPTURE - -// OMP END ATOMIC -TYPE_PARSER(construct(startOmpLine >> "END ATOMIC"_tok)) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] READ [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("READ"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] CAPTURE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("CAPTURE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - statement(assignmentStmt), Parser{} / endOmpLine))) - -TYPE_PARSER(construct(indirect(Parser{})) || - construct(indirect(Parser{}))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] COMPARE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("COMPARE"_tok), - Parser{} / endOmpLine, - Parser{}, - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] UPDATE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("UPDATE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [atomic-clause-list] -TYPE_PARSER(sourced(construct(verbatim("ATOMIC"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] WRITE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("WRITE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// Atomic Construct -TYPE_PARSER(construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{})) - // 2.13.2 OMP CRITICAL TYPE_PARSER(startOmpLine >> sourced(construct( diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..5983f54600c21 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -318,6 +318,34 @@ std::string OmpTraitSetSelectorName::ToString() const { return std::string(EnumToString(v)); } +llvm::omp::Clause OpenMPAtomicConstruct::GetKind() const { + auto &dirSpec{std::get(t)}; + for (auto &clause : dirSpec.Clauses().v) { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_read: + case llvm::omp::Clause::OMPC_write: + case llvm::omp::Clause::OMPC_update: + return clause.Id(); + default: + break; + } + } + return llvm::omp::Clause::OMPC_update; +} + +bool OpenMPAtomicConstruct::IsCapture() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_capture; + }); +} + +bool OpenMPAtomicConstruct::IsCompare() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_compare; + }); +} } // namespace Fortran::parser template static llvm::omp::Clause getClauseIdForClass(C &&) { diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..a2bc8b98088f1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2562,83 +2562,22 @@ class UnparseVisitor { Word(ToUpperCaseLetters(common::EnumToString(x))); } - void Unparse(const OmpAtomicClauseList &x) { Walk(" ", x.v, " "); } - - void Unparse(const OmpAtomic &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCapture &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" CAPTURE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - Put("\n"); - Walk(std::get(x.t)); - BeginOpenMP(); - Word("!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCompare &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" COMPARE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - } - void Unparse(const OmpAtomicRead &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" READ"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicUpdate &x) { + void Unparse(const OpenMPAtomicConstruct &x) { BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" UPDATE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicWrite &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" WRITE"); - Walk(std::get<2>(x.t)); + Word("!$OMP "); + Walk(std::get(x.t)); Put("\n"); EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); + Walk(std::get(x.t), ""); + if (auto &end{std::get>(x.t)}) { + BeginOpenMP(); + Word("!$OMP END "); + Walk(*end); + Put("\n"); + EndOpenMP(); + } } + void Unparse(const OpenMPExecutableAllocate &x) { const auto &fields = std::get>>( @@ -2889,23 +2828,8 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } + void Unparse(const OmpFailClause &x) { Walk(x.v); } void Unparse(const OmpMemoryOrderClause &x) { Walk(x.v); } - void Unparse(const OmpAtomicClause &x) { - common::visit(common::visitors{ - [&](const OmpMemoryOrderClause &y) { Walk(y); }, - [&](const OmpFailClause &y) { - Word("FAIL("); - Walk(y.v); - Put(")"); - }, - [&](const OmpHintClause &y) { - Word("HINT("); - Walk(y.v); - Put(")"); - }, - }, - x.u); - } void Unparse(const OmpMetadirectiveDirective &x) { BeginOpenMP(); Word("!$OMP METADIRECTIVE "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c582de6df0319..bf3dff8ddab28 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); + +template +static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { + return !(e == f); +} + // Use when clause falls under 'struct OmpClause' in 'parse-tree.h'. #define CHECK_SIMPLE_CLAUSE(X, Y) \ void OmpStructureChecker::Enter(const parser::OmpClause::X &) { \ @@ -78,6 +86,28 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } +static bool IsVarOrFunctionRef(const SomeExpr &expr) { + return evaluate::UnwrapProcedureRef(expr) != nullptr || + evaluate::IsVariable(expr); +} + +static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { + const parser::TypedExpr &typedExpr{parserExpr.typedExpr}; + // ForwardOwningPointer typedExpr + // `- GenericExprWrapper ^.get() + // `- std::optional ^->v + return typedExpr.get()->v; +} + +static std::optional GetDynamicType( + const parser::Expr &parserExpr) { + if (auto maybeExpr{GetEvaluateExpr(parserExpr)}) { + return maybeExpr->GetType(); + } else { + return std::nullopt; + } +} + // 'OmpWorkshareBlockChecker' is used to check the validity of the assignment // statements and the expressions enclosed in an OpenMP Workshare construct class OmpWorkshareBlockChecker { @@ -584,51 +614,26 @@ void OmpStructureChecker::CheckPredefinedAllocatorRestriction( } } -template -void OmpStructureChecker::CheckHintClause( - D *leftOmpClauseList, D *rightOmpClauseList, std::string_view dirName) { - bool foundHint{false}; +void OmpStructureChecker::Enter(const parser::OmpClause::Hint &x) { + CheckAllowedClause(llvm::omp::Clause::OMPC_hint); + auto &dirCtx{GetContext()}; - auto checkForValidHintClause = [&](const D *clauseList) { - for (const auto &clause : clauseList->v) { - const parser::OmpHintClause *ompHintClause = nullptr; - if constexpr (std::is_same_v) { - ompHintClause = std::get_if(&clause.u); - } else if constexpr (std::is_same_v) { - if (auto *hint{std::get_if(&clause.u)}) { - ompHintClause = &hint->v; - } - } - if (!ompHintClause) - continue; - if (foundHint) { - context_.Say(clause.source, - "At most one HINT clause can appear on the %s directive"_err_en_US, - parser::ToUpperCaseLetters(dirName)); - } - foundHint = true; - std::optional hintValue = GetIntValue(ompHintClause->v); - if (hintValue && *hintValue >= 0) { - /*`omp_sync_hint_nonspeculative` and `omp_lock_hint_speculative`*/ - if ((*hintValue & 0xC) == 0xC - /*`omp_sync_hint_uncontended` and omp_sync_hint_contended*/ - || (*hintValue & 0x3) == 0x3) - context_.Say(clause.source, - "Hint clause value " - "is not a valid OpenMP synchronization value"_err_en_US); - } else { - context_.Say(clause.source, - "Hint clause must have non-negative constant " - "integer expression"_err_en_US); + if (std::optional maybeVal{GetIntValue(x.v.v)}) { + int64_t val{*maybeVal}; + if (val >= 0) { + // Check contradictory values. + if ((val & 0xC) == 0xC || // omp_sync_hint_speculative and nonspeculative + (val & 0x3) == 0x3) { // omp_sync_hint_contended and uncontended + context_.Say(dirCtx.clauseSource, + "The synchronization hint is not valid"_err_en_US); } + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be non-negative"_err_en_US); } - }; - - if (leftOmpClauseList) { - checkForValidHintClause(leftOmpClauseList); - } - if (rightOmpClauseList) { - checkForValidHintClause(rightOmpClauseList); + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be a constant integer value"_err_en_US); } } @@ -2370,8 +2375,9 @@ void OmpStructureChecker::Leave(const parser::OpenMPCancelConstruct &) { void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { const auto &dir{std::get(x.t)}; + const auto &dirSource{std::get(dir.t).source}; const auto &endDir{std::get(x.t)}; - PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_critical); + PushContextAndClauseSets(dirSource, llvm::omp::Directive::OMPD_critical); const auto &block{std::get(x.t)}; CheckNoBranching(block, llvm::omp::Directive::OMPD_critical, dir.source); const auto &dirName{std::get>(dir.t)}; @@ -2404,7 +2410,6 @@ void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { "Hint clause other than omp_sync_hint_none cannot be specified for " "an unnamed CRITICAL directive"_err_en_US}); } - CheckHintClause(&ompClause, nullptr, "CRITICAL"); } void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { @@ -2632,420 +2637,1519 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; } - return false; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } + + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (Append(v, std::move(results)), ...); + return v; + } + +private: + static void Append(Result &acc, Result &&data) { + for (auto &&s : data) { + acc.push_back(std::move(s)); } } +}; +} // namespace atomic - ErrIfAllocatableVariable(var); +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { + // Both expr and x have the form of SomeType(SomeKind(...)[1]). + // Check if expr is + // SomeType(SomeKind(Type( + // Convert + // SomeKind(...)[2]))) + // where SomeKind(...) [1] and [2] are equal, and the Convert preserves + // TypeCategory. + + if (expr != x) { + auto top{atomic::ArgumentExtractor{}(expr)}; + return top.first == atomic::Operator::Resize && x == top.second.front(); } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); + return true; } } -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, const MaybeExpr &maybeExpr = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetExpr(operation.expr, maybeExpr); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedExpr expr; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var + // or + // ... = x capture-var = atomic-var + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return u.lhs == c.rhs; + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); + bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + + if (couldBeCapture1) { + if (couldBeCapture2) { + if (isUpdateCapture(as2, as1)) { + if (isUpdateCapture(as1, as2)) { + // If both statements could be captures and both could be updates, + // emit a warning about the ambiguity. + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); } + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, + as1.rhs.AsFortran(), as2.rhs.AsFortran()); + } + } else { // !couldBeCapture2 + if (isUpdateCapture(as2, as1)) { + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else { + context_.Say(act2.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as1.rhs.AsFortran()); + } + } + } else { // !couldBeCapture1 + if (couldBeCapture2) { + if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); + } else { + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as2.rhs.AsFortran()); + } + } else { + context_.Say(source, + "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + } + } + + return std::make_pair(nullptr, nullptr); +} + +void OmpStructureChecker::CheckAtomicCaptureAssignment( + const evaluate::Assignment &capture, const SomeExpr &atom, + parser::CharBlock source) { + const SomeExpr &cap{capture.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + // This part should have been checked prior to callig this function. + assert(capture.rhs == atom && "This canont be a capture assignment"); + CheckStorageOverlap(atom, {cap}, source); + } +} + +void OmpStructureChecker::CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source) { + const SomeExpr &atom{read.rhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source) { + // [6.0:190:13-15] + // A write structured block is write-statement, a write statement that has + // one of the following forms: + // x = expr + // x => expr + const SomeExpr &atom{write.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, lsrc); + CheckStorageOverlap(atom, {write.rhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source) { + // [6.0:191:1-7] + // An update structured block is update-statement, an update statement + // that has one of the following forms: + // x = x operator expr + // x = expr operator x + // x = intrinsic-procedure-name (x) + // x = intrinsic-procedure-name (x, expr-list) + // x = intrinsic-procedure-name (expr-list, x) + const SomeExpr &atom{update.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, lsrc); + + auto top{GetTopLevelOperation(update.rhs)}; + switch (top.first) { + case atomic::Operator::Add: + case atomic::Operator::Sub: + case atomic::Operator::Mul: + case atomic::Operator::Div: + case atomic::Operator::And: + case atomic::Operator::Or: + case atomic::Operator::Eqv: + case atomic::Operator::Neqv: + case atomic::Operator::Min: + case atomic::Operator::Max: + case atomic::Operator::Identity: + break; + case atomic::Operator::Call: + context_.Say(source, + "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Convert: + context_.Say(source, + "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Intrinsic: + context_.Say(source, + "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Unk: + context_.Say( + source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + default: + context_.Say(source, + "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, + atomic::ToString(top.first)); + return; + } + // Check if `atom` occurs exactly once in the argument list. + std::vector nonAtom; + auto unique{[&]() { // -> iterator + auto found{top.second.end()}; + for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { + if (IsSameOrResizeOf(*i, atom)) { + if (found != top.second.end()) { + return top.second.end(); } + found = i; + } else { + nonAtom.push_back(*i); } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); + return found; + }()}; + + if (unique == top.second.end()) { + if (top.first == atomic::Operator::Identity) { + // This is "x = y". + context_.Say(rsrc, + "The atomic variable %s should appear as an argument in the update operation"_err_en_US, + atom.AsFortran()); + } else { + context_.Say(rsrc, + "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, + atom.AsFortran(), atomic::ToString(top.first)); + } + } else { + CheckStorageOverlap(atom, nonAtom, source); } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( + const SomeExpr &cond, parser::CharBlock condSource, + const evaluate::Assignment &assign, parser::CharBlock assignSource) { + const SomeExpr &atom{assign.lhs}; + auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, alsrc); + + auto top{GetTopLevelOperation(cond)}; + // Missing arguments to operations would have been diagnosed by now. + + switch (top.first) { + case atomic::Operator::Associated: + if (atom != top.second.front()) { + context_.Say(assignSource, + "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); + } + break; + // x equalop e | e equalop x (allowing "e equalop x" is an extension) + case atomic::Operator::Eq: + case atomic::Operator::Eqv: + // x ordop expr | expr ordop x + case atomic::Operator::Lt: + case atomic::Operator::Gt: { + const SomeExpr &arg0{top.second[0]}; + const SomeExpr &arg1{top.second[1]}; + if (IsSameOrResizeOf(arg0, atom)) { + CheckStorageOverlap(atom, {arg1}, condSource); + } else if (IsSameOrResizeOf(arg1, atom)) { + CheckStorageOverlap(atom, {arg0}, condSource); + } else { + context_.Say(assignSource, + "An argument of the %s operator should be the target of the assignment"_err_en_US, + atomic::ToString(top.first)); + } + break; + } + case atomic::Operator::True: + case atomic::Operator::False: + break; + default: + context_.Say(condSource, + "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, + atomic::ToString(top.first)); + break; } } -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, +void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source) { + // The condition/statements must be: + // - cond: x equalop e ift: x = d iff: - + // - cond: x ordop expr ift: x = expr iff: - (+ commute ordop) + // - cond: associated(x) ift: x => expr iff: - + // - cond: associated(x, e) ift: x => expr iff: - + + // The if-true statement must be present, and must be an assignment. + auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; + if (!maybeAssign) { + if (update.ift.stmt) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); + } else { + context_.Say( + source, "Invalid body of ATOMIC UPDATE COMPARE operation"_err_en_US); + } + return; + } + const evaluate::Assignment assign{*maybeAssign}; + const SomeExpr &atom{assign.lhs}; + + CheckAtomicConditionalUpdateAssignment( + update.cond, update.source, assign, update.ift.source); + + CheckStorageOverlap(atom, {assign.rhs}, update.ift.source); + + if (update.iff) { + context_.Say(update.iff.source, + "In ATOMIC UPDATE COMPARE the update statement should not have an ELSE branch"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdateOnly( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeUpdate{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeUpdate->lhs}; + CheckAtomicUpdateAssignment(*maybeUpdate, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC UPDATE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdate( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // Allowable forms are (single-statement): + // - if ... + // - x = (... ? ... : x) + // and two-statement: + // - r = cond ; if (r) ... + + const parser::ExecutionPartConstruct *ust{nullptr}; // update + const parser::ExecutionPartConstruct *cst{nullptr}; // condition + + if (body.size() == 1) { + ust = &body.front(); + } else if (body.size() == 2) { + cst = &body.front(); + ust = &body.back(); + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE operation should contain one or two statements"_err_en_US); + return; + } + + // Flang doesn't support conditional-expr yet, so all update statements + // are if-statements. + + // IfStmt: if (...) ... + // IfConstruct: if (...) then ... endif + auto maybeUpdate{AnalyzeConditionalStmt(ust)}; + if (!maybeUpdate) { + context_.Say(source, + "In ATOMIC UPDATE COMPARE the update statement should be a conditional statement"_err_en_US); + return; + } + + AnalyzedCondStmt &update{*maybeUpdate}; + + if (SourcedActionStmt action{GetActionStmt(cst)}) { + // The "condition" statement must be `r = cond`. + if (auto maybeCond{GetEvaluateAssignment(action.stmt)}) { + if (maybeCond->lhs != update.cond) { + context_.Say(update.source, + "In ATOMIC UPDATE COMPARE the conditional statement must use %s as the condition"_err_en_US, + maybeCond->lhs.AsFortran()); + } else { + // If it's "r = ...; if (r) ..." then put the original condition + // in `update`. + update.cond = maybeCond->rhs; + } + } else { + context_.Say(action.source, + "In ATOMIC UPDATE COMPARE with two statements the first statement should compute the condition"_err_en_US); + } + } + + evaluate::Assignment assign{*GetEvaluateAssignment(update.ift.stmt)}; + + CheckAtomicConditionalUpdateStmt(update, source); + if (IsCheckForAssociated(update.cond)) { + if (!IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment should be a pointer-assignment when the condition is ASSOCIATED"_err_en_US); + } + } else { + if (IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment cannot be a pointer-assignment except when the condition is ASSOCIATED"_err_en_US); + } + } + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::None)); +} + +void OmpStructureChecker::CheckAtomicUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() != 2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two statements"_err_en_US); + return; + } + + auto [uec, cec]{CheckUpdateCapture(&body.front(), &body.back(), source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; + auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; + + if (!maybeUpdate || !maybeCapture) { + context_.Say(source, + "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); + return; + } + + const evaluate::Assignment &update{*maybeUpdate}; + const evaluate::Assignment &capture{*maybeCapture}; + const SomeExpr &atom{update.lhs}; + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + int action; + + if (IsMaybeAtomicWrite(update)) { + action = Analysis::Write; + CheckAtomicWriteAssignment(update, uact.source); + } else { + action = Analysis::Update; + CheckAtomicUpdateAssignment(update, uact.source); + } + CheckAtomicCaptureAssignment(capture, atom, cact.source); + + if (IsPointerAssignment(update) != IsPointerAssignment(capture)) { + context_.Say(cact.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + + if (GetActionStmt(&body.front()).stmt == uact.stmt) { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(action, update.rhs), + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + } else { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), + MakeAtomicAnalysisOp(action, update.rhs)); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // There are two different variants of this: + // (1) conditional-update and capture separately: + // This form only allows single-statement updates, i.e. the update + // form "r = cond; if (r) ...)" is not allowed. + // (2) conditional-update combined with capture in a single statement: + // This form does allow the condition to be calculated separately, + // i.e. "r = cond; if (r) ...". + // Regardless of what form it is, the actual update assignment is a + // proper write, i.e. "x = d", where d does not depend on x. + + AnalyzedCondStmt update; + SourcedActionStmt capture; + bool captureAlways{true}, captureFirst{true}; + + auto extractCapture{[&]() { + capture = update.iff; + captureAlways = false; + update.iff = SourcedActionStmt{}; + }}; + + auto classifyNonUpdate{[&](const SourcedActionStmt &action) { + // The non-update statement is either "r = cond" or the capture. + if (auto maybeAssign{GetEvaluateAssignment(action.stmt)}) { + if (update.cond == maybeAssign->lhs) { + // If this is "r = cond; if (r) ...", then update the condition. + update.cond = maybeAssign->rhs; + update.source = action.source; + // In this form, the update and the capture are combined into + // an IF-THEN-ELSE statement. + extractCapture(); + } else { + // Assume this is the capture-statement. + capture = action; + } + } + }}; + + if (body.size() == 2) { + // This could be + // - capture; conditional-update (in any order), or + // - r = cond; if (r) capture-update + const parser::ExecutionPartConstruct *st1{&body.front()}; + const parser::ExecutionPartConstruct *st2{&body.back()}; + // In either case, the conditional statement can be analyzed by + // AnalyzeConditionalStmt, whereas the other statement cannot. + if (auto maybeUpdate1{AnalyzeConditionalStmt(st1)}) { + update = *maybeUpdate1; + classifyNonUpdate(GetActionStmt(st2)); + captureFirst = false; + } else if (auto maybeUpdate2{AnalyzeConditionalStmt(st2)}) { + update = *maybeUpdate2; + classifyNonUpdate(GetActionStmt(st1)); + } else { + // None of the statements are conditional, this rules out the + // "r = cond; if (r) ..." and the "capture + conditional-update" + // variants. This could still be capture + write (which is classified + // as conditional-update-capture in the spec). + auto [uec, cec]{CheckUpdateCapture(st1, st2, source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + update.ift = uact; + capture = cact; + if (uec == st1) { + captureFirst = false; + } + } + } else if (body.size() == 1) { + if (auto maybeUpdate{AnalyzeConditionalStmt(&body.front())}) { + update = *maybeUpdate; + // This is the form with update and capture combined into an IF-THEN-ELSE + // statement. The capture-statement is always the ELSE branch. + extractCapture(); + } else { + goto invalid; + } + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE CAPTURE operation should contain one or two statements"_err_en_US); + return; + invalid: + context_.Say(source, + "Invalid body of ATOMIC UPDATE COMPARE CAPTURE operation"_err_en_US); + return; + } + + // The update must have a form `x = d` or `x => d`. + if (auto maybeWrite{GetEvaluateAssignment(update.ift.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, update.ift.source); + if (auto maybeCapture{GetEvaluateAssignment(capture.stmt)}) { + CheckAtomicCaptureAssignment(*maybeCapture, atom, capture.source); + + if (IsPointerAssignment(*maybeWrite) != + IsPointerAssignment(*maybeCapture)) { + context_.Say(capture.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + } else { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + return; + } + } else { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + return; + } + + // update.iff should be empty here, the capture statement should be + // stored in "capture". + + // Fill out the analysis in the AST node. + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + bool condUnused{std::visit( + [](auto &&s) { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v) { + return true; + } else { + return false; + } }, - x.u); + update.cond.u)}; + + int updateWhen{!condUnused ? Analysis::IfTrue : 0}; + int captureWhen{!captureAlways ? Analysis::IfFalse : 0}; + + evaluate::Assignment updAssign{*GetEvaluateAssignment(update.ift.stmt)}; + evaluate::Assignment capAssign{*GetEvaluateAssignment(capture.stmt)}; + + if (captureFirst) { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + } else { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + } +} + +void OmpStructureChecker::CheckAtomicRead( + const parser::OpenMPAtomicConstruct &x) { + // [6.0:190:5-7] + // A read structured block is read-statement, a read statement that has one + // of the following forms: + // v = x + // v => x + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Read cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC READ cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeRead->rhs}; + CheckAtomicReadAssignment(*maybeRead, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC READ operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC READ operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicWrite( + const parser::OpenMPAtomicConstruct &x) { + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Write cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC WRITE cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeWrite{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC WRITE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdate( + const parser::OpenMPAtomicConstruct &x) { + auto &block{std::get(x.t)}; + + bool isConditional{x.IsCompare()}; + bool isCapture{x.IsCapture()}; + const parser::Block &body{GetInnermostExecPart(block)}; + + if (isConditional && isCapture) { + CheckAtomicConditionalUpdateCapture(x, body, x.source); + } else if (isConditional) { + CheckAtomicConditionalUpdate(x, body, x.source); + } else if (isCapture) { + CheckAtomicUpdateCapture(x, body, x.source); + } else { // update-only + CheckAtomicUpdateOnly(x, body, x.source); + } +} + +void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { + // All of the following groups have the "exclusive" property, i.e. at + // most one clause from each group is allowed. + // The exclusivity-checking code should eventually be unified for all + // clauses, with clause groups defined in OMP.td. + std::array atomic{llvm::omp::Clause::OMPC_read, + llvm::omp::Clause::OMPC_update, llvm::omp::Clause::OMPC_write}; + std::array memoryOrder{llvm::omp::Clause::OMPC_acq_rel, + llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_relaxed, + llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_seq_cst}; + + auto checkExclusive{[&](llvm::ArrayRef group, + std::string_view name, + const parser::OmpClauseList &clauses) { + const parser::OmpClause *present{nullptr}; + for (const parser::OmpClause &clause : clauses.v) { + llvm::omp::Clause id{clause.Id()}; + if (!llvm::is_contained(group, id)) { + continue; + } + if (present == nullptr) { + present = &clause; + continue; + } else if (id == present->Id()) { + // Ignore repetitions of the same clause, those will be diagnosed + // separately. + continue; + } + parser::MessageFormattedText txt( + "At most one clause from the '%s' group is allowed on ATOMIC construct"_err_en_US, + name.data()); + parser::Message message(clause.source, txt); + message.Attach(present->source, + "Previous clause from this group provided here"_en_US); + context_.Say(std::move(message)); + return; + } + }}; + + auto &dirSpec{std::get(x.t)}; + auto &dir{std::get(dirSpec.t)}; + PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_atomic); + llvm::omp::Clause kind{x.GetKind()}; + + checkExclusive(atomic, "atomic", dirSpec.Clauses()); + checkExclusive(memoryOrder, "memory-order", dirSpec.Clauses()); + + switch (kind) { + case llvm::omp::Clause::OMPC_read: + CheckAtomicRead(x); + break; + case llvm::omp::Clause::OMPC_write: + CheckAtomicWrite(x); + break; + case llvm::omp::Clause::OMPC_update: + CheckAtomicUpdate(x); + break; + default: + break; + } } void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { @@ -3237,7 +4341,6 @@ CHECK_SIMPLE_CLAUSE(Final, OMPC_final) CHECK_SIMPLE_CLAUSE(Flush, OMPC_flush) CHECK_SIMPLE_CLAUSE(Full, OMPC_full) CHECK_SIMPLE_CLAUSE(Grainsize, OMPC_grainsize) -CHECK_SIMPLE_CLAUSE(Hint, OMPC_hint) CHECK_SIMPLE_CLAUSE(Holds, OMPC_holds) CHECK_SIMPLE_CLAUSE(Inclusive, OMPC_inclusive) CHECK_SIMPLE_CLAUSE(Initializer, OMPC_initializer) @@ -3867,40 +4970,6 @@ void OmpStructureChecker::CheckIsLoopIvPartOfClause( } } } -// Following clauses have a separate node in parse-tree.h. -// Atomic-clause -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicRead, OMPC_read) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicWrite, OMPC_write) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicUpdate, OMPC_update) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicCapture, OMPC_capture) - -void OmpStructureChecker::Leave(const parser::OmpAtomicRead &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_read, - {llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicWrite &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_write, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicUpdate &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_update, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -// OmpAtomic node represents atomic directive without atomic-clause. -// atomic-clause - READ,WRITE,UPDATE,CAPTURE. -void OmpStructureChecker::Leave(const parser::OmpAtomic &) { - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acquire)}) { - context_.Say(clause->source, - "Clause ACQUIRE is not allowed on the ATOMIC directive"_err_en_US); - } - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acq_rel)}) { - context_.Say(clause->source, - "Clause ACQ_REL is not allowed on the ATOMIC directive"_err_en_US); - } -} // Restrictions specific to each clause are implemented apart from the // generalized restrictions. @@ -4854,21 +5923,6 @@ void OmpStructureChecker::Leave(const parser::OmpContextSelector &) { ExitDirectiveNest(ContextSelectorNest); } -std::optional OmpStructureChecker::GetDynamicType( - const common::Indirection &parserExpr) { - // Indirection parserExpr - // `- parser::Expr ^.value() - const parser::TypedExpr &typedExpr{parserExpr.value().typedExpr}; - // ForwardOwningPointer typedExpr - // `- GenericExprWrapper ^.get() - // `- std::optional ^->v - if (auto maybeExpr{typedExpr.get()->v}) { - return maybeExpr->GetType(); - } else { - return std::nullopt; - } -} - const std::list & OmpStructureChecker::GetTraitPropertyList( const parser::OmpTraitSelector &trait) { @@ -5258,7 +6312,7 @@ void OmpStructureChecker::CheckTraitCondition( const parser::OmpTraitProperty &property{properties.front()}; auto &scalarExpr{std::get(property.u)}; - auto maybeType{GetDynamicType(scalarExpr.thing)}; + auto maybeType{GetDynamicType(scalarExpr.thing.value())}; if (!maybeType || maybeType->category() != TypeCategory::Logical) { context_.Say(property.source, "%s trait requires a single LOGICAL expression"_err_en_US, diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..bf6fbf16d0646 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -48,6 +48,7 @@ static const OmpDirectiveSet noWaitClauseNotAllowedSet{ } // namespace llvm namespace Fortran::semantics { +struct AnalyzedCondStmt; // Mapping from 'Symbol' to 'Source' to keep track of the variables // used in multiple clauses @@ -142,15 +143,6 @@ class OmpStructureChecker void Leave(const parser::OmpClauseList &); void Enter(const parser::OmpClause &); - void Enter(const parser::OmpAtomicRead &); - void Leave(const parser::OmpAtomicRead &); - void Enter(const parser::OmpAtomicWrite &); - void Leave(const parser::OmpAtomicWrite &); - void Enter(const parser::OmpAtomicUpdate &); - void Leave(const parser::OmpAtomicUpdate &); - void Enter(const parser::OmpAtomicCapture &); - void Leave(const parser::OmpAtomic &); - void Enter(const parser::DoConstruct &); void Leave(const parser::DoConstruct &); @@ -189,8 +181,6 @@ class OmpStructureChecker void CheckAllowedMapTypes(const parser::OmpMapType::Value &, const std::list &); - std::optional GetDynamicType( - const common::Indirection &); const std::list &GetTraitPropertyList( const parser::OmpTraitSelector &); std::optional GetClauseFromProperty( @@ -260,14 +250,41 @@ class OmpStructureChecker void CheckDoWhile(const parser::OpenMPLoopConstruct &x); void CheckAssociatedLoopConstraints(const parser::OpenMPLoopConstruct &x); template bool IsOperatorValid(const T &, const D &); - void CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *, const parser::OmpAtomicClauseList *); - void CheckAtomicUpdateStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureStmt(const parser::AssignmentStmt &); - void CheckAtomicWriteStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureConstruct(const parser::OmpAtomicCapture &); - void CheckAtomicCompareConstruct(const parser::OmpAtomicCompare &); - void CheckAtomicConstructStructure(const parser::OpenMPAtomicConstruct &); + + void CheckStorageOverlap(const evaluate::Expr &, + llvm::ArrayRef>, parser::CharBlock); + void CheckAtomicVariable( + const evaluate::Expr &, parser::CharBlock); + std::pair + CheckUpdateCapture(const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source); + void CheckAtomicCaptureAssignment(const evaluate::Assignment &capture, + const SomeExpr &atom, parser::CharBlock source); + void CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source); + void CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source); + void CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source); + void CheckAtomicConditionalUpdateAssignment(const SomeExpr &cond, + parser::CharBlock condSource, const evaluate::Assignment &assign, + parser::CharBlock assignSource); + void CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source); + void CheckAtomicUpdateOnly(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdate(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicUpdateCapture(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source); + void CheckAtomicRead(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicWrite(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicUpdate(const parser::OpenMPAtomicConstruct &x); + void CheckDistLinear(const parser::OpenMPLoopConstruct &x); void CheckSIMDNest(const parser::OpenMPConstruct &x); void CheckTargetNest(const parser::OpenMPConstruct &x); @@ -319,7 +336,6 @@ class OmpStructureChecker void EnterDirectiveNest(const int index) { directiveNest_[index]++; } void ExitDirectiveNest(const int index) { directiveNest_[index]--; } int GetDirectiveNest(const int index) { return directiveNest_[index]; } - template void CheckHintClause(D *, D *, std::string_view); inline void ErrIfAllocatableVariable(const parser::Variable &); inline void ErrIfLHSAndRHSSymbolsMatch( const parser::Variable &, const parser::Expr &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..dc395ecf211b3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1638,11 +1638,8 @@ class OmpVisitor : public virtual DeclarationVisitor { messageHandler().set_currStmtSource(std::nullopt); } bool Pre(const parser::OpenMPAtomicConstruct &x) { - return common::visit(common::visitors{[&](const auto &u) -> bool { - AddOmpSourceRange(u.source); - return true; - }}, - x.u); + AddOmpSourceRange(x.source); + return true; } void Post(const parser::OpenMPAtomicConstruct &) { messageHandler().set_currStmtSource(std::nullopt); diff --git a/flang/lib/Semantics/rewrite-directives.cpp b/flang/lib/Semantics/rewrite-directives.cpp index 104a77885d276..b4fef2c881b67 100644 --- a/flang/lib/Semantics/rewrite-directives.cpp +++ b/flang/lib/Semantics/rewrite-directives.cpp @@ -51,23 +51,21 @@ class OmpRewriteMutator : public DirectiveRewriteMutator { bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { // Find top-level parent of the operation. - Symbol *topLevelParent{common::visit( - [&](auto &atomic) { - Symbol *symbol{nullptr}; - Scope *scope{ - &context_.FindScope(std::get(atomic.t).source)}; - do { - if (Symbol * parent{scope->symbol()}) { - symbol = parent; - } - scope = &scope->parent(); - } while (!scope->IsGlobal()); - - assert(symbol && - "Atomic construct must be within a scope associated with a symbol"); - return symbol; - }, - x.u)}; + Symbol *topLevelParent{[&]() { + Symbol *symbol{nullptr}; + Scope *scope{&context_.FindScope( + std::get(x.t).source)}; + do { + if (Symbol * parent{scope->symbol()}) { + symbol = parent; + } + scope = &scope->parent(); + } while (!scope->IsGlobal()); + + assert(symbol && + "Atomic construct must be within a scope associated with a symbol"); + return symbol; + }()}; // Get the `atomic_default_mem_order` clause from the top-level parent. std::optional defaultMemOrder; @@ -86,66 +84,48 @@ bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { return false; } - auto findMemOrderClause = - [](const std::list &clauses) { - return llvm::any_of(clauses, [](const auto &clause) { - return std::get_if(&clause.u); + auto findMemOrderClause{[](const parser::OmpClauseList &clauses) { + return llvm::any_of( + clauses.v, [](auto &clause) -> const parser::OmpClause * { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_acq_rel: + case llvm::omp::Clause::OMPC_acquire: + case llvm::omp::Clause::OMPC_relaxed: + case llvm::omp::Clause::OMPC_release: + case llvm::omp::Clause::OMPC_seq_cst: + return &clause; + default: + return nullptr; + } }); - }; - - // Get the clause list to which the new memory order clause must be added, - // only if there are no other memory order clauses present for this atomic - // directive. - std::list *clauseList = common::visit( - common::visitors{[&](parser::OmpAtomic &atomicConstruct) { - // OmpAtomic only has a single list of clauses. - auto &clauses{std::get( - atomicConstruct.t)}; - return !findMemOrderClause(clauses.v) ? &clauses.v - : nullptr; - }, - [&](auto &atomicConstruct) { - // All other atomic constructs have two lists of clauses. - auto &clausesLhs{std::get<0>(atomicConstruct.t)}; - auto &clausesRhs{std::get<2>(atomicConstruct.t)}; - return !findMemOrderClause(clausesLhs.v) && - !findMemOrderClause(clausesRhs.v) - ? &clausesRhs.v - : nullptr; - }}, - x.u); + }}; - // Add a memory order clause to the atomic directive. + auto &dirSpec{std::get(x.t)}; + auto &clauseList{std::get>(dirSpec.t)}; if (clauseList) { - atomicDirectiveDefaultOrderFound_ = true; - switch (*defaultMemOrder) { - case common::OmpMemoryOrderType::Acq_Rel: - clauseList->emplace_back(common::visit( - common::visitors{[](parser::OmpAtomicRead &) -> parser::OmpClause { - return parser::OmpClause::Acquire{}; - }, - [](parser::OmpAtomicCapture &) -> parser::OmpClause { - return parser::OmpClause::AcqRel{}; - }, - [](auto &) -> parser::OmpClause { - // parser::{OmpAtomic, OmpAtomicUpdate, OmpAtomicWrite} - return parser::OmpClause::Release{}; - }}, - x.u)); - break; - case common::OmpMemoryOrderType::Relaxed: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::Relaxed{}}); - break; - case common::OmpMemoryOrderType::Seq_Cst: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::SeqCst{}}); - break; - default: - // FIXME: Don't process other values at the moment since their validity - // depends on the OpenMP version (which is unavailable here). - break; + if (findMemOrderClause(*clauseList)) { + return false; } + } else { + clauseList = parser::OmpClauseList(decltype(parser::OmpClauseList::v){}); + } + + // Add a memory order clause to the atomic directive. + atomicDirectiveDefaultOrderFound_ = true; + switch (*defaultMemOrder) { + case common::OmpMemoryOrderType::Acq_Rel: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::AcqRel{}}); + break; + case common::OmpMemoryOrderType::Relaxed: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::Relaxed{}}); + break; + case common::OmpMemoryOrderType::Seq_Cst: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::SeqCst{}}); + break; + default: + // FIXME: Don't process other values at the moment since their validity + // depends on the OpenMP version (which is unavailable here). + break; } return false; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 index b82bd13622764..6f58e0939a787 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 index 88ec6fe910b9e..6729be6e5cf8b 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b8422c8f3b769 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -79,16 +79,16 @@ subroutine pointers_in_atomic_capture() !CHECK: %[[VAL_A_BOX_ADDR:.*]] = fir.box_addr %[[VAL_A_LOADED]] : (!fir.box>) -> !fir.ptr !CHECK: %[[VAL_B_LOADED:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR:.*]] = fir.box_addr %[[VAL_B_LOADED]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR]] : !fir.ptr !CHECK: %[[VAL_B_LOADED_2:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR_2:.*]] = fir.box_addr %[[VAL_B_LOADED_2]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR_2]] : !fir.ptr !CHECK: omp.atomic.capture { !CHECK: omp.atomic.update %[[VAL_A_BOX_ADDR]] : !fir.ptr { !CHECK: ^bb0(%[[ARG:.*]]: i32): !CHECK: %[[TEMP:.*]] = arith.addi %[[ARG]], %[[VAL_B]] : i32 !CHECK: omp.yield(%[[TEMP]] : i32) !CHECK: } -!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 +!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR_2]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 !CHECK: } !CHECK: return !CHECK: } diff --git a/flang/test/Lower/OpenMP/atomic-write.f90 b/flang/test/Lower/OpenMP/atomic-write.f90 index 13392ad76471f..6eded49b0b15d 100644 --- a/flang/test/Lower/OpenMP/atomic-write.f90 +++ b/flang/test/Lower/OpenMP/atomic-write.f90 @@ -44,9 +44,9 @@ end program OmpAtomicWrite !CHECK-LABEL: func.func @_QPatomic_write_pointer() { !CHECK: %[[X_REF:.*]] = fir.alloca !fir.box> {bindc_name = "x", uniq_name = "_QFatomic_write_pointerEx"} !CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFatomic_write_pointerEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) -!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> !CHECK: %[[X_POINTEE_ADDR:.*]] = fir.box_addr %[[X_ADDR_BOX]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: omp.atomic.write %[[X_POINTEE_ADDR]] = %[[C1]] : !fir.ptr, i32 !CHECK: %[[C2:.*]] = arith.constant 2 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> diff --git a/flang/test/Parser/OpenMP/atomic-compare.f90 b/flang/test/Parser/OpenMP/atomic-compare.f90 deleted file mode 100644 index 5cd02698ff482..0000000000000 --- a/flang/test/Parser/OpenMP/atomic-compare.f90 +++ /dev/null @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s -! OpenMP version for documentation purposes only - it isn't used until Sema. -! This is testing for Parser errors that bail out before Sema. -program main - implicit none - integer :: i, j = 10 - logical :: r - - !CHECK: error: expected OpenMP construct - !$omp atomic compare write - r = i .eq. j + 1 - - !CHECK: error: expected end of line - !$omp atomic compare num_threads(4) - r = i .eq. j -end program main diff --git a/flang/test/Semantics/OpenMP/atomic-compare.f90 b/flang/test/Semantics/OpenMP/atomic-compare.f90 index 54492bf6a22a6..11e23e062bce7 100644 --- a/flang/test/Semantics/OpenMP/atomic-compare.f90 +++ b/flang/test/Semantics/OpenMP/atomic-compare.f90 @@ -44,46 +44,37 @@ !$omp end atomic ! Check for error conditions: - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic compare seq_cst seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst compare seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic compare acquire acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire compare acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic compare relaxed relaxed if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed compare relaxed if (b .eq. c) b = a - !ERROR: More than one FAIL clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one FAIL clause can appear on the ATOMIC directive !$omp atomic fail(release) compare fail(release) if (c .eq. a) a = b !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index c13a11a8dd5dc..deb67e7614659 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -16,20 +16,21 @@ program sample !$omp atomic read hint(2) y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(3) y = y + 10 !$omp atomic update hint(5) y = x + y - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement y = x x = y !$omp end atomic - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic update hint(x) y = y * 1 @@ -46,7 +47,7 @@ program sample !$omp atomic hint(omp_lock_hint_speculative) x = y + x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint) read y = x @@ -69,36 +70,36 @@ program sample !$omp atomic hint(omp_lock_hint_contended + omp_sync_hint_nonspeculative) x = y + x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint_contended) read y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = y * 9 - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp atomic hint(1.0) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp atomic hint(z + omp_sync_hint_nonspeculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(k + omp_sync_hint_speculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(p(1) + omp_sync_hint_uncontended) write x = 10 * y !$omp atomic write hint(a) - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and y+x access the same storage x = y + x !$omp atomic hint(abs(-1)) write diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 new file mode 100644 index 0000000000000..2619d235380f8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -0,0 +1,89 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC READ. Expect no diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v = x + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v => x +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic read + v = p(i) +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic read + !ERROR: Atomic variable x should be a scalar + v = x + + !$omp atomic read + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + v = y(2:4) +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic read + !ERROR: Within atomic operation x and x access the same storage + x = x + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic read + y(1) = y(2) +end + +subroutine f05 + integer :: x, v + + !$omp atomic read + !ERROR: Atomic expression x+1_4 should be a variable + v = x + 1 +end + +subroutine f06 + character :: x, v + + !$omp atomic read + !ERROR: Atomic variable x cannot have CHARACTER type + v = x +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic read + !ERROR: Atomic variable x cannot be ALLOCATABLE + v = x +end + diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 new file mode 100644 index 0000000000000..64e31a7974b55 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -0,0 +1,77 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !$omp atomic update capture + x = v + x = x + 1 + y = x + !$omp end atomic +end + +subroutine f01 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two assignments + !$omp atomic update capture + x = v + block + x = x + 1 + y = x + end block + !$omp end atomic +end + +subroutine f02 + integer :: x, y + + ! The update and capture statements can be inside of a single BLOCK. + ! The end-directive is then optional. Expect no diagnostics. + !$omp atomic update capture + block + x = x + 1 + y = x + end block +end + +subroutine f03 + integer :: x + + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !$omp atomic update capture + x = x + 1 + x = x + 2 + !$omp end atomic +end + +subroutine f04 + integer :: x, v + + !$omp atomic update capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + v = x + x = v + !$omp end atomic +end + +subroutine f05 + integer :: x, v, z + + !$omp atomic update capture + v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + !$omp end atomic +end + +subroutine f06 + integer :: x, v, z + + !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + v = x + !$omp end atomic +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 new file mode 100644 index 0000000000000..0aee02d429dc0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + !$omp atomic update + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + x = int(x + y) +end + +subroutine f06 + integer :: x, y + interface + function f(i, j) + integer :: f, i, j + end + end interface + + !$omp atomic update + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation + x = f(x, y) +end + +subroutine f07 + real :: x + integer :: y + + !$omp atomic update + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation + x = x ** y +end + +subroutine f08 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should appear as an argument in the update operation + x = y +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 index 21a9b87d26345..3084376b4275d 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 @@ -22,10 +22,10 @@ program sample x = x / y !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y end program diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 new file mode 100644 index 0000000000000..979568ebf7140 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -0,0 +1,81 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC WRITE. Expect no diagnostics. + !$omp atomic write + x = v + 1 + + !$omp atomic write + x = v + 3 + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic write + x = 2 * v + 3 + + !$omp atomic write + x => v +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic write + p(i) = v +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic write + !ERROR: Atomic variable x should be a scalar + x = v + + !$omp atomic write + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + y(2:4) = v +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic write + !ERROR: Within atomic operation x and x+1_4 access the same storage + x = x + 1 + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic write + y(1) = y(2) +end + +subroutine f06 + character :: x, v + + !$omp atomic write + !ERROR: Atomic variable x cannot have CHARACTER type + x = v +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic write + !ERROR: Atomic variable x cannot be ALLOCATABLE + x = v +end + diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0e100871ea9b4..0dc8251ae356e 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct @@ -11,9 +11,13 @@ a = b !$omp end atomic + !ERROR: ACQUIRE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic read acquire hint(OMP_LOCK_HINT_CONTENDED) a = b + !ERROR: RELEASE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic release hint(OMP_LOCK_HINT_UNCONTENDED) write a = b @@ -22,39 +26,32 @@ a = a + 1 !$omp end atomic + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: ACQ_REL clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic hint(1) acq_rel capture b = a a = a + 1 !$omp end atomic - !ERROR: expected end of line + !ERROR: At most one clause from the 'atomic' group is allowed on ATOMIC construct !$omp atomic read write + !ERROR: Atomic expression a+1._4 should be a variable a = a + 1 !$omp atomic a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic num_threads(4) a = a + 1 - !ERROR: expected end of line + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic capture num_threads(4) a = a + 1 + !ERROR: RELAXED clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic relaxed a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' - !$omp atomic num_threads write - a = a + 1 - !$omp end parallel end diff --git a/flang/test/Semantics/OpenMP/atomic01.f90 b/flang/test/Semantics/OpenMP/atomic01.f90 index 173effe86b69c..f700c381cadd0 100644 --- a/flang/test/Semantics/OpenMP/atomic01.f90 +++ b/flang/test/Semantics/OpenMP/atomic01.f90 @@ -14,322 +14,277 @@ ! At most one memory-order-clause may appear on the construct. !READ - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic read seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst read seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic read acquire acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire read acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic read relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed read relaxed i = j !UPDATE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic update seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst update seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic update release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release update release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic update relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed update relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !CAPTURE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic capture seq_cst seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst capture seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic capture release release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release capture release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic capture relaxed relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed capture relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel acq_rel capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic capture acq_rel acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel capture acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic capture acquire acquire i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire capture acquire i = j j = k !$omp end atomic !WRITE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic write seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst write seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic write release release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release write release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic write relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed write relaxed i = j !No atomic-clause - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j ! 2.17.7.3 ! At most one hint clause may appear on the construct. - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_speculative) hint(omp_sync_hint_speculative) read i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) read hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic read hint(omp_sync_hint_uncontended) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) update hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic update hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) capture i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) capture hint(omp_sync_hint_nonspeculative) i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic capture hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j j = k @@ -337,34 +292,26 @@ ! 2.17.7.4 ! If atomic-clause is read then memory-order-clause must not be acq_rel or release. - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic acq_rel read i = j - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read acq_rel i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic release read i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read release i = j ! 2.17.7.5 ! If atomic-clause is write then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acq_rel write i = j - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acq_rel i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acquire write i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acquire i = j @@ -372,33 +319,27 @@ ! 2.17.7.6 ! If atomic-clause is update or not present then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acq_rel update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acquire update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed on the ATOMIC directive !$omp atomic acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed on the ATOMIC directive !$omp atomic acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j end program diff --git a/flang/test/Semantics/OpenMP/atomic02.f90 b/flang/test/Semantics/OpenMP/atomic02.f90 index c66085d00f157..45e41f2552965 100644 --- a/flang/test/Semantics/OpenMP/atomic02.f90 +++ b/flang/test/Semantics/OpenMP/atomic02.f90 @@ -28,36 +28,29 @@ program OmpAtomic !$omp atomic a = a/(b + 1) !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: The atomic variable c should appear as an argument in the update operation c = d !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The /= operator is not a valid ATOMIC UPDATE operation l = a .NE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic m = m .AND. n @@ -76,32 +69,26 @@ program OmpAtomic !$omp atomic update a = a/(b + 1) !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: This is not a valid ATOMIC UPDATE operation c = c//d !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic update m = m .AND. n diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index 76367495b9861..f5c189fd05318 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -25,28 +25,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7, b, c) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8, a, d) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -60,26 +58,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = MOD(y, 9) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation x = ABS(x) end program OmpAtomic @@ -92,7 +90,7 @@ subroutine conflicting_types() type(simple) ::s z = 1 !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(s%z, 4) end subroutine @@ -105,40 +103,37 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) :: s !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MIN operator a = min(a, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a) !$omp atomic - !ERROR: Atomic update statement should be of the form `a = intrinsic_procedure(a, expr_list)` OR `a = intrinsic_procedure(expr_list, a)` a = min(b, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a, b) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: The atomic variable y should occur exactly once among the arguments of the top-level MIN operator y = min(z, x) !$omp atomic z = max(z, y) !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'k' + !ERROR: Atomic variable k should be a scalar + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation k = max(x, y) - + !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement x = min(x, k) !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - z =z + s%m + z = z + s%m end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index a9644ad95aa30..5c91ab5dc37e4 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -20,12 +20,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic @@ -33,12 +31,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic @@ -46,12 +42,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic @@ -59,12 +53,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic @@ -72,8 +64,7 @@ program OmpAtomic !$omp atomic m = n .AND. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic @@ -81,8 +72,7 @@ program OmpAtomic !$omp atomic m = n .OR. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic @@ -90,8 +80,7 @@ program OmpAtomic !$omp atomic m = n .EQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic @@ -99,8 +88,7 @@ program OmpAtomic !$omp atomic m = n .NEQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l !$omp atomic update @@ -108,12 +96,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic update @@ -121,12 +107,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic update @@ -134,12 +118,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic update @@ -147,12 +129,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic update @@ -160,8 +140,7 @@ program OmpAtomic !$omp atomic update m = n .AND. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic update @@ -169,8 +148,7 @@ program OmpAtomic !$omp atomic update m = n .OR. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic update @@ -178,8 +156,7 @@ program OmpAtomic !$omp atomic update m = n .EQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic update @@ -187,8 +164,7 @@ program OmpAtomic !$omp atomic update m = n .NEQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l end program OmpAtomic @@ -204,35 +180,34 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) p !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement x = x !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 1 !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and a*b access the same storage a = a * b + a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level * operator a = b * (a + 9) !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (a+b) access the same storage a = a * (a + b) !$omp atomic - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (b+a) access the same storage a = (b + a) * a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a * b + c !$omp atomic update - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a + b + c !$omp atomic @@ -243,23 +218,18 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement a = a + d !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = x * y / z !$omp atomic - !ERROR: Atomic update statement should be of form `p%m = p%m operator expr` OR `p%m = expr operator p%m` - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable p%m should occur exactly once among the arguments of the top-level + operator p%m = x + y !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement p%m = p%m + p%n end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index 266268a212440..e0103be4cae4a 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -8,20 +8,20 @@ program OmpAtomic use omp_lib integer :: g, x - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic relaxed, seq_cst x = x + 1 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic read seq_cst, relaxed x = g - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic write relaxed, release x = 2 * 4 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 10 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst x = g g = x * 10 diff --git a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 index 7ca8c858239f7..e9cfa49bf934e 100644 --- a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 @@ -18,7 +18,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(3) y = 2 !$omp end critical (name) @@ -27,12 +27,12 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(7) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(x) y = 2 @@ -54,7 +54,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint) y = 2 @@ -84,35 +84,35 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint_contended) y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp critical (name) hint(1.0) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp critical (name) hint(z + omp_sync_hint_nonspeculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(k + omp_sync_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(p(1) + omp_sync_hint_uncontended) y = 2 diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 505cbc48fef90..677b933932b44 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -20,70 +20,64 @@ program sample !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable y(1_8:3_8:1_8) should be a scalar v = y(1:3) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression x*(10_4+x) should be a variable v = x * (10 + x) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression 4_4 should be a variable v = 4 !$omp atomic read - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE v = k !$omp atomic write - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = x !$omp atomic update - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = k + x * (v * x) !$omp atomic - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = v * k !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'z%y' + !ERROR: Within atomic operation z%y and x+z%y access the same storage z%y = x + z%y !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Within atomic operation m and min(m,x,z%m)+k access the same storage m = min(m, x, z%m) + k !$omp atomic read - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Atomic expression min(m,x,z%m)+k should be a variable m = min(m, x, z%m) + k !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar x = a - !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - a = x - !$omp atomic write !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement x = a !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar a = x !$omp atomic capture @@ -93,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = b + (x*1) !$omp end atomic @@ -103,60 +97,58 @@ program sample !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component x expected to be captured in the second statement of ATOMIC CAPTURE construct x = x + 10 v = b !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt] v = 1 x = 4 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component z%y expected to be assigned in the second statement of ATOMIC CAPTURE construct x = z%y + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component z%m expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 x = z%y !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component y(2) expected to be assigned in the second statement of ATOMIC CAPTURE construct x = y(2) + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component y(1) expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 x = y(2) !$omp end atomic !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable r cannot have CHARACTER type l = r !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable l cannot have CHARACTER type l = r end program diff --git a/flang/test/Semantics/OpenMP/requires-atomic01.f90 b/flang/test/Semantics/OpenMP/requires-atomic01.f90 index ae9fd086015dd..e8817c3f5ef61 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic01.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic01.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> SeqCst !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> SeqCst !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> SeqCst !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> SeqCst !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> SeqCst !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 diff --git a/flang/test/Semantics/OpenMP/requires-atomic02.f90 b/flang/test/Semantics/OpenMP/requires-atomic02.f90 index 4976a9667eb78..a3724a83456fd 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic02.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic02.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Acquire + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> AcqRel !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> AcqRel !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> AcqRel !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> AcqRel !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> AcqRel + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> AcqRel !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 >From 5928e6c450ed9d91a8130fd489a3fabab830140c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:41:16 -0500 Subject: [PATCH 05/18] Restart build >From 5a9cb4aaca6c7cd079f24036b233dbfd3c6278d4 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:45:28 -0500 Subject: [PATCH 06/18] Restart build >From 6273bb856c66fe110731e25293be0f9a1f02c64a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 07:54:54 -0500 Subject: [PATCH 07/18] Restart build >From 3594a0fcab499d7509c2bd4620b2c47feb74a481 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 08:18:01 -0500 Subject: [PATCH 08/18] Replace %openmp_flags with -fopenmp in tests, add REQUIRES where needed --- flang/test/Semantics/OpenMP/atomic-read.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-capture.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 2 +- flang/test/Semantics/OpenMP/atomic.f90 | 2 ++ 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 index 2619d235380f8..73899a9ff37f2 100644 --- a/flang/test/Semantics/OpenMP/atomic-read.f90 +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index 64e31a7974b55..c427ba07d43d8 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 0aee02d429dc0..4595e02d01456 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 index 979568ebf7140..7965ad2dc7dbf 100644 --- a/flang/test/Semantics/OpenMP/atomic-write.f90 +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0dc8251ae356e..2caa161507d49 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,3 +1,5 @@ +! REQUIRES: openmp_runtime + ! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct >From 0142671339663ba09c169800f909c0d48c0ad38b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 09:06:20 -0500 Subject: [PATCH 09/18] Fix examples --- flang/examples/FeatureList/FeatureList.cpp | 10 ------- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 26 +++++-------------- 2 files changed, 7 insertions(+), 29 deletions(-) diff --git a/flang/examples/FeatureList/FeatureList.cpp b/flang/examples/FeatureList/FeatureList.cpp index d1407cf0ef239..a36b8719e365d 100644 --- a/flang/examples/FeatureList/FeatureList.cpp +++ b/flang/examples/FeatureList/FeatureList.cpp @@ -445,13 +445,6 @@ struct NodeVisitor { READ_FEATURE(ObjectDecl) READ_FEATURE(OldParameterStmt) READ_FEATURE(OmpAlignedClause) - READ_FEATURE(OmpAtomic) - READ_FEATURE(OmpAtomicCapture) - READ_FEATURE(OmpAtomicCapture::Stmt1) - READ_FEATURE(OmpAtomicCapture::Stmt2) - READ_FEATURE(OmpAtomicRead) - READ_FEATURE(OmpAtomicUpdate) - READ_FEATURE(OmpAtomicWrite) READ_FEATURE(OmpBeginBlockDirective) READ_FEATURE(OmpBeginLoopDirective) READ_FEATURE(OmpBeginSectionsDirective) @@ -480,7 +473,6 @@ struct NodeVisitor { READ_FEATURE(OmpIterationOffset) READ_FEATURE(OmpIterationVector) READ_FEATURE(OmpEndAllocators) - READ_FEATURE(OmpEndAtomic) READ_FEATURE(OmpEndBlockDirective) READ_FEATURE(OmpEndCriticalDirective) READ_FEATURE(OmpEndLoopDirective) @@ -566,8 +558,6 @@ struct NodeVisitor { READ_FEATURE(OpenMPDeclareTargetConstruct) READ_FEATURE(OmpMemoryOrderType) READ_FEATURE(OmpMemoryOrderClause) - READ_FEATURE(OmpAtomicClause) - READ_FEATURE(OmpAtomicClauseList) READ_FEATURE(OmpAtomicDefaultMemOrderClause) READ_FEATURE(OpenMPFlushConstruct) READ_FEATURE(OpenMPLoopConstruct) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..18a4107f9a9d1 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -74,25 +74,19 @@ SourcePosition OpenMPCounterVisitor::getLocation(const OpenMPConstruct &c) { // the directive field. [&](const auto &c) -> SourcePosition { const CharBlock &source{std::get<0>(c.t).source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPAtomicConstruct &c) -> SourcePosition { - return std::visit( - [&](const auto &o) -> SourcePosition { - const CharBlock &source{std::get(o.t).source}; - return parsing->allCooked() - .GetSourcePositionRange(source) - ->first; - }, - c.u); + const CharBlock &source{c.source}; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPSectionConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPUtilityConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, }, c.u); @@ -157,14 +151,8 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - return std::visit( - [&](const auto &c) { - // Get source from the verbatim fields - const CharBlock &source{std::get(c.t).source}; - return "atomic-" + - normalize_construct_name(source.ToString()); - }, - c.u); + const CharBlock &source{c.source}; + return normalize_construct_name(source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; >From c158867527e0a5e2b67146784386675e0d414c71 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:19 -0500 Subject: [PATCH 10/18] Fix example --- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 5 +++-- flang/test/Examples/omp-atomic.f90 | 16 +++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index 18a4107f9a9d1..b0964845d3b99 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -151,8 +151,9 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - const CharBlock &source{c.source}; - return normalize_construct_name(source.ToString()); + auto &dirSpec = std::get(c.t); + auto &dirName = std::get(dirSpec.t); + return normalize_construct_name(dirName.source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; diff --git a/flang/test/Examples/omp-atomic.f90 b/flang/test/Examples/omp-atomic.f90 index dcca34b633a3e..934f84f132484 100644 --- a/flang/test/Examples/omp-atomic.f90 +++ b/flang/test/Examples/omp-atomic.f90 @@ -26,25 +26,31 @@ ! CHECK:--- ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 9 -! CHECK-NEXT: construct: atomic-read +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: -! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: - clause: read ! CHECK-NEXT: details: '' +! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 12 -! CHECK-NEXT: construct: atomic-write +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: ! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' +! CHECK-NEXT: - clause: write ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 16 -! CHECK-NEXT: construct: atomic-capture +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: +! CHECK-NEXT: - clause: capture +! CHECK-NEXT: details: 'name_modifier=atomic;name_modifier=atomic;' ! CHECK-NEXT: - clause: seq_cst ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 21 -! CHECK-NEXT: construct: atomic-atomic +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: [] ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 8 >From ddacb71299d1fefd37b566e3caed45031584aac8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:47 -0500 Subject: [PATCH 11/18] Remove reference to *nullptr --- flang/lib/Lower/OpenMP/OpenMP.cpp | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ab868df76d298..4d9b375553c1e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3770,9 +3770,12 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); - mlir::Operation *secondOp = genAtomicOperation( - converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, - *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + mlir::Operation *secondOp = nullptr; + if (analysis.op1.what != analysis.None) { + secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, + atomAddr, atom, *get(analysis.op1.expr), + hint, memOrder, atomicAt, prepareAt); + } if (secondOp) { builder.setInsertionPointAfter(secondOp); >From 4546997f82dfe32b79b2bd0e2b65974991ab55da Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 2 May 2025 18:49:05 -0500 Subject: [PATCH 12/18] Updates and improvements --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 107 +++-- flang/lib/Semantics/check-omp-structure.cpp | 375 ++++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 1 + .../Todo/atomic-capture-implicit-cast.f90 | 48 --- .../Lower/OpenMP/atomic-implicit-cast.f90 | 2 - .../Semantics/OpenMP/atomic-hint-clause.f90 | 2 +- .../OpenMP/atomic-update-capture.f90 | 8 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 16 +- 9 files changed, 381 insertions(+), 180 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 77f57b1cb85c7..8213fe33edbd0 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4859,7 +4859,7 @@ struct OpenMPAtomicConstruct { struct Op { int what; - TypedExpr expr; + AssignmentStmt::TypedAssignment assign; }; TypedExpr atom, cond; Op op0, op1; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6177b59199481..7b6c22095d723 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2673,21 +2673,46 @@ getAtomicMemoryOrder(lower::AbstractConverter &converter, static mlir::Operation * // genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Value storeAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Type storeType = fir::unwrapRefType(storeAddr.getType()); + + mlir::Value toAddr = [&]() { + if (atomType == storeType) + return storeAddr; + return builder.createTemporary(loc, atomType, ".tmp.atomval"); + }(); builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + + if (atomType != storeType) { + lower::ExprToValueMap overrides; + // The READ operation could be a part of UPDATE CAPTURE, so make sure + // we don't emit extra code into the body of the atomic op. + builder.restoreInsertionPoint(postAt); + mlir::Value load = builder.create(loc, toAddr); + overrides.try_emplace(&atom, load); + + converter.overrideExprValues(&overrides); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); + converter.resetExprOverrides(); + + builder.create(loc, value, storeAddr); + } builder.restoreInsertionPoint(saved); return op; } @@ -2695,16 +2720,18 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, static mlir::Operation * // genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); mlir::Value converted = builder.createConvert(loc, atomType, value); @@ -2719,19 +2746,20 @@ static mlir::Operation * genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrResizeOf(arg, atom)) { @@ -2751,7 +2779,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, converter.overrideExprValues(&overrides); mlir::Value updated = - fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Value converted = builder.createConvert(loc, atomType, updated); builder.create(loc, converted); converter.resetExprOverrides(); @@ -2764,20 +2792,21 @@ static mlir::Operation * genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, int action, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: - return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Write: - return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Update: - return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, assign, + hint, memOrder, preAt, atomicAt, postAt); default: return nullptr; } @@ -3724,6 +3753,15 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { } return ""s; }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; const SomeExpr &atom = *analysis.atom.get()->v; @@ -3732,11 +3770,11 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; llvm::errs() << " op0 {\n"; llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op0.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << " op1 {\n"; llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op1.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << "}\n"; } @@ -3745,8 +3783,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAtomicConstruct &construct) { - auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { - if (auto *maybe = expr.get(); maybe && maybe->v) { + auto get = [](auto &&typedWrapper) -> decltype(&*typedWrapper.get()->v) { + if (auto *maybe = typedWrapper.get(); maybe && maybe->v) { return &*maybe->v; } else { return nullptr; @@ -3774,8 +3812,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, int action0 = analysis.op0.what & analysis.Action; int action1 = analysis.op1.what & analysis.Action; mlir::Operation *captureOp = nullptr; - fir::FirOpBuilder::InsertPoint atomicAt; - fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint preAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint atomicAt, postAt; if (construct.IsCapture()) { // Capturing operation. @@ -3784,7 +3822,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, captureOp = builder.create(loc, hint, memOrder); // Set the non-atomic insertion point to before the atomic.capture. - prepareAt = getInsertionPointBefore(captureOp); + preAt = getInsertionPointBefore(captureOp); mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); builder.setInsertionPointToEnd(block); @@ -3792,6 +3830,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, // atomic.capture. mlir::Operation *term = builder.create(loc); atomicAt = getInsertionPointBefore(term); + postAt = getInsertionPointAfter(captureOp); hint = nullptr; memOrder = nullptr; } else { @@ -3799,20 +3838,20 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(action0 != analysis.None && action1 == analysis.None && "Expexcing single action"); assert(!(analysis.op0.what & analysis.Condition)); - atomicAt = prepareAt; + postAt = atomicAt = preAt; } mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, - *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); mlir::Operation *secondOp = nullptr; if (analysis.op1.what != analysis.None) { secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, - atomAddr, atom, *get(analysis.op1.expr), - hint, memOrder, atomicAt, prepareAt); + atomAddr, atom, *get(analysis.op1.assign), + hint, memOrder, preAt, atomicAt, postAt); } if (secondOp) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 201b38bd05ff3..f7753a5e5cc59 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -86,9 +86,13 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } -static bool IsVarOrFunctionRef(const SomeExpr &expr) { - return evaluate::UnwrapProcedureRef(expr) != nullptr || - evaluate::IsVariable(expr); +static bool IsVarOrFunctionRef(const MaybeExpr &expr) { + if (expr) { + return evaluate::UnwrapProcedureRef(*expr) != nullptr || + evaluate::IsVariable(*expr); + } else { + return false; + } } static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { @@ -2838,6 +2842,12 @@ static std::pair SplitAssignmentSource( namespace atomic { +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + enum class Operator { Unk, // Operators that are officially allowed in the update operation @@ -3137,16 +3147,108 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (Append(v, std::move(results)), ...); + (MoveAppend(v, std::move(results)), ...); return v; } +}; -private: - static void Append(Result &acc, Result &&data) { - for (auto &&s : data) { - acc.push_back(std::move(s)); +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + auto copy{x.derived()}; + return {evaluate::AsGenericExpr(std::move(copy)), {}}; } } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; }; } // namespace atomic @@ -3265,6 +3367,22 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } +static MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { // Both expr and x have the form of SomeType(SomeKind(...)[1]). // Check if expr is @@ -3282,6 +3400,10 @@ bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { } } +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { if (value) { expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), @@ -3289,11 +3411,20 @@ static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { } } +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( - int what, const MaybeExpr &maybeExpr = std::nullopt) { + int what, + const std::optional &maybeAssign = std::nullopt) { parser::OpenMPAtomicConstruct::Analysis::Op operation; operation.what = what; - SetExpr(operation.expr, maybeExpr); + SetAssignment(operation.assign, maybeAssign); return operation; } @@ -3316,7 +3447,7 @@ static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( // }; // struct Op { // int what; - // TypedExpr expr; + // TypedAssignment assign; // }; // TypedExpr atom, cond; // Op op0, op1; @@ -3340,6 +3471,16 @@ void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, } } +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + /// Check if `expr` satisfies the following conditions for x and v: /// /// [6.0:189:10-12] @@ -3383,9 +3524,9 @@ OmpStructureChecker::CheckUpdateCapture( // // The two allowed cases are: // x = ... atomic-var = ... - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // or - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // x = ... atomic-var = ... // // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture @@ -3394,6 +3535,8 @@ OmpStructureChecker::CheckUpdateCapture( // // If the two statements don't fit these criteria, return a pair of default- // constructed values. + using ReturnTy = std::pair; SourcedActionStmt act1{GetActionStmt(ec1)}; SourcedActionStmt act2{GetActionStmt(ec2)}; @@ -3409,86 +3552,155 @@ OmpStructureChecker::CheckUpdateCapture( auto isUpdateCapture{ [](const evaluate::Assignment &u, const evaluate::Assignment &c) { - return u.lhs == c.rhs; + return IsSameOrConvertOf(c.rhs, u.lhs); }}; // Do some checks that narrow down the possible choices for the update // and the capture statements. This will help to emit better diagnostics. - bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); - bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; + + auto errorCaptureShouldRead{[&](const parser::CharBlock &source, + const std::string &expr) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read %s"_err_en_US, + expr); + }}; - if (couldBeCapture1) { - if (couldBeCapture2) { - if (isUpdateCapture(as2, as1)) { - if (isUpdateCapture(as1, as2)) { - // If both statements could be captures and both could be updates, - // emit a warning about the ambiguity. - context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); - } - return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); - } else if (isUpdateCapture(as1, as2)) { + auto errorNeitherWorks{[&]() { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture"_err_en_US); + }}; + + auto makeSelectionFromDet{[&](int det) -> ReturnTy { + // If det != 0, then the checks unambiguously suggest a specific + // categorization. + // If det == 0, then this function should be called only if the + // checks haven't ruled out any possibility, i.e. when both assigments + // could still be either updates or captures. + if (det > 0) { + // as1 is update, as2 is capture + if (isUpdateCapture(as1, as2)) { return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - context_.Say(source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, - as1.rhs.AsFortran(), as2.rhs.AsFortran()); + errorCaptureShouldRead(act2.source, as1.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } else { // !couldBeCapture2 + } else if (det < 0) { + // as2 is update, as1 is capture if (isUpdateCapture(as2, as1)) { return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } else { - context_.Say(act2.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as1.rhs.AsFortran()); + errorCaptureShouldRead(act1.source, as2.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } - } else { // !couldBeCapture1 - if (couldBeCapture2) { - if (isUpdateCapture(as1, as2)) { - return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); - } else { + } else { + bool updateFirst{isUpdateCapture(as1, as2)}; + bool captureFirst{isUpdateCapture(as2, as1)}; + if (updateFirst && captureFirst) { + // If both assignment could be the update and both could be the + // capture, emit a warning about the ambiguity. context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as2.rhs.AsFortran()); + "In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement"_warn_en_US); + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } - } else { - context_.Say(source, - "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + if (updateFirst != captureFirst) { + const parser::ExecutionPartConstruct *upd{updateFirst ? ec1 : ec2}; + const parser::ExecutionPartConstruct *cap{captureFirst ? ec1 : ec2}; + return std::make_pair(upd, cap); + } + assert(!updateFirst && !captureFirst); + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); } + }}; + + if (det != 0 || (cbu1 && cbu2 && cbc1 && cbc2)) { + return makeSelectionFromDet(det); } + assert(det == 0 && "Prior checks should have covered det != 0"); - return std::make_pair(nullptr, nullptr); + // If neither of the statements is an RMW update, it could still be a + // "write" update. Pretty much any assignment can be a write update, so + // recompute det with cbu1 = cbu2 = true. + if (int writeDet{int(cbc2) - int(cbc1)}; writeDet || (cbc1 && cbc2)) { + return makeSelectionFromDet(writeDet); + } + + // It's only errors from here on. + + if (!cbu1 && !cbu2 && !cbc1 && !cbc2) { + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); + } + + // The remaining cases are that + // - no candidate for update, or for capture, + // - one of the assigments cannot be anything. + + if (!cbu1 && !cbu2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update"_err_en_US); + return std::make_pair(nullptr, nullptr); + } else if (!cbc1 && !cbc2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + if ((!cbu1 && !cbc1) || (!cbu2 && !cbc2)) { + auto &src = (!cbu1 && !cbc1) ? act1.source : act2.source; + context_.Say(src, + "In ATOMIC UPDATE operation with CAPTURE the statement could be neither the update nor the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + // All cases should have been covered. + llvm_unreachable("Unchecked condition"); } void OmpStructureChecker::CheckAtomicCaptureAssignment( const evaluate::Assignment &capture, const SomeExpr &atom, parser::CharBlock source) { - const SomeExpr &cap{capture.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &cap{capture.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, rsrc); - // This part should have been checked prior to callig this function. - assert(capture.rhs == atom && "This canont be a capture assignment"); + // This part should have been checked prior to calling this function. + assert(*GetConvertInput(capture.rhs) == atom && + "This canont be a capture assignment"); CheckStorageOverlap(atom, {cap}, source); } } void OmpStructureChecker::CheckAtomicReadAssignment( const evaluate::Assignment &read, parser::CharBlock source) { - const SomeExpr &atom{read.rhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; - if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + if (auto maybe{GetConvertInput(read.rhs)}) { + const SomeExpr &atom{*maybe}; + + if (!IsVarOrFunctionRef(atom)) { + ErrorShouldBeVariable(atom, rsrc); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } } else { - CheckAtomicVariable(atom, rsrc); - CheckStorageOverlap(atom, {read.lhs}, source); + ErrorShouldBeVariable(read.rhs, rsrc); } } @@ -3499,12 +3711,11 @@ void OmpStructureChecker::CheckAtomicWriteAssignment( // one of the following forms: // x = expr // x => expr - const SomeExpr &atom{write.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{write.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, lsrc); CheckStorageOverlap(atom, {write.rhs}, source); @@ -3521,12 +3732,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( // x = intrinsic-procedure-name (x) // x = intrinsic-procedure-name (x, expr-list) // x = intrinsic-procedure-name (expr-list, x) - const SomeExpr &atom{update.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{update.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); // Skip other checks. return; } @@ -3605,12 +3815,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( const SomeExpr &cond, parser::CharBlock condSource, const evaluate::Assignment &assign, parser::CharBlock assignSource) { - const SomeExpr &atom{assign.lhs}; auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + const SomeExpr &atom{assign.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, arsrc); // Skip other checks. return; } @@ -3702,7 +3911,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( @@ -3786,7 +3995,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdate( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign), MakeAtomicAnalysisOp(Analysis::None)); } @@ -3839,12 +4048,12 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( if (GetActionStmt(&body.front()).stmt == uact.stmt) { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(action, update.rhs), - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + MakeAtomicAnalysisOp(action, update), + MakeAtomicAnalysisOp(Analysis::Read, capture)); } else { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), - MakeAtomicAnalysisOp(action, update.rhs)); + MakeAtomicAnalysisOp(Analysis::Read, capture), + MakeAtomicAnalysisOp(action, update)); } } @@ -3988,12 +4197,12 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( if (captureFirst) { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign)); } else { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign)); } } @@ -4019,13 +4228,15 @@ void OmpStructureChecker::CheckAtomicRead( if (body.size() == 1) { SourcedActionStmt action{GetActionStmt(&body.front())}; if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { - const SomeExpr &atom{maybeRead->rhs}; CheckAtomicReadAssignment(*maybeRead, action.source); - using Analysis = parser::OpenMPAtomicConstruct::Analysis; - x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), - MakeAtomicAnalysisOp(Analysis::None)); + if (auto maybe{GetConvertInput(maybeRead->rhs)}) { + const SomeExpr &atom{*maybe}; + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead), + MakeAtomicAnalysisOp(Analysis::None)); + } } else { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); @@ -4058,7 +4269,7 @@ void OmpStructureChecker::CheckAtomicWrite( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index bf6fbf16d0646..835fbe45e1c0e 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -253,6 +253,7 @@ class OmpStructureChecker void CheckStorageOverlap(const evaluate::Expr &, llvm::ArrayRef>, parser::CharBlock); + void ErrorShouldBeVariable(const MaybeExpr &expr, parser::CharBlock source); void CheckAtomicVariable( const evaluate::Expr &, parser::CharBlock); std::pair&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..aa9d2e0ac3ff7 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -1,5 +1,3 @@ -! REQUIRES : openmp_runtime - ! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s ! CHECK: func.func @_QPatomic_implicit_cast_read() { diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index deb67e7614659..8adb0f1a67409 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -25,7 +25,7 @@ program sample !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement y = x x = y !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index c427ba07d43d8..f808ed916fb7e 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -39,7 +39,7 @@ subroutine f02 subroutine f03 integer :: x - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture !$omp atomic update capture x = x + 1 x = x + 2 @@ -50,7 +50,7 @@ subroutine f04 integer :: x, v !$omp atomic update capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement v = x x = v !$omp end atomic @@ -60,8 +60,8 @@ subroutine f05 integer :: x, v, z !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 !$omp end atomic end @@ -70,8 +70,8 @@ subroutine f06 integer :: x, v, z !$omp atomic update capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x !$omp end atomic end diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 677b933932b44..5e180aa0bbe5b 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -97,50 +97,50 @@ program sample !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture x = x + 10 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read x v = b !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture !$omp atomic capture v = 1 x = 4 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) !$omp end atomic >From 40510a3068498d15257cc7d198bce9c8cd71a902 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 24 Mar 2025 15:38:58 -0500 Subject: [PATCH 13/18] DumpEvExpr: show type --- flang/include/flang/Semantics/dump-expr.h | 30 ++++++++++++++++------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 2f445429a10b5..1553dac3b6687 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,6 +16,7 @@ #include #include +#include #include #include @@ -38,6 +39,17 @@ class DumpEvaluateExpr { } private: + template + struct TypeOf { + static constexpr std::string_view name{TypeOf::get()}; + static constexpr std::string_view get() { + std::string_view v(__PRETTY_FUNCTION__); + v.remove_prefix(99); // Strip the part "... [with T = " + v.remove_suffix(50); // Strip the ending "; string_view = ...]" + return v; + } + }; + template void Show(const common::Indirection &x) { Show(x.value()); } @@ -76,7 +88,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant"); + Indent("derived constant "s + std::string(TypeOf::name)); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -84,7 +96,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant"); + Print("constant "s + std::string(TypeOf::name)); } } void Show(const Symbol &symbol); @@ -102,7 +114,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator"); + Indent("designator "s + std::string(TypeOf::name)); Show(x.u); Outdent(); } @@ -117,7 +129,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref"); + Indent("function ref "s + std::string(TypeOf::name)); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -127,14 +139,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value"); + Indent("array constructor value "s + std::string(TypeOf::name)); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do"); + Indent("implied do "s + std::string(TypeOf::name)); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -148,20 +160,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op"); + Indent("unary op "s + std::string(TypeOf::name)); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op"); + Indent("binary op "s + std::string(TypeOf::name)); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr T"); + Indent("expr <" + std::string(TypeOf::name) + ">"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index aa0b4e0f03398..66cedab94bfb4 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("expr some type"); + Indent("relational some type"); Show(x.u); Outdent(); } >From b40ba0ed9270daf4f7d99190c1e100028a3e09c3 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 15:14:45 -0500 Subject: [PATCH 14/18] Handle conversion from real to complex via complex constructor --- flang/lib/Semantics/check-omp-structure.cpp | 55 ++++++++++++++++++--- 1 file changed, 47 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dada9c6c2bd6f..ae81dcb5ea150 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,36 +3183,46 @@ struct ConvertCollector using Base::operator(); template // - Result operator()(const evaluate::Designator &x) const { + Result asSomeExpr(const T &x) const { auto copy{x}; return {AsGenericExpr(std::move(copy)), {}}; } + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + template // Result operator()(const evaluate::FunctionRef &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template // Result operator()(const evaluate::Constant &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template Result operator()(const evaluate::Operation &x) const { if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. + // Ignore parentheses. return (*this)(x.template operand<0>()); } else if constexpr (is_convert_v) { // Convert should always have a typed result, so it should be safe to // dereference x.GetType(). return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } } else { - auto copy{x.derived()}; - return {evaluate::AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x.derived()); } } @@ -3231,6 +3241,23 @@ struct ConvertCollector } private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + template // struct is_convert { static constexpr bool value{false}; @@ -3246,6 +3273,18 @@ struct ConvertCollector }; template // static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { >From 303aef7886243a6f7952e866cfb50d860ed98e61 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 16:07:19 -0500 Subject: [PATCH 15/18] Fix handling of insertion point --- flang/lib/Lower/OpenMP/OpenMP.cpp | 23 +++++++++++-------- .../Lower/OpenMP/atomic-implicit-cast.f90 | 8 +++---- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 1c5589b116ca7..60e559b326f7f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2749,7 +2749,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value storeAddr = @@ -2782,7 +2781,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, value, storeAddr); } - builder.restoreInsertionPoint(saved); return op; } @@ -2796,7 +2794,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value value = @@ -2807,7 +2804,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, converted, hint, memOrder); - builder.restoreInsertionPoint(saved); return op; } @@ -2823,7 +2819,6 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); @@ -2853,7 +2848,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(saved); + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } @@ -2866,6 +2861,8 @@ genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { + // This function and the functions called here do not preserve the + // builder's insertion point, or set it to anything specific. switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, @@ -3919,6 +3916,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, postAt = atomicAt = preAt; } + // The builder's insertion point needs to be specifically set before + // each call to `genAtomicOperation`. mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); @@ -3932,10 +3931,16 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, hint, memOrder, preAt, atomicAt, postAt); } - if (secondOp) { - builder.setInsertionPointAfter(secondOp); + if (construct.IsCapture()) { + // If this is a capture operation, the first/second ops will be inside + // of it. Set the insertion point to past the capture op itself. + builder.restoreInsertionPoint(postAt); } else { - builder.setInsertionPointAfter(firstOp); + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } } } } diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 6f9a481e4cf43..5e00235b85e74 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -95,9 +95,9 @@ subroutine atomic_implicit_cast_read ! CHECK: } ! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 ! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref -! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 ! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex ! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> @@ -107,14 +107,14 @@ subroutine atomic_implicit_cast_read !$omp end atomic -! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { -! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 ! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 ! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex ! CHECK: omp.yield(%[[RESULT]] : complex) ! CHECK: } >From d788d87ebe69ec82c14a0eb0cbb95df38a216fde Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:14:47 -0500 Subject: [PATCH 16/18] Allow conversion in update operations --- flang/include/flang/Semantics/tools.h | 17 ++++----- flang/lib/Lower/OpenMP/OpenMP.cpp | 6 ++-- flang/lib/Semantics/check-omp-structure.cpp | 33 ++++++----------- .../Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic03.f90 | 6 ++-- flang/test/Semantics/OpenMP/atomic04.f90 | 35 +++++++++---------- .../OpenMP/omp-atomic-assignment-stmt.f90 | 2 +- 7 files changed, 44 insertions(+), 57 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7f1ec59b087a2..9be2feb8ae064 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -789,14 +789,15 @@ inline bool checkForSymbolMatch( /// return the "expr" but with top-level parentheses stripped. std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); -/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). -/// Check if "expr" is -/// SomeType(SomeKind(Type( -/// Convert -/// SomeKind(...)[2]))) -/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves -/// TypeCategory. -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 60e559b326f7f..6977e209e8b1b 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2823,10 +2823,12 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; + // This must exist by now. + SomeExpr input = *semantics::GetConvertInput(assign.rhs); + std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { - if (!semantics::IsSameOrResizeOf(arg, atom)) { + if (!semantics::IsSameOrConvertOf(arg, atom)) { mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); overrides.try_emplace(&arg, val); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index ae81dcb5ea150..edd8525c118bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3425,12 +3425,12 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -static MaybeExpr GetConvertInput(const SomeExpr &x) { +MaybeExpr GetConvertInput(const SomeExpr &x) { // This returns SomeExpr(x) when x is a designator/functionref/constant. return atomic::ConvertCollector{}(x).first; } -static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { // Check if expr is same as x, or a sequence of Convert operations on x. if (expr == x) { return true; @@ -3441,23 +3441,6 @@ static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { } } -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { - // Both expr and x have the form of SomeType(SomeKind(...)[1]). - // Check if expr is - // SomeType(SomeKind(Type( - // Convert - // SomeKind(...)[2]))) - // where SomeKind(...) [1] and [2] are equal, and the Convert preserves - // TypeCategory. - - if (expr != x) { - auto top{atomic::ArgumentExtractor{}(expr)}; - return top.first == atomic::Operator::Resize && x == top.second.front(); - } else { - return true; - } -} - bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3801,7 +3784,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - auto top{GetTopLevelOperation(update.rhs)}; + std::pair> top{ + atomic::Operator::Unk, {}}; + if (auto &&maybeInput{GetConvertInput(update.rhs)}) { + top = GetTopLevelOperation(*maybeInput); + } switch (top.first) { case atomic::Operator::Add: case atomic::Operator::Sub: @@ -3842,7 +3829,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( auto unique{[&]() { // -> iterator auto found{top.second.end()}; for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { - if (IsSameOrResizeOf(*i, atom)) { + if (IsSameOrConvertOf(*i, atom)) { if (found != top.second.end()) { return top.second.end(); } @@ -3902,9 +3889,9 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( case atomic::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; - if (IsSameOrResizeOf(arg0, atom)) { + if (IsSameOrConvertOf(arg0, atom)) { CheckStorageOverlap(atom, {arg1}, condSource); - } else if (IsSameOrResizeOf(arg1, atom)) { + } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { context_.Say(assignSource, diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 4595e02d01456..28d0e264359cb 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -47,8 +47,8 @@ subroutine f05 integer :: x real :: y + ! An explicit conversion is accepted as an extension. !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = int(x + y) end diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index f5c189fd05318..b3a3c0d5e7a14 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -41,10 +41,10 @@ program OmpAtomic z = MIN(y, 8, a, d) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable y should appear as an argument in the update operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -126,7 +126,7 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic update !ERROR: Atomic variable k should be a scalar - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable k should occur exactly once among the arguments of the top-level MAX operator k = max(x, y) !$omp atomic diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index 5c91ab5dc37e4..d603ba8b3937c 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -1,5 +1,3 @@ -! REQUIRES: openmp_runtime - ! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags ! OpenMP Atomic construct @@ -7,7 +5,6 @@ ! Update assignment must be 'var = var op expr' or 'var = expr op var' program OmpAtomic - use omp_lib real x integer y logical m, n, l @@ -20,10 +17,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic @@ -31,10 +28,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic @@ -42,10 +39,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic @@ -53,10 +50,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic @@ -96,10 +93,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic update @@ -107,10 +104,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic update @@ -118,10 +115,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic update @@ -129,10 +126,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 5e180aa0bbe5b..8fdd2aed3ec1f 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -87,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + ! This ends up being "x = b + x". x = b + (x*1) !$omp end atomic >From 341723713929507c59d528540d32bc2e4213e920 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:21:56 -0500 Subject: [PATCH 17/18] format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6977e209e8b1b..0f553541c5ef0 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2850,7 +2850,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(postAt); // For naCtx cleanups + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } >From 2686207342bad511f6d51b20ed923c0d2cc9047b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:22:26 -0500 Subject: [PATCH 18/18] Revert "DumpEvExpr: show type" This reverts commit 40510a3068498d15257cc7d198bce9c8cd71a902. Debug changes accidentally pushed upstream. --- flang/include/flang/Semantics/dump-expr.h | 30 +++++++---------------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 10 insertions(+), 22 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 1553dac3b6687..2f445429a10b5 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,7 +16,6 @@ #include #include -#include #include #include @@ -39,17 +38,6 @@ class DumpEvaluateExpr { } private: - template - struct TypeOf { - static constexpr std::string_view name{TypeOf::get()}; - static constexpr std::string_view get() { - std::string_view v(__PRETTY_FUNCTION__); - v.remove_prefix(99); // Strip the part "... [with T = " - v.remove_suffix(50); // Strip the ending "; string_view = ...]" - return v; - } - }; - template void Show(const common::Indirection &x) { Show(x.value()); } @@ -88,7 +76,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant "s + std::string(TypeOf::name)); + Indent("derived constant"); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -96,7 +84,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant "s + std::string(TypeOf::name)); + Print("constant"); } } void Show(const Symbol &symbol); @@ -114,7 +102,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator "s + std::string(TypeOf::name)); + Indent("designator"); Show(x.u); Outdent(); } @@ -129,7 +117,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref "s + std::string(TypeOf::name)); + Indent("function ref"); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -139,14 +127,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value "s + std::string(TypeOf::name)); + Indent("array constructor value"); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do "s + std::string(TypeOf::name)); + Indent("implied do"); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -160,20 +148,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op "s + std::string(TypeOf::name)); + Indent("unary op"); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op "s + std::string(TypeOf::name)); + Indent("binary op"); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr <" + std::string(TypeOf::name) + ">"); + Indent("expr T"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index 66cedab94bfb4..aa0b4e0f03398 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("relational some type"); + Indent("expr some type"); Show(x.u); Outdent(); } From flang-commits at lists.llvm.org Tue May 27 10:43:35 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 27 May 2025 10:43:35 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <6835f9c7.a70a0220.293560.27f0@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From 25397a64d70714e9628f5d988de7ef3b4770523f Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, Destroy, and DescriptorIO. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. There is a fast(?) mode in the work queue implementation that causes new work items to be executed to completion immediately upon creation, saving the overhead of actually representing and managing the work queue. This mode can't be used on GPU devices, but it is enabled by default for CPU hosts. It can be disabled easily for debugging and performance testing. --- .../include/flang-rt/runtime/environment.h | 1 + flang-rt/include/flang-rt/runtime/stat.h | 10 +- .../include/flang-rt/runtime/work-queue.h | 538 +++++++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 546 +++++++++------ flang-rt/lib/runtime/derived.cpp | 519 +++++++------- flang-rt/lib/runtime/descriptor-io.cpp | 651 +++++++++++++++++- flang-rt/lib/runtime/descriptor-io.h | 620 +---------------- flang-rt/lib/runtime/environment.cpp | 4 + flang-rt/lib/runtime/namelist.cpp | 1 + flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 153 ++++ flang-rt/unittests/Runtime/ExternalIOTest.cpp | 2 +- flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- 15 files changed, 1983 insertions(+), 1082 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/environment.h b/flang-rt/include/flang-rt/runtime/environment.h index 16258b3bbba9b..87fe1f92ba545 100644 --- a/flang-rt/include/flang-rt/runtime/environment.h +++ b/flang-rt/include/flang-rt/runtime/environment.h @@ -63,6 +63,7 @@ struct ExecutionEnvironment { bool noStopMessage{false}; // NO_STOP_MESSAGE=1 inhibits "Fortran STOP" bool defaultUTF8{false}; // DEFAULT_UTF8 bool checkPointerDeallocation{true}; // FORT_CHECK_POINTER_DEALLOCATION + int internalDebugging{0}; // FLANG_RT_DEBUG // CUDA related variables std::size_t cudaStackLimit{0}; // ACC_OFFLOAD_STACK_SIZE diff --git a/flang-rt/include/flang-rt/runtime/stat.h b/flang-rt/include/flang-rt/runtime/stat.h index 070d0bf8673fb..dc372de53506a 100644 --- a/flang-rt/include/flang-rt/runtime/stat.h +++ b/flang-rt/include/flang-rt/runtime/stat.h @@ -24,7 +24,7 @@ class Terminator; enum Stat { StatOk = 0, // required to be zero by Fortran - // Interoperable STAT= codes + // Interoperable STAT= codes (>= 11) StatBaseNull = CFI_ERROR_BASE_ADDR_NULL, StatBaseNotNull = CFI_ERROR_BASE_ADDR_NOT_NULL, StatInvalidElemLen = CFI_INVALID_ELEM_LEN, @@ -36,7 +36,7 @@ enum Stat { StatMemAllocation = CFI_ERROR_MEM_ALLOCATION, StatOutOfBounds = CFI_ERROR_OUT_OF_BOUNDS, - // Standard STAT= values + // Standard STAT= values (>= 101) StatFailedImage = FORTRAN_RUNTIME_STAT_FAILED_IMAGE, StatLocked = FORTRAN_RUNTIME_STAT_LOCKED, StatLockedOtherImage = FORTRAN_RUNTIME_STAT_LOCKED_OTHER_IMAGE, @@ -49,10 +49,14 @@ enum Stat { // Additional "processor-defined" STAT= values StatInvalidArgumentNumber = FORTRAN_RUNTIME_STAT_INVALID_ARG_NUMBER, StatMissingArgument = FORTRAN_RUNTIME_STAT_MISSING_ARG, - StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, + StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, // -1 StatMoveAllocSameAllocatable = FORTRAN_RUNTIME_STAT_MOVE_ALLOC_SAME_ALLOCATABLE, StatBadPointerDeallocation = FORTRAN_RUNTIME_STAT_BAD_POINTER_DEALLOCATION, + + // Dummy status for work queue continuation, declared here to perhaps + // avoid collisions + StatContinue = 201 }; RT_API_ATTRS const char *StatErrorString(int); diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..2b46890aeebe1 --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,538 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue comprises a list of tickets. Each ticket class has a Begin() +// member function, which is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatContinue. When that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentsOverElements, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/connection.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; + +// Ticket worker base classes + +template class ImmediateTicketRunner { +public: + RT_API_ATTRS explicit ImmediateTicketRunner(TICKET &ticket) + : ticket_{ticket} {} + RT_API_ATTRS int Run(WorkQueue &workQueue) { + int status{ticket_.Begin(workQueue)}; + while (status == StatContinue) { + status = ticket_.Continue(workQueue); + } + return status; + } + +private: + TICKET &ticket_; +}; + +// Base class for ticket workers that operate elementwise over descriptors +class Elementwise { +protected: + RT_API_ATTRS Elementwise( + const Descriptor &instance, const Descriptor *from = nullptr) + : instance_{instance}, from_{from} { + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + RT_API_ATTRS bool IsComplete() const { return elementAt_ >= elements_; } + RT_API_ATTRS void Advance() { + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + if (from_) { + from_->IncrementSubscripts(fromSubscripts_); + } + } + RT_API_ATTRS void SkipToEnd() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + + const Descriptor &instance_, *from_{nullptr}; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + SubscriptValue subscripts_[common::maxRank]; + SubscriptValue fromSubscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components. +class Componentwise { +protected: + RT_API_ATTRS Componentwise(const typeInfo::DerivedType &); + RT_API_ATTRS bool IsComplete() const { return componentAt_ >= components_; } + RT_API_ATTRS void Advance() { + ++componentAt_; + GetComponent(); + } + RT_API_ATTRS void SkipToEnd() { + component_ = nullptr; + componentAt_ = components_; + } + RT_API_ATTRS void Reset() { + component_ = nullptr; + componentAt_ = 0; + GetComponent(); + } + RT_API_ATTRS void GetComponent(); + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentsOverElements : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ComponentsOverElements(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Elementwise::IsComplete()) { + Componentwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Componentwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextElement(); + if (Elementwise::IsComplete()) { + Elementwise::Reset(); + Componentwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Elementwise::Advance(); + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Advance(); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Reset(); + } + + int phase_{0}; +}; + +// Base class for ticket workers that operate over elements in an outer loop, +// type components in an inner loop. +class ElementsOverComponents : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ElementsOverComponents(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Componentwise::IsComplete()) { + Elementwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Elementwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextComponent(); + if (Componentwise::IsComplete()) { + Componentwise::Reset(); + Elementwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Componentwise::Advance(); + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Componentwise::Reset(); + Elementwise::Advance(); + } + + int phase_{0}; +}; + +// Ticket worker classes + +// Implements derived type instance initialization +class InitializeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket + : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket : public ImmediateTicketRunner { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : ImmediateTicketRunner{*this}, to_{to}, from_{&from}, + flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment. +class DerivedAssignTicket : public ImmediateTicketRunner, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ImmediateTicketRunner{*this}, + ElementsOverComponents{to, derived, &from}, flags_{flags}, + memmoveFct_{memmoveFct}, deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + StaticDescriptor fromComponentDescriptor_; +}; + +namespace io::descr { + +template +class DescriptorIoTicket + : public ImmediateTicketRunner>, + private Elementwise { +public: + RT_API_ATTRS DescriptorIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + Elementwise{descriptor}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS bool &anyIoTookPlace() { return anyIoTookPlace_; } + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; + common::optional nonTbpSpecial_; + const typeInfo::DerivedType *derived_{nullptr}; + const typeInfo::SpecialBinding *special_{nullptr}; + StaticDescriptor elementDescriptor_; +}; + +template +class DerivedIoTicket : public ImmediateTicketRunner>, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + ElementsOverComponents{descriptor, derived}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; +}; + +} // namespace io::descr + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant, + io::descr::DescriptorIoTicket, + io::descr::DerivedIoTicket, + io::descr::DerivedIoTicket> + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + // APIs for particular tasks. These can return StatOk if the work is + // completed immediately. + RT_API_ATTRS int BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return InitializeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + if (runTicketsImmediately_) { + return InitializeCloneTicket{clone, original, derived, hasStat, errMsg} + .Run(*this); + } else { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); + return StatContinue; + } + } + RT_API_ATTRS int BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return FinalizeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + if (runTicketsImmediately_) { + return DestroyTicket{descriptor, derived, finalize}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived, finalize); + return StatContinue; + } + } + RT_API_ATTRS int BeginAssign(Descriptor &to, const Descriptor &from, + int flags, MemmoveFct memmoveFct) { + if (runTicketsImmediately_) { + return AssignTicket{to, from, flags, memmoveFct}.Run(*this); + } else { + StartTicket().u.emplace(to, from, flags, memmoveFct); + return StatContinue; + } + } + RT_API_ATTRS int BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) { + if (runTicketsImmediately_) { + return DerivedAssignTicket{ + to, from, derived, flags, memmoveFct, deallocateAfter} + .Run(*this); + } else { + StartTicket().u.emplace( + to, from, derived, flags, memmoveFct, deallocateAfter); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDescriptorIo(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DescriptorIoTicket{ + io, descriptor, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, table, anyIoTookPlace); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedIo(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DerivedIoTicket{ + io, descriptor, derived, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, derived, table, anyIoTookPlace); + return StatContinue; + } + } + + RT_API_ATTRS int Run(); + +private: +#if RT_DEVICE_COMPILATION + // Always use the work queue on a GPU device to avoid recursion. + static constexpr bool runTicketsImmediately_{false}; +#else + // Avoid the work queue overhead on the host, unless it needs + // debugging, which is so much easier there. + static constexpr bool runTicketsImmediately_{true}; +#endif + + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index c5e7bdce5b2fd..3f9858671d1e1 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -67,6 +67,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -130,6 +131,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 86aeeaa88f2d1..b60c4214fc8e5 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -102,11 +103,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncObject)); } // least <= 0, most >= 0 @@ -231,6 +228,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -244,274 +243,373 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + if (workQueue.BeginAssign(to, from, flags, memmoveFct) == StatContinue) { + workQueue.Run(); + } +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + if (int status{workQueue.BeginFinalize(*toDeallocate_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncObject))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(newFrom, *derived)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + } + static constexpr int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (int status{workQueue.BeginAssign( + newFrom, *from_, nestedFlags, memmoveFct_)}; + status != StatOk && status != StatContinue) { + return status; } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + if (int status{workQueue.BeginFinalize(to_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed + } else if (!toDerived_->noDestructionNeeded()) { + if (int status{ + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + return StatContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); } + return StatOk; } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(to_, *toDerived_)}; + status != StatOk) { + return status; + } + } + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); + if (toDerived_) { + if (int status{workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_)}; + status != StatOk && status != StatContinue) { + return status; } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); - } - } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &workQueue) { + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + memmoveFct_( + instance_.ElementComponent(subscripts_, procPtr.offset), + from_->ElementComponent( + fromSubscripts_, procPtr.offset), + sizeof(typeInfo::ProcedurePointer)); + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &workQueue) { + // DerivedAssignTicket inherits from ElementComponentBase so that it + // loops over elements at the outer level and over components at the inner. + // This yield less surprising behavior for codes that care due to the use + // of defined assignments when the ordering of their calls matters. + while (!IsComplete()) { + // Copy the data components (incl. the parent) first. + switch (component_->genre()) { + case typeInfo::Component::Genre::Data: + if (component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{fromComponentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + toCompDesc, instance_, workQueue.terminator(), subscripts_); + component_->CreatePointerDescriptor( + fromCompDesc, *from_, workQueue.terminator(), fromSubscripts_); + Advance(); + if (int status{workQueue.BeginAssign( + toCompDesc, fromCompDesc, flags_, memmoveFct_)}; + status != StatOk) { + return status; + } + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + instance_.Element(subscripts_) + component_->offset())}; + const auto *fromDesc{reinterpret_cast( + from_->Element(fromSubscripts_) + component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (const auto *componentDerived{component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } + } + toDesc->Deallocate(); + } + Advance(); + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + // + // Be careful not to destroy/reallocate the LHS, if there is + // overlap between LHS and RHS (it seems that partial overlap + // is not possible, though). + // Invoke Assign() recursively to deal with potential aliasing. + // Force LHS deallocation with DeallocateLHS flag. + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + int nestedFlags{flags_ | DeallocateLHS}; + Advance(); + if (int status{workQueue.BeginAssign( + *toDesc, *fromDesc, nestedFlags, memmoveFct_)}; + status != StatOk) { + return status; + } + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -581,7 +679,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -598,7 +695,6 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. if (var) { diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index 35037036f63e7..8166ab64cfd71 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,195 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitialize(instance, derived)}; + return status == StatContinue ? workQueue.Run() : status; +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &comp{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + auto &pptr{*instance_.ElementComponent( + subscripts_, comp.offset)}; + pptr = comp.procInitialization; + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + SkipToNextComponent(); + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitialize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; - } - } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitializeClone( + clone, original, derived, hasStat, errMsg)}; + return status == StatContinue ? workQueue.Run() : status; } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncObject), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + if (int status{workQueue.BeginInitialize(cloneDesc, *derived)}; + status != StatOk) { + return status; } } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_)}; + status != StatOk) { + return status; + } + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } + } else { + SkipToNextComponent(); + } + } + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginFinalize(descriptor, derived) == StatContinue) { + workQueue.Run(); } } - return stat; } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +237,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +274,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,87 +298,94 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (finalizableParentType_->noFinalizationNeeded()) { + finalizableParentType_ = nullptr; + } else { + SkipToNextComponent(); + } + } + return StatContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + if (int status{ + workQueue.BeginFinalize(compDesc, *compDynamicType)}; + status != StatOk) { + return status; } } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginFinalize(compDesc, *compType)}; + status != StatOk) { + return status; } } + } else { + SkipToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginFinalize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + const auto &parentType{*finalizableParentType_}; + finalizableParentType_ = nullptr; + // Don't return StatOk here if the nested FInalize is still running; + // it needs this->componentDescriptor_. + return workQueue.BeginFinalize(tmpDesc, parentType); } + return StatOk; } // The order of finalization follows Fortran 2018 7.5.6.2, with @@ -373,51 +394,71 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginDestroy(descriptor, derived, finalize) == StatContinue) { + workQueue.Run(); + } } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + if (int status{workQueue.BeginFinalize(instance_, derived_)}; + status != StatOk && status != StatContinue) { + return status; + } } + return StatContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (!IsComplete()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *d, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + SkipToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginDestroy( + compDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } } + } else { + SkipToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index 3db1455af52fe..4aa3640c1ed94 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -7,15 +7,44 @@ //===----------------------------------------------------------------------===// #include "descriptor-io.h" +#include "edit-input.h" +#include "edit-output.h" +#include "unit.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/namelist.h" +#include "flang-rt/runtime/terminator.h" +#include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" +#include "flang/Common/optional.h" #include "flang/Common/restorer.h" +#include "flang/Common/uint128.h" +#include "flang/Runtime/cpp-type.h" #include "flang/Runtime/freestanding-tools.h" +// Implementation of I/O data list item transfers based on descriptors. +// (All I/O items come through here so that the code is exercised for test; +// some scalar I/O data transfer APIs could be changed to bypass their use +// of descriptors in the future for better efficiency.) + namespace Fortran::runtime::io::descr { RT_OFFLOAD_API_GROUP_BEGIN +template +inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, + const Descriptor &descriptor, const SubscriptValue subscripts[]) { + A *p{descriptor.Element(subscripts)}; + if (!p) { + io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " + "address or subscripts out of range"); + } + return *p; +} + // Defined formatted I/O (maybe) -Fortran::common::optional DefinedFormattedIo(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &derived, +static RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( + IoStatementState &io, const Descriptor &descriptor, + const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special, const SubscriptValue subscripts[]) { Fortran::common::optional peek{ @@ -104,8 +133,8 @@ Fortran::common::optional DefinedFormattedIo(IoStatementState &io, } // Defined unformatted I/O -bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, - const typeInfo::DerivedType &derived, +static RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special) { // Unformatted I/O must have an external unit (or child thereof). IoErrorHandler &handler{io.GetIoErrorHandler()}; @@ -152,5 +181,619 @@ bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, return handler.GetIoStat() == IostatOk; } +// Per-category descriptor-based I/O templates + +// TODO (perhaps as a nontrivial but small starter project): implement +// automatic repetition counts, like "10*3.14159", for list-directed and +// NAMELIST array output. + +template +inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, + const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!EditIntegerOutput(io, *edit, x, isSigned)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditIntegerInput( + io, *edit, reinterpret_cast(&x), KIND, isSigned)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedIntegerIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedRealIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + RawType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditRealInput(io, *edit, reinterpret_cast(&x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedRealIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedComplexIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + bool isListOutput{ + io.get_if>() != nullptr}; + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + RawType *x{&ExtractElement(io, descriptor, subscripts)}; + if (isListOutput) { + DataEdit rEdit, iEdit; + rEdit.descriptor = DataEdit::ListDirectedRealPart; + iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; + rEdit.modes = iEdit.modes = io.mutableModes(); + if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || + !RealOutputEditing{io, x[1]}.Edit(iEdit)) { + return false; + } + } else { + for (int k{0}; k < 2; ++k, ++x) { + auto edit{io.GetNextDataEdit()}; + if (!edit) { + return false; + } else if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, *x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { + break; + } else if (EditRealInput( + io, *edit, reinterpret_cast(x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedComplexIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedCharacterIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + std::size_t length{descriptor.ElementBytes() / sizeof(A)}; + auto *listOutput{io.get_if>()}; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + A *x{&ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditCharacterOutput(io, *edit, x, length)) { + return false; + } + } else { // input + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditCharacterInput(io, *edit, x, length)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedCharacterIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedLogicalIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + auto *listOutput{io.get_if>()}; + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditLogicalOutput(io, *edit, x != 0)) { + return false; + } + } else { + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + bool truth{}; + if (EditLogicalInput(io, *edit, truth)) { + x = truth; + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedLogicalIO: subscripts out of bounds"); + } + } + return true; +} + +template +RT_API_ATTRS int DerivedIoTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Data) { + // Create a descriptor for the component + Descriptor &compDesc{componentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + compDesc, instance_, io_.GetIoErrorHandler(), subscripts_); + Advance(); + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } else { + // Component is itself a descriptor + char *pointer{ + instance_.Element(subscripts_) + component_->offset()}; + const Descriptor &compDesc{ + *reinterpret_cast(pointer)}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + } + } + return StatOk; +} + +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Begin(WorkQueue &workQueue) { + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + if (handler.InError()) { + return handler.GetIoStat(); + } + if (!io_.get_if>()) { + handler.Crash("DescriptorIO() called for wrong I/O direction"); + return handler.GetIoStat(); + } + if constexpr (DIR == Direction::Input) { + if (!io_.BeginReadingRecord()) { + return StatOk; + } + } + if (!io_.get_if>()) { + // Unformatted I/O + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + if (const typeInfo::DerivedType * + type{addendum ? addendum->derivedType() : nullptr}) { + // derived type unformatted I/O + if (table_) { + if (const auto *definedIo{table_->Find(*type, + DIR == Direction::Input + ? common::DefinedIo::ReadUnformatted + : common::DefinedIo::WriteUnformatted)}) { + if (definedIo->subroutine) { + typeInfo::SpecialBinding special{DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false}; + if (DefinedUnformattedIo(io_, instance_, *type, special)) { + anyIoTookPlace_ = true; + return StatOk; + } + } else { + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } + } + } + if (const typeInfo::SpecialBinding * + special{type->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || special->isTypeBound()) { + // defined derived type unformatted I/O + if (DefinedUnformattedIo(io_, instance_, *type, *special)) { + anyIoTookPlace_ = true; + return StatOk; + } else { + return IostatEnd; + } + } + } + // Default derived type unformatted I/O + // TODO: If no component at any level has defined READ or WRITE + // (as appropriate), the elements are contiguous, and no byte swapping + // is active, do a block transfer via the code below. + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } else { + // intrinsic type unformatted I/O + auto *externalUnf{io_.get_if>()}; + ChildUnformattedIoStatementState *childUnf{nullptr}; + InquireIOLengthState *inq{nullptr}; + bool swapEndianness{false}; + if (externalUnf) { + swapEndianness = externalUnf->unit().swapEndianness(); + } else { + childUnf = io_.get_if>(); + if (!childUnf) { + inq = DIR == Direction::Output ? io_.get_if() + : nullptr; + RUNTIME_CHECK(handler, inq != nullptr); + } + } + std::size_t elementBytes{instance_.ElementBytes()}; + std::size_t swappingBytes{elementBytes}; + if (auto maybeCatAndKind{instance_.type().GetCategoryAndKind()}) { + // Byte swapping units can be smaller than elements, namely + // for COMPLEX and CHARACTER. + if (maybeCatAndKind->first == TypeCategory::Character) { + // swap each character position independently + swappingBytes = maybeCatAndKind->second; // kind + } else if (maybeCatAndKind->first == TypeCategory::Complex) { + // swap real and imaginary components independently + swappingBytes /= 2; + } + } + using CharType = + std::conditional_t; + auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { + if constexpr (DIR == Direction::Output) { + return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) + : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) + : inq->Emit(&x, totalBytes, swappingBytes); + } else { + return externalUnf + ? externalUnf->Receive(&x, totalBytes, swappingBytes) + : childUnf->Receive(&x, totalBytes, swappingBytes); + } + }}; + if (!swapEndianness && + instance_.IsContiguous()) { // contiguous unformatted I/O + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elements_ * elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O + for (; !IsComplete(); Advance()) { + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } + } + } + // Unformatted I/O never needs to call Continue(). + return StatOk; + } + // Formatted I/O + if (auto catAndKind{instance_.type().GetCategoryAndKind()}) { + TypeCategory cat{catAndKind->first}; + int kind{catAndKind->second}; + bool any{false}; + switch (cat) { + case TypeCategory::Integer: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, true); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, true); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, true); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, true); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, true); + break; + default: + handler.Crash( + "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Unsigned: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, false); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, false); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, false); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, false); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, false); + break; + default: + handler.Crash( + "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Real: + switch (kind) { + case 2: + any = FormattedRealIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedRealIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedRealIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedRealIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedRealIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedRealIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: REAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Complex: + switch (kind) { + case 2: + any = FormattedComplexIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedComplexIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedComplexIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedComplexIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedComplexIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedComplexIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Character: + switch (kind) { + case 1: + any = FormattedCharacterIO(io_, instance_); + break; + case 2: + any = FormattedCharacterIO(io_, instance_); + break; + case 4: + any = FormattedCharacterIO(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Logical: + switch (kind) { + case 1: + any = FormattedLogicalIO<1, DIR>(io_, instance_); + break; + case 2: + any = FormattedLogicalIO<2, DIR>(io_, instance_); + break; + case 4: + any = FormattedLogicalIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedLogicalIO<8, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Derived: { + // Derived type information must be present for formatted I/O. + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + RUNTIME_CHECK(handler, addendum != nullptr); + derived_ = addendum->derivedType(); + RUNTIME_CHECK(handler, derived_ != nullptr); + if (table_) { + if (const auto *definedIo{table_->Find(*derived_, + DIR == Direction::Input ? common::DefinedIo::ReadFormatted + : common::DefinedIo::WriteFormatted)}) { + if (definedIo->subroutine) { + nonTbpSpecial_.emplace(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false); + special_ = &*nonTbpSpecial_; + } + } + } + if (!special_) { + if (const typeInfo::SpecialBinding * + binding{derived_->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || + binding->isTypeBound()) { + special_ = binding; + } + } + } + return StatContinue; + } + } + if (any) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { + handler.Crash("DescriptorIO: bad type code (%d) in descriptor", + static_cast(instance_.type().raw())); + return handler.GetIoStat(); + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Continue(WorkQueue &workQueue) { + // Only derived type formatted I/O gets here. + while (!IsComplete()) { + if (special_) { + if (auto defined{DefinedFormattedIo( + io_, instance_, *derived_, *special_, subscripts_)}) { + anyIoTookPlace_ |= *defined; + Advance(); + continue; + } + } + Descriptor &elementDesc{elementDescriptor_.descriptor()}; + elementDesc.Establish( + *derived_, nullptr, 0, nullptr, CFI_attribute_pointer); + elementDesc.set_base_addr(instance_.Element(subscripts_)); + Advance(); + if (int status{workQueue.BeginDerivedIo( + io_, elementDesc, *derived_, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS bool DescriptorIO(IoStatementState &io, + const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { + bool anyIoTookPlace{false}; + WorkQueue workQueue{io.GetIoErrorHandler()}; + if (workQueue.BeginDescriptorIo(io, descriptor, table, anyIoTookPlace) == + StatContinue) { + workQueue.Run(); + } + return anyIoTookPlace; +} + +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); + RT_OFFLOAD_API_GROUP_END } // namespace Fortran::runtime::io::descr diff --git a/flang-rt/lib/runtime/descriptor-io.h b/flang-rt/lib/runtime/descriptor-io.h index eb60f106c9203..88ad59bd24b53 100644 --- a/flang-rt/lib/runtime/descriptor-io.h +++ b/flang-rt/lib/runtime/descriptor-io.h @@ -9,619 +9,27 @@ #ifndef FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ #define FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ -// Implementation of I/O data list item transfers based on descriptors. -// (All I/O items come through here so that the code is exercised for test; -// some scalar I/O data transfer APIs could be changed to bypass their use -// of descriptors in the future for better efficiency.) +#include "flang-rt/runtime/connection.h" -#include "edit-input.h" -#include "edit-output.h" -#include "unit.h" -#include "flang-rt/runtime/descriptor.h" -#include "flang-rt/runtime/io-stmt.h" -#include "flang-rt/runtime/namelist.h" -#include "flang-rt/runtime/terminator.h" -#include "flang-rt/runtime/type-info.h" -#include "flang/Common/optional.h" -#include "flang/Common/uint128.h" -#include "flang/Runtime/cpp-type.h" +namespace Fortran::runtime { +class Descriptor; +} // namespace Fortran::runtime -namespace Fortran::runtime::io::descr { -template -inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, - const Descriptor &descriptor, const SubscriptValue subscripts[]) { - A *p{descriptor.Element(subscripts)}; - if (!p) { - io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " - "address or subscripts out of range"); - } - return *p; -} - -// Per-category descriptor-based I/O templates - -// TODO (perhaps as a nontrivial but small starter project): implement -// automatic repetition counts, like "10*3.14159", for list-directed and -// NAMELIST array output. - -template -inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, - const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!EditIntegerOutput(io, *edit, x, isSigned)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditIntegerInput( - io, *edit, reinterpret_cast(&x), KIND, isSigned)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedIntegerIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedRealIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - RawType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditRealInput(io, *edit, reinterpret_cast(&x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedRealIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io -template -inline RT_API_ATTRS bool FormattedComplexIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - bool isListOutput{ - io.get_if>() != nullptr}; - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - RawType *x{&ExtractElement(io, descriptor, subscripts)}; - if (isListOutput) { - DataEdit rEdit, iEdit; - rEdit.descriptor = DataEdit::ListDirectedRealPart; - iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; - rEdit.modes = iEdit.modes = io.mutableModes(); - if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || - !RealOutputEditing{io, x[1]}.Edit(iEdit)) { - return false; - } - } else { - for (int k{0}; k < 2; ++k, ++x) { - auto edit{io.GetNextDataEdit()}; - if (!edit) { - return false; - } else if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, *x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { - break; - } else if (EditRealInput( - io, *edit, reinterpret_cast(x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedComplexIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedCharacterIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t length{descriptor.ElementBytes() / sizeof(A)}; - auto *listOutput{io.get_if>()}; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - A *x{&ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditCharacterOutput(io, *edit, x, length)) { - return false; - } - } else { // input - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditCharacterInput(io, *edit, x, length)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedCharacterIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedLogicalIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - auto *listOutput{io.get_if>()}; - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditLogicalOutput(io, *edit, x != 0)) { - return false; - } - } else { - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - bool truth{}; - if (EditLogicalInput(io, *edit, truth)) { - x = truth; - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedLogicalIO: subscripts out of bounds"); - } - } - return true; -} +namespace Fortran::runtime::io::descr { template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, +RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable * = nullptr); -// For intrinsic (not defined) derived type I/O, formatted & unformatted -template -static RT_API_ATTRS bool DefaultComponentIO(IoStatementState &io, - const typeInfo::Component &component, const Descriptor &origDescriptor, - const SubscriptValue origSubscripts[], Terminator &terminator, - const NonTbpDefinedIoTable *table) { -#if !defined(RT_DEVICE_AVOID_RECURSION) - if (component.genre() == typeInfo::Component::Genre::Data) { - // Create a descriptor for the component - StaticDescriptor statDesc; - Descriptor &desc{statDesc.descriptor()}; - component.CreatePointerDescriptor( - desc, origDescriptor, terminator, origSubscripts); - return DescriptorIO(io, desc, table); - } else { - // Component is itself a descriptor - char *pointer{ - origDescriptor.Element(origSubscripts) + component.offset()}; - const Descriptor &compDesc{*reinterpret_cast(pointer)}; - return compDesc.IsAllocated() && DescriptorIO(io, compDesc, table); - } -#else - terminator.Crash("not yet implemented: component IO"); -#endif -} - -template -static RT_API_ATTRS bool DefaultComponentwiseFormattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table, const SubscriptValue subscripts[]) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - // Return true for NAMELIST input if any component appeared. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && k > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -template -static RT_API_ATTRS bool DefaultComponentwiseUnformattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - return false; - } - } - } - return true; -} - -RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( - IoStatementState &, const Descriptor &, const typeInfo::DerivedType &, - const typeInfo::SpecialBinding &, const SubscriptValue[]); - -template -static RT_API_ATTRS bool FormattedDerivedTypeIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - // Derived type information must be present for formatted I/O. - const DescriptorAddendum *addendum{descriptor.Addendum()}; - RUNTIME_CHECK(handler, addendum != nullptr); - const typeInfo::DerivedType *type{addendum->derivedType()}; - RUNTIME_CHECK(handler, type != nullptr); - Fortran::common::optional nonTbpSpecial; - const typeInfo::SpecialBinding *special{nullptr}; - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadFormatted - : common::DefinedIo::WriteFormatted)}) { - if (definedIo->subroutine) { - nonTbpSpecial.emplace(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false); - special = &*nonTbpSpecial; - } - } - } - if (!special) { - if (const typeInfo::SpecialBinding * - binding{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted)}) { - if (!table || !table->ignoreNonTbpEntries || binding->isTypeBound()) { - special = binding; - } - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t numElements{descriptor.Elements()}; - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - Fortran::common::optional result; - if (special) { - result = DefinedFormattedIo(io, descriptor, *type, *special, subscripts); - } - if (!result) { - result = DefaultComponentwiseFormattedIO( - io, descriptor, *type, table, subscripts); - } - if (!result.value()) { - // Return true for NAMELIST input if we got anything. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && j > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &, const Descriptor &, - const typeInfo::DerivedType &, const typeInfo::SpecialBinding &); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); -// Unformatted I/O -template -static RT_API_ATTRS bool UnformattedDescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table = nullptr) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const DescriptorAddendum *addendum{descriptor.Addendum()}; - if (const typeInfo::DerivedType * - type{addendum ? addendum->derivedType() : nullptr}) { - // derived type unformatted I/O - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadUnformatted - : common::DefinedIo::WriteUnformatted)}) { - if (definedIo->subroutine) { - typeInfo::SpecialBinding special{DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false}; - if (Fortran::common::optional wasDefined{ - DefinedUnformattedIo(io, descriptor, *type, special)}) { - return *wasDefined; - } - } else { - return DefaultComponentwiseUnformattedIO( - io, descriptor, *type, table); - } - } - } - if (const typeInfo::SpecialBinding * - special{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { - if (!table || !table->ignoreNonTbpEntries || special->isTypeBound()) { - // defined derived type unformatted I/O - return DefinedUnformattedIo(io, descriptor, *type, *special); - } - } - // Default derived type unformatted I/O - // TODO: If no component at any level has defined READ or WRITE - // (as appropriate), the elements are contiguous, and no byte swapping - // is active, do a block transfer via the code below. - return DefaultComponentwiseUnformattedIO(io, descriptor, *type, table); - } else { - // intrinsic type unformatted I/O - auto *externalUnf{io.get_if>()}; - auto *childUnf{io.get_if>()}; - auto *inq{ - DIR == Direction::Output ? io.get_if() : nullptr}; - RUNTIME_CHECK(handler, externalUnf || childUnf || inq); - std::size_t elementBytes{descriptor.ElementBytes()}; - std::size_t numElements{descriptor.Elements()}; - std::size_t swappingBytes{elementBytes}; - if (auto maybeCatAndKind{descriptor.type().GetCategoryAndKind()}) { - // Byte swapping units can be smaller than elements, namely - // for COMPLEX and CHARACTER. - if (maybeCatAndKind->first == TypeCategory::Character) { - // swap each character position independently - swappingBytes = maybeCatAndKind->second; // kind - } else if (maybeCatAndKind->first == TypeCategory::Complex) { - // swap real and imaginary components independently - swappingBytes /= 2; - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using CharType = - std::conditional_t; - auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { - if constexpr (DIR == Direction::Output) { - return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) - : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) - : inq->Emit(&x, totalBytes, swappingBytes); - } else { - return externalUnf ? externalUnf->Receive(&x, totalBytes, swappingBytes) - : childUnf->Receive(&x, totalBytes, swappingBytes); - } - }}; - bool swapEndianness{externalUnf && externalUnf->unit().swapEndianness()}; - if (!swapEndianness && - descriptor.IsContiguous()) { // contiguous unformatted I/O - char &x{ExtractElement(io, descriptor, subscripts)}; - return Transfer(x, numElements * elementBytes); - } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O - for (std::size_t j{0}; j < numElements; ++j) { - char &x{ExtractElement(io, descriptor, subscripts)}; - if (!Transfer(x, elementBytes)) { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && - j + 1 < numElements) { - handler.Crash("DescriptorIO: subscripts out of bounds"); - } - } - return true; - } - } -} - -template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - if (handler.InError()) { - return false; - } - if (!io.get_if>()) { - handler.Crash("DescriptorIO() called for wrong I/O direction"); - return false; - } - if constexpr (DIR == Direction::Input) { - if (!io.BeginReadingRecord()) { - return false; - } - } - if (!io.get_if>()) { - return UnformattedDescriptorIO(io, descriptor, table); - } - if (auto catAndKind{descriptor.type().GetCategoryAndKind()}) { - TypeCategory cat{catAndKind->first}; - int kind{catAndKind->second}; - switch (cat) { - case TypeCategory::Integer: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, true); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, true); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, true); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, true); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, true); - default: - handler.Crash( - "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Unsigned: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, false); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, false); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, false); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, false); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, false); - default: - handler.Crash( - "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Real: - switch (kind) { - case 2: - return FormattedRealIO<2, DIR>(io, descriptor); - case 3: - return FormattedRealIO<3, DIR>(io, descriptor); - case 4: - return FormattedRealIO<4, DIR>(io, descriptor); - case 8: - return FormattedRealIO<8, DIR>(io, descriptor); - case 10: - return FormattedRealIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedRealIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: REAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Complex: - switch (kind) { - case 2: - return FormattedComplexIO<2, DIR>(io, descriptor); - case 3: - return FormattedComplexIO<3, DIR>(io, descriptor); - case 4: - return FormattedComplexIO<4, DIR>(io, descriptor); - case 8: - return FormattedComplexIO<8, DIR>(io, descriptor); - case 10: - return FormattedComplexIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedComplexIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Character: - switch (kind) { - case 1: - return FormattedCharacterIO(io, descriptor); - case 2: - return FormattedCharacterIO(io, descriptor); - case 4: - return FormattedCharacterIO(io, descriptor); - default: - handler.Crash( - "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Logical: - switch (kind) { - case 1: - return FormattedLogicalIO<1, DIR>(io, descriptor); - case 2: - return FormattedLogicalIO<2, DIR>(io, descriptor); - case 4: - return FormattedLogicalIO<4, DIR>(io, descriptor); - case 8: - return FormattedLogicalIO<8, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Derived: - return FormattedDerivedTypeIO(io, descriptor, table); - } - } - handler.Crash("DescriptorIO: bad type code (%d) in descriptor", - static_cast(descriptor.type().raw())); - return false; -} } // namespace Fortran::runtime::io::descr #endif // FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ diff --git a/flang-rt/lib/runtime/environment.cpp b/flang-rt/lib/runtime/environment.cpp index 1d5304254ed0e..0f0564403c0e2 100644 --- a/flang-rt/lib/runtime/environment.cpp +++ b/flang-rt/lib/runtime/environment.cpp @@ -143,6 +143,10 @@ void ExecutionEnvironment::Configure(int ac, const char *av[], } } + if (auto *x{std::getenv("FLANG_RT_DEBUG")}) { + internalDebugging = std::strtol(x, nullptr, 10); + } + if (auto *x{std::getenv("ACC_OFFLOAD_STACK_SIZE")}) { char *end; auto n{std::strtoul(x, &end, 10)}; diff --git a/flang-rt/lib/runtime/namelist.cpp b/flang-rt/lib/runtime/namelist.cpp index b0cf2180fc6d4..1bef387a9771f 100644 --- a/flang-rt/lib/runtime/namelist.cpp +++ b/flang-rt/lib/runtime/namelist.cpp @@ -10,6 +10,7 @@ #include "descriptor-io.h" #include "flang-rt/runtime/emit-encoded.h" #include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/type-info.h" #include "flang/Runtime/io-api.h" #include #include diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..eee1c551aad6d --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,153 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +#if !defined(RT_DEVICE_COMPILATION) +// FLANG_RT_DEBUG code is disabled when false. +static constexpr bool enableDebugOutput{false}; +#endif + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (componentAt_ >= components_) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + delete firstFree_; + } + firstFree_ = next; + } +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + firstFree_ = new TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: new ticket\n"); + } +#endif + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), + at->ticket.begun ? "Continue" : "Begin"); + } +#endif + int stat{at->ticket.Continue(*this)}; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { + std::fprintf(stderr, "WQ: ... stat %d\n", stat); + } +#endif + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime diff --git a/flang-rt/unittests/Runtime/ExternalIOTest.cpp b/flang-rt/unittests/Runtime/ExternalIOTest.cpp index 3833e48be3dd6..6c148b1de6f82 100644 --- a/flang-rt/unittests/Runtime/ExternalIOTest.cpp +++ b/flang-rt/unittests/Runtime/ExternalIOTest.cpp @@ -184,7 +184,7 @@ TEST(ExternalIOTests, TestSequentialFixedUnformatted) { io = IONAME(BeginInquireIoLength)(__FILE__, __LINE__); for (int j{1}; j <= 3; ++j) { ASSERT_TRUE(IONAME(OutputDescriptor)(io, desc)) - << "OutputDescriptor() for InquireIoLength"; + << "OutputDescriptor() for InquireIoLength " << j; } ASSERT_EQ(IONAME(GetIoLength)(io), 3 * recl) << "GetIoLength"; ASSERT_EQ(IONAME(EndIoStatement)(io), IostatOk) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..32bebc1d866a4 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -843,6 +843,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION From flang-commits at lists.llvm.org Tue May 27 12:14:57 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 27 May 2025 12:14:57 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <68360f31.170a0220.1dff0f.1a5c@mx.google.com> https://github.com/vzakhari approved this pull request. LGTM! Just one minor suggestion. https://github.com/llvm/llvm-project/pull/137727 From flang-commits at lists.llvm.org Tue May 27 12:14:57 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 27 May 2025 12:14:57 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <68360f31.170a0220.17f0ef.3844@mx.google.com> https://github.com/vzakhari edited https://github.com/llvm/llvm-project/pull/137727 From flang-commits at lists.llvm.org Tue May 27 12:14:57 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Tue, 27 May 2025 12:14:57 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <68360f31.630a0220.10c328.5622@mx.google.com> ================ @@ -0,0 +1,153 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +#if !defined(RT_DEVICE_COMPILATION) +// FLANG_RT_DEBUG code is disabled when false. +static constexpr bool enableDebugOutput{false}; +#endif + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (componentAt_ >= components_) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + delete firstFree_; + } + firstFree_ = next; + } +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + firstFree_ = new TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && (executionEnvironment.internalDebugging & 1)) { ---------------- vzakhari wrote: nit: could you turn literal `1` into a named constant so that it is easier to see which bits of `internalDebugging` are used for what (given that we may add more bits in future)? https://github.com/llvm/llvm-project/pull/137727 From flang-commits at lists.llvm.org Tue May 27 12:39:20 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 27 May 2025 12:39:20 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <683614e8.a70a0220.27257c.e36b@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From b7202ecd5ea9f2830f49b1061c2bddf0c09830ef Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, Destroy, and DescriptorIO. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. There is a fast(?) mode in the work queue implementation that causes new work items to be executed to completion immediately upon creation, saving the overhead of actually representing and managing the work queue. This mode can't be used on GPU devices, but it is enabled by default for CPU hosts. It can be disabled easily for debugging and performance testing. --- .../include/flang-rt/runtime/environment.h | 3 + flang-rt/include/flang-rt/runtime/stat.h | 10 +- .../include/flang-rt/runtime/work-queue.h | 538 +++++++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 546 +++++++++------ flang-rt/lib/runtime/derived.cpp | 519 +++++++------- flang-rt/lib/runtime/descriptor-io.cpp | 651 +++++++++++++++++- flang-rt/lib/runtime/descriptor-io.h | 620 +---------------- flang-rt/lib/runtime/environment.cpp | 4 + flang-rt/lib/runtime/namelist.cpp | 1 + flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 161 +++++ flang-rt/unittests/Runtime/ExternalIOTest.cpp | 2 +- flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- 15 files changed, 1993 insertions(+), 1082 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/environment.h b/flang-rt/include/flang-rt/runtime/environment.h index 16258b3bbba9b..e579f6012ce86 100644 --- a/flang-rt/include/flang-rt/runtime/environment.h +++ b/flang-rt/include/flang-rt/runtime/environment.h @@ -64,6 +64,9 @@ struct ExecutionEnvironment { bool defaultUTF8{false}; // DEFAULT_UTF8 bool checkPointerDeallocation{true}; // FORT_CHECK_POINTER_DEALLOCATION + enum InternalDebugging { WorkQueue = 1 }; + int internalDebugging{0}; // FLANG_RT_DEBUG + // CUDA related variables std::size_t cudaStackLimit{0}; // ACC_OFFLOAD_STACK_SIZE bool cudaDeviceIsManaged{false}; // NV_CUDAFOR_DEVICE_IS_MANAGED diff --git a/flang-rt/include/flang-rt/runtime/stat.h b/flang-rt/include/flang-rt/runtime/stat.h index 070d0bf8673fb..dc372de53506a 100644 --- a/flang-rt/include/flang-rt/runtime/stat.h +++ b/flang-rt/include/flang-rt/runtime/stat.h @@ -24,7 +24,7 @@ class Terminator; enum Stat { StatOk = 0, // required to be zero by Fortran - // Interoperable STAT= codes + // Interoperable STAT= codes (>= 11) StatBaseNull = CFI_ERROR_BASE_ADDR_NULL, StatBaseNotNull = CFI_ERROR_BASE_ADDR_NOT_NULL, StatInvalidElemLen = CFI_INVALID_ELEM_LEN, @@ -36,7 +36,7 @@ enum Stat { StatMemAllocation = CFI_ERROR_MEM_ALLOCATION, StatOutOfBounds = CFI_ERROR_OUT_OF_BOUNDS, - // Standard STAT= values + // Standard STAT= values (>= 101) StatFailedImage = FORTRAN_RUNTIME_STAT_FAILED_IMAGE, StatLocked = FORTRAN_RUNTIME_STAT_LOCKED, StatLockedOtherImage = FORTRAN_RUNTIME_STAT_LOCKED_OTHER_IMAGE, @@ -49,10 +49,14 @@ enum Stat { // Additional "processor-defined" STAT= values StatInvalidArgumentNumber = FORTRAN_RUNTIME_STAT_INVALID_ARG_NUMBER, StatMissingArgument = FORTRAN_RUNTIME_STAT_MISSING_ARG, - StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, + StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, // -1 StatMoveAllocSameAllocatable = FORTRAN_RUNTIME_STAT_MOVE_ALLOC_SAME_ALLOCATABLE, StatBadPointerDeallocation = FORTRAN_RUNTIME_STAT_BAD_POINTER_DEALLOCATION, + + // Dummy status for work queue continuation, declared here to perhaps + // avoid collisions + StatContinue = 201 }; RT_API_ATTRS const char *StatErrorString(int); diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..2b46890aeebe1 --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,538 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue comprises a list of tickets. Each ticket class has a Begin() +// member function, which is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatContinue. When that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentsOverElements, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/connection.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; + +// Ticket worker base classes + +template class ImmediateTicketRunner { +public: + RT_API_ATTRS explicit ImmediateTicketRunner(TICKET &ticket) + : ticket_{ticket} {} + RT_API_ATTRS int Run(WorkQueue &workQueue) { + int status{ticket_.Begin(workQueue)}; + while (status == StatContinue) { + status = ticket_.Continue(workQueue); + } + return status; + } + +private: + TICKET &ticket_; +}; + +// Base class for ticket workers that operate elementwise over descriptors +class Elementwise { +protected: + RT_API_ATTRS Elementwise( + const Descriptor &instance, const Descriptor *from = nullptr) + : instance_{instance}, from_{from} { + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + RT_API_ATTRS bool IsComplete() const { return elementAt_ >= elements_; } + RT_API_ATTRS void Advance() { + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + if (from_) { + from_->IncrementSubscripts(fromSubscripts_); + } + } + RT_API_ATTRS void SkipToEnd() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + + const Descriptor &instance_, *from_{nullptr}; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + SubscriptValue subscripts_[common::maxRank]; + SubscriptValue fromSubscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components. +class Componentwise { +protected: + RT_API_ATTRS Componentwise(const typeInfo::DerivedType &); + RT_API_ATTRS bool IsComplete() const { return componentAt_ >= components_; } + RT_API_ATTRS void Advance() { + ++componentAt_; + GetComponent(); + } + RT_API_ATTRS void SkipToEnd() { + component_ = nullptr; + componentAt_ = components_; + } + RT_API_ATTRS void Reset() { + component_ = nullptr; + componentAt_ = 0; + GetComponent(); + } + RT_API_ATTRS void GetComponent(); + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentsOverElements : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ComponentsOverElements(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Elementwise::IsComplete()) { + Componentwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Componentwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextElement(); + if (Elementwise::IsComplete()) { + Elementwise::Reset(); + Componentwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Elementwise::Advance(); + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Advance(); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Reset(); + } + + int phase_{0}; +}; + +// Base class for ticket workers that operate over elements in an outer loop, +// type components in an inner loop. +class ElementsOverComponents : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ElementsOverComponents(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Componentwise::IsComplete()) { + Elementwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Elementwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextComponent(); + if (Componentwise::IsComplete()) { + Componentwise::Reset(); + Elementwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Componentwise::Advance(); + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Componentwise::Reset(); + Elementwise::Advance(); + } + + int phase_{0}; +}; + +// Ticket worker classes + +// Implements derived type instance initialization +class InitializeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket + : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket : public ImmediateTicketRunner { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : ImmediateTicketRunner{*this}, to_{to}, from_{&from}, + flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment. +class DerivedAssignTicket : public ImmediateTicketRunner, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ImmediateTicketRunner{*this}, + ElementsOverComponents{to, derived, &from}, flags_{flags}, + memmoveFct_{memmoveFct}, deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + StaticDescriptor fromComponentDescriptor_; +}; + +namespace io::descr { + +template +class DescriptorIoTicket + : public ImmediateTicketRunner>, + private Elementwise { +public: + RT_API_ATTRS DescriptorIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + Elementwise{descriptor}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS bool &anyIoTookPlace() { return anyIoTookPlace_; } + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; + common::optional nonTbpSpecial_; + const typeInfo::DerivedType *derived_{nullptr}; + const typeInfo::SpecialBinding *special_{nullptr}; + StaticDescriptor elementDescriptor_; +}; + +template +class DerivedIoTicket : public ImmediateTicketRunner>, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + ElementsOverComponents{descriptor, derived}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; +}; + +} // namespace io::descr + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant, + io::descr::DescriptorIoTicket, + io::descr::DerivedIoTicket, + io::descr::DerivedIoTicket> + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + // APIs for particular tasks. These can return StatOk if the work is + // completed immediately. + RT_API_ATTRS int BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return InitializeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + if (runTicketsImmediately_) { + return InitializeCloneTicket{clone, original, derived, hasStat, errMsg} + .Run(*this); + } else { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); + return StatContinue; + } + } + RT_API_ATTRS int BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return FinalizeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + if (runTicketsImmediately_) { + return DestroyTicket{descriptor, derived, finalize}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived, finalize); + return StatContinue; + } + } + RT_API_ATTRS int BeginAssign(Descriptor &to, const Descriptor &from, + int flags, MemmoveFct memmoveFct) { + if (runTicketsImmediately_) { + return AssignTicket{to, from, flags, memmoveFct}.Run(*this); + } else { + StartTicket().u.emplace(to, from, flags, memmoveFct); + return StatContinue; + } + } + RT_API_ATTRS int BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) { + if (runTicketsImmediately_) { + return DerivedAssignTicket{ + to, from, derived, flags, memmoveFct, deallocateAfter} + .Run(*this); + } else { + StartTicket().u.emplace( + to, from, derived, flags, memmoveFct, deallocateAfter); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDescriptorIo(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DescriptorIoTicket{ + io, descriptor, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, table, anyIoTookPlace); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedIo(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DerivedIoTicket{ + io, descriptor, derived, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, derived, table, anyIoTookPlace); + return StatContinue; + } + } + + RT_API_ATTRS int Run(); + +private: +#if RT_DEVICE_COMPILATION + // Always use the work queue on a GPU device to avoid recursion. + static constexpr bool runTicketsImmediately_{false}; +#else + // Avoid the work queue overhead on the host, unless it needs + // debugging, which is so much easier there. + static constexpr bool runTicketsImmediately_{true}; +#endif + + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index c5e7bdce5b2fd..3f9858671d1e1 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -67,6 +67,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -130,6 +131,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 86aeeaa88f2d1..b60c4214fc8e5 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -102,11 +103,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncObject)); } // least <= 0, most >= 0 @@ -231,6 +228,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -244,274 +243,373 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + if (workQueue.BeginAssign(to, from, flags, memmoveFct) == StatContinue) { + workQueue.Run(); + } +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + if (int status{workQueue.BeginFinalize(*toDeallocate_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncObject))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(newFrom, *derived)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + } + static constexpr int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (int status{workQueue.BeginAssign( + newFrom, *from_, nestedFlags, memmoveFct_)}; + status != StatOk && status != StatContinue) { + return status; } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + if (int status{workQueue.BeginFinalize(to_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed + } else if (!toDerived_->noDestructionNeeded()) { + if (int status{ + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + return StatContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); } + return StatOk; } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(to_, *toDerived_)}; + status != StatOk) { + return status; + } + } + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); + if (toDerived_) { + if (int status{workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_)}; + status != StatOk && status != StatContinue) { + return status; } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); - } - } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &workQueue) { + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + memmoveFct_( + instance_.ElementComponent(subscripts_, procPtr.offset), + from_->ElementComponent( + fromSubscripts_, procPtr.offset), + sizeof(typeInfo::ProcedurePointer)); + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &workQueue) { + // DerivedAssignTicket inherits from ElementComponentBase so that it + // loops over elements at the outer level and over components at the inner. + // This yield less surprising behavior for codes that care due to the use + // of defined assignments when the ordering of their calls matters. + while (!IsComplete()) { + // Copy the data components (incl. the parent) first. + switch (component_->genre()) { + case typeInfo::Component::Genre::Data: + if (component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{fromComponentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + toCompDesc, instance_, workQueue.terminator(), subscripts_); + component_->CreatePointerDescriptor( + fromCompDesc, *from_, workQueue.terminator(), fromSubscripts_); + Advance(); + if (int status{workQueue.BeginAssign( + toCompDesc, fromCompDesc, flags_, memmoveFct_)}; + status != StatOk) { + return status; + } + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + instance_.Element(subscripts_) + component_->offset())}; + const auto *fromDesc{reinterpret_cast( + from_->Element(fromSubscripts_) + component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (const auto *componentDerived{component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } + } + toDesc->Deallocate(); + } + Advance(); + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + // + // Be careful not to destroy/reallocate the LHS, if there is + // overlap between LHS and RHS (it seems that partial overlap + // is not possible, though). + // Invoke Assign() recursively to deal with potential aliasing. + // Force LHS deallocation with DeallocateLHS flag. + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + int nestedFlags{flags_ | DeallocateLHS}; + Advance(); + if (int status{workQueue.BeginAssign( + *toDesc, *fromDesc, nestedFlags, memmoveFct_)}; + status != StatOk) { + return status; + } + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -581,7 +679,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -598,7 +695,6 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. if (var) { diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index 35037036f63e7..8166ab64cfd71 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,195 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitialize(instance, derived)}; + return status == StatContinue ? workQueue.Run() : status; +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &comp{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + auto &pptr{*instance_.ElementComponent( + subscripts_, comp.offset)}; + pptr = comp.procInitialization; + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + SkipToNextComponent(); + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitialize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; - } - } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitializeClone( + clone, original, derived, hasStat, errMsg)}; + return status == StatContinue ? workQueue.Run() : status; } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncObject), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + if (int status{workQueue.BeginInitialize(cloneDesc, *derived)}; + status != StatOk) { + return status; } } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_)}; + status != StatOk) { + return status; + } + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } + } else { + SkipToNextComponent(); + } + } + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginFinalize(descriptor, derived) == StatContinue) { + workQueue.Run(); } } - return stat; } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +237,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +274,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,87 +298,94 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (finalizableParentType_->noFinalizationNeeded()) { + finalizableParentType_ = nullptr; + } else { + SkipToNextComponent(); + } + } + return StatContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + if (int status{ + workQueue.BeginFinalize(compDesc, *compDynamicType)}; + status != StatOk) { + return status; } } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginFinalize(compDesc, *compType)}; + status != StatOk) { + return status; } } + } else { + SkipToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginFinalize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + const auto &parentType{*finalizableParentType_}; + finalizableParentType_ = nullptr; + // Don't return StatOk here if the nested FInalize is still running; + // it needs this->componentDescriptor_. + return workQueue.BeginFinalize(tmpDesc, parentType); } + return StatOk; } // The order of finalization follows Fortran 2018 7.5.6.2, with @@ -373,51 +394,71 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginDestroy(descriptor, derived, finalize) == StatContinue) { + workQueue.Run(); + } } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + if (int status{workQueue.BeginFinalize(instance_, derived_)}; + status != StatOk && status != StatContinue) { + return status; + } } + return StatContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (!IsComplete()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *d, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + SkipToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginDestroy( + compDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } } + } else { + SkipToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index 3db1455af52fe..4aa3640c1ed94 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -7,15 +7,44 @@ //===----------------------------------------------------------------------===// #include "descriptor-io.h" +#include "edit-input.h" +#include "edit-output.h" +#include "unit.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/namelist.h" +#include "flang-rt/runtime/terminator.h" +#include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" +#include "flang/Common/optional.h" #include "flang/Common/restorer.h" +#include "flang/Common/uint128.h" +#include "flang/Runtime/cpp-type.h" #include "flang/Runtime/freestanding-tools.h" +// Implementation of I/O data list item transfers based on descriptors. +// (All I/O items come through here so that the code is exercised for test; +// some scalar I/O data transfer APIs could be changed to bypass their use +// of descriptors in the future for better efficiency.) + namespace Fortran::runtime::io::descr { RT_OFFLOAD_API_GROUP_BEGIN +template +inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, + const Descriptor &descriptor, const SubscriptValue subscripts[]) { + A *p{descriptor.Element(subscripts)}; + if (!p) { + io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " + "address or subscripts out of range"); + } + return *p; +} + // Defined formatted I/O (maybe) -Fortran::common::optional DefinedFormattedIo(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &derived, +static RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( + IoStatementState &io, const Descriptor &descriptor, + const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special, const SubscriptValue subscripts[]) { Fortran::common::optional peek{ @@ -104,8 +133,8 @@ Fortran::common::optional DefinedFormattedIo(IoStatementState &io, } // Defined unformatted I/O -bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, - const typeInfo::DerivedType &derived, +static RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special) { // Unformatted I/O must have an external unit (or child thereof). IoErrorHandler &handler{io.GetIoErrorHandler()}; @@ -152,5 +181,619 @@ bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, return handler.GetIoStat() == IostatOk; } +// Per-category descriptor-based I/O templates + +// TODO (perhaps as a nontrivial but small starter project): implement +// automatic repetition counts, like "10*3.14159", for list-directed and +// NAMELIST array output. + +template +inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, + const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!EditIntegerOutput(io, *edit, x, isSigned)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditIntegerInput( + io, *edit, reinterpret_cast(&x), KIND, isSigned)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedIntegerIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedRealIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + RawType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditRealInput(io, *edit, reinterpret_cast(&x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedRealIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedComplexIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + bool isListOutput{ + io.get_if>() != nullptr}; + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + RawType *x{&ExtractElement(io, descriptor, subscripts)}; + if (isListOutput) { + DataEdit rEdit, iEdit; + rEdit.descriptor = DataEdit::ListDirectedRealPart; + iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; + rEdit.modes = iEdit.modes = io.mutableModes(); + if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || + !RealOutputEditing{io, x[1]}.Edit(iEdit)) { + return false; + } + } else { + for (int k{0}; k < 2; ++k, ++x) { + auto edit{io.GetNextDataEdit()}; + if (!edit) { + return false; + } else if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, *x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { + break; + } else if (EditRealInput( + io, *edit, reinterpret_cast(x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedComplexIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedCharacterIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + std::size_t length{descriptor.ElementBytes() / sizeof(A)}; + auto *listOutput{io.get_if>()}; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + A *x{&ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditCharacterOutput(io, *edit, x, length)) { + return false; + } + } else { // input + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditCharacterInput(io, *edit, x, length)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedCharacterIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedLogicalIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + auto *listOutput{io.get_if>()}; + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditLogicalOutput(io, *edit, x != 0)) { + return false; + } + } else { + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + bool truth{}; + if (EditLogicalInput(io, *edit, truth)) { + x = truth; + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedLogicalIO: subscripts out of bounds"); + } + } + return true; +} + +template +RT_API_ATTRS int DerivedIoTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Data) { + // Create a descriptor for the component + Descriptor &compDesc{componentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + compDesc, instance_, io_.GetIoErrorHandler(), subscripts_); + Advance(); + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } else { + // Component is itself a descriptor + char *pointer{ + instance_.Element(subscripts_) + component_->offset()}; + const Descriptor &compDesc{ + *reinterpret_cast(pointer)}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + } + } + return StatOk; +} + +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Begin(WorkQueue &workQueue) { + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + if (handler.InError()) { + return handler.GetIoStat(); + } + if (!io_.get_if>()) { + handler.Crash("DescriptorIO() called for wrong I/O direction"); + return handler.GetIoStat(); + } + if constexpr (DIR == Direction::Input) { + if (!io_.BeginReadingRecord()) { + return StatOk; + } + } + if (!io_.get_if>()) { + // Unformatted I/O + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + if (const typeInfo::DerivedType * + type{addendum ? addendum->derivedType() : nullptr}) { + // derived type unformatted I/O + if (table_) { + if (const auto *definedIo{table_->Find(*type, + DIR == Direction::Input + ? common::DefinedIo::ReadUnformatted + : common::DefinedIo::WriteUnformatted)}) { + if (definedIo->subroutine) { + typeInfo::SpecialBinding special{DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false}; + if (DefinedUnformattedIo(io_, instance_, *type, special)) { + anyIoTookPlace_ = true; + return StatOk; + } + } else { + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } + } + } + if (const typeInfo::SpecialBinding * + special{type->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || special->isTypeBound()) { + // defined derived type unformatted I/O + if (DefinedUnformattedIo(io_, instance_, *type, *special)) { + anyIoTookPlace_ = true; + return StatOk; + } else { + return IostatEnd; + } + } + } + // Default derived type unformatted I/O + // TODO: If no component at any level has defined READ or WRITE + // (as appropriate), the elements are contiguous, and no byte swapping + // is active, do a block transfer via the code below. + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } else { + // intrinsic type unformatted I/O + auto *externalUnf{io_.get_if>()}; + ChildUnformattedIoStatementState *childUnf{nullptr}; + InquireIOLengthState *inq{nullptr}; + bool swapEndianness{false}; + if (externalUnf) { + swapEndianness = externalUnf->unit().swapEndianness(); + } else { + childUnf = io_.get_if>(); + if (!childUnf) { + inq = DIR == Direction::Output ? io_.get_if() + : nullptr; + RUNTIME_CHECK(handler, inq != nullptr); + } + } + std::size_t elementBytes{instance_.ElementBytes()}; + std::size_t swappingBytes{elementBytes}; + if (auto maybeCatAndKind{instance_.type().GetCategoryAndKind()}) { + // Byte swapping units can be smaller than elements, namely + // for COMPLEX and CHARACTER. + if (maybeCatAndKind->first == TypeCategory::Character) { + // swap each character position independently + swappingBytes = maybeCatAndKind->second; // kind + } else if (maybeCatAndKind->first == TypeCategory::Complex) { + // swap real and imaginary components independently + swappingBytes /= 2; + } + } + using CharType = + std::conditional_t; + auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { + if constexpr (DIR == Direction::Output) { + return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) + : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) + : inq->Emit(&x, totalBytes, swappingBytes); + } else { + return externalUnf + ? externalUnf->Receive(&x, totalBytes, swappingBytes) + : childUnf->Receive(&x, totalBytes, swappingBytes); + } + }}; + if (!swapEndianness && + instance_.IsContiguous()) { // contiguous unformatted I/O + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elements_ * elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O + for (; !IsComplete(); Advance()) { + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } + } + } + // Unformatted I/O never needs to call Continue(). + return StatOk; + } + // Formatted I/O + if (auto catAndKind{instance_.type().GetCategoryAndKind()}) { + TypeCategory cat{catAndKind->first}; + int kind{catAndKind->second}; + bool any{false}; + switch (cat) { + case TypeCategory::Integer: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, true); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, true); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, true); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, true); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, true); + break; + default: + handler.Crash( + "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Unsigned: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, false); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, false); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, false); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, false); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, false); + break; + default: + handler.Crash( + "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Real: + switch (kind) { + case 2: + any = FormattedRealIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedRealIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedRealIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedRealIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedRealIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedRealIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: REAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Complex: + switch (kind) { + case 2: + any = FormattedComplexIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedComplexIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedComplexIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedComplexIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedComplexIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedComplexIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Character: + switch (kind) { + case 1: + any = FormattedCharacterIO(io_, instance_); + break; + case 2: + any = FormattedCharacterIO(io_, instance_); + break; + case 4: + any = FormattedCharacterIO(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Logical: + switch (kind) { + case 1: + any = FormattedLogicalIO<1, DIR>(io_, instance_); + break; + case 2: + any = FormattedLogicalIO<2, DIR>(io_, instance_); + break; + case 4: + any = FormattedLogicalIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedLogicalIO<8, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Derived: { + // Derived type information must be present for formatted I/O. + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + RUNTIME_CHECK(handler, addendum != nullptr); + derived_ = addendum->derivedType(); + RUNTIME_CHECK(handler, derived_ != nullptr); + if (table_) { + if (const auto *definedIo{table_->Find(*derived_, + DIR == Direction::Input ? common::DefinedIo::ReadFormatted + : common::DefinedIo::WriteFormatted)}) { + if (definedIo->subroutine) { + nonTbpSpecial_.emplace(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false); + special_ = &*nonTbpSpecial_; + } + } + } + if (!special_) { + if (const typeInfo::SpecialBinding * + binding{derived_->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || + binding->isTypeBound()) { + special_ = binding; + } + } + } + return StatContinue; + } + } + if (any) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { + handler.Crash("DescriptorIO: bad type code (%d) in descriptor", + static_cast(instance_.type().raw())); + return handler.GetIoStat(); + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Continue(WorkQueue &workQueue) { + // Only derived type formatted I/O gets here. + while (!IsComplete()) { + if (special_) { + if (auto defined{DefinedFormattedIo( + io_, instance_, *derived_, *special_, subscripts_)}) { + anyIoTookPlace_ |= *defined; + Advance(); + continue; + } + } + Descriptor &elementDesc{elementDescriptor_.descriptor()}; + elementDesc.Establish( + *derived_, nullptr, 0, nullptr, CFI_attribute_pointer); + elementDesc.set_base_addr(instance_.Element(subscripts_)); + Advance(); + if (int status{workQueue.BeginDerivedIo( + io_, elementDesc, *derived_, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS bool DescriptorIO(IoStatementState &io, + const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { + bool anyIoTookPlace{false}; + WorkQueue workQueue{io.GetIoErrorHandler()}; + if (workQueue.BeginDescriptorIo(io, descriptor, table, anyIoTookPlace) == + StatContinue) { + workQueue.Run(); + } + return anyIoTookPlace; +} + +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); + RT_OFFLOAD_API_GROUP_END } // namespace Fortran::runtime::io::descr diff --git a/flang-rt/lib/runtime/descriptor-io.h b/flang-rt/lib/runtime/descriptor-io.h index eb60f106c9203..88ad59bd24b53 100644 --- a/flang-rt/lib/runtime/descriptor-io.h +++ b/flang-rt/lib/runtime/descriptor-io.h @@ -9,619 +9,27 @@ #ifndef FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ #define FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ -// Implementation of I/O data list item transfers based on descriptors. -// (All I/O items come through here so that the code is exercised for test; -// some scalar I/O data transfer APIs could be changed to bypass their use -// of descriptors in the future for better efficiency.) +#include "flang-rt/runtime/connection.h" -#include "edit-input.h" -#include "edit-output.h" -#include "unit.h" -#include "flang-rt/runtime/descriptor.h" -#include "flang-rt/runtime/io-stmt.h" -#include "flang-rt/runtime/namelist.h" -#include "flang-rt/runtime/terminator.h" -#include "flang-rt/runtime/type-info.h" -#include "flang/Common/optional.h" -#include "flang/Common/uint128.h" -#include "flang/Runtime/cpp-type.h" +namespace Fortran::runtime { +class Descriptor; +} // namespace Fortran::runtime -namespace Fortran::runtime::io::descr { -template -inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, - const Descriptor &descriptor, const SubscriptValue subscripts[]) { - A *p{descriptor.Element(subscripts)}; - if (!p) { - io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " - "address or subscripts out of range"); - } - return *p; -} - -// Per-category descriptor-based I/O templates - -// TODO (perhaps as a nontrivial but small starter project): implement -// automatic repetition counts, like "10*3.14159", for list-directed and -// NAMELIST array output. - -template -inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, - const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!EditIntegerOutput(io, *edit, x, isSigned)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditIntegerInput( - io, *edit, reinterpret_cast(&x), KIND, isSigned)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedIntegerIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedRealIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - RawType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditRealInput(io, *edit, reinterpret_cast(&x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedRealIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io -template -inline RT_API_ATTRS bool FormattedComplexIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - bool isListOutput{ - io.get_if>() != nullptr}; - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - RawType *x{&ExtractElement(io, descriptor, subscripts)}; - if (isListOutput) { - DataEdit rEdit, iEdit; - rEdit.descriptor = DataEdit::ListDirectedRealPart; - iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; - rEdit.modes = iEdit.modes = io.mutableModes(); - if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || - !RealOutputEditing{io, x[1]}.Edit(iEdit)) { - return false; - } - } else { - for (int k{0}; k < 2; ++k, ++x) { - auto edit{io.GetNextDataEdit()}; - if (!edit) { - return false; - } else if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, *x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { - break; - } else if (EditRealInput( - io, *edit, reinterpret_cast(x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedComplexIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedCharacterIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t length{descriptor.ElementBytes() / sizeof(A)}; - auto *listOutput{io.get_if>()}; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - A *x{&ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditCharacterOutput(io, *edit, x, length)) { - return false; - } - } else { // input - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditCharacterInput(io, *edit, x, length)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedCharacterIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedLogicalIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - auto *listOutput{io.get_if>()}; - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditLogicalOutput(io, *edit, x != 0)) { - return false; - } - } else { - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - bool truth{}; - if (EditLogicalInput(io, *edit, truth)) { - x = truth; - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedLogicalIO: subscripts out of bounds"); - } - } - return true; -} +namespace Fortran::runtime::io::descr { template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, +RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable * = nullptr); -// For intrinsic (not defined) derived type I/O, formatted & unformatted -template -static RT_API_ATTRS bool DefaultComponentIO(IoStatementState &io, - const typeInfo::Component &component, const Descriptor &origDescriptor, - const SubscriptValue origSubscripts[], Terminator &terminator, - const NonTbpDefinedIoTable *table) { -#if !defined(RT_DEVICE_AVOID_RECURSION) - if (component.genre() == typeInfo::Component::Genre::Data) { - // Create a descriptor for the component - StaticDescriptor statDesc; - Descriptor &desc{statDesc.descriptor()}; - component.CreatePointerDescriptor( - desc, origDescriptor, terminator, origSubscripts); - return DescriptorIO(io, desc, table); - } else { - // Component is itself a descriptor - char *pointer{ - origDescriptor.Element(origSubscripts) + component.offset()}; - const Descriptor &compDesc{*reinterpret_cast(pointer)}; - return compDesc.IsAllocated() && DescriptorIO(io, compDesc, table); - } -#else - terminator.Crash("not yet implemented: component IO"); -#endif -} - -template -static RT_API_ATTRS bool DefaultComponentwiseFormattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table, const SubscriptValue subscripts[]) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - // Return true for NAMELIST input if any component appeared. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && k > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -template -static RT_API_ATTRS bool DefaultComponentwiseUnformattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - return false; - } - } - } - return true; -} - -RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( - IoStatementState &, const Descriptor &, const typeInfo::DerivedType &, - const typeInfo::SpecialBinding &, const SubscriptValue[]); - -template -static RT_API_ATTRS bool FormattedDerivedTypeIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - // Derived type information must be present for formatted I/O. - const DescriptorAddendum *addendum{descriptor.Addendum()}; - RUNTIME_CHECK(handler, addendum != nullptr); - const typeInfo::DerivedType *type{addendum->derivedType()}; - RUNTIME_CHECK(handler, type != nullptr); - Fortran::common::optional nonTbpSpecial; - const typeInfo::SpecialBinding *special{nullptr}; - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadFormatted - : common::DefinedIo::WriteFormatted)}) { - if (definedIo->subroutine) { - nonTbpSpecial.emplace(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false); - special = &*nonTbpSpecial; - } - } - } - if (!special) { - if (const typeInfo::SpecialBinding * - binding{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted)}) { - if (!table || !table->ignoreNonTbpEntries || binding->isTypeBound()) { - special = binding; - } - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t numElements{descriptor.Elements()}; - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - Fortran::common::optional result; - if (special) { - result = DefinedFormattedIo(io, descriptor, *type, *special, subscripts); - } - if (!result) { - result = DefaultComponentwiseFormattedIO( - io, descriptor, *type, table, subscripts); - } - if (!result.value()) { - // Return true for NAMELIST input if we got anything. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && j > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &, const Descriptor &, - const typeInfo::DerivedType &, const typeInfo::SpecialBinding &); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); -// Unformatted I/O -template -static RT_API_ATTRS bool UnformattedDescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table = nullptr) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const DescriptorAddendum *addendum{descriptor.Addendum()}; - if (const typeInfo::DerivedType * - type{addendum ? addendum->derivedType() : nullptr}) { - // derived type unformatted I/O - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadUnformatted - : common::DefinedIo::WriteUnformatted)}) { - if (definedIo->subroutine) { - typeInfo::SpecialBinding special{DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false}; - if (Fortran::common::optional wasDefined{ - DefinedUnformattedIo(io, descriptor, *type, special)}) { - return *wasDefined; - } - } else { - return DefaultComponentwiseUnformattedIO( - io, descriptor, *type, table); - } - } - } - if (const typeInfo::SpecialBinding * - special{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { - if (!table || !table->ignoreNonTbpEntries || special->isTypeBound()) { - // defined derived type unformatted I/O - return DefinedUnformattedIo(io, descriptor, *type, *special); - } - } - // Default derived type unformatted I/O - // TODO: If no component at any level has defined READ or WRITE - // (as appropriate), the elements are contiguous, and no byte swapping - // is active, do a block transfer via the code below. - return DefaultComponentwiseUnformattedIO(io, descriptor, *type, table); - } else { - // intrinsic type unformatted I/O - auto *externalUnf{io.get_if>()}; - auto *childUnf{io.get_if>()}; - auto *inq{ - DIR == Direction::Output ? io.get_if() : nullptr}; - RUNTIME_CHECK(handler, externalUnf || childUnf || inq); - std::size_t elementBytes{descriptor.ElementBytes()}; - std::size_t numElements{descriptor.Elements()}; - std::size_t swappingBytes{elementBytes}; - if (auto maybeCatAndKind{descriptor.type().GetCategoryAndKind()}) { - // Byte swapping units can be smaller than elements, namely - // for COMPLEX and CHARACTER. - if (maybeCatAndKind->first == TypeCategory::Character) { - // swap each character position independently - swappingBytes = maybeCatAndKind->second; // kind - } else if (maybeCatAndKind->first == TypeCategory::Complex) { - // swap real and imaginary components independently - swappingBytes /= 2; - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using CharType = - std::conditional_t; - auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { - if constexpr (DIR == Direction::Output) { - return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) - : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) - : inq->Emit(&x, totalBytes, swappingBytes); - } else { - return externalUnf ? externalUnf->Receive(&x, totalBytes, swappingBytes) - : childUnf->Receive(&x, totalBytes, swappingBytes); - } - }}; - bool swapEndianness{externalUnf && externalUnf->unit().swapEndianness()}; - if (!swapEndianness && - descriptor.IsContiguous()) { // contiguous unformatted I/O - char &x{ExtractElement(io, descriptor, subscripts)}; - return Transfer(x, numElements * elementBytes); - } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O - for (std::size_t j{0}; j < numElements; ++j) { - char &x{ExtractElement(io, descriptor, subscripts)}; - if (!Transfer(x, elementBytes)) { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && - j + 1 < numElements) { - handler.Crash("DescriptorIO: subscripts out of bounds"); - } - } - return true; - } - } -} - -template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - if (handler.InError()) { - return false; - } - if (!io.get_if>()) { - handler.Crash("DescriptorIO() called for wrong I/O direction"); - return false; - } - if constexpr (DIR == Direction::Input) { - if (!io.BeginReadingRecord()) { - return false; - } - } - if (!io.get_if>()) { - return UnformattedDescriptorIO(io, descriptor, table); - } - if (auto catAndKind{descriptor.type().GetCategoryAndKind()}) { - TypeCategory cat{catAndKind->first}; - int kind{catAndKind->second}; - switch (cat) { - case TypeCategory::Integer: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, true); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, true); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, true); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, true); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, true); - default: - handler.Crash( - "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Unsigned: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, false); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, false); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, false); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, false); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, false); - default: - handler.Crash( - "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Real: - switch (kind) { - case 2: - return FormattedRealIO<2, DIR>(io, descriptor); - case 3: - return FormattedRealIO<3, DIR>(io, descriptor); - case 4: - return FormattedRealIO<4, DIR>(io, descriptor); - case 8: - return FormattedRealIO<8, DIR>(io, descriptor); - case 10: - return FormattedRealIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedRealIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: REAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Complex: - switch (kind) { - case 2: - return FormattedComplexIO<2, DIR>(io, descriptor); - case 3: - return FormattedComplexIO<3, DIR>(io, descriptor); - case 4: - return FormattedComplexIO<4, DIR>(io, descriptor); - case 8: - return FormattedComplexIO<8, DIR>(io, descriptor); - case 10: - return FormattedComplexIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedComplexIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Character: - switch (kind) { - case 1: - return FormattedCharacterIO(io, descriptor); - case 2: - return FormattedCharacterIO(io, descriptor); - case 4: - return FormattedCharacterIO(io, descriptor); - default: - handler.Crash( - "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Logical: - switch (kind) { - case 1: - return FormattedLogicalIO<1, DIR>(io, descriptor); - case 2: - return FormattedLogicalIO<2, DIR>(io, descriptor); - case 4: - return FormattedLogicalIO<4, DIR>(io, descriptor); - case 8: - return FormattedLogicalIO<8, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Derived: - return FormattedDerivedTypeIO(io, descriptor, table); - } - } - handler.Crash("DescriptorIO: bad type code (%d) in descriptor", - static_cast(descriptor.type().raw())); - return false; -} } // namespace Fortran::runtime::io::descr #endif // FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ diff --git a/flang-rt/lib/runtime/environment.cpp b/flang-rt/lib/runtime/environment.cpp index 1d5304254ed0e..0f0564403c0e2 100644 --- a/flang-rt/lib/runtime/environment.cpp +++ b/flang-rt/lib/runtime/environment.cpp @@ -143,6 +143,10 @@ void ExecutionEnvironment::Configure(int ac, const char *av[], } } + if (auto *x{std::getenv("FLANG_RT_DEBUG")}) { + internalDebugging = std::strtol(x, nullptr, 10); + } + if (auto *x{std::getenv("ACC_OFFLOAD_STACK_SIZE")}) { char *end; auto n{std::strtoul(x, &end, 10)}; diff --git a/flang-rt/lib/runtime/namelist.cpp b/flang-rt/lib/runtime/namelist.cpp index b0cf2180fc6d4..1bef387a9771f 100644 --- a/flang-rt/lib/runtime/namelist.cpp +++ b/flang-rt/lib/runtime/namelist.cpp @@ -10,6 +10,7 @@ #include "descriptor-io.h" #include "flang-rt/runtime/emit-encoded.h" #include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/type-info.h" #include "flang/Runtime/io-api.h" #include #include diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..3617f8633f7f0 --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,161 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/memory.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +#if !defined(RT_DEVICE_COMPILATION) +// FLANG_RT_DEBUG code is disabled when false. +static constexpr bool enableDebugOutput{false}; +#endif + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (componentAt_ >= components_) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + FreeMemory(firstFree_); + } + firstFree_ = next; + } +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + void *p{AllocateMemoryOrCrash(terminator_, sizeof(TicketList))}; + firstFree_ = new (p) TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: new ticket\n"); + } +#endif + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), + at->ticket.begun ? "Continue" : "Begin"); + } +#endif + int stat{at->ticket.Continue(*this)}; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: ... stat %d\n", stat); + } +#endif + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime diff --git a/flang-rt/unittests/Runtime/ExternalIOTest.cpp b/flang-rt/unittests/Runtime/ExternalIOTest.cpp index 3833e48be3dd6..6c148b1de6f82 100644 --- a/flang-rt/unittests/Runtime/ExternalIOTest.cpp +++ b/flang-rt/unittests/Runtime/ExternalIOTest.cpp @@ -184,7 +184,7 @@ TEST(ExternalIOTests, TestSequentialFixedUnformatted) { io = IONAME(BeginInquireIoLength)(__FILE__, __LINE__); for (int j{1}; j <= 3; ++j) { ASSERT_TRUE(IONAME(OutputDescriptor)(io, desc)) - << "OutputDescriptor() for InquireIoLength"; + << "OutputDescriptor() for InquireIoLength " << j; } ASSERT_EQ(IONAME(GetIoLength)(io), 3 * recl) << "GetIoLength"; ASSERT_EQ(IONAME(EndIoStatement)(io), IostatOk) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..32bebc1d866a4 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -843,6 +843,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION From flang-commits at lists.llvm.org Tue May 27 13:24:17 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Tue, 27 May 2025 13:24:17 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <68361f71.050a0220.1d9502.0c5b@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From 24829fd2e2b1b4895b449220a2aa93b90bd00524 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, Destroy, and DescriptorIO. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. There is a fast(?) mode in the work queue implementation that causes new work items to be executed to completion immediately upon creation, saving the overhead of actually representing and managing the work queue. This mode can't be used on GPU devices, but it is enabled by default for CPU hosts. It can be disabled easily for debugging and performance testing. --- .../include/flang-rt/runtime/environment.h | 3 + flang-rt/include/flang-rt/runtime/stat.h | 10 +- .../include/flang-rt/runtime/work-queue.h | 538 +++++++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 546 +++++++++------ flang-rt/lib/runtime/derived.cpp | 519 +++++++------- flang-rt/lib/runtime/descriptor-io.cpp | 651 +++++++++++++++++- flang-rt/lib/runtime/descriptor-io.h | 620 +---------------- flang-rt/lib/runtime/environment.cpp | 4 + flang-rt/lib/runtime/namelist.cpp | 1 + flang-rt/lib/runtime/tools.cpp | 4 +- flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 161 +++++ flang-rt/unittests/Runtime/ExternalIOTest.cpp | 2 +- flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- 16 files changed, 1995 insertions(+), 1084 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/environment.h b/flang-rt/include/flang-rt/runtime/environment.h index 16258b3bbba9b..e579f6012ce86 100644 --- a/flang-rt/include/flang-rt/runtime/environment.h +++ b/flang-rt/include/flang-rt/runtime/environment.h @@ -64,6 +64,9 @@ struct ExecutionEnvironment { bool defaultUTF8{false}; // DEFAULT_UTF8 bool checkPointerDeallocation{true}; // FORT_CHECK_POINTER_DEALLOCATION + enum InternalDebugging { WorkQueue = 1 }; + int internalDebugging{0}; // FLANG_RT_DEBUG + // CUDA related variables std::size_t cudaStackLimit{0}; // ACC_OFFLOAD_STACK_SIZE bool cudaDeviceIsManaged{false}; // NV_CUDAFOR_DEVICE_IS_MANAGED diff --git a/flang-rt/include/flang-rt/runtime/stat.h b/flang-rt/include/flang-rt/runtime/stat.h index 070d0bf8673fb..dc372de53506a 100644 --- a/flang-rt/include/flang-rt/runtime/stat.h +++ b/flang-rt/include/flang-rt/runtime/stat.h @@ -24,7 +24,7 @@ class Terminator; enum Stat { StatOk = 0, // required to be zero by Fortran - // Interoperable STAT= codes + // Interoperable STAT= codes (>= 11) StatBaseNull = CFI_ERROR_BASE_ADDR_NULL, StatBaseNotNull = CFI_ERROR_BASE_ADDR_NOT_NULL, StatInvalidElemLen = CFI_INVALID_ELEM_LEN, @@ -36,7 +36,7 @@ enum Stat { StatMemAllocation = CFI_ERROR_MEM_ALLOCATION, StatOutOfBounds = CFI_ERROR_OUT_OF_BOUNDS, - // Standard STAT= values + // Standard STAT= values (>= 101) StatFailedImage = FORTRAN_RUNTIME_STAT_FAILED_IMAGE, StatLocked = FORTRAN_RUNTIME_STAT_LOCKED, StatLockedOtherImage = FORTRAN_RUNTIME_STAT_LOCKED_OTHER_IMAGE, @@ -49,10 +49,14 @@ enum Stat { // Additional "processor-defined" STAT= values StatInvalidArgumentNumber = FORTRAN_RUNTIME_STAT_INVALID_ARG_NUMBER, StatMissingArgument = FORTRAN_RUNTIME_STAT_MISSING_ARG, - StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, + StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, // -1 StatMoveAllocSameAllocatable = FORTRAN_RUNTIME_STAT_MOVE_ALLOC_SAME_ALLOCATABLE, StatBadPointerDeallocation = FORTRAN_RUNTIME_STAT_BAD_POINTER_DEALLOCATION, + + // Dummy status for work queue continuation, declared here to perhaps + // avoid collisions + StatContinue = 201 }; RT_API_ATTRS const char *StatErrorString(int); diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..2b46890aeebe1 --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,538 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue comprises a list of tickets. Each ticket class has a Begin() +// member function, which is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatContinue. When that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentsOverElements, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/connection.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; + +// Ticket worker base classes + +template class ImmediateTicketRunner { +public: + RT_API_ATTRS explicit ImmediateTicketRunner(TICKET &ticket) + : ticket_{ticket} {} + RT_API_ATTRS int Run(WorkQueue &workQueue) { + int status{ticket_.Begin(workQueue)}; + while (status == StatContinue) { + status = ticket_.Continue(workQueue); + } + return status; + } + +private: + TICKET &ticket_; +}; + +// Base class for ticket workers that operate elementwise over descriptors +class Elementwise { +protected: + RT_API_ATTRS Elementwise( + const Descriptor &instance, const Descriptor *from = nullptr) + : instance_{instance}, from_{from} { + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + RT_API_ATTRS bool IsComplete() const { return elementAt_ >= elements_; } + RT_API_ATTRS void Advance() { + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + if (from_) { + from_->IncrementSubscripts(fromSubscripts_); + } + } + RT_API_ATTRS void SkipToEnd() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + + const Descriptor &instance_, *from_{nullptr}; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + SubscriptValue subscripts_[common::maxRank]; + SubscriptValue fromSubscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components. +class Componentwise { +protected: + RT_API_ATTRS Componentwise(const typeInfo::DerivedType &); + RT_API_ATTRS bool IsComplete() const { return componentAt_ >= components_; } + RT_API_ATTRS void Advance() { + ++componentAt_; + GetComponent(); + } + RT_API_ATTRS void SkipToEnd() { + component_ = nullptr; + componentAt_ = components_; + } + RT_API_ATTRS void Reset() { + component_ = nullptr; + componentAt_ = 0; + GetComponent(); + } + RT_API_ATTRS void GetComponent(); + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentsOverElements : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ComponentsOverElements(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Elementwise::IsComplete()) { + Componentwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Componentwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextElement(); + if (Elementwise::IsComplete()) { + Elementwise::Reset(); + Componentwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Elementwise::Advance(); + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Advance(); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Reset(); + } + + int phase_{0}; +}; + +// Base class for ticket workers that operate over elements in an outer loop, +// type components in an inner loop. +class ElementsOverComponents : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ElementsOverComponents(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Componentwise::IsComplete()) { + Elementwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Elementwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextComponent(); + if (Componentwise::IsComplete()) { + Componentwise::Reset(); + Elementwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Componentwise::Advance(); + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Componentwise::Reset(); + Elementwise::Advance(); + } + + int phase_{0}; +}; + +// Ticket worker classes + +// Implements derived type instance initialization +class InitializeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket + : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket : public ImmediateTicketRunner { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : ImmediateTicketRunner{*this}, to_{to}, from_{&from}, + flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment. +class DerivedAssignTicket : public ImmediateTicketRunner, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ImmediateTicketRunner{*this}, + ElementsOverComponents{to, derived, &from}, flags_{flags}, + memmoveFct_{memmoveFct}, deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + StaticDescriptor fromComponentDescriptor_; +}; + +namespace io::descr { + +template +class DescriptorIoTicket + : public ImmediateTicketRunner>, + private Elementwise { +public: + RT_API_ATTRS DescriptorIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + Elementwise{descriptor}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS bool &anyIoTookPlace() { return anyIoTookPlace_; } + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; + common::optional nonTbpSpecial_; + const typeInfo::DerivedType *derived_{nullptr}; + const typeInfo::SpecialBinding *special_{nullptr}; + StaticDescriptor elementDescriptor_; +}; + +template +class DerivedIoTicket : public ImmediateTicketRunner>, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + ElementsOverComponents{descriptor, derived}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; +}; + +} // namespace io::descr + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant, + io::descr::DescriptorIoTicket, + io::descr::DerivedIoTicket, + io::descr::DerivedIoTicket> + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + // APIs for particular tasks. These can return StatOk if the work is + // completed immediately. + RT_API_ATTRS int BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return InitializeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + if (runTicketsImmediately_) { + return InitializeCloneTicket{clone, original, derived, hasStat, errMsg} + .Run(*this); + } else { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); + return StatContinue; + } + } + RT_API_ATTRS int BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return FinalizeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + if (runTicketsImmediately_) { + return DestroyTicket{descriptor, derived, finalize}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived, finalize); + return StatContinue; + } + } + RT_API_ATTRS int BeginAssign(Descriptor &to, const Descriptor &from, + int flags, MemmoveFct memmoveFct) { + if (runTicketsImmediately_) { + return AssignTicket{to, from, flags, memmoveFct}.Run(*this); + } else { + StartTicket().u.emplace(to, from, flags, memmoveFct); + return StatContinue; + } + } + RT_API_ATTRS int BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) { + if (runTicketsImmediately_) { + return DerivedAssignTicket{ + to, from, derived, flags, memmoveFct, deallocateAfter} + .Run(*this); + } else { + StartTicket().u.emplace( + to, from, derived, flags, memmoveFct, deallocateAfter); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDescriptorIo(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DescriptorIoTicket{ + io, descriptor, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, table, anyIoTookPlace); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedIo(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DerivedIoTicket{ + io, descriptor, derived, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, derived, table, anyIoTookPlace); + return StatContinue; + } + } + + RT_API_ATTRS int Run(); + +private: +#if RT_DEVICE_COMPILATION + // Always use the work queue on a GPU device to avoid recursion. + static constexpr bool runTicketsImmediately_{false}; +#else + // Avoid the work queue overhead on the host, unless it needs + // debugging, which is so much easier there. + static constexpr bool runTicketsImmediately_{true}; +#endif + + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index c5e7bdce5b2fd..3f9858671d1e1 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -67,6 +67,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -130,6 +131,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 86aeeaa88f2d1..b60c4214fc8e5 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -102,11 +103,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncObject)); } // least <= 0, most >= 0 @@ -231,6 +228,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -244,274 +243,373 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + if (workQueue.BeginAssign(to, from, flags, memmoveFct) == StatContinue) { + workQueue.Run(); + } +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + if (int status{workQueue.BeginFinalize(*toDeallocate_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncObject))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(newFrom, *derived)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + } + static constexpr int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (int status{workQueue.BeginAssign( + newFrom, *from_, nestedFlags, memmoveFct_)}; + status != StatOk && status != StatContinue) { + return status; } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + if (int status{workQueue.BeginFinalize(to_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed + } else if (!toDerived_->noDestructionNeeded()) { + if (int status{ + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + return StatContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); } + return StatOk; } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(to_, *toDerived_)}; + status != StatOk) { + return status; + } + } + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); + if (toDerived_) { + if (int status{workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_)}; + status != StatOk && status != StatContinue) { + return status; } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); - } - } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &workQueue) { + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + memmoveFct_( + instance_.ElementComponent(subscripts_, procPtr.offset), + from_->ElementComponent( + fromSubscripts_, procPtr.offset), + sizeof(typeInfo::ProcedurePointer)); + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &workQueue) { + // DerivedAssignTicket inherits from ElementComponentBase so that it + // loops over elements at the outer level and over components at the inner. + // This yield less surprising behavior for codes that care due to the use + // of defined assignments when the ordering of their calls matters. + while (!IsComplete()) { + // Copy the data components (incl. the parent) first. + switch (component_->genre()) { + case typeInfo::Component::Genre::Data: + if (component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{fromComponentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + toCompDesc, instance_, workQueue.terminator(), subscripts_); + component_->CreatePointerDescriptor( + fromCompDesc, *from_, workQueue.terminator(), fromSubscripts_); + Advance(); + if (int status{workQueue.BeginAssign( + toCompDesc, fromCompDesc, flags_, memmoveFct_)}; + status != StatOk) { + return status; + } + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{component_->SizeInBytes(instance_)}; + memmoveFct_(instance_.Element(subscripts_) + component_->offset(), + from_->Element(fromSubscripts_) + component_->offset(), + componentByteSize); + Advance(); + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + instance_.Element(subscripts_) + component_->offset())}; + const auto *fromDesc{reinterpret_cast( + from_->Element(fromSubscripts_) + component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (const auto *componentDerived{component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } + } + toDesc->Deallocate(); + } + Advance(); + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + // + // Be careful not to destroy/reallocate the LHS, if there is + // overlap between LHS and RHS (it seems that partial overlap + // is not possible, though). + // Invoke Assign() recursively to deal with potential aliasing. + // Force LHS deallocation with DeallocateLHS flag. + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + int nestedFlags{flags_ | DeallocateLHS}; + Advance(); + if (int status{workQueue.BeginAssign( + *toDesc, *fromDesc, nestedFlags, memmoveFct_)}; + status != StatOk) { + return status; + } + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -581,7 +679,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -598,7 +695,6 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. if (var) { diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index 35037036f63e7..8166ab64cfd71 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,195 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitialize(instance, derived)}; + return status == StatContinue ? workQueue.Run() : status; +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &comp{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + auto &pptr{*instance_.ElementComponent( + subscripts_, comp.offset)}; + pptr = comp.procInitialization; + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + SkipToNextComponent(); + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitialize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; - } - } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitializeClone( + clone, original, derived, hasStat, errMsg)}; + return status == StatContinue ? workQueue.Run() : status; } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncObject), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + if (int status{workQueue.BeginInitialize(cloneDesc, *derived)}; + status != StatOk) { + return status; } } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_)}; + status != StatOk) { + return status; + } + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } + } else { + SkipToNextComponent(); + } + } + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginFinalize(descriptor, derived) == StatContinue) { + workQueue.Run(); } } - return stat; } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +237,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +274,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,87 +298,94 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (finalizableParentType_->noFinalizationNeeded()) { + finalizableParentType_ = nullptr; + } else { + SkipToNextComponent(); + } + } + return StatContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + if (int status{ + workQueue.BeginFinalize(compDesc, *compDynamicType)}; + status != StatOk) { + return status; } } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginFinalize(compDesc, *compType)}; + status != StatOk) { + return status; } } + } else { + SkipToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginFinalize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + const auto &parentType{*finalizableParentType_}; + finalizableParentType_ = nullptr; + // Don't return StatOk here if the nested FInalize is still running; + // it needs this->componentDescriptor_. + return workQueue.BeginFinalize(tmpDesc, parentType); } + return StatOk; } // The order of finalization follows Fortran 2018 7.5.6.2, with @@ -373,51 +394,71 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginDestroy(descriptor, derived, finalize) == StatContinue) { + workQueue.Run(); + } } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + if (int status{workQueue.BeginFinalize(instance_, derived_)}; + status != StatOk && status != StatContinue) { + return status; + } } + return StatContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (!IsComplete()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *d, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + SkipToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginDestroy( + compDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } } + } else { + SkipToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index 3db1455af52fe..4aa3640c1ed94 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -7,15 +7,44 @@ //===----------------------------------------------------------------------===// #include "descriptor-io.h" +#include "edit-input.h" +#include "edit-output.h" +#include "unit.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/namelist.h" +#include "flang-rt/runtime/terminator.h" +#include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" +#include "flang/Common/optional.h" #include "flang/Common/restorer.h" +#include "flang/Common/uint128.h" +#include "flang/Runtime/cpp-type.h" #include "flang/Runtime/freestanding-tools.h" +// Implementation of I/O data list item transfers based on descriptors. +// (All I/O items come through here so that the code is exercised for test; +// some scalar I/O data transfer APIs could be changed to bypass their use +// of descriptors in the future for better efficiency.) + namespace Fortran::runtime::io::descr { RT_OFFLOAD_API_GROUP_BEGIN +template +inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, + const Descriptor &descriptor, const SubscriptValue subscripts[]) { + A *p{descriptor.Element(subscripts)}; + if (!p) { + io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " + "address or subscripts out of range"); + } + return *p; +} + // Defined formatted I/O (maybe) -Fortran::common::optional DefinedFormattedIo(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &derived, +static RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( + IoStatementState &io, const Descriptor &descriptor, + const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special, const SubscriptValue subscripts[]) { Fortran::common::optional peek{ @@ -104,8 +133,8 @@ Fortran::common::optional DefinedFormattedIo(IoStatementState &io, } // Defined unformatted I/O -bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, - const typeInfo::DerivedType &derived, +static RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special) { // Unformatted I/O must have an external unit (or child thereof). IoErrorHandler &handler{io.GetIoErrorHandler()}; @@ -152,5 +181,619 @@ bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, return handler.GetIoStat() == IostatOk; } +// Per-category descriptor-based I/O templates + +// TODO (perhaps as a nontrivial but small starter project): implement +// automatic repetition counts, like "10*3.14159", for list-directed and +// NAMELIST array output. + +template +inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, + const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!EditIntegerOutput(io, *edit, x, isSigned)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditIntegerInput( + io, *edit, reinterpret_cast(&x), KIND, isSigned)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedIntegerIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedRealIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + RawType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditRealInput(io, *edit, reinterpret_cast(&x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedRealIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedComplexIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + bool isListOutput{ + io.get_if>() != nullptr}; + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + RawType *x{&ExtractElement(io, descriptor, subscripts)}; + if (isListOutput) { + DataEdit rEdit, iEdit; + rEdit.descriptor = DataEdit::ListDirectedRealPart; + iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; + rEdit.modes = iEdit.modes = io.mutableModes(); + if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || + !RealOutputEditing{io, x[1]}.Edit(iEdit)) { + return false; + } + } else { + for (int k{0}; k < 2; ++k, ++x) { + auto edit{io.GetNextDataEdit()}; + if (!edit) { + return false; + } else if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, *x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { + break; + } else if (EditRealInput( + io, *edit, reinterpret_cast(x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedComplexIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedCharacterIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + std::size_t length{descriptor.ElementBytes() / sizeof(A)}; + auto *listOutput{io.get_if>()}; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + A *x{&ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditCharacterOutput(io, *edit, x, length)) { + return false; + } + } else { // input + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditCharacterInput(io, *edit, x, length)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedCharacterIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedLogicalIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + auto *listOutput{io.get_if>()}; + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditLogicalOutput(io, *edit, x != 0)) { + return false; + } + } else { + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + bool truth{}; + if (EditLogicalInput(io, *edit, truth)) { + x = truth; + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedLogicalIO: subscripts out of bounds"); + } + } + return true; +} + +template +RT_API_ATTRS int DerivedIoTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Data) { + // Create a descriptor for the component + Descriptor &compDesc{componentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + compDesc, instance_, io_.GetIoErrorHandler(), subscripts_); + Advance(); + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } else { + // Component is itself a descriptor + char *pointer{ + instance_.Element(subscripts_) + component_->offset()}; + const Descriptor &compDesc{ + *reinterpret_cast(pointer)}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + } + } + return StatOk; +} + +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Begin(WorkQueue &workQueue) { + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + if (handler.InError()) { + return handler.GetIoStat(); + } + if (!io_.get_if>()) { + handler.Crash("DescriptorIO() called for wrong I/O direction"); + return handler.GetIoStat(); + } + if constexpr (DIR == Direction::Input) { + if (!io_.BeginReadingRecord()) { + return StatOk; + } + } + if (!io_.get_if>()) { + // Unformatted I/O + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + if (const typeInfo::DerivedType * + type{addendum ? addendum->derivedType() : nullptr}) { + // derived type unformatted I/O + if (table_) { + if (const auto *definedIo{table_->Find(*type, + DIR == Direction::Input + ? common::DefinedIo::ReadUnformatted + : common::DefinedIo::WriteUnformatted)}) { + if (definedIo->subroutine) { + typeInfo::SpecialBinding special{DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false}; + if (DefinedUnformattedIo(io_, instance_, *type, special)) { + anyIoTookPlace_ = true; + return StatOk; + } + } else { + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } + } + } + if (const typeInfo::SpecialBinding * + special{type->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || special->isTypeBound()) { + // defined derived type unformatted I/O + if (DefinedUnformattedIo(io_, instance_, *type, *special)) { + anyIoTookPlace_ = true; + return StatOk; + } else { + return IostatEnd; + } + } + } + // Default derived type unformatted I/O + // TODO: If no component at any level has defined READ or WRITE + // (as appropriate), the elements are contiguous, and no byte swapping + // is active, do a block transfer via the code below. + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } else { + // intrinsic type unformatted I/O + auto *externalUnf{io_.get_if>()}; + ChildUnformattedIoStatementState *childUnf{nullptr}; + InquireIOLengthState *inq{nullptr}; + bool swapEndianness{false}; + if (externalUnf) { + swapEndianness = externalUnf->unit().swapEndianness(); + } else { + childUnf = io_.get_if>(); + if (!childUnf) { + inq = DIR == Direction::Output ? io_.get_if() + : nullptr; + RUNTIME_CHECK(handler, inq != nullptr); + } + } + std::size_t elementBytes{instance_.ElementBytes()}; + std::size_t swappingBytes{elementBytes}; + if (auto maybeCatAndKind{instance_.type().GetCategoryAndKind()}) { + // Byte swapping units can be smaller than elements, namely + // for COMPLEX and CHARACTER. + if (maybeCatAndKind->first == TypeCategory::Character) { + // swap each character position independently + swappingBytes = maybeCatAndKind->second; // kind + } else if (maybeCatAndKind->first == TypeCategory::Complex) { + // swap real and imaginary components independently + swappingBytes /= 2; + } + } + using CharType = + std::conditional_t; + auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { + if constexpr (DIR == Direction::Output) { + return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) + : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) + : inq->Emit(&x, totalBytes, swappingBytes); + } else { + return externalUnf + ? externalUnf->Receive(&x, totalBytes, swappingBytes) + : childUnf->Receive(&x, totalBytes, swappingBytes); + } + }}; + if (!swapEndianness && + instance_.IsContiguous()) { // contiguous unformatted I/O + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elements_ * elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O + for (; !IsComplete(); Advance()) { + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } + } + } + // Unformatted I/O never needs to call Continue(). + return StatOk; + } + // Formatted I/O + if (auto catAndKind{instance_.type().GetCategoryAndKind()}) { + TypeCategory cat{catAndKind->first}; + int kind{catAndKind->second}; + bool any{false}; + switch (cat) { + case TypeCategory::Integer: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, true); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, true); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, true); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, true); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, true); + break; + default: + handler.Crash( + "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Unsigned: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, false); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, false); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, false); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, false); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, false); + break; + default: + handler.Crash( + "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Real: + switch (kind) { + case 2: + any = FormattedRealIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedRealIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedRealIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedRealIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedRealIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedRealIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: REAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Complex: + switch (kind) { + case 2: + any = FormattedComplexIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedComplexIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedComplexIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedComplexIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedComplexIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedComplexIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Character: + switch (kind) { + case 1: + any = FormattedCharacterIO(io_, instance_); + break; + case 2: + any = FormattedCharacterIO(io_, instance_); + break; + case 4: + any = FormattedCharacterIO(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Logical: + switch (kind) { + case 1: + any = FormattedLogicalIO<1, DIR>(io_, instance_); + break; + case 2: + any = FormattedLogicalIO<2, DIR>(io_, instance_); + break; + case 4: + any = FormattedLogicalIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedLogicalIO<8, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Derived: { + // Derived type information must be present for formatted I/O. + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + RUNTIME_CHECK(handler, addendum != nullptr); + derived_ = addendum->derivedType(); + RUNTIME_CHECK(handler, derived_ != nullptr); + if (table_) { + if (const auto *definedIo{table_->Find(*derived_, + DIR == Direction::Input ? common::DefinedIo::ReadFormatted + : common::DefinedIo::WriteFormatted)}) { + if (definedIo->subroutine) { + nonTbpSpecial_.emplace(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false); + special_ = &*nonTbpSpecial_; + } + } + } + if (!special_) { + if (const typeInfo::SpecialBinding * + binding{derived_->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || + binding->isTypeBound()) { + special_ = binding; + } + } + } + return StatContinue; + } + } + if (any) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { + handler.Crash("DescriptorIO: bad type code (%d) in descriptor", + static_cast(instance_.type().raw())); + return handler.GetIoStat(); + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Continue(WorkQueue &workQueue) { + // Only derived type formatted I/O gets here. + while (!IsComplete()) { + if (special_) { + if (auto defined{DefinedFormattedIo( + io_, instance_, *derived_, *special_, subscripts_)}) { + anyIoTookPlace_ |= *defined; + Advance(); + continue; + } + } + Descriptor &elementDesc{elementDescriptor_.descriptor()}; + elementDesc.Establish( + *derived_, nullptr, 0, nullptr, CFI_attribute_pointer); + elementDesc.set_base_addr(instance_.Element(subscripts_)); + Advance(); + if (int status{workQueue.BeginDerivedIo( + io_, elementDesc, *derived_, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS bool DescriptorIO(IoStatementState &io, + const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { + bool anyIoTookPlace{false}; + WorkQueue workQueue{io.GetIoErrorHandler()}; + if (workQueue.BeginDescriptorIo(io, descriptor, table, anyIoTookPlace) == + StatContinue) { + workQueue.Run(); + } + return anyIoTookPlace; +} + +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); + RT_OFFLOAD_API_GROUP_END } // namespace Fortran::runtime::io::descr diff --git a/flang-rt/lib/runtime/descriptor-io.h b/flang-rt/lib/runtime/descriptor-io.h index eb60f106c9203..88ad59bd24b53 100644 --- a/flang-rt/lib/runtime/descriptor-io.h +++ b/flang-rt/lib/runtime/descriptor-io.h @@ -9,619 +9,27 @@ #ifndef FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ #define FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ -// Implementation of I/O data list item transfers based on descriptors. -// (All I/O items come through here so that the code is exercised for test; -// some scalar I/O data transfer APIs could be changed to bypass their use -// of descriptors in the future for better efficiency.) +#include "flang-rt/runtime/connection.h" -#include "edit-input.h" -#include "edit-output.h" -#include "unit.h" -#include "flang-rt/runtime/descriptor.h" -#include "flang-rt/runtime/io-stmt.h" -#include "flang-rt/runtime/namelist.h" -#include "flang-rt/runtime/terminator.h" -#include "flang-rt/runtime/type-info.h" -#include "flang/Common/optional.h" -#include "flang/Common/uint128.h" -#include "flang/Runtime/cpp-type.h" +namespace Fortran::runtime { +class Descriptor; +} // namespace Fortran::runtime -namespace Fortran::runtime::io::descr { -template -inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, - const Descriptor &descriptor, const SubscriptValue subscripts[]) { - A *p{descriptor.Element(subscripts)}; - if (!p) { - io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " - "address or subscripts out of range"); - } - return *p; -} - -// Per-category descriptor-based I/O templates - -// TODO (perhaps as a nontrivial but small starter project): implement -// automatic repetition counts, like "10*3.14159", for list-directed and -// NAMELIST array output. - -template -inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, - const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!EditIntegerOutput(io, *edit, x, isSigned)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditIntegerInput( - io, *edit, reinterpret_cast(&x), KIND, isSigned)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedIntegerIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedRealIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - RawType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditRealInput(io, *edit, reinterpret_cast(&x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedRealIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io -template -inline RT_API_ATTRS bool FormattedComplexIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - bool isListOutput{ - io.get_if>() != nullptr}; - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - RawType *x{&ExtractElement(io, descriptor, subscripts)}; - if (isListOutput) { - DataEdit rEdit, iEdit; - rEdit.descriptor = DataEdit::ListDirectedRealPart; - iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; - rEdit.modes = iEdit.modes = io.mutableModes(); - if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || - !RealOutputEditing{io, x[1]}.Edit(iEdit)) { - return false; - } - } else { - for (int k{0}; k < 2; ++k, ++x) { - auto edit{io.GetNextDataEdit()}; - if (!edit) { - return false; - } else if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, *x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { - break; - } else if (EditRealInput( - io, *edit, reinterpret_cast(x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedComplexIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedCharacterIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t length{descriptor.ElementBytes() / sizeof(A)}; - auto *listOutput{io.get_if>()}; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - A *x{&ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditCharacterOutput(io, *edit, x, length)) { - return false; - } - } else { // input - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditCharacterInput(io, *edit, x, length)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedCharacterIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedLogicalIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - auto *listOutput{io.get_if>()}; - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditLogicalOutput(io, *edit, x != 0)) { - return false; - } - } else { - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - bool truth{}; - if (EditLogicalInput(io, *edit, truth)) { - x = truth; - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedLogicalIO: subscripts out of bounds"); - } - } - return true; -} +namespace Fortran::runtime::io::descr { template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, +RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable * = nullptr); -// For intrinsic (not defined) derived type I/O, formatted & unformatted -template -static RT_API_ATTRS bool DefaultComponentIO(IoStatementState &io, - const typeInfo::Component &component, const Descriptor &origDescriptor, - const SubscriptValue origSubscripts[], Terminator &terminator, - const NonTbpDefinedIoTable *table) { -#if !defined(RT_DEVICE_AVOID_RECURSION) - if (component.genre() == typeInfo::Component::Genre::Data) { - // Create a descriptor for the component - StaticDescriptor statDesc; - Descriptor &desc{statDesc.descriptor()}; - component.CreatePointerDescriptor( - desc, origDescriptor, terminator, origSubscripts); - return DescriptorIO(io, desc, table); - } else { - // Component is itself a descriptor - char *pointer{ - origDescriptor.Element(origSubscripts) + component.offset()}; - const Descriptor &compDesc{*reinterpret_cast(pointer)}; - return compDesc.IsAllocated() && DescriptorIO(io, compDesc, table); - } -#else - terminator.Crash("not yet implemented: component IO"); -#endif -} - -template -static RT_API_ATTRS bool DefaultComponentwiseFormattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table, const SubscriptValue subscripts[]) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - // Return true for NAMELIST input if any component appeared. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && k > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -template -static RT_API_ATTRS bool DefaultComponentwiseUnformattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - return false; - } - } - } - return true; -} - -RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( - IoStatementState &, const Descriptor &, const typeInfo::DerivedType &, - const typeInfo::SpecialBinding &, const SubscriptValue[]); - -template -static RT_API_ATTRS bool FormattedDerivedTypeIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - // Derived type information must be present for formatted I/O. - const DescriptorAddendum *addendum{descriptor.Addendum()}; - RUNTIME_CHECK(handler, addendum != nullptr); - const typeInfo::DerivedType *type{addendum->derivedType()}; - RUNTIME_CHECK(handler, type != nullptr); - Fortran::common::optional nonTbpSpecial; - const typeInfo::SpecialBinding *special{nullptr}; - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadFormatted - : common::DefinedIo::WriteFormatted)}) { - if (definedIo->subroutine) { - nonTbpSpecial.emplace(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false); - special = &*nonTbpSpecial; - } - } - } - if (!special) { - if (const typeInfo::SpecialBinding * - binding{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted)}) { - if (!table || !table->ignoreNonTbpEntries || binding->isTypeBound()) { - special = binding; - } - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t numElements{descriptor.Elements()}; - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - Fortran::common::optional result; - if (special) { - result = DefinedFormattedIo(io, descriptor, *type, *special, subscripts); - } - if (!result) { - result = DefaultComponentwiseFormattedIO( - io, descriptor, *type, table, subscripts); - } - if (!result.value()) { - // Return true for NAMELIST input if we got anything. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && j > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &, const Descriptor &, - const typeInfo::DerivedType &, const typeInfo::SpecialBinding &); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); -// Unformatted I/O -template -static RT_API_ATTRS bool UnformattedDescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table = nullptr) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const DescriptorAddendum *addendum{descriptor.Addendum()}; - if (const typeInfo::DerivedType * - type{addendum ? addendum->derivedType() : nullptr}) { - // derived type unformatted I/O - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadUnformatted - : common::DefinedIo::WriteUnformatted)}) { - if (definedIo->subroutine) { - typeInfo::SpecialBinding special{DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false}; - if (Fortran::common::optional wasDefined{ - DefinedUnformattedIo(io, descriptor, *type, special)}) { - return *wasDefined; - } - } else { - return DefaultComponentwiseUnformattedIO( - io, descriptor, *type, table); - } - } - } - if (const typeInfo::SpecialBinding * - special{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { - if (!table || !table->ignoreNonTbpEntries || special->isTypeBound()) { - // defined derived type unformatted I/O - return DefinedUnformattedIo(io, descriptor, *type, *special); - } - } - // Default derived type unformatted I/O - // TODO: If no component at any level has defined READ or WRITE - // (as appropriate), the elements are contiguous, and no byte swapping - // is active, do a block transfer via the code below. - return DefaultComponentwiseUnformattedIO(io, descriptor, *type, table); - } else { - // intrinsic type unformatted I/O - auto *externalUnf{io.get_if>()}; - auto *childUnf{io.get_if>()}; - auto *inq{ - DIR == Direction::Output ? io.get_if() : nullptr}; - RUNTIME_CHECK(handler, externalUnf || childUnf || inq); - std::size_t elementBytes{descriptor.ElementBytes()}; - std::size_t numElements{descriptor.Elements()}; - std::size_t swappingBytes{elementBytes}; - if (auto maybeCatAndKind{descriptor.type().GetCategoryAndKind()}) { - // Byte swapping units can be smaller than elements, namely - // for COMPLEX and CHARACTER. - if (maybeCatAndKind->first == TypeCategory::Character) { - // swap each character position independently - swappingBytes = maybeCatAndKind->second; // kind - } else if (maybeCatAndKind->first == TypeCategory::Complex) { - // swap real and imaginary components independently - swappingBytes /= 2; - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using CharType = - std::conditional_t; - auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { - if constexpr (DIR == Direction::Output) { - return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) - : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) - : inq->Emit(&x, totalBytes, swappingBytes); - } else { - return externalUnf ? externalUnf->Receive(&x, totalBytes, swappingBytes) - : childUnf->Receive(&x, totalBytes, swappingBytes); - } - }}; - bool swapEndianness{externalUnf && externalUnf->unit().swapEndianness()}; - if (!swapEndianness && - descriptor.IsContiguous()) { // contiguous unformatted I/O - char &x{ExtractElement(io, descriptor, subscripts)}; - return Transfer(x, numElements * elementBytes); - } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O - for (std::size_t j{0}; j < numElements; ++j) { - char &x{ExtractElement(io, descriptor, subscripts)}; - if (!Transfer(x, elementBytes)) { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && - j + 1 < numElements) { - handler.Crash("DescriptorIO: subscripts out of bounds"); - } - } - return true; - } - } -} - -template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - if (handler.InError()) { - return false; - } - if (!io.get_if>()) { - handler.Crash("DescriptorIO() called for wrong I/O direction"); - return false; - } - if constexpr (DIR == Direction::Input) { - if (!io.BeginReadingRecord()) { - return false; - } - } - if (!io.get_if>()) { - return UnformattedDescriptorIO(io, descriptor, table); - } - if (auto catAndKind{descriptor.type().GetCategoryAndKind()}) { - TypeCategory cat{catAndKind->first}; - int kind{catAndKind->second}; - switch (cat) { - case TypeCategory::Integer: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, true); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, true); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, true); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, true); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, true); - default: - handler.Crash( - "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Unsigned: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, false); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, false); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, false); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, false); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, false); - default: - handler.Crash( - "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Real: - switch (kind) { - case 2: - return FormattedRealIO<2, DIR>(io, descriptor); - case 3: - return FormattedRealIO<3, DIR>(io, descriptor); - case 4: - return FormattedRealIO<4, DIR>(io, descriptor); - case 8: - return FormattedRealIO<8, DIR>(io, descriptor); - case 10: - return FormattedRealIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedRealIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: REAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Complex: - switch (kind) { - case 2: - return FormattedComplexIO<2, DIR>(io, descriptor); - case 3: - return FormattedComplexIO<3, DIR>(io, descriptor); - case 4: - return FormattedComplexIO<4, DIR>(io, descriptor); - case 8: - return FormattedComplexIO<8, DIR>(io, descriptor); - case 10: - return FormattedComplexIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedComplexIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Character: - switch (kind) { - case 1: - return FormattedCharacterIO(io, descriptor); - case 2: - return FormattedCharacterIO(io, descriptor); - case 4: - return FormattedCharacterIO(io, descriptor); - default: - handler.Crash( - "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Logical: - switch (kind) { - case 1: - return FormattedLogicalIO<1, DIR>(io, descriptor); - case 2: - return FormattedLogicalIO<2, DIR>(io, descriptor); - case 4: - return FormattedLogicalIO<4, DIR>(io, descriptor); - case 8: - return FormattedLogicalIO<8, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Derived: - return FormattedDerivedTypeIO(io, descriptor, table); - } - } - handler.Crash("DescriptorIO: bad type code (%d) in descriptor", - static_cast(descriptor.type().raw())); - return false; -} } // namespace Fortran::runtime::io::descr #endif // FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ diff --git a/flang-rt/lib/runtime/environment.cpp b/flang-rt/lib/runtime/environment.cpp index 1d5304254ed0e..0f0564403c0e2 100644 --- a/flang-rt/lib/runtime/environment.cpp +++ b/flang-rt/lib/runtime/environment.cpp @@ -143,6 +143,10 @@ void ExecutionEnvironment::Configure(int ac, const char *av[], } } + if (auto *x{std::getenv("FLANG_RT_DEBUG")}) { + internalDebugging = std::strtol(x, nullptr, 10); + } + if (auto *x{std::getenv("ACC_OFFLOAD_STACK_SIZE")}) { char *end; auto n{std::strtoul(x, &end, 10)}; diff --git a/flang-rt/lib/runtime/namelist.cpp b/flang-rt/lib/runtime/namelist.cpp index b0cf2180fc6d4..1bef387a9771f 100644 --- a/flang-rt/lib/runtime/namelist.cpp +++ b/flang-rt/lib/runtime/namelist.cpp @@ -10,6 +10,7 @@ #include "descriptor-io.h" #include "flang-rt/runtime/emit-encoded.h" #include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/type-info.h" #include "flang/Runtime/io-api.h" #include #include diff --git a/flang-rt/lib/runtime/tools.cpp b/flang-rt/lib/runtime/tools.cpp index b08195cd31e05..24d05f369fcbe 100644 --- a/flang-rt/lib/runtime/tools.cpp +++ b/flang-rt/lib/runtime/tools.cpp @@ -205,7 +205,7 @@ RT_API_ATTRS void ShallowCopyInner(const Descriptor &to, const Descriptor &from, // Doing the recursion upwards instead of downwards puts the more common // cases earlier in the if-chain and has a tangible impact on performance. template struct ShallowCopyRankSpecialize { - static bool execute(const Descriptor &to, const Descriptor &from, + static RT_API_ATTRS bool execute(const Descriptor &to, const Descriptor &from, bool toIsContiguous, bool fromIsContiguous) { if (to.rank() == RANK && from.rank() == RANK) { ShallowCopyInner(to, from, toIsContiguous, fromIsContiguous); @@ -217,7 +217,7 @@ template struct ShallowCopyRankSpecialize { }; template struct ShallowCopyRankSpecialize { - static bool execute(const Descriptor &to, const Descriptor &from, + static RT_API_ATTRS bool execute(const Descriptor &to, const Descriptor &from, bool toIsContiguous, bool fromIsContiguous) { return false; } diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..3617f8633f7f0 --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,161 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/memory.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +#if !defined(RT_DEVICE_COMPILATION) +// FLANG_RT_DEBUG code is disabled when false. +static constexpr bool enableDebugOutput{false}; +#endif + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (componentAt_ >= components_) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + FreeMemory(firstFree_); + } + firstFree_ = next; + } +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + void *p{AllocateMemoryOrCrash(terminator_, sizeof(TicketList))}; + firstFree_ = new (p) TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: new ticket\n"); + } +#endif + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), + at->ticket.begun ? "Continue" : "Begin"); + } +#endif + int stat{at->ticket.Continue(*this)}; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: ... stat %d\n", stat); + } +#endif + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime diff --git a/flang-rt/unittests/Runtime/ExternalIOTest.cpp b/flang-rt/unittests/Runtime/ExternalIOTest.cpp index 3833e48be3dd6..6c148b1de6f82 100644 --- a/flang-rt/unittests/Runtime/ExternalIOTest.cpp +++ b/flang-rt/unittests/Runtime/ExternalIOTest.cpp @@ -184,7 +184,7 @@ TEST(ExternalIOTests, TestSequentialFixedUnformatted) { io = IONAME(BeginInquireIoLength)(__FILE__, __LINE__); for (int j{1}; j <= 3; ++j) { ASSERT_TRUE(IONAME(OutputDescriptor)(io, desc)) - << "OutputDescriptor() for InquireIoLength"; + << "OutputDescriptor() for InquireIoLength " << j; } ASSERT_EQ(IONAME(GetIoLength)(io), 3 * recl) << "GetIoLength"; ASSERT_EQ(IONAME(EndIoStatement)(io), IostatOk) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..32bebc1d866a4 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -843,6 +843,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION From flang-commits at lists.llvm.org Tue May 27 20:48:18 2025 From: flang-commits at lists.llvm.org (Pranav Bhandarkar via flang-commits) Date: Tue, 27 May 2025 20:48:18 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] - Handle `BoxCharType` in `fir.box_offset` op (PR #141713) Message-ID: https://github.com/bhandarkar-pranav created https://github.com/llvm/llvm-project/pull/141713 To map `fir.boxchar` types reliably onto an offload target, such as a GPU, the `omp.map.info` is used to map the underlying data pointer (`fir.ref`) wrapped by the `fir.boxchar` MLIR value. The `omp.map.info` instruction needs a pointer to the underlying data pointer. Given a reference to a descriptor (`fir.box`), the `fir.box_offset` is used to obtain the address of the underlying data pointer. This PR extends `fir.box_offset` to provide the same functionality for `fir.boxchar` as well. >From 271272f7a98bf5bf5e651c70cbd5030a311cc078 Mon Sep 17 00:00:00 2001 From: Pranav Bhandarkar Date: Fri, 23 May 2025 10:23:57 -0500 Subject: [PATCH] Add the ability to the fir.box_offset op to handle references to fir.boxchar --- .../include/flang/Optimizer/Dialect/FIROps.td | 6 +++++ .../include/flang/Optimizer/Dialect/FIRType.h | 5 ++-- flang/lib/Optimizer/CodeGen/CodeGen.cpp | 25 ++++++++++++++----- flang/lib/Optimizer/Dialect/FIROps.cpp | 16 +++++++++--- flang/lib/Optimizer/Dialect/FIRType.cpp | 2 +- flang/test/Fir/box-offset-codegen.fir | 10 ++++++++ flang/test/Fir/box-offset.fir | 5 ++++ flang/test/Fir/invalid.fir | 10 +++++++- 8 files changed, 66 insertions(+), 13 deletions(-) diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td b/flang/include/flang/Optimizer/Dialect/FIROps.td index dc66885f776f0..160de05a33b41 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.td +++ b/flang/include/flang/Optimizer/Dialect/FIROps.td @@ -3240,11 +3240,17 @@ def fir_BoxOffsetOp : fir_Op<"box_offset", [NoMemoryEffect]> { descriptor implementation must have, only the base_addr and derived_type descriptor fields can be addressed. + It also accepts the address of a fir.boxchar and returns + address of the data pointer encapsulated by the fir.boxchar. + ``` %addr = fir.box_offset %box base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> %tdesc = fir.box_offset %box derived_type : (!fir.ref>>) -> !fir.llvm_ptr>> + %addr1 = fir.box_offset %boxchar base_addr : (!fir.ref>) -> !fir.llvm_ptr>> ``` + + The derived_type field cannot be used when the input to this op is a reference to a fir.boxchar. }]; let arguments = (ins diff --git a/flang/include/flang/Optimizer/Dialect/FIRType.h b/flang/include/flang/Optimizer/Dialect/FIRType.h index 52b14f15f89bd..01878aa41005c 100644 --- a/flang/include/flang/Optimizer/Dialect/FIRType.h +++ b/flang/include/flang/Optimizer/Dialect/FIRType.h @@ -278,8 +278,9 @@ inline mlir::Type unwrapRefType(mlir::Type t) { /// If `t` conforms with a pass-by-reference type (box, ref, ptr, etc.) then /// return the element type of `t`. Otherwise, return `t`. inline mlir::Type unwrapPassByRefType(mlir::Type t) { - if (auto eleTy = dyn_cast_ptrOrBoxEleTy(t)) - return eleTy; + if (conformsWithPassByRef(t)) + if (auto eleTy = dyn_cast_ptrOrBoxEleTy(t)) + return eleTy; return t; } diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index 205807eab403a..e383c2e3e89ab 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -3930,12 +3930,25 @@ struct BoxOffsetOpConversion : public fir::FIROpConversion { mlir::ConversionPatternRewriter &rewriter) const override { mlir::Type pty = ::getLlvmPtrType(boxOffset.getContext()); - mlir::Type boxType = fir::unwrapRefType(boxOffset.getBoxRef().getType()); - mlir::Type llvmBoxTy = - lowerTy().convertBoxTypeAsStruct(mlir::cast(boxType)); - int fieldId = boxOffset.getField() == fir::BoxFieldAttr::derived_type - ? getTypeDescFieldId(boxType) - : kAddrPosInBox; + mlir::Type boxRefType = fir::unwrapRefType(boxOffset.getBoxRef().getType()); + + assert((mlir::isa(boxRefType) || + mlir::isa(boxRefType)) && + "boxRef should be a reference to either fir.box or fir.boxchar"); + + mlir::Type llvmBoxTy; + int fieldId; + if (auto boxType = mlir::dyn_cast_or_null(boxRefType)) { + llvmBoxTy = + lowerTy().convertBoxTypeAsStruct(mlir::cast(boxType)); + fieldId = boxOffset.getField() == fir::BoxFieldAttr::derived_type + ? getTypeDescFieldId(boxType) + : kAddrPosInBox; + } else { + auto boxCharType = mlir::cast(boxRefType); + llvmBoxTy = lowerTy().convertType(boxCharType); + fieldId = kAddrPosInBox; + } rewriter.replaceOpWithNewOp( boxOffset, pty, llvmBoxTy, adaptor.getBoxRef(), llvm::ArrayRef{0, fieldId}); diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index cbe93907265f6..6435886d73081 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -4484,15 +4484,25 @@ void fir::IfOp::resultToSourceOps(llvm::SmallVectorImpl &results, llvm::LogicalResult fir::BoxOffsetOp::verify() { auto boxType = mlir::dyn_cast_or_null( fir::dyn_cast_ptrEleTy(getBoxRef().getType())); - if (!boxType) - return emitOpError("box_ref operand must have !fir.ref> type"); + mlir::Type boxCharType; + bool isBoxChar = false; + if (!boxType) { + boxCharType = mlir::dyn_cast_or_null( + fir::dyn_cast_ptrEleTy(getBoxRef().getType())); + if (!boxCharType) + return emitOpError("box_ref operand must have !fir.ref> or !fir.ref> type"); + isBoxChar = true; + } if (getField() != fir::BoxFieldAttr::base_addr && getField() != fir::BoxFieldAttr::derived_type) return emitOpError("cannot address provided field"); - if (getField() == fir::BoxFieldAttr::derived_type) + if (getField() == fir::BoxFieldAttr::derived_type) { + if (isBoxChar) + return emitOpError("cannot address derived_type field of a fir.boxchar"); if (!fir::boxHasAddendum(boxType)) return emitOpError("can only address derived_type field of derived type " "or unlimited polymorphic fir.box"); + } return mlir::success(); } diff --git a/flang/lib/Optimizer/Dialect/FIRType.cpp b/flang/lib/Optimizer/Dialect/FIRType.cpp index 1e6e95393c2f7..da7aa17445404 100644 --- a/flang/lib/Optimizer/Dialect/FIRType.cpp +++ b/flang/lib/Optimizer/Dialect/FIRType.cpp @@ -255,7 +255,7 @@ mlir::Type dyn_cast_ptrOrBoxEleTy(mlir::Type t) { return llvm::TypeSwitch(t) .Case([](auto p) { return p.getEleTy(); }) - .Case( + .Case( [](auto p) { return unwrapRefType(p.getEleTy()); }) .Default([](mlir::Type) { return mlir::Type{}; }); } diff --git a/flang/test/Fir/box-offset-codegen.fir b/flang/test/Fir/box-offset-codegen.fir index 15c9a11e5aefe..59cfda8523061 100644 --- a/flang/test/Fir/box-offset-codegen.fir +++ b/flang/test/Fir/box-offset-codegen.fir @@ -37,3 +37,13 @@ func.func @array_tdesc(%array : !fir.ref>) -> !fir.llvm_ptr>> { + %addr = fir.box_offset %boxchar base_addr : (!fir.ref>) -> !fir.llvm_ptr>> + return %addr : !fir.llvm_ptr>> +} + +// CHECK-LABEL: define ptr @boxchar_addr( +// CHECK-SAME: ptr captures(none) %[[BOXCHAR:.*]]){{.*}} { +// CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64 }, ptr %[[BOXCHAR]], i32 0, i32 0 +// CHECK: ret ptr %[[VAL_0]] diff --git a/flang/test/Fir/box-offset.fir b/flang/test/Fir/box-offset.fir index 98c2eaefb8d6b..181ad51a5dbe1 100644 --- a/flang/test/Fir/box-offset.fir +++ b/flang/test/Fir/box-offset.fir @@ -21,6 +21,9 @@ func.func @test_box_offset(%unlimited : !fir.ref>, %type_star : %addr6 = fir.box_offset %type_star base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> %tdesc6 = fir.box_offset %type_star derived_type : (!fir.ref>>) -> !fir.llvm_ptr> + + %boxchar = fir.alloca !fir.boxchar<1> + %addr7 = fir.box_offset %boxchar base_addr : (!fir.ref>) -> !fir.llvm_ptr>> return } // CHECK-LABEL: func.func @test_box_offset( @@ -40,3 +43,5 @@ func.func @test_box_offset(%unlimited : !fir.ref>, %type_star : // CHECK: %[[VAL_13:.*]] = fir.box_offset %[[VAL_0]] derived_type : (!fir.ref>) -> !fir.llvm_ptr> // CHECK: %[[VAL_14:.*]] = fir.box_offset %[[VAL_1]] base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> // CHECK: %[[VAL_15:.*]] = fir.box_offset %[[VAL_1]] derived_type : (!fir.ref>>) -> !fir.llvm_ptr> +// CHECK: %[[VAL_16:.*]] = fir.alloca !fir.boxchar<1> +// CHECK: %[[VAL_17:.*]] = fir.box_offset %[[VAL_16]] base_addr : (!fir.ref>) -> !fir.llvm_ptr>> diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index fd607fd9066f7..45cae1f82cb8e 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -972,13 +972,21 @@ func.func @rec_to_rec(%arg0: !fir.type) -> !fir.type) { - // expected-error at +1{{'fir.box_offset' op box_ref operand must have !fir.ref> type}} + // expected-error at +1{{'fir.box_offset' op box_ref operand must have !fir.ref> or !fir.ref> type}} %addr1 = fir.box_offset %not_a_box base_addr : (!fir.ref) -> !fir.llvm_ptr> return } // ----- +func.func @bad_box_offset(%boxchar : !fir.ref>) { + // expected-error at +1{{'fir.box_offset' op cannot address derived_type field of a fir.boxchar}} + %addr1 = fir.box_offset %boxchar derived_type : (!fir.ref>) -> !fir.llvm_ptr>> + return +} + +// ----- + func.func @bad_box_offset(%no_addendum : !fir.ref>) { // expected-error at +1{{'fir.box_offset' op can only address derived_type field of derived type or unlimited polymorphic fir.box}} %addr1 = fir.box_offset %no_addendum derived_type : (!fir.ref>) -> !fir.llvm_ptr>> From flang-commits at lists.llvm.org Tue May 27 20:48:52 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 27 May 2025 20:48:52 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] - Handle `BoxCharType` in `fir.box_offset` op (PR #141713) In-Reply-To: Message-ID: <683687a4.170a0220.1de21c.08d3@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Pranav Bhandarkar (bhandarkar-pranav)
Changes To map `fir.boxchar` types reliably onto an offload target, such as a GPU, the `omp.map.info` is used to map the underlying data pointer (`fir.ref<fir.char<k, ?>`) wrapped by the `fir.boxchar` MLIR value. The `omp.map.info` instruction needs a pointer to the underlying data pointer. Given a reference to a descriptor (`fir.box`), the `fir.box_offset` is used to obtain the address of the underlying data pointer. This PR extends `fir.box_offset` to provide the same functionality for `fir.boxchar` as well. --- Full diff: https://github.com/llvm/llvm-project/pull/141713.diff 8 Files Affected: - (modified) flang/include/flang/Optimizer/Dialect/FIROps.td (+6) - (modified) flang/include/flang/Optimizer/Dialect/FIRType.h (+3-2) - (modified) flang/lib/Optimizer/CodeGen/CodeGen.cpp (+19-6) - (modified) flang/lib/Optimizer/Dialect/FIROps.cpp (+13-3) - (modified) flang/lib/Optimizer/Dialect/FIRType.cpp (+1-1) - (modified) flang/test/Fir/box-offset-codegen.fir (+10) - (modified) flang/test/Fir/box-offset.fir (+5) - (modified) flang/test/Fir/invalid.fir (+9-1) ``````````diff diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td b/flang/include/flang/Optimizer/Dialect/FIROps.td index dc66885f776f0..160de05a33b41 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.td +++ b/flang/include/flang/Optimizer/Dialect/FIROps.td @@ -3240,11 +3240,17 @@ def fir_BoxOffsetOp : fir_Op<"box_offset", [NoMemoryEffect]> { descriptor implementation must have, only the base_addr and derived_type descriptor fields can be addressed. + It also accepts the address of a fir.boxchar and returns + address of the data pointer encapsulated by the fir.boxchar. + ``` %addr = fir.box_offset %box base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> %tdesc = fir.box_offset %box derived_type : (!fir.ref>>) -> !fir.llvm_ptr>> + %addr1 = fir.box_offset %boxchar base_addr : (!fir.ref>) -> !fir.llvm_ptr>> ``` + + The derived_type field cannot be used when the input to this op is a reference to a fir.boxchar. }]; let arguments = (ins diff --git a/flang/include/flang/Optimizer/Dialect/FIRType.h b/flang/include/flang/Optimizer/Dialect/FIRType.h index 52b14f15f89bd..01878aa41005c 100644 --- a/flang/include/flang/Optimizer/Dialect/FIRType.h +++ b/flang/include/flang/Optimizer/Dialect/FIRType.h @@ -278,8 +278,9 @@ inline mlir::Type unwrapRefType(mlir::Type t) { /// If `t` conforms with a pass-by-reference type (box, ref, ptr, etc.) then /// return the element type of `t`. Otherwise, return `t`. inline mlir::Type unwrapPassByRefType(mlir::Type t) { - if (auto eleTy = dyn_cast_ptrOrBoxEleTy(t)) - return eleTy; + if (conformsWithPassByRef(t)) + if (auto eleTy = dyn_cast_ptrOrBoxEleTy(t)) + return eleTy; return t; } diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index 205807eab403a..e383c2e3e89ab 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -3930,12 +3930,25 @@ struct BoxOffsetOpConversion : public fir::FIROpConversion { mlir::ConversionPatternRewriter &rewriter) const override { mlir::Type pty = ::getLlvmPtrType(boxOffset.getContext()); - mlir::Type boxType = fir::unwrapRefType(boxOffset.getBoxRef().getType()); - mlir::Type llvmBoxTy = - lowerTy().convertBoxTypeAsStruct(mlir::cast(boxType)); - int fieldId = boxOffset.getField() == fir::BoxFieldAttr::derived_type - ? getTypeDescFieldId(boxType) - : kAddrPosInBox; + mlir::Type boxRefType = fir::unwrapRefType(boxOffset.getBoxRef().getType()); + + assert((mlir::isa(boxRefType) || + mlir::isa(boxRefType)) && + "boxRef should be a reference to either fir.box or fir.boxchar"); + + mlir::Type llvmBoxTy; + int fieldId; + if (auto boxType = mlir::dyn_cast_or_null(boxRefType)) { + llvmBoxTy = + lowerTy().convertBoxTypeAsStruct(mlir::cast(boxType)); + fieldId = boxOffset.getField() == fir::BoxFieldAttr::derived_type + ? getTypeDescFieldId(boxType) + : kAddrPosInBox; + } else { + auto boxCharType = mlir::cast(boxRefType); + llvmBoxTy = lowerTy().convertType(boxCharType); + fieldId = kAddrPosInBox; + } rewriter.replaceOpWithNewOp( boxOffset, pty, llvmBoxTy, adaptor.getBoxRef(), llvm::ArrayRef{0, fieldId}); diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index cbe93907265f6..6435886d73081 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -4484,15 +4484,25 @@ void fir::IfOp::resultToSourceOps(llvm::SmallVectorImpl &results, llvm::LogicalResult fir::BoxOffsetOp::verify() { auto boxType = mlir::dyn_cast_or_null( fir::dyn_cast_ptrEleTy(getBoxRef().getType())); - if (!boxType) - return emitOpError("box_ref operand must have !fir.ref> type"); + mlir::Type boxCharType; + bool isBoxChar = false; + if (!boxType) { + boxCharType = mlir::dyn_cast_or_null( + fir::dyn_cast_ptrEleTy(getBoxRef().getType())); + if (!boxCharType) + return emitOpError("box_ref operand must have !fir.ref> or !fir.ref> type"); + isBoxChar = true; + } if (getField() != fir::BoxFieldAttr::base_addr && getField() != fir::BoxFieldAttr::derived_type) return emitOpError("cannot address provided field"); - if (getField() == fir::BoxFieldAttr::derived_type) + if (getField() == fir::BoxFieldAttr::derived_type) { + if (isBoxChar) + return emitOpError("cannot address derived_type field of a fir.boxchar"); if (!fir::boxHasAddendum(boxType)) return emitOpError("can only address derived_type field of derived type " "or unlimited polymorphic fir.box"); + } return mlir::success(); } diff --git a/flang/lib/Optimizer/Dialect/FIRType.cpp b/flang/lib/Optimizer/Dialect/FIRType.cpp index 1e6e95393c2f7..da7aa17445404 100644 --- a/flang/lib/Optimizer/Dialect/FIRType.cpp +++ b/flang/lib/Optimizer/Dialect/FIRType.cpp @@ -255,7 +255,7 @@ mlir::Type dyn_cast_ptrOrBoxEleTy(mlir::Type t) { return llvm::TypeSwitch(t) .Case([](auto p) { return p.getEleTy(); }) - .Case( + .Case( [](auto p) { return unwrapRefType(p.getEleTy()); }) .Default([](mlir::Type) { return mlir::Type{}; }); } diff --git a/flang/test/Fir/box-offset-codegen.fir b/flang/test/Fir/box-offset-codegen.fir index 15c9a11e5aefe..59cfda8523061 100644 --- a/flang/test/Fir/box-offset-codegen.fir +++ b/flang/test/Fir/box-offset-codegen.fir @@ -37,3 +37,13 @@ func.func @array_tdesc(%array : !fir.ref>) -> !fir.llvm_ptr>> { + %addr = fir.box_offset %boxchar base_addr : (!fir.ref>) -> !fir.llvm_ptr>> + return %addr : !fir.llvm_ptr>> +} + +// CHECK-LABEL: define ptr @boxchar_addr( +// CHECK-SAME: ptr captures(none) %[[BOXCHAR:.*]]){{.*}} { +// CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64 }, ptr %[[BOXCHAR]], i32 0, i32 0 +// CHECK: ret ptr %[[VAL_0]] diff --git a/flang/test/Fir/box-offset.fir b/flang/test/Fir/box-offset.fir index 98c2eaefb8d6b..181ad51a5dbe1 100644 --- a/flang/test/Fir/box-offset.fir +++ b/flang/test/Fir/box-offset.fir @@ -21,6 +21,9 @@ func.func @test_box_offset(%unlimited : !fir.ref>, %type_star : %addr6 = fir.box_offset %type_star base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> %tdesc6 = fir.box_offset %type_star derived_type : (!fir.ref>>) -> !fir.llvm_ptr> + + %boxchar = fir.alloca !fir.boxchar<1> + %addr7 = fir.box_offset %boxchar base_addr : (!fir.ref>) -> !fir.llvm_ptr>> return } // CHECK-LABEL: func.func @test_box_offset( @@ -40,3 +43,5 @@ func.func @test_box_offset(%unlimited : !fir.ref>, %type_star : // CHECK: %[[VAL_13:.*]] = fir.box_offset %[[VAL_0]] derived_type : (!fir.ref>) -> !fir.llvm_ptr> // CHECK: %[[VAL_14:.*]] = fir.box_offset %[[VAL_1]] base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> // CHECK: %[[VAL_15:.*]] = fir.box_offset %[[VAL_1]] derived_type : (!fir.ref>>) -> !fir.llvm_ptr> +// CHECK: %[[VAL_16:.*]] = fir.alloca !fir.boxchar<1> +// CHECK: %[[VAL_17:.*]] = fir.box_offset %[[VAL_16]] base_addr : (!fir.ref>) -> !fir.llvm_ptr>> diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index fd607fd9066f7..45cae1f82cb8e 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -972,13 +972,21 @@ func.func @rec_to_rec(%arg0: !fir.type) -> !fir.type) { - // expected-error at +1{{'fir.box_offset' op box_ref operand must have !fir.ref> type}} + // expected-error at +1{{'fir.box_offset' op box_ref operand must have !fir.ref> or !fir.ref> type}} %addr1 = fir.box_offset %not_a_box base_addr : (!fir.ref) -> !fir.llvm_ptr> return } // ----- +func.func @bad_box_offset(%boxchar : !fir.ref>) { + // expected-error at +1{{'fir.box_offset' op cannot address derived_type field of a fir.boxchar}} + %addr1 = fir.box_offset %boxchar derived_type : (!fir.ref>) -> !fir.llvm_ptr>> + return +} + +// ----- + func.func @bad_box_offset(%no_addendum : !fir.ref>) { // expected-error at +1{{'fir.box_offset' op can only address derived_type field of derived type or unlimited polymorphic fir.box}} %addr1 = fir.box_offset %no_addendum derived_type : (!fir.ref>) -> !fir.llvm_ptr>> ``````````
https://github.com/llvm/llvm-project/pull/141713 From flang-commits at lists.llvm.org Tue May 27 20:50:44 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Tue, 27 May 2025 20:50:44 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] - Handle `BoxCharType` in `fir.box_offset` op (PR #141713) In-Reply-To: Message-ID: <68368814.170a0220.38e8c6.0d32@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- flang/include/flang/Optimizer/Dialect/FIRType.h flang/lib/Optimizer/CodeGen/CodeGen.cpp flang/lib/Optimizer/Dialect/FIROps.cpp flang/lib/Optimizer/Dialect/FIRType.cpp ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index e383c2e3e..82d960a6f 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -3939,11 +3939,11 @@ struct BoxOffsetOpConversion : public fir::FIROpConversion { mlir::Type llvmBoxTy; int fieldId; if (auto boxType = mlir::dyn_cast_or_null(boxRefType)) { - llvmBoxTy = - lowerTy().convertBoxTypeAsStruct(mlir::cast(boxType)); + llvmBoxTy = lowerTy().convertBoxTypeAsStruct( + mlir::cast(boxType)); fieldId = boxOffset.getField() == fir::BoxFieldAttr::derived_type - ? getTypeDescFieldId(boxType) - : kAddrPosInBox; + ? getTypeDescFieldId(boxType) + : kAddrPosInBox; } else { auto boxCharType = mlir::cast(boxRefType); llvmBoxTy = lowerTy().convertType(boxCharType); diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index 6435886d7..8d3c82d00 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -4490,7 +4490,8 @@ llvm::LogicalResult fir::BoxOffsetOp::verify() { boxCharType = mlir::dyn_cast_or_null( fir::dyn_cast_ptrEleTy(getBoxRef().getType())); if (!boxCharType) - return emitOpError("box_ref operand must have !fir.ref> or !fir.ref> type"); + return emitOpError("box_ref operand must have !fir.ref> or " + "!fir.ref> type"); isBoxChar = true; } if (getField() != fir::BoxFieldAttr::base_addr && ``````````
https://github.com/llvm/llvm-project/pull/141713 From flang-commits at lists.llvm.org Tue May 27 20:57:37 2025 From: flang-commits at lists.llvm.org (Pranav Bhandarkar via flang-commits) Date: Tue, 27 May 2025 20:57:37 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] - Handle `BoxCharType` in `fir.box_offset` op (PR #141713) In-Reply-To: Message-ID: <683689b1.050a0220.2d9e44.0a63@mx.google.com> https://github.com/bhandarkar-pranav edited https://github.com/llvm/llvm-project/pull/141713 From flang-commits at lists.llvm.org Wed May 28 02:08:48 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 28 May 2025 02:08:48 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Retrieve shape from selector when generating assoc sym type (PR #137117) In-Reply-To: Message-ID: <6836d2a0.170a0220.1eb333.1747@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/137117 >From b707fd59ee81e0b5f2a52c1f4a018ae21fcc53ba Mon Sep 17 00:00:00 2001 From: ergawy Date: Thu, 24 Apr 2025 00:34:44 -0500 Subject: [PATCH 1/2] [flang] Retrieve shape from selector when generating assoc sym type This PR extends `genSymbolType` so that the type of an associating symbol carries the shape of the selector expression, if any. This is a fix for a bug that triggered when an associating symbol is used in a locality specifier. For example, given the following input: ```fortran associate(a => aa(4:)) do concurrent (i = 4:11) local(a) a(i) = 0 end do end associate ``` before the changes in the PR, flang would assert that we are casting between incompatible types. The issue happened since for the associating symbol (`a`), flang generated its type as `f32` rather than `!fir.array<8xf32>` as it should be in this case. --- flang/lib/Lower/ConvertType.cpp | 17 ++++++++++++++ .../do_concurrent_local_assoc_entity.f90 | 22 +++++++++++++++++++ 2 files changed, 39 insertions(+) create mode 100644 flang/test/Lower/do_concurrent_local_assoc_entity.f90 diff --git a/flang/lib/Lower/ConvertType.cpp b/flang/lib/Lower/ConvertType.cpp index d45f9e7c0bf1b..875bdba6cc6ba 100644 --- a/flang/lib/Lower/ConvertType.cpp +++ b/flang/lib/Lower/ConvertType.cpp @@ -279,6 +279,23 @@ struct TypeBuilderImpl { bool isPolymorphic = (Fortran::semantics::IsPolymorphic(symbol) || Fortran::semantics::IsUnlimitedPolymorphic(symbol)) && !Fortran::semantics::IsAssumedType(symbol); + if (const auto *assocDetails = + ultimate.detailsIf()) { + const auto &selector = assocDetails->expr(); + + if (selector && selector->Rank() > 0) { + auto shapeExpr = Fortran::evaluate::GetShape( + converter.getFoldingContext(), selector); + + fir::SequenceType::Shape shape; + // If there is no shapExpr, this is an assumed-rank, and the empty shape + // will build the desired fir.array<*:T> type. + if (shapeExpr) + translateShape(shape, std::move(*shapeExpr)); + ty = fir::SequenceType::get(shape, ty); + } + } + if (ultimate.IsObjectArray()) { auto shapeExpr = Fortran::evaluate::GetShape(converter.getFoldingContext(), ultimate); diff --git a/flang/test/Lower/do_concurrent_local_assoc_entity.f90 b/flang/test/Lower/do_concurrent_local_assoc_entity.f90 new file mode 100644 index 0000000000000..ca16ecaa5c137 --- /dev/null +++ b/flang/test/Lower/do_concurrent_local_assoc_entity.f90 @@ -0,0 +1,22 @@ +! RUN: %flang_fc1 -emit-hlfir -o - %s | FileCheck %s + +subroutine local_assoc + implicit none + integer i + real, dimension(2:11) :: aa + + associate(a => aa(4:)) + do concurrent (i = 4:11) local(a) + a(i) = 0 + end do + end associate +end subroutine local_assoc + +! CHECK: %[[C8:.*]] = arith.constant 8 : index + +! CHECK: fir.do_loop {{.*}} unordered { +! CHECK: %[[LOCAL_ALLOC:.*]] = fir.alloca !fir.array<8xf32> {bindc_name = "a", pinned, uniq_name = "{{.*}}local_assocEa"} +! CHECK: %[[LOCAL_SHAPE:.*]] = fir.shape %[[C8]] : +! CHECK: %[[LOCAL_DECL:.*]]:2 = hlfir.declare %[[LOCAL_ALLOC]](%[[LOCAL_SHAPE]]) +! CHECK: hlfir.designate %[[LOCAL_DECL]]#0 (%{{.*}}) +! CHECK: } >From dde0225506a090c701e2ab98c1ee5837cb5ba61b Mon Sep 17 00:00:00 2001 From: ergawy Date: Wed, 28 May 2025 04:08:24 -0500 Subject: [PATCH 2/2] review comment, restructure code --- flang/lib/Lower/ConvertType.cpp | 39 +++++++------------ .../do_concurrent_local_assoc_entity.f90 | 2 +- 2 files changed, 14 insertions(+), 27 deletions(-) diff --git a/flang/lib/Lower/ConvertType.cpp b/flang/lib/Lower/ConvertType.cpp index 875bdba6cc6ba..7a2e8e5095186 100644 --- a/flang/lib/Lower/ConvertType.cpp +++ b/flang/lib/Lower/ConvertType.cpp @@ -276,36 +276,23 @@ struct TypeBuilderImpl { } else { fir::emitFatalError(loc, "symbol must have a type"); } - bool isPolymorphic = (Fortran::semantics::IsPolymorphic(symbol) || - Fortran::semantics::IsUnlimitedPolymorphic(symbol)) && - !Fortran::semantics::IsAssumedType(symbol); - if (const auto *assocDetails = - ultimate.detailsIf()) { - const auto &selector = assocDetails->expr(); - - if (selector && selector->Rank() > 0) { - auto shapeExpr = Fortran::evaluate::GetShape( - converter.getFoldingContext(), selector); - - fir::SequenceType::Shape shape; - // If there is no shapExpr, this is an assumed-rank, and the empty shape - // will build the desired fir.array<*:T> type. - if (shapeExpr) - translateShape(shape, std::move(*shapeExpr)); - ty = fir::SequenceType::get(shape, ty); - } - } - if (ultimate.IsObjectArray()) { - auto shapeExpr = - Fortran::evaluate::GetShape(converter.getFoldingContext(), ultimate); + auto shapeExpr = + Fortran::evaluate::GetShape(converter.getFoldingContext(), ultimate); + + if (shapeExpr && !shapeExpr->empty()) { + // Statically ranked array. fir::SequenceType::Shape shape; - // If there is no shapExpr, this is an assumed-rank, and the empty shape - // will build the desired fir.array<*:T> type. - if (shapeExpr) - translateShape(shape, std::move(*shapeExpr)); + translateShape(shape, std::move(*shapeExpr)); ty = fir::SequenceType::get(shape, ty); + } else if (!shapeExpr) { + // Assumed-rank. + ty = fir::SequenceType::get(fir::SequenceType::Shape{}, ty); } + + bool isPolymorphic = (Fortran::semantics::IsPolymorphic(symbol) || + Fortran::semantics::IsUnlimitedPolymorphic(symbol)) && + !Fortran::semantics::IsAssumedType(symbol); if (Fortran::semantics::IsPointer(symbol)) return fir::wrapInClassOrBoxType(fir::PointerType::get(ty), isPolymorphic); diff --git a/flang/test/Lower/do_concurrent_local_assoc_entity.f90 b/flang/test/Lower/do_concurrent_local_assoc_entity.f90 index ca16ecaa5c137..280827871aaf4 100644 --- a/flang/test/Lower/do_concurrent_local_assoc_entity.f90 +++ b/flang/test/Lower/do_concurrent_local_assoc_entity.f90 @@ -14,7 +14,7 @@ end subroutine local_assoc ! CHECK: %[[C8:.*]] = arith.constant 8 : index -! CHECK: fir.do_loop {{.*}} unordered { +! CHECK: fir.do_concurrent.loop {{.*}} { ! CHECK: %[[LOCAL_ALLOC:.*]] = fir.alloca !fir.array<8xf32> {bindc_name = "a", pinned, uniq_name = "{{.*}}local_assocEa"} ! CHECK: %[[LOCAL_SHAPE:.*]] = fir.shape %[[C8]] : ! CHECK: %[[LOCAL_DECL:.*]]:2 = hlfir.declare %[[LOCAL_ALLOC]](%[[LOCAL_SHAPE]]) From flang-commits at lists.llvm.org Wed May 28 02:09:35 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 28 May 2025 02:09:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Retrieve shape from selector when generating assoc sym type (PR #137117) In-Reply-To: Message-ID: <6836d2cf.620a0220.f9585.1d31@mx.google.com> ================ @@ -279,6 +279,23 @@ struct TypeBuilderImpl { bool isPolymorphic = (Fortran::semantics::IsPolymorphic(symbol) || Fortran::semantics::IsUnlimitedPolymorphic(symbol)) && !Fortran::semantics::IsAssumedType(symbol); + if (const auto *assocDetails = ---------------- ergawy wrote: Sorry for very late reply on my end as well. Thanks for the comment, done. https://github.com/llvm/llvm-project/pull/137117 From flang-commits at lists.llvm.org Wed May 28 02:16:59 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 28 May 2025 02:16:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Basic PFT to MLIR lowering for do concurrent locality specifiers (PR #138534) In-Reply-To: Message-ID: <6836d48b.630a0220.2b9719.12ff@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/138534 >From 9d50ab6dffa8c58569eb6b45d0b7b1dba703eb17 Mon Sep 17 00:00:00 2001 From: ergawy Date: Mon, 5 May 2025 07:15:52 -0500 Subject: [PATCH 1/3] [flang][fir] Basic PFT to MLIR lowering for do concurrent locality specifiers Extends support for `fir.do_concurrent` locality specifiers to the PFT to MLIR level. This adds code-gen for generating the newly added `fir.local` ops and referencing these ops from `fir.do_concurrent.loop` ops that have locality specifiers attached to them. This reuses the `DataSharingProcessor` component and generalizes it a bit more to allow for handling `omp.private` ops and `fir.local` ops as well. --- flang/include/flang/Lower/AbstractConverter.h | 4 + .../include/flang/Optimizer/Dialect/FIROps.h | 4 + .../include/flang/Optimizer/Dialect/FIROps.td | 15 +++ flang/lib/Lower/Bridge.cpp | 59 ++++++++-- .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 104 +++++++++++++----- flang/lib/Lower/OpenMP/DataSharingProcessor.h | 14 ++- .../Lower/do_concurrent_delayed_locality.f90 | 49 +++++++++ 7 files changed, 209 insertions(+), 40 deletions(-) create mode 100644 flang/test/Lower/do_concurrent_delayed_locality.f90 diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 1d1323642bf9c..8ae68e143cd2f 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -348,6 +348,10 @@ class AbstractConverter { virtual Fortran::lower::SymbolBox lookupOneLevelUpSymbol(const Fortran::semantics::Symbol &sym) = 0; + /// Find the symbol in the inner-most level of the local map or return null. + virtual Fortran::lower::SymbolBox + shallowLookupSymbol(const Fortran::semantics::Symbol &sym) = 0; + /// Return the mlir::SymbolTable associated to the ModuleOp. /// Look-ups are faster using it than using module.lookup<>, /// but the module op should be queried in case of failure diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.h b/flang/include/flang/Optimizer/Dialect/FIROps.h index 1bed227afb50d..62ef8b4b502f2 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.h +++ b/flang/include/flang/Optimizer/Dialect/FIROps.h @@ -147,6 +147,10 @@ class CoordinateIndicesAdaptor { mlir::ValueRange values; }; +struct LocalitySpecifierOperands { + llvm::SmallVector<::mlir::Value> privateVars; + llvm::SmallVector<::mlir::Attribute> privateSyms; +}; } // namespace fir #endif // FORTRAN_OPTIMIZER_DIALECT_FIROPS_H diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td b/flang/include/flang/Optimizer/Dialect/FIROps.td index dc66885f776f0..f4b17ef7eed09 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.td +++ b/flang/include/flang/Optimizer/Dialect/FIROps.td @@ -3605,6 +3605,21 @@ def fir_LocalitySpecifierOp : fir_Op<"local", [IsolatedFromAbove]> { ]; let extraClassDeclaration = [{ + mlir::BlockArgument getInitMoldArg() { + auto ®ion = getInitRegion(); + return region.empty() ? nullptr : region.getArgument(0); + } + mlir::BlockArgument getInitPrivateArg() { + auto ®ion = getInitRegion(); + return region.empty() ? nullptr : region.getArgument(1); + } + + /// Returns true if the init region might read from the mold argument + bool initReadsFromMold() { + mlir::BlockArgument moldArg = getInitMoldArg(); + return moldArg && !moldArg.use_empty(); + } + /// Get the type for arguments to nested regions. This should /// generally be either the same as getType() or some pointer /// type (pointing to the type allocated by this op). diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index c9e91cf3e8042..df7ff6dde1065 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -12,6 +12,8 @@ #include "flang/Lower/Bridge.h" +#include "OpenMP/DataSharingProcessor.h" +#include "OpenMP/Utils.h" #include "flang/Lower/Allocatable.h" #include "flang/Lower/CallInterface.h" #include "flang/Lower/Coarray.h" @@ -1142,6 +1144,14 @@ class FirConverter : public Fortran::lower::AbstractConverter { return name; } + /// Find the symbol in the inner-most level of the local map or return null. + Fortran::lower::SymbolBox + shallowLookupSymbol(const Fortran::semantics::Symbol &sym) override { + if (Fortran::lower::SymbolBox v = localSymbols.shallowLookupSymbol(sym)) + return v; + return {}; + } + private: FirConverter() = delete; FirConverter(const FirConverter &) = delete; @@ -1216,14 +1226,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { return {}; } - /// Find the symbol in the inner-most level of the local map or return null. - Fortran::lower::SymbolBox - shallowLookupSymbol(const Fortran::semantics::Symbol &sym) { - if (Fortran::lower::SymbolBox v = localSymbols.shallowLookupSymbol(sym)) - return v; - return {}; - } - /// Find the symbol in one level up of symbol map such as for host-association /// in OpenMP code or return null. Fortran::lower::SymbolBox @@ -2027,9 +2029,30 @@ class FirConverter : public Fortran::lower::AbstractConverter { void handleLocalitySpecs(const IncrementLoopInfo &info) { Fortran::semantics::SemanticsContext &semanticsContext = bridge.getSemanticsContext(); - for (const Fortran::semantics::Symbol *sym : info.localSymList) + Fortran::lower::omp::DataSharingProcessor dsp( + *this, semanticsContext, getEval(), + /*useDelayedPrivatization=*/true, localSymbols); + fir::LocalitySpecifierOperands privateClauseOps; + auto doConcurrentLoopOp = + mlir::dyn_cast_if_present(info.loopOp); + bool useDelayedPriv = + enableDelayedPrivatizationStaging && doConcurrentLoopOp; + + for (const Fortran::semantics::Symbol *sym : info.localSymList) { + if (useDelayedPriv) { + dsp.privatizeSymbol(sym, &privateClauseOps); + continue; + } + createHostAssociateVarClone(*sym, /*skipDefaultInit=*/false); + } + for (const Fortran::semantics::Symbol *sym : info.localInitSymList) { + if (useDelayedPriv) { + dsp.privatizeSymbol(sym, &privateClauseOps); + continue; + } + createHostAssociateVarClone(*sym, /*skipDefaultInit=*/true); const auto *hostDetails = sym->detailsIf(); @@ -2048,6 +2071,24 @@ class FirConverter : public Fortran::lower::AbstractConverter { sym->detailsIf(); copySymbolBinding(hostDetails->symbol(), *sym); } + + if (useDelayedPriv) { + doConcurrentLoopOp.getLocalVarsMutable().assign( + privateClauseOps.privateVars); + doConcurrentLoopOp.setLocalSymsAttr( + builder->getArrayAttr(privateClauseOps.privateSyms)); + + for (auto [sym, privateVar] : llvm::zip_equal( + dsp.getAllSymbolsToPrivatize(), privateClauseOps.privateVars)) { + auto arg = doConcurrentLoopOp.getRegion().begin()->addArgument( + privateVar.getType(), doConcurrentLoopOp.getLoc()); + bindSymbol(*sym, hlfir::translateToExtendedValue( + privateVar.getLoc(), *builder, hlfir::Entity{arg}, + /*contiguousHint=*/true) + .first); + } + } + // Note that allocatable, types with ultimate components, and type // requiring finalization are forbidden in LOCAL/LOCAL_INIT (F2023 C1130), // so no clean-up needs to be generated for these entities. diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 20dc46e4710fb..629478294ef5b 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -20,6 +20,7 @@ #include "flang/Optimizer/Builder/BoxValue.h" #include "flang/Optimizer/Builder/HLFIRTools.h" #include "flang/Optimizer/Builder/Todo.h" +#include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/HLFIR/HLFIRDialect.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Semantics/attr.h" @@ -53,6 +54,15 @@ DataSharingProcessor::DataSharingProcessor( }); } +DataSharingProcessor::DataSharingProcessor(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + lower::pft::Evaluation &eval, + bool useDelayedPrivatization, + lower::SymMap &symTable) + : DataSharingProcessor(converter, semaCtx, {}, eval, + /*shouldCollectPreDeterminedSymols=*/false, + useDelayedPrivatization, symTable) {} + void DataSharingProcessor::processStep1( mlir::omp::PrivateClauseOps *clauseOps) { collectSymbolsForPrivatization(); @@ -174,7 +184,8 @@ void DataSharingProcessor::cloneSymbol(const semantics::Symbol *sym) { void DataSharingProcessor::copyFirstPrivateSymbol( const semantics::Symbol *sym, mlir::OpBuilder::InsertPoint *copyAssignIP) { - if (sym->test(semantics::Symbol::Flag::OmpFirstPrivate)) + if (sym->test(semantics::Symbol::Flag::OmpFirstPrivate) || + sym->test(semantics::Symbol::Flag::LocalityLocalInit)) converter.copyHostAssociateVar(*sym, copyAssignIP); } @@ -497,9 +508,9 @@ void DataSharingProcessor::privatize(mlir::omp::PrivateClauseOps *clauseOps) { if (const auto *commonDet = sym->detailsIf()) { for (const auto &mem : commonDet->objects()) - doPrivatize(&*mem, clauseOps); + privatizeSymbol(&*mem, clauseOps); } else - doPrivatize(sym, clauseOps); + privatizeSymbol(sym, clauseOps); } } @@ -516,22 +527,30 @@ void DataSharingProcessor::copyLastPrivatize(mlir::Operation *op) { } } -void DataSharingProcessor::doPrivatize(const semantics::Symbol *sym, - mlir::omp::PrivateClauseOps *clauseOps) { +template +void DataSharingProcessor::privatizeSymbol( + const semantics::Symbol *symToPrivatize, OperandsStructType *clauseOps) { if (!useDelayedPrivatization) { - cloneSymbol(sym); - copyFirstPrivateSymbol(sym); + cloneSymbol(symToPrivatize); + copyFirstPrivateSymbol(symToPrivatize); return; } - lower::SymbolBox hsb = converter.lookupOneLevelUpSymbol(*sym); + const semantics::Symbol *sym = symToPrivatize->HasLocalLocality() + ? &symToPrivatize->GetUltimate() + : symToPrivatize; + lower::SymbolBox hsb = symToPrivatize->HasLocalLocality() + ? converter.shallowLookupSymbol(*sym) + : converter.lookupOneLevelUpSymbol(*sym); assert(hsb && "Host symbol box not found"); hlfir::Entity entity{hsb.getAddr()}; bool cannotHaveNonDefaultLowerBounds = !entity.mayHaveNonDefaultLowerBounds(); mlir::Location symLoc = hsb.getAddr().getLoc(); std::string privatizerName = sym->name().ToString() + ".privatizer"; - bool isFirstPrivate = sym->test(semantics::Symbol::Flag::OmpFirstPrivate); + bool isFirstPrivate = + symToPrivatize->test(semantics::Symbol::Flag::OmpFirstPrivate) || + symToPrivatize->test(semantics::Symbol::Flag::LocalityLocalInit); mlir::Value privVal = hsb.getAddr(); mlir::Type allocType = privVal.getType(); @@ -565,7 +584,7 @@ void DataSharingProcessor::doPrivatize(const semantics::Symbol *sym, mlir::Type argType = privVal.getType(); - mlir::omp::PrivateClauseOp privatizerOp = [&]() { + OpType privatizerOp = [&]() { auto moduleOp = firOpBuilder.getModule(); auto uniquePrivatizerName = fir::getTypeAsString( allocType, converter.getKindMap(), @@ -573,16 +592,25 @@ void DataSharingProcessor::doPrivatize(const semantics::Symbol *sym, (isFirstPrivate ? "_firstprivate" : "_private")); if (auto existingPrivatizer = - moduleOp.lookupSymbol( - uniquePrivatizerName)) + moduleOp.lookupSymbol(uniquePrivatizerName)) return existingPrivatizer; mlir::OpBuilder::InsertionGuard guard(firOpBuilder); firOpBuilder.setInsertionPointToStart(moduleOp.getBody()); - auto result = firOpBuilder.create( - symLoc, uniquePrivatizerName, allocType, - isFirstPrivate ? mlir::omp::DataSharingClauseType::FirstPrivate - : mlir::omp::DataSharingClauseType::Private); + OpType result; + + if constexpr (std::is_same_v) { + result = firOpBuilder.create( + symLoc, uniquePrivatizerName, allocType, + isFirstPrivate ? mlir::omp::DataSharingClauseType::FirstPrivate + : mlir::omp::DataSharingClauseType::Private); + } else { + result = firOpBuilder.create( + symLoc, uniquePrivatizerName, allocType, + isFirstPrivate ? fir::LocalitySpecifierType::LocalInit + : fir::LocalitySpecifierType::Local); + } + fir::ExtendedValue symExV = converter.getSymbolExtendedValue(*sym); lower::SymMapScope outerScope(symTable); @@ -625,27 +653,36 @@ void DataSharingProcessor::doPrivatize(const semantics::Symbol *sym, ©Region, /*insertPt=*/{}, {argType, argType}, {symLoc, symLoc}); firOpBuilder.setInsertionPointToEnd(copyEntryBlock); - auto addSymbol = [&](unsigned argIdx, bool force = false) { + auto addSymbol = [&](unsigned argIdx, const semantics::Symbol *symToMap, + bool force = false) { symExV.match( [&](const fir::MutableBoxValue &box) { symTable.addSymbol( - *sym, fir::substBase(box, copyRegion.getArgument(argIdx)), - force); + *symToMap, + fir::substBase(box, copyRegion.getArgument(argIdx)), force); }, [&](const auto &box) { - symTable.addSymbol(*sym, copyRegion.getArgument(argIdx), force); + symTable.addSymbol(*symToMap, copyRegion.getArgument(argIdx), + force); }); }; - addSymbol(0, true); + addSymbol(0, sym, true); lower::SymMapScope innerScope(symTable); - addSymbol(1); + addSymbol(1, symToPrivatize); auto ip = firOpBuilder.saveInsertionPoint(); - copyFirstPrivateSymbol(sym, &ip); - - firOpBuilder.create( - hsb.getAddr().getLoc(), symTable.shallowLookupSymbol(*sym).getAddr()); + copyFirstPrivateSymbol(symToPrivatize, &ip); + + if constexpr (std::is_same_v) { + firOpBuilder.create( + hsb.getAddr().getLoc(), + symTable.shallowLookupSymbol(*symToPrivatize).getAddr()); + } else { + firOpBuilder.create( + hsb.getAddr().getLoc(), + symTable.shallowLookupSymbol(*symToPrivatize).getAddr()); + } } return result; @@ -656,9 +693,22 @@ void DataSharingProcessor::doPrivatize(const semantics::Symbol *sym, clauseOps->privateVars.push_back(privVal); } - symToPrivatizer[sym] = privatizerOp; + if (symToPrivatize->HasLocalLocality()) + allPrivatizedSymbols.insert(symToPrivatize); } +template void +DataSharingProcessor::privatizeSymbol( + const semantics::Symbol *symToPrivatize, + mlir::omp::PrivateClauseOps *clauseOps); + +template void +DataSharingProcessor::privatizeSymbol( + const semantics::Symbol *symToPrivatize, + fir::LocalitySpecifierOperands *clauseOps); + } // namespace omp } // namespace lower } // namespace Fortran diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.h b/flang/lib/Lower/OpenMP/DataSharingProcessor.h index 7787e4ffb03c2..ae759bfef566b 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.h +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.h @@ -77,8 +77,6 @@ class DataSharingProcessor { llvm::SetVector preDeterminedSymbols; llvm::SetVector allPrivatizedSymbols; - llvm::DenseMap - symToPrivatizer; lower::AbstractConverter &converter; semantics::SemanticsContext &semaCtx; fir::FirOpBuilder &firOpBuilder; @@ -105,8 +103,6 @@ class DataSharingProcessor { void collectImplicitSymbols(); void collectPreDeterminedSymbols(); void privatize(mlir::omp::PrivateClauseOps *clauseOps); - void doPrivatize(const semantics::Symbol *sym, - mlir::omp::PrivateClauseOps *clauseOps); void copyLastPrivatize(mlir::Operation *op); void insertLastPrivateCompare(mlir::Operation *op); void cloneSymbol(const semantics::Symbol *sym); @@ -125,6 +121,11 @@ class DataSharingProcessor { bool shouldCollectPreDeterminedSymbols, bool useDelayedPrivatization, lower::SymMap &symTable); + DataSharingProcessor(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + lower::pft::Evaluation &eval, + bool useDelayedPrivatization, lower::SymMap &symTable); + // Privatisation is split into two steps. // Step1 performs cloning of all privatisation clauses and copying for // firstprivates. Step1 is performed at the place where process/processStep1 @@ -151,6 +152,11 @@ class DataSharingProcessor { ? allPrivatizedSymbols.getArrayRef() : llvm::ArrayRef(); } + + template + void privatizeSymbol(const semantics::Symbol *symToPrivatize, + OperandsStructType *clauseOps); }; } // namespace omp diff --git a/flang/test/Lower/do_concurrent_delayed_locality.f90 b/flang/test/Lower/do_concurrent_delayed_locality.f90 new file mode 100644 index 0000000000000..9b234087ed4be --- /dev/null +++ b/flang/test/Lower/do_concurrent_delayed_locality.f90 @@ -0,0 +1,49 @@ +! RUN: %flang_fc1 -emit-hlfir -mmlir --openmp-enable-delayed-privatization-staging=true -o - %s | FileCheck %s + +subroutine do_concurrent_with_locality_specs + implicit none + integer :: i, local_var, local_init_var + + do concurrent (i=1:10) local(local_var) local_init(local_init_var) + if (i < 5) then + local_var = 42 + else + local_init_var = 84 + end if + end do +end subroutine + +! CHECK: fir.local {type = local_init} @[[LOCAL_INIT_SYM:.*]] : i32 copy { +! CHECK: ^bb0(%[[ORIG_VAL:.*]]: !fir.ref, %[[LOCAL_VAL:.*]]: !fir.ref): +! CHECK: %[[ORIG_VAL_LD:.*]] = fir.load %[[ORIG_VAL]] : !fir.ref +! CHECK: hlfir.assign %[[ORIG_VAL_LD]] to %[[LOCAL_VAL]] : i32, !fir.ref +! CHECK: fir.yield(%[[LOCAL_VAL]] : !fir.ref) +! CHECK: } + +! CHECK: fir.local {type = local} @[[LOCAL_SYM:.*]] : i32 + +! CHECK-LABEL: func.func @_QPdo_concurrent_with_locality_specs() { +! CHECK: %[[ORIG_LOCAL_INIT_ALLOC:.*]] = fir.alloca i32 {bindc_name = "local_init_var", {{.*}}} +! CHECK: %[[ORIG_LOCAL_INIT_DECL:.*]]:2 = hlfir.declare %[[ORIG_LOCAL_INIT_ALLOC]] + +! CHECK: %[[ORIG_LOCAL_ALLOC:.*]] = fir.alloca i32 {bindc_name = "local_var", {{.*}}} +! CHECK: %[[ORIG_LOCAL_DECL:.*]]:2 = hlfir.declare %[[ORIG_LOCAL_ALLOC]] + +! CHECK: fir.do_concurrent { +! CHECK: %[[IV_DECL:.*]]:2 = hlfir.declare %{{.*}} + +! CHECK: fir.do_concurrent.loop (%{{.*}}) = (%{{.*}}) to (%{{.*}}) step (%{{.*}}) local(@[[LOCAL_SYM]] %[[ORIG_LOCAL_DECL]]#0 -> %[[LOCAL_ARG:.*]], @[[LOCAL_INIT_SYM]] %[[ORIG_LOCAL_INIT_DECL]]#0 -> %[[LOCAL_INIT_ARG:.*]] : !fir.ref, !fir.ref) { +! CHECK: %[[LOCAL_DECL:.*]]:2 = hlfir.declare %[[LOCAL_ARG]] +! CHECK: %[[LOCAL_INIT_DECL:.*]]:2 = hlfir.declare %[[LOCAL_INIT_ARG]] + +! CHECK: fir.if %{{.*}} { +! CHECK: %[[C42:.*]] = arith.constant 42 : i32 +! CHECK: hlfir.assign %[[C42]] to %[[LOCAL_DECL]]#0 : i32, !fir.ref +! CHECK: } else { +! CHECK: %[[C84:.*]] = arith.constant 84 : i32 +! CHECK: hlfir.assign %[[C84]] to %[[LOCAL_INIT_DECL]]#0 : i32, !fir.ref +! CHECK: } +! CHECK: } +! CHECK: } +! CHECK: return +! CHECK: } >From 1a5529c28f91e7cc2874032b19eb5654cebb4047 Mon Sep 17 00:00:00 2001 From: ergawy Date: Wed, 7 May 2025 02:24:56 -0500 Subject: [PATCH 2/3] add todo --- flang/lib/Lower/Bridge.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index df7ff6dde1065..9d73503fcfd8c 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -2035,6 +2035,9 @@ class FirConverter : public Fortran::lower::AbstractConverter { fir::LocalitySpecifierOperands privateClauseOps; auto doConcurrentLoopOp = mlir::dyn_cast_if_present(info.loopOp); + // TODO Promote to using `enableDelayedPrivatization` (which is enabled by + // default unlike the staging flag) once the implementation of this is more + // complete. bool useDelayedPriv = enableDelayedPrivatizationStaging && doConcurrentLoopOp; >From 81c4e9438a41eb3bb0f4714346b0730c84a9b236 Mon Sep 17 00:00:00 2001 From: ergawy Date: Wed, 28 May 2025 04:16:42 -0500 Subject: [PATCH 3/3] add todo --- flang/lib/Lower/Bridge.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 9d73503fcfd8c..49675d34215a9 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -2029,6 +2029,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { void handleLocalitySpecs(const IncrementLoopInfo &info) { Fortran::semantics::SemanticsContext &semanticsContext = bridge.getSemanticsContext(); + // TODO Extract `DataSharingProcessor` from omp to a more general location. Fortran::lower::omp::DataSharingProcessor dsp( *this, semanticsContext, getEval(), /*useDelayedPrivatization=*/true, localSymbols); From flang-commits at lists.llvm.org Wed May 28 02:17:57 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 28 May 2025 02:17:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Basic PFT to MLIR lowering for do concurrent locality specifiers (PR #138534) In-Reply-To: Message-ID: <6836d4c5.170a0220.19503f.15eb@mx.google.com> ================ @@ -2029,9 +2031,33 @@ class FirConverter : public Fortran::lower::AbstractConverter { void handleLocalitySpecs(const IncrementLoopInfo &info) { Fortran::semantics::SemanticsContext &semanticsContext = bridge.getSemanticsContext(); - for (const Fortran::semantics::Symbol *sym : info.localSymList) + Fortran::lower::omp::DataSharingProcessor dsp( ---------------- ergawy wrote: Added a todo in https://github.com/llvm/llvm-project/pull/138534/commits/81c4e9438a41eb3bb0f4714346b0730c84a9b236, and working on a follow up PR now. https://github.com/llvm/llvm-project/pull/138534 From flang-commits at lists.llvm.org Wed May 28 02:18:14 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 28 May 2025 02:18:14 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Basic PFT to MLIR lowering for do concurrent locality specifiers (PR #138534) In-Reply-To: Message-ID: <6836d4d6.050a0220.24a4d3.13c9@mx.google.com> https://github.com/ergawy edited https://github.com/llvm/llvm-project/pull/138534 From flang-commits at lists.llvm.org Wed May 28 03:36:24 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 28 May 2025 03:36:24 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <6836e728.170a0220.1e69dc.21ed@mx.google.com> kiranchandramohan wrote: Does `CCC_OVERRIDE_OPTIONS` expands to Clang Compiler Commandline Override Options? If so `FCC_OVERRIDE_OPTIONS` expanding to Fortran Compiler Commandline Override Options seems the right replacement. If `CCC_OVERRIDE_OPTIONS` expands to Clang C Compiler Override Options then `FFC_OVERRIDE_OPTIONS` (as suggested by @tarunprabhu) expanding to Flang Fortran Compiler Overrided Options is better. https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Wed May 28 03:55:49 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 28 May 2025 03:55:49 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <6836ebb5.170a0220.2b96fe.1c8e@mx.google.com> ================ @@ -0,0 +1,12 @@ +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s ---------------- kiranchandramohan wrote: Does `+`, `x`,`X` have special meanings? Is that documented anywhere? I think we should document `FCC_OVERRIDE_OPTIONS`. One possible location is https://github.com/llvm/llvm-project/blob/main/flang/docs/FlangDriver.md https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Wed May 28 04:14:23 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 04:14:23 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] - Handle `BoxCharType` in `fir.box_offset` op (PR #141713) In-Reply-To: Message-ID: <6836f00f.050a0220.36570a.1514@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/141713 From flang-commits at lists.llvm.org Wed May 28 04:14:24 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 04:14:24 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] - Handle `BoxCharType` in `fir.box_offset` op (PR #141713) In-Reply-To: Message-ID: <6836f010.170a0220.381446.1ce1@mx.google.com> ================ @@ -278,8 +278,9 @@ inline mlir::Type unwrapRefType(mlir::Type t) { /// If `t` conforms with a pass-by-reference type (box, ref, ptr, etc.) then /// return the element type of `t`. Otherwise, return `t`. inline mlir::Type unwrapPassByRefType(mlir::Type t) { - if (auto eleTy = dyn_cast_ptrOrBoxEleTy(t)) - return eleTy; + if (conformsWithPassByRef(t)) ---------------- tblah wrote: Why was this needed? https://github.com/llvm/llvm-project/pull/141713 From flang-commits at lists.llvm.org Wed May 28 04:14:24 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 04:14:24 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] - Handle `BoxCharType` in `fir.box_offset` op (PR #141713) In-Reply-To: Message-ID: <6836f010.170a0220.13435d.1d73@mx.google.com> https://github.com/tblah approved this pull request. LGTM with one minor comment https://github.com/llvm/llvm-project/pull/141713 From flang-commits at lists.llvm.org Wed May 28 04:23:05 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 28 May 2025 04:23:05 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6836f219.170a0220.2cac45.27bc@mx.google.com> kiranchandramohan wrote: > > Thank you for seeing this through and making all the little changes. I have requested reviews from @MaskRay and @aeubanks for the clang side of things. > > hello,I noticed that this PR has been awaiting clang review for three weeks. I still haven't gotten a comment from them.would it be possible for you to suggest others who might be available to help review? Thanks Adding @AaronBallman for reviewing the clang side or to suggest a suitable reviewer. https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Wed May 28 05:27:56 2025 From: flang-commits at lists.llvm.org (Aaron Ballman via flang-commits) Date: Wed, 28 May 2025 05:27:56 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6837014c.050a0220.bd639.23dd@mx.google.com> https://github.com/AaronBallman approved this pull request. Clang bits LGTM! https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Wed May 28 06:30:42 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Wed, 28 May 2025 06:30:42 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68371002.170a0220.2e469.302a@mx.google.com> ================ @@ -0,0 +1,9 @@ +; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s + +; CHECK-LABEL: llvm.func @prefer_vector_width() +; CHECK: prefer_vector_width = "128" ---------------- tarunprabhu wrote: Should this be `CHECK-SAME`? I assume that you intend to match `prefer_vector_width` which appears in the attributes of `llvm.func`. The attributes appear on the same line, don't they? https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Wed May 28 06:31:28 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Wed, 28 May 2025 06:31:28 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68371030.170a0220.1f000e.345e@mx.google.com> ================ @@ -0,0 +1,9 @@ +; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s + +; CHECK-LABEL: llvm.func @prefer_vector_width() +; CHECK: prefer_vector_width = "128" ---------------- tarunprabhu wrote: Oh, does `CHECK-LABEL` allow one to continue checking on the same line? https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Wed May 28 06:32:20 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 28 May 2025 06:32:20 -0700 (PDT) Subject: [flang-commits] [flang] 59b7b5b - [OpenMP][Flang] Fix semantic check and scoping for declare mappers (#140560) Message-ID: <68371064.170a0220.399abd.3609@mx.google.com> Author: Akash Banerjee Date: 2025-05-28T14:32:17+01:00 New Revision: 59b7b5b6b5c032ed21049d631eb5d67091f3a21c URL: https://github.com/llvm/llvm-project/commit/59b7b5b6b5c032ed21049d631eb5d67091f3a21c DIFF: https://github.com/llvm/llvm-project/commit/59b7b5b6b5c032ed21049d631eb5d67091f3a21c.diff LOG: [OpenMP][Flang] Fix semantic check and scoping for declare mappers (#140560) The current semantic check in place is incorrect, this patch fixes this. Up to 1 **'default'** named mapper should be allowed for each derived type. The current semantic check only allows up to 1 **'default'** named mapper across all derived types. This also makes sure that declare mappers follow proper scoping rules for both default and named mappers. Co-authored-by: Raghu Maddhipatla Added: Modified: flang/include/flang/Parser/parse-tree.h flang/lib/Lower/OpenMP/ClauseProcessor.cpp flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Parser/openmp-parsers.cpp flang/lib/Parser/unparse.cpp flang/lib/Semantics/resolve-names.cpp flang/test/Lower/OpenMP/declare-mapper.f90 flang/test/Lower/OpenMP/map-mapper.f90 flang/test/Parser/OpenMP/declare-mapper-unparse.f90 flang/test/Parser/OpenMP/metadirective-dirspec.f90 flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 flang/test/Semantics/OpenMP/declare-mapper03.f90 Removed: ################################################################################ diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 254236b510544..c99006f0c1c22 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -3540,7 +3540,7 @@ WRAPPER_CLASS(OmpLocatorList, std::list); struct OmpMapperSpecifier { // Absent mapper-identifier is equivalent to DEFAULT. TUPLE_CLASS_BOILERPLATE(OmpMapperSpecifier); - std::tuple, TypeSpec, Name> t; + std::tuple t; }; // Ref: [4.5:222:1-5], [5.0:305:20-27], [5.1:337:11-19], [5.2:139:18-23], diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 885871698c946..ebdda9885d5c2 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1148,9 +1148,9 @@ void ClauseProcessor::processMapObjects( typeSpec = &object.sym()->GetType()->derivedTypeSpec(); if (typeSpec) { - mapperIdName = typeSpec->name().ToString() + ".default"; - mapperIdName = - converter.mangleName(mapperIdName, *typeSpec->GetScope()); + mapperIdName = typeSpec->name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); } } }; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 5a975384bd371..ddb08f74b3841 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2422,8 +2422,10 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, mlir::FlatSymbolRefAttr mapperId; if (sym.GetType()->category() == semantics::DeclTypeSpec::TypeDerived) { auto &typeSpec = sym.GetType()->derivedTypeSpec(); - std::string mapperIdName = typeSpec.name().ToString() + ".default"; - mapperIdName = converter.mangleName(mapperIdName, *typeSpec.GetScope()); + std::string mapperIdName = + typeSpec.name().ToString() + ".omp.default.mapper"; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) + mapperIdName = converter.mangleName(mapperIdName, sym->owner()); if (converter.getModuleOp().lookupSymbol(mapperIdName)) mapperId = mlir::FlatSymbolRefAttr::get(&converter.getMLIRContext(), mapperIdName); @@ -4006,24 +4008,16 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, lower::StatementContext stmtCtx; const auto &spec = std::get(declareMapperConstruct.t); - const auto &mapperName{std::get>(spec.t)}; + const auto &mapperName{std::get(spec.t)}; const auto &varType{std::get(spec.t)}; const auto &varName{std::get(spec.t)}; assert(varType.declTypeSpec->category() == semantics::DeclTypeSpec::Category::TypeDerived && "Expected derived type"); - std::string mapperNameStr; - if (mapperName.has_value()) { - mapperNameStr = mapperName->ToString(); - mapperNameStr = - converter.mangleName(mapperNameStr, mapperName->symbol->owner()); - } else { - mapperNameStr = - varType.declTypeSpec->derivedTypeSpec().name().ToString() + ".default"; - mapperNameStr = converter.mangleName( - mapperNameStr, *varType.declTypeSpec->derivedTypeSpec().GetScope()); - } + std::string mapperNameStr = mapperName; + if (auto *sym = converter.getCurrentScope().FindSymbol(mapperNameStr)) + mapperNameStr = converter.mangleName(mapperNameStr, sym->owner()); // Save current insertion point before moving to the module scope to create // the DeclareMapperOp diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 52d3a5844c969..c08cd1ab80559 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1389,8 +1389,28 @@ TYPE_PARSER( TYPE_PARSER(sourced(construct( verbatim("DECLARE TARGET"_tok), Parser{}))) +static OmpMapperSpecifier ConstructOmpMapperSpecifier( + std::optional &&mapperName, TypeSpec &&typeSpec, Name &&varName) { + // If a name is present, parse: name ":" typeSpec "::" name + // This matches the syntax: : :: + if (mapperName.has_value() && mapperName->ToString() != "default") { + return OmpMapperSpecifier{ + mapperName->ToString(), std::move(typeSpec), std::move(varName)}; + } + // If the name is missing, use the DerivedTypeSpec name to construct the + // default mapper name. + // This matches the syntax: :: + if (DerivedTypeSpec * derived{std::get_if(&typeSpec.u)}) { + return OmpMapperSpecifier{ + std::get(derived->t).ToString() + ".omp.default.mapper", + std::move(typeSpec), std::move(varName)}; + } + return OmpMapperSpecifier{std::string("omp.default.mapper"), + std::move(typeSpec), std::move(varName)}; +} + // mapper-specifier -TYPE_PARSER(construct( +TYPE_PARSER(applyFunction(ConstructOmpMapperSpecifier, maybe(name / ":" / !":"_tok), typeSpec / "::", name)) // OpenMP 5.2: 5.8.8 Declare Mapper Construct diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index a626888b7dfe5..0784a6703bbde 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2093,7 +2093,11 @@ class UnparseVisitor { Walk(x.v, ","); } void Unparse(const OmpMapperSpecifier &x) { - Walk(std::get>(x.t), ":"); + const auto &mapperName{std::get(x.t)}; + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); + Put(":"); + } Walk(std::get(x.t)); Put("::"); Walk(std::get(x.t)); @@ -2796,8 +2800,9 @@ class UnparseVisitor { BeginOpenMP(); Word("!$OMP DECLARE MAPPER ("); const auto &spec{std::get(z.t)}; - if (auto mapname{std::get>(spec.t)}) { - Walk(mapname); + const auto &mapperName{std::get(spec.t)}; + if (mapperName.find("omp.default.mapper") == std::string::npos) { + Walk(mapperName); Put(":"); } Walk(std::get(spec.t)); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index a8dbf61c8fd68..93f2150365a1f 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1766,15 +1766,9 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, // just following the natural flow, the map clauses gets processed before // the type has been fully processed. BeginDeclTypeSpec(); - if (auto &mapperName{std::get>(spec.t)}) { - mapperName->symbol = - &MakeSymbol(*mapperName, MiscDetails{MiscDetails::Kind::ConstructName}); - } else { - const parser::CharBlock defaultName{"default", 7}; - MakeSymbol( - defaultName, Attrs{}, MiscDetails{MiscDetails::Kind::ConstructName}); - } - + auto &mapperName{std::get(spec.t)}; + MakeSymbol(parser::CharBlock(mapperName), Attrs{}, + MiscDetails{MiscDetails::Kind::ConstructName}); PushScope(Scope::Kind::OtherConstruct, nullptr); Walk(std::get(spec.t)); auto &varName{std::get(spec.t)}; diff --git a/flang/test/Lower/OpenMP/declare-mapper.f90 b/flang/test/Lower/OpenMP/declare-mapper.f90 index 867b850317e66..8a98c68a8d582 100644 --- a/flang/test/Lower/OpenMP/declare-mapper.f90 +++ b/flang/test/Lower/OpenMP/declare-mapper.f90 @@ -5,6 +5,7 @@ ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-2.f90 -o - | FileCheck %t/omp-declare-mapper-2.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-3.f90 -o - | FileCheck %t/omp-declare-mapper-3.f90 ! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-4.f90 -o - | FileCheck %t/omp-declare-mapper-4.f90 +! RUN: %flang_fc1 -emit-hlfir -fopenmp -fopenmp-version=50 %t/omp-declare-mapper-5.f90 -o - | FileCheck %t/omp-declare-mapper-5.f90 !--- omp-declare-mapper-1.f90 subroutine declare_mapper_1 @@ -22,7 +23,7 @@ subroutine declare_mapper_1 end type type(my_type2) :: t real :: x, y(nvals) - !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { + !CHECK:omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_1my_type\.omp\.default\.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_1Tmy_type\{num_vals:i32,values:!fir\.box>>\}>]] { !CHECK: ^bb0(%[[VAL_0:.*]]: !fir.ref<[[MY_TYPE]]>): !CHECK: %[[VAL_1:.*]]:2 = hlfir.declare %[[VAL_0]] {uniq_name = "_QFdeclare_mapper_1Evar"} : (!fir.ref<[[MY_TYPE]]>) -> (!fir.ref<[[MY_TYPE]]>, !fir.ref<[[MY_TYPE]]>) !CHECK: %[[VAL_2:.*]] = hlfir.designate %[[VAL_1]]#0{"values"} {fortran_attrs = #fir.var_attrs} : (!fir.ref<[[MY_TYPE]]>) -> !fir.ref>>> @@ -149,7 +150,7 @@ subroutine declare_mapper_4 integer :: num end type - !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.default]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] + !CHECK: omp.declare_mapper @[[MY_TYPE_MAPPER:_QQFdeclare_mapper_4my_type.omp.default.mapper]] : [[MY_TYPE:!fir\.type<_QFdeclare_mapper_4Tmy_type\{num:i32\}>]] !$omp declare mapper (my_type :: var) map (var%num) type(my_type) :: a @@ -171,3 +172,93 @@ subroutine declare_mapper_4 a%num = 40 !$omp end target end subroutine declare_mapper_4 + +!--- omp-declare-mapper-5.f90 +program declare_mapper_5 + implicit none + + type :: mytype + integer :: x, y + end type + + !CHECK: omp.declare_mapper @[[INNER_MAPPER_NAMED:_QQFFuse_innermy_mapper]] : [[MY_TYPE:!fir\.type<_QFTmytype\{x:i32,y:i32\}>]] + !CHECK: omp.declare_mapper @[[INNER_MAPPER_DEFAULT:_QQFFuse_innermytype.omp.default.mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_NAMED:_QQFmy_mapper]] : [[MY_TYPE]] + !CHECK: omp.declare_mapper @[[OUTER_MAPPER_DEFAULT:_QQFmytype.omp.default.mapper]] : [[MY_TYPE]] + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + +contains + subroutine use_outer() + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[OUTER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine + + subroutine use_inner() + !$omp declare mapper(mytype :: var) map(tofrom: var%x) + !$omp declare mapper(my_mapper : mytype :: var) map(tofrom: var%y) + + type(mytype) :: a + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(implicit, tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_DEFAULT]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(default) : a) + a%x = 10 + !$omp end target + + !CHECK: %{{.*}} = omp.map.info var_ptr(%{{.*}}#1 : !fir.ref<[[MY_TYPE]]>, [[MY_TYPE]]) map_clauses(tofrom) capture(ByRef) mapper(@[[INNER_MAPPER_NAMED]]) -> !fir.ref<[[MY_TYPE]]> {name = "a"} + !$omp target map(mapper(my_mapper) : a) + a%y = 10 + !$omp end target + end subroutine +end program declare_mapper_5 diff --git a/flang/test/Lower/OpenMP/map-mapper.f90 b/flang/test/Lower/OpenMP/map-mapper.f90 index a511110cb5d18..91564bfc7bc46 100644 --- a/flang/test/Lower/OpenMP/map-mapper.f90 +++ b/flang/test/Lower/OpenMP/map-mapper.f90 @@ -8,7 +8,7 @@ program p !$omp declare mapper(xx : t1 :: nn) map(to: nn, nn%x) !$omp declare mapper(t1 :: nn) map(from: nn) - !CHECK-LABEL: omp.declare_mapper @_QQFt1.default : !fir.type<_QFTt1{x:!fir.array<256xi32>}> + !CHECK-LABEL: omp.declare_mapper @_QQFt1.omp.default.mapper : !fir.type<_QFTt1{x:!fir.array<256xi32>}> !CHECK-LABEL: omp.declare_mapper @_QQFxx : !fir.type<_QFTt1{x:!fir.array<256xi32>}> type(t1) :: a, b @@ -20,7 +20,7 @@ program p end do !$omp end target - !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.default) -> {{.*}} {name = "b"} + !CHECK: %[[MAP_B:.*]] = omp.map.info var_ptr(%{{.*}} : {{.*}}, {{.*}}) map_clauses(tofrom) capture(ByRef) mapper(@_QQFt1.omp.default.mapper) -> {{.*}} {name = "b"} !CHECK: omp.target map_entries(%[[MAP_B]] -> %{{.*}}, %{{.*}} -> %{{.*}} : {{.*}}, {{.*}}) { !$omp target map(mapper(default) : b) do i = 1, n diff --git a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 index 407bfd29153fa..30d75d02736f3 100644 --- a/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 +++ b/flang/test/Parser/OpenMP/declare-mapper-unparse.f90 @@ -7,36 +7,37 @@ program main type ty integer :: x end type ty - + !CHECK: !$OMP DECLARE MAPPER (mymapper:ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier -!PARSE-TREE: Name = 'mymapper' +!PARSE-TREE: string = 'mymapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' -!PARSE-TREE: Name = 'x' +!PARSE-TREE: Name = 'x' !CHECK: !$OMP DECLARE MAPPER (ty::mapped) MAP(mapped,mapped%x) !$omp declare mapper(ty :: mapped) map(mapped, mapped%x) - + !PARSE-TREE: OpenMPDeclareMapperConstruct !PARSE-TREE: OmpMapperSpecifier +!PARSE-TREE: string = 'ty.omp.default.mapper' !PARSE-TREE: TypeSpec -> DerivedTypeSpec !PARSE-TREE: Name = 'ty' -!PARSE-TREE: Name = 'mapped' +!PARSE-TREE: Name = 'mapped' !PARSE-TREE: OmpMapClause !PARSE-TREE: OmpObjectList -> OmpObject -> Designator -> DataRef -> Name = 'mapped' !PARSE-TREE: OmpObject -> Designator -> DataRef -> StructureComponent !PARSE-TREE: DataRef -> Name = 'mapped' !PARSE-TREE: Name = 'x' - + end program main !CHECK-LABEL: end program main diff --git a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 index b6c9c58948fec..baa8b2e08c539 100644 --- a/flang/test/Parser/OpenMP/metadirective-dirspec.f90 +++ b/flang/test/Parser/OpenMP/metadirective-dirspec.f90 @@ -78,7 +78,7 @@ subroutine f02 !PARSE-TREE: | | OmpDirectiveSpecification !PARSE-TREE: | | | OmpDirectiveName -> llvm::omp::Directive = declare mapper !PARSE-TREE: | | | OmpArgumentList -> OmpArgument -> OmpMapperSpecifier -!PARSE-TREE: | | | | Name = 'mymapper' +!PARSE-TREE: | | | | string = 'mymapper' !PARSE-TREE: | | | | TypeSpec -> IntrinsicTypeSpec -> IntegerTypeSpec -> !PARSE-TREE: | | | | Name = 'v' !PARSE-TREE: | | | OmpClauseList -> OmpClause -> Map -> OmpMapClause diff --git a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 index b4e03bd1632e5..06f41ab8ce76f 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper-symbols.f90 @@ -2,23 +2,23 @@ program main !CHECK-LABEL: MainProgram scope: main - implicit none + implicit none - type ty - integer :: x - end type ty - !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) - !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) + type ty + integer :: x + end type ty + !$omp declare mapper(mymapper : ty :: mapped) map(mapped, mapped%x) + !$omp declare mapper(ty :: maptwo) map(maptwo, maptwo%x) !! Note, symbols come out in their respective scope, but not in declaration order. -!CHECK: default: Misc ConstructName !CHECK: mymapper: Misc ConstructName !CHECK: ty: DerivedType components: x +!CHECK: ty.omp.default.mapper: Misc ConstructName !CHECK: DerivedType scope: ty !CHECK: OtherConstruct scope: !CHECK: mapped (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) -!CHECK: OtherConstruct scope: +!CHECK: OtherConstruct scope: !CHECK: maptwo (OmpMapToFrom) {{.*}} ObjectEntity type: TYPE(ty) - + end program main diff --git a/flang/test/Semantics/OpenMP/declare-mapper03.f90 b/flang/test/Semantics/OpenMP/declare-mapper03.f90 index b70b8a67f33e0..84fc3efafb3ad 100644 --- a/flang/test/Semantics/OpenMP/declare-mapper03.f90 +++ b/flang/test/Semantics/OpenMP/declare-mapper03.f90 @@ -5,12 +5,8 @@ integer :: y end type t1 -type :: t2 - real :: y, z -end type t2 - !error: 'default' is already declared in this scoping unit !$omp declare mapper(t1::x) map(x, x%y) -!$omp declare mapper(t2::w) map(w, w%y, w%z) +!$omp declare mapper(t1::x) map(x) end From flang-commits at lists.llvm.org Wed May 28 06:32:24 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Wed, 28 May 2025 06:32:24 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <68371068.630a0220.862fb.32d8@mx.google.com> https://github.com/TIFitis closed https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Wed May 28 06:32:27 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Wed, 28 May 2025 06:32:27 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Emit default declare mappers implicitly for derived types (PR #140562) In-Reply-To: Message-ID: <6837106b.170a0220.33d6b9.3b5c@mx.google.com> https://github.com/TIFitis edited https://github.com/llvm/llvm-project/pull/140562 From flang-commits at lists.llvm.org Wed May 28 06:36:33 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Wed, 28 May 2025 06:36:33 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <68371161.170a0220.340651.3b64@mx.google.com> https://github.com/tarunprabhu approved this pull request. @fanju110, Thanks for seeing this through! https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Wed May 28 06:51:59 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Wed, 28 May 2025 06:51:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <683714ff.050a0220.10cf1c.4a8f@mx.google.com> ================ @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); ---------------- mrkajetanp wrote: What exactly do you mean? We need to handle the copy out to deallocate the temporary, otherwise it leaks memory. I had to add this because it wasn't getting handled automatically without this. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Wed May 28 06:53:14 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 28 May 2025 06:53:14 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <6837154a.170a0220.127201.4ad5@mx.google.com> https://github.com/kiranchandramohan commented: The string `.omp.default.mapper` seems to be hardcoded in many places. Can we use a single function to get this string in all places? Is it possible to retain `std::optional` in the parse-tree representation? https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Wed May 28 06:58:22 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Wed, 28 May 2025 06:58:22 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <6837167e.050a0220.226735.5014@mx.google.com> ================ @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + mlir::Value tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); ---------------- mrkajetanp wrote: I suppose if it gets cleaned up later anyway then you're right, I'll drop this. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Wed May 28 07:08:37 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Wed, 28 May 2025 07:08:37 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <683718e5.170a0220.2889cd.6248@mx.google.com> TIFitis wrote: > The string `.omp.default.mapper` seems to be hardcoded in many places. Can we use a single function to get this string in all places? > > Is it possible to retain `std::optional` in the parse-tree representation? One place I can think of placing the string is `llvm/include/llvm/Frontend/OpenMP/OMPConstants.h`, do you have any other suggestions? As for retaining `std::optional`, the reason to drop `std::optional` is that the name is optional in source but always carries a _default_ value attached to it along with a symbol. With the name field being optional, we end up with a dangling default symbol which makes symbol lookups weird along with scoping rules. Secondly, I've changed it to `string` instead of `Name`, is that the name field doesn't have a data member to store non-static names, since it only has a `char*`. If we keep it as `Name` we will need other solutions in place to store the run-time generated string. The earlier iteration of this PR tried to do so but there is no clean way of doing it, and requires adding a new static storage. Do you have any suggestions to fix these issues while also keeping the field as `std::optional`? https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Wed May 28 07:27:20 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Wed, 28 May 2025 07:27:20 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68371d48.170a0220.18a3ac.7c64@mx.google.com> https://github.com/mcinally updated https://github.com/llvm/llvm-project/pull/141380 >From 9f8619cb54a3a11e4c90af7f5156141ddc59e4d4 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 24 May 2025 13:35:13 -0700 Subject: [PATCH 1/3] [flang] Add support for -mprefer-vector-width= This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- clang/include/clang/Driver/Options.td | 2 +- flang/include/flang/Frontend/CodeGenOptions.h | 3 +++ .../include/flang/Optimizer/Transforms/Passes.td | 4 ++++ flang/include/flang/Tools/CrossToolHelpers.h | 3 +++ flang/lib/Frontend/CompilerInvocation.cpp | 14 ++++++++++++++ flang/lib/Frontend/FrontendActions.cpp | 2 ++ flang/lib/Optimizer/Passes/Pipelines.cpp | 2 +- flang/lib/Optimizer/Transforms/FunctionAttr.cpp | 5 +++++ flang/test/Driver/prefer-vector-width.f90 | 16 ++++++++++++++++ mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td | 1 + mlir/lib/Target/LLVMIR/ModuleImport.cpp | 4 ++++ mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | 3 +++ 12 files changed, 57 insertions(+), 2 deletions(-) create mode 100644 flang/test/Driver/prefer-vector-width.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 22261621df092..b0b642796010b 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5480,7 +5480,7 @@ def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, MarshallingInfoStringVector>; def mprefer_vector_width_EQ : Joined<["-"], "mprefer-vector-width=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Specifies preferred vector width for auto-vectorization. Defaults to 'none' which allows target specific decisions.">, MarshallingInfoString>; def mstack_protector_guard_EQ : Joined<["-"], "mstack-protector-guard=">, Group, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -53,6 +53,9 @@ class CodeGenOptions : public CodeGenOptionsBase { /// The paths to the pass plugins that were registered using -fpass-plugin. std::vector LLVMPassPlugins; + // The prefered vector width, if requested by -mprefer-vector-width. + std::string PreferVectorWidth; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index b251534e1a8f6..2e932d9ad4a26 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -418,6 +418,10 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"preferVectorWidth", "prefer-vector-width", "std::string", + /*default=*/"", + "Set the prefer-vector-width attribute on functions in the " + "module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, ]; diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 118695bbe2626..058024a4a04c5 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; InstrumentFunctionExit = "__cyg_profile_func_exit"; @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for + ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. std::string InstrumentFunctionEntry = diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..918323d663610 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -309,6 +309,20 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, for (auto *a : args.filtered(clang::driver::options::OPT_fpass_plugin_EQ)) opts.LLVMPassPlugins.push_back(a->getValue()); + // -mprefer_vector_width option + if (const llvm::opt::Arg *a = args.getLastArg( + clang::driver::options::OPT_mprefer_vector_width_EQ)) { + llvm::StringRef s = a->getValue(); + unsigned Width; + if (s == "none") + opts.PreferVectorWidth = "none"; + else if (s.getAsInteger(10, Width)) + diags.Report(clang::diag::err_drv_invalid_value) + << a->getAsString(args) << a->getValue(); + else + opts.PreferVectorWidth = s.str(); + } + // -fembed-offload-object option for (auto *a : args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 38dfaadf1dff9..012d0fdfe645f 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -741,6 +741,8 @@ void CodeGenAction::generateLLVMIR() { config.VScaleMax = vsr->second; } + config.PreferVectorWidth = opts.PreferVectorWidth; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..5c1e1b9f77efb 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -354,7 +354,7 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - ""})); + config.PreferVectorWidth, ""})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 43e4c1a7af3cd..13f447cf738b4 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -36,6 +36,7 @@ class FunctionAttrPass : public fir::impl::FunctionAttrBase { approxFuncFPMath = options.approxFuncFPMath; noSignedZerosFPMath = options.noSignedZerosFPMath; unsafeFPMath = options.unsafeFPMath; + preferVectorWidth = options.preferVectorWidth; } FunctionAttrPass() {} void runOnOperation() override; @@ -102,6 +103,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!preferVectorWidth.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, preferVectorWidth)); LLVM_DEBUG(llvm::dbgs() << "=== End " DEBUG_TYPE " ===\n"); } diff --git a/flang/test/Driver/prefer-vector-width.f90 b/flang/test/Driver/prefer-vector-width.f90 new file mode 100644 index 0000000000000..d0f5fd28db826 --- /dev/null +++ b/flang/test/Driver/prefer-vector-width.f90 @@ -0,0 +1,16 @@ +! Test that -mprefer-vector-width works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mprefer-vector-width=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mprefer-vector-width=128 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-128 +! RUN: %flang_fc1 -mprefer-vector-width=256 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-256 +! RUN: not %flang_fc1 -mprefer-vector-width=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INVALID + +subroutine func +end subroutine func + +! CHECK-DEF-NOT: attributes #0 = { "prefer-vector-width"={{.*}} } +! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } +! CHECK-128: attributes #0 = { "prefer-vector-width"="128" } +! CHECK-256: attributes #0 = { "prefer-vector-width"="256" } +! CHECK-INVALID:error: invalid value 'xxx' in '-mprefer-vector-width=xxx' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index 6fde45ce5c556..c0324d561b77b 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1893,6 +1893,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$frame_pointer, OptionalAttr:$target_cpu, OptionalAttr:$tune_cpu, + OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index b049064fbd31c..85417da798b22 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2636,6 +2636,10 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, funcOp.setTargetFeaturesAttr( LLVM::TargetFeaturesAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("prefer-vector-width"); + attr.isStringAttribute()) + funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index 047e870b7dcd8..2b7f0b11613aa 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1540,6 +1540,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { if (auto tuneCpu = func.getTuneCpu()) llvmFunc->addFnAttr("tune-cpu", *tuneCpu); + if (auto preferVectorWidth = func.getPreferVectorWidth()) + llvmFunc->addFnAttr("prefer-vector-width", *preferVectorWidth); + if (auto attr = func.getVscaleRange()) llvmFunc->addFnAttr(llvm::Attribute::getWithVScaleRangeArgs( getLLVMContext(), attr->getMinRange().getInt(), >From 5bdf615715733351bae8f959f0a06a8449526bb8 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 24 May 2025 13:35:13 -0700 Subject: [PATCH 2/3] [flang] Add support for -mprefer-vector-width= This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- flang/lib/Frontend/CompilerInvocation.cpp | 4 ++-- mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll | 9 +++++++++ mlir/test/Target/LLVMIR/prefer-vector-width.mlir | 8 ++++++++ 3 files changed, 19 insertions(+), 2 deletions(-) create mode 100644 mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll create mode 100644 mlir/test/Target/LLVMIR/prefer-vector-width.mlir diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 918323d663610..90a002929eff0 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -313,10 +313,10 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, if (const llvm::opt::Arg *a = args.getLastArg( clang::driver::options::OPT_mprefer_vector_width_EQ)) { llvm::StringRef s = a->getValue(); - unsigned Width; + unsigned width; if (s == "none") opts.PreferVectorWidth = "none"; - else if (s.getAsInteger(10, Width)) + else if (s.getAsInteger(10, width)) diags.Report(clang::diag::err_drv_invalid_value) << a->getAsString(args) << a->getValue(); else diff --git a/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll new file mode 100644 index 0000000000000..831aa57345a3f --- /dev/null +++ b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll @@ -0,0 +1,9 @@ +; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s + +; CHECK-LABEL: llvm.func @prefer_vector_width() +; CHECK: prefer_vector_width = "128" +define void @prefer_vector_width() #0 { + ret void +} + +attributes #0 = { "prefer-vector-width"="128" } diff --git a/mlir/test/Target/LLVMIR/prefer-vector-width.mlir b/mlir/test/Target/LLVMIR/prefer-vector-width.mlir new file mode 100644 index 0000000000000..7410e8139fd31 --- /dev/null +++ b/mlir/test/Target/LLVMIR/prefer-vector-width.mlir @@ -0,0 +1,8 @@ +// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s + +// CHECK: define void @prefer_vector_width() #[[ATTRS:.*]] { +// CHECK: attributes #[[ATTRS]] = { "prefer-vector-width"="128" } + +llvm.func @prefer_vector_width() attributes {prefer_vector_width = "128"} { + llvm.return +} >From befabca370ba227262859aec47e4fbc93759b3a0 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 24 May 2025 13:35:13 -0700 Subject: [PATCH 3/3] [flang] Add support for -mprefer-vector-width= This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll index 831aa57345a3f..e30ef04924b81 100644 --- a/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll +++ b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll @@ -1,7 +1,7 @@ ; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s ; CHECK-LABEL: llvm.func @prefer_vector_width() -; CHECK: prefer_vector_width = "128" +; CHECK-SAME: prefer_vector_width = "128" define void @prefer_vector_width() #0 { ret void } From flang-commits at lists.llvm.org Wed May 28 07:29:09 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Wed, 28 May 2025 07:29:09 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68371db5.170a0220.102667.7c0b@mx.google.com> ================ @@ -0,0 +1,9 @@ +; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s + +; CHECK-LABEL: llvm.func @prefer_vector_width() +; CHECK: prefer_vector_width = "128" ---------------- mcinally wrote: It does continue where the last match ended, but no problem to update. I figured since this was a one function test, there was no risk of matching another line. Just uploaded an updated patch... https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Wed May 28 07:33:49 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Wed, 28 May 2025 07:33:49 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68371ecd.170a0220.6ce57.8189@mx.google.com> https://github.com/mrkajetanp edited https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Wed May 28 07:44:24 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 28 May 2025 07:44:24 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <68372148.050a0220.2fea6c.82c0@mx.google.com> kiranchandramohan wrote: > One place I can think of placing the string is `llvm/include/llvm/Frontend/OpenMP/OMPConstants.h`, do you have any other suggestions? Sounds OK to me. Alternative locations (if you omit parser changes) are in include/flang/Semantics/openmp*.h. > As for retaining std::optional, the reason to drop std::optional is that the name is optional in source but always carries a default value attached to it along with a symbol. With the name field being optional, we end up with a dangling default symbol which makes symbol lookups weird along with scoping rules. A possibility is to change it from `std::optional` to `Name`. > Secondly, I've changed it to string instead of Name, is that the name field doesn't have a data member to store non-static names, since it only has a char*. If we keep it as Name we will need other solutions in place to store the run-time generated string. The earlier iteration of this PR tried to do so but there is no clean way of doing it, and requires adding a new static storage. We can add it to the Details in symbol.h like WithBindName, WithOmpDeclarative, OpenACCRoutineInfo etc. https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Wed May 28 07:44:23 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Wed, 28 May 2025 07:44:23 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68372147.170a0220.8689d.802c@mx.google.com> ================ @@ -0,0 +1,9 @@ +; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s + +; CHECK-LABEL: llvm.func @prefer_vector_width() +; CHECK: prefer_vector_width = "128" ---------------- tarunprabhu wrote: Thanks for the update and clarification. https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Wed May 28 07:44:56 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Wed, 28 May 2025 07:44:56 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68372168.170a0220.377095.7eed@mx.google.com> https://github.com/tarunprabhu approved this pull request. Thanks for the all the changes. LGTM. https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Wed May 28 06:35:31 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 28 May 2025 06:35:31 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [WIP] Implement workdistribute construct (PR #140523) In-Reply-To: Message-ID: <68371123.170a0220.12b134.38c7@mx.google.com> https://github.com/skc7 updated https://github.com/llvm/llvm-project/pull/140523 >From e0dff6afb7aa31330aa0516effb7a0f65df5315f Mon Sep 17 00:00:00 2001 From: Ivan Radanov Ivanov Date: Mon, 4 Dec 2023 12:57:36 -0800 Subject: [PATCH 01/12] Add coexecute directives --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 45 ++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 0af4b436649a3..752486a8105b6 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -682,6 +682,8 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let association = AS_None; let category = CA_Executable; } +def OMP_Coexecute : Directive<"coexecute"> {} +def OMP_EndCoexecute : Directive<"end coexecute"> {} def OMP_Critical : Directive<"critical"> { let allowedOnceClauses = [ VersionedClause, @@ -2198,6 +2200,33 @@ def OMP_TargetTeams : Directive<"target teams"> { let leafConstructs = [OMP_Target, OMP_Teams]; let category = CA_Executable; } +def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; +} def OMP_TargetTeamsDistribute : Directive<"target teams distribute"> { let allowedClauses = [ VersionedClause, @@ -2484,6 +2513,22 @@ def OMP_TaskLoopSimd : Directive<"taskloop simd"> { let leafConstructs = [OMP_TaskLoop, OMP_Simd]; let category = CA_Executable; } +def OMP_TeamsCoexecute : Directive<"teams coexecute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause + ]; +} def OMP_TeamsDistribute : Directive<"teams distribute"> { let allowedClauses = [ VersionedClause, >From 8b1b36f5e716b8186d98b0d5c47c0fdf649ae67b Mon Sep 17 00:00:00 2001 From: skc7 Date: Tue, 13 May 2025 11:01:45 +0530 Subject: [PATCH 02/12] [OpenMP] Fix Coexecute definitions --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 752486a8105b6..7f450b43c2e36 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -682,8 +682,15 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let association = AS_None; let category = CA_Executable; } -def OMP_Coexecute : Directive<"coexecute"> {} -def OMP_EndCoexecute : Directive<"end coexecute"> {} +def OMP_Coexecute : Directive<"coexecute"> { + let association = AS_Block; + let category = CA_Executable; +} +def OMP_EndCoexecute : Directive<"end coexecute"> { + let leafConstructs = OMP_Coexecute.leafConstructs; + let association = OMP_Coexecute.association; + let category = OMP_Coexecute.category; +} def OMP_Critical : Directive<"critical"> { let allowedOnceClauses = [ VersionedClause, @@ -2224,8 +2231,10 @@ def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { VersionedClause, VersionedClause, VersionedClause, - VersionedClause, + VersionedClause, ]; + let leafConstructs = [OMP_Target, OMP_Teams, OMP_Coexecute]; + let category = CA_Executable; } def OMP_TargetTeamsDistribute : Directive<"target teams distribute"> { let allowedClauses = [ @@ -2528,6 +2537,8 @@ def OMP_TeamsCoexecute : Directive<"teams coexecute"> { VersionedClause, VersionedClause ]; + let leafConstructs = [OMP_Target, OMP_Teams]; + let category = CA_Executable; } def OMP_TeamsDistribute : Directive<"teams distribute"> { let allowedClauses = [ >From 9b8d66a45e602375ec779e6c5bdd43232644f9a2 Mon Sep 17 00:00:00 2001 From: Ivan Radanov Ivanov Date: Mon, 4 Dec 2023 12:58:10 -0800 Subject: [PATCH 03/12] Add omp.coexecute op --- mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 35 +++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 5a79fbf77a268..8061aa0209cc9 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -325,6 +325,41 @@ def SectionsOp : OpenMP_Op<"sections", traits = [ let hasRegionVerifier = 1; } +//===----------------------------------------------------------------------===// +// Coexecute Construct +//===----------------------------------------------------------------------===// + +def CoexecuteOp : OpenMP_Op<"coexecute"> { + let summary = "coexecute directive"; + let description = [{ + The coexecute construct specifies that the teams from the teams directive + this is nested in shall cooperate to execute the computation in this region. + There is no implicit barrier at the end as specified in the standard. + + TODO + We should probably change the defaut behaviour to have a barrier unless + nowait is specified, see below snippet. + + ``` + !$omp target teams + !$omp coexecute + tmp = matmul(x, y) + !$omp end coexecute + a = tmp(0, 0) ! there is no implicit barrier! the matmul hasnt completed! + !$omp end target teams coexecute + ``` + + }]; + + let arguments = (ins UnitAttr:$nowait); + + let regions = (region AnyRegion:$region); + + let assemblyFormat = [{ + oilist(`nowait` $nowait) $region attr-dict + }]; +} + //===----------------------------------------------------------------------===// // 2.8.2 Single Construct //===----------------------------------------------------------------------===// >From 7ecec06e00230649446c77c970160d4814a90e07 Mon Sep 17 00:00:00 2001 From: Ivan Radanov Ivanov Date: Mon, 4 Dec 2023 17:50:41 -0800 Subject: [PATCH 04/12] Initial frontend support for coexecute --- .../include/flang/Semantics/openmp-directive-sets.h | 13 +++++++++++++ flang/lib/Lower/OpenMP/OpenMP.cpp | 12 ++++++++++++ flang/lib/Parser/openmp-parsers.cpp | 5 ++++- flang/lib/Semantics/resolve-directives.cpp | 6 ++++++ 4 files changed, 35 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Semantics/openmp-directive-sets.h b/flang/include/flang/Semantics/openmp-directive-sets.h index dd610c9702c28..5c316e030c63f 100644 --- a/flang/include/flang/Semantics/openmp-directive-sets.h +++ b/flang/include/flang/Semantics/openmp-directive-sets.h @@ -143,6 +143,7 @@ static const OmpDirectiveSet topTargetSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, + Directive::OMPD_target_teams_coexecute, }; static const OmpDirectiveSet allTargetSet{topTargetSet}; @@ -187,9 +188,16 @@ static const OmpDirectiveSet allTeamsSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, + Directive::OMPD_target_teams_coexecute, } | topTeamsSet, }; +static const OmpDirectiveSet allCoexecuteSet{ + Directive::OMPD_coexecute, + Directive::OMPD_teams_coexecute, + Directive::OMPD_target_teams_coexecute, +}; + //===----------------------------------------------------------------------===// // Directive sets for groups of multiple directives //===----------------------------------------------------------------------===// @@ -230,6 +238,9 @@ static const OmpDirectiveSet blockConstructSet{ Directive::OMPD_taskgroup, Directive::OMPD_teams, Directive::OMPD_workshare, + Directive::OMPD_target_teams_coexecute, + Directive::OMPD_teams_coexecute, + Directive::OMPD_coexecute, }; static const OmpDirectiveSet loopConstructSet{ @@ -294,6 +305,7 @@ static const OmpDirectiveSet workShareSet{ Directive::OMPD_scope, Directive::OMPD_sections, Directive::OMPD_single, + Directive::OMPD_coexecute, } | allDoSet, }; @@ -376,6 +388,7 @@ static const OmpDirectiveSet nestedReduceWorkshareAllowedSet{ }; static const OmpDirectiveSet nestedTeamsAllowedSet{ + Directive::OMPD_coexecute, Directive::OMPD_distribute, Directive::OMPD_distribute_parallel_do, Directive::OMPD_distribute_parallel_do_simd, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 61bbc709872fd..b0c65c8e37988 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2670,6 +2670,15 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +static mlir::omp::CoexecuteOp +genCoexecuteOp(Fortran::lower::AbstractConverter &converter, + Fortran::lower::pft::Evaluation &eval, + mlir::Location currentLocation, + const Fortran::parser::OmpClauseList &clauseList) { + return genOpWithBody( + converter, eval, currentLocation, /*outerCombined=*/false, &clauseList); +} + //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// @@ -3929,6 +3938,9 @@ static void genOMPDispatch(lower::AbstractConverter &converter, newOp = genTeamsOp(converter, symTable, stmtCtx, semaCtx, eval, loc, queue, item); break; + case llvm::omp::Directive::OMPD_coexecute: + newOp = genCoexecuteOp(converter, eval, currentLocation, beginClauseList); + break; case llvm::omp::Directive::OMPD_tile: case llvm::omp::Directive::OMPD_unroll: { unsigned version = semaCtx.langOptions().OpenMPVersion; diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 52d3a5844c969..591b1642baed3 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1344,12 +1344,15 @@ TYPE_PARSER( "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_coexecute), "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), "TARGET" >> pure(llvm::omp::Directive::OMPD_target), "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_teams_coexecute), "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare), + "COEXECUTE" >> pure(llvm::omp::Directive::OMPD_coexecute)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 9fa7bc8964854..ae297f204356a 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1617,6 +1617,9 @@ bool OmpAttributeVisitor::Pre(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_taskgroup: case llvm::omp::Directive::OMPD_teams: + case llvm::omp::Directive::OMPD_coexecute: + case llvm::omp::Directive::OMPD_teams_coexecute: + case llvm::omp::Directive::OMPD_target_teams_coexecute: case llvm::omp::Directive::OMPD_workshare: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: @@ -1650,6 +1653,9 @@ void OmpAttributeVisitor::Post(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_target: case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_teams: + case llvm::omp::Directive::OMPD_coexecute: + case llvm::omp::Directive::OMPD_teams_coexecute: + case llvm::omp::Directive::OMPD_target_teams_coexecute: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: case llvm::omp::Directive::OMPD_target_parallel: { >From ca0cc44c621fde89f1889fb328e66755ca3f5e3a Mon Sep 17 00:00:00 2001 From: skc7 Date: Tue, 13 May 2025 15:09:45 +0530 Subject: [PATCH 05/12] [OpenMP] Fixes for coexecute definitions --- .../flang/Semantics/openmp-directive-sets.h | 1 + flang/lib/Lower/OpenMP/OpenMP.cpp | 13 ++-- flang/test/Lower/OpenMP/coexecute.f90 | 59 +++++++++++++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 33 +++++------ 4 files changed, 83 insertions(+), 23 deletions(-) create mode 100644 flang/test/Lower/OpenMP/coexecute.f90 diff --git a/flang/include/flang/Semantics/openmp-directive-sets.h b/flang/include/flang/Semantics/openmp-directive-sets.h index 5c316e030c63f..43f4e642b3d86 100644 --- a/flang/include/flang/Semantics/openmp-directive-sets.h +++ b/flang/include/flang/Semantics/openmp-directive-sets.h @@ -173,6 +173,7 @@ static const OmpDirectiveSet topTeamsSet{ Directive::OMPD_teams_distribute_parallel_do_simd, Directive::OMPD_teams_distribute_simd, Directive::OMPD_teams_loop, + Directive::OMPD_teams_coexecute, }; static const OmpDirectiveSet bottomTeamsSet{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index b0c65c8e37988..80612bd05ad97 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2671,12 +2671,13 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, } static mlir::omp::CoexecuteOp -genCoexecuteOp(Fortran::lower::AbstractConverter &converter, - Fortran::lower::pft::Evaluation &eval, - mlir::Location currentLocation, - const Fortran::parser::OmpClauseList &clauseList) { +genCoexecuteOp(lower::AbstractConverter &converter, lower::SymMap &symTable, + semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, + mlir::Location loc, const ConstructQueue &queue, + ConstructQueue::const_iterator item) { return genOpWithBody( - converter, eval, currentLocation, /*outerCombined=*/false, &clauseList); + OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, + llvm::omp::Directive::OMPD_coexecute), queue, item); } //===----------------------------------------------------------------------===// @@ -3939,7 +3940,7 @@ static void genOMPDispatch(lower::AbstractConverter &converter, item); break; case llvm::omp::Directive::OMPD_coexecute: - newOp = genCoexecuteOp(converter, eval, currentLocation, beginClauseList); + newOp = genCoexecuteOp(converter, symTable, semaCtx, eval, loc, queue, item); break; case llvm::omp::Directive::OMPD_tile: case llvm::omp::Directive::OMPD_unroll: { diff --git a/flang/test/Lower/OpenMP/coexecute.f90 b/flang/test/Lower/OpenMP/coexecute.f90 new file mode 100644 index 0000000000000..b14f71f9bbbfa --- /dev/null +++ b/flang/test/Lower/OpenMP/coexecute.f90 @@ -0,0 +1,59 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +! CHECK-LABEL: func @_QPtarget_teams_coexecute +subroutine target_teams_coexecute() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp target teams coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end target teams coexecute +end subroutine target_teams_coexecute + +! CHECK-LABEL: func @_QPteams_coexecute +subroutine teams_coexecute() + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp teams coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end teams coexecute +end subroutine teams_coexecute + +! CHECK-LABEL: func @_QPtarget_teams_coexecute_m +subroutine target_teams_coexecute_m() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp target + !$omp teams + !$omp coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end coexecute + !$omp end teams + !$omp end target +end subroutine target_teams_coexecute_m + +! CHECK-LABEL: func @_QPteams_coexecute_m +subroutine teams_coexecute_m() + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp teams + !$omp coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end coexecute + !$omp end teams +end subroutine teams_coexecute_m diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 7f450b43c2e36..3f02b6534816f 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -2209,29 +2209,28 @@ def OMP_TargetTeams : Directive<"target teams"> { } def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, + VersionedClause, VersionedClause, VersionedClause, - VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, - VersionedClause, - VersionedClause, VersionedClause, - VersionedClause, + VersionedClause, ]; - let allowedOnceClauses = [ + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, - VersionedClause, - VersionedClause, VersionedClause, - VersionedClause, VersionedClause, VersionedClause, + VersionedClause, ]; let leafConstructs = [OMP_Target, OMP_Teams, OMP_Coexecute]; let category = CA_Executable; @@ -2524,20 +2523,20 @@ def OMP_TaskLoopSimd : Directive<"taskloop simd"> { } def OMP_TeamsCoexecute : Directive<"teams coexecute"> { let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, VersionedClause, - VersionedClause + VersionedClause, ]; - let leafConstructs = [OMP_Target, OMP_Teams]; + let leafConstructs = [OMP_Teams, OMP_Coexecute]; let category = CA_Executable; } def OMP_TeamsDistribute : Directive<"teams distribute"> { >From 8077858a88a2ffac2b7d726c1ae5d1f1edb64b67 Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 14 May 2025 14:48:52 +0530 Subject: [PATCH 06/12] [OpenMP] Use workdistribute instead of coexecute --- .../flang/Semantics/openmp-directive-sets.h | 24 ++--- flang/lib/Lower/OpenMP/OpenMP.cpp | 15 ++- flang/lib/Parser/openmp-parsers.cpp | 6 +- flang/lib/Semantics/resolve-directives.cpp | 12 +-- flang/test/Lower/OpenMP/coexecute.f90 | 59 ---------- flang/test/Lower/OpenMP/workdistribute.f90 | 59 ++++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 101 ++++++++++-------- mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 28 ++--- 8 files changed, 152 insertions(+), 152 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/coexecute.f90 create mode 100644 flang/test/Lower/OpenMP/workdistribute.f90 diff --git a/flang/include/flang/Semantics/openmp-directive-sets.h b/flang/include/flang/Semantics/openmp-directive-sets.h index 43f4e642b3d86..7ced6ed9b44d6 100644 --- a/flang/include/flang/Semantics/openmp-directive-sets.h +++ b/flang/include/flang/Semantics/openmp-directive-sets.h @@ -143,7 +143,7 @@ static const OmpDirectiveSet topTargetSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, - Directive::OMPD_target_teams_coexecute, + Directive::OMPD_target_teams_workdistribute, }; static const OmpDirectiveSet allTargetSet{topTargetSet}; @@ -173,7 +173,7 @@ static const OmpDirectiveSet topTeamsSet{ Directive::OMPD_teams_distribute_parallel_do_simd, Directive::OMPD_teams_distribute_simd, Directive::OMPD_teams_loop, - Directive::OMPD_teams_coexecute, + Directive::OMPD_teams_workdistribute, }; static const OmpDirectiveSet bottomTeamsSet{ @@ -189,14 +189,14 @@ static const OmpDirectiveSet allTeamsSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, - Directive::OMPD_target_teams_coexecute, + Directive::OMPD_target_teams_workdistribute, } | topTeamsSet, }; -static const OmpDirectiveSet allCoexecuteSet{ - Directive::OMPD_coexecute, - Directive::OMPD_teams_coexecute, - Directive::OMPD_target_teams_coexecute, +static const OmpDirectiveSet allWorkdistributeSet{ + Directive::OMPD_workdistribute, + Directive::OMPD_teams_workdistribute, + Directive::OMPD_target_teams_workdistribute, }; //===----------------------------------------------------------------------===// @@ -239,9 +239,9 @@ static const OmpDirectiveSet blockConstructSet{ Directive::OMPD_taskgroup, Directive::OMPD_teams, Directive::OMPD_workshare, - Directive::OMPD_target_teams_coexecute, - Directive::OMPD_teams_coexecute, - Directive::OMPD_coexecute, + Directive::OMPD_target_teams_workdistribute, + Directive::OMPD_teams_workdistribute, + Directive::OMPD_workdistribute, }; static const OmpDirectiveSet loopConstructSet{ @@ -306,7 +306,7 @@ static const OmpDirectiveSet workShareSet{ Directive::OMPD_scope, Directive::OMPD_sections, Directive::OMPD_single, - Directive::OMPD_coexecute, + Directive::OMPD_workdistribute, } | allDoSet, }; @@ -389,7 +389,7 @@ static const OmpDirectiveSet nestedReduceWorkshareAllowedSet{ }; static const OmpDirectiveSet nestedTeamsAllowedSet{ - Directive::OMPD_coexecute, + Directive::OMPD_workdistribute, Directive::OMPD_distribute, Directive::OMPD_distribute_parallel_do, Directive::OMPD_distribute_parallel_do_simd, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 80612bd05ad97..42d04bceddb12 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2670,14 +2670,14 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } -static mlir::omp::CoexecuteOp -genCoexecuteOp(lower::AbstractConverter &converter, lower::SymMap &symTable, +static mlir::omp::WorkdistributeOp +genWorkdistributeOp(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, mlir::Location loc, const ConstructQueue &queue, ConstructQueue::const_iterator item) { - return genOpWithBody( + return genOpWithBody( OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, - llvm::omp::Directive::OMPD_coexecute), queue, item); + llvm::omp::Directive::OMPD_workdistribute), queue, item); } //===----------------------------------------------------------------------===// @@ -3939,16 +3939,15 @@ static void genOMPDispatch(lower::AbstractConverter &converter, newOp = genTeamsOp(converter, symTable, stmtCtx, semaCtx, eval, loc, queue, item); break; - case llvm::omp::Directive::OMPD_coexecute: - newOp = genCoexecuteOp(converter, symTable, semaCtx, eval, loc, queue, item); - break; case llvm::omp::Directive::OMPD_tile: case llvm::omp::Directive::OMPD_unroll: { unsigned version = semaCtx.langOptions().OpenMPVersion; TODO(loc, "Unhandled loop directive (" + llvm::omp::getOpenMPDirectiveName(dir, version) + ")"); } - // case llvm::omp::Directive::OMPD_workdistribute: + case llvm::omp::Directive::OMPD_workdistribute: + newOp = genWorkdistributeOp(converter, symTable, semaCtx, eval, loc, queue, item); + break; case llvm::omp::Directive::OMPD_workshare: newOp = genWorkshareOp(converter, symTable, stmtCtx, semaCtx, eval, loc, queue, item); diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 591b1642baed3..5b5ee257edd1f 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1344,15 +1344,15 @@ TYPE_PARSER( "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_coexecute), + "TARGET TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_workdistribute), "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), "TARGET" >> pure(llvm::omp::Directive::OMPD_target), "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_teams_coexecute), + "TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_teams_workdistribute), "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare), - "COEXECUTE" >> pure(llvm::omp::Directive::OMPD_coexecute)))) + "WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_workdistribute)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index ae297f204356a..4636508ac144d 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1617,9 +1617,9 @@ bool OmpAttributeVisitor::Pre(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_taskgroup: case llvm::omp::Directive::OMPD_teams: - case llvm::omp::Directive::OMPD_coexecute: - case llvm::omp::Directive::OMPD_teams_coexecute: - case llvm::omp::Directive::OMPD_target_teams_coexecute: + case llvm::omp::Directive::OMPD_workdistribute: + case llvm::omp::Directive::OMPD_teams_workdistribute: + case llvm::omp::Directive::OMPD_target_teams_workdistribute: case llvm::omp::Directive::OMPD_workshare: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: @@ -1653,9 +1653,9 @@ void OmpAttributeVisitor::Post(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_target: case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_teams: - case llvm::omp::Directive::OMPD_coexecute: - case llvm::omp::Directive::OMPD_teams_coexecute: - case llvm::omp::Directive::OMPD_target_teams_coexecute: + case llvm::omp::Directive::OMPD_workdistribute: + case llvm::omp::Directive::OMPD_teams_workdistribute: + case llvm::omp::Directive::OMPD_target_teams_workdistribute: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: case llvm::omp::Directive::OMPD_target_parallel: { diff --git a/flang/test/Lower/OpenMP/coexecute.f90 b/flang/test/Lower/OpenMP/coexecute.f90 deleted file mode 100644 index b14f71f9bbbfa..0000000000000 --- a/flang/test/Lower/OpenMP/coexecute.f90 +++ /dev/null @@ -1,59 +0,0 @@ -! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s - -! CHECK-LABEL: func @_QPtarget_teams_coexecute -subroutine target_teams_coexecute() - ! CHECK: omp.target - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp target teams coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end target teams coexecute -end subroutine target_teams_coexecute - -! CHECK-LABEL: func @_QPteams_coexecute -subroutine teams_coexecute() - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp teams coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end teams coexecute -end subroutine teams_coexecute - -! CHECK-LABEL: func @_QPtarget_teams_coexecute_m -subroutine target_teams_coexecute_m() - ! CHECK: omp.target - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp target - !$omp teams - !$omp coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end coexecute - !$omp end teams - !$omp end target -end subroutine target_teams_coexecute_m - -! CHECK-LABEL: func @_QPteams_coexecute_m -subroutine teams_coexecute_m() - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp teams - !$omp coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end coexecute - !$omp end teams -end subroutine teams_coexecute_m diff --git a/flang/test/Lower/OpenMP/workdistribute.f90 b/flang/test/Lower/OpenMP/workdistribute.f90 new file mode 100644 index 0000000000000..924205bb72e5e --- /dev/null +++ b/flang/test/Lower/OpenMP/workdistribute.f90 @@ -0,0 +1,59 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +! CHECK-LABEL: func @_QPtarget_teams_workdistribute +subroutine target_teams_workdistribute() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp target teams workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end target teams workdistribute +end subroutine target_teams_workdistribute + +! CHECK-LABEL: func @_QPteams_workdistribute +subroutine teams_workdistribute() + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp teams workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end teams workdistribute +end subroutine teams_workdistribute + +! CHECK-LABEL: func @_QPtarget_teams_workdistribute_m +subroutine target_teams_workdistribute_m() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp target + !$omp teams + !$omp workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end workdistribute + !$omp end teams + !$omp end target +end subroutine target_teams_workdistribute_m + +! CHECK-LABEL: func @_QPteams_workdistribute_m +subroutine teams_workdistribute_m() + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp teams + !$omp workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end workdistribute + !$omp end teams +end subroutine teams_workdistribute_m diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 3f02b6534816f..c88a3049450de 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -1292,6 +1292,15 @@ def OMP_EndWorkshare : Directive<"end workshare"> { let category = OMP_Workshare.category; let languages = [L_Fortran]; } +def OMP_Workdistribute : Directive<"workdistribute"> { + let association = AS_Block; + let category = CA_Executable; +} +def OMP_EndWorkdistribute : Directive<"end workdistribute"> { + let leafConstructs = OMP_Workdistribute.leafConstructs; + let association = OMP_Workdistribute.association; + let category = OMP_Workdistribute.category; +} //===----------------------------------------------------------------------===// // Definitions of OpenMP compound directives @@ -2207,34 +2216,6 @@ def OMP_TargetTeams : Directive<"target teams"> { let leafConstructs = [OMP_Target, OMP_Teams]; let category = CA_Executable; } -def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let leafConstructs = [OMP_Target, OMP_Teams, OMP_Coexecute]; - let category = CA_Executable; -} def OMP_TargetTeamsDistribute : Directive<"target teams distribute"> { let allowedClauses = [ VersionedClause, @@ -2457,6 +2438,34 @@ def OMP_TargetTeamsDistributeSimd : let leafConstructs = [OMP_Target, OMP_Teams, OMP_Distribute, OMP_Simd]; let category = CA_Executable; } +def OMP_TargetTeamsWorkdistribute : Directive<"target teams workdistribute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let leafConstructs = [OMP_Target, OMP_Teams, OMP_Workdistribute]; + let category = CA_Executable; +} def OMP_target_teams_loop : Directive<"target teams loop"> { let allowedClauses = [ VersionedClause, @@ -2521,24 +2530,6 @@ def OMP_TaskLoopSimd : Directive<"taskloop simd"> { let leafConstructs = [OMP_TaskLoop, OMP_Simd]; let category = CA_Executable; } -def OMP_TeamsCoexecute : Directive<"teams coexecute"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let leafConstructs = [OMP_Teams, OMP_Coexecute]; - let category = CA_Executable; -} def OMP_TeamsDistribute : Directive<"teams distribute"> { let allowedClauses = [ VersionedClause, @@ -2726,3 +2717,21 @@ def OMP_teams_loop : Directive<"teams loop"> { let leafConstructs = [OMP_Teams, OMP_loop]; let category = CA_Executable; } +def OMP_TeamsWorkdistribute : Directive<"teams workdistribute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let leafConstructs = [OMP_Teams, OMP_Workdistribute]; + let category = CA_Executable; +} diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 8061aa0209cc9..5e3ab0e908d21 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -326,38 +326,30 @@ def SectionsOp : OpenMP_Op<"sections", traits = [ } //===----------------------------------------------------------------------===// -// Coexecute Construct +// workdistribute Construct //===----------------------------------------------------------------------===// -def CoexecuteOp : OpenMP_Op<"coexecute"> { - let summary = "coexecute directive"; +def WorkdistributeOp : OpenMP_Op<"workdistribute"> { + let summary = "workdistribute directive"; let description = [{ - The coexecute construct specifies that the teams from the teams directive - this is nested in shall cooperate to execute the computation in this region. - There is no implicit barrier at the end as specified in the standard. - - TODO - We should probably change the defaut behaviour to have a barrier unless - nowait is specified, see below snippet. + workdistribute divides execution of the enclosed structured block into + separate units of work, each executed only once by each + initial thread in the league. ``` !$omp target teams - !$omp coexecute + !$omp workdistribute tmp = matmul(x, y) - !$omp end coexecute + !$omp end workdistribute a = tmp(0, 0) ! there is no implicit barrier! the matmul hasnt completed! - !$omp end target teams coexecute + !$omp end target teams workdistribute ``` }]; - let arguments = (ins UnitAttr:$nowait); - let regions = (region AnyRegion:$region); - let assemblyFormat = [{ - oilist(`nowait` $nowait) $region attr-dict - }]; + let assemblyFormat = "$region attr-dict"; } //===----------------------------------------------------------------------===// >From 085062f9ebac1079a720f614498c0b124eda8a51 Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 14 May 2025 16:17:14 +0530 Subject: [PATCH 07/12] [OpenMP] workdistribute trivial lowering Lowering logic inspired from ivanradanov coexeute lowering f56da1a207df4a40776a8570122a33f047074a3c --- .../include/flang/Optimizer/OpenMP/Passes.td | 4 + flang/lib/Optimizer/OpenMP/CMakeLists.txt | 1 + .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 101 ++++++++++++++++++ .../OpenMP/lower-workdistribute.mlir | 52 +++++++++ 4 files changed, 158 insertions(+) create mode 100644 flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp create mode 100644 flang/test/Transforms/OpenMP/lower-workdistribute.mlir diff --git a/flang/include/flang/Optimizer/OpenMP/Passes.td b/flang/include/flang/Optimizer/OpenMP/Passes.td index 704faf0ccd856..743b6d381ed42 100644 --- a/flang/include/flang/Optimizer/OpenMP/Passes.td +++ b/flang/include/flang/Optimizer/OpenMP/Passes.td @@ -93,6 +93,10 @@ def LowerWorkshare : Pass<"lower-workshare", "::mlir::ModuleOp"> { let summary = "Lower workshare construct"; } +def LowerWorkdistribute : Pass<"lower-workdistribute", "::mlir::ModuleOp"> { + let summary = "Lower workdistribute construct"; +} + def GenericLoopConversionPass : Pass<"omp-generic-loop-conversion", "mlir::func::FuncOp"> { let summary = "Converts OpenMP generic `omp.loop` to semantically " diff --git a/flang/lib/Optimizer/OpenMP/CMakeLists.txt b/flang/lib/Optimizer/OpenMP/CMakeLists.txt index e31543328a9f9..cd746834741f9 100644 --- a/flang/lib/Optimizer/OpenMP/CMakeLists.txt +++ b/flang/lib/Optimizer/OpenMP/CMakeLists.txt @@ -7,6 +7,7 @@ add_flang_library(FlangOpenMPTransforms MapsForPrivatizedSymbols.cpp MapInfoFinalization.cpp MarkDeclareTarget.cpp + LowerWorkdistribute.cpp LowerWorkshare.cpp LowerNontemporal.cpp diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp new file mode 100644 index 0000000000000..75c9d2b0d494e --- /dev/null +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -0,0 +1,101 @@ +//===- LowerWorkshare.cpp - special cases for bufferization -------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the lowering of omp.workdistribute. +// +//===----------------------------------------------------------------------===// + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +#include + +namespace flangomp { +#define GEN_PASS_DEF_LOWERWORKDISTRIBUTE +#include "flang/Optimizer/OpenMP/Passes.h.inc" +} // namespace flangomp + +#define DEBUG_TYPE "lower-workdistribute" + +using namespace mlir; + +namespace { + +struct WorkdistributeToSingle : public mlir::OpRewritePattern { +using OpRewritePattern::OpRewritePattern; +mlir::LogicalResult + matchAndRewrite(mlir::omp::WorkdistributeOp workdistribute, + mlir::PatternRewriter &rewriter) const override { + auto loc = workdistribute->getLoc(); + auto teams = llvm::dyn_cast(workdistribute->getParentOp()); + if (!teams) { + mlir::emitError(loc, "workdistribute not nested in teams\n"); + return mlir::failure(); + } + if (workdistribute.getRegion().getBlocks().size() != 1) { + mlir::emitError(loc, "workdistribute with multiple blocks\n"); + return mlir::failure(); + } + if (teams.getRegion().getBlocks().size() != 1) { + mlir::emitError(loc, "teams with multiple blocks\n"); + return mlir::failure(); + } + if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { + mlir::emitError(loc, "teams with multiple nested ops\n"); + return mlir::failure(); + } + mlir::Block *workdistributeBlock = &workdistribute.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teams); + rewriter.eraseOp(teams); + return mlir::success(); + } +}; + +class LowerWorkdistributePass + : public flangomp::impl::LowerWorkdistributeBase { +public: + void runOnOperation() override { + mlir::MLIRContext &context = getContext(); + mlir::RewritePatternSet patterns(&context); + mlir::GreedyRewriteConfig config; + // prevent the pattern driver form merging blocks + config.setRegionSimplificationLevel( + mlir::GreedySimplifyRegionLevel::Disabled); + + patterns.insert(&context); + mlir::Operation *op = getOperation(); + if (mlir::failed(mlir::applyPatternsGreedily(op, std::move(patterns), config))) { + mlir::emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + signalPassFailure(); + } + } +}; +} diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute.mlir new file mode 100644 index 0000000000000..34c8c3f01976d --- /dev/null +++ b/flang/test/Transforms/OpenMP/lower-workdistribute.mlir @@ -0,0 +1,52 @@ +// RUN: fir-opt --lower-workdistribute %s | FileCheck %s + +// CHECK-LABEL: func.func @_QPtarget_simple() { +// CHECK: %[[VAL_0:.*]] = arith.constant 2 : i32 +// CHECK: %[[VAL_1:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFtarget_simpleEa"} +// CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[VAL_1]] {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_3:.*]] = fir.alloca !fir.box> {bindc_name = "simple_var", uniq_name = "_QFtarget_simpleEsimple_var"} +// CHECK: %[[VAL_4:.*]] = fir.zero_bits !fir.heap +// CHECK: %[[VAL_5:.*]] = fir.embox %[[VAL_4]] : (!fir.heap) -> !fir.box> +// CHECK: fir.store %[[VAL_5]] to %[[VAL_3]] : !fir.ref>> +// CHECK: %[[VAL_6:.*]]:2 = hlfir.declare %[[VAL_3]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) +// CHECK: hlfir.assign %[[VAL_0]] to %[[VAL_2]]#0 : i32, !fir.ref +// CHECK: %[[VAL_7:.*]] = omp.map.info var_ptr(%[[VAL_2]]#1 : !fir.ref, i32) map_clauses(to) capture(ByRef) -> !fir.ref {name = "a"} +// CHECK: omp.target map_entries(%[[VAL_7]] -> %[[VAL_8:.*]] : !fir.ref) private(@_QFtarget_simpleEsimple_var_private_ref_box_heap_i32 %[[VAL_6]]#0 -> %[[VAL_9:.*]] : !fir.ref>>) { +// CHECK: %[[VAL_10:.*]] = arith.constant 10 : i32 +// CHECK: %[[VAL_11:.*]]:2 = hlfir.declare %[[VAL_8]] {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_12:.*]]:2 = hlfir.declare %[[VAL_9]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) +// CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_11]]#0 : !fir.ref +// CHECK: %[[VAL_14:.*]] = arith.addi %[[VAL_13]], %[[VAL_10]] : i32 +// CHECK: hlfir.assign %[[VAL_14]] to %[[VAL_12]]#0 realloc : i32, !fir.ref>> +// CHECK: omp.terminator +// CHECK: } +// CHECK: return +// CHECK: } +func.func @_QPtarget_simple() { + %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFtarget_simpleEa"} + %1:2 = hlfir.declare %0 {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %2 = fir.alloca !fir.box> {bindc_name = "simple_var", uniq_name = "_QFtarget_simpleEsimple_var"} + %3 = fir.zero_bits !fir.heap + %4 = fir.embox %3 : (!fir.heap) -> !fir.box> + fir.store %4 to %2 : !fir.ref>> + %5:2 = hlfir.declare %2 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) + %c2_i32 = arith.constant 2 : i32 + hlfir.assign %c2_i32 to %1#0 : i32, !fir.ref + %6 = omp.map.info var_ptr(%1#1 : !fir.ref, i32) map_clauses(to) capture(ByRef) -> !fir.ref {name = "a"} + omp.target map_entries(%6 -> %arg0 : !fir.ref) private(@_QFtarget_simpleEsimple_var_private_ref_box_heap_i32 %5#0 -> %arg1 : !fir.ref>>){ + omp.teams { + omp.workdistribute { + %11:2 = hlfir.declare %arg0 {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %12:2 = hlfir.declare %arg1 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) + %c10_i32 = arith.constant 10 : i32 + %13 = fir.load %11#0 : !fir.ref + %14 = arith.addi %c10_i32, %13 : i32 + hlfir.assign %14 to %12#0 realloc : i32, !fir.ref>> + omp.terminator + } + omp.terminator + } + omp.terminator + } + return +} \ No newline at end of file >From c9b63efe85f7aed781a4a0fd7d0888b595f2a520 Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 14 May 2025 19:29:33 +0530 Subject: [PATCH 08/12] [Flang][OpenMP] Add workdistribute lower pass to pipeline --- flang/lib/Optimizer/Passes/Pipelines.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..15983f80c1e4b 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -278,8 +278,10 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP, addNestedPassToAllTopLevelOperations( pm, hlfir::createInlineHLFIRAssign); pm.addPass(hlfir::createConvertHLFIRtoFIR()); - if (enableOpenMP) + if (enableOpenMP) { pm.addPass(flangomp::createLowerWorkshare()); + pm.addPass(flangomp::createLowerWorkdistribute()); + } } /// Create a pass pipeline for handling certain OpenMP transformations needed >From 048c3f22d55248a21e53ee3f4be2c0b07b500039 Mon Sep 17 00:00:00 2001 From: skc7 Date: Thu, 15 May 2025 16:39:21 +0530 Subject: [PATCH 09/12] [Flang][OpenMP] Add FissionWorkdistribute lowering. Fission logic inspired from ivanradanov implementation : c97eca4010e460aac5a3d795614ca0980bce4565 --- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 233 ++++++++++++++---- .../OpenMP/lower-workdistribute-fission.mlir | 60 +++++ ...ir => lower-workdistribute-to-single.mlir} | 2 +- 3 files changed, 243 insertions(+), 52 deletions(-) create mode 100644 flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir rename flang/test/Transforms/OpenMP/{lower-workdistribute.mlir => lower-workdistribute-to-single.mlir} (99%) diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index 75c9d2b0d494e..f799202be2645 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -10,31 +10,26 @@ // //===----------------------------------------------------------------------===// -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Dialect/FIROps.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "flang/Optimizer/HLFIR/Passes.h" +#include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/IR/Builders.h" +#include "mlir/IR/Value.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" #include #include -#include -#include -#include +#include +#include #include +#include #include -#include #include -#include -#include #include #include -#include "mlir/Transforms/GreedyPatternRewriteDriver.h" - +#include #include namespace flangomp { @@ -48,52 +43,188 @@ using namespace mlir; namespace { -struct WorkdistributeToSingle : public mlir::OpRewritePattern { -using OpRewritePattern::OpRewritePattern; -mlir::LogicalResult - matchAndRewrite(mlir::omp::WorkdistributeOp workdistribute, - mlir::PatternRewriter &rewriter) const override { - auto loc = workdistribute->getLoc(); - auto teams = llvm::dyn_cast(workdistribute->getParentOp()); - if (!teams) { - mlir::emitError(loc, "workdistribute not nested in teams\n"); - return mlir::failure(); - } - if (workdistribute.getRegion().getBlocks().size() != 1) { - mlir::emitError(loc, "workdistribute with multiple blocks\n"); - return mlir::failure(); +template +static T getPerfectlyNested(Operation *op) { + if (op->getNumRegions() != 1) + return nullptr; + auto ®ion = op->getRegion(0); + if (region.getBlocks().size() != 1) + return nullptr; + auto *block = ®ion.front(); + auto *firstOp = &block->front(); + if (auto nested = dyn_cast(firstOp)) + if (firstOp->getNextNode() == block->getTerminator()) + return nested; + return nullptr; +} + +/// This is the single source of truth about whether we should parallelize an +/// operation nested in an omp.workdistribute region. +static bool shouldParallelize(Operation *op) { + // Currently we cannot parallelize operations with results that have uses + if (llvm::any_of(op->getResults(), + [](OpResult v) -> bool { return !v.use_empty(); })) + return false; + // We will parallelize unordered loops - these come from array syntax + if (auto loop = dyn_cast(op)) { + auto unordered = loop.getUnordered(); + if (!unordered) + return false; + return *unordered; + } + if (auto callOp = dyn_cast(op)) { + auto callee = callOp.getCallee(); + if (!callee) + return false; + auto *func = op->getParentOfType().lookupSymbol(*callee); + // TODO need to insert a check here whether it is a call we can actually + // parallelize currently + if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) + return true; + return false; + } + // We cannot parallise anything else + return false; +} + +struct WorkdistributeToSingle : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + PatternRewriter &rewriter) const override { + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); } - if (teams.getRegion().getBlocks().size() != 1) { - mlir::emitError(loc, "teams with multiple blocks\n"); - return mlir::failure(); + + Block *workdistributeBlock = &workdistributeOp.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); + rewriter.eraseOp(teamsOp); + workdistributeOp.emitWarning("unable to parallelize coexecute"); + return success(); + } +}; + +/// If B() and D() are parallelizable, +/// +/// omp.teams { +/// omp.workdistribute { +/// A() +/// B() +/// C() +/// D() +/// E() +/// } +/// } +/// +/// becomes +/// +/// A() +/// omp.teams { +/// omp.workdistribute { +/// B() +/// } +/// } +/// C() +/// omp.teams { +/// omp.workdistribute { +/// D() +/// } +/// } +/// E() + +struct FissionWorkdistribute + : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult + matchAndRewrite(omp::WorkdistributeOp workdistribute, + PatternRewriter &rewriter) const override { + auto loc = workdistribute->getLoc(); + auto teams = dyn_cast(workdistribute->getParentOp()); + if (!teams) { + emitError(loc, "workdistribute not nested in teams\n"); + return failure(); + } + if (workdistribute.getRegion().getBlocks().size() != 1) { + emitError(loc, "workdistribute with multiple blocks\n"); + return failure(); + } + if (teams.getRegion().getBlocks().size() != 1) { + emitError(loc, "teams with multiple blocks\n"); + return failure(); + } + if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { + emitError(loc, "teams with multiple nested ops\n"); + return failure(); + } + + auto *teamsBlock = &teams.getRegion().front(); + + // While we have unhandled operations in the original workdistribute + auto *workdistributeBlock = &workdistribute.getRegion().front(); + auto *terminator = workdistributeBlock->getTerminator(); + bool changed = false; + while (&workdistributeBlock->front() != terminator) { + rewriter.setInsertionPoint(teams); + IRMapping mapping; + llvm::SmallVector hoisted; + Operation *parallelize = nullptr; + for (auto &op : workdistribute.getOps()) { + if (&op == terminator) { + break; } - if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { - mlir::emitError(loc, "teams with multiple nested ops\n"); - return mlir::failure(); + if (shouldParallelize(&op)) { + parallelize = &op; + break; + } else { + rewriter.clone(op, mapping); + hoisted.push_back(&op); + changed = true; } - mlir::Block *workdistributeBlock = &workdistribute.getRegion().front(); - rewriter.eraseOp(workdistributeBlock->getTerminator()); - rewriter.inlineBlockBefore(workdistributeBlock, teams); - rewriter.eraseOp(teams); - return mlir::success(); + } + + for (auto *op : hoisted) + rewriter.replaceOp(op, mapping.lookup(op)); + + if (parallelize && hoisted.empty() && + parallelize->getNextNode() == terminator) + break; + if (parallelize) { + auto newTeams = rewriter.cloneWithoutRegions(teams); + auto *newTeamsBlock = rewriter.createBlock( + &newTeams.getRegion(), newTeams.getRegion().begin(), {}, {}); + for (auto arg : teamsBlock->getArguments()) + newTeamsBlock->addArgument(arg.getType(), arg.getLoc()); + auto newWorkdistribute = rewriter.create(loc); + rewriter.create(loc); + rewriter.createBlock(&newWorkdistribute.getRegion(), + newWorkdistribute.getRegion().begin(), {}, {}); + auto *cloned = rewriter.clone(*parallelize); + rewriter.replaceOp(parallelize, cloned); + rewriter.create(loc); + changed = true; + } } + return success(changed); + } }; class LowerWorkdistributePass : public flangomp::impl::LowerWorkdistributeBase { public: void runOnOperation() override { - mlir::MLIRContext &context = getContext(); - mlir::RewritePatternSet patterns(&context); - mlir::GreedyRewriteConfig config; + MLIRContext &context = getContext(); + RewritePatternSet patterns(&context); + GreedyRewriteConfig config; // prevent the pattern driver form merging blocks config.setRegionSimplificationLevel( - mlir::GreedySimplifyRegionLevel::Disabled); + GreedySimplifyRegionLevel::Disabled); - patterns.insert(&context); - mlir::Operation *op = getOperation(); - if (mlir::failed(mlir::applyPatternsGreedily(op, std::move(patterns), config))) { - mlir::emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + patterns.insert(&context); + Operation *op = getOperation(); + if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { + emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); } } diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir new file mode 100644 index 0000000000000..ea03a10dd3d44 --- /dev/null +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir @@ -0,0 +1,60 @@ +// RUN: fir-opt --lower-workdistribute %s | FileCheck %s + +// CHECK-LABEL: func.func @test_fission_workdistribute({{.*}}) { +// CHECK: %[[VAL_0:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_2:.*]] = arith.constant 9 : index +// CHECK: %[[VAL_3:.*]] = arith.constant 5.000000e+00 : f32 +// CHECK: fir.store %[[VAL_3]] to %[[ARG2:.*]] : !fir.ref +// CHECK: fir.do_loop %[[VAL_4:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] unordered { +// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref +// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: } +// CHECK: fir.call @regular_side_effect_func(%[[ARG2:.*]]) : (!fir.ref) -> () +// CHECK: fir.call @my_fir_parallel_runtime_func(%[[ARG3:.*]]) : (!fir.ref) -> () +// CHECK: fir.do_loop %[[VAL_8:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] { +// CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_3]] to %[[VAL_9]] : !fir.ref +// CHECK: } +// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2:.*]] : !fir.ref +// CHECK: fir.store %[[VAL_10]] to %[[ARG3:.*]] : !fir.ref +// CHECK: return +// CHECK: } +module { +func.func @regular_side_effect_func(%arg0: !fir.ref) { + return +} +func.func @my_fir_parallel_runtime_func(%arg0: !fir.ref) attributes {fir.runtime} { + return +} +func.func @test_fission_workdistribute(%arr1: !fir.ref>, %arr2: !fir.ref>, %scalar_ref1: !fir.ref, %scalar_ref2: !fir.ref) { + %c0_idx = arith.constant 0 : index + %c1_idx = arith.constant 1 : index + %c9_idx = arith.constant 9 : index + %float_val = arith.constant 5.0 : f32 + omp.teams { + omp.workdistribute { + fir.store %float_val to %scalar_ref1 : !fir.ref + fir.do_loop %iv = %c0_idx to %c9_idx step %c1_idx unordered { + %elem_ptr_arr1 = fir.coordinate_of %arr1, %iv : (!fir.ref>, index) -> !fir.ref + %loaded_val_loop1 = fir.load %elem_ptr_arr1 : !fir.ref + %elem_ptr_arr2 = fir.coordinate_of %arr2, %iv : (!fir.ref>, index) -> !fir.ref + fir.store %loaded_val_loop1 to %elem_ptr_arr2 : !fir.ref + } + fir.call @regular_side_effect_func(%scalar_ref1) : (!fir.ref) -> () + fir.call @my_fir_parallel_runtime_func(%scalar_ref2) : (!fir.ref) -> () + fir.do_loop %jv = %c0_idx to %c9_idx step %c1_idx { + %elem_ptr_ordered_loop = fir.coordinate_of %arr1, %jv : (!fir.ref>, index) -> !fir.ref + fir.store %float_val to %elem_ptr_ordered_loop : !fir.ref + } + %loaded_for_hoist = fir.load %scalar_ref1 : !fir.ref + fir.store %loaded_for_hoist to %scalar_ref2 : !fir.ref + omp.terminator + } + omp.terminator + } + return +} +} diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-to-single.mlir similarity index 99% rename from flang/test/Transforms/OpenMP/lower-workdistribute.mlir rename to flang/test/Transforms/OpenMP/lower-workdistribute-to-single.mlir index 34c8c3f01976d..0cc2aeded2532 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-to-single.mlir @@ -49,4 +49,4 @@ func.func @_QPtarget_simple() { omp.terminator } return -} \ No newline at end of file +} >From 5b30d3dcb80cb4cef546f5bfdf3aa389f527d07d Mon Sep 17 00:00:00 2001 From: skc7 Date: Sun, 18 May 2025 12:37:53 +0530 Subject: [PATCH 10/12] [OpenMP][Flang] Lower teams workdistribute do_loop to wsloop. Logic inspired from ivanradanov commit 5682e9ea7fcba64693f7cfdc0f1970fab2d7d4ae --- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 177 +++++++++++++++--- .../OpenMP/lower-workdistribute-doloop.mlir | 28 +++ .../OpenMP/lower-workdistribute-fission.mlir | 22 ++- 3 files changed, 193 insertions(+), 34 deletions(-) create mode 100644 flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index f799202be2645..de208a8190650 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -6,18 +6,22 @@ // //===----------------------------------------------------------------------===// // -// This file implements the lowering of omp.workdistribute. +// This file implements the lowering and optimisations of omp.workdistribute. // //===----------------------------------------------------------------------===// +#include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Dialect/FIRDialect.h" #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Optimizer/HLFIR/Passes.h" +#include "flang/Optimizer/OpenMP/Utils.h" +#include "mlir/Analysis/SliceAnalysis.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/IR/Builders.h" #include "mlir/IR/Value.h" +#include "mlir/Transforms/DialectConversion.h" #include "mlir/Transforms/GreedyPatternRewriteDriver.h" #include #include @@ -29,6 +33,7 @@ #include #include #include +#include "mlir/Transforms/RegionUtils.h" #include #include @@ -87,25 +92,6 @@ static bool shouldParallelize(Operation *op) { return false; } -struct WorkdistributeToSingle : public OpRewritePattern { - using OpRewritePattern::OpRewritePattern; - LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, - PatternRewriter &rewriter) const override { - auto workdistributeOp = getPerfectlyNested(teamsOp); - if (!workdistributeOp) { - LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); - return failure(); - } - - Block *workdistributeBlock = &workdistributeOp.getRegion().front(); - rewriter.eraseOp(workdistributeBlock->getTerminator()); - rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); - rewriter.eraseOp(teamsOp); - workdistributeOp.emitWarning("unable to parallelize coexecute"); - return success(); - } -}; - /// If B() and D() are parallelizable, /// /// omp.teams { @@ -210,22 +196,161 @@ struct FissionWorkdistribute } }; +static void +genLoopNestClauseOps(mlir::Location loc, + mlir::PatternRewriter &rewriter, + fir::DoLoopOp loop, + mlir::omp::LoopNestOperands &loopNestClauseOps) { + assert(loopNestClauseOps.loopLowerBounds.empty() && + "Loop nest bounds were already emitted!"); + loopNestClauseOps.loopLowerBounds.push_back(loop.getLowerBound()); + loopNestClauseOps.loopUpperBounds.push_back(loop.getUpperBound()); + loopNestClauseOps.loopSteps.push_back(loop.getStep()); + loopNestClauseOps.loopInclusive = rewriter.getUnitAttr(); +} + +static void +genWsLoopOp(mlir::PatternRewriter &rewriter, + fir::DoLoopOp doLoop, + const mlir::omp::LoopNestOperands &clauseOps) { + + auto wsloopOp = rewriter.create(doLoop.getLoc()); + rewriter.createBlock(&wsloopOp.getRegion()); + + auto loopNestOp = + rewriter.create(doLoop.getLoc(), clauseOps); + + // Clone the loop's body inside the loop nest construct using the + // mapped values. + rewriter.cloneRegionBefore(doLoop.getRegion(), loopNestOp.getRegion(), + loopNestOp.getRegion().begin()); + Block *clonedBlock = &loopNestOp.getRegion().back(); + mlir::Operation *terminatorOp = clonedBlock->getTerminator(); + + // Erase fir.result op of do loop and create yield op. + if (auto resultOp = dyn_cast(terminatorOp)) { + rewriter.setInsertionPoint(terminatorOp); + rewriter.create(doLoop->getLoc()); + rewriter.eraseOp(terminatorOp); + } + return; +} + +/// If fir.do_loop id present inside teams workdistribute +/// +/// omp.teams { +/// omp.workdistribute { +/// fir.do_loop unoredered { +/// ... +/// } +/// } +/// } +/// +/// Then, its lowered to +/// +/// omp.teams { +/// omp.workdistribute { +/// omp.parallel { +/// omp.wsloop { +/// omp.loop_nest +/// ... +/// } +/// } +/// } +/// } +/// } + +struct TeamsWorkdistributeLowering : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + PatternRewriter &rewriter) const override { + auto teamsLoc = teamsOp->getLoc(); + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); + } + assert(teamsOp.getReductionVars().empty()); + + auto doLoop = getPerfectlyNested(workdistributeOp); + if (doLoop && shouldParallelize(doLoop)) { + + auto parallelOp = rewriter.create(teamsLoc); + rewriter.createBlock(¶llelOp.getRegion()); + rewriter.setInsertionPoint(rewriter.create(doLoop.getLoc())); + + mlir::omp::LoopNestOperands loopNestClauseOps; + genLoopNestClauseOps(doLoop.getLoc(), rewriter, doLoop, + loopNestClauseOps); + + genWsLoopOp(rewriter, doLoop, loopNestClauseOps); + rewriter.setInsertionPoint(doLoop); + rewriter.eraseOp(doLoop); + return success(); + } + return failure(); + } +}; + + +/// If A() and B () are present inside teams workdistribute +/// +/// omp.teams { +/// omp.workdistribute { +/// A() +/// B() +/// } +/// } +/// +/// Then, its lowered to +/// +/// A() +/// B() +/// + +struct TeamsWorkdistributeToSingle : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + PatternRewriter &rewriter) const override { + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); + } + Block *workdistributeBlock = &workdistributeOp.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); + rewriter.eraseOp(teamsOp); + return success(); + } +}; + class LowerWorkdistributePass : public flangomp::impl::LowerWorkdistributeBase { public: void runOnOperation() override { MLIRContext &context = getContext(); - RewritePatternSet patterns(&context); GreedyRewriteConfig config; // prevent the pattern driver form merging blocks config.setRegionSimplificationLevel( GreedySimplifyRegionLevel::Disabled); - - patterns.insert(&context); + Operation *op = getOperation(); - if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { - emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); - signalPassFailure(); + { + RewritePatternSet patterns(&context); + patterns.insert(&context); + if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { + emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + signalPassFailure(); + } + } + { + RewritePatternSet patterns(&context); + patterns.insert(&context); + if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { + emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + signalPassFailure(); + } } } }; diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir new file mode 100644 index 0000000000000..666bdb3ced647 --- /dev/null +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir @@ -0,0 +1,28 @@ +// RUN: fir-opt --lower-workdistribute %s | FileCheck %s + +// CHECK-LABEL: func.func @x({{.*}}) +// CHECK: %[[VAL_0:.*]] = arith.constant 0 : index +// CHECK: omp.parallel { +// CHECK: omp.wsloop { +// CHECK: omp.loop_nest (%[[VAL_1:.*]]) : index = (%[[ARG0:.*]]) to (%[[ARG1:.*]]) inclusive step (%[[ARG2:.*]]) { +// CHECK: fir.store %[[VAL_0]] to %[[ARG4:.*]] : !fir.ref +// CHECK: omp.yield +// CHECK: } +// CHECK: } +// CHECK: omp.terminator +// CHECK: } +// CHECK: return +// CHECK: } +func.func @x(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref) { + omp.teams { + omp.workdistribute { + fir.do_loop %iv = %lb to %ub step %step unordered { + %zero = arith.constant 0 : index + fir.store %zero to %addr : !fir.ref + } + omp.terminator + } + omp.terminator + } + return +} \ No newline at end of file diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir index ea03a10dd3d44..cf50d135d01ec 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir @@ -6,20 +6,26 @@ // CHECK: %[[VAL_2:.*]] = arith.constant 9 : index // CHECK: %[[VAL_3:.*]] = arith.constant 5.000000e+00 : f32 // CHECK: fir.store %[[VAL_3]] to %[[ARG2:.*]] : !fir.ref -// CHECK: fir.do_loop %[[VAL_4:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] unordered { -// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref -// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref -// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref -// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: omp.parallel { +// CHECK: omp.wsloop { +// CHECK: omp.loop_nest (%[[VAL_4:.*]]) : index = (%[[VAL_0]]) to (%[[VAL_2]]) inclusive step (%[[VAL_1]]) { +// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref +// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: omp.yield +// CHECK: } +// CHECK: } +// CHECK: omp.terminator // CHECK: } // CHECK: fir.call @regular_side_effect_func(%[[ARG2:.*]]) : (!fir.ref) -> () // CHECK: fir.call @my_fir_parallel_runtime_func(%[[ARG3:.*]]) : (!fir.ref) -> () // CHECK: fir.do_loop %[[VAL_8:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] { -// CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref // CHECK: fir.store %[[VAL_3]] to %[[VAL_9]] : !fir.ref // CHECK: } -// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2:.*]] : !fir.ref -// CHECK: fir.store %[[VAL_10]] to %[[ARG3:.*]] : !fir.ref +// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2]] : !fir.ref +// CHECK: fir.store %[[VAL_10]] to %[[ARG3]] : !fir.ref // CHECK: return // CHECK: } module { >From df65bd53111948abf6f9c2e1e0b8e27aa5e01946 Mon Sep 17 00:00:00 2001 From: skc7 Date: Mon, 19 May 2025 15:33:53 +0530 Subject: [PATCH 11/12] clang format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 18 +-- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 108 +++++++++--------- flang/lib/Parser/openmp-parsers.cpp | 6 +- .../OpenMP/lower-workdistribute-doloop.mlir | 2 +- 4 files changed, 67 insertions(+), 67 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 42d04bceddb12..ebf0710ab4feb 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2670,14 +2670,15 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } -static mlir::omp::WorkdistributeOp -genWorkdistributeOp(lower::AbstractConverter &converter, lower::SymMap &symTable, - semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - mlir::Location loc, const ConstructQueue &queue, - ConstructQueue::const_iterator item) { +static mlir::omp::WorkdistributeOp genWorkdistributeOp( + lower::AbstractConverter &converter, lower::SymMap &symTable, + semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, + mlir::Location loc, const ConstructQueue &queue, + ConstructQueue::const_iterator item) { return genOpWithBody( - OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, - llvm::omp::Directive::OMPD_workdistribute), queue, item); + OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, + llvm::omp::Directive::OMPD_workdistribute), + queue, item); } //===----------------------------------------------------------------------===// @@ -3946,7 +3947,8 @@ static void genOMPDispatch(lower::AbstractConverter &converter, llvm::omp::getOpenMPDirectiveName(dir, version) + ")"); } case llvm::omp::Directive::OMPD_workdistribute: - newOp = genWorkdistributeOp(converter, symTable, semaCtx, eval, loc, queue, item); + newOp = genWorkdistributeOp(converter, symTable, semaCtx, eval, loc, queue, + item); break; case llvm::omp::Directive::OMPD_workshare: newOp = genWorkshareOp(converter, symTable, stmtCtx, semaCtx, eval, loc, diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index de208a8190650..f75d4d1988fd2 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -14,15 +14,16 @@ #include "flang/Optimizer/Dialect/FIRDialect.h" #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/Dialect/FIRType.h" -#include "flang/Optimizer/Transforms/Passes.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Utils.h" +#include "flang/Optimizer/Transforms/Passes.h" #include "mlir/Analysis/SliceAnalysis.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/IR/Builders.h" #include "mlir/IR/Value.h" #include "mlir/Transforms/DialectConversion.h" #include "mlir/Transforms/GreedyPatternRewriteDriver.h" +#include "mlir/Transforms/RegionUtils.h" #include #include #include @@ -33,7 +34,6 @@ #include #include #include -#include "mlir/Transforms/RegionUtils.h" #include #include @@ -66,30 +66,30 @@ static T getPerfectlyNested(Operation *op) { /// This is the single source of truth about whether we should parallelize an /// operation nested in an omp.workdistribute region. static bool shouldParallelize(Operation *op) { - // Currently we cannot parallelize operations with results that have uses - if (llvm::any_of(op->getResults(), - [](OpResult v) -> bool { return !v.use_empty(); })) + // Currently we cannot parallelize operations with results that have uses + if (llvm::any_of(op->getResults(), + [](OpResult v) -> bool { return !v.use_empty(); })) + return false; + // We will parallelize unordered loops - these come from array syntax + if (auto loop = dyn_cast(op)) { + auto unordered = loop.getUnordered(); + if (!unordered) return false; - // We will parallelize unordered loops - these come from array syntax - if (auto loop = dyn_cast(op)) { - auto unordered = loop.getUnordered(); - if (!unordered) - return false; - return *unordered; - } - if (auto callOp = dyn_cast(op)) { - auto callee = callOp.getCallee(); - if (!callee) - return false; - auto *func = op->getParentOfType().lookupSymbol(*callee); - // TODO need to insert a check here whether it is a call we can actually - // parallelize currently - if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) - return true; + return *unordered; + } + if (auto callOp = dyn_cast(op)) { + auto callee = callOp.getCallee(); + if (!callee) return false; - } - // We cannot parallise anything else + auto *func = op->getParentOfType().lookupSymbol(*callee); + // TODO need to insert a check here whether it is a call we can actually + // parallelize currently + if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) + return true; return false; + } + // We cannot parallise anything else + return false; } /// If B() and D() are parallelizable, @@ -120,12 +120,10 @@ static bool shouldParallelize(Operation *op) { /// } /// E() -struct FissionWorkdistribute - : public OpRewritePattern { +struct FissionWorkdistribute : public OpRewritePattern { using OpRewritePattern::OpRewritePattern; - LogicalResult - matchAndRewrite(omp::WorkdistributeOp workdistribute, - PatternRewriter &rewriter) const override { + LogicalResult matchAndRewrite(omp::WorkdistributeOp workdistribute, + PatternRewriter &rewriter) const override { auto loc = workdistribute->getLoc(); auto teams = dyn_cast(workdistribute->getParentOp()); if (!teams) { @@ -185,7 +183,7 @@ struct FissionWorkdistribute auto newWorkdistribute = rewriter.create(loc); rewriter.create(loc); rewriter.createBlock(&newWorkdistribute.getRegion(), - newWorkdistribute.getRegion().begin(), {}, {}); + newWorkdistribute.getRegion().begin(), {}, {}); auto *cloned = rewriter.clone(*parallelize); rewriter.replaceOp(parallelize, cloned); rewriter.create(loc); @@ -197,8 +195,7 @@ struct FissionWorkdistribute }; static void -genLoopNestClauseOps(mlir::Location loc, - mlir::PatternRewriter &rewriter, +genLoopNestClauseOps(mlir::Location loc, mlir::PatternRewriter &rewriter, fir::DoLoopOp loop, mlir::omp::LoopNestOperands &loopNestClauseOps) { assert(loopNestClauseOps.loopLowerBounds.empty() && @@ -209,10 +206,8 @@ genLoopNestClauseOps(mlir::Location loc, loopNestClauseOps.loopInclusive = rewriter.getUnitAttr(); } -static void -genWsLoopOp(mlir::PatternRewriter &rewriter, - fir::DoLoopOp doLoop, - const mlir::omp::LoopNestOperands &clauseOps) { +static void genWsLoopOp(mlir::PatternRewriter &rewriter, fir::DoLoopOp doLoop, + const mlir::omp::LoopNestOperands &clauseOps) { auto wsloopOp = rewriter.create(doLoop.getLoc()); rewriter.createBlock(&wsloopOp.getRegion()); @@ -236,7 +231,7 @@ genWsLoopOp(mlir::PatternRewriter &rewriter, return; } -/// If fir.do_loop id present inside teams workdistribute +/// If fir.do_loop is present inside teams workdistribute /// /// omp.teams { /// omp.workdistribute { @@ -246,7 +241,7 @@ genWsLoopOp(mlir::PatternRewriter &rewriter, /// } /// } /// -/// Then, its lowered to +/// Then, its lowered to /// /// omp.teams { /// omp.workdistribute { @@ -277,7 +272,8 @@ struct TeamsWorkdistributeLowering : public OpRewritePattern { auto parallelOp = rewriter.create(teamsLoc); rewriter.createBlock(¶llelOp.getRegion()); - rewriter.setInsertionPoint(rewriter.create(doLoop.getLoc())); + rewriter.setInsertionPoint( + rewriter.create(doLoop.getLoc())); mlir::omp::LoopNestOperands loopNestClauseOps; genLoopNestClauseOps(doLoop.getLoc(), rewriter, doLoop, @@ -292,7 +288,6 @@ struct TeamsWorkdistributeLowering : public OpRewritePattern { } }; - /// If A() and B () are present inside teams workdistribute /// /// omp.teams { @@ -311,17 +306,17 @@ struct TeamsWorkdistributeLowering : public OpRewritePattern { struct TeamsWorkdistributeToSingle : public OpRewritePattern { using OpRewritePattern::OpRewritePattern; LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, - PatternRewriter &rewriter) const override { - auto workdistributeOp = getPerfectlyNested(teamsOp); - if (!workdistributeOp) { - LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); - return failure(); - } - Block *workdistributeBlock = &workdistributeOp.getRegion().front(); - rewriter.eraseOp(workdistributeBlock->getTerminator()); - rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); - rewriter.eraseOp(teamsOp); - return success(); + PatternRewriter &rewriter) const override { + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); + } + Block *workdistributeBlock = &workdistributeOp.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); + rewriter.eraseOp(teamsOp); + return success(); } }; @@ -332,13 +327,13 @@ class LowerWorkdistributePass MLIRContext &context = getContext(); GreedyRewriteConfig config; // prevent the pattern driver form merging blocks - config.setRegionSimplificationLevel( - GreedySimplifyRegionLevel::Disabled); - + config.setRegionSimplificationLevel(GreedySimplifyRegionLevel::Disabled); + Operation *op = getOperation(); { RewritePatternSet patterns(&context); - patterns.insert(&context); + patterns.insert( + &context); if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); @@ -346,7 +341,8 @@ class LowerWorkdistributePass } { RewritePatternSet patterns(&context); - patterns.insert(&context); + patterns.insert( + &context); if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); @@ -354,4 +350,4 @@ class LowerWorkdistributePass } } }; -} +} // namespace diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 5b5ee257edd1f..dc25adfe28c1d 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1344,12 +1344,14 @@ TYPE_PARSER( "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_workdistribute), + "TARGET TEAMS WORKDISTRIBUTE" >> + pure(llvm::omp::Directive::OMPD_target_teams_workdistribute), "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), "TARGET" >> pure(llvm::omp::Directive::OMPD_target), "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_teams_workdistribute), + "TEAMS WORKDISTRIBUTE" >> + pure(llvm::omp::Directive::OMPD_teams_workdistribute), "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare), "WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_workdistribute)))) diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir index 666bdb3ced647..9fb970246b90c 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir @@ -25,4 +25,4 @@ func.func @x(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref< omp.terminator } return -} \ No newline at end of file +} >From 60351b6a73ed19de8531ac63336e17be7536cf48 Mon Sep 17 00:00:00 2001 From: skc7 Date: Tue, 27 May 2025 16:24:26 +0530 Subject: [PATCH 12/12] update to workdistribute lowering --- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 194 ++++++++++-------- .../OpenMP/lower-workdistribute-doloop.mlir | 19 +- .../OpenMP/lower-workdistribute-fission.mlir | 31 +-- 3 files changed, 139 insertions(+), 105 deletions(-) diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index f75d4d1988fd2..c9c7827ace217 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -48,25 +48,21 @@ using namespace mlir; namespace { -template -static T getPerfectlyNested(Operation *op) { - if (op->getNumRegions() != 1) - return nullptr; - auto ®ion = op->getRegion(0); - if (region.getBlocks().size() != 1) - return nullptr; - auto *block = ®ion.front(); - auto *firstOp = &block->front(); - if (auto nested = dyn_cast(firstOp)) - if (firstOp->getNextNode() == block->getTerminator()) - return nested; - return nullptr; +static bool isRuntimeCall(Operation *op) { + if (auto callOp = dyn_cast(op)) { + auto callee = callOp.getCallee(); + if (!callee) + return false; + auto *func = op->getParentOfType().lookupSymbol(*callee); + if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) + return true; + } + return false; } /// This is the single source of truth about whether we should parallelize an -/// operation nested in an omp.workdistribute region. +/// operation nested in an omp.execute region. static bool shouldParallelize(Operation *op) { - // Currently we cannot parallelize operations with results that have uses if (llvm::any_of(op->getResults(), [](OpResult v) -> bool { return !v.use_empty(); })) return false; @@ -77,21 +73,28 @@ static bool shouldParallelize(Operation *op) { return false; return *unordered; } - if (auto callOp = dyn_cast(op)) { - auto callee = callOp.getCallee(); - if (!callee) - return false; - auto *func = op->getParentOfType().lookupSymbol(*callee); - // TODO need to insert a check here whether it is a call we can actually - // parallelize currently - if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) - return true; - return false; + if (isRuntimeCall(op)) { + return true; } // We cannot parallise anything else return false; } +template +static T getPerfectlyNested(Operation *op) { + if (op->getNumRegions() != 1) + return nullptr; + auto ®ion = op->getRegion(0); + if (region.getBlocks().size() != 1) + return nullptr; + auto *block = ®ion.front(); + auto *firstOp = &block->front(); + if (auto nested = dyn_cast(firstOp)) + if (firstOp->getNextNode() == block->getTerminator()) + return nested; + return nullptr; +} + /// If B() and D() are parallelizable, /// /// omp.teams { @@ -138,17 +141,33 @@ struct FissionWorkdistribute : public OpRewritePattern { emitError(loc, "teams with multiple blocks\n"); return failure(); } - if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { - emitError(loc, "teams with multiple nested ops\n"); - return failure(); - } auto *teamsBlock = &teams.getRegion().front(); + bool changed = false; + // Move the ops inside teams and before workdistribute outside. + IRMapping irMapping; + llvm::SmallVector teamsHoisted; + for (auto &op : teams.getOps()) { + if (&op == workdistribute) { + break; + } + if (shouldParallelize(&op)) { + emitError(loc, + "teams has parallelize ops before first workdistribute\n"); + return failure(); + } else { + rewriter.setInsertionPoint(teams); + rewriter.clone(op, irMapping); + teamsHoisted.push_back(&op); + changed = true; + } + } + for (auto *op : teamsHoisted) + rewriter.replaceOp(op, irMapping.lookup(op)); // While we have unhandled operations in the original workdistribute auto *workdistributeBlock = &workdistribute.getRegion().front(); auto *terminator = workdistributeBlock->getTerminator(); - bool changed = false; while (&workdistributeBlock->front() != terminator) { rewriter.setInsertionPoint(teams); IRMapping mapping; @@ -194,9 +213,51 @@ struct FissionWorkdistribute : public OpRewritePattern { } }; +/// If fir.do_loop is present inside teams workdistribute +/// +/// omp.teams { +/// omp.workdistribute { +/// fir.do_loop unoredered { +/// ... +/// } +/// } +/// } +/// +/// Then, its lowered to +/// +/// omp.teams { +/// omp.parallel { +/// omp.distribute { +/// omp.wsloop { +/// omp.loop_nest +/// ... +/// } +/// } +/// } +/// } + +static void genParallelOp(Location loc, PatternRewriter &rewriter, + bool composite) { + auto parallelOp = rewriter.create(loc); + parallelOp.setComposite(composite); + rewriter.createBlock(¶llelOp.getRegion()); + rewriter.setInsertionPoint(rewriter.create(loc)); + return; +} + +static void genDistributeOp(Location loc, PatternRewriter &rewriter, + bool composite) { + mlir::omp::DistributeOperands distributeClauseOps; + auto distributeOp = + rewriter.create(loc, distributeClauseOps); + distributeOp.setComposite(composite); + auto distributeBlock = rewriter.createBlock(&distributeOp.getRegion()); + rewriter.setInsertionPointToStart(distributeBlock); + return; +} + static void -genLoopNestClauseOps(mlir::Location loc, mlir::PatternRewriter &rewriter, - fir::DoLoopOp loop, +genLoopNestClauseOps(mlir::PatternRewriter &rewriter, fir::DoLoopOp loop, mlir::omp::LoopNestOperands &loopNestClauseOps) { assert(loopNestClauseOps.loopLowerBounds.empty() && "Loop nest bounds were already emitted!"); @@ -207,9 +268,11 @@ genLoopNestClauseOps(mlir::Location loc, mlir::PatternRewriter &rewriter, } static void genWsLoopOp(mlir::PatternRewriter &rewriter, fir::DoLoopOp doLoop, - const mlir::omp::LoopNestOperands &clauseOps) { + const mlir::omp::LoopNestOperands &clauseOps, + bool composite) { auto wsloopOp = rewriter.create(doLoop.getLoc()); + wsloopOp.setComposite(composite); rewriter.createBlock(&wsloopOp.getRegion()); auto loopNestOp = @@ -231,57 +294,20 @@ static void genWsLoopOp(mlir::PatternRewriter &rewriter, fir::DoLoopOp doLoop, return; } -/// If fir.do_loop is present inside teams workdistribute -/// -/// omp.teams { -/// omp.workdistribute { -/// fir.do_loop unoredered { -/// ... -/// } -/// } -/// } -/// -/// Then, its lowered to -/// -/// omp.teams { -/// omp.workdistribute { -/// omp.parallel { -/// omp.wsloop { -/// omp.loop_nest -/// ... -/// } -/// } -/// } -/// } -/// } - -struct TeamsWorkdistributeLowering : public OpRewritePattern { +struct WorkdistributeDoLower : public OpRewritePattern { using OpRewritePattern::OpRewritePattern; - LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + LogicalResult matchAndRewrite(omp::WorkdistributeOp workdistribute, PatternRewriter &rewriter) const override { - auto teamsLoc = teamsOp->getLoc(); - auto workdistributeOp = getPerfectlyNested(teamsOp); - if (!workdistributeOp) { - LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); - return failure(); - } - assert(teamsOp.getReductionVars().empty()); - - auto doLoop = getPerfectlyNested(workdistributeOp); + auto doLoop = getPerfectlyNested(workdistribute); + auto wdLoc = workdistribute->getLoc(); if (doLoop && shouldParallelize(doLoop)) { - - auto parallelOp = rewriter.create(teamsLoc); - rewriter.createBlock(¶llelOp.getRegion()); - rewriter.setInsertionPoint( - rewriter.create(doLoop.getLoc())); - + assert(doLoop.getReduceOperands().empty()); + genParallelOp(wdLoc, rewriter, true); + genDistributeOp(wdLoc, rewriter, true); mlir::omp::LoopNestOperands loopNestClauseOps; - genLoopNestClauseOps(doLoop.getLoc(), rewriter, doLoop, - loopNestClauseOps); - - genWsLoopOp(rewriter, doLoop, loopNestClauseOps); - rewriter.setInsertionPoint(doLoop); - rewriter.eraseOp(doLoop); + genLoopNestClauseOps(rewriter, doLoop, loopNestClauseOps); + genWsLoopOp(rewriter, doLoop, loopNestClauseOps, true); + rewriter.eraseOp(workdistribute); return success(); } return failure(); @@ -315,7 +341,7 @@ struct TeamsWorkdistributeToSingle : public OpRewritePattern { Block *workdistributeBlock = &workdistributeOp.getRegion().front(); rewriter.eraseOp(workdistributeBlock->getTerminator()); rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); - rewriter.eraseOp(teamsOp); + rewriter.eraseOp(workdistributeOp); return success(); } }; @@ -332,8 +358,7 @@ class LowerWorkdistributePass Operation *op = getOperation(); { RewritePatternSet patterns(&context); - patterns.insert( - &context); + patterns.insert(&context); if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); @@ -341,8 +366,7 @@ class LowerWorkdistributePass } { RewritePatternSet patterns(&context); - patterns.insert( - &context); + patterns.insert(&context); if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir index 9fb970246b90c..f8351bb64e6e8 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir @@ -2,13 +2,18 @@ // CHECK-LABEL: func.func @x({{.*}}) // CHECK: %[[VAL_0:.*]] = arith.constant 0 : index -// CHECK: omp.parallel { -// CHECK: omp.wsloop { -// CHECK: omp.loop_nest (%[[VAL_1:.*]]) : index = (%[[ARG0:.*]]) to (%[[ARG1:.*]]) inclusive step (%[[ARG2:.*]]) { -// CHECK: fir.store %[[VAL_0]] to %[[ARG4:.*]] : !fir.ref -// CHECK: omp.yield -// CHECK: } -// CHECK: } +// CHECK: omp.teams { +// CHECK: omp.parallel { +// CHECK: omp.distribute { +// CHECK: omp.wsloop { +// CHECK: omp.loop_nest (%[[VAL_1:.*]]) : index = (%[[ARG0:.*]]) to (%[[ARG1:.*]]) inclusive step (%[[ARG2:.*]]) { +// CHECK: fir.store %[[VAL_0]] to %[[ARG4:.*]] : !fir.ref +// CHECK: omp.yield +// CHECK: } +// CHECK: } {omp.composite} +// CHECK: } {omp.composite} +// CHECK: omp.terminator +// CHECK: } {omp.composite} // CHECK: omp.terminator // CHECK: } // CHECK: return diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir index cf50d135d01ec..c562b7009664d 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir @@ -1,21 +1,26 @@ // RUN: fir-opt --lower-workdistribute %s | FileCheck %s -// CHECK-LABEL: func.func @test_fission_workdistribute({{.*}}) { +// CHECK-LABEL: func.func @test_fission_workdistribute( // CHECK: %[[VAL_0:.*]] = arith.constant 0 : index // CHECK: %[[VAL_1:.*]] = arith.constant 1 : index // CHECK: %[[VAL_2:.*]] = arith.constant 9 : index // CHECK: %[[VAL_3:.*]] = arith.constant 5.000000e+00 : f32 // CHECK: fir.store %[[VAL_3]] to %[[ARG2:.*]] : !fir.ref -// CHECK: omp.parallel { -// CHECK: omp.wsloop { -// CHECK: omp.loop_nest (%[[VAL_4:.*]]) : index = (%[[VAL_0]]) to (%[[VAL_2]]) inclusive step (%[[VAL_1]]) { -// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref -// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref -// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref -// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref -// CHECK: omp.yield -// CHECK: } -// CHECK: } +// CHECK: omp.teams { +// CHECK: omp.parallel { +// CHECK: omp.distribute { +// CHECK: omp.wsloop { +// CHECK: omp.loop_nest (%[[VAL_4:.*]]) : index = (%[[VAL_0]]) to (%[[VAL_2]]) inclusive step (%[[VAL_1]]) { +// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref +// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: omp.yield +// CHECK: } +// CHECK: } {omp.composite} +// CHECK: } {omp.composite} +// CHECK: omp.terminator +// CHECK: } {omp.composite} // CHECK: omp.terminator // CHECK: } // CHECK: fir.call @regular_side_effect_func(%[[ARG2:.*]]) : (!fir.ref) -> () @@ -24,8 +29,8 @@ // CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref // CHECK: fir.store %[[VAL_3]] to %[[VAL_9]] : !fir.ref // CHECK: } -// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2]] : !fir.ref -// CHECK: fir.store %[[VAL_10]] to %[[ARG3]] : !fir.ref +// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2:.*]] : !fir.ref +// CHECK: fir.store %[[VAL_10]] to %[[ARG3:.*]] : !fir.ref // CHECK: return // CHECK: } module { From flang-commits at lists.llvm.org Wed May 28 07:32:50 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Wed, 28 May 2025 07:32:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68371e92.170a0220.193f6a.7da5@mx.google.com> https://github.com/mrkajetanp updated https://github.com/llvm/llvm-project/pull/138718 >From 005c599ffcbb1ee257ad8ca510d8ecd649fcab7b Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 10 Apr 2025 14:04:52 +0000 Subject: [PATCH 1/5] [flang] Inline hlfir.copy_in for trivial types hlfir.copy_in implements copying non-contiguous array slices for functions that take in arrays required to be contiguous through a flang-rt function that calls memcpy/memmove separately on each element. For large arrays of trivial types, this can incur considerable overhead compared to a plain copy loop that is better able to take advantage of hardware pipelines. To address that, extend the InlineHLFIRAssign optimisation pass with a new pattern for inlining hlfir.copy_in operations for trivial types. For the time being, the pattern is only applied in cases where the copy-in does not require a corresponding copy-out, such as when the function being called declares the array parameter as intent(in). Applying this optimisation reduces the runtime of thornado-mini's DeleptonizationProblem by a factor of about 1/3rd. Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 117 ++++++++++++++++++ 1 file changed, 117 insertions(+) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 6e209cce07ad4..38c684eaceb7d 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,6 +13,7 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + + if (!mlir::isa(copyOut)) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (::llvm::cast(copyOut).getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + auto results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + auto elem = hlfir::getElementAt(loc, builder, inputVariable, + loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + auto tempElem = hlfir::getElementAt(loc, builder, temp, + loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + auto refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + auto refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + auto addr = results[0]; + auto needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + auto tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. + rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); +} + class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -140,6 +256,7 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); + patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { >From 26d2e491acd54ef942af32c8361e97b24d190625 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Wed, 7 May 2025 16:04:07 +0000 Subject: [PATCH 2/5] Add tests Signed-off-by: Kajetan Puchalski --- flang/test/HLFIR/inline-hlfir-assign.fir | 144 +++++++++++++++++++++++ 1 file changed, 144 insertions(+) diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index f834e7971e3d5..df7681b9c5c16 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,3 +353,147 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } >From e51db2f45ad82798937b57e2b5c08e7bcc66deed Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 8 May 2025 15:15:56 +0000 Subject: [PATCH 3/5] Address Tom's review comments Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 41 +++++++++++-------- 1 file changed, 23 insertions(+), 18 deletions(-) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 38c684eaceb7d..dc545ece8adff 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -158,16 +158,16 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, "CopyInOp's WasCopied has no uses"); // The copy out should always be present, either to actually copy or just // deallocate memory. - auto *copyOut = - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - if (!mlir::isa(copyOut)) + if (!copyOut) return rewriter.notifyMatchFailure(copyIn, "CopyInOp has no direct CopyOut"); // Only inline the copy_in when copy_out does not need to be done, i.e. in // case of intent(in). - if (::llvm::cast(copyOut).getVar()) + if (copyOut.getVar()) return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); inputVariable = @@ -175,7 +175,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); mlir::Value isContiguous = builder.create(loc, inputVariable); - auto results = + mlir::Operation::result_range results = builder .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, /*withElseRegion=*/true) @@ -195,11 +195,11 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, loc, builder, extents, /*isUnordered=*/true, flangomp::shouldUseWorkshareLowering(copyIn)); builder.setInsertionPointToStart(loopNest.body); - auto elem = hlfir::getElementAt(loc, builder, inputVariable, - loopNest.oneBasedIndices); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); elem = hlfir::loadTrivialScalar(loc, builder, elem); - auto tempElem = hlfir::getElementAt(loc, builder, temp, - loopNest.oneBasedIndices); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); builder.create(loc, elem, tempElem); builder.setInsertionPointAfter(loopNest.outerOp); @@ -209,9 +209,9 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, if (mlir::isa(temp.getType())) { result = temp; } else { - auto refTy = + fir::ReferenceType refTy = fir::ReferenceType::get(temp.getElementOrSequenceType()); - auto refVal = builder.createConvert(loc, refTy, temp); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); result = builder.create(loc, resultAddrType, refVal); } @@ -221,25 +221,30 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }) .getResults(); - auto addr = results[0]; - auto needsCleanup = results[1]; + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { auto boxAddr = builder.create(loc, addr); - auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); builder.create(loc, heapVal); }); rewriter.eraseOp(copyOut); - auto tempBox = copyIn.getTempBox(); + mlir::Value tempBox = copyIn.getTempBox(); rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); // The TempBox is only needed for flang-rt calls which we're no longer - // generating. + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); } >From c1af5b1ead7a560112c9896b1cb2bac48b865df3 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 22 May 2025 13:37:53 +0000 Subject: [PATCH 4/5] Separate copy_in inlining into its own pass, add flag Signed-off-by: Kajetan Puchalski --- flang/include/flang/Optimizer/HLFIR/Passes.td | 4 + .../Optimizer/HLFIR/Transforms/CMakeLists.txt | 1 + .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 122 ------------ .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 180 ++++++++++++++++++ flang/lib/Optimizer/Passes/Pipelines.cpp | 5 + flang/test/HLFIR/inline-hlfir-assign.fir | 144 -------------- flang/test/HLFIR/inline-hlfir-copy-in.fir | 146 ++++++++++++++ 7 files changed, 336 insertions(+), 266 deletions(-) create mode 100644 flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp create mode 100644 flang/test/HLFIR/inline-hlfir-copy-in.fir diff --git a/flang/include/flang/Optimizer/HLFIR/Passes.td b/flang/include/flang/Optimizer/HLFIR/Passes.td index d445140118073..04d7aec5fe489 100644 --- a/flang/include/flang/Optimizer/HLFIR/Passes.td +++ b/flang/include/flang/Optimizer/HLFIR/Passes.td @@ -69,6 +69,10 @@ def InlineHLFIRAssign : Pass<"inline-hlfir-assign"> { let summary = "Inline hlfir.assign operations"; } +def InlineHLFIRCopyIn : Pass<"inline-hlfir-copy-in"> { + let summary = "Inline hlfir.copy_in operations"; +} + def PropagateFortranVariableAttributes : Pass<"propagate-fortran-attrs"> { let summary = "Propagate FortranVariableFlagsAttr attributes through HLFIR"; } diff --git a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt index d959428ebd203..cc74273d9c5d9 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt @@ -5,6 +5,7 @@ add_flang_library(HLFIRTransforms ConvertToFIR.cpp InlineElementals.cpp InlineHLFIRAssign.cpp + InlineHLFIRCopyIn.cpp LowerHLFIRIntrinsics.cpp LowerHLFIROrderedAssignments.cpp ScheduleOrderedAssignments.cpp diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index dc545ece8adff..6e209cce07ad4 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,7 +13,6 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" -#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -128,126 +127,6 @@ class InlineHLFIRAssignConversion } }; -class InlineCopyInConversion : public mlir::OpRewritePattern { -public: - using mlir::OpRewritePattern::OpRewritePattern; - - llvm::LogicalResult - matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const override; -}; - -llvm::LogicalResult -InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const { - fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); - mlir::Location loc = copyIn.getLoc(); - hlfir::Entity inputVariable{copyIn.getVar()}; - if (!fir::isa_trivial(inputVariable.getFortranElementType())) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's data type is not trivial"); - - if (fir::isPointerType(inputVariable.getType())) - return rewriter.notifyMatchFailure( - copyIn, "CopyInOp's input variable is a pointer"); - - // There should be exactly one user of WasCopied - the corresponding - // CopyOutOp. - if (copyIn.getWasCopied().getUses().empty()) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's WasCopied has no uses"); - // The copy out should always be present, either to actually copy or just - // deallocate memory. - auto copyOut = mlir::dyn_cast( - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - - if (!copyOut) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp has no direct CopyOut"); - - // Only inline the copy_in when copy_out does not need to be done, i.e. in - // case of intent(in). - if (copyOut.getVar()) - return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); - - inputVariable = - hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); - mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); - mlir::Value isContiguous = - builder.create(loc, inputVariable); - mlir::Operation::result_range results = - builder - .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, - /*withElseRegion=*/true) - .genThen([&]() { - mlir::Value falseVal = builder.create( - loc, builder.getI1Type(), builder.getBoolAttr(false)); - builder.create( - loc, mlir::ValueRange{inputVariable, falseVal}); - }) - .genElse([&] { - auto [temp, cleanup] = - hlfir::createTempFromMold(loc, builder, inputVariable); - mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); - llvm::SmallVector extents = - hlfir::getIndexExtents(loc, builder, shape); - hlfir::LoopNest loopNest = hlfir::genLoopNest( - loc, builder, extents, /*isUnordered=*/true, - flangomp::shouldUseWorkshareLowering(copyIn)); - builder.setInsertionPointToStart(loopNest.body); - hlfir::Entity elem = hlfir::getElementAt( - loc, builder, inputVariable, loopNest.oneBasedIndices); - elem = hlfir::loadTrivialScalar(loc, builder, elem); - hlfir::Entity tempElem = hlfir::getElementAt( - loc, builder, temp, loopNest.oneBasedIndices); - builder.create(loc, elem, tempElem); - builder.setInsertionPointAfter(loopNest.outerOp); - - mlir::Value result; - // Make sure the result is always a boxed array by boxing it - // ourselves if need be. - if (mlir::isa(temp.getType())) { - result = temp; - } else { - fir::ReferenceType refTy = - fir::ReferenceType::get(temp.getElementOrSequenceType()); - mlir::Value refVal = builder.createConvert(loc, refTy, temp); - result = - builder.create(loc, resultAddrType, refVal); - } - - builder.create(loc, - mlir::ValueRange{result, cleanup}); - }) - .getResults(); - - mlir::OpResult addr = results[0]; - mlir::OpResult needsCleanup = results[1]; - - builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { - auto boxAddr = builder.create(loc, addr); - fir::HeapType heapType = - fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - mlir::Value heapVal = - builder.createConvert(loc, heapType, boxAddr.getResult()); - builder.create(loc, heapVal); - }); - rewriter.eraseOp(copyOut); - - mlir::Value tempBox = copyIn.getTempBox(); - - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); - - // The TempBox is only needed for flang-rt calls which we're no longer - // generating. It should have no uses left at this stage. - if (!tempBox.getUses().empty()) - return mlir::failure(); - rewriter.eraseOp(tempBox.getDefiningOp()); - - return mlir::success(); -} - class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -261,7 +140,6 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); - patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp new file mode 100644 index 0000000000000..1e2aecaf535a0 --- /dev/null +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + mlir::Value tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); + rewriter.eraseOp(tempBox.getDefiningOp()); + + return mlir::success(); +} + +class InlineHLFIRCopyInPass + : public hlfir::impl::InlineHLFIRCopyInBase { +public: + void runOnOperation() override { + mlir::MLIRContext *context = &getContext(); + + mlir::GreedyRewriteConfig config; + // Prevent the pattern driver from merging blocks. + config.setRegionSimplificationLevel( + mlir::GreedySimplifyRegionLevel::Disabled); + + mlir::RewritePatternSet patterns(context); + if (!noInlineHLFIRCopyIn) { + patterns.insert(context); + } + + if (mlir::failed(mlir::applyPatternsGreedily( + getOperation(), std::move(patterns), config))) { + mlir::emitError(getOperation()->getLoc(), + "failure in hlfir.copy_in inlining"); + signalPassFailure(); + } + } +}; +} // namespace diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..1779623fddc5a 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -255,6 +255,11 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP, pm, hlfir::createOptimizedBufferization); addNestedPassToAllTopLevelOperations( pm, hlfir::createInlineHLFIRAssign); + + if (optLevel == llvm::OptimizationLevel::O3) { + addNestedPassToAllTopLevelOperations( + pm, hlfir::createInlineHLFIRCopyIn); + } } pm.addPass(hlfir::createLowerHLFIROrderedAssignments()); pm.addPass(hlfir::createLowerHLFIRIntrinsics()); diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index df7681b9c5c16..f834e7971e3d5 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,147 +353,3 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } - -// Test inlining of hlfir.copy_in that does not require the array to be copied out -func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant true -// CHECK: %[[VAL_4:.*]] = arith.constant false -// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 -// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { -// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 -// CHECK: } else { -// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} -// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { -// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref -// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref -// CHECK: } -// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 -// CHECK: } -// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: fir.if %[[VAL_21:.*]]#1 { -// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> -// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> -// CHECK: } -// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } - -// Test not inlining of hlfir.copy_in that requires the array to be copied out -func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_no_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> -// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) -// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () -// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir new file mode 100644 index 0000000000000..7140e93f19979 --- /dev/null +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -0,0 +1,146 @@ +// Test inlining of hlfir.copy_in +// RUN: fir-opt --inline-hlfir-copy-in %s | FileCheck %s + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } >From 5d27cf26d5ec227068b76f13479d18da33319472 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Wed, 28 May 2025 13:44:53 +0000 Subject: [PATCH 5/5] Support arrays behind a pointer, add metadata to disable vectorizing --- .../flang/Optimizer/Builder/HLFIRTools.h | 8 ++- flang/lib/Optimizer/Builder/HLFIRTools.cpp | 13 +++- .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 66 ++++++++++--------- 3 files changed, 52 insertions(+), 35 deletions(-) diff --git a/flang/include/flang/Optimizer/Builder/HLFIRTools.h b/flang/include/flang/Optimizer/Builder/HLFIRTools.h index ed00cec04dc39..2cbad6e268a38 100644 --- a/flang/include/flang/Optimizer/Builder/HLFIRTools.h +++ b/flang/include/flang/Optimizer/Builder/HLFIRTools.h @@ -374,12 +374,14 @@ struct LoopNest { /// loop constructs currently. LoopNest genLoopNest(mlir::Location loc, fir::FirOpBuilder &builder, mlir::ValueRange extents, bool isUnordered = false, - bool emitWorkshareLoop = false); + bool emitWorkshareLoop = false, + bool couldVectorize = true); inline LoopNest genLoopNest(mlir::Location loc, fir::FirOpBuilder &builder, mlir::Value shape, bool isUnordered = false, - bool emitWorkshareLoop = false) { + bool emitWorkshareLoop = false, + bool couldVectorize = true) { return genLoopNest(loc, builder, getIndexExtents(loc, builder, shape), - isUnordered, emitWorkshareLoop); + isUnordered, emitWorkshareLoop, couldVectorize); } /// The type of a callback that generates the body of a reduction diff --git a/flang/lib/Optimizer/Builder/HLFIRTools.cpp b/flang/lib/Optimizer/Builder/HLFIRTools.cpp index f24dc2caeedfc..14aae5d7118a1 100644 --- a/flang/lib/Optimizer/Builder/HLFIRTools.cpp +++ b/flang/lib/Optimizer/Builder/HLFIRTools.cpp @@ -21,6 +21,7 @@ #include "mlir/IR/IRMapping.h" #include "mlir/Support/LLVM.h" #include "llvm/ADT/TypeSwitch.h" +#include #include #include @@ -932,7 +933,8 @@ mlir::Value hlfir::inlineElementalOp( hlfir::LoopNest hlfir::genLoopNest(mlir::Location loc, fir::FirOpBuilder &builder, mlir::ValueRange extents, bool isUnordered, - bool emitWorkshareLoop) { + bool emitWorkshareLoop, + bool couldVectorize) { emitWorkshareLoop = emitWorkshareLoop && isUnordered; hlfir::LoopNest loopNest; assert(!extents.empty() && "must have at least one extent"); @@ -967,6 +969,15 @@ hlfir::LoopNest hlfir::genLoopNest(mlir::Location loc, auto ub = builder.createConvert(loc, indexType, extent); auto doLoop = builder.create(loc, one, ub, one, isUnordered); + if (!couldVectorize) { + mlir::LLVM::LoopVectorizeAttr va{mlir::LLVM::LoopVectorizeAttr::get( + builder.getContext(), + /*disable=*/builder.getBoolAttr(true), {}, {}, {}, {}, {}, {})}; + mlir::LLVM::LoopAnnotationAttr la = mlir::LLVM::LoopAnnotationAttr::get( + builder.getContext(), {}, /*vectorize=*/va, {}, /*unroll*/ {}, + /*unroll_and_jam*/ {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}); + doLoop.setLoopAnnotationAttr(la); + } loopNest.body = doLoop.getBody(); builder.setInsertionPointToStart(loopNest.body); // Reverse the indices so they are in column-major order. diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp index 1e2aecaf535a0..d1cbe3241c07b 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -52,19 +52,15 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, return rewriter.notifyMatchFailure(copyIn, "CopyInOp's data type is not trivial"); - if (fir::isPointerType(inputVariable.getType())) - return rewriter.notifyMatchFailure( - copyIn, "CopyInOp's input variable is a pointer"); - // There should be exactly one user of WasCopied - the corresponding // CopyOutOp. - if (copyIn.getWasCopied().getUses().empty()) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's WasCopied has no uses"); + if (!copyIn.getWasCopied().hasOneUse()) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's WasCopied has no single user"); // The copy out should always be present, either to actually copy or just // deallocate memory. auto copyOut = mlir::dyn_cast( - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + copyIn.getWasCopied().user_begin().getCurrent().getUser()); if (!copyOut) return rewriter.notifyMatchFailure(copyIn, @@ -77,28 +73,45 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, inputVariable = hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); - mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Type sequenceType = + hlfir::getFortranElementOrSequenceType(inputVariable.getType()); + fir::BoxType resultBoxType = fir::BoxType::get(sequenceType); mlir::Value isContiguous = builder.create(loc, inputVariable); mlir::Operation::result_range results = builder - .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + .genIfOp(loc, {resultBoxType, builder.getI1Type()}, isContiguous, /*withElseRegion=*/true) .genThen([&]() { - mlir::Value falseVal = builder.create( - loc, builder.getI1Type(), builder.getBoolAttr(false)); + mlir::Value result = inputVariable; + if (fir::isPointerType(inputVariable.getType())) { + auto boxAddr = builder.create(loc, inputVariable); + fir::ReferenceType refTy = fir::ReferenceType::get(sequenceType); + mlir::Value refVal = builder.createConvert(loc, refTy, boxAddr); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + result = builder.create(loc, resultBoxType, refVal, + shape); + } builder.create( - loc, mlir::ValueRange{inputVariable, falseVal}); + loc, mlir::ValueRange{result, builder.createBool(loc, false)}); }) .genElse([&] { - auto [temp, cleanup] = - hlfir::createTempFromMold(loc, builder, inputVariable); mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); llvm::SmallVector extents = hlfir::getIndexExtents(loc, builder, shape); - hlfir::LoopNest loopNest = hlfir::genLoopNest( - loc, builder, extents, /*isUnordered=*/true, - flangomp::shouldUseWorkshareLowering(copyIn)); + llvm::StringRef tmpName{".tmp.copy_in"}; + llvm::SmallVector lenParams; + mlir::Value alloc = builder.createHeapTemporary( + loc, sequenceType, tmpName, extents, lenParams); + + auto declareOp = builder.create( + loc, alloc, tmpName, shape, lenParams, + /*dummy_scope=*/nullptr); + hlfir::Entity temp{declareOp.getBase()}; + hlfir::LoopNest loopNest = + hlfir::genLoopNest(loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn), + /*couldVectorize=*/false); builder.setInsertionPointToStart(loopNest.body); hlfir::Entity elem = hlfir::getElementAt( loc, builder, inputVariable, loopNest.oneBasedIndices); @@ -117,12 +130,12 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, fir::ReferenceType refTy = fir::ReferenceType::get(temp.getElementOrSequenceType()); mlir::Value refVal = builder.createConvert(loc, refTy, temp); - result = - builder.create(loc, resultAddrType, refVal); + result = builder.create(loc, resultBoxType, refVal, + shape); } - builder.create(loc, - mlir::ValueRange{result, cleanup}); + builder.create( + loc, mlir::ValueRange{result, builder.createBool(loc, true)}); }) .getResults(); @@ -140,16 +153,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }); rewriter.eraseOp(copyOut); - mlir::Value tempBox = copyIn.getTempBox(); - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); - - // The TempBox is only needed for flang-rt calls which we're no longer - // generating. It should have no uses left at this stage. - if (!tempBox.getUses().empty()) - return mlir::failure(); - rewriter.eraseOp(tempBox.getDefiningOp()); - return mlir::success(); } From flang-commits at lists.llvm.org Wed May 28 08:00:24 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Wed, 28 May 2025 08:00:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68372508.050a0220.2fea6c.844d@mx.google.com> ================ @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); ---------------- vzakhari wrote: Sorry for not being clear. `CopyOut` op should be converted to FIR by `ConvertHLFIRtoFIR` pass, so it is not quite necessary to handle it here. Maybe I am missing something, if you see that the `CopyOut` does not get converted by the regular Flang pass pipeline. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Wed May 28 08:22:24 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 28 May 2025 08:22:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Allow forward reference to non-default INTEGER dummy (PR #141254) In-Reply-To: Message-ID: <68372a30.170a0220.15498c.8427@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/141254 >From 631895111f6b09b7b76f90532a24f42de995f268 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 23 May 2025 09:44:56 -0700 Subject: [PATCH] [flang] Allow forward reference to non-default INTEGER dummy A dummy argument with an explicit INTEGER type of non-default kind can be forward-referenced from a specification expression in many Fortran compilers. Handle by adding type declaration statements to the initial pass over a specification part's declaration constructs. Emit an optional warning under -pedantic. Fixes https://github.com/llvm/llvm-project/issues/140941. --- flang/docs/Extensions.md | 5 +- .../include/flang/Support/Fortran-features.h | 2 +- flang/lib/Semantics/resolve-names.cpp | 101 ++++++++++++++++-- .../test/Semantics/OpenMP/linear-clause01.f90 | 2 - flang/test/Semantics/resolve103.f90 | 16 +-- 5 files changed, 104 insertions(+), 22 deletions(-) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..e3501dffb8777 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -291,7 +291,10 @@ end * DATA statement initialization is allowed for procedure pointers outside structure constructors. * Nonstandard intrinsic functions: ISNAN, SIZEOF -* A forward reference to a default INTEGER scalar dummy argument or +* A forward reference to an INTEGER dummy argument is permitted to appear + in a specification expression, such as an array bound, in a scope with + IMPLICIT NONE(TYPE). +* A forward reference to a default INTEGER scalar `COMMON` block variable is permitted to appear in a specification expression, such as an array bound, in a scope with IMPLICIT NONE(TYPE) if the name of the variable would have caused it to be implicitly typed diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 0e18eaedf2139..e696da9042480 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -55,7 +55,7 @@ ENUM_CLASS(LanguageFeature, BackslashEscapes, OldDebugLines, UndefinableAsynchronousOrVolatileActual, AutomaticInMainProgram, PrintCptr, SavedLocalInSpecExpr, PrintNamelist, AssumedRankPassedToNonAssumedRank, IgnoreIrrelevantAttributes, Unsigned, AmbiguousStructureConstructor, - ContiguousOkForSeqAssociation) + ContiguousOkForSeqAssociation, ForwardRefExplicitTypeDummy) // Portability and suspicious usage warnings ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 3f4a06444c4f3..0db27d5c80dcd 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -768,10 +768,22 @@ class ScopeHandler : public ImplicitRulesVisitor { deferImplicitTyping_ = skipImplicitTyping_ = skip; } + void NoteEarlyDeclaredDummyArgument(Symbol &symbol) { + earlyDeclaredDummyArguments_.insert(symbol); + } + bool IsEarlyDeclaredDummyArgument(Symbol &symbol) { + return earlyDeclaredDummyArguments_.find(symbol) != + earlyDeclaredDummyArguments_.end(); + } + void ForgetEarlyDeclaredDummyArgument(Symbol &symbol) { + earlyDeclaredDummyArguments_.erase(symbol); + } + private: Scope *currScope_{nullptr}; FuncResultStack funcResultStack_{*this}; std::map deferred_; + UnorderedSymbolSet earlyDeclaredDummyArguments_; }; class ModuleVisitor : public virtual ScopeHandler { @@ -1976,6 +1988,9 @@ class ResolveNamesVisitor : public virtual ScopeHandler, Scope &topScope_; void PreSpecificationConstruct(const parser::SpecificationConstruct &); + void EarlyDummyTypeDeclaration( + const parser::Statement> + &); void CreateCommonBlockSymbols(const parser::CommonStmt &); void CreateObjectSymbols(const std::list &, Attr); void CreateGeneric(const parser::GenericSpec &); @@ -5611,6 +5626,7 @@ Symbol &DeclarationVisitor::DeclareUnknownEntity( } else { Symbol &symbol{DeclareEntity(name, attrs)}; if (auto *type{GetDeclTypeSpec()}) { + ForgetEarlyDeclaredDummyArgument(symbol); SetType(name, *type); } charInfo_.length.reset(); @@ -5687,6 +5703,7 @@ Symbol &DeclarationVisitor::DeclareProcEntity( symbol.set(Symbol::Flag::Subroutine); } } else if (auto *type{GetDeclTypeSpec()}) { + ForgetEarlyDeclaredDummyArgument(symbol); SetType(name, *type); symbol.set(Symbol::Flag::Function); } @@ -5701,6 +5718,7 @@ Symbol &DeclarationVisitor::DeclareObjectEntity( Symbol &symbol{DeclareEntity(name, attrs)}; if (auto *details{symbol.detailsIf()}) { if (auto *type{GetDeclTypeSpec()}) { + ForgetEarlyDeclaredDummyArgument(symbol); SetType(name, *type); } if (!arraySpec().empty()) { @@ -5711,9 +5729,11 @@ Symbol &DeclarationVisitor::DeclareObjectEntity( context().SetError(symbol); } } else if (MustBeScalar(symbol)) { - context().Warn(common::UsageWarning::PreviousScalarUse, name.source, - "'%s' appeared earlier as a scalar actual argument to a specification function"_warn_en_US, - name.source); + if (!context().HasError(symbol)) { + context().Warn(common::UsageWarning::PreviousScalarUse, name.source, + "'%s' appeared earlier as a scalar actual argument to a specification function"_warn_en_US, + name.source); + } } else if (details->init() || symbol.test(Symbol::Flag::InDataStmt)) { Say(name, "'%s' was initialized earlier as a scalar"_err_en_US); } else { @@ -8467,6 +8487,11 @@ const parser::Name *DeclarationVisitor::ResolveDataRef( x.u); } +static bool TypesMismatchIfNonNull( + const DeclTypeSpec *type1, const DeclTypeSpec *type2) { + return type1 && type2 && *type1 != *type2; +} + // If implicit types are allowed, ensure name is in the symbol table. // Otherwise, report an error if it hasn't been declared. const parser::Name *DeclarationVisitor::ResolveName(const parser::Name &name) { @@ -8488,13 +8513,30 @@ const parser::Name *DeclarationVisitor::ResolveName(const parser::Name &name) { symbol->set(Symbol::Flag::ImplicitOrError, false); if (IsUplevelReference(*symbol)) { MakeHostAssocSymbol(name, *symbol); - } else if (IsDummy(*symbol) || - (!symbol->GetType() && FindCommonBlockContaining(*symbol))) { + } else if (IsDummy(*symbol)) { CheckEntryDummyUse(name.source, symbol); + ConvertToObjectEntity(*symbol); + if (IsEarlyDeclaredDummyArgument(*symbol)) { + ForgetEarlyDeclaredDummyArgument(*symbol); + if (isImplicitNoneType()) { + context().Warn(common::LanguageFeature::ForwardRefImplicitNone, + name.source, + "'%s' was used under IMPLICIT NONE(TYPE) before being explicitly typed"_warn_en_US, + name.source); + } else if (TypesMismatchIfNonNull( + symbol->GetType(), GetImplicitType(*symbol))) { + context().Warn(common::LanguageFeature::ForwardRefExplicitTypeDummy, + name.source, + "'%s' was used before being explicitly typed (and its implicit type would differ)"_warn_en_US, + name.source); + } + } + ApplyImplicitRules(*symbol); + } else if (!symbol->GetType() && FindCommonBlockContaining(*symbol)) { ConvertToObjectEntity(*symbol); ApplyImplicitRules(*symbol); } else if (const auto *tpd{symbol->detailsIf()}; - tpd && !tpd->attr()) { + tpd && !tpd->attr()) { Say(name, "Type parameter '%s' was referenced before being declared"_err_en_US, name.source); @@ -9037,11 +9079,6 @@ static bool IsLocallyImplicitGlobalSymbol( return false; } -static bool TypesMismatchIfNonNull( - const DeclTypeSpec *type1, const DeclTypeSpec *type2) { - return type1 && type2 && *type1 != *type2; -} - // Check and set the Function or Subroutine flag on symbol; false on error. bool ResolveNamesVisitor::SetProcFlag( const parser::Name &name, Symbol &symbol, Symbol::Flag flag) { @@ -9258,6 +9295,10 @@ void ResolveNamesVisitor::PreSpecificationConstruct( const parser::SpecificationConstruct &spec) { common::visit( common::visitors{ + [&](const parser::Statement< + common::Indirection> &y) { + EarlyDummyTypeDeclaration(y); + }, [&](const parser::Statement> &y) { CreateGeneric(std::get(y.statement.value().t)); }, @@ -9286,6 +9327,44 @@ void ResolveNamesVisitor::PreSpecificationConstruct( spec.u); } +void ResolveNamesVisitor::EarlyDummyTypeDeclaration( + const parser::Statement> + &stmt) { + context().set_location(stmt.source); + const auto &[declTypeSpec, attrs, entities] = stmt.statement.value().t; + if (const auto *intrin{ + std::get_if(&declTypeSpec.u)}) { + if (const auto *intType{std::get_if(&intrin->u)}) { + if (const auto &kind{intType->v}) { + if (!parser::Unwrap(*kind) && + !parser::Unwrap(*kind)) { + return; + } + } + const DeclTypeSpec *type{nullptr}; + for (const auto &ent : entities) { + const auto &objName{std::get(ent.t)}; + Resolve(objName, FindInScope(currScope(), objName)); + if (Symbol * symbol{objName.symbol}; + symbol && IsDummy(*symbol) && NeedsType(*symbol)) { + if (!type) { + type = ProcessTypeSpec(declTypeSpec); + if (!type || !type->IsNumeric(TypeCategory::Integer)) { + break; + } + } + symbol->SetType(*type); + NoteEarlyDeclaredDummyArgument(*symbol); + // Set the Implicit flag to disable bogus errors from + // being emitted later when this declaration is processed + // again normally. + symbol->set(Symbol::Flag::Implicit); + } + } + } + } +} + void ResolveNamesVisitor::CreateCommonBlockSymbols( const parser::CommonStmt &commonStmt) { for (const parser::CommonStmt::Block &block : commonStmt.blocks) { diff --git a/flang/test/Semantics/OpenMP/linear-clause01.f90 b/flang/test/Semantics/OpenMP/linear-clause01.f90 index f95e834c9026c..286def2dba119 100644 --- a/flang/test/Semantics/OpenMP/linear-clause01.f90 +++ b/flang/test/Semantics/OpenMP/linear-clause01.f90 @@ -20,10 +20,8 @@ subroutine linear_clause_02(arg_01, arg_02) !$omp declare simd linear(val(arg_01)) real, intent(in) :: arg_01(:) - !ERROR: The list item 'arg_02' specified without the REF 'linear-modifier' must be of INTEGER type !ERROR: If the `linear-modifier` is REF or UVAL, the list item 'arg_02' must be a dummy argument without the VALUE attribute !$omp declare simd linear(uval(arg_02)) - !ERROR: The type of 'arg_02' has already been implicitly declared integer, value, intent(in) :: arg_02 !ERROR: The list item 'var' specified without the REF 'linear-modifier' must be of INTEGER type diff --git a/flang/test/Semantics/resolve103.f90 b/flang/test/Semantics/resolve103.f90 index 8f55968f43375..0acf2333b9586 100644 --- a/flang/test/Semantics/resolve103.f90 +++ b/flang/test/Semantics/resolve103.f90 @@ -1,8 +1,7 @@ ! RUN: not %flang_fc1 -pedantic %s 2>&1 | FileCheck %s ! Test extension: allow forward references to dummy arguments or COMMON ! from specification expressions in scopes with IMPLICIT NONE(TYPE), -! as long as those symbols are eventually typed later with the -! same integer type they would have had without IMPLICIT NONE. +! as long as those symbols are eventually typed later. !CHECK: warning: 'n1' was used without (or before) being explicitly typed !CHECK: error: No explicit type declared for dummy argument 'n1' @@ -19,12 +18,15 @@ subroutine foo2(a, n2) double precision n2 end -!CHECK: warning: 'n3' was used without (or before) being explicitly typed -!CHECK-NOT: error: Dummy argument 'n3' -subroutine foo3(a, n3) +!CHECK: warning: 'n3a' was used under IMPLICIT NONE(TYPE) before being explicitly typed +!CHECK: warning: 'n3b' was used under IMPLICIT NONE(TYPE) before being explicitly typed +!CHECK-NOT: error: Dummy argument 'n3a' +!CHECK-NOT: error: Dummy argument 'n3b' +subroutine foo3(a, n3a, n3b) implicit none - real a(n3) - integer n3 + integer a(n3a, n3b) + integer n3a + integer(8) n3b end !CHECK: warning: 'n4' was used without (or before) being explicitly typed From flang-commits at lists.llvm.org Wed May 28 08:23:14 2025 From: flang-commits at lists.llvm.org (Pranav Bhandarkar via flang-commits) Date: Wed, 28 May 2025 08:23:14 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] - Handle `BoxCharType` in `fir.box_offset` op (PR #141713) In-Reply-To: Message-ID: <68372a62.050a0220.2fea6c.8633@mx.google.com> https://github.com/bhandarkar-pranav updated https://github.com/llvm/llvm-project/pull/141713 >From 271272f7a98bf5bf5e651c70cbd5030a311cc078 Mon Sep 17 00:00:00 2001 From: Pranav Bhandarkar Date: Fri, 23 May 2025 10:23:57 -0500 Subject: [PATCH 1/2] Add the ability to the fir.box_offset op to handle references to fir.boxchar --- .../include/flang/Optimizer/Dialect/FIROps.td | 6 +++++ .../include/flang/Optimizer/Dialect/FIRType.h | 5 ++-- flang/lib/Optimizer/CodeGen/CodeGen.cpp | 25 ++++++++++++++----- flang/lib/Optimizer/Dialect/FIROps.cpp | 16 +++++++++--- flang/lib/Optimizer/Dialect/FIRType.cpp | 2 +- flang/test/Fir/box-offset-codegen.fir | 10 ++++++++ flang/test/Fir/box-offset.fir | 5 ++++ flang/test/Fir/invalid.fir | 10 +++++++- 8 files changed, 66 insertions(+), 13 deletions(-) diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td b/flang/include/flang/Optimizer/Dialect/FIROps.td index dc66885f776f0..160de05a33b41 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.td +++ b/flang/include/flang/Optimizer/Dialect/FIROps.td @@ -3240,11 +3240,17 @@ def fir_BoxOffsetOp : fir_Op<"box_offset", [NoMemoryEffect]> { descriptor implementation must have, only the base_addr and derived_type descriptor fields can be addressed. + It also accepts the address of a fir.boxchar and returns + address of the data pointer encapsulated by the fir.boxchar. + ``` %addr = fir.box_offset %box base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> %tdesc = fir.box_offset %box derived_type : (!fir.ref>>) -> !fir.llvm_ptr>> + %addr1 = fir.box_offset %boxchar base_addr : (!fir.ref>) -> !fir.llvm_ptr>> ``` + + The derived_type field cannot be used when the input to this op is a reference to a fir.boxchar. }]; let arguments = (ins diff --git a/flang/include/flang/Optimizer/Dialect/FIRType.h b/flang/include/flang/Optimizer/Dialect/FIRType.h index 52b14f15f89bd..01878aa41005c 100644 --- a/flang/include/flang/Optimizer/Dialect/FIRType.h +++ b/flang/include/flang/Optimizer/Dialect/FIRType.h @@ -278,8 +278,9 @@ inline mlir::Type unwrapRefType(mlir::Type t) { /// If `t` conforms with a pass-by-reference type (box, ref, ptr, etc.) then /// return the element type of `t`. Otherwise, return `t`. inline mlir::Type unwrapPassByRefType(mlir::Type t) { - if (auto eleTy = dyn_cast_ptrOrBoxEleTy(t)) - return eleTy; + if (conformsWithPassByRef(t)) + if (auto eleTy = dyn_cast_ptrOrBoxEleTy(t)) + return eleTy; return t; } diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index 205807eab403a..e383c2e3e89ab 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -3930,12 +3930,25 @@ struct BoxOffsetOpConversion : public fir::FIROpConversion { mlir::ConversionPatternRewriter &rewriter) const override { mlir::Type pty = ::getLlvmPtrType(boxOffset.getContext()); - mlir::Type boxType = fir::unwrapRefType(boxOffset.getBoxRef().getType()); - mlir::Type llvmBoxTy = - lowerTy().convertBoxTypeAsStruct(mlir::cast(boxType)); - int fieldId = boxOffset.getField() == fir::BoxFieldAttr::derived_type - ? getTypeDescFieldId(boxType) - : kAddrPosInBox; + mlir::Type boxRefType = fir::unwrapRefType(boxOffset.getBoxRef().getType()); + + assert((mlir::isa(boxRefType) || + mlir::isa(boxRefType)) && + "boxRef should be a reference to either fir.box or fir.boxchar"); + + mlir::Type llvmBoxTy; + int fieldId; + if (auto boxType = mlir::dyn_cast_or_null(boxRefType)) { + llvmBoxTy = + lowerTy().convertBoxTypeAsStruct(mlir::cast(boxType)); + fieldId = boxOffset.getField() == fir::BoxFieldAttr::derived_type + ? getTypeDescFieldId(boxType) + : kAddrPosInBox; + } else { + auto boxCharType = mlir::cast(boxRefType); + llvmBoxTy = lowerTy().convertType(boxCharType); + fieldId = kAddrPosInBox; + } rewriter.replaceOpWithNewOp( boxOffset, pty, llvmBoxTy, adaptor.getBoxRef(), llvm::ArrayRef{0, fieldId}); diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index cbe93907265f6..6435886d73081 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -4484,15 +4484,25 @@ void fir::IfOp::resultToSourceOps(llvm::SmallVectorImpl &results, llvm::LogicalResult fir::BoxOffsetOp::verify() { auto boxType = mlir::dyn_cast_or_null( fir::dyn_cast_ptrEleTy(getBoxRef().getType())); - if (!boxType) - return emitOpError("box_ref operand must have !fir.ref> type"); + mlir::Type boxCharType; + bool isBoxChar = false; + if (!boxType) { + boxCharType = mlir::dyn_cast_or_null( + fir::dyn_cast_ptrEleTy(getBoxRef().getType())); + if (!boxCharType) + return emitOpError("box_ref operand must have !fir.ref> or !fir.ref> type"); + isBoxChar = true; + } if (getField() != fir::BoxFieldAttr::base_addr && getField() != fir::BoxFieldAttr::derived_type) return emitOpError("cannot address provided field"); - if (getField() == fir::BoxFieldAttr::derived_type) + if (getField() == fir::BoxFieldAttr::derived_type) { + if (isBoxChar) + return emitOpError("cannot address derived_type field of a fir.boxchar"); if (!fir::boxHasAddendum(boxType)) return emitOpError("can only address derived_type field of derived type " "or unlimited polymorphic fir.box"); + } return mlir::success(); } diff --git a/flang/lib/Optimizer/Dialect/FIRType.cpp b/flang/lib/Optimizer/Dialect/FIRType.cpp index 1e6e95393c2f7..da7aa17445404 100644 --- a/flang/lib/Optimizer/Dialect/FIRType.cpp +++ b/flang/lib/Optimizer/Dialect/FIRType.cpp @@ -255,7 +255,7 @@ mlir::Type dyn_cast_ptrOrBoxEleTy(mlir::Type t) { return llvm::TypeSwitch(t) .Case([](auto p) { return p.getEleTy(); }) - .Case( + .Case( [](auto p) { return unwrapRefType(p.getEleTy()); }) .Default([](mlir::Type) { return mlir::Type{}; }); } diff --git a/flang/test/Fir/box-offset-codegen.fir b/flang/test/Fir/box-offset-codegen.fir index 15c9a11e5aefe..59cfda8523061 100644 --- a/flang/test/Fir/box-offset-codegen.fir +++ b/flang/test/Fir/box-offset-codegen.fir @@ -37,3 +37,13 @@ func.func @array_tdesc(%array : !fir.ref>) -> !fir.llvm_ptr>> { + %addr = fir.box_offset %boxchar base_addr : (!fir.ref>) -> !fir.llvm_ptr>> + return %addr : !fir.llvm_ptr>> +} + +// CHECK-LABEL: define ptr @boxchar_addr( +// CHECK-SAME: ptr captures(none) %[[BOXCHAR:.*]]){{.*}} { +// CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64 }, ptr %[[BOXCHAR]], i32 0, i32 0 +// CHECK: ret ptr %[[VAL_0]] diff --git a/flang/test/Fir/box-offset.fir b/flang/test/Fir/box-offset.fir index 98c2eaefb8d6b..181ad51a5dbe1 100644 --- a/flang/test/Fir/box-offset.fir +++ b/flang/test/Fir/box-offset.fir @@ -21,6 +21,9 @@ func.func @test_box_offset(%unlimited : !fir.ref>, %type_star : %addr6 = fir.box_offset %type_star base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> %tdesc6 = fir.box_offset %type_star derived_type : (!fir.ref>>) -> !fir.llvm_ptr> + + %boxchar = fir.alloca !fir.boxchar<1> + %addr7 = fir.box_offset %boxchar base_addr : (!fir.ref>) -> !fir.llvm_ptr>> return } // CHECK-LABEL: func.func @test_box_offset( @@ -40,3 +43,5 @@ func.func @test_box_offset(%unlimited : !fir.ref>, %type_star : // CHECK: %[[VAL_13:.*]] = fir.box_offset %[[VAL_0]] derived_type : (!fir.ref>) -> !fir.llvm_ptr> // CHECK: %[[VAL_14:.*]] = fir.box_offset %[[VAL_1]] base_addr : (!fir.ref>>) -> !fir.llvm_ptr>> // CHECK: %[[VAL_15:.*]] = fir.box_offset %[[VAL_1]] derived_type : (!fir.ref>>) -> !fir.llvm_ptr> +// CHECK: %[[VAL_16:.*]] = fir.alloca !fir.boxchar<1> +// CHECK: %[[VAL_17:.*]] = fir.box_offset %[[VAL_16]] base_addr : (!fir.ref>) -> !fir.llvm_ptr>> diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index fd607fd9066f7..45cae1f82cb8e 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -972,13 +972,21 @@ func.func @rec_to_rec(%arg0: !fir.type) -> !fir.type) { - // expected-error at +1{{'fir.box_offset' op box_ref operand must have !fir.ref> type}} + // expected-error at +1{{'fir.box_offset' op box_ref operand must have !fir.ref> or !fir.ref> type}} %addr1 = fir.box_offset %not_a_box base_addr : (!fir.ref) -> !fir.llvm_ptr> return } // ----- +func.func @bad_box_offset(%boxchar : !fir.ref>) { + // expected-error at +1{{'fir.box_offset' op cannot address derived_type field of a fir.boxchar}} + %addr1 = fir.box_offset %boxchar derived_type : (!fir.ref>) -> !fir.llvm_ptr>> + return +} + +// ----- + func.func @bad_box_offset(%no_addendum : !fir.ref>) { // expected-error at +1{{'fir.box_offset' op can only address derived_type field of derived type or unlimited polymorphic fir.box}} %addr1 = fir.box_offset %no_addendum derived_type : (!fir.ref>) -> !fir.llvm_ptr>> >From ada55c9750f6c2c2309ed7356c64e4751290cc3e Mon Sep 17 00:00:00 2001 From: Pranav Bhandarkar Date: Wed, 28 May 2025 10:21:12 -0500 Subject: [PATCH 2/2] Fix clang formatt issues --- flang/lib/Optimizer/CodeGen/CodeGen.cpp | 8 ++++---- flang/lib/Optimizer/Dialect/FIROps.cpp | 3 ++- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp index e383c2e3e89ab..82d960a6fc61e 100644 --- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp +++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp @@ -3939,11 +3939,11 @@ struct BoxOffsetOpConversion : public fir::FIROpConversion { mlir::Type llvmBoxTy; int fieldId; if (auto boxType = mlir::dyn_cast_or_null(boxRefType)) { - llvmBoxTy = - lowerTy().convertBoxTypeAsStruct(mlir::cast(boxType)); + llvmBoxTy = lowerTy().convertBoxTypeAsStruct( + mlir::cast(boxType)); fieldId = boxOffset.getField() == fir::BoxFieldAttr::derived_type - ? getTypeDescFieldId(boxType) - : kAddrPosInBox; + ? getTypeDescFieldId(boxType) + : kAddrPosInBox; } else { auto boxCharType = mlir::cast(boxRefType); llvmBoxTy = lowerTy().convertType(boxCharType); diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index 6435886d73081..8d3c82d00eec5 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -4490,7 +4490,8 @@ llvm::LogicalResult fir::BoxOffsetOp::verify() { boxCharType = mlir::dyn_cast_or_null( fir::dyn_cast_ptrEleTy(getBoxRef().getType())); if (!boxCharType) - return emitOpError("box_ref operand must have !fir.ref> or !fir.ref> type"); + return emitOpError("box_ref operand must have !fir.ref> or " + "!fir.ref> type"); isBoxChar = true; } if (getField() != fir::BoxFieldAttr::base_addr && From flang-commits at lists.llvm.org Wed May 28 08:23:21 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Wed, 28 May 2025 08:23:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68372a69.a70a0220.3b82a5.8243@mx.google.com> ================ @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); ---------------- mrkajetanp wrote: Ah right I see, so I should just replace the operands and then the copy out should get handled correctly by the lowering? https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Wed May 28 08:34:21 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Wed, 28 May 2025 08:34:21 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68372cfd.a70a0220.c6cf0.86d4@mx.google.com> https://github.com/mcinally updated https://github.com/llvm/llvm-project/pull/141380 >From 9f8619cb54a3a11e4c90af7f5156141ddc59e4d4 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 24 May 2025 13:35:13 -0700 Subject: [PATCH 1/3] [flang] Add support for -mprefer-vector-width= This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- clang/include/clang/Driver/Options.td | 2 +- flang/include/flang/Frontend/CodeGenOptions.h | 3 +++ .../include/flang/Optimizer/Transforms/Passes.td | 4 ++++ flang/include/flang/Tools/CrossToolHelpers.h | 3 +++ flang/lib/Frontend/CompilerInvocation.cpp | 14 ++++++++++++++ flang/lib/Frontend/FrontendActions.cpp | 2 ++ flang/lib/Optimizer/Passes/Pipelines.cpp | 2 +- flang/lib/Optimizer/Transforms/FunctionAttr.cpp | 5 +++++ flang/test/Driver/prefer-vector-width.f90 | 16 ++++++++++++++++ mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td | 1 + mlir/lib/Target/LLVMIR/ModuleImport.cpp | 4 ++++ mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | 3 +++ 12 files changed, 57 insertions(+), 2 deletions(-) create mode 100644 flang/test/Driver/prefer-vector-width.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 22261621df092..b0b642796010b 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5480,7 +5480,7 @@ def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, MarshallingInfoStringVector>; def mprefer_vector_width_EQ : Joined<["-"], "mprefer-vector-width=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Specifies preferred vector width for auto-vectorization. Defaults to 'none' which allows target specific decisions.">, MarshallingInfoString>; def mstack_protector_guard_EQ : Joined<["-"], "mstack-protector-guard=">, Group, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -53,6 +53,9 @@ class CodeGenOptions : public CodeGenOptionsBase { /// The paths to the pass plugins that were registered using -fpass-plugin. std::vector LLVMPassPlugins; + // The prefered vector width, if requested by -mprefer-vector-width. + std::string PreferVectorWidth; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index b251534e1a8f6..2e932d9ad4a26 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -418,6 +418,10 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"preferVectorWidth", "prefer-vector-width", "std::string", + /*default=*/"", + "Set the prefer-vector-width attribute on functions in the " + "module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, ]; diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 118695bbe2626..058024a4a04c5 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; InstrumentFunctionExit = "__cyg_profile_func_exit"; @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for + ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. std::string InstrumentFunctionEntry = diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..918323d663610 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -309,6 +309,20 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, for (auto *a : args.filtered(clang::driver::options::OPT_fpass_plugin_EQ)) opts.LLVMPassPlugins.push_back(a->getValue()); + // -mprefer_vector_width option + if (const llvm::opt::Arg *a = args.getLastArg( + clang::driver::options::OPT_mprefer_vector_width_EQ)) { + llvm::StringRef s = a->getValue(); + unsigned Width; + if (s == "none") + opts.PreferVectorWidth = "none"; + else if (s.getAsInteger(10, Width)) + diags.Report(clang::diag::err_drv_invalid_value) + << a->getAsString(args) << a->getValue(); + else + opts.PreferVectorWidth = s.str(); + } + // -fembed-offload-object option for (auto *a : args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 38dfaadf1dff9..012d0fdfe645f 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -741,6 +741,8 @@ void CodeGenAction::generateLLVMIR() { config.VScaleMax = vsr->second; } + config.PreferVectorWidth = opts.PreferVectorWidth; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..5c1e1b9f77efb 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -354,7 +354,7 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - ""})); + config.PreferVectorWidth, ""})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 43e4c1a7af3cd..13f447cf738b4 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -36,6 +36,7 @@ class FunctionAttrPass : public fir::impl::FunctionAttrBase { approxFuncFPMath = options.approxFuncFPMath; noSignedZerosFPMath = options.noSignedZerosFPMath; unsafeFPMath = options.unsafeFPMath; + preferVectorWidth = options.preferVectorWidth; } FunctionAttrPass() {} void runOnOperation() override; @@ -102,6 +103,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!preferVectorWidth.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, preferVectorWidth)); LLVM_DEBUG(llvm::dbgs() << "=== End " DEBUG_TYPE " ===\n"); } diff --git a/flang/test/Driver/prefer-vector-width.f90 b/flang/test/Driver/prefer-vector-width.f90 new file mode 100644 index 0000000000000..d0f5fd28db826 --- /dev/null +++ b/flang/test/Driver/prefer-vector-width.f90 @@ -0,0 +1,16 @@ +! Test that -mprefer-vector-width works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mprefer-vector-width=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mprefer-vector-width=128 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-128 +! RUN: %flang_fc1 -mprefer-vector-width=256 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-256 +! RUN: not %flang_fc1 -mprefer-vector-width=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INVALID + +subroutine func +end subroutine func + +! CHECK-DEF-NOT: attributes #0 = { "prefer-vector-width"={{.*}} } +! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } +! CHECK-128: attributes #0 = { "prefer-vector-width"="128" } +! CHECK-256: attributes #0 = { "prefer-vector-width"="256" } +! CHECK-INVALID:error: invalid value 'xxx' in '-mprefer-vector-width=xxx' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index 6fde45ce5c556..c0324d561b77b 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1893,6 +1893,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$frame_pointer, OptionalAttr:$target_cpu, OptionalAttr:$tune_cpu, + OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index b049064fbd31c..85417da798b22 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2636,6 +2636,10 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, funcOp.setTargetFeaturesAttr( LLVM::TargetFeaturesAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("prefer-vector-width"); + attr.isStringAttribute()) + funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index 047e870b7dcd8..2b7f0b11613aa 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1540,6 +1540,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { if (auto tuneCpu = func.getTuneCpu()) llvmFunc->addFnAttr("tune-cpu", *tuneCpu); + if (auto preferVectorWidth = func.getPreferVectorWidth()) + llvmFunc->addFnAttr("prefer-vector-width", *preferVectorWidth); + if (auto attr = func.getVscaleRange()) llvmFunc->addFnAttr(llvm::Attribute::getWithVScaleRangeArgs( getLLVMContext(), attr->getMinRange().getInt(), >From 5bdf615715733351bae8f959f0a06a8449526bb8 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 24 May 2025 13:35:13 -0700 Subject: [PATCH 2/3] [flang] Add support for -mprefer-vector-width= This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- flang/lib/Frontend/CompilerInvocation.cpp | 4 ++-- mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll | 9 +++++++++ mlir/test/Target/LLVMIR/prefer-vector-width.mlir | 8 ++++++++ 3 files changed, 19 insertions(+), 2 deletions(-) create mode 100644 mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll create mode 100644 mlir/test/Target/LLVMIR/prefer-vector-width.mlir diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 918323d663610..90a002929eff0 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -313,10 +313,10 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, if (const llvm::opt::Arg *a = args.getLastArg( clang::driver::options::OPT_mprefer_vector_width_EQ)) { llvm::StringRef s = a->getValue(); - unsigned Width; + unsigned width; if (s == "none") opts.PreferVectorWidth = "none"; - else if (s.getAsInteger(10, Width)) + else if (s.getAsInteger(10, width)) diags.Report(clang::diag::err_drv_invalid_value) << a->getAsString(args) << a->getValue(); else diff --git a/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll new file mode 100644 index 0000000000000..831aa57345a3f --- /dev/null +++ b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll @@ -0,0 +1,9 @@ +; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s + +; CHECK-LABEL: llvm.func @prefer_vector_width() +; CHECK: prefer_vector_width = "128" +define void @prefer_vector_width() #0 { + ret void +} + +attributes #0 = { "prefer-vector-width"="128" } diff --git a/mlir/test/Target/LLVMIR/prefer-vector-width.mlir b/mlir/test/Target/LLVMIR/prefer-vector-width.mlir new file mode 100644 index 0000000000000..7410e8139fd31 --- /dev/null +++ b/mlir/test/Target/LLVMIR/prefer-vector-width.mlir @@ -0,0 +1,8 @@ +// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s + +// CHECK: define void @prefer_vector_width() #[[ATTRS:.*]] { +// CHECK: attributes #[[ATTRS]] = { "prefer-vector-width"="128" } + +llvm.func @prefer_vector_width() attributes {prefer_vector_width = "128"} { + llvm.return +} >From befabca370ba227262859aec47e4fbc93759b3a0 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 24 May 2025 13:35:13 -0700 Subject: [PATCH 3/3] [flang] Add support for -mprefer-vector-width= This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll index 831aa57345a3f..e30ef04924b81 100644 --- a/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll +++ b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll @@ -1,7 +1,7 @@ ; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s ; CHECK-LABEL: llvm.func @prefer_vector_width() -; CHECK: prefer_vector_width = "128" +; CHECK-SAME: prefer_vector_width = "128" define void @prefer_vector_width() #0 { ret void } From flang-commits at lists.llvm.org Wed May 28 08:42:23 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Wed, 28 May 2025 08:42:23 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Retrieve shape from selector when generating assoc sym type (PR #137117) In-Reply-To: Message-ID: <68372edf.170a0220.2ca76d.8476@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/137117 >From cae0024d339c0fe31cd31ce67553a4143c6cc1b0 Mon Sep 17 00:00:00 2001 From: ergawy Date: Thu, 24 Apr 2025 00:34:44 -0500 Subject: [PATCH 1/2] [flang] Retrieve shape from selector when generating assoc sym type This PR extends `genSymbolType` so that the type of an associating symbol carries the shape of the selector expression, if any. This is a fix for a bug that triggered when an associating symbol is used in a locality specifier. For example, given the following input: ```fortran associate(a => aa(4:)) do concurrent (i = 4:11) local(a) a(i) = 0 end do end associate ``` before the changes in the PR, flang would assert that we are casting between incompatible types. The issue happened since for the associating symbol (`a`), flang generated its type as `f32` rather than `!fir.array<8xf32>` as it should be in this case. --- flang/lib/Lower/ConvertType.cpp | 17 ++++++++++++++ .../do_concurrent_local_assoc_entity.f90 | 22 +++++++++++++++++++ 2 files changed, 39 insertions(+) create mode 100644 flang/test/Lower/do_concurrent_local_assoc_entity.f90 diff --git a/flang/lib/Lower/ConvertType.cpp b/flang/lib/Lower/ConvertType.cpp index d45f9e7c0bf1b..875bdba6cc6ba 100644 --- a/flang/lib/Lower/ConvertType.cpp +++ b/flang/lib/Lower/ConvertType.cpp @@ -279,6 +279,23 @@ struct TypeBuilderImpl { bool isPolymorphic = (Fortran::semantics::IsPolymorphic(symbol) || Fortran::semantics::IsUnlimitedPolymorphic(symbol)) && !Fortran::semantics::IsAssumedType(symbol); + if (const auto *assocDetails = + ultimate.detailsIf()) { + const auto &selector = assocDetails->expr(); + + if (selector && selector->Rank() > 0) { + auto shapeExpr = Fortran::evaluate::GetShape( + converter.getFoldingContext(), selector); + + fir::SequenceType::Shape shape; + // If there is no shapExpr, this is an assumed-rank, and the empty shape + // will build the desired fir.array<*:T> type. + if (shapeExpr) + translateShape(shape, std::move(*shapeExpr)); + ty = fir::SequenceType::get(shape, ty); + } + } + if (ultimate.IsObjectArray()) { auto shapeExpr = Fortran::evaluate::GetShape(converter.getFoldingContext(), ultimate); diff --git a/flang/test/Lower/do_concurrent_local_assoc_entity.f90 b/flang/test/Lower/do_concurrent_local_assoc_entity.f90 new file mode 100644 index 0000000000000..ca16ecaa5c137 --- /dev/null +++ b/flang/test/Lower/do_concurrent_local_assoc_entity.f90 @@ -0,0 +1,22 @@ +! RUN: %flang_fc1 -emit-hlfir -o - %s | FileCheck %s + +subroutine local_assoc + implicit none + integer i + real, dimension(2:11) :: aa + + associate(a => aa(4:)) + do concurrent (i = 4:11) local(a) + a(i) = 0 + end do + end associate +end subroutine local_assoc + +! CHECK: %[[C8:.*]] = arith.constant 8 : index + +! CHECK: fir.do_loop {{.*}} unordered { +! CHECK: %[[LOCAL_ALLOC:.*]] = fir.alloca !fir.array<8xf32> {bindc_name = "a", pinned, uniq_name = "{{.*}}local_assocEa"} +! CHECK: %[[LOCAL_SHAPE:.*]] = fir.shape %[[C8]] : +! CHECK: %[[LOCAL_DECL:.*]]:2 = hlfir.declare %[[LOCAL_ALLOC]](%[[LOCAL_SHAPE]]) +! CHECK: hlfir.designate %[[LOCAL_DECL]]#0 (%{{.*}}) +! CHECK: } >From 741a623604b6c18cc200e4c3cc900233ebfe658b Mon Sep 17 00:00:00 2001 From: ergawy Date: Wed, 28 May 2025 04:08:24 -0500 Subject: [PATCH 2/2] review comment, restructure code --- flang/lib/Lower/ConvertType.cpp | 39 +++++++------------ .../do_concurrent_local_assoc_entity.f90 | 2 +- 2 files changed, 14 insertions(+), 27 deletions(-) diff --git a/flang/lib/Lower/ConvertType.cpp b/flang/lib/Lower/ConvertType.cpp index 875bdba6cc6ba..7a2e8e5095186 100644 --- a/flang/lib/Lower/ConvertType.cpp +++ b/flang/lib/Lower/ConvertType.cpp @@ -276,36 +276,23 @@ struct TypeBuilderImpl { } else { fir::emitFatalError(loc, "symbol must have a type"); } - bool isPolymorphic = (Fortran::semantics::IsPolymorphic(symbol) || - Fortran::semantics::IsUnlimitedPolymorphic(symbol)) && - !Fortran::semantics::IsAssumedType(symbol); - if (const auto *assocDetails = - ultimate.detailsIf()) { - const auto &selector = assocDetails->expr(); - - if (selector && selector->Rank() > 0) { - auto shapeExpr = Fortran::evaluate::GetShape( - converter.getFoldingContext(), selector); - - fir::SequenceType::Shape shape; - // If there is no shapExpr, this is an assumed-rank, and the empty shape - // will build the desired fir.array<*:T> type. - if (shapeExpr) - translateShape(shape, std::move(*shapeExpr)); - ty = fir::SequenceType::get(shape, ty); - } - } - if (ultimate.IsObjectArray()) { - auto shapeExpr = - Fortran::evaluate::GetShape(converter.getFoldingContext(), ultimate); + auto shapeExpr = + Fortran::evaluate::GetShape(converter.getFoldingContext(), ultimate); + + if (shapeExpr && !shapeExpr->empty()) { + // Statically ranked array. fir::SequenceType::Shape shape; - // If there is no shapExpr, this is an assumed-rank, and the empty shape - // will build the desired fir.array<*:T> type. - if (shapeExpr) - translateShape(shape, std::move(*shapeExpr)); + translateShape(shape, std::move(*shapeExpr)); ty = fir::SequenceType::get(shape, ty); + } else if (!shapeExpr) { + // Assumed-rank. + ty = fir::SequenceType::get(fir::SequenceType::Shape{}, ty); } + + bool isPolymorphic = (Fortran::semantics::IsPolymorphic(symbol) || + Fortran::semantics::IsUnlimitedPolymorphic(symbol)) && + !Fortran::semantics::IsAssumedType(symbol); if (Fortran::semantics::IsPointer(symbol)) return fir::wrapInClassOrBoxType(fir::PointerType::get(ty), isPolymorphic); diff --git a/flang/test/Lower/do_concurrent_local_assoc_entity.f90 b/flang/test/Lower/do_concurrent_local_assoc_entity.f90 index ca16ecaa5c137..280827871aaf4 100644 --- a/flang/test/Lower/do_concurrent_local_assoc_entity.f90 +++ b/flang/test/Lower/do_concurrent_local_assoc_entity.f90 @@ -14,7 +14,7 @@ end subroutine local_assoc ! CHECK: %[[C8:.*]] = arith.constant 8 : index -! CHECK: fir.do_loop {{.*}} unordered { +! CHECK: fir.do_concurrent.loop {{.*}} { ! CHECK: %[[LOCAL_ALLOC:.*]] = fir.alloca !fir.array<8xf32> {bindc_name = "a", pinned, uniq_name = "{{.*}}local_assocEa"} ! CHECK: %[[LOCAL_SHAPE:.*]] = fir.shape %[[C8]] : ! CHECK: %[[LOCAL_DECL:.*]]:2 = hlfir.declare %[[LOCAL_ALLOC]](%[[LOCAL_SHAPE]]) From flang-commits at lists.llvm.org Wed May 28 09:55:56 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 28 May 2025 09:55:56 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][rt] Enable Count and CountDim for device build (PR #141684) In-Reply-To: Message-ID: <6837401c.170a0220.a5be2.a487@mx.google.com> https://github.com/clementval closed https://github.com/llvm/llvm-project/pull/141684 From flang-commits at lists.llvm.org Wed May 28 10:19:06 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 10:19:06 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6837458a.a70a0220.1a998e.b070@mx.google.com> https://github.com/tblah commented: Thanks @kparzysz for the massive effort this patch must have taken to implement. I have a few minor comments but I agree with your overall approach. Right now I cannot build this with -Werror: ``` /home/$USER/llvm-project/flang/lib/Semantics/check-omp-structure.cpp:2820:56: error: missing field 'iff' initializer [-Werror,-Wmissing-field-initializers] 2820 | GetActionStmt(std::get(s.t))}; |  ^ 1 error generated. ninja: build stopped: subcommand failed. ``` https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Wed May 28 10:19:06 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 10:19:06 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6837458a.170a0220.38362a.aa76@mx.google.com> ================ @@ -4217,49 +3783,168 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAllocatorsConstruct &allocsConstruct) { TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; ---------------- tblah wrote: Please could you add a lit test for this https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Wed May 28 10:19:06 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 10:19:06 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6837458a.050a0220.1f34ab.a7d2@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Wed May 28 10:19:06 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 10:19:06 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6837458a.170a0220.25027f.a4b4@mx.google.com> ================ @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); ---------------- tblah wrote: Please could you add a comment to this like explaining some context so that if this ever fails, whoever debugs it knows where to look. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Wed May 28 10:19:06 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 10:19:06 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6837458a.170a0220.18afe4.a3ef@mx.google.com> ================ @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + ! An explicit conversion is accepted as an extension. + !$omp atomic update + x = int(x + y) ---------------- tblah wrote: Please could you document this extension (maybe flang/docs/Extensions.md or at least explain in the commit message about these consequences of using evaluate::Expr). https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Wed May 28 10:19:06 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 10:19:06 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6837458a.170a0220.374177.a5ad@mx.google.com> ================ @@ -2673,645 +2673,211 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// - -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr, - lower::StatementContext *atomicCaptureStmtCtx = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, ---------------- tblah wrote: It definately doesn't need to be done with this patch but I wonder if the clause processing can be moved to Clauses.cpp https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Wed May 28 10:19:07 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 10:19:07 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6837458b.170a0220.319297.a0ec@mx.google.com> ================ @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { ---------------- tblah wrote: This is now unused. Are you keeping it deliberately as a debugging tool or was this forgotten due to the attribute? https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Wed May 28 10:19:08 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 10:19:08 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6837458c.170a0220.319297.a0ee@mx.google.com> ================ @@ -777,5 +777,22 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); + ---------------- tblah wrote: Should these be defined in tools.cpp not check-omp-structure.cpp? https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Wed May 28 10:19:07 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 10:19:07 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6837458b.050a0220.348b11.a946@mx.google.com> ================ @@ -2656,527 +2665,1857 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; } } + return SourcedActionStmt{}; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); } - return false; + return SourcedActionStmt{}; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; - } - } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); - } +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; } - ErrIfAllocatableVariable(var); + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; + } else { + return std::nullopt; + } + }, + x->u); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const auto *v1 = GetExpr(context_, stmt1Var); - const auto *e1 = GetExpr(context_, stmt1Expr); - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - const auto *v2 = GetExpr(context_, stmt2Var); - const auto *e2 = GetExpr(context_, stmt2Expr); - - if (e1 && v1 && e2 && v2) { - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(v2, e2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } - if (!(*e1 == *v2)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(v1, e1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - if (!(*v1 == *e2)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); - } - } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } + return std::nullopt; } -} -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; } } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; } } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); - } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); + return std::nullopt; } -} -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, - }, - x.u); + return std::nullopt; } -void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { - dirContext_.pop_back(); +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension ---------------- tblah wrote: Do you think it is worth documenting these extensions somewhere? https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Wed May 28 10:19:08 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 10:19:08 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6837458c.a70a0220.50a6d.b130@mx.google.com> ================ @@ -4217,49 +3783,168 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAllocatorsConstruct &allocsConstruct) { TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { ---------------- tblah wrote: Ultra-nit, but it isn't too far fetched that some other construct might one day need analysis following a similar pattern. ```suggestion dumpAtomicAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { ``` https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Wed May 28 10:19:08 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 10:19:08 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6837458c.170a0220.3d0280.a639@mx.google.com> ================ @@ -2656,527 +2665,1857 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; } } + return SourcedActionStmt{}; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); } - return false; + return SourcedActionStmt{}; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; - } - } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); - } +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; } - ErrIfAllocatableVariable(var); + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; + } else { + return std::nullopt; + } + }, + x->u); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const auto *v1 = GetExpr(context_, stmt1Var); - const auto *e1 = GetExpr(context_, stmt1Expr); - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - const auto *v2 = GetExpr(context_, stmt2Var); - const auto *e2 = GetExpr(context_, stmt2Expr); - - if (e1 && v1 && e2 && v2) { - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(v2, e2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } - if (!(*e1 == *v2)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(v1, e1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - if (!(*v1 == *e2)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); - } - } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } + return std::nullopt; } -} -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; } } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; } } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); - } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); + return std::nullopt; } -} -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, - }, - x.u); + return std::nullopt; } -void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { - dirContext_.pop_back(); +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; ---------------- tblah wrote: >From above: ``` // extension: x = x is allowed (*), but we should never print // "identity" as the name of the operator ``` maybe this should be an llvm::unreachable, plus some logic at each call? https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Wed May 28 09:12:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 28 May 2025 09:12:21 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [WIP] Implement workdistribute construct (PR #140523) In-Reply-To: Message-ID: <683735e5.170a0220.26844f.9438@mx.google.com> https://github.com/skc7 updated https://github.com/llvm/llvm-project/pull/140523 >From e0dff6afb7aa31330aa0516effb7a0f65df5315f Mon Sep 17 00:00:00 2001 From: Ivan Radanov Ivanov Date: Mon, 4 Dec 2023 12:57:36 -0800 Subject: [PATCH 01/13] Add coexecute directives --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 45 ++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 0af4b436649a3..752486a8105b6 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -682,6 +682,8 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let association = AS_None; let category = CA_Executable; } +def OMP_Coexecute : Directive<"coexecute"> {} +def OMP_EndCoexecute : Directive<"end coexecute"> {} def OMP_Critical : Directive<"critical"> { let allowedOnceClauses = [ VersionedClause, @@ -2198,6 +2200,33 @@ def OMP_TargetTeams : Directive<"target teams"> { let leafConstructs = [OMP_Target, OMP_Teams]; let category = CA_Executable; } +def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; +} def OMP_TargetTeamsDistribute : Directive<"target teams distribute"> { let allowedClauses = [ VersionedClause, @@ -2484,6 +2513,22 @@ def OMP_TaskLoopSimd : Directive<"taskloop simd"> { let leafConstructs = [OMP_TaskLoop, OMP_Simd]; let category = CA_Executable; } +def OMP_TeamsCoexecute : Directive<"teams coexecute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause + ]; +} def OMP_TeamsDistribute : Directive<"teams distribute"> { let allowedClauses = [ VersionedClause, >From 8b1b36f5e716b8186d98b0d5c47c0fdf649ae67b Mon Sep 17 00:00:00 2001 From: skc7 Date: Tue, 13 May 2025 11:01:45 +0530 Subject: [PATCH 02/13] [OpenMP] Fix Coexecute definitions --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 752486a8105b6..7f450b43c2e36 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -682,8 +682,15 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let association = AS_None; let category = CA_Executable; } -def OMP_Coexecute : Directive<"coexecute"> {} -def OMP_EndCoexecute : Directive<"end coexecute"> {} +def OMP_Coexecute : Directive<"coexecute"> { + let association = AS_Block; + let category = CA_Executable; +} +def OMP_EndCoexecute : Directive<"end coexecute"> { + let leafConstructs = OMP_Coexecute.leafConstructs; + let association = OMP_Coexecute.association; + let category = OMP_Coexecute.category; +} def OMP_Critical : Directive<"critical"> { let allowedOnceClauses = [ VersionedClause, @@ -2224,8 +2231,10 @@ def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { VersionedClause, VersionedClause, VersionedClause, - VersionedClause, + VersionedClause, ]; + let leafConstructs = [OMP_Target, OMP_Teams, OMP_Coexecute]; + let category = CA_Executable; } def OMP_TargetTeamsDistribute : Directive<"target teams distribute"> { let allowedClauses = [ @@ -2528,6 +2537,8 @@ def OMP_TeamsCoexecute : Directive<"teams coexecute"> { VersionedClause, VersionedClause ]; + let leafConstructs = [OMP_Target, OMP_Teams]; + let category = CA_Executable; } def OMP_TeamsDistribute : Directive<"teams distribute"> { let allowedClauses = [ >From 9b8d66a45e602375ec779e6c5bdd43232644f9a2 Mon Sep 17 00:00:00 2001 From: Ivan Radanov Ivanov Date: Mon, 4 Dec 2023 12:58:10 -0800 Subject: [PATCH 03/13] Add omp.coexecute op --- mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 35 +++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 5a79fbf77a268..8061aa0209cc9 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -325,6 +325,41 @@ def SectionsOp : OpenMP_Op<"sections", traits = [ let hasRegionVerifier = 1; } +//===----------------------------------------------------------------------===// +// Coexecute Construct +//===----------------------------------------------------------------------===// + +def CoexecuteOp : OpenMP_Op<"coexecute"> { + let summary = "coexecute directive"; + let description = [{ + The coexecute construct specifies that the teams from the teams directive + this is nested in shall cooperate to execute the computation in this region. + There is no implicit barrier at the end as specified in the standard. + + TODO + We should probably change the defaut behaviour to have a barrier unless + nowait is specified, see below snippet. + + ``` + !$omp target teams + !$omp coexecute + tmp = matmul(x, y) + !$omp end coexecute + a = tmp(0, 0) ! there is no implicit barrier! the matmul hasnt completed! + !$omp end target teams coexecute + ``` + + }]; + + let arguments = (ins UnitAttr:$nowait); + + let regions = (region AnyRegion:$region); + + let assemblyFormat = [{ + oilist(`nowait` $nowait) $region attr-dict + }]; +} + //===----------------------------------------------------------------------===// // 2.8.2 Single Construct //===----------------------------------------------------------------------===// >From 7ecec06e00230649446c77c970160d4814a90e07 Mon Sep 17 00:00:00 2001 From: Ivan Radanov Ivanov Date: Mon, 4 Dec 2023 17:50:41 -0800 Subject: [PATCH 04/13] Initial frontend support for coexecute --- .../include/flang/Semantics/openmp-directive-sets.h | 13 +++++++++++++ flang/lib/Lower/OpenMP/OpenMP.cpp | 12 ++++++++++++ flang/lib/Parser/openmp-parsers.cpp | 5 ++++- flang/lib/Semantics/resolve-directives.cpp | 6 ++++++ 4 files changed, 35 insertions(+), 1 deletion(-) diff --git a/flang/include/flang/Semantics/openmp-directive-sets.h b/flang/include/flang/Semantics/openmp-directive-sets.h index dd610c9702c28..5c316e030c63f 100644 --- a/flang/include/flang/Semantics/openmp-directive-sets.h +++ b/flang/include/flang/Semantics/openmp-directive-sets.h @@ -143,6 +143,7 @@ static const OmpDirectiveSet topTargetSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, + Directive::OMPD_target_teams_coexecute, }; static const OmpDirectiveSet allTargetSet{topTargetSet}; @@ -187,9 +188,16 @@ static const OmpDirectiveSet allTeamsSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, + Directive::OMPD_target_teams_coexecute, } | topTeamsSet, }; +static const OmpDirectiveSet allCoexecuteSet{ + Directive::OMPD_coexecute, + Directive::OMPD_teams_coexecute, + Directive::OMPD_target_teams_coexecute, +}; + //===----------------------------------------------------------------------===// // Directive sets for groups of multiple directives //===----------------------------------------------------------------------===// @@ -230,6 +238,9 @@ static const OmpDirectiveSet blockConstructSet{ Directive::OMPD_taskgroup, Directive::OMPD_teams, Directive::OMPD_workshare, + Directive::OMPD_target_teams_coexecute, + Directive::OMPD_teams_coexecute, + Directive::OMPD_coexecute, }; static const OmpDirectiveSet loopConstructSet{ @@ -294,6 +305,7 @@ static const OmpDirectiveSet workShareSet{ Directive::OMPD_scope, Directive::OMPD_sections, Directive::OMPD_single, + Directive::OMPD_coexecute, } | allDoSet, }; @@ -376,6 +388,7 @@ static const OmpDirectiveSet nestedReduceWorkshareAllowedSet{ }; static const OmpDirectiveSet nestedTeamsAllowedSet{ + Directive::OMPD_coexecute, Directive::OMPD_distribute, Directive::OMPD_distribute_parallel_do, Directive::OMPD_distribute_parallel_do_simd, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 61bbc709872fd..b0c65c8e37988 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2670,6 +2670,15 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +static mlir::omp::CoexecuteOp +genCoexecuteOp(Fortran::lower::AbstractConverter &converter, + Fortran::lower::pft::Evaluation &eval, + mlir::Location currentLocation, + const Fortran::parser::OmpClauseList &clauseList) { + return genOpWithBody( + converter, eval, currentLocation, /*outerCombined=*/false, &clauseList); +} + //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// @@ -3929,6 +3938,9 @@ static void genOMPDispatch(lower::AbstractConverter &converter, newOp = genTeamsOp(converter, symTable, stmtCtx, semaCtx, eval, loc, queue, item); break; + case llvm::omp::Directive::OMPD_coexecute: + newOp = genCoexecuteOp(converter, eval, currentLocation, beginClauseList); + break; case llvm::omp::Directive::OMPD_tile: case llvm::omp::Directive::OMPD_unroll: { unsigned version = semaCtx.langOptions().OpenMPVersion; diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 52d3a5844c969..591b1642baed3 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1344,12 +1344,15 @@ TYPE_PARSER( "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_coexecute), "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), "TARGET" >> pure(llvm::omp::Directive::OMPD_target), "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_teams_coexecute), "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare), + "COEXECUTE" >> pure(llvm::omp::Directive::OMPD_coexecute)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 9fa7bc8964854..ae297f204356a 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1617,6 +1617,9 @@ bool OmpAttributeVisitor::Pre(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_taskgroup: case llvm::omp::Directive::OMPD_teams: + case llvm::omp::Directive::OMPD_coexecute: + case llvm::omp::Directive::OMPD_teams_coexecute: + case llvm::omp::Directive::OMPD_target_teams_coexecute: case llvm::omp::Directive::OMPD_workshare: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: @@ -1650,6 +1653,9 @@ void OmpAttributeVisitor::Post(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_target: case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_teams: + case llvm::omp::Directive::OMPD_coexecute: + case llvm::omp::Directive::OMPD_teams_coexecute: + case llvm::omp::Directive::OMPD_target_teams_coexecute: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: case llvm::omp::Directive::OMPD_target_parallel: { >From ca0cc44c621fde89f1889fb328e66755ca3f5e3a Mon Sep 17 00:00:00 2001 From: skc7 Date: Tue, 13 May 2025 15:09:45 +0530 Subject: [PATCH 05/13] [OpenMP] Fixes for coexecute definitions --- .../flang/Semantics/openmp-directive-sets.h | 1 + flang/lib/Lower/OpenMP/OpenMP.cpp | 13 ++-- flang/test/Lower/OpenMP/coexecute.f90 | 59 +++++++++++++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 33 +++++------ 4 files changed, 83 insertions(+), 23 deletions(-) create mode 100644 flang/test/Lower/OpenMP/coexecute.f90 diff --git a/flang/include/flang/Semantics/openmp-directive-sets.h b/flang/include/flang/Semantics/openmp-directive-sets.h index 5c316e030c63f..43f4e642b3d86 100644 --- a/flang/include/flang/Semantics/openmp-directive-sets.h +++ b/flang/include/flang/Semantics/openmp-directive-sets.h @@ -173,6 +173,7 @@ static const OmpDirectiveSet topTeamsSet{ Directive::OMPD_teams_distribute_parallel_do_simd, Directive::OMPD_teams_distribute_simd, Directive::OMPD_teams_loop, + Directive::OMPD_teams_coexecute, }; static const OmpDirectiveSet bottomTeamsSet{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index b0c65c8e37988..80612bd05ad97 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2671,12 +2671,13 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, } static mlir::omp::CoexecuteOp -genCoexecuteOp(Fortran::lower::AbstractConverter &converter, - Fortran::lower::pft::Evaluation &eval, - mlir::Location currentLocation, - const Fortran::parser::OmpClauseList &clauseList) { +genCoexecuteOp(lower::AbstractConverter &converter, lower::SymMap &symTable, + semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, + mlir::Location loc, const ConstructQueue &queue, + ConstructQueue::const_iterator item) { return genOpWithBody( - converter, eval, currentLocation, /*outerCombined=*/false, &clauseList); + OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, + llvm::omp::Directive::OMPD_coexecute), queue, item); } //===----------------------------------------------------------------------===// @@ -3939,7 +3940,7 @@ static void genOMPDispatch(lower::AbstractConverter &converter, item); break; case llvm::omp::Directive::OMPD_coexecute: - newOp = genCoexecuteOp(converter, eval, currentLocation, beginClauseList); + newOp = genCoexecuteOp(converter, symTable, semaCtx, eval, loc, queue, item); break; case llvm::omp::Directive::OMPD_tile: case llvm::omp::Directive::OMPD_unroll: { diff --git a/flang/test/Lower/OpenMP/coexecute.f90 b/flang/test/Lower/OpenMP/coexecute.f90 new file mode 100644 index 0000000000000..b14f71f9bbbfa --- /dev/null +++ b/flang/test/Lower/OpenMP/coexecute.f90 @@ -0,0 +1,59 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +! CHECK-LABEL: func @_QPtarget_teams_coexecute +subroutine target_teams_coexecute() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp target teams coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end target teams coexecute +end subroutine target_teams_coexecute + +! CHECK-LABEL: func @_QPteams_coexecute +subroutine teams_coexecute() + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp teams coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end teams coexecute +end subroutine teams_coexecute + +! CHECK-LABEL: func @_QPtarget_teams_coexecute_m +subroutine target_teams_coexecute_m() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp target + !$omp teams + !$omp coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end coexecute + !$omp end teams + !$omp end target +end subroutine target_teams_coexecute_m + +! CHECK-LABEL: func @_QPteams_coexecute_m +subroutine teams_coexecute_m() + ! CHECK: omp.teams + ! CHECK: omp.coexecute + !$omp teams + !$omp coexecute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end coexecute + !$omp end teams +end subroutine teams_coexecute_m diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 7f450b43c2e36..3f02b6534816f 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -2209,29 +2209,28 @@ def OMP_TargetTeams : Directive<"target teams"> { } def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, + VersionedClause, VersionedClause, VersionedClause, - VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, - VersionedClause, - VersionedClause, VersionedClause, - VersionedClause, + VersionedClause, ]; - let allowedOnceClauses = [ + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, - VersionedClause, - VersionedClause, VersionedClause, - VersionedClause, VersionedClause, VersionedClause, + VersionedClause, ]; let leafConstructs = [OMP_Target, OMP_Teams, OMP_Coexecute]; let category = CA_Executable; @@ -2524,20 +2523,20 @@ def OMP_TaskLoopSimd : Directive<"taskloop simd"> { } def OMP_TeamsCoexecute : Directive<"teams coexecute"> { let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, VersionedClause, - VersionedClause + VersionedClause, ]; - let leafConstructs = [OMP_Target, OMP_Teams]; + let leafConstructs = [OMP_Teams, OMP_Coexecute]; let category = CA_Executable; } def OMP_TeamsDistribute : Directive<"teams distribute"> { >From 8077858a88a2ffac2b7d726c1ae5d1f1edb64b67 Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 14 May 2025 14:48:52 +0530 Subject: [PATCH 06/13] [OpenMP] Use workdistribute instead of coexecute --- .../flang/Semantics/openmp-directive-sets.h | 24 ++--- flang/lib/Lower/OpenMP/OpenMP.cpp | 15 ++- flang/lib/Parser/openmp-parsers.cpp | 6 +- flang/lib/Semantics/resolve-directives.cpp | 12 +-- flang/test/Lower/OpenMP/coexecute.f90 | 59 ---------- flang/test/Lower/OpenMP/workdistribute.f90 | 59 ++++++++++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 101 ++++++++++-------- mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 28 ++--- 8 files changed, 152 insertions(+), 152 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/coexecute.f90 create mode 100644 flang/test/Lower/OpenMP/workdistribute.f90 diff --git a/flang/include/flang/Semantics/openmp-directive-sets.h b/flang/include/flang/Semantics/openmp-directive-sets.h index 43f4e642b3d86..7ced6ed9b44d6 100644 --- a/flang/include/flang/Semantics/openmp-directive-sets.h +++ b/flang/include/flang/Semantics/openmp-directive-sets.h @@ -143,7 +143,7 @@ static const OmpDirectiveSet topTargetSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, - Directive::OMPD_target_teams_coexecute, + Directive::OMPD_target_teams_workdistribute, }; static const OmpDirectiveSet allTargetSet{topTargetSet}; @@ -173,7 +173,7 @@ static const OmpDirectiveSet topTeamsSet{ Directive::OMPD_teams_distribute_parallel_do_simd, Directive::OMPD_teams_distribute_simd, Directive::OMPD_teams_loop, - Directive::OMPD_teams_coexecute, + Directive::OMPD_teams_workdistribute, }; static const OmpDirectiveSet bottomTeamsSet{ @@ -189,14 +189,14 @@ static const OmpDirectiveSet allTeamsSet{ Directive::OMPD_target_teams_distribute_parallel_do_simd, Directive::OMPD_target_teams_distribute_simd, Directive::OMPD_target_teams_loop, - Directive::OMPD_target_teams_coexecute, + Directive::OMPD_target_teams_workdistribute, } | topTeamsSet, }; -static const OmpDirectiveSet allCoexecuteSet{ - Directive::OMPD_coexecute, - Directive::OMPD_teams_coexecute, - Directive::OMPD_target_teams_coexecute, +static const OmpDirectiveSet allWorkdistributeSet{ + Directive::OMPD_workdistribute, + Directive::OMPD_teams_workdistribute, + Directive::OMPD_target_teams_workdistribute, }; //===----------------------------------------------------------------------===// @@ -239,9 +239,9 @@ static const OmpDirectiveSet blockConstructSet{ Directive::OMPD_taskgroup, Directive::OMPD_teams, Directive::OMPD_workshare, - Directive::OMPD_target_teams_coexecute, - Directive::OMPD_teams_coexecute, - Directive::OMPD_coexecute, + Directive::OMPD_target_teams_workdistribute, + Directive::OMPD_teams_workdistribute, + Directive::OMPD_workdistribute, }; static const OmpDirectiveSet loopConstructSet{ @@ -306,7 +306,7 @@ static const OmpDirectiveSet workShareSet{ Directive::OMPD_scope, Directive::OMPD_sections, Directive::OMPD_single, - Directive::OMPD_coexecute, + Directive::OMPD_workdistribute, } | allDoSet, }; @@ -389,7 +389,7 @@ static const OmpDirectiveSet nestedReduceWorkshareAllowedSet{ }; static const OmpDirectiveSet nestedTeamsAllowedSet{ - Directive::OMPD_coexecute, + Directive::OMPD_workdistribute, Directive::OMPD_distribute, Directive::OMPD_distribute_parallel_do, Directive::OMPD_distribute_parallel_do_simd, diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 80612bd05ad97..42d04bceddb12 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2670,14 +2670,14 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } -static mlir::omp::CoexecuteOp -genCoexecuteOp(lower::AbstractConverter &converter, lower::SymMap &symTable, +static mlir::omp::WorkdistributeOp +genWorkdistributeOp(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, mlir::Location loc, const ConstructQueue &queue, ConstructQueue::const_iterator item) { - return genOpWithBody( + return genOpWithBody( OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, - llvm::omp::Directive::OMPD_coexecute), queue, item); + llvm::omp::Directive::OMPD_workdistribute), queue, item); } //===----------------------------------------------------------------------===// @@ -3939,16 +3939,15 @@ static void genOMPDispatch(lower::AbstractConverter &converter, newOp = genTeamsOp(converter, symTable, stmtCtx, semaCtx, eval, loc, queue, item); break; - case llvm::omp::Directive::OMPD_coexecute: - newOp = genCoexecuteOp(converter, symTable, semaCtx, eval, loc, queue, item); - break; case llvm::omp::Directive::OMPD_tile: case llvm::omp::Directive::OMPD_unroll: { unsigned version = semaCtx.langOptions().OpenMPVersion; TODO(loc, "Unhandled loop directive (" + llvm::omp::getOpenMPDirectiveName(dir, version) + ")"); } - // case llvm::omp::Directive::OMPD_workdistribute: + case llvm::omp::Directive::OMPD_workdistribute: + newOp = genWorkdistributeOp(converter, symTable, semaCtx, eval, loc, queue, item); + break; case llvm::omp::Directive::OMPD_workshare: newOp = genWorkshareOp(converter, symTable, stmtCtx, semaCtx, eval, loc, queue, item); diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 591b1642baed3..5b5ee257edd1f 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1344,15 +1344,15 @@ TYPE_PARSER( "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_coexecute), + "TARGET TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_workdistribute), "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), "TARGET" >> pure(llvm::omp::Directive::OMPD_target), "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS COEXECUTE" >> pure(llvm::omp::Directive::OMPD_teams_coexecute), + "TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_teams_workdistribute), "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare), - "COEXECUTE" >> pure(llvm::omp::Directive::OMPD_coexecute)))) + "WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_workdistribute)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index ae297f204356a..4636508ac144d 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -1617,9 +1617,9 @@ bool OmpAttributeVisitor::Pre(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_taskgroup: case llvm::omp::Directive::OMPD_teams: - case llvm::omp::Directive::OMPD_coexecute: - case llvm::omp::Directive::OMPD_teams_coexecute: - case llvm::omp::Directive::OMPD_target_teams_coexecute: + case llvm::omp::Directive::OMPD_workdistribute: + case llvm::omp::Directive::OMPD_teams_workdistribute: + case llvm::omp::Directive::OMPD_target_teams_workdistribute: case llvm::omp::Directive::OMPD_workshare: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: @@ -1653,9 +1653,9 @@ void OmpAttributeVisitor::Post(const parser::OpenMPBlockConstruct &x) { case llvm::omp::Directive::OMPD_target: case llvm::omp::Directive::OMPD_task: case llvm::omp::Directive::OMPD_teams: - case llvm::omp::Directive::OMPD_coexecute: - case llvm::omp::Directive::OMPD_teams_coexecute: - case llvm::omp::Directive::OMPD_target_teams_coexecute: + case llvm::omp::Directive::OMPD_workdistribute: + case llvm::omp::Directive::OMPD_teams_workdistribute: + case llvm::omp::Directive::OMPD_target_teams_workdistribute: case llvm::omp::Directive::OMPD_parallel_workshare: case llvm::omp::Directive::OMPD_target_teams: case llvm::omp::Directive::OMPD_target_parallel: { diff --git a/flang/test/Lower/OpenMP/coexecute.f90 b/flang/test/Lower/OpenMP/coexecute.f90 deleted file mode 100644 index b14f71f9bbbfa..0000000000000 --- a/flang/test/Lower/OpenMP/coexecute.f90 +++ /dev/null @@ -1,59 +0,0 @@ -! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s - -! CHECK-LABEL: func @_QPtarget_teams_coexecute -subroutine target_teams_coexecute() - ! CHECK: omp.target - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp target teams coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end target teams coexecute -end subroutine target_teams_coexecute - -! CHECK-LABEL: func @_QPteams_coexecute -subroutine teams_coexecute() - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp teams coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end teams coexecute -end subroutine teams_coexecute - -! CHECK-LABEL: func @_QPtarget_teams_coexecute_m -subroutine target_teams_coexecute_m() - ! CHECK: omp.target - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp target - !$omp teams - !$omp coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end coexecute - !$omp end teams - !$omp end target -end subroutine target_teams_coexecute_m - -! CHECK-LABEL: func @_QPteams_coexecute_m -subroutine teams_coexecute_m() - ! CHECK: omp.teams - ! CHECK: omp.coexecute - !$omp teams - !$omp coexecute - ! CHECK: fir.call - call f1() - ! CHECK: omp.terminator - ! CHECK: omp.terminator - !$omp end coexecute - !$omp end teams -end subroutine teams_coexecute_m diff --git a/flang/test/Lower/OpenMP/workdistribute.f90 b/flang/test/Lower/OpenMP/workdistribute.f90 new file mode 100644 index 0000000000000..924205bb72e5e --- /dev/null +++ b/flang/test/Lower/OpenMP/workdistribute.f90 @@ -0,0 +1,59 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +! CHECK-LABEL: func @_QPtarget_teams_workdistribute +subroutine target_teams_workdistribute() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp target teams workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end target teams workdistribute +end subroutine target_teams_workdistribute + +! CHECK-LABEL: func @_QPteams_workdistribute +subroutine teams_workdistribute() + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp teams workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end teams workdistribute +end subroutine teams_workdistribute + +! CHECK-LABEL: func @_QPtarget_teams_workdistribute_m +subroutine target_teams_workdistribute_m() + ! CHECK: omp.target + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp target + !$omp teams + !$omp workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end workdistribute + !$omp end teams + !$omp end target +end subroutine target_teams_workdistribute_m + +! CHECK-LABEL: func @_QPteams_workdistribute_m +subroutine teams_workdistribute_m() + ! CHECK: omp.teams + ! CHECK: omp.workdistribute + !$omp teams + !$omp workdistribute + ! CHECK: fir.call + call f1() + ! CHECK: omp.terminator + ! CHECK: omp.terminator + !$omp end workdistribute + !$omp end teams +end subroutine teams_workdistribute_m diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index 3f02b6534816f..c88a3049450de 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -1292,6 +1292,15 @@ def OMP_EndWorkshare : Directive<"end workshare"> { let category = OMP_Workshare.category; let languages = [L_Fortran]; } +def OMP_Workdistribute : Directive<"workdistribute"> { + let association = AS_Block; + let category = CA_Executable; +} +def OMP_EndWorkdistribute : Directive<"end workdistribute"> { + let leafConstructs = OMP_Workdistribute.leafConstructs; + let association = OMP_Workdistribute.association; + let category = OMP_Workdistribute.category; +} //===----------------------------------------------------------------------===// // Definitions of OpenMP compound directives @@ -2207,34 +2216,6 @@ def OMP_TargetTeams : Directive<"target teams"> { let leafConstructs = [OMP_Target, OMP_Teams]; let category = CA_Executable; } -def OMP_TargetTeamsCoexecute : Directive<"target teams coexecute"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let leafConstructs = [OMP_Target, OMP_Teams, OMP_Coexecute]; - let category = CA_Executable; -} def OMP_TargetTeamsDistribute : Directive<"target teams distribute"> { let allowedClauses = [ VersionedClause, @@ -2457,6 +2438,34 @@ def OMP_TargetTeamsDistributeSimd : let leafConstructs = [OMP_Target, OMP_Teams, OMP_Distribute, OMP_Simd]; let category = CA_Executable; } +def OMP_TargetTeamsWorkdistribute : Directive<"target teams workdistribute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let leafConstructs = [OMP_Target, OMP_Teams, OMP_Workdistribute]; + let category = CA_Executable; +} def OMP_target_teams_loop : Directive<"target teams loop"> { let allowedClauses = [ VersionedClause, @@ -2521,24 +2530,6 @@ def OMP_TaskLoopSimd : Directive<"taskloop simd"> { let leafConstructs = [OMP_TaskLoop, OMP_Simd]; let category = CA_Executable; } -def OMP_TeamsCoexecute : Directive<"teams coexecute"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let allowedOnceClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; - let leafConstructs = [OMP_Teams, OMP_Coexecute]; - let category = CA_Executable; -} def OMP_TeamsDistribute : Directive<"teams distribute"> { let allowedClauses = [ VersionedClause, @@ -2726,3 +2717,21 @@ def OMP_teams_loop : Directive<"teams loop"> { let leafConstructs = [OMP_Teams, OMP_loop]; let category = CA_Executable; } +def OMP_TeamsWorkdistribute : Directive<"teams workdistribute"> { + let allowedClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let allowedOnceClauses = [ + VersionedClause, + VersionedClause, + VersionedClause, + VersionedClause, + ]; + let leafConstructs = [OMP_Teams, OMP_Workdistribute]; + let category = CA_Executable; +} diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 8061aa0209cc9..5e3ab0e908d21 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -326,38 +326,30 @@ def SectionsOp : OpenMP_Op<"sections", traits = [ } //===----------------------------------------------------------------------===// -// Coexecute Construct +// workdistribute Construct //===----------------------------------------------------------------------===// -def CoexecuteOp : OpenMP_Op<"coexecute"> { - let summary = "coexecute directive"; +def WorkdistributeOp : OpenMP_Op<"workdistribute"> { + let summary = "workdistribute directive"; let description = [{ - The coexecute construct specifies that the teams from the teams directive - this is nested in shall cooperate to execute the computation in this region. - There is no implicit barrier at the end as specified in the standard. - - TODO - We should probably change the defaut behaviour to have a barrier unless - nowait is specified, see below snippet. + workdistribute divides execution of the enclosed structured block into + separate units of work, each executed only once by each + initial thread in the league. ``` !$omp target teams - !$omp coexecute + !$omp workdistribute tmp = matmul(x, y) - !$omp end coexecute + !$omp end workdistribute a = tmp(0, 0) ! there is no implicit barrier! the matmul hasnt completed! - !$omp end target teams coexecute + !$omp end target teams workdistribute ``` }]; - let arguments = (ins UnitAttr:$nowait); - let regions = (region AnyRegion:$region); - let assemblyFormat = [{ - oilist(`nowait` $nowait) $region attr-dict - }]; + let assemblyFormat = "$region attr-dict"; } //===----------------------------------------------------------------------===// >From 085062f9ebac1079a720f614498c0b124eda8a51 Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 14 May 2025 16:17:14 +0530 Subject: [PATCH 07/13] [OpenMP] workdistribute trivial lowering Lowering logic inspired from ivanradanov coexeute lowering f56da1a207df4a40776a8570122a33f047074a3c --- .../include/flang/Optimizer/OpenMP/Passes.td | 4 + flang/lib/Optimizer/OpenMP/CMakeLists.txt | 1 + .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 101 ++++++++++++++++++ .../OpenMP/lower-workdistribute.mlir | 52 +++++++++ 4 files changed, 158 insertions(+) create mode 100644 flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp create mode 100644 flang/test/Transforms/OpenMP/lower-workdistribute.mlir diff --git a/flang/include/flang/Optimizer/OpenMP/Passes.td b/flang/include/flang/Optimizer/OpenMP/Passes.td index 704faf0ccd856..743b6d381ed42 100644 --- a/flang/include/flang/Optimizer/OpenMP/Passes.td +++ b/flang/include/flang/Optimizer/OpenMP/Passes.td @@ -93,6 +93,10 @@ def LowerWorkshare : Pass<"lower-workshare", "::mlir::ModuleOp"> { let summary = "Lower workshare construct"; } +def LowerWorkdistribute : Pass<"lower-workdistribute", "::mlir::ModuleOp"> { + let summary = "Lower workdistribute construct"; +} + def GenericLoopConversionPass : Pass<"omp-generic-loop-conversion", "mlir::func::FuncOp"> { let summary = "Converts OpenMP generic `omp.loop` to semantically " diff --git a/flang/lib/Optimizer/OpenMP/CMakeLists.txt b/flang/lib/Optimizer/OpenMP/CMakeLists.txt index e31543328a9f9..cd746834741f9 100644 --- a/flang/lib/Optimizer/OpenMP/CMakeLists.txt +++ b/flang/lib/Optimizer/OpenMP/CMakeLists.txt @@ -7,6 +7,7 @@ add_flang_library(FlangOpenMPTransforms MapsForPrivatizedSymbols.cpp MapInfoFinalization.cpp MarkDeclareTarget.cpp + LowerWorkdistribute.cpp LowerWorkshare.cpp LowerNontemporal.cpp diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp new file mode 100644 index 0000000000000..75c9d2b0d494e --- /dev/null +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -0,0 +1,101 @@ +//===- LowerWorkshare.cpp - special cases for bufferization -------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements the lowering of omp.workdistribute. +// +//===----------------------------------------------------------------------===// + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +#include + +namespace flangomp { +#define GEN_PASS_DEF_LOWERWORKDISTRIBUTE +#include "flang/Optimizer/OpenMP/Passes.h.inc" +} // namespace flangomp + +#define DEBUG_TYPE "lower-workdistribute" + +using namespace mlir; + +namespace { + +struct WorkdistributeToSingle : public mlir::OpRewritePattern { +using OpRewritePattern::OpRewritePattern; +mlir::LogicalResult + matchAndRewrite(mlir::omp::WorkdistributeOp workdistribute, + mlir::PatternRewriter &rewriter) const override { + auto loc = workdistribute->getLoc(); + auto teams = llvm::dyn_cast(workdistribute->getParentOp()); + if (!teams) { + mlir::emitError(loc, "workdistribute not nested in teams\n"); + return mlir::failure(); + } + if (workdistribute.getRegion().getBlocks().size() != 1) { + mlir::emitError(loc, "workdistribute with multiple blocks\n"); + return mlir::failure(); + } + if (teams.getRegion().getBlocks().size() != 1) { + mlir::emitError(loc, "teams with multiple blocks\n"); + return mlir::failure(); + } + if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { + mlir::emitError(loc, "teams with multiple nested ops\n"); + return mlir::failure(); + } + mlir::Block *workdistributeBlock = &workdistribute.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teams); + rewriter.eraseOp(teams); + return mlir::success(); + } +}; + +class LowerWorkdistributePass + : public flangomp::impl::LowerWorkdistributeBase { +public: + void runOnOperation() override { + mlir::MLIRContext &context = getContext(); + mlir::RewritePatternSet patterns(&context); + mlir::GreedyRewriteConfig config; + // prevent the pattern driver form merging blocks + config.setRegionSimplificationLevel( + mlir::GreedySimplifyRegionLevel::Disabled); + + patterns.insert(&context); + mlir::Operation *op = getOperation(); + if (mlir::failed(mlir::applyPatternsGreedily(op, std::move(patterns), config))) { + mlir::emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + signalPassFailure(); + } + } +}; +} diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute.mlir new file mode 100644 index 0000000000000..34c8c3f01976d --- /dev/null +++ b/flang/test/Transforms/OpenMP/lower-workdistribute.mlir @@ -0,0 +1,52 @@ +// RUN: fir-opt --lower-workdistribute %s | FileCheck %s + +// CHECK-LABEL: func.func @_QPtarget_simple() { +// CHECK: %[[VAL_0:.*]] = arith.constant 2 : i32 +// CHECK: %[[VAL_1:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFtarget_simpleEa"} +// CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[VAL_1]] {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_3:.*]] = fir.alloca !fir.box> {bindc_name = "simple_var", uniq_name = "_QFtarget_simpleEsimple_var"} +// CHECK: %[[VAL_4:.*]] = fir.zero_bits !fir.heap +// CHECK: %[[VAL_5:.*]] = fir.embox %[[VAL_4]] : (!fir.heap) -> !fir.box> +// CHECK: fir.store %[[VAL_5]] to %[[VAL_3]] : !fir.ref>> +// CHECK: %[[VAL_6:.*]]:2 = hlfir.declare %[[VAL_3]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) +// CHECK: hlfir.assign %[[VAL_0]] to %[[VAL_2]]#0 : i32, !fir.ref +// CHECK: %[[VAL_7:.*]] = omp.map.info var_ptr(%[[VAL_2]]#1 : !fir.ref, i32) map_clauses(to) capture(ByRef) -> !fir.ref {name = "a"} +// CHECK: omp.target map_entries(%[[VAL_7]] -> %[[VAL_8:.*]] : !fir.ref) private(@_QFtarget_simpleEsimple_var_private_ref_box_heap_i32 %[[VAL_6]]#0 -> %[[VAL_9:.*]] : !fir.ref>>) { +// CHECK: %[[VAL_10:.*]] = arith.constant 10 : i32 +// CHECK: %[[VAL_11:.*]]:2 = hlfir.declare %[[VAL_8]] {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_12:.*]]:2 = hlfir.declare %[[VAL_9]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) +// CHECK: %[[VAL_13:.*]] = fir.load %[[VAL_11]]#0 : !fir.ref +// CHECK: %[[VAL_14:.*]] = arith.addi %[[VAL_13]], %[[VAL_10]] : i32 +// CHECK: hlfir.assign %[[VAL_14]] to %[[VAL_12]]#0 realloc : i32, !fir.ref>> +// CHECK: omp.terminator +// CHECK: } +// CHECK: return +// CHECK: } +func.func @_QPtarget_simple() { + %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFtarget_simpleEa"} + %1:2 = hlfir.declare %0 {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %2 = fir.alloca !fir.box> {bindc_name = "simple_var", uniq_name = "_QFtarget_simpleEsimple_var"} + %3 = fir.zero_bits !fir.heap + %4 = fir.embox %3 : (!fir.heap) -> !fir.box> + fir.store %4 to %2 : !fir.ref>> + %5:2 = hlfir.declare %2 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) + %c2_i32 = arith.constant 2 : i32 + hlfir.assign %c2_i32 to %1#0 : i32, !fir.ref + %6 = omp.map.info var_ptr(%1#1 : !fir.ref, i32) map_clauses(to) capture(ByRef) -> !fir.ref {name = "a"} + omp.target map_entries(%6 -> %arg0 : !fir.ref) private(@_QFtarget_simpleEsimple_var_private_ref_box_heap_i32 %5#0 -> %arg1 : !fir.ref>>){ + omp.teams { + omp.workdistribute { + %11:2 = hlfir.declare %arg0 {uniq_name = "_QFtarget_simpleEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) + %12:2 = hlfir.declare %arg1 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFtarget_simpleEsimple_var"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) + %c10_i32 = arith.constant 10 : i32 + %13 = fir.load %11#0 : !fir.ref + %14 = arith.addi %c10_i32, %13 : i32 + hlfir.assign %14 to %12#0 realloc : i32, !fir.ref>> + omp.terminator + } + omp.terminator + } + omp.terminator + } + return +} \ No newline at end of file >From c9b63efe85f7aed781a4a0fd7d0888b595f2a520 Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 14 May 2025 19:29:33 +0530 Subject: [PATCH 08/13] [Flang][OpenMP] Add workdistribute lower pass to pipeline --- flang/lib/Optimizer/Passes/Pipelines.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..15983f80c1e4b 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -278,8 +278,10 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP, addNestedPassToAllTopLevelOperations( pm, hlfir::createInlineHLFIRAssign); pm.addPass(hlfir::createConvertHLFIRtoFIR()); - if (enableOpenMP) + if (enableOpenMP) { pm.addPass(flangomp::createLowerWorkshare()); + pm.addPass(flangomp::createLowerWorkdistribute()); + } } /// Create a pass pipeline for handling certain OpenMP transformations needed >From 048c3f22d55248a21e53ee3f4be2c0b07b500039 Mon Sep 17 00:00:00 2001 From: skc7 Date: Thu, 15 May 2025 16:39:21 +0530 Subject: [PATCH 09/13] [Flang][OpenMP] Add FissionWorkdistribute lowering. Fission logic inspired from ivanradanov implementation : c97eca4010e460aac5a3d795614ca0980bce4565 --- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 233 ++++++++++++++---- .../OpenMP/lower-workdistribute-fission.mlir | 60 +++++ ...ir => lower-workdistribute-to-single.mlir} | 2 +- 3 files changed, 243 insertions(+), 52 deletions(-) create mode 100644 flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir rename flang/test/Transforms/OpenMP/{lower-workdistribute.mlir => lower-workdistribute-to-single.mlir} (99%) diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index 75c9d2b0d494e..f799202be2645 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -10,31 +10,26 @@ // //===----------------------------------------------------------------------===// -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include +#include "flang/Optimizer/Dialect/FIRDialect.h" +#include "flang/Optimizer/Dialect/FIROps.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/Transforms/Passes.h" +#include "flang/Optimizer/HLFIR/Passes.h" +#include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/IR/Builders.h" +#include "mlir/IR/Value.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" #include #include -#include -#include -#include +#include +#include #include +#include #include -#include #include -#include -#include #include #include -#include "mlir/Transforms/GreedyPatternRewriteDriver.h" - +#include #include namespace flangomp { @@ -48,52 +43,188 @@ using namespace mlir; namespace { -struct WorkdistributeToSingle : public mlir::OpRewritePattern { -using OpRewritePattern::OpRewritePattern; -mlir::LogicalResult - matchAndRewrite(mlir::omp::WorkdistributeOp workdistribute, - mlir::PatternRewriter &rewriter) const override { - auto loc = workdistribute->getLoc(); - auto teams = llvm::dyn_cast(workdistribute->getParentOp()); - if (!teams) { - mlir::emitError(loc, "workdistribute not nested in teams\n"); - return mlir::failure(); - } - if (workdistribute.getRegion().getBlocks().size() != 1) { - mlir::emitError(loc, "workdistribute with multiple blocks\n"); - return mlir::failure(); +template +static T getPerfectlyNested(Operation *op) { + if (op->getNumRegions() != 1) + return nullptr; + auto ®ion = op->getRegion(0); + if (region.getBlocks().size() != 1) + return nullptr; + auto *block = ®ion.front(); + auto *firstOp = &block->front(); + if (auto nested = dyn_cast(firstOp)) + if (firstOp->getNextNode() == block->getTerminator()) + return nested; + return nullptr; +} + +/// This is the single source of truth about whether we should parallelize an +/// operation nested in an omp.workdistribute region. +static bool shouldParallelize(Operation *op) { + // Currently we cannot parallelize operations with results that have uses + if (llvm::any_of(op->getResults(), + [](OpResult v) -> bool { return !v.use_empty(); })) + return false; + // We will parallelize unordered loops - these come from array syntax + if (auto loop = dyn_cast(op)) { + auto unordered = loop.getUnordered(); + if (!unordered) + return false; + return *unordered; + } + if (auto callOp = dyn_cast(op)) { + auto callee = callOp.getCallee(); + if (!callee) + return false; + auto *func = op->getParentOfType().lookupSymbol(*callee); + // TODO need to insert a check here whether it is a call we can actually + // parallelize currently + if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) + return true; + return false; + } + // We cannot parallise anything else + return false; +} + +struct WorkdistributeToSingle : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + PatternRewriter &rewriter) const override { + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); } - if (teams.getRegion().getBlocks().size() != 1) { - mlir::emitError(loc, "teams with multiple blocks\n"); - return mlir::failure(); + + Block *workdistributeBlock = &workdistributeOp.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); + rewriter.eraseOp(teamsOp); + workdistributeOp.emitWarning("unable to parallelize coexecute"); + return success(); + } +}; + +/// If B() and D() are parallelizable, +/// +/// omp.teams { +/// omp.workdistribute { +/// A() +/// B() +/// C() +/// D() +/// E() +/// } +/// } +/// +/// becomes +/// +/// A() +/// omp.teams { +/// omp.workdistribute { +/// B() +/// } +/// } +/// C() +/// omp.teams { +/// omp.workdistribute { +/// D() +/// } +/// } +/// E() + +struct FissionWorkdistribute + : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult + matchAndRewrite(omp::WorkdistributeOp workdistribute, + PatternRewriter &rewriter) const override { + auto loc = workdistribute->getLoc(); + auto teams = dyn_cast(workdistribute->getParentOp()); + if (!teams) { + emitError(loc, "workdistribute not nested in teams\n"); + return failure(); + } + if (workdistribute.getRegion().getBlocks().size() != 1) { + emitError(loc, "workdistribute with multiple blocks\n"); + return failure(); + } + if (teams.getRegion().getBlocks().size() != 1) { + emitError(loc, "teams with multiple blocks\n"); + return failure(); + } + if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { + emitError(loc, "teams with multiple nested ops\n"); + return failure(); + } + + auto *teamsBlock = &teams.getRegion().front(); + + // While we have unhandled operations in the original workdistribute + auto *workdistributeBlock = &workdistribute.getRegion().front(); + auto *terminator = workdistributeBlock->getTerminator(); + bool changed = false; + while (&workdistributeBlock->front() != terminator) { + rewriter.setInsertionPoint(teams); + IRMapping mapping; + llvm::SmallVector hoisted; + Operation *parallelize = nullptr; + for (auto &op : workdistribute.getOps()) { + if (&op == terminator) { + break; } - if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { - mlir::emitError(loc, "teams with multiple nested ops\n"); - return mlir::failure(); + if (shouldParallelize(&op)) { + parallelize = &op; + break; + } else { + rewriter.clone(op, mapping); + hoisted.push_back(&op); + changed = true; } - mlir::Block *workdistributeBlock = &workdistribute.getRegion().front(); - rewriter.eraseOp(workdistributeBlock->getTerminator()); - rewriter.inlineBlockBefore(workdistributeBlock, teams); - rewriter.eraseOp(teams); - return mlir::success(); + } + + for (auto *op : hoisted) + rewriter.replaceOp(op, mapping.lookup(op)); + + if (parallelize && hoisted.empty() && + parallelize->getNextNode() == terminator) + break; + if (parallelize) { + auto newTeams = rewriter.cloneWithoutRegions(teams); + auto *newTeamsBlock = rewriter.createBlock( + &newTeams.getRegion(), newTeams.getRegion().begin(), {}, {}); + for (auto arg : teamsBlock->getArguments()) + newTeamsBlock->addArgument(arg.getType(), arg.getLoc()); + auto newWorkdistribute = rewriter.create(loc); + rewriter.create(loc); + rewriter.createBlock(&newWorkdistribute.getRegion(), + newWorkdistribute.getRegion().begin(), {}, {}); + auto *cloned = rewriter.clone(*parallelize); + rewriter.replaceOp(parallelize, cloned); + rewriter.create(loc); + changed = true; + } } + return success(changed); + } }; class LowerWorkdistributePass : public flangomp::impl::LowerWorkdistributeBase { public: void runOnOperation() override { - mlir::MLIRContext &context = getContext(); - mlir::RewritePatternSet patterns(&context); - mlir::GreedyRewriteConfig config; + MLIRContext &context = getContext(); + RewritePatternSet patterns(&context); + GreedyRewriteConfig config; // prevent the pattern driver form merging blocks config.setRegionSimplificationLevel( - mlir::GreedySimplifyRegionLevel::Disabled); + GreedySimplifyRegionLevel::Disabled); - patterns.insert(&context); - mlir::Operation *op = getOperation(); - if (mlir::failed(mlir::applyPatternsGreedily(op, std::move(patterns), config))) { - mlir::emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + patterns.insert(&context); + Operation *op = getOperation(); + if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { + emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); } } diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir new file mode 100644 index 0000000000000..ea03a10dd3d44 --- /dev/null +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir @@ -0,0 +1,60 @@ +// RUN: fir-opt --lower-workdistribute %s | FileCheck %s + +// CHECK-LABEL: func.func @test_fission_workdistribute({{.*}}) { +// CHECK: %[[VAL_0:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_2:.*]] = arith.constant 9 : index +// CHECK: %[[VAL_3:.*]] = arith.constant 5.000000e+00 : f32 +// CHECK: fir.store %[[VAL_3]] to %[[ARG2:.*]] : !fir.ref +// CHECK: fir.do_loop %[[VAL_4:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] unordered { +// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref +// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: } +// CHECK: fir.call @regular_side_effect_func(%[[ARG2:.*]]) : (!fir.ref) -> () +// CHECK: fir.call @my_fir_parallel_runtime_func(%[[ARG3:.*]]) : (!fir.ref) -> () +// CHECK: fir.do_loop %[[VAL_8:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] { +// CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_3]] to %[[VAL_9]] : !fir.ref +// CHECK: } +// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2:.*]] : !fir.ref +// CHECK: fir.store %[[VAL_10]] to %[[ARG3:.*]] : !fir.ref +// CHECK: return +// CHECK: } +module { +func.func @regular_side_effect_func(%arg0: !fir.ref) { + return +} +func.func @my_fir_parallel_runtime_func(%arg0: !fir.ref) attributes {fir.runtime} { + return +} +func.func @test_fission_workdistribute(%arr1: !fir.ref>, %arr2: !fir.ref>, %scalar_ref1: !fir.ref, %scalar_ref2: !fir.ref) { + %c0_idx = arith.constant 0 : index + %c1_idx = arith.constant 1 : index + %c9_idx = arith.constant 9 : index + %float_val = arith.constant 5.0 : f32 + omp.teams { + omp.workdistribute { + fir.store %float_val to %scalar_ref1 : !fir.ref + fir.do_loop %iv = %c0_idx to %c9_idx step %c1_idx unordered { + %elem_ptr_arr1 = fir.coordinate_of %arr1, %iv : (!fir.ref>, index) -> !fir.ref + %loaded_val_loop1 = fir.load %elem_ptr_arr1 : !fir.ref + %elem_ptr_arr2 = fir.coordinate_of %arr2, %iv : (!fir.ref>, index) -> !fir.ref + fir.store %loaded_val_loop1 to %elem_ptr_arr2 : !fir.ref + } + fir.call @regular_side_effect_func(%scalar_ref1) : (!fir.ref) -> () + fir.call @my_fir_parallel_runtime_func(%scalar_ref2) : (!fir.ref) -> () + fir.do_loop %jv = %c0_idx to %c9_idx step %c1_idx { + %elem_ptr_ordered_loop = fir.coordinate_of %arr1, %jv : (!fir.ref>, index) -> !fir.ref + fir.store %float_val to %elem_ptr_ordered_loop : !fir.ref + } + %loaded_for_hoist = fir.load %scalar_ref1 : !fir.ref + fir.store %loaded_for_hoist to %scalar_ref2 : !fir.ref + omp.terminator + } + omp.terminator + } + return +} +} diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-to-single.mlir similarity index 99% rename from flang/test/Transforms/OpenMP/lower-workdistribute.mlir rename to flang/test/Transforms/OpenMP/lower-workdistribute-to-single.mlir index 34c8c3f01976d..0cc2aeded2532 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-to-single.mlir @@ -49,4 +49,4 @@ func.func @_QPtarget_simple() { omp.terminator } return -} \ No newline at end of file +} >From 5b30d3dcb80cb4cef546f5bfdf3aa389f527d07d Mon Sep 17 00:00:00 2001 From: skc7 Date: Sun, 18 May 2025 12:37:53 +0530 Subject: [PATCH 10/13] [OpenMP][Flang] Lower teams workdistribute do_loop to wsloop. Logic inspired from ivanradanov commit 5682e9ea7fcba64693f7cfdc0f1970fab2d7d4ae --- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 177 +++++++++++++++--- .../OpenMP/lower-workdistribute-doloop.mlir | 28 +++ .../OpenMP/lower-workdistribute-fission.mlir | 22 ++- 3 files changed, 193 insertions(+), 34 deletions(-) create mode 100644 flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index f799202be2645..de208a8190650 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -6,18 +6,22 @@ // //===----------------------------------------------------------------------===// // -// This file implements the lowering of omp.workdistribute. +// This file implements the lowering and optimisations of omp.workdistribute. // //===----------------------------------------------------------------------===// +#include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Dialect/FIRDialect.h" #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Optimizer/HLFIR/Passes.h" +#include "flang/Optimizer/OpenMP/Utils.h" +#include "mlir/Analysis/SliceAnalysis.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/IR/Builders.h" #include "mlir/IR/Value.h" +#include "mlir/Transforms/DialectConversion.h" #include "mlir/Transforms/GreedyPatternRewriteDriver.h" #include #include @@ -29,6 +33,7 @@ #include #include #include +#include "mlir/Transforms/RegionUtils.h" #include #include @@ -87,25 +92,6 @@ static bool shouldParallelize(Operation *op) { return false; } -struct WorkdistributeToSingle : public OpRewritePattern { - using OpRewritePattern::OpRewritePattern; - LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, - PatternRewriter &rewriter) const override { - auto workdistributeOp = getPerfectlyNested(teamsOp); - if (!workdistributeOp) { - LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); - return failure(); - } - - Block *workdistributeBlock = &workdistributeOp.getRegion().front(); - rewriter.eraseOp(workdistributeBlock->getTerminator()); - rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); - rewriter.eraseOp(teamsOp); - workdistributeOp.emitWarning("unable to parallelize coexecute"); - return success(); - } -}; - /// If B() and D() are parallelizable, /// /// omp.teams { @@ -210,22 +196,161 @@ struct FissionWorkdistribute } }; +static void +genLoopNestClauseOps(mlir::Location loc, + mlir::PatternRewriter &rewriter, + fir::DoLoopOp loop, + mlir::omp::LoopNestOperands &loopNestClauseOps) { + assert(loopNestClauseOps.loopLowerBounds.empty() && + "Loop nest bounds were already emitted!"); + loopNestClauseOps.loopLowerBounds.push_back(loop.getLowerBound()); + loopNestClauseOps.loopUpperBounds.push_back(loop.getUpperBound()); + loopNestClauseOps.loopSteps.push_back(loop.getStep()); + loopNestClauseOps.loopInclusive = rewriter.getUnitAttr(); +} + +static void +genWsLoopOp(mlir::PatternRewriter &rewriter, + fir::DoLoopOp doLoop, + const mlir::omp::LoopNestOperands &clauseOps) { + + auto wsloopOp = rewriter.create(doLoop.getLoc()); + rewriter.createBlock(&wsloopOp.getRegion()); + + auto loopNestOp = + rewriter.create(doLoop.getLoc(), clauseOps); + + // Clone the loop's body inside the loop nest construct using the + // mapped values. + rewriter.cloneRegionBefore(doLoop.getRegion(), loopNestOp.getRegion(), + loopNestOp.getRegion().begin()); + Block *clonedBlock = &loopNestOp.getRegion().back(); + mlir::Operation *terminatorOp = clonedBlock->getTerminator(); + + // Erase fir.result op of do loop and create yield op. + if (auto resultOp = dyn_cast(terminatorOp)) { + rewriter.setInsertionPoint(terminatorOp); + rewriter.create(doLoop->getLoc()); + rewriter.eraseOp(terminatorOp); + } + return; +} + +/// If fir.do_loop id present inside teams workdistribute +/// +/// omp.teams { +/// omp.workdistribute { +/// fir.do_loop unoredered { +/// ... +/// } +/// } +/// } +/// +/// Then, its lowered to +/// +/// omp.teams { +/// omp.workdistribute { +/// omp.parallel { +/// omp.wsloop { +/// omp.loop_nest +/// ... +/// } +/// } +/// } +/// } +/// } + +struct TeamsWorkdistributeLowering : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + PatternRewriter &rewriter) const override { + auto teamsLoc = teamsOp->getLoc(); + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); + } + assert(teamsOp.getReductionVars().empty()); + + auto doLoop = getPerfectlyNested(workdistributeOp); + if (doLoop && shouldParallelize(doLoop)) { + + auto parallelOp = rewriter.create(teamsLoc); + rewriter.createBlock(¶llelOp.getRegion()); + rewriter.setInsertionPoint(rewriter.create(doLoop.getLoc())); + + mlir::omp::LoopNestOperands loopNestClauseOps; + genLoopNestClauseOps(doLoop.getLoc(), rewriter, doLoop, + loopNestClauseOps); + + genWsLoopOp(rewriter, doLoop, loopNestClauseOps); + rewriter.setInsertionPoint(doLoop); + rewriter.eraseOp(doLoop); + return success(); + } + return failure(); + } +}; + + +/// If A() and B () are present inside teams workdistribute +/// +/// omp.teams { +/// omp.workdistribute { +/// A() +/// B() +/// } +/// } +/// +/// Then, its lowered to +/// +/// A() +/// B() +/// + +struct TeamsWorkdistributeToSingle : public OpRewritePattern { + using OpRewritePattern::OpRewritePattern; + LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + PatternRewriter &rewriter) const override { + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); + } + Block *workdistributeBlock = &workdistributeOp.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); + rewriter.eraseOp(teamsOp); + return success(); + } +}; + class LowerWorkdistributePass : public flangomp::impl::LowerWorkdistributeBase { public: void runOnOperation() override { MLIRContext &context = getContext(); - RewritePatternSet patterns(&context); GreedyRewriteConfig config; // prevent the pattern driver form merging blocks config.setRegionSimplificationLevel( GreedySimplifyRegionLevel::Disabled); - - patterns.insert(&context); + Operation *op = getOperation(); - if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { - emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); - signalPassFailure(); + { + RewritePatternSet patterns(&context); + patterns.insert(&context); + if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { + emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + signalPassFailure(); + } + } + { + RewritePatternSet patterns(&context); + patterns.insert(&context); + if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { + emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); + signalPassFailure(); + } } } }; diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir new file mode 100644 index 0000000000000..666bdb3ced647 --- /dev/null +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir @@ -0,0 +1,28 @@ +// RUN: fir-opt --lower-workdistribute %s | FileCheck %s + +// CHECK-LABEL: func.func @x({{.*}}) +// CHECK: %[[VAL_0:.*]] = arith.constant 0 : index +// CHECK: omp.parallel { +// CHECK: omp.wsloop { +// CHECK: omp.loop_nest (%[[VAL_1:.*]]) : index = (%[[ARG0:.*]]) to (%[[ARG1:.*]]) inclusive step (%[[ARG2:.*]]) { +// CHECK: fir.store %[[VAL_0]] to %[[ARG4:.*]] : !fir.ref +// CHECK: omp.yield +// CHECK: } +// CHECK: } +// CHECK: omp.terminator +// CHECK: } +// CHECK: return +// CHECK: } +func.func @x(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref) { + omp.teams { + omp.workdistribute { + fir.do_loop %iv = %lb to %ub step %step unordered { + %zero = arith.constant 0 : index + fir.store %zero to %addr : !fir.ref + } + omp.terminator + } + omp.terminator + } + return +} \ No newline at end of file diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir index ea03a10dd3d44..cf50d135d01ec 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir @@ -6,20 +6,26 @@ // CHECK: %[[VAL_2:.*]] = arith.constant 9 : index // CHECK: %[[VAL_3:.*]] = arith.constant 5.000000e+00 : f32 // CHECK: fir.store %[[VAL_3]] to %[[ARG2:.*]] : !fir.ref -// CHECK: fir.do_loop %[[VAL_4:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] unordered { -// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref -// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref -// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref -// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: omp.parallel { +// CHECK: omp.wsloop { +// CHECK: omp.loop_nest (%[[VAL_4:.*]]) : index = (%[[VAL_0]]) to (%[[VAL_2]]) inclusive step (%[[VAL_1]]) { +// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref +// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: omp.yield +// CHECK: } +// CHECK: } +// CHECK: omp.terminator // CHECK: } // CHECK: fir.call @regular_side_effect_func(%[[ARG2:.*]]) : (!fir.ref) -> () // CHECK: fir.call @my_fir_parallel_runtime_func(%[[ARG3:.*]]) : (!fir.ref) -> () // CHECK: fir.do_loop %[[VAL_8:.*]] = %[[VAL_0]] to %[[VAL_2]] step %[[VAL_1]] { -// CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref // CHECK: fir.store %[[VAL_3]] to %[[VAL_9]] : !fir.ref // CHECK: } -// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2:.*]] : !fir.ref -// CHECK: fir.store %[[VAL_10]] to %[[ARG3:.*]] : !fir.ref +// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2]] : !fir.ref +// CHECK: fir.store %[[VAL_10]] to %[[ARG3]] : !fir.ref // CHECK: return // CHECK: } module { >From df65bd53111948abf6f9c2e1e0b8e27aa5e01946 Mon Sep 17 00:00:00 2001 From: skc7 Date: Mon, 19 May 2025 15:33:53 +0530 Subject: [PATCH 11/13] clang format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 18 +-- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 108 +++++++++--------- flang/lib/Parser/openmp-parsers.cpp | 6 +- .../OpenMP/lower-workdistribute-doloop.mlir | 2 +- 4 files changed, 67 insertions(+), 67 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 42d04bceddb12..ebf0710ab4feb 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2670,14 +2670,15 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } -static mlir::omp::WorkdistributeOp -genWorkdistributeOp(lower::AbstractConverter &converter, lower::SymMap &symTable, - semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - mlir::Location loc, const ConstructQueue &queue, - ConstructQueue::const_iterator item) { +static mlir::omp::WorkdistributeOp genWorkdistributeOp( + lower::AbstractConverter &converter, lower::SymMap &symTable, + semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, + mlir::Location loc, const ConstructQueue &queue, + ConstructQueue::const_iterator item) { return genOpWithBody( - OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, - llvm::omp::Directive::OMPD_workdistribute), queue, item); + OpWithBodyGenInfo(converter, symTable, semaCtx, loc, eval, + llvm::omp::Directive::OMPD_workdistribute), + queue, item); } //===----------------------------------------------------------------------===// @@ -3946,7 +3947,8 @@ static void genOMPDispatch(lower::AbstractConverter &converter, llvm::omp::getOpenMPDirectiveName(dir, version) + ")"); } case llvm::omp::Directive::OMPD_workdistribute: - newOp = genWorkdistributeOp(converter, symTable, semaCtx, eval, loc, queue, item); + newOp = genWorkdistributeOp(converter, symTable, semaCtx, eval, loc, queue, + item); break; case llvm::omp::Directive::OMPD_workshare: newOp = genWorkshareOp(converter, symTable, stmtCtx, semaCtx, eval, loc, diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index de208a8190650..f75d4d1988fd2 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -14,15 +14,16 @@ #include "flang/Optimizer/Dialect/FIRDialect.h" #include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/Dialect/FIRType.h" -#include "flang/Optimizer/Transforms/Passes.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Utils.h" +#include "flang/Optimizer/Transforms/Passes.h" #include "mlir/Analysis/SliceAnalysis.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/IR/Builders.h" #include "mlir/IR/Value.h" #include "mlir/Transforms/DialectConversion.h" #include "mlir/Transforms/GreedyPatternRewriteDriver.h" +#include "mlir/Transforms/RegionUtils.h" #include #include #include @@ -33,7 +34,6 @@ #include #include #include -#include "mlir/Transforms/RegionUtils.h" #include #include @@ -66,30 +66,30 @@ static T getPerfectlyNested(Operation *op) { /// This is the single source of truth about whether we should parallelize an /// operation nested in an omp.workdistribute region. static bool shouldParallelize(Operation *op) { - // Currently we cannot parallelize operations with results that have uses - if (llvm::any_of(op->getResults(), - [](OpResult v) -> bool { return !v.use_empty(); })) + // Currently we cannot parallelize operations with results that have uses + if (llvm::any_of(op->getResults(), + [](OpResult v) -> bool { return !v.use_empty(); })) + return false; + // We will parallelize unordered loops - these come from array syntax + if (auto loop = dyn_cast(op)) { + auto unordered = loop.getUnordered(); + if (!unordered) return false; - // We will parallelize unordered loops - these come from array syntax - if (auto loop = dyn_cast(op)) { - auto unordered = loop.getUnordered(); - if (!unordered) - return false; - return *unordered; - } - if (auto callOp = dyn_cast(op)) { - auto callee = callOp.getCallee(); - if (!callee) - return false; - auto *func = op->getParentOfType().lookupSymbol(*callee); - // TODO need to insert a check here whether it is a call we can actually - // parallelize currently - if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) - return true; + return *unordered; + } + if (auto callOp = dyn_cast(op)) { + auto callee = callOp.getCallee(); + if (!callee) return false; - } - // We cannot parallise anything else + auto *func = op->getParentOfType().lookupSymbol(*callee); + // TODO need to insert a check here whether it is a call we can actually + // parallelize currently + if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) + return true; return false; + } + // We cannot parallise anything else + return false; } /// If B() and D() are parallelizable, @@ -120,12 +120,10 @@ static bool shouldParallelize(Operation *op) { /// } /// E() -struct FissionWorkdistribute - : public OpRewritePattern { +struct FissionWorkdistribute : public OpRewritePattern { using OpRewritePattern::OpRewritePattern; - LogicalResult - matchAndRewrite(omp::WorkdistributeOp workdistribute, - PatternRewriter &rewriter) const override { + LogicalResult matchAndRewrite(omp::WorkdistributeOp workdistribute, + PatternRewriter &rewriter) const override { auto loc = workdistribute->getLoc(); auto teams = dyn_cast(workdistribute->getParentOp()); if (!teams) { @@ -185,7 +183,7 @@ struct FissionWorkdistribute auto newWorkdistribute = rewriter.create(loc); rewriter.create(loc); rewriter.createBlock(&newWorkdistribute.getRegion(), - newWorkdistribute.getRegion().begin(), {}, {}); + newWorkdistribute.getRegion().begin(), {}, {}); auto *cloned = rewriter.clone(*parallelize); rewriter.replaceOp(parallelize, cloned); rewriter.create(loc); @@ -197,8 +195,7 @@ struct FissionWorkdistribute }; static void -genLoopNestClauseOps(mlir::Location loc, - mlir::PatternRewriter &rewriter, +genLoopNestClauseOps(mlir::Location loc, mlir::PatternRewriter &rewriter, fir::DoLoopOp loop, mlir::omp::LoopNestOperands &loopNestClauseOps) { assert(loopNestClauseOps.loopLowerBounds.empty() && @@ -209,10 +206,8 @@ genLoopNestClauseOps(mlir::Location loc, loopNestClauseOps.loopInclusive = rewriter.getUnitAttr(); } -static void -genWsLoopOp(mlir::PatternRewriter &rewriter, - fir::DoLoopOp doLoop, - const mlir::omp::LoopNestOperands &clauseOps) { +static void genWsLoopOp(mlir::PatternRewriter &rewriter, fir::DoLoopOp doLoop, + const mlir::omp::LoopNestOperands &clauseOps) { auto wsloopOp = rewriter.create(doLoop.getLoc()); rewriter.createBlock(&wsloopOp.getRegion()); @@ -236,7 +231,7 @@ genWsLoopOp(mlir::PatternRewriter &rewriter, return; } -/// If fir.do_loop id present inside teams workdistribute +/// If fir.do_loop is present inside teams workdistribute /// /// omp.teams { /// omp.workdistribute { @@ -246,7 +241,7 @@ genWsLoopOp(mlir::PatternRewriter &rewriter, /// } /// } /// -/// Then, its lowered to +/// Then, its lowered to /// /// omp.teams { /// omp.workdistribute { @@ -277,7 +272,8 @@ struct TeamsWorkdistributeLowering : public OpRewritePattern { auto parallelOp = rewriter.create(teamsLoc); rewriter.createBlock(¶llelOp.getRegion()); - rewriter.setInsertionPoint(rewriter.create(doLoop.getLoc())); + rewriter.setInsertionPoint( + rewriter.create(doLoop.getLoc())); mlir::omp::LoopNestOperands loopNestClauseOps; genLoopNestClauseOps(doLoop.getLoc(), rewriter, doLoop, @@ -292,7 +288,6 @@ struct TeamsWorkdistributeLowering : public OpRewritePattern { } }; - /// If A() and B () are present inside teams workdistribute /// /// omp.teams { @@ -311,17 +306,17 @@ struct TeamsWorkdistributeLowering : public OpRewritePattern { struct TeamsWorkdistributeToSingle : public OpRewritePattern { using OpRewritePattern::OpRewritePattern; LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, - PatternRewriter &rewriter) const override { - auto workdistributeOp = getPerfectlyNested(teamsOp); - if (!workdistributeOp) { - LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); - return failure(); - } - Block *workdistributeBlock = &workdistributeOp.getRegion().front(); - rewriter.eraseOp(workdistributeBlock->getTerminator()); - rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); - rewriter.eraseOp(teamsOp); - return success(); + PatternRewriter &rewriter) const override { + auto workdistributeOp = getPerfectlyNested(teamsOp); + if (!workdistributeOp) { + LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); + return failure(); + } + Block *workdistributeBlock = &workdistributeOp.getRegion().front(); + rewriter.eraseOp(workdistributeBlock->getTerminator()); + rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); + rewriter.eraseOp(teamsOp); + return success(); } }; @@ -332,13 +327,13 @@ class LowerWorkdistributePass MLIRContext &context = getContext(); GreedyRewriteConfig config; // prevent the pattern driver form merging blocks - config.setRegionSimplificationLevel( - GreedySimplifyRegionLevel::Disabled); - + config.setRegionSimplificationLevel(GreedySimplifyRegionLevel::Disabled); + Operation *op = getOperation(); { RewritePatternSet patterns(&context); - patterns.insert(&context); + patterns.insert( + &context); if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); @@ -346,7 +341,8 @@ class LowerWorkdistributePass } { RewritePatternSet patterns(&context); - patterns.insert(&context); + patterns.insert( + &context); if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); @@ -354,4 +350,4 @@ class LowerWorkdistributePass } } }; -} +} // namespace diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index 5b5ee257edd1f..dc25adfe28c1d 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1344,12 +1344,14 @@ TYPE_PARSER( "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_target_teams_workdistribute), + "TARGET TEAMS WORKDISTRIBUTE" >> + pure(llvm::omp::Directive::OMPD_target_teams_workdistribute), "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), "TARGET" >> pure(llvm::omp::Directive::OMPD_target), "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_teams_workdistribute), + "TEAMS WORKDISTRIBUTE" >> + pure(llvm::omp::Directive::OMPD_teams_workdistribute), "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare), "WORKDISTRIBUTE" >> pure(llvm::omp::Directive::OMPD_workdistribute)))) diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir index 666bdb3ced647..9fb970246b90c 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir @@ -25,4 +25,4 @@ func.func @x(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref< omp.terminator } return -} \ No newline at end of file +} >From 60351b6a73ed19de8531ac63336e17be7536cf48 Mon Sep 17 00:00:00 2001 From: skc7 Date: Tue, 27 May 2025 16:24:26 +0530 Subject: [PATCH 12/13] update to workdistribute lowering --- .../Optimizer/OpenMP/LowerWorkdistribute.cpp | 194 ++++++++++-------- .../OpenMP/lower-workdistribute-doloop.mlir | 19 +- .../OpenMP/lower-workdistribute-fission.mlir | 31 +-- 3 files changed, 139 insertions(+), 105 deletions(-) diff --git a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp index f75d4d1988fd2..c9c7827ace217 100644 --- a/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp +++ b/flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp @@ -48,25 +48,21 @@ using namespace mlir; namespace { -template -static T getPerfectlyNested(Operation *op) { - if (op->getNumRegions() != 1) - return nullptr; - auto ®ion = op->getRegion(0); - if (region.getBlocks().size() != 1) - return nullptr; - auto *block = ®ion.front(); - auto *firstOp = &block->front(); - if (auto nested = dyn_cast(firstOp)) - if (firstOp->getNextNode() == block->getTerminator()) - return nested; - return nullptr; +static bool isRuntimeCall(Operation *op) { + if (auto callOp = dyn_cast(op)) { + auto callee = callOp.getCallee(); + if (!callee) + return false; + auto *func = op->getParentOfType().lookupSymbol(*callee); + if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) + return true; + } + return false; } /// This is the single source of truth about whether we should parallelize an -/// operation nested in an omp.workdistribute region. +/// operation nested in an omp.execute region. static bool shouldParallelize(Operation *op) { - // Currently we cannot parallelize operations with results that have uses if (llvm::any_of(op->getResults(), [](OpResult v) -> bool { return !v.use_empty(); })) return false; @@ -77,21 +73,28 @@ static bool shouldParallelize(Operation *op) { return false; return *unordered; } - if (auto callOp = dyn_cast(op)) { - auto callee = callOp.getCallee(); - if (!callee) - return false; - auto *func = op->getParentOfType().lookupSymbol(*callee); - // TODO need to insert a check here whether it is a call we can actually - // parallelize currently - if (func->getAttr(fir::FIROpsDialect::getFirRuntimeAttrName())) - return true; - return false; + if (isRuntimeCall(op)) { + return true; } // We cannot parallise anything else return false; } +template +static T getPerfectlyNested(Operation *op) { + if (op->getNumRegions() != 1) + return nullptr; + auto ®ion = op->getRegion(0); + if (region.getBlocks().size() != 1) + return nullptr; + auto *block = ®ion.front(); + auto *firstOp = &block->front(); + if (auto nested = dyn_cast(firstOp)) + if (firstOp->getNextNode() == block->getTerminator()) + return nested; + return nullptr; +} + /// If B() and D() are parallelizable, /// /// omp.teams { @@ -138,17 +141,33 @@ struct FissionWorkdistribute : public OpRewritePattern { emitError(loc, "teams with multiple blocks\n"); return failure(); } - if (teams.getRegion().getBlocks().front().getOperations().size() != 2) { - emitError(loc, "teams with multiple nested ops\n"); - return failure(); - } auto *teamsBlock = &teams.getRegion().front(); + bool changed = false; + // Move the ops inside teams and before workdistribute outside. + IRMapping irMapping; + llvm::SmallVector teamsHoisted; + for (auto &op : teams.getOps()) { + if (&op == workdistribute) { + break; + } + if (shouldParallelize(&op)) { + emitError(loc, + "teams has parallelize ops before first workdistribute\n"); + return failure(); + } else { + rewriter.setInsertionPoint(teams); + rewriter.clone(op, irMapping); + teamsHoisted.push_back(&op); + changed = true; + } + } + for (auto *op : teamsHoisted) + rewriter.replaceOp(op, irMapping.lookup(op)); // While we have unhandled operations in the original workdistribute auto *workdistributeBlock = &workdistribute.getRegion().front(); auto *terminator = workdistributeBlock->getTerminator(); - bool changed = false; while (&workdistributeBlock->front() != terminator) { rewriter.setInsertionPoint(teams); IRMapping mapping; @@ -194,9 +213,51 @@ struct FissionWorkdistribute : public OpRewritePattern { } }; +/// If fir.do_loop is present inside teams workdistribute +/// +/// omp.teams { +/// omp.workdistribute { +/// fir.do_loop unoredered { +/// ... +/// } +/// } +/// } +/// +/// Then, its lowered to +/// +/// omp.teams { +/// omp.parallel { +/// omp.distribute { +/// omp.wsloop { +/// omp.loop_nest +/// ... +/// } +/// } +/// } +/// } + +static void genParallelOp(Location loc, PatternRewriter &rewriter, + bool composite) { + auto parallelOp = rewriter.create(loc); + parallelOp.setComposite(composite); + rewriter.createBlock(¶llelOp.getRegion()); + rewriter.setInsertionPoint(rewriter.create(loc)); + return; +} + +static void genDistributeOp(Location loc, PatternRewriter &rewriter, + bool composite) { + mlir::omp::DistributeOperands distributeClauseOps; + auto distributeOp = + rewriter.create(loc, distributeClauseOps); + distributeOp.setComposite(composite); + auto distributeBlock = rewriter.createBlock(&distributeOp.getRegion()); + rewriter.setInsertionPointToStart(distributeBlock); + return; +} + static void -genLoopNestClauseOps(mlir::Location loc, mlir::PatternRewriter &rewriter, - fir::DoLoopOp loop, +genLoopNestClauseOps(mlir::PatternRewriter &rewriter, fir::DoLoopOp loop, mlir::omp::LoopNestOperands &loopNestClauseOps) { assert(loopNestClauseOps.loopLowerBounds.empty() && "Loop nest bounds were already emitted!"); @@ -207,9 +268,11 @@ genLoopNestClauseOps(mlir::Location loc, mlir::PatternRewriter &rewriter, } static void genWsLoopOp(mlir::PatternRewriter &rewriter, fir::DoLoopOp doLoop, - const mlir::omp::LoopNestOperands &clauseOps) { + const mlir::omp::LoopNestOperands &clauseOps, + bool composite) { auto wsloopOp = rewriter.create(doLoop.getLoc()); + wsloopOp.setComposite(composite); rewriter.createBlock(&wsloopOp.getRegion()); auto loopNestOp = @@ -231,57 +294,20 @@ static void genWsLoopOp(mlir::PatternRewriter &rewriter, fir::DoLoopOp doLoop, return; } -/// If fir.do_loop is present inside teams workdistribute -/// -/// omp.teams { -/// omp.workdistribute { -/// fir.do_loop unoredered { -/// ... -/// } -/// } -/// } -/// -/// Then, its lowered to -/// -/// omp.teams { -/// omp.workdistribute { -/// omp.parallel { -/// omp.wsloop { -/// omp.loop_nest -/// ... -/// } -/// } -/// } -/// } -/// } - -struct TeamsWorkdistributeLowering : public OpRewritePattern { +struct WorkdistributeDoLower : public OpRewritePattern { using OpRewritePattern::OpRewritePattern; - LogicalResult matchAndRewrite(omp::TeamsOp teamsOp, + LogicalResult matchAndRewrite(omp::WorkdistributeOp workdistribute, PatternRewriter &rewriter) const override { - auto teamsLoc = teamsOp->getLoc(); - auto workdistributeOp = getPerfectlyNested(teamsOp); - if (!workdistributeOp) { - LLVM_DEBUG(llvm::dbgs() << DEBUG_TYPE << " No workdistribute nested\n"); - return failure(); - } - assert(teamsOp.getReductionVars().empty()); - - auto doLoop = getPerfectlyNested(workdistributeOp); + auto doLoop = getPerfectlyNested(workdistribute); + auto wdLoc = workdistribute->getLoc(); if (doLoop && shouldParallelize(doLoop)) { - - auto parallelOp = rewriter.create(teamsLoc); - rewriter.createBlock(¶llelOp.getRegion()); - rewriter.setInsertionPoint( - rewriter.create(doLoop.getLoc())); - + assert(doLoop.getReduceOperands().empty()); + genParallelOp(wdLoc, rewriter, true); + genDistributeOp(wdLoc, rewriter, true); mlir::omp::LoopNestOperands loopNestClauseOps; - genLoopNestClauseOps(doLoop.getLoc(), rewriter, doLoop, - loopNestClauseOps); - - genWsLoopOp(rewriter, doLoop, loopNestClauseOps); - rewriter.setInsertionPoint(doLoop); - rewriter.eraseOp(doLoop); + genLoopNestClauseOps(rewriter, doLoop, loopNestClauseOps); + genWsLoopOp(rewriter, doLoop, loopNestClauseOps, true); + rewriter.eraseOp(workdistribute); return success(); } return failure(); @@ -315,7 +341,7 @@ struct TeamsWorkdistributeToSingle : public OpRewritePattern { Block *workdistributeBlock = &workdistributeOp.getRegion().front(); rewriter.eraseOp(workdistributeBlock->getTerminator()); rewriter.inlineBlockBefore(workdistributeBlock, teamsOp); - rewriter.eraseOp(teamsOp); + rewriter.eraseOp(workdistributeOp); return success(); } }; @@ -332,8 +358,7 @@ class LowerWorkdistributePass Operation *op = getOperation(); { RewritePatternSet patterns(&context); - patterns.insert( - &context); + patterns.insert(&context); if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); @@ -341,8 +366,7 @@ class LowerWorkdistributePass } { RewritePatternSet patterns(&context); - patterns.insert( - &context); + patterns.insert(&context); if (failed(applyPatternsGreedily(op, std::move(patterns), config))) { emitError(op->getLoc(), DEBUG_TYPE " pass failed\n"); signalPassFailure(); diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir index 9fb970246b90c..f8351bb64e6e8 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-doloop.mlir @@ -2,13 +2,18 @@ // CHECK-LABEL: func.func @x({{.*}}) // CHECK: %[[VAL_0:.*]] = arith.constant 0 : index -// CHECK: omp.parallel { -// CHECK: omp.wsloop { -// CHECK: omp.loop_nest (%[[VAL_1:.*]]) : index = (%[[ARG0:.*]]) to (%[[ARG1:.*]]) inclusive step (%[[ARG2:.*]]) { -// CHECK: fir.store %[[VAL_0]] to %[[ARG4:.*]] : !fir.ref -// CHECK: omp.yield -// CHECK: } -// CHECK: } +// CHECK: omp.teams { +// CHECK: omp.parallel { +// CHECK: omp.distribute { +// CHECK: omp.wsloop { +// CHECK: omp.loop_nest (%[[VAL_1:.*]]) : index = (%[[ARG0:.*]]) to (%[[ARG1:.*]]) inclusive step (%[[ARG2:.*]]) { +// CHECK: fir.store %[[VAL_0]] to %[[ARG4:.*]] : !fir.ref +// CHECK: omp.yield +// CHECK: } +// CHECK: } {omp.composite} +// CHECK: } {omp.composite} +// CHECK: omp.terminator +// CHECK: } {omp.composite} // CHECK: omp.terminator // CHECK: } // CHECK: return diff --git a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir index cf50d135d01ec..c562b7009664d 100644 --- a/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir +++ b/flang/test/Transforms/OpenMP/lower-workdistribute-fission.mlir @@ -1,21 +1,26 @@ // RUN: fir-opt --lower-workdistribute %s | FileCheck %s -// CHECK-LABEL: func.func @test_fission_workdistribute({{.*}}) { +// CHECK-LABEL: func.func @test_fission_workdistribute( // CHECK: %[[VAL_0:.*]] = arith.constant 0 : index // CHECK: %[[VAL_1:.*]] = arith.constant 1 : index // CHECK: %[[VAL_2:.*]] = arith.constant 9 : index // CHECK: %[[VAL_3:.*]] = arith.constant 5.000000e+00 : f32 // CHECK: fir.store %[[VAL_3]] to %[[ARG2:.*]] : !fir.ref -// CHECK: omp.parallel { -// CHECK: omp.wsloop { -// CHECK: omp.loop_nest (%[[VAL_4:.*]]) : index = (%[[VAL_0]]) to (%[[VAL_2]]) inclusive step (%[[VAL_1]]) { -// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref -// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref -// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref -// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref -// CHECK: omp.yield -// CHECK: } -// CHECK: } +// CHECK: omp.teams { +// CHECK: omp.parallel { +// CHECK: omp.distribute { +// CHECK: omp.wsloop { +// CHECK: omp.loop_nest (%[[VAL_4:.*]]) : index = (%[[VAL_0]]) to (%[[VAL_2]]) inclusive step (%[[VAL_1]]) { +// CHECK: %[[VAL_5:.*]] = fir.coordinate_of %[[ARG0:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: %[[VAL_6:.*]] = fir.load %[[VAL_5]] : !fir.ref +// CHECK: %[[VAL_7:.*]] = fir.coordinate_of %[[ARG1:.*]], %[[VAL_4]] : (!fir.ref>, index) -> !fir.ref +// CHECK: fir.store %[[VAL_6]] to %[[VAL_7]] : !fir.ref +// CHECK: omp.yield +// CHECK: } +// CHECK: } {omp.composite} +// CHECK: } {omp.composite} +// CHECK: omp.terminator +// CHECK: } {omp.composite} // CHECK: omp.terminator // CHECK: } // CHECK: fir.call @regular_side_effect_func(%[[ARG2:.*]]) : (!fir.ref) -> () @@ -24,8 +29,8 @@ // CHECK: %[[VAL_9:.*]] = fir.coordinate_of %[[ARG0]], %[[VAL_8]] : (!fir.ref>, index) -> !fir.ref // CHECK: fir.store %[[VAL_3]] to %[[VAL_9]] : !fir.ref // CHECK: } -// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2]] : !fir.ref -// CHECK: fir.store %[[VAL_10]] to %[[ARG3]] : !fir.ref +// CHECK: %[[VAL_10:.*]] = fir.load %[[ARG2:.*]] : !fir.ref +// CHECK: fir.store %[[VAL_10]] to %[[ARG3:.*]] : !fir.ref // CHECK: return // CHECK: } module { >From fdc6938dff8456cf5864cc40b999e9855943e70b Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 28 May 2025 21:41:25 +0530 Subject: [PATCH 13/13] Fix basic-program.fir test. --- flang/test/Fir/basic-program.fir | 1 + 1 file changed, 1 insertion(+) diff --git a/flang/test/Fir/basic-program.fir b/flang/test/Fir/basic-program.fir index 7ac8b92f48953..a611629eeb280 100644 --- a/flang/test/Fir/basic-program.fir +++ b/flang/test/Fir/basic-program.fir @@ -69,6 +69,7 @@ func.func @_QQmain() { // PASSES-NEXT: InlineHLFIRAssign // PASSES-NEXT: ConvertHLFIRtoFIR // PASSES-NEXT: LowerWorkshare +// PASSES-NEXT: LowerWorkdistribute // PASSES-NEXT: CSE // PASSES-NEXT: (S) 0 num-cse'd - Number of operations CSE'd // PASSES-NEXT: (S) 0 num-dce'd - Number of operations DCE'd From flang-commits at lists.llvm.org Wed May 28 10:19:09 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Wed, 28 May 2025 10:19:09 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6837458d.170a0220.19503f.a298@mx.google.com> ================ @@ -2656,527 +2665,1857 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; } } + return SourcedActionStmt{}; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); } - return false; + return SourcedActionStmt{}; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; - } - } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); - } +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; } - ErrIfAllocatableVariable(var); + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; + } else { + return std::nullopt; + } + }, + x->u); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const auto *v1 = GetExpr(context_, stmt1Var); - const auto *e1 = GetExpr(context_, stmt1Expr); - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - const auto *v2 = GetExpr(context_, stmt2Var); - const auto *e2 = GetExpr(context_, stmt2Expr); - - if (e1 && v1 && e2 && v2) { - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(v2, e2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } - if (!(*e1 == *v2)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(v1, e1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - if (!(*v1 == *e2)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); - } - } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } + return std::nullopt; } -} -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; } } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; } } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); - } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); + return std::nullopt; } -} -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, - }, - x.u); + return std::nullopt; } -void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { - dirContext_.pop_back(); +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; + } } -// Clauses -// Mainly categorized as -// 1. Checks on 'OmpClauseList' from 'parse-tree.h'. -// 2. Checks on clauses which fall under 'struct OmpClause' from parse-tree.h. -// 3. Checks on clauses which are not in 'struct OmpClause' from parse-tree.h. +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} -void OmpStructureChecker::Leave(const parser::OmpClauseList &) { - // 2.7.1 Loop Construct Restriction - if (llvm::omp::allDoSet.test(GetContext().directive)) { - if (auto *clause{FindClause(llvm::omp::Clause::OMPC_schedule)}) { - // only one schedule clause is allowed - const auto &schedClause{std::get(clause->u)}; - auto &modifiers{OmpGetModifiers(schedClause.v)}; - auto *ordering{ - OmpGetUniqueModifier(modifiers)}; - if (ordering && - ordering->v == parser::OmpOrderingModifier::Value::Nonmonotonic) { - if (FindClause(llvm::omp::Clause::OMPC_ordered)) { - context_.Say(clause->source, - "The NONMONOTONIC modifier cannot be specified " - "if an ORDERED clause is specified"_err_en_US); - } + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } + return result; + } - if (auto *clause{FindClause(llvm::omp::Clause::OMPC_ordered)}) { - // only one ordered clause is allowed - const auto &orderedClause{ - std::get(clause->u)}; + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } - if (orderedClause.v) { - CheckNotAllowedIfClause( - llvm::omp::Clause::OMPC_ordered, {llvm::omp::Clause::OMPC_linear}); + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } - if (auto *clause2{FindClause(llvm::omp::Clause::OMPC_collapse)}) { - const auto &collapseClause{ - std::get(clause2->u)}; - // ordered and collapse both have parameters - if (const auto orderedValue{GetIntValue(orderedClause.v)}) { - if (const auto collapseValue{GetIntValue(collapseClause.v)}) { - if (*orderedValue > 0 && *orderedValue < *collapseValue) { - context_.Say(clause->source, - "The parameter of the ORDERED clause must be " - "greater than or equal to " - "the parameter of the COLLAPSE clause"_err_en_US); - } - } - } - } + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); } + } + } - // TODO: ordered region binding check (requires nesting implementation) +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; } - } // doSet + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } - // 2.8.1 Simd Construct Restriction - if (llvm::omp::allSimdSet.test(GetContext().directive)) { - if (auto *clause{FindClause(llvm::omp::Clause::OMPC_simdlen)}) { - if (auto *clause2{FindClause(llvm::omp::Clause::OMPC_safelen)}) { - const auto &simdlenClause{ - std::get(clause->u)}; - const auto &safelenClause{ - std::get(clause2->u)}; - // simdlen and safelen both have parameters - if (const auto simdlenValue{GetIntValue(simdlenClause.v)}) { - if (const auto safelenValue{GetIntValue(safelenClause.v)}) { - if (*safelenValue > 0 && *simdlenValue > *safelenValue) { - context_.Say(clause->source, - "The parameter of the SIMDLEN clause must be less than or " - "equal to the parameter of the SAFELEN clause"_err_en_US); - } - } - } + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (MoveAppend(v, std::move(results)), ...); + return v; + } +}; + +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result asSomeExpr(const T &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::Constant &x) const { + return asSomeExpr(x); + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); } + } else { + return asSomeExpr(x.derived()); } + } - // 2.11.5 Simd construct restriction (OpenMP 5.1) - if (auto *sl_clause{FindClause(llvm::omp::Clause::OMPC_safelen)}) { - if (auto *o_clause{FindClause(llvm::omp::Clause::OMPC_order)}) { - const auto &orderClause{ - std::get(o_clause->u)}; - if (std::get(orderClause.v.t) == - parser::OmpOrderClause::Ordering::Concurrent) { - context_.Say(sl_clause->source, - "The `SAFELEN` clause cannot appear in the `SIMD` directive " - "with `ORDER(CONCURRENT)` clause"_err_en_US); - } + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; } - } // SIMD + } - // Semantic checks related to presence of multiple list items within the same - // clause - CheckMultListItems(); + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; - if (GetContext().directive == llvm::omp::Directive::OMPD_task) { - if (auto *detachClause{FindClause(llvm::omp::Clause::OMPC_detach)}) { - unsigned version{context_.langOptions().OpenMPVersion}; - if (version == 50 || version == 51) { + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; +}; +} // namespace atomic + +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); +} + +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} + +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} + +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; + } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, + const std::optional &maybeAssign = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetAssignment(operation.assign, maybeAssign); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedAssignment assign; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var (with optional converts) + // or + // ... = x capture-var = atomic-var (with optional converts) + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + using ReturnTy = std::pair; + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return IsSameOrConvertOf(c.rhs, u.lhs); + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; ---------------- tblah wrote: I think this is easier to understand written as a logical expression (if I have understood this correctly): ``` bool cbu1c2 = cbu1 && sbc2; bool cbu2c1 = cbc1 && cbu2; // det > 0 bool u1c2 = cbu1c2 && !cbu2c1; // det < 0 bool c1u2 = !cbu1c2 && cbu2c1; // det == 0 bool ambiguous = !u1c2 && !c1u2; ``` Using the multiply and subtraction is very clever but I don't think it is appropriate here because the code is already challenging to understand. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Wed May 28 11:53:28 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Wed, 28 May 2025 11:53:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Verify that arguments to COPYPRIVATE are variables (PR #141823) Message-ID: https://github.com/kparzysz created https://github.com/llvm/llvm-project/pull/141823 The check if the arguments are variable list items was missing, leading to a crash in lowering in some invalid situations. This fixes the first testcase reported in https://github.com/llvm/llvm-project/issues/141481 >From 7103b042de5e0bf6212a0a13b1a76e66bf633b67 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 28 May 2025 13:49:56 -0500 Subject: [PATCH] [flang][OpenMP] Verify that arguments to COPYPRIVATE are variables The check if the arguments are variable list items was missing, leading to a crash in lowering in some invalid situations. This fixes the first testcase reported in https://github.com/llvm/llvm-project/issues/141481 --- flang/lib/Semantics/check-omp-structure.cpp | 25 ++++++++++--------- flang/lib/Semantics/check-omp-structure.h | 1 + flang/test/Semantics/OpenMP/copyprivate04.f90 | 1 + flang/test/Semantics/OpenMP/copyprivate05.f90 | 12 +++++++++ 4 files changed, 27 insertions(+), 12 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/copyprivate05.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bda0d62829506..297cd32270705 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -390,6 +390,16 @@ std::optional OmpStructureChecker::IsContiguous( object.u); } +void OmpStructureChecker::CheckVariableListItem( + const SymbolSourceMap &symbols) { + for (auto &[symbol, source] : symbols) { + if (!IsVariableListItem(*symbol)) { + context_.SayWithDecl(*symbol, source, "'%s' must be a variable"_err_en_US, + symbol->name()); + } + } +} + void OmpStructureChecker::CheckMultipleOccurrence( semantics::UnorderedSymbolSet &listVars, const std::list &nameList, const parser::CharBlock &item, @@ -4587,6 +4597,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Copyprivate &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_copyprivate); SymbolSourceMap symbols; GetSymbolsInObjectList(x.v, symbols); + CheckVariableListItem(symbols); CheckIntentInPointer(symbols, llvm::omp::Clause::OMPC_copyprivate); CheckCopyingPolymorphicAllocatable( symbols, llvm::omp::Clause::OMPC_copyprivate); @@ -4859,12 +4870,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::From &x) { const auto &objList{std::get(x.v.t)}; SymbolSourceMap symbols; GetSymbolsInObjectList(objList, symbols); - for (const auto &[symbol, source] : symbols) { - if (!IsVariableListItem(*symbol)) { - context_.SayWithDecl( - *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); - } - } + CheckVariableListItem(symbols); // Ref: [4.5:109:19] // If a list item is an array section it must specify contiguous storage. @@ -4904,12 +4910,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::To &x) { const auto &objList{std::get(x.v.t)}; SymbolSourceMap symbols; GetSymbolsInObjectList(objList, symbols); - for (const auto &[symbol, source] : symbols) { - if (!IsVariableListItem(*symbol)) { - context_.SayWithDecl( - *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); - } - } + CheckVariableListItem(symbols); // Ref: [4.5:109:19] // If a list item is an array section it must specify contiguous storage. diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 587959f7d506f..1a8059d8548ed 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -174,6 +174,7 @@ class OmpStructureChecker bool IsExtendedListItem(const Symbol &sym); bool IsCommonBlock(const Symbol &sym); std::optional IsContiguous(const parser::OmpObject &object); + void CheckVariableListItem(const SymbolSourceMap &symbols); void CheckMultipleOccurrence(semantics::UnorderedSymbolSet &listVars, const std::list &nameList, const parser::CharBlock &item, const std::string &clauseName); diff --git a/flang/test/Semantics/OpenMP/copyprivate04.f90 b/flang/test/Semantics/OpenMP/copyprivate04.f90 index 291cf1103fb27..8d7800229bc5f 100644 --- a/flang/test/Semantics/OpenMP/copyprivate04.f90 +++ b/flang/test/Semantics/OpenMP/copyprivate04.f90 @@ -70,6 +70,7 @@ program omp_copyprivate ! Named constants are shared. !$omp single !ERROR: COPYPRIVATE variable 'pi' is not PRIVATE or THREADPRIVATE in outer context + !ERROR: 'pi' must be a variable !$omp end single copyprivate(pi) !$omp parallel do diff --git a/flang/test/Semantics/OpenMP/copyprivate05.f90 b/flang/test/Semantics/OpenMP/copyprivate05.f90 new file mode 100644 index 0000000000000..129f8f0b5144e --- /dev/null +++ b/flang/test/Semantics/OpenMP/copyprivate05.f90 @@ -0,0 +1,12 @@ +!RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp + +! The first testcase from https://github.com/llvm/llvm-project/issues/141481 + +subroutine f00 + type t + end type + +!ERROR: 't' must be a variable +!$omp single copyprivate(t) +!$omp end single +end From flang-commits at lists.llvm.org Wed May 28 11:54:04 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 28 May 2025 11:54:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Verify that arguments to COPYPRIVATE are variables (PR #141823) In-Reply-To: Message-ID: <68375bcc.170a0220.1b4429.af5b@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Krzysztof Parzyszek (kparzysz)
Changes The check if the arguments are variable list items was missing, leading to a crash in lowering in some invalid situations. This fixes the first testcase reported in https://github.com/llvm/llvm-project/issues/141481 --- Full diff: https://github.com/llvm/llvm-project/pull/141823.diff 4 Files Affected: - (modified) flang/lib/Semantics/check-omp-structure.cpp (+13-12) - (modified) flang/lib/Semantics/check-omp-structure.h (+1) - (modified) flang/test/Semantics/OpenMP/copyprivate04.f90 (+1) - (added) flang/test/Semantics/OpenMP/copyprivate05.f90 (+12) ``````````diff diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bda0d62829506..297cd32270705 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -390,6 +390,16 @@ std::optional OmpStructureChecker::IsContiguous( object.u); } +void OmpStructureChecker::CheckVariableListItem( + const SymbolSourceMap &symbols) { + for (auto &[symbol, source] : symbols) { + if (!IsVariableListItem(*symbol)) { + context_.SayWithDecl(*symbol, source, "'%s' must be a variable"_err_en_US, + symbol->name()); + } + } +} + void OmpStructureChecker::CheckMultipleOccurrence( semantics::UnorderedSymbolSet &listVars, const std::list &nameList, const parser::CharBlock &item, @@ -4587,6 +4597,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Copyprivate &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_copyprivate); SymbolSourceMap symbols; GetSymbolsInObjectList(x.v, symbols); + CheckVariableListItem(symbols); CheckIntentInPointer(symbols, llvm::omp::Clause::OMPC_copyprivate); CheckCopyingPolymorphicAllocatable( symbols, llvm::omp::Clause::OMPC_copyprivate); @@ -4859,12 +4870,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::From &x) { const auto &objList{std::get(x.v.t)}; SymbolSourceMap symbols; GetSymbolsInObjectList(objList, symbols); - for (const auto &[symbol, source] : symbols) { - if (!IsVariableListItem(*symbol)) { - context_.SayWithDecl( - *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); - } - } + CheckVariableListItem(symbols); // Ref: [4.5:109:19] // If a list item is an array section it must specify contiguous storage. @@ -4904,12 +4910,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::To &x) { const auto &objList{std::get(x.v.t)}; SymbolSourceMap symbols; GetSymbolsInObjectList(objList, symbols); - for (const auto &[symbol, source] : symbols) { - if (!IsVariableListItem(*symbol)) { - context_.SayWithDecl( - *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); - } - } + CheckVariableListItem(symbols); // Ref: [4.5:109:19] // If a list item is an array section it must specify contiguous storage. diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 587959f7d506f..1a8059d8548ed 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -174,6 +174,7 @@ class OmpStructureChecker bool IsExtendedListItem(const Symbol &sym); bool IsCommonBlock(const Symbol &sym); std::optional IsContiguous(const parser::OmpObject &object); + void CheckVariableListItem(const SymbolSourceMap &symbols); void CheckMultipleOccurrence(semantics::UnorderedSymbolSet &listVars, const std::list &nameList, const parser::CharBlock &item, const std::string &clauseName); diff --git a/flang/test/Semantics/OpenMP/copyprivate04.f90 b/flang/test/Semantics/OpenMP/copyprivate04.f90 index 291cf1103fb27..8d7800229bc5f 100644 --- a/flang/test/Semantics/OpenMP/copyprivate04.f90 +++ b/flang/test/Semantics/OpenMP/copyprivate04.f90 @@ -70,6 +70,7 @@ program omp_copyprivate ! Named constants are shared. !$omp single !ERROR: COPYPRIVATE variable 'pi' is not PRIVATE or THREADPRIVATE in outer context + !ERROR: 'pi' must be a variable !$omp end single copyprivate(pi) !$omp parallel do diff --git a/flang/test/Semantics/OpenMP/copyprivate05.f90 b/flang/test/Semantics/OpenMP/copyprivate05.f90 new file mode 100644 index 0000000000000..129f8f0b5144e --- /dev/null +++ b/flang/test/Semantics/OpenMP/copyprivate05.f90 @@ -0,0 +1,12 @@ +!RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp + +! The first testcase from https://github.com/llvm/llvm-project/issues/141481 + +subroutine f00 + type t + end type + +!ERROR: 't' must be a variable +!$omp single copyprivate(t) +!$omp end single +end ``````````
https://github.com/llvm/llvm-project/pull/141823 From flang-commits at lists.llvm.org Wed May 28 11:56:32 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 28 May 2025 11:56:32 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Verify that arguments to COPYPRIVATE are variables (PR #141823) In-Reply-To: Message-ID: <68375c60.170a0220.1de2f4.b48e@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- flang/lib/Semantics/check-omp-structure.cpp flang/lib/Semantics/check-omp-structure.h ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 297cd3227..b0bc478d9 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -394,8 +394,8 @@ void OmpStructureChecker::CheckVariableListItem( const SymbolSourceMap &symbols) { for (auto &[symbol, source] : symbols) { if (!IsVariableListItem(*symbol)) { - context_.SayWithDecl(*symbol, source, "'%s' must be a variable"_err_en_US, - symbol->name()); + context_.SayWithDecl( + *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); } } } ``````````
https://github.com/llvm/llvm-project/pull/141823 From flang-commits at lists.llvm.org Wed May 28 12:02:43 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Wed, 28 May 2025 12:02:43 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Verify that arguments to COPYPRIVATE are variables (PR #141823) In-Reply-To: Message-ID: <68375dd3.170a0220.2f77cb.b205@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/141823 >From 7103b042de5e0bf6212a0a13b1a76e66bf633b67 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 28 May 2025 13:49:56 -0500 Subject: [PATCH 1/2] [flang][OpenMP] Verify that arguments to COPYPRIVATE are variables The check if the arguments are variable list items was missing, leading to a crash in lowering in some invalid situations. This fixes the first testcase reported in https://github.com/llvm/llvm-project/issues/141481 --- flang/lib/Semantics/check-omp-structure.cpp | 25 ++++++++++--------- flang/lib/Semantics/check-omp-structure.h | 1 + flang/test/Semantics/OpenMP/copyprivate04.f90 | 1 + flang/test/Semantics/OpenMP/copyprivate05.f90 | 12 +++++++++ 4 files changed, 27 insertions(+), 12 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/copyprivate05.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bda0d62829506..297cd32270705 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -390,6 +390,16 @@ std::optional OmpStructureChecker::IsContiguous( object.u); } +void OmpStructureChecker::CheckVariableListItem( + const SymbolSourceMap &symbols) { + for (auto &[symbol, source] : symbols) { + if (!IsVariableListItem(*symbol)) { + context_.SayWithDecl(*symbol, source, "'%s' must be a variable"_err_en_US, + symbol->name()); + } + } +} + void OmpStructureChecker::CheckMultipleOccurrence( semantics::UnorderedSymbolSet &listVars, const std::list &nameList, const parser::CharBlock &item, @@ -4587,6 +4597,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Copyprivate &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_copyprivate); SymbolSourceMap symbols; GetSymbolsInObjectList(x.v, symbols); + CheckVariableListItem(symbols); CheckIntentInPointer(symbols, llvm::omp::Clause::OMPC_copyprivate); CheckCopyingPolymorphicAllocatable( symbols, llvm::omp::Clause::OMPC_copyprivate); @@ -4859,12 +4870,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::From &x) { const auto &objList{std::get(x.v.t)}; SymbolSourceMap symbols; GetSymbolsInObjectList(objList, symbols); - for (const auto &[symbol, source] : symbols) { - if (!IsVariableListItem(*symbol)) { - context_.SayWithDecl( - *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); - } - } + CheckVariableListItem(symbols); // Ref: [4.5:109:19] // If a list item is an array section it must specify contiguous storage. @@ -4904,12 +4910,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::To &x) { const auto &objList{std::get(x.v.t)}; SymbolSourceMap symbols; GetSymbolsInObjectList(objList, symbols); - for (const auto &[symbol, source] : symbols) { - if (!IsVariableListItem(*symbol)) { - context_.SayWithDecl( - *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); - } - } + CheckVariableListItem(symbols); // Ref: [4.5:109:19] // If a list item is an array section it must specify contiguous storage. diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 587959f7d506f..1a8059d8548ed 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -174,6 +174,7 @@ class OmpStructureChecker bool IsExtendedListItem(const Symbol &sym); bool IsCommonBlock(const Symbol &sym); std::optional IsContiguous(const parser::OmpObject &object); + void CheckVariableListItem(const SymbolSourceMap &symbols); void CheckMultipleOccurrence(semantics::UnorderedSymbolSet &listVars, const std::list &nameList, const parser::CharBlock &item, const std::string &clauseName); diff --git a/flang/test/Semantics/OpenMP/copyprivate04.f90 b/flang/test/Semantics/OpenMP/copyprivate04.f90 index 291cf1103fb27..8d7800229bc5f 100644 --- a/flang/test/Semantics/OpenMP/copyprivate04.f90 +++ b/flang/test/Semantics/OpenMP/copyprivate04.f90 @@ -70,6 +70,7 @@ program omp_copyprivate ! Named constants are shared. !$omp single !ERROR: COPYPRIVATE variable 'pi' is not PRIVATE or THREADPRIVATE in outer context + !ERROR: 'pi' must be a variable !$omp end single copyprivate(pi) !$omp parallel do diff --git a/flang/test/Semantics/OpenMP/copyprivate05.f90 b/flang/test/Semantics/OpenMP/copyprivate05.f90 new file mode 100644 index 0000000000000..129f8f0b5144e --- /dev/null +++ b/flang/test/Semantics/OpenMP/copyprivate05.f90 @@ -0,0 +1,12 @@ +!RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp + +! The first testcase from https://github.com/llvm/llvm-project/issues/141481 + +subroutine f00 + type t + end type + +!ERROR: 't' must be a variable +!$omp single copyprivate(t) +!$omp end single +end >From e52b6bf45685fc806b9cab2fe0f3243d0f9467ab Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 28 May 2025 14:02:30 -0500 Subject: [PATCH 2/2] format --- flang/lib/Semantics/check-omp-structure.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 297cd32270705..b0bc478d96a1e 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -394,8 +394,8 @@ void OmpStructureChecker::CheckVariableListItem( const SymbolSourceMap &symbols) { for (auto &[symbol, source] : symbols) { if (!IsVariableListItem(*symbol)) { - context_.SayWithDecl(*symbol, source, "'%s' must be a variable"_err_en_US, - symbol->name()); + context_.SayWithDecl( + *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); } } } From flang-commits at lists.llvm.org Wed May 28 13:10:46 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Wed, 28 May 2025 13:10:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Treat ClassType as BoxType in COPYPRIVATE (PR #141844) Message-ID: https://github.com/kparzysz created https://github.com/llvm/llvm-project/pull/141844 This fixes the second problem reported in https://github.com/llvm/llvm-project/issues/141481 >From 4da5be5562b65570db85163a17902eb0605ac9eb Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 28 May 2025 15:08:20 -0500 Subject: [PATCH] [flang][OpenMP] Treat ClassType as BoxType in COPYPRIVATE This fixes the second problem reported in https://github.com/llvm/llvm-project/issues/141481 --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 3 +++ flang/test/Lower/OpenMP/copyprivate4.f90 | 18 ++++++++++++++++++ 2 files changed, 21 insertions(+) create mode 100644 flang/test/Lower/OpenMP/copyprivate4.f90 diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 885871698c946..afbc77d48cb53 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -743,6 +743,9 @@ void TypeInfo::typeScan(mlir::Type ty) { } else if (auto bty = mlir::dyn_cast(ty)) { inBox = true; typeScan(bty.getEleTy()); + } else if (auto cty = mlir::dyn_cast(ty)) { + inBox = true; + typeScan(cty.getEleTy()); } else if (auto cty = mlir::dyn_cast(ty)) { charLen = cty.getLen(); } else if (auto hty = mlir::dyn_cast(ty)) { diff --git a/flang/test/Lower/OpenMP/copyprivate4.f90 b/flang/test/Lower/OpenMP/copyprivate4.f90 new file mode 100644 index 0000000000000..02fdbc71edc59 --- /dev/null +++ b/flang/test/Lower/OpenMP/copyprivate4.f90 @@ -0,0 +1,18 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s + +!The second testcase from https://github.com/llvm/llvm-project/issues/141481 + +!Check that we don't crash on this. + +!CHECK: omp.single copyprivate(%6#0 -> @_copy_class_ptr_rec__QFf01Tt : !fir.ref>>>) { +!CHECK: omp.terminator +!CHECK: } + +subroutine f01 + type t + end type + class(t), pointer :: tt + +!$omp single copyprivate(tt) +!$omp end single +end From flang-commits at lists.llvm.org Wed May 28 13:11:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 28 May 2025 13:11:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Treat ClassType as BoxType in COPYPRIVATE (PR #141844) In-Reply-To: Message-ID: <68376de9.170a0220.ea282.b6ce@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Krzysztof Parzyszek (kparzysz)
Changes This fixes the second problem reported in https://github.com/llvm/llvm-project/issues/141481 --- Full diff: https://github.com/llvm/llvm-project/pull/141844.diff 2 Files Affected: - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+3) - (added) flang/test/Lower/OpenMP/copyprivate4.f90 (+18) ``````````diff diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 885871698c946..afbc77d48cb53 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -743,6 +743,9 @@ void TypeInfo::typeScan(mlir::Type ty) { } else if (auto bty = mlir::dyn_cast(ty)) { inBox = true; typeScan(bty.getEleTy()); + } else if (auto cty = mlir::dyn_cast(ty)) { + inBox = true; + typeScan(cty.getEleTy()); } else if (auto cty = mlir::dyn_cast(ty)) { charLen = cty.getLen(); } else if (auto hty = mlir::dyn_cast(ty)) { diff --git a/flang/test/Lower/OpenMP/copyprivate4.f90 b/flang/test/Lower/OpenMP/copyprivate4.f90 new file mode 100644 index 0000000000000..02fdbc71edc59 --- /dev/null +++ b/flang/test/Lower/OpenMP/copyprivate4.f90 @@ -0,0 +1,18 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s + +!The second testcase from https://github.com/llvm/llvm-project/issues/141481 + +!Check that we don't crash on this. + +!CHECK: omp.single copyprivate(%6#0 -> @_copy_class_ptr_rec__QFf01Tt : !fir.ref>>>) { +!CHECK: omp.terminator +!CHECK: } + +subroutine f01 + type t + end type + class(t), pointer :: tt + +!$omp single copyprivate(tt) +!$omp end single +end ``````````
https://github.com/llvm/llvm-project/pull/141844 From flang-commits at lists.llvm.org Wed May 28 12:53:39 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Wed, 28 May 2025 12:53:39 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <683769c3.170a0220.12fbc7.b075@mx.google.com> https://github.com/mrkajetanp updated https://github.com/llvm/llvm-project/pull/138718 >From 005c599ffcbb1ee257ad8ca510d8ecd649fcab7b Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 10 Apr 2025 14:04:52 +0000 Subject: [PATCH 1/5] [flang] Inline hlfir.copy_in for trivial types hlfir.copy_in implements copying non-contiguous array slices for functions that take in arrays required to be contiguous through a flang-rt function that calls memcpy/memmove separately on each element. For large arrays of trivial types, this can incur considerable overhead compared to a plain copy loop that is better able to take advantage of hardware pipelines. To address that, extend the InlineHLFIRAssign optimisation pass with a new pattern for inlining hlfir.copy_in operations for trivial types. For the time being, the pattern is only applied in cases where the copy-in does not require a corresponding copy-out, such as when the function being called declares the array parameter as intent(in). Applying this optimisation reduces the runtime of thornado-mini's DeleptonizationProblem by a factor of about 1/3rd. Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 117 ++++++++++++++++++ 1 file changed, 117 insertions(+) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 6e209cce07ad4..38c684eaceb7d 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,6 +13,7 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + + if (!mlir::isa(copyOut)) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (::llvm::cast(copyOut).getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + auto results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + auto elem = hlfir::getElementAt(loc, builder, inputVariable, + loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + auto tempElem = hlfir::getElementAt(loc, builder, temp, + loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + auto refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + auto refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + auto addr = results[0]; + auto needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + auto tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. + rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); +} + class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -140,6 +256,7 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); + patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { >From 26d2e491acd54ef942af32c8361e97b24d190625 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Wed, 7 May 2025 16:04:07 +0000 Subject: [PATCH 2/5] Add tests Signed-off-by: Kajetan Puchalski --- flang/test/HLFIR/inline-hlfir-assign.fir | 144 +++++++++++++++++++++++ 1 file changed, 144 insertions(+) diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index f834e7971e3d5..df7681b9c5c16 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,3 +353,147 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } >From e51db2f45ad82798937b57e2b5c08e7bcc66deed Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 8 May 2025 15:15:56 +0000 Subject: [PATCH 3/5] Address Tom's review comments Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 41 +++++++++++-------- 1 file changed, 23 insertions(+), 18 deletions(-) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 38c684eaceb7d..dc545ece8adff 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -158,16 +158,16 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, "CopyInOp's WasCopied has no uses"); // The copy out should always be present, either to actually copy or just // deallocate memory. - auto *copyOut = - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - if (!mlir::isa(copyOut)) + if (!copyOut) return rewriter.notifyMatchFailure(copyIn, "CopyInOp has no direct CopyOut"); // Only inline the copy_in when copy_out does not need to be done, i.e. in // case of intent(in). - if (::llvm::cast(copyOut).getVar()) + if (copyOut.getVar()) return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); inputVariable = @@ -175,7 +175,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); mlir::Value isContiguous = builder.create(loc, inputVariable); - auto results = + mlir::Operation::result_range results = builder .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, /*withElseRegion=*/true) @@ -195,11 +195,11 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, loc, builder, extents, /*isUnordered=*/true, flangomp::shouldUseWorkshareLowering(copyIn)); builder.setInsertionPointToStart(loopNest.body); - auto elem = hlfir::getElementAt(loc, builder, inputVariable, - loopNest.oneBasedIndices); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); elem = hlfir::loadTrivialScalar(loc, builder, elem); - auto tempElem = hlfir::getElementAt(loc, builder, temp, - loopNest.oneBasedIndices); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); builder.create(loc, elem, tempElem); builder.setInsertionPointAfter(loopNest.outerOp); @@ -209,9 +209,9 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, if (mlir::isa(temp.getType())) { result = temp; } else { - auto refTy = + fir::ReferenceType refTy = fir::ReferenceType::get(temp.getElementOrSequenceType()); - auto refVal = builder.createConvert(loc, refTy, temp); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); result = builder.create(loc, resultAddrType, refVal); } @@ -221,25 +221,30 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }) .getResults(); - auto addr = results[0]; - auto needsCleanup = results[1]; + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { auto boxAddr = builder.create(loc, addr); - auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); builder.create(loc, heapVal); }); rewriter.eraseOp(copyOut); - auto tempBox = copyIn.getTempBox(); + mlir::Value tempBox = copyIn.getTempBox(); rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); // The TempBox is only needed for flang-rt calls which we're no longer - // generating. + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); } >From c1af5b1ead7a560112c9896b1cb2bac48b865df3 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 22 May 2025 13:37:53 +0000 Subject: [PATCH 4/5] Separate copy_in inlining into its own pass, add flag Signed-off-by: Kajetan Puchalski --- flang/include/flang/Optimizer/HLFIR/Passes.td | 4 + .../Optimizer/HLFIR/Transforms/CMakeLists.txt | 1 + .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 122 ------------ .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 180 ++++++++++++++++++ flang/lib/Optimizer/Passes/Pipelines.cpp | 5 + flang/test/HLFIR/inline-hlfir-assign.fir | 144 -------------- flang/test/HLFIR/inline-hlfir-copy-in.fir | 146 ++++++++++++++ 7 files changed, 336 insertions(+), 266 deletions(-) create mode 100644 flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp create mode 100644 flang/test/HLFIR/inline-hlfir-copy-in.fir diff --git a/flang/include/flang/Optimizer/HLFIR/Passes.td b/flang/include/flang/Optimizer/HLFIR/Passes.td index d445140118073..04d7aec5fe489 100644 --- a/flang/include/flang/Optimizer/HLFIR/Passes.td +++ b/flang/include/flang/Optimizer/HLFIR/Passes.td @@ -69,6 +69,10 @@ def InlineHLFIRAssign : Pass<"inline-hlfir-assign"> { let summary = "Inline hlfir.assign operations"; } +def InlineHLFIRCopyIn : Pass<"inline-hlfir-copy-in"> { + let summary = "Inline hlfir.copy_in operations"; +} + def PropagateFortranVariableAttributes : Pass<"propagate-fortran-attrs"> { let summary = "Propagate FortranVariableFlagsAttr attributes through HLFIR"; } diff --git a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt index d959428ebd203..cc74273d9c5d9 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt @@ -5,6 +5,7 @@ add_flang_library(HLFIRTransforms ConvertToFIR.cpp InlineElementals.cpp InlineHLFIRAssign.cpp + InlineHLFIRCopyIn.cpp LowerHLFIRIntrinsics.cpp LowerHLFIROrderedAssignments.cpp ScheduleOrderedAssignments.cpp diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index dc545ece8adff..6e209cce07ad4 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,7 +13,6 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" -#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -128,126 +127,6 @@ class InlineHLFIRAssignConversion } }; -class InlineCopyInConversion : public mlir::OpRewritePattern { -public: - using mlir::OpRewritePattern::OpRewritePattern; - - llvm::LogicalResult - matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const override; -}; - -llvm::LogicalResult -InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const { - fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); - mlir::Location loc = copyIn.getLoc(); - hlfir::Entity inputVariable{copyIn.getVar()}; - if (!fir::isa_trivial(inputVariable.getFortranElementType())) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's data type is not trivial"); - - if (fir::isPointerType(inputVariable.getType())) - return rewriter.notifyMatchFailure( - copyIn, "CopyInOp's input variable is a pointer"); - - // There should be exactly one user of WasCopied - the corresponding - // CopyOutOp. - if (copyIn.getWasCopied().getUses().empty()) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's WasCopied has no uses"); - // The copy out should always be present, either to actually copy or just - // deallocate memory. - auto copyOut = mlir::dyn_cast( - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - - if (!copyOut) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp has no direct CopyOut"); - - // Only inline the copy_in when copy_out does not need to be done, i.e. in - // case of intent(in). - if (copyOut.getVar()) - return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); - - inputVariable = - hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); - mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); - mlir::Value isContiguous = - builder.create(loc, inputVariable); - mlir::Operation::result_range results = - builder - .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, - /*withElseRegion=*/true) - .genThen([&]() { - mlir::Value falseVal = builder.create( - loc, builder.getI1Type(), builder.getBoolAttr(false)); - builder.create( - loc, mlir::ValueRange{inputVariable, falseVal}); - }) - .genElse([&] { - auto [temp, cleanup] = - hlfir::createTempFromMold(loc, builder, inputVariable); - mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); - llvm::SmallVector extents = - hlfir::getIndexExtents(loc, builder, shape); - hlfir::LoopNest loopNest = hlfir::genLoopNest( - loc, builder, extents, /*isUnordered=*/true, - flangomp::shouldUseWorkshareLowering(copyIn)); - builder.setInsertionPointToStart(loopNest.body); - hlfir::Entity elem = hlfir::getElementAt( - loc, builder, inputVariable, loopNest.oneBasedIndices); - elem = hlfir::loadTrivialScalar(loc, builder, elem); - hlfir::Entity tempElem = hlfir::getElementAt( - loc, builder, temp, loopNest.oneBasedIndices); - builder.create(loc, elem, tempElem); - builder.setInsertionPointAfter(loopNest.outerOp); - - mlir::Value result; - // Make sure the result is always a boxed array by boxing it - // ourselves if need be. - if (mlir::isa(temp.getType())) { - result = temp; - } else { - fir::ReferenceType refTy = - fir::ReferenceType::get(temp.getElementOrSequenceType()); - mlir::Value refVal = builder.createConvert(loc, refTy, temp); - result = - builder.create(loc, resultAddrType, refVal); - } - - builder.create(loc, - mlir::ValueRange{result, cleanup}); - }) - .getResults(); - - mlir::OpResult addr = results[0]; - mlir::OpResult needsCleanup = results[1]; - - builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { - auto boxAddr = builder.create(loc, addr); - fir::HeapType heapType = - fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - mlir::Value heapVal = - builder.createConvert(loc, heapType, boxAddr.getResult()); - builder.create(loc, heapVal); - }); - rewriter.eraseOp(copyOut); - - mlir::Value tempBox = copyIn.getTempBox(); - - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); - - // The TempBox is only needed for flang-rt calls which we're no longer - // generating. It should have no uses left at this stage. - if (!tempBox.getUses().empty()) - return mlir::failure(); - rewriter.eraseOp(tempBox.getDefiningOp()); - - return mlir::success(); -} - class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -261,7 +140,6 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); - patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp new file mode 100644 index 0000000000000..1e2aecaf535a0 --- /dev/null +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + mlir::Value tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); + rewriter.eraseOp(tempBox.getDefiningOp()); + + return mlir::success(); +} + +class InlineHLFIRCopyInPass + : public hlfir::impl::InlineHLFIRCopyInBase { +public: + void runOnOperation() override { + mlir::MLIRContext *context = &getContext(); + + mlir::GreedyRewriteConfig config; + // Prevent the pattern driver from merging blocks. + config.setRegionSimplificationLevel( + mlir::GreedySimplifyRegionLevel::Disabled); + + mlir::RewritePatternSet patterns(context); + if (!noInlineHLFIRCopyIn) { + patterns.insert(context); + } + + if (mlir::failed(mlir::applyPatternsGreedily( + getOperation(), std::move(patterns), config))) { + mlir::emitError(getOperation()->getLoc(), + "failure in hlfir.copy_in inlining"); + signalPassFailure(); + } + } +}; +} // namespace diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..1779623fddc5a 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -255,6 +255,11 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP, pm, hlfir::createOptimizedBufferization); addNestedPassToAllTopLevelOperations( pm, hlfir::createInlineHLFIRAssign); + + if (optLevel == llvm::OptimizationLevel::O3) { + addNestedPassToAllTopLevelOperations( + pm, hlfir::createInlineHLFIRCopyIn); + } } pm.addPass(hlfir::createLowerHLFIROrderedAssignments()); pm.addPass(hlfir::createLowerHLFIRIntrinsics()); diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index df7681b9c5c16..f834e7971e3d5 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,147 +353,3 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } - -// Test inlining of hlfir.copy_in that does not require the array to be copied out -func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant true -// CHECK: %[[VAL_4:.*]] = arith.constant false -// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 -// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { -// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 -// CHECK: } else { -// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} -// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { -// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref -// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref -// CHECK: } -// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 -// CHECK: } -// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: fir.if %[[VAL_21:.*]]#1 { -// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> -// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> -// CHECK: } -// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } - -// Test not inlining of hlfir.copy_in that requires the array to be copied out -func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_no_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> -// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) -// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () -// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir new file mode 100644 index 0000000000000..7140e93f19979 --- /dev/null +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -0,0 +1,146 @@ +// Test inlining of hlfir.copy_in +// RUN: fir-opt --inline-hlfir-copy-in %s | FileCheck %s + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } >From bed8a6af87cff1a404255c92da06d8e7ca07e908 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Wed, 28 May 2025 13:44:53 +0000 Subject: [PATCH 5/5] Support arrays behind a pointer, add metadata to disable vectorizing --- .../flang/Optimizer/Builder/HLFIRTools.h | 8 ++- flang/lib/Optimizer/Builder/HLFIRTools.cpp | 13 +++- .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 66 ++++++++++--------- flang/test/HLFIR/inline-hlfir-copy-in.fir | 6 +- 4 files changed, 55 insertions(+), 38 deletions(-) diff --git a/flang/include/flang/Optimizer/Builder/HLFIRTools.h b/flang/include/flang/Optimizer/Builder/HLFIRTools.h index ed00cec04dc39..2cbad6e268a38 100644 --- a/flang/include/flang/Optimizer/Builder/HLFIRTools.h +++ b/flang/include/flang/Optimizer/Builder/HLFIRTools.h @@ -374,12 +374,14 @@ struct LoopNest { /// loop constructs currently. LoopNest genLoopNest(mlir::Location loc, fir::FirOpBuilder &builder, mlir::ValueRange extents, bool isUnordered = false, - bool emitWorkshareLoop = false); + bool emitWorkshareLoop = false, + bool couldVectorize = true); inline LoopNest genLoopNest(mlir::Location loc, fir::FirOpBuilder &builder, mlir::Value shape, bool isUnordered = false, - bool emitWorkshareLoop = false) { + bool emitWorkshareLoop = false, + bool couldVectorize = true) { return genLoopNest(loc, builder, getIndexExtents(loc, builder, shape), - isUnordered, emitWorkshareLoop); + isUnordered, emitWorkshareLoop, couldVectorize); } /// The type of a callback that generates the body of a reduction diff --git a/flang/lib/Optimizer/Builder/HLFIRTools.cpp b/flang/lib/Optimizer/Builder/HLFIRTools.cpp index f24dc2caeedfc..14aae5d7118a1 100644 --- a/flang/lib/Optimizer/Builder/HLFIRTools.cpp +++ b/flang/lib/Optimizer/Builder/HLFIRTools.cpp @@ -21,6 +21,7 @@ #include "mlir/IR/IRMapping.h" #include "mlir/Support/LLVM.h" #include "llvm/ADT/TypeSwitch.h" +#include #include #include @@ -932,7 +933,8 @@ mlir::Value hlfir::inlineElementalOp( hlfir::LoopNest hlfir::genLoopNest(mlir::Location loc, fir::FirOpBuilder &builder, mlir::ValueRange extents, bool isUnordered, - bool emitWorkshareLoop) { + bool emitWorkshareLoop, + bool couldVectorize) { emitWorkshareLoop = emitWorkshareLoop && isUnordered; hlfir::LoopNest loopNest; assert(!extents.empty() && "must have at least one extent"); @@ -967,6 +969,15 @@ hlfir::LoopNest hlfir::genLoopNest(mlir::Location loc, auto ub = builder.createConvert(loc, indexType, extent); auto doLoop = builder.create(loc, one, ub, one, isUnordered); + if (!couldVectorize) { + mlir::LLVM::LoopVectorizeAttr va{mlir::LLVM::LoopVectorizeAttr::get( + builder.getContext(), + /*disable=*/builder.getBoolAttr(true), {}, {}, {}, {}, {}, {})}; + mlir::LLVM::LoopAnnotationAttr la = mlir::LLVM::LoopAnnotationAttr::get( + builder.getContext(), {}, /*vectorize=*/va, {}, /*unroll*/ {}, + /*unroll_and_jam*/ {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}); + doLoop.setLoopAnnotationAttr(la); + } loopNest.body = doLoop.getBody(); builder.setInsertionPointToStart(loopNest.body); // Reverse the indices so they are in column-major order. diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp index 1e2aecaf535a0..d1cbe3241c07b 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -52,19 +52,15 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, return rewriter.notifyMatchFailure(copyIn, "CopyInOp's data type is not trivial"); - if (fir::isPointerType(inputVariable.getType())) - return rewriter.notifyMatchFailure( - copyIn, "CopyInOp's input variable is a pointer"); - // There should be exactly one user of WasCopied - the corresponding // CopyOutOp. - if (copyIn.getWasCopied().getUses().empty()) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's WasCopied has no uses"); + if (!copyIn.getWasCopied().hasOneUse()) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's WasCopied has no single user"); // The copy out should always be present, either to actually copy or just // deallocate memory. auto copyOut = mlir::dyn_cast( - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + copyIn.getWasCopied().user_begin().getCurrent().getUser()); if (!copyOut) return rewriter.notifyMatchFailure(copyIn, @@ -77,28 +73,45 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, inputVariable = hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); - mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Type sequenceType = + hlfir::getFortranElementOrSequenceType(inputVariable.getType()); + fir::BoxType resultBoxType = fir::BoxType::get(sequenceType); mlir::Value isContiguous = builder.create(loc, inputVariable); mlir::Operation::result_range results = builder - .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + .genIfOp(loc, {resultBoxType, builder.getI1Type()}, isContiguous, /*withElseRegion=*/true) .genThen([&]() { - mlir::Value falseVal = builder.create( - loc, builder.getI1Type(), builder.getBoolAttr(false)); + mlir::Value result = inputVariable; + if (fir::isPointerType(inputVariable.getType())) { + auto boxAddr = builder.create(loc, inputVariable); + fir::ReferenceType refTy = fir::ReferenceType::get(sequenceType); + mlir::Value refVal = builder.createConvert(loc, refTy, boxAddr); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + result = builder.create(loc, resultBoxType, refVal, + shape); + } builder.create( - loc, mlir::ValueRange{inputVariable, falseVal}); + loc, mlir::ValueRange{result, builder.createBool(loc, false)}); }) .genElse([&] { - auto [temp, cleanup] = - hlfir::createTempFromMold(loc, builder, inputVariable); mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); llvm::SmallVector extents = hlfir::getIndexExtents(loc, builder, shape); - hlfir::LoopNest loopNest = hlfir::genLoopNest( - loc, builder, extents, /*isUnordered=*/true, - flangomp::shouldUseWorkshareLowering(copyIn)); + llvm::StringRef tmpName{".tmp.copy_in"}; + llvm::SmallVector lenParams; + mlir::Value alloc = builder.createHeapTemporary( + loc, sequenceType, tmpName, extents, lenParams); + + auto declareOp = builder.create( + loc, alloc, tmpName, shape, lenParams, + /*dummy_scope=*/nullptr); + hlfir::Entity temp{declareOp.getBase()}; + hlfir::LoopNest loopNest = + hlfir::genLoopNest(loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn), + /*couldVectorize=*/false); builder.setInsertionPointToStart(loopNest.body); hlfir::Entity elem = hlfir::getElementAt( loc, builder, inputVariable, loopNest.oneBasedIndices); @@ -117,12 +130,12 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, fir::ReferenceType refTy = fir::ReferenceType::get(temp.getElementOrSequenceType()); mlir::Value refVal = builder.createConvert(loc, refTy, temp); - result = - builder.create(loc, resultAddrType, refVal); + result = builder.create(loc, resultBoxType, refVal, + shape); } - builder.create(loc, - mlir::ValueRange{result, cleanup}); + builder.create( + loc, mlir::ValueRange{result, builder.createBool(loc, true)}); }) .getResults(); @@ -140,16 +153,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }); rewriter.eraseOp(copyOut); - mlir::Value tempBox = copyIn.getTempBox(); - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); - - // The TempBox is only needed for flang-rt calls which we're no longer - // generating. It should have no uses left at this stage. - if (!tempBox.getUses().empty()) - return mlir::failure(); - rewriter.eraseOp(tempBox.getDefiningOp()); - return mlir::success(); } diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir index 7140e93f19979..7a5b6e591f7c7 100644 --- a/flang/test/HLFIR/inline-hlfir-copy-in.fir +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -60,9 +60,9 @@ func.func private @_test_inline_copy_in(%arg0: !fir.box> { // CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { // CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 // CHECK: } else { -// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} -// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp.copy_in", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp.copy_in"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered attributes {loopAnnotation = #loop_annotation} { // CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref // CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref // CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref From flang-commits at lists.llvm.org Wed May 28 13:16:00 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Wed, 28 May 2025 13:16:00 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Treat ClassType as BoxType in COPYPRIVATE (PR #141844) In-Reply-To: Message-ID: <68376f00.170a0220.28bec7.02ed@mx.google.com> https://github.com/kiranchandramohan approved this pull request. LG. Please wait for @luporl https://github.com/llvm/llvm-project/pull/141844 From flang-commits at lists.llvm.org Wed May 28 13:25:00 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Wed, 28 May 2025 13:25:00 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <6837711c.050a0220.1fd100.b5f9@mx.google.com> ================ @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); ---------------- vzakhari wrote: Yes, I think it should work if you just relink the operands of the `CopyOut`. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Wed May 28 13:58:28 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 28 May 2025 13:58:28 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang] Extension: allow char string edit descriptors in input formats (PR #140624) In-Reply-To: Message-ID: <683778f4.170a0220.374968.044d@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/140624 From flang-commits at lists.llvm.org Wed May 28 13:58:24 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 28 May 2025 13:58:24 -0700 (PDT) Subject: [flang-commits] [flang] 4c6b60a - [flang] Extension: allow char string edit descriptors in input formats (#140624) Message-ID: <683778f0.a70a0220.8ceee.013d@mx.google.com> Author: Peter Klausler Date: 2025-05-28T13:58:22-07:00 New Revision: 4c6b60a639a674f22f5b3aeac0f05ad5aacc552a URL: https://github.com/llvm/llvm-project/commit/4c6b60a639a674f22f5b3aeac0f05ad5aacc552a DIFF: https://github.com/llvm/llvm-project/commit/4c6b60a639a674f22f5b3aeac0f05ad5aacc552a.diff LOG: [flang] Extension: allow char string edit descriptors in input formats (#140624) FORMAT("J=",I3) is accepted by a few other Fortran compilers as a valid format for input as well as for output. The character string edit descriptor "J=" is interpreted as if it had been 2X on input, causing two characters to be skipped over. The skipped characters don't have to match the characters in the literal string. An optional warning is emitted under control of the -pedantic option. Added: Modified: flang-rt/include/flang-rt/runtime/format-implementation.h flang-rt/unittests/Runtime/NumericalFormatTest.cpp flang/docs/Extensions.md flang/include/flang/Common/format.h flang/test/Semantics/io09.f90 Removed: ################################################################################ diff --git a/flang-rt/include/flang-rt/runtime/format-implementation.h b/flang-rt/include/flang-rt/runtime/format-implementation.h index 8f4eb1161dd14..85dc922bc31bc 100644 --- a/flang-rt/include/flang-rt/runtime/format-implementation.h +++ b/flang-rt/include/flang-rt/runtime/format-implementation.h @@ -427,7 +427,11 @@ RT_API_ATTRS int FormatControl::CueUpNextDataEdit( } else { --chars; } - EmitAscii(context, format_ + start, chars); + if constexpr (std::is_base_of_v) { + context.HandleRelativePosition(chars); + } else { + EmitAscii(context, format_ + start, chars); + } } else if (ch == 'H') { // 9HHOLLERITH if (!repeat || *repeat < 1 || offset_ + *repeat > formatLength_) { @@ -435,7 +439,12 @@ RT_API_ATTRS int FormatControl::CueUpNextDataEdit( maybeReversionPoint); return 0; } - EmitAscii(context, format_ + offset_, static_cast(*repeat)); + if constexpr (std::is_base_of_v) { + context.HandleRelativePosition(static_cast(*repeat)); + } else { + EmitAscii( + context, format_ + offset_, static_cast(*repeat)); + } offset_ += *repeat; } else if (ch >= 'A' && ch <= 'Z') { int start{offset_ - 1}; diff --git a/flang-rt/unittests/Runtime/NumericalFormatTest.cpp b/flang-rt/unittests/Runtime/NumericalFormatTest.cpp index a752f9d6c723b..f1492d0e39fec 100644 --- a/flang-rt/unittests/Runtime/NumericalFormatTest.cpp +++ b/flang-rt/unittests/Runtime/NumericalFormatTest.cpp @@ -882,6 +882,7 @@ TEST(IOApiTests, EditDoubleInputValues) { {"(F18.1)", " 125", 0x4029000000000000, 0}, {"(F18.2)", " 125", 0x3ff4000000000000, 0}, {"(F18.3)", " 125", 0x3fc0000000000000, 0}, + {"('str',F3.0)", "xxx125", 0x405f400000000000, 0}, {"(-1P,F18.0)", " 125", 0x4093880000000000, 0}, // 1250 {"(1P,F18.0)", " 125", 0x4029000000000000, 0}, // 12.5 {"(BZ,F18.0)", " 125 ", 0x4093880000000000, 0}, // 1250 diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 00a7e2bac84e6..1cc4881438cc1 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -424,6 +424,10 @@ end * A zero field width is allowed for logical formatted output (`L0`). * `OPEN(..., FORM='BINARY')` is accepted as a legacy synonym for the standard `OPEN(..., FORM='UNFORMATTED', ACCESS='STREAM')`. +* A character string edit descriptor is allowed in an input format + with an optional compilation-time warning. When executed, it + is treated as an 'nX' positioning control descriptor that skips + over the same number of characters, without comparison. ### Extensions supported when enabled by options diff --git a/flang/include/flang/Common/format.h b/flang/include/flang/Common/format.h index da416506ffb5d..1650f56140b4d 100644 --- a/flang/include/flang/Common/format.h +++ b/flang/include/flang/Common/format.h @@ -430,11 +430,11 @@ template void FormatValidator::NextToken() { } } SetLength(); - if (stmt_ == IoStmtKind::Read && - previousToken_.kind() != TokenKind::DT) { // 13.3.2p6 - ReportError("String edit descriptor in READ format expression"); - } else if (token_.kind() != TokenKind::String) { + if (token_.kind() != TokenKind::String) { ReportError("Unterminated string"); + } else if (stmt_ == IoStmtKind::Read && + previousToken_.kind() != TokenKind::DT) { // 13.3.2p6 + ReportWarning("String edit descriptor in READ format expression"); } break; default: diff --git a/flang/test/Semantics/io09.f90 b/flang/test/Semantics/io09.f90 index 495cbf059005c..7fc9d8ffe7b4b 100644 --- a/flang/test/Semantics/io09.f90 +++ b/flang/test/Semantics/io09.f90 @@ -1,8 +1,8 @@ -! RUN: %python %S/test_errors.py %s %flang_fc1 - !ERROR: String edit descriptor in READ format expression +! RUN: %python %S/test_errors.py %s %flang_fc1 -pedantic + !WARNING: String edit descriptor in READ format expression read(*,'("abc")') - !ERROR: String edit descriptor in READ format expression + !ERROR: Unterminated string !ERROR: Unterminated format expression read(*,'("abc)') From flang-commits at lists.llvm.org Wed May 28 13:59:51 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 28 May 2025 13:59:51 -0700 (PDT) Subject: [flang-commits] [flang] ff82884 - [flang] Fix crash in error recovery (#140768) Message-ID: <68377947.050a0220.a7828.0184@mx.google.com> Author: Peter Klausler Date: 2025-05-28T13:59:48-07:00 New Revision: ff8288442dad15d66b7a22ccad94e926e2f56deb URL: https://github.com/llvm/llvm-project/commit/ff8288442dad15d66b7a22ccad94e926e2f56deb DIFF: https://github.com/llvm/llvm-project/commit/ff8288442dad15d66b7a22ccad94e926e2f56deb.diff LOG: [flang] Fix crash in error recovery (#140768) When a TYPE(*) dummy argument is erroneously used as a component value in a structure constructor, semantics crashes if the structure constructor had been initially parsed as a potential function reference. Clean out stale typed expressions when reanalyzing the reconstructed parse subtree to ensure that errors are caught the next time around. Added: flang/test/Semantics/bug869.f90 Modified: flang/lib/Semantics/expression.cpp Removed: ################################################################################ diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index b3ad608ee6744..d68e71f57f141 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -3376,6 +3376,10 @@ MaybeExpr ExpressionAnalyzer::Analyze(const parser::FunctionReference &funcRef, auto &mutableRef{const_cast(funcRef)}; *structureConstructor = mutableRef.ConvertToStructureConstructor(type.derivedTypeSpec()); + // Don't use saved typed expressions left over from argument + // analysis; they might not be valid structure components + // (e.g., a TYPE(*) argument) + auto restorer{DoNotUseSavedTypedExprs()}; return Analyze(structureConstructor->value()); } } @@ -4058,7 +4062,7 @@ MaybeExpr ExpressionAnalyzer::ExprOrVariable( // first to be sure. std::optional ctor; result = Analyze(funcRef->value(), &ctor); - if (result && ctor) { + if (ctor) { // A misparsed function reference is really a structure // constructor. Repair the parse tree in situ. const_cast(x).u = std::move(*ctor); diff --git a/flang/test/Semantics/bug869.f90 b/flang/test/Semantics/bug869.f90 new file mode 100644 index 0000000000000..ddc7dffcc2fa4 --- /dev/null +++ b/flang/test/Semantics/bug869.f90 @@ -0,0 +1,10 @@ +! RUN: %python %S/test_errors.py %s %flang_fc1 +! Regression test for crash +subroutine sub(xx) + type(*) :: xx + type ty + end type + type(ty) obj + !ERROR: TYPE(*) dummy argument may only be used as an actual argument + obj = ty(xx) +end From flang-commits at lists.llvm.org Wed May 28 13:59:54 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 28 May 2025 13:59:54 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix crash in error recovery (PR #140768) In-Reply-To: Message-ID: <6837794a.a70a0220.28047f.018f@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/140768 From flang-commits at lists.llvm.org Wed May 28 14:01:13 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 28 May 2025 14:01:13 -0700 (PDT) Subject: [flang-commits] [flang] a6432b9 - [flang] Fix folding of SHAPE(SPREAD(source, dim, ncopies=-1)) (#141146) Message-ID: <68377999.170a0220.30c11c.0161@mx.google.com> Author: Peter Klausler Date: 2025-05-28T14:01:10-07:00 New Revision: a6432b971af1f38c604c9b835c5c52baf5c9b1f7 URL: https://github.com/llvm/llvm-project/commit/a6432b971af1f38c604c9b835c5c52baf5c9b1f7 DIFF: https://github.com/llvm/llvm-project/commit/a6432b971af1f38c604c9b835c5c52baf5c9b1f7.diff LOG: [flang] Fix folding of SHAPE(SPREAD(source,dim,ncopies=-1)) (#141146) The number of copies on the new dimension must be clamped via MAX(0, ncopies) so that it is no less than zero. Fixes https://github.com/llvm/llvm-project/issues/141119. Added: Modified: flang/lib/Evaluate/shape.cpp flang/test/Evaluate/fold-spread.f90 Removed: ################################################################################ diff --git a/flang/lib/Evaluate/shape.cpp b/flang/lib/Evaluate/shape.cpp index ac4811e9978eb..776866d1416d2 100644 --- a/flang/lib/Evaluate/shape.cpp +++ b/flang/lib/Evaluate/shape.cpp @@ -1073,8 +1073,8 @@ auto GetShapeHelper::operator()(const ProcedureRef &call) const -> Result { } } } else if (intrinsic->name == "spread") { - // SHAPE(SPREAD(ARRAY,DIM,NCOPIES)) = SHAPE(ARRAY) with NCOPIES inserted - // at position DIM. + // SHAPE(SPREAD(ARRAY,DIM,NCOPIES)) = SHAPE(ARRAY) with MAX(0,NCOPIES) + // inserted at position DIM. if (call.arguments().size() == 3) { auto arrayShape{ (*this)(UnwrapExpr>(call.arguments().at(0)))}; @@ -1086,7 +1086,8 @@ auto GetShapeHelper::operator()(const ProcedureRef &call) const -> Result { if (*dim >= 1 && static_cast(*dim) <= arrayShape->size() + 1) { arrayShape->emplace(arrayShape->begin() + *dim - 1, - ConvertToType(common::Clone(*nCopies))); + Extremum{Ordering::Greater, ExtentExpr{0}, + ConvertToType(common::Clone(*nCopies))}); return std::move(*arrayShape); } } diff --git a/flang/test/Evaluate/fold-spread.f90 b/flang/test/Evaluate/fold-spread.f90 index b7e493ee061c8..c8b87e8c87811 100644 --- a/flang/test/Evaluate/fold-spread.f90 +++ b/flang/test/Evaluate/fold-spread.f90 @@ -12,4 +12,5 @@ module m1 logical, parameter :: test_log4 = all(any(spread([.false., .true.], 2, 2), dim=2) .eqv. [.false., .true.]) logical, parameter :: test_m2toa3 = all(spread(reshape([(j,j=1,6)],[2,3]),1,4) == & reshape([((j,k=1,4),j=1,6)],[4,2,3])) + logical, parameter :: test_shape_neg = all(shape(spread(0,1,-1)) == [0]) end module From flang-commits at lists.llvm.org Wed May 28 14:01:16 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 28 May 2025 14:01:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix folding of SHAPE(SPREAD(source, dim, ncopies=-1)) (PR #141146) In-Reply-To: Message-ID: <6837799c.170a0220.95e07.014d@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/141146 From flang-commits at lists.llvm.org Wed May 28 14:01:31 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 28 May 2025 14:01:31 -0700 (PDT) Subject: [flang-commits] [flang] 1457125 - [flang] Allow forward reference to non-default INTEGER dummy (#141254) Message-ID: <683779ab.170a0220.2a6532.00e0@mx.google.com> Author: Peter Klausler Date: 2025-05-28T14:01:28-07:00 New Revision: 145712555f6cbcfb4c7e07d5ee3459570c2a581a URL: https://github.com/llvm/llvm-project/commit/145712555f6cbcfb4c7e07d5ee3459570c2a581a DIFF: https://github.com/llvm/llvm-project/commit/145712555f6cbcfb4c7e07d5ee3459570c2a581a.diff LOG: [flang] Allow forward reference to non-default INTEGER dummy (#141254) A dummy argument with an explicit INTEGER type of non-default kind can be forward-referenced from a specification expression in many Fortran compilers. Handle by adding type declaration statements to the initial pass over a specification part's declaration constructs. Emit an optional warning under -pedantic. Fixes https://github.com/llvm/llvm-project/issues/140941. Added: Modified: flang/docs/Extensions.md flang/include/flang/Support/Fortran-features.h flang/lib/Semantics/resolve-names.cpp flang/test/Semantics/OpenMP/linear-clause01.f90 flang/test/Semantics/resolve103.f90 Removed: ################################################################################ diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 1cc4881438cc1..51969de5ac7fe 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -291,7 +291,10 @@ end * DATA statement initialization is allowed for procedure pointers outside structure constructors. * Nonstandard intrinsic functions: ISNAN, SIZEOF -* A forward reference to a default INTEGER scalar dummy argument or +* A forward reference to an INTEGER dummy argument is permitted to appear + in a specification expression, such as an array bound, in a scope with + IMPLICIT NONE(TYPE). +* A forward reference to a default INTEGER scalar `COMMON` block variable is permitted to appear in a specification expression, such as an array bound, in a scope with IMPLICIT NONE(TYPE) if the name of the variable would have caused it to be implicitly typed diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 0e18eaedf2139..e696da9042480 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -55,7 +55,7 @@ ENUM_CLASS(LanguageFeature, BackslashEscapes, OldDebugLines, UndefinableAsynchronousOrVolatileActual, AutomaticInMainProgram, PrintCptr, SavedLocalInSpecExpr, PrintNamelist, AssumedRankPassedToNonAssumedRank, IgnoreIrrelevantAttributes, Unsigned, AmbiguousStructureConstructor, - ContiguousOkForSeqAssociation) + ContiguousOkForSeqAssociation, ForwardRefExplicitTypeDummy) // Portability and suspicious usage warnings ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 93f2150365a1f..57035c57ee16f 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -768,10 +768,22 @@ class ScopeHandler : public ImplicitRulesVisitor { deferImplicitTyping_ = skipImplicitTyping_ = skip; } + void NoteEarlyDeclaredDummyArgument(Symbol &symbol) { + earlyDeclaredDummyArguments_.insert(symbol); + } + bool IsEarlyDeclaredDummyArgument(Symbol &symbol) { + return earlyDeclaredDummyArguments_.find(symbol) != + earlyDeclaredDummyArguments_.end(); + } + void ForgetEarlyDeclaredDummyArgument(Symbol &symbol) { + earlyDeclaredDummyArguments_.erase(symbol); + } + private: Scope *currScope_{nullptr}; FuncResultStack funcResultStack_{*this}; std::map deferred_; + UnorderedSymbolSet earlyDeclaredDummyArguments_; }; class ModuleVisitor : public virtual ScopeHandler { @@ -1970,6 +1982,9 @@ class ResolveNamesVisitor : public virtual ScopeHandler, Scope &topScope_; void PreSpecificationConstruct(const parser::SpecificationConstruct &); + void EarlyDummyTypeDeclaration( + const parser::Statement> + &); void CreateCommonBlockSymbols(const parser::CommonStmt &); void CreateObjectSymbols(const std::list &, Attr); void CreateGeneric(const parser::GenericSpec &); @@ -5605,6 +5620,7 @@ Symbol &DeclarationVisitor::DeclareUnknownEntity( } else { Symbol &symbol{DeclareEntity(name, attrs)}; if (auto *type{GetDeclTypeSpec()}) { + ForgetEarlyDeclaredDummyArgument(symbol); SetType(name, *type); } charInfo_.length.reset(); @@ -5681,6 +5697,7 @@ Symbol &DeclarationVisitor::DeclareProcEntity( symbol.set(Symbol::Flag::Subroutine); } } else if (auto *type{GetDeclTypeSpec()}) { + ForgetEarlyDeclaredDummyArgument(symbol); SetType(name, *type); symbol.set(Symbol::Flag::Function); } @@ -5695,6 +5712,7 @@ Symbol &DeclarationVisitor::DeclareObjectEntity( Symbol &symbol{DeclareEntity(name, attrs)}; if (auto *details{symbol.detailsIf()}) { if (auto *type{GetDeclTypeSpec()}) { + ForgetEarlyDeclaredDummyArgument(symbol); SetType(name, *type); } if (!arraySpec().empty()) { @@ -5705,9 +5723,11 @@ Symbol &DeclarationVisitor::DeclareObjectEntity( context().SetError(symbol); } } else if (MustBeScalar(symbol)) { - context().Warn(common::UsageWarning::PreviousScalarUse, name.source, - "'%s' appeared earlier as a scalar actual argument to a specification function"_warn_en_US, - name.source); + if (!context().HasError(symbol)) { + context().Warn(common::UsageWarning::PreviousScalarUse, name.source, + "'%s' appeared earlier as a scalar actual argument to a specification function"_warn_en_US, + name.source); + } } else if (details->init() || symbol.test(Symbol::Flag::InDataStmt)) { Say(name, "'%s' was initialized earlier as a scalar"_err_en_US); } else { @@ -8461,6 +8481,11 @@ const parser::Name *DeclarationVisitor::ResolveDataRef( x.u); } +static bool TypesMismatchIfNonNull( + const DeclTypeSpec *type1, const DeclTypeSpec *type2) { + return type1 && type2 && *type1 != *type2; +} + // If implicit types are allowed, ensure name is in the symbol table. // Otherwise, report an error if it hasn't been declared. const parser::Name *DeclarationVisitor::ResolveName(const parser::Name &name) { @@ -8482,13 +8507,30 @@ const parser::Name *DeclarationVisitor::ResolveName(const parser::Name &name) { symbol->set(Symbol::Flag::ImplicitOrError, false); if (IsUplevelReference(*symbol)) { MakeHostAssocSymbol(name, *symbol); - } else if (IsDummy(*symbol) || - (!symbol->GetType() && FindCommonBlockContaining(*symbol))) { + } else if (IsDummy(*symbol)) { CheckEntryDummyUse(name.source, symbol); + ConvertToObjectEntity(*symbol); + if (IsEarlyDeclaredDummyArgument(*symbol)) { + ForgetEarlyDeclaredDummyArgument(*symbol); + if (isImplicitNoneType()) { + context().Warn(common::LanguageFeature::ForwardRefImplicitNone, + name.source, + "'%s' was used under IMPLICIT NONE(TYPE) before being explicitly typed"_warn_en_US, + name.source); + } else if (TypesMismatchIfNonNull( + symbol->GetType(), GetImplicitType(*symbol))) { + context().Warn(common::LanguageFeature::ForwardRefExplicitTypeDummy, + name.source, + "'%s' was used before being explicitly typed (and its implicit type would diff er)"_warn_en_US, + name.source); + } + } + ApplyImplicitRules(*symbol); + } else if (!symbol->GetType() && FindCommonBlockContaining(*symbol)) { ConvertToObjectEntity(*symbol); ApplyImplicitRules(*symbol); } else if (const auto *tpd{symbol->detailsIf()}; - tpd && !tpd->attr()) { + tpd && !tpd->attr()) { Say(name, "Type parameter '%s' was referenced before being declared"_err_en_US, name.source); @@ -9031,11 +9073,6 @@ static bool IsLocallyImplicitGlobalSymbol( return false; } -static bool TypesMismatchIfNonNull( - const DeclTypeSpec *type1, const DeclTypeSpec *type2) { - return type1 && type2 && *type1 != *type2; -} - // Check and set the Function or Subroutine flag on symbol; false on error. bool ResolveNamesVisitor::SetProcFlag( const parser::Name &name, Symbol &symbol, Symbol::Flag flag) { @@ -9252,6 +9289,10 @@ void ResolveNamesVisitor::PreSpecificationConstruct( const parser::SpecificationConstruct &spec) { common::visit( common::visitors{ + [&](const parser::Statement< + common::Indirection> &y) { + EarlyDummyTypeDeclaration(y); + }, [&](const parser::Statement> &y) { CreateGeneric(std::get(y.statement.value().t)); }, @@ -9280,6 +9321,44 @@ void ResolveNamesVisitor::PreSpecificationConstruct( spec.u); } +void ResolveNamesVisitor::EarlyDummyTypeDeclaration( + const parser::Statement> + &stmt) { + context().set_location(stmt.source); + const auto &[declTypeSpec, attrs, entities] = stmt.statement.value().t; + if (const auto *intrin{ + std::get_if(&declTypeSpec.u)}) { + if (const auto *intType{std::get_if(&intrin->u)}) { + if (const auto &kind{intType->v}) { + if (!parser::Unwrap(*kind) && + !parser::Unwrap(*kind)) { + return; + } + } + const DeclTypeSpec *type{nullptr}; + for (const auto &ent : entities) { + const auto &objName{std::get(ent.t)}; + Resolve(objName, FindInScope(currScope(), objName)); + if (Symbol * symbol{objName.symbol}; + symbol && IsDummy(*symbol) && NeedsType(*symbol)) { + if (!type) { + type = ProcessTypeSpec(declTypeSpec); + if (!type || !type->IsNumeric(TypeCategory::Integer)) { + break; + } + } + symbol->SetType(*type); + NoteEarlyDeclaredDummyArgument(*symbol); + // Set the Implicit flag to disable bogus errors from + // being emitted later when this declaration is processed + // again normally. + symbol->set(Symbol::Flag::Implicit); + } + } + } + } +} + void ResolveNamesVisitor::CreateCommonBlockSymbols( const parser::CommonStmt &commonStmt) { for (const parser::CommonStmt::Block &block : commonStmt.blocks) { diff --git a/flang/test/Semantics/OpenMP/linear-clause01.f90 b/flang/test/Semantics/OpenMP/linear-clause01.f90 index f95e834c9026c..286def2dba119 100644 --- a/flang/test/Semantics/OpenMP/linear-clause01.f90 +++ b/flang/test/Semantics/OpenMP/linear-clause01.f90 @@ -20,10 +20,8 @@ subroutine linear_clause_02(arg_01, arg_02) !$omp declare simd linear(val(arg_01)) real, intent(in) :: arg_01(:) - !ERROR: The list item 'arg_02' specified without the REF 'linear-modifier' must be of INTEGER type !ERROR: If the `linear-modifier` is REF or UVAL, the list item 'arg_02' must be a dummy argument without the VALUE attribute !$omp declare simd linear(uval(arg_02)) - !ERROR: The type of 'arg_02' has already been implicitly declared integer, value, intent(in) :: arg_02 !ERROR: The list item 'var' specified without the REF 'linear-modifier' must be of INTEGER type diff --git a/flang/test/Semantics/resolve103.f90 b/flang/test/Semantics/resolve103.f90 index 8f55968f43375..0acf2333b9586 100644 --- a/flang/test/Semantics/resolve103.f90 +++ b/flang/test/Semantics/resolve103.f90 @@ -1,8 +1,7 @@ ! RUN: not %flang_fc1 -pedantic %s 2>&1 | FileCheck %s ! Test extension: allow forward references to dummy arguments or COMMON ! from specification expressions in scopes with IMPLICIT NONE(TYPE), -! as long as those symbols are eventually typed later with the -! same integer type they would have had without IMPLICIT NONE. +! as long as those symbols are eventually typed later. !CHECK: warning: 'n1' was used without (or before) being explicitly typed !CHECK: error: No explicit type declared for dummy argument 'n1' @@ -19,12 +18,15 @@ subroutine foo2(a, n2) double precision n2 end -!CHECK: warning: 'n3' was used without (or before) being explicitly typed -!CHECK-NOT: error: Dummy argument 'n3' -subroutine foo3(a, n3) +!CHECK: warning: 'n3a' was used under IMPLICIT NONE(TYPE) before being explicitly typed +!CHECK: warning: 'n3b' was used under IMPLICIT NONE(TYPE) before being explicitly typed +!CHECK-NOT: error: Dummy argument 'n3a' +!CHECK-NOT: error: Dummy argument 'n3b' +subroutine foo3(a, n3a, n3b) implicit none - real a(n3) - integer n3 + integer a(n3a, n3b) + integer n3a + integer(8) n3b end !CHECK: warning: 'n4' was used without (or before) being explicitly typed From flang-commits at lists.llvm.org Wed May 28 14:01:35 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 28 May 2025 14:01:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Allow forward reference to non-default INTEGER dummy (PR #141254) In-Reply-To: Message-ID: <683779af.630a0220.52d4b.b987@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/141254 From flang-commits at lists.llvm.org Wed May 28 14:01:58 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 28 May 2025 14:01:58 -0700 (PDT) Subject: [flang-commits] [flang] f48f844 - [flang] Fix prescanner bug w/ empty macros in line continuation (#141274) Message-ID: <683779c6.a70a0220.7b245.010e@mx.google.com> Author: Peter Klausler Date: 2025-05-28T14:01:55-07:00 New Revision: f48f844f3ccef16dc5bb3ac6e7be582f01d45317 URL: https://github.com/llvm/llvm-project/commit/f48f844f3ccef16dc5bb3ac6e7be582f01d45317 DIFF: https://github.com/llvm/llvm-project/commit/f48f844f3ccef16dc5bb3ac6e7be582f01d45317.diff LOG: [flang] Fix prescanner bug w/ empty macros in line continuation (#141274) When processing free form source line continuation, the prescanner treats empty keyword macros as if they were spaces or tabs. After skipping over them, however, there's code that only works if the skipped characters ended with an actual space or tab. If the last skipped item was an empty keyword macro's name, the last character of that name would end up being the first character of the continuation line. Fix. Added: flang/test/Preprocessing/bug890.F90 Modified: flang/lib/Parser/prescan.cpp Removed: ################################################################################ diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp index 3bc2ea0b37508..9aef0c9981e3c 100644 --- a/flang/lib/Parser/prescan.cpp +++ b/flang/lib/Parser/prescan.cpp @@ -1473,7 +1473,7 @@ const char *Prescanner::FreeFormContinuationLine(bool ampersand) { GetProvenanceRange(p, p + 1), "Character literal continuation line should have been preceded by '&'"_port_en_US); } - } else if (p > lineStart) { + } else if (p > lineStart && IsSpaceOrTab(p - 1)) { --p; } else { insertASpace_ = true; diff --git a/flang/test/Preprocessing/bug890.F90 b/flang/test/Preprocessing/bug890.F90 new file mode 100644 index 0000000000000..0ce2d8c3f1569 --- /dev/null +++ b/flang/test/Preprocessing/bug890.F90 @@ -0,0 +1,6 @@ +! RUN: %flang -E %s 2>&1 | FileCheck %s +!CHECK: subroutine sub() +#define empty +subroutine sub ( & + empty) +end From flang-commits at lists.llvm.org Wed May 28 14:02:56 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Wed, 28 May 2025 14:02:56 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix prescanner bug w/ empty macros in line continuation (PR #141274) In-Reply-To: Message-ID: <68377a00.170a0220.3836f7.04cc@mx.google.com> https://github.com/klausler closed https://github.com/llvm/llvm-project/pull/141274 From flang-commits at lists.llvm.org Wed May 28 14:05:06 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Wed, 28 May 2025 14:05:06 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow polymorphic entity in data clauses (PR #141856) Message-ID: https://github.com/clementval created https://github.com/llvm/llvm-project/pull/141856 - Attach the mappable interface to ClassType - Remove the TODO since derived type in descriptor can be handled as other descriptors >From f9fa71463a07263d992298479af46a5bf9329473 Mon Sep 17 00:00:00 2001 From: Valentin Clement Date: Wed, 28 May 2025 14:02:38 -0700 Subject: [PATCH] [flang][openacc] Allow polymorphic entity in data clauses --- .../Optimizer/Builder/DirectivesCommon.h | 3 --- .../OpenACC/RegisterOpenACCExtensions.cpp | 2 ++ flang/test/Lower/OpenACC/acc-enter-data.f90 | 20 +++++++++++++++++++ 3 files changed, 22 insertions(+), 3 deletions(-) diff --git a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h index 3f30c761acb4e..f71a2ccd07bfd 100644 --- a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h +++ b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h @@ -75,9 +75,6 @@ inline AddrAndBoundsInfo getDataOperandBaseAddr(fir::FirOpBuilder &builder, if (auto boxTy = mlir::dyn_cast( fir::unwrapRefType(symAddr.getType()))) { - if (mlir::isa(boxTy.getEleTy())) - TODO(loc, "derived type"); - // In case of a box reference, load it here to get the box value. // This is preferrable because then the same box value can then be used for // all address/dimension retrievals. For Fortran optional though, leave diff --git a/flang/lib/Optimizer/OpenACC/RegisterOpenACCExtensions.cpp b/flang/lib/Optimizer/OpenACC/RegisterOpenACCExtensions.cpp index 16115c7cb6e59..5f174ad4b40fe 100644 --- a/flang/lib/Optimizer/OpenACC/RegisterOpenACCExtensions.cpp +++ b/flang/lib/Optimizer/OpenACC/RegisterOpenACCExtensions.cpp @@ -22,6 +22,8 @@ void registerOpenACCExtensions(mlir::DialectRegistry ®istry) { fir::SequenceType::attachInterface>( *ctx); fir::BoxType::attachInterface>(*ctx); + fir::ClassType::attachInterface>( + *ctx); fir::ReferenceType::attachInterface< OpenACCPointerLikeModel>(*ctx); diff --git a/flang/test/Lower/OpenACC/acc-enter-data.f90 b/flang/test/Lower/OpenACC/acc-enter-data.f90 index f7396660a6d3c..2718c96a563fb 100644 --- a/flang/test/Lower/OpenACC/acc-enter-data.f90 +++ b/flang/test/Lower/OpenACC/acc-enter-data.f90 @@ -2,6 +2,14 @@ ! RUN: bbc -fopenacc -emit-hlfir %s -o - | FileCheck %s +module mod1 + implicit none + type :: derived + real :: m + contains + end type derived +end module mod1 + subroutine acc_enter_data integer :: async = 1 real, dimension(10, 10) :: a, b, c @@ -651,3 +659,15 @@ subroutine acc_enter_data_single_array_element() !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.heap>) end subroutine + +subroutine test_class(a) + use mod1 + class(derived) :: a + !$acc enter data copyin(a) +end subroutine + +! CHECK-LABEL: func.func @_QPtest_class( +! CHECK-SAME: %[[ARG0:.*]]: !fir.class> {fir.bindc_name = "a"}) { +! CHECK: %[[DECL_ARG0:.*]]:2 = hlfir.declare %[[ARG0]] dummy_scope %0 {uniq_name = "_QFtest_classEa"} : (!fir.class>, !fir.dscope) -> (!fir.class>, !fir.class>) +! CHECK: %[[COPYIN:.*]] = acc.copyin var(%[[DECL_ARG0]]#0 : !fir.class>) -> !fir.class> {name = "a", structured = false} +! CHECK: acc.enter_data dataOperands(%[[COPYIN]] : !fir.class>) From flang-commits at lists.llvm.org Wed May 28 14:05:40 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 28 May 2025 14:05:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow polymorphic entity in data clauses (PR #141856) In-Reply-To: Message-ID: <68377aa4.050a0220.208c8d.01a4@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Valentin Clement (バレンタイン クレメン) (clementval)
Changes - Attach the mappable interface to ClassType - Remove the TODO since derived type in descriptor can be handled as other descriptors --- Full diff: https://github.com/llvm/llvm-project/pull/141856.diff 3 Files Affected: - (modified) flang/include/flang/Optimizer/Builder/DirectivesCommon.h (-3) - (modified) flang/lib/Optimizer/OpenACC/RegisterOpenACCExtensions.cpp (+2) - (modified) flang/test/Lower/OpenACC/acc-enter-data.f90 (+20) ``````````diff diff --git a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h index 3f30c761acb4e..f71a2ccd07bfd 100644 --- a/flang/include/flang/Optimizer/Builder/DirectivesCommon.h +++ b/flang/include/flang/Optimizer/Builder/DirectivesCommon.h @@ -75,9 +75,6 @@ inline AddrAndBoundsInfo getDataOperandBaseAddr(fir::FirOpBuilder &builder, if (auto boxTy = mlir::dyn_cast( fir::unwrapRefType(symAddr.getType()))) { - if (mlir::isa(boxTy.getEleTy())) - TODO(loc, "derived type"); - // In case of a box reference, load it here to get the box value. // This is preferrable because then the same box value can then be used for // all address/dimension retrievals. For Fortran optional though, leave diff --git a/flang/lib/Optimizer/OpenACC/RegisterOpenACCExtensions.cpp b/flang/lib/Optimizer/OpenACC/RegisterOpenACCExtensions.cpp index 16115c7cb6e59..5f174ad4b40fe 100644 --- a/flang/lib/Optimizer/OpenACC/RegisterOpenACCExtensions.cpp +++ b/flang/lib/Optimizer/OpenACC/RegisterOpenACCExtensions.cpp @@ -22,6 +22,8 @@ void registerOpenACCExtensions(mlir::DialectRegistry ®istry) { fir::SequenceType::attachInterface>( *ctx); fir::BoxType::attachInterface>(*ctx); + fir::ClassType::attachInterface>( + *ctx); fir::ReferenceType::attachInterface< OpenACCPointerLikeModel>(*ctx); diff --git a/flang/test/Lower/OpenACC/acc-enter-data.f90 b/flang/test/Lower/OpenACC/acc-enter-data.f90 index f7396660a6d3c..2718c96a563fb 100644 --- a/flang/test/Lower/OpenACC/acc-enter-data.f90 +++ b/flang/test/Lower/OpenACC/acc-enter-data.f90 @@ -2,6 +2,14 @@ ! RUN: bbc -fopenacc -emit-hlfir %s -o - | FileCheck %s +module mod1 + implicit none + type :: derived + real :: m + contains + end type derived +end module mod1 + subroutine acc_enter_data integer :: async = 1 real, dimension(10, 10) :: a, b, c @@ -651,3 +659,15 @@ subroutine acc_enter_data_single_array_element() !CHECK: acc.enter_data dataOperands(%[[CREATE]] : !fir.heap>) end subroutine + +subroutine test_class(a) + use mod1 + class(derived) :: a + !$acc enter data copyin(a) +end subroutine + +! CHECK-LABEL: func.func @_QPtest_class( +! CHECK-SAME: %[[ARG0:.*]]: !fir.class> {fir.bindc_name = "a"}) { +! CHECK: %[[DECL_ARG0:.*]]:2 = hlfir.declare %[[ARG0]] dummy_scope %0 {uniq_name = "_QFtest_classEa"} : (!fir.class>, !fir.dscope) -> (!fir.class>, !fir.class>) +! CHECK: %[[COPYIN:.*]] = acc.copyin var(%[[DECL_ARG0]]#0 : !fir.class>) -> !fir.class> {name = "a", structured = false} +! CHECK: acc.enter_data dataOperands(%[[COPYIN]] : !fir.class>) ``````````
https://github.com/llvm/llvm-project/pull/141856 From flang-commits at lists.llvm.org Wed May 28 14:56:17 2025 From: flang-commits at lists.llvm.org (Razvan Lupusoru via flang-commits) Date: Wed, 28 May 2025 14:56:17 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openacc] Allow polymorphic entity in data clauses (PR #141856) In-Reply-To: Message-ID: <68378681.170a0220.274780.06ec@mx.google.com> https://github.com/razvanlupusoru approved this pull request. Thank you! :) https://github.com/llvm/llvm-project/pull/141856 From flang-commits at lists.llvm.org Wed May 28 20:09:46 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Wed, 28 May 2025 20:09:46 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6837cffa.170a0220.d8a6e.0d03@mx.google.com> fanju110 wrote: > @fanju110, Thanks for seeing this through! Hi @tarunprabhu , If everything looks good, could you please merge this PR at your convenience (I don't have merge rights)? Thank you! https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Wed May 28 20:16:40 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Wed, 28 May 2025 20:16:40 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][rt] Enable Count and CountDim for device build (PR #141684) In-Reply-To: Message-ID: <6837d198.170a0220.111914.0d62@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `ppc64-flang-aix` running on `ppc64-flang-aix-test` while building `flang-rt` at step 5 "build-unified-tree". Full details are available at: https://lab.llvm.org/buildbot/#/builders/201/builds/4830
Here is the relevant piece of the build log for the reference ``` Step 5 (build-unified-tree) failure: build (failure) ... 2.444 [18/162/83] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/ucmpdi2.c.o 2.449 [17/162/84] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/udivdi3.c.o 2.459 [16/162/85] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/udivmoddi4.c.o 2.466 [15/162/86] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/udivmodsi4.c.o 2.472 [14/162/87] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/udivmodti4.c.o 2.477 [13/162/88] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/udivsi3.c.o 2.485 [12/162/89] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/udivti3.c.o 2.508 [11/162/90] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/umoddi3.c.o 2.525 [10/162/91] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/umodsi3.c.o 2.532 [9/162/92] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/muldc3.c.o FAILED: CMakeFiles/clang_rt.builtins-powerpc.dir/muldc3.c.o /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/build/./bin/clang --target=powerpc64-ibm-aix -DVISIBILITY_HIDDEN -O3 -DNDEBUG -m32 -fno-lto -std=c11 -fPIC -fno-builtin -fvisibility=hidden -fomit-frame-pointer -MD -MT CMakeFiles/clang_rt.builtins-powerpc.dir/muldc3.c.o -MF CMakeFiles/clang_rt.builtins-powerpc.dir/muldc3.c.o.d -o CMakeFiles/clang_rt.builtins-powerpc.dir/muldc3.c.o -c /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/llvm-project/compiler-rt/lib/builtins/muldc3.c fatal error: error in backend: Cannot select: intrinsic %llvm.ppc.altivec.vupklsw PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/build/./bin/clang --target=powerpc64-ibm-aix -DVISIBILITY_HIDDEN -O3 -DNDEBUG -m32 -fno-lto -std=c11 -fPIC -fno-builtin -fvisibility=hidden -fomit-frame-pointer -MD -MT CMakeFiles/clang_rt.builtins-powerpc.dir/muldc3.c.o -MF CMakeFiles/clang_rt.builtins-powerpc.dir/muldc3.c.o.d -o CMakeFiles/clang_rt.builtins-powerpc.dir/muldc3.c.o -c /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/llvm-project/compiler-rt/lib/builtins/muldc3.c 1. parser at end of file 2. Code generation 3. Running pass 'Function Pass Manager' on module '/home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/llvm-project/compiler-rt/lib/builtins/muldc3.c'. 4. Running pass 'PowerPC DAG->DAG Pattern Instruction Selection' on function '@__muldc3' clang: error: clang frontend command failed with exit code 70 (use -v to see invocation) clang version 21.0.0git Target: powerpc-ibm-aix Thread model: posix InstalledDir: /home/llvm/llvm-external-buildbots/workers/ppc64-flang-aix-test/ppc64-flang-aix-build/build/bin Build config: +assertions clang: note: diagnostic msg: ******************** PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT: Preprocessed source(s) and associated run script(s) are located at: clang: note: diagnostic msg: /tmp/muldc3-a038d0.c clang: note: diagnostic msg: /tmp/muldc3-a038d0.sh clang: note: diagnostic msg: ******************** 2.540 [9/161/93] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/umodti3.c.o 2.548 [9/160/94] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/emutls.c.o 2.551 [9/159/95] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/enable_execute_stack.c.o 2.552 [9/158/96] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/gcc_personality_v0.c.o 2.554 [9/157/97] Building C object CMakeFiles/clang_rt.builtins-powerpc.dir/clear_cache.c.o 2.556 [9/156/98] Building C object CMakeFiles/clang_rt.builtins-powerpc64.dir/ppc/fixunstfdi.c.o 2.556 [9/155/99] Building C object CMakeFiles/clang_rt.builtins-powerpc64.dir/ppc/floatunditf.c.o 2.558 [9/154/100] Building C object CMakeFiles/clang_rt.builtins-powerpc64.dir/ppc/gcc_qadd.c.o 2.577 [9/153/101] Building C object CMakeFiles/clang_rt.builtins-powerpc64.dir/absvdi2.c.o 2.588 [9/152/102] Building C object CMakeFiles/clang_rt.builtins-powerpc64.dir/ppc/gcc_qmul.c.o 2.619 [9/151/103] Building C object CMakeFiles/clang_rt.builtins-powerpc64.dir/ppc/divtc3.c.o 2.624 [9/150/104] Building C object CMakeFiles/clang_rt.builtins-powerpc64.dir/absvsi2.c.o 2.633 [9/149/105] Building C object CMakeFiles/clang_rt.builtins-powerpc64.dir/apple_versioning.c.o ```
https://github.com/llvm/llvm-project/pull/141684 From flang-commits at lists.llvm.org Thu May 29 00:37:39 2025 From: flang-commits at lists.llvm.org (Paul Osmialowski via flang-commits) Date: Thu, 29 May 2025 00:37:39 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <68380ec3.170a0220.1f28b9.1d5d@mx.google.com> pawosm-arm wrote: > @pawosm-arm -- Sorry to tag you explicitly, I do not have access to formally request reviews using the Reviewers section. No worries, I will look into it today. https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 00:40:53 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 00:40:53 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <68380f85.170a0220.6b9d9.1ee0@mx.google.com> github-actions[bot] wrote: :warning: Python code formatter, darker found issues in your code. :warning:
You can test this locally with the following command: ``````````bash darker --check --diff -r HEAD~1...HEAD flang/docs/conf.py ``````````
View the diff from darker here. ``````````diff --- conf.py 2025-05-29 00:21:16.000000 +0000 +++ conf.py 2025-05-29 07:40:22.608149 +0000 @@ -225,13 +225,11 @@ # -- Options for manual page output -------------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [ - ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) -] +man_pages = [("index", "flang", "Flang Documentation", ["Flang Contributors"], 1)] # If true, show URL addresses after external links. # man_show_urls = False ``````````
https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 02:04:31 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 02:04:31 -0700 (PDT) Subject: [flang-commits] [flang] e33cd96 - [flang][fir] Basic PFT to MLIR lowering for do concurrent locality specifiers (#138534) Message-ID: <6838231f.050a0220.327f9f.1f9b@mx.google.com> Author: Kareem Ergawy Date: 2025-05-29T11:04:27+02:00 New Revision: e33cd9690fe11305acd7df35532d712844b9049e URL: https://github.com/llvm/llvm-project/commit/e33cd9690fe11305acd7df35532d712844b9049e DIFF: https://github.com/llvm/llvm-project/commit/e33cd9690fe11305acd7df35532d712844b9049e.diff LOG: [flang][fir] Basic PFT to MLIR lowering for do concurrent locality specifiers (#138534) Extends support for `fir.do_concurrent` locality specifiers to the PFT to MLIR level. This adds code-gen for generating the newly added `fir.local` ops and referencing these ops from `fir.do_concurrent.loop` ops that have locality specifiers attached to them. This reuses the `DataSharingProcessor` component and generalizes it a bit more to allow for handling `omp.private` ops and `fir.local` ops as well. PR stack: - https://github.com/llvm/llvm-project/pull/137928 - https://github.com/llvm/llvm-project/pull/138505 - https://github.com/llvm/llvm-project/pull/138506 - https://github.com/llvm/llvm-project/pull/138512 - https://github.com/llvm/llvm-project/pull/138534 (this PR) - https://github.com/llvm/llvm-project/pull/138816 Added: flang/test/Lower/do_concurrent_delayed_locality.f90 Modified: flang/include/flang/Lower/AbstractConverter.h flang/include/flang/Optimizer/Dialect/FIROps.h flang/include/flang/Optimizer/Dialect/FIROps.td flang/lib/Lower/Bridge.cpp flang/lib/Lower/OpenMP/DataSharingProcessor.cpp flang/lib/Lower/OpenMP/DataSharingProcessor.h Removed: ################################################################################ diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h index 1d1323642bf9c..8ae68e143cd2f 100644 --- a/flang/include/flang/Lower/AbstractConverter.h +++ b/flang/include/flang/Lower/AbstractConverter.h @@ -348,6 +348,10 @@ class AbstractConverter { virtual Fortran::lower::SymbolBox lookupOneLevelUpSymbol(const Fortran::semantics::Symbol &sym) = 0; + /// Find the symbol in the inner-most level of the local map or return null. + virtual Fortran::lower::SymbolBox + shallowLookupSymbol(const Fortran::semantics::Symbol &sym) = 0; + /// Return the mlir::SymbolTable associated to the ModuleOp. /// Look-ups are faster using it than using module.lookup<>, /// but the module op should be queried in case of failure diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.h b/flang/include/flang/Optimizer/Dialect/FIROps.h index 1bed227afb50d..62ef8b4b502f2 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.h +++ b/flang/include/flang/Optimizer/Dialect/FIROps.h @@ -147,6 +147,10 @@ class CoordinateIndicesAdaptor { mlir::ValueRange values; }; +struct LocalitySpecifierOperands { + llvm::SmallVector<::mlir::Value> privateVars; + llvm::SmallVector<::mlir::Attribute> privateSyms; +}; } // namespace fir #endif // FORTRAN_OPTIMIZER_DIALECT_FIROPS_H diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td b/flang/include/flang/Optimizer/Dialect/FIROps.td index dc66885f776f0..f4b17ef7eed09 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.td +++ b/flang/include/flang/Optimizer/Dialect/FIROps.td @@ -3605,6 +3605,21 @@ def fir_LocalitySpecifierOp : fir_Op<"local", [IsolatedFromAbove]> { ]; let extraClassDeclaration = [{ + mlir::BlockArgument getInitMoldArg() { + auto ®ion = getInitRegion(); + return region.empty() ? nullptr : region.getArgument(0); + } + mlir::BlockArgument getInitPrivateArg() { + auto ®ion = getInitRegion(); + return region.empty() ? nullptr : region.getArgument(1); + } + + /// Returns true if the init region might read from the mold argument + bool initReadsFromMold() { + mlir::BlockArgument moldArg = getInitMoldArg(); + return moldArg && !moldArg.use_empty(); + } + /// Get the type for arguments to nested regions. This should /// generally be either the same as getType() or some pointer /// type (pointing to the type allocated by this op). diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index c9e91cf3e8042..49675d34215a9 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -12,6 +12,8 @@ #include "flang/Lower/Bridge.h" +#include "OpenMP/DataSharingProcessor.h" +#include "OpenMP/Utils.h" #include "flang/Lower/Allocatable.h" #include "flang/Lower/CallInterface.h" #include "flang/Lower/Coarray.h" @@ -1142,6 +1144,14 @@ class FirConverter : public Fortran::lower::AbstractConverter { return name; } + /// Find the symbol in the inner-most level of the local map or return null. + Fortran::lower::SymbolBox + shallowLookupSymbol(const Fortran::semantics::Symbol &sym) override { + if (Fortran::lower::SymbolBox v = localSymbols.shallowLookupSymbol(sym)) + return v; + return {}; + } + private: FirConverter() = delete; FirConverter(const FirConverter &) = delete; @@ -1216,14 +1226,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { return {}; } - /// Find the symbol in the inner-most level of the local map or return null. - Fortran::lower::SymbolBox - shallowLookupSymbol(const Fortran::semantics::Symbol &sym) { - if (Fortran::lower::SymbolBox v = localSymbols.shallowLookupSymbol(sym)) - return v; - return {}; - } - /// Find the symbol in one level up of symbol map such as for host-association /// in OpenMP code or return null. Fortran::lower::SymbolBox @@ -2027,9 +2029,34 @@ class FirConverter : public Fortran::lower::AbstractConverter { void handleLocalitySpecs(const IncrementLoopInfo &info) { Fortran::semantics::SemanticsContext &semanticsContext = bridge.getSemanticsContext(); - for (const Fortran::semantics::Symbol *sym : info.localSymList) + // TODO Extract `DataSharingProcessor` from omp to a more general location. + Fortran::lower::omp::DataSharingProcessor dsp( + *this, semanticsContext, getEval(), + /*useDelayedPrivatization=*/true, localSymbols); + fir::LocalitySpecifierOperands privateClauseOps; + auto doConcurrentLoopOp = + mlir::dyn_cast_if_present(info.loopOp); + // TODO Promote to using `enableDelayedPrivatization` (which is enabled by + // default unlike the staging flag) once the implementation of this is more + // complete. + bool useDelayedPriv = + enableDelayedPrivatizationStaging && doConcurrentLoopOp; + + for (const Fortran::semantics::Symbol *sym : info.localSymList) { + if (useDelayedPriv) { + dsp.privatizeSymbol(sym, &privateClauseOps); + continue; + } + createHostAssociateVarClone(*sym, /*skipDefaultInit=*/false); + } + for (const Fortran::semantics::Symbol *sym : info.localInitSymList) { + if (useDelayedPriv) { + dsp.privatizeSymbol(sym, &privateClauseOps); + continue; + } + createHostAssociateVarClone(*sym, /*skipDefaultInit=*/true); const auto *hostDetails = sym->detailsIf(); @@ -2048,6 +2075,24 @@ class FirConverter : public Fortran::lower::AbstractConverter { sym->detailsIf(); copySymbolBinding(hostDetails->symbol(), *sym); } + + if (useDelayedPriv) { + doConcurrentLoopOp.getLocalVarsMutable().assign( + privateClauseOps.privateVars); + doConcurrentLoopOp.setLocalSymsAttr( + builder->getArrayAttr(privateClauseOps.privateSyms)); + + for (auto [sym, privateVar] : llvm::zip_equal( + dsp.getAllSymbolsToPrivatize(), privateClauseOps.privateVars)) { + auto arg = doConcurrentLoopOp.getRegion().begin()->addArgument( + privateVar.getType(), doConcurrentLoopOp.getLoc()); + bindSymbol(*sym, hlfir::translateToExtendedValue( + privateVar.getLoc(), *builder, hlfir::Entity{arg}, + /*contiguousHint=*/true) + .first); + } + } + // Note that allocatable, types with ultimate components, and type // requiring finalization are forbidden in LOCAL/LOCAL_INIT (F2023 C1130), // so no clean-up needs to be generated for these entities. diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 20dc46e4710fb..629478294ef5b 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -20,6 +20,7 @@ #include "flang/Optimizer/Builder/BoxValue.h" #include "flang/Optimizer/Builder/HLFIRTools.h" #include "flang/Optimizer/Builder/Todo.h" +#include "flang/Optimizer/Dialect/FIROps.h" #include "flang/Optimizer/HLFIR/HLFIRDialect.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Semantics/attr.h" @@ -53,6 +54,15 @@ DataSharingProcessor::DataSharingProcessor( }); } +DataSharingProcessor::DataSharingProcessor(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + lower::pft::Evaluation &eval, + bool useDelayedPrivatization, + lower::SymMap &symTable) + : DataSharingProcessor(converter, semaCtx, {}, eval, + /*shouldCollectPreDeterminedSymols=*/false, + useDelayedPrivatization, symTable) {} + void DataSharingProcessor::processStep1( mlir::omp::PrivateClauseOps *clauseOps) { collectSymbolsForPrivatization(); @@ -174,7 +184,8 @@ void DataSharingProcessor::cloneSymbol(const semantics::Symbol *sym) { void DataSharingProcessor::copyFirstPrivateSymbol( const semantics::Symbol *sym, mlir::OpBuilder::InsertPoint *copyAssignIP) { - if (sym->test(semantics::Symbol::Flag::OmpFirstPrivate)) + if (sym->test(semantics::Symbol::Flag::OmpFirstPrivate) || + sym->test(semantics::Symbol::Flag::LocalityLocalInit)) converter.copyHostAssociateVar(*sym, copyAssignIP); } @@ -497,9 +508,9 @@ void DataSharingProcessor::privatize(mlir::omp::PrivateClauseOps *clauseOps) { if (const auto *commonDet = sym->detailsIf()) { for (const auto &mem : commonDet->objects()) - doPrivatize(&*mem, clauseOps); + privatizeSymbol(&*mem, clauseOps); } else - doPrivatize(sym, clauseOps); + privatizeSymbol(sym, clauseOps); } } @@ -516,22 +527,30 @@ void DataSharingProcessor::copyLastPrivatize(mlir::Operation *op) { } } -void DataSharingProcessor::doPrivatize(const semantics::Symbol *sym, - mlir::omp::PrivateClauseOps *clauseOps) { +template +void DataSharingProcessor::privatizeSymbol( + const semantics::Symbol *symToPrivatize, OperandsStructType *clauseOps) { if (!useDelayedPrivatization) { - cloneSymbol(sym); - copyFirstPrivateSymbol(sym); + cloneSymbol(symToPrivatize); + copyFirstPrivateSymbol(symToPrivatize); return; } - lower::SymbolBox hsb = converter.lookupOneLevelUpSymbol(*sym); + const semantics::Symbol *sym = symToPrivatize->HasLocalLocality() + ? &symToPrivatize->GetUltimate() + : symToPrivatize; + lower::SymbolBox hsb = symToPrivatize->HasLocalLocality() + ? converter.shallowLookupSymbol(*sym) + : converter.lookupOneLevelUpSymbol(*sym); assert(hsb && "Host symbol box not found"); hlfir::Entity entity{hsb.getAddr()}; bool cannotHaveNonDefaultLowerBounds = !entity.mayHaveNonDefaultLowerBounds(); mlir::Location symLoc = hsb.getAddr().getLoc(); std::string privatizerName = sym->name().ToString() + ".privatizer"; - bool isFirstPrivate = sym->test(semantics::Symbol::Flag::OmpFirstPrivate); + bool isFirstPrivate = + symToPrivatize->test(semantics::Symbol::Flag::OmpFirstPrivate) || + symToPrivatize->test(semantics::Symbol::Flag::LocalityLocalInit); mlir::Value privVal = hsb.getAddr(); mlir::Type allocType = privVal.getType(); @@ -565,7 +584,7 @@ void DataSharingProcessor::doPrivatize(const semantics::Symbol *sym, mlir::Type argType = privVal.getType(); - mlir::omp::PrivateClauseOp privatizerOp = [&]() { + OpType privatizerOp = [&]() { auto moduleOp = firOpBuilder.getModule(); auto uniquePrivatizerName = fir::getTypeAsString( allocType, converter.getKindMap(), @@ -573,16 +592,25 @@ void DataSharingProcessor::doPrivatize(const semantics::Symbol *sym, (isFirstPrivate ? "_firstprivate" : "_private")); if (auto existingPrivatizer = - moduleOp.lookupSymbol( - uniquePrivatizerName)) + moduleOp.lookupSymbol(uniquePrivatizerName)) return existingPrivatizer; mlir::OpBuilder::InsertionGuard guard(firOpBuilder); firOpBuilder.setInsertionPointToStart(moduleOp.getBody()); - auto result = firOpBuilder.create( - symLoc, uniquePrivatizerName, allocType, - isFirstPrivate ? mlir::omp::DataSharingClauseType::FirstPrivate - : mlir::omp::DataSharingClauseType::Private); + OpType result; + + if constexpr (std::is_same_v) { + result = firOpBuilder.create( + symLoc, uniquePrivatizerName, allocType, + isFirstPrivate ? mlir::omp::DataSharingClauseType::FirstPrivate + : mlir::omp::DataSharingClauseType::Private); + } else { + result = firOpBuilder.create( + symLoc, uniquePrivatizerName, allocType, + isFirstPrivate ? fir::LocalitySpecifierType::LocalInit + : fir::LocalitySpecifierType::Local); + } + fir::ExtendedValue symExV = converter.getSymbolExtendedValue(*sym); lower::SymMapScope outerScope(symTable); @@ -625,27 +653,36 @@ void DataSharingProcessor::doPrivatize(const semantics::Symbol *sym, ©Region, /*insertPt=*/{}, {argType, argType}, {symLoc, symLoc}); firOpBuilder.setInsertionPointToEnd(copyEntryBlock); - auto addSymbol = [&](unsigned argIdx, bool force = false) { + auto addSymbol = [&](unsigned argIdx, const semantics::Symbol *symToMap, + bool force = false) { symExV.match( [&](const fir::MutableBoxValue &box) { symTable.addSymbol( - *sym, fir::substBase(box, copyRegion.getArgument(argIdx)), - force); + *symToMap, + fir::substBase(box, copyRegion.getArgument(argIdx)), force); }, [&](const auto &box) { - symTable.addSymbol(*sym, copyRegion.getArgument(argIdx), force); + symTable.addSymbol(*symToMap, copyRegion.getArgument(argIdx), + force); }); }; - addSymbol(0, true); + addSymbol(0, sym, true); lower::SymMapScope innerScope(symTable); - addSymbol(1); + addSymbol(1, symToPrivatize); auto ip = firOpBuilder.saveInsertionPoint(); - copyFirstPrivateSymbol(sym, &ip); - - firOpBuilder.create( - hsb.getAddr().getLoc(), symTable.shallowLookupSymbol(*sym).getAddr()); + copyFirstPrivateSymbol(symToPrivatize, &ip); + + if constexpr (std::is_same_v) { + firOpBuilder.create( + hsb.getAddr().getLoc(), + symTable.shallowLookupSymbol(*symToPrivatize).getAddr()); + } else { + firOpBuilder.create( + hsb.getAddr().getLoc(), + symTable.shallowLookupSymbol(*symToPrivatize).getAddr()); + } } return result; @@ -656,9 +693,22 @@ void DataSharingProcessor::doPrivatize(const semantics::Symbol *sym, clauseOps->privateVars.push_back(privVal); } - symToPrivatizer[sym] = privatizerOp; + if (symToPrivatize->HasLocalLocality()) + allPrivatizedSymbols.insert(symToPrivatize); } +template void +DataSharingProcessor::privatizeSymbol( + const semantics::Symbol *symToPrivatize, + mlir::omp::PrivateClauseOps *clauseOps); + +template void +DataSharingProcessor::privatizeSymbol( + const semantics::Symbol *symToPrivatize, + fir::LocalitySpecifierOperands *clauseOps); + } // namespace omp } // namespace lower } // namespace Fortran diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.h b/flang/lib/Lower/OpenMP/DataSharingProcessor.h index 7787e4ffb03c2..ae759bfef566b 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.h +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.h @@ -77,8 +77,6 @@ class DataSharingProcessor { llvm::SetVector preDeterminedSymbols; llvm::SetVector allPrivatizedSymbols; - llvm::DenseMap - symToPrivatizer; lower::AbstractConverter &converter; semantics::SemanticsContext &semaCtx; fir::FirOpBuilder &firOpBuilder; @@ -105,8 +103,6 @@ class DataSharingProcessor { void collectImplicitSymbols(); void collectPreDeterminedSymbols(); void privatize(mlir::omp::PrivateClauseOps *clauseOps); - void doPrivatize(const semantics::Symbol *sym, - mlir::omp::PrivateClauseOps *clauseOps); void copyLastPrivatize(mlir::Operation *op); void insertLastPrivateCompare(mlir::Operation *op); void cloneSymbol(const semantics::Symbol *sym); @@ -125,6 +121,11 @@ class DataSharingProcessor { bool shouldCollectPreDeterminedSymbols, bool useDelayedPrivatization, lower::SymMap &symTable); + DataSharingProcessor(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + lower::pft::Evaluation &eval, + bool useDelayedPrivatization, lower::SymMap &symTable); + // Privatisation is split into two steps. // Step1 performs cloning of all privatisation clauses and copying for // firstprivates. Step1 is performed at the place where process/processStep1 @@ -151,6 +152,11 @@ class DataSharingProcessor { ? allPrivatizedSymbols.getArrayRef() : llvm::ArrayRef(); } + + template + void privatizeSymbol(const semantics::Symbol *symToPrivatize, + OperandsStructType *clauseOps); }; } // namespace omp diff --git a/flang/test/Lower/do_concurrent_delayed_locality.f90 b/flang/test/Lower/do_concurrent_delayed_locality.f90 new file mode 100644 index 0000000000000..9b234087ed4be --- /dev/null +++ b/flang/test/Lower/do_concurrent_delayed_locality.f90 @@ -0,0 +1,49 @@ +! RUN: %flang_fc1 -emit-hlfir -mmlir --openmp-enable-delayed-privatization-staging=true -o - %s | FileCheck %s + +subroutine do_concurrent_with_locality_specs + implicit none + integer :: i, local_var, local_init_var + + do concurrent (i=1:10) local(local_var) local_init(local_init_var) + if (i < 5) then + local_var = 42 + else + local_init_var = 84 + end if + end do +end subroutine + +! CHECK: fir.local {type = local_init} @[[LOCAL_INIT_SYM:.*]] : i32 copy { +! CHECK: ^bb0(%[[ORIG_VAL:.*]]: !fir.ref, %[[LOCAL_VAL:.*]]: !fir.ref): +! CHECK: %[[ORIG_VAL_LD:.*]] = fir.load %[[ORIG_VAL]] : !fir.ref +! CHECK: hlfir.assign %[[ORIG_VAL_LD]] to %[[LOCAL_VAL]] : i32, !fir.ref +! CHECK: fir.yield(%[[LOCAL_VAL]] : !fir.ref) +! CHECK: } + +! CHECK: fir.local {type = local} @[[LOCAL_SYM:.*]] : i32 + +! CHECK-LABEL: func.func @_QPdo_concurrent_with_locality_specs() { +! CHECK: %[[ORIG_LOCAL_INIT_ALLOC:.*]] = fir.alloca i32 {bindc_name = "local_init_var", {{.*}}} +! CHECK: %[[ORIG_LOCAL_INIT_DECL:.*]]:2 = hlfir.declare %[[ORIG_LOCAL_INIT_ALLOC]] + +! CHECK: %[[ORIG_LOCAL_ALLOC:.*]] = fir.alloca i32 {bindc_name = "local_var", {{.*}}} +! CHECK: %[[ORIG_LOCAL_DECL:.*]]:2 = hlfir.declare %[[ORIG_LOCAL_ALLOC]] + +! CHECK: fir.do_concurrent { +! CHECK: %[[IV_DECL:.*]]:2 = hlfir.declare %{{.*}} + +! CHECK: fir.do_concurrent.loop (%{{.*}}) = (%{{.*}}) to (%{{.*}}) step (%{{.*}}) local(@[[LOCAL_SYM]] %[[ORIG_LOCAL_DECL]]#0 -> %[[LOCAL_ARG:.*]], @[[LOCAL_INIT_SYM]] %[[ORIG_LOCAL_INIT_DECL]]#0 -> %[[LOCAL_INIT_ARG:.*]] : !fir.ref, !fir.ref) { +! CHECK: %[[LOCAL_DECL:.*]]:2 = hlfir.declare %[[LOCAL_ARG]] +! CHECK: %[[LOCAL_INIT_DECL:.*]]:2 = hlfir.declare %[[LOCAL_INIT_ARG]] + +! CHECK: fir.if %{{.*}} { +! CHECK: %[[C42:.*]] = arith.constant 42 : i32 +! CHECK: hlfir.assign %[[C42]] to %[[LOCAL_DECL]]#0 : i32, !fir.ref +! CHECK: } else { +! CHECK: %[[C84:.*]] = arith.constant 84 : i32 +! CHECK: hlfir.assign %[[C84]] to %[[LOCAL_INIT_DECL]]#0 : i32, !fir.ref +! CHECK: } +! CHECK: } +! CHECK: } +! CHECK: return +! CHECK: } From flang-commits at lists.llvm.org Thu May 29 02:04:34 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 29 May 2025 02:04:34 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir] Basic PFT to MLIR lowering for do concurrent locality specifiers (PR #138534) In-Reply-To: Message-ID: <68382322.a70a0220.48b9d.1f33@mx.google.com> https://github.com/ergawy closed https://github.com/llvm/llvm-project/pull/138534 From flang-commits at lists.llvm.org Thu May 29 02:04:36 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 29 May 2025 02:04:36 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Generlize names of delayed privatization CLI flags (PR #138816) In-Reply-To: Message-ID: <68382324.620a0220.2cc6e6.20f7@mx.google.com> https://github.com/ergawy edited https://github.com/llvm/llvm-project/pull/138816 From flang-commits at lists.llvm.org Thu May 29 02:07:38 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 29 May 2025 02:07:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Generlize names of delayed privatization CLI flags (PR #138816) In-Reply-To: Message-ID: <683823da.170a0220.31e733.2832@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/138816 >From 68494ba320bccccb7d713a749d15927071f14b57 Mon Sep 17 00:00:00 2001 From: ergawy Date: Wed, 7 May 2025 02:41:14 -0500 Subject: [PATCH] [flang] Generlize names of delayed privatization CLI flags Remove the `openmp` prefix from delayed privatization/localization flags since they are now used for `do concurrent` as well. --- flang/include/flang/Support/Flags.h | 17 ++++++++++++++++ flang/lib/Lower/Bridge.cpp | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 1 + flang/lib/Lower/OpenMP/Utils.cpp | 12 ----------- flang/lib/Lower/OpenMP/Utils.h | 2 -- flang/lib/Support/CMakeLists.txt | 1 + flang/lib/Support/Flags.cpp | 20 +++++++++++++++++++ .../distribute-standalone-private.f90 | 4 ++-- .../DelayedPrivatization/equivalence.f90 | 4 ++-- .../target-private-allocatable.f90 | 4 ++-- .../target-private-multiple-variables.f90 | 4 ++-- .../target-private-simple.f90 | 4 ++-- .../OpenMP/allocatable-multiple-vars.f90 | 4 ++-- .../OpenMP/cfg-conversion-omp.private.f90 | 2 +- .../test/Lower/OpenMP/debug_info_conflict.f90 | 2 +- ...elayed-privatization-allocatable-array.f90 | 4 ++-- ...privatization-allocatable-firstprivate.f90 | 6 +++--- ...ayed-privatization-allocatable-private.f90 | 4 ++-- .../OpenMP/delayed-privatization-array.f90 | 12 +++++------ .../delayed-privatization-character-array.f90 | 8 ++++---- .../delayed-privatization-character.f90 | 8 ++++---- .../delayed-privatization-default-init.f90 | 4 ++-- .../delayed-privatization-firstprivate.f90 | 4 ++-- ...rivatization-lower-allocatable-to-llvm.f90 | 2 +- .../OpenMP/delayed-privatization-pointer.f90 | 4 ++-- ...yed-privatization-private-firstprivate.f90 | 4 ++-- .../OpenMP/delayed-privatization-private.f90 | 4 ++-- .../delayed-privatization-reduction-byref.f90 | 2 +- .../delayed-privatization-reduction.f90 | 4 ++-- .../different_vars_lastprivate_barrier.f90 | 2 +- .../Lower/OpenMP/firstprivate-commonblock.f90 | 2 +- .../test/Lower/OpenMP/private-commonblock.f90 | 2 +- .../Lower/OpenMP/private-derived-type.f90 | 4 ++-- .../OpenMP/same_var_first_lastprivate.f90 | 2 +- .../Lower/do_concurrent_delayed_locality.f90 | 2 +- 35 files changed, 96 insertions(+), 71 deletions(-) create mode 100644 flang/include/flang/Support/Flags.h create mode 100644 flang/lib/Support/Flags.cpp diff --git a/flang/include/flang/Support/Flags.h b/flang/include/flang/Support/Flags.h new file mode 100644 index 0000000000000..bcbb72f8e50d0 --- /dev/null +++ b/flang/include/flang/Support/Flags.h @@ -0,0 +1,17 @@ +//===-- include/flang/Support/Flags.h ---------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#ifndef FORTRAN_SUPPORT_FLAGS_H_ +#define FORTRAN_SUPPORT_FLAGS_H_ + +#include "llvm/Support/CommandLine.h" + +extern llvm::cl::opt enableDelayedPrivatization; +extern llvm::cl::opt enableDelayedPrivatizationStaging; + +#endif // FORTRAN_SUPPORT_FLAGS_H_ diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 49675d34215a9..9f3c50a52973a 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -13,7 +13,6 @@ #include "flang/Lower/Bridge.h" #include "OpenMP/DataSharingProcessor.h" -#include "OpenMP/Utils.h" #include "flang/Lower/Allocatable.h" #include "flang/Lower/CallInterface.h" #include "flang/Lower/Coarray.h" @@ -63,6 +62,7 @@ #include "flang/Semantics/runtime-type-info.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" +#include "flang/Support/Flags.h" #include "flang/Support/Version.h" #include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" #include "mlir/IR/BuiltinAttributes.h" diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ddb08f74b3841..19e106e2593ab 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -34,6 +34,7 @@ #include "flang/Parser/parse-tree.h" #include "flang/Semantics/openmp-directive-sets.h" #include "flang/Semantics/tools.h" +#include "flang/Support/Flags.h" #include "flang/Support/OpenMP-utils.h" #include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" diff --git a/flang/lib/Lower/OpenMP/Utils.cpp b/flang/lib/Lower/OpenMP/Utils.cpp index 711d4af287691..c226c2558e7aa 100644 --- a/flang/lib/Lower/OpenMP/Utils.cpp +++ b/flang/lib/Lower/OpenMP/Utils.cpp @@ -33,18 +33,6 @@ llvm::cl::opt treatIndexAsSection( llvm::cl::desc("In the OpenMP data clauses treat `a(N)` as `a(N:N)`."), llvm::cl::init(true)); -llvm::cl::opt enableDelayedPrivatization( - "openmp-enable-delayed-privatization", - llvm::cl::desc( - "Emit `[first]private` variables as clauses on the MLIR ops."), - llvm::cl::init(true)); - -llvm::cl::opt enableDelayedPrivatizationStaging( - "openmp-enable-delayed-privatization-staging", - llvm::cl::desc("For partially supported constructs, emit `[first]private` " - "variables as clauses on the MLIR ops."), - llvm::cl::init(false)); - namespace Fortran { namespace lower { namespace omp { diff --git a/flang/lib/Lower/OpenMP/Utils.h b/flang/lib/Lower/OpenMP/Utils.h index 30b4613837b9a..a7eb2dc5ee664 100644 --- a/flang/lib/Lower/OpenMP/Utils.h +++ b/flang/lib/Lower/OpenMP/Utils.h @@ -17,8 +17,6 @@ #include extern llvm::cl::opt treatIndexAsSection; -extern llvm::cl::opt enableDelayedPrivatization; -extern llvm::cl::opt enableDelayedPrivatizationStaging; namespace fir { class FirOpBuilder; diff --git a/flang/lib/Support/CMakeLists.txt b/flang/lib/Support/CMakeLists.txt index 4ee381589a208..363f57ce97dae 100644 --- a/flang/lib/Support/CMakeLists.txt +++ b/flang/lib/Support/CMakeLists.txt @@ -44,6 +44,7 @@ endif() add_flang_library(FortranSupport default-kinds.cpp + Flags.cpp Fortran.cpp Fortran-features.cpp idioms.cpp diff --git a/flang/lib/Support/Flags.cpp b/flang/lib/Support/Flags.cpp new file mode 100644 index 0000000000000..02f64981618dd --- /dev/null +++ b/flang/lib/Support/Flags.cpp @@ -0,0 +1,20 @@ +//===-- lib/Support/Flags.cpp ---------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Support/Flags.h" + +llvm::cl::opt enableDelayedPrivatization("enable-delayed-privatization", + llvm::cl::desc( + "Emit private/local variables as clauses/specifiers on MLIR ops."), + llvm::cl::init(true)); + +llvm::cl::opt enableDelayedPrivatizationStaging( + "enable-delayed-privatization-staging", + llvm::cl::desc("For partially supported constructs, emit private/local " + "variables as clauses/specifiers on MLIR ops."), + llvm::cl::init(false)); diff --git a/flang/test/Lower/OpenMP/DelayedPrivatization/distribute-standalone-private.f90 b/flang/test/Lower/OpenMP/DelayedPrivatization/distribute-standalone-private.f90 index a9c85db79fa31..92aeb3fbc1ee7 100644 --- a/flang/test/Lower/OpenMP/DelayedPrivatization/distribute-standalone-private.f90 +++ b/flang/test/Lower/OpenMP/DelayedPrivatization/distribute-standalone-private.f90 @@ -1,6 +1,6 @@ -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization-staging \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization-staging \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization-staging -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization-staging -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine standalone_distribute diff --git a/flang/test/Lower/OpenMP/DelayedPrivatization/equivalence.f90 b/flang/test/Lower/OpenMP/DelayedPrivatization/equivalence.f90 index 721bfff012f14..5234862feaa76 100644 --- a/flang/test/Lower/OpenMP/DelayedPrivatization/equivalence.f90 +++ b/flang/test/Lower/OpenMP/DelayedPrivatization/equivalence.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for variables that are storage associated via `EQUIVALENCE`. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine private_common diff --git a/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-allocatable.f90 b/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-allocatable.f90 index 87c2c2ae26796..3d93fbc6e446e 100644 --- a/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-allocatable.f90 +++ b/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-allocatable.f90 @@ -1,8 +1,8 @@ ! Tests delayed privatization for `targets ... private(..)` for allocatables. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization-staging \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization-staging \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization-staging -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization-staging -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine target_allocatable diff --git a/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-multiple-variables.f90 b/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-multiple-variables.f90 index ad7bfb3d7c247..12e15a2aafc2d 100644 --- a/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-multiple-variables.f90 +++ b/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-multiple-variables.f90 @@ -1,8 +1,8 @@ ! Tests delayed privatization for `targets ... private(..)` for allocatables. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization-staging \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization-staging \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization-staging -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization-staging -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine target_allocatable(lb, ub, l) diff --git a/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-simple.f90 b/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-simple.f90 index 5abf2cbb15c92..f543068d29753 100644 --- a/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-simple.f90 +++ b/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-simple.f90 @@ -1,8 +1,8 @@ ! Tests delayed privatization for `targets ... private(..)` for simple variables. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization-staging \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization-staging \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization-staging -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization-staging -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine target_simple diff --git a/flang/test/Lower/OpenMP/allocatable-multiple-vars.f90 b/flang/test/Lower/OpenMP/allocatable-multiple-vars.f90 index e6450a13e13a0..91ba75f2198e3 100644 --- a/flang/test/Lower/OpenMP/allocatable-multiple-vars.f90 +++ b/flang/test/Lower/OpenMP/allocatable-multiple-vars.f90 @@ -1,9 +1,9 @@ ! Test early privatization for multiple allocatable variables. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization=false \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization=false \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization=false -o - %s 2>&1 |\ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization=false -o - %s 2>&1 |\ ! RUN: FileCheck %s subroutine delayed_privatization_allocatable diff --git a/flang/test/Lower/OpenMP/cfg-conversion-omp.private.f90 b/flang/test/Lower/OpenMP/cfg-conversion-omp.private.f90 index 8b8adf2b140c7..f8d771d10d281 100644 --- a/flang/test/Lower/OpenMP/cfg-conversion-omp.private.f90 +++ b/flang/test/Lower/OpenMP/cfg-conversion-omp.private.f90 @@ -2,7 +2,7 @@ ! RUN: split-file %s %t && cd %t -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - test.f90 2>&1 | \ ! RUN: fir-opt --cfg-conversion -o test.cfg-conv.mlir ! RUN: FileCheck --input-file=test.cfg-conv.mlir %s --check-prefix="CFGConv" diff --git a/flang/test/Lower/OpenMP/debug_info_conflict.f90 b/flang/test/Lower/OpenMP/debug_info_conflict.f90 index 5e52db281da23..b80900476053a 100644 --- a/flang/test/Lower/OpenMP/debug_info_conflict.f90 +++ b/flang/test/Lower/OpenMP/debug_info_conflict.f90 @@ -1,7 +1,7 @@ ! Tests that there no debug-info conflicts arise because of DI attached to nested ! OMP regions arguments. -! RUN: %flang -c -fopenmp -g -mmlir --openmp-enable-delayed-privatization=true \ +! RUN: %flang -c -fopenmp -g -mmlir --enable-delayed-privatization=true \ ! RUN: %s -o - 2>&1 | FileCheck %s subroutine bar (b) diff --git a/flang/test/Lower/OpenMP/delayed-privatization-allocatable-array.f90 b/flang/test/Lower/OpenMP/delayed-privatization-allocatable-array.f90 index 9b6dbabf0c6ff..d1c7167546b43 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-allocatable-array.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-allocatable-array.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for allocatable arrays. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 |\ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 |\ ! RUN: FileCheck %s subroutine delayed_privatization_private(var1, l1) diff --git a/flang/test/Lower/OpenMP/delayed-privatization-allocatable-firstprivate.f90 b/flang/test/Lower/OpenMP/delayed-privatization-allocatable-firstprivate.f90 index 01ca1073ae849..612bb55c770f7 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-allocatable-firstprivate.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-allocatable-firstprivate.f90 @@ -2,9 +2,9 @@ ! RUN: split-file %s %t -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/test_ir.f90 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %t/test_ir.f90 2>&1 |\ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %t/test_ir.f90 2>&1 |\ ! RUN: FileCheck %s !--- test_ir.f90 @@ -38,7 +38,7 @@ subroutine delayed_privatization_allocatable ! CHECK-NEXT: hlfir.assign %[[ORIG_BASE_LD]] to %[[PRIV_PRIV_ARG]] realloc ! CHECK-NEXT: } -! RUN: %flang -c -emit-llvm -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang -c -emit-llvm -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/test_compilation_to_obj.f90 | \ ! RUN: llvm-dis 2>&1 |\ ! RUN: FileCheck %s -check-prefix=LLVM diff --git a/flang/test/Lower/OpenMP/delayed-privatization-allocatable-private.f90 b/flang/test/Lower/OpenMP/delayed-privatization-allocatable-private.f90 index 4ce66f52110e0..67cfb864bc8f3 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-allocatable-private.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-allocatable-private.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for allocatables: `private`. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 |\ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 |\ ! RUN: FileCheck %s subroutine delayed_privatization_allocatable diff --git a/flang/test/Lower/OpenMP/delayed-privatization-array.f90 b/flang/test/Lower/OpenMP/delayed-privatization-array.f90 index c447fa6f27a75..9aaf75f66dbbb 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-array.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-array.f90 @@ -2,19 +2,19 @@ ! RUN: split-file %s %t -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/one_dim_array.f90 2>&1 | FileCheck %s --check-prefix=ONE_DIM -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - \ ! RUN: %t/one_dim_array.f90 2>&1 | FileCheck %s --check-prefix=ONE_DIM -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/two_dim_array.f90 2>&1 | FileCheck %s --check-prefix=TWO_DIM -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - \ ! RUN: %t/two_dim_array.f90 2>&1 | FileCheck %s --check-prefix=TWO_DIM -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/one_dim_array_default_lb.f90 2>&1 | FileCheck %s --check-prefix=ONE_DIM_DEFAULT_LB -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - \ ! RUN: %t/one_dim_array_default_lb.f90 2>&1 | FileCheck %s --check-prefix=ONE_DIM_DEFAULT_LB !--- one_dim_array.f90 diff --git a/flang/test/Lower/OpenMP/delayed-privatization-character-array.f90 b/flang/test/Lower/OpenMP/delayed-privatization-character-array.f90 index 4c7287283c7ad..383b033d772aa 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-character-array.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-character-array.f90 @@ -2,14 +2,14 @@ ! RUN: split-file %s %t -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/static_len.f90 2>&1 | FileCheck %s --check-prefix=STATIC_LEN -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %t/static_len.f90 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %t/static_len.f90 2>&1 \ ! RUN: | FileCheck %s --check-prefix=STATIC_LEN -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/dyn_len.f90 2>&1 | FileCheck %s --check-prefix=DYN_LEN -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %t/dyn_len.f90 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %t/dyn_len.f90 2>&1 \ ! RUN: | FileCheck %s --check-prefix=DYN_LEN !--- static_len.f90 diff --git a/flang/test/Lower/OpenMP/delayed-privatization-character.f90 b/flang/test/Lower/OpenMP/delayed-privatization-character.f90 index 3d1a312963371..d0f7ef6f2cd0c 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-character.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-character.f90 @@ -2,14 +2,14 @@ ! RUN: split-file %s %t -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/dyn_len.f90 2>&1 | FileCheck %s --check-prefix=DYN_LEN -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %t/dyn_len.f90 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %t/dyn_len.f90 2>&1 \ ! RUN: | FileCheck %s --check-prefix=DYN_LEN -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/static_len.f90 2>&1 | FileCheck %s --check-prefix=STATIC_LEN -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %t/static_len.f90 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %t/static_len.f90 2>&1 \ ! RUN: | FileCheck %s --check-prefix=STATIC_LEN !--- dyn_len.f90 diff --git a/flang/test/Lower/OpenMP/delayed-privatization-default-init.f90 b/flang/test/Lower/OpenMP/delayed-privatization-default-init.f90 index 87d4605217a8a..cb17e4cd6afc1 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-default-init.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-default-init.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for derived types with default initialization. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 |\ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 |\ ! RUN: FileCheck %s subroutine delayed_privatization_default_init diff --git a/flang/test/Lower/OpenMP/delayed-privatization-firstprivate.f90 b/flang/test/Lower/OpenMP/delayed-privatization-firstprivate.f90 index 904ea783ad5b4..3f80b5e1bd209 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-firstprivate.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-firstprivate.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for the `firstprivate` clause. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine delayed_privatization_firstprivate diff --git a/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90 b/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90 index ac9a6d8746cf2..effc356590e9a 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90 @@ -1,7 +1,7 @@ ! Tests the OMPIRBuilder can handle multiple privatization regions that contain ! multiple BBs (for example, for allocatables). -! RUN: %flang -S -emit-llvm -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang -S -emit-llvm -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s subroutine foo(x) diff --git a/flang/test/Lower/OpenMP/delayed-privatization-pointer.f90 b/flang/test/Lower/OpenMP/delayed-privatization-pointer.f90 index f39ac9199e8bd..38ead806199c1 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-pointer.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-pointer.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for pointers: `private`. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 |\ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 |\ ! RUN: FileCheck %s subroutine delayed_privatization_pointer diff --git a/flang/test/Lower/OpenMP/delayed-privatization-private-firstprivate.f90 b/flang/test/Lower/OpenMP/delayed-privatization-private-firstprivate.f90 index d961210dcbc38..ad53703d3122e 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-private-firstprivate.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-private-firstprivate.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for both `private` and `firstprivate` clauses. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine delayed_privatization_private_firstprivate diff --git a/flang/test/Lower/OpenMP/delayed-privatization-private.f90 b/flang/test/Lower/OpenMP/delayed-privatization-private.f90 index 69c362e4828bf..84d6caedf5010 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-private.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-private.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for the `private` clause. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine delayed_privatization_private diff --git a/flang/test/Lower/OpenMP/delayed-privatization-reduction-byref.f90 b/flang/test/Lower/OpenMP/delayed-privatization-reduction-byref.f90 index f463f2b4630ae..4b6a643f94059 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-reduction-byref.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-reduction-byref.f90 @@ -3,7 +3,7 @@ ! that the block arguments are added in the proper order (reductions first and ! then delayed privatization. -! RUN: bbc -emit-hlfir -fopenmp --force-byref-reduction --openmp-enable-delayed-privatization -o - %s 2>&1 | FileCheck %s +! RUN: bbc -emit-hlfir -fopenmp --force-byref-reduction --enable-delayed-privatization -o - %s 2>&1 | FileCheck %s subroutine red_and_delayed_private integer :: red diff --git a/flang/test/Lower/OpenMP/delayed-privatization-reduction.f90 b/flang/test/Lower/OpenMP/delayed-privatization-reduction.f90 index a1ddbc30d6e46..f8f78b0531091 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-reduction.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-reduction.f90 @@ -3,9 +3,9 @@ ! that the block arguments are added in the proper order (reductions first and ! then delayed privatization. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine red_and_delayed_private diff --git a/flang/test/Lower/OpenMP/different_vars_lastprivate_barrier.f90 b/flang/test/Lower/OpenMP/different_vars_lastprivate_barrier.f90 index b74e083925aba..5bf634c86652b 100644 --- a/flang/test/Lower/OpenMP/different_vars_lastprivate_barrier.f90 +++ b/flang/test/Lower/OpenMP/different_vars_lastprivate_barrier.f90 @@ -1,4 +1,4 @@ -! RUN: %flang_fc1 -fopenmp -mmlir --openmp-enable-delayed-privatization-staging=true -emit-hlfir %s -o - | FileCheck %s +! RUN: %flang_fc1 -fopenmp -mmlir --enable-delayed-privatization-staging=true -emit-hlfir %s -o - | FileCheck %s subroutine first_and_lastprivate(var) integer i diff --git a/flang/test/Lower/OpenMP/firstprivate-commonblock.f90 b/flang/test/Lower/OpenMP/firstprivate-commonblock.f90 index 315e1b7745a6f..1b029c193b7b6 100644 --- a/flang/test/Lower/OpenMP/firstprivate-commonblock.f90 +++ b/flang/test/Lower/OpenMP/firstprivate-commonblock.f90 @@ -1,5 +1,5 @@ ! RUN: %flang_fc1 -emit-hlfir -fopenmp \ -! RUN: -mmlir --openmp-enable-delayed-privatization=true -o - %s 2>&1 \ +! RUN: -mmlir --enable-delayed-privatization=true -o - %s 2>&1 \ ! RUN: | FileCheck %s !CHECK: func.func @_QPfirstprivate_common() { diff --git a/flang/test/Lower/OpenMP/private-commonblock.f90 b/flang/test/Lower/OpenMP/private-commonblock.f90 index 009b086a0c7fd..8f5f641dea325 100644 --- a/flang/test/Lower/OpenMP/private-commonblock.f90 +++ b/flang/test/Lower/OpenMP/private-commonblock.f90 @@ -1,5 +1,5 @@ ! RUN: %flang_fc1 -emit-hlfir -fopenmp \ -! RUN: -mmlir --openmp-enable-delayed-privatization=true -o - %s 2>&1 \ +! RUN: -mmlir --enable-delayed-privatization=true -o - %s 2>&1 \ ! RUN: | FileCheck %s !CHECK: func.func @_QPprivate_common() { diff --git a/flang/test/Lower/OpenMP/private-derived-type.f90 b/flang/test/Lower/OpenMP/private-derived-type.f90 index cb51c2b34b424..3947a0d68b58c 100644 --- a/flang/test/Lower/OpenMP/private-derived-type.f90 +++ b/flang/test/Lower/OpenMP/private-derived-type.f90 @@ -1,5 +1,5 @@ -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization-staging=true -o - %s | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization-staging=true -o - %s | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization-staging=true -o - %s | FileCheck %s +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization-staging=true -o - %s | FileCheck %s subroutine s4 type y3 diff --git a/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 b/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 index 45d6f91f67f1f..14d860c30f6f2 100644 --- a/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 +++ b/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 @@ -1,4 +1,4 @@ -! RUN: %flang_fc1 -fopenmp -mmlir --openmp-enable-delayed-privatization-staging=true -emit-hlfir %s -o - | FileCheck %s +! RUN: %flang_fc1 -fopenmp -mmlir --enable-delayed-privatization-staging=true -emit-hlfir %s -o - | FileCheck %s subroutine first_and_lastprivate integer i diff --git a/flang/test/Lower/do_concurrent_delayed_locality.f90 b/flang/test/Lower/do_concurrent_delayed_locality.f90 index 9b234087ed4be..6cae0eb46db13 100644 --- a/flang/test/Lower/do_concurrent_delayed_locality.f90 +++ b/flang/test/Lower/do_concurrent_delayed_locality.f90 @@ -1,4 +1,4 @@ -! RUN: %flang_fc1 -emit-hlfir -mmlir --openmp-enable-delayed-privatization-staging=true -o - %s | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -mmlir --enable-delayed-privatization-staging=true -o - %s | FileCheck %s subroutine do_concurrent_with_locality_specs implicit none From flang-commits at lists.llvm.org Thu May 29 02:18:01 2025 From: flang-commits at lists.llvm.org (Paul Osmialowski via flang-commits) Date: Thu, 29 May 2025 02:18:01 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <68382649.170a0220.28bec7.22cf@mx.google.com> pawosm-arm wrote: I think more people need to look into it. AFAIR the plan was that flang does not use the `.rst` files for documentation, it was supposed to rely on the markdown files entirely. https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 03:06:26 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Thu, 29 May 2025 03:06:26 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <683831a2.170a0220.3a0a34.266d@mx.google.com> https://github.com/abidh updated https://github.com/llvm/llvm-project/pull/140556 >From 5d20af48673adebc2ab3e1a6c8442f67d84f1847 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Mon, 19 May 2025 15:21:25 +0100 Subject: [PATCH 1/2] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. This PR add functionality to change flang command line using environment variable `FCC_OVERRIDE_OPTIONS`. It is quite similar to what `CCC_OVERRIDE_OPTIONS` does for clang. The `FCC_OVERRIDE_OPTIONS` seemed like the most obvious name to me but I am open to other ideas. The `applyOverrideOptions` now takes an extra argument that is the name of the environment variable. Previously `CCC_OVERRIDE_OPTIONS` was hardcoded. --- clang/include/clang/Driver/Driver.h | 2 +- clang/lib/Driver/Driver.cpp | 4 ++-- clang/tools/driver/driver.cpp | 2 +- flang/test/Driver/fcc_override.f90 | 12 ++++++++++++ flang/tools/flang-driver/driver.cpp | 7 +++++++ 5 files changed, 23 insertions(+), 4 deletions(-) create mode 100644 flang/test/Driver/fcc_override.f90 diff --git a/clang/include/clang/Driver/Driver.h b/clang/include/clang/Driver/Driver.h index b463dc2a93550..7ca848f11b561 100644 --- a/clang/include/clang/Driver/Driver.h +++ b/clang/include/clang/Driver/Driver.h @@ -879,7 +879,7 @@ llvm::Error expandResponseFiles(SmallVectorImpl &Args, /// See applyOneOverrideOption. void applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideOpts, - llvm::StringSet<> &SavedStrings, + llvm::StringSet<> &SavedStrings, StringRef EnvVar, raw_ostream *OS = nullptr); } // end namespace driver diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp index a648cc928afdc..a8fea35926a0d 100644 --- a/clang/lib/Driver/Driver.cpp +++ b/clang/lib/Driver/Driver.cpp @@ -7289,7 +7289,7 @@ static void applyOneOverrideOption(raw_ostream &OS, void driver::applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideStr, llvm::StringSet<> &SavedStrings, - raw_ostream *OS) { + StringRef EnvVar, raw_ostream *OS) { if (!OS) OS = &llvm::nulls(); @@ -7298,7 +7298,7 @@ void driver::applyOverrideOptions(SmallVectorImpl &Args, OS = &llvm::nulls(); } - *OS << "### CCC_OVERRIDE_OPTIONS: " << OverrideStr << "\n"; + *OS << "### " << EnvVar << ": " << OverrideStr << "\n"; // This does not need to be efficient. diff --git a/clang/tools/driver/driver.cpp b/clang/tools/driver/driver.cpp index 82f47ab973064..81964c65c2892 100644 --- a/clang/tools/driver/driver.cpp +++ b/clang/tools/driver/driver.cpp @@ -305,7 +305,7 @@ int clang_main(int Argc, char **Argv, const llvm::ToolContext &ToolContext) { if (const char *OverrideStr = ::getenv("CCC_OVERRIDE_OPTIONS")) { // FIXME: Driver shouldn't take extra initial argument. driver::applyOverrideOptions(Args, OverrideStr, SavedStrings, - &llvm::errs()); + "CCC_OVERRIDE_OPTIONS", &llvm::errs()); } std::string Path = GetExecutablePath(ToolContext.Path, CanonicalPrefixes); diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 new file mode 100644 index 0000000000000..55a07803fdde5 --- /dev/null +++ b/flang/test/Driver/fcc_override.f90 @@ -0,0 +1,12 @@ +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s +! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang -target x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR + +! CHECK: "-fc1" +! CHECK-NOT: "-Oignore" +! CHECK: "-Omagic" +! CHECK-NOT: "-Oignore" + +! RM-WERROR: ### FCC_OVERRIDE_OPTIONS: x-Werror +-g +! RM-WERROR-NEXT: ### Deleting argument -Werror +! RM-WERROR-NEXT: ### Adding argument -g at end +! RM-WERROR-NOT: "-Werror" diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ed52988feaa59..ad0efa3279cef 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -111,6 +111,13 @@ int main(int argc, const char **argv) { } } + llvm::StringSet<> SavedStrings; + // Handle FCC_OVERRIDE_OPTIONS, used for editing a command line behind the + // scenes. + if (const char *OverrideStr = ::getenv("FCC_OVERRIDE_OPTIONS")) + clang::driver::applyOverrideOptions(args, OverrideStr, SavedStrings, + "FCC_OVERRIDE_OPTIONS", &llvm::errs()); + // Not in the frontend mode - continue in the compiler driver mode. // Create DiagnosticsEngine for the compiler driver >From d1f2c9b8abd2690612a4b886a7a85b8e7f57d359 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Thu, 29 May 2025 11:05:57 +0100 Subject: [PATCH 2/2] Add documentation for FCC_OVERRIDE_OPTIONS. --- flang/docs/FlangDriver.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/flang/docs/FlangDriver.md b/flang/docs/FlangDriver.md index 97744f0bee069..f93df8701e677 100644 --- a/flang/docs/FlangDriver.md +++ b/flang/docs/FlangDriver.md @@ -614,3 +614,28 @@ nvfortran defines `-fast` as - `-Mcache_align`: there is no equivalent flag in Flang or Clang. - `-Mflushz`: flush-to-zero mode - when `-ffast-math` is specified, Flang will link to `crtfastmath.o` to ensure denormal numbers are flushed to zero. + + +## FCC_OVERRIDE_OPTIONS + +The environment variable `FCC_OVERRIDE_OPTIONS` can be used to apply a list of +edits to the input argument lists. The value of this environment variable is +a space separated list of edits to perform. These edits are applied in order to +the input argument lists. Edits should be one of the following forms: + +- `#`: Silence information about the changes to the command line arguments. + +- `^FOO`: Add `FOO` as a new argument at the beginning of the command line. + +- `+FOO`: Add `FOO` as a new argument at the end of the command line. + +- `s/XXX/YYY/`: Substitute the regular expression `XXX` with `YYY` in the + command line. + +- `xOPTION`: Removes all instances of the literal argument `OPTION`. + +- `XOPTION`: Removes all instances of the literal argument `OPTION`, and the + following argument. + +- `Ox`: Removes all flags matching `O` or `O[sz0-9]` and adds `Ox` at the end + of the command line. \ No newline at end of file From flang-commits at lists.llvm.org Thu May 29 03:08:42 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Thu, 29 May 2025 03:08:42 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <6838322a.050a0220.3193d5.bb38@mx.google.com> ================ @@ -0,0 +1,12 @@ +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s ---------------- abidh wrote: Yes, those have special meaning and they were only documented in the comments. I have now added documentation for it in the suggested file. https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Thu May 29 03:11:06 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Thu, 29 May 2025 03:11:06 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <683832ba.170a0220.27f8a0.2575@mx.google.com> abidh wrote: > Does `CCC_OVERRIDE_OPTIONS` expands to Clang Compiler Commandline Override Options? If so `FCC_OVERRIDE_OPTIONS` expanding to Fortran Compiler Commandline Override Options seems the right replacement. > > If `CCC_OVERRIDE_OPTIONS` expands to Clang C Compiler Override Options then `FFC_OVERRIDE_OPTIONS` (as suggested by @tarunprabhu) expanding to Flang Fortran Compiler Overrided Options is better. The `CCC_OVERRIDE_OPTIONS` was introduced in a19fad as replacement of `QA_OVERRIDE_GCC3_OPTIONS`. I am not sure if there is a definite word on what it actually expands to. I am happy with either of them. https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Thu May 29 03:24:01 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 29 May 2025 03:24:01 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow structure component in `task depend` clauses (PR #141923) Message-ID: https://github.com/ergawy created https://github.com/llvm/llvm-project/pull/141923 Even though the spec (version 5.2) prohibits strcuture components from being specified in `depend` clauses, this restriction is not sensible. This PR rectifies the issue by lifting that restriction and allowing structure components in `depend` clauses (which is allowed by OpenMP 6.0). >From 18a503d2c70a7aad7e397cb1c85ac8dcad34f72b Mon Sep 17 00:00:00 2001 From: ergawy Date: Thu, 29 May 2025 05:19:15 -0500 Subject: [PATCH] [flang][OpenMP] Allow structure component in `task depend` clauses Even though the spec (version 5.2) prohibits strcuture components from being specified in `depend` clauses, this restriction is not sensible. This PR rectifies the issue by lifting that restriction and allowing structure components in `depend` clauses (which is allowed by OpenMP 6.0). --- flang/include/flang/Evaluate/tools.h | 10 ++++ flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 5 ++ flang/lib/Semantics/check-omp-structure.cpp | 8 +-- .../task-depend-structure-component.f90 | 21 ++++++++ flang/test/Semantics/OpenMP/depend02.f90 | 49 ------------------- 5 files changed, 38 insertions(+), 55 deletions(-) create mode 100644 flang/test/Lower/OpenMP/task-depend-structure-component.f90 delete mode 100644 flang/test/Semantics/OpenMP/depend02.f90 diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 7f2e91ae128bd..4efd88b13183f 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -414,6 +414,16 @@ const Symbol *IsArrayElement(const Expr &expr, bool intoSubstring = true, return nullptr; } +template +bool isStructureCompnent(const Fortran::evaluate::Expr &expr) { + if (auto dataRef{ExtractDataRef(expr, /*intoSubstring=*/false)}) { + const Fortran::evaluate::DataRef *ref{&*dataRef}; + return std::holds_alternative(ref->u); + } + + return false; +} + template std::optional ExtractNamedEntity(const A &x) { if (auto dataRef{ExtractDataRef(x)}) { diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index ebdda9885d5c2..8506d562c5094 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -944,6 +944,11 @@ bool ClauseProcessor::processDepend(lower::SymMap &symMap, converter.getCurrentLocation(), converter, expr, symMap, stmtCtx); dependVar = entity.getBase(); } + } else if (evaluate::isStructureCompnent(*object.ref())) { + SomeExpr expr = *object.ref(); + hlfir::EntityWithAttributes entity = convertExprToHLFIR( + converter.getCurrentLocation(), converter, expr, symMap, stmtCtx); + dependVar = entity.getBase(); } else { semantics::Symbol *sym = object.sym(); dependVar = converter.getSymbolAddress(*sym); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bda0d62829506..a3eeac4524eff 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -5493,12 +5493,8 @@ void OmpStructureChecker::CheckDependList(const parser::DataRef &d) { // Check if the base element is valid on Depend Clause CheckDependList(elem.value().base); }, - [&](const common::Indirection &) { - context_.Say(GetContext().clauseSource, - "A variable that is part of another variable " - "(such as an element of a structure) but is not an array " - "element or an array section cannot appear in a DEPEND " - "clause"_err_en_US); + [&](const common::Indirection &comp) { + CheckDependList(comp.value().base); }, [&](const common::Indirection &) { context_.Say(GetContext().clauseSource, diff --git a/flang/test/Lower/OpenMP/task-depend-structure-component.f90 b/flang/test/Lower/OpenMP/task-depend-structure-component.f90 new file mode 100644 index 0000000000000..7cf6dbfac2729 --- /dev/null +++ b/flang/test/Lower/OpenMP/task-depend-structure-component.f90 @@ -0,0 +1,21 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +subroutine depend + type :: my_struct + integer :: my_component(10) + end type + + type(my_struct) :: my_var + + !$omp task depend(in:my_var%my_component) + !$omp end task +end subroutine depend + +! CHECK: %[[VAR_ALLOC:.*]] = fir.alloca !fir.type<{{.*}}my_struct{{.*}}> {bindc_name = "my_var", {{.*}}} +! CHECK: %[[VAR_DECL:.*]]:2 = hlfir.declare %[[VAR_ALLOC]] + +! CHECK: %[[COMP_SELECTOR:.*]] = hlfir.designate %[[VAR_DECL]]#0{"my_component"} + +! CHECK: omp.task depend(taskdependin -> %[[COMP_SELECTOR]] : {{.*}}) { +! CHECK: omp.terminator +! CHECK: } diff --git a/flang/test/Semantics/OpenMP/depend02.f90 b/flang/test/Semantics/OpenMP/depend02.f90 deleted file mode 100644 index 76c02c8f9cbab..0000000000000 --- a/flang/test/Semantics/OpenMP/depend02.f90 +++ /dev/null @@ -1,49 +0,0 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp -! OpenMP Version 4.5 -! 2.13.9 Depend Clause -! A variable that is part of another variable -! (such as an element of a structure) but is not an array element or -! an array section cannot appear in a DEPEND clause - -subroutine vec_mult(N) - implicit none - integer :: i, N - real, allocatable :: p(:), v1(:), v2(:) - - type my_type - integer :: a(10) - end type my_type - - type(my_type) :: my_var - allocate( p(N), v1(N), v2(N) ) - - !$omp parallel num_threads(2) - !$omp single - - !$omp task depend(out:v1) - call init(v1, N) - !$omp end task - - !$omp task depend(out:v2) - call init(v2, N) - !$omp end task - - !ERROR: A variable that is part of another variable (such as an element of a structure) but is not an array element or an array section cannot appear in a DEPEND clause - !$omp target nowait depend(in:v1,v2, my_var%a) depend(out:p) & - !$omp& map(to:v1,v2) map(from: p) - !$omp parallel do - do i=1,N - p(i) = v1(i) * v2(i) - end do - !$omp end target - - !$omp task depend(in:p) - call output(p, N) - !$omp end task - - !$omp end single - !$omp end parallel - - deallocate( p, v1, v2 ) - -end subroutine From flang-commits at lists.llvm.org Thu May 29 03:24:34 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 03:24:34 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow structure component in `task depend` clauses (PR #141923) In-Reply-To: Message-ID: <683835e2.170a0220.269365.2493@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics @llvm/pr-subscribers-flang-fir-hlfir Author: Kareem Ergawy (ergawy)
Changes Even though the spec (version 5.2) prohibits strcuture components from being specified in `depend` clauses, this restriction is not sensible. This PR rectifies the issue by lifting that restriction and allowing structure components in `depend` clauses (which is allowed by OpenMP 6.0). --- Full diff: https://github.com/llvm/llvm-project/pull/141923.diff 5 Files Affected: - (modified) flang/include/flang/Evaluate/tools.h (+10) - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+5) - (modified) flang/lib/Semantics/check-omp-structure.cpp (+2-6) - (added) flang/test/Lower/OpenMP/task-depend-structure-component.f90 (+21) - (removed) flang/test/Semantics/OpenMP/depend02.f90 (-49) ``````````diff diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 7f2e91ae128bd..4efd88b13183f 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -414,6 +414,16 @@ const Symbol *IsArrayElement(const Expr &expr, bool intoSubstring = true, return nullptr; } +template +bool isStructureCompnent(const Fortran::evaluate::Expr &expr) { + if (auto dataRef{ExtractDataRef(expr, /*intoSubstring=*/false)}) { + const Fortran::evaluate::DataRef *ref{&*dataRef}; + return std::holds_alternative(ref->u); + } + + return false; +} + template std::optional ExtractNamedEntity(const A &x) { if (auto dataRef{ExtractDataRef(x)}) { diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index ebdda9885d5c2..8506d562c5094 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -944,6 +944,11 @@ bool ClauseProcessor::processDepend(lower::SymMap &symMap, converter.getCurrentLocation(), converter, expr, symMap, stmtCtx); dependVar = entity.getBase(); } + } else if (evaluate::isStructureCompnent(*object.ref())) { + SomeExpr expr = *object.ref(); + hlfir::EntityWithAttributes entity = convertExprToHLFIR( + converter.getCurrentLocation(), converter, expr, symMap, stmtCtx); + dependVar = entity.getBase(); } else { semantics::Symbol *sym = object.sym(); dependVar = converter.getSymbolAddress(*sym); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bda0d62829506..a3eeac4524eff 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -5493,12 +5493,8 @@ void OmpStructureChecker::CheckDependList(const parser::DataRef &d) { // Check if the base element is valid on Depend Clause CheckDependList(elem.value().base); }, - [&](const common::Indirection &) { - context_.Say(GetContext().clauseSource, - "A variable that is part of another variable " - "(such as an element of a structure) but is not an array " - "element or an array section cannot appear in a DEPEND " - "clause"_err_en_US); + [&](const common::Indirection &comp) { + CheckDependList(comp.value().base); }, [&](const common::Indirection &) { context_.Say(GetContext().clauseSource, diff --git a/flang/test/Lower/OpenMP/task-depend-structure-component.f90 b/flang/test/Lower/OpenMP/task-depend-structure-component.f90 new file mode 100644 index 0000000000000..7cf6dbfac2729 --- /dev/null +++ b/flang/test/Lower/OpenMP/task-depend-structure-component.f90 @@ -0,0 +1,21 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +subroutine depend + type :: my_struct + integer :: my_component(10) + end type + + type(my_struct) :: my_var + + !$omp task depend(in:my_var%my_component) + !$omp end task +end subroutine depend + +! CHECK: %[[VAR_ALLOC:.*]] = fir.alloca !fir.type<{{.*}}my_struct{{.*}}> {bindc_name = "my_var", {{.*}}} +! CHECK: %[[VAR_DECL:.*]]:2 = hlfir.declare %[[VAR_ALLOC]] + +! CHECK: %[[COMP_SELECTOR:.*]] = hlfir.designate %[[VAR_DECL]]#0{"my_component"} + +! CHECK: omp.task depend(taskdependin -> %[[COMP_SELECTOR]] : {{.*}}) { +! CHECK: omp.terminator +! CHECK: } diff --git a/flang/test/Semantics/OpenMP/depend02.f90 b/flang/test/Semantics/OpenMP/depend02.f90 deleted file mode 100644 index 76c02c8f9cbab..0000000000000 --- a/flang/test/Semantics/OpenMP/depend02.f90 +++ /dev/null @@ -1,49 +0,0 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp -! OpenMP Version 4.5 -! 2.13.9 Depend Clause -! A variable that is part of another variable -! (such as an element of a structure) but is not an array element or -! an array section cannot appear in a DEPEND clause - -subroutine vec_mult(N) - implicit none - integer :: i, N - real, allocatable :: p(:), v1(:), v2(:) - - type my_type - integer :: a(10) - end type my_type - - type(my_type) :: my_var - allocate( p(N), v1(N), v2(N) ) - - !$omp parallel num_threads(2) - !$omp single - - !$omp task depend(out:v1) - call init(v1, N) - !$omp end task - - !$omp task depend(out:v2) - call init(v2, N) - !$omp end task - - !ERROR: A variable that is part of another variable (such as an element of a structure) but is not an array element or an array section cannot appear in a DEPEND clause - !$omp target nowait depend(in:v1,v2, my_var%a) depend(out:p) & - !$omp& map(to:v1,v2) map(from: p) - !$omp parallel do - do i=1,N - p(i) = v1(i) * v2(i) - end do - !$omp end target - - !$omp task depend(in:p) - call output(p, N) - !$omp end task - - !$omp end single - !$omp end parallel - - deallocate( p, v1, v2 ) - -end subroutine ``````````
https://github.com/llvm/llvm-project/pull/141923 From flang-commits at lists.llvm.org Thu May 29 03:27:06 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 03:27:06 -0700 (PDT) Subject: [flang-commits] [flang] 7e9887a - [flang] Generlize names of delayed privatization CLI flags (#138816) Message-ID: <6838367a.a70a0220.22ad38.24dd@mx.google.com> Author: Kareem Ergawy Date: 2025-05-29T12:27:03+02:00 New Revision: 7e9887a50df2de9c666f5e7ceb46c27bfccc618f URL: https://github.com/llvm/llvm-project/commit/7e9887a50df2de9c666f5e7ceb46c27bfccc618f DIFF: https://github.com/llvm/llvm-project/commit/7e9887a50df2de9c666f5e7ceb46c27bfccc618f.diff LOG: [flang] Generlize names of delayed privatization CLI flags (#138816) Remove the `openmp` prefix from delayed privatization/localization flags since they are now used for `do concurrent` as well. PR stack: - https://github.com/llvm/llvm-project/pull/137928 - https://github.com/llvm/llvm-project/pull/138505 - https://github.com/llvm/llvm-project/pull/138506 - https://github.com/llvm/llvm-project/pull/138512 - https://github.com/llvm/llvm-project/pull/138534 - https://github.com/llvm/llvm-project/pull/138816 (this PR) Added: flang/include/flang/Support/Flags.h flang/lib/Support/Flags.cpp Modified: flang/lib/Lower/Bridge.cpp flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Lower/OpenMP/Utils.cpp flang/lib/Lower/OpenMP/Utils.h flang/lib/Support/CMakeLists.txt flang/test/Lower/OpenMP/DelayedPrivatization/distribute-standalone-private.f90 flang/test/Lower/OpenMP/DelayedPrivatization/equivalence.f90 flang/test/Lower/OpenMP/DelayedPrivatization/target-private-allocatable.f90 flang/test/Lower/OpenMP/DelayedPrivatization/target-private-multiple-variables.f90 flang/test/Lower/OpenMP/DelayedPrivatization/target-private-simple.f90 flang/test/Lower/OpenMP/allocatable-multiple-vars.f90 flang/test/Lower/OpenMP/cfg-conversion-omp.private.f90 flang/test/Lower/OpenMP/debug_info_conflict.f90 flang/test/Lower/OpenMP/delayed-privatization-allocatable-array.f90 flang/test/Lower/OpenMP/delayed-privatization-allocatable-firstprivate.f90 flang/test/Lower/OpenMP/delayed-privatization-allocatable-private.f90 flang/test/Lower/OpenMP/delayed-privatization-array.f90 flang/test/Lower/OpenMP/delayed-privatization-character-array.f90 flang/test/Lower/OpenMP/delayed-privatization-character.f90 flang/test/Lower/OpenMP/delayed-privatization-default-init.f90 flang/test/Lower/OpenMP/delayed-privatization-firstprivate.f90 flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90 flang/test/Lower/OpenMP/delayed-privatization-pointer.f90 flang/test/Lower/OpenMP/delayed-privatization-private-firstprivate.f90 flang/test/Lower/OpenMP/delayed-privatization-private.f90 flang/test/Lower/OpenMP/delayed-privatization-reduction-byref.f90 flang/test/Lower/OpenMP/delayed-privatization-reduction.f90 flang/test/Lower/OpenMP/different_vars_lastprivate_barrier.f90 flang/test/Lower/OpenMP/firstprivate-commonblock.f90 flang/test/Lower/OpenMP/private-commonblock.f90 flang/test/Lower/OpenMP/private-derived-type.f90 flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 flang/test/Lower/do_concurrent_delayed_locality.f90 Removed: ################################################################################ diff --git a/flang/include/flang/Support/Flags.h b/flang/include/flang/Support/Flags.h new file mode 100644 index 0000000000000..bcbb72f8e50d0 --- /dev/null +++ b/flang/include/flang/Support/Flags.h @@ -0,0 +1,17 @@ +//===-- include/flang/Support/Flags.h ---------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#ifndef FORTRAN_SUPPORT_FLAGS_H_ +#define FORTRAN_SUPPORT_FLAGS_H_ + +#include "llvm/Support/CommandLine.h" + +extern llvm::cl::opt enableDelayedPrivatization; +extern llvm::cl::opt enableDelayedPrivatizationStaging; + +#endif // FORTRAN_SUPPORT_FLAGS_H_ diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 49675d34215a9..9f3c50a52973a 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -13,7 +13,6 @@ #include "flang/Lower/Bridge.h" #include "OpenMP/DataSharingProcessor.h" -#include "OpenMP/Utils.h" #include "flang/Lower/Allocatable.h" #include "flang/Lower/CallInterface.h" #include "flang/Lower/Coarray.h" @@ -63,6 +62,7 @@ #include "flang/Semantics/runtime-type-info.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" +#include "flang/Support/Flags.h" #include "flang/Support/Version.h" #include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" #include "mlir/IR/BuiltinAttributes.h" diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ddb08f74b3841..19e106e2593ab 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -34,6 +34,7 @@ #include "flang/Parser/parse-tree.h" #include "flang/Semantics/openmp-directive-sets.h" #include "flang/Semantics/tools.h" +#include "flang/Support/Flags.h" #include "flang/Support/OpenMP-utils.h" #include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h" #include "mlir/Dialect/OpenMP/OpenMPDialect.h" diff --git a/flang/lib/Lower/OpenMP/Utils.cpp b/flang/lib/Lower/OpenMP/Utils.cpp index 711d4af287691..c226c2558e7aa 100644 --- a/flang/lib/Lower/OpenMP/Utils.cpp +++ b/flang/lib/Lower/OpenMP/Utils.cpp @@ -33,18 +33,6 @@ llvm::cl::opt treatIndexAsSection( llvm::cl::desc("In the OpenMP data clauses treat `a(N)` as `a(N:N)`."), llvm::cl::init(true)); -llvm::cl::opt enableDelayedPrivatization( - "openmp-enable-delayed-privatization", - llvm::cl::desc( - "Emit `[first]private` variables as clauses on the MLIR ops."), - llvm::cl::init(true)); - -llvm::cl::opt enableDelayedPrivatizationStaging( - "openmp-enable-delayed-privatization-staging", - llvm::cl::desc("For partially supported constructs, emit `[first]private` " - "variables as clauses on the MLIR ops."), - llvm::cl::init(false)); - namespace Fortran { namespace lower { namespace omp { diff --git a/flang/lib/Lower/OpenMP/Utils.h b/flang/lib/Lower/OpenMP/Utils.h index 30b4613837b9a..a7eb2dc5ee664 100644 --- a/flang/lib/Lower/OpenMP/Utils.h +++ b/flang/lib/Lower/OpenMP/Utils.h @@ -17,8 +17,6 @@ #include extern llvm::cl::opt treatIndexAsSection; -extern llvm::cl::opt enableDelayedPrivatization; -extern llvm::cl::opt enableDelayedPrivatizationStaging; namespace fir { class FirOpBuilder; diff --git a/flang/lib/Support/CMakeLists.txt b/flang/lib/Support/CMakeLists.txt index 4ee381589a208..363f57ce97dae 100644 --- a/flang/lib/Support/CMakeLists.txt +++ b/flang/lib/Support/CMakeLists.txt @@ -44,6 +44,7 @@ endif() add_flang_library(FortranSupport default-kinds.cpp + Flags.cpp Fortran.cpp Fortran-features.cpp idioms.cpp diff --git a/flang/lib/Support/Flags.cpp b/flang/lib/Support/Flags.cpp new file mode 100644 index 0000000000000..02f64981618dd --- /dev/null +++ b/flang/lib/Support/Flags.cpp @@ -0,0 +1,20 @@ +//===-- lib/Support/Flags.cpp ---------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Support/Flags.h" + +llvm::cl::opt enableDelayedPrivatization("enable-delayed-privatization", + llvm::cl::desc( + "Emit private/local variables as clauses/specifiers on MLIR ops."), + llvm::cl::init(true)); + +llvm::cl::opt enableDelayedPrivatizationStaging( + "enable-delayed-privatization-staging", + llvm::cl::desc("For partially supported constructs, emit private/local " + "variables as clauses/specifiers on MLIR ops."), + llvm::cl::init(false)); diff --git a/flang/test/Lower/OpenMP/DelayedPrivatization/distribute-standalone-private.f90 b/flang/test/Lower/OpenMP/DelayedPrivatization/distribute-standalone-private.f90 index a9c85db79fa31..92aeb3fbc1ee7 100644 --- a/flang/test/Lower/OpenMP/DelayedPrivatization/distribute-standalone-private.f90 +++ b/flang/test/Lower/OpenMP/DelayedPrivatization/distribute-standalone-private.f90 @@ -1,6 +1,6 @@ -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization-staging \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization-staging \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization-staging -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization-staging -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine standalone_distribute diff --git a/flang/test/Lower/OpenMP/DelayedPrivatization/equivalence.f90 b/flang/test/Lower/OpenMP/DelayedPrivatization/equivalence.f90 index 721bfff012f14..5234862feaa76 100644 --- a/flang/test/Lower/OpenMP/DelayedPrivatization/equivalence.f90 +++ b/flang/test/Lower/OpenMP/DelayedPrivatization/equivalence.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for variables that are storage associated via `EQUIVALENCE`. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine private_common diff --git a/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-allocatable.f90 b/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-allocatable.f90 index 87c2c2ae26796..3d93fbc6e446e 100644 --- a/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-allocatable.f90 +++ b/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-allocatable.f90 @@ -1,8 +1,8 @@ ! Tests delayed privatization for `targets ... private(..)` for allocatables. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization-staging \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization-staging \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization-staging -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization-staging -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine target_allocatable diff --git a/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-multiple-variables.f90 b/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-multiple-variables.f90 index ad7bfb3d7c247..12e15a2aafc2d 100644 --- a/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-multiple-variables.f90 +++ b/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-multiple-variables.f90 @@ -1,8 +1,8 @@ ! Tests delayed privatization for `targets ... private(..)` for allocatables. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization-staging \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization-staging \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization-staging -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization-staging -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine target_allocatable(lb, ub, l) diff --git a/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-simple.f90 b/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-simple.f90 index 5abf2cbb15c92..f543068d29753 100644 --- a/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-simple.f90 +++ b/flang/test/Lower/OpenMP/DelayedPrivatization/target-private-simple.f90 @@ -1,8 +1,8 @@ ! Tests delayed privatization for `targets ... private(..)` for simple variables. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization-staging \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization-staging \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization-staging -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization-staging -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine target_simple diff --git a/flang/test/Lower/OpenMP/allocatable-multiple-vars.f90 b/flang/test/Lower/OpenMP/allocatable-multiple-vars.f90 index e6450a13e13a0..91ba75f2198e3 100644 --- a/flang/test/Lower/OpenMP/allocatable-multiple-vars.f90 +++ b/flang/test/Lower/OpenMP/allocatable-multiple-vars.f90 @@ -1,9 +1,9 @@ ! Test early privatization for multiple allocatable variables. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization=false \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization=false \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization=false -o - %s 2>&1 |\ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization=false -o - %s 2>&1 |\ ! RUN: FileCheck %s subroutine delayed_privatization_allocatable diff --git a/flang/test/Lower/OpenMP/cfg-conversion-omp.private.f90 b/flang/test/Lower/OpenMP/cfg-conversion-omp.private.f90 index 8b8adf2b140c7..f8d771d10d281 100644 --- a/flang/test/Lower/OpenMP/cfg-conversion-omp.private.f90 +++ b/flang/test/Lower/OpenMP/cfg-conversion-omp.private.f90 @@ -2,7 +2,7 @@ ! RUN: split-file %s %t && cd %t -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - test.f90 2>&1 | \ ! RUN: fir-opt --cfg-conversion -o test.cfg-conv.mlir ! RUN: FileCheck --input-file=test.cfg-conv.mlir %s --check-prefix="CFGConv" diff --git a/flang/test/Lower/OpenMP/debug_info_conflict.f90 b/flang/test/Lower/OpenMP/debug_info_conflict.f90 index 5e52db281da23..b80900476053a 100644 --- a/flang/test/Lower/OpenMP/debug_info_conflict.f90 +++ b/flang/test/Lower/OpenMP/debug_info_conflict.f90 @@ -1,7 +1,7 @@ ! Tests that there no debug-info conflicts arise because of DI attached to nested ! OMP regions arguments. -! RUN: %flang -c -fopenmp -g -mmlir --openmp-enable-delayed-privatization=true \ +! RUN: %flang -c -fopenmp -g -mmlir --enable-delayed-privatization=true \ ! RUN: %s -o - 2>&1 | FileCheck %s subroutine bar (b) diff --git a/flang/test/Lower/OpenMP/delayed-privatization-allocatable-array.f90 b/flang/test/Lower/OpenMP/delayed-privatization-allocatable-array.f90 index 9b6dbabf0c6ff..d1c7167546b43 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-allocatable-array.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-allocatable-array.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for allocatable arrays. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 |\ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 |\ ! RUN: FileCheck %s subroutine delayed_privatization_private(var1, l1) diff --git a/flang/test/Lower/OpenMP/delayed-privatization-allocatable-firstprivate.f90 b/flang/test/Lower/OpenMP/delayed-privatization-allocatable-firstprivate.f90 index 01ca1073ae849..612bb55c770f7 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-allocatable-firstprivate.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-allocatable-firstprivate.f90 @@ -2,9 +2,9 @@ ! RUN: split-file %s %t -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/test_ir.f90 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %t/test_ir.f90 2>&1 |\ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %t/test_ir.f90 2>&1 |\ ! RUN: FileCheck %s !--- test_ir.f90 @@ -38,7 +38,7 @@ subroutine delayed_privatization_allocatable ! CHECK-NEXT: hlfir.assign %[[ORIG_BASE_LD]] to %[[PRIV_PRIV_ARG]] realloc ! CHECK-NEXT: } -! RUN: %flang -c -emit-llvm -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang -c -emit-llvm -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/test_compilation_to_obj.f90 | \ ! RUN: llvm-dis 2>&1 |\ ! RUN: FileCheck %s -check-prefix=LLVM diff --git a/flang/test/Lower/OpenMP/delayed-privatization-allocatable-private.f90 b/flang/test/Lower/OpenMP/delayed-privatization-allocatable-private.f90 index 4ce66f52110e0..67cfb864bc8f3 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-allocatable-private.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-allocatable-private.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for allocatables: `private`. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 |\ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 |\ ! RUN: FileCheck %s subroutine delayed_privatization_allocatable diff --git a/flang/test/Lower/OpenMP/delayed-privatization-array.f90 b/flang/test/Lower/OpenMP/delayed-privatization-array.f90 index c447fa6f27a75..9aaf75f66dbbb 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-array.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-array.f90 @@ -2,19 +2,19 @@ ! RUN: split-file %s %t -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/one_dim_array.f90 2>&1 | FileCheck %s --check-prefix=ONE_DIM -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - \ ! RUN: %t/one_dim_array.f90 2>&1 | FileCheck %s --check-prefix=ONE_DIM -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/two_dim_array.f90 2>&1 | FileCheck %s --check-prefix=TWO_DIM -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - \ ! RUN: %t/two_dim_array.f90 2>&1 | FileCheck %s --check-prefix=TWO_DIM -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/one_dim_array_default_lb.f90 2>&1 | FileCheck %s --check-prefix=ONE_DIM_DEFAULT_LB -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - \ ! RUN: %t/one_dim_array_default_lb.f90 2>&1 | FileCheck %s --check-prefix=ONE_DIM_DEFAULT_LB !--- one_dim_array.f90 diff --git a/flang/test/Lower/OpenMP/delayed-privatization-character-array.f90 b/flang/test/Lower/OpenMP/delayed-privatization-character-array.f90 index 4c7287283c7ad..383b033d772aa 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-character-array.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-character-array.f90 @@ -2,14 +2,14 @@ ! RUN: split-file %s %t -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/static_len.f90 2>&1 | FileCheck %s --check-prefix=STATIC_LEN -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %t/static_len.f90 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %t/static_len.f90 2>&1 \ ! RUN: | FileCheck %s --check-prefix=STATIC_LEN -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/dyn_len.f90 2>&1 | FileCheck %s --check-prefix=DYN_LEN -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %t/dyn_len.f90 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %t/dyn_len.f90 2>&1 \ ! RUN: | FileCheck %s --check-prefix=DYN_LEN !--- static_len.f90 diff --git a/flang/test/Lower/OpenMP/delayed-privatization-character.f90 b/flang/test/Lower/OpenMP/delayed-privatization-character.f90 index 3d1a312963371..d0f7ef6f2cd0c 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-character.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-character.f90 @@ -2,14 +2,14 @@ ! RUN: split-file %s %t -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/dyn_len.f90 2>&1 | FileCheck %s --check-prefix=DYN_LEN -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %t/dyn_len.f90 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %t/dyn_len.f90 2>&1 \ ! RUN: | FileCheck %s --check-prefix=DYN_LEN -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %t/static_len.f90 2>&1 | FileCheck %s --check-prefix=STATIC_LEN -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %t/static_len.f90 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %t/static_len.f90 2>&1 \ ! RUN: | FileCheck %s --check-prefix=STATIC_LEN !--- dyn_len.f90 diff --git a/flang/test/Lower/OpenMP/delayed-privatization-default-init.f90 b/flang/test/Lower/OpenMP/delayed-privatization-default-init.f90 index 87d4605217a8a..cb17e4cd6afc1 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-default-init.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-default-init.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for derived types with default initialization. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 |\ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 |\ ! RUN: FileCheck %s subroutine delayed_privatization_default_init diff --git a/flang/test/Lower/OpenMP/delayed-privatization-firstprivate.f90 b/flang/test/Lower/OpenMP/delayed-privatization-firstprivate.f90 index 904ea783ad5b4..3f80b5e1bd209 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-firstprivate.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-firstprivate.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for the `firstprivate` clause. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine delayed_privatization_firstprivate diff --git a/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90 b/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90 index ac9a6d8746cf2..effc356590e9a 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90 @@ -1,7 +1,7 @@ ! Tests the OMPIRBuilder can handle multiple privatization regions that contain ! multiple BBs (for example, for allocatables). -! RUN: %flang -S -emit-llvm -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang -S -emit-llvm -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s subroutine foo(x) diff --git a/flang/test/Lower/OpenMP/delayed-privatization-pointer.f90 b/flang/test/Lower/OpenMP/delayed-privatization-pointer.f90 index f39ac9199e8bd..38ead806199c1 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-pointer.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-pointer.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for pointers: `private`. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 |\ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 |\ ! RUN: FileCheck %s subroutine delayed_privatization_pointer diff --git a/flang/test/Lower/OpenMP/delayed-privatization-private-firstprivate.f90 b/flang/test/Lower/OpenMP/delayed-privatization-private-firstprivate.f90 index d961210dcbc38..ad53703d3122e 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-private-firstprivate.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-private-firstprivate.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for both `private` and `firstprivate` clauses. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine delayed_privatization_private_firstprivate diff --git a/flang/test/Lower/OpenMP/delayed-privatization-private.f90 b/flang/test/Lower/OpenMP/delayed-privatization-private.f90 index 69c362e4828bf..84d6caedf5010 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-private.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-private.f90 @@ -1,8 +1,8 @@ ! Test delayed privatization for the `private` clause. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine delayed_privatization_private diff --git a/flang/test/Lower/OpenMP/delayed-privatization-reduction-byref.f90 b/flang/test/Lower/OpenMP/delayed-privatization-reduction-byref.f90 index f463f2b4630ae..4b6a643f94059 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-reduction-byref.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-reduction-byref.f90 @@ -3,7 +3,7 @@ ! that the block arguments are added in the proper order (reductions first and ! then delayed privatization. -! RUN: bbc -emit-hlfir -fopenmp --force-byref-reduction --openmp-enable-delayed-privatization -o - %s 2>&1 | FileCheck %s +! RUN: bbc -emit-hlfir -fopenmp --force-byref-reduction --enable-delayed-privatization -o - %s 2>&1 | FileCheck %s subroutine red_and_delayed_private integer :: red diff --git a/flang/test/Lower/OpenMP/delayed-privatization-reduction.f90 b/flang/test/Lower/OpenMP/delayed-privatization-reduction.f90 index a1ddbc30d6e46..f8f78b0531091 100644 --- a/flang/test/Lower/OpenMP/delayed-privatization-reduction.f90 +++ b/flang/test/Lower/OpenMP/delayed-privatization-reduction.f90 @@ -3,9 +3,9 @@ ! that the block arguments are added in the proper order (reductions first and ! then delayed privatization. -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization \ +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization \ ! RUN: -o - %s 2>&1 | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization -o - %s 2>&1 \ +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization -o - %s 2>&1 \ ! RUN: | FileCheck %s subroutine red_and_delayed_private diff --git a/flang/test/Lower/OpenMP/ diff erent_vars_lastprivate_barrier.f90 b/flang/test/Lower/OpenMP/ diff erent_vars_lastprivate_barrier.f90 index b74e083925aba..5bf634c86652b 100644 --- a/flang/test/Lower/OpenMP/ diff erent_vars_lastprivate_barrier.f90 +++ b/flang/test/Lower/OpenMP/ diff erent_vars_lastprivate_barrier.f90 @@ -1,4 +1,4 @@ -! RUN: %flang_fc1 -fopenmp -mmlir --openmp-enable-delayed-privatization-staging=true -emit-hlfir %s -o - | FileCheck %s +! RUN: %flang_fc1 -fopenmp -mmlir --enable-delayed-privatization-staging=true -emit-hlfir %s -o - | FileCheck %s subroutine first_and_lastprivate(var) integer i diff --git a/flang/test/Lower/OpenMP/firstprivate-commonblock.f90 b/flang/test/Lower/OpenMP/firstprivate-commonblock.f90 index 315e1b7745a6f..1b029c193b7b6 100644 --- a/flang/test/Lower/OpenMP/firstprivate-commonblock.f90 +++ b/flang/test/Lower/OpenMP/firstprivate-commonblock.f90 @@ -1,5 +1,5 @@ ! RUN: %flang_fc1 -emit-hlfir -fopenmp \ -! RUN: -mmlir --openmp-enable-delayed-privatization=true -o - %s 2>&1 \ +! RUN: -mmlir --enable-delayed-privatization=true -o - %s 2>&1 \ ! RUN: | FileCheck %s !CHECK: func.func @_QPfirstprivate_common() { diff --git a/flang/test/Lower/OpenMP/private-commonblock.f90 b/flang/test/Lower/OpenMP/private-commonblock.f90 index 009b086a0c7fd..8f5f641dea325 100644 --- a/flang/test/Lower/OpenMP/private-commonblock.f90 +++ b/flang/test/Lower/OpenMP/private-commonblock.f90 @@ -1,5 +1,5 @@ ! RUN: %flang_fc1 -emit-hlfir -fopenmp \ -! RUN: -mmlir --openmp-enable-delayed-privatization=true -o - %s 2>&1 \ +! RUN: -mmlir --enable-delayed-privatization=true -o - %s 2>&1 \ ! RUN: | FileCheck %s !CHECK: func.func @_QPprivate_common() { diff --git a/flang/test/Lower/OpenMP/private-derived-type.f90 b/flang/test/Lower/OpenMP/private-derived-type.f90 index cb51c2b34b424..3947a0d68b58c 100644 --- a/flang/test/Lower/OpenMP/private-derived-type.f90 +++ b/flang/test/Lower/OpenMP/private-derived-type.f90 @@ -1,5 +1,5 @@ -! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --openmp-enable-delayed-privatization-staging=true -o - %s | FileCheck %s -! RUN: bbc -emit-hlfir -fopenmp --openmp-enable-delayed-privatization-staging=true -o - %s | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -fopenmp -mmlir --enable-delayed-privatization-staging=true -o - %s | FileCheck %s +! RUN: bbc -emit-hlfir -fopenmp --enable-delayed-privatization-staging=true -o - %s | FileCheck %s subroutine s4 type y3 diff --git a/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 b/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 index 45d6f91f67f1f..14d860c30f6f2 100644 --- a/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 +++ b/flang/test/Lower/OpenMP/same_var_first_lastprivate.f90 @@ -1,4 +1,4 @@ -! RUN: %flang_fc1 -fopenmp -mmlir --openmp-enable-delayed-privatization-staging=true -emit-hlfir %s -o - | FileCheck %s +! RUN: %flang_fc1 -fopenmp -mmlir --enable-delayed-privatization-staging=true -emit-hlfir %s -o - | FileCheck %s subroutine first_and_lastprivate integer i diff --git a/flang/test/Lower/do_concurrent_delayed_locality.f90 b/flang/test/Lower/do_concurrent_delayed_locality.f90 index 9b234087ed4be..6cae0eb46db13 100644 --- a/flang/test/Lower/do_concurrent_delayed_locality.f90 +++ b/flang/test/Lower/do_concurrent_delayed_locality.f90 @@ -1,4 +1,4 @@ -! RUN: %flang_fc1 -emit-hlfir -mmlir --openmp-enable-delayed-privatization-staging=true -o - %s | FileCheck %s +! RUN: %flang_fc1 -emit-hlfir -mmlir --enable-delayed-privatization-staging=true -o - %s | FileCheck %s subroutine do_concurrent_with_locality_specs implicit none From flang-commits at lists.llvm.org Thu May 29 03:27:09 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 29 May 2025 03:27:09 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Generlize names of delayed privatization CLI flags (PR #138816) In-Reply-To: Message-ID: <6838367d.050a0220.2626e0.240a@mx.google.com> https://github.com/ergawy closed https://github.com/llvm/llvm-project/pull/138816 From flang-commits at lists.llvm.org Thu May 29 03:27:12 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 29 May 2025 03:27:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir][OpenMP] Refactor privtization code into shared location (PR #141767) In-Reply-To: Message-ID: <68383680.170a0220.16bdbd.1f63@mx.google.com> https://github.com/ergawy edited https://github.com/llvm/llvm-project/pull/141767 From flang-commits at lists.llvm.org Thu May 29 03:28:05 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 29 May 2025 03:28:05 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir][OpenMP] Refactor privtization code into shared location (PR #141767) In-Reply-To: Message-ID: <683836b5.170a0220.2208a0.26e1@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/141767 >From a69412fbe209d2049943f1f781dcb03e2b5cc8de Mon Sep 17 00:00:00 2001 From: ergawy Date: Wed, 28 May 2025 06:18:33 -0500 Subject: [PATCH] [flang][fir][OpenMP] Refactor privtization code into shared location Refactors the utils needed to create privtization/locatization ops for both the fir and OpenMP dialects into a shared location isolating OpenMP stuff out of it as much as possible. --- flang/include/flang/Lower/Support/Utils.h | 8 + flang/lib/Lower/Bridge.cpp | 23 +- .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 207 +++--------------- flang/lib/Lower/OpenMP/DataSharingProcessor.h | 4 +- flang/lib/Lower/Support/Utils.cpp | 180 +++++++++++++++ 5 files changed, 239 insertions(+), 183 deletions(-) diff --git a/flang/include/flang/Lower/Support/Utils.h b/flang/include/flang/Lower/Support/Utils.h index baaf644dd6efb..8ad3a903beee9 100644 --- a/flang/include/flang/Lower/Support/Utils.h +++ b/flang/include/flang/Lower/Support/Utils.h @@ -94,6 +94,14 @@ bool isEqual(const Fortran::lower::SomeExpr *x, const Fortran::lower::SomeExpr *y); bool isEqual(const Fortran::lower::ExplicitIterSpace::ArrayBases &x, const Fortran::lower::ExplicitIterSpace::ArrayBases &y); + +template +void privatizeSymbol( + lower::AbstractConverter &converter, fir::FirOpBuilder &firOpBuilder, + lower::SymMap &symTable, std::function initGen, + llvm::SetVector &allPrivatizedSymbols, + const semantics::Symbol *symToPrivatize, OperandsStructType *clauseOps); + } // end namespace Fortran::lower // DenseMapInfo for pointers to Fortran::lower::SomeExpr. diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 9f3c50a52973a..4e6db3eaa990d 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -2029,10 +2029,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { void handleLocalitySpecs(const IncrementLoopInfo &info) { Fortran::semantics::SemanticsContext &semanticsContext = bridge.getSemanticsContext(); - // TODO Extract `DataSharingProcessor` from omp to a more general location. - Fortran::lower::omp::DataSharingProcessor dsp( - *this, semanticsContext, getEval(), - /*useDelayedPrivatization=*/true, localSymbols); fir::LocalitySpecifierOperands privateClauseOps; auto doConcurrentLoopOp = mlir::dyn_cast_if_present(info.loopOp); @@ -2041,10 +2037,17 @@ class FirConverter : public Fortran::lower::AbstractConverter { // complete. bool useDelayedPriv = enableDelayedPrivatizationStaging && doConcurrentLoopOp; + llvm::SetVector allPrivatizedSymbols; for (const Fortran::semantics::Symbol *sym : info.localSymList) { if (useDelayedPriv) { - dsp.privatizeSymbol(sym, &privateClauseOps); + Fortran::lower::privatizeSymbol( + *this, this->getFirOpBuilder(), localSymbols, + [this](fir::LocalitySpecifierOp result, mlir::Type argType) { + TODO(this->toLocation(), + "Localizers that need init regions are not supported yet."); + }, + allPrivatizedSymbols, sym, &privateClauseOps); continue; } @@ -2053,7 +2056,13 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (const Fortran::semantics::Symbol *sym : info.localInitSymList) { if (useDelayedPriv) { - dsp.privatizeSymbol(sym, &privateClauseOps); + Fortran::lower::privatizeSymbol( + *this, this->getFirOpBuilder(), localSymbols, + [this](fir::LocalitySpecifierOp result, mlir::Type argType) { + TODO(this->toLocation(), + "Localizers that need init regions are not supported yet."); + }, + allPrivatizedSymbols, sym, &privateClauseOps); continue; } @@ -2083,7 +2092,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { builder->getArrayAttr(privateClauseOps.privateSyms)); for (auto [sym, privateVar] : llvm::zip_equal( - dsp.getAllSymbolsToPrivatize(), privateClauseOps.privateVars)) { + allPrivatizedSymbols, privateClauseOps.privateVars)) { auto arg = doConcurrentLoopOp.getRegion().begin()->addArgument( privateVar.getType(), doConcurrentLoopOp.getLoc()); bindSymbol(*sym, hlfir::translateToExtendedValue( diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 629478294ef5b..03109c82a976a 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -16,6 +16,7 @@ #include "Utils.h" #include "flang/Lower/ConvertVariable.h" #include "flang/Lower/PFTBuilder.h" +#include "flang/Lower/Support/Utils.h" #include "flang/Lower/SymbolMap.h" #include "flang/Optimizer/Builder/BoxValue.h" #include "flang/Optimizer/Builder/HLFIRTools.h" @@ -527,188 +528,48 @@ void DataSharingProcessor::copyLastPrivatize(mlir::Operation *op) { } } -template void DataSharingProcessor::privatizeSymbol( - const semantics::Symbol *symToPrivatize, OperandsStructType *clauseOps) { + const semantics::Symbol *symToPrivatize, + mlir::omp::PrivateClauseOps *clauseOps) { if (!useDelayedPrivatization) { cloneSymbol(symToPrivatize); copyFirstPrivateSymbol(symToPrivatize); return; } - const semantics::Symbol *sym = symToPrivatize->HasLocalLocality() - ? &symToPrivatize->GetUltimate() - : symToPrivatize; - lower::SymbolBox hsb = symToPrivatize->HasLocalLocality() - ? converter.shallowLookupSymbol(*sym) - : converter.lookupOneLevelUpSymbol(*sym); - assert(hsb && "Host symbol box not found"); - hlfir::Entity entity{hsb.getAddr()}; - bool cannotHaveNonDefaultLowerBounds = !entity.mayHaveNonDefaultLowerBounds(); - - mlir::Location symLoc = hsb.getAddr().getLoc(); - std::string privatizerName = sym->name().ToString() + ".privatizer"; - bool isFirstPrivate = - symToPrivatize->test(semantics::Symbol::Flag::OmpFirstPrivate) || - symToPrivatize->test(semantics::Symbol::Flag::LocalityLocalInit); - - mlir::Value privVal = hsb.getAddr(); - mlir::Type allocType = privVal.getType(); - if (!mlir::isa(privVal.getType())) - allocType = fir::unwrapRefType(privVal.getType()); - - if (auto poly = mlir::dyn_cast(allocType)) { - if (!mlir::isa(poly.getEleTy()) && isFirstPrivate) - TODO(symLoc, "create polymorphic host associated copy"); - } - - // fir.array<> cannot be converted to any single llvm type and fir helpers - // are not available in openmp to llvmir translation so we cannot generate - // an alloca for a fir.array type there. Get around this by boxing all - // arrays. - if (mlir::isa(allocType)) { - entity = genVariableBox(symLoc, firOpBuilder, entity); - privVal = entity.getBase(); - allocType = privVal.getType(); - } - - if (mlir::isa(privVal.getType())) { - // Boxes should be passed by reference into nested regions: - auto oldIP = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); - auto alloca = firOpBuilder.create(symLoc, privVal.getType()); - firOpBuilder.restoreInsertionPoint(oldIP); - firOpBuilder.create(symLoc, privVal, alloca); - privVal = alloca; - } - - mlir::Type argType = privVal.getType(); - - OpType privatizerOp = [&]() { - auto moduleOp = firOpBuilder.getModule(); - auto uniquePrivatizerName = fir::getTypeAsString( - allocType, converter.getKindMap(), - converter.mangleName(*sym) + - (isFirstPrivate ? "_firstprivate" : "_private")); - - if (auto existingPrivatizer = - moduleOp.lookupSymbol(uniquePrivatizerName)) - return existingPrivatizer; - - mlir::OpBuilder::InsertionGuard guard(firOpBuilder); - firOpBuilder.setInsertionPointToStart(moduleOp.getBody()); - OpType result; - - if constexpr (std::is_same_v) { - result = firOpBuilder.create( - symLoc, uniquePrivatizerName, allocType, - isFirstPrivate ? mlir::omp::DataSharingClauseType::FirstPrivate - : mlir::omp::DataSharingClauseType::Private); - } else { - result = firOpBuilder.create( - symLoc, uniquePrivatizerName, allocType, - isFirstPrivate ? fir::LocalitySpecifierType::LocalInit - : fir::LocalitySpecifierType::Local); - } - - fir::ExtendedValue symExV = converter.getSymbolExtendedValue(*sym); - lower::SymMapScope outerScope(symTable); - - // Populate the `init` region. - // We need to initialize in the following cases: - // 1. The allocation was for a derived type which requires initialization - // (this can be skipped if it will be initialized anyway by the copy - // region, unless the derived type has allocatable components) - // 2. The allocation was for any kind of box - // 3. The allocation was for a boxed character - const bool needsInitialization = - (Fortran::lower::hasDefaultInitialization(sym->GetUltimate()) && - (!isFirstPrivate || hlfir::mayHaveAllocatableComponent(allocType))) || - mlir::isa(allocType) || - mlir::isa(allocType); - if (needsInitialization) { - mlir::Region &initRegion = result.getInitRegion(); - mlir::Block *initBlock = firOpBuilder.createBlock( - &initRegion, /*insertPt=*/{}, {argType, argType}, {symLoc, symLoc}); - - populateByRefInitAndCleanupRegions( - converter, symLoc, argType, /*scalarInitValue=*/nullptr, initBlock, - result.getInitPrivateArg(), result.getInitMoldArg(), - result.getDeallocRegion(), - isFirstPrivate ? DeclOperationKind::FirstPrivate - : DeclOperationKind::Private, - sym, cannotHaveNonDefaultLowerBounds); - // TODO: currently there are false positives from dead uses of the mold - // arg - if (result.initReadsFromMold()) - mightHaveReadHostSym.insert(sym); - } - - // Populate the `copy` region if this is a `firstprivate`. - if (isFirstPrivate) { - mlir::Region ©Region = result.getCopyRegion(); - // First block argument corresponding to the original/host value while - // second block argument corresponding to the privatized value. - mlir::Block *copyEntryBlock = firOpBuilder.createBlock( - ©Region, /*insertPt=*/{}, {argType, argType}, {symLoc, symLoc}); - firOpBuilder.setInsertionPointToEnd(copyEntryBlock); - - auto addSymbol = [&](unsigned argIdx, const semantics::Symbol *symToMap, - bool force = false) { - symExV.match( - [&](const fir::MutableBoxValue &box) { - symTable.addSymbol( - *symToMap, - fir::substBase(box, copyRegion.getArgument(argIdx)), force); - }, - [&](const auto &box) { - symTable.addSymbol(*symToMap, copyRegion.getArgument(argIdx), - force); - }); - }; - - addSymbol(0, sym, true); - lower::SymMapScope innerScope(symTable); - addSymbol(1, symToPrivatize); - - auto ip = firOpBuilder.saveInsertionPoint(); - copyFirstPrivateSymbol(symToPrivatize, &ip); - - if constexpr (std::is_same_v) { - firOpBuilder.create( - hsb.getAddr().getLoc(), - symTable.shallowLookupSymbol(*symToPrivatize).getAddr()); - } else { - firOpBuilder.create( - hsb.getAddr().getLoc(), - symTable.shallowLookupSymbol(*symToPrivatize).getAddr()); - } - } - - return result; - }(); - - if (clauseOps) { - clauseOps->privateSyms.push_back(mlir::SymbolRefAttr::get(privatizerOp)); - clauseOps->privateVars.push_back(privVal); - } + auto initGen = [&](mlir::omp::PrivateClauseOp result, mlir::Type argType) { + lower::SymbolBox hsb = converter.lookupOneLevelUpSymbol(*symToPrivatize); + assert(hsb && "Host symbol box not found"); + hlfir::Entity entity{hsb.getAddr()}; + bool cannotHaveNonDefaultLowerBounds = + !entity.mayHaveNonDefaultLowerBounds(); + + mlir::Region &initRegion = result.getInitRegion(); + mlir::Location symLoc = hsb.getAddr().getLoc(); + mlir::Block *initBlock = firOpBuilder.createBlock( + &initRegion, /*insertPt=*/{}, {argType, argType}, {symLoc, symLoc}); + + bool emitCopyRegion = + symToPrivatize->test(semantics::Symbol::Flag::OmpFirstPrivate); + + populateByRefInitAndCleanupRegions( + converter, symLoc, argType, /*scalarInitValue=*/nullptr, initBlock, + result.getInitPrivateArg(), result.getInitMoldArg(), + result.getDeallocRegion(), + emitCopyRegion ? omp::DeclOperationKind::FirstPrivate + : omp::DeclOperationKind::Private, + symToPrivatize, cannotHaveNonDefaultLowerBounds); + // TODO: currently there are false positives from dead uses of the mold + // arg + if (result.initReadsFromMold()) + mightHaveReadHostSym.insert(symToPrivatize); + }; - if (symToPrivatize->HasLocalLocality()) - allPrivatizedSymbols.insert(symToPrivatize); + Fortran::lower::privatizeSymbol( + converter, firOpBuilder, symTable, initGen, allPrivatizedSymbols, + symToPrivatize, clauseOps); } - -template void -DataSharingProcessor::privatizeSymbol( - const semantics::Symbol *symToPrivatize, - mlir::omp::PrivateClauseOps *clauseOps); - -template void -DataSharingProcessor::privatizeSymbol( - const semantics::Symbol *symToPrivatize, - fir::LocalitySpecifierOperands *clauseOps); - } // namespace omp } // namespace lower } // namespace Fortran diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.h b/flang/lib/Lower/OpenMP/DataSharingProcessor.h index ae759bfef566b..8a7dbb2ae30b7 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.h +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.h @@ -153,10 +153,8 @@ class DataSharingProcessor { : llvm::ArrayRef(); } - template void privatizeSymbol(const semantics::Symbol *symToPrivatize, - OperandsStructType *clauseOps); + mlir::omp::PrivateClauseOps *clauseOps); }; } // namespace omp diff --git a/flang/lib/Lower/Support/Utils.cpp b/flang/lib/Lower/Support/Utils.cpp index 668ee31a36bc3..de810cb2f4b34 100644 --- a/flang/lib/Lower/Support/Utils.cpp +++ b/flang/lib/Lower/Support/Utils.cpp @@ -633,4 +633,184 @@ bool isEqual(const Fortran::lower::ExplicitIterSpace::ArrayBases &x, }}, x, y); } + +void copyFirstPrivateSymbol(lower::AbstractConverter &converter, + const semantics::Symbol *sym, + mlir::OpBuilder::InsertPoint *copyAssignIP) { + if (sym->test(semantics::Symbol::Flag::OmpFirstPrivate) || + sym->test(semantics::Symbol::Flag::LocalityLocalInit)) + converter.copyHostAssociateVar(*sym, copyAssignIP); +} + +template +void privatizeSymbol( + lower::AbstractConverter &converter, fir::FirOpBuilder &firOpBuilder, + lower::SymMap &symTable, std::function initGen, + llvm::SetVector &allPrivatizedSymbols, + const semantics::Symbol *symToPrivatize, OperandsStructType *clauseOps) { + const semantics::Symbol *sym = symToPrivatize->HasLocalLocality() + ? &symToPrivatize->GetUltimate() + : symToPrivatize; + lower::SymbolBox hsb = symToPrivatize->HasLocalLocality() + ? converter.shallowLookupSymbol(*sym) + : converter.lookupOneLevelUpSymbol(*sym); + assert(hsb && "Host symbol box not found"); + hlfir::Entity entity{hsb.getAddr()}; + bool cannotHaveNonDefaultLowerBounds = !entity.mayHaveNonDefaultLowerBounds(); + + mlir::Location symLoc = hsb.getAddr().getLoc(); + std::string privatizerName = sym->name().ToString() + ".privatizer"; + bool emitCopyRegion = + symToPrivatize->test(semantics::Symbol::Flag::OmpFirstPrivate) || + symToPrivatize->test(semantics::Symbol::Flag::LocalityLocalInit); + + mlir::Value privVal = hsb.getAddr(); + mlir::Type allocType = privVal.getType(); + if (!mlir::isa(privVal.getType())) + allocType = fir::unwrapRefType(privVal.getType()); + + if (auto poly = mlir::dyn_cast(allocType)) { + if (!mlir::isa(poly.getEleTy()) && emitCopyRegion) + TODO(symLoc, "create polymorphic host associated copy"); + } + + // fir.array<> cannot be converted to any single llvm type and fir helpers + // are not available in openmp to llvmir translation so we cannot generate + // an alloca for a fir.array type there. Get around this by boxing all + // arrays. + if (mlir::isa(allocType)) { + entity = genVariableBox(symLoc, firOpBuilder, entity); + privVal = entity.getBase(); + allocType = privVal.getType(); + } + + if (mlir::isa(privVal.getType())) { + // Boxes should be passed by reference into nested regions: + auto oldIP = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + auto alloca = firOpBuilder.create(symLoc, privVal.getType()); + firOpBuilder.restoreInsertionPoint(oldIP); + firOpBuilder.create(symLoc, privVal, alloca); + privVal = alloca; + } + + mlir::Type argType = privVal.getType(); + + OpType privatizerOp = [&]() { + auto moduleOp = firOpBuilder.getModule(); + auto uniquePrivatizerName = fir::getTypeAsString( + allocType, converter.getKindMap(), + converter.mangleName(*sym) + + (emitCopyRegion ? "_firstprivate" : "_private")); + + if (auto existingPrivatizer = + moduleOp.lookupSymbol(uniquePrivatizerName)) + return existingPrivatizer; + + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(moduleOp.getBody()); + OpType result; + + if constexpr (std::is_same_v) { + result = firOpBuilder.create( + symLoc, uniquePrivatizerName, allocType, + emitCopyRegion ? mlir::omp::DataSharingClauseType::FirstPrivate + : mlir::omp::DataSharingClauseType::Private); + } else { + result = firOpBuilder.create( + symLoc, uniquePrivatizerName, allocType, + emitCopyRegion ? fir::LocalitySpecifierType::LocalInit + : fir::LocalitySpecifierType::Local); + } + + fir::ExtendedValue symExV = converter.getSymbolExtendedValue(*sym); + lower::SymMapScope outerScope(symTable); + + // Populate the `init` region. + // We need to initialize in the following cases: + // 1. The allocation was for a derived type which requires initialization + // (this can be skipped if it will be initialized anyway by the copy + // region, unless the derived type has allocatable components) + // 2. The allocation was for any kind of box + // 3. The allocation was for a boxed character + const bool needsInitialization = + (Fortran::lower::hasDefaultInitialization(sym->GetUltimate()) && + (!emitCopyRegion || hlfir::mayHaveAllocatableComponent(allocType))) || + mlir::isa(allocType) || + mlir::isa(allocType); + if (needsInitialization) { + initGen(result, argType); + } + + // Populate the `copy` region if this is a `firstprivate`. + if (emitCopyRegion) { + mlir::Region ©Region = result.getCopyRegion(); + // First block argument corresponding to the original/host value while + // second block argument corresponding to the privatized value. + mlir::Block *copyEntryBlock = firOpBuilder.createBlock( + ©Region, /*insertPt=*/{}, {argType, argType}, {symLoc, symLoc}); + firOpBuilder.setInsertionPointToEnd(copyEntryBlock); + + auto addSymbol = [&](unsigned argIdx, const semantics::Symbol *symToMap, + bool force = false) { + symExV.match( + [&](const fir::MutableBoxValue &box) { + symTable.addSymbol( + *symToMap, + fir::substBase(box, copyRegion.getArgument(argIdx)), force); + }, + [&](const auto &box) { + symTable.addSymbol(*symToMap, copyRegion.getArgument(argIdx), + force); + }); + }; + + addSymbol(0, sym, true); + lower::SymMapScope innerScope(symTable); + addSymbol(1, symToPrivatize); + + auto ip = firOpBuilder.saveInsertionPoint(); + copyFirstPrivateSymbol(converter, symToPrivatize, &ip); + + if constexpr (std::is_same_v) { + firOpBuilder.create( + hsb.getAddr().getLoc(), + symTable.shallowLookupSymbol(*symToPrivatize).getAddr()); + } else { + firOpBuilder.create( + hsb.getAddr().getLoc(), + symTable.shallowLookupSymbol(*symToPrivatize).getAddr()); + } + } + + return result; + }(); + + if (clauseOps) { + clauseOps->privateSyms.push_back(mlir::SymbolRefAttr::get(privatizerOp)); + clauseOps->privateVars.push_back(privVal); + } + + if (symToPrivatize->HasLocalLocality()) + allPrivatizedSymbols.insert(symToPrivatize); +} + +template void +privatizeSymbol( + lower::AbstractConverter &converter, fir::FirOpBuilder &firOpBuilder, + lower::SymMap &symTable, + std::function initGen, + llvm::SetVector &allPrivatizedSymbols, + const semantics::Symbol *symToPrivatize, + mlir::omp::PrivateClauseOps *clauseOps); + +template void +privatizeSymbol( + lower::AbstractConverter &converter, fir::FirOpBuilder &firOpBuilder, + lower::SymMap &symTable, + std::function initGen, + llvm::SetVector &allPrivatizedSymbols, + const semantics::Symbol *symToPrivatize, + fir::LocalitySpecifierOperands *clauseOps); + } // end namespace Fortran::lower From flang-commits at lists.llvm.org Thu May 29 03:44:51 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 29 May 2025 03:44:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir][OpenMP] Refactor privtization code into shared location (PR #141767) In-Reply-To: Message-ID: <68383aa3.050a0220.11f05a.2538@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/141767 From flang-commits at lists.llvm.org Thu May 29 03:44:51 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 29 May 2025 03:44:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir][OpenMP] Refactor privtization code into shared location (PR #141767) In-Reply-To: Message-ID: <68383aa3.170a0220.3a6d5b.2067@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks https://github.com/llvm/llvm-project/pull/141767 From flang-commits at lists.llvm.org Thu May 29 03:46:56 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 29 May 2025 03:46:56 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow structure component in `task depend` clauses (PR #141923) In-Reply-To: Message-ID: <68383b20.170a0220.3a91c7.21d5@mx.google.com> https://github.com/tblah approved this pull request. LGTM, thanks! https://github.com/llvm/llvm-project/pull/141923 From flang-commits at lists.llvm.org Thu May 29 04:05:07 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Thu, 29 May 2025 04:05:07 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <68383f63.170a0220.18ad4.2183@mx.google.com> https://github.com/snarang181 updated https://github.com/llvm/llvm-project/pull/141882 >From 560d42eab635803b217a04237b2d2ac7f02a1f7b Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Wed, 28 May 2025 20:21:16 -0400 Subject: [PATCH 1/3] [Flang][Docs] Add Sphinx man page support for Flang This patch enables building Flang man pages by: - Adding a `man_pages` entry in flang/docs/conf.py for Sphinx man builder. - Adding a minimal `index.rst` as the master document. - Adding placeholder `.rst` files for FIRLangRef and FlangCommandLineReference to fix toctree references. These changes unblock builds using `-DLLVM_BUILD_MANPAGES=ON` and allow `ninja docs-flang-man` to generate `flang.1`. Fixes #141757 --- flang/docs/FIRLangRef.rst | 4 ++++ flang/docs/FlangCommandLineReference.rst | 4 ++++ flang/docs/conf.py | 4 +++- flang/docs/index.rst | 10 ++++++++++ 4 files changed, 21 insertions(+), 1 deletion(-) create mode 100644 flang/docs/FIRLangRef.rst create mode 100644 flang/docs/FlangCommandLineReference.rst create mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst new file mode 100644 index 0000000000000..91edd67fdcad8 --- /dev/null +++ b/flang/docs/FIRLangRef.rst @@ -0,0 +1,4 @@ +FIR Language Reference +====================== + +(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst new file mode 100644 index 0000000000000..71f77f28ba72c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.rst @@ -0,0 +1,4 @@ +Flang Command Line Reference +============================ + +(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 48f7b69f5d750..46907f144e25a 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -227,7 +227,9 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [] +man_pages = [ + ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) +] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst new file mode 100644 index 0000000000000..09677eb87704f --- /dev/null +++ b/flang/docs/index.rst @@ -0,0 +1,10 @@ +Flang Documentation +==================== + +Welcome to the Flang documentation. + +.. toctree:: + :maxdepth: 1 + + FIRLangRef + FlangCommandLineReference >From e79151b0a5cef4ee51e1e76fd6d32b68714f32fd Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 06:53:34 -0400 Subject: [PATCH 2/3] Remove .rst files and point conf.py to pick up .md --- flang/docs/FIRLangRef.rst | 4 ---- flang/docs/FlangCommandLineReference.rst | 4 ---- flang/docs/conf.py | 5 ++--- flang/docs/index.rst | 10 ---------- 4 files changed, 2 insertions(+), 21 deletions(-) delete mode 100644 flang/docs/FIRLangRef.rst delete mode 100644 flang/docs/FlangCommandLineReference.rst delete mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst deleted file mode 100644 index 91edd67fdcad8..0000000000000 --- a/flang/docs/FIRLangRef.rst +++ /dev/null @@ -1,4 +0,0 @@ -FIR Language Reference -====================== - -(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst deleted file mode 100644 index 71f77f28ba72c..0000000000000 --- a/flang/docs/FlangCommandLineReference.rst +++ /dev/null @@ -1,4 +0,0 @@ -Flang Command Line Reference -============================ - -(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 46907f144e25a..4fd81440c8176 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -42,6 +42,7 @@ # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] +source_suffix = [".md"] myst_heading_anchors = 6 import sphinx @@ -227,9 +228,7 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [ - ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) -] +man_pages = [("index", "flang", "Flang Documentation", ["Flang Contributors"], 1)] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst deleted file mode 100644 index 09677eb87704f..0000000000000 --- a/flang/docs/index.rst +++ /dev/null @@ -1,10 +0,0 @@ -Flang Documentation -==================== - -Welcome to the Flang documentation. - -.. toctree:: - :maxdepth: 1 - - FIRLangRef - FlangCommandLineReference >From d3cf067f197b9b26ee75d3d41dc654b4e00acbcb Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 07:03:35 -0400 Subject: [PATCH 3/3] While building man pages, the .md files were being used. Due to that, the myst_parser was explictly imported. Adding Placeholder .md files which are required by index.md --- flang/docs/FIRLangRef.md | 3 +++ flang/docs/FlangCommandLineReference.md | 3 +++ flang/docs/conf.py | 10 +++++----- 3 files changed, 11 insertions(+), 5 deletions(-) create mode 100644 flang/docs/FIRLangRef.md create mode 100644 flang/docs/FlangCommandLineReference.md diff --git a/flang/docs/FIRLangRef.md b/flang/docs/FIRLangRef.md new file mode 100644 index 0000000000000..8e4052f14fc7c --- /dev/null +++ b/flang/docs/FIRLangRef.md @@ -0,0 +1,3 @@ +# FIR Language Reference + +_TODO: Add FIR language reference documentation._ diff --git a/flang/docs/FlangCommandLineReference.md b/flang/docs/FlangCommandLineReference.md new file mode 100644 index 0000000000000..ee8d7b83dc50c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.md @@ -0,0 +1,3 @@ +# Flang Command Line Reference + +_TODO: Add Flang CLI documentation._ diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 4fd81440c8176..7223661625689 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -10,6 +10,7 @@ # serve to show the default. from datetime import date + # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. @@ -28,16 +29,15 @@ "sphinx.ext.autodoc", ] -# When building man pages, we do not use the markdown pages, -# So, we can continue without the myst_parser dependencies. -# Doing so reduces dependencies of some packaged llvm distributions. + try: import myst_parser extensions.append("myst_parser") except ImportError: - if not tags.has("builder-man"): - raise + raise ImportError( + "myst_parser is required to build documentation, including man pages." + ) # Add any paths that contain templates here, relative to this directory. From flang-commits at lists.llvm.org Thu May 29 04:08:02 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Thu, 29 May 2025 04:08:02 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <68384012.170a0220.1d4586.2a28@mx.google.com> snarang181 wrote: > I think more people need to look into it. AFAIR the plan was that flang does not use the `.rst` files for documentation, it was supposed to rely on the markdown files entirely. @pawosm-arm, thanks for looking into it. I got rid of the `.rst` files. When I added the explicit source suffix, it seemed like the man-page build needed the parser for the .md files. With the `myst_parser` library not being imported, I was hitting parsing errors of the .md files, so I explicitly changed the requirement. With that and adding placeholder missing `.md` files, I can generate a man doc at `${BUILD_DIR}/tools/flang/docs/man/flang.1` https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 04:11:11 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 29 May 2025 04:11:11 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <683840cf.170a0220.37239c.2136@mx.google.com> kiranchandramohan wrote: `ninja flang-doc` might be generating files like `FIRLangRef.md`. Would adding these files manually affect that? Note that the HTML files generated from these are displayed currently in the following pages: https://flang.llvm.org/docs/ https://flang.llvm.org/docs/FIRLangRef.html https://flang.llvm.org/docs/FlangCommandLineReference.html https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 04:13:35 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Thu, 29 May 2025 04:13:35 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <6838415f.170a0220.36a6c5.22f8@mx.google.com> snarang181 wrote: > `ninja flang-doc` might be generating files like `FIRLangRef.md`. Would adding these files manually affect that? > > Note that the HTML files generated from these are displayed currently in the following pages: > > https://flang.llvm.org/docs/ > > https://flang.llvm.org/docs/FIRLangRef.html > > https://flang.llvm.org/docs/FlangCommandLineReference.html It did not seem like the build process got affected locally for me. Is there any other way of testing this out? https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 04:13:47 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 04:13:47 -0700 (PDT) Subject: [flang-commits] [flang] f8dcb05 - [flang][fir][OpenMP] Refactor privtization code into shared location (#141767) Message-ID: <6838416b.170a0220.3cd638.21f7@mx.google.com> Author: Kareem Ergawy Date: 2025-05-29T13:13:44+02:00 New Revision: f8dcb059ae06376b0991936026d5befb3d7b109b URL: https://github.com/llvm/llvm-project/commit/f8dcb059ae06376b0991936026d5befb3d7b109b DIFF: https://github.com/llvm/llvm-project/commit/f8dcb059ae06376b0991936026d5befb3d7b109b.diff LOG: [flang][fir][OpenMP] Refactor privtization code into shared location (#141767) Refactors the utils needed to create privtization/locatization ops for both the fir and OpenMP dialects into a shared location isolating OpenMP stuff out of it as much as possible. Added: Modified: flang/include/flang/Lower/Support/Utils.h flang/lib/Lower/Bridge.cpp flang/lib/Lower/OpenMP/DataSharingProcessor.cpp flang/lib/Lower/OpenMP/DataSharingProcessor.h flang/lib/Lower/Support/Utils.cpp Removed: ################################################################################ diff --git a/flang/include/flang/Lower/Support/Utils.h b/flang/include/flang/Lower/Support/Utils.h index baaf644dd6efb..8ad3a903beee9 100644 --- a/flang/include/flang/Lower/Support/Utils.h +++ b/flang/include/flang/Lower/Support/Utils.h @@ -94,6 +94,14 @@ bool isEqual(const Fortran::lower::SomeExpr *x, const Fortran::lower::SomeExpr *y); bool isEqual(const Fortran::lower::ExplicitIterSpace::ArrayBases &x, const Fortran::lower::ExplicitIterSpace::ArrayBases &y); + +template +void privatizeSymbol( + lower::AbstractConverter &converter, fir::FirOpBuilder &firOpBuilder, + lower::SymMap &symTable, std::function initGen, + llvm::SetVector &allPrivatizedSymbols, + const semantics::Symbol *symToPrivatize, OperandsStructType *clauseOps); + } // end namespace Fortran::lower // DenseMapInfo for pointers to Fortran::lower::SomeExpr. diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index 9f3c50a52973a..4e6db3eaa990d 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -2029,10 +2029,6 @@ class FirConverter : public Fortran::lower::AbstractConverter { void handleLocalitySpecs(const IncrementLoopInfo &info) { Fortran::semantics::SemanticsContext &semanticsContext = bridge.getSemanticsContext(); - // TODO Extract `DataSharingProcessor` from omp to a more general location. - Fortran::lower::omp::DataSharingProcessor dsp( - *this, semanticsContext, getEval(), - /*useDelayedPrivatization=*/true, localSymbols); fir::LocalitySpecifierOperands privateClauseOps; auto doConcurrentLoopOp = mlir::dyn_cast_if_present(info.loopOp); @@ -2041,10 +2037,17 @@ class FirConverter : public Fortran::lower::AbstractConverter { // complete. bool useDelayedPriv = enableDelayedPrivatizationStaging && doConcurrentLoopOp; + llvm::SetVector allPrivatizedSymbols; for (const Fortran::semantics::Symbol *sym : info.localSymList) { if (useDelayedPriv) { - dsp.privatizeSymbol(sym, &privateClauseOps); + Fortran::lower::privatizeSymbol( + *this, this->getFirOpBuilder(), localSymbols, + [this](fir::LocalitySpecifierOp result, mlir::Type argType) { + TODO(this->toLocation(), + "Localizers that need init regions are not supported yet."); + }, + allPrivatizedSymbols, sym, &privateClauseOps); continue; } @@ -2053,7 +2056,13 @@ class FirConverter : public Fortran::lower::AbstractConverter { for (const Fortran::semantics::Symbol *sym : info.localInitSymList) { if (useDelayedPriv) { - dsp.privatizeSymbol(sym, &privateClauseOps); + Fortran::lower::privatizeSymbol( + *this, this->getFirOpBuilder(), localSymbols, + [this](fir::LocalitySpecifierOp result, mlir::Type argType) { + TODO(this->toLocation(), + "Localizers that need init regions are not supported yet."); + }, + allPrivatizedSymbols, sym, &privateClauseOps); continue; } @@ -2083,7 +2092,7 @@ class FirConverter : public Fortran::lower::AbstractConverter { builder->getArrayAttr(privateClauseOps.privateSyms)); for (auto [sym, privateVar] : llvm::zip_equal( - dsp.getAllSymbolsToPrivatize(), privateClauseOps.privateVars)) { + allPrivatizedSymbols, privateClauseOps.privateVars)) { auto arg = doConcurrentLoopOp.getRegion().begin()->addArgument( privateVar.getType(), doConcurrentLoopOp.getLoc()); bindSymbol(*sym, hlfir::translateToExtendedValue( diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 629478294ef5b..03109c82a976a 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -16,6 +16,7 @@ #include "Utils.h" #include "flang/Lower/ConvertVariable.h" #include "flang/Lower/PFTBuilder.h" +#include "flang/Lower/Support/Utils.h" #include "flang/Lower/SymbolMap.h" #include "flang/Optimizer/Builder/BoxValue.h" #include "flang/Optimizer/Builder/HLFIRTools.h" @@ -527,188 +528,48 @@ void DataSharingProcessor::copyLastPrivatize(mlir::Operation *op) { } } -template void DataSharingProcessor::privatizeSymbol( - const semantics::Symbol *symToPrivatize, OperandsStructType *clauseOps) { + const semantics::Symbol *symToPrivatize, + mlir::omp::PrivateClauseOps *clauseOps) { if (!useDelayedPrivatization) { cloneSymbol(symToPrivatize); copyFirstPrivateSymbol(symToPrivatize); return; } - const semantics::Symbol *sym = symToPrivatize->HasLocalLocality() - ? &symToPrivatize->GetUltimate() - : symToPrivatize; - lower::SymbolBox hsb = symToPrivatize->HasLocalLocality() - ? converter.shallowLookupSymbol(*sym) - : converter.lookupOneLevelUpSymbol(*sym); - assert(hsb && "Host symbol box not found"); - hlfir::Entity entity{hsb.getAddr()}; - bool cannotHaveNonDefaultLowerBounds = !entity.mayHaveNonDefaultLowerBounds(); - - mlir::Location symLoc = hsb.getAddr().getLoc(); - std::string privatizerName = sym->name().ToString() + ".privatizer"; - bool isFirstPrivate = - symToPrivatize->test(semantics::Symbol::Flag::OmpFirstPrivate) || - symToPrivatize->test(semantics::Symbol::Flag::LocalityLocalInit); - - mlir::Value privVal = hsb.getAddr(); - mlir::Type allocType = privVal.getType(); - if (!mlir::isa(privVal.getType())) - allocType = fir::unwrapRefType(privVal.getType()); - - if (auto poly = mlir::dyn_cast(allocType)) { - if (!mlir::isa(poly.getEleTy()) && isFirstPrivate) - TODO(symLoc, "create polymorphic host associated copy"); - } - - // fir.array<> cannot be converted to any single llvm type and fir helpers - // are not available in openmp to llvmir translation so we cannot generate - // an alloca for a fir.array type there. Get around this by boxing all - // arrays. - if (mlir::isa(allocType)) { - entity = genVariableBox(symLoc, firOpBuilder, entity); - privVal = entity.getBase(); - allocType = privVal.getType(); - } - - if (mlir::isa(privVal.getType())) { - // Boxes should be passed by reference into nested regions: - auto oldIP = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); - auto alloca = firOpBuilder.create(symLoc, privVal.getType()); - firOpBuilder.restoreInsertionPoint(oldIP); - firOpBuilder.create(symLoc, privVal, alloca); - privVal = alloca; - } - - mlir::Type argType = privVal.getType(); - - OpType privatizerOp = [&]() { - auto moduleOp = firOpBuilder.getModule(); - auto uniquePrivatizerName = fir::getTypeAsString( - allocType, converter.getKindMap(), - converter.mangleName(*sym) + - (isFirstPrivate ? "_firstprivate" : "_private")); - - if (auto existingPrivatizer = - moduleOp.lookupSymbol(uniquePrivatizerName)) - return existingPrivatizer; - - mlir::OpBuilder::InsertionGuard guard(firOpBuilder); - firOpBuilder.setInsertionPointToStart(moduleOp.getBody()); - OpType result; - - if constexpr (std::is_same_v) { - result = firOpBuilder.create( - symLoc, uniquePrivatizerName, allocType, - isFirstPrivate ? mlir::omp::DataSharingClauseType::FirstPrivate - : mlir::omp::DataSharingClauseType::Private); - } else { - result = firOpBuilder.create( - symLoc, uniquePrivatizerName, allocType, - isFirstPrivate ? fir::LocalitySpecifierType::LocalInit - : fir::LocalitySpecifierType::Local); - } - - fir::ExtendedValue symExV = converter.getSymbolExtendedValue(*sym); - lower::SymMapScope outerScope(symTable); - - // Populate the `init` region. - // We need to initialize in the following cases: - // 1. The allocation was for a derived type which requires initialization - // (this can be skipped if it will be initialized anyway by the copy - // region, unless the derived type has allocatable components) - // 2. The allocation was for any kind of box - // 3. The allocation was for a boxed character - const bool needsInitialization = - (Fortran::lower::hasDefaultInitialization(sym->GetUltimate()) && - (!isFirstPrivate || hlfir::mayHaveAllocatableComponent(allocType))) || - mlir::isa(allocType) || - mlir::isa(allocType); - if (needsInitialization) { - mlir::Region &initRegion = result.getInitRegion(); - mlir::Block *initBlock = firOpBuilder.createBlock( - &initRegion, /*insertPt=*/{}, {argType, argType}, {symLoc, symLoc}); - - populateByRefInitAndCleanupRegions( - converter, symLoc, argType, /*scalarInitValue=*/nullptr, initBlock, - result.getInitPrivateArg(), result.getInitMoldArg(), - result.getDeallocRegion(), - isFirstPrivate ? DeclOperationKind::FirstPrivate - : DeclOperationKind::Private, - sym, cannotHaveNonDefaultLowerBounds); - // TODO: currently there are false positives from dead uses of the mold - // arg - if (result.initReadsFromMold()) - mightHaveReadHostSym.insert(sym); - } - - // Populate the `copy` region if this is a `firstprivate`. - if (isFirstPrivate) { - mlir::Region ©Region = result.getCopyRegion(); - // First block argument corresponding to the original/host value while - // second block argument corresponding to the privatized value. - mlir::Block *copyEntryBlock = firOpBuilder.createBlock( - ©Region, /*insertPt=*/{}, {argType, argType}, {symLoc, symLoc}); - firOpBuilder.setInsertionPointToEnd(copyEntryBlock); - - auto addSymbol = [&](unsigned argIdx, const semantics::Symbol *symToMap, - bool force = false) { - symExV.match( - [&](const fir::MutableBoxValue &box) { - symTable.addSymbol( - *symToMap, - fir::substBase(box, copyRegion.getArgument(argIdx)), force); - }, - [&](const auto &box) { - symTable.addSymbol(*symToMap, copyRegion.getArgument(argIdx), - force); - }); - }; - - addSymbol(0, sym, true); - lower::SymMapScope innerScope(symTable); - addSymbol(1, symToPrivatize); - - auto ip = firOpBuilder.saveInsertionPoint(); - copyFirstPrivateSymbol(symToPrivatize, &ip); - - if constexpr (std::is_same_v) { - firOpBuilder.create( - hsb.getAddr().getLoc(), - symTable.shallowLookupSymbol(*symToPrivatize).getAddr()); - } else { - firOpBuilder.create( - hsb.getAddr().getLoc(), - symTable.shallowLookupSymbol(*symToPrivatize).getAddr()); - } - } - - return result; - }(); - - if (clauseOps) { - clauseOps->privateSyms.push_back(mlir::SymbolRefAttr::get(privatizerOp)); - clauseOps->privateVars.push_back(privVal); - } + auto initGen = [&](mlir::omp::PrivateClauseOp result, mlir::Type argType) { + lower::SymbolBox hsb = converter.lookupOneLevelUpSymbol(*symToPrivatize); + assert(hsb && "Host symbol box not found"); + hlfir::Entity entity{hsb.getAddr()}; + bool cannotHaveNonDefaultLowerBounds = + !entity.mayHaveNonDefaultLowerBounds(); + + mlir::Region &initRegion = result.getInitRegion(); + mlir::Location symLoc = hsb.getAddr().getLoc(); + mlir::Block *initBlock = firOpBuilder.createBlock( + &initRegion, /*insertPt=*/{}, {argType, argType}, {symLoc, symLoc}); + + bool emitCopyRegion = + symToPrivatize->test(semantics::Symbol::Flag::OmpFirstPrivate); + + populateByRefInitAndCleanupRegions( + converter, symLoc, argType, /*scalarInitValue=*/nullptr, initBlock, + result.getInitPrivateArg(), result.getInitMoldArg(), + result.getDeallocRegion(), + emitCopyRegion ? omp::DeclOperationKind::FirstPrivate + : omp::DeclOperationKind::Private, + symToPrivatize, cannotHaveNonDefaultLowerBounds); + // TODO: currently there are false positives from dead uses of the mold + // arg + if (result.initReadsFromMold()) + mightHaveReadHostSym.insert(symToPrivatize); + }; - if (symToPrivatize->HasLocalLocality()) - allPrivatizedSymbols.insert(symToPrivatize); + Fortran::lower::privatizeSymbol( + converter, firOpBuilder, symTable, initGen, allPrivatizedSymbols, + symToPrivatize, clauseOps); } - -template void -DataSharingProcessor::privatizeSymbol( - const semantics::Symbol *symToPrivatize, - mlir::omp::PrivateClauseOps *clauseOps); - -template void -DataSharingProcessor::privatizeSymbol( - const semantics::Symbol *symToPrivatize, - fir::LocalitySpecifierOperands *clauseOps); - } // namespace omp } // namespace lower } // namespace Fortran diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.h b/flang/lib/Lower/OpenMP/DataSharingProcessor.h index ae759bfef566b..8a7dbb2ae30b7 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.h +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.h @@ -153,10 +153,8 @@ class DataSharingProcessor { : llvm::ArrayRef(); } - template void privatizeSymbol(const semantics::Symbol *symToPrivatize, - OperandsStructType *clauseOps); + mlir::omp::PrivateClauseOps *clauseOps); }; } // namespace omp diff --git a/flang/lib/Lower/Support/Utils.cpp b/flang/lib/Lower/Support/Utils.cpp index 668ee31a36bc3..de810cb2f4b34 100644 --- a/flang/lib/Lower/Support/Utils.cpp +++ b/flang/lib/Lower/Support/Utils.cpp @@ -633,4 +633,184 @@ bool isEqual(const Fortran::lower::ExplicitIterSpace::ArrayBases &x, }}, x, y); } + +void copyFirstPrivateSymbol(lower::AbstractConverter &converter, + const semantics::Symbol *sym, + mlir::OpBuilder::InsertPoint *copyAssignIP) { + if (sym->test(semantics::Symbol::Flag::OmpFirstPrivate) || + sym->test(semantics::Symbol::Flag::LocalityLocalInit)) + converter.copyHostAssociateVar(*sym, copyAssignIP); +} + +template +void privatizeSymbol( + lower::AbstractConverter &converter, fir::FirOpBuilder &firOpBuilder, + lower::SymMap &symTable, std::function initGen, + llvm::SetVector &allPrivatizedSymbols, + const semantics::Symbol *symToPrivatize, OperandsStructType *clauseOps) { + const semantics::Symbol *sym = symToPrivatize->HasLocalLocality() + ? &symToPrivatize->GetUltimate() + : symToPrivatize; + lower::SymbolBox hsb = symToPrivatize->HasLocalLocality() + ? converter.shallowLookupSymbol(*sym) + : converter.lookupOneLevelUpSymbol(*sym); + assert(hsb && "Host symbol box not found"); + hlfir::Entity entity{hsb.getAddr()}; + bool cannotHaveNonDefaultLowerBounds = !entity.mayHaveNonDefaultLowerBounds(); + + mlir::Location symLoc = hsb.getAddr().getLoc(); + std::string privatizerName = sym->name().ToString() + ".privatizer"; + bool emitCopyRegion = + symToPrivatize->test(semantics::Symbol::Flag::OmpFirstPrivate) || + symToPrivatize->test(semantics::Symbol::Flag::LocalityLocalInit); + + mlir::Value privVal = hsb.getAddr(); + mlir::Type allocType = privVal.getType(); + if (!mlir::isa(privVal.getType())) + allocType = fir::unwrapRefType(privVal.getType()); + + if (auto poly = mlir::dyn_cast(allocType)) { + if (!mlir::isa(poly.getEleTy()) && emitCopyRegion) + TODO(symLoc, "create polymorphic host associated copy"); + } + + // fir.array<> cannot be converted to any single llvm type and fir helpers + // are not available in openmp to llvmir translation so we cannot generate + // an alloca for a fir.array type there. Get around this by boxing all + // arrays. + if (mlir::isa(allocType)) { + entity = genVariableBox(symLoc, firOpBuilder, entity); + privVal = entity.getBase(); + allocType = privVal.getType(); + } + + if (mlir::isa(privVal.getType())) { + // Boxes should be passed by reference into nested regions: + auto oldIP = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointToStart(firOpBuilder.getAllocaBlock()); + auto alloca = firOpBuilder.create(symLoc, privVal.getType()); + firOpBuilder.restoreInsertionPoint(oldIP); + firOpBuilder.create(symLoc, privVal, alloca); + privVal = alloca; + } + + mlir::Type argType = privVal.getType(); + + OpType privatizerOp = [&]() { + auto moduleOp = firOpBuilder.getModule(); + auto uniquePrivatizerName = fir::getTypeAsString( + allocType, converter.getKindMap(), + converter.mangleName(*sym) + + (emitCopyRegion ? "_firstprivate" : "_private")); + + if (auto existingPrivatizer = + moduleOp.lookupSymbol(uniquePrivatizerName)) + return existingPrivatizer; + + mlir::OpBuilder::InsertionGuard guard(firOpBuilder); + firOpBuilder.setInsertionPointToStart(moduleOp.getBody()); + OpType result; + + if constexpr (std::is_same_v) { + result = firOpBuilder.create( + symLoc, uniquePrivatizerName, allocType, + emitCopyRegion ? mlir::omp::DataSharingClauseType::FirstPrivate + : mlir::omp::DataSharingClauseType::Private); + } else { + result = firOpBuilder.create( + symLoc, uniquePrivatizerName, allocType, + emitCopyRegion ? fir::LocalitySpecifierType::LocalInit + : fir::LocalitySpecifierType::Local); + } + + fir::ExtendedValue symExV = converter.getSymbolExtendedValue(*sym); + lower::SymMapScope outerScope(symTable); + + // Populate the `init` region. + // We need to initialize in the following cases: + // 1. The allocation was for a derived type which requires initialization + // (this can be skipped if it will be initialized anyway by the copy + // region, unless the derived type has allocatable components) + // 2. The allocation was for any kind of box + // 3. The allocation was for a boxed character + const bool needsInitialization = + (Fortran::lower::hasDefaultInitialization(sym->GetUltimate()) && + (!emitCopyRegion || hlfir::mayHaveAllocatableComponent(allocType))) || + mlir::isa(allocType) || + mlir::isa(allocType); + if (needsInitialization) { + initGen(result, argType); + } + + // Populate the `copy` region if this is a `firstprivate`. + if (emitCopyRegion) { + mlir::Region ©Region = result.getCopyRegion(); + // First block argument corresponding to the original/host value while + // second block argument corresponding to the privatized value. + mlir::Block *copyEntryBlock = firOpBuilder.createBlock( + ©Region, /*insertPt=*/{}, {argType, argType}, {symLoc, symLoc}); + firOpBuilder.setInsertionPointToEnd(copyEntryBlock); + + auto addSymbol = [&](unsigned argIdx, const semantics::Symbol *symToMap, + bool force = false) { + symExV.match( + [&](const fir::MutableBoxValue &box) { + symTable.addSymbol( + *symToMap, + fir::substBase(box, copyRegion.getArgument(argIdx)), force); + }, + [&](const auto &box) { + symTable.addSymbol(*symToMap, copyRegion.getArgument(argIdx), + force); + }); + }; + + addSymbol(0, sym, true); + lower::SymMapScope innerScope(symTable); + addSymbol(1, symToPrivatize); + + auto ip = firOpBuilder.saveInsertionPoint(); + copyFirstPrivateSymbol(converter, symToPrivatize, &ip); + + if constexpr (std::is_same_v) { + firOpBuilder.create( + hsb.getAddr().getLoc(), + symTable.shallowLookupSymbol(*symToPrivatize).getAddr()); + } else { + firOpBuilder.create( + hsb.getAddr().getLoc(), + symTable.shallowLookupSymbol(*symToPrivatize).getAddr()); + } + } + + return result; + }(); + + if (clauseOps) { + clauseOps->privateSyms.push_back(mlir::SymbolRefAttr::get(privatizerOp)); + clauseOps->privateVars.push_back(privVal); + } + + if (symToPrivatize->HasLocalLocality()) + allPrivatizedSymbols.insert(symToPrivatize); +} + +template void +privatizeSymbol( + lower::AbstractConverter &converter, fir::FirOpBuilder &firOpBuilder, + lower::SymMap &symTable, + std::function initGen, + llvm::SetVector &allPrivatizedSymbols, + const semantics::Symbol *symToPrivatize, + mlir::omp::PrivateClauseOps *clauseOps); + +template void +privatizeSymbol( + lower::AbstractConverter &converter, fir::FirOpBuilder &firOpBuilder, + lower::SymMap &symTable, + std::function initGen, + llvm::SetVector &allPrivatizedSymbols, + const semantics::Symbol *symToPrivatize, + fir::LocalitySpecifierOperands *clauseOps); + } // end namespace Fortran::lower From flang-commits at lists.llvm.org Thu May 29 04:13:50 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 29 May 2025 04:13:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir][OpenMP] Refactor privtization code into shared location (PR #141767) In-Reply-To: Message-ID: <6838416e.630a0220.3b1a90.d3ca@mx.google.com> https://github.com/ergawy closed https://github.com/llvm/llvm-project/pull/141767 From flang-commits at lists.llvm.org Thu May 29 04:31:56 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 29 May 2025 04:31:56 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <683845ac.050a0220.19bd5d.2741@mx.google.com> kiranchandramohan wrote: Can you enable `-DLLVM_ENABLE_SPHINX=ON -DSPHINX_WARNINGS_AS_ERRORS=OFF` in the build options and try `ninja docs-flang-html`? The patch that added FIRLangRef.md to docs is https://reviews.llvm.org/D128650 https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 05:13:16 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Thu, 29 May 2025 05:13:16 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <68384f5c.170a0220.3210ca.2d04@mx.google.com> snarang181 wrote: > https://reviews.llvm.org/D128650 @kiranchandramohan -- this runs for me successfully and does not interfere with the placeholder file. The final `FIRLangRef.md` that is produced in the build artifacts has been generated by tablegen. https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 05:25:52 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 29 May 2025 05:25:52 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <68385250.170a0220.110c9b.266a@mx.google.com> kiranchandramohan wrote: > > https://reviews.llvm.org/D128650 > > @kiranchandramohan -- this runs for me successfully and does not interfere with the placeholder file. The final `FIRLangRef.md` that is produced in the build artifacts has been generated by tablegen. What about the man pages and its contents. Is it the empty file or the one produced by tablegen? https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 05:37:01 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 29 May 2025 05:37:01 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Treat ClassType as BoxType in COPYPRIVATE (PR #141844) In-Reply-To: Message-ID: <683854ed.170a0220.1bd25c.264c@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/141844 >From 4da5be5562b65570db85163a17902eb0605ac9eb Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 28 May 2025 15:08:20 -0500 Subject: [PATCH] [flang][OpenMP] Treat ClassType as BoxType in COPYPRIVATE This fixes the second problem reported in https://github.com/llvm/llvm-project/issues/141481 --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 3 +++ flang/test/Lower/OpenMP/copyprivate4.f90 | 18 ++++++++++++++++++ 2 files changed, 21 insertions(+) create mode 100644 flang/test/Lower/OpenMP/copyprivate4.f90 diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 885871698c946..afbc77d48cb53 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -743,6 +743,9 @@ void TypeInfo::typeScan(mlir::Type ty) { } else if (auto bty = mlir::dyn_cast(ty)) { inBox = true; typeScan(bty.getEleTy()); + } else if (auto cty = mlir::dyn_cast(ty)) { + inBox = true; + typeScan(cty.getEleTy()); } else if (auto cty = mlir::dyn_cast(ty)) { charLen = cty.getLen(); } else if (auto hty = mlir::dyn_cast(ty)) { diff --git a/flang/test/Lower/OpenMP/copyprivate4.f90 b/flang/test/Lower/OpenMP/copyprivate4.f90 new file mode 100644 index 0000000000000..02fdbc71edc59 --- /dev/null +++ b/flang/test/Lower/OpenMP/copyprivate4.f90 @@ -0,0 +1,18 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s + +!The second testcase from https://github.com/llvm/llvm-project/issues/141481 + +!Check that we don't crash on this. + +!CHECK: omp.single copyprivate(%6#0 -> @_copy_class_ptr_rec__QFf01Tt : !fir.ref>>>) { +!CHECK: omp.terminator +!CHECK: } + +subroutine f01 + type t + end type + class(t), pointer :: tt + +!$omp single copyprivate(tt) +!$omp end single +end From flang-commits at lists.llvm.org Thu May 29 05:37:11 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 29 May 2025 05:37:11 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Verify that arguments to COPYPRIVATE are variables (PR #141823) In-Reply-To: Message-ID: <683854f7.050a0220.3c7a0b.2c15@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/141823 >From 7103b042de5e0bf6212a0a13b1a76e66bf633b67 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 28 May 2025 13:49:56 -0500 Subject: [PATCH 1/2] [flang][OpenMP] Verify that arguments to COPYPRIVATE are variables The check if the arguments are variable list items was missing, leading to a crash in lowering in some invalid situations. This fixes the first testcase reported in https://github.com/llvm/llvm-project/issues/141481 --- flang/lib/Semantics/check-omp-structure.cpp | 25 ++++++++++--------- flang/lib/Semantics/check-omp-structure.h | 1 + flang/test/Semantics/OpenMP/copyprivate04.f90 | 1 + flang/test/Semantics/OpenMP/copyprivate05.f90 | 12 +++++++++ 4 files changed, 27 insertions(+), 12 deletions(-) create mode 100644 flang/test/Semantics/OpenMP/copyprivate05.f90 diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bda0d62829506..297cd32270705 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -390,6 +390,16 @@ std::optional OmpStructureChecker::IsContiguous( object.u); } +void OmpStructureChecker::CheckVariableListItem( + const SymbolSourceMap &symbols) { + for (auto &[symbol, source] : symbols) { + if (!IsVariableListItem(*symbol)) { + context_.SayWithDecl(*symbol, source, "'%s' must be a variable"_err_en_US, + symbol->name()); + } + } +} + void OmpStructureChecker::CheckMultipleOccurrence( semantics::UnorderedSymbolSet &listVars, const std::list &nameList, const parser::CharBlock &item, @@ -4587,6 +4597,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Copyprivate &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_copyprivate); SymbolSourceMap symbols; GetSymbolsInObjectList(x.v, symbols); + CheckVariableListItem(symbols); CheckIntentInPointer(symbols, llvm::omp::Clause::OMPC_copyprivate); CheckCopyingPolymorphicAllocatable( symbols, llvm::omp::Clause::OMPC_copyprivate); @@ -4859,12 +4870,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::From &x) { const auto &objList{std::get(x.v.t)}; SymbolSourceMap symbols; GetSymbolsInObjectList(objList, symbols); - for (const auto &[symbol, source] : symbols) { - if (!IsVariableListItem(*symbol)) { - context_.SayWithDecl( - *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); - } - } + CheckVariableListItem(symbols); // Ref: [4.5:109:19] // If a list item is an array section it must specify contiguous storage. @@ -4904,12 +4910,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::To &x) { const auto &objList{std::get(x.v.t)}; SymbolSourceMap symbols; GetSymbolsInObjectList(objList, symbols); - for (const auto &[symbol, source] : symbols) { - if (!IsVariableListItem(*symbol)) { - context_.SayWithDecl( - *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); - } - } + CheckVariableListItem(symbols); // Ref: [4.5:109:19] // If a list item is an array section it must specify contiguous storage. diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 587959f7d506f..1a8059d8548ed 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -174,6 +174,7 @@ class OmpStructureChecker bool IsExtendedListItem(const Symbol &sym); bool IsCommonBlock(const Symbol &sym); std::optional IsContiguous(const parser::OmpObject &object); + void CheckVariableListItem(const SymbolSourceMap &symbols); void CheckMultipleOccurrence(semantics::UnorderedSymbolSet &listVars, const std::list &nameList, const parser::CharBlock &item, const std::string &clauseName); diff --git a/flang/test/Semantics/OpenMP/copyprivate04.f90 b/flang/test/Semantics/OpenMP/copyprivate04.f90 index 291cf1103fb27..8d7800229bc5f 100644 --- a/flang/test/Semantics/OpenMP/copyprivate04.f90 +++ b/flang/test/Semantics/OpenMP/copyprivate04.f90 @@ -70,6 +70,7 @@ program omp_copyprivate ! Named constants are shared. !$omp single !ERROR: COPYPRIVATE variable 'pi' is not PRIVATE or THREADPRIVATE in outer context + !ERROR: 'pi' must be a variable !$omp end single copyprivate(pi) !$omp parallel do diff --git a/flang/test/Semantics/OpenMP/copyprivate05.f90 b/flang/test/Semantics/OpenMP/copyprivate05.f90 new file mode 100644 index 0000000000000..129f8f0b5144e --- /dev/null +++ b/flang/test/Semantics/OpenMP/copyprivate05.f90 @@ -0,0 +1,12 @@ +!RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp + +! The first testcase from https://github.com/llvm/llvm-project/issues/141481 + +subroutine f00 + type t + end type + +!ERROR: 't' must be a variable +!$omp single copyprivate(t) +!$omp end single +end >From e52b6bf45685fc806b9cab2fe0f3243d0f9467ab Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 28 May 2025 14:02:30 -0500 Subject: [PATCH 2/2] format --- flang/lib/Semantics/check-omp-structure.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 297cd32270705..b0bc478d96a1e 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -394,8 +394,8 @@ void OmpStructureChecker::CheckVariableListItem( const SymbolSourceMap &symbols) { for (auto &[symbol, source] : symbols) { if (!IsVariableListItem(*symbol)) { - context_.SayWithDecl(*symbol, source, "'%s' must be a variable"_err_en_US, - symbol->name()); + context_.SayWithDecl( + *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); } } } From flang-commits at lists.llvm.org Thu May 29 05:40:34 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Thu, 29 May 2025 05:40:34 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <683855c2.050a0220.1d473e.2b00@mx.google.com> snarang181 wrote: > > > https://reviews.llvm.org/D128650 > > > > > > @kiranchandramohan -- this runs for me successfully and does not interfere with the placeholder file. The final `FIRLangRef.md` that is produced in the build artifacts has been generated by tablegen. > > What about the man pages and its contents. Is it the empty file or the one produced by tablegen? As I mentioned the man page produced at `${BUILD_DIR}/tools/flang/docs/man/flang.1` looks like the one produced by tablegen. The `FIRLangRef.md` looks to be completely overwritten in the build directory. For the other placeholder file, `FlangCommandLineReference.md`, there is no fallout either. https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 05:51:41 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 29 May 2025 05:51:41 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <6838585d.050a0220.33acde.2d6d@mx.google.com> kiranchandramohan wrote: > > > > https://reviews.llvm.org/D128650 > > > > > > > > > @kiranchandramohan -- this runs for me successfully and does not interfere with the placeholder file. The final `FIRLangRef.md` that is produced in the build artifacts has been generated by tablegen. > > > > > > What about the man pages and its contents. Is it the empty file or the one produced by tablegen? > > As I mentioned the man page produced at `${BUILD_DIR}/tools/flang/docs/man/flang.1` looks like the one produced by tablegen. The `FIRLangRef.md` looks to be completely overwritten in the build directory. For the other placeholder file, `FlangCommandLineReference.md`, there is no fallout either. Can we do something similar to https://reviews.llvm.org/D128650 to avoid needing dummy `FIRLangRef.md` and `FlangCommandLineReference`? https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 05:56:59 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Thu, 29 May 2025 05:56:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir][OpenMP] Refactor privtization code into shared location (PR #141767) In-Reply-To: Message-ID: <6838599b.170a0220.2f8529.27dc@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `ppc64le-flang-rhel-clang` running on `ppc64le-flang-rhel-test` while building `flang` at step 5 "build-unified-tree". Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/29376
Here is the relevant piece of the build log for the reference ``` Step 5 (build-unified-tree) failure: build (failure) ... 84.526 [30/23/6784] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/VectorSubscripts.cpp.o 86.102 [30/22/6785] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/HostAssociations.cpp.o 88.296 [30/21/6786] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/ConvertProcedureDesignator.cpp.o 88.747 [30/20/6787] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/ConvertType.cpp.o 90.028 [30/19/6788] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/CustomIntrinsicCall.cpp.o 91.041 [30/18/6789] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/IO.cpp.o 92.396 [30/17/6790] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/Allocatable.cpp.o 92.634 [30/16/6791] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/ConvertConstant.cpp.o 94.901 [30/15/6792] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/ConvertCall.cpp.o 95.036 [30/14/6793] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/Support/Utils.cpp.o FAILED: tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/Support/Utils.cpp.o ccache /home/buildbots/llvm-external-buildbots/clang.19.1.7/bin/clang++ -DFLANG_INCLUDE_TESTS=1 -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/tools/flang/lib/Lower -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/lib/Lower -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/tools/flang/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/llvm/include -isystem /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/../mlir/include -isystem /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/tools/mlir/include -isystem /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/tools/clang/include -isystem /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/llvm/../clang/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Werror -Wno-deprecated-copy -Wno-string-conversion -Wno-ctad-maybe-unsupported -Wno-unused-command-line-argument -Wstring-conversion -Wcovered-switch-default -Wno-nested-anon-types -Xclang -fno-pch-timestamp -O3 -DNDEBUG -std=c++17 -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -Winvalid-pch -Xclang -include-pch -Xclang /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/cmake_pch.hxx.pch -Xclang -include -Xclang /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/cmake_pch.hxx -MD -MT tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/Support/Utils.cpp.o -MF tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/Support/Utils.cpp.o.d -o tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/Support/Utils.cpp.o -c /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/lib/Lower/Support/Utils.cpp /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/lib/Lower/Support/Utils.cpp:659:8: error: unused variable 'cannotHaveNonDefaultLowerBounds' [-Werror,-Wunused-variable] 659 | bool cannotHaveNonDefaultLowerBounds = !entity.mayHaveNonDefaultLowerBounds(); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. 97.532 [30/13/6794] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/ConvertArrayConstructor.cpp.o 98.778 [30/12/6795] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/OpenMP/Utils.cpp.o 101.043 [30/11/6796] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/ConvertVariable.cpp.o 105.262 [30/10/6797] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/OpenMP/Decomposer.cpp.o 108.608 [30/9/6798] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/ConvertExprToHLFIR.cpp.o 110.604 [30/8/6799] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/OpenMP/ClauseProcessor.cpp.o 126.428 [30/7/6800] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/OpenMP/OpenMP.cpp.o 127.670 [30/6/6801] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/OpenMP/DataSharingProcessor.cpp.o 131.726 [30/5/6802] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/OpenMP/Clauses.cpp.o 132.491 [30/4/6803] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/Bridge.cpp.o 136.167 [30/3/6804] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/OpenACC.cpp.o 152.361 [30/2/6805] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/PFTBuilder.cpp.o 212.036 [30/1/6806] Building CXX object tools/flang/lib/Lower/CMakeFiles/FortranLower.dir/ConvertExpr.cpp.o ninja: build stopped: subcommand failed. ```
https://github.com/llvm/llvm-project/pull/141767 From flang-commits at lists.llvm.org Thu May 29 06:01:03 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 29 May 2025 06:01:03 -0700 (PDT) Subject: [flang-commits] [flang] [NFC][flang] Remove unused variable from `privatizeSymbol` (PR #141938) Message-ID: https://github.com/ergawy created https://github.com/llvm/llvm-project/pull/141938 None >From 9b643133941c94e4e1b6ad322aab9a68cd51c291 Mon Sep 17 00:00:00 2001 From: ergawy Date: Thu, 29 May 2025 07:59:27 -0500 Subject: [PATCH] [NFC][flang] Remove unused variable from `privatizeSymbol` --- flang/lib/Lower/Support/Utils.cpp | 1 - 1 file changed, 1 deletion(-) diff --git a/flang/lib/Lower/Support/Utils.cpp b/flang/lib/Lower/Support/Utils.cpp index de810cb2f4b34..2de9db992e278 100644 --- a/flang/lib/Lower/Support/Utils.cpp +++ b/flang/lib/Lower/Support/Utils.cpp @@ -656,7 +656,6 @@ void privatizeSymbol( : converter.lookupOneLevelUpSymbol(*sym); assert(hsb && "Host symbol box not found"); hlfir::Entity entity{hsb.getAddr()}; - bool cannotHaveNonDefaultLowerBounds = !entity.mayHaveNonDefaultLowerBounds(); mlir::Location symLoc = hsb.getAddr().getLoc(); std::string privatizerName = sym->name().ToString() + ".privatizer"; From flang-commits at lists.llvm.org Thu May 29 06:01:37 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 06:01:37 -0700 (PDT) Subject: [flang-commits] [flang] [NFC][flang] Remove unused variable from `privatizeSymbol` (PR #141938) In-Reply-To: Message-ID: <68385ab1.630a0220.18b6fa.d6a8@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Kareem Ergawy (ergawy)
Changes --- Full diff: https://github.com/llvm/llvm-project/pull/141938.diff 1 Files Affected: - (modified) flang/lib/Lower/Support/Utils.cpp (-1) ``````````diff diff --git a/flang/lib/Lower/Support/Utils.cpp b/flang/lib/Lower/Support/Utils.cpp index de810cb2f4b34..2de9db992e278 100644 --- a/flang/lib/Lower/Support/Utils.cpp +++ b/flang/lib/Lower/Support/Utils.cpp @@ -656,7 +656,6 @@ void privatizeSymbol( : converter.lookupOneLevelUpSymbol(*sym); assert(hsb && "Host symbol box not found"); hlfir::Entity entity{hsb.getAddr()}; - bool cannotHaveNonDefaultLowerBounds = !entity.mayHaveNonDefaultLowerBounds(); mlir::Location symLoc = hsb.getAddr().getLoc(); std::string privatizerName = sym->name().ToString() + ".privatizer"; ``````````
https://github.com/llvm/llvm-project/pull/141938 From flang-commits at lists.llvm.org Thu May 29 06:01:58 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Thu, 29 May 2025 06:01:58 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <68385ac6.170a0220.23356a.30d9@mx.google.com> https://github.com/snarang181 updated https://github.com/llvm/llvm-project/pull/141882 >From 560d42eab635803b217a04237b2d2ac7f02a1f7b Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Wed, 28 May 2025 20:21:16 -0400 Subject: [PATCH 1/4] [Flang][Docs] Add Sphinx man page support for Flang This patch enables building Flang man pages by: - Adding a `man_pages` entry in flang/docs/conf.py for Sphinx man builder. - Adding a minimal `index.rst` as the master document. - Adding placeholder `.rst` files for FIRLangRef and FlangCommandLineReference to fix toctree references. These changes unblock builds using `-DLLVM_BUILD_MANPAGES=ON` and allow `ninja docs-flang-man` to generate `flang.1`. Fixes #141757 --- flang/docs/FIRLangRef.rst | 4 ++++ flang/docs/FlangCommandLineReference.rst | 4 ++++ flang/docs/conf.py | 4 +++- flang/docs/index.rst | 10 ++++++++++ 4 files changed, 21 insertions(+), 1 deletion(-) create mode 100644 flang/docs/FIRLangRef.rst create mode 100644 flang/docs/FlangCommandLineReference.rst create mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst new file mode 100644 index 0000000000000..91edd67fdcad8 --- /dev/null +++ b/flang/docs/FIRLangRef.rst @@ -0,0 +1,4 @@ +FIR Language Reference +====================== + +(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst new file mode 100644 index 0000000000000..71f77f28ba72c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.rst @@ -0,0 +1,4 @@ +Flang Command Line Reference +============================ + +(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 48f7b69f5d750..46907f144e25a 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -227,7 +227,9 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [] +man_pages = [ + ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) +] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst new file mode 100644 index 0000000000000..09677eb87704f --- /dev/null +++ b/flang/docs/index.rst @@ -0,0 +1,10 @@ +Flang Documentation +==================== + +Welcome to the Flang documentation. + +.. toctree:: + :maxdepth: 1 + + FIRLangRef + FlangCommandLineReference >From e79151b0a5cef4ee51e1e76fd6d32b68714f32fd Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 06:53:34 -0400 Subject: [PATCH 2/4] Remove .rst files and point conf.py to pick up .md --- flang/docs/FIRLangRef.rst | 4 ---- flang/docs/FlangCommandLineReference.rst | 4 ---- flang/docs/conf.py | 5 ++--- flang/docs/index.rst | 10 ---------- 4 files changed, 2 insertions(+), 21 deletions(-) delete mode 100644 flang/docs/FIRLangRef.rst delete mode 100644 flang/docs/FlangCommandLineReference.rst delete mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst deleted file mode 100644 index 91edd67fdcad8..0000000000000 --- a/flang/docs/FIRLangRef.rst +++ /dev/null @@ -1,4 +0,0 @@ -FIR Language Reference -====================== - -(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst deleted file mode 100644 index 71f77f28ba72c..0000000000000 --- a/flang/docs/FlangCommandLineReference.rst +++ /dev/null @@ -1,4 +0,0 @@ -Flang Command Line Reference -============================ - -(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 46907f144e25a..4fd81440c8176 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -42,6 +42,7 @@ # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] +source_suffix = [".md"] myst_heading_anchors = 6 import sphinx @@ -227,9 +228,7 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [ - ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) -] +man_pages = [("index", "flang", "Flang Documentation", ["Flang Contributors"], 1)] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst deleted file mode 100644 index 09677eb87704f..0000000000000 --- a/flang/docs/index.rst +++ /dev/null @@ -1,10 +0,0 @@ -Flang Documentation -==================== - -Welcome to the Flang documentation. - -.. toctree:: - :maxdepth: 1 - - FIRLangRef - FlangCommandLineReference >From d3cf067f197b9b26ee75d3d41dc654b4e00acbcb Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 07:03:35 -0400 Subject: [PATCH 3/4] While building man pages, the .md files were being used. Due to that, the myst_parser was explictly imported. Adding Placeholder .md files which are required by index.md --- flang/docs/FIRLangRef.md | 3 +++ flang/docs/FlangCommandLineReference.md | 3 +++ flang/docs/conf.py | 10 +++++----- 3 files changed, 11 insertions(+), 5 deletions(-) create mode 100644 flang/docs/FIRLangRef.md create mode 100644 flang/docs/FlangCommandLineReference.md diff --git a/flang/docs/FIRLangRef.md b/flang/docs/FIRLangRef.md new file mode 100644 index 0000000000000..8e4052f14fc7c --- /dev/null +++ b/flang/docs/FIRLangRef.md @@ -0,0 +1,3 @@ +# FIR Language Reference + +_TODO: Add FIR language reference documentation._ diff --git a/flang/docs/FlangCommandLineReference.md b/flang/docs/FlangCommandLineReference.md new file mode 100644 index 0000000000000..ee8d7b83dc50c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.md @@ -0,0 +1,3 @@ +# Flang Command Line Reference + +_TODO: Add Flang CLI documentation._ diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 4fd81440c8176..7223661625689 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -10,6 +10,7 @@ # serve to show the default. from datetime import date + # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. @@ -28,16 +29,15 @@ "sphinx.ext.autodoc", ] -# When building man pages, we do not use the markdown pages, -# So, we can continue without the myst_parser dependencies. -# Doing so reduces dependencies of some packaged llvm distributions. + try: import myst_parser extensions.append("myst_parser") except ImportError: - if not tags.has("builder-man"): - raise + raise ImportError( + "myst_parser is required to build documentation, including man pages." + ) # Add any paths that contain templates here, relative to this directory. >From 4f24ea6695aa189eaa7cb2bd395e297664edb837 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 09:01:22 -0400 Subject: [PATCH 4/4] Remove placeholder .md files --- flang/docs/FIRLangRef.md | 3 --- flang/docs/FlangCommandLineReference.md | 3 --- 2 files changed, 6 deletions(-) delete mode 100644 flang/docs/FIRLangRef.md delete mode 100644 flang/docs/FlangCommandLineReference.md diff --git a/flang/docs/FIRLangRef.md b/flang/docs/FIRLangRef.md deleted file mode 100644 index 8e4052f14fc7c..0000000000000 --- a/flang/docs/FIRLangRef.md +++ /dev/null @@ -1,3 +0,0 @@ -# FIR Language Reference - -_TODO: Add FIR language reference documentation._ diff --git a/flang/docs/FlangCommandLineReference.md b/flang/docs/FlangCommandLineReference.md deleted file mode 100644 index ee8d7b83dc50c..0000000000000 --- a/flang/docs/FlangCommandLineReference.md +++ /dev/null @@ -1,3 +0,0 @@ -# Flang Command Line Reference - -_TODO: Add Flang CLI documentation._ From flang-commits at lists.llvm.org Thu May 29 06:03:22 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 29 May 2025 06:03:22 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Allow structure component in `task depend` clauses (PR #141923) In-Reply-To: Message-ID: <68385b1a.050a0220.918cb.2d11@mx.google.com> https://github.com/ergawy updated https://github.com/llvm/llvm-project/pull/141923 >From 18a503d2c70a7aad7e397cb1c85ac8dcad34f72b Mon Sep 17 00:00:00 2001 From: ergawy Date: Thu, 29 May 2025 05:19:15 -0500 Subject: [PATCH 1/2] [flang][OpenMP] Allow structure component in `task depend` clauses Even though the spec (version 5.2) prohibits strcuture components from being specified in `depend` clauses, this restriction is not sensible. This PR rectifies the issue by lifting that restriction and allowing structure components in `depend` clauses (which is allowed by OpenMP 6.0). --- flang/include/flang/Evaluate/tools.h | 10 ++++ flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 5 ++ flang/lib/Semantics/check-omp-structure.cpp | 8 +-- .../task-depend-structure-component.f90 | 21 ++++++++ flang/test/Semantics/OpenMP/depend02.f90 | 49 ------------------- 5 files changed, 38 insertions(+), 55 deletions(-) create mode 100644 flang/test/Lower/OpenMP/task-depend-structure-component.f90 delete mode 100644 flang/test/Semantics/OpenMP/depend02.f90 diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 7f2e91ae128bd..4efd88b13183f 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -414,6 +414,16 @@ const Symbol *IsArrayElement(const Expr &expr, bool intoSubstring = true, return nullptr; } +template +bool isStructureCompnent(const Fortran::evaluate::Expr &expr) { + if (auto dataRef{ExtractDataRef(expr, /*intoSubstring=*/false)}) { + const Fortran::evaluate::DataRef *ref{&*dataRef}; + return std::holds_alternative(ref->u); + } + + return false; +} + template std::optional ExtractNamedEntity(const A &x) { if (auto dataRef{ExtractDataRef(x)}) { diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index ebdda9885d5c2..8506d562c5094 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -944,6 +944,11 @@ bool ClauseProcessor::processDepend(lower::SymMap &symMap, converter.getCurrentLocation(), converter, expr, symMap, stmtCtx); dependVar = entity.getBase(); } + } else if (evaluate::isStructureCompnent(*object.ref())) { + SomeExpr expr = *object.ref(); + hlfir::EntityWithAttributes entity = convertExprToHLFIR( + converter.getCurrentLocation(), converter, expr, symMap, stmtCtx); + dependVar = entity.getBase(); } else { semantics::Symbol *sym = object.sym(); dependVar = converter.getSymbolAddress(*sym); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bda0d62829506..a3eeac4524eff 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -5493,12 +5493,8 @@ void OmpStructureChecker::CheckDependList(const parser::DataRef &d) { // Check if the base element is valid on Depend Clause CheckDependList(elem.value().base); }, - [&](const common::Indirection &) { - context_.Say(GetContext().clauseSource, - "A variable that is part of another variable " - "(such as an element of a structure) but is not an array " - "element or an array section cannot appear in a DEPEND " - "clause"_err_en_US); + [&](const common::Indirection &comp) { + CheckDependList(comp.value().base); }, [&](const common::Indirection &) { context_.Say(GetContext().clauseSource, diff --git a/flang/test/Lower/OpenMP/task-depend-structure-component.f90 b/flang/test/Lower/OpenMP/task-depend-structure-component.f90 new file mode 100644 index 0000000000000..7cf6dbfac2729 --- /dev/null +++ b/flang/test/Lower/OpenMP/task-depend-structure-component.f90 @@ -0,0 +1,21 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +subroutine depend + type :: my_struct + integer :: my_component(10) + end type + + type(my_struct) :: my_var + + !$omp task depend(in:my_var%my_component) + !$omp end task +end subroutine depend + +! CHECK: %[[VAR_ALLOC:.*]] = fir.alloca !fir.type<{{.*}}my_struct{{.*}}> {bindc_name = "my_var", {{.*}}} +! CHECK: %[[VAR_DECL:.*]]:2 = hlfir.declare %[[VAR_ALLOC]] + +! CHECK: %[[COMP_SELECTOR:.*]] = hlfir.designate %[[VAR_DECL]]#0{"my_component"} + +! CHECK: omp.task depend(taskdependin -> %[[COMP_SELECTOR]] : {{.*}}) { +! CHECK: omp.terminator +! CHECK: } diff --git a/flang/test/Semantics/OpenMP/depend02.f90 b/flang/test/Semantics/OpenMP/depend02.f90 deleted file mode 100644 index 76c02c8f9cbab..0000000000000 --- a/flang/test/Semantics/OpenMP/depend02.f90 +++ /dev/null @@ -1,49 +0,0 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp -! OpenMP Version 4.5 -! 2.13.9 Depend Clause -! A variable that is part of another variable -! (such as an element of a structure) but is not an array element or -! an array section cannot appear in a DEPEND clause - -subroutine vec_mult(N) - implicit none - integer :: i, N - real, allocatable :: p(:), v1(:), v2(:) - - type my_type - integer :: a(10) - end type my_type - - type(my_type) :: my_var - allocate( p(N), v1(N), v2(N) ) - - !$omp parallel num_threads(2) - !$omp single - - !$omp task depend(out:v1) - call init(v1, N) - !$omp end task - - !$omp task depend(out:v2) - call init(v2, N) - !$omp end task - - !ERROR: A variable that is part of another variable (such as an element of a structure) but is not an array element or an array section cannot appear in a DEPEND clause - !$omp target nowait depend(in:v1,v2, my_var%a) depend(out:p) & - !$omp& map(to:v1,v2) map(from: p) - !$omp parallel do - do i=1,N - p(i) = v1(i) * v2(i) - end do - !$omp end target - - !$omp task depend(in:p) - call output(p, N) - !$omp end task - - !$omp end single - !$omp end parallel - - deallocate( p, v1, v2 ) - -end subroutine >From d8c9f24ea2e2ae598247d4268dc67151939808b4 Mon Sep 17 00:00:00 2001 From: ergawy Date: Thu, 29 May 2025 08:03:01 -0500 Subject: [PATCH 2/2] fix typo --- flang/include/flang/Evaluate/tools.h | 2 +- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 4efd88b13183f..4dce1257a6507 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -415,7 +415,7 @@ const Symbol *IsArrayElement(const Expr &expr, bool intoSubstring = true, } template -bool isStructureCompnent(const Fortran::evaluate::Expr &expr) { +bool isStructureComponent(const Fortran::evaluate::Expr &expr) { if (auto dataRef{ExtractDataRef(expr, /*intoSubstring=*/false)}) { const Fortran::evaluate::DataRef *ref{&*dataRef}; return std::holds_alternative(ref->u); diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 8506d562c5094..307e2ac2c66f9 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -944,7 +944,7 @@ bool ClauseProcessor::processDepend(lower::SymMap &symMap, converter.getCurrentLocation(), converter, expr, symMap, stmtCtx); dependVar = entity.getBase(); } - } else if (evaluate::isStructureCompnent(*object.ref())) { + } else if (evaluate::isStructureComponent(*object.ref())) { SomeExpr expr = *object.ref(); hlfir::EntityWithAttributes entity = convertExprToHLFIR( converter.getCurrentLocation(), converter, expr, symMap, stmtCtx); From flang-commits at lists.llvm.org Thu May 29 06:03:47 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Thu, 29 May 2025 06:03:47 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <68385b33.050a0220.c19af.2da5@mx.google.com> snarang181 wrote: > > > > > https://reviews.llvm.org/D128650 > > > > > > > > > > > > @kiranchandramohan -- this runs for me successfully and does not interfere with the placeholder file. The final `FIRLangRef.md` that is produced in the build artifacts has been generated by tablegen. > > > > > > > > > What about the man pages and its contents. Is it the empty file or the one produced by tablegen? > > > > > > As I mentioned the man page produced at `${BUILD_DIR}/tools/flang/docs/man/flang.1` looks like the one produced by tablegen. The `FIRLangRef.md` looks to be completely overwritten in the build directory. For the other placeholder file, `FlangCommandLineReference.md`, there is no fallout either. > > Can we do something similar to https://reviews.llvm.org/D128650 to avoid needing dummy `FIRLangRef.md` and `FlangCommandLineReference`? @kiranchandramohan, I removed the placeholder `.md` files and the man pages get generated now. Just a warning of the form `WARNING: toctree contains reference to nonexisting document 'FIRLangRef'` `WARNING: toctree contains reference to nonexisting document 'FlangCommandLineReference'` This is probably OK? https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 06:06:38 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 29 May 2025 06:06:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][fir][OpenMP] Refactor privtization code into shared location (PR #141767) In-Reply-To: Message-ID: <68385bde.050a0220.2d672f.2ed4@mx.google.com> ergawy wrote: > LLVM Buildbot has detected a new failure on builder `ppc64le-flang-rhel-clang` running on `ppc64le-flang-rhel-test` while building `flang` at step 5 "build-unified-tree". > > Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/29376 > Here is the relevant piece of the build log for the reference Hopefully, fixed in https://github.com/llvm/llvm-project/pull/141938 https://github.com/llvm/llvm-project/pull/141767 From flang-commits at lists.llvm.org Thu May 29 06:52:51 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Thu, 29 May 2025 06:52:51 -0700 (PDT) Subject: [flang-commits] [flang] [NFC][flang] Remove unused variable from `privatizeSymbol` (PR #141938) In-Reply-To: Message-ID: <683866b3.050a0220.1a5c2f.4431@mx.google.com> https://github.com/tblah approved this pull request. I hope this is still being passed to the OpenMP init function LGTM https://github.com/llvm/llvm-project/pull/141938 From flang-commits at lists.llvm.org Thu May 29 06:53:17 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 29 May 2025 06:53:17 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <683866cd.170a0220.2a003c.4999@mx.google.com> ================ @@ -614,3 +614,28 @@ nvfortran defines `-fast` as - `-Mcache_align`: there is no equivalent flag in Flang or Clang. - `-Mflushz`: flush-to-zero mode - when `-ffast-math` is specified, Flang will link to `crtfastmath.o` to ensure denormal numbers are flushed to zero. + + +## FCC_OVERRIDE_OPTIONS + +The environment variable `FCC_OVERRIDE_OPTIONS` can be used to apply a list of +edits to the input argument lists. The value of this environment variable is ---------------- tarunprabhu wrote: Can we also document how this works relative to configuration files. For example, will the substitution option `s/XXX/YYY/` be applied to any options added by the configuration files or only to those passed in explicitly by the user? https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Thu May 29 06:53:17 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 29 May 2025 06:53:17 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <683866cd.170a0220.230a5f.4130@mx.google.com> ================ @@ -0,0 +1,12 @@ +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s ---------------- tarunprabhu wrote: We may also want to have a test for the interaction of this environment variable with options added by a config file. https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Thu May 29 06:54:47 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 29 May 2025 06:54:47 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <68386727.050a0220.7d3a7.49bb@mx.google.com> tarunprabhu wrote: > > Does `CCC_OVERRIDE_OPTIONS` expands to Clang Compiler Commandline Override Options? If so `FCC_OVERRIDE_OPTIONS` expanding to Fortran Compiler Commandline Override Options seems the right replacement. > > If `CCC_OVERRIDE_OPTIONS` expands to Clang C Compiler Override Options then `FFC_OVERRIDE_OPTIONS` (as suggested by @tarunprabhu) expanding to Flang Fortran Compiler Overrided Options is better. > > I am happy with either of them. Maybe we should split the difference and call it `FFF_OVERRIDE_OPTIONS`. More Fortran for everyone! :-) https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Thu May 29 07:06:12 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 29 May 2025 07:06:12 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <683869d4.170a0220.383749.4e06@mx.google.com> tarunprabhu wrote: > > @fanju110, Thanks for seeing this through! > > Hi @tarunprabhu , If everything looks good, could you please merge this PR at your convenience (I don't have merge rights)? Thank you! It looks like you need to run `clang-format` on some of the code. See [here](https://github.com/llvm/llvm-project/actions/runs/15247681502/job/43119228106?pr=136098). For changes that you have made in `clang/`, please run it only on the code that you have changed and not the entire file. For code in `flang/`, you may run it on the entire file (in fact, you should)( https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Thu May 29 07:14:44 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Thu, 29 May 2025 07:14:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68386bd4.170a0220.284cc1.2493@mx.google.com> ================ @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); ---------------- mrkajetanp wrote: Ah nice it does work, thanks for the idea! https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Thu May 29 07:31:56 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 29 May 2025 07:31:56 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <68386fdc.170a0220.29c3f4.7952@mx.google.com> tarunprabhu wrote: > @kiranchandramohan, I removed the placeholder `.md` files and the man pages get generated now. Just a warning of the form `WARNING: toctree contains reference to nonexisting document 'FIRLangRef'` `WARNING: toctree contains reference to nonexisting document 'FlangCommandLineReference'` This is probably OK? It might not be ok. If the buildkite fails, as it does [here](https://github.com/llvm/llvm-project/actions/runs/15324450618/job/43119288153?pr=141882), it will probably break buildbots too. I am not sure what is causing this though. https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 07:35:18 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Thu, 29 May 2025 07:35:18 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <683870a6.170a0220.46939.8177@mx.google.com> snarang181 wrote: > > @kiranchandramohan, I removed the placeholder `.md` files and the man pages get generated now. Just a warning of the form `WARNING: toctree contains reference to nonexisting document 'FIRLangRef'` `WARNING: toctree contains reference to nonexisting document 'FlangCommandLineReference'` This is probably OK? > > It might not be ok. If the buildkite fails, as it does [here](https://github.com/llvm/llvm-project/actions/runs/15324450618/job/43119288153?pr=141882), it will probably break buildbots too. I am not sure what is causing this though. Looking further into it. https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 07:44:05 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 07:44:05 -0700 (PDT) Subject: [flang-commits] [flang] 11c7a0c - [NFC][flang] Remove unused variable from `privatizeSymbol` (#141938) Message-ID: <683872b5.170a0220.2412de.83ce@mx.google.com> Author: Kareem Ergawy Date: 2025-05-29T16:44:01+02:00 New Revision: 11c7a0c3f780f505eef7021480f457b2f2a1ff89 URL: https://github.com/llvm/llvm-project/commit/11c7a0c3f780f505eef7021480f457b2f2a1ff89 DIFF: https://github.com/llvm/llvm-project/commit/11c7a0c3f780f505eef7021480f457b2f2a1ff89.diff LOG: [NFC][flang] Remove unused variable from `privatizeSymbol` (#141938) Added: Modified: flang/lib/Lower/Support/Utils.cpp Removed: ################################################################################ diff --git a/flang/lib/Lower/Support/Utils.cpp b/flang/lib/Lower/Support/Utils.cpp index de810cb2f4b34..2de9db992e278 100644 --- a/flang/lib/Lower/Support/Utils.cpp +++ b/flang/lib/Lower/Support/Utils.cpp @@ -656,7 +656,6 @@ void privatizeSymbol( : converter.lookupOneLevelUpSymbol(*sym); assert(hsb && "Host symbol box not found"); hlfir::Entity entity{hsb.getAddr()}; - bool cannotHaveNonDefaultLowerBounds = !entity.mayHaveNonDefaultLowerBounds(); mlir::Location symLoc = hsb.getAddr().getLoc(); std::string privatizerName = sym->name().ToString() + ".privatizer"; From flang-commits at lists.llvm.org Thu May 29 07:44:08 2025 From: flang-commits at lists.llvm.org (Kareem Ergawy via flang-commits) Date: Thu, 29 May 2025 07:44:08 -0700 (PDT) Subject: [flang-commits] [flang] [NFC][flang] Remove unused variable from `privatizeSymbol` (PR #141938) In-Reply-To: Message-ID: <683872b8.170a0220.47242.6e12@mx.google.com> https://github.com/ergawy closed https://github.com/llvm/llvm-project/pull/141938 From flang-commits at lists.llvm.org Thu May 29 07:13:24 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Thu, 29 May 2025 07:13:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68386b84.050a0220.f9f50.25f4@mx.google.com> https://github.com/mrkajetanp updated https://github.com/llvm/llvm-project/pull/138718 >From 78206935db8c87bb9e2c89da3574b933406f112e Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 10 Apr 2025 14:04:52 +0000 Subject: [PATCH 1/6] [flang] Inline hlfir.copy_in for trivial types hlfir.copy_in implements copying non-contiguous array slices for functions that take in arrays required to be contiguous through a flang-rt function that calls memcpy/memmove separately on each element. For large arrays of trivial types, this can incur considerable overhead compared to a plain copy loop that is better able to take advantage of hardware pipelines. To address that, extend the InlineHLFIRAssign optimisation pass with a new pattern for inlining hlfir.copy_in operations for trivial types. For the time being, the pattern is only applied in cases where the copy-in does not require a corresponding copy-out, such as when the function being called declares the array parameter as intent(in). Applying this optimisation reduces the runtime of thornado-mini's DeleptonizationProblem by a factor of about 1/3rd. Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 117 ++++++++++++++++++ 1 file changed, 117 insertions(+) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 6e209cce07ad4..38c684eaceb7d 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,6 +13,7 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + + if (!mlir::isa(copyOut)) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (::llvm::cast(copyOut).getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + auto results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + auto elem = hlfir::getElementAt(loc, builder, inputVariable, + loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + auto tempElem = hlfir::getElementAt(loc, builder, temp, + loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + auto refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + auto refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + auto addr = results[0]; + auto needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + auto tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. + rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); +} + class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -140,6 +256,7 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); + patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { >From 154b758b0da725c0d0f9b41cdc3713a05e2239a7 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Wed, 7 May 2025 16:04:07 +0000 Subject: [PATCH 2/6] Add tests Signed-off-by: Kajetan Puchalski --- flang/test/HLFIR/inline-hlfir-assign.fir | 144 +++++++++++++++++++++++ 1 file changed, 144 insertions(+) diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index f834e7971e3d5..df7681b9c5c16 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,3 +353,147 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } >From 6d334d77917a9e02b3e397dd1b3ea8605320c795 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 8 May 2025 15:15:56 +0000 Subject: [PATCH 3/6] Address Tom's review comments Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 41 +++++++++++-------- 1 file changed, 23 insertions(+), 18 deletions(-) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 38c684eaceb7d..dc545ece8adff 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -158,16 +158,16 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, "CopyInOp's WasCopied has no uses"); // The copy out should always be present, either to actually copy or just // deallocate memory. - auto *copyOut = - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - if (!mlir::isa(copyOut)) + if (!copyOut) return rewriter.notifyMatchFailure(copyIn, "CopyInOp has no direct CopyOut"); // Only inline the copy_in when copy_out does not need to be done, i.e. in // case of intent(in). - if (::llvm::cast(copyOut).getVar()) + if (copyOut.getVar()) return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); inputVariable = @@ -175,7 +175,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); mlir::Value isContiguous = builder.create(loc, inputVariable); - auto results = + mlir::Operation::result_range results = builder .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, /*withElseRegion=*/true) @@ -195,11 +195,11 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, loc, builder, extents, /*isUnordered=*/true, flangomp::shouldUseWorkshareLowering(copyIn)); builder.setInsertionPointToStart(loopNest.body); - auto elem = hlfir::getElementAt(loc, builder, inputVariable, - loopNest.oneBasedIndices); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); elem = hlfir::loadTrivialScalar(loc, builder, elem); - auto tempElem = hlfir::getElementAt(loc, builder, temp, - loopNest.oneBasedIndices); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); builder.create(loc, elem, tempElem); builder.setInsertionPointAfter(loopNest.outerOp); @@ -209,9 +209,9 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, if (mlir::isa(temp.getType())) { result = temp; } else { - auto refTy = + fir::ReferenceType refTy = fir::ReferenceType::get(temp.getElementOrSequenceType()); - auto refVal = builder.createConvert(loc, refTy, temp); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); result = builder.create(loc, resultAddrType, refVal); } @@ -221,25 +221,30 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }) .getResults(); - auto addr = results[0]; - auto needsCleanup = results[1]; + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { auto boxAddr = builder.create(loc, addr); - auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); builder.create(loc, heapVal); }); rewriter.eraseOp(copyOut); - auto tempBox = copyIn.getTempBox(); + mlir::Value tempBox = copyIn.getTempBox(); rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); // The TempBox is only needed for flang-rt calls which we're no longer - // generating. + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); } >From 6a9d0fd6cc3c72ed7382bd78128a4cd59b75abe9 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 22 May 2025 13:37:53 +0000 Subject: [PATCH 4/6] Separate copy_in inlining into its own pass, add flag Signed-off-by: Kajetan Puchalski --- flang/include/flang/Optimizer/HLFIR/Passes.td | 4 + .../Optimizer/HLFIR/Transforms/CMakeLists.txt | 1 + .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 122 ------------ .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 180 ++++++++++++++++++ flang/lib/Optimizer/Passes/Pipelines.cpp | 5 + flang/test/HLFIR/inline-hlfir-assign.fir | 144 -------------- flang/test/HLFIR/inline-hlfir-copy-in.fir | 146 ++++++++++++++ 7 files changed, 336 insertions(+), 266 deletions(-) create mode 100644 flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp create mode 100644 flang/test/HLFIR/inline-hlfir-copy-in.fir diff --git a/flang/include/flang/Optimizer/HLFIR/Passes.td b/flang/include/flang/Optimizer/HLFIR/Passes.td index d445140118073..04d7aec5fe489 100644 --- a/flang/include/flang/Optimizer/HLFIR/Passes.td +++ b/flang/include/flang/Optimizer/HLFIR/Passes.td @@ -69,6 +69,10 @@ def InlineHLFIRAssign : Pass<"inline-hlfir-assign"> { let summary = "Inline hlfir.assign operations"; } +def InlineHLFIRCopyIn : Pass<"inline-hlfir-copy-in"> { + let summary = "Inline hlfir.copy_in operations"; +} + def PropagateFortranVariableAttributes : Pass<"propagate-fortran-attrs"> { let summary = "Propagate FortranVariableFlagsAttr attributes through HLFIR"; } diff --git a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt index d959428ebd203..cc74273d9c5d9 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt @@ -5,6 +5,7 @@ add_flang_library(HLFIRTransforms ConvertToFIR.cpp InlineElementals.cpp InlineHLFIRAssign.cpp + InlineHLFIRCopyIn.cpp LowerHLFIRIntrinsics.cpp LowerHLFIROrderedAssignments.cpp ScheduleOrderedAssignments.cpp diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index dc545ece8adff..6e209cce07ad4 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,7 +13,6 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" -#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -128,126 +127,6 @@ class InlineHLFIRAssignConversion } }; -class InlineCopyInConversion : public mlir::OpRewritePattern { -public: - using mlir::OpRewritePattern::OpRewritePattern; - - llvm::LogicalResult - matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const override; -}; - -llvm::LogicalResult -InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const { - fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); - mlir::Location loc = copyIn.getLoc(); - hlfir::Entity inputVariable{copyIn.getVar()}; - if (!fir::isa_trivial(inputVariable.getFortranElementType())) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's data type is not trivial"); - - if (fir::isPointerType(inputVariable.getType())) - return rewriter.notifyMatchFailure( - copyIn, "CopyInOp's input variable is a pointer"); - - // There should be exactly one user of WasCopied - the corresponding - // CopyOutOp. - if (copyIn.getWasCopied().getUses().empty()) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's WasCopied has no uses"); - // The copy out should always be present, either to actually copy or just - // deallocate memory. - auto copyOut = mlir::dyn_cast( - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - - if (!copyOut) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp has no direct CopyOut"); - - // Only inline the copy_in when copy_out does not need to be done, i.e. in - // case of intent(in). - if (copyOut.getVar()) - return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); - - inputVariable = - hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); - mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); - mlir::Value isContiguous = - builder.create(loc, inputVariable); - mlir::Operation::result_range results = - builder - .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, - /*withElseRegion=*/true) - .genThen([&]() { - mlir::Value falseVal = builder.create( - loc, builder.getI1Type(), builder.getBoolAttr(false)); - builder.create( - loc, mlir::ValueRange{inputVariable, falseVal}); - }) - .genElse([&] { - auto [temp, cleanup] = - hlfir::createTempFromMold(loc, builder, inputVariable); - mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); - llvm::SmallVector extents = - hlfir::getIndexExtents(loc, builder, shape); - hlfir::LoopNest loopNest = hlfir::genLoopNest( - loc, builder, extents, /*isUnordered=*/true, - flangomp::shouldUseWorkshareLowering(copyIn)); - builder.setInsertionPointToStart(loopNest.body); - hlfir::Entity elem = hlfir::getElementAt( - loc, builder, inputVariable, loopNest.oneBasedIndices); - elem = hlfir::loadTrivialScalar(loc, builder, elem); - hlfir::Entity tempElem = hlfir::getElementAt( - loc, builder, temp, loopNest.oneBasedIndices); - builder.create(loc, elem, tempElem); - builder.setInsertionPointAfter(loopNest.outerOp); - - mlir::Value result; - // Make sure the result is always a boxed array by boxing it - // ourselves if need be. - if (mlir::isa(temp.getType())) { - result = temp; - } else { - fir::ReferenceType refTy = - fir::ReferenceType::get(temp.getElementOrSequenceType()); - mlir::Value refVal = builder.createConvert(loc, refTy, temp); - result = - builder.create(loc, resultAddrType, refVal); - } - - builder.create(loc, - mlir::ValueRange{result, cleanup}); - }) - .getResults(); - - mlir::OpResult addr = results[0]; - mlir::OpResult needsCleanup = results[1]; - - builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { - auto boxAddr = builder.create(loc, addr); - fir::HeapType heapType = - fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - mlir::Value heapVal = - builder.createConvert(loc, heapType, boxAddr.getResult()); - builder.create(loc, heapVal); - }); - rewriter.eraseOp(copyOut); - - mlir::Value tempBox = copyIn.getTempBox(); - - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); - - // The TempBox is only needed for flang-rt calls which we're no longer - // generating. It should have no uses left at this stage. - if (!tempBox.getUses().empty()) - return mlir::failure(); - rewriter.eraseOp(tempBox.getDefiningOp()); - - return mlir::success(); -} - class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -261,7 +140,6 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); - patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp new file mode 100644 index 0000000000000..1e2aecaf535a0 --- /dev/null +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + mlir::Value tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); + rewriter.eraseOp(tempBox.getDefiningOp()); + + return mlir::success(); +} + +class InlineHLFIRCopyInPass + : public hlfir::impl::InlineHLFIRCopyInBase { +public: + void runOnOperation() override { + mlir::MLIRContext *context = &getContext(); + + mlir::GreedyRewriteConfig config; + // Prevent the pattern driver from merging blocks. + config.setRegionSimplificationLevel( + mlir::GreedySimplifyRegionLevel::Disabled); + + mlir::RewritePatternSet patterns(context); + if (!noInlineHLFIRCopyIn) { + patterns.insert(context); + } + + if (mlir::failed(mlir::applyPatternsGreedily( + getOperation(), std::move(patterns), config))) { + mlir::emitError(getOperation()->getLoc(), + "failure in hlfir.copy_in inlining"); + signalPassFailure(); + } + } +}; +} // namespace diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..1779623fddc5a 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -255,6 +255,11 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP, pm, hlfir::createOptimizedBufferization); addNestedPassToAllTopLevelOperations( pm, hlfir::createInlineHLFIRAssign); + + if (optLevel == llvm::OptimizationLevel::O3) { + addNestedPassToAllTopLevelOperations( + pm, hlfir::createInlineHLFIRCopyIn); + } } pm.addPass(hlfir::createLowerHLFIROrderedAssignments()); pm.addPass(hlfir::createLowerHLFIRIntrinsics()); diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index df7681b9c5c16..f834e7971e3d5 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,147 +353,3 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } - -// Test inlining of hlfir.copy_in that does not require the array to be copied out -func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant true -// CHECK: %[[VAL_4:.*]] = arith.constant false -// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 -// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { -// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 -// CHECK: } else { -// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} -// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { -// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref -// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref -// CHECK: } -// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 -// CHECK: } -// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: fir.if %[[VAL_21:.*]]#1 { -// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> -// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> -// CHECK: } -// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } - -// Test not inlining of hlfir.copy_in that requires the array to be copied out -func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_no_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> -// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) -// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () -// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir new file mode 100644 index 0000000000000..7140e93f19979 --- /dev/null +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -0,0 +1,146 @@ +// Test inlining of hlfir.copy_in +// RUN: fir-opt --inline-hlfir-copy-in %s | FileCheck %s + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } >From 63f66ae55347275c3f42c456a70dfbb688836fe6 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Wed, 28 May 2025 13:44:53 +0000 Subject: [PATCH 5/6] Support arrays behind a pointer, add metadata to disable vectorizing --- .../flang/Optimizer/Builder/HLFIRTools.h | 8 ++- flang/lib/Optimizer/Builder/HLFIRTools.cpp | 13 +++- .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 66 ++++++++++--------- flang/test/HLFIR/inline-hlfir-copy-in.fir | 6 +- 4 files changed, 55 insertions(+), 38 deletions(-) diff --git a/flang/include/flang/Optimizer/Builder/HLFIRTools.h b/flang/include/flang/Optimizer/Builder/HLFIRTools.h index ed00cec04dc39..2cbad6e268a38 100644 --- a/flang/include/flang/Optimizer/Builder/HLFIRTools.h +++ b/flang/include/flang/Optimizer/Builder/HLFIRTools.h @@ -374,12 +374,14 @@ struct LoopNest { /// loop constructs currently. LoopNest genLoopNest(mlir::Location loc, fir::FirOpBuilder &builder, mlir::ValueRange extents, bool isUnordered = false, - bool emitWorkshareLoop = false); + bool emitWorkshareLoop = false, + bool couldVectorize = true); inline LoopNest genLoopNest(mlir::Location loc, fir::FirOpBuilder &builder, mlir::Value shape, bool isUnordered = false, - bool emitWorkshareLoop = false) { + bool emitWorkshareLoop = false, + bool couldVectorize = true) { return genLoopNest(loc, builder, getIndexExtents(loc, builder, shape), - isUnordered, emitWorkshareLoop); + isUnordered, emitWorkshareLoop, couldVectorize); } /// The type of a callback that generates the body of a reduction diff --git a/flang/lib/Optimizer/Builder/HLFIRTools.cpp b/flang/lib/Optimizer/Builder/HLFIRTools.cpp index f24dc2caeedfc..14aae5d7118a1 100644 --- a/flang/lib/Optimizer/Builder/HLFIRTools.cpp +++ b/flang/lib/Optimizer/Builder/HLFIRTools.cpp @@ -21,6 +21,7 @@ #include "mlir/IR/IRMapping.h" #include "mlir/Support/LLVM.h" #include "llvm/ADT/TypeSwitch.h" +#include #include #include @@ -932,7 +933,8 @@ mlir::Value hlfir::inlineElementalOp( hlfir::LoopNest hlfir::genLoopNest(mlir::Location loc, fir::FirOpBuilder &builder, mlir::ValueRange extents, bool isUnordered, - bool emitWorkshareLoop) { + bool emitWorkshareLoop, + bool couldVectorize) { emitWorkshareLoop = emitWorkshareLoop && isUnordered; hlfir::LoopNest loopNest; assert(!extents.empty() && "must have at least one extent"); @@ -967,6 +969,15 @@ hlfir::LoopNest hlfir::genLoopNest(mlir::Location loc, auto ub = builder.createConvert(loc, indexType, extent); auto doLoop = builder.create(loc, one, ub, one, isUnordered); + if (!couldVectorize) { + mlir::LLVM::LoopVectorizeAttr va{mlir::LLVM::LoopVectorizeAttr::get( + builder.getContext(), + /*disable=*/builder.getBoolAttr(true), {}, {}, {}, {}, {}, {})}; + mlir::LLVM::LoopAnnotationAttr la = mlir::LLVM::LoopAnnotationAttr::get( + builder.getContext(), {}, /*vectorize=*/va, {}, /*unroll*/ {}, + /*unroll_and_jam*/ {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}); + doLoop.setLoopAnnotationAttr(la); + } loopNest.body = doLoop.getBody(); builder.setInsertionPointToStart(loopNest.body); // Reverse the indices so they are in column-major order. diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp index 1e2aecaf535a0..d1cbe3241c07b 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -52,19 +52,15 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, return rewriter.notifyMatchFailure(copyIn, "CopyInOp's data type is not trivial"); - if (fir::isPointerType(inputVariable.getType())) - return rewriter.notifyMatchFailure( - copyIn, "CopyInOp's input variable is a pointer"); - // There should be exactly one user of WasCopied - the corresponding // CopyOutOp. - if (copyIn.getWasCopied().getUses().empty()) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's WasCopied has no uses"); + if (!copyIn.getWasCopied().hasOneUse()) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's WasCopied has no single user"); // The copy out should always be present, either to actually copy or just // deallocate memory. auto copyOut = mlir::dyn_cast( - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + copyIn.getWasCopied().user_begin().getCurrent().getUser()); if (!copyOut) return rewriter.notifyMatchFailure(copyIn, @@ -77,28 +73,45 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, inputVariable = hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); - mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Type sequenceType = + hlfir::getFortranElementOrSequenceType(inputVariable.getType()); + fir::BoxType resultBoxType = fir::BoxType::get(sequenceType); mlir::Value isContiguous = builder.create(loc, inputVariable); mlir::Operation::result_range results = builder - .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + .genIfOp(loc, {resultBoxType, builder.getI1Type()}, isContiguous, /*withElseRegion=*/true) .genThen([&]() { - mlir::Value falseVal = builder.create( - loc, builder.getI1Type(), builder.getBoolAttr(false)); + mlir::Value result = inputVariable; + if (fir::isPointerType(inputVariable.getType())) { + auto boxAddr = builder.create(loc, inputVariable); + fir::ReferenceType refTy = fir::ReferenceType::get(sequenceType); + mlir::Value refVal = builder.createConvert(loc, refTy, boxAddr); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + result = builder.create(loc, resultBoxType, refVal, + shape); + } builder.create( - loc, mlir::ValueRange{inputVariable, falseVal}); + loc, mlir::ValueRange{result, builder.createBool(loc, false)}); }) .genElse([&] { - auto [temp, cleanup] = - hlfir::createTempFromMold(loc, builder, inputVariable); mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); llvm::SmallVector extents = hlfir::getIndexExtents(loc, builder, shape); - hlfir::LoopNest loopNest = hlfir::genLoopNest( - loc, builder, extents, /*isUnordered=*/true, - flangomp::shouldUseWorkshareLowering(copyIn)); + llvm::StringRef tmpName{".tmp.copy_in"}; + llvm::SmallVector lenParams; + mlir::Value alloc = builder.createHeapTemporary( + loc, sequenceType, tmpName, extents, lenParams); + + auto declareOp = builder.create( + loc, alloc, tmpName, shape, lenParams, + /*dummy_scope=*/nullptr); + hlfir::Entity temp{declareOp.getBase()}; + hlfir::LoopNest loopNest = + hlfir::genLoopNest(loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn), + /*couldVectorize=*/false); builder.setInsertionPointToStart(loopNest.body); hlfir::Entity elem = hlfir::getElementAt( loc, builder, inputVariable, loopNest.oneBasedIndices); @@ -117,12 +130,12 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, fir::ReferenceType refTy = fir::ReferenceType::get(temp.getElementOrSequenceType()); mlir::Value refVal = builder.createConvert(loc, refTy, temp); - result = - builder.create(loc, resultAddrType, refVal); + result = builder.create(loc, resultBoxType, refVal, + shape); } - builder.create(loc, - mlir::ValueRange{result, cleanup}); + builder.create( + loc, mlir::ValueRange{result, builder.createBool(loc, true)}); }) .getResults(); @@ -140,16 +153,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }); rewriter.eraseOp(copyOut); - mlir::Value tempBox = copyIn.getTempBox(); - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); - - // The TempBox is only needed for flang-rt calls which we're no longer - // generating. It should have no uses left at this stage. - if (!tempBox.getUses().empty()) - return mlir::failure(); - rewriter.eraseOp(tempBox.getDefiningOp()); - return mlir::success(); } diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir index 7140e93f19979..7a5b6e591f7c7 100644 --- a/flang/test/HLFIR/inline-hlfir-copy-in.fir +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -60,9 +60,9 @@ func.func private @_test_inline_copy_in(%arg0: !fir.box> { // CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { // CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 // CHECK: } else { -// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} -// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp.copy_in", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp.copy_in"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered attributes {loopAnnotation = #loop_annotation} { // CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref // CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref // CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref >From 837d45b0661c29661969407e6ede3f7a70e4739e Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 29 May 2025 14:10:36 +0000 Subject: [PATCH 6/6] Keep the copy-out to deallocate the temporary Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 20 +++++++------------ flang/test/HLFIR/inline-hlfir-copy-in.fir | 6 +----- 2 files changed, 8 insertions(+), 18 deletions(-) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp index d1cbe3241c07b..0cad503afe16d 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -139,21 +139,15 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }) .getResults(); - mlir::OpResult addr = results[0]; + mlir::OpResult resultBox = results[0]; mlir::OpResult needsCleanup = results[1]; - builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { - auto boxAddr = builder.create(loc, addr); - fir::HeapType heapType = - fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - mlir::Value heapVal = - builder.createConvert(loc, heapType, boxAddr.getResult()); - builder.create(loc, heapVal); - }); - rewriter.eraseOp(copyOut); - - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + auto alloca = builder.create(loc, resultBox.getType()); + auto store = builder.create(loc, resultBox, alloca); + copyOut->setOperand(0, store.getMemref()); + copyOut->setOperand(1, needsCleanup); + + rewriter.replaceOp(copyIn, {resultBox, builder.genNot(loc, isContiguous)}); return mlir::success(); } diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir index 7a5b6e591f7c7..c1d5e11939b7c 100644 --- a/flang/test/HLFIR/inline-hlfir-copy-in.fir +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -73,11 +73,7 @@ func.func private @_test_inline_copy_in(%arg0: !fir.box> { // CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> // CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) // CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: fir.if %[[VAL_21:.*]]#1 { -// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> -// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> -// CHECK: } +// CHECK: hlfir.copy_out %16, %15#1 : (!fir.ref>>, i1) -> () // CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 // CHECK: return // CHECK: } From flang-commits at lists.llvm.org Thu May 29 08:16:54 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Thu, 29 May 2025 08:16:54 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [NFC][OpenMP] Move the default declare mapper name suffix to OMPConstants.h (PR #141964) Message-ID: https://github.com/TIFitis created https://github.com/llvm/llvm-project/pull/141964 This patch moves the default declare mapper name suffix ".omp.default.mapper" to the OMPConstants.h file to be used everywhere for lowering. >From 9daed33f9403a8c45f6ecba6fc7bc6dcac0f83c1 Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Thu, 29 May 2025 16:12:18 +0100 Subject: [PATCH] [NFC][OpenMP] Move the default declare mapper name suffix to OMPConstants.h This patch moves the default declare mapper name suffix ".omp.default.mapper" to the OMPConstants.h file to be used everywhere for lowering. --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- flang/lib/Parser/openmp-parsers.cpp | 2 +- llvm/include/llvm/Frontend/OpenMP/OMPConstants.h | 3 +++ 4 files changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index ebdda9885d5c2..49a3b64c03a7e 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1148,7 +1148,7 @@ void ClauseProcessor::processMapObjects( typeSpec = &object.sym()->GetType()->derivedTypeSpec(); if (typeSpec) { - mapperIdName = typeSpec->name().ToString() + ".omp.default.mapper"; + mapperIdName = typeSpec->name().ToString() + llvm::omp::OmpDefaultMapperName; if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) mapperIdName = converter.mangleName(mapperIdName, sym->owner()); } diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ddb08f74b3841..fa711060c6b90 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2423,7 +2423,7 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, if (sym.GetType()->category() == semantics::DeclTypeSpec::TypeDerived) { auto &typeSpec = sym.GetType()->derivedTypeSpec(); std::string mapperIdName = - typeSpec.name().ToString() + ".omp.default.mapper"; + typeSpec.name().ToString() + llvm::omp::OmpDefaultMapperName; if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) mapperIdName = converter.mangleName(mapperIdName, sym->owner()); if (converter.getModuleOp().lookupSymbol(mapperIdName)) diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index c08cd1ab80559..08326fad8c143 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1402,7 +1402,7 @@ static OmpMapperSpecifier ConstructOmpMapperSpecifier( // This matches the syntax: :: if (DerivedTypeSpec * derived{std::get_if(&typeSpec.u)}) { return OmpMapperSpecifier{ - std::get(derived->t).ToString() + ".omp.default.mapper", + std::get(derived->t).ToString() + llvm::omp::OmpDefaultMapperName, std::move(typeSpec), std::move(varName)}; } return OmpMapperSpecifier{std::string("omp.default.mapper"), diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h b/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h index 338b56226f204..6e1bce12af8e4 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h @@ -190,6 +190,9 @@ enum class OMPScheduleType { LLVM_MARK_AS_BITMASK_ENUM(/* LargestValue */ ModifierMask) }; +// Default OpenMP mapper name suffix. +inline constexpr const char *OmpDefaultMapperName = ".omp.default.mapper"; + /// Values for bit flags used to specify the mapping type for /// offloading. enum class OpenMPOffloadMappingFlags : uint64_t { From flang-commits at lists.llvm.org Thu May 29 08:17:42 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 08:17:42 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [NFC][OpenMP] Move the default declare mapper name suffix to OMPConstants.h (PR #141964) In-Reply-To: Message-ID: <68387a96.170a0220.335bcc.8d75@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Akash Banerjee (TIFitis)
Changes This patch moves the default declare mapper name suffix ".omp.default.mapper" to the OMPConstants.h file to be used everywhere for lowering. --- Full diff: https://github.com/llvm/llvm-project/pull/141964.diff 4 Files Affected: - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+1-1) - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+1-1) - (modified) flang/lib/Parser/openmp-parsers.cpp (+1-1) - (modified) llvm/include/llvm/Frontend/OpenMP/OMPConstants.h (+3) ``````````diff diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index ebdda9885d5c2..49a3b64c03a7e 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1148,7 +1148,7 @@ void ClauseProcessor::processMapObjects( typeSpec = &object.sym()->GetType()->derivedTypeSpec(); if (typeSpec) { - mapperIdName = typeSpec->name().ToString() + ".omp.default.mapper"; + mapperIdName = typeSpec->name().ToString() + llvm::omp::OmpDefaultMapperName; if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) mapperIdName = converter.mangleName(mapperIdName, sym->owner()); } diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ddb08f74b3841..fa711060c6b90 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2423,7 +2423,7 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, if (sym.GetType()->category() == semantics::DeclTypeSpec::TypeDerived) { auto &typeSpec = sym.GetType()->derivedTypeSpec(); std::string mapperIdName = - typeSpec.name().ToString() + ".omp.default.mapper"; + typeSpec.name().ToString() + llvm::omp::OmpDefaultMapperName; if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) mapperIdName = converter.mangleName(mapperIdName, sym->owner()); if (converter.getModuleOp().lookupSymbol(mapperIdName)) diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index c08cd1ab80559..08326fad8c143 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1402,7 +1402,7 @@ static OmpMapperSpecifier ConstructOmpMapperSpecifier( // This matches the syntax: :: if (DerivedTypeSpec * derived{std::get_if(&typeSpec.u)}) { return OmpMapperSpecifier{ - std::get(derived->t).ToString() + ".omp.default.mapper", + std::get(derived->t).ToString() + llvm::omp::OmpDefaultMapperName, std::move(typeSpec), std::move(varName)}; } return OmpMapperSpecifier{std::string("omp.default.mapper"), diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h b/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h index 338b56226f204..6e1bce12af8e4 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h @@ -190,6 +190,9 @@ enum class OMPScheduleType { LLVM_MARK_AS_BITMASK_ENUM(/* LargestValue */ ModifierMask) }; +// Default OpenMP mapper name suffix. +inline constexpr const char *OmpDefaultMapperName = ".omp.default.mapper"; + /// Values for bit flags used to specify the mapping type for /// offloading. enum class OpenMPOffloadMappingFlags : uint64_t { ``````````
https://github.com/llvm/llvm-project/pull/141964 From flang-commits at lists.llvm.org Thu May 29 08:20:17 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 08:20:17 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [NFC][OpenMP] Move the default declare mapper name suffix to OMPConstants.h (PR #141964) In-Reply-To: Message-ID: <68387b31.050a0220.1f2d65.9895@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions cpp,h -- flang/lib/Lower/OpenMP/ClauseProcessor.cpp flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Parser/openmp-parsers.cpp llvm/include/llvm/Frontend/OpenMP/OMPConstants.h ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 49a3b64c0..99b0d4dd6 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1148,7 +1148,8 @@ void ClauseProcessor::processMapObjects( typeSpec = &object.sym()->GetType()->derivedTypeSpec(); if (typeSpec) { - mapperIdName = typeSpec->name().ToString() + llvm::omp::OmpDefaultMapperName; + mapperIdName = + typeSpec->name().ToString() + llvm::omp::OmpDefaultMapperName; if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) mapperIdName = converter.mangleName(mapperIdName, sym->owner()); } ``````````
https://github.com/llvm/llvm-project/pull/141964 From flang-commits at lists.llvm.org Thu May 29 08:30:17 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Thu, 29 May 2025 08:30:17 -0700 (PDT) Subject: [flang-commits] [flang] [OpenMP][Flang] Fix semantic check and scoping for declare mappers (PR #140560) In-Reply-To: Message-ID: <68387d89.170a0220.3f351.607a@mx.google.com> TIFitis wrote: > > One place I can think of placing the string is `llvm/include/llvm/Frontend/OpenMP/OMPConstants.h`, do you have any other suggestions? > > Sounds OK to me. Alternative locations (if you omit parser changes) are in include/flang/Semantics/openmp*.h. > I've created a PR for this here. For now the string is used in the parse tree as well so I've added it to OMPConstants.h, perhaps we can move it to include/flang/Semantics/openmp*.h in the future. > > As for retaining std::optional, the reason to drop std::optional is that the name is optional in source but always carries a default value attached to it along with a symbol. With the name field being optional, we end up with a dangling default symbol which makes symbol lookups weird along with scoping rules. > > A possibility is to change it from `std::optional` to `Name`. > > > Secondly, I've changed it to string instead of Name, is that the name field doesn't have a data member to store non-static names, since it only has a char*. If we keep it as Name we will need other solutions in place to store the run-time generated string. The earlier iteration of this PR tried to do so but there is no clean way of doing it, and requires adding a new static storage. > > We can add it to the Details in symbol.h like WithBindName, WithOmpDeclarative, OpenACCRoutineInfo etc. I experimented with changing it to just `Name`, but the issue is that I couldn’t find a way to assign it a unique default name based on the derived type during parsing. Since `Name` is immutable, it can't be updated later. Given that the default mapper—and declare mappers in general—can be referenced both implicitly and explicitly, having a consistent naming scheme seems vital to me, especially for preserving scoping rules. Otherwise, we’d have to account for multiple special cases during lowering, rather than relying on a single solution that works for both named and default mappers. Using a `string` feels like a clean approach to me, with the small trade-off of being inconsistent with other similar name fields. However, since the name field for a declare mapper is unique due to its default naming scheme, this seems like a worthwhile compromise. That said, if you'd still prefer it to be changed to `Name`, I'm happy to revert this patch and let someone else take over, as I’m not too familiar with the parsing side of things. Thanks! 🙂 https://github.com/llvm/llvm-project/pull/140560 From flang-commits at lists.llvm.org Thu May 29 08:47:52 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Thu, 29 May 2025 08:47:52 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <683881a8.630a0220.1b2b2f.5678@mx.google.com> ================ @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for ---------------- eugeneepshteyn wrote: Don't need to initialize to empty string here, because `std::string` is by default constructed as empty string. Calling `empty()` on default-constructed string would return `true`. https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Thu May 29 09:30:28 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Thu, 29 May 2025 09:30:28 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <68388ba4.170a0220.3f450.9620@mx.google.com> https://github.com/abidh updated https://github.com/llvm/llvm-project/pull/140556 >From 5d20af48673adebc2ab3e1a6c8442f67d84f1847 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Mon, 19 May 2025 15:21:25 +0100 Subject: [PATCH 1/3] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. This PR add functionality to change flang command line using environment variable `FCC_OVERRIDE_OPTIONS`. It is quite similar to what `CCC_OVERRIDE_OPTIONS` does for clang. The `FCC_OVERRIDE_OPTIONS` seemed like the most obvious name to me but I am open to other ideas. The `applyOverrideOptions` now takes an extra argument that is the name of the environment variable. Previously `CCC_OVERRIDE_OPTIONS` was hardcoded. --- clang/include/clang/Driver/Driver.h | 2 +- clang/lib/Driver/Driver.cpp | 4 ++-- clang/tools/driver/driver.cpp | 2 +- flang/test/Driver/fcc_override.f90 | 12 ++++++++++++ flang/tools/flang-driver/driver.cpp | 7 +++++++ 5 files changed, 23 insertions(+), 4 deletions(-) create mode 100644 flang/test/Driver/fcc_override.f90 diff --git a/clang/include/clang/Driver/Driver.h b/clang/include/clang/Driver/Driver.h index b463dc2a93550..7ca848f11b561 100644 --- a/clang/include/clang/Driver/Driver.h +++ b/clang/include/clang/Driver/Driver.h @@ -879,7 +879,7 @@ llvm::Error expandResponseFiles(SmallVectorImpl &Args, /// See applyOneOverrideOption. void applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideOpts, - llvm::StringSet<> &SavedStrings, + llvm::StringSet<> &SavedStrings, StringRef EnvVar, raw_ostream *OS = nullptr); } // end namespace driver diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp index a648cc928afdc..a8fea35926a0d 100644 --- a/clang/lib/Driver/Driver.cpp +++ b/clang/lib/Driver/Driver.cpp @@ -7289,7 +7289,7 @@ static void applyOneOverrideOption(raw_ostream &OS, void driver::applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideStr, llvm::StringSet<> &SavedStrings, - raw_ostream *OS) { + StringRef EnvVar, raw_ostream *OS) { if (!OS) OS = &llvm::nulls(); @@ -7298,7 +7298,7 @@ void driver::applyOverrideOptions(SmallVectorImpl &Args, OS = &llvm::nulls(); } - *OS << "### CCC_OVERRIDE_OPTIONS: " << OverrideStr << "\n"; + *OS << "### " << EnvVar << ": " << OverrideStr << "\n"; // This does not need to be efficient. diff --git a/clang/tools/driver/driver.cpp b/clang/tools/driver/driver.cpp index 82f47ab973064..81964c65c2892 100644 --- a/clang/tools/driver/driver.cpp +++ b/clang/tools/driver/driver.cpp @@ -305,7 +305,7 @@ int clang_main(int Argc, char **Argv, const llvm::ToolContext &ToolContext) { if (const char *OverrideStr = ::getenv("CCC_OVERRIDE_OPTIONS")) { // FIXME: Driver shouldn't take extra initial argument. driver::applyOverrideOptions(Args, OverrideStr, SavedStrings, - &llvm::errs()); + "CCC_OVERRIDE_OPTIONS", &llvm::errs()); } std::string Path = GetExecutablePath(ToolContext.Path, CanonicalPrefixes); diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 new file mode 100644 index 0000000000000..55a07803fdde5 --- /dev/null +++ b/flang/test/Driver/fcc_override.f90 @@ -0,0 +1,12 @@ +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s +! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang -target x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR + +! CHECK: "-fc1" +! CHECK-NOT: "-Oignore" +! CHECK: "-Omagic" +! CHECK-NOT: "-Oignore" + +! RM-WERROR: ### FCC_OVERRIDE_OPTIONS: x-Werror +-g +! RM-WERROR-NEXT: ### Deleting argument -Werror +! RM-WERROR-NEXT: ### Adding argument -g at end +! RM-WERROR-NOT: "-Werror" diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ed52988feaa59..ad0efa3279cef 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -111,6 +111,13 @@ int main(int argc, const char **argv) { } } + llvm::StringSet<> SavedStrings; + // Handle FCC_OVERRIDE_OPTIONS, used for editing a command line behind the + // scenes. + if (const char *OverrideStr = ::getenv("FCC_OVERRIDE_OPTIONS")) + clang::driver::applyOverrideOptions(args, OverrideStr, SavedStrings, + "FCC_OVERRIDE_OPTIONS", &llvm::errs()); + // Not in the frontend mode - continue in the compiler driver mode. // Create DiagnosticsEngine for the compiler driver >From d1f2c9b8abd2690612a4b886a7a85b8e7f57d359 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Thu, 29 May 2025 11:05:57 +0100 Subject: [PATCH 2/3] Add documentation for FCC_OVERRIDE_OPTIONS. --- flang/docs/FlangDriver.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/flang/docs/FlangDriver.md b/flang/docs/FlangDriver.md index 97744f0bee069..f93df8701e677 100644 --- a/flang/docs/FlangDriver.md +++ b/flang/docs/FlangDriver.md @@ -614,3 +614,28 @@ nvfortran defines `-fast` as - `-Mcache_align`: there is no equivalent flag in Flang or Clang. - `-Mflushz`: flush-to-zero mode - when `-ffast-math` is specified, Flang will link to `crtfastmath.o` to ensure denormal numbers are flushed to zero. + + +## FCC_OVERRIDE_OPTIONS + +The environment variable `FCC_OVERRIDE_OPTIONS` can be used to apply a list of +edits to the input argument lists. The value of this environment variable is +a space separated list of edits to perform. These edits are applied in order to +the input argument lists. Edits should be one of the following forms: + +- `#`: Silence information about the changes to the command line arguments. + +- `^FOO`: Add `FOO` as a new argument at the beginning of the command line. + +- `+FOO`: Add `FOO` as a new argument at the end of the command line. + +- `s/XXX/YYY/`: Substitute the regular expression `XXX` with `YYY` in the + command line. + +- `xOPTION`: Removes all instances of the literal argument `OPTION`. + +- `XOPTION`: Removes all instances of the literal argument `OPTION`, and the + following argument. + +- `Ox`: Removes all flags matching `O` or `O[sz0-9]` and adds `Ox` at the end + of the command line. \ No newline at end of file >From d093a6ac74f8c0058e134ec55fbbf2b8edf9b477 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Thu, 29 May 2025 17:28:46 +0100 Subject: [PATCH 3/3] Mention that effect on options added by the config files. --- flang/docs/FlangDriver.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/flang/docs/FlangDriver.md b/flang/docs/FlangDriver.md index f93df8701e677..0302cb1dc33b9 100644 --- a/flang/docs/FlangDriver.md +++ b/flang/docs/FlangDriver.md @@ -638,4 +638,6 @@ the input argument lists. Edits should be one of the following forms: following argument. - `Ox`: Removes all flags matching `O` or `O[sz0-9]` and adds `Ox` at the end - of the command line. \ No newline at end of file + of the command line. + +This environment variable does not affect the options added by the config files. From flang-commits at lists.llvm.org Thu May 29 09:31:44 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Thu, 29 May 2025 09:31:44 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <68388bf0.170a0220.1d29f5.9525@mx.google.com> ================ @@ -614,3 +614,28 @@ nvfortran defines `-fast` as - `-Mcache_align`: there is no equivalent flag in Flang or Clang. - `-Mflushz`: flush-to-zero mode - when `-ffast-math` is specified, Flang will link to `crtfastmath.o` to ensure denormal numbers are flushed to zero. + + +## FCC_OVERRIDE_OPTIONS + +The environment variable `FCC_OVERRIDE_OPTIONS` can be used to apply a list of +edits to the input argument lists. The value of this environment variable is ---------------- abidh wrote: It does not effect the options added by the configuration files. The behavior is consistent with clang and `CCC_OVERRIDE_OPTIONS`. I added a line in the documentation for it. https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Thu May 29 10:21:37 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Thu, 29 May 2025 10:21:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Allow compiler directive in cuda code (PR #141991) Message-ID: https://github.com/clementval created https://github.com/llvm/llvm-project/pull/141991 None >From 6433cf5650c06739e382f789ab735c8bfdae1d62 Mon Sep 17 00:00:00 2001 From: Valentin Clement Date: Thu, 29 May 2025 10:20:53 -0700 Subject: [PATCH] [flang][cuda] Allow compiler directive in cuda code --- flang/lib/Semantics/check-cuda.cpp | 3 +++ flang/test/Semantics/cuf09.cuf | 11 +++++++++++ 2 files changed, 14 insertions(+) diff --git a/flang/lib/Semantics/check-cuda.cpp b/flang/lib/Semantics/check-cuda.cpp index fd1ec2b2c69f8..c024640af1220 100644 --- a/flang/lib/Semantics/check-cuda.cpp +++ b/flang/lib/Semantics/check-cuda.cpp @@ -321,6 +321,9 @@ template class DeviceContextChecker { Check(std::get(c.t)); } }, + [&](const common::Indirection &x) { + // TODO(CUDA): Check for unsupported compiler directive here. + }, [&](const auto &x) { if (auto source{parser::GetSource(x)}) { context_.Say(*source, diff --git a/flang/test/Semantics/cuf09.cuf b/flang/test/Semantics/cuf09.cuf index 4a6d9ab09387d..1e23819f9afe8 100644 --- a/flang/test/Semantics/cuf09.cuf +++ b/flang/test/Semantics/cuf09.cuf @@ -18,6 +18,17 @@ module m !WARNING: I/O statement might not be supported on device write(12,'(10F4.1)'), x end + attributes(global) subroutine devsub3(n) + implicit none + integer :: n + integer :: i, ig, iGrid + iGrid = gridDim%x*blockDim%x + ig = (blockIdx%x-1)*blockDim%x + threadIdx%x + + !dir$ nounroll + do i = ig, n, iGrid + end do + end subroutine attributes(global) subroutine hostglobal(a) integer :: a(*) i = threadIdx%x From flang-commits at lists.llvm.org Thu May 29 10:22:13 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 10:22:13 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Allow compiler directive in cuda code (PR #141991) In-Reply-To: Message-ID: <683897c5.170a0220.1741f1.a70d@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Valentin Clement (バレンタイン クレメン) (clementval)
Changes --- Full diff: https://github.com/llvm/llvm-project/pull/141991.diff 2 Files Affected: - (modified) flang/lib/Semantics/check-cuda.cpp (+3) - (modified) flang/test/Semantics/cuf09.cuf (+11) ``````````diff diff --git a/flang/lib/Semantics/check-cuda.cpp b/flang/lib/Semantics/check-cuda.cpp index fd1ec2b2c69f8..c024640af1220 100644 --- a/flang/lib/Semantics/check-cuda.cpp +++ b/flang/lib/Semantics/check-cuda.cpp @@ -321,6 +321,9 @@ template class DeviceContextChecker { Check(std::get(c.t)); } }, + [&](const common::Indirection &x) { + // TODO(CUDA): Check for unsupported compiler directive here. + }, [&](const auto &x) { if (auto source{parser::GetSource(x)}) { context_.Say(*source, diff --git a/flang/test/Semantics/cuf09.cuf b/flang/test/Semantics/cuf09.cuf index 4a6d9ab09387d..1e23819f9afe8 100644 --- a/flang/test/Semantics/cuf09.cuf +++ b/flang/test/Semantics/cuf09.cuf @@ -18,6 +18,17 @@ module m !WARNING: I/O statement might not be supported on device write(12,'(10F4.1)'), x end + attributes(global) subroutine devsub3(n) + implicit none + integer :: n + integer :: i, ig, iGrid + iGrid = gridDim%x*blockDim%x + ig = (blockIdx%x-1)*blockDim%x + threadIdx%x + + !dir$ nounroll + do i = ig, n, iGrid + end do + end subroutine attributes(global) subroutine hostglobal(a) integer :: a(*) i = threadIdx%x ``````````
https://github.com/llvm/llvm-project/pull/141991 From flang-commits at lists.llvm.org Thu May 29 10:26:50 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 29 May 2025 10:26:50 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683898da.050a0220.3a3799.b106@mx.google.com> ================ @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); ---------------- kparzysz wrote: I deleted that. `SomeExpr` is a typedef defined independently (but identically) in several places, and I wanted to make sure that we detect if the definition changes. We don't need that check, if the definitions start to disagree it would cause lots of other issues. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Thu May 29 10:29:10 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Thu, 29 May 2025 10:29:10 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68389966.050a0220.88554.b5b1@mx.google.com> ================ @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for ---------------- mcinally wrote: If there are no objections, I'll leave this. The other std::string's in the function are initialized the same way, so probably best to stay in that style. https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Thu May 29 10:30:04 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Thu, 29 May 2025 10:30:04 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <6838999c.630a0220.35b08b.7991@mx.google.com> mcinally wrote: @tarunprabhu Would you mind mashing the "Merge" button? It looks like getting commit access approval is taking longer than expected. https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Thu May 29 10:32:24 2025 From: flang-commits at lists.llvm.org (Pranav Bhandarkar via flang-commits) Date: Thu, 29 May 2025 10:32:24 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] - Handle `BoxCharType` in `fir.box_offset` op (PR #141713) In-Reply-To: Message-ID: <68389a28.170a0220.28c75b.aa79@mx.google.com> https://github.com/bhandarkar-pranav edited https://github.com/llvm/llvm-project/pull/141713 From flang-commits at lists.llvm.org Thu May 29 10:32:56 2025 From: flang-commits at lists.llvm.org (Pranav Bhandarkar via flang-commits) Date: Thu, 29 May 2025 10:32:56 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] - Handle `BoxCharType` in `fir.box_offset` op (PR #141713) In-Reply-To: Message-ID: <68389a48.a70a0220.79a88.b095@mx.google.com> https://github.com/bhandarkar-pranav edited https://github.com/llvm/llvm-project/pull/141713 From flang-commits at lists.llvm.org Thu May 29 10:37:06 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 29 May 2025 10:37:06 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68389b42.170a0220.52293.79f9@mx.google.com> tarunprabhu wrote: > @tarunprabhu Would you mind mashing the "Merge" button? It looks like getting commit access approval is taking longer than expected. You may need to rebase before I can merge. https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Thu May 29 10:37:33 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 29 May 2025 10:37:33 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68389b5d.170a0220.2389b1.b24e@mx.google.com> ================ @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for ---------------- tarunprabhu wrote: I think it is better to stay in the style of the other initializations. https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Thu May 29 10:38:16 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Thu, 29 May 2025 10:38:16 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <68389b88.050a0220.11a714.b35d@mx.google.com> https://github.com/mcinally updated https://github.com/llvm/llvm-project/pull/141380 >From 9f8619cb54a3a11e4c90af7f5156141ddc59e4d4 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 24 May 2025 13:35:13 -0700 Subject: [PATCH 1/3] [flang] Add support for -mprefer-vector-width= This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- clang/include/clang/Driver/Options.td | 2 +- flang/include/flang/Frontend/CodeGenOptions.h | 3 +++ .../include/flang/Optimizer/Transforms/Passes.td | 4 ++++ flang/include/flang/Tools/CrossToolHelpers.h | 3 +++ flang/lib/Frontend/CompilerInvocation.cpp | 14 ++++++++++++++ flang/lib/Frontend/FrontendActions.cpp | 2 ++ flang/lib/Optimizer/Passes/Pipelines.cpp | 2 +- flang/lib/Optimizer/Transforms/FunctionAttr.cpp | 5 +++++ flang/test/Driver/prefer-vector-width.f90 | 16 ++++++++++++++++ mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td | 1 + mlir/lib/Target/LLVMIR/ModuleImport.cpp | 4 ++++ mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | 3 +++ 12 files changed, 57 insertions(+), 2 deletions(-) create mode 100644 flang/test/Driver/prefer-vector-width.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 22261621df092..b0b642796010b 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5480,7 +5480,7 @@ def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, MarshallingInfoStringVector>; def mprefer_vector_width_EQ : Joined<["-"], "mprefer-vector-width=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Specifies preferred vector width for auto-vectorization. Defaults to 'none' which allows target specific decisions.">, MarshallingInfoString>; def mstack_protector_guard_EQ : Joined<["-"], "mstack-protector-guard=">, Group, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -53,6 +53,9 @@ class CodeGenOptions : public CodeGenOptionsBase { /// The paths to the pass plugins that were registered using -fpass-plugin. std::vector LLVMPassPlugins; + // The prefered vector width, if requested by -mprefer-vector-width. + std::string PreferVectorWidth; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index b251534e1a8f6..2e932d9ad4a26 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -418,6 +418,10 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"preferVectorWidth", "prefer-vector-width", "std::string", + /*default=*/"", + "Set the prefer-vector-width attribute on functions in the " + "module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, ]; diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 118695bbe2626..058024a4a04c5 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; InstrumentFunctionExit = "__cyg_profile_func_exit"; @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for + ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. std::string InstrumentFunctionEntry = diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..918323d663610 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -309,6 +309,20 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, for (auto *a : args.filtered(clang::driver::options::OPT_fpass_plugin_EQ)) opts.LLVMPassPlugins.push_back(a->getValue()); + // -mprefer_vector_width option + if (const llvm::opt::Arg *a = args.getLastArg( + clang::driver::options::OPT_mprefer_vector_width_EQ)) { + llvm::StringRef s = a->getValue(); + unsigned Width; + if (s == "none") + opts.PreferVectorWidth = "none"; + else if (s.getAsInteger(10, Width)) + diags.Report(clang::diag::err_drv_invalid_value) + << a->getAsString(args) << a->getValue(); + else + opts.PreferVectorWidth = s.str(); + } + // -fembed-offload-object option for (auto *a : args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 38dfaadf1dff9..012d0fdfe645f 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -741,6 +741,8 @@ void CodeGenAction::generateLLVMIR() { config.VScaleMax = vsr->second; } + config.PreferVectorWidth = opts.PreferVectorWidth; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..5c1e1b9f77efb 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -354,7 +354,7 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - ""})); + config.PreferVectorWidth, ""})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 43e4c1a7af3cd..13f447cf738b4 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -36,6 +36,7 @@ class FunctionAttrPass : public fir::impl::FunctionAttrBase { approxFuncFPMath = options.approxFuncFPMath; noSignedZerosFPMath = options.noSignedZerosFPMath; unsafeFPMath = options.unsafeFPMath; + preferVectorWidth = options.preferVectorWidth; } FunctionAttrPass() {} void runOnOperation() override; @@ -102,6 +103,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!preferVectorWidth.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, preferVectorWidth)); LLVM_DEBUG(llvm::dbgs() << "=== End " DEBUG_TYPE " ===\n"); } diff --git a/flang/test/Driver/prefer-vector-width.f90 b/flang/test/Driver/prefer-vector-width.f90 new file mode 100644 index 0000000000000..d0f5fd28db826 --- /dev/null +++ b/flang/test/Driver/prefer-vector-width.f90 @@ -0,0 +1,16 @@ +! Test that -mprefer-vector-width works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mprefer-vector-width=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mprefer-vector-width=128 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-128 +! RUN: %flang_fc1 -mprefer-vector-width=256 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-256 +! RUN: not %flang_fc1 -mprefer-vector-width=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INVALID + +subroutine func +end subroutine func + +! CHECK-DEF-NOT: attributes #0 = { "prefer-vector-width"={{.*}} } +! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } +! CHECK-128: attributes #0 = { "prefer-vector-width"="128" } +! CHECK-256: attributes #0 = { "prefer-vector-width"="256" } +! CHECK-INVALID:error: invalid value 'xxx' in '-mprefer-vector-width=xxx' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index 6fde45ce5c556..c0324d561b77b 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1893,6 +1893,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$frame_pointer, OptionalAttr:$target_cpu, OptionalAttr:$tune_cpu, + OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index b049064fbd31c..85417da798b22 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2636,6 +2636,10 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, funcOp.setTargetFeaturesAttr( LLVM::TargetFeaturesAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("prefer-vector-width"); + attr.isStringAttribute()) + funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index 047e870b7dcd8..2b7f0b11613aa 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1540,6 +1540,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { if (auto tuneCpu = func.getTuneCpu()) llvmFunc->addFnAttr("tune-cpu", *tuneCpu); + if (auto preferVectorWidth = func.getPreferVectorWidth()) + llvmFunc->addFnAttr("prefer-vector-width", *preferVectorWidth); + if (auto attr = func.getVscaleRange()) llvmFunc->addFnAttr(llvm::Attribute::getWithVScaleRangeArgs( getLLVMContext(), attr->getMinRange().getInt(), >From 5bdf615715733351bae8f959f0a06a8449526bb8 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 24 May 2025 13:35:13 -0700 Subject: [PATCH 2/3] [flang] Add support for -mprefer-vector-width= This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- flang/lib/Frontend/CompilerInvocation.cpp | 4 ++-- mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll | 9 +++++++++ mlir/test/Target/LLVMIR/prefer-vector-width.mlir | 8 ++++++++ 3 files changed, 19 insertions(+), 2 deletions(-) create mode 100644 mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll create mode 100644 mlir/test/Target/LLVMIR/prefer-vector-width.mlir diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 918323d663610..90a002929eff0 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -313,10 +313,10 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, if (const llvm::opt::Arg *a = args.getLastArg( clang::driver::options::OPT_mprefer_vector_width_EQ)) { llvm::StringRef s = a->getValue(); - unsigned Width; + unsigned width; if (s == "none") opts.PreferVectorWidth = "none"; - else if (s.getAsInteger(10, Width)) + else if (s.getAsInteger(10, width)) diags.Report(clang::diag::err_drv_invalid_value) << a->getAsString(args) << a->getValue(); else diff --git a/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll new file mode 100644 index 0000000000000..831aa57345a3f --- /dev/null +++ b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll @@ -0,0 +1,9 @@ +; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s + +; CHECK-LABEL: llvm.func @prefer_vector_width() +; CHECK: prefer_vector_width = "128" +define void @prefer_vector_width() #0 { + ret void +} + +attributes #0 = { "prefer-vector-width"="128" } diff --git a/mlir/test/Target/LLVMIR/prefer-vector-width.mlir b/mlir/test/Target/LLVMIR/prefer-vector-width.mlir new file mode 100644 index 0000000000000..7410e8139fd31 --- /dev/null +++ b/mlir/test/Target/LLVMIR/prefer-vector-width.mlir @@ -0,0 +1,8 @@ +// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s + +// CHECK: define void @prefer_vector_width() #[[ATTRS:.*]] { +// CHECK: attributes #[[ATTRS]] = { "prefer-vector-width"="128" } + +llvm.func @prefer_vector_width() attributes {prefer_vector_width = "128"} { + llvm.return +} >From befabca370ba227262859aec47e4fbc93759b3a0 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 24 May 2025 13:35:13 -0700 Subject: [PATCH 3/3] [flang] Add support for -mprefer-vector-width= This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. --- mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll index 831aa57345a3f..e30ef04924b81 100644 --- a/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll +++ b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll @@ -1,7 +1,7 @@ ; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s ; CHECK-LABEL: llvm.func @prefer_vector_width() -; CHECK: prefer_vector_width = "128" +; CHECK-SAME: prefer_vector_width = "128" define void @prefer_vector_width() #0 { ret void } From flang-commits at lists.llvm.org Thu May 29 10:44:04 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 10:44:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Allow compiler directive in cuda code (PR #141991) In-Reply-To: Message-ID: <68389ce4.170a0220.230a5f.ae41@mx.google.com> https://github.com/klausler approved this pull request. https://github.com/llvm/llvm-project/pull/141991 From flang-commits at lists.llvm.org Thu May 29 10:46:47 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 10:46:47 -0700 (PDT) Subject: [flang-commits] [flang] ef11bdc - [flang][cuda] Allow compiler directive in cuda code (#141991) Message-ID: <68389d87.170a0220.352b84.b601@mx.google.com> Author: Valentin Clement (バレンタイン クレメン) Date: 2025-05-29T10:46:44-07:00 New Revision: ef11bdc7fe43ed9d418795c56668a1c6f8c6e35f URL: https://github.com/llvm/llvm-project/commit/ef11bdc7fe43ed9d418795c56668a1c6f8c6e35f DIFF: https://github.com/llvm/llvm-project/commit/ef11bdc7fe43ed9d418795c56668a1c6f8c6e35f.diff LOG: [flang][cuda] Allow compiler directive in cuda code (#141991) Added: Modified: flang/lib/Semantics/check-cuda.cpp flang/test/Semantics/cuf09.cuf Removed: ################################################################################ diff --git a/flang/lib/Semantics/check-cuda.cpp b/flang/lib/Semantics/check-cuda.cpp index fd1ec2b2c69f8..c024640af1220 100644 --- a/flang/lib/Semantics/check-cuda.cpp +++ b/flang/lib/Semantics/check-cuda.cpp @@ -321,6 +321,9 @@ template class DeviceContextChecker { Check(std::get(c.t)); } }, + [&](const common::Indirection &x) { + // TODO(CUDA): Check for unsupported compiler directive here. + }, [&](const auto &x) { if (auto source{parser::GetSource(x)}) { context_.Say(*source, diff --git a/flang/test/Semantics/cuf09.cuf b/flang/test/Semantics/cuf09.cuf index 4a6d9ab09387d..1e23819f9afe8 100644 --- a/flang/test/Semantics/cuf09.cuf +++ b/flang/test/Semantics/cuf09.cuf @@ -18,6 +18,17 @@ module m !WARNING: I/O statement might not be supported on device write(12,'(10F4.1)'), x end + attributes(global) subroutine devsub3(n) + implicit none + integer :: n + integer :: i, ig, iGrid + iGrid = gridDim%x*blockDim%x + ig = (blockIdx%x-1)*blockDim%x + threadIdx%x + + !dir$ nounroll + do i = ig, n, iGrid + end do + end subroutine attributes(global) subroutine hostglobal(a) integer :: a(*) i = threadIdx%x From flang-commits at lists.llvm.org Thu May 29 10:46:50 2025 From: flang-commits at lists.llvm.org (Valentin Clement =?utf-8?b?44OQ44Os44Oz44K/44Kk44OzIOOCr+ODrOODoeODsw==?= via flang-commits) Date: Thu, 29 May 2025 10:46:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cuda] Allow compiler directive in cuda code (PR #141991) In-Reply-To: Message-ID: <68389d8a.050a0220.14b51.68a6@mx.google.com> https://github.com/clementval closed https://github.com/llvm/llvm-project/pull/141991 From flang-commits at lists.llvm.org Thu May 29 11:07:35 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Thu, 29 May 2025 11:07:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <6838a267.170a0220.2c5695.819a@mx.google.com> ================ @@ -0,0 +1,178 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (!copyIn.getWasCopied().hasOneUse()) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's WasCopied has no single user"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().user_begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type sequenceType = + hlfir::getFortranElementOrSequenceType(inputVariable.getType()); + fir::BoxType resultBoxType = fir::BoxType::get(sequenceType); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultBoxType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value result = inputVariable; + if (fir::isPointerType(inputVariable.getType())) { ---------------- vzakhari wrote: I think a `ReboxOp` should be just enough instead of all the code for the `EmboxOp` (see `genNonOptionalCopyIn` in `ConvertToFIR.cpp`). Please also make sure that you handle assumed-rank cases and optional vars properly (please add tests for those as well). https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Thu May 29 11:07:35 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Thu, 29 May 2025 11:07:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <6838a267.170a0220.846ca.be21@mx.google.com> ================ @@ -0,0 +1,178 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (!copyIn.getWasCopied().hasOneUse()) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's WasCopied has no single user"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().user_begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type sequenceType = + hlfir::getFortranElementOrSequenceType(inputVariable.getType()); + fir::BoxType resultBoxType = fir::BoxType::get(sequenceType); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultBoxType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value result = inputVariable; + if (fir::isPointerType(inputVariable.getType())) { + auto boxAddr = builder.create(loc, inputVariable); + fir::ReferenceType refTy = fir::ReferenceType::get(sequenceType); + mlir::Value refVal = builder.createConvert(loc, refTy, boxAddr); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + result = builder.create(loc, resultBoxType, refVal, + shape); + } + builder.create( + loc, mlir::ValueRange{result, builder.createBool(loc, false)}); + }) + .genElse([&] { + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + llvm::StringRef tmpName{".tmp.copy_in"}; + llvm::SmallVector lenParams; + mlir::Value alloc = builder.createHeapTemporary( + loc, sequenceType, tmpName, extents, lenParams); + + auto declareOp = builder.create( + loc, alloc, tmpName, shape, lenParams, + /*dummy_scope=*/nullptr); + hlfir::Entity temp{declareOp.getBase()}; + hlfir::LoopNest loopNest = + hlfir::genLoopNest(loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn), + /*couldVectorize=*/false); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = builder.create(loc, resultBoxType, refVal, + shape); + } + + builder.create( + loc, mlir::ValueRange{result, builder.createBool(loc, true)}); + }) + .getResults(); + + mlir::OpResult resultBox = results[0]; + mlir::OpResult needsCleanup = results[1]; + + auto alloca = builder.create(loc, resultBox.getType()); ---------------- vzakhari wrote: Please use `copyIn.getTempBox()` as a temporary box instead. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Thu May 29 11:07:35 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Thu, 29 May 2025 11:07:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <6838a267.170a0220.182745.be09@mx.google.com> ================ @@ -0,0 +1,178 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (!copyIn.getWasCopied().hasOneUse()) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's WasCopied has no single user"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().user_begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type sequenceType = + hlfir::getFortranElementOrSequenceType(inputVariable.getType()); + fir::BoxType resultBoxType = fir::BoxType::get(sequenceType); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultBoxType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value result = inputVariable; + if (fir::isPointerType(inputVariable.getType())) { + auto boxAddr = builder.create(loc, inputVariable); + fir::ReferenceType refTy = fir::ReferenceType::get(sequenceType); + mlir::Value refVal = builder.createConvert(loc, refTy, boxAddr); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + result = builder.create(loc, resultBoxType, refVal, + shape); + } + builder.create( + loc, mlir::ValueRange{result, builder.createBool(loc, false)}); + }) + .genElse([&] { + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + llvm::StringRef tmpName{".tmp.copy_in"}; + llvm::SmallVector lenParams; + mlir::Value alloc = builder.createHeapTemporary( + loc, sequenceType, tmpName, extents, lenParams); + + auto declareOp = builder.create( + loc, alloc, tmpName, shape, lenParams, + /*dummy_scope=*/nullptr); + hlfir::Entity temp{declareOp.getBase()}; + hlfir::LoopNest loopNest = + hlfir::genLoopNest(loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn), + /*couldVectorize=*/false); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = builder.create(loc, resultBoxType, refVal, + shape); + } + + builder.create( + loc, mlir::ValueRange{result, builder.createBool(loc, true)}); + }) + .getResults(); + + mlir::OpResult resultBox = results[0]; + mlir::OpResult needsCleanup = results[1]; + + auto alloca = builder.create(loc, resultBox.getType()); + auto store = builder.create(loc, resultBox, alloca); + copyOut->setOperand(0, store.getMemref()); ---------------- vzakhari wrote: I think this is not okay without using `start/finalizeOpModification`, but I do not think we need to do it at all. I suggest taking `getTempBox` and storing the result box into it after the generated IF. Then the `replaceOp` should just be enough, and no explicit update of the copy-out will be required. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Thu May 29 11:07:36 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Thu, 29 May 2025 11:07:36 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <6838a268.170a0220.233464.b3ae@mx.google.com> ================ @@ -967,6 +969,15 @@ hlfir::LoopNest hlfir::genLoopNest(mlir::Location loc, auto ub = builder.createConvert(loc, indexType, extent); auto doLoop = builder.create(loc, one, ub, one, isUnordered); + if (!couldVectorize) { ---------------- vzakhari wrote: This seems to be an unrelated change. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Thu May 29 11:22:43 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Thu, 29 May 2025 11:22:43 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <6838a5f3.170a0220.383749.b251@mx.google.com> mcinally wrote: Seems ok now: "Changes can be cleanly merged." If it's still not available, I may have messed something up on my side. I'll review the GitHub docs to see if I missed something... https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Thu May 29 11:25:54 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 29 May 2025 11:25:54 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <6838a6b2.170a0220.327f22.b1c7@mx.google.com> tarunprabhu wrote: > Seems ok now: "Changes can be cleanly merged." > > If it's still not available, I may have messed something up on my side. I'll review the GitHub docs to see if I missed something... It's probably fine, but I'd like to give the buildkites a chance to finish before I merge it. https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Thu May 29 11:44:18 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Thu, 29 May 2025 11:44:18 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [NFC][OpenMP] Move the default declare mapper name suffix to OMPConstants.h (PR #141964) In-Reply-To: Message-ID: <6838ab02.170a0220.258f79.b6cc@mx.google.com> https://github.com/kiranchandramohan approved this pull request. LG. Thanks for the change. https://github.com/llvm/llvm-project/pull/141964 From flang-commits at lists.llvm.org Thu May 29 11:50:33 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 29 May 2025 11:50:33 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6838ac79.050a0220.1cafa9.b896@mx.google.com> ================ @@ -2656,527 +2665,1857 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; } } + return SourcedActionStmt{}; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); } - return false; + return SourcedActionStmt{}; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; - } - } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); - } +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; } - ErrIfAllocatableVariable(var); + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; + } else { + return std::nullopt; + } + }, + x->u); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const auto *v1 = GetExpr(context_, stmt1Var); - const auto *e1 = GetExpr(context_, stmt1Expr); - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - const auto *v2 = GetExpr(context_, stmt2Var); - const auto *e2 = GetExpr(context_, stmt2Expr); - - if (e1 && v1 && e2 && v2) { - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(v2, e2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } - if (!(*e1 == *v2)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(v1, e1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - if (!(*v1 == *e2)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); - } - } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } + return std::nullopt; } -} -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; } } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; } } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); - } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); + return std::nullopt; } -} -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, - }, - x.u); + return std::nullopt; } -void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { - dirContext_.pop_back(); +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; ---------------- kparzysz wrote: Done https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Thu May 29 11:52:16 2025 From: flang-commits at lists.llvm.org (Raghu Maddhipatla via flang-commits) Date: Thu, 29 May 2025 11:52:16 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [NFC][OpenMP] Move the default declare mapper name suffix to OMPConstants.h (PR #141964) In-Reply-To: Message-ID: <6838ace0.050a0220.282fa4.bf2b@mx.google.com> https://github.com/raghavendhra approved this pull request. https://github.com/llvm/llvm-project/pull/141964 From flang-commits at lists.llvm.org Thu May 29 11:56:21 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 29 May 2025 11:56:21 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6838add5.050a0220.3bcf0a.bafa@mx.google.com> ================ @@ -777,5 +777,22 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); + ---------------- kparzysz wrote: The implementations all use the "atomic" namespace in check-omp-structure.cpp, which I'd rather keep "private" to that file. If you think these declarations don't belong here, then maybe we could invent a new header omp-tools.h or something like that? Let me know what you think. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Thu May 29 11:57:21 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 29 May 2025 11:57:21 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6838ae11.170a0220.57d1f.8dcb@mx.google.com> ================ @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { ---------------- kparzysz wrote: I have written/deleted this repeatedly when debugging parser issues, so I kept it around on purpose. Should I remove it? https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Thu May 29 11:59:29 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 29 May 2025 11:59:29 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6838ae91.170a0220.38b7b3.b737@mx.google.com> kparzysz wrote: Addressed a subset of comments, mostly in check-omp-structure.cpp. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Thu May 29 12:00:06 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 29 May 2025 12:00:06 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6838aeb6.630a0220.ef545.85e5@mx.google.com> ================ @@ -2656,527 +2665,1857 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; } } + return SourcedActionStmt{}; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); } - return false; + return SourcedActionStmt{}; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; - } - } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); - } +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; } - ErrIfAllocatableVariable(var); + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; + } else { + return std::nullopt; + } + }, + x->u); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const auto *v1 = GetExpr(context_, stmt1Var); - const auto *e1 = GetExpr(context_, stmt1Expr); - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - const auto *v2 = GetExpr(context_, stmt2Var); - const auto *e2 = GetExpr(context_, stmt2Expr); - - if (e1 && v1 && e2 && v2) { - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(v2, e2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } - if (!(*e1 == *v2)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(v1, e1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - if (!(*v1 == *e2)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); - } - } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } + return std::nullopt; } -} -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; } } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; } } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); - } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); + return std::nullopt; } -} -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, - }, - x.u); + return std::nullopt; } -void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { - dirContext_.pop_back(); +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension ---------------- kparzysz wrote: Yes, will do. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Thu May 29 12:08:16 2025 From: flang-commits at lists.llvm.org (Leandro Lupori via flang-commits) Date: Thu, 29 May 2025 12:08:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Treat ClassType as BoxType in COPYPRIVATE (PR #141844) In-Reply-To: Message-ID: <6838b0a0.050a0220.362627.7a93@mx.google.com> https://github.com/luporl approved this pull request. LGTM, thanks for the fix. https://github.com/llvm/llvm-project/pull/141844 From flang-commits at lists.llvm.org Thu May 29 12:32:52 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 12:32:52 -0700 (PDT) Subject: [flang-commits] [flang] 4811c67 - [flang][OpenMP] Treat ClassType as BoxType in COPYPRIVATE (#141844) Message-ID: <6838b664.170a0220.26b8bd.b930@mx.google.com> Author: Krzysztof Parzyszek Date: 2025-05-29T14:32:49-05:00 New Revision: 4811c67d62b840a7f5d3320de0b15ba96e27d2e4 URL: https://github.com/llvm/llvm-project/commit/4811c67d62b840a7f5d3320de0b15ba96e27d2e4 DIFF: https://github.com/llvm/llvm-project/commit/4811c67d62b840a7f5d3320de0b15ba96e27d2e4.diff LOG: [flang][OpenMP] Treat ClassType as BoxType in COPYPRIVATE (#141844) This fixes the second problem reported in https://github.com/llvm/llvm-project/issues/141481 Added: flang/test/Lower/OpenMP/copyprivate4.f90 Modified: flang/lib/Lower/OpenMP/ClauseProcessor.cpp Removed: ################################################################################ diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index ebdda9885d5c2..a1fff6c5b7d90 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -743,6 +743,9 @@ void TypeInfo::typeScan(mlir::Type ty) { } else if (auto bty = mlir::dyn_cast(ty)) { inBox = true; typeScan(bty.getEleTy()); + } else if (auto cty = mlir::dyn_cast(ty)) { + inBox = true; + typeScan(cty.getEleTy()); } else if (auto cty = mlir::dyn_cast(ty)) { charLen = cty.getLen(); } else if (auto hty = mlir::dyn_cast(ty)) { diff --git a/flang/test/Lower/OpenMP/copyprivate4.f90 b/flang/test/Lower/OpenMP/copyprivate4.f90 new file mode 100644 index 0000000000000..02fdbc71edc59 --- /dev/null +++ b/flang/test/Lower/OpenMP/copyprivate4.f90 @@ -0,0 +1,18 @@ +!RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s + +!The second testcase from https://github.com/llvm/llvm-project/issues/141481 + +!Check that we don't crash on this. + +!CHECK: omp.single copyprivate(%6#0 -> @_copy_class_ptr_rec__QFf01Tt : !fir.ref>>>) { +!CHECK: omp.terminator +!CHECK: } + +subroutine f01 + type t + end type + class(t), pointer :: tt + +!$omp single copyprivate(tt) +!$omp end single +end From flang-commits at lists.llvm.org Thu May 29 12:32:55 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 29 May 2025 12:32:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Treat ClassType as BoxType in COPYPRIVATE (PR #141844) In-Reply-To: Message-ID: <6838b667.170a0220.2eb05f.c5ae@mx.google.com> https://github.com/kparzysz closed https://github.com/llvm/llvm-project/pull/141844 From flang-commits at lists.llvm.org Thu May 29 12:34:31 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 12:34:31 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Correct defined assignment case (PR #142020) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/142020 When a generic ASSIGNMENT(=) has elemental and non-elemental specific procedures that match the actual arguments, the non-elemental procedure must take precedence. We get this right for generics defined with interface blocks, but the type-bound case fails if the non-elemental specific takes a non-default PASS argument. Fixes https://github.com/llvm/llvm-project/issues/141807. >From 1eb6e57dc34820cfdd4ff3e0893456ef42b358d7 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Thu, 29 May 2025 12:30:18 -0700 Subject: [PATCH] [flang] Correct defined assignment case When a generic ASSIGNMENT(=) has elemental and non-elemental specific procedures that match the actual arguments, the non-elemental procedure must take precedence. We get this right for generics defined with interface blocks, but the type-bound case fails if the non-elemental specific takes a non-default PASS argument. Fixes https://github.com/llvm/llvm-project/issues/141807. --- flang/lib/Semantics/expression.cpp | 34 +++++++++++++++++++----------- flang/test/Semantics/bug141807.f90 | 32 ++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+), 12 deletions(-) create mode 100644 flang/test/Semantics/bug141807.f90 diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index d68e71f57f141..f4af738284ed7 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -2907,7 +2907,7 @@ std::pair ExpressionAnalyzer::ResolveGeneric( continue; } // Matching distance is smaller than the previously matched - // specific. Let it go thourgh so the current procedure is picked. + // specific. Let it go through so the current procedure is picked. } else { // 16.9.144(6): a bare NULL() is not allowed as an actual // argument to a generic procedure if the specific procedure @@ -4824,31 +4824,41 @@ bool ArgumentAnalyzer::OkLogicalIntegerAssignment( std::optional ArgumentAnalyzer::GetDefinedAssignmentProc() { const Symbol *proc{nullptr}; + bool isProcElemental{false}; std::optional passedObjectIndex; std::string oprNameString{"assignment(=)"}; parser::CharBlock oprName{oprNameString}; const auto &scope{context_.context().FindScope(source_)}; - // If multiple resolutions were possible, they will have been already - // diagnosed. { auto restorer{context_.GetContextualMessages().DiscardMessages()}; if (const Symbol *symbol{scope.FindSymbol(oprName)}) { ExpressionAnalyzer::AdjustActuals noAdjustment; proc = context_.ResolveGeneric(*symbol, actuals_, noAdjustment, true).first; + if (proc) { + isProcElemental = IsElementalProcedure(*proc); + } } - for (std::size_t i{0}; !proc && i < actuals_.size(); ++i) { + for (std::size_t i{0}; (!proc || isProcElemental) && i < actuals_.size(); + ++i) { const Symbol *generic{nullptr}; if (const Symbol * binding{FindBoundOp(oprName, i, generic, /*isSubroutine=*/true)}) { - if (CheckAccessibleSymbol(scope, DEREF(generic))) { - // ignore inaccessible type-bound ASSIGNMENT(=) generic - } else if (const Symbol * - resolution{GetBindingResolution(GetType(i), *binding)}) { - proc = resolution; - } else { - proc = binding; - passedObjectIndex = i; + // ignore inaccessible type-bound ASSIGNMENT(=) generic + if (!CheckAccessibleSymbol(scope, DEREF(generic))) { + const Symbol *resolution{GetBindingResolution(GetType(i), *binding)}; + const Symbol &newProc{*(resolution ? resolution : binding)}; + bool isElemental{IsElementalProcedure(newProc)}; + if (!proc || !isElemental) { + // Non-elemental resolution overrides elemental + proc = &newProc; + isProcElemental = isElemental; + if (resolution) { + passedObjectIndex.reset(); + } else { + passedObjectIndex = i; + } + } } } } diff --git a/flang/test/Semantics/bug141807.f90 b/flang/test/Semantics/bug141807.f90 new file mode 100644 index 0000000000000..48539f19927c1 --- /dev/null +++ b/flang/test/Semantics/bug141807.f90 @@ -0,0 +1,32 @@ +!RUN: %flang_fc1 -fdebug-unparse %s | FileCheck %s +!Ensure that non-elemental specific takes precedence over elemental +!defined assignment, even with non-default PASS argument. +module m + type base + integer :: n = -999 + contains + procedure, pass(from) :: array_assign_scalar + procedure :: elemental_assign + generic :: assignment(=) => array_assign_scalar, elemental_assign + end type + contains + subroutine array_assign_scalar(to, from) + class(base), intent(out) :: to(:) + class(base), intent(in) :: from + to%n = from%n + end + impure elemental subroutine elemental_assign(to, from) + class(base), intent(out) :: to + class(base), intent(in) :: from + to%n = from%n + end +end + +use m +type(base) :: array(1), scalar +scalar%n = 1 +!CHECK: CALL array_assign_scalar(array,(scalar)) +array = scalar +!CHECK: CALL elemental_assign(array,[base::scalar]) +array = [scalar] +end From flang-commits at lists.llvm.org Thu May 29 12:35:07 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 12:35:07 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Correct defined assignment case (PR #142020) In-Reply-To: Message-ID: <6838b6eb.050a0220.33acde.c6d5@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes When a generic ASSIGNMENT(=) has elemental and non-elemental specific procedures that match the actual arguments, the non-elemental procedure must take precedence. We get this right for generics defined with interface blocks, but the type-bound case fails if the non-elemental specific takes a non-default PASS argument. Fixes https://github.com/llvm/llvm-project/issues/141807. --- Full diff: https://github.com/llvm/llvm-project/pull/142020.diff 2 Files Affected: - (modified) flang/lib/Semantics/expression.cpp (+22-12) - (added) flang/test/Semantics/bug141807.f90 (+32) ``````````diff diff --git a/flang/lib/Semantics/expression.cpp b/flang/lib/Semantics/expression.cpp index d68e71f57f141..f4af738284ed7 100644 --- a/flang/lib/Semantics/expression.cpp +++ b/flang/lib/Semantics/expression.cpp @@ -2907,7 +2907,7 @@ std::pair ExpressionAnalyzer::ResolveGeneric( continue; } // Matching distance is smaller than the previously matched - // specific. Let it go thourgh so the current procedure is picked. + // specific. Let it go through so the current procedure is picked. } else { // 16.9.144(6): a bare NULL() is not allowed as an actual // argument to a generic procedure if the specific procedure @@ -4824,31 +4824,41 @@ bool ArgumentAnalyzer::OkLogicalIntegerAssignment( std::optional ArgumentAnalyzer::GetDefinedAssignmentProc() { const Symbol *proc{nullptr}; + bool isProcElemental{false}; std::optional passedObjectIndex; std::string oprNameString{"assignment(=)"}; parser::CharBlock oprName{oprNameString}; const auto &scope{context_.context().FindScope(source_)}; - // If multiple resolutions were possible, they will have been already - // diagnosed. { auto restorer{context_.GetContextualMessages().DiscardMessages()}; if (const Symbol *symbol{scope.FindSymbol(oprName)}) { ExpressionAnalyzer::AdjustActuals noAdjustment; proc = context_.ResolveGeneric(*symbol, actuals_, noAdjustment, true).first; + if (proc) { + isProcElemental = IsElementalProcedure(*proc); + } } - for (std::size_t i{0}; !proc && i < actuals_.size(); ++i) { + for (std::size_t i{0}; (!proc || isProcElemental) && i < actuals_.size(); + ++i) { const Symbol *generic{nullptr}; if (const Symbol * binding{FindBoundOp(oprName, i, generic, /*isSubroutine=*/true)}) { - if (CheckAccessibleSymbol(scope, DEREF(generic))) { - // ignore inaccessible type-bound ASSIGNMENT(=) generic - } else if (const Symbol * - resolution{GetBindingResolution(GetType(i), *binding)}) { - proc = resolution; - } else { - proc = binding; - passedObjectIndex = i; + // ignore inaccessible type-bound ASSIGNMENT(=) generic + if (!CheckAccessibleSymbol(scope, DEREF(generic))) { + const Symbol *resolution{GetBindingResolution(GetType(i), *binding)}; + const Symbol &newProc{*(resolution ? resolution : binding)}; + bool isElemental{IsElementalProcedure(newProc)}; + if (!proc || !isElemental) { + // Non-elemental resolution overrides elemental + proc = &newProc; + isProcElemental = isElemental; + if (resolution) { + passedObjectIndex.reset(); + } else { + passedObjectIndex = i; + } + } } } } diff --git a/flang/test/Semantics/bug141807.f90 b/flang/test/Semantics/bug141807.f90 new file mode 100644 index 0000000000000..48539f19927c1 --- /dev/null +++ b/flang/test/Semantics/bug141807.f90 @@ -0,0 +1,32 @@ +!RUN: %flang_fc1 -fdebug-unparse %s | FileCheck %s +!Ensure that non-elemental specific takes precedence over elemental +!defined assignment, even with non-default PASS argument. +module m + type base + integer :: n = -999 + contains + procedure, pass(from) :: array_assign_scalar + procedure :: elemental_assign + generic :: assignment(=) => array_assign_scalar, elemental_assign + end type + contains + subroutine array_assign_scalar(to, from) + class(base), intent(out) :: to(:) + class(base), intent(in) :: from + to%n = from%n + end + impure elemental subroutine elemental_assign(to, from) + class(base), intent(out) :: to + class(base), intent(in) :: from + to%n = from%n + end +end + +use m +type(base) :: array(1), scalar +scalar%n = 1 +!CHECK: CALL array_assign_scalar(array,(scalar)) +array = scalar +!CHECK: CALL elemental_assign(array,[base::scalar]) +array = [scalar] +end ``````````
https://github.com/llvm/llvm-project/pull/142020 From flang-commits at lists.llvm.org Thu May 29 12:47:29 2025 From: flang-commits at lists.llvm.org (Leandro Lupori via flang-commits) Date: Thu, 29 May 2025 12:47:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Verify that arguments to COPYPRIVATE are variables (PR #141823) In-Reply-To: Message-ID: <6838b9d1.650a0220.a2d54.7235@mx.google.com> https://github.com/luporl approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/141823 From flang-commits at lists.llvm.org Thu May 29 12:56:27 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Thu, 29 May 2025 12:56:27 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bbeb.170a0220.2648e7.be0c@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/142022 >From 8f3fd2daab46f477e87043c66b3049dff4a5b20e Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:11:04 -0700 Subject: [PATCH 1/2] initial commit --- flang/include/flang/Common/enum-class.h | 47 ++++- .../include/flang/Support/Fortran-features.h | 51 ++++-- flang/lib/Frontend/CompilerInvocation.cpp | 62 ++++--- flang/lib/Support/CMakeLists.txt | 1 + flang/lib/Support/Fortran-features.cpp | 168 ++++++++++++++---- flang/lib/Support/enum-class.cpp | 24 +++ flang/test/Driver/disable-diagnostic.f90 | 19 ++ flang/test/Driver/werror-wrong.f90 | 7 +- flang/test/Driver/wextra-ok.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 3 + flang/unittests/Common/EnumClassTests.cpp | 45 +++++ .../unittests/Common/FortranFeaturesTest.cpp | 142 +++++++++++++++ 12 files changed, 483 insertions(+), 88 deletions(-) create mode 100644 flang/lib/Support/enum-class.cpp create mode 100644 flang/test/Driver/disable-diagnostic.f90 create mode 100644 flang/unittests/Common/EnumClassTests.cpp create mode 100644 flang/unittests/Common/FortranFeaturesTest.cpp diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index 41575d45091a8..baf9fe418141d 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -18,8 +18,9 @@ #define FORTRAN_COMMON_ENUM_CLASS_H_ #include -#include - +#include +#include +#include namespace Fortran::common { constexpr std::size_t CountEnumNames(const char *p) { @@ -58,15 +59,51 @@ constexpr std::array EnumNames(const char *p) { return result; } +template +std::optional inline fmap(std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +std::optional FindEnumIndex( + Predicate pred, int size, const std::string_view *names); + +using FindEnumIndexType = std::optional( + Predicate, int, const std::string_view *); + +template +std::optional inline FindEnum( + Predicate pred, std::function(Predicate)> find) { + std::function f = [](int x) { return static_cast(x); }; + return fmap(find(pred), f); +} + #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ ::Fortran::common::CountEnumNames(#__VA_ARGS__)}; \ + [[maybe_unused]] static constexpr std::array NAME##_names{ \ + ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ [[maybe_unused]] static inline std::string_view EnumToString(NAME e) { \ - static const constexpr auto names{ \ - ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ - return names[static_cast(e)]; \ + return NAME##_names[static_cast(e)]; \ } +#define ENUM_CLASS_EXTRA(NAME) \ + [[maybe_unused]] inline std::optional Find##NAME##Index( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnumIndex( \ + p, NAME##_enumSize, NAME##_names.data()); \ + } \ + [[maybe_unused]] inline std::optional Find##NAME( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + } \ + [[maybe_unused]] inline std::optional StringTo##NAME( \ + const std::string_view name) { \ + return Find##NAME( \ + [name](const std::string_view s) -> bool { return name == s; }); \ + } } // namespace Fortran::common #endif // FORTRAN_COMMON_ENUM_CLASS_H_ diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index e696da9042480..d5aa7357ffea0 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -12,6 +12,8 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" #include "flang/Common/idioms.h" +#include "llvm/Support/Error.h" +#include "llvm/Support/raw_ostream.h" #include #include @@ -79,12 +81,13 @@ ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, NullActualForDefaultIntentAllocatable, UseAssociationIntoSameNameSubprogram, HostAssociatedIntentOutInSpecExpr, NonVolatilePointerToVolatile) +// Generate default String -> Enum mapping. +ENUM_CLASS_EXTRA(LanguageFeature) +ENUM_CLASS_EXTRA(UsageWarning) + using LanguageFeatures = EnumSet; using UsageWarnings = EnumSet; -std::optional FindLanguageFeature(const char *); -std::optional FindUsageWarning(const char *); - class LanguageFeatureControl { public: LanguageFeatureControl(); @@ -97,8 +100,10 @@ class LanguageFeatureControl { void EnableWarning(UsageWarning w, bool yes = true) { warnUsage_.set(w, yes); } - void WarnOnAllNonstandard(bool yes = true) { warnAllLanguage_ = yes; } - void WarnOnAllUsage(bool yes = true) { warnAllUsage_ = yes; } + void WarnOnAllNonstandard(bool yes = true); + bool IsWarnOnAllNonstandard() const { return warnAllLanguage_; } + void WarnOnAllUsage(bool yes = true); + bool IsWarnOnAllUsage() const { return warnAllUsage_; } void DisableAllNonstandardWarnings() { warnAllLanguage_ = false; warnLanguage_.clear(); @@ -107,16 +112,16 @@ class LanguageFeatureControl { warnAllUsage_ = false; warnUsage_.clear(); } - - bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } - bool ShouldWarn(LanguageFeature f) const { - return (warnAllLanguage_ && f != LanguageFeature::OpenMP && - f != LanguageFeature::OpenACC && f != LanguageFeature::CUDA) || - warnLanguage_.test(f); - } - bool ShouldWarn(UsageWarning w) const { - return warnAllUsage_ || warnUsage_.test(w); + void DisableAllWarnings() { + disableAllWarnings_ = true; + DisableAllNonstandardWarnings(); + DisableAllUsageWarnings(); } + bool applyCLIOption(llvm::StringRef input); + bool AreWarningsDisabled() const { return disableAllWarnings_; } + bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } + bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } + bool ShouldWarn(UsageWarning w) const { return warnUsage_.test(w); } // Return all spellings of operators names, depending on features enabled std::vector GetNames(LogicalOperator) const; std::vector GetNames(RelationalOperator) const; @@ -127,6 +132,24 @@ class LanguageFeatureControl { bool warnAllLanguage_{false}; UsageWarnings warnUsage_; bool warnAllUsage_{false}; + bool disableAllWarnings_{false}; }; + +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). Just exposed for the template below. +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find); + +template +std::optional> parseCLIEnum( + llvm::StringRef input, std::function(Predicate)> find) { + using To = std::pair; + using From = std::pair; + static std::function cast = [](From x) { + return std::pair{x.first, static_cast(x.second)}; + }; + return fmap(parseCLIEnumIndex(input, find), cast); +} + } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..9ea568549bd6c 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -34,6 +34,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" +#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" #include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" @@ -45,6 +46,7 @@ #include #include #include +#include using namespace Fortran::frontend; @@ -971,10 +973,23 @@ static bool parseSemaArgs(CompilerInvocation &res, llvm::opt::ArgList &args, /// Parses all diagnostics related arguments and populates the variables /// options accordingly. Returns false if new errors are generated. +/// FC1 driver entry point for parsing diagnostic arguments. static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, clang::DiagnosticsEngine &diags) { unsigned numErrorsBefore = diags.getNumErrors(); + auto &features = res.getFrontendOpts().features; + // The order of these flags (-pedantic -W -w) is important and is + // chosen to match clang's behavior. + + // -pedantic + if (args.hasArg(clang::driver::options::OPT_pedantic)) { + features.WarnOnAllNonstandard(); + features.WarnOnAllUsage(); + res.setEnableConformanceChecks(); + res.setEnableUsageChecks(); + } + // -Werror option // TODO: Currently throws a Diagnostic for anything other than -W, // this has to change when other -W's are supported. @@ -984,21 +999,27 @@ static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, for (const auto &wArg : wArgs) { if (wArg == "error") { res.setWarnAsErr(true); - } else { - const unsigned diagID = - diags.getCustomDiagID(clang::DiagnosticsEngine::Error, - "Only `-Werror` is supported currently."); - diags.Report(diagID); + // -W(no-) + } else if (!features.applyCLIOption(wArg)) { + const unsigned diagID = diags.getCustomDiagID( + clang::DiagnosticsEngine::Error, "Unknown diagnostic option: -W%0"); + diags.Report(diagID) << wArg; } } } + // -w + if (args.hasArg(clang::driver::options::OPT_w)) { + features.DisableAllWarnings(); + res.setDisableWarnings(); + } + // Default to off for `flang -fc1`. - res.getFrontendOpts().showColors = - parseShowColorsArgs(args, /*defaultDiagColor=*/false); + bool showColors = parseShowColorsArgs(args, false); - // Honor color diagnostics. - res.getDiagnosticOpts().ShowColors = res.getFrontendOpts().showColors; + diags.getDiagnosticOptions().ShowColors = showColors; + res.getDiagnosticOpts().ShowColors = showColors; + res.getFrontendOpts().showColors = showColors; return diags.getNumErrors() == numErrorsBefore; } @@ -1074,16 +1095,6 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, Fortran::common::LanguageFeature::OpenACC); } - // -pedantic - if (args.hasArg(clang::driver::options::OPT_pedantic)) { - res.setEnableConformanceChecks(); - res.setEnableUsageChecks(); - } - - // -w - if (args.hasArg(clang::driver::options::OPT_w)) - res.setDisableWarnings(); - // -std=f2018 // TODO: Set proper options when more fortran standards // are supported. @@ -1092,6 +1103,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, // We only allow f2018 as the given standard if (standard == "f2018") { res.setEnableConformanceChecks(); + res.getFrontendOpts().features.WarnOnAllNonstandard(); } else { const unsigned diagID = diags.getCustomDiagID(clang::DiagnosticsEngine::Error, @@ -1099,6 +1111,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, diags.Report(diagID); } } + return diags.getNumErrors() == numErrorsBefore; } @@ -1694,16 +1707,7 @@ void CompilerInvocation::setFortranOpts() { if (frontendOptions.needProvenanceRangeToCharBlockMappings) fortranOptions.needProvenanceRangeToCharBlockMappings = true; - if (getEnableConformanceChecks()) - fortranOptions.features.WarnOnAllNonstandard(); - - if (getEnableUsageChecks()) - fortranOptions.features.WarnOnAllUsage(); - - if (getDisableWarnings()) { - fortranOptions.features.DisableAllNonstandardWarnings(); - fortranOptions.features.DisableAllUsageWarnings(); - } + fortranOptions.features = frontendOptions.features; } std::unique_ptr diff --git a/flang/lib/Support/CMakeLists.txt b/flang/lib/Support/CMakeLists.txt index 363f57ce97dae..9ef31a2a6dcc7 100644 --- a/flang/lib/Support/CMakeLists.txt +++ b/flang/lib/Support/CMakeLists.txt @@ -44,6 +44,7 @@ endif() add_flang_library(FortranSupport default-kinds.cpp + enum-class.cpp Flags.cpp Fortran.cpp Fortran-features.cpp diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index bee8984102b82..55abf0385d185 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -9,6 +9,8 @@ #include "flang/Support/Fortran-features.h" #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/Support/raw_ostream.h" namespace Fortran::common { @@ -94,57 +96,123 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Ignore case and any inserted punctuation (like '-'/'_') -static std::optional GetWarningChar(char ch) { - if (ch >= 'a' && ch <= 'z') { - return ch; - } else if (ch >= 'A' && ch <= 'Z') { - return ch - 'A' + 'a'; - } else if (ch >= '0' && ch <= '9') { - return ch; - } else { - return std::nullopt; +// Split a string with camel case into the individual words. +// Note, the small vector is just an array of a few pointers and lengths +// into the original input string. So all this allocation should be pretty +// cheap. +llvm::SmallVector splitCamelCase(llvm::StringRef input) { + using namespace llvm; + if (input.empty()) { + return {}; } + SmallVector parts{}; + parts.reserve(input.size()); + auto check = [&input](size_t j, function_ref predicate) { + return j < input.size() && predicate(input[j]); + }; + size_t i{0}; + size_t startWord = i; + for (; i < input.size(); i++) { + if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || + ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { + parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); + startWord = i + 1; + } + } + parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); + return parts; } -static bool WarningNameMatch(const char *a, const char *b) { - while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); - } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); +// Split a string whith hyphens into the individual words. +llvm::SmallVector splitHyphenated(llvm::StringRef input) { + auto parts = llvm::SmallVector{}; + llvm::SplitString(input, parts, "-"); + return parts; +} + +// Check if two strings are equal while normalizing case for the +// right word which is assumed to be a single word in camel case. +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { + size_t ls = l.size(); + if (ls != r.size()) + return false; + size_t j{0}; + // Process the upper case characters. + for (; j < ls; j++) { + char rc = r[j]; + char rc2l = llvm::toLower(rc); + if (rc == rc2l) { + // Past run of Uppers Case; + break; } - if (!ach && !bch) { - return true; - } else if (!ach || !bch || *ach != *bch) { + if (l[j] != rc2l) + return false; + } + // Process the lower case characters. + for (; j < ls; j++) { + if (l[j] != r[j]) { return false; } - ++a, ++b; } + return true; } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find) { + auto parts = splitHyphenated(input); + bool negated = false; + if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { + negated = true; + // Remove the "no" part + parts = llvm::SmallVector(parts.begin() + 1, parts.end()); + } + size_t chars = 0; + for (auto p : parts) { + chars += p.size(); + } + auto pred = [&](auto s) { + if (chars != s.size()) { + return false; + } + auto ccParts = splitCamelCase(s); + auto num_ccParts = ccParts.size(); + if (parts.size() != num_ccParts) { + return false; + } + for (size_t i{0}; i < num_ccParts; i++) { + if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { + return false; } } - } - return std::nullopt; + return true; + }; + auto cast = [negated](int x) { return std::pair{!negated, x}; }; + return fmap>(find(pred), cast); } -std::optional FindLanguageFeature(const char *name) { - return ScanEnum(name); +std::optional> parseCLILanguageFeature( + llvm::StringRef input) { + return parseCLIEnum(input, FindLanguageFeatureIndex); } -std::optional FindUsageWarning(const char *name) { - return ScanEnum(name); +std::optional> parseCLIUsageWarning( + llvm::StringRef input) { + return parseCLIEnum(input, FindUsageWarningIndex); +} + +// Take a string from the CLI and apply it to the LanguageFeatureControl. +// Return true if the option was applied recognized. +bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { + if (auto result = parseCLILanguageFeature(input)) { + EnableWarning(result->second, result->first); + return true; + } else if (auto result = parseCLIUsageWarning(input)) { + EnableWarning(result->second, result->first); + return true; + } + return false; } std::vector LanguageFeatureControl::GetNames( @@ -201,4 +269,32 @@ std::vector LanguageFeatureControl::GetNames( } } +template +void ForEachEnum(std::function f) { + for (size_t j{0}; j < N; ++j) { + f(static_cast(j)); + } +} + +void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { + warnAllLanguage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + // should be equivalent to: reset().flip() set ... + ForEachEnum( + [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + if (yes) { + // These three features do not need to be warned about, + // but we do want their feature flags. + warnLanguage_.set(LanguageFeature::OpenMP, false); + warnLanguage_.set(LanguageFeature::OpenACC, false); + warnLanguage_.set(LanguageFeature::CUDA, false); + } +} + +void LanguageFeatureControl::WarnOnAllUsage(bool yes) { + warnAllUsage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + ForEachEnum( + [&](UsageWarning w) { warnUsage_.set(w, yes); }); +} } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp new file mode 100644 index 0000000000000..ed11318382b35 --- /dev/null +++ b/flang/lib/Support/enum-class.cpp @@ -0,0 +1,24 @@ +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include +#include +namespace Fortran::common { + +std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; + } + } + return std::nullopt; +} + + +} // namespace Fortran::common \ No newline at end of file diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 new file mode 100644 index 0000000000000..8a58e63cfa3ac --- /dev/null +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -0,0 +1,19 @@ +! RUN: %flang -Wknown-bad-implicit-interface %s -c 2>&1 | FileCheck %s --check-prefix=WARN +! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty +! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 +! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 +! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface +! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface + +program disable_diagnostic + REAL :: x + INTEGER :: y + ! CHECK-NOT: warning + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(x) + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(y) +end program disable_diagnostic + +subroutine sub() +end subroutine sub \ No newline at end of file diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 58adf6f745d5e..33f0aff8a1739 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -1,6 +1,7 @@ ! Ensure that only argument -Werror is supported. -! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG -! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG +! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG1 +! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 -! WRONG: Only `-Werror` is supported currently. +! WRONG1: error: Unknown diagnostic option: -Wall +! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file diff --git a/flang/test/Driver/wextra-ok.f90 b/flang/test/Driver/wextra-ok.f90 index 441029aa0af27..db15c7f14aa35 100644 --- a/flang/test/Driver/wextra-ok.f90 +++ b/flang/test/Driver/wextra-ok.f90 @@ -5,7 +5,7 @@ ! RUN: not %flang -std=f2018 -Wblah -Wextra %s -c 2>&1 | FileCheck %s --check-prefix=WRONG ! CHECK-OK: the warning option '-Wextra' is not supported -! WRONG: Only `-Werror` is supported currently. +! WRONG: Unknown diagnostic option: -Wblah program wextra_ok end program wextra_ok diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index bda02ed29a5ef..19cc5a20fecf4 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -1,3 +1,6 @@ add_flang_unittest(FlangCommonTests + EnumClassTests.cpp FastIntSetTest.cpp + FortranFeaturesTest.cpp ) +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp new file mode 100644 index 0000000000000..f67c453cfad15 --- /dev/null +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -0,0 +1,45 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Common/template.h" +#include "gtest/gtest.h" + +using namespace Fortran::common; +using namespace std; + +ENUM_CLASS(TestEnum, One, Two, + Three) +ENUM_CLASS_EXTRA(TestEnum) + +TEST(EnumClassTest, EnumToString) { + ASSERT_EQ(EnumToString(TestEnum::One), "One"); + ASSERT_EQ(EnumToString(TestEnum::Two), "Two"); + ASSERT_EQ(EnumToString(TestEnum::Three), "Three"); +} + +TEST(EnumClassTest, EnumToStringData) { + ASSERT_STREQ(EnumToString(TestEnum::One).data(), "One, Two, Three"); +} + +TEST(EnumClassTest, StringToEnum) { + ASSERT_EQ(StringToTestEnum("One"), std::optional{TestEnum::One}); + ASSERT_EQ(StringToTestEnum("Two"), std::optional{TestEnum::Two}); + ASSERT_EQ(StringToTestEnum("Three"), std::optional{TestEnum::Three}); + ASSERT_EQ(StringToTestEnum("Four"), std::nullopt); + ASSERT_EQ(StringToTestEnum(""), std::nullopt); + ASSERT_EQ(StringToTestEnum("One, Two, Three"), std::nullopt); +} + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, FindNameNormal) { + auto p1 = [](auto s) { return s == "TwentyOne"; }; + ASSERT_EQ(FindTestEnumExtra(p1), std::optional{TestEnumExtra::TwentyOne}); +} diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp new file mode 100644 index 0000000000000..7ec7054f14f6e --- /dev/null +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -0,0 +1,142 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Support/Fortran-features.h" +#include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/ErrorHandling.h" +#include "gtest/gtest.h" + +namespace Fortran::common { + +// Not currently exported from Fortran-features.h +llvm::SmallVector splitCamelCase(llvm::StringRef input); +llvm::SmallVector splitHyphenated(llvm::StringRef input); +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, SplitCamelCase) { + + auto parts = splitCamelCase("oP"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("o", 1))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("P", 1))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OPName"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("OP", 2))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OpName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("Op", 2))) { + ADD_FAILURE() << "First part is not Op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("opName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("op", 2))) { + ADD_FAILURE() << "First part is not op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("FlangTestProgram123"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("Flang", 5))) { + ADD_FAILURE() << "First part is not Flang"; + } + if (parts[1].compare(llvm::StringRef("Test", 4))) { + ADD_FAILURE() << "Second part is not Test"; + } + if (parts[2].compare(llvm::StringRef("Program123", 10))) { + ADD_FAILURE() << "Third part is not Program123"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, SplitHyphenated) { + auto parts = splitHyphenated("no-twenty-one"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("no", 2))) { + ADD_FAILURE() << "First part is not twenty"; + } + if (parts[1].compare(llvm::StringRef("twenty", 6))) { + ADD_FAILURE() << "Second part is not one"; + } + if (parts[2].compare(llvm::StringRef("one", 3))) { + ADD_FAILURE() << "Third part is not one"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); + + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); +} + +std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); +} + +TEST(EnumClassTest, parseCLIEnumOption) { + auto result = parseCLITestEnumExtraOption("no-twenty-one"); + auto expected = std::pair(false, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("twenty-one"); + expected = std::pair(true, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-forty-two"); + expected = std::pair(false, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("forty-two"); + expected = std::pair(true, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-seven-seven-seven"); + expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("seven-seven-seven"); + expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); +} + +} // namespace Fortran::common >From 49a0579f9477936b72f0580823b4dd6824697512 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:56:14 -0700 Subject: [PATCH 2/2] adjust headers --- flang/include/flang/Support/Fortran-features.h | 4 +--- flang/lib/Frontend/CompilerInvocation.cpp | 5 ----- flang/lib/Support/Fortran-features.cpp | 1 - 3 files changed, 1 insertion(+), 9 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index d5aa7357ffea0..4a8b0da4c0d4d 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -11,9 +11,7 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" -#include "flang/Common/idioms.h" -#include "llvm/Support/Error.h" -#include "llvm/Support/raw_ostream.h" +#include "llvm/ADT/StringRef.h" #include #include diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 9ea568549bd6c..d8bf601d0171d 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -20,11 +20,9 @@ #include "flang/Support/Version.h" #include "flang/Tools/TargetSetup.h" #include "flang/Version.inc" -#include "clang/Basic/AllDiagnostics.h" #include "clang/Basic/DiagnosticDriver.h" #include "clang/Basic/DiagnosticOptions.h" #include "clang/Driver/Driver.h" -#include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" #include "llvm/ADT/StringRef.h" @@ -34,9 +32,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" -#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" -#include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" #include "llvm/Support/Process.h" #include "llvm/Support/raw_ostream.h" @@ -46,7 +42,6 @@ #include #include #include -#include using namespace Fortran::frontend; diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 55abf0385d185..0e394162ef577 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -10,7 +10,6 @@ #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" -#include "llvm/Support/raw_ostream.h" namespace Fortran::common { From flang-commits at lists.llvm.org Thu May 29 12:57:35 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Thu, 29 May 2025 12:57:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bc2f.630a0220.1c4760.0832@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/142022 >From 8f3fd2daab46f477e87043c66b3049dff4a5b20e Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:11:04 -0700 Subject: [PATCH 1/3] initial commit --- flang/include/flang/Common/enum-class.h | 47 ++++- .../include/flang/Support/Fortran-features.h | 51 ++++-- flang/lib/Frontend/CompilerInvocation.cpp | 62 ++++--- flang/lib/Support/CMakeLists.txt | 1 + flang/lib/Support/Fortran-features.cpp | 168 ++++++++++++++---- flang/lib/Support/enum-class.cpp | 24 +++ flang/test/Driver/disable-diagnostic.f90 | 19 ++ flang/test/Driver/werror-wrong.f90 | 7 +- flang/test/Driver/wextra-ok.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 3 + flang/unittests/Common/EnumClassTests.cpp | 45 +++++ .../unittests/Common/FortranFeaturesTest.cpp | 142 +++++++++++++++ 12 files changed, 483 insertions(+), 88 deletions(-) create mode 100644 flang/lib/Support/enum-class.cpp create mode 100644 flang/test/Driver/disable-diagnostic.f90 create mode 100644 flang/unittests/Common/EnumClassTests.cpp create mode 100644 flang/unittests/Common/FortranFeaturesTest.cpp diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index 41575d45091a8..baf9fe418141d 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -18,8 +18,9 @@ #define FORTRAN_COMMON_ENUM_CLASS_H_ #include -#include - +#include +#include +#include namespace Fortran::common { constexpr std::size_t CountEnumNames(const char *p) { @@ -58,15 +59,51 @@ constexpr std::array EnumNames(const char *p) { return result; } +template +std::optional inline fmap(std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +std::optional FindEnumIndex( + Predicate pred, int size, const std::string_view *names); + +using FindEnumIndexType = std::optional( + Predicate, int, const std::string_view *); + +template +std::optional inline FindEnum( + Predicate pred, std::function(Predicate)> find) { + std::function f = [](int x) { return static_cast(x); }; + return fmap(find(pred), f); +} + #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ ::Fortran::common::CountEnumNames(#__VA_ARGS__)}; \ + [[maybe_unused]] static constexpr std::array NAME##_names{ \ + ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ [[maybe_unused]] static inline std::string_view EnumToString(NAME e) { \ - static const constexpr auto names{ \ - ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ - return names[static_cast(e)]; \ + return NAME##_names[static_cast(e)]; \ } +#define ENUM_CLASS_EXTRA(NAME) \ + [[maybe_unused]] inline std::optional Find##NAME##Index( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnumIndex( \ + p, NAME##_enumSize, NAME##_names.data()); \ + } \ + [[maybe_unused]] inline std::optional Find##NAME( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + } \ + [[maybe_unused]] inline std::optional StringTo##NAME( \ + const std::string_view name) { \ + return Find##NAME( \ + [name](const std::string_view s) -> bool { return name == s; }); \ + } } // namespace Fortran::common #endif // FORTRAN_COMMON_ENUM_CLASS_H_ diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index e696da9042480..d5aa7357ffea0 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -12,6 +12,8 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" #include "flang/Common/idioms.h" +#include "llvm/Support/Error.h" +#include "llvm/Support/raw_ostream.h" #include #include @@ -79,12 +81,13 @@ ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, NullActualForDefaultIntentAllocatable, UseAssociationIntoSameNameSubprogram, HostAssociatedIntentOutInSpecExpr, NonVolatilePointerToVolatile) +// Generate default String -> Enum mapping. +ENUM_CLASS_EXTRA(LanguageFeature) +ENUM_CLASS_EXTRA(UsageWarning) + using LanguageFeatures = EnumSet; using UsageWarnings = EnumSet; -std::optional FindLanguageFeature(const char *); -std::optional FindUsageWarning(const char *); - class LanguageFeatureControl { public: LanguageFeatureControl(); @@ -97,8 +100,10 @@ class LanguageFeatureControl { void EnableWarning(UsageWarning w, bool yes = true) { warnUsage_.set(w, yes); } - void WarnOnAllNonstandard(bool yes = true) { warnAllLanguage_ = yes; } - void WarnOnAllUsage(bool yes = true) { warnAllUsage_ = yes; } + void WarnOnAllNonstandard(bool yes = true); + bool IsWarnOnAllNonstandard() const { return warnAllLanguage_; } + void WarnOnAllUsage(bool yes = true); + bool IsWarnOnAllUsage() const { return warnAllUsage_; } void DisableAllNonstandardWarnings() { warnAllLanguage_ = false; warnLanguage_.clear(); @@ -107,16 +112,16 @@ class LanguageFeatureControl { warnAllUsage_ = false; warnUsage_.clear(); } - - bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } - bool ShouldWarn(LanguageFeature f) const { - return (warnAllLanguage_ && f != LanguageFeature::OpenMP && - f != LanguageFeature::OpenACC && f != LanguageFeature::CUDA) || - warnLanguage_.test(f); - } - bool ShouldWarn(UsageWarning w) const { - return warnAllUsage_ || warnUsage_.test(w); + void DisableAllWarnings() { + disableAllWarnings_ = true; + DisableAllNonstandardWarnings(); + DisableAllUsageWarnings(); } + bool applyCLIOption(llvm::StringRef input); + bool AreWarningsDisabled() const { return disableAllWarnings_; } + bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } + bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } + bool ShouldWarn(UsageWarning w) const { return warnUsage_.test(w); } // Return all spellings of operators names, depending on features enabled std::vector GetNames(LogicalOperator) const; std::vector GetNames(RelationalOperator) const; @@ -127,6 +132,24 @@ class LanguageFeatureControl { bool warnAllLanguage_{false}; UsageWarnings warnUsage_; bool warnAllUsage_{false}; + bool disableAllWarnings_{false}; }; + +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). Just exposed for the template below. +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find); + +template +std::optional> parseCLIEnum( + llvm::StringRef input, std::function(Predicate)> find) { + using To = std::pair; + using From = std::pair; + static std::function cast = [](From x) { + return std::pair{x.first, static_cast(x.second)}; + }; + return fmap(parseCLIEnumIndex(input, find), cast); +} + } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..9ea568549bd6c 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -34,6 +34,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" +#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" #include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" @@ -45,6 +46,7 @@ #include #include #include +#include using namespace Fortran::frontend; @@ -971,10 +973,23 @@ static bool parseSemaArgs(CompilerInvocation &res, llvm::opt::ArgList &args, /// Parses all diagnostics related arguments and populates the variables /// options accordingly. Returns false if new errors are generated. +/// FC1 driver entry point for parsing diagnostic arguments. static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, clang::DiagnosticsEngine &diags) { unsigned numErrorsBefore = diags.getNumErrors(); + auto &features = res.getFrontendOpts().features; + // The order of these flags (-pedantic -W -w) is important and is + // chosen to match clang's behavior. + + // -pedantic + if (args.hasArg(clang::driver::options::OPT_pedantic)) { + features.WarnOnAllNonstandard(); + features.WarnOnAllUsage(); + res.setEnableConformanceChecks(); + res.setEnableUsageChecks(); + } + // -Werror option // TODO: Currently throws a Diagnostic for anything other than -W, // this has to change when other -W's are supported. @@ -984,21 +999,27 @@ static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, for (const auto &wArg : wArgs) { if (wArg == "error") { res.setWarnAsErr(true); - } else { - const unsigned diagID = - diags.getCustomDiagID(clang::DiagnosticsEngine::Error, - "Only `-Werror` is supported currently."); - diags.Report(diagID); + // -W(no-) + } else if (!features.applyCLIOption(wArg)) { + const unsigned diagID = diags.getCustomDiagID( + clang::DiagnosticsEngine::Error, "Unknown diagnostic option: -W%0"); + diags.Report(diagID) << wArg; } } } + // -w + if (args.hasArg(clang::driver::options::OPT_w)) { + features.DisableAllWarnings(); + res.setDisableWarnings(); + } + // Default to off for `flang -fc1`. - res.getFrontendOpts().showColors = - parseShowColorsArgs(args, /*defaultDiagColor=*/false); + bool showColors = parseShowColorsArgs(args, false); - // Honor color diagnostics. - res.getDiagnosticOpts().ShowColors = res.getFrontendOpts().showColors; + diags.getDiagnosticOptions().ShowColors = showColors; + res.getDiagnosticOpts().ShowColors = showColors; + res.getFrontendOpts().showColors = showColors; return diags.getNumErrors() == numErrorsBefore; } @@ -1074,16 +1095,6 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, Fortran::common::LanguageFeature::OpenACC); } - // -pedantic - if (args.hasArg(clang::driver::options::OPT_pedantic)) { - res.setEnableConformanceChecks(); - res.setEnableUsageChecks(); - } - - // -w - if (args.hasArg(clang::driver::options::OPT_w)) - res.setDisableWarnings(); - // -std=f2018 // TODO: Set proper options when more fortran standards // are supported. @@ -1092,6 +1103,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, // We only allow f2018 as the given standard if (standard == "f2018") { res.setEnableConformanceChecks(); + res.getFrontendOpts().features.WarnOnAllNonstandard(); } else { const unsigned diagID = diags.getCustomDiagID(clang::DiagnosticsEngine::Error, @@ -1099,6 +1111,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, diags.Report(diagID); } } + return diags.getNumErrors() == numErrorsBefore; } @@ -1694,16 +1707,7 @@ void CompilerInvocation::setFortranOpts() { if (frontendOptions.needProvenanceRangeToCharBlockMappings) fortranOptions.needProvenanceRangeToCharBlockMappings = true; - if (getEnableConformanceChecks()) - fortranOptions.features.WarnOnAllNonstandard(); - - if (getEnableUsageChecks()) - fortranOptions.features.WarnOnAllUsage(); - - if (getDisableWarnings()) { - fortranOptions.features.DisableAllNonstandardWarnings(); - fortranOptions.features.DisableAllUsageWarnings(); - } + fortranOptions.features = frontendOptions.features; } std::unique_ptr diff --git a/flang/lib/Support/CMakeLists.txt b/flang/lib/Support/CMakeLists.txt index 363f57ce97dae..9ef31a2a6dcc7 100644 --- a/flang/lib/Support/CMakeLists.txt +++ b/flang/lib/Support/CMakeLists.txt @@ -44,6 +44,7 @@ endif() add_flang_library(FortranSupport default-kinds.cpp + enum-class.cpp Flags.cpp Fortran.cpp Fortran-features.cpp diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index bee8984102b82..55abf0385d185 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -9,6 +9,8 @@ #include "flang/Support/Fortran-features.h" #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/Support/raw_ostream.h" namespace Fortran::common { @@ -94,57 +96,123 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Ignore case and any inserted punctuation (like '-'/'_') -static std::optional GetWarningChar(char ch) { - if (ch >= 'a' && ch <= 'z') { - return ch; - } else if (ch >= 'A' && ch <= 'Z') { - return ch - 'A' + 'a'; - } else if (ch >= '0' && ch <= '9') { - return ch; - } else { - return std::nullopt; +// Split a string with camel case into the individual words. +// Note, the small vector is just an array of a few pointers and lengths +// into the original input string. So all this allocation should be pretty +// cheap. +llvm::SmallVector splitCamelCase(llvm::StringRef input) { + using namespace llvm; + if (input.empty()) { + return {}; } + SmallVector parts{}; + parts.reserve(input.size()); + auto check = [&input](size_t j, function_ref predicate) { + return j < input.size() && predicate(input[j]); + }; + size_t i{0}; + size_t startWord = i; + for (; i < input.size(); i++) { + if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || + ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { + parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); + startWord = i + 1; + } + } + parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); + return parts; } -static bool WarningNameMatch(const char *a, const char *b) { - while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); - } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); +// Split a string whith hyphens into the individual words. +llvm::SmallVector splitHyphenated(llvm::StringRef input) { + auto parts = llvm::SmallVector{}; + llvm::SplitString(input, parts, "-"); + return parts; +} + +// Check if two strings are equal while normalizing case for the +// right word which is assumed to be a single word in camel case. +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { + size_t ls = l.size(); + if (ls != r.size()) + return false; + size_t j{0}; + // Process the upper case characters. + for (; j < ls; j++) { + char rc = r[j]; + char rc2l = llvm::toLower(rc); + if (rc == rc2l) { + // Past run of Uppers Case; + break; } - if (!ach && !bch) { - return true; - } else if (!ach || !bch || *ach != *bch) { + if (l[j] != rc2l) + return false; + } + // Process the lower case characters. + for (; j < ls; j++) { + if (l[j] != r[j]) { return false; } - ++a, ++b; } + return true; } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find) { + auto parts = splitHyphenated(input); + bool negated = false; + if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { + negated = true; + // Remove the "no" part + parts = llvm::SmallVector(parts.begin() + 1, parts.end()); + } + size_t chars = 0; + for (auto p : parts) { + chars += p.size(); + } + auto pred = [&](auto s) { + if (chars != s.size()) { + return false; + } + auto ccParts = splitCamelCase(s); + auto num_ccParts = ccParts.size(); + if (parts.size() != num_ccParts) { + return false; + } + for (size_t i{0}; i < num_ccParts; i++) { + if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { + return false; } } - } - return std::nullopt; + return true; + }; + auto cast = [negated](int x) { return std::pair{!negated, x}; }; + return fmap>(find(pred), cast); } -std::optional FindLanguageFeature(const char *name) { - return ScanEnum(name); +std::optional> parseCLILanguageFeature( + llvm::StringRef input) { + return parseCLIEnum(input, FindLanguageFeatureIndex); } -std::optional FindUsageWarning(const char *name) { - return ScanEnum(name); +std::optional> parseCLIUsageWarning( + llvm::StringRef input) { + return parseCLIEnum(input, FindUsageWarningIndex); +} + +// Take a string from the CLI and apply it to the LanguageFeatureControl. +// Return true if the option was applied recognized. +bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { + if (auto result = parseCLILanguageFeature(input)) { + EnableWarning(result->second, result->first); + return true; + } else if (auto result = parseCLIUsageWarning(input)) { + EnableWarning(result->second, result->first); + return true; + } + return false; } std::vector LanguageFeatureControl::GetNames( @@ -201,4 +269,32 @@ std::vector LanguageFeatureControl::GetNames( } } +template +void ForEachEnum(std::function f) { + for (size_t j{0}; j < N; ++j) { + f(static_cast(j)); + } +} + +void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { + warnAllLanguage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + // should be equivalent to: reset().flip() set ... + ForEachEnum( + [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + if (yes) { + // These three features do not need to be warned about, + // but we do want their feature flags. + warnLanguage_.set(LanguageFeature::OpenMP, false); + warnLanguage_.set(LanguageFeature::OpenACC, false); + warnLanguage_.set(LanguageFeature::CUDA, false); + } +} + +void LanguageFeatureControl::WarnOnAllUsage(bool yes) { + warnAllUsage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + ForEachEnum( + [&](UsageWarning w) { warnUsage_.set(w, yes); }); +} } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp new file mode 100644 index 0000000000000..ed11318382b35 --- /dev/null +++ b/flang/lib/Support/enum-class.cpp @@ -0,0 +1,24 @@ +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include +#include +namespace Fortran::common { + +std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; + } + } + return std::nullopt; +} + + +} // namespace Fortran::common \ No newline at end of file diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 new file mode 100644 index 0000000000000..8a58e63cfa3ac --- /dev/null +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -0,0 +1,19 @@ +! RUN: %flang -Wknown-bad-implicit-interface %s -c 2>&1 | FileCheck %s --check-prefix=WARN +! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty +! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 +! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 +! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface +! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface + +program disable_diagnostic + REAL :: x + INTEGER :: y + ! CHECK-NOT: warning + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(x) + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(y) +end program disable_diagnostic + +subroutine sub() +end subroutine sub \ No newline at end of file diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 58adf6f745d5e..33f0aff8a1739 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -1,6 +1,7 @@ ! Ensure that only argument -Werror is supported. -! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG -! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG +! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG1 +! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 -! WRONG: Only `-Werror` is supported currently. +! WRONG1: error: Unknown diagnostic option: -Wall +! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file diff --git a/flang/test/Driver/wextra-ok.f90 b/flang/test/Driver/wextra-ok.f90 index 441029aa0af27..db15c7f14aa35 100644 --- a/flang/test/Driver/wextra-ok.f90 +++ b/flang/test/Driver/wextra-ok.f90 @@ -5,7 +5,7 @@ ! RUN: not %flang -std=f2018 -Wblah -Wextra %s -c 2>&1 | FileCheck %s --check-prefix=WRONG ! CHECK-OK: the warning option '-Wextra' is not supported -! WRONG: Only `-Werror` is supported currently. +! WRONG: Unknown diagnostic option: -Wblah program wextra_ok end program wextra_ok diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index bda02ed29a5ef..19cc5a20fecf4 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -1,3 +1,6 @@ add_flang_unittest(FlangCommonTests + EnumClassTests.cpp FastIntSetTest.cpp + FortranFeaturesTest.cpp ) +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp new file mode 100644 index 0000000000000..f67c453cfad15 --- /dev/null +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -0,0 +1,45 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Common/template.h" +#include "gtest/gtest.h" + +using namespace Fortran::common; +using namespace std; + +ENUM_CLASS(TestEnum, One, Two, + Three) +ENUM_CLASS_EXTRA(TestEnum) + +TEST(EnumClassTest, EnumToString) { + ASSERT_EQ(EnumToString(TestEnum::One), "One"); + ASSERT_EQ(EnumToString(TestEnum::Two), "Two"); + ASSERT_EQ(EnumToString(TestEnum::Three), "Three"); +} + +TEST(EnumClassTest, EnumToStringData) { + ASSERT_STREQ(EnumToString(TestEnum::One).data(), "One, Two, Three"); +} + +TEST(EnumClassTest, StringToEnum) { + ASSERT_EQ(StringToTestEnum("One"), std::optional{TestEnum::One}); + ASSERT_EQ(StringToTestEnum("Two"), std::optional{TestEnum::Two}); + ASSERT_EQ(StringToTestEnum("Three"), std::optional{TestEnum::Three}); + ASSERT_EQ(StringToTestEnum("Four"), std::nullopt); + ASSERT_EQ(StringToTestEnum(""), std::nullopt); + ASSERT_EQ(StringToTestEnum("One, Two, Three"), std::nullopt); +} + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, FindNameNormal) { + auto p1 = [](auto s) { return s == "TwentyOne"; }; + ASSERT_EQ(FindTestEnumExtra(p1), std::optional{TestEnumExtra::TwentyOne}); +} diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp new file mode 100644 index 0000000000000..7ec7054f14f6e --- /dev/null +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -0,0 +1,142 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Support/Fortran-features.h" +#include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/ErrorHandling.h" +#include "gtest/gtest.h" + +namespace Fortran::common { + +// Not currently exported from Fortran-features.h +llvm::SmallVector splitCamelCase(llvm::StringRef input); +llvm::SmallVector splitHyphenated(llvm::StringRef input); +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, SplitCamelCase) { + + auto parts = splitCamelCase("oP"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("o", 1))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("P", 1))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OPName"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("OP", 2))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OpName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("Op", 2))) { + ADD_FAILURE() << "First part is not Op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("opName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("op", 2))) { + ADD_FAILURE() << "First part is not op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("FlangTestProgram123"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("Flang", 5))) { + ADD_FAILURE() << "First part is not Flang"; + } + if (parts[1].compare(llvm::StringRef("Test", 4))) { + ADD_FAILURE() << "Second part is not Test"; + } + if (parts[2].compare(llvm::StringRef("Program123", 10))) { + ADD_FAILURE() << "Third part is not Program123"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, SplitHyphenated) { + auto parts = splitHyphenated("no-twenty-one"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("no", 2))) { + ADD_FAILURE() << "First part is not twenty"; + } + if (parts[1].compare(llvm::StringRef("twenty", 6))) { + ADD_FAILURE() << "Second part is not one"; + } + if (parts[2].compare(llvm::StringRef("one", 3))) { + ADD_FAILURE() << "Third part is not one"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); + + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); +} + +std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); +} + +TEST(EnumClassTest, parseCLIEnumOption) { + auto result = parseCLITestEnumExtraOption("no-twenty-one"); + auto expected = std::pair(false, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("twenty-one"); + expected = std::pair(true, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-forty-two"); + expected = std::pair(false, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("forty-two"); + expected = std::pair(true, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-seven-seven-seven"); + expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("seven-seven-seven"); + expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); +} + +} // namespace Fortran::common >From 49a0579f9477936b72f0580823b4dd6824697512 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:56:14 -0700 Subject: [PATCH 2/3] adjust headers --- flang/include/flang/Support/Fortran-features.h | 4 +--- flang/lib/Frontend/CompilerInvocation.cpp | 5 ----- flang/lib/Support/Fortran-features.cpp | 1 - 3 files changed, 1 insertion(+), 9 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index d5aa7357ffea0..4a8b0da4c0d4d 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -11,9 +11,7 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" -#include "flang/Common/idioms.h" -#include "llvm/Support/Error.h" -#include "llvm/Support/raw_ostream.h" +#include "llvm/ADT/StringRef.h" #include #include diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 9ea568549bd6c..d8bf601d0171d 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -20,11 +20,9 @@ #include "flang/Support/Version.h" #include "flang/Tools/TargetSetup.h" #include "flang/Version.inc" -#include "clang/Basic/AllDiagnostics.h" #include "clang/Basic/DiagnosticDriver.h" #include "clang/Basic/DiagnosticOptions.h" #include "clang/Driver/Driver.h" -#include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" #include "llvm/ADT/StringRef.h" @@ -34,9 +32,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" -#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" -#include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" #include "llvm/Support/Process.h" #include "llvm/Support/raw_ostream.h" @@ -46,7 +42,6 @@ #include #include #include -#include using namespace Fortran::frontend; diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 55abf0385d185..0e394162ef577 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -10,7 +10,6 @@ #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" -#include "llvm/Support/raw_ostream.h" namespace Fortran::common { >From fa2db7090c6d374ce1a835ad26d19a1d7bd42262 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:57:22 -0700 Subject: [PATCH 3/3] reformat --- flang/lib/Support/enum-class.cpp | 20 ++++++++++--------- flang/unittests/Common/EnumClassTests.cpp | 5 ++--- .../unittests/Common/FortranFeaturesTest.cpp | 18 ++++++++++------- 3 files changed, 24 insertions(+), 19 deletions(-) diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp index ed11318382b35..ac57f27ef1c9e 100644 --- a/flang/lib/Support/enum-class.cpp +++ b/flang/lib/Support/enum-class.cpp @@ -1,4 +1,5 @@ -//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ +//-*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -7,18 +8,19 @@ //===----------------------------------------------------------------------===// #include "flang/Common/enum-class.h" -#include #include +#include namespace Fortran::common { -std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { - for (int i = 0; i < size; ++i) { - if (pred(names[i])) { - return i; - } +std::optional FindEnumIndex( + std::function pred, int size, + const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; } - return std::nullopt; + } + return std::nullopt; } - } // namespace Fortran::common \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp index f67c453cfad15..c9224a8ceba54 100644 --- a/flang/unittests/Common/EnumClassTests.cpp +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -6,15 +6,14 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Common/template.h" -#include "gtest/gtest.h" using namespace Fortran::common; using namespace std; -ENUM_CLASS(TestEnum, One, Two, - Three) +ENUM_CLASS(TestEnum, One, Two, Three) ENUM_CLASS_EXTRA(TestEnum) TEST(EnumClassTest, EnumToString) { diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 7ec7054f14f6e..597928e7fe56e 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -6,12 +6,12 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Support/Fortran-features.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" -#include "gtest/gtest.h" namespace Fortran::common { @@ -34,7 +34,7 @@ TEST(EnumClassTest, SplitCamelCase) { if (parts[1].compare(llvm::StringRef("P", 1))) { ADD_FAILURE() << "Second part is not Name"; } - + parts = splitCamelCase("OPName"); ASSERT_EQ(parts.size(), (size_t)2); @@ -114,13 +114,15 @@ TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); } -std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { - return parseCLIEnum(input, FindTestEnumExtraIndex); +std::optional> parseCLITestEnumExtraOption( + llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); } TEST(EnumClassTest, parseCLIEnumOption) { auto result = parseCLITestEnumExtraOption("no-twenty-one"); - auto expected = std::pair(false, TestEnumExtra::TwentyOne); + auto expected = + std::pair(false, TestEnumExtra::TwentyOne); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("twenty-one"); expected = std::pair(true, TestEnumExtra::TwentyOne); @@ -132,10 +134,12 @@ TEST(EnumClassTest, parseCLIEnumOption) { expected = std::pair(true, TestEnumExtra::FortyTwo); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("no-seven-seven-seven"); - expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(false, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("seven-seven-seven"); - expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(true, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); } From flang-commits at lists.llvm.org Thu May 29 13:05:08 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 13:05:08 -0700 (PDT) Subject: [flang-commits] [flang] 798ae82 - [flang][OpenMP] Verify that arguments to COPYPRIVATE are variables (#141823) Message-ID: <6838bdf4.630a0220.10fb10.7fcb@mx.google.com> Author: Krzysztof Parzyszek Date: 2025-05-29T15:05:05-05:00 New Revision: 798ae823997b417cb85da098f16e4b1101d9b68c URL: https://github.com/llvm/llvm-project/commit/798ae823997b417cb85da098f16e4b1101d9b68c DIFF: https://github.com/llvm/llvm-project/commit/798ae823997b417cb85da098f16e4b1101d9b68c.diff LOG: [flang][OpenMP] Verify that arguments to COPYPRIVATE are variables (#141823) The check if the arguments are variable list items was missing, leading to a crash in lowering in some invalid situations. This fixes the first testcase reported in https://github.com/llvm/llvm-project/issues/141481 Added: flang/test/Semantics/OpenMP/copyprivate05.f90 Modified: flang/lib/Semantics/check-omp-structure.cpp flang/lib/Semantics/check-omp-structure.h flang/test/Semantics/OpenMP/copyprivate04.f90 Removed: ################################################################################ diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bda0d62829506..b0bc478d96a1e 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -390,6 +390,16 @@ std::optional OmpStructureChecker::IsContiguous( object.u); } +void OmpStructureChecker::CheckVariableListItem( + const SymbolSourceMap &symbols) { + for (auto &[symbol, source] : symbols) { + if (!IsVariableListItem(*symbol)) { + context_.SayWithDecl( + *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); + } + } +} + void OmpStructureChecker::CheckMultipleOccurrence( semantics::UnorderedSymbolSet &listVars, const std::list &nameList, const parser::CharBlock &item, @@ -4587,6 +4597,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Copyprivate &x) { CheckAllowedClause(llvm::omp::Clause::OMPC_copyprivate); SymbolSourceMap symbols; GetSymbolsInObjectList(x.v, symbols); + CheckVariableListItem(symbols); CheckIntentInPointer(symbols, llvm::omp::Clause::OMPC_copyprivate); CheckCopyingPolymorphicAllocatable( symbols, llvm::omp::Clause::OMPC_copyprivate); @@ -4859,12 +4870,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::From &x) { const auto &objList{std::get(x.v.t)}; SymbolSourceMap symbols; GetSymbolsInObjectList(objList, symbols); - for (const auto &[symbol, source] : symbols) { - if (!IsVariableListItem(*symbol)) { - context_.SayWithDecl( - *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); - } - } + CheckVariableListItem(symbols); // Ref: [4.5:109:19] // If a list item is an array section it must specify contiguous storage. @@ -4904,12 +4910,7 @@ void OmpStructureChecker::Enter(const parser::OmpClause::To &x) { const auto &objList{std::get(x.v.t)}; SymbolSourceMap symbols; GetSymbolsInObjectList(objList, symbols); - for (const auto &[symbol, source] : symbols) { - if (!IsVariableListItem(*symbol)) { - context_.SayWithDecl( - *symbol, source, "'%s' must be a variable"_err_en_US, symbol->name()); - } - } + CheckVariableListItem(symbols); // Ref: [4.5:109:19] // If a list item is an array section it must specify contiguous storage. diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 587959f7d506f..1a8059d8548ed 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -174,6 +174,7 @@ class OmpStructureChecker bool IsExtendedListItem(const Symbol &sym); bool IsCommonBlock(const Symbol &sym); std::optional IsContiguous(const parser::OmpObject &object); + void CheckVariableListItem(const SymbolSourceMap &symbols); void CheckMultipleOccurrence(semantics::UnorderedSymbolSet &listVars, const std::list &nameList, const parser::CharBlock &item, const std::string &clauseName); diff --git a/flang/test/Semantics/OpenMP/copyprivate04.f90 b/flang/test/Semantics/OpenMP/copyprivate04.f90 index 291cf1103fb27..8d7800229bc5f 100644 --- a/flang/test/Semantics/OpenMP/copyprivate04.f90 +++ b/flang/test/Semantics/OpenMP/copyprivate04.f90 @@ -70,6 +70,7 @@ program omp_copyprivate ! Named constants are shared. !$omp single !ERROR: COPYPRIVATE variable 'pi' is not PRIVATE or THREADPRIVATE in outer context + !ERROR: 'pi' must be a variable !$omp end single copyprivate(pi) !$omp parallel do diff --git a/flang/test/Semantics/OpenMP/copyprivate05.f90 b/flang/test/Semantics/OpenMP/copyprivate05.f90 new file mode 100644 index 0000000000000..129f8f0b5144e --- /dev/null +++ b/flang/test/Semantics/OpenMP/copyprivate05.f90 @@ -0,0 +1,12 @@ +!RUN: %python %S/../test_errors.py %s %flang_fc1 -fopenmp + +! The first testcase from https://github.com/llvm/llvm-project/issues/141481 + +subroutine f00 + type t + end type + +!ERROR: 't' must be a variable +!$omp single copyprivate(t) +!$omp end single +end From flang-commits at lists.llvm.org Thu May 29 13:05:12 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 29 May 2025 13:05:12 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Verify that arguments to COPYPRIVATE are variables (PR #141823) In-Reply-To: Message-ID: <6838bdf8.170a0220.be583.8e54@mx.google.com> https://github.com/kparzysz closed https://github.com/llvm/llvm-project/pull/141823 From flang-commits at lists.llvm.org Thu May 29 13:11:15 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 13:11:15 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bf63.170a0220.1288f7.bb06@mx.google.com> https://github.com/klausler requested changes to this pull request. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 13:11:15 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 13:11:15 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bf63.050a0220.1cafa9.bf8c@mx.google.com> ================ @@ -58,15 +59,51 @@ constexpr std::array EnumNames(const char *p) { return result; } +template +std::optional inline fmap(std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +std::optional FindEnumIndex( + Predicate pred, int size, const std::string_view *names); + +using FindEnumIndexType = std::optional( + Predicate, int, const std::string_view *); + +template +std::optional inline FindEnum( + Predicate pred, std::function(Predicate)> find) { + std::function f = [](int x) { return static_cast(x); }; ---------------- klausler wrote: Braced initialization always in f18 before lowering or in the runtime. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 13:11:16 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 13:11:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bf64.050a0220.b13ec.ca27@mx.google.com> ================ @@ -127,6 +132,24 @@ class LanguageFeatureControl { bool warnAllLanguage_{false}; UsageWarnings warnUsage_; bool warnAllUsage_{false}; + bool disableAllWarnings_{false}; }; + +// Parse a CLI enum option return the enum index and whether it should be ---------------- klausler wrote: Need some punctuation between the words "option" and "return". https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 13:11:16 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 13:11:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bf64.170a0220.3ca331.c6a2@mx.google.com> ================ @@ -58,15 +59,51 @@ constexpr std::array EnumNames(const char *p) { return result; } +template +std::optional inline fmap(std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +std::optional FindEnumIndex( + Predicate pred, int size, const std::string_view *names); ---------------- klausler wrote: `std::size_t`, not `int` https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 13:11:16 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 13:11:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bf64.170a0220.29f79a.c986@mx.google.com> ================ @@ -58,15 +59,51 @@ constexpr std::array EnumNames(const char *p) { return result; } +template +std::optional inline fmap(std::optional x, std::function f) { ---------------- klausler wrote: `fmap` isn't conforming with the naming guidelines in our style. And it's not a fully generic fmap that works with other monad-like C++ types. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 13:11:16 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 13:11:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bf64.170a0220.2fb5b6.8c77@mx.google.com> ================ @@ -127,6 +132,24 @@ class LanguageFeatureControl { bool warnAllLanguage_{false}; UsageWarnings warnUsage_; bool warnAllUsage_{false}; + bool disableAllWarnings_{false}; }; + +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). Just exposed for the template below. +std::optional> parseCLIEnumIndex( ---------------- klausler wrote: Capitalize. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 13:11:16 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 13:11:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bf64.630a0220.fa646.8ab1@mx.google.com> ================ @@ -0,0 +1,19 @@ +! RUN: %flang -Wknown-bad-implicit-interface %s -c 2>&1 | FileCheck %s --check-prefix=WARN +! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty +! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 +! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 +! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface +! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface + +program disable_diagnostic + REAL :: x + INTEGER :: y + ! CHECK-NOT: warning + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(x) + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(y) +end program disable_diagnostic + +subroutine sub() +end subroutine sub ---------------- klausler wrote: missing newline? https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 13:11:16 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 13:11:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bf64.170a0220.18ad4.c001@mx.google.com> ================ @@ -94,57 +96,123 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Ignore case and any inserted punctuation (like '-'/'_') ---------------- klausler wrote: Restore this function, or write something like it, and use it to implement a string comparison that ignores non-alphabetic characters and normalizes to lower case. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 13:11:16 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 13:11:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bf64.050a0220.3ae2e5.cd2a@mx.google.com> ================ @@ -0,0 +1,24 @@ +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include +#include +namespace Fortran::common { + +std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { + for (int i = 0; i < size; ++i) { ---------------- klausler wrote: Braced initialization, and `std::size_t`. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 13:11:16 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 13:11:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bf64.170a0220.30917f.80b3@mx.google.com> ================ @@ -58,15 +59,51 @@ constexpr std::array EnumNames(const char *p) { return result; } +template ---------------- klausler wrote: This enum-class.h header file gets included into most compilations, directly or indirectly. These little helpers need to have less generic names, keep their generic names but appear in a more general Common header, or be put into a nested namespace here. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 13:11:16 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 13:11:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bf64.050a0220.330e92.748b@mx.google.com> ================ @@ -201,4 +269,32 @@ std::vector LanguageFeatureControl::GetNames( } } +template +void ForEachEnum(std::function f) { + for (size_t j{0}; j < N; ++j) { + f(static_cast(j)); + } +} + +void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { + warnAllLanguage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; ---------------- klausler wrote: Write this as an `if` and it will be clearer. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 13:11:16 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 13:11:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bf64.170a0220.1288cf.c2e0@mx.google.com> ================ @@ -1,6 +1,7 @@ ! Ensure that only argument -Werror is supported. -! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG -! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG +! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG1 +! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 -! WRONG: Only `-Werror` is supported currently. +! WRONG1: error: Unknown diagnostic option: -Wall +! WRONG2: error: Unknown diagnostic option: -WX ---------------- klausler wrote: missing newline? https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 13:11:16 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 13:11:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <6838bf64.170a0220.95e07.bb33@mx.google.com> ================ @@ -0,0 +1,24 @@ +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include +#include +namespace Fortran::common { + +std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; + } + } + return std::nullopt; +} + + +} // namespace Fortran::common ---------------- klausler wrote: There's a weird icon below this line on GitHub's code review page that might signify a missing newline. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 13:14:22 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Thu, 29 May 2025 13:14:22 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <6838c01e.170a0220.29c3f4.bfa9@mx.google.com> ================ @@ -967,6 +969,15 @@ hlfir::LoopNest hlfir::genLoopNest(mlir::Location loc, auto ub = builder.createConvert(loc, indexType, extent); auto doLoop = builder.create(loc, one, ub, one, isUnordered); + if (!couldVectorize) { ---------------- mrkajetanp wrote: It's so that genLoopNest can optionally annotate the generated loops as not vectorisable, as discussed in the RFC discussion. Do you mean I should post that on its own as a separate prerequisite PR? https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Thu May 29 13:19:04 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Thu, 29 May 2025 13:19:04 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <6838c138.050a0220.b1695.cfb9@mx.google.com> ================ @@ -967,6 +969,15 @@ hlfir::LoopNest hlfir::genLoopNest(mlir::Location loc, auto ub = builder.createConvert(loc, indexType, extent); auto doLoop = builder.create(loc, one, ub, one, isUnordered); + if (!couldVectorize) { ---------------- vzakhari wrote: I would prefer it to be separate commit, but then it will be a non-testable change. Okay, let's keep it, but please make sure your LIT tests check that proper attributes are set in `#loop_annotation`. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Thu May 29 13:54:54 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 29 May 2025 13:54:54 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #141380) In-Reply-To: Message-ID: <6838c99e.050a0220.208c8d.c4e0@mx.google.com> tarunprabhu wrote: > > Seems ok now: "Changes can be cleanly merged." > > If it's still not available, I may have messed something up on my side. I'll review the GitHub docs to see if I missed something... > > It's probably fine, but I'd like to give the buildkites a chance to finish before I merge it. Ugh! Looks like the buildkites succeeded, but now there are conflicts with main. My apologies, but could you rebase this again. https://github.com/llvm/llvm-project/pull/141380 From flang-commits at lists.llvm.org Thu May 29 14:10:01 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 14:10:01 -0700 (PDT) Subject: [flang-commits] [flang] [flang][NFC] Clean up code in two new functions (PR #142037) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/142037 Two recently-added functions in Semantics/tools.h need some cleaning up to conform to the coding style of the project. One of them should actually be in Parser/tools.{h,cpp}, the other doesn't need to be defined in the header. >From 2e2d9da17886ffb8ce5a9fe4d81e3e3df3d169e8 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Thu, 29 May 2025 14:05:34 -0700 Subject: [PATCH] [flang][NFC] Clean up code in two new functions Two recently-added functions in Semantics/tools.h need some cleaning up to conform to the coding style of the project. One of them should actually be in Parser/tools.{h,cpp}, the other doesn't need to be defined in the header. --- flang/include/flang/Parser/tools.h | 3 +++ flang/include/flang/Semantics/tools.h | 26 ++------------------- flang/lib/Lower/OpenACC.cpp | 4 ++-- flang/lib/Lower/OpenMP/OpenMP.cpp | 4 ++-- flang/lib/Parser/tools.cpp | 6 +++++ flang/lib/Semantics/check-omp-structure.cpp | 8 +++---- flang/lib/Semantics/tools.cpp | 12 ++++++++++ 7 files changed, 31 insertions(+), 32 deletions(-) diff --git a/flang/include/flang/Parser/tools.h b/flang/include/flang/Parser/tools.h index f1ead11734fa0..447bccd5d35a6 100644 --- a/flang/include/flang/Parser/tools.h +++ b/flang/include/flang/Parser/tools.h @@ -250,5 +250,8 @@ template std::optional GetLastSource(A &x) { return GetSourceHelper::GetSource(const_cast(x)); } +// Checks whether the assignment statement has a single variable on the RHS. +bool CheckForSingleVariableOnRHS(const AssignmentStmt &); + } // namespace Fortran::parser #endif // FORTRAN_PARSER_TOOLS_H_ diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 3839bc1d2a215..1c1526d51c509 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -753,29 +753,7 @@ std::string GetCommonBlockObjectName(const Symbol &, bool underscoring); // Check for ambiguous USE associations bool HadUseError(SemanticsContext &, SourceName at, const Symbol *); -/// Checks if the assignment statement has a single variable on the RHS. -inline bool checkForSingleVariableOnRHS( - const Fortran::parser::AssignmentStmt &assignmentStmt) { - const Fortran::parser::Expr &expr{ - std::get(assignmentStmt.t)}; - const Fortran::common::Indirection *designator = - std::get_if>( - &expr.u); - return designator != nullptr; -} - -/// Checks if the symbol on the LHS is present in the RHS expression. -inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, - const Fortran::semantics::SomeExpr *rhs) { - auto lhsSyms{Fortran::evaluate::GetSymbolVector(*lhs)}; - const Fortran::semantics::Symbol &lhsSymbol{*lhsSyms.front()}; - for (const Fortran::semantics::Symbol &symbol : - Fortran::evaluate::GetSymbolVector(*rhs)) { - if (lhsSymbol == symbol) { - return true; - } - } - return false; -} +// Checks whether the symbol on the LHS is present in the RHS expression. +bool CheckForSymbolMatch(const SomeExpr *lhs,const SomeExpr *rhs); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 02dba22c29c7f..c10e1777614cd 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -653,8 +653,8 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter, firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch( + if (Fortran::parser::CheckForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::CheckForSymbolMatch( Fortran::semantics::GetExpr(stmt2Var), Fortran::semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ddb08f74b3841..e07f33671e728 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3199,8 +3199,8 @@ static void genAtomicCapture(lower::AbstractConverter &converter, firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(semantics::GetExpr(stmt2Var), + if (parser::CheckForSingleVariableOnRHS(stmt1)) { + if (semantics::CheckForSymbolMatch(semantics::GetExpr(stmt2Var), semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); diff --git a/flang/lib/Parser/tools.cpp b/flang/lib/Parser/tools.cpp index 6e5f1ed2fc66f..85f0858a8f147 100644 --- a/flang/lib/Parser/tools.cpp +++ b/flang/lib/Parser/tools.cpp @@ -174,4 +174,10 @@ const CoindexedNamedObject *GetCoindexedNamedObject( }, allocateObject.u); } + +bool CheckForSingleVariableOnRHS(const AssignmentStmt &assignmentStmt) { + const Expr &expr{std::get(assignmentStmt.t)}; + return std::holds_alternative>(expr.u); +} + } // namespace Fortran::parser diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bda0d62829506..ae5dca1b95f6e 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2922,9 +2922,9 @@ void OmpStructureChecker::CheckAtomicCaptureConstruct( const auto *e2 = GetExpr(context_, stmt2Expr); if (e1 && v1 && e2 && v2) { - if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (parser::CheckForSingleVariableOnRHS(stmt1)) { CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(v2, e2)) { + if (CheckForSymbolMatch(v2, e2)) { // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] CheckAtomicUpdateStmt(stmt2); } else { @@ -2936,8 +2936,8 @@ void OmpStructureChecker::CheckAtomicCaptureConstruct( "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, stmt1Expr.source); } - } else if (semantics::checkForSymbolMatch(v1, e1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { + } else if (CheckForSymbolMatch(v1, e1) && + parser::CheckForSingleVariableOnRHS(stmt2)) { // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] CheckAtomicUpdateStmt(stmt1); CheckAtomicCaptureStmt(stmt2); diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 1d1e3ac044166..d8e9385f6973c 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -1756,4 +1756,16 @@ bool HadUseError( } } +bool CheckForSymbolMatch(const SomeExpr *lhs, const SomeExpr *rhs) { + if (lhs && rhs) { + if (const Symbol *first{evaluate::GetFirstSymbol(*lhs)}) { + for (const Symbol &symbol : evaluate::GetSymbolVector(*rhs)) { + if (first == &symbol) { + return true; + } + } + } + } + return false; +} } // namespace Fortran::semantics From flang-commits at lists.llvm.org Thu May 29 14:10:33 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 14:10:33 -0700 (PDT) Subject: [flang-commits] [flang] [flang][NFC] Clean up code in two new functions (PR #142037) In-Reply-To: Message-ID: <6838cd49.050a0220.208c8d.c5e5@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes Two recently-added functions in Semantics/tools.h need some cleaning up to conform to the coding style of the project. One of them should actually be in Parser/tools.{h,cpp}, the other doesn't need to be defined in the header. --- Full diff: https://github.com/llvm/llvm-project/pull/142037.diff 7 Files Affected: - (modified) flang/include/flang/Parser/tools.h (+3) - (modified) flang/include/flang/Semantics/tools.h (+2-24) - (modified) flang/lib/Lower/OpenACC.cpp (+2-2) - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+2-2) - (modified) flang/lib/Parser/tools.cpp (+6) - (modified) flang/lib/Semantics/check-omp-structure.cpp (+4-4) - (modified) flang/lib/Semantics/tools.cpp (+12) ``````````diff diff --git a/flang/include/flang/Parser/tools.h b/flang/include/flang/Parser/tools.h index f1ead11734fa0..447bccd5d35a6 100644 --- a/flang/include/flang/Parser/tools.h +++ b/flang/include/flang/Parser/tools.h @@ -250,5 +250,8 @@ template std::optional GetLastSource(A &x) { return GetSourceHelper::GetSource(const_cast(x)); } +// Checks whether the assignment statement has a single variable on the RHS. +bool CheckForSingleVariableOnRHS(const AssignmentStmt &); + } // namespace Fortran::parser #endif // FORTRAN_PARSER_TOOLS_H_ diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 3839bc1d2a215..1c1526d51c509 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -753,29 +753,7 @@ std::string GetCommonBlockObjectName(const Symbol &, bool underscoring); // Check for ambiguous USE associations bool HadUseError(SemanticsContext &, SourceName at, const Symbol *); -/// Checks if the assignment statement has a single variable on the RHS. -inline bool checkForSingleVariableOnRHS( - const Fortran::parser::AssignmentStmt &assignmentStmt) { - const Fortran::parser::Expr &expr{ - std::get(assignmentStmt.t)}; - const Fortran::common::Indirection *designator = - std::get_if>( - &expr.u); - return designator != nullptr; -} - -/// Checks if the symbol on the LHS is present in the RHS expression. -inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, - const Fortran::semantics::SomeExpr *rhs) { - auto lhsSyms{Fortran::evaluate::GetSymbolVector(*lhs)}; - const Fortran::semantics::Symbol &lhsSymbol{*lhsSyms.front()}; - for (const Fortran::semantics::Symbol &symbol : - Fortran::evaluate::GetSymbolVector(*rhs)) { - if (lhsSymbol == symbol) { - return true; - } - } - return false; -} +// Checks whether the symbol on the LHS is present in the RHS expression. +bool CheckForSymbolMatch(const SomeExpr *lhs,const SomeExpr *rhs); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 02dba22c29c7f..c10e1777614cd 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -653,8 +653,8 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter, firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch( + if (Fortran::parser::CheckForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::CheckForSymbolMatch( Fortran::semantics::GetExpr(stmt2Var), Fortran::semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ddb08f74b3841..e07f33671e728 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3199,8 +3199,8 @@ static void genAtomicCapture(lower::AbstractConverter &converter, firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(semantics::GetExpr(stmt2Var), + if (parser::CheckForSingleVariableOnRHS(stmt1)) { + if (semantics::CheckForSymbolMatch(semantics::GetExpr(stmt2Var), semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); diff --git a/flang/lib/Parser/tools.cpp b/flang/lib/Parser/tools.cpp index 6e5f1ed2fc66f..85f0858a8f147 100644 --- a/flang/lib/Parser/tools.cpp +++ b/flang/lib/Parser/tools.cpp @@ -174,4 +174,10 @@ const CoindexedNamedObject *GetCoindexedNamedObject( }, allocateObject.u); } + +bool CheckForSingleVariableOnRHS(const AssignmentStmt &assignmentStmt) { + const Expr &expr{std::get(assignmentStmt.t)}; + return std::holds_alternative>(expr.u); +} + } // namespace Fortran::parser diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bda0d62829506..ae5dca1b95f6e 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2922,9 +2922,9 @@ void OmpStructureChecker::CheckAtomicCaptureConstruct( const auto *e2 = GetExpr(context_, stmt2Expr); if (e1 && v1 && e2 && v2) { - if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (parser::CheckForSingleVariableOnRHS(stmt1)) { CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(v2, e2)) { + if (CheckForSymbolMatch(v2, e2)) { // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] CheckAtomicUpdateStmt(stmt2); } else { @@ -2936,8 +2936,8 @@ void OmpStructureChecker::CheckAtomicCaptureConstruct( "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, stmt1Expr.source); } - } else if (semantics::checkForSymbolMatch(v1, e1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { + } else if (CheckForSymbolMatch(v1, e1) && + parser::CheckForSingleVariableOnRHS(stmt2)) { // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] CheckAtomicUpdateStmt(stmt1); CheckAtomicCaptureStmt(stmt2); diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 1d1e3ac044166..d8e9385f6973c 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -1756,4 +1756,16 @@ bool HadUseError( } } +bool CheckForSymbolMatch(const SomeExpr *lhs, const SomeExpr *rhs) { + if (lhs && rhs) { + if (const Symbol *first{evaluate::GetFirstSymbol(*lhs)}) { + for (const Symbol &symbol : evaluate::GetSymbolVector(*rhs)) { + if (first == &symbol) { + return true; + } + } + } + } + return false; +} } // namespace Fortran::semantics ``````````
https://github.com/llvm/llvm-project/pull/142037 From flang-commits at lists.llvm.org Thu May 29 14:12:25 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 14:12:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang][NFC] Clean up code in two new functions (PR #142037) In-Reply-To: Message-ID: <6838cdb9.050a0220.2183f5.bf47@mx.google.com> github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning:
You can test this locally with the following command: ``````````bash git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- flang/include/flang/Parser/tools.h flang/include/flang/Semantics/tools.h flang/lib/Lower/OpenACC.cpp flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Parser/tools.cpp flang/lib/Semantics/check-omp-structure.cpp flang/lib/Semantics/tools.cpp ``````````
View the diff from clang-format here. ``````````diff diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 1c1526d51..4b2bb4fa1 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -754,6 +754,6 @@ std::string GetCommonBlockObjectName(const Symbol &, bool underscoring); bool HadUseError(SemanticsContext &, SourceName at, const Symbol *); // Checks whether the symbol on the LHS is present in the RHS expression. -bool CheckForSymbolMatch(const SomeExpr *lhs,const SomeExpr *rhs); +bool CheckForSymbolMatch(const SomeExpr *lhs, const SomeExpr *rhs); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ ``````````
https://github.com/llvm/llvm-project/pull/142037 From flang-commits at lists.llvm.org Thu May 29 14:14:25 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 14:14:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang][NFC] Clean up code in two new functions (PR #142037) In-Reply-To: Message-ID: <6838ce31.050a0220.14b51.75a9@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/142037 >From 6a6edec1e09e9123a8b9d1e1e059b0beee6e4aaf Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Thu, 29 May 2025 14:13:41 -0700 Subject: [PATCH] [flang][NFC] Clean up code in two new functions Two recently-added functions in Semantics/tools.h need some cleaning up to conform to the coding style of the project. One of them should actually be in Parser/tools.{h,cpp}, the other doesn't need to be defined in the header. --- flang/include/flang/Parser/tools.h | 3 +++ flang/include/flang/Semantics/tools.h | 26 ++------------------- flang/lib/Lower/OpenACC.cpp | 4 ++-- flang/lib/Lower/OpenMP/OpenMP.cpp | 4 ++-- flang/lib/Parser/tools.cpp | 6 +++++ flang/lib/Semantics/check-omp-structure.cpp | 8 +++---- flang/lib/Semantics/tools.cpp | 12 ++++++++++ 7 files changed, 31 insertions(+), 32 deletions(-) diff --git a/flang/include/flang/Parser/tools.h b/flang/include/flang/Parser/tools.h index f1ead11734fa0..447bccd5d35a6 100644 --- a/flang/include/flang/Parser/tools.h +++ b/flang/include/flang/Parser/tools.h @@ -250,5 +250,8 @@ template std::optional GetLastSource(A &x) { return GetSourceHelper::GetSource(const_cast(x)); } +// Checks whether the assignment statement has a single variable on the RHS. +bool CheckForSingleVariableOnRHS(const AssignmentStmt &); + } // namespace Fortran::parser #endif // FORTRAN_PARSER_TOOLS_H_ diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 3839bc1d2a215..4b2bb4fa167f8 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -753,29 +753,7 @@ std::string GetCommonBlockObjectName(const Symbol &, bool underscoring); // Check for ambiguous USE associations bool HadUseError(SemanticsContext &, SourceName at, const Symbol *); -/// Checks if the assignment statement has a single variable on the RHS. -inline bool checkForSingleVariableOnRHS( - const Fortran::parser::AssignmentStmt &assignmentStmt) { - const Fortran::parser::Expr &expr{ - std::get(assignmentStmt.t)}; - const Fortran::common::Indirection *designator = - std::get_if>( - &expr.u); - return designator != nullptr; -} - -/// Checks if the symbol on the LHS is present in the RHS expression. -inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, - const Fortran::semantics::SomeExpr *rhs) { - auto lhsSyms{Fortran::evaluate::GetSymbolVector(*lhs)}; - const Fortran::semantics::Symbol &lhsSymbol{*lhsSyms.front()}; - for (const Fortran::semantics::Symbol &symbol : - Fortran::evaluate::GetSymbolVector(*rhs)) { - if (lhsSymbol == symbol) { - return true; - } - } - return false; -} +// Checks whether the symbol on the LHS is present in the RHS expression. +bool CheckForSymbolMatch(const SomeExpr *lhs, const SomeExpr *rhs); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 02dba22c29c7f..c10e1777614cd 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -653,8 +653,8 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter, firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch( + if (Fortran::parser::CheckForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::CheckForSymbolMatch( Fortran::semantics::GetExpr(stmt2Var), Fortran::semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ddb08f74b3841..e07f33671e728 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3199,8 +3199,8 @@ static void genAtomicCapture(lower::AbstractConverter &converter, firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); mlir::Block &block = atomicCaptureOp->getRegion(0).back(); firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(semantics::GetExpr(stmt2Var), + if (parser::CheckForSingleVariableOnRHS(stmt1)) { + if (semantics::CheckForSymbolMatch(semantics::GetExpr(stmt2Var), semantics::GetExpr(stmt2Expr))) { // Atomic capture construct is of the form [capture-stmt, update-stmt] const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); diff --git a/flang/lib/Parser/tools.cpp b/flang/lib/Parser/tools.cpp index 6e5f1ed2fc66f..85f0858a8f147 100644 --- a/flang/lib/Parser/tools.cpp +++ b/flang/lib/Parser/tools.cpp @@ -174,4 +174,10 @@ const CoindexedNamedObject *GetCoindexedNamedObject( }, allocateObject.u); } + +bool CheckForSingleVariableOnRHS(const AssignmentStmt &assignmentStmt) { + const Expr &expr{std::get(assignmentStmt.t)}; + return std::holds_alternative>(expr.u); +} + } // namespace Fortran::parser diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bda0d62829506..ae5dca1b95f6e 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2922,9 +2922,9 @@ void OmpStructureChecker::CheckAtomicCaptureConstruct( const auto *e2 = GetExpr(context_, stmt2Expr); if (e1 && v1 && e2 && v2) { - if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (parser::CheckForSingleVariableOnRHS(stmt1)) { CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(v2, e2)) { + if (CheckForSymbolMatch(v2, e2)) { // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] CheckAtomicUpdateStmt(stmt2); } else { @@ -2936,8 +2936,8 @@ void OmpStructureChecker::CheckAtomicCaptureConstruct( "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, stmt1Expr.source); } - } else if (semantics::checkForSymbolMatch(v1, e1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { + } else if (CheckForSymbolMatch(v1, e1) && + parser::CheckForSingleVariableOnRHS(stmt2)) { // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] CheckAtomicUpdateStmt(stmt1); CheckAtomicCaptureStmt(stmt2); diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 1d1e3ac044166..d8e9385f6973c 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -1756,4 +1756,16 @@ bool HadUseError( } } +bool CheckForSymbolMatch(const SomeExpr *lhs, const SomeExpr *rhs) { + if (lhs && rhs) { + if (const Symbol *first{evaluate::GetFirstSymbol(*lhs)}) { + for (const Symbol &symbol : evaluate::GetSymbolVector(*rhs)) { + if (first == &symbol) { + return true; + } + } + } + } + return false; +} } // namespace Fortran::semantics From flang-commits at lists.llvm.org Thu May 29 17:51:02 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 29 May 2025 17:51:02 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <683900f6.170a0220.1442d5.c70e@mx.google.com> https://github.com/tarunprabhu edited https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Thu May 29 17:51:02 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 29 May 2025 17:51:02 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <683900f6.170a0220.33fd92.05de@mx.google.com> https://github.com/tarunprabhu commented: Thanks for the changes, Abid. Regarding my suggestions in the documentation, feel free to incorporate or discard them as you see fit. I'll leave it you and @kiranchandramohan to decide what the name of the environment variable should be. Other than that, this looks good. https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Thu May 29 17:51:02 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Thu, 29 May 2025 17:51:02 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <683900f6.170a0220.22c2a8.c84a@mx.google.com> ================ @@ -614,3 +614,30 @@ nvfortran defines `-fast` as - `-Mcache_align`: there is no equivalent flag in Flang or Clang. - `-Mflushz`: flush-to-zero mode - when `-ffast-math` is specified, Flang will link to `crtfastmath.o` to ensure denormal numbers are flushed to zero. + + +## FCC_OVERRIDE_OPTIONS + +The environment variable `FCC_OVERRIDE_OPTIONS` can be used to apply a list of +edits to the input argument lists. The value of this environment variable is +a space separated list of edits to perform. These edits are applied in order to +the input argument lists. Edits should be one of the following forms: ---------------- tarunprabhu wrote: What do you think of this? ```suggestion The environment variable `FCC_OVERRIDE_OPTIONS` can be used to edit flang's command line arguments. The value of this variable is a space-separated list of edits to perform. The edits are applied in the order in which they appear in `FCC_OVERRIDE_OPTIONS`. Each edit should be one of the following forms: ``` https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Thu May 29 19:04:17 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Thu, 29 May 2025 19:04:17 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <68391221.170a0220.186f8c.cb72@mx.google.com> snarang181 wrote: @kiranchandramohan @tarunprabhu, what do you guys think about modifying `index.md` in the following manner ? This will load things on demand and we should not be hitting the `nonexistent document` errors. ``` ```{eval} import os if os.path.exists('FlangCommandLineReference.rst'): print('FlangCommandLineReference') ``` https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Thu May 29 19:48:31 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Thu, 29 May 2025 19:48:31 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [mlir][flang] Added Weighted[Region]BranchOpInterface's. (PR #142079) Message-ID: https://github.com/vzakhari created https://github.com/llvm/llvm-project/pull/142079 The new interfaces provide getters and setters for the weight information about the branches of BranchOpInterface and RegionBranchOpInterface operations. These interfaces are done the same way as LLVM dialect's BranchWeightOpInterface. The plan is to produce this information in Flang, e.g. mark most probably "cold" code as such and allow LLVM to order basic blocks accordingly. An example of such a code is copy loops generated for arrays repacking - we can mark it as "cold" assuming that the copy will not happen dynamically. If the copy actually happens the overhead of the copy is probably high enough so that we may not care about the little overhead of jumping to the "cold" code and fetching it. >From f258ed9be16829b6d5c9261c1a0b153c697271e7 Mon Sep 17 00:00:00 2001 From: Slava Zakharin Date: Thu, 29 May 2025 19:09:16 -0700 Subject: [PATCH] [mlir][flang] Added Weighted[Region]BranchOpInterface's. The new interfaces provide getters and setters for the weight information about the branches of BranchOpInterface and RegionBranchOpInterface operations. These interfaces are done the same way as LLVM dialect's BranchWeightOpInterface. The plan is to produce this information in Flang, e.g. mark most probably "cold" code as such and allow LLVM to order basic blocks accordingly. An example of such a code is copy loops generated for arrays repacking - we can mark it as "cold" assuming that the copy will not happen dynamically. If the copy actually happens the overhead of the copy is probably high enough so that we may not care about the little overhead of jumping to the "cold" code and fetching it. --- .../include/flang/Optimizer/Dialect/FIROps.td | 18 ++- flang/lib/Optimizer/Dialect/FIROps.cpp | 21 +++- .../Transforms/ControlFlowConverter.cpp | 4 +- flang/test/Fir/cfg-conversion-if.fir | 46 ++++++++ flang/test/Fir/fir-ops.fir | 16 +++ flang/test/Fir/invalid.fir | 37 ++++++ .../Dialect/ControlFlow/IR/ControlFlowOps.td | 34 +++--- .../mlir/Interfaces/ControlFlowInterfaces.h | 20 ++++ .../mlir/Interfaces/ControlFlowInterfaces.td | 107 ++++++++++++++++++ .../ControlFlowToLLVM/ControlFlowToLLVM.cpp | 6 +- mlir/lib/Interfaces/ControlFlowInterfaces.cpp | 49 ++++++++ .../Conversion/ControlFlowToLLVM/branch.mlir | 14 +++ mlir/test/Dialect/ControlFlow/invalid.mlir | 36 ++++++ mlir/test/Dialect/ControlFlow/ops.mlir | 10 ++ 14 files changed, 396 insertions(+), 22 deletions(-) create mode 100644 flang/test/Fir/cfg-conversion-if.fir diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td b/flang/include/flang/Optimizer/Dialect/FIROps.td index f4b17ef7eed09..7001e25a9bcda 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.td +++ b/flang/include/flang/Optimizer/Dialect/FIROps.td @@ -2323,9 +2323,13 @@ def fir_DoLoopOp : region_Op<"do_loop", [AttrSizedOperandSegments, }]; } -def fir_IfOp : region_Op<"if", [DeclareOpInterfaceMethods, RecursiveMemoryEffects, - NoRegionArguments]> { +def fir_IfOp + : region_Op< + "if", [DeclareOpInterfaceMethods< + RegionBranchOpInterface, ["getRegionInvocationBounds", + "getEntrySuccessorRegions"]>, + RecursiveMemoryEffects, NoRegionArguments, + WeightedRegionBranchOpInterface]> { let summary = "if-then-else conditional operation"; let description = [{ Used to conditionally execute operations. This operation is the FIR @@ -2342,7 +2346,8 @@ def fir_IfOp : region_Op<"if", [DeclareOpInterfaceMethods:$region_weights); let results = (outs Variadic:$results); let regions = (region @@ -2371,6 +2376,11 @@ def fir_IfOp : region_Op<"if", [DeclareOpInterfaceMethods &results, unsigned resultNum); + + /// Returns the display name string for the region_weights attribute. + static constexpr llvm::StringRef getWeightsAttrAssemblyName() { + return "weights"; + } }]; } diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index cbe93907265f6..2949120894132 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -4418,6 +4418,19 @@ mlir::ParseResult fir::IfOp::parse(mlir::OpAsmParser &parser, parser.resolveOperand(cond, i1Type, result.operands)) return mlir::failure(); + if (mlir::succeeded( + parser.parseOptionalKeyword(getWeightsAttrAssemblyName()))) { + if (parser.parseLParen()) + return mlir::failure(); + mlir::DenseI32ArrayAttr weights; + if (parser.parseCustomAttributeWithFallback(weights, mlir::Type{})) + return mlir::failure(); + if (weights) + result.addAttribute(getRegionWeightsAttrName(result.name), weights); + if (parser.parseRParen()) + return mlir::failure(); + } + if (parser.parseOptionalArrowTypeList(result.types)) return mlir::failure(); @@ -4449,6 +4462,11 @@ llvm::LogicalResult fir::IfOp::verify() { void fir::IfOp::print(mlir::OpAsmPrinter &p) { bool printBlockTerminators = false; p << ' ' << getCondition(); + if (auto weights = getRegionWeightsAttr()) { + p << ' ' << getWeightsAttrAssemblyName() << '('; + p.printStrippedAttrOrType(weights); + p << ')'; + } if (!getResults().empty()) { p << " -> (" << getResultTypes() << ')'; printBlockTerminators = true; @@ -4464,7 +4482,8 @@ void fir::IfOp::print(mlir::OpAsmPrinter &p) { p.printRegion(otherReg, /*printEntryBlockArgs=*/false, printBlockTerminators); } - p.printOptionalAttrDict((*this)->getAttrs()); + p.printOptionalAttrDict((*this)->getAttrs(), + /*elideAttrs=*/{getRegionWeightsAttrName()}); } void fir::IfOp::resultToSourceOps(llvm::SmallVectorImpl &results, diff --git a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp index 8a9e9b80134b8..5256ef8d53d85 100644 --- a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp +++ b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp @@ -212,9 +212,11 @@ class CfgIfConv : public mlir::OpRewritePattern { } rewriter.setInsertionPointToEnd(condBlock); - rewriter.create( + auto branchOp = rewriter.create( loc, ifOp.getCondition(), ifOpBlock, llvm::ArrayRef(), otherwiseBlock, llvm::ArrayRef()); + if (auto weights = ifOp.getRegionWeightsOrNull()) + branchOp.setBranchWeights(weights); rewriter.replaceOp(ifOp, continueBlock->getArguments()); return success(); } diff --git a/flang/test/Fir/cfg-conversion-if.fir b/flang/test/Fir/cfg-conversion-if.fir new file mode 100644 index 0000000000000..1e30ee8e64f02 --- /dev/null +++ b/flang/test/Fir/cfg-conversion-if.fir @@ -0,0 +1,46 @@ +// RUN: fir-opt --split-input-file --cfg-conversion %s | FileCheck %s + +func.func private @callee() -> none + +// CHECK-LABEL: func.func @if_then( +// CHECK-SAME: %[[ARG0:.*]]: i1) { +// CHECK: cf.cond_br %[[ARG0]] weights([10, 90]), ^bb1, ^bb2 +// CHECK: ^bb1: +// CHECK: %[[VAL_0:.*]] = fir.call @callee() : () -> none +// CHECK: cf.br ^bb2 +// CHECK: ^bb2: +// CHECK: return +// CHECK: } +func.func @if_then(%cond: i1) { + fir.if %cond weights([10, 90]) { + fir.call @callee() : () -> none + } + return +} + +// ----- + +// CHECK-LABEL: func.func @if_then_else( +// CHECK-SAME: %[[ARG0:.*]]: i1) -> i32 { +// CHECK: %[[VAL_0:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : i32 +// CHECK: cf.cond_br %[[ARG0]] weights([90, 10]), ^bb1, ^bb2 +// CHECK: ^bb1: +// CHECK: cf.br ^bb3(%[[VAL_0]] : i32) +// CHECK: ^bb2: +// CHECK: cf.br ^bb3(%[[VAL_1]] : i32) +// CHECK: ^bb3(%[[VAL_2:.*]]: i32): +// CHECK: cf.br ^bb4 +// CHECK: ^bb4: +// CHECK: return %[[VAL_2]] : i32 +// CHECK: } +func.func @if_then_else(%cond: i1) -> i32 { + %c0 = arith.constant 0 : i32 + %c1 = arith.constant 1 : i32 + %result = fir.if %cond weights([90, 10]) -> i32 { + fir.result %c0 : i32 + } else { + fir.result %c1 : i32 + } + return %result : i32 +} diff --git a/flang/test/Fir/fir-ops.fir b/flang/test/Fir/fir-ops.fir index 9c444d2f4e0bc..3585bf9efca3e 100644 --- a/flang/test/Fir/fir-ops.fir +++ b/flang/test/Fir/fir-ops.fir @@ -1015,3 +1015,19 @@ func.func @test_box_total_elements(%arg0: !fir.class> %6 = arith.addi %2, %5 : index return %6 : index } + +// CHECK-LABEL: func.func @test_if_weights( +// CHECK-SAME: %[[ARG0:.*]]: i1) { +func.func @test_if_weights(%cond: i1) { +// CHECK: fir.if %[[ARG0]] weights([99, 1]) { +// CHECK: } + fir.if %cond weights([99, 1]) { + } +// CHECK: fir.if %[[ARG0]] weights([99, 1]) { +// CHECK: } else { +// CHECK: } + fir.if %cond weights ([99,1]) { + } else { + } + return +} diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index fd607fd9066f7..0391cdbef71e5 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -1385,3 +1385,40 @@ fir.local {type = local_init} @x.localizer : f32 init { ^bb0(%arg0: f32, %arg1: f32): fir.yield(%arg0 : f32) } + +// ----- + +func.func @wrong_weights_number_in_if_then(%cond: i1) { +// expected-error @below {{number of weights (1) does not match the number of regions (2)}} + fir.if %cond weights([50]) { + } + return +} + +// ----- + +func.func @wrong_weights_number_in_if_then_else(%cond: i1) { +// expected-error @below {{number of weights (3) does not match the number of regions (2)}} + fir.if %cond weights([50, 40, 10]) { + } else { + } + return +} + +// ----- + +func.func @negative_weight_in_if_then(%cond: i1) { +// expected-error @below {{weight #0 must be non-negative}} + fir.if %cond weights([-1, 101]) { + } + return +} + +// ----- + +func.func @wrong_total_weight_in_if_then(%cond: i1) { +// expected-error @below {{total weight 101 is not 100}} + fir.if %cond weights([1, 100]) { + } + return +} diff --git a/mlir/include/mlir/Dialect/ControlFlow/IR/ControlFlowOps.td b/mlir/include/mlir/Dialect/ControlFlow/IR/ControlFlowOps.td index 48f12b46a57f1..79da81ba049dd 100644 --- a/mlir/include/mlir/Dialect/ControlFlow/IR/ControlFlowOps.td +++ b/mlir/include/mlir/Dialect/ControlFlow/IR/ControlFlowOps.td @@ -112,10 +112,11 @@ def BranchOp : CF_Op<"br", [ // CondBranchOp //===----------------------------------------------------------------------===// -def CondBranchOp : CF_Op<"cond_br", - [AttrSizedOperandSegments, - DeclareOpInterfaceMethods, - Pure, Terminator]> { +def CondBranchOp + : CF_Op<"cond_br", [AttrSizedOperandSegments, + DeclareOpInterfaceMethods< + BranchOpInterface, ["getSuccessorForOperands"]>, + WeightedBranchOpInterface, Pure, Terminator]> { let summary = "Conditional branch operation"; let description = [{ The `cf.cond_br` terminator operation represents a conditional branch on a @@ -144,20 +145,23 @@ def CondBranchOp : CF_Op<"cond_br", ``` }]; - let arguments = (ins I1:$condition, - Variadic:$trueDestOperands, - Variadic:$falseDestOperands); + let arguments = (ins I1:$condition, Variadic:$trueDestOperands, + Variadic:$falseDestOperands, + OptionalAttr:$branch_weights); let successors = (successor AnySuccessor:$trueDest, AnySuccessor:$falseDest); - let builders = [ - OpBuilder<(ins "Value":$condition, "Block *":$trueDest, - "ValueRange":$trueOperands, "Block *":$falseDest, - "ValueRange":$falseOperands), [{ - build($_builder, $_state, condition, trueOperands, falseOperands, trueDest, + let builders = [OpBuilder<(ins "Value":$condition, "Block *":$trueDest, + "ValueRange":$trueOperands, + "Block *":$falseDest, + "ValueRange":$falseOperands), + [{ + build($_builder, $_state, condition, trueOperands, falseOperands, /*branch_weights=*/{}, trueDest, falseDest); }]>, - OpBuilder<(ins "Value":$condition, "Block *":$trueDest, - "Block *":$falseDest, CArg<"ValueRange", "{}">:$falseOperands), [{ + OpBuilder<(ins "Value":$condition, "Block *":$trueDest, + "Block *":$falseDest, + CArg<"ValueRange", "{}">:$falseOperands), + [{ build($_builder, $_state, condition, trueDest, ValueRange(), falseDest, falseOperands); }]>]; @@ -216,7 +220,7 @@ def CondBranchOp : CF_Op<"cond_br", let hasCanonicalizer = 1; let assemblyFormat = [{ - $condition `,` + $condition (`weights` `(` $branch_weights^ `)` )? `,` $trueDest (`(` $trueDestOperands^ `:` type($trueDestOperands) `)`)? `,` $falseDest (`(` $falseDestOperands^ `:` type($falseDestOperands) `)`)? attr-dict diff --git a/mlir/include/mlir/Interfaces/ControlFlowInterfaces.h b/mlir/include/mlir/Interfaces/ControlFlowInterfaces.h index 7f6967f11444f..d63800c12d132 100644 --- a/mlir/include/mlir/Interfaces/ControlFlowInterfaces.h +++ b/mlir/include/mlir/Interfaces/ControlFlowInterfaces.h @@ -142,6 +142,26 @@ LogicalResult verifyBranchSuccessorOperands(Operation *op, unsigned succNo, const SuccessorOperands &operands); } // namespace detail +//===----------------------------------------------------------------------===// +// WeightedBranchOpInterface +//===----------------------------------------------------------------------===// + +namespace detail { +/// Verify that the branch weights attached to an operation +/// implementing WeightedBranchOpInterface are correct. +LogicalResult verifyBranchWeights(Operation *op); +} // namespace detail + +//===----------------------------------------------------------------------===// +// WeightedRegiobBranchOpInterface +//===----------------------------------------------------------------------===// + +namespace detail { +/// Verify that the region weights attached to an operation +/// implementing WeightedRegiobBranchOpInterface are correct. +LogicalResult verifyRegionBranchWeights(Operation *op); +} // namespace detail + //===----------------------------------------------------------------------===// // RegionBranchOpInterface //===----------------------------------------------------------------------===// diff --git a/mlir/include/mlir/Interfaces/ControlFlowInterfaces.td b/mlir/include/mlir/Interfaces/ControlFlowInterfaces.td index 69bce78e946c8..7a47b686ac7d1 100644 --- a/mlir/include/mlir/Interfaces/ControlFlowInterfaces.td +++ b/mlir/include/mlir/Interfaces/ControlFlowInterfaces.td @@ -375,6 +375,113 @@ def SelectLikeOpInterface : OpInterface<"SelectLikeOpInterface"> { ]; } +//===----------------------------------------------------------------------===// +// WeightedBranchOpInterface +//===----------------------------------------------------------------------===// + +def WeightedBranchOpInterface : OpInterface<"WeightedBranchOpInterface"> { + let description = [{ + This interface provides weight information for branching terminator + operations, i.e. terminator operations with successors. + + This interface provides methods for getting/setting integer non-negative + weight of each branch in the range from 0 to 100. The sum of weights + must be 100. The number of weights must match the number of successors + of the operation. + + The weights specify the probability (in percents) of taking + a particular branch. + + The default implementations of the methods expect the operation + to have an attribute of type DenseI32ArrayAttr named branch_weights. + }]; + let cppNamespace = "::mlir"; + + let methods = [InterfaceMethod< + /*desc=*/"Returns the branch weights attribute or nullptr", + /*returnType=*/"::mlir::DenseI32ArrayAttr", + /*methodName=*/"getBranchWeightsOrNull", + /*args=*/(ins), + /*methodBody=*/[{}], + /*defaultImpl=*/[{ + auto op = cast(this->getOperation()); + return op.getBranchWeightsAttr(); + }]>, + InterfaceMethod< + /*desc=*/"Sets the branch weights attribute", + /*returnType=*/"void", + /*methodName=*/"setBranchWeights", + /*args=*/(ins "::mlir::DenseI32ArrayAttr":$attr), + /*methodBody=*/[{}], + /*defaultImpl=*/[{ + auto op = cast(this->getOperation()); + op.setBranchWeightsAttr(attr); + }]>, + ]; + + let verify = [{ + return ::mlir::detail::verifyBranchWeights($_op); + }]; +} + +//===----------------------------------------------------------------------===// +// WeightedRegionBranchOpInterface +//===----------------------------------------------------------------------===// + +// TODO: the probabilities of entering a particular region seem +// to correlate with the values returned by +// RegionBranchOpInterface::invocationBounds(), and we should probably +// verify that the values are consistent. In that case, should +// WeightedRegionBranchOpInterface extend RegionBranchOpInterface? +def WeightedRegionBranchOpInterface + : OpInterface<"WeightedRegionBranchOpInterface"> { + let description = [{ + This interface provides weight information for region operations + that exhibit branching behavior between held regions. + + This interface provides methods for getting/setting integer non-negative + weight of each branch in the range from 0 to 100. The sum of weights + must be 100. The number of weights must match the number of regions + held by the operation (including empty regions). + + The weights specify the probability (in percents) of branching + to a particular region when first executing the operation. + For example, for loop-like operations with a single region + the weight specifies the probability of entering the loop. + In this case, the weight must be either 0 or 100. + + The default implementations of the methods expect the operation + to have an attribute of type DenseI32ArrayAttr named branch_weights. + }]; + let cppNamespace = "::mlir"; + + let methods = [InterfaceMethod< + /*desc=*/"Returns the region weights attribute or nullptr", + /*returnType=*/"::mlir::DenseI32ArrayAttr", + /*methodName=*/"getRegionWeightsOrNull", + /*args=*/(ins), + /*methodBody=*/[{}], + /*defaultImpl=*/[{ + auto op = cast(this->getOperation()); + return op.getRegionWeightsAttr(); + }]>, + InterfaceMethod< + /*desc=*/"Sets the region weights attribute", + /*returnType=*/"void", + /*methodName=*/"setRegionWeights", + /*args=*/(ins "::mlir::DenseI32ArrayAttr":$attr), + /*methodBody=*/[{}], + /*defaultImpl=*/[{ + auto op = cast(this->getOperation()); + op.setRegionWeightsAttr(attr); + }]>, + ]; + + let verify = [{ + return ::mlir::detail::verifyRegionBranchWeights($_op); + }]; +} + //===----------------------------------------------------------------------===// // ControlFlow Traits //===----------------------------------------------------------------------===// diff --git a/mlir/lib/Conversion/ControlFlowToLLVM/ControlFlowToLLVM.cpp b/mlir/lib/Conversion/ControlFlowToLLVM/ControlFlowToLLVM.cpp index debfd003bd5b5..12769e486a3c7 100644 --- a/mlir/lib/Conversion/ControlFlowToLLVM/ControlFlowToLLVM.cpp +++ b/mlir/lib/Conversion/ControlFlowToLLVM/ControlFlowToLLVM.cpp @@ -166,10 +166,14 @@ struct CondBranchOpLowering : public ConvertOpToLLVMPattern { TypeRange(adaptor.getFalseDestOperands())); if (failed(convertedFalseBlock)) return failure(); - Operation *newOp = rewriter.replaceOpWithNewOp( + auto newOp = rewriter.replaceOpWithNewOp( op, adaptor.getCondition(), *convertedTrueBlock, adaptor.getTrueDestOperands(), *convertedFalseBlock, adaptor.getFalseDestOperands()); + if (auto weights = op.getBranchWeightsOrNull()) { + newOp.setBranchWeights(weights); + op.removeBranchWeightsAttr(); + } // TODO: We should not just forward all attributes like that. But there are // existing Flang tests that depend on this behavior. newOp->setAttrs(op->getAttrDictionary()); diff --git a/mlir/lib/Interfaces/ControlFlowInterfaces.cpp b/mlir/lib/Interfaces/ControlFlowInterfaces.cpp index 2ae334b517a31..e587e8f1af178 100644 --- a/mlir/lib/Interfaces/ControlFlowInterfaces.cpp +++ b/mlir/lib/Interfaces/ControlFlowInterfaces.cpp @@ -80,6 +80,55 @@ detail::verifyBranchSuccessorOperands(Operation *op, unsigned succNo, return success(); } +//===----------------------------------------------------------------------===// +// WeightedBranchOpInterface +//===----------------------------------------------------------------------===// + +LogicalResult detail::verifyBranchWeights(Operation *op) { + auto weights = cast(op).getBranchWeightsOrNull(); + if (weights) { + if (weights.size() != op->getNumSuccessors()) + return op->emitError() << "number of weights (" << weights.size() + << ") does not match the number of successors (" + << op->getNumSuccessors() << ")"; + int32_t total = 0; + for (auto weight : llvm::enumerate(weights.asArrayRef())) { + if (weight.value() < 0) + return op->emitError() + << "weight #" << weight.index() << " must be non-negative"; + total += weight.value(); + } + if (total != 100) + return op->emitError() << "total weight " << total << " is not 100"; + } + return mlir::success(); +} + +//===----------------------------------------------------------------------===// +// WeightedRegionBranchOpInterface +//===----------------------------------------------------------------------===// + +LogicalResult detail::verifyRegionBranchWeights(Operation *op) { + auto weights = + cast(op).getRegionWeightsOrNull(); + if (weights) { + if (weights.size() != op->getNumRegions()) + return op->emitError() << "number of weights (" << weights.size() + << ") does not match the number of regions (" + << op->getNumRegions() << ")"; + int32_t total = 0; + for (auto weight : llvm::enumerate(weights.asArrayRef())) { + if (weight.value() < 0) + return op->emitError() + << "weight #" << weight.index() << " must be non-negative"; + total += weight.value(); + } + if (total != 100) + return op->emitError() << "total weight " << total << " is not 100"; + } + return mlir::success(); +} + //===----------------------------------------------------------------------===// // RegionBranchOpInterface //===----------------------------------------------------------------------===// diff --git a/mlir/test/Conversion/ControlFlowToLLVM/branch.mlir b/mlir/test/Conversion/ControlFlowToLLVM/branch.mlir index 9a0f2b7714544..7c78211d59010 100644 --- a/mlir/test/Conversion/ControlFlowToLLVM/branch.mlir +++ b/mlir/test/Conversion/ControlFlowToLLVM/branch.mlir @@ -67,3 +67,17 @@ func.func @unreachable_block() { ^bb1(%arg0: index): cf.br ^bb1(%arg0 : index) } + +// ----- + +// Test case for cf.cond_br with weights. + +// CHECK-LABEL: func.func @cf_cond_br_with_weights( +func.func @cf_cond_br_with_weights(%cond: i1, %a: index, %b: index) -> index { +// CHECK: llvm.cond_br %{{.*}} weights([90, 10]), ^bb1(%{{.*}} : i64), ^bb2(%{{.*}} : i64) + cf.cond_br %cond, ^bb1(%a : index), ^bb2(%b : index) {branch_weights = array} +^bb1(%arg1: index): + return %arg1 : index +^bb2(%arg2: index): + return %arg2 : index +} diff --git a/mlir/test/Dialect/ControlFlow/invalid.mlir b/mlir/test/Dialect/ControlFlow/invalid.mlir index b51d8095c9974..6024c6d55ac64 100644 --- a/mlir/test/Dialect/ControlFlow/invalid.mlir +++ b/mlir/test/Dialect/ControlFlow/invalid.mlir @@ -67,3 +67,39 @@ func.func @switch_missing_default(%flag : i32, %caseOperand : i32) { ^bb3(%bb3arg : i32): return } + +// ----- + +// CHECK-LABEL: func @wrong_weights_number +func.func @wrong_weights_number(%cond: i1) { + // expected-error at +1 {{number of weights (1) does not match the number of successors (2)}} + cf.cond_br %cond weights([100]), ^bb1, ^bb2 + ^bb1: + return + ^bb2: + return +} + +// ----- + +// CHECK-LABEL: func @negative_weight +func.func @wrong_total_weight(%cond: i1) { + // expected-error at +1 {{weight #0 must be non-negative}} + cf.cond_br %cond weights([-1, 101]), ^bb1, ^bb2 + ^bb1: + return + ^bb2: + return +} + +// ----- + +// CHECK-LABEL: func @wrong_total_weight +func.func @wrong_total_weight(%cond: i1) { + // expected-error at +1 {{total weight 101 is not 100}} + cf.cond_br %cond weights([100, 1]), ^bb1, ^bb2 + ^bb1: + return + ^bb2: + return +} diff --git a/mlir/test/Dialect/ControlFlow/ops.mlir b/mlir/test/Dialect/ControlFlow/ops.mlir index c9317c7613972..160534240e0fa 100644 --- a/mlir/test/Dialect/ControlFlow/ops.mlir +++ b/mlir/test/Dialect/ControlFlow/ops.mlir @@ -51,3 +51,13 @@ func.func @switch_result_number(%arg0: i32) { ^bb2: return } + +// CHECK-LABEL: func @cond_weights +func.func @cond_weights(%cond: i1) { +// CHECK: cf.cond_br %{{.*}} weights([60, 40]), ^{{.*}}, ^{{.*}} + cf.cond_br %cond weights([60, 40]), ^bb1, ^bb2 + ^bb1: + return + ^bb2: + return +} From flang-commits at lists.llvm.org Thu May 29 19:49:04 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 19:49:04 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [mlir][flang] Added Weighted[Region]BranchOpInterface's. (PR #142079) In-Reply-To: Message-ID: <68391ca0.170a0220.39ed5d.c8a1@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-mlir Author: Slava Zakharin (vzakhari)
Changes The new interfaces provide getters and setters for the weight information about the branches of BranchOpInterface and RegionBranchOpInterface operations. These interfaces are done the same way as LLVM dialect's BranchWeightOpInterface. The plan is to produce this information in Flang, e.g. mark most probably "cold" code as such and allow LLVM to order basic blocks accordingly. An example of such a code is copy loops generated for arrays repacking - we can mark it as "cold" assuming that the copy will not happen dynamically. If the copy actually happens the overhead of the copy is probably high enough so that we may not care about the little overhead of jumping to the "cold" code and fetching it. --- Patch is 23.69 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/142079.diff 14 Files Affected: - (modified) flang/include/flang/Optimizer/Dialect/FIROps.td (+14-4) - (modified) flang/lib/Optimizer/Dialect/FIROps.cpp (+20-1) - (modified) flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp (+3-1) - (added) flang/test/Fir/cfg-conversion-if.fir (+46) - (modified) flang/test/Fir/fir-ops.fir (+16) - (modified) flang/test/Fir/invalid.fir (+37) - (modified) mlir/include/mlir/Dialect/ControlFlow/IR/ControlFlowOps.td (+19-15) - (modified) mlir/include/mlir/Interfaces/ControlFlowInterfaces.h (+20) - (modified) mlir/include/mlir/Interfaces/ControlFlowInterfaces.td (+107) - (modified) mlir/lib/Conversion/ControlFlowToLLVM/ControlFlowToLLVM.cpp (+5-1) - (modified) mlir/lib/Interfaces/ControlFlowInterfaces.cpp (+49) - (modified) mlir/test/Conversion/ControlFlowToLLVM/branch.mlir (+14) - (modified) mlir/test/Dialect/ControlFlow/invalid.mlir (+36) - (modified) mlir/test/Dialect/ControlFlow/ops.mlir (+10) ``````````diff diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td b/flang/include/flang/Optimizer/Dialect/FIROps.td index f4b17ef7eed09..7001e25a9bcda 100644 --- a/flang/include/flang/Optimizer/Dialect/FIROps.td +++ b/flang/include/flang/Optimizer/Dialect/FIROps.td @@ -2323,9 +2323,13 @@ def fir_DoLoopOp : region_Op<"do_loop", [AttrSizedOperandSegments, }]; } -def fir_IfOp : region_Op<"if", [DeclareOpInterfaceMethods, RecursiveMemoryEffects, - NoRegionArguments]> { +def fir_IfOp + : region_Op< + "if", [DeclareOpInterfaceMethods< + RegionBranchOpInterface, ["getRegionInvocationBounds", + "getEntrySuccessorRegions"]>, + RecursiveMemoryEffects, NoRegionArguments, + WeightedRegionBranchOpInterface]> { let summary = "if-then-else conditional operation"; let description = [{ Used to conditionally execute operations. This operation is the FIR @@ -2342,7 +2346,8 @@ def fir_IfOp : region_Op<"if", [DeclareOpInterfaceMethods:$region_weights); let results = (outs Variadic:$results); let regions = (region @@ -2371,6 +2376,11 @@ def fir_IfOp : region_Op<"if", [DeclareOpInterfaceMethods &results, unsigned resultNum); + + /// Returns the display name string for the region_weights attribute. + static constexpr llvm::StringRef getWeightsAttrAssemblyName() { + return "weights"; + } }]; } diff --git a/flang/lib/Optimizer/Dialect/FIROps.cpp b/flang/lib/Optimizer/Dialect/FIROps.cpp index cbe93907265f6..2949120894132 100644 --- a/flang/lib/Optimizer/Dialect/FIROps.cpp +++ b/flang/lib/Optimizer/Dialect/FIROps.cpp @@ -4418,6 +4418,19 @@ mlir::ParseResult fir::IfOp::parse(mlir::OpAsmParser &parser, parser.resolveOperand(cond, i1Type, result.operands)) return mlir::failure(); + if (mlir::succeeded( + parser.parseOptionalKeyword(getWeightsAttrAssemblyName()))) { + if (parser.parseLParen()) + return mlir::failure(); + mlir::DenseI32ArrayAttr weights; + if (parser.parseCustomAttributeWithFallback(weights, mlir::Type{})) + return mlir::failure(); + if (weights) + result.addAttribute(getRegionWeightsAttrName(result.name), weights); + if (parser.parseRParen()) + return mlir::failure(); + } + if (parser.parseOptionalArrowTypeList(result.types)) return mlir::failure(); @@ -4449,6 +4462,11 @@ llvm::LogicalResult fir::IfOp::verify() { void fir::IfOp::print(mlir::OpAsmPrinter &p) { bool printBlockTerminators = false; p << ' ' << getCondition(); + if (auto weights = getRegionWeightsAttr()) { + p << ' ' << getWeightsAttrAssemblyName() << '('; + p.printStrippedAttrOrType(weights); + p << ')'; + } if (!getResults().empty()) { p << " -> (" << getResultTypes() << ')'; printBlockTerminators = true; @@ -4464,7 +4482,8 @@ void fir::IfOp::print(mlir::OpAsmPrinter &p) { p.printRegion(otherReg, /*printEntryBlockArgs=*/false, printBlockTerminators); } - p.printOptionalAttrDict((*this)->getAttrs()); + p.printOptionalAttrDict((*this)->getAttrs(), + /*elideAttrs=*/{getRegionWeightsAttrName()}); } void fir::IfOp::resultToSourceOps(llvm::SmallVectorImpl &results, diff --git a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp index 8a9e9b80134b8..5256ef8d53d85 100644 --- a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp +++ b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp @@ -212,9 +212,11 @@ class CfgIfConv : public mlir::OpRewritePattern { } rewriter.setInsertionPointToEnd(condBlock); - rewriter.create( + auto branchOp = rewriter.create( loc, ifOp.getCondition(), ifOpBlock, llvm::ArrayRef(), otherwiseBlock, llvm::ArrayRef()); + if (auto weights = ifOp.getRegionWeightsOrNull()) + branchOp.setBranchWeights(weights); rewriter.replaceOp(ifOp, continueBlock->getArguments()); return success(); } diff --git a/flang/test/Fir/cfg-conversion-if.fir b/flang/test/Fir/cfg-conversion-if.fir new file mode 100644 index 0000000000000..1e30ee8e64f02 --- /dev/null +++ b/flang/test/Fir/cfg-conversion-if.fir @@ -0,0 +1,46 @@ +// RUN: fir-opt --split-input-file --cfg-conversion %s | FileCheck %s + +func.func private @callee() -> none + +// CHECK-LABEL: func.func @if_then( +// CHECK-SAME: %[[ARG0:.*]]: i1) { +// CHECK: cf.cond_br %[[ARG0]] weights([10, 90]), ^bb1, ^bb2 +// CHECK: ^bb1: +// CHECK: %[[VAL_0:.*]] = fir.call @callee() : () -> none +// CHECK: cf.br ^bb2 +// CHECK: ^bb2: +// CHECK: return +// CHECK: } +func.func @if_then(%cond: i1) { + fir.if %cond weights([10, 90]) { + fir.call @callee() : () -> none + } + return +} + +// ----- + +// CHECK-LABEL: func.func @if_then_else( +// CHECK-SAME: %[[ARG0:.*]]: i1) -> i32 { +// CHECK: %[[VAL_0:.*]] = arith.constant 0 : i32 +// CHECK: %[[VAL_1:.*]] = arith.constant 1 : i32 +// CHECK: cf.cond_br %[[ARG0]] weights([90, 10]), ^bb1, ^bb2 +// CHECK: ^bb1: +// CHECK: cf.br ^bb3(%[[VAL_0]] : i32) +// CHECK: ^bb2: +// CHECK: cf.br ^bb3(%[[VAL_1]] : i32) +// CHECK: ^bb3(%[[VAL_2:.*]]: i32): +// CHECK: cf.br ^bb4 +// CHECK: ^bb4: +// CHECK: return %[[VAL_2]] : i32 +// CHECK: } +func.func @if_then_else(%cond: i1) -> i32 { + %c0 = arith.constant 0 : i32 + %c1 = arith.constant 1 : i32 + %result = fir.if %cond weights([90, 10]) -> i32 { + fir.result %c0 : i32 + } else { + fir.result %c1 : i32 + } + return %result : i32 +} diff --git a/flang/test/Fir/fir-ops.fir b/flang/test/Fir/fir-ops.fir index 9c444d2f4e0bc..3585bf9efca3e 100644 --- a/flang/test/Fir/fir-ops.fir +++ b/flang/test/Fir/fir-ops.fir @@ -1015,3 +1015,19 @@ func.func @test_box_total_elements(%arg0: !fir.class> %6 = arith.addi %2, %5 : index return %6 : index } + +// CHECK-LABEL: func.func @test_if_weights( +// CHECK-SAME: %[[ARG0:.*]]: i1) { +func.func @test_if_weights(%cond: i1) { +// CHECK: fir.if %[[ARG0]] weights([99, 1]) { +// CHECK: } + fir.if %cond weights([99, 1]) { + } +// CHECK: fir.if %[[ARG0]] weights([99, 1]) { +// CHECK: } else { +// CHECK: } + fir.if %cond weights ([99,1]) { + } else { + } + return +} diff --git a/flang/test/Fir/invalid.fir b/flang/test/Fir/invalid.fir index fd607fd9066f7..0391cdbef71e5 100644 --- a/flang/test/Fir/invalid.fir +++ b/flang/test/Fir/invalid.fir @@ -1385,3 +1385,40 @@ fir.local {type = local_init} @x.localizer : f32 init { ^bb0(%arg0: f32, %arg1: f32): fir.yield(%arg0 : f32) } + +// ----- + +func.func @wrong_weights_number_in_if_then(%cond: i1) { +// expected-error @below {{number of weights (1) does not match the number of regions (2)}} + fir.if %cond weights([50]) { + } + return +} + +// ----- + +func.func @wrong_weights_number_in_if_then_else(%cond: i1) { +// expected-error @below {{number of weights (3) does not match the number of regions (2)}} + fir.if %cond weights([50, 40, 10]) { + } else { + } + return +} + +// ----- + +func.func @negative_weight_in_if_then(%cond: i1) { +// expected-error @below {{weight #0 must be non-negative}} + fir.if %cond weights([-1, 101]) { + } + return +} + +// ----- + +func.func @wrong_total_weight_in_if_then(%cond: i1) { +// expected-error @below {{total weight 101 is not 100}} + fir.if %cond weights([1, 100]) { + } + return +} diff --git a/mlir/include/mlir/Dialect/ControlFlow/IR/ControlFlowOps.td b/mlir/include/mlir/Dialect/ControlFlow/IR/ControlFlowOps.td index 48f12b46a57f1..79da81ba049dd 100644 --- a/mlir/include/mlir/Dialect/ControlFlow/IR/ControlFlowOps.td +++ b/mlir/include/mlir/Dialect/ControlFlow/IR/ControlFlowOps.td @@ -112,10 +112,11 @@ def BranchOp : CF_Op<"br", [ // CondBranchOp //===----------------------------------------------------------------------===// -def CondBranchOp : CF_Op<"cond_br", - [AttrSizedOperandSegments, - DeclareOpInterfaceMethods, - Pure, Terminator]> { +def CondBranchOp + : CF_Op<"cond_br", [AttrSizedOperandSegments, + DeclareOpInterfaceMethods< + BranchOpInterface, ["getSuccessorForOperands"]>, + WeightedBranchOpInterface, Pure, Terminator]> { let summary = "Conditional branch operation"; let description = [{ The `cf.cond_br` terminator operation represents a conditional branch on a @@ -144,20 +145,23 @@ def CondBranchOp : CF_Op<"cond_br", ``` }]; - let arguments = (ins I1:$condition, - Variadic:$trueDestOperands, - Variadic:$falseDestOperands); + let arguments = (ins I1:$condition, Variadic:$trueDestOperands, + Variadic:$falseDestOperands, + OptionalAttr:$branch_weights); let successors = (successor AnySuccessor:$trueDest, AnySuccessor:$falseDest); - let builders = [ - OpBuilder<(ins "Value":$condition, "Block *":$trueDest, - "ValueRange":$trueOperands, "Block *":$falseDest, - "ValueRange":$falseOperands), [{ - build($_builder, $_state, condition, trueOperands, falseOperands, trueDest, + let builders = [OpBuilder<(ins "Value":$condition, "Block *":$trueDest, + "ValueRange":$trueOperands, + "Block *":$falseDest, + "ValueRange":$falseOperands), + [{ + build($_builder, $_state, condition, trueOperands, falseOperands, /*branch_weights=*/{}, trueDest, falseDest); }]>, - OpBuilder<(ins "Value":$condition, "Block *":$trueDest, - "Block *":$falseDest, CArg<"ValueRange", "{}">:$falseOperands), [{ + OpBuilder<(ins "Value":$condition, "Block *":$trueDest, + "Block *":$falseDest, + CArg<"ValueRange", "{}">:$falseOperands), + [{ build($_builder, $_state, condition, trueDest, ValueRange(), falseDest, falseOperands); }]>]; @@ -216,7 +220,7 @@ def CondBranchOp : CF_Op<"cond_br", let hasCanonicalizer = 1; let assemblyFormat = [{ - $condition `,` + $condition (`weights` `(` $branch_weights^ `)` )? `,` $trueDest (`(` $trueDestOperands^ `:` type($trueDestOperands) `)`)? `,` $falseDest (`(` $falseDestOperands^ `:` type($falseDestOperands) `)`)? attr-dict diff --git a/mlir/include/mlir/Interfaces/ControlFlowInterfaces.h b/mlir/include/mlir/Interfaces/ControlFlowInterfaces.h index 7f6967f11444f..d63800c12d132 100644 --- a/mlir/include/mlir/Interfaces/ControlFlowInterfaces.h +++ b/mlir/include/mlir/Interfaces/ControlFlowInterfaces.h @@ -142,6 +142,26 @@ LogicalResult verifyBranchSuccessorOperands(Operation *op, unsigned succNo, const SuccessorOperands &operands); } // namespace detail +//===----------------------------------------------------------------------===// +// WeightedBranchOpInterface +//===----------------------------------------------------------------------===// + +namespace detail { +/// Verify that the branch weights attached to an operation +/// implementing WeightedBranchOpInterface are correct. +LogicalResult verifyBranchWeights(Operation *op); +} // namespace detail + +//===----------------------------------------------------------------------===// +// WeightedRegiobBranchOpInterface +//===----------------------------------------------------------------------===// + +namespace detail { +/// Verify that the region weights attached to an operation +/// implementing WeightedRegiobBranchOpInterface are correct. +LogicalResult verifyRegionBranchWeights(Operation *op); +} // namespace detail + //===----------------------------------------------------------------------===// // RegionBranchOpInterface //===----------------------------------------------------------------------===// diff --git a/mlir/include/mlir/Interfaces/ControlFlowInterfaces.td b/mlir/include/mlir/Interfaces/ControlFlowInterfaces.td index 69bce78e946c8..7a47b686ac7d1 100644 --- a/mlir/include/mlir/Interfaces/ControlFlowInterfaces.td +++ b/mlir/include/mlir/Interfaces/ControlFlowInterfaces.td @@ -375,6 +375,113 @@ def SelectLikeOpInterface : OpInterface<"SelectLikeOpInterface"> { ]; } +//===----------------------------------------------------------------------===// +// WeightedBranchOpInterface +//===----------------------------------------------------------------------===// + +def WeightedBranchOpInterface : OpInterface<"WeightedBranchOpInterface"> { + let description = [{ + This interface provides weight information for branching terminator + operations, i.e. terminator operations with successors. + + This interface provides methods for getting/setting integer non-negative + weight of each branch in the range from 0 to 100. The sum of weights + must be 100. The number of weights must match the number of successors + of the operation. + + The weights specify the probability (in percents) of taking + a particular branch. + + The default implementations of the methods expect the operation + to have an attribute of type DenseI32ArrayAttr named branch_weights. + }]; + let cppNamespace = "::mlir"; + + let methods = [InterfaceMethod< + /*desc=*/"Returns the branch weights attribute or nullptr", + /*returnType=*/"::mlir::DenseI32ArrayAttr", + /*methodName=*/"getBranchWeightsOrNull", + /*args=*/(ins), + /*methodBody=*/[{}], + /*defaultImpl=*/[{ + auto op = cast(this->getOperation()); + return op.getBranchWeightsAttr(); + }]>, + InterfaceMethod< + /*desc=*/"Sets the branch weights attribute", + /*returnType=*/"void", + /*methodName=*/"setBranchWeights", + /*args=*/(ins "::mlir::DenseI32ArrayAttr":$attr), + /*methodBody=*/[{}], + /*defaultImpl=*/[{ + auto op = cast(this->getOperation()); + op.setBranchWeightsAttr(attr); + }]>, + ]; + + let verify = [{ + return ::mlir::detail::verifyBranchWeights($_op); + }]; +} + +//===----------------------------------------------------------------------===// +// WeightedRegionBranchOpInterface +//===----------------------------------------------------------------------===// + +// TODO: the probabilities of entering a particular region seem +// to correlate with the values returned by +// RegionBranchOpInterface::invocationBounds(), and we should probably +// verify that the values are consistent. In that case, should +// WeightedRegionBranchOpInterface extend RegionBranchOpInterface? +def WeightedRegionBranchOpInterface + : OpInterface<"WeightedRegionBranchOpInterface"> { + let description = [{ + This interface provides weight information for region operations + that exhibit branching behavior between held regions. + + This interface provides methods for getting/setting integer non-negative + weight of each branch in the range from 0 to 100. The sum of weights + must be 100. The number of weights must match the number of regions + held by the operation (including empty regions). + + The weights specify the probability (in percents) of branching + to a particular region when first executing the operation. + For example, for loop-like operations with a single region + the weight specifies the probability of entering the loop. + In this case, the weight must be either 0 or 100. + + The default implementations of the methods expect the operation + to have an attribute of type DenseI32ArrayAttr named branch_weights. + }]; + let cppNamespace = "::mlir"; + + let methods = [InterfaceMethod< + /*desc=*/"Returns the region weights attribute or nullptr", + /*returnType=*/"::mlir::DenseI32ArrayAttr", + /*methodName=*/"getRegionWeightsOrNull", + /*args=*/(ins), + /*methodBody=*/[{}], + /*defaultImpl=*/[{ + auto op = cast(this->getOperation()); + return op.getRegionWeightsAttr(); + }]>, + InterfaceMethod< + /*desc=*/"Sets the region weights attribute", + /*returnType=*/"void", + /*methodName=*/"setRegionWeights", + /*args=*/(ins "::mlir::DenseI32ArrayAttr":$attr), + /*methodBody=*/[{}], + /*defaultImpl=*/[{ + auto op = cast(this->getOperation()); + op.setRegionWeightsAttr(attr); + }]>, + ]; + + let verify = [{ + return ::mlir::detail::verifyRegionBranchWeights($_op); + }]; +} + //===----------------------------------------------------------------------===// // ControlFlow Traits //===----------------------------------------------------------------------===// diff --git a/mlir/lib/Conversion/ControlFlowToLLVM/ControlFlowToLLVM.cpp b/mlir/lib/Conversion/ControlFlowToLLVM/ControlFlowToLLVM.cpp index debfd003bd5b5..12769e486a3c7 100644 --- a/mlir/lib/Conversion/ControlFlowToLLVM/ControlFlowToLLVM.cpp +++ b/mlir/lib/Conversion/ControlFlowToLLVM/ControlFlowToLLVM.cpp @@ -166,10 +166,14 @@ struct CondBranchOpLowering : public ConvertOpToLLVMPattern { TypeRange(adaptor.getFalseDestOperands())); if (failed(convertedFalseBlock)) return failure(); - Operation *newOp = rewriter.replaceOpWithNewOp( + auto newOp = rewriter.replaceOpWithNewOp( op, adaptor.getCondition(), *convertedTrueBlock, adaptor.getTrueDestOperands(), *convertedFalseBlock, adaptor.getFalseDestOperands()); + if (auto weights = op.getBranchWeightsOrNull()) { + newOp.setBranchWeights(weights); + op.removeBranchWeightsAttr(); + } // TODO: We should not just forward all attributes like that. But there are // existing Flang tests that depend on this behavior. newOp->setAttrs(op->getAttrDictionary()); diff --git a/mlir/lib/Interfaces/ControlFlowInterfaces.cpp b/mlir/lib/Interfaces/ControlFlowInterfaces.cpp index 2ae334b517a31..e587e8f1af178 100644 --- a/mlir/lib/Interfaces/ControlFlowInterfaces.cpp +++ b/mlir/lib/Interfaces/ControlFlowInterfaces.cpp @@ -80,6 +80,55 @@ detail::verifyBranchSuccessorOperands(Operation *op, unsigned succNo, return success(); } +//===----------------------------------------------------------------------===// +// WeightedBranchOpInterface +//===----------------------------------------------------------------------===// + +LogicalResult detail::verifyBranchWeights(Operation *op) { + auto weights = cast(op).getBranchWeightsOrNull(); + if (weights) { + if (weights.size() != op->getNumSuccessors()) + return op->emitError() << "number of weights (" << weights.size() + << ") does not match the number of successors (" + << op->getNumSuccessors() << ")"; + int32_t total = 0; + for (auto weight : llvm::enumerate(weights.asArrayRef())) { + if (weight.value() < 0) + return op->emitError() + << "weight #" << weight.index() << " must be non-negative"; + total += weight.value(); + } + if (total != 100) + return op->emitError() << "total weight " << total << " is not 100"; + } + return mlir::success(); +} + +//===--------------------... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/142079 From flang-commits at lists.llvm.org Thu May 29 20:22:27 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Thu, 29 May 2025 20:22:27 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <68392473.a70a0220.3545f0.d18b@mx.google.com> ================ @@ -127,6 +130,24 @@ class LanguageFeatureControl { bool warnAllLanguage_{false}; UsageWarnings warnUsage_; bool warnAllUsage_{false}; + bool disableAllWarnings_{false}; }; + +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). Just exposed for the template below. +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find); + +template +std::optional> parseCLIEnum( + llvm::StringRef input, std::function(Predicate)> find) { ---------------- eugeneepshteyn wrote: `std::string_view` would probably work just as well here. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 20:24:29 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Thu, 29 May 2025 20:24:29 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683924ed.170a0220.230a80.cb6a@mx.google.com> ================ @@ -107,16 +110,16 @@ class LanguageFeatureControl { warnAllUsage_ = false; warnUsage_.clear(); } - - bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } - bool ShouldWarn(LanguageFeature f) const { - return (warnAllLanguage_ && f != LanguageFeature::OpenMP && - f != LanguageFeature::OpenACC && f != LanguageFeature::CUDA) || - warnLanguage_.test(f); - } - bool ShouldWarn(UsageWarning w) const { - return warnAllUsage_ || warnUsage_.test(w); + void DisableAllWarnings() { + disableAllWarnings_ = true; + DisableAllNonstandardWarnings(); + DisableAllUsageWarnings(); } + bool applyCLIOption(llvm::StringRef input); ---------------- eugeneepshteyn wrote: I thought this part of the code intentionally avoided explicit LLVM dependencies? Perhaps use `std::string_view` instead? FWIW, `llvm::StringRef` has functionality to convert to/from `std::string_view`. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Thu May 29 21:22:32 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 21:22:32 -0700 (PDT) Subject: [flang-commits] [flang] f5d3470 - [flang][OpenMP] Allow structure component in `task depend` clauses (#141923) Message-ID: <68393288.170a0220.351d83.0e51@mx.google.com> Author: Kareem Ergawy Date: 2025-05-30T06:22:29+02:00 New Revision: f5d3470d425f9ee99436bfdc0c1ae1ce03bab385 URL: https://github.com/llvm/llvm-project/commit/f5d3470d425f9ee99436bfdc0c1ae1ce03bab385 DIFF: https://github.com/llvm/llvm-project/commit/f5d3470d425f9ee99436bfdc0c1ae1ce03bab385.diff LOG: [flang][OpenMP] Allow structure component in `task depend` clauses (#141923) Even though the spec (version 5.2) prohibits strcuture components from being specified in `depend` clauses, this restriction is not sensible. This PR rectifies the issue by lifting that restriction and allowing structure components in `depend` clauses (which is allowed by OpenMP 6.0). Added: flang/test/Lower/OpenMP/task-depend-structure-component.f90 Modified: flang/include/flang/Evaluate/tools.h flang/lib/Lower/OpenMP/ClauseProcessor.cpp flang/lib/Semantics/check-omp-structure.cpp Removed: flang/test/Semantics/OpenMP/depend02.f90 ################################################################################ diff --git a/flang/include/flang/Evaluate/tools.h b/flang/include/flang/Evaluate/tools.h index 7f2e91ae128bd..4dce1257a6507 100644 --- a/flang/include/flang/Evaluate/tools.h +++ b/flang/include/flang/Evaluate/tools.h @@ -414,6 +414,16 @@ const Symbol *IsArrayElement(const Expr &expr, bool intoSubstring = true, return nullptr; } +template +bool isStructureComponent(const Fortran::evaluate::Expr &expr) { + if (auto dataRef{ExtractDataRef(expr, /*intoSubstring=*/false)}) { + const Fortran::evaluate::DataRef *ref{&*dataRef}; + return std::holds_alternative(ref->u); + } + + return false; +} + template std::optional ExtractNamedEntity(const A &x) { if (auto dataRef{ExtractDataRef(x)}) { diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index a1fff6c5b7d90..f9a6f9506f510 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -947,6 +947,11 @@ bool ClauseProcessor::processDepend(lower::SymMap &symMap, converter.getCurrentLocation(), converter, expr, symMap, stmtCtx); dependVar = entity.getBase(); } + } else if (evaluate::isStructureComponent(*object.ref())) { + SomeExpr expr = *object.ref(); + hlfir::EntityWithAttributes entity = convertExprToHLFIR( + converter.getCurrentLocation(), converter, expr, symMap, stmtCtx); + dependVar = entity.getBase(); } else { semantics::Symbol *sym = object.sym(); dependVar = converter.getSymbolAddress(*sym); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index b0bc478d96a1e..76dfd40c6a62c 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -5494,12 +5494,8 @@ void OmpStructureChecker::CheckDependList(const parser::DataRef &d) { // Check if the base element is valid on Depend Clause CheckDependList(elem.value().base); }, - [&](const common::Indirection &) { - context_.Say(GetContext().clauseSource, - "A variable that is part of another variable " - "(such as an element of a structure) but is not an array " - "element or an array section cannot appear in a DEPEND " - "clause"_err_en_US); + [&](const common::Indirection &comp) { + CheckDependList(comp.value().base); }, [&](const common::Indirection &) { context_.Say(GetContext().clauseSource, diff --git a/flang/test/Lower/OpenMP/task-depend-structure-component.f90 b/flang/test/Lower/OpenMP/task-depend-structure-component.f90 new file mode 100644 index 0000000000000..7cf6dbfac2729 --- /dev/null +++ b/flang/test/Lower/OpenMP/task-depend-structure-component.f90 @@ -0,0 +1,21 @@ +! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s + +subroutine depend + type :: my_struct + integer :: my_component(10) + end type + + type(my_struct) :: my_var + + !$omp task depend(in:my_var%my_component) + !$omp end task +end subroutine depend + +! CHECK: %[[VAR_ALLOC:.*]] = fir.alloca !fir.type<{{.*}}my_struct{{.*}}> {bindc_name = "my_var", {{.*}}} +! CHECK: %[[VAR_DECL:.*]]:2 = hlfir.declare %[[VAR_ALLOC]] + +! CHECK: %[[COMP_SELECTOR:.*]] = hlfir.designate %[[VAR_DECL]]#0{"my_component"} + +! CHECK: omp.task depend(taskdependin -> %[[COMP_SELECTOR]] : {{.*}}) { +! CHECK: omp.terminator +! CHECK: } diff --git a/flang/test/Semantics/OpenMP/depend02.f90 b/flang/test/Semantics/OpenMP/depend02.f90 deleted file mode 100644 index 76c02c8f9cbab..0000000000000 --- a/flang/test/Semantics/OpenMP/depend02.f90 +++ /dev/null @@ -1,49 +0,0 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp -! OpenMP Version 4.5 -! 2.13.9 Depend Clause -! A variable that is part of another variable -! (such as an element of a structure) but is not an array element or -! an array section cannot appear in a DEPEND clause - -subroutine vec_mult(N) - implicit none - integer :: i, N - real, allocatable :: p(:), v1(:), v2(:) - - type my_type - integer :: a(10) - end type my_type - - type(my_type) :: my_var - allocate( p(N), v1(N), v2(N) ) - - !$omp parallel num_threads(2) - !$omp single - - !$omp task depend(out:v1) - call init(v1, N) - !$omp end task - - !$omp task depend(out:v2) - call init(v2, N) - !$omp end task - - !ERROR: A variable that is part of another variable (such as an element of a structure) but is not an array element or an array section cannot appear in a DEPEND clause - !$omp target nowait depend(in:v1,v2, my_var%a) depend(out:p) & - !$omp& map(to:v1,v2) map(from: p) - !$omp parallel do - do i=1,N - p(i) = v1(i) * v2(i) - end do - !$omp end target - - !$omp task depend(in:p) - call output(p, N) - !$omp end task - - !$omp end single - !$omp end parallel - - deallocate( p, v1, v2 ) - -end subroutine From flang-commits at lists.llvm.org Thu May 29 22:33:22 2025 From: flang-commits at lists.llvm.org (Fangrui Song via flang-commits) Date: Thu, 29 May 2025 22:33:22 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <68394322.050a0220.908ad.df13@mx.google.com> https://github.com/MaskRay edited https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Thu May 29 22:33:23 2025 From: flang-commits at lists.llvm.org (Fangrui Song via flang-commits) Date: Thu, 29 May 2025 22:33:23 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <68394323.170a0220.13729e.d354@mx.google.com> https://github.com/MaskRay commented: clang driver code looks good https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Thu May 29 22:33:23 2025 From: flang-commits at lists.llvm.org (Fangrui Song via flang-commits) Date: Thu, 29 May 2025 22:33:23 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <68394323.170a0220.d58ae.cdd4@mx.google.com> ================ @@ -0,0 +1,12 @@ +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s ---------------- MaskRay wrote: `--target=x86_64-unknown-linux-gnu` instead of the long deprecated `-target ` https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Thu May 29 23:47:55 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 23:47:55 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6839549b.170a0220.e7414.d587@mx.google.com> fanju110 wrote: > > > @fanju110, Thanks for seeing this through! > > > > > > Hi @tarunprabhu , If everything looks good, could you please merge this PR at your convenience (I don't have merge rights)? Thank you! > > It looks like you need to run `clang-format` on some of the code. See [here](https://github.com/llvm/llvm-project/actions/runs/15247681502/job/43119228106?pr=136098). For changes that you have made in `clang/`, please run it only on the code that you have changed and not the entire file. For code in `flang/`, you may run it on the entire file (in fact, you should)( Hi, I have run clang-format as you suggested! https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Thu May 29 23:53:01 2025 From: flang-commits at lists.llvm.org (=?UTF-8?B?Um9nZXIgRmVycmVyIEliw6HDsWV6?= via flang-commits) Date: Thu, 29 May 2025 23:53:01 -0700 (PDT) Subject: [flang-commits] [flang] [Flang][Preprocessor] Avoid creating an empty token when a kind suffix is torn by a pasting operator (PR #139795) In-Reply-To: Message-ID: <683955cd.170a0220.1e21c5.148d@mx.google.com> rofirrim wrote: @jeanPerier maybe you know who could review this? Thanks a lot! https://github.com/llvm/llvm-project/pull/139795 From flang-commits at lists.llvm.org Fri May 30 00:10:43 2025 From: flang-commits at lists.llvm.org (Yussur Mustafa Oraji via flang-commits) Date: Fri, 30 May 2025 00:10:43 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <683959f3.170a0220.3ae26b.158b@mx.google.com> https://github.com/N00byKing updated https://github.com/llvm/llvm-project/pull/136827 >From da77ccb2e85eda47d24ffb06281649f7b5fed050 Mon Sep 17 00:00:00 2001 From: Yussur Mustafa Oraji Date: Wed, 23 Apr 2025 10:33:04 +0200 Subject: [PATCH] [flang] Add __COUNTER__ preprocessor macro --- flang/docs/Extensions.md | 2 ++ flang/docs/Preprocessing.md | 12 ++++++++++++ flang/include/flang/Parser/preprocessor.h | 2 ++ flang/lib/Parser/preprocessor.cpp | 4 ++++ flang/test/Preprocessing/counter.F90 | 9 +++++++++ 5 files changed, 29 insertions(+) create mode 100644 flang/test/Preprocessing/counter.F90 diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 05e21ef2d33b5..d18833c066801 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -509,6 +509,8 @@ end * We respect Fortran comments in macro actual arguments (like GNU, Intel, NAG; unlike PGI and XLF) on the principle that macro calls should be treated like function references. Fortran's line continuation methods also work. +* We implement the `__COUNTER__` preprocessing extension, + see [Non-standard Extensions](Preprocessing.md#non-standard-extensions) ## Standard features not silently accepted diff --git a/flang/docs/Preprocessing.md b/flang/docs/Preprocessing.md index 0b70d857833ce..db815b9244edf 100644 --- a/flang/docs/Preprocessing.md +++ b/flang/docs/Preprocessing.md @@ -138,6 +138,18 @@ text. OpenMP-style directives that look like comments are not addressed by this scheme but are obvious extensions. +## Currently implemented built-ins + +* `__DATE__`: Date, given as e.g. "Jun 16 1904" +* `__TIME__`: Time in 24-hour format including seconds, e.g. "09:24:13" +* `__TIMESTAMP__`: Date, time and year of last modification, given as e.g. "Fri May 9 09:16:17 2025" +* `__FILE__`: Current file +* `__LINE__`: Current line + +### Non-standard Extensions + +* `__COUNTER__`: Replaced by sequential integers on each expansion, starting from 0. + ## Appendix `N` in the table below means "not supported"; this doesn't mean a bug, it just means that a particular behavior was diff --git a/flang/include/flang/Parser/preprocessor.h b/flang/include/flang/Parser/preprocessor.h index 86528a7e68def..834c84a639a74 100644 --- a/flang/include/flang/Parser/preprocessor.h +++ b/flang/include/flang/Parser/preprocessor.h @@ -121,6 +121,8 @@ class Preprocessor { std::list names_; std::unordered_map definitions_; std::stack ifStack_; + + unsigned int counterVal_{0}; }; } // namespace Fortran::parser #endif // FORTRAN_PARSER_PREPROCESSOR_H_ diff --git a/flang/lib/Parser/preprocessor.cpp b/flang/lib/Parser/preprocessor.cpp index a47f9c32ad27c..4549b1c505569 100644 --- a/flang/lib/Parser/preprocessor.cpp +++ b/flang/lib/Parser/preprocessor.cpp @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -299,6 +300,7 @@ void Preprocessor::DefineStandardMacros() { Define("__FILE__"s, "__FILE__"s); Define("__LINE__"s, "__LINE__"s); Define("__TIMESTAMP__"s, "__TIMESTAMP__"s); + Define("__COUNTER__"s, "__COUNTER__"s); } void Preprocessor::Define(const std::string ¯o, const std::string &value) { @@ -421,6 +423,8 @@ std::optional Preprocessor::MacroReplacement( repl = "\""s + time + '"'; } } + } else if (name == "__COUNTER__") { + repl = std::to_string(counterVal_++); } if (!repl.empty()) { ProvenanceRange insert{allSources_.AddCompilerInsertion(repl)}; diff --git a/flang/test/Preprocessing/counter.F90 b/flang/test/Preprocessing/counter.F90 new file mode 100644 index 0000000000000..9761c8fb7f355 --- /dev/null +++ b/flang/test/Preprocessing/counter.F90 @@ -0,0 +1,9 @@ +! RUN: %flang -E %s | FileCheck %s +! CHECK: print *, 0 +! CHECK: print *, 1 +! CHECK: print *, 2 +! Check incremental counter macro +#define foo bar +print *, __COUNTER__ +print *, __COUNTER__ +print *, __COUNTER__ From flang-commits at lists.llvm.org Fri May 30 01:07:35 2025 From: flang-commits at lists.llvm.org (Yussur Mustafa Oraji via flang-commits) Date: Fri, 30 May 2025 01:07:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <68396747.170a0220.23f106.d44c@mx.google.com> N00byKing wrote: Thanks for the reviews :) Also, I just added the file extension to the markdown reference, which made CI break earlier. Let me know if there are any further changes needed. Otherwise, I can't merge it myself, so I will need someone to do it for me. https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Fri May 30 02:30:12 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Fri, 30 May 2025 02:30:12 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <68397aa4.170a0220.279e47.d9ca@mx.google.com> https://github.com/abidh updated https://github.com/llvm/llvm-project/pull/140556 >From 5d20af48673adebc2ab3e1a6c8442f67d84f1847 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Mon, 19 May 2025 15:21:25 +0100 Subject: [PATCH 1/5] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. This PR add functionality to change flang command line using environment variable `FCC_OVERRIDE_OPTIONS`. It is quite similar to what `CCC_OVERRIDE_OPTIONS` does for clang. The `FCC_OVERRIDE_OPTIONS` seemed like the most obvious name to me but I am open to other ideas. The `applyOverrideOptions` now takes an extra argument that is the name of the environment variable. Previously `CCC_OVERRIDE_OPTIONS` was hardcoded. --- clang/include/clang/Driver/Driver.h | 2 +- clang/lib/Driver/Driver.cpp | 4 ++-- clang/tools/driver/driver.cpp | 2 +- flang/test/Driver/fcc_override.f90 | 12 ++++++++++++ flang/tools/flang-driver/driver.cpp | 7 +++++++ 5 files changed, 23 insertions(+), 4 deletions(-) create mode 100644 flang/test/Driver/fcc_override.f90 diff --git a/clang/include/clang/Driver/Driver.h b/clang/include/clang/Driver/Driver.h index b463dc2a93550..7ca848f11b561 100644 --- a/clang/include/clang/Driver/Driver.h +++ b/clang/include/clang/Driver/Driver.h @@ -879,7 +879,7 @@ llvm::Error expandResponseFiles(SmallVectorImpl &Args, /// See applyOneOverrideOption. void applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideOpts, - llvm::StringSet<> &SavedStrings, + llvm::StringSet<> &SavedStrings, StringRef EnvVar, raw_ostream *OS = nullptr); } // end namespace driver diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp index a648cc928afdc..a8fea35926a0d 100644 --- a/clang/lib/Driver/Driver.cpp +++ b/clang/lib/Driver/Driver.cpp @@ -7289,7 +7289,7 @@ static void applyOneOverrideOption(raw_ostream &OS, void driver::applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideStr, llvm::StringSet<> &SavedStrings, - raw_ostream *OS) { + StringRef EnvVar, raw_ostream *OS) { if (!OS) OS = &llvm::nulls(); @@ -7298,7 +7298,7 @@ void driver::applyOverrideOptions(SmallVectorImpl &Args, OS = &llvm::nulls(); } - *OS << "### CCC_OVERRIDE_OPTIONS: " << OverrideStr << "\n"; + *OS << "### " << EnvVar << ": " << OverrideStr << "\n"; // This does not need to be efficient. diff --git a/clang/tools/driver/driver.cpp b/clang/tools/driver/driver.cpp index 82f47ab973064..81964c65c2892 100644 --- a/clang/tools/driver/driver.cpp +++ b/clang/tools/driver/driver.cpp @@ -305,7 +305,7 @@ int clang_main(int Argc, char **Argv, const llvm::ToolContext &ToolContext) { if (const char *OverrideStr = ::getenv("CCC_OVERRIDE_OPTIONS")) { // FIXME: Driver shouldn't take extra initial argument. driver::applyOverrideOptions(Args, OverrideStr, SavedStrings, - &llvm::errs()); + "CCC_OVERRIDE_OPTIONS", &llvm::errs()); } std::string Path = GetExecutablePath(ToolContext.Path, CanonicalPrefixes); diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 new file mode 100644 index 0000000000000..55a07803fdde5 --- /dev/null +++ b/flang/test/Driver/fcc_override.f90 @@ -0,0 +1,12 @@ +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s +! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang -target x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR + +! CHECK: "-fc1" +! CHECK-NOT: "-Oignore" +! CHECK: "-Omagic" +! CHECK-NOT: "-Oignore" + +! RM-WERROR: ### FCC_OVERRIDE_OPTIONS: x-Werror +-g +! RM-WERROR-NEXT: ### Deleting argument -Werror +! RM-WERROR-NEXT: ### Adding argument -g at end +! RM-WERROR-NOT: "-Werror" diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ed52988feaa59..ad0efa3279cef 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -111,6 +111,13 @@ int main(int argc, const char **argv) { } } + llvm::StringSet<> SavedStrings; + // Handle FCC_OVERRIDE_OPTIONS, used for editing a command line behind the + // scenes. + if (const char *OverrideStr = ::getenv("FCC_OVERRIDE_OPTIONS")) + clang::driver::applyOverrideOptions(args, OverrideStr, SavedStrings, + "FCC_OVERRIDE_OPTIONS", &llvm::errs()); + // Not in the frontend mode - continue in the compiler driver mode. // Create DiagnosticsEngine for the compiler driver >From d1f2c9b8abd2690612a4b886a7a85b8e7f57d359 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Thu, 29 May 2025 11:05:57 +0100 Subject: [PATCH 2/5] Add documentation for FCC_OVERRIDE_OPTIONS. --- flang/docs/FlangDriver.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/flang/docs/FlangDriver.md b/flang/docs/FlangDriver.md index 97744f0bee069..f93df8701e677 100644 --- a/flang/docs/FlangDriver.md +++ b/flang/docs/FlangDriver.md @@ -614,3 +614,28 @@ nvfortran defines `-fast` as - `-Mcache_align`: there is no equivalent flag in Flang or Clang. - `-Mflushz`: flush-to-zero mode - when `-ffast-math` is specified, Flang will link to `crtfastmath.o` to ensure denormal numbers are flushed to zero. + + +## FCC_OVERRIDE_OPTIONS + +The environment variable `FCC_OVERRIDE_OPTIONS` can be used to apply a list of +edits to the input argument lists. The value of this environment variable is +a space separated list of edits to perform. These edits are applied in order to +the input argument lists. Edits should be one of the following forms: + +- `#`: Silence information about the changes to the command line arguments. + +- `^FOO`: Add `FOO` as a new argument at the beginning of the command line. + +- `+FOO`: Add `FOO` as a new argument at the end of the command line. + +- `s/XXX/YYY/`: Substitute the regular expression `XXX` with `YYY` in the + command line. + +- `xOPTION`: Removes all instances of the literal argument `OPTION`. + +- `XOPTION`: Removes all instances of the literal argument `OPTION`, and the + following argument. + +- `Ox`: Removes all flags matching `O` or `O[sz0-9]` and adds `Ox` at the end + of the command line. \ No newline at end of file >From d093a6ac74f8c0058e134ec55fbbf2b8edf9b477 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Thu, 29 May 2025 17:28:46 +0100 Subject: [PATCH 3/5] Mention that effect on options added by the config files. --- flang/docs/FlangDriver.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/flang/docs/FlangDriver.md b/flang/docs/FlangDriver.md index f93df8701e677..0302cb1dc33b9 100644 --- a/flang/docs/FlangDriver.md +++ b/flang/docs/FlangDriver.md @@ -638,4 +638,6 @@ the input argument lists. Edits should be one of the following forms: following argument. - `Ox`: Removes all flags matching `O` or `O[sz0-9]` and adds `Ox` at the end - of the command line. \ No newline at end of file + of the command line. + +This environment variable does not affect the options added by the config files. >From 9faf4d384a40514b15cc3bf270303843e8dd4822 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Thu, 29 May 2025 20:36:59 +0100 Subject: [PATCH 4/5] Add a test for option from config file. --- flang/test/Driver/Inputs/config-7.cfg | 1 + flang/test/Driver/fcc_override.f90 | 5 +++++ 2 files changed, 6 insertions(+) create mode 100644 flang/test/Driver/Inputs/config-7.cfg diff --git a/flang/test/Driver/Inputs/config-7.cfg b/flang/test/Driver/Inputs/config-7.cfg new file mode 100644 index 0000000000000..2f41be663b282 --- /dev/null +++ b/flang/test/Driver/Inputs/config-7.cfg @@ -0,0 +1 @@ +-Werror diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 index 55a07803fdde5..417919b5d667a 100644 --- a/flang/test/Driver/fcc_override.f90 +++ b/flang/test/Driver/fcc_override.f90 @@ -1,5 +1,6 @@ ! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s ! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang -target x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR +! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror" %flang --config=%S/Inputs/config-7.cfg -### %s -c 2>&1 | FileCheck %s -check-prefix=CONF ! CHECK: "-fc1" ! CHECK-NOT: "-Oignore" @@ -10,3 +11,7 @@ ! RM-WERROR-NEXT: ### Deleting argument -Werror ! RM-WERROR-NEXT: ### Adding argument -g at end ! RM-WERROR-NOT: "-Werror" + +! Test that FCC_OVERRIDE_OPTIONS does not affect the options from config files. +! CONF: ### FCC_OVERRIDE_OPTIONS: x-Werror +! CONF: "-Werror" >From db45474fc5625223b5240aa7f7ef094d2d80d5ae Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Fri, 30 May 2025 10:28:49 +0100 Subject: [PATCH 5/5] Handle review comments. --- flang/docs/FlangDriver.md | 8 ++++---- flang/test/Driver/fcc_override.f90 | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/docs/FlangDriver.md b/flang/docs/FlangDriver.md index 0302cb1dc33b9..e6750c92567a4 100644 --- a/flang/docs/FlangDriver.md +++ b/flang/docs/FlangDriver.md @@ -618,10 +618,10 @@ nvfortran defines `-fast` as ## FCC_OVERRIDE_OPTIONS -The environment variable `FCC_OVERRIDE_OPTIONS` can be used to apply a list of -edits to the input argument lists. The value of this environment variable is -a space separated list of edits to perform. These edits are applied in order to -the input argument lists. Edits should be one of the following forms: +The environment variable `FCC_OVERRIDE_OPTIONS` can be used to edit flang's +command line arguments. The value of this variable is a space-separated list of +edits to perform. The edits are applied in the order in which they appear in +`FCC_OVERRIDE_OPTIONS`. Each edit should be one of the following forms: - `#`: Silence information about the changes to the command line arguments. diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 index 417919b5d667a..71def0847f150 100644 --- a/flang/test/Driver/fcc_override.f90 +++ b/flang/test/Driver/fcc_override.f90 @@ -1,5 +1,5 @@ ! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s -! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang -target x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR +! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang --target=x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR ! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror" %flang --config=%S/Inputs/config-7.cfg -### %s -c 2>&1 | FileCheck %s -check-prefix=CONF ! CHECK: "-fc1" From flang-commits at lists.llvm.org Fri May 30 02:30:20 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Fri, 30 May 2025 02:30:20 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <68397aac.050a0220.18a286.e408@mx.google.com> ================ @@ -614,3 +614,30 @@ nvfortran defines `-fast` as - `-Mcache_align`: there is no equivalent flag in Flang or Clang. - `-Mflushz`: flush-to-zero mode - when `-ffast-math` is specified, Flang will link to `crtfastmath.o` to ensure denormal numbers are flushed to zero. + + +## FCC_OVERRIDE_OPTIONS + +The environment variable `FCC_OVERRIDE_OPTIONS` can be used to apply a list of +edits to the input argument lists. The value of this environment variable is +a space separated list of edits to perform. These edits are applied in order to +the input argument lists. Edits should be one of the following forms: ---------------- abidh wrote: Thanks. This looks better than what I had originally. I have updated the docs. https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Fri May 30 02:42:21 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 30 May 2025 02:42:21 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <68397d7d.170a0220.bfafd.d872@mx.google.com> kiranchandramohan wrote: > @kiranchandramohan @tarunprabhu, what do you guys think about modifying `index.md` in the following manner ? This will load things on demand and we should not be hitting the `nonexistent document` errors. > > ``` > ```{eval} > import os > if os.path.exists('FlangCommandLineReference.rst'): > print('FlangCommandLineReference') > ``` I would prefer something like https://reviews.llvm.org/D128650 where the generated md/rst files are copied by the `docs-flang-man` target before it hits the code that triggers the `nonexistent document` warning/errors. https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Fri May 30 02:47:25 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 02:47:25 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [mlir][flang] Added Weighted[Region]BranchOpInterface's. (PR #142079) In-Reply-To: Message-ID: <68397ead.170a0220.1860ee.1d28@mx.google.com> https://github.com/tblah approved this pull request. Good idea! LGTM but wait for an MLIR reviewer too https://github.com/llvm/llvm-project/pull/142079 From flang-commits at lists.llvm.org Fri May 30 02:47:25 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 02:47:25 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [mlir][flang] Added Weighted[Region]BranchOpInterface's. (PR #142079) In-Reply-To: Message-ID: <68397ead.050a0220.24387d.e8c3@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/142079 From flang-commits at lists.llvm.org Fri May 30 02:47:25 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 02:47:25 -0700 (PDT) Subject: [flang-commits] [flang] [mlir] [mlir][flang] Added Weighted[Region]BranchOpInterface's. (PR #142079) In-Reply-To: Message-ID: <68397ead.050a0220.a7828.e5d5@mx.google.com> ================ @@ -80,6 +80,55 @@ detail::verifyBranchSuccessorOperands(Operation *op, unsigned succNo, return success(); } +//===----------------------------------------------------------------------===// +// WeightedBranchOpInterface +//===----------------------------------------------------------------------===// + +LogicalResult detail::verifyBranchWeights(Operation *op) { + auto weights = cast(op).getBranchWeightsOrNull(); + if (weights) { + if (weights.size() != op->getNumSuccessors()) + return op->emitError() << "number of weights (" << weights.size() + << ") does not match the number of successors (" + << op->getNumSuccessors() << ")"; + int32_t total = 0; + for (auto weight : llvm::enumerate(weights.asArrayRef())) { + if (weight.value() < 0) + return op->emitError() + << "weight #" << weight.index() << " must be non-negative"; + total += weight.value(); + } + if (total != 100) + return op->emitError() << "total weight " << total << " is not 100"; + } + return mlir::success(); +} + +//===----------------------------------------------------------------------===// +// WeightedRegionBranchOpInterface +//===----------------------------------------------------------------------===// + +LogicalResult detail::verifyRegionBranchWeights(Operation *op) { ---------------- tblah wrote: nit: Both of these could share a helper https://github.com/llvm/llvm-project/pull/142079 From flang-commits at lists.llvm.org Fri May 30 02:52:00 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 02:52:00 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <68397fc0.170a0220.12fbab.1d66@mx.google.com> ================ @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { ---------------- tblah wrote: I think it is okay to keep if that's on purpose. I was just checking it hadn't been left in by accident. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Fri May 30 02:53:45 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 30 May 2025 02:53:45 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <68398029.170a0220.21d22d.1c9a@mx.google.com> ================ @@ -111,6 +111,13 @@ int main(int argc, const char **argv) { } } + llvm::StringSet<> SavedStrings; + // Handle FCC_OVERRIDE_OPTIONS, used for editing a command line behind the + // scenes. + if (const char *OverrideStr = ::getenv("FCC_OVERRIDE_OPTIONS")) ---------------- kiranchandramohan wrote: ```suggestion if (const char *overrideStr = ::getenv("FCC_OVERRIDE_OPTIONS")) ``` https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Fri May 30 02:53:46 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 30 May 2025 02:53:46 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <6839802a.050a0220.327f9f.e528@mx.google.com> https://github.com/kiranchandramohan edited https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Fri May 30 02:53:46 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 30 May 2025 02:53:46 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <6839802a.050a0220.17001d.9adf@mx.google.com> ================ @@ -111,6 +111,13 @@ int main(int argc, const char **argv) { } } + llvm::StringSet<> SavedStrings; ---------------- kiranchandramohan wrote: ```suggestion llvm::StringSet<> savedStrings; ``` https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Fri May 30 02:53:47 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 30 May 2025 02:53:47 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <6839802b.050a0220.255788.e8f1@mx.google.com> https://github.com/kiranchandramohan approved this pull request. LGTM. Thanks. https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Fri May 30 03:13:55 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Fri, 30 May 2025 03:13:55 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <683984e3.170a0220.fe08a.230e@mx.google.com> https://github.com/abidh updated https://github.com/llvm/llvm-project/pull/140556 >From 5d20af48673adebc2ab3e1a6c8442f67d84f1847 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Mon, 19 May 2025 15:21:25 +0100 Subject: [PATCH 1/6] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. This PR add functionality to change flang command line using environment variable `FCC_OVERRIDE_OPTIONS`. It is quite similar to what `CCC_OVERRIDE_OPTIONS` does for clang. The `FCC_OVERRIDE_OPTIONS` seemed like the most obvious name to me but I am open to other ideas. The `applyOverrideOptions` now takes an extra argument that is the name of the environment variable. Previously `CCC_OVERRIDE_OPTIONS` was hardcoded. --- clang/include/clang/Driver/Driver.h | 2 +- clang/lib/Driver/Driver.cpp | 4 ++-- clang/tools/driver/driver.cpp | 2 +- flang/test/Driver/fcc_override.f90 | 12 ++++++++++++ flang/tools/flang-driver/driver.cpp | 7 +++++++ 5 files changed, 23 insertions(+), 4 deletions(-) create mode 100644 flang/test/Driver/fcc_override.f90 diff --git a/clang/include/clang/Driver/Driver.h b/clang/include/clang/Driver/Driver.h index b463dc2a93550..7ca848f11b561 100644 --- a/clang/include/clang/Driver/Driver.h +++ b/clang/include/clang/Driver/Driver.h @@ -879,7 +879,7 @@ llvm::Error expandResponseFiles(SmallVectorImpl &Args, /// See applyOneOverrideOption. void applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideOpts, - llvm::StringSet<> &SavedStrings, + llvm::StringSet<> &SavedStrings, StringRef EnvVar, raw_ostream *OS = nullptr); } // end namespace driver diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp index a648cc928afdc..a8fea35926a0d 100644 --- a/clang/lib/Driver/Driver.cpp +++ b/clang/lib/Driver/Driver.cpp @@ -7289,7 +7289,7 @@ static void applyOneOverrideOption(raw_ostream &OS, void driver::applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideStr, llvm::StringSet<> &SavedStrings, - raw_ostream *OS) { + StringRef EnvVar, raw_ostream *OS) { if (!OS) OS = &llvm::nulls(); @@ -7298,7 +7298,7 @@ void driver::applyOverrideOptions(SmallVectorImpl &Args, OS = &llvm::nulls(); } - *OS << "### CCC_OVERRIDE_OPTIONS: " << OverrideStr << "\n"; + *OS << "### " << EnvVar << ": " << OverrideStr << "\n"; // This does not need to be efficient. diff --git a/clang/tools/driver/driver.cpp b/clang/tools/driver/driver.cpp index 82f47ab973064..81964c65c2892 100644 --- a/clang/tools/driver/driver.cpp +++ b/clang/tools/driver/driver.cpp @@ -305,7 +305,7 @@ int clang_main(int Argc, char **Argv, const llvm::ToolContext &ToolContext) { if (const char *OverrideStr = ::getenv("CCC_OVERRIDE_OPTIONS")) { // FIXME: Driver shouldn't take extra initial argument. driver::applyOverrideOptions(Args, OverrideStr, SavedStrings, - &llvm::errs()); + "CCC_OVERRIDE_OPTIONS", &llvm::errs()); } std::string Path = GetExecutablePath(ToolContext.Path, CanonicalPrefixes); diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 new file mode 100644 index 0000000000000..55a07803fdde5 --- /dev/null +++ b/flang/test/Driver/fcc_override.f90 @@ -0,0 +1,12 @@ +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s +! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang -target x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR + +! CHECK: "-fc1" +! CHECK-NOT: "-Oignore" +! CHECK: "-Omagic" +! CHECK-NOT: "-Oignore" + +! RM-WERROR: ### FCC_OVERRIDE_OPTIONS: x-Werror +-g +! RM-WERROR-NEXT: ### Deleting argument -Werror +! RM-WERROR-NEXT: ### Adding argument -g at end +! RM-WERROR-NOT: "-Werror" diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ed52988feaa59..ad0efa3279cef 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -111,6 +111,13 @@ int main(int argc, const char **argv) { } } + llvm::StringSet<> SavedStrings; + // Handle FCC_OVERRIDE_OPTIONS, used for editing a command line behind the + // scenes. + if (const char *OverrideStr = ::getenv("FCC_OVERRIDE_OPTIONS")) + clang::driver::applyOverrideOptions(args, OverrideStr, SavedStrings, + "FCC_OVERRIDE_OPTIONS", &llvm::errs()); + // Not in the frontend mode - continue in the compiler driver mode. // Create DiagnosticsEngine for the compiler driver >From d1f2c9b8abd2690612a4b886a7a85b8e7f57d359 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Thu, 29 May 2025 11:05:57 +0100 Subject: [PATCH 2/6] Add documentation for FCC_OVERRIDE_OPTIONS. --- flang/docs/FlangDriver.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/flang/docs/FlangDriver.md b/flang/docs/FlangDriver.md index 97744f0bee069..f93df8701e677 100644 --- a/flang/docs/FlangDriver.md +++ b/flang/docs/FlangDriver.md @@ -614,3 +614,28 @@ nvfortran defines `-fast` as - `-Mcache_align`: there is no equivalent flag in Flang or Clang. - `-Mflushz`: flush-to-zero mode - when `-ffast-math` is specified, Flang will link to `crtfastmath.o` to ensure denormal numbers are flushed to zero. + + +## FCC_OVERRIDE_OPTIONS + +The environment variable `FCC_OVERRIDE_OPTIONS` can be used to apply a list of +edits to the input argument lists. The value of this environment variable is +a space separated list of edits to perform. These edits are applied in order to +the input argument lists. Edits should be one of the following forms: + +- `#`: Silence information about the changes to the command line arguments. + +- `^FOO`: Add `FOO` as a new argument at the beginning of the command line. + +- `+FOO`: Add `FOO` as a new argument at the end of the command line. + +- `s/XXX/YYY/`: Substitute the regular expression `XXX` with `YYY` in the + command line. + +- `xOPTION`: Removes all instances of the literal argument `OPTION`. + +- `XOPTION`: Removes all instances of the literal argument `OPTION`, and the + following argument. + +- `Ox`: Removes all flags matching `O` or `O[sz0-9]` and adds `Ox` at the end + of the command line. \ No newline at end of file >From d093a6ac74f8c0058e134ec55fbbf2b8edf9b477 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Thu, 29 May 2025 17:28:46 +0100 Subject: [PATCH 3/6] Mention that effect on options added by the config files. --- flang/docs/FlangDriver.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/flang/docs/FlangDriver.md b/flang/docs/FlangDriver.md index f93df8701e677..0302cb1dc33b9 100644 --- a/flang/docs/FlangDriver.md +++ b/flang/docs/FlangDriver.md @@ -638,4 +638,6 @@ the input argument lists. Edits should be one of the following forms: following argument. - `Ox`: Removes all flags matching `O` or `O[sz0-9]` and adds `Ox` at the end - of the command line. \ No newline at end of file + of the command line. + +This environment variable does not affect the options added by the config files. >From 9faf4d384a40514b15cc3bf270303843e8dd4822 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Thu, 29 May 2025 20:36:59 +0100 Subject: [PATCH 4/6] Add a test for option from config file. --- flang/test/Driver/Inputs/config-7.cfg | 1 + flang/test/Driver/fcc_override.f90 | 5 +++++ 2 files changed, 6 insertions(+) create mode 100644 flang/test/Driver/Inputs/config-7.cfg diff --git a/flang/test/Driver/Inputs/config-7.cfg b/flang/test/Driver/Inputs/config-7.cfg new file mode 100644 index 0000000000000..2f41be663b282 --- /dev/null +++ b/flang/test/Driver/Inputs/config-7.cfg @@ -0,0 +1 @@ +-Werror diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 index 55a07803fdde5..417919b5d667a 100644 --- a/flang/test/Driver/fcc_override.f90 +++ b/flang/test/Driver/fcc_override.f90 @@ -1,5 +1,6 @@ ! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s ! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang -target x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR +! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror" %flang --config=%S/Inputs/config-7.cfg -### %s -c 2>&1 | FileCheck %s -check-prefix=CONF ! CHECK: "-fc1" ! CHECK-NOT: "-Oignore" @@ -10,3 +11,7 @@ ! RM-WERROR-NEXT: ### Deleting argument -Werror ! RM-WERROR-NEXT: ### Adding argument -g at end ! RM-WERROR-NOT: "-Werror" + +! Test that FCC_OVERRIDE_OPTIONS does not affect the options from config files. +! CONF: ### FCC_OVERRIDE_OPTIONS: x-Werror +! CONF: "-Werror" >From db45474fc5625223b5240aa7f7ef094d2d80d5ae Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Fri, 30 May 2025 10:28:49 +0100 Subject: [PATCH 5/6] Handle review comments. --- flang/docs/FlangDriver.md | 8 ++++---- flang/test/Driver/fcc_override.f90 | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/docs/FlangDriver.md b/flang/docs/FlangDriver.md index 0302cb1dc33b9..e6750c92567a4 100644 --- a/flang/docs/FlangDriver.md +++ b/flang/docs/FlangDriver.md @@ -618,10 +618,10 @@ nvfortran defines `-fast` as ## FCC_OVERRIDE_OPTIONS -The environment variable `FCC_OVERRIDE_OPTIONS` can be used to apply a list of -edits to the input argument lists. The value of this environment variable is -a space separated list of edits to perform. These edits are applied in order to -the input argument lists. Edits should be one of the following forms: +The environment variable `FCC_OVERRIDE_OPTIONS` can be used to edit flang's +command line arguments. The value of this variable is a space-separated list of +edits to perform. The edits are applied in the order in which they appear in +`FCC_OVERRIDE_OPTIONS`. Each edit should be one of the following forms: - `#`: Silence information about the changes to the command line arguments. diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 index 417919b5d667a..71def0847f150 100644 --- a/flang/test/Driver/fcc_override.f90 +++ b/flang/test/Driver/fcc_override.f90 @@ -1,5 +1,5 @@ ! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s -! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang -target x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR +! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang --target=x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR ! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror" %flang --config=%S/Inputs/config-7.cfg -### %s -c 2>&1 | FileCheck %s -check-prefix=CONF ! CHECK: "-fc1" >From 2efd653c1a575fc77d53d5db7db1cf5225f8039d Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Fri, 30 May 2025 11:13:29 +0100 Subject: [PATCH 6/6] Handle review comments. --- flang/tools/flang-driver/driver.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ad0efa3279cef..62dbfc72bb191 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -111,11 +111,11 @@ int main(int argc, const char **argv) { } } - llvm::StringSet<> SavedStrings; + llvm::StringSet<> savedStrings; // Handle FCC_OVERRIDE_OPTIONS, used for editing a command line behind the // scenes. - if (const char *OverrideStr = ::getenv("FCC_OVERRIDE_OPTIONS")) - clang::driver::applyOverrideOptions(args, OverrideStr, SavedStrings, + if (const char *overrideStr = ::getenv("FCC_OVERRIDE_OPTIONS")) + clang::driver::applyOverrideOptions(args, overrideStr, savedStrings, "FCC_OVERRIDE_OPTIONS", &llvm::errs()); // Not in the frontend mode - continue in the compiler driver mode. From flang-commits at lists.llvm.org Fri May 30 03:19:40 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Fri, 30 May 2025 03:19:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <6839863c.a70a0220.79a88.e4ca@mx.google.com> ================ @@ -0,0 +1,178 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (!copyIn.getWasCopied().hasOneUse()) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's WasCopied has no single user"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().user_begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type sequenceType = + hlfir::getFortranElementOrSequenceType(inputVariable.getType()); + fir::BoxType resultBoxType = fir::BoxType::get(sequenceType); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultBoxType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value result = inputVariable; + if (fir::isPointerType(inputVariable.getType())) { + auto boxAddr = builder.create(loc, inputVariable); + fir::ReferenceType refTy = fir::ReferenceType::get(sequenceType); + mlir::Value refVal = builder.createConvert(loc, refTy, boxAddr); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + result = builder.create(loc, resultBoxType, refVal, + shape); + } + builder.create( + loc, mlir::ValueRange{result, builder.createBool(loc, false)}); + }) + .genElse([&] { + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + llvm::StringRef tmpName{".tmp.copy_in"}; + llvm::SmallVector lenParams; + mlir::Value alloc = builder.createHeapTemporary( + loc, sequenceType, tmpName, extents, lenParams); + + auto declareOp = builder.create( + loc, alloc, tmpName, shape, lenParams, + /*dummy_scope=*/nullptr); + hlfir::Entity temp{declareOp.getBase()}; + hlfir::LoopNest loopNest = + hlfir::genLoopNest(loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn), + /*couldVectorize=*/false); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = builder.create(loc, resultBoxType, refVal, + shape); + } + + builder.create( + loc, mlir::ValueRange{result, builder.createBool(loc, true)}); + }) + .getResults(); + + mlir::OpResult resultBox = results[0]; + mlir::OpResult needsCleanup = results[1]; + + auto alloca = builder.create(loc, resultBox.getType()); ---------------- mrkajetanp wrote: That tempBox is a ``` %3 = "fir.alloca"() <{in_type = !fir.box>>, operandSegmentSizes = array}> : () -> !fir.ref>>> ``` while the box resulting from the if block is just a box because in one of the branches it's not actually a heap type. If I try storing it directly I get a type mismatch error on that account because the op is: ``` "fir.store"(%20#0, %5) : (!fir.box>, !fir.ref>>>) -> () ``` Do you mean I should force convert the resulting box? Or do some other type tricks? The thinking here was that the existing tempBox will get cleaned up anyway so it's better to make a specific new alloca with the type I want. https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Fri May 30 03:19:57 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Fri, 30 May 2025 03:19:57 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <6839864d.050a0220.a44f4.ebba@mx.google.com> https://github.com/mrkajetanp edited https://github.com/llvm/llvm-project/pull/138718 From flang-commits at lists.llvm.org Fri May 30 03:34:35 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 03:34:35 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683989bb.050a0220.ae6dd.e511@mx.google.com> ================ @@ -777,5 +777,22 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); + ---------------- tblah wrote: I just thought it was confusing declaring them in a header with one name then defining them in a different cpp file. But you're right, it doesn't make a lot of sense putting them in check-omp-structure.h either. I think these functions do not look very specific to atomic so it should be okay to extract them if you don't mind. Otherwise I am open to other ideas or if you feel strongly they can stay where they are. I don't think there's an actual style guide rule saying these can't be in a different header and cpp file. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Thu May 29 08:46:19 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 08:46:19 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <6838814b.170a0220.15da2.9100@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From 9392a42caf6d8956cfbe0e3b404f0434909c8c41 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, Destroy, and DescriptorIO. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. There is a fast(?) mode in the work queue implementation that causes new work items to be executed to completion immediately upon creation, saving the overhead of actually representing and managing the work queue. This mode can't be used on GPU devices, but it is enabled by default for CPU hosts. It can be disabled easily for debugging and performance testing. --- .../include/flang-rt/runtime/environment.h | 3 + flang-rt/include/flang-rt/runtime/stat.h | 10 +- .../include/flang-rt/runtime/work-queue.h | 548 +++++++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 595 ++++++++++------ flang-rt/lib/runtime/derived.cpp | 519 +++++++------- flang-rt/lib/runtime/descriptor-io.cpp | 651 +++++++++++++++++- flang-rt/lib/runtime/descriptor-io.h | 620 +---------------- flang-rt/lib/runtime/environment.cpp | 4 + flang-rt/lib/runtime/namelist.cpp | 1 + flang-rt/lib/runtime/tools.cpp | 4 +- flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 161 +++++ flang-rt/unittests/Runtime/ExternalIOTest.cpp | 2 +- flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- 16 files changed, 2054 insertions(+), 1084 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/environment.h b/flang-rt/include/flang-rt/runtime/environment.h index 16258b3bbba9b..e579f6012ce86 100644 --- a/flang-rt/include/flang-rt/runtime/environment.h +++ b/flang-rt/include/flang-rt/runtime/environment.h @@ -64,6 +64,9 @@ struct ExecutionEnvironment { bool defaultUTF8{false}; // DEFAULT_UTF8 bool checkPointerDeallocation{true}; // FORT_CHECK_POINTER_DEALLOCATION + enum InternalDebugging { WorkQueue = 1 }; + int internalDebugging{0}; // FLANG_RT_DEBUG + // CUDA related variables std::size_t cudaStackLimit{0}; // ACC_OFFLOAD_STACK_SIZE bool cudaDeviceIsManaged{false}; // NV_CUDAFOR_DEVICE_IS_MANAGED diff --git a/flang-rt/include/flang-rt/runtime/stat.h b/flang-rt/include/flang-rt/runtime/stat.h index 070d0bf8673fb..dc372de53506a 100644 --- a/flang-rt/include/flang-rt/runtime/stat.h +++ b/flang-rt/include/flang-rt/runtime/stat.h @@ -24,7 +24,7 @@ class Terminator; enum Stat { StatOk = 0, // required to be zero by Fortran - // Interoperable STAT= codes + // Interoperable STAT= codes (>= 11) StatBaseNull = CFI_ERROR_BASE_ADDR_NULL, StatBaseNotNull = CFI_ERROR_BASE_ADDR_NOT_NULL, StatInvalidElemLen = CFI_INVALID_ELEM_LEN, @@ -36,7 +36,7 @@ enum Stat { StatMemAllocation = CFI_ERROR_MEM_ALLOCATION, StatOutOfBounds = CFI_ERROR_OUT_OF_BOUNDS, - // Standard STAT= values + // Standard STAT= values (>= 101) StatFailedImage = FORTRAN_RUNTIME_STAT_FAILED_IMAGE, StatLocked = FORTRAN_RUNTIME_STAT_LOCKED, StatLockedOtherImage = FORTRAN_RUNTIME_STAT_LOCKED_OTHER_IMAGE, @@ -49,10 +49,14 @@ enum Stat { // Additional "processor-defined" STAT= values StatInvalidArgumentNumber = FORTRAN_RUNTIME_STAT_INVALID_ARG_NUMBER, StatMissingArgument = FORTRAN_RUNTIME_STAT_MISSING_ARG, - StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, + StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, // -1 StatMoveAllocSameAllocatable = FORTRAN_RUNTIME_STAT_MOVE_ALLOC_SAME_ALLOCATABLE, StatBadPointerDeallocation = FORTRAN_RUNTIME_STAT_BAD_POINTER_DEALLOCATION, + + // Dummy status for work queue continuation, declared here to perhaps + // avoid collisions + StatContinue = 201 }; RT_API_ATTRS const char *StatErrorString(int); diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..85b995c254b5b --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,548 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue comprises a list of tickets. Each ticket class has a Begin() +// member function, which is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatContinue. When that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentsOverElements, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/connection.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; + +// Ticket worker base classes + +template class ImmediateTicketRunner { +public: + RT_API_ATTRS explicit ImmediateTicketRunner(TICKET &ticket) + : ticket_{ticket} {} + RT_API_ATTRS int Run(WorkQueue &workQueue) { + int status{ticket_.Begin(workQueue)}; + while (status == StatContinue) { + status = ticket_.Continue(workQueue); + } + return status; + } + +private: + TICKET &ticket_; +}; + +// Base class for ticket workers that operate elementwise over descriptors +class Elementwise { +protected: + RT_API_ATTRS Elementwise( + const Descriptor &instance, const Descriptor *from = nullptr) + : instance_{instance}, from_{from} { + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + RT_API_ATTRS bool IsComplete() const { return elementAt_ >= elements_; } + RT_API_ATTRS void Advance() { + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + if (from_) { + from_->IncrementSubscripts(fromSubscripts_); + } + } + RT_API_ATTRS void SkipToEnd() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + + const Descriptor &instance_, *from_{nullptr}; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + SubscriptValue subscripts_[common::maxRank]; + SubscriptValue fromSubscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components. +class Componentwise { +protected: + RT_API_ATTRS Componentwise(const typeInfo::DerivedType &); + RT_API_ATTRS bool IsComplete() const { return componentAt_ >= components_; } + RT_API_ATTRS void Advance() { + ++componentAt_; + GetComponent(); + } + RT_API_ATTRS void SkipToEnd() { + component_ = nullptr; + componentAt_ = components_; + } + RT_API_ATTRS void Reset() { + component_ = nullptr; + componentAt_ = 0; + GetComponent(); + } + RT_API_ATTRS void GetComponent(); + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentsOverElements : protected Componentwise, protected Elementwise { +protected: + RT_API_ATTRS ComponentsOverElements(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Componentwise{derived}, Elementwise{instance, from} { + if (Elementwise::IsComplete()) { + Componentwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Componentwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextElement(); + if (Elementwise::IsComplete()) { + Elementwise::Reset(); + Componentwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Elementwise::Advance(); + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Advance(); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Reset(); + } + + int phase_{0}; +}; + +// Base class for ticket workers that operate over elements in an outer loop, +// type components in an inner loop. +class ElementsOverComponents : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ElementsOverComponents(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Componentwise::IsComplete()) { + Elementwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Elementwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextComponent(); + if (Componentwise::IsComplete()) { + Componentwise::Reset(); + Elementwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Componentwise::Advance(); + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Componentwise::Reset(); + Elementwise::Advance(); + } + + int phase_{0}; +}; + +// Ticket worker classes + +// Implements derived type instance initialization +class InitializeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket + : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket : public ImmediateTicketRunner { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : ImmediateTicketRunner{*this}, to_{to}, from_{&from}, + flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment. +template +class DerivedAssignTicket + : public ImmediateTicketRunner>, + private std::conditional_t { +public: + using Base = std::conditional_t; + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ImmediateTicketRunner{*this}, + Base{to, derived, &from}, flags_{flags}, memmoveFct_{memmoveFct}, + deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + static constexpr bool isComponentwise_{IS_COMPONENTWISE}; + bool isContiguousComponentwise_{isComponentwise_ && + this->instance_.IsContiguous() && this->from_->IsContiguous()}; + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + StaticDescriptor fromComponentDescriptor_; +}; + +namespace io::descr { + +template +class DescriptorIoTicket + : public ImmediateTicketRunner>, + private Elementwise { +public: + RT_API_ATTRS DescriptorIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + Elementwise{descriptor}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS bool &anyIoTookPlace() { return anyIoTookPlace_; } + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; + common::optional nonTbpSpecial_; + const typeInfo::DerivedType *derived_{nullptr}; + const typeInfo::SpecialBinding *special_{nullptr}; + StaticDescriptor elementDescriptor_; +}; + +template +class DerivedIoTicket : public ImmediateTicketRunner>, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + ElementsOverComponents{descriptor, derived}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; +}; + +} // namespace io::descr + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant, + DerivedAssignTicket, + io::descr::DescriptorIoTicket, + io::descr::DescriptorIoTicket, + io::descr::DerivedIoTicket, + io::descr::DerivedIoTicket> + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + // APIs for particular tasks. These can return StatOk if the work is + // completed immediately. + RT_API_ATTRS int BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return InitializeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + if (runTicketsImmediately_) { + return InitializeCloneTicket{clone, original, derived, hasStat, errMsg} + .Run(*this); + } else { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); + return StatContinue; + } + } + RT_API_ATTRS int BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return FinalizeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + if (runTicketsImmediately_) { + return DestroyTicket{descriptor, derived, finalize}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived, finalize); + return StatContinue; + } + } + RT_API_ATTRS int BeginAssign(Descriptor &to, const Descriptor &from, + int flags, MemmoveFct memmoveFct) { + if (runTicketsImmediately_) { + return AssignTicket{to, from, flags, memmoveFct}.Run(*this); + } else { + StartTicket().u.emplace(to, from, flags, memmoveFct); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) { + if (runTicketsImmediately_) { + return DerivedAssignTicket{ + to, from, derived, flags, memmoveFct, deallocateAfter} + .Run(*this); + } else { + StartTicket().u.emplace>( + to, from, derived, flags, memmoveFct, deallocateAfter); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDescriptorIo(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DescriptorIoTicket{ + io, descriptor, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, table, anyIoTookPlace); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedIo(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DerivedIoTicket{ + io, descriptor, derived, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, derived, table, anyIoTookPlace); + return StatContinue; + } + } + + RT_API_ATTRS int Run(); + +private: +#if RT_DEVICE_COMPILATION + // Always use the work queue on a GPU device to avoid recursion. + static constexpr bool runTicketsImmediately_{false}; +#else + // Avoid the work queue overhead on the host, unless it needs + // debugging, which is so much easier there. + static constexpr bool runTicketsImmediately_{true}; +#endif + + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index a3f63b4315644..332c0872e065f 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -68,6 +68,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -131,6 +132,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 86aeeaa88f2d1..742b938121068 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -102,11 +103,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncObject)); } // least <= 0, most >= 0 @@ -231,6 +228,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -244,274 +243,422 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + if (workQueue.BeginAssign(to, from, flags, memmoveFct) == StatContinue) { + workQueue.Run(); + } +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + if (int status{workQueue.BeginFinalize(*toDeallocate_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncObject))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum * addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(newFrom, *derived)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + static constexpr int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (int status{workQueue.BeginAssign( + newFrom, *from_, nestedFlags, memmoveFct_)}; + status != StatOk && status != StatContinue) { + return status; + } + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + if (int status{workQueue.BeginFinalize(to_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + } else if (!toDerived_->noDestructionNeeded()) { + if (int status{ + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false)}; + status != StatOk && status != StatContinue) { + return status; } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed } } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + return StatContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); + } + return StatOk; + } + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(to_, *toDerived_)}; + status != StatOk) { + return status; + } + } + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); - } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); - } + if (toDerived_) { + // TODO: don't go componentwise if any defined assignment, maybe + if (int status{workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_)}; + status != StatOk && status != StatContinue) { + return status; } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +template +RT_API_ATTRS int DerivedAssignTicket::Begin( + WorkQueue &workQueue) { + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{this->derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{this->IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + memmoveFct_(this->instance_.template ElementComponent( + this->subscripts_, procPtr.offset), + this->from_->template ElementComponent( + this->fromSubscripts_, procPtr.offset), + sizeof(typeInfo::ProcedurePointer)); + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} +template RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &); +template RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &); + +template +RT_API_ATTRS int DerivedAssignTicket::Continue( + WorkQueue &workQueue) { + // DerivedAssignTicket inherits from ElementComponentBase so that it + // loops over elements at the outer level and over components at the inner. + // This yield less surprising behavior for codes that care due to the use + // of defined assignments when the ordering of their calls matters. + while (!this->IsComplete()) { + // Copy the data components (incl. the parent) first. + switch (this->component_->genre()) { + case typeInfo::Component::Genre::Data: + if (this->component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{this->componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{this->fromComponentDescriptor_.descriptor()}; + this->component_->CreatePointerDescriptor(toCompDesc, this->instance_, + workQueue.terminator(), this->subscripts_); + this->component_->CreatePointerDescriptor(fromCompDesc, *this->from_, + workQueue.terminator(), this->fromSubscripts_); + this->Advance(); + if (int status{workQueue.BeginAssign( + toCompDesc, fromCompDesc, flags_, memmoveFct_)}; + status != StatOk) { + return status; + } + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{ + this->component_->SizeInBytes(this->instance_)}; + if (isContiguousComponentwise_) { + std::size_t offset{this->component_->offset()}; + char *to{this->instance_.template OffsetElement(offset)}; + const char *from{ + this->from_->template OffsetElement(offset)}; + std::size_t toElementStride{this->instance_.ElementBytes()}; + std::size_t fromElementStride{ + this->from_->rank() == 0 ? 0 : this->from_->ElementBytes()}; + for (std::size_t n{this->elements_}; n--; + to += toElementStride, from += fromElementStride) { + memmoveFct_(to, from, componentByteSize); + } + this->Componentwise::Advance(); + } else { + memmoveFct_( + this->instance_.template Element(this->subscripts_) + + this->component_->offset(), + this->from_->template Element(this->fromSubscripts_) + + this->component_->offset(), + componentByteSize); + this->Advance(); + } + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{ + this->component_->SizeInBytes(this->instance_)}; + if (isContiguousComponentwise_) { + std::size_t offset{this->component_->offset()}; + char *to{this->instance_.template OffsetElement(offset)}; + const char *from{ + this->from_->template OffsetElement(offset)}; + std::size_t toElementStride{this->instance_.ElementBytes()}; + std::size_t fromElementStride{ + this->from_->rank() == 0 ? 0 : this->from_->ElementBytes()}; + for (std::size_t n{this->elements_}; n--; + to += toElementStride, from += fromElementStride) { + memmoveFct_(to, from, componentByteSize); + } + this->Componentwise::Advance(); + } else { + memmoveFct_(this->instance_.template Element(this->subscripts_) + + this->component_->offset(), + this->from_->template Element(this->fromSubscripts_) + + this->component_->offset(), + componentByteSize); + this->Advance(); + } + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + this->instance_.template Element(this->subscripts_) + + this->component_->offset())}; + const auto *fromDesc{reinterpret_cast( + this->from_->template Element(this->fromSubscripts_) + + this->component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (this->phase_ == 0) { + this->phase_++; + if (const auto *componentDerived{this->component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } + } + toDesc->Deallocate(); + } + this->Advance(); + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + this->Advance(); + int nestedFlags{flags_}; + if (this->derived_.noFinalizationNeeded() && + this->derived_.noInitializationNeeded() && + this->derived_.noDestructionNeeded()) { + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + } else { + // Force LHS deallocation with DeallocateLHS flag. + nestedFlags |= DeallocateLHS; + } + if (int status{workQueue.BeginAssign( + *toDesc, *fromDesc, nestedFlags, memmoveFct_)}; + status != StatOk) { + return status; + } + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} +template RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &); +template RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &); RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -581,7 +728,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -598,7 +744,6 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. if (var) { diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index 35037036f63e7..8166ab64cfd71 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,195 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitialize(instance, derived)}; + return status == StatContinue ? workQueue.Run() : status; +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &comp{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + auto &pptr{*instance_.ElementComponent( + subscripts_, comp.offset)}; + pptr = comp.procInitialization; + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + SkipToNextComponent(); + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitialize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; - } - } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitializeClone( + clone, original, derived, hasStat, errMsg)}; + return status == StatContinue ? workQueue.Run() : status; } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncObject), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + if (int status{workQueue.BeginInitialize(cloneDesc, *derived)}; + status != StatOk) { + return status; } } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType * + derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_)}; + status != StatOk) { + return status; + } + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } + } else { + SkipToNextComponent(); + } + } + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginFinalize(descriptor, derived) == StatContinue) { + workQueue.Run(); } } - return stat; } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +237,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +274,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,87 +298,94 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (finalizableParentType_->noFinalizationNeeded()) { + finalizableParentType_ = nullptr; + } else { + SkipToNextComponent(); + } + } + return StatContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType * + compDynamicType{addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + if (int status{ + workQueue.BeginFinalize(compDesc, *compDynamicType)}; + status != StatOk) { + return status; } } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType * compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginFinalize(compDesc, *compType)}; + status != StatOk) { + return status; } } + } else { + SkipToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginFinalize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + const auto &parentType{*finalizableParentType_}; + finalizableParentType_ = nullptr; + // Don't return StatOk here if the nested FInalize is still running; + // it needs this->componentDescriptor_. + return workQueue.BeginFinalize(tmpDesc, parentType); } + return StatOk; } // The order of finalization follows Fortran 2018 7.5.6.2, with @@ -373,51 +394,71 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginDestroy(descriptor, derived, finalize) == StatContinue) { + workQueue.Run(); + } } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + if (int status{workQueue.BeginFinalize(instance_, derived_)}; + status != StatOk && status != StatContinue) { + return status; + } } + return StatContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (!IsComplete()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *d, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + SkipToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginDestroy( + compDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } } + } else { + SkipToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index 3db1455af52fe..4aa3640c1ed94 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -7,15 +7,44 @@ //===----------------------------------------------------------------------===// #include "descriptor-io.h" +#include "edit-input.h" +#include "edit-output.h" +#include "unit.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/namelist.h" +#include "flang-rt/runtime/terminator.h" +#include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" +#include "flang/Common/optional.h" #include "flang/Common/restorer.h" +#include "flang/Common/uint128.h" +#include "flang/Runtime/cpp-type.h" #include "flang/Runtime/freestanding-tools.h" +// Implementation of I/O data list item transfers based on descriptors. +// (All I/O items come through here so that the code is exercised for test; +// some scalar I/O data transfer APIs could be changed to bypass their use +// of descriptors in the future for better efficiency.) + namespace Fortran::runtime::io::descr { RT_OFFLOAD_API_GROUP_BEGIN +template +inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, + const Descriptor &descriptor, const SubscriptValue subscripts[]) { + A *p{descriptor.Element(subscripts)}; + if (!p) { + io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " + "address or subscripts out of range"); + } + return *p; +} + // Defined formatted I/O (maybe) -Fortran::common::optional DefinedFormattedIo(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &derived, +static RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( + IoStatementState &io, const Descriptor &descriptor, + const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special, const SubscriptValue subscripts[]) { Fortran::common::optional peek{ @@ -104,8 +133,8 @@ Fortran::common::optional DefinedFormattedIo(IoStatementState &io, } // Defined unformatted I/O -bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, - const typeInfo::DerivedType &derived, +static RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special) { // Unformatted I/O must have an external unit (or child thereof). IoErrorHandler &handler{io.GetIoErrorHandler()}; @@ -152,5 +181,619 @@ bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, return handler.GetIoStat() == IostatOk; } +// Per-category descriptor-based I/O templates + +// TODO (perhaps as a nontrivial but small starter project): implement +// automatic repetition counts, like "10*3.14159", for list-directed and +// NAMELIST array output. + +template +inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, + const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!EditIntegerOutput(io, *edit, x, isSigned)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditIntegerInput( + io, *edit, reinterpret_cast(&x), KIND, isSigned)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedIntegerIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedRealIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + RawType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditRealInput(io, *edit, reinterpret_cast(&x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedRealIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedComplexIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + bool isListOutput{ + io.get_if>() != nullptr}; + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + RawType *x{&ExtractElement(io, descriptor, subscripts)}; + if (isListOutput) { + DataEdit rEdit, iEdit; + rEdit.descriptor = DataEdit::ListDirectedRealPart; + iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; + rEdit.modes = iEdit.modes = io.mutableModes(); + if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || + !RealOutputEditing{io, x[1]}.Edit(iEdit)) { + return false; + } + } else { + for (int k{0}; k < 2; ++k, ++x) { + auto edit{io.GetNextDataEdit()}; + if (!edit) { + return false; + } else if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, *x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { + break; + } else if (EditRealInput( + io, *edit, reinterpret_cast(x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedComplexIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedCharacterIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + std::size_t length{descriptor.ElementBytes() / sizeof(A)}; + auto *listOutput{io.get_if>()}; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + A *x{&ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditCharacterOutput(io, *edit, x, length)) { + return false; + } + } else { // input + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditCharacterInput(io, *edit, x, length)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedCharacterIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedLogicalIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + auto *listOutput{io.get_if>()}; + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditLogicalOutput(io, *edit, x != 0)) { + return false; + } + } else { + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + bool truth{}; + if (EditLogicalInput(io, *edit, truth)) { + x = truth; + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedLogicalIO: subscripts out of bounds"); + } + } + return true; +} + +template +RT_API_ATTRS int DerivedIoTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Data) { + // Create a descriptor for the component + Descriptor &compDesc{componentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + compDesc, instance_, io_.GetIoErrorHandler(), subscripts_); + Advance(); + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } else { + // Component is itself a descriptor + char *pointer{ + instance_.Element(subscripts_) + component_->offset()}; + const Descriptor &compDesc{ + *reinterpret_cast(pointer)}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + } + } + return StatOk; +} + +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Begin(WorkQueue &workQueue) { + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + if (handler.InError()) { + return handler.GetIoStat(); + } + if (!io_.get_if>()) { + handler.Crash("DescriptorIO() called for wrong I/O direction"); + return handler.GetIoStat(); + } + if constexpr (DIR == Direction::Input) { + if (!io_.BeginReadingRecord()) { + return StatOk; + } + } + if (!io_.get_if>()) { + // Unformatted I/O + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + if (const typeInfo::DerivedType * + type{addendum ? addendum->derivedType() : nullptr}) { + // derived type unformatted I/O + if (table_) { + if (const auto *definedIo{table_->Find(*type, + DIR == Direction::Input + ? common::DefinedIo::ReadUnformatted + : common::DefinedIo::WriteUnformatted)}) { + if (definedIo->subroutine) { + typeInfo::SpecialBinding special{DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false}; + if (DefinedUnformattedIo(io_, instance_, *type, special)) { + anyIoTookPlace_ = true; + return StatOk; + } + } else { + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } + } + } + if (const typeInfo::SpecialBinding * + special{type->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || special->isTypeBound()) { + // defined derived type unformatted I/O + if (DefinedUnformattedIo(io_, instance_, *type, *special)) { + anyIoTookPlace_ = true; + return StatOk; + } else { + return IostatEnd; + } + } + } + // Default derived type unformatted I/O + // TODO: If no component at any level has defined READ or WRITE + // (as appropriate), the elements are contiguous, and no byte swapping + // is active, do a block transfer via the code below. + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } else { + // intrinsic type unformatted I/O + auto *externalUnf{io_.get_if>()}; + ChildUnformattedIoStatementState *childUnf{nullptr}; + InquireIOLengthState *inq{nullptr}; + bool swapEndianness{false}; + if (externalUnf) { + swapEndianness = externalUnf->unit().swapEndianness(); + } else { + childUnf = io_.get_if>(); + if (!childUnf) { + inq = DIR == Direction::Output ? io_.get_if() + : nullptr; + RUNTIME_CHECK(handler, inq != nullptr); + } + } + std::size_t elementBytes{instance_.ElementBytes()}; + std::size_t swappingBytes{elementBytes}; + if (auto maybeCatAndKind{instance_.type().GetCategoryAndKind()}) { + // Byte swapping units can be smaller than elements, namely + // for COMPLEX and CHARACTER. + if (maybeCatAndKind->first == TypeCategory::Character) { + // swap each character position independently + swappingBytes = maybeCatAndKind->second; // kind + } else if (maybeCatAndKind->first == TypeCategory::Complex) { + // swap real and imaginary components independently + swappingBytes /= 2; + } + } + using CharType = + std::conditional_t; + auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { + if constexpr (DIR == Direction::Output) { + return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) + : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) + : inq->Emit(&x, totalBytes, swappingBytes); + } else { + return externalUnf + ? externalUnf->Receive(&x, totalBytes, swappingBytes) + : childUnf->Receive(&x, totalBytes, swappingBytes); + } + }}; + if (!swapEndianness && + instance_.IsContiguous()) { // contiguous unformatted I/O + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elements_ * elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O + for (; !IsComplete(); Advance()) { + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } + } + } + // Unformatted I/O never needs to call Continue(). + return StatOk; + } + // Formatted I/O + if (auto catAndKind{instance_.type().GetCategoryAndKind()}) { + TypeCategory cat{catAndKind->first}; + int kind{catAndKind->second}; + bool any{false}; + switch (cat) { + case TypeCategory::Integer: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, true); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, true); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, true); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, true); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, true); + break; + default: + handler.Crash( + "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Unsigned: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, false); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, false); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, false); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, false); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, false); + break; + default: + handler.Crash( + "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Real: + switch (kind) { + case 2: + any = FormattedRealIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedRealIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedRealIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedRealIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedRealIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedRealIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: REAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Complex: + switch (kind) { + case 2: + any = FormattedComplexIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedComplexIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedComplexIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedComplexIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedComplexIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedComplexIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Character: + switch (kind) { + case 1: + any = FormattedCharacterIO(io_, instance_); + break; + case 2: + any = FormattedCharacterIO(io_, instance_); + break; + case 4: + any = FormattedCharacterIO(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Logical: + switch (kind) { + case 1: + any = FormattedLogicalIO<1, DIR>(io_, instance_); + break; + case 2: + any = FormattedLogicalIO<2, DIR>(io_, instance_); + break; + case 4: + any = FormattedLogicalIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedLogicalIO<8, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Derived: { + // Derived type information must be present for formatted I/O. + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + RUNTIME_CHECK(handler, addendum != nullptr); + derived_ = addendum->derivedType(); + RUNTIME_CHECK(handler, derived_ != nullptr); + if (table_) { + if (const auto *definedIo{table_->Find(*derived_, + DIR == Direction::Input ? common::DefinedIo::ReadFormatted + : common::DefinedIo::WriteFormatted)}) { + if (definedIo->subroutine) { + nonTbpSpecial_.emplace(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false); + special_ = &*nonTbpSpecial_; + } + } + } + if (!special_) { + if (const typeInfo::SpecialBinding * + binding{derived_->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || + binding->isTypeBound()) { + special_ = binding; + } + } + } + return StatContinue; + } + } + if (any) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { + handler.Crash("DescriptorIO: bad type code (%d) in descriptor", + static_cast(instance_.type().raw())); + return handler.GetIoStat(); + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Continue(WorkQueue &workQueue) { + // Only derived type formatted I/O gets here. + while (!IsComplete()) { + if (special_) { + if (auto defined{DefinedFormattedIo( + io_, instance_, *derived_, *special_, subscripts_)}) { + anyIoTookPlace_ |= *defined; + Advance(); + continue; + } + } + Descriptor &elementDesc{elementDescriptor_.descriptor()}; + elementDesc.Establish( + *derived_, nullptr, 0, nullptr, CFI_attribute_pointer); + elementDesc.set_base_addr(instance_.Element(subscripts_)); + Advance(); + if (int status{workQueue.BeginDerivedIo( + io_, elementDesc, *derived_, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS bool DescriptorIO(IoStatementState &io, + const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { + bool anyIoTookPlace{false}; + WorkQueue workQueue{io.GetIoErrorHandler()}; + if (workQueue.BeginDescriptorIo(io, descriptor, table, anyIoTookPlace) == + StatContinue) { + workQueue.Run(); + } + return anyIoTookPlace; +} + +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); + RT_OFFLOAD_API_GROUP_END } // namespace Fortran::runtime::io::descr diff --git a/flang-rt/lib/runtime/descriptor-io.h b/flang-rt/lib/runtime/descriptor-io.h index eb60f106c9203..88ad59bd24b53 100644 --- a/flang-rt/lib/runtime/descriptor-io.h +++ b/flang-rt/lib/runtime/descriptor-io.h @@ -9,619 +9,27 @@ #ifndef FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ #define FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ -// Implementation of I/O data list item transfers based on descriptors. -// (All I/O items come through here so that the code is exercised for test; -// some scalar I/O data transfer APIs could be changed to bypass their use -// of descriptors in the future for better efficiency.) +#include "flang-rt/runtime/connection.h" -#include "edit-input.h" -#include "edit-output.h" -#include "unit.h" -#include "flang-rt/runtime/descriptor.h" -#include "flang-rt/runtime/io-stmt.h" -#include "flang-rt/runtime/namelist.h" -#include "flang-rt/runtime/terminator.h" -#include "flang-rt/runtime/type-info.h" -#include "flang/Common/optional.h" -#include "flang/Common/uint128.h" -#include "flang/Runtime/cpp-type.h" +namespace Fortran::runtime { +class Descriptor; +} // namespace Fortran::runtime -namespace Fortran::runtime::io::descr { -template -inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, - const Descriptor &descriptor, const SubscriptValue subscripts[]) { - A *p{descriptor.Element(subscripts)}; - if (!p) { - io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " - "address or subscripts out of range"); - } - return *p; -} - -// Per-category descriptor-based I/O templates - -// TODO (perhaps as a nontrivial but small starter project): implement -// automatic repetition counts, like "10*3.14159", for list-directed and -// NAMELIST array output. - -template -inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, - const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!EditIntegerOutput(io, *edit, x, isSigned)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditIntegerInput( - io, *edit, reinterpret_cast(&x), KIND, isSigned)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedIntegerIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedRealIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - RawType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditRealInput(io, *edit, reinterpret_cast(&x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedRealIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io -template -inline RT_API_ATTRS bool FormattedComplexIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - bool isListOutput{ - io.get_if>() != nullptr}; - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - RawType *x{&ExtractElement(io, descriptor, subscripts)}; - if (isListOutput) { - DataEdit rEdit, iEdit; - rEdit.descriptor = DataEdit::ListDirectedRealPart; - iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; - rEdit.modes = iEdit.modes = io.mutableModes(); - if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || - !RealOutputEditing{io, x[1]}.Edit(iEdit)) { - return false; - } - } else { - for (int k{0}; k < 2; ++k, ++x) { - auto edit{io.GetNextDataEdit()}; - if (!edit) { - return false; - } else if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, *x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { - break; - } else if (EditRealInput( - io, *edit, reinterpret_cast(x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedComplexIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedCharacterIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t length{descriptor.ElementBytes() / sizeof(A)}; - auto *listOutput{io.get_if>()}; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - A *x{&ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditCharacterOutput(io, *edit, x, length)) { - return false; - } - } else { // input - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditCharacterInput(io, *edit, x, length)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedCharacterIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedLogicalIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - auto *listOutput{io.get_if>()}; - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditLogicalOutput(io, *edit, x != 0)) { - return false; - } - } else { - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - bool truth{}; - if (EditLogicalInput(io, *edit, truth)) { - x = truth; - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedLogicalIO: subscripts out of bounds"); - } - } - return true; -} +namespace Fortran::runtime::io::descr { template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, +RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable * = nullptr); -// For intrinsic (not defined) derived type I/O, formatted & unformatted -template -static RT_API_ATTRS bool DefaultComponentIO(IoStatementState &io, - const typeInfo::Component &component, const Descriptor &origDescriptor, - const SubscriptValue origSubscripts[], Terminator &terminator, - const NonTbpDefinedIoTable *table) { -#if !defined(RT_DEVICE_AVOID_RECURSION) - if (component.genre() == typeInfo::Component::Genre::Data) { - // Create a descriptor for the component - StaticDescriptor statDesc; - Descriptor &desc{statDesc.descriptor()}; - component.CreatePointerDescriptor( - desc, origDescriptor, terminator, origSubscripts); - return DescriptorIO(io, desc, table); - } else { - // Component is itself a descriptor - char *pointer{ - origDescriptor.Element(origSubscripts) + component.offset()}; - const Descriptor &compDesc{*reinterpret_cast(pointer)}; - return compDesc.IsAllocated() && DescriptorIO(io, compDesc, table); - } -#else - terminator.Crash("not yet implemented: component IO"); -#endif -} - -template -static RT_API_ATTRS bool DefaultComponentwiseFormattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table, const SubscriptValue subscripts[]) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - // Return true for NAMELIST input if any component appeared. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && k > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -template -static RT_API_ATTRS bool DefaultComponentwiseUnformattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - return false; - } - } - } - return true; -} - -RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( - IoStatementState &, const Descriptor &, const typeInfo::DerivedType &, - const typeInfo::SpecialBinding &, const SubscriptValue[]); - -template -static RT_API_ATTRS bool FormattedDerivedTypeIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - // Derived type information must be present for formatted I/O. - const DescriptorAddendum *addendum{descriptor.Addendum()}; - RUNTIME_CHECK(handler, addendum != nullptr); - const typeInfo::DerivedType *type{addendum->derivedType()}; - RUNTIME_CHECK(handler, type != nullptr); - Fortran::common::optional nonTbpSpecial; - const typeInfo::SpecialBinding *special{nullptr}; - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadFormatted - : common::DefinedIo::WriteFormatted)}) { - if (definedIo->subroutine) { - nonTbpSpecial.emplace(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false); - special = &*nonTbpSpecial; - } - } - } - if (!special) { - if (const typeInfo::SpecialBinding * - binding{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted)}) { - if (!table || !table->ignoreNonTbpEntries || binding->isTypeBound()) { - special = binding; - } - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t numElements{descriptor.Elements()}; - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - Fortran::common::optional result; - if (special) { - result = DefinedFormattedIo(io, descriptor, *type, *special, subscripts); - } - if (!result) { - result = DefaultComponentwiseFormattedIO( - io, descriptor, *type, table, subscripts); - } - if (!result.value()) { - // Return true for NAMELIST input if we got anything. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && j > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &, const Descriptor &, - const typeInfo::DerivedType &, const typeInfo::SpecialBinding &); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); -// Unformatted I/O -template -static RT_API_ATTRS bool UnformattedDescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table = nullptr) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const DescriptorAddendum *addendum{descriptor.Addendum()}; - if (const typeInfo::DerivedType * - type{addendum ? addendum->derivedType() : nullptr}) { - // derived type unformatted I/O - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadUnformatted - : common::DefinedIo::WriteUnformatted)}) { - if (definedIo->subroutine) { - typeInfo::SpecialBinding special{DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false}; - if (Fortran::common::optional wasDefined{ - DefinedUnformattedIo(io, descriptor, *type, special)}) { - return *wasDefined; - } - } else { - return DefaultComponentwiseUnformattedIO( - io, descriptor, *type, table); - } - } - } - if (const typeInfo::SpecialBinding * - special{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { - if (!table || !table->ignoreNonTbpEntries || special->isTypeBound()) { - // defined derived type unformatted I/O - return DefinedUnformattedIo(io, descriptor, *type, *special); - } - } - // Default derived type unformatted I/O - // TODO: If no component at any level has defined READ or WRITE - // (as appropriate), the elements are contiguous, and no byte swapping - // is active, do a block transfer via the code below. - return DefaultComponentwiseUnformattedIO(io, descriptor, *type, table); - } else { - // intrinsic type unformatted I/O - auto *externalUnf{io.get_if>()}; - auto *childUnf{io.get_if>()}; - auto *inq{ - DIR == Direction::Output ? io.get_if() : nullptr}; - RUNTIME_CHECK(handler, externalUnf || childUnf || inq); - std::size_t elementBytes{descriptor.ElementBytes()}; - std::size_t numElements{descriptor.Elements()}; - std::size_t swappingBytes{elementBytes}; - if (auto maybeCatAndKind{descriptor.type().GetCategoryAndKind()}) { - // Byte swapping units can be smaller than elements, namely - // for COMPLEX and CHARACTER. - if (maybeCatAndKind->first == TypeCategory::Character) { - // swap each character position independently - swappingBytes = maybeCatAndKind->second; // kind - } else if (maybeCatAndKind->first == TypeCategory::Complex) { - // swap real and imaginary components independently - swappingBytes /= 2; - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using CharType = - std::conditional_t; - auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { - if constexpr (DIR == Direction::Output) { - return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) - : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) - : inq->Emit(&x, totalBytes, swappingBytes); - } else { - return externalUnf ? externalUnf->Receive(&x, totalBytes, swappingBytes) - : childUnf->Receive(&x, totalBytes, swappingBytes); - } - }}; - bool swapEndianness{externalUnf && externalUnf->unit().swapEndianness()}; - if (!swapEndianness && - descriptor.IsContiguous()) { // contiguous unformatted I/O - char &x{ExtractElement(io, descriptor, subscripts)}; - return Transfer(x, numElements * elementBytes); - } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O - for (std::size_t j{0}; j < numElements; ++j) { - char &x{ExtractElement(io, descriptor, subscripts)}; - if (!Transfer(x, elementBytes)) { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && - j + 1 < numElements) { - handler.Crash("DescriptorIO: subscripts out of bounds"); - } - } - return true; - } - } -} - -template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - if (handler.InError()) { - return false; - } - if (!io.get_if>()) { - handler.Crash("DescriptorIO() called for wrong I/O direction"); - return false; - } - if constexpr (DIR == Direction::Input) { - if (!io.BeginReadingRecord()) { - return false; - } - } - if (!io.get_if>()) { - return UnformattedDescriptorIO(io, descriptor, table); - } - if (auto catAndKind{descriptor.type().GetCategoryAndKind()}) { - TypeCategory cat{catAndKind->first}; - int kind{catAndKind->second}; - switch (cat) { - case TypeCategory::Integer: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, true); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, true); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, true); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, true); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, true); - default: - handler.Crash( - "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Unsigned: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, false); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, false); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, false); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, false); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, false); - default: - handler.Crash( - "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Real: - switch (kind) { - case 2: - return FormattedRealIO<2, DIR>(io, descriptor); - case 3: - return FormattedRealIO<3, DIR>(io, descriptor); - case 4: - return FormattedRealIO<4, DIR>(io, descriptor); - case 8: - return FormattedRealIO<8, DIR>(io, descriptor); - case 10: - return FormattedRealIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedRealIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: REAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Complex: - switch (kind) { - case 2: - return FormattedComplexIO<2, DIR>(io, descriptor); - case 3: - return FormattedComplexIO<3, DIR>(io, descriptor); - case 4: - return FormattedComplexIO<4, DIR>(io, descriptor); - case 8: - return FormattedComplexIO<8, DIR>(io, descriptor); - case 10: - return FormattedComplexIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedComplexIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Character: - switch (kind) { - case 1: - return FormattedCharacterIO(io, descriptor); - case 2: - return FormattedCharacterIO(io, descriptor); - case 4: - return FormattedCharacterIO(io, descriptor); - default: - handler.Crash( - "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Logical: - switch (kind) { - case 1: - return FormattedLogicalIO<1, DIR>(io, descriptor); - case 2: - return FormattedLogicalIO<2, DIR>(io, descriptor); - case 4: - return FormattedLogicalIO<4, DIR>(io, descriptor); - case 8: - return FormattedLogicalIO<8, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Derived: - return FormattedDerivedTypeIO(io, descriptor, table); - } - } - handler.Crash("DescriptorIO: bad type code (%d) in descriptor", - static_cast(descriptor.type().raw())); - return false; -} } // namespace Fortran::runtime::io::descr #endif // FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ diff --git a/flang-rt/lib/runtime/environment.cpp b/flang-rt/lib/runtime/environment.cpp index 1d5304254ed0e..0f0564403c0e2 100644 --- a/flang-rt/lib/runtime/environment.cpp +++ b/flang-rt/lib/runtime/environment.cpp @@ -143,6 +143,10 @@ void ExecutionEnvironment::Configure(int ac, const char *av[], } } + if (auto *x{std::getenv("FLANG_RT_DEBUG")}) { + internalDebugging = std::strtol(x, nullptr, 10); + } + if (auto *x{std::getenv("ACC_OFFLOAD_STACK_SIZE")}) { char *end; auto n{std::strtoul(x, &end, 10)}; diff --git a/flang-rt/lib/runtime/namelist.cpp b/flang-rt/lib/runtime/namelist.cpp index b0cf2180fc6d4..1bef387a9771f 100644 --- a/flang-rt/lib/runtime/namelist.cpp +++ b/flang-rt/lib/runtime/namelist.cpp @@ -10,6 +10,7 @@ #include "descriptor-io.h" #include "flang-rt/runtime/emit-encoded.h" #include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/type-info.h" #include "flang/Runtime/io-api.h" #include #include diff --git a/flang-rt/lib/runtime/tools.cpp b/flang-rt/lib/runtime/tools.cpp index b08195cd31e05..24d05f369fcbe 100644 --- a/flang-rt/lib/runtime/tools.cpp +++ b/flang-rt/lib/runtime/tools.cpp @@ -205,7 +205,7 @@ RT_API_ATTRS void ShallowCopyInner(const Descriptor &to, const Descriptor &from, // Doing the recursion upwards instead of downwards puts the more common // cases earlier in the if-chain and has a tangible impact on performance. template struct ShallowCopyRankSpecialize { - static bool execute(const Descriptor &to, const Descriptor &from, + static RT_API_ATTRS bool execute(const Descriptor &to, const Descriptor &from, bool toIsContiguous, bool fromIsContiguous) { if (to.rank() == RANK && from.rank() == RANK) { ShallowCopyInner(to, from, toIsContiguous, fromIsContiguous); @@ -217,7 +217,7 @@ template struct ShallowCopyRankSpecialize { }; template struct ShallowCopyRankSpecialize { - static bool execute(const Descriptor &to, const Descriptor &from, + static RT_API_ATTRS bool execute(const Descriptor &to, const Descriptor &from, bool toIsContiguous, bool fromIsContiguous) { return false; } diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..a508ecb637102 --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,161 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/memory.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +#if !defined(RT_DEVICE_COMPILATION) +// FLANG_RT_DEBUG code is disabled when false. +static constexpr bool enableDebugOutput{false}; +#endif + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (IsComplete()) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + FreeMemory(firstFree_); + } + firstFree_ = next; + } +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + void *p{AllocateMemoryOrCrash(terminator_, sizeof(TicketList))}; + firstFree_ = new (p) TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: new ticket\n"); + } +#endif + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), + at->ticket.begun ? "Continue" : "Begin"); + } +#endif + int stat{at->ticket.Continue(*this)}; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: ... stat %d\n", stat); + } +#endif + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime diff --git a/flang-rt/unittests/Runtime/ExternalIOTest.cpp b/flang-rt/unittests/Runtime/ExternalIOTest.cpp index 3833e48be3dd6..6c148b1de6f82 100644 --- a/flang-rt/unittests/Runtime/ExternalIOTest.cpp +++ b/flang-rt/unittests/Runtime/ExternalIOTest.cpp @@ -184,7 +184,7 @@ TEST(ExternalIOTests, TestSequentialFixedUnformatted) { io = IONAME(BeginInquireIoLength)(__FILE__, __LINE__); for (int j{1}; j <= 3; ++j) { ASSERT_TRUE(IONAME(OutputDescriptor)(io, desc)) - << "OutputDescriptor() for InquireIoLength"; + << "OutputDescriptor() for InquireIoLength " << j; } ASSERT_EQ(IONAME(GetIoLength)(io), 3 * recl) << "GetIoLength"; ASSERT_EQ(IONAME(EndIoStatement)(io), IostatOk) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 51969de5ac7fe..377d01dbec69a 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -850,6 +850,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION From flang-commits at lists.llvm.org Thu May 29 09:01:50 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Thu, 29 May 2025 09:01:50 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <683884ee.170a0220.14af18.911b@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From 121bfc89bc63672b974a678617ae77627997f752 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, Destroy, and DescriptorIO. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. There is a fast(?) mode in the work queue implementation that causes new work items to be executed to completion immediately upon creation, saving the overhead of actually representing and managing the work queue. This mode can't be used on GPU devices, but it is enabled by default for CPU hosts. It can be disabled easily for debugging and performance testing. --- .../include/flang-rt/runtime/environment.h | 3 + flang-rt/include/flang-rt/runtime/stat.h | 10 +- .../include/flang-rt/runtime/work-queue.h | 548 +++++++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 595 ++++++++++------ flang-rt/lib/runtime/derived.cpp | 517 +++++++------- flang-rt/lib/runtime/descriptor-io.cpp | 651 +++++++++++++++++- flang-rt/lib/runtime/descriptor-io.h | 620 +---------------- flang-rt/lib/runtime/environment.cpp | 4 + flang-rt/lib/runtime/namelist.cpp | 1 + flang-rt/lib/runtime/tools.cpp | 4 +- flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 161 +++++ flang-rt/unittests/Runtime/ExternalIOTest.cpp | 2 +- flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- 16 files changed, 2052 insertions(+), 1084 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp diff --git a/flang-rt/include/flang-rt/runtime/environment.h b/flang-rt/include/flang-rt/runtime/environment.h index 16258b3bbba9b..e579f6012ce86 100644 --- a/flang-rt/include/flang-rt/runtime/environment.h +++ b/flang-rt/include/flang-rt/runtime/environment.h @@ -64,6 +64,9 @@ struct ExecutionEnvironment { bool defaultUTF8{false}; // DEFAULT_UTF8 bool checkPointerDeallocation{true}; // FORT_CHECK_POINTER_DEALLOCATION + enum InternalDebugging { WorkQueue = 1 }; + int internalDebugging{0}; // FLANG_RT_DEBUG + // CUDA related variables std::size_t cudaStackLimit{0}; // ACC_OFFLOAD_STACK_SIZE bool cudaDeviceIsManaged{false}; // NV_CUDAFOR_DEVICE_IS_MANAGED diff --git a/flang-rt/include/flang-rt/runtime/stat.h b/flang-rt/include/flang-rt/runtime/stat.h index 070d0bf8673fb..dc372de53506a 100644 --- a/flang-rt/include/flang-rt/runtime/stat.h +++ b/flang-rt/include/flang-rt/runtime/stat.h @@ -24,7 +24,7 @@ class Terminator; enum Stat { StatOk = 0, // required to be zero by Fortran - // Interoperable STAT= codes + // Interoperable STAT= codes (>= 11) StatBaseNull = CFI_ERROR_BASE_ADDR_NULL, StatBaseNotNull = CFI_ERROR_BASE_ADDR_NOT_NULL, StatInvalidElemLen = CFI_INVALID_ELEM_LEN, @@ -36,7 +36,7 @@ enum Stat { StatMemAllocation = CFI_ERROR_MEM_ALLOCATION, StatOutOfBounds = CFI_ERROR_OUT_OF_BOUNDS, - // Standard STAT= values + // Standard STAT= values (>= 101) StatFailedImage = FORTRAN_RUNTIME_STAT_FAILED_IMAGE, StatLocked = FORTRAN_RUNTIME_STAT_LOCKED, StatLockedOtherImage = FORTRAN_RUNTIME_STAT_LOCKED_OTHER_IMAGE, @@ -49,10 +49,14 @@ enum Stat { // Additional "processor-defined" STAT= values StatInvalidArgumentNumber = FORTRAN_RUNTIME_STAT_INVALID_ARG_NUMBER, StatMissingArgument = FORTRAN_RUNTIME_STAT_MISSING_ARG, - StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, + StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, // -1 StatMoveAllocSameAllocatable = FORTRAN_RUNTIME_STAT_MOVE_ALLOC_SAME_ALLOCATABLE, StatBadPointerDeallocation = FORTRAN_RUNTIME_STAT_BAD_POINTER_DEALLOCATION, + + // Dummy status for work queue continuation, declared here to perhaps + // avoid collisions + StatContinue = 201 }; RT_API_ATTRS const char *StatErrorString(int); diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..85b995c254b5b --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,548 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue comprises a list of tickets. Each ticket class has a Begin() +// member function, which is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatContinue. When that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentsOverElements, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/connection.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; + +// Ticket worker base classes + +template class ImmediateTicketRunner { +public: + RT_API_ATTRS explicit ImmediateTicketRunner(TICKET &ticket) + : ticket_{ticket} {} + RT_API_ATTRS int Run(WorkQueue &workQueue) { + int status{ticket_.Begin(workQueue)}; + while (status == StatContinue) { + status = ticket_.Continue(workQueue); + } + return status; + } + +private: + TICKET &ticket_; +}; + +// Base class for ticket workers that operate elementwise over descriptors +class Elementwise { +protected: + RT_API_ATTRS Elementwise( + const Descriptor &instance, const Descriptor *from = nullptr) + : instance_{instance}, from_{from} { + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + RT_API_ATTRS bool IsComplete() const { return elementAt_ >= elements_; } + RT_API_ATTRS void Advance() { + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + if (from_) { + from_->IncrementSubscripts(fromSubscripts_); + } + } + RT_API_ATTRS void SkipToEnd() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + + const Descriptor &instance_, *from_{nullptr}; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + SubscriptValue subscripts_[common::maxRank]; + SubscriptValue fromSubscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components. +class Componentwise { +protected: + RT_API_ATTRS Componentwise(const typeInfo::DerivedType &); + RT_API_ATTRS bool IsComplete() const { return componentAt_ >= components_; } + RT_API_ATTRS void Advance() { + ++componentAt_; + GetComponent(); + } + RT_API_ATTRS void SkipToEnd() { + component_ = nullptr; + componentAt_ = components_; + } + RT_API_ATTRS void Reset() { + component_ = nullptr; + componentAt_ = 0; + GetComponent(); + } + RT_API_ATTRS void GetComponent(); + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentsOverElements : protected Componentwise, protected Elementwise { +protected: + RT_API_ATTRS ComponentsOverElements(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Componentwise{derived}, Elementwise{instance, from} { + if (Elementwise::IsComplete()) { + Componentwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Componentwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextElement(); + if (Elementwise::IsComplete()) { + Elementwise::Reset(); + Componentwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Elementwise::Advance(); + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Advance(); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Reset(); + } + + int phase_{0}; +}; + +// Base class for ticket workers that operate over elements in an outer loop, +// type components in an inner loop. +class ElementsOverComponents : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ElementsOverComponents(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Componentwise::IsComplete()) { + Elementwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Elementwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextComponent(); + if (Componentwise::IsComplete()) { + Componentwise::Reset(); + Elementwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Componentwise::Advance(); + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Componentwise::Reset(); + Elementwise::Advance(); + } + + int phase_{0}; +}; + +// Ticket worker classes + +// Implements derived type instance initialization +class InitializeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket + : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket : public ImmediateTicketRunner { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : ImmediateTicketRunner{*this}, to_{to}, from_{&from}, + flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment. +template +class DerivedAssignTicket + : public ImmediateTicketRunner>, + private std::conditional_t { +public: + using Base = std::conditional_t; + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ImmediateTicketRunner{*this}, + Base{to, derived, &from}, flags_{flags}, memmoveFct_{memmoveFct}, + deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + static constexpr bool isComponentwise_{IS_COMPONENTWISE}; + bool isContiguousComponentwise_{isComponentwise_ && + this->instance_.IsContiguous() && this->from_->IsContiguous()}; + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + StaticDescriptor fromComponentDescriptor_; +}; + +namespace io::descr { + +template +class DescriptorIoTicket + : public ImmediateTicketRunner>, + private Elementwise { +public: + RT_API_ATTRS DescriptorIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + Elementwise{descriptor}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS bool &anyIoTookPlace() { return anyIoTookPlace_; } + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; + common::optional nonTbpSpecial_; + const typeInfo::DerivedType *derived_{nullptr}; + const typeInfo::SpecialBinding *special_{nullptr}; + StaticDescriptor elementDescriptor_; +}; + +template +class DerivedIoTicket : public ImmediateTicketRunner>, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + ElementsOverComponents{descriptor, derived}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; +}; + +} // namespace io::descr + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant, + DerivedAssignTicket, + io::descr::DescriptorIoTicket, + io::descr::DescriptorIoTicket, + io::descr::DerivedIoTicket, + io::descr::DerivedIoTicket> + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + // APIs for particular tasks. These can return StatOk if the work is + // completed immediately. + RT_API_ATTRS int BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return InitializeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + if (runTicketsImmediately_) { + return InitializeCloneTicket{clone, original, derived, hasStat, errMsg} + .Run(*this); + } else { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); + return StatContinue; + } + } + RT_API_ATTRS int BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return FinalizeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + if (runTicketsImmediately_) { + return DestroyTicket{descriptor, derived, finalize}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived, finalize); + return StatContinue; + } + } + RT_API_ATTRS int BeginAssign(Descriptor &to, const Descriptor &from, + int flags, MemmoveFct memmoveFct) { + if (runTicketsImmediately_) { + return AssignTicket{to, from, flags, memmoveFct}.Run(*this); + } else { + StartTicket().u.emplace(to, from, flags, memmoveFct); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) { + if (runTicketsImmediately_) { + return DerivedAssignTicket{ + to, from, derived, flags, memmoveFct, deallocateAfter} + .Run(*this); + } else { + StartTicket().u.emplace>( + to, from, derived, flags, memmoveFct, deallocateAfter); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDescriptorIo(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DescriptorIoTicket{ + io, descriptor, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, table, anyIoTookPlace); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedIo(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DerivedIoTicket{ + io, descriptor, derived, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, derived, table, anyIoTookPlace); + return StatContinue; + } + } + + RT_API_ATTRS int Run(); + +private: +#if RT_DEVICE_COMPILATION + // Always use the work queue on a GPU device to avoid recursion. + static constexpr bool runTicketsImmediately_{false}; +#else + // Avoid the work queue overhead on the host, unless it needs + // debugging, which is so much easier there. + static constexpr bool runTicketsImmediately_{true}; +#endif + + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index a3f63b4315644..332c0872e065f 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -68,6 +68,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -131,6 +132,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 86aeeaa88f2d1..ee4500db17015 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -102,11 +103,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncObject)); } // least <= 0, most >= 0 @@ -231,6 +228,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -244,274 +243,422 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + if (workQueue.BeginAssign(to, from, flags, memmoveFct) == StatContinue) { + workQueue.Run(); + } +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + if (int status{workQueue.BeginFinalize(*toDeallocate_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncObject))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum *addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(newFrom, *derived)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + static constexpr int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (int status{workQueue.BeginAssign( + newFrom, *from_, nestedFlags, memmoveFct_)}; + status != StatOk && status != StatContinue) { + return status; + } + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + if (int status{workQueue.BeginFinalize(to_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + } else if (!toDerived_->noDestructionNeeded()) { + if (int status{ + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false)}; + status != StatOk && status != StatContinue) { + return status; } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed } } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + return StatContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); + } + return StatOk; + } + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(to_, *toDerived_)}; + status != StatOk) { + return status; + } + } + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); - } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } - } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); - } + if (toDerived_) { + // TODO: don't go componentwise if any defined assignment, maybe + if (int status{workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_)}; + status != StatOk && status != StatContinue) { + return status; } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +template +RT_API_ATTRS int DerivedAssignTicket::Begin( + WorkQueue &workQueue) { + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{this->derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{this->IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + memmoveFct_(this->instance_.template ElementComponent( + this->subscripts_, procPtr.offset), + this->from_->template ElementComponent( + this->fromSubscripts_, procPtr.offset), + sizeof(typeInfo::ProcedurePointer)); + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} +template RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &); +template RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &); + +template +RT_API_ATTRS int DerivedAssignTicket::Continue( + WorkQueue &workQueue) { + // DerivedAssignTicket inherits from ElementComponentBase so that it + // loops over elements at the outer level and over components at the inner. + // This yield less surprising behavior for codes that care due to the use + // of defined assignments when the ordering of their calls matters. + while (!this->IsComplete()) { + // Copy the data components (incl. the parent) first. + switch (this->component_->genre()) { + case typeInfo::Component::Genre::Data: + if (this->component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{this->componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{this->fromComponentDescriptor_.descriptor()}; + this->component_->CreatePointerDescriptor(toCompDesc, this->instance_, + workQueue.terminator(), this->subscripts_); + this->component_->CreatePointerDescriptor(fromCompDesc, *this->from_, + workQueue.terminator(), this->fromSubscripts_); + this->Advance(); + if (int status{workQueue.BeginAssign( + toCompDesc, fromCompDesc, flags_, memmoveFct_)}; + status != StatOk) { + return status; + } + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{ + this->component_->SizeInBytes(this->instance_)}; + if (isContiguousComponentwise_) { + std::size_t offset{this->component_->offset()}; + char *to{this->instance_.template OffsetElement(offset)}; + const char *from{ + this->from_->template OffsetElement(offset)}; + std::size_t toElementStride{this->instance_.ElementBytes()}; + std::size_t fromElementStride{ + this->from_->rank() == 0 ? 0 : this->from_->ElementBytes()}; + for (std::size_t n{this->elements_}; n--; + to += toElementStride, from += fromElementStride) { + memmoveFct_(to, from, componentByteSize); + } + this->Componentwise::Advance(); + } else { + memmoveFct_( + this->instance_.template Element(this->subscripts_) + + this->component_->offset(), + this->from_->template Element(this->fromSubscripts_) + + this->component_->offset(), + componentByteSize); + this->Advance(); + } + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{ + this->component_->SizeInBytes(this->instance_)}; + if (isContiguousComponentwise_) { + std::size_t offset{this->component_->offset()}; + char *to{this->instance_.template OffsetElement(offset)}; + const char *from{ + this->from_->template OffsetElement(offset)}; + std::size_t toElementStride{this->instance_.ElementBytes()}; + std::size_t fromElementStride{ + this->from_->rank() == 0 ? 0 : this->from_->ElementBytes()}; + for (std::size_t n{this->elements_}; n--; + to += toElementStride, from += fromElementStride) { + memmoveFct_(to, from, componentByteSize); + } + this->Componentwise::Advance(); + } else { + memmoveFct_(this->instance_.template Element(this->subscripts_) + + this->component_->offset(), + this->from_->template Element(this->fromSubscripts_) + + this->component_->offset(), + componentByteSize); + this->Advance(); + } + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + this->instance_.template Element(this->subscripts_) + + this->component_->offset())}; + const auto *fromDesc{reinterpret_cast( + this->from_->template Element(this->fromSubscripts_) + + this->component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (this->phase_ == 0) { + this->phase_++; + if (const auto *componentDerived{this->component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } + } + toDesc->Deallocate(); + } + this->Advance(); + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + this->Advance(); + int nestedFlags{flags_}; + if (this->derived_.noFinalizationNeeded() && + this->derived_.noInitializationNeeded() && + this->derived_.noDestructionNeeded()) { + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + } else { + // Force LHS deallocation with DeallocateLHS flag. + nestedFlags |= DeallocateLHS; + } + if (int status{workQueue.BeginAssign( + *toDesc, *fromDesc, nestedFlags, memmoveFct_)}; + status != StatOk) { + return status; + } + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} +template RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &); +template RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &); RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -581,7 +728,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -598,7 +744,6 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. if (var) { diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index 35037036f63e7..8ab737c701b01 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,193 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitialize(instance, derived)}; + return status == StatContinue ? workQueue.Run() : status; +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &comp{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + auto &pptr{*instance_.ElementComponent( + subscripts_, comp.offset)}; + pptr = comp.procInitialization; + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + SkipToNextComponent(); + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitialize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; - } - } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitializeClone( + clone, original, derived, hasStat, errMsg)}; + return status == StatContinue ? workQueue.Run() : status; } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncObject), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum *addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + if (int status{workQueue.BeginInitialize(cloneDesc, *derived)}; + status != StatOk) { + return status; } } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum *addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType *derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_)}; + status != StatOk) { + return status; + } + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } + } else { + SkipToNextComponent(); + } + } + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginFinalize(descriptor, derived) == StatContinue) { + workQueue.Run(); } } - return stat; } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +235,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +272,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,87 +296,94 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (finalizableParentType_->noFinalizationNeeded()) { + finalizableParentType_ = nullptr; + } else { + SkipToNextComponent(); + } + } + return StatContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum *addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType *compDynamicType{ + addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + if (int status{ + workQueue.BeginFinalize(compDesc, *compDynamicType)}; + status != StatOk) { + return status; } } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType *compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginFinalize(compDesc, *compType)}; + status != StatOk) { + return status; } } + } else { + SkipToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginFinalize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + const auto &parentType{*finalizableParentType_}; + finalizableParentType_ = nullptr; + // Don't return StatOk here if the nested FInalize is still running; + // it needs this->componentDescriptor_. + return workQueue.BeginFinalize(tmpDesc, parentType); } + return StatOk; } // The order of finalization follows Fortran 2018 7.5.6.2, with @@ -373,51 +392,71 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginDestroy(descriptor, derived, finalize) == StatContinue) { + workQueue.Run(); + } } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + if (int status{workQueue.BeginFinalize(instance_, derived_)}; + status != StatOk && status != StatContinue) { + return status; + } } + return StatContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (!IsComplete()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *d, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + SkipToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginDestroy( + compDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } } + } else { + SkipToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index 3db1455af52fe..364724b89ba0d 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -7,15 +7,44 @@ //===----------------------------------------------------------------------===// #include "descriptor-io.h" +#include "edit-input.h" +#include "edit-output.h" +#include "unit.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/namelist.h" +#include "flang-rt/runtime/terminator.h" +#include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" +#include "flang/Common/optional.h" #include "flang/Common/restorer.h" +#include "flang/Common/uint128.h" +#include "flang/Runtime/cpp-type.h" #include "flang/Runtime/freestanding-tools.h" +// Implementation of I/O data list item transfers based on descriptors. +// (All I/O items come through here so that the code is exercised for test; +// some scalar I/O data transfer APIs could be changed to bypass their use +// of descriptors in the future for better efficiency.) + namespace Fortran::runtime::io::descr { RT_OFFLOAD_API_GROUP_BEGIN +template +inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, + const Descriptor &descriptor, const SubscriptValue subscripts[]) { + A *p{descriptor.Element(subscripts)}; + if (!p) { + io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " + "address or subscripts out of range"); + } + return *p; +} + // Defined formatted I/O (maybe) -Fortran::common::optional DefinedFormattedIo(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &derived, +static RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( + IoStatementState &io, const Descriptor &descriptor, + const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special, const SubscriptValue subscripts[]) { Fortran::common::optional peek{ @@ -104,8 +133,8 @@ Fortran::common::optional DefinedFormattedIo(IoStatementState &io, } // Defined unformatted I/O -bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, - const typeInfo::DerivedType &derived, +static RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special) { // Unformatted I/O must have an external unit (or child thereof). IoErrorHandler &handler{io.GetIoErrorHandler()}; @@ -152,5 +181,619 @@ bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, return handler.GetIoStat() == IostatOk; } +// Per-category descriptor-based I/O templates + +// TODO (perhaps as a nontrivial but small starter project): implement +// automatic repetition counts, like "10*3.14159", for list-directed and +// NAMELIST array output. + +template +inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, + const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!EditIntegerOutput(io, *edit, x, isSigned)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditIntegerInput( + io, *edit, reinterpret_cast(&x), KIND, isSigned)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedIntegerIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedRealIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + RawType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditRealInput(io, *edit, reinterpret_cast(&x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedRealIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedComplexIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + bool isListOutput{ + io.get_if>() != nullptr}; + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + RawType *x{&ExtractElement(io, descriptor, subscripts)}; + if (isListOutput) { + DataEdit rEdit, iEdit; + rEdit.descriptor = DataEdit::ListDirectedRealPart; + iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; + rEdit.modes = iEdit.modes = io.mutableModes(); + if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || + !RealOutputEditing{io, x[1]}.Edit(iEdit)) { + return false; + } + } else { + for (int k{0}; k < 2; ++k, ++x) { + auto edit{io.GetNextDataEdit()}; + if (!edit) { + return false; + } else if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, *x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { + break; + } else if (EditRealInput( + io, *edit, reinterpret_cast(x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedComplexIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedCharacterIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + std::size_t length{descriptor.ElementBytes() / sizeof(A)}; + auto *listOutput{io.get_if>()}; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + A *x{&ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditCharacterOutput(io, *edit, x, length)) { + return false; + } + } else { // input + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditCharacterInput(io, *edit, x, length)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedCharacterIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedLogicalIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + auto *listOutput{io.get_if>()}; + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditLogicalOutput(io, *edit, x != 0)) { + return false; + } + } else { + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + bool truth{}; + if (EditLogicalInput(io, *edit, truth)) { + x = truth; + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedLogicalIO: subscripts out of bounds"); + } + } + return true; +} + +template +RT_API_ATTRS int DerivedIoTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Data) { + // Create a descriptor for the component + Descriptor &compDesc{componentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + compDesc, instance_, io_.GetIoErrorHandler(), subscripts_); + Advance(); + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } else { + // Component is itself a descriptor + char *pointer{ + instance_.Element(subscripts_) + component_->offset()}; + const Descriptor &compDesc{ + *reinterpret_cast(pointer)}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + } + } + return StatOk; +} + +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Begin(WorkQueue &workQueue) { + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + if (handler.InError()) { + return handler.GetIoStat(); + } + if (!io_.get_if>()) { + handler.Crash("DescriptorIO() called for wrong I/O direction"); + return handler.GetIoStat(); + } + if constexpr (DIR == Direction::Input) { + if (!io_.BeginReadingRecord()) { + return StatOk; + } + } + if (!io_.get_if>()) { + // Unformatted I/O + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + if (const typeInfo::DerivedType *type{ + addendum ? addendum->derivedType() : nullptr}) { + // derived type unformatted I/O + if (table_) { + if (const auto *definedIo{table_->Find(*type, + DIR == Direction::Input + ? common::DefinedIo::ReadUnformatted + : common::DefinedIo::WriteUnformatted)}) { + if (definedIo->subroutine) { + typeInfo::SpecialBinding special{DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false}; + if (DefinedUnformattedIo(io_, instance_, *type, special)) { + anyIoTookPlace_ = true; + return StatOk; + } + } else { + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } + } + } + if (const typeInfo::SpecialBinding *special{ + type->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || special->isTypeBound()) { + // defined derived type unformatted I/O + if (DefinedUnformattedIo(io_, instance_, *type, *special)) { + anyIoTookPlace_ = true; + return StatOk; + } else { + return IostatEnd; + } + } + } + // Default derived type unformatted I/O + // TODO: If no component at any level has defined READ or WRITE + // (as appropriate), the elements are contiguous, and no byte swapping + // is active, do a block transfer via the code below. + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } else { + // intrinsic type unformatted I/O + auto *externalUnf{io_.get_if>()}; + ChildUnformattedIoStatementState *childUnf{nullptr}; + InquireIOLengthState *inq{nullptr}; + bool swapEndianness{false}; + if (externalUnf) { + swapEndianness = externalUnf->unit().swapEndianness(); + } else { + childUnf = io_.get_if>(); + if (!childUnf) { + inq = DIR == Direction::Output ? io_.get_if() + : nullptr; + RUNTIME_CHECK(handler, inq != nullptr); + } + } + std::size_t elementBytes{instance_.ElementBytes()}; + std::size_t swappingBytes{elementBytes}; + if (auto maybeCatAndKind{instance_.type().GetCategoryAndKind()}) { + // Byte swapping units can be smaller than elements, namely + // for COMPLEX and CHARACTER. + if (maybeCatAndKind->first == TypeCategory::Character) { + // swap each character position independently + swappingBytes = maybeCatAndKind->second; // kind + } else if (maybeCatAndKind->first == TypeCategory::Complex) { + // swap real and imaginary components independently + swappingBytes /= 2; + } + } + using CharType = + std::conditional_t; + auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { + if constexpr (DIR == Direction::Output) { + return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) + : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) + : inq->Emit(&x, totalBytes, swappingBytes); + } else { + return externalUnf + ? externalUnf->Receive(&x, totalBytes, swappingBytes) + : childUnf->Receive(&x, totalBytes, swappingBytes); + } + }}; + if (!swapEndianness && + instance_.IsContiguous()) { // contiguous unformatted I/O + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elements_ * elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O + for (; !IsComplete(); Advance()) { + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } + } + } + // Unformatted I/O never needs to call Continue(). + return StatOk; + } + // Formatted I/O + if (auto catAndKind{instance_.type().GetCategoryAndKind()}) { + TypeCategory cat{catAndKind->first}; + int kind{catAndKind->second}; + bool any{false}; + switch (cat) { + case TypeCategory::Integer: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, true); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, true); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, true); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, true); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, true); + break; + default: + handler.Crash( + "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Unsigned: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, false); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, false); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, false); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, false); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, false); + break; + default: + handler.Crash( + "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Real: + switch (kind) { + case 2: + any = FormattedRealIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedRealIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedRealIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedRealIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedRealIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedRealIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: REAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Complex: + switch (kind) { + case 2: + any = FormattedComplexIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedComplexIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedComplexIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedComplexIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedComplexIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedComplexIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Character: + switch (kind) { + case 1: + any = FormattedCharacterIO(io_, instance_); + break; + case 2: + any = FormattedCharacterIO(io_, instance_); + break; + case 4: + any = FormattedCharacterIO(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Logical: + switch (kind) { + case 1: + any = FormattedLogicalIO<1, DIR>(io_, instance_); + break; + case 2: + any = FormattedLogicalIO<2, DIR>(io_, instance_); + break; + case 4: + any = FormattedLogicalIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedLogicalIO<8, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Derived: { + // Derived type information must be present for formatted I/O. + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + RUNTIME_CHECK(handler, addendum != nullptr); + derived_ = addendum->derivedType(); + RUNTIME_CHECK(handler, derived_ != nullptr); + if (table_) { + if (const auto *definedIo{table_->Find(*derived_, + DIR == Direction::Input ? common::DefinedIo::ReadFormatted + : common::DefinedIo::WriteFormatted)}) { + if (definedIo->subroutine) { + nonTbpSpecial_.emplace(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false); + special_ = &*nonTbpSpecial_; + } + } + } + if (!special_) { + if (const typeInfo::SpecialBinding *binding{ + derived_->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || + binding->isTypeBound()) { + special_ = binding; + } + } + } + return StatContinue; + } + } + if (any) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { + handler.Crash("DescriptorIO: bad type code (%d) in descriptor", + static_cast(instance_.type().raw())); + return handler.GetIoStat(); + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Continue(WorkQueue &workQueue) { + // Only derived type formatted I/O gets here. + while (!IsComplete()) { + if (special_) { + if (auto defined{DefinedFormattedIo( + io_, instance_, *derived_, *special_, subscripts_)}) { + anyIoTookPlace_ |= *defined; + Advance(); + continue; + } + } + Descriptor &elementDesc{elementDescriptor_.descriptor()}; + elementDesc.Establish( + *derived_, nullptr, 0, nullptr, CFI_attribute_pointer); + elementDesc.set_base_addr(instance_.Element(subscripts_)); + Advance(); + if (int status{workQueue.BeginDerivedIo( + io_, elementDesc, *derived_, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS bool DescriptorIO(IoStatementState &io, + const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { + bool anyIoTookPlace{false}; + WorkQueue workQueue{io.GetIoErrorHandler()}; + if (workQueue.BeginDescriptorIo(io, descriptor, table, anyIoTookPlace) == + StatContinue) { + workQueue.Run(); + } + return anyIoTookPlace; +} + +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); + RT_OFFLOAD_API_GROUP_END } // namespace Fortran::runtime::io::descr diff --git a/flang-rt/lib/runtime/descriptor-io.h b/flang-rt/lib/runtime/descriptor-io.h index eb60f106c9203..88ad59bd24b53 100644 --- a/flang-rt/lib/runtime/descriptor-io.h +++ b/flang-rt/lib/runtime/descriptor-io.h @@ -9,619 +9,27 @@ #ifndef FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ #define FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ -// Implementation of I/O data list item transfers based on descriptors. -// (All I/O items come through here so that the code is exercised for test; -// some scalar I/O data transfer APIs could be changed to bypass their use -// of descriptors in the future for better efficiency.) +#include "flang-rt/runtime/connection.h" -#include "edit-input.h" -#include "edit-output.h" -#include "unit.h" -#include "flang-rt/runtime/descriptor.h" -#include "flang-rt/runtime/io-stmt.h" -#include "flang-rt/runtime/namelist.h" -#include "flang-rt/runtime/terminator.h" -#include "flang-rt/runtime/type-info.h" -#include "flang/Common/optional.h" -#include "flang/Common/uint128.h" -#include "flang/Runtime/cpp-type.h" +namespace Fortran::runtime { +class Descriptor; +} // namespace Fortran::runtime -namespace Fortran::runtime::io::descr { -template -inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, - const Descriptor &descriptor, const SubscriptValue subscripts[]) { - A *p{descriptor.Element(subscripts)}; - if (!p) { - io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " - "address or subscripts out of range"); - } - return *p; -} - -// Per-category descriptor-based I/O templates - -// TODO (perhaps as a nontrivial but small starter project): implement -// automatic repetition counts, like "10*3.14159", for list-directed and -// NAMELIST array output. - -template -inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, - const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!EditIntegerOutput(io, *edit, x, isSigned)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditIntegerInput( - io, *edit, reinterpret_cast(&x), KIND, isSigned)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedIntegerIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedRealIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - RawType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditRealInput(io, *edit, reinterpret_cast(&x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedRealIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io -template -inline RT_API_ATTRS bool FormattedComplexIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - bool isListOutput{ - io.get_if>() != nullptr}; - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - RawType *x{&ExtractElement(io, descriptor, subscripts)}; - if (isListOutput) { - DataEdit rEdit, iEdit; - rEdit.descriptor = DataEdit::ListDirectedRealPart; - iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; - rEdit.modes = iEdit.modes = io.mutableModes(); - if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || - !RealOutputEditing{io, x[1]}.Edit(iEdit)) { - return false; - } - } else { - for (int k{0}; k < 2; ++k, ++x) { - auto edit{io.GetNextDataEdit()}; - if (!edit) { - return false; - } else if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, *x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { - break; - } else if (EditRealInput( - io, *edit, reinterpret_cast(x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedComplexIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedCharacterIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t length{descriptor.ElementBytes() / sizeof(A)}; - auto *listOutput{io.get_if>()}; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - A *x{&ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditCharacterOutput(io, *edit, x, length)) { - return false; - } - } else { // input - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditCharacterInput(io, *edit, x, length)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedCharacterIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedLogicalIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - auto *listOutput{io.get_if>()}; - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditLogicalOutput(io, *edit, x != 0)) { - return false; - } - } else { - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - bool truth{}; - if (EditLogicalInput(io, *edit, truth)) { - x = truth; - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedLogicalIO: subscripts out of bounds"); - } - } - return true; -} +namespace Fortran::runtime::io::descr { template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, +RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable * = nullptr); -// For intrinsic (not defined) derived type I/O, formatted & unformatted -template -static RT_API_ATTRS bool DefaultComponentIO(IoStatementState &io, - const typeInfo::Component &component, const Descriptor &origDescriptor, - const SubscriptValue origSubscripts[], Terminator &terminator, - const NonTbpDefinedIoTable *table) { -#if !defined(RT_DEVICE_AVOID_RECURSION) - if (component.genre() == typeInfo::Component::Genre::Data) { - // Create a descriptor for the component - StaticDescriptor statDesc; - Descriptor &desc{statDesc.descriptor()}; - component.CreatePointerDescriptor( - desc, origDescriptor, terminator, origSubscripts); - return DescriptorIO(io, desc, table); - } else { - // Component is itself a descriptor - char *pointer{ - origDescriptor.Element(origSubscripts) + component.offset()}; - const Descriptor &compDesc{*reinterpret_cast(pointer)}; - return compDesc.IsAllocated() && DescriptorIO(io, compDesc, table); - } -#else - terminator.Crash("not yet implemented: component IO"); -#endif -} - -template -static RT_API_ATTRS bool DefaultComponentwiseFormattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table, const SubscriptValue subscripts[]) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - // Return true for NAMELIST input if any component appeared. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && k > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -template -static RT_API_ATTRS bool DefaultComponentwiseUnformattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - return false; - } - } - } - return true; -} - -RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( - IoStatementState &, const Descriptor &, const typeInfo::DerivedType &, - const typeInfo::SpecialBinding &, const SubscriptValue[]); - -template -static RT_API_ATTRS bool FormattedDerivedTypeIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - // Derived type information must be present for formatted I/O. - const DescriptorAddendum *addendum{descriptor.Addendum()}; - RUNTIME_CHECK(handler, addendum != nullptr); - const typeInfo::DerivedType *type{addendum->derivedType()}; - RUNTIME_CHECK(handler, type != nullptr); - Fortran::common::optional nonTbpSpecial; - const typeInfo::SpecialBinding *special{nullptr}; - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadFormatted - : common::DefinedIo::WriteFormatted)}) { - if (definedIo->subroutine) { - nonTbpSpecial.emplace(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false); - special = &*nonTbpSpecial; - } - } - } - if (!special) { - if (const typeInfo::SpecialBinding * - binding{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted)}) { - if (!table || !table->ignoreNonTbpEntries || binding->isTypeBound()) { - special = binding; - } - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t numElements{descriptor.Elements()}; - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - Fortran::common::optional result; - if (special) { - result = DefinedFormattedIo(io, descriptor, *type, *special, subscripts); - } - if (!result) { - result = DefaultComponentwiseFormattedIO( - io, descriptor, *type, table, subscripts); - } - if (!result.value()) { - // Return true for NAMELIST input if we got anything. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && j > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &, const Descriptor &, - const typeInfo::DerivedType &, const typeInfo::SpecialBinding &); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); -// Unformatted I/O -template -static RT_API_ATTRS bool UnformattedDescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table = nullptr) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const DescriptorAddendum *addendum{descriptor.Addendum()}; - if (const typeInfo::DerivedType * - type{addendum ? addendum->derivedType() : nullptr}) { - // derived type unformatted I/O - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadUnformatted - : common::DefinedIo::WriteUnformatted)}) { - if (definedIo->subroutine) { - typeInfo::SpecialBinding special{DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false}; - if (Fortran::common::optional wasDefined{ - DefinedUnformattedIo(io, descriptor, *type, special)}) { - return *wasDefined; - } - } else { - return DefaultComponentwiseUnformattedIO( - io, descriptor, *type, table); - } - } - } - if (const typeInfo::SpecialBinding * - special{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { - if (!table || !table->ignoreNonTbpEntries || special->isTypeBound()) { - // defined derived type unformatted I/O - return DefinedUnformattedIo(io, descriptor, *type, *special); - } - } - // Default derived type unformatted I/O - // TODO: If no component at any level has defined READ or WRITE - // (as appropriate), the elements are contiguous, and no byte swapping - // is active, do a block transfer via the code below. - return DefaultComponentwiseUnformattedIO(io, descriptor, *type, table); - } else { - // intrinsic type unformatted I/O - auto *externalUnf{io.get_if>()}; - auto *childUnf{io.get_if>()}; - auto *inq{ - DIR == Direction::Output ? io.get_if() : nullptr}; - RUNTIME_CHECK(handler, externalUnf || childUnf || inq); - std::size_t elementBytes{descriptor.ElementBytes()}; - std::size_t numElements{descriptor.Elements()}; - std::size_t swappingBytes{elementBytes}; - if (auto maybeCatAndKind{descriptor.type().GetCategoryAndKind()}) { - // Byte swapping units can be smaller than elements, namely - // for COMPLEX and CHARACTER. - if (maybeCatAndKind->first == TypeCategory::Character) { - // swap each character position independently - swappingBytes = maybeCatAndKind->second; // kind - } else if (maybeCatAndKind->first == TypeCategory::Complex) { - // swap real and imaginary components independently - swappingBytes /= 2; - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using CharType = - std::conditional_t; - auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { - if constexpr (DIR == Direction::Output) { - return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) - : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) - : inq->Emit(&x, totalBytes, swappingBytes); - } else { - return externalUnf ? externalUnf->Receive(&x, totalBytes, swappingBytes) - : childUnf->Receive(&x, totalBytes, swappingBytes); - } - }}; - bool swapEndianness{externalUnf && externalUnf->unit().swapEndianness()}; - if (!swapEndianness && - descriptor.IsContiguous()) { // contiguous unformatted I/O - char &x{ExtractElement(io, descriptor, subscripts)}; - return Transfer(x, numElements * elementBytes); - } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O - for (std::size_t j{0}; j < numElements; ++j) { - char &x{ExtractElement(io, descriptor, subscripts)}; - if (!Transfer(x, elementBytes)) { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && - j + 1 < numElements) { - handler.Crash("DescriptorIO: subscripts out of bounds"); - } - } - return true; - } - } -} - -template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - if (handler.InError()) { - return false; - } - if (!io.get_if>()) { - handler.Crash("DescriptorIO() called for wrong I/O direction"); - return false; - } - if constexpr (DIR == Direction::Input) { - if (!io.BeginReadingRecord()) { - return false; - } - } - if (!io.get_if>()) { - return UnformattedDescriptorIO(io, descriptor, table); - } - if (auto catAndKind{descriptor.type().GetCategoryAndKind()}) { - TypeCategory cat{catAndKind->first}; - int kind{catAndKind->second}; - switch (cat) { - case TypeCategory::Integer: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, true); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, true); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, true); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, true); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, true); - default: - handler.Crash( - "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Unsigned: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, false); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, false); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, false); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, false); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, false); - default: - handler.Crash( - "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Real: - switch (kind) { - case 2: - return FormattedRealIO<2, DIR>(io, descriptor); - case 3: - return FormattedRealIO<3, DIR>(io, descriptor); - case 4: - return FormattedRealIO<4, DIR>(io, descriptor); - case 8: - return FormattedRealIO<8, DIR>(io, descriptor); - case 10: - return FormattedRealIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedRealIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: REAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Complex: - switch (kind) { - case 2: - return FormattedComplexIO<2, DIR>(io, descriptor); - case 3: - return FormattedComplexIO<3, DIR>(io, descriptor); - case 4: - return FormattedComplexIO<4, DIR>(io, descriptor); - case 8: - return FormattedComplexIO<8, DIR>(io, descriptor); - case 10: - return FormattedComplexIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedComplexIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Character: - switch (kind) { - case 1: - return FormattedCharacterIO(io, descriptor); - case 2: - return FormattedCharacterIO(io, descriptor); - case 4: - return FormattedCharacterIO(io, descriptor); - default: - handler.Crash( - "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Logical: - switch (kind) { - case 1: - return FormattedLogicalIO<1, DIR>(io, descriptor); - case 2: - return FormattedLogicalIO<2, DIR>(io, descriptor); - case 4: - return FormattedLogicalIO<4, DIR>(io, descriptor); - case 8: - return FormattedLogicalIO<8, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Derived: - return FormattedDerivedTypeIO(io, descriptor, table); - } - } - handler.Crash("DescriptorIO: bad type code (%d) in descriptor", - static_cast(descriptor.type().raw())); - return false; -} } // namespace Fortran::runtime::io::descr #endif // FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ diff --git a/flang-rt/lib/runtime/environment.cpp b/flang-rt/lib/runtime/environment.cpp index 1d5304254ed0e..0f0564403c0e2 100644 --- a/flang-rt/lib/runtime/environment.cpp +++ b/flang-rt/lib/runtime/environment.cpp @@ -143,6 +143,10 @@ void ExecutionEnvironment::Configure(int ac, const char *av[], } } + if (auto *x{std::getenv("FLANG_RT_DEBUG")}) { + internalDebugging = std::strtol(x, nullptr, 10); + } + if (auto *x{std::getenv("ACC_OFFLOAD_STACK_SIZE")}) { char *end; auto n{std::strtoul(x, &end, 10)}; diff --git a/flang-rt/lib/runtime/namelist.cpp b/flang-rt/lib/runtime/namelist.cpp index b0cf2180fc6d4..1bef387a9771f 100644 --- a/flang-rt/lib/runtime/namelist.cpp +++ b/flang-rt/lib/runtime/namelist.cpp @@ -10,6 +10,7 @@ #include "descriptor-io.h" #include "flang-rt/runtime/emit-encoded.h" #include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/type-info.h" #include "flang/Runtime/io-api.h" #include #include diff --git a/flang-rt/lib/runtime/tools.cpp b/flang-rt/lib/runtime/tools.cpp index b08195cd31e05..24d05f369fcbe 100644 --- a/flang-rt/lib/runtime/tools.cpp +++ b/flang-rt/lib/runtime/tools.cpp @@ -205,7 +205,7 @@ RT_API_ATTRS void ShallowCopyInner(const Descriptor &to, const Descriptor &from, // Doing the recursion upwards instead of downwards puts the more common // cases earlier in the if-chain and has a tangible impact on performance. template struct ShallowCopyRankSpecialize { - static bool execute(const Descriptor &to, const Descriptor &from, + static RT_API_ATTRS bool execute(const Descriptor &to, const Descriptor &from, bool toIsContiguous, bool fromIsContiguous) { if (to.rank() == RANK && from.rank() == RANK) { ShallowCopyInner(to, from, toIsContiguous, fromIsContiguous); @@ -217,7 +217,7 @@ template struct ShallowCopyRankSpecialize { }; template struct ShallowCopyRankSpecialize { - static bool execute(const Descriptor &to, const Descriptor &from, + static RT_API_ATTRS bool execute(const Descriptor &to, const Descriptor &from, bool toIsContiguous, bool fromIsContiguous) { return false; } diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..a508ecb637102 --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,161 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/memory.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +#if !defined(RT_DEVICE_COMPILATION) +// FLANG_RT_DEBUG code is disabled when false. +static constexpr bool enableDebugOutput{false}; +#endif + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (IsComplete()) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + FreeMemory(firstFree_); + } + firstFree_ = next; + } +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + void *p{AllocateMemoryOrCrash(terminator_, sizeof(TicketList))}; + firstFree_ = new (p) TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: new ticket\n"); + } +#endif + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), + at->ticket.begun ? "Continue" : "Begin"); + } +#endif + int stat{at->ticket.Continue(*this)}; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: ... stat %d\n", stat); + } +#endif + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime diff --git a/flang-rt/unittests/Runtime/ExternalIOTest.cpp b/flang-rt/unittests/Runtime/ExternalIOTest.cpp index 3833e48be3dd6..6c148b1de6f82 100644 --- a/flang-rt/unittests/Runtime/ExternalIOTest.cpp +++ b/flang-rt/unittests/Runtime/ExternalIOTest.cpp @@ -184,7 +184,7 @@ TEST(ExternalIOTests, TestSequentialFixedUnformatted) { io = IONAME(BeginInquireIoLength)(__FILE__, __LINE__); for (int j{1}; j <= 3; ++j) { ASSERT_TRUE(IONAME(OutputDescriptor)(io, desc)) - << "OutputDescriptor() for InquireIoLength"; + << "OutputDescriptor() for InquireIoLength " << j; } ASSERT_EQ(IONAME(GetIoLength)(io), 3 * recl) << "GetIoLength"; ASSERT_EQ(IONAME(EndIoStatement)(io), IostatOk) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 51969de5ac7fe..377d01dbec69a 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -850,6 +850,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION From flang-commits at lists.llvm.org Thu May 29 11:50:23 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 29 May 2025 11:50:23 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6838ac6f.170a0220.1b8e03.b5df@mx.google.com> ================ @@ -2656,527 +2665,1857 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; } } + return SourcedActionStmt{}; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); } - return false; + return SourcedActionStmt{}; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; - } - } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); - } +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; } - ErrIfAllocatableVariable(var); + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; + } else { + return std::nullopt; + } + }, + x->u); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const auto *v1 = GetExpr(context_, stmt1Var); - const auto *e1 = GetExpr(context_, stmt1Expr); - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - const auto *v2 = GetExpr(context_, stmt2Var); - const auto *e2 = GetExpr(context_, stmt2Expr); - - if (e1 && v1 && e2 && v2) { - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(v2, e2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } - if (!(*e1 == *v2)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(v1, e1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - if (!(*v1 == *e2)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); - } - } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } + return std::nullopt; } -} -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; } } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; } } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); - } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); + return std::nullopt; } -} -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, - }, - x.u); + return std::nullopt; } -void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { - dirContext_.pop_back(); +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; + } } -// Clauses -// Mainly categorized as -// 1. Checks on 'OmpClauseList' from 'parse-tree.h'. -// 2. Checks on clauses which fall under 'struct OmpClause' from parse-tree.h. -// 3. Checks on clauses which are not in 'struct OmpClause' from parse-tree.h. +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} -void OmpStructureChecker::Leave(const parser::OmpClauseList &) { - // 2.7.1 Loop Construct Restriction - if (llvm::omp::allDoSet.test(GetContext().directive)) { - if (auto *clause{FindClause(llvm::omp::Clause::OMPC_schedule)}) { - // only one schedule clause is allowed - const auto &schedClause{std::get(clause->u)}; - auto &modifiers{OmpGetModifiers(schedClause.v)}; - auto *ordering{ - OmpGetUniqueModifier(modifiers)}; - if (ordering && - ordering->v == parser::OmpOrderingModifier::Value::Nonmonotonic) { - if (FindClause(llvm::omp::Clause::OMPC_ordered)) { - context_.Say(clause->source, - "The NONMONOTONIC modifier cannot be specified " - "if an ORDERED clause is specified"_err_en_US); - } + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } + return result; + } - if (auto *clause{FindClause(llvm::omp::Clause::OMPC_ordered)}) { - // only one ordered clause is allowed - const auto &orderedClause{ - std::get(clause->u)}; + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } - if (orderedClause.v) { - CheckNotAllowedIfClause( - llvm::omp::Clause::OMPC_ordered, {llvm::omp::Clause::OMPC_linear}); + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } - if (auto *clause2{FindClause(llvm::omp::Clause::OMPC_collapse)}) { - const auto &collapseClause{ - std::get(clause2->u)}; - // ordered and collapse both have parameters - if (const auto orderedValue{GetIntValue(orderedClause.v)}) { - if (const auto collapseValue{GetIntValue(collapseClause.v)}) { - if (*orderedValue > 0 && *orderedValue < *collapseValue) { - context_.Say(clause->source, - "The parameter of the ORDERED clause must be " - "greater than or equal to " - "the parameter of the COLLAPSE clause"_err_en_US); - } - } - } - } + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); } + } + } - // TODO: ordered region binding check (requires nesting implementation) +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; } - } // doSet + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } - // 2.8.1 Simd Construct Restriction - if (llvm::omp::allSimdSet.test(GetContext().directive)) { - if (auto *clause{FindClause(llvm::omp::Clause::OMPC_simdlen)}) { - if (auto *clause2{FindClause(llvm::omp::Clause::OMPC_safelen)}) { - const auto &simdlenClause{ - std::get(clause->u)}; - const auto &safelenClause{ - std::get(clause2->u)}; - // simdlen and safelen both have parameters - if (const auto simdlenValue{GetIntValue(simdlenClause.v)}) { - if (const auto safelenValue{GetIntValue(safelenClause.v)}) { - if (*safelenValue > 0 && *simdlenValue > *safelenValue) { - context_.Say(clause->source, - "The parameter of the SIMDLEN clause must be less than or " - "equal to the parameter of the SAFELEN clause"_err_en_US); - } - } - } + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (MoveAppend(v, std::move(results)), ...); + return v; + } +}; + +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result asSomeExpr(const T &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::Constant &x) const { + return asSomeExpr(x); + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); } + } else { + return asSomeExpr(x.derived()); } + } - // 2.11.5 Simd construct restriction (OpenMP 5.1) - if (auto *sl_clause{FindClause(llvm::omp::Clause::OMPC_safelen)}) { - if (auto *o_clause{FindClause(llvm::omp::Clause::OMPC_order)}) { - const auto &orderClause{ - std::get(o_clause->u)}; - if (std::get(orderClause.v.t) == - parser::OmpOrderClause::Ordering::Concurrent) { - context_.Say(sl_clause->source, - "The `SAFELEN` clause cannot appear in the `SIMD` directive " - "with `ORDER(CONCURRENT)` clause"_err_en_US); - } + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; } - } // SIMD + } - // Semantic checks related to presence of multiple list items within the same - // clause - CheckMultListItems(); + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; - if (GetContext().directive == llvm::omp::Directive::OMPD_task) { - if (auto *detachClause{FindClause(llvm::omp::Clause::OMPC_detach)}) { - unsigned version{context_.langOptions().OpenMPVersion}; - if (version == 50 || version == 51) { + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; +}; +} // namespace atomic + +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); +} + +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} + +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} + +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; + } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, + const std::optional &maybeAssign = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetAssignment(operation.assign, maybeAssign); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedAssignment assign; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var (with optional converts) + // or + // ... = x capture-var = atomic-var (with optional converts) + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + using ReturnTy = std::pair; + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return IsSameOrConvertOf(c.rhs, u.lhs); + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; ---------------- kparzysz wrote: I added an explanation for the use of determinant instead. This technique is used twice, and I think the explanation makes it clear. Let me know what you think. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Thu May 29 11:58:11 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 29 May 2025 11:58:11 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6838ae43.170a0220.2befdd.b7ce@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 01/22] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 02/22] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 03/22] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; >From 02b40809aff5b2ad315ef078f0ea8edb0f30a40c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 17 Mar 2025 15:53:27 -0500 Subject: [PATCH 04/22] [flang][OpenMP] Overhaul implementation of ATOMIC construct The parser will accept a wide variety of illegal attempts at forming an ATOMIC construct, leaving it to the semantic analysis to diagnose any issues. This consolidates the analysis into one place and allows us to produce more informative diagnostics. The parser's outcome will be parser::OpenMPAtomicConstruct object holding the directive, parser::Body, and an optional end-directive. The prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have been removed. READ, WRITE, etc. are now proper clauses. The semantic analysis consistently operates on "evaluation" represen- tations, mainly evaluate::Expr (as SomeExpr) and evaluate::Assignment. The results of the semantic analysis are stored in a mutable member of the OpenMPAtomicConstruct node. This follows a precedent of having `typedExpr` member in parser::Expr, for example. This allows the lowering code to avoid duplicated handling of AST nodes. Using a BLOCK construct containing multiple statements for an ATOMIC construct that requires multiple statements is now allowed. In fact, any nesting of such BLOCK constructs is allowed. This implementation will parse, and perform semantic checks for both conditional-update and conditional-update-capture, although no MLIR will be generated for those. Instead, a TODO error will be issues prior to lowering. The allowed forms of the ATOMIC construct were based on the OpenMP 6.0 spec. --- flang/include/flang/Parser/dump-parse-tree.h | 12 - flang/include/flang/Parser/parse-tree.h | 111 +- flang/include/flang/Semantics/tools.h | 16 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 40 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 760 +++---- flang/lib/Parser/openmp-parsers.cpp | 233 +- flang/lib/Parser/parse-tree.cpp | 28 + flang/lib/Parser/unparse.cpp | 102 +- flang/lib/Semantics/check-omp-structure.cpp | 1998 +++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 56 +- flang/lib/Semantics/resolve-names.cpp | 7 +- flang/lib/Semantics/rewrite-directives.cpp | 126 +- .../Lower/OpenMP/Todo/atomic-compare-fail.f90 | 2 +- .../test/Lower/OpenMP/Todo/atomic-compare.f90 | 2 +- flang/test/Lower/OpenMP/atomic-capture.f90 | 4 +- flang/test/Lower/OpenMP/atomic-write.f90 | 2 +- flang/test/Parser/OpenMP/atomic-compare.f90 | 16 - .../test/Semantics/OpenMP/atomic-compare.f90 | 29 +- .../Semantics/OpenMP/atomic-hint-clause.f90 | 23 +- flang/test/Semantics/OpenMP/atomic-read.f90 | 89 + .../OpenMP/atomic-update-capture.f90 | 77 + .../Semantics/OpenMP/atomic-update-only.f90 | 83 + .../OpenMP/atomic-update-overloaded-ops.f90 | 4 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 81 + flang/test/Semantics/OpenMP/atomic.f90 | 29 +- flang/test/Semantics/OpenMP/atomic01.f90 | 221 +- flang/test/Semantics/OpenMP/atomic02.f90 | 47 +- flang/test/Semantics/OpenMP/atomic03.f90 | 51 +- flang/test/Semantics/OpenMP/atomic04.f90 | 96 +- flang/test/Semantics/OpenMP/atomic05.f90 | 12 +- .../Semantics/OpenMP/critical-hint-clause.f90 | 20 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 58 +- .../Semantics/OpenMP/requires-atomic01.f90 | 86 +- .../Semantics/OpenMP/requires-atomic02.f90 | 86 +- 34 files changed, 2838 insertions(+), 1769 deletions(-) delete mode 100644 flang/test/Parser/OpenMP/atomic-compare.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-read.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-capture.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-only.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-write.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a8c01ee2e7da3 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -526,15 +526,6 @@ class ParseTreeDumper { NODE(parser, OmpAtClause) NODE_ENUM(OmpAtClause, ActionTime) NODE_ENUM(OmpSeverityClause, Severity) - NODE(parser, OmpAtomic) - NODE(parser, OmpAtomicCapture) - NODE(OmpAtomicCapture, Stmt1) - NODE(OmpAtomicCapture, Stmt2) - NODE(parser, OmpAtomicCompare) - NODE(parser, OmpAtomicCompareIfStmt) - NODE(parser, OmpAtomicRead) - NODE(parser, OmpAtomicUpdate) - NODE(parser, OmpAtomicWrite) NODE(parser, OmpBeginBlockDirective) NODE(parser, OmpBeginLoopDirective) NODE(parser, OmpBeginSectionsDirective) @@ -581,7 +572,6 @@ class ParseTreeDumper { NODE(parser, OmpDoacrossClause) NODE(parser, OmpDestroyClause) NODE(parser, OmpEndAllocators) - NODE(parser, OmpEndAtomic) NODE(parser, OmpEndBlockDirective) NODE(parser, OmpEndCriticalDirective) NODE(parser, OmpEndLoopDirective) @@ -709,8 +699,6 @@ class ParseTreeDumper { NODE(parser, OpenMPDeclareMapperConstruct) NODE_ENUM(common, OmpMemoryOrderType) NODE(parser, OmpMemoryOrderClause) - NODE(parser, OmpAtomicClause) - NODE(parser, OmpAtomicClauseList) NODE(parser, OmpAtomicDefaultMemOrderClause) NODE(parser, OpenMPDepobjConstruct) NODE(parser, OpenMPUtilityConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..77f57b1cb85c7 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4835,94 +4835,37 @@ struct OmpMemoryOrderClause { CharBlock source; }; -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) | -// FAIL(memory-order) -struct OmpAtomicClause { - UNION_CLASS_BOILERPLATE(OmpAtomicClause); - CharBlock source; - std::variant u; -}; - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -struct OmpAtomicClauseList { - WRAPPER_CLASS_BOILERPLATE(OmpAtomicClauseList, std::list); - CharBlock source; -}; - -// END ATOMIC -EMPTY_CLASS(OmpEndAtomic); - -// ATOMIC READ -struct OmpAtomicRead { - TUPLE_CLASS_BOILERPLATE(OmpAtomicRead); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC WRITE -struct OmpAtomicWrite { - TUPLE_CLASS_BOILERPLATE(OmpAtomicWrite); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC UPDATE -struct OmpAtomicUpdate { - TUPLE_CLASS_BOILERPLATE(OmpAtomicUpdate); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC CAPTURE -struct OmpAtomicCapture { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCapture); - CharBlock source; - WRAPPER_CLASS(Stmt1, Statement); - WRAPPER_CLASS(Stmt2, Statement); - std::tuple - t; -}; - -struct OmpAtomicCompareIfStmt { - UNION_CLASS_BOILERPLATE(OmpAtomicCompareIfStmt); - std::variant, common::Indirection> u; -}; - -// ATOMIC COMPARE (OpenMP 5.1, OPenMP 5.2 spec: 15.8.4) -struct OmpAtomicCompare { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCompare); +struct OpenMPAtomicConstruct { + llvm::omp::Clause GetKind() const; + bool IsCapture() const; + bool IsCompare() const; + TUPLE_CLASS_BOILERPLATE(OpenMPAtomicConstruct); CharBlock source; - std::tuple> + std::tuple> t; -}; -// ATOMIC -struct OmpAtomic { - TUPLE_CLASS_BOILERPLATE(OmpAtomic); - CharBlock source; - std::tuple, - std::optional> - t; -}; + // Information filled out during semantic checks to avoid duplication + // of analyses. + struct Analysis { + static constexpr int None = 0; + static constexpr int Read = 1; + static constexpr int Write = 2; + static constexpr int Update = Read | Write; + static constexpr int Action = 3; // Bitmask for None, Read, Write, Update + static constexpr int IfTrue = 4; + static constexpr int IfFalse = 8; + static constexpr int Condition = 12; // Bitmask for IfTrue, IfFalse + + struct Op { + int what; + TypedExpr expr; + }; + TypedExpr atom, cond; + Op op0, op1; + }; -// 2.17.7 atomic -> -// ATOMIC [atomic-clause-list] atomic-construct [atomic-clause-list] | -// ATOMIC [atomic-clause-list] -// atomic-construct -> READ | WRITE | UPDATE | CAPTURE | COMPARE -struct OpenMPAtomicConstruct { - UNION_CLASS_BOILERPLATE(OpenMPAtomicConstruct); - std::variant - u; + mutable Analysis analysis; }; // OpenMP directives that associate with loop(s) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..7f1ec59b087a2 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -782,5 +782,21 @@ inline bool checkForSymbolMatch( } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). +/// Check if "expr" is +/// SomeType(SomeKind(Type( +/// Convert +/// SomeKind(...)[2]))) +/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves +/// TypeCategory. +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..0d941bf39afc3 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -332,26 +332,26 @@ getSource(const semantics::SemanticsContext &semaCtx, const parser::CharBlock *source = nullptr; auto ompConsVisit = [&](const parser::OpenMPConstruct &x) { - std::visit(common::visitors{ - [&](const parser::OpenMPSectionsConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPLoopConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPBlockConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPCriticalConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPAtomicConstruct &x) { - std::visit([&](const auto &x) { source = &x.source; }, - x.u); - }, - [&](const auto &x) { source = &x.source; }, - }, - x.u); + std::visit( + common::visitors{ + [&](const parser::OpenMPSectionsConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPLoopConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPBlockConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPCriticalConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPAtomicConstruct &x) { + source = &std::get(x.t).source; + }, + [&](const auto &x) { source = &x.source; }, + }, + x.u); }; eval.visit(common::visitors{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fdd85e94829f3..ab868df76d298 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2588,455 +2588,183 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); +} -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} + +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, + const List &clauses) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + if (clause.id != llvm::omp::Clause::OMPC_hint) + continue; + auto &hint = std::get(clause.u); + auto maybeVal = evaluate::ToInt64(hint.v); + CHECK(maybeVal); + return builder.getI64IntegerAttr(*maybeVal); } + return nullptr; } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static mlir::omp::ClauseMemoryOrderKindAttr +getAtomicMemoryOrder(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + const List &clauses) { + std::optional kind; + unsigned version = semaCtx.langOptions().OpenMPVersion; - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + switch (clause.id) { + case llvm::omp::Clause::OMPC_acq_rel: + kind = mlir::omp::ClauseMemoryOrderKind::Acq_rel; + break; + case llvm::omp::Clause::OMPC_acquire: + kind = mlir::omp::ClauseMemoryOrderKind::Acquire; + break; + case llvm::omp::Clause::OMPC_relaxed: + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + break; + case llvm::omp::Clause::OMPC_release: + kind = mlir::omp::ClauseMemoryOrderKind::Release; + break; + case llvm::omp::Clause::OMPC_seq_cst: + kind = mlir::omp::ClauseMemoryOrderKind::Seq_cst; + break; + default: + break; + } + } - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); + // Starting with 5.1, if no memory-order clause is present, the effect + // is as if "relaxed" was present. + if (!kind) { + if (version <= 50) + return nullptr; + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + } + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + return mlir::omp::ClauseMemoryOrderKindAttr::get(builder.getContext(), *kind); +} + +static mlir::Operation * // +genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; +static mlir::Operation * // +genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Value converted = builder.createConvert(loc, atomType, value); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, converted, hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - common::visit( - common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - lower::StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); +static mlir::Operation * +genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + lower::ExprToValueMap overrides; + lower::StatementContext naCtx; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + assert(!args.empty() && "Update operation without arguments"); + for (auto &arg : args) { + if (!semantics::IsSameOrResizeOf(arg, atom)) { + mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); + overrides.try_emplace(&arg, val); } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); } - mlir::Operation *atomicUpdateOp = nullptr; - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), - val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -static void genAtomicWrite(lower::AbstractConverter &converter, - const parser::OmpAtomicWrite &atomicWrite, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - - const parser::AssignmentStmt &stmt = - std::get>(atomicWrite.t) - .statement; - const evaluate::Assignment &assign = *stmt.typedAssignment->v; - lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -static void genAtomicRead(lower::AbstractConverter &converter, - const parser::OmpAtomicRead &atomicRead, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicRead.t) - .statement.t); + builder.restoreInsertionPoint(atomicAt); + auto updateOp = + builder.create(loc, atomAddr, hint, memOrder); - lower::StatementContext stmtCtx; - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -static void genAtomicUpdate(lower::AbstractConverter &converter, - const parser::OmpAtomicUpdate &atomicUpdate, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicUpdate.t) - .statement.t); + mlir::Region ®ion = updateOp->getRegion(0); + mlir::Block *block = builder.createBlock(®ion, {}, {atomType}, {loc}); + mlir::Value localAtom = fir::getBase(block->getArgument(0)); + overrides.try_emplace(&atom, localAtom); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -static void genOmpAtomic(lower::AbstractConverter &converter, - const parser::OmpAtomic &atomicConstruct, - mlir::Location loc) { - const parser::OmpAtomicClauseList &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>(atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicConstruct.t) - .statement.t); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -static void genAtomicCapture(lower::AbstractConverter &converter, - const parser::OmpAtomicCapture &atomicCapture, - mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + converter.overrideExprValues(&overrides); + mlir::Value updated = + fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value converted = builder.createConvert(loc, atomType, updated); + builder.create(loc, converted); + converter.resetExprOverrides(); - const parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const parser::OmpAtomicClauseList &rightHandClauseList = - std::get<2>(atomicCapture.t); - const parser::OmpAtomicClauseList &leftHandClauseList = - std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + builder.restoreInsertionPoint(saved); + return updateOp; +} + +static mlir::Operation * +genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, int action, + mlir::Value atomAddr, const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + switch (action) { + case parser::OpenMPAtomicConstruct::Analysis::Read: + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Write: + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Update: + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + default: + return nullptr; } - firOpBuilder.setInsertionPointToEnd(&block); - firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); } //===----------------------------------------------------------------------===// @@ -3911,10 +3639,6 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, @@ -3922,38 +3646,140 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; + llvm::errs() << " atom: " << atom.AsFortran() << "\n"; + llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; + llvm::errs() << " op0 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << " op1 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << "}\n"; +} + static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - const parser::OpenMPAtomicConstruct &atomicConstruct) { - Fortran::common::visit( - common::visitors{ - [&](const parser::OmpAtomicRead &atomicRead) { - mlir::Location loc = converter.genLocation(atomicRead.source); - genAtomicRead(converter, atomicRead, loc); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - mlir::Location loc = converter.genLocation(atomicWrite.source); - genAtomicWrite(converter, atomicWrite, loc); - }, - [&](const parser::OmpAtomic &atomicConstruct) { - mlir::Location loc = converter.genLocation(atomicConstruct.source); - genOmpAtomic(converter, atomicConstruct, loc); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - mlir::Location loc = converter.genLocation(atomicUpdate.source); - genAtomicUpdate(converter, atomicUpdate, loc); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - mlir::Location loc = converter.genLocation(atomicCapture.source); - genAtomicCapture(converter, atomicCapture, loc); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - mlir::Location loc = converter.genLocation(atomicCompare.source); - TODO(loc, "OpenMP atomic compare"); - }, - }, - atomicConstruct.u); + const parser::OpenMPAtomicConstruct &construct) { + auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { + if (auto *maybe = expr.get(); maybe && maybe->v) { + return &*maybe->v; + } else { + return nullptr; + } + }; + + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto &dirSpec = std::get(construct.t); + List clauses = makeClauses(dirSpec.Clauses(), semaCtx); + lower::StatementContext stmtCtx; + + const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + const semantics::SomeExpr &atom = *get(analysis.atom); + mlir::Location loc = converter.genLocation(construct.source); + mlir::Value atomAddr = + fir::getBase(converter.genExprAddr(atom, stmtCtx, &loc)); + mlir::IntegerAttr hint = getAtomicHint(converter, clauses); + mlir::omp::ClauseMemoryOrderKindAttr memOrder = + getAtomicMemoryOrder(converter, semaCtx, clauses); + + if (auto *cond = get(analysis.cond)) { + (void)cond; + TODO(loc, "OpenMP ATOMIC COMPARE"); + } else { + int action0 = analysis.op0.what & analysis.Action; + int action1 = analysis.op1.what & analysis.Action; + mlir::Operation *captureOp = nullptr; + fir::FirOpBuilder::InsertPoint atomicAt; + fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + + if (construct.IsCapture()) { + // Capturing operation. + assert(action0 != analysis.None && action1 != analysis.None && + "Expexcing two actions"); + captureOp = + builder.create(loc, hint, memOrder); + // Set the non-atomic insertion point to before the atomic.capture. + prepareAt = getInsertionPointBefore(captureOp); + + mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); + builder.setInsertionPointToEnd(block); + // Set the atomic insertion point to before the terminator inside + // atomic.capture. + mlir::Operation *term = builder.create(loc); + atomicAt = getInsertionPointBefore(term); + hint = nullptr; + memOrder = nullptr; + } else { + // Non-capturing operation. + assert(action0 != analysis.None && action1 == analysis.None && + "Expexcing single action"); + assert(!(analysis.op0.what & analysis.Condition)); + atomicAt = prepareAt; + } + + mlir::Operation *firstOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, + *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + assert(firstOp && "Should have created an atomic operation"); + atomicAt = getInsertionPointAfter(firstOp); + + mlir::Operation *secondOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, + *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } + } } static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..b30263ad54fa4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { + return std::string( + state.GetLocation(), std::min(64, state.BytesRemaining())); +} + constexpr auto startOmpLine = skipStuffBeforeStatement >> "!$OMP "_sptok; constexpr auto endOmpLine = space >> endOfLine; @@ -918,8 +924,10 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "BIND" >> construct(construct( parenthesized(Parser{}))) || + "CAPTURE" >> construct(construct()) || "COLLAPSE" >> construct(construct( parenthesized(scalarIntConstantExpr))) || + "COMPARE" >> construct(construct()) || "CONTAINS" >> construct(construct( parenthesized(Parser{}))) || "COPYIN" >> construct(construct( @@ -1039,6 +1047,7 @@ TYPE_PARSER( // "TASK_REDUCTION" >> construct(construct( parenthesized(Parser{}))) || + "READ" >> construct(construct()) || "RELAXED" >> construct(construct()) || "RELEASE" >> construct(construct()) || "REVERSE_OFFLOAD" >> @@ -1082,6 +1091,7 @@ TYPE_PARSER( // maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || + "WRITE" >> construct(construct()) || // Cancellable constructs "DO"_id >= construct(construct( @@ -1210,6 +1220,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. +// A problem can occur when atomic constructs without end-directive follow +// each other closely, e.g. +// !$omp atomic write +// x = v +// !$omp atomic update +// x = x + 1 +// ... +// The speculative parsing will become "recursive", and has the potential +// to take a (practically) infinite amount of time given a sufficiently +// large number of such constructs in a row. Since atomic constructs cannot +// contain other OpenMP constructs, guarding against recursive calls to the +// atomic construct parser solves the problem. +struct OmpAtomicConstructParser { + using resultType = OpenMPAtomicConstruct; + + static constexpr size_t BodyLimit{5}; + + std::optional Parse(ParseState &state) const { + if (recursing_) { + return std::nullopt; + } + recursing_ = true; + + auto dirSpec{Parser{}.Parse(state)}; + if (!dirSpec || dirSpec->DirId() != llvm::omp::Directive::OMPD_atomic) { + recursing_ = false; + return std::nullopt; + } + + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + if (ParseOne(exec, end, tail, state)) { + if (!tail.first.empty()) { + if (auto &&rest{attempt(LimitedTailParser(BodyLimit)).Parse(state)}) { + for (auto &&s : rest->first) { + tail.first.emplace_back(std::move(s)); + } + assert(!tail.second); + tail.second = std::move(rest->second); + } + } + recursing_ = false; + return OpenMPAtomicConstruct{ + std::move(*dirSpec), std::move(tail.first), std::move(tail.second)}; + } + + recursing_ = false; + return std::nullopt; + } + +private: + // Begin-directive + TailType = entire construct. + using TailType = std::pair>; + + // Parse either an ExecutionPartConstruct, or atomic end-directive. When + // successful, record the result in the "tail" provided, otherwise fail. + static std::optional ParseOne( // + Parser &exec, OmpEndDirectiveParser &end, + TailType &tail, ParseState &state) { + auto isRecovery{[](const ExecutionPartConstruct &e) { + return std::holds_alternative(e.u); + }}; + if (auto &&stmt{attempt(exec).Parse(state)}; stmt && !isRecovery(*stmt)) { + tail.first.emplace_back(std::move(*stmt)); + } else if (auto &&dir{attempt(end).Parse(state)}) { + tail.second = std::move(*dir); + } else { + return std::nullopt; + } + return Success{}; + } + + struct LimitedTailParser { + using resultType = TailType; + + constexpr LimitedTailParser(size_t count) : count_(count) {} + + std::optional Parse(ParseState &state) const { + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + for (size_t i{0}; i != count_; ++i) { + if (ParseOne(exec, end, tail, state)) { + if (tail.second) { + // Return when the end-directive was parsed. + return std::move(tail); + } + } else { + break; + } + } + return std::nullopt; + } + + private: + const size_t count_; + }; + + // The recursion guard should become thread_local if parsing is ever + // parallelized. + static bool recursing_; +}; + +bool OmpAtomicConstructParser::recursing_{false}; + +TYPE_PARSER(sourced( // + construct(OmpAtomicConstructParser{}))) + // 2.17.7 Atomic construct/2.17.8 Flush construct [OpenMP 5.0] // memory-order-clause -> // acq_rel @@ -1224,19 +1383,6 @@ TYPE_PARSER(sourced(construct( "RELEASE" >> construct(construct()) || "SEQ_CST" >> construct(construct()))))) -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) -TYPE_PARSER(sourced(construct( - construct(Parser{}) || - construct( - "FAIL" >> parenthesized(Parser{})) || - construct( - "HINT" >> parenthesized(Parser{}))))) - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -TYPE_PARSER(sourced(construct( - many(maybe(","_tok) >> sourced(Parser{}))))) - static bool IsSimpleStandalone(const OmpDirectiveName &name) { switch (name.v) { case llvm::omp::Directive::OMPD_barrier: @@ -1383,67 +1529,6 @@ TYPE_PARSER(sourced( TYPE_PARSER(construct(Parser{}) || construct(Parser{})) -// 2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] | -// ATOMIC [clause] -// clause -> memory-order-clause | HINT(hint-expression) -// memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED -// atomic-clause -> READ | WRITE | UPDATE | CAPTURE - -// OMP END ATOMIC -TYPE_PARSER(construct(startOmpLine >> "END ATOMIC"_tok)) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] READ [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("READ"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] CAPTURE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("CAPTURE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - statement(assignmentStmt), Parser{} / endOmpLine))) - -TYPE_PARSER(construct(indirect(Parser{})) || - construct(indirect(Parser{}))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] COMPARE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("COMPARE"_tok), - Parser{} / endOmpLine, - Parser{}, - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] UPDATE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("UPDATE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [atomic-clause-list] -TYPE_PARSER(sourced(construct(verbatim("ATOMIC"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] WRITE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("WRITE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// Atomic Construct -TYPE_PARSER(construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{})) - // 2.13.2 OMP CRITICAL TYPE_PARSER(startOmpLine >> sourced(construct( diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..5983f54600c21 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -318,6 +318,34 @@ std::string OmpTraitSetSelectorName::ToString() const { return std::string(EnumToString(v)); } +llvm::omp::Clause OpenMPAtomicConstruct::GetKind() const { + auto &dirSpec{std::get(t)}; + for (auto &clause : dirSpec.Clauses().v) { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_read: + case llvm::omp::Clause::OMPC_write: + case llvm::omp::Clause::OMPC_update: + return clause.Id(); + default: + break; + } + } + return llvm::omp::Clause::OMPC_update; +} + +bool OpenMPAtomicConstruct::IsCapture() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_capture; + }); +} + +bool OpenMPAtomicConstruct::IsCompare() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_compare; + }); +} } // namespace Fortran::parser template static llvm::omp::Clause getClauseIdForClass(C &&) { diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..a2bc8b98088f1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2562,83 +2562,22 @@ class UnparseVisitor { Word(ToUpperCaseLetters(common::EnumToString(x))); } - void Unparse(const OmpAtomicClauseList &x) { Walk(" ", x.v, " "); } - - void Unparse(const OmpAtomic &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCapture &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" CAPTURE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - Put("\n"); - Walk(std::get(x.t)); - BeginOpenMP(); - Word("!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCompare &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" COMPARE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - } - void Unparse(const OmpAtomicRead &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" READ"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicUpdate &x) { + void Unparse(const OpenMPAtomicConstruct &x) { BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" UPDATE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicWrite &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" WRITE"); - Walk(std::get<2>(x.t)); + Word("!$OMP "); + Walk(std::get(x.t)); Put("\n"); EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); + Walk(std::get(x.t), ""); + if (auto &end{std::get>(x.t)}) { + BeginOpenMP(); + Word("!$OMP END "); + Walk(*end); + Put("\n"); + EndOpenMP(); + } } + void Unparse(const OpenMPExecutableAllocate &x) { const auto &fields = std::get>>( @@ -2889,23 +2828,8 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } + void Unparse(const OmpFailClause &x) { Walk(x.v); } void Unparse(const OmpMemoryOrderClause &x) { Walk(x.v); } - void Unparse(const OmpAtomicClause &x) { - common::visit(common::visitors{ - [&](const OmpMemoryOrderClause &y) { Walk(y); }, - [&](const OmpFailClause &y) { - Word("FAIL("); - Walk(y.v); - Put(")"); - }, - [&](const OmpHintClause &y) { - Word("HINT("); - Walk(y.v); - Put(")"); - }, - }, - x.u); - } void Unparse(const OmpMetadirectiveDirective &x) { BeginOpenMP(); Word("!$OMP METADIRECTIVE "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c582de6df0319..bf3dff8ddab28 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); + +template +static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { + return !(e == f); +} + // Use when clause falls under 'struct OmpClause' in 'parse-tree.h'. #define CHECK_SIMPLE_CLAUSE(X, Y) \ void OmpStructureChecker::Enter(const parser::OmpClause::X &) { \ @@ -78,6 +86,28 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } +static bool IsVarOrFunctionRef(const SomeExpr &expr) { + return evaluate::UnwrapProcedureRef(expr) != nullptr || + evaluate::IsVariable(expr); +} + +static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { + const parser::TypedExpr &typedExpr{parserExpr.typedExpr}; + // ForwardOwningPointer typedExpr + // `- GenericExprWrapper ^.get() + // `- std::optional ^->v + return typedExpr.get()->v; +} + +static std::optional GetDynamicType( + const parser::Expr &parserExpr) { + if (auto maybeExpr{GetEvaluateExpr(parserExpr)}) { + return maybeExpr->GetType(); + } else { + return std::nullopt; + } +} + // 'OmpWorkshareBlockChecker' is used to check the validity of the assignment // statements and the expressions enclosed in an OpenMP Workshare construct class OmpWorkshareBlockChecker { @@ -584,51 +614,26 @@ void OmpStructureChecker::CheckPredefinedAllocatorRestriction( } } -template -void OmpStructureChecker::CheckHintClause( - D *leftOmpClauseList, D *rightOmpClauseList, std::string_view dirName) { - bool foundHint{false}; +void OmpStructureChecker::Enter(const parser::OmpClause::Hint &x) { + CheckAllowedClause(llvm::omp::Clause::OMPC_hint); + auto &dirCtx{GetContext()}; - auto checkForValidHintClause = [&](const D *clauseList) { - for (const auto &clause : clauseList->v) { - const parser::OmpHintClause *ompHintClause = nullptr; - if constexpr (std::is_same_v) { - ompHintClause = std::get_if(&clause.u); - } else if constexpr (std::is_same_v) { - if (auto *hint{std::get_if(&clause.u)}) { - ompHintClause = &hint->v; - } - } - if (!ompHintClause) - continue; - if (foundHint) { - context_.Say(clause.source, - "At most one HINT clause can appear on the %s directive"_err_en_US, - parser::ToUpperCaseLetters(dirName)); - } - foundHint = true; - std::optional hintValue = GetIntValue(ompHintClause->v); - if (hintValue && *hintValue >= 0) { - /*`omp_sync_hint_nonspeculative` and `omp_lock_hint_speculative`*/ - if ((*hintValue & 0xC) == 0xC - /*`omp_sync_hint_uncontended` and omp_sync_hint_contended*/ - || (*hintValue & 0x3) == 0x3) - context_.Say(clause.source, - "Hint clause value " - "is not a valid OpenMP synchronization value"_err_en_US); - } else { - context_.Say(clause.source, - "Hint clause must have non-negative constant " - "integer expression"_err_en_US); + if (std::optional maybeVal{GetIntValue(x.v.v)}) { + int64_t val{*maybeVal}; + if (val >= 0) { + // Check contradictory values. + if ((val & 0xC) == 0xC || // omp_sync_hint_speculative and nonspeculative + (val & 0x3) == 0x3) { // omp_sync_hint_contended and uncontended + context_.Say(dirCtx.clauseSource, + "The synchronization hint is not valid"_err_en_US); } + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be non-negative"_err_en_US); } - }; - - if (leftOmpClauseList) { - checkForValidHintClause(leftOmpClauseList); - } - if (rightOmpClauseList) { - checkForValidHintClause(rightOmpClauseList); + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be a constant integer value"_err_en_US); } } @@ -2370,8 +2375,9 @@ void OmpStructureChecker::Leave(const parser::OpenMPCancelConstruct &) { void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { const auto &dir{std::get(x.t)}; + const auto &dirSource{std::get(dir.t).source}; const auto &endDir{std::get(x.t)}; - PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_critical); + PushContextAndClauseSets(dirSource, llvm::omp::Directive::OMPD_critical); const auto &block{std::get(x.t)}; CheckNoBranching(block, llvm::omp::Directive::OMPD_critical, dir.source); const auto &dirName{std::get>(dir.t)}; @@ -2404,7 +2410,6 @@ void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { "Hint clause other than omp_sync_hint_none cannot be specified for " "an unnamed CRITICAL directive"_err_en_US}); } - CheckHintClause(&ompClause, nullptr, "CRITICAL"); } void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { @@ -2632,420 +2637,1519 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; } - return false; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } + + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (Append(v, std::move(results)), ...); + return v; + } + +private: + static void Append(Result &acc, Result &&data) { + for (auto &&s : data) { + acc.push_back(std::move(s)); } } +}; +} // namespace atomic - ErrIfAllocatableVariable(var); +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { + // Both expr and x have the form of SomeType(SomeKind(...)[1]). + // Check if expr is + // SomeType(SomeKind(Type( + // Convert + // SomeKind(...)[2]))) + // where SomeKind(...) [1] and [2] are equal, and the Convert preserves + // TypeCategory. + + if (expr != x) { + auto top{atomic::ArgumentExtractor{}(expr)}; + return top.first == atomic::Operator::Resize && x == top.second.front(); } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); + return true; } } -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, const MaybeExpr &maybeExpr = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetExpr(operation.expr, maybeExpr); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedExpr expr; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var + // or + // ... = x capture-var = atomic-var + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return u.lhs == c.rhs; + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); + bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + + if (couldBeCapture1) { + if (couldBeCapture2) { + if (isUpdateCapture(as2, as1)) { + if (isUpdateCapture(as1, as2)) { + // If both statements could be captures and both could be updates, + // emit a warning about the ambiguity. + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); } + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, + as1.rhs.AsFortran(), as2.rhs.AsFortran()); + } + } else { // !couldBeCapture2 + if (isUpdateCapture(as2, as1)) { + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else { + context_.Say(act2.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as1.rhs.AsFortran()); + } + } + } else { // !couldBeCapture1 + if (couldBeCapture2) { + if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); + } else { + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as2.rhs.AsFortran()); + } + } else { + context_.Say(source, + "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + } + } + + return std::make_pair(nullptr, nullptr); +} + +void OmpStructureChecker::CheckAtomicCaptureAssignment( + const evaluate::Assignment &capture, const SomeExpr &atom, + parser::CharBlock source) { + const SomeExpr &cap{capture.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + // This part should have been checked prior to callig this function. + assert(capture.rhs == atom && "This canont be a capture assignment"); + CheckStorageOverlap(atom, {cap}, source); + } +} + +void OmpStructureChecker::CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source) { + const SomeExpr &atom{read.rhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source) { + // [6.0:190:13-15] + // A write structured block is write-statement, a write statement that has + // one of the following forms: + // x = expr + // x => expr + const SomeExpr &atom{write.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, lsrc); + CheckStorageOverlap(atom, {write.rhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source) { + // [6.0:191:1-7] + // An update structured block is update-statement, an update statement + // that has one of the following forms: + // x = x operator expr + // x = expr operator x + // x = intrinsic-procedure-name (x) + // x = intrinsic-procedure-name (x, expr-list) + // x = intrinsic-procedure-name (expr-list, x) + const SomeExpr &atom{update.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, lsrc); + + auto top{GetTopLevelOperation(update.rhs)}; + switch (top.first) { + case atomic::Operator::Add: + case atomic::Operator::Sub: + case atomic::Operator::Mul: + case atomic::Operator::Div: + case atomic::Operator::And: + case atomic::Operator::Or: + case atomic::Operator::Eqv: + case atomic::Operator::Neqv: + case atomic::Operator::Min: + case atomic::Operator::Max: + case atomic::Operator::Identity: + break; + case atomic::Operator::Call: + context_.Say(source, + "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Convert: + context_.Say(source, + "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Intrinsic: + context_.Say(source, + "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Unk: + context_.Say( + source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + default: + context_.Say(source, + "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, + atomic::ToString(top.first)); + return; + } + // Check if `atom` occurs exactly once in the argument list. + std::vector nonAtom; + auto unique{[&]() { // -> iterator + auto found{top.second.end()}; + for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { + if (IsSameOrResizeOf(*i, atom)) { + if (found != top.second.end()) { + return top.second.end(); } + found = i; + } else { + nonAtom.push_back(*i); } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); + return found; + }()}; + + if (unique == top.second.end()) { + if (top.first == atomic::Operator::Identity) { + // This is "x = y". + context_.Say(rsrc, + "The atomic variable %s should appear as an argument in the update operation"_err_en_US, + atom.AsFortran()); + } else { + context_.Say(rsrc, + "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, + atom.AsFortran(), atomic::ToString(top.first)); + } + } else { + CheckStorageOverlap(atom, nonAtom, source); } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( + const SomeExpr &cond, parser::CharBlock condSource, + const evaluate::Assignment &assign, parser::CharBlock assignSource) { + const SomeExpr &atom{assign.lhs}; + auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, alsrc); + + auto top{GetTopLevelOperation(cond)}; + // Missing arguments to operations would have been diagnosed by now. + + switch (top.first) { + case atomic::Operator::Associated: + if (atom != top.second.front()) { + context_.Say(assignSource, + "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); + } + break; + // x equalop e | e equalop x (allowing "e equalop x" is an extension) + case atomic::Operator::Eq: + case atomic::Operator::Eqv: + // x ordop expr | expr ordop x + case atomic::Operator::Lt: + case atomic::Operator::Gt: { + const SomeExpr &arg0{top.second[0]}; + const SomeExpr &arg1{top.second[1]}; + if (IsSameOrResizeOf(arg0, atom)) { + CheckStorageOverlap(atom, {arg1}, condSource); + } else if (IsSameOrResizeOf(arg1, atom)) { + CheckStorageOverlap(atom, {arg0}, condSource); + } else { + context_.Say(assignSource, + "An argument of the %s operator should be the target of the assignment"_err_en_US, + atomic::ToString(top.first)); + } + break; + } + case atomic::Operator::True: + case atomic::Operator::False: + break; + default: + context_.Say(condSource, + "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, + atomic::ToString(top.first)); + break; } } -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, +void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source) { + // The condition/statements must be: + // - cond: x equalop e ift: x = d iff: - + // - cond: x ordop expr ift: x = expr iff: - (+ commute ordop) + // - cond: associated(x) ift: x => expr iff: - + // - cond: associated(x, e) ift: x => expr iff: - + + // The if-true statement must be present, and must be an assignment. + auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; + if (!maybeAssign) { + if (update.ift.stmt) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); + } else { + context_.Say( + source, "Invalid body of ATOMIC UPDATE COMPARE operation"_err_en_US); + } + return; + } + const evaluate::Assignment assign{*maybeAssign}; + const SomeExpr &atom{assign.lhs}; + + CheckAtomicConditionalUpdateAssignment( + update.cond, update.source, assign, update.ift.source); + + CheckStorageOverlap(atom, {assign.rhs}, update.ift.source); + + if (update.iff) { + context_.Say(update.iff.source, + "In ATOMIC UPDATE COMPARE the update statement should not have an ELSE branch"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdateOnly( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeUpdate{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeUpdate->lhs}; + CheckAtomicUpdateAssignment(*maybeUpdate, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC UPDATE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdate( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // Allowable forms are (single-statement): + // - if ... + // - x = (... ? ... : x) + // and two-statement: + // - r = cond ; if (r) ... + + const parser::ExecutionPartConstruct *ust{nullptr}; // update + const parser::ExecutionPartConstruct *cst{nullptr}; // condition + + if (body.size() == 1) { + ust = &body.front(); + } else if (body.size() == 2) { + cst = &body.front(); + ust = &body.back(); + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE operation should contain one or two statements"_err_en_US); + return; + } + + // Flang doesn't support conditional-expr yet, so all update statements + // are if-statements. + + // IfStmt: if (...) ... + // IfConstruct: if (...) then ... endif + auto maybeUpdate{AnalyzeConditionalStmt(ust)}; + if (!maybeUpdate) { + context_.Say(source, + "In ATOMIC UPDATE COMPARE the update statement should be a conditional statement"_err_en_US); + return; + } + + AnalyzedCondStmt &update{*maybeUpdate}; + + if (SourcedActionStmt action{GetActionStmt(cst)}) { + // The "condition" statement must be `r = cond`. + if (auto maybeCond{GetEvaluateAssignment(action.stmt)}) { + if (maybeCond->lhs != update.cond) { + context_.Say(update.source, + "In ATOMIC UPDATE COMPARE the conditional statement must use %s as the condition"_err_en_US, + maybeCond->lhs.AsFortran()); + } else { + // If it's "r = ...; if (r) ..." then put the original condition + // in `update`. + update.cond = maybeCond->rhs; + } + } else { + context_.Say(action.source, + "In ATOMIC UPDATE COMPARE with two statements the first statement should compute the condition"_err_en_US); + } + } + + evaluate::Assignment assign{*GetEvaluateAssignment(update.ift.stmt)}; + + CheckAtomicConditionalUpdateStmt(update, source); + if (IsCheckForAssociated(update.cond)) { + if (!IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment should be a pointer-assignment when the condition is ASSOCIATED"_err_en_US); + } + } else { + if (IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment cannot be a pointer-assignment except when the condition is ASSOCIATED"_err_en_US); + } + } + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::None)); +} + +void OmpStructureChecker::CheckAtomicUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() != 2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two statements"_err_en_US); + return; + } + + auto [uec, cec]{CheckUpdateCapture(&body.front(), &body.back(), source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; + auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; + + if (!maybeUpdate || !maybeCapture) { + context_.Say(source, + "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); + return; + } + + const evaluate::Assignment &update{*maybeUpdate}; + const evaluate::Assignment &capture{*maybeCapture}; + const SomeExpr &atom{update.lhs}; + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + int action; + + if (IsMaybeAtomicWrite(update)) { + action = Analysis::Write; + CheckAtomicWriteAssignment(update, uact.source); + } else { + action = Analysis::Update; + CheckAtomicUpdateAssignment(update, uact.source); + } + CheckAtomicCaptureAssignment(capture, atom, cact.source); + + if (IsPointerAssignment(update) != IsPointerAssignment(capture)) { + context_.Say(cact.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + + if (GetActionStmt(&body.front()).stmt == uact.stmt) { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(action, update.rhs), + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + } else { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), + MakeAtomicAnalysisOp(action, update.rhs)); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // There are two different variants of this: + // (1) conditional-update and capture separately: + // This form only allows single-statement updates, i.e. the update + // form "r = cond; if (r) ...)" is not allowed. + // (2) conditional-update combined with capture in a single statement: + // This form does allow the condition to be calculated separately, + // i.e. "r = cond; if (r) ...". + // Regardless of what form it is, the actual update assignment is a + // proper write, i.e. "x = d", where d does not depend on x. + + AnalyzedCondStmt update; + SourcedActionStmt capture; + bool captureAlways{true}, captureFirst{true}; + + auto extractCapture{[&]() { + capture = update.iff; + captureAlways = false; + update.iff = SourcedActionStmt{}; + }}; + + auto classifyNonUpdate{[&](const SourcedActionStmt &action) { + // The non-update statement is either "r = cond" or the capture. + if (auto maybeAssign{GetEvaluateAssignment(action.stmt)}) { + if (update.cond == maybeAssign->lhs) { + // If this is "r = cond; if (r) ...", then update the condition. + update.cond = maybeAssign->rhs; + update.source = action.source; + // In this form, the update and the capture are combined into + // an IF-THEN-ELSE statement. + extractCapture(); + } else { + // Assume this is the capture-statement. + capture = action; + } + } + }}; + + if (body.size() == 2) { + // This could be + // - capture; conditional-update (in any order), or + // - r = cond; if (r) capture-update + const parser::ExecutionPartConstruct *st1{&body.front()}; + const parser::ExecutionPartConstruct *st2{&body.back()}; + // In either case, the conditional statement can be analyzed by + // AnalyzeConditionalStmt, whereas the other statement cannot. + if (auto maybeUpdate1{AnalyzeConditionalStmt(st1)}) { + update = *maybeUpdate1; + classifyNonUpdate(GetActionStmt(st2)); + captureFirst = false; + } else if (auto maybeUpdate2{AnalyzeConditionalStmt(st2)}) { + update = *maybeUpdate2; + classifyNonUpdate(GetActionStmt(st1)); + } else { + // None of the statements are conditional, this rules out the + // "r = cond; if (r) ..." and the "capture + conditional-update" + // variants. This could still be capture + write (which is classified + // as conditional-update-capture in the spec). + auto [uec, cec]{CheckUpdateCapture(st1, st2, source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + update.ift = uact; + capture = cact; + if (uec == st1) { + captureFirst = false; + } + } + } else if (body.size() == 1) { + if (auto maybeUpdate{AnalyzeConditionalStmt(&body.front())}) { + update = *maybeUpdate; + // This is the form with update and capture combined into an IF-THEN-ELSE + // statement. The capture-statement is always the ELSE branch. + extractCapture(); + } else { + goto invalid; + } + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE CAPTURE operation should contain one or two statements"_err_en_US); + return; + invalid: + context_.Say(source, + "Invalid body of ATOMIC UPDATE COMPARE CAPTURE operation"_err_en_US); + return; + } + + // The update must have a form `x = d` or `x => d`. + if (auto maybeWrite{GetEvaluateAssignment(update.ift.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, update.ift.source); + if (auto maybeCapture{GetEvaluateAssignment(capture.stmt)}) { + CheckAtomicCaptureAssignment(*maybeCapture, atom, capture.source); + + if (IsPointerAssignment(*maybeWrite) != + IsPointerAssignment(*maybeCapture)) { + context_.Say(capture.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + } else { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + return; + } + } else { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + return; + } + + // update.iff should be empty here, the capture statement should be + // stored in "capture". + + // Fill out the analysis in the AST node. + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + bool condUnused{std::visit( + [](auto &&s) { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v) { + return true; + } else { + return false; + } }, - x.u); + update.cond.u)}; + + int updateWhen{!condUnused ? Analysis::IfTrue : 0}; + int captureWhen{!captureAlways ? Analysis::IfFalse : 0}; + + evaluate::Assignment updAssign{*GetEvaluateAssignment(update.ift.stmt)}; + evaluate::Assignment capAssign{*GetEvaluateAssignment(capture.stmt)}; + + if (captureFirst) { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + } else { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + } +} + +void OmpStructureChecker::CheckAtomicRead( + const parser::OpenMPAtomicConstruct &x) { + // [6.0:190:5-7] + // A read structured block is read-statement, a read statement that has one + // of the following forms: + // v = x + // v => x + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Read cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC READ cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeRead->rhs}; + CheckAtomicReadAssignment(*maybeRead, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC READ operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC READ operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicWrite( + const parser::OpenMPAtomicConstruct &x) { + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Write cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC WRITE cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeWrite{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC WRITE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdate( + const parser::OpenMPAtomicConstruct &x) { + auto &block{std::get(x.t)}; + + bool isConditional{x.IsCompare()}; + bool isCapture{x.IsCapture()}; + const parser::Block &body{GetInnermostExecPart(block)}; + + if (isConditional && isCapture) { + CheckAtomicConditionalUpdateCapture(x, body, x.source); + } else if (isConditional) { + CheckAtomicConditionalUpdate(x, body, x.source); + } else if (isCapture) { + CheckAtomicUpdateCapture(x, body, x.source); + } else { // update-only + CheckAtomicUpdateOnly(x, body, x.source); + } +} + +void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { + // All of the following groups have the "exclusive" property, i.e. at + // most one clause from each group is allowed. + // The exclusivity-checking code should eventually be unified for all + // clauses, with clause groups defined in OMP.td. + std::array atomic{llvm::omp::Clause::OMPC_read, + llvm::omp::Clause::OMPC_update, llvm::omp::Clause::OMPC_write}; + std::array memoryOrder{llvm::omp::Clause::OMPC_acq_rel, + llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_relaxed, + llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_seq_cst}; + + auto checkExclusive{[&](llvm::ArrayRef group, + std::string_view name, + const parser::OmpClauseList &clauses) { + const parser::OmpClause *present{nullptr}; + for (const parser::OmpClause &clause : clauses.v) { + llvm::omp::Clause id{clause.Id()}; + if (!llvm::is_contained(group, id)) { + continue; + } + if (present == nullptr) { + present = &clause; + continue; + } else if (id == present->Id()) { + // Ignore repetitions of the same clause, those will be diagnosed + // separately. + continue; + } + parser::MessageFormattedText txt( + "At most one clause from the '%s' group is allowed on ATOMIC construct"_err_en_US, + name.data()); + parser::Message message(clause.source, txt); + message.Attach(present->source, + "Previous clause from this group provided here"_en_US); + context_.Say(std::move(message)); + return; + } + }}; + + auto &dirSpec{std::get(x.t)}; + auto &dir{std::get(dirSpec.t)}; + PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_atomic); + llvm::omp::Clause kind{x.GetKind()}; + + checkExclusive(atomic, "atomic", dirSpec.Clauses()); + checkExclusive(memoryOrder, "memory-order", dirSpec.Clauses()); + + switch (kind) { + case llvm::omp::Clause::OMPC_read: + CheckAtomicRead(x); + break; + case llvm::omp::Clause::OMPC_write: + CheckAtomicWrite(x); + break; + case llvm::omp::Clause::OMPC_update: + CheckAtomicUpdate(x); + break; + default: + break; + } } void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { @@ -3237,7 +4341,6 @@ CHECK_SIMPLE_CLAUSE(Final, OMPC_final) CHECK_SIMPLE_CLAUSE(Flush, OMPC_flush) CHECK_SIMPLE_CLAUSE(Full, OMPC_full) CHECK_SIMPLE_CLAUSE(Grainsize, OMPC_grainsize) -CHECK_SIMPLE_CLAUSE(Hint, OMPC_hint) CHECK_SIMPLE_CLAUSE(Holds, OMPC_holds) CHECK_SIMPLE_CLAUSE(Inclusive, OMPC_inclusive) CHECK_SIMPLE_CLAUSE(Initializer, OMPC_initializer) @@ -3867,40 +4970,6 @@ void OmpStructureChecker::CheckIsLoopIvPartOfClause( } } } -// Following clauses have a separate node in parse-tree.h. -// Atomic-clause -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicRead, OMPC_read) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicWrite, OMPC_write) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicUpdate, OMPC_update) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicCapture, OMPC_capture) - -void OmpStructureChecker::Leave(const parser::OmpAtomicRead &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_read, - {llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicWrite &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_write, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicUpdate &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_update, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -// OmpAtomic node represents atomic directive without atomic-clause. -// atomic-clause - READ,WRITE,UPDATE,CAPTURE. -void OmpStructureChecker::Leave(const parser::OmpAtomic &) { - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acquire)}) { - context_.Say(clause->source, - "Clause ACQUIRE is not allowed on the ATOMIC directive"_err_en_US); - } - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acq_rel)}) { - context_.Say(clause->source, - "Clause ACQ_REL is not allowed on the ATOMIC directive"_err_en_US); - } -} // Restrictions specific to each clause are implemented apart from the // generalized restrictions. @@ -4854,21 +5923,6 @@ void OmpStructureChecker::Leave(const parser::OmpContextSelector &) { ExitDirectiveNest(ContextSelectorNest); } -std::optional OmpStructureChecker::GetDynamicType( - const common::Indirection &parserExpr) { - // Indirection parserExpr - // `- parser::Expr ^.value() - const parser::TypedExpr &typedExpr{parserExpr.value().typedExpr}; - // ForwardOwningPointer typedExpr - // `- GenericExprWrapper ^.get() - // `- std::optional ^->v - if (auto maybeExpr{typedExpr.get()->v}) { - return maybeExpr->GetType(); - } else { - return std::nullopt; - } -} - const std::list & OmpStructureChecker::GetTraitPropertyList( const parser::OmpTraitSelector &trait) { @@ -5258,7 +6312,7 @@ void OmpStructureChecker::CheckTraitCondition( const parser::OmpTraitProperty &property{properties.front()}; auto &scalarExpr{std::get(property.u)}; - auto maybeType{GetDynamicType(scalarExpr.thing)}; + auto maybeType{GetDynamicType(scalarExpr.thing.value())}; if (!maybeType || maybeType->category() != TypeCategory::Logical) { context_.Say(property.source, "%s trait requires a single LOGICAL expression"_err_en_US, diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..bf6fbf16d0646 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -48,6 +48,7 @@ static const OmpDirectiveSet noWaitClauseNotAllowedSet{ } // namespace llvm namespace Fortran::semantics { +struct AnalyzedCondStmt; // Mapping from 'Symbol' to 'Source' to keep track of the variables // used in multiple clauses @@ -142,15 +143,6 @@ class OmpStructureChecker void Leave(const parser::OmpClauseList &); void Enter(const parser::OmpClause &); - void Enter(const parser::OmpAtomicRead &); - void Leave(const parser::OmpAtomicRead &); - void Enter(const parser::OmpAtomicWrite &); - void Leave(const parser::OmpAtomicWrite &); - void Enter(const parser::OmpAtomicUpdate &); - void Leave(const parser::OmpAtomicUpdate &); - void Enter(const parser::OmpAtomicCapture &); - void Leave(const parser::OmpAtomic &); - void Enter(const parser::DoConstruct &); void Leave(const parser::DoConstruct &); @@ -189,8 +181,6 @@ class OmpStructureChecker void CheckAllowedMapTypes(const parser::OmpMapType::Value &, const std::list &); - std::optional GetDynamicType( - const common::Indirection &); const std::list &GetTraitPropertyList( const parser::OmpTraitSelector &); std::optional GetClauseFromProperty( @@ -260,14 +250,41 @@ class OmpStructureChecker void CheckDoWhile(const parser::OpenMPLoopConstruct &x); void CheckAssociatedLoopConstraints(const parser::OpenMPLoopConstruct &x); template bool IsOperatorValid(const T &, const D &); - void CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *, const parser::OmpAtomicClauseList *); - void CheckAtomicUpdateStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureStmt(const parser::AssignmentStmt &); - void CheckAtomicWriteStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureConstruct(const parser::OmpAtomicCapture &); - void CheckAtomicCompareConstruct(const parser::OmpAtomicCompare &); - void CheckAtomicConstructStructure(const parser::OpenMPAtomicConstruct &); + + void CheckStorageOverlap(const evaluate::Expr &, + llvm::ArrayRef>, parser::CharBlock); + void CheckAtomicVariable( + const evaluate::Expr &, parser::CharBlock); + std::pair + CheckUpdateCapture(const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source); + void CheckAtomicCaptureAssignment(const evaluate::Assignment &capture, + const SomeExpr &atom, parser::CharBlock source); + void CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source); + void CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source); + void CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source); + void CheckAtomicConditionalUpdateAssignment(const SomeExpr &cond, + parser::CharBlock condSource, const evaluate::Assignment &assign, + parser::CharBlock assignSource); + void CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source); + void CheckAtomicUpdateOnly(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdate(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicUpdateCapture(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source); + void CheckAtomicRead(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicWrite(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicUpdate(const parser::OpenMPAtomicConstruct &x); + void CheckDistLinear(const parser::OpenMPLoopConstruct &x); void CheckSIMDNest(const parser::OpenMPConstruct &x); void CheckTargetNest(const parser::OpenMPConstruct &x); @@ -319,7 +336,6 @@ class OmpStructureChecker void EnterDirectiveNest(const int index) { directiveNest_[index]++; } void ExitDirectiveNest(const int index) { directiveNest_[index]--; } int GetDirectiveNest(const int index) { return directiveNest_[index]; } - template void CheckHintClause(D *, D *, std::string_view); inline void ErrIfAllocatableVariable(const parser::Variable &); inline void ErrIfLHSAndRHSSymbolsMatch( const parser::Variable &, const parser::Expr &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..dc395ecf211b3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1638,11 +1638,8 @@ class OmpVisitor : public virtual DeclarationVisitor { messageHandler().set_currStmtSource(std::nullopt); } bool Pre(const parser::OpenMPAtomicConstruct &x) { - return common::visit(common::visitors{[&](const auto &u) -> bool { - AddOmpSourceRange(u.source); - return true; - }}, - x.u); + AddOmpSourceRange(x.source); + return true; } void Post(const parser::OpenMPAtomicConstruct &) { messageHandler().set_currStmtSource(std::nullopt); diff --git a/flang/lib/Semantics/rewrite-directives.cpp b/flang/lib/Semantics/rewrite-directives.cpp index 104a77885d276..b4fef2c881b67 100644 --- a/flang/lib/Semantics/rewrite-directives.cpp +++ b/flang/lib/Semantics/rewrite-directives.cpp @@ -51,23 +51,21 @@ class OmpRewriteMutator : public DirectiveRewriteMutator { bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { // Find top-level parent of the operation. - Symbol *topLevelParent{common::visit( - [&](auto &atomic) { - Symbol *symbol{nullptr}; - Scope *scope{ - &context_.FindScope(std::get(atomic.t).source)}; - do { - if (Symbol * parent{scope->symbol()}) { - symbol = parent; - } - scope = &scope->parent(); - } while (!scope->IsGlobal()); - - assert(symbol && - "Atomic construct must be within a scope associated with a symbol"); - return symbol; - }, - x.u)}; + Symbol *topLevelParent{[&]() { + Symbol *symbol{nullptr}; + Scope *scope{&context_.FindScope( + std::get(x.t).source)}; + do { + if (Symbol * parent{scope->symbol()}) { + symbol = parent; + } + scope = &scope->parent(); + } while (!scope->IsGlobal()); + + assert(symbol && + "Atomic construct must be within a scope associated with a symbol"); + return symbol; + }()}; // Get the `atomic_default_mem_order` clause from the top-level parent. std::optional defaultMemOrder; @@ -86,66 +84,48 @@ bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { return false; } - auto findMemOrderClause = - [](const std::list &clauses) { - return llvm::any_of(clauses, [](const auto &clause) { - return std::get_if(&clause.u); + auto findMemOrderClause{[](const parser::OmpClauseList &clauses) { + return llvm::any_of( + clauses.v, [](auto &clause) -> const parser::OmpClause * { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_acq_rel: + case llvm::omp::Clause::OMPC_acquire: + case llvm::omp::Clause::OMPC_relaxed: + case llvm::omp::Clause::OMPC_release: + case llvm::omp::Clause::OMPC_seq_cst: + return &clause; + default: + return nullptr; + } }); - }; - - // Get the clause list to which the new memory order clause must be added, - // only if there are no other memory order clauses present for this atomic - // directive. - std::list *clauseList = common::visit( - common::visitors{[&](parser::OmpAtomic &atomicConstruct) { - // OmpAtomic only has a single list of clauses. - auto &clauses{std::get( - atomicConstruct.t)}; - return !findMemOrderClause(clauses.v) ? &clauses.v - : nullptr; - }, - [&](auto &atomicConstruct) { - // All other atomic constructs have two lists of clauses. - auto &clausesLhs{std::get<0>(atomicConstruct.t)}; - auto &clausesRhs{std::get<2>(atomicConstruct.t)}; - return !findMemOrderClause(clausesLhs.v) && - !findMemOrderClause(clausesRhs.v) - ? &clausesRhs.v - : nullptr; - }}, - x.u); + }}; - // Add a memory order clause to the atomic directive. + auto &dirSpec{std::get(x.t)}; + auto &clauseList{std::get>(dirSpec.t)}; if (clauseList) { - atomicDirectiveDefaultOrderFound_ = true; - switch (*defaultMemOrder) { - case common::OmpMemoryOrderType::Acq_Rel: - clauseList->emplace_back(common::visit( - common::visitors{[](parser::OmpAtomicRead &) -> parser::OmpClause { - return parser::OmpClause::Acquire{}; - }, - [](parser::OmpAtomicCapture &) -> parser::OmpClause { - return parser::OmpClause::AcqRel{}; - }, - [](auto &) -> parser::OmpClause { - // parser::{OmpAtomic, OmpAtomicUpdate, OmpAtomicWrite} - return parser::OmpClause::Release{}; - }}, - x.u)); - break; - case common::OmpMemoryOrderType::Relaxed: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::Relaxed{}}); - break; - case common::OmpMemoryOrderType::Seq_Cst: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::SeqCst{}}); - break; - default: - // FIXME: Don't process other values at the moment since their validity - // depends on the OpenMP version (which is unavailable here). - break; + if (findMemOrderClause(*clauseList)) { + return false; } + } else { + clauseList = parser::OmpClauseList(decltype(parser::OmpClauseList::v){}); + } + + // Add a memory order clause to the atomic directive. + atomicDirectiveDefaultOrderFound_ = true; + switch (*defaultMemOrder) { + case common::OmpMemoryOrderType::Acq_Rel: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::AcqRel{}}); + break; + case common::OmpMemoryOrderType::Relaxed: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::Relaxed{}}); + break; + case common::OmpMemoryOrderType::Seq_Cst: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::SeqCst{}}); + break; + default: + // FIXME: Don't process other values at the moment since their validity + // depends on the OpenMP version (which is unavailable here). + break; } return false; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 index b82bd13622764..6f58e0939a787 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 index 88ec6fe910b9e..6729be6e5cf8b 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b8422c8f3b769 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -79,16 +79,16 @@ subroutine pointers_in_atomic_capture() !CHECK: %[[VAL_A_BOX_ADDR:.*]] = fir.box_addr %[[VAL_A_LOADED]] : (!fir.box>) -> !fir.ptr !CHECK: %[[VAL_B_LOADED:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR:.*]] = fir.box_addr %[[VAL_B_LOADED]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR]] : !fir.ptr !CHECK: %[[VAL_B_LOADED_2:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR_2:.*]] = fir.box_addr %[[VAL_B_LOADED_2]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR_2]] : !fir.ptr !CHECK: omp.atomic.capture { !CHECK: omp.atomic.update %[[VAL_A_BOX_ADDR]] : !fir.ptr { !CHECK: ^bb0(%[[ARG:.*]]: i32): !CHECK: %[[TEMP:.*]] = arith.addi %[[ARG]], %[[VAL_B]] : i32 !CHECK: omp.yield(%[[TEMP]] : i32) !CHECK: } -!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 +!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR_2]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 !CHECK: } !CHECK: return !CHECK: } diff --git a/flang/test/Lower/OpenMP/atomic-write.f90 b/flang/test/Lower/OpenMP/atomic-write.f90 index 13392ad76471f..6eded49b0b15d 100644 --- a/flang/test/Lower/OpenMP/atomic-write.f90 +++ b/flang/test/Lower/OpenMP/atomic-write.f90 @@ -44,9 +44,9 @@ end program OmpAtomicWrite !CHECK-LABEL: func.func @_QPatomic_write_pointer() { !CHECK: %[[X_REF:.*]] = fir.alloca !fir.box> {bindc_name = "x", uniq_name = "_QFatomic_write_pointerEx"} !CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFatomic_write_pointerEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) -!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> !CHECK: %[[X_POINTEE_ADDR:.*]] = fir.box_addr %[[X_ADDR_BOX]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: omp.atomic.write %[[X_POINTEE_ADDR]] = %[[C1]] : !fir.ptr, i32 !CHECK: %[[C2:.*]] = arith.constant 2 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> diff --git a/flang/test/Parser/OpenMP/atomic-compare.f90 b/flang/test/Parser/OpenMP/atomic-compare.f90 deleted file mode 100644 index 5cd02698ff482..0000000000000 --- a/flang/test/Parser/OpenMP/atomic-compare.f90 +++ /dev/null @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s -! OpenMP version for documentation purposes only - it isn't used until Sema. -! This is testing for Parser errors that bail out before Sema. -program main - implicit none - integer :: i, j = 10 - logical :: r - - !CHECK: error: expected OpenMP construct - !$omp atomic compare write - r = i .eq. j + 1 - - !CHECK: error: expected end of line - !$omp atomic compare num_threads(4) - r = i .eq. j -end program main diff --git a/flang/test/Semantics/OpenMP/atomic-compare.f90 b/flang/test/Semantics/OpenMP/atomic-compare.f90 index 54492bf6a22a6..11e23e062bce7 100644 --- a/flang/test/Semantics/OpenMP/atomic-compare.f90 +++ b/flang/test/Semantics/OpenMP/atomic-compare.f90 @@ -44,46 +44,37 @@ !$omp end atomic ! Check for error conditions: - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic compare seq_cst seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst compare seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic compare acquire acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire compare acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic compare relaxed relaxed if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed compare relaxed if (b .eq. c) b = a - !ERROR: More than one FAIL clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one FAIL clause can appear on the ATOMIC directive !$omp atomic fail(release) compare fail(release) if (c .eq. a) a = b !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index c13a11a8dd5dc..deb67e7614659 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -16,20 +16,21 @@ program sample !$omp atomic read hint(2) y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(3) y = y + 10 !$omp atomic update hint(5) y = x + y - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement y = x x = y !$omp end atomic - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic update hint(x) y = y * 1 @@ -46,7 +47,7 @@ program sample !$omp atomic hint(omp_lock_hint_speculative) x = y + x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint) read y = x @@ -69,36 +70,36 @@ program sample !$omp atomic hint(omp_lock_hint_contended + omp_sync_hint_nonspeculative) x = y + x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint_contended) read y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = y * 9 - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp atomic hint(1.0) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp atomic hint(z + omp_sync_hint_nonspeculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(k + omp_sync_hint_speculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(p(1) + omp_sync_hint_uncontended) write x = 10 * y !$omp atomic write hint(a) - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and y+x access the same storage x = y + x !$omp atomic hint(abs(-1)) write diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 new file mode 100644 index 0000000000000..2619d235380f8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -0,0 +1,89 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC READ. Expect no diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v = x + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v => x +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic read + v = p(i) +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic read + !ERROR: Atomic variable x should be a scalar + v = x + + !$omp atomic read + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + v = y(2:4) +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic read + !ERROR: Within atomic operation x and x access the same storage + x = x + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic read + y(1) = y(2) +end + +subroutine f05 + integer :: x, v + + !$omp atomic read + !ERROR: Atomic expression x+1_4 should be a variable + v = x + 1 +end + +subroutine f06 + character :: x, v + + !$omp atomic read + !ERROR: Atomic variable x cannot have CHARACTER type + v = x +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic read + !ERROR: Atomic variable x cannot be ALLOCATABLE + v = x +end + diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 new file mode 100644 index 0000000000000..64e31a7974b55 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -0,0 +1,77 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !$omp atomic update capture + x = v + x = x + 1 + y = x + !$omp end atomic +end + +subroutine f01 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two assignments + !$omp atomic update capture + x = v + block + x = x + 1 + y = x + end block + !$omp end atomic +end + +subroutine f02 + integer :: x, y + + ! The update and capture statements can be inside of a single BLOCK. + ! The end-directive is then optional. Expect no diagnostics. + !$omp atomic update capture + block + x = x + 1 + y = x + end block +end + +subroutine f03 + integer :: x + + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !$omp atomic update capture + x = x + 1 + x = x + 2 + !$omp end atomic +end + +subroutine f04 + integer :: x, v + + !$omp atomic update capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + v = x + x = v + !$omp end atomic +end + +subroutine f05 + integer :: x, v, z + + !$omp atomic update capture + v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + !$omp end atomic +end + +subroutine f06 + integer :: x, v, z + + !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + v = x + !$omp end atomic +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 new file mode 100644 index 0000000000000..0aee02d429dc0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + !$omp atomic update + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + x = int(x + y) +end + +subroutine f06 + integer :: x, y + interface + function f(i, j) + integer :: f, i, j + end + end interface + + !$omp atomic update + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation + x = f(x, y) +end + +subroutine f07 + real :: x + integer :: y + + !$omp atomic update + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation + x = x ** y +end + +subroutine f08 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should appear as an argument in the update operation + x = y +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 index 21a9b87d26345..3084376b4275d 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 @@ -22,10 +22,10 @@ program sample x = x / y !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y end program diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 new file mode 100644 index 0000000000000..979568ebf7140 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -0,0 +1,81 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC WRITE. Expect no diagnostics. + !$omp atomic write + x = v + 1 + + !$omp atomic write + x = v + 3 + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic write + x = 2 * v + 3 + + !$omp atomic write + x => v +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic write + p(i) = v +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic write + !ERROR: Atomic variable x should be a scalar + x = v + + !$omp atomic write + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + y(2:4) = v +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic write + !ERROR: Within atomic operation x and x+1_4 access the same storage + x = x + 1 + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic write + y(1) = y(2) +end + +subroutine f06 + character :: x, v + + !$omp atomic write + !ERROR: Atomic variable x cannot have CHARACTER type + x = v +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic write + !ERROR: Atomic variable x cannot be ALLOCATABLE + x = v +end + diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0e100871ea9b4..0dc8251ae356e 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct @@ -11,9 +11,13 @@ a = b !$omp end atomic + !ERROR: ACQUIRE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic read acquire hint(OMP_LOCK_HINT_CONTENDED) a = b + !ERROR: RELEASE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic release hint(OMP_LOCK_HINT_UNCONTENDED) write a = b @@ -22,39 +26,32 @@ a = a + 1 !$omp end atomic + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: ACQ_REL clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic hint(1) acq_rel capture b = a a = a + 1 !$omp end atomic - !ERROR: expected end of line + !ERROR: At most one clause from the 'atomic' group is allowed on ATOMIC construct !$omp atomic read write + !ERROR: Atomic expression a+1._4 should be a variable a = a + 1 !$omp atomic a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic num_threads(4) a = a + 1 - !ERROR: expected end of line + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic capture num_threads(4) a = a + 1 + !ERROR: RELAXED clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic relaxed a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' - !$omp atomic num_threads write - a = a + 1 - !$omp end parallel end diff --git a/flang/test/Semantics/OpenMP/atomic01.f90 b/flang/test/Semantics/OpenMP/atomic01.f90 index 173effe86b69c..f700c381cadd0 100644 --- a/flang/test/Semantics/OpenMP/atomic01.f90 +++ b/flang/test/Semantics/OpenMP/atomic01.f90 @@ -14,322 +14,277 @@ ! At most one memory-order-clause may appear on the construct. !READ - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic read seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst read seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic read acquire acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire read acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic read relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed read relaxed i = j !UPDATE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic update seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst update seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic update release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release update release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic update relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed update relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !CAPTURE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic capture seq_cst seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst capture seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic capture release release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release capture release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic capture relaxed relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed capture relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel acq_rel capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic capture acq_rel acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel capture acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic capture acquire acquire i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire capture acquire i = j j = k !$omp end atomic !WRITE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic write seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst write seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic write release release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release write release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic write relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed write relaxed i = j !No atomic-clause - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j ! 2.17.7.3 ! At most one hint clause may appear on the construct. - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_speculative) hint(omp_sync_hint_speculative) read i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) read hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic read hint(omp_sync_hint_uncontended) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) update hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic update hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) capture i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) capture hint(omp_sync_hint_nonspeculative) i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic capture hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j j = k @@ -337,34 +292,26 @@ ! 2.17.7.4 ! If atomic-clause is read then memory-order-clause must not be acq_rel or release. - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic acq_rel read i = j - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read acq_rel i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic release read i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read release i = j ! 2.17.7.5 ! If atomic-clause is write then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acq_rel write i = j - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acq_rel i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acquire write i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acquire i = j @@ -372,33 +319,27 @@ ! 2.17.7.6 ! If atomic-clause is update or not present then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acq_rel update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acquire update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed on the ATOMIC directive !$omp atomic acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed on the ATOMIC directive !$omp atomic acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j end program diff --git a/flang/test/Semantics/OpenMP/atomic02.f90 b/flang/test/Semantics/OpenMP/atomic02.f90 index c66085d00f157..45e41f2552965 100644 --- a/flang/test/Semantics/OpenMP/atomic02.f90 +++ b/flang/test/Semantics/OpenMP/atomic02.f90 @@ -28,36 +28,29 @@ program OmpAtomic !$omp atomic a = a/(b + 1) !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: The atomic variable c should appear as an argument in the update operation c = d !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The /= operator is not a valid ATOMIC UPDATE operation l = a .NE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic m = m .AND. n @@ -76,32 +69,26 @@ program OmpAtomic !$omp atomic update a = a/(b + 1) !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: This is not a valid ATOMIC UPDATE operation c = c//d !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic update m = m .AND. n diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index 76367495b9861..f5c189fd05318 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -25,28 +25,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7, b, c) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8, a, d) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -60,26 +58,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = MOD(y, 9) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation x = ABS(x) end program OmpAtomic @@ -92,7 +90,7 @@ subroutine conflicting_types() type(simple) ::s z = 1 !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(s%z, 4) end subroutine @@ -105,40 +103,37 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) :: s !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MIN operator a = min(a, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a) !$omp atomic - !ERROR: Atomic update statement should be of the form `a = intrinsic_procedure(a, expr_list)` OR `a = intrinsic_procedure(expr_list, a)` a = min(b, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a, b) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: The atomic variable y should occur exactly once among the arguments of the top-level MIN operator y = min(z, x) !$omp atomic z = max(z, y) !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'k' + !ERROR: Atomic variable k should be a scalar + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation k = max(x, y) - + !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement x = min(x, k) !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - z =z + s%m + z = z + s%m end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index a9644ad95aa30..5c91ab5dc37e4 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -20,12 +20,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic @@ -33,12 +31,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic @@ -46,12 +42,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic @@ -59,12 +53,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic @@ -72,8 +64,7 @@ program OmpAtomic !$omp atomic m = n .AND. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic @@ -81,8 +72,7 @@ program OmpAtomic !$omp atomic m = n .OR. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic @@ -90,8 +80,7 @@ program OmpAtomic !$omp atomic m = n .EQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic @@ -99,8 +88,7 @@ program OmpAtomic !$omp atomic m = n .NEQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l !$omp atomic update @@ -108,12 +96,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic update @@ -121,12 +107,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic update @@ -134,12 +118,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic update @@ -147,12 +129,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic update @@ -160,8 +140,7 @@ program OmpAtomic !$omp atomic update m = n .AND. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic update @@ -169,8 +148,7 @@ program OmpAtomic !$omp atomic update m = n .OR. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic update @@ -178,8 +156,7 @@ program OmpAtomic !$omp atomic update m = n .EQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic update @@ -187,8 +164,7 @@ program OmpAtomic !$omp atomic update m = n .NEQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l end program OmpAtomic @@ -204,35 +180,34 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) p !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement x = x !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 1 !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and a*b access the same storage a = a * b + a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level * operator a = b * (a + 9) !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (a+b) access the same storage a = a * (a + b) !$omp atomic - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (b+a) access the same storage a = (b + a) * a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a * b + c !$omp atomic update - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a + b + c !$omp atomic @@ -243,23 +218,18 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement a = a + d !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = x * y / z !$omp atomic - !ERROR: Atomic update statement should be of form `p%m = p%m operator expr` OR `p%m = expr operator p%m` - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable p%m should occur exactly once among the arguments of the top-level + operator p%m = x + y !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement p%m = p%m + p%n end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index 266268a212440..e0103be4cae4a 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -8,20 +8,20 @@ program OmpAtomic use omp_lib integer :: g, x - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic relaxed, seq_cst x = x + 1 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic read seq_cst, relaxed x = g - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic write relaxed, release x = 2 * 4 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 10 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst x = g g = x * 10 diff --git a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 index 7ca8c858239f7..e9cfa49bf934e 100644 --- a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 @@ -18,7 +18,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(3) y = 2 !$omp end critical (name) @@ -27,12 +27,12 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(7) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(x) y = 2 @@ -54,7 +54,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint) y = 2 @@ -84,35 +84,35 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint_contended) y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp critical (name) hint(1.0) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp critical (name) hint(z + omp_sync_hint_nonspeculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(k + omp_sync_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(p(1) + omp_sync_hint_uncontended) y = 2 diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 505cbc48fef90..677b933932b44 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -20,70 +20,64 @@ program sample !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable y(1_8:3_8:1_8) should be a scalar v = y(1:3) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression x*(10_4+x) should be a variable v = x * (10 + x) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression 4_4 should be a variable v = 4 !$omp atomic read - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE v = k !$omp atomic write - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = x !$omp atomic update - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = k + x * (v * x) !$omp atomic - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = v * k !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'z%y' + !ERROR: Within atomic operation z%y and x+z%y access the same storage z%y = x + z%y !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Within atomic operation m and min(m,x,z%m)+k access the same storage m = min(m, x, z%m) + k !$omp atomic read - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Atomic expression min(m,x,z%m)+k should be a variable m = min(m, x, z%m) + k !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar x = a - !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - a = x - !$omp atomic write !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement x = a !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar a = x !$omp atomic capture @@ -93,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = b + (x*1) !$omp end atomic @@ -103,60 +97,58 @@ program sample !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component x expected to be captured in the second statement of ATOMIC CAPTURE construct x = x + 10 v = b !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt] v = 1 x = 4 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component z%y expected to be assigned in the second statement of ATOMIC CAPTURE construct x = z%y + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component z%m expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 x = z%y !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component y(2) expected to be assigned in the second statement of ATOMIC CAPTURE construct x = y(2) + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component y(1) expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 x = y(2) !$omp end atomic !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable r cannot have CHARACTER type l = r !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable l cannot have CHARACTER type l = r end program diff --git a/flang/test/Semantics/OpenMP/requires-atomic01.f90 b/flang/test/Semantics/OpenMP/requires-atomic01.f90 index ae9fd086015dd..e8817c3f5ef61 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic01.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic01.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> SeqCst !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> SeqCst !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> SeqCst !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> SeqCst !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> SeqCst !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 diff --git a/flang/test/Semantics/OpenMP/requires-atomic02.f90 b/flang/test/Semantics/OpenMP/requires-atomic02.f90 index 4976a9667eb78..a3724a83456fd 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic02.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic02.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Acquire + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> AcqRel !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> AcqRel !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> AcqRel !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> AcqRel !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> AcqRel + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> AcqRel !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 >From 5928e6c450ed9d91a8130fd489a3fabab830140c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:41:16 -0500 Subject: [PATCH 05/22] Restart build >From 5a9cb4aaca6c7cd079f24036b233dbfd3c6278d4 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:45:28 -0500 Subject: [PATCH 06/22] Restart build >From 6273bb856c66fe110731e25293be0f9a1f02c64a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 07:54:54 -0500 Subject: [PATCH 07/22] Restart build >From 3594a0fcab499d7509c2bd4620b2c47feb74a481 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 08:18:01 -0500 Subject: [PATCH 08/22] Replace %openmp_flags with -fopenmp in tests, add REQUIRES where needed --- flang/test/Semantics/OpenMP/atomic-read.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-capture.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 2 +- flang/test/Semantics/OpenMP/atomic.f90 | 2 ++ 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 index 2619d235380f8..73899a9ff37f2 100644 --- a/flang/test/Semantics/OpenMP/atomic-read.f90 +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index 64e31a7974b55..c427ba07d43d8 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 0aee02d429dc0..4595e02d01456 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 index 979568ebf7140..7965ad2dc7dbf 100644 --- a/flang/test/Semantics/OpenMP/atomic-write.f90 +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0dc8251ae356e..2caa161507d49 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,3 +1,5 @@ +! REQUIRES: openmp_runtime + ! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct >From 0142671339663ba09c169800f909c0d48c0ad38b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 09:06:20 -0500 Subject: [PATCH 09/22] Fix examples --- flang/examples/FeatureList/FeatureList.cpp | 10 ------- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 26 +++++-------------- 2 files changed, 7 insertions(+), 29 deletions(-) diff --git a/flang/examples/FeatureList/FeatureList.cpp b/flang/examples/FeatureList/FeatureList.cpp index d1407cf0ef239..a36b8719e365d 100644 --- a/flang/examples/FeatureList/FeatureList.cpp +++ b/flang/examples/FeatureList/FeatureList.cpp @@ -445,13 +445,6 @@ struct NodeVisitor { READ_FEATURE(ObjectDecl) READ_FEATURE(OldParameterStmt) READ_FEATURE(OmpAlignedClause) - READ_FEATURE(OmpAtomic) - READ_FEATURE(OmpAtomicCapture) - READ_FEATURE(OmpAtomicCapture::Stmt1) - READ_FEATURE(OmpAtomicCapture::Stmt2) - READ_FEATURE(OmpAtomicRead) - READ_FEATURE(OmpAtomicUpdate) - READ_FEATURE(OmpAtomicWrite) READ_FEATURE(OmpBeginBlockDirective) READ_FEATURE(OmpBeginLoopDirective) READ_FEATURE(OmpBeginSectionsDirective) @@ -480,7 +473,6 @@ struct NodeVisitor { READ_FEATURE(OmpIterationOffset) READ_FEATURE(OmpIterationVector) READ_FEATURE(OmpEndAllocators) - READ_FEATURE(OmpEndAtomic) READ_FEATURE(OmpEndBlockDirective) READ_FEATURE(OmpEndCriticalDirective) READ_FEATURE(OmpEndLoopDirective) @@ -566,8 +558,6 @@ struct NodeVisitor { READ_FEATURE(OpenMPDeclareTargetConstruct) READ_FEATURE(OmpMemoryOrderType) READ_FEATURE(OmpMemoryOrderClause) - READ_FEATURE(OmpAtomicClause) - READ_FEATURE(OmpAtomicClauseList) READ_FEATURE(OmpAtomicDefaultMemOrderClause) READ_FEATURE(OpenMPFlushConstruct) READ_FEATURE(OpenMPLoopConstruct) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..18a4107f9a9d1 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -74,25 +74,19 @@ SourcePosition OpenMPCounterVisitor::getLocation(const OpenMPConstruct &c) { // the directive field. [&](const auto &c) -> SourcePosition { const CharBlock &source{std::get<0>(c.t).source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPAtomicConstruct &c) -> SourcePosition { - return std::visit( - [&](const auto &o) -> SourcePosition { - const CharBlock &source{std::get(o.t).source}; - return parsing->allCooked() - .GetSourcePositionRange(source) - ->first; - }, - c.u); + const CharBlock &source{c.source}; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPSectionConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPUtilityConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, }, c.u); @@ -157,14 +151,8 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - return std::visit( - [&](const auto &c) { - // Get source from the verbatim fields - const CharBlock &source{std::get(c.t).source}; - return "atomic-" + - normalize_construct_name(source.ToString()); - }, - c.u); + const CharBlock &source{c.source}; + return normalize_construct_name(source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; >From c158867527e0a5e2b67146784386675e0d414c71 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:19 -0500 Subject: [PATCH 10/22] Fix example --- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 5 +++-- flang/test/Examples/omp-atomic.f90 | 16 +++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index 18a4107f9a9d1..b0964845d3b99 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -151,8 +151,9 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - const CharBlock &source{c.source}; - return normalize_construct_name(source.ToString()); + auto &dirSpec = std::get(c.t); + auto &dirName = std::get(dirSpec.t); + return normalize_construct_name(dirName.source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; diff --git a/flang/test/Examples/omp-atomic.f90 b/flang/test/Examples/omp-atomic.f90 index dcca34b633a3e..934f84f132484 100644 --- a/flang/test/Examples/omp-atomic.f90 +++ b/flang/test/Examples/omp-atomic.f90 @@ -26,25 +26,31 @@ ! CHECK:--- ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 9 -! CHECK-NEXT: construct: atomic-read +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: -! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: - clause: read ! CHECK-NEXT: details: '' +! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 12 -! CHECK-NEXT: construct: atomic-write +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: ! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' +! CHECK-NEXT: - clause: write ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 16 -! CHECK-NEXT: construct: atomic-capture +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: +! CHECK-NEXT: - clause: capture +! CHECK-NEXT: details: 'name_modifier=atomic;name_modifier=atomic;' ! CHECK-NEXT: - clause: seq_cst ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 21 -! CHECK-NEXT: construct: atomic-atomic +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: [] ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 8 >From ddacb71299d1fefd37b566e3caed45031584aac8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:47 -0500 Subject: [PATCH 11/22] Remove reference to *nullptr --- flang/lib/Lower/OpenMP/OpenMP.cpp | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ab868df76d298..4d9b375553c1e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3770,9 +3770,12 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); - mlir::Operation *secondOp = genAtomicOperation( - converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, - *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + mlir::Operation *secondOp = nullptr; + if (analysis.op1.what != analysis.None) { + secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, + atomAddr, atom, *get(analysis.op1.expr), + hint, memOrder, atomicAt, prepareAt); + } if (secondOp) { builder.setInsertionPointAfter(secondOp); >From 4546997f82dfe32b79b2bd0e2b65974991ab55da Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 2 May 2025 18:49:05 -0500 Subject: [PATCH 12/22] Updates and improvements --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 107 +++-- flang/lib/Semantics/check-omp-structure.cpp | 375 ++++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 1 + .../Todo/atomic-capture-implicit-cast.f90 | 48 --- .../Lower/OpenMP/atomic-implicit-cast.f90 | 2 - .../Semantics/OpenMP/atomic-hint-clause.f90 | 2 +- .../OpenMP/atomic-update-capture.f90 | 8 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 16 +- 9 files changed, 381 insertions(+), 180 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 77f57b1cb85c7..8213fe33edbd0 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4859,7 +4859,7 @@ struct OpenMPAtomicConstruct { struct Op { int what; - TypedExpr expr; + AssignmentStmt::TypedAssignment assign; }; TypedExpr atom, cond; Op op0, op1; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6177b59199481..7b6c22095d723 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2673,21 +2673,46 @@ getAtomicMemoryOrder(lower::AbstractConverter &converter, static mlir::Operation * // genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Value storeAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Type storeType = fir::unwrapRefType(storeAddr.getType()); + + mlir::Value toAddr = [&]() { + if (atomType == storeType) + return storeAddr; + return builder.createTemporary(loc, atomType, ".tmp.atomval"); + }(); builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + + if (atomType != storeType) { + lower::ExprToValueMap overrides; + // The READ operation could be a part of UPDATE CAPTURE, so make sure + // we don't emit extra code into the body of the atomic op. + builder.restoreInsertionPoint(postAt); + mlir::Value load = builder.create(loc, toAddr); + overrides.try_emplace(&atom, load); + + converter.overrideExprValues(&overrides); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); + converter.resetExprOverrides(); + + builder.create(loc, value, storeAddr); + } builder.restoreInsertionPoint(saved); return op; } @@ -2695,16 +2720,18 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, static mlir::Operation * // genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); mlir::Value converted = builder.createConvert(loc, atomType, value); @@ -2719,19 +2746,20 @@ static mlir::Operation * genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrResizeOf(arg, atom)) { @@ -2751,7 +2779,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, converter.overrideExprValues(&overrides); mlir::Value updated = - fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Value converted = builder.createConvert(loc, atomType, updated); builder.create(loc, converted); converter.resetExprOverrides(); @@ -2764,20 +2792,21 @@ static mlir::Operation * genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, int action, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: - return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Write: - return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Update: - return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, assign, + hint, memOrder, preAt, atomicAt, postAt); default: return nullptr; } @@ -3724,6 +3753,15 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { } return ""s; }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; const SomeExpr &atom = *analysis.atom.get()->v; @@ -3732,11 +3770,11 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; llvm::errs() << " op0 {\n"; llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op0.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << " op1 {\n"; llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op1.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << "}\n"; } @@ -3745,8 +3783,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAtomicConstruct &construct) { - auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { - if (auto *maybe = expr.get(); maybe && maybe->v) { + auto get = [](auto &&typedWrapper) -> decltype(&*typedWrapper.get()->v) { + if (auto *maybe = typedWrapper.get(); maybe && maybe->v) { return &*maybe->v; } else { return nullptr; @@ -3774,8 +3812,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, int action0 = analysis.op0.what & analysis.Action; int action1 = analysis.op1.what & analysis.Action; mlir::Operation *captureOp = nullptr; - fir::FirOpBuilder::InsertPoint atomicAt; - fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint preAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint atomicAt, postAt; if (construct.IsCapture()) { // Capturing operation. @@ -3784,7 +3822,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, captureOp = builder.create(loc, hint, memOrder); // Set the non-atomic insertion point to before the atomic.capture. - prepareAt = getInsertionPointBefore(captureOp); + preAt = getInsertionPointBefore(captureOp); mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); builder.setInsertionPointToEnd(block); @@ -3792,6 +3830,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, // atomic.capture. mlir::Operation *term = builder.create(loc); atomicAt = getInsertionPointBefore(term); + postAt = getInsertionPointAfter(captureOp); hint = nullptr; memOrder = nullptr; } else { @@ -3799,20 +3838,20 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(action0 != analysis.None && action1 == analysis.None && "Expexcing single action"); assert(!(analysis.op0.what & analysis.Condition)); - atomicAt = prepareAt; + postAt = atomicAt = preAt; } mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, - *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); mlir::Operation *secondOp = nullptr; if (analysis.op1.what != analysis.None) { secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, - atomAddr, atom, *get(analysis.op1.expr), - hint, memOrder, atomicAt, prepareAt); + atomAddr, atom, *get(analysis.op1.assign), + hint, memOrder, preAt, atomicAt, postAt); } if (secondOp) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 201b38bd05ff3..f7753a5e5cc59 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -86,9 +86,13 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } -static bool IsVarOrFunctionRef(const SomeExpr &expr) { - return evaluate::UnwrapProcedureRef(expr) != nullptr || - evaluate::IsVariable(expr); +static bool IsVarOrFunctionRef(const MaybeExpr &expr) { + if (expr) { + return evaluate::UnwrapProcedureRef(*expr) != nullptr || + evaluate::IsVariable(*expr); + } else { + return false; + } } static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { @@ -2838,6 +2842,12 @@ static std::pair SplitAssignmentSource( namespace atomic { +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + enum class Operator { Unk, // Operators that are officially allowed in the update operation @@ -3137,16 +3147,108 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (Append(v, std::move(results)), ...); + (MoveAppend(v, std::move(results)), ...); return v; } +}; -private: - static void Append(Result &acc, Result &&data) { - for (auto &&s : data) { - acc.push_back(std::move(s)); +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + auto copy{x.derived()}; + return {evaluate::AsGenericExpr(std::move(copy)), {}}; } } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; }; } // namespace atomic @@ -3265,6 +3367,22 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } +static MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { // Both expr and x have the form of SomeType(SomeKind(...)[1]). // Check if expr is @@ -3282,6 +3400,10 @@ bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { } } +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { if (value) { expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), @@ -3289,11 +3411,20 @@ static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { } } +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( - int what, const MaybeExpr &maybeExpr = std::nullopt) { + int what, + const std::optional &maybeAssign = std::nullopt) { parser::OpenMPAtomicConstruct::Analysis::Op operation; operation.what = what; - SetExpr(operation.expr, maybeExpr); + SetAssignment(operation.assign, maybeAssign); return operation; } @@ -3316,7 +3447,7 @@ static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( // }; // struct Op { // int what; - // TypedExpr expr; + // TypedAssignment assign; // }; // TypedExpr atom, cond; // Op op0, op1; @@ -3340,6 +3471,16 @@ void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, } } +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + /// Check if `expr` satisfies the following conditions for x and v: /// /// [6.0:189:10-12] @@ -3383,9 +3524,9 @@ OmpStructureChecker::CheckUpdateCapture( // // The two allowed cases are: // x = ... atomic-var = ... - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // or - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // x = ... atomic-var = ... // // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture @@ -3394,6 +3535,8 @@ OmpStructureChecker::CheckUpdateCapture( // // If the two statements don't fit these criteria, return a pair of default- // constructed values. + using ReturnTy = std::pair; SourcedActionStmt act1{GetActionStmt(ec1)}; SourcedActionStmt act2{GetActionStmt(ec2)}; @@ -3409,86 +3552,155 @@ OmpStructureChecker::CheckUpdateCapture( auto isUpdateCapture{ [](const evaluate::Assignment &u, const evaluate::Assignment &c) { - return u.lhs == c.rhs; + return IsSameOrConvertOf(c.rhs, u.lhs); }}; // Do some checks that narrow down the possible choices for the update // and the capture statements. This will help to emit better diagnostics. - bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); - bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; + + auto errorCaptureShouldRead{[&](const parser::CharBlock &source, + const std::string &expr) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read %s"_err_en_US, + expr); + }}; - if (couldBeCapture1) { - if (couldBeCapture2) { - if (isUpdateCapture(as2, as1)) { - if (isUpdateCapture(as1, as2)) { - // If both statements could be captures and both could be updates, - // emit a warning about the ambiguity. - context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); - } - return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); - } else if (isUpdateCapture(as1, as2)) { + auto errorNeitherWorks{[&]() { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture"_err_en_US); + }}; + + auto makeSelectionFromDet{[&](int det) -> ReturnTy { + // If det != 0, then the checks unambiguously suggest a specific + // categorization. + // If det == 0, then this function should be called only if the + // checks haven't ruled out any possibility, i.e. when both assigments + // could still be either updates or captures. + if (det > 0) { + // as1 is update, as2 is capture + if (isUpdateCapture(as1, as2)) { return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - context_.Say(source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, - as1.rhs.AsFortran(), as2.rhs.AsFortran()); + errorCaptureShouldRead(act2.source, as1.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } else { // !couldBeCapture2 + } else if (det < 0) { + // as2 is update, as1 is capture if (isUpdateCapture(as2, as1)) { return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } else { - context_.Say(act2.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as1.rhs.AsFortran()); + errorCaptureShouldRead(act1.source, as2.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } - } else { // !couldBeCapture1 - if (couldBeCapture2) { - if (isUpdateCapture(as1, as2)) { - return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); - } else { + } else { + bool updateFirst{isUpdateCapture(as1, as2)}; + bool captureFirst{isUpdateCapture(as2, as1)}; + if (updateFirst && captureFirst) { + // If both assignment could be the update and both could be the + // capture, emit a warning about the ambiguity. context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as2.rhs.AsFortran()); + "In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement"_warn_en_US); + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } - } else { - context_.Say(source, - "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + if (updateFirst != captureFirst) { + const parser::ExecutionPartConstruct *upd{updateFirst ? ec1 : ec2}; + const parser::ExecutionPartConstruct *cap{captureFirst ? ec1 : ec2}; + return std::make_pair(upd, cap); + } + assert(!updateFirst && !captureFirst); + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); } + }}; + + if (det != 0 || (cbu1 && cbu2 && cbc1 && cbc2)) { + return makeSelectionFromDet(det); } + assert(det == 0 && "Prior checks should have covered det != 0"); - return std::make_pair(nullptr, nullptr); + // If neither of the statements is an RMW update, it could still be a + // "write" update. Pretty much any assignment can be a write update, so + // recompute det with cbu1 = cbu2 = true. + if (int writeDet{int(cbc2) - int(cbc1)}; writeDet || (cbc1 && cbc2)) { + return makeSelectionFromDet(writeDet); + } + + // It's only errors from here on. + + if (!cbu1 && !cbu2 && !cbc1 && !cbc2) { + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); + } + + // The remaining cases are that + // - no candidate for update, or for capture, + // - one of the assigments cannot be anything. + + if (!cbu1 && !cbu2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update"_err_en_US); + return std::make_pair(nullptr, nullptr); + } else if (!cbc1 && !cbc2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + if ((!cbu1 && !cbc1) || (!cbu2 && !cbc2)) { + auto &src = (!cbu1 && !cbc1) ? act1.source : act2.source; + context_.Say(src, + "In ATOMIC UPDATE operation with CAPTURE the statement could be neither the update nor the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + // All cases should have been covered. + llvm_unreachable("Unchecked condition"); } void OmpStructureChecker::CheckAtomicCaptureAssignment( const evaluate::Assignment &capture, const SomeExpr &atom, parser::CharBlock source) { - const SomeExpr &cap{capture.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &cap{capture.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, rsrc); - // This part should have been checked prior to callig this function. - assert(capture.rhs == atom && "This canont be a capture assignment"); + // This part should have been checked prior to calling this function. + assert(*GetConvertInput(capture.rhs) == atom && + "This canont be a capture assignment"); CheckStorageOverlap(atom, {cap}, source); } } void OmpStructureChecker::CheckAtomicReadAssignment( const evaluate::Assignment &read, parser::CharBlock source) { - const SomeExpr &atom{read.rhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; - if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + if (auto maybe{GetConvertInput(read.rhs)}) { + const SomeExpr &atom{*maybe}; + + if (!IsVarOrFunctionRef(atom)) { + ErrorShouldBeVariable(atom, rsrc); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } } else { - CheckAtomicVariable(atom, rsrc); - CheckStorageOverlap(atom, {read.lhs}, source); + ErrorShouldBeVariable(read.rhs, rsrc); } } @@ -3499,12 +3711,11 @@ void OmpStructureChecker::CheckAtomicWriteAssignment( // one of the following forms: // x = expr // x => expr - const SomeExpr &atom{write.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{write.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, lsrc); CheckStorageOverlap(atom, {write.rhs}, source); @@ -3521,12 +3732,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( // x = intrinsic-procedure-name (x) // x = intrinsic-procedure-name (x, expr-list) // x = intrinsic-procedure-name (expr-list, x) - const SomeExpr &atom{update.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{update.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); // Skip other checks. return; } @@ -3605,12 +3815,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( const SomeExpr &cond, parser::CharBlock condSource, const evaluate::Assignment &assign, parser::CharBlock assignSource) { - const SomeExpr &atom{assign.lhs}; auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + const SomeExpr &atom{assign.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, arsrc); // Skip other checks. return; } @@ -3702,7 +3911,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( @@ -3786,7 +3995,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdate( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign), MakeAtomicAnalysisOp(Analysis::None)); } @@ -3839,12 +4048,12 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( if (GetActionStmt(&body.front()).stmt == uact.stmt) { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(action, update.rhs), - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + MakeAtomicAnalysisOp(action, update), + MakeAtomicAnalysisOp(Analysis::Read, capture)); } else { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), - MakeAtomicAnalysisOp(action, update.rhs)); + MakeAtomicAnalysisOp(Analysis::Read, capture), + MakeAtomicAnalysisOp(action, update)); } } @@ -3988,12 +4197,12 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( if (captureFirst) { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign)); } else { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign)); } } @@ -4019,13 +4228,15 @@ void OmpStructureChecker::CheckAtomicRead( if (body.size() == 1) { SourcedActionStmt action{GetActionStmt(&body.front())}; if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { - const SomeExpr &atom{maybeRead->rhs}; CheckAtomicReadAssignment(*maybeRead, action.source); - using Analysis = parser::OpenMPAtomicConstruct::Analysis; - x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), - MakeAtomicAnalysisOp(Analysis::None)); + if (auto maybe{GetConvertInput(maybeRead->rhs)}) { + const SomeExpr &atom{*maybe}; + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead), + MakeAtomicAnalysisOp(Analysis::None)); + } } else { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); @@ -4058,7 +4269,7 @@ void OmpStructureChecker::CheckAtomicWrite( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index bf6fbf16d0646..835fbe45e1c0e 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -253,6 +253,7 @@ class OmpStructureChecker void CheckStorageOverlap(const evaluate::Expr &, llvm::ArrayRef>, parser::CharBlock); + void ErrorShouldBeVariable(const MaybeExpr &expr, parser::CharBlock source); void CheckAtomicVariable( const evaluate::Expr &, parser::CharBlock); std::pair&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..aa9d2e0ac3ff7 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -1,5 +1,3 @@ -! REQUIRES : openmp_runtime - ! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s ! CHECK: func.func @_QPatomic_implicit_cast_read() { diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index deb67e7614659..8adb0f1a67409 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -25,7 +25,7 @@ program sample !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement y = x x = y !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index c427ba07d43d8..f808ed916fb7e 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -39,7 +39,7 @@ subroutine f02 subroutine f03 integer :: x - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture !$omp atomic update capture x = x + 1 x = x + 2 @@ -50,7 +50,7 @@ subroutine f04 integer :: x, v !$omp atomic update capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement v = x x = v !$omp end atomic @@ -60,8 +60,8 @@ subroutine f05 integer :: x, v, z !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 !$omp end atomic end @@ -70,8 +70,8 @@ subroutine f06 integer :: x, v, z !$omp atomic update capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x !$omp end atomic end diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 677b933932b44..5e180aa0bbe5b 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -97,50 +97,50 @@ program sample !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture x = x + 10 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read x v = b !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture !$omp atomic capture v = 1 x = 4 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) !$omp end atomic >From 40510a3068498d15257cc7d198bce9c8cd71a902 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 24 Mar 2025 15:38:58 -0500 Subject: [PATCH 13/22] DumpEvExpr: show type --- flang/include/flang/Semantics/dump-expr.h | 30 ++++++++++++++++------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 2f445429a10b5..1553dac3b6687 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,6 +16,7 @@ #include #include +#include #include #include @@ -38,6 +39,17 @@ class DumpEvaluateExpr { } private: + template + struct TypeOf { + static constexpr std::string_view name{TypeOf::get()}; + static constexpr std::string_view get() { + std::string_view v(__PRETTY_FUNCTION__); + v.remove_prefix(99); // Strip the part "... [with T = " + v.remove_suffix(50); // Strip the ending "; string_view = ...]" + return v; + } + }; + template void Show(const common::Indirection &x) { Show(x.value()); } @@ -76,7 +88,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant"); + Indent("derived constant "s + std::string(TypeOf::name)); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -84,7 +96,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant"); + Print("constant "s + std::string(TypeOf::name)); } } void Show(const Symbol &symbol); @@ -102,7 +114,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator"); + Indent("designator "s + std::string(TypeOf::name)); Show(x.u); Outdent(); } @@ -117,7 +129,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref"); + Indent("function ref "s + std::string(TypeOf::name)); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -127,14 +139,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value"); + Indent("array constructor value "s + std::string(TypeOf::name)); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do"); + Indent("implied do "s + std::string(TypeOf::name)); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -148,20 +160,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op"); + Indent("unary op "s + std::string(TypeOf::name)); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op"); + Indent("binary op "s + std::string(TypeOf::name)); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr T"); + Indent("expr <" + std::string(TypeOf::name) + ">"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index aa0b4e0f03398..66cedab94bfb4 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("expr some type"); + Indent("relational some type"); Show(x.u); Outdent(); } >From b40ba0ed9270daf4f7d99190c1e100028a3e09c3 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 15:14:45 -0500 Subject: [PATCH 14/22] Handle conversion from real to complex via complex constructor --- flang/lib/Semantics/check-omp-structure.cpp | 55 ++++++++++++++++++--- 1 file changed, 47 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dada9c6c2bd6f..ae81dcb5ea150 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,36 +3183,46 @@ struct ConvertCollector using Base::operator(); template // - Result operator()(const evaluate::Designator &x) const { + Result asSomeExpr(const T &x) const { auto copy{x}; return {AsGenericExpr(std::move(copy)), {}}; } + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + template // Result operator()(const evaluate::FunctionRef &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template // Result operator()(const evaluate::Constant &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template Result operator()(const evaluate::Operation &x) const { if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. + // Ignore parentheses. return (*this)(x.template operand<0>()); } else if constexpr (is_convert_v) { // Convert should always have a typed result, so it should be safe to // dereference x.GetType(). return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } } else { - auto copy{x.derived()}; - return {evaluate::AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x.derived()); } } @@ -3231,6 +3241,23 @@ struct ConvertCollector } private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + template // struct is_convert { static constexpr bool value{false}; @@ -3246,6 +3273,18 @@ struct ConvertCollector }; template // static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { >From 303aef7886243a6f7952e866cfb50d860ed98e61 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 16:07:19 -0500 Subject: [PATCH 15/22] Fix handling of insertion point --- flang/lib/Lower/OpenMP/OpenMP.cpp | 23 +++++++++++-------- .../Lower/OpenMP/atomic-implicit-cast.f90 | 8 +++---- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 1c5589b116ca7..60e559b326f7f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2749,7 +2749,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value storeAddr = @@ -2782,7 +2781,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, value, storeAddr); } - builder.restoreInsertionPoint(saved); return op; } @@ -2796,7 +2794,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value value = @@ -2807,7 +2804,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, converted, hint, memOrder); - builder.restoreInsertionPoint(saved); return op; } @@ -2823,7 +2819,6 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); @@ -2853,7 +2848,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(saved); + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } @@ -2866,6 +2861,8 @@ genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { + // This function and the functions called here do not preserve the + // builder's insertion point, or set it to anything specific. switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, @@ -3919,6 +3916,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, postAt = atomicAt = preAt; } + // The builder's insertion point needs to be specifically set before + // each call to `genAtomicOperation`. mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); @@ -3932,10 +3931,16 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, hint, memOrder, preAt, atomicAt, postAt); } - if (secondOp) { - builder.setInsertionPointAfter(secondOp); + if (construct.IsCapture()) { + // If this is a capture operation, the first/second ops will be inside + // of it. Set the insertion point to past the capture op itself. + builder.restoreInsertionPoint(postAt); } else { - builder.setInsertionPointAfter(firstOp); + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } } } } diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 6f9a481e4cf43..5e00235b85e74 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -95,9 +95,9 @@ subroutine atomic_implicit_cast_read ! CHECK: } ! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 ! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref -! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 ! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex ! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> @@ -107,14 +107,14 @@ subroutine atomic_implicit_cast_read !$omp end atomic -! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { -! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 ! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 ! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex ! CHECK: omp.yield(%[[RESULT]] : complex) ! CHECK: } >From d788d87ebe69ec82c14a0eb0cbb95df38a216fde Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:14:47 -0500 Subject: [PATCH 16/22] Allow conversion in update operations --- flang/include/flang/Semantics/tools.h | 17 ++++----- flang/lib/Lower/OpenMP/OpenMP.cpp | 6 ++-- flang/lib/Semantics/check-omp-structure.cpp | 33 ++++++----------- .../Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic03.f90 | 6 ++-- flang/test/Semantics/OpenMP/atomic04.f90 | 35 +++++++++---------- .../OpenMP/omp-atomic-assignment-stmt.f90 | 2 +- 7 files changed, 44 insertions(+), 57 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7f1ec59b087a2..9be2feb8ae064 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -789,14 +789,15 @@ inline bool checkForSymbolMatch( /// return the "expr" but with top-level parentheses stripped. std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); -/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). -/// Check if "expr" is -/// SomeType(SomeKind(Type( -/// Convert -/// SomeKind(...)[2]))) -/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves -/// TypeCategory. -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 60e559b326f7f..6977e209e8b1b 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2823,10 +2823,12 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; + // This must exist by now. + SomeExpr input = *semantics::GetConvertInput(assign.rhs); + std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { - if (!semantics::IsSameOrResizeOf(arg, atom)) { + if (!semantics::IsSameOrConvertOf(arg, atom)) { mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); overrides.try_emplace(&arg, val); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index ae81dcb5ea150..edd8525c118bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3425,12 +3425,12 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -static MaybeExpr GetConvertInput(const SomeExpr &x) { +MaybeExpr GetConvertInput(const SomeExpr &x) { // This returns SomeExpr(x) when x is a designator/functionref/constant. return atomic::ConvertCollector{}(x).first; } -static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { // Check if expr is same as x, or a sequence of Convert operations on x. if (expr == x) { return true; @@ -3441,23 +3441,6 @@ static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { } } -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { - // Both expr and x have the form of SomeType(SomeKind(...)[1]). - // Check if expr is - // SomeType(SomeKind(Type( - // Convert - // SomeKind(...)[2]))) - // where SomeKind(...) [1] and [2] are equal, and the Convert preserves - // TypeCategory. - - if (expr != x) { - auto top{atomic::ArgumentExtractor{}(expr)}; - return top.first == atomic::Operator::Resize && x == top.second.front(); - } else { - return true; - } -} - bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3801,7 +3784,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - auto top{GetTopLevelOperation(update.rhs)}; + std::pair> top{ + atomic::Operator::Unk, {}}; + if (auto &&maybeInput{GetConvertInput(update.rhs)}) { + top = GetTopLevelOperation(*maybeInput); + } switch (top.first) { case atomic::Operator::Add: case atomic::Operator::Sub: @@ -3842,7 +3829,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( auto unique{[&]() { // -> iterator auto found{top.second.end()}; for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { - if (IsSameOrResizeOf(*i, atom)) { + if (IsSameOrConvertOf(*i, atom)) { if (found != top.second.end()) { return top.second.end(); } @@ -3902,9 +3889,9 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( case atomic::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; - if (IsSameOrResizeOf(arg0, atom)) { + if (IsSameOrConvertOf(arg0, atom)) { CheckStorageOverlap(atom, {arg1}, condSource); - } else if (IsSameOrResizeOf(arg1, atom)) { + } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { context_.Say(assignSource, diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 4595e02d01456..28d0e264359cb 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -47,8 +47,8 @@ subroutine f05 integer :: x real :: y + ! An explicit conversion is accepted as an extension. !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = int(x + y) end diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index f5c189fd05318..b3a3c0d5e7a14 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -41,10 +41,10 @@ program OmpAtomic z = MIN(y, 8, a, d) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable y should appear as an argument in the update operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -126,7 +126,7 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic update !ERROR: Atomic variable k should be a scalar - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable k should occur exactly once among the arguments of the top-level MAX operator k = max(x, y) !$omp atomic diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index 5c91ab5dc37e4..d603ba8b3937c 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -1,5 +1,3 @@ -! REQUIRES: openmp_runtime - ! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags ! OpenMP Atomic construct @@ -7,7 +5,6 @@ ! Update assignment must be 'var = var op expr' or 'var = expr op var' program OmpAtomic - use omp_lib real x integer y logical m, n, l @@ -20,10 +17,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic @@ -31,10 +28,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic @@ -42,10 +39,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic @@ -53,10 +50,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic @@ -96,10 +93,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic update @@ -107,10 +104,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic update @@ -118,10 +115,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic update @@ -129,10 +126,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 5e180aa0bbe5b..8fdd2aed3ec1f 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -87,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + ! This ends up being "x = b + x". x = b + (x*1) !$omp end atomic >From 341723713929507c59d528540d32bc2e4213e920 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:21:56 -0500 Subject: [PATCH 17/22] format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6977e209e8b1b..0f553541c5ef0 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2850,7 +2850,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(postAt); // For naCtx cleanups + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } >From 2686207342bad511f6d51b20ed923c0d2cc9047b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:22:26 -0500 Subject: [PATCH 18/22] Revert "DumpEvExpr: show type" This reverts commit 40510a3068498d15257cc7d198bce9c8cd71a902. Debug changes accidentally pushed upstream. --- flang/include/flang/Semantics/dump-expr.h | 30 +++++++---------------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 10 insertions(+), 22 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 1553dac3b6687..2f445429a10b5 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,7 +16,6 @@ #include #include -#include #include #include @@ -39,17 +38,6 @@ class DumpEvaluateExpr { } private: - template - struct TypeOf { - static constexpr std::string_view name{TypeOf::get()}; - static constexpr std::string_view get() { - std::string_view v(__PRETTY_FUNCTION__); - v.remove_prefix(99); // Strip the part "... [with T = " - v.remove_suffix(50); // Strip the ending "; string_view = ...]" - return v; - } - }; - template void Show(const common::Indirection &x) { Show(x.value()); } @@ -88,7 +76,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant "s + std::string(TypeOf::name)); + Indent("derived constant"); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -96,7 +84,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant "s + std::string(TypeOf::name)); + Print("constant"); } } void Show(const Symbol &symbol); @@ -114,7 +102,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator "s + std::string(TypeOf::name)); + Indent("designator"); Show(x.u); Outdent(); } @@ -129,7 +117,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref "s + std::string(TypeOf::name)); + Indent("function ref"); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -139,14 +127,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value "s + std::string(TypeOf::name)); + Indent("array constructor value"); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do "s + std::string(TypeOf::name)); + Indent("implied do"); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -160,20 +148,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op "s + std::string(TypeOf::name)); + Indent("unary op"); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op "s + std::string(TypeOf::name)); + Indent("binary op"); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr <" + std::string(TypeOf::name) + ">"); + Indent("expr T"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index 66cedab94bfb4..aa0b4e0f03398 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("relational some type"); + Indent("expr some type"); Show(x.u); Outdent(); } >From c00fc531bcf742c409fc974da94c5b362fa9132c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:37:19 -0500 Subject: [PATCH 19/22] Delete unnecessary static_assert --- flang/lib/Semantics/check-omp-structure.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 6005dda7c26fe..2e59553d5e130 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -21,8 +21,6 @@ namespace Fortran::semantics { -static_assert(std::is_same_v>); - template static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { return !(e == f); >From 45b012c16b77c757a0d09b2a229bad49fed8d26f Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:25 -0500 Subject: [PATCH 20/22] Add missing initializer for 'iff' --- flang/lib/Semantics/check-omp-structure.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 2e59553d5e130..aa1bd136b371f 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2815,7 +2815,7 @@ static std::optional AnalyzeConditionalStmt( } } else { AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, - GetActionStmt(std::get(s.t))}; + GetActionStmt(std::get(s.t)), SourcedActionStmt{}}; if (result.ift.stmt) { return result; } >From daeac25991bf14fb08c3accabe068c074afa1eb7 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:47 -0500 Subject: [PATCH 21/22] Add asserts for printing "Identity" as top-level operator --- flang/lib/Semantics/check-omp-structure.cpp | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa1bd136b371f..062b45deac865 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3823,6 +3823,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, atomic::ToString(top.first)); @@ -3852,6 +3853,8 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, atom.AsFortran(), atomic::ToString(top.first)); @@ -3898,16 +3901,20 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, atomic::ToString(top.first)); } break; } + case atomic::Operator::Identity: case atomic::Operator::True: case atomic::Operator::False: break; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, atomic::ToString(top.first)); >From ae121e5c37453af1a4aba7c77939c2c1c45b75fa Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 13:15:58 -0500 Subject: [PATCH 22/22] Explain the use of determinant --- flang/lib/Semantics/check-omp-structure.cpp | 31 ++++++++++++++++++--- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 062b45deac865..bc6a09b9768ef 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3606,10 +3606,33 @@ OmpStructureChecker::CheckUpdateCapture( // subexpression of the right-hand side. // 2. An assignment could be a capture (cbc) if the right-hand side is // a variable (or a function ref), with potential type conversions. - bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; - bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; - bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; - bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; // Can as1 be an update? + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; // Can as2 be an update? + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; // Can 1 be capture? + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; // Can 2 be capture? + + // We want to diagnose cases where both assignments cannot be an update, + // or both cannot be a capture, as well as cases where either assignment + // cannot be any of these two. + // + // If we organize these boolean values into a matrix + // |cbu1 cbu2| + // |cbc1 cbc2| + // then we want to diagnose cases where the matrix has a zero (i.e. "false") + // row or column, including the case where everything is zero. All these + // cases correspond to the determinant of the matrix being 0, which suggests + // that checking the det may be a convenient diagnostic check. There is only + // one additional case where the det is 0, which is when the matrx is all 1 + // ("true"). The "all true" case represents the situation where both + // assignments could be an update as well as a capture. On the other hand, + // whenever det != 0, the roles of the update and the capture can be + // unambiguously assigned to as1 and as2 [1]. + // + // [1] This can be easily verified by hand: there are 10 2x2 matrices with + // det = 0, leaving 6 cases where det != 0: + // 0 1 0 1 1 0 1 0 1 1 1 1 + // 1 0 1 1 0 1 1 1 0 1 1 0 + // In each case the classification is unambiguous. // |cbu1 cbu2| // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 From flang-commits at lists.llvm.org Thu May 29 11:58:20 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Thu, 29 May 2025 11:58:20 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6838ae4c.170a0220.48b5d.b55a@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 01/22] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 02/22] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 03/22] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; >From 02b40809aff5b2ad315ef078f0ea8edb0f30a40c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 17 Mar 2025 15:53:27 -0500 Subject: [PATCH 04/22] [flang][OpenMP] Overhaul implementation of ATOMIC construct The parser will accept a wide variety of illegal attempts at forming an ATOMIC construct, leaving it to the semantic analysis to diagnose any issues. This consolidates the analysis into one place and allows us to produce more informative diagnostics. The parser's outcome will be parser::OpenMPAtomicConstruct object holding the directive, parser::Body, and an optional end-directive. The prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have been removed. READ, WRITE, etc. are now proper clauses. The semantic analysis consistently operates on "evaluation" represen- tations, mainly evaluate::Expr (as SomeExpr) and evaluate::Assignment. The results of the semantic analysis are stored in a mutable member of the OpenMPAtomicConstruct node. This follows a precedent of having `typedExpr` member in parser::Expr, for example. This allows the lowering code to avoid duplicated handling of AST nodes. Using a BLOCK construct containing multiple statements for an ATOMIC construct that requires multiple statements is now allowed. In fact, any nesting of such BLOCK constructs is allowed. This implementation will parse, and perform semantic checks for both conditional-update and conditional-update-capture, although no MLIR will be generated for those. Instead, a TODO error will be issues prior to lowering. The allowed forms of the ATOMIC construct were based on the OpenMP 6.0 spec. --- flang/include/flang/Parser/dump-parse-tree.h | 12 - flang/include/flang/Parser/parse-tree.h | 111 +- flang/include/flang/Semantics/tools.h | 16 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 40 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 760 +++---- flang/lib/Parser/openmp-parsers.cpp | 233 +- flang/lib/Parser/parse-tree.cpp | 28 + flang/lib/Parser/unparse.cpp | 102 +- flang/lib/Semantics/check-omp-structure.cpp | 1998 +++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 56 +- flang/lib/Semantics/resolve-names.cpp | 7 +- flang/lib/Semantics/rewrite-directives.cpp | 126 +- .../Lower/OpenMP/Todo/atomic-compare-fail.f90 | 2 +- .../test/Lower/OpenMP/Todo/atomic-compare.f90 | 2 +- flang/test/Lower/OpenMP/atomic-capture.f90 | 4 +- flang/test/Lower/OpenMP/atomic-write.f90 | 2 +- flang/test/Parser/OpenMP/atomic-compare.f90 | 16 - .../test/Semantics/OpenMP/atomic-compare.f90 | 29 +- .../Semantics/OpenMP/atomic-hint-clause.f90 | 23 +- flang/test/Semantics/OpenMP/atomic-read.f90 | 89 + .../OpenMP/atomic-update-capture.f90 | 77 + .../Semantics/OpenMP/atomic-update-only.f90 | 83 + .../OpenMP/atomic-update-overloaded-ops.f90 | 4 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 81 + flang/test/Semantics/OpenMP/atomic.f90 | 29 +- flang/test/Semantics/OpenMP/atomic01.f90 | 221 +- flang/test/Semantics/OpenMP/atomic02.f90 | 47 +- flang/test/Semantics/OpenMP/atomic03.f90 | 51 +- flang/test/Semantics/OpenMP/atomic04.f90 | 96 +- flang/test/Semantics/OpenMP/atomic05.f90 | 12 +- .../Semantics/OpenMP/critical-hint-clause.f90 | 20 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 58 +- .../Semantics/OpenMP/requires-atomic01.f90 | 86 +- .../Semantics/OpenMP/requires-atomic02.f90 | 86 +- 34 files changed, 2838 insertions(+), 1769 deletions(-) delete mode 100644 flang/test/Parser/OpenMP/atomic-compare.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-read.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-capture.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-only.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-write.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a8c01ee2e7da3 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -526,15 +526,6 @@ class ParseTreeDumper { NODE(parser, OmpAtClause) NODE_ENUM(OmpAtClause, ActionTime) NODE_ENUM(OmpSeverityClause, Severity) - NODE(parser, OmpAtomic) - NODE(parser, OmpAtomicCapture) - NODE(OmpAtomicCapture, Stmt1) - NODE(OmpAtomicCapture, Stmt2) - NODE(parser, OmpAtomicCompare) - NODE(parser, OmpAtomicCompareIfStmt) - NODE(parser, OmpAtomicRead) - NODE(parser, OmpAtomicUpdate) - NODE(parser, OmpAtomicWrite) NODE(parser, OmpBeginBlockDirective) NODE(parser, OmpBeginLoopDirective) NODE(parser, OmpBeginSectionsDirective) @@ -581,7 +572,6 @@ class ParseTreeDumper { NODE(parser, OmpDoacrossClause) NODE(parser, OmpDestroyClause) NODE(parser, OmpEndAllocators) - NODE(parser, OmpEndAtomic) NODE(parser, OmpEndBlockDirective) NODE(parser, OmpEndCriticalDirective) NODE(parser, OmpEndLoopDirective) @@ -709,8 +699,6 @@ class ParseTreeDumper { NODE(parser, OpenMPDeclareMapperConstruct) NODE_ENUM(common, OmpMemoryOrderType) NODE(parser, OmpMemoryOrderClause) - NODE(parser, OmpAtomicClause) - NODE(parser, OmpAtomicClauseList) NODE(parser, OmpAtomicDefaultMemOrderClause) NODE(parser, OpenMPDepobjConstruct) NODE(parser, OpenMPUtilityConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..77f57b1cb85c7 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4835,94 +4835,37 @@ struct OmpMemoryOrderClause { CharBlock source; }; -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) | -// FAIL(memory-order) -struct OmpAtomicClause { - UNION_CLASS_BOILERPLATE(OmpAtomicClause); - CharBlock source; - std::variant u; -}; - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -struct OmpAtomicClauseList { - WRAPPER_CLASS_BOILERPLATE(OmpAtomicClauseList, std::list); - CharBlock source; -}; - -// END ATOMIC -EMPTY_CLASS(OmpEndAtomic); - -// ATOMIC READ -struct OmpAtomicRead { - TUPLE_CLASS_BOILERPLATE(OmpAtomicRead); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC WRITE -struct OmpAtomicWrite { - TUPLE_CLASS_BOILERPLATE(OmpAtomicWrite); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC UPDATE -struct OmpAtomicUpdate { - TUPLE_CLASS_BOILERPLATE(OmpAtomicUpdate); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC CAPTURE -struct OmpAtomicCapture { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCapture); - CharBlock source; - WRAPPER_CLASS(Stmt1, Statement); - WRAPPER_CLASS(Stmt2, Statement); - std::tuple - t; -}; - -struct OmpAtomicCompareIfStmt { - UNION_CLASS_BOILERPLATE(OmpAtomicCompareIfStmt); - std::variant, common::Indirection> u; -}; - -// ATOMIC COMPARE (OpenMP 5.1, OPenMP 5.2 spec: 15.8.4) -struct OmpAtomicCompare { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCompare); +struct OpenMPAtomicConstruct { + llvm::omp::Clause GetKind() const; + bool IsCapture() const; + bool IsCompare() const; + TUPLE_CLASS_BOILERPLATE(OpenMPAtomicConstruct); CharBlock source; - std::tuple> + std::tuple> t; -}; -// ATOMIC -struct OmpAtomic { - TUPLE_CLASS_BOILERPLATE(OmpAtomic); - CharBlock source; - std::tuple, - std::optional> - t; -}; + // Information filled out during semantic checks to avoid duplication + // of analyses. + struct Analysis { + static constexpr int None = 0; + static constexpr int Read = 1; + static constexpr int Write = 2; + static constexpr int Update = Read | Write; + static constexpr int Action = 3; // Bitmask for None, Read, Write, Update + static constexpr int IfTrue = 4; + static constexpr int IfFalse = 8; + static constexpr int Condition = 12; // Bitmask for IfTrue, IfFalse + + struct Op { + int what; + TypedExpr expr; + }; + TypedExpr atom, cond; + Op op0, op1; + }; -// 2.17.7 atomic -> -// ATOMIC [atomic-clause-list] atomic-construct [atomic-clause-list] | -// ATOMIC [atomic-clause-list] -// atomic-construct -> READ | WRITE | UPDATE | CAPTURE | COMPARE -struct OpenMPAtomicConstruct { - UNION_CLASS_BOILERPLATE(OpenMPAtomicConstruct); - std::variant - u; + mutable Analysis analysis; }; // OpenMP directives that associate with loop(s) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..7f1ec59b087a2 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -782,5 +782,21 @@ inline bool checkForSymbolMatch( } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). +/// Check if "expr" is +/// SomeType(SomeKind(Type( +/// Convert +/// SomeKind(...)[2]))) +/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves +/// TypeCategory. +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..0d941bf39afc3 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -332,26 +332,26 @@ getSource(const semantics::SemanticsContext &semaCtx, const parser::CharBlock *source = nullptr; auto ompConsVisit = [&](const parser::OpenMPConstruct &x) { - std::visit(common::visitors{ - [&](const parser::OpenMPSectionsConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPLoopConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPBlockConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPCriticalConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPAtomicConstruct &x) { - std::visit([&](const auto &x) { source = &x.source; }, - x.u); - }, - [&](const auto &x) { source = &x.source; }, - }, - x.u); + std::visit( + common::visitors{ + [&](const parser::OpenMPSectionsConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPLoopConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPBlockConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPCriticalConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPAtomicConstruct &x) { + source = &std::get(x.t).source; + }, + [&](const auto &x) { source = &x.source; }, + }, + x.u); }; eval.visit(common::visitors{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fdd85e94829f3..ab868df76d298 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2588,455 +2588,183 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); +} -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} + +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, + const List &clauses) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + if (clause.id != llvm::omp::Clause::OMPC_hint) + continue; + auto &hint = std::get(clause.u); + auto maybeVal = evaluate::ToInt64(hint.v); + CHECK(maybeVal); + return builder.getI64IntegerAttr(*maybeVal); } + return nullptr; } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static mlir::omp::ClauseMemoryOrderKindAttr +getAtomicMemoryOrder(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + const List &clauses) { + std::optional kind; + unsigned version = semaCtx.langOptions().OpenMPVersion; - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + switch (clause.id) { + case llvm::omp::Clause::OMPC_acq_rel: + kind = mlir::omp::ClauseMemoryOrderKind::Acq_rel; + break; + case llvm::omp::Clause::OMPC_acquire: + kind = mlir::omp::ClauseMemoryOrderKind::Acquire; + break; + case llvm::omp::Clause::OMPC_relaxed: + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + break; + case llvm::omp::Clause::OMPC_release: + kind = mlir::omp::ClauseMemoryOrderKind::Release; + break; + case llvm::omp::Clause::OMPC_seq_cst: + kind = mlir::omp::ClauseMemoryOrderKind::Seq_cst; + break; + default: + break; + } + } - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); + // Starting with 5.1, if no memory-order clause is present, the effect + // is as if "relaxed" was present. + if (!kind) { + if (version <= 50) + return nullptr; + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + } + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + return mlir::omp::ClauseMemoryOrderKindAttr::get(builder.getContext(), *kind); +} + +static mlir::Operation * // +genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; +static mlir::Operation * // +genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Value converted = builder.createConvert(loc, atomType, value); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, converted, hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - common::visit( - common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - lower::StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); +static mlir::Operation * +genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + lower::ExprToValueMap overrides; + lower::StatementContext naCtx; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + assert(!args.empty() && "Update operation without arguments"); + for (auto &arg : args) { + if (!semantics::IsSameOrResizeOf(arg, atom)) { + mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); + overrides.try_emplace(&arg, val); } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); } - mlir::Operation *atomicUpdateOp = nullptr; - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), - val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -static void genAtomicWrite(lower::AbstractConverter &converter, - const parser::OmpAtomicWrite &atomicWrite, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - - const parser::AssignmentStmt &stmt = - std::get>(atomicWrite.t) - .statement; - const evaluate::Assignment &assign = *stmt.typedAssignment->v; - lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -static void genAtomicRead(lower::AbstractConverter &converter, - const parser::OmpAtomicRead &atomicRead, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicRead.t) - .statement.t); + builder.restoreInsertionPoint(atomicAt); + auto updateOp = + builder.create(loc, atomAddr, hint, memOrder); - lower::StatementContext stmtCtx; - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -static void genAtomicUpdate(lower::AbstractConverter &converter, - const parser::OmpAtomicUpdate &atomicUpdate, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicUpdate.t) - .statement.t); + mlir::Region ®ion = updateOp->getRegion(0); + mlir::Block *block = builder.createBlock(®ion, {}, {atomType}, {loc}); + mlir::Value localAtom = fir::getBase(block->getArgument(0)); + overrides.try_emplace(&atom, localAtom); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -static void genOmpAtomic(lower::AbstractConverter &converter, - const parser::OmpAtomic &atomicConstruct, - mlir::Location loc) { - const parser::OmpAtomicClauseList &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>(atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicConstruct.t) - .statement.t); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -static void genAtomicCapture(lower::AbstractConverter &converter, - const parser::OmpAtomicCapture &atomicCapture, - mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + converter.overrideExprValues(&overrides); + mlir::Value updated = + fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value converted = builder.createConvert(loc, atomType, updated); + builder.create(loc, converted); + converter.resetExprOverrides(); - const parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const parser::OmpAtomicClauseList &rightHandClauseList = - std::get<2>(atomicCapture.t); - const parser::OmpAtomicClauseList &leftHandClauseList = - std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + builder.restoreInsertionPoint(saved); + return updateOp; +} + +static mlir::Operation * +genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, int action, + mlir::Value atomAddr, const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + switch (action) { + case parser::OpenMPAtomicConstruct::Analysis::Read: + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Write: + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Update: + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + default: + return nullptr; } - firOpBuilder.setInsertionPointToEnd(&block); - firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); } //===----------------------------------------------------------------------===// @@ -3911,10 +3639,6 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, @@ -3922,38 +3646,140 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; + llvm::errs() << " atom: " << atom.AsFortran() << "\n"; + llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; + llvm::errs() << " op0 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << " op1 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << "}\n"; +} + static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - const parser::OpenMPAtomicConstruct &atomicConstruct) { - Fortran::common::visit( - common::visitors{ - [&](const parser::OmpAtomicRead &atomicRead) { - mlir::Location loc = converter.genLocation(atomicRead.source); - genAtomicRead(converter, atomicRead, loc); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - mlir::Location loc = converter.genLocation(atomicWrite.source); - genAtomicWrite(converter, atomicWrite, loc); - }, - [&](const parser::OmpAtomic &atomicConstruct) { - mlir::Location loc = converter.genLocation(atomicConstruct.source); - genOmpAtomic(converter, atomicConstruct, loc); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - mlir::Location loc = converter.genLocation(atomicUpdate.source); - genAtomicUpdate(converter, atomicUpdate, loc); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - mlir::Location loc = converter.genLocation(atomicCapture.source); - genAtomicCapture(converter, atomicCapture, loc); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - mlir::Location loc = converter.genLocation(atomicCompare.source); - TODO(loc, "OpenMP atomic compare"); - }, - }, - atomicConstruct.u); + const parser::OpenMPAtomicConstruct &construct) { + auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { + if (auto *maybe = expr.get(); maybe && maybe->v) { + return &*maybe->v; + } else { + return nullptr; + } + }; + + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto &dirSpec = std::get(construct.t); + List clauses = makeClauses(dirSpec.Clauses(), semaCtx); + lower::StatementContext stmtCtx; + + const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + const semantics::SomeExpr &atom = *get(analysis.atom); + mlir::Location loc = converter.genLocation(construct.source); + mlir::Value atomAddr = + fir::getBase(converter.genExprAddr(atom, stmtCtx, &loc)); + mlir::IntegerAttr hint = getAtomicHint(converter, clauses); + mlir::omp::ClauseMemoryOrderKindAttr memOrder = + getAtomicMemoryOrder(converter, semaCtx, clauses); + + if (auto *cond = get(analysis.cond)) { + (void)cond; + TODO(loc, "OpenMP ATOMIC COMPARE"); + } else { + int action0 = analysis.op0.what & analysis.Action; + int action1 = analysis.op1.what & analysis.Action; + mlir::Operation *captureOp = nullptr; + fir::FirOpBuilder::InsertPoint atomicAt; + fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + + if (construct.IsCapture()) { + // Capturing operation. + assert(action0 != analysis.None && action1 != analysis.None && + "Expexcing two actions"); + captureOp = + builder.create(loc, hint, memOrder); + // Set the non-atomic insertion point to before the atomic.capture. + prepareAt = getInsertionPointBefore(captureOp); + + mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); + builder.setInsertionPointToEnd(block); + // Set the atomic insertion point to before the terminator inside + // atomic.capture. + mlir::Operation *term = builder.create(loc); + atomicAt = getInsertionPointBefore(term); + hint = nullptr; + memOrder = nullptr; + } else { + // Non-capturing operation. + assert(action0 != analysis.None && action1 == analysis.None && + "Expexcing single action"); + assert(!(analysis.op0.what & analysis.Condition)); + atomicAt = prepareAt; + } + + mlir::Operation *firstOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, + *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + assert(firstOp && "Should have created an atomic operation"); + atomicAt = getInsertionPointAfter(firstOp); + + mlir::Operation *secondOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, + *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } + } } static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..b30263ad54fa4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { + return std::string( + state.GetLocation(), std::min(64, state.BytesRemaining())); +} + constexpr auto startOmpLine = skipStuffBeforeStatement >> "!$OMP "_sptok; constexpr auto endOmpLine = space >> endOfLine; @@ -918,8 +924,10 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "BIND" >> construct(construct( parenthesized(Parser{}))) || + "CAPTURE" >> construct(construct()) || "COLLAPSE" >> construct(construct( parenthesized(scalarIntConstantExpr))) || + "COMPARE" >> construct(construct()) || "CONTAINS" >> construct(construct( parenthesized(Parser{}))) || "COPYIN" >> construct(construct( @@ -1039,6 +1047,7 @@ TYPE_PARSER( // "TASK_REDUCTION" >> construct(construct( parenthesized(Parser{}))) || + "READ" >> construct(construct()) || "RELAXED" >> construct(construct()) || "RELEASE" >> construct(construct()) || "REVERSE_OFFLOAD" >> @@ -1082,6 +1091,7 @@ TYPE_PARSER( // maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || + "WRITE" >> construct(construct()) || // Cancellable constructs "DO"_id >= construct(construct( @@ -1210,6 +1220,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. +// A problem can occur when atomic constructs without end-directive follow +// each other closely, e.g. +// !$omp atomic write +// x = v +// !$omp atomic update +// x = x + 1 +// ... +// The speculative parsing will become "recursive", and has the potential +// to take a (practically) infinite amount of time given a sufficiently +// large number of such constructs in a row. Since atomic constructs cannot +// contain other OpenMP constructs, guarding against recursive calls to the +// atomic construct parser solves the problem. +struct OmpAtomicConstructParser { + using resultType = OpenMPAtomicConstruct; + + static constexpr size_t BodyLimit{5}; + + std::optional Parse(ParseState &state) const { + if (recursing_) { + return std::nullopt; + } + recursing_ = true; + + auto dirSpec{Parser{}.Parse(state)}; + if (!dirSpec || dirSpec->DirId() != llvm::omp::Directive::OMPD_atomic) { + recursing_ = false; + return std::nullopt; + } + + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + if (ParseOne(exec, end, tail, state)) { + if (!tail.first.empty()) { + if (auto &&rest{attempt(LimitedTailParser(BodyLimit)).Parse(state)}) { + for (auto &&s : rest->first) { + tail.first.emplace_back(std::move(s)); + } + assert(!tail.second); + tail.second = std::move(rest->second); + } + } + recursing_ = false; + return OpenMPAtomicConstruct{ + std::move(*dirSpec), std::move(tail.first), std::move(tail.second)}; + } + + recursing_ = false; + return std::nullopt; + } + +private: + // Begin-directive + TailType = entire construct. + using TailType = std::pair>; + + // Parse either an ExecutionPartConstruct, or atomic end-directive. When + // successful, record the result in the "tail" provided, otherwise fail. + static std::optional ParseOne( // + Parser &exec, OmpEndDirectiveParser &end, + TailType &tail, ParseState &state) { + auto isRecovery{[](const ExecutionPartConstruct &e) { + return std::holds_alternative(e.u); + }}; + if (auto &&stmt{attempt(exec).Parse(state)}; stmt && !isRecovery(*stmt)) { + tail.first.emplace_back(std::move(*stmt)); + } else if (auto &&dir{attempt(end).Parse(state)}) { + tail.second = std::move(*dir); + } else { + return std::nullopt; + } + return Success{}; + } + + struct LimitedTailParser { + using resultType = TailType; + + constexpr LimitedTailParser(size_t count) : count_(count) {} + + std::optional Parse(ParseState &state) const { + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + for (size_t i{0}; i != count_; ++i) { + if (ParseOne(exec, end, tail, state)) { + if (tail.second) { + // Return when the end-directive was parsed. + return std::move(tail); + } + } else { + break; + } + } + return std::nullopt; + } + + private: + const size_t count_; + }; + + // The recursion guard should become thread_local if parsing is ever + // parallelized. + static bool recursing_; +}; + +bool OmpAtomicConstructParser::recursing_{false}; + +TYPE_PARSER(sourced( // + construct(OmpAtomicConstructParser{}))) + // 2.17.7 Atomic construct/2.17.8 Flush construct [OpenMP 5.0] // memory-order-clause -> // acq_rel @@ -1224,19 +1383,6 @@ TYPE_PARSER(sourced(construct( "RELEASE" >> construct(construct()) || "SEQ_CST" >> construct(construct()))))) -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) -TYPE_PARSER(sourced(construct( - construct(Parser{}) || - construct( - "FAIL" >> parenthesized(Parser{})) || - construct( - "HINT" >> parenthesized(Parser{}))))) - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -TYPE_PARSER(sourced(construct( - many(maybe(","_tok) >> sourced(Parser{}))))) - static bool IsSimpleStandalone(const OmpDirectiveName &name) { switch (name.v) { case llvm::omp::Directive::OMPD_barrier: @@ -1383,67 +1529,6 @@ TYPE_PARSER(sourced( TYPE_PARSER(construct(Parser{}) || construct(Parser{})) -// 2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] | -// ATOMIC [clause] -// clause -> memory-order-clause | HINT(hint-expression) -// memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED -// atomic-clause -> READ | WRITE | UPDATE | CAPTURE - -// OMP END ATOMIC -TYPE_PARSER(construct(startOmpLine >> "END ATOMIC"_tok)) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] READ [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("READ"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] CAPTURE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("CAPTURE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - statement(assignmentStmt), Parser{} / endOmpLine))) - -TYPE_PARSER(construct(indirect(Parser{})) || - construct(indirect(Parser{}))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] COMPARE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("COMPARE"_tok), - Parser{} / endOmpLine, - Parser{}, - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] UPDATE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("UPDATE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [atomic-clause-list] -TYPE_PARSER(sourced(construct(verbatim("ATOMIC"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] WRITE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("WRITE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// Atomic Construct -TYPE_PARSER(construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{})) - // 2.13.2 OMP CRITICAL TYPE_PARSER(startOmpLine >> sourced(construct( diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..5983f54600c21 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -318,6 +318,34 @@ std::string OmpTraitSetSelectorName::ToString() const { return std::string(EnumToString(v)); } +llvm::omp::Clause OpenMPAtomicConstruct::GetKind() const { + auto &dirSpec{std::get(t)}; + for (auto &clause : dirSpec.Clauses().v) { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_read: + case llvm::omp::Clause::OMPC_write: + case llvm::omp::Clause::OMPC_update: + return clause.Id(); + default: + break; + } + } + return llvm::omp::Clause::OMPC_update; +} + +bool OpenMPAtomicConstruct::IsCapture() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_capture; + }); +} + +bool OpenMPAtomicConstruct::IsCompare() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_compare; + }); +} } // namespace Fortran::parser template static llvm::omp::Clause getClauseIdForClass(C &&) { diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..a2bc8b98088f1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2562,83 +2562,22 @@ class UnparseVisitor { Word(ToUpperCaseLetters(common::EnumToString(x))); } - void Unparse(const OmpAtomicClauseList &x) { Walk(" ", x.v, " "); } - - void Unparse(const OmpAtomic &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCapture &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" CAPTURE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - Put("\n"); - Walk(std::get(x.t)); - BeginOpenMP(); - Word("!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCompare &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" COMPARE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - } - void Unparse(const OmpAtomicRead &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" READ"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicUpdate &x) { + void Unparse(const OpenMPAtomicConstruct &x) { BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" UPDATE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicWrite &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" WRITE"); - Walk(std::get<2>(x.t)); + Word("!$OMP "); + Walk(std::get(x.t)); Put("\n"); EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); + Walk(std::get(x.t), ""); + if (auto &end{std::get>(x.t)}) { + BeginOpenMP(); + Word("!$OMP END "); + Walk(*end); + Put("\n"); + EndOpenMP(); + } } + void Unparse(const OpenMPExecutableAllocate &x) { const auto &fields = std::get>>( @@ -2889,23 +2828,8 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } + void Unparse(const OmpFailClause &x) { Walk(x.v); } void Unparse(const OmpMemoryOrderClause &x) { Walk(x.v); } - void Unparse(const OmpAtomicClause &x) { - common::visit(common::visitors{ - [&](const OmpMemoryOrderClause &y) { Walk(y); }, - [&](const OmpFailClause &y) { - Word("FAIL("); - Walk(y.v); - Put(")"); - }, - [&](const OmpHintClause &y) { - Word("HINT("); - Walk(y.v); - Put(")"); - }, - }, - x.u); - } void Unparse(const OmpMetadirectiveDirective &x) { BeginOpenMP(); Word("!$OMP METADIRECTIVE "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c582de6df0319..bf3dff8ddab28 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); + +template +static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { + return !(e == f); +} + // Use when clause falls under 'struct OmpClause' in 'parse-tree.h'. #define CHECK_SIMPLE_CLAUSE(X, Y) \ void OmpStructureChecker::Enter(const parser::OmpClause::X &) { \ @@ -78,6 +86,28 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } +static bool IsVarOrFunctionRef(const SomeExpr &expr) { + return evaluate::UnwrapProcedureRef(expr) != nullptr || + evaluate::IsVariable(expr); +} + +static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { + const parser::TypedExpr &typedExpr{parserExpr.typedExpr}; + // ForwardOwningPointer typedExpr + // `- GenericExprWrapper ^.get() + // `- std::optional ^->v + return typedExpr.get()->v; +} + +static std::optional GetDynamicType( + const parser::Expr &parserExpr) { + if (auto maybeExpr{GetEvaluateExpr(parserExpr)}) { + return maybeExpr->GetType(); + } else { + return std::nullopt; + } +} + // 'OmpWorkshareBlockChecker' is used to check the validity of the assignment // statements and the expressions enclosed in an OpenMP Workshare construct class OmpWorkshareBlockChecker { @@ -584,51 +614,26 @@ void OmpStructureChecker::CheckPredefinedAllocatorRestriction( } } -template -void OmpStructureChecker::CheckHintClause( - D *leftOmpClauseList, D *rightOmpClauseList, std::string_view dirName) { - bool foundHint{false}; +void OmpStructureChecker::Enter(const parser::OmpClause::Hint &x) { + CheckAllowedClause(llvm::omp::Clause::OMPC_hint); + auto &dirCtx{GetContext()}; - auto checkForValidHintClause = [&](const D *clauseList) { - for (const auto &clause : clauseList->v) { - const parser::OmpHintClause *ompHintClause = nullptr; - if constexpr (std::is_same_v) { - ompHintClause = std::get_if(&clause.u); - } else if constexpr (std::is_same_v) { - if (auto *hint{std::get_if(&clause.u)}) { - ompHintClause = &hint->v; - } - } - if (!ompHintClause) - continue; - if (foundHint) { - context_.Say(clause.source, - "At most one HINT clause can appear on the %s directive"_err_en_US, - parser::ToUpperCaseLetters(dirName)); - } - foundHint = true; - std::optional hintValue = GetIntValue(ompHintClause->v); - if (hintValue && *hintValue >= 0) { - /*`omp_sync_hint_nonspeculative` and `omp_lock_hint_speculative`*/ - if ((*hintValue & 0xC) == 0xC - /*`omp_sync_hint_uncontended` and omp_sync_hint_contended*/ - || (*hintValue & 0x3) == 0x3) - context_.Say(clause.source, - "Hint clause value " - "is not a valid OpenMP synchronization value"_err_en_US); - } else { - context_.Say(clause.source, - "Hint clause must have non-negative constant " - "integer expression"_err_en_US); + if (std::optional maybeVal{GetIntValue(x.v.v)}) { + int64_t val{*maybeVal}; + if (val >= 0) { + // Check contradictory values. + if ((val & 0xC) == 0xC || // omp_sync_hint_speculative and nonspeculative + (val & 0x3) == 0x3) { // omp_sync_hint_contended and uncontended + context_.Say(dirCtx.clauseSource, + "The synchronization hint is not valid"_err_en_US); } + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be non-negative"_err_en_US); } - }; - - if (leftOmpClauseList) { - checkForValidHintClause(leftOmpClauseList); - } - if (rightOmpClauseList) { - checkForValidHintClause(rightOmpClauseList); + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be a constant integer value"_err_en_US); } } @@ -2370,8 +2375,9 @@ void OmpStructureChecker::Leave(const parser::OpenMPCancelConstruct &) { void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { const auto &dir{std::get(x.t)}; + const auto &dirSource{std::get(dir.t).source}; const auto &endDir{std::get(x.t)}; - PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_critical); + PushContextAndClauseSets(dirSource, llvm::omp::Directive::OMPD_critical); const auto &block{std::get(x.t)}; CheckNoBranching(block, llvm::omp::Directive::OMPD_critical, dir.source); const auto &dirName{std::get>(dir.t)}; @@ -2404,7 +2410,6 @@ void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { "Hint clause other than omp_sync_hint_none cannot be specified for " "an unnamed CRITICAL directive"_err_en_US}); } - CheckHintClause(&ompClause, nullptr, "CRITICAL"); } void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { @@ -2632,420 +2637,1519 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; } - return false; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } + + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (Append(v, std::move(results)), ...); + return v; + } + +private: + static void Append(Result &acc, Result &&data) { + for (auto &&s : data) { + acc.push_back(std::move(s)); } } +}; +} // namespace atomic - ErrIfAllocatableVariable(var); +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { + // Both expr and x have the form of SomeType(SomeKind(...)[1]). + // Check if expr is + // SomeType(SomeKind(Type( + // Convert + // SomeKind(...)[2]))) + // where SomeKind(...) [1] and [2] are equal, and the Convert preserves + // TypeCategory. + + if (expr != x) { + auto top{atomic::ArgumentExtractor{}(expr)}; + return top.first == atomic::Operator::Resize && x == top.second.front(); } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); + return true; } } -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, const MaybeExpr &maybeExpr = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetExpr(operation.expr, maybeExpr); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedExpr expr; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var + // or + // ... = x capture-var = atomic-var + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return u.lhs == c.rhs; + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); + bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + + if (couldBeCapture1) { + if (couldBeCapture2) { + if (isUpdateCapture(as2, as1)) { + if (isUpdateCapture(as1, as2)) { + // If both statements could be captures and both could be updates, + // emit a warning about the ambiguity. + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); } + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, + as1.rhs.AsFortran(), as2.rhs.AsFortran()); + } + } else { // !couldBeCapture2 + if (isUpdateCapture(as2, as1)) { + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else { + context_.Say(act2.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as1.rhs.AsFortran()); + } + } + } else { // !couldBeCapture1 + if (couldBeCapture2) { + if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); + } else { + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as2.rhs.AsFortran()); + } + } else { + context_.Say(source, + "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + } + } + + return std::make_pair(nullptr, nullptr); +} + +void OmpStructureChecker::CheckAtomicCaptureAssignment( + const evaluate::Assignment &capture, const SomeExpr &atom, + parser::CharBlock source) { + const SomeExpr &cap{capture.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + // This part should have been checked prior to callig this function. + assert(capture.rhs == atom && "This canont be a capture assignment"); + CheckStorageOverlap(atom, {cap}, source); + } +} + +void OmpStructureChecker::CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source) { + const SomeExpr &atom{read.rhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source) { + // [6.0:190:13-15] + // A write structured block is write-statement, a write statement that has + // one of the following forms: + // x = expr + // x => expr + const SomeExpr &atom{write.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, lsrc); + CheckStorageOverlap(atom, {write.rhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source) { + // [6.0:191:1-7] + // An update structured block is update-statement, an update statement + // that has one of the following forms: + // x = x operator expr + // x = expr operator x + // x = intrinsic-procedure-name (x) + // x = intrinsic-procedure-name (x, expr-list) + // x = intrinsic-procedure-name (expr-list, x) + const SomeExpr &atom{update.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, lsrc); + + auto top{GetTopLevelOperation(update.rhs)}; + switch (top.first) { + case atomic::Operator::Add: + case atomic::Operator::Sub: + case atomic::Operator::Mul: + case atomic::Operator::Div: + case atomic::Operator::And: + case atomic::Operator::Or: + case atomic::Operator::Eqv: + case atomic::Operator::Neqv: + case atomic::Operator::Min: + case atomic::Operator::Max: + case atomic::Operator::Identity: + break; + case atomic::Operator::Call: + context_.Say(source, + "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Convert: + context_.Say(source, + "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Intrinsic: + context_.Say(source, + "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Unk: + context_.Say( + source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + default: + context_.Say(source, + "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, + atomic::ToString(top.first)); + return; + } + // Check if `atom` occurs exactly once in the argument list. + std::vector nonAtom; + auto unique{[&]() { // -> iterator + auto found{top.second.end()}; + for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { + if (IsSameOrResizeOf(*i, atom)) { + if (found != top.second.end()) { + return top.second.end(); } + found = i; + } else { + nonAtom.push_back(*i); } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); + return found; + }()}; + + if (unique == top.second.end()) { + if (top.first == atomic::Operator::Identity) { + // This is "x = y". + context_.Say(rsrc, + "The atomic variable %s should appear as an argument in the update operation"_err_en_US, + atom.AsFortran()); + } else { + context_.Say(rsrc, + "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, + atom.AsFortran(), atomic::ToString(top.first)); + } + } else { + CheckStorageOverlap(atom, nonAtom, source); } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( + const SomeExpr &cond, parser::CharBlock condSource, + const evaluate::Assignment &assign, parser::CharBlock assignSource) { + const SomeExpr &atom{assign.lhs}; + auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, alsrc); + + auto top{GetTopLevelOperation(cond)}; + // Missing arguments to operations would have been diagnosed by now. + + switch (top.first) { + case atomic::Operator::Associated: + if (atom != top.second.front()) { + context_.Say(assignSource, + "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); + } + break; + // x equalop e | e equalop x (allowing "e equalop x" is an extension) + case atomic::Operator::Eq: + case atomic::Operator::Eqv: + // x ordop expr | expr ordop x + case atomic::Operator::Lt: + case atomic::Operator::Gt: { + const SomeExpr &arg0{top.second[0]}; + const SomeExpr &arg1{top.second[1]}; + if (IsSameOrResizeOf(arg0, atom)) { + CheckStorageOverlap(atom, {arg1}, condSource); + } else if (IsSameOrResizeOf(arg1, atom)) { + CheckStorageOverlap(atom, {arg0}, condSource); + } else { + context_.Say(assignSource, + "An argument of the %s operator should be the target of the assignment"_err_en_US, + atomic::ToString(top.first)); + } + break; + } + case atomic::Operator::True: + case atomic::Operator::False: + break; + default: + context_.Say(condSource, + "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, + atomic::ToString(top.first)); + break; } } -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, +void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source) { + // The condition/statements must be: + // - cond: x equalop e ift: x = d iff: - + // - cond: x ordop expr ift: x = expr iff: - (+ commute ordop) + // - cond: associated(x) ift: x => expr iff: - + // - cond: associated(x, e) ift: x => expr iff: - + + // The if-true statement must be present, and must be an assignment. + auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; + if (!maybeAssign) { + if (update.ift.stmt) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); + } else { + context_.Say( + source, "Invalid body of ATOMIC UPDATE COMPARE operation"_err_en_US); + } + return; + } + const evaluate::Assignment assign{*maybeAssign}; + const SomeExpr &atom{assign.lhs}; + + CheckAtomicConditionalUpdateAssignment( + update.cond, update.source, assign, update.ift.source); + + CheckStorageOverlap(atom, {assign.rhs}, update.ift.source); + + if (update.iff) { + context_.Say(update.iff.source, + "In ATOMIC UPDATE COMPARE the update statement should not have an ELSE branch"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdateOnly( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeUpdate{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeUpdate->lhs}; + CheckAtomicUpdateAssignment(*maybeUpdate, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC UPDATE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdate( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // Allowable forms are (single-statement): + // - if ... + // - x = (... ? ... : x) + // and two-statement: + // - r = cond ; if (r) ... + + const parser::ExecutionPartConstruct *ust{nullptr}; // update + const parser::ExecutionPartConstruct *cst{nullptr}; // condition + + if (body.size() == 1) { + ust = &body.front(); + } else if (body.size() == 2) { + cst = &body.front(); + ust = &body.back(); + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE operation should contain one or two statements"_err_en_US); + return; + } + + // Flang doesn't support conditional-expr yet, so all update statements + // are if-statements. + + // IfStmt: if (...) ... + // IfConstruct: if (...) then ... endif + auto maybeUpdate{AnalyzeConditionalStmt(ust)}; + if (!maybeUpdate) { + context_.Say(source, + "In ATOMIC UPDATE COMPARE the update statement should be a conditional statement"_err_en_US); + return; + } + + AnalyzedCondStmt &update{*maybeUpdate}; + + if (SourcedActionStmt action{GetActionStmt(cst)}) { + // The "condition" statement must be `r = cond`. + if (auto maybeCond{GetEvaluateAssignment(action.stmt)}) { + if (maybeCond->lhs != update.cond) { + context_.Say(update.source, + "In ATOMIC UPDATE COMPARE the conditional statement must use %s as the condition"_err_en_US, + maybeCond->lhs.AsFortran()); + } else { + // If it's "r = ...; if (r) ..." then put the original condition + // in `update`. + update.cond = maybeCond->rhs; + } + } else { + context_.Say(action.source, + "In ATOMIC UPDATE COMPARE with two statements the first statement should compute the condition"_err_en_US); + } + } + + evaluate::Assignment assign{*GetEvaluateAssignment(update.ift.stmt)}; + + CheckAtomicConditionalUpdateStmt(update, source); + if (IsCheckForAssociated(update.cond)) { + if (!IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment should be a pointer-assignment when the condition is ASSOCIATED"_err_en_US); + } + } else { + if (IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment cannot be a pointer-assignment except when the condition is ASSOCIATED"_err_en_US); + } + } + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::None)); +} + +void OmpStructureChecker::CheckAtomicUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() != 2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two statements"_err_en_US); + return; + } + + auto [uec, cec]{CheckUpdateCapture(&body.front(), &body.back(), source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; + auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; + + if (!maybeUpdate || !maybeCapture) { + context_.Say(source, + "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); + return; + } + + const evaluate::Assignment &update{*maybeUpdate}; + const evaluate::Assignment &capture{*maybeCapture}; + const SomeExpr &atom{update.lhs}; + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + int action; + + if (IsMaybeAtomicWrite(update)) { + action = Analysis::Write; + CheckAtomicWriteAssignment(update, uact.source); + } else { + action = Analysis::Update; + CheckAtomicUpdateAssignment(update, uact.source); + } + CheckAtomicCaptureAssignment(capture, atom, cact.source); + + if (IsPointerAssignment(update) != IsPointerAssignment(capture)) { + context_.Say(cact.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + + if (GetActionStmt(&body.front()).stmt == uact.stmt) { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(action, update.rhs), + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + } else { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), + MakeAtomicAnalysisOp(action, update.rhs)); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // There are two different variants of this: + // (1) conditional-update and capture separately: + // This form only allows single-statement updates, i.e. the update + // form "r = cond; if (r) ...)" is not allowed. + // (2) conditional-update combined with capture in a single statement: + // This form does allow the condition to be calculated separately, + // i.e. "r = cond; if (r) ...". + // Regardless of what form it is, the actual update assignment is a + // proper write, i.e. "x = d", where d does not depend on x. + + AnalyzedCondStmt update; + SourcedActionStmt capture; + bool captureAlways{true}, captureFirst{true}; + + auto extractCapture{[&]() { + capture = update.iff; + captureAlways = false; + update.iff = SourcedActionStmt{}; + }}; + + auto classifyNonUpdate{[&](const SourcedActionStmt &action) { + // The non-update statement is either "r = cond" or the capture. + if (auto maybeAssign{GetEvaluateAssignment(action.stmt)}) { + if (update.cond == maybeAssign->lhs) { + // If this is "r = cond; if (r) ...", then update the condition. + update.cond = maybeAssign->rhs; + update.source = action.source; + // In this form, the update and the capture are combined into + // an IF-THEN-ELSE statement. + extractCapture(); + } else { + // Assume this is the capture-statement. + capture = action; + } + } + }}; + + if (body.size() == 2) { + // This could be + // - capture; conditional-update (in any order), or + // - r = cond; if (r) capture-update + const parser::ExecutionPartConstruct *st1{&body.front()}; + const parser::ExecutionPartConstruct *st2{&body.back()}; + // In either case, the conditional statement can be analyzed by + // AnalyzeConditionalStmt, whereas the other statement cannot. + if (auto maybeUpdate1{AnalyzeConditionalStmt(st1)}) { + update = *maybeUpdate1; + classifyNonUpdate(GetActionStmt(st2)); + captureFirst = false; + } else if (auto maybeUpdate2{AnalyzeConditionalStmt(st2)}) { + update = *maybeUpdate2; + classifyNonUpdate(GetActionStmt(st1)); + } else { + // None of the statements are conditional, this rules out the + // "r = cond; if (r) ..." and the "capture + conditional-update" + // variants. This could still be capture + write (which is classified + // as conditional-update-capture in the spec). + auto [uec, cec]{CheckUpdateCapture(st1, st2, source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + update.ift = uact; + capture = cact; + if (uec == st1) { + captureFirst = false; + } + } + } else if (body.size() == 1) { + if (auto maybeUpdate{AnalyzeConditionalStmt(&body.front())}) { + update = *maybeUpdate; + // This is the form with update and capture combined into an IF-THEN-ELSE + // statement. The capture-statement is always the ELSE branch. + extractCapture(); + } else { + goto invalid; + } + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE CAPTURE operation should contain one or two statements"_err_en_US); + return; + invalid: + context_.Say(source, + "Invalid body of ATOMIC UPDATE COMPARE CAPTURE operation"_err_en_US); + return; + } + + // The update must have a form `x = d` or `x => d`. + if (auto maybeWrite{GetEvaluateAssignment(update.ift.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, update.ift.source); + if (auto maybeCapture{GetEvaluateAssignment(capture.stmt)}) { + CheckAtomicCaptureAssignment(*maybeCapture, atom, capture.source); + + if (IsPointerAssignment(*maybeWrite) != + IsPointerAssignment(*maybeCapture)) { + context_.Say(capture.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + } else { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + return; + } + } else { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + return; + } + + // update.iff should be empty here, the capture statement should be + // stored in "capture". + + // Fill out the analysis in the AST node. + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + bool condUnused{std::visit( + [](auto &&s) { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v) { + return true; + } else { + return false; + } }, - x.u); + update.cond.u)}; + + int updateWhen{!condUnused ? Analysis::IfTrue : 0}; + int captureWhen{!captureAlways ? Analysis::IfFalse : 0}; + + evaluate::Assignment updAssign{*GetEvaluateAssignment(update.ift.stmt)}; + evaluate::Assignment capAssign{*GetEvaluateAssignment(capture.stmt)}; + + if (captureFirst) { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + } else { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + } +} + +void OmpStructureChecker::CheckAtomicRead( + const parser::OpenMPAtomicConstruct &x) { + // [6.0:190:5-7] + // A read structured block is read-statement, a read statement that has one + // of the following forms: + // v = x + // v => x + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Read cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC READ cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeRead->rhs}; + CheckAtomicReadAssignment(*maybeRead, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC READ operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC READ operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicWrite( + const parser::OpenMPAtomicConstruct &x) { + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Write cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC WRITE cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeWrite{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC WRITE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdate( + const parser::OpenMPAtomicConstruct &x) { + auto &block{std::get(x.t)}; + + bool isConditional{x.IsCompare()}; + bool isCapture{x.IsCapture()}; + const parser::Block &body{GetInnermostExecPart(block)}; + + if (isConditional && isCapture) { + CheckAtomicConditionalUpdateCapture(x, body, x.source); + } else if (isConditional) { + CheckAtomicConditionalUpdate(x, body, x.source); + } else if (isCapture) { + CheckAtomicUpdateCapture(x, body, x.source); + } else { // update-only + CheckAtomicUpdateOnly(x, body, x.source); + } +} + +void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { + // All of the following groups have the "exclusive" property, i.e. at + // most one clause from each group is allowed. + // The exclusivity-checking code should eventually be unified for all + // clauses, with clause groups defined in OMP.td. + std::array atomic{llvm::omp::Clause::OMPC_read, + llvm::omp::Clause::OMPC_update, llvm::omp::Clause::OMPC_write}; + std::array memoryOrder{llvm::omp::Clause::OMPC_acq_rel, + llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_relaxed, + llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_seq_cst}; + + auto checkExclusive{[&](llvm::ArrayRef group, + std::string_view name, + const parser::OmpClauseList &clauses) { + const parser::OmpClause *present{nullptr}; + for (const parser::OmpClause &clause : clauses.v) { + llvm::omp::Clause id{clause.Id()}; + if (!llvm::is_contained(group, id)) { + continue; + } + if (present == nullptr) { + present = &clause; + continue; + } else if (id == present->Id()) { + // Ignore repetitions of the same clause, those will be diagnosed + // separately. + continue; + } + parser::MessageFormattedText txt( + "At most one clause from the '%s' group is allowed on ATOMIC construct"_err_en_US, + name.data()); + parser::Message message(clause.source, txt); + message.Attach(present->source, + "Previous clause from this group provided here"_en_US); + context_.Say(std::move(message)); + return; + } + }}; + + auto &dirSpec{std::get(x.t)}; + auto &dir{std::get(dirSpec.t)}; + PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_atomic); + llvm::omp::Clause kind{x.GetKind()}; + + checkExclusive(atomic, "atomic", dirSpec.Clauses()); + checkExclusive(memoryOrder, "memory-order", dirSpec.Clauses()); + + switch (kind) { + case llvm::omp::Clause::OMPC_read: + CheckAtomicRead(x); + break; + case llvm::omp::Clause::OMPC_write: + CheckAtomicWrite(x); + break; + case llvm::omp::Clause::OMPC_update: + CheckAtomicUpdate(x); + break; + default: + break; + } } void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { @@ -3237,7 +4341,6 @@ CHECK_SIMPLE_CLAUSE(Final, OMPC_final) CHECK_SIMPLE_CLAUSE(Flush, OMPC_flush) CHECK_SIMPLE_CLAUSE(Full, OMPC_full) CHECK_SIMPLE_CLAUSE(Grainsize, OMPC_grainsize) -CHECK_SIMPLE_CLAUSE(Hint, OMPC_hint) CHECK_SIMPLE_CLAUSE(Holds, OMPC_holds) CHECK_SIMPLE_CLAUSE(Inclusive, OMPC_inclusive) CHECK_SIMPLE_CLAUSE(Initializer, OMPC_initializer) @@ -3867,40 +4970,6 @@ void OmpStructureChecker::CheckIsLoopIvPartOfClause( } } } -// Following clauses have a separate node in parse-tree.h. -// Atomic-clause -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicRead, OMPC_read) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicWrite, OMPC_write) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicUpdate, OMPC_update) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicCapture, OMPC_capture) - -void OmpStructureChecker::Leave(const parser::OmpAtomicRead &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_read, - {llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicWrite &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_write, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicUpdate &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_update, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -// OmpAtomic node represents atomic directive without atomic-clause. -// atomic-clause - READ,WRITE,UPDATE,CAPTURE. -void OmpStructureChecker::Leave(const parser::OmpAtomic &) { - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acquire)}) { - context_.Say(clause->source, - "Clause ACQUIRE is not allowed on the ATOMIC directive"_err_en_US); - } - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acq_rel)}) { - context_.Say(clause->source, - "Clause ACQ_REL is not allowed on the ATOMIC directive"_err_en_US); - } -} // Restrictions specific to each clause are implemented apart from the // generalized restrictions. @@ -4854,21 +5923,6 @@ void OmpStructureChecker::Leave(const parser::OmpContextSelector &) { ExitDirectiveNest(ContextSelectorNest); } -std::optional OmpStructureChecker::GetDynamicType( - const common::Indirection &parserExpr) { - // Indirection parserExpr - // `- parser::Expr ^.value() - const parser::TypedExpr &typedExpr{parserExpr.value().typedExpr}; - // ForwardOwningPointer typedExpr - // `- GenericExprWrapper ^.get() - // `- std::optional ^->v - if (auto maybeExpr{typedExpr.get()->v}) { - return maybeExpr->GetType(); - } else { - return std::nullopt; - } -} - const std::list & OmpStructureChecker::GetTraitPropertyList( const parser::OmpTraitSelector &trait) { @@ -5258,7 +6312,7 @@ void OmpStructureChecker::CheckTraitCondition( const parser::OmpTraitProperty &property{properties.front()}; auto &scalarExpr{std::get(property.u)}; - auto maybeType{GetDynamicType(scalarExpr.thing)}; + auto maybeType{GetDynamicType(scalarExpr.thing.value())}; if (!maybeType || maybeType->category() != TypeCategory::Logical) { context_.Say(property.source, "%s trait requires a single LOGICAL expression"_err_en_US, diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..bf6fbf16d0646 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -48,6 +48,7 @@ static const OmpDirectiveSet noWaitClauseNotAllowedSet{ } // namespace llvm namespace Fortran::semantics { +struct AnalyzedCondStmt; // Mapping from 'Symbol' to 'Source' to keep track of the variables // used in multiple clauses @@ -142,15 +143,6 @@ class OmpStructureChecker void Leave(const parser::OmpClauseList &); void Enter(const parser::OmpClause &); - void Enter(const parser::OmpAtomicRead &); - void Leave(const parser::OmpAtomicRead &); - void Enter(const parser::OmpAtomicWrite &); - void Leave(const parser::OmpAtomicWrite &); - void Enter(const parser::OmpAtomicUpdate &); - void Leave(const parser::OmpAtomicUpdate &); - void Enter(const parser::OmpAtomicCapture &); - void Leave(const parser::OmpAtomic &); - void Enter(const parser::DoConstruct &); void Leave(const parser::DoConstruct &); @@ -189,8 +181,6 @@ class OmpStructureChecker void CheckAllowedMapTypes(const parser::OmpMapType::Value &, const std::list &); - std::optional GetDynamicType( - const common::Indirection &); const std::list &GetTraitPropertyList( const parser::OmpTraitSelector &); std::optional GetClauseFromProperty( @@ -260,14 +250,41 @@ class OmpStructureChecker void CheckDoWhile(const parser::OpenMPLoopConstruct &x); void CheckAssociatedLoopConstraints(const parser::OpenMPLoopConstruct &x); template bool IsOperatorValid(const T &, const D &); - void CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *, const parser::OmpAtomicClauseList *); - void CheckAtomicUpdateStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureStmt(const parser::AssignmentStmt &); - void CheckAtomicWriteStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureConstruct(const parser::OmpAtomicCapture &); - void CheckAtomicCompareConstruct(const parser::OmpAtomicCompare &); - void CheckAtomicConstructStructure(const parser::OpenMPAtomicConstruct &); + + void CheckStorageOverlap(const evaluate::Expr &, + llvm::ArrayRef>, parser::CharBlock); + void CheckAtomicVariable( + const evaluate::Expr &, parser::CharBlock); + std::pair + CheckUpdateCapture(const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source); + void CheckAtomicCaptureAssignment(const evaluate::Assignment &capture, + const SomeExpr &atom, parser::CharBlock source); + void CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source); + void CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source); + void CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source); + void CheckAtomicConditionalUpdateAssignment(const SomeExpr &cond, + parser::CharBlock condSource, const evaluate::Assignment &assign, + parser::CharBlock assignSource); + void CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source); + void CheckAtomicUpdateOnly(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdate(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicUpdateCapture(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source); + void CheckAtomicRead(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicWrite(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicUpdate(const parser::OpenMPAtomicConstruct &x); + void CheckDistLinear(const parser::OpenMPLoopConstruct &x); void CheckSIMDNest(const parser::OpenMPConstruct &x); void CheckTargetNest(const parser::OpenMPConstruct &x); @@ -319,7 +336,6 @@ class OmpStructureChecker void EnterDirectiveNest(const int index) { directiveNest_[index]++; } void ExitDirectiveNest(const int index) { directiveNest_[index]--; } int GetDirectiveNest(const int index) { return directiveNest_[index]; } - template void CheckHintClause(D *, D *, std::string_view); inline void ErrIfAllocatableVariable(const parser::Variable &); inline void ErrIfLHSAndRHSSymbolsMatch( const parser::Variable &, const parser::Expr &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..dc395ecf211b3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1638,11 +1638,8 @@ class OmpVisitor : public virtual DeclarationVisitor { messageHandler().set_currStmtSource(std::nullopt); } bool Pre(const parser::OpenMPAtomicConstruct &x) { - return common::visit(common::visitors{[&](const auto &u) -> bool { - AddOmpSourceRange(u.source); - return true; - }}, - x.u); + AddOmpSourceRange(x.source); + return true; } void Post(const parser::OpenMPAtomicConstruct &) { messageHandler().set_currStmtSource(std::nullopt); diff --git a/flang/lib/Semantics/rewrite-directives.cpp b/flang/lib/Semantics/rewrite-directives.cpp index 104a77885d276..b4fef2c881b67 100644 --- a/flang/lib/Semantics/rewrite-directives.cpp +++ b/flang/lib/Semantics/rewrite-directives.cpp @@ -51,23 +51,21 @@ class OmpRewriteMutator : public DirectiveRewriteMutator { bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { // Find top-level parent of the operation. - Symbol *topLevelParent{common::visit( - [&](auto &atomic) { - Symbol *symbol{nullptr}; - Scope *scope{ - &context_.FindScope(std::get(atomic.t).source)}; - do { - if (Symbol * parent{scope->symbol()}) { - symbol = parent; - } - scope = &scope->parent(); - } while (!scope->IsGlobal()); - - assert(symbol && - "Atomic construct must be within a scope associated with a symbol"); - return symbol; - }, - x.u)}; + Symbol *topLevelParent{[&]() { + Symbol *symbol{nullptr}; + Scope *scope{&context_.FindScope( + std::get(x.t).source)}; + do { + if (Symbol * parent{scope->symbol()}) { + symbol = parent; + } + scope = &scope->parent(); + } while (!scope->IsGlobal()); + + assert(symbol && + "Atomic construct must be within a scope associated with a symbol"); + return symbol; + }()}; // Get the `atomic_default_mem_order` clause from the top-level parent. std::optional defaultMemOrder; @@ -86,66 +84,48 @@ bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { return false; } - auto findMemOrderClause = - [](const std::list &clauses) { - return llvm::any_of(clauses, [](const auto &clause) { - return std::get_if(&clause.u); + auto findMemOrderClause{[](const parser::OmpClauseList &clauses) { + return llvm::any_of( + clauses.v, [](auto &clause) -> const parser::OmpClause * { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_acq_rel: + case llvm::omp::Clause::OMPC_acquire: + case llvm::omp::Clause::OMPC_relaxed: + case llvm::omp::Clause::OMPC_release: + case llvm::omp::Clause::OMPC_seq_cst: + return &clause; + default: + return nullptr; + } }); - }; - - // Get the clause list to which the new memory order clause must be added, - // only if there are no other memory order clauses present for this atomic - // directive. - std::list *clauseList = common::visit( - common::visitors{[&](parser::OmpAtomic &atomicConstruct) { - // OmpAtomic only has a single list of clauses. - auto &clauses{std::get( - atomicConstruct.t)}; - return !findMemOrderClause(clauses.v) ? &clauses.v - : nullptr; - }, - [&](auto &atomicConstruct) { - // All other atomic constructs have two lists of clauses. - auto &clausesLhs{std::get<0>(atomicConstruct.t)}; - auto &clausesRhs{std::get<2>(atomicConstruct.t)}; - return !findMemOrderClause(clausesLhs.v) && - !findMemOrderClause(clausesRhs.v) - ? &clausesRhs.v - : nullptr; - }}, - x.u); + }}; - // Add a memory order clause to the atomic directive. + auto &dirSpec{std::get(x.t)}; + auto &clauseList{std::get>(dirSpec.t)}; if (clauseList) { - atomicDirectiveDefaultOrderFound_ = true; - switch (*defaultMemOrder) { - case common::OmpMemoryOrderType::Acq_Rel: - clauseList->emplace_back(common::visit( - common::visitors{[](parser::OmpAtomicRead &) -> parser::OmpClause { - return parser::OmpClause::Acquire{}; - }, - [](parser::OmpAtomicCapture &) -> parser::OmpClause { - return parser::OmpClause::AcqRel{}; - }, - [](auto &) -> parser::OmpClause { - // parser::{OmpAtomic, OmpAtomicUpdate, OmpAtomicWrite} - return parser::OmpClause::Release{}; - }}, - x.u)); - break; - case common::OmpMemoryOrderType::Relaxed: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::Relaxed{}}); - break; - case common::OmpMemoryOrderType::Seq_Cst: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::SeqCst{}}); - break; - default: - // FIXME: Don't process other values at the moment since their validity - // depends on the OpenMP version (which is unavailable here). - break; + if (findMemOrderClause(*clauseList)) { + return false; } + } else { + clauseList = parser::OmpClauseList(decltype(parser::OmpClauseList::v){}); + } + + // Add a memory order clause to the atomic directive. + atomicDirectiveDefaultOrderFound_ = true; + switch (*defaultMemOrder) { + case common::OmpMemoryOrderType::Acq_Rel: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::AcqRel{}}); + break; + case common::OmpMemoryOrderType::Relaxed: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::Relaxed{}}); + break; + case common::OmpMemoryOrderType::Seq_Cst: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::SeqCst{}}); + break; + default: + // FIXME: Don't process other values at the moment since their validity + // depends on the OpenMP version (which is unavailable here). + break; } return false; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 index b82bd13622764..6f58e0939a787 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 index 88ec6fe910b9e..6729be6e5cf8b 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b8422c8f3b769 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -79,16 +79,16 @@ subroutine pointers_in_atomic_capture() !CHECK: %[[VAL_A_BOX_ADDR:.*]] = fir.box_addr %[[VAL_A_LOADED]] : (!fir.box>) -> !fir.ptr !CHECK: %[[VAL_B_LOADED:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR:.*]] = fir.box_addr %[[VAL_B_LOADED]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR]] : !fir.ptr !CHECK: %[[VAL_B_LOADED_2:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR_2:.*]] = fir.box_addr %[[VAL_B_LOADED_2]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR_2]] : !fir.ptr !CHECK: omp.atomic.capture { !CHECK: omp.atomic.update %[[VAL_A_BOX_ADDR]] : !fir.ptr { !CHECK: ^bb0(%[[ARG:.*]]: i32): !CHECK: %[[TEMP:.*]] = arith.addi %[[ARG]], %[[VAL_B]] : i32 !CHECK: omp.yield(%[[TEMP]] : i32) !CHECK: } -!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 +!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR_2]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 !CHECK: } !CHECK: return !CHECK: } diff --git a/flang/test/Lower/OpenMP/atomic-write.f90 b/flang/test/Lower/OpenMP/atomic-write.f90 index 13392ad76471f..6eded49b0b15d 100644 --- a/flang/test/Lower/OpenMP/atomic-write.f90 +++ b/flang/test/Lower/OpenMP/atomic-write.f90 @@ -44,9 +44,9 @@ end program OmpAtomicWrite !CHECK-LABEL: func.func @_QPatomic_write_pointer() { !CHECK: %[[X_REF:.*]] = fir.alloca !fir.box> {bindc_name = "x", uniq_name = "_QFatomic_write_pointerEx"} !CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFatomic_write_pointerEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) -!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> !CHECK: %[[X_POINTEE_ADDR:.*]] = fir.box_addr %[[X_ADDR_BOX]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: omp.atomic.write %[[X_POINTEE_ADDR]] = %[[C1]] : !fir.ptr, i32 !CHECK: %[[C2:.*]] = arith.constant 2 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> diff --git a/flang/test/Parser/OpenMP/atomic-compare.f90 b/flang/test/Parser/OpenMP/atomic-compare.f90 deleted file mode 100644 index 5cd02698ff482..0000000000000 --- a/flang/test/Parser/OpenMP/atomic-compare.f90 +++ /dev/null @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s -! OpenMP version for documentation purposes only - it isn't used until Sema. -! This is testing for Parser errors that bail out before Sema. -program main - implicit none - integer :: i, j = 10 - logical :: r - - !CHECK: error: expected OpenMP construct - !$omp atomic compare write - r = i .eq. j + 1 - - !CHECK: error: expected end of line - !$omp atomic compare num_threads(4) - r = i .eq. j -end program main diff --git a/flang/test/Semantics/OpenMP/atomic-compare.f90 b/flang/test/Semantics/OpenMP/atomic-compare.f90 index 54492bf6a22a6..11e23e062bce7 100644 --- a/flang/test/Semantics/OpenMP/atomic-compare.f90 +++ b/flang/test/Semantics/OpenMP/atomic-compare.f90 @@ -44,46 +44,37 @@ !$omp end atomic ! Check for error conditions: - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic compare seq_cst seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst compare seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic compare acquire acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire compare acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic compare relaxed relaxed if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed compare relaxed if (b .eq. c) b = a - !ERROR: More than one FAIL clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one FAIL clause can appear on the ATOMIC directive !$omp atomic fail(release) compare fail(release) if (c .eq. a) a = b !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index c13a11a8dd5dc..deb67e7614659 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -16,20 +16,21 @@ program sample !$omp atomic read hint(2) y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(3) y = y + 10 !$omp atomic update hint(5) y = x + y - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement y = x x = y !$omp end atomic - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic update hint(x) y = y * 1 @@ -46,7 +47,7 @@ program sample !$omp atomic hint(omp_lock_hint_speculative) x = y + x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint) read y = x @@ -69,36 +70,36 @@ program sample !$omp atomic hint(omp_lock_hint_contended + omp_sync_hint_nonspeculative) x = y + x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint_contended) read y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = y * 9 - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp atomic hint(1.0) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp atomic hint(z + omp_sync_hint_nonspeculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(k + omp_sync_hint_speculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(p(1) + omp_sync_hint_uncontended) write x = 10 * y !$omp atomic write hint(a) - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and y+x access the same storage x = y + x !$omp atomic hint(abs(-1)) write diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 new file mode 100644 index 0000000000000..2619d235380f8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -0,0 +1,89 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC READ. Expect no diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v = x + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v => x +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic read + v = p(i) +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic read + !ERROR: Atomic variable x should be a scalar + v = x + + !$omp atomic read + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + v = y(2:4) +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic read + !ERROR: Within atomic operation x and x access the same storage + x = x + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic read + y(1) = y(2) +end + +subroutine f05 + integer :: x, v + + !$omp atomic read + !ERROR: Atomic expression x+1_4 should be a variable + v = x + 1 +end + +subroutine f06 + character :: x, v + + !$omp atomic read + !ERROR: Atomic variable x cannot have CHARACTER type + v = x +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic read + !ERROR: Atomic variable x cannot be ALLOCATABLE + v = x +end + diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 new file mode 100644 index 0000000000000..64e31a7974b55 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -0,0 +1,77 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !$omp atomic update capture + x = v + x = x + 1 + y = x + !$omp end atomic +end + +subroutine f01 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two assignments + !$omp atomic update capture + x = v + block + x = x + 1 + y = x + end block + !$omp end atomic +end + +subroutine f02 + integer :: x, y + + ! The update and capture statements can be inside of a single BLOCK. + ! The end-directive is then optional. Expect no diagnostics. + !$omp atomic update capture + block + x = x + 1 + y = x + end block +end + +subroutine f03 + integer :: x + + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !$omp atomic update capture + x = x + 1 + x = x + 2 + !$omp end atomic +end + +subroutine f04 + integer :: x, v + + !$omp atomic update capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + v = x + x = v + !$omp end atomic +end + +subroutine f05 + integer :: x, v, z + + !$omp atomic update capture + v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + !$omp end atomic +end + +subroutine f06 + integer :: x, v, z + + !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + v = x + !$omp end atomic +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 new file mode 100644 index 0000000000000..0aee02d429dc0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + !$omp atomic update + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + x = int(x + y) +end + +subroutine f06 + integer :: x, y + interface + function f(i, j) + integer :: f, i, j + end + end interface + + !$omp atomic update + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation + x = f(x, y) +end + +subroutine f07 + real :: x + integer :: y + + !$omp atomic update + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation + x = x ** y +end + +subroutine f08 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should appear as an argument in the update operation + x = y +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 index 21a9b87d26345..3084376b4275d 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 @@ -22,10 +22,10 @@ program sample x = x / y !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y end program diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 new file mode 100644 index 0000000000000..979568ebf7140 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -0,0 +1,81 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC WRITE. Expect no diagnostics. + !$omp atomic write + x = v + 1 + + !$omp atomic write + x = v + 3 + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic write + x = 2 * v + 3 + + !$omp atomic write + x => v +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic write + p(i) = v +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic write + !ERROR: Atomic variable x should be a scalar + x = v + + !$omp atomic write + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + y(2:4) = v +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic write + !ERROR: Within atomic operation x and x+1_4 access the same storage + x = x + 1 + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic write + y(1) = y(2) +end + +subroutine f06 + character :: x, v + + !$omp atomic write + !ERROR: Atomic variable x cannot have CHARACTER type + x = v +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic write + !ERROR: Atomic variable x cannot be ALLOCATABLE + x = v +end + diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0e100871ea9b4..0dc8251ae356e 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct @@ -11,9 +11,13 @@ a = b !$omp end atomic + !ERROR: ACQUIRE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic read acquire hint(OMP_LOCK_HINT_CONTENDED) a = b + !ERROR: RELEASE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic release hint(OMP_LOCK_HINT_UNCONTENDED) write a = b @@ -22,39 +26,32 @@ a = a + 1 !$omp end atomic + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: ACQ_REL clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic hint(1) acq_rel capture b = a a = a + 1 !$omp end atomic - !ERROR: expected end of line + !ERROR: At most one clause from the 'atomic' group is allowed on ATOMIC construct !$omp atomic read write + !ERROR: Atomic expression a+1._4 should be a variable a = a + 1 !$omp atomic a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic num_threads(4) a = a + 1 - !ERROR: expected end of line + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic capture num_threads(4) a = a + 1 + !ERROR: RELAXED clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic relaxed a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' - !$omp atomic num_threads write - a = a + 1 - !$omp end parallel end diff --git a/flang/test/Semantics/OpenMP/atomic01.f90 b/flang/test/Semantics/OpenMP/atomic01.f90 index 173effe86b69c..f700c381cadd0 100644 --- a/flang/test/Semantics/OpenMP/atomic01.f90 +++ b/flang/test/Semantics/OpenMP/atomic01.f90 @@ -14,322 +14,277 @@ ! At most one memory-order-clause may appear on the construct. !READ - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic read seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst read seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic read acquire acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire read acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic read relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed read relaxed i = j !UPDATE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic update seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst update seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic update release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release update release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic update relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed update relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !CAPTURE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic capture seq_cst seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst capture seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic capture release release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release capture release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic capture relaxed relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed capture relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel acq_rel capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic capture acq_rel acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel capture acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic capture acquire acquire i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire capture acquire i = j j = k !$omp end atomic !WRITE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic write seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst write seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic write release release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release write release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic write relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed write relaxed i = j !No atomic-clause - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j ! 2.17.7.3 ! At most one hint clause may appear on the construct. - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_speculative) hint(omp_sync_hint_speculative) read i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) read hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic read hint(omp_sync_hint_uncontended) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) update hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic update hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) capture i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) capture hint(omp_sync_hint_nonspeculative) i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic capture hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j j = k @@ -337,34 +292,26 @@ ! 2.17.7.4 ! If atomic-clause is read then memory-order-clause must not be acq_rel or release. - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic acq_rel read i = j - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read acq_rel i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic release read i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read release i = j ! 2.17.7.5 ! If atomic-clause is write then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acq_rel write i = j - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acq_rel i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acquire write i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acquire i = j @@ -372,33 +319,27 @@ ! 2.17.7.6 ! If atomic-clause is update or not present then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acq_rel update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acquire update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed on the ATOMIC directive !$omp atomic acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed on the ATOMIC directive !$omp atomic acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j end program diff --git a/flang/test/Semantics/OpenMP/atomic02.f90 b/flang/test/Semantics/OpenMP/atomic02.f90 index c66085d00f157..45e41f2552965 100644 --- a/flang/test/Semantics/OpenMP/atomic02.f90 +++ b/flang/test/Semantics/OpenMP/atomic02.f90 @@ -28,36 +28,29 @@ program OmpAtomic !$omp atomic a = a/(b + 1) !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: The atomic variable c should appear as an argument in the update operation c = d !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The /= operator is not a valid ATOMIC UPDATE operation l = a .NE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic m = m .AND. n @@ -76,32 +69,26 @@ program OmpAtomic !$omp atomic update a = a/(b + 1) !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: This is not a valid ATOMIC UPDATE operation c = c//d !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic update m = m .AND. n diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index 76367495b9861..f5c189fd05318 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -25,28 +25,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7, b, c) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8, a, d) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -60,26 +58,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = MOD(y, 9) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation x = ABS(x) end program OmpAtomic @@ -92,7 +90,7 @@ subroutine conflicting_types() type(simple) ::s z = 1 !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(s%z, 4) end subroutine @@ -105,40 +103,37 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) :: s !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MIN operator a = min(a, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a) !$omp atomic - !ERROR: Atomic update statement should be of the form `a = intrinsic_procedure(a, expr_list)` OR `a = intrinsic_procedure(expr_list, a)` a = min(b, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a, b) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: The atomic variable y should occur exactly once among the arguments of the top-level MIN operator y = min(z, x) !$omp atomic z = max(z, y) !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'k' + !ERROR: Atomic variable k should be a scalar + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation k = max(x, y) - + !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement x = min(x, k) !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - z =z + s%m + z = z + s%m end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index a9644ad95aa30..5c91ab5dc37e4 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -20,12 +20,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic @@ -33,12 +31,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic @@ -46,12 +42,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic @@ -59,12 +53,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic @@ -72,8 +64,7 @@ program OmpAtomic !$omp atomic m = n .AND. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic @@ -81,8 +72,7 @@ program OmpAtomic !$omp atomic m = n .OR. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic @@ -90,8 +80,7 @@ program OmpAtomic !$omp atomic m = n .EQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic @@ -99,8 +88,7 @@ program OmpAtomic !$omp atomic m = n .NEQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l !$omp atomic update @@ -108,12 +96,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic update @@ -121,12 +107,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic update @@ -134,12 +118,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic update @@ -147,12 +129,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic update @@ -160,8 +140,7 @@ program OmpAtomic !$omp atomic update m = n .AND. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic update @@ -169,8 +148,7 @@ program OmpAtomic !$omp atomic update m = n .OR. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic update @@ -178,8 +156,7 @@ program OmpAtomic !$omp atomic update m = n .EQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic update @@ -187,8 +164,7 @@ program OmpAtomic !$omp atomic update m = n .NEQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l end program OmpAtomic @@ -204,35 +180,34 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) p !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement x = x !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 1 !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and a*b access the same storage a = a * b + a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level * operator a = b * (a + 9) !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (a+b) access the same storage a = a * (a + b) !$omp atomic - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (b+a) access the same storage a = (b + a) * a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a * b + c !$omp atomic update - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a + b + c !$omp atomic @@ -243,23 +218,18 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement a = a + d !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = x * y / z !$omp atomic - !ERROR: Atomic update statement should be of form `p%m = p%m operator expr` OR `p%m = expr operator p%m` - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable p%m should occur exactly once among the arguments of the top-level + operator p%m = x + y !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement p%m = p%m + p%n end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index 266268a212440..e0103be4cae4a 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -8,20 +8,20 @@ program OmpAtomic use omp_lib integer :: g, x - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic relaxed, seq_cst x = x + 1 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic read seq_cst, relaxed x = g - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic write relaxed, release x = 2 * 4 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 10 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst x = g g = x * 10 diff --git a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 index 7ca8c858239f7..e9cfa49bf934e 100644 --- a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 @@ -18,7 +18,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(3) y = 2 !$omp end critical (name) @@ -27,12 +27,12 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(7) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(x) y = 2 @@ -54,7 +54,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint) y = 2 @@ -84,35 +84,35 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint_contended) y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp critical (name) hint(1.0) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp critical (name) hint(z + omp_sync_hint_nonspeculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(k + omp_sync_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(p(1) + omp_sync_hint_uncontended) y = 2 diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 505cbc48fef90..677b933932b44 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -20,70 +20,64 @@ program sample !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable y(1_8:3_8:1_8) should be a scalar v = y(1:3) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression x*(10_4+x) should be a variable v = x * (10 + x) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression 4_4 should be a variable v = 4 !$omp atomic read - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE v = k !$omp atomic write - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = x !$omp atomic update - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = k + x * (v * x) !$omp atomic - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = v * k !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'z%y' + !ERROR: Within atomic operation z%y and x+z%y access the same storage z%y = x + z%y !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Within atomic operation m and min(m,x,z%m)+k access the same storage m = min(m, x, z%m) + k !$omp atomic read - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Atomic expression min(m,x,z%m)+k should be a variable m = min(m, x, z%m) + k !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar x = a - !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - a = x - !$omp atomic write !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement x = a !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar a = x !$omp atomic capture @@ -93,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = b + (x*1) !$omp end atomic @@ -103,60 +97,58 @@ program sample !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component x expected to be captured in the second statement of ATOMIC CAPTURE construct x = x + 10 v = b !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt] v = 1 x = 4 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component z%y expected to be assigned in the second statement of ATOMIC CAPTURE construct x = z%y + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component z%m expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 x = z%y !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component y(2) expected to be assigned in the second statement of ATOMIC CAPTURE construct x = y(2) + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component y(1) expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 x = y(2) !$omp end atomic !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable r cannot have CHARACTER type l = r !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable l cannot have CHARACTER type l = r end program diff --git a/flang/test/Semantics/OpenMP/requires-atomic01.f90 b/flang/test/Semantics/OpenMP/requires-atomic01.f90 index ae9fd086015dd..e8817c3f5ef61 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic01.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic01.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> SeqCst !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> SeqCst !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> SeqCst !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> SeqCst !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> SeqCst !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 diff --git a/flang/test/Semantics/OpenMP/requires-atomic02.f90 b/flang/test/Semantics/OpenMP/requires-atomic02.f90 index 4976a9667eb78..a3724a83456fd 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic02.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic02.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Acquire + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> AcqRel !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> AcqRel !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> AcqRel !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> AcqRel !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> AcqRel + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> AcqRel !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 >From 5928e6c450ed9d91a8130fd489a3fabab830140c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:41:16 -0500 Subject: [PATCH 05/22] Restart build >From 5a9cb4aaca6c7cd079f24036b233dbfd3c6278d4 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:45:28 -0500 Subject: [PATCH 06/22] Restart build >From 6273bb856c66fe110731e25293be0f9a1f02c64a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 07:54:54 -0500 Subject: [PATCH 07/22] Restart build >From 3594a0fcab499d7509c2bd4620b2c47feb74a481 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 08:18:01 -0500 Subject: [PATCH 08/22] Replace %openmp_flags with -fopenmp in tests, add REQUIRES where needed --- flang/test/Semantics/OpenMP/atomic-read.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-capture.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 2 +- flang/test/Semantics/OpenMP/atomic.f90 | 2 ++ 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 index 2619d235380f8..73899a9ff37f2 100644 --- a/flang/test/Semantics/OpenMP/atomic-read.f90 +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index 64e31a7974b55..c427ba07d43d8 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 0aee02d429dc0..4595e02d01456 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 index 979568ebf7140..7965ad2dc7dbf 100644 --- a/flang/test/Semantics/OpenMP/atomic-write.f90 +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0dc8251ae356e..2caa161507d49 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,3 +1,5 @@ +! REQUIRES: openmp_runtime + ! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct >From 0142671339663ba09c169800f909c0d48c0ad38b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 09:06:20 -0500 Subject: [PATCH 09/22] Fix examples --- flang/examples/FeatureList/FeatureList.cpp | 10 ------- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 26 +++++-------------- 2 files changed, 7 insertions(+), 29 deletions(-) diff --git a/flang/examples/FeatureList/FeatureList.cpp b/flang/examples/FeatureList/FeatureList.cpp index d1407cf0ef239..a36b8719e365d 100644 --- a/flang/examples/FeatureList/FeatureList.cpp +++ b/flang/examples/FeatureList/FeatureList.cpp @@ -445,13 +445,6 @@ struct NodeVisitor { READ_FEATURE(ObjectDecl) READ_FEATURE(OldParameterStmt) READ_FEATURE(OmpAlignedClause) - READ_FEATURE(OmpAtomic) - READ_FEATURE(OmpAtomicCapture) - READ_FEATURE(OmpAtomicCapture::Stmt1) - READ_FEATURE(OmpAtomicCapture::Stmt2) - READ_FEATURE(OmpAtomicRead) - READ_FEATURE(OmpAtomicUpdate) - READ_FEATURE(OmpAtomicWrite) READ_FEATURE(OmpBeginBlockDirective) READ_FEATURE(OmpBeginLoopDirective) READ_FEATURE(OmpBeginSectionsDirective) @@ -480,7 +473,6 @@ struct NodeVisitor { READ_FEATURE(OmpIterationOffset) READ_FEATURE(OmpIterationVector) READ_FEATURE(OmpEndAllocators) - READ_FEATURE(OmpEndAtomic) READ_FEATURE(OmpEndBlockDirective) READ_FEATURE(OmpEndCriticalDirective) READ_FEATURE(OmpEndLoopDirective) @@ -566,8 +558,6 @@ struct NodeVisitor { READ_FEATURE(OpenMPDeclareTargetConstruct) READ_FEATURE(OmpMemoryOrderType) READ_FEATURE(OmpMemoryOrderClause) - READ_FEATURE(OmpAtomicClause) - READ_FEATURE(OmpAtomicClauseList) READ_FEATURE(OmpAtomicDefaultMemOrderClause) READ_FEATURE(OpenMPFlushConstruct) READ_FEATURE(OpenMPLoopConstruct) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..18a4107f9a9d1 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -74,25 +74,19 @@ SourcePosition OpenMPCounterVisitor::getLocation(const OpenMPConstruct &c) { // the directive field. [&](const auto &c) -> SourcePosition { const CharBlock &source{std::get<0>(c.t).source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPAtomicConstruct &c) -> SourcePosition { - return std::visit( - [&](const auto &o) -> SourcePosition { - const CharBlock &source{std::get(o.t).source}; - return parsing->allCooked() - .GetSourcePositionRange(source) - ->first; - }, - c.u); + const CharBlock &source{c.source}; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPSectionConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPUtilityConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, }, c.u); @@ -157,14 +151,8 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - return std::visit( - [&](const auto &c) { - // Get source from the verbatim fields - const CharBlock &source{std::get(c.t).source}; - return "atomic-" + - normalize_construct_name(source.ToString()); - }, - c.u); + const CharBlock &source{c.source}; + return normalize_construct_name(source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; >From c158867527e0a5e2b67146784386675e0d414c71 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:19 -0500 Subject: [PATCH 10/22] Fix example --- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 5 +++-- flang/test/Examples/omp-atomic.f90 | 16 +++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index 18a4107f9a9d1..b0964845d3b99 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -151,8 +151,9 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - const CharBlock &source{c.source}; - return normalize_construct_name(source.ToString()); + auto &dirSpec = std::get(c.t); + auto &dirName = std::get(dirSpec.t); + return normalize_construct_name(dirName.source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; diff --git a/flang/test/Examples/omp-atomic.f90 b/flang/test/Examples/omp-atomic.f90 index dcca34b633a3e..934f84f132484 100644 --- a/flang/test/Examples/omp-atomic.f90 +++ b/flang/test/Examples/omp-atomic.f90 @@ -26,25 +26,31 @@ ! CHECK:--- ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 9 -! CHECK-NEXT: construct: atomic-read +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: -! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: - clause: read ! CHECK-NEXT: details: '' +! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 12 -! CHECK-NEXT: construct: atomic-write +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: ! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' +! CHECK-NEXT: - clause: write ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 16 -! CHECK-NEXT: construct: atomic-capture +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: +! CHECK-NEXT: - clause: capture +! CHECK-NEXT: details: 'name_modifier=atomic;name_modifier=atomic;' ! CHECK-NEXT: - clause: seq_cst ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 21 -! CHECK-NEXT: construct: atomic-atomic +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: [] ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 8 >From ddacb71299d1fefd37b566e3caed45031584aac8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:47 -0500 Subject: [PATCH 11/22] Remove reference to *nullptr --- flang/lib/Lower/OpenMP/OpenMP.cpp | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ab868df76d298..4d9b375553c1e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3770,9 +3770,12 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); - mlir::Operation *secondOp = genAtomicOperation( - converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, - *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + mlir::Operation *secondOp = nullptr; + if (analysis.op1.what != analysis.None) { + secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, + atomAddr, atom, *get(analysis.op1.expr), + hint, memOrder, atomicAt, prepareAt); + } if (secondOp) { builder.setInsertionPointAfter(secondOp); >From 4546997f82dfe32b79b2bd0e2b65974991ab55da Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 2 May 2025 18:49:05 -0500 Subject: [PATCH 12/22] Updates and improvements --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 107 +++-- flang/lib/Semantics/check-omp-structure.cpp | 375 ++++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 1 + .../Todo/atomic-capture-implicit-cast.f90 | 48 --- .../Lower/OpenMP/atomic-implicit-cast.f90 | 2 - .../Semantics/OpenMP/atomic-hint-clause.f90 | 2 +- .../OpenMP/atomic-update-capture.f90 | 8 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 16 +- 9 files changed, 381 insertions(+), 180 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 77f57b1cb85c7..8213fe33edbd0 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4859,7 +4859,7 @@ struct OpenMPAtomicConstruct { struct Op { int what; - TypedExpr expr; + AssignmentStmt::TypedAssignment assign; }; TypedExpr atom, cond; Op op0, op1; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6177b59199481..7b6c22095d723 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2673,21 +2673,46 @@ getAtomicMemoryOrder(lower::AbstractConverter &converter, static mlir::Operation * // genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Value storeAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Type storeType = fir::unwrapRefType(storeAddr.getType()); + + mlir::Value toAddr = [&]() { + if (atomType == storeType) + return storeAddr; + return builder.createTemporary(loc, atomType, ".tmp.atomval"); + }(); builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + + if (atomType != storeType) { + lower::ExprToValueMap overrides; + // The READ operation could be a part of UPDATE CAPTURE, so make sure + // we don't emit extra code into the body of the atomic op. + builder.restoreInsertionPoint(postAt); + mlir::Value load = builder.create(loc, toAddr); + overrides.try_emplace(&atom, load); + + converter.overrideExprValues(&overrides); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); + converter.resetExprOverrides(); + + builder.create(loc, value, storeAddr); + } builder.restoreInsertionPoint(saved); return op; } @@ -2695,16 +2720,18 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, static mlir::Operation * // genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); mlir::Value converted = builder.createConvert(loc, atomType, value); @@ -2719,19 +2746,20 @@ static mlir::Operation * genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrResizeOf(arg, atom)) { @@ -2751,7 +2779,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, converter.overrideExprValues(&overrides); mlir::Value updated = - fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Value converted = builder.createConvert(loc, atomType, updated); builder.create(loc, converted); converter.resetExprOverrides(); @@ -2764,20 +2792,21 @@ static mlir::Operation * genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, int action, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: - return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Write: - return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Update: - return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, assign, + hint, memOrder, preAt, atomicAt, postAt); default: return nullptr; } @@ -3724,6 +3753,15 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { } return ""s; }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; const SomeExpr &atom = *analysis.atom.get()->v; @@ -3732,11 +3770,11 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; llvm::errs() << " op0 {\n"; llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op0.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << " op1 {\n"; llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op1.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << "}\n"; } @@ -3745,8 +3783,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAtomicConstruct &construct) { - auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { - if (auto *maybe = expr.get(); maybe && maybe->v) { + auto get = [](auto &&typedWrapper) -> decltype(&*typedWrapper.get()->v) { + if (auto *maybe = typedWrapper.get(); maybe && maybe->v) { return &*maybe->v; } else { return nullptr; @@ -3774,8 +3812,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, int action0 = analysis.op0.what & analysis.Action; int action1 = analysis.op1.what & analysis.Action; mlir::Operation *captureOp = nullptr; - fir::FirOpBuilder::InsertPoint atomicAt; - fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint preAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint atomicAt, postAt; if (construct.IsCapture()) { // Capturing operation. @@ -3784,7 +3822,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, captureOp = builder.create(loc, hint, memOrder); // Set the non-atomic insertion point to before the atomic.capture. - prepareAt = getInsertionPointBefore(captureOp); + preAt = getInsertionPointBefore(captureOp); mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); builder.setInsertionPointToEnd(block); @@ -3792,6 +3830,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, // atomic.capture. mlir::Operation *term = builder.create(loc); atomicAt = getInsertionPointBefore(term); + postAt = getInsertionPointAfter(captureOp); hint = nullptr; memOrder = nullptr; } else { @@ -3799,20 +3838,20 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(action0 != analysis.None && action1 == analysis.None && "Expexcing single action"); assert(!(analysis.op0.what & analysis.Condition)); - atomicAt = prepareAt; + postAt = atomicAt = preAt; } mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, - *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); mlir::Operation *secondOp = nullptr; if (analysis.op1.what != analysis.None) { secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, - atomAddr, atom, *get(analysis.op1.expr), - hint, memOrder, atomicAt, prepareAt); + atomAddr, atom, *get(analysis.op1.assign), + hint, memOrder, preAt, atomicAt, postAt); } if (secondOp) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 201b38bd05ff3..f7753a5e5cc59 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -86,9 +86,13 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } -static bool IsVarOrFunctionRef(const SomeExpr &expr) { - return evaluate::UnwrapProcedureRef(expr) != nullptr || - evaluate::IsVariable(expr); +static bool IsVarOrFunctionRef(const MaybeExpr &expr) { + if (expr) { + return evaluate::UnwrapProcedureRef(*expr) != nullptr || + evaluate::IsVariable(*expr); + } else { + return false; + } } static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { @@ -2838,6 +2842,12 @@ static std::pair SplitAssignmentSource( namespace atomic { +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + enum class Operator { Unk, // Operators that are officially allowed in the update operation @@ -3137,16 +3147,108 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (Append(v, std::move(results)), ...); + (MoveAppend(v, std::move(results)), ...); return v; } +}; -private: - static void Append(Result &acc, Result &&data) { - for (auto &&s : data) { - acc.push_back(std::move(s)); +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + auto copy{x.derived()}; + return {evaluate::AsGenericExpr(std::move(copy)), {}}; } } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; }; } // namespace atomic @@ -3265,6 +3367,22 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } +static MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { // Both expr and x have the form of SomeType(SomeKind(...)[1]). // Check if expr is @@ -3282,6 +3400,10 @@ bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { } } +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { if (value) { expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), @@ -3289,11 +3411,20 @@ static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { } } +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( - int what, const MaybeExpr &maybeExpr = std::nullopt) { + int what, + const std::optional &maybeAssign = std::nullopt) { parser::OpenMPAtomicConstruct::Analysis::Op operation; operation.what = what; - SetExpr(operation.expr, maybeExpr); + SetAssignment(operation.assign, maybeAssign); return operation; } @@ -3316,7 +3447,7 @@ static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( // }; // struct Op { // int what; - // TypedExpr expr; + // TypedAssignment assign; // }; // TypedExpr atom, cond; // Op op0, op1; @@ -3340,6 +3471,16 @@ void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, } } +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + /// Check if `expr` satisfies the following conditions for x and v: /// /// [6.0:189:10-12] @@ -3383,9 +3524,9 @@ OmpStructureChecker::CheckUpdateCapture( // // The two allowed cases are: // x = ... atomic-var = ... - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // or - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // x = ... atomic-var = ... // // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture @@ -3394,6 +3535,8 @@ OmpStructureChecker::CheckUpdateCapture( // // If the two statements don't fit these criteria, return a pair of default- // constructed values. + using ReturnTy = std::pair; SourcedActionStmt act1{GetActionStmt(ec1)}; SourcedActionStmt act2{GetActionStmt(ec2)}; @@ -3409,86 +3552,155 @@ OmpStructureChecker::CheckUpdateCapture( auto isUpdateCapture{ [](const evaluate::Assignment &u, const evaluate::Assignment &c) { - return u.lhs == c.rhs; + return IsSameOrConvertOf(c.rhs, u.lhs); }}; // Do some checks that narrow down the possible choices for the update // and the capture statements. This will help to emit better diagnostics. - bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); - bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; + + auto errorCaptureShouldRead{[&](const parser::CharBlock &source, + const std::string &expr) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read %s"_err_en_US, + expr); + }}; - if (couldBeCapture1) { - if (couldBeCapture2) { - if (isUpdateCapture(as2, as1)) { - if (isUpdateCapture(as1, as2)) { - // If both statements could be captures and both could be updates, - // emit a warning about the ambiguity. - context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); - } - return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); - } else if (isUpdateCapture(as1, as2)) { + auto errorNeitherWorks{[&]() { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture"_err_en_US); + }}; + + auto makeSelectionFromDet{[&](int det) -> ReturnTy { + // If det != 0, then the checks unambiguously suggest a specific + // categorization. + // If det == 0, then this function should be called only if the + // checks haven't ruled out any possibility, i.e. when both assigments + // could still be either updates or captures. + if (det > 0) { + // as1 is update, as2 is capture + if (isUpdateCapture(as1, as2)) { return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - context_.Say(source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, - as1.rhs.AsFortran(), as2.rhs.AsFortran()); + errorCaptureShouldRead(act2.source, as1.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } else { // !couldBeCapture2 + } else if (det < 0) { + // as2 is update, as1 is capture if (isUpdateCapture(as2, as1)) { return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } else { - context_.Say(act2.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as1.rhs.AsFortran()); + errorCaptureShouldRead(act1.source, as2.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } - } else { // !couldBeCapture1 - if (couldBeCapture2) { - if (isUpdateCapture(as1, as2)) { - return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); - } else { + } else { + bool updateFirst{isUpdateCapture(as1, as2)}; + bool captureFirst{isUpdateCapture(as2, as1)}; + if (updateFirst && captureFirst) { + // If both assignment could be the update and both could be the + // capture, emit a warning about the ambiguity. context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as2.rhs.AsFortran()); + "In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement"_warn_en_US); + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } - } else { - context_.Say(source, - "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + if (updateFirst != captureFirst) { + const parser::ExecutionPartConstruct *upd{updateFirst ? ec1 : ec2}; + const parser::ExecutionPartConstruct *cap{captureFirst ? ec1 : ec2}; + return std::make_pair(upd, cap); + } + assert(!updateFirst && !captureFirst); + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); } + }}; + + if (det != 0 || (cbu1 && cbu2 && cbc1 && cbc2)) { + return makeSelectionFromDet(det); } + assert(det == 0 && "Prior checks should have covered det != 0"); - return std::make_pair(nullptr, nullptr); + // If neither of the statements is an RMW update, it could still be a + // "write" update. Pretty much any assignment can be a write update, so + // recompute det with cbu1 = cbu2 = true. + if (int writeDet{int(cbc2) - int(cbc1)}; writeDet || (cbc1 && cbc2)) { + return makeSelectionFromDet(writeDet); + } + + // It's only errors from here on. + + if (!cbu1 && !cbu2 && !cbc1 && !cbc2) { + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); + } + + // The remaining cases are that + // - no candidate for update, or for capture, + // - one of the assigments cannot be anything. + + if (!cbu1 && !cbu2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update"_err_en_US); + return std::make_pair(nullptr, nullptr); + } else if (!cbc1 && !cbc2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + if ((!cbu1 && !cbc1) || (!cbu2 && !cbc2)) { + auto &src = (!cbu1 && !cbc1) ? act1.source : act2.source; + context_.Say(src, + "In ATOMIC UPDATE operation with CAPTURE the statement could be neither the update nor the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + // All cases should have been covered. + llvm_unreachable("Unchecked condition"); } void OmpStructureChecker::CheckAtomicCaptureAssignment( const evaluate::Assignment &capture, const SomeExpr &atom, parser::CharBlock source) { - const SomeExpr &cap{capture.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &cap{capture.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, rsrc); - // This part should have been checked prior to callig this function. - assert(capture.rhs == atom && "This canont be a capture assignment"); + // This part should have been checked prior to calling this function. + assert(*GetConvertInput(capture.rhs) == atom && + "This canont be a capture assignment"); CheckStorageOverlap(atom, {cap}, source); } } void OmpStructureChecker::CheckAtomicReadAssignment( const evaluate::Assignment &read, parser::CharBlock source) { - const SomeExpr &atom{read.rhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; - if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + if (auto maybe{GetConvertInput(read.rhs)}) { + const SomeExpr &atom{*maybe}; + + if (!IsVarOrFunctionRef(atom)) { + ErrorShouldBeVariable(atom, rsrc); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } } else { - CheckAtomicVariable(atom, rsrc); - CheckStorageOverlap(atom, {read.lhs}, source); + ErrorShouldBeVariable(read.rhs, rsrc); } } @@ -3499,12 +3711,11 @@ void OmpStructureChecker::CheckAtomicWriteAssignment( // one of the following forms: // x = expr // x => expr - const SomeExpr &atom{write.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{write.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, lsrc); CheckStorageOverlap(atom, {write.rhs}, source); @@ -3521,12 +3732,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( // x = intrinsic-procedure-name (x) // x = intrinsic-procedure-name (x, expr-list) // x = intrinsic-procedure-name (expr-list, x) - const SomeExpr &atom{update.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{update.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); // Skip other checks. return; } @@ -3605,12 +3815,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( const SomeExpr &cond, parser::CharBlock condSource, const evaluate::Assignment &assign, parser::CharBlock assignSource) { - const SomeExpr &atom{assign.lhs}; auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + const SomeExpr &atom{assign.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, arsrc); // Skip other checks. return; } @@ -3702,7 +3911,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( @@ -3786,7 +3995,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdate( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign), MakeAtomicAnalysisOp(Analysis::None)); } @@ -3839,12 +4048,12 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( if (GetActionStmt(&body.front()).stmt == uact.stmt) { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(action, update.rhs), - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + MakeAtomicAnalysisOp(action, update), + MakeAtomicAnalysisOp(Analysis::Read, capture)); } else { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), - MakeAtomicAnalysisOp(action, update.rhs)); + MakeAtomicAnalysisOp(Analysis::Read, capture), + MakeAtomicAnalysisOp(action, update)); } } @@ -3988,12 +4197,12 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( if (captureFirst) { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign)); } else { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign)); } } @@ -4019,13 +4228,15 @@ void OmpStructureChecker::CheckAtomicRead( if (body.size() == 1) { SourcedActionStmt action{GetActionStmt(&body.front())}; if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { - const SomeExpr &atom{maybeRead->rhs}; CheckAtomicReadAssignment(*maybeRead, action.source); - using Analysis = parser::OpenMPAtomicConstruct::Analysis; - x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), - MakeAtomicAnalysisOp(Analysis::None)); + if (auto maybe{GetConvertInput(maybeRead->rhs)}) { + const SomeExpr &atom{*maybe}; + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead), + MakeAtomicAnalysisOp(Analysis::None)); + } } else { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); @@ -4058,7 +4269,7 @@ void OmpStructureChecker::CheckAtomicWrite( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index bf6fbf16d0646..835fbe45e1c0e 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -253,6 +253,7 @@ class OmpStructureChecker void CheckStorageOverlap(const evaluate::Expr &, llvm::ArrayRef>, parser::CharBlock); + void ErrorShouldBeVariable(const MaybeExpr &expr, parser::CharBlock source); void CheckAtomicVariable( const evaluate::Expr &, parser::CharBlock); std::pair&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..aa9d2e0ac3ff7 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -1,5 +1,3 @@ -! REQUIRES : openmp_runtime - ! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s ! CHECK: func.func @_QPatomic_implicit_cast_read() { diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index deb67e7614659..8adb0f1a67409 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -25,7 +25,7 @@ program sample !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement y = x x = y !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index c427ba07d43d8..f808ed916fb7e 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -39,7 +39,7 @@ subroutine f02 subroutine f03 integer :: x - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture !$omp atomic update capture x = x + 1 x = x + 2 @@ -50,7 +50,7 @@ subroutine f04 integer :: x, v !$omp atomic update capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement v = x x = v !$omp end atomic @@ -60,8 +60,8 @@ subroutine f05 integer :: x, v, z !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 !$omp end atomic end @@ -70,8 +70,8 @@ subroutine f06 integer :: x, v, z !$omp atomic update capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x !$omp end atomic end diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 677b933932b44..5e180aa0bbe5b 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -97,50 +97,50 @@ program sample !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture x = x + 10 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read x v = b !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture !$omp atomic capture v = 1 x = 4 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) !$omp end atomic >From 40510a3068498d15257cc7d198bce9c8cd71a902 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 24 Mar 2025 15:38:58 -0500 Subject: [PATCH 13/22] DumpEvExpr: show type --- flang/include/flang/Semantics/dump-expr.h | 30 ++++++++++++++++------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 2f445429a10b5..1553dac3b6687 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,6 +16,7 @@ #include #include +#include #include #include @@ -38,6 +39,17 @@ class DumpEvaluateExpr { } private: + template + struct TypeOf { + static constexpr std::string_view name{TypeOf::get()}; + static constexpr std::string_view get() { + std::string_view v(__PRETTY_FUNCTION__); + v.remove_prefix(99); // Strip the part "... [with T = " + v.remove_suffix(50); // Strip the ending "; string_view = ...]" + return v; + } + }; + template void Show(const common::Indirection &x) { Show(x.value()); } @@ -76,7 +88,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant"); + Indent("derived constant "s + std::string(TypeOf::name)); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -84,7 +96,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant"); + Print("constant "s + std::string(TypeOf::name)); } } void Show(const Symbol &symbol); @@ -102,7 +114,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator"); + Indent("designator "s + std::string(TypeOf::name)); Show(x.u); Outdent(); } @@ -117,7 +129,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref"); + Indent("function ref "s + std::string(TypeOf::name)); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -127,14 +139,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value"); + Indent("array constructor value "s + std::string(TypeOf::name)); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do"); + Indent("implied do "s + std::string(TypeOf::name)); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -148,20 +160,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op"); + Indent("unary op "s + std::string(TypeOf::name)); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op"); + Indent("binary op "s + std::string(TypeOf::name)); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr T"); + Indent("expr <" + std::string(TypeOf::name) + ">"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index aa0b4e0f03398..66cedab94bfb4 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("expr some type"); + Indent("relational some type"); Show(x.u); Outdent(); } >From b40ba0ed9270daf4f7d99190c1e100028a3e09c3 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 15:14:45 -0500 Subject: [PATCH 14/22] Handle conversion from real to complex via complex constructor --- flang/lib/Semantics/check-omp-structure.cpp | 55 ++++++++++++++++++--- 1 file changed, 47 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dada9c6c2bd6f..ae81dcb5ea150 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,36 +3183,46 @@ struct ConvertCollector using Base::operator(); template // - Result operator()(const evaluate::Designator &x) const { + Result asSomeExpr(const T &x) const { auto copy{x}; return {AsGenericExpr(std::move(copy)), {}}; } + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + template // Result operator()(const evaluate::FunctionRef &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template // Result operator()(const evaluate::Constant &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template Result operator()(const evaluate::Operation &x) const { if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. + // Ignore parentheses. return (*this)(x.template operand<0>()); } else if constexpr (is_convert_v) { // Convert should always have a typed result, so it should be safe to // dereference x.GetType(). return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } } else { - auto copy{x.derived()}; - return {evaluate::AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x.derived()); } } @@ -3231,6 +3241,23 @@ struct ConvertCollector } private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + template // struct is_convert { static constexpr bool value{false}; @@ -3246,6 +3273,18 @@ struct ConvertCollector }; template // static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { >From 303aef7886243a6f7952e866cfb50d860ed98e61 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 16:07:19 -0500 Subject: [PATCH 15/22] Fix handling of insertion point --- flang/lib/Lower/OpenMP/OpenMP.cpp | 23 +++++++++++-------- .../Lower/OpenMP/atomic-implicit-cast.f90 | 8 +++---- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 1c5589b116ca7..60e559b326f7f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2749,7 +2749,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value storeAddr = @@ -2782,7 +2781,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, value, storeAddr); } - builder.restoreInsertionPoint(saved); return op; } @@ -2796,7 +2794,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value value = @@ -2807,7 +2804,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, converted, hint, memOrder); - builder.restoreInsertionPoint(saved); return op; } @@ -2823,7 +2819,6 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); @@ -2853,7 +2848,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(saved); + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } @@ -2866,6 +2861,8 @@ genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { + // This function and the functions called here do not preserve the + // builder's insertion point, or set it to anything specific. switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, @@ -3919,6 +3916,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, postAt = atomicAt = preAt; } + // The builder's insertion point needs to be specifically set before + // each call to `genAtomicOperation`. mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); @@ -3932,10 +3931,16 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, hint, memOrder, preAt, atomicAt, postAt); } - if (secondOp) { - builder.setInsertionPointAfter(secondOp); + if (construct.IsCapture()) { + // If this is a capture operation, the first/second ops will be inside + // of it. Set the insertion point to past the capture op itself. + builder.restoreInsertionPoint(postAt); } else { - builder.setInsertionPointAfter(firstOp); + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } } } } diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 6f9a481e4cf43..5e00235b85e74 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -95,9 +95,9 @@ subroutine atomic_implicit_cast_read ! CHECK: } ! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 ! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref -! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 ! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex ! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> @@ -107,14 +107,14 @@ subroutine atomic_implicit_cast_read !$omp end atomic -! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { -! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 ! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 ! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex ! CHECK: omp.yield(%[[RESULT]] : complex) ! CHECK: } >From d788d87ebe69ec82c14a0eb0cbb95df38a216fde Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:14:47 -0500 Subject: [PATCH 16/22] Allow conversion in update operations --- flang/include/flang/Semantics/tools.h | 17 ++++----- flang/lib/Lower/OpenMP/OpenMP.cpp | 6 ++-- flang/lib/Semantics/check-omp-structure.cpp | 33 ++++++----------- .../Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic03.f90 | 6 ++-- flang/test/Semantics/OpenMP/atomic04.f90 | 35 +++++++++---------- .../OpenMP/omp-atomic-assignment-stmt.f90 | 2 +- 7 files changed, 44 insertions(+), 57 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7f1ec59b087a2..9be2feb8ae064 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -789,14 +789,15 @@ inline bool checkForSymbolMatch( /// return the "expr" but with top-level parentheses stripped. std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); -/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). -/// Check if "expr" is -/// SomeType(SomeKind(Type( -/// Convert -/// SomeKind(...)[2]))) -/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves -/// TypeCategory. -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 60e559b326f7f..6977e209e8b1b 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2823,10 +2823,12 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; + // This must exist by now. + SomeExpr input = *semantics::GetConvertInput(assign.rhs); + std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { - if (!semantics::IsSameOrResizeOf(arg, atom)) { + if (!semantics::IsSameOrConvertOf(arg, atom)) { mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); overrides.try_emplace(&arg, val); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index ae81dcb5ea150..edd8525c118bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3425,12 +3425,12 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -static MaybeExpr GetConvertInput(const SomeExpr &x) { +MaybeExpr GetConvertInput(const SomeExpr &x) { // This returns SomeExpr(x) when x is a designator/functionref/constant. return atomic::ConvertCollector{}(x).first; } -static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { // Check if expr is same as x, or a sequence of Convert operations on x. if (expr == x) { return true; @@ -3441,23 +3441,6 @@ static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { } } -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { - // Both expr and x have the form of SomeType(SomeKind(...)[1]). - // Check if expr is - // SomeType(SomeKind(Type( - // Convert - // SomeKind(...)[2]))) - // where SomeKind(...) [1] and [2] are equal, and the Convert preserves - // TypeCategory. - - if (expr != x) { - auto top{atomic::ArgumentExtractor{}(expr)}; - return top.first == atomic::Operator::Resize && x == top.second.front(); - } else { - return true; - } -} - bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3801,7 +3784,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - auto top{GetTopLevelOperation(update.rhs)}; + std::pair> top{ + atomic::Operator::Unk, {}}; + if (auto &&maybeInput{GetConvertInput(update.rhs)}) { + top = GetTopLevelOperation(*maybeInput); + } switch (top.first) { case atomic::Operator::Add: case atomic::Operator::Sub: @@ -3842,7 +3829,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( auto unique{[&]() { // -> iterator auto found{top.second.end()}; for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { - if (IsSameOrResizeOf(*i, atom)) { + if (IsSameOrConvertOf(*i, atom)) { if (found != top.second.end()) { return top.second.end(); } @@ -3902,9 +3889,9 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( case atomic::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; - if (IsSameOrResizeOf(arg0, atom)) { + if (IsSameOrConvertOf(arg0, atom)) { CheckStorageOverlap(atom, {arg1}, condSource); - } else if (IsSameOrResizeOf(arg1, atom)) { + } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { context_.Say(assignSource, diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 4595e02d01456..28d0e264359cb 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -47,8 +47,8 @@ subroutine f05 integer :: x real :: y + ! An explicit conversion is accepted as an extension. !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = int(x + y) end diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index f5c189fd05318..b3a3c0d5e7a14 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -41,10 +41,10 @@ program OmpAtomic z = MIN(y, 8, a, d) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable y should appear as an argument in the update operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -126,7 +126,7 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic update !ERROR: Atomic variable k should be a scalar - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable k should occur exactly once among the arguments of the top-level MAX operator k = max(x, y) !$omp atomic diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index 5c91ab5dc37e4..d603ba8b3937c 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -1,5 +1,3 @@ -! REQUIRES: openmp_runtime - ! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags ! OpenMP Atomic construct @@ -7,7 +5,6 @@ ! Update assignment must be 'var = var op expr' or 'var = expr op var' program OmpAtomic - use omp_lib real x integer y logical m, n, l @@ -20,10 +17,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic @@ -31,10 +28,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic @@ -42,10 +39,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic @@ -53,10 +50,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic @@ -96,10 +93,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic update @@ -107,10 +104,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic update @@ -118,10 +115,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic update @@ -129,10 +126,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 5e180aa0bbe5b..8fdd2aed3ec1f 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -87,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + ! This ends up being "x = b + x". x = b + (x*1) !$omp end atomic >From 341723713929507c59d528540d32bc2e4213e920 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:21:56 -0500 Subject: [PATCH 17/22] format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6977e209e8b1b..0f553541c5ef0 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2850,7 +2850,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(postAt); // For naCtx cleanups + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } >From 2686207342bad511f6d51b20ed923c0d2cc9047b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:22:26 -0500 Subject: [PATCH 18/22] Revert "DumpEvExpr: show type" This reverts commit 40510a3068498d15257cc7d198bce9c8cd71a902. Debug changes accidentally pushed upstream. --- flang/include/flang/Semantics/dump-expr.h | 30 +++++++---------------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 10 insertions(+), 22 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 1553dac3b6687..2f445429a10b5 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,7 +16,6 @@ #include #include -#include #include #include @@ -39,17 +38,6 @@ class DumpEvaluateExpr { } private: - template - struct TypeOf { - static constexpr std::string_view name{TypeOf::get()}; - static constexpr std::string_view get() { - std::string_view v(__PRETTY_FUNCTION__); - v.remove_prefix(99); // Strip the part "... [with T = " - v.remove_suffix(50); // Strip the ending "; string_view = ...]" - return v; - } - }; - template void Show(const common::Indirection &x) { Show(x.value()); } @@ -88,7 +76,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant "s + std::string(TypeOf::name)); + Indent("derived constant"); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -96,7 +84,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant "s + std::string(TypeOf::name)); + Print("constant"); } } void Show(const Symbol &symbol); @@ -114,7 +102,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator "s + std::string(TypeOf::name)); + Indent("designator"); Show(x.u); Outdent(); } @@ -129,7 +117,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref "s + std::string(TypeOf::name)); + Indent("function ref"); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -139,14 +127,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value "s + std::string(TypeOf::name)); + Indent("array constructor value"); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do "s + std::string(TypeOf::name)); + Indent("implied do"); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -160,20 +148,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op "s + std::string(TypeOf::name)); + Indent("unary op"); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op "s + std::string(TypeOf::name)); + Indent("binary op"); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr <" + std::string(TypeOf::name) + ">"); + Indent("expr T"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index 66cedab94bfb4..aa0b4e0f03398 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("relational some type"); + Indent("expr some type"); Show(x.u); Outdent(); } >From c00fc531bcf742c409fc974da94c5b362fa9132c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:37:19 -0500 Subject: [PATCH 19/22] Delete unnecessary static_assert --- flang/lib/Semantics/check-omp-structure.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 6005dda7c26fe..2e59553d5e130 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -21,8 +21,6 @@ namespace Fortran::semantics { -static_assert(std::is_same_v>); - template static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { return !(e == f); >From 45b012c16b77c757a0d09b2a229bad49fed8d26f Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:25 -0500 Subject: [PATCH 20/22] Add missing initializer for 'iff' --- flang/lib/Semantics/check-omp-structure.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 2e59553d5e130..aa1bd136b371f 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2815,7 +2815,7 @@ static std::optional AnalyzeConditionalStmt( } } else { AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, - GetActionStmt(std::get(s.t))}; + GetActionStmt(std::get(s.t)), SourcedActionStmt{}}; if (result.ift.stmt) { return result; } >From daeac25991bf14fb08c3accabe068c074afa1eb7 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:47 -0500 Subject: [PATCH 21/22] Add asserts for printing "Identity" as top-level operator --- flang/lib/Semantics/check-omp-structure.cpp | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa1bd136b371f..062b45deac865 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3823,6 +3823,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, atomic::ToString(top.first)); @@ -3852,6 +3853,8 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, atom.AsFortran(), atomic::ToString(top.first)); @@ -3898,16 +3901,20 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, atomic::ToString(top.first)); } break; } + case atomic::Operator::Identity: case atomic::Operator::True: case atomic::Operator::False: break; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, atomic::ToString(top.first)); >From ae121e5c37453af1a4aba7c77939c2c1c45b75fa Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 13:15:58 -0500 Subject: [PATCH 22/22] Explain the use of determinant --- flang/lib/Semantics/check-omp-structure.cpp | 31 ++++++++++++++++++--- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 062b45deac865..bc6a09b9768ef 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3606,10 +3606,33 @@ OmpStructureChecker::CheckUpdateCapture( // subexpression of the right-hand side. // 2. An assignment could be a capture (cbc) if the right-hand side is // a variable (or a function ref), with potential type conversions. - bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; - bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; - bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; - bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; // Can as1 be an update? + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; // Can as2 be an update? + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; // Can 1 be capture? + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; // Can 2 be capture? + + // We want to diagnose cases where both assignments cannot be an update, + // or both cannot be a capture, as well as cases where either assignment + // cannot be any of these two. + // + // If we organize these boolean values into a matrix + // |cbu1 cbu2| + // |cbc1 cbc2| + // then we want to diagnose cases where the matrix has a zero (i.e. "false") + // row or column, including the case where everything is zero. All these + // cases correspond to the determinant of the matrix being 0, which suggests + // that checking the det may be a convenient diagnostic check. There is only + // one additional case where the det is 0, which is when the matrx is all 1 + // ("true"). The "all true" case represents the situation where both + // assignments could be an update as well as a capture. On the other hand, + // whenever det != 0, the roles of the update and the capture can be + // unambiguously assigned to as1 and as2 [1]. + // + // [1] This can be easily verified by hand: there are 10 2x2 matrices with + // det = 0, leaving 6 cases where det != 0: + // 0 1 0 1 1 0 1 0 1 1 1 1 + // 1 0 1 1 0 1 1 1 0 1 1 0 + // In each case the classification is unambiguous. // |cbu1 cbu2| // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 From flang-commits at lists.llvm.org Thu May 29 20:07:08 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Thu, 29 May 2025 20:07:08 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <683920dc.170a0220.1d2b6c.c954@mx.google.com> https://github.com/fanju110 updated https://github.com/llvm/llvm-project/pull/136098 >From 9494c9752400e4708dbc8b6a5ca4993ea9565e95 Mon Sep 17 00:00:00 2001 From: fanyikang Date: Thu, 17 Apr 2025 15:17:07 +0800 Subject: [PATCH 01/14] Add support for IR PGO (-fprofile-generate/-fprofile-use=/file) This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags: -fprofile-generate for instrumentation-based profile generation -fprofile-use=/file for profile-guided optimization Co-Authored-By: ict-ql <168183727+ict-ql at users.noreply.github.com> --- clang/include/clang/Driver/Options.td | 4 +- clang/lib/Driver/ToolChains/Flang.cpp | 8 +++ .../include/flang/Frontend/CodeGenOptions.def | 5 ++ flang/include/flang/Frontend/CodeGenOptions.h | 49 +++++++++++++++++ flang/lib/Frontend/CompilerInvocation.cpp | 12 +++++ flang/lib/Frontend/FrontendActions.cpp | 54 +++++++++++++++++++ flang/test/Driver/flang-f-opts.f90 | 5 ++ .../Inputs/gcc-flag-compatibility_IR.proftext | 19 +++++++ .../gcc-flag-compatibility_IR_entry.proftext | 14 +++++ flang/test/Profile/gcc-flag-compatibility.f90 | 39 ++++++++++++++ 10 files changed, 207 insertions(+), 2 deletions(-) create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext create mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext create mode 100644 flang/test/Profile/gcc-flag-compatibility.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index affc076a876ad..0b0dbc467c1e0 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1756,7 +1756,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFFFFFE">; def fprofile_generate : Flag<["-"], "fprofile-generate">, - Group, Visibility<[ClangOption, CLOption]>, + Group, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">, Group, Visibility<[ClangOption, CLOption]>, @@ -1773,7 +1773,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group, Visibility<[ClangOption, CLOption]>, Alias; def fprofile_use_EQ : Joined<["-"], "fprofile-use=">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, MetaVarName<"">, HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from /default.profdata. Otherwise, it reads from file .">; def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">, diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index a8b4688aed09c..fcdbe8a6aba5a 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,6 +882,14 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); + + if (Args.hasArg(options::OPT_fprofile_generate)){ + CmdArgs.push_back("-fprofile-generate"); + } + if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { + CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); + } + // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ, diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 57830bf51a1b3..4dec86cd8f51b 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,6 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. +ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Whether emit extra debug info for sample pgo profile collection. +CODEGENOPT(DebugInfoForProfiling, 1, 0) +CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..e052250f97e75 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,6 +148,55 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; + enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. + }; + + + /// Name of the profile file to use as output for -fprofile-instr-generate, + /// -fprofile-generate, and -fcs-profile-generate. + std::string InstrProfileOutput; + + /// Name of the profile file to use as input for -fmemory-profile-use. + std::string MemoryProfileUsePath; + + unsigned int DebugInfoForProfiling; + + unsigned int AtomicProfileUpdate; + + /// Name of the profile file to use as input for -fprofile-instr-use + std::string ProfileInstrumentUsePath; + + /// Name of the profile remapping file to apply to the profile data supplied + /// by -fprofile-sample-use or -fprofile-instr-use. + std::string ProfileRemappingFile; + + /// Check if Clang profile instrumenation is on. + bool hasProfileClangInstr() const { + return getProfileInstr() == ProfileClangInstr; + } + + /// Check if IR level profile instrumentation is on. + bool hasProfileIRInstr() const { + return getProfileInstr() == ProfileIRInstr; + } + + /// Check if CS IR level profile instrumentation is on. + bool hasProfileCSIRInstr() const { + return getProfileInstr() == ProfileCSIRInstr; + } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } + /// Check if CSIR profile use is on. + bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) #define ENUM_CODEGENOPT(Name, Type, Bits, Default) \ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 6f87a18d69c3d..f013fce2f3cfc 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,6 +27,7 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" +#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -431,6 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, opts.IsPIE = 1; } + if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { + opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + } + + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { + opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.ProfileInstrumentUsePath = A->getValue(); + } + // -mcmodel option. if (const llvm::opt::Arg *a = args.getLastArg(clang::driver::options::OPT_mcmodel_EQ)) { diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c1f47b12abee2..68880bdeecf8d 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -63,11 +63,14 @@ #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" #include "llvm/Transforms/Utils/ModuleUtils.h" +#include "llvm/Transforms/Instrumentation/InstrProfiling.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include #include @@ -130,6 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// + +static llvm::cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -892,6 +909,20 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } + +// Default filename used for profile generation. +namespace llvm { + extern llvm::cl::opt DebugInfoCorrelate; + extern llvm::cl::opt ProfileCorrelate; + + +std::string getDefaultProfileGenName() { + return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} +} + void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -909,6 +940,29 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; + + if (opts.hasProfileIRInstr()){ + // // -fprofile-generate. + pgoOpt = llvm::PGOOptions( + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } + else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", + opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, + llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Driver/flang-f-opts.f90 b/flang/test/Driver/flang-f-opts.f90 index 4493a519e2010..b972b9b7b2a59 100644 --- a/flang/test/Driver/flang-f-opts.f90 +++ b/flang/test/Driver/flang-f-opts.f90 @@ -8,3 +8,8 @@ ! CHECK-LABEL: "-fc1" ! CHECK: -ffp-contract=off ! CHECK: -O3 + +! RUN: %flang -### -S -fprofile-generate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-GENERATE-LLVM %s +! CHECK-PROFILE-GENERATE-LLVM: "-fprofile-generate" +! RUN: %flang -### -S -fprofile-use=%S %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-USE-DIR %s +! CHECK-PROFILE-USE-DIR: "-fprofile-use={{.*}}" diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext new file mode 100644 index 0000000000000..6a6df8b1d4d5b --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -0,0 +1,19 @@ +# IR level Instrumentation Flag +:ir +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + +main +# Func Hash: +742261418966908927 +# Num Counters: +1 +# Counter Values: +1 + diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext new file mode 100644 index 0000000000000..9a46140286673 --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -0,0 +1,14 @@ +# IR level Instrumentation Flag +:ir +:entry_first +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + + + diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 new file mode 100644 index 0000000000000..0124bc79b87ef --- /dev/null +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -0,0 +1,39 @@ +! Tests for -fprofile-generate and -fprofile-use flag compatibility. These two +! flags behave similarly to their GCC counterparts: +! +! -fprofile-generate Generates the profile file ./default.profraw +! -fprofile-use=/file Uses the profile file /file + +! On AIX, -flto used to be required with -fprofile-generate. gcc-flag-compatibility-aix.c is used to do the testing on AIX with -flto +! RUN: %flang %s -c -S -o - -emit-llvm -fprofile-generate | FileCheck -check-prefix=PROFILE-GEN %s +! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section +! PROFILE-GEN: @__profd_{{_?}}main = + + + +! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof +! This uses LLVM IR format profile. +! RUN: rm -rf %t.dir +! RUN: mkdir -p %t.dir/some/path +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s +! + + + +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s +! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} +! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} + + +program main + implicit none + integer :: i + integer :: X = 0 + + do i = 0, 99 + X = X + i + end do + +end program main >From b897c7aa1e21dfe46b4acf709f3ea38d9021c164 Mon Sep 17 00:00:00 2001 From: FYK Date: Wed, 23 Apr 2025 09:56:14 +0800 Subject: [PATCH 02/14] Update flang/lib/Frontend/FrontendActions.cpp Remove redundant comment symbols Co-authored-by: Tom Eccles --- flang/lib/Frontend/FrontendActions.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 68880bdeecf8d..cd13a6aca92cd 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -942,7 +942,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { std::optional pgoOpt; if (opts.hasProfileIRInstr()){ - // // -fprofile-generate. + // -fprofile-generate. pgoOpt = llvm::PGOOptions( opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() : opts.InstrProfileOutput, >From bc5adfcc4ac3456f587bedd48c1a8892d27e53ae Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:48:30 +0800 Subject: [PATCH 03/14] format code with clang-format --- flang/include/flang/Frontend/CodeGenOptions.h | 17 ++-- flang/lib/Frontend/CompilerInvocation.cpp | 15 ++-- flang/lib/Frontend/FrontendActions.cpp | 83 +++++++++---------- .../Inputs/gcc-flag-compatibility_IR.proftext | 3 +- .../gcc-flag-compatibility_IR_entry.proftext | 5 +- 5 files changed, 59 insertions(+), 64 deletions(-) diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index e052250f97e75..c9577862df832 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -156,7 +156,6 @@ class CodeGenOptions : public CodeGenOptionsBase { ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -171,7 +170,7 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; - /// Name of the profile remapping file to apply to the profile data supplied + /// Name of the profile remapping file to apply to the profile data supplied /// by -fprofile-sample-use or -fprofile-instr-use. std::string ProfileRemappingFile; @@ -181,19 +180,17 @@ class CodeGenOptions : public CodeGenOptionsBase { } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; - } + bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { return getProfileInstr() == ProfileCSIRInstr; } - /// Check if IR level profile use is on. - bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; - } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; + } /// Check if CSIR profile use is on. bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index f013fce2f3cfc..b28c2c0047579 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -27,7 +27,6 @@ #include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" -#include "clang/Driver/Driver.h" #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" @@ -433,13 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.DebugInfoForProfiling = + args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); + opts.AtomicProfileUpdate = + args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); } - + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse( + Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index cd13a6aca92cd..8d1ab670e4db4 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -56,21 +56,21 @@ #include "llvm/Passes/PassBuilder.h" #include "llvm/Passes/PassPlugin.h" #include "llvm/Passes/StandardInstrumentations.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/Support/AMDGPUAddrSpace.h" #include "llvm/Support/Error.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/FileSystem.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" -#include "llvm/Support/PGOOptions.h" #include "llvm/Target/TargetMachine.h" #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" -#include "llvm/Transforms/Utils/ModuleUtils.h" #include "llvm/Transforms/Instrumentation/InstrProfiling.h" -#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Transforms/Utils/ModuleUtils.h" #include #include @@ -133,19 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// - static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); + "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + llvm::cl::Hidden, + llvm::cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, + "default", "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, + "optsize", "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, + "minsize", "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, + "optnone", + "Mark cold functions with optnone."))); bool PrescanAction::beginSourceFileAction() { return runPrescan(); } @@ -909,19 +910,18 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } - // Default filename used for profile generation. namespace llvm { - extern llvm::cl::opt DebugInfoCorrelate; - extern llvm::cl::opt ProfileCorrelate; - +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt ProfileCorrelate; std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE + return DebugInfoCorrelate || + ProfileCorrelate != llvm::InstrProfCorrelator::NONE ? "default_%m.proflite" : "default_%m.profraw"; } -} +} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); @@ -940,29 +940,28 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; - - if (opts.hasProfileIRInstr()){ + + if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); - } - else if (opts.hasProfileIRUse()) { - llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); - // -fprofile-use. - auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse - : llvm::PGOOptions::NoCSAction; - pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "", - opts.ProfileRemappingFile, - opts.MemoryProfileUsePath, VFS, - llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling); - } - + opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, + opts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + } else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = + llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions( + opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, + ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext index 6a6df8b1d4d5b..2650fb5ebfd35 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -15,5 +15,4 @@ main # Num Counters: 1 # Counter Values: -1 - +1 \ No newline at end of file diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext index 9a46140286673..c4a2a26557e80 100644 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -8,7 +8,4 @@ _QQmain 2 # Counter Values: 100 -1 - - - +1 \ No newline at end of file >From d64d9d95fb97d6cfa4bf4192bfb20f5c8d6b3bc3 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 25 Apr 2025 11:53:47 +0800 Subject: [PATCH 04/14] simplify push_back usage --- clang/lib/Driver/ToolChains/Flang.cpp | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index fcdbe8a6aba5a..9c7e87c455e44 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -882,13 +882,9 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); - - if (Args.hasArg(options::OPT_fprofile_generate)){ - CmdArgs.push_back("-fprofile-generate"); - } - if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) { - CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue())); - } + // recognise options: fprofile-generate -fprofile-use= + Args.addAllArgs( + CmdArgs, {options::OPT_fprofile_generate, options::OPT_fprofile_use_EQ}); // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. >From 22475a85d24b22fb44ca5a5ce26542b556bae280 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 20:33:54 +0800 Subject: [PATCH 05/14] Port the getDefaultProfileGenName definition and the ProfileInstrKind definition from clang to the llvm namespace to allow flang to reuse these code. --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++--- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/include/clang/CodeGen/BackendUtil.h | 3 ++ clang/lib/Basic/ProfileList.cpp | 20 ++++---- clang/lib/CodeGen/BackendUtil.cpp | 50 ++++++------------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 +-- flang/include/flang/Frontend/CodeGenOptions.h | 28 ++++------- .../include/flang/Frontend/FrontendActions.h | 5 ++ flang/lib/Frontend/CompilerInvocation.cpp | 11 ++-- flang/lib/Frontend/FrontendActions.cpp | 28 +++-------- .../llvm/Frontend/Driver/CodeGenOptions.h | 15 +++++- llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 25 ++++++++++ 17 files changed, 123 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index 92e0d13bf25b6..d9abf7bf962d2 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,6 +8,8 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -19,6 +21,7 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; +extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..963ed321b2cb9 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,38 +103,16 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); -// Experiment to mark cold functions as optsize/minsize/optnone. -// TODO: remove once this is exposed as a proper driver flag. -static cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, - cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); - extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +812,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,14 +825,14 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) @@ -863,15 +841,15 @@ void EmitAssemblyHelper::RunOptimizationPipeline( ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index f9a45bd6c0a56..9ba74a9dad9be 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,8 +20,13 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/PGOOptions.h" #include +namespace llvm { +extern cl::opt ClPGOColdFuncAttr; +} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..c758aa18fbb8e 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -28,6 +28,7 @@ #include "flang/Semantics/unparse-with-symbols.h" #include "flang/Support/default-kinds.h" #include "flang/Tools/CrossToolHelpers.h" +#include "clang/CodeGen/BackendUtil.h" #include "mlir/IR/Dialect.h" #include "mlir/Parser/Parser.h" @@ -133,21 +134,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -944,12 +930,12 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, + opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + llvm::PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +945,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..6188c20cb29cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,9 +13,14 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; } // namespace llvm namespace llvm::driver { @@ -35,7 +40,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); - +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..818dcd3752437 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,7 +8,26 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(llvm::PGOOptions::ColdFuncOpt::Default), + cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); +} // namespace llvm namespace llvm::driver { @@ -56,4 +75,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From e53e689985088bbcdc253950a2ecc715592b5b3a Mon Sep 17 00:00:00 2001 From: fanju110 Date: Mon, 28 Apr 2025 21:49:36 +0800 Subject: [PATCH 06/14] Remove redundant function definitions --- flang/lib/Frontend/FrontendActions.cpp | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index c758aa18fbb8e..cdd2853bcd201 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -896,18 +896,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); >From 248175453354fecd078f5553576d16ce810e7808 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:12:32 +0800 Subject: [PATCH 07/14] Move the interface to the cpp that uses it --- clang/include/clang/Basic/CodeGenOptions.def | 4 +- clang/include/clang/Basic/CodeGenOptions.h | 22 +++++---- clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/lib/Basic/ProfileList.cpp | 20 ++++----- clang/lib/CodeGen/BackendUtil.cpp | 37 +++++++-------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 8 ++-- flang/include/flang/Frontend/CodeGenOptions.h | 28 +++++------- flang/lib/Frontend/CompilerInvocation.cpp | 11 ++--- flang/lib/Frontend/FrontendActions.cpp | 45 ++++--------------- .../llvm/Frontend/Driver/CodeGenOptions.h | 10 +++++ llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 12 +++++ 15 files changed, 101 insertions(+), 120 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index a436c0ec98d5b..3dbadc85f23cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..e4a74cd2c12cb 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -509,35 +509,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0c5aa6de3f3c0 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -71,22 +71,22 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -117,7 +117,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -131,13 +131,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 7557cb8408921..592e3bbbcc1cf 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -124,17 +124,10 @@ namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -834,12 +827,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -847,31 +840,31 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) PGOOpt = PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); + llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4d29ceace646f..10f0a12773498 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -951,7 +951,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0154799498f5e..2e8e90493c370 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3568,7 +3568,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index cfc5c069b0849..ece7e5cad774c 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1514,11 +1514,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index 4dec86cd8f51b..68396c4c40711 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,11 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Whether emit extra debug info for sample pgo profile collection. -CODEGENOPT(DebugInfoForProfiling, 1, 0) -CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index c9577862df832..6a989f0c0775e 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -148,14 +148,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. - }; - /// Name of the profile file to use as output for -fprofile-instr-generate, /// -fprofile-generate, and -fcs-profile-generate. std::string InstrProfileOutput; @@ -163,10 +155,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Name of the profile file to use as input for -fmemory-profile-use. std::string MemoryProfileUsePath; - unsigned int DebugInfoForProfiling; - - unsigned int AtomicProfileUpdate; - /// Name of the profile file to use as input for -fprofile-instr-use std::string ProfileInstrumentUsePath; @@ -176,23 +164,27 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == llvm::driver::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { return getProfileInstr() == ProfileIRInstr; } + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index b28c2c0047579..43c8f4d596903 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -432,17 +433,11 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, } if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); - opts.DebugInfoForProfiling = - args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling); - opts.AtomicProfileUpdate = - args.hasArg(clang::driver::options::OPT_fprofile_update_EQ); + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); } if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse( - Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr); + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); opts.ProfileInstrumentUsePath = A->getValue(); } diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 8d1ab670e4db4..a650f54620543 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -133,21 +133,6 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, // Custom BeginSourceFileAction //===----------------------------------------------------------------------===// -static llvm::cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), - llvm::cl::Hidden, - llvm::cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, - "default", "Default (no attribute)"), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, - "optsize", "Mark cold functions with optsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, - "minsize", "Mark cold functions with minsize."), - clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, - "optnone", - "Mark cold functions with optnone."))); - bool PrescanAction::beginSourceFileAction() { return runPrescan(); } bool PrescanAndParseAction::beginSourceFileAction() { @@ -910,19 +895,6 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags, delete tlii; } -// Default filename used for profile generation. -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt ProfileCorrelate; - -std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || - ProfileCorrelate != llvm::InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} -} // namespace llvm - void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { CompilerInstance &ci = getInstance(); const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts(); @@ -943,13 +915,14 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { if (opts.hasProfileIRInstr()) { // -fprofile-generate. - pgoOpt = llvm::PGOOptions( - opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr, - opts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate); + pgoOpt = llvm::PGOOptions(opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, + llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, + llvm::PGOOptions::ColdFuncOpt::Default, false, + /*PseudoProbeForProfiling=*/false, false); } else if (opts.hasProfileIRUse()) { llvm::IntrusiveRefCntPtr VFS = llvm::vfs::getRealFileSystem(); @@ -959,7 +932,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { pgoOpt = llvm::PGOOptions( opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - ClPGOColdFuncAttr, opts.DebugInfoForProfiling); + llvm::PGOOptions::ColdFuncOpt::Default, false); } llvm::StandardInstrumentations si(llvmModule->getContext(), diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index c51476e9ad3fe..3eb03cc3064cf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,6 +13,7 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -36,6 +37,15 @@ enum class VectorLibrary { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index ed7c57a930aca..14b6b89da8465 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,8 +8,14 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; +} // namespace llvm namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, @@ -56,4 +62,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver >From 70fea2265a374f59345691f4ad7653ef4f0b6aa6 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:25:15 +0800 Subject: [PATCH 08/14] Move the interface to the cpp that uses it --- clang/include/clang/CodeGen/BackendUtil.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/clang/include/clang/CodeGen/BackendUtil.h b/clang/include/clang/CodeGen/BackendUtil.h index d9abf7bf962d2..92e0d13bf25b6 100644 --- a/clang/include/clang/CodeGen/BackendUtil.h +++ b/clang/include/clang/CodeGen/BackendUtil.h @@ -8,8 +8,6 @@ #ifndef LLVM_CLANG_CODEGEN_BACKENDUTIL_H #define LLVM_CLANG_CODEGEN_BACKENDUTIL_H -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include "clang/Basic/LLVM.h" #include "llvm/IR/ModuleSummaryIndex.h" @@ -21,7 +19,6 @@ template class Expected; template class IntrusiveRefCntPtr; class Module; class MemoryBufferRef; -extern cl::opt ClPGOColdFuncAttr; namespace vfs { class FileSystem; } // namespace vfs >From 5705d5eff937ca18eb44bec28a967a8629f0c085 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:26:22 +0800 Subject: [PATCH 09/14] Move the interface to the cpp that uses it --- flang/include/flang/Frontend/FrontendActions.h | 5 ----- 1 file changed, 5 deletions(-) diff --git a/flang/include/flang/Frontend/FrontendActions.h b/flang/include/flang/Frontend/FrontendActions.h index 9ba74a9dad9be..f9a45bd6c0a56 100644 --- a/flang/include/flang/Frontend/FrontendActions.h +++ b/flang/include/flang/Frontend/FrontendActions.h @@ -20,13 +20,8 @@ #include "mlir/IR/OwningOpRef.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/Module.h" -#include "llvm/Support/CommandLine.h" -#include "llvm/Support/PGOOptions.h" #include -namespace llvm { -extern cl::opt ClPGOColdFuncAttr; -} // namespace llvm namespace Fortran::frontend { //===----------------------------------------------------------------------===// >From 016aab17f4cc73416c6ebca61240f269aac837d2 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Wed, 30 Apr 2025 17:34:00 +0800 Subject: [PATCH 10/14] Fill in the missing code --- clang/lib/CodeGen/BackendUtil.cpp | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 2d33edbb8430d..6eb3a8638b7d1 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -103,6 +103,21 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( "sanitizer-early-opt-ep", cl::Optional, cl::desc("Insert sanitizers on OptimizerEarlyEP.")); +// Experiment to mark cold functions as optsize/minsize/optnone. +// TODO: remove once this is exposed as a proper driver flag. +static cl::opt ClPGOColdFuncAttr( + "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); + extern cl::opt ProfileCorrelate; } // namespace llvm namespace clang { >From f36bfcfbfdc87b896f41be1ba25d8c18c339f1c1 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Thu, 1 May 2025 23:18:34 +0800 Subject: [PATCH 11/14] Adjusting the format of the code --- flang/test/Profile/gcc-flag-compatibility.f90 | 7 ------- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 7 ++++--- 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 index 0124bc79b87ef..4490c45232d28 100644 --- a/flang/test/Profile/gcc-flag-compatibility.f90 +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -9,24 +9,17 @@ ! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section ! PROFILE-GEN: @__profd_{{_?}}main = - - ! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof ! This uses LLVM IR format profile. ! RUN: rm -rf %t.dir ! RUN: mkdir -p %t.dir/some/path ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s -! - - - ! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof ! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s ! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} ! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} - program main implicit none integer :: i diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 3eb03cc3064cf..98b9e1554f317 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -14,6 +14,7 @@ #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #include + namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -34,9 +35,6 @@ enum class VectorLibrary { AMDLIBM // AMD vector math library. }; -TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, - VectorLibrary Veclib); - enum ProfileInstrKind { ProfileNone, // Profile instrumentation is turned off. ProfileClangInstr, // Clang instrumentation to generate execution counts @@ -44,6 +42,9 @@ enum ProfileInstrKind { ProfileIRInstr, // IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; +TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, + VectorLibrary Veclib); + // Default filename used for profile generation. std::string getDefaultProfileGenName(); } // end namespace llvm::driver >From a5c7da77d2aa6909451bed3fb0f02c9b735dc876 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 2 May 2025 00:01:26 +0800 Subject: [PATCH 12/14] Adjusting the format of the code --- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 98b9e1554f317..84bba2a964ecf 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -20,6 +20,7 @@ class Triple; class TargetLibraryInfoImpl; } // namespace llvm + namespace llvm::driver { /// Vector library option used with -fveclib= @@ -42,6 +43,7 @@ enum ProfileInstrKind { ProfileIRInstr, // IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. }; + TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); >From a99e16b29d70d2fea6d16ec06e6ca55f477b74e9 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 2 May 2025 00:07:23 +0800 Subject: [PATCH 13/14] Adjusting the format of the code --- llvm/include/llvm/Frontend/Driver/CodeGenOptions.h | 1 - llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 84bba2a964ecf..f0baa6fcdbbd3 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -20,7 +20,6 @@ class Triple; class TargetLibraryInfoImpl; } // namespace llvm - namespace llvm::driver { /// Vector library option used with -fveclib= diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index 14b6b89da8465..c48f5ed68b10b 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -16,6 +16,7 @@ extern llvm::cl::opt DebugInfoCorrelate; extern llvm::cl::opt ProfileCorrelate; } // namespace llvm + namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, >From edc2045f6051bb7903fb088a66d25052d75e8ca7 Mon Sep 17 00:00:00 2001 From: fanju110 Date: Fri, 30 May 2025 10:08:57 +0800 Subject: [PATCH 14/14] Using clang-format for changed files --- clang/lib/CodeGen/BackendUtil.cpp | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 8f7ad0e39ba1f..daf49f6cedd0e 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -106,17 +106,17 @@ static cl::opt ClSanitizeOnOptimizerEarlyEP( // Experiment to mark cold functions as optsize/minsize/optnone. // TODO: remove once this is exposed as a proper driver flag. static cl::opt ClPGOColdFuncAttr( - "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, - cl::desc( - "Function attribute to apply to cold functions as determined by PGO"), - cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", - "Default (no attribute)"), - clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", - "Mark cold functions with optsize."), - clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", - "Mark cold functions with minsize."), - clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", - "Mark cold functions with optnone."))); + "pgo-cold-func-opt", cl::init(PGOOptions::ColdFuncOpt::Default), cl::Hidden, + cl::desc( + "Function attribute to apply to cold functions as determined by PGO"), + cl::values(clEnumValN(PGOOptions::ColdFuncOpt::Default, "default", + "Default (no attribute)"), + clEnumValN(PGOOptions::ColdFuncOpt::OptSize, "optsize", + "Mark cold functions with optsize."), + clEnumValN(PGOOptions::ColdFuncOpt::MinSize, "minsize", + "Mark cold functions with minsize."), + clEnumValN(PGOOptions::ColdFuncOpt::OptNone, "optnone", + "Mark cold functions with optnone."))); extern cl::opt ProfileCorrelate; } // namespace llvm @@ -853,9 +853,10 @@ void EmitAssemblyHelper::RunOptimizationPipeline( CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) - PGOOpt = PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::NoAction, PGOOptions::NoCSAction, - llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); + PGOOpt = + PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, From flang-commits at lists.llvm.org Fri May 30 03:37:46 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 03:37:46 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <68398a7a.050a0220.2d2390.e456@mx.google.com> ================ @@ -2656,527 +2665,1857 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; } } + return SourcedActionStmt{}; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); } - return false; + return SourcedActionStmt{}; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; - } - } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); - } +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; } - ErrIfAllocatableVariable(var); + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; + } else { + return std::nullopt; + } + }, + x->u); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const auto *v1 = GetExpr(context_, stmt1Var); - const auto *e1 = GetExpr(context_, stmt1Expr); - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - const auto *v2 = GetExpr(context_, stmt2Var); - const auto *e2 = GetExpr(context_, stmt2Expr); - - if (e1 && v1 && e2 && v2) { - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(v2, e2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } - if (!(*e1 == *v2)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(v1, e1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - if (!(*v1 == *e2)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); - } - } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); } + return std::nullopt; } -} -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; } } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; } } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); - } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); + return std::nullopt; } -} -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, - }, - x.u); + return std::nullopt; } -void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { - dirContext_.pop_back(); +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; + } } -// Clauses -// Mainly categorized as -// 1. Checks on 'OmpClauseList' from 'parse-tree.h'. -// 2. Checks on clauses which fall under 'struct OmpClause' from parse-tree.h. -// 3. Checks on clauses which are not in 'struct OmpClause' from parse-tree.h. +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} -void OmpStructureChecker::Leave(const parser::OmpClauseList &) { - // 2.7.1 Loop Construct Restriction - if (llvm::omp::allDoSet.test(GetContext().directive)) { - if (auto *clause{FindClause(llvm::omp::Clause::OMPC_schedule)}) { - // only one schedule clause is allowed - const auto &schedClause{std::get(clause->u)}; - auto &modifiers{OmpGetModifiers(schedClause.v)}; - auto *ordering{ - OmpGetUniqueModifier(modifiers)}; - if (ordering && - ordering->v == parser::OmpOrderingModifier::Value::Nonmonotonic) { - if (FindClause(llvm::omp::Clause::OMPC_ordered)) { - context_.Say(clause->source, - "The NONMONOTONIC modifier cannot be specified " - "if an ORDERED clause is specified"_err_en_US); - } + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } + return result; + } - if (auto *clause{FindClause(llvm::omp::Clause::OMPC_ordered)}) { - // only one ordered clause is allowed - const auto &orderedClause{ - std::get(clause->u)}; + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } - if (orderedClause.v) { - CheckNotAllowedIfClause( - llvm::omp::Clause::OMPC_ordered, {llvm::omp::Clause::OMPC_linear}); + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } - if (auto *clause2{FindClause(llvm::omp::Clause::OMPC_collapse)}) { - const auto &collapseClause{ - std::get(clause2->u)}; - // ordered and collapse both have parameters - if (const auto orderedValue{GetIntValue(orderedClause.v)}) { - if (const auto collapseValue{GetIntValue(collapseClause.v)}) { - if (*orderedValue > 0 && *orderedValue < *collapseValue) { - context_.Say(clause->source, - "The parameter of the ORDERED clause must be " - "greater than or equal to " - "the parameter of the COLLAPSE clause"_err_en_US); - } - } - } - } + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); } + } + } - // TODO: ordered region binding check (requires nesting implementation) +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; } - } // doSet + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } - // 2.8.1 Simd Construct Restriction - if (llvm::omp::allSimdSet.test(GetContext().directive)) { - if (auto *clause{FindClause(llvm::omp::Clause::OMPC_simdlen)}) { - if (auto *clause2{FindClause(llvm::omp::Clause::OMPC_safelen)}) { - const auto &simdlenClause{ - std::get(clause->u)}; - const auto &safelenClause{ - std::get(clause2->u)}; - // simdlen and safelen both have parameters - if (const auto simdlenValue{GetIntValue(simdlenClause.v)}) { - if (const auto safelenValue{GetIntValue(safelenClause.v)}) { - if (*safelenValue > 0 && *simdlenValue > *safelenValue) { - context_.Say(clause->source, - "The parameter of the SIMDLEN clause must be less than or " - "equal to the parameter of the SAFELEN clause"_err_en_US); - } - } - } + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (MoveAppend(v, std::move(results)), ...); + return v; + } +}; + +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result asSomeExpr(const T &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::Constant &x) const { + return asSomeExpr(x); + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); } + } else { + return asSomeExpr(x.derived()); } + } - // 2.11.5 Simd construct restriction (OpenMP 5.1) - if (auto *sl_clause{FindClause(llvm::omp::Clause::OMPC_safelen)}) { - if (auto *o_clause{FindClause(llvm::omp::Clause::OMPC_order)}) { - const auto &orderClause{ - std::get(o_clause->u)}; - if (std::get(orderClause.v.t) == - parser::OmpOrderClause::Ordering::Concurrent) { - context_.Say(sl_clause->source, - "The `SAFELEN` clause cannot appear in the `SIMD` directive " - "with `ORDER(CONCURRENT)` clause"_err_en_US); - } + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; } - } // SIMD + } - // Semantic checks related to presence of multiple list items within the same - // clause - CheckMultListItems(); + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; - if (GetContext().directive == llvm::omp::Directive::OMPD_task) { - if (auto *detachClause{FindClause(llvm::omp::Clause::OMPC_detach)}) { - unsigned version{context_.langOptions().OpenMPVersion}; - if (version == 50 || version == 51) { + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; +}; +} // namespace atomic + +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); +} + +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} + +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} + +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; + } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, + const std::optional &maybeAssign = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetAssignment(operation.assign, maybeAssign); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedAssignment assign; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var (with optional converts) + // or + // ... = x capture-var = atomic-var (with optional converts) + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + using ReturnTy = std::pair; + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return IsSameOrConvertOf(c.rhs, u.lhs); + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; ---------------- tblah wrote: Okay that looks good to me. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Fri May 30 04:50:15 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 04:50:15 -0700 (PDT) Subject: [flang-commits] =?utf-8?q?=5Bflang=5D_Revert_=22Reland_=22=5Bflan?= =?utf-8?q?g=5D_Added_noalias_attribute_to_function_arguments=E2=80=A6_=28?= =?utf-8?q?PR_=23142128=29?= Message-ID: https://github.com/tblah created https://github.com/llvm/llvm-project/pull/142128 …. (#140803)"" This reverts commit a0d699a8e686cba99690cf28463d14526c5bfbc8. With this enabled we see a 70% performance regression for exchange2_r on neoverse-v1 (aws graviton 3) using `-mcpu=native -Ofast -flto`. There is also a smaller regression on neoverse-v2. This appears to be because function specialization is no longer kicking in during LTO for digits_2. This can be seen in the output executable: previously it contained specialized copies of the function with names like `_QMbrute_forcePdigits_2.specialized.4`. Now there are no names like this. The bug is not in this patch as such, instead in the function specialization pass but due to the size of the regression I would like to request that this is reverted until function specialization has been fixed. >From 61914aead5a5f9e0e336b7534b3d41e1a9b90896 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Fri, 30 May 2025 11:45:26 +0000 Subject: [PATCH] Revert "Reland "[flang] Added noalias attribute to function arguments. (#140803)"" This reverts commit a0d699a8e686cba99690cf28463d14526c5bfbc8. --- .../flang/Optimizer/Transforms/Passes.td | 6 - flang/lib/Optimizer/Passes/Pipelines.cpp | 6 +- .../lib/Optimizer/Transforms/FunctionAttr.cpp | 29 ++--- flang/test/Fir/array-coor.fir | 2 +- flang/test/Fir/arrayset.fir | 2 +- flang/test/Fir/arrexp.fir | 18 +-- flang/test/Fir/box-offset-codegen.fir | 8 +- flang/test/Fir/box-typecode.fir | 2 +- flang/test/Fir/box.fir | 18 +-- flang/test/Fir/boxproc.fir | 4 +- flang/test/Fir/commute.fir | 2 +- flang/test/Fir/coordinateof.fir | 2 +- flang/test/Fir/embox.fir | 8 +- flang/test/Fir/field-index.fir | 4 +- .../Fir/ignore-missing-type-descriptor.fir | 2 +- flang/test/Fir/polymorphic.fir | 2 +- flang/test/Fir/rebox.fir | 12 +- .../test/Fir/struct-passing-x86-64-byval.fir | 48 ++++---- .../Fir/target-rewrite-complex-10-x86.fir | 2 +- flang/test/Fir/target.fir | 8 +- flang/test/Fir/tbaa-codegen.fir | 2 +- flang/test/Fir/tbaa-codegen2.fir | 2 +- flang/test/Integration/OpenMP/copyprivate.f90 | 34 +++--- flang/test/Integration/debug-local-var-2.f90 | 4 +- flang/test/Integration/unroll-loops.f90 | 2 +- flang/test/Lower/HLFIR/unroll-loops.fir | 2 +- flang/test/Lower/forall/character-1.f90 | 2 +- .../constant-argument-globalisation.fir | 4 +- .../Transforms/function-attrs-noalias.fir | 113 ------------------ flang/test/Transforms/function-attrs.fir | 27 +---- 30 files changed, 112 insertions(+), 265 deletions(-) delete mode 100644 flang/test/Transforms/function-attrs-noalias.fir diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index 71493535db8ba..8e7f12505c59c 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -431,12 +431,6 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "Set the unsafe-fp-math attribute on functions in the module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, - Option<"setNoCapture", "set-nocapture", "bool", /*default=*/"false", - "Set LLVM nocapture attribute on function arguments, " - "if possible">, - Option<"setNoAlias", "set-noalias", "bool", /*default=*/"false", - "Set LLVM noalias attribute on function arguments, " - "if possible">, ]; } diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 378913fcb1329..77751908e35be 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -350,15 +350,11 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, else framePointerKind = mlir::LLVM::framePointerKind::FramePointerKind::None; - bool setNoCapture = false, setNoAlias = false; - if (config.OptLevel.isOptimizingForSpeed()) - setNoCapture = setNoAlias = true; - pm.addPass(fir::createFunctionAttr( {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - /*tuneCPU=*/"", setNoCapture, setNoAlias})); + ""})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index c8cdba0d6f9c4..43e4c1a7af3cd 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -27,8 +27,17 @@ namespace { class FunctionAttrPass : public fir::impl::FunctionAttrBase { public: - FunctionAttrPass(const fir::FunctionAttrOptions &options) : Base{options} {} - FunctionAttrPass() = default; + FunctionAttrPass(const fir::FunctionAttrOptions &options) { + instrumentFunctionEntry = options.instrumentFunctionEntry; + instrumentFunctionExit = options.instrumentFunctionExit; + framePointerKind = options.framePointerKind; + noInfsFPMath = options.noInfsFPMath; + noNaNsFPMath = options.noNaNsFPMath; + approxFuncFPMath = options.approxFuncFPMath; + noSignedZerosFPMath = options.noSignedZerosFPMath; + unsafeFPMath = options.unsafeFPMath; + } + FunctionAttrPass() {} void runOnOperation() override; }; @@ -47,28 +56,14 @@ void FunctionAttrPass::runOnOperation() { if ((isFromModule || !func.isDeclaration()) && !fir::hasBindcAttr(func.getOperation())) { llvm::StringRef nocapture = mlir::LLVM::LLVMDialect::getNoCaptureAttrName(); - llvm::StringRef noalias = mlir::LLVM::LLVMDialect::getNoAliasAttrName(); mlir::UnitAttr unitAttr = mlir::UnitAttr::get(func.getContext()); for (auto [index, argType] : llvm::enumerate(func.getArgumentTypes())) { - bool isNoCapture = false; - bool isNoAlias = false; if (mlir::isa(argType) && !func.getArgAttr(index, fir::getTargetAttrName()) && !func.getArgAttr(index, fir::getAsynchronousAttrName()) && - !func.getArgAttr(index, fir::getVolatileAttrName())) { - isNoCapture = true; - isNoAlias = !fir::isPointerType(argType); - } else if (mlir::isa(argType)) { - // !fir.box arguments will be passed as descriptor pointers - // at LLVM IR dialect level - they cannot be captured, - // and cannot alias with anything within the function. - isNoCapture = isNoAlias = true; - } - if (isNoCapture && setNoCapture) + !func.getArgAttr(index, fir::getVolatileAttrName())) func.setArgAttr(index, nocapture, unitAttr); - if (isNoAlias && setNoAlias) - func.setArgAttr(index, noalias, unitAttr); } } diff --git a/flang/test/Fir/array-coor.fir b/flang/test/Fir/array-coor.fir index 2caa727a10c50..a765670d20b28 100644 --- a/flang/test/Fir/array-coor.fir +++ b/flang/test/Fir/array-coor.fir @@ -33,7 +33,7 @@ func.func @test_array_coor_box_component_slice(%arg0: !fir.box) -> () // CHECK-LABEL: define void @test_array_coor_box_component_slice( -// CHECK-SAME: ptr {{[^%]*}}%[[VAL_0:.*]]) +// CHECK-SAME: ptr %[[VAL_0:.*]]) // CHECK: %[[VAL_1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[VAL_0]], i32 0, i32 7, i32 0, i32 2 // CHECK: %[[VAL_2:.*]] = load i64, ptr %[[VAL_1]] // CHECK: %[[VAL_3:.*]] = mul nsw i64 1, %[[VAL_2]] diff --git a/flang/test/Fir/arrayset.fir b/flang/test/Fir/arrayset.fir index cb26971cb962d..dab939aba1702 100644 --- a/flang/test/Fir/arrayset.fir +++ b/flang/test/Fir/arrayset.fir @@ -1,7 +1,7 @@ // RUN: tco %s | FileCheck %s // RUN: %flang_fc1 -emit-llvm %s -o - | FileCheck %s -// CHECK-LABEL: define void @x( +// CHECK-LABEL: define void @x(ptr captures(none) %0) func.func @x(%arr : !fir.ref>) { %1 = arith.constant 0 : index %2 = arith.constant 9 : index diff --git a/flang/test/Fir/arrexp.fir b/flang/test/Fir/arrexp.fir index 6c7f71f6f1f9c..924c1fab8d84b 100644 --- a/flang/test/Fir/arrexp.fir +++ b/flang/test/Fir/arrexp.fir @@ -1,7 +1,7 @@ // RUN: tco %s | FileCheck %s // CHECK-LABEL: define void @f1 -// CHECK: (ptr {{[^%]*}}%[[A:[^,]*]], {{.*}}, float %[[F:.*]]) +// CHECK: (ptr captures(none) %[[A:[^,]*]], {{.*}}, float %[[F:.*]]) func.func @f1(%a : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -23,7 +23,7 @@ func.func @f1(%a : !fir.ref>, %n : index, %m : index, %o : i } // CHECK-LABEL: define void @f2 -// CHECK: (ptr {{[^%]*}}%[[A:[^,]*]], {{.*}}, float %[[F:.*]]) +// CHECK: (ptr captures(none) %[[A:[^,]*]], {{.*}}, float %[[F:.*]]) func.func @f2(%a : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -47,7 +47,7 @@ func.func @f2(%a : !fir.ref>, %b : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -72,7 +72,7 @@ func.func @f3(%a : !fir.ref>, %b : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -102,7 +102,7 @@ func.func @f4(%a : !fir.ref>, %b : !fir.ref>, %arg1: !fir.box>, %arg2: f32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -135,7 +135,7 @@ func.func @f5(%arg0: !fir.box>, %arg1: !fir.box>, %arg1: f32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -165,7 +165,7 @@ func.func @f6(%arg0: !fir.box>, %arg1: f32) { // Non contiguous array with lower bounds (x = y(100), with y(4:)) // Test array_coor offset computation. // CHECK-LABEL: define void @f7( -// CHECK: ptr {{[^%]*}}%[[X:[^,]*]], ptr {{[^%]*}}%[[Y:.*]]) +// CHECK: ptr captures(none) %[[X:[^,]*]], ptr %[[Y:.*]]) func.func @f7(%arg0: !fir.ref, %arg1: !fir.box>) { %c4 = arith.constant 4 : index %c100 = arith.constant 100 : index @@ -181,7 +181,7 @@ func.func @f7(%arg0: !fir.ref, %arg1: !fir.box>) { // Test A(:, :)%x reference codegen with A constant shape. // CHECK-LABEL: define void @f8( -// CHECK-SAME: ptr {{[^%]*}}%[[A:.*]], i32 %[[I:.*]]) +// CHECK-SAME: ptr captures(none) %[[A:.*]], i32 %[[I:.*]]) func.func @f8(%a : !fir.ref>>, %i : i32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -198,7 +198,7 @@ func.func @f8(%a : !fir.ref>>, %i : i32) { // Test casts in in array_coor offset computation when type parameters are not i64 // CHECK-LABEL: define ptr @f9( -// CHECK-SAME: i32 %[[I:.*]], i64 %{{.*}}, i64 %{{.*}}, ptr {{[^%]*}}%[[C:.*]]) +// CHECK-SAME: i32 %[[I:.*]], i64 %{{.*}}, i64 %{{.*}}, ptr captures(none) %[[C:.*]]) func.func @f9(%i: i32, %e : i64, %j: i64, %c: !fir.ref>>) -> !fir.ref> { %s = fir.shape %e, %e : (i64, i64) -> !fir.shape<2> // CHECK: %[[CAST:.*]] = sext i32 %[[I]] to i64 diff --git a/flang/test/Fir/box-offset-codegen.fir b/flang/test/Fir/box-offset-codegen.fir index 11d5750ffc385..15c9a11e5aefe 100644 --- a/flang/test/Fir/box-offset-codegen.fir +++ b/flang/test/Fir/box-offset-codegen.fir @@ -7,7 +7,7 @@ func.func @scalar_addr(%scalar : !fir.ref>>) -> !fir.llvm_ return %addr : !fir.llvm_ptr>> } // CHECK-LABEL: define ptr @scalar_addr( -// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 0 // CHECK: ret ptr %[[VAL_0]] @@ -16,7 +16,7 @@ func.func @scalar_tdesc(%scalar : !fir.ref>>) -> !fir.llvm return %tdesc : !fir.llvm_ptr>> } // CHECK-LABEL: define ptr @scalar_tdesc( -// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 7 // CHECK: ret ptr %[[VAL_0]] @@ -25,7 +25,7 @@ func.func @array_addr(%array : !fir.ref>>> } // CHECK-LABEL: define ptr @array_addr( -// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 0 // CHECK: ret ptr %[[VAL_0]] @@ -34,6 +34,6 @@ func.func @array_tdesc(%array : !fir.ref>> } // CHECK-LABEL: define ptr @array_tdesc( -// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 8 // CHECK: ret ptr %[[VAL_0]] diff --git a/flang/test/Fir/box-typecode.fir b/flang/test/Fir/box-typecode.fir index a8d43eba39889..766c5165b947c 100644 --- a/flang/test/Fir/box-typecode.fir +++ b/flang/test/Fir/box-typecode.fir @@ -6,7 +6,7 @@ func.func @test_box_typecode(%a: !fir.class) -> i32 { } // CHECK-LABEL: @test_box_typecode( -// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]) +// CHECK-SAME: ptr %[[BOX:.*]]) // CHECK: %[[GEP:.*]] = getelementptr { ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}} }, ptr %[[BOX]], i32 0, i32 4 // CHECK: %[[TYPE_CODE:.*]] = load i8, ptr %[[GEP]] // CHECK: %[[TYPE_CODE_CONV:.*]] = sext i8 %[[TYPE_CODE]] to i32 diff --git a/flang/test/Fir/box.fir b/flang/test/Fir/box.fir index c0cf3d8375983..5e931a2e0d9aa 100644 --- a/flang/test/Fir/box.fir +++ b/flang/test/Fir/box.fir @@ -24,7 +24,7 @@ func.func private @g(%b : !fir.box) func.func private @ga(%b : !fir.box>) // CHECK-LABEL: define void @f -// CHECK: (ptr {{[^%]*}}%[[ARG:.*]]) +// CHECK: (ptr captures(none) %[[ARG:.*]]) func.func @f(%a : !fir.ref) { // CHECK: %[[DESC:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 } // CHECK: %[[INS0:.*]] = insertvalue {{.*}} { ptr undef, i64 4, i32 20240719, i8 0, i8 27, i8 0, i8 0 }, ptr %[[ARG]], 0 @@ -38,7 +38,7 @@ func.func @f(%a : !fir.ref) { } // CHECK-LABEL: define void @fa -// CHECK: (ptr {{[^%]*}}%[[ARG:.*]]) +// CHECK: (ptr captures(none) %[[ARG:.*]]) func.func @fa(%a : !fir.ref>) { %c = fir.convert %a : (!fir.ref>) -> !fir.ref> %c1 = arith.constant 1 : index @@ -54,7 +54,7 @@ func.func @fa(%a : !fir.ref>) { // Boxing of a scalar character of dynamic length // CHECK-LABEL: define void @b1( -// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]]) +// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]]) func.func @b1(%arg0 : !fir.ref>, %arg1 : index) -> !fir.box> { // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 } // CHECK: %[[size:.*]] = mul i64 1, %[[arg1]] @@ -69,8 +69,8 @@ func.func @b1(%arg0 : !fir.ref>, %arg1 : index) -> !fir.box>>, %arg1 : index) -> !fir.box>> { %1 = fir.shape %arg1 : (index) -> !fir.shape<1> // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } @@ -85,7 +85,7 @@ func.func @b2(%arg0 : !fir.ref>>, %arg1 : index) -> // Boxing of a dynamic array of character of dynamic length // CHECK-LABEL: define void @b3( -// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]], i64 %[[arg2:.*]]) +// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]], i64 %[[arg2:.*]]) func.func @b3(%arg0 : !fir.ref>>, %arg1 : index, %arg2 : index) -> !fir.box>> { %1 = fir.shape %arg2 : (index) -> !fir.shape<1> // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } @@ -103,7 +103,7 @@ func.func @b3(%arg0 : !fir.ref>>, %arg1 : index, %ar // Boxing of a static array of character of dynamic length // CHECK-LABEL: define void @b4( -// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]]) +// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]]) func.func @b4(%arg0 : !fir.ref>>, %arg1 : index) -> !fir.box>> { %c_7 = arith.constant 7 : index %1 = fir.shape %c_7 : (index) -> !fir.shape<1> @@ -122,7 +122,7 @@ func.func @b4(%arg0 : !fir.ref>>, %arg1 : index) -> // Storing a fir.box into a fir.ref (modifying descriptors). // CHECK-LABEL: define void @b5( -// CHECK-SAME: ptr {{[^%]*}}%[[arg0:.*]], ptr {{[^%]*}}%[[arg1:.*]]) +// CHECK-SAME: ptr captures(none) %[[arg0:.*]], ptr %[[arg1:.*]]) func.func @b5(%arg0 : !fir.ref>>>, %arg1 : !fir.box>>) { fir.store %arg1 to %arg0 : !fir.ref>>> // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr %0, ptr %1, i32 72, i1 false) @@ -132,7 +132,7 @@ func.func @b5(%arg0 : !fir.ref>>>, %arg1 func.func private @callee6(!fir.box) -> i32 // CHECK-LABEL: define i32 @box6( -// CHECK-SAME: ptr {{[^%]*}}%[[ARG0:.*]], i64 %[[ARG1:.*]], i64 %[[ARG2:.*]]) +// CHECK-SAME: ptr captures(none) %[[ARG0:.*]], i64 %[[ARG1:.*]], i64 %[[ARG2:.*]]) func.func @box6(%0 : !fir.ref>, %1 : index, %2 : index) -> i32 { %c100 = arith.constant 100 : index %c50 = arith.constant 50 : index diff --git a/flang/test/Fir/boxproc.fir b/flang/test/Fir/boxproc.fir index 5d82522055adc..e99dfd0b92afd 100644 --- a/flang/test/Fir/boxproc.fir +++ b/flang/test/Fir/boxproc.fir @@ -16,7 +16,7 @@ // CHECK: call void @_QPtest_proc_dummy_other(ptr %[[VAL_6]]) // CHECK-LABEL: define void @_QFtest_proc_dummyPtest_proc_dummy_a(ptr -// CHECK-SAME: {{[^%]*}}%[[VAL_0:.*]], ptr nest {{[^%]*}}%[[VAL_1:.*]]) +// CHECK-SAME: captures(none) %[[VAL_0:.*]], ptr nest captures(none) %[[VAL_1:.*]]) // CHECK-LABEL: define void @_QPtest_proc_dummy_other(ptr // CHECK-SAME: %[[VAL_0:.*]]) @@ -92,7 +92,7 @@ func.func @_QPtest_proc_dummy_other(%arg0: !fir.boxproc<() -> ()>) { // CHECK: call void @llvm.stackrestore.p0(ptr %[[VAL_27]]) // CHECK-LABEL: define { ptr, i64 } @_QFtest_proc_dummy_charPgen_message(ptr -// CHECK-SAME: {{[^%]*}}%[[VAL_0:.*]], i64 %[[VAL_1:.*]], ptr nest {{[^%]*}}%[[VAL_2:.*]]) +// CHECK-SAME: captures(none) %[[VAL_0:.*]], i64 %[[VAL_1:.*]], ptr nest captures(none) %[[VAL_2:.*]]) // CHECK: %[[VAL_3:.*]] = getelementptr { { ptr, i64 } }, ptr %[[VAL_2]], i32 0, i32 0 // CHECK: %[[VAL_4:.*]] = load { ptr, i64 }, ptr %[[VAL_3]], align 8 // CHECK: %[[VAL_5:.*]] = extractvalue { ptr, i64 } %[[VAL_4]], 0 diff --git a/flang/test/Fir/commute.fir b/flang/test/Fir/commute.fir index 8713c8ff24e7f..a857ba55b00c5 100644 --- a/flang/test/Fir/commute.fir +++ b/flang/test/Fir/commute.fir @@ -11,7 +11,7 @@ func.func @f1(%a : i32, %b : i32) -> i32 { return %3 : i32 } -// CHECK-LABEL: define i32 @f2(ptr {{[^%]*}}%0) +// CHECK-LABEL: define i32 @f2(ptr captures(none) %0) func.func @f2(%a : !fir.ref) -> i32 { %1 = fir.load %a : !fir.ref // CHECK: %[[r2:.*]] = load diff --git a/flang/test/Fir/coordinateof.fir b/flang/test/Fir/coordinateof.fir index a01e9e9d1fc40..693bdf716ba1d 100644 --- a/flang/test/Fir/coordinateof.fir +++ b/flang/test/Fir/coordinateof.fir @@ -62,7 +62,7 @@ func.func @foo5(%box : !fir.box>>, %i : index) -> i32 } // CHECK-LABEL: @foo6 -// CHECK-SAME: (ptr {{[^%]*}}%[[box:.*]], i64 %{{.*}}, ptr {{[^%]*}}%{{.*}}) +// CHECK-SAME: (ptr %[[box:.*]], i64 %{{.*}}, ptr captures(none) %{{.*}}) func.func @foo6(%box : !fir.box>>>, %i : i64 , %res : !fir.ref>) { // CHECK: %[[addr_gep:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, ptr %[[box]], i32 0, i32 0 // CHECK: %[[addr:.*]] = load ptr, ptr %[[addr_gep]] diff --git a/flang/test/Fir/embox.fir b/flang/test/Fir/embox.fir index 0f304cff2c79e..18b5efbc6a0e4 100644 --- a/flang/test/Fir/embox.fir +++ b/flang/test/Fir/embox.fir @@ -2,7 +2,7 @@ // RUN: %flang_fc1 -mmlir -disable-external-name-interop -emit-llvm %s -o -| FileCheck %s -// CHECK-LABEL: define void @_QPtest_callee( +// CHECK-LABEL: define void @_QPtest_callee(ptr %0) func.func @_QPtest_callee(%arg0: !fir.box>) { return } @@ -29,7 +29,7 @@ func.func @_QPtest_slice() { return } -// CHECK-LABEL: define void @_QPtest_dt_callee( +// CHECK-LABEL: define void @_QPtest_dt_callee(ptr %0) func.func @_QPtest_dt_callee(%arg0: !fir.box>) { return } @@ -63,7 +63,7 @@ func.func @_QPtest_dt_slice() { func.func private @takesRank2CharBox(!fir.box>>) // CHECK-LABEL: define void @emboxSubstring( -// CHECK-SAME: ptr {{[^%]*}}%[[arg0:.*]]) +// CHECK-SAME: ptr captures(none) %[[arg0:.*]]) func.func @emboxSubstring(%arg0: !fir.ref>>) { %c2 = arith.constant 2 : index %c3 = arith.constant 3 : index @@ -84,7 +84,7 @@ func.func @emboxSubstring(%arg0: !fir.ref>>) { func.func private @do_something(!fir.box>) -> () // CHECK: define void @fir_dev_issue_1416 -// CHECK-SAME: ptr {{[^%]*}}%[[base_addr:.*]], i64 %[[low:.*]], i64 %[[up:.*]], i64 %[[at:.*]]) +// CHECK-SAME: ptr captures(none) %[[base_addr:.*]], i64 %[[low:.*]], i64 %[[up:.*]], i64 %[[at:.*]]) func.func @fir_dev_issue_1416(%arg0: !fir.ref>, %low: index, %up: index, %at : index) { // Test fir.embox with a constant interior array shape. %c1 = arith.constant 1 : index diff --git a/flang/test/Fir/field-index.fir b/flang/test/Fir/field-index.fir index 19cfd2c04ad99..55d173201f29a 100644 --- a/flang/test/Fir/field-index.fir +++ b/flang/test/Fir/field-index.fir @@ -7,7 +7,7 @@ // CHECK-DAG: %[[c:.*]] = type { float, %[[b]] } // CHECK-LABEL: @simple_field -// CHECK-SAME: (ptr {{[^%]*}}%[[arg0:.*]]) +// CHECK-SAME: (ptr captures(none) %[[arg0:.*]]) func.func @simple_field(%arg0: !fir.ref>) -> i32 { // CHECK: %[[GEP:.*]] = getelementptr %a, ptr %[[arg0]], i32 0, i32 1 %2 = fir.coordinate_of %arg0, i : (!fir.ref>) -> !fir.ref @@ -17,7 +17,7 @@ func.func @simple_field(%arg0: !fir.ref>) -> i32 { } // CHECK-LABEL: @derived_field -// CHECK-SAME: (ptr {{[^%]*}}%[[arg0:.*]]) +// CHECK-SAME: (ptr captures(none) %[[arg0:.*]]) func.func @derived_field(%arg0: !fir.ref}>>) -> i32 { // CHECK: %[[GEP:.*]] = getelementptr %c, ptr %[[arg0]], i32 0, i32 1, i32 1 %3 = fir.coordinate_of %arg0, some_b, i : (!fir.ref}>>) -> !fir.ref diff --git a/flang/test/Fir/ignore-missing-type-descriptor.fir b/flang/test/Fir/ignore-missing-type-descriptor.fir index d3e6dd166ca45..f9dcb7db77afe 100644 --- a/flang/test/Fir/ignore-missing-type-descriptor.fir +++ b/flang/test/Fir/ignore-missing-type-descriptor.fir @@ -15,7 +15,7 @@ func.func @test_embox(%addr: !fir.ref) { return } // CHECK-LABEL: define void @test_embox( -// CHECK-SAME: ptr {{[^%]*}}%[[ADDR:.*]]) +// CHECK-SAME: ptr captures(none) %[[ADDR:.*]]) // CHECK: insertvalue { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] } // CHECK-SAME: { ptr undef, i64 4, // CHECK-SAME: i32 20240719, i8 0, i8 42, i8 0, i8 1, ptr null, [1 x i64] zeroinitializer }, diff --git a/flang/test/Fir/polymorphic.fir b/flang/test/Fir/polymorphic.fir index 84fa2e950633f..ea1099af6b988 100644 --- a/flang/test/Fir/polymorphic.fir +++ b/flang/test/Fir/polymorphic.fir @@ -86,7 +86,7 @@ func.func @_QMunlimitedPsub1(%arg0: !fir.class> {fir.bindc_na } // CHECK-LABEL: define void @_QMunlimitedPsub1( -// CHECK-SAME: ptr {{[^%]*}}%[[ARRAY:.*]]){{.*}}{ +// CHECK-SAME: ptr %[[ARRAY:.*]]){{.*}}{ // CHECK: %[[BOX:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] } // CHECK: %{{.}} = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[ARRAY]], i32 0, i32 7, i32 0, i32 2 // CHECK: %[[TYPE_DESC_GEP:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[ARRAY]], i32 0, i32 8 diff --git a/flang/test/Fir/rebox.fir b/flang/test/Fir/rebox.fir index 0c9f6d9bb94ad..140308be6a814 100644 --- a/flang/test/Fir/rebox.fir +++ b/flang/test/Fir/rebox.fir @@ -9,7 +9,7 @@ func.func private @bar1(!fir.box>) // CHECK-LABEL: define void @test_rebox_1( -// CHECK-SAME: ptr {{[^%]*}}%[[INBOX:.*]]) +// CHECK-SAME: ptr %[[INBOX:.*]]) func.func @test_rebox_1(%arg0: !fir.box>) { // CHECK: %[[OUTBOX_ALLOC:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } %c2 = arith.constant 2 : index @@ -54,7 +54,7 @@ func.func @test_rebox_1(%arg0: !fir.box>) { func.func private @bar_rebox_test2(!fir.box>>) // CHECK-LABEL: define void @test_rebox_2( -// CHECK-SAME: ptr {{[^%]*}}%[[INBOX:.*]]) +// CHECK-SAME: ptr %[[INBOX:.*]]) func.func @test_rebox_2(%arg0: !fir.box>>) { %c1 = arith.constant 1 : index %c4 = arith.constant 4 : index @@ -82,7 +82,7 @@ func.func @test_rebox_2(%arg0: !fir.box>>) { func.func private @bar_rebox_test3(!fir.box>) // CHECK-LABEL: define void @test_rebox_3( -// CHECK-SAME: ptr {{[^%]*}}%[[INBOX:.*]]) +// CHECK-SAME: ptr %[[INBOX:.*]]) func.func @test_rebox_3(%arg0: !fir.box>) { // CHECK: %[[OUTBOX_ALLOC:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [3 x [3 x i64]] } %c2 = arith.constant 2 : index @@ -116,7 +116,7 @@ func.func @test_rebox_3(%arg0: !fir.box>) { // time constant length. // CHECK-LABEL: define void @test_rebox_4( -// CHECK-SAME: ptr {{[^%]*}}%[[INPUT:.*]]) +// CHECK-SAME: ptr %[[INPUT:.*]]) func.func @test_rebox_4(%arg0: !fir.box>>) { // CHECK: %[[NEWBOX_STORAGE:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } // CHECK: %[[EXTENT_GEP:.*]] = getelementptr {{{.*}}}, ptr %[[INPUT]], i32 0, i32 7, i32 0, i32 1 @@ -144,7 +144,7 @@ func.func private @bar_test_rebox_4(!fir.box>>) { // CHECK: %[[OUTBOX_ALLOC:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } %c1 = arith.constant 1 : index @@ -184,7 +184,7 @@ func.func @test_cmplx_1(%arg0: !fir.box>>) { // end subroutine // CHECK-LABEL: define void @test_cmplx_2( -// CHECK-SAME: ptr {{[^%]*}}%[[INBOX:.*]]) +// CHECK-SAME: ptr %[[INBOX:.*]]) func.func @test_cmplx_2(%arg0: !fir.box>>) { // CHECK: %[[OUTBOX_ALLOC:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } %c7 = arith.constant 7 : index diff --git a/flang/test/Fir/struct-passing-x86-64-byval.fir b/flang/test/Fir/struct-passing-x86-64-byval.fir index 997d2930f836c..8451c26095226 100644 --- a/flang/test/Fir/struct-passing-x86-64-byval.fir +++ b/flang/test/Fir/struct-passing-x86-64-byval.fir @@ -80,27 +80,27 @@ func.func @not_enough_int_reg_3(%arg0: i32, %arg1: i32, %arg2: i32, %arg3: i32, } } -// CHECK: define void @takes_toobig(ptr noalias byval(%toobig) align 8 captures(none) %{{.*}}) { -// CHECK: define void @takes_toobig_align16(ptr noalias byval(%toobig_align16) align 16 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_int_reg_1(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr noalias byval(%fits_in_1_int_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_int_reg_1b(ptr noalias captures(none) %{{.*}}, ptr noalias captures(none) %{{.*}}, ptr noalias captures(none) %{{.*}}, ptr noalias captures(none) %{{.*}}, ptr noalias captures(none) %{{.*}}, ptr noalias captures(none) %{{.*}}, ptr noalias byval(%fits_in_1_int_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_int_reg_2(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr noalias byval(%fits_in_2_int_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @ftakes_toobig(ptr noalias byval(%ftoobig) align 8 captures(none) %{{.*}}) { -// CHECK: define void @ftakes_toobig_align16(ptr noalias byval(%ftoobig_align16) align 16 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_sse_reg_1(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr noalias byval(%fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_sse_reg_1b(<2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, ptr noalias byval(%fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_sse_reg_1c(double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, ptr noalias byval(%fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_sse_reg_2(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr noalias byval(%fits_in_2_sse_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @test_contains_x87(ptr noalias byval(%contains_x87) align 16 captures(none) %{{.*}}) { -// CHECK: define void @test_contains_complex_x87(ptr noalias byval(%contains_complex_x87) align 16 captures(none) %{{.*}}) { -// CHECK: define void @test_nested_toobig(ptr noalias byval(%nested_toobig) align 8 captures(none) %{{.*}}) { -// CHECK: define void @test_badly_aligned(ptr noalias byval(%badly_aligned) align 8 captures(none) %{{.*}}) { -// CHECK: define void @test_logical_toobig(ptr noalias byval(%logical_too_big) align 8 captures(none) %{{.*}}) { -// CHECK: define void @l_not_enough_int_reg(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr noalias byval(%l_fits_in_2_int_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @test_complex_toobig(ptr noalias byval(%complex_too_big) align 8 captures(none) %{{.*}}) { -// CHECK: define void @cplx_not_enough_sse_reg_1(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr noalias byval(%cplx_fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @test_char_to_big(ptr noalias byval(%char_too_big) align 8 captures(none) %{{.*}}) { -// CHECK: define void @char_not_enough_int_reg_1(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr noalias byval(%char_fits_in_1_int_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @mix_not_enough_int_reg_1(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr noalias byval(%mix_in_1_int_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @mix_not_enough_sse_reg_2(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr noalias byval(%mix_in_1_int_reg_1_sse_reg) align 8 captures(none) %{{.*}}) { -// CHECK: define void @not_enough_int_reg_3(ptr noalias sret({ fp128, fp128 }) align 16 captures(none) %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr noalias byval(%fits_in_1_int_reg) align 8 captures(none) %{{.*}}) +// CHECK: define void @takes_toobig(ptr byval(%toobig) align 8 captures(none) %{{.*}}) { +// CHECK: define void @takes_toobig_align16(ptr byval(%toobig_align16) align 16 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_int_reg_1(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr byval(%fits_in_1_int_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_int_reg_1b(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}, ptr byval(%fits_in_1_int_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_int_reg_2(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr byval(%fits_in_2_int_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @ftakes_toobig(ptr byval(%ftoobig) align 8 captures(none) %{{.*}}) { +// CHECK: define void @ftakes_toobig_align16(ptr byval(%ftoobig_align16) align 16 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_sse_reg_1(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr byval(%fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_sse_reg_1b(<2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, <2 x float> %{{.*}}, ptr byval(%fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_sse_reg_1c(double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, double %{{.*}}, ptr byval(%fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_sse_reg_2(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr byval(%fits_in_2_sse_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @test_contains_x87(ptr byval(%contains_x87) align 16 captures(none) %{{.*}}) { +// CHECK: define void @test_contains_complex_x87(ptr byval(%contains_complex_x87) align 16 captures(none) %{{.*}}) { +// CHECK: define void @test_nested_toobig(ptr byval(%nested_toobig) align 8 captures(none) %{{.*}}) { +// CHECK: define void @test_badly_aligned(ptr byval(%badly_aligned) align 8 captures(none) %{{.*}}) { +// CHECK: define void @test_logical_toobig(ptr byval(%logical_too_big) align 8 captures(none) %{{.*}}) { +// CHECK: define void @l_not_enough_int_reg(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr byval(%l_fits_in_2_int_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @test_complex_toobig(ptr byval(%complex_too_big) align 8 captures(none) %{{.*}}) { +// CHECK: define void @cplx_not_enough_sse_reg_1(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr byval(%cplx_fits_in_1_sse_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @test_char_to_big(ptr byval(%char_too_big) align 8 captures(none) %{{.*}}) { +// CHECK: define void @char_not_enough_int_reg_1(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr byval(%char_fits_in_1_int_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @mix_not_enough_int_reg_1(i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr byval(%mix_in_1_int_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @mix_not_enough_sse_reg_2(float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, float %{{.*}}, ptr byval(%mix_in_1_int_reg_1_sse_reg) align 8 captures(none) %{{.*}}) { +// CHECK: define void @not_enough_int_reg_3(ptr sret({ fp128, fp128 }) align 16 captures(none) %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, i32 %{{.*}}, ptr byval(%fits_in_1_int_reg) align 8 captures(none) %{{.*}}) diff --git a/flang/test/Fir/target-rewrite-complex-10-x86.fir b/flang/test/Fir/target-rewrite-complex-10-x86.fir index 5f917ee42d598..6404b4f766d39 100644 --- a/flang/test/Fir/target-rewrite-complex-10-x86.fir +++ b/flang/test/Fir/target-rewrite-complex-10-x86.fir @@ -30,5 +30,5 @@ func.func @takecomplex10(%z: complex) { // AMD64: %[[VAL_3:.*]] = fir.alloca complex // AMD64: fir.store %[[VAL_2]] to %[[VAL_3]] : !fir.ref> -// AMD64_LLVM: define void @takecomplex10(ptr noalias byval({ x86_fp80, x86_fp80 }) align 16 captures(none) %0) +// AMD64_LLVM: define void @takecomplex10(ptr byval({ x86_fp80, x86_fp80 }) align 16 captures(none) %0) } diff --git a/flang/test/Fir/target.fir b/flang/test/Fir/target.fir index e1190649e0803..781d153f525ff 100644 --- a/flang/test/Fir/target.fir +++ b/flang/test/Fir/target.fir @@ -26,7 +26,7 @@ func.func @gen4() -> complex { return %6 : complex } -// I32-LABEL: define void @gen8(ptr noalias sret({ double, double }) align 4 captures(none) % +// I32-LABEL: define void @gen8(ptr sret({ double, double }) align 4 captures(none) % // X64-LABEL: define { double, double } @gen8() // AARCH64-LABEL: define { double, double } @gen8() // PPC-LABEL: define { double, double } @gen8() @@ -93,9 +93,9 @@ func.func @call8() { return } -// I32-LABEL: define i64 @char1lensum(ptr {{[^%]*}}%0, ptr {{[^%]*}}%1, i32 %2, i32 %3) -// X64-LABEL: define i64 @char1lensum(ptr {{[^%]*}}%0, ptr {{[^%]*}}%1, i64 %2, i64 %3) -// PPC-LABEL: define i64 @char1lensum(ptr {{[^%]*}}%0, ptr {{[^%]*}}%1, i64 %2, i64 %3) +// I32-LABEL: define i64 @char1lensum(ptr captures(none) %0, ptr captures(none) %1, i32 %2, i32 %3) +// X64-LABEL: define i64 @char1lensum(ptr captures(none) %0, ptr captures(none) %1, i64 %2, i64 %3) +// PPC-LABEL: define i64 @char1lensum(ptr captures(none) %0, ptr captures(none) %1, i64 %2, i64 %3) func.func @char1lensum(%arg0 : !fir.boxchar<1>, %arg1 : !fir.boxchar<1>) -> i64 { // X64-DAG: %[[p0:.*]] = insertvalue { ptr, i64 } undef, ptr %1, 0 // X64-DAG: = insertvalue { ptr, i64 } %[[p0]], i64 %3, 1 diff --git a/flang/test/Fir/tbaa-codegen.fir b/flang/test/Fir/tbaa-codegen.fir index b6b0982b3934e..87bb15c0fea6c 100644 --- a/flang/test/Fir/tbaa-codegen.fir +++ b/flang/test/Fir/tbaa-codegen.fir @@ -28,7 +28,7 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ } // CHECK-LABEL: define void @_QPsimple( -// CHECK-SAME: ptr {{[^%]*}}%[[ARG0:.*]]){{.*}}{ +// CHECK-SAME: ptr %[[ARG0:.*]]){{.*}}{ // [...] // load a(2): // CHECK: %[[VAL20:.*]] = getelementptr i8, ptr %{{.*}}, i64 %{{.*}} diff --git a/flang/test/Fir/tbaa-codegen2.fir b/flang/test/Fir/tbaa-codegen2.fir index 69b36c2611505..8f8b6a29129e7 100644 --- a/flang/test/Fir/tbaa-codegen2.fir +++ b/flang/test/Fir/tbaa-codegen2.fir @@ -60,7 +60,7 @@ module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.targ } } // CHECK-LABEL: define void @_QPfunc( -// CHECK-SAME: ptr {{[^%]*}}%[[ARG0:.*]]){{.*}}{ +// CHECK-SAME: ptr %[[ARG0:.*]]){{.*}}{ // [...] // CHECK: %[[VAL5:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, ptr %[[ARG0]], i32 0, i32 7, i32 0, i32 0 // box access: diff --git a/flang/test/Integration/OpenMP/copyprivate.f90 b/flang/test/Integration/OpenMP/copyprivate.f90 index e0e4abe015438..3bae003ea8d83 100644 --- a/flang/test/Integration/OpenMP/copyprivate.f90 +++ b/flang/test/Integration/OpenMP/copyprivate.f90 @@ -8,25 +8,25 @@ !RUN: %flang_fc1 -emit-llvm -fopenmp %s -o - | FileCheck %s -!CHECK-DAG: define internal void @_copy_box_Uxi32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_box_10xi32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_i64(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_box_Uxi64(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_f32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_box_2x3xf32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_z32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_box_10xz32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_l32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_box_5xl32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_c8x8(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_box_10xc8x8(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_c16x5(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_rec__QFtest_typesTdt(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_box_heap_Uxi32(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) -!CHECK-DAG: define internal void @_copy_box_ptr_Uxc8x9(ptr {{[^%]*}}%{{.*}}, ptr {{[^%]*}}%{{.*}}) +!CHECK-DAG: define internal void @_copy_box_Uxi32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_box_10xi32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_i64(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_box_Uxi64(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_f32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_box_2x3xf32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_z32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_box_10xz32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_l32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_box_5xl32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_c8x8(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_box_10xc8x8(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_c16x5(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_rec__QFtest_typesTdt(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_box_heap_Uxi32(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) +!CHECK-DAG: define internal void @_copy_box_ptr_Uxc8x9(ptr captures(none) %{{.*}}, ptr captures(none) %{{.*}}) !CHECK-LABEL: define internal void @_copy_i32( -!CHECK-SAME: ptr {{[^%]*}}%[[DST:.*]], ptr {{[^%]*}}%[[SRC:.*]]){{.*}} { +!CHECK-SAME: ptr captures(none) %[[DST:.*]], ptr captures(none) %[[SRC:.*]]){{.*}} { !CHECK-NEXT: %[[SRC_VAL:.*]] = load i32, ptr %[[SRC]] !CHECK-NEXT: store i32 %[[SRC_VAL]], ptr %[[DST]] !CHECK-NEXT: ret void diff --git a/flang/test/Integration/debug-local-var-2.f90 b/flang/test/Integration/debug-local-var-2.f90 index 468bb0c5a1289..08aeb0999b01b 100644 --- a/flang/test/Integration/debug-local-var-2.f90 +++ b/flang/test/Integration/debug-local-var-2.f90 @@ -20,7 +20,7 @@ ! BOTH-LABEL: } ! BOTH-LABEL: define {{.*}}i64 @_QFPfn1 -! BOTH-SAME: (ptr {{[^%]*}}%[[ARG1:.*]], ptr {{[^%]*}}%[[ARG2:.*]], ptr {{[^%]*}}%[[ARG3:.*]]) +! BOTH-SAME: (ptr captures(none) %[[ARG1:.*]], ptr captures(none) %[[ARG2:.*]], ptr captures(none) %[[ARG3:.*]]) ! RECORDS-DAG: #dbg_declare(ptr %[[ARG1]], ![[A1:.*]], !DIExpression(), !{{.*}}) ! RECORDS-DAG: #dbg_declare(ptr %[[ARG2]], ![[B1:.*]], !DIExpression(), !{{.*}}) ! RECORDS-DAG: #dbg_declare(ptr %[[ARG3]], ![[C1:.*]], !DIExpression(), !{{.*}}) @@ -29,7 +29,7 @@ ! BOTH-LABEL: } ! BOTH-LABEL: define {{.*}}i32 @_QFPfn2 -! BOTH-SAME: (ptr {{[^%]*}}%[[FN2ARG1:.*]], ptr {{[^%]*}}%[[FN2ARG2:.*]], ptr {{[^%]*}}%[[FN2ARG3:.*]]) +! BOTH-SAME: (ptr captures(none) %[[FN2ARG1:.*]], ptr captures(none) %[[FN2ARG2:.*]], ptr captures(none) %[[FN2ARG3:.*]]) ! RECORDS-DAG: #dbg_declare(ptr %[[FN2ARG1]], ![[A2:.*]], !DIExpression(), !{{.*}}) ! RECORDS-DAG: #dbg_declare(ptr %[[FN2ARG2]], ![[B2:.*]], !DIExpression(), !{{.*}}) ! RECORDS-DAG: #dbg_declare(ptr %[[FN2ARG3]], ![[C2:.*]], !DIExpression(), !{{.*}}) diff --git a/flang/test/Integration/unroll-loops.f90 b/flang/test/Integration/unroll-loops.f90 index 87ab9efeb703b..debe45e0ec359 100644 --- a/flang/test/Integration/unroll-loops.f90 +++ b/flang/test/Integration/unroll-loops.f90 @@ -13,7 +13,7 @@ ! RUN: %if x86-registered-target %{ %{check-nounroll} %} ! ! CHECK-LABEL: @unroll -! CHECK-SAME: (ptr {{[^%]*}}%[[ARG0:.*]]) +! CHECK-SAME: (ptr writeonly captures(none) %[[ARG0:.*]]) subroutine unroll(a) integer(kind=8), intent(out) :: a(1000) integer(kind=8) :: i diff --git a/flang/test/Lower/HLFIR/unroll-loops.fir b/flang/test/Lower/HLFIR/unroll-loops.fir index 89e8ce82d6f3f..1321f39677405 100644 --- a/flang/test/Lower/HLFIR/unroll-loops.fir +++ b/flang/test/Lower/HLFIR/unroll-loops.fir @@ -11,7 +11,7 @@ // RUN: %if x86-registered-target %{ %{check-nounroll} %} // CHECK-LABEL: @unroll -// CHECK-SAME: (ptr {{[^%]*}}%[[ARG0:.*]]) +// CHECK-SAME: (ptr writeonly captures(none) %[[ARG0:.*]]) func.func @unroll(%arg0: !fir.ref> {fir.bindc_name = "a"}) { %scope = fir.dummy_scope : !fir.dscope %c1000 = arith.constant 1000 : index diff --git a/flang/test/Lower/forall/character-1.f90 b/flang/test/Lower/forall/character-1.f90 index 1e4bb73350871..69064ddfcf0be 100644 --- a/flang/test/Lower/forall/character-1.f90 +++ b/flang/test/Lower/forall/character-1.f90 @@ -22,7 +22,7 @@ end subroutine sub end program test ! CHECK-LABEL: define internal void @_QFPsub( -! CHECK-SAME: ptr {{[^%]*}}%[[arg:.*]]) +! CHECK-SAME: ptr %[[arg:.*]]) ! CHECK: %[[extent:.*]] = getelementptr { {{.*}}, [1 x [3 x i64]] }, ptr %[[arg]], i32 0, i32 7, i64 0, i32 1 ! CHECK: %[[extval:.*]] = load i64, ptr %[[extent]] ! CHECK: %[[elesize:.*]] = getelementptr { {{.*}}, [1 x [3 x i64]] }, ptr %[[arg]], i32 0, i32 1 diff --git a/flang/test/Transforms/constant-argument-globalisation.fir b/flang/test/Transforms/constant-argument-globalisation.fir index 4e5995bdf2207..02349de40bc0b 100644 --- a/flang/test/Transforms/constant-argument-globalisation.fir +++ b/flang/test/Transforms/constant-argument-globalisation.fir @@ -49,8 +49,8 @@ module { // DISABLE-LABEL: ; ModuleID = // DISABLE-NOT: @_extruded // DISABLE: define void @sub1( -// DISABLE-SAME: ptr {{[^%]*}}[[ARG0:%.*]], -// DISABLE-SAME: ptr {{[^%]*}}[[ARG1:%.*]]) +// DISABLE-SAME: ptr captures(none) [[ARG0:%.*]], +// DISABLE-SAME: ptr captures(none) [[ARG1:%.*]]) // DISABLE-SAME: { // DISABLE: [[CONST_R0:%.*]] = alloca double // DISABLE: [[CONST_R1:%.*]] = alloca double diff --git a/flang/test/Transforms/function-attrs-noalias.fir b/flang/test/Transforms/function-attrs-noalias.fir deleted file mode 100644 index 6733fa96457bc..0000000000000 --- a/flang/test/Transforms/function-attrs-noalias.fir +++ /dev/null @@ -1,113 +0,0 @@ -// RUN: fir-opt --function-attr="set-noalias=true" %s | FileCheck %s - -// Test the annotation of function arguments with llvm.noalias. - -// Test !fir.ref arguments. -// CHECK-LABEL: func.func private @test_ref( -// CHECK-SAME: %[[ARG0:.*]]: !fir.ref {llvm.noalias}) { -func.func private @test_ref(%arg0: !fir.ref) { - return -} - -// CHECK-LABEL: func.func private @test_ref_target( -// CHECK-SAME: %[[ARG0:.*]]: !fir.ref {fir.target}) { -func.func private @test_ref_target(%arg0: !fir.ref {fir.target}) { - return -} - -// CHECK-LABEL: func.func private @test_ref_volatile( -// CHECK-SAME: %[[ARG0:.*]]: !fir.ref {fir.volatile}) { -func.func private @test_ref_volatile(%arg0: !fir.ref {fir.volatile}) { - return -} - -// CHECK-LABEL: func.func private @test_ref_asynchronous( -// CHECK-SAME: %[[ARG0:.*]]: !fir.ref {fir.asynchronous}) { -func.func private @test_ref_asynchronous(%arg0: !fir.ref {fir.asynchronous}) { - return -} - -// CHECK-LABEL: func.func private @test_ref_box( -// CHECK-SAME: %[[ARG0:.*]]: !fir.ref> {llvm.noalias}) { -// Test !fir.ref> arguments: -func.func private @test_ref_box(%arg0: !fir.ref>) { - return -} - -// CHECK-LABEL: func.func private @test_ref_box_target( -// CHECK-SAME: %[[ARG0:.*]]: !fir.ref> {fir.target}) { -func.func private @test_ref_box_target(%arg0: !fir.ref> {fir.target}) { - return -} - -// CHECK-LABEL: func.func private @test_ref_box_volatile( -// CHECK-SAME: %[[ARG0:.*]]: !fir.ref> {fir.volatile}) { -func.func private @test_ref_box_volatile(%arg0: !fir.ref> {fir.volatile}) { - return -} - -// CHECK-LABEL: func.func private @test_ref_box_asynchronous( -// CHECK-SAME: %[[ARG0:.*]]: !fir.ref> {fir.asynchronous}) { -func.func private @test_ref_box_asynchronous(%arg0: !fir.ref> {fir.asynchronous}) { - return -} - -// Test POINTER arguments. -// CHECK-LABEL: func.func private @test_ref_box_ptr( -// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>>) { -func.func private @test_ref_box_ptr(%arg0: !fir.ref>>) { - return -} - -// Test ALLOCATABLE arguments. -// CHECK-LABEL: func.func private @test_ref_box_heap( -// CHECK-SAME: %[[ARG0:.*]]: !fir.ref>> {llvm.noalias}) { -func.func private @test_ref_box_heap(%arg0: !fir.ref>>) { - return -} - -// BIND(C) functions are not annotated. -// CHECK-LABEL: func.func private @test_ref_bindc( -// CHECK-SAME: %[[ARG0:.*]]: !fir.ref) -func.func private @test_ref_bindc(%arg0: !fir.ref) attributes {fir.bindc_name = "test_ref_bindc", fir.proc_attrs = #fir.proc_attrs} { - return -} - -// Test function declaration from a module. -// CHECK-LABEL: func.func private @_QMtest_modPcheck_module( -// CHECK-SAME: !fir.ref {llvm.noalias}) -func.func private @_QMtest_modPcheck_module(!fir.ref) - -// Test !fir.box arguments: -// CHECK-LABEL: func.func private @test_box( -// CHECK-SAME: %[[ARG0:.*]]: !fir.box {llvm.noalias}) { -func.func private @test_box(%arg0: !fir.box) { - return -} - -// CHECK-LABEL: func.func private @test_box_target( -// CHECK-SAME: %[[ARG0:.*]]: !fir.box {fir.target, llvm.noalias}) { -func.func private @test_box_target(%arg0: !fir.box {fir.target}) { - return -} - -// CHECK-LABEL: func.func private @test_box_volatile( -// CHECK-SAME: %[[ARG0:.*]]: !fir.box {fir.volatile, llvm.noalias}) { -func.func private @test_box_volatile(%arg0: !fir.box {fir.volatile}) { - return -} - -// CHECK-LABEL: func.func private @test_box_asynchronous( -// CHECK-SAME: %[[ARG0:.*]]: !fir.box {fir.asynchronous, llvm.noalias}) { -func.func private @test_box_asynchronous(%arg0: !fir.box {fir.asynchronous}) { - return -} - -// !fir.boxchar<> is lowered before FunctionAttrPass, but let's -// make sure we do not annotate it. -// CHECK-LABEL: func.func private @test_boxchar( -// CHECK-SAME: %[[ARG0:.*]]: !fir.boxchar<1>) { -func.func private @test_boxchar(%arg0: !fir.boxchar<1>) { - return -} - diff --git a/flang/test/Transforms/function-attrs.fir b/flang/test/Transforms/function-attrs.fir index 8e3a896fd58bf..5f871a1a7b6c5 100644 --- a/flang/test/Transforms/function-attrs.fir +++ b/flang/test/Transforms/function-attrs.fir @@ -1,4 +1,4 @@ -// RUN: fir-opt --function-attr="set-nocapture=true" %s | FileCheck %s +// RUN: fir-opt --function-attr %s | FileCheck %s // If a function has a body and is not bind(c), and if the dummy argument doesn't have the target, // asynchronous, volatile, or pointer attribute, then add llvm.nocapture to the dummy argument. @@ -43,28 +43,3 @@ func.func private @_QMarg_modPcheck_args(!fir.ref {fir.target}, !fir.ref {llvm.nocapture}, // CHECK-SAME: !fir.boxchar<1>, // CHECK-SAME: !fir.ref> {llvm.nocapture}) - -// Test !fir.box arguments: -// CHECK-LABEL: func.func private @test_box( -// CHECK-SAME: %[[ARG0:.*]]: !fir.box {llvm.nocapture}) { -func.func private @test_box(%arg0: !fir.box) { - return -} - -// CHECK-LABEL: func.func private @test_box_target( -// CHECK-SAME: %[[ARG0:.*]]: !fir.box {fir.target, llvm.nocapture}) { -func.func private @test_box_target(%arg0: !fir.box {fir.target}) { - return -} - -// CHECK-LABEL: func.func private @test_box_volatile( -// CHECK-SAME: %[[ARG0:.*]]: !fir.box {fir.volatile, llvm.nocapture}) { -func.func private @test_box_volatile(%arg0: !fir.box {fir.volatile}) { - return -} - -// CHECK-LABEL: func.func private @test_box_asynchronous( -// CHECK-SAME: %[[ARG0:.*]]: !fir.box {fir.asynchronous, llvm.nocapture}) { -func.func private @test_box_asynchronous(%arg0: !fir.box {fir.asynchronous}) { - return -} From flang-commits at lists.llvm.org Fri May 30 04:50:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 04:50:50 -0700 (PDT) Subject: [flang-commits] =?utf-8?q?=5Bflang=5D_Revert_=22Reland_=22=5Bflan?= =?utf-8?q?g=5D_Added_noalias_attribute_to_function_arguments=E2=80=A6_=28?= =?utf-8?q?PR_=23142128=29?= In-Reply-To: Message-ID: <68399b9a.170a0220.d63f8.ddea@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Tom Eccles (tblah)
Changes …. (#140803)"" This reverts commit a0d699a8e686cba99690cf28463d14526c5bfbc8. With this enabled we see a 70% performance regression for exchange2_r on neoverse-v1 (aws graviton 3) using `-mcpu=native -Ofast -flto`. There is also a smaller regression on neoverse-v2. This appears to be because function specialization is no longer kicking in during LTO for digits_2. This can be seen in the output executable: previously it contained specialized copies of the function with names like `_QMbrute_forcePdigits_2.specialized.4`. Now there are no names like this. The bug is not in this patch as such, instead in the function specialization pass but due to the size of the regression I would like to request that this is reverted until function specialization has been fixed. --- Patch is 52.48 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/142128.diff 30 Files Affected: - (modified) flang/include/flang/Optimizer/Transforms/Passes.td (-6) - (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+1-5) - (modified) flang/lib/Optimizer/Transforms/FunctionAttr.cpp (+12-17) - (modified) flang/test/Fir/array-coor.fir (+1-1) - (modified) flang/test/Fir/arrayset.fir (+1-1) - (modified) flang/test/Fir/arrexp.fir (+9-9) - (modified) flang/test/Fir/box-offset-codegen.fir (+4-4) - (modified) flang/test/Fir/box-typecode.fir (+1-1) - (modified) flang/test/Fir/box.fir (+9-9) - (modified) flang/test/Fir/boxproc.fir (+2-2) - (modified) flang/test/Fir/commute.fir (+1-1) - (modified) flang/test/Fir/coordinateof.fir (+1-1) - (modified) flang/test/Fir/embox.fir (+4-4) - (modified) flang/test/Fir/field-index.fir (+2-2) - (modified) flang/test/Fir/ignore-missing-type-descriptor.fir (+1-1) - (modified) flang/test/Fir/polymorphic.fir (+1-1) - (modified) flang/test/Fir/rebox.fir (+6-6) - (modified) flang/test/Fir/struct-passing-x86-64-byval.fir (+24-24) - (modified) flang/test/Fir/target-rewrite-complex-10-x86.fir (+1-1) - (modified) flang/test/Fir/target.fir (+4-4) - (modified) flang/test/Fir/tbaa-codegen.fir (+1-1) - (modified) flang/test/Fir/tbaa-codegen2.fir (+1-1) - (modified) flang/test/Integration/OpenMP/copyprivate.f90 (+17-17) - (modified) flang/test/Integration/debug-local-var-2.f90 (+2-2) - (modified) flang/test/Integration/unroll-loops.f90 (+1-1) - (modified) flang/test/Lower/HLFIR/unroll-loops.fir (+1-1) - (modified) flang/test/Lower/forall/character-1.f90 (+1-1) - (modified) flang/test/Transforms/constant-argument-globalisation.fir (+2-2) - (removed) flang/test/Transforms/function-attrs-noalias.fir (-113) - (modified) flang/test/Transforms/function-attrs.fir (+1-26) ``````````diff diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index 71493535db8ba..8e7f12505c59c 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -431,12 +431,6 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "Set the unsafe-fp-math attribute on functions in the module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, - Option<"setNoCapture", "set-nocapture", "bool", /*default=*/"false", - "Set LLVM nocapture attribute on function arguments, " - "if possible">, - Option<"setNoAlias", "set-noalias", "bool", /*default=*/"false", - "Set LLVM noalias attribute on function arguments, " - "if possible">, ]; } diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 378913fcb1329..77751908e35be 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -350,15 +350,11 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, else framePointerKind = mlir::LLVM::framePointerKind::FramePointerKind::None; - bool setNoCapture = false, setNoAlias = false; - if (config.OptLevel.isOptimizingForSpeed()) - setNoCapture = setNoAlias = true; - pm.addPass(fir::createFunctionAttr( {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - /*tuneCPU=*/"", setNoCapture, setNoAlias})); + ""})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index c8cdba0d6f9c4..43e4c1a7af3cd 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -27,8 +27,17 @@ namespace { class FunctionAttrPass : public fir::impl::FunctionAttrBase { public: - FunctionAttrPass(const fir::FunctionAttrOptions &options) : Base{options} {} - FunctionAttrPass() = default; + FunctionAttrPass(const fir::FunctionAttrOptions &options) { + instrumentFunctionEntry = options.instrumentFunctionEntry; + instrumentFunctionExit = options.instrumentFunctionExit; + framePointerKind = options.framePointerKind; + noInfsFPMath = options.noInfsFPMath; + noNaNsFPMath = options.noNaNsFPMath; + approxFuncFPMath = options.approxFuncFPMath; + noSignedZerosFPMath = options.noSignedZerosFPMath; + unsafeFPMath = options.unsafeFPMath; + } + FunctionAttrPass() {} void runOnOperation() override; }; @@ -47,28 +56,14 @@ void FunctionAttrPass::runOnOperation() { if ((isFromModule || !func.isDeclaration()) && !fir::hasBindcAttr(func.getOperation())) { llvm::StringRef nocapture = mlir::LLVM::LLVMDialect::getNoCaptureAttrName(); - llvm::StringRef noalias = mlir::LLVM::LLVMDialect::getNoAliasAttrName(); mlir::UnitAttr unitAttr = mlir::UnitAttr::get(func.getContext()); for (auto [index, argType] : llvm::enumerate(func.getArgumentTypes())) { - bool isNoCapture = false; - bool isNoAlias = false; if (mlir::isa(argType) && !func.getArgAttr(index, fir::getTargetAttrName()) && !func.getArgAttr(index, fir::getAsynchronousAttrName()) && - !func.getArgAttr(index, fir::getVolatileAttrName())) { - isNoCapture = true; - isNoAlias = !fir::isPointerType(argType); - } else if (mlir::isa(argType)) { - // !fir.box arguments will be passed as descriptor pointers - // at LLVM IR dialect level - they cannot be captured, - // and cannot alias with anything within the function. - isNoCapture = isNoAlias = true; - } - if (isNoCapture && setNoCapture) + !func.getArgAttr(index, fir::getVolatileAttrName())) func.setArgAttr(index, nocapture, unitAttr); - if (isNoAlias && setNoAlias) - func.setArgAttr(index, noalias, unitAttr); } } diff --git a/flang/test/Fir/array-coor.fir b/flang/test/Fir/array-coor.fir index 2caa727a10c50..a765670d20b28 100644 --- a/flang/test/Fir/array-coor.fir +++ b/flang/test/Fir/array-coor.fir @@ -33,7 +33,7 @@ func.func @test_array_coor_box_component_slice(%arg0: !fir.box) -> () // CHECK-LABEL: define void @test_array_coor_box_component_slice( -// CHECK-SAME: ptr {{[^%]*}}%[[VAL_0:.*]]) +// CHECK-SAME: ptr %[[VAL_0:.*]]) // CHECK: %[[VAL_1:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[VAL_0]], i32 0, i32 7, i32 0, i32 2 // CHECK: %[[VAL_2:.*]] = load i64, ptr %[[VAL_1]] // CHECK: %[[VAL_3:.*]] = mul nsw i64 1, %[[VAL_2]] diff --git a/flang/test/Fir/arrayset.fir b/flang/test/Fir/arrayset.fir index cb26971cb962d..dab939aba1702 100644 --- a/flang/test/Fir/arrayset.fir +++ b/flang/test/Fir/arrayset.fir @@ -1,7 +1,7 @@ // RUN: tco %s | FileCheck %s // RUN: %flang_fc1 -emit-llvm %s -o - | FileCheck %s -// CHECK-LABEL: define void @x( +// CHECK-LABEL: define void @x(ptr captures(none) %0) func.func @x(%arr : !fir.ref>) { %1 = arith.constant 0 : index %2 = arith.constant 9 : index diff --git a/flang/test/Fir/arrexp.fir b/flang/test/Fir/arrexp.fir index 6c7f71f6f1f9c..924c1fab8d84b 100644 --- a/flang/test/Fir/arrexp.fir +++ b/flang/test/Fir/arrexp.fir @@ -1,7 +1,7 @@ // RUN: tco %s | FileCheck %s // CHECK-LABEL: define void @f1 -// CHECK: (ptr {{[^%]*}}%[[A:[^,]*]], {{.*}}, float %[[F:.*]]) +// CHECK: (ptr captures(none) %[[A:[^,]*]], {{.*}}, float %[[F:.*]]) func.func @f1(%a : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -23,7 +23,7 @@ func.func @f1(%a : !fir.ref>, %n : index, %m : index, %o : i } // CHECK-LABEL: define void @f2 -// CHECK: (ptr {{[^%]*}}%[[A:[^,]*]], {{.*}}, float %[[F:.*]]) +// CHECK: (ptr captures(none) %[[A:[^,]*]], {{.*}}, float %[[F:.*]]) func.func @f2(%a : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -47,7 +47,7 @@ func.func @f2(%a : !fir.ref>, %b : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -72,7 +72,7 @@ func.func @f3(%a : !fir.ref>, %b : !fir.ref>, %b : !fir.ref>, %n : index, %m : index, %o : index, %p : index, %f : f32) { %c1 = arith.constant 1 : index %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shapeshift<2> @@ -102,7 +102,7 @@ func.func @f4(%a : !fir.ref>, %b : !fir.ref>, %arg1: !fir.box>, %arg2: f32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -135,7 +135,7 @@ func.func @f5(%arg0: !fir.box>, %arg1: !fir.box>, %arg1: f32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -165,7 +165,7 @@ func.func @f6(%arg0: !fir.box>, %arg1: f32) { // Non contiguous array with lower bounds (x = y(100), with y(4:)) // Test array_coor offset computation. // CHECK-LABEL: define void @f7( -// CHECK: ptr {{[^%]*}}%[[X:[^,]*]], ptr {{[^%]*}}%[[Y:.*]]) +// CHECK: ptr captures(none) %[[X:[^,]*]], ptr %[[Y:.*]]) func.func @f7(%arg0: !fir.ref, %arg1: !fir.box>) { %c4 = arith.constant 4 : index %c100 = arith.constant 100 : index @@ -181,7 +181,7 @@ func.func @f7(%arg0: !fir.ref, %arg1: !fir.box>) { // Test A(:, :)%x reference codegen with A constant shape. // CHECK-LABEL: define void @f8( -// CHECK-SAME: ptr {{[^%]*}}%[[A:.*]], i32 %[[I:.*]]) +// CHECK-SAME: ptr captures(none) %[[A:.*]], i32 %[[I:.*]]) func.func @f8(%a : !fir.ref>>, %i : i32) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index @@ -198,7 +198,7 @@ func.func @f8(%a : !fir.ref>>, %i : i32) { // Test casts in in array_coor offset computation when type parameters are not i64 // CHECK-LABEL: define ptr @f9( -// CHECK-SAME: i32 %[[I:.*]], i64 %{{.*}}, i64 %{{.*}}, ptr {{[^%]*}}%[[C:.*]]) +// CHECK-SAME: i32 %[[I:.*]], i64 %{{.*}}, i64 %{{.*}}, ptr captures(none) %[[C:.*]]) func.func @f9(%i: i32, %e : i64, %j: i64, %c: !fir.ref>>) -> !fir.ref> { %s = fir.shape %e, %e : (i64, i64) -> !fir.shape<2> // CHECK: %[[CAST:.*]] = sext i32 %[[I]] to i64 diff --git a/flang/test/Fir/box-offset-codegen.fir b/flang/test/Fir/box-offset-codegen.fir index 11d5750ffc385..15c9a11e5aefe 100644 --- a/flang/test/Fir/box-offset-codegen.fir +++ b/flang/test/Fir/box-offset-codegen.fir @@ -7,7 +7,7 @@ func.func @scalar_addr(%scalar : !fir.ref>>) -> !fir.llvm_ return %addr : !fir.llvm_ptr>> } // CHECK-LABEL: define ptr @scalar_addr( -// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 0 // CHECK: ret ptr %[[VAL_0]] @@ -16,7 +16,7 @@ func.func @scalar_tdesc(%scalar : !fir.ref>>) -> !fir.llvm return %tdesc : !fir.llvm_ptr>> } // CHECK-LABEL: define ptr @scalar_tdesc( -// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 7 // CHECK: ret ptr %[[VAL_0]] @@ -25,7 +25,7 @@ func.func @array_addr(%array : !fir.ref>>> } // CHECK-LABEL: define ptr @array_addr( -// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 0 // CHECK: ret ptr %[[VAL_0]] @@ -34,6 +34,6 @@ func.func @array_tdesc(%array : !fir.ref>> } // CHECK-LABEL: define ptr @array_tdesc( -// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]){{.*}}{ +// CHECK-SAME: ptr captures(none) %[[BOX:.*]]){{.*}}{ // CHECK: %[[VAL_0:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]], ptr, [1 x i64] }, ptr %[[BOX]], i32 0, i32 8 // CHECK: ret ptr %[[VAL_0]] diff --git a/flang/test/Fir/box-typecode.fir b/flang/test/Fir/box-typecode.fir index a8d43eba39889..766c5165b947c 100644 --- a/flang/test/Fir/box-typecode.fir +++ b/flang/test/Fir/box-typecode.fir @@ -6,7 +6,7 @@ func.func @test_box_typecode(%a: !fir.class) -> i32 { } // CHECK-LABEL: @test_box_typecode( -// CHECK-SAME: ptr {{[^%]*}}%[[BOX:.*]]) +// CHECK-SAME: ptr %[[BOX:.*]]) // CHECK: %[[GEP:.*]] = getelementptr { ptr, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}}, i{{.*}} }, ptr %[[BOX]], i32 0, i32 4 // CHECK: %[[TYPE_CODE:.*]] = load i8, ptr %[[GEP]] // CHECK: %[[TYPE_CODE_CONV:.*]] = sext i8 %[[TYPE_CODE]] to i32 diff --git a/flang/test/Fir/box.fir b/flang/test/Fir/box.fir index c0cf3d8375983..5e931a2e0d9aa 100644 --- a/flang/test/Fir/box.fir +++ b/flang/test/Fir/box.fir @@ -24,7 +24,7 @@ func.func private @g(%b : !fir.box) func.func private @ga(%b : !fir.box>) // CHECK-LABEL: define void @f -// CHECK: (ptr {{[^%]*}}%[[ARG:.*]]) +// CHECK: (ptr captures(none) %[[ARG:.*]]) func.func @f(%a : !fir.ref) { // CHECK: %[[DESC:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 } // CHECK: %[[INS0:.*]] = insertvalue {{.*}} { ptr undef, i64 4, i32 20240719, i8 0, i8 27, i8 0, i8 0 }, ptr %[[ARG]], 0 @@ -38,7 +38,7 @@ func.func @f(%a : !fir.ref) { } // CHECK-LABEL: define void @fa -// CHECK: (ptr {{[^%]*}}%[[ARG:.*]]) +// CHECK: (ptr captures(none) %[[ARG:.*]]) func.func @fa(%a : !fir.ref>) { %c = fir.convert %a : (!fir.ref>) -> !fir.ref> %c1 = arith.constant 1 : index @@ -54,7 +54,7 @@ func.func @fa(%a : !fir.ref>) { // Boxing of a scalar character of dynamic length // CHECK-LABEL: define void @b1( -// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]]) +// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]]) func.func @b1(%arg0 : !fir.ref>, %arg1 : index) -> !fir.box> { // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8 } // CHECK: %[[size:.*]] = mul i64 1, %[[arg1]] @@ -69,8 +69,8 @@ func.func @b1(%arg0 : !fir.ref>, %arg1 : index) -> !fir.box>>, %arg1 : index) -> !fir.box>> { %1 = fir.shape %arg1 : (index) -> !fir.shape<1> // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } @@ -85,7 +85,7 @@ func.func @b2(%arg0 : !fir.ref>>, %arg1 : index) -> // Boxing of a dynamic array of character of dynamic length // CHECK-LABEL: define void @b3( -// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]], i64 %[[arg2:.*]]) +// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]], i64 %[[arg2:.*]]) func.func @b3(%arg0 : !fir.ref>>, %arg1 : index, %arg2 : index) -> !fir.box>> { %1 = fir.shape %arg2 : (index) -> !fir.shape<1> // CHECK: %[[alloca:.*]] = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] } @@ -103,7 +103,7 @@ func.func @b3(%arg0 : !fir.ref>>, %arg1 : index, %ar // Boxing of a static array of character of dynamic length // CHECK-LABEL: define void @b4( -// CHECK-SAME: ptr {{[^%]*}}%[[res:.*]], ptr {{[^%]*}}%[[arg0:.*]], i64 %[[arg1:.*]]) +// CHECK-SAME: ptr captures(none) %[[res:.*]], ptr captures(none) %[[arg0:.*]], i64 %[[arg1:.*]]) func.func @b4(%arg0 : !fir.ref>>, %arg1 : index) -> !fir.box>> { %c_7 = arith.constant 7 : index %1 = fir.shape %c_7 : (index) -> !fir.shape<1> @@ -122,7 +122,7 @@ func.func @b4(%arg0 : !fir.ref>>, %arg1 : index) -> // Storing a fir.box into a fir.ref (modifying descriptors). // CHECK-LABEL: define void @b5( -// CHECK-SAME: ptr {{[^%]*}}%[[arg0:.*]], ptr {{[^%]*}}%[[arg1:.*]]) +// CHECK-SAME: ptr captures(none) %[[arg0:.*]], ptr %[[arg1:.*]]) func.func @b5(%arg0 : !fir.ref>>>, %arg1 : !fir.box>>) { fir.store %arg1 to %arg0 : !fir.ref>>> // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr %0, ptr %1, i32 72, i1 false) @@ -132,7 +132,7 @@ func.func @b5(%arg0 : !fir.ref>>>, %arg1 func.func private @callee6(!fir.box) -> i32 // CHECK-LABEL: define i32 @box6( -// CHECK-SAME: ptr {{[^%]*}}%[[ARG0:.*]], i64 %[[ARG1:.*]], i64 %[[ARG2:.*]]) +// CHECK-SAME: ptr captures(none) %[[ARG0:.*]], i64 %[[ARG1:.*]], i64 %[[ARG2:.*]]) func.func @box6(%0 : !fir.ref>, %1 : index, %2 : index) -> i32 { %c100 = arith.constant 100 : index %c50 = arith.constant 50 : index diff --git a/flang/test/Fir/boxproc.fir b/flang/test/Fir/boxproc.fir index 5d82522055adc..e99dfd0b92afd 100644 --- a/flang/test/Fir/boxproc.fir +++ b/flang/test/Fir/boxproc.fir @@ -16,7 +16,7 @@ // CHECK: call void @_QPtest_proc_dummy_other(ptr %[[VAL_6]]) // CHECK-LABEL: define void @_QFtest_proc_dummyPtest_proc_dummy_a(ptr -// CHECK-SAME: {{[^%]*}}%[[VAL_0:.*]], ptr nest {{[^%]*}}%[[VAL_1:.*]]) +// CHECK-SAME: captures(none) %[[VAL_0:.*]], ptr nest captures(none) %[[VAL_1:.*]]) // CHECK-LABEL: define void @_QPtest_proc_dummy_other(ptr // CHECK-SAME: %[[VAL_0:.*]]) @@ -92,7 +92,7 @@ func.func @_QPtest_proc_dummy_other(%arg0: !fir.boxproc<() -> ()>) { // CHECK: call void @llvm.stackrestore.p0(ptr %[[VAL_27]]) // CHECK-LABEL: define { ptr, i64 } @_QFtest_proc_dummy_charPgen_message(ptr -// CHECK-SAME: {{[^%]*}}%[[VAL_0:.*]], i64 %[[VAL_1:.*]], ptr nest {{[^%]*}}%[[VAL_2:.*]]) +// CHECK-SAME: captures(none) %[[VAL_0:.*]], i64 %[[VAL_1:.*]], ptr nest captures(none) %[[VAL_2:.*]]) // CHECK: %[[VAL_3:.*]] = getelementptr { { ptr, i64 } }, ptr %[[VAL_2]], i32 0, i32 0 // CHECK: %[[VAL_4:.*]] = load { ptr, i64 }, ptr %[[VAL_3]], align 8 // CHECK: %[[VAL_5:.*]] = extractvalue { ptr, i64 } %[[VAL_4]], 0 diff --git a/flang/test/Fir/commute.fir b/flang/test/Fir/commute.fir index 8713c8ff24e7f..a857ba55b00c5 100644 --- a/flang/test/Fir/commute.fir +++ b/flang/test/Fir/commute.fir @@ -11,7 +11,7 @@ func.func @f1(%a : i32, %b : i32) -> i32 { return %3 : i32 } -// CHECK-LABEL: define i32 @f2(ptr {{[^%]*}}%0) +// CHECK-LABEL: define i32 @f2(ptr captures(none) %0) func.func @f2(%a : !fir.ref) -> i32 { %1 = fir.load %a : !fir.ref // CHECK: %[[r2:.*]] = load diff --git a/flang/test/Fir/coordinateof.fir b/flang/test/Fir/coordinateof.fir index a01e9e9d1fc40..693bdf716ba1d 100644 --- a/flang/test/Fir/coordinateof.fir +++ b/flang/test/Fir/coordinateof.fir @@ -62,7 +62,7 @@ func.func @foo5(%box : !fir.box>>, %i : index) -> i32 } // CHECK-LABEL: @foo6 -// CHECK-SAME: (ptr {{[^%]*}}%[[box:.*]], i64 %{{.*}}, ptr {{[^%]*}}%{{.*}}) +// CHECK-SAME: (ptr %[[box:.*]], i64 %{{.*}}, ptr captures(none) %{{.*}}) func.func @foo6(%box : !fir.box>>>, %i : i64 , %res : !fir.ref>) { // CHECK: %[[addr_gep:.*]] = getelementptr { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, ptr %[[box]], i32 0, i32 0 // CHECK: %[[addr:.*]] = load ptr, ptr %[[addr_gep]] diff --git a/flang/test/Fir/embox.fir b/flang/test/Fir... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/142128 From flang-commits at lists.llvm.org Fri May 30 04:51:19 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 04:51:19 -0700 (PDT) Subject: [flang-commits] =?utf-8?q?=5Bflang=5D_Revert_=22Reland_=22=5Bflan?= =?utf-8?q?g=5D_Added_noalias_attribute_to_function_arguments=E2=80=A6_=28?= =?utf-8?q?PR_=23142128=29?= In-Reply-To: Message-ID: <68399bb7.170a0220.1181f0.e559@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/142128 From flang-commits at lists.llvm.org Fri May 30 05:01:21 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Fri, 30 May 2025 05:01:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Inline hlfir.copy_in for trivial types (PR #138718) In-Reply-To: Message-ID: <68399e11.170a0220.230a5f.e33c@mx.google.com> https://github.com/mrkajetanp updated https://github.com/llvm/llvm-project/pull/138718 >From 78206935db8c87bb9e2c89da3574b933406f112e Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 10 Apr 2025 14:04:52 +0000 Subject: [PATCH 1/7] [flang] Inline hlfir.copy_in for trivial types hlfir.copy_in implements copying non-contiguous array slices for functions that take in arrays required to be contiguous through a flang-rt function that calls memcpy/memmove separately on each element. For large arrays of trivial types, this can incur considerable overhead compared to a plain copy loop that is better able to take advantage of hardware pipelines. To address that, extend the InlineHLFIRAssign optimisation pass with a new pattern for inlining hlfir.copy_in operations for trivial types. For the time being, the pattern is only applied in cases where the copy-in does not require a corresponding copy-out, such as when the function being called declares the array parameter as intent(in). Applying this optimisation reduces the runtime of thornado-mini's DeleptonizationProblem by a factor of about 1/3rd. Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 117 ++++++++++++++++++ 1 file changed, 117 insertions(+) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 6e209cce07ad4..38c684eaceb7d 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,6 +13,7 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -127,6 +128,121 @@ class InlineHLFIRAssignConversion } }; +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto *copyOut = + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + + if (!mlir::isa(copyOut)) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (::llvm::cast(copyOut).getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + auto results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + auto elem = hlfir::getElementAt(loc, builder, inputVariable, + loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + auto tempElem = hlfir::getElementAt(loc, builder, temp, + loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + auto refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + auto refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + auto addr = results[0]; + auto needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + auto tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. + rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); +} + class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -140,6 +256,7 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); + patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { >From 154b758b0da725c0d0f9b41cdc3713a05e2239a7 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Wed, 7 May 2025 16:04:07 +0000 Subject: [PATCH 2/7] Add tests Signed-off-by: Kajetan Puchalski --- flang/test/HLFIR/inline-hlfir-assign.fir | 144 +++++++++++++++++++++++ 1 file changed, 144 insertions(+) diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index f834e7971e3d5..df7681b9c5c16 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,3 +353,147 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } >From 6d334d77917a9e02b3e397dd1b3ea8605320c795 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 8 May 2025 15:15:56 +0000 Subject: [PATCH 3/7] Address Tom's review comments Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 41 +++++++++++-------- 1 file changed, 23 insertions(+), 18 deletions(-) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index 38c684eaceb7d..dc545ece8adff 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -158,16 +158,16 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, "CopyInOp's WasCopied has no uses"); // The copy out should always be present, either to actually copy or just // deallocate memory. - auto *copyOut = - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser(); + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - if (!mlir::isa(copyOut)) + if (!copyOut) return rewriter.notifyMatchFailure(copyIn, "CopyInOp has no direct CopyOut"); // Only inline the copy_in when copy_out does not need to be done, i.e. in // case of intent(in). - if (::llvm::cast(copyOut).getVar()) + if (copyOut.getVar()) return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); inputVariable = @@ -175,7 +175,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); mlir::Value isContiguous = builder.create(loc, inputVariable); - auto results = + mlir::Operation::result_range results = builder .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, /*withElseRegion=*/true) @@ -195,11 +195,11 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, loc, builder, extents, /*isUnordered=*/true, flangomp::shouldUseWorkshareLowering(copyIn)); builder.setInsertionPointToStart(loopNest.body); - auto elem = hlfir::getElementAt(loc, builder, inputVariable, - loopNest.oneBasedIndices); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); elem = hlfir::loadTrivialScalar(loc, builder, elem); - auto tempElem = hlfir::getElementAt(loc, builder, temp, - loopNest.oneBasedIndices); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); builder.create(loc, elem, tempElem); builder.setInsertionPointAfter(loopNest.outerOp); @@ -209,9 +209,9 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, if (mlir::isa(temp.getType())) { result = temp; } else { - auto refTy = + fir::ReferenceType refTy = fir::ReferenceType::get(temp.getElementOrSequenceType()); - auto refVal = builder.createConvert(loc, refTy, temp); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); result = builder.create(loc, resultAddrType, refVal); } @@ -221,25 +221,30 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }) .getResults(); - auto addr = results[0]; - auto needsCleanup = results[1]; + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, false).genThen([&] { + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { auto boxAddr = builder.create(loc, addr); - auto heapType = fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - auto heapVal = builder.createConvert(loc, heapType, boxAddr.getResult()); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); builder.create(loc, heapVal); }); rewriter.eraseOp(copyOut); - auto tempBox = copyIn.getTempBox(); + mlir::Value tempBox = copyIn.getTempBox(); rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); // The TempBox is only needed for flang-rt calls which we're no longer - // generating. + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); rewriter.eraseOp(tempBox.getDefiningOp()); + return mlir::success(); } >From 6a9d0fd6cc3c72ed7382bd78128a4cd59b75abe9 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 22 May 2025 13:37:53 +0000 Subject: [PATCH 4/7] Separate copy_in inlining into its own pass, add flag Signed-off-by: Kajetan Puchalski --- flang/include/flang/Optimizer/HLFIR/Passes.td | 4 + .../Optimizer/HLFIR/Transforms/CMakeLists.txt | 1 + .../HLFIR/Transforms/InlineHLFIRAssign.cpp | 122 ------------ .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 180 ++++++++++++++++++ flang/lib/Optimizer/Passes/Pipelines.cpp | 5 + flang/test/HLFIR/inline-hlfir-assign.fir | 144 -------------- flang/test/HLFIR/inline-hlfir-copy-in.fir | 146 ++++++++++++++ 7 files changed, 336 insertions(+), 266 deletions(-) create mode 100644 flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp create mode 100644 flang/test/HLFIR/inline-hlfir-copy-in.fir diff --git a/flang/include/flang/Optimizer/HLFIR/Passes.td b/flang/include/flang/Optimizer/HLFIR/Passes.td index d445140118073..04d7aec5fe489 100644 --- a/flang/include/flang/Optimizer/HLFIR/Passes.td +++ b/flang/include/flang/Optimizer/HLFIR/Passes.td @@ -69,6 +69,10 @@ def InlineHLFIRAssign : Pass<"inline-hlfir-assign"> { let summary = "Inline hlfir.assign operations"; } +def InlineHLFIRCopyIn : Pass<"inline-hlfir-copy-in"> { + let summary = "Inline hlfir.copy_in operations"; +} + def PropagateFortranVariableAttributes : Pass<"propagate-fortran-attrs"> { let summary = "Propagate FortranVariableFlagsAttr attributes through HLFIR"; } diff --git a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt index d959428ebd203..cc74273d9c5d9 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt +++ b/flang/lib/Optimizer/HLFIR/Transforms/CMakeLists.txt @@ -5,6 +5,7 @@ add_flang_library(HLFIRTransforms ConvertToFIR.cpp InlineElementals.cpp InlineHLFIRAssign.cpp + InlineHLFIRCopyIn.cpp LowerHLFIRIntrinsics.cpp LowerHLFIROrderedAssignments.cpp ScheduleOrderedAssignments.cpp diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp index dc545ece8adff..6e209cce07ad4 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp @@ -13,7 +13,6 @@ #include "flang/Optimizer/Analysis/AliasAnalysis.h" #include "flang/Optimizer/Builder/FIRBuilder.h" #include "flang/Optimizer/Builder/HLFIRTools.h" -#include "flang/Optimizer/Dialect/FIRType.h" #include "flang/Optimizer/HLFIR/HLFIROps.h" #include "flang/Optimizer/HLFIR/Passes.h" #include "flang/Optimizer/OpenMP/Passes.h" @@ -128,126 +127,6 @@ class InlineHLFIRAssignConversion } }; -class InlineCopyInConversion : public mlir::OpRewritePattern { -public: - using mlir::OpRewritePattern::OpRewritePattern; - - llvm::LogicalResult - matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const override; -}; - -llvm::LogicalResult -InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, - mlir::PatternRewriter &rewriter) const { - fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); - mlir::Location loc = copyIn.getLoc(); - hlfir::Entity inputVariable{copyIn.getVar()}; - if (!fir::isa_trivial(inputVariable.getFortranElementType())) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's data type is not trivial"); - - if (fir::isPointerType(inputVariable.getType())) - return rewriter.notifyMatchFailure( - copyIn, "CopyInOp's input variable is a pointer"); - - // There should be exactly one user of WasCopied - the corresponding - // CopyOutOp. - if (copyIn.getWasCopied().getUses().empty()) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's WasCopied has no uses"); - // The copy out should always be present, either to actually copy or just - // deallocate memory. - auto copyOut = mlir::dyn_cast( - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); - - if (!copyOut) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp has no direct CopyOut"); - - // Only inline the copy_in when copy_out does not need to be done, i.e. in - // case of intent(in). - if (copyOut.getVar()) - return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); - - inputVariable = - hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); - mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); - mlir::Value isContiguous = - builder.create(loc, inputVariable); - mlir::Operation::result_range results = - builder - .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, - /*withElseRegion=*/true) - .genThen([&]() { - mlir::Value falseVal = builder.create( - loc, builder.getI1Type(), builder.getBoolAttr(false)); - builder.create( - loc, mlir::ValueRange{inputVariable, falseVal}); - }) - .genElse([&] { - auto [temp, cleanup] = - hlfir::createTempFromMold(loc, builder, inputVariable); - mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); - llvm::SmallVector extents = - hlfir::getIndexExtents(loc, builder, shape); - hlfir::LoopNest loopNest = hlfir::genLoopNest( - loc, builder, extents, /*isUnordered=*/true, - flangomp::shouldUseWorkshareLowering(copyIn)); - builder.setInsertionPointToStart(loopNest.body); - hlfir::Entity elem = hlfir::getElementAt( - loc, builder, inputVariable, loopNest.oneBasedIndices); - elem = hlfir::loadTrivialScalar(loc, builder, elem); - hlfir::Entity tempElem = hlfir::getElementAt( - loc, builder, temp, loopNest.oneBasedIndices); - builder.create(loc, elem, tempElem); - builder.setInsertionPointAfter(loopNest.outerOp); - - mlir::Value result; - // Make sure the result is always a boxed array by boxing it - // ourselves if need be. - if (mlir::isa(temp.getType())) { - result = temp; - } else { - fir::ReferenceType refTy = - fir::ReferenceType::get(temp.getElementOrSequenceType()); - mlir::Value refVal = builder.createConvert(loc, refTy, temp); - result = - builder.create(loc, resultAddrType, refVal); - } - - builder.create(loc, - mlir::ValueRange{result, cleanup}); - }) - .getResults(); - - mlir::OpResult addr = results[0]; - mlir::OpResult needsCleanup = results[1]; - - builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { - auto boxAddr = builder.create(loc, addr); - fir::HeapType heapType = - fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - mlir::Value heapVal = - builder.createConvert(loc, heapType, boxAddr.getResult()); - builder.create(loc, heapVal); - }); - rewriter.eraseOp(copyOut); - - mlir::Value tempBox = copyIn.getTempBox(); - - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); - - // The TempBox is only needed for flang-rt calls which we're no longer - // generating. It should have no uses left at this stage. - if (!tempBox.getUses().empty()) - return mlir::failure(); - rewriter.eraseOp(tempBox.getDefiningOp()); - - return mlir::success(); -} - class InlineHLFIRAssignPass : public hlfir::impl::InlineHLFIRAssignBase { public: @@ -261,7 +140,6 @@ class InlineHLFIRAssignPass mlir::RewritePatternSet patterns(context); patterns.insert(context); - patterns.insert(context); if (mlir::failed(mlir::applyPatternsGreedily( getOperation(), std::move(patterns), config))) { diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp new file mode 100644 index 0000000000000..1e2aecaf535a0 --- /dev/null +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -0,0 +1,180 @@ +//===- InlineHLFIRCopyIn.cpp - Inline hlfir.copy_in ops -------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// Transform hlfir.copy_in array operations into loop nests performing element +// per element assignments. For simplicity, the inlining is done for trivial +// data types when the copy_in does not require a corresponding copy_out and +// when the input array is not behind a pointer. This may change in the future. +//===----------------------------------------------------------------------===// + +#include "flang/Optimizer/Builder/FIRBuilder.h" +#include "flang/Optimizer/Builder/HLFIRTools.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/HLFIR/HLFIROps.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" + +namespace hlfir { +#define GEN_PASS_DEF_INLINEHLFIRCOPYIN +#include "flang/Optimizer/HLFIR/Passes.h.inc" +} // namespace hlfir + +#define DEBUG_TYPE "inline-hlfir-copy-in" + +static llvm::cl::opt noInlineHLFIRCopyIn( + "no-inline-hlfir-copy-in", + llvm::cl::desc("Do not inline hlfir.copy_in operations"), + llvm::cl::init(false)); + +namespace { +class InlineCopyInConversion : public mlir::OpRewritePattern { +public: + using mlir::OpRewritePattern::OpRewritePattern; + + llvm::LogicalResult + matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const override; +}; + +llvm::LogicalResult +InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, + mlir::PatternRewriter &rewriter) const { + fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); + mlir::Location loc = copyIn.getLoc(); + hlfir::Entity inputVariable{copyIn.getVar()}; + if (!fir::isa_trivial(inputVariable.getFortranElementType())) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's data type is not trivial"); + + if (fir::isPointerType(inputVariable.getType())) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's input variable is a pointer"); + + // There should be exactly one user of WasCopied - the corresponding + // CopyOutOp. + if (copyIn.getWasCopied().getUses().empty()) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp's WasCopied has no uses"); + // The copy out should always be present, either to actually copy or just + // deallocate memory. + auto copyOut = mlir::dyn_cast( + copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + + if (!copyOut) + return rewriter.notifyMatchFailure(copyIn, + "CopyInOp has no direct CopyOut"); + + // Only inline the copy_in when copy_out does not need to be done, i.e. in + // case of intent(in). + if (copyOut.getVar()) + return rewriter.notifyMatchFailure(copyIn, "CopyIn needs a copy-out"); + + inputVariable = + hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Value isContiguous = + builder.create(loc, inputVariable); + mlir::Operation::result_range results = + builder + .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + /*withElseRegion=*/true) + .genThen([&]() { + mlir::Value falseVal = builder.create( + loc, builder.getI1Type(), builder.getBoolAttr(false)); + builder.create( + loc, mlir::ValueRange{inputVariable, falseVal}); + }) + .genElse([&] { + auto [temp, cleanup] = + hlfir::createTempFromMold(loc, builder, inputVariable); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + llvm::SmallVector extents = + hlfir::getIndexExtents(loc, builder, shape); + hlfir::LoopNest loopNest = hlfir::genLoopNest( + loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn)); + builder.setInsertionPointToStart(loopNest.body); + hlfir::Entity elem = hlfir::getElementAt( + loc, builder, inputVariable, loopNest.oneBasedIndices); + elem = hlfir::loadTrivialScalar(loc, builder, elem); + hlfir::Entity tempElem = hlfir::getElementAt( + loc, builder, temp, loopNest.oneBasedIndices); + builder.create(loc, elem, tempElem); + builder.setInsertionPointAfter(loopNest.outerOp); + + mlir::Value result; + // Make sure the result is always a boxed array by boxing it + // ourselves if need be. + if (mlir::isa(temp.getType())) { + result = temp; + } else { + fir::ReferenceType refTy = + fir::ReferenceType::get(temp.getElementOrSequenceType()); + mlir::Value refVal = builder.createConvert(loc, refTy, temp); + result = + builder.create(loc, resultAddrType, refVal); + } + + builder.create(loc, + mlir::ValueRange{result, cleanup}); + }) + .getResults(); + + mlir::OpResult addr = results[0]; + mlir::OpResult needsCleanup = results[1]; + + builder.setInsertionPoint(copyOut); + builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { + auto boxAddr = builder.create(loc, addr); + fir::HeapType heapType = + fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); + mlir::Value heapVal = + builder.createConvert(loc, heapType, boxAddr.getResult()); + builder.create(loc, heapVal); + }); + rewriter.eraseOp(copyOut); + + mlir::Value tempBox = copyIn.getTempBox(); + + rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + + // The TempBox is only needed for flang-rt calls which we're no longer + // generating. It should have no uses left at this stage. + if (!tempBox.getUses().empty()) + return mlir::failure(); + rewriter.eraseOp(tempBox.getDefiningOp()); + + return mlir::success(); +} + +class InlineHLFIRCopyInPass + : public hlfir::impl::InlineHLFIRCopyInBase { +public: + void runOnOperation() override { + mlir::MLIRContext *context = &getContext(); + + mlir::GreedyRewriteConfig config; + // Prevent the pattern driver from merging blocks. + config.setRegionSimplificationLevel( + mlir::GreedySimplifyRegionLevel::Disabled); + + mlir::RewritePatternSet patterns(context); + if (!noInlineHLFIRCopyIn) { + patterns.insert(context); + } + + if (mlir::failed(mlir::applyPatternsGreedily( + getOperation(), std::move(patterns), config))) { + mlir::emitError(getOperation()->getLoc(), + "failure in hlfir.copy_in inlining"); + signalPassFailure(); + } + } +}; +} // namespace diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 77751908e35be..1779623fddc5a 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -255,6 +255,11 @@ void createHLFIRToFIRPassPipeline(mlir::PassManager &pm, bool enableOpenMP, pm, hlfir::createOptimizedBufferization); addNestedPassToAllTopLevelOperations( pm, hlfir::createInlineHLFIRAssign); + + if (optLevel == llvm::OptimizationLevel::O3) { + addNestedPassToAllTopLevelOperations( + pm, hlfir::createInlineHLFIRCopyIn); + } } pm.addPass(hlfir::createLowerHLFIROrderedAssignments()); pm.addPass(hlfir::createLowerHLFIRIntrinsics()); diff --git a/flang/test/HLFIR/inline-hlfir-assign.fir b/flang/test/HLFIR/inline-hlfir-assign.fir index df7681b9c5c16..f834e7971e3d5 100644 --- a/flang/test/HLFIR/inline-hlfir-assign.fir +++ b/flang/test/HLFIR/inline-hlfir-assign.fir @@ -353,147 +353,3 @@ func.func @_QPtest_expr_rhs(%arg0: !fir.ref> // CHECK: return // CHECK: } - -// Test inlining of hlfir.copy_in that does not require the array to be copied out -func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant true -// CHECK: %[[VAL_4:.*]] = arith.constant false -// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 -// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { -// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 -// CHECK: } else { -// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} -// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { -// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref -// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref -// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref -// CHECK: } -// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 -// CHECK: } -// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: fir.if %[[VAL_21:.*]]#1 { -// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> -// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> -// CHECK: } -// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } - -// Test not inlining of hlfir.copy_in that requires the array to be copied out -func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { - %0 = fir.alloca !fir.box>> - %1 = fir.dummy_scope : !fir.dscope - %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) - %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) - %5 = fir.load %2#0 : !fir.ref - %6 = fir.convert %5 : (i32) -> i64 - %c1 = arith.constant 1 : index - %c1_0 = arith.constant 1 : index - %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) - %c1_1 = arith.constant 1 : index - %c0 = arith.constant 0 : index - %8 = arith.subi %7#1, %c1 : index - %9 = arith.addi %8, %c1_1 : index - %10 = arith.divsi %9, %c1_1 : index - %11 = arith.cmpi sgt, %10, %c0 : index - %12 = arith.select %11, %10, %c0 : index - %13 = fir.load %3#0 : !fir.ref - %14 = fir.convert %13 : (i32) -> i64 - %15 = fir.shape %12 : (index) -> !fir.shape<1> - %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> - %c100_i32 = arith.constant 100 : i32 - %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) - %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> - %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) - fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () - hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () - hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 - return -} - -// CHECK-LABEL: func.func private @_test_no_inline_copy_in( -// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, -// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, -// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { -// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 -// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index -// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index -// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> -// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope -// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) -// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) -// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref -// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 -// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) -// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index -// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref -// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 -// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> -// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> -// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) -// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) -// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () -// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 -// CHECK: return -// CHECK: } diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir new file mode 100644 index 0000000000000..7140e93f19979 --- /dev/null +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -0,0 +1,146 @@ +// Test inlining of hlfir.copy_in +// RUN: fir-opt --inline-hlfir-copy-in %s | FileCheck %s + +// Test inlining of hlfir.copy_in that does not require the array to be copied out +func.func private @_test_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#0) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 : (!fir.ref>>>, i1) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant true +// CHECK: %[[VAL_4:.*]] = arith.constant false +// CHECK: %[[VAL_5:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_6:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_7:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_8:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_22:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_8:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_22:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_7:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_6:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_7:.*]]:%[[VAL_13:.*]]#1:%[[VAL_7:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]] = fir.is_contiguous_box %[[VAL_19:.*]] whole : (!fir.box>) -> i1 +// CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { +// CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 +// CHECK: } else { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref +// CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref +// CHECK: hlfir.assign %[[VAL_27:.*]] to %[[VAL_28:.*]] : f64, !fir.ref +// CHECK: } +// CHECK: fir.result %[[VAL_25:.*]]#0, %[[VAL_3:.*]] : !fir.box>, i1 +// CHECK: } +// CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: fir.if %[[VAL_21:.*]]#1 { +// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> +// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> +// CHECK: } +// CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } + +// Test not inlining of hlfir.copy_in that requires the array to be copied out +func.func private @_test_no_inline_copy_in(%arg0: !fir.box> {fir.bindc_name = "x"}, %arg1: !fir.ref {fir.bindc_name = "i"}, %arg2: !fir.ref {fir.bindc_name = "j"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg1 dummy_scope %1 {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %3:2 = hlfir.declare %arg2 dummy_scope %1 {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) + %4:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %5 = fir.load %2#0 : !fir.ref + %6 = fir.convert %5 : (i32) -> i64 + %c1 = arith.constant 1 : index + %c1_0 = arith.constant 1 : index + %7:3 = fir.box_dims %4#1, %c1_0 : (!fir.box>, index) -> (index, index, index) + %c1_1 = arith.constant 1 : index + %c0 = arith.constant 0 : index + %8 = arith.subi %7#1, %c1 : index + %9 = arith.addi %8, %c1_1 : index + %10 = arith.divsi %9, %c1_1 : index + %11 = arith.cmpi sgt, %10, %c0 : index + %12 = arith.select %11, %10, %c0 : index + %13 = fir.load %3#0 : !fir.ref + %14 = fir.convert %13 : (i32) -> i64 + %15 = fir.shape %12 : (index) -> !fir.shape<1> + %16 = hlfir.designate %4#0 (%6, %c1:%7#1:%c1_1, %14) shape %15 : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> + %c100_i32 = arith.constant 100 : i32 + %17:2 = hlfir.copy_in %16 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %18 = fir.box_addr %17#0 : (!fir.box>) -> !fir.ref> + %19:3 = hlfir.associate %c100_i32 {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) + fir.call @_QFPsb(%18, %19#1) fastmath : (!fir.ref>, !fir.ref) -> () + hlfir.copy_out %0, %17#1 to %16 : (!fir.ref>>>, i1, !fir.box>) -> () + hlfir.end_associate %19#1, %19#2 : !fir.ref, i1 + return +} + +// CHECK-LABEL: func.func private @_test_no_inline_copy_in( +// CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, +// CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, +// CHECK-SAME: %[[VAL_2:.*]]: !fir.ref {fir.bindc_name = "j"}) { +// CHECK: %[[VAL_3:.*]] = arith.constant 100 : i32 +// CHECK: %[[VAL_4:.*]] = arith.constant 0 : index +// CHECK: %[[VAL_5:.*]] = arith.constant 1 : index +// CHECK: %[[VAL_6:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_7:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_1:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ei"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_9:.*]]:2 = hlfir.declare %[[VAL_2:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ej"} : (!fir.ref, !fir.dscope) -> (!fir.ref, !fir.ref) +// CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0:.*]] dummy_scope %[[VAL_7:.*]] {uniq_name = "_QFFsb2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_8:.*]]#0 : !fir.ref +// CHECK: %[[VAL_12:.*]] = fir.convert %[[VAL_11:.*]] : (i32) -> i64 +// CHECK: %[[VAL_13:.*]]:3 = fir.box_dims %[[VAL_10:.*]]#1, %[[VAL_5:.*]] : (!fir.box>, index) -> (index, index, index) +// CHECK: %[[VAL_14:.*]] = arith.cmpi sgt, %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_15:.*]] = arith.select %[[VAL_14:.*]], %[[VAL_13:.*]]#1, %[[VAL_4:.*]] : index +// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_9:.*]]#0 : !fir.ref +// CHECK: %[[VAL_17:.*]] = fir.convert %[[VAL_16:.*]] : (i32) -> i64 +// CHECK: %[[VAL_18:.*]] = fir.shape %[[VAL_15:.*]] : (index) -> !fir.shape<1> +// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_10:.*]]#0 (%[[VAL_12:.*]], %[[VAL_5:.*]]:%[[VAL_13:.*]]#1:%[[VAL_5:.*]], %[[VAL_17:.*]]) shape %[[VAL_18:.*]] : (!fir.box>, i64, index, index, index, i64, !fir.shape<1>) -> !fir.box> +// CHECK: %[[VAL_20:.*]]:2 = hlfir.copy_in %[[VAL_19:.*]] to %[[VAL_6:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_21:.*]] = fir.box_addr %[[VAL_20:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: %[[VAL_22:.*]]:3 = hlfir.associate %[[VAL_3:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) +// CHECK: fir.call @_QFPsb(%[[VAL_21:.*]], %[[VAL_22:.*]]#1) fastmath : (!fir.ref>, !fir.ref) -> () +// CHECK: hlfir.copy_out %[[VAL_6:.*]], %[[VAL_20:.*]]#1 to %[[VAL_19:.*]] : (!fir.ref>>>, i1, !fir.box>) -> () +// CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 +// CHECK: return +// CHECK: } >From 63f66ae55347275c3f42c456a70dfbb688836fe6 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Wed, 28 May 2025 13:44:53 +0000 Subject: [PATCH 5/7] Support arrays behind a pointer, add metadata to disable vectorizing --- .../flang/Optimizer/Builder/HLFIRTools.h | 8 ++- flang/lib/Optimizer/Builder/HLFIRTools.cpp | 13 +++- .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 66 ++++++++++--------- flang/test/HLFIR/inline-hlfir-copy-in.fir | 6 +- 4 files changed, 55 insertions(+), 38 deletions(-) diff --git a/flang/include/flang/Optimizer/Builder/HLFIRTools.h b/flang/include/flang/Optimizer/Builder/HLFIRTools.h index ed00cec04dc39..2cbad6e268a38 100644 --- a/flang/include/flang/Optimizer/Builder/HLFIRTools.h +++ b/flang/include/flang/Optimizer/Builder/HLFIRTools.h @@ -374,12 +374,14 @@ struct LoopNest { /// loop constructs currently. LoopNest genLoopNest(mlir::Location loc, fir::FirOpBuilder &builder, mlir::ValueRange extents, bool isUnordered = false, - bool emitWorkshareLoop = false); + bool emitWorkshareLoop = false, + bool couldVectorize = true); inline LoopNest genLoopNest(mlir::Location loc, fir::FirOpBuilder &builder, mlir::Value shape, bool isUnordered = false, - bool emitWorkshareLoop = false) { + bool emitWorkshareLoop = false, + bool couldVectorize = true) { return genLoopNest(loc, builder, getIndexExtents(loc, builder, shape), - isUnordered, emitWorkshareLoop); + isUnordered, emitWorkshareLoop, couldVectorize); } /// The type of a callback that generates the body of a reduction diff --git a/flang/lib/Optimizer/Builder/HLFIRTools.cpp b/flang/lib/Optimizer/Builder/HLFIRTools.cpp index f24dc2caeedfc..14aae5d7118a1 100644 --- a/flang/lib/Optimizer/Builder/HLFIRTools.cpp +++ b/flang/lib/Optimizer/Builder/HLFIRTools.cpp @@ -21,6 +21,7 @@ #include "mlir/IR/IRMapping.h" #include "mlir/Support/LLVM.h" #include "llvm/ADT/TypeSwitch.h" +#include #include #include @@ -932,7 +933,8 @@ mlir::Value hlfir::inlineElementalOp( hlfir::LoopNest hlfir::genLoopNest(mlir::Location loc, fir::FirOpBuilder &builder, mlir::ValueRange extents, bool isUnordered, - bool emitWorkshareLoop) { + bool emitWorkshareLoop, + bool couldVectorize) { emitWorkshareLoop = emitWorkshareLoop && isUnordered; hlfir::LoopNest loopNest; assert(!extents.empty() && "must have at least one extent"); @@ -967,6 +969,15 @@ hlfir::LoopNest hlfir::genLoopNest(mlir::Location loc, auto ub = builder.createConvert(loc, indexType, extent); auto doLoop = builder.create(loc, one, ub, one, isUnordered); + if (!couldVectorize) { + mlir::LLVM::LoopVectorizeAttr va{mlir::LLVM::LoopVectorizeAttr::get( + builder.getContext(), + /*disable=*/builder.getBoolAttr(true), {}, {}, {}, {}, {}, {})}; + mlir::LLVM::LoopAnnotationAttr la = mlir::LLVM::LoopAnnotationAttr::get( + builder.getContext(), {}, /*vectorize=*/va, {}, /*unroll*/ {}, + /*unroll_and_jam*/ {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}); + doLoop.setLoopAnnotationAttr(la); + } loopNest.body = doLoop.getBody(); builder.setInsertionPointToStart(loopNest.body); // Reverse the indices so they are in column-major order. diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp index 1e2aecaf535a0..d1cbe3241c07b 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -52,19 +52,15 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, return rewriter.notifyMatchFailure(copyIn, "CopyInOp's data type is not trivial"); - if (fir::isPointerType(inputVariable.getType())) - return rewriter.notifyMatchFailure( - copyIn, "CopyInOp's input variable is a pointer"); - // There should be exactly one user of WasCopied - the corresponding // CopyOutOp. - if (copyIn.getWasCopied().getUses().empty()) - return rewriter.notifyMatchFailure(copyIn, - "CopyInOp's WasCopied has no uses"); + if (!copyIn.getWasCopied().hasOneUse()) + return rewriter.notifyMatchFailure( + copyIn, "CopyInOp's WasCopied has no single user"); // The copy out should always be present, either to actually copy or just // deallocate memory. auto copyOut = mlir::dyn_cast( - copyIn.getWasCopied().getUsers().begin().getCurrent().getUser()); + copyIn.getWasCopied().user_begin().getCurrent().getUser()); if (!copyOut) return rewriter.notifyMatchFailure(copyIn, @@ -77,28 +73,45 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, inputVariable = hlfir::derefPointersAndAllocatables(loc, builder, inputVariable); - mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); + mlir::Type sequenceType = + hlfir::getFortranElementOrSequenceType(inputVariable.getType()); + fir::BoxType resultBoxType = fir::BoxType::get(sequenceType); mlir::Value isContiguous = builder.create(loc, inputVariable); mlir::Operation::result_range results = builder - .genIfOp(loc, {resultAddrType, builder.getI1Type()}, isContiguous, + .genIfOp(loc, {resultBoxType, builder.getI1Type()}, isContiguous, /*withElseRegion=*/true) .genThen([&]() { - mlir::Value falseVal = builder.create( - loc, builder.getI1Type(), builder.getBoolAttr(false)); + mlir::Value result = inputVariable; + if (fir::isPointerType(inputVariable.getType())) { + auto boxAddr = builder.create(loc, inputVariable); + fir::ReferenceType refTy = fir::ReferenceType::get(sequenceType); + mlir::Value refVal = builder.createConvert(loc, refTy, boxAddr); + mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); + result = builder.create(loc, resultBoxType, refVal, + shape); + } builder.create( - loc, mlir::ValueRange{inputVariable, falseVal}); + loc, mlir::ValueRange{result, builder.createBool(loc, false)}); }) .genElse([&] { - auto [temp, cleanup] = - hlfir::createTempFromMold(loc, builder, inputVariable); mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); llvm::SmallVector extents = hlfir::getIndexExtents(loc, builder, shape); - hlfir::LoopNest loopNest = hlfir::genLoopNest( - loc, builder, extents, /*isUnordered=*/true, - flangomp::shouldUseWorkshareLowering(copyIn)); + llvm::StringRef tmpName{".tmp.copy_in"}; + llvm::SmallVector lenParams; + mlir::Value alloc = builder.createHeapTemporary( + loc, sequenceType, tmpName, extents, lenParams); + + auto declareOp = builder.create( + loc, alloc, tmpName, shape, lenParams, + /*dummy_scope=*/nullptr); + hlfir::Entity temp{declareOp.getBase()}; + hlfir::LoopNest loopNest = + hlfir::genLoopNest(loc, builder, extents, /*isUnordered=*/true, + flangomp::shouldUseWorkshareLowering(copyIn), + /*couldVectorize=*/false); builder.setInsertionPointToStart(loopNest.body); hlfir::Entity elem = hlfir::getElementAt( loc, builder, inputVariable, loopNest.oneBasedIndices); @@ -117,12 +130,12 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, fir::ReferenceType refTy = fir::ReferenceType::get(temp.getElementOrSequenceType()); mlir::Value refVal = builder.createConvert(loc, refTy, temp); - result = - builder.create(loc, resultAddrType, refVal); + result = builder.create(loc, resultBoxType, refVal, + shape); } - builder.create(loc, - mlir::ValueRange{result, cleanup}); + builder.create( + loc, mlir::ValueRange{result, builder.createBool(loc, true)}); }) .getResults(); @@ -140,16 +153,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }); rewriter.eraseOp(copyOut); - mlir::Value tempBox = copyIn.getTempBox(); - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); - - // The TempBox is only needed for flang-rt calls which we're no longer - // generating. It should have no uses left at this stage. - if (!tempBox.getUses().empty()) - return mlir::failure(); - rewriter.eraseOp(tempBox.getDefiningOp()); - return mlir::success(); } diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir index 7140e93f19979..7a5b6e591f7c7 100644 --- a/flang/test/HLFIR/inline-hlfir-copy-in.fir +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -60,9 +60,9 @@ func.func private @_test_inline_copy_in(%arg0: !fir.box> { // CHECK: %[[VAL_21:.*]]:2 = fir.if %[[VAL_20:.*]] -> (!fir.box>, i1) { // CHECK: fir.result %[[VAL_19:.*]], %[[VAL_4:.*]] : !fir.box>, i1 // CHECK: } else { -// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp", uniq_name = ""} -// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) -// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered { +// CHECK: %[[VAL_24:.*]] = fir.allocmem !fir.array, %[[VAL_15:.*]] {bindc_name = ".tmp.copy_in", uniq_name = ""} +// CHECK: %[[VAL_25:.*]]:2 = hlfir.declare %[[VAL_24:.*]](%[[VAL_18:.*]]) {uniq_name = ".tmp.copy_in"} : (!fir.heap>, !fir.shape<1>) -> (!fir.box>, !fir.heap>) +// CHECK: fir.do_loop %arg3 = %[[VAL_7:.*]] to %[[VAL_15:.*]] step %[[VAL_7:.*]] unordered attributes {loopAnnotation = #loop_annotation} { // CHECK: %[[VAL_26:.*]] = hlfir.designate %[[VAL_19:.*]] (%arg3) : (!fir.box>, index) -> !fir.ref // CHECK: %[[VAL_27:.*]] = fir.load %[[VAL_26:.*]] : !fir.ref // CHECK: %[[VAL_28:.*]] = hlfir.designate %[[VAL_25:.*]]#0 (%arg3) : (!fir.box>, index) -> !fir.ref >From 837d45b0661c29661969407e6ede3f7a70e4739e Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Thu, 29 May 2025 14:10:36 +0000 Subject: [PATCH 6/7] Keep the copy-out to deallocate the temporary Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 20 +++++++------------ flang/test/HLFIR/inline-hlfir-copy-in.fir | 6 +----- 2 files changed, 8 insertions(+), 18 deletions(-) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp index d1cbe3241c07b..0cad503afe16d 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -139,21 +139,15 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, }) .getResults(); - mlir::OpResult addr = results[0]; + mlir::OpResult resultBox = results[0]; mlir::OpResult needsCleanup = results[1]; - builder.setInsertionPoint(copyOut); - builder.genIfOp(loc, {}, needsCleanup, /*withElseRegion=*/false).genThen([&] { - auto boxAddr = builder.create(loc, addr); - fir::HeapType heapType = - fir::HeapType::get(fir::BoxValue(addr).getBaseTy()); - mlir::Value heapVal = - builder.createConvert(loc, heapType, boxAddr.getResult()); - builder.create(loc, heapVal); - }); - rewriter.eraseOp(copyOut); - - rewriter.replaceOp(copyIn, {addr, builder.genNot(loc, isContiguous)}); + auto alloca = builder.create(loc, resultBox.getType()); + auto store = builder.create(loc, resultBox, alloca); + copyOut->setOperand(0, store.getMemref()); + copyOut->setOperand(1, needsCleanup); + + rewriter.replaceOp(copyIn, {resultBox, builder.genNot(loc, isContiguous)}); return mlir::success(); } diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir index 7a5b6e591f7c7..c1d5e11939b7c 100644 --- a/flang/test/HLFIR/inline-hlfir-copy-in.fir +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -73,11 +73,7 @@ func.func private @_test_inline_copy_in(%arg0: !fir.box> { // CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> // CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref, !fir.ref, i1) // CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath : (!fir.ref>, !fir.ref) -> () -// CHECK: fir.if %[[VAL_21:.*]]#1 { -// CHECK: %[[VAL_24:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box>) -> !fir.ref> -// CHECK: %[[VAL_25:.*]] = fir.convert %[[VAL_24:.*]] : (!fir.ref>) -> !fir.heap> -// CHECK: fir.freemem %[[VAL_25:.*]] : !fir.heap> -// CHECK: } +// CHECK: hlfir.copy_out %16, %15#1 : (!fir.ref>>, i1) -> () // CHECK: hlfir.end_associate %[[VAL_23:.*]]#1, %[[VAL_23:.*]]#2 : !fir.ref, i1 // CHECK: return // CHECK: } >From 4c0e8e80d2a23b6afc8862966634b063b6bdfed4 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Fri, 30 May 2025 11:59:31 +0000 Subject: [PATCH 7/7] Use rebox, assumed-rank handling, expand tests Signed-off-by: Kajetan Puchalski --- .../HLFIR/Transforms/InlineHLFIRCopyIn.cpp | 17 +++-- flang/test/HLFIR/inline-hlfir-copy-in.fir | 64 +++++++++++++++++++ 2 files changed, 75 insertions(+), 6 deletions(-) diff --git a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp index 0cad503afe16d..7e8acc515ee26 100644 --- a/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp +++ b/flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRCopyIn.cpp @@ -48,6 +48,7 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, fir::FirOpBuilder builder(rewriter, copyIn.getOperation()); mlir::Location loc = copyIn.getLoc(); hlfir::Entity inputVariable{copyIn.getVar()}; + mlir::Type resultAddrType = copyIn.getCopiedIn().getType(); if (!fir::isa_trivial(inputVariable.getFortranElementType())) return rewriter.notifyMatchFailure(copyIn, "CopyInOp's data type is not trivial"); @@ -66,6 +67,10 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, return rewriter.notifyMatchFailure(copyIn, "CopyInOp has no direct CopyOut"); + if (mlir::cast(resultAddrType).isAssumedRank()) + return rewriter.notifyMatchFailure(copyIn, + "The result array is assumed-rank"); + // Only inline the copy_in when copy_out does not need to be done, i.e. in // case of intent(in). if (copyOut.getVar()) @@ -85,12 +90,9 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, .genThen([&]() { mlir::Value result = inputVariable; if (fir::isPointerType(inputVariable.getType())) { - auto boxAddr = builder.create(loc, inputVariable); - fir::ReferenceType refTy = fir::ReferenceType::get(sequenceType); - mlir::Value refVal = builder.createConvert(loc, refTy, boxAddr); - mlir::Value shape = hlfir::genShape(loc, builder, inputVariable); - result = builder.create(loc, resultBoxType, refVal, - shape); + result = builder.create( + loc, resultBoxType, inputVariable, mlir::Value{}, + mlir::Value{}); } builder.create( loc, mlir::ValueRange{result, builder.createBool(loc, false)}); @@ -142,10 +144,13 @@ InlineCopyInConversion::matchAndRewrite(hlfir::CopyInOp copyIn, mlir::OpResult resultBox = results[0]; mlir::OpResult needsCleanup = results[1]; + // Prepare the corresponding copyOut to free the temporary if it is required auto alloca = builder.create(loc, resultBox.getType()); auto store = builder.create(loc, resultBox, alloca); + rewriter.startOpModification(copyOut); copyOut->setOperand(0, store.getMemref()); copyOut->setOperand(1, needsCleanup); + rewriter.finalizeOpModification(copyOut); rewriter.replaceOp(copyIn, {resultBox, builder.genNot(loc, isContiguous)}); return mlir::success(); diff --git a/flang/test/HLFIR/inline-hlfir-copy-in.fir b/flang/test/HLFIR/inline-hlfir-copy-in.fir index c1d5e11939b7c..f3c4b38962a0c 100644 --- a/flang/test/HLFIR/inline-hlfir-copy-in.fir +++ b/flang/test/HLFIR/inline-hlfir-copy-in.fir @@ -34,6 +34,8 @@ func.func private @_test_inline_copy_in(%arg0: !fir.box> { return } +// CHECK: #loop_vectorize = #llvm.loop_vectorize +// CHECK: #loop_annotation = #llvm.loop_annotation // CHECK-LABEL: func.func private @_test_inline_copy_in( // CHECK-SAME: %[[VAL_0:.*]]: !fir.box> {fir.bindc_name = "x"}, // CHECK-SAME: %[[VAL_1:.*]]: !fir.ref {fir.bindc_name = "i"}, @@ -140,3 +142,65 @@ func.func private @_test_no_inline_copy_in(%arg0: !fir.box // CHECK: hlfir.end_associate %[[VAL_22:.*]]#1, %[[VAL_22:.*]]#2 : !fir.ref, i1 // CHECK: return // CHECK: } + +// Test not inlining optional dummy arguments (no direct copy-out) +func.func @_QPoptional_copy_in_out(%arg0: !fir.box> {fir.bindc_name = "x", fir.optional}) { + %false = arith.constant false + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg0 dummy_scope %1 {fortran_attrs = #fir.var_attrs, uniq_name = "_QFoptional_copy_in_outEx"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %3 = fir.is_present %2#0 : (!fir.box>) -> i1 + %4:2 = fir.if %3 -> (!fir.ref>, i1) { + %5:2 = hlfir.copy_in %2#0 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + %6 = fir.box_addr %5#0 : (!fir.box>) -> !fir.ref> + fir.result %6, %5#1 : !fir.ref>, i1 + } else { + %5 = fir.absent !fir.ref> + fir.result %5, %false : !fir.ref>, i1 + } + fir.call @_QPtakes_optional_explicit(%4#0) fastmath : (!fir.ref>) -> () + hlfir.copy_out %0, %4#1 : (!fir.ref>>>, i1) -> () + return +} + +// CHECK-LABEL: func.func @_QPoptional_copy_in_out( +// CHECK-SAME: %[[ARG_0:.*]]: !fir.box> {fir.bindc_name = "x", fir.optional}) { +// CHECK: %false = arith.constant false +// CHECK: %[[VAL_0:.*]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_1:.*]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_2:.*]]:2 = hlfir.declare %[[ARG_0:.*]] dummy_scope %[[VAL_1:.*]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFoptional_copy_in_outEx"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_3:.*]] = fir.is_present %[[VAL_2:.*]]#0 : (!fir.box>) -> i1 +// CHECK: %[[VAL_4:.*]]:2 = fir.if %[[VAL_3:.*]] -> (!fir.ref>, i1) { +// CHECK: %[[VAL_5:.*]]:2 = hlfir.copy_in %[[VAL_2:.*]]#0 to %[[VAL_0:.*]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: %[[VAL_6:.*]] = fir.box_addr %[[VAL_5:.*]]#0 : (!fir.box>) -> !fir.ref> +// CHECK: fir.result %[[VAL_6:.*]], %[[VAL_5:.*]]#1 : !fir.ref>, i1 +// CHECK: } else { +// CHECK: %[[VAL_5:.*]] = fir.absent !fir.ref> +// CHECK: fir.result %[[VAL_5:.*]], %false : !fir.ref>, i1 +// CHECK: } +// CHECK: fir.call @_QPtakes_optional_explicit(%[[VAL_4:.*]]#0) fastmath : (!fir.ref>) -> () +// CHECK: hlfir.copy_out %[[VAL_0:.*]], %[[VAL_4:.*]]#1 : (!fir.ref>>>, i1) -> () +// CHECK: return +// CHECK: } + +// Test not inlining of assumed-rank arrays +func.func @_QPtest_copy_in_out_2(%arg0: !fir.box> {fir.bindc_name = "x"}) { + %0 = fir.alloca !fir.box>> + %1 = fir.dummy_scope : !fir.dscope + %2:2 = hlfir.declare %arg0 dummy_scope %1 {uniq_name = "_QFtest_copy_in_out_2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) + %3:2 = hlfir.copy_in %2#0 to %0 : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) + fir.call @_QPtakes_contiguous_intentin(%3#0) fastmath : (!fir.box>) -> () + hlfir.copy_out %0, %3#1 : (!fir.ref>>>, i1) -> () + return +} + +// CHECK-LABEL: func.func @_QPtest_copy_in_out_2( +// CHECK-SAME: %[[ARG_0]]: !fir.box> {fir.bindc_name = "x"}) { +// CHECK: %[[VAL_0]] = fir.alloca !fir.box>> +// CHECK: %[[VAL_1]] = fir.dummy_scope : !fir.dscope +// CHECK: %[[VAL_2]]:2 = hlfir.declare %[[ARG_0]] dummy_scope %[[VAL_1]] {uniq_name = "_QFtest_copy_in_out_2Ex"} : (!fir.box>, !fir.dscope) -> (!fir.box>, !fir.box>) +// CHECK: %[[VAL_3]]:2 = hlfir.copy_in %[[VAL_2]]#0 to %[[VAL_0]] : (!fir.box>, !fir.ref>>>) -> (!fir.box>, i1) +// CHECK: fir.call @_QPtakes_contiguous_intentin(%[[VAL_3]]#0) fastmath : (!fir.box>) -> () +// CHECK: hlfir.copy_out %[[VAL_0]], %[[VAL_3]]#1 : (!fir.ref>>>, i1) -> () +// CHECK: return +// CHECK: } From flang-commits at lists.llvm.org Fri May 30 05:33:34 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 30 May 2025 05:33:34 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <6839a59e.170a0220.1192d8.e72f@mx.google.com> https://github.com/eugeneepshteyn updated https://github.com/llvm/llvm-project/pull/136827 >From da77ccb2e85eda47d24ffb06281649f7b5fed050 Mon Sep 17 00:00:00 2001 From: Yussur Mustafa Oraji Date: Wed, 23 Apr 2025 10:33:04 +0200 Subject: [PATCH] [flang] Add __COUNTER__ preprocessor macro --- flang/docs/Extensions.md | 2 ++ flang/docs/Preprocessing.md | 12 ++++++++++++ flang/include/flang/Parser/preprocessor.h | 2 ++ flang/lib/Parser/preprocessor.cpp | 4 ++++ flang/test/Preprocessing/counter.F90 | 9 +++++++++ 5 files changed, 29 insertions(+) create mode 100644 flang/test/Preprocessing/counter.F90 diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 05e21ef2d33b5..d18833c066801 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -509,6 +509,8 @@ end * We respect Fortran comments in macro actual arguments (like GNU, Intel, NAG; unlike PGI and XLF) on the principle that macro calls should be treated like function references. Fortran's line continuation methods also work. +* We implement the `__COUNTER__` preprocessing extension, + see [Non-standard Extensions](Preprocessing.md#non-standard-extensions) ## Standard features not silently accepted diff --git a/flang/docs/Preprocessing.md b/flang/docs/Preprocessing.md index 0b70d857833ce..db815b9244edf 100644 --- a/flang/docs/Preprocessing.md +++ b/flang/docs/Preprocessing.md @@ -138,6 +138,18 @@ text. OpenMP-style directives that look like comments are not addressed by this scheme but are obvious extensions. +## Currently implemented built-ins + +* `__DATE__`: Date, given as e.g. "Jun 16 1904" +* `__TIME__`: Time in 24-hour format including seconds, e.g. "09:24:13" +* `__TIMESTAMP__`: Date, time and year of last modification, given as e.g. "Fri May 9 09:16:17 2025" +* `__FILE__`: Current file +* `__LINE__`: Current line + +### Non-standard Extensions + +* `__COUNTER__`: Replaced by sequential integers on each expansion, starting from 0. + ## Appendix `N` in the table below means "not supported"; this doesn't mean a bug, it just means that a particular behavior was diff --git a/flang/include/flang/Parser/preprocessor.h b/flang/include/flang/Parser/preprocessor.h index 86528a7e68def..834c84a639a74 100644 --- a/flang/include/flang/Parser/preprocessor.h +++ b/flang/include/flang/Parser/preprocessor.h @@ -121,6 +121,8 @@ class Preprocessor { std::list names_; std::unordered_map definitions_; std::stack ifStack_; + + unsigned int counterVal_{0}; }; } // namespace Fortran::parser #endif // FORTRAN_PARSER_PREPROCESSOR_H_ diff --git a/flang/lib/Parser/preprocessor.cpp b/flang/lib/Parser/preprocessor.cpp index a47f9c32ad27c..4549b1c505569 100644 --- a/flang/lib/Parser/preprocessor.cpp +++ b/flang/lib/Parser/preprocessor.cpp @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -299,6 +300,7 @@ void Preprocessor::DefineStandardMacros() { Define("__FILE__"s, "__FILE__"s); Define("__LINE__"s, "__LINE__"s); Define("__TIMESTAMP__"s, "__TIMESTAMP__"s); + Define("__COUNTER__"s, "__COUNTER__"s); } void Preprocessor::Define(const std::string ¯o, const std::string &value) { @@ -421,6 +423,8 @@ std::optional Preprocessor::MacroReplacement( repl = "\""s + time + '"'; } } + } else if (name == "__COUNTER__") { + repl = std::to_string(counterVal_++); } if (!repl.empty()) { ProvenanceRange insert{allSources_.AddCompilerInsertion(repl)}; diff --git a/flang/test/Preprocessing/counter.F90 b/flang/test/Preprocessing/counter.F90 new file mode 100644 index 0000000000000..9761c8fb7f355 --- /dev/null +++ b/flang/test/Preprocessing/counter.F90 @@ -0,0 +1,9 @@ +! RUN: %flang -E %s | FileCheck %s +! CHECK: print *, 0 +! CHECK: print *, 1 +! CHECK: print *, 2 +! Check incremental counter macro +#define foo bar +print *, __COUNTER__ +print *, __COUNTER__ +print *, __COUNTER__ From flang-commits at lists.llvm.org Fri May 30 05:33:57 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 05:33:57 -0700 (PDT) Subject: [flang-commits] [flang] 5c3bf36 - [flang] Add __COUNTER__ preprocessor macro (#136827) Message-ID: <6839a5b5.170a0220.601a4.253f@mx.google.com> Author: Yussur Mustafa Oraji Date: 2025-05-30T08:33:53-04:00 New Revision: 5c3bf36c996e0e8e1b6fcdd2fc116d3e5305df13 URL: https://github.com/llvm/llvm-project/commit/5c3bf36c996e0e8e1b6fcdd2fc116d3e5305df13 DIFF: https://github.com/llvm/llvm-project/commit/5c3bf36c996e0e8e1b6fcdd2fc116d3e5305df13.diff LOG: [flang] Add __COUNTER__ preprocessor macro (#136827) This commit adds support for the `__COUNTER__` preprocessor macro, which works the same as the one found in clang. It is useful to generate unique names at compile-time. Added: flang/test/Preprocessing/counter.F90 Modified: flang/docs/Extensions.md flang/docs/Preprocessing.md flang/include/flang/Parser/preprocessor.h flang/lib/Parser/preprocessor.cpp Removed: ################################################################################ diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 51969de5ac7fe..7d5e35a3991dc 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -521,6 +521,8 @@ end * We respect Fortran comments in macro actual arguments (like GNU, Intel, NAG; unlike PGI and XLF) on the principle that macro calls should be treated like function references. Fortran's line continuation methods also work. +* We implement the `__COUNTER__` preprocessing extension, + see [Non-standard Extensions](Preprocessing.md#non-standard-extensions) ## Standard features not silently accepted diff --git a/flang/docs/Preprocessing.md b/flang/docs/Preprocessing.md index 0b70d857833ce..db815b9244edf 100644 --- a/flang/docs/Preprocessing.md +++ b/flang/docs/Preprocessing.md @@ -138,6 +138,18 @@ text. OpenMP-style directives that look like comments are not addressed by this scheme but are obvious extensions. +## Currently implemented built-ins + +* `__DATE__`: Date, given as e.g. "Jun 16 1904" +* `__TIME__`: Time in 24-hour format including seconds, e.g. "09:24:13" +* `__TIMESTAMP__`: Date, time and year of last modification, given as e.g. "Fri May 9 09:16:17 2025" +* `__FILE__`: Current file +* `__LINE__`: Current line + +### Non-standard Extensions + +* `__COUNTER__`: Replaced by sequential integers on each expansion, starting from 0. + ## Appendix `N` in the table below means "not supported"; this doesn't mean a bug, it just means that a particular behavior was diff --git a/flang/include/flang/Parser/preprocessor.h b/flang/include/flang/Parser/preprocessor.h index 15810a34ee6a5..bb13b4463fa80 100644 --- a/flang/include/flang/Parser/preprocessor.h +++ b/flang/include/flang/Parser/preprocessor.h @@ -122,6 +122,8 @@ class Preprocessor { std::list names_; std::unordered_map definitions_; std::stack ifStack_; + + unsigned int counterVal_{0}; }; } // namespace Fortran::parser #endif // FORTRAN_PARSER_PREPROCESSOR_H_ diff --git a/flang/lib/Parser/preprocessor.cpp b/flang/lib/Parser/preprocessor.cpp index a5de14d864762..ef24c704db885 100644 --- a/flang/lib/Parser/preprocessor.cpp +++ b/flang/lib/Parser/preprocessor.cpp @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -299,6 +300,7 @@ void Preprocessor::DefineStandardMacros() { Define("__FILE__"s, "__FILE__"s); Define("__LINE__"s, "__LINE__"s); Define("__TIMESTAMP__"s, "__TIMESTAMP__"s); + Define("__COUNTER__"s, "__COUNTER__"s); } static const std::string idChars{ @@ -495,6 +497,8 @@ std::optional Preprocessor::MacroReplacement( repl = "\""s + time + '"'; } } + } else if (name == "__COUNTER__") { + repl = std::to_string(counterVal_++); } if (!repl.empty()) { ProvenanceRange insert{allSources_.AddCompilerInsertion(repl)}; diff --git a/flang/test/Preprocessing/counter.F90 b/flang/test/Preprocessing/counter.F90 new file mode 100644 index 0000000000000..9761c8fb7f355 --- /dev/null +++ b/flang/test/Preprocessing/counter.F90 @@ -0,0 +1,9 @@ +! RUN: %flang -E %s | FileCheck %s +! CHECK: print *, 0 +! CHECK: print *, 1 +! CHECK: print *, 2 +! Check incremental counter macro +#define foo bar +print *, __COUNTER__ +print *, __COUNTER__ +print *, __COUNTER__ From flang-commits at lists.llvm.org Fri May 30 05:33:59 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 30 May 2025 05:33:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <6839a5b7.170a0220.123f9b.ea12@mx.google.com> https://github.com/eugeneepshteyn closed https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Fri May 30 05:34:18 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 05:34:18 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Add __COUNTER__ preprocessor macro (PR #136827) In-Reply-To: Message-ID: <6839a5ca.170a0220.922b5.e450@mx.google.com> github-actions[bot] wrote: @N00byKing Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our [build bots](https://lab.llvm.org/buildbot/). If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail [here](https://llvm.org/docs/MyFirstTypoFix.html#myfirsttypofix-issues-after-landing-your-pr). If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of [LLVM development](https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy). You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! https://github.com/llvm/llvm-project/pull/136827 From flang-commits at lists.llvm.org Fri May 30 05:27:36 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 30 May 2025 05:27:36 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #140066) In-Reply-To: Message-ID: <6839a438.050a0220.7d3a7.f9e3@mx.google.com> ================ @@ -290,6 +290,10 @@ class SemanticsContext { // Top-level ProgramTrees are owned by the SemanticsContext for persistence. ProgramTree &SaveProgramTree(ProgramTree &&); + // Store (and get a reference to the stored string) for mangled names + // used for OpenMP DECLARE REDUCTION. + std::string &StoreUserReductionName(const std::string &name); ---------------- kiranchandramohan wrote: I was hoping we could get this to work without the requirement for a string storage like this. https://github.com/llvm/llvm-project/pull/140066 From flang-commits at lists.llvm.org Fri May 30 05:27:36 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 30 May 2025 05:27:36 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #140066) In-Reply-To: Message-ID: <6839a438.170a0220.d8a6e.e330@mx.google.com> ================ @@ -3508,6 +3509,17 @@ bool OmpStructureChecker::CheckReductionOperator( break; } } + // User-defined operators are OK if there has been a declared reduction + // for that. We mangle those names to store the user details. + if (const auto *definedOp{std::get_if(&dOpr.u)}) { + std::string mangled = MangleDefinedOperator(definedOp->v.symbol->name()); + const Scope &scope = definedOp->v.symbol->owner(); + if (const Symbol *symbol = scope.FindSymbol(mangled)) { ---------------- kiranchandramohan wrote: Nit: Braced initialization for all three assignments here. https://github.com/llvm/llvm-project/pull/140066 From flang-commits at lists.llvm.org Fri May 30 05:27:37 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 30 May 2025 05:27:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang][openmp]Add UserReductionDetails and use in DECLARE REDUCTION (PR #140066) In-Reply-To: Message-ID: <6839a439.170a0220.164cc8.e4ef@mx.google.com> ================ @@ -1785,14 +1794,91 @@ void OmpVisitor::ProcessMapperSpecifier(const parser::OmpMapperSpecifier &spec, PopScope(); } +parser::CharBlock MakeNameFromOperator( + const parser::DefinedOperator::IntrinsicOperator &op, + SemanticsContext &context) { + switch (op) { + case parser::DefinedOperator::IntrinsicOperator::Multiply: + return parser::CharBlock{"op.*", 4}; + case parser::DefinedOperator::IntrinsicOperator::Add: + return parser::CharBlock{"op.+", 4}; + case parser::DefinedOperator::IntrinsicOperator::Subtract: + return parser::CharBlock{"op.-", 4}; + + case parser::DefinedOperator::IntrinsicOperator::AND: + return parser::CharBlock{"op.AND", 6}; + case parser::DefinedOperator::IntrinsicOperator::OR: + return parser::CharBlock{"op.OR", 6}; + case parser::DefinedOperator::IntrinsicOperator::EQV: + return parser::CharBlock{"op.EQV", 7}; + case parser::DefinedOperator::IntrinsicOperator::NEQV: + return parser::CharBlock{"op.NEQV", 8}; + + default: + context.Say("Unsupported operator in DECLARE REDUCTION"_err_en_US); + return parser::CharBlock{"op.?", 4}; + } +} + +parser::CharBlock MangleSpecialFunctions(const parser::CharBlock &name) { + return llvm::StringSwitch(name.ToString()) + .Case("max", {"op.max", 6}) + .Case("min", {"op.min", 6}) + .Case("iand", {"op.iand", 7}) + .Case("ior", {"op.ior", 6}) + .Case("ieor", {"op.ieor", 7}) + .Default(name); +} + +std::string MangleDefinedOperator(const parser::CharBlock &name) { + CHECK(name[0] == '.' && name[name.size() - 1] == '.'); + return "op" + name.ToString(); +} + +template void OmpVisitor::ProcessReductionSpecifier( const parser::OmpReductionSpecifier &spec, - const std::optional &clauses) { + const std::optional &clauses, ---------------- kiranchandramohan wrote: The changes in this function looks a bit hacky. Ideally we shouldn't have random `parser::Name` not attached to the parse-tree and floating symbols. https://github.com/llvm/llvm-project/pull/140066 From flang-commits at lists.llvm.org Fri May 30 06:36:06 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Fri, 30 May 2025 06:36:06 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [NFC][OpenMP] Move the default declare mapper name suffix to OMPConstants.h (PR #141964) In-Reply-To: Message-ID: <6839b446.a70a0220.29f43f.ff9b@mx.google.com> https://github.com/TIFitis updated https://github.com/llvm/llvm-project/pull/141964 >From 9daed33f9403a8c45f6ecba6fc7bc6dcac0f83c1 Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Thu, 29 May 2025 16:12:18 +0100 Subject: [PATCH 1/2] [NFC][OpenMP] Move the default declare mapper name suffix to OMPConstants.h This patch moves the default declare mapper name suffix ".omp.default.mapper" to the OMPConstants.h file to be used everywhere for lowering. --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- flang/lib/Parser/openmp-parsers.cpp | 2 +- llvm/include/llvm/Frontend/OpenMP/OMPConstants.h | 3 +++ 4 files changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index ebdda9885d5c2..49a3b64c03a7e 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1148,7 +1148,7 @@ void ClauseProcessor::processMapObjects( typeSpec = &object.sym()->GetType()->derivedTypeSpec(); if (typeSpec) { - mapperIdName = typeSpec->name().ToString() + ".omp.default.mapper"; + mapperIdName = typeSpec->name().ToString() + llvm::omp::OmpDefaultMapperName; if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) mapperIdName = converter.mangleName(mapperIdName, sym->owner()); } diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ddb08f74b3841..fa711060c6b90 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2423,7 +2423,7 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, if (sym.GetType()->category() == semantics::DeclTypeSpec::TypeDerived) { auto &typeSpec = sym.GetType()->derivedTypeSpec(); std::string mapperIdName = - typeSpec.name().ToString() + ".omp.default.mapper"; + typeSpec.name().ToString() + llvm::omp::OmpDefaultMapperName; if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) mapperIdName = converter.mangleName(mapperIdName, sym->owner()); if (converter.getModuleOp().lookupSymbol(mapperIdName)) diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index c08cd1ab80559..08326fad8c143 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1402,7 +1402,7 @@ static OmpMapperSpecifier ConstructOmpMapperSpecifier( // This matches the syntax: :: if (DerivedTypeSpec * derived{std::get_if(&typeSpec.u)}) { return OmpMapperSpecifier{ - std::get(derived->t).ToString() + ".omp.default.mapper", + std::get(derived->t).ToString() + llvm::omp::OmpDefaultMapperName, std::move(typeSpec), std::move(varName)}; } return OmpMapperSpecifier{std::string("omp.default.mapper"), diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h b/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h index 338b56226f204..6e1bce12af8e4 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h @@ -190,6 +190,9 @@ enum class OMPScheduleType { LLVM_MARK_AS_BITMASK_ENUM(/* LargestValue */ ModifierMask) }; +// Default OpenMP mapper name suffix. +inline constexpr const char *OmpDefaultMapperName = ".omp.default.mapper"; + /// Values for bit flags used to specify the mapping type for /// offloading. enum class OpenMPOffloadMappingFlags : uint64_t { >From 38887433567aea17c017c45a5fc3fb58ed00384c Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Fri, 30 May 2025 14:35:41 +0100 Subject: [PATCH 2/2] Fix clang format. --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 49a3b64c03a7e..99b0d4dd696d4 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1148,7 +1148,8 @@ void ClauseProcessor::processMapObjects( typeSpec = &object.sym()->GetType()->derivedTypeSpec(); if (typeSpec) { - mapperIdName = typeSpec->name().ToString() + llvm::omp::OmpDefaultMapperName; + mapperIdName = + typeSpec->name().ToString() + llvm::omp::OmpDefaultMapperName; if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) mapperIdName = converter.mangleName(mapperIdName, sym->owner()); } From flang-commits at lists.llvm.org Fri May 30 06:39:07 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 06:39:07 -0700 (PDT) Subject: [flang-commits] [flang] 99ae675 - [NFC][OpenMP] Move the default declare mapper name suffix to OMPConstants.h (#141964) Message-ID: <6839b4fb.170a0220.1f6b4f.ea47@mx.google.com> Author: Akash Banerjee Date: 2025-05-30T14:39:03+01:00 New Revision: 99ae675fb7957f3eb8b65e9086dae4bbc722f221 URL: https://github.com/llvm/llvm-project/commit/99ae675fb7957f3eb8b65e9086dae4bbc722f221 DIFF: https://github.com/llvm/llvm-project/commit/99ae675fb7957f3eb8b65e9086dae4bbc722f221.diff LOG: [NFC][OpenMP] Move the default declare mapper name suffix to OMPConstants.h (#141964) This patch moves the default declare mapper name suffix ".omp.default.mapper" to the OMPConstants.h file to be used everywhere for lowering. Added: Modified: flang/lib/Lower/OpenMP/ClauseProcessor.cpp flang/lib/Lower/OpenMP/OpenMP.cpp flang/lib/Parser/openmp-parsers.cpp llvm/include/llvm/Frontend/OpenMP/OMPConstants.h Removed: ################################################################################ diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index f9a6f9506f510..88baad8827e92 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1156,7 +1156,8 @@ void ClauseProcessor::processMapObjects( typeSpec = &object.sym()->GetType()->derivedTypeSpec(); if (typeSpec) { - mapperIdName = typeSpec->name().ToString() + ".omp.default.mapper"; + mapperIdName = + typeSpec->name().ToString() + llvm::omp::OmpDefaultMapperName; if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) mapperIdName = converter.mangleName(mapperIdName, sym->owner()); } diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 19e106e2593ab..6892e571e62a3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2424,7 +2424,7 @@ genTargetOp(lower::AbstractConverter &converter, lower::SymMap &symTable, if (sym.GetType()->category() == semantics::DeclTypeSpec::TypeDerived) { auto &typeSpec = sym.GetType()->derivedTypeSpec(); std::string mapperIdName = - typeSpec.name().ToString() + ".omp.default.mapper"; + typeSpec.name().ToString() + llvm::omp::OmpDefaultMapperName; if (auto *sym = converter.getCurrentScope().FindSymbol(mapperIdName)) mapperIdName = converter.mangleName(mapperIdName, sym->owner()); if (converter.getModuleOp().lookupSymbol(mapperIdName)) diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index c08cd1ab80559..08326fad8c143 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -1402,7 +1402,7 @@ static OmpMapperSpecifier ConstructOmpMapperSpecifier( // This matches the syntax: :: if (DerivedTypeSpec * derived{std::get_if(&typeSpec.u)}) { return OmpMapperSpecifier{ - std::get(derived->t).ToString() + ".omp.default.mapper", + std::get(derived->t).ToString() + llvm::omp::OmpDefaultMapperName, std::move(typeSpec), std::move(varName)}; } return OmpMapperSpecifier{std::string("omp.default.mapper"), diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h b/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h index 338b56226f204..6e1bce12af8e4 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h @@ -190,6 +190,9 @@ enum class OMPScheduleType { LLVM_MARK_AS_BITMASK_ENUM(/* LargestValue */ ModifierMask) }; +// Default OpenMP mapper name suffix. +inline constexpr const char *OmpDefaultMapperName = ".omp.default.mapper"; + /// Values for bit flags used to specify the mapping type for /// offloading. enum class OpenMPOffloadMappingFlags : uint64_t { From flang-commits at lists.llvm.org Fri May 30 06:39:10 2025 From: flang-commits at lists.llvm.org (Akash Banerjee via flang-commits) Date: Fri, 30 May 2025 06:39:10 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [NFC][OpenMP] Move the default declare mapper name suffix to OMPConstants.h (PR #141964) In-Reply-To: Message-ID: <6839b4fe.170a0220.174b1b.eae1@mx.google.com> https://github.com/TIFitis closed https://github.com/llvm/llvm-project/pull/141964 From flang-commits at lists.llvm.org Fri May 30 06:49:43 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Fri, 30 May 2025 06:49:43 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #142073) In-Reply-To: Message-ID: <6839b777.170a0220.2f94c5.f786@mx.google.com> https://github.com/tarunprabhu approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/142073 From flang-commits at lists.llvm.org Fri May 30 06:50:23 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 06:50:23 -0700 (PDT) Subject: [flang-commits] [flang] ce9cef7 - [flang] Add support for -mprefer-vector-width= (#142073) Message-ID: <6839b79f.050a0220.233dc.b224@mx.google.com> Author: Cameron McInally Date: 2025-05-30T07:50:18-06:00 New Revision: ce9cef79ea3f1ee86e4dc674d4c05b2fa8b3c7a8 URL: https://github.com/llvm/llvm-project/commit/ce9cef79ea3f1ee86e4dc674d4c05b2fa8b3c7a8 DIFF: https://github.com/llvm/llvm-project/commit/ce9cef79ea3f1ee86e4dc674d4c05b2fa8b3c7a8.diff LOG: [flang] Add support for -mprefer-vector-width= (#142073) This patch adds support for the -mprefer-vector-width= command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "prefer-vector-width" function attribute. Co-authored-by: Cameron McInally Added: flang/test/Driver/prefer-vector-width.f90 mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll mlir/test/Target/LLVMIR/prefer-vector-width.mlir Modified: clang/include/clang/Driver/Options.td flang/include/flang/Frontend/CodeGenOptions.h flang/include/flang/Optimizer/Transforms/Passes.td flang/include/flang/Tools/CrossToolHelpers.h flang/lib/Frontend/CompilerInvocation.cpp flang/lib/Frontend/FrontendActions.cpp flang/lib/Optimizer/Passes/Pipelines.cpp flang/lib/Optimizer/Transforms/FunctionAttr.cpp mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td mlir/lib/Target/LLVMIR/ModuleImport.cpp mlir/lib/Target/LLVMIR/ModuleTranslation.cpp Removed: ################################################################################ diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 6e72576862890..5ca31c253ed8f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5480,7 +5480,7 @@ def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, MarshallingInfoStringVector>; def mprefer_vector_width_EQ : Joined<["-"], "mprefer-vector-width=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Specifies preferred vector width for auto-vectorization. Defaults to 'none' which allows target specific decisions.">, MarshallingInfoString>; def mstack_protector_guard_EQ : Joined<["-"], "mstack-protector-guard=">, Group, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 2b4e823b3fef4..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -53,6 +53,9 @@ class CodeGenOptions : public CodeGenOptionsBase { /// The paths to the pass plugins that were registered using -fpass-plugin. std::vector LLVMPassPlugins; + // The prefered vector width, if requested by -mprefer-vector-width. + std::string PreferVectorWidth; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index 71493535db8ba..1b1970412676d 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -429,6 +429,10 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"preferVectorWidth", "prefer-vector-width", "std::string", + /*default=*/"", + "Set the prefer-vector-width attribute on functions in the " + "module.">, Option<"tuneCPU", "tune-cpu", "std::string", /*default=*/"", "Set the tune-cpu attribute on functions in the module.">, Option<"setNoCapture", "set-nocapture", "bool", /*default=*/"false", diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 118695bbe2626..058024a4a04c5 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; InstrumentFunctionExit = "__cyg_profile_func_exit"; @@ -126,6 +127,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for + ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. bool EnableOpenMP = false; ///< Enable OpenMP lowering. std::string InstrumentFunctionEntry = diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..90a002929eff0 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -309,6 +309,20 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, for (auto *a : args.filtered(clang::driver::options::OPT_fpass_plugin_EQ)) opts.LLVMPassPlugins.push_back(a->getValue()); + // -mprefer_vector_width option + if (const llvm::opt::Arg *a = args.getLastArg( + clang::driver::options::OPT_mprefer_vector_width_EQ)) { + llvm::StringRef s = a->getValue(); + unsigned width; + if (s == "none") + opts.PreferVectorWidth = "none"; + else if (s.getAsInteger(10, width)) + diags.Report(clang::diag::err_drv_invalid_value) + << a->getAsString(args) << a->getValue(); + else + opts.PreferVectorWidth = s.str(); + } + // -fembed-offload-object option for (auto *a : args.filtered(clang::driver::options::OPT_fembed_offload_object_EQ)) diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 38dfaadf1dff9..012d0fdfe645f 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -741,6 +741,8 @@ void CodeGenAction::generateLLVMIR() { config.VScaleMax = vsr->second; } + config.PreferVectorWidth = opts.PreferVectorWidth; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 378913fcb1329..0c774eede4c9a 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -358,7 +358,7 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - /*tuneCPU=*/"", setNoCapture, setNoAlias})); + config.PreferVectorWidth, /*tuneCPU=*/"", setNoCapture, setNoAlias})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index c8cdba0d6f9c4..041aa8717d20e 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -107,6 +107,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!preferVectorWidth.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, preferVectorWidth)); LLVM_DEBUG(llvm::dbgs() << "=== End " DEBUG_TYPE " ===\n"); } diff --git a/flang/test/Driver/prefer-vector-width.f90 b/flang/test/Driver/prefer-vector-width.f90 new file mode 100644 index 0000000000000..d0f5fd28db826 --- /dev/null +++ b/flang/test/Driver/prefer-vector-width.f90 @@ -0,0 +1,16 @@ +! Test that -mprefer-vector-width works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mprefer-vector-width=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mprefer-vector-width=128 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-128 +! RUN: %flang_fc1 -mprefer-vector-width=256 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-256 +! RUN: not %flang_fc1 -mprefer-vector-width=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INVALID + +subroutine func +end subroutine func + +! CHECK-DEF-NOT: attributes #0 = { "prefer-vector-width"={{.*}} } +! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } +! CHECK-128: attributes #0 = { "prefer-vector-width"="128" } +! CHECK-256: attributes #0 = { "prefer-vector-width"="256" } +! CHECK-INVALID:error: invalid value 'xxx' in '-mprefer-vector-width=xxx' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index 6fde45ce5c556..c0324d561b77b 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1893,6 +1893,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$frame_pointer, OptionalAttr:$target_cpu, OptionalAttr:$tune_cpu, + OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index b049064fbd31c..85417da798b22 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2636,6 +2636,10 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, funcOp.setTargetFeaturesAttr( LLVM::TargetFeaturesAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("prefer-vector-width"); + attr.isStringAttribute()) + funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index e3264985ecd6e..4cc419c7cde5b 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1549,6 +1549,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { if (auto tuneCpu = func.getTuneCpu()) llvmFunc->addFnAttr("tune-cpu", *tuneCpu); + if (auto preferVectorWidth = func.getPreferVectorWidth()) + llvmFunc->addFnAttr("prefer-vector-width", *preferVectorWidth); + if (auto attr = func.getVscaleRange()) llvmFunc->addFnAttr(llvm::Attribute::getWithVScaleRangeArgs( getLLVMContext(), attr->getMinRange().getInt(), diff --git a/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll new file mode 100644 index 0000000000000..e30ef04924b81 --- /dev/null +++ b/mlir/test/Target/LLVMIR/Import/prefer-vector-width.ll @@ -0,0 +1,9 @@ +; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s + +; CHECK-LABEL: llvm.func @prefer_vector_width() +; CHECK-SAME: prefer_vector_width = "128" +define void @prefer_vector_width() #0 { + ret void +} + +attributes #0 = { "prefer-vector-width"="128" } diff --git a/mlir/test/Target/LLVMIR/prefer-vector-width.mlir b/mlir/test/Target/LLVMIR/prefer-vector-width.mlir new file mode 100644 index 0000000000000..7410e8139fd31 --- /dev/null +++ b/mlir/test/Target/LLVMIR/prefer-vector-width.mlir @@ -0,0 +1,8 @@ +// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s + +// CHECK: define void @prefer_vector_width() #[[ATTRS:.*]] { +// CHECK: attributes #[[ATTRS]] = { "prefer-vector-width"="128" } + +llvm.func @prefer_vector_width() attributes {prefer_vector_width = "128"} { + llvm.return +} From flang-commits at lists.llvm.org Fri May 30 06:50:24 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Fri, 30 May 2025 06:50:24 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #142073) In-Reply-To: Message-ID: <6839b7a0.a70a0220.2826eb.012d@mx.google.com> https://github.com/tarunprabhu closed https://github.com/llvm/llvm-project/pull/142073 From flang-commits at lists.llvm.org Fri May 30 06:57:56 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Fri, 30 May 2025 06:57:56 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #142073) In-Reply-To: Message-ID: <6839b964.170a0220.24229e.430d@mx.google.com> mcinally wrote: Thanks, @tarunprabhu! https://github.com/llvm/llvm-project/pull/142073 From flang-commits at lists.llvm.org Fri May 30 07:13:58 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 07:13:58 -0700 (PDT) Subject: [flang-commits] [flang] d27a210 - Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (#136098) Message-ID: <6839bd26.050a0220.2d2390.14a5@mx.google.com> Author: FYK Date: 2025-05-30T08:13:53-06:00 New Revision: d27a210a77af63568db9f829702b4b2c98473a46 URL: https://github.com/llvm/llvm-project/commit/d27a210a77af63568db9f829702b4b2c98473a46 DIFF: https://github.com/llvm/llvm-project/commit/d27a210a77af63568db9f829702b4b2c98473a46.diff LOG: Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (#136098) This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags: - `-fprofile-generate` for instrumentation-based profile generation - `-fprofile-use=/file` for profile-guided optimization Resolves #74216 (implements IR PGO support phase) **Key changes:** - Frontend flag handling aligned with Clang/GCC semantics - Instrumentation hooks into LLVM PGO infrastructure - LIT tests verifying: - Instrumentation metadata generation - Profile loading from specified path - Branch weight attribution (IR checks) **Tests:** - Added gcc-flag-compatibility.f90 test module verifying: - Flag parsing boundary conditions - IR-level profile annotation consistency - Profile input path normalization rules - SPEC2006 benchmark results will be shared in comments For details on LLVM's PGO framework, refer to [Clang PGO Documentation](https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization). This implementation was developed by [XSCC Compiler Team](https://github.com/orgs/OpenXiangShan/teams/xscc). --------- Co-authored-by: ict-ql <168183727+ict-ql at users.noreply.github.com> Co-authored-by: Tom Eccles Added: flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext flang/test/Profile/gcc-flag-compatibility.f90 Modified: clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Basic/CodeGenOptions.h clang/include/clang/Basic/ProfileList.h clang/include/clang/Driver/Options.td clang/lib/Basic/ProfileList.cpp clang/lib/CodeGen/BackendUtil.cpp clang/lib/CodeGen/CodeGenAction.cpp clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Flang.cpp clang/lib/Frontend/CompilerInvocation.cpp flang/include/flang/Frontend/CodeGenOptions.def flang/include/flang/Frontend/CodeGenOptions.h flang/lib/Frontend/CompilerInvocation.cpp flang/lib/Frontend/FrontendActions.cpp flang/test/Driver/flang-f-opts.f90 llvm/include/llvm/Frontend/Driver/CodeGenOptions.h llvm/lib/Frontend/Driver/CodeGenOptions.cpp Removed: ################################################################################ diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index aad4e107cbeb3..11dad53a52efe 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,9 +223,11 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 4, ProfileNone) + +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 4, llvm::driver::ProfileInstrKind::ProfileNone) + /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index 278803f7bb960..bffbd00b1bd72 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -518,35 +518,41 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == ProfileClangInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == ProfileIRInstr; + return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == ProfileCSIRInstr; + return getProfileInstr() == + llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } + bool hasProfileInstr() const { + return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; + } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == ProfileClangInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == ProfileIRInstr || - getProfileUse() == ProfileCSIRInstr; + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index b4217e49c18a3..5338ef3992ade 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,17 +49,16 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; + ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const; + llvm::driver::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 5ca31c253ed8f..5c79c66b55eb3 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1772,7 +1772,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFFFFFE">; def fprofile_generate : Flag<["-"], "fprofile-generate">, - Group, Visibility<[ClangOption, CLOption]>, + Group, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">, Group, Visibility<[ClangOption, CLOption]>, @@ -1789,7 +1789,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group, Visibility<[ClangOption, CLOption]>, Alias; def fprofile_use_EQ : Joined<["-"], "fprofile-use=">, Group, - Visibility<[ClangOption, CLOption]>, + Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, MetaVarName<"">, HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from /default.profdata. Otherwise, it reads from file .">; def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">, diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 2d37014294b92..bea65579f396b 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -70,24 +70,24 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { +static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { switch (Kind) { - case CodeGenOptions::ProfileNone: + case llvm::driver::ProfileInstrKind::ProfileNone: return ""; - case CodeGenOptions::ProfileClangInstr: + case llvm::driver::ProfileInstrKind::ProfileClangInstr: return "clang"; - case CodeGenOptions::ProfileIRInstr: + case llvm::driver::ProfileInstrKind::ProfileIRInstr: return "llvm"; - case CodeGenOptions::ProfileCSIRInstr: + case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: return "csllvm"; case CodeGenOptions::ProfileIRSampleColdCov: return "sample-coldcov"; } - llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); + llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { +ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -118,7 +118,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -132,13 +132,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - CodeGenOptions::ProfileInstrKind Kind) const { + llvm::driver::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index cd5fc48c4a22b..03e10b1138a71 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -123,17 +123,10 @@ namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } -// Default filename used for profile generation. -static std::string getDefaultProfileGenName() { - return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} - // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? getDefaultProfileGenName() + ? llvm::driver::getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -835,12 +828,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, - PGOOptions::IRInstr, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, - CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions( + getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -848,31 +841,32 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, ClPGOColdFuncAttr, + PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) - PGOOpt = PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); - else if (CodeGenOpts.PseudoProbeForProfiling) - // -fpseudo-probe-for-profiling PGOOpt = - PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); + llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); + else if (CodeGenOpts.PseudoProbeForProfiling) + // -fpseudo-probe-for-profiling + PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + llvm::ClPGOColdFuncAttr, + CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - ClPGOColdFuncAttr, true); + llvm::ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 1f5eb427b566f..5493cc92bd8b0 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && - CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) + if (OptRecordFile && CodeGenOpts.getProfileUse() != + llvm::driver::ProfileInstrKind::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 4193f0a1b278f..30aec87c909eb 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -940,7 +940,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != + llvm::driver::ProfileInstrKind::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 6d2c705338ecf..264f1bdee81c6 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3601,7 +3601,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index dcc46469df3e9..e303631cc1d57 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -883,6 +883,10 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); + // recognise options: fprofile-generate -fprofile-use= + Args.addAllArgs( + CmdArgs, {options::OPT_fprofile_generate, options::OPT_fprofile_use_EQ}); + // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ, diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index 9c33910eff57e..11d0dc6b7b6f1 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1499,11 +1499,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); else - Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); } else - Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); + Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index a697872836569..ae12aec518108 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,8 +24,15 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. + +/// Choose profile instrumenation kind or no instrumentation. +ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +/// Choose profile kind for PGO use compilation. +ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) + CODEGENOPT(InstrumentFunctions, 1, 0) ///< Set when -finstrument_functions is ///< enabled on the compile step. + CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 61e56e51c4bbb..06203670f97b9 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -151,6 +151,44 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; + /// Name of the profile file to use as output for -fprofile-instr-generate, + /// -fprofile-generate, and -fcs-profile-generate. + std::string InstrProfileOutput; + + /// Name of the profile file to use as input for -fmemory-profile-use. + std::string MemoryProfileUsePath; + + /// Name of the profile file to use as input for -fprofile-instr-use + std::string ProfileInstrumentUsePath; + + /// Name of the profile remapping file to apply to the profile data supplied + /// by -fprofile-sample-use or -fprofile-instr-use. + std::string ProfileRemappingFile; + + /// Check if Clang profile instrumenation is on. + bool hasProfileClangInstr() const { + return getProfileInstr() == llvm::driver::ProfileClangInstr; + } + + /// Check if IR level profile instrumentation is on. + bool hasProfileIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileIRInstr; + } + + /// Check if CS IR level profile instrumentation is on. + bool hasProfileCSIRInstr() const { + return getProfileInstr() == llvm::driver::ProfileCSIRInstr; + } + /// Check if IR level profile use is on. + bool hasProfileIRUse() const { + return getProfileUse() == llvm::driver::ProfileIRInstr || + getProfileUse() == llvm::driver::ProfileCSIRInstr; + } + /// Check if CSIR profile use is on. + bool hasProfileCSIRUse() const { + return getProfileUse() == llvm::driver::ProfileCSIRInstr; + } + // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) #define ENUM_CODEGENOPT(Name, Type, Bits, Default) \ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 90a002929eff0..0571aea8ec801 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,6 +30,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" +#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -452,6 +453,15 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, opts.IsPIE = 1; } + if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { + opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); + } + + if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { + opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); + opts.ProfileInstrumentUsePath = A->getValue(); + } + // -mcmodel option. if (const llvm::opt::Arg *a = args.getLastArg(clang::driver::options::OPT_mcmodel_EQ)) { diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 012d0fdfe645f..da8fa518ab3e1 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -56,10 +56,12 @@ #include "llvm/Passes/PassBuilder.h" #include "llvm/Passes/PassPlugin.h" #include "llvm/Passes/StandardInstrumentations.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/Support/AMDGPUAddrSpace.h" #include "llvm/Support/Error.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/FileSystem.h" +#include "llvm/Support/PGOOptions.h" #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" @@ -67,6 +69,7 @@ #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" +#include "llvm/Transforms/Instrumentation/InstrProfiling.h" #include "llvm/Transforms/Utils/ModuleUtils.h" #include #include @@ -918,6 +921,29 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; + + if (opts.hasProfileIRInstr()) { + // -fprofile-generate. + pgoOpt = llvm::PGOOptions(opts.InstrProfileOutput.empty() + ? llvm::driver::getDefaultProfileGenName() + : opts.InstrProfileOutput, + "", "", opts.MemoryProfileUsePath, nullptr, + llvm::PGOOptions::IRInstr, + llvm::PGOOptions::NoCSAction, + llvm::PGOOptions::ColdFuncOpt::Default, false, + /*PseudoProbeForProfiling=*/false, false); + } else if (opts.hasProfileIRUse()) { + llvm::IntrusiveRefCntPtr VFS = + llvm::vfs::getRealFileSystem(); + // -fprofile-use. + auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse + : llvm::PGOOptions::NoCSAction; + pgoOpt = llvm::PGOOptions( + opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, + opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, + llvm::PGOOptions::ColdFuncOpt::Default, false); + } + llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Driver/flang-f-opts.f90 b/flang/test/Driver/flang-f-opts.f90 index 4493a519e2010..b972b9b7b2a59 100644 --- a/flang/test/Driver/flang-f-opts.f90 +++ b/flang/test/Driver/flang-f-opts.f90 @@ -8,3 +8,8 @@ ! CHECK-LABEL: "-fc1" ! CHECK: -ffp-contract=off ! CHECK: -O3 + +! RUN: %flang -### -S -fprofile-generate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-GENERATE-LLVM %s +! CHECK-PROFILE-GENERATE-LLVM: "-fprofile-generate" +! RUN: %flang -### -S -fprofile-use=%S %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-USE-DIR %s +! CHECK-PROFILE-USE-DIR: "-fprofile-use={{.*}}" diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext new file mode 100644 index 0000000000000..2650fb5ebfd35 --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext @@ -0,0 +1,18 @@ +# IR level Instrumentation Flag +:ir +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 + +main +# Func Hash: +742261418966908927 +# Num Counters: +1 +# Counter Values: +1 \ No newline at end of file diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext new file mode 100644 index 0000000000000..c4a2a26557e80 --- /dev/null +++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext @@ -0,0 +1,11 @@ +# IR level Instrumentation Flag +:ir +:entry_first +_QQmain +# Func Hash: +146835646621254984 +# Num Counters: +2 +# Counter Values: +100 +1 \ No newline at end of file diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 new file mode 100644 index 0000000000000..4490c45232d28 --- /dev/null +++ b/flang/test/Profile/gcc-flag-compatibility.f90 @@ -0,0 +1,32 @@ +! Tests for -fprofile-generate and -fprofile-use flag compatibility. These two +! flags behave similarly to their GCC counterparts: +! +! -fprofile-generate Generates the profile file ./default.profraw +! -fprofile-use=/file Uses the profile file /file + +! On AIX, -flto used to be required with -fprofile-generate. gcc-flag-compatibility-aix.c is used to do the testing on AIX with -flto +! RUN: %flang %s -c -S -o - -emit-llvm -fprofile-generate | FileCheck -check-prefix=PROFILE-GEN %s +! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section +! PROFILE-GEN: @__profd_{{_?}}main = + +! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof +! This uses LLVM IR format profile. +! RUN: rm -rf %t.dir +! RUN: mkdir -p %t.dir/some/path +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s +! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof +! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s +! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} +! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} + +program main + implicit none + integer :: i + integer :: X = 0 + + do i = 0, 99 + X = X + i + end do + +end program main diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index ee52645f2e51b..82f583bc459e6 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,6 +13,8 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H +#include + namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -46,9 +48,19 @@ enum class VectorLibrary { AMDLIBM // AMD vector math library. }; +enum ProfileInstrKind { + ProfileNone, // Profile instrumentation is turned off. + ProfileClangInstr, // Clang instrumentation to generate execution counts + // to use with PGO. + ProfileIRInstr, // IR level PGO instrumentation in LLVM. + ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +}; + TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); +// Default filename used for profile generation. +std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index 52080dea93c98..df884908845d2 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,8 +8,15 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" +#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/TargetParser/Triple.h" +namespace llvm { +extern llvm::cl::opt DebugInfoCorrelate; +extern llvm::cl::opt + ProfileCorrelate; +} // namespace llvm + namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, @@ -56,4 +63,10 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } +std::string getDefaultProfileGenName() { + return llvm::DebugInfoCorrelate || + llvm::ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} } // namespace llvm::driver From flang-commits at lists.llvm.org Fri May 30 07:14:07 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Fri, 30 May 2025 07:14:07 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6839bd2f.170a0220.18d289.0d58@mx.google.com> https://github.com/tarunprabhu closed https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Fri May 30 07:14:29 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 07:14:29 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6839bd45.170a0220.1f6b4f.0c5f@mx.google.com> github-actions[bot] wrote: @fanju110 Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our [build bots](https://lab.llvm.org/buildbot/). If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail [here](https://llvm.org/docs/MyFirstTypoFix.html#myfirsttypofix-issues-after-landing-your-pr). If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of [LLVM development](https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy). You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Fri May 30 07:19:11 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Fri, 30 May 2025 07:19:11 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6839be5f.a70a0220.2b4a1.2b27@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `amdgpu-offload-rhel-8-cmake-build-only` running on `rocm-docker-rhel-8` while building `clang,flang,llvm` at step 4 "annotate". Full details are available at: https://lab.llvm.org/buildbot/#/builders/204/builds/10902
Here is the relevant piece of the build log for the reference ``` Step 4 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py --jobs=32' (failure) ... [4577/7836] Linking CXX executable bin/mlir-irdl-to-cpp [4578/7836] Linking CXX shared library lib/libMLIRArithDialect.so.21.0git [4579/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/OpStats.cpp.o [4580/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/PrintIR.cpp.o [4581/7836] Linking CXX shared library lib/libMLIRPDLToPDLInterp.so.21.0git [4582/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/Mem2Reg.cpp.o [4583/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/SCCP.cpp.o [4584/7836] Linking CXX shared library lib/libLLVMIRPrinter.so.21.0git [4585/7836] Linking CXX shared library lib/libLLVMFrontendAtomic.so.21.0git [4586/7836] Linking CXX shared library lib/libLLVMFrontendDriver.so.21.0git FAILED: lib/libLLVMFrontendDriver.so.21.0git : && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -Wno-misleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -Wl,-z,defs -Wl,-z,nodelete -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/./lib -Wl,--gc-sections -shared -Wl,-soname,libLLVMFrontendDriver.so.21.0git -o lib/libLLVMFrontendDriver.so.21.0git lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/lib:" lib/libLLVMAnalysis.so.21.0git lib/libLLVMCore.so.21.0git lib/libLLVMSupport.so.21.0git -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/lib && : lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o: In function `llvm::driver::getDefaultProfileGenName[abi:cxx11]()': CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x3): undefined reference to `llvm::DebugInfoCorrelate' CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x21): undefined reference to `llvm::ProfileCorrelate' collect2: error: ld returned 1 exit status [4587/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/RemoveDeadValues.cpp.o [4588/7836] Linking CXX shared library lib/libMLIRQueryLib.so.21.0git [4589/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/StripDebugInfo.cpp.o [4590/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/SymbolDCE.cpp.o [4591/7836] Creating library symlink lib/libMLIRArithDialect.so [4592/7836] Creating library symlink lib/libMLIRPDLToPDLInterp.so [4593/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/SROA.cpp.o [4594/7836] Linking CXX shared library lib/libLLVMTarget.so.21.0git [4595/7836] Linking CXX shared library lib/libMLIRPluginsLib.so.21.0git [4596/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/TopologicalSort.cpp.o [4597/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/SymbolPrivatize.cpp.o [4598/7836] Creating library symlink lib/libLLVMIRPrinter.so [4599/7836] Creating library symlink lib/libLLVMFrontendAtomic.so [4600/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/ViewOpGraph.cpp.o [4601/7836] Building CXX object tools/mlir/lib/Transforms/Utils/CMakeFiles/obj.MLIRTransformUtils.dir/ControlFlowSinkUtils.cpp.o [4602/7836] Building CXX object tools/mlir/lib/Transforms/Utils/CMakeFiles/obj.MLIRTransformUtils.dir/CFGToSCF.cpp.o [4603/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Module.cpp.o [4604/7836] Building CXX object tools/mlir/lib/Transforms/Utils/CMakeFiles/obj.MLIRTransformUtils.dir/CommutativityUtils.cpp.o [4605/7836] Linking CXX shared library lib/libLLVMBitWriter.so.21.0git [4606/7836] Linking CXX shared library lib/libMLIRPDLLCodeGen.so.21.0git [4607/7836] Linking CXX shared library lib/libLLVMSandboxIR.so.21.0git [4608/7836] Linking CXX shared library lib/libLLVMTransformUtils.so.21.0git [4609/7836] Building AMDGPUGenAsmWriter.inc... [4610/7836] Building CXX object tools/clang/lib/Lex/CMakeFiles/obj.clangLex.dir/LiteralSupport.cpp.o [4611/7836] Building AMDGPUGenAsmMatcher.inc... [4612/7836] Building CXX object lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/AsmPrinter.cpp.o [4613/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Attributes.cpp.o [4614/7836] Building AMDGPUGenGlobalISel.inc... [4615/7836] Building AMDGPUGenDAGISel.inc... [4616/7836] Building AMDGPUGenInstrInfo.inc... [4617/7836] Building AMDGPUGenRegisterBank.inc... [4618/7836] Building CXX object lib/LTO/CMakeFiles/LLVMLTO.dir/LTO.cpp.o In file included from /usr/include/c++/8/cassert:44, Step 7 (build cmake config) failure: build cmake config (failure) ... [4577/7836] Linking CXX executable bin/mlir-irdl-to-cpp [4578/7836] Linking CXX shared library lib/libMLIRArithDialect.so.21.0git [4579/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/OpStats.cpp.o [4580/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/PrintIR.cpp.o [4581/7836] Linking CXX shared library lib/libMLIRPDLToPDLInterp.so.21.0git [4582/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/Mem2Reg.cpp.o [4583/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/SCCP.cpp.o [4584/7836] Linking CXX shared library lib/libLLVMIRPrinter.so.21.0git [4585/7836] Linking CXX shared library lib/libLLVMFrontendAtomic.so.21.0git [4586/7836] Linking CXX shared library lib/libLLVMFrontendDriver.so.21.0git FAILED: lib/libLLVMFrontendDriver.so.21.0git : && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -Wno-misleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -Wl,-z,defs -Wl,-z,nodelete -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/./lib -Wl,--gc-sections -shared -Wl,-soname,libLLVMFrontendDriver.so.21.0git -o lib/libLLVMFrontendDriver.so.21.0git lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/lib:" lib/libLLVMAnalysis.so.21.0git lib/libLLVMCore.so.21.0git lib/libLLVMSupport.so.21.0git -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/lib && : lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o: In function `llvm::driver::getDefaultProfileGenName[abi:cxx11]()': CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x3): undefined reference to `llvm::DebugInfoCorrelate' CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x21): undefined reference to `llvm::ProfileCorrelate' collect2: error: ld returned 1 exit status [4587/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/RemoveDeadValues.cpp.o [4588/7836] Linking CXX shared library lib/libMLIRQueryLib.so.21.0git [4589/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/StripDebugInfo.cpp.o [4590/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/SymbolDCE.cpp.o [4591/7836] Creating library symlink lib/libMLIRArithDialect.so [4592/7836] Creating library symlink lib/libMLIRPDLToPDLInterp.so [4593/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/SROA.cpp.o [4594/7836] Linking CXX shared library lib/libLLVMTarget.so.21.0git [4595/7836] Linking CXX shared library lib/libMLIRPluginsLib.so.21.0git [4596/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/TopologicalSort.cpp.o [4597/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/SymbolPrivatize.cpp.o [4598/7836] Creating library symlink lib/libLLVMIRPrinter.so [4599/7836] Creating library symlink lib/libLLVMFrontendAtomic.so [4600/7836] Building CXX object tools/mlir/lib/Transforms/CMakeFiles/obj.MLIRTransforms.dir/ViewOpGraph.cpp.o [4601/7836] Building CXX object tools/mlir/lib/Transforms/Utils/CMakeFiles/obj.MLIRTransformUtils.dir/ControlFlowSinkUtils.cpp.o [4602/7836] Building CXX object tools/mlir/lib/Transforms/Utils/CMakeFiles/obj.MLIRTransformUtils.dir/CFGToSCF.cpp.o [4603/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Module.cpp.o [4604/7836] Building CXX object tools/mlir/lib/Transforms/Utils/CMakeFiles/obj.MLIRTransformUtils.dir/CommutativityUtils.cpp.o [4605/7836] Linking CXX shared library lib/libLLVMBitWriter.so.21.0git [4606/7836] Linking CXX shared library lib/libMLIRPDLLCodeGen.so.21.0git [4607/7836] Linking CXX shared library lib/libLLVMSandboxIR.so.21.0git [4608/7836] Linking CXX shared library lib/libLLVMTransformUtils.so.21.0git [4609/7836] Building AMDGPUGenAsmWriter.inc... [4610/7836] Building CXX object tools/clang/lib/Lex/CMakeFiles/obj.clangLex.dir/LiteralSupport.cpp.o [4611/7836] Building AMDGPUGenAsmMatcher.inc... [4612/7836] Building CXX object lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/AsmPrinter.cpp.o [4613/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Attributes.cpp.o [4614/7836] Building AMDGPUGenGlobalISel.inc... [4615/7836] Building AMDGPUGenDAGISel.inc... [4616/7836] Building AMDGPUGenInstrInfo.inc... [4617/7836] Building AMDGPUGenRegisterBank.inc... [4618/7836] Building CXX object lib/LTO/CMakeFiles/LLVMLTO.dir/LTO.cpp.o In file included from /usr/include/c++/8/cassert:44, ```
https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Fri May 30 07:20:10 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Fri, 30 May 2025 07:20:10 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <6839be9a.050a0220.ae6dd.1f0c@mx.google.com> ================ @@ -0,0 +1,17 @@ +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s ---------------- tarunprabhu wrote: Should this also be `--target`? ```suggestion ! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s ``` https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Fri May 30 07:20:12 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Fri, 30 May 2025 07:20:12 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6839be9c.a70a0220.26f6c9.202b@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `amdgpu-offload-rhel-9-cmake-build-only` running on `rocm-docker-rhel-9` while building `clang,flang,llvm` at step 4 "annotate". Full details are available at: https://lab.llvm.org/buildbot/#/builders/205/builds/10879
Here is the relevant piece of the build log for the reference ``` Step 4 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py --jobs=32' (failure) ... [5073/7836] Creating library symlink lib/libLLVMAnalysis.so [5074/7836] Linking CXX shared library lib/libMLIRTosaToArith.so.21.0git [5075/7836] Linking CXX shared library lib/libMLIRTosaToSCF.so.21.0git [5076/7836] Linking CXX shared library lib/libMLIRTosaToMLProgram.so.21.0git [5077/7836] Linking CXX shared library lib/libMLIRLinalgDialect.so.21.0git [5078/7836] Linking CXX shared library lib/libLLVMIRPrinter.so.21.0git [5079/7836] Creating library symlink lib/libLLVMIRPrinter.so [5080/7836] Linking CXX shared library lib/libLLVMFrontendAtomic.so.21.0git [5081/7836] Creating library symlink lib/libLLVMFrontendAtomic.so [5082/7836] Linking CXX shared library lib/libLLVMFrontendDriver.so.21.0git FAILED: lib/libLLVMFrontendDriver.so.21.0git : && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -Wl,-z,defs -Wl,-z,nodelete -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/./lib -Wl,--gc-sections -shared -Wl,-soname,libLLVMFrontendDriver.so.21.0git -o lib/libLLVMFrontendDriver.so.21.0git lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/lib:" lib/libLLVMAnalysis.so.21.0git lib/libLLVMCore.so.21.0git lib/libLLVMSupport.so.21.0git -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/lib && : /usr/bin/ld: lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o: in function `llvm::driver::getDefaultProfileGenName[abi:cxx11]()': CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x3): undefined reference to `llvm::DebugInfoCorrelate' /usr/bin/ld: CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x21): undefined reference to `llvm::ProfileCorrelate' collect2: error: ld returned 1 exit status [5083/7836] Linking CXX shared library lib/libMLIRTosaToTensor.so.21.0git [5084/7836] Linking CXX shared library lib/libMLIRMemRefTransforms.so.21.0git [5085/7836] Linking CXX shared library lib/libLLVMBitWriter.so.21.0git [5086/7836] Linking CXX shared library lib/libLLVMTarget.so.21.0git [5087/7836] Linking CXX shared library lib/libLLVMSandboxIR.so.21.0git [5088/7836] Building AMDGPUGenCallingConv.inc... [5089/7836] Linking CXX shared library lib/libMLIRSPIRVDialect.so.21.0git [5090/7836] Linking CXX shared library lib/libLLVMTransformUtils.so.21.0git [5091/7836] Building AMDGPUGenSearchableTables.inc... [5092/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/CodeGenOptions.cpp.o [5093/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Builtins.cpp.o [5094/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/ARC.cpp.o [5095/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/OpenCLOptions.cpp.o [5096/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/ProfileList.cpp.o /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/clang/lib/Basic/ProfileList.cpp: In function ‘llvm::StringRef getSectionName(llvm::driver::ProfileInstrKind)’: /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/clang/lib/Basic/ProfileList.cpp:83:3: warning: case value ‘4’ not in enumerated type ‘llvm::driver::ProfileInstrKind’ [-Wswitch] 83 | case CodeGenOptions::ProfileIRSampleColdCov: | ^~~~ [5097/7836] Building AMDGPUGenAsmWriter.inc... [5098/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/TargetInfo.cpp.o [5099/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/AMDGPU.cpp.o [5100/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Module.cpp.o [5101/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/ARM.cpp.o [5102/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/AArch64.cpp.o [5103/7836] Building AMDGPUGenGlobalISel.inc... [5104/7836] Building AMDGPUGenDAGISel.inc... [5105/7836] Building AMDGPUGenAsmMatcher.inc... [5106/7836] Building CXX object lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/AsmPrinter.cpp.o [5107/7836] Building CXX object tools/clang/lib/Frontend/Rewrite/CMakeFiles/obj.clangRewriteFrontend.dir/RewriteModernObjC.cpp.o [5108/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Attributes.cpp.o [5109/7836] Building AMDGPUGenRegisterBank.inc... [5110/7836] Building AMDGPUGenInstrInfo.inc... [5111/7836] Building AMDGPUGenRegisterInfo.inc... Step 7 (build cmake config) failure: build cmake config (failure) ... [5073/7836] Creating library symlink lib/libLLVMAnalysis.so [5074/7836] Linking CXX shared library lib/libMLIRTosaToArith.so.21.0git [5075/7836] Linking CXX shared library lib/libMLIRTosaToSCF.so.21.0git [5076/7836] Linking CXX shared library lib/libMLIRTosaToMLProgram.so.21.0git [5077/7836] Linking CXX shared library lib/libMLIRLinalgDialect.so.21.0git [5078/7836] Linking CXX shared library lib/libLLVMIRPrinter.so.21.0git [5079/7836] Creating library symlink lib/libLLVMIRPrinter.so [5080/7836] Linking CXX shared library lib/libLLVMFrontendAtomic.so.21.0git [5081/7836] Creating library symlink lib/libLLVMFrontendAtomic.so [5082/7836] Linking CXX shared library lib/libLLVMFrontendDriver.so.21.0git FAILED: lib/libLLVMFrontendDriver.so.21.0git : && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -Wl,-z,defs -Wl,-z,nodelete -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/./lib -Wl,--gc-sections -shared -Wl,-soname,libLLVMFrontendDriver.so.21.0git -o lib/libLLVMFrontendDriver.so.21.0git lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/lib:" lib/libLLVMAnalysis.so.21.0git lib/libLLVMCore.so.21.0git lib/libLLVMSupport.so.21.0git -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/lib && : /usr/bin/ld: lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o: in function `llvm::driver::getDefaultProfileGenName[abi:cxx11]()': CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x3): undefined reference to `llvm::DebugInfoCorrelate' /usr/bin/ld: CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x21): undefined reference to `llvm::ProfileCorrelate' collect2: error: ld returned 1 exit status [5083/7836] Linking CXX shared library lib/libMLIRTosaToTensor.so.21.0git [5084/7836] Linking CXX shared library lib/libMLIRMemRefTransforms.so.21.0git [5085/7836] Linking CXX shared library lib/libLLVMBitWriter.so.21.0git [5086/7836] Linking CXX shared library lib/libLLVMTarget.so.21.0git [5087/7836] Linking CXX shared library lib/libLLVMSandboxIR.so.21.0git [5088/7836] Building AMDGPUGenCallingConv.inc... [5089/7836] Linking CXX shared library lib/libMLIRSPIRVDialect.so.21.0git [5090/7836] Linking CXX shared library lib/libLLVMTransformUtils.so.21.0git [5091/7836] Building AMDGPUGenSearchableTables.inc... [5092/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/CodeGenOptions.cpp.o [5093/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Builtins.cpp.o [5094/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/ARC.cpp.o [5095/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/OpenCLOptions.cpp.o [5096/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/ProfileList.cpp.o /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/clang/lib/Basic/ProfileList.cpp: In function ‘llvm::StringRef getSectionName(llvm::driver::ProfileInstrKind)’: /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/clang/lib/Basic/ProfileList.cpp:83:3: warning: case value ‘4’ not in enumerated type ‘llvm::driver::ProfileInstrKind’ [-Wswitch] 83 | case CodeGenOptions::ProfileIRSampleColdCov: | ^~~~ [5097/7836] Building AMDGPUGenAsmWriter.inc... [5098/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/TargetInfo.cpp.o [5099/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/AMDGPU.cpp.o [5100/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Module.cpp.o [5101/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/ARM.cpp.o [5102/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/AArch64.cpp.o [5103/7836] Building AMDGPUGenGlobalISel.inc... [5104/7836] Building AMDGPUGenDAGISel.inc... [5105/7836] Building AMDGPUGenAsmMatcher.inc... [5106/7836] Building CXX object lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/AsmPrinter.cpp.o [5107/7836] Building CXX object tools/clang/lib/Frontend/Rewrite/CMakeFiles/obj.clangRewriteFrontend.dir/RewriteModernObjC.cpp.o [5108/7836] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Attributes.cpp.o [5109/7836] Building AMDGPUGenRegisterBank.inc... [5110/7836] Building AMDGPUGenInstrInfo.inc... [5111/7836] Building AMDGPUGenRegisterInfo.inc... ```
https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Fri May 30 07:20:20 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Fri, 30 May 2025 07:20:20 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <6839bea4.050a0220.2183f5.1c96@mx.google.com> https://github.com/tarunprabhu edited https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Fri May 30 07:21:25 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Fri, 30 May 2025 07:21:25 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6839bee5.170a0220.36074a.5c0c@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `amdgpu-offload-ubuntu-22-cmake-build-only` running on `rocm-docker-ubu-22` while building `clang,flang,llvm` at step 4 "annotate". Full details are available at: https://lab.llvm.org/buildbot/#/builders/203/builds/12089
Here is the relevant piece of the build log for the reference ``` Step 4 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py --jobs=32' (failure) ... [4788/7836] Linking CXX shared library lib/libMLIRNVGPUDialect.so.21.0git [4789/7836] Linking CXX shared library lib/libMLIRShardingInterface.so.21.0git [4790/7836] Linking CXX shared library lib/libMLIRArithToEmitC.so.21.0git [4791/7836] Linking CXX shared library lib/libLLVMIRPrinter.so.21.0git [4792/7836] Linking CXX shared library lib/libMLIRBufferizationDialect.so.21.0git [4793/7836] Linking CXX shared library lib/libLLVMFrontendAtomic.so.21.0git [4794/7836] Creating library symlink lib/libMLIRArithToEmitC.so [4795/7836] Creating library symlink lib/libLLVMIRPrinter.so [4796/7836] Creating library symlink lib/libLLVMFrontendAtomic.so [4797/7836] Linking CXX shared library lib/libLLVMFrontendDriver.so.21.0git FAILED: lib/libLLVMFrontendDriver.so.21.0git : && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -Wl,-z,defs -Wl,-z,nodelete -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/./lib -Wl,--gc-sections -shared -Wl,-soname,libLLVMFrontendDriver.so.21.0git -o lib/libLLVMFrontendDriver.so.21.0git lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/lib:" lib/libLLVMAnalysis.so.21.0git lib/libLLVMCore.so.21.0git lib/libLLVMSupport.so.21.0git -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/lib && : /usr/bin/ld: lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o: in function `llvm::driver::getDefaultProfileGenName[abi:cxx11]()': CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x7): undefined reference to `llvm::DebugInfoCorrelate' /usr/bin/ld: CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x25): undefined reference to `llvm::ProfileCorrelate' collect2: error: ld returned 1 exit status [4798/7836] Creating library symlink lib/libMLIRTensorUtils.so [4799/7836] Creating library symlink lib/libMLIRBufferizationDialect.so [4800/7836] Creating library symlink lib/libMLIRShardingInterface.so [4801/7836] Linking CXX shared library lib/libLLVMBitWriter.so.21.0git [4802/7836] Linking CXX shared library lib/libMLIRSCFDialect.so.21.0git [4803/7836] Linking CXX shared library lib/libLLVMTarget.so.21.0git [4804/7836] Building CXX object tools/mlir/test/lib/Dialect/TestIRDLToCpp/CMakeFiles/MLIRTestIRDLToCppDialect.dir/TestIRDLToCppDialect.cpp.o In file included from /usr/include/c++/11/cassert:44, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/llvm/include/llvm/IR/ProfileSummary.h:17, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/llvm/include/llvm/IR/Module.h:30, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/mlir/include/mlir/Dialect/LLVMIR/LLVMDialect.h:38, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/mlir/include/mlir/Target/LLVMIR/LLVMTranslationInterface.h:16, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/mlir/test/lib/Dialect/TestIRDLToCpp/TestIRDLToCppDialect.cpp:22: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/tools/mlir/test/lib/Dialect/TestIRDLToCpp/test_irdl_to_cpp.irdl.mlir.cpp.inc: In static member function ‘static llvm::StringRef mlir::test_irdl_to_cpp::BarOp::getOperandName(unsigned int)’: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/tools/mlir/test/lib/Dialect/TestIRDLToCpp/test_irdl_to_cpp.irdl.mlir.cpp.inc:223:18: warning: comparison of unsigned expression in ‘< 0’ is always false [-Wtype-limits] 223 | assert(index < 0 && "invalid attribute index"); | ~~~~~~^~~ [4805/7836] Linking CXX shared library lib/libMLIRShapeDialect.so.21.0git [4806/7836] Linking CXX shared library lib/libLLVMSandboxIR.so.21.0git [4807/7836] Linking CXX shared library lib/libLLVMTransformUtils.so.21.0git [4808/7836] Building AMDGPUGenPostLegalizeGICombiner.inc... [4809/7836] Building CXX object tools/clang/lib/Lex/CMakeFiles/obj.clangLex.dir/LiteralSupport.cpp.o [4810/7836] Building AMDGPUGenDisassemblerTables.inc... [4811/7836] Building AMDGPUGenRegBankGICombiner.inc... [4812/7836] Building AMDGPUGenPreLegalizeGICombiner.inc... [4813/7836] Building AMDGPUGenMCCodeEmitter.inc... [4814/7836] Linking CXX shared library lib/libMLIRSPIRVDialect.so.21.0git [4815/7836] Building AMDGPUGenSubtargetInfo.inc... [4816/7836] Building CXX object tools/clang/lib/Frontend/Rewrite/CMakeFiles/obj.clangRewriteFrontend.dir/RewriteModernObjC.cpp.o [4817/7836] Building AMDGPUGenSearchableTables.inc... [4818/7836] Building CXX object tools/clang/lib/Interpreter/CMakeFiles/obj.clangInterpreter.dir/DeviceOffload.cpp.o [4819/7836] Building AMDGPUGenCallingConv.inc... [4820/7836] Building CXX object tools/clang/lib/Tooling/CMakeFiles/obj.clangTooling.dir/GuessTargetAndModeCompilationDatabase.cpp.o Step 7 (build cmake config) failure: build cmake config (failure) ... [4788/7836] Linking CXX shared library lib/libMLIRNVGPUDialect.so.21.0git [4789/7836] Linking CXX shared library lib/libMLIRShardingInterface.so.21.0git [4790/7836] Linking CXX shared library lib/libMLIRArithToEmitC.so.21.0git [4791/7836] Linking CXX shared library lib/libLLVMIRPrinter.so.21.0git [4792/7836] Linking CXX shared library lib/libMLIRBufferizationDialect.so.21.0git [4793/7836] Linking CXX shared library lib/libLLVMFrontendAtomic.so.21.0git [4794/7836] Creating library symlink lib/libMLIRArithToEmitC.so [4795/7836] Creating library symlink lib/libLLVMIRPrinter.so [4796/7836] Creating library symlink lib/libLLVMFrontendAtomic.so [4797/7836] Linking CXX shared library lib/libLLVMFrontendDriver.so.21.0git FAILED: lib/libLLVMFrontendDriver.so.21.0git : && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -Wl,-z,defs -Wl,-z,nodelete -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/./lib -Wl,--gc-sections -shared -Wl,-soname,libLLVMFrontendDriver.so.21.0git -o lib/libLLVMFrontendDriver.so.21.0git lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/lib:" lib/libLLVMAnalysis.so.21.0git lib/libLLVMCore.so.21.0git lib/libLLVMSupport.so.21.0git -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/lib && : /usr/bin/ld: lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o: in function `llvm::driver::getDefaultProfileGenName[abi:cxx11]()': CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x7): undefined reference to `llvm::DebugInfoCorrelate' /usr/bin/ld: CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x25): undefined reference to `llvm::ProfileCorrelate' collect2: error: ld returned 1 exit status [4798/7836] Creating library symlink lib/libMLIRTensorUtils.so [4799/7836] Creating library symlink lib/libMLIRBufferizationDialect.so [4800/7836] Creating library symlink lib/libMLIRShardingInterface.so [4801/7836] Linking CXX shared library lib/libLLVMBitWriter.so.21.0git [4802/7836] Linking CXX shared library lib/libMLIRSCFDialect.so.21.0git [4803/7836] Linking CXX shared library lib/libLLVMTarget.so.21.0git [4804/7836] Building CXX object tools/mlir/test/lib/Dialect/TestIRDLToCpp/CMakeFiles/MLIRTestIRDLToCppDialect.dir/TestIRDLToCppDialect.cpp.o In file included from /usr/include/c++/11/cassert:44, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/llvm/include/llvm/IR/ProfileSummary.h:17, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/llvm/include/llvm/IR/Module.h:30, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/mlir/include/mlir/Dialect/LLVMIR/LLVMDialect.h:38, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/mlir/include/mlir/Target/LLVMIR/LLVMTranslationInterface.h:16, from /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-project/mlir/test/lib/Dialect/TestIRDLToCpp/TestIRDLToCppDialect.cpp:22: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/tools/mlir/test/lib/Dialect/TestIRDLToCpp/test_irdl_to_cpp.irdl.mlir.cpp.inc: In static member function ‘static llvm::StringRef mlir::test_irdl_to_cpp::BarOp::getOperandName(unsigned int)’: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/tools/mlir/test/lib/Dialect/TestIRDLToCpp/test_irdl_to_cpp.irdl.mlir.cpp.inc:223:18: warning: comparison of unsigned expression in ‘< 0’ is always false [-Wtype-limits] 223 | assert(index < 0 && "invalid attribute index"); | ~~~~~~^~~ [4805/7836] Linking CXX shared library lib/libMLIRShapeDialect.so.21.0git [4806/7836] Linking CXX shared library lib/libLLVMSandboxIR.so.21.0git [4807/7836] Linking CXX shared library lib/libLLVMTransformUtils.so.21.0git [4808/7836] Building AMDGPUGenPostLegalizeGICombiner.inc... [4809/7836] Building CXX object tools/clang/lib/Lex/CMakeFiles/obj.clangLex.dir/LiteralSupport.cpp.o [4810/7836] Building AMDGPUGenDisassemblerTables.inc... [4811/7836] Building AMDGPUGenRegBankGICombiner.inc... [4812/7836] Building AMDGPUGenPreLegalizeGICombiner.inc... [4813/7836] Building AMDGPUGenMCCodeEmitter.inc... [4814/7836] Linking CXX shared library lib/libMLIRSPIRVDialect.so.21.0git [4815/7836] Building AMDGPUGenSubtargetInfo.inc... [4816/7836] Building CXX object tools/clang/lib/Frontend/Rewrite/CMakeFiles/obj.clangRewriteFrontend.dir/RewriteModernObjC.cpp.o [4817/7836] Building AMDGPUGenSearchableTables.inc... [4818/7836] Building CXX object tools/clang/lib/Interpreter/CMakeFiles/obj.clangInterpreter.dir/DeviceOffload.cpp.o [4819/7836] Building AMDGPUGenCallingConv.inc... [4820/7836] Building CXX object tools/clang/lib/Tooling/CMakeFiles/obj.clangTooling.dir/GuessTargetAndModeCompilationDatabase.cpp.o ```
https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Fri May 30 07:24:42 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Fri, 30 May 2025 07:24:42 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6839bfaa.170a0220.b373.1e54@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `openmp-offload-amdgpu-runtime-2` running on `rocm-worker-hw-02` while building `clang,flang,llvm` at step 5 "compile-openmp". Full details are available at: https://lab.llvm.org/buildbot/#/builders/10/builds/6370
Here is the relevant piece of the build log for the reference ``` Step 5 (compile-openmp) failure: build (failure) ... 9.419 [1581/64/3007] Building CXX object tools/clang/lib/Parse/CMakeFiles/obj.clangParse.dir/ParseObjc.cpp.o 9.433 [1580/64/3008] Linking CXX shared library lib/libLLVMIRPrinter.so.21.0git 9.443 [1579/64/3009] Creating library symlink lib/libLLVMIRPrinter.so 9.477 [1578/64/3010] Building CXX object tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaCUDA.cpp.o 9.478 [1577/64/3011] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets/X86.cpp.o 9.502 [1576/64/3012] Linking CXX shared library lib/libLLVMFrontendAtomic.so.21.0git 9.510 [1575/64/3013] Linking CXX shared library lib/libLLVMBitWriter.so.21.0git 9.512 [1574/64/3014] Creating library symlink lib/libLLVMFrontendAtomic.so 9.520 [1573/64/3015] Creating library symlink lib/libLLVMBitWriter.so 9.538 [1572/64/3016] Linking CXX shared library lib/libLLVMFrontendDriver.so.21.0git FAILED: lib/libLLVMFrontendDriver.so.21.0git : && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -Wl,-z,defs -Wl,-z,nodelete -Wl,-rpath-link,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./lib -Wl,--gc-sections -shared -Wl,-soname,libLLVMFrontendDriver.so.21.0git -o lib/libLLVMFrontendDriver.so.21.0git lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/lib:" lib/libLLVMAnalysis.so.21.0git lib/libLLVMCore.so.21.0git lib/libLLVMSupport.so.21.0git -Wl,-rpath-link,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/lib && : /usr/bin/ld: lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o: in function `llvm::driver::getDefaultProfileGenName[abi:cxx11]()': CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x7): undefined reference to `llvm::DebugInfoCorrelate' /usr/bin/ld: CodeGenOptions.cpp:(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x25): undefined reference to `llvm::ProfileCorrelate' collect2: error: ld returned 1 exit status 9.544 [1572/63/3017] Building CXX object tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaAttr.cpp.o 9.583 [1572/62/3018] Linking CXX shared library lib/libLLVMCGData.so.21.0git 9.586 [1572/61/3019] Linking CXX shared library lib/libLLVMTarget.so.21.0git 9.694 [1572/60/3020] Building CXX object tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaCast.cpp.o 9.700 [1572/59/3021] Linking CXX shared library lib/libLLVMTransformUtils.so.21.0git 9.848 [1572/58/3022] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/FormatString.cpp.o 9.969 [1572/57/3023] Building CXX object tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaARM.cpp.o 9.989 [1572/56/3024] Building CXX object tools/clang/lib/Parse/CMakeFiles/obj.clangParse.dir/ParsePragma.cpp.o 9.993 [1572/55/3025] Building AMDGPUGenMCPseudoLowering.inc... 10.129 [1572/54/3026] Building CXX object tools/clang/lib/Parse/CMakeFiles/obj.clangParse.dir/ParseDeclCXX.cpp.o 10.429 [1572/53/3027] Building AMDGPUGenPreLegalizeGICombiner.inc... 10.455 [1572/52/3028] Building AMDGPUGenRegBankGICombiner.inc... 10.775 [1572/51/3029] Building CXX object tools/clang/lib/Parse/CMakeFiles/obj.clangParse.dir/ParseOpenMP.cpp.o 10.801 [1572/50/3030] Building AMDGPUGenPostLegalizeGICombiner.inc... 10.832 [1572/49/3031] Building AMDGPUGenMCCodeEmitter.inc... 10.963 [1572/48/3032] Building CXX object tools/clang/lib/Basic/CMakeFiles/obj.clangBasic.dir/Targets.cpp.o 11.041 [1572/47/3033] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/ScanfFormatString.cpp.o 11.044 [1572/46/3034] Building AMDGPUGenSubtargetInfo.inc... 11.077 [1572/45/3035] Building AMDGPUGenDisassemblerTables.inc... 11.111 [1572/44/3036] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/PrintfFormatString.cpp.o 11.182 [1572/43/3037] Building CXX object tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/ParsedAttr.cpp.o 11.220 [1572/42/3038] Building CXX object tools/clang/lib/CodeGen/CMakeFiles/obj.clangCodeGen.dir/ABIInfoImpl.cpp.o 11.478 [1572/41/3039] Building CXX object tools/clang/lib/Parse/CMakeFiles/obj.clangParse.dir/ParseDecl.cpp.o 11.514 [1572/40/3040] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/RecordLayout.cpp.o 11.719 [1572/39/3041] Building CXX object tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaDeclObjC.cpp.o 11.809 [1572/38/3042] Building AMDGPUGenSearchableTables.inc... 11.988 [1572/37/3043] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/ItaniumCXXABI.cpp.o 12.064 [1572/36/3044] Building AMDGPUGenCallingConv.inc... 12.092 [1572/35/3045] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/ByteCode/InterpBuiltinBitCast.cpp.o 12.209 [1572/34/3046] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/MicrosoftCXXABI.cpp.o 12.377 [1572/33/3047] Building CXX object lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/AsmPrinter.cpp.o 12.635 [1572/32/3048] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/Mangle.cpp.o 12.712 [1572/31/3049] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/DeclBase.cpp.o ```
https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Fri May 30 07:27:12 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 07:27:12 -0700 (PDT) Subject: [flang-commits] [flang] 597340b - Revert "Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler" (#142159) Message-ID: <6839c040.170a0220.24449.6856@mx.google.com> Author: Tarun Prabhu Date: 2025-05-30T08:27:08-06:00 New Revision: 597340b5b666bdee2887f56c111407b6737cbf34 URL: https://github.com/llvm/llvm-project/commit/597340b5b666bdee2887f56c111407b6737cbf34 DIFF: https://github.com/llvm/llvm-project/commit/597340b5b666bdee2887f56c111407b6737cbf34.diff LOG: Revert "Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler" (#142159) Reverts llvm/llvm-project#136098 Added: Modified: clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Basic/CodeGenOptions.h clang/include/clang/Basic/ProfileList.h clang/include/clang/Driver/Options.td clang/lib/Basic/ProfileList.cpp clang/lib/CodeGen/BackendUtil.cpp clang/lib/CodeGen/CodeGenAction.cpp clang/lib/CodeGen/CodeGenFunction.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Flang.cpp clang/lib/Frontend/CompilerInvocation.cpp flang/include/flang/Frontend/CodeGenOptions.def flang/include/flang/Frontend/CodeGenOptions.h flang/lib/Frontend/CompilerInvocation.cpp flang/lib/Frontend/FrontendActions.cpp flang/test/Driver/flang-f-opts.f90 llvm/include/llvm/Frontend/Driver/CodeGenOptions.h llvm/lib/Frontend/Driver/CodeGenOptions.cpp Removed: flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext flang/test/Profile/gcc-flag-compatibility.f90 ################################################################################ diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index 11dad53a52efe..aad4e107cbeb3 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,11 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. - -ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 4, llvm::driver::ProfileInstrKind::ProfileNone) - +ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 4, ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index bffbd00b1bd72..278803f7bb960 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -518,41 +518,35 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == - llvm::driver::ProfileInstrKind::ProfileClangInstr; + return getProfileInstr() == ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; + return getProfileInstr() == ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == - llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + return getProfileInstr() == ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { - return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; - } + bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; + return getProfileUse() == ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || - getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { - return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; - } + bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index 5338ef3992ade..b4217e49c18a3 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,16 +49,17 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; + ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - llvm::driver::ProfileInstrKind Kind) const; + CodeGenOptions::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - llvm::driver::ProfileInstrKind Kind) const; + CodeGenOptions::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, + CodeGenOptions::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 5c79c66b55eb3..5ca31c253ed8f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1772,7 +1772,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFFFFFE">; def fprofile_generate : Flag<["-"], "fprofile-generate">, - Group, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, + Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">, Group, Visibility<[ClangOption, CLOption]>, @@ -1789,7 +1789,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group, Visibility<[ClangOption, CLOption]>, Alias; def fprofile_use_EQ : Joined<["-"], "fprofile-use=">, Group, - Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, + Visibility<[ClangOption, CLOption]>, MetaVarName<"">, HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from /default.profdata. Otherwise, it reads from file .">; def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">, diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index bea65579f396b..2d37014294b92 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -70,24 +70,24 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { +static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { switch (Kind) { - case llvm::driver::ProfileInstrKind::ProfileNone: + case CodeGenOptions::ProfileNone: return ""; - case llvm::driver::ProfileInstrKind::ProfileClangInstr: + case CodeGenOptions::ProfileClangInstr: return "clang"; - case llvm::driver::ProfileInstrKind::ProfileIRInstr: + case CodeGenOptions::ProfileIRInstr: return "llvm"; - case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: + case CodeGenOptions::ProfileCSIRInstr: return "csllvm"; case CodeGenOptions::ProfileIRSampleColdCov: return "sample-coldcov"; } - llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); + llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { +ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -118,7 +118,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - llvm::driver::ProfileInstrKind Kind) const { + CodeGenOptions::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -132,13 +132,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - llvm::driver::ProfileInstrKind Kind) const { + CodeGenOptions::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - llvm::driver::ProfileInstrKind Kind) const { + CodeGenOptions::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 03e10b1138a71..cd5fc48c4a22b 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -123,10 +123,17 @@ namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } +// Default filename used for profile generation. +static std::string getDefaultProfileGenName() { + return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} + // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? llvm::driver::getDefaultProfileGenName() + ? getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -828,12 +835,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions( - getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, - PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, - CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, + PGOOptions::IRInstr, PGOOptions::NoCSAction, + ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, + CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -841,32 +848,31 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + PGOOptions::NoCSAction, ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) - PGOOpt = - PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::NoAction, PGOOptions::NoCSAction, - llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); + PGOOpt = PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, + PGOOptions::NoAction, PGOOptions::NoCSAction, + ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - llvm::ClPGOColdFuncAttr, - CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = + PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - llvm::ClPGOColdFuncAttr, true); + ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 5493cc92bd8b0..1f5eb427b566f 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && CodeGenOpts.getProfileUse() != - llvm::driver::ProfileInstrKind::ProfileNone) + if (OptRecordFile && + CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 30aec87c909eb..4193f0a1b278f 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -940,8 +940,7 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != - llvm::driver::ProfileInstrKind::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 264f1bdee81c6..6d2c705338ecf 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3601,7 +3601,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index e303631cc1d57..dcc46469df3e9 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -883,10 +883,6 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); - // recognise options: fprofile-generate -fprofile-use= - Args.addAllArgs( - CmdArgs, {options::OPT_fprofile_generate, options::OPT_fprofile_use_EQ}); - // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ, diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index 11d0dc6b7b6f1..9c33910eff57e 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1499,11 +1499,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); + Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); else - Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); + Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); } else - Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); + Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index ae12aec518108..a697872836569 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,15 +24,8 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. - -/// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) -/// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) - CODEGENOPT(InstrumentFunctions, 1, 0) ///< Set when -finstrument_functions is ///< enabled on the compile step. - CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 06203670f97b9..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -151,44 +151,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - /// Name of the profile file to use as output for -fprofile-instr-generate, - /// -fprofile-generate, and -fcs-profile-generate. - std::string InstrProfileOutput; - - /// Name of the profile file to use as input for -fmemory-profile-use. - std::string MemoryProfileUsePath; - - /// Name of the profile file to use as input for -fprofile-instr-use - std::string ProfileInstrumentUsePath; - - /// Name of the profile remapping file to apply to the profile data supplied - /// by -fprofile-sample-use or -fprofile-instr-use. - std::string ProfileRemappingFile; - - /// Check if Clang profile instrumenation is on. - bool hasProfileClangInstr() const { - return getProfileInstr() == llvm::driver::ProfileClangInstr; - } - - /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { - return getProfileInstr() == llvm::driver::ProfileIRInstr; - } - - /// Check if CS IR level profile instrumentation is on. - bool hasProfileCSIRInstr() const { - return getProfileInstr() == llvm::driver::ProfileCSIRInstr; - } - /// Check if IR level profile use is on. - bool hasProfileIRUse() const { - return getProfileUse() == llvm::driver::ProfileIRInstr || - getProfileUse() == llvm::driver::ProfileCSIRInstr; - } - /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { - return getProfileUse() == llvm::driver::ProfileCSIRInstr; - } - // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) #define ENUM_CODEGENOPT(Name, Type, Bits, Default) \ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 0571aea8ec801..90a002929eff0 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,7 +30,6 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" -#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -453,15 +452,6 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, opts.IsPIE = 1; } - if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); - } - - if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); - opts.ProfileInstrumentUsePath = A->getValue(); - } - // -mcmodel option. if (const llvm::opt::Arg *a = args.getLastArg(clang::driver::options::OPT_mcmodel_EQ)) { diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index da8fa518ab3e1..012d0fdfe645f 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -56,12 +56,10 @@ #include "llvm/Passes/PassBuilder.h" #include "llvm/Passes/PassPlugin.h" #include "llvm/Passes/StandardInstrumentations.h" -#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/Support/AMDGPUAddrSpace.h" #include "llvm/Support/Error.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/FileSystem.h" -#include "llvm/Support/PGOOptions.h" #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" @@ -69,7 +67,6 @@ #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" -#include "llvm/Transforms/Instrumentation/InstrProfiling.h" #include "llvm/Transforms/Utils/ModuleUtils.h" #include #include @@ -921,29 +918,6 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; - - if (opts.hasProfileIRInstr()) { - // -fprofile-generate. - pgoOpt = llvm::PGOOptions(opts.InstrProfileOutput.empty() - ? llvm::driver::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, - llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, - llvm::PGOOptions::ColdFuncOpt::Default, false, - /*PseudoProbeForProfiling=*/false, false); - } else if (opts.hasProfileIRUse()) { - llvm::IntrusiveRefCntPtr VFS = - llvm::vfs::getRealFileSystem(); - // -fprofile-use. - auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse - : llvm::PGOOptions::NoCSAction; - pgoOpt = llvm::PGOOptions( - opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, - opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - llvm::PGOOptions::ColdFuncOpt::Default, false); - } - llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Driver/flang-f-opts.f90 b/flang/test/Driver/flang-f-opts.f90 index b972b9b7b2a59..4493a519e2010 100644 --- a/flang/test/Driver/flang-f-opts.f90 +++ b/flang/test/Driver/flang-f-opts.f90 @@ -8,8 +8,3 @@ ! CHECK-LABEL: "-fc1" ! CHECK: -ffp-contract=off ! CHECK: -O3 - -! RUN: %flang -### -S -fprofile-generate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-GENERATE-LLVM %s -! CHECK-PROFILE-GENERATE-LLVM: "-fprofile-generate" -! RUN: %flang -### -S -fprofile-use=%S %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-USE-DIR %s -! CHECK-PROFILE-USE-DIR: "-fprofile-use={{.*}}" diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext deleted file mode 100644 index 2650fb5ebfd35..0000000000000 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext +++ /dev/null @@ -1,18 +0,0 @@ -# IR level Instrumentation Flag -:ir -_QQmain -# Func Hash: -146835646621254984 -# Num Counters: -2 -# Counter Values: -100 -1 - -main -# Func Hash: -742261418966908927 -# Num Counters: -1 -# Counter Values: -1 \ No newline at end of file diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext deleted file mode 100644 index c4a2a26557e80..0000000000000 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext +++ /dev/null @@ -1,11 +0,0 @@ -# IR level Instrumentation Flag -:ir -:entry_first -_QQmain -# Func Hash: -146835646621254984 -# Num Counters: -2 -# Counter Values: -100 -1 \ No newline at end of file diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 deleted file mode 100644 index 4490c45232d28..0000000000000 --- a/flang/test/Profile/gcc-flag-compatibility.f90 +++ /dev/null @@ -1,32 +0,0 @@ -! Tests for -fprofile-generate and -fprofile-use flag compatibility. These two -! flags behave similarly to their GCC counterparts: -! -! -fprofile-generate Generates the profile file ./default.profraw -! -fprofile-use=/file Uses the profile file /file - -! On AIX, -flto used to be required with -fprofile-generate. gcc-flag-compatibility-aix.c is used to do the testing on AIX with -flto -! RUN: %flang %s -c -S -o - -emit-llvm -fprofile-generate | FileCheck -check-prefix=PROFILE-GEN %s -! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section -! PROFILE-GEN: @__profd_{{_?}}main = - -! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof -! This uses LLVM IR format profile. -! RUN: rm -rf %t.dir -! RUN: mkdir -p %t.dir/some/path -! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof -! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s -! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof -! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s -! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} -! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} - -program main - implicit none - integer :: i - integer :: X = 0 - - do i = 0, 99 - X = X + i - end do - -end program main diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 82f583bc459e6..ee52645f2e51b 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,8 +13,6 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H -#include - namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -48,19 +46,9 @@ enum class VectorLibrary { AMDLIBM // AMD vector math library. }; -enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. -}; - TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); -// Default filename used for profile generation. -std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index df884908845d2..52080dea93c98 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,15 +8,8 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" -#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/TargetParser/Triple.h" -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt - ProfileCorrelate; -} // namespace llvm - namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, @@ -63,10 +56,4 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } -std::string getDefaultProfileGenName() { - return llvm::DebugInfoCorrelate || - llvm::ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} } // namespace llvm::driver From flang-commits at lists.llvm.org Fri May 30 07:29:04 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Fri, 30 May 2025 07:29:04 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6839c0b0.050a0220.2d7a87.3bec@mx.google.com> tarunprabhu wrote: @fanju110 The PR caused some buildbot failures, so I have [reverted](https://github.com/llvm/llvm-project/commit/597340b5b666bdee2887f56c111407b6737cbf34) it. You can see the error [here](https://lab.llvm.org/buildbot/#/builders/203/builds/12089/steps/7/logs/stdio). https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Fri May 30 07:34:16 2025 From: flang-commits at lists.llvm.org (Kajetan Puchalski via flang-commits) Date: Fri, 30 May 2025 07:34:16 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Resolve names for declare simd uniform clause (PR #142160) Message-ID: https://github.com/mrkajetanp created https://github.com/llvm/llvm-project/pull/142160 Add a visitor for OmpClause::Uniform to resolve its parameter names. Fixes issue #140741. >From 8c97501ba478596c1d3374f812314dae38201fa4 Mon Sep 17 00:00:00 2001 From: Kajetan Puchalski Date: Fri, 30 May 2025 09:20:24 +0000 Subject: [PATCH] [flang][OpenMP] Resolve names for declare simd uniform clause Add a visitor for OmpClause::Uniform to resolve its parameter names. Fixes issue #140741. Signed-off-by: Kajetan Puchalski --- flang/include/flang/Semantics/symbol.h | 2 +- flang/lib/Semantics/resolve-directives.cpp | 7 +++++++ flang/lib/Semantics/resolve-names.cpp | 5 +++++ .../test/Semantics/OpenMP/declare-simd-uniform.f90 | 13 +++++++++++++ 4 files changed, 26 insertions(+), 1 deletion(-) create mode 100644 flang/test/Semantics/OpenMP/declare-simd-uniform.f90 diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 4cded64d170cd..9ebdd3a8081ed 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -786,7 +786,7 @@ class Symbol { OmpExecutableAllocateDirective, OmpDeclareSimd, OmpDeclareTarget, OmpThreadprivate, OmpDeclareReduction, OmpFlushed, OmpCriticalLock, OmpIfSpecified, OmpNone, OmpPreDetermined, OmpImplicit, OmpDependObject, - OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction); + OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction, OmpUniform); using Flags = common::EnumSet; const Scope &owner() const { return *owner_; } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 9fa7bc8964854..ecb4f02732a4b 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -550,6 +550,13 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { return false; } + bool Pre(const parser::OmpClause::Uniform &x) { + for (const auto &name : x.v) { + ResolveOmpName(name, Symbol::Flag::OmpUniform); + } + return false; + } + bool Pre(const parser::OmpInReductionClause &x) { auto &objects{std::get(x.t)}; ResolveOmpObjectList(objects, Symbol::Flag::OmpInReduction); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 57035c57ee16f..7bea6fdb00e55 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1511,6 +1511,11 @@ class OmpVisitor : public virtual DeclarationVisitor { return false; } + bool Pre(const parser::OpenMPDeclareSimdConstruct &x) { + AddOmpSourceRange(x.source); + return true; + } + bool Pre(const parser::OmpInitializerProc &x) { auto &procDes = std::get(x.t); auto &name = std::get(procDes.u); diff --git a/flang/test/Semantics/OpenMP/declare-simd-uniform.f90 b/flang/test/Semantics/OpenMP/declare-simd-uniform.f90 new file mode 100644 index 0000000000000..3a0c52b8e7b0e --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-simd-uniform.f90 @@ -0,0 +1,13 @@ +! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! Test declare simd with uniform clause + +function add2(a,b,i,fact,alc) result(c) + !$omp declare simd(add2) uniform(a,b,fact) + integer :: i + integer,pointer::alc + double precision :: a(*),b(*),fact,c + c = a(i) + b(i) + fact +end function + +end + From flang-commits at lists.llvm.org Fri May 30 07:34:49 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 07:34:49 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Resolve names for declare simd uniform clause (PR #142160) In-Reply-To: Message-ID: <6839c209.170a0220.23238a.2fd5@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Kajetan Puchalski (mrkajetanp)
Changes Add a visitor for OmpClause::Uniform to resolve its parameter names. Fixes issue #140741. --- Full diff: https://github.com/llvm/llvm-project/pull/142160.diff 4 Files Affected: - (modified) flang/include/flang/Semantics/symbol.h (+1-1) - (modified) flang/lib/Semantics/resolve-directives.cpp (+7) - (modified) flang/lib/Semantics/resolve-names.cpp (+5) - (added) flang/test/Semantics/OpenMP/declare-simd-uniform.f90 (+13) ``````````diff diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 4cded64d170cd..9ebdd3a8081ed 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -786,7 +786,7 @@ class Symbol { OmpExecutableAllocateDirective, OmpDeclareSimd, OmpDeclareTarget, OmpThreadprivate, OmpDeclareReduction, OmpFlushed, OmpCriticalLock, OmpIfSpecified, OmpNone, OmpPreDetermined, OmpImplicit, OmpDependObject, - OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction); + OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction, OmpUniform); using Flags = common::EnumSet; const Scope &owner() const { return *owner_; } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 9fa7bc8964854..ecb4f02732a4b 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -550,6 +550,13 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { return false; } + bool Pre(const parser::OmpClause::Uniform &x) { + for (const auto &name : x.v) { + ResolveOmpName(name, Symbol::Flag::OmpUniform); + } + return false; + } + bool Pre(const parser::OmpInReductionClause &x) { auto &objects{std::get(x.t)}; ResolveOmpObjectList(objects, Symbol::Flag::OmpInReduction); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 57035c57ee16f..7bea6fdb00e55 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1511,6 +1511,11 @@ class OmpVisitor : public virtual DeclarationVisitor { return false; } + bool Pre(const parser::OpenMPDeclareSimdConstruct &x) { + AddOmpSourceRange(x.source); + return true; + } + bool Pre(const parser::OmpInitializerProc &x) { auto &procDes = std::get(x.t); auto &name = std::get(procDes.u); diff --git a/flang/test/Semantics/OpenMP/declare-simd-uniform.f90 b/flang/test/Semantics/OpenMP/declare-simd-uniform.f90 new file mode 100644 index 0000000000000..3a0c52b8e7b0e --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-simd-uniform.f90 @@ -0,0 +1,13 @@ +! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! Test declare simd with uniform clause + +function add2(a,b,i,fact,alc) result(c) + !$omp declare simd(add2) uniform(a,b,fact) + integer :: i + integer,pointer::alc + double precision :: a(*),b(*),fact,c + c = a(i) + b(i) + fact +end function + +end + ``````````
https://github.com/llvm/llvm-project/pull/142160 From flang-commits at lists.llvm.org Fri May 30 07:34:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 07:34:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Resolve names for declare simd uniform clause (PR #142160) In-Reply-To: Message-ID: <6839c20a.050a0220.312495.e4ae@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Kajetan Puchalski (mrkajetanp)
Changes Add a visitor for OmpClause::Uniform to resolve its parameter names. Fixes issue #140741. --- Full diff: https://github.com/llvm/llvm-project/pull/142160.diff 4 Files Affected: - (modified) flang/include/flang/Semantics/symbol.h (+1-1) - (modified) flang/lib/Semantics/resolve-directives.cpp (+7) - (modified) flang/lib/Semantics/resolve-names.cpp (+5) - (added) flang/test/Semantics/OpenMP/declare-simd-uniform.f90 (+13) ``````````diff diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 4cded64d170cd..9ebdd3a8081ed 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -786,7 +786,7 @@ class Symbol { OmpExecutableAllocateDirective, OmpDeclareSimd, OmpDeclareTarget, OmpThreadprivate, OmpDeclareReduction, OmpFlushed, OmpCriticalLock, OmpIfSpecified, OmpNone, OmpPreDetermined, OmpImplicit, OmpDependObject, - OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction); + OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction, OmpUniform); using Flags = common::EnumSet; const Scope &owner() const { return *owner_; } diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 9fa7bc8964854..ecb4f02732a4b 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -550,6 +550,13 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { return false; } + bool Pre(const parser::OmpClause::Uniform &x) { + for (const auto &name : x.v) { + ResolveOmpName(name, Symbol::Flag::OmpUniform); + } + return false; + } + bool Pre(const parser::OmpInReductionClause &x) { auto &objects{std::get(x.t)}; ResolveOmpObjectList(objects, Symbol::Flag::OmpInReduction); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index 57035c57ee16f..7bea6fdb00e55 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1511,6 +1511,11 @@ class OmpVisitor : public virtual DeclarationVisitor { return false; } + bool Pre(const parser::OpenMPDeclareSimdConstruct &x) { + AddOmpSourceRange(x.source); + return true; + } + bool Pre(const parser::OmpInitializerProc &x) { auto &procDes = std::get(x.t); auto &name = std::get(procDes.u); diff --git a/flang/test/Semantics/OpenMP/declare-simd-uniform.f90 b/flang/test/Semantics/OpenMP/declare-simd-uniform.f90 new file mode 100644 index 0000000000000..3a0c52b8e7b0e --- /dev/null +++ b/flang/test/Semantics/OpenMP/declare-simd-uniform.f90 @@ -0,0 +1,13 @@ +! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! Test declare simd with uniform clause + +function add2(a,b,i,fact,alc) result(c) + !$omp declare simd(add2) uniform(a,b,fact) + integer :: i + integer,pointer::alc + double precision :: a(*),b(*),fact,c + c = a(i) + b(i) + fact +end function + +end + ``````````
https://github.com/llvm/llvm-project/pull/142160 From flang-commits at lists.llvm.org Fri May 30 07:35:57 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Fri, 30 May 2025 07:35:57 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #142073) In-Reply-To: Message-ID: <6839c24d.170a0220.202dfc.7684@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `ppc64le-flang-rhel-clang` running on `ppc64le-flang-rhel-test` while building `clang,flang,mlir` at step 6 "test-build-unified-tree-check-flang". Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/29494
Here is the relevant piece of the build log for the reference ``` Step 6 (test-build-unified-tree-check-flang) failure: test (failure) ******************** TEST 'Flang :: Driver/prefer-vector-width.f90' FAILED ******************** Exit Code: 1 Command Output (stderr): -- /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/flang -fc1 -emit-llvm -o - /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Driver/prefer-vector-width.f90 2>&1| /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Driver/prefer-vector-width.f90 --check-prefix=CHECK-DEF # RUN: at line 3 + /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Driver/prefer-vector-width.f90 --check-prefix=CHECK-DEF + /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/flang -fc1 -emit-llvm -o - /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Driver/prefer-vector-width.f90 /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/flang -fc1 -mprefer-vector-width=none -emit-llvm -o - /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Driver/prefer-vector-width.f90 2>&1| /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Driver/prefer-vector-width.f90 --check-prefix=CHECK-NONE # RUN: at line 4 + /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/flang -fc1 -mprefer-vector-width=none -emit-llvm -o - /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Driver/prefer-vector-width.f90 + /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/build/bin/FileCheck /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Driver/prefer-vector-width.f90 --check-prefix=CHECK-NONE /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Driver/prefer-vector-width.f90:13:15: error: CHECK-NONE: expected string not found in input ! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } ^ :1:1: note: scanning from here ; ModuleID = 'FIRModule' ^ :10:1: note: possible intended match here attributes #0 = { "prefer-vector-width"="none" "target-features"="+64bit" } ^ Input file: Check file: /home/buildbots/llvm-external-buildbots/workers/ppc64le-flang-rhel-test/ppc64le-flang-rhel-clang-build/llvm-project/flang/test/Driver/prefer-vector-width.f90 -dump-input=help explains the following input dump. Input was: <<<<<< 1: ; ModuleID = 'FIRModule' check:13'0 X~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found 2: source_filename = "FIRModule" check:13'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3: target datalayout = "e-m:e-Fn32-i64:64-i128:128-n32:64-S128-v256:256:256-v512:512:512" check:13'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4: target triple = "powerpc64le-unknown-linux-gnu" check:13'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5: check:13'0 ~ 6: define void @func_() #0 { check:13'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 7: ret void check:13'0 ~~~~~~~~~~ 8: } check:13'0 ~~ 9: check:13'0 ~ 10: attributes #0 = { "prefer-vector-width"="none" "target-features"="+64bit" } check:13'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ check:13'1 ? possible intended match 11: ... ```
https://github.com/llvm/llvm-project/pull/142073 From flang-commits at lists.llvm.org Fri May 30 07:13:37 2025 From: flang-commits at lists.llvm.org (Leandro Lupori via flang-commits) Date: Fri, 30 May 2025 07:13:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Explicitly set Shared DSA in symbols (PR #142154) Message-ID: https://github.com/luporl created https://github.com/llvm/llvm-project/pull/142154 Before this change, OmpShared was not always set in shared symbols. Instead, absence of private flags was interpreted as shared DSA. The problem was that symbols with no flags, with only a host association, could also mean "has same DSA as in the enclosing context". Now shared symbols behave the same as private and can be treated the same way. Because of the host association symbols with no flags mentioned above, it was also incorrect to simply test the flags of a given symbol to find out if it was private or shared. The function GetSymbolDSA() was added to fix this. It would be better to avoid the need of these special symbols, but this would require changes to how symbols are collected in lowering. Besides that, some semantic checks need to know if a DSA clause was used or not. To avoid confusing implicit symbols with DSA clauses a new flag was added: OmpExplicit. It is now set for all symbols with explicitly determined data-sharing attributes. With the changes above, AddToContextObjectWithDSA() and the symbol to DSA map could probably be removed and the DSA could be obtained directly from the symbol, but this was not attempted. Some debug messages were also added, with the "omp" DEBUG_TYPE, to make it easier to debug the creation of implicit symbols and to visualize all associations of a given symbol. Fixes #130533 >From 38294dd11c7b12e793e42569f94d76a156a92cfe Mon Sep 17 00:00:00 2001 From: Leandro Lupori Date: Fri, 23 May 2025 11:25:16 -0300 Subject: [PATCH] [flang][OpenMP] Explicitly set Shared DSA in symbols Before this change, OmpShared was not always set in shared symbols. Instead, absence of private flags was interpreted as shared DSA. The problem was that symbols with no flags, with only a host association, could also mean "has same DSA as in the enclosing context". Now shared symbols behave the same as private and can be treated the same way. Because of the host association symbols with no flags mentioned above, it was also incorrect to simply test the flags of a given symbol to find out if it was private or shared. The function GetSymbolDSA() was added to fix this. It would be better to avoid the need of these special symbols, but this would require changes to how symbols are collected in lowering. Besides that, some semantic checks need to know if a DSA clause was used or not. To avoid confusing implicit symbols with DSA clauses a new flag was added: OmpExplicit. It is now set for all symbols with explicitly determined data-sharing attributes. With the changes above, AddToContextObjectWithDSA() and the symbol to DSA map could probably be removed and the DSA could be obtained directly from the symbol, but this was not attempted. Some debug messages were also added, with the "omp" DEBUG_TYPE, to make it easier to debug the creation of implicit symbols and to visualize all associations of a given symbol. Fixes #130533 --- flang/include/flang/Semantics/openmp-dsa.h | 20 ++ flang/include/flang/Semantics/symbol.h | 4 +- flang/lib/Lower/Bridge.cpp | 4 +- flang/lib/Semantics/CMakeLists.txt | 1 + flang/lib/Semantics/openmp-dsa.cpp | 29 ++ flang/lib/Semantics/resolve-directives.cpp | 301 +++++++++++++----- flang/test/Semantics/OpenMP/common-block.f90 | 6 +- flang/test/Semantics/OpenMP/copyprivate03.f90 | 12 + .../test/Semantics/OpenMP/default-clause.f90 | 6 +- .../Semantics/OpenMP/do05-positivecase.f90 | 6 +- flang/test/Semantics/OpenMP/do20.f90 | 2 +- flang/test/Semantics/OpenMP/forall.f90 | 4 +- flang/test/Semantics/OpenMP/implicit-dsa.f90 | 35 +- flang/test/Semantics/OpenMP/reduction08.f90 | 20 +- flang/test/Semantics/OpenMP/reduction09.f90 | 14 +- flang/test/Semantics/OpenMP/reduction11.f90 | 2 +- flang/test/Semantics/OpenMP/scan2.f90 | 4 +- flang/test/Semantics/OpenMP/symbol01.f90 | 12 +- flang/test/Semantics/OpenMP/symbol02.f90 | 8 +- flang/test/Semantics/OpenMP/symbol03.f90 | 8 +- flang/test/Semantics/OpenMP/symbol04.f90 | 4 +- flang/test/Semantics/OpenMP/symbol05.f90 | 2 +- flang/test/Semantics/OpenMP/symbol06.f90 | 2 +- flang/test/Semantics/OpenMP/symbol07.f90 | 4 +- flang/test/Semantics/OpenMP/symbol08.f90 | 36 +-- flang/test/Semantics/OpenMP/symbol09.f90 | 4 +- 26 files changed, 380 insertions(+), 170 deletions(-) create mode 100644 flang/include/flang/Semantics/openmp-dsa.h create mode 100644 flang/lib/Semantics/openmp-dsa.cpp diff --git a/flang/include/flang/Semantics/openmp-dsa.h b/flang/include/flang/Semantics/openmp-dsa.h new file mode 100644 index 0000000000000..4b94a679f29ef --- /dev/null +++ b/flang/include/flang/Semantics/openmp-dsa.h @@ -0,0 +1,20 @@ +//===-- include/flang/Semantics/openmp-dsa.h --------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#ifndef FORTRAN_SEMANTICS_OPENMP_DSA_H_ +#define FORTRAN_SEMANTICS_OPENMP_DSA_H_ + +#include "flang/Semantics/symbol.h" + +namespace Fortran::semantics { + +Symbol::Flags GetSymbolDSA(const Symbol &symbol); + +} // namespace Fortran::semantics + +#endif // FORTRAN_SEMANTICS_OPENMP_DSA_H_ diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 4cded64d170cd..59920e08cc926 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -785,8 +785,8 @@ class Symbol { OmpAllocate, OmpDeclarativeAllocateDirective, OmpExecutableAllocateDirective, OmpDeclareSimd, OmpDeclareTarget, OmpThreadprivate, OmpDeclareReduction, OmpFlushed, OmpCriticalLock, - OmpIfSpecified, OmpNone, OmpPreDetermined, OmpImplicit, OmpDependObject, - OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction); + OmpIfSpecified, OmpNone, OmpPreDetermined, OmpExplicit, OmpImplicit, + OmpDependObject, OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction); using Flags = common::EnumSet; const Scope &owner() const { return *owner_; } diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index c9e91cf3e8042..86d5e0d37bc38 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -58,6 +58,7 @@ #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Parser/parse-tree.h" #include "flang/Runtime/iostat-consts.h" +#include "flang/Semantics/openmp-dsa.h" #include "flang/Semantics/runtime-type-info.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" @@ -1385,7 +1386,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { if (isUnordered || sym.has() || sym.has()) { if (!shallowLookupSymbol(sym) && - !sym.test(Fortran::semantics::Symbol::Flag::OmpShared)) { + !GetSymbolDSA(sym).test( + Fortran::semantics::Symbol::Flag::OmpShared)) { // Do concurrent loop variables are not mapped yet since they are local // to the Do concurrent scope (same for OpenMP loops). mlir::OpBuilder::InsertPoint insPt = builder->saveInsertionPoint(); diff --git a/flang/lib/Semantics/CMakeLists.txt b/flang/lib/Semantics/CMakeLists.txt index bd8cc47365f06..18c89587843a9 100644 --- a/flang/lib/Semantics/CMakeLists.txt +++ b/flang/lib/Semantics/CMakeLists.txt @@ -32,6 +32,7 @@ add_flang_library(FortranSemantics dump-expr.cpp expression.cpp mod-file.cpp + openmp-dsa.cpp openmp-modifiers.cpp pointer-assignment.cpp program-tree.cpp diff --git a/flang/lib/Semantics/openmp-dsa.cpp b/flang/lib/Semantics/openmp-dsa.cpp new file mode 100644 index 0000000000000..48aa36febe5c5 --- /dev/null +++ b/flang/lib/Semantics/openmp-dsa.cpp @@ -0,0 +1,29 @@ +//===-- flang/lib/Semantics/openmp-dsa.cpp ----------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Semantics/openmp-dsa.h" + +namespace Fortran::semantics { + +Symbol::Flags GetSymbolDSA(const Symbol &symbol) { + Symbol::Flags dsaFlags{Symbol::Flag::OmpPrivate, + Symbol::Flag::OmpFirstPrivate, Symbol::Flag::OmpLastPrivate, + Symbol::Flag::OmpShared, Symbol::Flag::OmpLinear, + Symbol::Flag::OmpReduction}; + Symbol::Flags dsa{symbol.flags() & dsaFlags}; + if (dsa.any()) { + return dsa; + } + // If no DSA are set use those from the host associated symbol, if any. + if (const auto *details{symbol.detailsIf()}) { + return GetSymbolDSA(details->symbol()); + } + return {}; +} + +} // namespace Fortran::semantics diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 9fa7bc8964854..e604bca4213c8 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -19,9 +19,11 @@ #include "flang/Parser/parse-tree.h" #include "flang/Parser/tools.h" #include "flang/Semantics/expression.h" +#include "flang/Semantics/openmp-dsa.h" #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" +#include "llvm/Support/Debug.h" #include #include #include @@ -111,10 +113,9 @@ template class DirectiveAttributeVisitor { const parser::Name *GetLoopIndex(const parser::DoConstruct &); const parser::DoConstruct *GetDoConstructIf( const parser::ExecutionPartConstruct &); - Symbol *DeclareNewPrivateAccessEntity(const Symbol &, Symbol::Flag, Scope &); - Symbol *DeclarePrivateAccessEntity( - const parser::Name &, Symbol::Flag, Scope &); - Symbol *DeclarePrivateAccessEntity(Symbol &, Symbol::Flag, Scope &); + Symbol *DeclareNewAccessEntity(const Symbol &, Symbol::Flag, Scope &); + Symbol *DeclareAccessEntity(const parser::Name &, Symbol::Flag, Scope &); + Symbol *DeclareAccessEntity(Symbol &, Symbol::Flag, Scope &); Symbol *DeclareOrMarkOtherAccessEntity(const parser::Name &, Symbol::Flag); UnorderedSymbolSet dataSharingAttributeObjects_; // on one directive @@ -749,10 +750,11 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { Symbol::Flags ompFlagsRequireNewSymbol{Symbol::Flag::OmpPrivate, Symbol::Flag::OmpLinear, Symbol::Flag::OmpFirstPrivate, - Symbol::Flag::OmpLastPrivate, Symbol::Flag::OmpReduction, - Symbol::Flag::OmpCriticalLock, Symbol::Flag::OmpCopyIn, - Symbol::Flag::OmpUseDevicePtr, Symbol::Flag::OmpUseDeviceAddr, - Symbol::Flag::OmpIsDevicePtr, Symbol::Flag::OmpHasDeviceAddr}; + Symbol::Flag::OmpLastPrivate, Symbol::Flag::OmpShared, + Symbol::Flag::OmpReduction, Symbol::Flag::OmpCriticalLock, + Symbol::Flag::OmpCopyIn, Symbol::Flag::OmpUseDevicePtr, + Symbol::Flag::OmpUseDeviceAddr, Symbol::Flag::OmpIsDevicePtr, + Symbol::Flag::OmpHasDeviceAddr}; Symbol::Flags ompFlagsRequireMark{Symbol::Flag::OmpThreadprivate, Symbol::Flag::OmpDeclareTarget, Symbol::Flag::OmpExclusiveScan, @@ -829,8 +831,24 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { void IssueNonConformanceWarning( llvm::omp::Directive D, parser::CharBlock source); - void CreateImplicitSymbols( - const Symbol *symbol, std::optional setFlag = std::nullopt); + void CreateImplicitSymbols(const Symbol *symbol); + + void AddToContextObjectWithExplicitDSA(Symbol &symbol, Symbol::Flag flag) { + AddToContextObjectWithDSA(symbol, flag); + if (dataSharingAttributeFlags.test(flag)) { + symbol.set(Symbol::Flag::OmpExplicit); + } + } + + // Clear any previous data-sharing attribute flags and set the new ones. + // Needed when setting PreDetermined DSAs, that take precedence over + // Implicit ones. + void SetSymbolDSA(Symbol &symbol, Symbol::Flags flags) { + symbol.flags() &= ~(dataSharingAttributeFlags | + Symbol::Flags{Symbol::Flag::OmpExplicit, Symbol::Flag::OmpImplicit, + Symbol::Flag::OmpPreDetermined}); + symbol.flags() |= flags; + } }; template @@ -867,7 +885,7 @@ const parser::DoConstruct *DirectiveAttributeVisitor::GetDoConstructIf( } template -Symbol *DirectiveAttributeVisitor::DeclareNewPrivateAccessEntity( +Symbol *DirectiveAttributeVisitor::DeclareNewAccessEntity( const Symbol &object, Symbol::Flag flag, Scope &scope) { assert(object.owner() != currScope()); auto &symbol{MakeAssocSymbol(object.name(), object, scope)}; @@ -880,20 +898,20 @@ Symbol *DirectiveAttributeVisitor::DeclareNewPrivateAccessEntity( } template -Symbol *DirectiveAttributeVisitor::DeclarePrivateAccessEntity( +Symbol *DirectiveAttributeVisitor::DeclareAccessEntity( const parser::Name &name, Symbol::Flag flag, Scope &scope) { if (!name.symbol) { return nullptr; // not resolved by Name Resolution step, do nothing } - name.symbol = DeclarePrivateAccessEntity(*name.symbol, flag, scope); + name.symbol = DeclareAccessEntity(*name.symbol, flag, scope); return name.symbol; } template -Symbol *DirectiveAttributeVisitor::DeclarePrivateAccessEntity( +Symbol *DirectiveAttributeVisitor::DeclareAccessEntity( Symbol &object, Symbol::Flag flag, Scope &scope) { if (object.owner() != currScope()) { - return DeclareNewPrivateAccessEntity(object, flag, scope); + return DeclareNewAccessEntity(object, flag, scope); } else { object.set(flag); return &object; @@ -1600,6 +1618,20 @@ void AccAttributeVisitor::CheckMultipleAppearances( } } +#ifndef NDEBUG + +#define DEBUG_TYPE "omp" + +static llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const Symbol::Flags &flags); + +namespace dbg { +static void DumpAssocSymbols(llvm::raw_ostream &os, const Symbol &sym); +static std::string ScopeSourcePos(const Fortran::semantics::Scope &scope); +} // namespace dbg + +#endif + bool OmpAttributeVisitor::Pre(const parser::OpenMPBlockConstruct &x) { const auto &beginBlockDir{std::get(x.t)}; const auto &beginDir{std::get(beginBlockDir.t)}; @@ -1792,12 +1824,12 @@ void OmpAttributeVisitor::ResolveSeqLoopIndexInParallelOrTaskConstruct( } } } - // If this symbol is already Private or Firstprivate in the enclosing - // OpenMP parallel or task then there is nothing to do here. + // If this symbol already has an explicit data-sharing attribute in the + // enclosing OpenMP parallel or task then there is nothing to do here. if (auto *symbol{targetIt->scope.FindSymbol(iv.source)}) { if (symbol->owner() == targetIt->scope) { - if (symbol->test(Symbol::Flag::OmpPrivate) || - symbol->test(Symbol::Flag::OmpFirstPrivate)) { + if (symbol->test(Symbol::Flag::OmpExplicit) && + (symbol->flags() & dataSharingAttributeFlags).any()) { return; } } @@ -1806,7 +1838,8 @@ void OmpAttributeVisitor::ResolveSeqLoopIndexInParallelOrTaskConstruct( // parallel or task if (auto *symbol{ResolveOmp(iv, Symbol::Flag::OmpPrivate, targetIt->scope)}) { targetIt++; - symbol->set(Symbol::Flag::OmpPreDetermined); + SetSymbolDSA( + *symbol, {Symbol::Flag::OmpPreDetermined, Symbol::Flag::OmpPrivate}); iv.symbol = symbol; // adjust the symbol within region for (auto it{dirContext_.rbegin()}; it != targetIt; ++it) { AddToContextObjectWithDSA(*symbol, Symbol::Flag::OmpPrivate, *it); @@ -1918,7 +1951,7 @@ void OmpAttributeVisitor::PrivatizeAssociatedLoopIndexAndCheckLoopLevel( const parser::Name *iv{GetLoopIndex(*loop)}; if (iv) { if (auto *symbol{ResolveOmp(*iv, ivDSA, currScope())}) { - symbol->set(Symbol::Flag::OmpPreDetermined); + SetSymbolDSA(*symbol, {Symbol::Flag::OmpPreDetermined, ivDSA}); iv->symbol = symbol; // adjust the symbol within region AddToContextObjectWithDSA(*symbol, ivDSA); } @@ -2178,42 +2211,48 @@ static bool IsPrivatizable(const Symbol *sym) { misc->kind() != MiscDetails::Kind::ConstructName)); } -void OmpAttributeVisitor::CreateImplicitSymbols( - const Symbol *symbol, std::optional setFlag) { +void OmpAttributeVisitor::CreateImplicitSymbols(const Symbol *symbol) { if (!IsPrivatizable(symbol)) { return; } + LLVM_DEBUG(llvm::dbgs() << "CreateImplicitSymbols: " << *symbol << '\n'); + // Implicitly determined DSAs // OMP 5.2 5.1.1 - Variables Referenced in a Construct Symbol *lastDeclSymbol = nullptr; - std::optional prevDSA; + Symbol::Flags prevDSA; for (int dirDepth{0}; dirDepth < (int)dirContext_.size(); ++dirDepth) { DirContext &dirContext = dirContext_[dirDepth]; - std::optional dsa; + Symbol::Flags dsa; - for (auto symMap : dirContext.objectWithDSA) { - // if the `symbol` already has a data-sharing attribute - if (symMap.first->name() == symbol->name()) { - dsa = symMap.second; - break; + Scope &scope{context_.FindScope(dirContext.directiveSource)}; + auto it{scope.find(symbol->name())}; + if (it != scope.end()) { + // There is already a symbol in the current scope, use its DSA. + dsa = GetSymbolDSA(*it->second); + } else { + for (auto symMap : dirContext.objectWithDSA) { + if (symMap.first->name() == symbol->name()) { + // `symbol` already has a data-sharing attribute in the current + // context, use it. + dsa.set(symMap.second); + break; + } } } // When handling each implicit rule for a given symbol, one of the - // following 3 actions may be taken: - // 1. Declare a new private symbol. - // 2. Create a new association symbol with no flags, that will represent - // a shared symbol in the current scope. Note that symbols without - // any private flags are considered as shared. - // 3. Use the last declared private symbol, by inserting a new symbol - // in the scope being processed, associated with it. - // If no private symbol was declared previously, then no association - // is needed and the symbol from the enclosing scope will be - // inherited by the current one. + // following actions may be taken: + // 1. Declare a new private or shared symbol. + // 2. Use the last declared symbol, by inserting a new symbol in the + // scope being processed, associated with it. + // If no symbol was declared previously, then no association is needed + // and the symbol from the enclosing scope will be inherited by the + // current one. // // Because of how symbols are collected in lowering, not inserting a new - // symbol in the last case could lead to the conclusion that a symbol + // symbol in the second case could lead to the conclusion that a symbol // from an enclosing construct was declared in the current construct, // which would result in wrong privatization code being generated. // Consider the following example: @@ -2231,46 +2270,71 @@ void OmpAttributeVisitor::CreateImplicitSymbols( // it would have the private flag set. // This would make x appear to be defined in p2, causing it to be // privatized in p2 and its privatization in p1 to be skipped. - auto makePrivateSymbol = [&](Symbol::Flag flag) { + auto makeSymbol = [&](Symbol::Flags flags) { const Symbol *hostSymbol = lastDeclSymbol ? lastDeclSymbol : &symbol->GetUltimate(); - lastDeclSymbol = DeclareNewPrivateAccessEntity( + assert(flags.LeastElement()); + Symbol::Flag flag = *flags.LeastElement(); + lastDeclSymbol = DeclareNewAccessEntity( *hostSymbol, flag, context_.FindScope(dirContext.directiveSource)); - if (setFlag) { - lastDeclSymbol->set(*setFlag); - } + lastDeclSymbol->flags() |= flags; return lastDeclSymbol; }; - auto makeSharedSymbol = [&](std::optional flag = {}) { - const Symbol *hostSymbol = - lastDeclSymbol ? lastDeclSymbol : &symbol->GetUltimate(); - Symbol &assocSymbol = MakeAssocSymbol(symbol->name(), *hostSymbol, - context_.FindScope(dirContext.directiveSource)); - if (flag) { - assocSymbol.set(*flag); - } - }; auto useLastDeclSymbol = [&]() { if (lastDeclSymbol) { - makeSharedSymbol(); + const Symbol *hostSymbol = + lastDeclSymbol ? lastDeclSymbol : &symbol->GetUltimate(); + MakeAssocSymbol(symbol->name(), *hostSymbol, + context_.FindScope(dirContext.directiveSource)); } }; +#ifndef NDEBUG + auto printImplicitRule = [&](const char *id) { + LLVM_DEBUG(llvm::dbgs() << "\t" << id << ": dsa: " << dsa << '\n'); + LLVM_DEBUG( + llvm::dbgs() << "\t\tScope: " << dbg::ScopeSourcePos(scope) << '\n'); + }; +#define PRINT_IMPLICIT_RULE(id) printImplicitRule(id) +#else +#define PRINT_IMPLICIT_RULE(id) +#endif + bool taskGenDir = llvm::omp::taskGeneratingSet.test(dirContext.directive); bool targetDir = llvm::omp::allTargetSet.test(dirContext.directive); bool parallelDir = llvm::omp::allParallelSet.test(dirContext.directive); bool teamsDir = llvm::omp::allTeamsSet.test(dirContext.directive); - if (dsa.has_value()) { - if (dsa.value() == Symbol::Flag::OmpShared && - (parallelDir || taskGenDir || teamsDir)) { - makeSharedSymbol(Symbol::Flag::OmpShared); + if (dsa.any()) { + if (parallelDir || taskGenDir || teamsDir) { + Symbol *prevDeclSymbol{lastDeclSymbol}; + // NOTE As `dsa` will match that of the symbol in the current scope + // (if any), we won't override the DSA of any existing symbol. + if ((dsa & dataSharingAttributeFlags).any()) { + makeSymbol(dsa); + } + // Fix host association of explicit symbols, as they can be created + // before implicit ones in enclosing scope. + if (prevDeclSymbol && prevDeclSymbol != lastDeclSymbol && + lastDeclSymbol->test(Symbol::Flag::OmpExplicit)) { + const auto *hostAssoc{lastDeclSymbol->detailsIf()}; + if (hostAssoc && hostAssoc->symbol() != *prevDeclSymbol) { + lastDeclSymbol->set_details(HostAssocDetails{*prevDeclSymbol}); + } + } } - // Private symbols will have been declared already. prevDSA = dsa; + PRINT_IMPLICIT_RULE("0) already has DSA"); continue; } + // NOTE Because of how lowering uses OmpImplicit flag, we can only set it + // for symbols with private DSA. + // Also, as the default clause is handled separately in lowering, + // don't mark its symbols with OmpImplicit either. + // Ideally, lowering should be changed and all implicit symbols + // should be marked with OmpImplicit. + if (dirContext.defaultDSA == Symbol::Flag::OmpPrivate || dirContext.defaultDSA == Symbol::Flag::OmpFirstPrivate || dirContext.defaultDSA == Symbol::Flag::OmpShared) { @@ -2279,33 +2343,34 @@ void OmpAttributeVisitor::CreateImplicitSymbols( if (!parallelDir && !taskGenDir && !teamsDir) { return; } - if (dirContext.defaultDSA != Symbol::Flag::OmpShared) { - makePrivateSymbol(dirContext.defaultDSA); - } else { - makeSharedSymbol(); - } - dsa = dirContext.defaultDSA; + dsa = {dirContext.defaultDSA}; + makeSymbol(dsa); + PRINT_IMPLICIT_RULE("1) default"); } else if (parallelDir) { // 2) parallel -> shared - makeSharedSymbol(); - dsa = Symbol::Flag::OmpShared; + dsa = {Symbol::Flag::OmpShared}; + makeSymbol(dsa); + PRINT_IMPLICIT_RULE("2) parallel"); } else if (!taskGenDir && !targetDir) { // 3) enclosing context - useLastDeclSymbol(); dsa = prevDSA; + useLastDeclSymbol(); + PRINT_IMPLICIT_RULE("3) enclosing context"); } else if (targetDir) { // TODO 4) not mapped target variable -> firstprivate dsa = prevDSA; } else if (taskGenDir) { // TODO 5) dummy arg in orphaned taskgen construct -> firstprivate - if (prevDSA == Symbol::Flag::OmpShared) { + if (prevDSA.test(Symbol::Flag::OmpShared)) { // 6) shared in enclosing context -> shared - makeSharedSymbol(); - dsa = Symbol::Flag::OmpShared; + dsa = {Symbol::Flag::OmpShared}; + makeSymbol(dsa); + PRINT_IMPLICIT_RULE("6) taskgen: shared"); } else { // 7) firstprivate - dsa = Symbol::Flag::OmpFirstPrivate; - makePrivateSymbol(*dsa)->set(Symbol::Flag::OmpImplicit); + dsa = {Symbol::Flag::OmpFirstPrivate}; + makeSymbol(dsa)->set(Symbol::Flag::OmpImplicit); + PRINT_IMPLICIT_RULE("7) taskgen: firstprivate"); } } prevDSA = dsa; @@ -2371,7 +2436,7 @@ void OmpAttributeVisitor::ResolveOmpName( if (ResolveName(&name)) { if (auto *resolvedSymbol{ResolveOmp(name, ompFlag, currScope())}) { if (dataSharingAttributeFlags.test(ompFlag)) { - AddToContextObjectWithDSA(*resolvedSymbol, ompFlag); + AddToContextObjectWithExplicitDSA(*resolvedSymbol, ompFlag); } } } else if (ompFlag == Symbol::Flag::OmpCriticalLock) { @@ -2484,7 +2549,7 @@ void OmpAttributeVisitor::ResolveOmpObject( if (dataCopyingAttributeFlags.test(ompFlag)) { CheckDataCopyingClause(*name, *symbol, ompFlag); } else { - AddToContextObjectWithDSA(*symbol, ompFlag); + AddToContextObjectWithExplicitDSA(*symbol, ompFlag); if (dataSharingAttributeFlags.test(ompFlag)) { CheckMultipleAppearances(*name, *symbol, ompFlag); } @@ -2588,8 +2653,14 @@ void OmpAttributeVisitor::ResolveOmpObject( GetContext().directive))) { for (Symbol::Flag ompFlag1 : dataMappingAttributeFlags) { for (Symbol::Flag ompFlag2 : dataSharingAttributeFlags) { - checkExclusivelists( - hostAssocSym, ompFlag1, symbol, ompFlag2); + if ((hostAssocSym->test(ompFlag2) && + hostAssocSym->test( + Symbol::Flag::OmpExplicit)) || + (symbol->test(ompFlag2) && + symbol->test(Symbol::Flag::OmpExplicit))) { + checkExclusivelists( + hostAssocSym, ompFlag1, symbol, ompFlag2); + } } } } @@ -2624,7 +2695,7 @@ void OmpAttributeVisitor::ResolveOmpObject( if (dataCopyingAttributeFlags.test(ompFlag)) { CheckDataCopyingClause(name, *resolvedObject, ompFlag); } else { - AddToContextObjectWithDSA(*resolvedObject, ompFlag); + AddToContextObjectWithExplicitDSA(*resolvedObject, ompFlag); } details.replace_object(*resolvedObject, index); } @@ -2643,7 +2714,7 @@ void OmpAttributeVisitor::ResolveOmpObject( Symbol *OmpAttributeVisitor::ResolveOmp( const parser::Name &name, Symbol::Flag ompFlag, Scope &scope) { if (ompFlagsRequireNewSymbol.test(ompFlag)) { - return DeclarePrivateAccessEntity(name, ompFlag, scope); + return DeclareAccessEntity(name, ompFlag, scope); } else { return DeclareOrMarkOtherAccessEntity(name, ompFlag); } @@ -2652,7 +2723,7 @@ Symbol *OmpAttributeVisitor::ResolveOmp( Symbol *OmpAttributeVisitor::ResolveOmp( Symbol &symbol, Symbol::Flag ompFlag, Scope &scope) { if (ompFlagsRequireNewSymbol.test(ompFlag)) { - return DeclarePrivateAccessEntity(symbol, ompFlag, scope); + return DeclareAccessEntity(symbol, ompFlag, scope); } else { return DeclareOrMarkOtherAccessEntity(symbol, ompFlag); } @@ -2831,10 +2902,16 @@ static bool IsSymbolThreadprivate(const Symbol &symbol) { } static bool IsSymbolPrivate(const Symbol &symbol) { - if (symbol.test(Symbol::Flag::OmpPrivate) || - symbol.test(Symbol::Flag::OmpFirstPrivate)) { + LLVM_DEBUG(llvm::dbgs() << "IsSymbolPrivate(" << symbol.name() << "):\n"); + LLVM_DEBUG(dbg::DumpAssocSymbols(llvm::dbgs(), symbol)); + + if (Symbol::Flags dsa{GetSymbolDSA(symbol)}; dsa.any()) { + if (dsa.test(Symbol::Flag::OmpShared)) { + return false; + } return true; } + // A symbol that has not gone through constructs that may privatize the // original symbol may be predetermined as private. // (OMP 5.2 5.1.1 - Variables Referenced in a Construct) @@ -3080,4 +3157,60 @@ void OmpAttributeVisitor::IssueNonConformanceWarning( context_.Warn(common::UsageWarning::OpenMPUsage, source, "%s"_warn_en_US, warnStrOS.str()); } + +#ifndef NDEBUG + +static llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const Symbol::Flags &flags) { + flags.Dump(os, Symbol::EnumToString); + return os; +} + +namespace dbg { + +static llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, std::optional srcPos) { + if (srcPos) { + os << *srcPos.value().path << ":" << srcPos.value().line << ": "; + } + return os; +} + +static std::optional GetSourcePosition( + const Fortran::semantics::Scope &scope, + const Fortran::parser::CharBlock &src) { + parser::AllCookedSources &allCookedSources{ + scope.context().allCookedSources()}; + if (std::optional prange{ + allCookedSources.GetProvenanceRange(src)}) { + return allCookedSources.allSources().GetSourcePosition(prange->start()); + } + return std::nullopt; +} + +// Returns a string containing the source location of `scope` followed by +// its first source line. +static std::string ScopeSourcePos(const Fortran::semantics::Scope &scope) { + const parser::CharBlock &sourceRange{scope.sourceRange()}; + std::string src{sourceRange.ToString()}; + size_t nl{src.find('\n')}; + std::string str; + llvm::raw_string_ostream ss{str}; + + ss << GetSourcePosition(scope, sourceRange) << src.substr(0, nl); + return str; +} + +static void DumpAssocSymbols(llvm::raw_ostream &os, const Symbol &sym) { + os << '\t' << sym << '\n'; + os << "\t\tOwner: " << ScopeSourcePos(sym.owner()) << '\n'; + if (const auto *details{sym.detailsIf()}) { + DumpAssocSymbols(os, details->symbol()); + } +} + +} // namespace dbg + +#endif + } // namespace Fortran::semantics diff --git a/flang/test/Semantics/OpenMP/common-block.f90 b/flang/test/Semantics/OpenMP/common-block.f90 index e1ddd120da857..93f29b12eacae 100644 --- a/flang/test/Semantics/OpenMP/common-block.f90 +++ b/flang/test/Semantics/OpenMP/common-block.f90 @@ -10,9 +10,9 @@ program main common /blk/ a, b, c !$omp parallel private(/blk/) !CHECK: OtherConstruct scope: size=0 alignment=1 - !CHECK: a (OmpPrivate): HostAssoc - !CHECK: b (OmpPrivate): HostAssoc - !CHECK: c (OmpPrivate): HostAssoc + !CHECK: a (OmpPrivate, OmpExplicit): HostAssoc + !CHECK: b (OmpPrivate, OmpExplicit): HostAssoc + !CHECK: c (OmpPrivate, OmpExplicit): HostAssoc call sub(a, b, c) !$omp end parallel end program diff --git a/flang/test/Semantics/OpenMP/copyprivate03.f90 b/flang/test/Semantics/OpenMP/copyprivate03.f90 index 9d39fdb6b13c8..fae190645b5e7 100644 --- a/flang/test/Semantics/OpenMP/copyprivate03.f90 +++ b/flang/test/Semantics/OpenMP/copyprivate03.f90 @@ -6,6 +6,8 @@ program omp_copyprivate integer :: a(10), b(10) + real, dimension(:), allocatable :: c + real, dimension(:), pointer :: d integer, save :: k !$omp threadprivate(k) @@ -43,4 +45,14 @@ program omp_copyprivate print *, a, b + !$omp task + !$omp parallel private(c, d) + allocate(c(5)) + allocate(d(10)) + !$omp single + c = 22 + d = 33 + !$omp end single copyprivate(c, d) + !$omp end parallel + !$omp end task end program omp_copyprivate diff --git a/flang/test/Semantics/OpenMP/default-clause.f90 b/flang/test/Semantics/OpenMP/default-clause.f90 index 9cde77be2babe..d4c38ea56de53 100644 --- a/flang/test/Semantics/OpenMP/default-clause.f90 +++ b/flang/test/Semantics/OpenMP/default-clause.f90 @@ -15,8 +15,8 @@ program sample !CHECK: OtherConstruct scope: size=0 alignment=1 !CHECK: a (OmpPrivate): HostAssoc !CHECK: k (OmpPrivate): HostAssoc - !CHECK: x (OmpFirstPrivate): HostAssoc - !CHECK: y (OmpPrivate): HostAssoc + !CHECK: x (OmpFirstPrivate, OmpExplicit): HostAssoc + !CHECK: y (OmpPrivate, OmpExplicit): HostAssoc !CHECK: z (OmpPrivate): HostAssoc !$omp parallel default(private) !CHECK: OtherConstruct scope: size=0 alignment=1 @@ -34,7 +34,7 @@ program sample !$omp parallel default(firstprivate) shared(y) private(w) !CHECK: OtherConstruct scope: size=0 alignment=1 !CHECK: k (OmpFirstPrivate): HostAssoc - !CHECK: w (OmpPrivate): HostAssoc + !CHECK: w (OmpPrivate, OmpExplicit): HostAssoc !CHECK: z (OmpFirstPrivate): HostAssoc y = 30 w = 40 diff --git a/flang/test/Semantics/OpenMP/do05-positivecase.f90 b/flang/test/Semantics/OpenMP/do05-positivecase.f90 index 5e1b1b86f72f6..8481cb2fc2ca0 100644 --- a/flang/test/Semantics/OpenMP/do05-positivecase.f90 +++ b/flang/test/Semantics/OpenMP/do05-positivecase.f90 @@ -20,12 +20,12 @@ program omp_do !$omp parallel default(shared) !$omp do !DEF: /omp_do/OtherConstruct2/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) - !DEF: /omp_do/OtherConstruct2/n HostAssoc INTEGER(4) + !DEF: /omp_do/OtherConstruct2/OtherConstruct1/n HostAssoc INTEGER(4) do i=1,n !$omp parallel !$omp single !DEF: /work EXTERNAL (Subroutine) ProcEntity - !DEF: /omp_do/OtherConstruct2/OtherConstruct1/OtherConstruct1/i HostAssoc INTEGER(4) + !DEF: /omp_do/OtherConstruct2/OtherConstruct1/OtherConstruct1/OtherConstruct1/i HostAssoc INTEGER(4) call work(i, 1) !$omp end single !$omp end parallel @@ -34,7 +34,7 @@ program omp_do !$omp end parallel !$omp parallel private(i) - !DEF: /omp_do/OtherConstruct3/i (OmpPrivate) HostAssoc INTEGER(4) + !DEF: /omp_do/OtherConstruct3/i (OmpPrivate, OmpExplicit) HostAssoc INTEGER(4) do i=1,10 !$omp single print *, "hello" diff --git a/flang/test/Semantics/OpenMP/do20.f90 b/flang/test/Semantics/OpenMP/do20.f90 index 040a82079590f..ee305ad1a34cf 100644 --- a/flang/test/Semantics/OpenMP/do20.f90 +++ b/flang/test/Semantics/OpenMP/do20.f90 @@ -10,7 +10,7 @@ subroutine shared_iv !$omp parallel shared(i) !$omp single - !DEF: /shared_iv/OtherConstruct1/i (OmpShared) HostAssoc INTEGER(4) + !DEF: /shared_iv/OtherConstruct1/OtherConstruct1/i HostAssoc INTEGER(4) do i = 0, 1 end do !$omp end single diff --git a/flang/test/Semantics/OpenMP/forall.f90 b/flang/test/Semantics/OpenMP/forall.f90 index 58492664a4e85..b862b4b27641b 100644 --- a/flang/test/Semantics/OpenMP/forall.f90 +++ b/flang/test/Semantics/OpenMP/forall.f90 @@ -18,8 +18,8 @@ !$omp parallel !DEF: /MainProgram1/OtherConstruct1/Forall1/i (Implicit) ObjectEntity INTEGER(4) - !DEF: /MainProgram1/OtherConstruct1/a HostAssoc INTEGER(4) - !DEF: /MainProgram1/OtherConstruct1/b HostAssoc INTEGER(4) + !DEF: /MainProgram1/OtherConstruct1/a (OmpShared) HostAssoc INTEGER(4) + !DEF: /MainProgram1/OtherConstruct1/b (OmpShared) HostAssoc INTEGER(4) forall(i = 1:5) a(i) = b(i) * 2 !$omp end parallel diff --git a/flang/test/Semantics/OpenMP/implicit-dsa.f90 b/flang/test/Semantics/OpenMP/implicit-dsa.f90 index a7ed834b0f1c6..7e38435274b7b 100644 --- a/flang/test/Semantics/OpenMP/implicit-dsa.f90 +++ b/flang/test/Semantics/OpenMP/implicit-dsa.f90 @@ -14,15 +14,15 @@ subroutine implicit_dsa_test1 !$omp task private(y) shared(z) !DEF: /implicit_dsa_test1/OtherConstruct1/x (OmpFirstPrivate, OmpImplicit) HostAssoc INTEGER(4) - !DEF: /implicit_dsa_test1/OtherConstruct1/y (OmpPrivate) HostAssoc INTEGER(4) - !DEF: /implicit_dsa_test1/OtherConstruct1/z (OmpShared) HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test1/OtherConstruct1/y (OmpPrivate, OmpExplicit) HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test1/OtherConstruct1/z (OmpShared, OmpExplicit) HostAssoc INTEGER(4) x = y + z !$omp end task !$omp task default(shared) - !DEF: /implicit_dsa_test1/OtherConstruct2/x HostAssoc INTEGER(4) - !DEF: /implicit_dsa_test1/OtherConstruct2/y HostAssoc INTEGER(4) - !DEF: /implicit_dsa_test1/OtherConstruct2/z HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test1/OtherConstruct2/x (OmpShared) HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test1/OtherConstruct2/y (OmpShared) HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test1/OtherConstruct2/z (OmpShared) HostAssoc INTEGER(4) x = y + z !$omp end task @@ -61,16 +61,16 @@ subroutine implicit_dsa_test3 !$omp parallel !$omp task - !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct1/x HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct1/x (OmpShared) HostAssoc INTEGER(4) x = 1 - !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct1/y HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct1/y (OmpShared) HostAssoc INTEGER(4) y = 1 !$omp end task !$omp task firstprivate(x) - !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct2/x (OmpFirstPrivate) HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct2/x (OmpFirstPrivate, OmpExplicit) HostAssoc INTEGER(4) x = 1 - !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct2/z HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct2/z (OmpShared) HostAssoc INTEGER(4) z = 1 !$omp end task !$omp end parallel @@ -110,7 +110,7 @@ subroutine implicit_dsa_test5 !$omp parallel default(private) !$omp task !$omp parallel - !DEF: /implicit_dsa_test5/OtherConstruct1/OtherConstruct1/OtherConstruct1/x HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test5/OtherConstruct1/OtherConstruct1/OtherConstruct1/x (OmpShared) HostAssoc INTEGER(4) x = 1 !$omp end parallel !$omp end task @@ -133,7 +133,7 @@ subroutine implicit_dsa_test6 !$omp end parallel !$omp parallel default(firstprivate) shared(y) - !DEF: /implicit_dsa_test6/OtherConstruct1/OtherConstruct2/y (OmpShared) HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test6/OtherConstruct1/OtherConstruct2/y (OmpShared, OmpExplicit) HostAssoc INTEGER(4) !DEF: /implicit_dsa_test6/OtherConstruct1/OtherConstruct2/x (OmpFirstPrivate) HostAssocINTEGER(4) !DEF: /implicit_dsa_test6/OtherConstruct1/OtherConstruct2/z (OmpFirstPrivate) HostAssocINTEGER(4) y = x + z @@ -156,3 +156,16 @@ subroutine implicit_dsa_test7 !$omp end taskgroup !$omp end task end subroutine + +! Predetermined loop iteration variable. +!DEF: /implicit_dsa_test8 (Subroutine) Subprogram +subroutine implicit_dsa_test8 + !DEF: /implicit_dsa_test8/i ObjectEntity INTEGER(4) + integer i + + !$omp task + !DEF: /implicit_dsa_test8/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) + do i = 1, 10 + end do + !$omp end task +end subroutine diff --git a/flang/test/Semantics/OpenMP/reduction08.f90 b/flang/test/Semantics/OpenMP/reduction08.f90 index 9442fbd4d5978..01a06eb7d7414 100644 --- a/flang/test/Semantics/OpenMP/reduction08.f90 +++ b/flang/test/Semantics/OpenMP/reduction08.f90 @@ -13,9 +13,9 @@ program omp_reduction !$omp parallel do reduction(max:k) !DEF: /omp_reduction/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct1/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct1/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) !DEF: /omp_reduction/max ELEMENTAL, INTRINSIC, PURE (Function) ProcEntity - !DEF: /omp_reduction/OtherConstruct1/m HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct1/m (OmpShared) HostAssoc INTEGER(4) k = max(k, m) end do !$omp end parallel do @@ -23,9 +23,9 @@ program omp_reduction !$omp parallel do reduction(min:k) !DEF: /omp_reduction/OtherConstruct2/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct2/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct2/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) !DEF: /omp_reduction/min ELEMENTAL, INTRINSIC, PURE (Function) ProcEntity - !DEF: /omp_reduction/OtherConstruct2/m HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct2/m (OmpShared) HostAssoc INTEGER(4) k = min(k, m) end do !$omp end parallel do @@ -33,9 +33,9 @@ program omp_reduction !$omp parallel do reduction(iand:k) !DEF: /omp_reduction/OtherConstruct3/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct3/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct3/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) !DEF: /omp_reduction/iand ELEMENTAL, INTRINSIC, PURE (Function) ProcEntity - !DEF: /omp_reduction/OtherConstruct3/m HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct3/m (OmpShared) HostAssoc INTEGER(4) k = iand(k, m) end do !$omp end parallel do @@ -43,9 +43,9 @@ program omp_reduction !$omp parallel do reduction(ior:k) !DEF: /omp_reduction/OtherConstruct4/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct4/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct4/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) !DEF: /omp_reduction/ior ELEMENTAL, INTRINSIC, PURE (Function) ProcEntity - !DEF: /omp_reduction/OtherConstruct4/m HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct4/m (OmpShared) HostAssoc INTEGER(4) k = ior(k, m) end do !$omp end parallel do @@ -53,9 +53,9 @@ program omp_reduction !$omp parallel do reduction(ieor:k) !DEF: /omp_reduction/OtherConstruct5/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct5/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct5/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) !DEF: /omp_reduction/ieor ELEMENTAL, INTRINSIC, PURE (Function) ProcEntity - !DEF: /omp_reduction/OtherConstruct5/m HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct5/m (OmpShared) HostAssoc INTEGER(4) k = ieor(k,m) end do !$omp end parallel do diff --git a/flang/test/Semantics/OpenMP/reduction09.f90 b/flang/test/Semantics/OpenMP/reduction09.f90 index 1af2fc4fd9691..d6c71c30d2834 100644 --- a/flang/test/Semantics/OpenMP/reduction09.f90 +++ b/flang/test/Semantics/OpenMP/reduction09.f90 @@ -16,7 +16,7 @@ program omp_reduction !$omp do reduction(+:k) !DEF: /omp_reduction/OtherConstruct1/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct1/OtherConstruct1/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct1/OtherConstruct1/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) k = k+1 end do !$omp end do @@ -26,7 +26,7 @@ program omp_reduction !$omp parallel do reduction(+:a(10)) !DEF: /omp_reduction/OtherConstruct2/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct2/k HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct2/k (OmpShared) HostAssoc INTEGER(4) k = k+1 end do !$omp end parallel do @@ -35,7 +35,7 @@ program omp_reduction !$omp parallel do reduction(+:a(1:10:1)) !DEF: /omp_reduction/OtherConstruct3/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct3/k HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct3/k (OmpShared) HostAssoc INTEGER(4) k = k+1 end do !$omp end parallel do @@ -43,7 +43,7 @@ program omp_reduction !$omp parallel do reduction(+:b(1:10:1,1:5,2)) !DEF: /omp_reduction/OtherConstruct4/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct4/k HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct4/k (OmpShared) HostAssoc INTEGER(4) k = k+1 end do !$omp end parallel do @@ -51,7 +51,7 @@ program omp_reduction !$omp parallel do reduction(+:b(1:10:1,1:5,2:5:1)) !DEF: /omp_reduction/OtherConstruct5/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct5/k HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct5/k (OmpShared) HostAssoc INTEGER(4) k = k+1 end do !$omp end parallel do @@ -60,7 +60,7 @@ program omp_reduction !$omp do reduction(+:k) reduction(+:j) !DEF: /omp_reduction/OtherConstruct6/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct6/OtherConstruct1/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct6/OtherConstruct1/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) k = k+1 end do !$omp end do @@ -69,7 +69,7 @@ program omp_reduction !$omp do reduction(+:k) reduction(*:j) reduction(+:l) !DEF: /omp_reduction/OtherConstruct7/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct7/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct7/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) k = k+1 end do !$omp end do diff --git a/flang/test/Semantics/OpenMP/reduction11.f90 b/flang/test/Semantics/OpenMP/reduction11.f90 index 3893fe70b407f..b2ad0f6a6ee11 100644 --- a/flang/test/Semantics/OpenMP/reduction11.f90 +++ b/flang/test/Semantics/OpenMP/reduction11.f90 @@ -12,7 +12,7 @@ program omp_reduction ! CHECK: OtherConstruct scope ! CHECK: i (OmpPrivate, OmpPreDetermined): HostAssoc - ! CHECK: k (OmpReduction): HostAssoc + ! CHECK: k (OmpReduction, OmpExplicit): HostAssoc ! CHECK: max, INTRINSIC: ProcEntity !$omp parallel do reduction(max:k) do i=1,10 diff --git a/flang/test/Semantics/OpenMP/scan2.f90 b/flang/test/Semantics/OpenMP/scan2.f90 index 5232e63aa6b4f..ffe84910f88a2 100644 --- a/flang/test/Semantics/OpenMP/scan2.f90 +++ b/flang/test/Semantics/OpenMP/scan2.f90 @@ -12,13 +12,13 @@ program omp_reduction ! CHECK: OtherConstruct scope ! CHECK: i (OmpPrivate, OmpPreDetermined): HostAssoc - ! CHECK: k (OmpReduction, OmpInclusiveScan, OmpInScanReduction): HostAssoc + ! CHECK: k (OmpReduction, OmpExplicit, OmpInclusiveScan, OmpInScanReduction): HostAssoc !$omp parallel do reduction(inscan, +:k) do i=1,10 !$omp scan inclusive(k) end do !$omp end parallel do - ! CHECK: m (OmpReduction, OmpExclusiveScan, OmpInScanReduction): HostAssoc + ! CHECK: m (OmpReduction, OmpExplicit, OmpExclusiveScan, OmpInScanReduction): HostAssoc !$omp parallel do reduction(inscan, +:m) do i=1,10 !$omp scan exclusive(m) diff --git a/flang/test/Semantics/OpenMP/symbol01.f90 b/flang/test/Semantics/OpenMP/symbol01.f90 index a40a8563fde1f..595b6b89c84fd 100644 --- a/flang/test/Semantics/OpenMP/symbol01.f90 +++ b/flang/test/Semantics/OpenMP/symbol01.f90 @@ -47,22 +47,22 @@ program mm !$omp parallel do private(a,t,/c/) shared(c) !DEF: /mm/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /mm/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(4) - !DEF: /mm/OtherConstruct1/b HostAssoc INTEGER(4) + !DEF: /mm/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(4) + !DEF: /mm/OtherConstruct1/b (OmpShared) HostAssoc INTEGER(4) !REF: /mm/OtherConstruct1/i a = a+b(i) - !DEF: /mm/OtherConstruct1/t (OmpPrivate) HostAssoc TYPE(myty) + !DEF: /mm/OtherConstruct1/t (OmpPrivate, OmpExplicit) HostAssoc TYPE(myty) !REF: /md/myty/a !REF: /mm/OtherConstruct1/i t%a = i - !DEF: /mm/OtherConstruct1/y (OmpPrivate) HostAssoc REAL(4) + !DEF: /mm/OtherConstruct1/y (OmpPrivate, OmpExplicit) HostAssoc REAL(4) y = 0. - !DEF: /mm/OtherConstruct1/x (OmpPrivate) HostAssoc REAL(4) + !DEF: /mm/OtherConstruct1/x (OmpPrivate, OmpExplicit) HostAssoc REAL(4) !REF: /mm/OtherConstruct1/a !REF: /mm/OtherConstruct1/i !REF: /mm/OtherConstruct1/y x = a+i+y - !DEF: /mm/OtherConstruct1/c (OmpShared) HostAssoc REAL(4) + !DEF: /mm/OtherConstruct1/c (OmpShared, OmpExplicit) HostAssoc REAL(4) c = 3.0 end do end program diff --git a/flang/test/Semantics/OpenMP/symbol02.f90 b/flang/test/Semantics/OpenMP/symbol02.f90 index 31d9cb2e46ba8..9007da042845a 100644 --- a/flang/test/Semantics/OpenMP/symbol02.f90 +++ b/flang/test/Semantics/OpenMP/symbol02.f90 @@ -11,13 +11,13 @@ !DEF: /MainProgram1/c (Implicit) ObjectEntity REAL(4) c = 0 !$omp parallel private(a,b) shared(c,d) - !DEF: /MainProgram1/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(4) a = 3. - !DEF: /MainProgram1/OtherConstruct1/b (OmpPrivate) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/b (OmpPrivate, OmpExplicit) HostAssoc REAL(4) b = 4 - !DEF: /MainProgram1/OtherConstruct1/c (OmpShared) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/c (OmpShared, OmpExplicit) HostAssoc REAL(4) c = 5 - !DEF: /MainProgram1/OtherConstruct1/d (OmpShared) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/d (OmpShared, OmpExplicit) HostAssoc REAL(4) d = 6 !$omp end parallel !DEF: /MainProgram1/a (Implicit) ObjectEntity REAL(4) diff --git a/flang/test/Semantics/OpenMP/symbol03.f90 b/flang/test/Semantics/OpenMP/symbol03.f90 index 08defb40e56a7..d67c1fdf333c4 100644 --- a/flang/test/Semantics/OpenMP/symbol03.f90 +++ b/flang/test/Semantics/OpenMP/symbol03.f90 @@ -7,14 +7,14 @@ !DEF: /MainProgram1/b (Implicit) ObjectEntity REAL(4) b = 2 !$omp parallel private(a) shared(b) - !DEF: /MainProgram1/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(4) a = 3. - !DEF: /MainProgram1/OtherConstruct1/b (OmpShared) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/b (OmpShared, OmpExplicit) HostAssoc REAL(4) b = 4 !$omp parallel private(b) shared(a) - !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/a (OmpShared) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/a (OmpShared, OmpExplicit) HostAssoc REAL(4) a = 5. - !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/b (OmpPrivate) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/b (OmpPrivate, OmpExplicit) HostAssoc REAL(4) b = 6 !$omp end parallel !$omp end parallel diff --git a/flang/test/Semantics/OpenMP/symbol04.f90 b/flang/test/Semantics/OpenMP/symbol04.f90 index 808d1e0dd09be..834b166266376 100644 --- a/flang/test/Semantics/OpenMP/symbol04.f90 +++ b/flang/test/Semantics/OpenMP/symbol04.f90 @@ -9,12 +9,12 @@ !REF: /MainProgram1/a a = 3.14 !$omp parallel private(a) - !DEF: /MainProgram1/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(8) + !DEF: /MainProgram1/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(8) a = 2. !$omp do private(a) !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(8) + !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(8) a = 1. end do !$omp end parallel diff --git a/flang/test/Semantics/OpenMP/symbol05.f90 b/flang/test/Semantics/OpenMP/symbol05.f90 index 1ad0c10a40135..fe01f15d20aa3 100644 --- a/flang/test/Semantics/OpenMP/symbol05.f90 +++ b/flang/test/Semantics/OpenMP/symbol05.f90 @@ -15,7 +15,7 @@ subroutine foo !DEF: /mm/foo/a ObjectEntity INTEGER(4) integer :: a = 3 !$omp parallel - !DEF: /mm/foo/OtherConstruct1/a HostAssoc INTEGER(4) + !DEF: /mm/foo/OtherConstruct1/a (OmpShared) HostAssoc INTEGER(4) a = 1 !DEF: /mm/i PUBLIC (Implicit, OmpThreadprivate) ObjectEntity INTEGER(4) !REF: /mm/foo/OtherConstruct1/a diff --git a/flang/test/Semantics/OpenMP/symbol06.f90 b/flang/test/Semantics/OpenMP/symbol06.f90 index 906264eb12642..daf3874b79af6 100644 --- a/flang/test/Semantics/OpenMP/symbol06.f90 +++ b/flang/test/Semantics/OpenMP/symbol06.f90 @@ -10,7 +10,7 @@ !$omp parallel do firstprivate(a) lastprivate(a) !DEF: /MainProgram1/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /MainProgram1/OtherConstruct1/a (OmpFirstPrivate, OmpLastPrivate) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/a (OmpFirstPrivate, OmpLastPrivate, OmpExplicit) HostAssoc REAL(4) a = 2. end do end program diff --git a/flang/test/Semantics/OpenMP/symbol07.f90 b/flang/test/Semantics/OpenMP/symbol07.f90 index a375942ebb1d9..86b7305411347 100644 --- a/flang/test/Semantics/OpenMP/symbol07.f90 +++ b/flang/test/Semantics/OpenMP/symbol07.f90 @@ -21,9 +21,9 @@ subroutine function_call_in_region !DEF: /function_call_in_region/b ObjectEntity REAL(4) real :: b = 5. !$omp parallel default(none) private(a) shared(b) - !DEF: /function_call_in_region/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(4) + !DEF: /function_call_in_region/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(4) !REF: /function_call_in_region/foo - !DEF: /function_call_in_region/OtherConstruct1/b (OmpShared) HostAssoc REAL(4) + !DEF: /function_call_in_region/OtherConstruct1/b (OmpShared, OmpExplicit) HostAssoc REAL(4) a = foo(b) !$omp end parallel !REF: /function_call_in_region/a diff --git a/flang/test/Semantics/OpenMP/symbol08.f90 b/flang/test/Semantics/OpenMP/symbol08.f90 index 80ae1c6d2242b..545bccc86b068 100644 --- a/flang/test/Semantics/OpenMP/symbol08.f90 +++ b/flang/test/Semantics/OpenMP/symbol08.f90 @@ -28,19 +28,19 @@ subroutine test_do !DEF: /test_do/k ObjectEntity INTEGER(4) integer i, j, k !$omp parallel - !DEF: /test_do/OtherConstruct1/i HostAssoc INTEGER(4) + !DEF: /test_do/OtherConstruct1/i (OmpShared) HostAssoc INTEGER(4) i = 99 !$omp do collapse(2) !DEF: /test_do/OtherConstruct1/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,5 !DEF: /test_do/OtherConstruct1/OtherConstruct1/j (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do j=6,10 - !DEF: /test_do/OtherConstruct1/a HostAssoc REAL(4) + !DEF: /test_do/OtherConstruct1/OtherConstruct1/a HostAssoc REAL(4) a(1,1,1) = 0. !DEF: /test_do/OtherConstruct1/k (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do k=11,15 - !REF: /test_do/OtherConstruct1/a - !REF: /test_do/OtherConstruct1/k + !REF: /test_do/OtherConstruct1/OtherConstruct1/a + !DEF: /test_do/OtherConstruct1/OtherConstruct1/k HostAssoc INTEGER(4) !REF: /test_do/OtherConstruct1/OtherConstruct1/j !REF: /test_do/OtherConstruct1/OtherConstruct1/i a(k,j,i) = 1. @@ -65,9 +65,9 @@ subroutine test_pardo do i=1,5 !DEF: /test_pardo/OtherConstruct1/j (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do j=6,10 - !DEF: /test_pardo/OtherConstruct1/a HostAssoc REAL(4) + !DEF: /test_pardo/OtherConstruct1/a (OmpShared) HostAssoc REAL(4) a(1,1,1) = 0. - !DEF: /test_pardo/OtherConstruct1/k (OmpPrivate) HostAssoc INTEGER(4) + !DEF: /test_pardo/OtherConstruct1/k (OmpPrivate, OmpExplicit) HostAssoc INTEGER(4) do k=11,15 !REF: /test_pardo/OtherConstruct1/a !REF: /test_pardo/OtherConstruct1/k @@ -91,7 +91,7 @@ subroutine test_taskloop !$omp taskloop private(j) !DEF: /test_taskloop/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,5 - !DEF: /test_taskloop/OtherConstruct1/j (OmpPrivate) HostAssoc INTEGER(4) + !DEF: /test_taskloop/OtherConstruct1/j (OmpPrivate, OmpExplicit) HostAssoc INTEGER(4) !REF: /test_taskloop/OtherConstruct1/i do j=1,i !DEF: /test_taskloop/OtherConstruct1/a (OmpFirstPrivate, OmpImplicit) HostAssoc REAL(4) @@ -139,15 +139,15 @@ subroutine dotprod (b, c, n, block_size, num_teams, block_threads) do i0=1,n,block_size !$omp parallel do reduction(+: sum) !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) - !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/i0 HostAssoc INTEGER(4) + !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/i0 (OmpShared) HostAssoc INTEGER(4) !DEF: /dotprod/min ELEMENTAL, INTRINSIC, PURE (Function) ProcEntity - !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/block_size HostAssoc INTEGER(4) - !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/n HostAssoc INTEGER(4) + !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/block_size (OmpShared) HostAssoc INTEGER(4) + !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/n (OmpShared) HostAssoc INTEGER(4) do i=i0,min(i0+block_size, n) - !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/sum (OmpReduction) HostAssoc REAL(4) - !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/b HostAssoc REAL(4) + !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/sum (OmpReduction, OmpExplicit) HostAssoc REAL(4) + !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/b (OmpShared) HostAssoc REAL(4) !REF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/i - !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/c HostAssoc REAL(4) + !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/c (OmpShared) HostAssoc REAL(4) sum = sum+b(i)*c(i) end do end do @@ -175,7 +175,7 @@ subroutine test_simd do j=6,10 !DEF: /test_simd/OtherConstruct1/k (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do k=11,15 - !DEF: /test_simd/OtherConstruct1/a HostAssoc REAL(4) + !DEF: /test_simd/OtherConstruct1/a (OmpShared) HostAssoc REAL(4) !REF: /test_simd/OtherConstruct1/k !REF: /test_simd/OtherConstruct1/j !REF: /test_simd/OtherConstruct1/i @@ -202,7 +202,7 @@ subroutine test_simd_multi do j=6,10 !DEF: /test_simd_multi/OtherConstruct1/k (OmpLastPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do k=11,15 - !DEF: /test_simd_multi/OtherConstruct1/a HostAssoc REAL(4) + !DEF: /test_simd_multi/OtherConstruct1/a (OmpShared) HostAssoc REAL(4) !REF: /test_simd_multi/OtherConstruct1/k !REF: /test_simd_multi/OtherConstruct1/j !REF: /test_simd_multi/OtherConstruct1/i @@ -224,11 +224,11 @@ subroutine test_seq_loop !REF: /test_seq_loop/j j = -1 !$omp parallel - !DEF: /test_seq_loop/OtherConstruct1/i HostAssoc INTEGER(4) - !DEF: /test_seq_loop/OtherConstruct1/j HostAssoc INTEGER(4) + !DEF: /test_seq_loop/OtherConstruct1/i (OmpShared) HostAssoc INTEGER(4) + !DEF: /test_seq_loop/OtherConstruct1/j (OmpShared) HostAssoc INTEGER(4) print *, i, j !$omp parallel - !DEF: /test_seq_loop/OtherConstruct1/OtherConstruct1/i HostAssoc INTEGER(4) + !DEF: /test_seq_loop/OtherConstruct1/OtherConstruct1/i (OmpShared) HostAssoc INTEGER(4) !DEF: /test_seq_loop/OtherConstruct1/OtherConstruct1/j (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) print *, i, j !$omp do diff --git a/flang/test/Semantics/OpenMP/symbol09.f90 b/flang/test/Semantics/OpenMP/symbol09.f90 index a375942ebb1d9..86b7305411347 100644 --- a/flang/test/Semantics/OpenMP/symbol09.f90 +++ b/flang/test/Semantics/OpenMP/symbol09.f90 @@ -21,9 +21,9 @@ subroutine function_call_in_region !DEF: /function_call_in_region/b ObjectEntity REAL(4) real :: b = 5. !$omp parallel default(none) private(a) shared(b) - !DEF: /function_call_in_region/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(4) + !DEF: /function_call_in_region/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(4) !REF: /function_call_in_region/foo - !DEF: /function_call_in_region/OtherConstruct1/b (OmpShared) HostAssoc REAL(4) + !DEF: /function_call_in_region/OtherConstruct1/b (OmpShared, OmpExplicit) HostAssoc REAL(4) a = foo(b) !$omp end parallel !REF: /function_call_in_region/a From flang-commits at lists.llvm.org Fri May 30 07:14:13 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 07:14:13 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Explicitly set Shared DSA in symbols (PR #142154) In-Reply-To: Message-ID: <6839bd35.170a0220.2e4f54.0f03@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Leandro Lupori (luporl)
Changes Before this change, OmpShared was not always set in shared symbols. Instead, absence of private flags was interpreted as shared DSA. The problem was that symbols with no flags, with only a host association, could also mean "has same DSA as in the enclosing context". Now shared symbols behave the same as private and can be treated the same way. Because of the host association symbols with no flags mentioned above, it was also incorrect to simply test the flags of a given symbol to find out if it was private or shared. The function GetSymbolDSA() was added to fix this. It would be better to avoid the need of these special symbols, but this would require changes to how symbols are collected in lowering. Besides that, some semantic checks need to know if a DSA clause was used or not. To avoid confusing implicit symbols with DSA clauses a new flag was added: OmpExplicit. It is now set for all symbols with explicitly determined data-sharing attributes. With the changes above, AddToContextObjectWithDSA() and the symbol to DSA map could probably be removed and the DSA could be obtained directly from the symbol, but this was not attempted. Some debug messages were also added, with the "omp" DEBUG_TYPE, to make it easier to debug the creation of implicit symbols and to visualize all associations of a given symbol. Fixes #130533 --- Patch is 55.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/142154.diff 26 Files Affected: - (added) flang/include/flang/Semantics/openmp-dsa.h (+20) - (modified) flang/include/flang/Semantics/symbol.h (+2-2) - (modified) flang/lib/Lower/Bridge.cpp (+3-1) - (modified) flang/lib/Semantics/CMakeLists.txt (+1) - (added) flang/lib/Semantics/openmp-dsa.cpp (+29) - (modified) flang/lib/Semantics/resolve-directives.cpp (+217-84) - (modified) flang/test/Semantics/OpenMP/common-block.f90 (+3-3) - (modified) flang/test/Semantics/OpenMP/copyprivate03.f90 (+12) - (modified) flang/test/Semantics/OpenMP/default-clause.f90 (+3-3) - (modified) flang/test/Semantics/OpenMP/do05-positivecase.f90 (+3-3) - (modified) flang/test/Semantics/OpenMP/do20.f90 (+1-1) - (modified) flang/test/Semantics/OpenMP/forall.f90 (+2-2) - (modified) flang/test/Semantics/OpenMP/implicit-dsa.f90 (+24-11) - (modified) flang/test/Semantics/OpenMP/reduction08.f90 (+10-10) - (modified) flang/test/Semantics/OpenMP/reduction09.f90 (+7-7) - (modified) flang/test/Semantics/OpenMP/reduction11.f90 (+1-1) - (modified) flang/test/Semantics/OpenMP/scan2.f90 (+2-2) - (modified) flang/test/Semantics/OpenMP/symbol01.f90 (+6-6) - (modified) flang/test/Semantics/OpenMP/symbol02.f90 (+4-4) - (modified) flang/test/Semantics/OpenMP/symbol03.f90 (+4-4) - (modified) flang/test/Semantics/OpenMP/symbol04.f90 (+2-2) - (modified) flang/test/Semantics/OpenMP/symbol05.f90 (+1-1) - (modified) flang/test/Semantics/OpenMP/symbol06.f90 (+1-1) - (modified) flang/test/Semantics/OpenMP/symbol07.f90 (+2-2) - (modified) flang/test/Semantics/OpenMP/symbol08.f90 (+18-18) - (modified) flang/test/Semantics/OpenMP/symbol09.f90 (+2-2) ``````````diff diff --git a/flang/include/flang/Semantics/openmp-dsa.h b/flang/include/flang/Semantics/openmp-dsa.h new file mode 100644 index 0000000000000..4b94a679f29ef --- /dev/null +++ b/flang/include/flang/Semantics/openmp-dsa.h @@ -0,0 +1,20 @@ +//===-- include/flang/Semantics/openmp-dsa.h --------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#ifndef FORTRAN_SEMANTICS_OPENMP_DSA_H_ +#define FORTRAN_SEMANTICS_OPENMP_DSA_H_ + +#include "flang/Semantics/symbol.h" + +namespace Fortran::semantics { + +Symbol::Flags GetSymbolDSA(const Symbol &symbol); + +} // namespace Fortran::semantics + +#endif // FORTRAN_SEMANTICS_OPENMP_DSA_H_ diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 4cded64d170cd..59920e08cc926 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -785,8 +785,8 @@ class Symbol { OmpAllocate, OmpDeclarativeAllocateDirective, OmpExecutableAllocateDirective, OmpDeclareSimd, OmpDeclareTarget, OmpThreadprivate, OmpDeclareReduction, OmpFlushed, OmpCriticalLock, - OmpIfSpecified, OmpNone, OmpPreDetermined, OmpImplicit, OmpDependObject, - OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction); + OmpIfSpecified, OmpNone, OmpPreDetermined, OmpExplicit, OmpImplicit, + OmpDependObject, OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction); using Flags = common::EnumSet; const Scope &owner() const { return *owner_; } diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index c9e91cf3e8042..86d5e0d37bc38 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -58,6 +58,7 @@ #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Parser/parse-tree.h" #include "flang/Runtime/iostat-consts.h" +#include "flang/Semantics/openmp-dsa.h" #include "flang/Semantics/runtime-type-info.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" @@ -1385,7 +1386,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { if (isUnordered || sym.has() || sym.has()) { if (!shallowLookupSymbol(sym) && - !sym.test(Fortran::semantics::Symbol::Flag::OmpShared)) { + !GetSymbolDSA(sym).test( + Fortran::semantics::Symbol::Flag::OmpShared)) { // Do concurrent loop variables are not mapped yet since they are local // to the Do concurrent scope (same for OpenMP loops). mlir::OpBuilder::InsertPoint insPt = builder->saveInsertionPoint(); diff --git a/flang/lib/Semantics/CMakeLists.txt b/flang/lib/Semantics/CMakeLists.txt index bd8cc47365f06..18c89587843a9 100644 --- a/flang/lib/Semantics/CMakeLists.txt +++ b/flang/lib/Semantics/CMakeLists.txt @@ -32,6 +32,7 @@ add_flang_library(FortranSemantics dump-expr.cpp expression.cpp mod-file.cpp + openmp-dsa.cpp openmp-modifiers.cpp pointer-assignment.cpp program-tree.cpp diff --git a/flang/lib/Semantics/openmp-dsa.cpp b/flang/lib/Semantics/openmp-dsa.cpp new file mode 100644 index 0000000000000..48aa36febe5c5 --- /dev/null +++ b/flang/lib/Semantics/openmp-dsa.cpp @@ -0,0 +1,29 @@ +//===-- flang/lib/Semantics/openmp-dsa.cpp ----------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Semantics/openmp-dsa.h" + +namespace Fortran::semantics { + +Symbol::Flags GetSymbolDSA(const Symbol &symbol) { + Symbol::Flags dsaFlags{Symbol::Flag::OmpPrivate, + Symbol::Flag::OmpFirstPrivate, Symbol::Flag::OmpLastPrivate, + Symbol::Flag::OmpShared, Symbol::Flag::OmpLinear, + Symbol::Flag::OmpReduction}; + Symbol::Flags dsa{symbol.flags() & dsaFlags}; + if (dsa.any()) { + return dsa; + } + // If no DSA are set use those from the host associated symbol, if any. + if (const auto *details{symbol.detailsIf()}) { + return GetSymbolDSA(details->symbol()); + } + return {}; +} + +} // namespace Fortran::semantics diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 9fa7bc8964854..e604bca4213c8 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -19,9 +19,11 @@ #include "flang/Parser/parse-tree.h" #include "flang/Parser/tools.h" #include "flang/Semantics/expression.h" +#include "flang/Semantics/openmp-dsa.h" #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" +#include "llvm/Support/Debug.h" #include #include #include @@ -111,10 +113,9 @@ template class DirectiveAttributeVisitor { const parser::Name *GetLoopIndex(const parser::DoConstruct &); const parser::DoConstruct *GetDoConstructIf( const parser::ExecutionPartConstruct &); - Symbol *DeclareNewPrivateAccessEntity(const Symbol &, Symbol::Flag, Scope &); - Symbol *DeclarePrivateAccessEntity( - const parser::Name &, Symbol::Flag, Scope &); - Symbol *DeclarePrivateAccessEntity(Symbol &, Symbol::Flag, Scope &); + Symbol *DeclareNewAccessEntity(const Symbol &, Symbol::Flag, Scope &); + Symbol *DeclareAccessEntity(const parser::Name &, Symbol::Flag, Scope &); + Symbol *DeclareAccessEntity(Symbol &, Symbol::Flag, Scope &); Symbol *DeclareOrMarkOtherAccessEntity(const parser::Name &, Symbol::Flag); UnorderedSymbolSet dataSharingAttributeObjects_; // on one directive @@ -749,10 +750,11 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { Symbol::Flags ompFlagsRequireNewSymbol{Symbol::Flag::OmpPrivate, Symbol::Flag::OmpLinear, Symbol::Flag::OmpFirstPrivate, - Symbol::Flag::OmpLastPrivate, Symbol::Flag::OmpReduction, - Symbol::Flag::OmpCriticalLock, Symbol::Flag::OmpCopyIn, - Symbol::Flag::OmpUseDevicePtr, Symbol::Flag::OmpUseDeviceAddr, - Symbol::Flag::OmpIsDevicePtr, Symbol::Flag::OmpHasDeviceAddr}; + Symbol::Flag::OmpLastPrivate, Symbol::Flag::OmpShared, + Symbol::Flag::OmpReduction, Symbol::Flag::OmpCriticalLock, + Symbol::Flag::OmpCopyIn, Symbol::Flag::OmpUseDevicePtr, + Symbol::Flag::OmpUseDeviceAddr, Symbol::Flag::OmpIsDevicePtr, + Symbol::Flag::OmpHasDeviceAddr}; Symbol::Flags ompFlagsRequireMark{Symbol::Flag::OmpThreadprivate, Symbol::Flag::OmpDeclareTarget, Symbol::Flag::OmpExclusiveScan, @@ -829,8 +831,24 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { void IssueNonConformanceWarning( llvm::omp::Directive D, parser::CharBlock source); - void CreateImplicitSymbols( - const Symbol *symbol, std::optional setFlag = std::nullopt); + void CreateImplicitSymbols(const Symbol *symbol); + + void AddToContextObjectWithExplicitDSA(Symbol &symbol, Symbol::Flag flag) { + AddToContextObjectWithDSA(symbol, flag); + if (dataSharingAttributeFlags.test(flag)) { + symbol.set(Symbol::Flag::OmpExplicit); + } + } + + // Clear any previous data-sharing attribute flags and set the new ones. + // Needed when setting PreDetermined DSAs, that take precedence over + // Implicit ones. + void SetSymbolDSA(Symbol &symbol, Symbol::Flags flags) { + symbol.flags() &= ~(dataSharingAttributeFlags | + Symbol::Flags{Symbol::Flag::OmpExplicit, Symbol::Flag::OmpImplicit, + Symbol::Flag::OmpPreDetermined}); + symbol.flags() |= flags; + } }; template @@ -867,7 +885,7 @@ const parser::DoConstruct *DirectiveAttributeVisitor::GetDoConstructIf( } template -Symbol *DirectiveAttributeVisitor::DeclareNewPrivateAccessEntity( +Symbol *DirectiveAttributeVisitor::DeclareNewAccessEntity( const Symbol &object, Symbol::Flag flag, Scope &scope) { assert(object.owner() != currScope()); auto &symbol{MakeAssocSymbol(object.name(), object, scope)}; @@ -880,20 +898,20 @@ Symbol *DirectiveAttributeVisitor::DeclareNewPrivateAccessEntity( } template -Symbol *DirectiveAttributeVisitor::DeclarePrivateAccessEntity( +Symbol *DirectiveAttributeVisitor::DeclareAccessEntity( const parser::Name &name, Symbol::Flag flag, Scope &scope) { if (!name.symbol) { return nullptr; // not resolved by Name Resolution step, do nothing } - name.symbol = DeclarePrivateAccessEntity(*name.symbol, flag, scope); + name.symbol = DeclareAccessEntity(*name.symbol, flag, scope); return name.symbol; } template -Symbol *DirectiveAttributeVisitor::DeclarePrivateAccessEntity( +Symbol *DirectiveAttributeVisitor::DeclareAccessEntity( Symbol &object, Symbol::Flag flag, Scope &scope) { if (object.owner() != currScope()) { - return DeclareNewPrivateAccessEntity(object, flag, scope); + return DeclareNewAccessEntity(object, flag, scope); } else { object.set(flag); return &object; @@ -1600,6 +1618,20 @@ void AccAttributeVisitor::CheckMultipleAppearances( } } +#ifndef NDEBUG + +#define DEBUG_TYPE "omp" + +static llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const Symbol::Flags &flags); + +namespace dbg { +static void DumpAssocSymbols(llvm::raw_ostream &os, const Symbol &sym); +static std::string ScopeSourcePos(const Fortran::semantics::Scope &scope); +} // namespace dbg + +#endif + bool OmpAttributeVisitor::Pre(const parser::OpenMPBlockConstruct &x) { const auto &beginBlockDir{std::get(x.t)}; const auto &beginDir{std::get(beginBlockDir.t)}; @@ -1792,12 +1824,12 @@ void OmpAttributeVisitor::ResolveSeqLoopIndexInParallelOrTaskConstruct( } } } - // If this symbol is already Private or Firstprivate in the enclosing - // OpenMP parallel or task then there is nothing to do here. + // If this symbol already has an explicit data-sharing attribute in the + // enclosing OpenMP parallel or task then there is nothing to do here. if (auto *symbol{targetIt->scope.FindSymbol(iv.source)}) { if (symbol->owner() == targetIt->scope) { - if (symbol->test(Symbol::Flag::OmpPrivate) || - symbol->test(Symbol::Flag::OmpFirstPrivate)) { + if (symbol->test(Symbol::Flag::OmpExplicit) && + (symbol->flags() & dataSharingAttributeFlags).any()) { return; } } @@ -1806,7 +1838,8 @@ void OmpAttributeVisitor::ResolveSeqLoopIndexInParallelOrTaskConstruct( // parallel or task if (auto *symbol{ResolveOmp(iv, Symbol::Flag::OmpPrivate, targetIt->scope)}) { targetIt++; - symbol->set(Symbol::Flag::OmpPreDetermined); + SetSymbolDSA( + *symbol, {Symbol::Flag::OmpPreDetermined, Symbol::Flag::OmpPrivate}); iv.symbol = symbol; // adjust the symbol within region for (auto it{dirContext_.rbegin()}; it != targetIt; ++it) { AddToContextObjectWithDSA(*symbol, Symbol::Flag::OmpPrivate, *it); @@ -1918,7 +1951,7 @@ void OmpAttributeVisitor::PrivatizeAssociatedLoopIndexAndCheckLoopLevel( const parser::Name *iv{GetLoopIndex(*loop)}; if (iv) { if (auto *symbol{ResolveOmp(*iv, ivDSA, currScope())}) { - symbol->set(Symbol::Flag::OmpPreDetermined); + SetSymbolDSA(*symbol, {Symbol::Flag::OmpPreDetermined, ivDSA}); iv->symbol = symbol; // adjust the symbol within region AddToContextObjectWithDSA(*symbol, ivDSA); } @@ -2178,42 +2211,48 @@ static bool IsPrivatizable(const Symbol *sym) { misc->kind() != MiscDetails::Kind::ConstructName)); } -void OmpAttributeVisitor::CreateImplicitSymbols( - const Symbol *symbol, std::optional setFlag) { +void OmpAttributeVisitor::CreateImplicitSymbols(const Symbol *symbol) { if (!IsPrivatizable(symbol)) { return; } + LLVM_DEBUG(llvm::dbgs() << "CreateImplicitSymbols: " << *symbol << '\n'); + // Implicitly determined DSAs // OMP 5.2 5.1.1 - Variables Referenced in a Construct Symbol *lastDeclSymbol = nullptr; - std::optional prevDSA; + Symbol::Flags prevDSA; for (int dirDepth{0}; dirDepth < (int)dirContext_.size(); ++dirDepth) { DirContext &dirContext = dirContext_[dirDepth]; - std::optional dsa; + Symbol::Flags dsa; - for (auto symMap : dirContext.objectWithDSA) { - // if the `symbol` already has a data-sharing attribute - if (symMap.first->name() == symbol->name()) { - dsa = symMap.second; - break; + Scope &scope{context_.FindScope(dirContext.directiveSource)}; + auto it{scope.find(symbol->name())}; + if (it != scope.end()) { + // There is already a symbol in the current scope, use its DSA. + dsa = GetSymbolDSA(*it->second); + } else { + for (auto symMap : dirContext.objectWithDSA) { + if (symMap.first->name() == symbol->name()) { + // `symbol` already has a data-sharing attribute in the current + // context, use it. + dsa.set(symMap.second); + break; + } } } // When handling each implicit rule for a given symbol, one of the - // following 3 actions may be taken: - // 1. Declare a new private symbol. - // 2. Create a new association symbol with no flags, that will represent - // a shared symbol in the current scope. Note that symbols without - // any private flags are considered as shared. - // 3. Use the last declared private symbol, by inserting a new symbol - // in the scope being processed, associated with it. - // If no private symbol was declared previously, then no association - // is needed and the symbol from the enclosing scope will be - // inherited by the current one. + // following actions may be taken: + // 1. Declare a new private or shared symbol. + // 2. Use the last declared symbol, by inserting a new symbol in the + // scope being processed, associated with it. + // If no symbol was declared previously, then no association is needed + // and the symbol from the enclosing scope will be inherited by the + // current one. // // Because of how symbols are collected in lowering, not inserting a new - // symbol in the last case could lead to the conclusion that a symbol + // symbol in the second case could lead to the conclusion that a symbol // from an enclosing construct was declared in the current construct, // which would result in wrong privatization code being generated. // Consider the following example: @@ -2231,46 +2270,71 @@ void OmpAttributeVisitor::CreateImplicitSymbols( // it would have the private flag set. // This would make x appear to be defined in p2, causing it to be // privatized in p2 and its privatization in p1 to be skipped. - auto makePrivateSymbol = [&](Symbol::Flag flag) { + auto makeSymbol = [&](Symbol::Flags flags) { const Symbol *hostSymbol = lastDeclSymbol ? lastDeclSymbol : &symbol->GetUltimate(); - lastDeclSymbol = DeclareNewPrivateAccessEntity( + assert(flags.LeastElement()); + Symbol::Flag flag = *flags.LeastElement(); + lastDeclSymbol = DeclareNewAccessEntity( *hostSymbol, flag, context_.FindScope(dirContext.directiveSource)); - if (setFlag) { - lastDeclSymbol->set(*setFlag); - } + lastDeclSymbol->flags() |= flags; return lastDeclSymbol; }; - auto makeSharedSymbol = [&](std::optional flag = {}) { - const Symbol *hostSymbol = - lastDeclSymbol ? lastDeclSymbol : &symbol->GetUltimate(); - Symbol &assocSymbol = MakeAssocSymbol(symbol->name(), *hostSymbol, - context_.FindScope(dirContext.directiveSource)); - if (flag) { - assocSymbol.set(*flag); - } - }; auto useLastDeclSymbol = [&]() { if (lastDeclSymbol) { - makeSharedSymbol(); + const Symbol *hostSymbol = + lastDeclSymbol ? lastDeclSymbol : &symbol->GetUltimate(); + MakeAssocSymbol(symbol->name(), *hostSymbol, + context_.FindScope(dirContext.directiveSource)); } }; +#ifndef NDEBUG + auto printImplicitRule = [&](const char *id) { + LLVM_DEBUG(llvm::dbgs() << "\t" << id << ": dsa: " << dsa << '\n'); + LLVM_DEBUG( + llvm::dbgs() << "\t\tScope: " << dbg::ScopeSourcePos(scope) << '\n'); + }; +#define PRINT_IMPLICIT_RULE(id) printImplicitRule(id) +#else +#define PRINT_IMPLICIT_RULE(id) +#endif + bool taskGenDir = llvm::omp::taskGeneratingSet.test(dirContext.directive); bool targetDir = llvm::omp::allTargetSet.test(dirContext.directive); bool parallelDir = llvm::omp::allParallelSet.test(dirContext.directive); bool teamsDir = llvm::omp::allTeamsSet.test(dirContext.directive); - if (dsa.has_value()) { - if (dsa.value() == Symbol::Flag::OmpShared && - (parallelDir || taskGenDir || teamsDir)) { - makeSharedSymbol(Symbol::Flag::OmpShared); + if (dsa.any()) { + if (parallelDir || taskGenDir || teamsDir) { + Symbol *prevDeclSymbol{lastDeclSymbol}; + // NOTE As `dsa` will match that of the symbol in the current scope + // (if any), we won't override the DSA of any existing symbol. + if ((dsa & dataSharingAttributeFlags).any()) { + makeSymbol(dsa); + } + // Fix host association of explicit symbols, as they can be created + // before implicit ones in enclosing scope. + if (prevDeclSymbol && prevDeclSymbol != lastDeclSymbol && + lastDeclSymbol->test(Symbol::Flag::OmpExplicit)) { + const auto *hostAssoc{lastDeclSymbol->detailsIf()}; + if (hostAssoc && hostAssoc->symbol() != *prevDeclSymbol) { + lastDeclSymbol->set_details(HostAssocDetails{*prevDeclSymbol}); + } + } } - // Private symbols will have been declared already. prevDSA = dsa; + PRINT_IMPLICIT_RULE("0) already has DSA"); continue; } + // NOTE Because of how lowering uses OmpImplicit flag, we can only set it + // for symbols with private DSA. + // Also, as the default clause is handled separately in lowering, + // don't mark its symbols with OmpImplicit either. + // Ideally, lowering should be changed and all implicit symbols + // should be marked with OmpImplicit. + if (dirContext.defaultDSA == Symbol::Flag::OmpPrivate || dirContext.defaultDSA == Symbol::Flag::OmpFirstPrivate || dirContext.defaultDSA == Symbol::Flag::OmpShared) { @@ -2279,33 +2343,34 @@ void OmpAttributeVisitor::CreateImplicitSymbols( if (!parallelDir && !taskGenDir && !teamsDir) { return; } - if (dirContext.defaultDSA != Symbol::Flag::OmpShared) { - makePrivateSymbol(dirContext.defaultDSA); - } else { - makeSharedSymbol(); - } - dsa = dirContext.defaultDSA; + dsa = {dirContext.defaultDSA}; + makeSymbol(dsa); + PRINT_IMPLICIT_RULE("1) default"); } else if (parallelDir) { // 2) parallel -> shared - makeSharedSymbol(); - dsa = Symbol::Flag::OmpShared; + dsa = {Symbol::Flag::OmpShared}; + makeSymbol(dsa); + PRINT_IMPLICIT_RULE("2) parallel"); } else if (!taskGenDir && !targetDir) { // 3) enclosing context - useLastDeclSymbol(); dsa = prevDSA; + useLastDeclSymbol(); + PRINT_IMPLICIT_RULE("3) enclosing context"); } else if (targetDir) { // TODO 4) not mapped target variable -> firstprivate dsa = prevDSA; } else if (taskGenDir) { // TODO 5) dummy arg in orphaned taskgen construct -> firstprivate - if (prevDSA == Symbol::Flag::OmpShared) { + if (prevDSA.test(Symbol::Flag::OmpShared)) { // 6) shared in enclosing context -> shared - makeSharedSymbol()... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/142154 From flang-commits at lists.llvm.org Fri May 30 07:25:19 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Fri, 30 May 2025 07:25:19 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Revert "Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler" (PR #142159) Message-ID: https://github.com/tarunprabhu created https://github.com/llvm/llvm-project/pull/142159 Reverts llvm/llvm-project#136098 >From b9503fe262c416111ee77be30767a791cf750fb8 Mon Sep 17 00:00:00 2001 From: Tarun Prabhu Date: Fri, 30 May 2025 08:22:15 -0600 Subject: [PATCH] =?UTF-8?q?Revert=20"Add=20IR=20Profile-Guided=20Optimizat?= =?UTF-8?q?ion=20(IR=20PGO)=20support=20to=20the=20Flang=20comp=E2=80=A6"?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This reverts commit d27a210a77af63568db9f829702b4b2c98473a46. --- clang/include/clang/Basic/CodeGenOptions.def | 6 +-- clang/include/clang/Basic/CodeGenOptions.h | 22 ++++------ clang/include/clang/Basic/ProfileList.h | 9 ++-- clang/include/clang/Driver/Options.td | 4 +- clang/lib/Basic/ProfileList.cpp | 20 ++++----- clang/lib/CodeGen/BackendUtil.cpp | 42 +++++++++++-------- clang/lib/CodeGen/CodeGenAction.cpp | 4 +- clang/lib/CodeGen/CodeGenFunction.cpp | 3 +- clang/lib/CodeGen/CodeGenModule.cpp | 2 +- clang/lib/Driver/ToolChains/Flang.cpp | 4 -- clang/lib/Frontend/CompilerInvocation.cpp | 6 +-- .../include/flang/Frontend/CodeGenOptions.def | 7 ---- flang/include/flang/Frontend/CodeGenOptions.h | 38 ----------------- flang/lib/Frontend/CompilerInvocation.cpp | 10 ----- flang/lib/Frontend/FrontendActions.cpp | 26 ------------ flang/test/Driver/flang-f-opts.f90 | 5 --- .../Inputs/gcc-flag-compatibility_IR.proftext | 18 -------- .../gcc-flag-compatibility_IR_entry.proftext | 11 ----- flang/test/Profile/gcc-flag-compatibility.f90 | 32 -------------- .../llvm/Frontend/Driver/CodeGenOptions.h | 12 ------ llvm/lib/Frontend/Driver/CodeGenOptions.cpp | 13 ------ 21 files changed, 58 insertions(+), 236 deletions(-) delete mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext delete mode 100644 flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext delete mode 100644 flang/test/Profile/gcc-flag-compatibility.f90 diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index 11dad53a52efe..aad4e107cbeb3 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,11 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. - -ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 4, llvm::driver::ProfileInstrKind::ProfileNone) - +ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 4, ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index bffbd00b1bd72..278803f7bb960 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -518,41 +518,35 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == - llvm::driver::ProfileInstrKind::ProfileClangInstr; + return getProfileInstr() == ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; + return getProfileInstr() == ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == - llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + return getProfileInstr() == ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { - return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; - } + bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; + return getProfileUse() == ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || - getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { - return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; - } + bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index 5338ef3992ade..b4217e49c18a3 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,16 +49,17 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; + ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - llvm::driver::ProfileInstrKind Kind) const; + CodeGenOptions::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - llvm::driver::ProfileInstrKind Kind) const; + CodeGenOptions::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, + CodeGenOptions::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 5c79c66b55eb3..5ca31c253ed8f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1772,7 +1772,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFFFFFE">; def fprofile_generate : Flag<["-"], "fprofile-generate">, - Group, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, + Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">, Group, Visibility<[ClangOption, CLOption]>, @@ -1789,7 +1789,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group, Visibility<[ClangOption, CLOption]>, Alias; def fprofile_use_EQ : Joined<["-"], "fprofile-use=">, Group, - Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, + Visibility<[ClangOption, CLOption]>, MetaVarName<"">, HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from /default.profdata. Otherwise, it reads from file .">; def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">, diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index bea65579f396b..2d37014294b92 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -70,24 +70,24 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { +static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { switch (Kind) { - case llvm::driver::ProfileInstrKind::ProfileNone: + case CodeGenOptions::ProfileNone: return ""; - case llvm::driver::ProfileInstrKind::ProfileClangInstr: + case CodeGenOptions::ProfileClangInstr: return "clang"; - case llvm::driver::ProfileInstrKind::ProfileIRInstr: + case CodeGenOptions::ProfileIRInstr: return "llvm"; - case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: + case CodeGenOptions::ProfileCSIRInstr: return "csllvm"; case CodeGenOptions::ProfileIRSampleColdCov: return "sample-coldcov"; } - llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); + llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { +ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -118,7 +118,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - llvm::driver::ProfileInstrKind Kind) const { + CodeGenOptions::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -132,13 +132,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - llvm::driver::ProfileInstrKind Kind) const { + CodeGenOptions::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - llvm::driver::ProfileInstrKind Kind) const { + CodeGenOptions::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 03e10b1138a71..cd5fc48c4a22b 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -123,10 +123,17 @@ namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } +// Default filename used for profile generation. +static std::string getDefaultProfileGenName() { + return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} + // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? llvm::driver::getDefaultProfileGenName() + ? getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -828,12 +835,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions( - getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, - PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, - CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, + PGOOptions::IRInstr, PGOOptions::NoCSAction, + ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, + CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -841,32 +848,31 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + PGOOptions::NoCSAction, ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) - PGOOpt = - PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::NoAction, PGOOptions::NoCSAction, - llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); + PGOOpt = PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, + PGOOptions::NoAction, PGOOptions::NoCSAction, + ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - llvm::ClPGOColdFuncAttr, - CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = + PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - llvm::ClPGOColdFuncAttr, true); + ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 5493cc92bd8b0..1f5eb427b566f 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && CodeGenOpts.getProfileUse() != - llvm::driver::ProfileInstrKind::ProfileNone) + if (OptRecordFile && + CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 30aec87c909eb..4193f0a1b278f 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -940,8 +940,7 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != - llvm::driver::ProfileInstrKind::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 264f1bdee81c6..6d2c705338ecf 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3601,7 +3601,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index e303631cc1d57..dcc46469df3e9 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -883,10 +883,6 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); - // recognise options: fprofile-generate -fprofile-use= - Args.addAllArgs( - CmdArgs, {options::OPT_fprofile_generate, options::OPT_fprofile_use_EQ}); - // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ, diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index 11d0dc6b7b6f1..9c33910eff57e 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1499,11 +1499,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); + Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); else - Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); + Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); } else - Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); + Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index ae12aec518108..a697872836569 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,15 +24,8 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. - -/// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) -/// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) - CODEGENOPT(InstrumentFunctions, 1, 0) ///< Set when -finstrument_functions is ///< enabled on the compile step. - CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 06203670f97b9..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -151,44 +151,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - /// Name of the profile file to use as output for -fprofile-instr-generate, - /// -fprofile-generate, and -fcs-profile-generate. - std::string InstrProfileOutput; - - /// Name of the profile file to use as input for -fmemory-profile-use. - std::string MemoryProfileUsePath; - - /// Name of the profile file to use as input for -fprofile-instr-use - std::string ProfileInstrumentUsePath; - - /// Name of the profile remapping file to apply to the profile data supplied - /// by -fprofile-sample-use or -fprofile-instr-use. - std::string ProfileRemappingFile; - - /// Check if Clang profile instrumenation is on. - bool hasProfileClangInstr() const { - return getProfileInstr() == llvm::driver::ProfileClangInstr; - } - - /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { - return getProfileInstr() == llvm::driver::ProfileIRInstr; - } - - /// Check if CS IR level profile instrumentation is on. - bool hasProfileCSIRInstr() const { - return getProfileInstr() == llvm::driver::ProfileCSIRInstr; - } - /// Check if IR level profile use is on. - bool hasProfileIRUse() const { - return getProfileUse() == llvm::driver::ProfileIRInstr || - getProfileUse() == llvm::driver::ProfileCSIRInstr; - } - /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { - return getProfileUse() == llvm::driver::ProfileCSIRInstr; - } - // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) #define ENUM_CODEGENOPT(Name, Type, Bits, Default) \ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 0571aea8ec801..90a002929eff0 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -30,7 +30,6 @@ #include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/Frontend/Debug/Options.h" -#include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Option/Arg.h" #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" @@ -453,15 +452,6 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, opts.IsPIE = 1; } - if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) { - opts.setProfileInstr(llvm::driver::ProfileInstrKind::ProfileIRInstr); - } - - if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) { - opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); - opts.ProfileInstrumentUsePath = A->getValue(); - } - // -mcmodel option. if (const llvm::opt::Arg *a = args.getLastArg(clang::driver::options::OPT_mcmodel_EQ)) { diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index da8fa518ab3e1..012d0fdfe645f 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -56,12 +56,10 @@ #include "llvm/Passes/PassBuilder.h" #include "llvm/Passes/PassPlugin.h" #include "llvm/Passes/StandardInstrumentations.h" -#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/Support/AMDGPUAddrSpace.h" #include "llvm/Support/Error.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/FileSystem.h" -#include "llvm/Support/PGOOptions.h" #include "llvm/Support/Path.h" #include "llvm/Support/SourceMgr.h" #include "llvm/Support/ToolOutputFile.h" @@ -69,7 +67,6 @@ #include "llvm/TargetParser/RISCVISAInfo.h" #include "llvm/TargetParser/RISCVTargetParser.h" #include "llvm/Transforms/IPO/Internalize.h" -#include "llvm/Transforms/Instrumentation/InstrProfiling.h" #include "llvm/Transforms/Utils/ModuleUtils.h" #include #include @@ -921,29 +918,6 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) { llvm::PassInstrumentationCallbacks pic; llvm::PipelineTuningOptions pto; std::optional pgoOpt; - - if (opts.hasProfileIRInstr()) { - // -fprofile-generate. - pgoOpt = llvm::PGOOptions(opts.InstrProfileOutput.empty() - ? llvm::driver::getDefaultProfileGenName() - : opts.InstrProfileOutput, - "", "", opts.MemoryProfileUsePath, nullptr, - llvm::PGOOptions::IRInstr, - llvm::PGOOptions::NoCSAction, - llvm::PGOOptions::ColdFuncOpt::Default, false, - /*PseudoProbeForProfiling=*/false, false); - } else if (opts.hasProfileIRUse()) { - llvm::IntrusiveRefCntPtr VFS = - llvm::vfs::getRealFileSystem(); - // -fprofile-use. - auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse - : llvm::PGOOptions::NoCSAction; - pgoOpt = llvm::PGOOptions( - opts.ProfileInstrumentUsePath, "", opts.ProfileRemappingFile, - opts.MemoryProfileUsePath, VFS, llvm::PGOOptions::IRUse, CSAction, - llvm::PGOOptions::ColdFuncOpt::Default, false); - } - llvm::StandardInstrumentations si(llvmModule->getContext(), opts.DebugPassManager); si.registerCallbacks(pic, &mam); diff --git a/flang/test/Driver/flang-f-opts.f90 b/flang/test/Driver/flang-f-opts.f90 index b972b9b7b2a59..4493a519e2010 100644 --- a/flang/test/Driver/flang-f-opts.f90 +++ b/flang/test/Driver/flang-f-opts.f90 @@ -8,8 +8,3 @@ ! CHECK-LABEL: "-fc1" ! CHECK: -ffp-contract=off ! CHECK: -O3 - -! RUN: %flang -### -S -fprofile-generate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-GENERATE-LLVM %s -! CHECK-PROFILE-GENERATE-LLVM: "-fprofile-generate" -! RUN: %flang -### -S -fprofile-use=%S %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-USE-DIR %s -! CHECK-PROFILE-USE-DIR: "-fprofile-use={{.*}}" diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext deleted file mode 100644 index 2650fb5ebfd35..0000000000000 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext +++ /dev/null @@ -1,18 +0,0 @@ -# IR level Instrumentation Flag -:ir -_QQmain -# Func Hash: -146835646621254984 -# Num Counters: -2 -# Counter Values: -100 -1 - -main -# Func Hash: -742261418966908927 -# Num Counters: -1 -# Counter Values: -1 \ No newline at end of file diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext deleted file mode 100644 index c4a2a26557e80..0000000000000 --- a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext +++ /dev/null @@ -1,11 +0,0 @@ -# IR level Instrumentation Flag -:ir -:entry_first -_QQmain -# Func Hash: -146835646621254984 -# Num Counters: -2 -# Counter Values: -100 -1 \ No newline at end of file diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90 deleted file mode 100644 index 4490c45232d28..0000000000000 --- a/flang/test/Profile/gcc-flag-compatibility.f90 +++ /dev/null @@ -1,32 +0,0 @@ -! Tests for -fprofile-generate and -fprofile-use flag compatibility. These two -! flags behave similarly to their GCC counterparts: -! -! -fprofile-generate Generates the profile file ./default.profraw -! -fprofile-use=/file Uses the profile file /file - -! On AIX, -flto used to be required with -fprofile-generate. gcc-flag-compatibility-aix.c is used to do the testing on AIX with -flto -! RUN: %flang %s -c -S -o - -emit-llvm -fprofile-generate | FileCheck -check-prefix=PROFILE-GEN %s -! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section -! PROFILE-GEN: @__profd_{{_?}}main = - -! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof -! This uses LLVM IR format profile. -! RUN: rm -rf %t.dir -! RUN: mkdir -p %t.dir/some/path -! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof -! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s -! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof -! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s -! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1} -! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100} - -program main - implicit none - integer :: i - integer :: X = 0 - - do i = 0, 99 - X = X + i - end do - -end program main diff --git a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h index 82f583bc459e6..ee52645f2e51b 100644 --- a/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h +++ b/llvm/include/llvm/Frontend/Driver/CodeGenOptions.h @@ -13,8 +13,6 @@ #ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H #define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H -#include - namespace llvm { class Triple; class TargetLibraryInfoImpl; @@ -48,19 +46,9 @@ enum class VectorLibrary { AMDLIBM // AMD vector math library. }; -enum ProfileInstrKind { - ProfileNone, // Profile instrumentation is turned off. - ProfileClangInstr, // Clang instrumentation to generate execution counts - // to use with PGO. - ProfileIRInstr, // IR level PGO instrumentation in LLVM. - ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. -}; - TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, VectorLibrary Veclib); -// Default filename used for profile generation. -std::string getDefaultProfileGenName(); } // end namespace llvm::driver #endif diff --git a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp index df884908845d2..52080dea93c98 100644 --- a/llvm/lib/Frontend/Driver/CodeGenOptions.cpp +++ b/llvm/lib/Frontend/Driver/CodeGenOptions.cpp @@ -8,15 +8,8 @@ #include "llvm/Frontend/Driver/CodeGenOptions.h" #include "llvm/Analysis/TargetLibraryInfo.h" -#include "llvm/ProfileData/InstrProfCorrelator.h" #include "llvm/TargetParser/Triple.h" -namespace llvm { -extern llvm::cl::opt DebugInfoCorrelate; -extern llvm::cl::opt - ProfileCorrelate; -} // namespace llvm - namespace llvm::driver { TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, @@ -63,10 +56,4 @@ TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, return TLII; } -std::string getDefaultProfileGenName() { - return llvm::DebugInfoCorrelate || - llvm::ProfileCorrelate != InstrProfCorrelator::NONE - ? "default_%m.proflite" - : "default_%m.profraw"; -} } // namespace llvm::driver From flang-commits at lists.llvm.org Fri May 30 07:25:56 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 07:25:56 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Revert "Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler" (PR #142159) In-Reply-To: Message-ID: <6839bff4.050a0220.312495.d302@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-clang-driver Author: Tarun Prabhu (tarunprabhu)
Changes Reverts llvm/llvm-project#136098 --- Patch is 29.10 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/142159.diff 21 Files Affected: - (modified) clang/include/clang/Basic/CodeGenOptions.def (+2-4) - (modified) clang/include/clang/Basic/CodeGenOptions.h (+8-14) - (modified) clang/include/clang/Basic/ProfileList.h (+5-4) - (modified) clang/include/clang/Driver/Options.td (+2-2) - (modified) clang/lib/Basic/ProfileList.cpp (+10-10) - (modified) clang/lib/CodeGen/BackendUtil.cpp (+24-18) - (modified) clang/lib/CodeGen/CodeGenAction.cpp (+2-2) - (modified) clang/lib/CodeGen/CodeGenFunction.cpp (+1-2) - (modified) clang/lib/CodeGen/CodeGenModule.cpp (+1-1) - (modified) clang/lib/Driver/ToolChains/Flang.cpp (-4) - (modified) clang/lib/Frontend/CompilerInvocation.cpp (+3-3) - (modified) flang/include/flang/Frontend/CodeGenOptions.def (-7) - (modified) flang/include/flang/Frontend/CodeGenOptions.h (-38) - (modified) flang/lib/Frontend/CompilerInvocation.cpp (-10) - (modified) flang/lib/Frontend/FrontendActions.cpp (-26) - (modified) flang/test/Driver/flang-f-opts.f90 (-5) - (removed) flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext (-18) - (removed) flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext (-11) - (removed) flang/test/Profile/gcc-flag-compatibility.f90 (-32) - (modified) llvm/include/llvm/Frontend/Driver/CodeGenOptions.h (-12) - (modified) llvm/lib/Frontend/Driver/CodeGenOptions.cpp (-13) ``````````diff diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index 11dad53a52efe..aad4e107cbeb3 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,11 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. - -ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 4, llvm::driver::ProfileInstrKind::ProfileNone) - +ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 4, ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index bffbd00b1bd72..278803f7bb960 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -518,41 +518,35 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == - llvm::driver::ProfileInstrKind::ProfileClangInstr; + return getProfileInstr() == ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; + return getProfileInstr() == ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == - llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + return getProfileInstr() == ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { - return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; - } + bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; + return getProfileUse() == ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || - getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { - return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; - } + bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index 5338ef3992ade..b4217e49c18a3 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,16 +49,17 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; + ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - llvm::driver::ProfileInstrKind Kind) const; + CodeGenOptions::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - llvm::driver::ProfileInstrKind Kind) const; + CodeGenOptions::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, + CodeGenOptions::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 5c79c66b55eb3..5ca31c253ed8f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1772,7 +1772,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFFFFFE">; def fprofile_generate : Flag<["-"], "fprofile-generate">, - Group, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, + Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">, Group, Visibility<[ClangOption, CLOption]>, @@ -1789,7 +1789,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group, Visibility<[ClangOption, CLOption]>, Alias; def fprofile_use_EQ : Joined<["-"], "fprofile-use=">, Group, - Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, + Visibility<[ClangOption, CLOption]>, MetaVarName<"">, HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from /default.profdata. Otherwise, it reads from file .">; def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">, diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index bea65579f396b..2d37014294b92 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -70,24 +70,24 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { +static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { switch (Kind) { - case llvm::driver::ProfileInstrKind::ProfileNone: + case CodeGenOptions::ProfileNone: return ""; - case llvm::driver::ProfileInstrKind::ProfileClangInstr: + case CodeGenOptions::ProfileClangInstr: return "clang"; - case llvm::driver::ProfileInstrKind::ProfileIRInstr: + case CodeGenOptions::ProfileIRInstr: return "llvm"; - case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: + case CodeGenOptions::ProfileCSIRInstr: return "csllvm"; case CodeGenOptions::ProfileIRSampleColdCov: return "sample-coldcov"; } - llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); + llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { +ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -118,7 +118,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - llvm::driver::ProfileInstrKind Kind) const { + CodeGenOptions::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -132,13 +132,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - llvm::driver::ProfileInstrKind Kind) const { + CodeGenOptions::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - llvm::driver::ProfileInstrKind Kind) const { + CodeGenOptions::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 03e10b1138a71..cd5fc48c4a22b 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -123,10 +123,17 @@ namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } +// Default filename used for profile generation. +static std::string getDefaultProfileGenName() { + return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} + // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? llvm::driver::getDefaultProfileGenName() + ? getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -828,12 +835,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions( - getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, - PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, - CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, + PGOOptions::IRInstr, PGOOptions::NoCSAction, + ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, + CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -841,32 +848,31 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + PGOOptions::NoCSAction, ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) - PGOOpt = - PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::NoAction, PGOOptions::NoCSAction, - llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); + PGOOpt = PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, + PGOOptions::NoAction, PGOOptions::NoCSAction, + ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - llvm::ClPGOColdFuncAttr, - CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = + PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - llvm::ClPGOColdFuncAttr, true); + ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 5493cc92bd8b0..1f5eb427b566f 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && CodeGenOpts.getProfileUse() != - llvm::driver::ProfileInstrKind::ProfileNone) + if (OptRecordFile && + CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 30aec87c909eb..4193f0a1b278f 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -940,8 +940,7 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != - llvm::driver::ProfileInstrKind::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 264f1bdee81c6..6d2c705338ecf 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3601,7 +3601,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index e303631cc1d57..dcc46469df3e9 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -883,10 +883,6 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); - // recognise options: fprofile-generate -fprofile-use= - Args.addAllArgs( - CmdArgs, {options::OPT_fprofile_generate, options::OPT_fprofile_use_EQ}); - // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ, diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index 11d0dc6b7b6f1..9c33910eff57e 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1499,11 +1499,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); + Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); else - Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); + Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); } else - Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); + Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index ae12aec518108..a697872836569 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,15 +24,8 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. - -/// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) -/// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) - CODEGENOPT(InstrumentFunctions, 1, 0) ///< Set when -finstrument_functions is ///< enabled on the compile step. - CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 06203670f97b9..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -151,44 +151,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - /// Name of the profile file to use as output for -fprofile-instr-generate, - /// -fprofile-generate, and -fcs-profile-generate. - std::string InstrProfileOutput; - - /// Name of the profile file to use as input for -fmemory-profile-use. - std::string MemoryProfileUsePath; - - /// Name of the profile file to use as input for -fprofile-instr-use - std::string ProfileInstrumentUsePath; - - /// Name of the profile remapping file to apply to the profile data supplied - /// by -fprofile-sample-use or -fprofile-instr-use. - std::string ProfileRemappingFile; - - /// Check if Clang profile instrumenation is on. - bool hasProfileClangInstr() const { - return getProfileInstr() == llvm::driver::ProfileClangInstr; - } - - /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { - return getProfileInstr() == llvm::driver::ProfileIRInstr; - } - - /// Check if CS IR level profile instrumentation is on. - bool hasProfileCSIRInstr() const { - return getProfileInstr() == llvm::driver::ProfileCSIRInstr; - } - /// Check if IR level profile use is on. - bool hasProfileIRUse() const { - return getProfileUse() == llvm::driver::ProfileIRInstr || - getProfileUse() == llvm::driver::ProfileCSIRInstr; - } - /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { - return getProfileUse() == llvm::driver::ProfileCSIRInstr; - } - // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) #define ENUM_CODEGENOPT(Name, Type, Bits, Default) \ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerI... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/142159 From flang-commits at lists.llvm.org Fri May 30 07:25:56 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 07:25:56 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Revert "Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler" (PR #142159) In-Reply-To: Message-ID: <6839bff4.050a0220.c105a.352a@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-driver @llvm/pr-subscribers-clang-codegen Author: Tarun Prabhu (tarunprabhu)
Changes Reverts llvm/llvm-project#136098 --- Patch is 29.10 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/142159.diff 21 Files Affected: - (modified) clang/include/clang/Basic/CodeGenOptions.def (+2-4) - (modified) clang/include/clang/Basic/CodeGenOptions.h (+8-14) - (modified) clang/include/clang/Basic/ProfileList.h (+5-4) - (modified) clang/include/clang/Driver/Options.td (+2-2) - (modified) clang/lib/Basic/ProfileList.cpp (+10-10) - (modified) clang/lib/CodeGen/BackendUtil.cpp (+24-18) - (modified) clang/lib/CodeGen/CodeGenAction.cpp (+2-2) - (modified) clang/lib/CodeGen/CodeGenFunction.cpp (+1-2) - (modified) clang/lib/CodeGen/CodeGenModule.cpp (+1-1) - (modified) clang/lib/Driver/ToolChains/Flang.cpp (-4) - (modified) clang/lib/Frontend/CompilerInvocation.cpp (+3-3) - (modified) flang/include/flang/Frontend/CodeGenOptions.def (-7) - (modified) flang/include/flang/Frontend/CodeGenOptions.h (-38) - (modified) flang/lib/Frontend/CompilerInvocation.cpp (-10) - (modified) flang/lib/Frontend/FrontendActions.cpp (-26) - (modified) flang/test/Driver/flang-f-opts.f90 (-5) - (removed) flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext (-18) - (removed) flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext (-11) - (removed) flang/test/Profile/gcc-flag-compatibility.f90 (-32) - (modified) llvm/include/llvm/Frontend/Driver/CodeGenOptions.h (-12) - (modified) llvm/lib/Frontend/Driver/CodeGenOptions.cpp (-13) ``````````diff diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index 11dad53a52efe..aad4e107cbeb3 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,11 +223,9 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. - -ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 4, llvm::driver::ProfileInstrKind::ProfileNone) - +ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 4, ProfileNone) /// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) +ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) /// Partition functions into N groups and select only functions in group i to be /// instrumented. Selected group numbers can be 0 to N-1 inclusive. VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index bffbd00b1bd72..278803f7bb960 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -518,41 +518,35 @@ class CodeGenOptions : public CodeGenOptionsBase { /// Check if Clang profile instrumenation is on. bool hasProfileClangInstr() const { - return getProfileInstr() == - llvm::driver::ProfileInstrKind::ProfileClangInstr; + return getProfileInstr() == ProfileClangInstr; } /// Check if IR level profile instrumentation is on. bool hasProfileIRInstr() const { - return getProfileInstr() == llvm::driver::ProfileInstrKind::ProfileIRInstr; + return getProfileInstr() == ProfileIRInstr; } /// Check if CS IR level profile instrumentation is on. bool hasProfileCSIRInstr() const { - return getProfileInstr() == - llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + return getProfileInstr() == ProfileCSIRInstr; } /// Check if any form of instrumentation is on. - bool hasProfileInstr() const { - return getProfileInstr() != llvm::driver::ProfileInstrKind::ProfileNone; - } + bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; } /// Check if Clang profile use is on. bool hasProfileClangUse() const { - return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileClangInstr; + return getProfileUse() == ProfileClangInstr; } /// Check if IR level profile use is on. bool hasProfileIRUse() const { - return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileIRInstr || - getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; + return getProfileUse() == ProfileIRInstr || + getProfileUse() == ProfileCSIRInstr; } /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { - return getProfileUse() == llvm::driver::ProfileInstrKind::ProfileCSIRInstr; - } + bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; } /// Check if type and variable info should be emitted. bool hasReducedDebugInfo() const { diff --git a/clang/include/clang/Basic/ProfileList.h b/clang/include/clang/Basic/ProfileList.h index 5338ef3992ade..b4217e49c18a3 100644 --- a/clang/include/clang/Basic/ProfileList.h +++ b/clang/include/clang/Basic/ProfileList.h @@ -49,16 +49,17 @@ class ProfileList { ~ProfileList(); bool isEmpty() const { return Empty; } - ExclusionType getDefault(llvm::driver::ProfileInstrKind Kind) const; + ExclusionType getDefault(CodeGenOptions::ProfileInstrKind Kind) const; std::optional isFunctionExcluded(StringRef FunctionName, - llvm::driver::ProfileInstrKind Kind) const; + CodeGenOptions::ProfileInstrKind Kind) const; std::optional isLocationExcluded(SourceLocation Loc, - llvm::driver::ProfileInstrKind Kind) const; + CodeGenOptions::ProfileInstrKind Kind) const; std::optional - isFileExcluded(StringRef FileName, llvm::driver::ProfileInstrKind Kind) const; + isFileExcluded(StringRef FileName, + CodeGenOptions::ProfileInstrKind Kind) const; }; } // namespace clang diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 5c79c66b55eb3..5ca31c253ed8f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1772,7 +1772,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">, HelpText<"Maximum number of test vectors in MC/DC coverage">, MarshallingInfoInt, "0x7FFFFFFE">; def fprofile_generate : Flag<["-"], "fprofile-generate">, - Group, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, + Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">, Group, Visibility<[ClangOption, CLOption]>, @@ -1789,7 +1789,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group, Visibility<[ClangOption, CLOption]>, Alias; def fprofile_use_EQ : Joined<["-"], "fprofile-use=">, Group, - Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>, + Visibility<[ClangOption, CLOption]>, MetaVarName<"">, HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from /default.profdata. Otherwise, it reads from file .">; def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">, diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index bea65579f396b..2d37014294b92 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -70,24 +70,24 @@ ProfileList::ProfileList(ArrayRef Paths, SourceManager &SM) ProfileList::~ProfileList() = default; -static StringRef getSectionName(llvm::driver::ProfileInstrKind Kind) { +static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { switch (Kind) { - case llvm::driver::ProfileInstrKind::ProfileNone: + case CodeGenOptions::ProfileNone: return ""; - case llvm::driver::ProfileInstrKind::ProfileClangInstr: + case CodeGenOptions::ProfileClangInstr: return "clang"; - case llvm::driver::ProfileInstrKind::ProfileIRInstr: + case CodeGenOptions::ProfileIRInstr: return "llvm"; - case llvm::driver::ProfileInstrKind::ProfileCSIRInstr: + case CodeGenOptions::ProfileCSIRInstr: return "csllvm"; case CodeGenOptions::ProfileIRSampleColdCov: return "sample-coldcov"; } - llvm_unreachable("Unhandled llvm::driver::ProfileInstrKind enum"); + llvm_unreachable("Unhandled CodeGenOptions::ProfileInstrKind enum"); } ProfileList::ExclusionType -ProfileList::getDefault(llvm::driver::ProfileInstrKind Kind) const { +ProfileList::getDefault(CodeGenOptions::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "default:" if (SCL->inSection(Section, "default", "allow")) @@ -118,7 +118,7 @@ ProfileList::inSection(StringRef Section, StringRef Prefix, std::optional ProfileList::isFunctionExcluded(StringRef FunctionName, - llvm::driver::ProfileInstrKind Kind) const { + CodeGenOptions::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "function:=" if (auto V = inSection(Section, "function", FunctionName)) @@ -132,13 +132,13 @@ ProfileList::isFunctionExcluded(StringRef FunctionName, std::optional ProfileList::isLocationExcluded(SourceLocation Loc, - llvm::driver::ProfileInstrKind Kind) const { + CodeGenOptions::ProfileInstrKind Kind) const { return isFileExcluded(SM.getFilename(SM.getFileLoc(Loc)), Kind); } std::optional ProfileList::isFileExcluded(StringRef FileName, - llvm::driver::ProfileInstrKind Kind) const { + CodeGenOptions::ProfileInstrKind Kind) const { StringRef Section = getSectionName(Kind); // Check for "source:=" if (auto V = inSection(Section, "source", FileName)) diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 03e10b1138a71..cd5fc48c4a22b 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -123,10 +123,17 @@ namespace clang { extern llvm::cl::opt ClSanitizeGuardChecks; } +// Default filename used for profile generation. +static std::string getDefaultProfileGenName() { + return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE + ? "default_%m.proflite" + : "default_%m.profraw"; +} + // Path and name of file used for profile generation static std::string getProfileGenName(const CodeGenOptions &CodeGenOpts) { std::string FileName = CodeGenOpts.InstrProfileOutput.empty() - ? llvm::driver::getDefaultProfileGenName() + ? getDefaultProfileGenName() : CodeGenOpts.InstrProfileOutput; if (CodeGenOpts.ContinuousProfileSync) FileName = "%c" + FileName; @@ -828,12 +835,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline( if (CodeGenOpts.hasProfileIRInstr()) // -fprofile-generate. - PGOOpt = PGOOptions( - getProfileGenName(CodeGenOpts), "", "", - CodeGenOpts.MemoryProfileUsePath, nullptr, PGOOptions::IRInstr, - PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, - CodeGenOpts.DebugInfoForProfiling, - /*PseudoProbeForProfiling=*/false, CodeGenOpts.AtomicProfileUpdate); + PGOOpt = PGOOptions(getProfileGenName(CodeGenOpts), "", "", + CodeGenOpts.MemoryProfileUsePath, nullptr, + PGOOptions::IRInstr, PGOOptions::NoCSAction, + ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, + /*PseudoProbeForProfiling=*/false, + CodeGenOpts.AtomicProfileUpdate); else if (CodeGenOpts.hasProfileIRUse()) { // -fprofile-use. auto CSAction = CodeGenOpts.hasProfileCSIRUse() ? PGOOptions::CSIRUse @@ -841,32 +848,31 @@ void EmitAssemblyHelper::RunOptimizationPipeline( PGOOpt = PGOOptions(CodeGenOpts.ProfileInstrumentUsePath, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::IRUse, CSAction, llvm::ClPGOColdFuncAttr, + PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); } else if (!CodeGenOpts.SampleProfileFile.empty()) // -fprofile-sample-use PGOOpt = PGOOptions( CodeGenOpts.SampleProfileFile, "", CodeGenOpts.ProfileRemappingFile, CodeGenOpts.MemoryProfileUsePath, VFS, PGOOptions::SampleUse, - PGOOptions::NoCSAction, llvm::ClPGOColdFuncAttr, + PGOOptions::NoCSAction, ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, CodeGenOpts.PseudoProbeForProfiling); else if (!CodeGenOpts.MemoryProfileUsePath.empty()) // -fmemory-profile-use (without any of the above options) - PGOOpt = - PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, - PGOOptions::NoAction, PGOOptions::NoCSAction, - llvm::ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); + PGOOpt = PGOOptions("", "", "", CodeGenOpts.MemoryProfileUsePath, VFS, + PGOOptions::NoAction, PGOOptions::NoCSAction, + ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling); else if (CodeGenOpts.PseudoProbeForProfiling) // -fpseudo-probe-for-profiling - PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, - PGOOptions::NoAction, PGOOptions::NoCSAction, - llvm::ClPGOColdFuncAttr, - CodeGenOpts.DebugInfoForProfiling, true); + PGOOpt = + PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, + PGOOptions::NoAction, PGOOptions::NoCSAction, + ClPGOColdFuncAttr, CodeGenOpts.DebugInfoForProfiling, true); else if (CodeGenOpts.DebugInfoForProfiling) // -fdebug-info-for-profiling PGOOpt = PGOOptions("", "", "", /*MemoryProfile=*/"", nullptr, PGOOptions::NoAction, PGOOptions::NoCSAction, - llvm::ClPGOColdFuncAttr, true); + ClPGOColdFuncAttr, true); // Check to see if we want to generate a CS profile. if (CodeGenOpts.hasProfileCSIRInstr()) { diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp index 5493cc92bd8b0..1f5eb427b566f 100644 --- a/clang/lib/CodeGen/CodeGenAction.cpp +++ b/clang/lib/CodeGen/CodeGenAction.cpp @@ -273,8 +273,8 @@ void BackendConsumer::HandleTranslationUnit(ASTContext &C) { std::unique_ptr OptRecordFile = std::move(*OptRecordFileOrErr); - if (OptRecordFile && CodeGenOpts.getProfileUse() != - llvm::driver::ProfileInstrKind::ProfileNone) + if (OptRecordFile && + CodeGenOpts.getProfileUse() != CodeGenOptions::ProfileNone) Ctx.setDiagnosticsHotnessRequested(true); if (CodeGenOpts.MisExpect) { diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp index 30aec87c909eb..4193f0a1b278f 100644 --- a/clang/lib/CodeGen/CodeGenFunction.cpp +++ b/clang/lib/CodeGen/CodeGenFunction.cpp @@ -940,8 +940,7 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy, } } - if (CGM.getCodeGenOpts().getProfileInstr() != - llvm::driver::ProfileInstrKind::ProfileNone) { + if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone) { switch (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc)) { case ProfileList::Skip: Fn->addFnAttr(llvm::Attribute::SkipProfile); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 264f1bdee81c6..6d2c705338ecf 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -3601,7 +3601,7 @@ CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn, // If the profile list is empty, then instrument everything. if (ProfileList.isEmpty()) return ProfileList::Allow; - llvm::driver::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); + CodeGenOptions::ProfileInstrKind Kind = getCodeGenOpts().getProfileInstr(); // First, check the function name. if (auto V = ProfileList.isFunctionExcluded(Fn->getName(), Kind)) return *V; diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp index e303631cc1d57..dcc46469df3e9 100644 --- a/clang/lib/Driver/ToolChains/Flang.cpp +++ b/clang/lib/Driver/ToolChains/Flang.cpp @@ -883,10 +883,6 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption Args.AddLastArg(CmdArgs, options::OPT_w); - // recognise options: fprofile-generate -fprofile-use= - Args.addAllArgs( - CmdArgs, {options::OPT_fprofile_generate, options::OPT_fprofile_use_EQ}); - // Forward flags for OpenMP. We don't do this if the current action is an // device offloading action other than OpenMP. if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ, diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp index 11d0dc6b7b6f1..9c33910eff57e 100644 --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -1499,11 +1499,11 @@ static void setPGOUseInstrumentor(CodeGenOptions &Opts, // which is available (might be one or both). if (PGOReader->isIRLevelProfile() || PGOReader->hasMemoryProfile()) { if (PGOReader->hasCSIRLevelProfile()) - Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileCSIRInstr); + Opts.setProfileUse(CodeGenOptions::ProfileCSIRInstr); else - Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileIRInstr); + Opts.setProfileUse(CodeGenOptions::ProfileIRInstr); } else - Opts.setProfileUse(llvm::driver::ProfileInstrKind::ProfileClangInstr); + Opts.setProfileUse(CodeGenOptions::ProfileClangInstr); } void CompilerInvocation::setDefaultPointerAuthOptions( diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def index ae12aec518108..a697872836569 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.def +++ b/flang/include/flang/Frontend/CodeGenOptions.def @@ -24,15 +24,8 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified. CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new ///< pass manager. - -/// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) -/// Choose profile kind for PGO use compilation. -ENUM_CODEGENOPT(ProfileUse, llvm::driver::ProfileInstrKind, 2, llvm::driver::ProfileInstrKind::ProfileNone) - CODEGENOPT(InstrumentFunctions, 1, 0) ///< Set when -finstrument_functions is ///< enabled on the compile step. - CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level. CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module. CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 06203670f97b9..61e56e51c4bbb 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -151,44 +151,6 @@ class CodeGenOptions : public CodeGenOptionsBase { /// OpenMP is enabled. using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; - /// Name of the profile file to use as output for -fprofile-instr-generate, - /// -fprofile-generate, and -fcs-profile-generate. - std::string InstrProfileOutput; - - /// Name of the profile file to use as input for -fmemory-profile-use. - std::string MemoryProfileUsePath; - - /// Name of the profile file to use as input for -fprofile-instr-use - std::string ProfileInstrumentUsePath; - - /// Name of the profile remapping file to apply to the profile data supplied - /// by -fprofile-sample-use or -fprofile-instr-use. - std::string ProfileRemappingFile; - - /// Check if Clang profile instrumenation is on. - bool hasProfileClangInstr() const { - return getProfileInstr() == llvm::driver::ProfileClangInstr; - } - - /// Check if IR level profile instrumentation is on. - bool hasProfileIRInstr() const { - return getProfileInstr() == llvm::driver::ProfileIRInstr; - } - - /// Check if CS IR level profile instrumentation is on. - bool hasProfileCSIRInstr() const { - return getProfileInstr() == llvm::driver::ProfileCSIRInstr; - } - /// Check if IR level profile use is on. - bool hasProfileIRUse() const { - return getProfileUse() == llvm::driver::ProfileIRInstr || - getProfileUse() == llvm::driver::ProfileCSIRInstr; - } - /// Check if CSIR profile use is on. - bool hasProfileCSIRUse() const { - return getProfileUse() == llvm::driver::ProfileCSIRInstr; - } - // Define accessors/mutators for code generation options of enumeration type. #define CODEGENOPT(Name, Bits, Default) #define ENUM_CODEGENOPT(Name, Type, Bits, Default) \ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerI... [truncated] ``````````
https://github.com/llvm/llvm-project/pull/142159 From flang-commits at lists.llvm.org Fri May 30 07:26:11 2025 From: flang-commits at lists.llvm.org (Jan Patrick Lehr via flang-commits) Date: Fri, 30 May 2025 07:26:11 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Revert "Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler" (PR #142159) In-Reply-To: Message-ID: <6839c003.170a0220.6c5b.7a0b@mx.google.com> https://github.com/jplehr approved this pull request. LG https://github.com/llvm/llvm-project/pull/142159 From flang-commits at lists.llvm.org Fri May 30 07:27:14 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Fri, 30 May 2025 07:27:14 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Revert "Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler" (PR #142159) In-Reply-To: Message-ID: <6839c042.170a0220.d63f8.1ff2@mx.google.com> https://github.com/tarunprabhu closed https://github.com/llvm/llvm-project/pull/142159 From flang-commits at lists.llvm.org Fri May 30 07:29:46 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 30 May 2025 07:29:46 -0700 (PDT) Subject: [flang-commits] =?utf-8?q?=5Bflang=5D_Revert_=22Reland_=22=5Bflan?= =?utf-8?q?g=5D_Added_noalias_attribute_to_function_arguments=E2=80=A6_=28?= =?utf-8?q?PR_=23142128=29?= In-Reply-To: Message-ID: <6839c0da.170a0220.1234fd.2e89@mx.google.com> ================ @@ -350,15 +350,11 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, else framePointerKind = mlir::LLVM::framePointerKind::FramePointerKind::None; - bool setNoCapture = false, setNoAlias = false; - if (config.OptLevel.isOptimizingForSpeed()) ---------------- vzakhari wrote: Tom, will you be okay if I just remove this `if`? The LIT tests should still work, and it will be easier to re-enable the code, when function speciailization is fixed. https://github.com/llvm/llvm-project/pull/142128 From flang-commits at lists.llvm.org Fri May 30 07:35:28 2025 From: flang-commits at lists.llvm.org (Leandro Lupori via flang-commits) Date: Fri, 30 May 2025 07:35:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Explicitly set Shared DSA in symbols (PR #142154) In-Reply-To: Message-ID: <6839c230.a70a0220.22ad38.50d9@mx.google.com> https://github.com/luporl edited https://github.com/llvm/llvm-project/pull/142154 From flang-commits at lists.llvm.org Fri May 30 07:51:06 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Fri, 30 May 2025 07:51:06 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #142073) In-Reply-To: Message-ID: <6839c5da.050a0220.2d52e2.f090@mx.google.com> tarunprabhu wrote: @mcinally, It looks like the buildbot failure is because a test is written expecting a single attribute, but two are present, although the attribute being checked for is present. https://github.com/llvm/llvm-project/pull/142073 From flang-commits at lists.llvm.org Fri May 30 07:53:35 2025 From: flang-commits at lists.llvm.org (Kiran Chandramohan via flang-commits) Date: Fri, 30 May 2025 07:53:35 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Resolve names for declare simd uniform clause (PR #142160) In-Reply-To: Message-ID: <6839c66f.170a0220.19c62a.3b51@mx.google.com> ================ @@ -550,6 +550,13 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { return false; } + bool Pre(const parser::OmpClause::Uniform &x) { + for (const auto &name : x.v) { + ResolveOmpName(name, Symbol::Flag::OmpUniform); ---------------- kiranchandramohan wrote: Can you add a `debug-dump-symbols` test to show that the symbol corresponding to name has the `OmpUniform` flag? https://github.com/llvm/llvm-project/pull/142160 From flang-commits at lists.llvm.org Fri May 30 08:02:19 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 08:02:19 -0700 (PDT) Subject: [flang-commits] =?utf-8?q?=5Bflang=5D_Revert_=22Reland_=22=5Bflan?= =?utf-8?q?g=5D_Added_noalias_attribute_to_function_arguments=E2=80=A6_=28?= =?utf-8?q?PR_=23142128=29?= In-Reply-To: Message-ID: <6839c87b.170a0220.2f6183.80ea@mx.google.com> https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/142128 >From 645f6c332a1e7b9566900cd071cc44eca50d27a5 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Fri, 30 May 2025 14:53:07 +0000 Subject: [PATCH] [flang] Disable noalias captures(none) by default This is due to a 70% regression in exchange2_r on neoverse-v2 due to function specialization no longer triggering in the LTO pipline. --- flang/lib/Optimizer/Passes/Pipelines.cpp | 15 ++++++++++++--- flang/test/Fir/polymorphic.fir | 2 +- flang/test/Fir/struct-passing-x86-64-byval.fir | 2 +- flang/test/Fir/target-rewrite-complex-10-x86.fir | 2 +- flang/test/Fir/target.fir | 2 +- 5 files changed, 16 insertions(+), 7 deletions(-) diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 0c774eede4c9a..ec17a93b53ff4 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -10,6 +10,14 @@ /// common to flang and the test tools. #include "flang/Optimizer/Passes/Pipelines.h" +#include "llvm/Support/CommandLine.h" + +/// Force setting the no-alias attribute on fuction arguments when possible. +static llvm::cl::opt forceNoAlias("force-no-alias", llvm::cl::Hidden, + llvm::cl::init(false)); +/// Force setting the no-capture attribute on fuction arguments when possible. +static llvm::cl::opt forceNoCapture("force-no-capture", llvm::cl::Hidden, + llvm::cl::init(false)); namespace fir { @@ -350,9 +358,10 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, else framePointerKind = mlir::LLVM::framePointerKind::FramePointerKind::None; - bool setNoCapture = false, setNoAlias = false; - if (config.OptLevel.isOptimizingForSpeed()) - setNoCapture = setNoAlias = true; + // TODO: re-enable setNoAlias by default (when optimizing for speed) once + // function specialization is fixed. + bool setNoAlias = forceNoAlias; + bool setNoCapture = forceNoCapture; pm.addPass(fir::createFunctionAttr( {framePointerKind, config.InstrumentFunctionEntry, diff --git a/flang/test/Fir/polymorphic.fir b/flang/test/Fir/polymorphic.fir index 84fa2e950633f..d9a13a99477ce 100644 --- a/flang/test/Fir/polymorphic.fir +++ b/flang/test/Fir/polymorphic.fir @@ -1,4 +1,4 @@ -// RUN: tco %s | FileCheck %s +// RUN: tco --force-no-capture %s | FileCheck %s // Test code gen for unlimited polymorphic type descriptor. diff --git a/flang/test/Fir/struct-passing-x86-64-byval.fir b/flang/test/Fir/struct-passing-x86-64-byval.fir index 997d2930f836c..dd25b80a3f81d 100644 --- a/flang/test/Fir/struct-passing-x86-64-byval.fir +++ b/flang/test/Fir/struct-passing-x86-64-byval.fir @@ -1,7 +1,7 @@ // Test X86-64 ABI rewrite of struct passed by value (BIND(C), VALUE derived types). // This test test cases where the struct must be passed on the stack according // to the System V ABI. -// RUN: tco --target=x86_64-unknown-linux-gnu %s | FileCheck %s +// RUN: tco --target=x86_64-unknown-linux-gnu --force-no-capture --force-no-alias %s | FileCheck %s module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"} { diff --git a/flang/test/Fir/target-rewrite-complex-10-x86.fir b/flang/test/Fir/target-rewrite-complex-10-x86.fir index 5f917ee42d598..b05187c65a932 100644 --- a/flang/test/Fir/target-rewrite-complex-10-x86.fir +++ b/flang/test/Fir/target-rewrite-complex-10-x86.fir @@ -1,6 +1,6 @@ // Test COMPLEX(10) passing and returning on X86 // RUN: fir-opt --target-rewrite="target=x86_64-unknown-linux-gnu" %s | FileCheck %s --check-prefix=AMD64 -// RUN: tco -target="x86_64-unknown-linux-gnu" %s | FileCheck %s --check-prefix=AMD64_LLVM +// RUN: tco -target="x86_64-unknown-linux-gnu" --force-no-alias --force-no-capture %s | FileCheck %s --check-prefix=AMD64_LLVM module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"} { diff --git a/flang/test/Fir/target.fir b/flang/test/Fir/target.fir index e1190649e0803..d40bcae4a8ad2 100644 --- a/flang/test/Fir/target.fir +++ b/flang/test/Fir/target.fir @@ -1,4 +1,4 @@ -// RUN: tco --target=i386-unknown-linux-gnu %s | FileCheck %s --check-prefix=I32 +// RUN: tco --target=i386-unknown-linux-gnu --force-no-alias --force-no-capture %s | FileCheck %s --check-prefix=I32 // RUN: tco --target=x86_64-unknown-linux-gnu %s | FileCheck %s --check-prefix=X64 // RUN: tco --target=aarch64-unknown-linux-gnu %s | FileCheck %s --check-prefix=AARCH64 // RUN: tco --target=powerpc64le-unknown-linux-gnu %s | FileCheck %s --check-prefix=PPC From flang-commits at lists.llvm.org Fri May 30 08:03:34 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 08:03:34 -0700 (PDT) Subject: [flang-commits] =?utf-8?q?=5Bflang=5D_Revert_=22Reland_=22=5Bflan?= =?utf-8?q?g=5D_Added_noalias_attribute_to_function_arguments=E2=80=A6_=28?= =?utf-8?q?PR_=23142128=29?= In-Reply-To: Message-ID: <6839c8c6.170a0220.4818d.826c@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/142128 From flang-commits at lists.llvm.org Fri May 30 08:04:05 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 08:04:05 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Disable noalias captures(none) by default (PR #142128) In-Reply-To: Message-ID: <6839c8e5.170a0220.e4567.3acb@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/142128 From flang-commits at lists.llvm.org Fri May 30 08:09:37 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Fri, 30 May 2025 08:09:37 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <6839ca31.a70a0220.26f6c9.488b@mx.google.com> https://github.com/abidh updated https://github.com/llvm/llvm-project/pull/140556 >From 5d20af48673adebc2ab3e1a6c8442f67d84f1847 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Mon, 19 May 2025 15:21:25 +0100 Subject: [PATCH 1/7] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. This PR add functionality to change flang command line using environment variable `FCC_OVERRIDE_OPTIONS`. It is quite similar to what `CCC_OVERRIDE_OPTIONS` does for clang. The `FCC_OVERRIDE_OPTIONS` seemed like the most obvious name to me but I am open to other ideas. The `applyOverrideOptions` now takes an extra argument that is the name of the environment variable. Previously `CCC_OVERRIDE_OPTIONS` was hardcoded. --- clang/include/clang/Driver/Driver.h | 2 +- clang/lib/Driver/Driver.cpp | 4 ++-- clang/tools/driver/driver.cpp | 2 +- flang/test/Driver/fcc_override.f90 | 12 ++++++++++++ flang/tools/flang-driver/driver.cpp | 7 +++++++ 5 files changed, 23 insertions(+), 4 deletions(-) create mode 100644 flang/test/Driver/fcc_override.f90 diff --git a/clang/include/clang/Driver/Driver.h b/clang/include/clang/Driver/Driver.h index b463dc2a93550..7ca848f11b561 100644 --- a/clang/include/clang/Driver/Driver.h +++ b/clang/include/clang/Driver/Driver.h @@ -879,7 +879,7 @@ llvm::Error expandResponseFiles(SmallVectorImpl &Args, /// See applyOneOverrideOption. void applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideOpts, - llvm::StringSet<> &SavedStrings, + llvm::StringSet<> &SavedStrings, StringRef EnvVar, raw_ostream *OS = nullptr); } // end namespace driver diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp index a648cc928afdc..a8fea35926a0d 100644 --- a/clang/lib/Driver/Driver.cpp +++ b/clang/lib/Driver/Driver.cpp @@ -7289,7 +7289,7 @@ static void applyOneOverrideOption(raw_ostream &OS, void driver::applyOverrideOptions(SmallVectorImpl &Args, const char *OverrideStr, llvm::StringSet<> &SavedStrings, - raw_ostream *OS) { + StringRef EnvVar, raw_ostream *OS) { if (!OS) OS = &llvm::nulls(); @@ -7298,7 +7298,7 @@ void driver::applyOverrideOptions(SmallVectorImpl &Args, OS = &llvm::nulls(); } - *OS << "### CCC_OVERRIDE_OPTIONS: " << OverrideStr << "\n"; + *OS << "### " << EnvVar << ": " << OverrideStr << "\n"; // This does not need to be efficient. diff --git a/clang/tools/driver/driver.cpp b/clang/tools/driver/driver.cpp index 82f47ab973064..81964c65c2892 100644 --- a/clang/tools/driver/driver.cpp +++ b/clang/tools/driver/driver.cpp @@ -305,7 +305,7 @@ int clang_main(int Argc, char **Argv, const llvm::ToolContext &ToolContext) { if (const char *OverrideStr = ::getenv("CCC_OVERRIDE_OPTIONS")) { // FIXME: Driver shouldn't take extra initial argument. driver::applyOverrideOptions(Args, OverrideStr, SavedStrings, - &llvm::errs()); + "CCC_OVERRIDE_OPTIONS", &llvm::errs()); } std::string Path = GetExecutablePath(ToolContext.Path, CanonicalPrefixes); diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 new file mode 100644 index 0000000000000..55a07803fdde5 --- /dev/null +++ b/flang/test/Driver/fcc_override.f90 @@ -0,0 +1,12 @@ +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s +! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang -target x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR + +! CHECK: "-fc1" +! CHECK-NOT: "-Oignore" +! CHECK: "-Omagic" +! CHECK-NOT: "-Oignore" + +! RM-WERROR: ### FCC_OVERRIDE_OPTIONS: x-Werror +-g +! RM-WERROR-NEXT: ### Deleting argument -Werror +! RM-WERROR-NEXT: ### Adding argument -g at end +! RM-WERROR-NOT: "-Werror" diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ed52988feaa59..ad0efa3279cef 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -111,6 +111,13 @@ int main(int argc, const char **argv) { } } + llvm::StringSet<> SavedStrings; + // Handle FCC_OVERRIDE_OPTIONS, used for editing a command line behind the + // scenes. + if (const char *OverrideStr = ::getenv("FCC_OVERRIDE_OPTIONS")) + clang::driver::applyOverrideOptions(args, OverrideStr, SavedStrings, + "FCC_OVERRIDE_OPTIONS", &llvm::errs()); + // Not in the frontend mode - continue in the compiler driver mode. // Create DiagnosticsEngine for the compiler driver >From d1f2c9b8abd2690612a4b886a7a85b8e7f57d359 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Thu, 29 May 2025 11:05:57 +0100 Subject: [PATCH 2/7] Add documentation for FCC_OVERRIDE_OPTIONS. --- flang/docs/FlangDriver.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/flang/docs/FlangDriver.md b/flang/docs/FlangDriver.md index 97744f0bee069..f93df8701e677 100644 --- a/flang/docs/FlangDriver.md +++ b/flang/docs/FlangDriver.md @@ -614,3 +614,28 @@ nvfortran defines `-fast` as - `-Mcache_align`: there is no equivalent flag in Flang or Clang. - `-Mflushz`: flush-to-zero mode - when `-ffast-math` is specified, Flang will link to `crtfastmath.o` to ensure denormal numbers are flushed to zero. + + +## FCC_OVERRIDE_OPTIONS + +The environment variable `FCC_OVERRIDE_OPTIONS` can be used to apply a list of +edits to the input argument lists. The value of this environment variable is +a space separated list of edits to perform. These edits are applied in order to +the input argument lists. Edits should be one of the following forms: + +- `#`: Silence information about the changes to the command line arguments. + +- `^FOO`: Add `FOO` as a new argument at the beginning of the command line. + +- `+FOO`: Add `FOO` as a new argument at the end of the command line. + +- `s/XXX/YYY/`: Substitute the regular expression `XXX` with `YYY` in the + command line. + +- `xOPTION`: Removes all instances of the literal argument `OPTION`. + +- `XOPTION`: Removes all instances of the literal argument `OPTION`, and the + following argument. + +- `Ox`: Removes all flags matching `O` or `O[sz0-9]` and adds `Ox` at the end + of the command line. \ No newline at end of file >From d093a6ac74f8c0058e134ec55fbbf2b8edf9b477 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Thu, 29 May 2025 17:28:46 +0100 Subject: [PATCH 3/7] Mention that effect on options added by the config files. --- flang/docs/FlangDriver.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/flang/docs/FlangDriver.md b/flang/docs/FlangDriver.md index f93df8701e677..0302cb1dc33b9 100644 --- a/flang/docs/FlangDriver.md +++ b/flang/docs/FlangDriver.md @@ -638,4 +638,6 @@ the input argument lists. Edits should be one of the following forms: following argument. - `Ox`: Removes all flags matching `O` or `O[sz0-9]` and adds `Ox` at the end - of the command line. \ No newline at end of file + of the command line. + +This environment variable does not affect the options added by the config files. >From 9faf4d384a40514b15cc3bf270303843e8dd4822 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Thu, 29 May 2025 20:36:59 +0100 Subject: [PATCH 4/7] Add a test for option from config file. --- flang/test/Driver/Inputs/config-7.cfg | 1 + flang/test/Driver/fcc_override.f90 | 5 +++++ 2 files changed, 6 insertions(+) create mode 100644 flang/test/Driver/Inputs/config-7.cfg diff --git a/flang/test/Driver/Inputs/config-7.cfg b/flang/test/Driver/Inputs/config-7.cfg new file mode 100644 index 0000000000000..2f41be663b282 --- /dev/null +++ b/flang/test/Driver/Inputs/config-7.cfg @@ -0,0 +1 @@ +-Werror diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 index 55a07803fdde5..417919b5d667a 100644 --- a/flang/test/Driver/fcc_override.f90 +++ b/flang/test/Driver/fcc_override.f90 @@ -1,5 +1,6 @@ ! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s ! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang -target x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR +! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror" %flang --config=%S/Inputs/config-7.cfg -### %s -c 2>&1 | FileCheck %s -check-prefix=CONF ! CHECK: "-fc1" ! CHECK-NOT: "-Oignore" @@ -10,3 +11,7 @@ ! RM-WERROR-NEXT: ### Deleting argument -Werror ! RM-WERROR-NEXT: ### Adding argument -g at end ! RM-WERROR-NOT: "-Werror" + +! Test that FCC_OVERRIDE_OPTIONS does not affect the options from config files. +! CONF: ### FCC_OVERRIDE_OPTIONS: x-Werror +! CONF: "-Werror" >From db45474fc5625223b5240aa7f7ef094d2d80d5ae Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Fri, 30 May 2025 10:28:49 +0100 Subject: [PATCH 5/7] Handle review comments. --- flang/docs/FlangDriver.md | 8 ++++---- flang/test/Driver/fcc_override.f90 | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/flang/docs/FlangDriver.md b/flang/docs/FlangDriver.md index 0302cb1dc33b9..e6750c92567a4 100644 --- a/flang/docs/FlangDriver.md +++ b/flang/docs/FlangDriver.md @@ -618,10 +618,10 @@ nvfortran defines `-fast` as ## FCC_OVERRIDE_OPTIONS -The environment variable `FCC_OVERRIDE_OPTIONS` can be used to apply a list of -edits to the input argument lists. The value of this environment variable is -a space separated list of edits to perform. These edits are applied in order to -the input argument lists. Edits should be one of the following forms: +The environment variable `FCC_OVERRIDE_OPTIONS` can be used to edit flang's +command line arguments. The value of this variable is a space-separated list of +edits to perform. The edits are applied in the order in which they appear in +`FCC_OVERRIDE_OPTIONS`. Each edit should be one of the following forms: - `#`: Silence information about the changes to the command line arguments. diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 index 417919b5d667a..71def0847f150 100644 --- a/flang/test/Driver/fcc_override.f90 +++ b/flang/test/Driver/fcc_override.f90 @@ -1,5 +1,5 @@ ! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s -! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang -target x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR +! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang --target=x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR ! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror" %flang --config=%S/Inputs/config-7.cfg -### %s -c 2>&1 | FileCheck %s -check-prefix=CONF ! CHECK: "-fc1" >From 2efd653c1a575fc77d53d5db7db1cf5225f8039d Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Fri, 30 May 2025 11:13:29 +0100 Subject: [PATCH 6/7] Handle review comments. --- flang/tools/flang-driver/driver.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/flang/tools/flang-driver/driver.cpp b/flang/tools/flang-driver/driver.cpp index ad0efa3279cef..62dbfc72bb191 100644 --- a/flang/tools/flang-driver/driver.cpp +++ b/flang/tools/flang-driver/driver.cpp @@ -111,11 +111,11 @@ int main(int argc, const char **argv) { } } - llvm::StringSet<> SavedStrings; + llvm::StringSet<> savedStrings; // Handle FCC_OVERRIDE_OPTIONS, used for editing a command line behind the // scenes. - if (const char *OverrideStr = ::getenv("FCC_OVERRIDE_OPTIONS")) - clang::driver::applyOverrideOptions(args, OverrideStr, SavedStrings, + if (const char *overrideStr = ::getenv("FCC_OVERRIDE_OPTIONS")) + clang::driver::applyOverrideOptions(args, overrideStr, savedStrings, "FCC_OVERRIDE_OPTIONS", &llvm::errs()); // Not in the frontend mode - continue in the compiler driver mode. >From f41f8ffe3d537291fd870fb649f4d4b31361c262 Mon Sep 17 00:00:00 2001 From: Abid Qadeer Date: Fri, 30 May 2025 16:09:19 +0100 Subject: [PATCH 7/7] Replace -target with --target= to address review comments. --- flang/test/Driver/fcc_override.f90 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/test/Driver/fcc_override.f90 b/flang/test/Driver/fcc_override.f90 index 71def0847f150..2717d203c2ea3 100644 --- a/flang/test/Driver/fcc_override.f90 +++ b/flang/test/Driver/fcc_override.f90 @@ -1,4 +1,4 @@ -! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang --target=x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s ! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror +-g" %flang --target=x86_64-unknown-linux-gnu -Werror %s -c -### 2>&1 | FileCheck %s -check-prefix=RM-WERROR ! RUN: env FCC_OVERRIDE_OPTIONS="x-Werror" %flang --config=%S/Inputs/config-7.cfg -### %s -c 2>&1 | FileCheck %s -check-prefix=CONF From flang-commits at lists.llvm.org Fri May 30 08:09:53 2025 From: flang-commits at lists.llvm.org (Abid Qadeer via flang-commits) Date: Fri, 30 May 2025 08:09:53 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <6839ca41.050a0220.223f49.4a70@mx.google.com> ================ @@ -0,0 +1,17 @@ +! RUN: env FCC_OVERRIDE_OPTIONS="#+-Os +-Oz +-O +-O3 +-Oignore +a +b +c xb Xa Omagic ^-### " %flang -target x86_64-unknown-linux-gnu %s -O2 b -O3 2>&1 | FileCheck %s ---------------- abidh wrote: Indeed. Thanks for catching this. Fixed now. https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Fri May 30 08:27:41 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Fri, 30 May 2025 08:27:41 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <6839ce6d.170a0220.2017c3.3e53@mx.google.com> https://github.com/tarunprabhu approved this pull request. Thanks Abid. LGTM. https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Fri May 30 08:50:46 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Fri, 30 May 2025 08:50:46 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6839d3d6.170a0220.1a967a.43b2@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `llvm-nvptx-nvidia-ubuntu` running on `as-builder-7` while building `clang,flang,llvm` at step 5 "build-unified-tree". Full details are available at: https://lab.llvm.org/buildbot/#/builders/180/builds/18512
Here is the relevant piece of the build log for the reference ``` Step 5 (build-unified-tree) failure: build (failure) ... In file included from /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/tools/llvm-link/llvm-link.cpp:36: /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h: In constructor ‘llvm::FunctionImporter::ImportListsTy::ImportListsTy()’: /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h:273:33: warning: member ‘llvm::FunctionImporter::ImportListsTy::ImportIDs’ is used uninitialized [-Wuninitialized] 273 | ImportListsTy() : EmptyList(ImportIDs) {} | ^~~~~~~~~ /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h: In constructor ‘llvm::FunctionImporter::ImportListsTy::ImportListsTy(size_t)’: /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h:274:44: warning: member ‘llvm::FunctionImporter::ImportListsTy::ImportIDs’ is used uninitialized [-Wuninitialized] 274 | ImportListsTy(size_t Size) : EmptyList(ImportIDs), ListsImpl(Size) {} | ^~~~~~~~~ 45.744 [637/11/2102] Linking CXX shared library lib/libLLVMFrontendDriver.so.21.0git FAILED: lib/libLLVMFrontendDriver.so.21.0git : && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -Wl,-z,defs -Wl,-z,nodelete -fuse-ld=gold -Wl,--gc-sections -shared -Wl,-soname,libLLVMFrontendDriver.so.21.0git -o lib/libLLVMFrontendDriver.so.21.0git lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o -Wl,-rpath,"\$ORIGIN/../lib:/home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/build/lib:" lib/libLLVMAnalysis.so.21.0git lib/libLLVMCore.so.21.0git lib/libLLVMSupport.so.21.0git -Wl,-rpath-link,/home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/build/lib && : lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o:CodeGenOptions.cpp:function llvm::driver::getDefaultProfileGenName[abi:cxx11]():(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x7): error: undefined reference to 'llvm::DebugInfoCorrelate' lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o:CodeGenOptions.cpp:function llvm::driver::getDefaultProfileGenName[abi:cxx11]():(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x1e): error: undefined reference to 'llvm::ProfileCorrelate' collect2: error: ld returned 1 exit status 45.949 [637/4/2109] Linking CXX shared library lib/libLLVMTransformUtils.so.21.0git 53.427 [637/2/2111] Building CXX object lib/LTO/CMakeFiles/LLVMLTO.dir/LTO.cpp.o In file included from /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/include/llvm/LTO/LTO.h:32, from /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/lib/LTO/LTO.cpp:13: /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h: In constructor ‘llvm::FunctionImporter::ImportListsTy::ImportListsTy()’: /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h:273:33: warning: member ‘llvm::FunctionImporter::ImportListsTy::ImportIDs’ is used uninitialized [-Wuninitialized] 273 | ImportListsTy() : EmptyList(ImportIDs) {} | ^~~~~~~~~ /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h: In constructor ‘llvm::FunctionImporter::ImportListsTy::ImportListsTy(size_t)’: /home/buildbot/worker/as-builder-7/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h:274:44: warning: member ‘llvm::FunctionImporter::ImportListsTy::ImportIDs’ is used uninitialized [-Wuninitialized] 274 | ImportListsTy(size_t Size) : EmptyList(ImportIDs), ListsImpl(Size) {} | ^~~~~~~~~ 54.081 [637/1/2112] Building CXX object lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/AsmPrinter.cpp.o ninja: build stopped: subcommand failed. ```
https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Fri May 30 08:52:25 2025 From: flang-commits at lists.llvm.org (LLVM Continuous Integration via flang-commits) Date: Fri, 30 May 2025 08:52:25 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <6839d439.050a0220.3c7a0b.6609@mx.google.com> llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `llvm-nvptx64-nvidia-ubuntu` running on `as-builder-7` while building `clang,flang,llvm` at step 5 "build-unified-tree". Full details are available at: https://lab.llvm.org/buildbot/#/builders/160/builds/18369
Here is the relevant piece of the build log for the reference ``` Step 5 (build-unified-tree) failure: build (failure) ... In file included from /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/tools/llvm-link/llvm-link.cpp:36: /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h: In constructor ‘llvm::FunctionImporter::ImportListsTy::ImportListsTy()’: /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h:273:33: warning: member ‘llvm::FunctionImporter::ImportListsTy::ImportIDs’ is used uninitialized [-Wuninitialized] 273 | ImportListsTy() : EmptyList(ImportIDs) {} | ^~~~~~~~~ /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h: In constructor ‘llvm::FunctionImporter::ImportListsTy::ImportListsTy(size_t)’: /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h:274:44: warning: member ‘llvm::FunctionImporter::ImportListsTy::ImportIDs’ is used uninitialized [-Wuninitialized] 274 | ImportListsTy(size_t Size) : EmptyList(ImportIDs), ListsImpl(Size) {} | ^~~~~~~~~ 45.902 [639/11/2100] Linking CXX shared library lib/libLLVMFrontendDriver.so.21.0git FAILED: lib/libLLVMFrontendDriver.so.21.0git : && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -Wl,-z,defs -Wl,-z,nodelete -fuse-ld=gold -Wl,--gc-sections -shared -Wl,-soname,libLLVMFrontendDriver.so.21.0git -o lib/libLLVMFrontendDriver.so.21.0git lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o -Wl,-rpath,"\$ORIGIN/../lib:/home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/build/lib:" lib/libLLVMAnalysis.so.21.0git lib/libLLVMCore.so.21.0git lib/libLLVMSupport.so.21.0git -Wl,-rpath-link,/home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/build/lib && : lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o:CodeGenOptions.cpp:function llvm::driver::getDefaultProfileGenName[abi:cxx11]():(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x7): error: undefined reference to 'llvm::DebugInfoCorrelate' lib/Frontend/Driver/CMakeFiles/LLVMFrontendDriver.dir/CodeGenOptions.cpp.o:CodeGenOptions.cpp:function llvm::driver::getDefaultProfileGenName[abi:cxx11]():(.text._ZN4llvm6driver24getDefaultProfileGenNameB5cxx11Ev+0x1e): error: undefined reference to 'llvm::ProfileCorrelate' collect2: error: ld returned 1 exit status 46.104 [639/4/2107] Linking CXX shared library lib/libLLVMTransformUtils.so.21.0git 50.893 [639/2/2109] Building CXX object lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/AsmPrinter.cpp.o 53.468 [639/1/2110] Building CXX object lib/LTO/CMakeFiles/LLVMLTO.dir/LTO.cpp.o In file included from /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/include/llvm/LTO/LTO.h:32, from /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/lib/LTO/LTO.cpp:13: /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h: In constructor ‘llvm::FunctionImporter::ImportListsTy::ImportListsTy()’: /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h:273:33: warning: member ‘llvm::FunctionImporter::ImportListsTy::ImportIDs’ is used uninitialized [-Wuninitialized] 273 | ImportListsTy() : EmptyList(ImportIDs) {} | ^~~~~~~~~ /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h: In constructor ‘llvm::FunctionImporter::ImportListsTy::ImportListsTy(size_t)’: /home/buildbot/worker/as-builder-7/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/include/llvm/Transforms/IPO/FunctionImport.h:274:44: warning: member ‘llvm::FunctionImporter::ImportListsTy::ImportIDs’ is used uninitialized [-Wuninitialized] 274 | ImportListsTy(size_t Size) : EmptyList(ImportIDs), ListsImpl(Size) {} | ^~~~~~~~~ ninja: build stopped: subcommand failed. ```
https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Fri May 30 08:53:16 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Fri, 30 May 2025 08:53:16 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mrecip[=] (PR #142172) Message-ID: https://github.com/mcinally created https://github.com/llvm/llvm-project/pull/142172 This patch adds support for the -mrecip command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "reciprocal-estimates" function attribute. >From e7c60252b02109d3f0e05cef7ecc7fbc4f1de197 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Fri, 30 May 2025 08:31:31 -0700 Subject: [PATCH] [flang] Add support for -mrecip[=] This patch adds support for the -mrecip command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "reciprocal-estimates" function attribute. --- clang/include/clang/Driver/Options.td | 3 +- flang/include/flang/Frontend/CodeGenOptions.h | 3 + .../flang/Optimizer/Transforms/Passes.td | 3 + flang/include/flang/Tools/CrossToolHelpers.h | 3 + flang/lib/Frontend/CompilerInvocation.cpp | 159 ++++++++++++++++++ flang/lib/Frontend/FrontendActions.cpp | 2 + flang/lib/Optimizer/Passes/Pipelines.cpp | 3 +- .../lib/Optimizer/Transforms/FunctionAttr.cpp | 4 + flang/test/Driver/mrecip.f90 | 27 +++ mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td | 1 + mlir/lib/Target/LLVMIR/ModuleImport.cpp | 5 + mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | 3 + mlir/test/Target/LLVMIR/Import/mrecip.ll | 9 + mlir/test/Target/LLVMIR/mrecip.mlir | 8 + 14 files changed, 231 insertions(+), 2 deletions(-) create mode 100644 flang/test/Driver/mrecip.f90 create mode 100644 mlir/test/Target/LLVMIR/Import/mrecip.ll create mode 100644 mlir/test/Target/LLVMIR/mrecip.mlir diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 5ca31c253ed8f..291e9ae223805 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5472,9 +5472,10 @@ def mno_implicit_float : Flag<["-"], "mno-implicit-float">, Group, HelpText<"Don't generate implicit floating point or vector instructions">; def mimplicit_float : Flag<["-"], "mimplicit-float">, Group; def mrecip : Flag<["-"], "mrecip">, Group, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Equivalent to '-mrecip=all'">; def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Control use of approximate reciprocal and reciprocal square root instructions followed by iterations of " "Newton-Raphson refinement. " " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 61e56e51c4bbb..5c1f7ef52ca14 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -56,6 +56,9 @@ class CodeGenOptions : public CodeGenOptionsBase { // The prefered vector width, if requested by -mprefer-vector-width. std::string PreferVectorWidth; + // List of reciprocal estimate sub-options. + std::string Reciprocals; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index 1b1970412676d..34842f9785942 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -429,6 +429,9 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"reciprocals", "mrecip", "std::string", /*default=*/"", + "Set the reciprocal-estimates attribute on functions in the " + "module.">, Option<"preferVectorWidth", "prefer-vector-width", "std::string", /*default=*/"", "Set the prefer-vector-width attribute on functions in the " diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 058024a4a04c5..337685c82af5f 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + Reciprocals = opts.Reciprocals; PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; @@ -127,6 +128,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string Reciprocals = ""; ///< Set reciprocal-estimate attribute for + ///< functions. std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 90a002929eff0..957f2041ffd8d 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -1251,6 +1251,164 @@ static bool parseIntegerOverflowArgs(CompilerInvocation &invoc, return true; } +/// This is a helper function for validating the optional refinement step +/// parameter in reciprocal argument strings. Return false if there is an error +/// parsing the refinement step. Otherwise, return true and set the Position +/// of the refinement step in the input string. +/// +/// \param [in] in The input string +/// \param [in] a The compiler invocation arguments to parse +/// \param [out] position The position of the refinement step in input string +/// \param [out] diags DiagnosticsEngine to report erros with +static bool getRefinementStep(llvm::StringRef in, const llvm::opt::Arg &a, + size_t &position, + clang::DiagnosticsEngine &diags) { + const char refinementStepToken = ':'; + position = in.find(refinementStepToken); + if (position != llvm::StringRef::npos) { + llvm::StringRef option = a.getOption().getName(); + llvm::StringRef refStep = in.substr(position + 1); + // Allow exactly one numeric character for the additional refinement + // step parameter. This is reasonable for all currently-supported + // operations and architectures because we would expect that a larger value + // of refinement steps would cause the estimate "optimization" to + // under-perform the native operation. Also, if the estimate does not + // converge quickly, it probably will not ever converge, so further + // refinement steps will not produce a better answer. + if (refStep.size() != 1) { + diags.Report(clang::diag::err_drv_invalid_value) << option << refStep; + return false; + } + char refStepChar = refStep[0]; + if (refStepChar < '0' || refStepChar > '9') { + diags.Report(clang::diag::err_drv_invalid_value) << option << refStep; + return false; + } + } + return true; +} + +/// Parses all -mrecip= arguments and populates the +/// CompilerInvocation accordingly. Returns false if new errors are generated. +/// +/// \param [out] invoc Stores the processed arguments +/// \param [in] args The compiler invocation arguments to parse +/// \param [out] diags DiagnosticsEngine to report erros with +static bool parseMRecip(CompilerInvocation &invoc, llvm::opt::ArgList &args, + clang::DiagnosticsEngine &diags) { + llvm::StringRef disabledPrefixIn = "!"; + llvm::StringRef disabledPrefixOut = "!"; + llvm::StringRef enabledPrefixOut = ""; + llvm::StringRef out = ""; + Fortran::frontend::CodeGenOptions &opts = invoc.getCodeGenOpts(); + + const llvm::opt::Arg *a = + args.getLastArg(clang::driver::options::OPT_mrecip, + clang::driver::options::OPT_mrecip_EQ); + if (!a) + return true; + + unsigned numOptions = a->getNumValues(); + if (numOptions == 0) { + // No option is the same as "all". + opts.Reciprocals = "all"; + return true; + } + + // Pass through "all", "none", or "default" with an optional refinement step. + if (numOptions == 1) { + llvm::StringRef val = a->getValue(0); + size_t refStepLoc; + if (!getRefinementStep(val, *a, refStepLoc, diags)) + return false; + llvm::StringRef valBase = val.slice(0, refStepLoc); + if (valBase == "all" || valBase == "none" || valBase == "default") { + opts.Reciprocals = args.MakeArgString(val); + return true; + } + } + + // Each reciprocal type may be enabled or disabled individually. + // Check each input value for validity, concatenate them all back together, + // and pass through. + + llvm::StringMap optionStrings; + optionStrings.insert(std::make_pair("divd", false)); + optionStrings.insert(std::make_pair("divf", false)); + optionStrings.insert(std::make_pair("divh", false)); + optionStrings.insert(std::make_pair("vec-divd", false)); + optionStrings.insert(std::make_pair("vec-divf", false)); + optionStrings.insert(std::make_pair("vec-divh", false)); + optionStrings.insert(std::make_pair("sqrtd", false)); + optionStrings.insert(std::make_pair("sqrtf", false)); + optionStrings.insert(std::make_pair("sqrth", false)); + optionStrings.insert(std::make_pair("vec-sqrtd", false)); + optionStrings.insert(std::make_pair("vec-sqrtf", false)); + optionStrings.insert(std::make_pair("vec-sqrth", false)); + + for (unsigned i = 0; i != numOptions; ++i) { + llvm::StringRef val = a->getValue(i); + + bool isDisabled = val.starts_with(disabledPrefixIn); + // Ignore the disablement token for string matching. + if (isDisabled) + val = val.substr(1); + + size_t refStep; + if (!getRefinementStep(val, *a, refStep, diags)) + return false; + + llvm::StringRef valBase = val.slice(0, refStep); + llvm::StringMap::iterator optionIter = optionStrings.find(valBase); + if (optionIter == optionStrings.end()) { + // Try again specifying float suffix. + optionIter = optionStrings.find(valBase.str() + 'f'); + if (optionIter == optionStrings.end()) { + // The input name did not match any known option string. + diags.Report(clang::diag::err_drv_invalid_value) + << a->getOption().getName() << val; + return false; + } + // The option was specified without a half or float or double suffix. + // Make sure that the double or half entry was not already specified. + // The float entry will be checked below. + if (optionStrings[valBase.str() + 'd'] || + optionStrings[valBase.str() + 'h']) { + diags.Report(clang::diag::err_drv_invalid_value) + << a->getOption().getName() << val; + return false; + } + } + + if (optionIter->second == true) { + // Duplicate option specified. + diags.Report(clang::diag::err_drv_invalid_value) + << a->getOption().getName() << val; + return false; + } + + // Mark the matched option as found. Do not allow duplicate specifiers. + optionIter->second = true; + + // If the precision was not specified, also mark the double and half entry + // as found. + if (valBase.back() != 'f' && valBase.back() != 'd' && + valBase.back() != 'h') { + optionStrings[valBase.str() + 'd'] = true; + optionStrings[valBase.str() + 'h'] = true; + } + + // Build the output string. + llvm::StringRef prefix = isDisabled ? disabledPrefixOut : enabledPrefixOut; + out = args.MakeArgString(out + prefix + val); + if (i != numOptions - 1) + out = args.MakeArgString(out + ","); + } + + opts.Reciprocals = args.MakeArgString(out); // Handle the rest. + return true; +} + /// Parses all floating point related arguments and populates the /// CompilerInvocation accordingly. /// Returns false if new errors are generated. @@ -1398,6 +1556,7 @@ static bool parseLangOptionsArgs(CompilerInvocation &invoc, success &= parseIntegerOverflowArgs(invoc, args, diags); success &= parseFloatingPointArgs(invoc, args, diags); + success &= parseMRecip(invoc, args, diags); success &= parseVScaleArgs(invoc, args, diags); return success; diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 012d0fdfe645f..31803b27b8ceb 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -743,6 +743,8 @@ void CodeGenAction::generateLLVMIR() { config.PreferVectorWidth = opts.PreferVectorWidth; + config.Reciprocals = opts.Reciprocals; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 0c774eede4c9a..06eaa18fe7284 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -358,7 +358,8 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - config.PreferVectorWidth, /*tuneCPU=*/"", setNoCapture, setNoAlias})); + config.Reciprocals, config.PreferVectorWidth, /*tuneCPU=*/"", + setNoCapture, setNoAlias})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 041aa8717d20e..5ac4ed8a93b6b 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -107,6 +107,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!reciprocals.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getReciprocalEstimatesAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, reciprocals)); if (!preferVectorWidth.empty()) func->setAttr( mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), diff --git a/flang/test/Driver/mrecip.f90 b/flang/test/Driver/mrecip.f90 new file mode 100644 index 0000000000000..9ec65c0ef4dde --- /dev/null +++ b/flang/test/Driver/mrecip.f90 @@ -0,0 +1,27 @@ +! Test that -mrecip[=] works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-OMIT +! RUN: %flang_fc1 -mrecip -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NOARG +! RUN: %flang_fc1 -mrecip=all -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-ALL +! RUN: %flang_fc1 -mrecip=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mrecip=default -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mrecip=divd,divf,divh,vec-divd,vec-divf,vec-divh,sqrtd,sqrtf,sqrth,vec-sqrtd,vec-sqrtf,vec-sqrth -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-POS +! RUN: %flang_fc1 -mrecip=!divd,!divf,!divh,!vec-divd,!vec-divf,!vec-divh,!sqrtd,!sqrtf,!sqrth,!vec-sqrtd,!vec-sqrtf,!vec-sqrth +! -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NEG +! RUN: %flang_fc1 -mrecip=!divd,divf,!divh,sqrtd,!sqrtf,sqrth -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-MIX +! RUN: not %flang_fc1 -mrecip=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INV +! RUN: not %flang_fc1 -mrecip=divd,divd -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DUP + +subroutine func +end subroutine func + +! CHECK-OMIT-NOT: attributes #0 = { "reciprocal-estimates"={{.*}} } +! CHECK-NOARG: attributes #0 = { "reciprocal-estimates"="all" } +! CHECK-ALL: attributes #0 = { "reciprocal-estimates"="all" } +! CHECK-NONE: attributes #0 = { "reciprocal-estimates"="none" } +! CHECK-DEF: attributes #0 = { "reciprocal-estimates"="default" } +! CHECK-POS: attributes #0 = { "reciprocal-estimates"="divd,divf,divh,vec-divd,vec-divf,vec-divh,sqrtd,sqrtf,sqrth,vec-sqrtd,vec-sqrtf,vec-sqrth" } +! CHECK-NEG: attributes #0 = { "reciprocal-estimates"="!divd,!divf,!divh,!vec-divd,!vec-divf,!vec-divh,!sqrtd,!sqrtf,!sqrth,!vec-sqrtd,!vec-sqrtf,!vec-sqrth" } +! CHECK-MIX: attributes #0 = { "reciprocal-estimates"="!divd,divf,!divh,sqrtd,!sqrtf,sqrth" } +! CHECK-INV: error: invalid value 'xxx' in 'mrecip=' +! CHECK-DUP: error: invalid value 'divd' in 'mrecip=' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index c0324d561b77b..e5d35fa82fbce 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1895,6 +1895,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$tune_cpu, OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, + OptionalAttr:$reciprocal_estimates, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, OptionalAttr:$no_nans_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index 85417da798b22..1e6e60a78b37d 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2640,6 +2640,11 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, attr.isStringAttribute()) funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("reciprocal-estimates"); + attr.isStringAttribute()) + funcOp.setReciprocalEstimatesAttr( + StringAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index 4cc419c7cde5b..bc66625bf374e 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1557,6 +1557,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { getLLVMContext(), attr->getMinRange().getInt(), attr->getMaxRange().getInt())); + if (auto reciprocalEstimates = func.getReciprocalEstimates()) + llvmFunc->addFnAttr("reciprocal-estimates", *reciprocalEstimates); + if (auto unsafeFpMath = func.getUnsafeFpMath()) llvmFunc->addFnAttr("unsafe-fp-math", llvm::toStringRef(*unsafeFpMath)); diff --git a/mlir/test/Target/LLVMIR/Import/mrecip.ll b/mlir/test/Target/LLVMIR/Import/mrecip.ll new file mode 100644 index 0000000000000..01814cce70059 --- /dev/null +++ b/mlir/test/Target/LLVMIR/Import/mrecip.ll @@ -0,0 +1,9 @@ +; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s + +; CHECK-LABEL: llvm.func @reciprocal_estimates() +; CHECK-SAME: reciprocal_estimates = "all" +define void @reciprocal_estimates() #0 { + ret void +} + +attributes #0 = { "reciprocal-estimates" = "all" } diff --git a/mlir/test/Target/LLVMIR/mrecip.mlir b/mlir/test/Target/LLVMIR/mrecip.mlir new file mode 100644 index 0000000000000..e0bc66c272f6a --- /dev/null +++ b/mlir/test/Target/LLVMIR/mrecip.mlir @@ -0,0 +1,8 @@ +// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s + +// CHECK: define void @reciprocal_estimates() #[[ATTRS:.*]] { +// CHECK: attributes #[[ATTRS]] = { "reciprocal-estimates"="all" } + +llvm.func @reciprocal_estimates() attributes {reciprocal_estimates = "all"} { + llvm.return +} From flang-commits at lists.llvm.org Fri May 30 08:53:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 08:53:50 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mrecip[=] (PR #142172) In-Reply-To: Message-ID: <6839d48e.a70a0220.3ab12c.5ba3@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-clang Author: Cameron McInally (mcinally)
Changes This patch adds support for the -mrecip command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "reciprocal-estimates" function attribute. --- Full diff: https://github.com/llvm/llvm-project/pull/142172.diff 14 Files Affected: - (modified) clang/include/clang/Driver/Options.td (+2-1) - (modified) flang/include/flang/Frontend/CodeGenOptions.h (+3) - (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+3) - (modified) flang/include/flang/Tools/CrossToolHelpers.h (+3) - (modified) flang/lib/Frontend/CompilerInvocation.cpp (+159) - (modified) flang/lib/Frontend/FrontendActions.cpp (+2) - (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+2-1) - (modified) flang/lib/Optimizer/Transforms/FunctionAttr.cpp (+4) - (added) flang/test/Driver/mrecip.f90 (+27) - (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td (+1) - (modified) mlir/lib/Target/LLVMIR/ModuleImport.cpp (+5) - (modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+3) - (added) mlir/test/Target/LLVMIR/Import/mrecip.ll (+9) - (added) mlir/test/Target/LLVMIR/mrecip.mlir (+8) ``````````diff diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 5ca31c253ed8f..291e9ae223805 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5472,9 +5472,10 @@ def mno_implicit_float : Flag<["-"], "mno-implicit-float">, Group, HelpText<"Don't generate implicit floating point or vector instructions">; def mimplicit_float : Flag<["-"], "mimplicit-float">, Group; def mrecip : Flag<["-"], "mrecip">, Group, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Equivalent to '-mrecip=all'">; def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Control use of approximate reciprocal and reciprocal square root instructions followed by iterations of " "Newton-Raphson refinement. " " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 61e56e51c4bbb..5c1f7ef52ca14 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -56,6 +56,9 @@ class CodeGenOptions : public CodeGenOptionsBase { // The prefered vector width, if requested by -mprefer-vector-width. std::string PreferVectorWidth; + // List of reciprocal estimate sub-options. + std::string Reciprocals; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index 1b1970412676d..34842f9785942 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -429,6 +429,9 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"reciprocals", "mrecip", "std::string", /*default=*/"", + "Set the reciprocal-estimates attribute on functions in the " + "module.">, Option<"preferVectorWidth", "prefer-vector-width", "std::string", /*default=*/"", "Set the prefer-vector-width attribute on functions in the " diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 058024a4a04c5..337685c82af5f 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + Reciprocals = opts.Reciprocals; PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; @@ -127,6 +128,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string Reciprocals = ""; ///< Set reciprocal-estimate attribute for + ///< functions. std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 90a002929eff0..957f2041ffd8d 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -1251,6 +1251,164 @@ static bool parseIntegerOverflowArgs(CompilerInvocation &invoc, return true; } +/// This is a helper function for validating the optional refinement step +/// parameter in reciprocal argument strings. Return false if there is an error +/// parsing the refinement step. Otherwise, return true and set the Position +/// of the refinement step in the input string. +/// +/// \param [in] in The input string +/// \param [in] a The compiler invocation arguments to parse +/// \param [out] position The position of the refinement step in input string +/// \param [out] diags DiagnosticsEngine to report erros with +static bool getRefinementStep(llvm::StringRef in, const llvm::opt::Arg &a, + size_t &position, + clang::DiagnosticsEngine &diags) { + const char refinementStepToken = ':'; + position = in.find(refinementStepToken); + if (position != llvm::StringRef::npos) { + llvm::StringRef option = a.getOption().getName(); + llvm::StringRef refStep = in.substr(position + 1); + // Allow exactly one numeric character for the additional refinement + // step parameter. This is reasonable for all currently-supported + // operations and architectures because we would expect that a larger value + // of refinement steps would cause the estimate "optimization" to + // under-perform the native operation. Also, if the estimate does not + // converge quickly, it probably will not ever converge, so further + // refinement steps will not produce a better answer. + if (refStep.size() != 1) { + diags.Report(clang::diag::err_drv_invalid_value) << option << refStep; + return false; + } + char refStepChar = refStep[0]; + if (refStepChar < '0' || refStepChar > '9') { + diags.Report(clang::diag::err_drv_invalid_value) << option << refStep; + return false; + } + } + return true; +} + +/// Parses all -mrecip= arguments and populates the +/// CompilerInvocation accordingly. Returns false if new errors are generated. +/// +/// \param [out] invoc Stores the processed arguments +/// \param [in] args The compiler invocation arguments to parse +/// \param [out] diags DiagnosticsEngine to report erros with +static bool parseMRecip(CompilerInvocation &invoc, llvm::opt::ArgList &args, + clang::DiagnosticsEngine &diags) { + llvm::StringRef disabledPrefixIn = "!"; + llvm::StringRef disabledPrefixOut = "!"; + llvm::StringRef enabledPrefixOut = ""; + llvm::StringRef out = ""; + Fortran::frontend::CodeGenOptions &opts = invoc.getCodeGenOpts(); + + const llvm::opt::Arg *a = + args.getLastArg(clang::driver::options::OPT_mrecip, + clang::driver::options::OPT_mrecip_EQ); + if (!a) + return true; + + unsigned numOptions = a->getNumValues(); + if (numOptions == 0) { + // No option is the same as "all". + opts.Reciprocals = "all"; + return true; + } + + // Pass through "all", "none", or "default" with an optional refinement step. + if (numOptions == 1) { + llvm::StringRef val = a->getValue(0); + size_t refStepLoc; + if (!getRefinementStep(val, *a, refStepLoc, diags)) + return false; + llvm::StringRef valBase = val.slice(0, refStepLoc); + if (valBase == "all" || valBase == "none" || valBase == "default") { + opts.Reciprocals = args.MakeArgString(val); + return true; + } + } + + // Each reciprocal type may be enabled or disabled individually. + // Check each input value for validity, concatenate them all back together, + // and pass through. + + llvm::StringMap optionStrings; + optionStrings.insert(std::make_pair("divd", false)); + optionStrings.insert(std::make_pair("divf", false)); + optionStrings.insert(std::make_pair("divh", false)); + optionStrings.insert(std::make_pair("vec-divd", false)); + optionStrings.insert(std::make_pair("vec-divf", false)); + optionStrings.insert(std::make_pair("vec-divh", false)); + optionStrings.insert(std::make_pair("sqrtd", false)); + optionStrings.insert(std::make_pair("sqrtf", false)); + optionStrings.insert(std::make_pair("sqrth", false)); + optionStrings.insert(std::make_pair("vec-sqrtd", false)); + optionStrings.insert(std::make_pair("vec-sqrtf", false)); + optionStrings.insert(std::make_pair("vec-sqrth", false)); + + for (unsigned i = 0; i != numOptions; ++i) { + llvm::StringRef val = a->getValue(i); + + bool isDisabled = val.starts_with(disabledPrefixIn); + // Ignore the disablement token for string matching. + if (isDisabled) + val = val.substr(1); + + size_t refStep; + if (!getRefinementStep(val, *a, refStep, diags)) + return false; + + llvm::StringRef valBase = val.slice(0, refStep); + llvm::StringMap::iterator optionIter = optionStrings.find(valBase); + if (optionIter == optionStrings.end()) { + // Try again specifying float suffix. + optionIter = optionStrings.find(valBase.str() + 'f'); + if (optionIter == optionStrings.end()) { + // The input name did not match any known option string. + diags.Report(clang::diag::err_drv_invalid_value) + << a->getOption().getName() << val; + return false; + } + // The option was specified without a half or float or double suffix. + // Make sure that the double or half entry was not already specified. + // The float entry will be checked below. + if (optionStrings[valBase.str() + 'd'] || + optionStrings[valBase.str() + 'h']) { + diags.Report(clang::diag::err_drv_invalid_value) + << a->getOption().getName() << val; + return false; + } + } + + if (optionIter->second == true) { + // Duplicate option specified. + diags.Report(clang::diag::err_drv_invalid_value) + << a->getOption().getName() << val; + return false; + } + + // Mark the matched option as found. Do not allow duplicate specifiers. + optionIter->second = true; + + // If the precision was not specified, also mark the double and half entry + // as found. + if (valBase.back() != 'f' && valBase.back() != 'd' && + valBase.back() != 'h') { + optionStrings[valBase.str() + 'd'] = true; + optionStrings[valBase.str() + 'h'] = true; + } + + // Build the output string. + llvm::StringRef prefix = isDisabled ? disabledPrefixOut : enabledPrefixOut; + out = args.MakeArgString(out + prefix + val); + if (i != numOptions - 1) + out = args.MakeArgString(out + ","); + } + + opts.Reciprocals = args.MakeArgString(out); // Handle the rest. + return true; +} + /// Parses all floating point related arguments and populates the /// CompilerInvocation accordingly. /// Returns false if new errors are generated. @@ -1398,6 +1556,7 @@ static bool parseLangOptionsArgs(CompilerInvocation &invoc, success &= parseIntegerOverflowArgs(invoc, args, diags); success &= parseFloatingPointArgs(invoc, args, diags); + success &= parseMRecip(invoc, args, diags); success &= parseVScaleArgs(invoc, args, diags); return success; diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 012d0fdfe645f..31803b27b8ceb 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -743,6 +743,8 @@ void CodeGenAction::generateLLVMIR() { config.PreferVectorWidth = opts.PreferVectorWidth; + config.Reciprocals = opts.Reciprocals; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 0c774eede4c9a..06eaa18fe7284 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -358,7 +358,8 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - config.PreferVectorWidth, /*tuneCPU=*/"", setNoCapture, setNoAlias})); + config.Reciprocals, config.PreferVectorWidth, /*tuneCPU=*/"", + setNoCapture, setNoAlias})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 041aa8717d20e..5ac4ed8a93b6b 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -107,6 +107,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!reciprocals.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getReciprocalEstimatesAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, reciprocals)); if (!preferVectorWidth.empty()) func->setAttr( mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), diff --git a/flang/test/Driver/mrecip.f90 b/flang/test/Driver/mrecip.f90 new file mode 100644 index 0000000000000..9ec65c0ef4dde --- /dev/null +++ b/flang/test/Driver/mrecip.f90 @@ -0,0 +1,27 @@ +! Test that -mrecip[=] works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-OMIT +! RUN: %flang_fc1 -mrecip -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NOARG +! RUN: %flang_fc1 -mrecip=all -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-ALL +! RUN: %flang_fc1 -mrecip=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mrecip=default -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mrecip=divd,divf,divh,vec-divd,vec-divf,vec-divh,sqrtd,sqrtf,sqrth,vec-sqrtd,vec-sqrtf,vec-sqrth -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-POS +! RUN: %flang_fc1 -mrecip=!divd,!divf,!divh,!vec-divd,!vec-divf,!vec-divh,!sqrtd,!sqrtf,!sqrth,!vec-sqrtd,!vec-sqrtf,!vec-sqrth +! -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NEG +! RUN: %flang_fc1 -mrecip=!divd,divf,!divh,sqrtd,!sqrtf,sqrth -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-MIX +! RUN: not %flang_fc1 -mrecip=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INV +! RUN: not %flang_fc1 -mrecip=divd,divd -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DUP + +subroutine func +end subroutine func + +! CHECK-OMIT-NOT: attributes #0 = { "reciprocal-estimates"={{.*}} } +! CHECK-NOARG: attributes #0 = { "reciprocal-estimates"="all" } +! CHECK-ALL: attributes #0 = { "reciprocal-estimates"="all" } +! CHECK-NONE: attributes #0 = { "reciprocal-estimates"="none" } +! CHECK-DEF: attributes #0 = { "reciprocal-estimates"="default" } +! CHECK-POS: attributes #0 = { "reciprocal-estimates"="divd,divf,divh,vec-divd,vec-divf,vec-divh,sqrtd,sqrtf,sqrth,vec-sqrtd,vec-sqrtf,vec-sqrth" } +! CHECK-NEG: attributes #0 = { "reciprocal-estimates"="!divd,!divf,!divh,!vec-divd,!vec-divf,!vec-divh,!sqrtd,!sqrtf,!sqrth,!vec-sqrtd,!vec-sqrtf,!vec-sqrth" } +! CHECK-MIX: attributes #0 = { "reciprocal-estimates"="!divd,divf,!divh,sqrtd,!sqrtf,sqrth" } +! CHECK-INV: error: invalid value 'xxx' in 'mrecip=' +! CHECK-DUP: error: invalid value 'divd' in 'mrecip=' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index c0324d561b77b..e5d35fa82fbce 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1895,6 +1895,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$tune_cpu, OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, + OptionalAttr:$reciprocal_estimates, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, OptionalAttr:$no_nans_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index 85417da798b22..1e6e60a78b37d 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2640,6 +2640,11 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, attr.isStringAttribute()) funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("reciprocal-estimates"); + attr.isStringAttribute()) + funcOp.setReciprocalEstimatesAttr( + StringAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index 4cc419c7cde5b..bc66625bf374e 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1557,6 +1557,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { getLLVMContext(), attr->getMinRange().getInt(), attr->getMaxRange().getInt())); + if (auto reciprocalEstimates = func.getReciprocalEstimates()) + llvmFunc->addFnAttr("reciprocal-estimates", *reciprocalEstimates); + if (auto unsafeFpMath = func.getUnsafeFpMath()) llvmFunc->addFnAttr("unsafe-fp-math", llvm::toStringRef(*unsafeFpMath)); diff --git a/mlir/test/Target/LLVMIR/Import/mrecip.ll b/mlir/test/Target/LLVMIR/Import/mrecip.ll new file mode 100644 index 0000000000000..01814cce70059 --- /dev/null +++ b/mlir/test/Target/LLVMIR/Import/mrecip.ll @@ -0,0 +1,9 @@ +; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s + +; CHECK-LABEL: llvm.func @reciprocal_estimates() +; CHECK-SAME: reciprocal_estimates = "all" +define void @reciprocal_estimates() #0 { + ret void +} + +attributes #0 = { "reciprocal-estimates" = "all" } diff --git a/mlir/test/Target/LLVMIR/mrecip.mlir b/mlir/test/Target/LLVMIR/mrecip.mlir new file mode 100644 index 0000000000000..e0bc66c272f6a --- /dev/null +++ b/mlir/test/Target/LLVMIR/mrecip.mlir @@ -0,0 +1,8 @@ +// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s + +// CHECK: define void @reciprocal_estimates() #[[ATTRS:.*]] { +// CHECK: attributes #[[ATTRS]] = { "reciprocal-estimates"="all" } + +llvm.func @reciprocal_estimates() attributes {reciprocal_estimates = "all"} { + llvm.return +} ``````````
https://github.com/llvm/llvm-project/pull/142172 From flang-commits at lists.llvm.org Fri May 30 08:53:51 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 08:53:51 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mrecip[=] (PR #142172) In-Reply-To: Message-ID: <6839d48f.170a0220.39ed5d.4795@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-mlir-llvm Author: Cameron McInally (mcinally)
Changes This patch adds support for the -mrecip command line option. The parsing of this options is equivalent to Clang's and it is implemented by setting the "reciprocal-estimates" function attribute. --- Full diff: https://github.com/llvm/llvm-project/pull/142172.diff 14 Files Affected: - (modified) clang/include/clang/Driver/Options.td (+2-1) - (modified) flang/include/flang/Frontend/CodeGenOptions.h (+3) - (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+3) - (modified) flang/include/flang/Tools/CrossToolHelpers.h (+3) - (modified) flang/lib/Frontend/CompilerInvocation.cpp (+159) - (modified) flang/lib/Frontend/FrontendActions.cpp (+2) - (modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+2-1) - (modified) flang/lib/Optimizer/Transforms/FunctionAttr.cpp (+4) - (added) flang/test/Driver/mrecip.f90 (+27) - (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td (+1) - (modified) mlir/lib/Target/LLVMIR/ModuleImport.cpp (+5) - (modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+3) - (added) mlir/test/Target/LLVMIR/Import/mrecip.ll (+9) - (added) mlir/test/Target/LLVMIR/mrecip.mlir (+8) ``````````diff diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 5ca31c253ed8f..291e9ae223805 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -5472,9 +5472,10 @@ def mno_implicit_float : Flag<["-"], "mno-implicit-float">, Group, HelpText<"Don't generate implicit floating point or vector instructions">; def mimplicit_float : Flag<["-"], "mimplicit-float">, Group; def mrecip : Flag<["-"], "mrecip">, Group, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Equivalent to '-mrecip=all'">; def mrecip_EQ : CommaJoined<["-"], "mrecip=">, Group, - Visibility<[ClangOption, CC1Option]>, + Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, HelpText<"Control use of approximate reciprocal and reciprocal square root instructions followed by iterations of " "Newton-Raphson refinement. " " = ( ['!'] ['vec-'] ('rcp'|'sqrt') [('h'|'s'|'d')] [':'] ) | 'all' | 'default' | 'none'">, diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h index 61e56e51c4bbb..5c1f7ef52ca14 100644 --- a/flang/include/flang/Frontend/CodeGenOptions.h +++ b/flang/include/flang/Frontend/CodeGenOptions.h @@ -56,6 +56,9 @@ class CodeGenOptions : public CodeGenOptionsBase { // The prefered vector width, if requested by -mprefer-vector-width. std::string PreferVectorWidth; + // List of reciprocal estimate sub-options. + std::string Reciprocals; + /// List of filenames passed in using the -fembed-offload-object option. These /// are offloading binaries containing device images and metadata. std::vector OffloadObjects; diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td index 1b1970412676d..34842f9785942 100644 --- a/flang/include/flang/Optimizer/Transforms/Passes.td +++ b/flang/include/flang/Optimizer/Transforms/Passes.td @@ -429,6 +429,9 @@ def FunctionAttr : Pass<"function-attr", "mlir::func::FuncOp"> { "module.">, Option<"unsafeFPMath", "unsafe-fp-math", "bool", /*default=*/"false", "Set the unsafe-fp-math attribute on functions in the module.">, + Option<"reciprocals", "mrecip", "std::string", /*default=*/"", + "Set the reciprocal-estimates attribute on functions in the " + "module.">, Option<"preferVectorWidth", "prefer-vector-width", "std::string", /*default=*/"", "Set the prefer-vector-width attribute on functions in the " diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h index 058024a4a04c5..337685c82af5f 100644 --- a/flang/include/flang/Tools/CrossToolHelpers.h +++ b/flang/include/flang/Tools/CrossToolHelpers.h @@ -102,6 +102,7 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { UnsafeFPMath = mathOpts.getAssociativeMath() && mathOpts.getReciprocalMath() && NoSignedZerosFPMath && ApproxFuncFPMath && mathOpts.getFPContractEnabled(); + Reciprocals = opts.Reciprocals; PreferVectorWidth = opts.PreferVectorWidth; if (opts.InstrumentFunctions) { InstrumentFunctionEntry = "__cyg_profile_func_enter"; @@ -127,6 +128,8 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks { bool NoSignedZerosFPMath = false; ///< Set no-signed-zeros-fp-math attribute for functions. bool UnsafeFPMath = false; ///< Set unsafe-fp-math attribute for functions. + std::string Reciprocals = ""; ///< Set reciprocal-estimate attribute for + ///< functions. std::string PreferVectorWidth = ""; ///< Set prefer-vector-width attribute for ///< functions. bool NSWOnLoopVarInc = true; ///< Add nsw flag to loop variable increments. diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 90a002929eff0..957f2041ffd8d 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -1251,6 +1251,164 @@ static bool parseIntegerOverflowArgs(CompilerInvocation &invoc, return true; } +/// This is a helper function for validating the optional refinement step +/// parameter in reciprocal argument strings. Return false if there is an error +/// parsing the refinement step. Otherwise, return true and set the Position +/// of the refinement step in the input string. +/// +/// \param [in] in The input string +/// \param [in] a The compiler invocation arguments to parse +/// \param [out] position The position of the refinement step in input string +/// \param [out] diags DiagnosticsEngine to report erros with +static bool getRefinementStep(llvm::StringRef in, const llvm::opt::Arg &a, + size_t &position, + clang::DiagnosticsEngine &diags) { + const char refinementStepToken = ':'; + position = in.find(refinementStepToken); + if (position != llvm::StringRef::npos) { + llvm::StringRef option = a.getOption().getName(); + llvm::StringRef refStep = in.substr(position + 1); + // Allow exactly one numeric character for the additional refinement + // step parameter. This is reasonable for all currently-supported + // operations and architectures because we would expect that a larger value + // of refinement steps would cause the estimate "optimization" to + // under-perform the native operation. Also, if the estimate does not + // converge quickly, it probably will not ever converge, so further + // refinement steps will not produce a better answer. + if (refStep.size() != 1) { + diags.Report(clang::diag::err_drv_invalid_value) << option << refStep; + return false; + } + char refStepChar = refStep[0]; + if (refStepChar < '0' || refStepChar > '9') { + diags.Report(clang::diag::err_drv_invalid_value) << option << refStep; + return false; + } + } + return true; +} + +/// Parses all -mrecip= arguments and populates the +/// CompilerInvocation accordingly. Returns false if new errors are generated. +/// +/// \param [out] invoc Stores the processed arguments +/// \param [in] args The compiler invocation arguments to parse +/// \param [out] diags DiagnosticsEngine to report erros with +static bool parseMRecip(CompilerInvocation &invoc, llvm::opt::ArgList &args, + clang::DiagnosticsEngine &diags) { + llvm::StringRef disabledPrefixIn = "!"; + llvm::StringRef disabledPrefixOut = "!"; + llvm::StringRef enabledPrefixOut = ""; + llvm::StringRef out = ""; + Fortran::frontend::CodeGenOptions &opts = invoc.getCodeGenOpts(); + + const llvm::opt::Arg *a = + args.getLastArg(clang::driver::options::OPT_mrecip, + clang::driver::options::OPT_mrecip_EQ); + if (!a) + return true; + + unsigned numOptions = a->getNumValues(); + if (numOptions == 0) { + // No option is the same as "all". + opts.Reciprocals = "all"; + return true; + } + + // Pass through "all", "none", or "default" with an optional refinement step. + if (numOptions == 1) { + llvm::StringRef val = a->getValue(0); + size_t refStepLoc; + if (!getRefinementStep(val, *a, refStepLoc, diags)) + return false; + llvm::StringRef valBase = val.slice(0, refStepLoc); + if (valBase == "all" || valBase == "none" || valBase == "default") { + opts.Reciprocals = args.MakeArgString(val); + return true; + } + } + + // Each reciprocal type may be enabled or disabled individually. + // Check each input value for validity, concatenate them all back together, + // and pass through. + + llvm::StringMap optionStrings; + optionStrings.insert(std::make_pair("divd", false)); + optionStrings.insert(std::make_pair("divf", false)); + optionStrings.insert(std::make_pair("divh", false)); + optionStrings.insert(std::make_pair("vec-divd", false)); + optionStrings.insert(std::make_pair("vec-divf", false)); + optionStrings.insert(std::make_pair("vec-divh", false)); + optionStrings.insert(std::make_pair("sqrtd", false)); + optionStrings.insert(std::make_pair("sqrtf", false)); + optionStrings.insert(std::make_pair("sqrth", false)); + optionStrings.insert(std::make_pair("vec-sqrtd", false)); + optionStrings.insert(std::make_pair("vec-sqrtf", false)); + optionStrings.insert(std::make_pair("vec-sqrth", false)); + + for (unsigned i = 0; i != numOptions; ++i) { + llvm::StringRef val = a->getValue(i); + + bool isDisabled = val.starts_with(disabledPrefixIn); + // Ignore the disablement token for string matching. + if (isDisabled) + val = val.substr(1); + + size_t refStep; + if (!getRefinementStep(val, *a, refStep, diags)) + return false; + + llvm::StringRef valBase = val.slice(0, refStep); + llvm::StringMap::iterator optionIter = optionStrings.find(valBase); + if (optionIter == optionStrings.end()) { + // Try again specifying float suffix. + optionIter = optionStrings.find(valBase.str() + 'f'); + if (optionIter == optionStrings.end()) { + // The input name did not match any known option string. + diags.Report(clang::diag::err_drv_invalid_value) + << a->getOption().getName() << val; + return false; + } + // The option was specified without a half or float or double suffix. + // Make sure that the double or half entry was not already specified. + // The float entry will be checked below. + if (optionStrings[valBase.str() + 'd'] || + optionStrings[valBase.str() + 'h']) { + diags.Report(clang::diag::err_drv_invalid_value) + << a->getOption().getName() << val; + return false; + } + } + + if (optionIter->second == true) { + // Duplicate option specified. + diags.Report(clang::diag::err_drv_invalid_value) + << a->getOption().getName() << val; + return false; + } + + // Mark the matched option as found. Do not allow duplicate specifiers. + optionIter->second = true; + + // If the precision was not specified, also mark the double and half entry + // as found. + if (valBase.back() != 'f' && valBase.back() != 'd' && + valBase.back() != 'h') { + optionStrings[valBase.str() + 'd'] = true; + optionStrings[valBase.str() + 'h'] = true; + } + + // Build the output string. + llvm::StringRef prefix = isDisabled ? disabledPrefixOut : enabledPrefixOut; + out = args.MakeArgString(out + prefix + val); + if (i != numOptions - 1) + out = args.MakeArgString(out + ","); + } + + opts.Reciprocals = args.MakeArgString(out); // Handle the rest. + return true; +} + /// Parses all floating point related arguments and populates the /// CompilerInvocation accordingly. /// Returns false if new errors are generated. @@ -1398,6 +1556,7 @@ static bool parseLangOptionsArgs(CompilerInvocation &invoc, success &= parseIntegerOverflowArgs(invoc, args, diags); success &= parseFloatingPointArgs(invoc, args, diags); + success &= parseMRecip(invoc, args, diags); success &= parseVScaleArgs(invoc, args, diags); return success; diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp index 012d0fdfe645f..31803b27b8ceb 100644 --- a/flang/lib/Frontend/FrontendActions.cpp +++ b/flang/lib/Frontend/FrontendActions.cpp @@ -743,6 +743,8 @@ void CodeGenAction::generateLLVMIR() { config.PreferVectorWidth = opts.PreferVectorWidth; + config.Reciprocals = opts.Reciprocals; + if (ci.getInvocation().getFrontendOpts().features.IsEnabled( Fortran::common::LanguageFeature::OpenMP)) config.EnableOpenMP = true; diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 0c774eede4c9a..06eaa18fe7284 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -358,7 +358,8 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, {framePointerKind, config.InstrumentFunctionEntry, config.InstrumentFunctionExit, config.NoInfsFPMath, config.NoNaNsFPMath, config.ApproxFuncFPMath, config.NoSignedZerosFPMath, config.UnsafeFPMath, - config.PreferVectorWidth, /*tuneCPU=*/"", setNoCapture, setNoAlias})); + config.Reciprocals, config.PreferVectorWidth, /*tuneCPU=*/"", + setNoCapture, setNoAlias})); if (config.EnableOpenMP) { pm.addNestedPass( diff --git a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp index 041aa8717d20e..5ac4ed8a93b6b 100644 --- a/flang/lib/Optimizer/Transforms/FunctionAttr.cpp +++ b/flang/lib/Optimizer/Transforms/FunctionAttr.cpp @@ -107,6 +107,10 @@ void FunctionAttrPass::runOnOperation() { func->setAttr( mlir::LLVM::LLVMFuncOp::getUnsafeFpMathAttrName(llvmFuncOpName), mlir::BoolAttr::get(context, true)); + if (!reciprocals.empty()) + func->setAttr( + mlir::LLVM::LLVMFuncOp::getReciprocalEstimatesAttrName(llvmFuncOpName), + mlir::StringAttr::get(context, reciprocals)); if (!preferVectorWidth.empty()) func->setAttr( mlir::LLVM::LLVMFuncOp::getPreferVectorWidthAttrName(llvmFuncOpName), diff --git a/flang/test/Driver/mrecip.f90 b/flang/test/Driver/mrecip.f90 new file mode 100644 index 0000000000000..9ec65c0ef4dde --- /dev/null +++ b/flang/test/Driver/mrecip.f90 @@ -0,0 +1,27 @@ +! Test that -mrecip[=] works as expected. + +! RUN: %flang_fc1 -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-OMIT +! RUN: %flang_fc1 -mrecip -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NOARG +! RUN: %flang_fc1 -mrecip=all -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-ALL +! RUN: %flang_fc1 -mrecip=none -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NONE +! RUN: %flang_fc1 -mrecip=default -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DEF +! RUN: %flang_fc1 -mrecip=divd,divf,divh,vec-divd,vec-divf,vec-divh,sqrtd,sqrtf,sqrth,vec-sqrtd,vec-sqrtf,vec-sqrth -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-POS +! RUN: %flang_fc1 -mrecip=!divd,!divf,!divh,!vec-divd,!vec-divf,!vec-divh,!sqrtd,!sqrtf,!sqrth,!vec-sqrtd,!vec-sqrtf,!vec-sqrth +! -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-NEG +! RUN: %flang_fc1 -mrecip=!divd,divf,!divh,sqrtd,!sqrtf,sqrth -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-MIX +! RUN: not %flang_fc1 -mrecip=xxx -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-INV +! RUN: not %flang_fc1 -mrecip=divd,divd -emit-llvm -o - %s 2>&1| FileCheck %s --check-prefix=CHECK-DUP + +subroutine func +end subroutine func + +! CHECK-OMIT-NOT: attributes #0 = { "reciprocal-estimates"={{.*}} } +! CHECK-NOARG: attributes #0 = { "reciprocal-estimates"="all" } +! CHECK-ALL: attributes #0 = { "reciprocal-estimates"="all" } +! CHECK-NONE: attributes #0 = { "reciprocal-estimates"="none" } +! CHECK-DEF: attributes #0 = { "reciprocal-estimates"="default" } +! CHECK-POS: attributes #0 = { "reciprocal-estimates"="divd,divf,divh,vec-divd,vec-divf,vec-divh,sqrtd,sqrtf,sqrth,vec-sqrtd,vec-sqrtf,vec-sqrth" } +! CHECK-NEG: attributes #0 = { "reciprocal-estimates"="!divd,!divf,!divh,!vec-divd,!vec-divf,!vec-divh,!sqrtd,!sqrtf,!sqrth,!vec-sqrtd,!vec-sqrtf,!vec-sqrth" } +! CHECK-MIX: attributes #0 = { "reciprocal-estimates"="!divd,divf,!divh,sqrtd,!sqrtf,sqrth" } +! CHECK-INV: error: invalid value 'xxx' in 'mrecip=' +! CHECK-DUP: error: invalid value 'divd' in 'mrecip=' diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td index c0324d561b77b..e5d35fa82fbce 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td @@ -1895,6 +1895,7 @@ def LLVM_LLVMFuncOp : LLVM_Op<"func", [ OptionalAttr:$tune_cpu, OptionalAttr:$prefer_vector_width, OptionalAttr:$target_features, + OptionalAttr:$reciprocal_estimates, OptionalAttr:$unsafe_fp_math, OptionalAttr:$no_infs_fp_math, OptionalAttr:$no_nans_fp_math, diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp index 85417da798b22..1e6e60a78b37d 100644 --- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp @@ -2640,6 +2640,11 @@ void ModuleImport::processFunctionAttributes(llvm::Function *func, attr.isStringAttribute()) funcOp.setPreferVectorWidth(attr.getValueAsString()); + if (llvm::Attribute attr = func->getFnAttribute("reciprocal-estimates"); + attr.isStringAttribute()) + funcOp.setReciprocalEstimatesAttr( + StringAttr::get(context, attr.getValueAsString())); + if (llvm::Attribute attr = func->getFnAttribute("unsafe-fp-math"); attr.isStringAttribute()) funcOp.setUnsafeFpMath(attr.getValueAsBool()); diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp index 4cc419c7cde5b..bc66625bf374e 100644 --- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp @@ -1557,6 +1557,9 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { getLLVMContext(), attr->getMinRange().getInt(), attr->getMaxRange().getInt())); + if (auto reciprocalEstimates = func.getReciprocalEstimates()) + llvmFunc->addFnAttr("reciprocal-estimates", *reciprocalEstimates); + if (auto unsafeFpMath = func.getUnsafeFpMath()) llvmFunc->addFnAttr("unsafe-fp-math", llvm::toStringRef(*unsafeFpMath)); diff --git a/mlir/test/Target/LLVMIR/Import/mrecip.ll b/mlir/test/Target/LLVMIR/Import/mrecip.ll new file mode 100644 index 0000000000000..01814cce70059 --- /dev/null +++ b/mlir/test/Target/LLVMIR/Import/mrecip.ll @@ -0,0 +1,9 @@ +; RUN: mlir-translate -import-llvm -split-input-file %s | FileCheck %s + +; CHECK-LABEL: llvm.func @reciprocal_estimates() +; CHECK-SAME: reciprocal_estimates = "all" +define void @reciprocal_estimates() #0 { + ret void +} + +attributes #0 = { "reciprocal-estimates" = "all" } diff --git a/mlir/test/Target/LLVMIR/mrecip.mlir b/mlir/test/Target/LLVMIR/mrecip.mlir new file mode 100644 index 0000000000000..e0bc66c272f6a --- /dev/null +++ b/mlir/test/Target/LLVMIR/mrecip.mlir @@ -0,0 +1,8 @@ +// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s + +// CHECK: define void @reciprocal_estimates() #[[ATTRS:.*]] { +// CHECK: attributes #[[ATTRS]] = { "reciprocal-estimates"="all" } + +llvm.func @reciprocal_estimates() attributes {reciprocal_estimates = "all"} { + llvm.return +} ``````````
https://github.com/llvm/llvm-project/pull/142172 From flang-commits at lists.llvm.org Fri May 30 09:07:02 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 30 May 2025 09:07:02 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mrecip[=] (PR #142172) In-Reply-To: Message-ID: <6839d7a6.630a0220.8659a.1f38@mx.google.com> https://github.com/vzakhari commented: LGTM, though, I think we'd better reuse the code from `Clang.cpp`. `flangFrontend` already depends on `clangDriver`, so we just need to export `ParseMRecip` and `getRefinementStep` from `clangDriver` (and probably replace their `Driver` argument with a `DiagnosticEngine` argument, so that it works for both clang and flang). https://github.com/llvm/llvm-project/pull/142172 From flang-commits at lists.llvm.org Fri May 30 09:17:26 2025 From: flang-commits at lists.llvm.org (Asher Mancinelli via flang-commits) Date: Fri, 30 May 2025 09:17:26 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Propagate volatile on openmp reduction vars (PR #142182) Message-ID: https://github.com/ashermancinelli created https://github.com/llvm/llvm-project/pull/142182 In the openmp reduction clause lowering, reduction variables' reference types were not propagating the volatility of the original variable's type. Add a test and address cases of this in ReductionProcessor. I'm also looking for other cases of this in the openmp bridge, but haven't found any yet. >From cfcec421649158f23d3bf3d37cfbfa313ba779fa Mon Sep 17 00:00:00 2001 From: Asher Mancinelli Date: Thu, 29 May 2025 15:36:13 -0700 Subject: [PATCH] [flang] Propagate volatile on openmp reduction vars In the openmp reduction clause lowering, reduction variables"'" reference types were not propagating the volatility of the original variable's type. Add a test and address cases of this in ReductionProcessor. --- flang/lib/Lower/OpenMP/ReductionProcessor.cpp | 6 ++- flang/test/Lower/volatile-openmp1.f90 | 46 +++++++++++++++++++ 2 files changed, 50 insertions(+), 2 deletions(-) create mode 100644 flang/test/Lower/volatile-openmp1.f90 diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp index b8aa0deb42dd6..7ef0f2a0ef7c5 100644 --- a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp @@ -389,7 +389,8 @@ static void genBoxCombiner(fir::FirOpBuilder &builder, mlir::Location loc, hlfir::LoopNest nest = hlfir::genLoopNest( loc, builder, shapeShift.getExtents(), /*isUnordered=*/true); builder.setInsertionPointToStart(nest.body); - mlir::Type refTy = fir::ReferenceType::get(seqTy.getEleTy()); + const bool seqIsVolatile = fir::isa_volatile_type(seqTy.getEleTy()); + mlir::Type refTy = fir::ReferenceType::get(seqTy.getEleTy(), seqIsVolatile); auto lhsEleAddr = builder.create( loc, refTy, lhs, shapeShift, /*slice=*/mlir::Value{}, nest.oneBasedIndices, /*typeparms=*/mlir::ValueRange{}); @@ -659,7 +660,8 @@ void ReductionProcessor::processReductionArguments( assert(fir::isa_ref_type(symVal.getType()) && "reduction input var is passed by reference"); mlir::Type elementType = fir::dyn_cast_ptrEleTy(symVal.getType()); - mlir::Type refTy = fir::ReferenceType::get(elementType); + const bool symIsVolatile = fir::isa_volatile_type(symVal.getType()); + mlir::Type refTy = fir::ReferenceType::get(elementType, symIsVolatile); reductionVars.push_back( builder.createConvert(currentLocation, refTy, symVal)); diff --git a/flang/test/Lower/volatile-openmp1.f90 b/flang/test/Lower/volatile-openmp1.f90 new file mode 100644 index 0000000000000..163db953b6b80 --- /dev/null +++ b/flang/test/Lower/volatile-openmp1.f90 @@ -0,0 +1,46 @@ +! RUN: bbc --strict-fir-volatile-verifier -fopenmp %s -o - | FileCheck %s +program main +implicit none +integer,volatile::a +integer::n,i +a=0 +n=1000 +!$omp parallel +!$omp do reduction(+:a) + do i=1,n + a=a+1 + end do +!$omp end parallel +end program + +! CHECK-LABEL: func.func @_QQmain() attributes {fir.bindc_name = "main"} { +! CHECK: %[[VAL_0:.*]] = arith.constant 1 : i32 +! CHECK: %[[VAL_1:.*]] = arith.constant 1000 : i32 +! CHECK: %[[VAL_2:.*]] = arith.constant 0 : i32 +! CHECK: %[[VAL_3:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_4:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFEa"} +! CHECK: %[[VAL_5:.*]] = fir.volatile_cast %[[VAL_4]] : (!fir.ref) -> !fir.ref +! CHECK: %[[VAL_6:.*]]:2 = hlfir.declare %[[VAL_5]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_7:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFEi"} +! CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_7]] {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_9:.*]] = fir.alloca i32 {bindc_name = "n", uniq_name = "_QFEn"} +! CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_9]] {uniq_name = "_QFEn"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[VAL_2]] to %[[VAL_6]]#0 : i32, !fir.ref +! CHECK: hlfir.assign %[[VAL_1]] to %[[VAL_10]]#0 : i32, !fir.ref +! CHECK: omp.parallel { +! CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_10]]#0 : !fir.ref +! CHECK: omp.wsloop private(@_QFEi_private_i32 %[[VAL_8]]#0 -> %[[VAL_12:.*]] : !fir.ref) reduction(@add_reduction_i32 %[[VAL_6]]#0 -> %[[VAL_13:.*]] : !fir.ref) { +! CHECK: omp.loop_nest (%[[VAL_14:.*]]) : i32 = (%[[VAL_0]]) to (%[[VAL_11]]) inclusive step (%[[VAL_0]]) { +! CHECK: %[[VAL_15:.*]]:2 = hlfir.declare %[[VAL_12]] {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[VAL_13]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[VAL_14]] to %[[VAL_15]]#0 : i32, !fir.ref +! CHECK: %[[VAL_17:.*]] = fir.load %[[VAL_16]]#0 : !fir.ref +! CHECK: %[[VAL_18:.*]] = arith.addi %[[VAL_17]], %[[VAL_0]] : i32 +! CHECK: hlfir.assign %[[VAL_18]] to %[[VAL_16]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } From flang-commits at lists.llvm.org Fri May 30 09:17:34 2025 From: flang-commits at lists.llvm.org (Kelvin Li via flang-commits) Date: Fri, 30 May 2025 09:17:34 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mrecip[=] (PR #142172) In-Reply-To: Message-ID: <6839da1e.a70a0220.2b4a1.6a23@mx.google.com> ================ @@ -1251,6 +1251,164 @@ static bool parseIntegerOverflowArgs(CompilerInvocation &invoc, return true; } +/// This is a helper function for validating the optional refinement step +/// parameter in reciprocal argument strings. Return false if there is an error +/// parsing the refinement step. Otherwise, return true and set the Position +/// of the refinement step in the input string. +/// +/// \param [in] in The input string +/// \param [in] a The compiler invocation arguments to parse +/// \param [out] position The position of the refinement step in input string +/// \param [out] diags DiagnosticsEngine to report erros with ---------------- kkwli wrote: erros → errors https://github.com/llvm/llvm-project/pull/142172 From flang-commits at lists.llvm.org Fri May 30 09:17:50 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 09:17:50 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Propagate volatile on openmp reduction vars (PR #142182) In-Reply-To: Message-ID: <6839da2e.050a0220.14b51.fb15@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Asher Mancinelli (ashermancinelli)
Changes In the openmp reduction clause lowering, reduction variables' reference types were not propagating the volatility of the original variable's type. Add a test and address cases of this in ReductionProcessor. I'm also looking for other cases of this in the openmp bridge, but haven't found any yet. --- Full diff: https://github.com/llvm/llvm-project/pull/142182.diff 2 Files Affected: - (modified) flang/lib/Lower/OpenMP/ReductionProcessor.cpp (+4-2) - (added) flang/test/Lower/volatile-openmp1.f90 (+46) ``````````diff diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp index b8aa0deb42dd6..7ef0f2a0ef7c5 100644 --- a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp @@ -389,7 +389,8 @@ static void genBoxCombiner(fir::FirOpBuilder &builder, mlir::Location loc, hlfir::LoopNest nest = hlfir::genLoopNest( loc, builder, shapeShift.getExtents(), /*isUnordered=*/true); builder.setInsertionPointToStart(nest.body); - mlir::Type refTy = fir::ReferenceType::get(seqTy.getEleTy()); + const bool seqIsVolatile = fir::isa_volatile_type(seqTy.getEleTy()); + mlir::Type refTy = fir::ReferenceType::get(seqTy.getEleTy(), seqIsVolatile); auto lhsEleAddr = builder.create( loc, refTy, lhs, shapeShift, /*slice=*/mlir::Value{}, nest.oneBasedIndices, /*typeparms=*/mlir::ValueRange{}); @@ -659,7 +660,8 @@ void ReductionProcessor::processReductionArguments( assert(fir::isa_ref_type(symVal.getType()) && "reduction input var is passed by reference"); mlir::Type elementType = fir::dyn_cast_ptrEleTy(symVal.getType()); - mlir::Type refTy = fir::ReferenceType::get(elementType); + const bool symIsVolatile = fir::isa_volatile_type(symVal.getType()); + mlir::Type refTy = fir::ReferenceType::get(elementType, symIsVolatile); reductionVars.push_back( builder.createConvert(currentLocation, refTy, symVal)); diff --git a/flang/test/Lower/volatile-openmp1.f90 b/flang/test/Lower/volatile-openmp1.f90 new file mode 100644 index 0000000000000..163db953b6b80 --- /dev/null +++ b/flang/test/Lower/volatile-openmp1.f90 @@ -0,0 +1,46 @@ +! RUN: bbc --strict-fir-volatile-verifier -fopenmp %s -o - | FileCheck %s +program main +implicit none +integer,volatile::a +integer::n,i +a=0 +n=1000 +!$omp parallel +!$omp do reduction(+:a) + do i=1,n + a=a+1 + end do +!$omp end parallel +end program + +! CHECK-LABEL: func.func @_QQmain() attributes {fir.bindc_name = "main"} { +! CHECK: %[[VAL_0:.*]] = arith.constant 1 : i32 +! CHECK: %[[VAL_1:.*]] = arith.constant 1000 : i32 +! CHECK: %[[VAL_2:.*]] = arith.constant 0 : i32 +! CHECK: %[[VAL_3:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_4:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFEa"} +! CHECK: %[[VAL_5:.*]] = fir.volatile_cast %[[VAL_4]] : (!fir.ref) -> !fir.ref +! CHECK: %[[VAL_6:.*]]:2 = hlfir.declare %[[VAL_5]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_7:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFEi"} +! CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_7]] {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_9:.*]] = fir.alloca i32 {bindc_name = "n", uniq_name = "_QFEn"} +! CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_9]] {uniq_name = "_QFEn"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[VAL_2]] to %[[VAL_6]]#0 : i32, !fir.ref +! CHECK: hlfir.assign %[[VAL_1]] to %[[VAL_10]]#0 : i32, !fir.ref +! CHECK: omp.parallel { +! CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_10]]#0 : !fir.ref +! CHECK: omp.wsloop private(@_QFEi_private_i32 %[[VAL_8]]#0 -> %[[VAL_12:.*]] : !fir.ref) reduction(@add_reduction_i32 %[[VAL_6]]#0 -> %[[VAL_13:.*]] : !fir.ref) { +! CHECK: omp.loop_nest (%[[VAL_14:.*]]) : i32 = (%[[VAL_0]]) to (%[[VAL_11]]) inclusive step (%[[VAL_0]]) { +! CHECK: %[[VAL_15:.*]]:2 = hlfir.declare %[[VAL_12]] {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[VAL_13]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[VAL_14]] to %[[VAL_15]]#0 : i32, !fir.ref +! CHECK: %[[VAL_17:.*]] = fir.load %[[VAL_16]]#0 : !fir.ref +! CHECK: %[[VAL_18:.*]] = arith.addi %[[VAL_17]], %[[VAL_0]] : i32 +! CHECK: hlfir.assign %[[VAL_18]] to %[[VAL_16]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } ``````````
https://github.com/llvm/llvm-project/pull/142182 From flang-commits at lists.llvm.org Fri May 30 09:17:51 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 09:17:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Propagate volatile on openmp reduction vars (PR #142182) In-Reply-To: Message-ID: <6839da2f.170a0220.2e47e7.5382@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir Author: Asher Mancinelli (ashermancinelli)
Changes In the openmp reduction clause lowering, reduction variables' reference types were not propagating the volatility of the original variable's type. Add a test and address cases of this in ReductionProcessor. I'm also looking for other cases of this in the openmp bridge, but haven't found any yet. --- Full diff: https://github.com/llvm/llvm-project/pull/142182.diff 2 Files Affected: - (modified) flang/lib/Lower/OpenMP/ReductionProcessor.cpp (+4-2) - (added) flang/test/Lower/volatile-openmp1.f90 (+46) ``````````diff diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp index b8aa0deb42dd6..7ef0f2a0ef7c5 100644 --- a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp @@ -389,7 +389,8 @@ static void genBoxCombiner(fir::FirOpBuilder &builder, mlir::Location loc, hlfir::LoopNest nest = hlfir::genLoopNest( loc, builder, shapeShift.getExtents(), /*isUnordered=*/true); builder.setInsertionPointToStart(nest.body); - mlir::Type refTy = fir::ReferenceType::get(seqTy.getEleTy()); + const bool seqIsVolatile = fir::isa_volatile_type(seqTy.getEleTy()); + mlir::Type refTy = fir::ReferenceType::get(seqTy.getEleTy(), seqIsVolatile); auto lhsEleAddr = builder.create( loc, refTy, lhs, shapeShift, /*slice=*/mlir::Value{}, nest.oneBasedIndices, /*typeparms=*/mlir::ValueRange{}); @@ -659,7 +660,8 @@ void ReductionProcessor::processReductionArguments( assert(fir::isa_ref_type(symVal.getType()) && "reduction input var is passed by reference"); mlir::Type elementType = fir::dyn_cast_ptrEleTy(symVal.getType()); - mlir::Type refTy = fir::ReferenceType::get(elementType); + const bool symIsVolatile = fir::isa_volatile_type(symVal.getType()); + mlir::Type refTy = fir::ReferenceType::get(elementType, symIsVolatile); reductionVars.push_back( builder.createConvert(currentLocation, refTy, symVal)); diff --git a/flang/test/Lower/volatile-openmp1.f90 b/flang/test/Lower/volatile-openmp1.f90 new file mode 100644 index 0000000000000..163db953b6b80 --- /dev/null +++ b/flang/test/Lower/volatile-openmp1.f90 @@ -0,0 +1,46 @@ +! RUN: bbc --strict-fir-volatile-verifier -fopenmp %s -o - | FileCheck %s +program main +implicit none +integer,volatile::a +integer::n,i +a=0 +n=1000 +!$omp parallel +!$omp do reduction(+:a) + do i=1,n + a=a+1 + end do +!$omp end parallel +end program + +! CHECK-LABEL: func.func @_QQmain() attributes {fir.bindc_name = "main"} { +! CHECK: %[[VAL_0:.*]] = arith.constant 1 : i32 +! CHECK: %[[VAL_1:.*]] = arith.constant 1000 : i32 +! CHECK: %[[VAL_2:.*]] = arith.constant 0 : i32 +! CHECK: %[[VAL_3:.*]] = fir.dummy_scope : !fir.dscope +! CHECK: %[[VAL_4:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFEa"} +! CHECK: %[[VAL_5:.*]] = fir.volatile_cast %[[VAL_4]] : (!fir.ref) -> !fir.ref +! CHECK: %[[VAL_6:.*]]:2 = hlfir.declare %[[VAL_5]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_7:.*]] = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFEi"} +! CHECK: %[[VAL_8:.*]]:2 = hlfir.declare %[[VAL_7]] {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_9:.*]] = fir.alloca i32 {bindc_name = "n", uniq_name = "_QFEn"} +! CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_9]] {uniq_name = "_QFEn"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[VAL_2]] to %[[VAL_6]]#0 : i32, !fir.ref +! CHECK: hlfir.assign %[[VAL_1]] to %[[VAL_10]]#0 : i32, !fir.ref +! CHECK: omp.parallel { +! CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_10]]#0 : !fir.ref +! CHECK: omp.wsloop private(@_QFEi_private_i32 %[[VAL_8]]#0 -> %[[VAL_12:.*]] : !fir.ref) reduction(@add_reduction_i32 %[[VAL_6]]#0 -> %[[VAL_13:.*]] : !fir.ref) { +! CHECK: omp.loop_nest (%[[VAL_14:.*]]) : i32 = (%[[VAL_0]]) to (%[[VAL_11]]) inclusive step (%[[VAL_0]]) { +! CHECK: %[[VAL_15:.*]]:2 = hlfir.declare %[[VAL_12]] {uniq_name = "_QFEi"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[VAL_13]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +! CHECK: hlfir.assign %[[VAL_14]] to %[[VAL_15]]#0 : i32, !fir.ref +! CHECK: %[[VAL_17:.*]] = fir.load %[[VAL_16]]#0 : !fir.ref +! CHECK: %[[VAL_18:.*]] = arith.addi %[[VAL_17]], %[[VAL_0]] : i32 +! CHECK: hlfir.assign %[[VAL_18]] to %[[VAL_16]]#0 : i32, !fir.ref +! CHECK: omp.yield +! CHECK: } +! CHECK: } +! CHECK: omp.terminator +! CHECK: } +! CHECK: return +! CHECK: } ``````````
https://github.com/llvm/llvm-project/pull/142182 From flang-commits at lists.llvm.org Fri May 30 09:20:28 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 30 May 2025 09:20:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Propagate volatile on openmp reduction vars (PR #142182) In-Reply-To: Message-ID: <6839dacc.050a0220.1e3100.5faa@mx.google.com> https://github.com/vzakhari approved this pull request. LGTM! https://github.com/llvm/llvm-project/pull/142182 From flang-commits at lists.llvm.org Fri May 30 08:36:53 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 30 May 2025 08:36:53 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Disable noalias captures(none) by default (PR #142128) In-Reply-To: Message-ID: <6839d095.170a0220.1b298e.46ac@mx.google.com> ================ @@ -350,9 +358,10 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, else framePointerKind = mlir::LLVM::framePointerKind::FramePointerKind::None; - bool setNoCapture = false, setNoAlias = false; - if (config.OptLevel.isOptimizingForSpeed()) - setNoCapture = setNoAlias = true; + // TODO: re-enable setNoAlias by default (when optimizing for speed) once + // function specialization is fixed. + bool setNoAlias = forceNoAlias; + bool setNoCapture = forceNoCapture; ---------------- vzakhari wrote: Oh, I think this will revert more than I did. Nocapture was set before my change, but only for ref types (not for boxes). Can you given me some time today to disable it correctly to resolve the exchange issue and avoid other regressions? Or is it crucial to resolve the exchange issue immediately? Please let me know. https://github.com/llvm/llvm-project/pull/142128 From flang-commits at lists.llvm.org Fri May 30 08:52:08 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 08:52:08 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Disable noalias captures(none) by default (PR #142128) In-Reply-To: Message-ID: <6839d428.050a0220.2d52e2.f81f@mx.google.com> https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/142128 >From 645f6c332a1e7b9566900cd071cc44eca50d27a5 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Fri, 30 May 2025 14:53:07 +0000 Subject: [PATCH 1/2] [flang] Disable noalias captures(none) by default This is due to a 70% regression in exchange2_r on neoverse-v2 due to function specialization no longer triggering in the LTO pipline. --- flang/lib/Optimizer/Passes/Pipelines.cpp | 15 ++++++++++++--- flang/test/Fir/polymorphic.fir | 2 +- flang/test/Fir/struct-passing-x86-64-byval.fir | 2 +- flang/test/Fir/target-rewrite-complex-10-x86.fir | 2 +- flang/test/Fir/target.fir | 2 +- 5 files changed, 16 insertions(+), 7 deletions(-) diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index 0c774eede4c9a..ec17a93b53ff4 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -10,6 +10,14 @@ /// common to flang and the test tools. #include "flang/Optimizer/Passes/Pipelines.h" +#include "llvm/Support/CommandLine.h" + +/// Force setting the no-alias attribute on fuction arguments when possible. +static llvm::cl::opt forceNoAlias("force-no-alias", llvm::cl::Hidden, + llvm::cl::init(false)); +/// Force setting the no-capture attribute on fuction arguments when possible. +static llvm::cl::opt forceNoCapture("force-no-capture", llvm::cl::Hidden, + llvm::cl::init(false)); namespace fir { @@ -350,9 +358,10 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, else framePointerKind = mlir::LLVM::framePointerKind::FramePointerKind::None; - bool setNoCapture = false, setNoAlias = false; - if (config.OptLevel.isOptimizingForSpeed()) - setNoCapture = setNoAlias = true; + // TODO: re-enable setNoAlias by default (when optimizing for speed) once + // function specialization is fixed. + bool setNoAlias = forceNoAlias; + bool setNoCapture = forceNoCapture; pm.addPass(fir::createFunctionAttr( {framePointerKind, config.InstrumentFunctionEntry, diff --git a/flang/test/Fir/polymorphic.fir b/flang/test/Fir/polymorphic.fir index 84fa2e950633f..d9a13a99477ce 100644 --- a/flang/test/Fir/polymorphic.fir +++ b/flang/test/Fir/polymorphic.fir @@ -1,4 +1,4 @@ -// RUN: tco %s | FileCheck %s +// RUN: tco --force-no-capture %s | FileCheck %s // Test code gen for unlimited polymorphic type descriptor. diff --git a/flang/test/Fir/struct-passing-x86-64-byval.fir b/flang/test/Fir/struct-passing-x86-64-byval.fir index 997d2930f836c..dd25b80a3f81d 100644 --- a/flang/test/Fir/struct-passing-x86-64-byval.fir +++ b/flang/test/Fir/struct-passing-x86-64-byval.fir @@ -1,7 +1,7 @@ // Test X86-64 ABI rewrite of struct passed by value (BIND(C), VALUE derived types). // This test test cases where the struct must be passed on the stack according // to the System V ABI. -// RUN: tco --target=x86_64-unknown-linux-gnu %s | FileCheck %s +// RUN: tco --target=x86_64-unknown-linux-gnu --force-no-capture --force-no-alias %s | FileCheck %s module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"} { diff --git a/flang/test/Fir/target-rewrite-complex-10-x86.fir b/flang/test/Fir/target-rewrite-complex-10-x86.fir index 5f917ee42d598..b05187c65a932 100644 --- a/flang/test/Fir/target-rewrite-complex-10-x86.fir +++ b/flang/test/Fir/target-rewrite-complex-10-x86.fir @@ -1,6 +1,6 @@ // Test COMPLEX(10) passing and returning on X86 // RUN: fir-opt --target-rewrite="target=x86_64-unknown-linux-gnu" %s | FileCheck %s --check-prefix=AMD64 -// RUN: tco -target="x86_64-unknown-linux-gnu" %s | FileCheck %s --check-prefix=AMD64_LLVM +// RUN: tco -target="x86_64-unknown-linux-gnu" --force-no-alias --force-no-capture %s | FileCheck %s --check-prefix=AMD64_LLVM module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"} { diff --git a/flang/test/Fir/target.fir b/flang/test/Fir/target.fir index e1190649e0803..d40bcae4a8ad2 100644 --- a/flang/test/Fir/target.fir +++ b/flang/test/Fir/target.fir @@ -1,4 +1,4 @@ -// RUN: tco --target=i386-unknown-linux-gnu %s | FileCheck %s --check-prefix=I32 +// RUN: tco --target=i386-unknown-linux-gnu --force-no-alias --force-no-capture %s | FileCheck %s --check-prefix=I32 // RUN: tco --target=x86_64-unknown-linux-gnu %s | FileCheck %s --check-prefix=X64 // RUN: tco --target=aarch64-unknown-linux-gnu %s | FileCheck %s --check-prefix=AARCH64 // RUN: tco --target=powerpc64le-unknown-linux-gnu %s | FileCheck %s --check-prefix=PPC >From e956bfcc84d4db925b7fbe9d8b9c44b0e76fef71 Mon Sep 17 00:00:00 2001 From: Tom Eccles Date: Fri, 30 May 2025 15:51:51 +0000 Subject: [PATCH 2/2] Re-enable captures(none) --- flang/lib/Optimizer/Passes/Pipelines.cpp | 5 +---- flang/test/Fir/polymorphic.fir | 2 +- flang/test/Fir/struct-passing-x86-64-byval.fir | 2 +- flang/test/Fir/target-rewrite-complex-10-x86.fir | 2 +- flang/test/Fir/target.fir | 2 +- 5 files changed, 5 insertions(+), 8 deletions(-) diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp index ec17a93b53ff4..00248032daf92 100644 --- a/flang/lib/Optimizer/Passes/Pipelines.cpp +++ b/flang/lib/Optimizer/Passes/Pipelines.cpp @@ -15,9 +15,6 @@ /// Force setting the no-alias attribute on fuction arguments when possible. static llvm::cl::opt forceNoAlias("force-no-alias", llvm::cl::Hidden, llvm::cl::init(false)); -/// Force setting the no-capture attribute on fuction arguments when possible. -static llvm::cl::opt forceNoCapture("force-no-capture", llvm::cl::Hidden, - llvm::cl::init(false)); namespace fir { @@ -361,7 +358,7 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, // TODO: re-enable setNoAlias by default (when optimizing for speed) once // function specialization is fixed. bool setNoAlias = forceNoAlias; - bool setNoCapture = forceNoCapture; + bool setNoCapture = config.OptLevel.isOptimizingForSpeed(); pm.addPass(fir::createFunctionAttr( {framePointerKind, config.InstrumentFunctionEntry, diff --git a/flang/test/Fir/polymorphic.fir b/flang/test/Fir/polymorphic.fir index d9a13a99477ce..84fa2e950633f 100644 --- a/flang/test/Fir/polymorphic.fir +++ b/flang/test/Fir/polymorphic.fir @@ -1,4 +1,4 @@ -// RUN: tco --force-no-capture %s | FileCheck %s +// RUN: tco %s | FileCheck %s // Test code gen for unlimited polymorphic type descriptor. diff --git a/flang/test/Fir/struct-passing-x86-64-byval.fir b/flang/test/Fir/struct-passing-x86-64-byval.fir index dd25b80a3f81d..e22c3a23ef9da 100644 --- a/flang/test/Fir/struct-passing-x86-64-byval.fir +++ b/flang/test/Fir/struct-passing-x86-64-byval.fir @@ -1,7 +1,7 @@ // Test X86-64 ABI rewrite of struct passed by value (BIND(C), VALUE derived types). // This test test cases where the struct must be passed on the stack according // to the System V ABI. -// RUN: tco --target=x86_64-unknown-linux-gnu --force-no-capture --force-no-alias %s | FileCheck %s +// RUN: tco --target=x86_64-unknown-linux-gnu --force-no-alias %s | FileCheck %s module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"} { diff --git a/flang/test/Fir/target-rewrite-complex-10-x86.fir b/flang/test/Fir/target-rewrite-complex-10-x86.fir index b05187c65a932..d54f98b72b3d2 100644 --- a/flang/test/Fir/target-rewrite-complex-10-x86.fir +++ b/flang/test/Fir/target-rewrite-complex-10-x86.fir @@ -1,6 +1,6 @@ // Test COMPLEX(10) passing and returning on X86 // RUN: fir-opt --target-rewrite="target=x86_64-unknown-linux-gnu" %s | FileCheck %s --check-prefix=AMD64 -// RUN: tco -target="x86_64-unknown-linux-gnu" --force-no-alias --force-no-capture %s | FileCheck %s --check-prefix=AMD64_LLVM +// RUN: tco -target="x86_64-unknown-linux-gnu" --force-no-alias %s | FileCheck %s --check-prefix=AMD64_LLVM module attributes {fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"} { diff --git a/flang/test/Fir/target.fir b/flang/test/Fir/target.fir index d40bcae4a8ad2..b04e23a018e7e 100644 --- a/flang/test/Fir/target.fir +++ b/flang/test/Fir/target.fir @@ -1,4 +1,4 @@ -// RUN: tco --target=i386-unknown-linux-gnu --force-no-alias --force-no-capture %s | FileCheck %s --check-prefix=I32 +// RUN: tco --target=i386-unknown-linux-gnu --force-no-alias %s | FileCheck %s --check-prefix=I32 // RUN: tco --target=x86_64-unknown-linux-gnu %s | FileCheck %s --check-prefix=X64 // RUN: tco --target=aarch64-unknown-linux-gnu %s | FileCheck %s --check-prefix=AARCH64 // RUN: tco --target=powerpc64le-unknown-linux-gnu %s | FileCheck %s --check-prefix=PPC From flang-commits at lists.llvm.org Fri May 30 08:52:44 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 08:52:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Disable noalias captures(none) by default (PR #142128) In-Reply-To: Message-ID: <6839d44c.170a0220.123f9b.4e74@mx.google.com> ================ @@ -350,9 +358,10 @@ void createDefaultFIRCodeGenPassPipeline(mlir::PassManager &pm, else framePointerKind = mlir::LLVM::framePointerKind::FramePointerKind::None; - bool setNoCapture = false, setNoAlias = false; - if (config.OptLevel.isOptimizingForSpeed()) - setNoCapture = setNoAlias = true; + // TODO: re-enable setNoAlias by default (when optimizing for speed) once + // function specialization is fixed. + bool setNoAlias = forceNoAlias; + bool setNoCapture = forceNoCapture; ---------------- tblah wrote: My bad I was rushing and missed this. It turns out to work fine with nocapture still enabled. I've updated the patch. https://github.com/llvm/llvm-project/pull/142128 From flang-commits at lists.llvm.org Fri May 30 08:52:58 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 08:52:58 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Disable noalias by default (PR #142128) In-Reply-To: Message-ID: <6839d45a.a70a0220.8ceee.4ea8@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/142128 From flang-commits at lists.llvm.org Fri May 30 08:54:47 2025 From: flang-commits at lists.llvm.org (Slava Zakharin via flang-commits) Date: Fri, 30 May 2025 08:54:47 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Disable noalias by default (PR #142128) In-Reply-To: Message-ID: <6839d4c7.170a0220.24c270.4a73@mx.google.com> https://github.com/vzakhari approved this pull request. LGTM. Thank you, Tom! https://github.com/llvm/llvm-project/pull/142128 From flang-commits at lists.llvm.org Fri May 30 09:27:53 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Fri, 30 May 2025 09:27:53 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <6839dc89.630a0220.263bc4.2536@mx.google.com> https://github.com/snarang181 updated https://github.com/llvm/llvm-project/pull/141882 >From 05d155d3def0173ed099b8d7588dca1ea644754a Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Wed, 28 May 2025 20:21:16 -0400 Subject: [PATCH 1/4] [Flang][Docs] Add Sphinx man page support for Flang This patch enables building Flang man pages by: - Adding a `man_pages` entry in flang/docs/conf.py for Sphinx man builder. - Adding a minimal `index.rst` as the master document. - Adding placeholder `.rst` files for FIRLangRef and FlangCommandLineReference to fix toctree references. These changes unblock builds using `-DLLVM_BUILD_MANPAGES=ON` and allow `ninja docs-flang-man` to generate `flang.1`. Fixes #141757 --- flang/docs/FIRLangRef.rst | 4 ++++ flang/docs/FlangCommandLineReference.rst | 4 ++++ flang/docs/conf.py | 4 +++- flang/docs/index.rst | 10 ++++++++++ 4 files changed, 21 insertions(+), 1 deletion(-) create mode 100644 flang/docs/FIRLangRef.rst create mode 100644 flang/docs/FlangCommandLineReference.rst create mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst new file mode 100644 index 0000000000000..91edd67fdcad8 --- /dev/null +++ b/flang/docs/FIRLangRef.rst @@ -0,0 +1,4 @@ +FIR Language Reference +====================== + +(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst new file mode 100644 index 0000000000000..71f77f28ba72c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.rst @@ -0,0 +1,4 @@ +Flang Command Line Reference +============================ + +(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 48f7b69f5d750..46907f144e25a 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -227,7 +227,9 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [] +man_pages = [ + ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) +] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst new file mode 100644 index 0000000000000..09677eb87704f --- /dev/null +++ b/flang/docs/index.rst @@ -0,0 +1,10 @@ +Flang Documentation +==================== + +Welcome to the Flang documentation. + +.. toctree:: + :maxdepth: 1 + + FIRLangRef + FlangCommandLineReference >From c255837afe37348ed2e1fe480bc2b621025f8774 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 06:53:34 -0400 Subject: [PATCH 2/4] Remove .rst files and point conf.py to pick up .md --- flang/docs/FIRLangRef.rst | 4 ---- flang/docs/FlangCommandLineReference.rst | 4 ---- flang/docs/conf.py | 5 ++--- flang/docs/index.rst | 10 ---------- 4 files changed, 2 insertions(+), 21 deletions(-) delete mode 100644 flang/docs/FIRLangRef.rst delete mode 100644 flang/docs/FlangCommandLineReference.rst delete mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst deleted file mode 100644 index 91edd67fdcad8..0000000000000 --- a/flang/docs/FIRLangRef.rst +++ /dev/null @@ -1,4 +0,0 @@ -FIR Language Reference -====================== - -(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst deleted file mode 100644 index 71f77f28ba72c..0000000000000 --- a/flang/docs/FlangCommandLineReference.rst +++ /dev/null @@ -1,4 +0,0 @@ -Flang Command Line Reference -============================ - -(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 46907f144e25a..4fd81440c8176 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -42,6 +42,7 @@ # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] +source_suffix = [".md"] myst_heading_anchors = 6 import sphinx @@ -227,9 +228,7 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [ - ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) -] +man_pages = [("index", "flang", "Flang Documentation", ["Flang Contributors"], 1)] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst deleted file mode 100644 index 09677eb87704f..0000000000000 --- a/flang/docs/index.rst +++ /dev/null @@ -1,10 +0,0 @@ -Flang Documentation -==================== - -Welcome to the Flang documentation. - -.. toctree:: - :maxdepth: 1 - - FIRLangRef - FlangCommandLineReference >From c1e998189c0b61e9910c2b613f8c610ae67c43ff Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 07:03:35 -0400 Subject: [PATCH 3/4] While building man pages, the .md files were being used. Due to that, the myst_parser was explictly imported. Adding Placeholder .md files which are required by index.md --- flang/docs/FIRLangRef.md | 3 +++ flang/docs/FlangCommandLineReference.md | 3 +++ flang/docs/conf.py | 10 +++++----- 3 files changed, 11 insertions(+), 5 deletions(-) create mode 100644 flang/docs/FIRLangRef.md create mode 100644 flang/docs/FlangCommandLineReference.md diff --git a/flang/docs/FIRLangRef.md b/flang/docs/FIRLangRef.md new file mode 100644 index 0000000000000..8e4052f14fc7c --- /dev/null +++ b/flang/docs/FIRLangRef.md @@ -0,0 +1,3 @@ +# FIR Language Reference + +_TODO: Add FIR language reference documentation._ diff --git a/flang/docs/FlangCommandLineReference.md b/flang/docs/FlangCommandLineReference.md new file mode 100644 index 0000000000000..ee8d7b83dc50c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.md @@ -0,0 +1,3 @@ +# Flang Command Line Reference + +_TODO: Add Flang CLI documentation._ diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 4fd81440c8176..7223661625689 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -10,6 +10,7 @@ # serve to show the default. from datetime import date + # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. @@ -28,16 +29,15 @@ "sphinx.ext.autodoc", ] -# When building man pages, we do not use the markdown pages, -# So, we can continue without the myst_parser dependencies. -# Doing so reduces dependencies of some packaged llvm distributions. + try: import myst_parser extensions.append("myst_parser") except ImportError: - if not tags.has("builder-man"): - raise + raise ImportError( + "myst_parser is required to build documentation, including man pages." + ) # Add any paths that contain templates here, relative to this directory. >From bee779e542853266d69e01d2e724636a219430dd Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 09:01:22 -0400 Subject: [PATCH 4/4] Remove placeholder .md files --- flang/docs/FIRLangRef.md | 3 --- flang/docs/FlangCommandLineReference.md | 3 --- 2 files changed, 6 deletions(-) delete mode 100644 flang/docs/FIRLangRef.md delete mode 100644 flang/docs/FlangCommandLineReference.md diff --git a/flang/docs/FIRLangRef.md b/flang/docs/FIRLangRef.md deleted file mode 100644 index 8e4052f14fc7c..0000000000000 --- a/flang/docs/FIRLangRef.md +++ /dev/null @@ -1,3 +0,0 @@ -# FIR Language Reference - -_TODO: Add FIR language reference documentation._ diff --git a/flang/docs/FlangCommandLineReference.md b/flang/docs/FlangCommandLineReference.md deleted file mode 100644 index ee8d7b83dc50c..0000000000000 --- a/flang/docs/FlangCommandLineReference.md +++ /dev/null @@ -1,3 +0,0 @@ -# Flang Command Line Reference - -_TODO: Add Flang CLI documentation._ From flang-commits at lists.llvm.org Fri May 30 09:29:51 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 30 May 2025 09:29:51 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mrecip[=] (PR #142172) In-Reply-To: Message-ID: <6839dcff.630a0220.1c1ab1.1a06@mx.google.com> ================ @@ -1251,6 +1251,164 @@ static bool parseIntegerOverflowArgs(CompilerInvocation &invoc, return true; } +/// This is a helper function for validating the optional refinement step +/// parameter in reciprocal argument strings. Return false if there is an error +/// parsing the refinement step. Otherwise, return true and set the Position +/// of the refinement step in the input string. +/// +/// \param [in] in The input string +/// \param [in] a The compiler invocation arguments to parse +/// \param [out] position The position of the refinement step in input string +/// \param [out] diags DiagnosticsEngine to report erros with +static bool getRefinementStep(llvm::StringRef in, const llvm::opt::Arg &a, + size_t &position, + clang::DiagnosticsEngine &diags) { + const char refinementStepToken = ':'; + position = in.find(refinementStepToken); + if (position != llvm::StringRef::npos) { + llvm::StringRef option = a.getOption().getName(); + llvm::StringRef refStep = in.substr(position + 1); + // Allow exactly one numeric character for the additional refinement + // step parameter. This is reasonable for all currently-supported + // operations and architectures because we would expect that a larger value + // of refinement steps would cause the estimate "optimization" to + // under-perform the native operation. Also, if the estimate does not + // converge quickly, it probably will not ever converge, so further + // refinement steps will not produce a better answer. + if (refStep.size() != 1) { + diags.Report(clang::diag::err_drv_invalid_value) << option << refStep; + return false; + } + char refStepChar = refStep[0]; + if (refStepChar < '0' || refStepChar > '9') { + diags.Report(clang::diag::err_drv_invalid_value) << option << refStep; + return false; + } + } + return true; +} + +/// Parses all -mrecip= arguments and populates the +/// CompilerInvocation accordingly. Returns false if new errors are generated. +/// +/// \param [out] invoc Stores the processed arguments +/// \param [in] args The compiler invocation arguments to parse +/// \param [out] diags DiagnosticsEngine to report erros with +static bool parseMRecip(CompilerInvocation &invoc, llvm::opt::ArgList &args, + clang::DiagnosticsEngine &diags) { + llvm::StringRef disabledPrefixIn = "!"; + llvm::StringRef disabledPrefixOut = "!"; + llvm::StringRef enabledPrefixOut = ""; + llvm::StringRef out = ""; + Fortran::frontend::CodeGenOptions &opts = invoc.getCodeGenOpts(); + + const llvm::opt::Arg *a = + args.getLastArg(clang::driver::options::OPT_mrecip, + clang::driver::options::OPT_mrecip_EQ); + if (!a) + return true; + + unsigned numOptions = a->getNumValues(); + if (numOptions == 0) { + // No option is the same as "all". + opts.Reciprocals = "all"; + return true; + } + + // Pass through "all", "none", or "default" with an optional refinement step. + if (numOptions == 1) { + llvm::StringRef val = a->getValue(0); + size_t refStepLoc; + if (!getRefinementStep(val, *a, refStepLoc, diags)) + return false; + llvm::StringRef valBase = val.slice(0, refStepLoc); + if (valBase == "all" || valBase == "none" || valBase == "default") { + opts.Reciprocals = args.MakeArgString(val); + return true; + } + } + + // Each reciprocal type may be enabled or disabled individually. + // Check each input value for validity, concatenate them all back together, + // and pass through. + + llvm::StringMap optionStrings; + optionStrings.insert(std::make_pair("divd", false)); + optionStrings.insert(std::make_pair("divf", false)); + optionStrings.insert(std::make_pair("divh", false)); + optionStrings.insert(std::make_pair("vec-divd", false)); + optionStrings.insert(std::make_pair("vec-divf", false)); + optionStrings.insert(std::make_pair("vec-divh", false)); + optionStrings.insert(std::make_pair("sqrtd", false)); + optionStrings.insert(std::make_pair("sqrtf", false)); + optionStrings.insert(std::make_pair("sqrth", false)); + optionStrings.insert(std::make_pair("vec-sqrtd", false)); + optionStrings.insert(std::make_pair("vec-sqrtf", false)); + optionStrings.insert(std::make_pair("vec-sqrth", false)); + + for (unsigned i = 0; i != numOptions; ++i) { + llvm::StringRef val = a->getValue(i); + + bool isDisabled = val.starts_with(disabledPrefixIn); + // Ignore the disablement token for string matching. + if (isDisabled) + val = val.substr(1); + + size_t refStep; + if (!getRefinementStep(val, *a, refStep, diags)) + return false; + + llvm::StringRef valBase = val.slice(0, refStep); + llvm::StringMap::iterator optionIter = optionStrings.find(valBase); + if (optionIter == optionStrings.end()) { + // Try again specifying float suffix. + optionIter = optionStrings.find(valBase.str() + 'f'); + if (optionIter == optionStrings.end()) { + // The input name did not match any known option string. + diags.Report(clang::diag::err_drv_invalid_value) + << a->getOption().getName() << val; + return false; + } + // The option was specified without a half or float or double suffix. + // Make sure that the double or half entry was not already specified. + // The float entry will be checked below. + if (optionStrings[valBase.str() + 'd'] || + optionStrings[valBase.str() + 'h']) { ---------------- eugeneepshteyn wrote: Using operator `[]` on a map has a side-effect of creating an entry, if it's not already in the map. Is this intended here? https://github.com/llvm/llvm-project/pull/142172 From flang-commits at lists.llvm.org Fri May 30 09:29:52 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 30 May 2025 09:29:52 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mrecip[=] (PR #142172) In-Reply-To: Message-ID: <6839dd00.170a0220.1bc561.5850@mx.google.com> ================ @@ -1251,6 +1251,164 @@ static bool parseIntegerOverflowArgs(CompilerInvocation &invoc, return true; } +/// This is a helper function for validating the optional refinement step +/// parameter in reciprocal argument strings. Return false if there is an error +/// parsing the refinement step. Otherwise, return true and set the Position +/// of the refinement step in the input string. +/// +/// \param [in] in The input string +/// \param [in] a The compiler invocation arguments to parse +/// \param [out] position The position of the refinement step in input string +/// \param [out] diags DiagnosticsEngine to report erros with +static bool getRefinementStep(llvm::StringRef in, const llvm::opt::Arg &a, + size_t &position, + clang::DiagnosticsEngine &diags) { + const char refinementStepToken = ':'; + position = in.find(refinementStepToken); + if (position != llvm::StringRef::npos) { + llvm::StringRef option = a.getOption().getName(); + llvm::StringRef refStep = in.substr(position + 1); + // Allow exactly one numeric character for the additional refinement + // step parameter. This is reasonable for all currently-supported + // operations and architectures because we would expect that a larger value + // of refinement steps would cause the estimate "optimization" to + // under-perform the native operation. Also, if the estimate does not + // converge quickly, it probably will not ever converge, so further + // refinement steps will not produce a better answer. + if (refStep.size() != 1) { + diags.Report(clang::diag::err_drv_invalid_value) << option << refStep; + return false; + } + char refStepChar = refStep[0]; + if (refStepChar < '0' || refStepChar > '9') { + diags.Report(clang::diag::err_drv_invalid_value) << option << refStep; + return false; + } + } + return true; +} + +/// Parses all -mrecip= arguments and populates the +/// CompilerInvocation accordingly. Returns false if new errors are generated. +/// +/// \param [out] invoc Stores the processed arguments +/// \param [in] args The compiler invocation arguments to parse +/// \param [out] diags DiagnosticsEngine to report erros with +static bool parseMRecip(CompilerInvocation &invoc, llvm::opt::ArgList &args, + clang::DiagnosticsEngine &diags) { + llvm::StringRef disabledPrefixIn = "!"; + llvm::StringRef disabledPrefixOut = "!"; + llvm::StringRef enabledPrefixOut = ""; + llvm::StringRef out = ""; + Fortran::frontend::CodeGenOptions &opts = invoc.getCodeGenOpts(); + + const llvm::opt::Arg *a = + args.getLastArg(clang::driver::options::OPT_mrecip, + clang::driver::options::OPT_mrecip_EQ); + if (!a) + return true; + + unsigned numOptions = a->getNumValues(); + if (numOptions == 0) { + // No option is the same as "all". + opts.Reciprocals = "all"; + return true; + } + + // Pass through "all", "none", or "default" with an optional refinement step. + if (numOptions == 1) { + llvm::StringRef val = a->getValue(0); + size_t refStepLoc; + if (!getRefinementStep(val, *a, refStepLoc, diags)) + return false; + llvm::StringRef valBase = val.slice(0, refStepLoc); + if (valBase == "all" || valBase == "none" || valBase == "default") { + opts.Reciprocals = args.MakeArgString(val); + return true; + } + } + + // Each reciprocal type may be enabled or disabled individually. + // Check each input value for validity, concatenate them all back together, + // and pass through. + + llvm::StringMap optionStrings; + optionStrings.insert(std::make_pair("divd", false)); + optionStrings.insert(std::make_pair("divf", false)); + optionStrings.insert(std::make_pair("divh", false)); + optionStrings.insert(std::make_pair("vec-divd", false)); + optionStrings.insert(std::make_pair("vec-divf", false)); + optionStrings.insert(std::make_pair("vec-divh", false)); + optionStrings.insert(std::make_pair("sqrtd", false)); + optionStrings.insert(std::make_pair("sqrtf", false)); + optionStrings.insert(std::make_pair("sqrth", false)); + optionStrings.insert(std::make_pair("vec-sqrtd", false)); + optionStrings.insert(std::make_pair("vec-sqrtf", false)); + optionStrings.insert(std::make_pair("vec-sqrth", false)); ---------------- eugeneepshteyn wrote: FYI, it may be cleaner to do something like this: `optionStrings["divd"] = false;`, etc. https://github.com/llvm/llvm-project/pull/142172 From flang-commits at lists.llvm.org Fri May 30 09:33:32 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 30 May 2025 09:33:32 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mrecip[=] (PR #142172) In-Reply-To: Message-ID: <6839dddc.170a0220.e7414.6316@mx.google.com> ================ @@ -1251,6 +1251,164 @@ static bool parseIntegerOverflowArgs(CompilerInvocation &invoc, return true; } +/// This is a helper function for validating the optional refinement step +/// parameter in reciprocal argument strings. Return false if there is an error +/// parsing the refinement step. Otherwise, return true and set the Position +/// of the refinement step in the input string. +/// +/// \param [in] in The input string +/// \param [in] a The compiler invocation arguments to parse +/// \param [out] position The position of the refinement step in input string +/// \param [out] diags DiagnosticsEngine to report erros with +static bool getRefinementStep(llvm::StringRef in, const llvm::opt::Arg &a, + size_t &position, + clang::DiagnosticsEngine &diags) { + const char refinementStepToken = ':'; + position = in.find(refinementStepToken); + if (position != llvm::StringRef::npos) { + llvm::StringRef option = a.getOption().getName(); + llvm::StringRef refStep = in.substr(position + 1); + // Allow exactly one numeric character for the additional refinement + // step parameter. This is reasonable for all currently-supported + // operations and architectures because we would expect that a larger value + // of refinement steps would cause the estimate "optimization" to + // under-perform the native operation. Also, if the estimate does not + // converge quickly, it probably will not ever converge, so further + // refinement steps will not produce a better answer. + if (refStep.size() != 1) { + diags.Report(clang::diag::err_drv_invalid_value) << option << refStep; + return false; + } + char refStepChar = refStep[0]; + if (refStepChar < '0' || refStepChar > '9') { + diags.Report(clang::diag::err_drv_invalid_value) << option << refStep; + return false; + } + } + return true; +} + +/// Parses all -mrecip= arguments and populates the +/// CompilerInvocation accordingly. Returns false if new errors are generated. +/// +/// \param [out] invoc Stores the processed arguments +/// \param [in] args The compiler invocation arguments to parse +/// \param [out] diags DiagnosticsEngine to report erros with +static bool parseMRecip(CompilerInvocation &invoc, llvm::opt::ArgList &args, + clang::DiagnosticsEngine &diags) { + llvm::StringRef disabledPrefixIn = "!"; + llvm::StringRef disabledPrefixOut = "!"; + llvm::StringRef enabledPrefixOut = ""; + llvm::StringRef out = ""; + Fortran::frontend::CodeGenOptions &opts = invoc.getCodeGenOpts(); + + const llvm::opt::Arg *a = + args.getLastArg(clang::driver::options::OPT_mrecip, + clang::driver::options::OPT_mrecip_EQ); + if (!a) + return true; + + unsigned numOptions = a->getNumValues(); + if (numOptions == 0) { + // No option is the same as "all". + opts.Reciprocals = "all"; + return true; + } + + // Pass through "all", "none", or "default" with an optional refinement step. + if (numOptions == 1) { + llvm::StringRef val = a->getValue(0); + size_t refStepLoc; + if (!getRefinementStep(val, *a, refStepLoc, diags)) + return false; + llvm::StringRef valBase = val.slice(0, refStepLoc); + if (valBase == "all" || valBase == "none" || valBase == "default") { + opts.Reciprocals = args.MakeArgString(val); + return true; + } + } + + // Each reciprocal type may be enabled or disabled individually. + // Check each input value for validity, concatenate them all back together, + // and pass through. + + llvm::StringMap optionStrings; + optionStrings.insert(std::make_pair("divd", false)); + optionStrings.insert(std::make_pair("divf", false)); + optionStrings.insert(std::make_pair("divh", false)); + optionStrings.insert(std::make_pair("vec-divd", false)); + optionStrings.insert(std::make_pair("vec-divf", false)); + optionStrings.insert(std::make_pair("vec-divh", false)); + optionStrings.insert(std::make_pair("sqrtd", false)); + optionStrings.insert(std::make_pair("sqrtf", false)); + optionStrings.insert(std::make_pair("sqrth", false)); + optionStrings.insert(std::make_pair("vec-sqrtd", false)); + optionStrings.insert(std::make_pair("vec-sqrtf", false)); + optionStrings.insert(std::make_pair("vec-sqrth", false)); + + for (unsigned i = 0; i != numOptions; ++i) { + llvm::StringRef val = a->getValue(i); + + bool isDisabled = val.starts_with(disabledPrefixIn); + // Ignore the disablement token for string matching. + if (isDisabled) + val = val.substr(1); + + size_t refStep; + if (!getRefinementStep(val, *a, refStep, diags)) + return false; + + llvm::StringRef valBase = val.slice(0, refStep); + llvm::StringMap::iterator optionIter = optionStrings.find(valBase); + if (optionIter == optionStrings.end()) { + // Try again specifying float suffix. + optionIter = optionStrings.find(valBase.str() + 'f'); + if (optionIter == optionStrings.end()) { + // The input name did not match any known option string. + diags.Report(clang::diag::err_drv_invalid_value) + << a->getOption().getName() << val; + return false; + } + // The option was specified without a half or float or double suffix. + // Make sure that the double or half entry was not already specified. + // The float entry will be checked below. + if (optionStrings[valBase.str() + 'd'] || + optionStrings[valBase.str() + 'h']) { + diags.Report(clang::diag::err_drv_invalid_value) + << a->getOption().getName() << val; + return false; + } + } + + if (optionIter->second == true) { + // Duplicate option specified. + diags.Report(clang::diag::err_drv_invalid_value) + << a->getOption().getName() << val; + return false; + } + + // Mark the matched option as found. Do not allow duplicate specifiers. + optionIter->second = true; + + // If the precision was not specified, also mark the double and half entry + // as found. + if (valBase.back() != 'f' && valBase.back() != 'd' && + valBase.back() != 'h') { + optionStrings[valBase.str() + 'd'] = true; + optionStrings[valBase.str() + 'h'] = true; + } + + // Build the output string. + llvm::StringRef prefix = isDisabled ? disabledPrefixOut : enabledPrefixOut; + out = args.MakeArgString(out + prefix + val); + if (i != numOptions - 1) + out = args.MakeArgString(out + ","); + } + + opts.Reciprocals = args.MakeArgString(out); // Handle the rest. ---------------- eugeneepshteyn wrote: I haven't found any issues with the string memory ownership, but just to be sure maybe run this under valgrind. https://github.com/llvm/llvm-project/pull/142172 From flang-commits at lists.llvm.org Fri May 30 10:49:19 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 30 May 2025 10:49:19 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6839ef9f.170a0220.26b8bd.7ffb@mx.google.com> ================ @@ -777,5 +777,22 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); + ---------------- kparzysz wrote: Done. I put the operation codes in an "operation" namespace in tools.h, and the implementations of the common functions in tools.cpp. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Fri May 30 10:15:23 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 10:15:23 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Explicitly set Shared DSA in symbols (PR #142154) In-Reply-To: Message-ID: <6839e7ab.050a0220.23c765.7f5c@mx.google.com> https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/142154 From flang-commits at lists.llvm.org Fri May 30 10:15:23 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 10:15:23 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Explicitly set Shared DSA in symbols (PR #142154) In-Reply-To: Message-ID: <6839e7ab.170a0220.1118ef.71f5@mx.google.com> https://github.com/tblah approved this pull request. LGTM, the approach seems sensible and it is okay with me. Do you plan to follow up with the lowering changes required to clean this up? https://github.com/llvm/llvm-project/pull/142154 From flang-commits at lists.llvm.org Fri May 30 10:15:24 2025 From: flang-commits at lists.llvm.org (Tom Eccles via flang-commits) Date: Fri, 30 May 2025 10:15:24 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Explicitly set Shared DSA in symbols (PR #142154) In-Reply-To: Message-ID: <6839e7ac.630a0220.319794.3dea@mx.google.com> ================ @@ -43,4 +45,14 @@ program omp_copyprivate print *, a, b + !$omp task + !$omp parallel private(c, d) + allocate(c(5)) + allocate(d(10)) + !$omp single + c = 22 + d = 33 + !$omp end single copyprivate(c, d) ---------------- tblah wrote: nit: please could you add a comment explaining that this is checking the error is not generated in this case to see that c and d inherit PRIVATE DSA from the enclosing PARALLEL. https://github.com/llvm/llvm-project/pull/142154 From flang-commits at lists.llvm.org Fri May 30 10:18:13 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 30 May 2025 10:18:13 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][runtime] Replace recursion with iterative work queue (PR #137727) In-Reply-To: Message-ID: <6839e855.050a0220.175b01.886f@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/137727 >From 61187093aa87f00484d9ddc486c415255092a919 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Wed, 23 Apr 2025 14:44:23 -0700 Subject: [PATCH] [flang][runtime] Replace recursion with iterative work queue Recursion, both direct and indirect, prevents accurate stack size calculation at link time for GPU device code. Restructure these recursive (often mutually so) routines in the Fortran runtime with new implementations based on an iterative work queue with suspendable/resumable work tickets: Assign, Initialize, initializeClone, Finalize, Destroy, and DescriptorIO. Note that derived type FINAL subroutine calls, defined assignments, and defined I/O procedures all perform callbacks into user code, which may well reenter the runtime library. This kind of recursion is not handled by this change, although it may be possible to do so in the future using thread-local work queues. The effects of this restructuring on CPU performance are yet to be measured. There is a fast(?) mode in the work queue implementation that causes new work items to be executed to completion immediately upon creation, saving the overhead of actually representing and managing the work queue. This mode can't be used on GPU devices, but it is enabled by default for CPU hosts. It can be disabled easily for debugging and performance testing. --- .../include/flang-rt/runtime/environment.h | 3 + flang-rt/include/flang-rt/runtime/stat.h | 10 +- flang-rt/include/flang-rt/runtime/type-info.h | 2 + .../include/flang-rt/runtime/work-queue.h | 548 +++++++++++++++ flang-rt/lib/runtime/CMakeLists.txt | 2 + flang-rt/lib/runtime/assign.cpp | 622 +++++++++++------ flang-rt/lib/runtime/derived.cpp | 517 +++++++------- flang-rt/lib/runtime/descriptor-io.cpp | 651 +++++++++++++++++- flang-rt/lib/runtime/descriptor-io.h | 620 +---------------- flang-rt/lib/runtime/environment.cpp | 4 + flang-rt/lib/runtime/namelist.cpp | 1 + flang-rt/lib/runtime/tools.cpp | 4 +- flang-rt/lib/runtime/type-info.cpp | 6 +- flang-rt/lib/runtime/work-queue.cpp | 161 +++++ flang-rt/unittests/Runtime/ExternalIOTest.cpp | 2 +- flang/docs/Extensions.md | 10 + flang/include/flang/Runtime/assign.h | 2 +- flang/include/flang/Semantics/tools.h | 7 +- flang/lib/Semantics/runtime-type-info.cpp | 4 + flang/lib/Semantics/tools.cpp | 32 + flang/module/__fortran_type_info.f90 | 3 +- flang/test/Lower/volatile-openmp.f90 | 8 +- flang/test/Semantics/typeinfo01.f90 | 26 +- flang/test/Semantics/typeinfo03.f90 | 2 +- flang/test/Semantics/typeinfo04.f90 | 8 +- flang/test/Semantics/typeinfo05.f90 | 4 +- flang/test/Semantics/typeinfo06.f90 | 4 +- flang/test/Semantics/typeinfo07.f90 | 8 +- flang/test/Semantics/typeinfo08.f90 | 2 +- flang/test/Semantics/typeinfo11.f90 | 2 +- flang/test/Semantics/typeinfo12.f90 | 67 ++ 31 files changed, 2225 insertions(+), 1117 deletions(-) create mode 100644 flang-rt/include/flang-rt/runtime/work-queue.h create mode 100644 flang-rt/lib/runtime/work-queue.cpp create mode 100644 flang/test/Semantics/typeinfo12.f90 diff --git a/flang-rt/include/flang-rt/runtime/environment.h b/flang-rt/include/flang-rt/runtime/environment.h index 16258b3bbba9b..e579f6012ce86 100644 --- a/flang-rt/include/flang-rt/runtime/environment.h +++ b/flang-rt/include/flang-rt/runtime/environment.h @@ -64,6 +64,9 @@ struct ExecutionEnvironment { bool defaultUTF8{false}; // DEFAULT_UTF8 bool checkPointerDeallocation{true}; // FORT_CHECK_POINTER_DEALLOCATION + enum InternalDebugging { WorkQueue = 1 }; + int internalDebugging{0}; // FLANG_RT_DEBUG + // CUDA related variables std::size_t cudaStackLimit{0}; // ACC_OFFLOAD_STACK_SIZE bool cudaDeviceIsManaged{false}; // NV_CUDAFOR_DEVICE_IS_MANAGED diff --git a/flang-rt/include/flang-rt/runtime/stat.h b/flang-rt/include/flang-rt/runtime/stat.h index 070d0bf8673fb..dc372de53506a 100644 --- a/flang-rt/include/flang-rt/runtime/stat.h +++ b/flang-rt/include/flang-rt/runtime/stat.h @@ -24,7 +24,7 @@ class Terminator; enum Stat { StatOk = 0, // required to be zero by Fortran - // Interoperable STAT= codes + // Interoperable STAT= codes (>= 11) StatBaseNull = CFI_ERROR_BASE_ADDR_NULL, StatBaseNotNull = CFI_ERROR_BASE_ADDR_NOT_NULL, StatInvalidElemLen = CFI_INVALID_ELEM_LEN, @@ -36,7 +36,7 @@ enum Stat { StatMemAllocation = CFI_ERROR_MEM_ALLOCATION, StatOutOfBounds = CFI_ERROR_OUT_OF_BOUNDS, - // Standard STAT= values + // Standard STAT= values (>= 101) StatFailedImage = FORTRAN_RUNTIME_STAT_FAILED_IMAGE, StatLocked = FORTRAN_RUNTIME_STAT_LOCKED, StatLockedOtherImage = FORTRAN_RUNTIME_STAT_LOCKED_OTHER_IMAGE, @@ -49,10 +49,14 @@ enum Stat { // Additional "processor-defined" STAT= values StatInvalidArgumentNumber = FORTRAN_RUNTIME_STAT_INVALID_ARG_NUMBER, StatMissingArgument = FORTRAN_RUNTIME_STAT_MISSING_ARG, - StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, + StatValueTooShort = FORTRAN_RUNTIME_STAT_VALUE_TOO_SHORT, // -1 StatMoveAllocSameAllocatable = FORTRAN_RUNTIME_STAT_MOVE_ALLOC_SAME_ALLOCATABLE, StatBadPointerDeallocation = FORTRAN_RUNTIME_STAT_BAD_POINTER_DEALLOCATION, + + // Dummy status for work queue continuation, declared here to perhaps + // avoid collisions + StatContinue = 201 }; RT_API_ATTRS const char *StatErrorString(int); diff --git a/flang-rt/include/flang-rt/runtime/type-info.h b/flang-rt/include/flang-rt/runtime/type-info.h index 5e79efde164f2..9bde3adba87f5 100644 --- a/flang-rt/include/flang-rt/runtime/type-info.h +++ b/flang-rt/include/flang-rt/runtime/type-info.h @@ -240,6 +240,7 @@ class DerivedType { RT_API_ATTRS bool noFinalizationNeeded() const { return noFinalizationNeeded_; } + RT_API_ATTRS bool noDefinedAssignment() const { return noDefinedAssignment_; } RT_API_ATTRS std::size_t LenParameters() const { return lenParameterKind().Elements(); @@ -322,6 +323,7 @@ class DerivedType { bool noInitializationNeeded_{false}; bool noDestructionNeeded_{false}; bool noFinalizationNeeded_{false}; + bool noDefinedAssignment_{false}; }; } // namespace Fortran::runtime::typeInfo diff --git a/flang-rt/include/flang-rt/runtime/work-queue.h b/flang-rt/include/flang-rt/runtime/work-queue.h new file mode 100644 index 0000000000000..878b18373e1d2 --- /dev/null +++ b/flang-rt/include/flang-rt/runtime/work-queue.h @@ -0,0 +1,548 @@ +//===-- include/flang-rt/runtime/work-queue.h -------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +// Internal runtime utilities for work queues that replace the use of recursion +// for better GPU device support. +// +// A work queue comprises a list of tickets. Each ticket class has a Begin() +// member function, which is called once, and a Continue() member function +// that can be called zero or more times. A ticket's execution terminates +// when either of these member functions returns a status other than +// StatContinue. When that status is not StatOk, then the whole queue +// is shut down. +// +// By returning StatContinue from its Continue() member function, +// a ticket suspends its execution so that any nested tickets that it +// may have created can be run to completion. It is the reponsibility +// of each ticket class to maintain resumption information in its state +// and manage its own progress. Most ticket classes inherit from +// class ComponentsOverElements, which implements an outer loop over all +// components of a derived type, and an inner loop over all elements +// of a descriptor, possibly with multiple phases of execution per element. +// +// Tickets are created by WorkQueue::Begin...() member functions. +// There is one of these for each "top level" recursive function in the +// Fortran runtime support library that has been restructured into this +// ticket framework. +// +// When the work queue is running tickets, it always selects the last ticket +// on the list for execution -- "work stack" might have been a more accurate +// name for this framework. This ticket may, while doing its job, create +// new tickets, and since those are pushed after the active one, the first +// such nested ticket will be the next one executed to completion -- i.e., +// the order of nested WorkQueue::Begin...() calls is respected. +// Note that a ticket's Continue() member function won't be called again +// until all nested tickets have run to completion and it is once again +// the last ticket on the queue. +// +// Example for an assignment to a derived type: +// 1. Assign() is called, and its work queue is created. It calls +// WorkQueue::BeginAssign() and then WorkQueue::Run(). +// 2. Run calls AssignTicket::Begin(), which pushes a tickets via +// BeginFinalize() and returns StatContinue. +// 3. FinalizeTicket::Begin() and FinalizeTicket::Continue() are called +// until one of them returns StatOk, which ends the finalization ticket. +// 4. AssignTicket::Continue() is then called; it creates a DerivedAssignTicket +// and then returns StatOk, which ends the ticket. +// 5. At this point, only one ticket remains. DerivedAssignTicket::Begin() +// and ::Continue() are called until they are done (not StatContinue). +// Along the way, it may create nested AssignTickets for components, +// and suspend itself so that they may each run to completion. + +#ifndef FLANG_RT_RUNTIME_WORK_QUEUE_H_ +#define FLANG_RT_RUNTIME_WORK_QUEUE_H_ + +#include "flang-rt/runtime/connection.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/stat.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/api-attrs.h" +#include "flang/Runtime/freestanding-tools.h" +#include + +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io + +namespace Fortran::runtime { +class Terminator; +class WorkQueue; + +// Ticket worker base classes + +template class ImmediateTicketRunner { +public: + RT_API_ATTRS explicit ImmediateTicketRunner(TICKET &ticket) + : ticket_{ticket} {} + RT_API_ATTRS int Run(WorkQueue &workQueue) { + int status{ticket_.Begin(workQueue)}; + while (status == StatContinue) { + status = ticket_.Continue(workQueue); + } + return status; + } + +private: + TICKET &ticket_; +}; + +// Base class for ticket workers that operate elementwise over descriptors +class Elementwise { +protected: + RT_API_ATTRS Elementwise( + const Descriptor &instance, const Descriptor *from = nullptr) + : instance_{instance}, from_{from} { + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + RT_API_ATTRS bool IsComplete() const { return elementAt_ >= elements_; } + RT_API_ATTRS void Advance() { + ++elementAt_; + instance_.IncrementSubscripts(subscripts_); + if (from_) { + from_->IncrementSubscripts(fromSubscripts_); + } + } + RT_API_ATTRS void SkipToEnd() { elementAt_ = elements_; } + RT_API_ATTRS void Reset() { + elementAt_ = 0; + instance_.GetLowerBounds(subscripts_); + if (from_) { + from_->GetLowerBounds(fromSubscripts_); + } + } + + const Descriptor &instance_, *from_{nullptr}; + std::size_t elements_{instance_.Elements()}; + std::size_t elementAt_{0}; + SubscriptValue subscripts_[common::maxRank]; + SubscriptValue fromSubscripts_[common::maxRank]; +}; + +// Base class for ticket workers that operate over derived type components. +class Componentwise { +protected: + RT_API_ATTRS Componentwise(const typeInfo::DerivedType &); + RT_API_ATTRS bool IsComplete() const { return componentAt_ >= components_; } + RT_API_ATTRS void Advance() { + ++componentAt_; + GetComponent(); + } + RT_API_ATTRS void SkipToEnd() { + component_ = nullptr; + componentAt_ = components_; + } + RT_API_ATTRS void Reset() { + component_ = nullptr; + componentAt_ = 0; + GetComponent(); + } + RT_API_ATTRS void GetComponent(); + + const typeInfo::DerivedType &derived_; + std::size_t components_{0}, componentAt_{0}; + const typeInfo::Component *component_{nullptr}; + StaticDescriptor componentDescriptor_; +}; + +// Base class for ticket workers that operate over derived type components +// in an outer loop, and elements in an inner loop. +class ComponentsOverElements : protected Componentwise, protected Elementwise { +protected: + RT_API_ATTRS ComponentsOverElements(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Componentwise{derived}, Elementwise{instance, from} { + if (Elementwise::IsComplete()) { + Componentwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Componentwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextElement(); + if (Elementwise::IsComplete()) { + Elementwise::Reset(); + Componentwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Elementwise::Advance(); + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Advance(); + } + RT_API_ATTRS void Reset() { + phase_ = 0; + Elementwise::Reset(); + Componentwise::Reset(); + } + + int phase_{0}; +}; + +// Base class for ticket workers that operate over elements in an outer loop, +// type components in an inner loop. +class ElementsOverComponents : protected Elementwise, protected Componentwise { +protected: + RT_API_ATTRS ElementsOverComponents(const Descriptor &instance, + const typeInfo::DerivedType &derived, const Descriptor *from = nullptr) + : Elementwise{instance, from}, Componentwise{derived} { + if (Componentwise::IsComplete()) { + Elementwise::SkipToEnd(); + } + } + RT_API_ATTRS bool IsComplete() const { return Elementwise::IsComplete(); } + RT_API_ATTRS void Advance() { + SkipToNextComponent(); + if (Componentwise::IsComplete()) { + Componentwise::Reset(); + Elementwise::Advance(); + } + } + RT_API_ATTRS void SkipToNextComponent() { + phase_ = 0; + Componentwise::Advance(); + } + RT_API_ATTRS void SkipToNextElement() { + phase_ = 0; + Componentwise::Reset(); + Elementwise::Advance(); + } + + int phase_{0}; +}; + +// Ticket worker classes + +// Implements derived type instance initialization +class InitializeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); +}; + +// Initializes one derived type instance from the value of another +class InitializeCloneTicket + : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS InitializeCloneTicket(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{original, derived}, clone_{clone}, + hasStat_{hasStat}, errMsg_{errMsg} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const Descriptor &clone_; + bool hasStat_{false}; + const Descriptor *errMsg_{nullptr}; + StaticDescriptor cloneComponentDescriptor_; +}; + +// Implements derived type instance finalization +class FinalizeTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS FinalizeTicket( + const Descriptor &instance, const typeInfo::DerivedType &derived) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + const typeInfo::DerivedType *finalizableParentType_{nullptr}; +}; + +// Implements derived type instance destruction +class DestroyTicket : public ImmediateTicketRunner, + private ComponentsOverElements { +public: + RT_API_ATTRS DestroyTicket(const Descriptor &instance, + const typeInfo::DerivedType &derived, bool finalize) + : ImmediateTicketRunner{*this}, + ComponentsOverElements{instance, derived}, finalize_{finalize} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + bool finalize_{false}; +}; + +// Implements general intrinsic assignment +class AssignTicket : public ImmediateTicketRunner { +public: + RT_API_ATTRS AssignTicket( + Descriptor &to, const Descriptor &from, int flags, MemmoveFct memmoveFct) + : ImmediateTicketRunner{*this}, to_{to}, from_{&from}, + flags_{flags}, memmoveFct_{memmoveFct} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + RT_API_ATTRS bool IsSimpleMemmove() const { + return !toDerived_ && to_.rank() == from_->rank() && to_.IsContiguous() && + from_->IsContiguous() && to_.ElementBytes() == from_->ElementBytes(); + } + RT_API_ATTRS Descriptor &GetTempDescriptor(); + + Descriptor &to_; + const Descriptor *from_{nullptr}; + int flags_{0}; // enum AssignFlags + MemmoveFct memmoveFct_{nullptr}; + StaticDescriptor tempDescriptor_; + const typeInfo::DerivedType *toDerived_{nullptr}; + Descriptor *toDeallocate_{nullptr}; + bool persist_{false}; + bool done_{false}; +}; + +// Implements derived type intrinsic assignment. +template +class DerivedAssignTicket + : public ImmediateTicketRunner>, + private std::conditional_t { +public: + using Base = std::conditional_t; + RT_API_ATTRS DerivedAssignTicket(const Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) + : ImmediateTicketRunner{*this}, + Base{to, derived, &from}, flags_{flags}, memmoveFct_{memmoveFct}, + deallocateAfter_{deallocateAfter} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + +private: + static constexpr bool isComponentwise_{IS_COMPONENTWISE}; + bool toIsContiguous_{this->instance_.IsContiguous()}; + bool fromIsContiguous_{this->from_->IsContiguous()}; + int flags_{0}; + MemmoveFct memmoveFct_{nullptr}; + Descriptor *deallocateAfter_{nullptr}; + StaticDescriptor fromComponentDescriptor_; +}; + +namespace io::descr { + +template +class DescriptorIoTicket + : public ImmediateTicketRunner>, + private Elementwise { +public: + RT_API_ATTRS DescriptorIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + Elementwise{descriptor}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &); + RT_API_ATTRS int Continue(WorkQueue &); + RT_API_ATTRS bool &anyIoTookPlace() { return anyIoTookPlace_; } + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; + common::optional nonTbpSpecial_; + const typeInfo::DerivedType *derived_{nullptr}; + const typeInfo::SpecialBinding *special_{nullptr}; + StaticDescriptor elementDescriptor_; +}; + +template +class DerivedIoTicket : public ImmediateTicketRunner>, + private ElementsOverComponents { +public: + RT_API_ATTRS DerivedIoTicket(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) + : ImmediateTicketRunner(*this), + ElementsOverComponents{descriptor, derived}, io_{io}, table_{table}, + anyIoTookPlace_{anyIoTookPlace} {} + RT_API_ATTRS int Begin(WorkQueue &) { return StatContinue; } + RT_API_ATTRS int Continue(WorkQueue &); + +private: + io::IoStatementState &io_; + const io::NonTbpDefinedIoTable *table_{nullptr}; + bool &anyIoTookPlace_; +}; + +} // namespace io::descr + +struct NullTicket { + RT_API_ATTRS int Begin(WorkQueue &) const { return StatOk; } + RT_API_ATTRS int Continue(WorkQueue &) const { return StatOk; } +}; + +struct Ticket { + RT_API_ATTRS int Continue(WorkQueue &); + bool begun{false}; + std::variant, + DerivedAssignTicket, + io::descr::DescriptorIoTicket, + io::descr::DescriptorIoTicket, + io::descr::DerivedIoTicket, + io::descr::DerivedIoTicket> + u; +}; + +class WorkQueue { +public: + RT_API_ATTRS explicit WorkQueue(Terminator &terminator) + : terminator_{terminator} { + for (int j{1}; j < numStatic_; ++j) { + static_[j].previous = &static_[j - 1]; + static_[j - 1].next = &static_[j]; + } + } + RT_API_ATTRS ~WorkQueue(); + RT_API_ATTRS Terminator &terminator() { return terminator_; }; + + // APIs for particular tasks. These can return StatOk if the work is + // completed immediately. + RT_API_ATTRS int BeginInitialize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return InitializeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginInitializeClone(const Descriptor &clone, + const Descriptor &original, const typeInfo::DerivedType &derived, + bool hasStat, const Descriptor *errMsg) { + if (runTicketsImmediately_) { + return InitializeCloneTicket{clone, original, derived, hasStat, errMsg} + .Run(*this); + } else { + StartTicket().u.emplace( + clone, original, derived, hasStat, errMsg); + return StatContinue; + } + } + RT_API_ATTRS int BeginFinalize( + const Descriptor &descriptor, const typeInfo::DerivedType &derived) { + if (runTicketsImmediately_) { + return FinalizeTicket{descriptor, derived}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived); + return StatContinue; + } + } + RT_API_ATTRS int BeginDestroy(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, bool finalize) { + if (runTicketsImmediately_) { + return DestroyTicket{descriptor, derived, finalize}.Run(*this); + } else { + StartTicket().u.emplace(descriptor, derived, finalize); + return StatContinue; + } + } + RT_API_ATTRS int BeginAssign(Descriptor &to, const Descriptor &from, + int flags, MemmoveFct memmoveFct) { + if (runTicketsImmediately_) { + return AssignTicket{to, from, flags, memmoveFct}.Run(*this); + } else { + StartTicket().u.emplace(to, from, flags, memmoveFct); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedAssign(Descriptor &to, const Descriptor &from, + const typeInfo::DerivedType &derived, int flags, MemmoveFct memmoveFct, + Descriptor *deallocateAfter) { + if (runTicketsImmediately_) { + return DerivedAssignTicket{ + to, from, derived, flags, memmoveFct, deallocateAfter} + .Run(*this); + } else { + StartTicket().u.emplace>( + to, from, derived, flags, memmoveFct, deallocateAfter); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDescriptorIo(io::IoStatementState &io, + const Descriptor &descriptor, const io::NonTbpDefinedIoTable *table, + bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DescriptorIoTicket{ + io, descriptor, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, table, anyIoTookPlace); + return StatContinue; + } + } + template + RT_API_ATTRS int BeginDerivedIo(io::IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, + const io::NonTbpDefinedIoTable *table, bool &anyIoTookPlace) { + if (runTicketsImmediately_) { + return io::descr::DerivedIoTicket{ + io, descriptor, derived, table, anyIoTookPlace} + .Run(*this); + } else { + StartTicket().u.emplace>( + io, descriptor, derived, table, anyIoTookPlace); + return StatContinue; + } + } + + RT_API_ATTRS int Run(); + +private: +#if RT_DEVICE_COMPILATION + // Always use the work queue on a GPU device to avoid recursion. + static constexpr bool runTicketsImmediately_{false}; +#else + // Avoid the work queue overhead on the host, unless it needs + // debugging, which is so much easier there. + static constexpr bool runTicketsImmediately_{true}; +#endif + + // Most uses of the work queue won't go very deep. + static constexpr int numStatic_{2}; + + struct TicketList { + bool isStatic{true}; + Ticket ticket; + TicketList *previous{nullptr}, *next{nullptr}; + }; + + RT_API_ATTRS Ticket &StartTicket(); + RT_API_ATTRS void Stop(); + + Terminator &terminator_; + TicketList *first_{nullptr}, *last_{nullptr}, *insertAfter_{nullptr}; + TicketList static_[numStatic_]; + TicketList *firstFree_{static_}; +}; + +} // namespace Fortran::runtime +#endif // FLANG_RT_RUNTIME_WORK_QUEUE_H_ diff --git a/flang-rt/lib/runtime/CMakeLists.txt b/flang-rt/lib/runtime/CMakeLists.txt index a3f63b4315644..332c0872e065f 100644 --- a/flang-rt/lib/runtime/CMakeLists.txt +++ b/flang-rt/lib/runtime/CMakeLists.txt @@ -68,6 +68,7 @@ set(supported_sources type-info.cpp unit.cpp utf.cpp + work-queue.cpp ) # List of source not used for GPU offloading. @@ -131,6 +132,7 @@ set(gpu_sources type-code.cpp type-info.cpp utf.cpp + work-queue.cpp complex-powi.cpp reduce.cpp reduction.cpp diff --git a/flang-rt/lib/runtime/assign.cpp b/flang-rt/lib/runtime/assign.cpp index 86aeeaa88f2d1..41b130cc8f257 100644 --- a/flang-rt/lib/runtime/assign.cpp +++ b/flang-rt/lib/runtime/assign.cpp @@ -14,6 +14,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -102,11 +103,7 @@ static RT_API_ATTRS int AllocateAssignmentLHS( toDim.SetByteStride(stride); stride *= toDim.Extent(); } - int result{ReturnError(terminator, to.Allocate(kNoAsyncObject))}; - if (result == StatOk && derived && !derived->noInitializationNeeded()) { - result = ReturnError(terminator, Initialize(to, *derived, terminator)); - } - return result; + return ReturnError(terminator, to.Allocate(kNoAsyncObject)); } // least <= 0, most >= 0 @@ -231,6 +228,8 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, } } +RT_OFFLOAD_API_GROUP_BEGIN + // Common implementation of assignments, both intrinsic assignments and // those cases of polymorphic user-defined ASSIGNMENT(=) TBPs that could not // be resolved in semantics. Most assignment statements do not need any @@ -244,274 +243,453 @@ static RT_API_ATTRS void BlankPadCharacterAssignment(Descriptor &to, // dealing with array constructors. RT_API_ATTRS void Assign(Descriptor &to, const Descriptor &from, Terminator &terminator, int flags, MemmoveFct memmoveFct) { - bool mustDeallocateLHS{(flags & DeallocateLHS) || - MustDeallocateLHS(to, from, terminator, flags)}; - DescriptorAddendum *toAddendum{to.Addendum()}; - const typeInfo::DerivedType *toDerived{ - toAddendum ? toAddendum->derivedType() : nullptr}; - if (toDerived && (flags & NeedFinalization) && - toDerived->noFinalizationNeeded()) { - flags &= ~NeedFinalization; - } - std::size_t toElementBytes{to.ElementBytes()}; - std::size_t fromElementBytes{from.ElementBytes()}; - // The following lambda definition violates the conding style, - // but cuda-11.8 nvcc hits an internal error with the brace initialization. - auto isSimpleMemmove = [&]() { - return !toDerived && to.rank() == from.rank() && to.IsContiguous() && - from.IsContiguous() && toElementBytes == fromElementBytes; - }; - StaticDescriptor deferredDeallocStatDesc; - Descriptor *deferDeallocation{nullptr}; - if (MayAlias(to, from)) { + WorkQueue workQueue{terminator}; + if (workQueue.BeginAssign(to, from, flags, memmoveFct) == StatContinue) { + workQueue.Run(); + } +} + +RT_API_ATTRS int AssignTicket::Begin(WorkQueue &workQueue) { + bool mustDeallocateLHS{(flags_ & DeallocateLHS) || + MustDeallocateLHS(to_, *from_, workQueue.terminator(), flags_)}; + DescriptorAddendum *toAddendum{to_.Addendum()}; + toDerived_ = toAddendum ? toAddendum->derivedType() : nullptr; + if (toDerived_ && (flags_ & NeedFinalization) && + toDerived_->noFinalizationNeeded()) { + flags_ &= ~NeedFinalization; + } + if (MayAlias(to_, *from_)) { if (mustDeallocateLHS) { - deferDeallocation = &deferredDeallocStatDesc.descriptor(); + // Convert the LHS into a temporary, then make it look deallocated. + toDeallocate_ = &tempDescriptor_.descriptor(); + persist_ = true; // tempDescriptor_ state must outlive child tickets std::memcpy( - reinterpret_cast(deferDeallocation), &to, to.SizeInBytes()); - to.set_base_addr(nullptr); - } else if (!isSimpleMemmove()) { + reinterpret_cast(toDeallocate_), &to_, to_.SizeInBytes()); + to_.set_base_addr(nullptr); + if (toDerived_ && (flags_ & NeedFinalization)) { + if (int status{workQueue.BeginFinalize(*toDeallocate_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + flags_ &= ~NeedFinalization; + } + } else if (!IsSimpleMemmove()) { // Handle LHS/RHS aliasing by copying RHS into a temp, then // recursively assigning from that temp. - auto descBytes{from.SizeInBytes()}; - StaticDescriptor staticDesc; - Descriptor &newFrom{staticDesc.descriptor()}; - std::memcpy(reinterpret_cast(&newFrom), &from, descBytes); + auto descBytes{from_->SizeInBytes()}; + Descriptor &newFrom{tempDescriptor_.descriptor()}; + persist_ = true; // tempDescriptor_ state must outlive child tickets + std::memcpy(reinterpret_cast(&newFrom), from_, descBytes); // Pretend the temporary descriptor is for an ALLOCATABLE // entity, otherwise, the Deallocate() below will not // free the descriptor memory. newFrom.raw().attribute = CFI_attribute_allocatable; - auto stat{ReturnError(terminator, newFrom.Allocate(kNoAsyncObject))}; - if (stat == StatOk) { - if (HasDynamicComponent(from)) { - // If 'from' has allocatable/automatic component, we cannot - // just make a shallow copy of the descriptor member. - // This will still leave data overlap in 'to' and 'newFrom'. - // For example: - // type t - // character, allocatable :: c(:) - // end type t - // type(t) :: x(3) - // x(2:3) = x(1:2) - // We have to make a deep copy into 'newFrom' in this case. - RTNAME(AssignTemporary) - (newFrom, from, terminator.sourceFileName(), terminator.sourceLine()); - } else { - ShallowCopy(newFrom, from, true, from.IsContiguous()); + if (int stat{ReturnError( + workQueue.terminator(), newFrom.Allocate(kNoAsyncObject))}; + stat != StatOk) { + return stat; + } + if (HasDynamicComponent(*from_)) { + // If 'from' has allocatable/automatic component, we cannot + // just make a shallow copy of the descriptor member. + // This will still leave data overlap in 'to' and 'newFrom'. + // For example: + // type t + // character, allocatable :: c(:) + // end type t + // type(t) :: x(3) + // x(2:3) = x(1:2) + // We have to make a deep copy into 'newFrom' in this case. + if (const DescriptorAddendum *addendum{newFrom.Addendum()}) { + if (const auto *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(newFrom, *derived)}; + status != StatOk && status != StatContinue) { + return status; + } + } + } + } + static constexpr int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (int status{workQueue.BeginAssign( + newFrom, *from_, nestedFlags, memmoveFct_)}; + status != StatOk && status != StatContinue) { + return status; } - Assign(to, newFrom, terminator, - flags & - (NeedFinalization | ComponentCanBeDefinedAssignment | - ExplicitLengthCharacterLHS | CanBeDefinedAssignment)); - newFrom.Deallocate(); + } else { + ShallowCopy(newFrom, *from_, true, from_->IsContiguous()); } - return; + from_ = &newFrom; + flags_ &= NeedFinalization | ComponentCanBeDefinedAssignment | + ExplicitLengthCharacterLHS | CanBeDefinedAssignment; + toDeallocate_ = &newFrom; } } - if (to.IsAllocatable()) { + if (to_.IsAllocatable()) { if (mustDeallocateLHS) { - if (deferDeallocation) { - if ((flags & NeedFinalization) && toDerived) { - Finalize(*deferDeallocation, *toDerived, &terminator); - flags &= ~NeedFinalization; - } - } else { - to.Destroy((flags & NeedFinalization) != 0, /*destroyPointers=*/false, - &terminator); - flags &= ~NeedFinalization; + if (!toDeallocate_ && to_.IsAllocated()) { + toDeallocate_ = &to_; } - } else if (to.rank() != from.rank() && !to.IsAllocated()) { - terminator.Crash("Assign: mismatched ranks (%d != %d) in assignment to " - "unallocated allocatable", - to.rank(), from.rank()); + } else if (to_.rank() != from_->rank() && !to_.IsAllocated()) { + workQueue.terminator().Crash("Assign: mismatched ranks (%d != %d) in " + "assignment to unallocated allocatable", + to_.rank(), from_->rank()); } - if (!to.IsAllocated()) { - if (AllocateAssignmentLHS(to, from, terminator, flags) != StatOk) { - return; + } else if (!to_.IsAllocated()) { + workQueue.terminator().Crash( + "Assign: left-hand side variable is neither allocated nor allocatable"); + } + if (toDerived_ && to_.IsAllocated()) { + // Schedule finalization or destruction of the LHS. + if (flags_ & NeedFinalization) { + if (int status{workQueue.BeginFinalize(to_, *toDerived_)}; + status != StatOk && status != StatContinue) { + return status; + } + } else if (!toDerived_->noDestructionNeeded()) { + if (int status{ + workQueue.BeginDestroy(to_, *toDerived_, /*finalize=*/false)}; + status != StatOk && status != StatContinue) { + return status; } - flags &= ~NeedFinalization; - toElementBytes = to.ElementBytes(); // may have changed } } - if (toDerived && (flags & CanBeDefinedAssignment)) { - // Check for a user-defined assignment type-bound procedure; - // see 10.2.1.4-5. A user-defined assignment TBP defines all of - // the semantics, including allocatable (re)allocation and any - // finalization. - // - // Note that the aliasing and LHS (re)allocation handling above - // needs to run even with CanBeDefinedAssignment flag, when - // the Assign() is invoked recursively for component-per-component - // assignments. - if (to.rank() == 0) { - if (const auto *special{toDerived->FindSpecialBinding( + return StatContinue; +} + +RT_API_ATTRS int AssignTicket::Continue(WorkQueue &workQueue) { + if (done_) { + // All child tickets are complete; can release this ticket's state. + if (toDeallocate_) { + toDeallocate_->Deallocate(); + } + return StatOk; + } + // All necessary finalization or destruction that was initiated by Begin() + // has been completed. Deallocation may be pending, and if it's for the LHS, + // do it now so that the LHS gets reallocated. + if (toDeallocate_ == &to_) { + toDeallocate_ = nullptr; + to_.Deallocate(); + } + // Allocate the LHS if needed + if (!to_.IsAllocated()) { + if (int stat{ + AllocateAssignmentLHS(to_, *from_, workQueue.terminator(), flags_)}; + stat != StatOk) { + return stat; + } + const auto *addendum{to_.Addendum()}; + toDerived_ = addendum ? addendum->derivedType() : nullptr; + if (toDerived_ && !toDerived_->noInitializationNeeded()) { + if (int status{workQueue.BeginInitialize(to_, *toDerived_)}; + status != StatOk) { + return status; + } + } + } + // Check for a user-defined assignment type-bound procedure; + // see 10.2.1.4-5. + // Note that the aliasing and LHS (re)allocation handling above + // needs to run even with CanBeDefinedAssignment flag, since + // Assign() can be invoked recursively for component-wise assignments. + if (toDerived_ && (flags_ & CanBeDefinedAssignment)) { + if (to_.rank() == 0) { + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ScalarAssignment)}) { - return DoScalarDefinedAssignment(to, from, *special); + DoScalarDefinedAssignment(to_, *from_, *special); + done_ = true; + return StatContinue; } } - if (const auto *special{toDerived->FindSpecialBinding( + if (const auto *special{toDerived_->FindSpecialBinding( typeInfo::SpecialBinding::Which::ElementalAssignment)}) { - return DoElementalDefinedAssignment(to, from, *toDerived, *special); + DoElementalDefinedAssignment(to_, *from_, *toDerived_, *special); + done_ = true; + return StatContinue; } } - SubscriptValue toAt[maxRank]; - to.GetLowerBounds(toAt); - // Scalar expansion of the RHS is implied by using the same empty - // subscript values on each (seemingly) elemental reference into - // "from". - SubscriptValue fromAt[maxRank]; - from.GetLowerBounds(fromAt); - std::size_t toElements{to.Elements()}; - if (from.rank() > 0 && toElements != from.Elements()) { - terminator.Crash("Assign: mismatching element counts in array assignment " - "(to %zd, from %zd)", - toElements, from.Elements()); + // Intrinsic assignment + std::size_t toElements{to_.Elements()}; + if (from_->rank() > 0 && toElements != from_->Elements()) { + workQueue.terminator().Crash("Assign: mismatching element counts in array " + "assignment (to %zd, from %zd)", + toElements, from_->Elements()); } - if (to.type() != from.type()) { - terminator.Crash("Assign: mismatching types (to code %d != from code %d)", - to.type().raw(), from.type().raw()); + if (to_.type() != from_->type()) { + workQueue.terminator().Crash( + "Assign: mismatching types (to code %d != from code %d)", + to_.type().raw(), from_->type().raw()); } - if (toElementBytes > fromElementBytes && !to.type().IsCharacter()) { - terminator.Crash("Assign: mismatching non-character element sizes (to %zd " - "bytes != from %zd bytes)", + std::size_t toElementBytes{to_.ElementBytes()}; + std::size_t fromElementBytes{from_->ElementBytes()}; + if (toElementBytes > fromElementBytes && !to_.type().IsCharacter()) { + workQueue.terminator().Crash("Assign: mismatching non-character element " + "sizes (to %zd bytes != from %zd bytes)", toElementBytes, fromElementBytes); } - if (const typeInfo::DerivedType * - updatedToDerived{toAddendum ? toAddendum->derivedType() : nullptr}) { - // Derived type intrinsic assignment, which is componentwise and elementwise - // for all components, including parent components (10.2.1.2-3). - // The target is first finalized if still necessary (7.5.6.3(1)) - if (flags & NeedFinalization) { - Finalize(to, *updatedToDerived, &terminator); - } else if (updatedToDerived && !updatedToDerived->noDestructionNeeded()) { - Destroy(to, /*finalize=*/false, *updatedToDerived, &terminator); - } - // Copy the data components (incl. the parent) first. - const Descriptor &componentDesc{updatedToDerived->component()}; - std::size_t numComponents{componentDesc.Elements()}; - for (std::size_t j{0}; j < toElements; - ++j, to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - for (std::size_t k{0}; k < numComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement( - k)}; // TODO: exploit contiguity here - // Use PolymorphicLHS for components so that the right things happen - // when the components are polymorphic; when they're not, they're both - // not, and their declared types will match. - int nestedFlags{MaybeReallocate | PolymorphicLHS}; - if (flags & ComponentCanBeDefinedAssignment) { - nestedFlags |= - CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; - } - switch (comp.genre()) { - case typeInfo::Component::Genre::Data: - if (comp.category() == TypeCategory::Derived) { - StaticDescriptor statDesc[2]; - Descriptor &toCompDesc{statDesc[0].descriptor()}; - Descriptor &fromCompDesc{statDesc[1].descriptor()}; - comp.CreatePointerDescriptor(toCompDesc, to, terminator, toAt); - comp.CreatePointerDescriptor( - fromCompDesc, from, terminator, fromAt); - Assign(toCompDesc, fromCompDesc, terminator, nestedFlags); - } else { // Component has intrinsic type; simply copy raw bytes - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } - break; - case typeInfo::Component::Genre::Pointer: { - std::size_t componentByteSize{comp.SizeInBytes(to)}; - memmoveFct(to.Element(toAt) + comp.offset(), - from.Element(fromAt) + comp.offset(), - componentByteSize); - } break; - case typeInfo::Component::Genre::Allocatable: - case typeInfo::Component::Genre::Automatic: { - auto *toDesc{reinterpret_cast( - to.Element(toAt) + comp.offset())}; - const auto *fromDesc{reinterpret_cast( - from.Element(fromAt) + comp.offset())}; - // Allocatable components of the LHS are unconditionally - // deallocated before assignment (F'2018 10.2.1.3(13)(1)), - // unlike a "top-level" assignment to a variable, where - // deallocation is optional. - // - // Be careful not to destroy/reallocate the LHS, if there is - // overlap between LHS and RHS (it seems that partial overlap - // is not possible, though). - // Invoke Assign() recursively to deal with potential aliasing. - if (toDesc->IsAllocatable()) { - if (!fromDesc->IsAllocated()) { - // No aliasing. - // - // If to is not allocated, the Destroy() call is a no-op. - // This is just a shortcut, because the recursive Assign() - // below would initiate the destruction for to. - // No finalization is required. - toDesc->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); - continue; // F'2018 10.2.1.3(13)(2) - } - } - // Force LHS deallocation with DeallocateLHS flag. - // The actual deallocation may be avoided, if the existing - // location can be reoccupied. - Assign(*toDesc, *fromDesc, terminator, nestedFlags | DeallocateLHS); - } break; - } + if (toDerived_) { + if (toDerived_->noDefinedAssignment()) { // componentwise + if (int status{workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_)}; + status != StatOk && status != StatContinue) { + return status; } - // Copy procedure pointer components - const Descriptor &procPtrDesc{updatedToDerived->procPtr()}; - std::size_t numProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < numProcPtrs; ++k) { - const auto &procPtr{ - *procPtrDesc.ZeroBasedIndexedElement( - k)}; - memmoveFct(to.Element(toAt) + procPtr.offset, - from.Element(fromAt) + procPtr.offset, - sizeof(typeInfo::ProcedurePointer)); + } else { // elementwise + if (int status{workQueue.BeginDerivedAssign( + to_, *from_, *toDerived_, flags_, memmoveFct_, toDeallocate_)}; + status != StatOk && status != StatContinue) { + return status; } } - } else { // intrinsic type, intrinsic assignment - if (isSimpleMemmove()) { - memmoveFct(to.raw().base_addr, from.raw().base_addr, - toElements * toElementBytes); - } else if (toElementBytes > fromElementBytes) { // blank padding - switch (to.type().raw()) { + toDeallocate_ = nullptr; + } else if (IsSimpleMemmove()) { + memmoveFct_(to_.raw().base_addr, from_->raw().base_addr, + toElements * toElementBytes); + } else { + // Scalar expansion of the RHS is implied by using the same empty + // subscript values on each (seemingly) elemental reference into + // "from". + SubscriptValue toAt[maxRank]; + to_.GetLowerBounds(toAt); + SubscriptValue fromAt[maxRank]; + from_->GetLowerBounds(fromAt); + if (toElementBytes > fromElementBytes) { // blank padding + switch (to_.type().raw()) { case CFI_type_signed_char: case CFI_type_char: - BlankPadCharacterAssignment(to, from, toAt, fromAt, toElements, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char16_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; case CFI_type_char32_t: - BlankPadCharacterAssignment(to, from, toAt, fromAt, + BlankPadCharacterAssignment(to_, *from_, toAt, fromAt, toElements, toElementBytes, fromElementBytes); break; default: - terminator.Crash("unexpected type code %d in blank padded Assign()", - to.type().raw()); + workQueue.terminator().Crash( + "unexpected type code %d in blank padded Assign()", + to_.type().raw()); } } else { // elemental copies, possibly with character truncation for (std::size_t n{toElements}; n-- > 0; - to.IncrementSubscripts(toAt), from.IncrementSubscripts(fromAt)) { - memmoveFct(to.Element(toAt), from.Element(fromAt), + to_.IncrementSubscripts(toAt), from_->IncrementSubscripts(fromAt)) { + memmoveFct_(to_.Element(toAt), from_->Element(fromAt), toElementBytes); } } } - if (deferDeallocation) { - // deferDeallocation is used only when LHS is an allocatable. - // The finalization has already been run for it. - deferDeallocation->Destroy( - /*finalize=*/false, /*destroyPointers=*/false, &terminator); + if (persist_) { + done_ = true; + return StatContinue; + } else { + if (toDeallocate_) { + toDeallocate_->Deallocate(); + toDeallocate_ = nullptr; + } + return StatOk; } } -RT_OFFLOAD_API_GROUP_BEGIN +template +RT_API_ATTRS int DerivedAssignTicket::Begin( + WorkQueue &workQueue) { + if (toIsContiguous_ && fromIsContiguous_ && + this->derived_.noDestructionNeeded() && + this->derived_.noDefinedAssignment() && + this->instance_.rank() == this->from_->rank()) { + if (std::size_t elementBytes{this->instance_.ElementBytes()}; + elementBytes == this->from_->ElementBytes()) { + // Fastest path. Both LHS and RHS are contiguous, RHS is not a scalar + // to be expanded, the types have the same size, and there are no + // allocatable components or defined ASSIGNMENT(=) at any level. + memmoveFct_(this->instance_.template OffsetElement(), + this->from_->template OffsetElement(), + this->instance_.Elements() * elementBytes); + return StatOk; + } + } + // Use PolymorphicLHS for components so that the right things happen + // when the components are polymorphic; when they're not, they're both + // not, and their declared types will match. + int nestedFlags{MaybeReallocate | PolymorphicLHS}; + if (flags_ & ComponentCanBeDefinedAssignment) { + nestedFlags |= CanBeDefinedAssignment | ComponentCanBeDefinedAssignment; + } + flags_ = nestedFlags; + // Copy procedure pointer components + const Descriptor &procPtrDesc{this->derived_.procPtr()}; + bool noDataComponents{this->IsComplete()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &procPtr{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + memmoveFct_(this->instance_.template ElementComponent( + this->subscripts_, procPtr.offset), + this->from_->template ElementComponent( + this->fromSubscripts_, procPtr.offset), + sizeof(typeInfo::ProcedurePointer)); + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + if (noDataComponents) { + return StatOk; + } + return StatContinue; +} +template RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &); +template RT_API_ATTRS int DerivedAssignTicket::Begin(WorkQueue &); + +template +RT_API_ATTRS int DerivedAssignTicket::Continue( + WorkQueue &workQueue) { + while (!this->IsComplete()) { + // Copy the data components (incl. the parent) first. + switch (this->component_->genre()) { + case typeInfo::Component::Genre::Data: + if (this->component_->category() == TypeCategory::Derived) { + Descriptor &toCompDesc{this->componentDescriptor_.descriptor()}; + Descriptor &fromCompDesc{this->fromComponentDescriptor_.descriptor()}; + this->component_->CreatePointerDescriptor(toCompDesc, this->instance_, + workQueue.terminator(), this->subscripts_); + this->component_->CreatePointerDescriptor(fromCompDesc, *this->from_, + workQueue.terminator(), this->fromSubscripts_); + this->Advance(); + if (int status{workQueue.BeginAssign( + toCompDesc, fromCompDesc, flags_, memmoveFct_)}; + status != StatOk) { + return status; + } + } else { // Component has intrinsic type; simply copy raw bytes + std::size_t componentByteSize{ + this->component_->SizeInBytes(this->instance_)}; + if (IS_COMPONENTWISE && toIsContiguous_ && fromIsContiguous_) { + std::size_t offset{this->component_->offset()}; + char *to{this->instance_.template OffsetElement(offset)}; + const char *from{ + this->from_->template OffsetElement(offset)}; + std::size_t toElementStride{this->instance_.ElementBytes()}; + std::size_t fromElementStride{ + this->from_->rank() == 0 ? 0 : this->from_->ElementBytes()}; + if (toElementStride == fromElementStride && + toElementStride == componentByteSize) { + memmoveFct_(to, from, this->elements_ * componentByteSize); + } else { + for (std::size_t n{this->elements_}; n--; + to += toElementStride, from += fromElementStride) { + memmoveFct_(to, from, componentByteSize); + } + } + this->Componentwise::Advance(); + } else { + memmoveFct_( + this->instance_.template Element(this->subscripts_) + + this->component_->offset(), + this->from_->template Element(this->fromSubscripts_) + + this->component_->offset(), + componentByteSize); + this->Advance(); + } + } + break; + case typeInfo::Component::Genre::Pointer: { + std::size_t componentByteSize{ + this->component_->SizeInBytes(this->instance_)}; + if (IS_COMPONENTWISE && toIsContiguous_ && fromIsContiguous_) { + std::size_t offset{this->component_->offset()}; + char *to{this->instance_.template OffsetElement(offset)}; + const char *from{ + this->from_->template OffsetElement(offset)}; + std::size_t toElementStride{this->instance_.ElementBytes()}; + std::size_t fromElementStride{ + this->from_->rank() == 0 ? 0 : this->from_->ElementBytes()}; + if (toElementStride == fromElementStride && + toElementStride == componentByteSize) { + memmoveFct_(to, from, this->elements_ * componentByteSize); + } else { + for (std::size_t n{this->elements_}; n--; + to += toElementStride, from += fromElementStride) { + memmoveFct_(to, from, componentByteSize); + } + } + this->Componentwise::Advance(); + } else { + memmoveFct_(this->instance_.template Element(this->subscripts_) + + this->component_->offset(), + this->from_->template Element(this->fromSubscripts_) + + this->component_->offset(), + componentByteSize); + this->Advance(); + } + } break; + case typeInfo::Component::Genre::Allocatable: + case typeInfo::Component::Genre::Automatic: { + auto *toDesc{reinterpret_cast( + this->instance_.template Element(this->subscripts_) + + this->component_->offset())}; + const auto *fromDesc{reinterpret_cast( + this->from_->template Element(this->fromSubscripts_) + + this->component_->offset())}; + if (toDesc->IsAllocatable() && !fromDesc->IsAllocated()) { + if (toDesc->IsAllocated()) { + if (this->phase_ == 0) { + this->phase_++; + if (const auto *componentDerived{this->component_->derivedType()}; + componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *toDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } + } + toDesc->Deallocate(); + } + this->Advance(); + } else { + // Allocatable components of the LHS are unconditionally + // deallocated before assignment (F'2018 10.2.1.3(13)(1)), + // unlike a "top-level" assignment to a variable, where + // deallocation is optional. + this->Advance(); + int nestedFlags{flags_}; + if (this->derived_.noFinalizationNeeded() && + this->derived_.noInitializationNeeded() && + this->derived_.noDestructionNeeded()) { + // The actual deallocation may be avoided, if the existing + // location can be reoccupied. + } else { + // Force LHS deallocation with DeallocateLHS flag. + nestedFlags |= DeallocateLHS; + } + if (int status{workQueue.BeginAssign( + *toDesc, *fromDesc, nestedFlags, memmoveFct_)}; + status != StatOk) { + return status; + } + } + } break; + } + } + if (deallocateAfter_) { + deallocateAfter_->Deallocate(); + } + return StatOk; +} +template RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &); +template RT_API_ATTRS int DerivedAssignTicket::Continue(WorkQueue &); RT_API_ATTRS void DoFromSourceAssign(Descriptor &alloc, const Descriptor &source, Terminator &terminator, MemmoveFct memmoveFct) { @@ -581,7 +759,6 @@ void RTDEF(AssignTemporary)(Descriptor &to, const Descriptor &from, } } } - Assign(to, from, terminator, MaybeReallocate | PolymorphicLHS); } @@ -598,7 +775,6 @@ void RTDEF(CopyInAssign)(Descriptor &temp, const Descriptor &var, void RTDEF(CopyOutAssign)( Descriptor *var, Descriptor &temp, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - // Copyout from the temporary must not cause any finalizations // for LHS. The variable must be properly initialized already. if (var) { diff --git a/flang-rt/lib/runtime/derived.cpp b/flang-rt/lib/runtime/derived.cpp index 35037036f63e7..8ab737c701b01 100644 --- a/flang-rt/lib/runtime/derived.cpp +++ b/flang-rt/lib/runtime/derived.cpp @@ -12,6 +12,7 @@ #include "flang-rt/runtime/terminator.h" #include "flang-rt/runtime/tools.h" #include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" namespace Fortran::runtime { @@ -30,180 +31,193 @@ static RT_API_ATTRS void GetComponentExtents(SubscriptValue (&extents)[maxRank], } RT_API_ATTRS int Initialize(const Descriptor &instance, - const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, - const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{instance.Elements()}; - int stat{StatOk}; - // Initialize data components in each element; the per-element iterations - // constitute the inner loops, not the outer ones - std::size_t myComponents{componentDesc.Elements()}; - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &allocDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(allocDesc, instance, terminator); + const typeInfo::DerivedType &derived, Terminator &terminator, bool, + const Descriptor *) { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitialize(instance, derived)}; + return status == StatContinue ? workQueue.Run() : status; +} + +RT_API_ATTRS int InitializeTicket::Begin(WorkQueue &) { + // Initialize procedure pointer components in each element + const Descriptor &procPtrDesc{derived_.procPtr()}; + if (std::size_t numProcPtrs{procPtrDesc.Elements()}) { + bool noDataComponents{IsComplete()}; + for (std::size_t k{0}; k < numProcPtrs; ++k) { + const auto &comp{ + *procPtrDesc.ZeroBasedIndexedElement(k)}; + // Loop only over elements + if (noDataComponents) { + Elementwise::Reset(); + } + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + auto &pptr{*instance_.ElementComponent( + subscripts_, comp.offset)}; + pptr = comp.procInitialization; + } + } + if (noDataComponents) { + return StatOk; + } + Elementwise::Reset(); + } + return StatContinue; +} + +RT_API_ATTRS int InitializeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + // Establish allocatable descriptors + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &allocDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + allocDesc, instance_, workQueue.terminator()); allocDesc.raw().attribute = CFI_attribute_allocatable; - if (comp.genre() == typeInfo::Component::Genre::Automatic) { - stat = ReturnError( - terminator, allocDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{allocDesc.Addendum()}) { - if (const auto *derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - stat = Initialize( - allocDesc, *derived, terminator, hasStat, errMsg); - } - } - } - } - if (stat != StatOk) { - break; - } - } } - } else if (const void *init{comp.initialization()}) { + SkipToNextComponent(); + } else if (const void *init{component_->initialization()}) { // Explicit initialization of data pointers and // non-allocatable non-automatic components - std::size_t bytes{comp.SizeInBytes(instance)}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - char *ptr{instance.ElementComponent(at, comp.offset())}; + std::size_t bytes{component_->SizeInBytes(instance_)}; + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + char *ptr{instance_.ElementComponent( + subscripts_, component_->offset())}; std::memcpy(ptr, init, bytes); } - } else if (comp.genre() == typeInfo::Component::Genre::Pointer) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Pointer) { // Data pointers without explicit initialization are established // so that they are valid right-hand side targets of pointer // assignment statements. - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - Descriptor &ptrDesc{ - *instance.ElementComponent(at, comp.offset())}; - comp.EstablishDescriptor(ptrDesc, instance, terminator); + for (; !Elementwise::IsComplete(); Elementwise::Advance()) { + Descriptor &ptrDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + component_->EstablishDescriptor( + ptrDesc, instance_, workQueue.terminator()); ptrDesc.raw().attribute = CFI_attribute_pointer; } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noInitializationNeeded()) { + SkipToNextComponent(); + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noInitializationNeeded()) { // Default initialization of non-pointer non-allocatable/automatic - // data component. Handles parent component's elements. Recursive. + // data component. Handles parent component's elements. SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, instance); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - compDesc.Establish(compType, - instance.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = Initialize(compDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; - } + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitialize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - // Initialize procedure pointer components in each element - const Descriptor &procPtrDesc{derived.procPtr()}; - std::size_t myProcPtrs{procPtrDesc.Elements()}; - for (std::size_t k{0}; k < myProcPtrs; ++k) { - const auto &comp{ - *procPtrDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - instance.GetLowerBounds(at); - for (std::size_t j{0}; j++ < elements; instance.IncrementSubscripts(at)) { - auto &pptr{*instance.ElementComponent( - at, comp.offset)}; - pptr = comp.procInitialization; - } - } - return stat; + return StatOk; } RT_API_ATTRS int InitializeClone(const Descriptor &clone, - const Descriptor &orig, const typeInfo::DerivedType &derived, + const Descriptor &original, const typeInfo::DerivedType &derived, Terminator &terminator, bool hasStat, const Descriptor *errMsg) { - const Descriptor &componentDesc{derived.component()}; - std::size_t elements{orig.Elements()}; - int stat{StatOk}; - - // Skip pointers and unallocated variables. - if (orig.IsPointer() || !orig.IsAllocated()) { - return stat; + if (original.IsPointer() || !original.IsAllocated()) { + return StatOk; // nothing to do + } else { + WorkQueue workQueue{terminator}; + int status{workQueue.BeginInitializeClone( + clone, original, derived, hasStat, errMsg)}; + return status == StatContinue ? workQueue.Run() : status; } - // Initialize each data component. - std::size_t components{componentDesc.Elements()}; - for (std::size_t i{0}; i < components; ++i) { - const typeInfo::Component &comp{ - *componentDesc.ZeroBasedIndexedElement(i)}; - SubscriptValue at[maxRank]; - orig.GetLowerBounds(at); - // Allocate allocatable components that are also allocated in the original - // object. - if (comp.genre() == typeInfo::Component::Genre::Allocatable) { - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { - Descriptor &origDesc{ - *orig.ElementComponent(at, comp.offset())}; - Descriptor &cloneDesc{ - *clone.ElementComponent(at, comp.offset())}; - if (origDesc.IsAllocated()) { +} + +RT_API_ATTRS int InitializeCloneTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable) { + Descriptor &origDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + if (origDesc.IsAllocated()) { + Descriptor &cloneDesc{*clone_.ElementComponent( + subscripts_, component_->offset())}; + if (phase_ == 0) { + ++phase_; cloneDesc.ApplyMold(origDesc, origDesc.rank()); - stat = ReturnError( - terminator, cloneDesc.Allocate(kNoAsyncObject), errMsg, hasStat); - if (stat == StatOk) { - if (const DescriptorAddendum * addendum{cloneDesc.Addendum()}) { - if (const typeInfo::DerivedType * - derived{addendum->derivedType()}) { - if (!derived->noInitializationNeeded()) { - // Perform default initialization for the allocated element. - stat = Initialize( - cloneDesc, *derived, terminator, hasStat, errMsg); - } - // Initialize derived type's allocatables. - if (stat == StatOk) { - stat = InitializeClone(cloneDesc, origDesc, *derived, - terminator, hasStat, errMsg); + if (int stat{ReturnError(workQueue.terminator(), + cloneDesc.Allocate(kNoAsyncObject), errMsg_, hasStat_)}; + stat != StatOk) { + return stat; + } + if (const DescriptorAddendum *addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType *derived{addendum->derivedType()}) { + if (!derived->noInitializationNeeded()) { + // Perform default initialization for the allocated element. + if (int status{workQueue.BeginInitialize(cloneDesc, *derived)}; + status != StatOk) { + return status; } } } } } - if (stat != StatOk) { - break; + if (phase_ == 1) { + ++phase_; + if (const DescriptorAddendum *addendum{cloneDesc.Addendum()}) { + if (const typeInfo::DerivedType *derived{addendum->derivedType()}) { + // Initialize derived type's allocatables. + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, *derived, hasStat_, errMsg_)}; + status != StatOk) { + return status; + } + } + } } } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType()) { - // Handle nested derived types. - const typeInfo::DerivedType &compType{*comp.derivedType()}; - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, orig); - // Data components don't have descriptors, allocate them. - StaticDescriptor origStaticDesc; - StaticDescriptor cloneStaticDesc; - Descriptor &origDesc{origStaticDesc.descriptor()}; - Descriptor &cloneDesc{cloneStaticDesc.descriptor()}; - // Initialize each element. - for (std::size_t j{0}; j < elements; ++j, orig.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (component_->derivedType()) { + // Handle nested derived types. + const typeInfo::DerivedType &compType{*component_->derivedType()}; + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &origDesc{componentDescriptor_.descriptor()}; + Descriptor &cloneDesc{cloneComponentDescriptor_.descriptor()}; origDesc.Establish(compType, - orig.ElementComponent(at, comp.offset()), comp.rank(), - extents); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); cloneDesc.Establish(compType, - clone.ElementComponent(at, comp.offset()), comp.rank(), - extents); - stat = InitializeClone( - cloneDesc, origDesc, compType, terminator, hasStat, errMsg); - if (stat != StatOk) { - break; + clone_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginInitializeClone( + cloneDesc, origDesc, compType, hasStat_, errMsg_)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } + } else { + SkipToNextComponent(); + } + } + return StatOk; +} + +// Fortran 2018 subclause 7.5.6.2 +RT_API_ATTRS void Finalize(const Descriptor &descriptor, + const typeInfo::DerivedType &derived, Terminator *terminator) { + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Finalize() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginFinalize(descriptor, derived) == StatContinue) { + workQueue.Run(); } } - return stat; } static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( @@ -221,7 +235,7 @@ static RT_API_ATTRS const typeInfo::SpecialBinding *FindFinal( } static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { + const typeInfo::DerivedType &derived, Terminator &terminator) { if (const auto *special{FindFinal(derived, descriptor.rank())}) { if (special->which() == typeInfo::SpecialBinding::Which::ElementalFinal) { std::size_t elements{descriptor.Elements()}; @@ -258,9 +272,7 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, copy = descriptor; copy.set_base_addr(nullptr); copy.raw().attribute = CFI_attribute_allocatable; - Terminator stubTerminator{"CallFinalProcedure() in Fortran runtime", 0}; - RUNTIME_CHECK(terminator ? *terminator : stubTerminator, - copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); + RUNTIME_CHECK(terminator, copy.Allocate(kNoAsyncObject) == CFI_SUCCESS); ShallowCopyDiscontiguousToContiguous(copy, descriptor); argDescriptor = © } @@ -284,87 +296,94 @@ static RT_API_ATTRS void CallFinalSubroutine(const Descriptor &descriptor, } } -// Fortran 2018 subclause 7.5.6.2 -RT_API_ATTRS void Finalize(const Descriptor &descriptor, - const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noFinalizationNeeded() || !descriptor.IsAllocated()) { - return; - } - CallFinalSubroutine(descriptor, derived, terminator); - const auto *parentType{derived.GetParentType()}; - bool recurse{parentType && !parentType->noFinalizationNeeded()}; +RT_API_ATTRS int FinalizeTicket::Begin(WorkQueue &workQueue) { + CallFinalSubroutine(instance_, derived_, workQueue.terminator()); // If there's a finalizable parent component, handle it last, as required // by the Fortran standard (7.5.6.2), and do so recursively with the same // descriptor so that the rank is preserved. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - for (auto k{recurse ? std::size_t{1} - /* skip first component, it's the parent */ - : 0}; - k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - if (comp.genre() == typeInfo::Component::Genre::Allocatable && - comp.category() == TypeCategory::Derived) { + finalizableParentType_ = derived_.GetParentType(); + if (finalizableParentType_) { + if (finalizableParentType_->noFinalizationNeeded()) { + finalizableParentType_ = nullptr; + } else { + SkipToNextComponent(); + } + } + return StatContinue; +} + +RT_API_ATTRS int FinalizeTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Allocatable && + component_->category() == TypeCategory::Derived) { // Component may be polymorphic or unlimited polymorphic. Need to use the // dynamic type to check whether finalization is needed. - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - if (const DescriptorAddendum * addendum{compDesc.Addendum()}) { - if (const typeInfo::DerivedType * - compDynamicType{addendum->derivedType()}) { - if (!compDynamicType->noFinalizationNeeded()) { - Finalize(compDesc, *compDynamicType, terminator); + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (const DescriptorAddendum *addendum{compDesc.Addendum()}) { + if (const typeInfo::DerivedType *compDynamicType{ + addendum->derivedType()}) { + if (!compDynamicType->noFinalizationNeeded()) { + if (int status{ + workQueue.BeginFinalize(compDesc, *compDynamicType)}; + status != StatOk) { + return status; } } } } } - } else if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - if (const typeInfo::DerivedType * compType{comp.derivedType()}) { - if (!compType->noFinalizationNeeded()) { - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - const Descriptor &compDesc{ - *descriptor.ElementComponent(at, comp.offset())}; - if (compDesc.IsAllocated()) { - Finalize(compDesc, *compType, terminator); - } + } else if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + if (const typeInfo::DerivedType *compType{component_->derivedType()}; + compType && !compType->noFinalizationNeeded()) { + const Descriptor &compDesc{*instance_.ElementComponent( + subscripts_, component_->offset())}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginFinalize(compDesc, *compType)}; + status != StatOk) { + return status; } } + } else { + SkipToNextComponent(); } - } else if (comp.genre() == typeInfo::Component::Genre::Data && - comp.derivedType() && !comp.derivedType()->noFinalizationNeeded()) { + } else if (component_->genre() == typeInfo::Component::Genre::Data && + component_->derivedType() && + !component_->derivedType()->noFinalizationNeeded()) { SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { - compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Finalize(compDesc, compType, terminator); + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*component_->derivedType()}; + compDesc.Establish(compType, + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginFinalize(compDesc, compType)}; + status != StatOk) { + return status; } + } else { + SkipToNextComponent(); } } - if (recurse) { - StaticDescriptor statDesc; - Descriptor &tmpDesc{statDesc.descriptor()}; - tmpDesc = descriptor; + // Last, do the parent component, if any and finalizable. + if (finalizableParentType_) { + Descriptor &tmpDesc{componentDescriptor_.descriptor()}; + tmpDesc = instance_; tmpDesc.raw().attribute = CFI_attribute_pointer; - tmpDesc.Addendum()->set_derivedType(parentType); - tmpDesc.raw().elem_len = parentType->sizeInBytes(); - Finalize(tmpDesc, *parentType, terminator); + tmpDesc.Addendum()->set_derivedType(finalizableParentType_); + tmpDesc.raw().elem_len = finalizableParentType_->sizeInBytes(); + const auto &parentType{*finalizableParentType_}; + finalizableParentType_ = nullptr; + // Don't return StatOk here if the nested FInalize is still running; + // it needs this->componentDescriptor_. + return workQueue.BeginFinalize(tmpDesc, parentType); } + return StatOk; } // The order of finalization follows Fortran 2018 7.5.6.2, with @@ -373,51 +392,71 @@ RT_API_ATTRS void Finalize(const Descriptor &descriptor, // preceding any deallocation. RT_API_ATTRS void Destroy(const Descriptor &descriptor, bool finalize, const typeInfo::DerivedType &derived, Terminator *terminator) { - if (derived.noDestructionNeeded() || !descriptor.IsAllocated()) { - return; + if (!derived.noFinalizationNeeded() && descriptor.IsAllocated()) { + Terminator stubTerminator{"Destroy() in Fortran runtime", 0}; + WorkQueue workQueue{terminator ? *terminator : stubTerminator}; + if (workQueue.BeginDestroy(descriptor, derived, finalize) == StatContinue) { + workQueue.Run(); + } } - if (finalize && !derived.noFinalizationNeeded()) { - Finalize(descriptor, derived, terminator); +} + +RT_API_ATTRS int DestroyTicket::Begin(WorkQueue &workQueue) { + if (finalize_ && !derived_.noFinalizationNeeded()) { + if (int status{workQueue.BeginFinalize(instance_, derived_)}; + status != StatOk && status != StatContinue) { + return status; + } } + return StatContinue; +} + +RT_API_ATTRS int DestroyTicket::Continue(WorkQueue &workQueue) { // Deallocate all direct and indirect allocatable and automatic components. // Contrary to finalization, the order of deallocation does not matter. - const Descriptor &componentDesc{derived.component()}; - std::size_t myComponents{componentDesc.Elements()}; - std::size_t elements{descriptor.Elements()}; - SubscriptValue at[maxRank]; - descriptor.GetLowerBounds(at); - for (std::size_t k{0}; k < myComponents; ++k) { - const auto &comp{ - *componentDesc.ZeroBasedIndexedElement(k)}; - const bool destroyComp{ - comp.derivedType() && !comp.derivedType()->noDestructionNeeded()}; - if (comp.genre() == typeInfo::Component::Genre::Allocatable || - comp.genre() == typeInfo::Component::Genre::Automatic) { - for (std::size_t j{0}; j < elements; ++j) { - Descriptor *d{ - descriptor.ElementComponent(at, comp.offset())}; - if (destroyComp) { - Destroy(*d, /*finalize=*/false, *comp.derivedType(), terminator); + while (!IsComplete()) { + const auto *componentDerived{component_->derivedType()}; + if (component_->genre() == typeInfo::Component::Genre::Allocatable || + component_->genre() == typeInfo::Component::Genre::Automatic) { + Descriptor *d{instance_.ElementComponent( + subscripts_, component_->offset())}; + if (d->IsAllocated()) { + if (phase_ == 0) { + ++phase_; + if (componentDerived && !componentDerived->noDestructionNeeded()) { + if (int status{workQueue.BeginDestroy( + *d, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } + } } d->Deallocate(); - descriptor.IncrementSubscripts(at); } - } else if (destroyComp && - comp.genre() == typeInfo::Component::Genre::Data) { - SubscriptValue extents[maxRank]; - GetComponentExtents(extents, comp, descriptor); - StaticDescriptor staticDescriptor; - Descriptor &compDesc{staticDescriptor.descriptor()}; - const typeInfo::DerivedType &compType{*comp.derivedType()}; - for (std::size_t j{0}; j++ < elements; - descriptor.IncrementSubscripts(at)) { + Advance(); + } else if (component_->genre() == typeInfo::Component::Genre::Data) { + if (!componentDerived || componentDerived->noDestructionNeeded()) { + SkipToNextComponent(); + } else { + SubscriptValue extents[maxRank]; + GetComponentExtents(extents, *component_, instance_); + Descriptor &compDesc{componentDescriptor_.descriptor()}; + const typeInfo::DerivedType &compType{*componentDerived}; compDesc.Establish(compType, - descriptor.ElementComponent(at, comp.offset()), comp.rank(), - extents); - Destroy(compDesc, /*finalize=*/false, *comp.derivedType(), terminator); + instance_.ElementComponent(subscripts_, component_->offset()), + component_->rank(), extents); + Advance(); + if (int status{workQueue.BeginDestroy( + compDesc, *componentDerived, /*finalize=*/false)}; + status != StatOk) { + return status; + } } + } else { + SkipToNextComponent(); } } + return StatOk; } RT_API_ATTRS bool HasDynamicComponent(const Descriptor &descriptor) { diff --git a/flang-rt/lib/runtime/descriptor-io.cpp b/flang-rt/lib/runtime/descriptor-io.cpp index 3db1455af52fe..364724b89ba0d 100644 --- a/flang-rt/lib/runtime/descriptor-io.cpp +++ b/flang-rt/lib/runtime/descriptor-io.cpp @@ -7,15 +7,44 @@ //===----------------------------------------------------------------------===// #include "descriptor-io.h" +#include "edit-input.h" +#include "edit-output.h" +#include "unit.h" +#include "flang-rt/runtime/descriptor.h" +#include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/namelist.h" +#include "flang-rt/runtime/terminator.h" +#include "flang-rt/runtime/type-info.h" +#include "flang-rt/runtime/work-queue.h" +#include "flang/Common/optional.h" #include "flang/Common/restorer.h" +#include "flang/Common/uint128.h" +#include "flang/Runtime/cpp-type.h" #include "flang/Runtime/freestanding-tools.h" +// Implementation of I/O data list item transfers based on descriptors. +// (All I/O items come through here so that the code is exercised for test; +// some scalar I/O data transfer APIs could be changed to bypass their use +// of descriptors in the future for better efficiency.) + namespace Fortran::runtime::io::descr { RT_OFFLOAD_API_GROUP_BEGIN +template +inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, + const Descriptor &descriptor, const SubscriptValue subscripts[]) { + A *p{descriptor.Element(subscripts)}; + if (!p) { + io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " + "address or subscripts out of range"); + } + return *p; +} + // Defined formatted I/O (maybe) -Fortran::common::optional DefinedFormattedIo(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &derived, +static RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( + IoStatementState &io, const Descriptor &descriptor, + const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special, const SubscriptValue subscripts[]) { Fortran::common::optional peek{ @@ -104,8 +133,8 @@ Fortran::common::optional DefinedFormattedIo(IoStatementState &io, } // Defined unformatted I/O -bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, - const typeInfo::DerivedType &derived, +static RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &io, + const Descriptor &descriptor, const typeInfo::DerivedType &derived, const typeInfo::SpecialBinding &special) { // Unformatted I/O must have an external unit (or child thereof). IoErrorHandler &handler{io.GetIoErrorHandler()}; @@ -152,5 +181,619 @@ bool DefinedUnformattedIo(IoStatementState &io, const Descriptor &descriptor, return handler.GetIoStat() == IostatOk; } +// Per-category descriptor-based I/O templates + +// TODO (perhaps as a nontrivial but small starter project): implement +// automatic repetition counts, like "10*3.14159", for list-directed and +// NAMELIST array output. + +template +inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, + const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!EditIntegerOutput(io, *edit, x, isSigned)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditIntegerInput( + io, *edit, reinterpret_cast(&x), KIND, isSigned)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedIntegerIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedRealIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + if (auto edit{io.GetNextDataEdit()}) { + RawType &x{ExtractElement(io, descriptor, subscripts)}; + if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditRealInput(io, *edit, reinterpret_cast(&x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedRealIO: subscripts out of bounds"); + } + } else { + return false; + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedComplexIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + bool isListOutput{ + io.get_if>() != nullptr}; + using RawType = typename RealOutputEditing::BinaryFloatingPoint; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + RawType *x{&ExtractElement(io, descriptor, subscripts)}; + if (isListOutput) { + DataEdit rEdit, iEdit; + rEdit.descriptor = DataEdit::ListDirectedRealPart; + iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; + rEdit.modes = iEdit.modes = io.mutableModes(); + if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || + !RealOutputEditing{io, x[1]}.Edit(iEdit)) { + return false; + } + } else { + for (int k{0}; k < 2; ++k, ++x) { + auto edit{io.GetNextDataEdit()}; + if (!edit) { + return false; + } else if constexpr (DIR == Direction::Output) { + if (!RealOutputEditing{io, *x}.Edit(*edit)) { + return false; + } + } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { + break; + } else if (EditRealInput( + io, *edit, reinterpret_cast(x))) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedComplexIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedCharacterIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + std::size_t length{descriptor.ElementBytes() / sizeof(A)}; + auto *listOutput{io.get_if>()}; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + A *x{&ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditCharacterOutput(io, *edit, x, length)) { + return false; + } + } else { // input + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + if (EditCharacterInput(io, *edit, x, length)) { + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedCharacterIO: subscripts out of bounds"); + } + } + return true; +} + +template +inline RT_API_ATTRS bool FormattedLogicalIO( + IoStatementState &io, const Descriptor &descriptor) { + std::size_t numElements{descriptor.Elements()}; + SubscriptValue subscripts[maxRank]; + descriptor.GetLowerBounds(subscripts); + auto *listOutput{io.get_if>()}; + using IntType = CppTypeFor; + bool anyInput{false}; + for (std::size_t j{0}; j < numElements; ++j) { + IntType &x{ExtractElement(io, descriptor, subscripts)}; + if (listOutput) { + if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { + return false; + } + } else if (auto edit{io.GetNextDataEdit()}) { + if constexpr (DIR == Direction::Output) { + if (!EditLogicalOutput(io, *edit, x != 0)) { + return false; + } + } else { + if (edit->descriptor != DataEdit::ListDirectedNullValue) { + bool truth{}; + if (EditLogicalInput(io, *edit, truth)) { + x = truth; + anyInput = true; + } else { + return anyInput && edit->IsNamelist(); + } + } + } + } else { + return false; + } + if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { + io.GetIoErrorHandler().Crash( + "FormattedLogicalIO: subscripts out of bounds"); + } + } + return true; +} + +template +RT_API_ATTRS int DerivedIoTicket::Continue(WorkQueue &workQueue) { + while (!IsComplete()) { + if (component_->genre() == typeInfo::Component::Genre::Data) { + // Create a descriptor for the component + Descriptor &compDesc{componentDescriptor_.descriptor()}; + component_->CreatePointerDescriptor( + compDesc, instance_, io_.GetIoErrorHandler(), subscripts_); + Advance(); + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } else { + // Component is itself a descriptor + char *pointer{ + instance_.Element(subscripts_) + component_->offset()}; + const Descriptor &compDesc{ + *reinterpret_cast(pointer)}; + Advance(); + if (compDesc.IsAllocated()) { + if (int status{workQueue.BeginDescriptorIo( + io_, compDesc, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + } + } + return StatOk; +} + +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DerivedIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Begin(WorkQueue &workQueue) { + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + if (handler.InError()) { + return handler.GetIoStat(); + } + if (!io_.get_if>()) { + handler.Crash("DescriptorIO() called for wrong I/O direction"); + return handler.GetIoStat(); + } + if constexpr (DIR == Direction::Input) { + if (!io_.BeginReadingRecord()) { + return StatOk; + } + } + if (!io_.get_if>()) { + // Unformatted I/O + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + if (const typeInfo::DerivedType *type{ + addendum ? addendum->derivedType() : nullptr}) { + // derived type unformatted I/O + if (table_) { + if (const auto *definedIo{table_->Find(*type, + DIR == Direction::Input + ? common::DefinedIo::ReadUnformatted + : common::DefinedIo::WriteUnformatted)}) { + if (definedIo->subroutine) { + typeInfo::SpecialBinding special{DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false}; + if (DefinedUnformattedIo(io_, instance_, *type, special)) { + anyIoTookPlace_ = true; + return StatOk; + } + } else { + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } + } + } + if (const typeInfo::SpecialBinding *special{ + type->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadUnformatted + : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || special->isTypeBound()) { + // defined derived type unformatted I/O + if (DefinedUnformattedIo(io_, instance_, *type, *special)) { + anyIoTookPlace_ = true; + return StatOk; + } else { + return IostatEnd; + } + } + } + // Default derived type unformatted I/O + // TODO: If no component at any level has defined READ or WRITE + // (as appropriate), the elements are contiguous, and no byte swapping + // is active, do a block transfer via the code below. + int status{workQueue.BeginDerivedIo( + io_, instance_, *type, table_, anyIoTookPlace_)}; + return status == StatContinue ? StatOk : status; // done here + } else { + // intrinsic type unformatted I/O + auto *externalUnf{io_.get_if>()}; + ChildUnformattedIoStatementState *childUnf{nullptr}; + InquireIOLengthState *inq{nullptr}; + bool swapEndianness{false}; + if (externalUnf) { + swapEndianness = externalUnf->unit().swapEndianness(); + } else { + childUnf = io_.get_if>(); + if (!childUnf) { + inq = DIR == Direction::Output ? io_.get_if() + : nullptr; + RUNTIME_CHECK(handler, inq != nullptr); + } + } + std::size_t elementBytes{instance_.ElementBytes()}; + std::size_t swappingBytes{elementBytes}; + if (auto maybeCatAndKind{instance_.type().GetCategoryAndKind()}) { + // Byte swapping units can be smaller than elements, namely + // for COMPLEX and CHARACTER. + if (maybeCatAndKind->first == TypeCategory::Character) { + // swap each character position independently + swappingBytes = maybeCatAndKind->second; // kind + } else if (maybeCatAndKind->first == TypeCategory::Complex) { + // swap real and imaginary components independently + swappingBytes /= 2; + } + } + using CharType = + std::conditional_t; + auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { + if constexpr (DIR == Direction::Output) { + return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) + : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) + : inq->Emit(&x, totalBytes, swappingBytes); + } else { + return externalUnf + ? externalUnf->Receive(&x, totalBytes, swappingBytes) + : childUnf->Receive(&x, totalBytes, swappingBytes); + } + }}; + if (!swapEndianness && + instance_.IsContiguous()) { // contiguous unformatted I/O + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elements_ * elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O + for (; !IsComplete(); Advance()) { + char &x{ExtractElement(io_, instance_, subscripts_)}; + if (Transfer(x, elementBytes)) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } + } + } + // Unformatted I/O never needs to call Continue(). + return StatOk; + } + // Formatted I/O + if (auto catAndKind{instance_.type().GetCategoryAndKind()}) { + TypeCategory cat{catAndKind->first}; + int kind{catAndKind->second}; + bool any{false}; + switch (cat) { + case TypeCategory::Integer: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, true); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, true); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, true); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, true); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, true); + break; + default: + handler.Crash( + "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Unsigned: + switch (kind) { + case 1: + any = FormattedIntegerIO<1, DIR>(io_, instance_, false); + break; + case 2: + any = FormattedIntegerIO<2, DIR>(io_, instance_, false); + break; + case 4: + any = FormattedIntegerIO<4, DIR>(io_, instance_, false); + break; + case 8: + any = FormattedIntegerIO<8, DIR>(io_, instance_, false); + break; + case 16: + any = FormattedIntegerIO<16, DIR>(io_, instance_, false); + break; + default: + handler.Crash( + "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Real: + switch (kind) { + case 2: + any = FormattedRealIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedRealIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedRealIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedRealIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedRealIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedRealIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: REAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Complex: + switch (kind) { + case 2: + any = FormattedComplexIO<2, DIR>(io_, instance_); + break; + case 3: + any = FormattedComplexIO<3, DIR>(io_, instance_); + break; + case 4: + any = FormattedComplexIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedComplexIO<8, DIR>(io_, instance_); + break; + case 10: + any = FormattedComplexIO<10, DIR>(io_, instance_); + break; + // TODO: case double/double + case 16: + any = FormattedComplexIO<16, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Character: + switch (kind) { + case 1: + any = FormattedCharacterIO(io_, instance_); + break; + case 2: + any = FormattedCharacterIO(io_, instance_); + break; + case 4: + any = FormattedCharacterIO(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Logical: + switch (kind) { + case 1: + any = FormattedLogicalIO<1, DIR>(io_, instance_); + break; + case 2: + any = FormattedLogicalIO<2, DIR>(io_, instance_); + break; + case 4: + any = FormattedLogicalIO<4, DIR>(io_, instance_); + break; + case 8: + any = FormattedLogicalIO<8, DIR>(io_, instance_); + break; + default: + handler.Crash( + "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); + return IostatEnd; + } + break; + case TypeCategory::Derived: { + // Derived type information must be present for formatted I/O. + IoErrorHandler &handler{io_.GetIoErrorHandler()}; + const DescriptorAddendum *addendum{instance_.Addendum()}; + RUNTIME_CHECK(handler, addendum != nullptr); + derived_ = addendum->derivedType(); + RUNTIME_CHECK(handler, derived_ != nullptr); + if (table_) { + if (const auto *definedIo{table_->Find(*derived_, + DIR == Direction::Input ? common::DefinedIo::ReadFormatted + : common::DefinedIo::WriteFormatted)}) { + if (definedIo->subroutine) { + nonTbpSpecial_.emplace(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted, + definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, + false); + special_ = &*nonTbpSpecial_; + } + } + } + if (!special_) { + if (const typeInfo::SpecialBinding *binding{ + derived_->FindSpecialBinding(DIR == Direction::Input + ? typeInfo::SpecialBinding::Which::ReadFormatted + : typeInfo::SpecialBinding::Which::WriteFormatted)}) { + if (!table_ || !table_->ignoreNonTbpEntries || + binding->isTypeBound()) { + special_ = binding; + } + } + } + return StatContinue; + } + } + if (any) { + anyIoTookPlace_ = true; + } else { + return IostatEnd; + } + } else { + handler.Crash("DescriptorIO: bad type code (%d) in descriptor", + static_cast(instance_.type().raw())); + return handler.GetIoStat(); + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Begin( + WorkQueue &); + +template +RT_API_ATTRS int DescriptorIoTicket::Continue(WorkQueue &workQueue) { + // Only derived type formatted I/O gets here. + while (!IsComplete()) { + if (special_) { + if (auto defined{DefinedFormattedIo( + io_, instance_, *derived_, *special_, subscripts_)}) { + anyIoTookPlace_ |= *defined; + Advance(); + continue; + } + } + Descriptor &elementDesc{elementDescriptor_.descriptor()}; + elementDesc.Establish( + *derived_, nullptr, 0, nullptr, CFI_attribute_pointer); + elementDesc.set_base_addr(instance_.Element(subscripts_)); + Advance(); + if (int status{workQueue.BeginDerivedIo( + io_, elementDesc, *derived_, table_, anyIoTookPlace_)}; + status != StatOk) { + return status; + } + } + return StatOk; +} + +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); +template RT_API_ATTRS int DescriptorIoTicket::Continue( + WorkQueue &); + +template +RT_API_ATTRS bool DescriptorIO(IoStatementState &io, + const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { + bool anyIoTookPlace{false}; + WorkQueue workQueue{io.GetIoErrorHandler()}; + if (workQueue.BeginDescriptorIo(io, descriptor, table, anyIoTookPlace) == + StatContinue) { + workQueue.Run(); + } + return anyIoTookPlace; +} + +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); + RT_OFFLOAD_API_GROUP_END } // namespace Fortran::runtime::io::descr diff --git a/flang-rt/lib/runtime/descriptor-io.h b/flang-rt/lib/runtime/descriptor-io.h index eb60f106c9203..88ad59bd24b53 100644 --- a/flang-rt/lib/runtime/descriptor-io.h +++ b/flang-rt/lib/runtime/descriptor-io.h @@ -9,619 +9,27 @@ #ifndef FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ #define FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ -// Implementation of I/O data list item transfers based on descriptors. -// (All I/O items come through here so that the code is exercised for test; -// some scalar I/O data transfer APIs could be changed to bypass their use -// of descriptors in the future for better efficiency.) +#include "flang-rt/runtime/connection.h" -#include "edit-input.h" -#include "edit-output.h" -#include "unit.h" -#include "flang-rt/runtime/descriptor.h" -#include "flang-rt/runtime/io-stmt.h" -#include "flang-rt/runtime/namelist.h" -#include "flang-rt/runtime/terminator.h" -#include "flang-rt/runtime/type-info.h" -#include "flang/Common/optional.h" -#include "flang/Common/uint128.h" -#include "flang/Runtime/cpp-type.h" +namespace Fortran::runtime { +class Descriptor; +} // namespace Fortran::runtime -namespace Fortran::runtime::io::descr { -template -inline RT_API_ATTRS A &ExtractElement(IoStatementState &io, - const Descriptor &descriptor, const SubscriptValue subscripts[]) { - A *p{descriptor.Element(subscripts)}; - if (!p) { - io.GetIoErrorHandler().Crash("Bad address for I/O item -- null base " - "address or subscripts out of range"); - } - return *p; -} - -// Per-category descriptor-based I/O templates - -// TODO (perhaps as a nontrivial but small starter project): implement -// automatic repetition counts, like "10*3.14159", for list-directed and -// NAMELIST array output. - -template -inline RT_API_ATTRS bool FormattedIntegerIO(IoStatementState &io, - const Descriptor &descriptor, [[maybe_unused]] bool isSigned) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!EditIntegerOutput(io, *edit, x, isSigned)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditIntegerInput( - io, *edit, reinterpret_cast(&x), KIND, isSigned)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedIntegerIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedRealIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - if (auto edit{io.GetNextDataEdit()}) { - RawType &x{ExtractElement(io, descriptor, subscripts)}; - if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditRealInput(io, *edit, reinterpret_cast(&x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedRealIO: subscripts out of bounds"); - } - } else { - return false; - } - } - return true; -} +namespace Fortran::runtime::io { +class IoStatementState; +struct NonTbpDefinedIoTable; +} // namespace Fortran::runtime::io -template -inline RT_API_ATTRS bool FormattedComplexIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - bool isListOutput{ - io.get_if>() != nullptr}; - using RawType = typename RealOutputEditing::BinaryFloatingPoint; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - RawType *x{&ExtractElement(io, descriptor, subscripts)}; - if (isListOutput) { - DataEdit rEdit, iEdit; - rEdit.descriptor = DataEdit::ListDirectedRealPart; - iEdit.descriptor = DataEdit::ListDirectedImaginaryPart; - rEdit.modes = iEdit.modes = io.mutableModes(); - if (!RealOutputEditing{io, x[0]}.Edit(rEdit) || - !RealOutputEditing{io, x[1]}.Edit(iEdit)) { - return false; - } - } else { - for (int k{0}; k < 2; ++k, ++x) { - auto edit{io.GetNextDataEdit()}; - if (!edit) { - return false; - } else if constexpr (DIR == Direction::Output) { - if (!RealOutputEditing{io, *x}.Edit(*edit)) { - return false; - } - } else if (edit->descriptor == DataEdit::ListDirectedNullValue) { - break; - } else if (EditRealInput( - io, *edit, reinterpret_cast(x))) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedComplexIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedCharacterIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t length{descriptor.ElementBytes() / sizeof(A)}; - auto *listOutput{io.get_if>()}; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - A *x{&ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedCharacterOutput(io, *listOutput, x, length)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditCharacterOutput(io, *edit, x, length)) { - return false; - } - } else { // input - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - if (EditCharacterInput(io, *edit, x, length)) { - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedCharacterIO: subscripts out of bounds"); - } - } - return true; -} - -template -inline RT_API_ATTRS bool FormattedLogicalIO( - IoStatementState &io, const Descriptor &descriptor) { - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - auto *listOutput{io.get_if>()}; - using IntType = CppTypeFor; - bool anyInput{false}; - for (std::size_t j{0}; j < numElements; ++j) { - IntType &x{ExtractElement(io, descriptor, subscripts)}; - if (listOutput) { - if (!ListDirectedLogicalOutput(io, *listOutput, x != 0)) { - return false; - } - } else if (auto edit{io.GetNextDataEdit()}) { - if constexpr (DIR == Direction::Output) { - if (!EditLogicalOutput(io, *edit, x != 0)) { - return false; - } - } else { - if (edit->descriptor != DataEdit::ListDirectedNullValue) { - bool truth{}; - if (EditLogicalInput(io, *edit, truth)) { - x = truth; - anyInput = true; - } else { - return anyInput && edit->IsNamelist(); - } - } - } - } else { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && j + 1 < numElements) { - io.GetIoErrorHandler().Crash( - "FormattedLogicalIO: subscripts out of bounds"); - } - } - return true; -} +namespace Fortran::runtime::io::descr { template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, +RT_API_ATTRS bool DescriptorIO(IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable * = nullptr); -// For intrinsic (not defined) derived type I/O, formatted & unformatted -template -static RT_API_ATTRS bool DefaultComponentIO(IoStatementState &io, - const typeInfo::Component &component, const Descriptor &origDescriptor, - const SubscriptValue origSubscripts[], Terminator &terminator, - const NonTbpDefinedIoTable *table) { -#if !defined(RT_DEVICE_AVOID_RECURSION) - if (component.genre() == typeInfo::Component::Genre::Data) { - // Create a descriptor for the component - StaticDescriptor statDesc; - Descriptor &desc{statDesc.descriptor()}; - component.CreatePointerDescriptor( - desc, origDescriptor, terminator, origSubscripts); - return DescriptorIO(io, desc, table); - } else { - // Component is itself a descriptor - char *pointer{ - origDescriptor.Element(origSubscripts) + component.offset()}; - const Descriptor &compDesc{*reinterpret_cast(pointer)}; - return compDesc.IsAllocated() && DescriptorIO(io, compDesc, table); - } -#else - terminator.Crash("not yet implemented: component IO"); -#endif -} - -template -static RT_API_ATTRS bool DefaultComponentwiseFormattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table, const SubscriptValue subscripts[]) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - // Return true for NAMELIST input if any component appeared. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && k > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -template -static RT_API_ATTRS bool DefaultComponentwiseUnformattedIO(IoStatementState &io, - const Descriptor &descriptor, const typeInfo::DerivedType &type, - const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const Descriptor &compArray{type.component()}; - RUNTIME_CHECK(handler, compArray.rank() == 1); - std::size_t numComponents{compArray.Elements()}; - std::size_t numElements{descriptor.Elements()}; - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - SubscriptValue at[maxRank]; - compArray.GetLowerBounds(at); - for (std::size_t k{0}; k < numComponents; - ++k, compArray.IncrementSubscripts(at)) { - const typeInfo::Component &component{ - *compArray.Element(at)}; - if (!DefaultComponentIO( - io, component, descriptor, subscripts, handler, table)) { - return false; - } - } - } - return true; -} - -RT_API_ATTRS Fortran::common::optional DefinedFormattedIo( - IoStatementState &, const Descriptor &, const typeInfo::DerivedType &, - const typeInfo::SpecialBinding &, const SubscriptValue[]); - -template -static RT_API_ATTRS bool FormattedDerivedTypeIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - // Derived type information must be present for formatted I/O. - const DescriptorAddendum *addendum{descriptor.Addendum()}; - RUNTIME_CHECK(handler, addendum != nullptr); - const typeInfo::DerivedType *type{addendum->derivedType()}; - RUNTIME_CHECK(handler, type != nullptr); - Fortran::common::optional nonTbpSpecial; - const typeInfo::SpecialBinding *special{nullptr}; - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadFormatted - : common::DefinedIo::WriteFormatted)}) { - if (definedIo->subroutine) { - nonTbpSpecial.emplace(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false); - special = &*nonTbpSpecial; - } - } - } - if (!special) { - if (const typeInfo::SpecialBinding * - binding{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadFormatted - : typeInfo::SpecialBinding::Which::WriteFormatted)}) { - if (!table || !table->ignoreNonTbpEntries || binding->isTypeBound()) { - special = binding; - } - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - std::size_t numElements{descriptor.Elements()}; - for (std::size_t j{0}; j < numElements; - ++j, descriptor.IncrementSubscripts(subscripts)) { - Fortran::common::optional result; - if (special) { - result = DefinedFormattedIo(io, descriptor, *type, *special, subscripts); - } - if (!result) { - result = DefaultComponentwiseFormattedIO( - io, descriptor, *type, table, subscripts); - } - if (!result.value()) { - // Return true for NAMELIST input if we got anything. - auto *listInput{ - io.get_if>()}; - return DIR == Direction::Input && j > 0 && listInput && - listInput->inNamelistSequence(); - } - } - return true; -} - -RT_API_ATTRS bool DefinedUnformattedIo(IoStatementState &, const Descriptor &, - const typeInfo::DerivedType &, const typeInfo::SpecialBinding &); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); +extern template RT_API_ATTRS bool DescriptorIO( + IoStatementState &, const Descriptor &, const NonTbpDefinedIoTable *); -// Unformatted I/O -template -static RT_API_ATTRS bool UnformattedDescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table = nullptr) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - const DescriptorAddendum *addendum{descriptor.Addendum()}; - if (const typeInfo::DerivedType * - type{addendum ? addendum->derivedType() : nullptr}) { - // derived type unformatted I/O - if (table) { - if (const auto *definedIo{table->Find(*type, - DIR == Direction::Input ? common::DefinedIo::ReadUnformatted - : common::DefinedIo::WriteUnformatted)}) { - if (definedIo->subroutine) { - typeInfo::SpecialBinding special{DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted, - definedIo->subroutine, definedIo->isDtvArgPolymorphic, false, - false}; - if (Fortran::common::optional wasDefined{ - DefinedUnformattedIo(io, descriptor, *type, special)}) { - return *wasDefined; - } - } else { - return DefaultComponentwiseUnformattedIO( - io, descriptor, *type, table); - } - } - } - if (const typeInfo::SpecialBinding * - special{type->FindSpecialBinding(DIR == Direction::Input - ? typeInfo::SpecialBinding::Which::ReadUnformatted - : typeInfo::SpecialBinding::Which::WriteUnformatted)}) { - if (!table || !table->ignoreNonTbpEntries || special->isTypeBound()) { - // defined derived type unformatted I/O - return DefinedUnformattedIo(io, descriptor, *type, *special); - } - } - // Default derived type unformatted I/O - // TODO: If no component at any level has defined READ or WRITE - // (as appropriate), the elements are contiguous, and no byte swapping - // is active, do a block transfer via the code below. - return DefaultComponentwiseUnformattedIO(io, descriptor, *type, table); - } else { - // intrinsic type unformatted I/O - auto *externalUnf{io.get_if>()}; - auto *childUnf{io.get_if>()}; - auto *inq{ - DIR == Direction::Output ? io.get_if() : nullptr}; - RUNTIME_CHECK(handler, externalUnf || childUnf || inq); - std::size_t elementBytes{descriptor.ElementBytes()}; - std::size_t numElements{descriptor.Elements()}; - std::size_t swappingBytes{elementBytes}; - if (auto maybeCatAndKind{descriptor.type().GetCategoryAndKind()}) { - // Byte swapping units can be smaller than elements, namely - // for COMPLEX and CHARACTER. - if (maybeCatAndKind->first == TypeCategory::Character) { - // swap each character position independently - swappingBytes = maybeCatAndKind->second; // kind - } else if (maybeCatAndKind->first == TypeCategory::Complex) { - // swap real and imaginary components independently - swappingBytes /= 2; - } - } - SubscriptValue subscripts[maxRank]; - descriptor.GetLowerBounds(subscripts); - using CharType = - std::conditional_t; - auto Transfer{[=](CharType &x, std::size_t totalBytes) -> bool { - if constexpr (DIR == Direction::Output) { - return externalUnf ? externalUnf->Emit(&x, totalBytes, swappingBytes) - : childUnf ? childUnf->Emit(&x, totalBytes, swappingBytes) - : inq->Emit(&x, totalBytes, swappingBytes); - } else { - return externalUnf ? externalUnf->Receive(&x, totalBytes, swappingBytes) - : childUnf->Receive(&x, totalBytes, swappingBytes); - } - }}; - bool swapEndianness{externalUnf && externalUnf->unit().swapEndianness()}; - if (!swapEndianness && - descriptor.IsContiguous()) { // contiguous unformatted I/O - char &x{ExtractElement(io, descriptor, subscripts)}; - return Transfer(x, numElements * elementBytes); - } else { // non-contiguous or byte-swapped intrinsic type unformatted I/O - for (std::size_t j{0}; j < numElements; ++j) { - char &x{ExtractElement(io, descriptor, subscripts)}; - if (!Transfer(x, elementBytes)) { - return false; - } - if (!descriptor.IncrementSubscripts(subscripts) && - j + 1 < numElements) { - handler.Crash("DescriptorIO: subscripts out of bounds"); - } - } - return true; - } - } -} - -template -static RT_API_ATTRS bool DescriptorIO(IoStatementState &io, - const Descriptor &descriptor, const NonTbpDefinedIoTable *table) { - IoErrorHandler &handler{io.GetIoErrorHandler()}; - if (handler.InError()) { - return false; - } - if (!io.get_if>()) { - handler.Crash("DescriptorIO() called for wrong I/O direction"); - return false; - } - if constexpr (DIR == Direction::Input) { - if (!io.BeginReadingRecord()) { - return false; - } - } - if (!io.get_if>()) { - return UnformattedDescriptorIO(io, descriptor, table); - } - if (auto catAndKind{descriptor.type().GetCategoryAndKind()}) { - TypeCategory cat{catAndKind->first}; - int kind{catAndKind->second}; - switch (cat) { - case TypeCategory::Integer: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, true); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, true); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, true); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, true); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, true); - default: - handler.Crash( - "not yet implemented: INTEGER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Unsigned: - switch (kind) { - case 1: - return FormattedIntegerIO<1, DIR>(io, descriptor, false); - case 2: - return FormattedIntegerIO<2, DIR>(io, descriptor, false); - case 4: - return FormattedIntegerIO<4, DIR>(io, descriptor, false); - case 8: - return FormattedIntegerIO<8, DIR>(io, descriptor, false); - case 16: - return FormattedIntegerIO<16, DIR>(io, descriptor, false); - default: - handler.Crash( - "not yet implemented: UNSIGNED(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Real: - switch (kind) { - case 2: - return FormattedRealIO<2, DIR>(io, descriptor); - case 3: - return FormattedRealIO<3, DIR>(io, descriptor); - case 4: - return FormattedRealIO<4, DIR>(io, descriptor); - case 8: - return FormattedRealIO<8, DIR>(io, descriptor); - case 10: - return FormattedRealIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedRealIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: REAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Complex: - switch (kind) { - case 2: - return FormattedComplexIO<2, DIR>(io, descriptor); - case 3: - return FormattedComplexIO<3, DIR>(io, descriptor); - case 4: - return FormattedComplexIO<4, DIR>(io, descriptor); - case 8: - return FormattedComplexIO<8, DIR>(io, descriptor); - case 10: - return FormattedComplexIO<10, DIR>(io, descriptor); - // TODO: case double/double - case 16: - return FormattedComplexIO<16, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: COMPLEX(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Character: - switch (kind) { - case 1: - return FormattedCharacterIO(io, descriptor); - case 2: - return FormattedCharacterIO(io, descriptor); - case 4: - return FormattedCharacterIO(io, descriptor); - default: - handler.Crash( - "not yet implemented: CHARACTER(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Logical: - switch (kind) { - case 1: - return FormattedLogicalIO<1, DIR>(io, descriptor); - case 2: - return FormattedLogicalIO<2, DIR>(io, descriptor); - case 4: - return FormattedLogicalIO<4, DIR>(io, descriptor); - case 8: - return FormattedLogicalIO<8, DIR>(io, descriptor); - default: - handler.Crash( - "not yet implemented: LOGICAL(KIND=%d) in formatted IO", kind); - return false; - } - case TypeCategory::Derived: - return FormattedDerivedTypeIO(io, descriptor, table); - } - } - handler.Crash("DescriptorIO: bad type code (%d) in descriptor", - static_cast(descriptor.type().raw())); - return false; -} } // namespace Fortran::runtime::io::descr #endif // FLANG_RT_RUNTIME_DESCRIPTOR_IO_H_ diff --git a/flang-rt/lib/runtime/environment.cpp b/flang-rt/lib/runtime/environment.cpp index 1d5304254ed0e..0f0564403c0e2 100644 --- a/flang-rt/lib/runtime/environment.cpp +++ b/flang-rt/lib/runtime/environment.cpp @@ -143,6 +143,10 @@ void ExecutionEnvironment::Configure(int ac, const char *av[], } } + if (auto *x{std::getenv("FLANG_RT_DEBUG")}) { + internalDebugging = std::strtol(x, nullptr, 10); + } + if (auto *x{std::getenv("ACC_OFFLOAD_STACK_SIZE")}) { char *end; auto n{std::strtoul(x, &end, 10)}; diff --git a/flang-rt/lib/runtime/namelist.cpp b/flang-rt/lib/runtime/namelist.cpp index b0cf2180fc6d4..1bef387a9771f 100644 --- a/flang-rt/lib/runtime/namelist.cpp +++ b/flang-rt/lib/runtime/namelist.cpp @@ -10,6 +10,7 @@ #include "descriptor-io.h" #include "flang-rt/runtime/emit-encoded.h" #include "flang-rt/runtime/io-stmt.h" +#include "flang-rt/runtime/type-info.h" #include "flang/Runtime/io-api.h" #include #include diff --git a/flang-rt/lib/runtime/tools.cpp b/flang-rt/lib/runtime/tools.cpp index b08195cd31e05..24d05f369fcbe 100644 --- a/flang-rt/lib/runtime/tools.cpp +++ b/flang-rt/lib/runtime/tools.cpp @@ -205,7 +205,7 @@ RT_API_ATTRS void ShallowCopyInner(const Descriptor &to, const Descriptor &from, // Doing the recursion upwards instead of downwards puts the more common // cases earlier in the if-chain and has a tangible impact on performance. template struct ShallowCopyRankSpecialize { - static bool execute(const Descriptor &to, const Descriptor &from, + static RT_API_ATTRS bool execute(const Descriptor &to, const Descriptor &from, bool toIsContiguous, bool fromIsContiguous) { if (to.rank() == RANK && from.rank() == RANK) { ShallowCopyInner(to, from, toIsContiguous, fromIsContiguous); @@ -217,7 +217,7 @@ template struct ShallowCopyRankSpecialize { }; template struct ShallowCopyRankSpecialize { - static bool execute(const Descriptor &to, const Descriptor &from, + static RT_API_ATTRS bool execute(const Descriptor &to, const Descriptor &from, bool toIsContiguous, bool fromIsContiguous) { return false; } diff --git a/flang-rt/lib/runtime/type-info.cpp b/flang-rt/lib/runtime/type-info.cpp index 82182696d70c6..451213202acef 100644 --- a/flang-rt/lib/runtime/type-info.cpp +++ b/flang-rt/lib/runtime/type-info.cpp @@ -140,11 +140,11 @@ RT_API_ATTRS void Component::CreatePointerDescriptor(Descriptor &descriptor, const SubscriptValue *subscripts) const { RUNTIME_CHECK(terminator, genre_ == Genre::Data); EstablishDescriptor(descriptor, container, terminator); + std::size_t offset{offset_}; if (subscripts) { - descriptor.set_base_addr(container.Element(subscripts) + offset_); - } else { - descriptor.set_base_addr(container.OffsetElement() + offset_); + offset += container.SubscriptsToByteOffset(subscripts); } + descriptor.set_base_addr(container.OffsetElement() + offset); descriptor.raw().attribute = CFI_attribute_pointer; } diff --git a/flang-rt/lib/runtime/work-queue.cpp b/flang-rt/lib/runtime/work-queue.cpp new file mode 100644 index 0000000000000..a508ecb637102 --- /dev/null +++ b/flang-rt/lib/runtime/work-queue.cpp @@ -0,0 +1,161 @@ +//===-- lib/runtime/work-queue.cpp ------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang-rt/runtime/work-queue.h" +#include "flang-rt/runtime/environment.h" +#include "flang-rt/runtime/memory.h" +#include "flang-rt/runtime/type-info.h" +#include "flang/Common/visit.h" + +namespace Fortran::runtime { + +#if !defined(RT_DEVICE_COMPILATION) +// FLANG_RT_DEBUG code is disabled when false. +static constexpr bool enableDebugOutput{false}; +#endif + +RT_OFFLOAD_API_GROUP_BEGIN + +RT_API_ATTRS Componentwise::Componentwise(const typeInfo::DerivedType &derived) + : derived_{derived}, components_{derived_.component().Elements()} { + GetComponent(); +} + +RT_API_ATTRS void Componentwise::GetComponent() { + if (IsComplete()) { + component_ = nullptr; + } else { + const Descriptor &componentDesc{derived_.component()}; + component_ = componentDesc.ZeroBasedIndexedElement( + componentAt_); + } +} + +RT_API_ATTRS int Ticket::Continue(WorkQueue &workQueue) { + if (!begun) { + begun = true; + return common::visit( + [&workQueue]( + auto &specificTicket) { return specificTicket.Begin(workQueue); }, + u); + } else { + return common::visit( + [&workQueue](auto &specificTicket) { + return specificTicket.Continue(workQueue); + }, + u); + } +} + +RT_API_ATTRS WorkQueue::~WorkQueue() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } + while (firstFree_) { + TicketList *next{firstFree_->next}; + if (!firstFree_->isStatic) { + FreeMemory(firstFree_); + } + firstFree_ = next; + } +} + +RT_API_ATTRS Ticket &WorkQueue::StartTicket() { + if (!firstFree_) { + void *p{AllocateMemoryOrCrash(terminator_, sizeof(TicketList))}; + firstFree_ = new (p) TicketList; + firstFree_->isStatic = false; + } + TicketList *newTicket{firstFree_}; + if ((firstFree_ = newTicket->next)) { + firstFree_->previous = nullptr; + } + TicketList *after{insertAfter_ ? insertAfter_->next : nullptr}; + if ((newTicket->previous = insertAfter_ ? insertAfter_ : last_)) { + newTicket->previous->next = newTicket; + } else { + first_ = newTicket; + } + if ((newTicket->next = after)) { + after->previous = newTicket; + } else { + last_ = newTicket; + } + newTicket->ticket.begun = false; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: new ticket\n"); + } +#endif + return newTicket->ticket; +} + +RT_API_ATTRS int WorkQueue::Run() { + while (last_) { + TicketList *at{last_}; + insertAfter_ = last_; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: %zd %s\n", at->ticket.u.index(), + at->ticket.begun ? "Continue" : "Begin"); + } +#endif + int stat{at->ticket.Continue(*this)}; +#if !defined(RT_DEVICE_COMPILATION) + if (enableDebugOutput && + (executionEnvironment.internalDebugging & + ExecutionEnvironment::WorkQueue)) { + std::fprintf(stderr, "WQ: ... stat %d\n", stat); + } +#endif + insertAfter_ = nullptr; + if (stat == StatOk) { + if (at->previous) { + at->previous->next = at->next; + } else { + first_ = at->next; + } + if (at->next) { + at->next->previous = at->previous; + } else { + last_ = at->previous; + } + if ((at->next = firstFree_)) { + at->next->previous = at; + } + at->previous = nullptr; + firstFree_ = at; + } else if (stat != StatContinue) { + Stop(); + return stat; + } + } + return StatOk; +} + +RT_API_ATTRS void WorkQueue::Stop() { + if (last_) { + if ((last_->next = firstFree_)) { + last_->next->previous = last_; + } + firstFree_ = first_; + first_ = last_ = nullptr; + } +} + +RT_OFFLOAD_API_GROUP_END + +} // namespace Fortran::runtime diff --git a/flang-rt/unittests/Runtime/ExternalIOTest.cpp b/flang-rt/unittests/Runtime/ExternalIOTest.cpp index 3833e48be3dd6..6c148b1de6f82 100644 --- a/flang-rt/unittests/Runtime/ExternalIOTest.cpp +++ b/flang-rt/unittests/Runtime/ExternalIOTest.cpp @@ -184,7 +184,7 @@ TEST(ExternalIOTests, TestSequentialFixedUnformatted) { io = IONAME(BeginInquireIoLength)(__FILE__, __LINE__); for (int j{1}; j <= 3; ++j) { ASSERT_TRUE(IONAME(OutputDescriptor)(io, desc)) - << "OutputDescriptor() for InquireIoLength"; + << "OutputDescriptor() for InquireIoLength " << j; } ASSERT_EQ(IONAME(GetIoLength)(io), 3 * recl) << "GetIoLength"; ASSERT_EQ(IONAME(EndIoStatement)(io), IostatOk) diff --git a/flang/docs/Extensions.md b/flang/docs/Extensions.md index 51969de5ac7fe..377d01dbec69a 100644 --- a/flang/docs/Extensions.md +++ b/flang/docs/Extensions.md @@ -850,6 +850,16 @@ print *, [(j,j=1,10)] warning since such values may have become defined by the time the nested expression's value is required. +* Intrinsic assignment of arrays is defined elementally, and intrinsic + assignment of derived type components is defined componentwise. + However, when intrinsic assignment takes place for an array of derived + type, the order of the loop nesting is not defined. + Some compilers will loop over the elements, assigning all of the components + of each element before proceeding to the next element. + This compiler loops over all of the components, and assigns all of + the elements for each component before proceeding to the next component. + A program using defined assignment might be able to detect the difference. + ## De Facto Standard Features * `EXTENDS_TYPE_OF()` returns `.TRUE.` if both of its arguments have the diff --git a/flang/include/flang/Runtime/assign.h b/flang/include/flang/Runtime/assign.h index bc80997a1bec2..eb1f63184a177 100644 --- a/flang/include/flang/Runtime/assign.h +++ b/flang/include/flang/Runtime/assign.h @@ -38,7 +38,7 @@ enum AssignFlags { ComponentCanBeDefinedAssignment = 1 << 3, ExplicitLengthCharacterLHS = 1 << 4, PolymorphicLHS = 1 << 5, - DeallocateLHS = 1 << 6 + DeallocateLHS = 1 << 6, }; #ifdef RT_DEVICE_COMPILATION diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 3839bc1d2a215..79f7032aac312 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -182,9 +182,12 @@ const Symbol *HasImpureFinal( const Symbol &, std::optional rank = std::nullopt); // Is this type finalizable or does it contain any polymorphic allocatable // ultimate components? -bool MayRequireFinalization(const DerivedTypeSpec &derived); +bool MayRequireFinalization(const DerivedTypeSpec &); // Does this type have an allocatable direct component? -bool HasAllocatableDirectComponent(const DerivedTypeSpec &derived); +bool HasAllocatableDirectComponent(const DerivedTypeSpec &); +// Does this type have any defined assignment at any level (or any polymorphic +// allocatable)? +bool MayHaveDefinedAssignment(const DerivedTypeSpec &); bool IsInBlankCommon(const Symbol &); bool IsAssumedLengthCharacter(const Symbol &); diff --git a/flang/lib/Semantics/runtime-type-info.cpp b/flang/lib/Semantics/runtime-type-info.cpp index 98295f3705a71..e1fe087b8740e 100644 --- a/flang/lib/Semantics/runtime-type-info.cpp +++ b/flang/lib/Semantics/runtime-type-info.cpp @@ -661,6 +661,10 @@ const Symbol *RuntimeTableBuilder::DescribeType( AddValue(dtValues, derivedTypeSchema_, "nofinalizationneeded"s, IntExpr<1>( derivedTypeSpec && !MayRequireFinalization(*derivedTypeSpec))); + // Similarly, a flag to enable optimized runtime assignment. + AddValue(dtValues, derivedTypeSchema_, "nodefinedassignment"s, + IntExpr<1>( + derivedTypeSpec && !MayHaveDefinedAssignment(*derivedTypeSpec))); } dtObject.get().set_init(MaybeExpr{ StructureExpr(Structure(derivedTypeSchema_, std::move(dtValues)))}); diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 1d1e3ac044166..3247addc905ba 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -813,6 +813,38 @@ bool HasAllocatableDirectComponent(const DerivedTypeSpec &derived) { return std::any_of(directs.begin(), directs.end(), IsAllocatable); } +static bool MayHaveDefinedAssignment( + const DerivedTypeSpec &derived, std::set &checked) { + if (const Scope *scope{derived.GetScope()}; + scope && checked.find(scope) == checked.end()) { + checked.insert(scope); + for (const auto &[_, symbolRef] : *scope) { + if (const auto *generic{symbolRef->detailsIf()}) { + if (generic->kind().IsAssignment()) { + return true; + } + } else if (symbolRef->has() && + !IsPointer(*symbolRef)) { + if (const DeclTypeSpec *type{symbolRef->GetType()}) { + if (type->IsPolymorphic()) { + return true; + } else if (const DerivedTypeSpec *derived{type->AsDerived()}) { + if (MayHaveDefinedAssignment(*derived, checked)) { + return true; + } + } + } + } + } + } + return false; +} + +bool MayHaveDefinedAssignment(const DerivedTypeSpec &derived) { + std::set checked; + return MayHaveDefinedAssignment(derived, checked); +} + bool IsAssumedLengthCharacter(const Symbol &symbol) { if (const DeclTypeSpec * type{symbol.GetType()}) { return type->category() == DeclTypeSpec::Character && diff --git a/flang/module/__fortran_type_info.f90 b/flang/module/__fortran_type_info.f90 index b30a6bf697563..7226b06504d28 100644 --- a/flang/module/__fortran_type_info.f90 +++ b/flang/module/__fortran_type_info.f90 @@ -52,7 +52,8 @@ integer(1) :: noInitializationNeeded ! 1 if no component w/ init integer(1) :: noDestructionNeeded ! 1 if no component w/ dealloc/final integer(1) :: noFinalizationNeeded ! 1 if nothing finalizeable - integer(1) :: __padding0(4) + integer(1) :: noDefinedAssignment ! 1 if no defined ASSIGNMENT(=) + integer(1) :: __padding0(3) end type type :: Binding diff --git a/flang/test/Lower/volatile-openmp.f90 b/flang/test/Lower/volatile-openmp.f90 index 28f0bf78f33c9..2e05b652822b5 100644 --- a/flang/test/Lower/volatile-openmp.f90 +++ b/flang/test/Lower/volatile-openmp.f90 @@ -23,11 +23,11 @@ ! CHECK: %[[VAL_11:.*]] = fir.address_of(@_QFEcontainer) : !fir.ref>>}>> ! CHECK: %[[VAL_12:.*]] = fir.volatile_cast %[[VAL_11]] : (!fir.ref>>}>>) -> !fir.ref>>}>, volatile> ! CHECK: %[[VAL_13:.*]]:2 = hlfir.declare %[[VAL_12]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFEcontainer"} : (!fir.ref>>}>, volatile>) -> (!fir.ref>>}>, volatile>, !fir.ref>>}>, volatile>) -! CHECK: %[[VAL_14:.*]] = fir.address_of(@_QFE.c.t) : !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>> +! CHECK: %[[VAL_14:.*]] = fir.address_of(@_QFE.c.t) : !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,nodefinedassignment:i8,__padding0:!fir.array<3xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>> ! CHECK: %[[VAL_15:.*]] = fir.shape_shift %[[VAL_0]], %[[VAL_1]] : (index, index) -> !fir.shapeshift<1> -! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[VAL_14]](%[[VAL_15]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.c.t"} : (!fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.shapeshift<1>) -> (!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>) -! CHECK: %[[VAL_17:.*]] = fir.address_of(@_QFE.dt.t) : !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>> -! CHECK: %[[VAL_18:.*]]:2 = hlfir.declare %[[VAL_17]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.dt.t"} : (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) -> (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>, !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,__padding0:!fir.array<4xi8>}>>) +! CHECK: %[[VAL_16:.*]]:2 = hlfir.declare %[[VAL_14]](%[[VAL_15]]) {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.c.t"} : (!fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,nodefinedassignment:i8,__padding0:!fir.array<3xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.shapeshift<1>) -> (!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,nodefinedassignment:i8,__padding0:!fir.array<3xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>, !fir.ref>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,nodefinedassignment:i8,__padding0:!fir.array<3xi8>}>>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>) +! CHECK: %[[VAL_17:.*]] = fir.address_of(@_QFE.dt.t) : !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,nodefinedassignment:i8,__padding0:!fir.array<3xi8>}>> +! CHECK: %[[VAL_18:.*]]:2 = hlfir.declare %[[VAL_17]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFE.dt.t"} : (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,nodefinedassignment:i8,__padding0:!fir.array<3xi8>}>>) -> (!fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,nodefinedassignment:i8,__padding0:!fir.array<3xi8>}>>, !fir.ref,name:!fir.box>>}>>>>,name:!fir.box>>,sizeinbytes:i64,uninstantiated:!fir.box>>,kindparameter:!fir.box>>,lenparameterkind:!fir.box>>,component:!fir.box>>,genre:i8,category:i8,kind:i8,rank:i8,__padding0:!fir.array<4xi8>,offset:i64,characterlen:!fir.type<_QM__fortran_type_infoTvalue{{[<]?}}{genre:i8,__padding0:!fir.array<7xi8>,value:i64}{{[>]?}}>,derived:!fir.box>>,lenvalue:!fir.box,value:i64}{{[>]?}}>>>>,bounds:!fir.box,value:i64}{{[>]?}}>>>>,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_ptr{__address:i64}>}>>>>,procptr:!fir.box>>,offset:i64,initialization:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}>>>>,special:!fir.box,proc:!fir.type<_QM__fortran_builtinsT__builtin_c_funptr{__address:i64}>}{{[>]?}}>>>>,specialbitset:i32,hasparent:i8,noinitializationneeded:i8,nodestructionneeded:i8,nofinalizationneeded:i8,nodefinedassignment:i8,__padding0:!fir.array<3xi8>}>>) ! CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_13]]#0{"array"} {fortran_attrs = #fir.var_attrs} : (!fir.ref>>}>, volatile>) -> !fir.ref>>, volatile> ! CHECK: %[[VAL_20:.*]] = fir.load %[[VAL_19]] : !fir.ref>>, volatile> ! CHECK: %[[VAL_21:.*]]:3 = fir.box_dims %[[VAL_20]], %[[VAL_0]] : (!fir.box>>, index) -> (index, index, index) diff --git a/flang/test/Semantics/typeinfo01.f90 b/flang/test/Semantics/typeinfo01.f90 index c1427f28753cf..7b1a19c3a9725 100644 --- a/flang/test/Semantics/typeinfo01.f90 +++ b/flang/test/Semantics/typeinfo01.f90 @@ -8,7 +8,7 @@ module m01 end type !CHECK: Module scope: m01 !CHECK: .c.t1, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(component) shape: 0_8:0_8 init:[component::component(name=.n.n,genre=1_1,category=0_1,kind=4_1,rank=0_1,offset=0_8,characterlen=value(genre=1_1,value=0_8),derived=NULL(),lenvalue=NULL(),bounds=NULL(),initialization=NULL())] -!CHECK: .dt.t1, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t1,sizeinbytes=4_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t1,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t1, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t1,sizeinbytes=4_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t1,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) !CHECK: .n.n, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: CHARACTER(1_8,1) init:"n" !CHECK: .n.t1, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: CHARACTER(2_8,1) init:"t1" !CHECK: DerivedType scope: t1 @@ -23,8 +23,8 @@ module m02 end type !CHECK: .c.child, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(component) shape: 0_8:1_8 init:[component::component(name=.n.parent,genre=1_1,category=6_1,kind=0_1,rank=0_1,offset=0_8,characterlen=value(genre=1_1,value=0_8),derived=.dt.parent,lenvalue=NULL(),bounds=NULL(),initialization=NULL()),component(name=.n.cn,genre=1_1,category=0_1,kind=4_1,rank=0_1,offset=4_8,characterlen=value(genre=1_1,value=0_8),derived=NULL(),lenvalue=NULL(),bounds=NULL(),initialization=NULL())] !CHECK: .c.parent, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(component) shape: 0_8:0_8 init:[component::component(name=.n.pn,genre=1_1,category=0_1,kind=4_1,rank=0_1,offset=0_8,characterlen=value(genre=1_1,value=0_8),derived=NULL(),lenvalue=NULL(),bounds=NULL(),initialization=NULL())] -!CHECK: .dt.child, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.child,sizeinbytes=8_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.child,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) -!CHECK: .dt.parent, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.parent,sizeinbytes=4_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.parent,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.child, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.child,sizeinbytes=8_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.child,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) +!CHECK: .dt.parent, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.parent,sizeinbytes=4_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.parent,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) end module module m03 @@ -35,7 +35,7 @@ module m03 type(kpdt(4)) :: x !CHECK: .c.kpdt.4, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(component) shape: 0_8:0_8 init:[component::component(name=.n.a,genre=1_1,category=2_1,kind=4_1,rank=0_1,offset=0_8,characterlen=value(genre=1_1,value=0_8),derived=NULL(),lenvalue=NULL(),bounds=NULL(),initialization=NULL())] !CHECK: .dt.kpdt, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(name=.n.kpdt,uninstantiated=NULL(),kindparameter=.kp.kpdt,lenparameterkind=NULL()) -!CHECK: .dt.kpdt.4, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.kpdt,sizeinbytes=4_8,uninstantiated=.dt.kpdt,kindparameter=.kp.kpdt.4,lenparameterkind=NULL(),component=.c.kpdt.4,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.kpdt.4, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.kpdt,sizeinbytes=4_8,uninstantiated=.dt.kpdt,kindparameter=.kp.kpdt.4,lenparameterkind=NULL(),component=.c.kpdt.4,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) !CHECK: .kp.kpdt.4, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: INTEGER(8) shape: 0_8:0_8 init:[INTEGER(8)::4_8] end module @@ -49,7 +49,7 @@ module m04 subroutine s1(x) class(tbps), intent(in) :: x end subroutine -!CHECK: .dt.tbps, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.tbps,name=.n.tbps,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.tbps, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.tbps,name=.n.tbps,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) !CHECK: .v.tbps, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:1_8 init:[binding::binding(proc=s1,name=.n.b1),binding(proc=s1,name=.n.b2)] end module @@ -61,7 +61,7 @@ module m05 subroutine s1(x) class(t), intent(in) :: x end subroutine -!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t,sizeinbytes=8_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=.p.t,special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t,sizeinbytes=8_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=.p.t,special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) !CHECK: .p.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(procptrcomponent) shape: 0_8:0_8 init:[procptrcomponent::procptrcomponent(name=.n.p1,offset=0_8,initialization=s1)] end module @@ -85,8 +85,8 @@ subroutine s2(x, y) class(t), intent(in) :: y end subroutine !CHECK: .c.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(component) shape: 0_8:0_8 init:[component::component(name=.n.t,genre=1_1,category=6_1,kind=0_1,rank=0_1,offset=0_8,characterlen=value(genre=1_1,value=0_8),derived=.dt.t,lenvalue=NULL(),bounds=NULL(),initialization=NULL())] -!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t,name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=2_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) -!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t2,name=.n.t2,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t,name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=2_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=0_1) +!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t2,name=.n.t2,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=0_1) !CHECK: .s.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=1_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s1)] !CHECK: .v.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s1,name=.n.s1)] !CHECK: .v.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s2,name=.n.s1)] @@ -103,7 +103,7 @@ impure elemental subroutine s1(x, y) class(t), intent(out) :: x class(t), intent(in) :: y end subroutine -!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t,name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=4_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t,name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=4_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=0_1) !CHECK: .s.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=2_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s1)] !CHECK: .v.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s1,name=.n.s1)] end module @@ -126,7 +126,7 @@ impure elemental subroutine s3(x) subroutine s4(x) type(t), contiguous :: x(:,:,:) end subroutine -!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=7296_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=0_1,nofinalizationneeded=0_1) +!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=7296_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=0_1,nofinalizationneeded=0_1,nodefinedassignment=1_1) !CHECK: .s.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:3_8 init:[specialbinding::specialbinding(which=7_1,isargdescriptorset=0_1,istypebound=1_1,isargcontiguousset=0_1,proc=s3),specialbinding(which=10_1,isargdescriptorset=1_1,istypebound=1_1,isargcontiguousset=0_1,proc=s1),specialbinding(which=11_1,isargdescriptorset=0_1,istypebound=1_1,isargcontiguousset=1_1,proc=s2),specialbinding(which=12_1,isargdescriptorset=1_1,istypebound=1_1,isargcontiguousset=1_1,proc=s4)] end module @@ -168,7 +168,7 @@ subroutine wu(x,u,iostat,iomsg) integer, intent(out) :: iostat character(len=*), intent(inout) :: iomsg end subroutine -!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t,name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=120_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t,name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=120_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) !CHECK: .s.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:3_8 init:[specialbinding::specialbinding(which=3_1,isargdescriptorset=1_1,istypebound=1_1,isargcontiguousset=0_1,proc=rf),specialbinding(which=4_1,isargdescriptorset=1_1,istypebound=1_1,isargcontiguousset=0_1,proc=ru),specialbinding(which=5_1,isargdescriptorset=1_1,istypebound=1_1,isargcontiguousset=0_1,proc=wf),specialbinding(which=6_1,isargdescriptorset=1_1,istypebound=1_1,isargcontiguousset=0_1,proc=wu)] !CHECK: .v.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:3_8 init:[binding::binding(proc=rf,name=.n.rf),binding(proc=ru,name=.n.ru),binding(proc=wf,name=.n.wf),binding(proc=wu,name=.n.wu)] end module @@ -217,7 +217,7 @@ subroutine wu(x,u,iostat,iomsg) integer, intent(out) :: iostat character(len=*), intent(inout) :: iomsg end subroutine -!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=120_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=120_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) !CHECK: .s.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:3_8 init:[specialbinding::specialbinding(which=3_1,isargdescriptorset=0_1,istypebound=0_1,isargcontiguousset=0_1,proc=rf),specialbinding(which=4_1,isargdescriptorset=0_1,istypebound=0_1,isargcontiguousset=0_1,proc=ru),specialbinding(which=5_1,isargdescriptorset=0_1,istypebound=0_1,isargcontiguousset=0_1,proc=wf),specialbinding(which=6_1,isargdescriptorset=0_1,istypebound=0_1,isargcontiguousset=0_1,proc=wu)] end module @@ -234,7 +234,7 @@ module m11 !CHECK: .c.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(component) shape: 0_8:3_8 init:[component::component(name=.n.allocatable,genre=3_1,category=2_1,kind=4_1,rank=1_1,offset=0_8,characterlen=value(genre=1_1,value=0_8),derived=NULL(),lenvalue=NULL(),bounds=NULL(),initialization=NULL()),component(name=.n.pointer,genre=2_1,category=2_1,kind=4_1,rank=0_1,offset=48_8,characterlen=value(genre=1_1,value=0_8),derived=NULL(),lenvalue=NULL(),bounds=NULL(),initialization=.di.t.pointer),component(name=.n.chauto,genre=4_1,category=4_1,kind=1_1,rank=0_1,offset=72_8,characterlen=value(genre=3_1,value=0_8),derived=NULL(),lenvalue=NULL(),bounds=NULL(),initialization=NULL()),component(name=.n.automatic,genre=4_1,category=2_1,kind=4_1,rank=1_1,offset=96_8,characterlen=value(genre=1_1,value=0_8),derived=NULL(),lenvalue=NULL(),bounds=.b.t.automatic,initialization=NULL())] !CHECK: .di.t.pointer, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(.dp.t.pointer) init:.dp.t.pointer(pointer=target) !CHECK: .dp.t.pointer (CompilerCreated): DerivedType components: pointer -!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t,sizeinbytes=144_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=.lpk.t,component=.c.t,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=1_1) +!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t,sizeinbytes=144_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=.lpk.t,component=.c.t,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) !CHECK: .lpk.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: INTEGER(1) shape: 0_8:0_8 init:[INTEGER(1)::8_1] !CHECK: DerivedType scope: .dp.t.pointer size=24 alignment=8 instantiation of .dp.t.pointer !CHECK: pointer, POINTER size=24 offset=0: ObjectEntity type: REAL(4) diff --git a/flang/test/Semantics/typeinfo03.f90 b/flang/test/Semantics/typeinfo03.f90 index f0c0a817da4a4..e2552d0a21d6f 100644 --- a/flang/test/Semantics/typeinfo03.f90 +++ b/flang/test/Semantics/typeinfo03.f90 @@ -6,4 +6,4 @@ module m class(*), pointer :: sp, ap(:) end type end module -!CHECK: .dt.haspointer, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.haspointer,sizeinbytes=104_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.haspointer,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.haspointer, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.haspointer,sizeinbytes=104_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.haspointer,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) diff --git a/flang/test/Semantics/typeinfo04.f90 b/flang/test/Semantics/typeinfo04.f90 index de8464321a409..94dd2199db35a 100644 --- a/flang/test/Semantics/typeinfo04.f90 +++ b/flang/test/Semantics/typeinfo04.f90 @@ -7,18 +7,18 @@ module m contains final :: final end type -!CHECK: .dt.finalizable, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.finalizable,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.finalizable,specialbitset=128_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=0_1,nofinalizationneeded=0_1) +!CHECK: .dt.finalizable, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.finalizable,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.finalizable,specialbitset=128_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=0_1,nofinalizationneeded=0_1,nodefinedassignment=1_1) type, abstract :: t1 end type -!CHECK: .dt.t1, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(name=.n.t1,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t1, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(name=.n.t1,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) type, abstract :: t2 real, allocatable :: a(:) end type -!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(name=.n.t2,sizeinbytes=48_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=1_1) +!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(name=.n.t2,sizeinbytes=48_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) type, abstract :: t3 type(finalizable) :: x end type -!CHECK: .dt.t3, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(name=.n.t3,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t3,procptr=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=0_1,nofinalizationneeded=0_1) +!CHECK: .dt.t3, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(name=.n.t3,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t3,procptr=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=0_1,nofinalizationneeded=0_1,nodefinedassignment=1_1) contains impure elemental subroutine final(x) type(finalizable), intent(in out) :: x diff --git a/flang/test/Semantics/typeinfo05.f90 b/flang/test/Semantics/typeinfo05.f90 index 2a7f12a153eb8..df1aecf3821de 100644 --- a/flang/test/Semantics/typeinfo05.f90 +++ b/flang/test/Semantics/typeinfo05.f90 @@ -7,10 +7,10 @@ program main type t1 type(t2), pointer :: b end type t1 -!CHECK: .dt.t1, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t1,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t1,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t1, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t1,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t1,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) type :: t2 type(t1) :: a end type t2 -! CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t2,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +! CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t2,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) end program main diff --git a/flang/test/Semantics/typeinfo06.f90 b/flang/test/Semantics/typeinfo06.f90 index 2385709a8eb44..22f37b1a4369d 100644 --- a/flang/test/Semantics/typeinfo06.f90 +++ b/flang/test/Semantics/typeinfo06.f90 @@ -7,10 +7,10 @@ program main type t1 type(t2), allocatable :: b end type t1 -!CHECK: .dt.t1, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t1,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t1,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=1_1) +!CHECK: .dt.t1, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t1,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t1,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) type :: t2 type(t1) :: a end type t2 -! CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t2,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=1_1) +! CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t2,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) end program main diff --git a/flang/test/Semantics/typeinfo07.f90 b/flang/test/Semantics/typeinfo07.f90 index e8766d9811db8..ab20d6f601106 100644 --- a/flang/test/Semantics/typeinfo07.f90 +++ b/flang/test/Semantics/typeinfo07.f90 @@ -16,7 +16,7 @@ type(t_container_extension) :: wrapper end type end -! CHECK: .dt.t_container, SAVE, TARGET (CompilerCreated, ReadOnly): {{.*}}noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=0_1) -! CHECK: .dt.t_container_extension, SAVE, TARGET (CompilerCreated, ReadOnly): {{.*}}noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=0_1) -! CHECK: .dt.t_container_not_polymorphic, SAVE, TARGET (CompilerCreated, ReadOnly): {{.*}}noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=1_1) -! CHECK: .dt.t_container_wrapper, SAVE, TARGET (CompilerCreated, ReadOnly): {{.*}}noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=0_1) +! CHECK: .dt.t_container, SAVE, TARGET (CompilerCreated, ReadOnly): {{.*}}noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=0_1,nodefinedassignment=0_1) +! CHECK: .dt.t_container_extension, SAVE, TARGET (CompilerCreated, ReadOnly): {{.*}}noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=0_1,nodefinedassignment=0_1) +! CHECK: .dt.t_container_not_polymorphic, SAVE, TARGET (CompilerCreated, ReadOnly): {{.*}}noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) +! CHECK: .dt.t_container_wrapper, SAVE, TARGET (CompilerCreated, ReadOnly): {{.*}}noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=0_1,nodefinedassignment=0_1) diff --git a/flang/test/Semantics/typeinfo08.f90 b/flang/test/Semantics/typeinfo08.f90 index 689cf469dee3b..391a66f3d6664 100644 --- a/flang/test/Semantics/typeinfo08.f90 +++ b/flang/test/Semantics/typeinfo08.f90 @@ -13,7 +13,7 @@ module m !CHECK: Module scope: m size=0 alignment=1 sourceRange=113 bytes !CHECK: .c.s, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(component) shape: 0_8:0_8 init:[component::component(name=.n.t1,genre=1_1,category=6_1,kind=0_1,rank=0_1,offset=0_8,characterlen=value(genre=1_1,value=0_8),lenvalue=NULL(),bounds=NULL(),initialization=NULL())] -!CHECK: .dt.s, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.s,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=.lpk.s,component=.c.s,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.s, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.s,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=.lpk.s,component=.c.s,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) !CHECK: .lpk.s, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: INTEGER(1) shape: 0_8:0_8 init:[INTEGER(1)::4_1] !CHECK: .n.s, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: CHARACTER(1_8,1) init:"s" !CHECK: .n.t1, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: CHARACTER(2_8,1) init:"t1" diff --git a/flang/test/Semantics/typeinfo11.f90 b/flang/test/Semantics/typeinfo11.f90 index 92efc8f9ea54b..08e0b95abb763 100644 --- a/flang/test/Semantics/typeinfo11.f90 +++ b/flang/test/Semantics/typeinfo11.f90 @@ -14,4 +14,4 @@ type(t2) x end -!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t2,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=0_1) +!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.t2,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=0_1,nodefinedassignment=0_1) diff --git a/flang/test/Semantics/typeinfo12.f90 b/flang/test/Semantics/typeinfo12.f90 new file mode 100644 index 0000000000000..6b23b63d28b1d --- /dev/null +++ b/flang/test/Semantics/typeinfo12.f90 @@ -0,0 +1,67 @@ +!RUN: bbc --dump-symbols %s | FileCheck %s +!Check "nodefinedassignment" settings. + +module m01 + + type hasAsst1 + contains + procedure asst1 + generic :: assignment(=) => asst1 + end type +!CHECK: .dt.hasasst1, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.hasasst1,name=.n.hasasst1,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.hasasst1,specialbitset=4_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=0_1) + + type hasAsst2 ! no defined assignment relevant to the runtime + end type + interface assignment(=) + procedure asst2 + end interface +!CHECK: .dt.hasasst2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.hasasst2,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) + + type test1 + type(hasAsst1) c + end type +!CHECK: .dt.test1, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.test1,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.test1,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=0_1) + + type test2 + type(hasAsst2) c + end type +!CHECK: .dt.test2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.test2,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.test2,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) + + type test3 + type(hasAsst1), pointer :: p + end type +!CHECK: .dt.test3, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.test3,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.test3,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) + + type test4 + type(hasAsst2), pointer :: p + end type +!CHECK: .dt.test4, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.test4,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.test4,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) + + type, extends(hasAsst1) :: test5 + end type +!CHECK: .dt.test5, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.test5,name=.n.test5,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.test5,procptr=NULL(),special=.s.test5,specialbitset=4_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=0_1) + + type, extends(hasAsst2) :: test6 + end type +!CHECK: .dt.test6, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.test6,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.test6,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) + + type test7 + type(test7), allocatable :: c + end type +!CHECK: .dt.test7, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.test7,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.test7,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=1_1,nodefinedassignment=1_1) + + type test8 + class(test8), allocatable :: c + end type +!CHECK: .dt.test8, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=NULL(),name=.n.test8,sizeinbytes=40_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.test8,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=0_1,noinitializationneeded=0_1,nodestructionneeded=0_1,nofinalizationneeded=0_1,nodefinedassignment=0_1) + + contains + impure elemental subroutine asst1(left, right) + class(hasAsst1), intent(out) :: left + class(hasAsst1), intent(in) :: right + end + impure elemental subroutine asst2(left, right) + class(hasAsst2), intent(out) :: left + class(hasAsst2), intent(in) :: right + end +end From flang-commits at lists.llvm.org Fri May 30 10:48:21 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 30 May 2025 10:48:21 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6839ef65.170a0220.24dec9.b94d@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 01/24] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 02/24] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 03/24] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; >From 02b40809aff5b2ad315ef078f0ea8edb0f30a40c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 17 Mar 2025 15:53:27 -0500 Subject: [PATCH 04/24] [flang][OpenMP] Overhaul implementation of ATOMIC construct The parser will accept a wide variety of illegal attempts at forming an ATOMIC construct, leaving it to the semantic analysis to diagnose any issues. This consolidates the analysis into one place and allows us to produce more informative diagnostics. The parser's outcome will be parser::OpenMPAtomicConstruct object holding the directive, parser::Body, and an optional end-directive. The prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have been removed. READ, WRITE, etc. are now proper clauses. The semantic analysis consistently operates on "evaluation" represen- tations, mainly evaluate::Expr (as SomeExpr) and evaluate::Assignment. The results of the semantic analysis are stored in a mutable member of the OpenMPAtomicConstruct node. This follows a precedent of having `typedExpr` member in parser::Expr, for example. This allows the lowering code to avoid duplicated handling of AST nodes. Using a BLOCK construct containing multiple statements for an ATOMIC construct that requires multiple statements is now allowed. In fact, any nesting of such BLOCK constructs is allowed. This implementation will parse, and perform semantic checks for both conditional-update and conditional-update-capture, although no MLIR will be generated for those. Instead, a TODO error will be issues prior to lowering. The allowed forms of the ATOMIC construct were based on the OpenMP 6.0 spec. --- flang/include/flang/Parser/dump-parse-tree.h | 12 - flang/include/flang/Parser/parse-tree.h | 111 +- flang/include/flang/Semantics/tools.h | 16 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 40 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 760 +++---- flang/lib/Parser/openmp-parsers.cpp | 233 +- flang/lib/Parser/parse-tree.cpp | 28 + flang/lib/Parser/unparse.cpp | 102 +- flang/lib/Semantics/check-omp-structure.cpp | 1998 +++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 56 +- flang/lib/Semantics/resolve-names.cpp | 7 +- flang/lib/Semantics/rewrite-directives.cpp | 126 +- .../Lower/OpenMP/Todo/atomic-compare-fail.f90 | 2 +- .../test/Lower/OpenMP/Todo/atomic-compare.f90 | 2 +- flang/test/Lower/OpenMP/atomic-capture.f90 | 4 +- flang/test/Lower/OpenMP/atomic-write.f90 | 2 +- flang/test/Parser/OpenMP/atomic-compare.f90 | 16 - .../test/Semantics/OpenMP/atomic-compare.f90 | 29 +- .../Semantics/OpenMP/atomic-hint-clause.f90 | 23 +- flang/test/Semantics/OpenMP/atomic-read.f90 | 89 + .../OpenMP/atomic-update-capture.f90 | 77 + .../Semantics/OpenMP/atomic-update-only.f90 | 83 + .../OpenMP/atomic-update-overloaded-ops.f90 | 4 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 81 + flang/test/Semantics/OpenMP/atomic.f90 | 29 +- flang/test/Semantics/OpenMP/atomic01.f90 | 221 +- flang/test/Semantics/OpenMP/atomic02.f90 | 47 +- flang/test/Semantics/OpenMP/atomic03.f90 | 51 +- flang/test/Semantics/OpenMP/atomic04.f90 | 96 +- flang/test/Semantics/OpenMP/atomic05.f90 | 12 +- .../Semantics/OpenMP/critical-hint-clause.f90 | 20 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 58 +- .../Semantics/OpenMP/requires-atomic01.f90 | 86 +- .../Semantics/OpenMP/requires-atomic02.f90 | 86 +- 34 files changed, 2838 insertions(+), 1769 deletions(-) delete mode 100644 flang/test/Parser/OpenMP/atomic-compare.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-read.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-capture.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-only.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-write.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a8c01ee2e7da3 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -526,15 +526,6 @@ class ParseTreeDumper { NODE(parser, OmpAtClause) NODE_ENUM(OmpAtClause, ActionTime) NODE_ENUM(OmpSeverityClause, Severity) - NODE(parser, OmpAtomic) - NODE(parser, OmpAtomicCapture) - NODE(OmpAtomicCapture, Stmt1) - NODE(OmpAtomicCapture, Stmt2) - NODE(parser, OmpAtomicCompare) - NODE(parser, OmpAtomicCompareIfStmt) - NODE(parser, OmpAtomicRead) - NODE(parser, OmpAtomicUpdate) - NODE(parser, OmpAtomicWrite) NODE(parser, OmpBeginBlockDirective) NODE(parser, OmpBeginLoopDirective) NODE(parser, OmpBeginSectionsDirective) @@ -581,7 +572,6 @@ class ParseTreeDumper { NODE(parser, OmpDoacrossClause) NODE(parser, OmpDestroyClause) NODE(parser, OmpEndAllocators) - NODE(parser, OmpEndAtomic) NODE(parser, OmpEndBlockDirective) NODE(parser, OmpEndCriticalDirective) NODE(parser, OmpEndLoopDirective) @@ -709,8 +699,6 @@ class ParseTreeDumper { NODE(parser, OpenMPDeclareMapperConstruct) NODE_ENUM(common, OmpMemoryOrderType) NODE(parser, OmpMemoryOrderClause) - NODE(parser, OmpAtomicClause) - NODE(parser, OmpAtomicClauseList) NODE(parser, OmpAtomicDefaultMemOrderClause) NODE(parser, OpenMPDepobjConstruct) NODE(parser, OpenMPUtilityConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..77f57b1cb85c7 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4835,94 +4835,37 @@ struct OmpMemoryOrderClause { CharBlock source; }; -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) | -// FAIL(memory-order) -struct OmpAtomicClause { - UNION_CLASS_BOILERPLATE(OmpAtomicClause); - CharBlock source; - std::variant u; -}; - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -struct OmpAtomicClauseList { - WRAPPER_CLASS_BOILERPLATE(OmpAtomicClauseList, std::list); - CharBlock source; -}; - -// END ATOMIC -EMPTY_CLASS(OmpEndAtomic); - -// ATOMIC READ -struct OmpAtomicRead { - TUPLE_CLASS_BOILERPLATE(OmpAtomicRead); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC WRITE -struct OmpAtomicWrite { - TUPLE_CLASS_BOILERPLATE(OmpAtomicWrite); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC UPDATE -struct OmpAtomicUpdate { - TUPLE_CLASS_BOILERPLATE(OmpAtomicUpdate); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC CAPTURE -struct OmpAtomicCapture { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCapture); - CharBlock source; - WRAPPER_CLASS(Stmt1, Statement); - WRAPPER_CLASS(Stmt2, Statement); - std::tuple - t; -}; - -struct OmpAtomicCompareIfStmt { - UNION_CLASS_BOILERPLATE(OmpAtomicCompareIfStmt); - std::variant, common::Indirection> u; -}; - -// ATOMIC COMPARE (OpenMP 5.1, OPenMP 5.2 spec: 15.8.4) -struct OmpAtomicCompare { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCompare); +struct OpenMPAtomicConstruct { + llvm::omp::Clause GetKind() const; + bool IsCapture() const; + bool IsCompare() const; + TUPLE_CLASS_BOILERPLATE(OpenMPAtomicConstruct); CharBlock source; - std::tuple> + std::tuple> t; -}; -// ATOMIC -struct OmpAtomic { - TUPLE_CLASS_BOILERPLATE(OmpAtomic); - CharBlock source; - std::tuple, - std::optional> - t; -}; + // Information filled out during semantic checks to avoid duplication + // of analyses. + struct Analysis { + static constexpr int None = 0; + static constexpr int Read = 1; + static constexpr int Write = 2; + static constexpr int Update = Read | Write; + static constexpr int Action = 3; // Bitmask for None, Read, Write, Update + static constexpr int IfTrue = 4; + static constexpr int IfFalse = 8; + static constexpr int Condition = 12; // Bitmask for IfTrue, IfFalse + + struct Op { + int what; + TypedExpr expr; + }; + TypedExpr atom, cond; + Op op0, op1; + }; -// 2.17.7 atomic -> -// ATOMIC [atomic-clause-list] atomic-construct [atomic-clause-list] | -// ATOMIC [atomic-clause-list] -// atomic-construct -> READ | WRITE | UPDATE | CAPTURE | COMPARE -struct OpenMPAtomicConstruct { - UNION_CLASS_BOILERPLATE(OpenMPAtomicConstruct); - std::variant - u; + mutable Analysis analysis; }; // OpenMP directives that associate with loop(s) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..7f1ec59b087a2 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -782,5 +782,21 @@ inline bool checkForSymbolMatch( } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). +/// Check if "expr" is +/// SomeType(SomeKind(Type( +/// Convert +/// SomeKind(...)[2]))) +/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves +/// TypeCategory. +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..0d941bf39afc3 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -332,26 +332,26 @@ getSource(const semantics::SemanticsContext &semaCtx, const parser::CharBlock *source = nullptr; auto ompConsVisit = [&](const parser::OpenMPConstruct &x) { - std::visit(common::visitors{ - [&](const parser::OpenMPSectionsConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPLoopConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPBlockConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPCriticalConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPAtomicConstruct &x) { - std::visit([&](const auto &x) { source = &x.source; }, - x.u); - }, - [&](const auto &x) { source = &x.source; }, - }, - x.u); + std::visit( + common::visitors{ + [&](const parser::OpenMPSectionsConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPLoopConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPBlockConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPCriticalConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPAtomicConstruct &x) { + source = &std::get(x.t).source; + }, + [&](const auto &x) { source = &x.source; }, + }, + x.u); }; eval.visit(common::visitors{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fdd85e94829f3..ab868df76d298 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2588,455 +2588,183 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); +} -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} + +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, + const List &clauses) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + if (clause.id != llvm::omp::Clause::OMPC_hint) + continue; + auto &hint = std::get(clause.u); + auto maybeVal = evaluate::ToInt64(hint.v); + CHECK(maybeVal); + return builder.getI64IntegerAttr(*maybeVal); } + return nullptr; } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static mlir::omp::ClauseMemoryOrderKindAttr +getAtomicMemoryOrder(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + const List &clauses) { + std::optional kind; + unsigned version = semaCtx.langOptions().OpenMPVersion; - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + switch (clause.id) { + case llvm::omp::Clause::OMPC_acq_rel: + kind = mlir::omp::ClauseMemoryOrderKind::Acq_rel; + break; + case llvm::omp::Clause::OMPC_acquire: + kind = mlir::omp::ClauseMemoryOrderKind::Acquire; + break; + case llvm::omp::Clause::OMPC_relaxed: + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + break; + case llvm::omp::Clause::OMPC_release: + kind = mlir::omp::ClauseMemoryOrderKind::Release; + break; + case llvm::omp::Clause::OMPC_seq_cst: + kind = mlir::omp::ClauseMemoryOrderKind::Seq_cst; + break; + default: + break; + } + } - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); + // Starting with 5.1, if no memory-order clause is present, the effect + // is as if "relaxed" was present. + if (!kind) { + if (version <= 50) + return nullptr; + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + } + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + return mlir::omp::ClauseMemoryOrderKindAttr::get(builder.getContext(), *kind); +} + +static mlir::Operation * // +genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; +static mlir::Operation * // +genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Value converted = builder.createConvert(loc, atomType, value); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, converted, hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - common::visit( - common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - lower::StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); +static mlir::Operation * +genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + lower::ExprToValueMap overrides; + lower::StatementContext naCtx; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + assert(!args.empty() && "Update operation without arguments"); + for (auto &arg : args) { + if (!semantics::IsSameOrResizeOf(arg, atom)) { + mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); + overrides.try_emplace(&arg, val); } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); } - mlir::Operation *atomicUpdateOp = nullptr; - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), - val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -static void genAtomicWrite(lower::AbstractConverter &converter, - const parser::OmpAtomicWrite &atomicWrite, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - - const parser::AssignmentStmt &stmt = - std::get>(atomicWrite.t) - .statement; - const evaluate::Assignment &assign = *stmt.typedAssignment->v; - lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -static void genAtomicRead(lower::AbstractConverter &converter, - const parser::OmpAtomicRead &atomicRead, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicRead.t) - .statement.t); + builder.restoreInsertionPoint(atomicAt); + auto updateOp = + builder.create(loc, atomAddr, hint, memOrder); - lower::StatementContext stmtCtx; - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -static void genAtomicUpdate(lower::AbstractConverter &converter, - const parser::OmpAtomicUpdate &atomicUpdate, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicUpdate.t) - .statement.t); + mlir::Region ®ion = updateOp->getRegion(0); + mlir::Block *block = builder.createBlock(®ion, {}, {atomType}, {loc}); + mlir::Value localAtom = fir::getBase(block->getArgument(0)); + overrides.try_emplace(&atom, localAtom); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -static void genOmpAtomic(lower::AbstractConverter &converter, - const parser::OmpAtomic &atomicConstruct, - mlir::Location loc) { - const parser::OmpAtomicClauseList &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>(atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicConstruct.t) - .statement.t); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -static void genAtomicCapture(lower::AbstractConverter &converter, - const parser::OmpAtomicCapture &atomicCapture, - mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + converter.overrideExprValues(&overrides); + mlir::Value updated = + fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value converted = builder.createConvert(loc, atomType, updated); + builder.create(loc, converted); + converter.resetExprOverrides(); - const parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const parser::OmpAtomicClauseList &rightHandClauseList = - std::get<2>(atomicCapture.t); - const parser::OmpAtomicClauseList &leftHandClauseList = - std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + builder.restoreInsertionPoint(saved); + return updateOp; +} + +static mlir::Operation * +genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, int action, + mlir::Value atomAddr, const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + switch (action) { + case parser::OpenMPAtomicConstruct::Analysis::Read: + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Write: + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Update: + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + default: + return nullptr; } - firOpBuilder.setInsertionPointToEnd(&block); - firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); } //===----------------------------------------------------------------------===// @@ -3911,10 +3639,6 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, @@ -3922,38 +3646,140 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; + llvm::errs() << " atom: " << atom.AsFortran() << "\n"; + llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; + llvm::errs() << " op0 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << " op1 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << "}\n"; +} + static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - const parser::OpenMPAtomicConstruct &atomicConstruct) { - Fortran::common::visit( - common::visitors{ - [&](const parser::OmpAtomicRead &atomicRead) { - mlir::Location loc = converter.genLocation(atomicRead.source); - genAtomicRead(converter, atomicRead, loc); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - mlir::Location loc = converter.genLocation(atomicWrite.source); - genAtomicWrite(converter, atomicWrite, loc); - }, - [&](const parser::OmpAtomic &atomicConstruct) { - mlir::Location loc = converter.genLocation(atomicConstruct.source); - genOmpAtomic(converter, atomicConstruct, loc); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - mlir::Location loc = converter.genLocation(atomicUpdate.source); - genAtomicUpdate(converter, atomicUpdate, loc); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - mlir::Location loc = converter.genLocation(atomicCapture.source); - genAtomicCapture(converter, atomicCapture, loc); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - mlir::Location loc = converter.genLocation(atomicCompare.source); - TODO(loc, "OpenMP atomic compare"); - }, - }, - atomicConstruct.u); + const parser::OpenMPAtomicConstruct &construct) { + auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { + if (auto *maybe = expr.get(); maybe && maybe->v) { + return &*maybe->v; + } else { + return nullptr; + } + }; + + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto &dirSpec = std::get(construct.t); + List clauses = makeClauses(dirSpec.Clauses(), semaCtx); + lower::StatementContext stmtCtx; + + const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + const semantics::SomeExpr &atom = *get(analysis.atom); + mlir::Location loc = converter.genLocation(construct.source); + mlir::Value atomAddr = + fir::getBase(converter.genExprAddr(atom, stmtCtx, &loc)); + mlir::IntegerAttr hint = getAtomicHint(converter, clauses); + mlir::omp::ClauseMemoryOrderKindAttr memOrder = + getAtomicMemoryOrder(converter, semaCtx, clauses); + + if (auto *cond = get(analysis.cond)) { + (void)cond; + TODO(loc, "OpenMP ATOMIC COMPARE"); + } else { + int action0 = analysis.op0.what & analysis.Action; + int action1 = analysis.op1.what & analysis.Action; + mlir::Operation *captureOp = nullptr; + fir::FirOpBuilder::InsertPoint atomicAt; + fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + + if (construct.IsCapture()) { + // Capturing operation. + assert(action0 != analysis.None && action1 != analysis.None && + "Expexcing two actions"); + captureOp = + builder.create(loc, hint, memOrder); + // Set the non-atomic insertion point to before the atomic.capture. + prepareAt = getInsertionPointBefore(captureOp); + + mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); + builder.setInsertionPointToEnd(block); + // Set the atomic insertion point to before the terminator inside + // atomic.capture. + mlir::Operation *term = builder.create(loc); + atomicAt = getInsertionPointBefore(term); + hint = nullptr; + memOrder = nullptr; + } else { + // Non-capturing operation. + assert(action0 != analysis.None && action1 == analysis.None && + "Expexcing single action"); + assert(!(analysis.op0.what & analysis.Condition)); + atomicAt = prepareAt; + } + + mlir::Operation *firstOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, + *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + assert(firstOp && "Should have created an atomic operation"); + atomicAt = getInsertionPointAfter(firstOp); + + mlir::Operation *secondOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, + *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } + } } static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..b30263ad54fa4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { + return std::string( + state.GetLocation(), std::min(64, state.BytesRemaining())); +} + constexpr auto startOmpLine = skipStuffBeforeStatement >> "!$OMP "_sptok; constexpr auto endOmpLine = space >> endOfLine; @@ -918,8 +924,10 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "BIND" >> construct(construct( parenthesized(Parser{}))) || + "CAPTURE" >> construct(construct()) || "COLLAPSE" >> construct(construct( parenthesized(scalarIntConstantExpr))) || + "COMPARE" >> construct(construct()) || "CONTAINS" >> construct(construct( parenthesized(Parser{}))) || "COPYIN" >> construct(construct( @@ -1039,6 +1047,7 @@ TYPE_PARSER( // "TASK_REDUCTION" >> construct(construct( parenthesized(Parser{}))) || + "READ" >> construct(construct()) || "RELAXED" >> construct(construct()) || "RELEASE" >> construct(construct()) || "REVERSE_OFFLOAD" >> @@ -1082,6 +1091,7 @@ TYPE_PARSER( // maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || + "WRITE" >> construct(construct()) || // Cancellable constructs "DO"_id >= construct(construct( @@ -1210,6 +1220,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. +// A problem can occur when atomic constructs without end-directive follow +// each other closely, e.g. +// !$omp atomic write +// x = v +// !$omp atomic update +// x = x + 1 +// ... +// The speculative parsing will become "recursive", and has the potential +// to take a (practically) infinite amount of time given a sufficiently +// large number of such constructs in a row. Since atomic constructs cannot +// contain other OpenMP constructs, guarding against recursive calls to the +// atomic construct parser solves the problem. +struct OmpAtomicConstructParser { + using resultType = OpenMPAtomicConstruct; + + static constexpr size_t BodyLimit{5}; + + std::optional Parse(ParseState &state) const { + if (recursing_) { + return std::nullopt; + } + recursing_ = true; + + auto dirSpec{Parser{}.Parse(state)}; + if (!dirSpec || dirSpec->DirId() != llvm::omp::Directive::OMPD_atomic) { + recursing_ = false; + return std::nullopt; + } + + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + if (ParseOne(exec, end, tail, state)) { + if (!tail.first.empty()) { + if (auto &&rest{attempt(LimitedTailParser(BodyLimit)).Parse(state)}) { + for (auto &&s : rest->first) { + tail.first.emplace_back(std::move(s)); + } + assert(!tail.second); + tail.second = std::move(rest->second); + } + } + recursing_ = false; + return OpenMPAtomicConstruct{ + std::move(*dirSpec), std::move(tail.first), std::move(tail.second)}; + } + + recursing_ = false; + return std::nullopt; + } + +private: + // Begin-directive + TailType = entire construct. + using TailType = std::pair>; + + // Parse either an ExecutionPartConstruct, or atomic end-directive. When + // successful, record the result in the "tail" provided, otherwise fail. + static std::optional ParseOne( // + Parser &exec, OmpEndDirectiveParser &end, + TailType &tail, ParseState &state) { + auto isRecovery{[](const ExecutionPartConstruct &e) { + return std::holds_alternative(e.u); + }}; + if (auto &&stmt{attempt(exec).Parse(state)}; stmt && !isRecovery(*stmt)) { + tail.first.emplace_back(std::move(*stmt)); + } else if (auto &&dir{attempt(end).Parse(state)}) { + tail.second = std::move(*dir); + } else { + return std::nullopt; + } + return Success{}; + } + + struct LimitedTailParser { + using resultType = TailType; + + constexpr LimitedTailParser(size_t count) : count_(count) {} + + std::optional Parse(ParseState &state) const { + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + for (size_t i{0}; i != count_; ++i) { + if (ParseOne(exec, end, tail, state)) { + if (tail.second) { + // Return when the end-directive was parsed. + return std::move(tail); + } + } else { + break; + } + } + return std::nullopt; + } + + private: + const size_t count_; + }; + + // The recursion guard should become thread_local if parsing is ever + // parallelized. + static bool recursing_; +}; + +bool OmpAtomicConstructParser::recursing_{false}; + +TYPE_PARSER(sourced( // + construct(OmpAtomicConstructParser{}))) + // 2.17.7 Atomic construct/2.17.8 Flush construct [OpenMP 5.0] // memory-order-clause -> // acq_rel @@ -1224,19 +1383,6 @@ TYPE_PARSER(sourced(construct( "RELEASE" >> construct(construct()) || "SEQ_CST" >> construct(construct()))))) -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) -TYPE_PARSER(sourced(construct( - construct(Parser{}) || - construct( - "FAIL" >> parenthesized(Parser{})) || - construct( - "HINT" >> parenthesized(Parser{}))))) - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -TYPE_PARSER(sourced(construct( - many(maybe(","_tok) >> sourced(Parser{}))))) - static bool IsSimpleStandalone(const OmpDirectiveName &name) { switch (name.v) { case llvm::omp::Directive::OMPD_barrier: @@ -1383,67 +1529,6 @@ TYPE_PARSER(sourced( TYPE_PARSER(construct(Parser{}) || construct(Parser{})) -// 2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] | -// ATOMIC [clause] -// clause -> memory-order-clause | HINT(hint-expression) -// memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED -// atomic-clause -> READ | WRITE | UPDATE | CAPTURE - -// OMP END ATOMIC -TYPE_PARSER(construct(startOmpLine >> "END ATOMIC"_tok)) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] READ [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("READ"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] CAPTURE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("CAPTURE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - statement(assignmentStmt), Parser{} / endOmpLine))) - -TYPE_PARSER(construct(indirect(Parser{})) || - construct(indirect(Parser{}))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] COMPARE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("COMPARE"_tok), - Parser{} / endOmpLine, - Parser{}, - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] UPDATE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("UPDATE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [atomic-clause-list] -TYPE_PARSER(sourced(construct(verbatim("ATOMIC"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] WRITE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("WRITE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// Atomic Construct -TYPE_PARSER(construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{})) - // 2.13.2 OMP CRITICAL TYPE_PARSER(startOmpLine >> sourced(construct( diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..5983f54600c21 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -318,6 +318,34 @@ std::string OmpTraitSetSelectorName::ToString() const { return std::string(EnumToString(v)); } +llvm::omp::Clause OpenMPAtomicConstruct::GetKind() const { + auto &dirSpec{std::get(t)}; + for (auto &clause : dirSpec.Clauses().v) { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_read: + case llvm::omp::Clause::OMPC_write: + case llvm::omp::Clause::OMPC_update: + return clause.Id(); + default: + break; + } + } + return llvm::omp::Clause::OMPC_update; +} + +bool OpenMPAtomicConstruct::IsCapture() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_capture; + }); +} + +bool OpenMPAtomicConstruct::IsCompare() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_compare; + }); +} } // namespace Fortran::parser template static llvm::omp::Clause getClauseIdForClass(C &&) { diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..a2bc8b98088f1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2562,83 +2562,22 @@ class UnparseVisitor { Word(ToUpperCaseLetters(common::EnumToString(x))); } - void Unparse(const OmpAtomicClauseList &x) { Walk(" ", x.v, " "); } - - void Unparse(const OmpAtomic &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCapture &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" CAPTURE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - Put("\n"); - Walk(std::get(x.t)); - BeginOpenMP(); - Word("!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCompare &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" COMPARE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - } - void Unparse(const OmpAtomicRead &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" READ"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicUpdate &x) { + void Unparse(const OpenMPAtomicConstruct &x) { BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" UPDATE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicWrite &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" WRITE"); - Walk(std::get<2>(x.t)); + Word("!$OMP "); + Walk(std::get(x.t)); Put("\n"); EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); + Walk(std::get(x.t), ""); + if (auto &end{std::get>(x.t)}) { + BeginOpenMP(); + Word("!$OMP END "); + Walk(*end); + Put("\n"); + EndOpenMP(); + } } + void Unparse(const OpenMPExecutableAllocate &x) { const auto &fields = std::get>>( @@ -2889,23 +2828,8 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } + void Unparse(const OmpFailClause &x) { Walk(x.v); } void Unparse(const OmpMemoryOrderClause &x) { Walk(x.v); } - void Unparse(const OmpAtomicClause &x) { - common::visit(common::visitors{ - [&](const OmpMemoryOrderClause &y) { Walk(y); }, - [&](const OmpFailClause &y) { - Word("FAIL("); - Walk(y.v); - Put(")"); - }, - [&](const OmpHintClause &y) { - Word("HINT("); - Walk(y.v); - Put(")"); - }, - }, - x.u); - } void Unparse(const OmpMetadirectiveDirective &x) { BeginOpenMP(); Word("!$OMP METADIRECTIVE "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c582de6df0319..bf3dff8ddab28 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); + +template +static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { + return !(e == f); +} + // Use when clause falls under 'struct OmpClause' in 'parse-tree.h'. #define CHECK_SIMPLE_CLAUSE(X, Y) \ void OmpStructureChecker::Enter(const parser::OmpClause::X &) { \ @@ -78,6 +86,28 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } +static bool IsVarOrFunctionRef(const SomeExpr &expr) { + return evaluate::UnwrapProcedureRef(expr) != nullptr || + evaluate::IsVariable(expr); +} + +static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { + const parser::TypedExpr &typedExpr{parserExpr.typedExpr}; + // ForwardOwningPointer typedExpr + // `- GenericExprWrapper ^.get() + // `- std::optional ^->v + return typedExpr.get()->v; +} + +static std::optional GetDynamicType( + const parser::Expr &parserExpr) { + if (auto maybeExpr{GetEvaluateExpr(parserExpr)}) { + return maybeExpr->GetType(); + } else { + return std::nullopt; + } +} + // 'OmpWorkshareBlockChecker' is used to check the validity of the assignment // statements and the expressions enclosed in an OpenMP Workshare construct class OmpWorkshareBlockChecker { @@ -584,51 +614,26 @@ void OmpStructureChecker::CheckPredefinedAllocatorRestriction( } } -template -void OmpStructureChecker::CheckHintClause( - D *leftOmpClauseList, D *rightOmpClauseList, std::string_view dirName) { - bool foundHint{false}; +void OmpStructureChecker::Enter(const parser::OmpClause::Hint &x) { + CheckAllowedClause(llvm::omp::Clause::OMPC_hint); + auto &dirCtx{GetContext()}; - auto checkForValidHintClause = [&](const D *clauseList) { - for (const auto &clause : clauseList->v) { - const parser::OmpHintClause *ompHintClause = nullptr; - if constexpr (std::is_same_v) { - ompHintClause = std::get_if(&clause.u); - } else if constexpr (std::is_same_v) { - if (auto *hint{std::get_if(&clause.u)}) { - ompHintClause = &hint->v; - } - } - if (!ompHintClause) - continue; - if (foundHint) { - context_.Say(clause.source, - "At most one HINT clause can appear on the %s directive"_err_en_US, - parser::ToUpperCaseLetters(dirName)); - } - foundHint = true; - std::optional hintValue = GetIntValue(ompHintClause->v); - if (hintValue && *hintValue >= 0) { - /*`omp_sync_hint_nonspeculative` and `omp_lock_hint_speculative`*/ - if ((*hintValue & 0xC) == 0xC - /*`omp_sync_hint_uncontended` and omp_sync_hint_contended*/ - || (*hintValue & 0x3) == 0x3) - context_.Say(clause.source, - "Hint clause value " - "is not a valid OpenMP synchronization value"_err_en_US); - } else { - context_.Say(clause.source, - "Hint clause must have non-negative constant " - "integer expression"_err_en_US); + if (std::optional maybeVal{GetIntValue(x.v.v)}) { + int64_t val{*maybeVal}; + if (val >= 0) { + // Check contradictory values. + if ((val & 0xC) == 0xC || // omp_sync_hint_speculative and nonspeculative + (val & 0x3) == 0x3) { // omp_sync_hint_contended and uncontended + context_.Say(dirCtx.clauseSource, + "The synchronization hint is not valid"_err_en_US); } + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be non-negative"_err_en_US); } - }; - - if (leftOmpClauseList) { - checkForValidHintClause(leftOmpClauseList); - } - if (rightOmpClauseList) { - checkForValidHintClause(rightOmpClauseList); + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be a constant integer value"_err_en_US); } } @@ -2370,8 +2375,9 @@ void OmpStructureChecker::Leave(const parser::OpenMPCancelConstruct &) { void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { const auto &dir{std::get(x.t)}; + const auto &dirSource{std::get(dir.t).source}; const auto &endDir{std::get(x.t)}; - PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_critical); + PushContextAndClauseSets(dirSource, llvm::omp::Directive::OMPD_critical); const auto &block{std::get(x.t)}; CheckNoBranching(block, llvm::omp::Directive::OMPD_critical, dir.source); const auto &dirName{std::get>(dir.t)}; @@ -2404,7 +2410,6 @@ void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { "Hint clause other than omp_sync_hint_none cannot be specified for " "an unnamed CRITICAL directive"_err_en_US}); } - CheckHintClause(&ompClause, nullptr, "CRITICAL"); } void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { @@ -2632,420 +2637,1519 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; } - return false; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } + + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (Append(v, std::move(results)), ...); + return v; + } + +private: + static void Append(Result &acc, Result &&data) { + for (auto &&s : data) { + acc.push_back(std::move(s)); } } +}; +} // namespace atomic - ErrIfAllocatableVariable(var); +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { + // Both expr and x have the form of SomeType(SomeKind(...)[1]). + // Check if expr is + // SomeType(SomeKind(Type( + // Convert + // SomeKind(...)[2]))) + // where SomeKind(...) [1] and [2] are equal, and the Convert preserves + // TypeCategory. + + if (expr != x) { + auto top{atomic::ArgumentExtractor{}(expr)}; + return top.first == atomic::Operator::Resize && x == top.second.front(); } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); + return true; } } -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, const MaybeExpr &maybeExpr = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetExpr(operation.expr, maybeExpr); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedExpr expr; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var + // or + // ... = x capture-var = atomic-var + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return u.lhs == c.rhs; + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); + bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + + if (couldBeCapture1) { + if (couldBeCapture2) { + if (isUpdateCapture(as2, as1)) { + if (isUpdateCapture(as1, as2)) { + // If both statements could be captures and both could be updates, + // emit a warning about the ambiguity. + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); } + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, + as1.rhs.AsFortran(), as2.rhs.AsFortran()); + } + } else { // !couldBeCapture2 + if (isUpdateCapture(as2, as1)) { + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else { + context_.Say(act2.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as1.rhs.AsFortran()); + } + } + } else { // !couldBeCapture1 + if (couldBeCapture2) { + if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); + } else { + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as2.rhs.AsFortran()); + } + } else { + context_.Say(source, + "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + } + } + + return std::make_pair(nullptr, nullptr); +} + +void OmpStructureChecker::CheckAtomicCaptureAssignment( + const evaluate::Assignment &capture, const SomeExpr &atom, + parser::CharBlock source) { + const SomeExpr &cap{capture.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + // This part should have been checked prior to callig this function. + assert(capture.rhs == atom && "This canont be a capture assignment"); + CheckStorageOverlap(atom, {cap}, source); + } +} + +void OmpStructureChecker::CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source) { + const SomeExpr &atom{read.rhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source) { + // [6.0:190:13-15] + // A write structured block is write-statement, a write statement that has + // one of the following forms: + // x = expr + // x => expr + const SomeExpr &atom{write.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, lsrc); + CheckStorageOverlap(atom, {write.rhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source) { + // [6.0:191:1-7] + // An update structured block is update-statement, an update statement + // that has one of the following forms: + // x = x operator expr + // x = expr operator x + // x = intrinsic-procedure-name (x) + // x = intrinsic-procedure-name (x, expr-list) + // x = intrinsic-procedure-name (expr-list, x) + const SomeExpr &atom{update.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, lsrc); + + auto top{GetTopLevelOperation(update.rhs)}; + switch (top.first) { + case atomic::Operator::Add: + case atomic::Operator::Sub: + case atomic::Operator::Mul: + case atomic::Operator::Div: + case atomic::Operator::And: + case atomic::Operator::Or: + case atomic::Operator::Eqv: + case atomic::Operator::Neqv: + case atomic::Operator::Min: + case atomic::Operator::Max: + case atomic::Operator::Identity: + break; + case atomic::Operator::Call: + context_.Say(source, + "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Convert: + context_.Say(source, + "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Intrinsic: + context_.Say(source, + "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Unk: + context_.Say( + source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + default: + context_.Say(source, + "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, + atomic::ToString(top.first)); + return; + } + // Check if `atom` occurs exactly once in the argument list. + std::vector nonAtom; + auto unique{[&]() { // -> iterator + auto found{top.second.end()}; + for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { + if (IsSameOrResizeOf(*i, atom)) { + if (found != top.second.end()) { + return top.second.end(); } + found = i; + } else { + nonAtom.push_back(*i); } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); + return found; + }()}; + + if (unique == top.second.end()) { + if (top.first == atomic::Operator::Identity) { + // This is "x = y". + context_.Say(rsrc, + "The atomic variable %s should appear as an argument in the update operation"_err_en_US, + atom.AsFortran()); + } else { + context_.Say(rsrc, + "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, + atom.AsFortran(), atomic::ToString(top.first)); + } + } else { + CheckStorageOverlap(atom, nonAtom, source); } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( + const SomeExpr &cond, parser::CharBlock condSource, + const evaluate::Assignment &assign, parser::CharBlock assignSource) { + const SomeExpr &atom{assign.lhs}; + auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, alsrc); + + auto top{GetTopLevelOperation(cond)}; + // Missing arguments to operations would have been diagnosed by now. + + switch (top.first) { + case atomic::Operator::Associated: + if (atom != top.second.front()) { + context_.Say(assignSource, + "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); + } + break; + // x equalop e | e equalop x (allowing "e equalop x" is an extension) + case atomic::Operator::Eq: + case atomic::Operator::Eqv: + // x ordop expr | expr ordop x + case atomic::Operator::Lt: + case atomic::Operator::Gt: { + const SomeExpr &arg0{top.second[0]}; + const SomeExpr &arg1{top.second[1]}; + if (IsSameOrResizeOf(arg0, atom)) { + CheckStorageOverlap(atom, {arg1}, condSource); + } else if (IsSameOrResizeOf(arg1, atom)) { + CheckStorageOverlap(atom, {arg0}, condSource); + } else { + context_.Say(assignSource, + "An argument of the %s operator should be the target of the assignment"_err_en_US, + atomic::ToString(top.first)); + } + break; + } + case atomic::Operator::True: + case atomic::Operator::False: + break; + default: + context_.Say(condSource, + "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, + atomic::ToString(top.first)); + break; } } -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, +void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source) { + // The condition/statements must be: + // - cond: x equalop e ift: x = d iff: - + // - cond: x ordop expr ift: x = expr iff: - (+ commute ordop) + // - cond: associated(x) ift: x => expr iff: - + // - cond: associated(x, e) ift: x => expr iff: - + + // The if-true statement must be present, and must be an assignment. + auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; + if (!maybeAssign) { + if (update.ift.stmt) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); + } else { + context_.Say( + source, "Invalid body of ATOMIC UPDATE COMPARE operation"_err_en_US); + } + return; + } + const evaluate::Assignment assign{*maybeAssign}; + const SomeExpr &atom{assign.lhs}; + + CheckAtomicConditionalUpdateAssignment( + update.cond, update.source, assign, update.ift.source); + + CheckStorageOverlap(atom, {assign.rhs}, update.ift.source); + + if (update.iff) { + context_.Say(update.iff.source, + "In ATOMIC UPDATE COMPARE the update statement should not have an ELSE branch"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdateOnly( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeUpdate{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeUpdate->lhs}; + CheckAtomicUpdateAssignment(*maybeUpdate, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC UPDATE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdate( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // Allowable forms are (single-statement): + // - if ... + // - x = (... ? ... : x) + // and two-statement: + // - r = cond ; if (r) ... + + const parser::ExecutionPartConstruct *ust{nullptr}; // update + const parser::ExecutionPartConstruct *cst{nullptr}; // condition + + if (body.size() == 1) { + ust = &body.front(); + } else if (body.size() == 2) { + cst = &body.front(); + ust = &body.back(); + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE operation should contain one or two statements"_err_en_US); + return; + } + + // Flang doesn't support conditional-expr yet, so all update statements + // are if-statements. + + // IfStmt: if (...) ... + // IfConstruct: if (...) then ... endif + auto maybeUpdate{AnalyzeConditionalStmt(ust)}; + if (!maybeUpdate) { + context_.Say(source, + "In ATOMIC UPDATE COMPARE the update statement should be a conditional statement"_err_en_US); + return; + } + + AnalyzedCondStmt &update{*maybeUpdate}; + + if (SourcedActionStmt action{GetActionStmt(cst)}) { + // The "condition" statement must be `r = cond`. + if (auto maybeCond{GetEvaluateAssignment(action.stmt)}) { + if (maybeCond->lhs != update.cond) { + context_.Say(update.source, + "In ATOMIC UPDATE COMPARE the conditional statement must use %s as the condition"_err_en_US, + maybeCond->lhs.AsFortran()); + } else { + // If it's "r = ...; if (r) ..." then put the original condition + // in `update`. + update.cond = maybeCond->rhs; + } + } else { + context_.Say(action.source, + "In ATOMIC UPDATE COMPARE with two statements the first statement should compute the condition"_err_en_US); + } + } + + evaluate::Assignment assign{*GetEvaluateAssignment(update.ift.stmt)}; + + CheckAtomicConditionalUpdateStmt(update, source); + if (IsCheckForAssociated(update.cond)) { + if (!IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment should be a pointer-assignment when the condition is ASSOCIATED"_err_en_US); + } + } else { + if (IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment cannot be a pointer-assignment except when the condition is ASSOCIATED"_err_en_US); + } + } + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::None)); +} + +void OmpStructureChecker::CheckAtomicUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() != 2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two statements"_err_en_US); + return; + } + + auto [uec, cec]{CheckUpdateCapture(&body.front(), &body.back(), source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; + auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; + + if (!maybeUpdate || !maybeCapture) { + context_.Say(source, + "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); + return; + } + + const evaluate::Assignment &update{*maybeUpdate}; + const evaluate::Assignment &capture{*maybeCapture}; + const SomeExpr &atom{update.lhs}; + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + int action; + + if (IsMaybeAtomicWrite(update)) { + action = Analysis::Write; + CheckAtomicWriteAssignment(update, uact.source); + } else { + action = Analysis::Update; + CheckAtomicUpdateAssignment(update, uact.source); + } + CheckAtomicCaptureAssignment(capture, atom, cact.source); + + if (IsPointerAssignment(update) != IsPointerAssignment(capture)) { + context_.Say(cact.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + + if (GetActionStmt(&body.front()).stmt == uact.stmt) { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(action, update.rhs), + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + } else { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), + MakeAtomicAnalysisOp(action, update.rhs)); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // There are two different variants of this: + // (1) conditional-update and capture separately: + // This form only allows single-statement updates, i.e. the update + // form "r = cond; if (r) ...)" is not allowed. + // (2) conditional-update combined with capture in a single statement: + // This form does allow the condition to be calculated separately, + // i.e. "r = cond; if (r) ...". + // Regardless of what form it is, the actual update assignment is a + // proper write, i.e. "x = d", where d does not depend on x. + + AnalyzedCondStmt update; + SourcedActionStmt capture; + bool captureAlways{true}, captureFirst{true}; + + auto extractCapture{[&]() { + capture = update.iff; + captureAlways = false; + update.iff = SourcedActionStmt{}; + }}; + + auto classifyNonUpdate{[&](const SourcedActionStmt &action) { + // The non-update statement is either "r = cond" or the capture. + if (auto maybeAssign{GetEvaluateAssignment(action.stmt)}) { + if (update.cond == maybeAssign->lhs) { + // If this is "r = cond; if (r) ...", then update the condition. + update.cond = maybeAssign->rhs; + update.source = action.source; + // In this form, the update and the capture are combined into + // an IF-THEN-ELSE statement. + extractCapture(); + } else { + // Assume this is the capture-statement. + capture = action; + } + } + }}; + + if (body.size() == 2) { + // This could be + // - capture; conditional-update (in any order), or + // - r = cond; if (r) capture-update + const parser::ExecutionPartConstruct *st1{&body.front()}; + const parser::ExecutionPartConstruct *st2{&body.back()}; + // In either case, the conditional statement can be analyzed by + // AnalyzeConditionalStmt, whereas the other statement cannot. + if (auto maybeUpdate1{AnalyzeConditionalStmt(st1)}) { + update = *maybeUpdate1; + classifyNonUpdate(GetActionStmt(st2)); + captureFirst = false; + } else if (auto maybeUpdate2{AnalyzeConditionalStmt(st2)}) { + update = *maybeUpdate2; + classifyNonUpdate(GetActionStmt(st1)); + } else { + // None of the statements are conditional, this rules out the + // "r = cond; if (r) ..." and the "capture + conditional-update" + // variants. This could still be capture + write (which is classified + // as conditional-update-capture in the spec). + auto [uec, cec]{CheckUpdateCapture(st1, st2, source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + update.ift = uact; + capture = cact; + if (uec == st1) { + captureFirst = false; + } + } + } else if (body.size() == 1) { + if (auto maybeUpdate{AnalyzeConditionalStmt(&body.front())}) { + update = *maybeUpdate; + // This is the form with update and capture combined into an IF-THEN-ELSE + // statement. The capture-statement is always the ELSE branch. + extractCapture(); + } else { + goto invalid; + } + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE CAPTURE operation should contain one or two statements"_err_en_US); + return; + invalid: + context_.Say(source, + "Invalid body of ATOMIC UPDATE COMPARE CAPTURE operation"_err_en_US); + return; + } + + // The update must have a form `x = d` or `x => d`. + if (auto maybeWrite{GetEvaluateAssignment(update.ift.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, update.ift.source); + if (auto maybeCapture{GetEvaluateAssignment(capture.stmt)}) { + CheckAtomicCaptureAssignment(*maybeCapture, atom, capture.source); + + if (IsPointerAssignment(*maybeWrite) != + IsPointerAssignment(*maybeCapture)) { + context_.Say(capture.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + } else { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + return; + } + } else { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + return; + } + + // update.iff should be empty here, the capture statement should be + // stored in "capture". + + // Fill out the analysis in the AST node. + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + bool condUnused{std::visit( + [](auto &&s) { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v) { + return true; + } else { + return false; + } }, - x.u); + update.cond.u)}; + + int updateWhen{!condUnused ? Analysis::IfTrue : 0}; + int captureWhen{!captureAlways ? Analysis::IfFalse : 0}; + + evaluate::Assignment updAssign{*GetEvaluateAssignment(update.ift.stmt)}; + evaluate::Assignment capAssign{*GetEvaluateAssignment(capture.stmt)}; + + if (captureFirst) { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + } else { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + } +} + +void OmpStructureChecker::CheckAtomicRead( + const parser::OpenMPAtomicConstruct &x) { + // [6.0:190:5-7] + // A read structured block is read-statement, a read statement that has one + // of the following forms: + // v = x + // v => x + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Read cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC READ cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeRead->rhs}; + CheckAtomicReadAssignment(*maybeRead, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC READ operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC READ operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicWrite( + const parser::OpenMPAtomicConstruct &x) { + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Write cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC WRITE cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeWrite{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC WRITE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdate( + const parser::OpenMPAtomicConstruct &x) { + auto &block{std::get(x.t)}; + + bool isConditional{x.IsCompare()}; + bool isCapture{x.IsCapture()}; + const parser::Block &body{GetInnermostExecPart(block)}; + + if (isConditional && isCapture) { + CheckAtomicConditionalUpdateCapture(x, body, x.source); + } else if (isConditional) { + CheckAtomicConditionalUpdate(x, body, x.source); + } else if (isCapture) { + CheckAtomicUpdateCapture(x, body, x.source); + } else { // update-only + CheckAtomicUpdateOnly(x, body, x.source); + } +} + +void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { + // All of the following groups have the "exclusive" property, i.e. at + // most one clause from each group is allowed. + // The exclusivity-checking code should eventually be unified for all + // clauses, with clause groups defined in OMP.td. + std::array atomic{llvm::omp::Clause::OMPC_read, + llvm::omp::Clause::OMPC_update, llvm::omp::Clause::OMPC_write}; + std::array memoryOrder{llvm::omp::Clause::OMPC_acq_rel, + llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_relaxed, + llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_seq_cst}; + + auto checkExclusive{[&](llvm::ArrayRef group, + std::string_view name, + const parser::OmpClauseList &clauses) { + const parser::OmpClause *present{nullptr}; + for (const parser::OmpClause &clause : clauses.v) { + llvm::omp::Clause id{clause.Id()}; + if (!llvm::is_contained(group, id)) { + continue; + } + if (present == nullptr) { + present = &clause; + continue; + } else if (id == present->Id()) { + // Ignore repetitions of the same clause, those will be diagnosed + // separately. + continue; + } + parser::MessageFormattedText txt( + "At most one clause from the '%s' group is allowed on ATOMIC construct"_err_en_US, + name.data()); + parser::Message message(clause.source, txt); + message.Attach(present->source, + "Previous clause from this group provided here"_en_US); + context_.Say(std::move(message)); + return; + } + }}; + + auto &dirSpec{std::get(x.t)}; + auto &dir{std::get(dirSpec.t)}; + PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_atomic); + llvm::omp::Clause kind{x.GetKind()}; + + checkExclusive(atomic, "atomic", dirSpec.Clauses()); + checkExclusive(memoryOrder, "memory-order", dirSpec.Clauses()); + + switch (kind) { + case llvm::omp::Clause::OMPC_read: + CheckAtomicRead(x); + break; + case llvm::omp::Clause::OMPC_write: + CheckAtomicWrite(x); + break; + case llvm::omp::Clause::OMPC_update: + CheckAtomicUpdate(x); + break; + default: + break; + } } void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { @@ -3237,7 +4341,6 @@ CHECK_SIMPLE_CLAUSE(Final, OMPC_final) CHECK_SIMPLE_CLAUSE(Flush, OMPC_flush) CHECK_SIMPLE_CLAUSE(Full, OMPC_full) CHECK_SIMPLE_CLAUSE(Grainsize, OMPC_grainsize) -CHECK_SIMPLE_CLAUSE(Hint, OMPC_hint) CHECK_SIMPLE_CLAUSE(Holds, OMPC_holds) CHECK_SIMPLE_CLAUSE(Inclusive, OMPC_inclusive) CHECK_SIMPLE_CLAUSE(Initializer, OMPC_initializer) @@ -3867,40 +4970,6 @@ void OmpStructureChecker::CheckIsLoopIvPartOfClause( } } } -// Following clauses have a separate node in parse-tree.h. -// Atomic-clause -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicRead, OMPC_read) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicWrite, OMPC_write) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicUpdate, OMPC_update) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicCapture, OMPC_capture) - -void OmpStructureChecker::Leave(const parser::OmpAtomicRead &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_read, - {llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicWrite &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_write, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicUpdate &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_update, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -// OmpAtomic node represents atomic directive without atomic-clause. -// atomic-clause - READ,WRITE,UPDATE,CAPTURE. -void OmpStructureChecker::Leave(const parser::OmpAtomic &) { - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acquire)}) { - context_.Say(clause->source, - "Clause ACQUIRE is not allowed on the ATOMIC directive"_err_en_US); - } - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acq_rel)}) { - context_.Say(clause->source, - "Clause ACQ_REL is not allowed on the ATOMIC directive"_err_en_US); - } -} // Restrictions specific to each clause are implemented apart from the // generalized restrictions. @@ -4854,21 +5923,6 @@ void OmpStructureChecker::Leave(const parser::OmpContextSelector &) { ExitDirectiveNest(ContextSelectorNest); } -std::optional OmpStructureChecker::GetDynamicType( - const common::Indirection &parserExpr) { - // Indirection parserExpr - // `- parser::Expr ^.value() - const parser::TypedExpr &typedExpr{parserExpr.value().typedExpr}; - // ForwardOwningPointer typedExpr - // `- GenericExprWrapper ^.get() - // `- std::optional ^->v - if (auto maybeExpr{typedExpr.get()->v}) { - return maybeExpr->GetType(); - } else { - return std::nullopt; - } -} - const std::list & OmpStructureChecker::GetTraitPropertyList( const parser::OmpTraitSelector &trait) { @@ -5258,7 +6312,7 @@ void OmpStructureChecker::CheckTraitCondition( const parser::OmpTraitProperty &property{properties.front()}; auto &scalarExpr{std::get(property.u)}; - auto maybeType{GetDynamicType(scalarExpr.thing)}; + auto maybeType{GetDynamicType(scalarExpr.thing.value())}; if (!maybeType || maybeType->category() != TypeCategory::Logical) { context_.Say(property.source, "%s trait requires a single LOGICAL expression"_err_en_US, diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..bf6fbf16d0646 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -48,6 +48,7 @@ static const OmpDirectiveSet noWaitClauseNotAllowedSet{ } // namespace llvm namespace Fortran::semantics { +struct AnalyzedCondStmt; // Mapping from 'Symbol' to 'Source' to keep track of the variables // used in multiple clauses @@ -142,15 +143,6 @@ class OmpStructureChecker void Leave(const parser::OmpClauseList &); void Enter(const parser::OmpClause &); - void Enter(const parser::OmpAtomicRead &); - void Leave(const parser::OmpAtomicRead &); - void Enter(const parser::OmpAtomicWrite &); - void Leave(const parser::OmpAtomicWrite &); - void Enter(const parser::OmpAtomicUpdate &); - void Leave(const parser::OmpAtomicUpdate &); - void Enter(const parser::OmpAtomicCapture &); - void Leave(const parser::OmpAtomic &); - void Enter(const parser::DoConstruct &); void Leave(const parser::DoConstruct &); @@ -189,8 +181,6 @@ class OmpStructureChecker void CheckAllowedMapTypes(const parser::OmpMapType::Value &, const std::list &); - std::optional GetDynamicType( - const common::Indirection &); const std::list &GetTraitPropertyList( const parser::OmpTraitSelector &); std::optional GetClauseFromProperty( @@ -260,14 +250,41 @@ class OmpStructureChecker void CheckDoWhile(const parser::OpenMPLoopConstruct &x); void CheckAssociatedLoopConstraints(const parser::OpenMPLoopConstruct &x); template bool IsOperatorValid(const T &, const D &); - void CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *, const parser::OmpAtomicClauseList *); - void CheckAtomicUpdateStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureStmt(const parser::AssignmentStmt &); - void CheckAtomicWriteStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureConstruct(const parser::OmpAtomicCapture &); - void CheckAtomicCompareConstruct(const parser::OmpAtomicCompare &); - void CheckAtomicConstructStructure(const parser::OpenMPAtomicConstruct &); + + void CheckStorageOverlap(const evaluate::Expr &, + llvm::ArrayRef>, parser::CharBlock); + void CheckAtomicVariable( + const evaluate::Expr &, parser::CharBlock); + std::pair + CheckUpdateCapture(const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source); + void CheckAtomicCaptureAssignment(const evaluate::Assignment &capture, + const SomeExpr &atom, parser::CharBlock source); + void CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source); + void CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source); + void CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source); + void CheckAtomicConditionalUpdateAssignment(const SomeExpr &cond, + parser::CharBlock condSource, const evaluate::Assignment &assign, + parser::CharBlock assignSource); + void CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source); + void CheckAtomicUpdateOnly(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdate(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicUpdateCapture(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source); + void CheckAtomicRead(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicWrite(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicUpdate(const parser::OpenMPAtomicConstruct &x); + void CheckDistLinear(const parser::OpenMPLoopConstruct &x); void CheckSIMDNest(const parser::OpenMPConstruct &x); void CheckTargetNest(const parser::OpenMPConstruct &x); @@ -319,7 +336,6 @@ class OmpStructureChecker void EnterDirectiveNest(const int index) { directiveNest_[index]++; } void ExitDirectiveNest(const int index) { directiveNest_[index]--; } int GetDirectiveNest(const int index) { return directiveNest_[index]; } - template void CheckHintClause(D *, D *, std::string_view); inline void ErrIfAllocatableVariable(const parser::Variable &); inline void ErrIfLHSAndRHSSymbolsMatch( const parser::Variable &, const parser::Expr &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..dc395ecf211b3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1638,11 +1638,8 @@ class OmpVisitor : public virtual DeclarationVisitor { messageHandler().set_currStmtSource(std::nullopt); } bool Pre(const parser::OpenMPAtomicConstruct &x) { - return common::visit(common::visitors{[&](const auto &u) -> bool { - AddOmpSourceRange(u.source); - return true; - }}, - x.u); + AddOmpSourceRange(x.source); + return true; } void Post(const parser::OpenMPAtomicConstruct &) { messageHandler().set_currStmtSource(std::nullopt); diff --git a/flang/lib/Semantics/rewrite-directives.cpp b/flang/lib/Semantics/rewrite-directives.cpp index 104a77885d276..b4fef2c881b67 100644 --- a/flang/lib/Semantics/rewrite-directives.cpp +++ b/flang/lib/Semantics/rewrite-directives.cpp @@ -51,23 +51,21 @@ class OmpRewriteMutator : public DirectiveRewriteMutator { bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { // Find top-level parent of the operation. - Symbol *topLevelParent{common::visit( - [&](auto &atomic) { - Symbol *symbol{nullptr}; - Scope *scope{ - &context_.FindScope(std::get(atomic.t).source)}; - do { - if (Symbol * parent{scope->symbol()}) { - symbol = parent; - } - scope = &scope->parent(); - } while (!scope->IsGlobal()); - - assert(symbol && - "Atomic construct must be within a scope associated with a symbol"); - return symbol; - }, - x.u)}; + Symbol *topLevelParent{[&]() { + Symbol *symbol{nullptr}; + Scope *scope{&context_.FindScope( + std::get(x.t).source)}; + do { + if (Symbol * parent{scope->symbol()}) { + symbol = parent; + } + scope = &scope->parent(); + } while (!scope->IsGlobal()); + + assert(symbol && + "Atomic construct must be within a scope associated with a symbol"); + return symbol; + }()}; // Get the `atomic_default_mem_order` clause from the top-level parent. std::optional defaultMemOrder; @@ -86,66 +84,48 @@ bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { return false; } - auto findMemOrderClause = - [](const std::list &clauses) { - return llvm::any_of(clauses, [](const auto &clause) { - return std::get_if(&clause.u); + auto findMemOrderClause{[](const parser::OmpClauseList &clauses) { + return llvm::any_of( + clauses.v, [](auto &clause) -> const parser::OmpClause * { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_acq_rel: + case llvm::omp::Clause::OMPC_acquire: + case llvm::omp::Clause::OMPC_relaxed: + case llvm::omp::Clause::OMPC_release: + case llvm::omp::Clause::OMPC_seq_cst: + return &clause; + default: + return nullptr; + } }); - }; - - // Get the clause list to which the new memory order clause must be added, - // only if there are no other memory order clauses present for this atomic - // directive. - std::list *clauseList = common::visit( - common::visitors{[&](parser::OmpAtomic &atomicConstruct) { - // OmpAtomic only has a single list of clauses. - auto &clauses{std::get( - atomicConstruct.t)}; - return !findMemOrderClause(clauses.v) ? &clauses.v - : nullptr; - }, - [&](auto &atomicConstruct) { - // All other atomic constructs have two lists of clauses. - auto &clausesLhs{std::get<0>(atomicConstruct.t)}; - auto &clausesRhs{std::get<2>(atomicConstruct.t)}; - return !findMemOrderClause(clausesLhs.v) && - !findMemOrderClause(clausesRhs.v) - ? &clausesRhs.v - : nullptr; - }}, - x.u); + }}; - // Add a memory order clause to the atomic directive. + auto &dirSpec{std::get(x.t)}; + auto &clauseList{std::get>(dirSpec.t)}; if (clauseList) { - atomicDirectiveDefaultOrderFound_ = true; - switch (*defaultMemOrder) { - case common::OmpMemoryOrderType::Acq_Rel: - clauseList->emplace_back(common::visit( - common::visitors{[](parser::OmpAtomicRead &) -> parser::OmpClause { - return parser::OmpClause::Acquire{}; - }, - [](parser::OmpAtomicCapture &) -> parser::OmpClause { - return parser::OmpClause::AcqRel{}; - }, - [](auto &) -> parser::OmpClause { - // parser::{OmpAtomic, OmpAtomicUpdate, OmpAtomicWrite} - return parser::OmpClause::Release{}; - }}, - x.u)); - break; - case common::OmpMemoryOrderType::Relaxed: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::Relaxed{}}); - break; - case common::OmpMemoryOrderType::Seq_Cst: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::SeqCst{}}); - break; - default: - // FIXME: Don't process other values at the moment since their validity - // depends on the OpenMP version (which is unavailable here). - break; + if (findMemOrderClause(*clauseList)) { + return false; } + } else { + clauseList = parser::OmpClauseList(decltype(parser::OmpClauseList::v){}); + } + + // Add a memory order clause to the atomic directive. + atomicDirectiveDefaultOrderFound_ = true; + switch (*defaultMemOrder) { + case common::OmpMemoryOrderType::Acq_Rel: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::AcqRel{}}); + break; + case common::OmpMemoryOrderType::Relaxed: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::Relaxed{}}); + break; + case common::OmpMemoryOrderType::Seq_Cst: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::SeqCst{}}); + break; + default: + // FIXME: Don't process other values at the moment since their validity + // depends on the OpenMP version (which is unavailable here). + break; } return false; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 index b82bd13622764..6f58e0939a787 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 index 88ec6fe910b9e..6729be6e5cf8b 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b8422c8f3b769 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -79,16 +79,16 @@ subroutine pointers_in_atomic_capture() !CHECK: %[[VAL_A_BOX_ADDR:.*]] = fir.box_addr %[[VAL_A_LOADED]] : (!fir.box>) -> !fir.ptr !CHECK: %[[VAL_B_LOADED:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR:.*]] = fir.box_addr %[[VAL_B_LOADED]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR]] : !fir.ptr !CHECK: %[[VAL_B_LOADED_2:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR_2:.*]] = fir.box_addr %[[VAL_B_LOADED_2]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR_2]] : !fir.ptr !CHECK: omp.atomic.capture { !CHECK: omp.atomic.update %[[VAL_A_BOX_ADDR]] : !fir.ptr { !CHECK: ^bb0(%[[ARG:.*]]: i32): !CHECK: %[[TEMP:.*]] = arith.addi %[[ARG]], %[[VAL_B]] : i32 !CHECK: omp.yield(%[[TEMP]] : i32) !CHECK: } -!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 +!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR_2]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 !CHECK: } !CHECK: return !CHECK: } diff --git a/flang/test/Lower/OpenMP/atomic-write.f90 b/flang/test/Lower/OpenMP/atomic-write.f90 index 13392ad76471f..6eded49b0b15d 100644 --- a/flang/test/Lower/OpenMP/atomic-write.f90 +++ b/flang/test/Lower/OpenMP/atomic-write.f90 @@ -44,9 +44,9 @@ end program OmpAtomicWrite !CHECK-LABEL: func.func @_QPatomic_write_pointer() { !CHECK: %[[X_REF:.*]] = fir.alloca !fir.box> {bindc_name = "x", uniq_name = "_QFatomic_write_pointerEx"} !CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFatomic_write_pointerEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) -!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> !CHECK: %[[X_POINTEE_ADDR:.*]] = fir.box_addr %[[X_ADDR_BOX]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: omp.atomic.write %[[X_POINTEE_ADDR]] = %[[C1]] : !fir.ptr, i32 !CHECK: %[[C2:.*]] = arith.constant 2 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> diff --git a/flang/test/Parser/OpenMP/atomic-compare.f90 b/flang/test/Parser/OpenMP/atomic-compare.f90 deleted file mode 100644 index 5cd02698ff482..0000000000000 --- a/flang/test/Parser/OpenMP/atomic-compare.f90 +++ /dev/null @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s -! OpenMP version for documentation purposes only - it isn't used until Sema. -! This is testing for Parser errors that bail out before Sema. -program main - implicit none - integer :: i, j = 10 - logical :: r - - !CHECK: error: expected OpenMP construct - !$omp atomic compare write - r = i .eq. j + 1 - - !CHECK: error: expected end of line - !$omp atomic compare num_threads(4) - r = i .eq. j -end program main diff --git a/flang/test/Semantics/OpenMP/atomic-compare.f90 b/flang/test/Semantics/OpenMP/atomic-compare.f90 index 54492bf6a22a6..11e23e062bce7 100644 --- a/flang/test/Semantics/OpenMP/atomic-compare.f90 +++ b/flang/test/Semantics/OpenMP/atomic-compare.f90 @@ -44,46 +44,37 @@ !$omp end atomic ! Check for error conditions: - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic compare seq_cst seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst compare seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic compare acquire acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire compare acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic compare relaxed relaxed if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed compare relaxed if (b .eq. c) b = a - !ERROR: More than one FAIL clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one FAIL clause can appear on the ATOMIC directive !$omp atomic fail(release) compare fail(release) if (c .eq. a) a = b !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index c13a11a8dd5dc..deb67e7614659 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -16,20 +16,21 @@ program sample !$omp atomic read hint(2) y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(3) y = y + 10 !$omp atomic update hint(5) y = x + y - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement y = x x = y !$omp end atomic - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic update hint(x) y = y * 1 @@ -46,7 +47,7 @@ program sample !$omp atomic hint(omp_lock_hint_speculative) x = y + x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint) read y = x @@ -69,36 +70,36 @@ program sample !$omp atomic hint(omp_lock_hint_contended + omp_sync_hint_nonspeculative) x = y + x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint_contended) read y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = y * 9 - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp atomic hint(1.0) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp atomic hint(z + omp_sync_hint_nonspeculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(k + omp_sync_hint_speculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(p(1) + omp_sync_hint_uncontended) write x = 10 * y !$omp atomic write hint(a) - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and y+x access the same storage x = y + x !$omp atomic hint(abs(-1)) write diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 new file mode 100644 index 0000000000000..2619d235380f8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -0,0 +1,89 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC READ. Expect no diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v = x + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v => x +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic read + v = p(i) +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic read + !ERROR: Atomic variable x should be a scalar + v = x + + !$omp atomic read + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + v = y(2:4) +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic read + !ERROR: Within atomic operation x and x access the same storage + x = x + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic read + y(1) = y(2) +end + +subroutine f05 + integer :: x, v + + !$omp atomic read + !ERROR: Atomic expression x+1_4 should be a variable + v = x + 1 +end + +subroutine f06 + character :: x, v + + !$omp atomic read + !ERROR: Atomic variable x cannot have CHARACTER type + v = x +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic read + !ERROR: Atomic variable x cannot be ALLOCATABLE + v = x +end + diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 new file mode 100644 index 0000000000000..64e31a7974b55 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -0,0 +1,77 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !$omp atomic update capture + x = v + x = x + 1 + y = x + !$omp end atomic +end + +subroutine f01 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two assignments + !$omp atomic update capture + x = v + block + x = x + 1 + y = x + end block + !$omp end atomic +end + +subroutine f02 + integer :: x, y + + ! The update and capture statements can be inside of a single BLOCK. + ! The end-directive is then optional. Expect no diagnostics. + !$omp atomic update capture + block + x = x + 1 + y = x + end block +end + +subroutine f03 + integer :: x + + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !$omp atomic update capture + x = x + 1 + x = x + 2 + !$omp end atomic +end + +subroutine f04 + integer :: x, v + + !$omp atomic update capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + v = x + x = v + !$omp end atomic +end + +subroutine f05 + integer :: x, v, z + + !$omp atomic update capture + v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + !$omp end atomic +end + +subroutine f06 + integer :: x, v, z + + !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + v = x + !$omp end atomic +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 new file mode 100644 index 0000000000000..0aee02d429dc0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + !$omp atomic update + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + x = int(x + y) +end + +subroutine f06 + integer :: x, y + interface + function f(i, j) + integer :: f, i, j + end + end interface + + !$omp atomic update + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation + x = f(x, y) +end + +subroutine f07 + real :: x + integer :: y + + !$omp atomic update + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation + x = x ** y +end + +subroutine f08 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should appear as an argument in the update operation + x = y +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 index 21a9b87d26345..3084376b4275d 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 @@ -22,10 +22,10 @@ program sample x = x / y !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y end program diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 new file mode 100644 index 0000000000000..979568ebf7140 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -0,0 +1,81 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC WRITE. Expect no diagnostics. + !$omp atomic write + x = v + 1 + + !$omp atomic write + x = v + 3 + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic write + x = 2 * v + 3 + + !$omp atomic write + x => v +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic write + p(i) = v +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic write + !ERROR: Atomic variable x should be a scalar + x = v + + !$omp atomic write + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + y(2:4) = v +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic write + !ERROR: Within atomic operation x and x+1_4 access the same storage + x = x + 1 + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic write + y(1) = y(2) +end + +subroutine f06 + character :: x, v + + !$omp atomic write + !ERROR: Atomic variable x cannot have CHARACTER type + x = v +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic write + !ERROR: Atomic variable x cannot be ALLOCATABLE + x = v +end + diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0e100871ea9b4..0dc8251ae356e 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct @@ -11,9 +11,13 @@ a = b !$omp end atomic + !ERROR: ACQUIRE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic read acquire hint(OMP_LOCK_HINT_CONTENDED) a = b + !ERROR: RELEASE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic release hint(OMP_LOCK_HINT_UNCONTENDED) write a = b @@ -22,39 +26,32 @@ a = a + 1 !$omp end atomic + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: ACQ_REL clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic hint(1) acq_rel capture b = a a = a + 1 !$omp end atomic - !ERROR: expected end of line + !ERROR: At most one clause from the 'atomic' group is allowed on ATOMIC construct !$omp atomic read write + !ERROR: Atomic expression a+1._4 should be a variable a = a + 1 !$omp atomic a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic num_threads(4) a = a + 1 - !ERROR: expected end of line + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic capture num_threads(4) a = a + 1 + !ERROR: RELAXED clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic relaxed a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' - !$omp atomic num_threads write - a = a + 1 - !$omp end parallel end diff --git a/flang/test/Semantics/OpenMP/atomic01.f90 b/flang/test/Semantics/OpenMP/atomic01.f90 index 173effe86b69c..f700c381cadd0 100644 --- a/flang/test/Semantics/OpenMP/atomic01.f90 +++ b/flang/test/Semantics/OpenMP/atomic01.f90 @@ -14,322 +14,277 @@ ! At most one memory-order-clause may appear on the construct. !READ - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic read seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst read seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic read acquire acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire read acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic read relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed read relaxed i = j !UPDATE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic update seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst update seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic update release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release update release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic update relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed update relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !CAPTURE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic capture seq_cst seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst capture seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic capture release release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release capture release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic capture relaxed relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed capture relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel acq_rel capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic capture acq_rel acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel capture acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic capture acquire acquire i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire capture acquire i = j j = k !$omp end atomic !WRITE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic write seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst write seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic write release release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release write release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic write relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed write relaxed i = j !No atomic-clause - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j ! 2.17.7.3 ! At most one hint clause may appear on the construct. - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_speculative) hint(omp_sync_hint_speculative) read i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) read hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic read hint(omp_sync_hint_uncontended) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) update hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic update hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) capture i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) capture hint(omp_sync_hint_nonspeculative) i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic capture hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j j = k @@ -337,34 +292,26 @@ ! 2.17.7.4 ! If atomic-clause is read then memory-order-clause must not be acq_rel or release. - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic acq_rel read i = j - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read acq_rel i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic release read i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read release i = j ! 2.17.7.5 ! If atomic-clause is write then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acq_rel write i = j - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acq_rel i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acquire write i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acquire i = j @@ -372,33 +319,27 @@ ! 2.17.7.6 ! If atomic-clause is update or not present then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acq_rel update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acquire update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed on the ATOMIC directive !$omp atomic acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed on the ATOMIC directive !$omp atomic acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j end program diff --git a/flang/test/Semantics/OpenMP/atomic02.f90 b/flang/test/Semantics/OpenMP/atomic02.f90 index c66085d00f157..45e41f2552965 100644 --- a/flang/test/Semantics/OpenMP/atomic02.f90 +++ b/flang/test/Semantics/OpenMP/atomic02.f90 @@ -28,36 +28,29 @@ program OmpAtomic !$omp atomic a = a/(b + 1) !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: The atomic variable c should appear as an argument in the update operation c = d !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The /= operator is not a valid ATOMIC UPDATE operation l = a .NE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic m = m .AND. n @@ -76,32 +69,26 @@ program OmpAtomic !$omp atomic update a = a/(b + 1) !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: This is not a valid ATOMIC UPDATE operation c = c//d !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic update m = m .AND. n diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index 76367495b9861..f5c189fd05318 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -25,28 +25,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7, b, c) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8, a, d) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -60,26 +58,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = MOD(y, 9) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation x = ABS(x) end program OmpAtomic @@ -92,7 +90,7 @@ subroutine conflicting_types() type(simple) ::s z = 1 !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(s%z, 4) end subroutine @@ -105,40 +103,37 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) :: s !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MIN operator a = min(a, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a) !$omp atomic - !ERROR: Atomic update statement should be of the form `a = intrinsic_procedure(a, expr_list)` OR `a = intrinsic_procedure(expr_list, a)` a = min(b, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a, b) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: The atomic variable y should occur exactly once among the arguments of the top-level MIN operator y = min(z, x) !$omp atomic z = max(z, y) !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'k' + !ERROR: Atomic variable k should be a scalar + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation k = max(x, y) - + !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement x = min(x, k) !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - z =z + s%m + z = z + s%m end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index a9644ad95aa30..5c91ab5dc37e4 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -20,12 +20,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic @@ -33,12 +31,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic @@ -46,12 +42,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic @@ -59,12 +53,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic @@ -72,8 +64,7 @@ program OmpAtomic !$omp atomic m = n .AND. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic @@ -81,8 +72,7 @@ program OmpAtomic !$omp atomic m = n .OR. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic @@ -90,8 +80,7 @@ program OmpAtomic !$omp atomic m = n .EQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic @@ -99,8 +88,7 @@ program OmpAtomic !$omp atomic m = n .NEQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l !$omp atomic update @@ -108,12 +96,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic update @@ -121,12 +107,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic update @@ -134,12 +118,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic update @@ -147,12 +129,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic update @@ -160,8 +140,7 @@ program OmpAtomic !$omp atomic update m = n .AND. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic update @@ -169,8 +148,7 @@ program OmpAtomic !$omp atomic update m = n .OR. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic update @@ -178,8 +156,7 @@ program OmpAtomic !$omp atomic update m = n .EQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic update @@ -187,8 +164,7 @@ program OmpAtomic !$omp atomic update m = n .NEQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l end program OmpAtomic @@ -204,35 +180,34 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) p !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement x = x !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 1 !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and a*b access the same storage a = a * b + a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level * operator a = b * (a + 9) !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (a+b) access the same storage a = a * (a + b) !$omp atomic - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (b+a) access the same storage a = (b + a) * a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a * b + c !$omp atomic update - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a + b + c !$omp atomic @@ -243,23 +218,18 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement a = a + d !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = x * y / z !$omp atomic - !ERROR: Atomic update statement should be of form `p%m = p%m operator expr` OR `p%m = expr operator p%m` - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable p%m should occur exactly once among the arguments of the top-level + operator p%m = x + y !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement p%m = p%m + p%n end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index 266268a212440..e0103be4cae4a 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -8,20 +8,20 @@ program OmpAtomic use omp_lib integer :: g, x - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic relaxed, seq_cst x = x + 1 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic read seq_cst, relaxed x = g - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic write relaxed, release x = 2 * 4 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 10 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst x = g g = x * 10 diff --git a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 index 7ca8c858239f7..e9cfa49bf934e 100644 --- a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 @@ -18,7 +18,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(3) y = 2 !$omp end critical (name) @@ -27,12 +27,12 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(7) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(x) y = 2 @@ -54,7 +54,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint) y = 2 @@ -84,35 +84,35 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint_contended) y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp critical (name) hint(1.0) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp critical (name) hint(z + omp_sync_hint_nonspeculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(k + omp_sync_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(p(1) + omp_sync_hint_uncontended) y = 2 diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 505cbc48fef90..677b933932b44 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -20,70 +20,64 @@ program sample !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable y(1_8:3_8:1_8) should be a scalar v = y(1:3) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression x*(10_4+x) should be a variable v = x * (10 + x) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression 4_4 should be a variable v = 4 !$omp atomic read - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE v = k !$omp atomic write - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = x !$omp atomic update - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = k + x * (v * x) !$omp atomic - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = v * k !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'z%y' + !ERROR: Within atomic operation z%y and x+z%y access the same storage z%y = x + z%y !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Within atomic operation m and min(m,x,z%m)+k access the same storage m = min(m, x, z%m) + k !$omp atomic read - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Atomic expression min(m,x,z%m)+k should be a variable m = min(m, x, z%m) + k !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar x = a - !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - a = x - !$omp atomic write !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement x = a !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar a = x !$omp atomic capture @@ -93,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = b + (x*1) !$omp end atomic @@ -103,60 +97,58 @@ program sample !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component x expected to be captured in the second statement of ATOMIC CAPTURE construct x = x + 10 v = b !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt] v = 1 x = 4 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component z%y expected to be assigned in the second statement of ATOMIC CAPTURE construct x = z%y + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component z%m expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 x = z%y !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component y(2) expected to be assigned in the second statement of ATOMIC CAPTURE construct x = y(2) + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component y(1) expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 x = y(2) !$omp end atomic !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable r cannot have CHARACTER type l = r !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable l cannot have CHARACTER type l = r end program diff --git a/flang/test/Semantics/OpenMP/requires-atomic01.f90 b/flang/test/Semantics/OpenMP/requires-atomic01.f90 index ae9fd086015dd..e8817c3f5ef61 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic01.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic01.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> SeqCst !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> SeqCst !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> SeqCst !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> SeqCst !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> SeqCst !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 diff --git a/flang/test/Semantics/OpenMP/requires-atomic02.f90 b/flang/test/Semantics/OpenMP/requires-atomic02.f90 index 4976a9667eb78..a3724a83456fd 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic02.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic02.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Acquire + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> AcqRel !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> AcqRel !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> AcqRel !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> AcqRel !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> AcqRel + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> AcqRel !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 >From 5928e6c450ed9d91a8130fd489a3fabab830140c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:41:16 -0500 Subject: [PATCH 05/24] Restart build >From 5a9cb4aaca6c7cd079f24036b233dbfd3c6278d4 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:45:28 -0500 Subject: [PATCH 06/24] Restart build >From 6273bb856c66fe110731e25293be0f9a1f02c64a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 07:54:54 -0500 Subject: [PATCH 07/24] Restart build >From 3594a0fcab499d7509c2bd4620b2c47feb74a481 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 08:18:01 -0500 Subject: [PATCH 08/24] Replace %openmp_flags with -fopenmp in tests, add REQUIRES where needed --- flang/test/Semantics/OpenMP/atomic-read.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-capture.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 2 +- flang/test/Semantics/OpenMP/atomic.f90 | 2 ++ 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 index 2619d235380f8..73899a9ff37f2 100644 --- a/flang/test/Semantics/OpenMP/atomic-read.f90 +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index 64e31a7974b55..c427ba07d43d8 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 0aee02d429dc0..4595e02d01456 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 index 979568ebf7140..7965ad2dc7dbf 100644 --- a/flang/test/Semantics/OpenMP/atomic-write.f90 +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0dc8251ae356e..2caa161507d49 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,3 +1,5 @@ +! REQUIRES: openmp_runtime + ! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct >From 0142671339663ba09c169800f909c0d48c0ad38b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 09:06:20 -0500 Subject: [PATCH 09/24] Fix examples --- flang/examples/FeatureList/FeatureList.cpp | 10 ------- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 26 +++++-------------- 2 files changed, 7 insertions(+), 29 deletions(-) diff --git a/flang/examples/FeatureList/FeatureList.cpp b/flang/examples/FeatureList/FeatureList.cpp index d1407cf0ef239..a36b8719e365d 100644 --- a/flang/examples/FeatureList/FeatureList.cpp +++ b/flang/examples/FeatureList/FeatureList.cpp @@ -445,13 +445,6 @@ struct NodeVisitor { READ_FEATURE(ObjectDecl) READ_FEATURE(OldParameterStmt) READ_FEATURE(OmpAlignedClause) - READ_FEATURE(OmpAtomic) - READ_FEATURE(OmpAtomicCapture) - READ_FEATURE(OmpAtomicCapture::Stmt1) - READ_FEATURE(OmpAtomicCapture::Stmt2) - READ_FEATURE(OmpAtomicRead) - READ_FEATURE(OmpAtomicUpdate) - READ_FEATURE(OmpAtomicWrite) READ_FEATURE(OmpBeginBlockDirective) READ_FEATURE(OmpBeginLoopDirective) READ_FEATURE(OmpBeginSectionsDirective) @@ -480,7 +473,6 @@ struct NodeVisitor { READ_FEATURE(OmpIterationOffset) READ_FEATURE(OmpIterationVector) READ_FEATURE(OmpEndAllocators) - READ_FEATURE(OmpEndAtomic) READ_FEATURE(OmpEndBlockDirective) READ_FEATURE(OmpEndCriticalDirective) READ_FEATURE(OmpEndLoopDirective) @@ -566,8 +558,6 @@ struct NodeVisitor { READ_FEATURE(OpenMPDeclareTargetConstruct) READ_FEATURE(OmpMemoryOrderType) READ_FEATURE(OmpMemoryOrderClause) - READ_FEATURE(OmpAtomicClause) - READ_FEATURE(OmpAtomicClauseList) READ_FEATURE(OmpAtomicDefaultMemOrderClause) READ_FEATURE(OpenMPFlushConstruct) READ_FEATURE(OpenMPLoopConstruct) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..18a4107f9a9d1 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -74,25 +74,19 @@ SourcePosition OpenMPCounterVisitor::getLocation(const OpenMPConstruct &c) { // the directive field. [&](const auto &c) -> SourcePosition { const CharBlock &source{std::get<0>(c.t).source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPAtomicConstruct &c) -> SourcePosition { - return std::visit( - [&](const auto &o) -> SourcePosition { - const CharBlock &source{std::get(o.t).source}; - return parsing->allCooked() - .GetSourcePositionRange(source) - ->first; - }, - c.u); + const CharBlock &source{c.source}; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPSectionConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPUtilityConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, }, c.u); @@ -157,14 +151,8 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - return std::visit( - [&](const auto &c) { - // Get source from the verbatim fields - const CharBlock &source{std::get(c.t).source}; - return "atomic-" + - normalize_construct_name(source.ToString()); - }, - c.u); + const CharBlock &source{c.source}; + return normalize_construct_name(source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; >From c158867527e0a5e2b67146784386675e0d414c71 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:19 -0500 Subject: [PATCH 10/24] Fix example --- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 5 +++-- flang/test/Examples/omp-atomic.f90 | 16 +++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index 18a4107f9a9d1..b0964845d3b99 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -151,8 +151,9 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - const CharBlock &source{c.source}; - return normalize_construct_name(source.ToString()); + auto &dirSpec = std::get(c.t); + auto &dirName = std::get(dirSpec.t); + return normalize_construct_name(dirName.source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; diff --git a/flang/test/Examples/omp-atomic.f90 b/flang/test/Examples/omp-atomic.f90 index dcca34b633a3e..934f84f132484 100644 --- a/flang/test/Examples/omp-atomic.f90 +++ b/flang/test/Examples/omp-atomic.f90 @@ -26,25 +26,31 @@ ! CHECK:--- ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 9 -! CHECK-NEXT: construct: atomic-read +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: -! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: - clause: read ! CHECK-NEXT: details: '' +! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 12 -! CHECK-NEXT: construct: atomic-write +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: ! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' +! CHECK-NEXT: - clause: write ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 16 -! CHECK-NEXT: construct: atomic-capture +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: +! CHECK-NEXT: - clause: capture +! CHECK-NEXT: details: 'name_modifier=atomic;name_modifier=atomic;' ! CHECK-NEXT: - clause: seq_cst ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 21 -! CHECK-NEXT: construct: atomic-atomic +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: [] ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 8 >From ddacb71299d1fefd37b566e3caed45031584aac8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:47 -0500 Subject: [PATCH 11/24] Remove reference to *nullptr --- flang/lib/Lower/OpenMP/OpenMP.cpp | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ab868df76d298..4d9b375553c1e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3770,9 +3770,12 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); - mlir::Operation *secondOp = genAtomicOperation( - converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, - *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + mlir::Operation *secondOp = nullptr; + if (analysis.op1.what != analysis.None) { + secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, + atomAddr, atom, *get(analysis.op1.expr), + hint, memOrder, atomicAt, prepareAt); + } if (secondOp) { builder.setInsertionPointAfter(secondOp); >From 4546997f82dfe32b79b2bd0e2b65974991ab55da Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 2 May 2025 18:49:05 -0500 Subject: [PATCH 12/24] Updates and improvements --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 107 +++-- flang/lib/Semantics/check-omp-structure.cpp | 375 ++++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 1 + .../Todo/atomic-capture-implicit-cast.f90 | 48 --- .../Lower/OpenMP/atomic-implicit-cast.f90 | 2 - .../Semantics/OpenMP/atomic-hint-clause.f90 | 2 +- .../OpenMP/atomic-update-capture.f90 | 8 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 16 +- 9 files changed, 381 insertions(+), 180 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 77f57b1cb85c7..8213fe33edbd0 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4859,7 +4859,7 @@ struct OpenMPAtomicConstruct { struct Op { int what; - TypedExpr expr; + AssignmentStmt::TypedAssignment assign; }; TypedExpr atom, cond; Op op0, op1; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6177b59199481..7b6c22095d723 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2673,21 +2673,46 @@ getAtomicMemoryOrder(lower::AbstractConverter &converter, static mlir::Operation * // genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Value storeAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Type storeType = fir::unwrapRefType(storeAddr.getType()); + + mlir::Value toAddr = [&]() { + if (atomType == storeType) + return storeAddr; + return builder.createTemporary(loc, atomType, ".tmp.atomval"); + }(); builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + + if (atomType != storeType) { + lower::ExprToValueMap overrides; + // The READ operation could be a part of UPDATE CAPTURE, so make sure + // we don't emit extra code into the body of the atomic op. + builder.restoreInsertionPoint(postAt); + mlir::Value load = builder.create(loc, toAddr); + overrides.try_emplace(&atom, load); + + converter.overrideExprValues(&overrides); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); + converter.resetExprOverrides(); + + builder.create(loc, value, storeAddr); + } builder.restoreInsertionPoint(saved); return op; } @@ -2695,16 +2720,18 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, static mlir::Operation * // genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); mlir::Value converted = builder.createConvert(loc, atomType, value); @@ -2719,19 +2746,20 @@ static mlir::Operation * genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrResizeOf(arg, atom)) { @@ -2751,7 +2779,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, converter.overrideExprValues(&overrides); mlir::Value updated = - fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Value converted = builder.createConvert(loc, atomType, updated); builder.create(loc, converted); converter.resetExprOverrides(); @@ -2764,20 +2792,21 @@ static mlir::Operation * genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, int action, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: - return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Write: - return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Update: - return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, assign, + hint, memOrder, preAt, atomicAt, postAt); default: return nullptr; } @@ -3724,6 +3753,15 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { } return ""s; }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; const SomeExpr &atom = *analysis.atom.get()->v; @@ -3732,11 +3770,11 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; llvm::errs() << " op0 {\n"; llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op0.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << " op1 {\n"; llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op1.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << "}\n"; } @@ -3745,8 +3783,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAtomicConstruct &construct) { - auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { - if (auto *maybe = expr.get(); maybe && maybe->v) { + auto get = [](auto &&typedWrapper) -> decltype(&*typedWrapper.get()->v) { + if (auto *maybe = typedWrapper.get(); maybe && maybe->v) { return &*maybe->v; } else { return nullptr; @@ -3774,8 +3812,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, int action0 = analysis.op0.what & analysis.Action; int action1 = analysis.op1.what & analysis.Action; mlir::Operation *captureOp = nullptr; - fir::FirOpBuilder::InsertPoint atomicAt; - fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint preAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint atomicAt, postAt; if (construct.IsCapture()) { // Capturing operation. @@ -3784,7 +3822,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, captureOp = builder.create(loc, hint, memOrder); // Set the non-atomic insertion point to before the atomic.capture. - prepareAt = getInsertionPointBefore(captureOp); + preAt = getInsertionPointBefore(captureOp); mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); builder.setInsertionPointToEnd(block); @@ -3792,6 +3830,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, // atomic.capture. mlir::Operation *term = builder.create(loc); atomicAt = getInsertionPointBefore(term); + postAt = getInsertionPointAfter(captureOp); hint = nullptr; memOrder = nullptr; } else { @@ -3799,20 +3838,20 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(action0 != analysis.None && action1 == analysis.None && "Expexcing single action"); assert(!(analysis.op0.what & analysis.Condition)); - atomicAt = prepareAt; + postAt = atomicAt = preAt; } mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, - *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); mlir::Operation *secondOp = nullptr; if (analysis.op1.what != analysis.None) { secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, - atomAddr, atom, *get(analysis.op1.expr), - hint, memOrder, atomicAt, prepareAt); + atomAddr, atom, *get(analysis.op1.assign), + hint, memOrder, preAt, atomicAt, postAt); } if (secondOp) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 201b38bd05ff3..f7753a5e5cc59 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -86,9 +86,13 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } -static bool IsVarOrFunctionRef(const SomeExpr &expr) { - return evaluate::UnwrapProcedureRef(expr) != nullptr || - evaluate::IsVariable(expr); +static bool IsVarOrFunctionRef(const MaybeExpr &expr) { + if (expr) { + return evaluate::UnwrapProcedureRef(*expr) != nullptr || + evaluate::IsVariable(*expr); + } else { + return false; + } } static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { @@ -2838,6 +2842,12 @@ static std::pair SplitAssignmentSource( namespace atomic { +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + enum class Operator { Unk, // Operators that are officially allowed in the update operation @@ -3137,16 +3147,108 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (Append(v, std::move(results)), ...); + (MoveAppend(v, std::move(results)), ...); return v; } +}; -private: - static void Append(Result &acc, Result &&data) { - for (auto &&s : data) { - acc.push_back(std::move(s)); +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + auto copy{x.derived()}; + return {evaluate::AsGenericExpr(std::move(copy)), {}}; } } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; }; } // namespace atomic @@ -3265,6 +3367,22 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } +static MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { // Both expr and x have the form of SomeType(SomeKind(...)[1]). // Check if expr is @@ -3282,6 +3400,10 @@ bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { } } +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { if (value) { expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), @@ -3289,11 +3411,20 @@ static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { } } +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( - int what, const MaybeExpr &maybeExpr = std::nullopt) { + int what, + const std::optional &maybeAssign = std::nullopt) { parser::OpenMPAtomicConstruct::Analysis::Op operation; operation.what = what; - SetExpr(operation.expr, maybeExpr); + SetAssignment(operation.assign, maybeAssign); return operation; } @@ -3316,7 +3447,7 @@ static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( // }; // struct Op { // int what; - // TypedExpr expr; + // TypedAssignment assign; // }; // TypedExpr atom, cond; // Op op0, op1; @@ -3340,6 +3471,16 @@ void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, } } +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + /// Check if `expr` satisfies the following conditions for x and v: /// /// [6.0:189:10-12] @@ -3383,9 +3524,9 @@ OmpStructureChecker::CheckUpdateCapture( // // The two allowed cases are: // x = ... atomic-var = ... - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // or - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // x = ... atomic-var = ... // // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture @@ -3394,6 +3535,8 @@ OmpStructureChecker::CheckUpdateCapture( // // If the two statements don't fit these criteria, return a pair of default- // constructed values. + using ReturnTy = std::pair; SourcedActionStmt act1{GetActionStmt(ec1)}; SourcedActionStmt act2{GetActionStmt(ec2)}; @@ -3409,86 +3552,155 @@ OmpStructureChecker::CheckUpdateCapture( auto isUpdateCapture{ [](const evaluate::Assignment &u, const evaluate::Assignment &c) { - return u.lhs == c.rhs; + return IsSameOrConvertOf(c.rhs, u.lhs); }}; // Do some checks that narrow down the possible choices for the update // and the capture statements. This will help to emit better diagnostics. - bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); - bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; + + auto errorCaptureShouldRead{[&](const parser::CharBlock &source, + const std::string &expr) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read %s"_err_en_US, + expr); + }}; - if (couldBeCapture1) { - if (couldBeCapture2) { - if (isUpdateCapture(as2, as1)) { - if (isUpdateCapture(as1, as2)) { - // If both statements could be captures and both could be updates, - // emit a warning about the ambiguity. - context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); - } - return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); - } else if (isUpdateCapture(as1, as2)) { + auto errorNeitherWorks{[&]() { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture"_err_en_US); + }}; + + auto makeSelectionFromDet{[&](int det) -> ReturnTy { + // If det != 0, then the checks unambiguously suggest a specific + // categorization. + // If det == 0, then this function should be called only if the + // checks haven't ruled out any possibility, i.e. when both assigments + // could still be either updates or captures. + if (det > 0) { + // as1 is update, as2 is capture + if (isUpdateCapture(as1, as2)) { return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - context_.Say(source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, - as1.rhs.AsFortran(), as2.rhs.AsFortran()); + errorCaptureShouldRead(act2.source, as1.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } else { // !couldBeCapture2 + } else if (det < 0) { + // as2 is update, as1 is capture if (isUpdateCapture(as2, as1)) { return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } else { - context_.Say(act2.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as1.rhs.AsFortran()); + errorCaptureShouldRead(act1.source, as2.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } - } else { // !couldBeCapture1 - if (couldBeCapture2) { - if (isUpdateCapture(as1, as2)) { - return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); - } else { + } else { + bool updateFirst{isUpdateCapture(as1, as2)}; + bool captureFirst{isUpdateCapture(as2, as1)}; + if (updateFirst && captureFirst) { + // If both assignment could be the update and both could be the + // capture, emit a warning about the ambiguity. context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as2.rhs.AsFortran()); + "In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement"_warn_en_US); + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } - } else { - context_.Say(source, - "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + if (updateFirst != captureFirst) { + const parser::ExecutionPartConstruct *upd{updateFirst ? ec1 : ec2}; + const parser::ExecutionPartConstruct *cap{captureFirst ? ec1 : ec2}; + return std::make_pair(upd, cap); + } + assert(!updateFirst && !captureFirst); + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); } + }}; + + if (det != 0 || (cbu1 && cbu2 && cbc1 && cbc2)) { + return makeSelectionFromDet(det); } + assert(det == 0 && "Prior checks should have covered det != 0"); - return std::make_pair(nullptr, nullptr); + // If neither of the statements is an RMW update, it could still be a + // "write" update. Pretty much any assignment can be a write update, so + // recompute det with cbu1 = cbu2 = true. + if (int writeDet{int(cbc2) - int(cbc1)}; writeDet || (cbc1 && cbc2)) { + return makeSelectionFromDet(writeDet); + } + + // It's only errors from here on. + + if (!cbu1 && !cbu2 && !cbc1 && !cbc2) { + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); + } + + // The remaining cases are that + // - no candidate for update, or for capture, + // - one of the assigments cannot be anything. + + if (!cbu1 && !cbu2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update"_err_en_US); + return std::make_pair(nullptr, nullptr); + } else if (!cbc1 && !cbc2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + if ((!cbu1 && !cbc1) || (!cbu2 && !cbc2)) { + auto &src = (!cbu1 && !cbc1) ? act1.source : act2.source; + context_.Say(src, + "In ATOMIC UPDATE operation with CAPTURE the statement could be neither the update nor the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + // All cases should have been covered. + llvm_unreachable("Unchecked condition"); } void OmpStructureChecker::CheckAtomicCaptureAssignment( const evaluate::Assignment &capture, const SomeExpr &atom, parser::CharBlock source) { - const SomeExpr &cap{capture.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &cap{capture.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, rsrc); - // This part should have been checked prior to callig this function. - assert(capture.rhs == atom && "This canont be a capture assignment"); + // This part should have been checked prior to calling this function. + assert(*GetConvertInput(capture.rhs) == atom && + "This canont be a capture assignment"); CheckStorageOverlap(atom, {cap}, source); } } void OmpStructureChecker::CheckAtomicReadAssignment( const evaluate::Assignment &read, parser::CharBlock source) { - const SomeExpr &atom{read.rhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; - if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + if (auto maybe{GetConvertInput(read.rhs)}) { + const SomeExpr &atom{*maybe}; + + if (!IsVarOrFunctionRef(atom)) { + ErrorShouldBeVariable(atom, rsrc); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } } else { - CheckAtomicVariable(atom, rsrc); - CheckStorageOverlap(atom, {read.lhs}, source); + ErrorShouldBeVariable(read.rhs, rsrc); } } @@ -3499,12 +3711,11 @@ void OmpStructureChecker::CheckAtomicWriteAssignment( // one of the following forms: // x = expr // x => expr - const SomeExpr &atom{write.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{write.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, lsrc); CheckStorageOverlap(atom, {write.rhs}, source); @@ -3521,12 +3732,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( // x = intrinsic-procedure-name (x) // x = intrinsic-procedure-name (x, expr-list) // x = intrinsic-procedure-name (expr-list, x) - const SomeExpr &atom{update.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{update.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); // Skip other checks. return; } @@ -3605,12 +3815,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( const SomeExpr &cond, parser::CharBlock condSource, const evaluate::Assignment &assign, parser::CharBlock assignSource) { - const SomeExpr &atom{assign.lhs}; auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + const SomeExpr &atom{assign.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, arsrc); // Skip other checks. return; } @@ -3702,7 +3911,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( @@ -3786,7 +3995,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdate( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign), MakeAtomicAnalysisOp(Analysis::None)); } @@ -3839,12 +4048,12 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( if (GetActionStmt(&body.front()).stmt == uact.stmt) { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(action, update.rhs), - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + MakeAtomicAnalysisOp(action, update), + MakeAtomicAnalysisOp(Analysis::Read, capture)); } else { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), - MakeAtomicAnalysisOp(action, update.rhs)); + MakeAtomicAnalysisOp(Analysis::Read, capture), + MakeAtomicAnalysisOp(action, update)); } } @@ -3988,12 +4197,12 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( if (captureFirst) { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign)); } else { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign)); } } @@ -4019,13 +4228,15 @@ void OmpStructureChecker::CheckAtomicRead( if (body.size() == 1) { SourcedActionStmt action{GetActionStmt(&body.front())}; if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { - const SomeExpr &atom{maybeRead->rhs}; CheckAtomicReadAssignment(*maybeRead, action.source); - using Analysis = parser::OpenMPAtomicConstruct::Analysis; - x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), - MakeAtomicAnalysisOp(Analysis::None)); + if (auto maybe{GetConvertInput(maybeRead->rhs)}) { + const SomeExpr &atom{*maybe}; + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead), + MakeAtomicAnalysisOp(Analysis::None)); + } } else { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); @@ -4058,7 +4269,7 @@ void OmpStructureChecker::CheckAtomicWrite( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index bf6fbf16d0646..835fbe45e1c0e 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -253,6 +253,7 @@ class OmpStructureChecker void CheckStorageOverlap(const evaluate::Expr &, llvm::ArrayRef>, parser::CharBlock); + void ErrorShouldBeVariable(const MaybeExpr &expr, parser::CharBlock source); void CheckAtomicVariable( const evaluate::Expr &, parser::CharBlock); std::pair&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..aa9d2e0ac3ff7 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -1,5 +1,3 @@ -! REQUIRES : openmp_runtime - ! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s ! CHECK: func.func @_QPatomic_implicit_cast_read() { diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index deb67e7614659..8adb0f1a67409 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -25,7 +25,7 @@ program sample !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement y = x x = y !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index c427ba07d43d8..f808ed916fb7e 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -39,7 +39,7 @@ subroutine f02 subroutine f03 integer :: x - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture !$omp atomic update capture x = x + 1 x = x + 2 @@ -50,7 +50,7 @@ subroutine f04 integer :: x, v !$omp atomic update capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement v = x x = v !$omp end atomic @@ -60,8 +60,8 @@ subroutine f05 integer :: x, v, z !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 !$omp end atomic end @@ -70,8 +70,8 @@ subroutine f06 integer :: x, v, z !$omp atomic update capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x !$omp end atomic end diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 677b933932b44..5e180aa0bbe5b 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -97,50 +97,50 @@ program sample !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture x = x + 10 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read x v = b !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture !$omp atomic capture v = 1 x = 4 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) !$omp end atomic >From 40510a3068498d15257cc7d198bce9c8cd71a902 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 24 Mar 2025 15:38:58 -0500 Subject: [PATCH 13/24] DumpEvExpr: show type --- flang/include/flang/Semantics/dump-expr.h | 30 ++++++++++++++++------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 2f445429a10b5..1553dac3b6687 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,6 +16,7 @@ #include #include +#include #include #include @@ -38,6 +39,17 @@ class DumpEvaluateExpr { } private: + template + struct TypeOf { + static constexpr std::string_view name{TypeOf::get()}; + static constexpr std::string_view get() { + std::string_view v(__PRETTY_FUNCTION__); + v.remove_prefix(99); // Strip the part "... [with T = " + v.remove_suffix(50); // Strip the ending "; string_view = ...]" + return v; + } + }; + template void Show(const common::Indirection &x) { Show(x.value()); } @@ -76,7 +88,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant"); + Indent("derived constant "s + std::string(TypeOf::name)); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -84,7 +96,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant"); + Print("constant "s + std::string(TypeOf::name)); } } void Show(const Symbol &symbol); @@ -102,7 +114,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator"); + Indent("designator "s + std::string(TypeOf::name)); Show(x.u); Outdent(); } @@ -117,7 +129,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref"); + Indent("function ref "s + std::string(TypeOf::name)); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -127,14 +139,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value"); + Indent("array constructor value "s + std::string(TypeOf::name)); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do"); + Indent("implied do "s + std::string(TypeOf::name)); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -148,20 +160,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op"); + Indent("unary op "s + std::string(TypeOf::name)); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op"); + Indent("binary op "s + std::string(TypeOf::name)); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr T"); + Indent("expr <" + std::string(TypeOf::name) + ">"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index aa0b4e0f03398..66cedab94bfb4 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("expr some type"); + Indent("relational some type"); Show(x.u); Outdent(); } >From b40ba0ed9270daf4f7d99190c1e100028a3e09c3 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 15:14:45 -0500 Subject: [PATCH 14/24] Handle conversion from real to complex via complex constructor --- flang/lib/Semantics/check-omp-structure.cpp | 55 ++++++++++++++++++--- 1 file changed, 47 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dada9c6c2bd6f..ae81dcb5ea150 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,36 +3183,46 @@ struct ConvertCollector using Base::operator(); template // - Result operator()(const evaluate::Designator &x) const { + Result asSomeExpr(const T &x) const { auto copy{x}; return {AsGenericExpr(std::move(copy)), {}}; } + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + template // Result operator()(const evaluate::FunctionRef &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template // Result operator()(const evaluate::Constant &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template Result operator()(const evaluate::Operation &x) const { if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. + // Ignore parentheses. return (*this)(x.template operand<0>()); } else if constexpr (is_convert_v) { // Convert should always have a typed result, so it should be safe to // dereference x.GetType(). return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } } else { - auto copy{x.derived()}; - return {evaluate::AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x.derived()); } } @@ -3231,6 +3241,23 @@ struct ConvertCollector } private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + template // struct is_convert { static constexpr bool value{false}; @@ -3246,6 +3273,18 @@ struct ConvertCollector }; template // static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { >From 303aef7886243a6f7952e866cfb50d860ed98e61 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 16:07:19 -0500 Subject: [PATCH 15/24] Fix handling of insertion point --- flang/lib/Lower/OpenMP/OpenMP.cpp | 23 +++++++++++-------- .../Lower/OpenMP/atomic-implicit-cast.f90 | 8 +++---- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 1c5589b116ca7..60e559b326f7f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2749,7 +2749,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value storeAddr = @@ -2782,7 +2781,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, value, storeAddr); } - builder.restoreInsertionPoint(saved); return op; } @@ -2796,7 +2794,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value value = @@ -2807,7 +2804,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, converted, hint, memOrder); - builder.restoreInsertionPoint(saved); return op; } @@ -2823,7 +2819,6 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); @@ -2853,7 +2848,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(saved); + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } @@ -2866,6 +2861,8 @@ genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { + // This function and the functions called here do not preserve the + // builder's insertion point, or set it to anything specific. switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, @@ -3919,6 +3916,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, postAt = atomicAt = preAt; } + // The builder's insertion point needs to be specifically set before + // each call to `genAtomicOperation`. mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); @@ -3932,10 +3931,16 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, hint, memOrder, preAt, atomicAt, postAt); } - if (secondOp) { - builder.setInsertionPointAfter(secondOp); + if (construct.IsCapture()) { + // If this is a capture operation, the first/second ops will be inside + // of it. Set the insertion point to past the capture op itself. + builder.restoreInsertionPoint(postAt); } else { - builder.setInsertionPointAfter(firstOp); + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } } } } diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 6f9a481e4cf43..5e00235b85e74 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -95,9 +95,9 @@ subroutine atomic_implicit_cast_read ! CHECK: } ! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 ! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref -! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 ! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex ! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> @@ -107,14 +107,14 @@ subroutine atomic_implicit_cast_read !$omp end atomic -! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { -! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 ! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 ! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex ! CHECK: omp.yield(%[[RESULT]] : complex) ! CHECK: } >From d788d87ebe69ec82c14a0eb0cbb95df38a216fde Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:14:47 -0500 Subject: [PATCH 16/24] Allow conversion in update operations --- flang/include/flang/Semantics/tools.h | 17 ++++----- flang/lib/Lower/OpenMP/OpenMP.cpp | 6 ++-- flang/lib/Semantics/check-omp-structure.cpp | 33 ++++++----------- .../Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic03.f90 | 6 ++-- flang/test/Semantics/OpenMP/atomic04.f90 | 35 +++++++++---------- .../OpenMP/omp-atomic-assignment-stmt.f90 | 2 +- 7 files changed, 44 insertions(+), 57 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7f1ec59b087a2..9be2feb8ae064 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -789,14 +789,15 @@ inline bool checkForSymbolMatch( /// return the "expr" but with top-level parentheses stripped. std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); -/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). -/// Check if "expr" is -/// SomeType(SomeKind(Type( -/// Convert -/// SomeKind(...)[2]))) -/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves -/// TypeCategory. -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 60e559b326f7f..6977e209e8b1b 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2823,10 +2823,12 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; + // This must exist by now. + SomeExpr input = *semantics::GetConvertInput(assign.rhs); + std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { - if (!semantics::IsSameOrResizeOf(arg, atom)) { + if (!semantics::IsSameOrConvertOf(arg, atom)) { mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); overrides.try_emplace(&arg, val); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index ae81dcb5ea150..edd8525c118bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3425,12 +3425,12 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -static MaybeExpr GetConvertInput(const SomeExpr &x) { +MaybeExpr GetConvertInput(const SomeExpr &x) { // This returns SomeExpr(x) when x is a designator/functionref/constant. return atomic::ConvertCollector{}(x).first; } -static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { // Check if expr is same as x, or a sequence of Convert operations on x. if (expr == x) { return true; @@ -3441,23 +3441,6 @@ static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { } } -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { - // Both expr and x have the form of SomeType(SomeKind(...)[1]). - // Check if expr is - // SomeType(SomeKind(Type( - // Convert - // SomeKind(...)[2]))) - // where SomeKind(...) [1] and [2] are equal, and the Convert preserves - // TypeCategory. - - if (expr != x) { - auto top{atomic::ArgumentExtractor{}(expr)}; - return top.first == atomic::Operator::Resize && x == top.second.front(); - } else { - return true; - } -} - bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3801,7 +3784,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - auto top{GetTopLevelOperation(update.rhs)}; + std::pair> top{ + atomic::Operator::Unk, {}}; + if (auto &&maybeInput{GetConvertInput(update.rhs)}) { + top = GetTopLevelOperation(*maybeInput); + } switch (top.first) { case atomic::Operator::Add: case atomic::Operator::Sub: @@ -3842,7 +3829,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( auto unique{[&]() { // -> iterator auto found{top.second.end()}; for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { - if (IsSameOrResizeOf(*i, atom)) { + if (IsSameOrConvertOf(*i, atom)) { if (found != top.second.end()) { return top.second.end(); } @@ -3902,9 +3889,9 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( case atomic::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; - if (IsSameOrResizeOf(arg0, atom)) { + if (IsSameOrConvertOf(arg0, atom)) { CheckStorageOverlap(atom, {arg1}, condSource); - } else if (IsSameOrResizeOf(arg1, atom)) { + } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { context_.Say(assignSource, diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 4595e02d01456..28d0e264359cb 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -47,8 +47,8 @@ subroutine f05 integer :: x real :: y + ! An explicit conversion is accepted as an extension. !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = int(x + y) end diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index f5c189fd05318..b3a3c0d5e7a14 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -41,10 +41,10 @@ program OmpAtomic z = MIN(y, 8, a, d) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable y should appear as an argument in the update operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -126,7 +126,7 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic update !ERROR: Atomic variable k should be a scalar - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable k should occur exactly once among the arguments of the top-level MAX operator k = max(x, y) !$omp atomic diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index 5c91ab5dc37e4..d603ba8b3937c 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -1,5 +1,3 @@ -! REQUIRES: openmp_runtime - ! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags ! OpenMP Atomic construct @@ -7,7 +5,6 @@ ! Update assignment must be 'var = var op expr' or 'var = expr op var' program OmpAtomic - use omp_lib real x integer y logical m, n, l @@ -20,10 +17,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic @@ -31,10 +28,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic @@ -42,10 +39,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic @@ -53,10 +50,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic @@ -96,10 +93,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic update @@ -107,10 +104,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic update @@ -118,10 +115,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic update @@ -129,10 +126,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 5e180aa0bbe5b..8fdd2aed3ec1f 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -87,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + ! This ends up being "x = b + x". x = b + (x*1) !$omp end atomic >From 341723713929507c59d528540d32bc2e4213e920 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:21:56 -0500 Subject: [PATCH 17/24] format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6977e209e8b1b..0f553541c5ef0 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2850,7 +2850,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(postAt); // For naCtx cleanups + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } >From 2686207342bad511f6d51b20ed923c0d2cc9047b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:22:26 -0500 Subject: [PATCH 18/24] Revert "DumpEvExpr: show type" This reverts commit 40510a3068498d15257cc7d198bce9c8cd71a902. Debug changes accidentally pushed upstream. --- flang/include/flang/Semantics/dump-expr.h | 30 +++++++---------------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 10 insertions(+), 22 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 1553dac3b6687..2f445429a10b5 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,7 +16,6 @@ #include #include -#include #include #include @@ -39,17 +38,6 @@ class DumpEvaluateExpr { } private: - template - struct TypeOf { - static constexpr std::string_view name{TypeOf::get()}; - static constexpr std::string_view get() { - std::string_view v(__PRETTY_FUNCTION__); - v.remove_prefix(99); // Strip the part "... [with T = " - v.remove_suffix(50); // Strip the ending "; string_view = ...]" - return v; - } - }; - template void Show(const common::Indirection &x) { Show(x.value()); } @@ -88,7 +76,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant "s + std::string(TypeOf::name)); + Indent("derived constant"); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -96,7 +84,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant "s + std::string(TypeOf::name)); + Print("constant"); } } void Show(const Symbol &symbol); @@ -114,7 +102,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator "s + std::string(TypeOf::name)); + Indent("designator"); Show(x.u); Outdent(); } @@ -129,7 +117,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref "s + std::string(TypeOf::name)); + Indent("function ref"); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -139,14 +127,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value "s + std::string(TypeOf::name)); + Indent("array constructor value"); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do "s + std::string(TypeOf::name)); + Indent("implied do"); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -160,20 +148,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op "s + std::string(TypeOf::name)); + Indent("unary op"); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op "s + std::string(TypeOf::name)); + Indent("binary op"); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr <" + std::string(TypeOf::name) + ">"); + Indent("expr T"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index 66cedab94bfb4..aa0b4e0f03398 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("relational some type"); + Indent("expr some type"); Show(x.u); Outdent(); } >From c00fc531bcf742c409fc974da94c5b362fa9132c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:37:19 -0500 Subject: [PATCH 19/24] Delete unnecessary static_assert --- flang/lib/Semantics/check-omp-structure.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 6005dda7c26fe..2e59553d5e130 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -21,8 +21,6 @@ namespace Fortran::semantics { -static_assert(std::is_same_v>); - template static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { return !(e == f); >From 45b012c16b77c757a0d09b2a229bad49fed8d26f Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:25 -0500 Subject: [PATCH 20/24] Add missing initializer for 'iff' --- flang/lib/Semantics/check-omp-structure.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 2e59553d5e130..aa1bd136b371f 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2815,7 +2815,7 @@ static std::optional AnalyzeConditionalStmt( } } else { AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, - GetActionStmt(std::get(s.t))}; + GetActionStmt(std::get(s.t)), SourcedActionStmt{}}; if (result.ift.stmt) { return result; } >From daeac25991bf14fb08c3accabe068c074afa1eb7 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:47 -0500 Subject: [PATCH 21/24] Add asserts for printing "Identity" as top-level operator --- flang/lib/Semantics/check-omp-structure.cpp | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa1bd136b371f..062b45deac865 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3823,6 +3823,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, atomic::ToString(top.first)); @@ -3852,6 +3853,8 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, atom.AsFortran(), atomic::ToString(top.first)); @@ -3898,16 +3901,20 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, atomic::ToString(top.first)); } break; } + case atomic::Operator::Identity: case atomic::Operator::True: case atomic::Operator::False: break; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, atomic::ToString(top.first)); >From ae121e5c37453af1a4aba7c77939c2c1c45b75fa Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 13:15:58 -0500 Subject: [PATCH 22/24] Explain the use of determinant --- flang/lib/Semantics/check-omp-structure.cpp | 31 ++++++++++++++++++--- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 062b45deac865..bc6a09b9768ef 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3606,10 +3606,33 @@ OmpStructureChecker::CheckUpdateCapture( // subexpression of the right-hand side. // 2. An assignment could be a capture (cbc) if the right-hand side is // a variable (or a function ref), with potential type conversions. - bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; - bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; - bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; - bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; // Can as1 be an update? + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; // Can as2 be an update? + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; // Can 1 be capture? + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; // Can 2 be capture? + + // We want to diagnose cases where both assignments cannot be an update, + // or both cannot be a capture, as well as cases where either assignment + // cannot be any of these two. + // + // If we organize these boolean values into a matrix + // |cbu1 cbu2| + // |cbc1 cbc2| + // then we want to diagnose cases where the matrix has a zero (i.e. "false") + // row or column, including the case where everything is zero. All these + // cases correspond to the determinant of the matrix being 0, which suggests + // that checking the det may be a convenient diagnostic check. There is only + // one additional case where the det is 0, which is when the matrx is all 1 + // ("true"). The "all true" case represents the situation where both + // assignments could be an update as well as a capture. On the other hand, + // whenever det != 0, the roles of the update and the capture can be + // unambiguously assigned to as1 and as2 [1]. + // + // [1] This can be easily verified by hand: there are 10 2x2 matrices with + // det = 0, leaving 6 cases where det != 0: + // 0 1 0 1 1 0 1 0 1 1 1 1 + // 1 0 1 1 0 1 1 1 0 1 1 0 + // In each case the classification is unambiguous. // |cbu1 cbu2| // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 >From cae0e8fcd3f6b8c2bc3ad8f85599ef4765c6afc5 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 11:48:18 -0500 Subject: [PATCH 23/24] Deal with assignments that failed Fortran semantic checks Don't emit diagnostics for those. --- flang/lib/Semantics/check-omp-structure.cpp | 66 ++++++++++++++------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bc6a09b9768ef..89a3a407441a8 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2726,6 +2726,9 @@ static SourcedActionStmt GetActionStmt(const parser::Block &block) { // Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption // is that the ActionStmt will be either an assignment or a pointer-assignment, // otherwise return std::nullopt. +// Note: This function can return std::nullopt on [Pointer]AssignmentStmt where +// the "typedAssignment" is unset. This can happen is there are semantic errors +// in the purported assignment. static std::optional GetEvaluateAssignment( const parser::ActionStmt *x) { if (x == nullptr) { @@ -2754,6 +2757,29 @@ static std::optional GetEvaluateAssignment( x->u); } +// Check if the ActionStmt is actually a [Pointer]AssignmentStmt. This is +// to separate cases where the source has something that looks like an +// assignment, but is semantically wrong (diagnosed by general semantic +// checks), and where the source has some other statement (which we want +// to report as "should be an assignment"). +static bool IsAssignment(const parser::ActionStmt *x) { + if (x == nullptr) { + return false; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + + return common::visit( + [](auto &&s) -> bool { + using BareS = llvm::remove_cvref_t; + return std::is_same_v || + std::is_same_v; + }, + x->u); +} + static std::optional AnalyzeConditionalStmt( const parser::ExecutionPartConstruct *x) { if (x == nullptr) { @@ -3588,8 +3614,10 @@ OmpStructureChecker::CheckUpdateCapture( auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; if (!maybeAssign1 || !maybeAssign2) { - context_.Say(source, - "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + if (!IsAssignment(act1.stmt) || !IsAssignment(act2.stmt)) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + } return std::make_pair(nullptr, nullptr); } @@ -3956,7 +3984,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( // The if-true statement must be present, and must be an assignment. auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; if (!maybeAssign) { - if (update.ift.stmt) { + if (update.ift.stmt && !IsAssignment(update.ift.stmt)) { context_.Say(update.ift.source, "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); } else { @@ -3992,7 +4020,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); } @@ -4094,17 +4122,11 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( } SourcedActionStmt uact{GetActionStmt(uec)}; SourcedActionStmt cact{GetActionStmt(cec)}; - auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; - auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; - - if (!maybeUpdate || !maybeCapture) { - context_.Say(source, - "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); - return; - } + // The "dereferences" of std::optional are guaranteed to be valid after + // CheckUpdateCapture. + evaluate::Assignment update{*GetEvaluateAssignment(uact.stmt)}; + evaluate::Assignment capture{*GetEvaluateAssignment(cact.stmt)}; - const evaluate::Assignment &update{*maybeUpdate}; - const evaluate::Assignment &capture{*maybeCapture}; const SomeExpr &atom{update.lhs}; using Analysis = parser::OpenMPAtomicConstruct::Analysis; @@ -4242,13 +4264,17 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( return; } } else { - context_.Say(capture.source, - "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + if (!IsAssignment(capture.stmt)) { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + } return; } } else { - context_.Say(update.ift.source, - "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + if (!IsAssignment(update.ift.stmt)) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + } return; } @@ -4316,7 +4342,7 @@ void OmpStructureChecker::CheckAtomicRead( MakeAtomicAnalysisOp(Analysis::Read, maybeRead), MakeAtomicAnalysisOp(Analysis::None)); } - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); } @@ -4350,7 +4376,7 @@ void OmpStructureChecker::CheckAtomicWrite( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); } >From 6bc8c10c793ebac02c78daec33e7fb5e6becb8e0 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 12:47:00 -0500 Subject: [PATCH 24/24] Move common functions to tools.cpp --- flang/include/flang/Semantics/tools.h | 134 +++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- flang/lib/Semantics/check-omp-structure.cpp | 506 ++------------------ flang/lib/Semantics/tools.cpp | 310 ++++++++++++ 4 files changed, 484 insertions(+), 468 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 821f1ae34fd5b..25fadceefceb0 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -778,11 +778,135 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, return false; } -/// If the top-level operation (ignoring parentheses) is either an -/// evaluate::FunctionRef, or a specialization of evaluate::Operation, -/// then return the list of arguments (wrapped in SomeExpr). Otherwise, -/// return the "expr" but with top-level parentheses stripped. -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); +namespace operation { + +enum class Operator { + Add, + And, + Associated, + Call, + Convert, + Div, + Eq, + Eqv, + False, + Ge, + Gt, + Identity, + Intrinsic, + Lt, + Max, + Min, + Mul, + Ne, + Neqv, + Not, + Or, + Pow, + Resize, // Convert within the same TypeCategory + Sub, + True, + Unknown, +}; + +std::string ToString(Operator op); + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unknown; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unknown; +} + +template +Operator OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Add; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Sub; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Mul; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Div; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } +} + +template // +Operator OperationCode(const T &) { + return Operator::Unknown; +} + +Operator OperationCode(const evaluate::ProcedureDesignator &proc); + +} // namespace operation + +/// Return information about the top-level operation (ignoring parentheses): +/// the operation code and the list of arguments. +std::pair> +GetTopLevelOperation(const SomeExpr &expr); /// Check if expr is same as x, or a sequence of Convert operations on x. bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ad5eae4ae39a2..c74f7627c5e25 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2828,7 +2828,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, // This must exist by now. SomeExpr input = *semantics::GetConvertInput(assign.rhs); - std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; + std::vector args{semantics::GetTopLevelOperation(input).second}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrConvertOf(arg, atom)) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 89a3a407441a8..f29a56d5fd92a 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2891,290 +2891,6 @@ static std::pair SplitAssignmentSource( namespace atomic { -template static void MoveAppend(V &accum, V &&other) { - for (auto &&s : other) { - accum.push_back(std::move(s)); - } -} - -enum class Operator { - Unk, - // Operators that are officially allowed in the update operation - Add, - And, - Associated, - Div, - Eq, - Eqv, - Ge, // extension - Gt, - Identity, // extension: x = x is allowed (*), but we should never print - // "identity" as the name of the operator - Le, // extension - Lt, - Max, - Min, - Mul, - Ne, // extension - Neqv, - Or, - Sub, - // Operators that we recognize for technical reasons - True, - False, - Not, - Convert, - Resize, - Intrinsic, - Call, - Pow, - - // (*): "x = x + 0" is a valid update statement, but it will be folded - // to "x = x" by the time we look at it. Since the source statements - // "x = x" and "x = x + 0" will end up looking the same, accept the - // former as an extension. -}; - -std::string ToString(Operator op) { - switch (op) { - case Operator::Add: - return "+"; - case Operator::And: - return "AND"; - case Operator::Associated: - return "ASSOCIATED"; - case Operator::Div: - return "/"; - case Operator::Eq: - return "=="; - case Operator::Eqv: - return "EQV"; - case Operator::Ge: - return ">="; - case Operator::Gt: - return ">"; - case Operator::Identity: - return "identity"; - case Operator::Le: - return "<="; - case Operator::Lt: - return "<"; - case Operator::Max: - return "MAX"; - case Operator::Min: - return "MIN"; - case Operator::Mul: - return "*"; - case Operator::Neqv: - return "NEQV/EOR"; - case Operator::Ne: - return "/="; - case Operator::Or: - return "OR"; - case Operator::Sub: - return "-"; - case Operator::True: - return ".TRUE."; - case Operator::False: - return ".FALSE."; - case Operator::Not: - return "NOT"; - case Operator::Convert: - return "type-conversion"; - case Operator::Resize: - return "resize"; - case Operator::Intrinsic: - return "intrinsic"; - case Operator::Call: - return "function-call"; - case Operator::Pow: - return "**"; - default: - return "??"; - } -} - -template // -struct ArgumentExtractor - : public evaluate::Traverse, - std::pair>, false> { - using Arguments = std::vector; - using Result = std::pair; - using Base = evaluate::Traverse, - Result, false>; - static constexpr auto IgnoreResizes = IgnoreResizingConverts; - static constexpr auto Logical = common::TypeCategory::Logical; - ArgumentExtractor() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result operator()( - const evaluate::Constant> &x) const { - if (const auto &val{x.GetScalarValue()}) { - return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) - : std::make_pair(Operator::False, Arguments{}); - } - return Default(); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - Result result{OperationCode(x.proc()), {}}; - for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { - if (auto *e{x.UnwrapArgExpr(i)}) { - result.second.push_back(*e); - } - } - return result; - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. - return (*this)(x.template operand<0>()); - } - if constexpr (IgnoreResizes && - std::is_same_v>) { - // Ignore conversions within the same category. - // Atomic operations on int(kind=1) may be implicitly widened - // to int(kind=4) for example. - return (*this)(x.template operand<0>()); - } else { - return std::make_pair( - OperationCode(x), OperationArgs(x, std::index_sequence_for{})); - } - } - - template // - Result operator()(const evaluate::Designator &x) const { - evaluate::Designator copy{x}; - Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; - return result; - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - // There shouldn't be any combining needed, since we're stopping the - // traversal at the top-level operation, but implement one that picks - // the first non-empty result. - if constexpr (sizeof...(Rs) == 0) { - return std::move(result); - } else { - if (!result.second.empty()) { - return std::move(result); - } else { - return Combine(std::move(results)...); - } - } - } - -private: - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) - const { - switch (op.derived().logicalOperator) { - case common::LogicalOperator::And: - return Operator::And; - case common::LogicalOperator::Or: - return Operator::Or; - case common::LogicalOperator::Eqv: - return Operator::Eqv; - case common::LogicalOperator::Neqv: - return Operator::Neqv; - case common::LogicalOperator::Not: - return Operator::Not; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - switch (op.derived().opr) { - case common::RelationalOperator::LT: - return Operator::Lt; - case common::RelationalOperator::LE: - return Operator::Le; - case common::RelationalOperator::EQ: - return Operator::Eq; - case common::RelationalOperator::NE: - return Operator::Ne; - case common::RelationalOperator::GE: - return Operator::Ge; - case common::RelationalOperator::GT: - return Operator::Gt; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Add; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Sub; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Mul; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Div; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - if constexpr (C == T::category) { - return Operator::Resize; - } else { - return Operator::Convert; - } - } - Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { - Operator code = llvm::StringSwitch(proc.GetName()) - .Case("associated", Operator::Associated) - .Case("min", Operator::Min) - .Case("max", Operator::Max) - .Case("iand", Operator::And) - .Case("ior", Operator::Or) - .Case("ieor", Operator::Neqv) - .Default(Operator::Call); - if (code == Operator::Call && proc.GetSpecificIntrinsic()) { - return Operator::Intrinsic; - } - return code; - } - template // - Operator OperationCode(const T &) const { - return Operator::Unk; - } - - template - Arguments OperationArgs(const evaluate::Operation &x, - std::index_sequence) const { - return Arguments{SomeExpr(x.template operand())...}; - } -}; - struct DesignatorCollector : public evaluate::Traverse, false> { using Result = std::vector; @@ -3196,125 +2912,14 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (MoveAppend(v, std::move(results)), ...); - return v; - } -}; - -struct ConvertCollector - : public evaluate::Traverse>, false> { - using Result = std::pair>; - using Base = evaluate::Traverse; - ConvertCollector() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result asSomeExpr(const T &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; - } - - template // - Result operator()(const evaluate::Designator &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::Constant &x) const { - return asSomeExpr(x); - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore parentheses. - return (*this)(x.template operand<0>()); - } else if constexpr (is_convert_v) { - // Convert should always have a typed result, so it should be safe to - // dereference x.GetType(). - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else if constexpr (is_complex_constructor_v) { - // This is a conversion iff the imaginary operand is 0. - if (IsZero(x.template operand<1>())) { - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else { - return asSomeExpr(x.derived()); - } - } else { - return asSomeExpr(x.derived()); - } - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - Result v(std::move(result)); - auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { - assert((!x.has_value() || !y.has_value()) && "Multiple designators"); - if (!x.has_value()) { - x = std::move(y); + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); } }}; - (setValue(v.first, std::move(results).first), ...); - (MoveAppend(v.second, std::move(results).second), ...); + (moveAppend(v, std::move(results)), ...); return v; } - -private: - template // - static bool IsZero(const T &x) { - return false; - } - template // - static bool IsZero(const evaluate::Expr &x) { - return common::visit([](auto &&s) { return IsZero(s); }, x.u); - } - template // - static bool IsZero(const evaluate::Constant &x) { - if (auto &&maybeScalar{x.GetScalarValue()}) { - return maybeScalar->IsZero(); - } else { - return false; - } - } - - template // - struct is_convert { - static constexpr bool value{false}; - }; - template // - struct is_convert> { - static constexpr bool value{true}; - }; - template // - struct is_convert> { - // Conversion from complex to real. - static constexpr bool value{true}; - }; - template // - static constexpr bool is_convert_v = is_convert::value; - - template // - struct is_complex_constructor { - static constexpr bool value{false}; - }; - template // - struct is_complex_constructor> { - static constexpr bool value{true}; - }; - template // - static constexpr bool is_complex_constructor_v = - is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { @@ -3347,22 +2952,13 @@ static bool IsAllocatable(const SomeExpr &expr) { return !syms.empty() && IsAllocatable(syms.back()); } -static std::pair> GetTopLevelOperation( - const SomeExpr &expr) { - return atomic::ArgumentExtractor{}(expr); -} - -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { - return GetTopLevelOperation(expr).second; -} - static bool IsPointerAssignment(const evaluate::Assignment &x) { return std::holds_alternative(x.u) || std::holds_alternative(x.u); } static bool IsCheckForAssociated(const SomeExpr &cond) { - return GetTopLevelOperation(cond).first == atomic::Operator::Associated; + return GetTopLevelOperation(cond).first == operation::Operator::Associated; } static bool HasCommonDesignatorSymbols( @@ -3455,23 +3051,7 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -MaybeExpr GetConvertInput(const SomeExpr &x) { - // This returns SomeExpr(x) when x is a designator/functionref/constant. - return atomic::ConvertCollector{}(x).first; -} - -bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { - // Check if expr is same as x, or a sequence of Convert operations on x. - if (expr == x) { - return true; - } else if (auto maybe{GetConvertInput(expr)}) { - return *maybe == x; - } else { - return false; - } -} - -bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { +static bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3839,45 +3419,46 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - std::pair> top{ - atomic::Operator::Unk, {}}; + std::pair> top{ + operation::Operator::Unknown, {}}; if (auto &&maybeInput{GetConvertInput(update.rhs)}) { top = GetTopLevelOperation(*maybeInput); } switch (top.first) { - case atomic::Operator::Add: - case atomic::Operator::Sub: - case atomic::Operator::Mul: - case atomic::Operator::Div: - case atomic::Operator::And: - case atomic::Operator::Or: - case atomic::Operator::Eqv: - case atomic::Operator::Neqv: - case atomic::Operator::Min: - case atomic::Operator::Max: - case atomic::Operator::Identity: + case operation::Operator::Add: + case operation::Operator::Sub: + case operation::Operator::Mul: + case operation::Operator::Div: + case operation::Operator::And: + case operation::Operator::Or: + case operation::Operator::Eqv: + case operation::Operator::Neqv: + case operation::Operator::Min: + case operation::Operator::Max: + case operation::Operator::Identity: break; - case atomic::Operator::Call: + case operation::Operator::Call: context_.Say(source, "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Convert: + case operation::Operator::Convert: context_.Say(source, "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Intrinsic: + case operation::Operator::Intrinsic: context_.Say(source, "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Unk: + case operation::Operator::Unknown: context_.Say( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); return; } // Check if `atom` occurs exactly once in the argument list. @@ -3898,17 +3479,17 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( }()}; if (unique == top.second.end()) { - if (top.first == atomic::Operator::Identity) { + if (top.first == operation::Operator::Identity) { // This is "x = y". context_.Say(rsrc, "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, - atom.AsFortran(), atomic::ToString(top.first)); + atom.AsFortran(), operation::ToString(top.first)); } } else { CheckStorageOverlap(atom, nonAtom, source); @@ -3933,18 +3514,18 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( // Missing arguments to operations would have been diagnosed by now. switch (top.first) { - case atomic::Operator::Associated: + case operation::Operator::Associated: if (atom != top.second.front()) { context_.Say(assignSource, "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); } break; // x equalop e | e equalop x (allowing "e equalop x" is an extension) - case atomic::Operator::Eq: - case atomic::Operator::Eqv: + case operation::Operator::Eq: + case operation::Operator::Eqv: // x ordop expr | expr ordop x - case atomic::Operator::Lt: - case atomic::Operator::Gt: { + case operation::Operator::Lt: + case operation::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; if (IsSameOrConvertOf(arg0, atom)) { @@ -3952,23 +3533,24 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); } break; } - case atomic::Operator::Identity: - case atomic::Operator::True: - case atomic::Operator::False: + case operation::Operator::Identity: + case operation::Operator::True: + case operation::Operator::False: break; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); break; } } diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 1d1e3ac044166..fce930dcc1d02 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -17,6 +17,7 @@ #include "flang/Semantics/tools.h" #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringSwitch.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1756,4 +1757,313 @@ bool HadUseError( } } +namespace operation { +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() + ? std::make_pair(operation::Operator::True, Arguments{}) + : std::make_pair(operation::Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{operation::OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); + } + } + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair(operation::OperationCode(x), + OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{ + operation::Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; +} // namespace operation + +std::string operation::ToString(operation::Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; + } +} + +operation::Operator operation::OperationCode( + const evaluate::ProcedureDesignator &proc) { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; +} + +std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return operation::ArgumentExtractor{}(expr); +} + +namespace operation { +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result asSomeExpr(const T &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::Constant &x) const { + return asSomeExpr(x); + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } + } else { + return asSomeExpr(x.derived()); + } + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (moveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; +}; +} // namespace operation + +MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return operation::ConvertCollector{}(x).first; +} + +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + } // namespace Fortran::semantics From flang-commits at lists.llvm.org Fri May 30 10:49:29 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 30 May 2025 10:49:29 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6839efa9.050a0220.c105a.91a4@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 01/24] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 02/24] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 03/24] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; >From 02b40809aff5b2ad315ef078f0ea8edb0f30a40c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 17 Mar 2025 15:53:27 -0500 Subject: [PATCH 04/24] [flang][OpenMP] Overhaul implementation of ATOMIC construct The parser will accept a wide variety of illegal attempts at forming an ATOMIC construct, leaving it to the semantic analysis to diagnose any issues. This consolidates the analysis into one place and allows us to produce more informative diagnostics. The parser's outcome will be parser::OpenMPAtomicConstruct object holding the directive, parser::Body, and an optional end-directive. The prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have been removed. READ, WRITE, etc. are now proper clauses. The semantic analysis consistently operates on "evaluation" represen- tations, mainly evaluate::Expr (as SomeExpr) and evaluate::Assignment. The results of the semantic analysis are stored in a mutable member of the OpenMPAtomicConstruct node. This follows a precedent of having `typedExpr` member in parser::Expr, for example. This allows the lowering code to avoid duplicated handling of AST nodes. Using a BLOCK construct containing multiple statements for an ATOMIC construct that requires multiple statements is now allowed. In fact, any nesting of such BLOCK constructs is allowed. This implementation will parse, and perform semantic checks for both conditional-update and conditional-update-capture, although no MLIR will be generated for those. Instead, a TODO error will be issues prior to lowering. The allowed forms of the ATOMIC construct were based on the OpenMP 6.0 spec. --- flang/include/flang/Parser/dump-parse-tree.h | 12 - flang/include/flang/Parser/parse-tree.h | 111 +- flang/include/flang/Semantics/tools.h | 16 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 40 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 760 +++---- flang/lib/Parser/openmp-parsers.cpp | 233 +- flang/lib/Parser/parse-tree.cpp | 28 + flang/lib/Parser/unparse.cpp | 102 +- flang/lib/Semantics/check-omp-structure.cpp | 1998 +++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 56 +- flang/lib/Semantics/resolve-names.cpp | 7 +- flang/lib/Semantics/rewrite-directives.cpp | 126 +- .../Lower/OpenMP/Todo/atomic-compare-fail.f90 | 2 +- .../test/Lower/OpenMP/Todo/atomic-compare.f90 | 2 +- flang/test/Lower/OpenMP/atomic-capture.f90 | 4 +- flang/test/Lower/OpenMP/atomic-write.f90 | 2 +- flang/test/Parser/OpenMP/atomic-compare.f90 | 16 - .../test/Semantics/OpenMP/atomic-compare.f90 | 29 +- .../Semantics/OpenMP/atomic-hint-clause.f90 | 23 +- flang/test/Semantics/OpenMP/atomic-read.f90 | 89 + .../OpenMP/atomic-update-capture.f90 | 77 + .../Semantics/OpenMP/atomic-update-only.f90 | 83 + .../OpenMP/atomic-update-overloaded-ops.f90 | 4 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 81 + flang/test/Semantics/OpenMP/atomic.f90 | 29 +- flang/test/Semantics/OpenMP/atomic01.f90 | 221 +- flang/test/Semantics/OpenMP/atomic02.f90 | 47 +- flang/test/Semantics/OpenMP/atomic03.f90 | 51 +- flang/test/Semantics/OpenMP/atomic04.f90 | 96 +- flang/test/Semantics/OpenMP/atomic05.f90 | 12 +- .../Semantics/OpenMP/critical-hint-clause.f90 | 20 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 58 +- .../Semantics/OpenMP/requires-atomic01.f90 | 86 +- .../Semantics/OpenMP/requires-atomic02.f90 | 86 +- 34 files changed, 2838 insertions(+), 1769 deletions(-) delete mode 100644 flang/test/Parser/OpenMP/atomic-compare.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-read.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-capture.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-only.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-write.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a8c01ee2e7da3 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -526,15 +526,6 @@ class ParseTreeDumper { NODE(parser, OmpAtClause) NODE_ENUM(OmpAtClause, ActionTime) NODE_ENUM(OmpSeverityClause, Severity) - NODE(parser, OmpAtomic) - NODE(parser, OmpAtomicCapture) - NODE(OmpAtomicCapture, Stmt1) - NODE(OmpAtomicCapture, Stmt2) - NODE(parser, OmpAtomicCompare) - NODE(parser, OmpAtomicCompareIfStmt) - NODE(parser, OmpAtomicRead) - NODE(parser, OmpAtomicUpdate) - NODE(parser, OmpAtomicWrite) NODE(parser, OmpBeginBlockDirective) NODE(parser, OmpBeginLoopDirective) NODE(parser, OmpBeginSectionsDirective) @@ -581,7 +572,6 @@ class ParseTreeDumper { NODE(parser, OmpDoacrossClause) NODE(parser, OmpDestroyClause) NODE(parser, OmpEndAllocators) - NODE(parser, OmpEndAtomic) NODE(parser, OmpEndBlockDirective) NODE(parser, OmpEndCriticalDirective) NODE(parser, OmpEndLoopDirective) @@ -709,8 +699,6 @@ class ParseTreeDumper { NODE(parser, OpenMPDeclareMapperConstruct) NODE_ENUM(common, OmpMemoryOrderType) NODE(parser, OmpMemoryOrderClause) - NODE(parser, OmpAtomicClause) - NODE(parser, OmpAtomicClauseList) NODE(parser, OmpAtomicDefaultMemOrderClause) NODE(parser, OpenMPDepobjConstruct) NODE(parser, OpenMPUtilityConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..77f57b1cb85c7 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4835,94 +4835,37 @@ struct OmpMemoryOrderClause { CharBlock source; }; -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) | -// FAIL(memory-order) -struct OmpAtomicClause { - UNION_CLASS_BOILERPLATE(OmpAtomicClause); - CharBlock source; - std::variant u; -}; - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -struct OmpAtomicClauseList { - WRAPPER_CLASS_BOILERPLATE(OmpAtomicClauseList, std::list); - CharBlock source; -}; - -// END ATOMIC -EMPTY_CLASS(OmpEndAtomic); - -// ATOMIC READ -struct OmpAtomicRead { - TUPLE_CLASS_BOILERPLATE(OmpAtomicRead); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC WRITE -struct OmpAtomicWrite { - TUPLE_CLASS_BOILERPLATE(OmpAtomicWrite); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC UPDATE -struct OmpAtomicUpdate { - TUPLE_CLASS_BOILERPLATE(OmpAtomicUpdate); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC CAPTURE -struct OmpAtomicCapture { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCapture); - CharBlock source; - WRAPPER_CLASS(Stmt1, Statement); - WRAPPER_CLASS(Stmt2, Statement); - std::tuple - t; -}; - -struct OmpAtomicCompareIfStmt { - UNION_CLASS_BOILERPLATE(OmpAtomicCompareIfStmt); - std::variant, common::Indirection> u; -}; - -// ATOMIC COMPARE (OpenMP 5.1, OPenMP 5.2 spec: 15.8.4) -struct OmpAtomicCompare { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCompare); +struct OpenMPAtomicConstruct { + llvm::omp::Clause GetKind() const; + bool IsCapture() const; + bool IsCompare() const; + TUPLE_CLASS_BOILERPLATE(OpenMPAtomicConstruct); CharBlock source; - std::tuple> + std::tuple> t; -}; -// ATOMIC -struct OmpAtomic { - TUPLE_CLASS_BOILERPLATE(OmpAtomic); - CharBlock source; - std::tuple, - std::optional> - t; -}; + // Information filled out during semantic checks to avoid duplication + // of analyses. + struct Analysis { + static constexpr int None = 0; + static constexpr int Read = 1; + static constexpr int Write = 2; + static constexpr int Update = Read | Write; + static constexpr int Action = 3; // Bitmask for None, Read, Write, Update + static constexpr int IfTrue = 4; + static constexpr int IfFalse = 8; + static constexpr int Condition = 12; // Bitmask for IfTrue, IfFalse + + struct Op { + int what; + TypedExpr expr; + }; + TypedExpr atom, cond; + Op op0, op1; + }; -// 2.17.7 atomic -> -// ATOMIC [atomic-clause-list] atomic-construct [atomic-clause-list] | -// ATOMIC [atomic-clause-list] -// atomic-construct -> READ | WRITE | UPDATE | CAPTURE | COMPARE -struct OpenMPAtomicConstruct { - UNION_CLASS_BOILERPLATE(OpenMPAtomicConstruct); - std::variant - u; + mutable Analysis analysis; }; // OpenMP directives that associate with loop(s) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..7f1ec59b087a2 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -782,5 +782,21 @@ inline bool checkForSymbolMatch( } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). +/// Check if "expr" is +/// SomeType(SomeKind(Type( +/// Convert +/// SomeKind(...)[2]))) +/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves +/// TypeCategory. +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..0d941bf39afc3 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -332,26 +332,26 @@ getSource(const semantics::SemanticsContext &semaCtx, const parser::CharBlock *source = nullptr; auto ompConsVisit = [&](const parser::OpenMPConstruct &x) { - std::visit(common::visitors{ - [&](const parser::OpenMPSectionsConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPLoopConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPBlockConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPCriticalConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPAtomicConstruct &x) { - std::visit([&](const auto &x) { source = &x.source; }, - x.u); - }, - [&](const auto &x) { source = &x.source; }, - }, - x.u); + std::visit( + common::visitors{ + [&](const parser::OpenMPSectionsConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPLoopConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPBlockConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPCriticalConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPAtomicConstruct &x) { + source = &std::get(x.t).source; + }, + [&](const auto &x) { source = &x.source; }, + }, + x.u); }; eval.visit(common::visitors{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fdd85e94829f3..ab868df76d298 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2588,455 +2588,183 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); +} -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} + +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, + const List &clauses) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + if (clause.id != llvm::omp::Clause::OMPC_hint) + continue; + auto &hint = std::get(clause.u); + auto maybeVal = evaluate::ToInt64(hint.v); + CHECK(maybeVal); + return builder.getI64IntegerAttr(*maybeVal); } + return nullptr; } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static mlir::omp::ClauseMemoryOrderKindAttr +getAtomicMemoryOrder(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + const List &clauses) { + std::optional kind; + unsigned version = semaCtx.langOptions().OpenMPVersion; - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + switch (clause.id) { + case llvm::omp::Clause::OMPC_acq_rel: + kind = mlir::omp::ClauseMemoryOrderKind::Acq_rel; + break; + case llvm::omp::Clause::OMPC_acquire: + kind = mlir::omp::ClauseMemoryOrderKind::Acquire; + break; + case llvm::omp::Clause::OMPC_relaxed: + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + break; + case llvm::omp::Clause::OMPC_release: + kind = mlir::omp::ClauseMemoryOrderKind::Release; + break; + case llvm::omp::Clause::OMPC_seq_cst: + kind = mlir::omp::ClauseMemoryOrderKind::Seq_cst; + break; + default: + break; + } + } - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); + // Starting with 5.1, if no memory-order clause is present, the effect + // is as if "relaxed" was present. + if (!kind) { + if (version <= 50) + return nullptr; + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + } + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + return mlir::omp::ClauseMemoryOrderKindAttr::get(builder.getContext(), *kind); +} + +static mlir::Operation * // +genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; +static mlir::Operation * // +genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Value converted = builder.createConvert(loc, atomType, value); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, converted, hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - common::visit( - common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - lower::StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); +static mlir::Operation * +genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + lower::ExprToValueMap overrides; + lower::StatementContext naCtx; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + assert(!args.empty() && "Update operation without arguments"); + for (auto &arg : args) { + if (!semantics::IsSameOrResizeOf(arg, atom)) { + mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); + overrides.try_emplace(&arg, val); } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); } - mlir::Operation *atomicUpdateOp = nullptr; - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), - val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -static void genAtomicWrite(lower::AbstractConverter &converter, - const parser::OmpAtomicWrite &atomicWrite, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - - const parser::AssignmentStmt &stmt = - std::get>(atomicWrite.t) - .statement; - const evaluate::Assignment &assign = *stmt.typedAssignment->v; - lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -static void genAtomicRead(lower::AbstractConverter &converter, - const parser::OmpAtomicRead &atomicRead, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicRead.t) - .statement.t); + builder.restoreInsertionPoint(atomicAt); + auto updateOp = + builder.create(loc, atomAddr, hint, memOrder); - lower::StatementContext stmtCtx; - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -static void genAtomicUpdate(lower::AbstractConverter &converter, - const parser::OmpAtomicUpdate &atomicUpdate, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicUpdate.t) - .statement.t); + mlir::Region ®ion = updateOp->getRegion(0); + mlir::Block *block = builder.createBlock(®ion, {}, {atomType}, {loc}); + mlir::Value localAtom = fir::getBase(block->getArgument(0)); + overrides.try_emplace(&atom, localAtom); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -static void genOmpAtomic(lower::AbstractConverter &converter, - const parser::OmpAtomic &atomicConstruct, - mlir::Location loc) { - const parser::OmpAtomicClauseList &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>(atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicConstruct.t) - .statement.t); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -static void genAtomicCapture(lower::AbstractConverter &converter, - const parser::OmpAtomicCapture &atomicCapture, - mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + converter.overrideExprValues(&overrides); + mlir::Value updated = + fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value converted = builder.createConvert(loc, atomType, updated); + builder.create(loc, converted); + converter.resetExprOverrides(); - const parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const parser::OmpAtomicClauseList &rightHandClauseList = - std::get<2>(atomicCapture.t); - const parser::OmpAtomicClauseList &leftHandClauseList = - std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + builder.restoreInsertionPoint(saved); + return updateOp; +} + +static mlir::Operation * +genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, int action, + mlir::Value atomAddr, const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + switch (action) { + case parser::OpenMPAtomicConstruct::Analysis::Read: + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Write: + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Update: + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + default: + return nullptr; } - firOpBuilder.setInsertionPointToEnd(&block); - firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); } //===----------------------------------------------------------------------===// @@ -3911,10 +3639,6 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, @@ -3922,38 +3646,140 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; + llvm::errs() << " atom: " << atom.AsFortran() << "\n"; + llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; + llvm::errs() << " op0 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << " op1 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << "}\n"; +} + static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - const parser::OpenMPAtomicConstruct &atomicConstruct) { - Fortran::common::visit( - common::visitors{ - [&](const parser::OmpAtomicRead &atomicRead) { - mlir::Location loc = converter.genLocation(atomicRead.source); - genAtomicRead(converter, atomicRead, loc); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - mlir::Location loc = converter.genLocation(atomicWrite.source); - genAtomicWrite(converter, atomicWrite, loc); - }, - [&](const parser::OmpAtomic &atomicConstruct) { - mlir::Location loc = converter.genLocation(atomicConstruct.source); - genOmpAtomic(converter, atomicConstruct, loc); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - mlir::Location loc = converter.genLocation(atomicUpdate.source); - genAtomicUpdate(converter, atomicUpdate, loc); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - mlir::Location loc = converter.genLocation(atomicCapture.source); - genAtomicCapture(converter, atomicCapture, loc); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - mlir::Location loc = converter.genLocation(atomicCompare.source); - TODO(loc, "OpenMP atomic compare"); - }, - }, - atomicConstruct.u); + const parser::OpenMPAtomicConstruct &construct) { + auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { + if (auto *maybe = expr.get(); maybe && maybe->v) { + return &*maybe->v; + } else { + return nullptr; + } + }; + + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto &dirSpec = std::get(construct.t); + List clauses = makeClauses(dirSpec.Clauses(), semaCtx); + lower::StatementContext stmtCtx; + + const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + const semantics::SomeExpr &atom = *get(analysis.atom); + mlir::Location loc = converter.genLocation(construct.source); + mlir::Value atomAddr = + fir::getBase(converter.genExprAddr(atom, stmtCtx, &loc)); + mlir::IntegerAttr hint = getAtomicHint(converter, clauses); + mlir::omp::ClauseMemoryOrderKindAttr memOrder = + getAtomicMemoryOrder(converter, semaCtx, clauses); + + if (auto *cond = get(analysis.cond)) { + (void)cond; + TODO(loc, "OpenMP ATOMIC COMPARE"); + } else { + int action0 = analysis.op0.what & analysis.Action; + int action1 = analysis.op1.what & analysis.Action; + mlir::Operation *captureOp = nullptr; + fir::FirOpBuilder::InsertPoint atomicAt; + fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + + if (construct.IsCapture()) { + // Capturing operation. + assert(action0 != analysis.None && action1 != analysis.None && + "Expexcing two actions"); + captureOp = + builder.create(loc, hint, memOrder); + // Set the non-atomic insertion point to before the atomic.capture. + prepareAt = getInsertionPointBefore(captureOp); + + mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); + builder.setInsertionPointToEnd(block); + // Set the atomic insertion point to before the terminator inside + // atomic.capture. + mlir::Operation *term = builder.create(loc); + atomicAt = getInsertionPointBefore(term); + hint = nullptr; + memOrder = nullptr; + } else { + // Non-capturing operation. + assert(action0 != analysis.None && action1 == analysis.None && + "Expexcing single action"); + assert(!(analysis.op0.what & analysis.Condition)); + atomicAt = prepareAt; + } + + mlir::Operation *firstOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, + *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + assert(firstOp && "Should have created an atomic operation"); + atomicAt = getInsertionPointAfter(firstOp); + + mlir::Operation *secondOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, + *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } + } } static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..b30263ad54fa4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { + return std::string( + state.GetLocation(), std::min(64, state.BytesRemaining())); +} + constexpr auto startOmpLine = skipStuffBeforeStatement >> "!$OMP "_sptok; constexpr auto endOmpLine = space >> endOfLine; @@ -918,8 +924,10 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "BIND" >> construct(construct( parenthesized(Parser{}))) || + "CAPTURE" >> construct(construct()) || "COLLAPSE" >> construct(construct( parenthesized(scalarIntConstantExpr))) || + "COMPARE" >> construct(construct()) || "CONTAINS" >> construct(construct( parenthesized(Parser{}))) || "COPYIN" >> construct(construct( @@ -1039,6 +1047,7 @@ TYPE_PARSER( // "TASK_REDUCTION" >> construct(construct( parenthesized(Parser{}))) || + "READ" >> construct(construct()) || "RELAXED" >> construct(construct()) || "RELEASE" >> construct(construct()) || "REVERSE_OFFLOAD" >> @@ -1082,6 +1091,7 @@ TYPE_PARSER( // maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || + "WRITE" >> construct(construct()) || // Cancellable constructs "DO"_id >= construct(construct( @@ -1210,6 +1220,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. +// A problem can occur when atomic constructs without end-directive follow +// each other closely, e.g. +// !$omp atomic write +// x = v +// !$omp atomic update +// x = x + 1 +// ... +// The speculative parsing will become "recursive", and has the potential +// to take a (practically) infinite amount of time given a sufficiently +// large number of such constructs in a row. Since atomic constructs cannot +// contain other OpenMP constructs, guarding against recursive calls to the +// atomic construct parser solves the problem. +struct OmpAtomicConstructParser { + using resultType = OpenMPAtomicConstruct; + + static constexpr size_t BodyLimit{5}; + + std::optional Parse(ParseState &state) const { + if (recursing_) { + return std::nullopt; + } + recursing_ = true; + + auto dirSpec{Parser{}.Parse(state)}; + if (!dirSpec || dirSpec->DirId() != llvm::omp::Directive::OMPD_atomic) { + recursing_ = false; + return std::nullopt; + } + + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + if (ParseOne(exec, end, tail, state)) { + if (!tail.first.empty()) { + if (auto &&rest{attempt(LimitedTailParser(BodyLimit)).Parse(state)}) { + for (auto &&s : rest->first) { + tail.first.emplace_back(std::move(s)); + } + assert(!tail.second); + tail.second = std::move(rest->second); + } + } + recursing_ = false; + return OpenMPAtomicConstruct{ + std::move(*dirSpec), std::move(tail.first), std::move(tail.second)}; + } + + recursing_ = false; + return std::nullopt; + } + +private: + // Begin-directive + TailType = entire construct. + using TailType = std::pair>; + + // Parse either an ExecutionPartConstruct, or atomic end-directive. When + // successful, record the result in the "tail" provided, otherwise fail. + static std::optional ParseOne( // + Parser &exec, OmpEndDirectiveParser &end, + TailType &tail, ParseState &state) { + auto isRecovery{[](const ExecutionPartConstruct &e) { + return std::holds_alternative(e.u); + }}; + if (auto &&stmt{attempt(exec).Parse(state)}; stmt && !isRecovery(*stmt)) { + tail.first.emplace_back(std::move(*stmt)); + } else if (auto &&dir{attempt(end).Parse(state)}) { + tail.second = std::move(*dir); + } else { + return std::nullopt; + } + return Success{}; + } + + struct LimitedTailParser { + using resultType = TailType; + + constexpr LimitedTailParser(size_t count) : count_(count) {} + + std::optional Parse(ParseState &state) const { + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + for (size_t i{0}; i != count_; ++i) { + if (ParseOne(exec, end, tail, state)) { + if (tail.second) { + // Return when the end-directive was parsed. + return std::move(tail); + } + } else { + break; + } + } + return std::nullopt; + } + + private: + const size_t count_; + }; + + // The recursion guard should become thread_local if parsing is ever + // parallelized. + static bool recursing_; +}; + +bool OmpAtomicConstructParser::recursing_{false}; + +TYPE_PARSER(sourced( // + construct(OmpAtomicConstructParser{}))) + // 2.17.7 Atomic construct/2.17.8 Flush construct [OpenMP 5.0] // memory-order-clause -> // acq_rel @@ -1224,19 +1383,6 @@ TYPE_PARSER(sourced(construct( "RELEASE" >> construct(construct()) || "SEQ_CST" >> construct(construct()))))) -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) -TYPE_PARSER(sourced(construct( - construct(Parser{}) || - construct( - "FAIL" >> parenthesized(Parser{})) || - construct( - "HINT" >> parenthesized(Parser{}))))) - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -TYPE_PARSER(sourced(construct( - many(maybe(","_tok) >> sourced(Parser{}))))) - static bool IsSimpleStandalone(const OmpDirectiveName &name) { switch (name.v) { case llvm::omp::Directive::OMPD_barrier: @@ -1383,67 +1529,6 @@ TYPE_PARSER(sourced( TYPE_PARSER(construct(Parser{}) || construct(Parser{})) -// 2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] | -// ATOMIC [clause] -// clause -> memory-order-clause | HINT(hint-expression) -// memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED -// atomic-clause -> READ | WRITE | UPDATE | CAPTURE - -// OMP END ATOMIC -TYPE_PARSER(construct(startOmpLine >> "END ATOMIC"_tok)) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] READ [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("READ"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] CAPTURE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("CAPTURE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - statement(assignmentStmt), Parser{} / endOmpLine))) - -TYPE_PARSER(construct(indirect(Parser{})) || - construct(indirect(Parser{}))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] COMPARE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("COMPARE"_tok), - Parser{} / endOmpLine, - Parser{}, - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] UPDATE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("UPDATE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [atomic-clause-list] -TYPE_PARSER(sourced(construct(verbatim("ATOMIC"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] WRITE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("WRITE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// Atomic Construct -TYPE_PARSER(construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{})) - // 2.13.2 OMP CRITICAL TYPE_PARSER(startOmpLine >> sourced(construct( diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..5983f54600c21 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -318,6 +318,34 @@ std::string OmpTraitSetSelectorName::ToString() const { return std::string(EnumToString(v)); } +llvm::omp::Clause OpenMPAtomicConstruct::GetKind() const { + auto &dirSpec{std::get(t)}; + for (auto &clause : dirSpec.Clauses().v) { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_read: + case llvm::omp::Clause::OMPC_write: + case llvm::omp::Clause::OMPC_update: + return clause.Id(); + default: + break; + } + } + return llvm::omp::Clause::OMPC_update; +} + +bool OpenMPAtomicConstruct::IsCapture() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_capture; + }); +} + +bool OpenMPAtomicConstruct::IsCompare() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_compare; + }); +} } // namespace Fortran::parser template static llvm::omp::Clause getClauseIdForClass(C &&) { diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..a2bc8b98088f1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2562,83 +2562,22 @@ class UnparseVisitor { Word(ToUpperCaseLetters(common::EnumToString(x))); } - void Unparse(const OmpAtomicClauseList &x) { Walk(" ", x.v, " "); } - - void Unparse(const OmpAtomic &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCapture &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" CAPTURE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - Put("\n"); - Walk(std::get(x.t)); - BeginOpenMP(); - Word("!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCompare &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" COMPARE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - } - void Unparse(const OmpAtomicRead &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" READ"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicUpdate &x) { + void Unparse(const OpenMPAtomicConstruct &x) { BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" UPDATE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicWrite &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" WRITE"); - Walk(std::get<2>(x.t)); + Word("!$OMP "); + Walk(std::get(x.t)); Put("\n"); EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); + Walk(std::get(x.t), ""); + if (auto &end{std::get>(x.t)}) { + BeginOpenMP(); + Word("!$OMP END "); + Walk(*end); + Put("\n"); + EndOpenMP(); + } } + void Unparse(const OpenMPExecutableAllocate &x) { const auto &fields = std::get>>( @@ -2889,23 +2828,8 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } + void Unparse(const OmpFailClause &x) { Walk(x.v); } void Unparse(const OmpMemoryOrderClause &x) { Walk(x.v); } - void Unparse(const OmpAtomicClause &x) { - common::visit(common::visitors{ - [&](const OmpMemoryOrderClause &y) { Walk(y); }, - [&](const OmpFailClause &y) { - Word("FAIL("); - Walk(y.v); - Put(")"); - }, - [&](const OmpHintClause &y) { - Word("HINT("); - Walk(y.v); - Put(")"); - }, - }, - x.u); - } void Unparse(const OmpMetadirectiveDirective &x) { BeginOpenMP(); Word("!$OMP METADIRECTIVE "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c582de6df0319..bf3dff8ddab28 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); + +template +static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { + return !(e == f); +} + // Use when clause falls under 'struct OmpClause' in 'parse-tree.h'. #define CHECK_SIMPLE_CLAUSE(X, Y) \ void OmpStructureChecker::Enter(const parser::OmpClause::X &) { \ @@ -78,6 +86,28 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } +static bool IsVarOrFunctionRef(const SomeExpr &expr) { + return evaluate::UnwrapProcedureRef(expr) != nullptr || + evaluate::IsVariable(expr); +} + +static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { + const parser::TypedExpr &typedExpr{parserExpr.typedExpr}; + // ForwardOwningPointer typedExpr + // `- GenericExprWrapper ^.get() + // `- std::optional ^->v + return typedExpr.get()->v; +} + +static std::optional GetDynamicType( + const parser::Expr &parserExpr) { + if (auto maybeExpr{GetEvaluateExpr(parserExpr)}) { + return maybeExpr->GetType(); + } else { + return std::nullopt; + } +} + // 'OmpWorkshareBlockChecker' is used to check the validity of the assignment // statements and the expressions enclosed in an OpenMP Workshare construct class OmpWorkshareBlockChecker { @@ -584,51 +614,26 @@ void OmpStructureChecker::CheckPredefinedAllocatorRestriction( } } -template -void OmpStructureChecker::CheckHintClause( - D *leftOmpClauseList, D *rightOmpClauseList, std::string_view dirName) { - bool foundHint{false}; +void OmpStructureChecker::Enter(const parser::OmpClause::Hint &x) { + CheckAllowedClause(llvm::omp::Clause::OMPC_hint); + auto &dirCtx{GetContext()}; - auto checkForValidHintClause = [&](const D *clauseList) { - for (const auto &clause : clauseList->v) { - const parser::OmpHintClause *ompHintClause = nullptr; - if constexpr (std::is_same_v) { - ompHintClause = std::get_if(&clause.u); - } else if constexpr (std::is_same_v) { - if (auto *hint{std::get_if(&clause.u)}) { - ompHintClause = &hint->v; - } - } - if (!ompHintClause) - continue; - if (foundHint) { - context_.Say(clause.source, - "At most one HINT clause can appear on the %s directive"_err_en_US, - parser::ToUpperCaseLetters(dirName)); - } - foundHint = true; - std::optional hintValue = GetIntValue(ompHintClause->v); - if (hintValue && *hintValue >= 0) { - /*`omp_sync_hint_nonspeculative` and `omp_lock_hint_speculative`*/ - if ((*hintValue & 0xC) == 0xC - /*`omp_sync_hint_uncontended` and omp_sync_hint_contended*/ - || (*hintValue & 0x3) == 0x3) - context_.Say(clause.source, - "Hint clause value " - "is not a valid OpenMP synchronization value"_err_en_US); - } else { - context_.Say(clause.source, - "Hint clause must have non-negative constant " - "integer expression"_err_en_US); + if (std::optional maybeVal{GetIntValue(x.v.v)}) { + int64_t val{*maybeVal}; + if (val >= 0) { + // Check contradictory values. + if ((val & 0xC) == 0xC || // omp_sync_hint_speculative and nonspeculative + (val & 0x3) == 0x3) { // omp_sync_hint_contended and uncontended + context_.Say(dirCtx.clauseSource, + "The synchronization hint is not valid"_err_en_US); } + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be non-negative"_err_en_US); } - }; - - if (leftOmpClauseList) { - checkForValidHintClause(leftOmpClauseList); - } - if (rightOmpClauseList) { - checkForValidHintClause(rightOmpClauseList); + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be a constant integer value"_err_en_US); } } @@ -2370,8 +2375,9 @@ void OmpStructureChecker::Leave(const parser::OpenMPCancelConstruct &) { void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { const auto &dir{std::get(x.t)}; + const auto &dirSource{std::get(dir.t).source}; const auto &endDir{std::get(x.t)}; - PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_critical); + PushContextAndClauseSets(dirSource, llvm::omp::Directive::OMPD_critical); const auto &block{std::get(x.t)}; CheckNoBranching(block, llvm::omp::Directive::OMPD_critical, dir.source); const auto &dirName{std::get>(dir.t)}; @@ -2404,7 +2410,6 @@ void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { "Hint clause other than omp_sync_hint_none cannot be specified for " "an unnamed CRITICAL directive"_err_en_US}); } - CheckHintClause(&ompClause, nullptr, "CRITICAL"); } void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { @@ -2632,420 +2637,1519 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; } - return false; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } + + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (Append(v, std::move(results)), ...); + return v; + } + +private: + static void Append(Result &acc, Result &&data) { + for (auto &&s : data) { + acc.push_back(std::move(s)); } } +}; +} // namespace atomic - ErrIfAllocatableVariable(var); +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { + // Both expr and x have the form of SomeType(SomeKind(...)[1]). + // Check if expr is + // SomeType(SomeKind(Type( + // Convert + // SomeKind(...)[2]))) + // where SomeKind(...) [1] and [2] are equal, and the Convert preserves + // TypeCategory. + + if (expr != x) { + auto top{atomic::ArgumentExtractor{}(expr)}; + return top.first == atomic::Operator::Resize && x == top.second.front(); } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); + return true; } } -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, const MaybeExpr &maybeExpr = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetExpr(operation.expr, maybeExpr); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedExpr expr; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var + // or + // ... = x capture-var = atomic-var + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return u.lhs == c.rhs; + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); + bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + + if (couldBeCapture1) { + if (couldBeCapture2) { + if (isUpdateCapture(as2, as1)) { + if (isUpdateCapture(as1, as2)) { + // If both statements could be captures and both could be updates, + // emit a warning about the ambiguity. + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); } + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, + as1.rhs.AsFortran(), as2.rhs.AsFortran()); + } + } else { // !couldBeCapture2 + if (isUpdateCapture(as2, as1)) { + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else { + context_.Say(act2.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as1.rhs.AsFortran()); + } + } + } else { // !couldBeCapture1 + if (couldBeCapture2) { + if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); + } else { + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as2.rhs.AsFortran()); + } + } else { + context_.Say(source, + "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + } + } + + return std::make_pair(nullptr, nullptr); +} + +void OmpStructureChecker::CheckAtomicCaptureAssignment( + const evaluate::Assignment &capture, const SomeExpr &atom, + parser::CharBlock source) { + const SomeExpr &cap{capture.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + // This part should have been checked prior to callig this function. + assert(capture.rhs == atom && "This canont be a capture assignment"); + CheckStorageOverlap(atom, {cap}, source); + } +} + +void OmpStructureChecker::CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source) { + const SomeExpr &atom{read.rhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source) { + // [6.0:190:13-15] + // A write structured block is write-statement, a write statement that has + // one of the following forms: + // x = expr + // x => expr + const SomeExpr &atom{write.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, lsrc); + CheckStorageOverlap(atom, {write.rhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source) { + // [6.0:191:1-7] + // An update structured block is update-statement, an update statement + // that has one of the following forms: + // x = x operator expr + // x = expr operator x + // x = intrinsic-procedure-name (x) + // x = intrinsic-procedure-name (x, expr-list) + // x = intrinsic-procedure-name (expr-list, x) + const SomeExpr &atom{update.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, lsrc); + + auto top{GetTopLevelOperation(update.rhs)}; + switch (top.first) { + case atomic::Operator::Add: + case atomic::Operator::Sub: + case atomic::Operator::Mul: + case atomic::Operator::Div: + case atomic::Operator::And: + case atomic::Operator::Or: + case atomic::Operator::Eqv: + case atomic::Operator::Neqv: + case atomic::Operator::Min: + case atomic::Operator::Max: + case atomic::Operator::Identity: + break; + case atomic::Operator::Call: + context_.Say(source, + "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Convert: + context_.Say(source, + "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Intrinsic: + context_.Say(source, + "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Unk: + context_.Say( + source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + default: + context_.Say(source, + "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, + atomic::ToString(top.first)); + return; + } + // Check if `atom` occurs exactly once in the argument list. + std::vector nonAtom; + auto unique{[&]() { // -> iterator + auto found{top.second.end()}; + for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { + if (IsSameOrResizeOf(*i, atom)) { + if (found != top.second.end()) { + return top.second.end(); } + found = i; + } else { + nonAtom.push_back(*i); } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); + return found; + }()}; + + if (unique == top.second.end()) { + if (top.first == atomic::Operator::Identity) { + // This is "x = y". + context_.Say(rsrc, + "The atomic variable %s should appear as an argument in the update operation"_err_en_US, + atom.AsFortran()); + } else { + context_.Say(rsrc, + "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, + atom.AsFortran(), atomic::ToString(top.first)); + } + } else { + CheckStorageOverlap(atom, nonAtom, source); } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( + const SomeExpr &cond, parser::CharBlock condSource, + const evaluate::Assignment &assign, parser::CharBlock assignSource) { + const SomeExpr &atom{assign.lhs}; + auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, alsrc); + + auto top{GetTopLevelOperation(cond)}; + // Missing arguments to operations would have been diagnosed by now. + + switch (top.first) { + case atomic::Operator::Associated: + if (atom != top.second.front()) { + context_.Say(assignSource, + "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); + } + break; + // x equalop e | e equalop x (allowing "e equalop x" is an extension) + case atomic::Operator::Eq: + case atomic::Operator::Eqv: + // x ordop expr | expr ordop x + case atomic::Operator::Lt: + case atomic::Operator::Gt: { + const SomeExpr &arg0{top.second[0]}; + const SomeExpr &arg1{top.second[1]}; + if (IsSameOrResizeOf(arg0, atom)) { + CheckStorageOverlap(atom, {arg1}, condSource); + } else if (IsSameOrResizeOf(arg1, atom)) { + CheckStorageOverlap(atom, {arg0}, condSource); + } else { + context_.Say(assignSource, + "An argument of the %s operator should be the target of the assignment"_err_en_US, + atomic::ToString(top.first)); + } + break; + } + case atomic::Operator::True: + case atomic::Operator::False: + break; + default: + context_.Say(condSource, + "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, + atomic::ToString(top.first)); + break; } } -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, +void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source) { + // The condition/statements must be: + // - cond: x equalop e ift: x = d iff: - + // - cond: x ordop expr ift: x = expr iff: - (+ commute ordop) + // - cond: associated(x) ift: x => expr iff: - + // - cond: associated(x, e) ift: x => expr iff: - + + // The if-true statement must be present, and must be an assignment. + auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; + if (!maybeAssign) { + if (update.ift.stmt) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); + } else { + context_.Say( + source, "Invalid body of ATOMIC UPDATE COMPARE operation"_err_en_US); + } + return; + } + const evaluate::Assignment assign{*maybeAssign}; + const SomeExpr &atom{assign.lhs}; + + CheckAtomicConditionalUpdateAssignment( + update.cond, update.source, assign, update.ift.source); + + CheckStorageOverlap(atom, {assign.rhs}, update.ift.source); + + if (update.iff) { + context_.Say(update.iff.source, + "In ATOMIC UPDATE COMPARE the update statement should not have an ELSE branch"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdateOnly( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeUpdate{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeUpdate->lhs}; + CheckAtomicUpdateAssignment(*maybeUpdate, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC UPDATE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdate( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // Allowable forms are (single-statement): + // - if ... + // - x = (... ? ... : x) + // and two-statement: + // - r = cond ; if (r) ... + + const parser::ExecutionPartConstruct *ust{nullptr}; // update + const parser::ExecutionPartConstruct *cst{nullptr}; // condition + + if (body.size() == 1) { + ust = &body.front(); + } else if (body.size() == 2) { + cst = &body.front(); + ust = &body.back(); + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE operation should contain one or two statements"_err_en_US); + return; + } + + // Flang doesn't support conditional-expr yet, so all update statements + // are if-statements. + + // IfStmt: if (...) ... + // IfConstruct: if (...) then ... endif + auto maybeUpdate{AnalyzeConditionalStmt(ust)}; + if (!maybeUpdate) { + context_.Say(source, + "In ATOMIC UPDATE COMPARE the update statement should be a conditional statement"_err_en_US); + return; + } + + AnalyzedCondStmt &update{*maybeUpdate}; + + if (SourcedActionStmt action{GetActionStmt(cst)}) { + // The "condition" statement must be `r = cond`. + if (auto maybeCond{GetEvaluateAssignment(action.stmt)}) { + if (maybeCond->lhs != update.cond) { + context_.Say(update.source, + "In ATOMIC UPDATE COMPARE the conditional statement must use %s as the condition"_err_en_US, + maybeCond->lhs.AsFortran()); + } else { + // If it's "r = ...; if (r) ..." then put the original condition + // in `update`. + update.cond = maybeCond->rhs; + } + } else { + context_.Say(action.source, + "In ATOMIC UPDATE COMPARE with two statements the first statement should compute the condition"_err_en_US); + } + } + + evaluate::Assignment assign{*GetEvaluateAssignment(update.ift.stmt)}; + + CheckAtomicConditionalUpdateStmt(update, source); + if (IsCheckForAssociated(update.cond)) { + if (!IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment should be a pointer-assignment when the condition is ASSOCIATED"_err_en_US); + } + } else { + if (IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment cannot be a pointer-assignment except when the condition is ASSOCIATED"_err_en_US); + } + } + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::None)); +} + +void OmpStructureChecker::CheckAtomicUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() != 2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two statements"_err_en_US); + return; + } + + auto [uec, cec]{CheckUpdateCapture(&body.front(), &body.back(), source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; + auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; + + if (!maybeUpdate || !maybeCapture) { + context_.Say(source, + "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); + return; + } + + const evaluate::Assignment &update{*maybeUpdate}; + const evaluate::Assignment &capture{*maybeCapture}; + const SomeExpr &atom{update.lhs}; + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + int action; + + if (IsMaybeAtomicWrite(update)) { + action = Analysis::Write; + CheckAtomicWriteAssignment(update, uact.source); + } else { + action = Analysis::Update; + CheckAtomicUpdateAssignment(update, uact.source); + } + CheckAtomicCaptureAssignment(capture, atom, cact.source); + + if (IsPointerAssignment(update) != IsPointerAssignment(capture)) { + context_.Say(cact.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + + if (GetActionStmt(&body.front()).stmt == uact.stmt) { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(action, update.rhs), + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + } else { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), + MakeAtomicAnalysisOp(action, update.rhs)); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // There are two different variants of this: + // (1) conditional-update and capture separately: + // This form only allows single-statement updates, i.e. the update + // form "r = cond; if (r) ...)" is not allowed. + // (2) conditional-update combined with capture in a single statement: + // This form does allow the condition to be calculated separately, + // i.e. "r = cond; if (r) ...". + // Regardless of what form it is, the actual update assignment is a + // proper write, i.e. "x = d", where d does not depend on x. + + AnalyzedCondStmt update; + SourcedActionStmt capture; + bool captureAlways{true}, captureFirst{true}; + + auto extractCapture{[&]() { + capture = update.iff; + captureAlways = false; + update.iff = SourcedActionStmt{}; + }}; + + auto classifyNonUpdate{[&](const SourcedActionStmt &action) { + // The non-update statement is either "r = cond" or the capture. + if (auto maybeAssign{GetEvaluateAssignment(action.stmt)}) { + if (update.cond == maybeAssign->lhs) { + // If this is "r = cond; if (r) ...", then update the condition. + update.cond = maybeAssign->rhs; + update.source = action.source; + // In this form, the update and the capture are combined into + // an IF-THEN-ELSE statement. + extractCapture(); + } else { + // Assume this is the capture-statement. + capture = action; + } + } + }}; + + if (body.size() == 2) { + // This could be + // - capture; conditional-update (in any order), or + // - r = cond; if (r) capture-update + const parser::ExecutionPartConstruct *st1{&body.front()}; + const parser::ExecutionPartConstruct *st2{&body.back()}; + // In either case, the conditional statement can be analyzed by + // AnalyzeConditionalStmt, whereas the other statement cannot. + if (auto maybeUpdate1{AnalyzeConditionalStmt(st1)}) { + update = *maybeUpdate1; + classifyNonUpdate(GetActionStmt(st2)); + captureFirst = false; + } else if (auto maybeUpdate2{AnalyzeConditionalStmt(st2)}) { + update = *maybeUpdate2; + classifyNonUpdate(GetActionStmt(st1)); + } else { + // None of the statements are conditional, this rules out the + // "r = cond; if (r) ..." and the "capture + conditional-update" + // variants. This could still be capture + write (which is classified + // as conditional-update-capture in the spec). + auto [uec, cec]{CheckUpdateCapture(st1, st2, source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + update.ift = uact; + capture = cact; + if (uec == st1) { + captureFirst = false; + } + } + } else if (body.size() == 1) { + if (auto maybeUpdate{AnalyzeConditionalStmt(&body.front())}) { + update = *maybeUpdate; + // This is the form with update and capture combined into an IF-THEN-ELSE + // statement. The capture-statement is always the ELSE branch. + extractCapture(); + } else { + goto invalid; + } + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE CAPTURE operation should contain one or two statements"_err_en_US); + return; + invalid: + context_.Say(source, + "Invalid body of ATOMIC UPDATE COMPARE CAPTURE operation"_err_en_US); + return; + } + + // The update must have a form `x = d` or `x => d`. + if (auto maybeWrite{GetEvaluateAssignment(update.ift.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, update.ift.source); + if (auto maybeCapture{GetEvaluateAssignment(capture.stmt)}) { + CheckAtomicCaptureAssignment(*maybeCapture, atom, capture.source); + + if (IsPointerAssignment(*maybeWrite) != + IsPointerAssignment(*maybeCapture)) { + context_.Say(capture.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + } else { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + return; + } + } else { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + return; + } + + // update.iff should be empty here, the capture statement should be + // stored in "capture". + + // Fill out the analysis in the AST node. + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + bool condUnused{std::visit( + [](auto &&s) { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v) { + return true; + } else { + return false; + } }, - x.u); + update.cond.u)}; + + int updateWhen{!condUnused ? Analysis::IfTrue : 0}; + int captureWhen{!captureAlways ? Analysis::IfFalse : 0}; + + evaluate::Assignment updAssign{*GetEvaluateAssignment(update.ift.stmt)}; + evaluate::Assignment capAssign{*GetEvaluateAssignment(capture.stmt)}; + + if (captureFirst) { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + } else { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + } +} + +void OmpStructureChecker::CheckAtomicRead( + const parser::OpenMPAtomicConstruct &x) { + // [6.0:190:5-7] + // A read structured block is read-statement, a read statement that has one + // of the following forms: + // v = x + // v => x + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Read cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC READ cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeRead->rhs}; + CheckAtomicReadAssignment(*maybeRead, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC READ operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC READ operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicWrite( + const parser::OpenMPAtomicConstruct &x) { + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Write cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC WRITE cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeWrite{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC WRITE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdate( + const parser::OpenMPAtomicConstruct &x) { + auto &block{std::get(x.t)}; + + bool isConditional{x.IsCompare()}; + bool isCapture{x.IsCapture()}; + const parser::Block &body{GetInnermostExecPart(block)}; + + if (isConditional && isCapture) { + CheckAtomicConditionalUpdateCapture(x, body, x.source); + } else if (isConditional) { + CheckAtomicConditionalUpdate(x, body, x.source); + } else if (isCapture) { + CheckAtomicUpdateCapture(x, body, x.source); + } else { // update-only + CheckAtomicUpdateOnly(x, body, x.source); + } +} + +void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { + // All of the following groups have the "exclusive" property, i.e. at + // most one clause from each group is allowed. + // The exclusivity-checking code should eventually be unified for all + // clauses, with clause groups defined in OMP.td. + std::array atomic{llvm::omp::Clause::OMPC_read, + llvm::omp::Clause::OMPC_update, llvm::omp::Clause::OMPC_write}; + std::array memoryOrder{llvm::omp::Clause::OMPC_acq_rel, + llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_relaxed, + llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_seq_cst}; + + auto checkExclusive{[&](llvm::ArrayRef group, + std::string_view name, + const parser::OmpClauseList &clauses) { + const parser::OmpClause *present{nullptr}; + for (const parser::OmpClause &clause : clauses.v) { + llvm::omp::Clause id{clause.Id()}; + if (!llvm::is_contained(group, id)) { + continue; + } + if (present == nullptr) { + present = &clause; + continue; + } else if (id == present->Id()) { + // Ignore repetitions of the same clause, those will be diagnosed + // separately. + continue; + } + parser::MessageFormattedText txt( + "At most one clause from the '%s' group is allowed on ATOMIC construct"_err_en_US, + name.data()); + parser::Message message(clause.source, txt); + message.Attach(present->source, + "Previous clause from this group provided here"_en_US); + context_.Say(std::move(message)); + return; + } + }}; + + auto &dirSpec{std::get(x.t)}; + auto &dir{std::get(dirSpec.t)}; + PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_atomic); + llvm::omp::Clause kind{x.GetKind()}; + + checkExclusive(atomic, "atomic", dirSpec.Clauses()); + checkExclusive(memoryOrder, "memory-order", dirSpec.Clauses()); + + switch (kind) { + case llvm::omp::Clause::OMPC_read: + CheckAtomicRead(x); + break; + case llvm::omp::Clause::OMPC_write: + CheckAtomicWrite(x); + break; + case llvm::omp::Clause::OMPC_update: + CheckAtomicUpdate(x); + break; + default: + break; + } } void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { @@ -3237,7 +4341,6 @@ CHECK_SIMPLE_CLAUSE(Final, OMPC_final) CHECK_SIMPLE_CLAUSE(Flush, OMPC_flush) CHECK_SIMPLE_CLAUSE(Full, OMPC_full) CHECK_SIMPLE_CLAUSE(Grainsize, OMPC_grainsize) -CHECK_SIMPLE_CLAUSE(Hint, OMPC_hint) CHECK_SIMPLE_CLAUSE(Holds, OMPC_holds) CHECK_SIMPLE_CLAUSE(Inclusive, OMPC_inclusive) CHECK_SIMPLE_CLAUSE(Initializer, OMPC_initializer) @@ -3867,40 +4970,6 @@ void OmpStructureChecker::CheckIsLoopIvPartOfClause( } } } -// Following clauses have a separate node in parse-tree.h. -// Atomic-clause -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicRead, OMPC_read) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicWrite, OMPC_write) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicUpdate, OMPC_update) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicCapture, OMPC_capture) - -void OmpStructureChecker::Leave(const parser::OmpAtomicRead &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_read, - {llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicWrite &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_write, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicUpdate &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_update, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -// OmpAtomic node represents atomic directive without atomic-clause. -// atomic-clause - READ,WRITE,UPDATE,CAPTURE. -void OmpStructureChecker::Leave(const parser::OmpAtomic &) { - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acquire)}) { - context_.Say(clause->source, - "Clause ACQUIRE is not allowed on the ATOMIC directive"_err_en_US); - } - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acq_rel)}) { - context_.Say(clause->source, - "Clause ACQ_REL is not allowed on the ATOMIC directive"_err_en_US); - } -} // Restrictions specific to each clause are implemented apart from the // generalized restrictions. @@ -4854,21 +5923,6 @@ void OmpStructureChecker::Leave(const parser::OmpContextSelector &) { ExitDirectiveNest(ContextSelectorNest); } -std::optional OmpStructureChecker::GetDynamicType( - const common::Indirection &parserExpr) { - // Indirection parserExpr - // `- parser::Expr ^.value() - const parser::TypedExpr &typedExpr{parserExpr.value().typedExpr}; - // ForwardOwningPointer typedExpr - // `- GenericExprWrapper ^.get() - // `- std::optional ^->v - if (auto maybeExpr{typedExpr.get()->v}) { - return maybeExpr->GetType(); - } else { - return std::nullopt; - } -} - const std::list & OmpStructureChecker::GetTraitPropertyList( const parser::OmpTraitSelector &trait) { @@ -5258,7 +6312,7 @@ void OmpStructureChecker::CheckTraitCondition( const parser::OmpTraitProperty &property{properties.front()}; auto &scalarExpr{std::get(property.u)}; - auto maybeType{GetDynamicType(scalarExpr.thing)}; + auto maybeType{GetDynamicType(scalarExpr.thing.value())}; if (!maybeType || maybeType->category() != TypeCategory::Logical) { context_.Say(property.source, "%s trait requires a single LOGICAL expression"_err_en_US, diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..bf6fbf16d0646 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -48,6 +48,7 @@ static const OmpDirectiveSet noWaitClauseNotAllowedSet{ } // namespace llvm namespace Fortran::semantics { +struct AnalyzedCondStmt; // Mapping from 'Symbol' to 'Source' to keep track of the variables // used in multiple clauses @@ -142,15 +143,6 @@ class OmpStructureChecker void Leave(const parser::OmpClauseList &); void Enter(const parser::OmpClause &); - void Enter(const parser::OmpAtomicRead &); - void Leave(const parser::OmpAtomicRead &); - void Enter(const parser::OmpAtomicWrite &); - void Leave(const parser::OmpAtomicWrite &); - void Enter(const parser::OmpAtomicUpdate &); - void Leave(const parser::OmpAtomicUpdate &); - void Enter(const parser::OmpAtomicCapture &); - void Leave(const parser::OmpAtomic &); - void Enter(const parser::DoConstruct &); void Leave(const parser::DoConstruct &); @@ -189,8 +181,6 @@ class OmpStructureChecker void CheckAllowedMapTypes(const parser::OmpMapType::Value &, const std::list &); - std::optional GetDynamicType( - const common::Indirection &); const std::list &GetTraitPropertyList( const parser::OmpTraitSelector &); std::optional GetClauseFromProperty( @@ -260,14 +250,41 @@ class OmpStructureChecker void CheckDoWhile(const parser::OpenMPLoopConstruct &x); void CheckAssociatedLoopConstraints(const parser::OpenMPLoopConstruct &x); template bool IsOperatorValid(const T &, const D &); - void CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *, const parser::OmpAtomicClauseList *); - void CheckAtomicUpdateStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureStmt(const parser::AssignmentStmt &); - void CheckAtomicWriteStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureConstruct(const parser::OmpAtomicCapture &); - void CheckAtomicCompareConstruct(const parser::OmpAtomicCompare &); - void CheckAtomicConstructStructure(const parser::OpenMPAtomicConstruct &); + + void CheckStorageOverlap(const evaluate::Expr &, + llvm::ArrayRef>, parser::CharBlock); + void CheckAtomicVariable( + const evaluate::Expr &, parser::CharBlock); + std::pair + CheckUpdateCapture(const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source); + void CheckAtomicCaptureAssignment(const evaluate::Assignment &capture, + const SomeExpr &atom, parser::CharBlock source); + void CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source); + void CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source); + void CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source); + void CheckAtomicConditionalUpdateAssignment(const SomeExpr &cond, + parser::CharBlock condSource, const evaluate::Assignment &assign, + parser::CharBlock assignSource); + void CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source); + void CheckAtomicUpdateOnly(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdate(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicUpdateCapture(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source); + void CheckAtomicRead(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicWrite(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicUpdate(const parser::OpenMPAtomicConstruct &x); + void CheckDistLinear(const parser::OpenMPLoopConstruct &x); void CheckSIMDNest(const parser::OpenMPConstruct &x); void CheckTargetNest(const parser::OpenMPConstruct &x); @@ -319,7 +336,6 @@ class OmpStructureChecker void EnterDirectiveNest(const int index) { directiveNest_[index]++; } void ExitDirectiveNest(const int index) { directiveNest_[index]--; } int GetDirectiveNest(const int index) { return directiveNest_[index]; } - template void CheckHintClause(D *, D *, std::string_view); inline void ErrIfAllocatableVariable(const parser::Variable &); inline void ErrIfLHSAndRHSSymbolsMatch( const parser::Variable &, const parser::Expr &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..dc395ecf211b3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1638,11 +1638,8 @@ class OmpVisitor : public virtual DeclarationVisitor { messageHandler().set_currStmtSource(std::nullopt); } bool Pre(const parser::OpenMPAtomicConstruct &x) { - return common::visit(common::visitors{[&](const auto &u) -> bool { - AddOmpSourceRange(u.source); - return true; - }}, - x.u); + AddOmpSourceRange(x.source); + return true; } void Post(const parser::OpenMPAtomicConstruct &) { messageHandler().set_currStmtSource(std::nullopt); diff --git a/flang/lib/Semantics/rewrite-directives.cpp b/flang/lib/Semantics/rewrite-directives.cpp index 104a77885d276..b4fef2c881b67 100644 --- a/flang/lib/Semantics/rewrite-directives.cpp +++ b/flang/lib/Semantics/rewrite-directives.cpp @@ -51,23 +51,21 @@ class OmpRewriteMutator : public DirectiveRewriteMutator { bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { // Find top-level parent of the operation. - Symbol *topLevelParent{common::visit( - [&](auto &atomic) { - Symbol *symbol{nullptr}; - Scope *scope{ - &context_.FindScope(std::get(atomic.t).source)}; - do { - if (Symbol * parent{scope->symbol()}) { - symbol = parent; - } - scope = &scope->parent(); - } while (!scope->IsGlobal()); - - assert(symbol && - "Atomic construct must be within a scope associated with a symbol"); - return symbol; - }, - x.u)}; + Symbol *topLevelParent{[&]() { + Symbol *symbol{nullptr}; + Scope *scope{&context_.FindScope( + std::get(x.t).source)}; + do { + if (Symbol * parent{scope->symbol()}) { + symbol = parent; + } + scope = &scope->parent(); + } while (!scope->IsGlobal()); + + assert(symbol && + "Atomic construct must be within a scope associated with a symbol"); + return symbol; + }()}; // Get the `atomic_default_mem_order` clause from the top-level parent. std::optional defaultMemOrder; @@ -86,66 +84,48 @@ bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { return false; } - auto findMemOrderClause = - [](const std::list &clauses) { - return llvm::any_of(clauses, [](const auto &clause) { - return std::get_if(&clause.u); + auto findMemOrderClause{[](const parser::OmpClauseList &clauses) { + return llvm::any_of( + clauses.v, [](auto &clause) -> const parser::OmpClause * { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_acq_rel: + case llvm::omp::Clause::OMPC_acquire: + case llvm::omp::Clause::OMPC_relaxed: + case llvm::omp::Clause::OMPC_release: + case llvm::omp::Clause::OMPC_seq_cst: + return &clause; + default: + return nullptr; + } }); - }; - - // Get the clause list to which the new memory order clause must be added, - // only if there are no other memory order clauses present for this atomic - // directive. - std::list *clauseList = common::visit( - common::visitors{[&](parser::OmpAtomic &atomicConstruct) { - // OmpAtomic only has a single list of clauses. - auto &clauses{std::get( - atomicConstruct.t)}; - return !findMemOrderClause(clauses.v) ? &clauses.v - : nullptr; - }, - [&](auto &atomicConstruct) { - // All other atomic constructs have two lists of clauses. - auto &clausesLhs{std::get<0>(atomicConstruct.t)}; - auto &clausesRhs{std::get<2>(atomicConstruct.t)}; - return !findMemOrderClause(clausesLhs.v) && - !findMemOrderClause(clausesRhs.v) - ? &clausesRhs.v - : nullptr; - }}, - x.u); + }}; - // Add a memory order clause to the atomic directive. + auto &dirSpec{std::get(x.t)}; + auto &clauseList{std::get>(dirSpec.t)}; if (clauseList) { - atomicDirectiveDefaultOrderFound_ = true; - switch (*defaultMemOrder) { - case common::OmpMemoryOrderType::Acq_Rel: - clauseList->emplace_back(common::visit( - common::visitors{[](parser::OmpAtomicRead &) -> parser::OmpClause { - return parser::OmpClause::Acquire{}; - }, - [](parser::OmpAtomicCapture &) -> parser::OmpClause { - return parser::OmpClause::AcqRel{}; - }, - [](auto &) -> parser::OmpClause { - // parser::{OmpAtomic, OmpAtomicUpdate, OmpAtomicWrite} - return parser::OmpClause::Release{}; - }}, - x.u)); - break; - case common::OmpMemoryOrderType::Relaxed: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::Relaxed{}}); - break; - case common::OmpMemoryOrderType::Seq_Cst: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::SeqCst{}}); - break; - default: - // FIXME: Don't process other values at the moment since their validity - // depends on the OpenMP version (which is unavailable here). - break; + if (findMemOrderClause(*clauseList)) { + return false; } + } else { + clauseList = parser::OmpClauseList(decltype(parser::OmpClauseList::v){}); + } + + // Add a memory order clause to the atomic directive. + atomicDirectiveDefaultOrderFound_ = true; + switch (*defaultMemOrder) { + case common::OmpMemoryOrderType::Acq_Rel: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::AcqRel{}}); + break; + case common::OmpMemoryOrderType::Relaxed: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::Relaxed{}}); + break; + case common::OmpMemoryOrderType::Seq_Cst: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::SeqCst{}}); + break; + default: + // FIXME: Don't process other values at the moment since their validity + // depends on the OpenMP version (which is unavailable here). + break; } return false; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 index b82bd13622764..6f58e0939a787 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 index 88ec6fe910b9e..6729be6e5cf8b 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b8422c8f3b769 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -79,16 +79,16 @@ subroutine pointers_in_atomic_capture() !CHECK: %[[VAL_A_BOX_ADDR:.*]] = fir.box_addr %[[VAL_A_LOADED]] : (!fir.box>) -> !fir.ptr !CHECK: %[[VAL_B_LOADED:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR:.*]] = fir.box_addr %[[VAL_B_LOADED]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR]] : !fir.ptr !CHECK: %[[VAL_B_LOADED_2:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR_2:.*]] = fir.box_addr %[[VAL_B_LOADED_2]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR_2]] : !fir.ptr !CHECK: omp.atomic.capture { !CHECK: omp.atomic.update %[[VAL_A_BOX_ADDR]] : !fir.ptr { !CHECK: ^bb0(%[[ARG:.*]]: i32): !CHECK: %[[TEMP:.*]] = arith.addi %[[ARG]], %[[VAL_B]] : i32 !CHECK: omp.yield(%[[TEMP]] : i32) !CHECK: } -!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 +!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR_2]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 !CHECK: } !CHECK: return !CHECK: } diff --git a/flang/test/Lower/OpenMP/atomic-write.f90 b/flang/test/Lower/OpenMP/atomic-write.f90 index 13392ad76471f..6eded49b0b15d 100644 --- a/flang/test/Lower/OpenMP/atomic-write.f90 +++ b/flang/test/Lower/OpenMP/atomic-write.f90 @@ -44,9 +44,9 @@ end program OmpAtomicWrite !CHECK-LABEL: func.func @_QPatomic_write_pointer() { !CHECK: %[[X_REF:.*]] = fir.alloca !fir.box> {bindc_name = "x", uniq_name = "_QFatomic_write_pointerEx"} !CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFatomic_write_pointerEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) -!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> !CHECK: %[[X_POINTEE_ADDR:.*]] = fir.box_addr %[[X_ADDR_BOX]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: omp.atomic.write %[[X_POINTEE_ADDR]] = %[[C1]] : !fir.ptr, i32 !CHECK: %[[C2:.*]] = arith.constant 2 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> diff --git a/flang/test/Parser/OpenMP/atomic-compare.f90 b/flang/test/Parser/OpenMP/atomic-compare.f90 deleted file mode 100644 index 5cd02698ff482..0000000000000 --- a/flang/test/Parser/OpenMP/atomic-compare.f90 +++ /dev/null @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s -! OpenMP version for documentation purposes only - it isn't used until Sema. -! This is testing for Parser errors that bail out before Sema. -program main - implicit none - integer :: i, j = 10 - logical :: r - - !CHECK: error: expected OpenMP construct - !$omp atomic compare write - r = i .eq. j + 1 - - !CHECK: error: expected end of line - !$omp atomic compare num_threads(4) - r = i .eq. j -end program main diff --git a/flang/test/Semantics/OpenMP/atomic-compare.f90 b/flang/test/Semantics/OpenMP/atomic-compare.f90 index 54492bf6a22a6..11e23e062bce7 100644 --- a/flang/test/Semantics/OpenMP/atomic-compare.f90 +++ b/flang/test/Semantics/OpenMP/atomic-compare.f90 @@ -44,46 +44,37 @@ !$omp end atomic ! Check for error conditions: - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic compare seq_cst seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst compare seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic compare acquire acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire compare acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic compare relaxed relaxed if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed compare relaxed if (b .eq. c) b = a - !ERROR: More than one FAIL clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one FAIL clause can appear on the ATOMIC directive !$omp atomic fail(release) compare fail(release) if (c .eq. a) a = b !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index c13a11a8dd5dc..deb67e7614659 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -16,20 +16,21 @@ program sample !$omp atomic read hint(2) y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(3) y = y + 10 !$omp atomic update hint(5) y = x + y - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement y = x x = y !$omp end atomic - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic update hint(x) y = y * 1 @@ -46,7 +47,7 @@ program sample !$omp atomic hint(omp_lock_hint_speculative) x = y + x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint) read y = x @@ -69,36 +70,36 @@ program sample !$omp atomic hint(omp_lock_hint_contended + omp_sync_hint_nonspeculative) x = y + x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint_contended) read y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = y * 9 - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp atomic hint(1.0) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp atomic hint(z + omp_sync_hint_nonspeculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(k + omp_sync_hint_speculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(p(1) + omp_sync_hint_uncontended) write x = 10 * y !$omp atomic write hint(a) - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and y+x access the same storage x = y + x !$omp atomic hint(abs(-1)) write diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 new file mode 100644 index 0000000000000..2619d235380f8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -0,0 +1,89 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC READ. Expect no diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v = x + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v => x +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic read + v = p(i) +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic read + !ERROR: Atomic variable x should be a scalar + v = x + + !$omp atomic read + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + v = y(2:4) +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic read + !ERROR: Within atomic operation x and x access the same storage + x = x + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic read + y(1) = y(2) +end + +subroutine f05 + integer :: x, v + + !$omp atomic read + !ERROR: Atomic expression x+1_4 should be a variable + v = x + 1 +end + +subroutine f06 + character :: x, v + + !$omp atomic read + !ERROR: Atomic variable x cannot have CHARACTER type + v = x +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic read + !ERROR: Atomic variable x cannot be ALLOCATABLE + v = x +end + diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 new file mode 100644 index 0000000000000..64e31a7974b55 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -0,0 +1,77 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !$omp atomic update capture + x = v + x = x + 1 + y = x + !$omp end atomic +end + +subroutine f01 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two assignments + !$omp atomic update capture + x = v + block + x = x + 1 + y = x + end block + !$omp end atomic +end + +subroutine f02 + integer :: x, y + + ! The update and capture statements can be inside of a single BLOCK. + ! The end-directive is then optional. Expect no diagnostics. + !$omp atomic update capture + block + x = x + 1 + y = x + end block +end + +subroutine f03 + integer :: x + + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !$omp atomic update capture + x = x + 1 + x = x + 2 + !$omp end atomic +end + +subroutine f04 + integer :: x, v + + !$omp atomic update capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + v = x + x = v + !$omp end atomic +end + +subroutine f05 + integer :: x, v, z + + !$omp atomic update capture + v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + !$omp end atomic +end + +subroutine f06 + integer :: x, v, z + + !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + v = x + !$omp end atomic +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 new file mode 100644 index 0000000000000..0aee02d429dc0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + !$omp atomic update + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + x = int(x + y) +end + +subroutine f06 + integer :: x, y + interface + function f(i, j) + integer :: f, i, j + end + end interface + + !$omp atomic update + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation + x = f(x, y) +end + +subroutine f07 + real :: x + integer :: y + + !$omp atomic update + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation + x = x ** y +end + +subroutine f08 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should appear as an argument in the update operation + x = y +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 index 21a9b87d26345..3084376b4275d 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 @@ -22,10 +22,10 @@ program sample x = x / y !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y end program diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 new file mode 100644 index 0000000000000..979568ebf7140 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -0,0 +1,81 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC WRITE. Expect no diagnostics. + !$omp atomic write + x = v + 1 + + !$omp atomic write + x = v + 3 + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic write + x = 2 * v + 3 + + !$omp atomic write + x => v +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic write + p(i) = v +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic write + !ERROR: Atomic variable x should be a scalar + x = v + + !$omp atomic write + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + y(2:4) = v +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic write + !ERROR: Within atomic operation x and x+1_4 access the same storage + x = x + 1 + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic write + y(1) = y(2) +end + +subroutine f06 + character :: x, v + + !$omp atomic write + !ERROR: Atomic variable x cannot have CHARACTER type + x = v +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic write + !ERROR: Atomic variable x cannot be ALLOCATABLE + x = v +end + diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0e100871ea9b4..0dc8251ae356e 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct @@ -11,9 +11,13 @@ a = b !$omp end atomic + !ERROR: ACQUIRE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic read acquire hint(OMP_LOCK_HINT_CONTENDED) a = b + !ERROR: RELEASE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic release hint(OMP_LOCK_HINT_UNCONTENDED) write a = b @@ -22,39 +26,32 @@ a = a + 1 !$omp end atomic + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: ACQ_REL clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic hint(1) acq_rel capture b = a a = a + 1 !$omp end atomic - !ERROR: expected end of line + !ERROR: At most one clause from the 'atomic' group is allowed on ATOMIC construct !$omp atomic read write + !ERROR: Atomic expression a+1._4 should be a variable a = a + 1 !$omp atomic a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic num_threads(4) a = a + 1 - !ERROR: expected end of line + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic capture num_threads(4) a = a + 1 + !ERROR: RELAXED clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic relaxed a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' - !$omp atomic num_threads write - a = a + 1 - !$omp end parallel end diff --git a/flang/test/Semantics/OpenMP/atomic01.f90 b/flang/test/Semantics/OpenMP/atomic01.f90 index 173effe86b69c..f700c381cadd0 100644 --- a/flang/test/Semantics/OpenMP/atomic01.f90 +++ b/flang/test/Semantics/OpenMP/atomic01.f90 @@ -14,322 +14,277 @@ ! At most one memory-order-clause may appear on the construct. !READ - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic read seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst read seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic read acquire acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire read acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic read relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed read relaxed i = j !UPDATE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic update seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst update seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic update release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release update release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic update relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed update relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !CAPTURE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic capture seq_cst seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst capture seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic capture release release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release capture release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic capture relaxed relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed capture relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel acq_rel capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic capture acq_rel acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel capture acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic capture acquire acquire i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire capture acquire i = j j = k !$omp end atomic !WRITE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic write seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst write seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic write release release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release write release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic write relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed write relaxed i = j !No atomic-clause - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j ! 2.17.7.3 ! At most one hint clause may appear on the construct. - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_speculative) hint(omp_sync_hint_speculative) read i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) read hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic read hint(omp_sync_hint_uncontended) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) update hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic update hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) capture i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) capture hint(omp_sync_hint_nonspeculative) i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic capture hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j j = k @@ -337,34 +292,26 @@ ! 2.17.7.4 ! If atomic-clause is read then memory-order-clause must not be acq_rel or release. - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic acq_rel read i = j - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read acq_rel i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic release read i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read release i = j ! 2.17.7.5 ! If atomic-clause is write then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acq_rel write i = j - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acq_rel i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acquire write i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acquire i = j @@ -372,33 +319,27 @@ ! 2.17.7.6 ! If atomic-clause is update or not present then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acq_rel update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acquire update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed on the ATOMIC directive !$omp atomic acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed on the ATOMIC directive !$omp atomic acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j end program diff --git a/flang/test/Semantics/OpenMP/atomic02.f90 b/flang/test/Semantics/OpenMP/atomic02.f90 index c66085d00f157..45e41f2552965 100644 --- a/flang/test/Semantics/OpenMP/atomic02.f90 +++ b/flang/test/Semantics/OpenMP/atomic02.f90 @@ -28,36 +28,29 @@ program OmpAtomic !$omp atomic a = a/(b + 1) !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: The atomic variable c should appear as an argument in the update operation c = d !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The /= operator is not a valid ATOMIC UPDATE operation l = a .NE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic m = m .AND. n @@ -76,32 +69,26 @@ program OmpAtomic !$omp atomic update a = a/(b + 1) !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: This is not a valid ATOMIC UPDATE operation c = c//d !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic update m = m .AND. n diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index 76367495b9861..f5c189fd05318 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -25,28 +25,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7, b, c) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8, a, d) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -60,26 +58,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = MOD(y, 9) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation x = ABS(x) end program OmpAtomic @@ -92,7 +90,7 @@ subroutine conflicting_types() type(simple) ::s z = 1 !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(s%z, 4) end subroutine @@ -105,40 +103,37 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) :: s !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MIN operator a = min(a, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a) !$omp atomic - !ERROR: Atomic update statement should be of the form `a = intrinsic_procedure(a, expr_list)` OR `a = intrinsic_procedure(expr_list, a)` a = min(b, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a, b) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: The atomic variable y should occur exactly once among the arguments of the top-level MIN operator y = min(z, x) !$omp atomic z = max(z, y) !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'k' + !ERROR: Atomic variable k should be a scalar + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation k = max(x, y) - + !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement x = min(x, k) !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - z =z + s%m + z = z + s%m end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index a9644ad95aa30..5c91ab5dc37e4 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -20,12 +20,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic @@ -33,12 +31,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic @@ -46,12 +42,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic @@ -59,12 +53,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic @@ -72,8 +64,7 @@ program OmpAtomic !$omp atomic m = n .AND. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic @@ -81,8 +72,7 @@ program OmpAtomic !$omp atomic m = n .OR. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic @@ -90,8 +80,7 @@ program OmpAtomic !$omp atomic m = n .EQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic @@ -99,8 +88,7 @@ program OmpAtomic !$omp atomic m = n .NEQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l !$omp atomic update @@ -108,12 +96,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic update @@ -121,12 +107,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic update @@ -134,12 +118,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic update @@ -147,12 +129,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic update @@ -160,8 +140,7 @@ program OmpAtomic !$omp atomic update m = n .AND. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic update @@ -169,8 +148,7 @@ program OmpAtomic !$omp atomic update m = n .OR. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic update @@ -178,8 +156,7 @@ program OmpAtomic !$omp atomic update m = n .EQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic update @@ -187,8 +164,7 @@ program OmpAtomic !$omp atomic update m = n .NEQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l end program OmpAtomic @@ -204,35 +180,34 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) p !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement x = x !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 1 !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and a*b access the same storage a = a * b + a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level * operator a = b * (a + 9) !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (a+b) access the same storage a = a * (a + b) !$omp atomic - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (b+a) access the same storage a = (b + a) * a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a * b + c !$omp atomic update - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a + b + c !$omp atomic @@ -243,23 +218,18 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement a = a + d !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = x * y / z !$omp atomic - !ERROR: Atomic update statement should be of form `p%m = p%m operator expr` OR `p%m = expr operator p%m` - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable p%m should occur exactly once among the arguments of the top-level + operator p%m = x + y !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement p%m = p%m + p%n end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index 266268a212440..e0103be4cae4a 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -8,20 +8,20 @@ program OmpAtomic use omp_lib integer :: g, x - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic relaxed, seq_cst x = x + 1 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic read seq_cst, relaxed x = g - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic write relaxed, release x = 2 * 4 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 10 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst x = g g = x * 10 diff --git a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 index 7ca8c858239f7..e9cfa49bf934e 100644 --- a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 @@ -18,7 +18,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(3) y = 2 !$omp end critical (name) @@ -27,12 +27,12 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(7) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(x) y = 2 @@ -54,7 +54,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint) y = 2 @@ -84,35 +84,35 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint_contended) y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp critical (name) hint(1.0) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp critical (name) hint(z + omp_sync_hint_nonspeculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(k + omp_sync_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(p(1) + omp_sync_hint_uncontended) y = 2 diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 505cbc48fef90..677b933932b44 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -20,70 +20,64 @@ program sample !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable y(1_8:3_8:1_8) should be a scalar v = y(1:3) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression x*(10_4+x) should be a variable v = x * (10 + x) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression 4_4 should be a variable v = 4 !$omp atomic read - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE v = k !$omp atomic write - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = x !$omp atomic update - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = k + x * (v * x) !$omp atomic - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = v * k !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'z%y' + !ERROR: Within atomic operation z%y and x+z%y access the same storage z%y = x + z%y !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Within atomic operation m and min(m,x,z%m)+k access the same storage m = min(m, x, z%m) + k !$omp atomic read - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Atomic expression min(m,x,z%m)+k should be a variable m = min(m, x, z%m) + k !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar x = a - !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - a = x - !$omp atomic write !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement x = a !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar a = x !$omp atomic capture @@ -93,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = b + (x*1) !$omp end atomic @@ -103,60 +97,58 @@ program sample !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component x expected to be captured in the second statement of ATOMIC CAPTURE construct x = x + 10 v = b !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt] v = 1 x = 4 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component z%y expected to be assigned in the second statement of ATOMIC CAPTURE construct x = z%y + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component z%m expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 x = z%y !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component y(2) expected to be assigned in the second statement of ATOMIC CAPTURE construct x = y(2) + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component y(1) expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 x = y(2) !$omp end atomic !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable r cannot have CHARACTER type l = r !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable l cannot have CHARACTER type l = r end program diff --git a/flang/test/Semantics/OpenMP/requires-atomic01.f90 b/flang/test/Semantics/OpenMP/requires-atomic01.f90 index ae9fd086015dd..e8817c3f5ef61 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic01.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic01.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> SeqCst !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> SeqCst !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> SeqCst !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> SeqCst !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> SeqCst !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 diff --git a/flang/test/Semantics/OpenMP/requires-atomic02.f90 b/flang/test/Semantics/OpenMP/requires-atomic02.f90 index 4976a9667eb78..a3724a83456fd 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic02.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic02.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Acquire + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> AcqRel !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> AcqRel !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> AcqRel !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> AcqRel !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> AcqRel + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> AcqRel !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 >From 5928e6c450ed9d91a8130fd489a3fabab830140c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:41:16 -0500 Subject: [PATCH 05/24] Restart build >From 5a9cb4aaca6c7cd079f24036b233dbfd3c6278d4 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:45:28 -0500 Subject: [PATCH 06/24] Restart build >From 6273bb856c66fe110731e25293be0f9a1f02c64a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 07:54:54 -0500 Subject: [PATCH 07/24] Restart build >From 3594a0fcab499d7509c2bd4620b2c47feb74a481 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 08:18:01 -0500 Subject: [PATCH 08/24] Replace %openmp_flags with -fopenmp in tests, add REQUIRES where needed --- flang/test/Semantics/OpenMP/atomic-read.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-capture.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 2 +- flang/test/Semantics/OpenMP/atomic.f90 | 2 ++ 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 index 2619d235380f8..73899a9ff37f2 100644 --- a/flang/test/Semantics/OpenMP/atomic-read.f90 +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index 64e31a7974b55..c427ba07d43d8 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 0aee02d429dc0..4595e02d01456 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 index 979568ebf7140..7965ad2dc7dbf 100644 --- a/flang/test/Semantics/OpenMP/atomic-write.f90 +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0dc8251ae356e..2caa161507d49 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,3 +1,5 @@ +! REQUIRES: openmp_runtime + ! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct >From 0142671339663ba09c169800f909c0d48c0ad38b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 09:06:20 -0500 Subject: [PATCH 09/24] Fix examples --- flang/examples/FeatureList/FeatureList.cpp | 10 ------- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 26 +++++-------------- 2 files changed, 7 insertions(+), 29 deletions(-) diff --git a/flang/examples/FeatureList/FeatureList.cpp b/flang/examples/FeatureList/FeatureList.cpp index d1407cf0ef239..a36b8719e365d 100644 --- a/flang/examples/FeatureList/FeatureList.cpp +++ b/flang/examples/FeatureList/FeatureList.cpp @@ -445,13 +445,6 @@ struct NodeVisitor { READ_FEATURE(ObjectDecl) READ_FEATURE(OldParameterStmt) READ_FEATURE(OmpAlignedClause) - READ_FEATURE(OmpAtomic) - READ_FEATURE(OmpAtomicCapture) - READ_FEATURE(OmpAtomicCapture::Stmt1) - READ_FEATURE(OmpAtomicCapture::Stmt2) - READ_FEATURE(OmpAtomicRead) - READ_FEATURE(OmpAtomicUpdate) - READ_FEATURE(OmpAtomicWrite) READ_FEATURE(OmpBeginBlockDirective) READ_FEATURE(OmpBeginLoopDirective) READ_FEATURE(OmpBeginSectionsDirective) @@ -480,7 +473,6 @@ struct NodeVisitor { READ_FEATURE(OmpIterationOffset) READ_FEATURE(OmpIterationVector) READ_FEATURE(OmpEndAllocators) - READ_FEATURE(OmpEndAtomic) READ_FEATURE(OmpEndBlockDirective) READ_FEATURE(OmpEndCriticalDirective) READ_FEATURE(OmpEndLoopDirective) @@ -566,8 +558,6 @@ struct NodeVisitor { READ_FEATURE(OpenMPDeclareTargetConstruct) READ_FEATURE(OmpMemoryOrderType) READ_FEATURE(OmpMemoryOrderClause) - READ_FEATURE(OmpAtomicClause) - READ_FEATURE(OmpAtomicClauseList) READ_FEATURE(OmpAtomicDefaultMemOrderClause) READ_FEATURE(OpenMPFlushConstruct) READ_FEATURE(OpenMPLoopConstruct) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..18a4107f9a9d1 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -74,25 +74,19 @@ SourcePosition OpenMPCounterVisitor::getLocation(const OpenMPConstruct &c) { // the directive field. [&](const auto &c) -> SourcePosition { const CharBlock &source{std::get<0>(c.t).source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPAtomicConstruct &c) -> SourcePosition { - return std::visit( - [&](const auto &o) -> SourcePosition { - const CharBlock &source{std::get(o.t).source}; - return parsing->allCooked() - .GetSourcePositionRange(source) - ->first; - }, - c.u); + const CharBlock &source{c.source}; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPSectionConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPUtilityConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, }, c.u); @@ -157,14 +151,8 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - return std::visit( - [&](const auto &c) { - // Get source from the verbatim fields - const CharBlock &source{std::get(c.t).source}; - return "atomic-" + - normalize_construct_name(source.ToString()); - }, - c.u); + const CharBlock &source{c.source}; + return normalize_construct_name(source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; >From c158867527e0a5e2b67146784386675e0d414c71 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:19 -0500 Subject: [PATCH 10/24] Fix example --- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 5 +++-- flang/test/Examples/omp-atomic.f90 | 16 +++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index 18a4107f9a9d1..b0964845d3b99 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -151,8 +151,9 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - const CharBlock &source{c.source}; - return normalize_construct_name(source.ToString()); + auto &dirSpec = std::get(c.t); + auto &dirName = std::get(dirSpec.t); + return normalize_construct_name(dirName.source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; diff --git a/flang/test/Examples/omp-atomic.f90 b/flang/test/Examples/omp-atomic.f90 index dcca34b633a3e..934f84f132484 100644 --- a/flang/test/Examples/omp-atomic.f90 +++ b/flang/test/Examples/omp-atomic.f90 @@ -26,25 +26,31 @@ ! CHECK:--- ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 9 -! CHECK-NEXT: construct: atomic-read +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: -! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: - clause: read ! CHECK-NEXT: details: '' +! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 12 -! CHECK-NEXT: construct: atomic-write +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: ! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' +! CHECK-NEXT: - clause: write ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 16 -! CHECK-NEXT: construct: atomic-capture +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: +! CHECK-NEXT: - clause: capture +! CHECK-NEXT: details: 'name_modifier=atomic;name_modifier=atomic;' ! CHECK-NEXT: - clause: seq_cst ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 21 -! CHECK-NEXT: construct: atomic-atomic +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: [] ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 8 >From ddacb71299d1fefd37b566e3caed45031584aac8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:47 -0500 Subject: [PATCH 11/24] Remove reference to *nullptr --- flang/lib/Lower/OpenMP/OpenMP.cpp | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ab868df76d298..4d9b375553c1e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3770,9 +3770,12 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); - mlir::Operation *secondOp = genAtomicOperation( - converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, - *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + mlir::Operation *secondOp = nullptr; + if (analysis.op1.what != analysis.None) { + secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, + atomAddr, atom, *get(analysis.op1.expr), + hint, memOrder, atomicAt, prepareAt); + } if (secondOp) { builder.setInsertionPointAfter(secondOp); >From 4546997f82dfe32b79b2bd0e2b65974991ab55da Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 2 May 2025 18:49:05 -0500 Subject: [PATCH 12/24] Updates and improvements --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 107 +++-- flang/lib/Semantics/check-omp-structure.cpp | 375 ++++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 1 + .../Todo/atomic-capture-implicit-cast.f90 | 48 --- .../Lower/OpenMP/atomic-implicit-cast.f90 | 2 - .../Semantics/OpenMP/atomic-hint-clause.f90 | 2 +- .../OpenMP/atomic-update-capture.f90 | 8 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 16 +- 9 files changed, 381 insertions(+), 180 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 77f57b1cb85c7..8213fe33edbd0 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4859,7 +4859,7 @@ struct OpenMPAtomicConstruct { struct Op { int what; - TypedExpr expr; + AssignmentStmt::TypedAssignment assign; }; TypedExpr atom, cond; Op op0, op1; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6177b59199481..7b6c22095d723 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2673,21 +2673,46 @@ getAtomicMemoryOrder(lower::AbstractConverter &converter, static mlir::Operation * // genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Value storeAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Type storeType = fir::unwrapRefType(storeAddr.getType()); + + mlir::Value toAddr = [&]() { + if (atomType == storeType) + return storeAddr; + return builder.createTemporary(loc, atomType, ".tmp.atomval"); + }(); builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + + if (atomType != storeType) { + lower::ExprToValueMap overrides; + // The READ operation could be a part of UPDATE CAPTURE, so make sure + // we don't emit extra code into the body of the atomic op. + builder.restoreInsertionPoint(postAt); + mlir::Value load = builder.create(loc, toAddr); + overrides.try_emplace(&atom, load); + + converter.overrideExprValues(&overrides); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); + converter.resetExprOverrides(); + + builder.create(loc, value, storeAddr); + } builder.restoreInsertionPoint(saved); return op; } @@ -2695,16 +2720,18 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, static mlir::Operation * // genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); mlir::Value converted = builder.createConvert(loc, atomType, value); @@ -2719,19 +2746,20 @@ static mlir::Operation * genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrResizeOf(arg, atom)) { @@ -2751,7 +2779,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, converter.overrideExprValues(&overrides); mlir::Value updated = - fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Value converted = builder.createConvert(loc, atomType, updated); builder.create(loc, converted); converter.resetExprOverrides(); @@ -2764,20 +2792,21 @@ static mlir::Operation * genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, int action, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: - return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Write: - return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Update: - return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, assign, + hint, memOrder, preAt, atomicAt, postAt); default: return nullptr; } @@ -3724,6 +3753,15 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { } return ""s; }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; const SomeExpr &atom = *analysis.atom.get()->v; @@ -3732,11 +3770,11 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; llvm::errs() << " op0 {\n"; llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op0.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << " op1 {\n"; llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op1.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << "}\n"; } @@ -3745,8 +3783,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAtomicConstruct &construct) { - auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { - if (auto *maybe = expr.get(); maybe && maybe->v) { + auto get = [](auto &&typedWrapper) -> decltype(&*typedWrapper.get()->v) { + if (auto *maybe = typedWrapper.get(); maybe && maybe->v) { return &*maybe->v; } else { return nullptr; @@ -3774,8 +3812,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, int action0 = analysis.op0.what & analysis.Action; int action1 = analysis.op1.what & analysis.Action; mlir::Operation *captureOp = nullptr; - fir::FirOpBuilder::InsertPoint atomicAt; - fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint preAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint atomicAt, postAt; if (construct.IsCapture()) { // Capturing operation. @@ -3784,7 +3822,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, captureOp = builder.create(loc, hint, memOrder); // Set the non-atomic insertion point to before the atomic.capture. - prepareAt = getInsertionPointBefore(captureOp); + preAt = getInsertionPointBefore(captureOp); mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); builder.setInsertionPointToEnd(block); @@ -3792,6 +3830,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, // atomic.capture. mlir::Operation *term = builder.create(loc); atomicAt = getInsertionPointBefore(term); + postAt = getInsertionPointAfter(captureOp); hint = nullptr; memOrder = nullptr; } else { @@ -3799,20 +3838,20 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(action0 != analysis.None && action1 == analysis.None && "Expexcing single action"); assert(!(analysis.op0.what & analysis.Condition)); - atomicAt = prepareAt; + postAt = atomicAt = preAt; } mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, - *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); mlir::Operation *secondOp = nullptr; if (analysis.op1.what != analysis.None) { secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, - atomAddr, atom, *get(analysis.op1.expr), - hint, memOrder, atomicAt, prepareAt); + atomAddr, atom, *get(analysis.op1.assign), + hint, memOrder, preAt, atomicAt, postAt); } if (secondOp) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 201b38bd05ff3..f7753a5e5cc59 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -86,9 +86,13 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } -static bool IsVarOrFunctionRef(const SomeExpr &expr) { - return evaluate::UnwrapProcedureRef(expr) != nullptr || - evaluate::IsVariable(expr); +static bool IsVarOrFunctionRef(const MaybeExpr &expr) { + if (expr) { + return evaluate::UnwrapProcedureRef(*expr) != nullptr || + evaluate::IsVariable(*expr); + } else { + return false; + } } static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { @@ -2838,6 +2842,12 @@ static std::pair SplitAssignmentSource( namespace atomic { +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + enum class Operator { Unk, // Operators that are officially allowed in the update operation @@ -3137,16 +3147,108 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (Append(v, std::move(results)), ...); + (MoveAppend(v, std::move(results)), ...); return v; } +}; -private: - static void Append(Result &acc, Result &&data) { - for (auto &&s : data) { - acc.push_back(std::move(s)); +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + auto copy{x.derived()}; + return {evaluate::AsGenericExpr(std::move(copy)), {}}; } } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; }; } // namespace atomic @@ -3265,6 +3367,22 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } +static MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { // Both expr and x have the form of SomeType(SomeKind(...)[1]). // Check if expr is @@ -3282,6 +3400,10 @@ bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { } } +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { if (value) { expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), @@ -3289,11 +3411,20 @@ static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { } } +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( - int what, const MaybeExpr &maybeExpr = std::nullopt) { + int what, + const std::optional &maybeAssign = std::nullopt) { parser::OpenMPAtomicConstruct::Analysis::Op operation; operation.what = what; - SetExpr(operation.expr, maybeExpr); + SetAssignment(operation.assign, maybeAssign); return operation; } @@ -3316,7 +3447,7 @@ static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( // }; // struct Op { // int what; - // TypedExpr expr; + // TypedAssignment assign; // }; // TypedExpr atom, cond; // Op op0, op1; @@ -3340,6 +3471,16 @@ void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, } } +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + /// Check if `expr` satisfies the following conditions for x and v: /// /// [6.0:189:10-12] @@ -3383,9 +3524,9 @@ OmpStructureChecker::CheckUpdateCapture( // // The two allowed cases are: // x = ... atomic-var = ... - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // or - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // x = ... atomic-var = ... // // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture @@ -3394,6 +3535,8 @@ OmpStructureChecker::CheckUpdateCapture( // // If the two statements don't fit these criteria, return a pair of default- // constructed values. + using ReturnTy = std::pair; SourcedActionStmt act1{GetActionStmt(ec1)}; SourcedActionStmt act2{GetActionStmt(ec2)}; @@ -3409,86 +3552,155 @@ OmpStructureChecker::CheckUpdateCapture( auto isUpdateCapture{ [](const evaluate::Assignment &u, const evaluate::Assignment &c) { - return u.lhs == c.rhs; + return IsSameOrConvertOf(c.rhs, u.lhs); }}; // Do some checks that narrow down the possible choices for the update // and the capture statements. This will help to emit better diagnostics. - bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); - bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; + + auto errorCaptureShouldRead{[&](const parser::CharBlock &source, + const std::string &expr) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read %s"_err_en_US, + expr); + }}; - if (couldBeCapture1) { - if (couldBeCapture2) { - if (isUpdateCapture(as2, as1)) { - if (isUpdateCapture(as1, as2)) { - // If both statements could be captures and both could be updates, - // emit a warning about the ambiguity. - context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); - } - return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); - } else if (isUpdateCapture(as1, as2)) { + auto errorNeitherWorks{[&]() { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture"_err_en_US); + }}; + + auto makeSelectionFromDet{[&](int det) -> ReturnTy { + // If det != 0, then the checks unambiguously suggest a specific + // categorization. + // If det == 0, then this function should be called only if the + // checks haven't ruled out any possibility, i.e. when both assigments + // could still be either updates or captures. + if (det > 0) { + // as1 is update, as2 is capture + if (isUpdateCapture(as1, as2)) { return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - context_.Say(source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, - as1.rhs.AsFortran(), as2.rhs.AsFortran()); + errorCaptureShouldRead(act2.source, as1.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } else { // !couldBeCapture2 + } else if (det < 0) { + // as2 is update, as1 is capture if (isUpdateCapture(as2, as1)) { return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } else { - context_.Say(act2.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as1.rhs.AsFortran()); + errorCaptureShouldRead(act1.source, as2.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } - } else { // !couldBeCapture1 - if (couldBeCapture2) { - if (isUpdateCapture(as1, as2)) { - return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); - } else { + } else { + bool updateFirst{isUpdateCapture(as1, as2)}; + bool captureFirst{isUpdateCapture(as2, as1)}; + if (updateFirst && captureFirst) { + // If both assignment could be the update and both could be the + // capture, emit a warning about the ambiguity. context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as2.rhs.AsFortran()); + "In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement"_warn_en_US); + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } - } else { - context_.Say(source, - "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + if (updateFirst != captureFirst) { + const parser::ExecutionPartConstruct *upd{updateFirst ? ec1 : ec2}; + const parser::ExecutionPartConstruct *cap{captureFirst ? ec1 : ec2}; + return std::make_pair(upd, cap); + } + assert(!updateFirst && !captureFirst); + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); } + }}; + + if (det != 0 || (cbu1 && cbu2 && cbc1 && cbc2)) { + return makeSelectionFromDet(det); } + assert(det == 0 && "Prior checks should have covered det != 0"); - return std::make_pair(nullptr, nullptr); + // If neither of the statements is an RMW update, it could still be a + // "write" update. Pretty much any assignment can be a write update, so + // recompute det with cbu1 = cbu2 = true. + if (int writeDet{int(cbc2) - int(cbc1)}; writeDet || (cbc1 && cbc2)) { + return makeSelectionFromDet(writeDet); + } + + // It's only errors from here on. + + if (!cbu1 && !cbu2 && !cbc1 && !cbc2) { + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); + } + + // The remaining cases are that + // - no candidate for update, or for capture, + // - one of the assigments cannot be anything. + + if (!cbu1 && !cbu2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update"_err_en_US); + return std::make_pair(nullptr, nullptr); + } else if (!cbc1 && !cbc2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + if ((!cbu1 && !cbc1) || (!cbu2 && !cbc2)) { + auto &src = (!cbu1 && !cbc1) ? act1.source : act2.source; + context_.Say(src, + "In ATOMIC UPDATE operation with CAPTURE the statement could be neither the update nor the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + // All cases should have been covered. + llvm_unreachable("Unchecked condition"); } void OmpStructureChecker::CheckAtomicCaptureAssignment( const evaluate::Assignment &capture, const SomeExpr &atom, parser::CharBlock source) { - const SomeExpr &cap{capture.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &cap{capture.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, rsrc); - // This part should have been checked prior to callig this function. - assert(capture.rhs == atom && "This canont be a capture assignment"); + // This part should have been checked prior to calling this function. + assert(*GetConvertInput(capture.rhs) == atom && + "This canont be a capture assignment"); CheckStorageOverlap(atom, {cap}, source); } } void OmpStructureChecker::CheckAtomicReadAssignment( const evaluate::Assignment &read, parser::CharBlock source) { - const SomeExpr &atom{read.rhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; - if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + if (auto maybe{GetConvertInput(read.rhs)}) { + const SomeExpr &atom{*maybe}; + + if (!IsVarOrFunctionRef(atom)) { + ErrorShouldBeVariable(atom, rsrc); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } } else { - CheckAtomicVariable(atom, rsrc); - CheckStorageOverlap(atom, {read.lhs}, source); + ErrorShouldBeVariable(read.rhs, rsrc); } } @@ -3499,12 +3711,11 @@ void OmpStructureChecker::CheckAtomicWriteAssignment( // one of the following forms: // x = expr // x => expr - const SomeExpr &atom{write.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{write.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, lsrc); CheckStorageOverlap(atom, {write.rhs}, source); @@ -3521,12 +3732,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( // x = intrinsic-procedure-name (x) // x = intrinsic-procedure-name (x, expr-list) // x = intrinsic-procedure-name (expr-list, x) - const SomeExpr &atom{update.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{update.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); // Skip other checks. return; } @@ -3605,12 +3815,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( const SomeExpr &cond, parser::CharBlock condSource, const evaluate::Assignment &assign, parser::CharBlock assignSource) { - const SomeExpr &atom{assign.lhs}; auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + const SomeExpr &atom{assign.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, arsrc); // Skip other checks. return; } @@ -3702,7 +3911,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( @@ -3786,7 +3995,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdate( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign), MakeAtomicAnalysisOp(Analysis::None)); } @@ -3839,12 +4048,12 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( if (GetActionStmt(&body.front()).stmt == uact.stmt) { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(action, update.rhs), - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + MakeAtomicAnalysisOp(action, update), + MakeAtomicAnalysisOp(Analysis::Read, capture)); } else { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), - MakeAtomicAnalysisOp(action, update.rhs)); + MakeAtomicAnalysisOp(Analysis::Read, capture), + MakeAtomicAnalysisOp(action, update)); } } @@ -3988,12 +4197,12 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( if (captureFirst) { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign)); } else { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign)); } } @@ -4019,13 +4228,15 @@ void OmpStructureChecker::CheckAtomicRead( if (body.size() == 1) { SourcedActionStmt action{GetActionStmt(&body.front())}; if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { - const SomeExpr &atom{maybeRead->rhs}; CheckAtomicReadAssignment(*maybeRead, action.source); - using Analysis = parser::OpenMPAtomicConstruct::Analysis; - x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), - MakeAtomicAnalysisOp(Analysis::None)); + if (auto maybe{GetConvertInput(maybeRead->rhs)}) { + const SomeExpr &atom{*maybe}; + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead), + MakeAtomicAnalysisOp(Analysis::None)); + } } else { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); @@ -4058,7 +4269,7 @@ void OmpStructureChecker::CheckAtomicWrite( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index bf6fbf16d0646..835fbe45e1c0e 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -253,6 +253,7 @@ class OmpStructureChecker void CheckStorageOverlap(const evaluate::Expr &, llvm::ArrayRef>, parser::CharBlock); + void ErrorShouldBeVariable(const MaybeExpr &expr, parser::CharBlock source); void CheckAtomicVariable( const evaluate::Expr &, parser::CharBlock); std::pair&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..aa9d2e0ac3ff7 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -1,5 +1,3 @@ -! REQUIRES : openmp_runtime - ! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s ! CHECK: func.func @_QPatomic_implicit_cast_read() { diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index deb67e7614659..8adb0f1a67409 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -25,7 +25,7 @@ program sample !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement y = x x = y !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index c427ba07d43d8..f808ed916fb7e 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -39,7 +39,7 @@ subroutine f02 subroutine f03 integer :: x - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture !$omp atomic update capture x = x + 1 x = x + 2 @@ -50,7 +50,7 @@ subroutine f04 integer :: x, v !$omp atomic update capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement v = x x = v !$omp end atomic @@ -60,8 +60,8 @@ subroutine f05 integer :: x, v, z !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 !$omp end atomic end @@ -70,8 +70,8 @@ subroutine f06 integer :: x, v, z !$omp atomic update capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x !$omp end atomic end diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 677b933932b44..5e180aa0bbe5b 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -97,50 +97,50 @@ program sample !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture x = x + 10 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read x v = b !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture !$omp atomic capture v = 1 x = 4 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) !$omp end atomic >From 40510a3068498d15257cc7d198bce9c8cd71a902 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 24 Mar 2025 15:38:58 -0500 Subject: [PATCH 13/24] DumpEvExpr: show type --- flang/include/flang/Semantics/dump-expr.h | 30 ++++++++++++++++------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 2f445429a10b5..1553dac3b6687 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,6 +16,7 @@ #include #include +#include #include #include @@ -38,6 +39,17 @@ class DumpEvaluateExpr { } private: + template + struct TypeOf { + static constexpr std::string_view name{TypeOf::get()}; + static constexpr std::string_view get() { + std::string_view v(__PRETTY_FUNCTION__); + v.remove_prefix(99); // Strip the part "... [with T = " + v.remove_suffix(50); // Strip the ending "; string_view = ...]" + return v; + } + }; + template void Show(const common::Indirection &x) { Show(x.value()); } @@ -76,7 +88,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant"); + Indent("derived constant "s + std::string(TypeOf::name)); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -84,7 +96,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant"); + Print("constant "s + std::string(TypeOf::name)); } } void Show(const Symbol &symbol); @@ -102,7 +114,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator"); + Indent("designator "s + std::string(TypeOf::name)); Show(x.u); Outdent(); } @@ -117,7 +129,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref"); + Indent("function ref "s + std::string(TypeOf::name)); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -127,14 +139,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value"); + Indent("array constructor value "s + std::string(TypeOf::name)); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do"); + Indent("implied do "s + std::string(TypeOf::name)); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -148,20 +160,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op"); + Indent("unary op "s + std::string(TypeOf::name)); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op"); + Indent("binary op "s + std::string(TypeOf::name)); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr T"); + Indent("expr <" + std::string(TypeOf::name) + ">"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index aa0b4e0f03398..66cedab94bfb4 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("expr some type"); + Indent("relational some type"); Show(x.u); Outdent(); } >From b40ba0ed9270daf4f7d99190c1e100028a3e09c3 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 15:14:45 -0500 Subject: [PATCH 14/24] Handle conversion from real to complex via complex constructor --- flang/lib/Semantics/check-omp-structure.cpp | 55 ++++++++++++++++++--- 1 file changed, 47 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dada9c6c2bd6f..ae81dcb5ea150 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,36 +3183,46 @@ struct ConvertCollector using Base::operator(); template // - Result operator()(const evaluate::Designator &x) const { + Result asSomeExpr(const T &x) const { auto copy{x}; return {AsGenericExpr(std::move(copy)), {}}; } + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + template // Result operator()(const evaluate::FunctionRef &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template // Result operator()(const evaluate::Constant &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template Result operator()(const evaluate::Operation &x) const { if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. + // Ignore parentheses. return (*this)(x.template operand<0>()); } else if constexpr (is_convert_v) { // Convert should always have a typed result, so it should be safe to // dereference x.GetType(). return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } } else { - auto copy{x.derived()}; - return {evaluate::AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x.derived()); } } @@ -3231,6 +3241,23 @@ struct ConvertCollector } private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + template // struct is_convert { static constexpr bool value{false}; @@ -3246,6 +3273,18 @@ struct ConvertCollector }; template // static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { >From 303aef7886243a6f7952e866cfb50d860ed98e61 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 16:07:19 -0500 Subject: [PATCH 15/24] Fix handling of insertion point --- flang/lib/Lower/OpenMP/OpenMP.cpp | 23 +++++++++++-------- .../Lower/OpenMP/atomic-implicit-cast.f90 | 8 +++---- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 1c5589b116ca7..60e559b326f7f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2749,7 +2749,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value storeAddr = @@ -2782,7 +2781,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, value, storeAddr); } - builder.restoreInsertionPoint(saved); return op; } @@ -2796,7 +2794,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value value = @@ -2807,7 +2804,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, converted, hint, memOrder); - builder.restoreInsertionPoint(saved); return op; } @@ -2823,7 +2819,6 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); @@ -2853,7 +2848,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(saved); + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } @@ -2866,6 +2861,8 @@ genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { + // This function and the functions called here do not preserve the + // builder's insertion point, or set it to anything specific. switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, @@ -3919,6 +3916,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, postAt = atomicAt = preAt; } + // The builder's insertion point needs to be specifically set before + // each call to `genAtomicOperation`. mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); @@ -3932,10 +3931,16 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, hint, memOrder, preAt, atomicAt, postAt); } - if (secondOp) { - builder.setInsertionPointAfter(secondOp); + if (construct.IsCapture()) { + // If this is a capture operation, the first/second ops will be inside + // of it. Set the insertion point to past the capture op itself. + builder.restoreInsertionPoint(postAt); } else { - builder.setInsertionPointAfter(firstOp); + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } } } } diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 6f9a481e4cf43..5e00235b85e74 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -95,9 +95,9 @@ subroutine atomic_implicit_cast_read ! CHECK: } ! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 ! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref -! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 ! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex ! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> @@ -107,14 +107,14 @@ subroutine atomic_implicit_cast_read !$omp end atomic -! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { -! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 ! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 ! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex ! CHECK: omp.yield(%[[RESULT]] : complex) ! CHECK: } >From d788d87ebe69ec82c14a0eb0cbb95df38a216fde Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:14:47 -0500 Subject: [PATCH 16/24] Allow conversion in update operations --- flang/include/flang/Semantics/tools.h | 17 ++++----- flang/lib/Lower/OpenMP/OpenMP.cpp | 6 ++-- flang/lib/Semantics/check-omp-structure.cpp | 33 ++++++----------- .../Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic03.f90 | 6 ++-- flang/test/Semantics/OpenMP/atomic04.f90 | 35 +++++++++---------- .../OpenMP/omp-atomic-assignment-stmt.f90 | 2 +- 7 files changed, 44 insertions(+), 57 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7f1ec59b087a2..9be2feb8ae064 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -789,14 +789,15 @@ inline bool checkForSymbolMatch( /// return the "expr" but with top-level parentheses stripped. std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); -/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). -/// Check if "expr" is -/// SomeType(SomeKind(Type( -/// Convert -/// SomeKind(...)[2]))) -/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves -/// TypeCategory. -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 60e559b326f7f..6977e209e8b1b 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2823,10 +2823,12 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; + // This must exist by now. + SomeExpr input = *semantics::GetConvertInput(assign.rhs); + std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { - if (!semantics::IsSameOrResizeOf(arg, atom)) { + if (!semantics::IsSameOrConvertOf(arg, atom)) { mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); overrides.try_emplace(&arg, val); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index ae81dcb5ea150..edd8525c118bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3425,12 +3425,12 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -static MaybeExpr GetConvertInput(const SomeExpr &x) { +MaybeExpr GetConvertInput(const SomeExpr &x) { // This returns SomeExpr(x) when x is a designator/functionref/constant. return atomic::ConvertCollector{}(x).first; } -static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { // Check if expr is same as x, or a sequence of Convert operations on x. if (expr == x) { return true; @@ -3441,23 +3441,6 @@ static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { } } -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { - // Both expr and x have the form of SomeType(SomeKind(...)[1]). - // Check if expr is - // SomeType(SomeKind(Type( - // Convert - // SomeKind(...)[2]))) - // where SomeKind(...) [1] and [2] are equal, and the Convert preserves - // TypeCategory. - - if (expr != x) { - auto top{atomic::ArgumentExtractor{}(expr)}; - return top.first == atomic::Operator::Resize && x == top.second.front(); - } else { - return true; - } -} - bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3801,7 +3784,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - auto top{GetTopLevelOperation(update.rhs)}; + std::pair> top{ + atomic::Operator::Unk, {}}; + if (auto &&maybeInput{GetConvertInput(update.rhs)}) { + top = GetTopLevelOperation(*maybeInput); + } switch (top.first) { case atomic::Operator::Add: case atomic::Operator::Sub: @@ -3842,7 +3829,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( auto unique{[&]() { // -> iterator auto found{top.second.end()}; for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { - if (IsSameOrResizeOf(*i, atom)) { + if (IsSameOrConvertOf(*i, atom)) { if (found != top.second.end()) { return top.second.end(); } @@ -3902,9 +3889,9 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( case atomic::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; - if (IsSameOrResizeOf(arg0, atom)) { + if (IsSameOrConvertOf(arg0, atom)) { CheckStorageOverlap(atom, {arg1}, condSource); - } else if (IsSameOrResizeOf(arg1, atom)) { + } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { context_.Say(assignSource, diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 4595e02d01456..28d0e264359cb 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -47,8 +47,8 @@ subroutine f05 integer :: x real :: y + ! An explicit conversion is accepted as an extension. !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = int(x + y) end diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index f5c189fd05318..b3a3c0d5e7a14 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -41,10 +41,10 @@ program OmpAtomic z = MIN(y, 8, a, d) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable y should appear as an argument in the update operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -126,7 +126,7 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic update !ERROR: Atomic variable k should be a scalar - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable k should occur exactly once among the arguments of the top-level MAX operator k = max(x, y) !$omp atomic diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index 5c91ab5dc37e4..d603ba8b3937c 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -1,5 +1,3 @@ -! REQUIRES: openmp_runtime - ! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags ! OpenMP Atomic construct @@ -7,7 +5,6 @@ ! Update assignment must be 'var = var op expr' or 'var = expr op var' program OmpAtomic - use omp_lib real x integer y logical m, n, l @@ -20,10 +17,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic @@ -31,10 +28,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic @@ -42,10 +39,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic @@ -53,10 +50,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic @@ -96,10 +93,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic update @@ -107,10 +104,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic update @@ -118,10 +115,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic update @@ -129,10 +126,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 5e180aa0bbe5b..8fdd2aed3ec1f 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -87,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + ! This ends up being "x = b + x". x = b + (x*1) !$omp end atomic >From 341723713929507c59d528540d32bc2e4213e920 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:21:56 -0500 Subject: [PATCH 17/24] format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6977e209e8b1b..0f553541c5ef0 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2850,7 +2850,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(postAt); // For naCtx cleanups + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } >From 2686207342bad511f6d51b20ed923c0d2cc9047b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:22:26 -0500 Subject: [PATCH 18/24] Revert "DumpEvExpr: show type" This reverts commit 40510a3068498d15257cc7d198bce9c8cd71a902. Debug changes accidentally pushed upstream. --- flang/include/flang/Semantics/dump-expr.h | 30 +++++++---------------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 10 insertions(+), 22 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 1553dac3b6687..2f445429a10b5 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,7 +16,6 @@ #include #include -#include #include #include @@ -39,17 +38,6 @@ class DumpEvaluateExpr { } private: - template - struct TypeOf { - static constexpr std::string_view name{TypeOf::get()}; - static constexpr std::string_view get() { - std::string_view v(__PRETTY_FUNCTION__); - v.remove_prefix(99); // Strip the part "... [with T = " - v.remove_suffix(50); // Strip the ending "; string_view = ...]" - return v; - } - }; - template void Show(const common::Indirection &x) { Show(x.value()); } @@ -88,7 +76,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant "s + std::string(TypeOf::name)); + Indent("derived constant"); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -96,7 +84,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant "s + std::string(TypeOf::name)); + Print("constant"); } } void Show(const Symbol &symbol); @@ -114,7 +102,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator "s + std::string(TypeOf::name)); + Indent("designator"); Show(x.u); Outdent(); } @@ -129,7 +117,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref "s + std::string(TypeOf::name)); + Indent("function ref"); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -139,14 +127,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value "s + std::string(TypeOf::name)); + Indent("array constructor value"); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do "s + std::string(TypeOf::name)); + Indent("implied do"); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -160,20 +148,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op "s + std::string(TypeOf::name)); + Indent("unary op"); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op "s + std::string(TypeOf::name)); + Indent("binary op"); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr <" + std::string(TypeOf::name) + ">"); + Indent("expr T"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index 66cedab94bfb4..aa0b4e0f03398 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("relational some type"); + Indent("expr some type"); Show(x.u); Outdent(); } >From c00fc531bcf742c409fc974da94c5b362fa9132c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:37:19 -0500 Subject: [PATCH 19/24] Delete unnecessary static_assert --- flang/lib/Semantics/check-omp-structure.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 6005dda7c26fe..2e59553d5e130 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -21,8 +21,6 @@ namespace Fortran::semantics { -static_assert(std::is_same_v>); - template static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { return !(e == f); >From 45b012c16b77c757a0d09b2a229bad49fed8d26f Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:25 -0500 Subject: [PATCH 20/24] Add missing initializer for 'iff' --- flang/lib/Semantics/check-omp-structure.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 2e59553d5e130..aa1bd136b371f 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2815,7 +2815,7 @@ static std::optional AnalyzeConditionalStmt( } } else { AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, - GetActionStmt(std::get(s.t))}; + GetActionStmt(std::get(s.t)), SourcedActionStmt{}}; if (result.ift.stmt) { return result; } >From daeac25991bf14fb08c3accabe068c074afa1eb7 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:47 -0500 Subject: [PATCH 21/24] Add asserts for printing "Identity" as top-level operator --- flang/lib/Semantics/check-omp-structure.cpp | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa1bd136b371f..062b45deac865 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3823,6 +3823,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, atomic::ToString(top.first)); @@ -3852,6 +3853,8 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, atom.AsFortran(), atomic::ToString(top.first)); @@ -3898,16 +3901,20 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, atomic::ToString(top.first)); } break; } + case atomic::Operator::Identity: case atomic::Operator::True: case atomic::Operator::False: break; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, atomic::ToString(top.first)); >From ae121e5c37453af1a4aba7c77939c2c1c45b75fa Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 13:15:58 -0500 Subject: [PATCH 22/24] Explain the use of determinant --- flang/lib/Semantics/check-omp-structure.cpp | 31 ++++++++++++++++++--- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 062b45deac865..bc6a09b9768ef 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3606,10 +3606,33 @@ OmpStructureChecker::CheckUpdateCapture( // subexpression of the right-hand side. // 2. An assignment could be a capture (cbc) if the right-hand side is // a variable (or a function ref), with potential type conversions. - bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; - bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; - bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; - bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; // Can as1 be an update? + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; // Can as2 be an update? + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; // Can 1 be capture? + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; // Can 2 be capture? + + // We want to diagnose cases where both assignments cannot be an update, + // or both cannot be a capture, as well as cases where either assignment + // cannot be any of these two. + // + // If we organize these boolean values into a matrix + // |cbu1 cbu2| + // |cbc1 cbc2| + // then we want to diagnose cases where the matrix has a zero (i.e. "false") + // row or column, including the case where everything is zero. All these + // cases correspond to the determinant of the matrix being 0, which suggests + // that checking the det may be a convenient diagnostic check. There is only + // one additional case where the det is 0, which is when the matrx is all 1 + // ("true"). The "all true" case represents the situation where both + // assignments could be an update as well as a capture. On the other hand, + // whenever det != 0, the roles of the update and the capture can be + // unambiguously assigned to as1 and as2 [1]. + // + // [1] This can be easily verified by hand: there are 10 2x2 matrices with + // det = 0, leaving 6 cases where det != 0: + // 0 1 0 1 1 0 1 0 1 1 1 1 + // 1 0 1 1 0 1 1 1 0 1 1 0 + // In each case the classification is unambiguous. // |cbu1 cbu2| // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 >From cae0e8fcd3f6b8c2bc3ad8f85599ef4765c6afc5 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 11:48:18 -0500 Subject: [PATCH 23/24] Deal with assignments that failed Fortran semantic checks Don't emit diagnostics for those. --- flang/lib/Semantics/check-omp-structure.cpp | 66 ++++++++++++++------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bc6a09b9768ef..89a3a407441a8 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2726,6 +2726,9 @@ static SourcedActionStmt GetActionStmt(const parser::Block &block) { // Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption // is that the ActionStmt will be either an assignment or a pointer-assignment, // otherwise return std::nullopt. +// Note: This function can return std::nullopt on [Pointer]AssignmentStmt where +// the "typedAssignment" is unset. This can happen is there are semantic errors +// in the purported assignment. static std::optional GetEvaluateAssignment( const parser::ActionStmt *x) { if (x == nullptr) { @@ -2754,6 +2757,29 @@ static std::optional GetEvaluateAssignment( x->u); } +// Check if the ActionStmt is actually a [Pointer]AssignmentStmt. This is +// to separate cases where the source has something that looks like an +// assignment, but is semantically wrong (diagnosed by general semantic +// checks), and where the source has some other statement (which we want +// to report as "should be an assignment"). +static bool IsAssignment(const parser::ActionStmt *x) { + if (x == nullptr) { + return false; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + + return common::visit( + [](auto &&s) -> bool { + using BareS = llvm::remove_cvref_t; + return std::is_same_v || + std::is_same_v; + }, + x->u); +} + static std::optional AnalyzeConditionalStmt( const parser::ExecutionPartConstruct *x) { if (x == nullptr) { @@ -3588,8 +3614,10 @@ OmpStructureChecker::CheckUpdateCapture( auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; if (!maybeAssign1 || !maybeAssign2) { - context_.Say(source, - "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + if (!IsAssignment(act1.stmt) || !IsAssignment(act2.stmt)) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + } return std::make_pair(nullptr, nullptr); } @@ -3956,7 +3984,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( // The if-true statement must be present, and must be an assignment. auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; if (!maybeAssign) { - if (update.ift.stmt) { + if (update.ift.stmt && !IsAssignment(update.ift.stmt)) { context_.Say(update.ift.source, "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); } else { @@ -3992,7 +4020,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); } @@ -4094,17 +4122,11 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( } SourcedActionStmt uact{GetActionStmt(uec)}; SourcedActionStmt cact{GetActionStmt(cec)}; - auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; - auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; - - if (!maybeUpdate || !maybeCapture) { - context_.Say(source, - "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); - return; - } + // The "dereferences" of std::optional are guaranteed to be valid after + // CheckUpdateCapture. + evaluate::Assignment update{*GetEvaluateAssignment(uact.stmt)}; + evaluate::Assignment capture{*GetEvaluateAssignment(cact.stmt)}; - const evaluate::Assignment &update{*maybeUpdate}; - const evaluate::Assignment &capture{*maybeCapture}; const SomeExpr &atom{update.lhs}; using Analysis = parser::OpenMPAtomicConstruct::Analysis; @@ -4242,13 +4264,17 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( return; } } else { - context_.Say(capture.source, - "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + if (!IsAssignment(capture.stmt)) { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + } return; } } else { - context_.Say(update.ift.source, - "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + if (!IsAssignment(update.ift.stmt)) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + } return; } @@ -4316,7 +4342,7 @@ void OmpStructureChecker::CheckAtomicRead( MakeAtomicAnalysisOp(Analysis::Read, maybeRead), MakeAtomicAnalysisOp(Analysis::None)); } - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); } @@ -4350,7 +4376,7 @@ void OmpStructureChecker::CheckAtomicWrite( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); } >From 6bc8c10c793ebac02c78daec33e7fb5e6becb8e0 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 12:47:00 -0500 Subject: [PATCH 24/24] Move common functions to tools.cpp --- flang/include/flang/Semantics/tools.h | 134 +++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- flang/lib/Semantics/check-omp-structure.cpp | 506 ++------------------ flang/lib/Semantics/tools.cpp | 310 ++++++++++++ 4 files changed, 484 insertions(+), 468 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 821f1ae34fd5b..25fadceefceb0 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -778,11 +778,135 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, return false; } -/// If the top-level operation (ignoring parentheses) is either an -/// evaluate::FunctionRef, or a specialization of evaluate::Operation, -/// then return the list of arguments (wrapped in SomeExpr). Otherwise, -/// return the "expr" but with top-level parentheses stripped. -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); +namespace operation { + +enum class Operator { + Add, + And, + Associated, + Call, + Convert, + Div, + Eq, + Eqv, + False, + Ge, + Gt, + Identity, + Intrinsic, + Lt, + Max, + Min, + Mul, + Ne, + Neqv, + Not, + Or, + Pow, + Resize, // Convert within the same TypeCategory + Sub, + True, + Unknown, +}; + +std::string ToString(Operator op); + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unknown; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unknown; +} + +template +Operator OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Add; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Sub; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Mul; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Div; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } +} + +template // +Operator OperationCode(const T &) { + return Operator::Unknown; +} + +Operator OperationCode(const evaluate::ProcedureDesignator &proc); + +} // namespace operation + +/// Return information about the top-level operation (ignoring parentheses): +/// the operation code and the list of arguments. +std::pair> +GetTopLevelOperation(const SomeExpr &expr); /// Check if expr is same as x, or a sequence of Convert operations on x. bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ad5eae4ae39a2..c74f7627c5e25 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2828,7 +2828,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, // This must exist by now. SomeExpr input = *semantics::GetConvertInput(assign.rhs); - std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; + std::vector args{semantics::GetTopLevelOperation(input).second}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrConvertOf(arg, atom)) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 89a3a407441a8..f29a56d5fd92a 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2891,290 +2891,6 @@ static std::pair SplitAssignmentSource( namespace atomic { -template static void MoveAppend(V &accum, V &&other) { - for (auto &&s : other) { - accum.push_back(std::move(s)); - } -} - -enum class Operator { - Unk, - // Operators that are officially allowed in the update operation - Add, - And, - Associated, - Div, - Eq, - Eqv, - Ge, // extension - Gt, - Identity, // extension: x = x is allowed (*), but we should never print - // "identity" as the name of the operator - Le, // extension - Lt, - Max, - Min, - Mul, - Ne, // extension - Neqv, - Or, - Sub, - // Operators that we recognize for technical reasons - True, - False, - Not, - Convert, - Resize, - Intrinsic, - Call, - Pow, - - // (*): "x = x + 0" is a valid update statement, but it will be folded - // to "x = x" by the time we look at it. Since the source statements - // "x = x" and "x = x + 0" will end up looking the same, accept the - // former as an extension. -}; - -std::string ToString(Operator op) { - switch (op) { - case Operator::Add: - return "+"; - case Operator::And: - return "AND"; - case Operator::Associated: - return "ASSOCIATED"; - case Operator::Div: - return "/"; - case Operator::Eq: - return "=="; - case Operator::Eqv: - return "EQV"; - case Operator::Ge: - return ">="; - case Operator::Gt: - return ">"; - case Operator::Identity: - return "identity"; - case Operator::Le: - return "<="; - case Operator::Lt: - return "<"; - case Operator::Max: - return "MAX"; - case Operator::Min: - return "MIN"; - case Operator::Mul: - return "*"; - case Operator::Neqv: - return "NEQV/EOR"; - case Operator::Ne: - return "/="; - case Operator::Or: - return "OR"; - case Operator::Sub: - return "-"; - case Operator::True: - return ".TRUE."; - case Operator::False: - return ".FALSE."; - case Operator::Not: - return "NOT"; - case Operator::Convert: - return "type-conversion"; - case Operator::Resize: - return "resize"; - case Operator::Intrinsic: - return "intrinsic"; - case Operator::Call: - return "function-call"; - case Operator::Pow: - return "**"; - default: - return "??"; - } -} - -template // -struct ArgumentExtractor - : public evaluate::Traverse, - std::pair>, false> { - using Arguments = std::vector; - using Result = std::pair; - using Base = evaluate::Traverse, - Result, false>; - static constexpr auto IgnoreResizes = IgnoreResizingConverts; - static constexpr auto Logical = common::TypeCategory::Logical; - ArgumentExtractor() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result operator()( - const evaluate::Constant> &x) const { - if (const auto &val{x.GetScalarValue()}) { - return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) - : std::make_pair(Operator::False, Arguments{}); - } - return Default(); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - Result result{OperationCode(x.proc()), {}}; - for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { - if (auto *e{x.UnwrapArgExpr(i)}) { - result.second.push_back(*e); - } - } - return result; - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. - return (*this)(x.template operand<0>()); - } - if constexpr (IgnoreResizes && - std::is_same_v>) { - // Ignore conversions within the same category. - // Atomic operations on int(kind=1) may be implicitly widened - // to int(kind=4) for example. - return (*this)(x.template operand<0>()); - } else { - return std::make_pair( - OperationCode(x), OperationArgs(x, std::index_sequence_for{})); - } - } - - template // - Result operator()(const evaluate::Designator &x) const { - evaluate::Designator copy{x}; - Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; - return result; - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - // There shouldn't be any combining needed, since we're stopping the - // traversal at the top-level operation, but implement one that picks - // the first non-empty result. - if constexpr (sizeof...(Rs) == 0) { - return std::move(result); - } else { - if (!result.second.empty()) { - return std::move(result); - } else { - return Combine(std::move(results)...); - } - } - } - -private: - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) - const { - switch (op.derived().logicalOperator) { - case common::LogicalOperator::And: - return Operator::And; - case common::LogicalOperator::Or: - return Operator::Or; - case common::LogicalOperator::Eqv: - return Operator::Eqv; - case common::LogicalOperator::Neqv: - return Operator::Neqv; - case common::LogicalOperator::Not: - return Operator::Not; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - switch (op.derived().opr) { - case common::RelationalOperator::LT: - return Operator::Lt; - case common::RelationalOperator::LE: - return Operator::Le; - case common::RelationalOperator::EQ: - return Operator::Eq; - case common::RelationalOperator::NE: - return Operator::Ne; - case common::RelationalOperator::GE: - return Operator::Ge; - case common::RelationalOperator::GT: - return Operator::Gt; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Add; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Sub; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Mul; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Div; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - if constexpr (C == T::category) { - return Operator::Resize; - } else { - return Operator::Convert; - } - } - Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { - Operator code = llvm::StringSwitch(proc.GetName()) - .Case("associated", Operator::Associated) - .Case("min", Operator::Min) - .Case("max", Operator::Max) - .Case("iand", Operator::And) - .Case("ior", Operator::Or) - .Case("ieor", Operator::Neqv) - .Default(Operator::Call); - if (code == Operator::Call && proc.GetSpecificIntrinsic()) { - return Operator::Intrinsic; - } - return code; - } - template // - Operator OperationCode(const T &) const { - return Operator::Unk; - } - - template - Arguments OperationArgs(const evaluate::Operation &x, - std::index_sequence) const { - return Arguments{SomeExpr(x.template operand())...}; - } -}; - struct DesignatorCollector : public evaluate::Traverse, false> { using Result = std::vector; @@ -3196,125 +2912,14 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (MoveAppend(v, std::move(results)), ...); - return v; - } -}; - -struct ConvertCollector - : public evaluate::Traverse>, false> { - using Result = std::pair>; - using Base = evaluate::Traverse; - ConvertCollector() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result asSomeExpr(const T &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; - } - - template // - Result operator()(const evaluate::Designator &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::Constant &x) const { - return asSomeExpr(x); - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore parentheses. - return (*this)(x.template operand<0>()); - } else if constexpr (is_convert_v) { - // Convert should always have a typed result, so it should be safe to - // dereference x.GetType(). - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else if constexpr (is_complex_constructor_v) { - // This is a conversion iff the imaginary operand is 0. - if (IsZero(x.template operand<1>())) { - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else { - return asSomeExpr(x.derived()); - } - } else { - return asSomeExpr(x.derived()); - } - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - Result v(std::move(result)); - auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { - assert((!x.has_value() || !y.has_value()) && "Multiple designators"); - if (!x.has_value()) { - x = std::move(y); + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); } }}; - (setValue(v.first, std::move(results).first), ...); - (MoveAppend(v.second, std::move(results).second), ...); + (moveAppend(v, std::move(results)), ...); return v; } - -private: - template // - static bool IsZero(const T &x) { - return false; - } - template // - static bool IsZero(const evaluate::Expr &x) { - return common::visit([](auto &&s) { return IsZero(s); }, x.u); - } - template // - static bool IsZero(const evaluate::Constant &x) { - if (auto &&maybeScalar{x.GetScalarValue()}) { - return maybeScalar->IsZero(); - } else { - return false; - } - } - - template // - struct is_convert { - static constexpr bool value{false}; - }; - template // - struct is_convert> { - static constexpr bool value{true}; - }; - template // - struct is_convert> { - // Conversion from complex to real. - static constexpr bool value{true}; - }; - template // - static constexpr bool is_convert_v = is_convert::value; - - template // - struct is_complex_constructor { - static constexpr bool value{false}; - }; - template // - struct is_complex_constructor> { - static constexpr bool value{true}; - }; - template // - static constexpr bool is_complex_constructor_v = - is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { @@ -3347,22 +2952,13 @@ static bool IsAllocatable(const SomeExpr &expr) { return !syms.empty() && IsAllocatable(syms.back()); } -static std::pair> GetTopLevelOperation( - const SomeExpr &expr) { - return atomic::ArgumentExtractor{}(expr); -} - -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { - return GetTopLevelOperation(expr).second; -} - static bool IsPointerAssignment(const evaluate::Assignment &x) { return std::holds_alternative(x.u) || std::holds_alternative(x.u); } static bool IsCheckForAssociated(const SomeExpr &cond) { - return GetTopLevelOperation(cond).first == atomic::Operator::Associated; + return GetTopLevelOperation(cond).first == operation::Operator::Associated; } static bool HasCommonDesignatorSymbols( @@ -3455,23 +3051,7 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -MaybeExpr GetConvertInput(const SomeExpr &x) { - // This returns SomeExpr(x) when x is a designator/functionref/constant. - return atomic::ConvertCollector{}(x).first; -} - -bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { - // Check if expr is same as x, or a sequence of Convert operations on x. - if (expr == x) { - return true; - } else if (auto maybe{GetConvertInput(expr)}) { - return *maybe == x; - } else { - return false; - } -} - -bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { +static bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3839,45 +3419,46 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - std::pair> top{ - atomic::Operator::Unk, {}}; + std::pair> top{ + operation::Operator::Unknown, {}}; if (auto &&maybeInput{GetConvertInput(update.rhs)}) { top = GetTopLevelOperation(*maybeInput); } switch (top.first) { - case atomic::Operator::Add: - case atomic::Operator::Sub: - case atomic::Operator::Mul: - case atomic::Operator::Div: - case atomic::Operator::And: - case atomic::Operator::Or: - case atomic::Operator::Eqv: - case atomic::Operator::Neqv: - case atomic::Operator::Min: - case atomic::Operator::Max: - case atomic::Operator::Identity: + case operation::Operator::Add: + case operation::Operator::Sub: + case operation::Operator::Mul: + case operation::Operator::Div: + case operation::Operator::And: + case operation::Operator::Or: + case operation::Operator::Eqv: + case operation::Operator::Neqv: + case operation::Operator::Min: + case operation::Operator::Max: + case operation::Operator::Identity: break; - case atomic::Operator::Call: + case operation::Operator::Call: context_.Say(source, "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Convert: + case operation::Operator::Convert: context_.Say(source, "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Intrinsic: + case operation::Operator::Intrinsic: context_.Say(source, "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Unk: + case operation::Operator::Unknown: context_.Say( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); return; } // Check if `atom` occurs exactly once in the argument list. @@ -3898,17 +3479,17 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( }()}; if (unique == top.second.end()) { - if (top.first == atomic::Operator::Identity) { + if (top.first == operation::Operator::Identity) { // This is "x = y". context_.Say(rsrc, "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, - atom.AsFortran(), atomic::ToString(top.first)); + atom.AsFortran(), operation::ToString(top.first)); } } else { CheckStorageOverlap(atom, nonAtom, source); @@ -3933,18 +3514,18 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( // Missing arguments to operations would have been diagnosed by now. switch (top.first) { - case atomic::Operator::Associated: + case operation::Operator::Associated: if (atom != top.second.front()) { context_.Say(assignSource, "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); } break; // x equalop e | e equalop x (allowing "e equalop x" is an extension) - case atomic::Operator::Eq: - case atomic::Operator::Eqv: + case operation::Operator::Eq: + case operation::Operator::Eqv: // x ordop expr | expr ordop x - case atomic::Operator::Lt: - case atomic::Operator::Gt: { + case operation::Operator::Lt: + case operation::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; if (IsSameOrConvertOf(arg0, atom)) { @@ -3952,23 +3533,24 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); } break; } - case atomic::Operator::Identity: - case atomic::Operator::True: - case atomic::Operator::False: + case operation::Operator::Identity: + case operation::Operator::True: + case operation::Operator::False: break; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); break; } } diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 1d1e3ac044166..fce930dcc1d02 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -17,6 +17,7 @@ #include "flang/Semantics/tools.h" #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringSwitch.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1756,4 +1757,313 @@ bool HadUseError( } } +namespace operation { +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() + ? std::make_pair(operation::Operator::True, Arguments{}) + : std::make_pair(operation::Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{operation::OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); + } + } + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair(operation::OperationCode(x), + OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{ + operation::Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; +} // namespace operation + +std::string operation::ToString(operation::Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; + } +} + +operation::Operator operation::OperationCode( + const evaluate::ProcedureDesignator &proc) { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; +} + +std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return operation::ArgumentExtractor{}(expr); +} + +namespace operation { +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result asSomeExpr(const T &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::Constant &x) const { + return asSomeExpr(x); + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } + } else { + return asSomeExpr(x.derived()); + } + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (moveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; +}; +} // namespace operation + +MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return operation::ConvertCollector{}(x).first; +} + +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + } // namespace Fortran::semantics From flang-commits at lists.llvm.org Fri May 30 11:05:39 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 30 May 2025 11:05:39 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <6839f373.050a0220.24387d.8c13@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 01/25] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 02/25] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 03/25] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; >From 02b40809aff5b2ad315ef078f0ea8edb0f30a40c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 17 Mar 2025 15:53:27 -0500 Subject: [PATCH 04/25] [flang][OpenMP] Overhaul implementation of ATOMIC construct The parser will accept a wide variety of illegal attempts at forming an ATOMIC construct, leaving it to the semantic analysis to diagnose any issues. This consolidates the analysis into one place and allows us to produce more informative diagnostics. The parser's outcome will be parser::OpenMPAtomicConstruct object holding the directive, parser::Body, and an optional end-directive. The prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have been removed. READ, WRITE, etc. are now proper clauses. The semantic analysis consistently operates on "evaluation" represen- tations, mainly evaluate::Expr (as SomeExpr) and evaluate::Assignment. The results of the semantic analysis are stored in a mutable member of the OpenMPAtomicConstruct node. This follows a precedent of having `typedExpr` member in parser::Expr, for example. This allows the lowering code to avoid duplicated handling of AST nodes. Using a BLOCK construct containing multiple statements for an ATOMIC construct that requires multiple statements is now allowed. In fact, any nesting of such BLOCK constructs is allowed. This implementation will parse, and perform semantic checks for both conditional-update and conditional-update-capture, although no MLIR will be generated for those. Instead, a TODO error will be issues prior to lowering. The allowed forms of the ATOMIC construct were based on the OpenMP 6.0 spec. --- flang/include/flang/Parser/dump-parse-tree.h | 12 - flang/include/flang/Parser/parse-tree.h | 111 +- flang/include/flang/Semantics/tools.h | 16 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 40 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 760 +++---- flang/lib/Parser/openmp-parsers.cpp | 233 +- flang/lib/Parser/parse-tree.cpp | 28 + flang/lib/Parser/unparse.cpp | 102 +- flang/lib/Semantics/check-omp-structure.cpp | 1998 +++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 56 +- flang/lib/Semantics/resolve-names.cpp | 7 +- flang/lib/Semantics/rewrite-directives.cpp | 126 +- .../Lower/OpenMP/Todo/atomic-compare-fail.f90 | 2 +- .../test/Lower/OpenMP/Todo/atomic-compare.f90 | 2 +- flang/test/Lower/OpenMP/atomic-capture.f90 | 4 +- flang/test/Lower/OpenMP/atomic-write.f90 | 2 +- flang/test/Parser/OpenMP/atomic-compare.f90 | 16 - .../test/Semantics/OpenMP/atomic-compare.f90 | 29 +- .../Semantics/OpenMP/atomic-hint-clause.f90 | 23 +- flang/test/Semantics/OpenMP/atomic-read.f90 | 89 + .../OpenMP/atomic-update-capture.f90 | 77 + .../Semantics/OpenMP/atomic-update-only.f90 | 83 + .../OpenMP/atomic-update-overloaded-ops.f90 | 4 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 81 + flang/test/Semantics/OpenMP/atomic.f90 | 29 +- flang/test/Semantics/OpenMP/atomic01.f90 | 221 +- flang/test/Semantics/OpenMP/atomic02.f90 | 47 +- flang/test/Semantics/OpenMP/atomic03.f90 | 51 +- flang/test/Semantics/OpenMP/atomic04.f90 | 96 +- flang/test/Semantics/OpenMP/atomic05.f90 | 12 +- .../Semantics/OpenMP/critical-hint-clause.f90 | 20 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 58 +- .../Semantics/OpenMP/requires-atomic01.f90 | 86 +- .../Semantics/OpenMP/requires-atomic02.f90 | 86 +- 34 files changed, 2838 insertions(+), 1769 deletions(-) delete mode 100644 flang/test/Parser/OpenMP/atomic-compare.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-read.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-capture.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-only.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-write.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a8c01ee2e7da3 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -526,15 +526,6 @@ class ParseTreeDumper { NODE(parser, OmpAtClause) NODE_ENUM(OmpAtClause, ActionTime) NODE_ENUM(OmpSeverityClause, Severity) - NODE(parser, OmpAtomic) - NODE(parser, OmpAtomicCapture) - NODE(OmpAtomicCapture, Stmt1) - NODE(OmpAtomicCapture, Stmt2) - NODE(parser, OmpAtomicCompare) - NODE(parser, OmpAtomicCompareIfStmt) - NODE(parser, OmpAtomicRead) - NODE(parser, OmpAtomicUpdate) - NODE(parser, OmpAtomicWrite) NODE(parser, OmpBeginBlockDirective) NODE(parser, OmpBeginLoopDirective) NODE(parser, OmpBeginSectionsDirective) @@ -581,7 +572,6 @@ class ParseTreeDumper { NODE(parser, OmpDoacrossClause) NODE(parser, OmpDestroyClause) NODE(parser, OmpEndAllocators) - NODE(parser, OmpEndAtomic) NODE(parser, OmpEndBlockDirective) NODE(parser, OmpEndCriticalDirective) NODE(parser, OmpEndLoopDirective) @@ -709,8 +699,6 @@ class ParseTreeDumper { NODE(parser, OpenMPDeclareMapperConstruct) NODE_ENUM(common, OmpMemoryOrderType) NODE(parser, OmpMemoryOrderClause) - NODE(parser, OmpAtomicClause) - NODE(parser, OmpAtomicClauseList) NODE(parser, OmpAtomicDefaultMemOrderClause) NODE(parser, OpenMPDepobjConstruct) NODE(parser, OpenMPUtilityConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..77f57b1cb85c7 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4835,94 +4835,37 @@ struct OmpMemoryOrderClause { CharBlock source; }; -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) | -// FAIL(memory-order) -struct OmpAtomicClause { - UNION_CLASS_BOILERPLATE(OmpAtomicClause); - CharBlock source; - std::variant u; -}; - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -struct OmpAtomicClauseList { - WRAPPER_CLASS_BOILERPLATE(OmpAtomicClauseList, std::list); - CharBlock source; -}; - -// END ATOMIC -EMPTY_CLASS(OmpEndAtomic); - -// ATOMIC READ -struct OmpAtomicRead { - TUPLE_CLASS_BOILERPLATE(OmpAtomicRead); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC WRITE -struct OmpAtomicWrite { - TUPLE_CLASS_BOILERPLATE(OmpAtomicWrite); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC UPDATE -struct OmpAtomicUpdate { - TUPLE_CLASS_BOILERPLATE(OmpAtomicUpdate); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC CAPTURE -struct OmpAtomicCapture { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCapture); - CharBlock source; - WRAPPER_CLASS(Stmt1, Statement); - WRAPPER_CLASS(Stmt2, Statement); - std::tuple - t; -}; - -struct OmpAtomicCompareIfStmt { - UNION_CLASS_BOILERPLATE(OmpAtomicCompareIfStmt); - std::variant, common::Indirection> u; -}; - -// ATOMIC COMPARE (OpenMP 5.1, OPenMP 5.2 spec: 15.8.4) -struct OmpAtomicCompare { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCompare); +struct OpenMPAtomicConstruct { + llvm::omp::Clause GetKind() const; + bool IsCapture() const; + bool IsCompare() const; + TUPLE_CLASS_BOILERPLATE(OpenMPAtomicConstruct); CharBlock source; - std::tuple> + std::tuple> t; -}; -// ATOMIC -struct OmpAtomic { - TUPLE_CLASS_BOILERPLATE(OmpAtomic); - CharBlock source; - std::tuple, - std::optional> - t; -}; + // Information filled out during semantic checks to avoid duplication + // of analyses. + struct Analysis { + static constexpr int None = 0; + static constexpr int Read = 1; + static constexpr int Write = 2; + static constexpr int Update = Read | Write; + static constexpr int Action = 3; // Bitmask for None, Read, Write, Update + static constexpr int IfTrue = 4; + static constexpr int IfFalse = 8; + static constexpr int Condition = 12; // Bitmask for IfTrue, IfFalse + + struct Op { + int what; + TypedExpr expr; + }; + TypedExpr atom, cond; + Op op0, op1; + }; -// 2.17.7 atomic -> -// ATOMIC [atomic-clause-list] atomic-construct [atomic-clause-list] | -// ATOMIC [atomic-clause-list] -// atomic-construct -> READ | WRITE | UPDATE | CAPTURE | COMPARE -struct OpenMPAtomicConstruct { - UNION_CLASS_BOILERPLATE(OpenMPAtomicConstruct); - std::variant - u; + mutable Analysis analysis; }; // OpenMP directives that associate with loop(s) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..7f1ec59b087a2 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -782,5 +782,21 @@ inline bool checkForSymbolMatch( } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). +/// Check if "expr" is +/// SomeType(SomeKind(Type( +/// Convert +/// SomeKind(...)[2]))) +/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves +/// TypeCategory. +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..0d941bf39afc3 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -332,26 +332,26 @@ getSource(const semantics::SemanticsContext &semaCtx, const parser::CharBlock *source = nullptr; auto ompConsVisit = [&](const parser::OpenMPConstruct &x) { - std::visit(common::visitors{ - [&](const parser::OpenMPSectionsConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPLoopConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPBlockConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPCriticalConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPAtomicConstruct &x) { - std::visit([&](const auto &x) { source = &x.source; }, - x.u); - }, - [&](const auto &x) { source = &x.source; }, - }, - x.u); + std::visit( + common::visitors{ + [&](const parser::OpenMPSectionsConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPLoopConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPBlockConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPCriticalConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPAtomicConstruct &x) { + source = &std::get(x.t).source; + }, + [&](const auto &x) { source = &x.source; }, + }, + x.u); }; eval.visit(common::visitors{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fdd85e94829f3..ab868df76d298 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2588,455 +2588,183 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); +} -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} + +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, + const List &clauses) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + if (clause.id != llvm::omp::Clause::OMPC_hint) + continue; + auto &hint = std::get(clause.u); + auto maybeVal = evaluate::ToInt64(hint.v); + CHECK(maybeVal); + return builder.getI64IntegerAttr(*maybeVal); } + return nullptr; } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static mlir::omp::ClauseMemoryOrderKindAttr +getAtomicMemoryOrder(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + const List &clauses) { + std::optional kind; + unsigned version = semaCtx.langOptions().OpenMPVersion; - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + switch (clause.id) { + case llvm::omp::Clause::OMPC_acq_rel: + kind = mlir::omp::ClauseMemoryOrderKind::Acq_rel; + break; + case llvm::omp::Clause::OMPC_acquire: + kind = mlir::omp::ClauseMemoryOrderKind::Acquire; + break; + case llvm::omp::Clause::OMPC_relaxed: + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + break; + case llvm::omp::Clause::OMPC_release: + kind = mlir::omp::ClauseMemoryOrderKind::Release; + break; + case llvm::omp::Clause::OMPC_seq_cst: + kind = mlir::omp::ClauseMemoryOrderKind::Seq_cst; + break; + default: + break; + } + } - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); + // Starting with 5.1, if no memory-order clause is present, the effect + // is as if "relaxed" was present. + if (!kind) { + if (version <= 50) + return nullptr; + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + } + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + return mlir::omp::ClauseMemoryOrderKindAttr::get(builder.getContext(), *kind); +} + +static mlir::Operation * // +genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; +static mlir::Operation * // +genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Value converted = builder.createConvert(loc, atomType, value); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, converted, hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - common::visit( - common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - lower::StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); +static mlir::Operation * +genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + lower::ExprToValueMap overrides; + lower::StatementContext naCtx; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + assert(!args.empty() && "Update operation without arguments"); + for (auto &arg : args) { + if (!semantics::IsSameOrResizeOf(arg, atom)) { + mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); + overrides.try_emplace(&arg, val); } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); } - mlir::Operation *atomicUpdateOp = nullptr; - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), - val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -static void genAtomicWrite(lower::AbstractConverter &converter, - const parser::OmpAtomicWrite &atomicWrite, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - - const parser::AssignmentStmt &stmt = - std::get>(atomicWrite.t) - .statement; - const evaluate::Assignment &assign = *stmt.typedAssignment->v; - lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -static void genAtomicRead(lower::AbstractConverter &converter, - const parser::OmpAtomicRead &atomicRead, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicRead.t) - .statement.t); + builder.restoreInsertionPoint(atomicAt); + auto updateOp = + builder.create(loc, atomAddr, hint, memOrder); - lower::StatementContext stmtCtx; - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -static void genAtomicUpdate(lower::AbstractConverter &converter, - const parser::OmpAtomicUpdate &atomicUpdate, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicUpdate.t) - .statement.t); + mlir::Region ®ion = updateOp->getRegion(0); + mlir::Block *block = builder.createBlock(®ion, {}, {atomType}, {loc}); + mlir::Value localAtom = fir::getBase(block->getArgument(0)); + overrides.try_emplace(&atom, localAtom); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -static void genOmpAtomic(lower::AbstractConverter &converter, - const parser::OmpAtomic &atomicConstruct, - mlir::Location loc) { - const parser::OmpAtomicClauseList &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>(atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicConstruct.t) - .statement.t); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -static void genAtomicCapture(lower::AbstractConverter &converter, - const parser::OmpAtomicCapture &atomicCapture, - mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + converter.overrideExprValues(&overrides); + mlir::Value updated = + fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value converted = builder.createConvert(loc, atomType, updated); + builder.create(loc, converted); + converter.resetExprOverrides(); - const parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const parser::OmpAtomicClauseList &rightHandClauseList = - std::get<2>(atomicCapture.t); - const parser::OmpAtomicClauseList &leftHandClauseList = - std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + builder.restoreInsertionPoint(saved); + return updateOp; +} + +static mlir::Operation * +genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, int action, + mlir::Value atomAddr, const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + switch (action) { + case parser::OpenMPAtomicConstruct::Analysis::Read: + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Write: + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Update: + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + default: + return nullptr; } - firOpBuilder.setInsertionPointToEnd(&block); - firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); } //===----------------------------------------------------------------------===// @@ -3911,10 +3639,6 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, @@ -3922,38 +3646,140 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; + llvm::errs() << " atom: " << atom.AsFortran() << "\n"; + llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; + llvm::errs() << " op0 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << " op1 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << "}\n"; +} + static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - const parser::OpenMPAtomicConstruct &atomicConstruct) { - Fortran::common::visit( - common::visitors{ - [&](const parser::OmpAtomicRead &atomicRead) { - mlir::Location loc = converter.genLocation(atomicRead.source); - genAtomicRead(converter, atomicRead, loc); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - mlir::Location loc = converter.genLocation(atomicWrite.source); - genAtomicWrite(converter, atomicWrite, loc); - }, - [&](const parser::OmpAtomic &atomicConstruct) { - mlir::Location loc = converter.genLocation(atomicConstruct.source); - genOmpAtomic(converter, atomicConstruct, loc); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - mlir::Location loc = converter.genLocation(atomicUpdate.source); - genAtomicUpdate(converter, atomicUpdate, loc); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - mlir::Location loc = converter.genLocation(atomicCapture.source); - genAtomicCapture(converter, atomicCapture, loc); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - mlir::Location loc = converter.genLocation(atomicCompare.source); - TODO(loc, "OpenMP atomic compare"); - }, - }, - atomicConstruct.u); + const parser::OpenMPAtomicConstruct &construct) { + auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { + if (auto *maybe = expr.get(); maybe && maybe->v) { + return &*maybe->v; + } else { + return nullptr; + } + }; + + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto &dirSpec = std::get(construct.t); + List clauses = makeClauses(dirSpec.Clauses(), semaCtx); + lower::StatementContext stmtCtx; + + const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + const semantics::SomeExpr &atom = *get(analysis.atom); + mlir::Location loc = converter.genLocation(construct.source); + mlir::Value atomAddr = + fir::getBase(converter.genExprAddr(atom, stmtCtx, &loc)); + mlir::IntegerAttr hint = getAtomicHint(converter, clauses); + mlir::omp::ClauseMemoryOrderKindAttr memOrder = + getAtomicMemoryOrder(converter, semaCtx, clauses); + + if (auto *cond = get(analysis.cond)) { + (void)cond; + TODO(loc, "OpenMP ATOMIC COMPARE"); + } else { + int action0 = analysis.op0.what & analysis.Action; + int action1 = analysis.op1.what & analysis.Action; + mlir::Operation *captureOp = nullptr; + fir::FirOpBuilder::InsertPoint atomicAt; + fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + + if (construct.IsCapture()) { + // Capturing operation. + assert(action0 != analysis.None && action1 != analysis.None && + "Expexcing two actions"); + captureOp = + builder.create(loc, hint, memOrder); + // Set the non-atomic insertion point to before the atomic.capture. + prepareAt = getInsertionPointBefore(captureOp); + + mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); + builder.setInsertionPointToEnd(block); + // Set the atomic insertion point to before the terminator inside + // atomic.capture. + mlir::Operation *term = builder.create(loc); + atomicAt = getInsertionPointBefore(term); + hint = nullptr; + memOrder = nullptr; + } else { + // Non-capturing operation. + assert(action0 != analysis.None && action1 == analysis.None && + "Expexcing single action"); + assert(!(analysis.op0.what & analysis.Condition)); + atomicAt = prepareAt; + } + + mlir::Operation *firstOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, + *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + assert(firstOp && "Should have created an atomic operation"); + atomicAt = getInsertionPointAfter(firstOp); + + mlir::Operation *secondOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, + *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } + } } static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..b30263ad54fa4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { + return std::string( + state.GetLocation(), std::min(64, state.BytesRemaining())); +} + constexpr auto startOmpLine = skipStuffBeforeStatement >> "!$OMP "_sptok; constexpr auto endOmpLine = space >> endOfLine; @@ -918,8 +924,10 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "BIND" >> construct(construct( parenthesized(Parser{}))) || + "CAPTURE" >> construct(construct()) || "COLLAPSE" >> construct(construct( parenthesized(scalarIntConstantExpr))) || + "COMPARE" >> construct(construct()) || "CONTAINS" >> construct(construct( parenthesized(Parser{}))) || "COPYIN" >> construct(construct( @@ -1039,6 +1047,7 @@ TYPE_PARSER( // "TASK_REDUCTION" >> construct(construct( parenthesized(Parser{}))) || + "READ" >> construct(construct()) || "RELAXED" >> construct(construct()) || "RELEASE" >> construct(construct()) || "REVERSE_OFFLOAD" >> @@ -1082,6 +1091,7 @@ TYPE_PARSER( // maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || + "WRITE" >> construct(construct()) || // Cancellable constructs "DO"_id >= construct(construct( @@ -1210,6 +1220,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. +// A problem can occur when atomic constructs without end-directive follow +// each other closely, e.g. +// !$omp atomic write +// x = v +// !$omp atomic update +// x = x + 1 +// ... +// The speculative parsing will become "recursive", and has the potential +// to take a (practically) infinite amount of time given a sufficiently +// large number of such constructs in a row. Since atomic constructs cannot +// contain other OpenMP constructs, guarding against recursive calls to the +// atomic construct parser solves the problem. +struct OmpAtomicConstructParser { + using resultType = OpenMPAtomicConstruct; + + static constexpr size_t BodyLimit{5}; + + std::optional Parse(ParseState &state) const { + if (recursing_) { + return std::nullopt; + } + recursing_ = true; + + auto dirSpec{Parser{}.Parse(state)}; + if (!dirSpec || dirSpec->DirId() != llvm::omp::Directive::OMPD_atomic) { + recursing_ = false; + return std::nullopt; + } + + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + if (ParseOne(exec, end, tail, state)) { + if (!tail.first.empty()) { + if (auto &&rest{attempt(LimitedTailParser(BodyLimit)).Parse(state)}) { + for (auto &&s : rest->first) { + tail.first.emplace_back(std::move(s)); + } + assert(!tail.second); + tail.second = std::move(rest->second); + } + } + recursing_ = false; + return OpenMPAtomicConstruct{ + std::move(*dirSpec), std::move(tail.first), std::move(tail.second)}; + } + + recursing_ = false; + return std::nullopt; + } + +private: + // Begin-directive + TailType = entire construct. + using TailType = std::pair>; + + // Parse either an ExecutionPartConstruct, or atomic end-directive. When + // successful, record the result in the "tail" provided, otherwise fail. + static std::optional ParseOne( // + Parser &exec, OmpEndDirectiveParser &end, + TailType &tail, ParseState &state) { + auto isRecovery{[](const ExecutionPartConstruct &e) { + return std::holds_alternative(e.u); + }}; + if (auto &&stmt{attempt(exec).Parse(state)}; stmt && !isRecovery(*stmt)) { + tail.first.emplace_back(std::move(*stmt)); + } else if (auto &&dir{attempt(end).Parse(state)}) { + tail.second = std::move(*dir); + } else { + return std::nullopt; + } + return Success{}; + } + + struct LimitedTailParser { + using resultType = TailType; + + constexpr LimitedTailParser(size_t count) : count_(count) {} + + std::optional Parse(ParseState &state) const { + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + for (size_t i{0}; i != count_; ++i) { + if (ParseOne(exec, end, tail, state)) { + if (tail.second) { + // Return when the end-directive was parsed. + return std::move(tail); + } + } else { + break; + } + } + return std::nullopt; + } + + private: + const size_t count_; + }; + + // The recursion guard should become thread_local if parsing is ever + // parallelized. + static bool recursing_; +}; + +bool OmpAtomicConstructParser::recursing_{false}; + +TYPE_PARSER(sourced( // + construct(OmpAtomicConstructParser{}))) + // 2.17.7 Atomic construct/2.17.8 Flush construct [OpenMP 5.0] // memory-order-clause -> // acq_rel @@ -1224,19 +1383,6 @@ TYPE_PARSER(sourced(construct( "RELEASE" >> construct(construct()) || "SEQ_CST" >> construct(construct()))))) -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) -TYPE_PARSER(sourced(construct( - construct(Parser{}) || - construct( - "FAIL" >> parenthesized(Parser{})) || - construct( - "HINT" >> parenthesized(Parser{}))))) - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -TYPE_PARSER(sourced(construct( - many(maybe(","_tok) >> sourced(Parser{}))))) - static bool IsSimpleStandalone(const OmpDirectiveName &name) { switch (name.v) { case llvm::omp::Directive::OMPD_barrier: @@ -1383,67 +1529,6 @@ TYPE_PARSER(sourced( TYPE_PARSER(construct(Parser{}) || construct(Parser{})) -// 2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] | -// ATOMIC [clause] -// clause -> memory-order-clause | HINT(hint-expression) -// memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED -// atomic-clause -> READ | WRITE | UPDATE | CAPTURE - -// OMP END ATOMIC -TYPE_PARSER(construct(startOmpLine >> "END ATOMIC"_tok)) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] READ [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("READ"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] CAPTURE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("CAPTURE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - statement(assignmentStmt), Parser{} / endOmpLine))) - -TYPE_PARSER(construct(indirect(Parser{})) || - construct(indirect(Parser{}))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] COMPARE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("COMPARE"_tok), - Parser{} / endOmpLine, - Parser{}, - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] UPDATE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("UPDATE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [atomic-clause-list] -TYPE_PARSER(sourced(construct(verbatim("ATOMIC"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] WRITE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("WRITE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// Atomic Construct -TYPE_PARSER(construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{})) - // 2.13.2 OMP CRITICAL TYPE_PARSER(startOmpLine >> sourced(construct( diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..5983f54600c21 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -318,6 +318,34 @@ std::string OmpTraitSetSelectorName::ToString() const { return std::string(EnumToString(v)); } +llvm::omp::Clause OpenMPAtomicConstruct::GetKind() const { + auto &dirSpec{std::get(t)}; + for (auto &clause : dirSpec.Clauses().v) { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_read: + case llvm::omp::Clause::OMPC_write: + case llvm::omp::Clause::OMPC_update: + return clause.Id(); + default: + break; + } + } + return llvm::omp::Clause::OMPC_update; +} + +bool OpenMPAtomicConstruct::IsCapture() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_capture; + }); +} + +bool OpenMPAtomicConstruct::IsCompare() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_compare; + }); +} } // namespace Fortran::parser template static llvm::omp::Clause getClauseIdForClass(C &&) { diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..a2bc8b98088f1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2562,83 +2562,22 @@ class UnparseVisitor { Word(ToUpperCaseLetters(common::EnumToString(x))); } - void Unparse(const OmpAtomicClauseList &x) { Walk(" ", x.v, " "); } - - void Unparse(const OmpAtomic &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCapture &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" CAPTURE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - Put("\n"); - Walk(std::get(x.t)); - BeginOpenMP(); - Word("!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCompare &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" COMPARE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - } - void Unparse(const OmpAtomicRead &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" READ"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicUpdate &x) { + void Unparse(const OpenMPAtomicConstruct &x) { BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" UPDATE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicWrite &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" WRITE"); - Walk(std::get<2>(x.t)); + Word("!$OMP "); + Walk(std::get(x.t)); Put("\n"); EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); + Walk(std::get(x.t), ""); + if (auto &end{std::get>(x.t)}) { + BeginOpenMP(); + Word("!$OMP END "); + Walk(*end); + Put("\n"); + EndOpenMP(); + } } + void Unparse(const OpenMPExecutableAllocate &x) { const auto &fields = std::get>>( @@ -2889,23 +2828,8 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } + void Unparse(const OmpFailClause &x) { Walk(x.v); } void Unparse(const OmpMemoryOrderClause &x) { Walk(x.v); } - void Unparse(const OmpAtomicClause &x) { - common::visit(common::visitors{ - [&](const OmpMemoryOrderClause &y) { Walk(y); }, - [&](const OmpFailClause &y) { - Word("FAIL("); - Walk(y.v); - Put(")"); - }, - [&](const OmpHintClause &y) { - Word("HINT("); - Walk(y.v); - Put(")"); - }, - }, - x.u); - } void Unparse(const OmpMetadirectiveDirective &x) { BeginOpenMP(); Word("!$OMP METADIRECTIVE "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c582de6df0319..bf3dff8ddab28 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); + +template +static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { + return !(e == f); +} + // Use when clause falls under 'struct OmpClause' in 'parse-tree.h'. #define CHECK_SIMPLE_CLAUSE(X, Y) \ void OmpStructureChecker::Enter(const parser::OmpClause::X &) { \ @@ -78,6 +86,28 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } +static bool IsVarOrFunctionRef(const SomeExpr &expr) { + return evaluate::UnwrapProcedureRef(expr) != nullptr || + evaluate::IsVariable(expr); +} + +static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { + const parser::TypedExpr &typedExpr{parserExpr.typedExpr}; + // ForwardOwningPointer typedExpr + // `- GenericExprWrapper ^.get() + // `- std::optional ^->v + return typedExpr.get()->v; +} + +static std::optional GetDynamicType( + const parser::Expr &parserExpr) { + if (auto maybeExpr{GetEvaluateExpr(parserExpr)}) { + return maybeExpr->GetType(); + } else { + return std::nullopt; + } +} + // 'OmpWorkshareBlockChecker' is used to check the validity of the assignment // statements and the expressions enclosed in an OpenMP Workshare construct class OmpWorkshareBlockChecker { @@ -584,51 +614,26 @@ void OmpStructureChecker::CheckPredefinedAllocatorRestriction( } } -template -void OmpStructureChecker::CheckHintClause( - D *leftOmpClauseList, D *rightOmpClauseList, std::string_view dirName) { - bool foundHint{false}; +void OmpStructureChecker::Enter(const parser::OmpClause::Hint &x) { + CheckAllowedClause(llvm::omp::Clause::OMPC_hint); + auto &dirCtx{GetContext()}; - auto checkForValidHintClause = [&](const D *clauseList) { - for (const auto &clause : clauseList->v) { - const parser::OmpHintClause *ompHintClause = nullptr; - if constexpr (std::is_same_v) { - ompHintClause = std::get_if(&clause.u); - } else if constexpr (std::is_same_v) { - if (auto *hint{std::get_if(&clause.u)}) { - ompHintClause = &hint->v; - } - } - if (!ompHintClause) - continue; - if (foundHint) { - context_.Say(clause.source, - "At most one HINT clause can appear on the %s directive"_err_en_US, - parser::ToUpperCaseLetters(dirName)); - } - foundHint = true; - std::optional hintValue = GetIntValue(ompHintClause->v); - if (hintValue && *hintValue >= 0) { - /*`omp_sync_hint_nonspeculative` and `omp_lock_hint_speculative`*/ - if ((*hintValue & 0xC) == 0xC - /*`omp_sync_hint_uncontended` and omp_sync_hint_contended*/ - || (*hintValue & 0x3) == 0x3) - context_.Say(clause.source, - "Hint clause value " - "is not a valid OpenMP synchronization value"_err_en_US); - } else { - context_.Say(clause.source, - "Hint clause must have non-negative constant " - "integer expression"_err_en_US); + if (std::optional maybeVal{GetIntValue(x.v.v)}) { + int64_t val{*maybeVal}; + if (val >= 0) { + // Check contradictory values. + if ((val & 0xC) == 0xC || // omp_sync_hint_speculative and nonspeculative + (val & 0x3) == 0x3) { // omp_sync_hint_contended and uncontended + context_.Say(dirCtx.clauseSource, + "The synchronization hint is not valid"_err_en_US); } + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be non-negative"_err_en_US); } - }; - - if (leftOmpClauseList) { - checkForValidHintClause(leftOmpClauseList); - } - if (rightOmpClauseList) { - checkForValidHintClause(rightOmpClauseList); + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be a constant integer value"_err_en_US); } } @@ -2370,8 +2375,9 @@ void OmpStructureChecker::Leave(const parser::OpenMPCancelConstruct &) { void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { const auto &dir{std::get(x.t)}; + const auto &dirSource{std::get(dir.t).source}; const auto &endDir{std::get(x.t)}; - PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_critical); + PushContextAndClauseSets(dirSource, llvm::omp::Directive::OMPD_critical); const auto &block{std::get(x.t)}; CheckNoBranching(block, llvm::omp::Directive::OMPD_critical, dir.source); const auto &dirName{std::get>(dir.t)}; @@ -2404,7 +2410,6 @@ void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { "Hint clause other than omp_sync_hint_none cannot be specified for " "an unnamed CRITICAL directive"_err_en_US}); } - CheckHintClause(&ompClause, nullptr, "CRITICAL"); } void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { @@ -2632,420 +2637,1519 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; } - return false; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } + + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (Append(v, std::move(results)), ...); + return v; + } + +private: + static void Append(Result &acc, Result &&data) { + for (auto &&s : data) { + acc.push_back(std::move(s)); } } +}; +} // namespace atomic - ErrIfAllocatableVariable(var); +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { + // Both expr and x have the form of SomeType(SomeKind(...)[1]). + // Check if expr is + // SomeType(SomeKind(Type( + // Convert + // SomeKind(...)[2]))) + // where SomeKind(...) [1] and [2] are equal, and the Convert preserves + // TypeCategory. + + if (expr != x) { + auto top{atomic::ArgumentExtractor{}(expr)}; + return top.first == atomic::Operator::Resize && x == top.second.front(); } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); + return true; } } -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, const MaybeExpr &maybeExpr = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetExpr(operation.expr, maybeExpr); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedExpr expr; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var + // or + // ... = x capture-var = atomic-var + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return u.lhs == c.rhs; + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); + bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + + if (couldBeCapture1) { + if (couldBeCapture2) { + if (isUpdateCapture(as2, as1)) { + if (isUpdateCapture(as1, as2)) { + // If both statements could be captures and both could be updates, + // emit a warning about the ambiguity. + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); } + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, + as1.rhs.AsFortran(), as2.rhs.AsFortran()); + } + } else { // !couldBeCapture2 + if (isUpdateCapture(as2, as1)) { + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else { + context_.Say(act2.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as1.rhs.AsFortran()); + } + } + } else { // !couldBeCapture1 + if (couldBeCapture2) { + if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); + } else { + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as2.rhs.AsFortran()); + } + } else { + context_.Say(source, + "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + } + } + + return std::make_pair(nullptr, nullptr); +} + +void OmpStructureChecker::CheckAtomicCaptureAssignment( + const evaluate::Assignment &capture, const SomeExpr &atom, + parser::CharBlock source) { + const SomeExpr &cap{capture.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + // This part should have been checked prior to callig this function. + assert(capture.rhs == atom && "This canont be a capture assignment"); + CheckStorageOverlap(atom, {cap}, source); + } +} + +void OmpStructureChecker::CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source) { + const SomeExpr &atom{read.rhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source) { + // [6.0:190:13-15] + // A write structured block is write-statement, a write statement that has + // one of the following forms: + // x = expr + // x => expr + const SomeExpr &atom{write.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, lsrc); + CheckStorageOverlap(atom, {write.rhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source) { + // [6.0:191:1-7] + // An update structured block is update-statement, an update statement + // that has one of the following forms: + // x = x operator expr + // x = expr operator x + // x = intrinsic-procedure-name (x) + // x = intrinsic-procedure-name (x, expr-list) + // x = intrinsic-procedure-name (expr-list, x) + const SomeExpr &atom{update.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, lsrc); + + auto top{GetTopLevelOperation(update.rhs)}; + switch (top.first) { + case atomic::Operator::Add: + case atomic::Operator::Sub: + case atomic::Operator::Mul: + case atomic::Operator::Div: + case atomic::Operator::And: + case atomic::Operator::Or: + case atomic::Operator::Eqv: + case atomic::Operator::Neqv: + case atomic::Operator::Min: + case atomic::Operator::Max: + case atomic::Operator::Identity: + break; + case atomic::Operator::Call: + context_.Say(source, + "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Convert: + context_.Say(source, + "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Intrinsic: + context_.Say(source, + "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Unk: + context_.Say( + source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + default: + context_.Say(source, + "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, + atomic::ToString(top.first)); + return; + } + // Check if `atom` occurs exactly once in the argument list. + std::vector nonAtom; + auto unique{[&]() { // -> iterator + auto found{top.second.end()}; + for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { + if (IsSameOrResizeOf(*i, atom)) { + if (found != top.second.end()) { + return top.second.end(); } + found = i; + } else { + nonAtom.push_back(*i); } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); + return found; + }()}; + + if (unique == top.second.end()) { + if (top.first == atomic::Operator::Identity) { + // This is "x = y". + context_.Say(rsrc, + "The atomic variable %s should appear as an argument in the update operation"_err_en_US, + atom.AsFortran()); + } else { + context_.Say(rsrc, + "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, + atom.AsFortran(), atomic::ToString(top.first)); + } + } else { + CheckStorageOverlap(atom, nonAtom, source); } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( + const SomeExpr &cond, parser::CharBlock condSource, + const evaluate::Assignment &assign, parser::CharBlock assignSource) { + const SomeExpr &atom{assign.lhs}; + auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, alsrc); + + auto top{GetTopLevelOperation(cond)}; + // Missing arguments to operations would have been diagnosed by now. + + switch (top.first) { + case atomic::Operator::Associated: + if (atom != top.second.front()) { + context_.Say(assignSource, + "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); + } + break; + // x equalop e | e equalop x (allowing "e equalop x" is an extension) + case atomic::Operator::Eq: + case atomic::Operator::Eqv: + // x ordop expr | expr ordop x + case atomic::Operator::Lt: + case atomic::Operator::Gt: { + const SomeExpr &arg0{top.second[0]}; + const SomeExpr &arg1{top.second[1]}; + if (IsSameOrResizeOf(arg0, atom)) { + CheckStorageOverlap(atom, {arg1}, condSource); + } else if (IsSameOrResizeOf(arg1, atom)) { + CheckStorageOverlap(atom, {arg0}, condSource); + } else { + context_.Say(assignSource, + "An argument of the %s operator should be the target of the assignment"_err_en_US, + atomic::ToString(top.first)); + } + break; + } + case atomic::Operator::True: + case atomic::Operator::False: + break; + default: + context_.Say(condSource, + "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, + atomic::ToString(top.first)); + break; } } -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, +void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source) { + // The condition/statements must be: + // - cond: x equalop e ift: x = d iff: - + // - cond: x ordop expr ift: x = expr iff: - (+ commute ordop) + // - cond: associated(x) ift: x => expr iff: - + // - cond: associated(x, e) ift: x => expr iff: - + + // The if-true statement must be present, and must be an assignment. + auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; + if (!maybeAssign) { + if (update.ift.stmt) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); + } else { + context_.Say( + source, "Invalid body of ATOMIC UPDATE COMPARE operation"_err_en_US); + } + return; + } + const evaluate::Assignment assign{*maybeAssign}; + const SomeExpr &atom{assign.lhs}; + + CheckAtomicConditionalUpdateAssignment( + update.cond, update.source, assign, update.ift.source); + + CheckStorageOverlap(atom, {assign.rhs}, update.ift.source); + + if (update.iff) { + context_.Say(update.iff.source, + "In ATOMIC UPDATE COMPARE the update statement should not have an ELSE branch"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdateOnly( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeUpdate{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeUpdate->lhs}; + CheckAtomicUpdateAssignment(*maybeUpdate, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC UPDATE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdate( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // Allowable forms are (single-statement): + // - if ... + // - x = (... ? ... : x) + // and two-statement: + // - r = cond ; if (r) ... + + const parser::ExecutionPartConstruct *ust{nullptr}; // update + const parser::ExecutionPartConstruct *cst{nullptr}; // condition + + if (body.size() == 1) { + ust = &body.front(); + } else if (body.size() == 2) { + cst = &body.front(); + ust = &body.back(); + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE operation should contain one or two statements"_err_en_US); + return; + } + + // Flang doesn't support conditional-expr yet, so all update statements + // are if-statements. + + // IfStmt: if (...) ... + // IfConstruct: if (...) then ... endif + auto maybeUpdate{AnalyzeConditionalStmt(ust)}; + if (!maybeUpdate) { + context_.Say(source, + "In ATOMIC UPDATE COMPARE the update statement should be a conditional statement"_err_en_US); + return; + } + + AnalyzedCondStmt &update{*maybeUpdate}; + + if (SourcedActionStmt action{GetActionStmt(cst)}) { + // The "condition" statement must be `r = cond`. + if (auto maybeCond{GetEvaluateAssignment(action.stmt)}) { + if (maybeCond->lhs != update.cond) { + context_.Say(update.source, + "In ATOMIC UPDATE COMPARE the conditional statement must use %s as the condition"_err_en_US, + maybeCond->lhs.AsFortran()); + } else { + // If it's "r = ...; if (r) ..." then put the original condition + // in `update`. + update.cond = maybeCond->rhs; + } + } else { + context_.Say(action.source, + "In ATOMIC UPDATE COMPARE with two statements the first statement should compute the condition"_err_en_US); + } + } + + evaluate::Assignment assign{*GetEvaluateAssignment(update.ift.stmt)}; + + CheckAtomicConditionalUpdateStmt(update, source); + if (IsCheckForAssociated(update.cond)) { + if (!IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment should be a pointer-assignment when the condition is ASSOCIATED"_err_en_US); + } + } else { + if (IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment cannot be a pointer-assignment except when the condition is ASSOCIATED"_err_en_US); + } + } + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::None)); +} + +void OmpStructureChecker::CheckAtomicUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() != 2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two statements"_err_en_US); + return; + } + + auto [uec, cec]{CheckUpdateCapture(&body.front(), &body.back(), source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; + auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; + + if (!maybeUpdate || !maybeCapture) { + context_.Say(source, + "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); + return; + } + + const evaluate::Assignment &update{*maybeUpdate}; + const evaluate::Assignment &capture{*maybeCapture}; + const SomeExpr &atom{update.lhs}; + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + int action; + + if (IsMaybeAtomicWrite(update)) { + action = Analysis::Write; + CheckAtomicWriteAssignment(update, uact.source); + } else { + action = Analysis::Update; + CheckAtomicUpdateAssignment(update, uact.source); + } + CheckAtomicCaptureAssignment(capture, atom, cact.source); + + if (IsPointerAssignment(update) != IsPointerAssignment(capture)) { + context_.Say(cact.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + + if (GetActionStmt(&body.front()).stmt == uact.stmt) { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(action, update.rhs), + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + } else { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), + MakeAtomicAnalysisOp(action, update.rhs)); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // There are two different variants of this: + // (1) conditional-update and capture separately: + // This form only allows single-statement updates, i.e. the update + // form "r = cond; if (r) ...)" is not allowed. + // (2) conditional-update combined with capture in a single statement: + // This form does allow the condition to be calculated separately, + // i.e. "r = cond; if (r) ...". + // Regardless of what form it is, the actual update assignment is a + // proper write, i.e. "x = d", where d does not depend on x. + + AnalyzedCondStmt update; + SourcedActionStmt capture; + bool captureAlways{true}, captureFirst{true}; + + auto extractCapture{[&]() { + capture = update.iff; + captureAlways = false; + update.iff = SourcedActionStmt{}; + }}; + + auto classifyNonUpdate{[&](const SourcedActionStmt &action) { + // The non-update statement is either "r = cond" or the capture. + if (auto maybeAssign{GetEvaluateAssignment(action.stmt)}) { + if (update.cond == maybeAssign->lhs) { + // If this is "r = cond; if (r) ...", then update the condition. + update.cond = maybeAssign->rhs; + update.source = action.source; + // In this form, the update and the capture are combined into + // an IF-THEN-ELSE statement. + extractCapture(); + } else { + // Assume this is the capture-statement. + capture = action; + } + } + }}; + + if (body.size() == 2) { + // This could be + // - capture; conditional-update (in any order), or + // - r = cond; if (r) capture-update + const parser::ExecutionPartConstruct *st1{&body.front()}; + const parser::ExecutionPartConstruct *st2{&body.back()}; + // In either case, the conditional statement can be analyzed by + // AnalyzeConditionalStmt, whereas the other statement cannot. + if (auto maybeUpdate1{AnalyzeConditionalStmt(st1)}) { + update = *maybeUpdate1; + classifyNonUpdate(GetActionStmt(st2)); + captureFirst = false; + } else if (auto maybeUpdate2{AnalyzeConditionalStmt(st2)}) { + update = *maybeUpdate2; + classifyNonUpdate(GetActionStmt(st1)); + } else { + // None of the statements are conditional, this rules out the + // "r = cond; if (r) ..." and the "capture + conditional-update" + // variants. This could still be capture + write (which is classified + // as conditional-update-capture in the spec). + auto [uec, cec]{CheckUpdateCapture(st1, st2, source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + update.ift = uact; + capture = cact; + if (uec == st1) { + captureFirst = false; + } + } + } else if (body.size() == 1) { + if (auto maybeUpdate{AnalyzeConditionalStmt(&body.front())}) { + update = *maybeUpdate; + // This is the form with update and capture combined into an IF-THEN-ELSE + // statement. The capture-statement is always the ELSE branch. + extractCapture(); + } else { + goto invalid; + } + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE CAPTURE operation should contain one or two statements"_err_en_US); + return; + invalid: + context_.Say(source, + "Invalid body of ATOMIC UPDATE COMPARE CAPTURE operation"_err_en_US); + return; + } + + // The update must have a form `x = d` or `x => d`. + if (auto maybeWrite{GetEvaluateAssignment(update.ift.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, update.ift.source); + if (auto maybeCapture{GetEvaluateAssignment(capture.stmt)}) { + CheckAtomicCaptureAssignment(*maybeCapture, atom, capture.source); + + if (IsPointerAssignment(*maybeWrite) != + IsPointerAssignment(*maybeCapture)) { + context_.Say(capture.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + } else { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + return; + } + } else { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + return; + } + + // update.iff should be empty here, the capture statement should be + // stored in "capture". + + // Fill out the analysis in the AST node. + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + bool condUnused{std::visit( + [](auto &&s) { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v) { + return true; + } else { + return false; + } }, - x.u); + update.cond.u)}; + + int updateWhen{!condUnused ? Analysis::IfTrue : 0}; + int captureWhen{!captureAlways ? Analysis::IfFalse : 0}; + + evaluate::Assignment updAssign{*GetEvaluateAssignment(update.ift.stmt)}; + evaluate::Assignment capAssign{*GetEvaluateAssignment(capture.stmt)}; + + if (captureFirst) { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + } else { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + } +} + +void OmpStructureChecker::CheckAtomicRead( + const parser::OpenMPAtomicConstruct &x) { + // [6.0:190:5-7] + // A read structured block is read-statement, a read statement that has one + // of the following forms: + // v = x + // v => x + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Read cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC READ cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeRead->rhs}; + CheckAtomicReadAssignment(*maybeRead, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC READ operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC READ operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicWrite( + const parser::OpenMPAtomicConstruct &x) { + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Write cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC WRITE cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeWrite{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC WRITE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdate( + const parser::OpenMPAtomicConstruct &x) { + auto &block{std::get(x.t)}; + + bool isConditional{x.IsCompare()}; + bool isCapture{x.IsCapture()}; + const parser::Block &body{GetInnermostExecPart(block)}; + + if (isConditional && isCapture) { + CheckAtomicConditionalUpdateCapture(x, body, x.source); + } else if (isConditional) { + CheckAtomicConditionalUpdate(x, body, x.source); + } else if (isCapture) { + CheckAtomicUpdateCapture(x, body, x.source); + } else { // update-only + CheckAtomicUpdateOnly(x, body, x.source); + } +} + +void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { + // All of the following groups have the "exclusive" property, i.e. at + // most one clause from each group is allowed. + // The exclusivity-checking code should eventually be unified for all + // clauses, with clause groups defined in OMP.td. + std::array atomic{llvm::omp::Clause::OMPC_read, + llvm::omp::Clause::OMPC_update, llvm::omp::Clause::OMPC_write}; + std::array memoryOrder{llvm::omp::Clause::OMPC_acq_rel, + llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_relaxed, + llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_seq_cst}; + + auto checkExclusive{[&](llvm::ArrayRef group, + std::string_view name, + const parser::OmpClauseList &clauses) { + const parser::OmpClause *present{nullptr}; + for (const parser::OmpClause &clause : clauses.v) { + llvm::omp::Clause id{clause.Id()}; + if (!llvm::is_contained(group, id)) { + continue; + } + if (present == nullptr) { + present = &clause; + continue; + } else if (id == present->Id()) { + // Ignore repetitions of the same clause, those will be diagnosed + // separately. + continue; + } + parser::MessageFormattedText txt( + "At most one clause from the '%s' group is allowed on ATOMIC construct"_err_en_US, + name.data()); + parser::Message message(clause.source, txt); + message.Attach(present->source, + "Previous clause from this group provided here"_en_US); + context_.Say(std::move(message)); + return; + } + }}; + + auto &dirSpec{std::get(x.t)}; + auto &dir{std::get(dirSpec.t)}; + PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_atomic); + llvm::omp::Clause kind{x.GetKind()}; + + checkExclusive(atomic, "atomic", dirSpec.Clauses()); + checkExclusive(memoryOrder, "memory-order", dirSpec.Clauses()); + + switch (kind) { + case llvm::omp::Clause::OMPC_read: + CheckAtomicRead(x); + break; + case llvm::omp::Clause::OMPC_write: + CheckAtomicWrite(x); + break; + case llvm::omp::Clause::OMPC_update: + CheckAtomicUpdate(x); + break; + default: + break; + } } void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { @@ -3237,7 +4341,6 @@ CHECK_SIMPLE_CLAUSE(Final, OMPC_final) CHECK_SIMPLE_CLAUSE(Flush, OMPC_flush) CHECK_SIMPLE_CLAUSE(Full, OMPC_full) CHECK_SIMPLE_CLAUSE(Grainsize, OMPC_grainsize) -CHECK_SIMPLE_CLAUSE(Hint, OMPC_hint) CHECK_SIMPLE_CLAUSE(Holds, OMPC_holds) CHECK_SIMPLE_CLAUSE(Inclusive, OMPC_inclusive) CHECK_SIMPLE_CLAUSE(Initializer, OMPC_initializer) @@ -3867,40 +4970,6 @@ void OmpStructureChecker::CheckIsLoopIvPartOfClause( } } } -// Following clauses have a separate node in parse-tree.h. -// Atomic-clause -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicRead, OMPC_read) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicWrite, OMPC_write) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicUpdate, OMPC_update) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicCapture, OMPC_capture) - -void OmpStructureChecker::Leave(const parser::OmpAtomicRead &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_read, - {llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicWrite &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_write, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicUpdate &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_update, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -// OmpAtomic node represents atomic directive without atomic-clause. -// atomic-clause - READ,WRITE,UPDATE,CAPTURE. -void OmpStructureChecker::Leave(const parser::OmpAtomic &) { - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acquire)}) { - context_.Say(clause->source, - "Clause ACQUIRE is not allowed on the ATOMIC directive"_err_en_US); - } - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acq_rel)}) { - context_.Say(clause->source, - "Clause ACQ_REL is not allowed on the ATOMIC directive"_err_en_US); - } -} // Restrictions specific to each clause are implemented apart from the // generalized restrictions. @@ -4854,21 +5923,6 @@ void OmpStructureChecker::Leave(const parser::OmpContextSelector &) { ExitDirectiveNest(ContextSelectorNest); } -std::optional OmpStructureChecker::GetDynamicType( - const common::Indirection &parserExpr) { - // Indirection parserExpr - // `- parser::Expr ^.value() - const parser::TypedExpr &typedExpr{parserExpr.value().typedExpr}; - // ForwardOwningPointer typedExpr - // `- GenericExprWrapper ^.get() - // `- std::optional ^->v - if (auto maybeExpr{typedExpr.get()->v}) { - return maybeExpr->GetType(); - } else { - return std::nullopt; - } -} - const std::list & OmpStructureChecker::GetTraitPropertyList( const parser::OmpTraitSelector &trait) { @@ -5258,7 +6312,7 @@ void OmpStructureChecker::CheckTraitCondition( const parser::OmpTraitProperty &property{properties.front()}; auto &scalarExpr{std::get(property.u)}; - auto maybeType{GetDynamicType(scalarExpr.thing)}; + auto maybeType{GetDynamicType(scalarExpr.thing.value())}; if (!maybeType || maybeType->category() != TypeCategory::Logical) { context_.Say(property.source, "%s trait requires a single LOGICAL expression"_err_en_US, diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..bf6fbf16d0646 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -48,6 +48,7 @@ static const OmpDirectiveSet noWaitClauseNotAllowedSet{ } // namespace llvm namespace Fortran::semantics { +struct AnalyzedCondStmt; // Mapping from 'Symbol' to 'Source' to keep track of the variables // used in multiple clauses @@ -142,15 +143,6 @@ class OmpStructureChecker void Leave(const parser::OmpClauseList &); void Enter(const parser::OmpClause &); - void Enter(const parser::OmpAtomicRead &); - void Leave(const parser::OmpAtomicRead &); - void Enter(const parser::OmpAtomicWrite &); - void Leave(const parser::OmpAtomicWrite &); - void Enter(const parser::OmpAtomicUpdate &); - void Leave(const parser::OmpAtomicUpdate &); - void Enter(const parser::OmpAtomicCapture &); - void Leave(const parser::OmpAtomic &); - void Enter(const parser::DoConstruct &); void Leave(const parser::DoConstruct &); @@ -189,8 +181,6 @@ class OmpStructureChecker void CheckAllowedMapTypes(const parser::OmpMapType::Value &, const std::list &); - std::optional GetDynamicType( - const common::Indirection &); const std::list &GetTraitPropertyList( const parser::OmpTraitSelector &); std::optional GetClauseFromProperty( @@ -260,14 +250,41 @@ class OmpStructureChecker void CheckDoWhile(const parser::OpenMPLoopConstruct &x); void CheckAssociatedLoopConstraints(const parser::OpenMPLoopConstruct &x); template bool IsOperatorValid(const T &, const D &); - void CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *, const parser::OmpAtomicClauseList *); - void CheckAtomicUpdateStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureStmt(const parser::AssignmentStmt &); - void CheckAtomicWriteStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureConstruct(const parser::OmpAtomicCapture &); - void CheckAtomicCompareConstruct(const parser::OmpAtomicCompare &); - void CheckAtomicConstructStructure(const parser::OpenMPAtomicConstruct &); + + void CheckStorageOverlap(const evaluate::Expr &, + llvm::ArrayRef>, parser::CharBlock); + void CheckAtomicVariable( + const evaluate::Expr &, parser::CharBlock); + std::pair + CheckUpdateCapture(const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source); + void CheckAtomicCaptureAssignment(const evaluate::Assignment &capture, + const SomeExpr &atom, parser::CharBlock source); + void CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source); + void CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source); + void CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source); + void CheckAtomicConditionalUpdateAssignment(const SomeExpr &cond, + parser::CharBlock condSource, const evaluate::Assignment &assign, + parser::CharBlock assignSource); + void CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source); + void CheckAtomicUpdateOnly(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdate(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicUpdateCapture(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source); + void CheckAtomicRead(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicWrite(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicUpdate(const parser::OpenMPAtomicConstruct &x); + void CheckDistLinear(const parser::OpenMPLoopConstruct &x); void CheckSIMDNest(const parser::OpenMPConstruct &x); void CheckTargetNest(const parser::OpenMPConstruct &x); @@ -319,7 +336,6 @@ class OmpStructureChecker void EnterDirectiveNest(const int index) { directiveNest_[index]++; } void ExitDirectiveNest(const int index) { directiveNest_[index]--; } int GetDirectiveNest(const int index) { return directiveNest_[index]; } - template void CheckHintClause(D *, D *, std::string_view); inline void ErrIfAllocatableVariable(const parser::Variable &); inline void ErrIfLHSAndRHSSymbolsMatch( const parser::Variable &, const parser::Expr &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..dc395ecf211b3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1638,11 +1638,8 @@ class OmpVisitor : public virtual DeclarationVisitor { messageHandler().set_currStmtSource(std::nullopt); } bool Pre(const parser::OpenMPAtomicConstruct &x) { - return common::visit(common::visitors{[&](const auto &u) -> bool { - AddOmpSourceRange(u.source); - return true; - }}, - x.u); + AddOmpSourceRange(x.source); + return true; } void Post(const parser::OpenMPAtomicConstruct &) { messageHandler().set_currStmtSource(std::nullopt); diff --git a/flang/lib/Semantics/rewrite-directives.cpp b/flang/lib/Semantics/rewrite-directives.cpp index 104a77885d276..b4fef2c881b67 100644 --- a/flang/lib/Semantics/rewrite-directives.cpp +++ b/flang/lib/Semantics/rewrite-directives.cpp @@ -51,23 +51,21 @@ class OmpRewriteMutator : public DirectiveRewriteMutator { bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { // Find top-level parent of the operation. - Symbol *topLevelParent{common::visit( - [&](auto &atomic) { - Symbol *symbol{nullptr}; - Scope *scope{ - &context_.FindScope(std::get(atomic.t).source)}; - do { - if (Symbol * parent{scope->symbol()}) { - symbol = parent; - } - scope = &scope->parent(); - } while (!scope->IsGlobal()); - - assert(symbol && - "Atomic construct must be within a scope associated with a symbol"); - return symbol; - }, - x.u)}; + Symbol *topLevelParent{[&]() { + Symbol *symbol{nullptr}; + Scope *scope{&context_.FindScope( + std::get(x.t).source)}; + do { + if (Symbol * parent{scope->symbol()}) { + symbol = parent; + } + scope = &scope->parent(); + } while (!scope->IsGlobal()); + + assert(symbol && + "Atomic construct must be within a scope associated with a symbol"); + return symbol; + }()}; // Get the `atomic_default_mem_order` clause from the top-level parent. std::optional defaultMemOrder; @@ -86,66 +84,48 @@ bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { return false; } - auto findMemOrderClause = - [](const std::list &clauses) { - return llvm::any_of(clauses, [](const auto &clause) { - return std::get_if(&clause.u); + auto findMemOrderClause{[](const parser::OmpClauseList &clauses) { + return llvm::any_of( + clauses.v, [](auto &clause) -> const parser::OmpClause * { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_acq_rel: + case llvm::omp::Clause::OMPC_acquire: + case llvm::omp::Clause::OMPC_relaxed: + case llvm::omp::Clause::OMPC_release: + case llvm::omp::Clause::OMPC_seq_cst: + return &clause; + default: + return nullptr; + } }); - }; - - // Get the clause list to which the new memory order clause must be added, - // only if there are no other memory order clauses present for this atomic - // directive. - std::list *clauseList = common::visit( - common::visitors{[&](parser::OmpAtomic &atomicConstruct) { - // OmpAtomic only has a single list of clauses. - auto &clauses{std::get( - atomicConstruct.t)}; - return !findMemOrderClause(clauses.v) ? &clauses.v - : nullptr; - }, - [&](auto &atomicConstruct) { - // All other atomic constructs have two lists of clauses. - auto &clausesLhs{std::get<0>(atomicConstruct.t)}; - auto &clausesRhs{std::get<2>(atomicConstruct.t)}; - return !findMemOrderClause(clausesLhs.v) && - !findMemOrderClause(clausesRhs.v) - ? &clausesRhs.v - : nullptr; - }}, - x.u); + }}; - // Add a memory order clause to the atomic directive. + auto &dirSpec{std::get(x.t)}; + auto &clauseList{std::get>(dirSpec.t)}; if (clauseList) { - atomicDirectiveDefaultOrderFound_ = true; - switch (*defaultMemOrder) { - case common::OmpMemoryOrderType::Acq_Rel: - clauseList->emplace_back(common::visit( - common::visitors{[](parser::OmpAtomicRead &) -> parser::OmpClause { - return parser::OmpClause::Acquire{}; - }, - [](parser::OmpAtomicCapture &) -> parser::OmpClause { - return parser::OmpClause::AcqRel{}; - }, - [](auto &) -> parser::OmpClause { - // parser::{OmpAtomic, OmpAtomicUpdate, OmpAtomicWrite} - return parser::OmpClause::Release{}; - }}, - x.u)); - break; - case common::OmpMemoryOrderType::Relaxed: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::Relaxed{}}); - break; - case common::OmpMemoryOrderType::Seq_Cst: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::SeqCst{}}); - break; - default: - // FIXME: Don't process other values at the moment since their validity - // depends on the OpenMP version (which is unavailable here). - break; + if (findMemOrderClause(*clauseList)) { + return false; } + } else { + clauseList = parser::OmpClauseList(decltype(parser::OmpClauseList::v){}); + } + + // Add a memory order clause to the atomic directive. + atomicDirectiveDefaultOrderFound_ = true; + switch (*defaultMemOrder) { + case common::OmpMemoryOrderType::Acq_Rel: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::AcqRel{}}); + break; + case common::OmpMemoryOrderType::Relaxed: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::Relaxed{}}); + break; + case common::OmpMemoryOrderType::Seq_Cst: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::SeqCst{}}); + break; + default: + // FIXME: Don't process other values at the moment since their validity + // depends on the OpenMP version (which is unavailable here). + break; } return false; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 index b82bd13622764..6f58e0939a787 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 index 88ec6fe910b9e..6729be6e5cf8b 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b8422c8f3b769 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -79,16 +79,16 @@ subroutine pointers_in_atomic_capture() !CHECK: %[[VAL_A_BOX_ADDR:.*]] = fir.box_addr %[[VAL_A_LOADED]] : (!fir.box>) -> !fir.ptr !CHECK: %[[VAL_B_LOADED:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR:.*]] = fir.box_addr %[[VAL_B_LOADED]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR]] : !fir.ptr !CHECK: %[[VAL_B_LOADED_2:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR_2:.*]] = fir.box_addr %[[VAL_B_LOADED_2]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR_2]] : !fir.ptr !CHECK: omp.atomic.capture { !CHECK: omp.atomic.update %[[VAL_A_BOX_ADDR]] : !fir.ptr { !CHECK: ^bb0(%[[ARG:.*]]: i32): !CHECK: %[[TEMP:.*]] = arith.addi %[[ARG]], %[[VAL_B]] : i32 !CHECK: omp.yield(%[[TEMP]] : i32) !CHECK: } -!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 +!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR_2]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 !CHECK: } !CHECK: return !CHECK: } diff --git a/flang/test/Lower/OpenMP/atomic-write.f90 b/flang/test/Lower/OpenMP/atomic-write.f90 index 13392ad76471f..6eded49b0b15d 100644 --- a/flang/test/Lower/OpenMP/atomic-write.f90 +++ b/flang/test/Lower/OpenMP/atomic-write.f90 @@ -44,9 +44,9 @@ end program OmpAtomicWrite !CHECK-LABEL: func.func @_QPatomic_write_pointer() { !CHECK: %[[X_REF:.*]] = fir.alloca !fir.box> {bindc_name = "x", uniq_name = "_QFatomic_write_pointerEx"} !CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFatomic_write_pointerEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) -!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> !CHECK: %[[X_POINTEE_ADDR:.*]] = fir.box_addr %[[X_ADDR_BOX]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: omp.atomic.write %[[X_POINTEE_ADDR]] = %[[C1]] : !fir.ptr, i32 !CHECK: %[[C2:.*]] = arith.constant 2 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> diff --git a/flang/test/Parser/OpenMP/atomic-compare.f90 b/flang/test/Parser/OpenMP/atomic-compare.f90 deleted file mode 100644 index 5cd02698ff482..0000000000000 --- a/flang/test/Parser/OpenMP/atomic-compare.f90 +++ /dev/null @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s -! OpenMP version for documentation purposes only - it isn't used until Sema. -! This is testing for Parser errors that bail out before Sema. -program main - implicit none - integer :: i, j = 10 - logical :: r - - !CHECK: error: expected OpenMP construct - !$omp atomic compare write - r = i .eq. j + 1 - - !CHECK: error: expected end of line - !$omp atomic compare num_threads(4) - r = i .eq. j -end program main diff --git a/flang/test/Semantics/OpenMP/atomic-compare.f90 b/flang/test/Semantics/OpenMP/atomic-compare.f90 index 54492bf6a22a6..11e23e062bce7 100644 --- a/flang/test/Semantics/OpenMP/atomic-compare.f90 +++ b/flang/test/Semantics/OpenMP/atomic-compare.f90 @@ -44,46 +44,37 @@ !$omp end atomic ! Check for error conditions: - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic compare seq_cst seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst compare seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic compare acquire acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire compare acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic compare relaxed relaxed if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed compare relaxed if (b .eq. c) b = a - !ERROR: More than one FAIL clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one FAIL clause can appear on the ATOMIC directive !$omp atomic fail(release) compare fail(release) if (c .eq. a) a = b !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index c13a11a8dd5dc..deb67e7614659 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -16,20 +16,21 @@ program sample !$omp atomic read hint(2) y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(3) y = y + 10 !$omp atomic update hint(5) y = x + y - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement y = x x = y !$omp end atomic - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic update hint(x) y = y * 1 @@ -46,7 +47,7 @@ program sample !$omp atomic hint(omp_lock_hint_speculative) x = y + x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint) read y = x @@ -69,36 +70,36 @@ program sample !$omp atomic hint(omp_lock_hint_contended + omp_sync_hint_nonspeculative) x = y + x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint_contended) read y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = y * 9 - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp atomic hint(1.0) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp atomic hint(z + omp_sync_hint_nonspeculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(k + omp_sync_hint_speculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(p(1) + omp_sync_hint_uncontended) write x = 10 * y !$omp atomic write hint(a) - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and y+x access the same storage x = y + x !$omp atomic hint(abs(-1)) write diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 new file mode 100644 index 0000000000000..2619d235380f8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -0,0 +1,89 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC READ. Expect no diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v = x + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v => x +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic read + v = p(i) +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic read + !ERROR: Atomic variable x should be a scalar + v = x + + !$omp atomic read + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + v = y(2:4) +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic read + !ERROR: Within atomic operation x and x access the same storage + x = x + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic read + y(1) = y(2) +end + +subroutine f05 + integer :: x, v + + !$omp atomic read + !ERROR: Atomic expression x+1_4 should be a variable + v = x + 1 +end + +subroutine f06 + character :: x, v + + !$omp atomic read + !ERROR: Atomic variable x cannot have CHARACTER type + v = x +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic read + !ERROR: Atomic variable x cannot be ALLOCATABLE + v = x +end + diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 new file mode 100644 index 0000000000000..64e31a7974b55 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -0,0 +1,77 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !$omp atomic update capture + x = v + x = x + 1 + y = x + !$omp end atomic +end + +subroutine f01 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two assignments + !$omp atomic update capture + x = v + block + x = x + 1 + y = x + end block + !$omp end atomic +end + +subroutine f02 + integer :: x, y + + ! The update and capture statements can be inside of a single BLOCK. + ! The end-directive is then optional. Expect no diagnostics. + !$omp atomic update capture + block + x = x + 1 + y = x + end block +end + +subroutine f03 + integer :: x + + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !$omp atomic update capture + x = x + 1 + x = x + 2 + !$omp end atomic +end + +subroutine f04 + integer :: x, v + + !$omp atomic update capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + v = x + x = v + !$omp end atomic +end + +subroutine f05 + integer :: x, v, z + + !$omp atomic update capture + v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + !$omp end atomic +end + +subroutine f06 + integer :: x, v, z + + !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + v = x + !$omp end atomic +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 new file mode 100644 index 0000000000000..0aee02d429dc0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + !$omp atomic update + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + x = int(x + y) +end + +subroutine f06 + integer :: x, y + interface + function f(i, j) + integer :: f, i, j + end + end interface + + !$omp atomic update + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation + x = f(x, y) +end + +subroutine f07 + real :: x + integer :: y + + !$omp atomic update + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation + x = x ** y +end + +subroutine f08 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should appear as an argument in the update operation + x = y +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 index 21a9b87d26345..3084376b4275d 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 @@ -22,10 +22,10 @@ program sample x = x / y !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y end program diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 new file mode 100644 index 0000000000000..979568ebf7140 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -0,0 +1,81 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC WRITE. Expect no diagnostics. + !$omp atomic write + x = v + 1 + + !$omp atomic write + x = v + 3 + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic write + x = 2 * v + 3 + + !$omp atomic write + x => v +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic write + p(i) = v +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic write + !ERROR: Atomic variable x should be a scalar + x = v + + !$omp atomic write + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + y(2:4) = v +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic write + !ERROR: Within atomic operation x and x+1_4 access the same storage + x = x + 1 + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic write + y(1) = y(2) +end + +subroutine f06 + character :: x, v + + !$omp atomic write + !ERROR: Atomic variable x cannot have CHARACTER type + x = v +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic write + !ERROR: Atomic variable x cannot be ALLOCATABLE + x = v +end + diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0e100871ea9b4..0dc8251ae356e 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct @@ -11,9 +11,13 @@ a = b !$omp end atomic + !ERROR: ACQUIRE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic read acquire hint(OMP_LOCK_HINT_CONTENDED) a = b + !ERROR: RELEASE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic release hint(OMP_LOCK_HINT_UNCONTENDED) write a = b @@ -22,39 +26,32 @@ a = a + 1 !$omp end atomic + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: ACQ_REL clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic hint(1) acq_rel capture b = a a = a + 1 !$omp end atomic - !ERROR: expected end of line + !ERROR: At most one clause from the 'atomic' group is allowed on ATOMIC construct !$omp atomic read write + !ERROR: Atomic expression a+1._4 should be a variable a = a + 1 !$omp atomic a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic num_threads(4) a = a + 1 - !ERROR: expected end of line + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic capture num_threads(4) a = a + 1 + !ERROR: RELAXED clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic relaxed a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' - !$omp atomic num_threads write - a = a + 1 - !$omp end parallel end diff --git a/flang/test/Semantics/OpenMP/atomic01.f90 b/flang/test/Semantics/OpenMP/atomic01.f90 index 173effe86b69c..f700c381cadd0 100644 --- a/flang/test/Semantics/OpenMP/atomic01.f90 +++ b/flang/test/Semantics/OpenMP/atomic01.f90 @@ -14,322 +14,277 @@ ! At most one memory-order-clause may appear on the construct. !READ - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic read seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst read seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic read acquire acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire read acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic read relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed read relaxed i = j !UPDATE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic update seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst update seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic update release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release update release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic update relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed update relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !CAPTURE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic capture seq_cst seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst capture seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic capture release release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release capture release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic capture relaxed relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed capture relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel acq_rel capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic capture acq_rel acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel capture acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic capture acquire acquire i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire capture acquire i = j j = k !$omp end atomic !WRITE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic write seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst write seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic write release release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release write release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic write relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed write relaxed i = j !No atomic-clause - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j ! 2.17.7.3 ! At most one hint clause may appear on the construct. - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_speculative) hint(omp_sync_hint_speculative) read i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) read hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic read hint(omp_sync_hint_uncontended) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) update hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic update hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) capture i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) capture hint(omp_sync_hint_nonspeculative) i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic capture hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j j = k @@ -337,34 +292,26 @@ ! 2.17.7.4 ! If atomic-clause is read then memory-order-clause must not be acq_rel or release. - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic acq_rel read i = j - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read acq_rel i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic release read i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read release i = j ! 2.17.7.5 ! If atomic-clause is write then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acq_rel write i = j - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acq_rel i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acquire write i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acquire i = j @@ -372,33 +319,27 @@ ! 2.17.7.6 ! If atomic-clause is update or not present then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acq_rel update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acquire update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed on the ATOMIC directive !$omp atomic acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed on the ATOMIC directive !$omp atomic acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j end program diff --git a/flang/test/Semantics/OpenMP/atomic02.f90 b/flang/test/Semantics/OpenMP/atomic02.f90 index c66085d00f157..45e41f2552965 100644 --- a/flang/test/Semantics/OpenMP/atomic02.f90 +++ b/flang/test/Semantics/OpenMP/atomic02.f90 @@ -28,36 +28,29 @@ program OmpAtomic !$omp atomic a = a/(b + 1) !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: The atomic variable c should appear as an argument in the update operation c = d !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The /= operator is not a valid ATOMIC UPDATE operation l = a .NE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic m = m .AND. n @@ -76,32 +69,26 @@ program OmpAtomic !$omp atomic update a = a/(b + 1) !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: This is not a valid ATOMIC UPDATE operation c = c//d !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic update m = m .AND. n diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index 76367495b9861..f5c189fd05318 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -25,28 +25,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7, b, c) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8, a, d) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -60,26 +58,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = MOD(y, 9) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation x = ABS(x) end program OmpAtomic @@ -92,7 +90,7 @@ subroutine conflicting_types() type(simple) ::s z = 1 !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(s%z, 4) end subroutine @@ -105,40 +103,37 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) :: s !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MIN operator a = min(a, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a) !$omp atomic - !ERROR: Atomic update statement should be of the form `a = intrinsic_procedure(a, expr_list)` OR `a = intrinsic_procedure(expr_list, a)` a = min(b, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a, b) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: The atomic variable y should occur exactly once among the arguments of the top-level MIN operator y = min(z, x) !$omp atomic z = max(z, y) !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'k' + !ERROR: Atomic variable k should be a scalar + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation k = max(x, y) - + !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement x = min(x, k) !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - z =z + s%m + z = z + s%m end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index a9644ad95aa30..5c91ab5dc37e4 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -20,12 +20,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic @@ -33,12 +31,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic @@ -46,12 +42,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic @@ -59,12 +53,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic @@ -72,8 +64,7 @@ program OmpAtomic !$omp atomic m = n .AND. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic @@ -81,8 +72,7 @@ program OmpAtomic !$omp atomic m = n .OR. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic @@ -90,8 +80,7 @@ program OmpAtomic !$omp atomic m = n .EQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic @@ -99,8 +88,7 @@ program OmpAtomic !$omp atomic m = n .NEQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l !$omp atomic update @@ -108,12 +96,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic update @@ -121,12 +107,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic update @@ -134,12 +118,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic update @@ -147,12 +129,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic update @@ -160,8 +140,7 @@ program OmpAtomic !$omp atomic update m = n .AND. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic update @@ -169,8 +148,7 @@ program OmpAtomic !$omp atomic update m = n .OR. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic update @@ -178,8 +156,7 @@ program OmpAtomic !$omp atomic update m = n .EQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic update @@ -187,8 +164,7 @@ program OmpAtomic !$omp atomic update m = n .NEQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l end program OmpAtomic @@ -204,35 +180,34 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) p !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement x = x !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 1 !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and a*b access the same storage a = a * b + a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level * operator a = b * (a + 9) !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (a+b) access the same storage a = a * (a + b) !$omp atomic - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (b+a) access the same storage a = (b + a) * a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a * b + c !$omp atomic update - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a + b + c !$omp atomic @@ -243,23 +218,18 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement a = a + d !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = x * y / z !$omp atomic - !ERROR: Atomic update statement should be of form `p%m = p%m operator expr` OR `p%m = expr operator p%m` - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable p%m should occur exactly once among the arguments of the top-level + operator p%m = x + y !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement p%m = p%m + p%n end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index 266268a212440..e0103be4cae4a 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -8,20 +8,20 @@ program OmpAtomic use omp_lib integer :: g, x - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic relaxed, seq_cst x = x + 1 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic read seq_cst, relaxed x = g - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic write relaxed, release x = 2 * 4 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 10 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst x = g g = x * 10 diff --git a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 index 7ca8c858239f7..e9cfa49bf934e 100644 --- a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 @@ -18,7 +18,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(3) y = 2 !$omp end critical (name) @@ -27,12 +27,12 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(7) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(x) y = 2 @@ -54,7 +54,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint) y = 2 @@ -84,35 +84,35 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint_contended) y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp critical (name) hint(1.0) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp critical (name) hint(z + omp_sync_hint_nonspeculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(k + omp_sync_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(p(1) + omp_sync_hint_uncontended) y = 2 diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 505cbc48fef90..677b933932b44 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -20,70 +20,64 @@ program sample !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable y(1_8:3_8:1_8) should be a scalar v = y(1:3) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression x*(10_4+x) should be a variable v = x * (10 + x) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression 4_4 should be a variable v = 4 !$omp atomic read - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE v = k !$omp atomic write - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = x !$omp atomic update - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = k + x * (v * x) !$omp atomic - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = v * k !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'z%y' + !ERROR: Within atomic operation z%y and x+z%y access the same storage z%y = x + z%y !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Within atomic operation m and min(m,x,z%m)+k access the same storage m = min(m, x, z%m) + k !$omp atomic read - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Atomic expression min(m,x,z%m)+k should be a variable m = min(m, x, z%m) + k !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar x = a - !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - a = x - !$omp atomic write !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement x = a !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar a = x !$omp atomic capture @@ -93,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = b + (x*1) !$omp end atomic @@ -103,60 +97,58 @@ program sample !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component x expected to be captured in the second statement of ATOMIC CAPTURE construct x = x + 10 v = b !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt] v = 1 x = 4 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component z%y expected to be assigned in the second statement of ATOMIC CAPTURE construct x = z%y + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component z%m expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 x = z%y !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component y(2) expected to be assigned in the second statement of ATOMIC CAPTURE construct x = y(2) + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component y(1) expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 x = y(2) !$omp end atomic !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable r cannot have CHARACTER type l = r !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable l cannot have CHARACTER type l = r end program diff --git a/flang/test/Semantics/OpenMP/requires-atomic01.f90 b/flang/test/Semantics/OpenMP/requires-atomic01.f90 index ae9fd086015dd..e8817c3f5ef61 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic01.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic01.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> SeqCst !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> SeqCst !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> SeqCst !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> SeqCst !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> SeqCst !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 diff --git a/flang/test/Semantics/OpenMP/requires-atomic02.f90 b/flang/test/Semantics/OpenMP/requires-atomic02.f90 index 4976a9667eb78..a3724a83456fd 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic02.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic02.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Acquire + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> AcqRel !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> AcqRel !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> AcqRel !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> AcqRel !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> AcqRel + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> AcqRel !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 >From 5928e6c450ed9d91a8130fd489a3fabab830140c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:41:16 -0500 Subject: [PATCH 05/25] Restart build >From 5a9cb4aaca6c7cd079f24036b233dbfd3c6278d4 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:45:28 -0500 Subject: [PATCH 06/25] Restart build >From 6273bb856c66fe110731e25293be0f9a1f02c64a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 07:54:54 -0500 Subject: [PATCH 07/25] Restart build >From 3594a0fcab499d7509c2bd4620b2c47feb74a481 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 08:18:01 -0500 Subject: [PATCH 08/25] Replace %openmp_flags with -fopenmp in tests, add REQUIRES where needed --- flang/test/Semantics/OpenMP/atomic-read.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-capture.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 2 +- flang/test/Semantics/OpenMP/atomic.f90 | 2 ++ 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 index 2619d235380f8..73899a9ff37f2 100644 --- a/flang/test/Semantics/OpenMP/atomic-read.f90 +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index 64e31a7974b55..c427ba07d43d8 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 0aee02d429dc0..4595e02d01456 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 index 979568ebf7140..7965ad2dc7dbf 100644 --- a/flang/test/Semantics/OpenMP/atomic-write.f90 +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0dc8251ae356e..2caa161507d49 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,3 +1,5 @@ +! REQUIRES: openmp_runtime + ! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct >From 0142671339663ba09c169800f909c0d48c0ad38b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 09:06:20 -0500 Subject: [PATCH 09/25] Fix examples --- flang/examples/FeatureList/FeatureList.cpp | 10 ------- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 26 +++++-------------- 2 files changed, 7 insertions(+), 29 deletions(-) diff --git a/flang/examples/FeatureList/FeatureList.cpp b/flang/examples/FeatureList/FeatureList.cpp index d1407cf0ef239..a36b8719e365d 100644 --- a/flang/examples/FeatureList/FeatureList.cpp +++ b/flang/examples/FeatureList/FeatureList.cpp @@ -445,13 +445,6 @@ struct NodeVisitor { READ_FEATURE(ObjectDecl) READ_FEATURE(OldParameterStmt) READ_FEATURE(OmpAlignedClause) - READ_FEATURE(OmpAtomic) - READ_FEATURE(OmpAtomicCapture) - READ_FEATURE(OmpAtomicCapture::Stmt1) - READ_FEATURE(OmpAtomicCapture::Stmt2) - READ_FEATURE(OmpAtomicRead) - READ_FEATURE(OmpAtomicUpdate) - READ_FEATURE(OmpAtomicWrite) READ_FEATURE(OmpBeginBlockDirective) READ_FEATURE(OmpBeginLoopDirective) READ_FEATURE(OmpBeginSectionsDirective) @@ -480,7 +473,6 @@ struct NodeVisitor { READ_FEATURE(OmpIterationOffset) READ_FEATURE(OmpIterationVector) READ_FEATURE(OmpEndAllocators) - READ_FEATURE(OmpEndAtomic) READ_FEATURE(OmpEndBlockDirective) READ_FEATURE(OmpEndCriticalDirective) READ_FEATURE(OmpEndLoopDirective) @@ -566,8 +558,6 @@ struct NodeVisitor { READ_FEATURE(OpenMPDeclareTargetConstruct) READ_FEATURE(OmpMemoryOrderType) READ_FEATURE(OmpMemoryOrderClause) - READ_FEATURE(OmpAtomicClause) - READ_FEATURE(OmpAtomicClauseList) READ_FEATURE(OmpAtomicDefaultMemOrderClause) READ_FEATURE(OpenMPFlushConstruct) READ_FEATURE(OpenMPLoopConstruct) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..18a4107f9a9d1 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -74,25 +74,19 @@ SourcePosition OpenMPCounterVisitor::getLocation(const OpenMPConstruct &c) { // the directive field. [&](const auto &c) -> SourcePosition { const CharBlock &source{std::get<0>(c.t).source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPAtomicConstruct &c) -> SourcePosition { - return std::visit( - [&](const auto &o) -> SourcePosition { - const CharBlock &source{std::get(o.t).source}; - return parsing->allCooked() - .GetSourcePositionRange(source) - ->first; - }, - c.u); + const CharBlock &source{c.source}; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPSectionConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPUtilityConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, }, c.u); @@ -157,14 +151,8 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - return std::visit( - [&](const auto &c) { - // Get source from the verbatim fields - const CharBlock &source{std::get(c.t).source}; - return "atomic-" + - normalize_construct_name(source.ToString()); - }, - c.u); + const CharBlock &source{c.source}; + return normalize_construct_name(source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; >From c158867527e0a5e2b67146784386675e0d414c71 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:19 -0500 Subject: [PATCH 10/25] Fix example --- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 5 +++-- flang/test/Examples/omp-atomic.f90 | 16 +++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index 18a4107f9a9d1..b0964845d3b99 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -151,8 +151,9 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - const CharBlock &source{c.source}; - return normalize_construct_name(source.ToString()); + auto &dirSpec = std::get(c.t); + auto &dirName = std::get(dirSpec.t); + return normalize_construct_name(dirName.source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; diff --git a/flang/test/Examples/omp-atomic.f90 b/flang/test/Examples/omp-atomic.f90 index dcca34b633a3e..934f84f132484 100644 --- a/flang/test/Examples/omp-atomic.f90 +++ b/flang/test/Examples/omp-atomic.f90 @@ -26,25 +26,31 @@ ! CHECK:--- ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 9 -! CHECK-NEXT: construct: atomic-read +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: -! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: - clause: read ! CHECK-NEXT: details: '' +! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 12 -! CHECK-NEXT: construct: atomic-write +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: ! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' +! CHECK-NEXT: - clause: write ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 16 -! CHECK-NEXT: construct: atomic-capture +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: +! CHECK-NEXT: - clause: capture +! CHECK-NEXT: details: 'name_modifier=atomic;name_modifier=atomic;' ! CHECK-NEXT: - clause: seq_cst ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 21 -! CHECK-NEXT: construct: atomic-atomic +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: [] ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 8 >From ddacb71299d1fefd37b566e3caed45031584aac8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:47 -0500 Subject: [PATCH 11/25] Remove reference to *nullptr --- flang/lib/Lower/OpenMP/OpenMP.cpp | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ab868df76d298..4d9b375553c1e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3770,9 +3770,12 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); - mlir::Operation *secondOp = genAtomicOperation( - converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, - *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + mlir::Operation *secondOp = nullptr; + if (analysis.op1.what != analysis.None) { + secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, + atomAddr, atom, *get(analysis.op1.expr), + hint, memOrder, atomicAt, prepareAt); + } if (secondOp) { builder.setInsertionPointAfter(secondOp); >From 4546997f82dfe32b79b2bd0e2b65974991ab55da Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 2 May 2025 18:49:05 -0500 Subject: [PATCH 12/25] Updates and improvements --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 107 +++-- flang/lib/Semantics/check-omp-structure.cpp | 375 ++++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 1 + .../Todo/atomic-capture-implicit-cast.f90 | 48 --- .../Lower/OpenMP/atomic-implicit-cast.f90 | 2 - .../Semantics/OpenMP/atomic-hint-clause.f90 | 2 +- .../OpenMP/atomic-update-capture.f90 | 8 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 16 +- 9 files changed, 381 insertions(+), 180 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 77f57b1cb85c7..8213fe33edbd0 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4859,7 +4859,7 @@ struct OpenMPAtomicConstruct { struct Op { int what; - TypedExpr expr; + AssignmentStmt::TypedAssignment assign; }; TypedExpr atom, cond; Op op0, op1; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6177b59199481..7b6c22095d723 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2673,21 +2673,46 @@ getAtomicMemoryOrder(lower::AbstractConverter &converter, static mlir::Operation * // genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Value storeAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Type storeType = fir::unwrapRefType(storeAddr.getType()); + + mlir::Value toAddr = [&]() { + if (atomType == storeType) + return storeAddr; + return builder.createTemporary(loc, atomType, ".tmp.atomval"); + }(); builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + + if (atomType != storeType) { + lower::ExprToValueMap overrides; + // The READ operation could be a part of UPDATE CAPTURE, so make sure + // we don't emit extra code into the body of the atomic op. + builder.restoreInsertionPoint(postAt); + mlir::Value load = builder.create(loc, toAddr); + overrides.try_emplace(&atom, load); + + converter.overrideExprValues(&overrides); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); + converter.resetExprOverrides(); + + builder.create(loc, value, storeAddr); + } builder.restoreInsertionPoint(saved); return op; } @@ -2695,16 +2720,18 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, static mlir::Operation * // genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); mlir::Value converted = builder.createConvert(loc, atomType, value); @@ -2719,19 +2746,20 @@ static mlir::Operation * genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrResizeOf(arg, atom)) { @@ -2751,7 +2779,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, converter.overrideExprValues(&overrides); mlir::Value updated = - fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Value converted = builder.createConvert(loc, atomType, updated); builder.create(loc, converted); converter.resetExprOverrides(); @@ -2764,20 +2792,21 @@ static mlir::Operation * genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, int action, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: - return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Write: - return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Update: - return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, assign, + hint, memOrder, preAt, atomicAt, postAt); default: return nullptr; } @@ -3724,6 +3753,15 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { } return ""s; }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; const SomeExpr &atom = *analysis.atom.get()->v; @@ -3732,11 +3770,11 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; llvm::errs() << " op0 {\n"; llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op0.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << " op1 {\n"; llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op1.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << "}\n"; } @@ -3745,8 +3783,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAtomicConstruct &construct) { - auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { - if (auto *maybe = expr.get(); maybe && maybe->v) { + auto get = [](auto &&typedWrapper) -> decltype(&*typedWrapper.get()->v) { + if (auto *maybe = typedWrapper.get(); maybe && maybe->v) { return &*maybe->v; } else { return nullptr; @@ -3774,8 +3812,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, int action0 = analysis.op0.what & analysis.Action; int action1 = analysis.op1.what & analysis.Action; mlir::Operation *captureOp = nullptr; - fir::FirOpBuilder::InsertPoint atomicAt; - fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint preAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint atomicAt, postAt; if (construct.IsCapture()) { // Capturing operation. @@ -3784,7 +3822,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, captureOp = builder.create(loc, hint, memOrder); // Set the non-atomic insertion point to before the atomic.capture. - prepareAt = getInsertionPointBefore(captureOp); + preAt = getInsertionPointBefore(captureOp); mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); builder.setInsertionPointToEnd(block); @@ -3792,6 +3830,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, // atomic.capture. mlir::Operation *term = builder.create(loc); atomicAt = getInsertionPointBefore(term); + postAt = getInsertionPointAfter(captureOp); hint = nullptr; memOrder = nullptr; } else { @@ -3799,20 +3838,20 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(action0 != analysis.None && action1 == analysis.None && "Expexcing single action"); assert(!(analysis.op0.what & analysis.Condition)); - atomicAt = prepareAt; + postAt = atomicAt = preAt; } mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, - *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); mlir::Operation *secondOp = nullptr; if (analysis.op1.what != analysis.None) { secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, - atomAddr, atom, *get(analysis.op1.expr), - hint, memOrder, atomicAt, prepareAt); + atomAddr, atom, *get(analysis.op1.assign), + hint, memOrder, preAt, atomicAt, postAt); } if (secondOp) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 201b38bd05ff3..f7753a5e5cc59 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -86,9 +86,13 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } -static bool IsVarOrFunctionRef(const SomeExpr &expr) { - return evaluate::UnwrapProcedureRef(expr) != nullptr || - evaluate::IsVariable(expr); +static bool IsVarOrFunctionRef(const MaybeExpr &expr) { + if (expr) { + return evaluate::UnwrapProcedureRef(*expr) != nullptr || + evaluate::IsVariable(*expr); + } else { + return false; + } } static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { @@ -2838,6 +2842,12 @@ static std::pair SplitAssignmentSource( namespace atomic { +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + enum class Operator { Unk, // Operators that are officially allowed in the update operation @@ -3137,16 +3147,108 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (Append(v, std::move(results)), ...); + (MoveAppend(v, std::move(results)), ...); return v; } +}; -private: - static void Append(Result &acc, Result &&data) { - for (auto &&s : data) { - acc.push_back(std::move(s)); +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + auto copy{x.derived()}; + return {evaluate::AsGenericExpr(std::move(copy)), {}}; } } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; }; } // namespace atomic @@ -3265,6 +3367,22 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } +static MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { // Both expr and x have the form of SomeType(SomeKind(...)[1]). // Check if expr is @@ -3282,6 +3400,10 @@ bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { } } +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { if (value) { expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), @@ -3289,11 +3411,20 @@ static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { } } +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( - int what, const MaybeExpr &maybeExpr = std::nullopt) { + int what, + const std::optional &maybeAssign = std::nullopt) { parser::OpenMPAtomicConstruct::Analysis::Op operation; operation.what = what; - SetExpr(operation.expr, maybeExpr); + SetAssignment(operation.assign, maybeAssign); return operation; } @@ -3316,7 +3447,7 @@ static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( // }; // struct Op { // int what; - // TypedExpr expr; + // TypedAssignment assign; // }; // TypedExpr atom, cond; // Op op0, op1; @@ -3340,6 +3471,16 @@ void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, } } +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + /// Check if `expr` satisfies the following conditions for x and v: /// /// [6.0:189:10-12] @@ -3383,9 +3524,9 @@ OmpStructureChecker::CheckUpdateCapture( // // The two allowed cases are: // x = ... atomic-var = ... - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // or - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // x = ... atomic-var = ... // // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture @@ -3394,6 +3535,8 @@ OmpStructureChecker::CheckUpdateCapture( // // If the two statements don't fit these criteria, return a pair of default- // constructed values. + using ReturnTy = std::pair; SourcedActionStmt act1{GetActionStmt(ec1)}; SourcedActionStmt act2{GetActionStmt(ec2)}; @@ -3409,86 +3552,155 @@ OmpStructureChecker::CheckUpdateCapture( auto isUpdateCapture{ [](const evaluate::Assignment &u, const evaluate::Assignment &c) { - return u.lhs == c.rhs; + return IsSameOrConvertOf(c.rhs, u.lhs); }}; // Do some checks that narrow down the possible choices for the update // and the capture statements. This will help to emit better diagnostics. - bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); - bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; + + auto errorCaptureShouldRead{[&](const parser::CharBlock &source, + const std::string &expr) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read %s"_err_en_US, + expr); + }}; - if (couldBeCapture1) { - if (couldBeCapture2) { - if (isUpdateCapture(as2, as1)) { - if (isUpdateCapture(as1, as2)) { - // If both statements could be captures and both could be updates, - // emit a warning about the ambiguity. - context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); - } - return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); - } else if (isUpdateCapture(as1, as2)) { + auto errorNeitherWorks{[&]() { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture"_err_en_US); + }}; + + auto makeSelectionFromDet{[&](int det) -> ReturnTy { + // If det != 0, then the checks unambiguously suggest a specific + // categorization. + // If det == 0, then this function should be called only if the + // checks haven't ruled out any possibility, i.e. when both assigments + // could still be either updates or captures. + if (det > 0) { + // as1 is update, as2 is capture + if (isUpdateCapture(as1, as2)) { return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - context_.Say(source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, - as1.rhs.AsFortran(), as2.rhs.AsFortran()); + errorCaptureShouldRead(act2.source, as1.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } else { // !couldBeCapture2 + } else if (det < 0) { + // as2 is update, as1 is capture if (isUpdateCapture(as2, as1)) { return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } else { - context_.Say(act2.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as1.rhs.AsFortran()); + errorCaptureShouldRead(act1.source, as2.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } - } else { // !couldBeCapture1 - if (couldBeCapture2) { - if (isUpdateCapture(as1, as2)) { - return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); - } else { + } else { + bool updateFirst{isUpdateCapture(as1, as2)}; + bool captureFirst{isUpdateCapture(as2, as1)}; + if (updateFirst && captureFirst) { + // If both assignment could be the update and both could be the + // capture, emit a warning about the ambiguity. context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as2.rhs.AsFortran()); + "In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement"_warn_en_US); + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } - } else { - context_.Say(source, - "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + if (updateFirst != captureFirst) { + const parser::ExecutionPartConstruct *upd{updateFirst ? ec1 : ec2}; + const parser::ExecutionPartConstruct *cap{captureFirst ? ec1 : ec2}; + return std::make_pair(upd, cap); + } + assert(!updateFirst && !captureFirst); + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); } + }}; + + if (det != 0 || (cbu1 && cbu2 && cbc1 && cbc2)) { + return makeSelectionFromDet(det); } + assert(det == 0 && "Prior checks should have covered det != 0"); - return std::make_pair(nullptr, nullptr); + // If neither of the statements is an RMW update, it could still be a + // "write" update. Pretty much any assignment can be a write update, so + // recompute det with cbu1 = cbu2 = true. + if (int writeDet{int(cbc2) - int(cbc1)}; writeDet || (cbc1 && cbc2)) { + return makeSelectionFromDet(writeDet); + } + + // It's only errors from here on. + + if (!cbu1 && !cbu2 && !cbc1 && !cbc2) { + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); + } + + // The remaining cases are that + // - no candidate for update, or for capture, + // - one of the assigments cannot be anything. + + if (!cbu1 && !cbu2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update"_err_en_US); + return std::make_pair(nullptr, nullptr); + } else if (!cbc1 && !cbc2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + if ((!cbu1 && !cbc1) || (!cbu2 && !cbc2)) { + auto &src = (!cbu1 && !cbc1) ? act1.source : act2.source; + context_.Say(src, + "In ATOMIC UPDATE operation with CAPTURE the statement could be neither the update nor the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + // All cases should have been covered. + llvm_unreachable("Unchecked condition"); } void OmpStructureChecker::CheckAtomicCaptureAssignment( const evaluate::Assignment &capture, const SomeExpr &atom, parser::CharBlock source) { - const SomeExpr &cap{capture.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &cap{capture.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, rsrc); - // This part should have been checked prior to callig this function. - assert(capture.rhs == atom && "This canont be a capture assignment"); + // This part should have been checked prior to calling this function. + assert(*GetConvertInput(capture.rhs) == atom && + "This canont be a capture assignment"); CheckStorageOverlap(atom, {cap}, source); } } void OmpStructureChecker::CheckAtomicReadAssignment( const evaluate::Assignment &read, parser::CharBlock source) { - const SomeExpr &atom{read.rhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; - if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + if (auto maybe{GetConvertInput(read.rhs)}) { + const SomeExpr &atom{*maybe}; + + if (!IsVarOrFunctionRef(atom)) { + ErrorShouldBeVariable(atom, rsrc); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } } else { - CheckAtomicVariable(atom, rsrc); - CheckStorageOverlap(atom, {read.lhs}, source); + ErrorShouldBeVariable(read.rhs, rsrc); } } @@ -3499,12 +3711,11 @@ void OmpStructureChecker::CheckAtomicWriteAssignment( // one of the following forms: // x = expr // x => expr - const SomeExpr &atom{write.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{write.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, lsrc); CheckStorageOverlap(atom, {write.rhs}, source); @@ -3521,12 +3732,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( // x = intrinsic-procedure-name (x) // x = intrinsic-procedure-name (x, expr-list) // x = intrinsic-procedure-name (expr-list, x) - const SomeExpr &atom{update.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{update.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); // Skip other checks. return; } @@ -3605,12 +3815,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( const SomeExpr &cond, parser::CharBlock condSource, const evaluate::Assignment &assign, parser::CharBlock assignSource) { - const SomeExpr &atom{assign.lhs}; auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + const SomeExpr &atom{assign.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, arsrc); // Skip other checks. return; } @@ -3702,7 +3911,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( @@ -3786,7 +3995,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdate( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign), MakeAtomicAnalysisOp(Analysis::None)); } @@ -3839,12 +4048,12 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( if (GetActionStmt(&body.front()).stmt == uact.stmt) { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(action, update.rhs), - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + MakeAtomicAnalysisOp(action, update), + MakeAtomicAnalysisOp(Analysis::Read, capture)); } else { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), - MakeAtomicAnalysisOp(action, update.rhs)); + MakeAtomicAnalysisOp(Analysis::Read, capture), + MakeAtomicAnalysisOp(action, update)); } } @@ -3988,12 +4197,12 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( if (captureFirst) { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign)); } else { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign)); } } @@ -4019,13 +4228,15 @@ void OmpStructureChecker::CheckAtomicRead( if (body.size() == 1) { SourcedActionStmt action{GetActionStmt(&body.front())}; if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { - const SomeExpr &atom{maybeRead->rhs}; CheckAtomicReadAssignment(*maybeRead, action.source); - using Analysis = parser::OpenMPAtomicConstruct::Analysis; - x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), - MakeAtomicAnalysisOp(Analysis::None)); + if (auto maybe{GetConvertInput(maybeRead->rhs)}) { + const SomeExpr &atom{*maybe}; + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead), + MakeAtomicAnalysisOp(Analysis::None)); + } } else { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); @@ -4058,7 +4269,7 @@ void OmpStructureChecker::CheckAtomicWrite( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index bf6fbf16d0646..835fbe45e1c0e 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -253,6 +253,7 @@ class OmpStructureChecker void CheckStorageOverlap(const evaluate::Expr &, llvm::ArrayRef>, parser::CharBlock); + void ErrorShouldBeVariable(const MaybeExpr &expr, parser::CharBlock source); void CheckAtomicVariable( const evaluate::Expr &, parser::CharBlock); std::pair&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..aa9d2e0ac3ff7 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -1,5 +1,3 @@ -! REQUIRES : openmp_runtime - ! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s ! CHECK: func.func @_QPatomic_implicit_cast_read() { diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index deb67e7614659..8adb0f1a67409 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -25,7 +25,7 @@ program sample !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement y = x x = y !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index c427ba07d43d8..f808ed916fb7e 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -39,7 +39,7 @@ subroutine f02 subroutine f03 integer :: x - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture !$omp atomic update capture x = x + 1 x = x + 2 @@ -50,7 +50,7 @@ subroutine f04 integer :: x, v !$omp atomic update capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement v = x x = v !$omp end atomic @@ -60,8 +60,8 @@ subroutine f05 integer :: x, v, z !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 !$omp end atomic end @@ -70,8 +70,8 @@ subroutine f06 integer :: x, v, z !$omp atomic update capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x !$omp end atomic end diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 677b933932b44..5e180aa0bbe5b 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -97,50 +97,50 @@ program sample !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture x = x + 10 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read x v = b !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture !$omp atomic capture v = 1 x = 4 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) !$omp end atomic >From 40510a3068498d15257cc7d198bce9c8cd71a902 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 24 Mar 2025 15:38:58 -0500 Subject: [PATCH 13/25] DumpEvExpr: show type --- flang/include/flang/Semantics/dump-expr.h | 30 ++++++++++++++++------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 2f445429a10b5..1553dac3b6687 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,6 +16,7 @@ #include #include +#include #include #include @@ -38,6 +39,17 @@ class DumpEvaluateExpr { } private: + template + struct TypeOf { + static constexpr std::string_view name{TypeOf::get()}; + static constexpr std::string_view get() { + std::string_view v(__PRETTY_FUNCTION__); + v.remove_prefix(99); // Strip the part "... [with T = " + v.remove_suffix(50); // Strip the ending "; string_view = ...]" + return v; + } + }; + template void Show(const common::Indirection &x) { Show(x.value()); } @@ -76,7 +88,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant"); + Indent("derived constant "s + std::string(TypeOf::name)); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -84,7 +96,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant"); + Print("constant "s + std::string(TypeOf::name)); } } void Show(const Symbol &symbol); @@ -102,7 +114,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator"); + Indent("designator "s + std::string(TypeOf::name)); Show(x.u); Outdent(); } @@ -117,7 +129,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref"); + Indent("function ref "s + std::string(TypeOf::name)); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -127,14 +139,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value"); + Indent("array constructor value "s + std::string(TypeOf::name)); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do"); + Indent("implied do "s + std::string(TypeOf::name)); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -148,20 +160,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op"); + Indent("unary op "s + std::string(TypeOf::name)); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op"); + Indent("binary op "s + std::string(TypeOf::name)); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr T"); + Indent("expr <" + std::string(TypeOf::name) + ">"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index aa0b4e0f03398..66cedab94bfb4 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("expr some type"); + Indent("relational some type"); Show(x.u); Outdent(); } >From b40ba0ed9270daf4f7d99190c1e100028a3e09c3 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 15:14:45 -0500 Subject: [PATCH 14/25] Handle conversion from real to complex via complex constructor --- flang/lib/Semantics/check-omp-structure.cpp | 55 ++++++++++++++++++--- 1 file changed, 47 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dada9c6c2bd6f..ae81dcb5ea150 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,36 +3183,46 @@ struct ConvertCollector using Base::operator(); template // - Result operator()(const evaluate::Designator &x) const { + Result asSomeExpr(const T &x) const { auto copy{x}; return {AsGenericExpr(std::move(copy)), {}}; } + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + template // Result operator()(const evaluate::FunctionRef &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template // Result operator()(const evaluate::Constant &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template Result operator()(const evaluate::Operation &x) const { if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. + // Ignore parentheses. return (*this)(x.template operand<0>()); } else if constexpr (is_convert_v) { // Convert should always have a typed result, so it should be safe to // dereference x.GetType(). return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } } else { - auto copy{x.derived()}; - return {evaluate::AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x.derived()); } } @@ -3231,6 +3241,23 @@ struct ConvertCollector } private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + template // struct is_convert { static constexpr bool value{false}; @@ -3246,6 +3273,18 @@ struct ConvertCollector }; template // static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { >From 303aef7886243a6f7952e866cfb50d860ed98e61 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 16:07:19 -0500 Subject: [PATCH 15/25] Fix handling of insertion point --- flang/lib/Lower/OpenMP/OpenMP.cpp | 23 +++++++++++-------- .../Lower/OpenMP/atomic-implicit-cast.f90 | 8 +++---- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 1c5589b116ca7..60e559b326f7f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2749,7 +2749,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value storeAddr = @@ -2782,7 +2781,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, value, storeAddr); } - builder.restoreInsertionPoint(saved); return op; } @@ -2796,7 +2794,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value value = @@ -2807,7 +2804,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, converted, hint, memOrder); - builder.restoreInsertionPoint(saved); return op; } @@ -2823,7 +2819,6 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); @@ -2853,7 +2848,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(saved); + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } @@ -2866,6 +2861,8 @@ genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { + // This function and the functions called here do not preserve the + // builder's insertion point, or set it to anything specific. switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, @@ -3919,6 +3916,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, postAt = atomicAt = preAt; } + // The builder's insertion point needs to be specifically set before + // each call to `genAtomicOperation`. mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); @@ -3932,10 +3931,16 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, hint, memOrder, preAt, atomicAt, postAt); } - if (secondOp) { - builder.setInsertionPointAfter(secondOp); + if (construct.IsCapture()) { + // If this is a capture operation, the first/second ops will be inside + // of it. Set the insertion point to past the capture op itself. + builder.restoreInsertionPoint(postAt); } else { - builder.setInsertionPointAfter(firstOp); + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } } } } diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 6f9a481e4cf43..5e00235b85e74 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -95,9 +95,9 @@ subroutine atomic_implicit_cast_read ! CHECK: } ! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 ! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref -! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 ! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex ! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> @@ -107,14 +107,14 @@ subroutine atomic_implicit_cast_read !$omp end atomic -! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { -! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 ! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 ! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex ! CHECK: omp.yield(%[[RESULT]] : complex) ! CHECK: } >From d788d87ebe69ec82c14a0eb0cbb95df38a216fde Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:14:47 -0500 Subject: [PATCH 16/25] Allow conversion in update operations --- flang/include/flang/Semantics/tools.h | 17 ++++----- flang/lib/Lower/OpenMP/OpenMP.cpp | 6 ++-- flang/lib/Semantics/check-omp-structure.cpp | 33 ++++++----------- .../Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic03.f90 | 6 ++-- flang/test/Semantics/OpenMP/atomic04.f90 | 35 +++++++++---------- .../OpenMP/omp-atomic-assignment-stmt.f90 | 2 +- 7 files changed, 44 insertions(+), 57 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7f1ec59b087a2..9be2feb8ae064 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -789,14 +789,15 @@ inline bool checkForSymbolMatch( /// return the "expr" but with top-level parentheses stripped. std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); -/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). -/// Check if "expr" is -/// SomeType(SomeKind(Type( -/// Convert -/// SomeKind(...)[2]))) -/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves -/// TypeCategory. -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 60e559b326f7f..6977e209e8b1b 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2823,10 +2823,12 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; + // This must exist by now. + SomeExpr input = *semantics::GetConvertInput(assign.rhs); + std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { - if (!semantics::IsSameOrResizeOf(arg, atom)) { + if (!semantics::IsSameOrConvertOf(arg, atom)) { mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); overrides.try_emplace(&arg, val); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index ae81dcb5ea150..edd8525c118bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3425,12 +3425,12 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -static MaybeExpr GetConvertInput(const SomeExpr &x) { +MaybeExpr GetConvertInput(const SomeExpr &x) { // This returns SomeExpr(x) when x is a designator/functionref/constant. return atomic::ConvertCollector{}(x).first; } -static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { // Check if expr is same as x, or a sequence of Convert operations on x. if (expr == x) { return true; @@ -3441,23 +3441,6 @@ static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { } } -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { - // Both expr and x have the form of SomeType(SomeKind(...)[1]). - // Check if expr is - // SomeType(SomeKind(Type( - // Convert - // SomeKind(...)[2]))) - // where SomeKind(...) [1] and [2] are equal, and the Convert preserves - // TypeCategory. - - if (expr != x) { - auto top{atomic::ArgumentExtractor{}(expr)}; - return top.first == atomic::Operator::Resize && x == top.second.front(); - } else { - return true; - } -} - bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3801,7 +3784,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - auto top{GetTopLevelOperation(update.rhs)}; + std::pair> top{ + atomic::Operator::Unk, {}}; + if (auto &&maybeInput{GetConvertInput(update.rhs)}) { + top = GetTopLevelOperation(*maybeInput); + } switch (top.first) { case atomic::Operator::Add: case atomic::Operator::Sub: @@ -3842,7 +3829,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( auto unique{[&]() { // -> iterator auto found{top.second.end()}; for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { - if (IsSameOrResizeOf(*i, atom)) { + if (IsSameOrConvertOf(*i, atom)) { if (found != top.second.end()) { return top.second.end(); } @@ -3902,9 +3889,9 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( case atomic::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; - if (IsSameOrResizeOf(arg0, atom)) { + if (IsSameOrConvertOf(arg0, atom)) { CheckStorageOverlap(atom, {arg1}, condSource); - } else if (IsSameOrResizeOf(arg1, atom)) { + } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { context_.Say(assignSource, diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 4595e02d01456..28d0e264359cb 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -47,8 +47,8 @@ subroutine f05 integer :: x real :: y + ! An explicit conversion is accepted as an extension. !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = int(x + y) end diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index f5c189fd05318..b3a3c0d5e7a14 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -41,10 +41,10 @@ program OmpAtomic z = MIN(y, 8, a, d) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable y should appear as an argument in the update operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -126,7 +126,7 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic update !ERROR: Atomic variable k should be a scalar - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable k should occur exactly once among the arguments of the top-level MAX operator k = max(x, y) !$omp atomic diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index 5c91ab5dc37e4..d603ba8b3937c 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -1,5 +1,3 @@ -! REQUIRES: openmp_runtime - ! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags ! OpenMP Atomic construct @@ -7,7 +5,6 @@ ! Update assignment must be 'var = var op expr' or 'var = expr op var' program OmpAtomic - use omp_lib real x integer y logical m, n, l @@ -20,10 +17,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic @@ -31,10 +28,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic @@ -42,10 +39,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic @@ -53,10 +50,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic @@ -96,10 +93,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic update @@ -107,10 +104,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic update @@ -118,10 +115,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic update @@ -129,10 +126,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 5e180aa0bbe5b..8fdd2aed3ec1f 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -87,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + ! This ends up being "x = b + x". x = b + (x*1) !$omp end atomic >From 341723713929507c59d528540d32bc2e4213e920 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:21:56 -0500 Subject: [PATCH 17/25] format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6977e209e8b1b..0f553541c5ef0 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2850,7 +2850,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(postAt); // For naCtx cleanups + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } >From 2686207342bad511f6d51b20ed923c0d2cc9047b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:22:26 -0500 Subject: [PATCH 18/25] Revert "DumpEvExpr: show type" This reverts commit 40510a3068498d15257cc7d198bce9c8cd71a902. Debug changes accidentally pushed upstream. --- flang/include/flang/Semantics/dump-expr.h | 30 +++++++---------------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 10 insertions(+), 22 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 1553dac3b6687..2f445429a10b5 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,7 +16,6 @@ #include #include -#include #include #include @@ -39,17 +38,6 @@ class DumpEvaluateExpr { } private: - template - struct TypeOf { - static constexpr std::string_view name{TypeOf::get()}; - static constexpr std::string_view get() { - std::string_view v(__PRETTY_FUNCTION__); - v.remove_prefix(99); // Strip the part "... [with T = " - v.remove_suffix(50); // Strip the ending "; string_view = ...]" - return v; - } - }; - template void Show(const common::Indirection &x) { Show(x.value()); } @@ -88,7 +76,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant "s + std::string(TypeOf::name)); + Indent("derived constant"); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -96,7 +84,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant "s + std::string(TypeOf::name)); + Print("constant"); } } void Show(const Symbol &symbol); @@ -114,7 +102,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator "s + std::string(TypeOf::name)); + Indent("designator"); Show(x.u); Outdent(); } @@ -129,7 +117,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref "s + std::string(TypeOf::name)); + Indent("function ref"); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -139,14 +127,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value "s + std::string(TypeOf::name)); + Indent("array constructor value"); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do "s + std::string(TypeOf::name)); + Indent("implied do"); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -160,20 +148,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op "s + std::string(TypeOf::name)); + Indent("unary op"); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op "s + std::string(TypeOf::name)); + Indent("binary op"); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr <" + std::string(TypeOf::name) + ">"); + Indent("expr T"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index 66cedab94bfb4..aa0b4e0f03398 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("relational some type"); + Indent("expr some type"); Show(x.u); Outdent(); } >From c00fc531bcf742c409fc974da94c5b362fa9132c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:37:19 -0500 Subject: [PATCH 19/25] Delete unnecessary static_assert --- flang/lib/Semantics/check-omp-structure.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 6005dda7c26fe..2e59553d5e130 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -21,8 +21,6 @@ namespace Fortran::semantics { -static_assert(std::is_same_v>); - template static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { return !(e == f); >From 45b012c16b77c757a0d09b2a229bad49fed8d26f Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:25 -0500 Subject: [PATCH 20/25] Add missing initializer for 'iff' --- flang/lib/Semantics/check-omp-structure.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 2e59553d5e130..aa1bd136b371f 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2815,7 +2815,7 @@ static std::optional AnalyzeConditionalStmt( } } else { AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, - GetActionStmt(std::get(s.t))}; + GetActionStmt(std::get(s.t)), SourcedActionStmt{}}; if (result.ift.stmt) { return result; } >From daeac25991bf14fb08c3accabe068c074afa1eb7 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:47 -0500 Subject: [PATCH 21/25] Add asserts for printing "Identity" as top-level operator --- flang/lib/Semantics/check-omp-structure.cpp | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa1bd136b371f..062b45deac865 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3823,6 +3823,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, atomic::ToString(top.first)); @@ -3852,6 +3853,8 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, atom.AsFortran(), atomic::ToString(top.first)); @@ -3898,16 +3901,20 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, atomic::ToString(top.first)); } break; } + case atomic::Operator::Identity: case atomic::Operator::True: case atomic::Operator::False: break; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, atomic::ToString(top.first)); >From ae121e5c37453af1a4aba7c77939c2c1c45b75fa Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 13:15:58 -0500 Subject: [PATCH 22/25] Explain the use of determinant --- flang/lib/Semantics/check-omp-structure.cpp | 31 ++++++++++++++++++--- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 062b45deac865..bc6a09b9768ef 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3606,10 +3606,33 @@ OmpStructureChecker::CheckUpdateCapture( // subexpression of the right-hand side. // 2. An assignment could be a capture (cbc) if the right-hand side is // a variable (or a function ref), with potential type conversions. - bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; - bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; - bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; - bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; // Can as1 be an update? + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; // Can as2 be an update? + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; // Can 1 be capture? + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; // Can 2 be capture? + + // We want to diagnose cases where both assignments cannot be an update, + // or both cannot be a capture, as well as cases where either assignment + // cannot be any of these two. + // + // If we organize these boolean values into a matrix + // |cbu1 cbu2| + // |cbc1 cbc2| + // then we want to diagnose cases where the matrix has a zero (i.e. "false") + // row or column, including the case where everything is zero. All these + // cases correspond to the determinant of the matrix being 0, which suggests + // that checking the det may be a convenient diagnostic check. There is only + // one additional case where the det is 0, which is when the matrx is all 1 + // ("true"). The "all true" case represents the situation where both + // assignments could be an update as well as a capture. On the other hand, + // whenever det != 0, the roles of the update and the capture can be + // unambiguously assigned to as1 and as2 [1]. + // + // [1] This can be easily verified by hand: there are 10 2x2 matrices with + // det = 0, leaving 6 cases where det != 0: + // 0 1 0 1 1 0 1 0 1 1 1 1 + // 1 0 1 1 0 1 1 1 0 1 1 0 + // In each case the classification is unambiguous. // |cbu1 cbu2| // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 >From cae0e8fcd3f6b8c2bc3ad8f85599ef4765c6afc5 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 11:48:18 -0500 Subject: [PATCH 23/25] Deal with assignments that failed Fortran semantic checks Don't emit diagnostics for those. --- flang/lib/Semantics/check-omp-structure.cpp | 66 ++++++++++++++------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bc6a09b9768ef..89a3a407441a8 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2726,6 +2726,9 @@ static SourcedActionStmt GetActionStmt(const parser::Block &block) { // Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption // is that the ActionStmt will be either an assignment or a pointer-assignment, // otherwise return std::nullopt. +// Note: This function can return std::nullopt on [Pointer]AssignmentStmt where +// the "typedAssignment" is unset. This can happen is there are semantic errors +// in the purported assignment. static std::optional GetEvaluateAssignment( const parser::ActionStmt *x) { if (x == nullptr) { @@ -2754,6 +2757,29 @@ static std::optional GetEvaluateAssignment( x->u); } +// Check if the ActionStmt is actually a [Pointer]AssignmentStmt. This is +// to separate cases where the source has something that looks like an +// assignment, but is semantically wrong (diagnosed by general semantic +// checks), and where the source has some other statement (which we want +// to report as "should be an assignment"). +static bool IsAssignment(const parser::ActionStmt *x) { + if (x == nullptr) { + return false; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + + return common::visit( + [](auto &&s) -> bool { + using BareS = llvm::remove_cvref_t; + return std::is_same_v || + std::is_same_v; + }, + x->u); +} + static std::optional AnalyzeConditionalStmt( const parser::ExecutionPartConstruct *x) { if (x == nullptr) { @@ -3588,8 +3614,10 @@ OmpStructureChecker::CheckUpdateCapture( auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; if (!maybeAssign1 || !maybeAssign2) { - context_.Say(source, - "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + if (!IsAssignment(act1.stmt) || !IsAssignment(act2.stmt)) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + } return std::make_pair(nullptr, nullptr); } @@ -3956,7 +3984,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( // The if-true statement must be present, and must be an assignment. auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; if (!maybeAssign) { - if (update.ift.stmt) { + if (update.ift.stmt && !IsAssignment(update.ift.stmt)) { context_.Say(update.ift.source, "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); } else { @@ -3992,7 +4020,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); } @@ -4094,17 +4122,11 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( } SourcedActionStmt uact{GetActionStmt(uec)}; SourcedActionStmt cact{GetActionStmt(cec)}; - auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; - auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; - - if (!maybeUpdate || !maybeCapture) { - context_.Say(source, - "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); - return; - } + // The "dereferences" of std::optional are guaranteed to be valid after + // CheckUpdateCapture. + evaluate::Assignment update{*GetEvaluateAssignment(uact.stmt)}; + evaluate::Assignment capture{*GetEvaluateAssignment(cact.stmt)}; - const evaluate::Assignment &update{*maybeUpdate}; - const evaluate::Assignment &capture{*maybeCapture}; const SomeExpr &atom{update.lhs}; using Analysis = parser::OpenMPAtomicConstruct::Analysis; @@ -4242,13 +4264,17 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( return; } } else { - context_.Say(capture.source, - "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + if (!IsAssignment(capture.stmt)) { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + } return; } } else { - context_.Say(update.ift.source, - "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + if (!IsAssignment(update.ift.stmt)) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + } return; } @@ -4316,7 +4342,7 @@ void OmpStructureChecker::CheckAtomicRead( MakeAtomicAnalysisOp(Analysis::Read, maybeRead), MakeAtomicAnalysisOp(Analysis::None)); } - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); } @@ -4350,7 +4376,7 @@ void OmpStructureChecker::CheckAtomicWrite( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); } >From 6bc8c10c793ebac02c78daec33e7fb5e6becb8e0 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 12:47:00 -0500 Subject: [PATCH 24/25] Move common functions to tools.cpp --- flang/include/flang/Semantics/tools.h | 134 +++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- flang/lib/Semantics/check-omp-structure.cpp | 506 ++------------------ flang/lib/Semantics/tools.cpp | 310 ++++++++++++ 4 files changed, 484 insertions(+), 468 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 821f1ae34fd5b..25fadceefceb0 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -778,11 +778,135 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, return false; } -/// If the top-level operation (ignoring parentheses) is either an -/// evaluate::FunctionRef, or a specialization of evaluate::Operation, -/// then return the list of arguments (wrapped in SomeExpr). Otherwise, -/// return the "expr" but with top-level parentheses stripped. -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); +namespace operation { + +enum class Operator { + Add, + And, + Associated, + Call, + Convert, + Div, + Eq, + Eqv, + False, + Ge, + Gt, + Identity, + Intrinsic, + Lt, + Max, + Min, + Mul, + Ne, + Neqv, + Not, + Or, + Pow, + Resize, // Convert within the same TypeCategory + Sub, + True, + Unknown, +}; + +std::string ToString(Operator op); + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unknown; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unknown; +} + +template +Operator OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Add; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Sub; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Mul; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Div; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } +} + +template // +Operator OperationCode(const T &) { + return Operator::Unknown; +} + +Operator OperationCode(const evaluate::ProcedureDesignator &proc); + +} // namespace operation + +/// Return information about the top-level operation (ignoring parentheses): +/// the operation code and the list of arguments. +std::pair> +GetTopLevelOperation(const SomeExpr &expr); /// Check if expr is same as x, or a sequence of Convert operations on x. bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ad5eae4ae39a2..c74f7627c5e25 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2828,7 +2828,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, // This must exist by now. SomeExpr input = *semantics::GetConvertInput(assign.rhs); - std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; + std::vector args{semantics::GetTopLevelOperation(input).second}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrConvertOf(arg, atom)) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 89a3a407441a8..f29a56d5fd92a 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2891,290 +2891,6 @@ static std::pair SplitAssignmentSource( namespace atomic { -template static void MoveAppend(V &accum, V &&other) { - for (auto &&s : other) { - accum.push_back(std::move(s)); - } -} - -enum class Operator { - Unk, - // Operators that are officially allowed in the update operation - Add, - And, - Associated, - Div, - Eq, - Eqv, - Ge, // extension - Gt, - Identity, // extension: x = x is allowed (*), but we should never print - // "identity" as the name of the operator - Le, // extension - Lt, - Max, - Min, - Mul, - Ne, // extension - Neqv, - Or, - Sub, - // Operators that we recognize for technical reasons - True, - False, - Not, - Convert, - Resize, - Intrinsic, - Call, - Pow, - - // (*): "x = x + 0" is a valid update statement, but it will be folded - // to "x = x" by the time we look at it. Since the source statements - // "x = x" and "x = x + 0" will end up looking the same, accept the - // former as an extension. -}; - -std::string ToString(Operator op) { - switch (op) { - case Operator::Add: - return "+"; - case Operator::And: - return "AND"; - case Operator::Associated: - return "ASSOCIATED"; - case Operator::Div: - return "/"; - case Operator::Eq: - return "=="; - case Operator::Eqv: - return "EQV"; - case Operator::Ge: - return ">="; - case Operator::Gt: - return ">"; - case Operator::Identity: - return "identity"; - case Operator::Le: - return "<="; - case Operator::Lt: - return "<"; - case Operator::Max: - return "MAX"; - case Operator::Min: - return "MIN"; - case Operator::Mul: - return "*"; - case Operator::Neqv: - return "NEQV/EOR"; - case Operator::Ne: - return "/="; - case Operator::Or: - return "OR"; - case Operator::Sub: - return "-"; - case Operator::True: - return ".TRUE."; - case Operator::False: - return ".FALSE."; - case Operator::Not: - return "NOT"; - case Operator::Convert: - return "type-conversion"; - case Operator::Resize: - return "resize"; - case Operator::Intrinsic: - return "intrinsic"; - case Operator::Call: - return "function-call"; - case Operator::Pow: - return "**"; - default: - return "??"; - } -} - -template // -struct ArgumentExtractor - : public evaluate::Traverse, - std::pair>, false> { - using Arguments = std::vector; - using Result = std::pair; - using Base = evaluate::Traverse, - Result, false>; - static constexpr auto IgnoreResizes = IgnoreResizingConverts; - static constexpr auto Logical = common::TypeCategory::Logical; - ArgumentExtractor() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result operator()( - const evaluate::Constant> &x) const { - if (const auto &val{x.GetScalarValue()}) { - return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) - : std::make_pair(Operator::False, Arguments{}); - } - return Default(); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - Result result{OperationCode(x.proc()), {}}; - for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { - if (auto *e{x.UnwrapArgExpr(i)}) { - result.second.push_back(*e); - } - } - return result; - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. - return (*this)(x.template operand<0>()); - } - if constexpr (IgnoreResizes && - std::is_same_v>) { - // Ignore conversions within the same category. - // Atomic operations on int(kind=1) may be implicitly widened - // to int(kind=4) for example. - return (*this)(x.template operand<0>()); - } else { - return std::make_pair( - OperationCode(x), OperationArgs(x, std::index_sequence_for{})); - } - } - - template // - Result operator()(const evaluate::Designator &x) const { - evaluate::Designator copy{x}; - Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; - return result; - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - // There shouldn't be any combining needed, since we're stopping the - // traversal at the top-level operation, but implement one that picks - // the first non-empty result. - if constexpr (sizeof...(Rs) == 0) { - return std::move(result); - } else { - if (!result.second.empty()) { - return std::move(result); - } else { - return Combine(std::move(results)...); - } - } - } - -private: - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) - const { - switch (op.derived().logicalOperator) { - case common::LogicalOperator::And: - return Operator::And; - case common::LogicalOperator::Or: - return Operator::Or; - case common::LogicalOperator::Eqv: - return Operator::Eqv; - case common::LogicalOperator::Neqv: - return Operator::Neqv; - case common::LogicalOperator::Not: - return Operator::Not; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - switch (op.derived().opr) { - case common::RelationalOperator::LT: - return Operator::Lt; - case common::RelationalOperator::LE: - return Operator::Le; - case common::RelationalOperator::EQ: - return Operator::Eq; - case common::RelationalOperator::NE: - return Operator::Ne; - case common::RelationalOperator::GE: - return Operator::Ge; - case common::RelationalOperator::GT: - return Operator::Gt; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Add; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Sub; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Mul; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Div; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - if constexpr (C == T::category) { - return Operator::Resize; - } else { - return Operator::Convert; - } - } - Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { - Operator code = llvm::StringSwitch(proc.GetName()) - .Case("associated", Operator::Associated) - .Case("min", Operator::Min) - .Case("max", Operator::Max) - .Case("iand", Operator::And) - .Case("ior", Operator::Or) - .Case("ieor", Operator::Neqv) - .Default(Operator::Call); - if (code == Operator::Call && proc.GetSpecificIntrinsic()) { - return Operator::Intrinsic; - } - return code; - } - template // - Operator OperationCode(const T &) const { - return Operator::Unk; - } - - template - Arguments OperationArgs(const evaluate::Operation &x, - std::index_sequence) const { - return Arguments{SomeExpr(x.template operand())...}; - } -}; - struct DesignatorCollector : public evaluate::Traverse, false> { using Result = std::vector; @@ -3196,125 +2912,14 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (MoveAppend(v, std::move(results)), ...); - return v; - } -}; - -struct ConvertCollector - : public evaluate::Traverse>, false> { - using Result = std::pair>; - using Base = evaluate::Traverse; - ConvertCollector() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result asSomeExpr(const T &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; - } - - template // - Result operator()(const evaluate::Designator &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::Constant &x) const { - return asSomeExpr(x); - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore parentheses. - return (*this)(x.template operand<0>()); - } else if constexpr (is_convert_v) { - // Convert should always have a typed result, so it should be safe to - // dereference x.GetType(). - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else if constexpr (is_complex_constructor_v) { - // This is a conversion iff the imaginary operand is 0. - if (IsZero(x.template operand<1>())) { - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else { - return asSomeExpr(x.derived()); - } - } else { - return asSomeExpr(x.derived()); - } - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - Result v(std::move(result)); - auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { - assert((!x.has_value() || !y.has_value()) && "Multiple designators"); - if (!x.has_value()) { - x = std::move(y); + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); } }}; - (setValue(v.first, std::move(results).first), ...); - (MoveAppend(v.second, std::move(results).second), ...); + (moveAppend(v, std::move(results)), ...); return v; } - -private: - template // - static bool IsZero(const T &x) { - return false; - } - template // - static bool IsZero(const evaluate::Expr &x) { - return common::visit([](auto &&s) { return IsZero(s); }, x.u); - } - template // - static bool IsZero(const evaluate::Constant &x) { - if (auto &&maybeScalar{x.GetScalarValue()}) { - return maybeScalar->IsZero(); - } else { - return false; - } - } - - template // - struct is_convert { - static constexpr bool value{false}; - }; - template // - struct is_convert> { - static constexpr bool value{true}; - }; - template // - struct is_convert> { - // Conversion from complex to real. - static constexpr bool value{true}; - }; - template // - static constexpr bool is_convert_v = is_convert::value; - - template // - struct is_complex_constructor { - static constexpr bool value{false}; - }; - template // - struct is_complex_constructor> { - static constexpr bool value{true}; - }; - template // - static constexpr bool is_complex_constructor_v = - is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { @@ -3347,22 +2952,13 @@ static bool IsAllocatable(const SomeExpr &expr) { return !syms.empty() && IsAllocatable(syms.back()); } -static std::pair> GetTopLevelOperation( - const SomeExpr &expr) { - return atomic::ArgumentExtractor{}(expr); -} - -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { - return GetTopLevelOperation(expr).second; -} - static bool IsPointerAssignment(const evaluate::Assignment &x) { return std::holds_alternative(x.u) || std::holds_alternative(x.u); } static bool IsCheckForAssociated(const SomeExpr &cond) { - return GetTopLevelOperation(cond).first == atomic::Operator::Associated; + return GetTopLevelOperation(cond).first == operation::Operator::Associated; } static bool HasCommonDesignatorSymbols( @@ -3455,23 +3051,7 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -MaybeExpr GetConvertInput(const SomeExpr &x) { - // This returns SomeExpr(x) when x is a designator/functionref/constant. - return atomic::ConvertCollector{}(x).first; -} - -bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { - // Check if expr is same as x, or a sequence of Convert operations on x. - if (expr == x) { - return true; - } else if (auto maybe{GetConvertInput(expr)}) { - return *maybe == x; - } else { - return false; - } -} - -bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { +static bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3839,45 +3419,46 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - std::pair> top{ - atomic::Operator::Unk, {}}; + std::pair> top{ + operation::Operator::Unknown, {}}; if (auto &&maybeInput{GetConvertInput(update.rhs)}) { top = GetTopLevelOperation(*maybeInput); } switch (top.first) { - case atomic::Operator::Add: - case atomic::Operator::Sub: - case atomic::Operator::Mul: - case atomic::Operator::Div: - case atomic::Operator::And: - case atomic::Operator::Or: - case atomic::Operator::Eqv: - case atomic::Operator::Neqv: - case atomic::Operator::Min: - case atomic::Operator::Max: - case atomic::Operator::Identity: + case operation::Operator::Add: + case operation::Operator::Sub: + case operation::Operator::Mul: + case operation::Operator::Div: + case operation::Operator::And: + case operation::Operator::Or: + case operation::Operator::Eqv: + case operation::Operator::Neqv: + case operation::Operator::Min: + case operation::Operator::Max: + case operation::Operator::Identity: break; - case atomic::Operator::Call: + case operation::Operator::Call: context_.Say(source, "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Convert: + case operation::Operator::Convert: context_.Say(source, "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Intrinsic: + case operation::Operator::Intrinsic: context_.Say(source, "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Unk: + case operation::Operator::Unknown: context_.Say( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); return; } // Check if `atom` occurs exactly once in the argument list. @@ -3898,17 +3479,17 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( }()}; if (unique == top.second.end()) { - if (top.first == atomic::Operator::Identity) { + if (top.first == operation::Operator::Identity) { // This is "x = y". context_.Say(rsrc, "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, - atom.AsFortran(), atomic::ToString(top.first)); + atom.AsFortran(), operation::ToString(top.first)); } } else { CheckStorageOverlap(atom, nonAtom, source); @@ -3933,18 +3514,18 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( // Missing arguments to operations would have been diagnosed by now. switch (top.first) { - case atomic::Operator::Associated: + case operation::Operator::Associated: if (atom != top.second.front()) { context_.Say(assignSource, "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); } break; // x equalop e | e equalop x (allowing "e equalop x" is an extension) - case atomic::Operator::Eq: - case atomic::Operator::Eqv: + case operation::Operator::Eq: + case operation::Operator::Eqv: // x ordop expr | expr ordop x - case atomic::Operator::Lt: - case atomic::Operator::Gt: { + case operation::Operator::Lt: + case operation::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; if (IsSameOrConvertOf(arg0, atom)) { @@ -3952,23 +3533,24 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); } break; } - case atomic::Operator::Identity: - case atomic::Operator::True: - case atomic::Operator::False: + case operation::Operator::Identity: + case operation::Operator::True: + case operation::Operator::False: break; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); break; } } diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 1d1e3ac044166..fce930dcc1d02 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -17,6 +17,7 @@ #include "flang/Semantics/tools.h" #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringSwitch.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1756,4 +1757,313 @@ bool HadUseError( } } +namespace operation { +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() + ? std::make_pair(operation::Operator::True, Arguments{}) + : std::make_pair(operation::Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{operation::OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); + } + } + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair(operation::OperationCode(x), + OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{ + operation::Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; +} // namespace operation + +std::string operation::ToString(operation::Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; + } +} + +operation::Operator operation::OperationCode( + const evaluate::ProcedureDesignator &proc) { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; +} + +std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return operation::ArgumentExtractor{}(expr); +} + +namespace operation { +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result asSomeExpr(const T &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::Constant &x) const { + return asSomeExpr(x); + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } + } else { + return asSomeExpr(x.derived()); + } + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (moveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; +}; +} // namespace operation + +MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return operation::ConvertCollector{}(x).first; +} + +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + } // namespace Fortran::semantics >From a83a1cf262eb9f01aafbcf099a8467aa9b861187 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 13:05:15 -0500 Subject: [PATCH 25/25] format --- flang/include/flang/Semantics/tools.h | 28 +++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 25fadceefceb0..9454f0b489192 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -830,8 +830,8 @@ Operator OperationCode( } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { switch (op.derived().opr) { case common::RelationalOperator::LT: return Operator::Lt; @@ -855,26 +855,26 @@ Operator OperationCode(const evaluate::Operation, Ts...> &op) { } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Sub; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Mul; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Div; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Pow; } @@ -885,8 +885,8 @@ Operator OperationCode( } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { if constexpr (C == T::category) { return Operator::Resize; } else { @@ -905,8 +905,8 @@ Operator OperationCode(const evaluate::ProcedureDesignator &proc); /// Return information about the top-level operation (ignoring parentheses): /// the operation code and the list of arguments. -std::pair> -GetTopLevelOperation(const SomeExpr &expr); +std::pair> GetTopLevelOperation( + const SomeExpr &expr); /// Check if expr is same as x, or a sequence of Convert operations on x. bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); From flang-commits at lists.llvm.org Fri May 30 11:11:16 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Fri, 30 May 2025 11:11:16 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <6839f4c4.050a0220.358d84.94c5@mx.google.com> https://github.com/snarang181 updated https://github.com/llvm/llvm-project/pull/141882 >From 05d155d3def0173ed099b8d7588dca1ea644754a Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Wed, 28 May 2025 20:21:16 -0400 Subject: [PATCH 1/5] [Flang][Docs] Add Sphinx man page support for Flang This patch enables building Flang man pages by: - Adding a `man_pages` entry in flang/docs/conf.py for Sphinx man builder. - Adding a minimal `index.rst` as the master document. - Adding placeholder `.rst` files for FIRLangRef and FlangCommandLineReference to fix toctree references. These changes unblock builds using `-DLLVM_BUILD_MANPAGES=ON` and allow `ninja docs-flang-man` to generate `flang.1`. Fixes #141757 --- flang/docs/FIRLangRef.rst | 4 ++++ flang/docs/FlangCommandLineReference.rst | 4 ++++ flang/docs/conf.py | 4 +++- flang/docs/index.rst | 10 ++++++++++ 4 files changed, 21 insertions(+), 1 deletion(-) create mode 100644 flang/docs/FIRLangRef.rst create mode 100644 flang/docs/FlangCommandLineReference.rst create mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst new file mode 100644 index 0000000000000..91edd67fdcad8 --- /dev/null +++ b/flang/docs/FIRLangRef.rst @@ -0,0 +1,4 @@ +FIR Language Reference +====================== + +(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst new file mode 100644 index 0000000000000..71f77f28ba72c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.rst @@ -0,0 +1,4 @@ +Flang Command Line Reference +============================ + +(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 48f7b69f5d750..46907f144e25a 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -227,7 +227,9 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [] +man_pages = [ + ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) +] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst new file mode 100644 index 0000000000000..09677eb87704f --- /dev/null +++ b/flang/docs/index.rst @@ -0,0 +1,10 @@ +Flang Documentation +==================== + +Welcome to the Flang documentation. + +.. toctree:: + :maxdepth: 1 + + FIRLangRef + FlangCommandLineReference >From c255837afe37348ed2e1fe480bc2b621025f8774 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 06:53:34 -0400 Subject: [PATCH 2/5] Remove .rst files and point conf.py to pick up .md --- flang/docs/FIRLangRef.rst | 4 ---- flang/docs/FlangCommandLineReference.rst | 4 ---- flang/docs/conf.py | 5 ++--- flang/docs/index.rst | 10 ---------- 4 files changed, 2 insertions(+), 21 deletions(-) delete mode 100644 flang/docs/FIRLangRef.rst delete mode 100644 flang/docs/FlangCommandLineReference.rst delete mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst deleted file mode 100644 index 91edd67fdcad8..0000000000000 --- a/flang/docs/FIRLangRef.rst +++ /dev/null @@ -1,4 +0,0 @@ -FIR Language Reference -====================== - -(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst deleted file mode 100644 index 71f77f28ba72c..0000000000000 --- a/flang/docs/FlangCommandLineReference.rst +++ /dev/null @@ -1,4 +0,0 @@ -Flang Command Line Reference -============================ - -(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 46907f144e25a..4fd81440c8176 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -42,6 +42,7 @@ # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] +source_suffix = [".md"] myst_heading_anchors = 6 import sphinx @@ -227,9 +228,7 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [ - ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) -] +man_pages = [("index", "flang", "Flang Documentation", ["Flang Contributors"], 1)] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst deleted file mode 100644 index 09677eb87704f..0000000000000 --- a/flang/docs/index.rst +++ /dev/null @@ -1,10 +0,0 @@ -Flang Documentation -==================== - -Welcome to the Flang documentation. - -.. toctree:: - :maxdepth: 1 - - FIRLangRef - FlangCommandLineReference >From c1e998189c0b61e9910c2b613f8c610ae67c43ff Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 07:03:35 -0400 Subject: [PATCH 3/5] While building man pages, the .md files were being used. Due to that, the myst_parser was explictly imported. Adding Placeholder .md files which are required by index.md --- flang/docs/FIRLangRef.md | 3 +++ flang/docs/FlangCommandLineReference.md | 3 +++ flang/docs/conf.py | 10 +++++----- 3 files changed, 11 insertions(+), 5 deletions(-) create mode 100644 flang/docs/FIRLangRef.md create mode 100644 flang/docs/FlangCommandLineReference.md diff --git a/flang/docs/FIRLangRef.md b/flang/docs/FIRLangRef.md new file mode 100644 index 0000000000000..8e4052f14fc7c --- /dev/null +++ b/flang/docs/FIRLangRef.md @@ -0,0 +1,3 @@ +# FIR Language Reference + +_TODO: Add FIR language reference documentation._ diff --git a/flang/docs/FlangCommandLineReference.md b/flang/docs/FlangCommandLineReference.md new file mode 100644 index 0000000000000..ee8d7b83dc50c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.md @@ -0,0 +1,3 @@ +# Flang Command Line Reference + +_TODO: Add Flang CLI documentation._ diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 4fd81440c8176..7223661625689 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -10,6 +10,7 @@ # serve to show the default. from datetime import date + # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. @@ -28,16 +29,15 @@ "sphinx.ext.autodoc", ] -# When building man pages, we do not use the markdown pages, -# So, we can continue without the myst_parser dependencies. -# Doing so reduces dependencies of some packaged llvm distributions. + try: import myst_parser extensions.append("myst_parser") except ImportError: - if not tags.has("builder-man"): - raise + raise ImportError( + "myst_parser is required to build documentation, including man pages." + ) # Add any paths that contain templates here, relative to this directory. >From bee779e542853266d69e01d2e724636a219430dd Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 09:01:22 -0400 Subject: [PATCH 4/5] Remove placeholder .md files --- flang/docs/FIRLangRef.md | 3 --- flang/docs/FlangCommandLineReference.md | 3 --- 2 files changed, 6 deletions(-) delete mode 100644 flang/docs/FIRLangRef.md delete mode 100644 flang/docs/FlangCommandLineReference.md diff --git a/flang/docs/FIRLangRef.md b/flang/docs/FIRLangRef.md deleted file mode 100644 index 8e4052f14fc7c..0000000000000 --- a/flang/docs/FIRLangRef.md +++ /dev/null @@ -1,3 +0,0 @@ -# FIR Language Reference - -_TODO: Add FIR language reference documentation._ diff --git a/flang/docs/FlangCommandLineReference.md b/flang/docs/FlangCommandLineReference.md deleted file mode 100644 index ee8d7b83dc50c..0000000000000 --- a/flang/docs/FlangCommandLineReference.md +++ /dev/null @@ -1,3 +0,0 @@ -# Flang Command Line Reference - -_TODO: Add Flang CLI documentation._ >From 1d69ea595b1e3e2fd8532b36665171c8084a0e30 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Fri, 30 May 2025 14:10:36 -0400 Subject: [PATCH 5/5] Enable docs-flang-html to build --- flang/docs/conf.py | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 7223661625689..03f5973392d65 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -42,7 +42,10 @@ # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] -source_suffix = [".md"] +source_suffix = { + ".rst": "restructuredtext", + ".md": "markdown", +} myst_heading_anchors = 6 import sphinx From flang-commits at lists.llvm.org Fri May 30 11:35:40 2025 From: flang-commits at lists.llvm.org (Leandro Lupori via flang-commits) Date: Fri, 30 May 2025 11:35:40 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Explicitly set Shared DSA in symbols (PR #142154) In-Reply-To: Message-ID: <6839fa7c.170a0220.276f58.d309@mx.google.com> https://github.com/luporl updated https://github.com/llvm/llvm-project/pull/142154 >From 38294dd11c7b12e793e42569f94d76a156a92cfe Mon Sep 17 00:00:00 2001 From: Leandro Lupori Date: Fri, 23 May 2025 11:25:16 -0300 Subject: [PATCH 1/2] [flang][OpenMP] Explicitly set Shared DSA in symbols Before this change, OmpShared was not always set in shared symbols. Instead, absence of private flags was interpreted as shared DSA. The problem was that symbols with no flags, with only a host association, could also mean "has same DSA as in the enclosing context". Now shared symbols behave the same as private and can be treated the same way. Because of the host association symbols with no flags mentioned above, it was also incorrect to simply test the flags of a given symbol to find out if it was private or shared. The function GetSymbolDSA() was added to fix this. It would be better to avoid the need of these special symbols, but this would require changes to how symbols are collected in lowering. Besides that, some semantic checks need to know if a DSA clause was used or not. To avoid confusing implicit symbols with DSA clauses a new flag was added: OmpExplicit. It is now set for all symbols with explicitly determined data-sharing attributes. With the changes above, AddToContextObjectWithDSA() and the symbol to DSA map could probably be removed and the DSA could be obtained directly from the symbol, but this was not attempted. Some debug messages were also added, with the "omp" DEBUG_TYPE, to make it easier to debug the creation of implicit symbols and to visualize all associations of a given symbol. Fixes #130533 --- flang/include/flang/Semantics/openmp-dsa.h | 20 ++ flang/include/flang/Semantics/symbol.h | 4 +- flang/lib/Lower/Bridge.cpp | 4 +- flang/lib/Semantics/CMakeLists.txt | 1 + flang/lib/Semantics/openmp-dsa.cpp | 29 ++ flang/lib/Semantics/resolve-directives.cpp | 301 +++++++++++++----- flang/test/Semantics/OpenMP/common-block.f90 | 6 +- flang/test/Semantics/OpenMP/copyprivate03.f90 | 12 + .../test/Semantics/OpenMP/default-clause.f90 | 6 +- .../Semantics/OpenMP/do05-positivecase.f90 | 6 +- flang/test/Semantics/OpenMP/do20.f90 | 2 +- flang/test/Semantics/OpenMP/forall.f90 | 4 +- flang/test/Semantics/OpenMP/implicit-dsa.f90 | 35 +- flang/test/Semantics/OpenMP/reduction08.f90 | 20 +- flang/test/Semantics/OpenMP/reduction09.f90 | 14 +- flang/test/Semantics/OpenMP/reduction11.f90 | 2 +- flang/test/Semantics/OpenMP/scan2.f90 | 4 +- flang/test/Semantics/OpenMP/symbol01.f90 | 12 +- flang/test/Semantics/OpenMP/symbol02.f90 | 8 +- flang/test/Semantics/OpenMP/symbol03.f90 | 8 +- flang/test/Semantics/OpenMP/symbol04.f90 | 4 +- flang/test/Semantics/OpenMP/symbol05.f90 | 2 +- flang/test/Semantics/OpenMP/symbol06.f90 | 2 +- flang/test/Semantics/OpenMP/symbol07.f90 | 4 +- flang/test/Semantics/OpenMP/symbol08.f90 | 36 +-- flang/test/Semantics/OpenMP/symbol09.f90 | 4 +- 26 files changed, 380 insertions(+), 170 deletions(-) create mode 100644 flang/include/flang/Semantics/openmp-dsa.h create mode 100644 flang/lib/Semantics/openmp-dsa.cpp diff --git a/flang/include/flang/Semantics/openmp-dsa.h b/flang/include/flang/Semantics/openmp-dsa.h new file mode 100644 index 0000000000000..4b94a679f29ef --- /dev/null +++ b/flang/include/flang/Semantics/openmp-dsa.h @@ -0,0 +1,20 @@ +//===-- include/flang/Semantics/openmp-dsa.h --------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#ifndef FORTRAN_SEMANTICS_OPENMP_DSA_H_ +#define FORTRAN_SEMANTICS_OPENMP_DSA_H_ + +#include "flang/Semantics/symbol.h" + +namespace Fortran::semantics { + +Symbol::Flags GetSymbolDSA(const Symbol &symbol); + +} // namespace Fortran::semantics + +#endif // FORTRAN_SEMANTICS_OPENMP_DSA_H_ diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h index 4cded64d170cd..59920e08cc926 100644 --- a/flang/include/flang/Semantics/symbol.h +++ b/flang/include/flang/Semantics/symbol.h @@ -785,8 +785,8 @@ class Symbol { OmpAllocate, OmpDeclarativeAllocateDirective, OmpExecutableAllocateDirective, OmpDeclareSimd, OmpDeclareTarget, OmpThreadprivate, OmpDeclareReduction, OmpFlushed, OmpCriticalLock, - OmpIfSpecified, OmpNone, OmpPreDetermined, OmpImplicit, OmpDependObject, - OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction); + OmpIfSpecified, OmpNone, OmpPreDetermined, OmpExplicit, OmpImplicit, + OmpDependObject, OmpInclusiveScan, OmpExclusiveScan, OmpInScanReduction); using Flags = common::EnumSet; const Scope &owner() const { return *owner_; } diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp index c9e91cf3e8042..86d5e0d37bc38 100644 --- a/flang/lib/Lower/Bridge.cpp +++ b/flang/lib/Lower/Bridge.cpp @@ -58,6 +58,7 @@ #include "flang/Optimizer/Transforms/Passes.h" #include "flang/Parser/parse-tree.h" #include "flang/Runtime/iostat-consts.h" +#include "flang/Semantics/openmp-dsa.h" #include "flang/Semantics/runtime-type-info.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" @@ -1385,7 +1386,8 @@ class FirConverter : public Fortran::lower::AbstractConverter { if (isUnordered || sym.has() || sym.has()) { if (!shallowLookupSymbol(sym) && - !sym.test(Fortran::semantics::Symbol::Flag::OmpShared)) { + !GetSymbolDSA(sym).test( + Fortran::semantics::Symbol::Flag::OmpShared)) { // Do concurrent loop variables are not mapped yet since they are local // to the Do concurrent scope (same for OpenMP loops). mlir::OpBuilder::InsertPoint insPt = builder->saveInsertionPoint(); diff --git a/flang/lib/Semantics/CMakeLists.txt b/flang/lib/Semantics/CMakeLists.txt index bd8cc47365f06..18c89587843a9 100644 --- a/flang/lib/Semantics/CMakeLists.txt +++ b/flang/lib/Semantics/CMakeLists.txt @@ -32,6 +32,7 @@ add_flang_library(FortranSemantics dump-expr.cpp expression.cpp mod-file.cpp + openmp-dsa.cpp openmp-modifiers.cpp pointer-assignment.cpp program-tree.cpp diff --git a/flang/lib/Semantics/openmp-dsa.cpp b/flang/lib/Semantics/openmp-dsa.cpp new file mode 100644 index 0000000000000..48aa36febe5c5 --- /dev/null +++ b/flang/lib/Semantics/openmp-dsa.cpp @@ -0,0 +1,29 @@ +//===-- flang/lib/Semantics/openmp-dsa.cpp ----------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Semantics/openmp-dsa.h" + +namespace Fortran::semantics { + +Symbol::Flags GetSymbolDSA(const Symbol &symbol) { + Symbol::Flags dsaFlags{Symbol::Flag::OmpPrivate, + Symbol::Flag::OmpFirstPrivate, Symbol::Flag::OmpLastPrivate, + Symbol::Flag::OmpShared, Symbol::Flag::OmpLinear, + Symbol::Flag::OmpReduction}; + Symbol::Flags dsa{symbol.flags() & dsaFlags}; + if (dsa.any()) { + return dsa; + } + // If no DSA are set use those from the host associated symbol, if any. + if (const auto *details{symbol.detailsIf()}) { + return GetSymbolDSA(details->symbol()); + } + return {}; +} + +} // namespace Fortran::semantics diff --git a/flang/lib/Semantics/resolve-directives.cpp b/flang/lib/Semantics/resolve-directives.cpp index 9fa7bc8964854..e604bca4213c8 100644 --- a/flang/lib/Semantics/resolve-directives.cpp +++ b/flang/lib/Semantics/resolve-directives.cpp @@ -19,9 +19,11 @@ #include "flang/Parser/parse-tree.h" #include "flang/Parser/tools.h" #include "flang/Semantics/expression.h" +#include "flang/Semantics/openmp-dsa.h" #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/symbol.h" #include "flang/Semantics/tools.h" +#include "llvm/Support/Debug.h" #include #include #include @@ -111,10 +113,9 @@ template class DirectiveAttributeVisitor { const parser::Name *GetLoopIndex(const parser::DoConstruct &); const parser::DoConstruct *GetDoConstructIf( const parser::ExecutionPartConstruct &); - Symbol *DeclareNewPrivateAccessEntity(const Symbol &, Symbol::Flag, Scope &); - Symbol *DeclarePrivateAccessEntity( - const parser::Name &, Symbol::Flag, Scope &); - Symbol *DeclarePrivateAccessEntity(Symbol &, Symbol::Flag, Scope &); + Symbol *DeclareNewAccessEntity(const Symbol &, Symbol::Flag, Scope &); + Symbol *DeclareAccessEntity(const parser::Name &, Symbol::Flag, Scope &); + Symbol *DeclareAccessEntity(Symbol &, Symbol::Flag, Scope &); Symbol *DeclareOrMarkOtherAccessEntity(const parser::Name &, Symbol::Flag); UnorderedSymbolSet dataSharingAttributeObjects_; // on one directive @@ -749,10 +750,11 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { Symbol::Flags ompFlagsRequireNewSymbol{Symbol::Flag::OmpPrivate, Symbol::Flag::OmpLinear, Symbol::Flag::OmpFirstPrivate, - Symbol::Flag::OmpLastPrivate, Symbol::Flag::OmpReduction, - Symbol::Flag::OmpCriticalLock, Symbol::Flag::OmpCopyIn, - Symbol::Flag::OmpUseDevicePtr, Symbol::Flag::OmpUseDeviceAddr, - Symbol::Flag::OmpIsDevicePtr, Symbol::Flag::OmpHasDeviceAddr}; + Symbol::Flag::OmpLastPrivate, Symbol::Flag::OmpShared, + Symbol::Flag::OmpReduction, Symbol::Flag::OmpCriticalLock, + Symbol::Flag::OmpCopyIn, Symbol::Flag::OmpUseDevicePtr, + Symbol::Flag::OmpUseDeviceAddr, Symbol::Flag::OmpIsDevicePtr, + Symbol::Flag::OmpHasDeviceAddr}; Symbol::Flags ompFlagsRequireMark{Symbol::Flag::OmpThreadprivate, Symbol::Flag::OmpDeclareTarget, Symbol::Flag::OmpExclusiveScan, @@ -829,8 +831,24 @@ class OmpAttributeVisitor : DirectiveAttributeVisitor { void IssueNonConformanceWarning( llvm::omp::Directive D, parser::CharBlock source); - void CreateImplicitSymbols( - const Symbol *symbol, std::optional setFlag = std::nullopt); + void CreateImplicitSymbols(const Symbol *symbol); + + void AddToContextObjectWithExplicitDSA(Symbol &symbol, Symbol::Flag flag) { + AddToContextObjectWithDSA(symbol, flag); + if (dataSharingAttributeFlags.test(flag)) { + symbol.set(Symbol::Flag::OmpExplicit); + } + } + + // Clear any previous data-sharing attribute flags and set the new ones. + // Needed when setting PreDetermined DSAs, that take precedence over + // Implicit ones. + void SetSymbolDSA(Symbol &symbol, Symbol::Flags flags) { + symbol.flags() &= ~(dataSharingAttributeFlags | + Symbol::Flags{Symbol::Flag::OmpExplicit, Symbol::Flag::OmpImplicit, + Symbol::Flag::OmpPreDetermined}); + symbol.flags() |= flags; + } }; template @@ -867,7 +885,7 @@ const parser::DoConstruct *DirectiveAttributeVisitor::GetDoConstructIf( } template -Symbol *DirectiveAttributeVisitor::DeclareNewPrivateAccessEntity( +Symbol *DirectiveAttributeVisitor::DeclareNewAccessEntity( const Symbol &object, Symbol::Flag flag, Scope &scope) { assert(object.owner() != currScope()); auto &symbol{MakeAssocSymbol(object.name(), object, scope)}; @@ -880,20 +898,20 @@ Symbol *DirectiveAttributeVisitor::DeclareNewPrivateAccessEntity( } template -Symbol *DirectiveAttributeVisitor::DeclarePrivateAccessEntity( +Symbol *DirectiveAttributeVisitor::DeclareAccessEntity( const parser::Name &name, Symbol::Flag flag, Scope &scope) { if (!name.symbol) { return nullptr; // not resolved by Name Resolution step, do nothing } - name.symbol = DeclarePrivateAccessEntity(*name.symbol, flag, scope); + name.symbol = DeclareAccessEntity(*name.symbol, flag, scope); return name.symbol; } template -Symbol *DirectiveAttributeVisitor::DeclarePrivateAccessEntity( +Symbol *DirectiveAttributeVisitor::DeclareAccessEntity( Symbol &object, Symbol::Flag flag, Scope &scope) { if (object.owner() != currScope()) { - return DeclareNewPrivateAccessEntity(object, flag, scope); + return DeclareNewAccessEntity(object, flag, scope); } else { object.set(flag); return &object; @@ -1600,6 +1618,20 @@ void AccAttributeVisitor::CheckMultipleAppearances( } } +#ifndef NDEBUG + +#define DEBUG_TYPE "omp" + +static llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const Symbol::Flags &flags); + +namespace dbg { +static void DumpAssocSymbols(llvm::raw_ostream &os, const Symbol &sym); +static std::string ScopeSourcePos(const Fortran::semantics::Scope &scope); +} // namespace dbg + +#endif + bool OmpAttributeVisitor::Pre(const parser::OpenMPBlockConstruct &x) { const auto &beginBlockDir{std::get(x.t)}; const auto &beginDir{std::get(beginBlockDir.t)}; @@ -1792,12 +1824,12 @@ void OmpAttributeVisitor::ResolveSeqLoopIndexInParallelOrTaskConstruct( } } } - // If this symbol is already Private or Firstprivate in the enclosing - // OpenMP parallel or task then there is nothing to do here. + // If this symbol already has an explicit data-sharing attribute in the + // enclosing OpenMP parallel or task then there is nothing to do here. if (auto *symbol{targetIt->scope.FindSymbol(iv.source)}) { if (symbol->owner() == targetIt->scope) { - if (symbol->test(Symbol::Flag::OmpPrivate) || - symbol->test(Symbol::Flag::OmpFirstPrivate)) { + if (symbol->test(Symbol::Flag::OmpExplicit) && + (symbol->flags() & dataSharingAttributeFlags).any()) { return; } } @@ -1806,7 +1838,8 @@ void OmpAttributeVisitor::ResolveSeqLoopIndexInParallelOrTaskConstruct( // parallel or task if (auto *symbol{ResolveOmp(iv, Symbol::Flag::OmpPrivate, targetIt->scope)}) { targetIt++; - symbol->set(Symbol::Flag::OmpPreDetermined); + SetSymbolDSA( + *symbol, {Symbol::Flag::OmpPreDetermined, Symbol::Flag::OmpPrivate}); iv.symbol = symbol; // adjust the symbol within region for (auto it{dirContext_.rbegin()}; it != targetIt; ++it) { AddToContextObjectWithDSA(*symbol, Symbol::Flag::OmpPrivate, *it); @@ -1918,7 +1951,7 @@ void OmpAttributeVisitor::PrivatizeAssociatedLoopIndexAndCheckLoopLevel( const parser::Name *iv{GetLoopIndex(*loop)}; if (iv) { if (auto *symbol{ResolveOmp(*iv, ivDSA, currScope())}) { - symbol->set(Symbol::Flag::OmpPreDetermined); + SetSymbolDSA(*symbol, {Symbol::Flag::OmpPreDetermined, ivDSA}); iv->symbol = symbol; // adjust the symbol within region AddToContextObjectWithDSA(*symbol, ivDSA); } @@ -2178,42 +2211,48 @@ static bool IsPrivatizable(const Symbol *sym) { misc->kind() != MiscDetails::Kind::ConstructName)); } -void OmpAttributeVisitor::CreateImplicitSymbols( - const Symbol *symbol, std::optional setFlag) { +void OmpAttributeVisitor::CreateImplicitSymbols(const Symbol *symbol) { if (!IsPrivatizable(symbol)) { return; } + LLVM_DEBUG(llvm::dbgs() << "CreateImplicitSymbols: " << *symbol << '\n'); + // Implicitly determined DSAs // OMP 5.2 5.1.1 - Variables Referenced in a Construct Symbol *lastDeclSymbol = nullptr; - std::optional prevDSA; + Symbol::Flags prevDSA; for (int dirDepth{0}; dirDepth < (int)dirContext_.size(); ++dirDepth) { DirContext &dirContext = dirContext_[dirDepth]; - std::optional dsa; + Symbol::Flags dsa; - for (auto symMap : dirContext.objectWithDSA) { - // if the `symbol` already has a data-sharing attribute - if (symMap.first->name() == symbol->name()) { - dsa = symMap.second; - break; + Scope &scope{context_.FindScope(dirContext.directiveSource)}; + auto it{scope.find(symbol->name())}; + if (it != scope.end()) { + // There is already a symbol in the current scope, use its DSA. + dsa = GetSymbolDSA(*it->second); + } else { + for (auto symMap : dirContext.objectWithDSA) { + if (symMap.first->name() == symbol->name()) { + // `symbol` already has a data-sharing attribute in the current + // context, use it. + dsa.set(symMap.second); + break; + } } } // When handling each implicit rule for a given symbol, one of the - // following 3 actions may be taken: - // 1. Declare a new private symbol. - // 2. Create a new association symbol with no flags, that will represent - // a shared symbol in the current scope. Note that symbols without - // any private flags are considered as shared. - // 3. Use the last declared private symbol, by inserting a new symbol - // in the scope being processed, associated with it. - // If no private symbol was declared previously, then no association - // is needed and the symbol from the enclosing scope will be - // inherited by the current one. + // following actions may be taken: + // 1. Declare a new private or shared symbol. + // 2. Use the last declared symbol, by inserting a new symbol in the + // scope being processed, associated with it. + // If no symbol was declared previously, then no association is needed + // and the symbol from the enclosing scope will be inherited by the + // current one. // // Because of how symbols are collected in lowering, not inserting a new - // symbol in the last case could lead to the conclusion that a symbol + // symbol in the second case could lead to the conclusion that a symbol // from an enclosing construct was declared in the current construct, // which would result in wrong privatization code being generated. // Consider the following example: @@ -2231,46 +2270,71 @@ void OmpAttributeVisitor::CreateImplicitSymbols( // it would have the private flag set. // This would make x appear to be defined in p2, causing it to be // privatized in p2 and its privatization in p1 to be skipped. - auto makePrivateSymbol = [&](Symbol::Flag flag) { + auto makeSymbol = [&](Symbol::Flags flags) { const Symbol *hostSymbol = lastDeclSymbol ? lastDeclSymbol : &symbol->GetUltimate(); - lastDeclSymbol = DeclareNewPrivateAccessEntity( + assert(flags.LeastElement()); + Symbol::Flag flag = *flags.LeastElement(); + lastDeclSymbol = DeclareNewAccessEntity( *hostSymbol, flag, context_.FindScope(dirContext.directiveSource)); - if (setFlag) { - lastDeclSymbol->set(*setFlag); - } + lastDeclSymbol->flags() |= flags; return lastDeclSymbol; }; - auto makeSharedSymbol = [&](std::optional flag = {}) { - const Symbol *hostSymbol = - lastDeclSymbol ? lastDeclSymbol : &symbol->GetUltimate(); - Symbol &assocSymbol = MakeAssocSymbol(symbol->name(), *hostSymbol, - context_.FindScope(dirContext.directiveSource)); - if (flag) { - assocSymbol.set(*flag); - } - }; auto useLastDeclSymbol = [&]() { if (lastDeclSymbol) { - makeSharedSymbol(); + const Symbol *hostSymbol = + lastDeclSymbol ? lastDeclSymbol : &symbol->GetUltimate(); + MakeAssocSymbol(symbol->name(), *hostSymbol, + context_.FindScope(dirContext.directiveSource)); } }; +#ifndef NDEBUG + auto printImplicitRule = [&](const char *id) { + LLVM_DEBUG(llvm::dbgs() << "\t" << id << ": dsa: " << dsa << '\n'); + LLVM_DEBUG( + llvm::dbgs() << "\t\tScope: " << dbg::ScopeSourcePos(scope) << '\n'); + }; +#define PRINT_IMPLICIT_RULE(id) printImplicitRule(id) +#else +#define PRINT_IMPLICIT_RULE(id) +#endif + bool taskGenDir = llvm::omp::taskGeneratingSet.test(dirContext.directive); bool targetDir = llvm::omp::allTargetSet.test(dirContext.directive); bool parallelDir = llvm::omp::allParallelSet.test(dirContext.directive); bool teamsDir = llvm::omp::allTeamsSet.test(dirContext.directive); - if (dsa.has_value()) { - if (dsa.value() == Symbol::Flag::OmpShared && - (parallelDir || taskGenDir || teamsDir)) { - makeSharedSymbol(Symbol::Flag::OmpShared); + if (dsa.any()) { + if (parallelDir || taskGenDir || teamsDir) { + Symbol *prevDeclSymbol{lastDeclSymbol}; + // NOTE As `dsa` will match that of the symbol in the current scope + // (if any), we won't override the DSA of any existing symbol. + if ((dsa & dataSharingAttributeFlags).any()) { + makeSymbol(dsa); + } + // Fix host association of explicit symbols, as they can be created + // before implicit ones in enclosing scope. + if (prevDeclSymbol && prevDeclSymbol != lastDeclSymbol && + lastDeclSymbol->test(Symbol::Flag::OmpExplicit)) { + const auto *hostAssoc{lastDeclSymbol->detailsIf()}; + if (hostAssoc && hostAssoc->symbol() != *prevDeclSymbol) { + lastDeclSymbol->set_details(HostAssocDetails{*prevDeclSymbol}); + } + } } - // Private symbols will have been declared already. prevDSA = dsa; + PRINT_IMPLICIT_RULE("0) already has DSA"); continue; } + // NOTE Because of how lowering uses OmpImplicit flag, we can only set it + // for symbols with private DSA. + // Also, as the default clause is handled separately in lowering, + // don't mark its symbols with OmpImplicit either. + // Ideally, lowering should be changed and all implicit symbols + // should be marked with OmpImplicit. + if (dirContext.defaultDSA == Symbol::Flag::OmpPrivate || dirContext.defaultDSA == Symbol::Flag::OmpFirstPrivate || dirContext.defaultDSA == Symbol::Flag::OmpShared) { @@ -2279,33 +2343,34 @@ void OmpAttributeVisitor::CreateImplicitSymbols( if (!parallelDir && !taskGenDir && !teamsDir) { return; } - if (dirContext.defaultDSA != Symbol::Flag::OmpShared) { - makePrivateSymbol(dirContext.defaultDSA); - } else { - makeSharedSymbol(); - } - dsa = dirContext.defaultDSA; + dsa = {dirContext.defaultDSA}; + makeSymbol(dsa); + PRINT_IMPLICIT_RULE("1) default"); } else if (parallelDir) { // 2) parallel -> shared - makeSharedSymbol(); - dsa = Symbol::Flag::OmpShared; + dsa = {Symbol::Flag::OmpShared}; + makeSymbol(dsa); + PRINT_IMPLICIT_RULE("2) parallel"); } else if (!taskGenDir && !targetDir) { // 3) enclosing context - useLastDeclSymbol(); dsa = prevDSA; + useLastDeclSymbol(); + PRINT_IMPLICIT_RULE("3) enclosing context"); } else if (targetDir) { // TODO 4) not mapped target variable -> firstprivate dsa = prevDSA; } else if (taskGenDir) { // TODO 5) dummy arg in orphaned taskgen construct -> firstprivate - if (prevDSA == Symbol::Flag::OmpShared) { + if (prevDSA.test(Symbol::Flag::OmpShared)) { // 6) shared in enclosing context -> shared - makeSharedSymbol(); - dsa = Symbol::Flag::OmpShared; + dsa = {Symbol::Flag::OmpShared}; + makeSymbol(dsa); + PRINT_IMPLICIT_RULE("6) taskgen: shared"); } else { // 7) firstprivate - dsa = Symbol::Flag::OmpFirstPrivate; - makePrivateSymbol(*dsa)->set(Symbol::Flag::OmpImplicit); + dsa = {Symbol::Flag::OmpFirstPrivate}; + makeSymbol(dsa)->set(Symbol::Flag::OmpImplicit); + PRINT_IMPLICIT_RULE("7) taskgen: firstprivate"); } } prevDSA = dsa; @@ -2371,7 +2436,7 @@ void OmpAttributeVisitor::ResolveOmpName( if (ResolveName(&name)) { if (auto *resolvedSymbol{ResolveOmp(name, ompFlag, currScope())}) { if (dataSharingAttributeFlags.test(ompFlag)) { - AddToContextObjectWithDSA(*resolvedSymbol, ompFlag); + AddToContextObjectWithExplicitDSA(*resolvedSymbol, ompFlag); } } } else if (ompFlag == Symbol::Flag::OmpCriticalLock) { @@ -2484,7 +2549,7 @@ void OmpAttributeVisitor::ResolveOmpObject( if (dataCopyingAttributeFlags.test(ompFlag)) { CheckDataCopyingClause(*name, *symbol, ompFlag); } else { - AddToContextObjectWithDSA(*symbol, ompFlag); + AddToContextObjectWithExplicitDSA(*symbol, ompFlag); if (dataSharingAttributeFlags.test(ompFlag)) { CheckMultipleAppearances(*name, *symbol, ompFlag); } @@ -2588,8 +2653,14 @@ void OmpAttributeVisitor::ResolveOmpObject( GetContext().directive))) { for (Symbol::Flag ompFlag1 : dataMappingAttributeFlags) { for (Symbol::Flag ompFlag2 : dataSharingAttributeFlags) { - checkExclusivelists( - hostAssocSym, ompFlag1, symbol, ompFlag2); + if ((hostAssocSym->test(ompFlag2) && + hostAssocSym->test( + Symbol::Flag::OmpExplicit)) || + (symbol->test(ompFlag2) && + symbol->test(Symbol::Flag::OmpExplicit))) { + checkExclusivelists( + hostAssocSym, ompFlag1, symbol, ompFlag2); + } } } } @@ -2624,7 +2695,7 @@ void OmpAttributeVisitor::ResolveOmpObject( if (dataCopyingAttributeFlags.test(ompFlag)) { CheckDataCopyingClause(name, *resolvedObject, ompFlag); } else { - AddToContextObjectWithDSA(*resolvedObject, ompFlag); + AddToContextObjectWithExplicitDSA(*resolvedObject, ompFlag); } details.replace_object(*resolvedObject, index); } @@ -2643,7 +2714,7 @@ void OmpAttributeVisitor::ResolveOmpObject( Symbol *OmpAttributeVisitor::ResolveOmp( const parser::Name &name, Symbol::Flag ompFlag, Scope &scope) { if (ompFlagsRequireNewSymbol.test(ompFlag)) { - return DeclarePrivateAccessEntity(name, ompFlag, scope); + return DeclareAccessEntity(name, ompFlag, scope); } else { return DeclareOrMarkOtherAccessEntity(name, ompFlag); } @@ -2652,7 +2723,7 @@ Symbol *OmpAttributeVisitor::ResolveOmp( Symbol *OmpAttributeVisitor::ResolveOmp( Symbol &symbol, Symbol::Flag ompFlag, Scope &scope) { if (ompFlagsRequireNewSymbol.test(ompFlag)) { - return DeclarePrivateAccessEntity(symbol, ompFlag, scope); + return DeclareAccessEntity(symbol, ompFlag, scope); } else { return DeclareOrMarkOtherAccessEntity(symbol, ompFlag); } @@ -2831,10 +2902,16 @@ static bool IsSymbolThreadprivate(const Symbol &symbol) { } static bool IsSymbolPrivate(const Symbol &symbol) { - if (symbol.test(Symbol::Flag::OmpPrivate) || - symbol.test(Symbol::Flag::OmpFirstPrivate)) { + LLVM_DEBUG(llvm::dbgs() << "IsSymbolPrivate(" << symbol.name() << "):\n"); + LLVM_DEBUG(dbg::DumpAssocSymbols(llvm::dbgs(), symbol)); + + if (Symbol::Flags dsa{GetSymbolDSA(symbol)}; dsa.any()) { + if (dsa.test(Symbol::Flag::OmpShared)) { + return false; + } return true; } + // A symbol that has not gone through constructs that may privatize the // original symbol may be predetermined as private. // (OMP 5.2 5.1.1 - Variables Referenced in a Construct) @@ -3080,4 +3157,60 @@ void OmpAttributeVisitor::IssueNonConformanceWarning( context_.Warn(common::UsageWarning::OpenMPUsage, source, "%s"_warn_en_US, warnStrOS.str()); } + +#ifndef NDEBUG + +static llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, const Symbol::Flags &flags) { + flags.Dump(os, Symbol::EnumToString); + return os; +} + +namespace dbg { + +static llvm::raw_ostream &operator<<( + llvm::raw_ostream &os, std::optional srcPos) { + if (srcPos) { + os << *srcPos.value().path << ":" << srcPos.value().line << ": "; + } + return os; +} + +static std::optional GetSourcePosition( + const Fortran::semantics::Scope &scope, + const Fortran::parser::CharBlock &src) { + parser::AllCookedSources &allCookedSources{ + scope.context().allCookedSources()}; + if (std::optional prange{ + allCookedSources.GetProvenanceRange(src)}) { + return allCookedSources.allSources().GetSourcePosition(prange->start()); + } + return std::nullopt; +} + +// Returns a string containing the source location of `scope` followed by +// its first source line. +static std::string ScopeSourcePos(const Fortran::semantics::Scope &scope) { + const parser::CharBlock &sourceRange{scope.sourceRange()}; + std::string src{sourceRange.ToString()}; + size_t nl{src.find('\n')}; + std::string str; + llvm::raw_string_ostream ss{str}; + + ss << GetSourcePosition(scope, sourceRange) << src.substr(0, nl); + return str; +} + +static void DumpAssocSymbols(llvm::raw_ostream &os, const Symbol &sym) { + os << '\t' << sym << '\n'; + os << "\t\tOwner: " << ScopeSourcePos(sym.owner()) << '\n'; + if (const auto *details{sym.detailsIf()}) { + DumpAssocSymbols(os, details->symbol()); + } +} + +} // namespace dbg + +#endif + } // namespace Fortran::semantics diff --git a/flang/test/Semantics/OpenMP/common-block.f90 b/flang/test/Semantics/OpenMP/common-block.f90 index e1ddd120da857..93f29b12eacae 100644 --- a/flang/test/Semantics/OpenMP/common-block.f90 +++ b/flang/test/Semantics/OpenMP/common-block.f90 @@ -10,9 +10,9 @@ program main common /blk/ a, b, c !$omp parallel private(/blk/) !CHECK: OtherConstruct scope: size=0 alignment=1 - !CHECK: a (OmpPrivate): HostAssoc - !CHECK: b (OmpPrivate): HostAssoc - !CHECK: c (OmpPrivate): HostAssoc + !CHECK: a (OmpPrivate, OmpExplicit): HostAssoc + !CHECK: b (OmpPrivate, OmpExplicit): HostAssoc + !CHECK: c (OmpPrivate, OmpExplicit): HostAssoc call sub(a, b, c) !$omp end parallel end program diff --git a/flang/test/Semantics/OpenMP/copyprivate03.f90 b/flang/test/Semantics/OpenMP/copyprivate03.f90 index 9d39fdb6b13c8..fae190645b5e7 100644 --- a/flang/test/Semantics/OpenMP/copyprivate03.f90 +++ b/flang/test/Semantics/OpenMP/copyprivate03.f90 @@ -6,6 +6,8 @@ program omp_copyprivate integer :: a(10), b(10) + real, dimension(:), allocatable :: c + real, dimension(:), pointer :: d integer, save :: k !$omp threadprivate(k) @@ -43,4 +45,14 @@ program omp_copyprivate print *, a, b + !$omp task + !$omp parallel private(c, d) + allocate(c(5)) + allocate(d(10)) + !$omp single + c = 22 + d = 33 + !$omp end single copyprivate(c, d) + !$omp end parallel + !$omp end task end program omp_copyprivate diff --git a/flang/test/Semantics/OpenMP/default-clause.f90 b/flang/test/Semantics/OpenMP/default-clause.f90 index 9cde77be2babe..d4c38ea56de53 100644 --- a/flang/test/Semantics/OpenMP/default-clause.f90 +++ b/flang/test/Semantics/OpenMP/default-clause.f90 @@ -15,8 +15,8 @@ program sample !CHECK: OtherConstruct scope: size=0 alignment=1 !CHECK: a (OmpPrivate): HostAssoc !CHECK: k (OmpPrivate): HostAssoc - !CHECK: x (OmpFirstPrivate): HostAssoc - !CHECK: y (OmpPrivate): HostAssoc + !CHECK: x (OmpFirstPrivate, OmpExplicit): HostAssoc + !CHECK: y (OmpPrivate, OmpExplicit): HostAssoc !CHECK: z (OmpPrivate): HostAssoc !$omp parallel default(private) !CHECK: OtherConstruct scope: size=0 alignment=1 @@ -34,7 +34,7 @@ program sample !$omp parallel default(firstprivate) shared(y) private(w) !CHECK: OtherConstruct scope: size=0 alignment=1 !CHECK: k (OmpFirstPrivate): HostAssoc - !CHECK: w (OmpPrivate): HostAssoc + !CHECK: w (OmpPrivate, OmpExplicit): HostAssoc !CHECK: z (OmpFirstPrivate): HostAssoc y = 30 w = 40 diff --git a/flang/test/Semantics/OpenMP/do05-positivecase.f90 b/flang/test/Semantics/OpenMP/do05-positivecase.f90 index 5e1b1b86f72f6..8481cb2fc2ca0 100644 --- a/flang/test/Semantics/OpenMP/do05-positivecase.f90 +++ b/flang/test/Semantics/OpenMP/do05-positivecase.f90 @@ -20,12 +20,12 @@ program omp_do !$omp parallel default(shared) !$omp do !DEF: /omp_do/OtherConstruct2/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) - !DEF: /omp_do/OtherConstruct2/n HostAssoc INTEGER(4) + !DEF: /omp_do/OtherConstruct2/OtherConstruct1/n HostAssoc INTEGER(4) do i=1,n !$omp parallel !$omp single !DEF: /work EXTERNAL (Subroutine) ProcEntity - !DEF: /omp_do/OtherConstruct2/OtherConstruct1/OtherConstruct1/i HostAssoc INTEGER(4) + !DEF: /omp_do/OtherConstruct2/OtherConstruct1/OtherConstruct1/OtherConstruct1/i HostAssoc INTEGER(4) call work(i, 1) !$omp end single !$omp end parallel @@ -34,7 +34,7 @@ program omp_do !$omp end parallel !$omp parallel private(i) - !DEF: /omp_do/OtherConstruct3/i (OmpPrivate) HostAssoc INTEGER(4) + !DEF: /omp_do/OtherConstruct3/i (OmpPrivate, OmpExplicit) HostAssoc INTEGER(4) do i=1,10 !$omp single print *, "hello" diff --git a/flang/test/Semantics/OpenMP/do20.f90 b/flang/test/Semantics/OpenMP/do20.f90 index 040a82079590f..ee305ad1a34cf 100644 --- a/flang/test/Semantics/OpenMP/do20.f90 +++ b/flang/test/Semantics/OpenMP/do20.f90 @@ -10,7 +10,7 @@ subroutine shared_iv !$omp parallel shared(i) !$omp single - !DEF: /shared_iv/OtherConstruct1/i (OmpShared) HostAssoc INTEGER(4) + !DEF: /shared_iv/OtherConstruct1/OtherConstruct1/i HostAssoc INTEGER(4) do i = 0, 1 end do !$omp end single diff --git a/flang/test/Semantics/OpenMP/forall.f90 b/flang/test/Semantics/OpenMP/forall.f90 index 58492664a4e85..b862b4b27641b 100644 --- a/flang/test/Semantics/OpenMP/forall.f90 +++ b/flang/test/Semantics/OpenMP/forall.f90 @@ -18,8 +18,8 @@ !$omp parallel !DEF: /MainProgram1/OtherConstruct1/Forall1/i (Implicit) ObjectEntity INTEGER(4) - !DEF: /MainProgram1/OtherConstruct1/a HostAssoc INTEGER(4) - !DEF: /MainProgram1/OtherConstruct1/b HostAssoc INTEGER(4) + !DEF: /MainProgram1/OtherConstruct1/a (OmpShared) HostAssoc INTEGER(4) + !DEF: /MainProgram1/OtherConstruct1/b (OmpShared) HostAssoc INTEGER(4) forall(i = 1:5) a(i) = b(i) * 2 !$omp end parallel diff --git a/flang/test/Semantics/OpenMP/implicit-dsa.f90 b/flang/test/Semantics/OpenMP/implicit-dsa.f90 index a7ed834b0f1c6..7e38435274b7b 100644 --- a/flang/test/Semantics/OpenMP/implicit-dsa.f90 +++ b/flang/test/Semantics/OpenMP/implicit-dsa.f90 @@ -14,15 +14,15 @@ subroutine implicit_dsa_test1 !$omp task private(y) shared(z) !DEF: /implicit_dsa_test1/OtherConstruct1/x (OmpFirstPrivate, OmpImplicit) HostAssoc INTEGER(4) - !DEF: /implicit_dsa_test1/OtherConstruct1/y (OmpPrivate) HostAssoc INTEGER(4) - !DEF: /implicit_dsa_test1/OtherConstruct1/z (OmpShared) HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test1/OtherConstruct1/y (OmpPrivate, OmpExplicit) HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test1/OtherConstruct1/z (OmpShared, OmpExplicit) HostAssoc INTEGER(4) x = y + z !$omp end task !$omp task default(shared) - !DEF: /implicit_dsa_test1/OtherConstruct2/x HostAssoc INTEGER(4) - !DEF: /implicit_dsa_test1/OtherConstruct2/y HostAssoc INTEGER(4) - !DEF: /implicit_dsa_test1/OtherConstruct2/z HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test1/OtherConstruct2/x (OmpShared) HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test1/OtherConstruct2/y (OmpShared) HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test1/OtherConstruct2/z (OmpShared) HostAssoc INTEGER(4) x = y + z !$omp end task @@ -61,16 +61,16 @@ subroutine implicit_dsa_test3 !$omp parallel !$omp task - !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct1/x HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct1/x (OmpShared) HostAssoc INTEGER(4) x = 1 - !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct1/y HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct1/y (OmpShared) HostAssoc INTEGER(4) y = 1 !$omp end task !$omp task firstprivate(x) - !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct2/x (OmpFirstPrivate) HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct2/x (OmpFirstPrivate, OmpExplicit) HostAssoc INTEGER(4) x = 1 - !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct2/z HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test3/OtherConstruct1/OtherConstruct2/z (OmpShared) HostAssoc INTEGER(4) z = 1 !$omp end task !$omp end parallel @@ -110,7 +110,7 @@ subroutine implicit_dsa_test5 !$omp parallel default(private) !$omp task !$omp parallel - !DEF: /implicit_dsa_test5/OtherConstruct1/OtherConstruct1/OtherConstruct1/x HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test5/OtherConstruct1/OtherConstruct1/OtherConstruct1/x (OmpShared) HostAssoc INTEGER(4) x = 1 !$omp end parallel !$omp end task @@ -133,7 +133,7 @@ subroutine implicit_dsa_test6 !$omp end parallel !$omp parallel default(firstprivate) shared(y) - !DEF: /implicit_dsa_test6/OtherConstruct1/OtherConstruct2/y (OmpShared) HostAssoc INTEGER(4) + !DEF: /implicit_dsa_test6/OtherConstruct1/OtherConstruct2/y (OmpShared, OmpExplicit) HostAssoc INTEGER(4) !DEF: /implicit_dsa_test6/OtherConstruct1/OtherConstruct2/x (OmpFirstPrivate) HostAssocINTEGER(4) !DEF: /implicit_dsa_test6/OtherConstruct1/OtherConstruct2/z (OmpFirstPrivate) HostAssocINTEGER(4) y = x + z @@ -156,3 +156,16 @@ subroutine implicit_dsa_test7 !$omp end taskgroup !$omp end task end subroutine + +! Predetermined loop iteration variable. +!DEF: /implicit_dsa_test8 (Subroutine) Subprogram +subroutine implicit_dsa_test8 + !DEF: /implicit_dsa_test8/i ObjectEntity INTEGER(4) + integer i + + !$omp task + !DEF: /implicit_dsa_test8/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) + do i = 1, 10 + end do + !$omp end task +end subroutine diff --git a/flang/test/Semantics/OpenMP/reduction08.f90 b/flang/test/Semantics/OpenMP/reduction08.f90 index 9442fbd4d5978..01a06eb7d7414 100644 --- a/flang/test/Semantics/OpenMP/reduction08.f90 +++ b/flang/test/Semantics/OpenMP/reduction08.f90 @@ -13,9 +13,9 @@ program omp_reduction !$omp parallel do reduction(max:k) !DEF: /omp_reduction/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct1/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct1/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) !DEF: /omp_reduction/max ELEMENTAL, INTRINSIC, PURE (Function) ProcEntity - !DEF: /omp_reduction/OtherConstruct1/m HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct1/m (OmpShared) HostAssoc INTEGER(4) k = max(k, m) end do !$omp end parallel do @@ -23,9 +23,9 @@ program omp_reduction !$omp parallel do reduction(min:k) !DEF: /omp_reduction/OtherConstruct2/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct2/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct2/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) !DEF: /omp_reduction/min ELEMENTAL, INTRINSIC, PURE (Function) ProcEntity - !DEF: /omp_reduction/OtherConstruct2/m HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct2/m (OmpShared) HostAssoc INTEGER(4) k = min(k, m) end do !$omp end parallel do @@ -33,9 +33,9 @@ program omp_reduction !$omp parallel do reduction(iand:k) !DEF: /omp_reduction/OtherConstruct3/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct3/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct3/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) !DEF: /omp_reduction/iand ELEMENTAL, INTRINSIC, PURE (Function) ProcEntity - !DEF: /omp_reduction/OtherConstruct3/m HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct3/m (OmpShared) HostAssoc INTEGER(4) k = iand(k, m) end do !$omp end parallel do @@ -43,9 +43,9 @@ program omp_reduction !$omp parallel do reduction(ior:k) !DEF: /omp_reduction/OtherConstruct4/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct4/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct4/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) !DEF: /omp_reduction/ior ELEMENTAL, INTRINSIC, PURE (Function) ProcEntity - !DEF: /omp_reduction/OtherConstruct4/m HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct4/m (OmpShared) HostAssoc INTEGER(4) k = ior(k, m) end do !$omp end parallel do @@ -53,9 +53,9 @@ program omp_reduction !$omp parallel do reduction(ieor:k) !DEF: /omp_reduction/OtherConstruct5/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct5/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct5/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) !DEF: /omp_reduction/ieor ELEMENTAL, INTRINSIC, PURE (Function) ProcEntity - !DEF: /omp_reduction/OtherConstruct5/m HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct5/m (OmpShared) HostAssoc INTEGER(4) k = ieor(k,m) end do !$omp end parallel do diff --git a/flang/test/Semantics/OpenMP/reduction09.f90 b/flang/test/Semantics/OpenMP/reduction09.f90 index 1af2fc4fd9691..d6c71c30d2834 100644 --- a/flang/test/Semantics/OpenMP/reduction09.f90 +++ b/flang/test/Semantics/OpenMP/reduction09.f90 @@ -16,7 +16,7 @@ program omp_reduction !$omp do reduction(+:k) !DEF: /omp_reduction/OtherConstruct1/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct1/OtherConstruct1/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct1/OtherConstruct1/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) k = k+1 end do !$omp end do @@ -26,7 +26,7 @@ program omp_reduction !$omp parallel do reduction(+:a(10)) !DEF: /omp_reduction/OtherConstruct2/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct2/k HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct2/k (OmpShared) HostAssoc INTEGER(4) k = k+1 end do !$omp end parallel do @@ -35,7 +35,7 @@ program omp_reduction !$omp parallel do reduction(+:a(1:10:1)) !DEF: /omp_reduction/OtherConstruct3/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct3/k HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct3/k (OmpShared) HostAssoc INTEGER(4) k = k+1 end do !$omp end parallel do @@ -43,7 +43,7 @@ program omp_reduction !$omp parallel do reduction(+:b(1:10:1,1:5,2)) !DEF: /omp_reduction/OtherConstruct4/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct4/k HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct4/k (OmpShared) HostAssoc INTEGER(4) k = k+1 end do !$omp end parallel do @@ -51,7 +51,7 @@ program omp_reduction !$omp parallel do reduction(+:b(1:10:1,1:5,2:5:1)) !DEF: /omp_reduction/OtherConstruct5/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct5/k HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct5/k (OmpShared) HostAssoc INTEGER(4) k = k+1 end do !$omp end parallel do @@ -60,7 +60,7 @@ program omp_reduction !$omp do reduction(+:k) reduction(+:j) !DEF: /omp_reduction/OtherConstruct6/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct6/OtherConstruct1/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct6/OtherConstruct1/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) k = k+1 end do !$omp end do @@ -69,7 +69,7 @@ program omp_reduction !$omp do reduction(+:k) reduction(*:j) reduction(+:l) !DEF: /omp_reduction/OtherConstruct7/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /omp_reduction/OtherConstruct7/k (OmpReduction) HostAssoc INTEGER(4) + !DEF: /omp_reduction/OtherConstruct7/k (OmpReduction, OmpExplicit) HostAssoc INTEGER(4) k = k+1 end do !$omp end do diff --git a/flang/test/Semantics/OpenMP/reduction11.f90 b/flang/test/Semantics/OpenMP/reduction11.f90 index 3893fe70b407f..b2ad0f6a6ee11 100644 --- a/flang/test/Semantics/OpenMP/reduction11.f90 +++ b/flang/test/Semantics/OpenMP/reduction11.f90 @@ -12,7 +12,7 @@ program omp_reduction ! CHECK: OtherConstruct scope ! CHECK: i (OmpPrivate, OmpPreDetermined): HostAssoc - ! CHECK: k (OmpReduction): HostAssoc + ! CHECK: k (OmpReduction, OmpExplicit): HostAssoc ! CHECK: max, INTRINSIC: ProcEntity !$omp parallel do reduction(max:k) do i=1,10 diff --git a/flang/test/Semantics/OpenMP/scan2.f90 b/flang/test/Semantics/OpenMP/scan2.f90 index 5232e63aa6b4f..ffe84910f88a2 100644 --- a/flang/test/Semantics/OpenMP/scan2.f90 +++ b/flang/test/Semantics/OpenMP/scan2.f90 @@ -12,13 +12,13 @@ program omp_reduction ! CHECK: OtherConstruct scope ! CHECK: i (OmpPrivate, OmpPreDetermined): HostAssoc - ! CHECK: k (OmpReduction, OmpInclusiveScan, OmpInScanReduction): HostAssoc + ! CHECK: k (OmpReduction, OmpExplicit, OmpInclusiveScan, OmpInScanReduction): HostAssoc !$omp parallel do reduction(inscan, +:k) do i=1,10 !$omp scan inclusive(k) end do !$omp end parallel do - ! CHECK: m (OmpReduction, OmpExclusiveScan, OmpInScanReduction): HostAssoc + ! CHECK: m (OmpReduction, OmpExplicit, OmpExclusiveScan, OmpInScanReduction): HostAssoc !$omp parallel do reduction(inscan, +:m) do i=1,10 !$omp scan exclusive(m) diff --git a/flang/test/Semantics/OpenMP/symbol01.f90 b/flang/test/Semantics/OpenMP/symbol01.f90 index a40a8563fde1f..595b6b89c84fd 100644 --- a/flang/test/Semantics/OpenMP/symbol01.f90 +++ b/flang/test/Semantics/OpenMP/symbol01.f90 @@ -47,22 +47,22 @@ program mm !$omp parallel do private(a,t,/c/) shared(c) !DEF: /mm/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /mm/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(4) - !DEF: /mm/OtherConstruct1/b HostAssoc INTEGER(4) + !DEF: /mm/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(4) + !DEF: /mm/OtherConstruct1/b (OmpShared) HostAssoc INTEGER(4) !REF: /mm/OtherConstruct1/i a = a+b(i) - !DEF: /mm/OtherConstruct1/t (OmpPrivate) HostAssoc TYPE(myty) + !DEF: /mm/OtherConstruct1/t (OmpPrivate, OmpExplicit) HostAssoc TYPE(myty) !REF: /md/myty/a !REF: /mm/OtherConstruct1/i t%a = i - !DEF: /mm/OtherConstruct1/y (OmpPrivate) HostAssoc REAL(4) + !DEF: /mm/OtherConstruct1/y (OmpPrivate, OmpExplicit) HostAssoc REAL(4) y = 0. - !DEF: /mm/OtherConstruct1/x (OmpPrivate) HostAssoc REAL(4) + !DEF: /mm/OtherConstruct1/x (OmpPrivate, OmpExplicit) HostAssoc REAL(4) !REF: /mm/OtherConstruct1/a !REF: /mm/OtherConstruct1/i !REF: /mm/OtherConstruct1/y x = a+i+y - !DEF: /mm/OtherConstruct1/c (OmpShared) HostAssoc REAL(4) + !DEF: /mm/OtherConstruct1/c (OmpShared, OmpExplicit) HostAssoc REAL(4) c = 3.0 end do end program diff --git a/flang/test/Semantics/OpenMP/symbol02.f90 b/flang/test/Semantics/OpenMP/symbol02.f90 index 31d9cb2e46ba8..9007da042845a 100644 --- a/flang/test/Semantics/OpenMP/symbol02.f90 +++ b/flang/test/Semantics/OpenMP/symbol02.f90 @@ -11,13 +11,13 @@ !DEF: /MainProgram1/c (Implicit) ObjectEntity REAL(4) c = 0 !$omp parallel private(a,b) shared(c,d) - !DEF: /MainProgram1/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(4) a = 3. - !DEF: /MainProgram1/OtherConstruct1/b (OmpPrivate) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/b (OmpPrivate, OmpExplicit) HostAssoc REAL(4) b = 4 - !DEF: /MainProgram1/OtherConstruct1/c (OmpShared) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/c (OmpShared, OmpExplicit) HostAssoc REAL(4) c = 5 - !DEF: /MainProgram1/OtherConstruct1/d (OmpShared) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/d (OmpShared, OmpExplicit) HostAssoc REAL(4) d = 6 !$omp end parallel !DEF: /MainProgram1/a (Implicit) ObjectEntity REAL(4) diff --git a/flang/test/Semantics/OpenMP/symbol03.f90 b/flang/test/Semantics/OpenMP/symbol03.f90 index 08defb40e56a7..d67c1fdf333c4 100644 --- a/flang/test/Semantics/OpenMP/symbol03.f90 +++ b/flang/test/Semantics/OpenMP/symbol03.f90 @@ -7,14 +7,14 @@ !DEF: /MainProgram1/b (Implicit) ObjectEntity REAL(4) b = 2 !$omp parallel private(a) shared(b) - !DEF: /MainProgram1/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(4) a = 3. - !DEF: /MainProgram1/OtherConstruct1/b (OmpShared) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/b (OmpShared, OmpExplicit) HostAssoc REAL(4) b = 4 !$omp parallel private(b) shared(a) - !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/a (OmpShared) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/a (OmpShared, OmpExplicit) HostAssoc REAL(4) a = 5. - !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/b (OmpPrivate) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/b (OmpPrivate, OmpExplicit) HostAssoc REAL(4) b = 6 !$omp end parallel !$omp end parallel diff --git a/flang/test/Semantics/OpenMP/symbol04.f90 b/flang/test/Semantics/OpenMP/symbol04.f90 index 808d1e0dd09be..834b166266376 100644 --- a/flang/test/Semantics/OpenMP/symbol04.f90 +++ b/flang/test/Semantics/OpenMP/symbol04.f90 @@ -9,12 +9,12 @@ !REF: /MainProgram1/a a = 3.14 !$omp parallel private(a) - !DEF: /MainProgram1/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(8) + !DEF: /MainProgram1/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(8) a = 2. !$omp do private(a) !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(8) + !DEF: /MainProgram1/OtherConstruct1/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(8) a = 1. end do !$omp end parallel diff --git a/flang/test/Semantics/OpenMP/symbol05.f90 b/flang/test/Semantics/OpenMP/symbol05.f90 index 1ad0c10a40135..fe01f15d20aa3 100644 --- a/flang/test/Semantics/OpenMP/symbol05.f90 +++ b/flang/test/Semantics/OpenMP/symbol05.f90 @@ -15,7 +15,7 @@ subroutine foo !DEF: /mm/foo/a ObjectEntity INTEGER(4) integer :: a = 3 !$omp parallel - !DEF: /mm/foo/OtherConstruct1/a HostAssoc INTEGER(4) + !DEF: /mm/foo/OtherConstruct1/a (OmpShared) HostAssoc INTEGER(4) a = 1 !DEF: /mm/i PUBLIC (Implicit, OmpThreadprivate) ObjectEntity INTEGER(4) !REF: /mm/foo/OtherConstruct1/a diff --git a/flang/test/Semantics/OpenMP/symbol06.f90 b/flang/test/Semantics/OpenMP/symbol06.f90 index 906264eb12642..daf3874b79af6 100644 --- a/flang/test/Semantics/OpenMP/symbol06.f90 +++ b/flang/test/Semantics/OpenMP/symbol06.f90 @@ -10,7 +10,7 @@ !$omp parallel do firstprivate(a) lastprivate(a) !DEF: /MainProgram1/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,10 - !DEF: /MainProgram1/OtherConstruct1/a (OmpFirstPrivate, OmpLastPrivate) HostAssoc REAL(4) + !DEF: /MainProgram1/OtherConstruct1/a (OmpFirstPrivate, OmpLastPrivate, OmpExplicit) HostAssoc REAL(4) a = 2. end do end program diff --git a/flang/test/Semantics/OpenMP/symbol07.f90 b/flang/test/Semantics/OpenMP/symbol07.f90 index a375942ebb1d9..86b7305411347 100644 --- a/flang/test/Semantics/OpenMP/symbol07.f90 +++ b/flang/test/Semantics/OpenMP/symbol07.f90 @@ -21,9 +21,9 @@ subroutine function_call_in_region !DEF: /function_call_in_region/b ObjectEntity REAL(4) real :: b = 5. !$omp parallel default(none) private(a) shared(b) - !DEF: /function_call_in_region/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(4) + !DEF: /function_call_in_region/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(4) !REF: /function_call_in_region/foo - !DEF: /function_call_in_region/OtherConstruct1/b (OmpShared) HostAssoc REAL(4) + !DEF: /function_call_in_region/OtherConstruct1/b (OmpShared, OmpExplicit) HostAssoc REAL(4) a = foo(b) !$omp end parallel !REF: /function_call_in_region/a diff --git a/flang/test/Semantics/OpenMP/symbol08.f90 b/flang/test/Semantics/OpenMP/symbol08.f90 index 80ae1c6d2242b..545bccc86b068 100644 --- a/flang/test/Semantics/OpenMP/symbol08.f90 +++ b/flang/test/Semantics/OpenMP/symbol08.f90 @@ -28,19 +28,19 @@ subroutine test_do !DEF: /test_do/k ObjectEntity INTEGER(4) integer i, j, k !$omp parallel - !DEF: /test_do/OtherConstruct1/i HostAssoc INTEGER(4) + !DEF: /test_do/OtherConstruct1/i (OmpShared) HostAssoc INTEGER(4) i = 99 !$omp do collapse(2) !DEF: /test_do/OtherConstruct1/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,5 !DEF: /test_do/OtherConstruct1/OtherConstruct1/j (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do j=6,10 - !DEF: /test_do/OtherConstruct1/a HostAssoc REAL(4) + !DEF: /test_do/OtherConstruct1/OtherConstruct1/a HostAssoc REAL(4) a(1,1,1) = 0. !DEF: /test_do/OtherConstruct1/k (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do k=11,15 - !REF: /test_do/OtherConstruct1/a - !REF: /test_do/OtherConstruct1/k + !REF: /test_do/OtherConstruct1/OtherConstruct1/a + !DEF: /test_do/OtherConstruct1/OtherConstruct1/k HostAssoc INTEGER(4) !REF: /test_do/OtherConstruct1/OtherConstruct1/j !REF: /test_do/OtherConstruct1/OtherConstruct1/i a(k,j,i) = 1. @@ -65,9 +65,9 @@ subroutine test_pardo do i=1,5 !DEF: /test_pardo/OtherConstruct1/j (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do j=6,10 - !DEF: /test_pardo/OtherConstruct1/a HostAssoc REAL(4) + !DEF: /test_pardo/OtherConstruct1/a (OmpShared) HostAssoc REAL(4) a(1,1,1) = 0. - !DEF: /test_pardo/OtherConstruct1/k (OmpPrivate) HostAssoc INTEGER(4) + !DEF: /test_pardo/OtherConstruct1/k (OmpPrivate, OmpExplicit) HostAssoc INTEGER(4) do k=11,15 !REF: /test_pardo/OtherConstruct1/a !REF: /test_pardo/OtherConstruct1/k @@ -91,7 +91,7 @@ subroutine test_taskloop !$omp taskloop private(j) !DEF: /test_taskloop/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do i=1,5 - !DEF: /test_taskloop/OtherConstruct1/j (OmpPrivate) HostAssoc INTEGER(4) + !DEF: /test_taskloop/OtherConstruct1/j (OmpPrivate, OmpExplicit) HostAssoc INTEGER(4) !REF: /test_taskloop/OtherConstruct1/i do j=1,i !DEF: /test_taskloop/OtherConstruct1/a (OmpFirstPrivate, OmpImplicit) HostAssoc REAL(4) @@ -139,15 +139,15 @@ subroutine dotprod (b, c, n, block_size, num_teams, block_threads) do i0=1,n,block_size !$omp parallel do reduction(+: sum) !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/i (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) - !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/i0 HostAssoc INTEGER(4) + !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/i0 (OmpShared) HostAssoc INTEGER(4) !DEF: /dotprod/min ELEMENTAL, INTRINSIC, PURE (Function) ProcEntity - !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/block_size HostAssoc INTEGER(4) - !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/n HostAssoc INTEGER(4) + !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/block_size (OmpShared) HostAssoc INTEGER(4) + !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/n (OmpShared) HostAssoc INTEGER(4) do i=i0,min(i0+block_size, n) - !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/sum (OmpReduction) HostAssoc REAL(4) - !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/b HostAssoc REAL(4) + !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/sum (OmpReduction, OmpExplicit) HostAssoc REAL(4) + !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/b (OmpShared) HostAssoc REAL(4) !REF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/i - !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/c HostAssoc REAL(4) + !DEF: /dotprod/OtherConstruct1/OtherConstruct1/OtherConstruct1/OtherConstruct1/c (OmpShared) HostAssoc REAL(4) sum = sum+b(i)*c(i) end do end do @@ -175,7 +175,7 @@ subroutine test_simd do j=6,10 !DEF: /test_simd/OtherConstruct1/k (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do k=11,15 - !DEF: /test_simd/OtherConstruct1/a HostAssoc REAL(4) + !DEF: /test_simd/OtherConstruct1/a (OmpShared) HostAssoc REAL(4) !REF: /test_simd/OtherConstruct1/k !REF: /test_simd/OtherConstruct1/j !REF: /test_simd/OtherConstruct1/i @@ -202,7 +202,7 @@ subroutine test_simd_multi do j=6,10 !DEF: /test_simd_multi/OtherConstruct1/k (OmpLastPrivate, OmpPreDetermined) HostAssoc INTEGER(4) do k=11,15 - !DEF: /test_simd_multi/OtherConstruct1/a HostAssoc REAL(4) + !DEF: /test_simd_multi/OtherConstruct1/a (OmpShared) HostAssoc REAL(4) !REF: /test_simd_multi/OtherConstruct1/k !REF: /test_simd_multi/OtherConstruct1/j !REF: /test_simd_multi/OtherConstruct1/i @@ -224,11 +224,11 @@ subroutine test_seq_loop !REF: /test_seq_loop/j j = -1 !$omp parallel - !DEF: /test_seq_loop/OtherConstruct1/i HostAssoc INTEGER(4) - !DEF: /test_seq_loop/OtherConstruct1/j HostAssoc INTEGER(4) + !DEF: /test_seq_loop/OtherConstruct1/i (OmpShared) HostAssoc INTEGER(4) + !DEF: /test_seq_loop/OtherConstruct1/j (OmpShared) HostAssoc INTEGER(4) print *, i, j !$omp parallel - !DEF: /test_seq_loop/OtherConstruct1/OtherConstruct1/i HostAssoc INTEGER(4) + !DEF: /test_seq_loop/OtherConstruct1/OtherConstruct1/i (OmpShared) HostAssoc INTEGER(4) !DEF: /test_seq_loop/OtherConstruct1/OtherConstruct1/j (OmpPrivate, OmpPreDetermined) HostAssoc INTEGER(4) print *, i, j !$omp do diff --git a/flang/test/Semantics/OpenMP/symbol09.f90 b/flang/test/Semantics/OpenMP/symbol09.f90 index a375942ebb1d9..86b7305411347 100644 --- a/flang/test/Semantics/OpenMP/symbol09.f90 +++ b/flang/test/Semantics/OpenMP/symbol09.f90 @@ -21,9 +21,9 @@ subroutine function_call_in_region !DEF: /function_call_in_region/b ObjectEntity REAL(4) real :: b = 5. !$omp parallel default(none) private(a) shared(b) - !DEF: /function_call_in_region/OtherConstruct1/a (OmpPrivate) HostAssoc REAL(4) + !DEF: /function_call_in_region/OtherConstruct1/a (OmpPrivate, OmpExplicit) HostAssoc REAL(4) !REF: /function_call_in_region/foo - !DEF: /function_call_in_region/OtherConstruct1/b (OmpShared) HostAssoc REAL(4) + !DEF: /function_call_in_region/OtherConstruct1/b (OmpShared, OmpExplicit) HostAssoc REAL(4) a = foo(b) !$omp end parallel !REF: /function_call_in_region/a >From 6c89ac271e4c3b4bfad0636fdaaf07a1004ad8a8 Mon Sep 17 00:00:00 2001 From: Leandro Lupori Date: Fri, 30 May 2025 15:33:44 -0300 Subject: [PATCH 2/2] Add a comment explaining a new test --- flang/test/Semantics/OpenMP/copyprivate03.f90 | 2 ++ 1 file changed, 2 insertions(+) diff --git a/flang/test/Semantics/OpenMP/copyprivate03.f90 b/flang/test/Semantics/OpenMP/copyprivate03.f90 index fae190645b5e7..249b27e4b2b86 100644 --- a/flang/test/Semantics/OpenMP/copyprivate03.f90 +++ b/flang/test/Semantics/OpenMP/copyprivate03.f90 @@ -52,6 +52,8 @@ program omp_copyprivate !$omp single c = 22 d = 33 + !Check that 'c' and 'd' inherit PRIVATE DSA from the enclosing PARALLEL + !and no error occurs. !$omp end single copyprivate(c, d) !$omp end parallel !$omp end task From flang-commits at lists.llvm.org Fri May 30 11:37:51 2025 From: flang-commits at lists.llvm.org (Leandro Lupori via flang-commits) Date: Fri, 30 May 2025 11:37:51 -0700 (PDT) Subject: [flang-commits] [flang] [flang][OpenMP] Explicitly set Shared DSA in symbols (PR #142154) In-Reply-To: Message-ID: <6839faff.050a0220.c94e2.8db3@mx.google.com> luporl wrote: > Do you plan to follow up with the lowering changes required to clean this up? Yes, but it may take a while. https://github.com/llvm/llvm-project/pull/142154 From flang-commits at lists.llvm.org Fri May 30 11:53:15 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Fri, 30 May 2025 11:53:15 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <6839fe9b.170a0220.2e47e7.880a@mx.google.com> snarang181 wrote: Hi @kiranchandramohan @tarunprabhu, with my latest commit the html target is built without any warnings. When I trigger the man build via `ninja docs-flang-man`, I see ``` WARNING: toctree contains reference to nonexisting document 'FIRLangRef' WARNING: toctree contains reference to nonexisting document 'FlangCommandLineReference' ``` If we build with `-DSPHINX_WARNINGS_AS_ERRORS=OFF`, this will lead to the man page successfully being built. If the flag is not set, we will hit a fatal error and fail the man build. Should we add this as a TODO and open another issue for that? I think this patch is tailored for the missing `man_config` issue and `man` not being built at all. Let me know what you guys think. https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Fri May 30 12:05:28 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 30 May 2025 12:05:28 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix corner case of defined component assignment (PR #142201) Message-ID: https://github.com/klausler created https://github.com/llvm/llvm-project/pull/142201 For componentwise assignment in intrinsic derived type assignment, the runtime type information's special binding table is currently populated only with type-bound ASSIGNMENT(=) procedures that have the same derived type for both arguments. This restriction excludes all defined assignments for cases that cannot arise in this context, like defined assignments from intrinsic types or incompatible derived types. However, this restriction also excludes defined assignments from distinct but compatible derived types, i.e. ancestors. Loosen it a little to allow them. Fixes https://github.com/llvm/llvm-project/issues/142151. >From 18e580d37bef07e0b805497c04a21700bafc014a Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 30 May 2025 11:59:44 -0700 Subject: [PATCH] [flang] Fix corner case of defined component assignment For componentwise assignment in intrinsic derived type assignment, the runtime type information's special binding table is currently populated only with type-bound ASSIGNMENT(=) procedures that have the same derived type for both arguments. This restriction excludes all defined assignments for cases that cannot arise in this context, like defined assignments from intrinsic types or incompatible derived types. However, this restriction also excludes defined assignments from distinct but compatible derived types, i.e. ancestors. Loosen it a little to allow them. Fixes https://github.com/llvm/llvm-project/issues/142151. --- flang/lib/Semantics/runtime-type-info.cpp | 12 ++++++--- flang/test/Semantics/typeinfo01.f90 | 33 +++++++++++++++++++++-- 2 files changed, 39 insertions(+), 6 deletions(-) diff --git a/flang/lib/Semantics/runtime-type-info.cpp b/flang/lib/Semantics/runtime-type-info.cpp index 98295f3705a71..378697237ba9b 100644 --- a/flang/lib/Semantics/runtime-type-info.cpp +++ b/flang/lib/Semantics/runtime-type-info.cpp @@ -1121,10 +1121,10 @@ void RuntimeTableBuilder::DescribeSpecialProc( int argThatMightBeDescriptor{0}; MaybeExpr which; if (isAssignment) { - // Only type-bound asst's with the same type on both dummy arguments + // Only type-bound asst's with compatible types on both dummy arguments // are germane to the runtime, which needs only these to implement // component assignment as part of intrinsic assignment. - // Non-type-bound generic INTERFACEs and assignments from distinct + // Non-type-bound generic INTERFACEs and assignments from incompatible // types must not be used for component intrinsic assignment. CHECK(proc->dummyArguments.size() == 2); const auto t1{ @@ -1137,8 +1137,12 @@ void RuntimeTableBuilder::DescribeSpecialProc( .type.type()}; if (!binding || t1.category() != TypeCategory::Derived || t2.category() != TypeCategory::Derived || - t1.IsUnlimitedPolymorphic() || t2.IsUnlimitedPolymorphic() || - t1.GetDerivedTypeSpec() != t2.GetDerivedTypeSpec()) { + t1.IsUnlimitedPolymorphic() || t2.IsUnlimitedPolymorphic()) { + return; + } + if (!derivedTypeSpec || + !derivedTypeSpec->MatchesOrExtends(t1.GetDerivedTypeSpec()) || + !derivedTypeSpec->MatchesOrExtends(t2.GetDerivedTypeSpec())) { return; } which = proc->IsElemental() ? elementalAssignmentEnum_ diff --git a/flang/test/Semantics/typeinfo01.f90 b/flang/test/Semantics/typeinfo01.f90 index c1427f28753cf..d228cd2a84ca4 100644 --- a/flang/test/Semantics/typeinfo01.f90 +++ b/flang/test/Semantics/typeinfo01.f90 @@ -73,7 +73,7 @@ module m06 end type type, extends(t) :: t2 contains - procedure :: s1 => s2 ! override + procedure :: s1 => s2 end type contains subroutine s1(x, y) @@ -86,8 +86,37 @@ subroutine s2(x, y) end subroutine !CHECK: .c.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(component) shape: 0_8:0_8 init:[component::component(name=.n.t,genre=1_1,category=6_1,kind=0_1,rank=0_1,offset=0_8,characterlen=value(genre=1_1,value=0_8),derived=.dt.t,lenvalue=NULL(),bounds=NULL(),initialization=NULL())] !CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t,name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=2_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) -!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t2,name=.n.t2,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t2,name=.n.t2,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=.s.t2,specialbitset=2_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) !CHECK: .s.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=1_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s1)] +!CHECK: .s.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=1_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s2)] +!CHECK: .v.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s1,name=.n.s1)] +!CHECK: .v.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s2,name=.n.s1)] +end module + +module m06a + type :: t + contains + procedure, pass(y) :: s1 + generic :: assignment(=) => s1 + end type + type, extends(t) :: t2 + contains + procedure, pass(y) :: s1 => s2 + end type + contains + subroutine s1(x, y) + class(t), intent(out) :: x + class(t), intent(in) :: y + end subroutine + subroutine s2(x, y) + class(t), intent(out) :: x + class(t2), intent(in) :: y + end subroutine +!CHECK: .c.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(component) shape: 0_8:0_8 init:[component::component(name=.n.t,genre=1_1,category=6_1,kind=0_1,rank=0_1,offset=0_8,characterlen=value(genre=1_1,value=0_8),derived=.dt.t,lenvalue=NULL(),bounds=NULL(),initialization=NULL())] +!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t,name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=2_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t2,name=.n.t2,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=.s.t2,specialbitset=2_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .s.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=1_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s1)] +!CHECK: .s.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=1_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s2)] !CHECK: .v.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s1,name=.n.s1)] !CHECK: .v.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s2,name=.n.s1)] end module From flang-commits at lists.llvm.org Fri May 30 12:05:55 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 30 May 2025 12:05:55 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix corner case of defined component assignment (PR #142201) In-Reply-To: Message-ID: <683a0193.050a0220.a44f4.9bbc@mx.google.com> https://github.com/klausler edited https://github.com/llvm/llvm-project/pull/142201 From flang-commits at lists.llvm.org Fri May 30 12:06:03 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 12:06:03 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix corner case of defined component assignment (PR #142201) In-Reply-To: Message-ID: <683a019b.170a0220.2dd613.9347@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-semantics Author: Peter Klausler (klausler)
Changes For componentwise assignment in derived type intrinsic assignment, the runtime type information's special binding table is currently populated only with type-bound ASSIGNMENT(=) procedures that have the same derived type for both arguments. This restriction excludes all defined assignments for cases that cannot arise in this context, like defined assignments from intrinsic types or incompatible derived types. However, this restriction also excludes defined assignments from distinct but compatible derived types, i.e. ancestors. Loosen it a little to allow them. Fixes https://github.com/llvm/llvm-project/issues/142151. --- Full diff: https://github.com/llvm/llvm-project/pull/142201.diff 2 Files Affected: - (modified) flang/lib/Semantics/runtime-type-info.cpp (+8-4) - (modified) flang/test/Semantics/typeinfo01.f90 (+31-2) ``````````diff diff --git a/flang/lib/Semantics/runtime-type-info.cpp b/flang/lib/Semantics/runtime-type-info.cpp index 98295f3705a71..378697237ba9b 100644 --- a/flang/lib/Semantics/runtime-type-info.cpp +++ b/flang/lib/Semantics/runtime-type-info.cpp @@ -1121,10 +1121,10 @@ void RuntimeTableBuilder::DescribeSpecialProc( int argThatMightBeDescriptor{0}; MaybeExpr which; if (isAssignment) { - // Only type-bound asst's with the same type on both dummy arguments + // Only type-bound asst's with compatible types on both dummy arguments // are germane to the runtime, which needs only these to implement // component assignment as part of intrinsic assignment. - // Non-type-bound generic INTERFACEs and assignments from distinct + // Non-type-bound generic INTERFACEs and assignments from incompatible // types must not be used for component intrinsic assignment. CHECK(proc->dummyArguments.size() == 2); const auto t1{ @@ -1137,8 +1137,12 @@ void RuntimeTableBuilder::DescribeSpecialProc( .type.type()}; if (!binding || t1.category() != TypeCategory::Derived || t2.category() != TypeCategory::Derived || - t1.IsUnlimitedPolymorphic() || t2.IsUnlimitedPolymorphic() || - t1.GetDerivedTypeSpec() != t2.GetDerivedTypeSpec()) { + t1.IsUnlimitedPolymorphic() || t2.IsUnlimitedPolymorphic()) { + return; + } + if (!derivedTypeSpec || + !derivedTypeSpec->MatchesOrExtends(t1.GetDerivedTypeSpec()) || + !derivedTypeSpec->MatchesOrExtends(t2.GetDerivedTypeSpec())) { return; } which = proc->IsElemental() ? elementalAssignmentEnum_ diff --git a/flang/test/Semantics/typeinfo01.f90 b/flang/test/Semantics/typeinfo01.f90 index c1427f28753cf..d228cd2a84ca4 100644 --- a/flang/test/Semantics/typeinfo01.f90 +++ b/flang/test/Semantics/typeinfo01.f90 @@ -73,7 +73,7 @@ module m06 end type type, extends(t) :: t2 contains - procedure :: s1 => s2 ! override + procedure :: s1 => s2 end type contains subroutine s1(x, y) @@ -86,8 +86,37 @@ subroutine s2(x, y) end subroutine !CHECK: .c.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(component) shape: 0_8:0_8 init:[component::component(name=.n.t,genre=1_1,category=6_1,kind=0_1,rank=0_1,offset=0_8,characterlen=value(genre=1_1,value=0_8),derived=.dt.t,lenvalue=NULL(),bounds=NULL(),initialization=NULL())] !CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t,name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=2_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) -!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t2,name=.n.t2,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t2,name=.n.t2,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=.s.t2,specialbitset=2_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) !CHECK: .s.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=1_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s1)] +!CHECK: .s.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=1_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s2)] +!CHECK: .v.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s1,name=.n.s1)] +!CHECK: .v.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s2,name=.n.s1)] +end module + +module m06a + type :: t + contains + procedure, pass(y) :: s1 + generic :: assignment(=) => s1 + end type + type, extends(t) :: t2 + contains + procedure, pass(y) :: s1 => s2 + end type + contains + subroutine s1(x, y) + class(t), intent(out) :: x + class(t), intent(in) :: y + end subroutine + subroutine s2(x, y) + class(t), intent(out) :: x + class(t2), intent(in) :: y + end subroutine +!CHECK: .c.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(component) shape: 0_8:0_8 init:[component::component(name=.n.t,genre=1_1,category=6_1,kind=0_1,rank=0_1,offset=0_8,characterlen=value(genre=1_1,value=0_8),derived=.dt.t,lenvalue=NULL(),bounds=NULL(),initialization=NULL())] +!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t,name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=2_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t2,name=.n.t2,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=.s.t2,specialbitset=2_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .s.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=1_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s1)] +!CHECK: .s.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=1_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s2)] !CHECK: .v.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s1,name=.n.s1)] !CHECK: .v.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s2,name=.n.s1)] end module ``````````
https://github.com/llvm/llvm-project/pull/142201 From flang-commits at lists.llvm.org Fri May 30 12:06:25 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Fri, 30 May 2025 12:06:25 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix corner case of defined component assignment (PR #142201) In-Reply-To: Message-ID: <683a01b1.630a0220.1cfb1f.55af@mx.google.com> https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/142201 >From 714ace59cd36d7ff513d351e0c01d5b39908d058 Mon Sep 17 00:00:00 2001 From: Peter Klausler Date: Fri, 30 May 2025 11:59:44 -0700 Subject: [PATCH] [flang] Fix corner case of defined component assignment For componentwise assignment in derived type intrinsic assignment, the runtime type information's special binding table is currently populated only with type-bound ASSIGNMENT(=) procedures that have the same derived type for both arguments. This restriction excludes all defined assignments for cases that cannot arise in this context, like defined assignments from intrinsic types or incompatible derived types. However, this restriction also excludes defined assignments from distinct but compatible derived types, i.e. ancestors. Loosen it a little to allow them. Fixes https://github.com/llvm/llvm-project/issues/142151. --- flang/lib/Semantics/runtime-type-info.cpp | 12 ++++++--- flang/test/Semantics/typeinfo01.f90 | 33 +++++++++++++++++++++-- 2 files changed, 39 insertions(+), 6 deletions(-) diff --git a/flang/lib/Semantics/runtime-type-info.cpp b/flang/lib/Semantics/runtime-type-info.cpp index 98295f3705a71..378697237ba9b 100644 --- a/flang/lib/Semantics/runtime-type-info.cpp +++ b/flang/lib/Semantics/runtime-type-info.cpp @@ -1121,10 +1121,10 @@ void RuntimeTableBuilder::DescribeSpecialProc( int argThatMightBeDescriptor{0}; MaybeExpr which; if (isAssignment) { - // Only type-bound asst's with the same type on both dummy arguments + // Only type-bound asst's with compatible types on both dummy arguments // are germane to the runtime, which needs only these to implement // component assignment as part of intrinsic assignment. - // Non-type-bound generic INTERFACEs and assignments from distinct + // Non-type-bound generic INTERFACEs and assignments from incompatible // types must not be used for component intrinsic assignment. CHECK(proc->dummyArguments.size() == 2); const auto t1{ @@ -1137,8 +1137,12 @@ void RuntimeTableBuilder::DescribeSpecialProc( .type.type()}; if (!binding || t1.category() != TypeCategory::Derived || t2.category() != TypeCategory::Derived || - t1.IsUnlimitedPolymorphic() || t2.IsUnlimitedPolymorphic() || - t1.GetDerivedTypeSpec() != t2.GetDerivedTypeSpec()) { + t1.IsUnlimitedPolymorphic() || t2.IsUnlimitedPolymorphic()) { + return; + } + if (!derivedTypeSpec || + !derivedTypeSpec->MatchesOrExtends(t1.GetDerivedTypeSpec()) || + !derivedTypeSpec->MatchesOrExtends(t2.GetDerivedTypeSpec())) { return; } which = proc->IsElemental() ? elementalAssignmentEnum_ diff --git a/flang/test/Semantics/typeinfo01.f90 b/flang/test/Semantics/typeinfo01.f90 index c1427f28753cf..d228cd2a84ca4 100644 --- a/flang/test/Semantics/typeinfo01.f90 +++ b/flang/test/Semantics/typeinfo01.f90 @@ -73,7 +73,7 @@ module m06 end type type, extends(t) :: t2 contains - procedure :: s1 => s2 ! override + procedure :: s1 => s2 end type contains subroutine s1(x, y) @@ -86,8 +86,37 @@ subroutine s2(x, y) end subroutine !CHECK: .c.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(component) shape: 0_8:0_8 init:[component::component(name=.n.t,genre=1_1,category=6_1,kind=0_1,rank=0_1,offset=0_8,characterlen=value(genre=1_1,value=0_8),derived=.dt.t,lenvalue=NULL(),bounds=NULL(),initialization=NULL())] !CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t,name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=2_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) -!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t2,name=.n.t2,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=NULL(),specialbitset=0_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t2,name=.n.t2,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=.s.t2,specialbitset=2_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) !CHECK: .s.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=1_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s1)] +!CHECK: .s.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=1_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s2)] +!CHECK: .v.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s1,name=.n.s1)] +!CHECK: .v.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s2,name=.n.s1)] +end module + +module m06a + type :: t + contains + procedure, pass(y) :: s1 + generic :: assignment(=) => s1 + end type + type, extends(t) :: t2 + contains + procedure, pass(y) :: s1 => s2 + end type + contains + subroutine s1(x, y) + class(t), intent(out) :: x + class(t), intent(in) :: y + end subroutine + subroutine s2(x, y) + class(t), intent(out) :: x + class(t2), intent(in) :: y + end subroutine +!CHECK: .c.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(component) shape: 0_8:0_8 init:[component::component(name=.n.t,genre=1_1,category=6_1,kind=0_1,rank=0_1,offset=0_8,characterlen=value(genre=1_1,value=0_8),derived=.dt.t,lenvalue=NULL(),bounds=NULL(),initialization=NULL())] +!CHECK: .dt.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t,name=.n.t,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=NULL(),procptr=NULL(),special=.s.t,specialbitset=2_4,hasparent=0_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .dt.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(derivedtype) init:derivedtype(binding=.v.t2,name=.n.t2,sizeinbytes=0_8,uninstantiated=NULL(),kindparameter=NULL(),lenparameterkind=NULL(),component=.c.t2,procptr=NULL(),special=.s.t2,specialbitset=2_4,hasparent=1_1,noinitializationneeded=1_1,nodestructionneeded=1_1,nofinalizationneeded=1_1) +!CHECK: .s.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=1_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s1)] +!CHECK: .s.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(specialbinding) shape: 0_8:0_8 init:[specialbinding::specialbinding(which=1_1,isargdescriptorset=3_1,istypebound=1_1,isargcontiguousset=0_1,proc=s2)] !CHECK: .v.t, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s1,name=.n.s1)] !CHECK: .v.t2, SAVE, TARGET (CompilerCreated, ReadOnly): ObjectEntity type: TYPE(binding) shape: 0_8:0_8 init:[binding::binding(proc=s2,name=.n.s1)] end module From flang-commits at lists.llvm.org Fri May 30 12:47:37 2025 From: flang-commits at lists.llvm.org (Eugene Epshteyn via flang-commits) Date: Fri, 30 May 2025 12:47:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix corner case of defined component assignment (PR #142201) In-Reply-To: Message-ID: <683a0b59.170a0220.357ea7.86b8@mx.google.com> https://github.com/eugeneepshteyn approved this pull request. https://github.com/llvm/llvm-project/pull/142201 From flang-commits at lists.llvm.org Fri May 30 13:06:18 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 30 May 2025 13:06:18 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683a0fba.050a0220.3b636e.9922@mx.google.com> ================ @@ -4217,49 +3783,168 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAllocatorsConstruct &allocsConstruct) { TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { ---------------- kparzysz wrote: Done https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Fri May 30 13:06:25 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 30 May 2025 13:06:25 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683a0fc1.170a0220.8e6a6.ce45@mx.google.com> ================ @@ -4217,49 +3783,168 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAllocatorsConstruct &allocsConstruct) { TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; ---------------- kparzysz wrote: Done https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Fri May 30 13:06:51 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 30 May 2025 13:06:51 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683a0fdb.170a0220.2a6532.8e09@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 01/28] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 02/28] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 03/28] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; >From 02b40809aff5b2ad315ef078f0ea8edb0f30a40c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 17 Mar 2025 15:53:27 -0500 Subject: [PATCH 04/28] [flang][OpenMP] Overhaul implementation of ATOMIC construct The parser will accept a wide variety of illegal attempts at forming an ATOMIC construct, leaving it to the semantic analysis to diagnose any issues. This consolidates the analysis into one place and allows us to produce more informative diagnostics. The parser's outcome will be parser::OpenMPAtomicConstruct object holding the directive, parser::Body, and an optional end-directive. The prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have been removed. READ, WRITE, etc. are now proper clauses. The semantic analysis consistently operates on "evaluation" represen- tations, mainly evaluate::Expr (as SomeExpr) and evaluate::Assignment. The results of the semantic analysis are stored in a mutable member of the OpenMPAtomicConstruct node. This follows a precedent of having `typedExpr` member in parser::Expr, for example. This allows the lowering code to avoid duplicated handling of AST nodes. Using a BLOCK construct containing multiple statements for an ATOMIC construct that requires multiple statements is now allowed. In fact, any nesting of such BLOCK constructs is allowed. This implementation will parse, and perform semantic checks for both conditional-update and conditional-update-capture, although no MLIR will be generated for those. Instead, a TODO error will be issues prior to lowering. The allowed forms of the ATOMIC construct were based on the OpenMP 6.0 spec. --- flang/include/flang/Parser/dump-parse-tree.h | 12 - flang/include/flang/Parser/parse-tree.h | 111 +- flang/include/flang/Semantics/tools.h | 16 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 40 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 760 +++---- flang/lib/Parser/openmp-parsers.cpp | 233 +- flang/lib/Parser/parse-tree.cpp | 28 + flang/lib/Parser/unparse.cpp | 102 +- flang/lib/Semantics/check-omp-structure.cpp | 1998 +++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 56 +- flang/lib/Semantics/resolve-names.cpp | 7 +- flang/lib/Semantics/rewrite-directives.cpp | 126 +- .../Lower/OpenMP/Todo/atomic-compare-fail.f90 | 2 +- .../test/Lower/OpenMP/Todo/atomic-compare.f90 | 2 +- flang/test/Lower/OpenMP/atomic-capture.f90 | 4 +- flang/test/Lower/OpenMP/atomic-write.f90 | 2 +- flang/test/Parser/OpenMP/atomic-compare.f90 | 16 - .../test/Semantics/OpenMP/atomic-compare.f90 | 29 +- .../Semantics/OpenMP/atomic-hint-clause.f90 | 23 +- flang/test/Semantics/OpenMP/atomic-read.f90 | 89 + .../OpenMP/atomic-update-capture.f90 | 77 + .../Semantics/OpenMP/atomic-update-only.f90 | 83 + .../OpenMP/atomic-update-overloaded-ops.f90 | 4 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 81 + flang/test/Semantics/OpenMP/atomic.f90 | 29 +- flang/test/Semantics/OpenMP/atomic01.f90 | 221 +- flang/test/Semantics/OpenMP/atomic02.f90 | 47 +- flang/test/Semantics/OpenMP/atomic03.f90 | 51 +- flang/test/Semantics/OpenMP/atomic04.f90 | 96 +- flang/test/Semantics/OpenMP/atomic05.f90 | 12 +- .../Semantics/OpenMP/critical-hint-clause.f90 | 20 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 58 +- .../Semantics/OpenMP/requires-atomic01.f90 | 86 +- .../Semantics/OpenMP/requires-atomic02.f90 | 86 +- 34 files changed, 2838 insertions(+), 1769 deletions(-) delete mode 100644 flang/test/Parser/OpenMP/atomic-compare.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-read.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-capture.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-only.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-write.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a8c01ee2e7da3 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -526,15 +526,6 @@ class ParseTreeDumper { NODE(parser, OmpAtClause) NODE_ENUM(OmpAtClause, ActionTime) NODE_ENUM(OmpSeverityClause, Severity) - NODE(parser, OmpAtomic) - NODE(parser, OmpAtomicCapture) - NODE(OmpAtomicCapture, Stmt1) - NODE(OmpAtomicCapture, Stmt2) - NODE(parser, OmpAtomicCompare) - NODE(parser, OmpAtomicCompareIfStmt) - NODE(parser, OmpAtomicRead) - NODE(parser, OmpAtomicUpdate) - NODE(parser, OmpAtomicWrite) NODE(parser, OmpBeginBlockDirective) NODE(parser, OmpBeginLoopDirective) NODE(parser, OmpBeginSectionsDirective) @@ -581,7 +572,6 @@ class ParseTreeDumper { NODE(parser, OmpDoacrossClause) NODE(parser, OmpDestroyClause) NODE(parser, OmpEndAllocators) - NODE(parser, OmpEndAtomic) NODE(parser, OmpEndBlockDirective) NODE(parser, OmpEndCriticalDirective) NODE(parser, OmpEndLoopDirective) @@ -709,8 +699,6 @@ class ParseTreeDumper { NODE(parser, OpenMPDeclareMapperConstruct) NODE_ENUM(common, OmpMemoryOrderType) NODE(parser, OmpMemoryOrderClause) - NODE(parser, OmpAtomicClause) - NODE(parser, OmpAtomicClauseList) NODE(parser, OmpAtomicDefaultMemOrderClause) NODE(parser, OpenMPDepobjConstruct) NODE(parser, OpenMPUtilityConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..77f57b1cb85c7 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4835,94 +4835,37 @@ struct OmpMemoryOrderClause { CharBlock source; }; -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) | -// FAIL(memory-order) -struct OmpAtomicClause { - UNION_CLASS_BOILERPLATE(OmpAtomicClause); - CharBlock source; - std::variant u; -}; - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -struct OmpAtomicClauseList { - WRAPPER_CLASS_BOILERPLATE(OmpAtomicClauseList, std::list); - CharBlock source; -}; - -// END ATOMIC -EMPTY_CLASS(OmpEndAtomic); - -// ATOMIC READ -struct OmpAtomicRead { - TUPLE_CLASS_BOILERPLATE(OmpAtomicRead); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC WRITE -struct OmpAtomicWrite { - TUPLE_CLASS_BOILERPLATE(OmpAtomicWrite); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC UPDATE -struct OmpAtomicUpdate { - TUPLE_CLASS_BOILERPLATE(OmpAtomicUpdate); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC CAPTURE -struct OmpAtomicCapture { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCapture); - CharBlock source; - WRAPPER_CLASS(Stmt1, Statement); - WRAPPER_CLASS(Stmt2, Statement); - std::tuple - t; -}; - -struct OmpAtomicCompareIfStmt { - UNION_CLASS_BOILERPLATE(OmpAtomicCompareIfStmt); - std::variant, common::Indirection> u; -}; - -// ATOMIC COMPARE (OpenMP 5.1, OPenMP 5.2 spec: 15.8.4) -struct OmpAtomicCompare { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCompare); +struct OpenMPAtomicConstruct { + llvm::omp::Clause GetKind() const; + bool IsCapture() const; + bool IsCompare() const; + TUPLE_CLASS_BOILERPLATE(OpenMPAtomicConstruct); CharBlock source; - std::tuple> + std::tuple> t; -}; -// ATOMIC -struct OmpAtomic { - TUPLE_CLASS_BOILERPLATE(OmpAtomic); - CharBlock source; - std::tuple, - std::optional> - t; -}; + // Information filled out during semantic checks to avoid duplication + // of analyses. + struct Analysis { + static constexpr int None = 0; + static constexpr int Read = 1; + static constexpr int Write = 2; + static constexpr int Update = Read | Write; + static constexpr int Action = 3; // Bitmask for None, Read, Write, Update + static constexpr int IfTrue = 4; + static constexpr int IfFalse = 8; + static constexpr int Condition = 12; // Bitmask for IfTrue, IfFalse + + struct Op { + int what; + TypedExpr expr; + }; + TypedExpr atom, cond; + Op op0, op1; + }; -// 2.17.7 atomic -> -// ATOMIC [atomic-clause-list] atomic-construct [atomic-clause-list] | -// ATOMIC [atomic-clause-list] -// atomic-construct -> READ | WRITE | UPDATE | CAPTURE | COMPARE -struct OpenMPAtomicConstruct { - UNION_CLASS_BOILERPLATE(OpenMPAtomicConstruct); - std::variant - u; + mutable Analysis analysis; }; // OpenMP directives that associate with loop(s) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..7f1ec59b087a2 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -782,5 +782,21 @@ inline bool checkForSymbolMatch( } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). +/// Check if "expr" is +/// SomeType(SomeKind(Type( +/// Convert +/// SomeKind(...)[2]))) +/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves +/// TypeCategory. +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..0d941bf39afc3 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -332,26 +332,26 @@ getSource(const semantics::SemanticsContext &semaCtx, const parser::CharBlock *source = nullptr; auto ompConsVisit = [&](const parser::OpenMPConstruct &x) { - std::visit(common::visitors{ - [&](const parser::OpenMPSectionsConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPLoopConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPBlockConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPCriticalConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPAtomicConstruct &x) { - std::visit([&](const auto &x) { source = &x.source; }, - x.u); - }, - [&](const auto &x) { source = &x.source; }, - }, - x.u); + std::visit( + common::visitors{ + [&](const parser::OpenMPSectionsConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPLoopConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPBlockConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPCriticalConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPAtomicConstruct &x) { + source = &std::get(x.t).source; + }, + [&](const auto &x) { source = &x.source; }, + }, + x.u); }; eval.visit(common::visitors{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fdd85e94829f3..ab868df76d298 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2588,455 +2588,183 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); +} -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} + +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, + const List &clauses) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + if (clause.id != llvm::omp::Clause::OMPC_hint) + continue; + auto &hint = std::get(clause.u); + auto maybeVal = evaluate::ToInt64(hint.v); + CHECK(maybeVal); + return builder.getI64IntegerAttr(*maybeVal); } + return nullptr; } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static mlir::omp::ClauseMemoryOrderKindAttr +getAtomicMemoryOrder(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + const List &clauses) { + std::optional kind; + unsigned version = semaCtx.langOptions().OpenMPVersion; - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + switch (clause.id) { + case llvm::omp::Clause::OMPC_acq_rel: + kind = mlir::omp::ClauseMemoryOrderKind::Acq_rel; + break; + case llvm::omp::Clause::OMPC_acquire: + kind = mlir::omp::ClauseMemoryOrderKind::Acquire; + break; + case llvm::omp::Clause::OMPC_relaxed: + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + break; + case llvm::omp::Clause::OMPC_release: + kind = mlir::omp::ClauseMemoryOrderKind::Release; + break; + case llvm::omp::Clause::OMPC_seq_cst: + kind = mlir::omp::ClauseMemoryOrderKind::Seq_cst; + break; + default: + break; + } + } - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); + // Starting with 5.1, if no memory-order clause is present, the effect + // is as if "relaxed" was present. + if (!kind) { + if (version <= 50) + return nullptr; + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + } + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + return mlir::omp::ClauseMemoryOrderKindAttr::get(builder.getContext(), *kind); +} + +static mlir::Operation * // +genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; +static mlir::Operation * // +genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Value converted = builder.createConvert(loc, atomType, value); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, converted, hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - common::visit( - common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - lower::StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); +static mlir::Operation * +genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + lower::ExprToValueMap overrides; + lower::StatementContext naCtx; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + assert(!args.empty() && "Update operation without arguments"); + for (auto &arg : args) { + if (!semantics::IsSameOrResizeOf(arg, atom)) { + mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); + overrides.try_emplace(&arg, val); } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); } - mlir::Operation *atomicUpdateOp = nullptr; - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), - val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -static void genAtomicWrite(lower::AbstractConverter &converter, - const parser::OmpAtomicWrite &atomicWrite, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - - const parser::AssignmentStmt &stmt = - std::get>(atomicWrite.t) - .statement; - const evaluate::Assignment &assign = *stmt.typedAssignment->v; - lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -static void genAtomicRead(lower::AbstractConverter &converter, - const parser::OmpAtomicRead &atomicRead, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicRead.t) - .statement.t); + builder.restoreInsertionPoint(atomicAt); + auto updateOp = + builder.create(loc, atomAddr, hint, memOrder); - lower::StatementContext stmtCtx; - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -static void genAtomicUpdate(lower::AbstractConverter &converter, - const parser::OmpAtomicUpdate &atomicUpdate, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicUpdate.t) - .statement.t); + mlir::Region ®ion = updateOp->getRegion(0); + mlir::Block *block = builder.createBlock(®ion, {}, {atomType}, {loc}); + mlir::Value localAtom = fir::getBase(block->getArgument(0)); + overrides.try_emplace(&atom, localAtom); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -static void genOmpAtomic(lower::AbstractConverter &converter, - const parser::OmpAtomic &atomicConstruct, - mlir::Location loc) { - const parser::OmpAtomicClauseList &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>(atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicConstruct.t) - .statement.t); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -static void genAtomicCapture(lower::AbstractConverter &converter, - const parser::OmpAtomicCapture &atomicCapture, - mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + converter.overrideExprValues(&overrides); + mlir::Value updated = + fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value converted = builder.createConvert(loc, atomType, updated); + builder.create(loc, converted); + converter.resetExprOverrides(); - const parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const parser::OmpAtomicClauseList &rightHandClauseList = - std::get<2>(atomicCapture.t); - const parser::OmpAtomicClauseList &leftHandClauseList = - std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + builder.restoreInsertionPoint(saved); + return updateOp; +} + +static mlir::Operation * +genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, int action, + mlir::Value atomAddr, const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + switch (action) { + case parser::OpenMPAtomicConstruct::Analysis::Read: + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Write: + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Update: + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + default: + return nullptr; } - firOpBuilder.setInsertionPointToEnd(&block); - firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); } //===----------------------------------------------------------------------===// @@ -3911,10 +3639,6 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, @@ -3922,38 +3646,140 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; + llvm::errs() << " atom: " << atom.AsFortran() << "\n"; + llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; + llvm::errs() << " op0 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << " op1 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << "}\n"; +} + static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - const parser::OpenMPAtomicConstruct &atomicConstruct) { - Fortran::common::visit( - common::visitors{ - [&](const parser::OmpAtomicRead &atomicRead) { - mlir::Location loc = converter.genLocation(atomicRead.source); - genAtomicRead(converter, atomicRead, loc); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - mlir::Location loc = converter.genLocation(atomicWrite.source); - genAtomicWrite(converter, atomicWrite, loc); - }, - [&](const parser::OmpAtomic &atomicConstruct) { - mlir::Location loc = converter.genLocation(atomicConstruct.source); - genOmpAtomic(converter, atomicConstruct, loc); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - mlir::Location loc = converter.genLocation(atomicUpdate.source); - genAtomicUpdate(converter, atomicUpdate, loc); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - mlir::Location loc = converter.genLocation(atomicCapture.source); - genAtomicCapture(converter, atomicCapture, loc); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - mlir::Location loc = converter.genLocation(atomicCompare.source); - TODO(loc, "OpenMP atomic compare"); - }, - }, - atomicConstruct.u); + const parser::OpenMPAtomicConstruct &construct) { + auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { + if (auto *maybe = expr.get(); maybe && maybe->v) { + return &*maybe->v; + } else { + return nullptr; + } + }; + + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto &dirSpec = std::get(construct.t); + List clauses = makeClauses(dirSpec.Clauses(), semaCtx); + lower::StatementContext stmtCtx; + + const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + const semantics::SomeExpr &atom = *get(analysis.atom); + mlir::Location loc = converter.genLocation(construct.source); + mlir::Value atomAddr = + fir::getBase(converter.genExprAddr(atom, stmtCtx, &loc)); + mlir::IntegerAttr hint = getAtomicHint(converter, clauses); + mlir::omp::ClauseMemoryOrderKindAttr memOrder = + getAtomicMemoryOrder(converter, semaCtx, clauses); + + if (auto *cond = get(analysis.cond)) { + (void)cond; + TODO(loc, "OpenMP ATOMIC COMPARE"); + } else { + int action0 = analysis.op0.what & analysis.Action; + int action1 = analysis.op1.what & analysis.Action; + mlir::Operation *captureOp = nullptr; + fir::FirOpBuilder::InsertPoint atomicAt; + fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + + if (construct.IsCapture()) { + // Capturing operation. + assert(action0 != analysis.None && action1 != analysis.None && + "Expexcing two actions"); + captureOp = + builder.create(loc, hint, memOrder); + // Set the non-atomic insertion point to before the atomic.capture. + prepareAt = getInsertionPointBefore(captureOp); + + mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); + builder.setInsertionPointToEnd(block); + // Set the atomic insertion point to before the terminator inside + // atomic.capture. + mlir::Operation *term = builder.create(loc); + atomicAt = getInsertionPointBefore(term); + hint = nullptr; + memOrder = nullptr; + } else { + // Non-capturing operation. + assert(action0 != analysis.None && action1 == analysis.None && + "Expexcing single action"); + assert(!(analysis.op0.what & analysis.Condition)); + atomicAt = prepareAt; + } + + mlir::Operation *firstOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, + *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + assert(firstOp && "Should have created an atomic operation"); + atomicAt = getInsertionPointAfter(firstOp); + + mlir::Operation *secondOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, + *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } + } } static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..b30263ad54fa4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { + return std::string( + state.GetLocation(), std::min(64, state.BytesRemaining())); +} + constexpr auto startOmpLine = skipStuffBeforeStatement >> "!$OMP "_sptok; constexpr auto endOmpLine = space >> endOfLine; @@ -918,8 +924,10 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "BIND" >> construct(construct( parenthesized(Parser{}))) || + "CAPTURE" >> construct(construct()) || "COLLAPSE" >> construct(construct( parenthesized(scalarIntConstantExpr))) || + "COMPARE" >> construct(construct()) || "CONTAINS" >> construct(construct( parenthesized(Parser{}))) || "COPYIN" >> construct(construct( @@ -1039,6 +1047,7 @@ TYPE_PARSER( // "TASK_REDUCTION" >> construct(construct( parenthesized(Parser{}))) || + "READ" >> construct(construct()) || "RELAXED" >> construct(construct()) || "RELEASE" >> construct(construct()) || "REVERSE_OFFLOAD" >> @@ -1082,6 +1091,7 @@ TYPE_PARSER( // maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || + "WRITE" >> construct(construct()) || // Cancellable constructs "DO"_id >= construct(construct( @@ -1210,6 +1220,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. +// A problem can occur when atomic constructs without end-directive follow +// each other closely, e.g. +// !$omp atomic write +// x = v +// !$omp atomic update +// x = x + 1 +// ... +// The speculative parsing will become "recursive", and has the potential +// to take a (practically) infinite amount of time given a sufficiently +// large number of such constructs in a row. Since atomic constructs cannot +// contain other OpenMP constructs, guarding against recursive calls to the +// atomic construct parser solves the problem. +struct OmpAtomicConstructParser { + using resultType = OpenMPAtomicConstruct; + + static constexpr size_t BodyLimit{5}; + + std::optional Parse(ParseState &state) const { + if (recursing_) { + return std::nullopt; + } + recursing_ = true; + + auto dirSpec{Parser{}.Parse(state)}; + if (!dirSpec || dirSpec->DirId() != llvm::omp::Directive::OMPD_atomic) { + recursing_ = false; + return std::nullopt; + } + + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + if (ParseOne(exec, end, tail, state)) { + if (!tail.first.empty()) { + if (auto &&rest{attempt(LimitedTailParser(BodyLimit)).Parse(state)}) { + for (auto &&s : rest->first) { + tail.first.emplace_back(std::move(s)); + } + assert(!tail.second); + tail.second = std::move(rest->second); + } + } + recursing_ = false; + return OpenMPAtomicConstruct{ + std::move(*dirSpec), std::move(tail.first), std::move(tail.second)}; + } + + recursing_ = false; + return std::nullopt; + } + +private: + // Begin-directive + TailType = entire construct. + using TailType = std::pair>; + + // Parse either an ExecutionPartConstruct, or atomic end-directive. When + // successful, record the result in the "tail" provided, otherwise fail. + static std::optional ParseOne( // + Parser &exec, OmpEndDirectiveParser &end, + TailType &tail, ParseState &state) { + auto isRecovery{[](const ExecutionPartConstruct &e) { + return std::holds_alternative(e.u); + }}; + if (auto &&stmt{attempt(exec).Parse(state)}; stmt && !isRecovery(*stmt)) { + tail.first.emplace_back(std::move(*stmt)); + } else if (auto &&dir{attempt(end).Parse(state)}) { + tail.second = std::move(*dir); + } else { + return std::nullopt; + } + return Success{}; + } + + struct LimitedTailParser { + using resultType = TailType; + + constexpr LimitedTailParser(size_t count) : count_(count) {} + + std::optional Parse(ParseState &state) const { + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + for (size_t i{0}; i != count_; ++i) { + if (ParseOne(exec, end, tail, state)) { + if (tail.second) { + // Return when the end-directive was parsed. + return std::move(tail); + } + } else { + break; + } + } + return std::nullopt; + } + + private: + const size_t count_; + }; + + // The recursion guard should become thread_local if parsing is ever + // parallelized. + static bool recursing_; +}; + +bool OmpAtomicConstructParser::recursing_{false}; + +TYPE_PARSER(sourced( // + construct(OmpAtomicConstructParser{}))) + // 2.17.7 Atomic construct/2.17.8 Flush construct [OpenMP 5.0] // memory-order-clause -> // acq_rel @@ -1224,19 +1383,6 @@ TYPE_PARSER(sourced(construct( "RELEASE" >> construct(construct()) || "SEQ_CST" >> construct(construct()))))) -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) -TYPE_PARSER(sourced(construct( - construct(Parser{}) || - construct( - "FAIL" >> parenthesized(Parser{})) || - construct( - "HINT" >> parenthesized(Parser{}))))) - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -TYPE_PARSER(sourced(construct( - many(maybe(","_tok) >> sourced(Parser{}))))) - static bool IsSimpleStandalone(const OmpDirectiveName &name) { switch (name.v) { case llvm::omp::Directive::OMPD_barrier: @@ -1383,67 +1529,6 @@ TYPE_PARSER(sourced( TYPE_PARSER(construct(Parser{}) || construct(Parser{})) -// 2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] | -// ATOMIC [clause] -// clause -> memory-order-clause | HINT(hint-expression) -// memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED -// atomic-clause -> READ | WRITE | UPDATE | CAPTURE - -// OMP END ATOMIC -TYPE_PARSER(construct(startOmpLine >> "END ATOMIC"_tok)) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] READ [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("READ"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] CAPTURE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("CAPTURE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - statement(assignmentStmt), Parser{} / endOmpLine))) - -TYPE_PARSER(construct(indirect(Parser{})) || - construct(indirect(Parser{}))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] COMPARE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("COMPARE"_tok), - Parser{} / endOmpLine, - Parser{}, - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] UPDATE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("UPDATE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [atomic-clause-list] -TYPE_PARSER(sourced(construct(verbatim("ATOMIC"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] WRITE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("WRITE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// Atomic Construct -TYPE_PARSER(construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{})) - // 2.13.2 OMP CRITICAL TYPE_PARSER(startOmpLine >> sourced(construct( diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..5983f54600c21 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -318,6 +318,34 @@ std::string OmpTraitSetSelectorName::ToString() const { return std::string(EnumToString(v)); } +llvm::omp::Clause OpenMPAtomicConstruct::GetKind() const { + auto &dirSpec{std::get(t)}; + for (auto &clause : dirSpec.Clauses().v) { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_read: + case llvm::omp::Clause::OMPC_write: + case llvm::omp::Clause::OMPC_update: + return clause.Id(); + default: + break; + } + } + return llvm::omp::Clause::OMPC_update; +} + +bool OpenMPAtomicConstruct::IsCapture() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_capture; + }); +} + +bool OpenMPAtomicConstruct::IsCompare() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_compare; + }); +} } // namespace Fortran::parser template static llvm::omp::Clause getClauseIdForClass(C &&) { diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..a2bc8b98088f1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2562,83 +2562,22 @@ class UnparseVisitor { Word(ToUpperCaseLetters(common::EnumToString(x))); } - void Unparse(const OmpAtomicClauseList &x) { Walk(" ", x.v, " "); } - - void Unparse(const OmpAtomic &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCapture &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" CAPTURE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - Put("\n"); - Walk(std::get(x.t)); - BeginOpenMP(); - Word("!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCompare &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" COMPARE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - } - void Unparse(const OmpAtomicRead &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" READ"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicUpdate &x) { + void Unparse(const OpenMPAtomicConstruct &x) { BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" UPDATE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicWrite &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" WRITE"); - Walk(std::get<2>(x.t)); + Word("!$OMP "); + Walk(std::get(x.t)); Put("\n"); EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); + Walk(std::get(x.t), ""); + if (auto &end{std::get>(x.t)}) { + BeginOpenMP(); + Word("!$OMP END "); + Walk(*end); + Put("\n"); + EndOpenMP(); + } } + void Unparse(const OpenMPExecutableAllocate &x) { const auto &fields = std::get>>( @@ -2889,23 +2828,8 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } + void Unparse(const OmpFailClause &x) { Walk(x.v); } void Unparse(const OmpMemoryOrderClause &x) { Walk(x.v); } - void Unparse(const OmpAtomicClause &x) { - common::visit(common::visitors{ - [&](const OmpMemoryOrderClause &y) { Walk(y); }, - [&](const OmpFailClause &y) { - Word("FAIL("); - Walk(y.v); - Put(")"); - }, - [&](const OmpHintClause &y) { - Word("HINT("); - Walk(y.v); - Put(")"); - }, - }, - x.u); - } void Unparse(const OmpMetadirectiveDirective &x) { BeginOpenMP(); Word("!$OMP METADIRECTIVE "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c582de6df0319..bf3dff8ddab28 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); + +template +static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { + return !(e == f); +} + // Use when clause falls under 'struct OmpClause' in 'parse-tree.h'. #define CHECK_SIMPLE_CLAUSE(X, Y) \ void OmpStructureChecker::Enter(const parser::OmpClause::X &) { \ @@ -78,6 +86,28 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } +static bool IsVarOrFunctionRef(const SomeExpr &expr) { + return evaluate::UnwrapProcedureRef(expr) != nullptr || + evaluate::IsVariable(expr); +} + +static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { + const parser::TypedExpr &typedExpr{parserExpr.typedExpr}; + // ForwardOwningPointer typedExpr + // `- GenericExprWrapper ^.get() + // `- std::optional ^->v + return typedExpr.get()->v; +} + +static std::optional GetDynamicType( + const parser::Expr &parserExpr) { + if (auto maybeExpr{GetEvaluateExpr(parserExpr)}) { + return maybeExpr->GetType(); + } else { + return std::nullopt; + } +} + // 'OmpWorkshareBlockChecker' is used to check the validity of the assignment // statements and the expressions enclosed in an OpenMP Workshare construct class OmpWorkshareBlockChecker { @@ -584,51 +614,26 @@ void OmpStructureChecker::CheckPredefinedAllocatorRestriction( } } -template -void OmpStructureChecker::CheckHintClause( - D *leftOmpClauseList, D *rightOmpClauseList, std::string_view dirName) { - bool foundHint{false}; +void OmpStructureChecker::Enter(const parser::OmpClause::Hint &x) { + CheckAllowedClause(llvm::omp::Clause::OMPC_hint); + auto &dirCtx{GetContext()}; - auto checkForValidHintClause = [&](const D *clauseList) { - for (const auto &clause : clauseList->v) { - const parser::OmpHintClause *ompHintClause = nullptr; - if constexpr (std::is_same_v) { - ompHintClause = std::get_if(&clause.u); - } else if constexpr (std::is_same_v) { - if (auto *hint{std::get_if(&clause.u)}) { - ompHintClause = &hint->v; - } - } - if (!ompHintClause) - continue; - if (foundHint) { - context_.Say(clause.source, - "At most one HINT clause can appear on the %s directive"_err_en_US, - parser::ToUpperCaseLetters(dirName)); - } - foundHint = true; - std::optional hintValue = GetIntValue(ompHintClause->v); - if (hintValue && *hintValue >= 0) { - /*`omp_sync_hint_nonspeculative` and `omp_lock_hint_speculative`*/ - if ((*hintValue & 0xC) == 0xC - /*`omp_sync_hint_uncontended` and omp_sync_hint_contended*/ - || (*hintValue & 0x3) == 0x3) - context_.Say(clause.source, - "Hint clause value " - "is not a valid OpenMP synchronization value"_err_en_US); - } else { - context_.Say(clause.source, - "Hint clause must have non-negative constant " - "integer expression"_err_en_US); + if (std::optional maybeVal{GetIntValue(x.v.v)}) { + int64_t val{*maybeVal}; + if (val >= 0) { + // Check contradictory values. + if ((val & 0xC) == 0xC || // omp_sync_hint_speculative and nonspeculative + (val & 0x3) == 0x3) { // omp_sync_hint_contended and uncontended + context_.Say(dirCtx.clauseSource, + "The synchronization hint is not valid"_err_en_US); } + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be non-negative"_err_en_US); } - }; - - if (leftOmpClauseList) { - checkForValidHintClause(leftOmpClauseList); - } - if (rightOmpClauseList) { - checkForValidHintClause(rightOmpClauseList); + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be a constant integer value"_err_en_US); } } @@ -2370,8 +2375,9 @@ void OmpStructureChecker::Leave(const parser::OpenMPCancelConstruct &) { void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { const auto &dir{std::get(x.t)}; + const auto &dirSource{std::get(dir.t).source}; const auto &endDir{std::get(x.t)}; - PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_critical); + PushContextAndClauseSets(dirSource, llvm::omp::Directive::OMPD_critical); const auto &block{std::get(x.t)}; CheckNoBranching(block, llvm::omp::Directive::OMPD_critical, dir.source); const auto &dirName{std::get>(dir.t)}; @@ -2404,7 +2410,6 @@ void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { "Hint clause other than omp_sync_hint_none cannot be specified for " "an unnamed CRITICAL directive"_err_en_US}); } - CheckHintClause(&ompClause, nullptr, "CRITICAL"); } void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { @@ -2632,420 +2637,1519 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; } - return false; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } + + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (Append(v, std::move(results)), ...); + return v; + } + +private: + static void Append(Result &acc, Result &&data) { + for (auto &&s : data) { + acc.push_back(std::move(s)); } } +}; +} // namespace atomic - ErrIfAllocatableVariable(var); +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { + // Both expr and x have the form of SomeType(SomeKind(...)[1]). + // Check if expr is + // SomeType(SomeKind(Type( + // Convert + // SomeKind(...)[2]))) + // where SomeKind(...) [1] and [2] are equal, and the Convert preserves + // TypeCategory. + + if (expr != x) { + auto top{atomic::ArgumentExtractor{}(expr)}; + return top.first == atomic::Operator::Resize && x == top.second.front(); } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); + return true; } } -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, const MaybeExpr &maybeExpr = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetExpr(operation.expr, maybeExpr); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedExpr expr; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var + // or + // ... = x capture-var = atomic-var + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return u.lhs == c.rhs; + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); + bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + + if (couldBeCapture1) { + if (couldBeCapture2) { + if (isUpdateCapture(as2, as1)) { + if (isUpdateCapture(as1, as2)) { + // If both statements could be captures and both could be updates, + // emit a warning about the ambiguity. + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); } + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, + as1.rhs.AsFortran(), as2.rhs.AsFortran()); + } + } else { // !couldBeCapture2 + if (isUpdateCapture(as2, as1)) { + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else { + context_.Say(act2.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as1.rhs.AsFortran()); + } + } + } else { // !couldBeCapture1 + if (couldBeCapture2) { + if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); + } else { + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as2.rhs.AsFortran()); + } + } else { + context_.Say(source, + "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + } + } + + return std::make_pair(nullptr, nullptr); +} + +void OmpStructureChecker::CheckAtomicCaptureAssignment( + const evaluate::Assignment &capture, const SomeExpr &atom, + parser::CharBlock source) { + const SomeExpr &cap{capture.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + // This part should have been checked prior to callig this function. + assert(capture.rhs == atom && "This canont be a capture assignment"); + CheckStorageOverlap(atom, {cap}, source); + } +} + +void OmpStructureChecker::CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source) { + const SomeExpr &atom{read.rhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source) { + // [6.0:190:13-15] + // A write structured block is write-statement, a write statement that has + // one of the following forms: + // x = expr + // x => expr + const SomeExpr &atom{write.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, lsrc); + CheckStorageOverlap(atom, {write.rhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source) { + // [6.0:191:1-7] + // An update structured block is update-statement, an update statement + // that has one of the following forms: + // x = x operator expr + // x = expr operator x + // x = intrinsic-procedure-name (x) + // x = intrinsic-procedure-name (x, expr-list) + // x = intrinsic-procedure-name (expr-list, x) + const SomeExpr &atom{update.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, lsrc); + + auto top{GetTopLevelOperation(update.rhs)}; + switch (top.first) { + case atomic::Operator::Add: + case atomic::Operator::Sub: + case atomic::Operator::Mul: + case atomic::Operator::Div: + case atomic::Operator::And: + case atomic::Operator::Or: + case atomic::Operator::Eqv: + case atomic::Operator::Neqv: + case atomic::Operator::Min: + case atomic::Operator::Max: + case atomic::Operator::Identity: + break; + case atomic::Operator::Call: + context_.Say(source, + "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Convert: + context_.Say(source, + "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Intrinsic: + context_.Say(source, + "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Unk: + context_.Say( + source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + default: + context_.Say(source, + "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, + atomic::ToString(top.first)); + return; + } + // Check if `atom` occurs exactly once in the argument list. + std::vector nonAtom; + auto unique{[&]() { // -> iterator + auto found{top.second.end()}; + for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { + if (IsSameOrResizeOf(*i, atom)) { + if (found != top.second.end()) { + return top.second.end(); } + found = i; + } else { + nonAtom.push_back(*i); } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); + return found; + }()}; + + if (unique == top.second.end()) { + if (top.first == atomic::Operator::Identity) { + // This is "x = y". + context_.Say(rsrc, + "The atomic variable %s should appear as an argument in the update operation"_err_en_US, + atom.AsFortran()); + } else { + context_.Say(rsrc, + "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, + atom.AsFortran(), atomic::ToString(top.first)); + } + } else { + CheckStorageOverlap(atom, nonAtom, source); } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( + const SomeExpr &cond, parser::CharBlock condSource, + const evaluate::Assignment &assign, parser::CharBlock assignSource) { + const SomeExpr &atom{assign.lhs}; + auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, alsrc); + + auto top{GetTopLevelOperation(cond)}; + // Missing arguments to operations would have been diagnosed by now. + + switch (top.first) { + case atomic::Operator::Associated: + if (atom != top.second.front()) { + context_.Say(assignSource, + "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); + } + break; + // x equalop e | e equalop x (allowing "e equalop x" is an extension) + case atomic::Operator::Eq: + case atomic::Operator::Eqv: + // x ordop expr | expr ordop x + case atomic::Operator::Lt: + case atomic::Operator::Gt: { + const SomeExpr &arg0{top.second[0]}; + const SomeExpr &arg1{top.second[1]}; + if (IsSameOrResizeOf(arg0, atom)) { + CheckStorageOverlap(atom, {arg1}, condSource); + } else if (IsSameOrResizeOf(arg1, atom)) { + CheckStorageOverlap(atom, {arg0}, condSource); + } else { + context_.Say(assignSource, + "An argument of the %s operator should be the target of the assignment"_err_en_US, + atomic::ToString(top.first)); + } + break; + } + case atomic::Operator::True: + case atomic::Operator::False: + break; + default: + context_.Say(condSource, + "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, + atomic::ToString(top.first)); + break; } } -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, +void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source) { + // The condition/statements must be: + // - cond: x equalop e ift: x = d iff: - + // - cond: x ordop expr ift: x = expr iff: - (+ commute ordop) + // - cond: associated(x) ift: x => expr iff: - + // - cond: associated(x, e) ift: x => expr iff: - + + // The if-true statement must be present, and must be an assignment. + auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; + if (!maybeAssign) { + if (update.ift.stmt) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); + } else { + context_.Say( + source, "Invalid body of ATOMIC UPDATE COMPARE operation"_err_en_US); + } + return; + } + const evaluate::Assignment assign{*maybeAssign}; + const SomeExpr &atom{assign.lhs}; + + CheckAtomicConditionalUpdateAssignment( + update.cond, update.source, assign, update.ift.source); + + CheckStorageOverlap(atom, {assign.rhs}, update.ift.source); + + if (update.iff) { + context_.Say(update.iff.source, + "In ATOMIC UPDATE COMPARE the update statement should not have an ELSE branch"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdateOnly( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeUpdate{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeUpdate->lhs}; + CheckAtomicUpdateAssignment(*maybeUpdate, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC UPDATE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdate( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // Allowable forms are (single-statement): + // - if ... + // - x = (... ? ... : x) + // and two-statement: + // - r = cond ; if (r) ... + + const parser::ExecutionPartConstruct *ust{nullptr}; // update + const parser::ExecutionPartConstruct *cst{nullptr}; // condition + + if (body.size() == 1) { + ust = &body.front(); + } else if (body.size() == 2) { + cst = &body.front(); + ust = &body.back(); + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE operation should contain one or two statements"_err_en_US); + return; + } + + // Flang doesn't support conditional-expr yet, so all update statements + // are if-statements. + + // IfStmt: if (...) ... + // IfConstruct: if (...) then ... endif + auto maybeUpdate{AnalyzeConditionalStmt(ust)}; + if (!maybeUpdate) { + context_.Say(source, + "In ATOMIC UPDATE COMPARE the update statement should be a conditional statement"_err_en_US); + return; + } + + AnalyzedCondStmt &update{*maybeUpdate}; + + if (SourcedActionStmt action{GetActionStmt(cst)}) { + // The "condition" statement must be `r = cond`. + if (auto maybeCond{GetEvaluateAssignment(action.stmt)}) { + if (maybeCond->lhs != update.cond) { + context_.Say(update.source, + "In ATOMIC UPDATE COMPARE the conditional statement must use %s as the condition"_err_en_US, + maybeCond->lhs.AsFortran()); + } else { + // If it's "r = ...; if (r) ..." then put the original condition + // in `update`. + update.cond = maybeCond->rhs; + } + } else { + context_.Say(action.source, + "In ATOMIC UPDATE COMPARE with two statements the first statement should compute the condition"_err_en_US); + } + } + + evaluate::Assignment assign{*GetEvaluateAssignment(update.ift.stmt)}; + + CheckAtomicConditionalUpdateStmt(update, source); + if (IsCheckForAssociated(update.cond)) { + if (!IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment should be a pointer-assignment when the condition is ASSOCIATED"_err_en_US); + } + } else { + if (IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment cannot be a pointer-assignment except when the condition is ASSOCIATED"_err_en_US); + } + } + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::None)); +} + +void OmpStructureChecker::CheckAtomicUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() != 2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two statements"_err_en_US); + return; + } + + auto [uec, cec]{CheckUpdateCapture(&body.front(), &body.back(), source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; + auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; + + if (!maybeUpdate || !maybeCapture) { + context_.Say(source, + "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); + return; + } + + const evaluate::Assignment &update{*maybeUpdate}; + const evaluate::Assignment &capture{*maybeCapture}; + const SomeExpr &atom{update.lhs}; + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + int action; + + if (IsMaybeAtomicWrite(update)) { + action = Analysis::Write; + CheckAtomicWriteAssignment(update, uact.source); + } else { + action = Analysis::Update; + CheckAtomicUpdateAssignment(update, uact.source); + } + CheckAtomicCaptureAssignment(capture, atom, cact.source); + + if (IsPointerAssignment(update) != IsPointerAssignment(capture)) { + context_.Say(cact.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + + if (GetActionStmt(&body.front()).stmt == uact.stmt) { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(action, update.rhs), + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + } else { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), + MakeAtomicAnalysisOp(action, update.rhs)); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // There are two different variants of this: + // (1) conditional-update and capture separately: + // This form only allows single-statement updates, i.e. the update + // form "r = cond; if (r) ...)" is not allowed. + // (2) conditional-update combined with capture in a single statement: + // This form does allow the condition to be calculated separately, + // i.e. "r = cond; if (r) ...". + // Regardless of what form it is, the actual update assignment is a + // proper write, i.e. "x = d", where d does not depend on x. + + AnalyzedCondStmt update; + SourcedActionStmt capture; + bool captureAlways{true}, captureFirst{true}; + + auto extractCapture{[&]() { + capture = update.iff; + captureAlways = false; + update.iff = SourcedActionStmt{}; + }}; + + auto classifyNonUpdate{[&](const SourcedActionStmt &action) { + // The non-update statement is either "r = cond" or the capture. + if (auto maybeAssign{GetEvaluateAssignment(action.stmt)}) { + if (update.cond == maybeAssign->lhs) { + // If this is "r = cond; if (r) ...", then update the condition. + update.cond = maybeAssign->rhs; + update.source = action.source; + // In this form, the update and the capture are combined into + // an IF-THEN-ELSE statement. + extractCapture(); + } else { + // Assume this is the capture-statement. + capture = action; + } + } + }}; + + if (body.size() == 2) { + // This could be + // - capture; conditional-update (in any order), or + // - r = cond; if (r) capture-update + const parser::ExecutionPartConstruct *st1{&body.front()}; + const parser::ExecutionPartConstruct *st2{&body.back()}; + // In either case, the conditional statement can be analyzed by + // AnalyzeConditionalStmt, whereas the other statement cannot. + if (auto maybeUpdate1{AnalyzeConditionalStmt(st1)}) { + update = *maybeUpdate1; + classifyNonUpdate(GetActionStmt(st2)); + captureFirst = false; + } else if (auto maybeUpdate2{AnalyzeConditionalStmt(st2)}) { + update = *maybeUpdate2; + classifyNonUpdate(GetActionStmt(st1)); + } else { + // None of the statements are conditional, this rules out the + // "r = cond; if (r) ..." and the "capture + conditional-update" + // variants. This could still be capture + write (which is classified + // as conditional-update-capture in the spec). + auto [uec, cec]{CheckUpdateCapture(st1, st2, source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + update.ift = uact; + capture = cact; + if (uec == st1) { + captureFirst = false; + } + } + } else if (body.size() == 1) { + if (auto maybeUpdate{AnalyzeConditionalStmt(&body.front())}) { + update = *maybeUpdate; + // This is the form with update and capture combined into an IF-THEN-ELSE + // statement. The capture-statement is always the ELSE branch. + extractCapture(); + } else { + goto invalid; + } + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE CAPTURE operation should contain one or two statements"_err_en_US); + return; + invalid: + context_.Say(source, + "Invalid body of ATOMIC UPDATE COMPARE CAPTURE operation"_err_en_US); + return; + } + + // The update must have a form `x = d` or `x => d`. + if (auto maybeWrite{GetEvaluateAssignment(update.ift.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, update.ift.source); + if (auto maybeCapture{GetEvaluateAssignment(capture.stmt)}) { + CheckAtomicCaptureAssignment(*maybeCapture, atom, capture.source); + + if (IsPointerAssignment(*maybeWrite) != + IsPointerAssignment(*maybeCapture)) { + context_.Say(capture.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + } else { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + return; + } + } else { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + return; + } + + // update.iff should be empty here, the capture statement should be + // stored in "capture". + + // Fill out the analysis in the AST node. + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + bool condUnused{std::visit( + [](auto &&s) { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v) { + return true; + } else { + return false; + } }, - x.u); + update.cond.u)}; + + int updateWhen{!condUnused ? Analysis::IfTrue : 0}; + int captureWhen{!captureAlways ? Analysis::IfFalse : 0}; + + evaluate::Assignment updAssign{*GetEvaluateAssignment(update.ift.stmt)}; + evaluate::Assignment capAssign{*GetEvaluateAssignment(capture.stmt)}; + + if (captureFirst) { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + } else { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + } +} + +void OmpStructureChecker::CheckAtomicRead( + const parser::OpenMPAtomicConstruct &x) { + // [6.0:190:5-7] + // A read structured block is read-statement, a read statement that has one + // of the following forms: + // v = x + // v => x + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Read cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC READ cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeRead->rhs}; + CheckAtomicReadAssignment(*maybeRead, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC READ operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC READ operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicWrite( + const parser::OpenMPAtomicConstruct &x) { + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Write cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC WRITE cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeWrite{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC WRITE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdate( + const parser::OpenMPAtomicConstruct &x) { + auto &block{std::get(x.t)}; + + bool isConditional{x.IsCompare()}; + bool isCapture{x.IsCapture()}; + const parser::Block &body{GetInnermostExecPart(block)}; + + if (isConditional && isCapture) { + CheckAtomicConditionalUpdateCapture(x, body, x.source); + } else if (isConditional) { + CheckAtomicConditionalUpdate(x, body, x.source); + } else if (isCapture) { + CheckAtomicUpdateCapture(x, body, x.source); + } else { // update-only + CheckAtomicUpdateOnly(x, body, x.source); + } +} + +void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { + // All of the following groups have the "exclusive" property, i.e. at + // most one clause from each group is allowed. + // The exclusivity-checking code should eventually be unified for all + // clauses, with clause groups defined in OMP.td. + std::array atomic{llvm::omp::Clause::OMPC_read, + llvm::omp::Clause::OMPC_update, llvm::omp::Clause::OMPC_write}; + std::array memoryOrder{llvm::omp::Clause::OMPC_acq_rel, + llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_relaxed, + llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_seq_cst}; + + auto checkExclusive{[&](llvm::ArrayRef group, + std::string_view name, + const parser::OmpClauseList &clauses) { + const parser::OmpClause *present{nullptr}; + for (const parser::OmpClause &clause : clauses.v) { + llvm::omp::Clause id{clause.Id()}; + if (!llvm::is_contained(group, id)) { + continue; + } + if (present == nullptr) { + present = &clause; + continue; + } else if (id == present->Id()) { + // Ignore repetitions of the same clause, those will be diagnosed + // separately. + continue; + } + parser::MessageFormattedText txt( + "At most one clause from the '%s' group is allowed on ATOMIC construct"_err_en_US, + name.data()); + parser::Message message(clause.source, txt); + message.Attach(present->source, + "Previous clause from this group provided here"_en_US); + context_.Say(std::move(message)); + return; + } + }}; + + auto &dirSpec{std::get(x.t)}; + auto &dir{std::get(dirSpec.t)}; + PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_atomic); + llvm::omp::Clause kind{x.GetKind()}; + + checkExclusive(atomic, "atomic", dirSpec.Clauses()); + checkExclusive(memoryOrder, "memory-order", dirSpec.Clauses()); + + switch (kind) { + case llvm::omp::Clause::OMPC_read: + CheckAtomicRead(x); + break; + case llvm::omp::Clause::OMPC_write: + CheckAtomicWrite(x); + break; + case llvm::omp::Clause::OMPC_update: + CheckAtomicUpdate(x); + break; + default: + break; + } } void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { @@ -3237,7 +4341,6 @@ CHECK_SIMPLE_CLAUSE(Final, OMPC_final) CHECK_SIMPLE_CLAUSE(Flush, OMPC_flush) CHECK_SIMPLE_CLAUSE(Full, OMPC_full) CHECK_SIMPLE_CLAUSE(Grainsize, OMPC_grainsize) -CHECK_SIMPLE_CLAUSE(Hint, OMPC_hint) CHECK_SIMPLE_CLAUSE(Holds, OMPC_holds) CHECK_SIMPLE_CLAUSE(Inclusive, OMPC_inclusive) CHECK_SIMPLE_CLAUSE(Initializer, OMPC_initializer) @@ -3867,40 +4970,6 @@ void OmpStructureChecker::CheckIsLoopIvPartOfClause( } } } -// Following clauses have a separate node in parse-tree.h. -// Atomic-clause -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicRead, OMPC_read) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicWrite, OMPC_write) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicUpdate, OMPC_update) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicCapture, OMPC_capture) - -void OmpStructureChecker::Leave(const parser::OmpAtomicRead &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_read, - {llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicWrite &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_write, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicUpdate &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_update, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -// OmpAtomic node represents atomic directive without atomic-clause. -// atomic-clause - READ,WRITE,UPDATE,CAPTURE. -void OmpStructureChecker::Leave(const parser::OmpAtomic &) { - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acquire)}) { - context_.Say(clause->source, - "Clause ACQUIRE is not allowed on the ATOMIC directive"_err_en_US); - } - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acq_rel)}) { - context_.Say(clause->source, - "Clause ACQ_REL is not allowed on the ATOMIC directive"_err_en_US); - } -} // Restrictions specific to each clause are implemented apart from the // generalized restrictions. @@ -4854,21 +5923,6 @@ void OmpStructureChecker::Leave(const parser::OmpContextSelector &) { ExitDirectiveNest(ContextSelectorNest); } -std::optional OmpStructureChecker::GetDynamicType( - const common::Indirection &parserExpr) { - // Indirection parserExpr - // `- parser::Expr ^.value() - const parser::TypedExpr &typedExpr{parserExpr.value().typedExpr}; - // ForwardOwningPointer typedExpr - // `- GenericExprWrapper ^.get() - // `- std::optional ^->v - if (auto maybeExpr{typedExpr.get()->v}) { - return maybeExpr->GetType(); - } else { - return std::nullopt; - } -} - const std::list & OmpStructureChecker::GetTraitPropertyList( const parser::OmpTraitSelector &trait) { @@ -5258,7 +6312,7 @@ void OmpStructureChecker::CheckTraitCondition( const parser::OmpTraitProperty &property{properties.front()}; auto &scalarExpr{std::get(property.u)}; - auto maybeType{GetDynamicType(scalarExpr.thing)}; + auto maybeType{GetDynamicType(scalarExpr.thing.value())}; if (!maybeType || maybeType->category() != TypeCategory::Logical) { context_.Say(property.source, "%s trait requires a single LOGICAL expression"_err_en_US, diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..bf6fbf16d0646 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -48,6 +48,7 @@ static const OmpDirectiveSet noWaitClauseNotAllowedSet{ } // namespace llvm namespace Fortran::semantics { +struct AnalyzedCondStmt; // Mapping from 'Symbol' to 'Source' to keep track of the variables // used in multiple clauses @@ -142,15 +143,6 @@ class OmpStructureChecker void Leave(const parser::OmpClauseList &); void Enter(const parser::OmpClause &); - void Enter(const parser::OmpAtomicRead &); - void Leave(const parser::OmpAtomicRead &); - void Enter(const parser::OmpAtomicWrite &); - void Leave(const parser::OmpAtomicWrite &); - void Enter(const parser::OmpAtomicUpdate &); - void Leave(const parser::OmpAtomicUpdate &); - void Enter(const parser::OmpAtomicCapture &); - void Leave(const parser::OmpAtomic &); - void Enter(const parser::DoConstruct &); void Leave(const parser::DoConstruct &); @@ -189,8 +181,6 @@ class OmpStructureChecker void CheckAllowedMapTypes(const parser::OmpMapType::Value &, const std::list &); - std::optional GetDynamicType( - const common::Indirection &); const std::list &GetTraitPropertyList( const parser::OmpTraitSelector &); std::optional GetClauseFromProperty( @@ -260,14 +250,41 @@ class OmpStructureChecker void CheckDoWhile(const parser::OpenMPLoopConstruct &x); void CheckAssociatedLoopConstraints(const parser::OpenMPLoopConstruct &x); template bool IsOperatorValid(const T &, const D &); - void CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *, const parser::OmpAtomicClauseList *); - void CheckAtomicUpdateStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureStmt(const parser::AssignmentStmt &); - void CheckAtomicWriteStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureConstruct(const parser::OmpAtomicCapture &); - void CheckAtomicCompareConstruct(const parser::OmpAtomicCompare &); - void CheckAtomicConstructStructure(const parser::OpenMPAtomicConstruct &); + + void CheckStorageOverlap(const evaluate::Expr &, + llvm::ArrayRef>, parser::CharBlock); + void CheckAtomicVariable( + const evaluate::Expr &, parser::CharBlock); + std::pair + CheckUpdateCapture(const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source); + void CheckAtomicCaptureAssignment(const evaluate::Assignment &capture, + const SomeExpr &atom, parser::CharBlock source); + void CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source); + void CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source); + void CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source); + void CheckAtomicConditionalUpdateAssignment(const SomeExpr &cond, + parser::CharBlock condSource, const evaluate::Assignment &assign, + parser::CharBlock assignSource); + void CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source); + void CheckAtomicUpdateOnly(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdate(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicUpdateCapture(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source); + void CheckAtomicRead(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicWrite(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicUpdate(const parser::OpenMPAtomicConstruct &x); + void CheckDistLinear(const parser::OpenMPLoopConstruct &x); void CheckSIMDNest(const parser::OpenMPConstruct &x); void CheckTargetNest(const parser::OpenMPConstruct &x); @@ -319,7 +336,6 @@ class OmpStructureChecker void EnterDirectiveNest(const int index) { directiveNest_[index]++; } void ExitDirectiveNest(const int index) { directiveNest_[index]--; } int GetDirectiveNest(const int index) { return directiveNest_[index]; } - template void CheckHintClause(D *, D *, std::string_view); inline void ErrIfAllocatableVariable(const parser::Variable &); inline void ErrIfLHSAndRHSSymbolsMatch( const parser::Variable &, const parser::Expr &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..dc395ecf211b3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1638,11 +1638,8 @@ class OmpVisitor : public virtual DeclarationVisitor { messageHandler().set_currStmtSource(std::nullopt); } bool Pre(const parser::OpenMPAtomicConstruct &x) { - return common::visit(common::visitors{[&](const auto &u) -> bool { - AddOmpSourceRange(u.source); - return true; - }}, - x.u); + AddOmpSourceRange(x.source); + return true; } void Post(const parser::OpenMPAtomicConstruct &) { messageHandler().set_currStmtSource(std::nullopt); diff --git a/flang/lib/Semantics/rewrite-directives.cpp b/flang/lib/Semantics/rewrite-directives.cpp index 104a77885d276..b4fef2c881b67 100644 --- a/flang/lib/Semantics/rewrite-directives.cpp +++ b/flang/lib/Semantics/rewrite-directives.cpp @@ -51,23 +51,21 @@ class OmpRewriteMutator : public DirectiveRewriteMutator { bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { // Find top-level parent of the operation. - Symbol *topLevelParent{common::visit( - [&](auto &atomic) { - Symbol *symbol{nullptr}; - Scope *scope{ - &context_.FindScope(std::get(atomic.t).source)}; - do { - if (Symbol * parent{scope->symbol()}) { - symbol = parent; - } - scope = &scope->parent(); - } while (!scope->IsGlobal()); - - assert(symbol && - "Atomic construct must be within a scope associated with a symbol"); - return symbol; - }, - x.u)}; + Symbol *topLevelParent{[&]() { + Symbol *symbol{nullptr}; + Scope *scope{&context_.FindScope( + std::get(x.t).source)}; + do { + if (Symbol * parent{scope->symbol()}) { + symbol = parent; + } + scope = &scope->parent(); + } while (!scope->IsGlobal()); + + assert(symbol && + "Atomic construct must be within a scope associated with a symbol"); + return symbol; + }()}; // Get the `atomic_default_mem_order` clause from the top-level parent. std::optional defaultMemOrder; @@ -86,66 +84,48 @@ bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { return false; } - auto findMemOrderClause = - [](const std::list &clauses) { - return llvm::any_of(clauses, [](const auto &clause) { - return std::get_if(&clause.u); + auto findMemOrderClause{[](const parser::OmpClauseList &clauses) { + return llvm::any_of( + clauses.v, [](auto &clause) -> const parser::OmpClause * { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_acq_rel: + case llvm::omp::Clause::OMPC_acquire: + case llvm::omp::Clause::OMPC_relaxed: + case llvm::omp::Clause::OMPC_release: + case llvm::omp::Clause::OMPC_seq_cst: + return &clause; + default: + return nullptr; + } }); - }; - - // Get the clause list to which the new memory order clause must be added, - // only if there are no other memory order clauses present for this atomic - // directive. - std::list *clauseList = common::visit( - common::visitors{[&](parser::OmpAtomic &atomicConstruct) { - // OmpAtomic only has a single list of clauses. - auto &clauses{std::get( - atomicConstruct.t)}; - return !findMemOrderClause(clauses.v) ? &clauses.v - : nullptr; - }, - [&](auto &atomicConstruct) { - // All other atomic constructs have two lists of clauses. - auto &clausesLhs{std::get<0>(atomicConstruct.t)}; - auto &clausesRhs{std::get<2>(atomicConstruct.t)}; - return !findMemOrderClause(clausesLhs.v) && - !findMemOrderClause(clausesRhs.v) - ? &clausesRhs.v - : nullptr; - }}, - x.u); + }}; - // Add a memory order clause to the atomic directive. + auto &dirSpec{std::get(x.t)}; + auto &clauseList{std::get>(dirSpec.t)}; if (clauseList) { - atomicDirectiveDefaultOrderFound_ = true; - switch (*defaultMemOrder) { - case common::OmpMemoryOrderType::Acq_Rel: - clauseList->emplace_back(common::visit( - common::visitors{[](parser::OmpAtomicRead &) -> parser::OmpClause { - return parser::OmpClause::Acquire{}; - }, - [](parser::OmpAtomicCapture &) -> parser::OmpClause { - return parser::OmpClause::AcqRel{}; - }, - [](auto &) -> parser::OmpClause { - // parser::{OmpAtomic, OmpAtomicUpdate, OmpAtomicWrite} - return parser::OmpClause::Release{}; - }}, - x.u)); - break; - case common::OmpMemoryOrderType::Relaxed: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::Relaxed{}}); - break; - case common::OmpMemoryOrderType::Seq_Cst: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::SeqCst{}}); - break; - default: - // FIXME: Don't process other values at the moment since their validity - // depends on the OpenMP version (which is unavailable here). - break; + if (findMemOrderClause(*clauseList)) { + return false; } + } else { + clauseList = parser::OmpClauseList(decltype(parser::OmpClauseList::v){}); + } + + // Add a memory order clause to the atomic directive. + atomicDirectiveDefaultOrderFound_ = true; + switch (*defaultMemOrder) { + case common::OmpMemoryOrderType::Acq_Rel: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::AcqRel{}}); + break; + case common::OmpMemoryOrderType::Relaxed: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::Relaxed{}}); + break; + case common::OmpMemoryOrderType::Seq_Cst: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::SeqCst{}}); + break; + default: + // FIXME: Don't process other values at the moment since their validity + // depends on the OpenMP version (which is unavailable here). + break; } return false; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 index b82bd13622764..6f58e0939a787 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 index 88ec6fe910b9e..6729be6e5cf8b 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b8422c8f3b769 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -79,16 +79,16 @@ subroutine pointers_in_atomic_capture() !CHECK: %[[VAL_A_BOX_ADDR:.*]] = fir.box_addr %[[VAL_A_LOADED]] : (!fir.box>) -> !fir.ptr !CHECK: %[[VAL_B_LOADED:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR:.*]] = fir.box_addr %[[VAL_B_LOADED]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR]] : !fir.ptr !CHECK: %[[VAL_B_LOADED_2:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR_2:.*]] = fir.box_addr %[[VAL_B_LOADED_2]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR_2]] : !fir.ptr !CHECK: omp.atomic.capture { !CHECK: omp.atomic.update %[[VAL_A_BOX_ADDR]] : !fir.ptr { !CHECK: ^bb0(%[[ARG:.*]]: i32): !CHECK: %[[TEMP:.*]] = arith.addi %[[ARG]], %[[VAL_B]] : i32 !CHECK: omp.yield(%[[TEMP]] : i32) !CHECK: } -!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 +!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR_2]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 !CHECK: } !CHECK: return !CHECK: } diff --git a/flang/test/Lower/OpenMP/atomic-write.f90 b/flang/test/Lower/OpenMP/atomic-write.f90 index 13392ad76471f..6eded49b0b15d 100644 --- a/flang/test/Lower/OpenMP/atomic-write.f90 +++ b/flang/test/Lower/OpenMP/atomic-write.f90 @@ -44,9 +44,9 @@ end program OmpAtomicWrite !CHECK-LABEL: func.func @_QPatomic_write_pointer() { !CHECK: %[[X_REF:.*]] = fir.alloca !fir.box> {bindc_name = "x", uniq_name = "_QFatomic_write_pointerEx"} !CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFatomic_write_pointerEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) -!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> !CHECK: %[[X_POINTEE_ADDR:.*]] = fir.box_addr %[[X_ADDR_BOX]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: omp.atomic.write %[[X_POINTEE_ADDR]] = %[[C1]] : !fir.ptr, i32 !CHECK: %[[C2:.*]] = arith.constant 2 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> diff --git a/flang/test/Parser/OpenMP/atomic-compare.f90 b/flang/test/Parser/OpenMP/atomic-compare.f90 deleted file mode 100644 index 5cd02698ff482..0000000000000 --- a/flang/test/Parser/OpenMP/atomic-compare.f90 +++ /dev/null @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s -! OpenMP version for documentation purposes only - it isn't used until Sema. -! This is testing for Parser errors that bail out before Sema. -program main - implicit none - integer :: i, j = 10 - logical :: r - - !CHECK: error: expected OpenMP construct - !$omp atomic compare write - r = i .eq. j + 1 - - !CHECK: error: expected end of line - !$omp atomic compare num_threads(4) - r = i .eq. j -end program main diff --git a/flang/test/Semantics/OpenMP/atomic-compare.f90 b/flang/test/Semantics/OpenMP/atomic-compare.f90 index 54492bf6a22a6..11e23e062bce7 100644 --- a/flang/test/Semantics/OpenMP/atomic-compare.f90 +++ b/flang/test/Semantics/OpenMP/atomic-compare.f90 @@ -44,46 +44,37 @@ !$omp end atomic ! Check for error conditions: - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic compare seq_cst seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst compare seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic compare acquire acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire compare acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic compare relaxed relaxed if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed compare relaxed if (b .eq. c) b = a - !ERROR: More than one FAIL clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one FAIL clause can appear on the ATOMIC directive !$omp atomic fail(release) compare fail(release) if (c .eq. a) a = b !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index c13a11a8dd5dc..deb67e7614659 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -16,20 +16,21 @@ program sample !$omp atomic read hint(2) y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(3) y = y + 10 !$omp atomic update hint(5) y = x + y - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement y = x x = y !$omp end atomic - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic update hint(x) y = y * 1 @@ -46,7 +47,7 @@ program sample !$omp atomic hint(omp_lock_hint_speculative) x = y + x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint) read y = x @@ -69,36 +70,36 @@ program sample !$omp atomic hint(omp_lock_hint_contended + omp_sync_hint_nonspeculative) x = y + x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint_contended) read y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = y * 9 - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp atomic hint(1.0) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp atomic hint(z + omp_sync_hint_nonspeculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(k + omp_sync_hint_speculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(p(1) + omp_sync_hint_uncontended) write x = 10 * y !$omp atomic write hint(a) - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and y+x access the same storage x = y + x !$omp atomic hint(abs(-1)) write diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 new file mode 100644 index 0000000000000..2619d235380f8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -0,0 +1,89 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC READ. Expect no diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v = x + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v => x +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic read + v = p(i) +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic read + !ERROR: Atomic variable x should be a scalar + v = x + + !$omp atomic read + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + v = y(2:4) +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic read + !ERROR: Within atomic operation x and x access the same storage + x = x + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic read + y(1) = y(2) +end + +subroutine f05 + integer :: x, v + + !$omp atomic read + !ERROR: Atomic expression x+1_4 should be a variable + v = x + 1 +end + +subroutine f06 + character :: x, v + + !$omp atomic read + !ERROR: Atomic variable x cannot have CHARACTER type + v = x +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic read + !ERROR: Atomic variable x cannot be ALLOCATABLE + v = x +end + diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 new file mode 100644 index 0000000000000..64e31a7974b55 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -0,0 +1,77 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !$omp atomic update capture + x = v + x = x + 1 + y = x + !$omp end atomic +end + +subroutine f01 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two assignments + !$omp atomic update capture + x = v + block + x = x + 1 + y = x + end block + !$omp end atomic +end + +subroutine f02 + integer :: x, y + + ! The update and capture statements can be inside of a single BLOCK. + ! The end-directive is then optional. Expect no diagnostics. + !$omp atomic update capture + block + x = x + 1 + y = x + end block +end + +subroutine f03 + integer :: x + + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !$omp atomic update capture + x = x + 1 + x = x + 2 + !$omp end atomic +end + +subroutine f04 + integer :: x, v + + !$omp atomic update capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + v = x + x = v + !$omp end atomic +end + +subroutine f05 + integer :: x, v, z + + !$omp atomic update capture + v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + !$omp end atomic +end + +subroutine f06 + integer :: x, v, z + + !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + v = x + !$omp end atomic +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 new file mode 100644 index 0000000000000..0aee02d429dc0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + !$omp atomic update + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + x = int(x + y) +end + +subroutine f06 + integer :: x, y + interface + function f(i, j) + integer :: f, i, j + end + end interface + + !$omp atomic update + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation + x = f(x, y) +end + +subroutine f07 + real :: x + integer :: y + + !$omp atomic update + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation + x = x ** y +end + +subroutine f08 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should appear as an argument in the update operation + x = y +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 index 21a9b87d26345..3084376b4275d 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 @@ -22,10 +22,10 @@ program sample x = x / y !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y end program diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 new file mode 100644 index 0000000000000..979568ebf7140 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -0,0 +1,81 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC WRITE. Expect no diagnostics. + !$omp atomic write + x = v + 1 + + !$omp atomic write + x = v + 3 + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic write + x = 2 * v + 3 + + !$omp atomic write + x => v +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic write + p(i) = v +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic write + !ERROR: Atomic variable x should be a scalar + x = v + + !$omp atomic write + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + y(2:4) = v +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic write + !ERROR: Within atomic operation x and x+1_4 access the same storage + x = x + 1 + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic write + y(1) = y(2) +end + +subroutine f06 + character :: x, v + + !$omp atomic write + !ERROR: Atomic variable x cannot have CHARACTER type + x = v +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic write + !ERROR: Atomic variable x cannot be ALLOCATABLE + x = v +end + diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0e100871ea9b4..0dc8251ae356e 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct @@ -11,9 +11,13 @@ a = b !$omp end atomic + !ERROR: ACQUIRE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic read acquire hint(OMP_LOCK_HINT_CONTENDED) a = b + !ERROR: RELEASE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic release hint(OMP_LOCK_HINT_UNCONTENDED) write a = b @@ -22,39 +26,32 @@ a = a + 1 !$omp end atomic + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: ACQ_REL clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic hint(1) acq_rel capture b = a a = a + 1 !$omp end atomic - !ERROR: expected end of line + !ERROR: At most one clause from the 'atomic' group is allowed on ATOMIC construct !$omp atomic read write + !ERROR: Atomic expression a+1._4 should be a variable a = a + 1 !$omp atomic a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic num_threads(4) a = a + 1 - !ERROR: expected end of line + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic capture num_threads(4) a = a + 1 + !ERROR: RELAXED clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic relaxed a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' - !$omp atomic num_threads write - a = a + 1 - !$omp end parallel end diff --git a/flang/test/Semantics/OpenMP/atomic01.f90 b/flang/test/Semantics/OpenMP/atomic01.f90 index 173effe86b69c..f700c381cadd0 100644 --- a/flang/test/Semantics/OpenMP/atomic01.f90 +++ b/flang/test/Semantics/OpenMP/atomic01.f90 @@ -14,322 +14,277 @@ ! At most one memory-order-clause may appear on the construct. !READ - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic read seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst read seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic read acquire acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire read acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic read relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed read relaxed i = j !UPDATE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic update seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst update seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic update release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release update release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic update relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed update relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !CAPTURE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic capture seq_cst seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst capture seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic capture release release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release capture release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic capture relaxed relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed capture relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel acq_rel capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic capture acq_rel acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel capture acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic capture acquire acquire i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire capture acquire i = j j = k !$omp end atomic !WRITE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic write seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst write seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic write release release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release write release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic write relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed write relaxed i = j !No atomic-clause - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j ! 2.17.7.3 ! At most one hint clause may appear on the construct. - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_speculative) hint(omp_sync_hint_speculative) read i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) read hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic read hint(omp_sync_hint_uncontended) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) update hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic update hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) capture i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) capture hint(omp_sync_hint_nonspeculative) i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic capture hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j j = k @@ -337,34 +292,26 @@ ! 2.17.7.4 ! If atomic-clause is read then memory-order-clause must not be acq_rel or release. - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic acq_rel read i = j - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read acq_rel i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic release read i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read release i = j ! 2.17.7.5 ! If atomic-clause is write then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acq_rel write i = j - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acq_rel i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acquire write i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acquire i = j @@ -372,33 +319,27 @@ ! 2.17.7.6 ! If atomic-clause is update or not present then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acq_rel update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acquire update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed on the ATOMIC directive !$omp atomic acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed on the ATOMIC directive !$omp atomic acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j end program diff --git a/flang/test/Semantics/OpenMP/atomic02.f90 b/flang/test/Semantics/OpenMP/atomic02.f90 index c66085d00f157..45e41f2552965 100644 --- a/flang/test/Semantics/OpenMP/atomic02.f90 +++ b/flang/test/Semantics/OpenMP/atomic02.f90 @@ -28,36 +28,29 @@ program OmpAtomic !$omp atomic a = a/(b + 1) !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: The atomic variable c should appear as an argument in the update operation c = d !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The /= operator is not a valid ATOMIC UPDATE operation l = a .NE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic m = m .AND. n @@ -76,32 +69,26 @@ program OmpAtomic !$omp atomic update a = a/(b + 1) !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: This is not a valid ATOMIC UPDATE operation c = c//d !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic update m = m .AND. n diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index 76367495b9861..f5c189fd05318 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -25,28 +25,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7, b, c) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8, a, d) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -60,26 +58,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = MOD(y, 9) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation x = ABS(x) end program OmpAtomic @@ -92,7 +90,7 @@ subroutine conflicting_types() type(simple) ::s z = 1 !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(s%z, 4) end subroutine @@ -105,40 +103,37 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) :: s !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MIN operator a = min(a, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a) !$omp atomic - !ERROR: Atomic update statement should be of the form `a = intrinsic_procedure(a, expr_list)` OR `a = intrinsic_procedure(expr_list, a)` a = min(b, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a, b) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: The atomic variable y should occur exactly once among the arguments of the top-level MIN operator y = min(z, x) !$omp atomic z = max(z, y) !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'k' + !ERROR: Atomic variable k should be a scalar + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation k = max(x, y) - + !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement x = min(x, k) !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - z =z + s%m + z = z + s%m end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index a9644ad95aa30..5c91ab5dc37e4 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -20,12 +20,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic @@ -33,12 +31,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic @@ -46,12 +42,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic @@ -59,12 +53,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic @@ -72,8 +64,7 @@ program OmpAtomic !$omp atomic m = n .AND. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic @@ -81,8 +72,7 @@ program OmpAtomic !$omp atomic m = n .OR. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic @@ -90,8 +80,7 @@ program OmpAtomic !$omp atomic m = n .EQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic @@ -99,8 +88,7 @@ program OmpAtomic !$omp atomic m = n .NEQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l !$omp atomic update @@ -108,12 +96,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic update @@ -121,12 +107,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic update @@ -134,12 +118,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic update @@ -147,12 +129,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic update @@ -160,8 +140,7 @@ program OmpAtomic !$omp atomic update m = n .AND. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic update @@ -169,8 +148,7 @@ program OmpAtomic !$omp atomic update m = n .OR. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic update @@ -178,8 +156,7 @@ program OmpAtomic !$omp atomic update m = n .EQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic update @@ -187,8 +164,7 @@ program OmpAtomic !$omp atomic update m = n .NEQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l end program OmpAtomic @@ -204,35 +180,34 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) p !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement x = x !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 1 !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and a*b access the same storage a = a * b + a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level * operator a = b * (a + 9) !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (a+b) access the same storage a = a * (a + b) !$omp atomic - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (b+a) access the same storage a = (b + a) * a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a * b + c !$omp atomic update - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a + b + c !$omp atomic @@ -243,23 +218,18 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement a = a + d !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = x * y / z !$omp atomic - !ERROR: Atomic update statement should be of form `p%m = p%m operator expr` OR `p%m = expr operator p%m` - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable p%m should occur exactly once among the arguments of the top-level + operator p%m = x + y !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement p%m = p%m + p%n end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index 266268a212440..e0103be4cae4a 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -8,20 +8,20 @@ program OmpAtomic use omp_lib integer :: g, x - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic relaxed, seq_cst x = x + 1 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic read seq_cst, relaxed x = g - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic write relaxed, release x = 2 * 4 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 10 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst x = g g = x * 10 diff --git a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 index 7ca8c858239f7..e9cfa49bf934e 100644 --- a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 @@ -18,7 +18,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(3) y = 2 !$omp end critical (name) @@ -27,12 +27,12 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(7) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(x) y = 2 @@ -54,7 +54,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint) y = 2 @@ -84,35 +84,35 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint_contended) y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp critical (name) hint(1.0) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp critical (name) hint(z + omp_sync_hint_nonspeculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(k + omp_sync_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(p(1) + omp_sync_hint_uncontended) y = 2 diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 505cbc48fef90..677b933932b44 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -20,70 +20,64 @@ program sample !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable y(1_8:3_8:1_8) should be a scalar v = y(1:3) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression x*(10_4+x) should be a variable v = x * (10 + x) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression 4_4 should be a variable v = 4 !$omp atomic read - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE v = k !$omp atomic write - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = x !$omp atomic update - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = k + x * (v * x) !$omp atomic - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = v * k !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'z%y' + !ERROR: Within atomic operation z%y and x+z%y access the same storage z%y = x + z%y !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Within atomic operation m and min(m,x,z%m)+k access the same storage m = min(m, x, z%m) + k !$omp atomic read - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Atomic expression min(m,x,z%m)+k should be a variable m = min(m, x, z%m) + k !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar x = a - !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - a = x - !$omp atomic write !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement x = a !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar a = x !$omp atomic capture @@ -93,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = b + (x*1) !$omp end atomic @@ -103,60 +97,58 @@ program sample !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component x expected to be captured in the second statement of ATOMIC CAPTURE construct x = x + 10 v = b !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt] v = 1 x = 4 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component z%y expected to be assigned in the second statement of ATOMIC CAPTURE construct x = z%y + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component z%m expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 x = z%y !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component y(2) expected to be assigned in the second statement of ATOMIC CAPTURE construct x = y(2) + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component y(1) expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 x = y(2) !$omp end atomic !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable r cannot have CHARACTER type l = r !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable l cannot have CHARACTER type l = r end program diff --git a/flang/test/Semantics/OpenMP/requires-atomic01.f90 b/flang/test/Semantics/OpenMP/requires-atomic01.f90 index ae9fd086015dd..e8817c3f5ef61 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic01.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic01.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> SeqCst !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> SeqCst !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> SeqCst !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> SeqCst !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> SeqCst !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 diff --git a/flang/test/Semantics/OpenMP/requires-atomic02.f90 b/flang/test/Semantics/OpenMP/requires-atomic02.f90 index 4976a9667eb78..a3724a83456fd 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic02.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic02.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Acquire + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> AcqRel !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> AcqRel !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> AcqRel !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> AcqRel !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> AcqRel + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> AcqRel !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 >From 5928e6c450ed9d91a8130fd489a3fabab830140c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:41:16 -0500 Subject: [PATCH 05/28] Restart build >From 5a9cb4aaca6c7cd079f24036b233dbfd3c6278d4 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:45:28 -0500 Subject: [PATCH 06/28] Restart build >From 6273bb856c66fe110731e25293be0f9a1f02c64a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 07:54:54 -0500 Subject: [PATCH 07/28] Restart build >From 3594a0fcab499d7509c2bd4620b2c47feb74a481 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 08:18:01 -0500 Subject: [PATCH 08/28] Replace %openmp_flags with -fopenmp in tests, add REQUIRES where needed --- flang/test/Semantics/OpenMP/atomic-read.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-capture.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 2 +- flang/test/Semantics/OpenMP/atomic.f90 | 2 ++ 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 index 2619d235380f8..73899a9ff37f2 100644 --- a/flang/test/Semantics/OpenMP/atomic-read.f90 +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index 64e31a7974b55..c427ba07d43d8 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 0aee02d429dc0..4595e02d01456 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 index 979568ebf7140..7965ad2dc7dbf 100644 --- a/flang/test/Semantics/OpenMP/atomic-write.f90 +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0dc8251ae356e..2caa161507d49 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,3 +1,5 @@ +! REQUIRES: openmp_runtime + ! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct >From 0142671339663ba09c169800f909c0d48c0ad38b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 09:06:20 -0500 Subject: [PATCH 09/28] Fix examples --- flang/examples/FeatureList/FeatureList.cpp | 10 ------- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 26 +++++-------------- 2 files changed, 7 insertions(+), 29 deletions(-) diff --git a/flang/examples/FeatureList/FeatureList.cpp b/flang/examples/FeatureList/FeatureList.cpp index d1407cf0ef239..a36b8719e365d 100644 --- a/flang/examples/FeatureList/FeatureList.cpp +++ b/flang/examples/FeatureList/FeatureList.cpp @@ -445,13 +445,6 @@ struct NodeVisitor { READ_FEATURE(ObjectDecl) READ_FEATURE(OldParameterStmt) READ_FEATURE(OmpAlignedClause) - READ_FEATURE(OmpAtomic) - READ_FEATURE(OmpAtomicCapture) - READ_FEATURE(OmpAtomicCapture::Stmt1) - READ_FEATURE(OmpAtomicCapture::Stmt2) - READ_FEATURE(OmpAtomicRead) - READ_FEATURE(OmpAtomicUpdate) - READ_FEATURE(OmpAtomicWrite) READ_FEATURE(OmpBeginBlockDirective) READ_FEATURE(OmpBeginLoopDirective) READ_FEATURE(OmpBeginSectionsDirective) @@ -480,7 +473,6 @@ struct NodeVisitor { READ_FEATURE(OmpIterationOffset) READ_FEATURE(OmpIterationVector) READ_FEATURE(OmpEndAllocators) - READ_FEATURE(OmpEndAtomic) READ_FEATURE(OmpEndBlockDirective) READ_FEATURE(OmpEndCriticalDirective) READ_FEATURE(OmpEndLoopDirective) @@ -566,8 +558,6 @@ struct NodeVisitor { READ_FEATURE(OpenMPDeclareTargetConstruct) READ_FEATURE(OmpMemoryOrderType) READ_FEATURE(OmpMemoryOrderClause) - READ_FEATURE(OmpAtomicClause) - READ_FEATURE(OmpAtomicClauseList) READ_FEATURE(OmpAtomicDefaultMemOrderClause) READ_FEATURE(OpenMPFlushConstruct) READ_FEATURE(OpenMPLoopConstruct) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..18a4107f9a9d1 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -74,25 +74,19 @@ SourcePosition OpenMPCounterVisitor::getLocation(const OpenMPConstruct &c) { // the directive field. [&](const auto &c) -> SourcePosition { const CharBlock &source{std::get<0>(c.t).source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPAtomicConstruct &c) -> SourcePosition { - return std::visit( - [&](const auto &o) -> SourcePosition { - const CharBlock &source{std::get(o.t).source}; - return parsing->allCooked() - .GetSourcePositionRange(source) - ->first; - }, - c.u); + const CharBlock &source{c.source}; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPSectionConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPUtilityConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, }, c.u); @@ -157,14 +151,8 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - return std::visit( - [&](const auto &c) { - // Get source from the verbatim fields - const CharBlock &source{std::get(c.t).source}; - return "atomic-" + - normalize_construct_name(source.ToString()); - }, - c.u); + const CharBlock &source{c.source}; + return normalize_construct_name(source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; >From c158867527e0a5e2b67146784386675e0d414c71 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:19 -0500 Subject: [PATCH 10/28] Fix example --- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 5 +++-- flang/test/Examples/omp-atomic.f90 | 16 +++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index 18a4107f9a9d1..b0964845d3b99 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -151,8 +151,9 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - const CharBlock &source{c.source}; - return normalize_construct_name(source.ToString()); + auto &dirSpec = std::get(c.t); + auto &dirName = std::get(dirSpec.t); + return normalize_construct_name(dirName.source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; diff --git a/flang/test/Examples/omp-atomic.f90 b/flang/test/Examples/omp-atomic.f90 index dcca34b633a3e..934f84f132484 100644 --- a/flang/test/Examples/omp-atomic.f90 +++ b/flang/test/Examples/omp-atomic.f90 @@ -26,25 +26,31 @@ ! CHECK:--- ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 9 -! CHECK-NEXT: construct: atomic-read +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: -! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: - clause: read ! CHECK-NEXT: details: '' +! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 12 -! CHECK-NEXT: construct: atomic-write +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: ! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' +! CHECK-NEXT: - clause: write ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 16 -! CHECK-NEXT: construct: atomic-capture +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: +! CHECK-NEXT: - clause: capture +! CHECK-NEXT: details: 'name_modifier=atomic;name_modifier=atomic;' ! CHECK-NEXT: - clause: seq_cst ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 21 -! CHECK-NEXT: construct: atomic-atomic +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: [] ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 8 >From ddacb71299d1fefd37b566e3caed45031584aac8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:47 -0500 Subject: [PATCH 11/28] Remove reference to *nullptr --- flang/lib/Lower/OpenMP/OpenMP.cpp | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ab868df76d298..4d9b375553c1e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3770,9 +3770,12 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); - mlir::Operation *secondOp = genAtomicOperation( - converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, - *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + mlir::Operation *secondOp = nullptr; + if (analysis.op1.what != analysis.None) { + secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, + atomAddr, atom, *get(analysis.op1.expr), + hint, memOrder, atomicAt, prepareAt); + } if (secondOp) { builder.setInsertionPointAfter(secondOp); >From 4546997f82dfe32b79b2bd0e2b65974991ab55da Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 2 May 2025 18:49:05 -0500 Subject: [PATCH 12/28] Updates and improvements --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 107 +++-- flang/lib/Semantics/check-omp-structure.cpp | 375 ++++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 1 + .../Todo/atomic-capture-implicit-cast.f90 | 48 --- .../Lower/OpenMP/atomic-implicit-cast.f90 | 2 - .../Semantics/OpenMP/atomic-hint-clause.f90 | 2 +- .../OpenMP/atomic-update-capture.f90 | 8 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 16 +- 9 files changed, 381 insertions(+), 180 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 77f57b1cb85c7..8213fe33edbd0 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4859,7 +4859,7 @@ struct OpenMPAtomicConstruct { struct Op { int what; - TypedExpr expr; + AssignmentStmt::TypedAssignment assign; }; TypedExpr atom, cond; Op op0, op1; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6177b59199481..7b6c22095d723 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2673,21 +2673,46 @@ getAtomicMemoryOrder(lower::AbstractConverter &converter, static mlir::Operation * // genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Value storeAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Type storeType = fir::unwrapRefType(storeAddr.getType()); + + mlir::Value toAddr = [&]() { + if (atomType == storeType) + return storeAddr; + return builder.createTemporary(loc, atomType, ".tmp.atomval"); + }(); builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + + if (atomType != storeType) { + lower::ExprToValueMap overrides; + // The READ operation could be a part of UPDATE CAPTURE, so make sure + // we don't emit extra code into the body of the atomic op. + builder.restoreInsertionPoint(postAt); + mlir::Value load = builder.create(loc, toAddr); + overrides.try_emplace(&atom, load); + + converter.overrideExprValues(&overrides); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); + converter.resetExprOverrides(); + + builder.create(loc, value, storeAddr); + } builder.restoreInsertionPoint(saved); return op; } @@ -2695,16 +2720,18 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, static mlir::Operation * // genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); mlir::Value converted = builder.createConvert(loc, atomType, value); @@ -2719,19 +2746,20 @@ static mlir::Operation * genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrResizeOf(arg, atom)) { @@ -2751,7 +2779,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, converter.overrideExprValues(&overrides); mlir::Value updated = - fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Value converted = builder.createConvert(loc, atomType, updated); builder.create(loc, converted); converter.resetExprOverrides(); @@ -2764,20 +2792,21 @@ static mlir::Operation * genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, int action, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: - return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Write: - return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Update: - return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, assign, + hint, memOrder, preAt, atomicAt, postAt); default: return nullptr; } @@ -3724,6 +3753,15 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { } return ""s; }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; const SomeExpr &atom = *analysis.atom.get()->v; @@ -3732,11 +3770,11 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; llvm::errs() << " op0 {\n"; llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op0.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << " op1 {\n"; llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op1.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << "}\n"; } @@ -3745,8 +3783,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAtomicConstruct &construct) { - auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { - if (auto *maybe = expr.get(); maybe && maybe->v) { + auto get = [](auto &&typedWrapper) -> decltype(&*typedWrapper.get()->v) { + if (auto *maybe = typedWrapper.get(); maybe && maybe->v) { return &*maybe->v; } else { return nullptr; @@ -3774,8 +3812,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, int action0 = analysis.op0.what & analysis.Action; int action1 = analysis.op1.what & analysis.Action; mlir::Operation *captureOp = nullptr; - fir::FirOpBuilder::InsertPoint atomicAt; - fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint preAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint atomicAt, postAt; if (construct.IsCapture()) { // Capturing operation. @@ -3784,7 +3822,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, captureOp = builder.create(loc, hint, memOrder); // Set the non-atomic insertion point to before the atomic.capture. - prepareAt = getInsertionPointBefore(captureOp); + preAt = getInsertionPointBefore(captureOp); mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); builder.setInsertionPointToEnd(block); @@ -3792,6 +3830,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, // atomic.capture. mlir::Operation *term = builder.create(loc); atomicAt = getInsertionPointBefore(term); + postAt = getInsertionPointAfter(captureOp); hint = nullptr; memOrder = nullptr; } else { @@ -3799,20 +3838,20 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(action0 != analysis.None && action1 == analysis.None && "Expexcing single action"); assert(!(analysis.op0.what & analysis.Condition)); - atomicAt = prepareAt; + postAt = atomicAt = preAt; } mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, - *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); mlir::Operation *secondOp = nullptr; if (analysis.op1.what != analysis.None) { secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, - atomAddr, atom, *get(analysis.op1.expr), - hint, memOrder, atomicAt, prepareAt); + atomAddr, atom, *get(analysis.op1.assign), + hint, memOrder, preAt, atomicAt, postAt); } if (secondOp) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 201b38bd05ff3..f7753a5e5cc59 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -86,9 +86,13 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } -static bool IsVarOrFunctionRef(const SomeExpr &expr) { - return evaluate::UnwrapProcedureRef(expr) != nullptr || - evaluate::IsVariable(expr); +static bool IsVarOrFunctionRef(const MaybeExpr &expr) { + if (expr) { + return evaluate::UnwrapProcedureRef(*expr) != nullptr || + evaluate::IsVariable(*expr); + } else { + return false; + } } static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { @@ -2838,6 +2842,12 @@ static std::pair SplitAssignmentSource( namespace atomic { +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + enum class Operator { Unk, // Operators that are officially allowed in the update operation @@ -3137,16 +3147,108 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (Append(v, std::move(results)), ...); + (MoveAppend(v, std::move(results)), ...); return v; } +}; -private: - static void Append(Result &acc, Result &&data) { - for (auto &&s : data) { - acc.push_back(std::move(s)); +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + auto copy{x.derived()}; + return {evaluate::AsGenericExpr(std::move(copy)), {}}; } } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; }; } // namespace atomic @@ -3265,6 +3367,22 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } +static MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { // Both expr and x have the form of SomeType(SomeKind(...)[1]). // Check if expr is @@ -3282,6 +3400,10 @@ bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { } } +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { if (value) { expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), @@ -3289,11 +3411,20 @@ static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { } } +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( - int what, const MaybeExpr &maybeExpr = std::nullopt) { + int what, + const std::optional &maybeAssign = std::nullopt) { parser::OpenMPAtomicConstruct::Analysis::Op operation; operation.what = what; - SetExpr(operation.expr, maybeExpr); + SetAssignment(operation.assign, maybeAssign); return operation; } @@ -3316,7 +3447,7 @@ static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( // }; // struct Op { // int what; - // TypedExpr expr; + // TypedAssignment assign; // }; // TypedExpr atom, cond; // Op op0, op1; @@ -3340,6 +3471,16 @@ void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, } } +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + /// Check if `expr` satisfies the following conditions for x and v: /// /// [6.0:189:10-12] @@ -3383,9 +3524,9 @@ OmpStructureChecker::CheckUpdateCapture( // // The two allowed cases are: // x = ... atomic-var = ... - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // or - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // x = ... atomic-var = ... // // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture @@ -3394,6 +3535,8 @@ OmpStructureChecker::CheckUpdateCapture( // // If the two statements don't fit these criteria, return a pair of default- // constructed values. + using ReturnTy = std::pair; SourcedActionStmt act1{GetActionStmt(ec1)}; SourcedActionStmt act2{GetActionStmt(ec2)}; @@ -3409,86 +3552,155 @@ OmpStructureChecker::CheckUpdateCapture( auto isUpdateCapture{ [](const evaluate::Assignment &u, const evaluate::Assignment &c) { - return u.lhs == c.rhs; + return IsSameOrConvertOf(c.rhs, u.lhs); }}; // Do some checks that narrow down the possible choices for the update // and the capture statements. This will help to emit better diagnostics. - bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); - bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; + + auto errorCaptureShouldRead{[&](const parser::CharBlock &source, + const std::string &expr) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read %s"_err_en_US, + expr); + }}; - if (couldBeCapture1) { - if (couldBeCapture2) { - if (isUpdateCapture(as2, as1)) { - if (isUpdateCapture(as1, as2)) { - // If both statements could be captures and both could be updates, - // emit a warning about the ambiguity. - context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); - } - return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); - } else if (isUpdateCapture(as1, as2)) { + auto errorNeitherWorks{[&]() { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture"_err_en_US); + }}; + + auto makeSelectionFromDet{[&](int det) -> ReturnTy { + // If det != 0, then the checks unambiguously suggest a specific + // categorization. + // If det == 0, then this function should be called only if the + // checks haven't ruled out any possibility, i.e. when both assigments + // could still be either updates or captures. + if (det > 0) { + // as1 is update, as2 is capture + if (isUpdateCapture(as1, as2)) { return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - context_.Say(source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, - as1.rhs.AsFortran(), as2.rhs.AsFortran()); + errorCaptureShouldRead(act2.source, as1.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } else { // !couldBeCapture2 + } else if (det < 0) { + // as2 is update, as1 is capture if (isUpdateCapture(as2, as1)) { return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } else { - context_.Say(act2.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as1.rhs.AsFortran()); + errorCaptureShouldRead(act1.source, as2.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } - } else { // !couldBeCapture1 - if (couldBeCapture2) { - if (isUpdateCapture(as1, as2)) { - return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); - } else { + } else { + bool updateFirst{isUpdateCapture(as1, as2)}; + bool captureFirst{isUpdateCapture(as2, as1)}; + if (updateFirst && captureFirst) { + // If both assignment could be the update and both could be the + // capture, emit a warning about the ambiguity. context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as2.rhs.AsFortran()); + "In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement"_warn_en_US); + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } - } else { - context_.Say(source, - "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + if (updateFirst != captureFirst) { + const parser::ExecutionPartConstruct *upd{updateFirst ? ec1 : ec2}; + const parser::ExecutionPartConstruct *cap{captureFirst ? ec1 : ec2}; + return std::make_pair(upd, cap); + } + assert(!updateFirst && !captureFirst); + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); } + }}; + + if (det != 0 || (cbu1 && cbu2 && cbc1 && cbc2)) { + return makeSelectionFromDet(det); } + assert(det == 0 && "Prior checks should have covered det != 0"); - return std::make_pair(nullptr, nullptr); + // If neither of the statements is an RMW update, it could still be a + // "write" update. Pretty much any assignment can be a write update, so + // recompute det with cbu1 = cbu2 = true. + if (int writeDet{int(cbc2) - int(cbc1)}; writeDet || (cbc1 && cbc2)) { + return makeSelectionFromDet(writeDet); + } + + // It's only errors from here on. + + if (!cbu1 && !cbu2 && !cbc1 && !cbc2) { + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); + } + + // The remaining cases are that + // - no candidate for update, or for capture, + // - one of the assigments cannot be anything. + + if (!cbu1 && !cbu2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update"_err_en_US); + return std::make_pair(nullptr, nullptr); + } else if (!cbc1 && !cbc2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + if ((!cbu1 && !cbc1) || (!cbu2 && !cbc2)) { + auto &src = (!cbu1 && !cbc1) ? act1.source : act2.source; + context_.Say(src, + "In ATOMIC UPDATE operation with CAPTURE the statement could be neither the update nor the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + // All cases should have been covered. + llvm_unreachable("Unchecked condition"); } void OmpStructureChecker::CheckAtomicCaptureAssignment( const evaluate::Assignment &capture, const SomeExpr &atom, parser::CharBlock source) { - const SomeExpr &cap{capture.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &cap{capture.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, rsrc); - // This part should have been checked prior to callig this function. - assert(capture.rhs == atom && "This canont be a capture assignment"); + // This part should have been checked prior to calling this function. + assert(*GetConvertInput(capture.rhs) == atom && + "This canont be a capture assignment"); CheckStorageOverlap(atom, {cap}, source); } } void OmpStructureChecker::CheckAtomicReadAssignment( const evaluate::Assignment &read, parser::CharBlock source) { - const SomeExpr &atom{read.rhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; - if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + if (auto maybe{GetConvertInput(read.rhs)}) { + const SomeExpr &atom{*maybe}; + + if (!IsVarOrFunctionRef(atom)) { + ErrorShouldBeVariable(atom, rsrc); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } } else { - CheckAtomicVariable(atom, rsrc); - CheckStorageOverlap(atom, {read.lhs}, source); + ErrorShouldBeVariable(read.rhs, rsrc); } } @@ -3499,12 +3711,11 @@ void OmpStructureChecker::CheckAtomicWriteAssignment( // one of the following forms: // x = expr // x => expr - const SomeExpr &atom{write.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{write.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, lsrc); CheckStorageOverlap(atom, {write.rhs}, source); @@ -3521,12 +3732,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( // x = intrinsic-procedure-name (x) // x = intrinsic-procedure-name (x, expr-list) // x = intrinsic-procedure-name (expr-list, x) - const SomeExpr &atom{update.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{update.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); // Skip other checks. return; } @@ -3605,12 +3815,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( const SomeExpr &cond, parser::CharBlock condSource, const evaluate::Assignment &assign, parser::CharBlock assignSource) { - const SomeExpr &atom{assign.lhs}; auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + const SomeExpr &atom{assign.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, arsrc); // Skip other checks. return; } @@ -3702,7 +3911,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( @@ -3786,7 +3995,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdate( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign), MakeAtomicAnalysisOp(Analysis::None)); } @@ -3839,12 +4048,12 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( if (GetActionStmt(&body.front()).stmt == uact.stmt) { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(action, update.rhs), - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + MakeAtomicAnalysisOp(action, update), + MakeAtomicAnalysisOp(Analysis::Read, capture)); } else { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), - MakeAtomicAnalysisOp(action, update.rhs)); + MakeAtomicAnalysisOp(Analysis::Read, capture), + MakeAtomicAnalysisOp(action, update)); } } @@ -3988,12 +4197,12 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( if (captureFirst) { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign)); } else { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign)); } } @@ -4019,13 +4228,15 @@ void OmpStructureChecker::CheckAtomicRead( if (body.size() == 1) { SourcedActionStmt action{GetActionStmt(&body.front())}; if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { - const SomeExpr &atom{maybeRead->rhs}; CheckAtomicReadAssignment(*maybeRead, action.source); - using Analysis = parser::OpenMPAtomicConstruct::Analysis; - x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), - MakeAtomicAnalysisOp(Analysis::None)); + if (auto maybe{GetConvertInput(maybeRead->rhs)}) { + const SomeExpr &atom{*maybe}; + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead), + MakeAtomicAnalysisOp(Analysis::None)); + } } else { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); @@ -4058,7 +4269,7 @@ void OmpStructureChecker::CheckAtomicWrite( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index bf6fbf16d0646..835fbe45e1c0e 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -253,6 +253,7 @@ class OmpStructureChecker void CheckStorageOverlap(const evaluate::Expr &, llvm::ArrayRef>, parser::CharBlock); + void ErrorShouldBeVariable(const MaybeExpr &expr, parser::CharBlock source); void CheckAtomicVariable( const evaluate::Expr &, parser::CharBlock); std::pair&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..aa9d2e0ac3ff7 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -1,5 +1,3 @@ -! REQUIRES : openmp_runtime - ! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s ! CHECK: func.func @_QPatomic_implicit_cast_read() { diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index deb67e7614659..8adb0f1a67409 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -25,7 +25,7 @@ program sample !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement y = x x = y !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index c427ba07d43d8..f808ed916fb7e 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -39,7 +39,7 @@ subroutine f02 subroutine f03 integer :: x - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture !$omp atomic update capture x = x + 1 x = x + 2 @@ -50,7 +50,7 @@ subroutine f04 integer :: x, v !$omp atomic update capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement v = x x = v !$omp end atomic @@ -60,8 +60,8 @@ subroutine f05 integer :: x, v, z !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 !$omp end atomic end @@ -70,8 +70,8 @@ subroutine f06 integer :: x, v, z !$omp atomic update capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x !$omp end atomic end diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 677b933932b44..5e180aa0bbe5b 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -97,50 +97,50 @@ program sample !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture x = x + 10 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read x v = b !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture !$omp atomic capture v = 1 x = 4 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) !$omp end atomic >From 40510a3068498d15257cc7d198bce9c8cd71a902 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 24 Mar 2025 15:38:58 -0500 Subject: [PATCH 13/28] DumpEvExpr: show type --- flang/include/flang/Semantics/dump-expr.h | 30 ++++++++++++++++------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 2f445429a10b5..1553dac3b6687 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,6 +16,7 @@ #include #include +#include #include #include @@ -38,6 +39,17 @@ class DumpEvaluateExpr { } private: + template + struct TypeOf { + static constexpr std::string_view name{TypeOf::get()}; + static constexpr std::string_view get() { + std::string_view v(__PRETTY_FUNCTION__); + v.remove_prefix(99); // Strip the part "... [with T = " + v.remove_suffix(50); // Strip the ending "; string_view = ...]" + return v; + } + }; + template void Show(const common::Indirection &x) { Show(x.value()); } @@ -76,7 +88,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant"); + Indent("derived constant "s + std::string(TypeOf::name)); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -84,7 +96,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant"); + Print("constant "s + std::string(TypeOf::name)); } } void Show(const Symbol &symbol); @@ -102,7 +114,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator"); + Indent("designator "s + std::string(TypeOf::name)); Show(x.u); Outdent(); } @@ -117,7 +129,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref"); + Indent("function ref "s + std::string(TypeOf::name)); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -127,14 +139,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value"); + Indent("array constructor value "s + std::string(TypeOf::name)); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do"); + Indent("implied do "s + std::string(TypeOf::name)); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -148,20 +160,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op"); + Indent("unary op "s + std::string(TypeOf::name)); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op"); + Indent("binary op "s + std::string(TypeOf::name)); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr T"); + Indent("expr <" + std::string(TypeOf::name) + ">"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index aa0b4e0f03398..66cedab94bfb4 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("expr some type"); + Indent("relational some type"); Show(x.u); Outdent(); } >From b40ba0ed9270daf4f7d99190c1e100028a3e09c3 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 15:14:45 -0500 Subject: [PATCH 14/28] Handle conversion from real to complex via complex constructor --- flang/lib/Semantics/check-omp-structure.cpp | 55 ++++++++++++++++++--- 1 file changed, 47 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dada9c6c2bd6f..ae81dcb5ea150 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,36 +3183,46 @@ struct ConvertCollector using Base::operator(); template // - Result operator()(const evaluate::Designator &x) const { + Result asSomeExpr(const T &x) const { auto copy{x}; return {AsGenericExpr(std::move(copy)), {}}; } + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + template // Result operator()(const evaluate::FunctionRef &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template // Result operator()(const evaluate::Constant &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template Result operator()(const evaluate::Operation &x) const { if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. + // Ignore parentheses. return (*this)(x.template operand<0>()); } else if constexpr (is_convert_v) { // Convert should always have a typed result, so it should be safe to // dereference x.GetType(). return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } } else { - auto copy{x.derived()}; - return {evaluate::AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x.derived()); } } @@ -3231,6 +3241,23 @@ struct ConvertCollector } private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + template // struct is_convert { static constexpr bool value{false}; @@ -3246,6 +3273,18 @@ struct ConvertCollector }; template // static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { >From 303aef7886243a6f7952e866cfb50d860ed98e61 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 16:07:19 -0500 Subject: [PATCH 15/28] Fix handling of insertion point --- flang/lib/Lower/OpenMP/OpenMP.cpp | 23 +++++++++++-------- .../Lower/OpenMP/atomic-implicit-cast.f90 | 8 +++---- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 1c5589b116ca7..60e559b326f7f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2749,7 +2749,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value storeAddr = @@ -2782,7 +2781,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, value, storeAddr); } - builder.restoreInsertionPoint(saved); return op; } @@ -2796,7 +2794,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value value = @@ -2807,7 +2804,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, converted, hint, memOrder); - builder.restoreInsertionPoint(saved); return op; } @@ -2823,7 +2819,6 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); @@ -2853,7 +2848,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(saved); + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } @@ -2866,6 +2861,8 @@ genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { + // This function and the functions called here do not preserve the + // builder's insertion point, or set it to anything specific. switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, @@ -3919,6 +3916,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, postAt = atomicAt = preAt; } + // The builder's insertion point needs to be specifically set before + // each call to `genAtomicOperation`. mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); @@ -3932,10 +3931,16 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, hint, memOrder, preAt, atomicAt, postAt); } - if (secondOp) { - builder.setInsertionPointAfter(secondOp); + if (construct.IsCapture()) { + // If this is a capture operation, the first/second ops will be inside + // of it. Set the insertion point to past the capture op itself. + builder.restoreInsertionPoint(postAt); } else { - builder.setInsertionPointAfter(firstOp); + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } } } } diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 6f9a481e4cf43..5e00235b85e74 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -95,9 +95,9 @@ subroutine atomic_implicit_cast_read ! CHECK: } ! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 ! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref -! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 ! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex ! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> @@ -107,14 +107,14 @@ subroutine atomic_implicit_cast_read !$omp end atomic -! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { -! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 ! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 ! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex ! CHECK: omp.yield(%[[RESULT]] : complex) ! CHECK: } >From d788d87ebe69ec82c14a0eb0cbb95df38a216fde Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:14:47 -0500 Subject: [PATCH 16/28] Allow conversion in update operations --- flang/include/flang/Semantics/tools.h | 17 ++++----- flang/lib/Lower/OpenMP/OpenMP.cpp | 6 ++-- flang/lib/Semantics/check-omp-structure.cpp | 33 ++++++----------- .../Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic03.f90 | 6 ++-- flang/test/Semantics/OpenMP/atomic04.f90 | 35 +++++++++---------- .../OpenMP/omp-atomic-assignment-stmt.f90 | 2 +- 7 files changed, 44 insertions(+), 57 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7f1ec59b087a2..9be2feb8ae064 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -789,14 +789,15 @@ inline bool checkForSymbolMatch( /// return the "expr" but with top-level parentheses stripped. std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); -/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). -/// Check if "expr" is -/// SomeType(SomeKind(Type( -/// Convert -/// SomeKind(...)[2]))) -/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves -/// TypeCategory. -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 60e559b326f7f..6977e209e8b1b 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2823,10 +2823,12 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; + // This must exist by now. + SomeExpr input = *semantics::GetConvertInput(assign.rhs); + std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { - if (!semantics::IsSameOrResizeOf(arg, atom)) { + if (!semantics::IsSameOrConvertOf(arg, atom)) { mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); overrides.try_emplace(&arg, val); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index ae81dcb5ea150..edd8525c118bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3425,12 +3425,12 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -static MaybeExpr GetConvertInput(const SomeExpr &x) { +MaybeExpr GetConvertInput(const SomeExpr &x) { // This returns SomeExpr(x) when x is a designator/functionref/constant. return atomic::ConvertCollector{}(x).first; } -static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { // Check if expr is same as x, or a sequence of Convert operations on x. if (expr == x) { return true; @@ -3441,23 +3441,6 @@ static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { } } -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { - // Both expr and x have the form of SomeType(SomeKind(...)[1]). - // Check if expr is - // SomeType(SomeKind(Type( - // Convert - // SomeKind(...)[2]))) - // where SomeKind(...) [1] and [2] are equal, and the Convert preserves - // TypeCategory. - - if (expr != x) { - auto top{atomic::ArgumentExtractor{}(expr)}; - return top.first == atomic::Operator::Resize && x == top.second.front(); - } else { - return true; - } -} - bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3801,7 +3784,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - auto top{GetTopLevelOperation(update.rhs)}; + std::pair> top{ + atomic::Operator::Unk, {}}; + if (auto &&maybeInput{GetConvertInput(update.rhs)}) { + top = GetTopLevelOperation(*maybeInput); + } switch (top.first) { case atomic::Operator::Add: case atomic::Operator::Sub: @@ -3842,7 +3829,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( auto unique{[&]() { // -> iterator auto found{top.second.end()}; for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { - if (IsSameOrResizeOf(*i, atom)) { + if (IsSameOrConvertOf(*i, atom)) { if (found != top.second.end()) { return top.second.end(); } @@ -3902,9 +3889,9 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( case atomic::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; - if (IsSameOrResizeOf(arg0, atom)) { + if (IsSameOrConvertOf(arg0, atom)) { CheckStorageOverlap(atom, {arg1}, condSource); - } else if (IsSameOrResizeOf(arg1, atom)) { + } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { context_.Say(assignSource, diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 4595e02d01456..28d0e264359cb 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -47,8 +47,8 @@ subroutine f05 integer :: x real :: y + ! An explicit conversion is accepted as an extension. !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = int(x + y) end diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index f5c189fd05318..b3a3c0d5e7a14 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -41,10 +41,10 @@ program OmpAtomic z = MIN(y, 8, a, d) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable y should appear as an argument in the update operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -126,7 +126,7 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic update !ERROR: Atomic variable k should be a scalar - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable k should occur exactly once among the arguments of the top-level MAX operator k = max(x, y) !$omp atomic diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index 5c91ab5dc37e4..d603ba8b3937c 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -1,5 +1,3 @@ -! REQUIRES: openmp_runtime - ! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags ! OpenMP Atomic construct @@ -7,7 +5,6 @@ ! Update assignment must be 'var = var op expr' or 'var = expr op var' program OmpAtomic - use omp_lib real x integer y logical m, n, l @@ -20,10 +17,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic @@ -31,10 +28,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic @@ -42,10 +39,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic @@ -53,10 +50,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic @@ -96,10 +93,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic update @@ -107,10 +104,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic update @@ -118,10 +115,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic update @@ -129,10 +126,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 5e180aa0bbe5b..8fdd2aed3ec1f 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -87,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + ! This ends up being "x = b + x". x = b + (x*1) !$omp end atomic >From 341723713929507c59d528540d32bc2e4213e920 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:21:56 -0500 Subject: [PATCH 17/28] format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6977e209e8b1b..0f553541c5ef0 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2850,7 +2850,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(postAt); // For naCtx cleanups + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } >From 2686207342bad511f6d51b20ed923c0d2cc9047b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:22:26 -0500 Subject: [PATCH 18/28] Revert "DumpEvExpr: show type" This reverts commit 40510a3068498d15257cc7d198bce9c8cd71a902. Debug changes accidentally pushed upstream. --- flang/include/flang/Semantics/dump-expr.h | 30 +++++++---------------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 10 insertions(+), 22 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 1553dac3b6687..2f445429a10b5 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,7 +16,6 @@ #include #include -#include #include #include @@ -39,17 +38,6 @@ class DumpEvaluateExpr { } private: - template - struct TypeOf { - static constexpr std::string_view name{TypeOf::get()}; - static constexpr std::string_view get() { - std::string_view v(__PRETTY_FUNCTION__); - v.remove_prefix(99); // Strip the part "... [with T = " - v.remove_suffix(50); // Strip the ending "; string_view = ...]" - return v; - } - }; - template void Show(const common::Indirection &x) { Show(x.value()); } @@ -88,7 +76,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant "s + std::string(TypeOf::name)); + Indent("derived constant"); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -96,7 +84,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant "s + std::string(TypeOf::name)); + Print("constant"); } } void Show(const Symbol &symbol); @@ -114,7 +102,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator "s + std::string(TypeOf::name)); + Indent("designator"); Show(x.u); Outdent(); } @@ -129,7 +117,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref "s + std::string(TypeOf::name)); + Indent("function ref"); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -139,14 +127,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value "s + std::string(TypeOf::name)); + Indent("array constructor value"); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do "s + std::string(TypeOf::name)); + Indent("implied do"); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -160,20 +148,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op "s + std::string(TypeOf::name)); + Indent("unary op"); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op "s + std::string(TypeOf::name)); + Indent("binary op"); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr <" + std::string(TypeOf::name) + ">"); + Indent("expr T"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index 66cedab94bfb4..aa0b4e0f03398 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("relational some type"); + Indent("expr some type"); Show(x.u); Outdent(); } >From c00fc531bcf742c409fc974da94c5b362fa9132c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:37:19 -0500 Subject: [PATCH 19/28] Delete unnecessary static_assert --- flang/lib/Semantics/check-omp-structure.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 6005dda7c26fe..2e59553d5e130 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -21,8 +21,6 @@ namespace Fortran::semantics { -static_assert(std::is_same_v>); - template static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { return !(e == f); >From 45b012c16b77c757a0d09b2a229bad49fed8d26f Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:25 -0500 Subject: [PATCH 20/28] Add missing initializer for 'iff' --- flang/lib/Semantics/check-omp-structure.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 2e59553d5e130..aa1bd136b371f 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2815,7 +2815,7 @@ static std::optional AnalyzeConditionalStmt( } } else { AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, - GetActionStmt(std::get(s.t))}; + GetActionStmt(std::get(s.t)), SourcedActionStmt{}}; if (result.ift.stmt) { return result; } >From daeac25991bf14fb08c3accabe068c074afa1eb7 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:47 -0500 Subject: [PATCH 21/28] Add asserts for printing "Identity" as top-level operator --- flang/lib/Semantics/check-omp-structure.cpp | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa1bd136b371f..062b45deac865 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3823,6 +3823,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, atomic::ToString(top.first)); @@ -3852,6 +3853,8 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, atom.AsFortran(), atomic::ToString(top.first)); @@ -3898,16 +3901,20 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, atomic::ToString(top.first)); } break; } + case atomic::Operator::Identity: case atomic::Operator::True: case atomic::Operator::False: break; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, atomic::ToString(top.first)); >From ae121e5c37453af1a4aba7c77939c2c1c45b75fa Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 13:15:58 -0500 Subject: [PATCH 22/28] Explain the use of determinant --- flang/lib/Semantics/check-omp-structure.cpp | 31 ++++++++++++++++++--- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 062b45deac865..bc6a09b9768ef 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3606,10 +3606,33 @@ OmpStructureChecker::CheckUpdateCapture( // subexpression of the right-hand side. // 2. An assignment could be a capture (cbc) if the right-hand side is // a variable (or a function ref), with potential type conversions. - bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; - bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; - bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; - bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; // Can as1 be an update? + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; // Can as2 be an update? + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; // Can 1 be capture? + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; // Can 2 be capture? + + // We want to diagnose cases where both assignments cannot be an update, + // or both cannot be a capture, as well as cases where either assignment + // cannot be any of these two. + // + // If we organize these boolean values into a matrix + // |cbu1 cbu2| + // |cbc1 cbc2| + // then we want to diagnose cases where the matrix has a zero (i.e. "false") + // row or column, including the case where everything is zero. All these + // cases correspond to the determinant of the matrix being 0, which suggests + // that checking the det may be a convenient diagnostic check. There is only + // one additional case where the det is 0, which is when the matrx is all 1 + // ("true"). The "all true" case represents the situation where both + // assignments could be an update as well as a capture. On the other hand, + // whenever det != 0, the roles of the update and the capture can be + // unambiguously assigned to as1 and as2 [1]. + // + // [1] This can be easily verified by hand: there are 10 2x2 matrices with + // det = 0, leaving 6 cases where det != 0: + // 0 1 0 1 1 0 1 0 1 1 1 1 + // 1 0 1 1 0 1 1 1 0 1 1 0 + // In each case the classification is unambiguous. // |cbu1 cbu2| // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 >From cae0e8fcd3f6b8c2bc3ad8f85599ef4765c6afc5 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 11:48:18 -0500 Subject: [PATCH 23/28] Deal with assignments that failed Fortran semantic checks Don't emit diagnostics for those. --- flang/lib/Semantics/check-omp-structure.cpp | 66 ++++++++++++++------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bc6a09b9768ef..89a3a407441a8 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2726,6 +2726,9 @@ static SourcedActionStmt GetActionStmt(const parser::Block &block) { // Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption // is that the ActionStmt will be either an assignment or a pointer-assignment, // otherwise return std::nullopt. +// Note: This function can return std::nullopt on [Pointer]AssignmentStmt where +// the "typedAssignment" is unset. This can happen is there are semantic errors +// in the purported assignment. static std::optional GetEvaluateAssignment( const parser::ActionStmt *x) { if (x == nullptr) { @@ -2754,6 +2757,29 @@ static std::optional GetEvaluateAssignment( x->u); } +// Check if the ActionStmt is actually a [Pointer]AssignmentStmt. This is +// to separate cases where the source has something that looks like an +// assignment, but is semantically wrong (diagnosed by general semantic +// checks), and where the source has some other statement (which we want +// to report as "should be an assignment"). +static bool IsAssignment(const parser::ActionStmt *x) { + if (x == nullptr) { + return false; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + + return common::visit( + [](auto &&s) -> bool { + using BareS = llvm::remove_cvref_t; + return std::is_same_v || + std::is_same_v; + }, + x->u); +} + static std::optional AnalyzeConditionalStmt( const parser::ExecutionPartConstruct *x) { if (x == nullptr) { @@ -3588,8 +3614,10 @@ OmpStructureChecker::CheckUpdateCapture( auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; if (!maybeAssign1 || !maybeAssign2) { - context_.Say(source, - "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + if (!IsAssignment(act1.stmt) || !IsAssignment(act2.stmt)) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + } return std::make_pair(nullptr, nullptr); } @@ -3956,7 +3984,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( // The if-true statement must be present, and must be an assignment. auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; if (!maybeAssign) { - if (update.ift.stmt) { + if (update.ift.stmt && !IsAssignment(update.ift.stmt)) { context_.Say(update.ift.source, "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); } else { @@ -3992,7 +4020,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); } @@ -4094,17 +4122,11 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( } SourcedActionStmt uact{GetActionStmt(uec)}; SourcedActionStmt cact{GetActionStmt(cec)}; - auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; - auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; - - if (!maybeUpdate || !maybeCapture) { - context_.Say(source, - "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); - return; - } + // The "dereferences" of std::optional are guaranteed to be valid after + // CheckUpdateCapture. + evaluate::Assignment update{*GetEvaluateAssignment(uact.stmt)}; + evaluate::Assignment capture{*GetEvaluateAssignment(cact.stmt)}; - const evaluate::Assignment &update{*maybeUpdate}; - const evaluate::Assignment &capture{*maybeCapture}; const SomeExpr &atom{update.lhs}; using Analysis = parser::OpenMPAtomicConstruct::Analysis; @@ -4242,13 +4264,17 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( return; } } else { - context_.Say(capture.source, - "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + if (!IsAssignment(capture.stmt)) { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + } return; } } else { - context_.Say(update.ift.source, - "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + if (!IsAssignment(update.ift.stmt)) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + } return; } @@ -4316,7 +4342,7 @@ void OmpStructureChecker::CheckAtomicRead( MakeAtomicAnalysisOp(Analysis::Read, maybeRead), MakeAtomicAnalysisOp(Analysis::None)); } - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); } @@ -4350,7 +4376,7 @@ void OmpStructureChecker::CheckAtomicWrite( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); } >From 6bc8c10c793ebac02c78daec33e7fb5e6becb8e0 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 12:47:00 -0500 Subject: [PATCH 24/28] Move common functions to tools.cpp --- flang/include/flang/Semantics/tools.h | 134 +++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- flang/lib/Semantics/check-omp-structure.cpp | 506 ++------------------ flang/lib/Semantics/tools.cpp | 310 ++++++++++++ 4 files changed, 484 insertions(+), 468 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 821f1ae34fd5b..25fadceefceb0 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -778,11 +778,135 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, return false; } -/// If the top-level operation (ignoring parentheses) is either an -/// evaluate::FunctionRef, or a specialization of evaluate::Operation, -/// then return the list of arguments (wrapped in SomeExpr). Otherwise, -/// return the "expr" but with top-level parentheses stripped. -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); +namespace operation { + +enum class Operator { + Add, + And, + Associated, + Call, + Convert, + Div, + Eq, + Eqv, + False, + Ge, + Gt, + Identity, + Intrinsic, + Lt, + Max, + Min, + Mul, + Ne, + Neqv, + Not, + Or, + Pow, + Resize, // Convert within the same TypeCategory + Sub, + True, + Unknown, +}; + +std::string ToString(Operator op); + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unknown; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unknown; +} + +template +Operator OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Add; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Sub; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Mul; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Div; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } +} + +template // +Operator OperationCode(const T &) { + return Operator::Unknown; +} + +Operator OperationCode(const evaluate::ProcedureDesignator &proc); + +} // namespace operation + +/// Return information about the top-level operation (ignoring parentheses): +/// the operation code and the list of arguments. +std::pair> +GetTopLevelOperation(const SomeExpr &expr); /// Check if expr is same as x, or a sequence of Convert operations on x. bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ad5eae4ae39a2..c74f7627c5e25 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2828,7 +2828,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, // This must exist by now. SomeExpr input = *semantics::GetConvertInput(assign.rhs); - std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; + std::vector args{semantics::GetTopLevelOperation(input).second}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrConvertOf(arg, atom)) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 89a3a407441a8..f29a56d5fd92a 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2891,290 +2891,6 @@ static std::pair SplitAssignmentSource( namespace atomic { -template static void MoveAppend(V &accum, V &&other) { - for (auto &&s : other) { - accum.push_back(std::move(s)); - } -} - -enum class Operator { - Unk, - // Operators that are officially allowed in the update operation - Add, - And, - Associated, - Div, - Eq, - Eqv, - Ge, // extension - Gt, - Identity, // extension: x = x is allowed (*), but we should never print - // "identity" as the name of the operator - Le, // extension - Lt, - Max, - Min, - Mul, - Ne, // extension - Neqv, - Or, - Sub, - // Operators that we recognize for technical reasons - True, - False, - Not, - Convert, - Resize, - Intrinsic, - Call, - Pow, - - // (*): "x = x + 0" is a valid update statement, but it will be folded - // to "x = x" by the time we look at it. Since the source statements - // "x = x" and "x = x + 0" will end up looking the same, accept the - // former as an extension. -}; - -std::string ToString(Operator op) { - switch (op) { - case Operator::Add: - return "+"; - case Operator::And: - return "AND"; - case Operator::Associated: - return "ASSOCIATED"; - case Operator::Div: - return "/"; - case Operator::Eq: - return "=="; - case Operator::Eqv: - return "EQV"; - case Operator::Ge: - return ">="; - case Operator::Gt: - return ">"; - case Operator::Identity: - return "identity"; - case Operator::Le: - return "<="; - case Operator::Lt: - return "<"; - case Operator::Max: - return "MAX"; - case Operator::Min: - return "MIN"; - case Operator::Mul: - return "*"; - case Operator::Neqv: - return "NEQV/EOR"; - case Operator::Ne: - return "/="; - case Operator::Or: - return "OR"; - case Operator::Sub: - return "-"; - case Operator::True: - return ".TRUE."; - case Operator::False: - return ".FALSE."; - case Operator::Not: - return "NOT"; - case Operator::Convert: - return "type-conversion"; - case Operator::Resize: - return "resize"; - case Operator::Intrinsic: - return "intrinsic"; - case Operator::Call: - return "function-call"; - case Operator::Pow: - return "**"; - default: - return "??"; - } -} - -template // -struct ArgumentExtractor - : public evaluate::Traverse, - std::pair>, false> { - using Arguments = std::vector; - using Result = std::pair; - using Base = evaluate::Traverse, - Result, false>; - static constexpr auto IgnoreResizes = IgnoreResizingConverts; - static constexpr auto Logical = common::TypeCategory::Logical; - ArgumentExtractor() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result operator()( - const evaluate::Constant> &x) const { - if (const auto &val{x.GetScalarValue()}) { - return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) - : std::make_pair(Operator::False, Arguments{}); - } - return Default(); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - Result result{OperationCode(x.proc()), {}}; - for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { - if (auto *e{x.UnwrapArgExpr(i)}) { - result.second.push_back(*e); - } - } - return result; - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. - return (*this)(x.template operand<0>()); - } - if constexpr (IgnoreResizes && - std::is_same_v>) { - // Ignore conversions within the same category. - // Atomic operations on int(kind=1) may be implicitly widened - // to int(kind=4) for example. - return (*this)(x.template operand<0>()); - } else { - return std::make_pair( - OperationCode(x), OperationArgs(x, std::index_sequence_for{})); - } - } - - template // - Result operator()(const evaluate::Designator &x) const { - evaluate::Designator copy{x}; - Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; - return result; - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - // There shouldn't be any combining needed, since we're stopping the - // traversal at the top-level operation, but implement one that picks - // the first non-empty result. - if constexpr (sizeof...(Rs) == 0) { - return std::move(result); - } else { - if (!result.second.empty()) { - return std::move(result); - } else { - return Combine(std::move(results)...); - } - } - } - -private: - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) - const { - switch (op.derived().logicalOperator) { - case common::LogicalOperator::And: - return Operator::And; - case common::LogicalOperator::Or: - return Operator::Or; - case common::LogicalOperator::Eqv: - return Operator::Eqv; - case common::LogicalOperator::Neqv: - return Operator::Neqv; - case common::LogicalOperator::Not: - return Operator::Not; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - switch (op.derived().opr) { - case common::RelationalOperator::LT: - return Operator::Lt; - case common::RelationalOperator::LE: - return Operator::Le; - case common::RelationalOperator::EQ: - return Operator::Eq; - case common::RelationalOperator::NE: - return Operator::Ne; - case common::RelationalOperator::GE: - return Operator::Ge; - case common::RelationalOperator::GT: - return Operator::Gt; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Add; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Sub; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Mul; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Div; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - if constexpr (C == T::category) { - return Operator::Resize; - } else { - return Operator::Convert; - } - } - Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { - Operator code = llvm::StringSwitch(proc.GetName()) - .Case("associated", Operator::Associated) - .Case("min", Operator::Min) - .Case("max", Operator::Max) - .Case("iand", Operator::And) - .Case("ior", Operator::Or) - .Case("ieor", Operator::Neqv) - .Default(Operator::Call); - if (code == Operator::Call && proc.GetSpecificIntrinsic()) { - return Operator::Intrinsic; - } - return code; - } - template // - Operator OperationCode(const T &) const { - return Operator::Unk; - } - - template - Arguments OperationArgs(const evaluate::Operation &x, - std::index_sequence) const { - return Arguments{SomeExpr(x.template operand())...}; - } -}; - struct DesignatorCollector : public evaluate::Traverse, false> { using Result = std::vector; @@ -3196,125 +2912,14 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (MoveAppend(v, std::move(results)), ...); - return v; - } -}; - -struct ConvertCollector - : public evaluate::Traverse>, false> { - using Result = std::pair>; - using Base = evaluate::Traverse; - ConvertCollector() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result asSomeExpr(const T &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; - } - - template // - Result operator()(const evaluate::Designator &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::Constant &x) const { - return asSomeExpr(x); - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore parentheses. - return (*this)(x.template operand<0>()); - } else if constexpr (is_convert_v) { - // Convert should always have a typed result, so it should be safe to - // dereference x.GetType(). - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else if constexpr (is_complex_constructor_v) { - // This is a conversion iff the imaginary operand is 0. - if (IsZero(x.template operand<1>())) { - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else { - return asSomeExpr(x.derived()); - } - } else { - return asSomeExpr(x.derived()); - } - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - Result v(std::move(result)); - auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { - assert((!x.has_value() || !y.has_value()) && "Multiple designators"); - if (!x.has_value()) { - x = std::move(y); + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); } }}; - (setValue(v.first, std::move(results).first), ...); - (MoveAppend(v.second, std::move(results).second), ...); + (moveAppend(v, std::move(results)), ...); return v; } - -private: - template // - static bool IsZero(const T &x) { - return false; - } - template // - static bool IsZero(const evaluate::Expr &x) { - return common::visit([](auto &&s) { return IsZero(s); }, x.u); - } - template // - static bool IsZero(const evaluate::Constant &x) { - if (auto &&maybeScalar{x.GetScalarValue()}) { - return maybeScalar->IsZero(); - } else { - return false; - } - } - - template // - struct is_convert { - static constexpr bool value{false}; - }; - template // - struct is_convert> { - static constexpr bool value{true}; - }; - template // - struct is_convert> { - // Conversion from complex to real. - static constexpr bool value{true}; - }; - template // - static constexpr bool is_convert_v = is_convert::value; - - template // - struct is_complex_constructor { - static constexpr bool value{false}; - }; - template // - struct is_complex_constructor> { - static constexpr bool value{true}; - }; - template // - static constexpr bool is_complex_constructor_v = - is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { @@ -3347,22 +2952,13 @@ static bool IsAllocatable(const SomeExpr &expr) { return !syms.empty() && IsAllocatable(syms.back()); } -static std::pair> GetTopLevelOperation( - const SomeExpr &expr) { - return atomic::ArgumentExtractor{}(expr); -} - -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { - return GetTopLevelOperation(expr).second; -} - static bool IsPointerAssignment(const evaluate::Assignment &x) { return std::holds_alternative(x.u) || std::holds_alternative(x.u); } static bool IsCheckForAssociated(const SomeExpr &cond) { - return GetTopLevelOperation(cond).first == atomic::Operator::Associated; + return GetTopLevelOperation(cond).first == operation::Operator::Associated; } static bool HasCommonDesignatorSymbols( @@ -3455,23 +3051,7 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -MaybeExpr GetConvertInput(const SomeExpr &x) { - // This returns SomeExpr(x) when x is a designator/functionref/constant. - return atomic::ConvertCollector{}(x).first; -} - -bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { - // Check if expr is same as x, or a sequence of Convert operations on x. - if (expr == x) { - return true; - } else if (auto maybe{GetConvertInput(expr)}) { - return *maybe == x; - } else { - return false; - } -} - -bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { +static bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3839,45 +3419,46 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - std::pair> top{ - atomic::Operator::Unk, {}}; + std::pair> top{ + operation::Operator::Unknown, {}}; if (auto &&maybeInput{GetConvertInput(update.rhs)}) { top = GetTopLevelOperation(*maybeInput); } switch (top.first) { - case atomic::Operator::Add: - case atomic::Operator::Sub: - case atomic::Operator::Mul: - case atomic::Operator::Div: - case atomic::Operator::And: - case atomic::Operator::Or: - case atomic::Operator::Eqv: - case atomic::Operator::Neqv: - case atomic::Operator::Min: - case atomic::Operator::Max: - case atomic::Operator::Identity: + case operation::Operator::Add: + case operation::Operator::Sub: + case operation::Operator::Mul: + case operation::Operator::Div: + case operation::Operator::And: + case operation::Operator::Or: + case operation::Operator::Eqv: + case operation::Operator::Neqv: + case operation::Operator::Min: + case operation::Operator::Max: + case operation::Operator::Identity: break; - case atomic::Operator::Call: + case operation::Operator::Call: context_.Say(source, "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Convert: + case operation::Operator::Convert: context_.Say(source, "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Intrinsic: + case operation::Operator::Intrinsic: context_.Say(source, "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Unk: + case operation::Operator::Unknown: context_.Say( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); return; } // Check if `atom` occurs exactly once in the argument list. @@ -3898,17 +3479,17 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( }()}; if (unique == top.second.end()) { - if (top.first == atomic::Operator::Identity) { + if (top.first == operation::Operator::Identity) { // This is "x = y". context_.Say(rsrc, "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, - atom.AsFortran(), atomic::ToString(top.first)); + atom.AsFortran(), operation::ToString(top.first)); } } else { CheckStorageOverlap(atom, nonAtom, source); @@ -3933,18 +3514,18 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( // Missing arguments to operations would have been diagnosed by now. switch (top.first) { - case atomic::Operator::Associated: + case operation::Operator::Associated: if (atom != top.second.front()) { context_.Say(assignSource, "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); } break; // x equalop e | e equalop x (allowing "e equalop x" is an extension) - case atomic::Operator::Eq: - case atomic::Operator::Eqv: + case operation::Operator::Eq: + case operation::Operator::Eqv: // x ordop expr | expr ordop x - case atomic::Operator::Lt: - case atomic::Operator::Gt: { + case operation::Operator::Lt: + case operation::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; if (IsSameOrConvertOf(arg0, atom)) { @@ -3952,23 +3533,24 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); } break; } - case atomic::Operator::Identity: - case atomic::Operator::True: - case atomic::Operator::False: + case operation::Operator::Identity: + case operation::Operator::True: + case operation::Operator::False: break; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); break; } } diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 1d1e3ac044166..fce930dcc1d02 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -17,6 +17,7 @@ #include "flang/Semantics/tools.h" #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringSwitch.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1756,4 +1757,313 @@ bool HadUseError( } } +namespace operation { +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() + ? std::make_pair(operation::Operator::True, Arguments{}) + : std::make_pair(operation::Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{operation::OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); + } + } + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair(operation::OperationCode(x), + OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{ + operation::Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; +} // namespace operation + +std::string operation::ToString(operation::Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; + } +} + +operation::Operator operation::OperationCode( + const evaluate::ProcedureDesignator &proc) { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; +} + +std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return operation::ArgumentExtractor{}(expr); +} + +namespace operation { +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result asSomeExpr(const T &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::Constant &x) const { + return asSomeExpr(x); + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } + } else { + return asSomeExpr(x.derived()); + } + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (moveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; +}; +} // namespace operation + +MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return operation::ConvertCollector{}(x).first; +} + +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + } // namespace Fortran::semantics >From a83a1cf262eb9f01aafbcf099a8467aa9b861187 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 13:05:15 -0500 Subject: [PATCH 25/28] format --- flang/include/flang/Semantics/tools.h | 28 +++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 25fadceefceb0..9454f0b489192 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -830,8 +830,8 @@ Operator OperationCode( } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { switch (op.derived().opr) { case common::RelationalOperator::LT: return Operator::Lt; @@ -855,26 +855,26 @@ Operator OperationCode(const evaluate::Operation, Ts...> &op) { } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Sub; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Mul; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Div; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Pow; } @@ -885,8 +885,8 @@ Operator OperationCode( } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { if constexpr (C == T::category) { return Operator::Resize; } else { @@ -905,8 +905,8 @@ Operator OperationCode(const evaluate::ProcedureDesignator &proc); /// Return information about the top-level operation (ignoring parentheses): /// the operation code and the list of arguments. -std::pair> -GetTopLevelOperation(const SomeExpr &expr); +std::pair> GetTopLevelOperation( + const SomeExpr &expr); /// Check if expr is same as x, or a sequence of Convert operations on x. bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); >From 9770e4d3c5b0a858f8b5864a7aada01946763450 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 14:45:18 -0500 Subject: [PATCH 26/28] Restore accidentally removed Le --- flang/include/flang/Semantics/tools.h | 1 + 1 file changed, 1 insertion(+) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 9454f0b489192..9766effba3ebe 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -794,6 +794,7 @@ enum class Operator { Gt, Identity, Intrinsic, + Le, Lt, Max, Min, >From 9b8aaa5586334b48fbb28c103eda1091168342e0 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 14:45:47 -0500 Subject: [PATCH 27/28] Recognize constants as "operations" This allows emitting slightly better diagnostic messages. --- flang/include/flang/Semantics/tools.h | 8 ++- flang/lib/Semantics/check-omp-structure.cpp | 1 + flang/lib/Semantics/tools.cpp | 71 +++++++++++---------- flang/test/Semantics/OpenMP/atomic04.f90 | 2 +- flang/test/Semantics/OpenMP/atomic05.f90 | 2 +- 5 files changed, 48 insertions(+), 36 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 9766effba3ebe..7a2be79f14a29 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -781,10 +781,12 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, namespace operation { enum class Operator { + Unknown, Add, And, Associated, Call, + Constant, Convert, Div, Eq, @@ -807,7 +809,6 @@ enum class Operator { Resize, // Convert within the same TypeCategory Sub, True, - Unknown, }; std::string ToString(Operator op); @@ -895,6 +896,11 @@ Operator OperationCode( } } +template +Operator OperationCode(const evaluate::Constant &x) { + return Operator::Constant; +} + template // Operator OperationCode(const T &) { return Operator::Unknown; diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 7f96e48a303fe..3c27a3968a7c9 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3459,6 +3459,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( context_.Say(source, "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); return; + case operation::Operator::Constant: case operation::Operator::Unknown: context_.Say( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index fce930dcc1d02..a8cd8a6ec2228 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -1758,6 +1758,12 @@ bool HadUseError( } namespace operation { +template // +SomeExpr asSomeExpr(const T &x) { + auto copy{x}; + return AsGenericExpr(std::move(copy)); +} + template // struct ArgumentExtractor : public evaluate::Traverse, @@ -1816,10 +1822,12 @@ struct ArgumentExtractor template // Result operator()(const evaluate::Designator &x) const { - evaluate::Designator copy{x}; - Result result{ - operation::Operator::Identity, {AsGenericExpr(std::move(copy))}}; - return result; + return {operation::Operator::Identity, {asSomeExpr(x)}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + return {operation::Operator::Identity, {asSomeExpr(x)}}; } template // @@ -1849,24 +1857,37 @@ struct ArgumentExtractor std::string operation::ToString(operation::Operator op) { switch (op) { + default: + case Operator::Unknown: + return "??"; case Operator::Add: return "+"; case Operator::And: return "AND"; case Operator::Associated: return "ASSOCIATED"; + case Operator::Call: + return "function-call"; + case Operator::Constant: + return "constant"; + case Operator::Convert: + return "type-conversion"; case Operator::Div: return "/"; case Operator::Eq: return "=="; case Operator::Eqv: return "EQV"; + case Operator::False: + return ".FALSE."; case Operator::Ge: return ">="; case Operator::Gt: return ">"; case Operator::Identity: return "identity"; + case Operator::Intrinsic: + return "intrinsic"; case Operator::Le: return "<="; case Operator::Lt: @@ -1877,32 +1898,22 @@ std::string operation::ToString(operation::Operator op) { return "MIN"; case Operator::Mul: return "*"; - case Operator::Neqv: - return "NEQV/EOR"; case Operator::Ne: return "/="; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Not: + return "NOT"; case Operator::Or: return "OR"; + case Operator::Pow: + return "**"; + case Operator::Resize: + return "resize"; case Operator::Sub: return "-"; case Operator::True: return ".TRUE."; - case Operator::False: - return ".FALSE."; - case Operator::Not: - return "NOT"; - case Operator::Convert: - return "type-conversion"; - case Operator::Resize: - return "resize"; - case Operator::Intrinsic: - return "intrinsic"; - case Operator::Call: - return "function-call"; - case Operator::Pow: - return "**"; - default: - return "??"; } } @@ -1939,25 +1950,19 @@ struct ConvertCollector using Base::operator(); - template // - Result asSomeExpr(const T &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; - } - template // Result operator()(const evaluate::Designator &x) const { - return asSomeExpr(x); + return {asSomeExpr(x), {}}; } template // Result operator()(const evaluate::FunctionRef &x) const { - return asSomeExpr(x); + return {asSomeExpr(x), {}}; } template // Result operator()(const evaluate::Constant &x) const { - return asSomeExpr(x); + return {asSomeExpr(x), {}}; } template @@ -1976,10 +1981,10 @@ struct ConvertCollector return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); } else { - return asSomeExpr(x.derived()); + return {asSomeExpr(x.derived()), {}}; } } else { - return asSomeExpr(x.derived()); + return {asSomeExpr(x.derived()), {}}; } } diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index d603ba8b3937c..0f69befed1414 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -180,7 +180,7 @@ subroutine more_invalid_atomic_update_stmts() x = x !$omp atomic update - !ERROR: This is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1 !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index e0103be4cae4a..77ffc6e57f1a3 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -19,7 +19,7 @@ program OmpAtomic x = 2 * 4 !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: This is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 10 !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst >From f7bc109276a7bb647b0c8f3d65af63fbfb3249dc Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 15:04:57 -0500 Subject: [PATCH 28/28] Add lit tests for dumping atomic analysis --- flang/lib/Lower/OpenMP/OpenMP.cpp | 8 +- .../Lower/OpenMP/dump-atomic-analysis.f90 | 82 +++++++++++++++++++ 2 files changed, 89 insertions(+), 1 deletion(-) create mode 100644 flang/test/Lower/OpenMP/dump-atomic-analysis.f90 diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 30acf8baba082..4c50717f8fde4 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -40,11 +40,14 @@ #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/Transforms/RegionUtils.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/Support/CommandLine.h" #include "llvm/Frontend/OpenMP/OMPConstants.h" using namespace Fortran::lower::omp; using namespace Fortran::common::openmp; +static llvm::cl::opt DumpAtomicAnalysis("fdebug-dump-atomic-analysis"); + //===----------------------------------------------------------------------===// // Code generation helper functions //===----------------------------------------------------------------------===// @@ -3790,7 +3793,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// [[maybe_unused]] static void -dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { +dumpAtomicAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { auto whatStr = [](int k) { std::string txt = "?"; switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { @@ -3869,6 +3872,9 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, lower::StatementContext stmtCtx; const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + if (DumpAtomicAnalysis) + dumpAtomicAnalysis(analysis); + const semantics::SomeExpr &atom = *get(analysis.atom); mlir::Location loc = converter.genLocation(construct.source); mlir::Value atomAddr = diff --git a/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 b/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 new file mode 100644 index 0000000000000..55c49f98cd2e8 --- /dev/null +++ b/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 @@ -0,0 +1,82 @@ +!RUN: %flang_fc1 -fopenmp -fopenmp-version=60 -emit-hlfir -mmlir -fdebug-dump-atomic-analysis %s -o /dev/null |& FileCheck %s + +subroutine f00(x) + integer :: x, v + !$omp atomic read + v = x +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Read +!CHECK-NEXT: assign: v=x +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: None +!CHECK-NEXT: assign: +!CHECK-NEXT: } +!CHECK-NEXT: } + + +subroutine f01(v) + integer :: x, v + !$omp atomic write + x = v +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Write +!CHECK-NEXT: assign: x=v +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: None +!CHECK-NEXT: assign: +!CHECK-NEXT: } +!CHECK-NEXT: } + + +subroutine f02(x, v) + integer :: x, v + !$omp atomic update + x = x + v +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Update +!CHECK-NEXT: assign: x=x+v +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: None +!CHECK-NEXT: assign: +!CHECK-NEXT: } +!CHECK-NEXT: } + + +subroutine f03(x, v) + integer :: x, v, t + !$omp atomic update capture + t = x + x = x + v + !$omp end atomic +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Read +!CHECK-NEXT: assign: t=x +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: Update +!CHECK-NEXT: assign: x=x+v +!CHECK-NEXT: } +!CHECK-NEXT: } From flang-commits at lists.llvm.org Fri May 30 13:07:01 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 30 May 2025 13:07:01 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683a0fe5.170a0220.2ddc0b.8d56@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 01/28] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 02/28] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 03/28] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; >From 02b40809aff5b2ad315ef078f0ea8edb0f30a40c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 17 Mar 2025 15:53:27 -0500 Subject: [PATCH 04/28] [flang][OpenMP] Overhaul implementation of ATOMIC construct The parser will accept a wide variety of illegal attempts at forming an ATOMIC construct, leaving it to the semantic analysis to diagnose any issues. This consolidates the analysis into one place and allows us to produce more informative diagnostics. The parser's outcome will be parser::OpenMPAtomicConstruct object holding the directive, parser::Body, and an optional end-directive. The prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have been removed. READ, WRITE, etc. are now proper clauses. The semantic analysis consistently operates on "evaluation" represen- tations, mainly evaluate::Expr (as SomeExpr) and evaluate::Assignment. The results of the semantic analysis are stored in a mutable member of the OpenMPAtomicConstruct node. This follows a precedent of having `typedExpr` member in parser::Expr, for example. This allows the lowering code to avoid duplicated handling of AST nodes. Using a BLOCK construct containing multiple statements for an ATOMIC construct that requires multiple statements is now allowed. In fact, any nesting of such BLOCK constructs is allowed. This implementation will parse, and perform semantic checks for both conditional-update and conditional-update-capture, although no MLIR will be generated for those. Instead, a TODO error will be issues prior to lowering. The allowed forms of the ATOMIC construct were based on the OpenMP 6.0 spec. --- flang/include/flang/Parser/dump-parse-tree.h | 12 - flang/include/flang/Parser/parse-tree.h | 111 +- flang/include/flang/Semantics/tools.h | 16 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 40 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 760 +++---- flang/lib/Parser/openmp-parsers.cpp | 233 +- flang/lib/Parser/parse-tree.cpp | 28 + flang/lib/Parser/unparse.cpp | 102 +- flang/lib/Semantics/check-omp-structure.cpp | 1998 +++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 56 +- flang/lib/Semantics/resolve-names.cpp | 7 +- flang/lib/Semantics/rewrite-directives.cpp | 126 +- .../Lower/OpenMP/Todo/atomic-compare-fail.f90 | 2 +- .../test/Lower/OpenMP/Todo/atomic-compare.f90 | 2 +- flang/test/Lower/OpenMP/atomic-capture.f90 | 4 +- flang/test/Lower/OpenMP/atomic-write.f90 | 2 +- flang/test/Parser/OpenMP/atomic-compare.f90 | 16 - .../test/Semantics/OpenMP/atomic-compare.f90 | 29 +- .../Semantics/OpenMP/atomic-hint-clause.f90 | 23 +- flang/test/Semantics/OpenMP/atomic-read.f90 | 89 + .../OpenMP/atomic-update-capture.f90 | 77 + .../Semantics/OpenMP/atomic-update-only.f90 | 83 + .../OpenMP/atomic-update-overloaded-ops.f90 | 4 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 81 + flang/test/Semantics/OpenMP/atomic.f90 | 29 +- flang/test/Semantics/OpenMP/atomic01.f90 | 221 +- flang/test/Semantics/OpenMP/atomic02.f90 | 47 +- flang/test/Semantics/OpenMP/atomic03.f90 | 51 +- flang/test/Semantics/OpenMP/atomic04.f90 | 96 +- flang/test/Semantics/OpenMP/atomic05.f90 | 12 +- .../Semantics/OpenMP/critical-hint-clause.f90 | 20 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 58 +- .../Semantics/OpenMP/requires-atomic01.f90 | 86 +- .../Semantics/OpenMP/requires-atomic02.f90 | 86 +- 34 files changed, 2838 insertions(+), 1769 deletions(-) delete mode 100644 flang/test/Parser/OpenMP/atomic-compare.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-read.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-capture.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-only.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-write.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a8c01ee2e7da3 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -526,15 +526,6 @@ class ParseTreeDumper { NODE(parser, OmpAtClause) NODE_ENUM(OmpAtClause, ActionTime) NODE_ENUM(OmpSeverityClause, Severity) - NODE(parser, OmpAtomic) - NODE(parser, OmpAtomicCapture) - NODE(OmpAtomicCapture, Stmt1) - NODE(OmpAtomicCapture, Stmt2) - NODE(parser, OmpAtomicCompare) - NODE(parser, OmpAtomicCompareIfStmt) - NODE(parser, OmpAtomicRead) - NODE(parser, OmpAtomicUpdate) - NODE(parser, OmpAtomicWrite) NODE(parser, OmpBeginBlockDirective) NODE(parser, OmpBeginLoopDirective) NODE(parser, OmpBeginSectionsDirective) @@ -581,7 +572,6 @@ class ParseTreeDumper { NODE(parser, OmpDoacrossClause) NODE(parser, OmpDestroyClause) NODE(parser, OmpEndAllocators) - NODE(parser, OmpEndAtomic) NODE(parser, OmpEndBlockDirective) NODE(parser, OmpEndCriticalDirective) NODE(parser, OmpEndLoopDirective) @@ -709,8 +699,6 @@ class ParseTreeDumper { NODE(parser, OpenMPDeclareMapperConstruct) NODE_ENUM(common, OmpMemoryOrderType) NODE(parser, OmpMemoryOrderClause) - NODE(parser, OmpAtomicClause) - NODE(parser, OmpAtomicClauseList) NODE(parser, OmpAtomicDefaultMemOrderClause) NODE(parser, OpenMPDepobjConstruct) NODE(parser, OpenMPUtilityConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..77f57b1cb85c7 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4835,94 +4835,37 @@ struct OmpMemoryOrderClause { CharBlock source; }; -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) | -// FAIL(memory-order) -struct OmpAtomicClause { - UNION_CLASS_BOILERPLATE(OmpAtomicClause); - CharBlock source; - std::variant u; -}; - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -struct OmpAtomicClauseList { - WRAPPER_CLASS_BOILERPLATE(OmpAtomicClauseList, std::list); - CharBlock source; -}; - -// END ATOMIC -EMPTY_CLASS(OmpEndAtomic); - -// ATOMIC READ -struct OmpAtomicRead { - TUPLE_CLASS_BOILERPLATE(OmpAtomicRead); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC WRITE -struct OmpAtomicWrite { - TUPLE_CLASS_BOILERPLATE(OmpAtomicWrite); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC UPDATE -struct OmpAtomicUpdate { - TUPLE_CLASS_BOILERPLATE(OmpAtomicUpdate); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC CAPTURE -struct OmpAtomicCapture { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCapture); - CharBlock source; - WRAPPER_CLASS(Stmt1, Statement); - WRAPPER_CLASS(Stmt2, Statement); - std::tuple - t; -}; - -struct OmpAtomicCompareIfStmt { - UNION_CLASS_BOILERPLATE(OmpAtomicCompareIfStmt); - std::variant, common::Indirection> u; -}; - -// ATOMIC COMPARE (OpenMP 5.1, OPenMP 5.2 spec: 15.8.4) -struct OmpAtomicCompare { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCompare); +struct OpenMPAtomicConstruct { + llvm::omp::Clause GetKind() const; + bool IsCapture() const; + bool IsCompare() const; + TUPLE_CLASS_BOILERPLATE(OpenMPAtomicConstruct); CharBlock source; - std::tuple> + std::tuple> t; -}; -// ATOMIC -struct OmpAtomic { - TUPLE_CLASS_BOILERPLATE(OmpAtomic); - CharBlock source; - std::tuple, - std::optional> - t; -}; + // Information filled out during semantic checks to avoid duplication + // of analyses. + struct Analysis { + static constexpr int None = 0; + static constexpr int Read = 1; + static constexpr int Write = 2; + static constexpr int Update = Read | Write; + static constexpr int Action = 3; // Bitmask for None, Read, Write, Update + static constexpr int IfTrue = 4; + static constexpr int IfFalse = 8; + static constexpr int Condition = 12; // Bitmask for IfTrue, IfFalse + + struct Op { + int what; + TypedExpr expr; + }; + TypedExpr atom, cond; + Op op0, op1; + }; -// 2.17.7 atomic -> -// ATOMIC [atomic-clause-list] atomic-construct [atomic-clause-list] | -// ATOMIC [atomic-clause-list] -// atomic-construct -> READ | WRITE | UPDATE | CAPTURE | COMPARE -struct OpenMPAtomicConstruct { - UNION_CLASS_BOILERPLATE(OpenMPAtomicConstruct); - std::variant - u; + mutable Analysis analysis; }; // OpenMP directives that associate with loop(s) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..7f1ec59b087a2 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -782,5 +782,21 @@ inline bool checkForSymbolMatch( } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). +/// Check if "expr" is +/// SomeType(SomeKind(Type( +/// Convert +/// SomeKind(...)[2]))) +/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves +/// TypeCategory. +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..0d941bf39afc3 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -332,26 +332,26 @@ getSource(const semantics::SemanticsContext &semaCtx, const parser::CharBlock *source = nullptr; auto ompConsVisit = [&](const parser::OpenMPConstruct &x) { - std::visit(common::visitors{ - [&](const parser::OpenMPSectionsConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPLoopConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPBlockConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPCriticalConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPAtomicConstruct &x) { - std::visit([&](const auto &x) { source = &x.source; }, - x.u); - }, - [&](const auto &x) { source = &x.source; }, - }, - x.u); + std::visit( + common::visitors{ + [&](const parser::OpenMPSectionsConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPLoopConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPBlockConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPCriticalConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPAtomicConstruct &x) { + source = &std::get(x.t).source; + }, + [&](const auto &x) { source = &x.source; }, + }, + x.u); }; eval.visit(common::visitors{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fdd85e94829f3..ab868df76d298 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2588,455 +2588,183 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); +} -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} + +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, + const List &clauses) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + if (clause.id != llvm::omp::Clause::OMPC_hint) + continue; + auto &hint = std::get(clause.u); + auto maybeVal = evaluate::ToInt64(hint.v); + CHECK(maybeVal); + return builder.getI64IntegerAttr(*maybeVal); } + return nullptr; } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static mlir::omp::ClauseMemoryOrderKindAttr +getAtomicMemoryOrder(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + const List &clauses) { + std::optional kind; + unsigned version = semaCtx.langOptions().OpenMPVersion; - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + switch (clause.id) { + case llvm::omp::Clause::OMPC_acq_rel: + kind = mlir::omp::ClauseMemoryOrderKind::Acq_rel; + break; + case llvm::omp::Clause::OMPC_acquire: + kind = mlir::omp::ClauseMemoryOrderKind::Acquire; + break; + case llvm::omp::Clause::OMPC_relaxed: + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + break; + case llvm::omp::Clause::OMPC_release: + kind = mlir::omp::ClauseMemoryOrderKind::Release; + break; + case llvm::omp::Clause::OMPC_seq_cst: + kind = mlir::omp::ClauseMemoryOrderKind::Seq_cst; + break; + default: + break; + } + } - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); + // Starting with 5.1, if no memory-order clause is present, the effect + // is as if "relaxed" was present. + if (!kind) { + if (version <= 50) + return nullptr; + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + } + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + return mlir::omp::ClauseMemoryOrderKindAttr::get(builder.getContext(), *kind); +} + +static mlir::Operation * // +genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; +static mlir::Operation * // +genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Value converted = builder.createConvert(loc, atomType, value); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, converted, hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - common::visit( - common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - lower::StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); +static mlir::Operation * +genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + lower::ExprToValueMap overrides; + lower::StatementContext naCtx; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + assert(!args.empty() && "Update operation without arguments"); + for (auto &arg : args) { + if (!semantics::IsSameOrResizeOf(arg, atom)) { + mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); + overrides.try_emplace(&arg, val); } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); } - mlir::Operation *atomicUpdateOp = nullptr; - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), - val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -static void genAtomicWrite(lower::AbstractConverter &converter, - const parser::OmpAtomicWrite &atomicWrite, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - - const parser::AssignmentStmt &stmt = - std::get>(atomicWrite.t) - .statement; - const evaluate::Assignment &assign = *stmt.typedAssignment->v; - lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -static void genAtomicRead(lower::AbstractConverter &converter, - const parser::OmpAtomicRead &atomicRead, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicRead.t) - .statement.t); + builder.restoreInsertionPoint(atomicAt); + auto updateOp = + builder.create(loc, atomAddr, hint, memOrder); - lower::StatementContext stmtCtx; - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -static void genAtomicUpdate(lower::AbstractConverter &converter, - const parser::OmpAtomicUpdate &atomicUpdate, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicUpdate.t) - .statement.t); + mlir::Region ®ion = updateOp->getRegion(0); + mlir::Block *block = builder.createBlock(®ion, {}, {atomType}, {loc}); + mlir::Value localAtom = fir::getBase(block->getArgument(0)); + overrides.try_emplace(&atom, localAtom); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -static void genOmpAtomic(lower::AbstractConverter &converter, - const parser::OmpAtomic &atomicConstruct, - mlir::Location loc) { - const parser::OmpAtomicClauseList &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>(atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicConstruct.t) - .statement.t); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -static void genAtomicCapture(lower::AbstractConverter &converter, - const parser::OmpAtomicCapture &atomicCapture, - mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + converter.overrideExprValues(&overrides); + mlir::Value updated = + fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value converted = builder.createConvert(loc, atomType, updated); + builder.create(loc, converted); + converter.resetExprOverrides(); - const parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const parser::OmpAtomicClauseList &rightHandClauseList = - std::get<2>(atomicCapture.t); - const parser::OmpAtomicClauseList &leftHandClauseList = - std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + builder.restoreInsertionPoint(saved); + return updateOp; +} + +static mlir::Operation * +genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, int action, + mlir::Value atomAddr, const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + switch (action) { + case parser::OpenMPAtomicConstruct::Analysis::Read: + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Write: + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Update: + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + default: + return nullptr; } - firOpBuilder.setInsertionPointToEnd(&block); - firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); } //===----------------------------------------------------------------------===// @@ -3911,10 +3639,6 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, @@ -3922,38 +3646,140 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; + llvm::errs() << " atom: " << atom.AsFortran() << "\n"; + llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; + llvm::errs() << " op0 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << " op1 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << "}\n"; +} + static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - const parser::OpenMPAtomicConstruct &atomicConstruct) { - Fortran::common::visit( - common::visitors{ - [&](const parser::OmpAtomicRead &atomicRead) { - mlir::Location loc = converter.genLocation(atomicRead.source); - genAtomicRead(converter, atomicRead, loc); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - mlir::Location loc = converter.genLocation(atomicWrite.source); - genAtomicWrite(converter, atomicWrite, loc); - }, - [&](const parser::OmpAtomic &atomicConstruct) { - mlir::Location loc = converter.genLocation(atomicConstruct.source); - genOmpAtomic(converter, atomicConstruct, loc); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - mlir::Location loc = converter.genLocation(atomicUpdate.source); - genAtomicUpdate(converter, atomicUpdate, loc); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - mlir::Location loc = converter.genLocation(atomicCapture.source); - genAtomicCapture(converter, atomicCapture, loc); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - mlir::Location loc = converter.genLocation(atomicCompare.source); - TODO(loc, "OpenMP atomic compare"); - }, - }, - atomicConstruct.u); + const parser::OpenMPAtomicConstruct &construct) { + auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { + if (auto *maybe = expr.get(); maybe && maybe->v) { + return &*maybe->v; + } else { + return nullptr; + } + }; + + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto &dirSpec = std::get(construct.t); + List clauses = makeClauses(dirSpec.Clauses(), semaCtx); + lower::StatementContext stmtCtx; + + const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + const semantics::SomeExpr &atom = *get(analysis.atom); + mlir::Location loc = converter.genLocation(construct.source); + mlir::Value atomAddr = + fir::getBase(converter.genExprAddr(atom, stmtCtx, &loc)); + mlir::IntegerAttr hint = getAtomicHint(converter, clauses); + mlir::omp::ClauseMemoryOrderKindAttr memOrder = + getAtomicMemoryOrder(converter, semaCtx, clauses); + + if (auto *cond = get(analysis.cond)) { + (void)cond; + TODO(loc, "OpenMP ATOMIC COMPARE"); + } else { + int action0 = analysis.op0.what & analysis.Action; + int action1 = analysis.op1.what & analysis.Action; + mlir::Operation *captureOp = nullptr; + fir::FirOpBuilder::InsertPoint atomicAt; + fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + + if (construct.IsCapture()) { + // Capturing operation. + assert(action0 != analysis.None && action1 != analysis.None && + "Expexcing two actions"); + captureOp = + builder.create(loc, hint, memOrder); + // Set the non-atomic insertion point to before the atomic.capture. + prepareAt = getInsertionPointBefore(captureOp); + + mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); + builder.setInsertionPointToEnd(block); + // Set the atomic insertion point to before the terminator inside + // atomic.capture. + mlir::Operation *term = builder.create(loc); + atomicAt = getInsertionPointBefore(term); + hint = nullptr; + memOrder = nullptr; + } else { + // Non-capturing operation. + assert(action0 != analysis.None && action1 == analysis.None && + "Expexcing single action"); + assert(!(analysis.op0.what & analysis.Condition)); + atomicAt = prepareAt; + } + + mlir::Operation *firstOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, + *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + assert(firstOp && "Should have created an atomic operation"); + atomicAt = getInsertionPointAfter(firstOp); + + mlir::Operation *secondOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, + *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } + } } static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..b30263ad54fa4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { + return std::string( + state.GetLocation(), std::min(64, state.BytesRemaining())); +} + constexpr auto startOmpLine = skipStuffBeforeStatement >> "!$OMP "_sptok; constexpr auto endOmpLine = space >> endOfLine; @@ -918,8 +924,10 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "BIND" >> construct(construct( parenthesized(Parser{}))) || + "CAPTURE" >> construct(construct()) || "COLLAPSE" >> construct(construct( parenthesized(scalarIntConstantExpr))) || + "COMPARE" >> construct(construct()) || "CONTAINS" >> construct(construct( parenthesized(Parser{}))) || "COPYIN" >> construct(construct( @@ -1039,6 +1047,7 @@ TYPE_PARSER( // "TASK_REDUCTION" >> construct(construct( parenthesized(Parser{}))) || + "READ" >> construct(construct()) || "RELAXED" >> construct(construct()) || "RELEASE" >> construct(construct()) || "REVERSE_OFFLOAD" >> @@ -1082,6 +1091,7 @@ TYPE_PARSER( // maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || + "WRITE" >> construct(construct()) || // Cancellable constructs "DO"_id >= construct(construct( @@ -1210,6 +1220,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. +// A problem can occur when atomic constructs without end-directive follow +// each other closely, e.g. +// !$omp atomic write +// x = v +// !$omp atomic update +// x = x + 1 +// ... +// The speculative parsing will become "recursive", and has the potential +// to take a (practically) infinite amount of time given a sufficiently +// large number of such constructs in a row. Since atomic constructs cannot +// contain other OpenMP constructs, guarding against recursive calls to the +// atomic construct parser solves the problem. +struct OmpAtomicConstructParser { + using resultType = OpenMPAtomicConstruct; + + static constexpr size_t BodyLimit{5}; + + std::optional Parse(ParseState &state) const { + if (recursing_) { + return std::nullopt; + } + recursing_ = true; + + auto dirSpec{Parser{}.Parse(state)}; + if (!dirSpec || dirSpec->DirId() != llvm::omp::Directive::OMPD_atomic) { + recursing_ = false; + return std::nullopt; + } + + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + if (ParseOne(exec, end, tail, state)) { + if (!tail.first.empty()) { + if (auto &&rest{attempt(LimitedTailParser(BodyLimit)).Parse(state)}) { + for (auto &&s : rest->first) { + tail.first.emplace_back(std::move(s)); + } + assert(!tail.second); + tail.second = std::move(rest->second); + } + } + recursing_ = false; + return OpenMPAtomicConstruct{ + std::move(*dirSpec), std::move(tail.first), std::move(tail.second)}; + } + + recursing_ = false; + return std::nullopt; + } + +private: + // Begin-directive + TailType = entire construct. + using TailType = std::pair>; + + // Parse either an ExecutionPartConstruct, or atomic end-directive. When + // successful, record the result in the "tail" provided, otherwise fail. + static std::optional ParseOne( // + Parser &exec, OmpEndDirectiveParser &end, + TailType &tail, ParseState &state) { + auto isRecovery{[](const ExecutionPartConstruct &e) { + return std::holds_alternative(e.u); + }}; + if (auto &&stmt{attempt(exec).Parse(state)}; stmt && !isRecovery(*stmt)) { + tail.first.emplace_back(std::move(*stmt)); + } else if (auto &&dir{attempt(end).Parse(state)}) { + tail.second = std::move(*dir); + } else { + return std::nullopt; + } + return Success{}; + } + + struct LimitedTailParser { + using resultType = TailType; + + constexpr LimitedTailParser(size_t count) : count_(count) {} + + std::optional Parse(ParseState &state) const { + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + for (size_t i{0}; i != count_; ++i) { + if (ParseOne(exec, end, tail, state)) { + if (tail.second) { + // Return when the end-directive was parsed. + return std::move(tail); + } + } else { + break; + } + } + return std::nullopt; + } + + private: + const size_t count_; + }; + + // The recursion guard should become thread_local if parsing is ever + // parallelized. + static bool recursing_; +}; + +bool OmpAtomicConstructParser::recursing_{false}; + +TYPE_PARSER(sourced( // + construct(OmpAtomicConstructParser{}))) + // 2.17.7 Atomic construct/2.17.8 Flush construct [OpenMP 5.0] // memory-order-clause -> // acq_rel @@ -1224,19 +1383,6 @@ TYPE_PARSER(sourced(construct( "RELEASE" >> construct(construct()) || "SEQ_CST" >> construct(construct()))))) -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) -TYPE_PARSER(sourced(construct( - construct(Parser{}) || - construct( - "FAIL" >> parenthesized(Parser{})) || - construct( - "HINT" >> parenthesized(Parser{}))))) - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -TYPE_PARSER(sourced(construct( - many(maybe(","_tok) >> sourced(Parser{}))))) - static bool IsSimpleStandalone(const OmpDirectiveName &name) { switch (name.v) { case llvm::omp::Directive::OMPD_barrier: @@ -1383,67 +1529,6 @@ TYPE_PARSER(sourced( TYPE_PARSER(construct(Parser{}) || construct(Parser{})) -// 2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] | -// ATOMIC [clause] -// clause -> memory-order-clause | HINT(hint-expression) -// memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED -// atomic-clause -> READ | WRITE | UPDATE | CAPTURE - -// OMP END ATOMIC -TYPE_PARSER(construct(startOmpLine >> "END ATOMIC"_tok)) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] READ [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("READ"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] CAPTURE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("CAPTURE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - statement(assignmentStmt), Parser{} / endOmpLine))) - -TYPE_PARSER(construct(indirect(Parser{})) || - construct(indirect(Parser{}))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] COMPARE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("COMPARE"_tok), - Parser{} / endOmpLine, - Parser{}, - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] UPDATE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("UPDATE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [atomic-clause-list] -TYPE_PARSER(sourced(construct(verbatim("ATOMIC"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] WRITE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("WRITE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// Atomic Construct -TYPE_PARSER(construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{})) - // 2.13.2 OMP CRITICAL TYPE_PARSER(startOmpLine >> sourced(construct( diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..5983f54600c21 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -318,6 +318,34 @@ std::string OmpTraitSetSelectorName::ToString() const { return std::string(EnumToString(v)); } +llvm::omp::Clause OpenMPAtomicConstruct::GetKind() const { + auto &dirSpec{std::get(t)}; + for (auto &clause : dirSpec.Clauses().v) { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_read: + case llvm::omp::Clause::OMPC_write: + case llvm::omp::Clause::OMPC_update: + return clause.Id(); + default: + break; + } + } + return llvm::omp::Clause::OMPC_update; +} + +bool OpenMPAtomicConstruct::IsCapture() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_capture; + }); +} + +bool OpenMPAtomicConstruct::IsCompare() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_compare; + }); +} } // namespace Fortran::parser template static llvm::omp::Clause getClauseIdForClass(C &&) { diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..a2bc8b98088f1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2562,83 +2562,22 @@ class UnparseVisitor { Word(ToUpperCaseLetters(common::EnumToString(x))); } - void Unparse(const OmpAtomicClauseList &x) { Walk(" ", x.v, " "); } - - void Unparse(const OmpAtomic &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCapture &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" CAPTURE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - Put("\n"); - Walk(std::get(x.t)); - BeginOpenMP(); - Word("!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCompare &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" COMPARE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - } - void Unparse(const OmpAtomicRead &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" READ"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicUpdate &x) { + void Unparse(const OpenMPAtomicConstruct &x) { BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" UPDATE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicWrite &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" WRITE"); - Walk(std::get<2>(x.t)); + Word("!$OMP "); + Walk(std::get(x.t)); Put("\n"); EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); + Walk(std::get(x.t), ""); + if (auto &end{std::get>(x.t)}) { + BeginOpenMP(); + Word("!$OMP END "); + Walk(*end); + Put("\n"); + EndOpenMP(); + } } + void Unparse(const OpenMPExecutableAllocate &x) { const auto &fields = std::get>>( @@ -2889,23 +2828,8 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } + void Unparse(const OmpFailClause &x) { Walk(x.v); } void Unparse(const OmpMemoryOrderClause &x) { Walk(x.v); } - void Unparse(const OmpAtomicClause &x) { - common::visit(common::visitors{ - [&](const OmpMemoryOrderClause &y) { Walk(y); }, - [&](const OmpFailClause &y) { - Word("FAIL("); - Walk(y.v); - Put(")"); - }, - [&](const OmpHintClause &y) { - Word("HINT("); - Walk(y.v); - Put(")"); - }, - }, - x.u); - } void Unparse(const OmpMetadirectiveDirective &x) { BeginOpenMP(); Word("!$OMP METADIRECTIVE "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c582de6df0319..bf3dff8ddab28 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); + +template +static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { + return !(e == f); +} + // Use when clause falls under 'struct OmpClause' in 'parse-tree.h'. #define CHECK_SIMPLE_CLAUSE(X, Y) \ void OmpStructureChecker::Enter(const parser::OmpClause::X &) { \ @@ -78,6 +86,28 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } +static bool IsVarOrFunctionRef(const SomeExpr &expr) { + return evaluate::UnwrapProcedureRef(expr) != nullptr || + evaluate::IsVariable(expr); +} + +static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { + const parser::TypedExpr &typedExpr{parserExpr.typedExpr}; + // ForwardOwningPointer typedExpr + // `- GenericExprWrapper ^.get() + // `- std::optional ^->v + return typedExpr.get()->v; +} + +static std::optional GetDynamicType( + const parser::Expr &parserExpr) { + if (auto maybeExpr{GetEvaluateExpr(parserExpr)}) { + return maybeExpr->GetType(); + } else { + return std::nullopt; + } +} + // 'OmpWorkshareBlockChecker' is used to check the validity of the assignment // statements and the expressions enclosed in an OpenMP Workshare construct class OmpWorkshareBlockChecker { @@ -584,51 +614,26 @@ void OmpStructureChecker::CheckPredefinedAllocatorRestriction( } } -template -void OmpStructureChecker::CheckHintClause( - D *leftOmpClauseList, D *rightOmpClauseList, std::string_view dirName) { - bool foundHint{false}; +void OmpStructureChecker::Enter(const parser::OmpClause::Hint &x) { + CheckAllowedClause(llvm::omp::Clause::OMPC_hint); + auto &dirCtx{GetContext()}; - auto checkForValidHintClause = [&](const D *clauseList) { - for (const auto &clause : clauseList->v) { - const parser::OmpHintClause *ompHintClause = nullptr; - if constexpr (std::is_same_v) { - ompHintClause = std::get_if(&clause.u); - } else if constexpr (std::is_same_v) { - if (auto *hint{std::get_if(&clause.u)}) { - ompHintClause = &hint->v; - } - } - if (!ompHintClause) - continue; - if (foundHint) { - context_.Say(clause.source, - "At most one HINT clause can appear on the %s directive"_err_en_US, - parser::ToUpperCaseLetters(dirName)); - } - foundHint = true; - std::optional hintValue = GetIntValue(ompHintClause->v); - if (hintValue && *hintValue >= 0) { - /*`omp_sync_hint_nonspeculative` and `omp_lock_hint_speculative`*/ - if ((*hintValue & 0xC) == 0xC - /*`omp_sync_hint_uncontended` and omp_sync_hint_contended*/ - || (*hintValue & 0x3) == 0x3) - context_.Say(clause.source, - "Hint clause value " - "is not a valid OpenMP synchronization value"_err_en_US); - } else { - context_.Say(clause.source, - "Hint clause must have non-negative constant " - "integer expression"_err_en_US); + if (std::optional maybeVal{GetIntValue(x.v.v)}) { + int64_t val{*maybeVal}; + if (val >= 0) { + // Check contradictory values. + if ((val & 0xC) == 0xC || // omp_sync_hint_speculative and nonspeculative + (val & 0x3) == 0x3) { // omp_sync_hint_contended and uncontended + context_.Say(dirCtx.clauseSource, + "The synchronization hint is not valid"_err_en_US); } + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be non-negative"_err_en_US); } - }; - - if (leftOmpClauseList) { - checkForValidHintClause(leftOmpClauseList); - } - if (rightOmpClauseList) { - checkForValidHintClause(rightOmpClauseList); + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be a constant integer value"_err_en_US); } } @@ -2370,8 +2375,9 @@ void OmpStructureChecker::Leave(const parser::OpenMPCancelConstruct &) { void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { const auto &dir{std::get(x.t)}; + const auto &dirSource{std::get(dir.t).source}; const auto &endDir{std::get(x.t)}; - PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_critical); + PushContextAndClauseSets(dirSource, llvm::omp::Directive::OMPD_critical); const auto &block{std::get(x.t)}; CheckNoBranching(block, llvm::omp::Directive::OMPD_critical, dir.source); const auto &dirName{std::get>(dir.t)}; @@ -2404,7 +2410,6 @@ void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { "Hint clause other than omp_sync_hint_none cannot be specified for " "an unnamed CRITICAL directive"_err_en_US}); } - CheckHintClause(&ompClause, nullptr, "CRITICAL"); } void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { @@ -2632,420 +2637,1519 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; } - return false; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } + + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (Append(v, std::move(results)), ...); + return v; + } + +private: + static void Append(Result &acc, Result &&data) { + for (auto &&s : data) { + acc.push_back(std::move(s)); } } +}; +} // namespace atomic - ErrIfAllocatableVariable(var); +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { + // Both expr and x have the form of SomeType(SomeKind(...)[1]). + // Check if expr is + // SomeType(SomeKind(Type( + // Convert + // SomeKind(...)[2]))) + // where SomeKind(...) [1] and [2] are equal, and the Convert preserves + // TypeCategory. + + if (expr != x) { + auto top{atomic::ArgumentExtractor{}(expr)}; + return top.first == atomic::Operator::Resize && x == top.second.front(); } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); + return true; } } -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, const MaybeExpr &maybeExpr = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetExpr(operation.expr, maybeExpr); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedExpr expr; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var + // or + // ... = x capture-var = atomic-var + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return u.lhs == c.rhs; + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); + bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + + if (couldBeCapture1) { + if (couldBeCapture2) { + if (isUpdateCapture(as2, as1)) { + if (isUpdateCapture(as1, as2)) { + // If both statements could be captures and both could be updates, + // emit a warning about the ambiguity. + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); } + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, + as1.rhs.AsFortran(), as2.rhs.AsFortran()); + } + } else { // !couldBeCapture2 + if (isUpdateCapture(as2, as1)) { + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else { + context_.Say(act2.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as1.rhs.AsFortran()); + } + } + } else { // !couldBeCapture1 + if (couldBeCapture2) { + if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); + } else { + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as2.rhs.AsFortran()); + } + } else { + context_.Say(source, + "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + } + } + + return std::make_pair(nullptr, nullptr); +} + +void OmpStructureChecker::CheckAtomicCaptureAssignment( + const evaluate::Assignment &capture, const SomeExpr &atom, + parser::CharBlock source) { + const SomeExpr &cap{capture.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + // This part should have been checked prior to callig this function. + assert(capture.rhs == atom && "This canont be a capture assignment"); + CheckStorageOverlap(atom, {cap}, source); + } +} + +void OmpStructureChecker::CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source) { + const SomeExpr &atom{read.rhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source) { + // [6.0:190:13-15] + // A write structured block is write-statement, a write statement that has + // one of the following forms: + // x = expr + // x => expr + const SomeExpr &atom{write.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, lsrc); + CheckStorageOverlap(atom, {write.rhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source) { + // [6.0:191:1-7] + // An update structured block is update-statement, an update statement + // that has one of the following forms: + // x = x operator expr + // x = expr operator x + // x = intrinsic-procedure-name (x) + // x = intrinsic-procedure-name (x, expr-list) + // x = intrinsic-procedure-name (expr-list, x) + const SomeExpr &atom{update.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, lsrc); + + auto top{GetTopLevelOperation(update.rhs)}; + switch (top.first) { + case atomic::Operator::Add: + case atomic::Operator::Sub: + case atomic::Operator::Mul: + case atomic::Operator::Div: + case atomic::Operator::And: + case atomic::Operator::Or: + case atomic::Operator::Eqv: + case atomic::Operator::Neqv: + case atomic::Operator::Min: + case atomic::Operator::Max: + case atomic::Operator::Identity: + break; + case atomic::Operator::Call: + context_.Say(source, + "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Convert: + context_.Say(source, + "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Intrinsic: + context_.Say(source, + "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Unk: + context_.Say( + source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + default: + context_.Say(source, + "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, + atomic::ToString(top.first)); + return; + } + // Check if `atom` occurs exactly once in the argument list. + std::vector nonAtom; + auto unique{[&]() { // -> iterator + auto found{top.second.end()}; + for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { + if (IsSameOrResizeOf(*i, atom)) { + if (found != top.second.end()) { + return top.second.end(); } + found = i; + } else { + nonAtom.push_back(*i); } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); + return found; + }()}; + + if (unique == top.second.end()) { + if (top.first == atomic::Operator::Identity) { + // This is "x = y". + context_.Say(rsrc, + "The atomic variable %s should appear as an argument in the update operation"_err_en_US, + atom.AsFortran()); + } else { + context_.Say(rsrc, + "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, + atom.AsFortran(), atomic::ToString(top.first)); + } + } else { + CheckStorageOverlap(atom, nonAtom, source); } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( + const SomeExpr &cond, parser::CharBlock condSource, + const evaluate::Assignment &assign, parser::CharBlock assignSource) { + const SomeExpr &atom{assign.lhs}; + auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, alsrc); + + auto top{GetTopLevelOperation(cond)}; + // Missing arguments to operations would have been diagnosed by now. + + switch (top.first) { + case atomic::Operator::Associated: + if (atom != top.second.front()) { + context_.Say(assignSource, + "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); + } + break; + // x equalop e | e equalop x (allowing "e equalop x" is an extension) + case atomic::Operator::Eq: + case atomic::Operator::Eqv: + // x ordop expr | expr ordop x + case atomic::Operator::Lt: + case atomic::Operator::Gt: { + const SomeExpr &arg0{top.second[0]}; + const SomeExpr &arg1{top.second[1]}; + if (IsSameOrResizeOf(arg0, atom)) { + CheckStorageOverlap(atom, {arg1}, condSource); + } else if (IsSameOrResizeOf(arg1, atom)) { + CheckStorageOverlap(atom, {arg0}, condSource); + } else { + context_.Say(assignSource, + "An argument of the %s operator should be the target of the assignment"_err_en_US, + atomic::ToString(top.first)); + } + break; + } + case atomic::Operator::True: + case atomic::Operator::False: + break; + default: + context_.Say(condSource, + "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, + atomic::ToString(top.first)); + break; } } -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, +void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source) { + // The condition/statements must be: + // - cond: x equalop e ift: x = d iff: - + // - cond: x ordop expr ift: x = expr iff: - (+ commute ordop) + // - cond: associated(x) ift: x => expr iff: - + // - cond: associated(x, e) ift: x => expr iff: - + + // The if-true statement must be present, and must be an assignment. + auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; + if (!maybeAssign) { + if (update.ift.stmt) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); + } else { + context_.Say( + source, "Invalid body of ATOMIC UPDATE COMPARE operation"_err_en_US); + } + return; + } + const evaluate::Assignment assign{*maybeAssign}; + const SomeExpr &atom{assign.lhs}; + + CheckAtomicConditionalUpdateAssignment( + update.cond, update.source, assign, update.ift.source); + + CheckStorageOverlap(atom, {assign.rhs}, update.ift.source); + + if (update.iff) { + context_.Say(update.iff.source, + "In ATOMIC UPDATE COMPARE the update statement should not have an ELSE branch"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdateOnly( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeUpdate{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeUpdate->lhs}; + CheckAtomicUpdateAssignment(*maybeUpdate, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC UPDATE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdate( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // Allowable forms are (single-statement): + // - if ... + // - x = (... ? ... : x) + // and two-statement: + // - r = cond ; if (r) ... + + const parser::ExecutionPartConstruct *ust{nullptr}; // update + const parser::ExecutionPartConstruct *cst{nullptr}; // condition + + if (body.size() == 1) { + ust = &body.front(); + } else if (body.size() == 2) { + cst = &body.front(); + ust = &body.back(); + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE operation should contain one or two statements"_err_en_US); + return; + } + + // Flang doesn't support conditional-expr yet, so all update statements + // are if-statements. + + // IfStmt: if (...) ... + // IfConstruct: if (...) then ... endif + auto maybeUpdate{AnalyzeConditionalStmt(ust)}; + if (!maybeUpdate) { + context_.Say(source, + "In ATOMIC UPDATE COMPARE the update statement should be a conditional statement"_err_en_US); + return; + } + + AnalyzedCondStmt &update{*maybeUpdate}; + + if (SourcedActionStmt action{GetActionStmt(cst)}) { + // The "condition" statement must be `r = cond`. + if (auto maybeCond{GetEvaluateAssignment(action.stmt)}) { + if (maybeCond->lhs != update.cond) { + context_.Say(update.source, + "In ATOMIC UPDATE COMPARE the conditional statement must use %s as the condition"_err_en_US, + maybeCond->lhs.AsFortran()); + } else { + // If it's "r = ...; if (r) ..." then put the original condition + // in `update`. + update.cond = maybeCond->rhs; + } + } else { + context_.Say(action.source, + "In ATOMIC UPDATE COMPARE with two statements the first statement should compute the condition"_err_en_US); + } + } + + evaluate::Assignment assign{*GetEvaluateAssignment(update.ift.stmt)}; + + CheckAtomicConditionalUpdateStmt(update, source); + if (IsCheckForAssociated(update.cond)) { + if (!IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment should be a pointer-assignment when the condition is ASSOCIATED"_err_en_US); + } + } else { + if (IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment cannot be a pointer-assignment except when the condition is ASSOCIATED"_err_en_US); + } + } + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::None)); +} + +void OmpStructureChecker::CheckAtomicUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() != 2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two statements"_err_en_US); + return; + } + + auto [uec, cec]{CheckUpdateCapture(&body.front(), &body.back(), source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; + auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; + + if (!maybeUpdate || !maybeCapture) { + context_.Say(source, + "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); + return; + } + + const evaluate::Assignment &update{*maybeUpdate}; + const evaluate::Assignment &capture{*maybeCapture}; + const SomeExpr &atom{update.lhs}; + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + int action; + + if (IsMaybeAtomicWrite(update)) { + action = Analysis::Write; + CheckAtomicWriteAssignment(update, uact.source); + } else { + action = Analysis::Update; + CheckAtomicUpdateAssignment(update, uact.source); + } + CheckAtomicCaptureAssignment(capture, atom, cact.source); + + if (IsPointerAssignment(update) != IsPointerAssignment(capture)) { + context_.Say(cact.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + + if (GetActionStmt(&body.front()).stmt == uact.stmt) { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(action, update.rhs), + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + } else { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), + MakeAtomicAnalysisOp(action, update.rhs)); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // There are two different variants of this: + // (1) conditional-update and capture separately: + // This form only allows single-statement updates, i.e. the update + // form "r = cond; if (r) ...)" is not allowed. + // (2) conditional-update combined with capture in a single statement: + // This form does allow the condition to be calculated separately, + // i.e. "r = cond; if (r) ...". + // Regardless of what form it is, the actual update assignment is a + // proper write, i.e. "x = d", where d does not depend on x. + + AnalyzedCondStmt update; + SourcedActionStmt capture; + bool captureAlways{true}, captureFirst{true}; + + auto extractCapture{[&]() { + capture = update.iff; + captureAlways = false; + update.iff = SourcedActionStmt{}; + }}; + + auto classifyNonUpdate{[&](const SourcedActionStmt &action) { + // The non-update statement is either "r = cond" or the capture. + if (auto maybeAssign{GetEvaluateAssignment(action.stmt)}) { + if (update.cond == maybeAssign->lhs) { + // If this is "r = cond; if (r) ...", then update the condition. + update.cond = maybeAssign->rhs; + update.source = action.source; + // In this form, the update and the capture are combined into + // an IF-THEN-ELSE statement. + extractCapture(); + } else { + // Assume this is the capture-statement. + capture = action; + } + } + }}; + + if (body.size() == 2) { + // This could be + // - capture; conditional-update (in any order), or + // - r = cond; if (r) capture-update + const parser::ExecutionPartConstruct *st1{&body.front()}; + const parser::ExecutionPartConstruct *st2{&body.back()}; + // In either case, the conditional statement can be analyzed by + // AnalyzeConditionalStmt, whereas the other statement cannot. + if (auto maybeUpdate1{AnalyzeConditionalStmt(st1)}) { + update = *maybeUpdate1; + classifyNonUpdate(GetActionStmt(st2)); + captureFirst = false; + } else if (auto maybeUpdate2{AnalyzeConditionalStmt(st2)}) { + update = *maybeUpdate2; + classifyNonUpdate(GetActionStmt(st1)); + } else { + // None of the statements are conditional, this rules out the + // "r = cond; if (r) ..." and the "capture + conditional-update" + // variants. This could still be capture + write (which is classified + // as conditional-update-capture in the spec). + auto [uec, cec]{CheckUpdateCapture(st1, st2, source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + update.ift = uact; + capture = cact; + if (uec == st1) { + captureFirst = false; + } + } + } else if (body.size() == 1) { + if (auto maybeUpdate{AnalyzeConditionalStmt(&body.front())}) { + update = *maybeUpdate; + // This is the form with update and capture combined into an IF-THEN-ELSE + // statement. The capture-statement is always the ELSE branch. + extractCapture(); + } else { + goto invalid; + } + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE CAPTURE operation should contain one or two statements"_err_en_US); + return; + invalid: + context_.Say(source, + "Invalid body of ATOMIC UPDATE COMPARE CAPTURE operation"_err_en_US); + return; + } + + // The update must have a form `x = d` or `x => d`. + if (auto maybeWrite{GetEvaluateAssignment(update.ift.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, update.ift.source); + if (auto maybeCapture{GetEvaluateAssignment(capture.stmt)}) { + CheckAtomicCaptureAssignment(*maybeCapture, atom, capture.source); + + if (IsPointerAssignment(*maybeWrite) != + IsPointerAssignment(*maybeCapture)) { + context_.Say(capture.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + } else { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + return; + } + } else { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + return; + } + + // update.iff should be empty here, the capture statement should be + // stored in "capture". + + // Fill out the analysis in the AST node. + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + bool condUnused{std::visit( + [](auto &&s) { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v) { + return true; + } else { + return false; + } }, - x.u); + update.cond.u)}; + + int updateWhen{!condUnused ? Analysis::IfTrue : 0}; + int captureWhen{!captureAlways ? Analysis::IfFalse : 0}; + + evaluate::Assignment updAssign{*GetEvaluateAssignment(update.ift.stmt)}; + evaluate::Assignment capAssign{*GetEvaluateAssignment(capture.stmt)}; + + if (captureFirst) { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + } else { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + } +} + +void OmpStructureChecker::CheckAtomicRead( + const parser::OpenMPAtomicConstruct &x) { + // [6.0:190:5-7] + // A read structured block is read-statement, a read statement that has one + // of the following forms: + // v = x + // v => x + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Read cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC READ cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeRead->rhs}; + CheckAtomicReadAssignment(*maybeRead, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC READ operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC READ operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicWrite( + const parser::OpenMPAtomicConstruct &x) { + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Write cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC WRITE cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeWrite{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC WRITE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdate( + const parser::OpenMPAtomicConstruct &x) { + auto &block{std::get(x.t)}; + + bool isConditional{x.IsCompare()}; + bool isCapture{x.IsCapture()}; + const parser::Block &body{GetInnermostExecPart(block)}; + + if (isConditional && isCapture) { + CheckAtomicConditionalUpdateCapture(x, body, x.source); + } else if (isConditional) { + CheckAtomicConditionalUpdate(x, body, x.source); + } else if (isCapture) { + CheckAtomicUpdateCapture(x, body, x.source); + } else { // update-only + CheckAtomicUpdateOnly(x, body, x.source); + } +} + +void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { + // All of the following groups have the "exclusive" property, i.e. at + // most one clause from each group is allowed. + // The exclusivity-checking code should eventually be unified for all + // clauses, with clause groups defined in OMP.td. + std::array atomic{llvm::omp::Clause::OMPC_read, + llvm::omp::Clause::OMPC_update, llvm::omp::Clause::OMPC_write}; + std::array memoryOrder{llvm::omp::Clause::OMPC_acq_rel, + llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_relaxed, + llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_seq_cst}; + + auto checkExclusive{[&](llvm::ArrayRef group, + std::string_view name, + const parser::OmpClauseList &clauses) { + const parser::OmpClause *present{nullptr}; + for (const parser::OmpClause &clause : clauses.v) { + llvm::omp::Clause id{clause.Id()}; + if (!llvm::is_contained(group, id)) { + continue; + } + if (present == nullptr) { + present = &clause; + continue; + } else if (id == present->Id()) { + // Ignore repetitions of the same clause, those will be diagnosed + // separately. + continue; + } + parser::MessageFormattedText txt( + "At most one clause from the '%s' group is allowed on ATOMIC construct"_err_en_US, + name.data()); + parser::Message message(clause.source, txt); + message.Attach(present->source, + "Previous clause from this group provided here"_en_US); + context_.Say(std::move(message)); + return; + } + }}; + + auto &dirSpec{std::get(x.t)}; + auto &dir{std::get(dirSpec.t)}; + PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_atomic); + llvm::omp::Clause kind{x.GetKind()}; + + checkExclusive(atomic, "atomic", dirSpec.Clauses()); + checkExclusive(memoryOrder, "memory-order", dirSpec.Clauses()); + + switch (kind) { + case llvm::omp::Clause::OMPC_read: + CheckAtomicRead(x); + break; + case llvm::omp::Clause::OMPC_write: + CheckAtomicWrite(x); + break; + case llvm::omp::Clause::OMPC_update: + CheckAtomicUpdate(x); + break; + default: + break; + } } void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { @@ -3237,7 +4341,6 @@ CHECK_SIMPLE_CLAUSE(Final, OMPC_final) CHECK_SIMPLE_CLAUSE(Flush, OMPC_flush) CHECK_SIMPLE_CLAUSE(Full, OMPC_full) CHECK_SIMPLE_CLAUSE(Grainsize, OMPC_grainsize) -CHECK_SIMPLE_CLAUSE(Hint, OMPC_hint) CHECK_SIMPLE_CLAUSE(Holds, OMPC_holds) CHECK_SIMPLE_CLAUSE(Inclusive, OMPC_inclusive) CHECK_SIMPLE_CLAUSE(Initializer, OMPC_initializer) @@ -3867,40 +4970,6 @@ void OmpStructureChecker::CheckIsLoopIvPartOfClause( } } } -// Following clauses have a separate node in parse-tree.h. -// Atomic-clause -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicRead, OMPC_read) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicWrite, OMPC_write) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicUpdate, OMPC_update) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicCapture, OMPC_capture) - -void OmpStructureChecker::Leave(const parser::OmpAtomicRead &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_read, - {llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicWrite &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_write, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicUpdate &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_update, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -// OmpAtomic node represents atomic directive without atomic-clause. -// atomic-clause - READ,WRITE,UPDATE,CAPTURE. -void OmpStructureChecker::Leave(const parser::OmpAtomic &) { - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acquire)}) { - context_.Say(clause->source, - "Clause ACQUIRE is not allowed on the ATOMIC directive"_err_en_US); - } - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acq_rel)}) { - context_.Say(clause->source, - "Clause ACQ_REL is not allowed on the ATOMIC directive"_err_en_US); - } -} // Restrictions specific to each clause are implemented apart from the // generalized restrictions. @@ -4854,21 +5923,6 @@ void OmpStructureChecker::Leave(const parser::OmpContextSelector &) { ExitDirectiveNest(ContextSelectorNest); } -std::optional OmpStructureChecker::GetDynamicType( - const common::Indirection &parserExpr) { - // Indirection parserExpr - // `- parser::Expr ^.value() - const parser::TypedExpr &typedExpr{parserExpr.value().typedExpr}; - // ForwardOwningPointer typedExpr - // `- GenericExprWrapper ^.get() - // `- std::optional ^->v - if (auto maybeExpr{typedExpr.get()->v}) { - return maybeExpr->GetType(); - } else { - return std::nullopt; - } -} - const std::list & OmpStructureChecker::GetTraitPropertyList( const parser::OmpTraitSelector &trait) { @@ -5258,7 +6312,7 @@ void OmpStructureChecker::CheckTraitCondition( const parser::OmpTraitProperty &property{properties.front()}; auto &scalarExpr{std::get(property.u)}; - auto maybeType{GetDynamicType(scalarExpr.thing)}; + auto maybeType{GetDynamicType(scalarExpr.thing.value())}; if (!maybeType || maybeType->category() != TypeCategory::Logical) { context_.Say(property.source, "%s trait requires a single LOGICAL expression"_err_en_US, diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..bf6fbf16d0646 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -48,6 +48,7 @@ static const OmpDirectiveSet noWaitClauseNotAllowedSet{ } // namespace llvm namespace Fortran::semantics { +struct AnalyzedCondStmt; // Mapping from 'Symbol' to 'Source' to keep track of the variables // used in multiple clauses @@ -142,15 +143,6 @@ class OmpStructureChecker void Leave(const parser::OmpClauseList &); void Enter(const parser::OmpClause &); - void Enter(const parser::OmpAtomicRead &); - void Leave(const parser::OmpAtomicRead &); - void Enter(const parser::OmpAtomicWrite &); - void Leave(const parser::OmpAtomicWrite &); - void Enter(const parser::OmpAtomicUpdate &); - void Leave(const parser::OmpAtomicUpdate &); - void Enter(const parser::OmpAtomicCapture &); - void Leave(const parser::OmpAtomic &); - void Enter(const parser::DoConstruct &); void Leave(const parser::DoConstruct &); @@ -189,8 +181,6 @@ class OmpStructureChecker void CheckAllowedMapTypes(const parser::OmpMapType::Value &, const std::list &); - std::optional GetDynamicType( - const common::Indirection &); const std::list &GetTraitPropertyList( const parser::OmpTraitSelector &); std::optional GetClauseFromProperty( @@ -260,14 +250,41 @@ class OmpStructureChecker void CheckDoWhile(const parser::OpenMPLoopConstruct &x); void CheckAssociatedLoopConstraints(const parser::OpenMPLoopConstruct &x); template bool IsOperatorValid(const T &, const D &); - void CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *, const parser::OmpAtomicClauseList *); - void CheckAtomicUpdateStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureStmt(const parser::AssignmentStmt &); - void CheckAtomicWriteStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureConstruct(const parser::OmpAtomicCapture &); - void CheckAtomicCompareConstruct(const parser::OmpAtomicCompare &); - void CheckAtomicConstructStructure(const parser::OpenMPAtomicConstruct &); + + void CheckStorageOverlap(const evaluate::Expr &, + llvm::ArrayRef>, parser::CharBlock); + void CheckAtomicVariable( + const evaluate::Expr &, parser::CharBlock); + std::pair + CheckUpdateCapture(const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source); + void CheckAtomicCaptureAssignment(const evaluate::Assignment &capture, + const SomeExpr &atom, parser::CharBlock source); + void CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source); + void CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source); + void CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source); + void CheckAtomicConditionalUpdateAssignment(const SomeExpr &cond, + parser::CharBlock condSource, const evaluate::Assignment &assign, + parser::CharBlock assignSource); + void CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source); + void CheckAtomicUpdateOnly(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdate(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicUpdateCapture(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source); + void CheckAtomicRead(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicWrite(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicUpdate(const parser::OpenMPAtomicConstruct &x); + void CheckDistLinear(const parser::OpenMPLoopConstruct &x); void CheckSIMDNest(const parser::OpenMPConstruct &x); void CheckTargetNest(const parser::OpenMPConstruct &x); @@ -319,7 +336,6 @@ class OmpStructureChecker void EnterDirectiveNest(const int index) { directiveNest_[index]++; } void ExitDirectiveNest(const int index) { directiveNest_[index]--; } int GetDirectiveNest(const int index) { return directiveNest_[index]; } - template void CheckHintClause(D *, D *, std::string_view); inline void ErrIfAllocatableVariable(const parser::Variable &); inline void ErrIfLHSAndRHSSymbolsMatch( const parser::Variable &, const parser::Expr &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..dc395ecf211b3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1638,11 +1638,8 @@ class OmpVisitor : public virtual DeclarationVisitor { messageHandler().set_currStmtSource(std::nullopt); } bool Pre(const parser::OpenMPAtomicConstruct &x) { - return common::visit(common::visitors{[&](const auto &u) -> bool { - AddOmpSourceRange(u.source); - return true; - }}, - x.u); + AddOmpSourceRange(x.source); + return true; } void Post(const parser::OpenMPAtomicConstruct &) { messageHandler().set_currStmtSource(std::nullopt); diff --git a/flang/lib/Semantics/rewrite-directives.cpp b/flang/lib/Semantics/rewrite-directives.cpp index 104a77885d276..b4fef2c881b67 100644 --- a/flang/lib/Semantics/rewrite-directives.cpp +++ b/flang/lib/Semantics/rewrite-directives.cpp @@ -51,23 +51,21 @@ class OmpRewriteMutator : public DirectiveRewriteMutator { bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { // Find top-level parent of the operation. - Symbol *topLevelParent{common::visit( - [&](auto &atomic) { - Symbol *symbol{nullptr}; - Scope *scope{ - &context_.FindScope(std::get(atomic.t).source)}; - do { - if (Symbol * parent{scope->symbol()}) { - symbol = parent; - } - scope = &scope->parent(); - } while (!scope->IsGlobal()); - - assert(symbol && - "Atomic construct must be within a scope associated with a symbol"); - return symbol; - }, - x.u)}; + Symbol *topLevelParent{[&]() { + Symbol *symbol{nullptr}; + Scope *scope{&context_.FindScope( + std::get(x.t).source)}; + do { + if (Symbol * parent{scope->symbol()}) { + symbol = parent; + } + scope = &scope->parent(); + } while (!scope->IsGlobal()); + + assert(symbol && + "Atomic construct must be within a scope associated with a symbol"); + return symbol; + }()}; // Get the `atomic_default_mem_order` clause from the top-level parent. std::optional defaultMemOrder; @@ -86,66 +84,48 @@ bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { return false; } - auto findMemOrderClause = - [](const std::list &clauses) { - return llvm::any_of(clauses, [](const auto &clause) { - return std::get_if(&clause.u); + auto findMemOrderClause{[](const parser::OmpClauseList &clauses) { + return llvm::any_of( + clauses.v, [](auto &clause) -> const parser::OmpClause * { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_acq_rel: + case llvm::omp::Clause::OMPC_acquire: + case llvm::omp::Clause::OMPC_relaxed: + case llvm::omp::Clause::OMPC_release: + case llvm::omp::Clause::OMPC_seq_cst: + return &clause; + default: + return nullptr; + } }); - }; - - // Get the clause list to which the new memory order clause must be added, - // only if there are no other memory order clauses present for this atomic - // directive. - std::list *clauseList = common::visit( - common::visitors{[&](parser::OmpAtomic &atomicConstruct) { - // OmpAtomic only has a single list of clauses. - auto &clauses{std::get( - atomicConstruct.t)}; - return !findMemOrderClause(clauses.v) ? &clauses.v - : nullptr; - }, - [&](auto &atomicConstruct) { - // All other atomic constructs have two lists of clauses. - auto &clausesLhs{std::get<0>(atomicConstruct.t)}; - auto &clausesRhs{std::get<2>(atomicConstruct.t)}; - return !findMemOrderClause(clausesLhs.v) && - !findMemOrderClause(clausesRhs.v) - ? &clausesRhs.v - : nullptr; - }}, - x.u); + }}; - // Add a memory order clause to the atomic directive. + auto &dirSpec{std::get(x.t)}; + auto &clauseList{std::get>(dirSpec.t)}; if (clauseList) { - atomicDirectiveDefaultOrderFound_ = true; - switch (*defaultMemOrder) { - case common::OmpMemoryOrderType::Acq_Rel: - clauseList->emplace_back(common::visit( - common::visitors{[](parser::OmpAtomicRead &) -> parser::OmpClause { - return parser::OmpClause::Acquire{}; - }, - [](parser::OmpAtomicCapture &) -> parser::OmpClause { - return parser::OmpClause::AcqRel{}; - }, - [](auto &) -> parser::OmpClause { - // parser::{OmpAtomic, OmpAtomicUpdate, OmpAtomicWrite} - return parser::OmpClause::Release{}; - }}, - x.u)); - break; - case common::OmpMemoryOrderType::Relaxed: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::Relaxed{}}); - break; - case common::OmpMemoryOrderType::Seq_Cst: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::SeqCst{}}); - break; - default: - // FIXME: Don't process other values at the moment since their validity - // depends on the OpenMP version (which is unavailable here). - break; + if (findMemOrderClause(*clauseList)) { + return false; } + } else { + clauseList = parser::OmpClauseList(decltype(parser::OmpClauseList::v){}); + } + + // Add a memory order clause to the atomic directive. + atomicDirectiveDefaultOrderFound_ = true; + switch (*defaultMemOrder) { + case common::OmpMemoryOrderType::Acq_Rel: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::AcqRel{}}); + break; + case common::OmpMemoryOrderType::Relaxed: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::Relaxed{}}); + break; + case common::OmpMemoryOrderType::Seq_Cst: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::SeqCst{}}); + break; + default: + // FIXME: Don't process other values at the moment since their validity + // depends on the OpenMP version (which is unavailable here). + break; } return false; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 index b82bd13622764..6f58e0939a787 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 index 88ec6fe910b9e..6729be6e5cf8b 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b8422c8f3b769 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -79,16 +79,16 @@ subroutine pointers_in_atomic_capture() !CHECK: %[[VAL_A_BOX_ADDR:.*]] = fir.box_addr %[[VAL_A_LOADED]] : (!fir.box>) -> !fir.ptr !CHECK: %[[VAL_B_LOADED:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR:.*]] = fir.box_addr %[[VAL_B_LOADED]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR]] : !fir.ptr !CHECK: %[[VAL_B_LOADED_2:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR_2:.*]] = fir.box_addr %[[VAL_B_LOADED_2]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR_2]] : !fir.ptr !CHECK: omp.atomic.capture { !CHECK: omp.atomic.update %[[VAL_A_BOX_ADDR]] : !fir.ptr { !CHECK: ^bb0(%[[ARG:.*]]: i32): !CHECK: %[[TEMP:.*]] = arith.addi %[[ARG]], %[[VAL_B]] : i32 !CHECK: omp.yield(%[[TEMP]] : i32) !CHECK: } -!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 +!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR_2]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 !CHECK: } !CHECK: return !CHECK: } diff --git a/flang/test/Lower/OpenMP/atomic-write.f90 b/flang/test/Lower/OpenMP/atomic-write.f90 index 13392ad76471f..6eded49b0b15d 100644 --- a/flang/test/Lower/OpenMP/atomic-write.f90 +++ b/flang/test/Lower/OpenMP/atomic-write.f90 @@ -44,9 +44,9 @@ end program OmpAtomicWrite !CHECK-LABEL: func.func @_QPatomic_write_pointer() { !CHECK: %[[X_REF:.*]] = fir.alloca !fir.box> {bindc_name = "x", uniq_name = "_QFatomic_write_pointerEx"} !CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFatomic_write_pointerEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) -!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> !CHECK: %[[X_POINTEE_ADDR:.*]] = fir.box_addr %[[X_ADDR_BOX]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: omp.atomic.write %[[X_POINTEE_ADDR]] = %[[C1]] : !fir.ptr, i32 !CHECK: %[[C2:.*]] = arith.constant 2 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> diff --git a/flang/test/Parser/OpenMP/atomic-compare.f90 b/flang/test/Parser/OpenMP/atomic-compare.f90 deleted file mode 100644 index 5cd02698ff482..0000000000000 --- a/flang/test/Parser/OpenMP/atomic-compare.f90 +++ /dev/null @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s -! OpenMP version for documentation purposes only - it isn't used until Sema. -! This is testing for Parser errors that bail out before Sema. -program main - implicit none - integer :: i, j = 10 - logical :: r - - !CHECK: error: expected OpenMP construct - !$omp atomic compare write - r = i .eq. j + 1 - - !CHECK: error: expected end of line - !$omp atomic compare num_threads(4) - r = i .eq. j -end program main diff --git a/flang/test/Semantics/OpenMP/atomic-compare.f90 b/flang/test/Semantics/OpenMP/atomic-compare.f90 index 54492bf6a22a6..11e23e062bce7 100644 --- a/flang/test/Semantics/OpenMP/atomic-compare.f90 +++ b/flang/test/Semantics/OpenMP/atomic-compare.f90 @@ -44,46 +44,37 @@ !$omp end atomic ! Check for error conditions: - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic compare seq_cst seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst compare seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic compare acquire acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire compare acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic compare relaxed relaxed if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed compare relaxed if (b .eq. c) b = a - !ERROR: More than one FAIL clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one FAIL clause can appear on the ATOMIC directive !$omp atomic fail(release) compare fail(release) if (c .eq. a) a = b !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index c13a11a8dd5dc..deb67e7614659 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -16,20 +16,21 @@ program sample !$omp atomic read hint(2) y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(3) y = y + 10 !$omp atomic update hint(5) y = x + y - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement y = x x = y !$omp end atomic - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic update hint(x) y = y * 1 @@ -46,7 +47,7 @@ program sample !$omp atomic hint(omp_lock_hint_speculative) x = y + x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint) read y = x @@ -69,36 +70,36 @@ program sample !$omp atomic hint(omp_lock_hint_contended + omp_sync_hint_nonspeculative) x = y + x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint_contended) read y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = y * 9 - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp atomic hint(1.0) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp atomic hint(z + omp_sync_hint_nonspeculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(k + omp_sync_hint_speculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(p(1) + omp_sync_hint_uncontended) write x = 10 * y !$omp atomic write hint(a) - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and y+x access the same storage x = y + x !$omp atomic hint(abs(-1)) write diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 new file mode 100644 index 0000000000000..2619d235380f8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -0,0 +1,89 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC READ. Expect no diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v = x + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v => x +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic read + v = p(i) +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic read + !ERROR: Atomic variable x should be a scalar + v = x + + !$omp atomic read + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + v = y(2:4) +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic read + !ERROR: Within atomic operation x and x access the same storage + x = x + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic read + y(1) = y(2) +end + +subroutine f05 + integer :: x, v + + !$omp atomic read + !ERROR: Atomic expression x+1_4 should be a variable + v = x + 1 +end + +subroutine f06 + character :: x, v + + !$omp atomic read + !ERROR: Atomic variable x cannot have CHARACTER type + v = x +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic read + !ERROR: Atomic variable x cannot be ALLOCATABLE + v = x +end + diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 new file mode 100644 index 0000000000000..64e31a7974b55 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -0,0 +1,77 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !$omp atomic update capture + x = v + x = x + 1 + y = x + !$omp end atomic +end + +subroutine f01 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two assignments + !$omp atomic update capture + x = v + block + x = x + 1 + y = x + end block + !$omp end atomic +end + +subroutine f02 + integer :: x, y + + ! The update and capture statements can be inside of a single BLOCK. + ! The end-directive is then optional. Expect no diagnostics. + !$omp atomic update capture + block + x = x + 1 + y = x + end block +end + +subroutine f03 + integer :: x + + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !$omp atomic update capture + x = x + 1 + x = x + 2 + !$omp end atomic +end + +subroutine f04 + integer :: x, v + + !$omp atomic update capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + v = x + x = v + !$omp end atomic +end + +subroutine f05 + integer :: x, v, z + + !$omp atomic update capture + v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + !$omp end atomic +end + +subroutine f06 + integer :: x, v, z + + !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + v = x + !$omp end atomic +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 new file mode 100644 index 0000000000000..0aee02d429dc0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + !$omp atomic update + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + x = int(x + y) +end + +subroutine f06 + integer :: x, y + interface + function f(i, j) + integer :: f, i, j + end + end interface + + !$omp atomic update + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation + x = f(x, y) +end + +subroutine f07 + real :: x + integer :: y + + !$omp atomic update + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation + x = x ** y +end + +subroutine f08 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should appear as an argument in the update operation + x = y +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 index 21a9b87d26345..3084376b4275d 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 @@ -22,10 +22,10 @@ program sample x = x / y !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y end program diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 new file mode 100644 index 0000000000000..979568ebf7140 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -0,0 +1,81 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC WRITE. Expect no diagnostics. + !$omp atomic write + x = v + 1 + + !$omp atomic write + x = v + 3 + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic write + x = 2 * v + 3 + + !$omp atomic write + x => v +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic write + p(i) = v +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic write + !ERROR: Atomic variable x should be a scalar + x = v + + !$omp atomic write + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + y(2:4) = v +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic write + !ERROR: Within atomic operation x and x+1_4 access the same storage + x = x + 1 + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic write + y(1) = y(2) +end + +subroutine f06 + character :: x, v + + !$omp atomic write + !ERROR: Atomic variable x cannot have CHARACTER type + x = v +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic write + !ERROR: Atomic variable x cannot be ALLOCATABLE + x = v +end + diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0e100871ea9b4..0dc8251ae356e 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct @@ -11,9 +11,13 @@ a = b !$omp end atomic + !ERROR: ACQUIRE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic read acquire hint(OMP_LOCK_HINT_CONTENDED) a = b + !ERROR: RELEASE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic release hint(OMP_LOCK_HINT_UNCONTENDED) write a = b @@ -22,39 +26,32 @@ a = a + 1 !$omp end atomic + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: ACQ_REL clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic hint(1) acq_rel capture b = a a = a + 1 !$omp end atomic - !ERROR: expected end of line + !ERROR: At most one clause from the 'atomic' group is allowed on ATOMIC construct !$omp atomic read write + !ERROR: Atomic expression a+1._4 should be a variable a = a + 1 !$omp atomic a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic num_threads(4) a = a + 1 - !ERROR: expected end of line + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic capture num_threads(4) a = a + 1 + !ERROR: RELAXED clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic relaxed a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' - !$omp atomic num_threads write - a = a + 1 - !$omp end parallel end diff --git a/flang/test/Semantics/OpenMP/atomic01.f90 b/flang/test/Semantics/OpenMP/atomic01.f90 index 173effe86b69c..f700c381cadd0 100644 --- a/flang/test/Semantics/OpenMP/atomic01.f90 +++ b/flang/test/Semantics/OpenMP/atomic01.f90 @@ -14,322 +14,277 @@ ! At most one memory-order-clause may appear on the construct. !READ - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic read seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst read seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic read acquire acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire read acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic read relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed read relaxed i = j !UPDATE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic update seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst update seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic update release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release update release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic update relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed update relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !CAPTURE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic capture seq_cst seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst capture seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic capture release release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release capture release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic capture relaxed relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed capture relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel acq_rel capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic capture acq_rel acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel capture acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic capture acquire acquire i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire capture acquire i = j j = k !$omp end atomic !WRITE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic write seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst write seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic write release release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release write release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic write relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed write relaxed i = j !No atomic-clause - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j ! 2.17.7.3 ! At most one hint clause may appear on the construct. - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_speculative) hint(omp_sync_hint_speculative) read i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) read hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic read hint(omp_sync_hint_uncontended) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) update hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic update hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) capture i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) capture hint(omp_sync_hint_nonspeculative) i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic capture hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j j = k @@ -337,34 +292,26 @@ ! 2.17.7.4 ! If atomic-clause is read then memory-order-clause must not be acq_rel or release. - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic acq_rel read i = j - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read acq_rel i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic release read i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read release i = j ! 2.17.7.5 ! If atomic-clause is write then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acq_rel write i = j - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acq_rel i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acquire write i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acquire i = j @@ -372,33 +319,27 @@ ! 2.17.7.6 ! If atomic-clause is update or not present then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acq_rel update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acquire update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed on the ATOMIC directive !$omp atomic acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed on the ATOMIC directive !$omp atomic acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j end program diff --git a/flang/test/Semantics/OpenMP/atomic02.f90 b/flang/test/Semantics/OpenMP/atomic02.f90 index c66085d00f157..45e41f2552965 100644 --- a/flang/test/Semantics/OpenMP/atomic02.f90 +++ b/flang/test/Semantics/OpenMP/atomic02.f90 @@ -28,36 +28,29 @@ program OmpAtomic !$omp atomic a = a/(b + 1) !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: The atomic variable c should appear as an argument in the update operation c = d !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The /= operator is not a valid ATOMIC UPDATE operation l = a .NE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic m = m .AND. n @@ -76,32 +69,26 @@ program OmpAtomic !$omp atomic update a = a/(b + 1) !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: This is not a valid ATOMIC UPDATE operation c = c//d !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic update m = m .AND. n diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index 76367495b9861..f5c189fd05318 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -25,28 +25,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7, b, c) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8, a, d) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -60,26 +58,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = MOD(y, 9) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation x = ABS(x) end program OmpAtomic @@ -92,7 +90,7 @@ subroutine conflicting_types() type(simple) ::s z = 1 !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(s%z, 4) end subroutine @@ -105,40 +103,37 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) :: s !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MIN operator a = min(a, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a) !$omp atomic - !ERROR: Atomic update statement should be of the form `a = intrinsic_procedure(a, expr_list)` OR `a = intrinsic_procedure(expr_list, a)` a = min(b, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a, b) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: The atomic variable y should occur exactly once among the arguments of the top-level MIN operator y = min(z, x) !$omp atomic z = max(z, y) !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'k' + !ERROR: Atomic variable k should be a scalar + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation k = max(x, y) - + !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement x = min(x, k) !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - z =z + s%m + z = z + s%m end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index a9644ad95aa30..5c91ab5dc37e4 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -20,12 +20,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic @@ -33,12 +31,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic @@ -46,12 +42,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic @@ -59,12 +53,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic @@ -72,8 +64,7 @@ program OmpAtomic !$omp atomic m = n .AND. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic @@ -81,8 +72,7 @@ program OmpAtomic !$omp atomic m = n .OR. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic @@ -90,8 +80,7 @@ program OmpAtomic !$omp atomic m = n .EQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic @@ -99,8 +88,7 @@ program OmpAtomic !$omp atomic m = n .NEQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l !$omp atomic update @@ -108,12 +96,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic update @@ -121,12 +107,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic update @@ -134,12 +118,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic update @@ -147,12 +129,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic update @@ -160,8 +140,7 @@ program OmpAtomic !$omp atomic update m = n .AND. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic update @@ -169,8 +148,7 @@ program OmpAtomic !$omp atomic update m = n .OR. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic update @@ -178,8 +156,7 @@ program OmpAtomic !$omp atomic update m = n .EQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic update @@ -187,8 +164,7 @@ program OmpAtomic !$omp atomic update m = n .NEQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l end program OmpAtomic @@ -204,35 +180,34 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) p !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement x = x !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 1 !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and a*b access the same storage a = a * b + a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level * operator a = b * (a + 9) !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (a+b) access the same storage a = a * (a + b) !$omp atomic - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (b+a) access the same storage a = (b + a) * a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a * b + c !$omp atomic update - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a + b + c !$omp atomic @@ -243,23 +218,18 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement a = a + d !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = x * y / z !$omp atomic - !ERROR: Atomic update statement should be of form `p%m = p%m operator expr` OR `p%m = expr operator p%m` - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable p%m should occur exactly once among the arguments of the top-level + operator p%m = x + y !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement p%m = p%m + p%n end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index 266268a212440..e0103be4cae4a 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -8,20 +8,20 @@ program OmpAtomic use omp_lib integer :: g, x - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic relaxed, seq_cst x = x + 1 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic read seq_cst, relaxed x = g - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic write relaxed, release x = 2 * 4 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 10 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst x = g g = x * 10 diff --git a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 index 7ca8c858239f7..e9cfa49bf934e 100644 --- a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 @@ -18,7 +18,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(3) y = 2 !$omp end critical (name) @@ -27,12 +27,12 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(7) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(x) y = 2 @@ -54,7 +54,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint) y = 2 @@ -84,35 +84,35 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint_contended) y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp critical (name) hint(1.0) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp critical (name) hint(z + omp_sync_hint_nonspeculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(k + omp_sync_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(p(1) + omp_sync_hint_uncontended) y = 2 diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 505cbc48fef90..677b933932b44 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -20,70 +20,64 @@ program sample !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable y(1_8:3_8:1_8) should be a scalar v = y(1:3) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression x*(10_4+x) should be a variable v = x * (10 + x) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression 4_4 should be a variable v = 4 !$omp atomic read - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE v = k !$omp atomic write - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = x !$omp atomic update - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = k + x * (v * x) !$omp atomic - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = v * k !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'z%y' + !ERROR: Within atomic operation z%y and x+z%y access the same storage z%y = x + z%y !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Within atomic operation m and min(m,x,z%m)+k access the same storage m = min(m, x, z%m) + k !$omp atomic read - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Atomic expression min(m,x,z%m)+k should be a variable m = min(m, x, z%m) + k !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar x = a - !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - a = x - !$omp atomic write !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement x = a !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar a = x !$omp atomic capture @@ -93,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = b + (x*1) !$omp end atomic @@ -103,60 +97,58 @@ program sample !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component x expected to be captured in the second statement of ATOMIC CAPTURE construct x = x + 10 v = b !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt] v = 1 x = 4 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component z%y expected to be assigned in the second statement of ATOMIC CAPTURE construct x = z%y + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component z%m expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 x = z%y !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component y(2) expected to be assigned in the second statement of ATOMIC CAPTURE construct x = y(2) + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component y(1) expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 x = y(2) !$omp end atomic !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable r cannot have CHARACTER type l = r !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable l cannot have CHARACTER type l = r end program diff --git a/flang/test/Semantics/OpenMP/requires-atomic01.f90 b/flang/test/Semantics/OpenMP/requires-atomic01.f90 index ae9fd086015dd..e8817c3f5ef61 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic01.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic01.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> SeqCst !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> SeqCst !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> SeqCst !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> SeqCst !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> SeqCst !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 diff --git a/flang/test/Semantics/OpenMP/requires-atomic02.f90 b/flang/test/Semantics/OpenMP/requires-atomic02.f90 index 4976a9667eb78..a3724a83456fd 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic02.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic02.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Acquire + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> AcqRel !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> AcqRel !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> AcqRel !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> AcqRel !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> AcqRel + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> AcqRel !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 >From 5928e6c450ed9d91a8130fd489a3fabab830140c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:41:16 -0500 Subject: [PATCH 05/28] Restart build >From 5a9cb4aaca6c7cd079f24036b233dbfd3c6278d4 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:45:28 -0500 Subject: [PATCH 06/28] Restart build >From 6273bb856c66fe110731e25293be0f9a1f02c64a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 07:54:54 -0500 Subject: [PATCH 07/28] Restart build >From 3594a0fcab499d7509c2bd4620b2c47feb74a481 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 08:18:01 -0500 Subject: [PATCH 08/28] Replace %openmp_flags with -fopenmp in tests, add REQUIRES where needed --- flang/test/Semantics/OpenMP/atomic-read.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-capture.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 2 +- flang/test/Semantics/OpenMP/atomic.f90 | 2 ++ 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 index 2619d235380f8..73899a9ff37f2 100644 --- a/flang/test/Semantics/OpenMP/atomic-read.f90 +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index 64e31a7974b55..c427ba07d43d8 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 0aee02d429dc0..4595e02d01456 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 index 979568ebf7140..7965ad2dc7dbf 100644 --- a/flang/test/Semantics/OpenMP/atomic-write.f90 +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0dc8251ae356e..2caa161507d49 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,3 +1,5 @@ +! REQUIRES: openmp_runtime + ! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct >From 0142671339663ba09c169800f909c0d48c0ad38b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 09:06:20 -0500 Subject: [PATCH 09/28] Fix examples --- flang/examples/FeatureList/FeatureList.cpp | 10 ------- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 26 +++++-------------- 2 files changed, 7 insertions(+), 29 deletions(-) diff --git a/flang/examples/FeatureList/FeatureList.cpp b/flang/examples/FeatureList/FeatureList.cpp index d1407cf0ef239..a36b8719e365d 100644 --- a/flang/examples/FeatureList/FeatureList.cpp +++ b/flang/examples/FeatureList/FeatureList.cpp @@ -445,13 +445,6 @@ struct NodeVisitor { READ_FEATURE(ObjectDecl) READ_FEATURE(OldParameterStmt) READ_FEATURE(OmpAlignedClause) - READ_FEATURE(OmpAtomic) - READ_FEATURE(OmpAtomicCapture) - READ_FEATURE(OmpAtomicCapture::Stmt1) - READ_FEATURE(OmpAtomicCapture::Stmt2) - READ_FEATURE(OmpAtomicRead) - READ_FEATURE(OmpAtomicUpdate) - READ_FEATURE(OmpAtomicWrite) READ_FEATURE(OmpBeginBlockDirective) READ_FEATURE(OmpBeginLoopDirective) READ_FEATURE(OmpBeginSectionsDirective) @@ -480,7 +473,6 @@ struct NodeVisitor { READ_FEATURE(OmpIterationOffset) READ_FEATURE(OmpIterationVector) READ_FEATURE(OmpEndAllocators) - READ_FEATURE(OmpEndAtomic) READ_FEATURE(OmpEndBlockDirective) READ_FEATURE(OmpEndCriticalDirective) READ_FEATURE(OmpEndLoopDirective) @@ -566,8 +558,6 @@ struct NodeVisitor { READ_FEATURE(OpenMPDeclareTargetConstruct) READ_FEATURE(OmpMemoryOrderType) READ_FEATURE(OmpMemoryOrderClause) - READ_FEATURE(OmpAtomicClause) - READ_FEATURE(OmpAtomicClauseList) READ_FEATURE(OmpAtomicDefaultMemOrderClause) READ_FEATURE(OpenMPFlushConstruct) READ_FEATURE(OpenMPLoopConstruct) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..18a4107f9a9d1 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -74,25 +74,19 @@ SourcePosition OpenMPCounterVisitor::getLocation(const OpenMPConstruct &c) { // the directive field. [&](const auto &c) -> SourcePosition { const CharBlock &source{std::get<0>(c.t).source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPAtomicConstruct &c) -> SourcePosition { - return std::visit( - [&](const auto &o) -> SourcePosition { - const CharBlock &source{std::get(o.t).source}; - return parsing->allCooked() - .GetSourcePositionRange(source) - ->first; - }, - c.u); + const CharBlock &source{c.source}; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPSectionConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPUtilityConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, }, c.u); @@ -157,14 +151,8 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - return std::visit( - [&](const auto &c) { - // Get source from the verbatim fields - const CharBlock &source{std::get(c.t).source}; - return "atomic-" + - normalize_construct_name(source.ToString()); - }, - c.u); + const CharBlock &source{c.source}; + return normalize_construct_name(source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; >From c158867527e0a5e2b67146784386675e0d414c71 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:19 -0500 Subject: [PATCH 10/28] Fix example --- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 5 +++-- flang/test/Examples/omp-atomic.f90 | 16 +++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index 18a4107f9a9d1..b0964845d3b99 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -151,8 +151,9 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - const CharBlock &source{c.source}; - return normalize_construct_name(source.ToString()); + auto &dirSpec = std::get(c.t); + auto &dirName = std::get(dirSpec.t); + return normalize_construct_name(dirName.source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; diff --git a/flang/test/Examples/omp-atomic.f90 b/flang/test/Examples/omp-atomic.f90 index dcca34b633a3e..934f84f132484 100644 --- a/flang/test/Examples/omp-atomic.f90 +++ b/flang/test/Examples/omp-atomic.f90 @@ -26,25 +26,31 @@ ! CHECK:--- ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 9 -! CHECK-NEXT: construct: atomic-read +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: -! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: - clause: read ! CHECK-NEXT: details: '' +! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 12 -! CHECK-NEXT: construct: atomic-write +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: ! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' +! CHECK-NEXT: - clause: write ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 16 -! CHECK-NEXT: construct: atomic-capture +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: +! CHECK-NEXT: - clause: capture +! CHECK-NEXT: details: 'name_modifier=atomic;name_modifier=atomic;' ! CHECK-NEXT: - clause: seq_cst ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 21 -! CHECK-NEXT: construct: atomic-atomic +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: [] ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 8 >From ddacb71299d1fefd37b566e3caed45031584aac8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:47 -0500 Subject: [PATCH 11/28] Remove reference to *nullptr --- flang/lib/Lower/OpenMP/OpenMP.cpp | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ab868df76d298..4d9b375553c1e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3770,9 +3770,12 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); - mlir::Operation *secondOp = genAtomicOperation( - converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, - *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + mlir::Operation *secondOp = nullptr; + if (analysis.op1.what != analysis.None) { + secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, + atomAddr, atom, *get(analysis.op1.expr), + hint, memOrder, atomicAt, prepareAt); + } if (secondOp) { builder.setInsertionPointAfter(secondOp); >From 4546997f82dfe32b79b2bd0e2b65974991ab55da Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 2 May 2025 18:49:05 -0500 Subject: [PATCH 12/28] Updates and improvements --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 107 +++-- flang/lib/Semantics/check-omp-structure.cpp | 375 ++++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 1 + .../Todo/atomic-capture-implicit-cast.f90 | 48 --- .../Lower/OpenMP/atomic-implicit-cast.f90 | 2 - .../Semantics/OpenMP/atomic-hint-clause.f90 | 2 +- .../OpenMP/atomic-update-capture.f90 | 8 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 16 +- 9 files changed, 381 insertions(+), 180 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 77f57b1cb85c7..8213fe33edbd0 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4859,7 +4859,7 @@ struct OpenMPAtomicConstruct { struct Op { int what; - TypedExpr expr; + AssignmentStmt::TypedAssignment assign; }; TypedExpr atom, cond; Op op0, op1; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6177b59199481..7b6c22095d723 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2673,21 +2673,46 @@ getAtomicMemoryOrder(lower::AbstractConverter &converter, static mlir::Operation * // genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Value storeAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Type storeType = fir::unwrapRefType(storeAddr.getType()); + + mlir::Value toAddr = [&]() { + if (atomType == storeType) + return storeAddr; + return builder.createTemporary(loc, atomType, ".tmp.atomval"); + }(); builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + + if (atomType != storeType) { + lower::ExprToValueMap overrides; + // The READ operation could be a part of UPDATE CAPTURE, so make sure + // we don't emit extra code into the body of the atomic op. + builder.restoreInsertionPoint(postAt); + mlir::Value load = builder.create(loc, toAddr); + overrides.try_emplace(&atom, load); + + converter.overrideExprValues(&overrides); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); + converter.resetExprOverrides(); + + builder.create(loc, value, storeAddr); + } builder.restoreInsertionPoint(saved); return op; } @@ -2695,16 +2720,18 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, static mlir::Operation * // genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); mlir::Value converted = builder.createConvert(loc, atomType, value); @@ -2719,19 +2746,20 @@ static mlir::Operation * genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrResizeOf(arg, atom)) { @@ -2751,7 +2779,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, converter.overrideExprValues(&overrides); mlir::Value updated = - fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Value converted = builder.createConvert(loc, atomType, updated); builder.create(loc, converted); converter.resetExprOverrides(); @@ -2764,20 +2792,21 @@ static mlir::Operation * genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, int action, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: - return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Write: - return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Update: - return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, assign, + hint, memOrder, preAt, atomicAt, postAt); default: return nullptr; } @@ -3724,6 +3753,15 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { } return ""s; }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; const SomeExpr &atom = *analysis.atom.get()->v; @@ -3732,11 +3770,11 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; llvm::errs() << " op0 {\n"; llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op0.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << " op1 {\n"; llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op1.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << "}\n"; } @@ -3745,8 +3783,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAtomicConstruct &construct) { - auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { - if (auto *maybe = expr.get(); maybe && maybe->v) { + auto get = [](auto &&typedWrapper) -> decltype(&*typedWrapper.get()->v) { + if (auto *maybe = typedWrapper.get(); maybe && maybe->v) { return &*maybe->v; } else { return nullptr; @@ -3774,8 +3812,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, int action0 = analysis.op0.what & analysis.Action; int action1 = analysis.op1.what & analysis.Action; mlir::Operation *captureOp = nullptr; - fir::FirOpBuilder::InsertPoint atomicAt; - fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint preAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint atomicAt, postAt; if (construct.IsCapture()) { // Capturing operation. @@ -3784,7 +3822,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, captureOp = builder.create(loc, hint, memOrder); // Set the non-atomic insertion point to before the atomic.capture. - prepareAt = getInsertionPointBefore(captureOp); + preAt = getInsertionPointBefore(captureOp); mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); builder.setInsertionPointToEnd(block); @@ -3792,6 +3830,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, // atomic.capture. mlir::Operation *term = builder.create(loc); atomicAt = getInsertionPointBefore(term); + postAt = getInsertionPointAfter(captureOp); hint = nullptr; memOrder = nullptr; } else { @@ -3799,20 +3838,20 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(action0 != analysis.None && action1 == analysis.None && "Expexcing single action"); assert(!(analysis.op0.what & analysis.Condition)); - atomicAt = prepareAt; + postAt = atomicAt = preAt; } mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, - *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); mlir::Operation *secondOp = nullptr; if (analysis.op1.what != analysis.None) { secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, - atomAddr, atom, *get(analysis.op1.expr), - hint, memOrder, atomicAt, prepareAt); + atomAddr, atom, *get(analysis.op1.assign), + hint, memOrder, preAt, atomicAt, postAt); } if (secondOp) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 201b38bd05ff3..f7753a5e5cc59 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -86,9 +86,13 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } -static bool IsVarOrFunctionRef(const SomeExpr &expr) { - return evaluate::UnwrapProcedureRef(expr) != nullptr || - evaluate::IsVariable(expr); +static bool IsVarOrFunctionRef(const MaybeExpr &expr) { + if (expr) { + return evaluate::UnwrapProcedureRef(*expr) != nullptr || + evaluate::IsVariable(*expr); + } else { + return false; + } } static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { @@ -2838,6 +2842,12 @@ static std::pair SplitAssignmentSource( namespace atomic { +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + enum class Operator { Unk, // Operators that are officially allowed in the update operation @@ -3137,16 +3147,108 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (Append(v, std::move(results)), ...); + (MoveAppend(v, std::move(results)), ...); return v; } +}; -private: - static void Append(Result &acc, Result &&data) { - for (auto &&s : data) { - acc.push_back(std::move(s)); +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + auto copy{x.derived()}; + return {evaluate::AsGenericExpr(std::move(copy)), {}}; } } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; }; } // namespace atomic @@ -3265,6 +3367,22 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } +static MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { // Both expr and x have the form of SomeType(SomeKind(...)[1]). // Check if expr is @@ -3282,6 +3400,10 @@ bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { } } +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { if (value) { expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), @@ -3289,11 +3411,20 @@ static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { } } +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( - int what, const MaybeExpr &maybeExpr = std::nullopt) { + int what, + const std::optional &maybeAssign = std::nullopt) { parser::OpenMPAtomicConstruct::Analysis::Op operation; operation.what = what; - SetExpr(operation.expr, maybeExpr); + SetAssignment(operation.assign, maybeAssign); return operation; } @@ -3316,7 +3447,7 @@ static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( // }; // struct Op { // int what; - // TypedExpr expr; + // TypedAssignment assign; // }; // TypedExpr atom, cond; // Op op0, op1; @@ -3340,6 +3471,16 @@ void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, } } +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + /// Check if `expr` satisfies the following conditions for x and v: /// /// [6.0:189:10-12] @@ -3383,9 +3524,9 @@ OmpStructureChecker::CheckUpdateCapture( // // The two allowed cases are: // x = ... atomic-var = ... - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // or - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // x = ... atomic-var = ... // // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture @@ -3394,6 +3535,8 @@ OmpStructureChecker::CheckUpdateCapture( // // If the two statements don't fit these criteria, return a pair of default- // constructed values. + using ReturnTy = std::pair; SourcedActionStmt act1{GetActionStmt(ec1)}; SourcedActionStmt act2{GetActionStmt(ec2)}; @@ -3409,86 +3552,155 @@ OmpStructureChecker::CheckUpdateCapture( auto isUpdateCapture{ [](const evaluate::Assignment &u, const evaluate::Assignment &c) { - return u.lhs == c.rhs; + return IsSameOrConvertOf(c.rhs, u.lhs); }}; // Do some checks that narrow down the possible choices for the update // and the capture statements. This will help to emit better diagnostics. - bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); - bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; + + auto errorCaptureShouldRead{[&](const parser::CharBlock &source, + const std::string &expr) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read %s"_err_en_US, + expr); + }}; - if (couldBeCapture1) { - if (couldBeCapture2) { - if (isUpdateCapture(as2, as1)) { - if (isUpdateCapture(as1, as2)) { - // If both statements could be captures and both could be updates, - // emit a warning about the ambiguity. - context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); - } - return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); - } else if (isUpdateCapture(as1, as2)) { + auto errorNeitherWorks{[&]() { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture"_err_en_US); + }}; + + auto makeSelectionFromDet{[&](int det) -> ReturnTy { + // If det != 0, then the checks unambiguously suggest a specific + // categorization. + // If det == 0, then this function should be called only if the + // checks haven't ruled out any possibility, i.e. when both assigments + // could still be either updates or captures. + if (det > 0) { + // as1 is update, as2 is capture + if (isUpdateCapture(as1, as2)) { return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - context_.Say(source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, - as1.rhs.AsFortran(), as2.rhs.AsFortran()); + errorCaptureShouldRead(act2.source, as1.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } else { // !couldBeCapture2 + } else if (det < 0) { + // as2 is update, as1 is capture if (isUpdateCapture(as2, as1)) { return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } else { - context_.Say(act2.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as1.rhs.AsFortran()); + errorCaptureShouldRead(act1.source, as2.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } - } else { // !couldBeCapture1 - if (couldBeCapture2) { - if (isUpdateCapture(as1, as2)) { - return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); - } else { + } else { + bool updateFirst{isUpdateCapture(as1, as2)}; + bool captureFirst{isUpdateCapture(as2, as1)}; + if (updateFirst && captureFirst) { + // If both assignment could be the update and both could be the + // capture, emit a warning about the ambiguity. context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as2.rhs.AsFortran()); + "In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement"_warn_en_US); + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } - } else { - context_.Say(source, - "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + if (updateFirst != captureFirst) { + const parser::ExecutionPartConstruct *upd{updateFirst ? ec1 : ec2}; + const parser::ExecutionPartConstruct *cap{captureFirst ? ec1 : ec2}; + return std::make_pair(upd, cap); + } + assert(!updateFirst && !captureFirst); + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); } + }}; + + if (det != 0 || (cbu1 && cbu2 && cbc1 && cbc2)) { + return makeSelectionFromDet(det); } + assert(det == 0 && "Prior checks should have covered det != 0"); - return std::make_pair(nullptr, nullptr); + // If neither of the statements is an RMW update, it could still be a + // "write" update. Pretty much any assignment can be a write update, so + // recompute det with cbu1 = cbu2 = true. + if (int writeDet{int(cbc2) - int(cbc1)}; writeDet || (cbc1 && cbc2)) { + return makeSelectionFromDet(writeDet); + } + + // It's only errors from here on. + + if (!cbu1 && !cbu2 && !cbc1 && !cbc2) { + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); + } + + // The remaining cases are that + // - no candidate for update, or for capture, + // - one of the assigments cannot be anything. + + if (!cbu1 && !cbu2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update"_err_en_US); + return std::make_pair(nullptr, nullptr); + } else if (!cbc1 && !cbc2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + if ((!cbu1 && !cbc1) || (!cbu2 && !cbc2)) { + auto &src = (!cbu1 && !cbc1) ? act1.source : act2.source; + context_.Say(src, + "In ATOMIC UPDATE operation with CAPTURE the statement could be neither the update nor the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + // All cases should have been covered. + llvm_unreachable("Unchecked condition"); } void OmpStructureChecker::CheckAtomicCaptureAssignment( const evaluate::Assignment &capture, const SomeExpr &atom, parser::CharBlock source) { - const SomeExpr &cap{capture.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &cap{capture.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, rsrc); - // This part should have been checked prior to callig this function. - assert(capture.rhs == atom && "This canont be a capture assignment"); + // This part should have been checked prior to calling this function. + assert(*GetConvertInput(capture.rhs) == atom && + "This canont be a capture assignment"); CheckStorageOverlap(atom, {cap}, source); } } void OmpStructureChecker::CheckAtomicReadAssignment( const evaluate::Assignment &read, parser::CharBlock source) { - const SomeExpr &atom{read.rhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; - if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + if (auto maybe{GetConvertInput(read.rhs)}) { + const SomeExpr &atom{*maybe}; + + if (!IsVarOrFunctionRef(atom)) { + ErrorShouldBeVariable(atom, rsrc); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } } else { - CheckAtomicVariable(atom, rsrc); - CheckStorageOverlap(atom, {read.lhs}, source); + ErrorShouldBeVariable(read.rhs, rsrc); } } @@ -3499,12 +3711,11 @@ void OmpStructureChecker::CheckAtomicWriteAssignment( // one of the following forms: // x = expr // x => expr - const SomeExpr &atom{write.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{write.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, lsrc); CheckStorageOverlap(atom, {write.rhs}, source); @@ -3521,12 +3732,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( // x = intrinsic-procedure-name (x) // x = intrinsic-procedure-name (x, expr-list) // x = intrinsic-procedure-name (expr-list, x) - const SomeExpr &atom{update.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{update.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); // Skip other checks. return; } @@ -3605,12 +3815,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( const SomeExpr &cond, parser::CharBlock condSource, const evaluate::Assignment &assign, parser::CharBlock assignSource) { - const SomeExpr &atom{assign.lhs}; auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + const SomeExpr &atom{assign.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, arsrc); // Skip other checks. return; } @@ -3702,7 +3911,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( @@ -3786,7 +3995,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdate( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign), MakeAtomicAnalysisOp(Analysis::None)); } @@ -3839,12 +4048,12 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( if (GetActionStmt(&body.front()).stmt == uact.stmt) { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(action, update.rhs), - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + MakeAtomicAnalysisOp(action, update), + MakeAtomicAnalysisOp(Analysis::Read, capture)); } else { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), - MakeAtomicAnalysisOp(action, update.rhs)); + MakeAtomicAnalysisOp(Analysis::Read, capture), + MakeAtomicAnalysisOp(action, update)); } } @@ -3988,12 +4197,12 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( if (captureFirst) { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign)); } else { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign)); } } @@ -4019,13 +4228,15 @@ void OmpStructureChecker::CheckAtomicRead( if (body.size() == 1) { SourcedActionStmt action{GetActionStmt(&body.front())}; if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { - const SomeExpr &atom{maybeRead->rhs}; CheckAtomicReadAssignment(*maybeRead, action.source); - using Analysis = parser::OpenMPAtomicConstruct::Analysis; - x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), - MakeAtomicAnalysisOp(Analysis::None)); + if (auto maybe{GetConvertInput(maybeRead->rhs)}) { + const SomeExpr &atom{*maybe}; + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead), + MakeAtomicAnalysisOp(Analysis::None)); + } } else { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); @@ -4058,7 +4269,7 @@ void OmpStructureChecker::CheckAtomicWrite( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index bf6fbf16d0646..835fbe45e1c0e 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -253,6 +253,7 @@ class OmpStructureChecker void CheckStorageOverlap(const evaluate::Expr &, llvm::ArrayRef>, parser::CharBlock); + void ErrorShouldBeVariable(const MaybeExpr &expr, parser::CharBlock source); void CheckAtomicVariable( const evaluate::Expr &, parser::CharBlock); std::pair&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..aa9d2e0ac3ff7 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -1,5 +1,3 @@ -! REQUIRES : openmp_runtime - ! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s ! CHECK: func.func @_QPatomic_implicit_cast_read() { diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index deb67e7614659..8adb0f1a67409 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -25,7 +25,7 @@ program sample !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement y = x x = y !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index c427ba07d43d8..f808ed916fb7e 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -39,7 +39,7 @@ subroutine f02 subroutine f03 integer :: x - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture !$omp atomic update capture x = x + 1 x = x + 2 @@ -50,7 +50,7 @@ subroutine f04 integer :: x, v !$omp atomic update capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement v = x x = v !$omp end atomic @@ -60,8 +60,8 @@ subroutine f05 integer :: x, v, z !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 !$omp end atomic end @@ -70,8 +70,8 @@ subroutine f06 integer :: x, v, z !$omp atomic update capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x !$omp end atomic end diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 677b933932b44..5e180aa0bbe5b 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -97,50 +97,50 @@ program sample !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture x = x + 10 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read x v = b !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture !$omp atomic capture v = 1 x = 4 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) !$omp end atomic >From 40510a3068498d15257cc7d198bce9c8cd71a902 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 24 Mar 2025 15:38:58 -0500 Subject: [PATCH 13/28] DumpEvExpr: show type --- flang/include/flang/Semantics/dump-expr.h | 30 ++++++++++++++++------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 2f445429a10b5..1553dac3b6687 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,6 +16,7 @@ #include #include +#include #include #include @@ -38,6 +39,17 @@ class DumpEvaluateExpr { } private: + template + struct TypeOf { + static constexpr std::string_view name{TypeOf::get()}; + static constexpr std::string_view get() { + std::string_view v(__PRETTY_FUNCTION__); + v.remove_prefix(99); // Strip the part "... [with T = " + v.remove_suffix(50); // Strip the ending "; string_view = ...]" + return v; + } + }; + template void Show(const common::Indirection &x) { Show(x.value()); } @@ -76,7 +88,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant"); + Indent("derived constant "s + std::string(TypeOf::name)); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -84,7 +96,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant"); + Print("constant "s + std::string(TypeOf::name)); } } void Show(const Symbol &symbol); @@ -102,7 +114,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator"); + Indent("designator "s + std::string(TypeOf::name)); Show(x.u); Outdent(); } @@ -117,7 +129,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref"); + Indent("function ref "s + std::string(TypeOf::name)); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -127,14 +139,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value"); + Indent("array constructor value "s + std::string(TypeOf::name)); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do"); + Indent("implied do "s + std::string(TypeOf::name)); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -148,20 +160,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op"); + Indent("unary op "s + std::string(TypeOf::name)); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op"); + Indent("binary op "s + std::string(TypeOf::name)); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr T"); + Indent("expr <" + std::string(TypeOf::name) + ">"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index aa0b4e0f03398..66cedab94bfb4 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("expr some type"); + Indent("relational some type"); Show(x.u); Outdent(); } >From b40ba0ed9270daf4f7d99190c1e100028a3e09c3 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 15:14:45 -0500 Subject: [PATCH 14/28] Handle conversion from real to complex via complex constructor --- flang/lib/Semantics/check-omp-structure.cpp | 55 ++++++++++++++++++--- 1 file changed, 47 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dada9c6c2bd6f..ae81dcb5ea150 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,36 +3183,46 @@ struct ConvertCollector using Base::operator(); template // - Result operator()(const evaluate::Designator &x) const { + Result asSomeExpr(const T &x) const { auto copy{x}; return {AsGenericExpr(std::move(copy)), {}}; } + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + template // Result operator()(const evaluate::FunctionRef &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template // Result operator()(const evaluate::Constant &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template Result operator()(const evaluate::Operation &x) const { if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. + // Ignore parentheses. return (*this)(x.template operand<0>()); } else if constexpr (is_convert_v) { // Convert should always have a typed result, so it should be safe to // dereference x.GetType(). return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } } else { - auto copy{x.derived()}; - return {evaluate::AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x.derived()); } } @@ -3231,6 +3241,23 @@ struct ConvertCollector } private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + template // struct is_convert { static constexpr bool value{false}; @@ -3246,6 +3273,18 @@ struct ConvertCollector }; template // static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { >From 303aef7886243a6f7952e866cfb50d860ed98e61 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 16:07:19 -0500 Subject: [PATCH 15/28] Fix handling of insertion point --- flang/lib/Lower/OpenMP/OpenMP.cpp | 23 +++++++++++-------- .../Lower/OpenMP/atomic-implicit-cast.f90 | 8 +++---- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 1c5589b116ca7..60e559b326f7f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2749,7 +2749,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value storeAddr = @@ -2782,7 +2781,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, value, storeAddr); } - builder.restoreInsertionPoint(saved); return op; } @@ -2796,7 +2794,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value value = @@ -2807,7 +2804,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, converted, hint, memOrder); - builder.restoreInsertionPoint(saved); return op; } @@ -2823,7 +2819,6 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); @@ -2853,7 +2848,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(saved); + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } @@ -2866,6 +2861,8 @@ genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { + // This function and the functions called here do not preserve the + // builder's insertion point, or set it to anything specific. switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, @@ -3919,6 +3916,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, postAt = atomicAt = preAt; } + // The builder's insertion point needs to be specifically set before + // each call to `genAtomicOperation`. mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); @@ -3932,10 +3931,16 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, hint, memOrder, preAt, atomicAt, postAt); } - if (secondOp) { - builder.setInsertionPointAfter(secondOp); + if (construct.IsCapture()) { + // If this is a capture operation, the first/second ops will be inside + // of it. Set the insertion point to past the capture op itself. + builder.restoreInsertionPoint(postAt); } else { - builder.setInsertionPointAfter(firstOp); + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } } } } diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 6f9a481e4cf43..5e00235b85e74 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -95,9 +95,9 @@ subroutine atomic_implicit_cast_read ! CHECK: } ! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 ! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref -! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 ! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex ! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> @@ -107,14 +107,14 @@ subroutine atomic_implicit_cast_read !$omp end atomic -! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { -! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 ! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 ! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex ! CHECK: omp.yield(%[[RESULT]] : complex) ! CHECK: } >From d788d87ebe69ec82c14a0eb0cbb95df38a216fde Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:14:47 -0500 Subject: [PATCH 16/28] Allow conversion in update operations --- flang/include/flang/Semantics/tools.h | 17 ++++----- flang/lib/Lower/OpenMP/OpenMP.cpp | 6 ++-- flang/lib/Semantics/check-omp-structure.cpp | 33 ++++++----------- .../Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic03.f90 | 6 ++-- flang/test/Semantics/OpenMP/atomic04.f90 | 35 +++++++++---------- .../OpenMP/omp-atomic-assignment-stmt.f90 | 2 +- 7 files changed, 44 insertions(+), 57 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7f1ec59b087a2..9be2feb8ae064 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -789,14 +789,15 @@ inline bool checkForSymbolMatch( /// return the "expr" but with top-level parentheses stripped. std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); -/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). -/// Check if "expr" is -/// SomeType(SomeKind(Type( -/// Convert -/// SomeKind(...)[2]))) -/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves -/// TypeCategory. -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 60e559b326f7f..6977e209e8b1b 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2823,10 +2823,12 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; + // This must exist by now. + SomeExpr input = *semantics::GetConvertInput(assign.rhs); + std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { - if (!semantics::IsSameOrResizeOf(arg, atom)) { + if (!semantics::IsSameOrConvertOf(arg, atom)) { mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); overrides.try_emplace(&arg, val); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index ae81dcb5ea150..edd8525c118bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3425,12 +3425,12 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -static MaybeExpr GetConvertInput(const SomeExpr &x) { +MaybeExpr GetConvertInput(const SomeExpr &x) { // This returns SomeExpr(x) when x is a designator/functionref/constant. return atomic::ConvertCollector{}(x).first; } -static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { // Check if expr is same as x, or a sequence of Convert operations on x. if (expr == x) { return true; @@ -3441,23 +3441,6 @@ static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { } } -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { - // Both expr and x have the form of SomeType(SomeKind(...)[1]). - // Check if expr is - // SomeType(SomeKind(Type( - // Convert - // SomeKind(...)[2]))) - // where SomeKind(...) [1] and [2] are equal, and the Convert preserves - // TypeCategory. - - if (expr != x) { - auto top{atomic::ArgumentExtractor{}(expr)}; - return top.first == atomic::Operator::Resize && x == top.second.front(); - } else { - return true; - } -} - bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3801,7 +3784,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - auto top{GetTopLevelOperation(update.rhs)}; + std::pair> top{ + atomic::Operator::Unk, {}}; + if (auto &&maybeInput{GetConvertInput(update.rhs)}) { + top = GetTopLevelOperation(*maybeInput); + } switch (top.first) { case atomic::Operator::Add: case atomic::Operator::Sub: @@ -3842,7 +3829,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( auto unique{[&]() { // -> iterator auto found{top.second.end()}; for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { - if (IsSameOrResizeOf(*i, atom)) { + if (IsSameOrConvertOf(*i, atom)) { if (found != top.second.end()) { return top.second.end(); } @@ -3902,9 +3889,9 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( case atomic::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; - if (IsSameOrResizeOf(arg0, atom)) { + if (IsSameOrConvertOf(arg0, atom)) { CheckStorageOverlap(atom, {arg1}, condSource); - } else if (IsSameOrResizeOf(arg1, atom)) { + } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { context_.Say(assignSource, diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 4595e02d01456..28d0e264359cb 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -47,8 +47,8 @@ subroutine f05 integer :: x real :: y + ! An explicit conversion is accepted as an extension. !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = int(x + y) end diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index f5c189fd05318..b3a3c0d5e7a14 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -41,10 +41,10 @@ program OmpAtomic z = MIN(y, 8, a, d) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable y should appear as an argument in the update operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -126,7 +126,7 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic update !ERROR: Atomic variable k should be a scalar - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable k should occur exactly once among the arguments of the top-level MAX operator k = max(x, y) !$omp atomic diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index 5c91ab5dc37e4..d603ba8b3937c 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -1,5 +1,3 @@ -! REQUIRES: openmp_runtime - ! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags ! OpenMP Atomic construct @@ -7,7 +5,6 @@ ! Update assignment must be 'var = var op expr' or 'var = expr op var' program OmpAtomic - use omp_lib real x integer y logical m, n, l @@ -20,10 +17,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic @@ -31,10 +28,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic @@ -42,10 +39,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic @@ -53,10 +50,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic @@ -96,10 +93,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic update @@ -107,10 +104,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic update @@ -118,10 +115,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic update @@ -129,10 +126,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 5e180aa0bbe5b..8fdd2aed3ec1f 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -87,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + ! This ends up being "x = b + x". x = b + (x*1) !$omp end atomic >From 341723713929507c59d528540d32bc2e4213e920 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:21:56 -0500 Subject: [PATCH 17/28] format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6977e209e8b1b..0f553541c5ef0 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2850,7 +2850,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(postAt); // For naCtx cleanups + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } >From 2686207342bad511f6d51b20ed923c0d2cc9047b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:22:26 -0500 Subject: [PATCH 18/28] Revert "DumpEvExpr: show type" This reverts commit 40510a3068498d15257cc7d198bce9c8cd71a902. Debug changes accidentally pushed upstream. --- flang/include/flang/Semantics/dump-expr.h | 30 +++++++---------------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 10 insertions(+), 22 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 1553dac3b6687..2f445429a10b5 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,7 +16,6 @@ #include #include -#include #include #include @@ -39,17 +38,6 @@ class DumpEvaluateExpr { } private: - template - struct TypeOf { - static constexpr std::string_view name{TypeOf::get()}; - static constexpr std::string_view get() { - std::string_view v(__PRETTY_FUNCTION__); - v.remove_prefix(99); // Strip the part "... [with T = " - v.remove_suffix(50); // Strip the ending "; string_view = ...]" - return v; - } - }; - template void Show(const common::Indirection &x) { Show(x.value()); } @@ -88,7 +76,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant "s + std::string(TypeOf::name)); + Indent("derived constant"); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -96,7 +84,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant "s + std::string(TypeOf::name)); + Print("constant"); } } void Show(const Symbol &symbol); @@ -114,7 +102,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator "s + std::string(TypeOf::name)); + Indent("designator"); Show(x.u); Outdent(); } @@ -129,7 +117,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref "s + std::string(TypeOf::name)); + Indent("function ref"); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -139,14 +127,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value "s + std::string(TypeOf::name)); + Indent("array constructor value"); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do "s + std::string(TypeOf::name)); + Indent("implied do"); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -160,20 +148,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op "s + std::string(TypeOf::name)); + Indent("unary op"); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op "s + std::string(TypeOf::name)); + Indent("binary op"); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr <" + std::string(TypeOf::name) + ">"); + Indent("expr T"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index 66cedab94bfb4..aa0b4e0f03398 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("relational some type"); + Indent("expr some type"); Show(x.u); Outdent(); } >From c00fc531bcf742c409fc974da94c5b362fa9132c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:37:19 -0500 Subject: [PATCH 19/28] Delete unnecessary static_assert --- flang/lib/Semantics/check-omp-structure.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 6005dda7c26fe..2e59553d5e130 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -21,8 +21,6 @@ namespace Fortran::semantics { -static_assert(std::is_same_v>); - template static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { return !(e == f); >From 45b012c16b77c757a0d09b2a229bad49fed8d26f Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:25 -0500 Subject: [PATCH 20/28] Add missing initializer for 'iff' --- flang/lib/Semantics/check-omp-structure.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 2e59553d5e130..aa1bd136b371f 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2815,7 +2815,7 @@ static std::optional AnalyzeConditionalStmt( } } else { AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, - GetActionStmt(std::get(s.t))}; + GetActionStmt(std::get(s.t)), SourcedActionStmt{}}; if (result.ift.stmt) { return result; } >From daeac25991bf14fb08c3accabe068c074afa1eb7 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:47 -0500 Subject: [PATCH 21/28] Add asserts for printing "Identity" as top-level operator --- flang/lib/Semantics/check-omp-structure.cpp | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa1bd136b371f..062b45deac865 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3823,6 +3823,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, atomic::ToString(top.first)); @@ -3852,6 +3853,8 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, atom.AsFortran(), atomic::ToString(top.first)); @@ -3898,16 +3901,20 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, atomic::ToString(top.first)); } break; } + case atomic::Operator::Identity: case atomic::Operator::True: case atomic::Operator::False: break; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, atomic::ToString(top.first)); >From ae121e5c37453af1a4aba7c77939c2c1c45b75fa Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 13:15:58 -0500 Subject: [PATCH 22/28] Explain the use of determinant --- flang/lib/Semantics/check-omp-structure.cpp | 31 ++++++++++++++++++--- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 062b45deac865..bc6a09b9768ef 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3606,10 +3606,33 @@ OmpStructureChecker::CheckUpdateCapture( // subexpression of the right-hand side. // 2. An assignment could be a capture (cbc) if the right-hand side is // a variable (or a function ref), with potential type conversions. - bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; - bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; - bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; - bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; // Can as1 be an update? + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; // Can as2 be an update? + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; // Can 1 be capture? + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; // Can 2 be capture? + + // We want to diagnose cases where both assignments cannot be an update, + // or both cannot be a capture, as well as cases where either assignment + // cannot be any of these two. + // + // If we organize these boolean values into a matrix + // |cbu1 cbu2| + // |cbc1 cbc2| + // then we want to diagnose cases where the matrix has a zero (i.e. "false") + // row or column, including the case where everything is zero. All these + // cases correspond to the determinant of the matrix being 0, which suggests + // that checking the det may be a convenient diagnostic check. There is only + // one additional case where the det is 0, which is when the matrx is all 1 + // ("true"). The "all true" case represents the situation where both + // assignments could be an update as well as a capture. On the other hand, + // whenever det != 0, the roles of the update and the capture can be + // unambiguously assigned to as1 and as2 [1]. + // + // [1] This can be easily verified by hand: there are 10 2x2 matrices with + // det = 0, leaving 6 cases where det != 0: + // 0 1 0 1 1 0 1 0 1 1 1 1 + // 1 0 1 1 0 1 1 1 0 1 1 0 + // In each case the classification is unambiguous. // |cbu1 cbu2| // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 >From cae0e8fcd3f6b8c2bc3ad8f85599ef4765c6afc5 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 11:48:18 -0500 Subject: [PATCH 23/28] Deal with assignments that failed Fortran semantic checks Don't emit diagnostics for those. --- flang/lib/Semantics/check-omp-structure.cpp | 66 ++++++++++++++------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bc6a09b9768ef..89a3a407441a8 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2726,6 +2726,9 @@ static SourcedActionStmt GetActionStmt(const parser::Block &block) { // Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption // is that the ActionStmt will be either an assignment or a pointer-assignment, // otherwise return std::nullopt. +// Note: This function can return std::nullopt on [Pointer]AssignmentStmt where +// the "typedAssignment" is unset. This can happen is there are semantic errors +// in the purported assignment. static std::optional GetEvaluateAssignment( const parser::ActionStmt *x) { if (x == nullptr) { @@ -2754,6 +2757,29 @@ static std::optional GetEvaluateAssignment( x->u); } +// Check if the ActionStmt is actually a [Pointer]AssignmentStmt. This is +// to separate cases where the source has something that looks like an +// assignment, but is semantically wrong (diagnosed by general semantic +// checks), and where the source has some other statement (which we want +// to report as "should be an assignment"). +static bool IsAssignment(const parser::ActionStmt *x) { + if (x == nullptr) { + return false; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + + return common::visit( + [](auto &&s) -> bool { + using BareS = llvm::remove_cvref_t; + return std::is_same_v || + std::is_same_v; + }, + x->u); +} + static std::optional AnalyzeConditionalStmt( const parser::ExecutionPartConstruct *x) { if (x == nullptr) { @@ -3588,8 +3614,10 @@ OmpStructureChecker::CheckUpdateCapture( auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; if (!maybeAssign1 || !maybeAssign2) { - context_.Say(source, - "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + if (!IsAssignment(act1.stmt) || !IsAssignment(act2.stmt)) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + } return std::make_pair(nullptr, nullptr); } @@ -3956,7 +3984,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( // The if-true statement must be present, and must be an assignment. auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; if (!maybeAssign) { - if (update.ift.stmt) { + if (update.ift.stmt && !IsAssignment(update.ift.stmt)) { context_.Say(update.ift.source, "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); } else { @@ -3992,7 +4020,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); } @@ -4094,17 +4122,11 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( } SourcedActionStmt uact{GetActionStmt(uec)}; SourcedActionStmt cact{GetActionStmt(cec)}; - auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; - auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; - - if (!maybeUpdate || !maybeCapture) { - context_.Say(source, - "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); - return; - } + // The "dereferences" of std::optional are guaranteed to be valid after + // CheckUpdateCapture. + evaluate::Assignment update{*GetEvaluateAssignment(uact.stmt)}; + evaluate::Assignment capture{*GetEvaluateAssignment(cact.stmt)}; - const evaluate::Assignment &update{*maybeUpdate}; - const evaluate::Assignment &capture{*maybeCapture}; const SomeExpr &atom{update.lhs}; using Analysis = parser::OpenMPAtomicConstruct::Analysis; @@ -4242,13 +4264,17 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( return; } } else { - context_.Say(capture.source, - "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + if (!IsAssignment(capture.stmt)) { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + } return; } } else { - context_.Say(update.ift.source, - "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + if (!IsAssignment(update.ift.stmt)) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + } return; } @@ -4316,7 +4342,7 @@ void OmpStructureChecker::CheckAtomicRead( MakeAtomicAnalysisOp(Analysis::Read, maybeRead), MakeAtomicAnalysisOp(Analysis::None)); } - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); } @@ -4350,7 +4376,7 @@ void OmpStructureChecker::CheckAtomicWrite( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); } >From 6bc8c10c793ebac02c78daec33e7fb5e6becb8e0 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 12:47:00 -0500 Subject: [PATCH 24/28] Move common functions to tools.cpp --- flang/include/flang/Semantics/tools.h | 134 +++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- flang/lib/Semantics/check-omp-structure.cpp | 506 ++------------------ flang/lib/Semantics/tools.cpp | 310 ++++++++++++ 4 files changed, 484 insertions(+), 468 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 821f1ae34fd5b..25fadceefceb0 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -778,11 +778,135 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, return false; } -/// If the top-level operation (ignoring parentheses) is either an -/// evaluate::FunctionRef, or a specialization of evaluate::Operation, -/// then return the list of arguments (wrapped in SomeExpr). Otherwise, -/// return the "expr" but with top-level parentheses stripped. -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); +namespace operation { + +enum class Operator { + Add, + And, + Associated, + Call, + Convert, + Div, + Eq, + Eqv, + False, + Ge, + Gt, + Identity, + Intrinsic, + Lt, + Max, + Min, + Mul, + Ne, + Neqv, + Not, + Or, + Pow, + Resize, // Convert within the same TypeCategory + Sub, + True, + Unknown, +}; + +std::string ToString(Operator op); + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unknown; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unknown; +} + +template +Operator OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Add; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Sub; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Mul; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Div; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } +} + +template // +Operator OperationCode(const T &) { + return Operator::Unknown; +} + +Operator OperationCode(const evaluate::ProcedureDesignator &proc); + +} // namespace operation + +/// Return information about the top-level operation (ignoring parentheses): +/// the operation code and the list of arguments. +std::pair> +GetTopLevelOperation(const SomeExpr &expr); /// Check if expr is same as x, or a sequence of Convert operations on x. bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ad5eae4ae39a2..c74f7627c5e25 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2828,7 +2828,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, // This must exist by now. SomeExpr input = *semantics::GetConvertInput(assign.rhs); - std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; + std::vector args{semantics::GetTopLevelOperation(input).second}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrConvertOf(arg, atom)) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 89a3a407441a8..f29a56d5fd92a 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2891,290 +2891,6 @@ static std::pair SplitAssignmentSource( namespace atomic { -template static void MoveAppend(V &accum, V &&other) { - for (auto &&s : other) { - accum.push_back(std::move(s)); - } -} - -enum class Operator { - Unk, - // Operators that are officially allowed in the update operation - Add, - And, - Associated, - Div, - Eq, - Eqv, - Ge, // extension - Gt, - Identity, // extension: x = x is allowed (*), but we should never print - // "identity" as the name of the operator - Le, // extension - Lt, - Max, - Min, - Mul, - Ne, // extension - Neqv, - Or, - Sub, - // Operators that we recognize for technical reasons - True, - False, - Not, - Convert, - Resize, - Intrinsic, - Call, - Pow, - - // (*): "x = x + 0" is a valid update statement, but it will be folded - // to "x = x" by the time we look at it. Since the source statements - // "x = x" and "x = x + 0" will end up looking the same, accept the - // former as an extension. -}; - -std::string ToString(Operator op) { - switch (op) { - case Operator::Add: - return "+"; - case Operator::And: - return "AND"; - case Operator::Associated: - return "ASSOCIATED"; - case Operator::Div: - return "/"; - case Operator::Eq: - return "=="; - case Operator::Eqv: - return "EQV"; - case Operator::Ge: - return ">="; - case Operator::Gt: - return ">"; - case Operator::Identity: - return "identity"; - case Operator::Le: - return "<="; - case Operator::Lt: - return "<"; - case Operator::Max: - return "MAX"; - case Operator::Min: - return "MIN"; - case Operator::Mul: - return "*"; - case Operator::Neqv: - return "NEQV/EOR"; - case Operator::Ne: - return "/="; - case Operator::Or: - return "OR"; - case Operator::Sub: - return "-"; - case Operator::True: - return ".TRUE."; - case Operator::False: - return ".FALSE."; - case Operator::Not: - return "NOT"; - case Operator::Convert: - return "type-conversion"; - case Operator::Resize: - return "resize"; - case Operator::Intrinsic: - return "intrinsic"; - case Operator::Call: - return "function-call"; - case Operator::Pow: - return "**"; - default: - return "??"; - } -} - -template // -struct ArgumentExtractor - : public evaluate::Traverse, - std::pair>, false> { - using Arguments = std::vector; - using Result = std::pair; - using Base = evaluate::Traverse, - Result, false>; - static constexpr auto IgnoreResizes = IgnoreResizingConverts; - static constexpr auto Logical = common::TypeCategory::Logical; - ArgumentExtractor() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result operator()( - const evaluate::Constant> &x) const { - if (const auto &val{x.GetScalarValue()}) { - return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) - : std::make_pair(Operator::False, Arguments{}); - } - return Default(); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - Result result{OperationCode(x.proc()), {}}; - for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { - if (auto *e{x.UnwrapArgExpr(i)}) { - result.second.push_back(*e); - } - } - return result; - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. - return (*this)(x.template operand<0>()); - } - if constexpr (IgnoreResizes && - std::is_same_v>) { - // Ignore conversions within the same category. - // Atomic operations on int(kind=1) may be implicitly widened - // to int(kind=4) for example. - return (*this)(x.template operand<0>()); - } else { - return std::make_pair( - OperationCode(x), OperationArgs(x, std::index_sequence_for{})); - } - } - - template // - Result operator()(const evaluate::Designator &x) const { - evaluate::Designator copy{x}; - Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; - return result; - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - // There shouldn't be any combining needed, since we're stopping the - // traversal at the top-level operation, but implement one that picks - // the first non-empty result. - if constexpr (sizeof...(Rs) == 0) { - return std::move(result); - } else { - if (!result.second.empty()) { - return std::move(result); - } else { - return Combine(std::move(results)...); - } - } - } - -private: - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) - const { - switch (op.derived().logicalOperator) { - case common::LogicalOperator::And: - return Operator::And; - case common::LogicalOperator::Or: - return Operator::Or; - case common::LogicalOperator::Eqv: - return Operator::Eqv; - case common::LogicalOperator::Neqv: - return Operator::Neqv; - case common::LogicalOperator::Not: - return Operator::Not; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - switch (op.derived().opr) { - case common::RelationalOperator::LT: - return Operator::Lt; - case common::RelationalOperator::LE: - return Operator::Le; - case common::RelationalOperator::EQ: - return Operator::Eq; - case common::RelationalOperator::NE: - return Operator::Ne; - case common::RelationalOperator::GE: - return Operator::Ge; - case common::RelationalOperator::GT: - return Operator::Gt; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Add; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Sub; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Mul; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Div; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - if constexpr (C == T::category) { - return Operator::Resize; - } else { - return Operator::Convert; - } - } - Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { - Operator code = llvm::StringSwitch(proc.GetName()) - .Case("associated", Operator::Associated) - .Case("min", Operator::Min) - .Case("max", Operator::Max) - .Case("iand", Operator::And) - .Case("ior", Operator::Or) - .Case("ieor", Operator::Neqv) - .Default(Operator::Call); - if (code == Operator::Call && proc.GetSpecificIntrinsic()) { - return Operator::Intrinsic; - } - return code; - } - template // - Operator OperationCode(const T &) const { - return Operator::Unk; - } - - template - Arguments OperationArgs(const evaluate::Operation &x, - std::index_sequence) const { - return Arguments{SomeExpr(x.template operand())...}; - } -}; - struct DesignatorCollector : public evaluate::Traverse, false> { using Result = std::vector; @@ -3196,125 +2912,14 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (MoveAppend(v, std::move(results)), ...); - return v; - } -}; - -struct ConvertCollector - : public evaluate::Traverse>, false> { - using Result = std::pair>; - using Base = evaluate::Traverse; - ConvertCollector() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result asSomeExpr(const T &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; - } - - template // - Result operator()(const evaluate::Designator &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::Constant &x) const { - return asSomeExpr(x); - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore parentheses. - return (*this)(x.template operand<0>()); - } else if constexpr (is_convert_v) { - // Convert should always have a typed result, so it should be safe to - // dereference x.GetType(). - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else if constexpr (is_complex_constructor_v) { - // This is a conversion iff the imaginary operand is 0. - if (IsZero(x.template operand<1>())) { - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else { - return asSomeExpr(x.derived()); - } - } else { - return asSomeExpr(x.derived()); - } - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - Result v(std::move(result)); - auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { - assert((!x.has_value() || !y.has_value()) && "Multiple designators"); - if (!x.has_value()) { - x = std::move(y); + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); } }}; - (setValue(v.first, std::move(results).first), ...); - (MoveAppend(v.second, std::move(results).second), ...); + (moveAppend(v, std::move(results)), ...); return v; } - -private: - template // - static bool IsZero(const T &x) { - return false; - } - template // - static bool IsZero(const evaluate::Expr &x) { - return common::visit([](auto &&s) { return IsZero(s); }, x.u); - } - template // - static bool IsZero(const evaluate::Constant &x) { - if (auto &&maybeScalar{x.GetScalarValue()}) { - return maybeScalar->IsZero(); - } else { - return false; - } - } - - template // - struct is_convert { - static constexpr bool value{false}; - }; - template // - struct is_convert> { - static constexpr bool value{true}; - }; - template // - struct is_convert> { - // Conversion from complex to real. - static constexpr bool value{true}; - }; - template // - static constexpr bool is_convert_v = is_convert::value; - - template // - struct is_complex_constructor { - static constexpr bool value{false}; - }; - template // - struct is_complex_constructor> { - static constexpr bool value{true}; - }; - template // - static constexpr bool is_complex_constructor_v = - is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { @@ -3347,22 +2952,13 @@ static bool IsAllocatable(const SomeExpr &expr) { return !syms.empty() && IsAllocatable(syms.back()); } -static std::pair> GetTopLevelOperation( - const SomeExpr &expr) { - return atomic::ArgumentExtractor{}(expr); -} - -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { - return GetTopLevelOperation(expr).second; -} - static bool IsPointerAssignment(const evaluate::Assignment &x) { return std::holds_alternative(x.u) || std::holds_alternative(x.u); } static bool IsCheckForAssociated(const SomeExpr &cond) { - return GetTopLevelOperation(cond).first == atomic::Operator::Associated; + return GetTopLevelOperation(cond).first == operation::Operator::Associated; } static bool HasCommonDesignatorSymbols( @@ -3455,23 +3051,7 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -MaybeExpr GetConvertInput(const SomeExpr &x) { - // This returns SomeExpr(x) when x is a designator/functionref/constant. - return atomic::ConvertCollector{}(x).first; -} - -bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { - // Check if expr is same as x, or a sequence of Convert operations on x. - if (expr == x) { - return true; - } else if (auto maybe{GetConvertInput(expr)}) { - return *maybe == x; - } else { - return false; - } -} - -bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { +static bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3839,45 +3419,46 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - std::pair> top{ - atomic::Operator::Unk, {}}; + std::pair> top{ + operation::Operator::Unknown, {}}; if (auto &&maybeInput{GetConvertInput(update.rhs)}) { top = GetTopLevelOperation(*maybeInput); } switch (top.first) { - case atomic::Operator::Add: - case atomic::Operator::Sub: - case atomic::Operator::Mul: - case atomic::Operator::Div: - case atomic::Operator::And: - case atomic::Operator::Or: - case atomic::Operator::Eqv: - case atomic::Operator::Neqv: - case atomic::Operator::Min: - case atomic::Operator::Max: - case atomic::Operator::Identity: + case operation::Operator::Add: + case operation::Operator::Sub: + case operation::Operator::Mul: + case operation::Operator::Div: + case operation::Operator::And: + case operation::Operator::Or: + case operation::Operator::Eqv: + case operation::Operator::Neqv: + case operation::Operator::Min: + case operation::Operator::Max: + case operation::Operator::Identity: break; - case atomic::Operator::Call: + case operation::Operator::Call: context_.Say(source, "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Convert: + case operation::Operator::Convert: context_.Say(source, "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Intrinsic: + case operation::Operator::Intrinsic: context_.Say(source, "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Unk: + case operation::Operator::Unknown: context_.Say( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); return; } // Check if `atom` occurs exactly once in the argument list. @@ -3898,17 +3479,17 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( }()}; if (unique == top.second.end()) { - if (top.first == atomic::Operator::Identity) { + if (top.first == operation::Operator::Identity) { // This is "x = y". context_.Say(rsrc, "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, - atom.AsFortran(), atomic::ToString(top.first)); + atom.AsFortran(), operation::ToString(top.first)); } } else { CheckStorageOverlap(atom, nonAtom, source); @@ -3933,18 +3514,18 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( // Missing arguments to operations would have been diagnosed by now. switch (top.first) { - case atomic::Operator::Associated: + case operation::Operator::Associated: if (atom != top.second.front()) { context_.Say(assignSource, "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); } break; // x equalop e | e equalop x (allowing "e equalop x" is an extension) - case atomic::Operator::Eq: - case atomic::Operator::Eqv: + case operation::Operator::Eq: + case operation::Operator::Eqv: // x ordop expr | expr ordop x - case atomic::Operator::Lt: - case atomic::Operator::Gt: { + case operation::Operator::Lt: + case operation::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; if (IsSameOrConvertOf(arg0, atom)) { @@ -3952,23 +3533,24 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); } break; } - case atomic::Operator::Identity: - case atomic::Operator::True: - case atomic::Operator::False: + case operation::Operator::Identity: + case operation::Operator::True: + case operation::Operator::False: break; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); break; } } diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 1d1e3ac044166..fce930dcc1d02 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -17,6 +17,7 @@ #include "flang/Semantics/tools.h" #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringSwitch.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1756,4 +1757,313 @@ bool HadUseError( } } +namespace operation { +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() + ? std::make_pair(operation::Operator::True, Arguments{}) + : std::make_pair(operation::Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{operation::OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); + } + } + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair(operation::OperationCode(x), + OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{ + operation::Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; +} // namespace operation + +std::string operation::ToString(operation::Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; + } +} + +operation::Operator operation::OperationCode( + const evaluate::ProcedureDesignator &proc) { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; +} + +std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return operation::ArgumentExtractor{}(expr); +} + +namespace operation { +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result asSomeExpr(const T &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::Constant &x) const { + return asSomeExpr(x); + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } + } else { + return asSomeExpr(x.derived()); + } + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (moveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; +}; +} // namespace operation + +MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return operation::ConvertCollector{}(x).first; +} + +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + } // namespace Fortran::semantics >From a83a1cf262eb9f01aafbcf099a8467aa9b861187 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 13:05:15 -0500 Subject: [PATCH 25/28] format --- flang/include/flang/Semantics/tools.h | 28 +++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 25fadceefceb0..9454f0b489192 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -830,8 +830,8 @@ Operator OperationCode( } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { switch (op.derived().opr) { case common::RelationalOperator::LT: return Operator::Lt; @@ -855,26 +855,26 @@ Operator OperationCode(const evaluate::Operation, Ts...> &op) { } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Sub; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Mul; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Div; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Pow; } @@ -885,8 +885,8 @@ Operator OperationCode( } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { if constexpr (C == T::category) { return Operator::Resize; } else { @@ -905,8 +905,8 @@ Operator OperationCode(const evaluate::ProcedureDesignator &proc); /// Return information about the top-level operation (ignoring parentheses): /// the operation code and the list of arguments. -std::pair> -GetTopLevelOperation(const SomeExpr &expr); +std::pair> GetTopLevelOperation( + const SomeExpr &expr); /// Check if expr is same as x, or a sequence of Convert operations on x. bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); >From 9770e4d3c5b0a858f8b5864a7aada01946763450 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 14:45:18 -0500 Subject: [PATCH 26/28] Restore accidentally removed Le --- flang/include/flang/Semantics/tools.h | 1 + 1 file changed, 1 insertion(+) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 9454f0b489192..9766effba3ebe 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -794,6 +794,7 @@ enum class Operator { Gt, Identity, Intrinsic, + Le, Lt, Max, Min, >From 9b8aaa5586334b48fbb28c103eda1091168342e0 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 14:45:47 -0500 Subject: [PATCH 27/28] Recognize constants as "operations" This allows emitting slightly better diagnostic messages. --- flang/include/flang/Semantics/tools.h | 8 ++- flang/lib/Semantics/check-omp-structure.cpp | 1 + flang/lib/Semantics/tools.cpp | 71 +++++++++++---------- flang/test/Semantics/OpenMP/atomic04.f90 | 2 +- flang/test/Semantics/OpenMP/atomic05.f90 | 2 +- 5 files changed, 48 insertions(+), 36 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 9766effba3ebe..7a2be79f14a29 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -781,10 +781,12 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, namespace operation { enum class Operator { + Unknown, Add, And, Associated, Call, + Constant, Convert, Div, Eq, @@ -807,7 +809,6 @@ enum class Operator { Resize, // Convert within the same TypeCategory Sub, True, - Unknown, }; std::string ToString(Operator op); @@ -895,6 +896,11 @@ Operator OperationCode( } } +template +Operator OperationCode(const evaluate::Constant &x) { + return Operator::Constant; +} + template // Operator OperationCode(const T &) { return Operator::Unknown; diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 7f96e48a303fe..3c27a3968a7c9 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3459,6 +3459,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( context_.Say(source, "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); return; + case operation::Operator::Constant: case operation::Operator::Unknown: context_.Say( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index fce930dcc1d02..a8cd8a6ec2228 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -1758,6 +1758,12 @@ bool HadUseError( } namespace operation { +template // +SomeExpr asSomeExpr(const T &x) { + auto copy{x}; + return AsGenericExpr(std::move(copy)); +} + template // struct ArgumentExtractor : public evaluate::Traverse, @@ -1816,10 +1822,12 @@ struct ArgumentExtractor template // Result operator()(const evaluate::Designator &x) const { - evaluate::Designator copy{x}; - Result result{ - operation::Operator::Identity, {AsGenericExpr(std::move(copy))}}; - return result; + return {operation::Operator::Identity, {asSomeExpr(x)}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + return {operation::Operator::Identity, {asSomeExpr(x)}}; } template // @@ -1849,24 +1857,37 @@ struct ArgumentExtractor std::string operation::ToString(operation::Operator op) { switch (op) { + default: + case Operator::Unknown: + return "??"; case Operator::Add: return "+"; case Operator::And: return "AND"; case Operator::Associated: return "ASSOCIATED"; + case Operator::Call: + return "function-call"; + case Operator::Constant: + return "constant"; + case Operator::Convert: + return "type-conversion"; case Operator::Div: return "/"; case Operator::Eq: return "=="; case Operator::Eqv: return "EQV"; + case Operator::False: + return ".FALSE."; case Operator::Ge: return ">="; case Operator::Gt: return ">"; case Operator::Identity: return "identity"; + case Operator::Intrinsic: + return "intrinsic"; case Operator::Le: return "<="; case Operator::Lt: @@ -1877,32 +1898,22 @@ std::string operation::ToString(operation::Operator op) { return "MIN"; case Operator::Mul: return "*"; - case Operator::Neqv: - return "NEQV/EOR"; case Operator::Ne: return "/="; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Not: + return "NOT"; case Operator::Or: return "OR"; + case Operator::Pow: + return "**"; + case Operator::Resize: + return "resize"; case Operator::Sub: return "-"; case Operator::True: return ".TRUE."; - case Operator::False: - return ".FALSE."; - case Operator::Not: - return "NOT"; - case Operator::Convert: - return "type-conversion"; - case Operator::Resize: - return "resize"; - case Operator::Intrinsic: - return "intrinsic"; - case Operator::Call: - return "function-call"; - case Operator::Pow: - return "**"; - default: - return "??"; } } @@ -1939,25 +1950,19 @@ struct ConvertCollector using Base::operator(); - template // - Result asSomeExpr(const T &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; - } - template // Result operator()(const evaluate::Designator &x) const { - return asSomeExpr(x); + return {asSomeExpr(x), {}}; } template // Result operator()(const evaluate::FunctionRef &x) const { - return asSomeExpr(x); + return {asSomeExpr(x), {}}; } template // Result operator()(const evaluate::Constant &x) const { - return asSomeExpr(x); + return {asSomeExpr(x), {}}; } template @@ -1976,10 +1981,10 @@ struct ConvertCollector return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); } else { - return asSomeExpr(x.derived()); + return {asSomeExpr(x.derived()), {}}; } } else { - return asSomeExpr(x.derived()); + return {asSomeExpr(x.derived()), {}}; } } diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index d603ba8b3937c..0f69befed1414 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -180,7 +180,7 @@ subroutine more_invalid_atomic_update_stmts() x = x !$omp atomic update - !ERROR: This is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1 !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index e0103be4cae4a..77ffc6e57f1a3 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -19,7 +19,7 @@ program OmpAtomic x = 2 * 4 !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: This is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 10 !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst >From f7bc109276a7bb647b0c8f3d65af63fbfb3249dc Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 15:04:57 -0500 Subject: [PATCH 28/28] Add lit tests for dumping atomic analysis --- flang/lib/Lower/OpenMP/OpenMP.cpp | 8 +- .../Lower/OpenMP/dump-atomic-analysis.f90 | 82 +++++++++++++++++++ 2 files changed, 89 insertions(+), 1 deletion(-) create mode 100644 flang/test/Lower/OpenMP/dump-atomic-analysis.f90 diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 30acf8baba082..4c50717f8fde4 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -40,11 +40,14 @@ #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/Transforms/RegionUtils.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/Support/CommandLine.h" #include "llvm/Frontend/OpenMP/OMPConstants.h" using namespace Fortran::lower::omp; using namespace Fortran::common::openmp; +static llvm::cl::opt DumpAtomicAnalysis("fdebug-dump-atomic-analysis"); + //===----------------------------------------------------------------------===// // Code generation helper functions //===----------------------------------------------------------------------===// @@ -3790,7 +3793,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// [[maybe_unused]] static void -dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { +dumpAtomicAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { auto whatStr = [](int k) { std::string txt = "?"; switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { @@ -3869,6 +3872,9 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, lower::StatementContext stmtCtx; const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + if (DumpAtomicAnalysis) + dumpAtomicAnalysis(analysis); + const semantics::SomeExpr &atom = *get(analysis.atom); mlir::Location loc = converter.genLocation(construct.source); mlir::Value atomAddr = diff --git a/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 b/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 new file mode 100644 index 0000000000000..55c49f98cd2e8 --- /dev/null +++ b/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 @@ -0,0 +1,82 @@ +!RUN: %flang_fc1 -fopenmp -fopenmp-version=60 -emit-hlfir -mmlir -fdebug-dump-atomic-analysis %s -o /dev/null |& FileCheck %s + +subroutine f00(x) + integer :: x, v + !$omp atomic read + v = x +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Read +!CHECK-NEXT: assign: v=x +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: None +!CHECK-NEXT: assign: +!CHECK-NEXT: } +!CHECK-NEXT: } + + +subroutine f01(v) + integer :: x, v + !$omp atomic write + x = v +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Write +!CHECK-NEXT: assign: x=v +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: None +!CHECK-NEXT: assign: +!CHECK-NEXT: } +!CHECK-NEXT: } + + +subroutine f02(x, v) + integer :: x, v + !$omp atomic update + x = x + v +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Update +!CHECK-NEXT: assign: x=x+v +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: None +!CHECK-NEXT: assign: +!CHECK-NEXT: } +!CHECK-NEXT: } + + +subroutine f03(x, v) + integer :: x, v, t + !$omp atomic update capture + t = x + x = x + v + !$omp end atomic +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Read +!CHECK-NEXT: assign: t=x +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: Update +!CHECK-NEXT: assign: x=x+v +!CHECK-NEXT: } +!CHECK-NEXT: } From flang-commits at lists.llvm.org Fri May 30 13:26:30 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 30 May 2025 13:26:30 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683a1476.170a0220.381612.9a13@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 01/29] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 02/29] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 03/29] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; >From 02b40809aff5b2ad315ef078f0ea8edb0f30a40c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 17 Mar 2025 15:53:27 -0500 Subject: [PATCH 04/29] [flang][OpenMP] Overhaul implementation of ATOMIC construct The parser will accept a wide variety of illegal attempts at forming an ATOMIC construct, leaving it to the semantic analysis to diagnose any issues. This consolidates the analysis into one place and allows us to produce more informative diagnostics. The parser's outcome will be parser::OpenMPAtomicConstruct object holding the directive, parser::Body, and an optional end-directive. The prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have been removed. READ, WRITE, etc. are now proper clauses. The semantic analysis consistently operates on "evaluation" represen- tations, mainly evaluate::Expr (as SomeExpr) and evaluate::Assignment. The results of the semantic analysis are stored in a mutable member of the OpenMPAtomicConstruct node. This follows a precedent of having `typedExpr` member in parser::Expr, for example. This allows the lowering code to avoid duplicated handling of AST nodes. Using a BLOCK construct containing multiple statements for an ATOMIC construct that requires multiple statements is now allowed. In fact, any nesting of such BLOCK constructs is allowed. This implementation will parse, and perform semantic checks for both conditional-update and conditional-update-capture, although no MLIR will be generated for those. Instead, a TODO error will be issues prior to lowering. The allowed forms of the ATOMIC construct were based on the OpenMP 6.0 spec. --- flang/include/flang/Parser/dump-parse-tree.h | 12 - flang/include/flang/Parser/parse-tree.h | 111 +- flang/include/flang/Semantics/tools.h | 16 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 40 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 760 +++---- flang/lib/Parser/openmp-parsers.cpp | 233 +- flang/lib/Parser/parse-tree.cpp | 28 + flang/lib/Parser/unparse.cpp | 102 +- flang/lib/Semantics/check-omp-structure.cpp | 1998 +++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 56 +- flang/lib/Semantics/resolve-names.cpp | 7 +- flang/lib/Semantics/rewrite-directives.cpp | 126 +- .../Lower/OpenMP/Todo/atomic-compare-fail.f90 | 2 +- .../test/Lower/OpenMP/Todo/atomic-compare.f90 | 2 +- flang/test/Lower/OpenMP/atomic-capture.f90 | 4 +- flang/test/Lower/OpenMP/atomic-write.f90 | 2 +- flang/test/Parser/OpenMP/atomic-compare.f90 | 16 - .../test/Semantics/OpenMP/atomic-compare.f90 | 29 +- .../Semantics/OpenMP/atomic-hint-clause.f90 | 23 +- flang/test/Semantics/OpenMP/atomic-read.f90 | 89 + .../OpenMP/atomic-update-capture.f90 | 77 + .../Semantics/OpenMP/atomic-update-only.f90 | 83 + .../OpenMP/atomic-update-overloaded-ops.f90 | 4 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 81 + flang/test/Semantics/OpenMP/atomic.f90 | 29 +- flang/test/Semantics/OpenMP/atomic01.f90 | 221 +- flang/test/Semantics/OpenMP/atomic02.f90 | 47 +- flang/test/Semantics/OpenMP/atomic03.f90 | 51 +- flang/test/Semantics/OpenMP/atomic04.f90 | 96 +- flang/test/Semantics/OpenMP/atomic05.f90 | 12 +- .../Semantics/OpenMP/critical-hint-clause.f90 | 20 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 58 +- .../Semantics/OpenMP/requires-atomic01.f90 | 86 +- .../Semantics/OpenMP/requires-atomic02.f90 | 86 +- 34 files changed, 2838 insertions(+), 1769 deletions(-) delete mode 100644 flang/test/Parser/OpenMP/atomic-compare.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-read.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-capture.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-only.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-write.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a8c01ee2e7da3 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -526,15 +526,6 @@ class ParseTreeDumper { NODE(parser, OmpAtClause) NODE_ENUM(OmpAtClause, ActionTime) NODE_ENUM(OmpSeverityClause, Severity) - NODE(parser, OmpAtomic) - NODE(parser, OmpAtomicCapture) - NODE(OmpAtomicCapture, Stmt1) - NODE(OmpAtomicCapture, Stmt2) - NODE(parser, OmpAtomicCompare) - NODE(parser, OmpAtomicCompareIfStmt) - NODE(parser, OmpAtomicRead) - NODE(parser, OmpAtomicUpdate) - NODE(parser, OmpAtomicWrite) NODE(parser, OmpBeginBlockDirective) NODE(parser, OmpBeginLoopDirective) NODE(parser, OmpBeginSectionsDirective) @@ -581,7 +572,6 @@ class ParseTreeDumper { NODE(parser, OmpDoacrossClause) NODE(parser, OmpDestroyClause) NODE(parser, OmpEndAllocators) - NODE(parser, OmpEndAtomic) NODE(parser, OmpEndBlockDirective) NODE(parser, OmpEndCriticalDirective) NODE(parser, OmpEndLoopDirective) @@ -709,8 +699,6 @@ class ParseTreeDumper { NODE(parser, OpenMPDeclareMapperConstruct) NODE_ENUM(common, OmpMemoryOrderType) NODE(parser, OmpMemoryOrderClause) - NODE(parser, OmpAtomicClause) - NODE(parser, OmpAtomicClauseList) NODE(parser, OmpAtomicDefaultMemOrderClause) NODE(parser, OpenMPDepobjConstruct) NODE(parser, OpenMPUtilityConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..77f57b1cb85c7 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4835,94 +4835,37 @@ struct OmpMemoryOrderClause { CharBlock source; }; -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) | -// FAIL(memory-order) -struct OmpAtomicClause { - UNION_CLASS_BOILERPLATE(OmpAtomicClause); - CharBlock source; - std::variant u; -}; - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -struct OmpAtomicClauseList { - WRAPPER_CLASS_BOILERPLATE(OmpAtomicClauseList, std::list); - CharBlock source; -}; - -// END ATOMIC -EMPTY_CLASS(OmpEndAtomic); - -// ATOMIC READ -struct OmpAtomicRead { - TUPLE_CLASS_BOILERPLATE(OmpAtomicRead); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC WRITE -struct OmpAtomicWrite { - TUPLE_CLASS_BOILERPLATE(OmpAtomicWrite); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC UPDATE -struct OmpAtomicUpdate { - TUPLE_CLASS_BOILERPLATE(OmpAtomicUpdate); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC CAPTURE -struct OmpAtomicCapture { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCapture); - CharBlock source; - WRAPPER_CLASS(Stmt1, Statement); - WRAPPER_CLASS(Stmt2, Statement); - std::tuple - t; -}; - -struct OmpAtomicCompareIfStmt { - UNION_CLASS_BOILERPLATE(OmpAtomicCompareIfStmt); - std::variant, common::Indirection> u; -}; - -// ATOMIC COMPARE (OpenMP 5.1, OPenMP 5.2 spec: 15.8.4) -struct OmpAtomicCompare { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCompare); +struct OpenMPAtomicConstruct { + llvm::omp::Clause GetKind() const; + bool IsCapture() const; + bool IsCompare() const; + TUPLE_CLASS_BOILERPLATE(OpenMPAtomicConstruct); CharBlock source; - std::tuple> + std::tuple> t; -}; -// ATOMIC -struct OmpAtomic { - TUPLE_CLASS_BOILERPLATE(OmpAtomic); - CharBlock source; - std::tuple, - std::optional> - t; -}; + // Information filled out during semantic checks to avoid duplication + // of analyses. + struct Analysis { + static constexpr int None = 0; + static constexpr int Read = 1; + static constexpr int Write = 2; + static constexpr int Update = Read | Write; + static constexpr int Action = 3; // Bitmask for None, Read, Write, Update + static constexpr int IfTrue = 4; + static constexpr int IfFalse = 8; + static constexpr int Condition = 12; // Bitmask for IfTrue, IfFalse + + struct Op { + int what; + TypedExpr expr; + }; + TypedExpr atom, cond; + Op op0, op1; + }; -// 2.17.7 atomic -> -// ATOMIC [atomic-clause-list] atomic-construct [atomic-clause-list] | -// ATOMIC [atomic-clause-list] -// atomic-construct -> READ | WRITE | UPDATE | CAPTURE | COMPARE -struct OpenMPAtomicConstruct { - UNION_CLASS_BOILERPLATE(OpenMPAtomicConstruct); - std::variant - u; + mutable Analysis analysis; }; // OpenMP directives that associate with loop(s) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..7f1ec59b087a2 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -782,5 +782,21 @@ inline bool checkForSymbolMatch( } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). +/// Check if "expr" is +/// SomeType(SomeKind(Type( +/// Convert +/// SomeKind(...)[2]))) +/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves +/// TypeCategory. +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..0d941bf39afc3 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -332,26 +332,26 @@ getSource(const semantics::SemanticsContext &semaCtx, const parser::CharBlock *source = nullptr; auto ompConsVisit = [&](const parser::OpenMPConstruct &x) { - std::visit(common::visitors{ - [&](const parser::OpenMPSectionsConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPLoopConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPBlockConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPCriticalConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPAtomicConstruct &x) { - std::visit([&](const auto &x) { source = &x.source; }, - x.u); - }, - [&](const auto &x) { source = &x.source; }, - }, - x.u); + std::visit( + common::visitors{ + [&](const parser::OpenMPSectionsConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPLoopConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPBlockConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPCriticalConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPAtomicConstruct &x) { + source = &std::get(x.t).source; + }, + [&](const auto &x) { source = &x.source; }, + }, + x.u); }; eval.visit(common::visitors{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fdd85e94829f3..ab868df76d298 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2588,455 +2588,183 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); +} -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} + +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, + const List &clauses) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + if (clause.id != llvm::omp::Clause::OMPC_hint) + continue; + auto &hint = std::get(clause.u); + auto maybeVal = evaluate::ToInt64(hint.v); + CHECK(maybeVal); + return builder.getI64IntegerAttr(*maybeVal); } + return nullptr; } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static mlir::omp::ClauseMemoryOrderKindAttr +getAtomicMemoryOrder(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + const List &clauses) { + std::optional kind; + unsigned version = semaCtx.langOptions().OpenMPVersion; - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + switch (clause.id) { + case llvm::omp::Clause::OMPC_acq_rel: + kind = mlir::omp::ClauseMemoryOrderKind::Acq_rel; + break; + case llvm::omp::Clause::OMPC_acquire: + kind = mlir::omp::ClauseMemoryOrderKind::Acquire; + break; + case llvm::omp::Clause::OMPC_relaxed: + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + break; + case llvm::omp::Clause::OMPC_release: + kind = mlir::omp::ClauseMemoryOrderKind::Release; + break; + case llvm::omp::Clause::OMPC_seq_cst: + kind = mlir::omp::ClauseMemoryOrderKind::Seq_cst; + break; + default: + break; + } + } - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); + // Starting with 5.1, if no memory-order clause is present, the effect + // is as if "relaxed" was present. + if (!kind) { + if (version <= 50) + return nullptr; + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + } + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + return mlir::omp::ClauseMemoryOrderKindAttr::get(builder.getContext(), *kind); +} + +static mlir::Operation * // +genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; +static mlir::Operation * // +genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Value converted = builder.createConvert(loc, atomType, value); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, converted, hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - common::visit( - common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - lower::StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); +static mlir::Operation * +genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + lower::ExprToValueMap overrides; + lower::StatementContext naCtx; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + assert(!args.empty() && "Update operation without arguments"); + for (auto &arg : args) { + if (!semantics::IsSameOrResizeOf(arg, atom)) { + mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); + overrides.try_emplace(&arg, val); } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); } - mlir::Operation *atomicUpdateOp = nullptr; - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), - val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -static void genAtomicWrite(lower::AbstractConverter &converter, - const parser::OmpAtomicWrite &atomicWrite, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - - const parser::AssignmentStmt &stmt = - std::get>(atomicWrite.t) - .statement; - const evaluate::Assignment &assign = *stmt.typedAssignment->v; - lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -static void genAtomicRead(lower::AbstractConverter &converter, - const parser::OmpAtomicRead &atomicRead, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicRead.t) - .statement.t); + builder.restoreInsertionPoint(atomicAt); + auto updateOp = + builder.create(loc, atomAddr, hint, memOrder); - lower::StatementContext stmtCtx; - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -static void genAtomicUpdate(lower::AbstractConverter &converter, - const parser::OmpAtomicUpdate &atomicUpdate, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicUpdate.t) - .statement.t); + mlir::Region ®ion = updateOp->getRegion(0); + mlir::Block *block = builder.createBlock(®ion, {}, {atomType}, {loc}); + mlir::Value localAtom = fir::getBase(block->getArgument(0)); + overrides.try_emplace(&atom, localAtom); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -static void genOmpAtomic(lower::AbstractConverter &converter, - const parser::OmpAtomic &atomicConstruct, - mlir::Location loc) { - const parser::OmpAtomicClauseList &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>(atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicConstruct.t) - .statement.t); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -static void genAtomicCapture(lower::AbstractConverter &converter, - const parser::OmpAtomicCapture &atomicCapture, - mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + converter.overrideExprValues(&overrides); + mlir::Value updated = + fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value converted = builder.createConvert(loc, atomType, updated); + builder.create(loc, converted); + converter.resetExprOverrides(); - const parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const parser::OmpAtomicClauseList &rightHandClauseList = - std::get<2>(atomicCapture.t); - const parser::OmpAtomicClauseList &leftHandClauseList = - std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + builder.restoreInsertionPoint(saved); + return updateOp; +} + +static mlir::Operation * +genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, int action, + mlir::Value atomAddr, const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + switch (action) { + case parser::OpenMPAtomicConstruct::Analysis::Read: + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Write: + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Update: + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + default: + return nullptr; } - firOpBuilder.setInsertionPointToEnd(&block); - firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); } //===----------------------------------------------------------------------===// @@ -3911,10 +3639,6 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, @@ -3922,38 +3646,140 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; + llvm::errs() << " atom: " << atom.AsFortran() << "\n"; + llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; + llvm::errs() << " op0 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << " op1 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << "}\n"; +} + static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - const parser::OpenMPAtomicConstruct &atomicConstruct) { - Fortran::common::visit( - common::visitors{ - [&](const parser::OmpAtomicRead &atomicRead) { - mlir::Location loc = converter.genLocation(atomicRead.source); - genAtomicRead(converter, atomicRead, loc); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - mlir::Location loc = converter.genLocation(atomicWrite.source); - genAtomicWrite(converter, atomicWrite, loc); - }, - [&](const parser::OmpAtomic &atomicConstruct) { - mlir::Location loc = converter.genLocation(atomicConstruct.source); - genOmpAtomic(converter, atomicConstruct, loc); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - mlir::Location loc = converter.genLocation(atomicUpdate.source); - genAtomicUpdate(converter, atomicUpdate, loc); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - mlir::Location loc = converter.genLocation(atomicCapture.source); - genAtomicCapture(converter, atomicCapture, loc); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - mlir::Location loc = converter.genLocation(atomicCompare.source); - TODO(loc, "OpenMP atomic compare"); - }, - }, - atomicConstruct.u); + const parser::OpenMPAtomicConstruct &construct) { + auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { + if (auto *maybe = expr.get(); maybe && maybe->v) { + return &*maybe->v; + } else { + return nullptr; + } + }; + + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto &dirSpec = std::get(construct.t); + List clauses = makeClauses(dirSpec.Clauses(), semaCtx); + lower::StatementContext stmtCtx; + + const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + const semantics::SomeExpr &atom = *get(analysis.atom); + mlir::Location loc = converter.genLocation(construct.source); + mlir::Value atomAddr = + fir::getBase(converter.genExprAddr(atom, stmtCtx, &loc)); + mlir::IntegerAttr hint = getAtomicHint(converter, clauses); + mlir::omp::ClauseMemoryOrderKindAttr memOrder = + getAtomicMemoryOrder(converter, semaCtx, clauses); + + if (auto *cond = get(analysis.cond)) { + (void)cond; + TODO(loc, "OpenMP ATOMIC COMPARE"); + } else { + int action0 = analysis.op0.what & analysis.Action; + int action1 = analysis.op1.what & analysis.Action; + mlir::Operation *captureOp = nullptr; + fir::FirOpBuilder::InsertPoint atomicAt; + fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + + if (construct.IsCapture()) { + // Capturing operation. + assert(action0 != analysis.None && action1 != analysis.None && + "Expexcing two actions"); + captureOp = + builder.create(loc, hint, memOrder); + // Set the non-atomic insertion point to before the atomic.capture. + prepareAt = getInsertionPointBefore(captureOp); + + mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); + builder.setInsertionPointToEnd(block); + // Set the atomic insertion point to before the terminator inside + // atomic.capture. + mlir::Operation *term = builder.create(loc); + atomicAt = getInsertionPointBefore(term); + hint = nullptr; + memOrder = nullptr; + } else { + // Non-capturing operation. + assert(action0 != analysis.None && action1 == analysis.None && + "Expexcing single action"); + assert(!(analysis.op0.what & analysis.Condition)); + atomicAt = prepareAt; + } + + mlir::Operation *firstOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, + *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + assert(firstOp && "Should have created an atomic operation"); + atomicAt = getInsertionPointAfter(firstOp); + + mlir::Operation *secondOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, + *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } + } } static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..b30263ad54fa4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { + return std::string( + state.GetLocation(), std::min(64, state.BytesRemaining())); +} + constexpr auto startOmpLine = skipStuffBeforeStatement >> "!$OMP "_sptok; constexpr auto endOmpLine = space >> endOfLine; @@ -918,8 +924,10 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "BIND" >> construct(construct( parenthesized(Parser{}))) || + "CAPTURE" >> construct(construct()) || "COLLAPSE" >> construct(construct( parenthesized(scalarIntConstantExpr))) || + "COMPARE" >> construct(construct()) || "CONTAINS" >> construct(construct( parenthesized(Parser{}))) || "COPYIN" >> construct(construct( @@ -1039,6 +1047,7 @@ TYPE_PARSER( // "TASK_REDUCTION" >> construct(construct( parenthesized(Parser{}))) || + "READ" >> construct(construct()) || "RELAXED" >> construct(construct()) || "RELEASE" >> construct(construct()) || "REVERSE_OFFLOAD" >> @@ -1082,6 +1091,7 @@ TYPE_PARSER( // maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || + "WRITE" >> construct(construct()) || // Cancellable constructs "DO"_id >= construct(construct( @@ -1210,6 +1220,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. +// A problem can occur when atomic constructs without end-directive follow +// each other closely, e.g. +// !$omp atomic write +// x = v +// !$omp atomic update +// x = x + 1 +// ... +// The speculative parsing will become "recursive", and has the potential +// to take a (practically) infinite amount of time given a sufficiently +// large number of such constructs in a row. Since atomic constructs cannot +// contain other OpenMP constructs, guarding against recursive calls to the +// atomic construct parser solves the problem. +struct OmpAtomicConstructParser { + using resultType = OpenMPAtomicConstruct; + + static constexpr size_t BodyLimit{5}; + + std::optional Parse(ParseState &state) const { + if (recursing_) { + return std::nullopt; + } + recursing_ = true; + + auto dirSpec{Parser{}.Parse(state)}; + if (!dirSpec || dirSpec->DirId() != llvm::omp::Directive::OMPD_atomic) { + recursing_ = false; + return std::nullopt; + } + + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + if (ParseOne(exec, end, tail, state)) { + if (!tail.first.empty()) { + if (auto &&rest{attempt(LimitedTailParser(BodyLimit)).Parse(state)}) { + for (auto &&s : rest->first) { + tail.first.emplace_back(std::move(s)); + } + assert(!tail.second); + tail.second = std::move(rest->second); + } + } + recursing_ = false; + return OpenMPAtomicConstruct{ + std::move(*dirSpec), std::move(tail.first), std::move(tail.second)}; + } + + recursing_ = false; + return std::nullopt; + } + +private: + // Begin-directive + TailType = entire construct. + using TailType = std::pair>; + + // Parse either an ExecutionPartConstruct, or atomic end-directive. When + // successful, record the result in the "tail" provided, otherwise fail. + static std::optional ParseOne( // + Parser &exec, OmpEndDirectiveParser &end, + TailType &tail, ParseState &state) { + auto isRecovery{[](const ExecutionPartConstruct &e) { + return std::holds_alternative(e.u); + }}; + if (auto &&stmt{attempt(exec).Parse(state)}; stmt && !isRecovery(*stmt)) { + tail.first.emplace_back(std::move(*stmt)); + } else if (auto &&dir{attempt(end).Parse(state)}) { + tail.second = std::move(*dir); + } else { + return std::nullopt; + } + return Success{}; + } + + struct LimitedTailParser { + using resultType = TailType; + + constexpr LimitedTailParser(size_t count) : count_(count) {} + + std::optional Parse(ParseState &state) const { + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + for (size_t i{0}; i != count_; ++i) { + if (ParseOne(exec, end, tail, state)) { + if (tail.second) { + // Return when the end-directive was parsed. + return std::move(tail); + } + } else { + break; + } + } + return std::nullopt; + } + + private: + const size_t count_; + }; + + // The recursion guard should become thread_local if parsing is ever + // parallelized. + static bool recursing_; +}; + +bool OmpAtomicConstructParser::recursing_{false}; + +TYPE_PARSER(sourced( // + construct(OmpAtomicConstructParser{}))) + // 2.17.7 Atomic construct/2.17.8 Flush construct [OpenMP 5.0] // memory-order-clause -> // acq_rel @@ -1224,19 +1383,6 @@ TYPE_PARSER(sourced(construct( "RELEASE" >> construct(construct()) || "SEQ_CST" >> construct(construct()))))) -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) -TYPE_PARSER(sourced(construct( - construct(Parser{}) || - construct( - "FAIL" >> parenthesized(Parser{})) || - construct( - "HINT" >> parenthesized(Parser{}))))) - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -TYPE_PARSER(sourced(construct( - many(maybe(","_tok) >> sourced(Parser{}))))) - static bool IsSimpleStandalone(const OmpDirectiveName &name) { switch (name.v) { case llvm::omp::Directive::OMPD_barrier: @@ -1383,67 +1529,6 @@ TYPE_PARSER(sourced( TYPE_PARSER(construct(Parser{}) || construct(Parser{})) -// 2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] | -// ATOMIC [clause] -// clause -> memory-order-clause | HINT(hint-expression) -// memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED -// atomic-clause -> READ | WRITE | UPDATE | CAPTURE - -// OMP END ATOMIC -TYPE_PARSER(construct(startOmpLine >> "END ATOMIC"_tok)) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] READ [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("READ"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] CAPTURE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("CAPTURE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - statement(assignmentStmt), Parser{} / endOmpLine))) - -TYPE_PARSER(construct(indirect(Parser{})) || - construct(indirect(Parser{}))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] COMPARE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("COMPARE"_tok), - Parser{} / endOmpLine, - Parser{}, - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] UPDATE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("UPDATE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [atomic-clause-list] -TYPE_PARSER(sourced(construct(verbatim("ATOMIC"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] WRITE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("WRITE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// Atomic Construct -TYPE_PARSER(construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{})) - // 2.13.2 OMP CRITICAL TYPE_PARSER(startOmpLine >> sourced(construct( diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..5983f54600c21 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -318,6 +318,34 @@ std::string OmpTraitSetSelectorName::ToString() const { return std::string(EnumToString(v)); } +llvm::omp::Clause OpenMPAtomicConstruct::GetKind() const { + auto &dirSpec{std::get(t)}; + for (auto &clause : dirSpec.Clauses().v) { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_read: + case llvm::omp::Clause::OMPC_write: + case llvm::omp::Clause::OMPC_update: + return clause.Id(); + default: + break; + } + } + return llvm::omp::Clause::OMPC_update; +} + +bool OpenMPAtomicConstruct::IsCapture() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_capture; + }); +} + +bool OpenMPAtomicConstruct::IsCompare() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_compare; + }); +} } // namespace Fortran::parser template static llvm::omp::Clause getClauseIdForClass(C &&) { diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..a2bc8b98088f1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2562,83 +2562,22 @@ class UnparseVisitor { Word(ToUpperCaseLetters(common::EnumToString(x))); } - void Unparse(const OmpAtomicClauseList &x) { Walk(" ", x.v, " "); } - - void Unparse(const OmpAtomic &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCapture &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" CAPTURE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - Put("\n"); - Walk(std::get(x.t)); - BeginOpenMP(); - Word("!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCompare &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" COMPARE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - } - void Unparse(const OmpAtomicRead &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" READ"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicUpdate &x) { + void Unparse(const OpenMPAtomicConstruct &x) { BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" UPDATE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicWrite &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" WRITE"); - Walk(std::get<2>(x.t)); + Word("!$OMP "); + Walk(std::get(x.t)); Put("\n"); EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); + Walk(std::get(x.t), ""); + if (auto &end{std::get>(x.t)}) { + BeginOpenMP(); + Word("!$OMP END "); + Walk(*end); + Put("\n"); + EndOpenMP(); + } } + void Unparse(const OpenMPExecutableAllocate &x) { const auto &fields = std::get>>( @@ -2889,23 +2828,8 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } + void Unparse(const OmpFailClause &x) { Walk(x.v); } void Unparse(const OmpMemoryOrderClause &x) { Walk(x.v); } - void Unparse(const OmpAtomicClause &x) { - common::visit(common::visitors{ - [&](const OmpMemoryOrderClause &y) { Walk(y); }, - [&](const OmpFailClause &y) { - Word("FAIL("); - Walk(y.v); - Put(")"); - }, - [&](const OmpHintClause &y) { - Word("HINT("); - Walk(y.v); - Put(")"); - }, - }, - x.u); - } void Unparse(const OmpMetadirectiveDirective &x) { BeginOpenMP(); Word("!$OMP METADIRECTIVE "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c582de6df0319..bf3dff8ddab28 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); + +template +static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { + return !(e == f); +} + // Use when clause falls under 'struct OmpClause' in 'parse-tree.h'. #define CHECK_SIMPLE_CLAUSE(X, Y) \ void OmpStructureChecker::Enter(const parser::OmpClause::X &) { \ @@ -78,6 +86,28 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } +static bool IsVarOrFunctionRef(const SomeExpr &expr) { + return evaluate::UnwrapProcedureRef(expr) != nullptr || + evaluate::IsVariable(expr); +} + +static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { + const parser::TypedExpr &typedExpr{parserExpr.typedExpr}; + // ForwardOwningPointer typedExpr + // `- GenericExprWrapper ^.get() + // `- std::optional ^->v + return typedExpr.get()->v; +} + +static std::optional GetDynamicType( + const parser::Expr &parserExpr) { + if (auto maybeExpr{GetEvaluateExpr(parserExpr)}) { + return maybeExpr->GetType(); + } else { + return std::nullopt; + } +} + // 'OmpWorkshareBlockChecker' is used to check the validity of the assignment // statements and the expressions enclosed in an OpenMP Workshare construct class OmpWorkshareBlockChecker { @@ -584,51 +614,26 @@ void OmpStructureChecker::CheckPredefinedAllocatorRestriction( } } -template -void OmpStructureChecker::CheckHintClause( - D *leftOmpClauseList, D *rightOmpClauseList, std::string_view dirName) { - bool foundHint{false}; +void OmpStructureChecker::Enter(const parser::OmpClause::Hint &x) { + CheckAllowedClause(llvm::omp::Clause::OMPC_hint); + auto &dirCtx{GetContext()}; - auto checkForValidHintClause = [&](const D *clauseList) { - for (const auto &clause : clauseList->v) { - const parser::OmpHintClause *ompHintClause = nullptr; - if constexpr (std::is_same_v) { - ompHintClause = std::get_if(&clause.u); - } else if constexpr (std::is_same_v) { - if (auto *hint{std::get_if(&clause.u)}) { - ompHintClause = &hint->v; - } - } - if (!ompHintClause) - continue; - if (foundHint) { - context_.Say(clause.source, - "At most one HINT clause can appear on the %s directive"_err_en_US, - parser::ToUpperCaseLetters(dirName)); - } - foundHint = true; - std::optional hintValue = GetIntValue(ompHintClause->v); - if (hintValue && *hintValue >= 0) { - /*`omp_sync_hint_nonspeculative` and `omp_lock_hint_speculative`*/ - if ((*hintValue & 0xC) == 0xC - /*`omp_sync_hint_uncontended` and omp_sync_hint_contended*/ - || (*hintValue & 0x3) == 0x3) - context_.Say(clause.source, - "Hint clause value " - "is not a valid OpenMP synchronization value"_err_en_US); - } else { - context_.Say(clause.source, - "Hint clause must have non-negative constant " - "integer expression"_err_en_US); + if (std::optional maybeVal{GetIntValue(x.v.v)}) { + int64_t val{*maybeVal}; + if (val >= 0) { + // Check contradictory values. + if ((val & 0xC) == 0xC || // omp_sync_hint_speculative and nonspeculative + (val & 0x3) == 0x3) { // omp_sync_hint_contended and uncontended + context_.Say(dirCtx.clauseSource, + "The synchronization hint is not valid"_err_en_US); } + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be non-negative"_err_en_US); } - }; - - if (leftOmpClauseList) { - checkForValidHintClause(leftOmpClauseList); - } - if (rightOmpClauseList) { - checkForValidHintClause(rightOmpClauseList); + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be a constant integer value"_err_en_US); } } @@ -2370,8 +2375,9 @@ void OmpStructureChecker::Leave(const parser::OpenMPCancelConstruct &) { void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { const auto &dir{std::get(x.t)}; + const auto &dirSource{std::get(dir.t).source}; const auto &endDir{std::get(x.t)}; - PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_critical); + PushContextAndClauseSets(dirSource, llvm::omp::Directive::OMPD_critical); const auto &block{std::get(x.t)}; CheckNoBranching(block, llvm::omp::Directive::OMPD_critical, dir.source); const auto &dirName{std::get>(dir.t)}; @@ -2404,7 +2410,6 @@ void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { "Hint clause other than omp_sync_hint_none cannot be specified for " "an unnamed CRITICAL directive"_err_en_US}); } - CheckHintClause(&ompClause, nullptr, "CRITICAL"); } void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { @@ -2632,420 +2637,1519 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; } - return false; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } + + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (Append(v, std::move(results)), ...); + return v; + } + +private: + static void Append(Result &acc, Result &&data) { + for (auto &&s : data) { + acc.push_back(std::move(s)); } } +}; +} // namespace atomic - ErrIfAllocatableVariable(var); +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { + // Both expr and x have the form of SomeType(SomeKind(...)[1]). + // Check if expr is + // SomeType(SomeKind(Type( + // Convert + // SomeKind(...)[2]))) + // where SomeKind(...) [1] and [2] are equal, and the Convert preserves + // TypeCategory. + + if (expr != x) { + auto top{atomic::ArgumentExtractor{}(expr)}; + return top.first == atomic::Operator::Resize && x == top.second.front(); } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); + return true; } } -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, const MaybeExpr &maybeExpr = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetExpr(operation.expr, maybeExpr); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedExpr expr; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var + // or + // ... = x capture-var = atomic-var + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return u.lhs == c.rhs; + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); + bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + + if (couldBeCapture1) { + if (couldBeCapture2) { + if (isUpdateCapture(as2, as1)) { + if (isUpdateCapture(as1, as2)) { + // If both statements could be captures and both could be updates, + // emit a warning about the ambiguity. + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); } + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, + as1.rhs.AsFortran(), as2.rhs.AsFortran()); + } + } else { // !couldBeCapture2 + if (isUpdateCapture(as2, as1)) { + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else { + context_.Say(act2.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as1.rhs.AsFortran()); + } + } + } else { // !couldBeCapture1 + if (couldBeCapture2) { + if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); + } else { + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as2.rhs.AsFortran()); + } + } else { + context_.Say(source, + "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + } + } + + return std::make_pair(nullptr, nullptr); +} + +void OmpStructureChecker::CheckAtomicCaptureAssignment( + const evaluate::Assignment &capture, const SomeExpr &atom, + parser::CharBlock source) { + const SomeExpr &cap{capture.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + // This part should have been checked prior to callig this function. + assert(capture.rhs == atom && "This canont be a capture assignment"); + CheckStorageOverlap(atom, {cap}, source); + } +} + +void OmpStructureChecker::CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source) { + const SomeExpr &atom{read.rhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source) { + // [6.0:190:13-15] + // A write structured block is write-statement, a write statement that has + // one of the following forms: + // x = expr + // x => expr + const SomeExpr &atom{write.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, lsrc); + CheckStorageOverlap(atom, {write.rhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source) { + // [6.0:191:1-7] + // An update structured block is update-statement, an update statement + // that has one of the following forms: + // x = x operator expr + // x = expr operator x + // x = intrinsic-procedure-name (x) + // x = intrinsic-procedure-name (x, expr-list) + // x = intrinsic-procedure-name (expr-list, x) + const SomeExpr &atom{update.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, lsrc); + + auto top{GetTopLevelOperation(update.rhs)}; + switch (top.first) { + case atomic::Operator::Add: + case atomic::Operator::Sub: + case atomic::Operator::Mul: + case atomic::Operator::Div: + case atomic::Operator::And: + case atomic::Operator::Or: + case atomic::Operator::Eqv: + case atomic::Operator::Neqv: + case atomic::Operator::Min: + case atomic::Operator::Max: + case atomic::Operator::Identity: + break; + case atomic::Operator::Call: + context_.Say(source, + "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Convert: + context_.Say(source, + "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Intrinsic: + context_.Say(source, + "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Unk: + context_.Say( + source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + default: + context_.Say(source, + "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, + atomic::ToString(top.first)); + return; + } + // Check if `atom` occurs exactly once in the argument list. + std::vector nonAtom; + auto unique{[&]() { // -> iterator + auto found{top.second.end()}; + for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { + if (IsSameOrResizeOf(*i, atom)) { + if (found != top.second.end()) { + return top.second.end(); } + found = i; + } else { + nonAtom.push_back(*i); } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); + return found; + }()}; + + if (unique == top.second.end()) { + if (top.first == atomic::Operator::Identity) { + // This is "x = y". + context_.Say(rsrc, + "The atomic variable %s should appear as an argument in the update operation"_err_en_US, + atom.AsFortran()); + } else { + context_.Say(rsrc, + "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, + atom.AsFortran(), atomic::ToString(top.first)); + } + } else { + CheckStorageOverlap(atom, nonAtom, source); } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( + const SomeExpr &cond, parser::CharBlock condSource, + const evaluate::Assignment &assign, parser::CharBlock assignSource) { + const SomeExpr &atom{assign.lhs}; + auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, alsrc); + + auto top{GetTopLevelOperation(cond)}; + // Missing arguments to operations would have been diagnosed by now. + + switch (top.first) { + case atomic::Operator::Associated: + if (atom != top.second.front()) { + context_.Say(assignSource, + "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); + } + break; + // x equalop e | e equalop x (allowing "e equalop x" is an extension) + case atomic::Operator::Eq: + case atomic::Operator::Eqv: + // x ordop expr | expr ordop x + case atomic::Operator::Lt: + case atomic::Operator::Gt: { + const SomeExpr &arg0{top.second[0]}; + const SomeExpr &arg1{top.second[1]}; + if (IsSameOrResizeOf(arg0, atom)) { + CheckStorageOverlap(atom, {arg1}, condSource); + } else if (IsSameOrResizeOf(arg1, atom)) { + CheckStorageOverlap(atom, {arg0}, condSource); + } else { + context_.Say(assignSource, + "An argument of the %s operator should be the target of the assignment"_err_en_US, + atomic::ToString(top.first)); + } + break; + } + case atomic::Operator::True: + case atomic::Operator::False: + break; + default: + context_.Say(condSource, + "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, + atomic::ToString(top.first)); + break; } } -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, +void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source) { + // The condition/statements must be: + // - cond: x equalop e ift: x = d iff: - + // - cond: x ordop expr ift: x = expr iff: - (+ commute ordop) + // - cond: associated(x) ift: x => expr iff: - + // - cond: associated(x, e) ift: x => expr iff: - + + // The if-true statement must be present, and must be an assignment. + auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; + if (!maybeAssign) { + if (update.ift.stmt) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); + } else { + context_.Say( + source, "Invalid body of ATOMIC UPDATE COMPARE operation"_err_en_US); + } + return; + } + const evaluate::Assignment assign{*maybeAssign}; + const SomeExpr &atom{assign.lhs}; + + CheckAtomicConditionalUpdateAssignment( + update.cond, update.source, assign, update.ift.source); + + CheckStorageOverlap(atom, {assign.rhs}, update.ift.source); + + if (update.iff) { + context_.Say(update.iff.source, + "In ATOMIC UPDATE COMPARE the update statement should not have an ELSE branch"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdateOnly( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeUpdate{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeUpdate->lhs}; + CheckAtomicUpdateAssignment(*maybeUpdate, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC UPDATE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdate( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // Allowable forms are (single-statement): + // - if ... + // - x = (... ? ... : x) + // and two-statement: + // - r = cond ; if (r) ... + + const parser::ExecutionPartConstruct *ust{nullptr}; // update + const parser::ExecutionPartConstruct *cst{nullptr}; // condition + + if (body.size() == 1) { + ust = &body.front(); + } else if (body.size() == 2) { + cst = &body.front(); + ust = &body.back(); + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE operation should contain one or two statements"_err_en_US); + return; + } + + // Flang doesn't support conditional-expr yet, so all update statements + // are if-statements. + + // IfStmt: if (...) ... + // IfConstruct: if (...) then ... endif + auto maybeUpdate{AnalyzeConditionalStmt(ust)}; + if (!maybeUpdate) { + context_.Say(source, + "In ATOMIC UPDATE COMPARE the update statement should be a conditional statement"_err_en_US); + return; + } + + AnalyzedCondStmt &update{*maybeUpdate}; + + if (SourcedActionStmt action{GetActionStmt(cst)}) { + // The "condition" statement must be `r = cond`. + if (auto maybeCond{GetEvaluateAssignment(action.stmt)}) { + if (maybeCond->lhs != update.cond) { + context_.Say(update.source, + "In ATOMIC UPDATE COMPARE the conditional statement must use %s as the condition"_err_en_US, + maybeCond->lhs.AsFortran()); + } else { + // If it's "r = ...; if (r) ..." then put the original condition + // in `update`. + update.cond = maybeCond->rhs; + } + } else { + context_.Say(action.source, + "In ATOMIC UPDATE COMPARE with two statements the first statement should compute the condition"_err_en_US); + } + } + + evaluate::Assignment assign{*GetEvaluateAssignment(update.ift.stmt)}; + + CheckAtomicConditionalUpdateStmt(update, source); + if (IsCheckForAssociated(update.cond)) { + if (!IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment should be a pointer-assignment when the condition is ASSOCIATED"_err_en_US); + } + } else { + if (IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment cannot be a pointer-assignment except when the condition is ASSOCIATED"_err_en_US); + } + } + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::None)); +} + +void OmpStructureChecker::CheckAtomicUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() != 2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two statements"_err_en_US); + return; + } + + auto [uec, cec]{CheckUpdateCapture(&body.front(), &body.back(), source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; + auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; + + if (!maybeUpdate || !maybeCapture) { + context_.Say(source, + "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); + return; + } + + const evaluate::Assignment &update{*maybeUpdate}; + const evaluate::Assignment &capture{*maybeCapture}; + const SomeExpr &atom{update.lhs}; + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + int action; + + if (IsMaybeAtomicWrite(update)) { + action = Analysis::Write; + CheckAtomicWriteAssignment(update, uact.source); + } else { + action = Analysis::Update; + CheckAtomicUpdateAssignment(update, uact.source); + } + CheckAtomicCaptureAssignment(capture, atom, cact.source); + + if (IsPointerAssignment(update) != IsPointerAssignment(capture)) { + context_.Say(cact.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + + if (GetActionStmt(&body.front()).stmt == uact.stmt) { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(action, update.rhs), + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + } else { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), + MakeAtomicAnalysisOp(action, update.rhs)); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // There are two different variants of this: + // (1) conditional-update and capture separately: + // This form only allows single-statement updates, i.e. the update + // form "r = cond; if (r) ...)" is not allowed. + // (2) conditional-update combined with capture in a single statement: + // This form does allow the condition to be calculated separately, + // i.e. "r = cond; if (r) ...". + // Regardless of what form it is, the actual update assignment is a + // proper write, i.e. "x = d", where d does not depend on x. + + AnalyzedCondStmt update; + SourcedActionStmt capture; + bool captureAlways{true}, captureFirst{true}; + + auto extractCapture{[&]() { + capture = update.iff; + captureAlways = false; + update.iff = SourcedActionStmt{}; + }}; + + auto classifyNonUpdate{[&](const SourcedActionStmt &action) { + // The non-update statement is either "r = cond" or the capture. + if (auto maybeAssign{GetEvaluateAssignment(action.stmt)}) { + if (update.cond == maybeAssign->lhs) { + // If this is "r = cond; if (r) ...", then update the condition. + update.cond = maybeAssign->rhs; + update.source = action.source; + // In this form, the update and the capture are combined into + // an IF-THEN-ELSE statement. + extractCapture(); + } else { + // Assume this is the capture-statement. + capture = action; + } + } + }}; + + if (body.size() == 2) { + // This could be + // - capture; conditional-update (in any order), or + // - r = cond; if (r) capture-update + const parser::ExecutionPartConstruct *st1{&body.front()}; + const parser::ExecutionPartConstruct *st2{&body.back()}; + // In either case, the conditional statement can be analyzed by + // AnalyzeConditionalStmt, whereas the other statement cannot. + if (auto maybeUpdate1{AnalyzeConditionalStmt(st1)}) { + update = *maybeUpdate1; + classifyNonUpdate(GetActionStmt(st2)); + captureFirst = false; + } else if (auto maybeUpdate2{AnalyzeConditionalStmt(st2)}) { + update = *maybeUpdate2; + classifyNonUpdate(GetActionStmt(st1)); + } else { + // None of the statements are conditional, this rules out the + // "r = cond; if (r) ..." and the "capture + conditional-update" + // variants. This could still be capture + write (which is classified + // as conditional-update-capture in the spec). + auto [uec, cec]{CheckUpdateCapture(st1, st2, source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + update.ift = uact; + capture = cact; + if (uec == st1) { + captureFirst = false; + } + } + } else if (body.size() == 1) { + if (auto maybeUpdate{AnalyzeConditionalStmt(&body.front())}) { + update = *maybeUpdate; + // This is the form with update and capture combined into an IF-THEN-ELSE + // statement. The capture-statement is always the ELSE branch. + extractCapture(); + } else { + goto invalid; + } + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE CAPTURE operation should contain one or two statements"_err_en_US); + return; + invalid: + context_.Say(source, + "Invalid body of ATOMIC UPDATE COMPARE CAPTURE operation"_err_en_US); + return; + } + + // The update must have a form `x = d` or `x => d`. + if (auto maybeWrite{GetEvaluateAssignment(update.ift.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, update.ift.source); + if (auto maybeCapture{GetEvaluateAssignment(capture.stmt)}) { + CheckAtomicCaptureAssignment(*maybeCapture, atom, capture.source); + + if (IsPointerAssignment(*maybeWrite) != + IsPointerAssignment(*maybeCapture)) { + context_.Say(capture.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + } else { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + return; + } + } else { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + return; + } + + // update.iff should be empty here, the capture statement should be + // stored in "capture". + + // Fill out the analysis in the AST node. + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + bool condUnused{std::visit( + [](auto &&s) { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v) { + return true; + } else { + return false; + } }, - x.u); + update.cond.u)}; + + int updateWhen{!condUnused ? Analysis::IfTrue : 0}; + int captureWhen{!captureAlways ? Analysis::IfFalse : 0}; + + evaluate::Assignment updAssign{*GetEvaluateAssignment(update.ift.stmt)}; + evaluate::Assignment capAssign{*GetEvaluateAssignment(capture.stmt)}; + + if (captureFirst) { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + } else { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + } +} + +void OmpStructureChecker::CheckAtomicRead( + const parser::OpenMPAtomicConstruct &x) { + // [6.0:190:5-7] + // A read structured block is read-statement, a read statement that has one + // of the following forms: + // v = x + // v => x + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Read cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC READ cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeRead->rhs}; + CheckAtomicReadAssignment(*maybeRead, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC READ operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC READ operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicWrite( + const parser::OpenMPAtomicConstruct &x) { + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Write cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC WRITE cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeWrite{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC WRITE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdate( + const parser::OpenMPAtomicConstruct &x) { + auto &block{std::get(x.t)}; + + bool isConditional{x.IsCompare()}; + bool isCapture{x.IsCapture()}; + const parser::Block &body{GetInnermostExecPart(block)}; + + if (isConditional && isCapture) { + CheckAtomicConditionalUpdateCapture(x, body, x.source); + } else if (isConditional) { + CheckAtomicConditionalUpdate(x, body, x.source); + } else if (isCapture) { + CheckAtomicUpdateCapture(x, body, x.source); + } else { // update-only + CheckAtomicUpdateOnly(x, body, x.source); + } +} + +void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { + // All of the following groups have the "exclusive" property, i.e. at + // most one clause from each group is allowed. + // The exclusivity-checking code should eventually be unified for all + // clauses, with clause groups defined in OMP.td. + std::array atomic{llvm::omp::Clause::OMPC_read, + llvm::omp::Clause::OMPC_update, llvm::omp::Clause::OMPC_write}; + std::array memoryOrder{llvm::omp::Clause::OMPC_acq_rel, + llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_relaxed, + llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_seq_cst}; + + auto checkExclusive{[&](llvm::ArrayRef group, + std::string_view name, + const parser::OmpClauseList &clauses) { + const parser::OmpClause *present{nullptr}; + for (const parser::OmpClause &clause : clauses.v) { + llvm::omp::Clause id{clause.Id()}; + if (!llvm::is_contained(group, id)) { + continue; + } + if (present == nullptr) { + present = &clause; + continue; + } else if (id == present->Id()) { + // Ignore repetitions of the same clause, those will be diagnosed + // separately. + continue; + } + parser::MessageFormattedText txt( + "At most one clause from the '%s' group is allowed on ATOMIC construct"_err_en_US, + name.data()); + parser::Message message(clause.source, txt); + message.Attach(present->source, + "Previous clause from this group provided here"_en_US); + context_.Say(std::move(message)); + return; + } + }}; + + auto &dirSpec{std::get(x.t)}; + auto &dir{std::get(dirSpec.t)}; + PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_atomic); + llvm::omp::Clause kind{x.GetKind()}; + + checkExclusive(atomic, "atomic", dirSpec.Clauses()); + checkExclusive(memoryOrder, "memory-order", dirSpec.Clauses()); + + switch (kind) { + case llvm::omp::Clause::OMPC_read: + CheckAtomicRead(x); + break; + case llvm::omp::Clause::OMPC_write: + CheckAtomicWrite(x); + break; + case llvm::omp::Clause::OMPC_update: + CheckAtomicUpdate(x); + break; + default: + break; + } } void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { @@ -3237,7 +4341,6 @@ CHECK_SIMPLE_CLAUSE(Final, OMPC_final) CHECK_SIMPLE_CLAUSE(Flush, OMPC_flush) CHECK_SIMPLE_CLAUSE(Full, OMPC_full) CHECK_SIMPLE_CLAUSE(Grainsize, OMPC_grainsize) -CHECK_SIMPLE_CLAUSE(Hint, OMPC_hint) CHECK_SIMPLE_CLAUSE(Holds, OMPC_holds) CHECK_SIMPLE_CLAUSE(Inclusive, OMPC_inclusive) CHECK_SIMPLE_CLAUSE(Initializer, OMPC_initializer) @@ -3867,40 +4970,6 @@ void OmpStructureChecker::CheckIsLoopIvPartOfClause( } } } -// Following clauses have a separate node in parse-tree.h. -// Atomic-clause -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicRead, OMPC_read) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicWrite, OMPC_write) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicUpdate, OMPC_update) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicCapture, OMPC_capture) - -void OmpStructureChecker::Leave(const parser::OmpAtomicRead &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_read, - {llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicWrite &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_write, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicUpdate &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_update, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -// OmpAtomic node represents atomic directive without atomic-clause. -// atomic-clause - READ,WRITE,UPDATE,CAPTURE. -void OmpStructureChecker::Leave(const parser::OmpAtomic &) { - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acquire)}) { - context_.Say(clause->source, - "Clause ACQUIRE is not allowed on the ATOMIC directive"_err_en_US); - } - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acq_rel)}) { - context_.Say(clause->source, - "Clause ACQ_REL is not allowed on the ATOMIC directive"_err_en_US); - } -} // Restrictions specific to each clause are implemented apart from the // generalized restrictions. @@ -4854,21 +5923,6 @@ void OmpStructureChecker::Leave(const parser::OmpContextSelector &) { ExitDirectiveNest(ContextSelectorNest); } -std::optional OmpStructureChecker::GetDynamicType( - const common::Indirection &parserExpr) { - // Indirection parserExpr - // `- parser::Expr ^.value() - const parser::TypedExpr &typedExpr{parserExpr.value().typedExpr}; - // ForwardOwningPointer typedExpr - // `- GenericExprWrapper ^.get() - // `- std::optional ^->v - if (auto maybeExpr{typedExpr.get()->v}) { - return maybeExpr->GetType(); - } else { - return std::nullopt; - } -} - const std::list & OmpStructureChecker::GetTraitPropertyList( const parser::OmpTraitSelector &trait) { @@ -5258,7 +6312,7 @@ void OmpStructureChecker::CheckTraitCondition( const parser::OmpTraitProperty &property{properties.front()}; auto &scalarExpr{std::get(property.u)}; - auto maybeType{GetDynamicType(scalarExpr.thing)}; + auto maybeType{GetDynamicType(scalarExpr.thing.value())}; if (!maybeType || maybeType->category() != TypeCategory::Logical) { context_.Say(property.source, "%s trait requires a single LOGICAL expression"_err_en_US, diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..bf6fbf16d0646 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -48,6 +48,7 @@ static const OmpDirectiveSet noWaitClauseNotAllowedSet{ } // namespace llvm namespace Fortran::semantics { +struct AnalyzedCondStmt; // Mapping from 'Symbol' to 'Source' to keep track of the variables // used in multiple clauses @@ -142,15 +143,6 @@ class OmpStructureChecker void Leave(const parser::OmpClauseList &); void Enter(const parser::OmpClause &); - void Enter(const parser::OmpAtomicRead &); - void Leave(const parser::OmpAtomicRead &); - void Enter(const parser::OmpAtomicWrite &); - void Leave(const parser::OmpAtomicWrite &); - void Enter(const parser::OmpAtomicUpdate &); - void Leave(const parser::OmpAtomicUpdate &); - void Enter(const parser::OmpAtomicCapture &); - void Leave(const parser::OmpAtomic &); - void Enter(const parser::DoConstruct &); void Leave(const parser::DoConstruct &); @@ -189,8 +181,6 @@ class OmpStructureChecker void CheckAllowedMapTypes(const parser::OmpMapType::Value &, const std::list &); - std::optional GetDynamicType( - const common::Indirection &); const std::list &GetTraitPropertyList( const parser::OmpTraitSelector &); std::optional GetClauseFromProperty( @@ -260,14 +250,41 @@ class OmpStructureChecker void CheckDoWhile(const parser::OpenMPLoopConstruct &x); void CheckAssociatedLoopConstraints(const parser::OpenMPLoopConstruct &x); template bool IsOperatorValid(const T &, const D &); - void CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *, const parser::OmpAtomicClauseList *); - void CheckAtomicUpdateStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureStmt(const parser::AssignmentStmt &); - void CheckAtomicWriteStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureConstruct(const parser::OmpAtomicCapture &); - void CheckAtomicCompareConstruct(const parser::OmpAtomicCompare &); - void CheckAtomicConstructStructure(const parser::OpenMPAtomicConstruct &); + + void CheckStorageOverlap(const evaluate::Expr &, + llvm::ArrayRef>, parser::CharBlock); + void CheckAtomicVariable( + const evaluate::Expr &, parser::CharBlock); + std::pair + CheckUpdateCapture(const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source); + void CheckAtomicCaptureAssignment(const evaluate::Assignment &capture, + const SomeExpr &atom, parser::CharBlock source); + void CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source); + void CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source); + void CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source); + void CheckAtomicConditionalUpdateAssignment(const SomeExpr &cond, + parser::CharBlock condSource, const evaluate::Assignment &assign, + parser::CharBlock assignSource); + void CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source); + void CheckAtomicUpdateOnly(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdate(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicUpdateCapture(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source); + void CheckAtomicRead(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicWrite(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicUpdate(const parser::OpenMPAtomicConstruct &x); + void CheckDistLinear(const parser::OpenMPLoopConstruct &x); void CheckSIMDNest(const parser::OpenMPConstruct &x); void CheckTargetNest(const parser::OpenMPConstruct &x); @@ -319,7 +336,6 @@ class OmpStructureChecker void EnterDirectiveNest(const int index) { directiveNest_[index]++; } void ExitDirectiveNest(const int index) { directiveNest_[index]--; } int GetDirectiveNest(const int index) { return directiveNest_[index]; } - template void CheckHintClause(D *, D *, std::string_view); inline void ErrIfAllocatableVariable(const parser::Variable &); inline void ErrIfLHSAndRHSSymbolsMatch( const parser::Variable &, const parser::Expr &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..dc395ecf211b3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1638,11 +1638,8 @@ class OmpVisitor : public virtual DeclarationVisitor { messageHandler().set_currStmtSource(std::nullopt); } bool Pre(const parser::OpenMPAtomicConstruct &x) { - return common::visit(common::visitors{[&](const auto &u) -> bool { - AddOmpSourceRange(u.source); - return true; - }}, - x.u); + AddOmpSourceRange(x.source); + return true; } void Post(const parser::OpenMPAtomicConstruct &) { messageHandler().set_currStmtSource(std::nullopt); diff --git a/flang/lib/Semantics/rewrite-directives.cpp b/flang/lib/Semantics/rewrite-directives.cpp index 104a77885d276..b4fef2c881b67 100644 --- a/flang/lib/Semantics/rewrite-directives.cpp +++ b/flang/lib/Semantics/rewrite-directives.cpp @@ -51,23 +51,21 @@ class OmpRewriteMutator : public DirectiveRewriteMutator { bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { // Find top-level parent of the operation. - Symbol *topLevelParent{common::visit( - [&](auto &atomic) { - Symbol *symbol{nullptr}; - Scope *scope{ - &context_.FindScope(std::get(atomic.t).source)}; - do { - if (Symbol * parent{scope->symbol()}) { - symbol = parent; - } - scope = &scope->parent(); - } while (!scope->IsGlobal()); - - assert(symbol && - "Atomic construct must be within a scope associated with a symbol"); - return symbol; - }, - x.u)}; + Symbol *topLevelParent{[&]() { + Symbol *symbol{nullptr}; + Scope *scope{&context_.FindScope( + std::get(x.t).source)}; + do { + if (Symbol * parent{scope->symbol()}) { + symbol = parent; + } + scope = &scope->parent(); + } while (!scope->IsGlobal()); + + assert(symbol && + "Atomic construct must be within a scope associated with a symbol"); + return symbol; + }()}; // Get the `atomic_default_mem_order` clause from the top-level parent. std::optional defaultMemOrder; @@ -86,66 +84,48 @@ bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { return false; } - auto findMemOrderClause = - [](const std::list &clauses) { - return llvm::any_of(clauses, [](const auto &clause) { - return std::get_if(&clause.u); + auto findMemOrderClause{[](const parser::OmpClauseList &clauses) { + return llvm::any_of( + clauses.v, [](auto &clause) -> const parser::OmpClause * { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_acq_rel: + case llvm::omp::Clause::OMPC_acquire: + case llvm::omp::Clause::OMPC_relaxed: + case llvm::omp::Clause::OMPC_release: + case llvm::omp::Clause::OMPC_seq_cst: + return &clause; + default: + return nullptr; + } }); - }; - - // Get the clause list to which the new memory order clause must be added, - // only if there are no other memory order clauses present for this atomic - // directive. - std::list *clauseList = common::visit( - common::visitors{[&](parser::OmpAtomic &atomicConstruct) { - // OmpAtomic only has a single list of clauses. - auto &clauses{std::get( - atomicConstruct.t)}; - return !findMemOrderClause(clauses.v) ? &clauses.v - : nullptr; - }, - [&](auto &atomicConstruct) { - // All other atomic constructs have two lists of clauses. - auto &clausesLhs{std::get<0>(atomicConstruct.t)}; - auto &clausesRhs{std::get<2>(atomicConstruct.t)}; - return !findMemOrderClause(clausesLhs.v) && - !findMemOrderClause(clausesRhs.v) - ? &clausesRhs.v - : nullptr; - }}, - x.u); + }}; - // Add a memory order clause to the atomic directive. + auto &dirSpec{std::get(x.t)}; + auto &clauseList{std::get>(dirSpec.t)}; if (clauseList) { - atomicDirectiveDefaultOrderFound_ = true; - switch (*defaultMemOrder) { - case common::OmpMemoryOrderType::Acq_Rel: - clauseList->emplace_back(common::visit( - common::visitors{[](parser::OmpAtomicRead &) -> parser::OmpClause { - return parser::OmpClause::Acquire{}; - }, - [](parser::OmpAtomicCapture &) -> parser::OmpClause { - return parser::OmpClause::AcqRel{}; - }, - [](auto &) -> parser::OmpClause { - // parser::{OmpAtomic, OmpAtomicUpdate, OmpAtomicWrite} - return parser::OmpClause::Release{}; - }}, - x.u)); - break; - case common::OmpMemoryOrderType::Relaxed: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::Relaxed{}}); - break; - case common::OmpMemoryOrderType::Seq_Cst: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::SeqCst{}}); - break; - default: - // FIXME: Don't process other values at the moment since their validity - // depends on the OpenMP version (which is unavailable here). - break; + if (findMemOrderClause(*clauseList)) { + return false; } + } else { + clauseList = parser::OmpClauseList(decltype(parser::OmpClauseList::v){}); + } + + // Add a memory order clause to the atomic directive. + atomicDirectiveDefaultOrderFound_ = true; + switch (*defaultMemOrder) { + case common::OmpMemoryOrderType::Acq_Rel: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::AcqRel{}}); + break; + case common::OmpMemoryOrderType::Relaxed: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::Relaxed{}}); + break; + case common::OmpMemoryOrderType::Seq_Cst: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::SeqCst{}}); + break; + default: + // FIXME: Don't process other values at the moment since their validity + // depends on the OpenMP version (which is unavailable here). + break; } return false; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 index b82bd13622764..6f58e0939a787 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 index 88ec6fe910b9e..6729be6e5cf8b 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b8422c8f3b769 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -79,16 +79,16 @@ subroutine pointers_in_atomic_capture() !CHECK: %[[VAL_A_BOX_ADDR:.*]] = fir.box_addr %[[VAL_A_LOADED]] : (!fir.box>) -> !fir.ptr !CHECK: %[[VAL_B_LOADED:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR:.*]] = fir.box_addr %[[VAL_B_LOADED]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR]] : !fir.ptr !CHECK: %[[VAL_B_LOADED_2:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR_2:.*]] = fir.box_addr %[[VAL_B_LOADED_2]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR_2]] : !fir.ptr !CHECK: omp.atomic.capture { !CHECK: omp.atomic.update %[[VAL_A_BOX_ADDR]] : !fir.ptr { !CHECK: ^bb0(%[[ARG:.*]]: i32): !CHECK: %[[TEMP:.*]] = arith.addi %[[ARG]], %[[VAL_B]] : i32 !CHECK: omp.yield(%[[TEMP]] : i32) !CHECK: } -!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 +!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR_2]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 !CHECK: } !CHECK: return !CHECK: } diff --git a/flang/test/Lower/OpenMP/atomic-write.f90 b/flang/test/Lower/OpenMP/atomic-write.f90 index 13392ad76471f..6eded49b0b15d 100644 --- a/flang/test/Lower/OpenMP/atomic-write.f90 +++ b/flang/test/Lower/OpenMP/atomic-write.f90 @@ -44,9 +44,9 @@ end program OmpAtomicWrite !CHECK-LABEL: func.func @_QPatomic_write_pointer() { !CHECK: %[[X_REF:.*]] = fir.alloca !fir.box> {bindc_name = "x", uniq_name = "_QFatomic_write_pointerEx"} !CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFatomic_write_pointerEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) -!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> !CHECK: %[[X_POINTEE_ADDR:.*]] = fir.box_addr %[[X_ADDR_BOX]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: omp.atomic.write %[[X_POINTEE_ADDR]] = %[[C1]] : !fir.ptr, i32 !CHECK: %[[C2:.*]] = arith.constant 2 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> diff --git a/flang/test/Parser/OpenMP/atomic-compare.f90 b/flang/test/Parser/OpenMP/atomic-compare.f90 deleted file mode 100644 index 5cd02698ff482..0000000000000 --- a/flang/test/Parser/OpenMP/atomic-compare.f90 +++ /dev/null @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s -! OpenMP version for documentation purposes only - it isn't used until Sema. -! This is testing for Parser errors that bail out before Sema. -program main - implicit none - integer :: i, j = 10 - logical :: r - - !CHECK: error: expected OpenMP construct - !$omp atomic compare write - r = i .eq. j + 1 - - !CHECK: error: expected end of line - !$omp atomic compare num_threads(4) - r = i .eq. j -end program main diff --git a/flang/test/Semantics/OpenMP/atomic-compare.f90 b/flang/test/Semantics/OpenMP/atomic-compare.f90 index 54492bf6a22a6..11e23e062bce7 100644 --- a/flang/test/Semantics/OpenMP/atomic-compare.f90 +++ b/flang/test/Semantics/OpenMP/atomic-compare.f90 @@ -44,46 +44,37 @@ !$omp end atomic ! Check for error conditions: - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic compare seq_cst seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst compare seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic compare acquire acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire compare acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic compare relaxed relaxed if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed compare relaxed if (b .eq. c) b = a - !ERROR: More than one FAIL clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one FAIL clause can appear on the ATOMIC directive !$omp atomic fail(release) compare fail(release) if (c .eq. a) a = b !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index c13a11a8dd5dc..deb67e7614659 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -16,20 +16,21 @@ program sample !$omp atomic read hint(2) y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(3) y = y + 10 !$omp atomic update hint(5) y = x + y - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement y = x x = y !$omp end atomic - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic update hint(x) y = y * 1 @@ -46,7 +47,7 @@ program sample !$omp atomic hint(omp_lock_hint_speculative) x = y + x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint) read y = x @@ -69,36 +70,36 @@ program sample !$omp atomic hint(omp_lock_hint_contended + omp_sync_hint_nonspeculative) x = y + x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint_contended) read y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = y * 9 - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp atomic hint(1.0) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp atomic hint(z + omp_sync_hint_nonspeculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(k + omp_sync_hint_speculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(p(1) + omp_sync_hint_uncontended) write x = 10 * y !$omp atomic write hint(a) - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and y+x access the same storage x = y + x !$omp atomic hint(abs(-1)) write diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 new file mode 100644 index 0000000000000..2619d235380f8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -0,0 +1,89 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC READ. Expect no diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v = x + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v => x +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic read + v = p(i) +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic read + !ERROR: Atomic variable x should be a scalar + v = x + + !$omp atomic read + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + v = y(2:4) +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic read + !ERROR: Within atomic operation x and x access the same storage + x = x + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic read + y(1) = y(2) +end + +subroutine f05 + integer :: x, v + + !$omp atomic read + !ERROR: Atomic expression x+1_4 should be a variable + v = x + 1 +end + +subroutine f06 + character :: x, v + + !$omp atomic read + !ERROR: Atomic variable x cannot have CHARACTER type + v = x +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic read + !ERROR: Atomic variable x cannot be ALLOCATABLE + v = x +end + diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 new file mode 100644 index 0000000000000..64e31a7974b55 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -0,0 +1,77 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !$omp atomic update capture + x = v + x = x + 1 + y = x + !$omp end atomic +end + +subroutine f01 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two assignments + !$omp atomic update capture + x = v + block + x = x + 1 + y = x + end block + !$omp end atomic +end + +subroutine f02 + integer :: x, y + + ! The update and capture statements can be inside of a single BLOCK. + ! The end-directive is then optional. Expect no diagnostics. + !$omp atomic update capture + block + x = x + 1 + y = x + end block +end + +subroutine f03 + integer :: x + + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !$omp atomic update capture + x = x + 1 + x = x + 2 + !$omp end atomic +end + +subroutine f04 + integer :: x, v + + !$omp atomic update capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + v = x + x = v + !$omp end atomic +end + +subroutine f05 + integer :: x, v, z + + !$omp atomic update capture + v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + !$omp end atomic +end + +subroutine f06 + integer :: x, v, z + + !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + v = x + !$omp end atomic +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 new file mode 100644 index 0000000000000..0aee02d429dc0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + !$omp atomic update + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + x = int(x + y) +end + +subroutine f06 + integer :: x, y + interface + function f(i, j) + integer :: f, i, j + end + end interface + + !$omp atomic update + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation + x = f(x, y) +end + +subroutine f07 + real :: x + integer :: y + + !$omp atomic update + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation + x = x ** y +end + +subroutine f08 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should appear as an argument in the update operation + x = y +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 index 21a9b87d26345..3084376b4275d 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 @@ -22,10 +22,10 @@ program sample x = x / y !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y end program diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 new file mode 100644 index 0000000000000..979568ebf7140 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -0,0 +1,81 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC WRITE. Expect no diagnostics. + !$omp atomic write + x = v + 1 + + !$omp atomic write + x = v + 3 + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic write + x = 2 * v + 3 + + !$omp atomic write + x => v +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic write + p(i) = v +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic write + !ERROR: Atomic variable x should be a scalar + x = v + + !$omp atomic write + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + y(2:4) = v +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic write + !ERROR: Within atomic operation x and x+1_4 access the same storage + x = x + 1 + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic write + y(1) = y(2) +end + +subroutine f06 + character :: x, v + + !$omp atomic write + !ERROR: Atomic variable x cannot have CHARACTER type + x = v +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic write + !ERROR: Atomic variable x cannot be ALLOCATABLE + x = v +end + diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0e100871ea9b4..0dc8251ae356e 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct @@ -11,9 +11,13 @@ a = b !$omp end atomic + !ERROR: ACQUIRE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic read acquire hint(OMP_LOCK_HINT_CONTENDED) a = b + !ERROR: RELEASE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic release hint(OMP_LOCK_HINT_UNCONTENDED) write a = b @@ -22,39 +26,32 @@ a = a + 1 !$omp end atomic + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: ACQ_REL clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic hint(1) acq_rel capture b = a a = a + 1 !$omp end atomic - !ERROR: expected end of line + !ERROR: At most one clause from the 'atomic' group is allowed on ATOMIC construct !$omp atomic read write + !ERROR: Atomic expression a+1._4 should be a variable a = a + 1 !$omp atomic a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic num_threads(4) a = a + 1 - !ERROR: expected end of line + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic capture num_threads(4) a = a + 1 + !ERROR: RELAXED clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic relaxed a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' - !$omp atomic num_threads write - a = a + 1 - !$omp end parallel end diff --git a/flang/test/Semantics/OpenMP/atomic01.f90 b/flang/test/Semantics/OpenMP/atomic01.f90 index 173effe86b69c..f700c381cadd0 100644 --- a/flang/test/Semantics/OpenMP/atomic01.f90 +++ b/flang/test/Semantics/OpenMP/atomic01.f90 @@ -14,322 +14,277 @@ ! At most one memory-order-clause may appear on the construct. !READ - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic read seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst read seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic read acquire acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire read acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic read relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed read relaxed i = j !UPDATE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic update seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst update seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic update release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release update release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic update relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed update relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !CAPTURE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic capture seq_cst seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst capture seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic capture release release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release capture release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic capture relaxed relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed capture relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel acq_rel capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic capture acq_rel acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel capture acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic capture acquire acquire i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire capture acquire i = j j = k !$omp end atomic !WRITE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic write seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst write seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic write release release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release write release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic write relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed write relaxed i = j !No atomic-clause - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j ! 2.17.7.3 ! At most one hint clause may appear on the construct. - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_speculative) hint(omp_sync_hint_speculative) read i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) read hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic read hint(omp_sync_hint_uncontended) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) update hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic update hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) capture i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) capture hint(omp_sync_hint_nonspeculative) i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic capture hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j j = k @@ -337,34 +292,26 @@ ! 2.17.7.4 ! If atomic-clause is read then memory-order-clause must not be acq_rel or release. - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic acq_rel read i = j - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read acq_rel i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic release read i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read release i = j ! 2.17.7.5 ! If atomic-clause is write then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acq_rel write i = j - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acq_rel i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acquire write i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acquire i = j @@ -372,33 +319,27 @@ ! 2.17.7.6 ! If atomic-clause is update or not present then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acq_rel update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acquire update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed on the ATOMIC directive !$omp atomic acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed on the ATOMIC directive !$omp atomic acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j end program diff --git a/flang/test/Semantics/OpenMP/atomic02.f90 b/flang/test/Semantics/OpenMP/atomic02.f90 index c66085d00f157..45e41f2552965 100644 --- a/flang/test/Semantics/OpenMP/atomic02.f90 +++ b/flang/test/Semantics/OpenMP/atomic02.f90 @@ -28,36 +28,29 @@ program OmpAtomic !$omp atomic a = a/(b + 1) !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: The atomic variable c should appear as an argument in the update operation c = d !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The /= operator is not a valid ATOMIC UPDATE operation l = a .NE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic m = m .AND. n @@ -76,32 +69,26 @@ program OmpAtomic !$omp atomic update a = a/(b + 1) !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: This is not a valid ATOMIC UPDATE operation c = c//d !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic update m = m .AND. n diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index 76367495b9861..f5c189fd05318 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -25,28 +25,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7, b, c) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8, a, d) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -60,26 +58,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = MOD(y, 9) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation x = ABS(x) end program OmpAtomic @@ -92,7 +90,7 @@ subroutine conflicting_types() type(simple) ::s z = 1 !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(s%z, 4) end subroutine @@ -105,40 +103,37 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) :: s !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MIN operator a = min(a, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a) !$omp atomic - !ERROR: Atomic update statement should be of the form `a = intrinsic_procedure(a, expr_list)` OR `a = intrinsic_procedure(expr_list, a)` a = min(b, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a, b) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: The atomic variable y should occur exactly once among the arguments of the top-level MIN operator y = min(z, x) !$omp atomic z = max(z, y) !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'k' + !ERROR: Atomic variable k should be a scalar + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation k = max(x, y) - + !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement x = min(x, k) !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - z =z + s%m + z = z + s%m end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index a9644ad95aa30..5c91ab5dc37e4 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -20,12 +20,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic @@ -33,12 +31,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic @@ -46,12 +42,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic @@ -59,12 +53,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic @@ -72,8 +64,7 @@ program OmpAtomic !$omp atomic m = n .AND. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic @@ -81,8 +72,7 @@ program OmpAtomic !$omp atomic m = n .OR. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic @@ -90,8 +80,7 @@ program OmpAtomic !$omp atomic m = n .EQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic @@ -99,8 +88,7 @@ program OmpAtomic !$omp atomic m = n .NEQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l !$omp atomic update @@ -108,12 +96,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic update @@ -121,12 +107,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic update @@ -134,12 +118,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic update @@ -147,12 +129,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic update @@ -160,8 +140,7 @@ program OmpAtomic !$omp atomic update m = n .AND. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic update @@ -169,8 +148,7 @@ program OmpAtomic !$omp atomic update m = n .OR. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic update @@ -178,8 +156,7 @@ program OmpAtomic !$omp atomic update m = n .EQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic update @@ -187,8 +164,7 @@ program OmpAtomic !$omp atomic update m = n .NEQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l end program OmpAtomic @@ -204,35 +180,34 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) p !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement x = x !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 1 !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and a*b access the same storage a = a * b + a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level * operator a = b * (a + 9) !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (a+b) access the same storage a = a * (a + b) !$omp atomic - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (b+a) access the same storage a = (b + a) * a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a * b + c !$omp atomic update - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a + b + c !$omp atomic @@ -243,23 +218,18 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement a = a + d !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = x * y / z !$omp atomic - !ERROR: Atomic update statement should be of form `p%m = p%m operator expr` OR `p%m = expr operator p%m` - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable p%m should occur exactly once among the arguments of the top-level + operator p%m = x + y !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement p%m = p%m + p%n end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index 266268a212440..e0103be4cae4a 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -8,20 +8,20 @@ program OmpAtomic use omp_lib integer :: g, x - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic relaxed, seq_cst x = x + 1 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic read seq_cst, relaxed x = g - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic write relaxed, release x = 2 * 4 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 10 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst x = g g = x * 10 diff --git a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 index 7ca8c858239f7..e9cfa49bf934e 100644 --- a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 @@ -18,7 +18,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(3) y = 2 !$omp end critical (name) @@ -27,12 +27,12 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(7) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(x) y = 2 @@ -54,7 +54,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint) y = 2 @@ -84,35 +84,35 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint_contended) y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp critical (name) hint(1.0) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp critical (name) hint(z + omp_sync_hint_nonspeculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(k + omp_sync_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(p(1) + omp_sync_hint_uncontended) y = 2 diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 505cbc48fef90..677b933932b44 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -20,70 +20,64 @@ program sample !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable y(1_8:3_8:1_8) should be a scalar v = y(1:3) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression x*(10_4+x) should be a variable v = x * (10 + x) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression 4_4 should be a variable v = 4 !$omp atomic read - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE v = k !$omp atomic write - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = x !$omp atomic update - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = k + x * (v * x) !$omp atomic - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = v * k !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'z%y' + !ERROR: Within atomic operation z%y and x+z%y access the same storage z%y = x + z%y !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Within atomic operation m and min(m,x,z%m)+k access the same storage m = min(m, x, z%m) + k !$omp atomic read - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Atomic expression min(m,x,z%m)+k should be a variable m = min(m, x, z%m) + k !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar x = a - !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - a = x - !$omp atomic write !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement x = a !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar a = x !$omp atomic capture @@ -93,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = b + (x*1) !$omp end atomic @@ -103,60 +97,58 @@ program sample !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component x expected to be captured in the second statement of ATOMIC CAPTURE construct x = x + 10 v = b !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt] v = 1 x = 4 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component z%y expected to be assigned in the second statement of ATOMIC CAPTURE construct x = z%y + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component z%m expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 x = z%y !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component y(2) expected to be assigned in the second statement of ATOMIC CAPTURE construct x = y(2) + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component y(1) expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 x = y(2) !$omp end atomic !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable r cannot have CHARACTER type l = r !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable l cannot have CHARACTER type l = r end program diff --git a/flang/test/Semantics/OpenMP/requires-atomic01.f90 b/flang/test/Semantics/OpenMP/requires-atomic01.f90 index ae9fd086015dd..e8817c3f5ef61 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic01.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic01.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> SeqCst !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> SeqCst !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> SeqCst !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> SeqCst !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> SeqCst !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 diff --git a/flang/test/Semantics/OpenMP/requires-atomic02.f90 b/flang/test/Semantics/OpenMP/requires-atomic02.f90 index 4976a9667eb78..a3724a83456fd 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic02.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic02.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Acquire + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> AcqRel !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> AcqRel !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> AcqRel !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> AcqRel !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> AcqRel + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> AcqRel !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 >From 5928e6c450ed9d91a8130fd489a3fabab830140c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:41:16 -0500 Subject: [PATCH 05/29] Restart build >From 5a9cb4aaca6c7cd079f24036b233dbfd3c6278d4 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:45:28 -0500 Subject: [PATCH 06/29] Restart build >From 6273bb856c66fe110731e25293be0f9a1f02c64a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 07:54:54 -0500 Subject: [PATCH 07/29] Restart build >From 3594a0fcab499d7509c2bd4620b2c47feb74a481 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 08:18:01 -0500 Subject: [PATCH 08/29] Replace %openmp_flags with -fopenmp in tests, add REQUIRES where needed --- flang/test/Semantics/OpenMP/atomic-read.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-capture.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 2 +- flang/test/Semantics/OpenMP/atomic.f90 | 2 ++ 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 index 2619d235380f8..73899a9ff37f2 100644 --- a/flang/test/Semantics/OpenMP/atomic-read.f90 +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index 64e31a7974b55..c427ba07d43d8 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 0aee02d429dc0..4595e02d01456 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 index 979568ebf7140..7965ad2dc7dbf 100644 --- a/flang/test/Semantics/OpenMP/atomic-write.f90 +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0dc8251ae356e..2caa161507d49 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,3 +1,5 @@ +! REQUIRES: openmp_runtime + ! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct >From 0142671339663ba09c169800f909c0d48c0ad38b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 09:06:20 -0500 Subject: [PATCH 09/29] Fix examples --- flang/examples/FeatureList/FeatureList.cpp | 10 ------- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 26 +++++-------------- 2 files changed, 7 insertions(+), 29 deletions(-) diff --git a/flang/examples/FeatureList/FeatureList.cpp b/flang/examples/FeatureList/FeatureList.cpp index d1407cf0ef239..a36b8719e365d 100644 --- a/flang/examples/FeatureList/FeatureList.cpp +++ b/flang/examples/FeatureList/FeatureList.cpp @@ -445,13 +445,6 @@ struct NodeVisitor { READ_FEATURE(ObjectDecl) READ_FEATURE(OldParameterStmt) READ_FEATURE(OmpAlignedClause) - READ_FEATURE(OmpAtomic) - READ_FEATURE(OmpAtomicCapture) - READ_FEATURE(OmpAtomicCapture::Stmt1) - READ_FEATURE(OmpAtomicCapture::Stmt2) - READ_FEATURE(OmpAtomicRead) - READ_FEATURE(OmpAtomicUpdate) - READ_FEATURE(OmpAtomicWrite) READ_FEATURE(OmpBeginBlockDirective) READ_FEATURE(OmpBeginLoopDirective) READ_FEATURE(OmpBeginSectionsDirective) @@ -480,7 +473,6 @@ struct NodeVisitor { READ_FEATURE(OmpIterationOffset) READ_FEATURE(OmpIterationVector) READ_FEATURE(OmpEndAllocators) - READ_FEATURE(OmpEndAtomic) READ_FEATURE(OmpEndBlockDirective) READ_FEATURE(OmpEndCriticalDirective) READ_FEATURE(OmpEndLoopDirective) @@ -566,8 +558,6 @@ struct NodeVisitor { READ_FEATURE(OpenMPDeclareTargetConstruct) READ_FEATURE(OmpMemoryOrderType) READ_FEATURE(OmpMemoryOrderClause) - READ_FEATURE(OmpAtomicClause) - READ_FEATURE(OmpAtomicClauseList) READ_FEATURE(OmpAtomicDefaultMemOrderClause) READ_FEATURE(OpenMPFlushConstruct) READ_FEATURE(OpenMPLoopConstruct) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..18a4107f9a9d1 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -74,25 +74,19 @@ SourcePosition OpenMPCounterVisitor::getLocation(const OpenMPConstruct &c) { // the directive field. [&](const auto &c) -> SourcePosition { const CharBlock &source{std::get<0>(c.t).source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPAtomicConstruct &c) -> SourcePosition { - return std::visit( - [&](const auto &o) -> SourcePosition { - const CharBlock &source{std::get(o.t).source}; - return parsing->allCooked() - .GetSourcePositionRange(source) - ->first; - }, - c.u); + const CharBlock &source{c.source}; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPSectionConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPUtilityConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, }, c.u); @@ -157,14 +151,8 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - return std::visit( - [&](const auto &c) { - // Get source from the verbatim fields - const CharBlock &source{std::get(c.t).source}; - return "atomic-" + - normalize_construct_name(source.ToString()); - }, - c.u); + const CharBlock &source{c.source}; + return normalize_construct_name(source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; >From c158867527e0a5e2b67146784386675e0d414c71 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:19 -0500 Subject: [PATCH 10/29] Fix example --- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 5 +++-- flang/test/Examples/omp-atomic.f90 | 16 +++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index 18a4107f9a9d1..b0964845d3b99 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -151,8 +151,9 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - const CharBlock &source{c.source}; - return normalize_construct_name(source.ToString()); + auto &dirSpec = std::get(c.t); + auto &dirName = std::get(dirSpec.t); + return normalize_construct_name(dirName.source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; diff --git a/flang/test/Examples/omp-atomic.f90 b/flang/test/Examples/omp-atomic.f90 index dcca34b633a3e..934f84f132484 100644 --- a/flang/test/Examples/omp-atomic.f90 +++ b/flang/test/Examples/omp-atomic.f90 @@ -26,25 +26,31 @@ ! CHECK:--- ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 9 -! CHECK-NEXT: construct: atomic-read +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: -! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: - clause: read ! CHECK-NEXT: details: '' +! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 12 -! CHECK-NEXT: construct: atomic-write +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: ! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' +! CHECK-NEXT: - clause: write ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 16 -! CHECK-NEXT: construct: atomic-capture +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: +! CHECK-NEXT: - clause: capture +! CHECK-NEXT: details: 'name_modifier=atomic;name_modifier=atomic;' ! CHECK-NEXT: - clause: seq_cst ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 21 -! CHECK-NEXT: construct: atomic-atomic +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: [] ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 8 >From ddacb71299d1fefd37b566e3caed45031584aac8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:47 -0500 Subject: [PATCH 11/29] Remove reference to *nullptr --- flang/lib/Lower/OpenMP/OpenMP.cpp | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ab868df76d298..4d9b375553c1e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3770,9 +3770,12 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); - mlir::Operation *secondOp = genAtomicOperation( - converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, - *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + mlir::Operation *secondOp = nullptr; + if (analysis.op1.what != analysis.None) { + secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, + atomAddr, atom, *get(analysis.op1.expr), + hint, memOrder, atomicAt, prepareAt); + } if (secondOp) { builder.setInsertionPointAfter(secondOp); >From 4546997f82dfe32b79b2bd0e2b65974991ab55da Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 2 May 2025 18:49:05 -0500 Subject: [PATCH 12/29] Updates and improvements --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 107 +++-- flang/lib/Semantics/check-omp-structure.cpp | 375 ++++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 1 + .../Todo/atomic-capture-implicit-cast.f90 | 48 --- .../Lower/OpenMP/atomic-implicit-cast.f90 | 2 - .../Semantics/OpenMP/atomic-hint-clause.f90 | 2 +- .../OpenMP/atomic-update-capture.f90 | 8 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 16 +- 9 files changed, 381 insertions(+), 180 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 77f57b1cb85c7..8213fe33edbd0 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4859,7 +4859,7 @@ struct OpenMPAtomicConstruct { struct Op { int what; - TypedExpr expr; + AssignmentStmt::TypedAssignment assign; }; TypedExpr atom, cond; Op op0, op1; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6177b59199481..7b6c22095d723 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2673,21 +2673,46 @@ getAtomicMemoryOrder(lower::AbstractConverter &converter, static mlir::Operation * // genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Value storeAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Type storeType = fir::unwrapRefType(storeAddr.getType()); + + mlir::Value toAddr = [&]() { + if (atomType == storeType) + return storeAddr; + return builder.createTemporary(loc, atomType, ".tmp.atomval"); + }(); builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + + if (atomType != storeType) { + lower::ExprToValueMap overrides; + // The READ operation could be a part of UPDATE CAPTURE, so make sure + // we don't emit extra code into the body of the atomic op. + builder.restoreInsertionPoint(postAt); + mlir::Value load = builder.create(loc, toAddr); + overrides.try_emplace(&atom, load); + + converter.overrideExprValues(&overrides); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); + converter.resetExprOverrides(); + + builder.create(loc, value, storeAddr); + } builder.restoreInsertionPoint(saved); return op; } @@ -2695,16 +2720,18 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, static mlir::Operation * // genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); mlir::Value converted = builder.createConvert(loc, atomType, value); @@ -2719,19 +2746,20 @@ static mlir::Operation * genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrResizeOf(arg, atom)) { @@ -2751,7 +2779,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, converter.overrideExprValues(&overrides); mlir::Value updated = - fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Value converted = builder.createConvert(loc, atomType, updated); builder.create(loc, converted); converter.resetExprOverrides(); @@ -2764,20 +2792,21 @@ static mlir::Operation * genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, int action, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: - return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Write: - return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Update: - return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, assign, + hint, memOrder, preAt, atomicAt, postAt); default: return nullptr; } @@ -3724,6 +3753,15 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { } return ""s; }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; const SomeExpr &atom = *analysis.atom.get()->v; @@ -3732,11 +3770,11 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; llvm::errs() << " op0 {\n"; llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op0.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << " op1 {\n"; llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op1.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << "}\n"; } @@ -3745,8 +3783,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAtomicConstruct &construct) { - auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { - if (auto *maybe = expr.get(); maybe && maybe->v) { + auto get = [](auto &&typedWrapper) -> decltype(&*typedWrapper.get()->v) { + if (auto *maybe = typedWrapper.get(); maybe && maybe->v) { return &*maybe->v; } else { return nullptr; @@ -3774,8 +3812,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, int action0 = analysis.op0.what & analysis.Action; int action1 = analysis.op1.what & analysis.Action; mlir::Operation *captureOp = nullptr; - fir::FirOpBuilder::InsertPoint atomicAt; - fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint preAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint atomicAt, postAt; if (construct.IsCapture()) { // Capturing operation. @@ -3784,7 +3822,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, captureOp = builder.create(loc, hint, memOrder); // Set the non-atomic insertion point to before the atomic.capture. - prepareAt = getInsertionPointBefore(captureOp); + preAt = getInsertionPointBefore(captureOp); mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); builder.setInsertionPointToEnd(block); @@ -3792,6 +3830,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, // atomic.capture. mlir::Operation *term = builder.create(loc); atomicAt = getInsertionPointBefore(term); + postAt = getInsertionPointAfter(captureOp); hint = nullptr; memOrder = nullptr; } else { @@ -3799,20 +3838,20 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(action0 != analysis.None && action1 == analysis.None && "Expexcing single action"); assert(!(analysis.op0.what & analysis.Condition)); - atomicAt = prepareAt; + postAt = atomicAt = preAt; } mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, - *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); mlir::Operation *secondOp = nullptr; if (analysis.op1.what != analysis.None) { secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, - atomAddr, atom, *get(analysis.op1.expr), - hint, memOrder, atomicAt, prepareAt); + atomAddr, atom, *get(analysis.op1.assign), + hint, memOrder, preAt, atomicAt, postAt); } if (secondOp) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 201b38bd05ff3..f7753a5e5cc59 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -86,9 +86,13 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } -static bool IsVarOrFunctionRef(const SomeExpr &expr) { - return evaluate::UnwrapProcedureRef(expr) != nullptr || - evaluate::IsVariable(expr); +static bool IsVarOrFunctionRef(const MaybeExpr &expr) { + if (expr) { + return evaluate::UnwrapProcedureRef(*expr) != nullptr || + evaluate::IsVariable(*expr); + } else { + return false; + } } static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { @@ -2838,6 +2842,12 @@ static std::pair SplitAssignmentSource( namespace atomic { +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + enum class Operator { Unk, // Operators that are officially allowed in the update operation @@ -3137,16 +3147,108 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (Append(v, std::move(results)), ...); + (MoveAppend(v, std::move(results)), ...); return v; } +}; -private: - static void Append(Result &acc, Result &&data) { - for (auto &&s : data) { - acc.push_back(std::move(s)); +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + auto copy{x.derived()}; + return {evaluate::AsGenericExpr(std::move(copy)), {}}; } } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; }; } // namespace atomic @@ -3265,6 +3367,22 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } +static MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { // Both expr and x have the form of SomeType(SomeKind(...)[1]). // Check if expr is @@ -3282,6 +3400,10 @@ bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { } } +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { if (value) { expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), @@ -3289,11 +3411,20 @@ static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { } } +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( - int what, const MaybeExpr &maybeExpr = std::nullopt) { + int what, + const std::optional &maybeAssign = std::nullopt) { parser::OpenMPAtomicConstruct::Analysis::Op operation; operation.what = what; - SetExpr(operation.expr, maybeExpr); + SetAssignment(operation.assign, maybeAssign); return operation; } @@ -3316,7 +3447,7 @@ static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( // }; // struct Op { // int what; - // TypedExpr expr; + // TypedAssignment assign; // }; // TypedExpr atom, cond; // Op op0, op1; @@ -3340,6 +3471,16 @@ void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, } } +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + /// Check if `expr` satisfies the following conditions for x and v: /// /// [6.0:189:10-12] @@ -3383,9 +3524,9 @@ OmpStructureChecker::CheckUpdateCapture( // // The two allowed cases are: // x = ... atomic-var = ... - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // or - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // x = ... atomic-var = ... // // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture @@ -3394,6 +3535,8 @@ OmpStructureChecker::CheckUpdateCapture( // // If the two statements don't fit these criteria, return a pair of default- // constructed values. + using ReturnTy = std::pair; SourcedActionStmt act1{GetActionStmt(ec1)}; SourcedActionStmt act2{GetActionStmt(ec2)}; @@ -3409,86 +3552,155 @@ OmpStructureChecker::CheckUpdateCapture( auto isUpdateCapture{ [](const evaluate::Assignment &u, const evaluate::Assignment &c) { - return u.lhs == c.rhs; + return IsSameOrConvertOf(c.rhs, u.lhs); }}; // Do some checks that narrow down the possible choices for the update // and the capture statements. This will help to emit better diagnostics. - bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); - bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; + + auto errorCaptureShouldRead{[&](const parser::CharBlock &source, + const std::string &expr) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read %s"_err_en_US, + expr); + }}; - if (couldBeCapture1) { - if (couldBeCapture2) { - if (isUpdateCapture(as2, as1)) { - if (isUpdateCapture(as1, as2)) { - // If both statements could be captures and both could be updates, - // emit a warning about the ambiguity. - context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); - } - return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); - } else if (isUpdateCapture(as1, as2)) { + auto errorNeitherWorks{[&]() { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture"_err_en_US); + }}; + + auto makeSelectionFromDet{[&](int det) -> ReturnTy { + // If det != 0, then the checks unambiguously suggest a specific + // categorization. + // If det == 0, then this function should be called only if the + // checks haven't ruled out any possibility, i.e. when both assigments + // could still be either updates or captures. + if (det > 0) { + // as1 is update, as2 is capture + if (isUpdateCapture(as1, as2)) { return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - context_.Say(source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, - as1.rhs.AsFortran(), as2.rhs.AsFortran()); + errorCaptureShouldRead(act2.source, as1.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } else { // !couldBeCapture2 + } else if (det < 0) { + // as2 is update, as1 is capture if (isUpdateCapture(as2, as1)) { return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } else { - context_.Say(act2.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as1.rhs.AsFortran()); + errorCaptureShouldRead(act1.source, as2.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } - } else { // !couldBeCapture1 - if (couldBeCapture2) { - if (isUpdateCapture(as1, as2)) { - return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); - } else { + } else { + bool updateFirst{isUpdateCapture(as1, as2)}; + bool captureFirst{isUpdateCapture(as2, as1)}; + if (updateFirst && captureFirst) { + // If both assignment could be the update and both could be the + // capture, emit a warning about the ambiguity. context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as2.rhs.AsFortran()); + "In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement"_warn_en_US); + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } - } else { - context_.Say(source, - "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + if (updateFirst != captureFirst) { + const parser::ExecutionPartConstruct *upd{updateFirst ? ec1 : ec2}; + const parser::ExecutionPartConstruct *cap{captureFirst ? ec1 : ec2}; + return std::make_pair(upd, cap); + } + assert(!updateFirst && !captureFirst); + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); } + }}; + + if (det != 0 || (cbu1 && cbu2 && cbc1 && cbc2)) { + return makeSelectionFromDet(det); } + assert(det == 0 && "Prior checks should have covered det != 0"); - return std::make_pair(nullptr, nullptr); + // If neither of the statements is an RMW update, it could still be a + // "write" update. Pretty much any assignment can be a write update, so + // recompute det with cbu1 = cbu2 = true. + if (int writeDet{int(cbc2) - int(cbc1)}; writeDet || (cbc1 && cbc2)) { + return makeSelectionFromDet(writeDet); + } + + // It's only errors from here on. + + if (!cbu1 && !cbu2 && !cbc1 && !cbc2) { + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); + } + + // The remaining cases are that + // - no candidate for update, or for capture, + // - one of the assigments cannot be anything. + + if (!cbu1 && !cbu2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update"_err_en_US); + return std::make_pair(nullptr, nullptr); + } else if (!cbc1 && !cbc2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + if ((!cbu1 && !cbc1) || (!cbu2 && !cbc2)) { + auto &src = (!cbu1 && !cbc1) ? act1.source : act2.source; + context_.Say(src, + "In ATOMIC UPDATE operation with CAPTURE the statement could be neither the update nor the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + // All cases should have been covered. + llvm_unreachable("Unchecked condition"); } void OmpStructureChecker::CheckAtomicCaptureAssignment( const evaluate::Assignment &capture, const SomeExpr &atom, parser::CharBlock source) { - const SomeExpr &cap{capture.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &cap{capture.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, rsrc); - // This part should have been checked prior to callig this function. - assert(capture.rhs == atom && "This canont be a capture assignment"); + // This part should have been checked prior to calling this function. + assert(*GetConvertInput(capture.rhs) == atom && + "This canont be a capture assignment"); CheckStorageOverlap(atom, {cap}, source); } } void OmpStructureChecker::CheckAtomicReadAssignment( const evaluate::Assignment &read, parser::CharBlock source) { - const SomeExpr &atom{read.rhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; - if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + if (auto maybe{GetConvertInput(read.rhs)}) { + const SomeExpr &atom{*maybe}; + + if (!IsVarOrFunctionRef(atom)) { + ErrorShouldBeVariable(atom, rsrc); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } } else { - CheckAtomicVariable(atom, rsrc); - CheckStorageOverlap(atom, {read.lhs}, source); + ErrorShouldBeVariable(read.rhs, rsrc); } } @@ -3499,12 +3711,11 @@ void OmpStructureChecker::CheckAtomicWriteAssignment( // one of the following forms: // x = expr // x => expr - const SomeExpr &atom{write.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{write.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, lsrc); CheckStorageOverlap(atom, {write.rhs}, source); @@ -3521,12 +3732,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( // x = intrinsic-procedure-name (x) // x = intrinsic-procedure-name (x, expr-list) // x = intrinsic-procedure-name (expr-list, x) - const SomeExpr &atom{update.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{update.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); // Skip other checks. return; } @@ -3605,12 +3815,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( const SomeExpr &cond, parser::CharBlock condSource, const evaluate::Assignment &assign, parser::CharBlock assignSource) { - const SomeExpr &atom{assign.lhs}; auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + const SomeExpr &atom{assign.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, arsrc); // Skip other checks. return; } @@ -3702,7 +3911,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( @@ -3786,7 +3995,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdate( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign), MakeAtomicAnalysisOp(Analysis::None)); } @@ -3839,12 +4048,12 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( if (GetActionStmt(&body.front()).stmt == uact.stmt) { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(action, update.rhs), - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + MakeAtomicAnalysisOp(action, update), + MakeAtomicAnalysisOp(Analysis::Read, capture)); } else { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), - MakeAtomicAnalysisOp(action, update.rhs)); + MakeAtomicAnalysisOp(Analysis::Read, capture), + MakeAtomicAnalysisOp(action, update)); } } @@ -3988,12 +4197,12 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( if (captureFirst) { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign)); } else { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign)); } } @@ -4019,13 +4228,15 @@ void OmpStructureChecker::CheckAtomicRead( if (body.size() == 1) { SourcedActionStmt action{GetActionStmt(&body.front())}; if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { - const SomeExpr &atom{maybeRead->rhs}; CheckAtomicReadAssignment(*maybeRead, action.source); - using Analysis = parser::OpenMPAtomicConstruct::Analysis; - x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), - MakeAtomicAnalysisOp(Analysis::None)); + if (auto maybe{GetConvertInput(maybeRead->rhs)}) { + const SomeExpr &atom{*maybe}; + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead), + MakeAtomicAnalysisOp(Analysis::None)); + } } else { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); @@ -4058,7 +4269,7 @@ void OmpStructureChecker::CheckAtomicWrite( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index bf6fbf16d0646..835fbe45e1c0e 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -253,6 +253,7 @@ class OmpStructureChecker void CheckStorageOverlap(const evaluate::Expr &, llvm::ArrayRef>, parser::CharBlock); + void ErrorShouldBeVariable(const MaybeExpr &expr, parser::CharBlock source); void CheckAtomicVariable( const evaluate::Expr &, parser::CharBlock); std::pair&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..aa9d2e0ac3ff7 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -1,5 +1,3 @@ -! REQUIRES : openmp_runtime - ! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s ! CHECK: func.func @_QPatomic_implicit_cast_read() { diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index deb67e7614659..8adb0f1a67409 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -25,7 +25,7 @@ program sample !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement y = x x = y !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index c427ba07d43d8..f808ed916fb7e 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -39,7 +39,7 @@ subroutine f02 subroutine f03 integer :: x - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture !$omp atomic update capture x = x + 1 x = x + 2 @@ -50,7 +50,7 @@ subroutine f04 integer :: x, v !$omp atomic update capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement v = x x = v !$omp end atomic @@ -60,8 +60,8 @@ subroutine f05 integer :: x, v, z !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 !$omp end atomic end @@ -70,8 +70,8 @@ subroutine f06 integer :: x, v, z !$omp atomic update capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x !$omp end atomic end diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 677b933932b44..5e180aa0bbe5b 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -97,50 +97,50 @@ program sample !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture x = x + 10 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read x v = b !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture !$omp atomic capture v = 1 x = 4 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) !$omp end atomic >From 40510a3068498d15257cc7d198bce9c8cd71a902 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 24 Mar 2025 15:38:58 -0500 Subject: [PATCH 13/29] DumpEvExpr: show type --- flang/include/flang/Semantics/dump-expr.h | 30 ++++++++++++++++------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 2f445429a10b5..1553dac3b6687 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,6 +16,7 @@ #include #include +#include #include #include @@ -38,6 +39,17 @@ class DumpEvaluateExpr { } private: + template + struct TypeOf { + static constexpr std::string_view name{TypeOf::get()}; + static constexpr std::string_view get() { + std::string_view v(__PRETTY_FUNCTION__); + v.remove_prefix(99); // Strip the part "... [with T = " + v.remove_suffix(50); // Strip the ending "; string_view = ...]" + return v; + } + }; + template void Show(const common::Indirection &x) { Show(x.value()); } @@ -76,7 +88,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant"); + Indent("derived constant "s + std::string(TypeOf::name)); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -84,7 +96,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant"); + Print("constant "s + std::string(TypeOf::name)); } } void Show(const Symbol &symbol); @@ -102,7 +114,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator"); + Indent("designator "s + std::string(TypeOf::name)); Show(x.u); Outdent(); } @@ -117,7 +129,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref"); + Indent("function ref "s + std::string(TypeOf::name)); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -127,14 +139,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value"); + Indent("array constructor value "s + std::string(TypeOf::name)); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do"); + Indent("implied do "s + std::string(TypeOf::name)); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -148,20 +160,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op"); + Indent("unary op "s + std::string(TypeOf::name)); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op"); + Indent("binary op "s + std::string(TypeOf::name)); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr T"); + Indent("expr <" + std::string(TypeOf::name) + ">"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index aa0b4e0f03398..66cedab94bfb4 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("expr some type"); + Indent("relational some type"); Show(x.u); Outdent(); } >From b40ba0ed9270daf4f7d99190c1e100028a3e09c3 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 15:14:45 -0500 Subject: [PATCH 14/29] Handle conversion from real to complex via complex constructor --- flang/lib/Semantics/check-omp-structure.cpp | 55 ++++++++++++++++++--- 1 file changed, 47 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dada9c6c2bd6f..ae81dcb5ea150 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,36 +3183,46 @@ struct ConvertCollector using Base::operator(); template // - Result operator()(const evaluate::Designator &x) const { + Result asSomeExpr(const T &x) const { auto copy{x}; return {AsGenericExpr(std::move(copy)), {}}; } + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + template // Result operator()(const evaluate::FunctionRef &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template // Result operator()(const evaluate::Constant &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template Result operator()(const evaluate::Operation &x) const { if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. + // Ignore parentheses. return (*this)(x.template operand<0>()); } else if constexpr (is_convert_v) { // Convert should always have a typed result, so it should be safe to // dereference x.GetType(). return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } } else { - auto copy{x.derived()}; - return {evaluate::AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x.derived()); } } @@ -3231,6 +3241,23 @@ struct ConvertCollector } private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + template // struct is_convert { static constexpr bool value{false}; @@ -3246,6 +3273,18 @@ struct ConvertCollector }; template // static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { >From 303aef7886243a6f7952e866cfb50d860ed98e61 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 16:07:19 -0500 Subject: [PATCH 15/29] Fix handling of insertion point --- flang/lib/Lower/OpenMP/OpenMP.cpp | 23 +++++++++++-------- .../Lower/OpenMP/atomic-implicit-cast.f90 | 8 +++---- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 1c5589b116ca7..60e559b326f7f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2749,7 +2749,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value storeAddr = @@ -2782,7 +2781,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, value, storeAddr); } - builder.restoreInsertionPoint(saved); return op; } @@ -2796,7 +2794,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value value = @@ -2807,7 +2804,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, converted, hint, memOrder); - builder.restoreInsertionPoint(saved); return op; } @@ -2823,7 +2819,6 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); @@ -2853,7 +2848,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(saved); + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } @@ -2866,6 +2861,8 @@ genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { + // This function and the functions called here do not preserve the + // builder's insertion point, or set it to anything specific. switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, @@ -3919,6 +3916,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, postAt = atomicAt = preAt; } + // The builder's insertion point needs to be specifically set before + // each call to `genAtomicOperation`. mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); @@ -3932,10 +3931,16 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, hint, memOrder, preAt, atomicAt, postAt); } - if (secondOp) { - builder.setInsertionPointAfter(secondOp); + if (construct.IsCapture()) { + // If this is a capture operation, the first/second ops will be inside + // of it. Set the insertion point to past the capture op itself. + builder.restoreInsertionPoint(postAt); } else { - builder.setInsertionPointAfter(firstOp); + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } } } } diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 6f9a481e4cf43..5e00235b85e74 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -95,9 +95,9 @@ subroutine atomic_implicit_cast_read ! CHECK: } ! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 ! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref -! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 ! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex ! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> @@ -107,14 +107,14 @@ subroutine atomic_implicit_cast_read !$omp end atomic -! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { -! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 ! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 ! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex ! CHECK: omp.yield(%[[RESULT]] : complex) ! CHECK: } >From d788d87ebe69ec82c14a0eb0cbb95df38a216fde Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:14:47 -0500 Subject: [PATCH 16/29] Allow conversion in update operations --- flang/include/flang/Semantics/tools.h | 17 ++++----- flang/lib/Lower/OpenMP/OpenMP.cpp | 6 ++-- flang/lib/Semantics/check-omp-structure.cpp | 33 ++++++----------- .../Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic03.f90 | 6 ++-- flang/test/Semantics/OpenMP/atomic04.f90 | 35 +++++++++---------- .../OpenMP/omp-atomic-assignment-stmt.f90 | 2 +- 7 files changed, 44 insertions(+), 57 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7f1ec59b087a2..9be2feb8ae064 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -789,14 +789,15 @@ inline bool checkForSymbolMatch( /// return the "expr" but with top-level parentheses stripped. std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); -/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). -/// Check if "expr" is -/// SomeType(SomeKind(Type( -/// Convert -/// SomeKind(...)[2]))) -/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves -/// TypeCategory. -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 60e559b326f7f..6977e209e8b1b 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2823,10 +2823,12 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; + // This must exist by now. + SomeExpr input = *semantics::GetConvertInput(assign.rhs); + std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { - if (!semantics::IsSameOrResizeOf(arg, atom)) { + if (!semantics::IsSameOrConvertOf(arg, atom)) { mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); overrides.try_emplace(&arg, val); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index ae81dcb5ea150..edd8525c118bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3425,12 +3425,12 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -static MaybeExpr GetConvertInput(const SomeExpr &x) { +MaybeExpr GetConvertInput(const SomeExpr &x) { // This returns SomeExpr(x) when x is a designator/functionref/constant. return atomic::ConvertCollector{}(x).first; } -static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { // Check if expr is same as x, or a sequence of Convert operations on x. if (expr == x) { return true; @@ -3441,23 +3441,6 @@ static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { } } -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { - // Both expr and x have the form of SomeType(SomeKind(...)[1]). - // Check if expr is - // SomeType(SomeKind(Type( - // Convert - // SomeKind(...)[2]))) - // where SomeKind(...) [1] and [2] are equal, and the Convert preserves - // TypeCategory. - - if (expr != x) { - auto top{atomic::ArgumentExtractor{}(expr)}; - return top.first == atomic::Operator::Resize && x == top.second.front(); - } else { - return true; - } -} - bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3801,7 +3784,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - auto top{GetTopLevelOperation(update.rhs)}; + std::pair> top{ + atomic::Operator::Unk, {}}; + if (auto &&maybeInput{GetConvertInput(update.rhs)}) { + top = GetTopLevelOperation(*maybeInput); + } switch (top.first) { case atomic::Operator::Add: case atomic::Operator::Sub: @@ -3842,7 +3829,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( auto unique{[&]() { // -> iterator auto found{top.second.end()}; for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { - if (IsSameOrResizeOf(*i, atom)) { + if (IsSameOrConvertOf(*i, atom)) { if (found != top.second.end()) { return top.second.end(); } @@ -3902,9 +3889,9 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( case atomic::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; - if (IsSameOrResizeOf(arg0, atom)) { + if (IsSameOrConvertOf(arg0, atom)) { CheckStorageOverlap(atom, {arg1}, condSource); - } else if (IsSameOrResizeOf(arg1, atom)) { + } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { context_.Say(assignSource, diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 4595e02d01456..28d0e264359cb 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -47,8 +47,8 @@ subroutine f05 integer :: x real :: y + ! An explicit conversion is accepted as an extension. !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = int(x + y) end diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index f5c189fd05318..b3a3c0d5e7a14 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -41,10 +41,10 @@ program OmpAtomic z = MIN(y, 8, a, d) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable y should appear as an argument in the update operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -126,7 +126,7 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic update !ERROR: Atomic variable k should be a scalar - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable k should occur exactly once among the arguments of the top-level MAX operator k = max(x, y) !$omp atomic diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index 5c91ab5dc37e4..d603ba8b3937c 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -1,5 +1,3 @@ -! REQUIRES: openmp_runtime - ! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags ! OpenMP Atomic construct @@ -7,7 +5,6 @@ ! Update assignment must be 'var = var op expr' or 'var = expr op var' program OmpAtomic - use omp_lib real x integer y logical m, n, l @@ -20,10 +17,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic @@ -31,10 +28,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic @@ -42,10 +39,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic @@ -53,10 +50,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic @@ -96,10 +93,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic update @@ -107,10 +104,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic update @@ -118,10 +115,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic update @@ -129,10 +126,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 5e180aa0bbe5b..8fdd2aed3ec1f 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -87,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + ! This ends up being "x = b + x". x = b + (x*1) !$omp end atomic >From 341723713929507c59d528540d32bc2e4213e920 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:21:56 -0500 Subject: [PATCH 17/29] format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6977e209e8b1b..0f553541c5ef0 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2850,7 +2850,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(postAt); // For naCtx cleanups + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } >From 2686207342bad511f6d51b20ed923c0d2cc9047b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:22:26 -0500 Subject: [PATCH 18/29] Revert "DumpEvExpr: show type" This reverts commit 40510a3068498d15257cc7d198bce9c8cd71a902. Debug changes accidentally pushed upstream. --- flang/include/flang/Semantics/dump-expr.h | 30 +++++++---------------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 10 insertions(+), 22 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 1553dac3b6687..2f445429a10b5 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,7 +16,6 @@ #include #include -#include #include #include @@ -39,17 +38,6 @@ class DumpEvaluateExpr { } private: - template - struct TypeOf { - static constexpr std::string_view name{TypeOf::get()}; - static constexpr std::string_view get() { - std::string_view v(__PRETTY_FUNCTION__); - v.remove_prefix(99); // Strip the part "... [with T = " - v.remove_suffix(50); // Strip the ending "; string_view = ...]" - return v; - } - }; - template void Show(const common::Indirection &x) { Show(x.value()); } @@ -88,7 +76,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant "s + std::string(TypeOf::name)); + Indent("derived constant"); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -96,7 +84,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant "s + std::string(TypeOf::name)); + Print("constant"); } } void Show(const Symbol &symbol); @@ -114,7 +102,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator "s + std::string(TypeOf::name)); + Indent("designator"); Show(x.u); Outdent(); } @@ -129,7 +117,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref "s + std::string(TypeOf::name)); + Indent("function ref"); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -139,14 +127,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value "s + std::string(TypeOf::name)); + Indent("array constructor value"); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do "s + std::string(TypeOf::name)); + Indent("implied do"); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -160,20 +148,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op "s + std::string(TypeOf::name)); + Indent("unary op"); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op "s + std::string(TypeOf::name)); + Indent("binary op"); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr <" + std::string(TypeOf::name) + ">"); + Indent("expr T"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index 66cedab94bfb4..aa0b4e0f03398 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("relational some type"); + Indent("expr some type"); Show(x.u); Outdent(); } >From c00fc531bcf742c409fc974da94c5b362fa9132c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:37:19 -0500 Subject: [PATCH 19/29] Delete unnecessary static_assert --- flang/lib/Semantics/check-omp-structure.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 6005dda7c26fe..2e59553d5e130 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -21,8 +21,6 @@ namespace Fortran::semantics { -static_assert(std::is_same_v>); - template static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { return !(e == f); >From 45b012c16b77c757a0d09b2a229bad49fed8d26f Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:25 -0500 Subject: [PATCH 20/29] Add missing initializer for 'iff' --- flang/lib/Semantics/check-omp-structure.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 2e59553d5e130..aa1bd136b371f 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2815,7 +2815,7 @@ static std::optional AnalyzeConditionalStmt( } } else { AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, - GetActionStmt(std::get(s.t))}; + GetActionStmt(std::get(s.t)), SourcedActionStmt{}}; if (result.ift.stmt) { return result; } >From daeac25991bf14fb08c3accabe068c074afa1eb7 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:47 -0500 Subject: [PATCH 21/29] Add asserts for printing "Identity" as top-level operator --- flang/lib/Semantics/check-omp-structure.cpp | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa1bd136b371f..062b45deac865 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3823,6 +3823,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, atomic::ToString(top.first)); @@ -3852,6 +3853,8 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, atom.AsFortran(), atomic::ToString(top.first)); @@ -3898,16 +3901,20 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, atomic::ToString(top.first)); } break; } + case atomic::Operator::Identity: case atomic::Operator::True: case atomic::Operator::False: break; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, atomic::ToString(top.first)); >From ae121e5c37453af1a4aba7c77939c2c1c45b75fa Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 13:15:58 -0500 Subject: [PATCH 22/29] Explain the use of determinant --- flang/lib/Semantics/check-omp-structure.cpp | 31 ++++++++++++++++++--- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 062b45deac865..bc6a09b9768ef 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3606,10 +3606,33 @@ OmpStructureChecker::CheckUpdateCapture( // subexpression of the right-hand side. // 2. An assignment could be a capture (cbc) if the right-hand side is // a variable (or a function ref), with potential type conversions. - bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; - bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; - bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; - bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; // Can as1 be an update? + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; // Can as2 be an update? + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; // Can 1 be capture? + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; // Can 2 be capture? + + // We want to diagnose cases where both assignments cannot be an update, + // or both cannot be a capture, as well as cases where either assignment + // cannot be any of these two. + // + // If we organize these boolean values into a matrix + // |cbu1 cbu2| + // |cbc1 cbc2| + // then we want to diagnose cases where the matrix has a zero (i.e. "false") + // row or column, including the case where everything is zero. All these + // cases correspond to the determinant of the matrix being 0, which suggests + // that checking the det may be a convenient diagnostic check. There is only + // one additional case where the det is 0, which is when the matrx is all 1 + // ("true"). The "all true" case represents the situation where both + // assignments could be an update as well as a capture. On the other hand, + // whenever det != 0, the roles of the update and the capture can be + // unambiguously assigned to as1 and as2 [1]. + // + // [1] This can be easily verified by hand: there are 10 2x2 matrices with + // det = 0, leaving 6 cases where det != 0: + // 0 1 0 1 1 0 1 0 1 1 1 1 + // 1 0 1 1 0 1 1 1 0 1 1 0 + // In each case the classification is unambiguous. // |cbu1 cbu2| // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 >From cae0e8fcd3f6b8c2bc3ad8f85599ef4765c6afc5 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 11:48:18 -0500 Subject: [PATCH 23/29] Deal with assignments that failed Fortran semantic checks Don't emit diagnostics for those. --- flang/lib/Semantics/check-omp-structure.cpp | 66 ++++++++++++++------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bc6a09b9768ef..89a3a407441a8 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2726,6 +2726,9 @@ static SourcedActionStmt GetActionStmt(const parser::Block &block) { // Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption // is that the ActionStmt will be either an assignment or a pointer-assignment, // otherwise return std::nullopt. +// Note: This function can return std::nullopt on [Pointer]AssignmentStmt where +// the "typedAssignment" is unset. This can happen is there are semantic errors +// in the purported assignment. static std::optional GetEvaluateAssignment( const parser::ActionStmt *x) { if (x == nullptr) { @@ -2754,6 +2757,29 @@ static std::optional GetEvaluateAssignment( x->u); } +// Check if the ActionStmt is actually a [Pointer]AssignmentStmt. This is +// to separate cases where the source has something that looks like an +// assignment, but is semantically wrong (diagnosed by general semantic +// checks), and where the source has some other statement (which we want +// to report as "should be an assignment"). +static bool IsAssignment(const parser::ActionStmt *x) { + if (x == nullptr) { + return false; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + + return common::visit( + [](auto &&s) -> bool { + using BareS = llvm::remove_cvref_t; + return std::is_same_v || + std::is_same_v; + }, + x->u); +} + static std::optional AnalyzeConditionalStmt( const parser::ExecutionPartConstruct *x) { if (x == nullptr) { @@ -3588,8 +3614,10 @@ OmpStructureChecker::CheckUpdateCapture( auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; if (!maybeAssign1 || !maybeAssign2) { - context_.Say(source, - "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + if (!IsAssignment(act1.stmt) || !IsAssignment(act2.stmt)) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + } return std::make_pair(nullptr, nullptr); } @@ -3956,7 +3984,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( // The if-true statement must be present, and must be an assignment. auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; if (!maybeAssign) { - if (update.ift.stmt) { + if (update.ift.stmt && !IsAssignment(update.ift.stmt)) { context_.Say(update.ift.source, "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); } else { @@ -3992,7 +4020,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); } @@ -4094,17 +4122,11 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( } SourcedActionStmt uact{GetActionStmt(uec)}; SourcedActionStmt cact{GetActionStmt(cec)}; - auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; - auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; - - if (!maybeUpdate || !maybeCapture) { - context_.Say(source, - "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); - return; - } + // The "dereferences" of std::optional are guaranteed to be valid after + // CheckUpdateCapture. + evaluate::Assignment update{*GetEvaluateAssignment(uact.stmt)}; + evaluate::Assignment capture{*GetEvaluateAssignment(cact.stmt)}; - const evaluate::Assignment &update{*maybeUpdate}; - const evaluate::Assignment &capture{*maybeCapture}; const SomeExpr &atom{update.lhs}; using Analysis = parser::OpenMPAtomicConstruct::Analysis; @@ -4242,13 +4264,17 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( return; } } else { - context_.Say(capture.source, - "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + if (!IsAssignment(capture.stmt)) { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + } return; } } else { - context_.Say(update.ift.source, - "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + if (!IsAssignment(update.ift.stmt)) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + } return; } @@ -4316,7 +4342,7 @@ void OmpStructureChecker::CheckAtomicRead( MakeAtomicAnalysisOp(Analysis::Read, maybeRead), MakeAtomicAnalysisOp(Analysis::None)); } - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); } @@ -4350,7 +4376,7 @@ void OmpStructureChecker::CheckAtomicWrite( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); } >From 6bc8c10c793ebac02c78daec33e7fb5e6becb8e0 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 12:47:00 -0500 Subject: [PATCH 24/29] Move common functions to tools.cpp --- flang/include/flang/Semantics/tools.h | 134 +++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- flang/lib/Semantics/check-omp-structure.cpp | 506 ++------------------ flang/lib/Semantics/tools.cpp | 310 ++++++++++++ 4 files changed, 484 insertions(+), 468 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 821f1ae34fd5b..25fadceefceb0 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -778,11 +778,135 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, return false; } -/// If the top-level operation (ignoring parentheses) is either an -/// evaluate::FunctionRef, or a specialization of evaluate::Operation, -/// then return the list of arguments (wrapped in SomeExpr). Otherwise, -/// return the "expr" but with top-level parentheses stripped. -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); +namespace operation { + +enum class Operator { + Add, + And, + Associated, + Call, + Convert, + Div, + Eq, + Eqv, + False, + Ge, + Gt, + Identity, + Intrinsic, + Lt, + Max, + Min, + Mul, + Ne, + Neqv, + Not, + Or, + Pow, + Resize, // Convert within the same TypeCategory + Sub, + True, + Unknown, +}; + +std::string ToString(Operator op); + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unknown; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unknown; +} + +template +Operator OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Add; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Sub; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Mul; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Div; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } +} + +template // +Operator OperationCode(const T &) { + return Operator::Unknown; +} + +Operator OperationCode(const evaluate::ProcedureDesignator &proc); + +} // namespace operation + +/// Return information about the top-level operation (ignoring parentheses): +/// the operation code and the list of arguments. +std::pair> +GetTopLevelOperation(const SomeExpr &expr); /// Check if expr is same as x, or a sequence of Convert operations on x. bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ad5eae4ae39a2..c74f7627c5e25 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2828,7 +2828,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, // This must exist by now. SomeExpr input = *semantics::GetConvertInput(assign.rhs); - std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; + std::vector args{semantics::GetTopLevelOperation(input).second}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrConvertOf(arg, atom)) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 89a3a407441a8..f29a56d5fd92a 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2891,290 +2891,6 @@ static std::pair SplitAssignmentSource( namespace atomic { -template static void MoveAppend(V &accum, V &&other) { - for (auto &&s : other) { - accum.push_back(std::move(s)); - } -} - -enum class Operator { - Unk, - // Operators that are officially allowed in the update operation - Add, - And, - Associated, - Div, - Eq, - Eqv, - Ge, // extension - Gt, - Identity, // extension: x = x is allowed (*), but we should never print - // "identity" as the name of the operator - Le, // extension - Lt, - Max, - Min, - Mul, - Ne, // extension - Neqv, - Or, - Sub, - // Operators that we recognize for technical reasons - True, - False, - Not, - Convert, - Resize, - Intrinsic, - Call, - Pow, - - // (*): "x = x + 0" is a valid update statement, but it will be folded - // to "x = x" by the time we look at it. Since the source statements - // "x = x" and "x = x + 0" will end up looking the same, accept the - // former as an extension. -}; - -std::string ToString(Operator op) { - switch (op) { - case Operator::Add: - return "+"; - case Operator::And: - return "AND"; - case Operator::Associated: - return "ASSOCIATED"; - case Operator::Div: - return "/"; - case Operator::Eq: - return "=="; - case Operator::Eqv: - return "EQV"; - case Operator::Ge: - return ">="; - case Operator::Gt: - return ">"; - case Operator::Identity: - return "identity"; - case Operator::Le: - return "<="; - case Operator::Lt: - return "<"; - case Operator::Max: - return "MAX"; - case Operator::Min: - return "MIN"; - case Operator::Mul: - return "*"; - case Operator::Neqv: - return "NEQV/EOR"; - case Operator::Ne: - return "/="; - case Operator::Or: - return "OR"; - case Operator::Sub: - return "-"; - case Operator::True: - return ".TRUE."; - case Operator::False: - return ".FALSE."; - case Operator::Not: - return "NOT"; - case Operator::Convert: - return "type-conversion"; - case Operator::Resize: - return "resize"; - case Operator::Intrinsic: - return "intrinsic"; - case Operator::Call: - return "function-call"; - case Operator::Pow: - return "**"; - default: - return "??"; - } -} - -template // -struct ArgumentExtractor - : public evaluate::Traverse, - std::pair>, false> { - using Arguments = std::vector; - using Result = std::pair; - using Base = evaluate::Traverse, - Result, false>; - static constexpr auto IgnoreResizes = IgnoreResizingConverts; - static constexpr auto Logical = common::TypeCategory::Logical; - ArgumentExtractor() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result operator()( - const evaluate::Constant> &x) const { - if (const auto &val{x.GetScalarValue()}) { - return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) - : std::make_pair(Operator::False, Arguments{}); - } - return Default(); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - Result result{OperationCode(x.proc()), {}}; - for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { - if (auto *e{x.UnwrapArgExpr(i)}) { - result.second.push_back(*e); - } - } - return result; - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. - return (*this)(x.template operand<0>()); - } - if constexpr (IgnoreResizes && - std::is_same_v>) { - // Ignore conversions within the same category. - // Atomic operations on int(kind=1) may be implicitly widened - // to int(kind=4) for example. - return (*this)(x.template operand<0>()); - } else { - return std::make_pair( - OperationCode(x), OperationArgs(x, std::index_sequence_for{})); - } - } - - template // - Result operator()(const evaluate::Designator &x) const { - evaluate::Designator copy{x}; - Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; - return result; - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - // There shouldn't be any combining needed, since we're stopping the - // traversal at the top-level operation, but implement one that picks - // the first non-empty result. - if constexpr (sizeof...(Rs) == 0) { - return std::move(result); - } else { - if (!result.second.empty()) { - return std::move(result); - } else { - return Combine(std::move(results)...); - } - } - } - -private: - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) - const { - switch (op.derived().logicalOperator) { - case common::LogicalOperator::And: - return Operator::And; - case common::LogicalOperator::Or: - return Operator::Or; - case common::LogicalOperator::Eqv: - return Operator::Eqv; - case common::LogicalOperator::Neqv: - return Operator::Neqv; - case common::LogicalOperator::Not: - return Operator::Not; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - switch (op.derived().opr) { - case common::RelationalOperator::LT: - return Operator::Lt; - case common::RelationalOperator::LE: - return Operator::Le; - case common::RelationalOperator::EQ: - return Operator::Eq; - case common::RelationalOperator::NE: - return Operator::Ne; - case common::RelationalOperator::GE: - return Operator::Ge; - case common::RelationalOperator::GT: - return Operator::Gt; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Add; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Sub; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Mul; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Div; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - if constexpr (C == T::category) { - return Operator::Resize; - } else { - return Operator::Convert; - } - } - Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { - Operator code = llvm::StringSwitch(proc.GetName()) - .Case("associated", Operator::Associated) - .Case("min", Operator::Min) - .Case("max", Operator::Max) - .Case("iand", Operator::And) - .Case("ior", Operator::Or) - .Case("ieor", Operator::Neqv) - .Default(Operator::Call); - if (code == Operator::Call && proc.GetSpecificIntrinsic()) { - return Operator::Intrinsic; - } - return code; - } - template // - Operator OperationCode(const T &) const { - return Operator::Unk; - } - - template - Arguments OperationArgs(const evaluate::Operation &x, - std::index_sequence) const { - return Arguments{SomeExpr(x.template operand())...}; - } -}; - struct DesignatorCollector : public evaluate::Traverse, false> { using Result = std::vector; @@ -3196,125 +2912,14 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (MoveAppend(v, std::move(results)), ...); - return v; - } -}; - -struct ConvertCollector - : public evaluate::Traverse>, false> { - using Result = std::pair>; - using Base = evaluate::Traverse; - ConvertCollector() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result asSomeExpr(const T &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; - } - - template // - Result operator()(const evaluate::Designator &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::Constant &x) const { - return asSomeExpr(x); - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore parentheses. - return (*this)(x.template operand<0>()); - } else if constexpr (is_convert_v) { - // Convert should always have a typed result, so it should be safe to - // dereference x.GetType(). - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else if constexpr (is_complex_constructor_v) { - // This is a conversion iff the imaginary operand is 0. - if (IsZero(x.template operand<1>())) { - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else { - return asSomeExpr(x.derived()); - } - } else { - return asSomeExpr(x.derived()); - } - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - Result v(std::move(result)); - auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { - assert((!x.has_value() || !y.has_value()) && "Multiple designators"); - if (!x.has_value()) { - x = std::move(y); + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); } }}; - (setValue(v.first, std::move(results).first), ...); - (MoveAppend(v.second, std::move(results).second), ...); + (moveAppend(v, std::move(results)), ...); return v; } - -private: - template // - static bool IsZero(const T &x) { - return false; - } - template // - static bool IsZero(const evaluate::Expr &x) { - return common::visit([](auto &&s) { return IsZero(s); }, x.u); - } - template // - static bool IsZero(const evaluate::Constant &x) { - if (auto &&maybeScalar{x.GetScalarValue()}) { - return maybeScalar->IsZero(); - } else { - return false; - } - } - - template // - struct is_convert { - static constexpr bool value{false}; - }; - template // - struct is_convert> { - static constexpr bool value{true}; - }; - template // - struct is_convert> { - // Conversion from complex to real. - static constexpr bool value{true}; - }; - template // - static constexpr bool is_convert_v = is_convert::value; - - template // - struct is_complex_constructor { - static constexpr bool value{false}; - }; - template // - struct is_complex_constructor> { - static constexpr bool value{true}; - }; - template // - static constexpr bool is_complex_constructor_v = - is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { @@ -3347,22 +2952,13 @@ static bool IsAllocatable(const SomeExpr &expr) { return !syms.empty() && IsAllocatable(syms.back()); } -static std::pair> GetTopLevelOperation( - const SomeExpr &expr) { - return atomic::ArgumentExtractor{}(expr); -} - -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { - return GetTopLevelOperation(expr).second; -} - static bool IsPointerAssignment(const evaluate::Assignment &x) { return std::holds_alternative(x.u) || std::holds_alternative(x.u); } static bool IsCheckForAssociated(const SomeExpr &cond) { - return GetTopLevelOperation(cond).first == atomic::Operator::Associated; + return GetTopLevelOperation(cond).first == operation::Operator::Associated; } static bool HasCommonDesignatorSymbols( @@ -3455,23 +3051,7 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -MaybeExpr GetConvertInput(const SomeExpr &x) { - // This returns SomeExpr(x) when x is a designator/functionref/constant. - return atomic::ConvertCollector{}(x).first; -} - -bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { - // Check if expr is same as x, or a sequence of Convert operations on x. - if (expr == x) { - return true; - } else if (auto maybe{GetConvertInput(expr)}) { - return *maybe == x; - } else { - return false; - } -} - -bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { +static bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3839,45 +3419,46 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - std::pair> top{ - atomic::Operator::Unk, {}}; + std::pair> top{ + operation::Operator::Unknown, {}}; if (auto &&maybeInput{GetConvertInput(update.rhs)}) { top = GetTopLevelOperation(*maybeInput); } switch (top.first) { - case atomic::Operator::Add: - case atomic::Operator::Sub: - case atomic::Operator::Mul: - case atomic::Operator::Div: - case atomic::Operator::And: - case atomic::Operator::Or: - case atomic::Operator::Eqv: - case atomic::Operator::Neqv: - case atomic::Operator::Min: - case atomic::Operator::Max: - case atomic::Operator::Identity: + case operation::Operator::Add: + case operation::Operator::Sub: + case operation::Operator::Mul: + case operation::Operator::Div: + case operation::Operator::And: + case operation::Operator::Or: + case operation::Operator::Eqv: + case operation::Operator::Neqv: + case operation::Operator::Min: + case operation::Operator::Max: + case operation::Operator::Identity: break; - case atomic::Operator::Call: + case operation::Operator::Call: context_.Say(source, "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Convert: + case operation::Operator::Convert: context_.Say(source, "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Intrinsic: + case operation::Operator::Intrinsic: context_.Say(source, "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Unk: + case operation::Operator::Unknown: context_.Say( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); return; } // Check if `atom` occurs exactly once in the argument list. @@ -3898,17 +3479,17 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( }()}; if (unique == top.second.end()) { - if (top.first == atomic::Operator::Identity) { + if (top.first == operation::Operator::Identity) { // This is "x = y". context_.Say(rsrc, "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, - atom.AsFortran(), atomic::ToString(top.first)); + atom.AsFortran(), operation::ToString(top.first)); } } else { CheckStorageOverlap(atom, nonAtom, source); @@ -3933,18 +3514,18 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( // Missing arguments to operations would have been diagnosed by now. switch (top.first) { - case atomic::Operator::Associated: + case operation::Operator::Associated: if (atom != top.second.front()) { context_.Say(assignSource, "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); } break; // x equalop e | e equalop x (allowing "e equalop x" is an extension) - case atomic::Operator::Eq: - case atomic::Operator::Eqv: + case operation::Operator::Eq: + case operation::Operator::Eqv: // x ordop expr | expr ordop x - case atomic::Operator::Lt: - case atomic::Operator::Gt: { + case operation::Operator::Lt: + case operation::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; if (IsSameOrConvertOf(arg0, atom)) { @@ -3952,23 +3533,24 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); } break; } - case atomic::Operator::Identity: - case atomic::Operator::True: - case atomic::Operator::False: + case operation::Operator::Identity: + case operation::Operator::True: + case operation::Operator::False: break; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); break; } } diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 1d1e3ac044166..fce930dcc1d02 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -17,6 +17,7 @@ #include "flang/Semantics/tools.h" #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringSwitch.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1756,4 +1757,313 @@ bool HadUseError( } } +namespace operation { +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() + ? std::make_pair(operation::Operator::True, Arguments{}) + : std::make_pair(operation::Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{operation::OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); + } + } + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair(operation::OperationCode(x), + OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{ + operation::Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; +} // namespace operation + +std::string operation::ToString(operation::Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; + } +} + +operation::Operator operation::OperationCode( + const evaluate::ProcedureDesignator &proc) { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; +} + +std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return operation::ArgumentExtractor{}(expr); +} + +namespace operation { +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result asSomeExpr(const T &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::Constant &x) const { + return asSomeExpr(x); + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } + } else { + return asSomeExpr(x.derived()); + } + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (moveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; +}; +} // namespace operation + +MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return operation::ConvertCollector{}(x).first; +} + +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + } // namespace Fortran::semantics >From a83a1cf262eb9f01aafbcf099a8467aa9b861187 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 13:05:15 -0500 Subject: [PATCH 25/29] format --- flang/include/flang/Semantics/tools.h | 28 +++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 25fadceefceb0..9454f0b489192 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -830,8 +830,8 @@ Operator OperationCode( } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { switch (op.derived().opr) { case common::RelationalOperator::LT: return Operator::Lt; @@ -855,26 +855,26 @@ Operator OperationCode(const evaluate::Operation, Ts...> &op) { } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Sub; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Mul; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Div; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Pow; } @@ -885,8 +885,8 @@ Operator OperationCode( } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { if constexpr (C == T::category) { return Operator::Resize; } else { @@ -905,8 +905,8 @@ Operator OperationCode(const evaluate::ProcedureDesignator &proc); /// Return information about the top-level operation (ignoring parentheses): /// the operation code and the list of arguments. -std::pair> -GetTopLevelOperation(const SomeExpr &expr); +std::pair> GetTopLevelOperation( + const SomeExpr &expr); /// Check if expr is same as x, or a sequence of Convert operations on x. bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); >From 9770e4d3c5b0a858f8b5864a7aada01946763450 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 14:45:18 -0500 Subject: [PATCH 26/29] Restore accidentally removed Le --- flang/include/flang/Semantics/tools.h | 1 + 1 file changed, 1 insertion(+) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 9454f0b489192..9766effba3ebe 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -794,6 +794,7 @@ enum class Operator { Gt, Identity, Intrinsic, + Le, Lt, Max, Min, >From 9b8aaa5586334b48fbb28c103eda1091168342e0 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 14:45:47 -0500 Subject: [PATCH 27/29] Recognize constants as "operations" This allows emitting slightly better diagnostic messages. --- flang/include/flang/Semantics/tools.h | 8 ++- flang/lib/Semantics/check-omp-structure.cpp | 1 + flang/lib/Semantics/tools.cpp | 71 +++++++++++---------- flang/test/Semantics/OpenMP/atomic04.f90 | 2 +- flang/test/Semantics/OpenMP/atomic05.f90 | 2 +- 5 files changed, 48 insertions(+), 36 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 9766effba3ebe..7a2be79f14a29 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -781,10 +781,12 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, namespace operation { enum class Operator { + Unknown, Add, And, Associated, Call, + Constant, Convert, Div, Eq, @@ -807,7 +809,6 @@ enum class Operator { Resize, // Convert within the same TypeCategory Sub, True, - Unknown, }; std::string ToString(Operator op); @@ -895,6 +896,11 @@ Operator OperationCode( } } +template +Operator OperationCode(const evaluate::Constant &x) { + return Operator::Constant; +} + template // Operator OperationCode(const T &) { return Operator::Unknown; diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 7f96e48a303fe..3c27a3968a7c9 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3459,6 +3459,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( context_.Say(source, "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); return; + case operation::Operator::Constant: case operation::Operator::Unknown: context_.Say( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index fce930dcc1d02..a8cd8a6ec2228 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -1758,6 +1758,12 @@ bool HadUseError( } namespace operation { +template // +SomeExpr asSomeExpr(const T &x) { + auto copy{x}; + return AsGenericExpr(std::move(copy)); +} + template // struct ArgumentExtractor : public evaluate::Traverse, @@ -1816,10 +1822,12 @@ struct ArgumentExtractor template // Result operator()(const evaluate::Designator &x) const { - evaluate::Designator copy{x}; - Result result{ - operation::Operator::Identity, {AsGenericExpr(std::move(copy))}}; - return result; + return {operation::Operator::Identity, {asSomeExpr(x)}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + return {operation::Operator::Identity, {asSomeExpr(x)}}; } template // @@ -1849,24 +1857,37 @@ struct ArgumentExtractor std::string operation::ToString(operation::Operator op) { switch (op) { + default: + case Operator::Unknown: + return "??"; case Operator::Add: return "+"; case Operator::And: return "AND"; case Operator::Associated: return "ASSOCIATED"; + case Operator::Call: + return "function-call"; + case Operator::Constant: + return "constant"; + case Operator::Convert: + return "type-conversion"; case Operator::Div: return "/"; case Operator::Eq: return "=="; case Operator::Eqv: return "EQV"; + case Operator::False: + return ".FALSE."; case Operator::Ge: return ">="; case Operator::Gt: return ">"; case Operator::Identity: return "identity"; + case Operator::Intrinsic: + return "intrinsic"; case Operator::Le: return "<="; case Operator::Lt: @@ -1877,32 +1898,22 @@ std::string operation::ToString(operation::Operator op) { return "MIN"; case Operator::Mul: return "*"; - case Operator::Neqv: - return "NEQV/EOR"; case Operator::Ne: return "/="; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Not: + return "NOT"; case Operator::Or: return "OR"; + case Operator::Pow: + return "**"; + case Operator::Resize: + return "resize"; case Operator::Sub: return "-"; case Operator::True: return ".TRUE."; - case Operator::False: - return ".FALSE."; - case Operator::Not: - return "NOT"; - case Operator::Convert: - return "type-conversion"; - case Operator::Resize: - return "resize"; - case Operator::Intrinsic: - return "intrinsic"; - case Operator::Call: - return "function-call"; - case Operator::Pow: - return "**"; - default: - return "??"; } } @@ -1939,25 +1950,19 @@ struct ConvertCollector using Base::operator(); - template // - Result asSomeExpr(const T &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; - } - template // Result operator()(const evaluate::Designator &x) const { - return asSomeExpr(x); + return {asSomeExpr(x), {}}; } template // Result operator()(const evaluate::FunctionRef &x) const { - return asSomeExpr(x); + return {asSomeExpr(x), {}}; } template // Result operator()(const evaluate::Constant &x) const { - return asSomeExpr(x); + return {asSomeExpr(x), {}}; } template @@ -1976,10 +1981,10 @@ struct ConvertCollector return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); } else { - return asSomeExpr(x.derived()); + return {asSomeExpr(x.derived()), {}}; } } else { - return asSomeExpr(x.derived()); + return {asSomeExpr(x.derived()), {}}; } } diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index d603ba8b3937c..0f69befed1414 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -180,7 +180,7 @@ subroutine more_invalid_atomic_update_stmts() x = x !$omp atomic update - !ERROR: This is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1 !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index e0103be4cae4a..77ffc6e57f1a3 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -19,7 +19,7 @@ program OmpAtomic x = 2 * 4 !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: This is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 10 !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst >From f7bc109276a7bb647b0c8f3d65af63fbfb3249dc Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 15:04:57 -0500 Subject: [PATCH 28/29] Add lit tests for dumping atomic analysis --- flang/lib/Lower/OpenMP/OpenMP.cpp | 8 +- .../Lower/OpenMP/dump-atomic-analysis.f90 | 82 +++++++++++++++++++ 2 files changed, 89 insertions(+), 1 deletion(-) create mode 100644 flang/test/Lower/OpenMP/dump-atomic-analysis.f90 diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 30acf8baba082..4c50717f8fde4 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -40,11 +40,14 @@ #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/Transforms/RegionUtils.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/Support/CommandLine.h" #include "llvm/Frontend/OpenMP/OMPConstants.h" using namespace Fortran::lower::omp; using namespace Fortran::common::openmp; +static llvm::cl::opt DumpAtomicAnalysis("fdebug-dump-atomic-analysis"); + //===----------------------------------------------------------------------===// // Code generation helper functions //===----------------------------------------------------------------------===// @@ -3790,7 +3793,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// [[maybe_unused]] static void -dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { +dumpAtomicAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { auto whatStr = [](int k) { std::string txt = "?"; switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { @@ -3869,6 +3872,9 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, lower::StatementContext stmtCtx; const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + if (DumpAtomicAnalysis) + dumpAtomicAnalysis(analysis); + const semantics::SomeExpr &atom = *get(analysis.atom); mlir::Location loc = converter.genLocation(construct.source); mlir::Value atomAddr = diff --git a/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 b/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 new file mode 100644 index 0000000000000..55c49f98cd2e8 --- /dev/null +++ b/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 @@ -0,0 +1,82 @@ +!RUN: %flang_fc1 -fopenmp -fopenmp-version=60 -emit-hlfir -mmlir -fdebug-dump-atomic-analysis %s -o /dev/null |& FileCheck %s + +subroutine f00(x) + integer :: x, v + !$omp atomic read + v = x +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Read +!CHECK-NEXT: assign: v=x +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: None +!CHECK-NEXT: assign: +!CHECK-NEXT: } +!CHECK-NEXT: } + + +subroutine f01(v) + integer :: x, v + !$omp atomic write + x = v +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Write +!CHECK-NEXT: assign: x=v +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: None +!CHECK-NEXT: assign: +!CHECK-NEXT: } +!CHECK-NEXT: } + + +subroutine f02(x, v) + integer :: x, v + !$omp atomic update + x = x + v +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Update +!CHECK-NEXT: assign: x=x+v +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: None +!CHECK-NEXT: assign: +!CHECK-NEXT: } +!CHECK-NEXT: } + + +subroutine f03(x, v) + integer :: x, v, t + !$omp atomic update capture + t = x + x = x + v + !$omp end atomic +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Read +!CHECK-NEXT: assign: t=x +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: Update +!CHECK-NEXT: assign: x=x+v +!CHECK-NEXT: } +!CHECK-NEXT: } >From 7355186f91e91088b58ba766974b82cbe45fb85a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 15:26:04 -0500 Subject: [PATCH 29/29] format --- flang/include/flang/Semantics/tools.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7a2be79f14a29..1e30321269562 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -896,7 +896,7 @@ Operator OperationCode( } } -template +template // Operator OperationCode(const evaluate::Constant &x) { return Operator::Constant; } diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 4c50717f8fde4..f3f896dbf1ecc 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -40,8 +40,8 @@ #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/Transforms/RegionUtils.h" #include "llvm/ADT/STLExtras.h" -#include "llvm/Support/CommandLine.h" #include "llvm/Frontend/OpenMP/OMPConstants.h" +#include "llvm/Support/CommandLine.h" using namespace Fortran::lower::omp; using namespace Fortran::common::openmp; From flang-commits at lists.llvm.org Fri May 30 14:23:06 2025 From: flang-commits at lists.llvm.org (Krzysztof Parzyszek via flang-commits) Date: Fri, 30 May 2025 14:23:06 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683a21ba.050a0220.1cafa9.9762@mx.google.com> https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/137852 >From 4aa88f8d04afcba35a1486e2661e5a29170694bf Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 07:58:29 -0500 Subject: [PATCH 01/30] [flang][OpenMP] Mark atomic clauses as unique The current implementation of the ATOMIC construct handles these clauses individually, and this change does not have an observable effect. At the same time these clauses are unique as per the OpenMP spec, and this patch reflects that in the OMP.td file. --- llvm/include/llvm/Frontend/OpenMP/OMP.td | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index eff6d57995d2b..cdfd3e3223fa8 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -602,22 +602,20 @@ def OMP_Assume : Directive<"assume"> { ]; } def OMP_Atomic : Directive<"atomic"> { - let allowedClauses = [ - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - VersionedClause, - ]; let allowedOnceClauses = [ VersionedClause, VersionedClause, + VersionedClause, + VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, VersionedClause, VersionedClause, + VersionedClause, VersionedClause, + VersionedClause, ]; let association = AS_Block; let category = CA_Executable; @@ -668,7 +666,7 @@ def OMP_CancellationPoint : Directive<"cancellation point"> { let category = CA_Executable; } def OMP_Critical : Directive<"critical"> { - let allowedClauses = [ + let allowedOnceClauses = [ VersionedClause, ]; let association = AS_Block; >From 69869a7673c62a5b47e20c532b6e438e929d212c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sat, 26 Apr 2025 10:08:58 -0500 Subject: [PATCH 02/30] [flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs The OpenMP implementation of the ATOMIC construct will change in the near future to accommodate OpenMP 6.0. This patch separates the shared implementations to avoid interfering with OpenACC. --- flang/include/flang/Lower/DirectivesCommon.h | 514 ------------------- flang/lib/Lower/OpenACC.cpp | 320 +++++++++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 473 ++++++++++++++++- 3 files changed, 767 insertions(+), 540 deletions(-) diff --git a/flang/include/flang/Lower/DirectivesCommon.h b/flang/include/flang/Lower/DirectivesCommon.h index d1dbaefcd81d0..93ab2e350d035 100644 --- a/flang/include/flang/Lower/DirectivesCommon.h +++ b/flang/include/flang/Lower/DirectivesCommon.h @@ -46,520 +46,6 @@ namespace Fortran { namespace lower { -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static inline void genOmpAtomicHintAndMemoryOrderClauses( - Fortran::lower::AbstractConverter &converter, - const Fortran::parser::OmpAtomicClauseList &clauseList, - mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const Fortran::parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = Fortran::semantics::GetExpr(s.v); - uint64_t hintExprValue = *Fortran::evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); - } -} - -template -static void processOmpAtomicTODO(mlir::Type elementType, - [[maybe_unused]] mlir::Location loc) { - if (!elementType) - return; - if constexpr (std::is_same()) { - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); - } -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicCaptureStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - processOmpAtomicTODO(elementType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType), hint, - memoryOrder); - } else { - firOpBuilder.create( - loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); - } -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicWriteStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); - } else { - firOpBuilder.create(loc, lhsAddr, rhsExpr); - } -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -template -static inline void genOmpAccAtomicUpdateStatement( - Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, - const Fortran::parser::Expr &assignmentStmtExpr, - [[maybe_unused]] const AtomicListT *leftHandClauseList, - [[maybe_unused]] const AtomicListT *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); - - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; - - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - Fortran::lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - Fortran::common::visit( - Fortran::common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of< - Fortran::parser::Expr::IntrinsicBinary, - T>::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back( - Fortran::semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); - } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); - } - - mlir::Operation *atomicUpdateOp = nullptr; - if constexpr (std::is_same()) { - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, - hint, memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, - hint, memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - } else { - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr); - } - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace( - Fortran::semantics::GetExpr(assignmentStmtVariable), val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - Fortran::lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - if constexpr (std::is_same()) { - firOpBuilder.create(currentLocation, convertResult); - } else { - firOpBuilder.create(currentLocation, convertResult); - } - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -template -void genOmpAccAtomicWrite(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicWrite, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - } - - const Fortran::parser::AssignmentStmt &stmt = - std::get>( - atomicWrite.t) - .statement; - const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; - Fortran::lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genOmpAccAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -template -void genOmpAccAtomicRead(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicRead, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicRead.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genOmpAccAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -template -void genOmpAccAtomicUpdate(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicUpdate, mlir::Location loc) { - const AtomicListT *rightHandClauseList = nullptr; - const AtomicListT *leftHandClauseList = nullptr; - if constexpr (std::is_same()) { - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - } - - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicUpdate.t) - .statement.t); - - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - leftHandClauseList, rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -template -void genOmpAtomic(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicConstruct, mlir::Location loc) { - const AtomicListT &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>( - atomicConstruct.t) - .statement.t); - Fortran::lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genOmpAccAtomicUpdateStatement( - converter, lhsAddr, varType, assignmentStmtVariable, assignmentStmtExpr, - &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -template -void genOmpAccAtomicCapture(Fortran::lower::AbstractConverter &converter, - const AtomicT &atomicCapture, mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - - const Fortran::parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const Fortran::parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - Fortran::lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - if constexpr (std::is_same()) { - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const AtomicListT &rightHandClauseList = std::get<2>(atomicCapture.t); - const AtomicListT &leftHandClauseList = std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - } else { - atomicCaptureOp = firOpBuilder.create(loc); - } - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { - if (Fortran::semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicCaptureStatement( - converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - genOmpAccAtomicWriteStatement( - converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const Fortran::semantics::SomeExpr &fromExpr = - *Fortran::semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genOmpAccAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genOmpAccAtomicCaptureStatement( - converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, loc); - } - firOpBuilder.setInsertionPointToEnd(&block); - if constexpr (std::is_same()) { - firOpBuilder.create(loc); - } else { - firOpBuilder.create(loc); - } - firOpBuilder.setInsertionPointToStart(&block); -} - /// Create empty blocks for the current region. /// These blocks replace blocks parented to an enclosing region. template diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp index 418bf4ee3d15f..e6175ebda40b2 100644 --- a/flang/lib/Lower/OpenACC.cpp +++ b/flang/lib/Lower/OpenACC.cpp @@ -375,6 +375,310 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) { llvm::report_fatal_error("Could not find symbol"); } +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static inline void +genAtomicCaptureStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value fromAddress, mlir::Value toAddress, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + firOpBuilder.create( + loc, fromAddress, toAddress, mlir::TypeAttr::get(elementType)); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static inline void +genAtomicWriteStatement(Fortran::lower::AbstractConverter &converter, + mlir::Value lhsAddr, mlir::Value rhsExpr, + mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + firOpBuilder.create(loc, lhsAddr, rhsExpr); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static inline void genAtomicUpdateStatement( + Fortran::lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const Fortran::parser::Variable &assignmentStmtVariable, + const Fortran::parser::Expr &assignmentStmtExpr, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>( + &arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + Fortran::lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + Fortran::common::visit( + Fortran::common::visitors{ + [&](const Fortran::common::Indirection< + Fortran::parser::FunctionReference> &funcRef) -> void { + const auto &args{ + std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = + args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const Fortran::common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(Fortran::semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of< + Fortran::parser::Expr::IntrinsicBinary, + T>::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back( + Fortran::semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + Fortran::lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + atomicUpdateOp = + firOpBuilder.create(currentLocation, lhsAddr); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace( + Fortran::semantics::GetExpr(assignmentStmtVariable), val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + Fortran::lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *Fortran::semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +void genAtomicWrite(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicWrite &atomicWrite, + mlir::Location loc) { + const Fortran::parser::AssignmentStmt &stmt = + std::get>( + atomicWrite.t) + .statement; + const Fortran::evaluate::Assignment &assign = *stmt.typedAssignment->v; + Fortran::lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, loc); +} + +/// Processes an atomic construct with read clause. +void genAtomicRead(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicRead &atomicRead, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicRead.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, elementType, + loc); +} + +/// Processes an atomic construct with update clause. +void genAtomicUpdate(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const auto &assignmentStmtExpr = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>( + atomicUpdate.t) + .statement.t); + + Fortran::lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *Fortran::semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, loc); +} + +/// Processes an atomic construct with capture clause. +void genAtomicCapture(Fortran::lower::AbstractConverter &converter, + const Fortran::parser::AccAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const Fortran::parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const Fortran::parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t) + .v.statement; + const Fortran::evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + Fortran::lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + atomicCaptureOp = firOpBuilder.create(loc); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (Fortran::semantics::checkForSingleVariableOnRHS(stmt1)) { + if (Fortran::semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicUpdateStatement(converter, stmt2LHSArg, stmt2VarType, stmt2Var, + stmt2Expr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + elementType, loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const Fortran::semantics::SomeExpr &fromExpr = + *Fortran::semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement(converter, stmt1LHSArg, stmt1VarType, stmt1Var, + stmt1Expr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + template static void genDataOperandOperations(const Fortran::parser::AccObjectList &objectList, @@ -4352,24 +4656,16 @@ genACC(Fortran::lower::AbstractConverter &converter, Fortran::common::visit( Fortran::common::visitors{ [&](const Fortran::parser::AccAtomicRead &atomicRead) { - Fortran::lower::genOmpAccAtomicRead(converter, atomicRead, - loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const Fortran::parser::AccAtomicWrite &atomicWrite) { - Fortran::lower::genOmpAccAtomicWrite< - Fortran::parser::AccAtomicWrite, void>(converter, atomicWrite, - loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const Fortran::parser::AccAtomicUpdate &atomicUpdate) { - Fortran::lower::genOmpAccAtomicUpdate< - Fortran::parser::AccAtomicUpdate, void>(converter, atomicUpdate, - loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const Fortran::parser::AccAtomicCapture &atomicCapture) { - Fortran::lower::genOmpAccAtomicCapture< - Fortran::parser::AccAtomicCapture, void>(converter, - atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, }, atomicConstruct.u); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 312557d5da07e..fdd85e94829f3 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2585,6 +2585,460 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, queue, item, clauseOps); } +//===----------------------------------------------------------------------===// +// Code generation for atomic operations +//===----------------------------------------------------------------------===// + +/// Populates \p hint and \p memoryOrder with appropriate clause information +/// if present on atomic construct. +static void genOmpAtomicHintAndMemoryOrderClauses( + lower::AbstractConverter &converter, + const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, + mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const parser::OmpAtomicClause &clause : clauseList.v) { + common::visit( + common::visitors{ + [&](const parser::OmpMemoryOrderClause &s) { + auto kind = common::visit( + common::visitors{ + [&](const parser::OmpClause::AcqRel &) { + return mlir::omp::ClauseMemoryOrderKind::Acq_rel; + }, + [&](const parser::OmpClause::Acquire &) { + return mlir::omp::ClauseMemoryOrderKind::Acquire; + }, + [&](const parser::OmpClause::Relaxed &) { + return mlir::omp::ClauseMemoryOrderKind::Relaxed; + }, + [&](const parser::OmpClause::Release &) { + return mlir::omp::ClauseMemoryOrderKind::Release; + }, + [&](const parser::OmpClause::SeqCst &) { + return mlir::omp::ClauseMemoryOrderKind::Seq_cst; + }, + [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { + llvm_unreachable("Unexpected clause"); + }, + }, + s.v.u); + memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( + firOpBuilder.getContext(), kind); + }, + [&](const parser::OmpHintClause &s) { + const auto *expr = semantics::GetExpr(s.v); + uint64_t hintExprValue = *evaluate::ToInt64(*expr); + hint = firOpBuilder.getI64IntegerAttr(hintExprValue); + }, + [&](const parser::OmpFailClause &) {}, + }, + clause.u); + } +} + +static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { + if (!elementType) + return; + assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && + "is supported type for omp atomic"); +} + +/// Used to generate atomic.read operation which is created in existing +/// location set by builder. +static void genAtomicCaptureStatement( + lower::AbstractConverter &converter, mlir::Value fromAddress, + mlir::Value toAddress, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, + mlir::Type elementType, mlir::Location loc) { + // Generate `atomic.read` operation for atomic assigment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + processOmpAtomicTODO(elementType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, fromAddress, toAddress, + mlir::TypeAttr::get(elementType), + hint, memoryOrder); +} + +/// Used to generate atomic.write operation which is created in existing +/// location set by builder. +static void genAtomicWriteStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Value *evaluatedExprValue = nullptr) { + // Generate `atomic.write` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // Create a conversion outside the capture block. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); + rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); + firOpBuilder.restoreInsertionPoint(insertionPoint); + + processOmpAtomicTODO(varType, loc); + + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, + memoryOrder); +} + +/// Used to generate atomic.update operation which is created in existing +/// location set by builder. +static void genAtomicUpdateStatement( + lower::AbstractConverter &converter, mlir::Value lhsAddr, + mlir::Type varType, const parser::Variable &assignmentStmtVariable, + const parser::Expr &assignmentStmtExpr, + const parser::OmpAtomicClauseList *leftHandClauseList, + const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, + mlir::Operation *atomicCaptureOp = nullptr) { + // Generate `atomic.update` operation for atomic assignment statements + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + + // Create the omp.atomic.update or acc.atomic.update operation + // + // func.func @_QPsb() { + // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} + // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} + // %2 = fir.load %1 : !fir.ref + // omp.atomic.update %0 : !fir.ref { + // ^bb0(%arg0: i32): + // %3 = arith.addi %arg0, %2 : i32 + // omp.yield(%3 : i32) + // } + // return + // } + + auto getArgExpression = + [](std::list::const_iterator it) { + const auto &arg{std::get((*it).t)}; + const auto *parserExpr{ + std::get_if>(&arg.u)}; + return parserExpr; + }; + + // Lower any non atomic sub-expression before the atomic operation, and + // map its lowered value to the semantic representation. + lower::ExprToValueMap exprValueOverrides; + // Max and min intrinsics can have a list of Args. Hence we need a list + // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. + llvm::SmallVector nonAtomicSubExprs; + common::visit( + common::visitors{ + [&](const common::Indirection &funcRef) + -> void { + const auto &args{std::get>( + funcRef.value().v.t)}; + std::list::const_iterator beginIt = + args.begin(); + std::list::const_iterator endIt = args.end(); + const auto *exprFirst{getArgExpression(beginIt)}; + if (exprFirst && exprFirst->value().source == + assignmentStmtVariable.GetSource()) { + // Add everything except the first + beginIt++; + } else { + // Add everything except the last + endIt--; + } + std::list::const_iterator it; + for (it = beginIt; it != endIt; it++) { + const common::Indirection *expr = + getArgExpression(it); + if (expr) + nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); + } + }, + [&](const auto &op) -> void { + using T = std::decay_t; + if constexpr (std::is_base_of::value) { + const auto &exprLeft{std::get<0>(op.t)}; + const auto &exprRight{std::get<1>(op.t)}; + if (exprLeft.value().source == assignmentStmtVariable.GetSource()) + nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); + else + nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); + } + }, + }, + assignmentStmtExpr.u); + lower::StatementContext nonAtomicStmtCtx; + if (!nonAtomicSubExprs.empty()) { + // Generate non atomic part before all the atomic operations. + auto insertionPoint = firOpBuilder.saveInsertionPoint(); + if (atomicCaptureOp) + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value nonAtomicVal; + for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { + nonAtomicVal = fir::getBase(converter.genExprValue( + currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); + exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); + } + if (atomicCaptureOp) + firOpBuilder.restoreInsertionPoint(insertionPoint); + } + + mlir::Operation *atomicUpdateOp = nullptr; + // If no hint clause is specified, the effect is as if + // hint(omp_sync_hint_none) had been specified. + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + if (leftHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, + memoryOrder); + if (rightHandClauseList) + genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, + memoryOrder); + atomicUpdateOp = firOpBuilder.create( + currentLocation, lhsAddr, hint, memoryOrder); + + processOmpAtomicTODO(varType, loc); + + llvm::SmallVector varTys = {varType}; + llvm::SmallVector locs = {currentLocation}; + firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); + mlir::Value val = + fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); + + exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), + val); + { + // statement context inside the atomic block. + converter.overrideExprValues(&exprValueOverrides); + lower::StatementContext atomicStmtCtx; + mlir::Value rhsExpr = fir::getBase(converter.genExprValue( + *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); + mlir::Value convertResult = + firOpBuilder.createConvert(currentLocation, varType, rhsExpr); + firOpBuilder.create(currentLocation, convertResult); + converter.resetExprOverrides(); + } + firOpBuilder.setInsertionPointAfter(atomicUpdateOp); +} + +/// Processes an atomic construct with write clause. +static void genAtomicWrite(lower::AbstractConverter &converter, + const parser::OmpAtomicWrite &atomicWrite, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicWrite.t); + leftHandClauseList = &std::get<0>(atomicWrite.t); + + const parser::AssignmentStmt &stmt = + std::get>(atomicWrite.t) + .statement; + const evaluate::Assignment &assign = *stmt.typedAssignment->v; + lower::StatementContext stmtCtx; + // Get the value and address of atomic write operands. + mlir::Value rhsExpr = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); + mlir::Value lhsAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); + genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with read clause. +static void genAtomicRead(lower::AbstractConverter &converter, + const parser::OmpAtomicRead &atomicRead, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicRead.t); + leftHandClauseList = &std::get<0>(atomicRead.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicRead.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicRead.t) + .statement.t); + + lower::StatementContext stmtCtx; + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); + mlir::Type elementType = converter.genType(fromExpr); + mlir::Value fromAddress = + fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); + mlir::Value toAddress = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + genAtomicCaptureStatement(converter, fromAddress, toAddress, + leftHandClauseList, rightHandClauseList, + elementType, loc); +} + +/// Processes an atomic construct with update clause. +static void genAtomicUpdate(lower::AbstractConverter &converter, + const parser::OmpAtomicUpdate &atomicUpdate, + mlir::Location loc) { + const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; + const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; + // Get the address of atomic read operands. + rightHandClauseList = &std::get<2>(atomicUpdate.t); + leftHandClauseList = &std::get<0>(atomicUpdate.t); + + const auto &assignmentStmtExpr = std::get( + std::get>(atomicUpdate.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicUpdate.t) + .statement.t); + + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, leftHandClauseList, + rightHandClauseList, loc); +} + +/// Processes an atomic construct with no clause - which implies update clause. +static void genOmpAtomic(lower::AbstractConverter &converter, + const parser::OmpAtomic &atomicConstruct, + mlir::Location loc) { + const parser::OmpAtomicClauseList &atomicClauseList = + std::get(atomicConstruct.t); + const auto &assignmentStmtExpr = std::get( + std::get>(atomicConstruct.t) + .statement.t); + const auto &assignmentStmtVariable = std::get( + std::get>(atomicConstruct.t) + .statement.t); + lower::StatementContext stmtCtx; + mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( + *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); + mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); + // If atomic-clause is not present on the construct, the behaviour is as if + // the update clause is specified (for both OpenMP and OpenACC). + genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, + assignmentStmtExpr, &atomicClauseList, nullptr, loc); +} + +/// Processes an atomic construct with capture clause. +static void genAtomicCapture(lower::AbstractConverter &converter, + const parser::OmpAtomicCapture &atomicCapture, + mlir::Location loc) { + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + + const parser::AssignmentStmt &stmt1 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; + const auto &stmt1Var{std::get(stmt1.t)}; + const auto &stmt1Expr{std::get(stmt1.t)}; + const parser::AssignmentStmt &stmt2 = + std::get(atomicCapture.t).v.statement; + const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; + const auto &stmt2Var{std::get(stmt2.t)}; + const auto &stmt2Expr{std::get(stmt2.t)}; + + // Pre-evaluate expressions to be used in the various operations inside + // `atomic.capture` since it is not desirable to have anything other than + // a `atomic.read`, `atomic.write`, or `atomic.update` operation + // inside `atomic.capture` + lower::StatementContext stmtCtx; + // LHS evaluations are common to all combinations of `atomic.capture` + mlir::Value stmt1LHSArg = + fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); + mlir::Value stmt2LHSArg = + fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); + + // Type information used in generation of `atomic.update` operation + mlir::Type stmt1VarType = + fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); + mlir::Type stmt2VarType = + fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); + + mlir::Operation *atomicCaptureOp = nullptr; + mlir::IntegerAttr hint = nullptr; + mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; + const parser::OmpAtomicClauseList &rightHandClauseList = + std::get<2>(atomicCapture.t); + const parser::OmpAtomicClauseList &leftHandClauseList = + std::get<0>(atomicCapture.t); + genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, + memoryOrder); + genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, + memoryOrder); + atomicCaptureOp = + firOpBuilder.create(loc, hint, memoryOrder); + + firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); + mlir::Block &block = atomicCaptureOp->getRegion(0).back(); + firOpBuilder.setInsertionPointToStart(&block); + if (semantics::checkForSingleVariableOnRHS(stmt1)) { + if (semantics::checkForSymbolMatch(stmt2)) { + // Atomic capture construct is of the form [capture-stmt, update-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicUpdateStatement( + converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + } else { + // Atomic capture construct is of the form [capture-stmt, write-stmt] + firOpBuilder.setInsertionPoint(atomicCaptureOp); + mlir::Value stmt2RHSArg = + fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); + firOpBuilder.setInsertionPointToStart(&block); + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc); + } + } else { + // Atomic capture construct is of the form [update-stmt, capture-stmt] + const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); + mlir::Type elementType = converter.genType(fromExpr); + genAtomicUpdateStatement( + converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); + genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, + /*leftHandClauseList=*/nullptr, + /*rightHandClauseList=*/nullptr, elementType, + loc); + } + firOpBuilder.setInsertionPointToEnd(&block); + firOpBuilder.create(loc); + firOpBuilder.setInsertionPointToStart(&block); +} + //===----------------------------------------------------------------------===// // Code generation functions for the standalone version of constructs that can // also be a leaf of a composite construct @@ -3476,32 +3930,23 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, common::visitors{ [&](const parser::OmpAtomicRead &atomicRead) { mlir::Location loc = converter.genLocation(atomicRead.source); - lower::genOmpAccAtomicRead( - converter, atomicRead, loc); + genAtomicRead(converter, atomicRead, loc); }, [&](const parser::OmpAtomicWrite &atomicWrite) { mlir::Location loc = converter.genLocation(atomicWrite.source); - lower::genOmpAccAtomicWrite( - converter, atomicWrite, loc); + genAtomicWrite(converter, atomicWrite, loc); }, [&](const parser::OmpAtomic &atomicConstruct) { mlir::Location loc = converter.genLocation(atomicConstruct.source); - lower::genOmpAtomic( - converter, atomicConstruct, loc); + genOmpAtomic(converter, atomicConstruct, loc); }, [&](const parser::OmpAtomicUpdate &atomicUpdate) { mlir::Location loc = converter.genLocation(atomicUpdate.source); - lower::genOmpAccAtomicUpdate( - converter, atomicUpdate, loc); + genAtomicUpdate(converter, atomicUpdate, loc); }, [&](const parser::OmpAtomicCapture &atomicCapture) { mlir::Location loc = converter.genLocation(atomicCapture.source); - lower::genOmpAccAtomicCapture( - converter, atomicCapture, loc); + genAtomicCapture(converter, atomicCapture, loc); }, [&](const parser::OmpAtomicCompare &atomicCompare) { mlir::Location loc = converter.genLocation(atomicCompare.source); >From 637d237b9d904cfea6dc40eefb303f59641f545e Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Sun, 27 Apr 2025 09:51:21 -0500 Subject: [PATCH 03/30] [flang][OpenMP] Allow UPDATE clause to not have any arguments The UPDATE clause can be specified on both ATOMIC and DEPOBJ directives. Currently, the ATOMIC directive has its own handling of it, and the definition of the UPDATE clause only supports its use in the DEPOBJ directive, where it takes a dependence-type as an argument. The UPDATE clause on the ATOMIC directive may not have any arguments. Since the implementation of the ATOMIC construct will be modified to use the standard handling of clauses, the definition of UPDATE should reflect that. --- flang/include/flang/Parser/parse-tree.h | 5 +++ flang/lib/Lower/OpenMP/Clauses.cpp | 10 +++-- flang/lib/Parser/openmp-parsers.cpp | 50 ++++++++++++--------- flang/lib/Semantics/check-omp-structure.cpp | 16 +++++-- llvm/include/llvm/Frontend/OpenMP/OMP.td | 1 + 5 files changed, 53 insertions(+), 29 deletions(-) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index ca8473c6f9674..e39ecc13f4eec 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4501,6 +4501,11 @@ struct OmpToClause { // Ref: [5.0:254-255], [5.1:287-288], [5.2:321-322] // +// In ATOMIC construct +// update-clause -> +// UPDATE // Since 4.5 +// +// In DEPOBJ construct // update-clause -> // UPDATE(dependence-type) // since 5.0, until 5.1 // update-clause -> diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index f1330b8d1909f..c258bef2e4427 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -1400,9 +1400,13 @@ Uniform make(const parser::OmpClause::Uniform &inp, Update make(const parser::OmpClause::Update &inp, semantics::SemanticsContext &semaCtx) { // inp.v -> parser::OmpUpdateClause - auto depType = - common::visit([](auto &&s) { return makeDepType(s); }, inp.v.u); - return Update{/*DependenceType=*/depType}; + if (inp.v) { + return common::visit( + [](auto &&s) { return Update{/*DependenceType=*/makeDepType(s)}; }, + inp.v->u); + } else { + return Update{/*DependenceType=*/std::nullopt}; + } } Use make(const parser::OmpClause::Use &inp, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index e631922a354c4..bfca4e3f1730a 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -836,9 +836,9 @@ TYPE_PARSER(construct( TYPE_PARSER(construct(Parser{}, maybe(":" >> nonemptyList(Parser{})))) -TYPE_PARSER(construct( - construct(Parser{}) || - construct(Parser{}))) +TYPE_PARSER( // + construct(parenthesized(Parser{})) || + construct(parenthesized(Parser{}))) TYPE_PARSER(construct( maybe(nonemptyList(Parser{}) / ":"), @@ -1079,7 +1079,7 @@ TYPE_PARSER( // parenthesized(nonemptyList(name)))) || "UNTIED" >> construct(construct()) || "UPDATE" >> construct(construct( - parenthesized(Parser{}))) || + maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || // Cancellable constructs @@ -1313,24 +1313,30 @@ TYPE_PARSER( endOfLine) // Directives enclosing structured-block -TYPE_PARSER(construct(first( - "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), - "MASTER" >> pure(llvm::omp::Directive::OMPD_master), - "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), - "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), - "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), - "PARALLEL WORKSHARE" >> pure(llvm::omp::Directive::OMPD_parallel_workshare), - "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), - "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), - "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), - "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), - "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), - "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), - "TARGET" >> pure(llvm::omp::Directive::OMPD_target), - "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), - "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), - "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), - "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) +TYPE_PARSER( + // In this context "TARGET UPDATE" can be parsed as a TARGET directive + // followed by an UPDATE clause. This is the only combination at the + // moment, exclude it explicitly. + (!"TARGET UPDATE"_sptok) >= + construct(first( + "MASKED" >> pure(llvm::omp::Directive::OMPD_masked), + "MASTER" >> pure(llvm::omp::Directive::OMPD_master), + "ORDERED" >> pure(llvm::omp::Directive::OMPD_ordered), + "PARALLEL MASKED" >> pure(llvm::omp::Directive::OMPD_parallel_masked), + "PARALLEL MASTER" >> pure(llvm::omp::Directive::OMPD_parallel_master), + "PARALLEL WORKSHARE" >> + pure(llvm::omp::Directive::OMPD_parallel_workshare), + "PARALLEL" >> pure(llvm::omp::Directive::OMPD_parallel), + "SCOPE" >> pure(llvm::omp::Directive::OMPD_scope), + "SINGLE" >> pure(llvm::omp::Directive::OMPD_single), + "TARGET DATA" >> pure(llvm::omp::Directive::OMPD_target_data), + "TARGET PARALLEL" >> pure(llvm::omp::Directive::OMPD_target_parallel), + "TARGET TEAMS" >> pure(llvm::omp::Directive::OMPD_target_teams), + "TARGET" >> pure(llvm::omp::Directive::OMPD_target), + "TASK"_id >> pure(llvm::omp::Directive::OMPD_task), + "TASKGROUP" >> pure(llvm::omp::Directive::OMPD_taskgroup), + "TEAMS" >> pure(llvm::omp::Directive::OMPD_teams), + "WORKSHARE" >> pure(llvm::omp::Directive::OMPD_workshare)))) TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 987066313fee5..c582de6df0319 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -4572,10 +4572,18 @@ void OmpStructureChecker::Enter(const parser::OmpClause::Update &x) { llvm::omp::Directive dir{GetContext().directive}; unsigned version{context_.langOptions().OpenMPVersion}; - auto *depType{std::get_if(&x.v.u)}; - auto *taskType{std::get_if(&x.v.u)}; - assert(((depType == nullptr) != (taskType == nullptr)) && - "Unexpected alternative in update clause"); + const parser::OmpDependenceType *depType{nullptr}; + const parser::OmpTaskDependenceType *taskType{nullptr}; + if (auto &maybeUpdate{x.v}) { + depType = std::get_if(&maybeUpdate->u); + taskType = std::get_if(&maybeUpdate->u); + } + + if (!depType && !taskType) { + assert(dir == llvm::omp::Directive::OMPD_atomic && + "Unexpected alternative in update clause"); + return; + } if (depType) { CheckDependenceType(depType->v); diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td index cdfd3e3223fa8..f4e400b651c31 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMP.td +++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td @@ -525,6 +525,7 @@ def OMPC_Untied : Clause<"untied"> { def OMPC_Update : Clause<"update"> { let clangClass = "OMPUpdateClause"; let flangClass = "OmpUpdateClause"; + let isValueOptional = true; } def OMPC_Use : Clause<"use"> { let clangClass = "OMPUseClause"; >From 02b40809aff5b2ad315ef078f0ea8edb0f30a40c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 17 Mar 2025 15:53:27 -0500 Subject: [PATCH 04/30] [flang][OpenMP] Overhaul implementation of ATOMIC construct The parser will accept a wide variety of illegal attempts at forming an ATOMIC construct, leaving it to the semantic analysis to diagnose any issues. This consolidates the analysis into one place and allows us to produce more informative diagnostics. The parser's outcome will be parser::OpenMPAtomicConstruct object holding the directive, parser::Body, and an optional end-directive. The prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have been removed. READ, WRITE, etc. are now proper clauses. The semantic analysis consistently operates on "evaluation" represen- tations, mainly evaluate::Expr (as SomeExpr) and evaluate::Assignment. The results of the semantic analysis are stored in a mutable member of the OpenMPAtomicConstruct node. This follows a precedent of having `typedExpr` member in parser::Expr, for example. This allows the lowering code to avoid duplicated handling of AST nodes. Using a BLOCK construct containing multiple statements for an ATOMIC construct that requires multiple statements is now allowed. In fact, any nesting of such BLOCK constructs is allowed. This implementation will parse, and perform semantic checks for both conditional-update and conditional-update-capture, although no MLIR will be generated for those. Instead, a TODO error will be issues prior to lowering. The allowed forms of the ATOMIC construct were based on the OpenMP 6.0 spec. --- flang/include/flang/Parser/dump-parse-tree.h | 12 - flang/include/flang/Parser/parse-tree.h | 111 +- flang/include/flang/Semantics/tools.h | 16 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 40 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 760 +++---- flang/lib/Parser/openmp-parsers.cpp | 233 +- flang/lib/Parser/parse-tree.cpp | 28 + flang/lib/Parser/unparse.cpp | 102 +- flang/lib/Semantics/check-omp-structure.cpp | 1998 +++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 56 +- flang/lib/Semantics/resolve-names.cpp | 7 +- flang/lib/Semantics/rewrite-directives.cpp | 126 +- .../Lower/OpenMP/Todo/atomic-compare-fail.f90 | 2 +- .../test/Lower/OpenMP/Todo/atomic-compare.f90 | 2 +- flang/test/Lower/OpenMP/atomic-capture.f90 | 4 +- flang/test/Lower/OpenMP/atomic-write.f90 | 2 +- flang/test/Parser/OpenMP/atomic-compare.f90 | 16 - .../test/Semantics/OpenMP/atomic-compare.f90 | 29 +- .../Semantics/OpenMP/atomic-hint-clause.f90 | 23 +- flang/test/Semantics/OpenMP/atomic-read.f90 | 89 + .../OpenMP/atomic-update-capture.f90 | 77 + .../Semantics/OpenMP/atomic-update-only.f90 | 83 + .../OpenMP/atomic-update-overloaded-ops.f90 | 4 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 81 + flang/test/Semantics/OpenMP/atomic.f90 | 29 +- flang/test/Semantics/OpenMP/atomic01.f90 | 221 +- flang/test/Semantics/OpenMP/atomic02.f90 | 47 +- flang/test/Semantics/OpenMP/atomic03.f90 | 51 +- flang/test/Semantics/OpenMP/atomic04.f90 | 96 +- flang/test/Semantics/OpenMP/atomic05.f90 | 12 +- .../Semantics/OpenMP/critical-hint-clause.f90 | 20 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 58 +- .../Semantics/OpenMP/requires-atomic01.f90 | 86 +- .../Semantics/OpenMP/requires-atomic02.f90 | 86 +- 34 files changed, 2838 insertions(+), 1769 deletions(-) delete mode 100644 flang/test/Parser/OpenMP/atomic-compare.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-read.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-capture.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-update-only.f90 create mode 100644 flang/test/Semantics/OpenMP/atomic-write.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index c0cf90c4696b6..a8c01ee2e7da3 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -526,15 +526,6 @@ class ParseTreeDumper { NODE(parser, OmpAtClause) NODE_ENUM(OmpAtClause, ActionTime) NODE_ENUM(OmpSeverityClause, Severity) - NODE(parser, OmpAtomic) - NODE(parser, OmpAtomicCapture) - NODE(OmpAtomicCapture, Stmt1) - NODE(OmpAtomicCapture, Stmt2) - NODE(parser, OmpAtomicCompare) - NODE(parser, OmpAtomicCompareIfStmt) - NODE(parser, OmpAtomicRead) - NODE(parser, OmpAtomicUpdate) - NODE(parser, OmpAtomicWrite) NODE(parser, OmpBeginBlockDirective) NODE(parser, OmpBeginLoopDirective) NODE(parser, OmpBeginSectionsDirective) @@ -581,7 +572,6 @@ class ParseTreeDumper { NODE(parser, OmpDoacrossClause) NODE(parser, OmpDestroyClause) NODE(parser, OmpEndAllocators) - NODE(parser, OmpEndAtomic) NODE(parser, OmpEndBlockDirective) NODE(parser, OmpEndCriticalDirective) NODE(parser, OmpEndLoopDirective) @@ -709,8 +699,6 @@ class ParseTreeDumper { NODE(parser, OpenMPDeclareMapperConstruct) NODE_ENUM(common, OmpMemoryOrderType) NODE(parser, OmpMemoryOrderClause) - NODE(parser, OmpAtomicClause) - NODE(parser, OmpAtomicClauseList) NODE(parser, OmpAtomicDefaultMemOrderClause) NODE(parser, OpenMPDepobjConstruct) NODE(parser, OpenMPUtilityConstruct) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index e39ecc13f4eec..77f57b1cb85c7 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4835,94 +4835,37 @@ struct OmpMemoryOrderClause { CharBlock source; }; -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) | -// FAIL(memory-order) -struct OmpAtomicClause { - UNION_CLASS_BOILERPLATE(OmpAtomicClause); - CharBlock source; - std::variant u; -}; - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -struct OmpAtomicClauseList { - WRAPPER_CLASS_BOILERPLATE(OmpAtomicClauseList, std::list); - CharBlock source; -}; - -// END ATOMIC -EMPTY_CLASS(OmpEndAtomic); - -// ATOMIC READ -struct OmpAtomicRead { - TUPLE_CLASS_BOILERPLATE(OmpAtomicRead); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC WRITE -struct OmpAtomicWrite { - TUPLE_CLASS_BOILERPLATE(OmpAtomicWrite); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC UPDATE -struct OmpAtomicUpdate { - TUPLE_CLASS_BOILERPLATE(OmpAtomicUpdate); - CharBlock source; - std::tuple, std::optional> - t; -}; - -// ATOMIC CAPTURE -struct OmpAtomicCapture { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCapture); - CharBlock source; - WRAPPER_CLASS(Stmt1, Statement); - WRAPPER_CLASS(Stmt2, Statement); - std::tuple - t; -}; - -struct OmpAtomicCompareIfStmt { - UNION_CLASS_BOILERPLATE(OmpAtomicCompareIfStmt); - std::variant, common::Indirection> u; -}; - -// ATOMIC COMPARE (OpenMP 5.1, OPenMP 5.2 spec: 15.8.4) -struct OmpAtomicCompare { - TUPLE_CLASS_BOILERPLATE(OmpAtomicCompare); +struct OpenMPAtomicConstruct { + llvm::omp::Clause GetKind() const; + bool IsCapture() const; + bool IsCompare() const; + TUPLE_CLASS_BOILERPLATE(OpenMPAtomicConstruct); CharBlock source; - std::tuple> + std::tuple> t; -}; -// ATOMIC -struct OmpAtomic { - TUPLE_CLASS_BOILERPLATE(OmpAtomic); - CharBlock source; - std::tuple, - std::optional> - t; -}; + // Information filled out during semantic checks to avoid duplication + // of analyses. + struct Analysis { + static constexpr int None = 0; + static constexpr int Read = 1; + static constexpr int Write = 2; + static constexpr int Update = Read | Write; + static constexpr int Action = 3; // Bitmask for None, Read, Write, Update + static constexpr int IfTrue = 4; + static constexpr int IfFalse = 8; + static constexpr int Condition = 12; // Bitmask for IfTrue, IfFalse + + struct Op { + int what; + TypedExpr expr; + }; + TypedExpr atom, cond; + Op op0, op1; + }; -// 2.17.7 atomic -> -// ATOMIC [atomic-clause-list] atomic-construct [atomic-clause-list] | -// ATOMIC [atomic-clause-list] -// atomic-construct -> READ | WRITE | UPDATE | CAPTURE | COMPARE -struct OpenMPAtomicConstruct { - UNION_CLASS_BOILERPLATE(OpenMPAtomicConstruct); - std::variant - u; + mutable Analysis analysis; }; // OpenMP directives that associate with loop(s) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index f25babb3c1f6d..7f1ec59b087a2 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -782,5 +782,21 @@ inline bool checkForSymbolMatch( } return false; } + +/// If the top-level operation (ignoring parentheses) is either an +/// evaluate::FunctionRef, or a specialization of evaluate::Operation, +/// then return the list of arguments (wrapped in SomeExpr). Otherwise, +/// return the "expr" but with top-level parentheses stripped. +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); + +/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). +/// Check if "expr" is +/// SomeType(SomeKind(Type( +/// Convert +/// SomeKind(...)[2]))) +/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves +/// TypeCategory. +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); + } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index b88454c45da85..0d941bf39afc3 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -332,26 +332,26 @@ getSource(const semantics::SemanticsContext &semaCtx, const parser::CharBlock *source = nullptr; auto ompConsVisit = [&](const parser::OpenMPConstruct &x) { - std::visit(common::visitors{ - [&](const parser::OpenMPSectionsConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPLoopConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPBlockConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPCriticalConstruct &x) { - source = &std::get<0>(x.t).source; - }, - [&](const parser::OpenMPAtomicConstruct &x) { - std::visit([&](const auto &x) { source = &x.source; }, - x.u); - }, - [&](const auto &x) { source = &x.source; }, - }, - x.u); + std::visit( + common::visitors{ + [&](const parser::OpenMPSectionsConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPLoopConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPBlockConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPCriticalConstruct &x) { + source = &std::get<0>(x.t).source; + }, + [&](const parser::OpenMPAtomicConstruct &x) { + source = &std::get(x.t).source; + }, + [&](const auto &x) { source = &x.source; }, + }, + x.u); }; eval.visit(common::visitors{ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index fdd85e94829f3..ab868df76d298 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2588,455 +2588,183 @@ genTeamsOp(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// // Code generation for atomic operations //===----------------------------------------------------------------------===// +static fir::FirOpBuilder::InsertPoint +getInsertionPointBefore(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + mlir::Block::iterator(op)); +} -/// Populates \p hint and \p memoryOrder with appropriate clause information -/// if present on atomic construct. -static void genOmpAtomicHintAndMemoryOrderClauses( - lower::AbstractConverter &converter, - const parser::OmpAtomicClauseList &clauseList, mlir::IntegerAttr &hint, - mlir::omp::ClauseMemoryOrderKindAttr &memoryOrder) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - for (const parser::OmpAtomicClause &clause : clauseList.v) { - common::visit( - common::visitors{ - [&](const parser::OmpMemoryOrderClause &s) { - auto kind = common::visit( - common::visitors{ - [&](const parser::OmpClause::AcqRel &) { - return mlir::omp::ClauseMemoryOrderKind::Acq_rel; - }, - [&](const parser::OmpClause::Acquire &) { - return mlir::omp::ClauseMemoryOrderKind::Acquire; - }, - [&](const parser::OmpClause::Relaxed &) { - return mlir::omp::ClauseMemoryOrderKind::Relaxed; - }, - [&](const parser::OmpClause::Release &) { - return mlir::omp::ClauseMemoryOrderKind::Release; - }, - [&](const parser::OmpClause::SeqCst &) { - return mlir::omp::ClauseMemoryOrderKind::Seq_cst; - }, - [&](auto &&) -> mlir::omp::ClauseMemoryOrderKind { - llvm_unreachable("Unexpected clause"); - }, - }, - s.v.u); - memoryOrder = mlir::omp::ClauseMemoryOrderKindAttr::get( - firOpBuilder.getContext(), kind); - }, - [&](const parser::OmpHintClause &s) { - const auto *expr = semantics::GetExpr(s.v); - uint64_t hintExprValue = *evaluate::ToInt64(*expr); - hint = firOpBuilder.getI64IntegerAttr(hintExprValue); - }, - [&](const parser::OmpFailClause &) {}, - }, - clause.u); +static fir::FirOpBuilder::InsertPoint +getInsertionPointAfter(mlir::Operation *op) { + return fir::FirOpBuilder::InsertPoint(op->getBlock(), + ++mlir::Block::iterator(op)); +} + +static mlir::IntegerAttr getAtomicHint(lower::AbstractConverter &converter, + const List &clauses) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + if (clause.id != llvm::omp::Clause::OMPC_hint) + continue; + auto &hint = std::get(clause.u); + auto maybeVal = evaluate::ToInt64(hint.v); + CHECK(maybeVal); + return builder.getI64IntegerAttr(*maybeVal); } + return nullptr; } -static void processOmpAtomicTODO(mlir::Type elementType, mlir::Location loc) { - if (!elementType) - return; - assert(fir::isa_trivial(fir::unwrapRefType(elementType)) && - "is supported type for omp atomic"); -} - -/// Used to generate atomic.read operation which is created in existing -/// location set by builder. -static void genAtomicCaptureStatement( - lower::AbstractConverter &converter, mlir::Value fromAddress, - mlir::Value toAddress, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, - mlir::Type elementType, mlir::Location loc) { - // Generate `atomic.read` operation for atomic assigment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); +static mlir::omp::ClauseMemoryOrderKindAttr +getAtomicMemoryOrder(lower::AbstractConverter &converter, + semantics::SemanticsContext &semaCtx, + const List &clauses) { + std::optional kind; + unsigned version = semaCtx.langOptions().OpenMPVersion; - processOmpAtomicTODO(elementType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, fromAddress, toAddress, - mlir::TypeAttr::get(elementType), - hint, memoryOrder); -} - -/// Used to generate atomic.write operation which is created in existing -/// location set by builder. -static void genAtomicWriteStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Value rhsExpr, const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Value *evaluatedExprValue = nullptr) { - // Generate `atomic.write` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + for (const Clause &clause : clauses) { + switch (clause.id) { + case llvm::omp::Clause::OMPC_acq_rel: + kind = mlir::omp::ClauseMemoryOrderKind::Acq_rel; + break; + case llvm::omp::Clause::OMPC_acquire: + kind = mlir::omp::ClauseMemoryOrderKind::Acquire; + break; + case llvm::omp::Clause::OMPC_relaxed: + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + break; + case llvm::omp::Clause::OMPC_release: + kind = mlir::omp::ClauseMemoryOrderKind::Release; + break; + case llvm::omp::Clause::OMPC_seq_cst: + kind = mlir::omp::ClauseMemoryOrderKind::Seq_cst; + break; + default: + break; + } + } - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // Create a conversion outside the capture block. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - firOpBuilder.setInsertionPointAfter(rhsExpr.getDefiningOp()); - rhsExpr = firOpBuilder.createConvert(loc, varType, rhsExpr); - firOpBuilder.restoreInsertionPoint(insertionPoint); - - processOmpAtomicTODO(varType, loc); - - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - firOpBuilder.create(loc, lhsAddr, rhsExpr, hint, - memoryOrder); -} - -/// Used to generate atomic.update operation which is created in existing -/// location set by builder. -static void genAtomicUpdateStatement( - lower::AbstractConverter &converter, mlir::Value lhsAddr, - mlir::Type varType, const parser::Variable &assignmentStmtVariable, - const parser::Expr &assignmentStmtExpr, - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList, mlir::Location loc, - mlir::Operation *atomicCaptureOp = nullptr) { - // Generate `atomic.update` operation for atomic assignment statements - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); - mlir::Location currentLocation = converter.getCurrentLocation(); + // Starting with 5.1, if no memory-order clause is present, the effect + // is as if "relaxed" was present. + if (!kind) { + if (version <= 50) + return nullptr; + kind = mlir::omp::ClauseMemoryOrderKind::Relaxed; + } + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + return mlir::omp::ClauseMemoryOrderKindAttr::get(builder.getContext(), *kind); +} + +static mlir::Operation * // +genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Create the omp.atomic.update or acc.atomic.update operation - // - // func.func @_QPsb() { - // %0 = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFsbEa"} - // %1 = fir.alloca i32 {bindc_name = "b", uniq_name = "_QFsbEb"} - // %2 = fir.load %1 : !fir.ref - // omp.atomic.update %0 : !fir.ref { - // ^bb0(%arg0: i32): - // %3 = arith.addi %arg0, %2 : i32 - // omp.yield(%3 : i32) - // } - // return - // } - - auto getArgExpression = - [](std::list::const_iterator it) { - const auto &arg{std::get((*it).t)}; - const auto *parserExpr{ - std::get_if>(&arg.u)}; - return parserExpr; - }; +static mlir::Operation * // +genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, + mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Value converted = builder.createConvert(loc, atomType, value); + + builder.restoreInsertionPoint(atomicAt); + mlir::Operation *op = builder.create( + loc, atomAddr, converted, hint, memOrder); + builder.restoreInsertionPoint(saved); + return op; +} - // Lower any non atomic sub-expression before the atomic operation, and - // map its lowered value to the semantic representation. - lower::ExprToValueMap exprValueOverrides; - // Max and min intrinsics can have a list of Args. Hence we need a list - // of nonAtomicSubExprs to hoist. Currently, only the load is hoisted. - llvm::SmallVector nonAtomicSubExprs; - common::visit( - common::visitors{ - [&](const common::Indirection &funcRef) - -> void { - const auto &args{std::get>( - funcRef.value().v.t)}; - std::list::const_iterator beginIt = - args.begin(); - std::list::const_iterator endIt = args.end(); - const auto *exprFirst{getArgExpression(beginIt)}; - if (exprFirst && exprFirst->value().source == - assignmentStmtVariable.GetSource()) { - // Add everything except the first - beginIt++; - } else { - // Add everything except the last - endIt--; - } - std::list::const_iterator it; - for (it = beginIt; it != endIt; it++) { - const common::Indirection *expr = - getArgExpression(it); - if (expr) - nonAtomicSubExprs.push_back(semantics::GetExpr(*expr)); - } - }, - [&](const auto &op) -> void { - using T = std::decay_t; - if constexpr (std::is_base_of::value) { - const auto &exprLeft{std::get<0>(op.t)}; - const auto &exprRight{std::get<1>(op.t)}; - if (exprLeft.value().source == assignmentStmtVariable.GetSource()) - nonAtomicSubExprs.push_back(semantics::GetExpr(exprRight)); - else - nonAtomicSubExprs.push_back(semantics::GetExpr(exprLeft)); - } - }, - }, - assignmentStmtExpr.u); - lower::StatementContext nonAtomicStmtCtx; - if (!nonAtomicSubExprs.empty()) { - // Generate non atomic part before all the atomic operations. - auto insertionPoint = firOpBuilder.saveInsertionPoint(); - if (atomicCaptureOp) - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value nonAtomicVal; - for (auto *nonAtomicSubExpr : nonAtomicSubExprs) { - nonAtomicVal = fir::getBase(converter.genExprValue( - currentLocation, *nonAtomicSubExpr, nonAtomicStmtCtx)); - exprValueOverrides.try_emplace(nonAtomicSubExpr, nonAtomicVal); +static mlir::Operation * +genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, mlir::Value atomAddr, + const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + lower::ExprToValueMap overrides; + lower::StatementContext naCtx; + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); + builder.restoreInsertionPoint(prepareAt); + + mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + + std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + assert(!args.empty() && "Update operation without arguments"); + for (auto &arg : args) { + if (!semantics::IsSameOrResizeOf(arg, atom)) { + mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); + overrides.try_emplace(&arg, val); } - if (atomicCaptureOp) - firOpBuilder.restoreInsertionPoint(insertionPoint); } - mlir::Operation *atomicUpdateOp = nullptr; - // If no hint clause is specified, the effect is as if - // hint(omp_sync_hint_none) had been specified. - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - if (leftHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *leftHandClauseList, hint, - memoryOrder); - if (rightHandClauseList) - genOmpAtomicHintAndMemoryOrderClauses(converter, *rightHandClauseList, hint, - memoryOrder); - atomicUpdateOp = firOpBuilder.create( - currentLocation, lhsAddr, hint, memoryOrder); - - processOmpAtomicTODO(varType, loc); - - llvm::SmallVector varTys = {varType}; - llvm::SmallVector locs = {currentLocation}; - firOpBuilder.createBlock(&atomicUpdateOp->getRegion(0), {}, varTys, locs); - mlir::Value val = - fir::getBase(atomicUpdateOp->getRegion(0).front().getArgument(0)); - - exprValueOverrides.try_emplace(semantics::GetExpr(assignmentStmtVariable), - val); - { - // statement context inside the atomic block. - converter.overrideExprValues(&exprValueOverrides); - lower::StatementContext atomicStmtCtx; - mlir::Value rhsExpr = fir::getBase(converter.genExprValue( - *semantics::GetExpr(assignmentStmtExpr), atomicStmtCtx)); - mlir::Value convertResult = - firOpBuilder.createConvert(currentLocation, varType, rhsExpr); - firOpBuilder.create(currentLocation, convertResult); - converter.resetExprOverrides(); - } - firOpBuilder.setInsertionPointAfter(atomicUpdateOp); -} - -/// Processes an atomic construct with write clause. -static void genAtomicWrite(lower::AbstractConverter &converter, - const parser::OmpAtomicWrite &atomicWrite, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicWrite.t); - leftHandClauseList = &std::get<0>(atomicWrite.t); - - const parser::AssignmentStmt &stmt = - std::get>(atomicWrite.t) - .statement; - const evaluate::Assignment &assign = *stmt.typedAssignment->v; - lower::StatementContext stmtCtx; - // Get the value and address of atomic write operands. - mlir::Value rhsExpr = - fir::getBase(converter.genExprValue(assign.rhs, stmtCtx)); - mlir::Value lhsAddr = - fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx)); - genAtomicWriteStatement(converter, lhsAddr, rhsExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with read clause. -static void genAtomicRead(lower::AbstractConverter &converter, - const parser::OmpAtomicRead &atomicRead, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicRead.t); - leftHandClauseList = &std::get<0>(atomicRead.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicRead.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicRead.t) - .statement.t); + builder.restoreInsertionPoint(atomicAt); + auto updateOp = + builder.create(loc, atomAddr, hint, memOrder); - lower::StatementContext stmtCtx; - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(assignmentStmtExpr); - mlir::Type elementType = converter.genType(fromExpr); - mlir::Value fromAddress = - fir::getBase(converter.genExprAddr(fromExpr, stmtCtx)); - mlir::Value toAddress = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - genAtomicCaptureStatement(converter, fromAddress, toAddress, - leftHandClauseList, rightHandClauseList, - elementType, loc); -} - -/// Processes an atomic construct with update clause. -static void genAtomicUpdate(lower::AbstractConverter &converter, - const parser::OmpAtomicUpdate &atomicUpdate, - mlir::Location loc) { - const parser::OmpAtomicClauseList *rightHandClauseList = nullptr; - const parser::OmpAtomicClauseList *leftHandClauseList = nullptr; - // Get the address of atomic read operands. - rightHandClauseList = &std::get<2>(atomicUpdate.t); - leftHandClauseList = &std::get<0>(atomicUpdate.t); - - const auto &assignmentStmtExpr = std::get( - std::get>(atomicUpdate.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicUpdate.t) - .statement.t); + mlir::Region ®ion = updateOp->getRegion(0); + mlir::Block *block = builder.createBlock(®ion, {}, {atomType}, {loc}); + mlir::Value localAtom = fir::getBase(block->getArgument(0)); + overrides.try_emplace(&atom, localAtom); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, leftHandClauseList, - rightHandClauseList, loc); -} - -/// Processes an atomic construct with no clause - which implies update clause. -static void genOmpAtomic(lower::AbstractConverter &converter, - const parser::OmpAtomic &atomicConstruct, - mlir::Location loc) { - const parser::OmpAtomicClauseList &atomicClauseList = - std::get(atomicConstruct.t); - const auto &assignmentStmtExpr = std::get( - std::get>(atomicConstruct.t) - .statement.t); - const auto &assignmentStmtVariable = std::get( - std::get>(atomicConstruct.t) - .statement.t); - lower::StatementContext stmtCtx; - mlir::Value lhsAddr = fir::getBase(converter.genExprAddr( - *semantics::GetExpr(assignmentStmtVariable), stmtCtx)); - mlir::Type varType = fir::unwrapRefType(lhsAddr.getType()); - // If atomic-clause is not present on the construct, the behaviour is as if - // the update clause is specified (for both OpenMP and OpenACC). - genAtomicUpdateStatement(converter, lhsAddr, varType, assignmentStmtVariable, - assignmentStmtExpr, &atomicClauseList, nullptr, loc); -} - -/// Processes an atomic construct with capture clause. -static void genAtomicCapture(lower::AbstractConverter &converter, - const parser::OmpAtomicCapture &atomicCapture, - mlir::Location loc) { - fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + converter.overrideExprValues(&overrides); + mlir::Value updated = + fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value converted = builder.createConvert(loc, atomType, updated); + builder.create(loc, converted); + converter.resetExprOverrides(); - const parser::AssignmentStmt &stmt1 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign1 = *stmt1.typedAssignment->v; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const parser::AssignmentStmt &stmt2 = - std::get(atomicCapture.t).v.statement; - const evaluate::Assignment &assign2 = *stmt2.typedAssignment->v; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - // Pre-evaluate expressions to be used in the various operations inside - // `atomic.capture` since it is not desirable to have anything other than - // a `atomic.read`, `atomic.write`, or `atomic.update` operation - // inside `atomic.capture` - lower::StatementContext stmtCtx; - // LHS evaluations are common to all combinations of `atomic.capture` - mlir::Value stmt1LHSArg = - fir::getBase(converter.genExprAddr(assign1.lhs, stmtCtx)); - mlir::Value stmt2LHSArg = - fir::getBase(converter.genExprAddr(assign2.lhs, stmtCtx)); - - // Type information used in generation of `atomic.update` operation - mlir::Type stmt1VarType = - fir::getBase(converter.genExprValue(assign1.lhs, stmtCtx)).getType(); - mlir::Type stmt2VarType = - fir::getBase(converter.genExprValue(assign2.lhs, stmtCtx)).getType(); - - mlir::Operation *atomicCaptureOp = nullptr; - mlir::IntegerAttr hint = nullptr; - mlir::omp::ClauseMemoryOrderKindAttr memoryOrder = nullptr; - const parser::OmpAtomicClauseList &rightHandClauseList = - std::get<2>(atomicCapture.t); - const parser::OmpAtomicClauseList &leftHandClauseList = - std::get<0>(atomicCapture.t); - genOmpAtomicHintAndMemoryOrderClauses(converter, leftHandClauseList, hint, - memoryOrder); - genOmpAtomicHintAndMemoryOrderClauses(converter, rightHandClauseList, hint, - memoryOrder); - atomicCaptureOp = - firOpBuilder.create(loc, hint, memoryOrder); - - firOpBuilder.createBlock(&(atomicCaptureOp->getRegion(0))); - mlir::Block &block = atomicCaptureOp->getRegion(0).back(); - firOpBuilder.setInsertionPointToStart(&block); - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - if (semantics::checkForSymbolMatch(stmt2)) { - // Atomic capture construct is of the form [capture-stmt, update-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicUpdateStatement( - converter, stmt2LHSArg, stmt2VarType, stmt2Var, stmt2Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - } else { - // Atomic capture construct is of the form [capture-stmt, write-stmt] - firOpBuilder.setInsertionPoint(atomicCaptureOp); - mlir::Value stmt2RHSArg = - fir::getBase(converter.genExprValue(assign2.rhs, stmtCtx)); - firOpBuilder.setInsertionPointToStart(&block); - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt1Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicCaptureStatement(converter, stmt2LHSArg, stmt1LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); - genAtomicWriteStatement(converter, stmt2LHSArg, stmt2RHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc); - } - } else { - // Atomic capture construct is of the form [update-stmt, capture-stmt] - const semantics::SomeExpr &fromExpr = *semantics::GetExpr(stmt2Expr); - mlir::Type elementType = converter.genType(fromExpr); - genAtomicUpdateStatement( - converter, stmt1LHSArg, stmt1VarType, stmt1Var, stmt1Expr, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, loc, atomicCaptureOp); - genAtomicCaptureStatement(converter, stmt1LHSArg, stmt2LHSArg, - /*leftHandClauseList=*/nullptr, - /*rightHandClauseList=*/nullptr, elementType, - loc); + builder.restoreInsertionPoint(saved); + return updateOp; +} + +static mlir::Operation * +genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, + lower::StatementContext &stmtCtx, int action, + mlir::Value atomAddr, const semantics::SomeExpr &atom, + const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint atomicAt, + fir::FirOpBuilder::InsertPoint prepareAt) { + switch (action) { + case parser::OpenMPAtomicConstruct::Analysis::Read: + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Write: + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + case parser::OpenMPAtomicConstruct::Analysis::Update: + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, + memOrder, atomicAt, prepareAt); + default: + return nullptr; } - firOpBuilder.setInsertionPointToEnd(&block); - firOpBuilder.create(loc); - firOpBuilder.setInsertionPointToStart(&block); } //===----------------------------------------------------------------------===// @@ -3911,10 +3639,6 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, standaloneConstruct.u); } -//===----------------------------------------------------------------------===// -// OpenMPConstruct visitors -//===----------------------------------------------------------------------===// - static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, @@ -3922,38 +3646,140 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, TODO(converter.getCurrentLocation(), "OpenMPAllocatorsConstruct"); } +//===----------------------------------------------------------------------===// +// OpenMPConstruct visitors +//===----------------------------------------------------------------------===// + +[[maybe_unused]] static void +dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { + auto whatStr = [](int k) { + std::string txt = "?"; + switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { + case parser::OpenMPAtomicConstruct::Analysis::None: + txt = "None"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Read: + txt = "Read"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Write: + txt = "Write"; + break; + case parser::OpenMPAtomicConstruct::Analysis::Update: + txt = "Update"; + break; + } + switch (k & parser::OpenMPAtomicConstruct::Analysis::Condition) { + case parser::OpenMPAtomicConstruct::Analysis::IfTrue: + txt += " | IfTrue"; + break; + case parser::OpenMPAtomicConstruct::Analysis::IfFalse: + txt += " | IfFalse"; + break; + } + return txt; + }; + + auto exprStr = [&](const parser::TypedExpr &expr) { + if (auto *maybe = expr.get()) { + if (maybe->v) + return maybe->v->AsFortran(); + } + return ""s; + }; + + const SomeExpr &atom = *analysis.atom.get()->v; + + llvm::errs() << "Analysis {\n"; + llvm::errs() << " atom: " << atom.AsFortran() << "\n"; + llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; + llvm::errs() << " op0 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << " op1 {\n"; + llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; + llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " }\n"; + llvm::errs() << "}\n"; +} + static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, - const parser::OpenMPAtomicConstruct &atomicConstruct) { - Fortran::common::visit( - common::visitors{ - [&](const parser::OmpAtomicRead &atomicRead) { - mlir::Location loc = converter.genLocation(atomicRead.source); - genAtomicRead(converter, atomicRead, loc); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - mlir::Location loc = converter.genLocation(atomicWrite.source); - genAtomicWrite(converter, atomicWrite, loc); - }, - [&](const parser::OmpAtomic &atomicConstruct) { - mlir::Location loc = converter.genLocation(atomicConstruct.source); - genOmpAtomic(converter, atomicConstruct, loc); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - mlir::Location loc = converter.genLocation(atomicUpdate.source); - genAtomicUpdate(converter, atomicUpdate, loc); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - mlir::Location loc = converter.genLocation(atomicCapture.source); - genAtomicCapture(converter, atomicCapture, loc); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - mlir::Location loc = converter.genLocation(atomicCompare.source); - TODO(loc, "OpenMP atomic compare"); - }, - }, - atomicConstruct.u); + const parser::OpenMPAtomicConstruct &construct) { + auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { + if (auto *maybe = expr.get(); maybe && maybe->v) { + return &*maybe->v; + } else { + return nullptr; + } + }; + + fir::FirOpBuilder &builder = converter.getFirOpBuilder(); + auto &dirSpec = std::get(construct.t); + List clauses = makeClauses(dirSpec.Clauses(), semaCtx); + lower::StatementContext stmtCtx; + + const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + const semantics::SomeExpr &atom = *get(analysis.atom); + mlir::Location loc = converter.genLocation(construct.source); + mlir::Value atomAddr = + fir::getBase(converter.genExprAddr(atom, stmtCtx, &loc)); + mlir::IntegerAttr hint = getAtomicHint(converter, clauses); + mlir::omp::ClauseMemoryOrderKindAttr memOrder = + getAtomicMemoryOrder(converter, semaCtx, clauses); + + if (auto *cond = get(analysis.cond)) { + (void)cond; + TODO(loc, "OpenMP ATOMIC COMPARE"); + } else { + int action0 = analysis.op0.what & analysis.Action; + int action1 = analysis.op1.what & analysis.Action; + mlir::Operation *captureOp = nullptr; + fir::FirOpBuilder::InsertPoint atomicAt; + fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + + if (construct.IsCapture()) { + // Capturing operation. + assert(action0 != analysis.None && action1 != analysis.None && + "Expexcing two actions"); + captureOp = + builder.create(loc, hint, memOrder); + // Set the non-atomic insertion point to before the atomic.capture. + prepareAt = getInsertionPointBefore(captureOp); + + mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); + builder.setInsertionPointToEnd(block); + // Set the atomic insertion point to before the terminator inside + // atomic.capture. + mlir::Operation *term = builder.create(loc); + atomicAt = getInsertionPointBefore(term); + hint = nullptr; + memOrder = nullptr; + } else { + // Non-capturing operation. + assert(action0 != analysis.None && action1 == analysis.None && + "Expexcing single action"); + assert(!(analysis.op0.what & analysis.Condition)); + atomicAt = prepareAt; + } + + mlir::Operation *firstOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, + *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + assert(firstOp && "Should have created an atomic operation"); + atomicAt = getInsertionPointAfter(firstOp); + + mlir::Operation *secondOp = genAtomicOperation( + converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, + *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } + } } static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp index bfca4e3f1730a..b30263ad54fa4 100644 --- a/flang/lib/Parser/openmp-parsers.cpp +++ b/flang/lib/Parser/openmp-parsers.cpp @@ -24,6 +24,12 @@ // OpenMP Directives and Clauses namespace Fortran::parser { +// Helper function to print the buffer contents starting at the current point. +[[maybe_unused]] static std::string ahead(const ParseState &state) { + return std::string( + state.GetLocation(), std::min(64, state.BytesRemaining())); +} + constexpr auto startOmpLine = skipStuffBeforeStatement >> "!$OMP "_sptok; constexpr auto endOmpLine = space >> endOfLine; @@ -918,8 +924,10 @@ TYPE_PARSER( // parenthesized(Parser{}))) || "BIND" >> construct(construct( parenthesized(Parser{}))) || + "CAPTURE" >> construct(construct()) || "COLLAPSE" >> construct(construct( parenthesized(scalarIntConstantExpr))) || + "COMPARE" >> construct(construct()) || "CONTAINS" >> construct(construct( parenthesized(Parser{}))) || "COPYIN" >> construct(construct( @@ -1039,6 +1047,7 @@ TYPE_PARSER( // "TASK_REDUCTION" >> construct(construct( parenthesized(Parser{}))) || + "READ" >> construct(construct()) || "RELAXED" >> construct(construct()) || "RELEASE" >> construct(construct()) || "REVERSE_OFFLOAD" >> @@ -1082,6 +1091,7 @@ TYPE_PARSER( // maybe(Parser{}))) || "WHEN" >> construct(construct( parenthesized(Parser{}))) || + "WRITE" >> construct(construct()) || // Cancellable constructs "DO"_id >= construct(construct( @@ -1210,6 +1220,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. +// A problem can occur when atomic constructs without end-directive follow +// each other closely, e.g. +// !$omp atomic write +// x = v +// !$omp atomic update +// x = x + 1 +// ... +// The speculative parsing will become "recursive", and has the potential +// to take a (practically) infinite amount of time given a sufficiently +// large number of such constructs in a row. Since atomic constructs cannot +// contain other OpenMP constructs, guarding against recursive calls to the +// atomic construct parser solves the problem. +struct OmpAtomicConstructParser { + using resultType = OpenMPAtomicConstruct; + + static constexpr size_t BodyLimit{5}; + + std::optional Parse(ParseState &state) const { + if (recursing_) { + return std::nullopt; + } + recursing_ = true; + + auto dirSpec{Parser{}.Parse(state)}; + if (!dirSpec || dirSpec->DirId() != llvm::omp::Directive::OMPD_atomic) { + recursing_ = false; + return std::nullopt; + } + + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + if (ParseOne(exec, end, tail, state)) { + if (!tail.first.empty()) { + if (auto &&rest{attempt(LimitedTailParser(BodyLimit)).Parse(state)}) { + for (auto &&s : rest->first) { + tail.first.emplace_back(std::move(s)); + } + assert(!tail.second); + tail.second = std::move(rest->second); + } + } + recursing_ = false; + return OpenMPAtomicConstruct{ + std::move(*dirSpec), std::move(tail.first), std::move(tail.second)}; + } + + recursing_ = false; + return std::nullopt; + } + +private: + // Begin-directive + TailType = entire construct. + using TailType = std::pair>; + + // Parse either an ExecutionPartConstruct, or atomic end-directive. When + // successful, record the result in the "tail" provided, otherwise fail. + static std::optional ParseOne( // + Parser &exec, OmpEndDirectiveParser &end, + TailType &tail, ParseState &state) { + auto isRecovery{[](const ExecutionPartConstruct &e) { + return std::holds_alternative(e.u); + }}; + if (auto &&stmt{attempt(exec).Parse(state)}; stmt && !isRecovery(*stmt)) { + tail.first.emplace_back(std::move(*stmt)); + } else if (auto &&dir{attempt(end).Parse(state)}) { + tail.second = std::move(*dir); + } else { + return std::nullopt; + } + return Success{}; + } + + struct LimitedTailParser { + using resultType = TailType; + + constexpr LimitedTailParser(size_t count) : count_(count) {} + + std::optional Parse(ParseState &state) const { + auto exec{Parser{}}; + auto end{OmpEndDirectiveParser{llvm::omp::Directive::OMPD_atomic}}; + TailType tail; + + for (size_t i{0}; i != count_; ++i) { + if (ParseOne(exec, end, tail, state)) { + if (tail.second) { + // Return when the end-directive was parsed. + return std::move(tail); + } + } else { + break; + } + } + return std::nullopt; + } + + private: + const size_t count_; + }; + + // The recursion guard should become thread_local if parsing is ever + // parallelized. + static bool recursing_; +}; + +bool OmpAtomicConstructParser::recursing_{false}; + +TYPE_PARSER(sourced( // + construct(OmpAtomicConstructParser{}))) + // 2.17.7 Atomic construct/2.17.8 Flush construct [OpenMP 5.0] // memory-order-clause -> // acq_rel @@ -1224,19 +1383,6 @@ TYPE_PARSER(sourced(construct( "RELEASE" >> construct(construct()) || "SEQ_CST" >> construct(construct()))))) -// 2.17.7 Atomic construct -// atomic-clause -> memory-order-clause | HINT(hint-expression) -TYPE_PARSER(sourced(construct( - construct(Parser{}) || - construct( - "FAIL" >> parenthesized(Parser{})) || - construct( - "HINT" >> parenthesized(Parser{}))))) - -// atomic-clause-list -> [atomic-clause, [atomic-clause], ...] -TYPE_PARSER(sourced(construct( - many(maybe(","_tok) >> sourced(Parser{}))))) - static bool IsSimpleStandalone(const OmpDirectiveName &name) { switch (name.v) { case llvm::omp::Directive::OMPD_barrier: @@ -1383,67 +1529,6 @@ TYPE_PARSER(sourced( TYPE_PARSER(construct(Parser{}) || construct(Parser{})) -// 2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] | -// ATOMIC [clause] -// clause -> memory-order-clause | HINT(hint-expression) -// memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED -// atomic-clause -> READ | WRITE | UPDATE | CAPTURE - -// OMP END ATOMIC -TYPE_PARSER(construct(startOmpLine >> "END ATOMIC"_tok)) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] READ [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("READ"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] CAPTURE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("CAPTURE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - statement(assignmentStmt), Parser{} / endOmpLine))) - -TYPE_PARSER(construct(indirect(Parser{})) || - construct(indirect(Parser{}))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] COMPARE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("COMPARE"_tok), - Parser{} / endOmpLine, - Parser{}, - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] UPDATE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("UPDATE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [atomic-clause-list] -TYPE_PARSER(sourced(construct(verbatim("ATOMIC"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// OMP ATOMIC [MEMORY-ORDER-CLAUSE-LIST] WRITE [MEMORY-ORDER-CLAUSE-LIST] -TYPE_PARSER("ATOMIC" >> - sourced(construct( - Parser{} / maybe(","_tok), verbatim("WRITE"_tok), - Parser{} / endOmpLine, statement(assignmentStmt), - maybe(Parser{} / endOmpLine)))) - -// Atomic Construct -TYPE_PARSER(construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{}) || - construct(Parser{})) - // 2.13.2 OMP CRITICAL TYPE_PARSER(startOmpLine >> sourced(construct( diff --git a/flang/lib/Parser/parse-tree.cpp b/flang/lib/Parser/parse-tree.cpp index 5839e7862b38b..5983f54600c21 100644 --- a/flang/lib/Parser/parse-tree.cpp +++ b/flang/lib/Parser/parse-tree.cpp @@ -318,6 +318,34 @@ std::string OmpTraitSetSelectorName::ToString() const { return std::string(EnumToString(v)); } +llvm::omp::Clause OpenMPAtomicConstruct::GetKind() const { + auto &dirSpec{std::get(t)}; + for (auto &clause : dirSpec.Clauses().v) { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_read: + case llvm::omp::Clause::OMPC_write: + case llvm::omp::Clause::OMPC_update: + return clause.Id(); + default: + break; + } + } + return llvm::omp::Clause::OMPC_update; +} + +bool OpenMPAtomicConstruct::IsCapture() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_capture; + }); +} + +bool OpenMPAtomicConstruct::IsCompare() const { + auto &dirSpec{std::get(t)}; + return llvm::any_of(dirSpec.Clauses().v, [](auto &clause) { + return clause.Id() == llvm::omp::Clause::OMPC_compare; + }); +} } // namespace Fortran::parser template static llvm::omp::Clause getClauseIdForClass(C &&) { diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp index 5ac598265ec87..a2bc8b98088f1 100644 --- a/flang/lib/Parser/unparse.cpp +++ b/flang/lib/Parser/unparse.cpp @@ -2562,83 +2562,22 @@ class UnparseVisitor { Word(ToUpperCaseLetters(common::EnumToString(x))); } - void Unparse(const OmpAtomicClauseList &x) { Walk(" ", x.v, " "); } - - void Unparse(const OmpAtomic &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCapture &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" CAPTURE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - Put("\n"); - Walk(std::get(x.t)); - BeginOpenMP(); - Word("!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicCompare &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" COMPARE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get(x.t)); - } - void Unparse(const OmpAtomicRead &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" READ"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicUpdate &x) { + void Unparse(const OpenMPAtomicConstruct &x) { BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" UPDATE"); - Walk(std::get<2>(x.t)); - Put("\n"); - EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); - } - void Unparse(const OmpAtomicWrite &x) { - BeginOpenMP(); - Word("!$OMP ATOMIC"); - Walk(std::get<0>(x.t)); - Word(" WRITE"); - Walk(std::get<2>(x.t)); + Word("!$OMP "); + Walk(std::get(x.t)); Put("\n"); EndOpenMP(); - Walk(std::get>(x.t)); - BeginOpenMP(); - Walk(std::get>(x.t), "!$OMP END ATOMIC\n"); - EndOpenMP(); + Walk(std::get(x.t), ""); + if (auto &end{std::get>(x.t)}) { + BeginOpenMP(); + Word("!$OMP END "); + Walk(*end); + Put("\n"); + EndOpenMP(); + } } + void Unparse(const OpenMPExecutableAllocate &x) { const auto &fields = std::get>>( @@ -2889,23 +2828,8 @@ class UnparseVisitor { Put("\n"); EndOpenMP(); } + void Unparse(const OmpFailClause &x) { Walk(x.v); } void Unparse(const OmpMemoryOrderClause &x) { Walk(x.v); } - void Unparse(const OmpAtomicClause &x) { - common::visit(common::visitors{ - [&](const OmpMemoryOrderClause &y) { Walk(y); }, - [&](const OmpFailClause &y) { - Word("FAIL("); - Walk(y.v); - Put(")"); - }, - [&](const OmpHintClause &y) { - Word("HINT("); - Walk(y.v); - Put(")"); - }, - }, - x.u); - } void Unparse(const OmpMetadirectiveDirective &x) { BeginOpenMP(); Word("!$OMP METADIRECTIVE "); diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index c582de6df0319..bf3dff8ddab28 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -16,10 +16,18 @@ #include "flang/Semantics/openmp-modifiers.h" #include "flang/Semantics/tools.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/StringSwitch.h" #include namespace Fortran::semantics { +static_assert(std::is_same_v>); + +template +static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { + return !(e == f); +} + // Use when clause falls under 'struct OmpClause' in 'parse-tree.h'. #define CHECK_SIMPLE_CLAUSE(X, Y) \ void OmpStructureChecker::Enter(const parser::OmpClause::X &) { \ @@ -78,6 +86,28 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } +static bool IsVarOrFunctionRef(const SomeExpr &expr) { + return evaluate::UnwrapProcedureRef(expr) != nullptr || + evaluate::IsVariable(expr); +} + +static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { + const parser::TypedExpr &typedExpr{parserExpr.typedExpr}; + // ForwardOwningPointer typedExpr + // `- GenericExprWrapper ^.get() + // `- std::optional ^->v + return typedExpr.get()->v; +} + +static std::optional GetDynamicType( + const parser::Expr &parserExpr) { + if (auto maybeExpr{GetEvaluateExpr(parserExpr)}) { + return maybeExpr->GetType(); + } else { + return std::nullopt; + } +} + // 'OmpWorkshareBlockChecker' is used to check the validity of the assignment // statements and the expressions enclosed in an OpenMP Workshare construct class OmpWorkshareBlockChecker { @@ -584,51 +614,26 @@ void OmpStructureChecker::CheckPredefinedAllocatorRestriction( } } -template -void OmpStructureChecker::CheckHintClause( - D *leftOmpClauseList, D *rightOmpClauseList, std::string_view dirName) { - bool foundHint{false}; +void OmpStructureChecker::Enter(const parser::OmpClause::Hint &x) { + CheckAllowedClause(llvm::omp::Clause::OMPC_hint); + auto &dirCtx{GetContext()}; - auto checkForValidHintClause = [&](const D *clauseList) { - for (const auto &clause : clauseList->v) { - const parser::OmpHintClause *ompHintClause = nullptr; - if constexpr (std::is_same_v) { - ompHintClause = std::get_if(&clause.u); - } else if constexpr (std::is_same_v) { - if (auto *hint{std::get_if(&clause.u)}) { - ompHintClause = &hint->v; - } - } - if (!ompHintClause) - continue; - if (foundHint) { - context_.Say(clause.source, - "At most one HINT clause can appear on the %s directive"_err_en_US, - parser::ToUpperCaseLetters(dirName)); - } - foundHint = true; - std::optional hintValue = GetIntValue(ompHintClause->v); - if (hintValue && *hintValue >= 0) { - /*`omp_sync_hint_nonspeculative` and `omp_lock_hint_speculative`*/ - if ((*hintValue & 0xC) == 0xC - /*`omp_sync_hint_uncontended` and omp_sync_hint_contended*/ - || (*hintValue & 0x3) == 0x3) - context_.Say(clause.source, - "Hint clause value " - "is not a valid OpenMP synchronization value"_err_en_US); - } else { - context_.Say(clause.source, - "Hint clause must have non-negative constant " - "integer expression"_err_en_US); + if (std::optional maybeVal{GetIntValue(x.v.v)}) { + int64_t val{*maybeVal}; + if (val >= 0) { + // Check contradictory values. + if ((val & 0xC) == 0xC || // omp_sync_hint_speculative and nonspeculative + (val & 0x3) == 0x3) { // omp_sync_hint_contended and uncontended + context_.Say(dirCtx.clauseSource, + "The synchronization hint is not valid"_err_en_US); } + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be non-negative"_err_en_US); } - }; - - if (leftOmpClauseList) { - checkForValidHintClause(leftOmpClauseList); - } - if (rightOmpClauseList) { - checkForValidHintClause(rightOmpClauseList); + } else { + context_.Say(dirCtx.clauseSource, + "Synchronization hint must be a constant integer value"_err_en_US); } } @@ -2370,8 +2375,9 @@ void OmpStructureChecker::Leave(const parser::OpenMPCancelConstruct &) { void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { const auto &dir{std::get(x.t)}; + const auto &dirSource{std::get(dir.t).source}; const auto &endDir{std::get(x.t)}; - PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_critical); + PushContextAndClauseSets(dirSource, llvm::omp::Directive::OMPD_critical); const auto &block{std::get(x.t)}; CheckNoBranching(block, llvm::omp::Directive::OMPD_critical, dir.source); const auto &dirName{std::get>(dir.t)}; @@ -2404,7 +2410,6 @@ void OmpStructureChecker::Enter(const parser::OpenMPCriticalConstruct &x) { "Hint clause other than omp_sync_hint_none cannot be specified for " "an unnamed CRITICAL directive"_err_en_US}); } - CheckHintClause(&ompClause, nullptr, "CRITICAL"); } void OmpStructureChecker::Leave(const parser::OpenMPCriticalConstruct &) { @@ -2632,420 +2637,1519 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t))}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); + } + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); +} + +namespace atomic { + +enum class Operator { + Unk, + // Operators that are officially allowed in the update operation + Add, + And, + Associated, + Div, + Eq, + Eqv, + Ge, // extension + Gt, + Identity, // extension: x = x is allowed (*), but we should never print + // "identity" as the name of the operator + Le, // extension + Lt, + Max, + Min, + Mul, + Ne, // extension + Neqv, + Or, + Sub, + // Operators that we recognize for technical reasons + True, + False, + Not, + Convert, + Resize, + Intrinsic, + Call, + Pow, + + // (*): "x = x + 0" is a valid update statement, but it will be folded + // to "x = x" by the time we look at it. Since the source statements + // "x = x" and "x = x + 0" will end up looking the same, accept the + // former as an extension. +}; + +std::string ToString(Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; } - return false; } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) + : std::make_pair(Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); } } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair( + OperationCode(x), OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) + const { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unk; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Add; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Sub; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Mul; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Div; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + return Operator::Pow; + } + template + Operator OperationCode( + const evaluate::Operation, Ts...> &op) const { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } + } + Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; + } + template // + Operator OperationCode(const T &) const { + return Operator::Unk; + } + + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + (Append(v, std::move(results)), ...); + return v; + } + +private: + static void Append(Result &acc, Result &&data) { + for (auto &&s : data) { + acc.push_back(std::move(s)); } } +}; +} // namespace atomic - ErrIfAllocatableVariable(var); +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return atomic::ArgumentExtractor{}(expr); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { + return GetTopLevelOperation(expr).second; +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(stmt2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); - } else { - // ATOMIC CAPTURE construct is of the form [capture-stmt, write-stmt] - CheckAtomicWriteStmt(stmt2); - } - auto *v{stmt2Var.typedExpr.get()}; - auto *e{stmt1Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Expr.source, - "Captured variable/array element/derived-type component %s expected to be assigned in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Expr.source); - } - } else if (semantics::checkForSymbolMatch(stmt1) && - semantics::checkForSingleVariableOnRHS(stmt2)) { - // ATOMIC CAPTURE construct is of the form [update-stmt, capture-stmt] - CheckAtomicUpdateStmt(stmt1); - CheckAtomicCaptureStmt(stmt2); - // Variable updated in stmt1 should be captured in stmt2 - auto *v{stmt1Var.typedExpr.get()}; - auto *e{stmt2Expr.typedExpr.get()}; - if (v && e && !(v->v == e->v)) { - context_.Say(stmt1Var.GetSource(), - "Updated variable/array element/derived-type component %s expected to be captured in the second statement of ATOMIC CAPTURE construct"_err_en_US, - stmt1Var.GetSource()); +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == atomic::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; + } + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } + } + return nullptr; +} + +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; +} + +bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { + // Both expr and x have the form of SomeType(SomeKind(...)[1]). + // Check if expr is + // SomeType(SomeKind(Type( + // Convert + // SomeKind(...)[2]))) + // where SomeKind(...) [1] and [2] are equal, and the Convert preserves + // TypeCategory. + + if (expr != x) { + auto top{atomic::ArgumentExtractor{}(expr)}; + return top.first == atomic::Operator::Resize && x == top.second.front(); } else { - context_.Say(stmt1Expr.source, - "Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt]"_err_en_US); + return true; } } -void OmpStructureChecker::CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *leftHandClauseList, - const parser::OmpAtomicClauseList *rightHandClauseList) { - int numMemoryOrderClause{0}; - int numFailClause{0}; - auto checkForValidMemoryOrderClause = [&](const parser::OmpAtomicClauseList - *clauseList) { - for (const auto &clause : clauseList->v) { - if (std::get_if(&clause.u)) { - numFailClause++; - if (numFailClause > 1) { - context_.Say(clause.source, - "More than one FAIL clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, const MaybeExpr &maybeExpr = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetExpr(operation.expr, maybeExpr); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedExpr expr; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var + // or + // ... = x capture-var = atomic-var + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return u.lhs == c.rhs; + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); + bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + + if (couldBeCapture1) { + if (couldBeCapture2) { + if (isUpdateCapture(as2, as1)) { + if (isUpdateCapture(as1, as2)) { + // If both statements could be captures and both could be updates, + // emit a warning about the ambiguity. + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); } + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - if (std::get_if(&clause.u)) { - numMemoryOrderClause++; - if (numMemoryOrderClause > 1) { - context_.Say(clause.source, - "More than one memory order clause not allowed on OpenMP ATOMIC construct"_err_en_US); - return; - } + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, + as1.rhs.AsFortran(), as2.rhs.AsFortran()); + } + } else { // !couldBeCapture2 + if (isUpdateCapture(as2, as1)) { + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); + } else { + context_.Say(act2.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as1.rhs.AsFortran()); + } + } + } else { // !couldBeCapture1 + if (couldBeCapture2) { + if (isUpdateCapture(as1, as2)) { + return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); + } else { + context_.Say(act1.source, + "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, + as2.rhs.AsFortran()); + } + } else { + context_.Say(source, + "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + } + } + + return std::make_pair(nullptr, nullptr); +} + +void OmpStructureChecker::CheckAtomicCaptureAssignment( + const evaluate::Assignment &capture, const SomeExpr &atom, + parser::CharBlock source) { + const SomeExpr &cap{capture.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + // This part should have been checked prior to callig this function. + assert(capture.rhs == atom && "This canont be a capture assignment"); + CheckStorageOverlap(atom, {cap}, source); + } +} + +void OmpStructureChecker::CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source) { + const SomeExpr &atom{read.rhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source) { + // [6.0:190:13-15] + // A write structured block is write-statement, a write statement that has + // one of the following forms: + // x = expr + // x => expr + const SomeExpr &atom{write.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + } else { + CheckAtomicVariable(atom, lsrc); + CheckStorageOverlap(atom, {write.rhs}, source); + } +} + +void OmpStructureChecker::CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source) { + // [6.0:191:1-7] + // An update structured block is update-statement, an update statement + // that has one of the following forms: + // x = x operator expr + // x = expr operator x + // x = intrinsic-procedure-name (x) + // x = intrinsic-procedure-name (x, expr-list) + // x = intrinsic-procedure-name (expr-list, x) + const SomeExpr &atom{update.lhs}; + auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, lsrc); + + auto top{GetTopLevelOperation(update.rhs)}; + switch (top.first) { + case atomic::Operator::Add: + case atomic::Operator::Sub: + case atomic::Operator::Mul: + case atomic::Operator::Div: + case atomic::Operator::And: + case atomic::Operator::Or: + case atomic::Operator::Eqv: + case atomic::Operator::Neqv: + case atomic::Operator::Min: + case atomic::Operator::Max: + case atomic::Operator::Identity: + break; + case atomic::Operator::Call: + context_.Say(source, + "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Convert: + context_.Say(source, + "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Intrinsic: + context_.Say(source, + "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + case atomic::Operator::Unk: + context_.Say( + source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); + return; + default: + context_.Say(source, + "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, + atomic::ToString(top.first)); + return; + } + // Check if `atom` occurs exactly once in the argument list. + std::vector nonAtom; + auto unique{[&]() { // -> iterator + auto found{top.second.end()}; + for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { + if (IsSameOrResizeOf(*i, atom)) { + if (found != top.second.end()) { + return top.second.end(); } + found = i; + } else { + nonAtom.push_back(*i); } } - }; - if (leftHandClauseList) { - checkForValidMemoryOrderClause(leftHandClauseList); + return found; + }()}; + + if (unique == top.second.end()) { + if (top.first == atomic::Operator::Identity) { + // This is "x = y". + context_.Say(rsrc, + "The atomic variable %s should appear as an argument in the update operation"_err_en_US, + atom.AsFortran()); + } else { + context_.Say(rsrc, + "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, + atom.AsFortran(), atomic::ToString(top.first)); + } + } else { + CheckStorageOverlap(atom, nonAtom, source); } - if (rightHandClauseList) { - checkForValidMemoryOrderClause(rightHandClauseList); +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( + const SomeExpr &cond, parser::CharBlock condSource, + const evaluate::Assignment &assign, parser::CharBlock assignSource) { + const SomeExpr &atom{assign.lhs}; + auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + + if (!IsVarOrFunctionRef(atom)) { + context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, + atom.AsFortran()); + // Skip other checks. + return; + } + + CheckAtomicVariable(atom, alsrc); + + auto top{GetTopLevelOperation(cond)}; + // Missing arguments to operations would have been diagnosed by now. + + switch (top.first) { + case atomic::Operator::Associated: + if (atom != top.second.front()) { + context_.Say(assignSource, + "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); + } + break; + // x equalop e | e equalop x (allowing "e equalop x" is an extension) + case atomic::Operator::Eq: + case atomic::Operator::Eqv: + // x ordop expr | expr ordop x + case atomic::Operator::Lt: + case atomic::Operator::Gt: { + const SomeExpr &arg0{top.second[0]}; + const SomeExpr &arg1{top.second[1]}; + if (IsSameOrResizeOf(arg0, atom)) { + CheckStorageOverlap(atom, {arg1}, condSource); + } else if (IsSameOrResizeOf(arg1, atom)) { + CheckStorageOverlap(atom, {arg0}, condSource); + } else { + context_.Say(assignSource, + "An argument of the %s operator should be the target of the assignment"_err_en_US, + atomic::ToString(top.first)); + } + break; + } + case atomic::Operator::True: + case atomic::Operator::False: + break; + default: + context_.Say(condSource, + "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, + atomic::ToString(top.first)); + break; } } -void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { - common::visit( - common::visitors{ - [&](const parser::OmpAtomic &atomicConstruct) { - const auto &dir{std::get(atomicConstruct.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicConstruct.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get(atomicConstruct.t), - nullptr); - CheckHintClause( - &std::get(atomicConstruct.t), - nullptr, "ATOMIC"); - }, - [&](const parser::OmpAtomicUpdate &atomicUpdate) { - const auto &dir{std::get(atomicUpdate.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicUpdateStmt( - std::get>( - atomicUpdate.t) - .statement); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t)); - CheckHintClause( - &std::get<0>(atomicUpdate.t), &std::get<2>(atomicUpdate.t), - "UPDATE"); - }, - [&](const parser::OmpAtomicRead &atomicRead) { - const auto &dir{std::get(atomicRead.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t)); - CheckHintClause( - &std::get<0>(atomicRead.t), &std::get<2>(atomicRead.t), "READ"); - CheckAtomicCaptureStmt( - std::get>( - atomicRead.t) - .statement); - }, - [&](const parser::OmpAtomicWrite &atomicWrite) { - const auto &dir{std::get(atomicWrite.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t)); - CheckHintClause( - &std::get<0>(atomicWrite.t), &std::get<2>(atomicWrite.t), - "WRITE"); - CheckAtomicWriteStmt( - std::get>( - atomicWrite.t) - .statement); - }, - [&](const parser::OmpAtomicCapture &atomicCapture) { - const auto &dir{std::get(atomicCapture.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t)); - CheckHintClause( - &std::get<0>(atomicCapture.t), &std::get<2>(atomicCapture.t), - "CAPTURE"); - CheckAtomicCaptureConstruct(atomicCapture); - }, - [&](const parser::OmpAtomicCompare &atomicCompare) { - const auto &dir{std::get(atomicCompare.t)}; - PushContextAndClauseSets( - dir.source, llvm::omp::Directive::OMPD_atomic); - CheckAtomicMemoryOrderClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t)); - CheckHintClause( - &std::get<0>(atomicCompare.t), &std::get<2>(atomicCompare.t), - "CAPTURE"); - CheckAtomicCompareConstruct(atomicCompare); - }, +void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source) { + // The condition/statements must be: + // - cond: x equalop e ift: x = d iff: - + // - cond: x ordop expr ift: x = expr iff: - (+ commute ordop) + // - cond: associated(x) ift: x => expr iff: - + // - cond: associated(x, e) ift: x => expr iff: - + + // The if-true statement must be present, and must be an assignment. + auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; + if (!maybeAssign) { + if (update.ift.stmt) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); + } else { + context_.Say( + source, "Invalid body of ATOMIC UPDATE COMPARE operation"_err_en_US); + } + return; + } + const evaluate::Assignment assign{*maybeAssign}; + const SomeExpr &atom{assign.lhs}; + + CheckAtomicConditionalUpdateAssignment( + update.cond, update.source, assign, update.ift.source); + + CheckStorageOverlap(atom, {assign.rhs}, update.ift.source); + + if (update.iff) { + context_.Say(update.iff.source, + "In ATOMIC UPDATE COMPARE the update statement should not have an ELSE branch"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdateOnly( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeUpdate{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeUpdate->lhs}; + CheckAtomicUpdateAssignment(*maybeUpdate, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC UPDATE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdate( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // Allowable forms are (single-statement): + // - if ... + // - x = (... ? ... : x) + // and two-statement: + // - r = cond ; if (r) ... + + const parser::ExecutionPartConstruct *ust{nullptr}; // update + const parser::ExecutionPartConstruct *cst{nullptr}; // condition + + if (body.size() == 1) { + ust = &body.front(); + } else if (body.size() == 2) { + cst = &body.front(); + ust = &body.back(); + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE operation should contain one or two statements"_err_en_US); + return; + } + + // Flang doesn't support conditional-expr yet, so all update statements + // are if-statements. + + // IfStmt: if (...) ... + // IfConstruct: if (...) then ... endif + auto maybeUpdate{AnalyzeConditionalStmt(ust)}; + if (!maybeUpdate) { + context_.Say(source, + "In ATOMIC UPDATE COMPARE the update statement should be a conditional statement"_err_en_US); + return; + } + + AnalyzedCondStmt &update{*maybeUpdate}; + + if (SourcedActionStmt action{GetActionStmt(cst)}) { + // The "condition" statement must be `r = cond`. + if (auto maybeCond{GetEvaluateAssignment(action.stmt)}) { + if (maybeCond->lhs != update.cond) { + context_.Say(update.source, + "In ATOMIC UPDATE COMPARE the conditional statement must use %s as the condition"_err_en_US, + maybeCond->lhs.AsFortran()); + } else { + // If it's "r = ...; if (r) ..." then put the original condition + // in `update`. + update.cond = maybeCond->rhs; + } + } else { + context_.Say(action.source, + "In ATOMIC UPDATE COMPARE with two statements the first statement should compute the condition"_err_en_US); + } + } + + evaluate::Assignment assign{*GetEvaluateAssignment(update.ift.stmt)}; + + CheckAtomicConditionalUpdateStmt(update, source); + if (IsCheckForAssociated(update.cond)) { + if (!IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment should be a pointer-assignment when the condition is ASSOCIATED"_err_en_US); + } + } else { + if (IsPointerAssignment(assign)) { + context_.Say(source, + "The assignment cannot be a pointer-assignment except when the condition is ASSOCIATED"_err_en_US); + } + } + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::None)); +} + +void OmpStructureChecker::CheckAtomicUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + if (body.size() != 2) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two statements"_err_en_US); + return; + } + + auto [uec, cec]{CheckUpdateCapture(&body.front(), &body.back(), source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; + auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; + + if (!maybeUpdate || !maybeCapture) { + context_.Say(source, + "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); + return; + } + + const evaluate::Assignment &update{*maybeUpdate}; + const evaluate::Assignment &capture{*maybeCapture}; + const SomeExpr &atom{update.lhs}; + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + int action; + + if (IsMaybeAtomicWrite(update)) { + action = Analysis::Write; + CheckAtomicWriteAssignment(update, uact.source); + } else { + action = Analysis::Update; + CheckAtomicUpdateAssignment(update, uact.source); + } + CheckAtomicCaptureAssignment(capture, atom, cact.source); + + if (IsPointerAssignment(update) != IsPointerAssignment(capture)) { + context_.Say(cact.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + + if (GetActionStmt(&body.front()).stmt == uact.stmt) { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(action, update.rhs), + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + } else { + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), + MakeAtomicAnalysisOp(action, update.rhs)); + } +} + +void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source) { + // There are two different variants of this: + // (1) conditional-update and capture separately: + // This form only allows single-statement updates, i.e. the update + // form "r = cond; if (r) ...)" is not allowed. + // (2) conditional-update combined with capture in a single statement: + // This form does allow the condition to be calculated separately, + // i.e. "r = cond; if (r) ...". + // Regardless of what form it is, the actual update assignment is a + // proper write, i.e. "x = d", where d does not depend on x. + + AnalyzedCondStmt update; + SourcedActionStmt capture; + bool captureAlways{true}, captureFirst{true}; + + auto extractCapture{[&]() { + capture = update.iff; + captureAlways = false; + update.iff = SourcedActionStmt{}; + }}; + + auto classifyNonUpdate{[&](const SourcedActionStmt &action) { + // The non-update statement is either "r = cond" or the capture. + if (auto maybeAssign{GetEvaluateAssignment(action.stmt)}) { + if (update.cond == maybeAssign->lhs) { + // If this is "r = cond; if (r) ...", then update the condition. + update.cond = maybeAssign->rhs; + update.source = action.source; + // In this form, the update and the capture are combined into + // an IF-THEN-ELSE statement. + extractCapture(); + } else { + // Assume this is the capture-statement. + capture = action; + } + } + }}; + + if (body.size() == 2) { + // This could be + // - capture; conditional-update (in any order), or + // - r = cond; if (r) capture-update + const parser::ExecutionPartConstruct *st1{&body.front()}; + const parser::ExecutionPartConstruct *st2{&body.back()}; + // In either case, the conditional statement can be analyzed by + // AnalyzeConditionalStmt, whereas the other statement cannot. + if (auto maybeUpdate1{AnalyzeConditionalStmt(st1)}) { + update = *maybeUpdate1; + classifyNonUpdate(GetActionStmt(st2)); + captureFirst = false; + } else if (auto maybeUpdate2{AnalyzeConditionalStmt(st2)}) { + update = *maybeUpdate2; + classifyNonUpdate(GetActionStmt(st1)); + } else { + // None of the statements are conditional, this rules out the + // "r = cond; if (r) ..." and the "capture + conditional-update" + // variants. This could still be capture + write (which is classified + // as conditional-update-capture in the spec). + auto [uec, cec]{CheckUpdateCapture(st1, st2, source)}; + if (!uec || !cec) { + // Diagnostics already emitted. + return; + } + SourcedActionStmt uact{GetActionStmt(uec)}; + SourcedActionStmt cact{GetActionStmt(cec)}; + update.ift = uact; + capture = cact; + if (uec == st1) { + captureFirst = false; + } + } + } else if (body.size() == 1) { + if (auto maybeUpdate{AnalyzeConditionalStmt(&body.front())}) { + update = *maybeUpdate; + // This is the form with update and capture combined into an IF-THEN-ELSE + // statement. The capture-statement is always the ELSE branch. + extractCapture(); + } else { + goto invalid; + } + } else { + context_.Say(source, + "ATOMIC UPDATE COMPARE CAPTURE operation should contain one or two statements"_err_en_US); + return; + invalid: + context_.Say(source, + "Invalid body of ATOMIC UPDATE COMPARE CAPTURE operation"_err_en_US); + return; + } + + // The update must have a form `x = d` or `x => d`. + if (auto maybeWrite{GetEvaluateAssignment(update.ift.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, update.ift.source); + if (auto maybeCapture{GetEvaluateAssignment(capture.stmt)}) { + CheckAtomicCaptureAssignment(*maybeCapture, atom, capture.source); + + if (IsPointerAssignment(*maybeWrite) != + IsPointerAssignment(*maybeCapture)) { + context_.Say(capture.source, + "The update and capture assignments should both be pointer-assignments or both be non-pointer-assignments"_err_en_US); + return; + } + } else { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + return; + } + } else { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + return; + } + + // update.iff should be empty here, the capture statement should be + // stored in "capture". + + // Fill out the analysis in the AST node. + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + bool condUnused{std::visit( + [](auto &&s) { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v) { + return true; + } else { + return false; + } }, - x.u); + update.cond.u)}; + + int updateWhen{!condUnused ? Analysis::IfTrue : 0}; + int captureWhen{!captureAlways ? Analysis::IfFalse : 0}; + + evaluate::Assignment updAssign{*GetEvaluateAssignment(update.ift.stmt)}; + evaluate::Assignment capAssign{*GetEvaluateAssignment(capture.stmt)}; + + if (captureFirst) { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + } else { + x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + } +} + +void OmpStructureChecker::CheckAtomicRead( + const parser::OpenMPAtomicConstruct &x) { + // [6.0:190:5-7] + // A read structured block is read-statement, a read statement that has one + // of the following forms: + // v = x + // v => x + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Read cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC READ cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeRead->rhs}; + CheckAtomicReadAssignment(*maybeRead, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC READ operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC READ operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicWrite( + const parser::OpenMPAtomicConstruct &x) { + auto &dirSpec{std::get(x.t)}; + auto &block{std::get(x.t)}; + + // Write cannot be conditional or have a capture statement. + if (x.IsCompare() || x.IsCapture()) { + context_.Say(dirSpec.source, + "ATOMIC WRITE cannot have COMPARE or CAPTURE clauses"_err_en_US); + return; + } + + const parser::Block &body{GetInnermostExecPart(block)}; + + if (body.size() == 1) { + SourcedActionStmt action{GetActionStmt(&body.front())}; + if (auto maybeWrite{GetEvaluateAssignment(action.stmt)}) { + const SomeExpr &atom{maybeWrite->lhs}; + CheckAtomicWriteAssignment(*maybeWrite, action.source); + + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::None)); + } else { + context_.Say( + x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); + } + } else { + context_.Say(x.source, + "ATOMIC WRITE operation should have a single statement"_err_en_US); + } +} + +void OmpStructureChecker::CheckAtomicUpdate( + const parser::OpenMPAtomicConstruct &x) { + auto &block{std::get(x.t)}; + + bool isConditional{x.IsCompare()}; + bool isCapture{x.IsCapture()}; + const parser::Block &body{GetInnermostExecPart(block)}; + + if (isConditional && isCapture) { + CheckAtomicConditionalUpdateCapture(x, body, x.source); + } else if (isConditional) { + CheckAtomicConditionalUpdate(x, body, x.source); + } else if (isCapture) { + CheckAtomicUpdateCapture(x, body, x.source); + } else { // update-only + CheckAtomicUpdateOnly(x, body, x.source); + } +} + +void OmpStructureChecker::Enter(const parser::OpenMPAtomicConstruct &x) { + // All of the following groups have the "exclusive" property, i.e. at + // most one clause from each group is allowed. + // The exclusivity-checking code should eventually be unified for all + // clauses, with clause groups defined in OMP.td. + std::array atomic{llvm::omp::Clause::OMPC_read, + llvm::omp::Clause::OMPC_update, llvm::omp::Clause::OMPC_write}; + std::array memoryOrder{llvm::omp::Clause::OMPC_acq_rel, + llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_relaxed, + llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_seq_cst}; + + auto checkExclusive{[&](llvm::ArrayRef group, + std::string_view name, + const parser::OmpClauseList &clauses) { + const parser::OmpClause *present{nullptr}; + for (const parser::OmpClause &clause : clauses.v) { + llvm::omp::Clause id{clause.Id()}; + if (!llvm::is_contained(group, id)) { + continue; + } + if (present == nullptr) { + present = &clause; + continue; + } else if (id == present->Id()) { + // Ignore repetitions of the same clause, those will be diagnosed + // separately. + continue; + } + parser::MessageFormattedText txt( + "At most one clause from the '%s' group is allowed on ATOMIC construct"_err_en_US, + name.data()); + parser::Message message(clause.source, txt); + message.Attach(present->source, + "Previous clause from this group provided here"_en_US); + context_.Say(std::move(message)); + return; + } + }}; + + auto &dirSpec{std::get(x.t)}; + auto &dir{std::get(dirSpec.t)}; + PushContextAndClauseSets(dir.source, llvm::omp::Directive::OMPD_atomic); + llvm::omp::Clause kind{x.GetKind()}; + + checkExclusive(atomic, "atomic", dirSpec.Clauses()); + checkExclusive(memoryOrder, "memory-order", dirSpec.Clauses()); + + switch (kind) { + case llvm::omp::Clause::OMPC_read: + CheckAtomicRead(x); + break; + case llvm::omp::Clause::OMPC_write: + CheckAtomicWrite(x); + break; + case llvm::omp::Clause::OMPC_update: + CheckAtomicUpdate(x); + break; + default: + break; + } } void OmpStructureChecker::Leave(const parser::OpenMPAtomicConstruct &) { @@ -3237,7 +4341,6 @@ CHECK_SIMPLE_CLAUSE(Final, OMPC_final) CHECK_SIMPLE_CLAUSE(Flush, OMPC_flush) CHECK_SIMPLE_CLAUSE(Full, OMPC_full) CHECK_SIMPLE_CLAUSE(Grainsize, OMPC_grainsize) -CHECK_SIMPLE_CLAUSE(Hint, OMPC_hint) CHECK_SIMPLE_CLAUSE(Holds, OMPC_holds) CHECK_SIMPLE_CLAUSE(Inclusive, OMPC_inclusive) CHECK_SIMPLE_CLAUSE(Initializer, OMPC_initializer) @@ -3867,40 +4970,6 @@ void OmpStructureChecker::CheckIsLoopIvPartOfClause( } } } -// Following clauses have a separate node in parse-tree.h. -// Atomic-clause -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicRead, OMPC_read) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicWrite, OMPC_write) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicUpdate, OMPC_update) -CHECK_SIMPLE_PARSER_CLAUSE(OmpAtomicCapture, OMPC_capture) - -void OmpStructureChecker::Leave(const parser::OmpAtomicRead &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_read, - {llvm::omp::Clause::OMPC_release, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicWrite &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_write, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -void OmpStructureChecker::Leave(const parser::OmpAtomicUpdate &) { - CheckNotAllowedIfClause(llvm::omp::Clause::OMPC_update, - {llvm::omp::Clause::OMPC_acquire, llvm::omp::Clause::OMPC_acq_rel}); -} - -// OmpAtomic node represents atomic directive without atomic-clause. -// atomic-clause - READ,WRITE,UPDATE,CAPTURE. -void OmpStructureChecker::Leave(const parser::OmpAtomic &) { - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acquire)}) { - context_.Say(clause->source, - "Clause ACQUIRE is not allowed on the ATOMIC directive"_err_en_US); - } - if (const auto *clause{FindClause(llvm::omp::Clause::OMPC_acq_rel)}) { - context_.Say(clause->source, - "Clause ACQ_REL is not allowed on the ATOMIC directive"_err_en_US); - } -} // Restrictions specific to each clause are implemented apart from the // generalized restrictions. @@ -4854,21 +5923,6 @@ void OmpStructureChecker::Leave(const parser::OmpContextSelector &) { ExitDirectiveNest(ContextSelectorNest); } -std::optional OmpStructureChecker::GetDynamicType( - const common::Indirection &parserExpr) { - // Indirection parserExpr - // `- parser::Expr ^.value() - const parser::TypedExpr &typedExpr{parserExpr.value().typedExpr}; - // ForwardOwningPointer typedExpr - // `- GenericExprWrapper ^.get() - // `- std::optional ^->v - if (auto maybeExpr{typedExpr.get()->v}) { - return maybeExpr->GetType(); - } else { - return std::nullopt; - } -} - const std::list & OmpStructureChecker::GetTraitPropertyList( const parser::OmpTraitSelector &trait) { @@ -5258,7 +6312,7 @@ void OmpStructureChecker::CheckTraitCondition( const parser::OmpTraitProperty &property{properties.front()}; auto &scalarExpr{std::get(property.u)}; - auto maybeType{GetDynamicType(scalarExpr.thing)}; + auto maybeType{GetDynamicType(scalarExpr.thing.value())}; if (!maybeType || maybeType->category() != TypeCategory::Logical) { context_.Say(property.source, "%s trait requires a single LOGICAL expression"_err_en_US, diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index 5ea2039a83c3f..bf6fbf16d0646 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -48,6 +48,7 @@ static const OmpDirectiveSet noWaitClauseNotAllowedSet{ } // namespace llvm namespace Fortran::semantics { +struct AnalyzedCondStmt; // Mapping from 'Symbol' to 'Source' to keep track of the variables // used in multiple clauses @@ -142,15 +143,6 @@ class OmpStructureChecker void Leave(const parser::OmpClauseList &); void Enter(const parser::OmpClause &); - void Enter(const parser::OmpAtomicRead &); - void Leave(const parser::OmpAtomicRead &); - void Enter(const parser::OmpAtomicWrite &); - void Leave(const parser::OmpAtomicWrite &); - void Enter(const parser::OmpAtomicUpdate &); - void Leave(const parser::OmpAtomicUpdate &); - void Enter(const parser::OmpAtomicCapture &); - void Leave(const parser::OmpAtomic &); - void Enter(const parser::DoConstruct &); void Leave(const parser::DoConstruct &); @@ -189,8 +181,6 @@ class OmpStructureChecker void CheckAllowedMapTypes(const parser::OmpMapType::Value &, const std::list &); - std::optional GetDynamicType( - const common::Indirection &); const std::list &GetTraitPropertyList( const parser::OmpTraitSelector &); std::optional GetClauseFromProperty( @@ -260,14 +250,41 @@ class OmpStructureChecker void CheckDoWhile(const parser::OpenMPLoopConstruct &x); void CheckAssociatedLoopConstraints(const parser::OpenMPLoopConstruct &x); template bool IsOperatorValid(const T &, const D &); - void CheckAtomicMemoryOrderClause( - const parser::OmpAtomicClauseList *, const parser::OmpAtomicClauseList *); - void CheckAtomicUpdateStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureStmt(const parser::AssignmentStmt &); - void CheckAtomicWriteStmt(const parser::AssignmentStmt &); - void CheckAtomicCaptureConstruct(const parser::OmpAtomicCapture &); - void CheckAtomicCompareConstruct(const parser::OmpAtomicCompare &); - void CheckAtomicConstructStructure(const parser::OpenMPAtomicConstruct &); + + void CheckStorageOverlap(const evaluate::Expr &, + llvm::ArrayRef>, parser::CharBlock); + void CheckAtomicVariable( + const evaluate::Expr &, parser::CharBlock); + std::pair + CheckUpdateCapture(const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source); + void CheckAtomicCaptureAssignment(const evaluate::Assignment &capture, + const SomeExpr &atom, parser::CharBlock source); + void CheckAtomicReadAssignment( + const evaluate::Assignment &read, parser::CharBlock source); + void CheckAtomicWriteAssignment( + const evaluate::Assignment &write, parser::CharBlock source); + void CheckAtomicUpdateAssignment( + const evaluate::Assignment &update, parser::CharBlock source); + void CheckAtomicConditionalUpdateAssignment(const SomeExpr &cond, + parser::CharBlock condSource, const evaluate::Assignment &assign, + parser::CharBlock assignSource); + void CheckAtomicConditionalUpdateStmt( + const AnalyzedCondStmt &update, parser::CharBlock source); + void CheckAtomicUpdateOnly(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdate(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicUpdateCapture(const parser::OpenMPAtomicConstruct &x, + const parser::Block &body, parser::CharBlock source); + void CheckAtomicConditionalUpdateCapture( + const parser::OpenMPAtomicConstruct &x, const parser::Block &body, + parser::CharBlock source); + void CheckAtomicRead(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicWrite(const parser::OpenMPAtomicConstruct &x); + void CheckAtomicUpdate(const parser::OpenMPAtomicConstruct &x); + void CheckDistLinear(const parser::OpenMPLoopConstruct &x); void CheckSIMDNest(const parser::OpenMPConstruct &x); void CheckTargetNest(const parser::OpenMPConstruct &x); @@ -319,7 +336,6 @@ class OmpStructureChecker void EnterDirectiveNest(const int index) { directiveNest_[index]++; } void ExitDirectiveNest(const int index) { directiveNest_[index]--; } int GetDirectiveNest(const int index) { return directiveNest_[index]; } - template void CheckHintClause(D *, D *, std::string_view); inline void ErrIfAllocatableVariable(const parser::Variable &); inline void ErrIfLHSAndRHSSymbolsMatch( const parser::Variable &, const parser::Expr &); diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp index f1d2ba4078236..dc395ecf211b3 100644 --- a/flang/lib/Semantics/resolve-names.cpp +++ b/flang/lib/Semantics/resolve-names.cpp @@ -1638,11 +1638,8 @@ class OmpVisitor : public virtual DeclarationVisitor { messageHandler().set_currStmtSource(std::nullopt); } bool Pre(const parser::OpenMPAtomicConstruct &x) { - return common::visit(common::visitors{[&](const auto &u) -> bool { - AddOmpSourceRange(u.source); - return true; - }}, - x.u); + AddOmpSourceRange(x.source); + return true; } void Post(const parser::OpenMPAtomicConstruct &) { messageHandler().set_currStmtSource(std::nullopt); diff --git a/flang/lib/Semantics/rewrite-directives.cpp b/flang/lib/Semantics/rewrite-directives.cpp index 104a77885d276..b4fef2c881b67 100644 --- a/flang/lib/Semantics/rewrite-directives.cpp +++ b/flang/lib/Semantics/rewrite-directives.cpp @@ -51,23 +51,21 @@ class OmpRewriteMutator : public DirectiveRewriteMutator { bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { // Find top-level parent of the operation. - Symbol *topLevelParent{common::visit( - [&](auto &atomic) { - Symbol *symbol{nullptr}; - Scope *scope{ - &context_.FindScope(std::get(atomic.t).source)}; - do { - if (Symbol * parent{scope->symbol()}) { - symbol = parent; - } - scope = &scope->parent(); - } while (!scope->IsGlobal()); - - assert(symbol && - "Atomic construct must be within a scope associated with a symbol"); - return symbol; - }, - x.u)}; + Symbol *topLevelParent{[&]() { + Symbol *symbol{nullptr}; + Scope *scope{&context_.FindScope( + std::get(x.t).source)}; + do { + if (Symbol * parent{scope->symbol()}) { + symbol = parent; + } + scope = &scope->parent(); + } while (!scope->IsGlobal()); + + assert(symbol && + "Atomic construct must be within a scope associated with a symbol"); + return symbol; + }()}; // Get the `atomic_default_mem_order` clause from the top-level parent. std::optional defaultMemOrder; @@ -86,66 +84,48 @@ bool OmpRewriteMutator::Pre(parser::OpenMPAtomicConstruct &x) { return false; } - auto findMemOrderClause = - [](const std::list &clauses) { - return llvm::any_of(clauses, [](const auto &clause) { - return std::get_if(&clause.u); + auto findMemOrderClause{[](const parser::OmpClauseList &clauses) { + return llvm::any_of( + clauses.v, [](auto &clause) -> const parser::OmpClause * { + switch (clause.Id()) { + case llvm::omp::Clause::OMPC_acq_rel: + case llvm::omp::Clause::OMPC_acquire: + case llvm::omp::Clause::OMPC_relaxed: + case llvm::omp::Clause::OMPC_release: + case llvm::omp::Clause::OMPC_seq_cst: + return &clause; + default: + return nullptr; + } }); - }; - - // Get the clause list to which the new memory order clause must be added, - // only if there are no other memory order clauses present for this atomic - // directive. - std::list *clauseList = common::visit( - common::visitors{[&](parser::OmpAtomic &atomicConstruct) { - // OmpAtomic only has a single list of clauses. - auto &clauses{std::get( - atomicConstruct.t)}; - return !findMemOrderClause(clauses.v) ? &clauses.v - : nullptr; - }, - [&](auto &atomicConstruct) { - // All other atomic constructs have two lists of clauses. - auto &clausesLhs{std::get<0>(atomicConstruct.t)}; - auto &clausesRhs{std::get<2>(atomicConstruct.t)}; - return !findMemOrderClause(clausesLhs.v) && - !findMemOrderClause(clausesRhs.v) - ? &clausesRhs.v - : nullptr; - }}, - x.u); + }}; - // Add a memory order clause to the atomic directive. + auto &dirSpec{std::get(x.t)}; + auto &clauseList{std::get>(dirSpec.t)}; if (clauseList) { - atomicDirectiveDefaultOrderFound_ = true; - switch (*defaultMemOrder) { - case common::OmpMemoryOrderType::Acq_Rel: - clauseList->emplace_back(common::visit( - common::visitors{[](parser::OmpAtomicRead &) -> parser::OmpClause { - return parser::OmpClause::Acquire{}; - }, - [](parser::OmpAtomicCapture &) -> parser::OmpClause { - return parser::OmpClause::AcqRel{}; - }, - [](auto &) -> parser::OmpClause { - // parser::{OmpAtomic, OmpAtomicUpdate, OmpAtomicWrite} - return parser::OmpClause::Release{}; - }}, - x.u)); - break; - case common::OmpMemoryOrderType::Relaxed: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::Relaxed{}}); - break; - case common::OmpMemoryOrderType::Seq_Cst: - clauseList->emplace_back( - parser::OmpClause{parser::OmpClause::SeqCst{}}); - break; - default: - // FIXME: Don't process other values at the moment since their validity - // depends on the OpenMP version (which is unavailable here). - break; + if (findMemOrderClause(*clauseList)) { + return false; } + } else { + clauseList = parser::OmpClauseList(decltype(parser::OmpClauseList::v){}); + } + + // Add a memory order clause to the atomic directive. + atomicDirectiveDefaultOrderFound_ = true; + switch (*defaultMemOrder) { + case common::OmpMemoryOrderType::Acq_Rel: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::AcqRel{}}); + break; + case common::OmpMemoryOrderType::Relaxed: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::Relaxed{}}); + break; + case common::OmpMemoryOrderType::Seq_Cst: + clauseList->v.emplace_back(parser::OmpClause{parser::OmpClause::SeqCst{}}); + break; + default: + // FIXME: Don't process other values at the moment since their validity + // depends on the OpenMP version (which is unavailable here). + break; } return false; diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 index b82bd13622764..6f58e0939a787 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare-fail.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 index 88ec6fe910b9e..6729be6e5cf8b 100644 --- a/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 +++ b/flang/test/Lower/OpenMP/Todo/atomic-compare.f90 @@ -1,6 +1,6 @@ ! RUN: %not_todo_cmd %flang_fc1 -emit-fir -fopenmp -fopenmp-version=51 -o - %s 2>&1 | FileCheck %s -! CHECK: not yet implemented: OpenMP atomic compare +! CHECK: not yet implemented: OpenMP ATOMIC COMPARE program p integer :: x logical :: r diff --git a/flang/test/Lower/OpenMP/atomic-capture.f90 b/flang/test/Lower/OpenMP/atomic-capture.f90 index bbb08220af9d9..b8422c8f3b769 100644 --- a/flang/test/Lower/OpenMP/atomic-capture.f90 +++ b/flang/test/Lower/OpenMP/atomic-capture.f90 @@ -79,16 +79,16 @@ subroutine pointers_in_atomic_capture() !CHECK: %[[VAL_A_BOX_ADDR:.*]] = fir.box_addr %[[VAL_A_LOADED]] : (!fir.box>) -> !fir.ptr !CHECK: %[[VAL_B_LOADED:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR:.*]] = fir.box_addr %[[VAL_B_LOADED]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR]] : !fir.ptr !CHECK: %[[VAL_B_LOADED_2:.*]] = fir.load %[[VAL_B_DECLARE]]#0 : !fir.ref>> !CHECK: %[[VAL_B_BOX_ADDR_2:.*]] = fir.box_addr %[[VAL_B_LOADED_2]] : (!fir.box>) -> !fir.ptr -!CHECK: %[[VAL_B:.*]] = fir.load %[[VAL_B_BOX_ADDR_2]] : !fir.ptr !CHECK: omp.atomic.capture { !CHECK: omp.atomic.update %[[VAL_A_BOX_ADDR]] : !fir.ptr { !CHECK: ^bb0(%[[ARG:.*]]: i32): !CHECK: %[[TEMP:.*]] = arith.addi %[[ARG]], %[[VAL_B]] : i32 !CHECK: omp.yield(%[[TEMP]] : i32) !CHECK: } -!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 +!CHECK: omp.atomic.read %[[VAL_B_BOX_ADDR_2]] = %[[VAL_A_BOX_ADDR]] : !fir.ptr, !fir.ptr, i32 !CHECK: } !CHECK: return !CHECK: } diff --git a/flang/test/Lower/OpenMP/atomic-write.f90 b/flang/test/Lower/OpenMP/atomic-write.f90 index 13392ad76471f..6eded49b0b15d 100644 --- a/flang/test/Lower/OpenMP/atomic-write.f90 +++ b/flang/test/Lower/OpenMP/atomic-write.f90 @@ -44,9 +44,9 @@ end program OmpAtomicWrite !CHECK-LABEL: func.func @_QPatomic_write_pointer() { !CHECK: %[[X_REF:.*]] = fir.alloca !fir.box> {bindc_name = "x", uniq_name = "_QFatomic_write_pointerEx"} !CHECK: %[[X_DECL:.*]]:2 = hlfir.declare %[[X_REF]] {fortran_attrs = #fir.var_attrs, uniq_name = "_QFatomic_write_pointerEx"} : (!fir.ref>>) -> (!fir.ref>>, !fir.ref>>) -!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> !CHECK: %[[X_POINTEE_ADDR:.*]] = fir.box_addr %[[X_ADDR_BOX]] : (!fir.box>) -> !fir.ptr +!CHECK: %[[C1:.*]] = arith.constant 1 : i32 !CHECK: omp.atomic.write %[[X_POINTEE_ADDR]] = %[[C1]] : !fir.ptr, i32 !CHECK: %[[C2:.*]] = arith.constant 2 : i32 !CHECK: %[[X_ADDR_BOX:.*]] = fir.load %[[X_DECL]]#0 : !fir.ref>> diff --git a/flang/test/Parser/OpenMP/atomic-compare.f90 b/flang/test/Parser/OpenMP/atomic-compare.f90 deleted file mode 100644 index 5cd02698ff482..0000000000000 --- a/flang/test/Parser/OpenMP/atomic-compare.f90 +++ /dev/null @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s -! OpenMP version for documentation purposes only - it isn't used until Sema. -! This is testing for Parser errors that bail out before Sema. -program main - implicit none - integer :: i, j = 10 - logical :: r - - !CHECK: error: expected OpenMP construct - !$omp atomic compare write - r = i .eq. j + 1 - - !CHECK: error: expected end of line - !$omp atomic compare num_threads(4) - r = i .eq. j -end program main diff --git a/flang/test/Semantics/OpenMP/atomic-compare.f90 b/flang/test/Semantics/OpenMP/atomic-compare.f90 index 54492bf6a22a6..11e23e062bce7 100644 --- a/flang/test/Semantics/OpenMP/atomic-compare.f90 +++ b/flang/test/Semantics/OpenMP/atomic-compare.f90 @@ -44,46 +44,37 @@ !$omp end atomic ! Check for error conditions: - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic compare seq_cst seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the COMPARE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst compare seq_cst if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic compare acquire acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the COMPARE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire compare acquire if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed compare if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic compare relaxed relaxed if (b .eq. c) b = a - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the COMPARE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed compare relaxed if (b .eq. c) b = a - !ERROR: More than one FAIL clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one FAIL clause can appear on the ATOMIC directive !$omp atomic fail(release) compare fail(release) if (c .eq. a) a = b !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index c13a11a8dd5dc..deb67e7614659 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -16,20 +16,21 @@ program sample !$omp atomic read hint(2) y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(3) y = y + 10 !$omp atomic update hint(5) y = x + y - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement y = x x = y !$omp end atomic - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic update hint(x) y = y * 1 @@ -46,7 +47,7 @@ program sample !$omp atomic hint(omp_lock_hint_speculative) x = y + x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint) read y = x @@ -69,36 +70,36 @@ program sample !$omp atomic hint(omp_lock_hint_contended + omp_sync_hint_nonspeculative) x = y + x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_uncontended + omp_sync_hint_contended) read y = x - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp atomic hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = y * 9 - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp atomic hint(1.0) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp atomic hint(z + omp_sync_hint_nonspeculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(k + omp_sync_hint_speculative) read y = x - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp atomic hint(p(1) + omp_sync_hint_uncontended) write x = 10 * y !$omp atomic write hint(a) - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and y+x access the same storage x = y + x !$omp atomic hint(abs(-1)) write diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 new file mode 100644 index 0000000000000..2619d235380f8 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -0,0 +1,89 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC READ. Expect no diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v = x + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic read + v = x + + !$omp atomic read + v => x +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic read + v = p(i) +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic read + !ERROR: Atomic variable x should be a scalar + v = x + + !$omp atomic read + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + v = y(2:4) +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic read + !ERROR: Within atomic operation x and x access the same storage + x = x + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic read + y(1) = y(2) +end + +subroutine f05 + integer :: x, v + + !$omp atomic read + !ERROR: Atomic expression x+1_4 should be a variable + v = x + 1 +end + +subroutine f06 + character :: x, v + + !$omp atomic read + !ERROR: Atomic variable x cannot have CHARACTER type + v = x +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic read + !ERROR: Atomic variable x cannot be ALLOCATABLE + v = x +end + diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 new file mode 100644 index 0000000000000..64e31a7974b55 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -0,0 +1,77 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !$omp atomic update capture + x = v + x = x + 1 + y = x + !$omp end atomic +end + +subroutine f01 + integer :: x, y, v + + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two assignments + !$omp atomic update capture + x = v + block + x = x + 1 + y = x + end block + !$omp end atomic +end + +subroutine f02 + integer :: x, y + + ! The update and capture statements can be inside of a single BLOCK. + ! The end-directive is then optional. Expect no diagnostics. + !$omp atomic update capture + block + x = x + 1 + y = x + end block +end + +subroutine f03 + integer :: x + + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !$omp atomic update capture + x = x + 1 + x = x + 2 + !$omp end atomic +end + +subroutine f04 + integer :: x, v + + !$omp atomic update capture + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + v = x + x = v + !$omp end atomic +end + +subroutine f05 + integer :: x, v, z + + !$omp atomic update capture + v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + !$omp end atomic +end + +subroutine f06 + integer :: x, v, z + + !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x + z = x + 1 + v = x + !$omp end atomic +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 new file mode 100644 index 0000000000000..0aee02d429dc0 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -0,0 +1,83 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, y + + ! The x is a direct argument of the + operator. Expect no diagnostics. + !$omp atomic update + x = x + (y - 1) +end + +subroutine f01 + integer :: x + + ! x + 0 is unusual, but legal. Expect no diagnostics. + !$omp atomic update + x = x + 0 +end + +subroutine f02 + integer :: x + + ! This is formally not allowed by the syntax restrictions of the spec, + ! but it's equivalent to either x+0 or x*1, both of which are legal. + ! Allow this case. Expect no diagnostics. + !$omp atomic update + x = x +end + +subroutine f03 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator + x = (x + y) + 1 +end + +subroutine f04 + integer :: x + real :: y + + !$omp atomic update + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation + x = floor(x + y) +end + +subroutine f05 + integer :: x + real :: y + + !$omp atomic update + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + x = int(x + y) +end + +subroutine f06 + integer :: x, y + interface + function f(i, j) + integer :: f, i, j + end + end interface + + !$omp atomic update + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation + x = f(x, y) +end + +subroutine f07 + real :: x + integer :: y + + !$omp atomic update + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation + x = x ** y +end + +subroutine f08 + integer :: x, y + + !$omp atomic update + !ERROR: The atomic variable x should appear as an argument in the update operation + x = y +end diff --git a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 index 21a9b87d26345..3084376b4275d 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-overloaded-ops.f90 @@ -22,10 +22,10 @@ program sample x = x / y !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: A call to this function is not a valid ATOMIC UPDATE operation x = x .MYOPERATOR. y end program diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 new file mode 100644 index 0000000000000..979568ebf7140 --- /dev/null +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -0,0 +1,81 @@ +!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 + +subroutine f00 + integer :: x, v + ! The end-directive is optional in ATOMIC WRITE. Expect no diagnostics. + !$omp atomic write + x = v + 1 + + !$omp atomic write + x = v + 3 + !$omp end atomic +end + +subroutine f01 + integer, pointer :: x, v + ! Intrinsic assignment and pointer assignment are both ok. Expect no + ! diagnostics. + !$omp atomic write + x = 2 * v + 3 + + !$omp atomic write + x => v +end + +subroutine f02(i) + integer :: i, v + interface + function p(i) + integer, pointer :: p + integer :: i + end + end interface + + ! Atomic variable can be a function reference. Expect no diagostics. + !$omp atomic write + p(i) = v +end + +subroutine f03 + integer :: x(3), y(5), v(3) + + !$omp atomic write + !ERROR: Atomic variable x should be a scalar + x = v + + !$omp atomic write + !ERROR: Atomic variable y(2_8:4_8:1_8) should be a scalar + y(2:4) = v +end + +subroutine f04 + integer :: x, y(3), v + + !$omp atomic write + !ERROR: Within atomic operation x and x+1_4 access the same storage + x = x + 1 + + ! Accessing same array, but not the same storage. Expect no diagnostics. + !$omp atomic write + y(1) = y(2) +end + +subroutine f06 + character :: x, v + + !$omp atomic write + !ERROR: Atomic variable x cannot have CHARACTER type + x = v +end + +subroutine f07 + integer, allocatable :: x + integer :: v + + allocate(x) + + !$omp atomic write + !ERROR: Atomic variable x cannot be ALLOCATABLE + x = v +end + diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0e100871ea9b4..0dc8251ae356e 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,4 +1,4 @@ -! RUN: %python %S/../test_errors.py %s %flang -fopenmp +! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct @@ -11,9 +11,13 @@ a = b !$omp end atomic + !ERROR: ACQUIRE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic read acquire hint(OMP_LOCK_HINT_CONTENDED) a = b + !ERROR: RELEASE clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic release hint(OMP_LOCK_HINT_UNCONTENDED) write a = b @@ -22,39 +26,32 @@ a = a + 1 !$omp end atomic + !ERROR: HINT clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 + !ERROR: ACQ_REL clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic hint(1) acq_rel capture b = a a = a + 1 !$omp end atomic - !ERROR: expected end of line + !ERROR: At most one clause from the 'atomic' group is allowed on ATOMIC construct !$omp atomic read write + !ERROR: Atomic expression a+1._4 should be a variable a = a + 1 !$omp atomic a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic num_threads(4) a = a + 1 - !ERROR: expected end of line + !ERROR: ATOMIC UPDATE operation with CAPTURE should contain two statements + !ERROR: NUM_THREADS clause is not allowed on the ATOMIC directive !$omp atomic capture num_threads(4) a = a + 1 + !ERROR: RELAXED clause is not allowed on directive ATOMIC in OpenMP v3.1, try -fopenmp-version=50 !$omp atomic relaxed a = a + 1 - !ERROR: expected 'UPDATE' - !ERROR: expected 'WRITE' - !ERROR: expected 'COMPARE' - !ERROR: expected 'CAPTURE' - !ERROR: expected 'READ' - !$omp atomic num_threads write - a = a + 1 - !$omp end parallel end diff --git a/flang/test/Semantics/OpenMP/atomic01.f90 b/flang/test/Semantics/OpenMP/atomic01.f90 index 173effe86b69c..f700c381cadd0 100644 --- a/flang/test/Semantics/OpenMP/atomic01.f90 +++ b/flang/test/Semantics/OpenMP/atomic01.f90 @@ -14,322 +14,277 @@ ! At most one memory-order-clause may appear on the construct. !READ - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic read seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the READ directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst read seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic read acquire acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the READ directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire read acquire i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed read i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic read relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the READ directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed read relaxed i = j !UPDATE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic update seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the UPDATE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst update seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic update release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the UPDATE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release update release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic update relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the UPDATE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed update relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !CAPTURE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic capture seq_cst seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the CAPTURE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst capture seq_cst i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic capture release release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the CAPTURE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release capture release i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic capture relaxed relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the CAPTURE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed capture relaxed i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel acq_rel capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic capture acq_rel acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQ_REL clause can appear on the CAPTURE directive + !ERROR: At most one ACQ_REL clause can appear on the ATOMIC directive !$omp atomic acq_rel capture acq_rel i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire acquire capture i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic capture acquire acquire i = j j = k !$omp end atomic - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one ACQUIRE clause can appear on the CAPTURE directive + !ERROR: At most one ACQUIRE clause can appear on the ATOMIC directive !$omp atomic acquire capture acquire i = j j = k !$omp end atomic !WRITE - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic write seq_cst seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one SEQ_CST clause can appear on the WRITE directive + !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst write seq_cst i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic write release release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELEASE clause can appear on the WRITE directive + !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release write release i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed write i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic write relaxed relaxed i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct - !ERROR: At most one RELAXED clause can appear on the WRITE directive + !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed write relaxed i = j !No atomic-clause - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELAXED clause can appear on the ATOMIC directive !$omp atomic relaxed relaxed - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one SEQ_CST clause can appear on the ATOMIC directive !$omp atomic seq_cst seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct !ERROR: At most one RELEASE clause can appear on the ATOMIC directive !$omp atomic release release - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j ! 2.17.7.3 ! At most one hint clause may appear on the construct. - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_speculative) hint(omp_sync_hint_speculative) read i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) read hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the READ directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic read hint(omp_sync_hint_uncontended) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) write i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) write hint(omp_sync_hint_nonspeculative) i = j - !ERROR: At most one HINT clause can appear on the WRITE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic write hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) update hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the UPDATE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic update hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint(omp_sync_hint_nonspeculative) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_contended) hint(omp_sync_hint_speculative) capture i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic hint(omp_sync_hint_nonspeculative) capture hint(omp_sync_hint_nonspeculative) i = j j = k !$omp end atomic - !ERROR: At most one HINT clause can appear on the CAPTURE directive + !ERROR: At most one HINT clause can appear on the ATOMIC directive !$omp atomic capture hint(omp_sync_hint_none) hint (omp_sync_hint_uncontended) i = j j = k @@ -337,34 +292,26 @@ ! 2.17.7.4 ! If atomic-clause is read then memory-order-clause must not be acq_rel or release. - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic acq_rel read i = j - !ERROR: Clause ACQ_REL is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read acq_rel i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic release read i = j - !ERROR: Clause RELEASE is not allowed if clause READ appears on the ATOMIC directive !$omp atomic read release i = j ! 2.17.7.5 ! If atomic-clause is write then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acq_rel write i = j - !ERROR: Clause ACQ_REL is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acq_rel i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic acquire write i = j - !ERROR: Clause ACQUIRE is not allowed if clause WRITE appears on the ATOMIC directive !$omp atomic write acquire i = j @@ -372,33 +319,27 @@ ! 2.17.7.6 ! If atomic-clause is update or not present then memory-order-clause must not be acq_rel or acquire. - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acq_rel update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic acquire update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed if clause UPDATE appears on the ATOMIC directive !$omp atomic update acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQ_REL is not allowed on the ATOMIC directive !$omp atomic acq_rel - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j - !ERROR: Clause ACQUIRE is not allowed on the ATOMIC directive !$omp atomic acquire - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The atomic variable i should appear as an argument in the update operation i = j end program diff --git a/flang/test/Semantics/OpenMP/atomic02.f90 b/flang/test/Semantics/OpenMP/atomic02.f90 index c66085d00f157..45e41f2552965 100644 --- a/flang/test/Semantics/OpenMP/atomic02.f90 +++ b/flang/test/Semantics/OpenMP/atomic02.f90 @@ -28,36 +28,29 @@ program OmpAtomic !$omp atomic a = a/(b + 1) !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: The atomic variable c should appear as an argument in the update operation c = d !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The /= operator is not a valid ATOMIC UPDATE operation l = a .NE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic m = m .AND. n @@ -76,32 +69,26 @@ program OmpAtomic !$omp atomic update a = a/(b + 1) !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The ** operator is not a valid ATOMIC UPDATE operation a = a**4 !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Invalid or missing operator in atomic update statement - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: Atomic variable c cannot have CHARACTER type + !ERROR: This is not a valid ATOMIC UPDATE operation c = c//d !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The < operator is not a valid ATOMIC UPDATE operation l = a .LT. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The <= operator is not a valid ATOMIC UPDATE operation l = a .LE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The == operator is not a valid ATOMIC UPDATE operation l = a .EQ. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The >= operator is not a valid ATOMIC UPDATE operation l = a .GE. b !$omp atomic update - !ERROR: Atomic update statement should be of form `l = l operator expr` OR `l = expr operator l` - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: The > operator is not a valid ATOMIC UPDATE operation l = a .GT. b !$omp atomic update m = m .AND. n diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index 76367495b9861..f5c189fd05318 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -25,28 +25,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7, b, c) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8, a, d) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -60,26 +58,26 @@ program OmpAtomic y = MIN(y, 8) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(y, 4) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level OR operator z = IOR(y, 5) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level NEQV/EOR operator z = IEOR(y, 6) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MAX operator z = MAX(y, 7) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level MIN operator z = MIN(y, 8) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = MOD(y, 9) !$omp atomic update - !ERROR: Invalid intrinsic procedure name in OpenMP ATOMIC (UPDATE) statement + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation x = ABS(x) end program OmpAtomic @@ -92,7 +90,7 @@ subroutine conflicting_types() type(simple) ::s z = 1 !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'z' + !ERROR: The atomic variable z should occur exactly once among the arguments of the top-level AND operator z = IAND(s%z, 4) end subroutine @@ -105,40 +103,37 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) :: s !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MIN operator a = min(a, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a) !$omp atomic - !ERROR: Atomic update statement should be of the form `a = intrinsic_procedure(a, expr_list)` OR `a = intrinsic_procedure(expr_list, a)` a = min(b, a, b) !$omp atomic - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'a' + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level MAX operator a = max(b, a, b, a, b) !$omp atomic update - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'y' + !ERROR: The atomic variable y should occur exactly once among the arguments of the top-level MIN operator y = min(z, x) !$omp atomic z = max(z, y) !$omp atomic update - !ERROR: Expected scalar variable on the LHS of atomic update assignment statement - !ERROR: Intrinsic procedure arguments in atomic update statement must have exactly one occurence of 'k' + !ERROR: Atomic variable k should be a scalar + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation k = max(x, y) - + !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement x = min(x, k) !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - z =z + s%m + z = z + s%m end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index a9644ad95aa30..5c91ab5dc37e4 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -20,12 +20,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic @@ -33,12 +31,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic @@ -46,12 +42,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic @@ -59,12 +53,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic @@ -72,8 +64,7 @@ program OmpAtomic !$omp atomic m = n .AND. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic @@ -81,8 +72,7 @@ program OmpAtomic !$omp atomic m = n .OR. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic @@ -90,8 +80,7 @@ program OmpAtomic !$omp atomic m = n .EQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic @@ -99,8 +88,7 @@ program OmpAtomic !$omp atomic m = n .NEQV. m !$omp atomic - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l !$omp atomic update @@ -108,12 +96,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y + 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 + y !$omp atomic update @@ -121,12 +107,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y - 1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1 - y !$omp atomic update @@ -134,12 +118,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y*1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1*y !$omp atomic update @@ -147,12 +129,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = y/1 !$omp atomic update - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Exactly one occurence of 'x' expected on the RHS of atomic update assignment statement + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = 1/y !$omp atomic update @@ -160,8 +140,7 @@ program OmpAtomic !$omp atomic update m = n .AND. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level AND operator m = n .AND. l !$omp atomic update @@ -169,8 +148,7 @@ program OmpAtomic !$omp atomic update m = n .OR. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level OR operator m = n .OR. l !$omp atomic update @@ -178,8 +156,7 @@ program OmpAtomic !$omp atomic update m = n .EQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level EQV operator m = n .EQV. l !$omp atomic update @@ -187,8 +164,7 @@ program OmpAtomic !$omp atomic update m = n .NEQV. m !$omp atomic update - !ERROR: Atomic update statement should be of form `m = m operator expr` OR `m = expr operator m` - !ERROR: Exactly one occurence of 'm' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable m should occur exactly once among the arguments of the top-level NEQV/EOR operator m = n .NEQV. l end program OmpAtomic @@ -204,35 +180,34 @@ subroutine more_invalid_atomic_update_stmts() type(some_type) p !$omp atomic - !ERROR: Invalid or missing operator in atomic update statement x = x !$omp atomic update - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 1 !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and a*b access the same storage a = a * b + a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level * operator a = b * (a + 9) !$omp atomic update - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (a+b) access the same storage a = a * (a + b) !$omp atomic - !ERROR: Exactly one occurence of 'a' expected on the RHS of atomic update assignment statement + !ERROR: Within atomic operation a and (b+a) access the same storage a = (b + a) * a !$omp atomic - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a * b + c !$omp atomic update - !ERROR: Atomic update statement should be of form `a = a operator expr` OR `a = expr operator a` + !ERROR: The atomic variable a should occur exactly once among the arguments of the top-level + operator a = a + b + c !$omp atomic @@ -243,23 +218,18 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement a = a + d !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = x * y / z !$omp atomic - !ERROR: Atomic update statement should be of form `p%m = p%m operator expr` OR `p%m = expr operator p%m` - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement + !ERROR: The atomic variable p%m should occur exactly once among the arguments of the top-level + operator p%m = x + y !$omp atomic update !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar REAL(4) and rank 1 array of REAL(4) - !ERROR: Expected scalar expression on the RHS of atomic update assignment statement - !ERROR: Exactly one occurence of 'p%m' expected on the RHS of atomic update assignment statement p%m = p%m + p%n end subroutine diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index 266268a212440..e0103be4cae4a 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -8,20 +8,20 @@ program OmpAtomic use omp_lib integer :: g, x - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic relaxed, seq_cst x = x + 1 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic read seq_cst, relaxed x = g - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic write relaxed, release x = 2 * 4 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: Invalid or missing operator in atomic update statement + !ERROR: This is not a valid ATOMIC UPDATE operation x = 10 - !ERROR: More than one memory order clause not allowed on OpenMP ATOMIC construct + !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst x = g g = x * 10 diff --git a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 index 7ca8c858239f7..e9cfa49bf934e 100644 --- a/flang/test/Semantics/OpenMP/critical-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/critical-hint-clause.f90 @@ -18,7 +18,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(3) y = 2 !$omp end critical (name) @@ -27,12 +27,12 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(7) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(x) y = 2 @@ -54,7 +54,7 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint) y = 2 @@ -84,35 +84,35 @@ program sample y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_uncontended + omp_sync_hint_contended) y = 2 !$omp end critical (name) - !ERROR: Hint clause value is not a valid OpenMP synchronization value + !ERROR: The synchronization hint is not valid !$omp critical (name) hint(omp_sync_hint_nonspeculative + omp_lock_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must have INTEGER type, but is REAL(4) !$omp critical (name) hint(1.0) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Operands of + must be numeric; have LOGICAL(4) and INTEGER(4) !$omp critical (name) hint(z + omp_sync_hint_nonspeculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(k + omp_sync_hint_speculative) y = 2 !$omp end critical (name) - !ERROR: Hint clause must have non-negative constant integer expression + !ERROR: Synchronization hint must be a constant integer value !ERROR: Must be a constant value !$omp critical (name) hint(p(1) + omp_sync_hint_uncontended) y = 2 diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 505cbc48fef90..677b933932b44 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -20,70 +20,64 @@ program sample !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable y(1_8:3_8:1_8) should be a scalar v = y(1:3) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression x*(10_4+x) should be a variable v = x * (10 + x) !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement + !ERROR: Atomic expression 4_4 should be a variable v = 4 !$omp atomic read - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE v = k !$omp atomic write - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = x !$omp atomic update - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = k + x * (v * x) !$omp atomic - !ERROR: k must not have ALLOCATABLE attribute + !ERROR: Atomic variable k cannot be ALLOCATABLE k = v * k !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'z%y' + !ERROR: Within atomic operation z%y and x+z%y access the same storage z%y = x + z%y !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic write - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Within atomic operation m and min(m,x,z%m)+k access the same storage m = min(m, x, z%m) + k !$omp atomic read - !ERROR: RHS expression on atomic assignment statement cannot access 'x' + !ERROR: Within atomic operation x and x access the same storage x = x !$omp atomic read - !ERROR: Expected scalar variable of intrinsic type on RHS of atomic assignment statement - !ERROR: RHS expression on atomic assignment statement cannot access 'm' + !ERROR: Atomic expression min(m,x,z%m)+k should be a variable m = min(m, x, z%m) + k !$omp atomic read !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar x = a - !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - a = x - !$omp atomic write !ERROR: No intrinsic or user-defined ASSIGNMENT(=) matches scalar INTEGER(4) and rank 1 array of INTEGER(4) - !ERROR: Expected scalar expression on the RHS of atomic assignment statement x = a !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement + !ERROR: Atomic variable a should be a scalar a = x !$omp atomic capture @@ -93,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: Atomic update statement should be of form `x = x operator expr` OR `x = expr operator x` + !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = b + (x*1) !$omp end atomic @@ -103,60 +97,58 @@ program sample !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component x expected to be assigned in the second statement of ATOMIC CAPTURE construct v = x + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component x expected to be captured in the second statement of ATOMIC CAPTURE construct x = x + 10 v = b !$omp end atomic + !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture - !ERROR: Invalid ATOMIC CAPTURE construct statements. Expected one of [update-stmt, capture-stmt], [capture-stmt, update-stmt], or [capture-stmt, write-stmt] v = 1 x = 4 !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component z%y expected to be assigned in the second statement of ATOMIC CAPTURE construct x = z%y + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component z%m expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 x = z%y !$omp end atomic !$omp atomic capture - !ERROR: Captured variable/array element/derived-type component y(2) expected to be assigned in the second statement of ATOMIC CAPTURE construct x = y(2) + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: Updated variable/array element/derived-type component y(1) expected to be captured in the second statement of ATOMIC CAPTURE construct + !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 x = y(2) !$omp end atomic !$omp atomic read - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable r cannot have CHARACTER type l = r !$omp atomic write - !ERROR: Expected scalar variable on the LHS of atomic assignment statement - !ERROR: Expected scalar expression on the RHS of atomic assignment statement + !ERROR: Atomic variable l cannot have CHARACTER type l = r end program diff --git a/flang/test/Semantics/OpenMP/requires-atomic01.f90 b/flang/test/Semantics/OpenMP/requires-atomic01.f90 index ae9fd086015dd..e8817c3f5ef61 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic01.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic01.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> SeqCst !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> SeqCst !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> SeqCst !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> SeqCst !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> SeqCst + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> SeqCst !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> SeqCst - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK-NOT: OmpClause -> SeqCst + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 diff --git a/flang/test/Semantics/OpenMP/requires-atomic02.f90 b/flang/test/Semantics/OpenMP/requires-atomic02.f90 index 4976a9667eb78..a3724a83456fd 100644 --- a/flang/test/Semantics/OpenMP/requires-atomic02.f90 +++ b/flang/test/Semantics/OpenMP/requires-atomic02.f90 @@ -10,20 +10,23 @@ program requires ! READ ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Acquire + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK: OmpClause -> AcqRel !$omp atomic read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Read !$omp atomic relaxed read i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicRead - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Acquire - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Read + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic read relaxed i = j @@ -31,20 +34,23 @@ program requires ! WRITE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK: OmpClause -> AcqRel !$omp atomic write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Write !$omp atomic relaxed write i = j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicWrite - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Write + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic write relaxed i = j @@ -52,31 +58,34 @@ program requires ! UPDATE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK: OmpClause -> AcqRel !$omp atomic update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Update !$omp atomic relaxed update i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicUpdate - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Update + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic update relaxed i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Release + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> AcqRel !$omp atomic i = i + j - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomic - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> Release - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed !$omp atomic relaxed i = i + j @@ -84,24 +93,27 @@ program requires ! CAPTURE ! ---------------------------------------------------------------------------- - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK: OmpMemoryOrderClause -> OmpClause -> AcqRel + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> AcqRel !$omp atomic capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Relaxed + ! CHECK: OmpClause -> Capture !$omp atomic relaxed capture i = j j = j + 1 !$omp end atomic - ! CHECK-LABEL: OpenMPAtomicConstruct -> OmpAtomicCapture - ! CHECK-NOT: OmpMemoryOrderClause -> OmpClause -> AcqRel - ! CHECK: OmpMemoryOrderClause -> OmpClause -> Relaxed + ! CHECK-LABEL: OpenMPAtomicConstruct + ! CHECK-NOT: OmpClause -> AcqRel + ! CHECK: OmpClause -> Capture + ! CHECK: OmpClause -> Relaxed !$omp atomic capture relaxed i = j j = j + 1 >From 5928e6c450ed9d91a8130fd489a3fabab830140c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:41:16 -0500 Subject: [PATCH 05/30] Restart build >From 5a9cb4aaca6c7cd079f24036b233dbfd3c6278d4 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Tue, 29 Apr 2025 14:45:28 -0500 Subject: [PATCH 06/30] Restart build >From 6273bb856c66fe110731e25293be0f9a1f02c64a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 07:54:54 -0500 Subject: [PATCH 07/30] Restart build >From 3594a0fcab499d7509c2bd4620b2c47feb74a481 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 08:18:01 -0500 Subject: [PATCH 08/30] Replace %openmp_flags with -fopenmp in tests, add REQUIRES where needed --- flang/test/Semantics/OpenMP/atomic-read.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-capture.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic-write.f90 | 2 +- flang/test/Semantics/OpenMP/atomic.f90 | 2 ++ 5 files changed, 6 insertions(+), 4 deletions(-) diff --git a/flang/test/Semantics/OpenMP/atomic-read.f90 b/flang/test/Semantics/OpenMP/atomic-read.f90 index 2619d235380f8..73899a9ff37f2 100644 --- a/flang/test/Semantics/OpenMP/atomic-read.f90 +++ b/flang/test/Semantics/OpenMP/atomic-read.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index 64e31a7974b55..c427ba07d43d8 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y, v diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 0aee02d429dc0..4595e02d01456 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, y diff --git a/flang/test/Semantics/OpenMP/atomic-write.f90 b/flang/test/Semantics/OpenMP/atomic-write.f90 index 979568ebf7140..7965ad2dc7dbf 100644 --- a/flang/test/Semantics/OpenMP/atomic-write.f90 +++ b/flang/test/Semantics/OpenMP/atomic-write.f90 @@ -1,4 +1,4 @@ -!RUN: %python %S/../test_errors.py %s %flang %openmp_flags -fopenmp-version=60 +!RUN: %python %S/../test_errors.py %s %flang -fopenmp -fopenmp-version=60 subroutine f00 integer :: x, v diff --git a/flang/test/Semantics/OpenMP/atomic.f90 b/flang/test/Semantics/OpenMP/atomic.f90 index 0dc8251ae356e..2caa161507d49 100644 --- a/flang/test/Semantics/OpenMP/atomic.f90 +++ b/flang/test/Semantics/OpenMP/atomic.f90 @@ -1,3 +1,5 @@ +! REQUIRES: openmp_runtime + ! RUN: %python %S/../test_errors.py %s %flang -fopenmp -J /work2/kparzysz/git/llvm.org/b/x86/runtimes/runtimes-bins/openmp/runtime/src use omp_lib ! Check OpenMP 2.13.6 atomic Construct >From 0142671339663ba09c169800f909c0d48c0ad38b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 09:06:20 -0500 Subject: [PATCH 09/30] Fix examples --- flang/examples/FeatureList/FeatureList.cpp | 10 ------- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 26 +++++-------------- 2 files changed, 7 insertions(+), 29 deletions(-) diff --git a/flang/examples/FeatureList/FeatureList.cpp b/flang/examples/FeatureList/FeatureList.cpp index d1407cf0ef239..a36b8719e365d 100644 --- a/flang/examples/FeatureList/FeatureList.cpp +++ b/flang/examples/FeatureList/FeatureList.cpp @@ -445,13 +445,6 @@ struct NodeVisitor { READ_FEATURE(ObjectDecl) READ_FEATURE(OldParameterStmt) READ_FEATURE(OmpAlignedClause) - READ_FEATURE(OmpAtomic) - READ_FEATURE(OmpAtomicCapture) - READ_FEATURE(OmpAtomicCapture::Stmt1) - READ_FEATURE(OmpAtomicCapture::Stmt2) - READ_FEATURE(OmpAtomicRead) - READ_FEATURE(OmpAtomicUpdate) - READ_FEATURE(OmpAtomicWrite) READ_FEATURE(OmpBeginBlockDirective) READ_FEATURE(OmpBeginLoopDirective) READ_FEATURE(OmpBeginSectionsDirective) @@ -480,7 +473,6 @@ struct NodeVisitor { READ_FEATURE(OmpIterationOffset) READ_FEATURE(OmpIterationVector) READ_FEATURE(OmpEndAllocators) - READ_FEATURE(OmpEndAtomic) READ_FEATURE(OmpEndBlockDirective) READ_FEATURE(OmpEndCriticalDirective) READ_FEATURE(OmpEndLoopDirective) @@ -566,8 +558,6 @@ struct NodeVisitor { READ_FEATURE(OpenMPDeclareTargetConstruct) READ_FEATURE(OmpMemoryOrderType) READ_FEATURE(OmpMemoryOrderClause) - READ_FEATURE(OmpAtomicClause) - READ_FEATURE(OmpAtomicClauseList) READ_FEATURE(OmpAtomicDefaultMemOrderClause) READ_FEATURE(OpenMPFlushConstruct) READ_FEATURE(OpenMPLoopConstruct) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index dbbf86a6c6151..18a4107f9a9d1 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -74,25 +74,19 @@ SourcePosition OpenMPCounterVisitor::getLocation(const OpenMPConstruct &c) { // the directive field. [&](const auto &c) -> SourcePosition { const CharBlock &source{std::get<0>(c.t).source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPAtomicConstruct &c) -> SourcePosition { - return std::visit( - [&](const auto &o) -> SourcePosition { - const CharBlock &source{std::get(o.t).source}; - return parsing->allCooked() - .GetSourcePositionRange(source) - ->first; - }, - c.u); + const CharBlock &source{c.source}; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPSectionConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, [&](const OpenMPUtilityConstruct &c) -> SourcePosition { const CharBlock &source{c.source}; - return (parsing->allCooked().GetSourcePositionRange(source))->first; + return parsing->allCooked().GetSourcePositionRange(source)->first; }, }, c.u); @@ -157,14 +151,8 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - return std::visit( - [&](const auto &c) { - // Get source from the verbatim fields - const CharBlock &source{std::get(c.t).source}; - return "atomic-" + - normalize_construct_name(source.ToString()); - }, - c.u); + const CharBlock &source{c.source}; + return normalize_construct_name(source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; >From c158867527e0a5e2b67146784386675e0d414c71 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:19 -0500 Subject: [PATCH 10/30] Fix example --- .../FlangOmpReport/FlangOmpReportVisitor.cpp | 5 +++-- flang/test/Examples/omp-atomic.f90 | 16 +++++++++++----- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp index 18a4107f9a9d1..b0964845d3b99 100644 --- a/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp +++ b/flang/examples/FlangOmpReport/FlangOmpReportVisitor.cpp @@ -151,8 +151,9 @@ std::string OpenMPCounterVisitor::getName(const OpenMPConstruct &c) { return normalize_construct_name(source.ToString()); }, [&](const OpenMPAtomicConstruct &c) -> std::string { - const CharBlock &source{c.source}; - return normalize_construct_name(source.ToString()); + auto &dirSpec = std::get(c.t); + auto &dirName = std::get(dirSpec.t); + return normalize_construct_name(dirName.source.ToString()); }, [&](const OpenMPUtilityConstruct &c) -> std::string { const CharBlock &source{c.source}; diff --git a/flang/test/Examples/omp-atomic.f90 b/flang/test/Examples/omp-atomic.f90 index dcca34b633a3e..934f84f132484 100644 --- a/flang/test/Examples/omp-atomic.f90 +++ b/flang/test/Examples/omp-atomic.f90 @@ -26,25 +26,31 @@ ! CHECK:--- ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 9 -! CHECK-NEXT: construct: atomic-read +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: -! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: - clause: read ! CHECK-NEXT: details: '' +! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 12 -! CHECK-NEXT: construct: atomic-write +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: ! CHECK-NEXT: - clause: seq_cst +! CHECK-NEXT: details: 'name_modifier=atomic;' +! CHECK-NEXT: - clause: write ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 16 -! CHECK-NEXT: construct: atomic-capture +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: +! CHECK-NEXT: - clause: capture +! CHECK-NEXT: details: 'name_modifier=atomic;name_modifier=atomic;' ! CHECK-NEXT: - clause: seq_cst ! CHECK-NEXT: details: '' ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 21 -! CHECK-NEXT: construct: atomic-atomic +! CHECK-NEXT: construct: atomic ! CHECK-NEXT: clauses: [] ! CHECK-NEXT:- file: '{{[^"]*}}omp-atomic.f90' ! CHECK-NEXT: line: 8 >From ddacb71299d1fefd37b566e3caed45031584aac8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Wed, 30 Apr 2025 12:01:47 -0500 Subject: [PATCH 11/30] Remove reference to *nullptr --- flang/lib/Lower/OpenMP/OpenMP.cpp | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ab868df76d298..4d9b375553c1e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -3770,9 +3770,12 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); - mlir::Operation *secondOp = genAtomicOperation( - converter, loc, stmtCtx, analysis.op1.what, atomAddr, atom, - *get(analysis.op1.expr), hint, memOrder, atomicAt, prepareAt); + mlir::Operation *secondOp = nullptr; + if (analysis.op1.what != analysis.None) { + secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, + atomAddr, atom, *get(analysis.op1.expr), + hint, memOrder, atomicAt, prepareAt); + } if (secondOp) { builder.setInsertionPointAfter(secondOp); >From 4546997f82dfe32b79b2bd0e2b65974991ab55da Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 2 May 2025 18:49:05 -0500 Subject: [PATCH 12/30] Updates and improvements --- flang/include/flang/Parser/parse-tree.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 107 +++-- flang/lib/Semantics/check-omp-structure.cpp | 375 ++++++++++++++---- flang/lib/Semantics/check-omp-structure.h | 1 + .../Todo/atomic-capture-implicit-cast.f90 | 48 --- .../Lower/OpenMP/atomic-implicit-cast.f90 | 2 - .../Semantics/OpenMP/atomic-hint-clause.f90 | 2 +- .../OpenMP/atomic-update-capture.f90 | 8 +- .../OpenMP/omp-atomic-assignment-stmt.f90 | 16 +- 9 files changed, 381 insertions(+), 180 deletions(-) delete mode 100644 flang/test/Lower/OpenMP/Todo/atomic-capture-implicit-cast.f90 diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 77f57b1cb85c7..8213fe33edbd0 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -4859,7 +4859,7 @@ struct OpenMPAtomicConstruct { struct Op { int what; - TypedExpr expr; + AssignmentStmt::TypedAssignment assign; }; TypedExpr atom, cond; Op op0, op1; diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6177b59199481..7b6c22095d723 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2673,21 +2673,46 @@ getAtomicMemoryOrder(lower::AbstractConverter &converter, static mlir::Operation * // genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value toAddr = fir::getBase(converter.genExprAddr(expr, stmtCtx, &loc)); + mlir::Value storeAddr = + fir::getBase(converter.genExprAddr(assign.lhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); + mlir::Type storeType = fir::unwrapRefType(storeAddr.getType()); + + mlir::Value toAddr = [&]() { + if (atomType == storeType) + return storeAddr; + return builder.createTemporary(loc, atomType, ".tmp.atomval"); + }(); builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, toAddr, mlir::TypeAttr::get(atomType), hint, memOrder); + + if (atomType != storeType) { + lower::ExprToValueMap overrides; + // The READ operation could be a part of UPDATE CAPTURE, so make sure + // we don't emit extra code into the body of the atomic op. + builder.restoreInsertionPoint(postAt); + mlir::Value load = builder.create(loc, toAddr); + overrides.try_emplace(&atom, load); + + converter.overrideExprValues(&overrides); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); + converter.resetExprOverrides(); + + builder.create(loc, value, storeAddr); + } builder.restoreInsertionPoint(saved); return op; } @@ -2695,16 +2720,18 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, static mlir::Operation * // genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, - const semantics::SomeExpr &atom, const semantics::SomeExpr &expr, - mlir::IntegerAttr hint, + const semantics::SomeExpr &atom, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); - mlir::Value value = fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + mlir::Value value = + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); mlir::Value converted = builder.createConvert(loc, atomType, value); @@ -2719,19 +2746,20 @@ static mlir::Operation * genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); - builder.restoreInsertionPoint(prepareAt); + builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(expr)}; + std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrResizeOf(arg, atom)) { @@ -2751,7 +2779,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, converter.overrideExprValues(&overrides); mlir::Value updated = - fir::getBase(converter.genExprValue(expr, stmtCtx, &loc)); + fir::getBase(converter.genExprValue(assign.rhs, stmtCtx, &loc)); mlir::Value converted = builder.createConvert(loc, atomType, updated); builder.create(loc, converted); converter.resetExprOverrides(); @@ -2764,20 +2792,21 @@ static mlir::Operation * genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, lower::StatementContext &stmtCtx, int action, mlir::Value atomAddr, const semantics::SomeExpr &atom, - const semantics::SomeExpr &expr, mlir::IntegerAttr hint, + const evaluate::Assignment &assign, mlir::IntegerAttr hint, mlir::omp::ClauseMemoryOrderKindAttr memOrder, + fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, - fir::FirOpBuilder::InsertPoint prepareAt) { + fir::FirOpBuilder::InsertPoint postAt) { switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: - return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Write: - return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicWrite(converter, loc, stmtCtx, atomAddr, atom, assign, hint, + memOrder, preAt, atomicAt, postAt); case parser::OpenMPAtomicConstruct::Analysis::Update: - return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, expr, hint, - memOrder, atomicAt, prepareAt); + return genAtomicUpdate(converter, loc, stmtCtx, atomAddr, atom, assign, + hint, memOrder, preAt, atomicAt, postAt); default: return nullptr; } @@ -3724,6 +3753,15 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { } return ""s; }; + auto assignStr = [&](const parser::AssignmentStmt::TypedAssignment &assign) { + if (auto *maybe = assign.get(); maybe && maybe->v) { + std::string str; + llvm::raw_string_ostream os(str); + maybe->v->AsFortran(os); + return str; + } + return ""s; + }; const SomeExpr &atom = *analysis.atom.get()->v; @@ -3732,11 +3770,11 @@ dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { llvm::errs() << " cond: " << exprStr(analysis.cond) << "\n"; llvm::errs() << " op0 {\n"; llvm::errs() << " what: " << whatStr(analysis.op0.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op0.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op0.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << " op1 {\n"; llvm::errs() << " what: " << whatStr(analysis.op1.what) << "\n"; - llvm::errs() << " expr: " << exprStr(analysis.op1.expr) << "\n"; + llvm::errs() << " assign: " << assignStr(analysis.op1.assign) << "\n"; llvm::errs() << " }\n"; llvm::errs() << "}\n"; } @@ -3745,8 +3783,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, semantics::SemanticsContext &semaCtx, lower::pft::Evaluation &eval, const parser::OpenMPAtomicConstruct &construct) { - auto get = [](const parser::TypedExpr &expr) -> const semantics::SomeExpr * { - if (auto *maybe = expr.get(); maybe && maybe->v) { + auto get = [](auto &&typedWrapper) -> decltype(&*typedWrapper.get()->v) { + if (auto *maybe = typedWrapper.get(); maybe && maybe->v) { return &*maybe->v; } else { return nullptr; @@ -3774,8 +3812,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, int action0 = analysis.op0.what & analysis.Action; int action1 = analysis.op1.what & analysis.Action; mlir::Operation *captureOp = nullptr; - fir::FirOpBuilder::InsertPoint atomicAt; - fir::FirOpBuilder::InsertPoint prepareAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint preAt = builder.saveInsertionPoint(); + fir::FirOpBuilder::InsertPoint atomicAt, postAt; if (construct.IsCapture()) { // Capturing operation. @@ -3784,7 +3822,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, captureOp = builder.create(loc, hint, memOrder); // Set the non-atomic insertion point to before the atomic.capture. - prepareAt = getInsertionPointBefore(captureOp); + preAt = getInsertionPointBefore(captureOp); mlir::Block *block = builder.createBlock(&captureOp->getRegion(0)); builder.setInsertionPointToEnd(block); @@ -3792,6 +3830,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, // atomic.capture. mlir::Operation *term = builder.create(loc); atomicAt = getInsertionPointBefore(term); + postAt = getInsertionPointAfter(captureOp); hint = nullptr; memOrder = nullptr; } else { @@ -3799,20 +3838,20 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, assert(action0 != analysis.None && action1 == analysis.None && "Expexcing single action"); assert(!(analysis.op0.what & analysis.Condition)); - atomicAt = prepareAt; + postAt = atomicAt = preAt; } mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, - *get(analysis.op0.expr), hint, memOrder, atomicAt, prepareAt); + *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); assert(firstOp && "Should have created an atomic operation"); atomicAt = getInsertionPointAfter(firstOp); mlir::Operation *secondOp = nullptr; if (analysis.op1.what != analysis.None) { secondOp = genAtomicOperation(converter, loc, stmtCtx, analysis.op1.what, - atomAddr, atom, *get(analysis.op1.expr), - hint, memOrder, atomicAt, prepareAt); + atomAddr, atom, *get(analysis.op1.assign), + hint, memOrder, preAt, atomicAt, postAt); } if (secondOp) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 201b38bd05ff3..f7753a5e5cc59 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -86,9 +86,13 @@ static const parser::ArrayElement *GetArrayElementFromObj( return nullptr; } -static bool IsVarOrFunctionRef(const SomeExpr &expr) { - return evaluate::UnwrapProcedureRef(expr) != nullptr || - evaluate::IsVariable(expr); +static bool IsVarOrFunctionRef(const MaybeExpr &expr) { + if (expr) { + return evaluate::UnwrapProcedureRef(*expr) != nullptr || + evaluate::IsVariable(*expr); + } else { + return false; + } } static std::optional GetEvaluateExpr(const parser::Expr &parserExpr) { @@ -2838,6 +2842,12 @@ static std::pair SplitAssignmentSource( namespace atomic { +template static void MoveAppend(V &accum, V &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } +} + enum class Operator { Unk, // Operators that are officially allowed in the update operation @@ -3137,16 +3147,108 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (Append(v, std::move(results)), ...); + (MoveAppend(v, std::move(results)), ...); return v; } +}; -private: - static void Append(Result &acc, Result &&data) { - for (auto &&s : data) { - acc.push_back(std::move(s)); +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + auto copy{x.derived()}; + return {evaluate::AsGenericExpr(std::move(copy)), {}}; } } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (MoveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; }; } // namespace atomic @@ -3265,6 +3367,22 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } +static MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return atomic::ConvertCollector{}(x).first; +} + +static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { // Both expr and x have the form of SomeType(SomeKind(...)[1]). // Check if expr is @@ -3282,6 +3400,10 @@ bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { } } +bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} + static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { if (value) { expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), @@ -3289,11 +3411,20 @@ static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { } } +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( - int what, const MaybeExpr &maybeExpr = std::nullopt) { + int what, + const std::optional &maybeAssign = std::nullopt) { parser::OpenMPAtomicConstruct::Analysis::Op operation; operation.what = what; - SetExpr(operation.expr, maybeExpr); + SetAssignment(operation.assign, maybeAssign); return operation; } @@ -3316,7 +3447,7 @@ static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( // }; // struct Op { // int what; - // TypedExpr expr; + // TypedAssignment assign; // }; // TypedExpr atom, cond; // Op op0, op1; @@ -3340,6 +3471,16 @@ void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, } } +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + /// Check if `expr` satisfies the following conditions for x and v: /// /// [6.0:189:10-12] @@ -3383,9 +3524,9 @@ OmpStructureChecker::CheckUpdateCapture( // // The two allowed cases are: // x = ... atomic-var = ... - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // or - // ... = x capture-var = atomic-var + // ... = x capture-var = atomic-var (with optional converts) // x = ... atomic-var = ... // // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture @@ -3394,6 +3535,8 @@ OmpStructureChecker::CheckUpdateCapture( // // If the two statements don't fit these criteria, return a pair of default- // constructed values. + using ReturnTy = std::pair; SourcedActionStmt act1{GetActionStmt(ec1)}; SourcedActionStmt act2{GetActionStmt(ec2)}; @@ -3409,86 +3552,155 @@ OmpStructureChecker::CheckUpdateCapture( auto isUpdateCapture{ [](const evaluate::Assignment &u, const evaluate::Assignment &c) { - return u.lhs == c.rhs; + return IsSameOrConvertOf(c.rhs, u.lhs); }}; // Do some checks that narrow down the possible choices for the update // and the capture statements. This will help to emit better diagnostics. - bool couldBeCapture1 = IsVarOrFunctionRef(as1.rhs); - bool couldBeCapture2 = IsVarOrFunctionRef(as2.rhs); + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + + // |cbu1 cbu2| + // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 + int det{int(cbu1) * int(cbc2) - int(cbu2) * int(cbc1)}; + + auto errorCaptureShouldRead{[&](const parser::CharBlock &source, + const std::string &expr) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read %s"_err_en_US, + expr); + }}; - if (couldBeCapture1) { - if (couldBeCapture2) { - if (isUpdateCapture(as2, as1)) { - if (isUpdateCapture(as1, as2)) { - // If both statements could be captures and both could be updates, - // emit a warning about the ambiguity. - context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement"_warn_en_US); - } - return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); - } else if (isUpdateCapture(as1, as2)) { + auto errorNeitherWorks{[&]() { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture"_err_en_US); + }}; + + auto makeSelectionFromDet{[&](int det) -> ReturnTy { + // If det != 0, then the checks unambiguously suggest a specific + // categorization. + // If det == 0, then this function should be called only if the + // checks haven't ruled out any possibility, i.e. when both assigments + // could still be either updates or captures. + if (det > 0) { + // as1 is update, as2 is capture + if (isUpdateCapture(as1, as2)) { return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); } else { - context_.Say(source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s or %s"_err_en_US, - as1.rhs.AsFortran(), as2.rhs.AsFortran()); + errorCaptureShouldRead(act2.source, as1.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } else { // !couldBeCapture2 + } else if (det < 0) { + // as2 is update, as1 is capture if (isUpdateCapture(as2, as1)) { return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } else { - context_.Say(act2.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as1.rhs.AsFortran()); + errorCaptureShouldRead(act1.source, as2.lhs.AsFortran()); + return std::make_pair(nullptr, nullptr); } - } - } else { // !couldBeCapture1 - if (couldBeCapture2) { - if (isUpdateCapture(as1, as2)) { - return std::make_pair(/*Update=*/ec1, /*Capture=*/ec2); - } else { + } else { + bool updateFirst{isUpdateCapture(as1, as2)}; + bool captureFirst{isUpdateCapture(as2, as1)}; + if (updateFirst && captureFirst) { + // If both assignment could be the update and both could be the + // capture, emit a warning about the ambiguity. context_.Say(act1.source, - "In ATOMIC UPDATE operation with CAPTURE the update statement should assign to %s"_err_en_US, - as2.rhs.AsFortran()); + "In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement"_warn_en_US); + return std::make_pair(/*Update=*/ec2, /*Capture=*/ec1); } - } else { - context_.Say(source, - "Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target)"_err_en_US); + if (updateFirst != captureFirst) { + const parser::ExecutionPartConstruct *upd{updateFirst ? ec1 : ec2}; + const parser::ExecutionPartConstruct *cap{captureFirst ? ec1 : ec2}; + return std::make_pair(upd, cap); + } + assert(!updateFirst && !captureFirst); + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); } + }}; + + if (det != 0 || (cbu1 && cbu2 && cbc1 && cbc2)) { + return makeSelectionFromDet(det); } + assert(det == 0 && "Prior checks should have covered det != 0"); - return std::make_pair(nullptr, nullptr); + // If neither of the statements is an RMW update, it could still be a + // "write" update. Pretty much any assignment can be a write update, so + // recompute det with cbu1 = cbu2 = true. + if (int writeDet{int(cbc2) - int(cbc1)}; writeDet || (cbc1 && cbc2)) { + return makeSelectionFromDet(writeDet); + } + + // It's only errors from here on. + + if (!cbu1 && !cbu2 && !cbc1 && !cbc2) { + errorNeitherWorks(); + return std::make_pair(nullptr, nullptr); + } + + // The remaining cases are that + // - no candidate for update, or for capture, + // - one of the assigments cannot be anything. + + if (!cbu1 && !cbu2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the update"_err_en_US); + return std::make_pair(nullptr, nullptr); + } else if (!cbc1 && !cbc2) { + context_.Say(source, + "In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + if ((!cbu1 && !cbc1) || (!cbu2 && !cbc2)) { + auto &src = (!cbu1 && !cbc1) ? act1.source : act2.source; + context_.Say(src, + "In ATOMIC UPDATE operation with CAPTURE the statement could be neither the update nor the capture"_err_en_US); + return std::make_pair(nullptr, nullptr); + } + + // All cases should have been covered. + llvm_unreachable("Unchecked condition"); } void OmpStructureChecker::CheckAtomicCaptureAssignment( const evaluate::Assignment &capture, const SomeExpr &atom, parser::CharBlock source) { - const SomeExpr &cap{capture.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &cap{capture.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, rsrc); - // This part should have been checked prior to callig this function. - assert(capture.rhs == atom && "This canont be a capture assignment"); + // This part should have been checked prior to calling this function. + assert(*GetConvertInput(capture.rhs) == atom && + "This canont be a capture assignment"); CheckStorageOverlap(atom, {cap}, source); } } void OmpStructureChecker::CheckAtomicReadAssignment( const evaluate::Assignment &read, parser::CharBlock source) { - const SomeExpr &atom{read.rhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; - if (!IsVarOrFunctionRef(atom)) { - context_.Say(rsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + if (auto maybe{GetConvertInput(read.rhs)}) { + const SomeExpr &atom{*maybe}; + + if (!IsVarOrFunctionRef(atom)) { + ErrorShouldBeVariable(atom, rsrc); + } else { + CheckAtomicVariable(atom, rsrc); + CheckStorageOverlap(atom, {read.lhs}, source); + } } else { - CheckAtomicVariable(atom, rsrc); - CheckStorageOverlap(atom, {read.lhs}, source); + ErrorShouldBeVariable(read.rhs, rsrc); } } @@ -3499,12 +3711,11 @@ void OmpStructureChecker::CheckAtomicWriteAssignment( // one of the following forms: // x = expr // x => expr - const SomeExpr &atom{write.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{write.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); } else { CheckAtomicVariable(atom, lsrc); CheckStorageOverlap(atom, {write.rhs}, source); @@ -3521,12 +3732,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( // x = intrinsic-procedure-name (x) // x = intrinsic-procedure-name (x, expr-list) // x = intrinsic-procedure-name (expr-list, x) - const SomeExpr &atom{update.lhs}; auto [lsrc, rsrc]{SplitAssignmentSource(source)}; + const SomeExpr &atom{update.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(lsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, rsrc); // Skip other checks. return; } @@ -3605,12 +3815,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( const SomeExpr &cond, parser::CharBlock condSource, const evaluate::Assignment &assign, parser::CharBlock assignSource) { - const SomeExpr &atom{assign.lhs}; auto [alsrc, arsrc]{SplitAssignmentSource(assignSource)}; + const SomeExpr &atom{assign.lhs}; if (!IsVarOrFunctionRef(atom)) { - context_.Say(alsrc, "Atomic expression %s should be a variable"_err_en_US, - atom.AsFortran()); + ErrorShouldBeVariable(atom, arsrc); // Skip other checks. return; } @@ -3702,7 +3911,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate->rhs), + MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( @@ -3786,7 +3995,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdate( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(assign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign.rhs), + MakeAtomicAnalysisOp(Analysis::Update | Analysis::IfTrue, assign), MakeAtomicAnalysisOp(Analysis::None)); } @@ -3839,12 +4048,12 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( if (GetActionStmt(&body.front()).stmt == uact.stmt) { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(action, update.rhs), - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs)); + MakeAtomicAnalysisOp(action, update), + MakeAtomicAnalysisOp(Analysis::Read, capture)); } else { x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, capture.lhs), - MakeAtomicAnalysisOp(action, update.rhs)); + MakeAtomicAnalysisOp(Analysis::Read, capture), + MakeAtomicAnalysisOp(action, update)); } } @@ -3988,12 +4197,12 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( if (captureFirst) { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs), - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs)); + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign), + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign)); } else { x.analysis = MakeAtomicAnalysis(updAssign.lhs, update.cond, - MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign.rhs), - MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign.lhs)); + MakeAtomicAnalysisOp(Analysis::Write | updateWhen, updAssign), + MakeAtomicAnalysisOp(Analysis::Read | captureWhen, capAssign)); } } @@ -4019,13 +4228,15 @@ void OmpStructureChecker::CheckAtomicRead( if (body.size() == 1) { SourcedActionStmt action{GetActionStmt(&body.front())}; if (auto maybeRead{GetEvaluateAssignment(action.stmt)}) { - const SomeExpr &atom{maybeRead->rhs}; CheckAtomicReadAssignment(*maybeRead, action.source); - using Analysis = parser::OpenMPAtomicConstruct::Analysis; - x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Read, maybeRead->lhs), - MakeAtomicAnalysisOp(Analysis::None)); + if (auto maybe{GetConvertInput(maybeRead->rhs)}) { + const SomeExpr &atom{*maybe}; + using Analysis = parser::OpenMPAtomicConstruct::Analysis; + x.analysis = MakeAtomicAnalysis(atom, std::nullopt, + MakeAtomicAnalysisOp(Analysis::Read, maybeRead), + MakeAtomicAnalysisOp(Analysis::None)); + } } else { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); @@ -4058,7 +4269,7 @@ void OmpStructureChecker::CheckAtomicWrite( using Analysis = parser::OpenMPAtomicConstruct::Analysis; x.analysis = MakeAtomicAnalysis(atom, std::nullopt, - MakeAtomicAnalysisOp(Analysis::Write, maybeWrite->rhs), + MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); } else { context_.Say( diff --git a/flang/lib/Semantics/check-omp-structure.h b/flang/lib/Semantics/check-omp-structure.h index bf6fbf16d0646..835fbe45e1c0e 100644 --- a/flang/lib/Semantics/check-omp-structure.h +++ b/flang/lib/Semantics/check-omp-structure.h @@ -253,6 +253,7 @@ class OmpStructureChecker void CheckStorageOverlap(const evaluate::Expr &, llvm::ArrayRef>, parser::CharBlock); + void ErrorShouldBeVariable(const MaybeExpr &expr, parser::CharBlock source); void CheckAtomicVariable( const evaluate::Expr &, parser::CharBlock); std::pair&1 | FileCheck %s - -!CHECK: not yet implemented: atomic capture requiring implicit type casts -subroutine capture_with_convert_f32_to_i32() - implicit none - integer :: k, v, i - - k = 1 - v = 0 - - !$omp atomic capture - v = k - k = (i + 1) * 3.14 - !$omp end atomic -end subroutine - -subroutine capture_with_convert_i32_to_f64() - real(8) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f64 - -subroutine capture_with_convert_f64_to_i32() - integer :: x - real(8) :: v - x = 1 - v = 0 - !$omp atomic capture - x = v - v = x - !$omp end atomic -end subroutine capture_with_convert_f64_to_i32 - -subroutine capture_with_convert_i32_to_f32() - real(4) :: x - integer :: v - x = 1.0 - v = 0 - !$omp atomic capture - v = x - x = x + v - !$omp end atomic -end subroutine capture_with_convert_i32_to_f32 diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 75f1cbfc979b9..aa9d2e0ac3ff7 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -1,5 +1,3 @@ -! REQUIRES : openmp_runtime - ! RUN: %flang_fc1 -emit-hlfir -fopenmp %s -o - | FileCheck %s ! CHECK: func.func @_QPatomic_implicit_cast_read() { diff --git a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 index deb67e7614659..8adb0f1a67409 100644 --- a/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 +++ b/flang/test/Semantics/OpenMP/atomic-hint-clause.f90 @@ -25,7 +25,7 @@ program sample !ERROR: The synchronization hint is not valid !$omp atomic hint(7) capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement y = x x = y !$omp end atomic diff --git a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 index c427ba07d43d8..f808ed916fb7e 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-capture.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-capture.f90 @@ -39,7 +39,7 @@ subroutine f02 subroutine f03 integer :: x - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the capture !$omp atomic update capture x = x + 1 x = x + 2 @@ -50,7 +50,7 @@ subroutine f04 integer :: x, v !$omp atomic update capture - !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the capture and the update, assuming the first one is the capture statement + !WARNING: In ATOMIC UPDATE operation with CAPTURE either statement could be the update and the capture, assuming the first one is the capture statement v = x x = v !$omp end atomic @@ -60,8 +60,8 @@ subroutine f05 integer :: x, v, z !$omp atomic update capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 !$omp end atomic end @@ -70,8 +70,8 @@ subroutine f06 integer :: x, v, z !$omp atomic update capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x z = x + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z v = x !$omp end atomic end diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 677b933932b44..5e180aa0bbe5b 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -97,50 +97,50 @@ program sample !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = b + 1 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read b v = x - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to x b = 10 !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) !$omp atomic capture x = x + 10 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read x v = b !$omp end atomic - !ERROR: Unable to identify capture statement: in ATOMIC UPDATE operation with CAPTURE the source value in the capture statement should be a variable (with the same type as the target) + !ERROR: In ATOMIC UPDATE operation with CAPTURE neither statement could be the update or the capture !$omp atomic capture v = 1 x = 4 !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to z%y z%m = z%m + 1.0 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read z%m x = z%y !$omp end atomic !$omp atomic capture + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 !$omp end atomic !$omp atomic capture - !ERROR: In ATOMIC UPDATE operation with CAPTURE the update statement should assign to y(2_8) y(1) = y(1) + 1 + !ERROR: In ATOMIC UPDATE operation with CAPTURE the right-hand side of the capture assignment should read y(1_8) x = y(2) !$omp end atomic >From 40510a3068498d15257cc7d198bce9c8cd71a902 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Mon, 24 Mar 2025 15:38:58 -0500 Subject: [PATCH 13/30] DumpEvExpr: show type --- flang/include/flang/Semantics/dump-expr.h | 30 ++++++++++++++++------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 2f445429a10b5..1553dac3b6687 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,6 +16,7 @@ #include #include +#include #include #include @@ -38,6 +39,17 @@ class DumpEvaluateExpr { } private: + template + struct TypeOf { + static constexpr std::string_view name{TypeOf::get()}; + static constexpr std::string_view get() { + std::string_view v(__PRETTY_FUNCTION__); + v.remove_prefix(99); // Strip the part "... [with T = " + v.remove_suffix(50); // Strip the ending "; string_view = ...]" + return v; + } + }; + template void Show(const common::Indirection &x) { Show(x.value()); } @@ -76,7 +88,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant"); + Indent("derived constant "s + std::string(TypeOf::name)); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -84,7 +96,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant"); + Print("constant "s + std::string(TypeOf::name)); } } void Show(const Symbol &symbol); @@ -102,7 +114,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator"); + Indent("designator "s + std::string(TypeOf::name)); Show(x.u); Outdent(); } @@ -117,7 +129,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref"); + Indent("function ref "s + std::string(TypeOf::name)); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -127,14 +139,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value"); + Indent("array constructor value "s + std::string(TypeOf::name)); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do"); + Indent("implied do "s + std::string(TypeOf::name)); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -148,20 +160,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op"); + Indent("unary op "s + std::string(TypeOf::name)); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op"); + Indent("binary op "s + std::string(TypeOf::name)); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr T"); + Indent("expr <" + std::string(TypeOf::name) + ">"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index aa0b4e0f03398..66cedab94bfb4 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("expr some type"); + Indent("relational some type"); Show(x.u); Outdent(); } >From b40ba0ed9270daf4f7d99190c1e100028a3e09c3 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 15:14:45 -0500 Subject: [PATCH 14/30] Handle conversion from real to complex via complex constructor --- flang/lib/Semantics/check-omp-structure.cpp | 55 ++++++++++++++++++--- 1 file changed, 47 insertions(+), 8 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index dada9c6c2bd6f..ae81dcb5ea150 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3183,36 +3183,46 @@ struct ConvertCollector using Base::operator(); template // - Result operator()(const evaluate::Designator &x) const { + Result asSomeExpr(const T &x) const { auto copy{x}; return {AsGenericExpr(std::move(copy)), {}}; } + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + template // Result operator()(const evaluate::FunctionRef &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template // Result operator()(const evaluate::Constant &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x); } template Result operator()(const evaluate::Operation &x) const { if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. + // Ignore parentheses. return (*this)(x.template operand<0>()); } else if constexpr (is_convert_v) { // Convert should always have a typed result, so it should be safe to // dereference x.GetType(). return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } } else { - auto copy{x.derived()}; - return {evaluate::AsGenericExpr(std::move(copy)), {}}; + return asSomeExpr(x.derived()); } } @@ -3231,6 +3241,23 @@ struct ConvertCollector } private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + template // struct is_convert { static constexpr bool value{false}; @@ -3246,6 +3273,18 @@ struct ConvertCollector }; template // static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { >From 303aef7886243a6f7952e866cfb50d860ed98e61 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 15 May 2025 16:07:19 -0500 Subject: [PATCH 15/30] Fix handling of insertion point --- flang/lib/Lower/OpenMP/OpenMP.cpp | 23 +++++++++++-------- .../Lower/OpenMP/atomic-implicit-cast.f90 | 8 +++---- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 1c5589b116ca7..60e559b326f7f 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2749,7 +2749,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value storeAddr = @@ -2782,7 +2781,6 @@ genAtomicRead(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, value, storeAddr); } - builder.restoreInsertionPoint(saved); return op; } @@ -2796,7 +2794,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Value value = @@ -2807,7 +2804,6 @@ genAtomicWrite(lower::AbstractConverter &converter, mlir::Location loc, builder.restoreInsertionPoint(atomicAt); mlir::Operation *op = builder.create( loc, atomAddr, converted, hint, memOrder); - builder.restoreInsertionPoint(saved); return op; } @@ -2823,7 +2819,6 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, lower::ExprToValueMap overrides; lower::StatementContext naCtx; fir::FirOpBuilder &builder = converter.getFirOpBuilder(); - fir::FirOpBuilder::InsertPoint saved = builder.saveInsertionPoint(); builder.restoreInsertionPoint(preAt); mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); @@ -2853,7 +2848,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(saved); + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } @@ -2866,6 +2861,8 @@ genAtomicOperation(lower::AbstractConverter &converter, mlir::Location loc, fir::FirOpBuilder::InsertPoint preAt, fir::FirOpBuilder::InsertPoint atomicAt, fir::FirOpBuilder::InsertPoint postAt) { + // This function and the functions called here do not preserve the + // builder's insertion point, or set it to anything specific. switch (action) { case parser::OpenMPAtomicConstruct::Analysis::Read: return genAtomicRead(converter, loc, stmtCtx, atomAddr, atom, assign, hint, @@ -3919,6 +3916,8 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, postAt = atomicAt = preAt; } + // The builder's insertion point needs to be specifically set before + // each call to `genAtomicOperation`. mlir::Operation *firstOp = genAtomicOperation( converter, loc, stmtCtx, analysis.op0.what, atomAddr, atom, *get(analysis.op0.assign), hint, memOrder, preAt, atomicAt, postAt); @@ -3932,10 +3931,16 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, hint, memOrder, preAt, atomicAt, postAt); } - if (secondOp) { - builder.setInsertionPointAfter(secondOp); + if (construct.IsCapture()) { + // If this is a capture operation, the first/second ops will be inside + // of it. Set the insertion point to past the capture op itself. + builder.restoreInsertionPoint(postAt); } else { - builder.setInsertionPointAfter(firstOp); + if (secondOp) { + builder.setInsertionPointAfter(secondOp); + } else { + builder.setInsertionPointAfter(firstOp); + } } } } diff --git a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 index 6f9a481e4cf43..5e00235b85e74 100644 --- a/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 +++ b/flang/test/Lower/OpenMP/atomic-implicit-cast.f90 @@ -95,9 +95,9 @@ subroutine atomic_implicit_cast_read ! CHECK: } ! CHECK: omp.atomic.read %[[ALLOCA6]] = %[[X_DECL]]#0 : !fir.ref, !fir.ref, i32 ! CHECK: %[[LOAD:.*]] = fir.load %[[ALLOCA6]] : !fir.ref -! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[CVT:.*]] = fir.convert %[[LOAD]] : (i32) -> f32 ! CHECK: %[[CST:.*]] = arith.constant 0.000000e+00 : f32 +! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CVT]], [0 : index] : (complex, f32) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST]], [1 : index] : (complex, f32) -> complex ! CHECK: fir.store %[[IDX2]] to %[[W_DECL]]#0 : !fir.ref> @@ -107,14 +107,14 @@ subroutine atomic_implicit_cast_read !$omp end atomic -! CHECK: omp.atomic.capture { -! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { -! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[CST1:.*]] = arith.constant 1.000000e+00 : f64 ! CHECK: %[[CST2:.*]] = arith.constant 0.000000e+00 : f64 ! CHECK: %[[UNDEF:.*]] = fir.undefined complex ! CHECK: %[[IDX1:.*]] = fir.insert_value %[[UNDEF]], %[[CST1]], [0 : index] : (complex, f64) -> complex ! CHECK: %[[IDX2:.*]] = fir.insert_value %[[IDX1]], %[[CST2]], [1 : index] : (complex, f64) -> complex +! CHECK: omp.atomic.capture { +! CHECK: omp.atomic.update %[[M_DECL]]#0 : !fir.ref> { +! CHECK: ^bb0(%[[ARG:.*]]: complex): ! CHECK: %[[RESULT:.*]] = fir.addc %[[ARG]], %[[IDX2]] {fastmath = #arith.fastmath} : complex ! CHECK: omp.yield(%[[RESULT]] : complex) ! CHECK: } >From d788d87ebe69ec82c14a0eb0cbb95df38a216fde Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:14:47 -0500 Subject: [PATCH 16/30] Allow conversion in update operations --- flang/include/flang/Semantics/tools.h | 17 ++++----- flang/lib/Lower/OpenMP/OpenMP.cpp | 6 ++-- flang/lib/Semantics/check-omp-structure.cpp | 33 ++++++----------- .../Semantics/OpenMP/atomic-update-only.f90 | 2 +- flang/test/Semantics/OpenMP/atomic03.f90 | 6 ++-- flang/test/Semantics/OpenMP/atomic04.f90 | 35 +++++++++---------- .../OpenMP/omp-atomic-assignment-stmt.f90 | 2 +- 7 files changed, 44 insertions(+), 57 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7f1ec59b087a2..9be2feb8ae064 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -789,14 +789,15 @@ inline bool checkForSymbolMatch( /// return the "expr" but with top-level parentheses stripped. std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); -/// Both "expr" and "x" have the form of SomeType(SomeKind(...)[1]). -/// Check if "expr" is -/// SomeType(SomeKind(Type( -/// Convert -/// SomeKind(...)[2]))) -/// where SomeKind(...) [1] and [2] are equal, and the Convert preserves -/// TypeCategory. -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x); +/// Check if expr is same as x, or a sequence of Convert operations on x. +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); + +/// Strip away any top-level Convert operations (if any exist) and return +/// the input value. A ComplexConstructor(x, 0) is also considered as a +/// convert operation. +/// If the input is not Operation, Designator, FunctionRef or Constant, +/// is returns std::nullopt. +MaybeExpr GetConvertInput(const SomeExpr &x); } // namespace Fortran::semantics #endif // FORTRAN_SEMANTICS_TOOLS_H_ diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 60e559b326f7f..6977e209e8b1b 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2823,10 +2823,12 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, mlir::Type atomType = fir::unwrapRefType(atomAddr.getType()); - std::vector args{semantics::GetOpenMPTopLevelArguments(assign.rhs)}; + // This must exist by now. + SomeExpr input = *semantics::GetConvertInput(assign.rhs); + std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { - if (!semantics::IsSameOrResizeOf(arg, atom)) { + if (!semantics::IsSameOrConvertOf(arg, atom)) { mlir::Value val = fir::getBase(converter.genExprValue(arg, naCtx, &loc)); overrides.try_emplace(&arg, val); } diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index ae81dcb5ea150..edd8525c118bd 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3425,12 +3425,12 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -static MaybeExpr GetConvertInput(const SomeExpr &x) { +MaybeExpr GetConvertInput(const SomeExpr &x) { // This returns SomeExpr(x) when x is a designator/functionref/constant. return atomic::ConvertCollector{}(x).first; } -static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { // Check if expr is same as x, or a sequence of Convert operations on x. if (expr == x) { return true; @@ -3441,23 +3441,6 @@ static bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { } } -bool IsSameOrResizeOf(const SomeExpr &expr, const SomeExpr &x) { - // Both expr and x have the form of SomeType(SomeKind(...)[1]). - // Check if expr is - // SomeType(SomeKind(Type( - // Convert - // SomeKind(...)[2]))) - // where SomeKind(...) [1] and [2] are equal, and the Convert preserves - // TypeCategory. - - if (expr != x) { - auto top{atomic::ArgumentExtractor{}(expr)}; - return top.first == atomic::Operator::Resize && x == top.second.front(); - } else { - return true; - } -} - bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3801,7 +3784,11 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - auto top{GetTopLevelOperation(update.rhs)}; + std::pair> top{ + atomic::Operator::Unk, {}}; + if (auto &&maybeInput{GetConvertInput(update.rhs)}) { + top = GetTopLevelOperation(*maybeInput); + } switch (top.first) { case atomic::Operator::Add: case atomic::Operator::Sub: @@ -3842,7 +3829,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( auto unique{[&]() { // -> iterator auto found{top.second.end()}; for (auto i{top.second.begin()}, e{top.second.end()}; i != e; ++i) { - if (IsSameOrResizeOf(*i, atom)) { + if (IsSameOrConvertOf(*i, atom)) { if (found != top.second.end()) { return top.second.end(); } @@ -3902,9 +3889,9 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( case atomic::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; - if (IsSameOrResizeOf(arg0, atom)) { + if (IsSameOrConvertOf(arg0, atom)) { CheckStorageOverlap(atom, {arg1}, condSource); - } else if (IsSameOrResizeOf(arg1, atom)) { + } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { context_.Say(assignSource, diff --git a/flang/test/Semantics/OpenMP/atomic-update-only.f90 b/flang/test/Semantics/OpenMP/atomic-update-only.f90 index 4595e02d01456..28d0e264359cb 100644 --- a/flang/test/Semantics/OpenMP/atomic-update-only.f90 +++ b/flang/test/Semantics/OpenMP/atomic-update-only.f90 @@ -47,8 +47,8 @@ subroutine f05 integer :: x real :: y + ! An explicit conversion is accepted as an extension. !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation x = int(x + y) end diff --git a/flang/test/Semantics/OpenMP/atomic03.f90 b/flang/test/Semantics/OpenMP/atomic03.f90 index f5c189fd05318..b3a3c0d5e7a14 100644 --- a/flang/test/Semantics/OpenMP/atomic03.f90 +++ b/flang/test/Semantics/OpenMP/atomic03.f90 @@ -41,10 +41,10 @@ program OmpAtomic z = MIN(y, 8, a, d) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: This intrinsic function is not a valid ATOMIC UPDATE operation y = FRACTION(x) !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable y should appear as an argument in the update operation y = REAL(x) !$omp atomic update y = IAND(y, 4) @@ -126,7 +126,7 @@ subroutine more_invalid_atomic_update_stmts() !$omp atomic update !ERROR: Atomic variable k should be a scalar - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable k should occur exactly once among the arguments of the top-level MAX operator k = max(x, y) !$omp atomic diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index 5c91ab5dc37e4..d603ba8b3937c 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -1,5 +1,3 @@ -! REQUIRES: openmp_runtime - ! RUN: %python %S/../test_errors.py %s %flang_fc1 %openmp_flags ! OpenMP Atomic construct @@ -7,7 +5,6 @@ ! Update assignment must be 'var = var op expr' or 'var = expr op var' program OmpAtomic - use omp_lib real x integer y logical m, n, l @@ -20,10 +17,10 @@ program OmpAtomic !$omp atomic x = 1 + x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic @@ -31,10 +28,10 @@ program OmpAtomic !$omp atomic x = 1 - x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic @@ -42,10 +39,10 @@ program OmpAtomic !$omp atomic x = 1*x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic @@ -53,10 +50,10 @@ program OmpAtomic !$omp atomic x = 1/x !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic @@ -96,10 +93,10 @@ program OmpAtomic !$omp atomic update x = 1 + x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = y + 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level + operator x = 1 + y !$omp atomic update @@ -107,10 +104,10 @@ program OmpAtomic !$omp atomic update x = 1 - x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = y - 1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level - operator x = 1 - y !$omp atomic update @@ -118,10 +115,10 @@ program OmpAtomic !$omp atomic update x = 1*x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = y*1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1*y !$omp atomic update @@ -129,10 +126,10 @@ program OmpAtomic !$omp atomic update x = 1/x !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = y/1 !$omp atomic update - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should occur exactly once among the arguments of the top-level / operator x = 1/y !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 index 5e180aa0bbe5b..8fdd2aed3ec1f 100644 --- a/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 +++ b/flang/test/Semantics/OpenMP/omp-atomic-assignment-stmt.f90 @@ -87,7 +87,7 @@ program sample !$omp atomic release capture v = x - !ERROR: An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation + ! This ends up being "x = b + x". x = b + (x*1) !$omp end atomic >From 341723713929507c59d528540d32bc2e4213e920 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:21:56 -0500 Subject: [PATCH 17/30] format --- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 6977e209e8b1b..0f553541c5ef0 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2850,7 +2850,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, builder.create(loc, converted); converter.resetExprOverrides(); - builder.restoreInsertionPoint(postAt); // For naCtx cleanups + builder.restoreInsertionPoint(postAt); // For naCtx cleanups return updateOp; } >From 2686207342bad511f6d51b20ed923c0d2cc9047b Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 16 May 2025 09:22:26 -0500 Subject: [PATCH 18/30] Revert "DumpEvExpr: show type" This reverts commit 40510a3068498d15257cc7d198bce9c8cd71a902. Debug changes accidentally pushed upstream. --- flang/include/flang/Semantics/dump-expr.h | 30 +++++++---------------- flang/lib/Semantics/dump-expr.cpp | 2 +- 2 files changed, 10 insertions(+), 22 deletions(-) diff --git a/flang/include/flang/Semantics/dump-expr.h b/flang/include/flang/Semantics/dump-expr.h index 1553dac3b6687..2f445429a10b5 100644 --- a/flang/include/flang/Semantics/dump-expr.h +++ b/flang/include/flang/Semantics/dump-expr.h @@ -16,7 +16,6 @@ #include #include -#include #include #include @@ -39,17 +38,6 @@ class DumpEvaluateExpr { } private: - template - struct TypeOf { - static constexpr std::string_view name{TypeOf::get()}; - static constexpr std::string_view get() { - std::string_view v(__PRETTY_FUNCTION__); - v.remove_prefix(99); // Strip the part "... [with T = " - v.remove_suffix(50); // Strip the ending "; string_view = ...]" - return v; - } - }; - template void Show(const common::Indirection &x) { Show(x.value()); } @@ -88,7 +76,7 @@ class DumpEvaluateExpr { void Show(const evaluate::NullPointer &); template void Show(const evaluate::Constant &x) { if constexpr (T::category == common::TypeCategory::Derived) { - Indent("derived constant "s + std::string(TypeOf::name)); + Indent("derived constant"); for (const auto &map : x.values()) { for (const auto &pair : map) { Show(pair.second.value()); @@ -96,7 +84,7 @@ class DumpEvaluateExpr { } Outdent(); } else { - Print("constant "s + std::string(TypeOf::name)); + Print("constant"); } } void Show(const Symbol &symbol); @@ -114,7 +102,7 @@ class DumpEvaluateExpr { void Show(const evaluate::Substring &x); void Show(const evaluate::ComplexPart &x); template void Show(const evaluate::Designator &x) { - Indent("designator "s + std::string(TypeOf::name)); + Indent("designator"); Show(x.u); Outdent(); } @@ -129,7 +117,7 @@ class DumpEvaluateExpr { Outdent(); } template void Show(const evaluate::FunctionRef &x) { - Indent("function ref "s + std::string(TypeOf::name)); + Indent("function ref"); Show(x.proc()); Show(x.arguments()); Outdent(); @@ -139,14 +127,14 @@ class DumpEvaluateExpr { } template void Show(const evaluate::ArrayConstructorValues &x) { - Indent("array constructor value "s + std::string(TypeOf::name)); + Indent("array constructor value"); for (auto &v : x) { Show(v); } Outdent(); } template void Show(const evaluate::ImpliedDo &x) { - Indent("implied do "s + std::string(TypeOf::name)); + Indent("implied do"); Show(x.lower()); Show(x.upper()); Show(x.stride()); @@ -160,20 +148,20 @@ class DumpEvaluateExpr { void Show(const evaluate::StructureConstructor &x); template void Show(const evaluate::Operation &op) { - Indent("unary op "s + std::string(TypeOf::name)); + Indent("unary op"); Show(op.left()); Outdent(); } template void Show(const evaluate::Operation &op) { - Indent("binary op "s + std::string(TypeOf::name)); + Indent("binary op"); Show(op.left()); Show(op.right()); Outdent(); } void Show(const evaluate::Relational &x); template void Show(const evaluate::Expr &x) { - Indent("expr <" + std::string(TypeOf::name) + ">"); + Indent("expr T"); Show(x.u); Outdent(); } diff --git a/flang/lib/Semantics/dump-expr.cpp b/flang/lib/Semantics/dump-expr.cpp index 66cedab94bfb4..aa0b4e0f03398 100644 --- a/flang/lib/Semantics/dump-expr.cpp +++ b/flang/lib/Semantics/dump-expr.cpp @@ -151,7 +151,7 @@ void DumpEvaluateExpr::Show(const evaluate::StructureConstructor &x) { } void DumpEvaluateExpr::Show(const evaluate::Relational &x) { - Indent("relational some type"); + Indent("expr some type"); Show(x.u); Outdent(); } >From c00fc531bcf742c409fc974da94c5b362fa9132c Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:37:19 -0500 Subject: [PATCH 19/30] Delete unnecessary static_assert --- flang/lib/Semantics/check-omp-structure.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 6005dda7c26fe..2e59553d5e130 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -21,8 +21,6 @@ namespace Fortran::semantics { -static_assert(std::is_same_v>); - template static bool operator!=(const evaluate::Expr &e, const evaluate::Expr &f) { return !(e == f); >From 45b012c16b77c757a0d09b2a229bad49fed8d26f Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:25 -0500 Subject: [PATCH 20/30] Add missing initializer for 'iff' --- flang/lib/Semantics/check-omp-structure.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 2e59553d5e130..aa1bd136b371f 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2815,7 +2815,7 @@ static std::optional AnalyzeConditionalStmt( } } else { AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, - GetActionStmt(std::get(s.t))}; + GetActionStmt(std::get(s.t)), SourcedActionStmt{}}; if (result.ift.stmt) { return result; } >From daeac25991bf14fb08c3accabe068c074afa1eb7 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 12:38:47 -0500 Subject: [PATCH 21/30] Add asserts for printing "Identity" as top-level operator --- flang/lib/Semantics/check-omp-structure.cpp | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index aa1bd136b371f..062b45deac865 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3823,6 +3823,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, atomic::ToString(top.first)); @@ -3852,6 +3853,8 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, atom.AsFortran(), atomic::ToString(top.first)); @@ -3898,16 +3901,20 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { + assert( + top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, atomic::ToString(top.first)); } break; } + case atomic::Operator::Identity: case atomic::Operator::True: case atomic::Operator::False: break; default: + assert(top.first != atomic::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, atomic::ToString(top.first)); >From ae121e5c37453af1a4aba7c77939c2c1c45b75fa Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 29 May 2025 13:15:58 -0500 Subject: [PATCH 22/30] Explain the use of determinant --- flang/lib/Semantics/check-omp-structure.cpp | 31 ++++++++++++++++++--- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 062b45deac865..bc6a09b9768ef 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3606,10 +3606,33 @@ OmpStructureChecker::CheckUpdateCapture( // subexpression of the right-hand side. // 2. An assignment could be a capture (cbc) if the right-hand side is // a variable (or a function ref), with potential type conversions. - bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; - bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; - bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; - bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; // Can as1 be an update? + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; // Can as2 be an update? + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; // Can 1 be capture? + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; // Can 2 be capture? + + // We want to diagnose cases where both assignments cannot be an update, + // or both cannot be a capture, as well as cases where either assignment + // cannot be any of these two. + // + // If we organize these boolean values into a matrix + // |cbu1 cbu2| + // |cbc1 cbc2| + // then we want to diagnose cases where the matrix has a zero (i.e. "false") + // row or column, including the case where everything is zero. All these + // cases correspond to the determinant of the matrix being 0, which suggests + // that checking the det may be a convenient diagnostic check. There is only + // one additional case where the det is 0, which is when the matrx is all 1 + // ("true"). The "all true" case represents the situation where both + // assignments could be an update as well as a capture. On the other hand, + // whenever det != 0, the roles of the update and the capture can be + // unambiguously assigned to as1 and as2 [1]. + // + // [1] This can be easily verified by hand: there are 10 2x2 matrices with + // det = 0, leaving 6 cases where det != 0: + // 0 1 0 1 1 0 1 0 1 1 1 1 + // 1 0 1 1 0 1 1 1 0 1 1 0 + // In each case the classification is unambiguous. // |cbu1 cbu2| // det |cbc1 cbc2| = cbu1*cbc2 - cbu2*cbc1 >From cae0e8fcd3f6b8c2bc3ad8f85599ef4765c6afc5 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 11:48:18 -0500 Subject: [PATCH 23/30] Deal with assignments that failed Fortran semantic checks Don't emit diagnostics for those. --- flang/lib/Semantics/check-omp-structure.cpp | 66 ++++++++++++++------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index bc6a09b9768ef..89a3a407441a8 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2726,6 +2726,9 @@ static SourcedActionStmt GetActionStmt(const parser::Block &block) { // Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption // is that the ActionStmt will be either an assignment or a pointer-assignment, // otherwise return std::nullopt. +// Note: This function can return std::nullopt on [Pointer]AssignmentStmt where +// the "typedAssignment" is unset. This can happen is there are semantic errors +// in the purported assignment. static std::optional GetEvaluateAssignment( const parser::ActionStmt *x) { if (x == nullptr) { @@ -2754,6 +2757,29 @@ static std::optional GetEvaluateAssignment( x->u); } +// Check if the ActionStmt is actually a [Pointer]AssignmentStmt. This is +// to separate cases where the source has something that looks like an +// assignment, but is semantically wrong (diagnosed by general semantic +// checks), and where the source has some other statement (which we want +// to report as "should be an assignment"). +static bool IsAssignment(const parser::ActionStmt *x) { + if (x == nullptr) { + return false; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + + return common::visit( + [](auto &&s) -> bool { + using BareS = llvm::remove_cvref_t; + return std::is_same_v || + std::is_same_v; + }, + x->u); +} + static std::optional AnalyzeConditionalStmt( const parser::ExecutionPartConstruct *x) { if (x == nullptr) { @@ -3588,8 +3614,10 @@ OmpStructureChecker::CheckUpdateCapture( auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; if (!maybeAssign1 || !maybeAssign2) { - context_.Say(source, - "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + if (!IsAssignment(act1.stmt) || !IsAssignment(act2.stmt)) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + } return std::make_pair(nullptr, nullptr); } @@ -3956,7 +3984,7 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateStmt( // The if-true statement must be present, and must be an assignment. auto maybeAssign{GetEvaluateAssignment(update.ift.stmt)}; if (!maybeAssign) { - if (update.ift.stmt) { + if (update.ift.stmt && !IsAssignment(update.ift.stmt)) { context_.Say(update.ift.source, "In ATOMIC UPDATE COMPARE the update statement should be an assignment"_err_en_US); } else { @@ -3992,7 +4020,7 @@ void OmpStructureChecker::CheckAtomicUpdateOnly( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Update, maybeUpdate), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( source, "ATOMIC UPDATE operation should be an assignment"_err_en_US); } @@ -4094,17 +4122,11 @@ void OmpStructureChecker::CheckAtomicUpdateCapture( } SourcedActionStmt uact{GetActionStmt(uec)}; SourcedActionStmt cact{GetActionStmt(cec)}; - auto maybeUpdate{GetEvaluateAssignment(uact.stmt)}; - auto maybeCapture{GetEvaluateAssignment(cact.stmt)}; - - if (!maybeUpdate || !maybeCapture) { - context_.Say(source, - "ATOMIC UPDATE CAPTURE operation both statements should be assignments"_err_en_US); - return; - } + // The "dereferences" of std::optional are guaranteed to be valid after + // CheckUpdateCapture. + evaluate::Assignment update{*GetEvaluateAssignment(uact.stmt)}; + evaluate::Assignment capture{*GetEvaluateAssignment(cact.stmt)}; - const evaluate::Assignment &update{*maybeUpdate}; - const evaluate::Assignment &capture{*maybeCapture}; const SomeExpr &atom{update.lhs}; using Analysis = parser::OpenMPAtomicConstruct::Analysis; @@ -4242,13 +4264,17 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateCapture( return; } } else { - context_.Say(capture.source, - "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + if (!IsAssignment(capture.stmt)) { + context_.Say(capture.source, + "In ATOMIC UPDATE COMPARE CAPTURE the capture statement should be an assignment"_err_en_US); + } return; } } else { - context_.Say(update.ift.source, - "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + if (!IsAssignment(update.ift.stmt)) { + context_.Say(update.ift.source, + "In ATOMIC UPDATE COMPARE CAPTURE the update statement should be an assignment"_err_en_US); + } return; } @@ -4316,7 +4342,7 @@ void OmpStructureChecker::CheckAtomicRead( MakeAtomicAnalysisOp(Analysis::Read, maybeRead), MakeAtomicAnalysisOp(Analysis::None)); } - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC READ operation should be an assignment"_err_en_US); } @@ -4350,7 +4376,7 @@ void OmpStructureChecker::CheckAtomicWrite( x.analysis = MakeAtomicAnalysis(atom, std::nullopt, MakeAtomicAnalysisOp(Analysis::Write, maybeWrite), MakeAtomicAnalysisOp(Analysis::None)); - } else { + } else if (!IsAssignment(action.stmt)) { context_.Say( x.source, "ATOMIC WRITE operation should be an assignment"_err_en_US); } >From 6bc8c10c793ebac02c78daec33e7fb5e6becb8e0 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 12:47:00 -0500 Subject: [PATCH 24/30] Move common functions to tools.cpp --- flang/include/flang/Semantics/tools.h | 134 +++++- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- flang/lib/Semantics/check-omp-structure.cpp | 506 ++------------------ flang/lib/Semantics/tools.cpp | 310 ++++++++++++ 4 files changed, 484 insertions(+), 468 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 821f1ae34fd5b..25fadceefceb0 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -778,11 +778,135 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, return false; } -/// If the top-level operation (ignoring parentheses) is either an -/// evaluate::FunctionRef, or a specialization of evaluate::Operation, -/// then return the list of arguments (wrapped in SomeExpr). Otherwise, -/// return the "expr" but with top-level parentheses stripped. -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr); +namespace operation { + +enum class Operator { + Add, + And, + Associated, + Call, + Convert, + Div, + Eq, + Eqv, + False, + Ge, + Gt, + Identity, + Intrinsic, + Lt, + Max, + Min, + Mul, + Ne, + Neqv, + Not, + Or, + Pow, + Resize, // Convert within the same TypeCategory + Sub, + True, + Unknown, +}; + +std::string ToString(Operator op); + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + switch (op.derived().logicalOperator) { + case common::LogicalOperator::And: + return Operator::And; + case common::LogicalOperator::Or: + return Operator::Or; + case common::LogicalOperator::Eqv: + return Operator::Eqv; + case common::LogicalOperator::Neqv: + return Operator::Neqv; + case common::LogicalOperator::Not: + return Operator::Not; + } + return Operator::Unknown; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + switch (op.derived().opr) { + case common::RelationalOperator::LT: + return Operator::Lt; + case common::RelationalOperator::LE: + return Operator::Le; + case common::RelationalOperator::EQ: + return Operator::Eq; + case common::RelationalOperator::NE: + return Operator::Ne; + case common::RelationalOperator::GE: + return Operator::Ge; + case common::RelationalOperator::GT: + return Operator::Gt; + } + return Operator::Unknown; +} + +template +Operator OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Add; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Sub; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Mul; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Div; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { + return Operator::Pow; +} + +template +Operator +OperationCode(const evaluate::Operation, Ts...> &op) { + if constexpr (C == T::category) { + return Operator::Resize; + } else { + return Operator::Convert; + } +} + +template // +Operator OperationCode(const T &) { + return Operator::Unknown; +} + +Operator OperationCode(const evaluate::ProcedureDesignator &proc); + +} // namespace operation + +/// Return information about the top-level operation (ignoring parentheses): +/// the operation code and the list of arguments. +std::pair> +GetTopLevelOperation(const SomeExpr &expr); /// Check if expr is same as x, or a sequence of Convert operations on x. bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index ad5eae4ae39a2..c74f7627c5e25 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2828,7 +2828,7 @@ genAtomicUpdate(lower::AbstractConverter &converter, mlir::Location loc, // This must exist by now. SomeExpr input = *semantics::GetConvertInput(assign.rhs); - std::vector args{semantics::GetOpenMPTopLevelArguments(input)}; + std::vector args{semantics::GetTopLevelOperation(input).second}; assert(!args.empty() && "Update operation without arguments"); for (auto &arg : args) { if (!semantics::IsSameOrConvertOf(arg, atom)) { diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 89a3a407441a8..f29a56d5fd92a 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -2891,290 +2891,6 @@ static std::pair SplitAssignmentSource( namespace atomic { -template static void MoveAppend(V &accum, V &&other) { - for (auto &&s : other) { - accum.push_back(std::move(s)); - } -} - -enum class Operator { - Unk, - // Operators that are officially allowed in the update operation - Add, - And, - Associated, - Div, - Eq, - Eqv, - Ge, // extension - Gt, - Identity, // extension: x = x is allowed (*), but we should never print - // "identity" as the name of the operator - Le, // extension - Lt, - Max, - Min, - Mul, - Ne, // extension - Neqv, - Or, - Sub, - // Operators that we recognize for technical reasons - True, - False, - Not, - Convert, - Resize, - Intrinsic, - Call, - Pow, - - // (*): "x = x + 0" is a valid update statement, but it will be folded - // to "x = x" by the time we look at it. Since the source statements - // "x = x" and "x = x + 0" will end up looking the same, accept the - // former as an extension. -}; - -std::string ToString(Operator op) { - switch (op) { - case Operator::Add: - return "+"; - case Operator::And: - return "AND"; - case Operator::Associated: - return "ASSOCIATED"; - case Operator::Div: - return "/"; - case Operator::Eq: - return "=="; - case Operator::Eqv: - return "EQV"; - case Operator::Ge: - return ">="; - case Operator::Gt: - return ">"; - case Operator::Identity: - return "identity"; - case Operator::Le: - return "<="; - case Operator::Lt: - return "<"; - case Operator::Max: - return "MAX"; - case Operator::Min: - return "MIN"; - case Operator::Mul: - return "*"; - case Operator::Neqv: - return "NEQV/EOR"; - case Operator::Ne: - return "/="; - case Operator::Or: - return "OR"; - case Operator::Sub: - return "-"; - case Operator::True: - return ".TRUE."; - case Operator::False: - return ".FALSE."; - case Operator::Not: - return "NOT"; - case Operator::Convert: - return "type-conversion"; - case Operator::Resize: - return "resize"; - case Operator::Intrinsic: - return "intrinsic"; - case Operator::Call: - return "function-call"; - case Operator::Pow: - return "**"; - default: - return "??"; - } -} - -template // -struct ArgumentExtractor - : public evaluate::Traverse, - std::pair>, false> { - using Arguments = std::vector; - using Result = std::pair; - using Base = evaluate::Traverse, - Result, false>; - static constexpr auto IgnoreResizes = IgnoreResizingConverts; - static constexpr auto Logical = common::TypeCategory::Logical; - ArgumentExtractor() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result operator()( - const evaluate::Constant> &x) const { - if (const auto &val{x.GetScalarValue()}) { - return val->IsTrue() ? std::make_pair(Operator::True, Arguments{}) - : std::make_pair(Operator::False, Arguments{}); - } - return Default(); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - Result result{OperationCode(x.proc()), {}}; - for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { - if (auto *e{x.UnwrapArgExpr(i)}) { - result.second.push_back(*e); - } - } - return result; - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore top-level parentheses. - return (*this)(x.template operand<0>()); - } - if constexpr (IgnoreResizes && - std::is_same_v>) { - // Ignore conversions within the same category. - // Atomic operations on int(kind=1) may be implicitly widened - // to int(kind=4) for example. - return (*this)(x.template operand<0>()); - } else { - return std::make_pair( - OperationCode(x), OperationArgs(x, std::index_sequence_for{})); - } - } - - template // - Result operator()(const evaluate::Designator &x) const { - evaluate::Designator copy{x}; - Result result{Operator::Identity, {AsGenericExpr(std::move(copy))}}; - return result; - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - // There shouldn't be any combining needed, since we're stopping the - // traversal at the top-level operation, but implement one that picks - // the first non-empty result. - if constexpr (sizeof...(Rs) == 0) { - return std::move(result); - } else { - if (!result.second.empty()) { - return std::move(result); - } else { - return Combine(std::move(results)...); - } - } - } - -private: - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) - const { - switch (op.derived().logicalOperator) { - case common::LogicalOperator::And: - return Operator::And; - case common::LogicalOperator::Or: - return Operator::Or; - case common::LogicalOperator::Eqv: - return Operator::Eqv; - case common::LogicalOperator::Neqv: - return Operator::Neqv; - case common::LogicalOperator::Not: - return Operator::Not; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - switch (op.derived().opr) { - case common::RelationalOperator::LT: - return Operator::Lt; - case common::RelationalOperator::LE: - return Operator::Le; - case common::RelationalOperator::EQ: - return Operator::Eq; - case common::RelationalOperator::NE: - return Operator::Ne; - case common::RelationalOperator::GE: - return Operator::Ge; - case common::RelationalOperator::GT: - return Operator::Gt; - } - return Operator::Unk; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Add; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Sub; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Mul; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Div; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - return Operator::Pow; - } - template - Operator OperationCode( - const evaluate::Operation, Ts...> &op) const { - if constexpr (C == T::category) { - return Operator::Resize; - } else { - return Operator::Convert; - } - } - Operator OperationCode(const evaluate::ProcedureDesignator &proc) const { - Operator code = llvm::StringSwitch(proc.GetName()) - .Case("associated", Operator::Associated) - .Case("min", Operator::Min) - .Case("max", Operator::Max) - .Case("iand", Operator::And) - .Case("ior", Operator::Or) - .Case("ieor", Operator::Neqv) - .Default(Operator::Call); - if (code == Operator::Call && proc.GetSpecificIntrinsic()) { - return Operator::Intrinsic; - } - return code; - } - template // - Operator OperationCode(const T &) const { - return Operator::Unk; - } - - template - Arguments OperationArgs(const evaluate::Operation &x, - std::index_sequence) const { - return Arguments{SomeExpr(x.template operand())...}; - } -}; - struct DesignatorCollector : public evaluate::Traverse, false> { using Result = std::vector; @@ -3196,125 +2912,14 @@ struct DesignatorCollector : public evaluate::Traverse // Result Combine(Result &&result, Rs &&...results) const { Result v(std::move(result)); - (MoveAppend(v, std::move(results)), ...); - return v; - } -}; - -struct ConvertCollector - : public evaluate::Traverse>, false> { - using Result = std::pair>; - using Base = evaluate::Traverse; - ConvertCollector() : Base(*this) {} - - Result Default() const { return {}; } - - using Base::operator(); - - template // - Result asSomeExpr(const T &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; - } - - template // - Result operator()(const evaluate::Designator &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::FunctionRef &x) const { - return asSomeExpr(x); - } - - template // - Result operator()(const evaluate::Constant &x) const { - return asSomeExpr(x); - } - - template - Result operator()(const evaluate::Operation &x) const { - if constexpr (std::is_same_v>) { - // Ignore parentheses. - return (*this)(x.template operand<0>()); - } else if constexpr (is_convert_v) { - // Convert should always have a typed result, so it should be safe to - // dereference x.GetType(). - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else if constexpr (is_complex_constructor_v) { - // This is a conversion iff the imaginary operand is 0. - if (IsZero(x.template operand<1>())) { - return Combine( - {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); - } else { - return asSomeExpr(x.derived()); - } - } else { - return asSomeExpr(x.derived()); - } - } - - template // - Result Combine(Result &&result, Rs &&...results) const { - Result v(std::move(result)); - auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { - assert((!x.has_value() || !y.has_value()) && "Multiple designators"); - if (!x.has_value()) { - x = std::move(y); + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); } }}; - (setValue(v.first, std::move(results).first), ...); - (MoveAppend(v.second, std::move(results).second), ...); + (moveAppend(v, std::move(results)), ...); return v; } - -private: - template // - static bool IsZero(const T &x) { - return false; - } - template // - static bool IsZero(const evaluate::Expr &x) { - return common::visit([](auto &&s) { return IsZero(s); }, x.u); - } - template // - static bool IsZero(const evaluate::Constant &x) { - if (auto &&maybeScalar{x.GetScalarValue()}) { - return maybeScalar->IsZero(); - } else { - return false; - } - } - - template // - struct is_convert { - static constexpr bool value{false}; - }; - template // - struct is_convert> { - static constexpr bool value{true}; - }; - template // - struct is_convert> { - // Conversion from complex to real. - static constexpr bool value{true}; - }; - template // - static constexpr bool is_convert_v = is_convert::value; - - template // - struct is_complex_constructor { - static constexpr bool value{false}; - }; - template // - struct is_complex_constructor> { - static constexpr bool value{true}; - }; - template // - static constexpr bool is_complex_constructor_v = - is_complex_constructor::value; }; struct VariableFinder : public evaluate::AnyTraverse { @@ -3347,22 +2952,13 @@ static bool IsAllocatable(const SomeExpr &expr) { return !syms.empty() && IsAllocatable(syms.back()); } -static std::pair> GetTopLevelOperation( - const SomeExpr &expr) { - return atomic::ArgumentExtractor{}(expr); -} - -std::vector GetOpenMPTopLevelArguments(const SomeExpr &expr) { - return GetTopLevelOperation(expr).second; -} - static bool IsPointerAssignment(const evaluate::Assignment &x) { return std::holds_alternative(x.u) || std::holds_alternative(x.u); } static bool IsCheckForAssociated(const SomeExpr &cond) { - return GetTopLevelOperation(cond).first == atomic::Operator::Associated; + return GetTopLevelOperation(cond).first == operation::Operator::Associated; } static bool HasCommonDesignatorSymbols( @@ -3455,23 +3051,7 @@ static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -MaybeExpr GetConvertInput(const SomeExpr &x) { - // This returns SomeExpr(x) when x is a designator/functionref/constant. - return atomic::ConvertCollector{}(x).first; -} - -bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { - // Check if expr is same as x, or a sequence of Convert operations on x. - if (expr == x) { - return true; - } else if (auto maybe{GetConvertInput(expr)}) { - return *maybe == x; - } else { - return false; - } -} - -bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { +static bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { return atomic::VariableFinder{sub}(super); } @@ -3839,45 +3419,46 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( CheckAtomicVariable(atom, lsrc); - std::pair> top{ - atomic::Operator::Unk, {}}; + std::pair> top{ + operation::Operator::Unknown, {}}; if (auto &&maybeInput{GetConvertInput(update.rhs)}) { top = GetTopLevelOperation(*maybeInput); } switch (top.first) { - case atomic::Operator::Add: - case atomic::Operator::Sub: - case atomic::Operator::Mul: - case atomic::Operator::Div: - case atomic::Operator::And: - case atomic::Operator::Or: - case atomic::Operator::Eqv: - case atomic::Operator::Neqv: - case atomic::Operator::Min: - case atomic::Operator::Max: - case atomic::Operator::Identity: + case operation::Operator::Add: + case operation::Operator::Sub: + case operation::Operator::Mul: + case operation::Operator::Div: + case operation::Operator::And: + case operation::Operator::Or: + case operation::Operator::Eqv: + case operation::Operator::Neqv: + case operation::Operator::Min: + case operation::Operator::Max: + case operation::Operator::Identity: break; - case atomic::Operator::Call: + case operation::Operator::Call: context_.Say(source, "A call to this function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Convert: + case operation::Operator::Convert: context_.Say(source, "An implicit or explicit type conversion is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Intrinsic: + case operation::Operator::Intrinsic: context_.Say(source, "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); return; - case atomic::Operator::Unk: + case operation::Operator::Unknown: context_.Say( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); return; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(source, "The %s operator is not a valid ATOMIC UPDATE operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); return; } // Check if `atom` occurs exactly once in the argument list. @@ -3898,17 +3479,17 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( }()}; if (unique == top.second.end()) { - if (top.first == atomic::Operator::Identity) { + if (top.first == operation::Operator::Identity) { // This is "x = y". context_.Say(rsrc, "The atomic variable %s should appear as an argument in the update operation"_err_en_US, atom.AsFortran()); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(rsrc, "The atomic variable %s should occur exactly once among the arguments of the top-level %s operator"_err_en_US, - atom.AsFortran(), atomic::ToString(top.first)); + atom.AsFortran(), operation::ToString(top.first)); } } else { CheckStorageOverlap(atom, nonAtom, source); @@ -3933,18 +3514,18 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( // Missing arguments to operations would have been diagnosed by now. switch (top.first) { - case atomic::Operator::Associated: + case operation::Operator::Associated: if (atom != top.second.front()) { context_.Say(assignSource, "The pointer argument to ASSOCIATED must be same as the target of the assignment"_err_en_US); } break; // x equalop e | e equalop x (allowing "e equalop x" is an extension) - case atomic::Operator::Eq: - case atomic::Operator::Eqv: + case operation::Operator::Eq: + case operation::Operator::Eqv: // x ordop expr | expr ordop x - case atomic::Operator::Lt: - case atomic::Operator::Gt: { + case operation::Operator::Lt: + case operation::Operator::Gt: { const SomeExpr &arg0{top.second[0]}; const SomeExpr &arg1{top.second[1]}; if (IsSameOrConvertOf(arg0, atom)) { @@ -3952,23 +3533,24 @@ void OmpStructureChecker::CheckAtomicConditionalUpdateAssignment( } else if (IsSameOrConvertOf(arg1, atom)) { CheckStorageOverlap(atom, {arg0}, condSource); } else { - assert( - top.first != atomic::Operator::Identity && "Handle this separately"); + assert(top.first != operation::Operator::Identity && + "Handle this separately"); context_.Say(assignSource, "An argument of the %s operator should be the target of the assignment"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); } break; } - case atomic::Operator::Identity: - case atomic::Operator::True: - case atomic::Operator::False: + case operation::Operator::Identity: + case operation::Operator::True: + case operation::Operator::False: break; default: - assert(top.first != atomic::Operator::Identity && "Handle this separately"); + assert( + top.first != operation::Operator::Identity && "Handle this separately"); context_.Say(condSource, "The %s operator is not a valid condition for ATOMIC operation"_err_en_US, - atomic::ToString(top.first)); + operation::ToString(top.first)); break; } } diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 1d1e3ac044166..fce930dcc1d02 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -17,6 +17,7 @@ #include "flang/Semantics/tools.h" #include "flang/Semantics/type.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringSwitch.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -1756,4 +1757,313 @@ bool HadUseError( } } +namespace operation { +template // +struct ArgumentExtractor + : public evaluate::Traverse, + std::pair>, false> { + using Arguments = std::vector; + using Result = std::pair; + using Base = evaluate::Traverse, + Result, false>; + static constexpr auto IgnoreResizes = IgnoreResizingConverts; + static constexpr auto Logical = common::TypeCategory::Logical; + ArgumentExtractor() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()( + const evaluate::Constant> &x) const { + if (const auto &val{x.GetScalarValue()}) { + return val->IsTrue() + ? std::make_pair(operation::Operator::True, Arguments{}) + : std::make_pair(operation::Operator::False, Arguments{}); + } + return Default(); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + Result result{operation::OperationCode(x.proc()), {}}; + for (size_t i{0}, e{x.arguments().size()}; i != e; ++i) { + if (auto *e{x.UnwrapArgExpr(i)}) { + result.second.push_back(*e); + } + } + return result; + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore top-level parentheses. + return (*this)(x.template operand<0>()); + } + if constexpr (IgnoreResizes && + std::is_same_v>) { + // Ignore conversions within the same category. + // Atomic operations on int(kind=1) may be implicitly widened + // to int(kind=4) for example. + return (*this)(x.template operand<0>()); + } else { + return std::make_pair(operation::OperationCode(x), + OperationArgs(x, std::index_sequence_for{})); + } + } + + template // + Result operator()(const evaluate::Designator &x) const { + evaluate::Designator copy{x}; + Result result{ + operation::Operator::Identity, {AsGenericExpr(std::move(copy))}}; + return result; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + // There shouldn't be any combining needed, since we're stopping the + // traversal at the top-level operation, but implement one that picks + // the first non-empty result. + if constexpr (sizeof...(Rs) == 0) { + return std::move(result); + } else { + if (!result.second.empty()) { + return std::move(result); + } else { + return Combine(std::move(results)...); + } + } + } + +private: + template + Arguments OperationArgs(const evaluate::Operation &x, + std::index_sequence) const { + return Arguments{SomeExpr(x.template operand())...}; + } +}; +} // namespace operation + +std::string operation::ToString(operation::Operator op) { + switch (op) { + case Operator::Add: + return "+"; + case Operator::And: + return "AND"; + case Operator::Associated: + return "ASSOCIATED"; + case Operator::Div: + return "/"; + case Operator::Eq: + return "=="; + case Operator::Eqv: + return "EQV"; + case Operator::Ge: + return ">="; + case Operator::Gt: + return ">"; + case Operator::Identity: + return "identity"; + case Operator::Le: + return "<="; + case Operator::Lt: + return "<"; + case Operator::Max: + return "MAX"; + case Operator::Min: + return "MIN"; + case Operator::Mul: + return "*"; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Ne: + return "/="; + case Operator::Or: + return "OR"; + case Operator::Sub: + return "-"; + case Operator::True: + return ".TRUE."; + case Operator::False: + return ".FALSE."; + case Operator::Not: + return "NOT"; + case Operator::Convert: + return "type-conversion"; + case Operator::Resize: + return "resize"; + case Operator::Intrinsic: + return "intrinsic"; + case Operator::Call: + return "function-call"; + case Operator::Pow: + return "**"; + default: + return "??"; + } +} + +operation::Operator operation::OperationCode( + const evaluate::ProcedureDesignator &proc) { + Operator code = llvm::StringSwitch(proc.GetName()) + .Case("associated", Operator::Associated) + .Case("min", Operator::Min) + .Case("max", Operator::Max) + .Case("iand", Operator::And) + .Case("ior", Operator::Or) + .Case("ieor", Operator::Neqv) + .Default(Operator::Call); + if (code == Operator::Call && proc.GetSpecificIntrinsic()) { + return Operator::Intrinsic; + } + return code; +} + +std::pair> GetTopLevelOperation( + const SomeExpr &expr) { + return operation::ArgumentExtractor{}(expr); +} + +namespace operation { +struct ConvertCollector + : public evaluate::Traverse>, false> { + using Result = std::pair>; + using Base = evaluate::Traverse; + ConvertCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result asSomeExpr(const T &x) const { + auto copy{x}; + return {AsGenericExpr(std::move(copy)), {}}; + } + + template // + Result operator()(const evaluate::Designator &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::FunctionRef &x) const { + return asSomeExpr(x); + } + + template // + Result operator()(const evaluate::Constant &x) const { + return asSomeExpr(x); + } + + template + Result operator()(const evaluate::Operation &x) const { + if constexpr (std::is_same_v>) { + // Ignore parentheses. + return (*this)(x.template operand<0>()); + } else if constexpr (is_convert_v) { + // Convert should always have a typed result, so it should be safe to + // dereference x.GetType(). + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else if constexpr (is_complex_constructor_v) { + // This is a conversion iff the imaginary operand is 0. + if (IsZero(x.template operand<1>())) { + return Combine( + {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); + } else { + return asSomeExpr(x.derived()); + } + } else { + return asSomeExpr(x.derived()); + } + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto setValue{[](MaybeExpr &x, MaybeExpr &&y) { + assert((!x.has_value() || !y.has_value()) && "Multiple designators"); + if (!x.has_value()) { + x = std::move(y); + } + }}; + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); + } + }}; + (setValue(v.first, std::move(results).first), ...); + (moveAppend(v.second, std::move(results).second), ...); + return v; + } + +private: + template // + static bool IsZero(const T &x) { + return false; + } + template // + static bool IsZero(const evaluate::Expr &x) { + return common::visit([](auto &&s) { return IsZero(s); }, x.u); + } + template // + static bool IsZero(const evaluate::Constant &x) { + if (auto &&maybeScalar{x.GetScalarValue()}) { + return maybeScalar->IsZero(); + } else { + return false; + } + } + + template // + struct is_convert { + static constexpr bool value{false}; + }; + template // + struct is_convert> { + static constexpr bool value{true}; + }; + template // + struct is_convert> { + // Conversion from complex to real. + static constexpr bool value{true}; + }; + template // + static constexpr bool is_convert_v = is_convert::value; + + template // + struct is_complex_constructor { + static constexpr bool value{false}; + }; + template // + struct is_complex_constructor> { + static constexpr bool value{true}; + }; + template // + static constexpr bool is_complex_constructor_v = + is_complex_constructor::value; +}; +} // namespace operation + +MaybeExpr GetConvertInput(const SomeExpr &x) { + // This returns SomeExpr(x) when x is a designator/functionref/constant. + return operation::ConvertCollector{}(x).first; +} + +bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x) { + // Check if expr is same as x, or a sequence of Convert operations on x. + if (expr == x) { + return true; + } else if (auto maybe{GetConvertInput(expr)}) { + return *maybe == x; + } else { + return false; + } +} + } // namespace Fortran::semantics >From a83a1cf262eb9f01aafbcf099a8467aa9b861187 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 13:05:15 -0500 Subject: [PATCH 25/30] format --- flang/include/flang/Semantics/tools.h | 28 +++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 25fadceefceb0..9454f0b489192 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -830,8 +830,8 @@ Operator OperationCode( } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { switch (op.derived().opr) { case common::RelationalOperator::LT: return Operator::Lt; @@ -855,26 +855,26 @@ Operator OperationCode(const evaluate::Operation, Ts...> &op) { } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Sub; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Mul; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Div; } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { return Operator::Pow; } @@ -885,8 +885,8 @@ Operator OperationCode( } template -Operator -OperationCode(const evaluate::Operation, Ts...> &op) { +Operator OperationCode( + const evaluate::Operation, Ts...> &op) { if constexpr (C == T::category) { return Operator::Resize; } else { @@ -905,8 +905,8 @@ Operator OperationCode(const evaluate::ProcedureDesignator &proc); /// Return information about the top-level operation (ignoring parentheses): /// the operation code and the list of arguments. -std::pair> -GetTopLevelOperation(const SomeExpr &expr); +std::pair> GetTopLevelOperation( + const SomeExpr &expr); /// Check if expr is same as x, or a sequence of Convert operations on x. bool IsSameOrConvertOf(const SomeExpr &expr, const SomeExpr &x); >From 9770e4d3c5b0a858f8b5864a7aada01946763450 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 14:45:18 -0500 Subject: [PATCH 26/30] Restore accidentally removed Le --- flang/include/flang/Semantics/tools.h | 1 + 1 file changed, 1 insertion(+) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 9454f0b489192..9766effba3ebe 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -794,6 +794,7 @@ enum class Operator { Gt, Identity, Intrinsic, + Le, Lt, Max, Min, >From 9b8aaa5586334b48fbb28c103eda1091168342e0 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 14:45:47 -0500 Subject: [PATCH 27/30] Recognize constants as "operations" This allows emitting slightly better diagnostic messages. --- flang/include/flang/Semantics/tools.h | 8 ++- flang/lib/Semantics/check-omp-structure.cpp | 1 + flang/lib/Semantics/tools.cpp | 71 +++++++++++---------- flang/test/Semantics/OpenMP/atomic04.f90 | 2 +- flang/test/Semantics/OpenMP/atomic05.f90 | 2 +- 5 files changed, 48 insertions(+), 36 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 9766effba3ebe..7a2be79f14a29 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -781,10 +781,12 @@ inline bool checkForSymbolMatch(const Fortran::semantics::SomeExpr *lhs, namespace operation { enum class Operator { + Unknown, Add, And, Associated, Call, + Constant, Convert, Div, Eq, @@ -807,7 +809,6 @@ enum class Operator { Resize, // Convert within the same TypeCategory Sub, True, - Unknown, }; std::string ToString(Operator op); @@ -895,6 +896,11 @@ Operator OperationCode( } } +template +Operator OperationCode(const evaluate::Constant &x) { + return Operator::Constant; +} + template // Operator OperationCode(const T &) { return Operator::Unknown; diff --git a/flang/lib/Semantics/check-omp-structure.cpp b/flang/lib/Semantics/check-omp-structure.cpp index 7f96e48a303fe..3c27a3968a7c9 100644 --- a/flang/lib/Semantics/check-omp-structure.cpp +++ b/flang/lib/Semantics/check-omp-structure.cpp @@ -3459,6 +3459,7 @@ void OmpStructureChecker::CheckAtomicUpdateAssignment( context_.Say(source, "This intrinsic function is not a valid ATOMIC UPDATE operation"_err_en_US); return; + case operation::Operator::Constant: case operation::Operator::Unknown: context_.Say( source, "This is not a valid ATOMIC UPDATE operation"_err_en_US); diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index fce930dcc1d02..a8cd8a6ec2228 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -1758,6 +1758,12 @@ bool HadUseError( } namespace operation { +template // +SomeExpr asSomeExpr(const T &x) { + auto copy{x}; + return AsGenericExpr(std::move(copy)); +} + template // struct ArgumentExtractor : public evaluate::Traverse, @@ -1816,10 +1822,12 @@ struct ArgumentExtractor template // Result operator()(const evaluate::Designator &x) const { - evaluate::Designator copy{x}; - Result result{ - operation::Operator::Identity, {AsGenericExpr(std::move(copy))}}; - return result; + return {operation::Operator::Identity, {asSomeExpr(x)}}; + } + + template // + Result operator()(const evaluate::Constant &x) const { + return {operation::Operator::Identity, {asSomeExpr(x)}}; } template // @@ -1849,24 +1857,37 @@ struct ArgumentExtractor std::string operation::ToString(operation::Operator op) { switch (op) { + default: + case Operator::Unknown: + return "??"; case Operator::Add: return "+"; case Operator::And: return "AND"; case Operator::Associated: return "ASSOCIATED"; + case Operator::Call: + return "function-call"; + case Operator::Constant: + return "constant"; + case Operator::Convert: + return "type-conversion"; case Operator::Div: return "/"; case Operator::Eq: return "=="; case Operator::Eqv: return "EQV"; + case Operator::False: + return ".FALSE."; case Operator::Ge: return ">="; case Operator::Gt: return ">"; case Operator::Identity: return "identity"; + case Operator::Intrinsic: + return "intrinsic"; case Operator::Le: return "<="; case Operator::Lt: @@ -1877,32 +1898,22 @@ std::string operation::ToString(operation::Operator op) { return "MIN"; case Operator::Mul: return "*"; - case Operator::Neqv: - return "NEQV/EOR"; case Operator::Ne: return "/="; + case Operator::Neqv: + return "NEQV/EOR"; + case Operator::Not: + return "NOT"; case Operator::Or: return "OR"; + case Operator::Pow: + return "**"; + case Operator::Resize: + return "resize"; case Operator::Sub: return "-"; case Operator::True: return ".TRUE."; - case Operator::False: - return ".FALSE."; - case Operator::Not: - return "NOT"; - case Operator::Convert: - return "type-conversion"; - case Operator::Resize: - return "resize"; - case Operator::Intrinsic: - return "intrinsic"; - case Operator::Call: - return "function-call"; - case Operator::Pow: - return "**"; - default: - return "??"; } } @@ -1939,25 +1950,19 @@ struct ConvertCollector using Base::operator(); - template // - Result asSomeExpr(const T &x) const { - auto copy{x}; - return {AsGenericExpr(std::move(copy)), {}}; - } - template // Result operator()(const evaluate::Designator &x) const { - return asSomeExpr(x); + return {asSomeExpr(x), {}}; } template // Result operator()(const evaluate::FunctionRef &x) const { - return asSomeExpr(x); + return {asSomeExpr(x), {}}; } template // Result operator()(const evaluate::Constant &x) const { - return asSomeExpr(x); + return {asSomeExpr(x), {}}; } template @@ -1976,10 +1981,10 @@ struct ConvertCollector return Combine( {std::nullopt, {*x.GetType()}}, (*this)(x.template operand<0>())); } else { - return asSomeExpr(x.derived()); + return {asSomeExpr(x.derived()), {}}; } } else { - return asSomeExpr(x.derived()); + return {asSomeExpr(x.derived()), {}}; } } diff --git a/flang/test/Semantics/OpenMP/atomic04.f90 b/flang/test/Semantics/OpenMP/atomic04.f90 index d603ba8b3937c..0f69befed1414 100644 --- a/flang/test/Semantics/OpenMP/atomic04.f90 +++ b/flang/test/Semantics/OpenMP/atomic04.f90 @@ -180,7 +180,7 @@ subroutine more_invalid_atomic_update_stmts() x = x !$omp atomic update - !ERROR: This is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 1 !$omp atomic update diff --git a/flang/test/Semantics/OpenMP/atomic05.f90 b/flang/test/Semantics/OpenMP/atomic05.f90 index e0103be4cae4a..77ffc6e57f1a3 100644 --- a/flang/test/Semantics/OpenMP/atomic05.f90 +++ b/flang/test/Semantics/OpenMP/atomic05.f90 @@ -19,7 +19,7 @@ program OmpAtomic x = 2 * 4 !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic update release, seq_cst - !ERROR: This is not a valid ATOMIC UPDATE operation + !ERROR: The atomic variable x should appear as an argument in the update operation x = 10 !ERROR: At most one clause from the 'memory-order' group is allowed on ATOMIC construct !$omp atomic capture release, seq_cst >From f7bc109276a7bb647b0c8f3d65af63fbfb3249dc Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 15:04:57 -0500 Subject: [PATCH 28/30] Add lit tests for dumping atomic analysis --- flang/lib/Lower/OpenMP/OpenMP.cpp | 8 +- .../Lower/OpenMP/dump-atomic-analysis.f90 | 82 +++++++++++++++++++ 2 files changed, 89 insertions(+), 1 deletion(-) create mode 100644 flang/test/Lower/OpenMP/dump-atomic-analysis.f90 diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 30acf8baba082..4c50717f8fde4 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -40,11 +40,14 @@ #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/Transforms/RegionUtils.h" #include "llvm/ADT/STLExtras.h" +#include "llvm/Support/CommandLine.h" #include "llvm/Frontend/OpenMP/OMPConstants.h" using namespace Fortran::lower::omp; using namespace Fortran::common::openmp; +static llvm::cl::opt DumpAtomicAnalysis("fdebug-dump-atomic-analysis"); + //===----------------------------------------------------------------------===// // Code generation helper functions //===----------------------------------------------------------------------===// @@ -3790,7 +3793,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, //===----------------------------------------------------------------------===// [[maybe_unused]] static void -dumpAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { +dumpAtomicAnalysis(const parser::OpenMPAtomicConstruct::Analysis &analysis) { auto whatStr = [](int k) { std::string txt = "?"; switch (k & parser::OpenMPAtomicConstruct::Analysis::Action) { @@ -3869,6 +3872,9 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable, lower::StatementContext stmtCtx; const parser::OpenMPAtomicConstruct::Analysis &analysis = construct.analysis; + if (DumpAtomicAnalysis) + dumpAtomicAnalysis(analysis); + const semantics::SomeExpr &atom = *get(analysis.atom); mlir::Location loc = converter.genLocation(construct.source); mlir::Value atomAddr = diff --git a/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 b/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 new file mode 100644 index 0000000000000..55c49f98cd2e8 --- /dev/null +++ b/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 @@ -0,0 +1,82 @@ +!RUN: %flang_fc1 -fopenmp -fopenmp-version=60 -emit-hlfir -mmlir -fdebug-dump-atomic-analysis %s -o /dev/null |& FileCheck %s + +subroutine f00(x) + integer :: x, v + !$omp atomic read + v = x +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Read +!CHECK-NEXT: assign: v=x +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: None +!CHECK-NEXT: assign: +!CHECK-NEXT: } +!CHECK-NEXT: } + + +subroutine f01(v) + integer :: x, v + !$omp atomic write + x = v +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Write +!CHECK-NEXT: assign: x=v +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: None +!CHECK-NEXT: assign: +!CHECK-NEXT: } +!CHECK-NEXT: } + + +subroutine f02(x, v) + integer :: x, v + !$omp atomic update + x = x + v +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Update +!CHECK-NEXT: assign: x=x+v +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: None +!CHECK-NEXT: assign: +!CHECK-NEXT: } +!CHECK-NEXT: } + + +subroutine f03(x, v) + integer :: x, v, t + !$omp atomic update capture + t = x + x = x + v + !$omp end atomic +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Read +!CHECK-NEXT: assign: t=x +!CHECK-NEXT: } +!CHECK-NEXT: op1 { +!CHECK-NEXT: what: Update +!CHECK-NEXT: assign: x=x+v +!CHECK-NEXT: } +!CHECK-NEXT: } >From 7355186f91e91088b58ba766974b82cbe45fb85a Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 15:26:04 -0500 Subject: [PATCH 29/30] format --- flang/include/flang/Semantics/tools.h | 2 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h index 7a2be79f14a29..1e30321269562 100644 --- a/flang/include/flang/Semantics/tools.h +++ b/flang/include/flang/Semantics/tools.h @@ -896,7 +896,7 @@ Operator OperationCode( } } -template +template // Operator OperationCode(const evaluate::Constant &x) { return Operator::Constant; } diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 4c50717f8fde4..f3f896dbf1ecc 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -40,8 +40,8 @@ #include "mlir/Dialect/OpenMP/OpenMPDialect.h" #include "mlir/Transforms/RegionUtils.h" #include "llvm/ADT/STLExtras.h" -#include "llvm/Support/CommandLine.h" #include "llvm/Frontend/OpenMP/OMPConstants.h" +#include "llvm/Support/CommandLine.h" using namespace Fortran::lower::omp; using namespace Fortran::common::openmp; >From dbe7a5272c9233bba766fbf5afce669254fc4da1 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Fri, 30 May 2025 16:22:32 -0500 Subject: [PATCH 30/30] Fix test maybe --- flang/test/Lower/OpenMP/dump-atomic-analysis.f90 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 b/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 index 55c49f98cd2e8..cbaf7bc9f2d8a 100644 --- a/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 +++ b/flang/test/Lower/OpenMP/dump-atomic-analysis.f90 @@ -1,4 +1,4 @@ -!RUN: %flang_fc1 -fopenmp -fopenmp-version=60 -emit-hlfir -mmlir -fdebug-dump-atomic-analysis %s -o /dev/null |& FileCheck %s +!RUN: %flang_fc1 -fopenmp -fopenmp-version=60 -emit-hlfir -mmlir -fdebug-dump-atomic-analysis %s -o /dev/null 2>&1 | FileCheck %s subroutine f00(x) integer :: x, v From flang-commits at lists.llvm.org Fri May 30 15:24:21 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 30 May 2025 15:24:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683a3015.170a0220.257460.d215@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/142022 >From 8f3fd2daab46f477e87043c66b3049dff4a5b20e Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:11:04 -0700 Subject: [PATCH 1/4] initial commit --- flang/include/flang/Common/enum-class.h | 47 ++++- .../include/flang/Support/Fortran-features.h | 51 ++++-- flang/lib/Frontend/CompilerInvocation.cpp | 62 ++++--- flang/lib/Support/CMakeLists.txt | 1 + flang/lib/Support/Fortran-features.cpp | 168 ++++++++++++++---- flang/lib/Support/enum-class.cpp | 24 +++ flang/test/Driver/disable-diagnostic.f90 | 19 ++ flang/test/Driver/werror-wrong.f90 | 7 +- flang/test/Driver/wextra-ok.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 3 + flang/unittests/Common/EnumClassTests.cpp | 45 +++++ .../unittests/Common/FortranFeaturesTest.cpp | 142 +++++++++++++++ 12 files changed, 483 insertions(+), 88 deletions(-) create mode 100644 flang/lib/Support/enum-class.cpp create mode 100644 flang/test/Driver/disable-diagnostic.f90 create mode 100644 flang/unittests/Common/EnumClassTests.cpp create mode 100644 flang/unittests/Common/FortranFeaturesTest.cpp diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index 41575d45091a8..baf9fe418141d 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -18,8 +18,9 @@ #define FORTRAN_COMMON_ENUM_CLASS_H_ #include -#include - +#include +#include +#include namespace Fortran::common { constexpr std::size_t CountEnumNames(const char *p) { @@ -58,15 +59,51 @@ constexpr std::array EnumNames(const char *p) { return result; } +template +std::optional inline fmap(std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +std::optional FindEnumIndex( + Predicate pred, int size, const std::string_view *names); + +using FindEnumIndexType = std::optional( + Predicate, int, const std::string_view *); + +template +std::optional inline FindEnum( + Predicate pred, std::function(Predicate)> find) { + std::function f = [](int x) { return static_cast(x); }; + return fmap(find(pred), f); +} + #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ ::Fortran::common::CountEnumNames(#__VA_ARGS__)}; \ + [[maybe_unused]] static constexpr std::array NAME##_names{ \ + ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ [[maybe_unused]] static inline std::string_view EnumToString(NAME e) { \ - static const constexpr auto names{ \ - ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ - return names[static_cast(e)]; \ + return NAME##_names[static_cast(e)]; \ } +#define ENUM_CLASS_EXTRA(NAME) \ + [[maybe_unused]] inline std::optional Find##NAME##Index( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnumIndex( \ + p, NAME##_enumSize, NAME##_names.data()); \ + } \ + [[maybe_unused]] inline std::optional Find##NAME( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + } \ + [[maybe_unused]] inline std::optional StringTo##NAME( \ + const std::string_view name) { \ + return Find##NAME( \ + [name](const std::string_view s) -> bool { return name == s; }); \ + } } // namespace Fortran::common #endif // FORTRAN_COMMON_ENUM_CLASS_H_ diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index e696da9042480..d5aa7357ffea0 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -12,6 +12,8 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" #include "flang/Common/idioms.h" +#include "llvm/Support/Error.h" +#include "llvm/Support/raw_ostream.h" #include #include @@ -79,12 +81,13 @@ ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, NullActualForDefaultIntentAllocatable, UseAssociationIntoSameNameSubprogram, HostAssociatedIntentOutInSpecExpr, NonVolatilePointerToVolatile) +// Generate default String -> Enum mapping. +ENUM_CLASS_EXTRA(LanguageFeature) +ENUM_CLASS_EXTRA(UsageWarning) + using LanguageFeatures = EnumSet; using UsageWarnings = EnumSet; -std::optional FindLanguageFeature(const char *); -std::optional FindUsageWarning(const char *); - class LanguageFeatureControl { public: LanguageFeatureControl(); @@ -97,8 +100,10 @@ class LanguageFeatureControl { void EnableWarning(UsageWarning w, bool yes = true) { warnUsage_.set(w, yes); } - void WarnOnAllNonstandard(bool yes = true) { warnAllLanguage_ = yes; } - void WarnOnAllUsage(bool yes = true) { warnAllUsage_ = yes; } + void WarnOnAllNonstandard(bool yes = true); + bool IsWarnOnAllNonstandard() const { return warnAllLanguage_; } + void WarnOnAllUsage(bool yes = true); + bool IsWarnOnAllUsage() const { return warnAllUsage_; } void DisableAllNonstandardWarnings() { warnAllLanguage_ = false; warnLanguage_.clear(); @@ -107,16 +112,16 @@ class LanguageFeatureControl { warnAllUsage_ = false; warnUsage_.clear(); } - - bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } - bool ShouldWarn(LanguageFeature f) const { - return (warnAllLanguage_ && f != LanguageFeature::OpenMP && - f != LanguageFeature::OpenACC && f != LanguageFeature::CUDA) || - warnLanguage_.test(f); - } - bool ShouldWarn(UsageWarning w) const { - return warnAllUsage_ || warnUsage_.test(w); + void DisableAllWarnings() { + disableAllWarnings_ = true; + DisableAllNonstandardWarnings(); + DisableAllUsageWarnings(); } + bool applyCLIOption(llvm::StringRef input); + bool AreWarningsDisabled() const { return disableAllWarnings_; } + bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } + bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } + bool ShouldWarn(UsageWarning w) const { return warnUsage_.test(w); } // Return all spellings of operators names, depending on features enabled std::vector GetNames(LogicalOperator) const; std::vector GetNames(RelationalOperator) const; @@ -127,6 +132,24 @@ class LanguageFeatureControl { bool warnAllLanguage_{false}; UsageWarnings warnUsage_; bool warnAllUsage_{false}; + bool disableAllWarnings_{false}; }; + +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). Just exposed for the template below. +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find); + +template +std::optional> parseCLIEnum( + llvm::StringRef input, std::function(Predicate)> find) { + using To = std::pair; + using From = std::pair; + static std::function cast = [](From x) { + return std::pair{x.first, static_cast(x.second)}; + }; + return fmap(parseCLIEnumIndex(input, find), cast); +} + } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..9ea568549bd6c 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -34,6 +34,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" +#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" #include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" @@ -45,6 +46,7 @@ #include #include #include +#include using namespace Fortran::frontend; @@ -971,10 +973,23 @@ static bool parseSemaArgs(CompilerInvocation &res, llvm::opt::ArgList &args, /// Parses all diagnostics related arguments and populates the variables /// options accordingly. Returns false if new errors are generated. +/// FC1 driver entry point for parsing diagnostic arguments. static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, clang::DiagnosticsEngine &diags) { unsigned numErrorsBefore = diags.getNumErrors(); + auto &features = res.getFrontendOpts().features; + // The order of these flags (-pedantic -W -w) is important and is + // chosen to match clang's behavior. + + // -pedantic + if (args.hasArg(clang::driver::options::OPT_pedantic)) { + features.WarnOnAllNonstandard(); + features.WarnOnAllUsage(); + res.setEnableConformanceChecks(); + res.setEnableUsageChecks(); + } + // -Werror option // TODO: Currently throws a Diagnostic for anything other than -W, // this has to change when other -W's are supported. @@ -984,21 +999,27 @@ static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, for (const auto &wArg : wArgs) { if (wArg == "error") { res.setWarnAsErr(true); - } else { - const unsigned diagID = - diags.getCustomDiagID(clang::DiagnosticsEngine::Error, - "Only `-Werror` is supported currently."); - diags.Report(diagID); + // -W(no-) + } else if (!features.applyCLIOption(wArg)) { + const unsigned diagID = diags.getCustomDiagID( + clang::DiagnosticsEngine::Error, "Unknown diagnostic option: -W%0"); + diags.Report(diagID) << wArg; } } } + // -w + if (args.hasArg(clang::driver::options::OPT_w)) { + features.DisableAllWarnings(); + res.setDisableWarnings(); + } + // Default to off for `flang -fc1`. - res.getFrontendOpts().showColors = - parseShowColorsArgs(args, /*defaultDiagColor=*/false); + bool showColors = parseShowColorsArgs(args, false); - // Honor color diagnostics. - res.getDiagnosticOpts().ShowColors = res.getFrontendOpts().showColors; + diags.getDiagnosticOptions().ShowColors = showColors; + res.getDiagnosticOpts().ShowColors = showColors; + res.getFrontendOpts().showColors = showColors; return diags.getNumErrors() == numErrorsBefore; } @@ -1074,16 +1095,6 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, Fortran::common::LanguageFeature::OpenACC); } - // -pedantic - if (args.hasArg(clang::driver::options::OPT_pedantic)) { - res.setEnableConformanceChecks(); - res.setEnableUsageChecks(); - } - - // -w - if (args.hasArg(clang::driver::options::OPT_w)) - res.setDisableWarnings(); - // -std=f2018 // TODO: Set proper options when more fortran standards // are supported. @@ -1092,6 +1103,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, // We only allow f2018 as the given standard if (standard == "f2018") { res.setEnableConformanceChecks(); + res.getFrontendOpts().features.WarnOnAllNonstandard(); } else { const unsigned diagID = diags.getCustomDiagID(clang::DiagnosticsEngine::Error, @@ -1099,6 +1111,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, diags.Report(diagID); } } + return diags.getNumErrors() == numErrorsBefore; } @@ -1694,16 +1707,7 @@ void CompilerInvocation::setFortranOpts() { if (frontendOptions.needProvenanceRangeToCharBlockMappings) fortranOptions.needProvenanceRangeToCharBlockMappings = true; - if (getEnableConformanceChecks()) - fortranOptions.features.WarnOnAllNonstandard(); - - if (getEnableUsageChecks()) - fortranOptions.features.WarnOnAllUsage(); - - if (getDisableWarnings()) { - fortranOptions.features.DisableAllNonstandardWarnings(); - fortranOptions.features.DisableAllUsageWarnings(); - } + fortranOptions.features = frontendOptions.features; } std::unique_ptr diff --git a/flang/lib/Support/CMakeLists.txt b/flang/lib/Support/CMakeLists.txt index 363f57ce97dae..9ef31a2a6dcc7 100644 --- a/flang/lib/Support/CMakeLists.txt +++ b/flang/lib/Support/CMakeLists.txt @@ -44,6 +44,7 @@ endif() add_flang_library(FortranSupport default-kinds.cpp + enum-class.cpp Flags.cpp Fortran.cpp Fortran-features.cpp diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index bee8984102b82..55abf0385d185 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -9,6 +9,8 @@ #include "flang/Support/Fortran-features.h" #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/Support/raw_ostream.h" namespace Fortran::common { @@ -94,57 +96,123 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Ignore case and any inserted punctuation (like '-'/'_') -static std::optional GetWarningChar(char ch) { - if (ch >= 'a' && ch <= 'z') { - return ch; - } else if (ch >= 'A' && ch <= 'Z') { - return ch - 'A' + 'a'; - } else if (ch >= '0' && ch <= '9') { - return ch; - } else { - return std::nullopt; +// Split a string with camel case into the individual words. +// Note, the small vector is just an array of a few pointers and lengths +// into the original input string. So all this allocation should be pretty +// cheap. +llvm::SmallVector splitCamelCase(llvm::StringRef input) { + using namespace llvm; + if (input.empty()) { + return {}; } + SmallVector parts{}; + parts.reserve(input.size()); + auto check = [&input](size_t j, function_ref predicate) { + return j < input.size() && predicate(input[j]); + }; + size_t i{0}; + size_t startWord = i; + for (; i < input.size(); i++) { + if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || + ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { + parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); + startWord = i + 1; + } + } + parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); + return parts; } -static bool WarningNameMatch(const char *a, const char *b) { - while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); - } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); +// Split a string whith hyphens into the individual words. +llvm::SmallVector splitHyphenated(llvm::StringRef input) { + auto parts = llvm::SmallVector{}; + llvm::SplitString(input, parts, "-"); + return parts; +} + +// Check if two strings are equal while normalizing case for the +// right word which is assumed to be a single word in camel case. +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { + size_t ls = l.size(); + if (ls != r.size()) + return false; + size_t j{0}; + // Process the upper case characters. + for (; j < ls; j++) { + char rc = r[j]; + char rc2l = llvm::toLower(rc); + if (rc == rc2l) { + // Past run of Uppers Case; + break; } - if (!ach && !bch) { - return true; - } else if (!ach || !bch || *ach != *bch) { + if (l[j] != rc2l) + return false; + } + // Process the lower case characters. + for (; j < ls; j++) { + if (l[j] != r[j]) { return false; } - ++a, ++b; } + return true; } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find) { + auto parts = splitHyphenated(input); + bool negated = false; + if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { + negated = true; + // Remove the "no" part + parts = llvm::SmallVector(parts.begin() + 1, parts.end()); + } + size_t chars = 0; + for (auto p : parts) { + chars += p.size(); + } + auto pred = [&](auto s) { + if (chars != s.size()) { + return false; + } + auto ccParts = splitCamelCase(s); + auto num_ccParts = ccParts.size(); + if (parts.size() != num_ccParts) { + return false; + } + for (size_t i{0}; i < num_ccParts; i++) { + if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { + return false; } } - } - return std::nullopt; + return true; + }; + auto cast = [negated](int x) { return std::pair{!negated, x}; }; + return fmap>(find(pred), cast); } -std::optional FindLanguageFeature(const char *name) { - return ScanEnum(name); +std::optional> parseCLILanguageFeature( + llvm::StringRef input) { + return parseCLIEnum(input, FindLanguageFeatureIndex); } -std::optional FindUsageWarning(const char *name) { - return ScanEnum(name); +std::optional> parseCLIUsageWarning( + llvm::StringRef input) { + return parseCLIEnum(input, FindUsageWarningIndex); +} + +// Take a string from the CLI and apply it to the LanguageFeatureControl. +// Return true if the option was applied recognized. +bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { + if (auto result = parseCLILanguageFeature(input)) { + EnableWarning(result->second, result->first); + return true; + } else if (auto result = parseCLIUsageWarning(input)) { + EnableWarning(result->second, result->first); + return true; + } + return false; } std::vector LanguageFeatureControl::GetNames( @@ -201,4 +269,32 @@ std::vector LanguageFeatureControl::GetNames( } } +template +void ForEachEnum(std::function f) { + for (size_t j{0}; j < N; ++j) { + f(static_cast(j)); + } +} + +void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { + warnAllLanguage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + // should be equivalent to: reset().flip() set ... + ForEachEnum( + [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + if (yes) { + // These three features do not need to be warned about, + // but we do want their feature flags. + warnLanguage_.set(LanguageFeature::OpenMP, false); + warnLanguage_.set(LanguageFeature::OpenACC, false); + warnLanguage_.set(LanguageFeature::CUDA, false); + } +} + +void LanguageFeatureControl::WarnOnAllUsage(bool yes) { + warnAllUsage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + ForEachEnum( + [&](UsageWarning w) { warnUsage_.set(w, yes); }); +} } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp new file mode 100644 index 0000000000000..ed11318382b35 --- /dev/null +++ b/flang/lib/Support/enum-class.cpp @@ -0,0 +1,24 @@ +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include +#include +namespace Fortran::common { + +std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; + } + } + return std::nullopt; +} + + +} // namespace Fortran::common \ No newline at end of file diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 new file mode 100644 index 0000000000000..8a58e63cfa3ac --- /dev/null +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -0,0 +1,19 @@ +! RUN: %flang -Wknown-bad-implicit-interface %s -c 2>&1 | FileCheck %s --check-prefix=WARN +! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty +! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 +! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 +! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface +! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface + +program disable_diagnostic + REAL :: x + INTEGER :: y + ! CHECK-NOT: warning + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(x) + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(y) +end program disable_diagnostic + +subroutine sub() +end subroutine sub \ No newline at end of file diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 58adf6f745d5e..33f0aff8a1739 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -1,6 +1,7 @@ ! Ensure that only argument -Werror is supported. -! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG -! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG +! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG1 +! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 -! WRONG: Only `-Werror` is supported currently. +! WRONG1: error: Unknown diagnostic option: -Wall +! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file diff --git a/flang/test/Driver/wextra-ok.f90 b/flang/test/Driver/wextra-ok.f90 index 441029aa0af27..db15c7f14aa35 100644 --- a/flang/test/Driver/wextra-ok.f90 +++ b/flang/test/Driver/wextra-ok.f90 @@ -5,7 +5,7 @@ ! RUN: not %flang -std=f2018 -Wblah -Wextra %s -c 2>&1 | FileCheck %s --check-prefix=WRONG ! CHECK-OK: the warning option '-Wextra' is not supported -! WRONG: Only `-Werror` is supported currently. +! WRONG: Unknown diagnostic option: -Wblah program wextra_ok end program wextra_ok diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index bda02ed29a5ef..19cc5a20fecf4 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -1,3 +1,6 @@ add_flang_unittest(FlangCommonTests + EnumClassTests.cpp FastIntSetTest.cpp + FortranFeaturesTest.cpp ) +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp new file mode 100644 index 0000000000000..f67c453cfad15 --- /dev/null +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -0,0 +1,45 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Common/template.h" +#include "gtest/gtest.h" + +using namespace Fortran::common; +using namespace std; + +ENUM_CLASS(TestEnum, One, Two, + Three) +ENUM_CLASS_EXTRA(TestEnum) + +TEST(EnumClassTest, EnumToString) { + ASSERT_EQ(EnumToString(TestEnum::One), "One"); + ASSERT_EQ(EnumToString(TestEnum::Two), "Two"); + ASSERT_EQ(EnumToString(TestEnum::Three), "Three"); +} + +TEST(EnumClassTest, EnumToStringData) { + ASSERT_STREQ(EnumToString(TestEnum::One).data(), "One, Two, Three"); +} + +TEST(EnumClassTest, StringToEnum) { + ASSERT_EQ(StringToTestEnum("One"), std::optional{TestEnum::One}); + ASSERT_EQ(StringToTestEnum("Two"), std::optional{TestEnum::Two}); + ASSERT_EQ(StringToTestEnum("Three"), std::optional{TestEnum::Three}); + ASSERT_EQ(StringToTestEnum("Four"), std::nullopt); + ASSERT_EQ(StringToTestEnum(""), std::nullopt); + ASSERT_EQ(StringToTestEnum("One, Two, Three"), std::nullopt); +} + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, FindNameNormal) { + auto p1 = [](auto s) { return s == "TwentyOne"; }; + ASSERT_EQ(FindTestEnumExtra(p1), std::optional{TestEnumExtra::TwentyOne}); +} diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp new file mode 100644 index 0000000000000..7ec7054f14f6e --- /dev/null +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -0,0 +1,142 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Support/Fortran-features.h" +#include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/ErrorHandling.h" +#include "gtest/gtest.h" + +namespace Fortran::common { + +// Not currently exported from Fortran-features.h +llvm::SmallVector splitCamelCase(llvm::StringRef input); +llvm::SmallVector splitHyphenated(llvm::StringRef input); +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, SplitCamelCase) { + + auto parts = splitCamelCase("oP"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("o", 1))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("P", 1))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OPName"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("OP", 2))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OpName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("Op", 2))) { + ADD_FAILURE() << "First part is not Op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("opName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("op", 2))) { + ADD_FAILURE() << "First part is not op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("FlangTestProgram123"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("Flang", 5))) { + ADD_FAILURE() << "First part is not Flang"; + } + if (parts[1].compare(llvm::StringRef("Test", 4))) { + ADD_FAILURE() << "Second part is not Test"; + } + if (parts[2].compare(llvm::StringRef("Program123", 10))) { + ADD_FAILURE() << "Third part is not Program123"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, SplitHyphenated) { + auto parts = splitHyphenated("no-twenty-one"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("no", 2))) { + ADD_FAILURE() << "First part is not twenty"; + } + if (parts[1].compare(llvm::StringRef("twenty", 6))) { + ADD_FAILURE() << "Second part is not one"; + } + if (parts[2].compare(llvm::StringRef("one", 3))) { + ADD_FAILURE() << "Third part is not one"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); + + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); +} + +std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); +} + +TEST(EnumClassTest, parseCLIEnumOption) { + auto result = parseCLITestEnumExtraOption("no-twenty-one"); + auto expected = std::pair(false, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("twenty-one"); + expected = std::pair(true, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-forty-two"); + expected = std::pair(false, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("forty-two"); + expected = std::pair(true, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-seven-seven-seven"); + expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("seven-seven-seven"); + expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); +} + +} // namespace Fortran::common >From 49a0579f9477936b72f0580823b4dd6824697512 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:56:14 -0700 Subject: [PATCH 2/4] adjust headers --- flang/include/flang/Support/Fortran-features.h | 4 +--- flang/lib/Frontend/CompilerInvocation.cpp | 5 ----- flang/lib/Support/Fortran-features.cpp | 1 - 3 files changed, 1 insertion(+), 9 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index d5aa7357ffea0..4a8b0da4c0d4d 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -11,9 +11,7 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" -#include "flang/Common/idioms.h" -#include "llvm/Support/Error.h" -#include "llvm/Support/raw_ostream.h" +#include "llvm/ADT/StringRef.h" #include #include diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 9ea568549bd6c..d8bf601d0171d 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -20,11 +20,9 @@ #include "flang/Support/Version.h" #include "flang/Tools/TargetSetup.h" #include "flang/Version.inc" -#include "clang/Basic/AllDiagnostics.h" #include "clang/Basic/DiagnosticDriver.h" #include "clang/Basic/DiagnosticOptions.h" #include "clang/Driver/Driver.h" -#include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" #include "llvm/ADT/StringRef.h" @@ -34,9 +32,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" -#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" -#include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" #include "llvm/Support/Process.h" #include "llvm/Support/raw_ostream.h" @@ -46,7 +42,6 @@ #include #include #include -#include using namespace Fortran::frontend; diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 55abf0385d185..0e394162ef577 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -10,7 +10,6 @@ #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" -#include "llvm/Support/raw_ostream.h" namespace Fortran::common { >From fa2db7090c6d374ce1a835ad26d19a1d7bd42262 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:57:22 -0700 Subject: [PATCH 3/4] reformat --- flang/lib/Support/enum-class.cpp | 20 ++++++++++--------- flang/unittests/Common/EnumClassTests.cpp | 5 ++--- .../unittests/Common/FortranFeaturesTest.cpp | 18 ++++++++++------- 3 files changed, 24 insertions(+), 19 deletions(-) diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp index ed11318382b35..ac57f27ef1c9e 100644 --- a/flang/lib/Support/enum-class.cpp +++ b/flang/lib/Support/enum-class.cpp @@ -1,4 +1,5 @@ -//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ +//-*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -7,18 +8,19 @@ //===----------------------------------------------------------------------===// #include "flang/Common/enum-class.h" -#include #include +#include namespace Fortran::common { -std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { - for (int i = 0; i < size; ++i) { - if (pred(names[i])) { - return i; - } +std::optional FindEnumIndex( + std::function pred, int size, + const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; } - return std::nullopt; + } + return std::nullopt; } - } // namespace Fortran::common \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp index f67c453cfad15..c9224a8ceba54 100644 --- a/flang/unittests/Common/EnumClassTests.cpp +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -6,15 +6,14 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Common/template.h" -#include "gtest/gtest.h" using namespace Fortran::common; using namespace std; -ENUM_CLASS(TestEnum, One, Two, - Three) +ENUM_CLASS(TestEnum, One, Two, Three) ENUM_CLASS_EXTRA(TestEnum) TEST(EnumClassTest, EnumToString) { diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 7ec7054f14f6e..597928e7fe56e 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -6,12 +6,12 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Support/Fortran-features.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" -#include "gtest/gtest.h" namespace Fortran::common { @@ -34,7 +34,7 @@ TEST(EnumClassTest, SplitCamelCase) { if (parts[1].compare(llvm::StringRef("P", 1))) { ADD_FAILURE() << "Second part is not Name"; } - + parts = splitCamelCase("OPName"); ASSERT_EQ(parts.size(), (size_t)2); @@ -114,13 +114,15 @@ TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); } -std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { - return parseCLIEnum(input, FindTestEnumExtraIndex); +std::optional> parseCLITestEnumExtraOption( + llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); } TEST(EnumClassTest, parseCLIEnumOption) { auto result = parseCLITestEnumExtraOption("no-twenty-one"); - auto expected = std::pair(false, TestEnumExtra::TwentyOne); + auto expected = + std::pair(false, TestEnumExtra::TwentyOne); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("twenty-one"); expected = std::pair(true, TestEnumExtra::TwentyOne); @@ -132,10 +134,12 @@ TEST(EnumClassTest, parseCLIEnumOption) { expected = std::pair(true, TestEnumExtra::FortyTwo); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("no-seven-seven-seven"); - expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(false, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("seven-seven-seven"); - expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(true, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); } >From 5f3feb64c1a97500e2808114d44bb07aa4ccb00c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 15:58:43 -0700 Subject: [PATCH 4/4] addressing feedback --- flang/include/flang/Common/enum-class.h | 53 +++--- flang/include/flang/Common/optional.h | 7 + .../include/flang/Support/Fortran-features.h | 16 -- flang/lib/Support/Fortran-features.cpp | 175 ++++++++---------- flang/lib/Support/enum-class.cpp | 15 +- flang/test/Driver/disable-diagnostic.f90 | 3 +- flang/test/Driver/werror-wrong.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 2 +- .../unittests/Common/FortranFeaturesTest.cpp | 159 +++------------- 9 files changed, 153 insertions(+), 279 deletions(-) diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index baf9fe418141d..3dbd11bb4057c 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -17,9 +17,9 @@ #ifndef FORTRAN_COMMON_ENUM_CLASS_H_ #define FORTRAN_COMMON_ENUM_CLASS_H_ +#include "optional.h" #include #include -#include #include namespace Fortran::common { @@ -59,26 +59,6 @@ constexpr std::array EnumNames(const char *p) { return result; } -template -std::optional inline fmap(std::optional x, std::function f) { - return x ? std::optional{f(*x)} : std::nullopt; -} - -using Predicate = std::function; -// Finds the first index for which the predicate returns true. -std::optional FindEnumIndex( - Predicate pred, int size, const std::string_view *names); - -using FindEnumIndexType = std::optional( - Predicate, int, const std::string_view *); - -template -std::optional inline FindEnum( - Predicate pred, std::function(Predicate)> find) { - std::function f = [](int x) { return static_cast(x); }; - return fmap(find(pred), f); -} - #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ @@ -90,17 +70,34 @@ std::optional inline FindEnum( return NAME##_names[static_cast(e)]; \ } +namespace EnumClass { + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +optional FindIndex( + Predicate pred, std::size_t size, const std::string_view *names); + +using FindIndexType = std::function(Predicate)>; + +template +optional inline Find(Predicate pred, FindIndexType findIndex) { + return MapOption( + findIndex(pred), [](int x) { return static_cast(x); }); +} + +} // namespace EnumClass + #define ENUM_CLASS_EXTRA(NAME) \ - [[maybe_unused]] inline std::optional Find##NAME##Index( \ - ::Fortran::common::Predicate p) { \ - return ::Fortran::common::FindEnumIndex( \ + [[maybe_unused]] inline optional Find##NAME##Index( \ + ::Fortran::common::EnumClass::Predicate p) { \ + return ::Fortran::common::EnumClass::FindIndex( \ p, NAME##_enumSize, NAME##_names.data()); \ } \ - [[maybe_unused]] inline std::optional Find##NAME( \ - ::Fortran::common::Predicate p) { \ - return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + [[maybe_unused]] inline optional Find##NAME( \ + ::Fortran::common::EnumClass::Predicate p) { \ + return ::Fortran::common::EnumClass::Find(p, Find##NAME##Index); \ } \ - [[maybe_unused]] inline std::optional StringTo##NAME( \ + [[maybe_unused]] inline optional StringTo##NAME( \ const std::string_view name) { \ return Find##NAME( \ [name](const std::string_view s) -> bool { return name == s; }); \ diff --git a/flang/include/flang/Common/optional.h b/flang/include/flang/Common/optional.h index c7c81f40cc8c8..5b623f01e828d 100644 --- a/flang/include/flang/Common/optional.h +++ b/flang/include/flang/Common/optional.h @@ -27,6 +27,7 @@ #define FORTRAN_COMMON_OPTIONAL_H #include "api-attrs.h" +#include #include #include @@ -238,6 +239,12 @@ using std::nullopt_t; using std::optional; #endif // !STD_OPTIONAL_UNSUPPORTED +template +std::optional inline MapOption( + std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + } // namespace Fortran::common #endif // FORTRAN_COMMON_OPTIONAL_H diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 4a8b0da4c0d4d..fd6a9139b7ea7 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -133,21 +133,5 @@ class LanguageFeatureControl { bool disableAllWarnings_{false}; }; -// Parse a CLI enum option return the enum index and whether it should be -// enabled (true) or disabled (false). Just exposed for the template below. -std::optional> parseCLIEnumIndex( - llvm::StringRef input, std::function(Predicate)> find); - -template -std::optional> parseCLIEnum( - llvm::StringRef input, std::function(Predicate)> find) { - using To = std::pair; - using From = std::pair; - static std::function cast = [](From x) { - return std::pair{x.first, static_cast(x.second)}; - }; - return fmap(parseCLIEnumIndex(input, find), cast); -} - } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 0e394162ef577..72ea6639adf51 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -11,6 +11,10 @@ #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" +// Debugging +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/raw_ostream.h" + namespace Fortran::common { LanguageFeatureControl::LanguageFeatureControl() { @@ -95,119 +99,99 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Split a string with camel case into the individual words. -// Note, the small vector is just an array of a few pointers and lengths -// into the original input string. So all this allocation should be pretty -// cheap. -llvm::SmallVector splitCamelCase(llvm::StringRef input) { - using namespace llvm; - if (input.empty()) { - return {}; +// Namespace for helper functions for parsing CLI options +// used instead of static so that there can be unit tests for these +// functions. +namespace FortranFeaturesHelpers { +// Check if Lower Case Hyphenated words are equal to Camel Case words. +// Because of out use case we know that 'r' the camel case string is +// well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. +// This is checked in the enum-class.h file. +bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { + size_t ls{l.size()}, rs{r.size()}; + if (ls < rs) { + return false; } - SmallVector parts{}; - parts.reserve(input.size()); - auto check = [&input](size_t j, function_ref predicate) { - return j < input.size() && predicate(input[j]); - }; - size_t i{0}; - size_t startWord = i; - for (; i < input.size(); i++) { - if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || - ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { - parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); - startWord = i + 1; + bool atStartOfWord{true}; + size_t wordCount{0}, j; // j is the number of word characters checked in r. + for (; j < rs; j++) { + if (wordCount + j >= ls) { + // `l` was shorter once the hiphens were removed. + // If r is null terminated, then we are good. + return r[j] == '\0'; } - } - parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); - return parts; -} - -// Split a string whith hyphens into the individual words. -llvm::SmallVector splitHyphenated(llvm::StringRef input) { - auto parts = llvm::SmallVector{}; - llvm::SplitString(input, parts, "-"); - return parts; -} - -// Check if two strings are equal while normalizing case for the -// right word which is assumed to be a single word in camel case. -bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { - size_t ls = l.size(); - if (ls != r.size()) - return false; - size_t j{0}; - // Process the upper case characters. - for (; j < ls; j++) { - char rc = r[j]; - char rc2l = llvm::toLower(rc); - if (rc == rc2l) { - // Past run of Uppers Case; - break; + if (atStartOfWord) { + if (llvm::isUpper(r[j])) { + // Upper Case Run + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else { + atStartOfWord = false; + if (l[wordCount + j] != r[j]) { + return false; + } + } + } else { + if (llvm::isUpper(r[j])) { + atStartOfWord = true; + if (l[wordCount + j] != '-') { + return false; + } + ++wordCount; + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else if (l[wordCount + j] != r[j]) { + return false; + } } - if (l[j] != rc2l) - return false; } - // Process the lower case characters. - for (; j < ls; j++) { - if (l[j] != r[j]) { - return false; - } + // If there are more characters in l after processing all the characters in r. + // then fail unless the string is null terminated. + if (ls > wordCount + j) { + return l[wordCount + j] == '\0'; } return true; } // Parse a CLI enum option return the enum index and whether it should be // enabled (true) or disabled (false). -std::optional> parseCLIEnumIndex( - llvm::StringRef input, std::function(Predicate)> find) { - auto parts = splitHyphenated(input); - bool negated = false; - if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { +template +optional> ParseCLIEnum( + llvm::StringRef input, EnumClass::FindIndexType findIndex) { + bool negated{false}; + if (input.starts_with("no-")) { negated = true; - // Remove the "no" part - parts = llvm::SmallVector(parts.begin() + 1, parts.end()); - } - size_t chars = 0; - for (auto p : parts) { - chars += p.size(); + input = input.drop_front(3); } - auto pred = [&](auto s) { - if (chars != s.size()) { - return false; - } - auto ccParts = splitCamelCase(s); - auto num_ccParts = ccParts.size(); - if (parts.size() != num_ccParts) { - return false; - } - for (size_t i{0}; i < num_ccParts; i++) { - if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { - return false; - } - } - return true; - }; - auto cast = [negated](int x) { return std::pair{!negated, x}; }; - return fmap>(find(pred), cast); + EnumClass::Predicate predicate{ + [input](llvm::StringRef r) { return LowerHyphEqualCamelCase(input, r); }}; + optional x = EnumClass::Find(predicate, findIndex); + return MapOption>( + x, [negated](T x) { return std::pair{!negated, x}; }); } -std::optional> parseCLILanguageFeature( +optional> parseCLIUsageWarning( llvm::StringRef input) { - return parseCLIEnum(input, FindLanguageFeatureIndex); + return ParseCLIEnum(input, FindUsageWarningIndex); } -std::optional> parseCLIUsageWarning( +optional> parseCLILanguageFeature( llvm::StringRef input) { - return parseCLIEnum(input, FindUsageWarningIndex); + return ParseCLIEnum(input, FindLanguageFeatureIndex); } +} // namespace FortranFeaturesHelpers + // Take a string from the CLI and apply it to the LanguageFeatureControl. // Return true if the option was applied recognized. bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { - if (auto result = parseCLILanguageFeature(input)) { + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(input)) { EnableWarning(result->second, result->first); return true; - } else if (auto result = parseCLIUsageWarning(input)) { + } else if (auto result = + FortranFeaturesHelpers::parseCLIUsageWarning(input)) { EnableWarning(result->second, result->first); return true; } @@ -277,11 +261,10 @@ void ForEachEnum(std::function f) { void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { warnAllLanguage_ = yes; - disableAllWarnings_ = yes ? false : disableAllWarnings_; - // should be equivalent to: reset().flip() set ... - ForEachEnum( - [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + warnLanguage_.reset(); if (yes) { + disableAllWarnings_ = false; + warnLanguage_.flip(); // These three features do not need to be warned about, // but we do want their feature flags. warnLanguage_.set(LanguageFeature::OpenMP, false); @@ -292,8 +275,10 @@ void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { void LanguageFeatureControl::WarnOnAllUsage(bool yes) { warnAllUsage_ = yes; - disableAllWarnings_ = yes ? false : disableAllWarnings_; - ForEachEnum( - [&](UsageWarning w) { warnUsage_.set(w, yes); }); + warnUsage_.reset(); + if (yes) { + disableAllWarnings_ = false; + warnUsage_.flip(); + } } } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp index ac57f27ef1c9e..d6d0ee758175b 100644 --- a/flang/lib/Support/enum-class.cpp +++ b/flang/lib/Support/enum-class.cpp @@ -8,19 +8,20 @@ //===----------------------------------------------------------------------===// #include "flang/Common/enum-class.h" +#include "flang/Common/optional.h" #include -#include -namespace Fortran::common { -std::optional FindEnumIndex( - std::function pred, int size, +namespace Fortran::common::EnumClass { + +optional FindIndex( + std::function pred, size_t size, const std::string_view *names) { - for (int i = 0; i < size; ++i) { + for (size_t i = 0; i < size; ++i) { if (pred(names[i])) { return i; } } - return std::nullopt; + return nullopt; } -} // namespace Fortran::common \ No newline at end of file +} // namespace Fortran::common::EnumClass diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 index 8a58e63cfa3ac..849489377da12 100644 --- a/flang/test/Driver/disable-diagnostic.f90 +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -2,6 +2,7 @@ ! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty ! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 ! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 + ! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface ! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface @@ -16,4 +17,4 @@ program disable_diagnostic end program disable_diagnostic subroutine sub() -end subroutine sub \ No newline at end of file +end subroutine sub diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 33f0aff8a1739..6e3c7cca15bc7 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -4,4 +4,4 @@ ! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 ! WRONG1: error: Unknown diagnostic option: -Wall -! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file +! WRONG2: error: Unknown diagnostic option: -WX diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index 19cc5a20fecf4..3149cb9f7bc47 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -3,4 +3,4 @@ add_flang_unittest(FlangCommonTests FastIntSetTest.cpp FortranFeaturesTest.cpp ) -target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 597928e7fe56e..e12aff9f7b735 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -12,135 +12,34 @@ #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" - -namespace Fortran::common { - -// Not currently exported from Fortran-features.h -llvm::SmallVector splitCamelCase(llvm::StringRef input); -llvm::SmallVector splitHyphenated(llvm::StringRef input); -bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); - -ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) -ENUM_CLASS_EXTRA(TestEnumExtra) - -TEST(EnumClassTest, SplitCamelCase) { - - auto parts = splitCamelCase("oP"); - ASSERT_EQ(parts.size(), (size_t)2); - - if (parts[0].compare(llvm::StringRef("o", 1))) { - ADD_FAILURE() << "First part is not OP"; - } - if (parts[1].compare(llvm::StringRef("P", 1))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("OPName"); - ASSERT_EQ(parts.size(), (size_t)2); - - if (parts[0].compare(llvm::StringRef("OP", 2))) { - ADD_FAILURE() << "First part is not OP"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("OpName"); - ASSERT_EQ(parts.size(), (size_t)2); - if (parts[0].compare(llvm::StringRef("Op", 2))) { - ADD_FAILURE() << "First part is not Op"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("opName"); - ASSERT_EQ(parts.size(), (size_t)2); - if (parts[0].compare(llvm::StringRef("op", 2))) { - ADD_FAILURE() << "First part is not op"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("FlangTestProgram123"); - ASSERT_EQ(parts.size(), (size_t)3); - if (parts[0].compare(llvm::StringRef("Flang", 5))) { - ADD_FAILURE() << "First part is not Flang"; - } - if (parts[1].compare(llvm::StringRef("Test", 4))) { - ADD_FAILURE() << "Second part is not Test"; - } - if (parts[2].compare(llvm::StringRef("Program123", 10))) { - ADD_FAILURE() << "Third part is not Program123"; - } - for (auto p : parts) { - llvm::errs() << p << " " << p.size() << "\n"; - } -} - -TEST(EnumClassTest, SplitHyphenated) { - auto parts = splitHyphenated("no-twenty-one"); - ASSERT_EQ(parts.size(), (size_t)3); - if (parts[0].compare(llvm::StringRef("no", 2))) { - ADD_FAILURE() << "First part is not twenty"; - } - if (parts[1].compare(llvm::StringRef("twenty", 6))) { - ADD_FAILURE() << "Second part is not one"; - } - if (parts[2].compare(llvm::StringRef("one", 3))) { - ADD_FAILURE() << "Third part is not one"; - } - for (auto p : parts) { - llvm::errs() << p << " " << p.size() << "\n"; - } -} - -TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); - - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); -} - -std::optional> parseCLITestEnumExtraOption( - llvm::StringRef input) { - return parseCLIEnum(input, FindTestEnumExtraIndex); -} - -TEST(EnumClassTest, parseCLIEnumOption) { - auto result = parseCLITestEnumExtraOption("no-twenty-one"); - auto expected = - std::pair(false, TestEnumExtra::TwentyOne); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("twenty-one"); - expected = std::pair(true, TestEnumExtra::TwentyOne); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("no-forty-two"); - expected = std::pair(false, TestEnumExtra::FortyTwo); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("forty-two"); - expected = std::pair(true, TestEnumExtra::FortyTwo); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("no-seven-seven-seven"); - expected = - std::pair(false, TestEnumExtra::SevenSevenSeven); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("seven-seven-seven"); - expected = - std::pair(true, TestEnumExtra::SevenSevenSeven); - ASSERT_EQ(result, std::optional{expected}); +#include + +namespace Fortran::common::FortranFeaturesHelpers { + +optional> parseCLIUsageWarning( + llvm::StringRef input); +TEST(EnumClassTest, ParseCLIUsageWarning) { + EXPECT_EQ((parseCLIUsageWarning("no-twenty-one")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("twenty-one")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-seven-seven-seven")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-")), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("Portability"), std::nullopt); + auto expect{std::pair{false, UsageWarning::Portability}}; + ASSERT_EQ(parseCLIUsageWarning("no-portability"), expect); + expect.first = true; + ASSERT_EQ((parseCLIUsageWarning("portability")), expect); + expect = + std::pair{false, Fortran::common::UsageWarning::PointerToUndefinable}; + ASSERT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable")), expect); + expect.first = true; + ASSERT_EQ((parseCLIUsageWarning("pointer-to-undefinable")), expect); + EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("NoPointerToUndefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable"), std::nullopt); } -} // namespace Fortran::common +} // namespace Fortran::common::FortranFeaturesHelpers From flang-commits at lists.llvm.org Fri May 30 15:42:44 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 30 May 2025 15:42:44 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683a3464.170a0220.3c7461.91b4@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/142022 >From 8f3fd2daab46f477e87043c66b3049dff4a5b20e Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:11:04 -0700 Subject: [PATCH 1/5] initial commit --- flang/include/flang/Common/enum-class.h | 47 ++++- .../include/flang/Support/Fortran-features.h | 51 ++++-- flang/lib/Frontend/CompilerInvocation.cpp | 62 ++++--- flang/lib/Support/CMakeLists.txt | 1 + flang/lib/Support/Fortran-features.cpp | 168 ++++++++++++++---- flang/lib/Support/enum-class.cpp | 24 +++ flang/test/Driver/disable-diagnostic.f90 | 19 ++ flang/test/Driver/werror-wrong.f90 | 7 +- flang/test/Driver/wextra-ok.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 3 + flang/unittests/Common/EnumClassTests.cpp | 45 +++++ .../unittests/Common/FortranFeaturesTest.cpp | 142 +++++++++++++++ 12 files changed, 483 insertions(+), 88 deletions(-) create mode 100644 flang/lib/Support/enum-class.cpp create mode 100644 flang/test/Driver/disable-diagnostic.f90 create mode 100644 flang/unittests/Common/EnumClassTests.cpp create mode 100644 flang/unittests/Common/FortranFeaturesTest.cpp diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index 41575d45091a8..baf9fe418141d 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -18,8 +18,9 @@ #define FORTRAN_COMMON_ENUM_CLASS_H_ #include -#include - +#include +#include +#include namespace Fortran::common { constexpr std::size_t CountEnumNames(const char *p) { @@ -58,15 +59,51 @@ constexpr std::array EnumNames(const char *p) { return result; } +template +std::optional inline fmap(std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +std::optional FindEnumIndex( + Predicate pred, int size, const std::string_view *names); + +using FindEnumIndexType = std::optional( + Predicate, int, const std::string_view *); + +template +std::optional inline FindEnum( + Predicate pred, std::function(Predicate)> find) { + std::function f = [](int x) { return static_cast(x); }; + return fmap(find(pred), f); +} + #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ ::Fortran::common::CountEnumNames(#__VA_ARGS__)}; \ + [[maybe_unused]] static constexpr std::array NAME##_names{ \ + ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ [[maybe_unused]] static inline std::string_view EnumToString(NAME e) { \ - static const constexpr auto names{ \ - ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ - return names[static_cast(e)]; \ + return NAME##_names[static_cast(e)]; \ } +#define ENUM_CLASS_EXTRA(NAME) \ + [[maybe_unused]] inline std::optional Find##NAME##Index( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnumIndex( \ + p, NAME##_enumSize, NAME##_names.data()); \ + } \ + [[maybe_unused]] inline std::optional Find##NAME( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + } \ + [[maybe_unused]] inline std::optional StringTo##NAME( \ + const std::string_view name) { \ + return Find##NAME( \ + [name](const std::string_view s) -> bool { return name == s; }); \ + } } // namespace Fortran::common #endif // FORTRAN_COMMON_ENUM_CLASS_H_ diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index e696da9042480..d5aa7357ffea0 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -12,6 +12,8 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" #include "flang/Common/idioms.h" +#include "llvm/Support/Error.h" +#include "llvm/Support/raw_ostream.h" #include #include @@ -79,12 +81,13 @@ ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, NullActualForDefaultIntentAllocatable, UseAssociationIntoSameNameSubprogram, HostAssociatedIntentOutInSpecExpr, NonVolatilePointerToVolatile) +// Generate default String -> Enum mapping. +ENUM_CLASS_EXTRA(LanguageFeature) +ENUM_CLASS_EXTRA(UsageWarning) + using LanguageFeatures = EnumSet; using UsageWarnings = EnumSet; -std::optional FindLanguageFeature(const char *); -std::optional FindUsageWarning(const char *); - class LanguageFeatureControl { public: LanguageFeatureControl(); @@ -97,8 +100,10 @@ class LanguageFeatureControl { void EnableWarning(UsageWarning w, bool yes = true) { warnUsage_.set(w, yes); } - void WarnOnAllNonstandard(bool yes = true) { warnAllLanguage_ = yes; } - void WarnOnAllUsage(bool yes = true) { warnAllUsage_ = yes; } + void WarnOnAllNonstandard(bool yes = true); + bool IsWarnOnAllNonstandard() const { return warnAllLanguage_; } + void WarnOnAllUsage(bool yes = true); + bool IsWarnOnAllUsage() const { return warnAllUsage_; } void DisableAllNonstandardWarnings() { warnAllLanguage_ = false; warnLanguage_.clear(); @@ -107,16 +112,16 @@ class LanguageFeatureControl { warnAllUsage_ = false; warnUsage_.clear(); } - - bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } - bool ShouldWarn(LanguageFeature f) const { - return (warnAllLanguage_ && f != LanguageFeature::OpenMP && - f != LanguageFeature::OpenACC && f != LanguageFeature::CUDA) || - warnLanguage_.test(f); - } - bool ShouldWarn(UsageWarning w) const { - return warnAllUsage_ || warnUsage_.test(w); + void DisableAllWarnings() { + disableAllWarnings_ = true; + DisableAllNonstandardWarnings(); + DisableAllUsageWarnings(); } + bool applyCLIOption(llvm::StringRef input); + bool AreWarningsDisabled() const { return disableAllWarnings_; } + bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } + bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } + bool ShouldWarn(UsageWarning w) const { return warnUsage_.test(w); } // Return all spellings of operators names, depending on features enabled std::vector GetNames(LogicalOperator) const; std::vector GetNames(RelationalOperator) const; @@ -127,6 +132,24 @@ class LanguageFeatureControl { bool warnAllLanguage_{false}; UsageWarnings warnUsage_; bool warnAllUsage_{false}; + bool disableAllWarnings_{false}; }; + +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). Just exposed for the template below. +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find); + +template +std::optional> parseCLIEnum( + llvm::StringRef input, std::function(Predicate)> find) { + using To = std::pair; + using From = std::pair; + static std::function cast = [](From x) { + return std::pair{x.first, static_cast(x.second)}; + }; + return fmap(parseCLIEnumIndex(input, find), cast); +} + } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..9ea568549bd6c 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -34,6 +34,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" +#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" #include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" @@ -45,6 +46,7 @@ #include #include #include +#include using namespace Fortran::frontend; @@ -971,10 +973,23 @@ static bool parseSemaArgs(CompilerInvocation &res, llvm::opt::ArgList &args, /// Parses all diagnostics related arguments and populates the variables /// options accordingly. Returns false if new errors are generated. +/// FC1 driver entry point for parsing diagnostic arguments. static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, clang::DiagnosticsEngine &diags) { unsigned numErrorsBefore = diags.getNumErrors(); + auto &features = res.getFrontendOpts().features; + // The order of these flags (-pedantic -W -w) is important and is + // chosen to match clang's behavior. + + // -pedantic + if (args.hasArg(clang::driver::options::OPT_pedantic)) { + features.WarnOnAllNonstandard(); + features.WarnOnAllUsage(); + res.setEnableConformanceChecks(); + res.setEnableUsageChecks(); + } + // -Werror option // TODO: Currently throws a Diagnostic for anything other than -W, // this has to change when other -W's are supported. @@ -984,21 +999,27 @@ static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, for (const auto &wArg : wArgs) { if (wArg == "error") { res.setWarnAsErr(true); - } else { - const unsigned diagID = - diags.getCustomDiagID(clang::DiagnosticsEngine::Error, - "Only `-Werror` is supported currently."); - diags.Report(diagID); + // -W(no-) + } else if (!features.applyCLIOption(wArg)) { + const unsigned diagID = diags.getCustomDiagID( + clang::DiagnosticsEngine::Error, "Unknown diagnostic option: -W%0"); + diags.Report(diagID) << wArg; } } } + // -w + if (args.hasArg(clang::driver::options::OPT_w)) { + features.DisableAllWarnings(); + res.setDisableWarnings(); + } + // Default to off for `flang -fc1`. - res.getFrontendOpts().showColors = - parseShowColorsArgs(args, /*defaultDiagColor=*/false); + bool showColors = parseShowColorsArgs(args, false); - // Honor color diagnostics. - res.getDiagnosticOpts().ShowColors = res.getFrontendOpts().showColors; + diags.getDiagnosticOptions().ShowColors = showColors; + res.getDiagnosticOpts().ShowColors = showColors; + res.getFrontendOpts().showColors = showColors; return diags.getNumErrors() == numErrorsBefore; } @@ -1074,16 +1095,6 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, Fortran::common::LanguageFeature::OpenACC); } - // -pedantic - if (args.hasArg(clang::driver::options::OPT_pedantic)) { - res.setEnableConformanceChecks(); - res.setEnableUsageChecks(); - } - - // -w - if (args.hasArg(clang::driver::options::OPT_w)) - res.setDisableWarnings(); - // -std=f2018 // TODO: Set proper options when more fortran standards // are supported. @@ -1092,6 +1103,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, // We only allow f2018 as the given standard if (standard == "f2018") { res.setEnableConformanceChecks(); + res.getFrontendOpts().features.WarnOnAllNonstandard(); } else { const unsigned diagID = diags.getCustomDiagID(clang::DiagnosticsEngine::Error, @@ -1099,6 +1111,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, diags.Report(diagID); } } + return diags.getNumErrors() == numErrorsBefore; } @@ -1694,16 +1707,7 @@ void CompilerInvocation::setFortranOpts() { if (frontendOptions.needProvenanceRangeToCharBlockMappings) fortranOptions.needProvenanceRangeToCharBlockMappings = true; - if (getEnableConformanceChecks()) - fortranOptions.features.WarnOnAllNonstandard(); - - if (getEnableUsageChecks()) - fortranOptions.features.WarnOnAllUsage(); - - if (getDisableWarnings()) { - fortranOptions.features.DisableAllNonstandardWarnings(); - fortranOptions.features.DisableAllUsageWarnings(); - } + fortranOptions.features = frontendOptions.features; } std::unique_ptr diff --git a/flang/lib/Support/CMakeLists.txt b/flang/lib/Support/CMakeLists.txt index 363f57ce97dae..9ef31a2a6dcc7 100644 --- a/flang/lib/Support/CMakeLists.txt +++ b/flang/lib/Support/CMakeLists.txt @@ -44,6 +44,7 @@ endif() add_flang_library(FortranSupport default-kinds.cpp + enum-class.cpp Flags.cpp Fortran.cpp Fortran-features.cpp diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index bee8984102b82..55abf0385d185 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -9,6 +9,8 @@ #include "flang/Support/Fortran-features.h" #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/Support/raw_ostream.h" namespace Fortran::common { @@ -94,57 +96,123 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Ignore case and any inserted punctuation (like '-'/'_') -static std::optional GetWarningChar(char ch) { - if (ch >= 'a' && ch <= 'z') { - return ch; - } else if (ch >= 'A' && ch <= 'Z') { - return ch - 'A' + 'a'; - } else if (ch >= '0' && ch <= '9') { - return ch; - } else { - return std::nullopt; +// Split a string with camel case into the individual words. +// Note, the small vector is just an array of a few pointers and lengths +// into the original input string. So all this allocation should be pretty +// cheap. +llvm::SmallVector splitCamelCase(llvm::StringRef input) { + using namespace llvm; + if (input.empty()) { + return {}; } + SmallVector parts{}; + parts.reserve(input.size()); + auto check = [&input](size_t j, function_ref predicate) { + return j < input.size() && predicate(input[j]); + }; + size_t i{0}; + size_t startWord = i; + for (; i < input.size(); i++) { + if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || + ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { + parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); + startWord = i + 1; + } + } + parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); + return parts; } -static bool WarningNameMatch(const char *a, const char *b) { - while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); - } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); +// Split a string whith hyphens into the individual words. +llvm::SmallVector splitHyphenated(llvm::StringRef input) { + auto parts = llvm::SmallVector{}; + llvm::SplitString(input, parts, "-"); + return parts; +} + +// Check if two strings are equal while normalizing case for the +// right word which is assumed to be a single word in camel case. +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { + size_t ls = l.size(); + if (ls != r.size()) + return false; + size_t j{0}; + // Process the upper case characters. + for (; j < ls; j++) { + char rc = r[j]; + char rc2l = llvm::toLower(rc); + if (rc == rc2l) { + // Past run of Uppers Case; + break; } - if (!ach && !bch) { - return true; - } else if (!ach || !bch || *ach != *bch) { + if (l[j] != rc2l) + return false; + } + // Process the lower case characters. + for (; j < ls; j++) { + if (l[j] != r[j]) { return false; } - ++a, ++b; } + return true; } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find) { + auto parts = splitHyphenated(input); + bool negated = false; + if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { + negated = true; + // Remove the "no" part + parts = llvm::SmallVector(parts.begin() + 1, parts.end()); + } + size_t chars = 0; + for (auto p : parts) { + chars += p.size(); + } + auto pred = [&](auto s) { + if (chars != s.size()) { + return false; + } + auto ccParts = splitCamelCase(s); + auto num_ccParts = ccParts.size(); + if (parts.size() != num_ccParts) { + return false; + } + for (size_t i{0}; i < num_ccParts; i++) { + if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { + return false; } } - } - return std::nullopt; + return true; + }; + auto cast = [negated](int x) { return std::pair{!negated, x}; }; + return fmap>(find(pred), cast); } -std::optional FindLanguageFeature(const char *name) { - return ScanEnum(name); +std::optional> parseCLILanguageFeature( + llvm::StringRef input) { + return parseCLIEnum(input, FindLanguageFeatureIndex); } -std::optional FindUsageWarning(const char *name) { - return ScanEnum(name); +std::optional> parseCLIUsageWarning( + llvm::StringRef input) { + return parseCLIEnum(input, FindUsageWarningIndex); +} + +// Take a string from the CLI and apply it to the LanguageFeatureControl. +// Return true if the option was applied recognized. +bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { + if (auto result = parseCLILanguageFeature(input)) { + EnableWarning(result->second, result->first); + return true; + } else if (auto result = parseCLIUsageWarning(input)) { + EnableWarning(result->second, result->first); + return true; + } + return false; } std::vector LanguageFeatureControl::GetNames( @@ -201,4 +269,32 @@ std::vector LanguageFeatureControl::GetNames( } } +template +void ForEachEnum(std::function f) { + for (size_t j{0}; j < N; ++j) { + f(static_cast(j)); + } +} + +void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { + warnAllLanguage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + // should be equivalent to: reset().flip() set ... + ForEachEnum( + [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + if (yes) { + // These three features do not need to be warned about, + // but we do want their feature flags. + warnLanguage_.set(LanguageFeature::OpenMP, false); + warnLanguage_.set(LanguageFeature::OpenACC, false); + warnLanguage_.set(LanguageFeature::CUDA, false); + } +} + +void LanguageFeatureControl::WarnOnAllUsage(bool yes) { + warnAllUsage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + ForEachEnum( + [&](UsageWarning w) { warnUsage_.set(w, yes); }); +} } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp new file mode 100644 index 0000000000000..ed11318382b35 --- /dev/null +++ b/flang/lib/Support/enum-class.cpp @@ -0,0 +1,24 @@ +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include +#include +namespace Fortran::common { + +std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; + } + } + return std::nullopt; +} + + +} // namespace Fortran::common \ No newline at end of file diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 new file mode 100644 index 0000000000000..8a58e63cfa3ac --- /dev/null +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -0,0 +1,19 @@ +! RUN: %flang -Wknown-bad-implicit-interface %s -c 2>&1 | FileCheck %s --check-prefix=WARN +! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty +! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 +! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 +! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface +! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface + +program disable_diagnostic + REAL :: x + INTEGER :: y + ! CHECK-NOT: warning + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(x) + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(y) +end program disable_diagnostic + +subroutine sub() +end subroutine sub \ No newline at end of file diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 58adf6f745d5e..33f0aff8a1739 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -1,6 +1,7 @@ ! Ensure that only argument -Werror is supported. -! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG -! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG +! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG1 +! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 -! WRONG: Only `-Werror` is supported currently. +! WRONG1: error: Unknown diagnostic option: -Wall +! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file diff --git a/flang/test/Driver/wextra-ok.f90 b/flang/test/Driver/wextra-ok.f90 index 441029aa0af27..db15c7f14aa35 100644 --- a/flang/test/Driver/wextra-ok.f90 +++ b/flang/test/Driver/wextra-ok.f90 @@ -5,7 +5,7 @@ ! RUN: not %flang -std=f2018 -Wblah -Wextra %s -c 2>&1 | FileCheck %s --check-prefix=WRONG ! CHECK-OK: the warning option '-Wextra' is not supported -! WRONG: Only `-Werror` is supported currently. +! WRONG: Unknown diagnostic option: -Wblah program wextra_ok end program wextra_ok diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index bda02ed29a5ef..19cc5a20fecf4 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -1,3 +1,6 @@ add_flang_unittest(FlangCommonTests + EnumClassTests.cpp FastIntSetTest.cpp + FortranFeaturesTest.cpp ) +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp new file mode 100644 index 0000000000000..f67c453cfad15 --- /dev/null +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -0,0 +1,45 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Common/template.h" +#include "gtest/gtest.h" + +using namespace Fortran::common; +using namespace std; + +ENUM_CLASS(TestEnum, One, Two, + Three) +ENUM_CLASS_EXTRA(TestEnum) + +TEST(EnumClassTest, EnumToString) { + ASSERT_EQ(EnumToString(TestEnum::One), "One"); + ASSERT_EQ(EnumToString(TestEnum::Two), "Two"); + ASSERT_EQ(EnumToString(TestEnum::Three), "Three"); +} + +TEST(EnumClassTest, EnumToStringData) { + ASSERT_STREQ(EnumToString(TestEnum::One).data(), "One, Two, Three"); +} + +TEST(EnumClassTest, StringToEnum) { + ASSERT_EQ(StringToTestEnum("One"), std::optional{TestEnum::One}); + ASSERT_EQ(StringToTestEnum("Two"), std::optional{TestEnum::Two}); + ASSERT_EQ(StringToTestEnum("Three"), std::optional{TestEnum::Three}); + ASSERT_EQ(StringToTestEnum("Four"), std::nullopt); + ASSERT_EQ(StringToTestEnum(""), std::nullopt); + ASSERT_EQ(StringToTestEnum("One, Two, Three"), std::nullopt); +} + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, FindNameNormal) { + auto p1 = [](auto s) { return s == "TwentyOne"; }; + ASSERT_EQ(FindTestEnumExtra(p1), std::optional{TestEnumExtra::TwentyOne}); +} diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp new file mode 100644 index 0000000000000..7ec7054f14f6e --- /dev/null +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -0,0 +1,142 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Support/Fortran-features.h" +#include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/ErrorHandling.h" +#include "gtest/gtest.h" + +namespace Fortran::common { + +// Not currently exported from Fortran-features.h +llvm::SmallVector splitCamelCase(llvm::StringRef input); +llvm::SmallVector splitHyphenated(llvm::StringRef input); +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, SplitCamelCase) { + + auto parts = splitCamelCase("oP"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("o", 1))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("P", 1))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OPName"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("OP", 2))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OpName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("Op", 2))) { + ADD_FAILURE() << "First part is not Op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("opName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("op", 2))) { + ADD_FAILURE() << "First part is not op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("FlangTestProgram123"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("Flang", 5))) { + ADD_FAILURE() << "First part is not Flang"; + } + if (parts[1].compare(llvm::StringRef("Test", 4))) { + ADD_FAILURE() << "Second part is not Test"; + } + if (parts[2].compare(llvm::StringRef("Program123", 10))) { + ADD_FAILURE() << "Third part is not Program123"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, SplitHyphenated) { + auto parts = splitHyphenated("no-twenty-one"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("no", 2))) { + ADD_FAILURE() << "First part is not twenty"; + } + if (parts[1].compare(llvm::StringRef("twenty", 6))) { + ADD_FAILURE() << "Second part is not one"; + } + if (parts[2].compare(llvm::StringRef("one", 3))) { + ADD_FAILURE() << "Third part is not one"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); + + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); +} + +std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); +} + +TEST(EnumClassTest, parseCLIEnumOption) { + auto result = parseCLITestEnumExtraOption("no-twenty-one"); + auto expected = std::pair(false, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("twenty-one"); + expected = std::pair(true, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-forty-two"); + expected = std::pair(false, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("forty-two"); + expected = std::pair(true, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-seven-seven-seven"); + expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("seven-seven-seven"); + expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); +} + +} // namespace Fortran::common >From 49a0579f9477936b72f0580823b4dd6824697512 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:56:14 -0700 Subject: [PATCH 2/5] adjust headers --- flang/include/flang/Support/Fortran-features.h | 4 +--- flang/lib/Frontend/CompilerInvocation.cpp | 5 ----- flang/lib/Support/Fortran-features.cpp | 1 - 3 files changed, 1 insertion(+), 9 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index d5aa7357ffea0..4a8b0da4c0d4d 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -11,9 +11,7 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" -#include "flang/Common/idioms.h" -#include "llvm/Support/Error.h" -#include "llvm/Support/raw_ostream.h" +#include "llvm/ADT/StringRef.h" #include #include diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 9ea568549bd6c..d8bf601d0171d 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -20,11 +20,9 @@ #include "flang/Support/Version.h" #include "flang/Tools/TargetSetup.h" #include "flang/Version.inc" -#include "clang/Basic/AllDiagnostics.h" #include "clang/Basic/DiagnosticDriver.h" #include "clang/Basic/DiagnosticOptions.h" #include "clang/Driver/Driver.h" -#include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" #include "llvm/ADT/StringRef.h" @@ -34,9 +32,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" -#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" -#include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" #include "llvm/Support/Process.h" #include "llvm/Support/raw_ostream.h" @@ -46,7 +42,6 @@ #include #include #include -#include using namespace Fortran::frontend; diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 55abf0385d185..0e394162ef577 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -10,7 +10,6 @@ #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" -#include "llvm/Support/raw_ostream.h" namespace Fortran::common { >From fa2db7090c6d374ce1a835ad26d19a1d7bd42262 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:57:22 -0700 Subject: [PATCH 3/5] reformat --- flang/lib/Support/enum-class.cpp | 20 ++++++++++--------- flang/unittests/Common/EnumClassTests.cpp | 5 ++--- .../unittests/Common/FortranFeaturesTest.cpp | 18 ++++++++++------- 3 files changed, 24 insertions(+), 19 deletions(-) diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp index ed11318382b35..ac57f27ef1c9e 100644 --- a/flang/lib/Support/enum-class.cpp +++ b/flang/lib/Support/enum-class.cpp @@ -1,4 +1,5 @@ -//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ +//-*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -7,18 +8,19 @@ //===----------------------------------------------------------------------===// #include "flang/Common/enum-class.h" -#include #include +#include namespace Fortran::common { -std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { - for (int i = 0; i < size; ++i) { - if (pred(names[i])) { - return i; - } +std::optional FindEnumIndex( + std::function pred, int size, + const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; } - return std::nullopt; + } + return std::nullopt; } - } // namespace Fortran::common \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp index f67c453cfad15..c9224a8ceba54 100644 --- a/flang/unittests/Common/EnumClassTests.cpp +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -6,15 +6,14 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Common/template.h" -#include "gtest/gtest.h" using namespace Fortran::common; using namespace std; -ENUM_CLASS(TestEnum, One, Two, - Three) +ENUM_CLASS(TestEnum, One, Two, Three) ENUM_CLASS_EXTRA(TestEnum) TEST(EnumClassTest, EnumToString) { diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 7ec7054f14f6e..597928e7fe56e 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -6,12 +6,12 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Support/Fortran-features.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" -#include "gtest/gtest.h" namespace Fortran::common { @@ -34,7 +34,7 @@ TEST(EnumClassTest, SplitCamelCase) { if (parts[1].compare(llvm::StringRef("P", 1))) { ADD_FAILURE() << "Second part is not Name"; } - + parts = splitCamelCase("OPName"); ASSERT_EQ(parts.size(), (size_t)2); @@ -114,13 +114,15 @@ TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); } -std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { - return parseCLIEnum(input, FindTestEnumExtraIndex); +std::optional> parseCLITestEnumExtraOption( + llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); } TEST(EnumClassTest, parseCLIEnumOption) { auto result = parseCLITestEnumExtraOption("no-twenty-one"); - auto expected = std::pair(false, TestEnumExtra::TwentyOne); + auto expected = + std::pair(false, TestEnumExtra::TwentyOne); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("twenty-one"); expected = std::pair(true, TestEnumExtra::TwentyOne); @@ -132,10 +134,12 @@ TEST(EnumClassTest, parseCLIEnumOption) { expected = std::pair(true, TestEnumExtra::FortyTwo); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("no-seven-seven-seven"); - expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(false, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("seven-seven-seven"); - expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(true, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); } >From 5f3feb64c1a97500e2808114d44bb07aa4ccb00c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 15:58:43 -0700 Subject: [PATCH 4/5] addressing feedback --- flang/include/flang/Common/enum-class.h | 53 +++--- flang/include/flang/Common/optional.h | 7 + .../include/flang/Support/Fortran-features.h | 16 -- flang/lib/Support/Fortran-features.cpp | 175 ++++++++---------- flang/lib/Support/enum-class.cpp | 15 +- flang/test/Driver/disable-diagnostic.f90 | 3 +- flang/test/Driver/werror-wrong.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 2 +- .../unittests/Common/FortranFeaturesTest.cpp | 159 +++------------- 9 files changed, 153 insertions(+), 279 deletions(-) diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index baf9fe418141d..3dbd11bb4057c 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -17,9 +17,9 @@ #ifndef FORTRAN_COMMON_ENUM_CLASS_H_ #define FORTRAN_COMMON_ENUM_CLASS_H_ +#include "optional.h" #include #include -#include #include namespace Fortran::common { @@ -59,26 +59,6 @@ constexpr std::array EnumNames(const char *p) { return result; } -template -std::optional inline fmap(std::optional x, std::function f) { - return x ? std::optional{f(*x)} : std::nullopt; -} - -using Predicate = std::function; -// Finds the first index for which the predicate returns true. -std::optional FindEnumIndex( - Predicate pred, int size, const std::string_view *names); - -using FindEnumIndexType = std::optional( - Predicate, int, const std::string_view *); - -template -std::optional inline FindEnum( - Predicate pred, std::function(Predicate)> find) { - std::function f = [](int x) { return static_cast(x); }; - return fmap(find(pred), f); -} - #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ @@ -90,17 +70,34 @@ std::optional inline FindEnum( return NAME##_names[static_cast(e)]; \ } +namespace EnumClass { + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +optional FindIndex( + Predicate pred, std::size_t size, const std::string_view *names); + +using FindIndexType = std::function(Predicate)>; + +template +optional inline Find(Predicate pred, FindIndexType findIndex) { + return MapOption( + findIndex(pred), [](int x) { return static_cast(x); }); +} + +} // namespace EnumClass + #define ENUM_CLASS_EXTRA(NAME) \ - [[maybe_unused]] inline std::optional Find##NAME##Index( \ - ::Fortran::common::Predicate p) { \ - return ::Fortran::common::FindEnumIndex( \ + [[maybe_unused]] inline optional Find##NAME##Index( \ + ::Fortran::common::EnumClass::Predicate p) { \ + return ::Fortran::common::EnumClass::FindIndex( \ p, NAME##_enumSize, NAME##_names.data()); \ } \ - [[maybe_unused]] inline std::optional Find##NAME( \ - ::Fortran::common::Predicate p) { \ - return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + [[maybe_unused]] inline optional Find##NAME( \ + ::Fortran::common::EnumClass::Predicate p) { \ + return ::Fortran::common::EnumClass::Find(p, Find##NAME##Index); \ } \ - [[maybe_unused]] inline std::optional StringTo##NAME( \ + [[maybe_unused]] inline optional StringTo##NAME( \ const std::string_view name) { \ return Find##NAME( \ [name](const std::string_view s) -> bool { return name == s; }); \ diff --git a/flang/include/flang/Common/optional.h b/flang/include/flang/Common/optional.h index c7c81f40cc8c8..5b623f01e828d 100644 --- a/flang/include/flang/Common/optional.h +++ b/flang/include/flang/Common/optional.h @@ -27,6 +27,7 @@ #define FORTRAN_COMMON_OPTIONAL_H #include "api-attrs.h" +#include #include #include @@ -238,6 +239,12 @@ using std::nullopt_t; using std::optional; #endif // !STD_OPTIONAL_UNSUPPORTED +template +std::optional inline MapOption( + std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + } // namespace Fortran::common #endif // FORTRAN_COMMON_OPTIONAL_H diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 4a8b0da4c0d4d..fd6a9139b7ea7 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -133,21 +133,5 @@ class LanguageFeatureControl { bool disableAllWarnings_{false}; }; -// Parse a CLI enum option return the enum index and whether it should be -// enabled (true) or disabled (false). Just exposed for the template below. -std::optional> parseCLIEnumIndex( - llvm::StringRef input, std::function(Predicate)> find); - -template -std::optional> parseCLIEnum( - llvm::StringRef input, std::function(Predicate)> find) { - using To = std::pair; - using From = std::pair; - static std::function cast = [](From x) { - return std::pair{x.first, static_cast(x.second)}; - }; - return fmap(parseCLIEnumIndex(input, find), cast); -} - } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 0e394162ef577..72ea6639adf51 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -11,6 +11,10 @@ #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" +// Debugging +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/raw_ostream.h" + namespace Fortran::common { LanguageFeatureControl::LanguageFeatureControl() { @@ -95,119 +99,99 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Split a string with camel case into the individual words. -// Note, the small vector is just an array of a few pointers and lengths -// into the original input string. So all this allocation should be pretty -// cheap. -llvm::SmallVector splitCamelCase(llvm::StringRef input) { - using namespace llvm; - if (input.empty()) { - return {}; +// Namespace for helper functions for parsing CLI options +// used instead of static so that there can be unit tests for these +// functions. +namespace FortranFeaturesHelpers { +// Check if Lower Case Hyphenated words are equal to Camel Case words. +// Because of out use case we know that 'r' the camel case string is +// well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. +// This is checked in the enum-class.h file. +bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { + size_t ls{l.size()}, rs{r.size()}; + if (ls < rs) { + return false; } - SmallVector parts{}; - parts.reserve(input.size()); - auto check = [&input](size_t j, function_ref predicate) { - return j < input.size() && predicate(input[j]); - }; - size_t i{0}; - size_t startWord = i; - for (; i < input.size(); i++) { - if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || - ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { - parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); - startWord = i + 1; + bool atStartOfWord{true}; + size_t wordCount{0}, j; // j is the number of word characters checked in r. + for (; j < rs; j++) { + if (wordCount + j >= ls) { + // `l` was shorter once the hiphens were removed. + // If r is null terminated, then we are good. + return r[j] == '\0'; } - } - parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); - return parts; -} - -// Split a string whith hyphens into the individual words. -llvm::SmallVector splitHyphenated(llvm::StringRef input) { - auto parts = llvm::SmallVector{}; - llvm::SplitString(input, parts, "-"); - return parts; -} - -// Check if two strings are equal while normalizing case for the -// right word which is assumed to be a single word in camel case. -bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { - size_t ls = l.size(); - if (ls != r.size()) - return false; - size_t j{0}; - // Process the upper case characters. - for (; j < ls; j++) { - char rc = r[j]; - char rc2l = llvm::toLower(rc); - if (rc == rc2l) { - // Past run of Uppers Case; - break; + if (atStartOfWord) { + if (llvm::isUpper(r[j])) { + // Upper Case Run + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else { + atStartOfWord = false; + if (l[wordCount + j] != r[j]) { + return false; + } + } + } else { + if (llvm::isUpper(r[j])) { + atStartOfWord = true; + if (l[wordCount + j] != '-') { + return false; + } + ++wordCount; + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else if (l[wordCount + j] != r[j]) { + return false; + } } - if (l[j] != rc2l) - return false; } - // Process the lower case characters. - for (; j < ls; j++) { - if (l[j] != r[j]) { - return false; - } + // If there are more characters in l after processing all the characters in r. + // then fail unless the string is null terminated. + if (ls > wordCount + j) { + return l[wordCount + j] == '\0'; } return true; } // Parse a CLI enum option return the enum index and whether it should be // enabled (true) or disabled (false). -std::optional> parseCLIEnumIndex( - llvm::StringRef input, std::function(Predicate)> find) { - auto parts = splitHyphenated(input); - bool negated = false; - if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { +template +optional> ParseCLIEnum( + llvm::StringRef input, EnumClass::FindIndexType findIndex) { + bool negated{false}; + if (input.starts_with("no-")) { negated = true; - // Remove the "no" part - parts = llvm::SmallVector(parts.begin() + 1, parts.end()); - } - size_t chars = 0; - for (auto p : parts) { - chars += p.size(); + input = input.drop_front(3); } - auto pred = [&](auto s) { - if (chars != s.size()) { - return false; - } - auto ccParts = splitCamelCase(s); - auto num_ccParts = ccParts.size(); - if (parts.size() != num_ccParts) { - return false; - } - for (size_t i{0}; i < num_ccParts; i++) { - if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { - return false; - } - } - return true; - }; - auto cast = [negated](int x) { return std::pair{!negated, x}; }; - return fmap>(find(pred), cast); + EnumClass::Predicate predicate{ + [input](llvm::StringRef r) { return LowerHyphEqualCamelCase(input, r); }}; + optional x = EnumClass::Find(predicate, findIndex); + return MapOption>( + x, [negated](T x) { return std::pair{!negated, x}; }); } -std::optional> parseCLILanguageFeature( +optional> parseCLIUsageWarning( llvm::StringRef input) { - return parseCLIEnum(input, FindLanguageFeatureIndex); + return ParseCLIEnum(input, FindUsageWarningIndex); } -std::optional> parseCLIUsageWarning( +optional> parseCLILanguageFeature( llvm::StringRef input) { - return parseCLIEnum(input, FindUsageWarningIndex); + return ParseCLIEnum(input, FindLanguageFeatureIndex); } +} // namespace FortranFeaturesHelpers + // Take a string from the CLI and apply it to the LanguageFeatureControl. // Return true if the option was applied recognized. bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { - if (auto result = parseCLILanguageFeature(input)) { + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(input)) { EnableWarning(result->second, result->first); return true; - } else if (auto result = parseCLIUsageWarning(input)) { + } else if (auto result = + FortranFeaturesHelpers::parseCLIUsageWarning(input)) { EnableWarning(result->second, result->first); return true; } @@ -277,11 +261,10 @@ void ForEachEnum(std::function f) { void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { warnAllLanguage_ = yes; - disableAllWarnings_ = yes ? false : disableAllWarnings_; - // should be equivalent to: reset().flip() set ... - ForEachEnum( - [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + warnLanguage_.reset(); if (yes) { + disableAllWarnings_ = false; + warnLanguage_.flip(); // These three features do not need to be warned about, // but we do want their feature flags. warnLanguage_.set(LanguageFeature::OpenMP, false); @@ -292,8 +275,10 @@ void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { void LanguageFeatureControl::WarnOnAllUsage(bool yes) { warnAllUsage_ = yes; - disableAllWarnings_ = yes ? false : disableAllWarnings_; - ForEachEnum( - [&](UsageWarning w) { warnUsage_.set(w, yes); }); + warnUsage_.reset(); + if (yes) { + disableAllWarnings_ = false; + warnUsage_.flip(); + } } } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp index ac57f27ef1c9e..d6d0ee758175b 100644 --- a/flang/lib/Support/enum-class.cpp +++ b/flang/lib/Support/enum-class.cpp @@ -8,19 +8,20 @@ //===----------------------------------------------------------------------===// #include "flang/Common/enum-class.h" +#include "flang/Common/optional.h" #include -#include -namespace Fortran::common { -std::optional FindEnumIndex( - std::function pred, int size, +namespace Fortran::common::EnumClass { + +optional FindIndex( + std::function pred, size_t size, const std::string_view *names) { - for (int i = 0; i < size; ++i) { + for (size_t i = 0; i < size; ++i) { if (pred(names[i])) { return i; } } - return std::nullopt; + return nullopt; } -} // namespace Fortran::common \ No newline at end of file +} // namespace Fortran::common::EnumClass diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 index 8a58e63cfa3ac..849489377da12 100644 --- a/flang/test/Driver/disable-diagnostic.f90 +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -2,6 +2,7 @@ ! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty ! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 ! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 + ! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface ! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface @@ -16,4 +17,4 @@ program disable_diagnostic end program disable_diagnostic subroutine sub() -end subroutine sub \ No newline at end of file +end subroutine sub diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 33f0aff8a1739..6e3c7cca15bc7 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -4,4 +4,4 @@ ! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 ! WRONG1: error: Unknown diagnostic option: -Wall -! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file +! WRONG2: error: Unknown diagnostic option: -WX diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index 19cc5a20fecf4..3149cb9f7bc47 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -3,4 +3,4 @@ add_flang_unittest(FlangCommonTests FastIntSetTest.cpp FortranFeaturesTest.cpp ) -target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 597928e7fe56e..e12aff9f7b735 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -12,135 +12,34 @@ #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" - -namespace Fortran::common { - -// Not currently exported from Fortran-features.h -llvm::SmallVector splitCamelCase(llvm::StringRef input); -llvm::SmallVector splitHyphenated(llvm::StringRef input); -bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); - -ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) -ENUM_CLASS_EXTRA(TestEnumExtra) - -TEST(EnumClassTest, SplitCamelCase) { - - auto parts = splitCamelCase("oP"); - ASSERT_EQ(parts.size(), (size_t)2); - - if (parts[0].compare(llvm::StringRef("o", 1))) { - ADD_FAILURE() << "First part is not OP"; - } - if (parts[1].compare(llvm::StringRef("P", 1))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("OPName"); - ASSERT_EQ(parts.size(), (size_t)2); - - if (parts[0].compare(llvm::StringRef("OP", 2))) { - ADD_FAILURE() << "First part is not OP"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("OpName"); - ASSERT_EQ(parts.size(), (size_t)2); - if (parts[0].compare(llvm::StringRef("Op", 2))) { - ADD_FAILURE() << "First part is not Op"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("opName"); - ASSERT_EQ(parts.size(), (size_t)2); - if (parts[0].compare(llvm::StringRef("op", 2))) { - ADD_FAILURE() << "First part is not op"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("FlangTestProgram123"); - ASSERT_EQ(parts.size(), (size_t)3); - if (parts[0].compare(llvm::StringRef("Flang", 5))) { - ADD_FAILURE() << "First part is not Flang"; - } - if (parts[1].compare(llvm::StringRef("Test", 4))) { - ADD_FAILURE() << "Second part is not Test"; - } - if (parts[2].compare(llvm::StringRef("Program123", 10))) { - ADD_FAILURE() << "Third part is not Program123"; - } - for (auto p : parts) { - llvm::errs() << p << " " << p.size() << "\n"; - } -} - -TEST(EnumClassTest, SplitHyphenated) { - auto parts = splitHyphenated("no-twenty-one"); - ASSERT_EQ(parts.size(), (size_t)3); - if (parts[0].compare(llvm::StringRef("no", 2))) { - ADD_FAILURE() << "First part is not twenty"; - } - if (parts[1].compare(llvm::StringRef("twenty", 6))) { - ADD_FAILURE() << "Second part is not one"; - } - if (parts[2].compare(llvm::StringRef("one", 3))) { - ADD_FAILURE() << "Third part is not one"; - } - for (auto p : parts) { - llvm::errs() << p << " " << p.size() << "\n"; - } -} - -TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); - - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); -} - -std::optional> parseCLITestEnumExtraOption( - llvm::StringRef input) { - return parseCLIEnum(input, FindTestEnumExtraIndex); -} - -TEST(EnumClassTest, parseCLIEnumOption) { - auto result = parseCLITestEnumExtraOption("no-twenty-one"); - auto expected = - std::pair(false, TestEnumExtra::TwentyOne); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("twenty-one"); - expected = std::pair(true, TestEnumExtra::TwentyOne); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("no-forty-two"); - expected = std::pair(false, TestEnumExtra::FortyTwo); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("forty-two"); - expected = std::pair(true, TestEnumExtra::FortyTwo); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("no-seven-seven-seven"); - expected = - std::pair(false, TestEnumExtra::SevenSevenSeven); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("seven-seven-seven"); - expected = - std::pair(true, TestEnumExtra::SevenSevenSeven); - ASSERT_EQ(result, std::optional{expected}); +#include + +namespace Fortran::common::FortranFeaturesHelpers { + +optional> parseCLIUsageWarning( + llvm::StringRef input); +TEST(EnumClassTest, ParseCLIUsageWarning) { + EXPECT_EQ((parseCLIUsageWarning("no-twenty-one")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("twenty-one")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-seven-seven-seven")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-")), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("Portability"), std::nullopt); + auto expect{std::pair{false, UsageWarning::Portability}}; + ASSERT_EQ(parseCLIUsageWarning("no-portability"), expect); + expect.first = true; + ASSERT_EQ((parseCLIUsageWarning("portability")), expect); + expect = + std::pair{false, Fortran::common::UsageWarning::PointerToUndefinable}; + ASSERT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable")), expect); + expect.first = true; + ASSERT_EQ((parseCLIUsageWarning("pointer-to-undefinable")), expect); + EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("NoPointerToUndefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable"), std::nullopt); } -} // namespace Fortran::common +} // namespace Fortran::common::FortranFeaturesHelpers >From 79303b42f7cfd3806c22bd34e5eced5f27d27f32 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 15:42:27 -0700 Subject: [PATCH 5/5] removing debugging statement --- flang/lib/Support/Fortran-features.cpp | 4 ---- 1 file changed, 4 deletions(-) diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 72ea6639adf51..75baa0b096af0 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -11,10 +11,6 @@ #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" -// Debugging -#include "llvm/ADT/StringRef.h" -#include "llvm/Support/raw_ostream.h" - namespace Fortran::common { LanguageFeatureControl::LanguageFeatureControl() { From flang-commits at lists.llvm.org Fri May 30 15:59:58 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 30 May 2025 15:59:58 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683a386e.170a0220.3577ee.8dcc@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/142022 >From 8f3fd2daab46f477e87043c66b3049dff4a5b20e Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:11:04 -0700 Subject: [PATCH 1/6] initial commit --- flang/include/flang/Common/enum-class.h | 47 ++++- .../include/flang/Support/Fortran-features.h | 51 ++++-- flang/lib/Frontend/CompilerInvocation.cpp | 62 ++++--- flang/lib/Support/CMakeLists.txt | 1 + flang/lib/Support/Fortran-features.cpp | 168 ++++++++++++++---- flang/lib/Support/enum-class.cpp | 24 +++ flang/test/Driver/disable-diagnostic.f90 | 19 ++ flang/test/Driver/werror-wrong.f90 | 7 +- flang/test/Driver/wextra-ok.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 3 + flang/unittests/Common/EnumClassTests.cpp | 45 +++++ .../unittests/Common/FortranFeaturesTest.cpp | 142 +++++++++++++++ 12 files changed, 483 insertions(+), 88 deletions(-) create mode 100644 flang/lib/Support/enum-class.cpp create mode 100644 flang/test/Driver/disable-diagnostic.f90 create mode 100644 flang/unittests/Common/EnumClassTests.cpp create mode 100644 flang/unittests/Common/FortranFeaturesTest.cpp diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index 41575d45091a8..baf9fe418141d 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -18,8 +18,9 @@ #define FORTRAN_COMMON_ENUM_CLASS_H_ #include -#include - +#include +#include +#include namespace Fortran::common { constexpr std::size_t CountEnumNames(const char *p) { @@ -58,15 +59,51 @@ constexpr std::array EnumNames(const char *p) { return result; } +template +std::optional inline fmap(std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +std::optional FindEnumIndex( + Predicate pred, int size, const std::string_view *names); + +using FindEnumIndexType = std::optional( + Predicate, int, const std::string_view *); + +template +std::optional inline FindEnum( + Predicate pred, std::function(Predicate)> find) { + std::function f = [](int x) { return static_cast(x); }; + return fmap(find(pred), f); +} + #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ ::Fortran::common::CountEnumNames(#__VA_ARGS__)}; \ + [[maybe_unused]] static constexpr std::array NAME##_names{ \ + ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ [[maybe_unused]] static inline std::string_view EnumToString(NAME e) { \ - static const constexpr auto names{ \ - ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ - return names[static_cast(e)]; \ + return NAME##_names[static_cast(e)]; \ } +#define ENUM_CLASS_EXTRA(NAME) \ + [[maybe_unused]] inline std::optional Find##NAME##Index( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnumIndex( \ + p, NAME##_enumSize, NAME##_names.data()); \ + } \ + [[maybe_unused]] inline std::optional Find##NAME( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + } \ + [[maybe_unused]] inline std::optional StringTo##NAME( \ + const std::string_view name) { \ + return Find##NAME( \ + [name](const std::string_view s) -> bool { return name == s; }); \ + } } // namespace Fortran::common #endif // FORTRAN_COMMON_ENUM_CLASS_H_ diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index e696da9042480..d5aa7357ffea0 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -12,6 +12,8 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" #include "flang/Common/idioms.h" +#include "llvm/Support/Error.h" +#include "llvm/Support/raw_ostream.h" #include #include @@ -79,12 +81,13 @@ ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, NullActualForDefaultIntentAllocatable, UseAssociationIntoSameNameSubprogram, HostAssociatedIntentOutInSpecExpr, NonVolatilePointerToVolatile) +// Generate default String -> Enum mapping. +ENUM_CLASS_EXTRA(LanguageFeature) +ENUM_CLASS_EXTRA(UsageWarning) + using LanguageFeatures = EnumSet; using UsageWarnings = EnumSet; -std::optional FindLanguageFeature(const char *); -std::optional FindUsageWarning(const char *); - class LanguageFeatureControl { public: LanguageFeatureControl(); @@ -97,8 +100,10 @@ class LanguageFeatureControl { void EnableWarning(UsageWarning w, bool yes = true) { warnUsage_.set(w, yes); } - void WarnOnAllNonstandard(bool yes = true) { warnAllLanguage_ = yes; } - void WarnOnAllUsage(bool yes = true) { warnAllUsage_ = yes; } + void WarnOnAllNonstandard(bool yes = true); + bool IsWarnOnAllNonstandard() const { return warnAllLanguage_; } + void WarnOnAllUsage(bool yes = true); + bool IsWarnOnAllUsage() const { return warnAllUsage_; } void DisableAllNonstandardWarnings() { warnAllLanguage_ = false; warnLanguage_.clear(); @@ -107,16 +112,16 @@ class LanguageFeatureControl { warnAllUsage_ = false; warnUsage_.clear(); } - - bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } - bool ShouldWarn(LanguageFeature f) const { - return (warnAllLanguage_ && f != LanguageFeature::OpenMP && - f != LanguageFeature::OpenACC && f != LanguageFeature::CUDA) || - warnLanguage_.test(f); - } - bool ShouldWarn(UsageWarning w) const { - return warnAllUsage_ || warnUsage_.test(w); + void DisableAllWarnings() { + disableAllWarnings_ = true; + DisableAllNonstandardWarnings(); + DisableAllUsageWarnings(); } + bool applyCLIOption(llvm::StringRef input); + bool AreWarningsDisabled() const { return disableAllWarnings_; } + bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } + bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } + bool ShouldWarn(UsageWarning w) const { return warnUsage_.test(w); } // Return all spellings of operators names, depending on features enabled std::vector GetNames(LogicalOperator) const; std::vector GetNames(RelationalOperator) const; @@ -127,6 +132,24 @@ class LanguageFeatureControl { bool warnAllLanguage_{false}; UsageWarnings warnUsage_; bool warnAllUsage_{false}; + bool disableAllWarnings_{false}; }; + +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). Just exposed for the template below. +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find); + +template +std::optional> parseCLIEnum( + llvm::StringRef input, std::function(Predicate)> find) { + using To = std::pair; + using From = std::pair; + static std::function cast = [](From x) { + return std::pair{x.first, static_cast(x.second)}; + }; + return fmap(parseCLIEnumIndex(input, find), cast); +} + } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..9ea568549bd6c 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -34,6 +34,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" +#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" #include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" @@ -45,6 +46,7 @@ #include #include #include +#include using namespace Fortran::frontend; @@ -971,10 +973,23 @@ static bool parseSemaArgs(CompilerInvocation &res, llvm::opt::ArgList &args, /// Parses all diagnostics related arguments and populates the variables /// options accordingly. Returns false if new errors are generated. +/// FC1 driver entry point for parsing diagnostic arguments. static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, clang::DiagnosticsEngine &diags) { unsigned numErrorsBefore = diags.getNumErrors(); + auto &features = res.getFrontendOpts().features; + // The order of these flags (-pedantic -W -w) is important and is + // chosen to match clang's behavior. + + // -pedantic + if (args.hasArg(clang::driver::options::OPT_pedantic)) { + features.WarnOnAllNonstandard(); + features.WarnOnAllUsage(); + res.setEnableConformanceChecks(); + res.setEnableUsageChecks(); + } + // -Werror option // TODO: Currently throws a Diagnostic for anything other than -W, // this has to change when other -W's are supported. @@ -984,21 +999,27 @@ static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, for (const auto &wArg : wArgs) { if (wArg == "error") { res.setWarnAsErr(true); - } else { - const unsigned diagID = - diags.getCustomDiagID(clang::DiagnosticsEngine::Error, - "Only `-Werror` is supported currently."); - diags.Report(diagID); + // -W(no-) + } else if (!features.applyCLIOption(wArg)) { + const unsigned diagID = diags.getCustomDiagID( + clang::DiagnosticsEngine::Error, "Unknown diagnostic option: -W%0"); + diags.Report(diagID) << wArg; } } } + // -w + if (args.hasArg(clang::driver::options::OPT_w)) { + features.DisableAllWarnings(); + res.setDisableWarnings(); + } + // Default to off for `flang -fc1`. - res.getFrontendOpts().showColors = - parseShowColorsArgs(args, /*defaultDiagColor=*/false); + bool showColors = parseShowColorsArgs(args, false); - // Honor color diagnostics. - res.getDiagnosticOpts().ShowColors = res.getFrontendOpts().showColors; + diags.getDiagnosticOptions().ShowColors = showColors; + res.getDiagnosticOpts().ShowColors = showColors; + res.getFrontendOpts().showColors = showColors; return diags.getNumErrors() == numErrorsBefore; } @@ -1074,16 +1095,6 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, Fortran::common::LanguageFeature::OpenACC); } - // -pedantic - if (args.hasArg(clang::driver::options::OPT_pedantic)) { - res.setEnableConformanceChecks(); - res.setEnableUsageChecks(); - } - - // -w - if (args.hasArg(clang::driver::options::OPT_w)) - res.setDisableWarnings(); - // -std=f2018 // TODO: Set proper options when more fortran standards // are supported. @@ -1092,6 +1103,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, // We only allow f2018 as the given standard if (standard == "f2018") { res.setEnableConformanceChecks(); + res.getFrontendOpts().features.WarnOnAllNonstandard(); } else { const unsigned diagID = diags.getCustomDiagID(clang::DiagnosticsEngine::Error, @@ -1099,6 +1111,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, diags.Report(diagID); } } + return diags.getNumErrors() == numErrorsBefore; } @@ -1694,16 +1707,7 @@ void CompilerInvocation::setFortranOpts() { if (frontendOptions.needProvenanceRangeToCharBlockMappings) fortranOptions.needProvenanceRangeToCharBlockMappings = true; - if (getEnableConformanceChecks()) - fortranOptions.features.WarnOnAllNonstandard(); - - if (getEnableUsageChecks()) - fortranOptions.features.WarnOnAllUsage(); - - if (getDisableWarnings()) { - fortranOptions.features.DisableAllNonstandardWarnings(); - fortranOptions.features.DisableAllUsageWarnings(); - } + fortranOptions.features = frontendOptions.features; } std::unique_ptr diff --git a/flang/lib/Support/CMakeLists.txt b/flang/lib/Support/CMakeLists.txt index 363f57ce97dae..9ef31a2a6dcc7 100644 --- a/flang/lib/Support/CMakeLists.txt +++ b/flang/lib/Support/CMakeLists.txt @@ -44,6 +44,7 @@ endif() add_flang_library(FortranSupport default-kinds.cpp + enum-class.cpp Flags.cpp Fortran.cpp Fortran-features.cpp diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index bee8984102b82..55abf0385d185 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -9,6 +9,8 @@ #include "flang/Support/Fortran-features.h" #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/Support/raw_ostream.h" namespace Fortran::common { @@ -94,57 +96,123 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Ignore case and any inserted punctuation (like '-'/'_') -static std::optional GetWarningChar(char ch) { - if (ch >= 'a' && ch <= 'z') { - return ch; - } else if (ch >= 'A' && ch <= 'Z') { - return ch - 'A' + 'a'; - } else if (ch >= '0' && ch <= '9') { - return ch; - } else { - return std::nullopt; +// Split a string with camel case into the individual words. +// Note, the small vector is just an array of a few pointers and lengths +// into the original input string. So all this allocation should be pretty +// cheap. +llvm::SmallVector splitCamelCase(llvm::StringRef input) { + using namespace llvm; + if (input.empty()) { + return {}; } + SmallVector parts{}; + parts.reserve(input.size()); + auto check = [&input](size_t j, function_ref predicate) { + return j < input.size() && predicate(input[j]); + }; + size_t i{0}; + size_t startWord = i; + for (; i < input.size(); i++) { + if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || + ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { + parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); + startWord = i + 1; + } + } + parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); + return parts; } -static bool WarningNameMatch(const char *a, const char *b) { - while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); - } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); +// Split a string whith hyphens into the individual words. +llvm::SmallVector splitHyphenated(llvm::StringRef input) { + auto parts = llvm::SmallVector{}; + llvm::SplitString(input, parts, "-"); + return parts; +} + +// Check if two strings are equal while normalizing case for the +// right word which is assumed to be a single word in camel case. +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { + size_t ls = l.size(); + if (ls != r.size()) + return false; + size_t j{0}; + // Process the upper case characters. + for (; j < ls; j++) { + char rc = r[j]; + char rc2l = llvm::toLower(rc); + if (rc == rc2l) { + // Past run of Uppers Case; + break; } - if (!ach && !bch) { - return true; - } else if (!ach || !bch || *ach != *bch) { + if (l[j] != rc2l) + return false; + } + // Process the lower case characters. + for (; j < ls; j++) { + if (l[j] != r[j]) { return false; } - ++a, ++b; } + return true; } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find) { + auto parts = splitHyphenated(input); + bool negated = false; + if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { + negated = true; + // Remove the "no" part + parts = llvm::SmallVector(parts.begin() + 1, parts.end()); + } + size_t chars = 0; + for (auto p : parts) { + chars += p.size(); + } + auto pred = [&](auto s) { + if (chars != s.size()) { + return false; + } + auto ccParts = splitCamelCase(s); + auto num_ccParts = ccParts.size(); + if (parts.size() != num_ccParts) { + return false; + } + for (size_t i{0}; i < num_ccParts; i++) { + if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { + return false; } } - } - return std::nullopt; + return true; + }; + auto cast = [negated](int x) { return std::pair{!negated, x}; }; + return fmap>(find(pred), cast); } -std::optional FindLanguageFeature(const char *name) { - return ScanEnum(name); +std::optional> parseCLILanguageFeature( + llvm::StringRef input) { + return parseCLIEnum(input, FindLanguageFeatureIndex); } -std::optional FindUsageWarning(const char *name) { - return ScanEnum(name); +std::optional> parseCLIUsageWarning( + llvm::StringRef input) { + return parseCLIEnum(input, FindUsageWarningIndex); +} + +// Take a string from the CLI and apply it to the LanguageFeatureControl. +// Return true if the option was applied recognized. +bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { + if (auto result = parseCLILanguageFeature(input)) { + EnableWarning(result->second, result->first); + return true; + } else if (auto result = parseCLIUsageWarning(input)) { + EnableWarning(result->second, result->first); + return true; + } + return false; } std::vector LanguageFeatureControl::GetNames( @@ -201,4 +269,32 @@ std::vector LanguageFeatureControl::GetNames( } } +template +void ForEachEnum(std::function f) { + for (size_t j{0}; j < N; ++j) { + f(static_cast(j)); + } +} + +void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { + warnAllLanguage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + // should be equivalent to: reset().flip() set ... + ForEachEnum( + [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + if (yes) { + // These three features do not need to be warned about, + // but we do want their feature flags. + warnLanguage_.set(LanguageFeature::OpenMP, false); + warnLanguage_.set(LanguageFeature::OpenACC, false); + warnLanguage_.set(LanguageFeature::CUDA, false); + } +} + +void LanguageFeatureControl::WarnOnAllUsage(bool yes) { + warnAllUsage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + ForEachEnum( + [&](UsageWarning w) { warnUsage_.set(w, yes); }); +} } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp new file mode 100644 index 0000000000000..ed11318382b35 --- /dev/null +++ b/flang/lib/Support/enum-class.cpp @@ -0,0 +1,24 @@ +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include +#include +namespace Fortran::common { + +std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; + } + } + return std::nullopt; +} + + +} // namespace Fortran::common \ No newline at end of file diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 new file mode 100644 index 0000000000000..8a58e63cfa3ac --- /dev/null +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -0,0 +1,19 @@ +! RUN: %flang -Wknown-bad-implicit-interface %s -c 2>&1 | FileCheck %s --check-prefix=WARN +! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty +! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 +! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 +! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface +! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface + +program disable_diagnostic + REAL :: x + INTEGER :: y + ! CHECK-NOT: warning + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(x) + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(y) +end program disable_diagnostic + +subroutine sub() +end subroutine sub \ No newline at end of file diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 58adf6f745d5e..33f0aff8a1739 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -1,6 +1,7 @@ ! Ensure that only argument -Werror is supported. -! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG -! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG +! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG1 +! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 -! WRONG: Only `-Werror` is supported currently. +! WRONG1: error: Unknown diagnostic option: -Wall +! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file diff --git a/flang/test/Driver/wextra-ok.f90 b/flang/test/Driver/wextra-ok.f90 index 441029aa0af27..db15c7f14aa35 100644 --- a/flang/test/Driver/wextra-ok.f90 +++ b/flang/test/Driver/wextra-ok.f90 @@ -5,7 +5,7 @@ ! RUN: not %flang -std=f2018 -Wblah -Wextra %s -c 2>&1 | FileCheck %s --check-prefix=WRONG ! CHECK-OK: the warning option '-Wextra' is not supported -! WRONG: Only `-Werror` is supported currently. +! WRONG: Unknown diagnostic option: -Wblah program wextra_ok end program wextra_ok diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index bda02ed29a5ef..19cc5a20fecf4 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -1,3 +1,6 @@ add_flang_unittest(FlangCommonTests + EnumClassTests.cpp FastIntSetTest.cpp + FortranFeaturesTest.cpp ) +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp new file mode 100644 index 0000000000000..f67c453cfad15 --- /dev/null +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -0,0 +1,45 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Common/template.h" +#include "gtest/gtest.h" + +using namespace Fortran::common; +using namespace std; + +ENUM_CLASS(TestEnum, One, Two, + Three) +ENUM_CLASS_EXTRA(TestEnum) + +TEST(EnumClassTest, EnumToString) { + ASSERT_EQ(EnumToString(TestEnum::One), "One"); + ASSERT_EQ(EnumToString(TestEnum::Two), "Two"); + ASSERT_EQ(EnumToString(TestEnum::Three), "Three"); +} + +TEST(EnumClassTest, EnumToStringData) { + ASSERT_STREQ(EnumToString(TestEnum::One).data(), "One, Two, Three"); +} + +TEST(EnumClassTest, StringToEnum) { + ASSERT_EQ(StringToTestEnum("One"), std::optional{TestEnum::One}); + ASSERT_EQ(StringToTestEnum("Two"), std::optional{TestEnum::Two}); + ASSERT_EQ(StringToTestEnum("Three"), std::optional{TestEnum::Three}); + ASSERT_EQ(StringToTestEnum("Four"), std::nullopt); + ASSERT_EQ(StringToTestEnum(""), std::nullopt); + ASSERT_EQ(StringToTestEnum("One, Two, Three"), std::nullopt); +} + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, FindNameNormal) { + auto p1 = [](auto s) { return s == "TwentyOne"; }; + ASSERT_EQ(FindTestEnumExtra(p1), std::optional{TestEnumExtra::TwentyOne}); +} diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp new file mode 100644 index 0000000000000..7ec7054f14f6e --- /dev/null +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -0,0 +1,142 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Support/Fortran-features.h" +#include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/ErrorHandling.h" +#include "gtest/gtest.h" + +namespace Fortran::common { + +// Not currently exported from Fortran-features.h +llvm::SmallVector splitCamelCase(llvm::StringRef input); +llvm::SmallVector splitHyphenated(llvm::StringRef input); +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, SplitCamelCase) { + + auto parts = splitCamelCase("oP"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("o", 1))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("P", 1))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OPName"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("OP", 2))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OpName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("Op", 2))) { + ADD_FAILURE() << "First part is not Op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("opName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("op", 2))) { + ADD_FAILURE() << "First part is not op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("FlangTestProgram123"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("Flang", 5))) { + ADD_FAILURE() << "First part is not Flang"; + } + if (parts[1].compare(llvm::StringRef("Test", 4))) { + ADD_FAILURE() << "Second part is not Test"; + } + if (parts[2].compare(llvm::StringRef("Program123", 10))) { + ADD_FAILURE() << "Third part is not Program123"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, SplitHyphenated) { + auto parts = splitHyphenated("no-twenty-one"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("no", 2))) { + ADD_FAILURE() << "First part is not twenty"; + } + if (parts[1].compare(llvm::StringRef("twenty", 6))) { + ADD_FAILURE() << "Second part is not one"; + } + if (parts[2].compare(llvm::StringRef("one", 3))) { + ADD_FAILURE() << "Third part is not one"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); + + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); +} + +std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); +} + +TEST(EnumClassTest, parseCLIEnumOption) { + auto result = parseCLITestEnumExtraOption("no-twenty-one"); + auto expected = std::pair(false, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("twenty-one"); + expected = std::pair(true, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-forty-two"); + expected = std::pair(false, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("forty-two"); + expected = std::pair(true, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-seven-seven-seven"); + expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("seven-seven-seven"); + expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); +} + +} // namespace Fortran::common >From 49a0579f9477936b72f0580823b4dd6824697512 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:56:14 -0700 Subject: [PATCH 2/6] adjust headers --- flang/include/flang/Support/Fortran-features.h | 4 +--- flang/lib/Frontend/CompilerInvocation.cpp | 5 ----- flang/lib/Support/Fortran-features.cpp | 1 - 3 files changed, 1 insertion(+), 9 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index d5aa7357ffea0..4a8b0da4c0d4d 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -11,9 +11,7 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" -#include "flang/Common/idioms.h" -#include "llvm/Support/Error.h" -#include "llvm/Support/raw_ostream.h" +#include "llvm/ADT/StringRef.h" #include #include diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 9ea568549bd6c..d8bf601d0171d 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -20,11 +20,9 @@ #include "flang/Support/Version.h" #include "flang/Tools/TargetSetup.h" #include "flang/Version.inc" -#include "clang/Basic/AllDiagnostics.h" #include "clang/Basic/DiagnosticDriver.h" #include "clang/Basic/DiagnosticOptions.h" #include "clang/Driver/Driver.h" -#include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" #include "llvm/ADT/StringRef.h" @@ -34,9 +32,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" -#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" -#include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" #include "llvm/Support/Process.h" #include "llvm/Support/raw_ostream.h" @@ -46,7 +42,6 @@ #include #include #include -#include using namespace Fortran::frontend; diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 55abf0385d185..0e394162ef577 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -10,7 +10,6 @@ #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" -#include "llvm/Support/raw_ostream.h" namespace Fortran::common { >From fa2db7090c6d374ce1a835ad26d19a1d7bd42262 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:57:22 -0700 Subject: [PATCH 3/6] reformat --- flang/lib/Support/enum-class.cpp | 20 ++++++++++--------- flang/unittests/Common/EnumClassTests.cpp | 5 ++--- .../unittests/Common/FortranFeaturesTest.cpp | 18 ++++++++++------- 3 files changed, 24 insertions(+), 19 deletions(-) diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp index ed11318382b35..ac57f27ef1c9e 100644 --- a/flang/lib/Support/enum-class.cpp +++ b/flang/lib/Support/enum-class.cpp @@ -1,4 +1,5 @@ -//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ +//-*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -7,18 +8,19 @@ //===----------------------------------------------------------------------===// #include "flang/Common/enum-class.h" -#include #include +#include namespace Fortran::common { -std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { - for (int i = 0; i < size; ++i) { - if (pred(names[i])) { - return i; - } +std::optional FindEnumIndex( + std::function pred, int size, + const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; } - return std::nullopt; + } + return std::nullopt; } - } // namespace Fortran::common \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp index f67c453cfad15..c9224a8ceba54 100644 --- a/flang/unittests/Common/EnumClassTests.cpp +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -6,15 +6,14 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Common/template.h" -#include "gtest/gtest.h" using namespace Fortran::common; using namespace std; -ENUM_CLASS(TestEnum, One, Two, - Three) +ENUM_CLASS(TestEnum, One, Two, Three) ENUM_CLASS_EXTRA(TestEnum) TEST(EnumClassTest, EnumToString) { diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 7ec7054f14f6e..597928e7fe56e 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -6,12 +6,12 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Support/Fortran-features.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" -#include "gtest/gtest.h" namespace Fortran::common { @@ -34,7 +34,7 @@ TEST(EnumClassTest, SplitCamelCase) { if (parts[1].compare(llvm::StringRef("P", 1))) { ADD_FAILURE() << "Second part is not Name"; } - + parts = splitCamelCase("OPName"); ASSERT_EQ(parts.size(), (size_t)2); @@ -114,13 +114,15 @@ TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); } -std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { - return parseCLIEnum(input, FindTestEnumExtraIndex); +std::optional> parseCLITestEnumExtraOption( + llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); } TEST(EnumClassTest, parseCLIEnumOption) { auto result = parseCLITestEnumExtraOption("no-twenty-one"); - auto expected = std::pair(false, TestEnumExtra::TwentyOne); + auto expected = + std::pair(false, TestEnumExtra::TwentyOne); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("twenty-one"); expected = std::pair(true, TestEnumExtra::TwentyOne); @@ -132,10 +134,12 @@ TEST(EnumClassTest, parseCLIEnumOption) { expected = std::pair(true, TestEnumExtra::FortyTwo); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("no-seven-seven-seven"); - expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(false, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("seven-seven-seven"); - expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(true, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); } >From 5f3feb64c1a97500e2808114d44bb07aa4ccb00c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 15:58:43 -0700 Subject: [PATCH 4/6] addressing feedback --- flang/include/flang/Common/enum-class.h | 53 +++--- flang/include/flang/Common/optional.h | 7 + .../include/flang/Support/Fortran-features.h | 16 -- flang/lib/Support/Fortran-features.cpp | 175 ++++++++---------- flang/lib/Support/enum-class.cpp | 15 +- flang/test/Driver/disable-diagnostic.f90 | 3 +- flang/test/Driver/werror-wrong.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 2 +- .../unittests/Common/FortranFeaturesTest.cpp | 159 +++------------- 9 files changed, 153 insertions(+), 279 deletions(-) diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index baf9fe418141d..3dbd11bb4057c 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -17,9 +17,9 @@ #ifndef FORTRAN_COMMON_ENUM_CLASS_H_ #define FORTRAN_COMMON_ENUM_CLASS_H_ +#include "optional.h" #include #include -#include #include namespace Fortran::common { @@ -59,26 +59,6 @@ constexpr std::array EnumNames(const char *p) { return result; } -template -std::optional inline fmap(std::optional x, std::function f) { - return x ? std::optional{f(*x)} : std::nullopt; -} - -using Predicate = std::function; -// Finds the first index for which the predicate returns true. -std::optional FindEnumIndex( - Predicate pred, int size, const std::string_view *names); - -using FindEnumIndexType = std::optional( - Predicate, int, const std::string_view *); - -template -std::optional inline FindEnum( - Predicate pred, std::function(Predicate)> find) { - std::function f = [](int x) { return static_cast(x); }; - return fmap(find(pred), f); -} - #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ @@ -90,17 +70,34 @@ std::optional inline FindEnum( return NAME##_names[static_cast(e)]; \ } +namespace EnumClass { + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +optional FindIndex( + Predicate pred, std::size_t size, const std::string_view *names); + +using FindIndexType = std::function(Predicate)>; + +template +optional inline Find(Predicate pred, FindIndexType findIndex) { + return MapOption( + findIndex(pred), [](int x) { return static_cast(x); }); +} + +} // namespace EnumClass + #define ENUM_CLASS_EXTRA(NAME) \ - [[maybe_unused]] inline std::optional Find##NAME##Index( \ - ::Fortran::common::Predicate p) { \ - return ::Fortran::common::FindEnumIndex( \ + [[maybe_unused]] inline optional Find##NAME##Index( \ + ::Fortran::common::EnumClass::Predicate p) { \ + return ::Fortran::common::EnumClass::FindIndex( \ p, NAME##_enumSize, NAME##_names.data()); \ } \ - [[maybe_unused]] inline std::optional Find##NAME( \ - ::Fortran::common::Predicate p) { \ - return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + [[maybe_unused]] inline optional Find##NAME( \ + ::Fortran::common::EnumClass::Predicate p) { \ + return ::Fortran::common::EnumClass::Find(p, Find##NAME##Index); \ } \ - [[maybe_unused]] inline std::optional StringTo##NAME( \ + [[maybe_unused]] inline optional StringTo##NAME( \ const std::string_view name) { \ return Find##NAME( \ [name](const std::string_view s) -> bool { return name == s; }); \ diff --git a/flang/include/flang/Common/optional.h b/flang/include/flang/Common/optional.h index c7c81f40cc8c8..5b623f01e828d 100644 --- a/flang/include/flang/Common/optional.h +++ b/flang/include/flang/Common/optional.h @@ -27,6 +27,7 @@ #define FORTRAN_COMMON_OPTIONAL_H #include "api-attrs.h" +#include #include #include @@ -238,6 +239,12 @@ using std::nullopt_t; using std::optional; #endif // !STD_OPTIONAL_UNSUPPORTED +template +std::optional inline MapOption( + std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + } // namespace Fortran::common #endif // FORTRAN_COMMON_OPTIONAL_H diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 4a8b0da4c0d4d..fd6a9139b7ea7 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -133,21 +133,5 @@ class LanguageFeatureControl { bool disableAllWarnings_{false}; }; -// Parse a CLI enum option return the enum index and whether it should be -// enabled (true) or disabled (false). Just exposed for the template below. -std::optional> parseCLIEnumIndex( - llvm::StringRef input, std::function(Predicate)> find); - -template -std::optional> parseCLIEnum( - llvm::StringRef input, std::function(Predicate)> find) { - using To = std::pair; - using From = std::pair; - static std::function cast = [](From x) { - return std::pair{x.first, static_cast(x.second)}; - }; - return fmap(parseCLIEnumIndex(input, find), cast); -} - } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 0e394162ef577..72ea6639adf51 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -11,6 +11,10 @@ #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" +// Debugging +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/raw_ostream.h" + namespace Fortran::common { LanguageFeatureControl::LanguageFeatureControl() { @@ -95,119 +99,99 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Split a string with camel case into the individual words. -// Note, the small vector is just an array of a few pointers and lengths -// into the original input string. So all this allocation should be pretty -// cheap. -llvm::SmallVector splitCamelCase(llvm::StringRef input) { - using namespace llvm; - if (input.empty()) { - return {}; +// Namespace for helper functions for parsing CLI options +// used instead of static so that there can be unit tests for these +// functions. +namespace FortranFeaturesHelpers { +// Check if Lower Case Hyphenated words are equal to Camel Case words. +// Because of out use case we know that 'r' the camel case string is +// well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. +// This is checked in the enum-class.h file. +bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { + size_t ls{l.size()}, rs{r.size()}; + if (ls < rs) { + return false; } - SmallVector parts{}; - parts.reserve(input.size()); - auto check = [&input](size_t j, function_ref predicate) { - return j < input.size() && predicate(input[j]); - }; - size_t i{0}; - size_t startWord = i; - for (; i < input.size(); i++) { - if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || - ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { - parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); - startWord = i + 1; + bool atStartOfWord{true}; + size_t wordCount{0}, j; // j is the number of word characters checked in r. + for (; j < rs; j++) { + if (wordCount + j >= ls) { + // `l` was shorter once the hiphens were removed. + // If r is null terminated, then we are good. + return r[j] == '\0'; } - } - parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); - return parts; -} - -// Split a string whith hyphens into the individual words. -llvm::SmallVector splitHyphenated(llvm::StringRef input) { - auto parts = llvm::SmallVector{}; - llvm::SplitString(input, parts, "-"); - return parts; -} - -// Check if two strings are equal while normalizing case for the -// right word which is assumed to be a single word in camel case. -bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { - size_t ls = l.size(); - if (ls != r.size()) - return false; - size_t j{0}; - // Process the upper case characters. - for (; j < ls; j++) { - char rc = r[j]; - char rc2l = llvm::toLower(rc); - if (rc == rc2l) { - // Past run of Uppers Case; - break; + if (atStartOfWord) { + if (llvm::isUpper(r[j])) { + // Upper Case Run + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else { + atStartOfWord = false; + if (l[wordCount + j] != r[j]) { + return false; + } + } + } else { + if (llvm::isUpper(r[j])) { + atStartOfWord = true; + if (l[wordCount + j] != '-') { + return false; + } + ++wordCount; + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else if (l[wordCount + j] != r[j]) { + return false; + } } - if (l[j] != rc2l) - return false; } - // Process the lower case characters. - for (; j < ls; j++) { - if (l[j] != r[j]) { - return false; - } + // If there are more characters in l after processing all the characters in r. + // then fail unless the string is null terminated. + if (ls > wordCount + j) { + return l[wordCount + j] == '\0'; } return true; } // Parse a CLI enum option return the enum index and whether it should be // enabled (true) or disabled (false). -std::optional> parseCLIEnumIndex( - llvm::StringRef input, std::function(Predicate)> find) { - auto parts = splitHyphenated(input); - bool negated = false; - if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { +template +optional> ParseCLIEnum( + llvm::StringRef input, EnumClass::FindIndexType findIndex) { + bool negated{false}; + if (input.starts_with("no-")) { negated = true; - // Remove the "no" part - parts = llvm::SmallVector(parts.begin() + 1, parts.end()); - } - size_t chars = 0; - for (auto p : parts) { - chars += p.size(); + input = input.drop_front(3); } - auto pred = [&](auto s) { - if (chars != s.size()) { - return false; - } - auto ccParts = splitCamelCase(s); - auto num_ccParts = ccParts.size(); - if (parts.size() != num_ccParts) { - return false; - } - for (size_t i{0}; i < num_ccParts; i++) { - if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { - return false; - } - } - return true; - }; - auto cast = [negated](int x) { return std::pair{!negated, x}; }; - return fmap>(find(pred), cast); + EnumClass::Predicate predicate{ + [input](llvm::StringRef r) { return LowerHyphEqualCamelCase(input, r); }}; + optional x = EnumClass::Find(predicate, findIndex); + return MapOption>( + x, [negated](T x) { return std::pair{!negated, x}; }); } -std::optional> parseCLILanguageFeature( +optional> parseCLIUsageWarning( llvm::StringRef input) { - return parseCLIEnum(input, FindLanguageFeatureIndex); + return ParseCLIEnum(input, FindUsageWarningIndex); } -std::optional> parseCLIUsageWarning( +optional> parseCLILanguageFeature( llvm::StringRef input) { - return parseCLIEnum(input, FindUsageWarningIndex); + return ParseCLIEnum(input, FindLanguageFeatureIndex); } +} // namespace FortranFeaturesHelpers + // Take a string from the CLI and apply it to the LanguageFeatureControl. // Return true if the option was applied recognized. bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { - if (auto result = parseCLILanguageFeature(input)) { + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(input)) { EnableWarning(result->second, result->first); return true; - } else if (auto result = parseCLIUsageWarning(input)) { + } else if (auto result = + FortranFeaturesHelpers::parseCLIUsageWarning(input)) { EnableWarning(result->second, result->first); return true; } @@ -277,11 +261,10 @@ void ForEachEnum(std::function f) { void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { warnAllLanguage_ = yes; - disableAllWarnings_ = yes ? false : disableAllWarnings_; - // should be equivalent to: reset().flip() set ... - ForEachEnum( - [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + warnLanguage_.reset(); if (yes) { + disableAllWarnings_ = false; + warnLanguage_.flip(); // These three features do not need to be warned about, // but we do want their feature flags. warnLanguage_.set(LanguageFeature::OpenMP, false); @@ -292,8 +275,10 @@ void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { void LanguageFeatureControl::WarnOnAllUsage(bool yes) { warnAllUsage_ = yes; - disableAllWarnings_ = yes ? false : disableAllWarnings_; - ForEachEnum( - [&](UsageWarning w) { warnUsage_.set(w, yes); }); + warnUsage_.reset(); + if (yes) { + disableAllWarnings_ = false; + warnUsage_.flip(); + } } } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp index ac57f27ef1c9e..d6d0ee758175b 100644 --- a/flang/lib/Support/enum-class.cpp +++ b/flang/lib/Support/enum-class.cpp @@ -8,19 +8,20 @@ //===----------------------------------------------------------------------===// #include "flang/Common/enum-class.h" +#include "flang/Common/optional.h" #include -#include -namespace Fortran::common { -std::optional FindEnumIndex( - std::function pred, int size, +namespace Fortran::common::EnumClass { + +optional FindIndex( + std::function pred, size_t size, const std::string_view *names) { - for (int i = 0; i < size; ++i) { + for (size_t i = 0; i < size; ++i) { if (pred(names[i])) { return i; } } - return std::nullopt; + return nullopt; } -} // namespace Fortran::common \ No newline at end of file +} // namespace Fortran::common::EnumClass diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 index 8a58e63cfa3ac..849489377da12 100644 --- a/flang/test/Driver/disable-diagnostic.f90 +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -2,6 +2,7 @@ ! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty ! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 ! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 + ! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface ! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface @@ -16,4 +17,4 @@ program disable_diagnostic end program disable_diagnostic subroutine sub() -end subroutine sub \ No newline at end of file +end subroutine sub diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 33f0aff8a1739..6e3c7cca15bc7 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -4,4 +4,4 @@ ! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 ! WRONG1: error: Unknown diagnostic option: -Wall -! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file +! WRONG2: error: Unknown diagnostic option: -WX diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index 19cc5a20fecf4..3149cb9f7bc47 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -3,4 +3,4 @@ add_flang_unittest(FlangCommonTests FastIntSetTest.cpp FortranFeaturesTest.cpp ) -target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 597928e7fe56e..e12aff9f7b735 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -12,135 +12,34 @@ #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" - -namespace Fortran::common { - -// Not currently exported from Fortran-features.h -llvm::SmallVector splitCamelCase(llvm::StringRef input); -llvm::SmallVector splitHyphenated(llvm::StringRef input); -bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); - -ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) -ENUM_CLASS_EXTRA(TestEnumExtra) - -TEST(EnumClassTest, SplitCamelCase) { - - auto parts = splitCamelCase("oP"); - ASSERT_EQ(parts.size(), (size_t)2); - - if (parts[0].compare(llvm::StringRef("o", 1))) { - ADD_FAILURE() << "First part is not OP"; - } - if (parts[1].compare(llvm::StringRef("P", 1))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("OPName"); - ASSERT_EQ(parts.size(), (size_t)2); - - if (parts[0].compare(llvm::StringRef("OP", 2))) { - ADD_FAILURE() << "First part is not OP"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("OpName"); - ASSERT_EQ(parts.size(), (size_t)2); - if (parts[0].compare(llvm::StringRef("Op", 2))) { - ADD_FAILURE() << "First part is not Op"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("opName"); - ASSERT_EQ(parts.size(), (size_t)2); - if (parts[0].compare(llvm::StringRef("op", 2))) { - ADD_FAILURE() << "First part is not op"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("FlangTestProgram123"); - ASSERT_EQ(parts.size(), (size_t)3); - if (parts[0].compare(llvm::StringRef("Flang", 5))) { - ADD_FAILURE() << "First part is not Flang"; - } - if (parts[1].compare(llvm::StringRef("Test", 4))) { - ADD_FAILURE() << "Second part is not Test"; - } - if (parts[2].compare(llvm::StringRef("Program123", 10))) { - ADD_FAILURE() << "Third part is not Program123"; - } - for (auto p : parts) { - llvm::errs() << p << " " << p.size() << "\n"; - } -} - -TEST(EnumClassTest, SplitHyphenated) { - auto parts = splitHyphenated("no-twenty-one"); - ASSERT_EQ(parts.size(), (size_t)3); - if (parts[0].compare(llvm::StringRef("no", 2))) { - ADD_FAILURE() << "First part is not twenty"; - } - if (parts[1].compare(llvm::StringRef("twenty", 6))) { - ADD_FAILURE() << "Second part is not one"; - } - if (parts[2].compare(llvm::StringRef("one", 3))) { - ADD_FAILURE() << "Third part is not one"; - } - for (auto p : parts) { - llvm::errs() << p << " " << p.size() << "\n"; - } -} - -TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); - - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); -} - -std::optional> parseCLITestEnumExtraOption( - llvm::StringRef input) { - return parseCLIEnum(input, FindTestEnumExtraIndex); -} - -TEST(EnumClassTest, parseCLIEnumOption) { - auto result = parseCLITestEnumExtraOption("no-twenty-one"); - auto expected = - std::pair(false, TestEnumExtra::TwentyOne); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("twenty-one"); - expected = std::pair(true, TestEnumExtra::TwentyOne); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("no-forty-two"); - expected = std::pair(false, TestEnumExtra::FortyTwo); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("forty-two"); - expected = std::pair(true, TestEnumExtra::FortyTwo); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("no-seven-seven-seven"); - expected = - std::pair(false, TestEnumExtra::SevenSevenSeven); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("seven-seven-seven"); - expected = - std::pair(true, TestEnumExtra::SevenSevenSeven); - ASSERT_EQ(result, std::optional{expected}); +#include + +namespace Fortran::common::FortranFeaturesHelpers { + +optional> parseCLIUsageWarning( + llvm::StringRef input); +TEST(EnumClassTest, ParseCLIUsageWarning) { + EXPECT_EQ((parseCLIUsageWarning("no-twenty-one")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("twenty-one")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-seven-seven-seven")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-")), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("Portability"), std::nullopt); + auto expect{std::pair{false, UsageWarning::Portability}}; + ASSERT_EQ(parseCLIUsageWarning("no-portability"), expect); + expect.first = true; + ASSERT_EQ((parseCLIUsageWarning("portability")), expect); + expect = + std::pair{false, Fortran::common::UsageWarning::PointerToUndefinable}; + ASSERT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable")), expect); + expect.first = true; + ASSERT_EQ((parseCLIUsageWarning("pointer-to-undefinable")), expect); + EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("NoPointerToUndefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable"), std::nullopt); } -} // namespace Fortran::common +} // namespace Fortran::common::FortranFeaturesHelpers >From 79303b42f7cfd3806c22bd34e5eced5f27d27f32 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 15:42:27 -0700 Subject: [PATCH 5/6] removing debugging statement --- flang/lib/Support/Fortran-features.cpp | 4 ---- 1 file changed, 4 deletions(-) diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 72ea6639adf51..75baa0b096af0 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -11,10 +11,6 @@ #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" -// Debugging -#include "llvm/ADT/StringRef.h" -#include "llvm/Support/raw_ostream.h" - namespace Fortran::common { LanguageFeatureControl::LanguageFeatureControl() { >From 8f0aa22125528a755ec61af2bd45b6c314cfe45c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 15:59:18 -0700 Subject: [PATCH 6/6] more feedback --- flang/include/flang/Support/Fortran-features.h | 4 +--- flang/lib/Support/Fortran-features.cpp | 16 +++++++++------- flang/unittests/Common/FortranFeaturesTest.cpp | 4 ---- 3 files changed, 10 insertions(+), 14 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index fd6a9139b7ea7..501b183cceeec 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -11,8 +11,6 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" -#include "llvm/ADT/StringRef.h" -#include #include namespace Fortran::common { @@ -115,7 +113,7 @@ class LanguageFeatureControl { DisableAllNonstandardWarnings(); DisableAllUsageWarnings(); } - bool applyCLIOption(llvm::StringRef input); + bool applyCLIOption(std::string_view input); bool AreWarningsDisabled() const { return disableAllWarnings_; } bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 75baa0b096af0..d140ecdff7f24 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -99,11 +99,11 @@ LanguageFeatureControl::LanguageFeatureControl() { // used instead of static so that there can be unit tests for these // functions. namespace FortranFeaturesHelpers { -// Check if Lower Case Hyphenated words are equal to Camel Case words. +// Check if lower case hyphenated words are equal to camel case words. // Because of out use case we know that 'r' the camel case string is // well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. // This is checked in the enum-class.h file. -bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { +static bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { size_t ls{l.size()}, rs{r.size()}; if (ls < rs) { return false; @@ -161,8 +161,9 @@ optional> ParseCLIEnum( negated = true; input = input.drop_front(3); } - EnumClass::Predicate predicate{ - [input](llvm::StringRef r) { return LowerHyphEqualCamelCase(input, r); }}; + EnumClass::Predicate predicate{[input](std::string_view r) { + return LowerHyphEqualCamelCase(input, r); + }}; optional x = EnumClass::Find(predicate, findIndex); return MapOption>( x, [negated](T x) { return std::pair{!negated, x}; }); @@ -182,12 +183,13 @@ optional> parseCLILanguageFeature( // Take a string from the CLI and apply it to the LanguageFeatureControl. // Return true if the option was applied recognized. -bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { - if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(input)) { +bool LanguageFeatureControl::applyCLIOption(std::string_view input) { + llvm::StringRef inputRef{input}; + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(inputRef)) { EnableWarning(result->second, result->first); return true; } else if (auto result = - FortranFeaturesHelpers::parseCLIUsageWarning(input)) { + FortranFeaturesHelpers::parseCLIUsageWarning(inputRef)) { EnableWarning(result->second, result->first); return true; } diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index e12aff9f7b735..b3f0c31a57025 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -7,11 +7,7 @@ //===----------------------------------------------------------------------===// #include "gtest/gtest.h" -#include "flang/Common/enum-class.h" #include "flang/Support/Fortran-features.h" -#include "llvm/ADT/SmallVector.h" -#include "llvm/ADT/StringRef.h" -#include "llvm/Support/ErrorHandling.h" #include namespace Fortran::common::FortranFeaturesHelpers { From flang-commits at lists.llvm.org Fri May 30 17:07:19 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 30 May 2025 17:07:19 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683a4837.170a0220.18d289.9408@mx.google.com> ================ @@ -107,16 +110,16 @@ class LanguageFeatureControl { warnAllUsage_ = false; warnUsage_.clear(); } - - bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } - bool ShouldWarn(LanguageFeature f) const { - return (warnAllLanguage_ && f != LanguageFeature::OpenMP && - f != LanguageFeature::OpenACC && f != LanguageFeature::CUDA) || - warnLanguage_.test(f); - } - bool ShouldWarn(UsageWarning w) const { - return warnAllUsage_ || warnUsage_.test(w); + void DisableAllWarnings() { + disableAllWarnings_ = true; + DisableAllNonstandardWarnings(); + DisableAllUsageWarnings(); } + bool applyCLIOption(llvm::StringRef input); ---------------- akuhlens wrote: I didn't know that. Do you know the reason behind avoiding having dependencies on llvm? https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Fri May 30 17:08:14 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Fri, 30 May 2025 17:08:14 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <683a486e.630a0220.11754d.6077@mx.google.com> https://github.com/snarang181 updated https://github.com/llvm/llvm-project/pull/141882 >From 7ef55e467fd2eaacc66c36c6e6e4df33b86ada62 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Wed, 28 May 2025 20:21:16 -0400 Subject: [PATCH 1/5] [Flang][Docs] Add Sphinx man page support for Flang This patch enables building Flang man pages by: - Adding a `man_pages` entry in flang/docs/conf.py for Sphinx man builder. - Adding a minimal `index.rst` as the master document. - Adding placeholder `.rst` files for FIRLangRef and FlangCommandLineReference to fix toctree references. These changes unblock builds using `-DLLVM_BUILD_MANPAGES=ON` and allow `ninja docs-flang-man` to generate `flang.1`. Fixes #141757 --- flang/docs/FIRLangRef.rst | 4 ++++ flang/docs/FlangCommandLineReference.rst | 4 ++++ flang/docs/conf.py | 4 +++- flang/docs/index.rst | 10 ++++++++++ 4 files changed, 21 insertions(+), 1 deletion(-) create mode 100644 flang/docs/FIRLangRef.rst create mode 100644 flang/docs/FlangCommandLineReference.rst create mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst new file mode 100644 index 0000000000000..91edd67fdcad8 --- /dev/null +++ b/flang/docs/FIRLangRef.rst @@ -0,0 +1,4 @@ +FIR Language Reference +====================== + +(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst new file mode 100644 index 0000000000000..71f77f28ba72c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.rst @@ -0,0 +1,4 @@ +Flang Command Line Reference +============================ + +(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 48f7b69f5d750..46907f144e25a 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -227,7 +227,9 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [] +man_pages = [ + ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) +] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst new file mode 100644 index 0000000000000..09677eb87704f --- /dev/null +++ b/flang/docs/index.rst @@ -0,0 +1,10 @@ +Flang Documentation +==================== + +Welcome to the Flang documentation. + +.. toctree:: + :maxdepth: 1 + + FIRLangRef + FlangCommandLineReference >From c21b5a0b7e404fc38f51f718e3f3c390d79af6a5 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 06:53:34 -0400 Subject: [PATCH 2/5] Remove .rst files and point conf.py to pick up .md --- flang/docs/FIRLangRef.rst | 4 ---- flang/docs/FlangCommandLineReference.rst | 4 ---- flang/docs/conf.py | 5 ++--- flang/docs/index.rst | 10 ---------- 4 files changed, 2 insertions(+), 21 deletions(-) delete mode 100644 flang/docs/FIRLangRef.rst delete mode 100644 flang/docs/FlangCommandLineReference.rst delete mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst deleted file mode 100644 index 91edd67fdcad8..0000000000000 --- a/flang/docs/FIRLangRef.rst +++ /dev/null @@ -1,4 +0,0 @@ -FIR Language Reference -====================== - -(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst deleted file mode 100644 index 71f77f28ba72c..0000000000000 --- a/flang/docs/FlangCommandLineReference.rst +++ /dev/null @@ -1,4 +0,0 @@ -Flang Command Line Reference -============================ - -(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 46907f144e25a..4fd81440c8176 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -42,6 +42,7 @@ # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] +source_suffix = [".md"] myst_heading_anchors = 6 import sphinx @@ -227,9 +228,7 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [ - ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) -] +man_pages = [("index", "flang", "Flang Documentation", ["Flang Contributors"], 1)] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst deleted file mode 100644 index 09677eb87704f..0000000000000 --- a/flang/docs/index.rst +++ /dev/null @@ -1,10 +0,0 @@ -Flang Documentation -==================== - -Welcome to the Flang documentation. - -.. toctree:: - :maxdepth: 1 - - FIRLangRef - FlangCommandLineReference >From 925c666bd60163e4943a074689a5bbbcbe29614a Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 07:03:35 -0400 Subject: [PATCH 3/5] While building man pages, the .md files were being used. Due to that, the myst_parser was explictly imported. Adding Placeholder .md files which are required by index.md --- flang/docs/FIRLangRef.md | 3 +++ flang/docs/FlangCommandLineReference.md | 3 +++ flang/docs/conf.py | 10 +++++----- 3 files changed, 11 insertions(+), 5 deletions(-) create mode 100644 flang/docs/FIRLangRef.md create mode 100644 flang/docs/FlangCommandLineReference.md diff --git a/flang/docs/FIRLangRef.md b/flang/docs/FIRLangRef.md new file mode 100644 index 0000000000000..8e4052f14fc7c --- /dev/null +++ b/flang/docs/FIRLangRef.md @@ -0,0 +1,3 @@ +# FIR Language Reference + +_TODO: Add FIR language reference documentation._ diff --git a/flang/docs/FlangCommandLineReference.md b/flang/docs/FlangCommandLineReference.md new file mode 100644 index 0000000000000..ee8d7b83dc50c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.md @@ -0,0 +1,3 @@ +# Flang Command Line Reference + +_TODO: Add Flang CLI documentation._ diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 4fd81440c8176..7223661625689 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -10,6 +10,7 @@ # serve to show the default. from datetime import date + # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. @@ -28,16 +29,15 @@ "sphinx.ext.autodoc", ] -# When building man pages, we do not use the markdown pages, -# So, we can continue without the myst_parser dependencies. -# Doing so reduces dependencies of some packaged llvm distributions. + try: import myst_parser extensions.append("myst_parser") except ImportError: - if not tags.has("builder-man"): - raise + raise ImportError( + "myst_parser is required to build documentation, including man pages." + ) # Add any paths that contain templates here, relative to this directory. >From 56340f2b9b2f4f084033db919e9f1a727b621856 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 09:01:22 -0400 Subject: [PATCH 4/5] Remove placeholder .md files --- flang/docs/FIRLangRef.md | 3 --- flang/docs/FlangCommandLineReference.md | 3 --- 2 files changed, 6 deletions(-) delete mode 100644 flang/docs/FIRLangRef.md delete mode 100644 flang/docs/FlangCommandLineReference.md diff --git a/flang/docs/FIRLangRef.md b/flang/docs/FIRLangRef.md deleted file mode 100644 index 8e4052f14fc7c..0000000000000 --- a/flang/docs/FIRLangRef.md +++ /dev/null @@ -1,3 +0,0 @@ -# FIR Language Reference - -_TODO: Add FIR language reference documentation._ diff --git a/flang/docs/FlangCommandLineReference.md b/flang/docs/FlangCommandLineReference.md deleted file mode 100644 index ee8d7b83dc50c..0000000000000 --- a/flang/docs/FlangCommandLineReference.md +++ /dev/null @@ -1,3 +0,0 @@ -# Flang Command Line Reference - -_TODO: Add Flang CLI documentation._ >From cd886e1f1bc5a8fd46733a6f352faef4f1070dc0 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Fri, 30 May 2025 14:10:36 -0400 Subject: [PATCH 5/5] Enable docs-flang-html to build --- flang/docs/conf.py | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 7223661625689..03f5973392d65 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -42,7 +42,10 @@ # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] -source_suffix = [".md"] +source_suffix = { + ".rst": "restructuredtext", + ".md": "markdown", +} myst_heading_anchors = 6 import sphinx From flang-commits at lists.llvm.org Fri May 30 17:02:33 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 30 May 2025 17:02:33 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683a4719.170a0220.d588a.93a7@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/142022 >From 8f3fd2daab46f477e87043c66b3049dff4a5b20e Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:11:04 -0700 Subject: [PATCH 1/7] initial commit --- flang/include/flang/Common/enum-class.h | 47 ++++- .../include/flang/Support/Fortran-features.h | 51 ++++-- flang/lib/Frontend/CompilerInvocation.cpp | 62 ++++--- flang/lib/Support/CMakeLists.txt | 1 + flang/lib/Support/Fortran-features.cpp | 168 ++++++++++++++---- flang/lib/Support/enum-class.cpp | 24 +++ flang/test/Driver/disable-diagnostic.f90 | 19 ++ flang/test/Driver/werror-wrong.f90 | 7 +- flang/test/Driver/wextra-ok.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 3 + flang/unittests/Common/EnumClassTests.cpp | 45 +++++ .../unittests/Common/FortranFeaturesTest.cpp | 142 +++++++++++++++ 12 files changed, 483 insertions(+), 88 deletions(-) create mode 100644 flang/lib/Support/enum-class.cpp create mode 100644 flang/test/Driver/disable-diagnostic.f90 create mode 100644 flang/unittests/Common/EnumClassTests.cpp create mode 100644 flang/unittests/Common/FortranFeaturesTest.cpp diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index 41575d45091a8..baf9fe418141d 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -18,8 +18,9 @@ #define FORTRAN_COMMON_ENUM_CLASS_H_ #include -#include - +#include +#include +#include namespace Fortran::common { constexpr std::size_t CountEnumNames(const char *p) { @@ -58,15 +59,51 @@ constexpr std::array EnumNames(const char *p) { return result; } +template +std::optional inline fmap(std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +std::optional FindEnumIndex( + Predicate pred, int size, const std::string_view *names); + +using FindEnumIndexType = std::optional( + Predicate, int, const std::string_view *); + +template +std::optional inline FindEnum( + Predicate pred, std::function(Predicate)> find) { + std::function f = [](int x) { return static_cast(x); }; + return fmap(find(pred), f); +} + #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ ::Fortran::common::CountEnumNames(#__VA_ARGS__)}; \ + [[maybe_unused]] static constexpr std::array NAME##_names{ \ + ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ [[maybe_unused]] static inline std::string_view EnumToString(NAME e) { \ - static const constexpr auto names{ \ - ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ - return names[static_cast(e)]; \ + return NAME##_names[static_cast(e)]; \ } +#define ENUM_CLASS_EXTRA(NAME) \ + [[maybe_unused]] inline std::optional Find##NAME##Index( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnumIndex( \ + p, NAME##_enumSize, NAME##_names.data()); \ + } \ + [[maybe_unused]] inline std::optional Find##NAME( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + } \ + [[maybe_unused]] inline std::optional StringTo##NAME( \ + const std::string_view name) { \ + return Find##NAME( \ + [name](const std::string_view s) -> bool { return name == s; }); \ + } } // namespace Fortran::common #endif // FORTRAN_COMMON_ENUM_CLASS_H_ diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index e696da9042480..d5aa7357ffea0 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -12,6 +12,8 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" #include "flang/Common/idioms.h" +#include "llvm/Support/Error.h" +#include "llvm/Support/raw_ostream.h" #include #include @@ -79,12 +81,13 @@ ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, NullActualForDefaultIntentAllocatable, UseAssociationIntoSameNameSubprogram, HostAssociatedIntentOutInSpecExpr, NonVolatilePointerToVolatile) +// Generate default String -> Enum mapping. +ENUM_CLASS_EXTRA(LanguageFeature) +ENUM_CLASS_EXTRA(UsageWarning) + using LanguageFeatures = EnumSet; using UsageWarnings = EnumSet; -std::optional FindLanguageFeature(const char *); -std::optional FindUsageWarning(const char *); - class LanguageFeatureControl { public: LanguageFeatureControl(); @@ -97,8 +100,10 @@ class LanguageFeatureControl { void EnableWarning(UsageWarning w, bool yes = true) { warnUsage_.set(w, yes); } - void WarnOnAllNonstandard(bool yes = true) { warnAllLanguage_ = yes; } - void WarnOnAllUsage(bool yes = true) { warnAllUsage_ = yes; } + void WarnOnAllNonstandard(bool yes = true); + bool IsWarnOnAllNonstandard() const { return warnAllLanguage_; } + void WarnOnAllUsage(bool yes = true); + bool IsWarnOnAllUsage() const { return warnAllUsage_; } void DisableAllNonstandardWarnings() { warnAllLanguage_ = false; warnLanguage_.clear(); @@ -107,16 +112,16 @@ class LanguageFeatureControl { warnAllUsage_ = false; warnUsage_.clear(); } - - bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } - bool ShouldWarn(LanguageFeature f) const { - return (warnAllLanguage_ && f != LanguageFeature::OpenMP && - f != LanguageFeature::OpenACC && f != LanguageFeature::CUDA) || - warnLanguage_.test(f); - } - bool ShouldWarn(UsageWarning w) const { - return warnAllUsage_ || warnUsage_.test(w); + void DisableAllWarnings() { + disableAllWarnings_ = true; + DisableAllNonstandardWarnings(); + DisableAllUsageWarnings(); } + bool applyCLIOption(llvm::StringRef input); + bool AreWarningsDisabled() const { return disableAllWarnings_; } + bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } + bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } + bool ShouldWarn(UsageWarning w) const { return warnUsage_.test(w); } // Return all spellings of operators names, depending on features enabled std::vector GetNames(LogicalOperator) const; std::vector GetNames(RelationalOperator) const; @@ -127,6 +132,24 @@ class LanguageFeatureControl { bool warnAllLanguage_{false}; UsageWarnings warnUsage_; bool warnAllUsage_{false}; + bool disableAllWarnings_{false}; }; + +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). Just exposed for the template below. +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find); + +template +std::optional> parseCLIEnum( + llvm::StringRef input, std::function(Predicate)> find) { + using To = std::pair; + using From = std::pair; + static std::function cast = [](From x) { + return std::pair{x.first, static_cast(x.second)}; + }; + return fmap(parseCLIEnumIndex(input, find), cast); +} + } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..9ea568549bd6c 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -34,6 +34,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" +#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" #include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" @@ -45,6 +46,7 @@ #include #include #include +#include using namespace Fortran::frontend; @@ -971,10 +973,23 @@ static bool parseSemaArgs(CompilerInvocation &res, llvm::opt::ArgList &args, /// Parses all diagnostics related arguments and populates the variables /// options accordingly. Returns false if new errors are generated. +/// FC1 driver entry point for parsing diagnostic arguments. static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, clang::DiagnosticsEngine &diags) { unsigned numErrorsBefore = diags.getNumErrors(); + auto &features = res.getFrontendOpts().features; + // The order of these flags (-pedantic -W -w) is important and is + // chosen to match clang's behavior. + + // -pedantic + if (args.hasArg(clang::driver::options::OPT_pedantic)) { + features.WarnOnAllNonstandard(); + features.WarnOnAllUsage(); + res.setEnableConformanceChecks(); + res.setEnableUsageChecks(); + } + // -Werror option // TODO: Currently throws a Diagnostic for anything other than -W, // this has to change when other -W's are supported. @@ -984,21 +999,27 @@ static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, for (const auto &wArg : wArgs) { if (wArg == "error") { res.setWarnAsErr(true); - } else { - const unsigned diagID = - diags.getCustomDiagID(clang::DiagnosticsEngine::Error, - "Only `-Werror` is supported currently."); - diags.Report(diagID); + // -W(no-) + } else if (!features.applyCLIOption(wArg)) { + const unsigned diagID = diags.getCustomDiagID( + clang::DiagnosticsEngine::Error, "Unknown diagnostic option: -W%0"); + diags.Report(diagID) << wArg; } } } + // -w + if (args.hasArg(clang::driver::options::OPT_w)) { + features.DisableAllWarnings(); + res.setDisableWarnings(); + } + // Default to off for `flang -fc1`. - res.getFrontendOpts().showColors = - parseShowColorsArgs(args, /*defaultDiagColor=*/false); + bool showColors = parseShowColorsArgs(args, false); - // Honor color diagnostics. - res.getDiagnosticOpts().ShowColors = res.getFrontendOpts().showColors; + diags.getDiagnosticOptions().ShowColors = showColors; + res.getDiagnosticOpts().ShowColors = showColors; + res.getFrontendOpts().showColors = showColors; return diags.getNumErrors() == numErrorsBefore; } @@ -1074,16 +1095,6 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, Fortran::common::LanguageFeature::OpenACC); } - // -pedantic - if (args.hasArg(clang::driver::options::OPT_pedantic)) { - res.setEnableConformanceChecks(); - res.setEnableUsageChecks(); - } - - // -w - if (args.hasArg(clang::driver::options::OPT_w)) - res.setDisableWarnings(); - // -std=f2018 // TODO: Set proper options when more fortran standards // are supported. @@ -1092,6 +1103,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, // We only allow f2018 as the given standard if (standard == "f2018") { res.setEnableConformanceChecks(); + res.getFrontendOpts().features.WarnOnAllNonstandard(); } else { const unsigned diagID = diags.getCustomDiagID(clang::DiagnosticsEngine::Error, @@ -1099,6 +1111,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, diags.Report(diagID); } } + return diags.getNumErrors() == numErrorsBefore; } @@ -1694,16 +1707,7 @@ void CompilerInvocation::setFortranOpts() { if (frontendOptions.needProvenanceRangeToCharBlockMappings) fortranOptions.needProvenanceRangeToCharBlockMappings = true; - if (getEnableConformanceChecks()) - fortranOptions.features.WarnOnAllNonstandard(); - - if (getEnableUsageChecks()) - fortranOptions.features.WarnOnAllUsage(); - - if (getDisableWarnings()) { - fortranOptions.features.DisableAllNonstandardWarnings(); - fortranOptions.features.DisableAllUsageWarnings(); - } + fortranOptions.features = frontendOptions.features; } std::unique_ptr diff --git a/flang/lib/Support/CMakeLists.txt b/flang/lib/Support/CMakeLists.txt index 363f57ce97dae..9ef31a2a6dcc7 100644 --- a/flang/lib/Support/CMakeLists.txt +++ b/flang/lib/Support/CMakeLists.txt @@ -44,6 +44,7 @@ endif() add_flang_library(FortranSupport default-kinds.cpp + enum-class.cpp Flags.cpp Fortran.cpp Fortran-features.cpp diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index bee8984102b82..55abf0385d185 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -9,6 +9,8 @@ #include "flang/Support/Fortran-features.h" #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/Support/raw_ostream.h" namespace Fortran::common { @@ -94,57 +96,123 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Ignore case and any inserted punctuation (like '-'/'_') -static std::optional GetWarningChar(char ch) { - if (ch >= 'a' && ch <= 'z') { - return ch; - } else if (ch >= 'A' && ch <= 'Z') { - return ch - 'A' + 'a'; - } else if (ch >= '0' && ch <= '9') { - return ch; - } else { - return std::nullopt; +// Split a string with camel case into the individual words. +// Note, the small vector is just an array of a few pointers and lengths +// into the original input string. So all this allocation should be pretty +// cheap. +llvm::SmallVector splitCamelCase(llvm::StringRef input) { + using namespace llvm; + if (input.empty()) { + return {}; } + SmallVector parts{}; + parts.reserve(input.size()); + auto check = [&input](size_t j, function_ref predicate) { + return j < input.size() && predicate(input[j]); + }; + size_t i{0}; + size_t startWord = i; + for (; i < input.size(); i++) { + if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || + ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { + parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); + startWord = i + 1; + } + } + parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); + return parts; } -static bool WarningNameMatch(const char *a, const char *b) { - while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); - } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); +// Split a string whith hyphens into the individual words. +llvm::SmallVector splitHyphenated(llvm::StringRef input) { + auto parts = llvm::SmallVector{}; + llvm::SplitString(input, parts, "-"); + return parts; +} + +// Check if two strings are equal while normalizing case for the +// right word which is assumed to be a single word in camel case. +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { + size_t ls = l.size(); + if (ls != r.size()) + return false; + size_t j{0}; + // Process the upper case characters. + for (; j < ls; j++) { + char rc = r[j]; + char rc2l = llvm::toLower(rc); + if (rc == rc2l) { + // Past run of Uppers Case; + break; } - if (!ach && !bch) { - return true; - } else if (!ach || !bch || *ach != *bch) { + if (l[j] != rc2l) + return false; + } + // Process the lower case characters. + for (; j < ls; j++) { + if (l[j] != r[j]) { return false; } - ++a, ++b; } + return true; } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find) { + auto parts = splitHyphenated(input); + bool negated = false; + if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { + negated = true; + // Remove the "no" part + parts = llvm::SmallVector(parts.begin() + 1, parts.end()); + } + size_t chars = 0; + for (auto p : parts) { + chars += p.size(); + } + auto pred = [&](auto s) { + if (chars != s.size()) { + return false; + } + auto ccParts = splitCamelCase(s); + auto num_ccParts = ccParts.size(); + if (parts.size() != num_ccParts) { + return false; + } + for (size_t i{0}; i < num_ccParts; i++) { + if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { + return false; } } - } - return std::nullopt; + return true; + }; + auto cast = [negated](int x) { return std::pair{!negated, x}; }; + return fmap>(find(pred), cast); } -std::optional FindLanguageFeature(const char *name) { - return ScanEnum(name); +std::optional> parseCLILanguageFeature( + llvm::StringRef input) { + return parseCLIEnum(input, FindLanguageFeatureIndex); } -std::optional FindUsageWarning(const char *name) { - return ScanEnum(name); +std::optional> parseCLIUsageWarning( + llvm::StringRef input) { + return parseCLIEnum(input, FindUsageWarningIndex); +} + +// Take a string from the CLI and apply it to the LanguageFeatureControl. +// Return true if the option was applied recognized. +bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { + if (auto result = parseCLILanguageFeature(input)) { + EnableWarning(result->second, result->first); + return true; + } else if (auto result = parseCLIUsageWarning(input)) { + EnableWarning(result->second, result->first); + return true; + } + return false; } std::vector LanguageFeatureControl::GetNames( @@ -201,4 +269,32 @@ std::vector LanguageFeatureControl::GetNames( } } +template +void ForEachEnum(std::function f) { + for (size_t j{0}; j < N; ++j) { + f(static_cast(j)); + } +} + +void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { + warnAllLanguage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + // should be equivalent to: reset().flip() set ... + ForEachEnum( + [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + if (yes) { + // These three features do not need to be warned about, + // but we do want their feature flags. + warnLanguage_.set(LanguageFeature::OpenMP, false); + warnLanguage_.set(LanguageFeature::OpenACC, false); + warnLanguage_.set(LanguageFeature::CUDA, false); + } +} + +void LanguageFeatureControl::WarnOnAllUsage(bool yes) { + warnAllUsage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + ForEachEnum( + [&](UsageWarning w) { warnUsage_.set(w, yes); }); +} } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp new file mode 100644 index 0000000000000..ed11318382b35 --- /dev/null +++ b/flang/lib/Support/enum-class.cpp @@ -0,0 +1,24 @@ +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include +#include +namespace Fortran::common { + +std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; + } + } + return std::nullopt; +} + + +} // namespace Fortran::common \ No newline at end of file diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 new file mode 100644 index 0000000000000..8a58e63cfa3ac --- /dev/null +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -0,0 +1,19 @@ +! RUN: %flang -Wknown-bad-implicit-interface %s -c 2>&1 | FileCheck %s --check-prefix=WARN +! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty +! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 +! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 +! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface +! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface + +program disable_diagnostic + REAL :: x + INTEGER :: y + ! CHECK-NOT: warning + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(x) + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(y) +end program disable_diagnostic + +subroutine sub() +end subroutine sub \ No newline at end of file diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 58adf6f745d5e..33f0aff8a1739 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -1,6 +1,7 @@ ! Ensure that only argument -Werror is supported. -! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG -! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG +! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG1 +! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 -! WRONG: Only `-Werror` is supported currently. +! WRONG1: error: Unknown diagnostic option: -Wall +! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file diff --git a/flang/test/Driver/wextra-ok.f90 b/flang/test/Driver/wextra-ok.f90 index 441029aa0af27..db15c7f14aa35 100644 --- a/flang/test/Driver/wextra-ok.f90 +++ b/flang/test/Driver/wextra-ok.f90 @@ -5,7 +5,7 @@ ! RUN: not %flang -std=f2018 -Wblah -Wextra %s -c 2>&1 | FileCheck %s --check-prefix=WRONG ! CHECK-OK: the warning option '-Wextra' is not supported -! WRONG: Only `-Werror` is supported currently. +! WRONG: Unknown diagnostic option: -Wblah program wextra_ok end program wextra_ok diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index bda02ed29a5ef..19cc5a20fecf4 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -1,3 +1,6 @@ add_flang_unittest(FlangCommonTests + EnumClassTests.cpp FastIntSetTest.cpp + FortranFeaturesTest.cpp ) +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp new file mode 100644 index 0000000000000..f67c453cfad15 --- /dev/null +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -0,0 +1,45 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Common/template.h" +#include "gtest/gtest.h" + +using namespace Fortran::common; +using namespace std; + +ENUM_CLASS(TestEnum, One, Two, + Three) +ENUM_CLASS_EXTRA(TestEnum) + +TEST(EnumClassTest, EnumToString) { + ASSERT_EQ(EnumToString(TestEnum::One), "One"); + ASSERT_EQ(EnumToString(TestEnum::Two), "Two"); + ASSERT_EQ(EnumToString(TestEnum::Three), "Three"); +} + +TEST(EnumClassTest, EnumToStringData) { + ASSERT_STREQ(EnumToString(TestEnum::One).data(), "One, Two, Three"); +} + +TEST(EnumClassTest, StringToEnum) { + ASSERT_EQ(StringToTestEnum("One"), std::optional{TestEnum::One}); + ASSERT_EQ(StringToTestEnum("Two"), std::optional{TestEnum::Two}); + ASSERT_EQ(StringToTestEnum("Three"), std::optional{TestEnum::Three}); + ASSERT_EQ(StringToTestEnum("Four"), std::nullopt); + ASSERT_EQ(StringToTestEnum(""), std::nullopt); + ASSERT_EQ(StringToTestEnum("One, Two, Three"), std::nullopt); +} + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, FindNameNormal) { + auto p1 = [](auto s) { return s == "TwentyOne"; }; + ASSERT_EQ(FindTestEnumExtra(p1), std::optional{TestEnumExtra::TwentyOne}); +} diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp new file mode 100644 index 0000000000000..7ec7054f14f6e --- /dev/null +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -0,0 +1,142 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Support/Fortran-features.h" +#include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/ErrorHandling.h" +#include "gtest/gtest.h" + +namespace Fortran::common { + +// Not currently exported from Fortran-features.h +llvm::SmallVector splitCamelCase(llvm::StringRef input); +llvm::SmallVector splitHyphenated(llvm::StringRef input); +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, SplitCamelCase) { + + auto parts = splitCamelCase("oP"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("o", 1))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("P", 1))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OPName"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("OP", 2))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OpName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("Op", 2))) { + ADD_FAILURE() << "First part is not Op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("opName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("op", 2))) { + ADD_FAILURE() << "First part is not op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("FlangTestProgram123"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("Flang", 5))) { + ADD_FAILURE() << "First part is not Flang"; + } + if (parts[1].compare(llvm::StringRef("Test", 4))) { + ADD_FAILURE() << "Second part is not Test"; + } + if (parts[2].compare(llvm::StringRef("Program123", 10))) { + ADD_FAILURE() << "Third part is not Program123"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, SplitHyphenated) { + auto parts = splitHyphenated("no-twenty-one"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("no", 2))) { + ADD_FAILURE() << "First part is not twenty"; + } + if (parts[1].compare(llvm::StringRef("twenty", 6))) { + ADD_FAILURE() << "Second part is not one"; + } + if (parts[2].compare(llvm::StringRef("one", 3))) { + ADD_FAILURE() << "Third part is not one"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); + + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); +} + +std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); +} + +TEST(EnumClassTest, parseCLIEnumOption) { + auto result = parseCLITestEnumExtraOption("no-twenty-one"); + auto expected = std::pair(false, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("twenty-one"); + expected = std::pair(true, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-forty-two"); + expected = std::pair(false, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("forty-two"); + expected = std::pair(true, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-seven-seven-seven"); + expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("seven-seven-seven"); + expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); +} + +} // namespace Fortran::common >From 49a0579f9477936b72f0580823b4dd6824697512 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:56:14 -0700 Subject: [PATCH 2/7] adjust headers --- flang/include/flang/Support/Fortran-features.h | 4 +--- flang/lib/Frontend/CompilerInvocation.cpp | 5 ----- flang/lib/Support/Fortran-features.cpp | 1 - 3 files changed, 1 insertion(+), 9 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index d5aa7357ffea0..4a8b0da4c0d4d 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -11,9 +11,7 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" -#include "flang/Common/idioms.h" -#include "llvm/Support/Error.h" -#include "llvm/Support/raw_ostream.h" +#include "llvm/ADT/StringRef.h" #include #include diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 9ea568549bd6c..d8bf601d0171d 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -20,11 +20,9 @@ #include "flang/Support/Version.h" #include "flang/Tools/TargetSetup.h" #include "flang/Version.inc" -#include "clang/Basic/AllDiagnostics.h" #include "clang/Basic/DiagnosticDriver.h" #include "clang/Basic/DiagnosticOptions.h" #include "clang/Driver/Driver.h" -#include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" #include "llvm/ADT/StringRef.h" @@ -34,9 +32,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" -#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" -#include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" #include "llvm/Support/Process.h" #include "llvm/Support/raw_ostream.h" @@ -46,7 +42,6 @@ #include #include #include -#include using namespace Fortran::frontend; diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 55abf0385d185..0e394162ef577 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -10,7 +10,6 @@ #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" -#include "llvm/Support/raw_ostream.h" namespace Fortran::common { >From fa2db7090c6d374ce1a835ad26d19a1d7bd42262 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:57:22 -0700 Subject: [PATCH 3/7] reformat --- flang/lib/Support/enum-class.cpp | 20 ++++++++++--------- flang/unittests/Common/EnumClassTests.cpp | 5 ++--- .../unittests/Common/FortranFeaturesTest.cpp | 18 ++++++++++------- 3 files changed, 24 insertions(+), 19 deletions(-) diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp index ed11318382b35..ac57f27ef1c9e 100644 --- a/flang/lib/Support/enum-class.cpp +++ b/flang/lib/Support/enum-class.cpp @@ -1,4 +1,5 @@ -//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ +//-*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -7,18 +8,19 @@ //===----------------------------------------------------------------------===// #include "flang/Common/enum-class.h" -#include #include +#include namespace Fortran::common { -std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { - for (int i = 0; i < size; ++i) { - if (pred(names[i])) { - return i; - } +std::optional FindEnumIndex( + std::function pred, int size, + const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; } - return std::nullopt; + } + return std::nullopt; } - } // namespace Fortran::common \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp index f67c453cfad15..c9224a8ceba54 100644 --- a/flang/unittests/Common/EnumClassTests.cpp +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -6,15 +6,14 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Common/template.h" -#include "gtest/gtest.h" using namespace Fortran::common; using namespace std; -ENUM_CLASS(TestEnum, One, Two, - Three) +ENUM_CLASS(TestEnum, One, Two, Three) ENUM_CLASS_EXTRA(TestEnum) TEST(EnumClassTest, EnumToString) { diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 7ec7054f14f6e..597928e7fe56e 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -6,12 +6,12 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Support/Fortran-features.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" -#include "gtest/gtest.h" namespace Fortran::common { @@ -34,7 +34,7 @@ TEST(EnumClassTest, SplitCamelCase) { if (parts[1].compare(llvm::StringRef("P", 1))) { ADD_FAILURE() << "Second part is not Name"; } - + parts = splitCamelCase("OPName"); ASSERT_EQ(parts.size(), (size_t)2); @@ -114,13 +114,15 @@ TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); } -std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { - return parseCLIEnum(input, FindTestEnumExtraIndex); +std::optional> parseCLITestEnumExtraOption( + llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); } TEST(EnumClassTest, parseCLIEnumOption) { auto result = parseCLITestEnumExtraOption("no-twenty-one"); - auto expected = std::pair(false, TestEnumExtra::TwentyOne); + auto expected = + std::pair(false, TestEnumExtra::TwentyOne); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("twenty-one"); expected = std::pair(true, TestEnumExtra::TwentyOne); @@ -132,10 +134,12 @@ TEST(EnumClassTest, parseCLIEnumOption) { expected = std::pair(true, TestEnumExtra::FortyTwo); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("no-seven-seven-seven"); - expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(false, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("seven-seven-seven"); - expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(true, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); } >From 5f3feb64c1a97500e2808114d44bb07aa4ccb00c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 15:58:43 -0700 Subject: [PATCH 4/7] addressing feedback --- flang/include/flang/Common/enum-class.h | 53 +++--- flang/include/flang/Common/optional.h | 7 + .../include/flang/Support/Fortran-features.h | 16 -- flang/lib/Support/Fortran-features.cpp | 175 ++++++++---------- flang/lib/Support/enum-class.cpp | 15 +- flang/test/Driver/disable-diagnostic.f90 | 3 +- flang/test/Driver/werror-wrong.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 2 +- .../unittests/Common/FortranFeaturesTest.cpp | 159 +++------------- 9 files changed, 153 insertions(+), 279 deletions(-) diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index baf9fe418141d..3dbd11bb4057c 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -17,9 +17,9 @@ #ifndef FORTRAN_COMMON_ENUM_CLASS_H_ #define FORTRAN_COMMON_ENUM_CLASS_H_ +#include "optional.h" #include #include -#include #include namespace Fortran::common { @@ -59,26 +59,6 @@ constexpr std::array EnumNames(const char *p) { return result; } -template -std::optional inline fmap(std::optional x, std::function f) { - return x ? std::optional{f(*x)} : std::nullopt; -} - -using Predicate = std::function; -// Finds the first index for which the predicate returns true. -std::optional FindEnumIndex( - Predicate pred, int size, const std::string_view *names); - -using FindEnumIndexType = std::optional( - Predicate, int, const std::string_view *); - -template -std::optional inline FindEnum( - Predicate pred, std::function(Predicate)> find) { - std::function f = [](int x) { return static_cast(x); }; - return fmap(find(pred), f); -} - #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ @@ -90,17 +70,34 @@ std::optional inline FindEnum( return NAME##_names[static_cast(e)]; \ } +namespace EnumClass { + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +optional FindIndex( + Predicate pred, std::size_t size, const std::string_view *names); + +using FindIndexType = std::function(Predicate)>; + +template +optional inline Find(Predicate pred, FindIndexType findIndex) { + return MapOption( + findIndex(pred), [](int x) { return static_cast(x); }); +} + +} // namespace EnumClass + #define ENUM_CLASS_EXTRA(NAME) \ - [[maybe_unused]] inline std::optional Find##NAME##Index( \ - ::Fortran::common::Predicate p) { \ - return ::Fortran::common::FindEnumIndex( \ + [[maybe_unused]] inline optional Find##NAME##Index( \ + ::Fortran::common::EnumClass::Predicate p) { \ + return ::Fortran::common::EnumClass::FindIndex( \ p, NAME##_enumSize, NAME##_names.data()); \ } \ - [[maybe_unused]] inline std::optional Find##NAME( \ - ::Fortran::common::Predicate p) { \ - return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + [[maybe_unused]] inline optional Find##NAME( \ + ::Fortran::common::EnumClass::Predicate p) { \ + return ::Fortran::common::EnumClass::Find(p, Find##NAME##Index); \ } \ - [[maybe_unused]] inline std::optional StringTo##NAME( \ + [[maybe_unused]] inline optional StringTo##NAME( \ const std::string_view name) { \ return Find##NAME( \ [name](const std::string_view s) -> bool { return name == s; }); \ diff --git a/flang/include/flang/Common/optional.h b/flang/include/flang/Common/optional.h index c7c81f40cc8c8..5b623f01e828d 100644 --- a/flang/include/flang/Common/optional.h +++ b/flang/include/flang/Common/optional.h @@ -27,6 +27,7 @@ #define FORTRAN_COMMON_OPTIONAL_H #include "api-attrs.h" +#include #include #include @@ -238,6 +239,12 @@ using std::nullopt_t; using std::optional; #endif // !STD_OPTIONAL_UNSUPPORTED +template +std::optional inline MapOption( + std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + } // namespace Fortran::common #endif // FORTRAN_COMMON_OPTIONAL_H diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 4a8b0da4c0d4d..fd6a9139b7ea7 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -133,21 +133,5 @@ class LanguageFeatureControl { bool disableAllWarnings_{false}; }; -// Parse a CLI enum option return the enum index and whether it should be -// enabled (true) or disabled (false). Just exposed for the template below. -std::optional> parseCLIEnumIndex( - llvm::StringRef input, std::function(Predicate)> find); - -template -std::optional> parseCLIEnum( - llvm::StringRef input, std::function(Predicate)> find) { - using To = std::pair; - using From = std::pair; - static std::function cast = [](From x) { - return std::pair{x.first, static_cast(x.second)}; - }; - return fmap(parseCLIEnumIndex(input, find), cast); -} - } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 0e394162ef577..72ea6639adf51 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -11,6 +11,10 @@ #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" +// Debugging +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/raw_ostream.h" + namespace Fortran::common { LanguageFeatureControl::LanguageFeatureControl() { @@ -95,119 +99,99 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Split a string with camel case into the individual words. -// Note, the small vector is just an array of a few pointers and lengths -// into the original input string. So all this allocation should be pretty -// cheap. -llvm::SmallVector splitCamelCase(llvm::StringRef input) { - using namespace llvm; - if (input.empty()) { - return {}; +// Namespace for helper functions for parsing CLI options +// used instead of static so that there can be unit tests for these +// functions. +namespace FortranFeaturesHelpers { +// Check if Lower Case Hyphenated words are equal to Camel Case words. +// Because of out use case we know that 'r' the camel case string is +// well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. +// This is checked in the enum-class.h file. +bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { + size_t ls{l.size()}, rs{r.size()}; + if (ls < rs) { + return false; } - SmallVector parts{}; - parts.reserve(input.size()); - auto check = [&input](size_t j, function_ref predicate) { - return j < input.size() && predicate(input[j]); - }; - size_t i{0}; - size_t startWord = i; - for (; i < input.size(); i++) { - if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || - ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { - parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); - startWord = i + 1; + bool atStartOfWord{true}; + size_t wordCount{0}, j; // j is the number of word characters checked in r. + for (; j < rs; j++) { + if (wordCount + j >= ls) { + // `l` was shorter once the hiphens were removed. + // If r is null terminated, then we are good. + return r[j] == '\0'; } - } - parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); - return parts; -} - -// Split a string whith hyphens into the individual words. -llvm::SmallVector splitHyphenated(llvm::StringRef input) { - auto parts = llvm::SmallVector{}; - llvm::SplitString(input, parts, "-"); - return parts; -} - -// Check if two strings are equal while normalizing case for the -// right word which is assumed to be a single word in camel case. -bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { - size_t ls = l.size(); - if (ls != r.size()) - return false; - size_t j{0}; - // Process the upper case characters. - for (; j < ls; j++) { - char rc = r[j]; - char rc2l = llvm::toLower(rc); - if (rc == rc2l) { - // Past run of Uppers Case; - break; + if (atStartOfWord) { + if (llvm::isUpper(r[j])) { + // Upper Case Run + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else { + atStartOfWord = false; + if (l[wordCount + j] != r[j]) { + return false; + } + } + } else { + if (llvm::isUpper(r[j])) { + atStartOfWord = true; + if (l[wordCount + j] != '-') { + return false; + } + ++wordCount; + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else if (l[wordCount + j] != r[j]) { + return false; + } } - if (l[j] != rc2l) - return false; } - // Process the lower case characters. - for (; j < ls; j++) { - if (l[j] != r[j]) { - return false; - } + // If there are more characters in l after processing all the characters in r. + // then fail unless the string is null terminated. + if (ls > wordCount + j) { + return l[wordCount + j] == '\0'; } return true; } // Parse a CLI enum option return the enum index and whether it should be // enabled (true) or disabled (false). -std::optional> parseCLIEnumIndex( - llvm::StringRef input, std::function(Predicate)> find) { - auto parts = splitHyphenated(input); - bool negated = false; - if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { +template +optional> ParseCLIEnum( + llvm::StringRef input, EnumClass::FindIndexType findIndex) { + bool negated{false}; + if (input.starts_with("no-")) { negated = true; - // Remove the "no" part - parts = llvm::SmallVector(parts.begin() + 1, parts.end()); - } - size_t chars = 0; - for (auto p : parts) { - chars += p.size(); + input = input.drop_front(3); } - auto pred = [&](auto s) { - if (chars != s.size()) { - return false; - } - auto ccParts = splitCamelCase(s); - auto num_ccParts = ccParts.size(); - if (parts.size() != num_ccParts) { - return false; - } - for (size_t i{0}; i < num_ccParts; i++) { - if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { - return false; - } - } - return true; - }; - auto cast = [negated](int x) { return std::pair{!negated, x}; }; - return fmap>(find(pred), cast); + EnumClass::Predicate predicate{ + [input](llvm::StringRef r) { return LowerHyphEqualCamelCase(input, r); }}; + optional x = EnumClass::Find(predicate, findIndex); + return MapOption>( + x, [negated](T x) { return std::pair{!negated, x}; }); } -std::optional> parseCLILanguageFeature( +optional> parseCLIUsageWarning( llvm::StringRef input) { - return parseCLIEnum(input, FindLanguageFeatureIndex); + return ParseCLIEnum(input, FindUsageWarningIndex); } -std::optional> parseCLIUsageWarning( +optional> parseCLILanguageFeature( llvm::StringRef input) { - return parseCLIEnum(input, FindUsageWarningIndex); + return ParseCLIEnum(input, FindLanguageFeatureIndex); } +} // namespace FortranFeaturesHelpers + // Take a string from the CLI and apply it to the LanguageFeatureControl. // Return true if the option was applied recognized. bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { - if (auto result = parseCLILanguageFeature(input)) { + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(input)) { EnableWarning(result->second, result->first); return true; - } else if (auto result = parseCLIUsageWarning(input)) { + } else if (auto result = + FortranFeaturesHelpers::parseCLIUsageWarning(input)) { EnableWarning(result->second, result->first); return true; } @@ -277,11 +261,10 @@ void ForEachEnum(std::function f) { void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { warnAllLanguage_ = yes; - disableAllWarnings_ = yes ? false : disableAllWarnings_; - // should be equivalent to: reset().flip() set ... - ForEachEnum( - [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + warnLanguage_.reset(); if (yes) { + disableAllWarnings_ = false; + warnLanguage_.flip(); // These three features do not need to be warned about, // but we do want their feature flags. warnLanguage_.set(LanguageFeature::OpenMP, false); @@ -292,8 +275,10 @@ void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { void LanguageFeatureControl::WarnOnAllUsage(bool yes) { warnAllUsage_ = yes; - disableAllWarnings_ = yes ? false : disableAllWarnings_; - ForEachEnum( - [&](UsageWarning w) { warnUsage_.set(w, yes); }); + warnUsage_.reset(); + if (yes) { + disableAllWarnings_ = false; + warnUsage_.flip(); + } } } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp index ac57f27ef1c9e..d6d0ee758175b 100644 --- a/flang/lib/Support/enum-class.cpp +++ b/flang/lib/Support/enum-class.cpp @@ -8,19 +8,20 @@ //===----------------------------------------------------------------------===// #include "flang/Common/enum-class.h" +#include "flang/Common/optional.h" #include -#include -namespace Fortran::common { -std::optional FindEnumIndex( - std::function pred, int size, +namespace Fortran::common::EnumClass { + +optional FindIndex( + std::function pred, size_t size, const std::string_view *names) { - for (int i = 0; i < size; ++i) { + for (size_t i = 0; i < size; ++i) { if (pred(names[i])) { return i; } } - return std::nullopt; + return nullopt; } -} // namespace Fortran::common \ No newline at end of file +} // namespace Fortran::common::EnumClass diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 index 8a58e63cfa3ac..849489377da12 100644 --- a/flang/test/Driver/disable-diagnostic.f90 +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -2,6 +2,7 @@ ! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty ! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 ! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 + ! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface ! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface @@ -16,4 +17,4 @@ program disable_diagnostic end program disable_diagnostic subroutine sub() -end subroutine sub \ No newline at end of file +end subroutine sub diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 33f0aff8a1739..6e3c7cca15bc7 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -4,4 +4,4 @@ ! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 ! WRONG1: error: Unknown diagnostic option: -Wall -! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file +! WRONG2: error: Unknown diagnostic option: -WX diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index 19cc5a20fecf4..3149cb9f7bc47 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -3,4 +3,4 @@ add_flang_unittest(FlangCommonTests FastIntSetTest.cpp FortranFeaturesTest.cpp ) -target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 597928e7fe56e..e12aff9f7b735 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -12,135 +12,34 @@ #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" - -namespace Fortran::common { - -// Not currently exported from Fortran-features.h -llvm::SmallVector splitCamelCase(llvm::StringRef input); -llvm::SmallVector splitHyphenated(llvm::StringRef input); -bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); - -ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) -ENUM_CLASS_EXTRA(TestEnumExtra) - -TEST(EnumClassTest, SplitCamelCase) { - - auto parts = splitCamelCase("oP"); - ASSERT_EQ(parts.size(), (size_t)2); - - if (parts[0].compare(llvm::StringRef("o", 1))) { - ADD_FAILURE() << "First part is not OP"; - } - if (parts[1].compare(llvm::StringRef("P", 1))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("OPName"); - ASSERT_EQ(parts.size(), (size_t)2); - - if (parts[0].compare(llvm::StringRef("OP", 2))) { - ADD_FAILURE() << "First part is not OP"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("OpName"); - ASSERT_EQ(parts.size(), (size_t)2); - if (parts[0].compare(llvm::StringRef("Op", 2))) { - ADD_FAILURE() << "First part is not Op"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("opName"); - ASSERT_EQ(parts.size(), (size_t)2); - if (parts[0].compare(llvm::StringRef("op", 2))) { - ADD_FAILURE() << "First part is not op"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("FlangTestProgram123"); - ASSERT_EQ(parts.size(), (size_t)3); - if (parts[0].compare(llvm::StringRef("Flang", 5))) { - ADD_FAILURE() << "First part is not Flang"; - } - if (parts[1].compare(llvm::StringRef("Test", 4))) { - ADD_FAILURE() << "Second part is not Test"; - } - if (parts[2].compare(llvm::StringRef("Program123", 10))) { - ADD_FAILURE() << "Third part is not Program123"; - } - for (auto p : parts) { - llvm::errs() << p << " " << p.size() << "\n"; - } -} - -TEST(EnumClassTest, SplitHyphenated) { - auto parts = splitHyphenated("no-twenty-one"); - ASSERT_EQ(parts.size(), (size_t)3); - if (parts[0].compare(llvm::StringRef("no", 2))) { - ADD_FAILURE() << "First part is not twenty"; - } - if (parts[1].compare(llvm::StringRef("twenty", 6))) { - ADD_FAILURE() << "Second part is not one"; - } - if (parts[2].compare(llvm::StringRef("one", 3))) { - ADD_FAILURE() << "Third part is not one"; - } - for (auto p : parts) { - llvm::errs() << p << " " << p.size() << "\n"; - } -} - -TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); - - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); -} - -std::optional> parseCLITestEnumExtraOption( - llvm::StringRef input) { - return parseCLIEnum(input, FindTestEnumExtraIndex); -} - -TEST(EnumClassTest, parseCLIEnumOption) { - auto result = parseCLITestEnumExtraOption("no-twenty-one"); - auto expected = - std::pair(false, TestEnumExtra::TwentyOne); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("twenty-one"); - expected = std::pair(true, TestEnumExtra::TwentyOne); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("no-forty-two"); - expected = std::pair(false, TestEnumExtra::FortyTwo); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("forty-two"); - expected = std::pair(true, TestEnumExtra::FortyTwo); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("no-seven-seven-seven"); - expected = - std::pair(false, TestEnumExtra::SevenSevenSeven); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("seven-seven-seven"); - expected = - std::pair(true, TestEnumExtra::SevenSevenSeven); - ASSERT_EQ(result, std::optional{expected}); +#include + +namespace Fortran::common::FortranFeaturesHelpers { + +optional> parseCLIUsageWarning( + llvm::StringRef input); +TEST(EnumClassTest, ParseCLIUsageWarning) { + EXPECT_EQ((parseCLIUsageWarning("no-twenty-one")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("twenty-one")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-seven-seven-seven")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-")), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("Portability"), std::nullopt); + auto expect{std::pair{false, UsageWarning::Portability}}; + ASSERT_EQ(parseCLIUsageWarning("no-portability"), expect); + expect.first = true; + ASSERT_EQ((parseCLIUsageWarning("portability")), expect); + expect = + std::pair{false, Fortran::common::UsageWarning::PointerToUndefinable}; + ASSERT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable")), expect); + expect.first = true; + ASSERT_EQ((parseCLIUsageWarning("pointer-to-undefinable")), expect); + EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("NoPointerToUndefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable"), std::nullopt); } -} // namespace Fortran::common +} // namespace Fortran::common::FortranFeaturesHelpers >From 79303b42f7cfd3806c22bd34e5eced5f27d27f32 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 15:42:27 -0700 Subject: [PATCH 5/7] removing debugging statement --- flang/lib/Support/Fortran-features.cpp | 4 ---- 1 file changed, 4 deletions(-) diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 72ea6639adf51..75baa0b096af0 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -11,10 +11,6 @@ #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" -// Debugging -#include "llvm/ADT/StringRef.h" -#include "llvm/Support/raw_ostream.h" - namespace Fortran::common { LanguageFeatureControl::LanguageFeatureControl() { >From 8f0aa22125528a755ec61af2bd45b6c314cfe45c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 15:59:18 -0700 Subject: [PATCH 6/7] more feedback --- flang/include/flang/Support/Fortran-features.h | 4 +--- flang/lib/Support/Fortran-features.cpp | 16 +++++++++------- flang/unittests/Common/FortranFeaturesTest.cpp | 4 ---- 3 files changed, 10 insertions(+), 14 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index fd6a9139b7ea7..501b183cceeec 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -11,8 +11,6 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" -#include "llvm/ADT/StringRef.h" -#include #include namespace Fortran::common { @@ -115,7 +113,7 @@ class LanguageFeatureControl { DisableAllNonstandardWarnings(); DisableAllUsageWarnings(); } - bool applyCLIOption(llvm::StringRef input); + bool applyCLIOption(std::string_view input); bool AreWarningsDisabled() const { return disableAllWarnings_; } bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 75baa0b096af0..d140ecdff7f24 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -99,11 +99,11 @@ LanguageFeatureControl::LanguageFeatureControl() { // used instead of static so that there can be unit tests for these // functions. namespace FortranFeaturesHelpers { -// Check if Lower Case Hyphenated words are equal to Camel Case words. +// Check if lower case hyphenated words are equal to camel case words. // Because of out use case we know that 'r' the camel case string is // well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. // This is checked in the enum-class.h file. -bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { +static bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { size_t ls{l.size()}, rs{r.size()}; if (ls < rs) { return false; @@ -161,8 +161,9 @@ optional> ParseCLIEnum( negated = true; input = input.drop_front(3); } - EnumClass::Predicate predicate{ - [input](llvm::StringRef r) { return LowerHyphEqualCamelCase(input, r); }}; + EnumClass::Predicate predicate{[input](std::string_view r) { + return LowerHyphEqualCamelCase(input, r); + }}; optional x = EnumClass::Find(predicate, findIndex); return MapOption>( x, [negated](T x) { return std::pair{!negated, x}; }); @@ -182,12 +183,13 @@ optional> parseCLILanguageFeature( // Take a string from the CLI and apply it to the LanguageFeatureControl. // Return true if the option was applied recognized. -bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { - if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(input)) { +bool LanguageFeatureControl::applyCLIOption(std::string_view input) { + llvm::StringRef inputRef{input}; + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(inputRef)) { EnableWarning(result->second, result->first); return true; } else if (auto result = - FortranFeaturesHelpers::parseCLIUsageWarning(input)) { + FortranFeaturesHelpers::parseCLIUsageWarning(inputRef)) { EnableWarning(result->second, result->first); return true; } diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index e12aff9f7b735..b3f0c31a57025 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -7,11 +7,7 @@ //===----------------------------------------------------------------------===// #include "gtest/gtest.h" -#include "flang/Common/enum-class.h" #include "flang/Support/Fortran-features.h" -#include "llvm/ADT/SmallVector.h" -#include "llvm/ADT/StringRef.h" -#include "llvm/Support/ErrorHandling.h" #include namespace Fortran::common::FortranFeaturesHelpers { >From a0317745bca77a1134e116fd570b4ecca60e4d95 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 17:02:17 -0700 Subject: [PATCH 7/7] adding insensitive match back --- .../include/flang/Support/Fortran-features.h | 2 +- flang/lib/Support/Fortran-features.cpp | 86 +++++++++++++++---- .../unittests/Common/FortranFeaturesTest.cpp | 75 +++++++++++----- 3 files changed, 123 insertions(+), 40 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 501b183cceeec..0b55a3175580a 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -113,7 +113,7 @@ class LanguageFeatureControl { DisableAllNonstandardWarnings(); DisableAllUsageWarnings(); } - bool applyCLIOption(std::string_view input); + bool applyCLIOption(std::string_view input, bool insensitive = false); bool AreWarningsDisabled() const { return disableAllWarnings_; } bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index d140ecdff7f24..80e87615697df 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -8,6 +8,7 @@ #include "flang/Support/Fortran-features.h" #include "flang/Common/idioms.h" +#include "flang/Common/optional.h" #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" @@ -99,11 +100,48 @@ LanguageFeatureControl::LanguageFeatureControl() { // used instead of static so that there can be unit tests for these // functions. namespace FortranFeaturesHelpers { + +// Ignore case and any inserted punctuation (like '-'/'_') +static std::optional GetWarningChar(char ch) { + if (ch >= 'a' && ch <= 'z') { + return ch; + } else if (ch >= 'A' && ch <= 'Z') { + return ch - 'A' + 'a'; + } else if (ch >= '0' && ch <= '9') { + return ch; + } else { + return std::nullopt; + } +} + +// Check for case and punctuation insensitive string equality. +// NB, b is probably not null terminated, so don't treat is like a C string. +static bool InsensitiveWarningNameMatch( + std::string_view a, std::string_view b) { + size_t j{0}, aSize{a.size()}, k{0}, bSize{b.size()}; + while (true) { + optional ach{nullopt}; + while (!ach && j < aSize) { + ach = GetWarningChar(a[j++]); + } + optional bch{}; + while (!bch && k < bSize) { + bch = GetWarningChar(b[k++]); + } + if (!ach && !bch) { + return true; + } else if (!ach || !bch || *ach != *bch) { + return false; + } + ach = bch = nullopt; + } +} + // Check if lower case hyphenated words are equal to camel case words. // Because of out use case we know that 'r' the camel case string is // well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. // This is checked in the enum-class.h file. -static bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { +static bool SensitiveWarningNameMatch(llvm::StringRef l, llvm::StringRef r) { size_t ls{l.size()}, rs{r.size()}; if (ls < rs) { return false; @@ -154,42 +192,56 @@ static bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { // Parse a CLI enum option return the enum index and whether it should be // enabled (true) or disabled (false). template -optional> ParseCLIEnum( - llvm::StringRef input, EnumClass::FindIndexType findIndex) { +optional> ParseCLIEnum(llvm::StringRef input, + EnumClass::FindIndexType findIndex, bool insensitive) { bool negated{false}; - if (input.starts_with("no-")) { - negated = true; - input = input.drop_front(3); + EnumClass::Predicate predicate; + if (insensitive) { + if (input.starts_with_insensitive("no")) { + negated = true; + input = input.drop_front(2); + } + predicate = [input](std::string_view r) { + return InsensitiveWarningNameMatch(input, r); + }; + } else { + if (input.starts_with("no-")) { + negated = true; + input = input.drop_front(3); + } + predicate = [input](std::string_view r) { + return SensitiveWarningNameMatch(input, r); + }; } - EnumClass::Predicate predicate{[input](std::string_view r) { - return LowerHyphEqualCamelCase(input, r); - }}; optional x = EnumClass::Find(predicate, findIndex); return MapOption>( x, [negated](T x) { return std::pair{!negated, x}; }); } optional> parseCLIUsageWarning( - llvm::StringRef input) { - return ParseCLIEnum(input, FindUsageWarningIndex); + llvm::StringRef input, bool insensitive) { + return ParseCLIEnum(input, FindUsageWarningIndex, insensitive); } optional> parseCLILanguageFeature( - llvm::StringRef input) { - return ParseCLIEnum(input, FindLanguageFeatureIndex); + llvm::StringRef input, bool insensitive) { + return ParseCLIEnum( + input, FindLanguageFeatureIndex, insensitive); } } // namespace FortranFeaturesHelpers // Take a string from the CLI and apply it to the LanguageFeatureControl. // Return true if the option was applied recognized. -bool LanguageFeatureControl::applyCLIOption(std::string_view input) { +bool LanguageFeatureControl::applyCLIOption( + std::string_view input, bool insensitive) { llvm::StringRef inputRef{input}; - if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(inputRef)) { + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature( + inputRef, insensitive)) { EnableWarning(result->second, result->first); return true; - } else if (auto result = - FortranFeaturesHelpers::parseCLIUsageWarning(inputRef)) { + } else if (auto result = FortranFeaturesHelpers::parseCLIUsageWarning( + inputRef, insensitive)) { EnableWarning(result->second, result->first); return true; } diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index b3f0c31a57025..4e9529d633ad9 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -13,29 +13,60 @@ namespace Fortran::common::FortranFeaturesHelpers { optional> parseCLIUsageWarning( - llvm::StringRef input); + llvm::StringRef input, bool insensitive); TEST(EnumClassTest, ParseCLIUsageWarning) { - EXPECT_EQ((parseCLIUsageWarning("no-twenty-one")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("twenty-one")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("no-seven-seven-seven")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("no")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("no-")), std::nullopt); - EXPECT_EQ(parseCLIUsageWarning("Portability"), std::nullopt); - auto expect{std::pair{false, UsageWarning::Portability}}; - ASSERT_EQ(parseCLIUsageWarning("no-portability"), expect); - expect.first = true; - ASSERT_EQ((parseCLIUsageWarning("portability")), expect); - expect = - std::pair{false, Fortran::common::UsageWarning::PointerToUndefinable}; - ASSERT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable")), expect); - expect.first = true; - ASSERT_EQ((parseCLIUsageWarning("pointer-to-undefinable")), expect); - EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable"), std::nullopt); - EXPECT_EQ(parseCLIUsageWarning("NoPointerToUndefinable"), std::nullopt); - EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable"), std::nullopt); - EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable"), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-twenty-one", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("twenty-one", false)), std::nullopt); + EXPECT_EQ( + (parseCLIUsageWarning("no-seven-seven-seven", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-", false)), std::nullopt); + + EXPECT_EQ(parseCLIUsageWarning("Portability", false), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-portability", false)), + (std::optional{std::pair{false, UsageWarning::Portability}})); + EXPECT_EQ((parseCLIUsageWarning("portability", false)), + (std::optional{std::pair{true, UsageWarning::Portability}})); + EXPECT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable", false)), + (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ((parseCLIUsageWarning("pointer-to-undefinable", false)), + (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable", false), std::nullopt); + EXPECT_EQ( + parseCLIUsageWarning("NoPointerToUndefinable", false), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable", false), std::nullopt); + EXPECT_EQ( + parseCLIUsageWarning("nopointertoundefinable", false), std::nullopt); + + EXPECT_EQ((parseCLIUsageWarning("no-twenty-one", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("twenty-one", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-seven-seven-seven", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-", true)), std::nullopt); + + EXPECT_EQ(parseCLIUsageWarning("Portability", true), + (std::optional{std::pair{true, UsageWarning::Portability}})); + EXPECT_EQ(parseCLIUsageWarning("no-portability", true), + (std::optional{std::pair{false, UsageWarning::Portability}})); + + EXPECT_EQ((parseCLIUsageWarning("portability", true)), + (std::optional{std::pair{true, UsageWarning::Portability}})); + EXPECT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable", true)), + (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ((parseCLIUsageWarning("pointer-to-undefinable", true)), + (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable", true), + (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("NoPointerToUndefinable", true), + (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable", true), + (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable", true), + (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); } } // namespace Fortran::common::FortranFeaturesHelpers From flang-commits at lists.llvm.org Fri May 30 17:33:03 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Fri, 30 May 2025 17:33:03 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page support for Flang (PR #141882) In-Reply-To: Message-ID: <683a4e3f.050a0220.10345c.b907@mx.google.com> https://github.com/snarang181 updated https://github.com/llvm/llvm-project/pull/141882 >From 7ef55e467fd2eaacc66c36c6e6e4df33b86ada62 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Wed, 28 May 2025 20:21:16 -0400 Subject: [PATCH 1/6] [Flang][Docs] Add Sphinx man page support for Flang This patch enables building Flang man pages by: - Adding a `man_pages` entry in flang/docs/conf.py for Sphinx man builder. - Adding a minimal `index.rst` as the master document. - Adding placeholder `.rst` files for FIRLangRef and FlangCommandLineReference to fix toctree references. These changes unblock builds using `-DLLVM_BUILD_MANPAGES=ON` and allow `ninja docs-flang-man` to generate `flang.1`. Fixes #141757 --- flang/docs/FIRLangRef.rst | 4 ++++ flang/docs/FlangCommandLineReference.rst | 4 ++++ flang/docs/conf.py | 4 +++- flang/docs/index.rst | 10 ++++++++++ 4 files changed, 21 insertions(+), 1 deletion(-) create mode 100644 flang/docs/FIRLangRef.rst create mode 100644 flang/docs/FlangCommandLineReference.rst create mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst new file mode 100644 index 0000000000000..91edd67fdcad8 --- /dev/null +++ b/flang/docs/FIRLangRef.rst @@ -0,0 +1,4 @@ +FIR Language Reference +====================== + +(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst new file mode 100644 index 0000000000000..71f77f28ba72c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.rst @@ -0,0 +1,4 @@ +Flang Command Line Reference +============================ + +(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 48f7b69f5d750..46907f144e25a 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -227,7 +227,9 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [] +man_pages = [ + ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) +] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst new file mode 100644 index 0000000000000..09677eb87704f --- /dev/null +++ b/flang/docs/index.rst @@ -0,0 +1,10 @@ +Flang Documentation +==================== + +Welcome to the Flang documentation. + +.. toctree:: + :maxdepth: 1 + + FIRLangRef + FlangCommandLineReference >From c21b5a0b7e404fc38f51f718e3f3c390d79af6a5 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 06:53:34 -0400 Subject: [PATCH 2/6] Remove .rst files and point conf.py to pick up .md --- flang/docs/FIRLangRef.rst | 4 ---- flang/docs/FlangCommandLineReference.rst | 4 ---- flang/docs/conf.py | 5 ++--- flang/docs/index.rst | 10 ---------- 4 files changed, 2 insertions(+), 21 deletions(-) delete mode 100644 flang/docs/FIRLangRef.rst delete mode 100644 flang/docs/FlangCommandLineReference.rst delete mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst deleted file mode 100644 index 91edd67fdcad8..0000000000000 --- a/flang/docs/FIRLangRef.rst +++ /dev/null @@ -1,4 +0,0 @@ -FIR Language Reference -====================== - -(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst deleted file mode 100644 index 71f77f28ba72c..0000000000000 --- a/flang/docs/FlangCommandLineReference.rst +++ /dev/null @@ -1,4 +0,0 @@ -Flang Command Line Reference -============================ - -(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 46907f144e25a..4fd81440c8176 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -42,6 +42,7 @@ # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] +source_suffix = [".md"] myst_heading_anchors = 6 import sphinx @@ -227,9 +228,7 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [ - ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) -] +man_pages = [("index", "flang", "Flang Documentation", ["Flang Contributors"], 1)] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst deleted file mode 100644 index 09677eb87704f..0000000000000 --- a/flang/docs/index.rst +++ /dev/null @@ -1,10 +0,0 @@ -Flang Documentation -==================== - -Welcome to the Flang documentation. - -.. toctree:: - :maxdepth: 1 - - FIRLangRef - FlangCommandLineReference >From 925c666bd60163e4943a074689a5bbbcbe29614a Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 07:03:35 -0400 Subject: [PATCH 3/6] While building man pages, the .md files were being used. Due to that, the myst_parser was explictly imported. Adding Placeholder .md files which are required by index.md --- flang/docs/FIRLangRef.md | 3 +++ flang/docs/FlangCommandLineReference.md | 3 +++ flang/docs/conf.py | 10 +++++----- 3 files changed, 11 insertions(+), 5 deletions(-) create mode 100644 flang/docs/FIRLangRef.md create mode 100644 flang/docs/FlangCommandLineReference.md diff --git a/flang/docs/FIRLangRef.md b/flang/docs/FIRLangRef.md new file mode 100644 index 0000000000000..8e4052f14fc7c --- /dev/null +++ b/flang/docs/FIRLangRef.md @@ -0,0 +1,3 @@ +# FIR Language Reference + +_TODO: Add FIR language reference documentation._ diff --git a/flang/docs/FlangCommandLineReference.md b/flang/docs/FlangCommandLineReference.md new file mode 100644 index 0000000000000..ee8d7b83dc50c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.md @@ -0,0 +1,3 @@ +# Flang Command Line Reference + +_TODO: Add Flang CLI documentation._ diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 4fd81440c8176..7223661625689 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -10,6 +10,7 @@ # serve to show the default. from datetime import date + # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. @@ -28,16 +29,15 @@ "sphinx.ext.autodoc", ] -# When building man pages, we do not use the markdown pages, -# So, we can continue without the myst_parser dependencies. -# Doing so reduces dependencies of some packaged llvm distributions. + try: import myst_parser extensions.append("myst_parser") except ImportError: - if not tags.has("builder-man"): - raise + raise ImportError( + "myst_parser is required to build documentation, including man pages." + ) # Add any paths that contain templates here, relative to this directory. >From 56340f2b9b2f4f084033db919e9f1a727b621856 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 09:01:22 -0400 Subject: [PATCH 4/6] Remove placeholder .md files --- flang/docs/FIRLangRef.md | 3 --- flang/docs/FlangCommandLineReference.md | 3 --- 2 files changed, 6 deletions(-) delete mode 100644 flang/docs/FIRLangRef.md delete mode 100644 flang/docs/FlangCommandLineReference.md diff --git a/flang/docs/FIRLangRef.md b/flang/docs/FIRLangRef.md deleted file mode 100644 index 8e4052f14fc7c..0000000000000 --- a/flang/docs/FIRLangRef.md +++ /dev/null @@ -1,3 +0,0 @@ -# FIR Language Reference - -_TODO: Add FIR language reference documentation._ diff --git a/flang/docs/FlangCommandLineReference.md b/flang/docs/FlangCommandLineReference.md deleted file mode 100644 index ee8d7b83dc50c..0000000000000 --- a/flang/docs/FlangCommandLineReference.md +++ /dev/null @@ -1,3 +0,0 @@ -# Flang Command Line Reference - -_TODO: Add Flang CLI documentation._ >From cd886e1f1bc5a8fd46733a6f352faef4f1070dc0 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Fri, 30 May 2025 14:10:36 -0400 Subject: [PATCH 5/6] Enable docs-flang-html to build --- flang/docs/conf.py | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 7223661625689..03f5973392d65 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -42,7 +42,10 @@ # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] -source_suffix = [".md"] +source_suffix = { + ".rst": "restructuredtext", + ".md": "markdown", +} myst_heading_anchors = 6 import sphinx >From 9a420432b4fd6441d87d4fa575434a8616d5c82d Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Fri, 30 May 2025 20:32:33 -0400 Subject: [PATCH 6/6] Modify CMake to build man without warnings --- flang/docs/CMakeLists.txt | 50 +++++++++++++++++++-------------------- 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/flang/docs/CMakeLists.txt b/flang/docs/CMakeLists.txt index 92feb059d4caa..e60f89569dad3 100644 --- a/flang/docs/CMakeLists.txt +++ b/flang/docs/CMakeLists.txt @@ -82,7 +82,7 @@ if (LLVM_ENABLE_DOXYGEN) endif() endif() -function (gen_rst_file_from_td output_file td_option source docs_target) +function (gen_rst_file_from_td output_file td_option source) if (NOT EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/${source}") message(FATAL_ERROR "Cannot find source file: ${source} in ${CMAKE_CURRENT_SOURCE_DIR}") endif() @@ -90,8 +90,6 @@ function (gen_rst_file_from_td output_file td_option source docs_target) list(APPEND LLVM_TABLEGEN_FLAGS "-I${TABLEGEN_INCLUDE_DIR}") list(APPEND LLVM_TABLEGEN_FLAGS "-I${CMAKE_CURRENT_SOURCE_DIR}/../../clang/include/clang/Driver/") clang_tablegen(Source/${output_file} ${td_option} SOURCE ${source} TARGET "gen-${output_file}") - add_dependencies(${docs_target} "gen-${output_file}") - # clang_tablegen() does not create the output directory automatically, # so we have to create it explicitly. Note that copy-flang-src-docs below # does create the output directory, but it is not necessarily run @@ -105,32 +103,34 @@ endfunction() if (LLVM_ENABLE_SPHINX) include(AddSphinxTarget) if (SPHINX_FOUND) + + # CLANG_TABLEGEN_EXE variable needs to be set for clang_tablegen to run without error + find_program(CLANG_TABLEGEN_EXE "clang-tblgen" ${LLVM_TOOLS_BINARY_DIR} NO_DEFAULT_PATH) + + # Generate the RST file from TableGen (shared by both HTML and MAN builds) + gen_rst_file_from_td(FlangCommandLineReference.rst -gen-opt-docs FlangOptionsDocs.td) + + # Copy the flang/docs directory and the generated FIRLangRef.md file to a place in the binary directory. + # Having all the files in a single directory makes it possible for Sphinx to process them together. + # Add a dependency to the flang-doc target to ensure that the FIRLangRef.md file is generated before the copying happens. + add_custom_target(copy-flang-src-docs + COMMAND "${CMAKE_COMMAND}" -E copy_directory + "${CMAKE_CURRENT_SOURCE_DIR}" + "${CMAKE_CURRENT_BINARY_DIR}/Source" + DEPENDS flang-doc gen-FlangCommandLineReference.rst) + + # Run Python preprocessing to prepend header to FIRLangRef.md + add_custom_command(TARGET copy-flang-src-docs + COMMAND "${Python3_EXECUTABLE}" + ARGS ${CMAKE_CURRENT_BINARY_DIR}/Source/FIR/CreateFIRLangRef.py) + if (${SPHINX_OUTPUT_HTML}) add_sphinx_target(html flang SOURCE_DIR "${CMAKE_CURRENT_BINARY_DIR}/Source") - - add_dependencies(docs-flang-html copy-flang-src-docs) - - # Copy the flang/docs directory and the generated FIRLangRef.md file to a place in the binary directory. - # Having all the files in a single directory makes it possible for Sphinx to process them together. - # Add a dependency to the flang-doc target to ensure that the FIRLangRef.md file is generated before the copying happens. - add_custom_target(copy-flang-src-docs - COMMAND "${CMAKE_COMMAND}" -E copy_directory - "${CMAKE_CURRENT_SOURCE_DIR}" - "${CMAKE_CURRENT_BINARY_DIR}/Source" - DEPENDS flang-doc) - - # Runs a python script prior to HTML generation to prepend a header to FIRLangRef, - # Without the header, the page is incorrectly formatted, as it assumes the first entry is the page title. - add_custom_command(TARGET copy-flang-src-docs - COMMAND "${Python3_EXECUTABLE}" - ARGS ${CMAKE_CURRENT_BINARY_DIR}/Source/FIR/CreateFIRLangRef.py) - - # CLANG_TABLEGEN_EXE variable needs to be set for clang_tablegen to run without error - find_program(CLANG_TABLEGEN_EXE "clang-tblgen" ${LLVM_TOOLS_BINARY_DIR} NO_DEFAULT_PATH) - gen_rst_file_from_td(FlangCommandLineReference.rst -gen-opt-docs FlangOptionsDocs.td docs-flang-html) endif() if (${SPHINX_OUTPUT_MAN}) - add_sphinx_target(man flang) + add_sphinx_target(man flang SOURCE_DIR "${CMAKE_CURRENT_BINARY_DIR}/Source") + add_dependencies(docs-flang-man gen-FlangCommandLineReference.rst) + add_dependencies(docs-flang-man copy-flang-src-docs) endif() endif() endif() From flang-commits at lists.llvm.org Fri May 30 17:35:31 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Fri, 30 May 2025 17:35:31 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page and html support for Flang (PR #141882) In-Reply-To: Message-ID: <683a4ed3.a70a0220.26b257.a0e3@mx.google.com> https://github.com/snarang181 edited https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Fri May 30 17:46:23 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Fri, 30 May 2025 17:46:23 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page and html support for Flang (PR #141882) In-Reply-To: Message-ID: <683a515f.a70a0220.79a88.a055@mx.google.com> https://github.com/snarang181 updated https://github.com/llvm/llvm-project/pull/141882 >From 7ef55e467fd2eaacc66c36c6e6e4df33b86ada62 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Wed, 28 May 2025 20:21:16 -0400 Subject: [PATCH 1/7] [Flang][Docs] Add Sphinx man page support for Flang This patch enables building Flang man pages by: - Adding a `man_pages` entry in flang/docs/conf.py for Sphinx man builder. - Adding a minimal `index.rst` as the master document. - Adding placeholder `.rst` files for FIRLangRef and FlangCommandLineReference to fix toctree references. These changes unblock builds using `-DLLVM_BUILD_MANPAGES=ON` and allow `ninja docs-flang-man` to generate `flang.1`. Fixes #141757 --- flang/docs/FIRLangRef.rst | 4 ++++ flang/docs/FlangCommandLineReference.rst | 4 ++++ flang/docs/conf.py | 4 +++- flang/docs/index.rst | 10 ++++++++++ 4 files changed, 21 insertions(+), 1 deletion(-) create mode 100644 flang/docs/FIRLangRef.rst create mode 100644 flang/docs/FlangCommandLineReference.rst create mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst new file mode 100644 index 0000000000000..91edd67fdcad8 --- /dev/null +++ b/flang/docs/FIRLangRef.rst @@ -0,0 +1,4 @@ +FIR Language Reference +====================== + +(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst new file mode 100644 index 0000000000000..71f77f28ba72c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.rst @@ -0,0 +1,4 @@ +Flang Command Line Reference +============================ + +(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 48f7b69f5d750..46907f144e25a 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -227,7 +227,9 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [] +man_pages = [ + ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) +] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst new file mode 100644 index 0000000000000..09677eb87704f --- /dev/null +++ b/flang/docs/index.rst @@ -0,0 +1,10 @@ +Flang Documentation +==================== + +Welcome to the Flang documentation. + +.. toctree:: + :maxdepth: 1 + + FIRLangRef + FlangCommandLineReference >From c21b5a0b7e404fc38f51f718e3f3c390d79af6a5 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 06:53:34 -0400 Subject: [PATCH 2/7] Remove .rst files and point conf.py to pick up .md --- flang/docs/FIRLangRef.rst | 4 ---- flang/docs/FlangCommandLineReference.rst | 4 ---- flang/docs/conf.py | 5 ++--- flang/docs/index.rst | 10 ---------- 4 files changed, 2 insertions(+), 21 deletions(-) delete mode 100644 flang/docs/FIRLangRef.rst delete mode 100644 flang/docs/FlangCommandLineReference.rst delete mode 100644 flang/docs/index.rst diff --git a/flang/docs/FIRLangRef.rst b/flang/docs/FIRLangRef.rst deleted file mode 100644 index 91edd67fdcad8..0000000000000 --- a/flang/docs/FIRLangRef.rst +++ /dev/null @@ -1,4 +0,0 @@ -FIR Language Reference -====================== - -(TODO: Add FIR language reference documentation) diff --git a/flang/docs/FlangCommandLineReference.rst b/flang/docs/FlangCommandLineReference.rst deleted file mode 100644 index 71f77f28ba72c..0000000000000 --- a/flang/docs/FlangCommandLineReference.rst +++ /dev/null @@ -1,4 +0,0 @@ -Flang Command Line Reference -============================ - -(TODO: Add Flang CLI documentation) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 46907f144e25a..4fd81440c8176 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -42,6 +42,7 @@ # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] +source_suffix = [".md"] myst_heading_anchors = 6 import sphinx @@ -227,9 +228,7 @@ # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [ - ('index', 'flang', 'Flang Documentation', ['Flang Contributors'], 1) -] +man_pages = [("index", "flang", "Flang Documentation", ["Flang Contributors"], 1)] # If true, show URL addresses after external links. # man_show_urls = False diff --git a/flang/docs/index.rst b/flang/docs/index.rst deleted file mode 100644 index 09677eb87704f..0000000000000 --- a/flang/docs/index.rst +++ /dev/null @@ -1,10 +0,0 @@ -Flang Documentation -==================== - -Welcome to the Flang documentation. - -.. toctree:: - :maxdepth: 1 - - FIRLangRef - FlangCommandLineReference >From 925c666bd60163e4943a074689a5bbbcbe29614a Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 07:03:35 -0400 Subject: [PATCH 3/7] While building man pages, the .md files were being used. Due to that, the myst_parser was explictly imported. Adding Placeholder .md files which are required by index.md --- flang/docs/FIRLangRef.md | 3 +++ flang/docs/FlangCommandLineReference.md | 3 +++ flang/docs/conf.py | 10 +++++----- 3 files changed, 11 insertions(+), 5 deletions(-) create mode 100644 flang/docs/FIRLangRef.md create mode 100644 flang/docs/FlangCommandLineReference.md diff --git a/flang/docs/FIRLangRef.md b/flang/docs/FIRLangRef.md new file mode 100644 index 0000000000000..8e4052f14fc7c --- /dev/null +++ b/flang/docs/FIRLangRef.md @@ -0,0 +1,3 @@ +# FIR Language Reference + +_TODO: Add FIR language reference documentation._ diff --git a/flang/docs/FlangCommandLineReference.md b/flang/docs/FlangCommandLineReference.md new file mode 100644 index 0000000000000..ee8d7b83dc50c --- /dev/null +++ b/flang/docs/FlangCommandLineReference.md @@ -0,0 +1,3 @@ +# Flang Command Line Reference + +_TODO: Add Flang CLI documentation._ diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 4fd81440c8176..7223661625689 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -10,6 +10,7 @@ # serve to show the default. from datetime import date + # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. @@ -28,16 +29,15 @@ "sphinx.ext.autodoc", ] -# When building man pages, we do not use the markdown pages, -# So, we can continue without the myst_parser dependencies. -# Doing so reduces dependencies of some packaged llvm distributions. + try: import myst_parser extensions.append("myst_parser") except ImportError: - if not tags.has("builder-man"): - raise + raise ImportError( + "myst_parser is required to build documentation, including man pages." + ) # Add any paths that contain templates here, relative to this directory. >From 56340f2b9b2f4f084033db919e9f1a727b621856 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Thu, 29 May 2025 09:01:22 -0400 Subject: [PATCH 4/7] Remove placeholder .md files --- flang/docs/FIRLangRef.md | 3 --- flang/docs/FlangCommandLineReference.md | 3 --- 2 files changed, 6 deletions(-) delete mode 100644 flang/docs/FIRLangRef.md delete mode 100644 flang/docs/FlangCommandLineReference.md diff --git a/flang/docs/FIRLangRef.md b/flang/docs/FIRLangRef.md deleted file mode 100644 index 8e4052f14fc7c..0000000000000 --- a/flang/docs/FIRLangRef.md +++ /dev/null @@ -1,3 +0,0 @@ -# FIR Language Reference - -_TODO: Add FIR language reference documentation._ diff --git a/flang/docs/FlangCommandLineReference.md b/flang/docs/FlangCommandLineReference.md deleted file mode 100644 index ee8d7b83dc50c..0000000000000 --- a/flang/docs/FlangCommandLineReference.md +++ /dev/null @@ -1,3 +0,0 @@ -# Flang Command Line Reference - -_TODO: Add Flang CLI documentation._ >From cd886e1f1bc5a8fd46733a6f352faef4f1070dc0 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Fri, 30 May 2025 14:10:36 -0400 Subject: [PATCH 5/7] Enable docs-flang-html to build --- flang/docs/conf.py | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/flang/docs/conf.py b/flang/docs/conf.py index 7223661625689..03f5973392d65 100644 --- a/flang/docs/conf.py +++ b/flang/docs/conf.py @@ -42,7 +42,10 @@ # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] -source_suffix = [".md"] +source_suffix = { + ".rst": "restructuredtext", + ".md": "markdown", +} myst_heading_anchors = 6 import sphinx >From 9a420432b4fd6441d87d4fa575434a8616d5c82d Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Fri, 30 May 2025 20:32:33 -0400 Subject: [PATCH 6/7] Modify CMake to build man without warnings --- flang/docs/CMakeLists.txt | 50 +++++++++++++++++++-------------------- 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/flang/docs/CMakeLists.txt b/flang/docs/CMakeLists.txt index 92feb059d4caa..e60f89569dad3 100644 --- a/flang/docs/CMakeLists.txt +++ b/flang/docs/CMakeLists.txt @@ -82,7 +82,7 @@ if (LLVM_ENABLE_DOXYGEN) endif() endif() -function (gen_rst_file_from_td output_file td_option source docs_target) +function (gen_rst_file_from_td output_file td_option source) if (NOT EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/${source}") message(FATAL_ERROR "Cannot find source file: ${source} in ${CMAKE_CURRENT_SOURCE_DIR}") endif() @@ -90,8 +90,6 @@ function (gen_rst_file_from_td output_file td_option source docs_target) list(APPEND LLVM_TABLEGEN_FLAGS "-I${TABLEGEN_INCLUDE_DIR}") list(APPEND LLVM_TABLEGEN_FLAGS "-I${CMAKE_CURRENT_SOURCE_DIR}/../../clang/include/clang/Driver/") clang_tablegen(Source/${output_file} ${td_option} SOURCE ${source} TARGET "gen-${output_file}") - add_dependencies(${docs_target} "gen-${output_file}") - # clang_tablegen() does not create the output directory automatically, # so we have to create it explicitly. Note that copy-flang-src-docs below # does create the output directory, but it is not necessarily run @@ -105,32 +103,34 @@ endfunction() if (LLVM_ENABLE_SPHINX) include(AddSphinxTarget) if (SPHINX_FOUND) + + # CLANG_TABLEGEN_EXE variable needs to be set for clang_tablegen to run without error + find_program(CLANG_TABLEGEN_EXE "clang-tblgen" ${LLVM_TOOLS_BINARY_DIR} NO_DEFAULT_PATH) + + # Generate the RST file from TableGen (shared by both HTML and MAN builds) + gen_rst_file_from_td(FlangCommandLineReference.rst -gen-opt-docs FlangOptionsDocs.td) + + # Copy the flang/docs directory and the generated FIRLangRef.md file to a place in the binary directory. + # Having all the files in a single directory makes it possible for Sphinx to process them together. + # Add a dependency to the flang-doc target to ensure that the FIRLangRef.md file is generated before the copying happens. + add_custom_target(copy-flang-src-docs + COMMAND "${CMAKE_COMMAND}" -E copy_directory + "${CMAKE_CURRENT_SOURCE_DIR}" + "${CMAKE_CURRENT_BINARY_DIR}/Source" + DEPENDS flang-doc gen-FlangCommandLineReference.rst) + + # Run Python preprocessing to prepend header to FIRLangRef.md + add_custom_command(TARGET copy-flang-src-docs + COMMAND "${Python3_EXECUTABLE}" + ARGS ${CMAKE_CURRENT_BINARY_DIR}/Source/FIR/CreateFIRLangRef.py) + if (${SPHINX_OUTPUT_HTML}) add_sphinx_target(html flang SOURCE_DIR "${CMAKE_CURRENT_BINARY_DIR}/Source") - - add_dependencies(docs-flang-html copy-flang-src-docs) - - # Copy the flang/docs directory and the generated FIRLangRef.md file to a place in the binary directory. - # Having all the files in a single directory makes it possible for Sphinx to process them together. - # Add a dependency to the flang-doc target to ensure that the FIRLangRef.md file is generated before the copying happens. - add_custom_target(copy-flang-src-docs - COMMAND "${CMAKE_COMMAND}" -E copy_directory - "${CMAKE_CURRENT_SOURCE_DIR}" - "${CMAKE_CURRENT_BINARY_DIR}/Source" - DEPENDS flang-doc) - - # Runs a python script prior to HTML generation to prepend a header to FIRLangRef, - # Without the header, the page is incorrectly formatted, as it assumes the first entry is the page title. - add_custom_command(TARGET copy-flang-src-docs - COMMAND "${Python3_EXECUTABLE}" - ARGS ${CMAKE_CURRENT_BINARY_DIR}/Source/FIR/CreateFIRLangRef.py) - - # CLANG_TABLEGEN_EXE variable needs to be set for clang_tablegen to run without error - find_program(CLANG_TABLEGEN_EXE "clang-tblgen" ${LLVM_TOOLS_BINARY_DIR} NO_DEFAULT_PATH) - gen_rst_file_from_td(FlangCommandLineReference.rst -gen-opt-docs FlangOptionsDocs.td docs-flang-html) endif() if (${SPHINX_OUTPUT_MAN}) - add_sphinx_target(man flang) + add_sphinx_target(man flang SOURCE_DIR "${CMAKE_CURRENT_BINARY_DIR}/Source") + add_dependencies(docs-flang-man gen-FlangCommandLineReference.rst) + add_dependencies(docs-flang-man copy-flang-src-docs) endif() endif() endif() >From 477e8249c13d6575c42a7ed6068ecbf00a733b56 Mon Sep 17 00:00:00 2001 From: Samarth Narang Date: Fri, 30 May 2025 20:44:17 -0400 Subject: [PATCH 7/7] Minor fix for building html version copy-flang-src-docs builds Source/ directory and in fixing, did not include it in the HTML target deps --- flang/docs/CMakeLists.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/flang/docs/CMakeLists.txt b/flang/docs/CMakeLists.txt index e60f89569dad3..2737d08c83196 100644 --- a/flang/docs/CMakeLists.txt +++ b/flang/docs/CMakeLists.txt @@ -126,6 +126,7 @@ if (LLVM_ENABLE_SPHINX) if (${SPHINX_OUTPUT_HTML}) add_sphinx_target(html flang SOURCE_DIR "${CMAKE_CURRENT_BINARY_DIR}/Source") + add_dependencies(docs-flang-html copy-flang-src-docs) endif() if (${SPHINX_OUTPUT_MAN}) add_sphinx_target(man flang SOURCE_DIR "${CMAKE_CURRENT_BINARY_DIR}/Source") From flang-commits at lists.llvm.org Fri May 30 17:53:10 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Fri, 30 May 2025 17:53:10 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page and html support for Flang (PR #141882) In-Reply-To: Message-ID: <683a52f6.170a0220.922b5.99da@mx.google.com> snarang181 wrote: @kiranchandramohan, @tarunprabhu, @pawosm-arm Hi guys, I think this PR is ready for review. We can successfully build both HTML and man targets with these changes. Let me know how it looks. Thanks. https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Fri May 30 18:00:38 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Fri, 30 May 2025 18:00:38 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page and html support for Flang (PR #141882) In-Reply-To: Message-ID: <683a54b6.170a0220.9053e.e8bb@mx.google.com> https://github.com/snarang181 edited https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Fri May 30 18:02:20 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 30 May 2025 18:02:20 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683a551c.050a0220.233dc.4919@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/142022 >From 8f3fd2daab46f477e87043c66b3049dff4a5b20e Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:11:04 -0700 Subject: [PATCH 1/8] initial commit --- flang/include/flang/Common/enum-class.h | 47 ++++- .../include/flang/Support/Fortran-features.h | 51 ++++-- flang/lib/Frontend/CompilerInvocation.cpp | 62 ++++--- flang/lib/Support/CMakeLists.txt | 1 + flang/lib/Support/Fortran-features.cpp | 168 ++++++++++++++---- flang/lib/Support/enum-class.cpp | 24 +++ flang/test/Driver/disable-diagnostic.f90 | 19 ++ flang/test/Driver/werror-wrong.f90 | 7 +- flang/test/Driver/wextra-ok.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 3 + flang/unittests/Common/EnumClassTests.cpp | 45 +++++ .../unittests/Common/FortranFeaturesTest.cpp | 142 +++++++++++++++ 12 files changed, 483 insertions(+), 88 deletions(-) create mode 100644 flang/lib/Support/enum-class.cpp create mode 100644 flang/test/Driver/disable-diagnostic.f90 create mode 100644 flang/unittests/Common/EnumClassTests.cpp create mode 100644 flang/unittests/Common/FortranFeaturesTest.cpp diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index 41575d45091a8..baf9fe418141d 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -18,8 +18,9 @@ #define FORTRAN_COMMON_ENUM_CLASS_H_ #include -#include - +#include +#include +#include namespace Fortran::common { constexpr std::size_t CountEnumNames(const char *p) { @@ -58,15 +59,51 @@ constexpr std::array EnumNames(const char *p) { return result; } +template +std::optional inline fmap(std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +std::optional FindEnumIndex( + Predicate pred, int size, const std::string_view *names); + +using FindEnumIndexType = std::optional( + Predicate, int, const std::string_view *); + +template +std::optional inline FindEnum( + Predicate pred, std::function(Predicate)> find) { + std::function f = [](int x) { return static_cast(x); }; + return fmap(find(pred), f); +} + #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ ::Fortran::common::CountEnumNames(#__VA_ARGS__)}; \ + [[maybe_unused]] static constexpr std::array NAME##_names{ \ + ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ [[maybe_unused]] static inline std::string_view EnumToString(NAME e) { \ - static const constexpr auto names{ \ - ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ - return names[static_cast(e)]; \ + return NAME##_names[static_cast(e)]; \ } +#define ENUM_CLASS_EXTRA(NAME) \ + [[maybe_unused]] inline std::optional Find##NAME##Index( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnumIndex( \ + p, NAME##_enumSize, NAME##_names.data()); \ + } \ + [[maybe_unused]] inline std::optional Find##NAME( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + } \ + [[maybe_unused]] inline std::optional StringTo##NAME( \ + const std::string_view name) { \ + return Find##NAME( \ + [name](const std::string_view s) -> bool { return name == s; }); \ + } } // namespace Fortran::common #endif // FORTRAN_COMMON_ENUM_CLASS_H_ diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index e696da9042480..d5aa7357ffea0 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -12,6 +12,8 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" #include "flang/Common/idioms.h" +#include "llvm/Support/Error.h" +#include "llvm/Support/raw_ostream.h" #include #include @@ -79,12 +81,13 @@ ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, NullActualForDefaultIntentAllocatable, UseAssociationIntoSameNameSubprogram, HostAssociatedIntentOutInSpecExpr, NonVolatilePointerToVolatile) +// Generate default String -> Enum mapping. +ENUM_CLASS_EXTRA(LanguageFeature) +ENUM_CLASS_EXTRA(UsageWarning) + using LanguageFeatures = EnumSet; using UsageWarnings = EnumSet; -std::optional FindLanguageFeature(const char *); -std::optional FindUsageWarning(const char *); - class LanguageFeatureControl { public: LanguageFeatureControl(); @@ -97,8 +100,10 @@ class LanguageFeatureControl { void EnableWarning(UsageWarning w, bool yes = true) { warnUsage_.set(w, yes); } - void WarnOnAllNonstandard(bool yes = true) { warnAllLanguage_ = yes; } - void WarnOnAllUsage(bool yes = true) { warnAllUsage_ = yes; } + void WarnOnAllNonstandard(bool yes = true); + bool IsWarnOnAllNonstandard() const { return warnAllLanguage_; } + void WarnOnAllUsage(bool yes = true); + bool IsWarnOnAllUsage() const { return warnAllUsage_; } void DisableAllNonstandardWarnings() { warnAllLanguage_ = false; warnLanguage_.clear(); @@ -107,16 +112,16 @@ class LanguageFeatureControl { warnAllUsage_ = false; warnUsage_.clear(); } - - bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } - bool ShouldWarn(LanguageFeature f) const { - return (warnAllLanguage_ && f != LanguageFeature::OpenMP && - f != LanguageFeature::OpenACC && f != LanguageFeature::CUDA) || - warnLanguage_.test(f); - } - bool ShouldWarn(UsageWarning w) const { - return warnAllUsage_ || warnUsage_.test(w); + void DisableAllWarnings() { + disableAllWarnings_ = true; + DisableAllNonstandardWarnings(); + DisableAllUsageWarnings(); } + bool applyCLIOption(llvm::StringRef input); + bool AreWarningsDisabled() const { return disableAllWarnings_; } + bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } + bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } + bool ShouldWarn(UsageWarning w) const { return warnUsage_.test(w); } // Return all spellings of operators names, depending on features enabled std::vector GetNames(LogicalOperator) const; std::vector GetNames(RelationalOperator) const; @@ -127,6 +132,24 @@ class LanguageFeatureControl { bool warnAllLanguage_{false}; UsageWarnings warnUsage_; bool warnAllUsage_{false}; + bool disableAllWarnings_{false}; }; + +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). Just exposed for the template below. +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find); + +template +std::optional> parseCLIEnum( + llvm::StringRef input, std::function(Predicate)> find) { + using To = std::pair; + using From = std::pair; + static std::function cast = [](From x) { + return std::pair{x.first, static_cast(x.second)}; + }; + return fmap(parseCLIEnumIndex(input, find), cast); +} + } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..9ea568549bd6c 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -34,6 +34,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" +#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" #include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" @@ -45,6 +46,7 @@ #include #include #include +#include using namespace Fortran::frontend; @@ -971,10 +973,23 @@ static bool parseSemaArgs(CompilerInvocation &res, llvm::opt::ArgList &args, /// Parses all diagnostics related arguments and populates the variables /// options accordingly. Returns false if new errors are generated. +/// FC1 driver entry point for parsing diagnostic arguments. static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, clang::DiagnosticsEngine &diags) { unsigned numErrorsBefore = diags.getNumErrors(); + auto &features = res.getFrontendOpts().features; + // The order of these flags (-pedantic -W -w) is important and is + // chosen to match clang's behavior. + + // -pedantic + if (args.hasArg(clang::driver::options::OPT_pedantic)) { + features.WarnOnAllNonstandard(); + features.WarnOnAllUsage(); + res.setEnableConformanceChecks(); + res.setEnableUsageChecks(); + } + // -Werror option // TODO: Currently throws a Diagnostic for anything other than -W, // this has to change when other -W's are supported. @@ -984,21 +999,27 @@ static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, for (const auto &wArg : wArgs) { if (wArg == "error") { res.setWarnAsErr(true); - } else { - const unsigned diagID = - diags.getCustomDiagID(clang::DiagnosticsEngine::Error, - "Only `-Werror` is supported currently."); - diags.Report(diagID); + // -W(no-) + } else if (!features.applyCLIOption(wArg)) { + const unsigned diagID = diags.getCustomDiagID( + clang::DiagnosticsEngine::Error, "Unknown diagnostic option: -W%0"); + diags.Report(diagID) << wArg; } } } + // -w + if (args.hasArg(clang::driver::options::OPT_w)) { + features.DisableAllWarnings(); + res.setDisableWarnings(); + } + // Default to off for `flang -fc1`. - res.getFrontendOpts().showColors = - parseShowColorsArgs(args, /*defaultDiagColor=*/false); + bool showColors = parseShowColorsArgs(args, false); - // Honor color diagnostics. - res.getDiagnosticOpts().ShowColors = res.getFrontendOpts().showColors; + diags.getDiagnosticOptions().ShowColors = showColors; + res.getDiagnosticOpts().ShowColors = showColors; + res.getFrontendOpts().showColors = showColors; return diags.getNumErrors() == numErrorsBefore; } @@ -1074,16 +1095,6 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, Fortran::common::LanguageFeature::OpenACC); } - // -pedantic - if (args.hasArg(clang::driver::options::OPT_pedantic)) { - res.setEnableConformanceChecks(); - res.setEnableUsageChecks(); - } - - // -w - if (args.hasArg(clang::driver::options::OPT_w)) - res.setDisableWarnings(); - // -std=f2018 // TODO: Set proper options when more fortran standards // are supported. @@ -1092,6 +1103,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, // We only allow f2018 as the given standard if (standard == "f2018") { res.setEnableConformanceChecks(); + res.getFrontendOpts().features.WarnOnAllNonstandard(); } else { const unsigned diagID = diags.getCustomDiagID(clang::DiagnosticsEngine::Error, @@ -1099,6 +1111,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, diags.Report(diagID); } } + return diags.getNumErrors() == numErrorsBefore; } @@ -1694,16 +1707,7 @@ void CompilerInvocation::setFortranOpts() { if (frontendOptions.needProvenanceRangeToCharBlockMappings) fortranOptions.needProvenanceRangeToCharBlockMappings = true; - if (getEnableConformanceChecks()) - fortranOptions.features.WarnOnAllNonstandard(); - - if (getEnableUsageChecks()) - fortranOptions.features.WarnOnAllUsage(); - - if (getDisableWarnings()) { - fortranOptions.features.DisableAllNonstandardWarnings(); - fortranOptions.features.DisableAllUsageWarnings(); - } + fortranOptions.features = frontendOptions.features; } std::unique_ptr diff --git a/flang/lib/Support/CMakeLists.txt b/flang/lib/Support/CMakeLists.txt index 363f57ce97dae..9ef31a2a6dcc7 100644 --- a/flang/lib/Support/CMakeLists.txt +++ b/flang/lib/Support/CMakeLists.txt @@ -44,6 +44,7 @@ endif() add_flang_library(FortranSupport default-kinds.cpp + enum-class.cpp Flags.cpp Fortran.cpp Fortran-features.cpp diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index bee8984102b82..55abf0385d185 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -9,6 +9,8 @@ #include "flang/Support/Fortran-features.h" #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/Support/raw_ostream.h" namespace Fortran::common { @@ -94,57 +96,123 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Ignore case and any inserted punctuation (like '-'/'_') -static std::optional GetWarningChar(char ch) { - if (ch >= 'a' && ch <= 'z') { - return ch; - } else if (ch >= 'A' && ch <= 'Z') { - return ch - 'A' + 'a'; - } else if (ch >= '0' && ch <= '9') { - return ch; - } else { - return std::nullopt; +// Split a string with camel case into the individual words. +// Note, the small vector is just an array of a few pointers and lengths +// into the original input string. So all this allocation should be pretty +// cheap. +llvm::SmallVector splitCamelCase(llvm::StringRef input) { + using namespace llvm; + if (input.empty()) { + return {}; } + SmallVector parts{}; + parts.reserve(input.size()); + auto check = [&input](size_t j, function_ref predicate) { + return j < input.size() && predicate(input[j]); + }; + size_t i{0}; + size_t startWord = i; + for (; i < input.size(); i++) { + if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || + ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { + parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); + startWord = i + 1; + } + } + parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); + return parts; } -static bool WarningNameMatch(const char *a, const char *b) { - while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); - } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); +// Split a string whith hyphens into the individual words. +llvm::SmallVector splitHyphenated(llvm::StringRef input) { + auto parts = llvm::SmallVector{}; + llvm::SplitString(input, parts, "-"); + return parts; +} + +// Check if two strings are equal while normalizing case for the +// right word which is assumed to be a single word in camel case. +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { + size_t ls = l.size(); + if (ls != r.size()) + return false; + size_t j{0}; + // Process the upper case characters. + for (; j < ls; j++) { + char rc = r[j]; + char rc2l = llvm::toLower(rc); + if (rc == rc2l) { + // Past run of Uppers Case; + break; } - if (!ach && !bch) { - return true; - } else if (!ach || !bch || *ach != *bch) { + if (l[j] != rc2l) + return false; + } + // Process the lower case characters. + for (; j < ls; j++) { + if (l[j] != r[j]) { return false; } - ++a, ++b; } + return true; } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find) { + auto parts = splitHyphenated(input); + bool negated = false; + if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { + negated = true; + // Remove the "no" part + parts = llvm::SmallVector(parts.begin() + 1, parts.end()); + } + size_t chars = 0; + for (auto p : parts) { + chars += p.size(); + } + auto pred = [&](auto s) { + if (chars != s.size()) { + return false; + } + auto ccParts = splitCamelCase(s); + auto num_ccParts = ccParts.size(); + if (parts.size() != num_ccParts) { + return false; + } + for (size_t i{0}; i < num_ccParts; i++) { + if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { + return false; } } - } - return std::nullopt; + return true; + }; + auto cast = [negated](int x) { return std::pair{!negated, x}; }; + return fmap>(find(pred), cast); } -std::optional FindLanguageFeature(const char *name) { - return ScanEnum(name); +std::optional> parseCLILanguageFeature( + llvm::StringRef input) { + return parseCLIEnum(input, FindLanguageFeatureIndex); } -std::optional FindUsageWarning(const char *name) { - return ScanEnum(name); +std::optional> parseCLIUsageWarning( + llvm::StringRef input) { + return parseCLIEnum(input, FindUsageWarningIndex); +} + +// Take a string from the CLI and apply it to the LanguageFeatureControl. +// Return true if the option was applied recognized. +bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { + if (auto result = parseCLILanguageFeature(input)) { + EnableWarning(result->second, result->first); + return true; + } else if (auto result = parseCLIUsageWarning(input)) { + EnableWarning(result->second, result->first); + return true; + } + return false; } std::vector LanguageFeatureControl::GetNames( @@ -201,4 +269,32 @@ std::vector LanguageFeatureControl::GetNames( } } +template +void ForEachEnum(std::function f) { + for (size_t j{0}; j < N; ++j) { + f(static_cast(j)); + } +} + +void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { + warnAllLanguage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + // should be equivalent to: reset().flip() set ... + ForEachEnum( + [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + if (yes) { + // These three features do not need to be warned about, + // but we do want their feature flags. + warnLanguage_.set(LanguageFeature::OpenMP, false); + warnLanguage_.set(LanguageFeature::OpenACC, false); + warnLanguage_.set(LanguageFeature::CUDA, false); + } +} + +void LanguageFeatureControl::WarnOnAllUsage(bool yes) { + warnAllUsage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + ForEachEnum( + [&](UsageWarning w) { warnUsage_.set(w, yes); }); +} } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp new file mode 100644 index 0000000000000..ed11318382b35 --- /dev/null +++ b/flang/lib/Support/enum-class.cpp @@ -0,0 +1,24 @@ +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include +#include +namespace Fortran::common { + +std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; + } + } + return std::nullopt; +} + + +} // namespace Fortran::common \ No newline at end of file diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 new file mode 100644 index 0000000000000..8a58e63cfa3ac --- /dev/null +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -0,0 +1,19 @@ +! RUN: %flang -Wknown-bad-implicit-interface %s -c 2>&1 | FileCheck %s --check-prefix=WARN +! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty +! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 +! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 +! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface +! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface + +program disable_diagnostic + REAL :: x + INTEGER :: y + ! CHECK-NOT: warning + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(x) + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(y) +end program disable_diagnostic + +subroutine sub() +end subroutine sub \ No newline at end of file diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 58adf6f745d5e..33f0aff8a1739 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -1,6 +1,7 @@ ! Ensure that only argument -Werror is supported. -! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG -! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG +! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG1 +! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 -! WRONG: Only `-Werror` is supported currently. +! WRONG1: error: Unknown diagnostic option: -Wall +! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file diff --git a/flang/test/Driver/wextra-ok.f90 b/flang/test/Driver/wextra-ok.f90 index 441029aa0af27..db15c7f14aa35 100644 --- a/flang/test/Driver/wextra-ok.f90 +++ b/flang/test/Driver/wextra-ok.f90 @@ -5,7 +5,7 @@ ! RUN: not %flang -std=f2018 -Wblah -Wextra %s -c 2>&1 | FileCheck %s --check-prefix=WRONG ! CHECK-OK: the warning option '-Wextra' is not supported -! WRONG: Only `-Werror` is supported currently. +! WRONG: Unknown diagnostic option: -Wblah program wextra_ok end program wextra_ok diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index bda02ed29a5ef..19cc5a20fecf4 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -1,3 +1,6 @@ add_flang_unittest(FlangCommonTests + EnumClassTests.cpp FastIntSetTest.cpp + FortranFeaturesTest.cpp ) +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp new file mode 100644 index 0000000000000..f67c453cfad15 --- /dev/null +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -0,0 +1,45 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Common/template.h" +#include "gtest/gtest.h" + +using namespace Fortran::common; +using namespace std; + +ENUM_CLASS(TestEnum, One, Two, + Three) +ENUM_CLASS_EXTRA(TestEnum) + +TEST(EnumClassTest, EnumToString) { + ASSERT_EQ(EnumToString(TestEnum::One), "One"); + ASSERT_EQ(EnumToString(TestEnum::Two), "Two"); + ASSERT_EQ(EnumToString(TestEnum::Three), "Three"); +} + +TEST(EnumClassTest, EnumToStringData) { + ASSERT_STREQ(EnumToString(TestEnum::One).data(), "One, Two, Three"); +} + +TEST(EnumClassTest, StringToEnum) { + ASSERT_EQ(StringToTestEnum("One"), std::optional{TestEnum::One}); + ASSERT_EQ(StringToTestEnum("Two"), std::optional{TestEnum::Two}); + ASSERT_EQ(StringToTestEnum("Three"), std::optional{TestEnum::Three}); + ASSERT_EQ(StringToTestEnum("Four"), std::nullopt); + ASSERT_EQ(StringToTestEnum(""), std::nullopt); + ASSERT_EQ(StringToTestEnum("One, Two, Three"), std::nullopt); +} + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, FindNameNormal) { + auto p1 = [](auto s) { return s == "TwentyOne"; }; + ASSERT_EQ(FindTestEnumExtra(p1), std::optional{TestEnumExtra::TwentyOne}); +} diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp new file mode 100644 index 0000000000000..7ec7054f14f6e --- /dev/null +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -0,0 +1,142 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Support/Fortran-features.h" +#include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/ErrorHandling.h" +#include "gtest/gtest.h" + +namespace Fortran::common { + +// Not currently exported from Fortran-features.h +llvm::SmallVector splitCamelCase(llvm::StringRef input); +llvm::SmallVector splitHyphenated(llvm::StringRef input); +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, SplitCamelCase) { + + auto parts = splitCamelCase("oP"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("o", 1))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("P", 1))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OPName"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("OP", 2))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OpName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("Op", 2))) { + ADD_FAILURE() << "First part is not Op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("opName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("op", 2))) { + ADD_FAILURE() << "First part is not op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("FlangTestProgram123"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("Flang", 5))) { + ADD_FAILURE() << "First part is not Flang"; + } + if (parts[1].compare(llvm::StringRef("Test", 4))) { + ADD_FAILURE() << "Second part is not Test"; + } + if (parts[2].compare(llvm::StringRef("Program123", 10))) { + ADD_FAILURE() << "Third part is not Program123"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, SplitHyphenated) { + auto parts = splitHyphenated("no-twenty-one"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("no", 2))) { + ADD_FAILURE() << "First part is not twenty"; + } + if (parts[1].compare(llvm::StringRef("twenty", 6))) { + ADD_FAILURE() << "Second part is not one"; + } + if (parts[2].compare(llvm::StringRef("one", 3))) { + ADD_FAILURE() << "Third part is not one"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); + + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); +} + +std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); +} + +TEST(EnumClassTest, parseCLIEnumOption) { + auto result = parseCLITestEnumExtraOption("no-twenty-one"); + auto expected = std::pair(false, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("twenty-one"); + expected = std::pair(true, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-forty-two"); + expected = std::pair(false, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("forty-two"); + expected = std::pair(true, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-seven-seven-seven"); + expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("seven-seven-seven"); + expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); +} + +} // namespace Fortran::common >From 49a0579f9477936b72f0580823b4dd6824697512 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:56:14 -0700 Subject: [PATCH 2/8] adjust headers --- flang/include/flang/Support/Fortran-features.h | 4 +--- flang/lib/Frontend/CompilerInvocation.cpp | 5 ----- flang/lib/Support/Fortran-features.cpp | 1 - 3 files changed, 1 insertion(+), 9 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index d5aa7357ffea0..4a8b0da4c0d4d 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -11,9 +11,7 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" -#include "flang/Common/idioms.h" -#include "llvm/Support/Error.h" -#include "llvm/Support/raw_ostream.h" +#include "llvm/ADT/StringRef.h" #include #include diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 9ea568549bd6c..d8bf601d0171d 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -20,11 +20,9 @@ #include "flang/Support/Version.h" #include "flang/Tools/TargetSetup.h" #include "flang/Version.inc" -#include "clang/Basic/AllDiagnostics.h" #include "clang/Basic/DiagnosticDriver.h" #include "clang/Basic/DiagnosticOptions.h" #include "clang/Driver/Driver.h" -#include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" #include "llvm/ADT/StringRef.h" @@ -34,9 +32,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" -#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" -#include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" #include "llvm/Support/Process.h" #include "llvm/Support/raw_ostream.h" @@ -46,7 +42,6 @@ #include #include #include -#include using namespace Fortran::frontend; diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 55abf0385d185..0e394162ef577 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -10,7 +10,6 @@ #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" -#include "llvm/Support/raw_ostream.h" namespace Fortran::common { >From fa2db7090c6d374ce1a835ad26d19a1d7bd42262 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:57:22 -0700 Subject: [PATCH 3/8] reformat --- flang/lib/Support/enum-class.cpp | 20 ++++++++++--------- flang/unittests/Common/EnumClassTests.cpp | 5 ++--- .../unittests/Common/FortranFeaturesTest.cpp | 18 ++++++++++------- 3 files changed, 24 insertions(+), 19 deletions(-) diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp index ed11318382b35..ac57f27ef1c9e 100644 --- a/flang/lib/Support/enum-class.cpp +++ b/flang/lib/Support/enum-class.cpp @@ -1,4 +1,5 @@ -//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ +//-*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -7,18 +8,19 @@ //===----------------------------------------------------------------------===// #include "flang/Common/enum-class.h" -#include #include +#include namespace Fortran::common { -std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { - for (int i = 0; i < size; ++i) { - if (pred(names[i])) { - return i; - } +std::optional FindEnumIndex( + std::function pred, int size, + const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; } - return std::nullopt; + } + return std::nullopt; } - } // namespace Fortran::common \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp index f67c453cfad15..c9224a8ceba54 100644 --- a/flang/unittests/Common/EnumClassTests.cpp +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -6,15 +6,14 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Common/template.h" -#include "gtest/gtest.h" using namespace Fortran::common; using namespace std; -ENUM_CLASS(TestEnum, One, Two, - Three) +ENUM_CLASS(TestEnum, One, Two, Three) ENUM_CLASS_EXTRA(TestEnum) TEST(EnumClassTest, EnumToString) { diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 7ec7054f14f6e..597928e7fe56e 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -6,12 +6,12 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Support/Fortran-features.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" -#include "gtest/gtest.h" namespace Fortran::common { @@ -34,7 +34,7 @@ TEST(EnumClassTest, SplitCamelCase) { if (parts[1].compare(llvm::StringRef("P", 1))) { ADD_FAILURE() << "Second part is not Name"; } - + parts = splitCamelCase("OPName"); ASSERT_EQ(parts.size(), (size_t)2); @@ -114,13 +114,15 @@ TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); } -std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { - return parseCLIEnum(input, FindTestEnumExtraIndex); +std::optional> parseCLITestEnumExtraOption( + llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); } TEST(EnumClassTest, parseCLIEnumOption) { auto result = parseCLITestEnumExtraOption("no-twenty-one"); - auto expected = std::pair(false, TestEnumExtra::TwentyOne); + auto expected = + std::pair(false, TestEnumExtra::TwentyOne); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("twenty-one"); expected = std::pair(true, TestEnumExtra::TwentyOne); @@ -132,10 +134,12 @@ TEST(EnumClassTest, parseCLIEnumOption) { expected = std::pair(true, TestEnumExtra::FortyTwo); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("no-seven-seven-seven"); - expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(false, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("seven-seven-seven"); - expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(true, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); } >From 5f3feb64c1a97500e2808114d44bb07aa4ccb00c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 15:58:43 -0700 Subject: [PATCH 4/8] addressing feedback --- flang/include/flang/Common/enum-class.h | 53 +++--- flang/include/flang/Common/optional.h | 7 + .../include/flang/Support/Fortran-features.h | 16 -- flang/lib/Support/Fortran-features.cpp | 175 ++++++++---------- flang/lib/Support/enum-class.cpp | 15 +- flang/test/Driver/disable-diagnostic.f90 | 3 +- flang/test/Driver/werror-wrong.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 2 +- .../unittests/Common/FortranFeaturesTest.cpp | 159 +++------------- 9 files changed, 153 insertions(+), 279 deletions(-) diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index baf9fe418141d..3dbd11bb4057c 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -17,9 +17,9 @@ #ifndef FORTRAN_COMMON_ENUM_CLASS_H_ #define FORTRAN_COMMON_ENUM_CLASS_H_ +#include "optional.h" #include #include -#include #include namespace Fortran::common { @@ -59,26 +59,6 @@ constexpr std::array EnumNames(const char *p) { return result; } -template -std::optional inline fmap(std::optional x, std::function f) { - return x ? std::optional{f(*x)} : std::nullopt; -} - -using Predicate = std::function; -// Finds the first index for which the predicate returns true. -std::optional FindEnumIndex( - Predicate pred, int size, const std::string_view *names); - -using FindEnumIndexType = std::optional( - Predicate, int, const std::string_view *); - -template -std::optional inline FindEnum( - Predicate pred, std::function(Predicate)> find) { - std::function f = [](int x) { return static_cast(x); }; - return fmap(find(pred), f); -} - #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ @@ -90,17 +70,34 @@ std::optional inline FindEnum( return NAME##_names[static_cast(e)]; \ } +namespace EnumClass { + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +optional FindIndex( + Predicate pred, std::size_t size, const std::string_view *names); + +using FindIndexType = std::function(Predicate)>; + +template +optional inline Find(Predicate pred, FindIndexType findIndex) { + return MapOption( + findIndex(pred), [](int x) { return static_cast(x); }); +} + +} // namespace EnumClass + #define ENUM_CLASS_EXTRA(NAME) \ - [[maybe_unused]] inline std::optional Find##NAME##Index( \ - ::Fortran::common::Predicate p) { \ - return ::Fortran::common::FindEnumIndex( \ + [[maybe_unused]] inline optional Find##NAME##Index( \ + ::Fortran::common::EnumClass::Predicate p) { \ + return ::Fortran::common::EnumClass::FindIndex( \ p, NAME##_enumSize, NAME##_names.data()); \ } \ - [[maybe_unused]] inline std::optional Find##NAME( \ - ::Fortran::common::Predicate p) { \ - return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + [[maybe_unused]] inline optional Find##NAME( \ + ::Fortran::common::EnumClass::Predicate p) { \ + return ::Fortran::common::EnumClass::Find(p, Find##NAME##Index); \ } \ - [[maybe_unused]] inline std::optional StringTo##NAME( \ + [[maybe_unused]] inline optional StringTo##NAME( \ const std::string_view name) { \ return Find##NAME( \ [name](const std::string_view s) -> bool { return name == s; }); \ diff --git a/flang/include/flang/Common/optional.h b/flang/include/flang/Common/optional.h index c7c81f40cc8c8..5b623f01e828d 100644 --- a/flang/include/flang/Common/optional.h +++ b/flang/include/flang/Common/optional.h @@ -27,6 +27,7 @@ #define FORTRAN_COMMON_OPTIONAL_H #include "api-attrs.h" +#include #include #include @@ -238,6 +239,12 @@ using std::nullopt_t; using std::optional; #endif // !STD_OPTIONAL_UNSUPPORTED +template +std::optional inline MapOption( + std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + } // namespace Fortran::common #endif // FORTRAN_COMMON_OPTIONAL_H diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 4a8b0da4c0d4d..fd6a9139b7ea7 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -133,21 +133,5 @@ class LanguageFeatureControl { bool disableAllWarnings_{false}; }; -// Parse a CLI enum option return the enum index and whether it should be -// enabled (true) or disabled (false). Just exposed for the template below. -std::optional> parseCLIEnumIndex( - llvm::StringRef input, std::function(Predicate)> find); - -template -std::optional> parseCLIEnum( - llvm::StringRef input, std::function(Predicate)> find) { - using To = std::pair; - using From = std::pair; - static std::function cast = [](From x) { - return std::pair{x.first, static_cast(x.second)}; - }; - return fmap(parseCLIEnumIndex(input, find), cast); -} - } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 0e394162ef577..72ea6639adf51 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -11,6 +11,10 @@ #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" +// Debugging +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/raw_ostream.h" + namespace Fortran::common { LanguageFeatureControl::LanguageFeatureControl() { @@ -95,119 +99,99 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Split a string with camel case into the individual words. -// Note, the small vector is just an array of a few pointers and lengths -// into the original input string. So all this allocation should be pretty -// cheap. -llvm::SmallVector splitCamelCase(llvm::StringRef input) { - using namespace llvm; - if (input.empty()) { - return {}; +// Namespace for helper functions for parsing CLI options +// used instead of static so that there can be unit tests for these +// functions. +namespace FortranFeaturesHelpers { +// Check if Lower Case Hyphenated words are equal to Camel Case words. +// Because of out use case we know that 'r' the camel case string is +// well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. +// This is checked in the enum-class.h file. +bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { + size_t ls{l.size()}, rs{r.size()}; + if (ls < rs) { + return false; } - SmallVector parts{}; - parts.reserve(input.size()); - auto check = [&input](size_t j, function_ref predicate) { - return j < input.size() && predicate(input[j]); - }; - size_t i{0}; - size_t startWord = i; - for (; i < input.size(); i++) { - if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || - ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { - parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); - startWord = i + 1; + bool atStartOfWord{true}; + size_t wordCount{0}, j; // j is the number of word characters checked in r. + for (; j < rs; j++) { + if (wordCount + j >= ls) { + // `l` was shorter once the hiphens were removed. + // If r is null terminated, then we are good. + return r[j] == '\0'; } - } - parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); - return parts; -} - -// Split a string whith hyphens into the individual words. -llvm::SmallVector splitHyphenated(llvm::StringRef input) { - auto parts = llvm::SmallVector{}; - llvm::SplitString(input, parts, "-"); - return parts; -} - -// Check if two strings are equal while normalizing case for the -// right word which is assumed to be a single word in camel case. -bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { - size_t ls = l.size(); - if (ls != r.size()) - return false; - size_t j{0}; - // Process the upper case characters. - for (; j < ls; j++) { - char rc = r[j]; - char rc2l = llvm::toLower(rc); - if (rc == rc2l) { - // Past run of Uppers Case; - break; + if (atStartOfWord) { + if (llvm::isUpper(r[j])) { + // Upper Case Run + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else { + atStartOfWord = false; + if (l[wordCount + j] != r[j]) { + return false; + } + } + } else { + if (llvm::isUpper(r[j])) { + atStartOfWord = true; + if (l[wordCount + j] != '-') { + return false; + } + ++wordCount; + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else if (l[wordCount + j] != r[j]) { + return false; + } } - if (l[j] != rc2l) - return false; } - // Process the lower case characters. - for (; j < ls; j++) { - if (l[j] != r[j]) { - return false; - } + // If there are more characters in l after processing all the characters in r. + // then fail unless the string is null terminated. + if (ls > wordCount + j) { + return l[wordCount + j] == '\0'; } return true; } // Parse a CLI enum option return the enum index and whether it should be // enabled (true) or disabled (false). -std::optional> parseCLIEnumIndex( - llvm::StringRef input, std::function(Predicate)> find) { - auto parts = splitHyphenated(input); - bool negated = false; - if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { +template +optional> ParseCLIEnum( + llvm::StringRef input, EnumClass::FindIndexType findIndex) { + bool negated{false}; + if (input.starts_with("no-")) { negated = true; - // Remove the "no" part - parts = llvm::SmallVector(parts.begin() + 1, parts.end()); - } - size_t chars = 0; - for (auto p : parts) { - chars += p.size(); + input = input.drop_front(3); } - auto pred = [&](auto s) { - if (chars != s.size()) { - return false; - } - auto ccParts = splitCamelCase(s); - auto num_ccParts = ccParts.size(); - if (parts.size() != num_ccParts) { - return false; - } - for (size_t i{0}; i < num_ccParts; i++) { - if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { - return false; - } - } - return true; - }; - auto cast = [negated](int x) { return std::pair{!negated, x}; }; - return fmap>(find(pred), cast); + EnumClass::Predicate predicate{ + [input](llvm::StringRef r) { return LowerHyphEqualCamelCase(input, r); }}; + optional x = EnumClass::Find(predicate, findIndex); + return MapOption>( + x, [negated](T x) { return std::pair{!negated, x}; }); } -std::optional> parseCLILanguageFeature( +optional> parseCLIUsageWarning( llvm::StringRef input) { - return parseCLIEnum(input, FindLanguageFeatureIndex); + return ParseCLIEnum(input, FindUsageWarningIndex); } -std::optional> parseCLIUsageWarning( +optional> parseCLILanguageFeature( llvm::StringRef input) { - return parseCLIEnum(input, FindUsageWarningIndex); + return ParseCLIEnum(input, FindLanguageFeatureIndex); } +} // namespace FortranFeaturesHelpers + // Take a string from the CLI and apply it to the LanguageFeatureControl. // Return true if the option was applied recognized. bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { - if (auto result = parseCLILanguageFeature(input)) { + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(input)) { EnableWarning(result->second, result->first); return true; - } else if (auto result = parseCLIUsageWarning(input)) { + } else if (auto result = + FortranFeaturesHelpers::parseCLIUsageWarning(input)) { EnableWarning(result->second, result->first); return true; } @@ -277,11 +261,10 @@ void ForEachEnum(std::function f) { void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { warnAllLanguage_ = yes; - disableAllWarnings_ = yes ? false : disableAllWarnings_; - // should be equivalent to: reset().flip() set ... - ForEachEnum( - [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + warnLanguage_.reset(); if (yes) { + disableAllWarnings_ = false; + warnLanguage_.flip(); // These three features do not need to be warned about, // but we do want their feature flags. warnLanguage_.set(LanguageFeature::OpenMP, false); @@ -292,8 +275,10 @@ void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { void LanguageFeatureControl::WarnOnAllUsage(bool yes) { warnAllUsage_ = yes; - disableAllWarnings_ = yes ? false : disableAllWarnings_; - ForEachEnum( - [&](UsageWarning w) { warnUsage_.set(w, yes); }); + warnUsage_.reset(); + if (yes) { + disableAllWarnings_ = false; + warnUsage_.flip(); + } } } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp index ac57f27ef1c9e..d6d0ee758175b 100644 --- a/flang/lib/Support/enum-class.cpp +++ b/flang/lib/Support/enum-class.cpp @@ -8,19 +8,20 @@ //===----------------------------------------------------------------------===// #include "flang/Common/enum-class.h" +#include "flang/Common/optional.h" #include -#include -namespace Fortran::common { -std::optional FindEnumIndex( - std::function pred, int size, +namespace Fortran::common::EnumClass { + +optional FindIndex( + std::function pred, size_t size, const std::string_view *names) { - for (int i = 0; i < size; ++i) { + for (size_t i = 0; i < size; ++i) { if (pred(names[i])) { return i; } } - return std::nullopt; + return nullopt; } -} // namespace Fortran::common \ No newline at end of file +} // namespace Fortran::common::EnumClass diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 index 8a58e63cfa3ac..849489377da12 100644 --- a/flang/test/Driver/disable-diagnostic.f90 +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -2,6 +2,7 @@ ! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty ! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 ! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 + ! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface ! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface @@ -16,4 +17,4 @@ program disable_diagnostic end program disable_diagnostic subroutine sub() -end subroutine sub \ No newline at end of file +end subroutine sub diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 33f0aff8a1739..6e3c7cca15bc7 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -4,4 +4,4 @@ ! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 ! WRONG1: error: Unknown diagnostic option: -Wall -! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file +! WRONG2: error: Unknown diagnostic option: -WX diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index 19cc5a20fecf4..3149cb9f7bc47 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -3,4 +3,4 @@ add_flang_unittest(FlangCommonTests FastIntSetTest.cpp FortranFeaturesTest.cpp ) -target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 597928e7fe56e..e12aff9f7b735 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -12,135 +12,34 @@ #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" - -namespace Fortran::common { - -// Not currently exported from Fortran-features.h -llvm::SmallVector splitCamelCase(llvm::StringRef input); -llvm::SmallVector splitHyphenated(llvm::StringRef input); -bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); - -ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) -ENUM_CLASS_EXTRA(TestEnumExtra) - -TEST(EnumClassTest, SplitCamelCase) { - - auto parts = splitCamelCase("oP"); - ASSERT_EQ(parts.size(), (size_t)2); - - if (parts[0].compare(llvm::StringRef("o", 1))) { - ADD_FAILURE() << "First part is not OP"; - } - if (parts[1].compare(llvm::StringRef("P", 1))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("OPName"); - ASSERT_EQ(parts.size(), (size_t)2); - - if (parts[0].compare(llvm::StringRef("OP", 2))) { - ADD_FAILURE() << "First part is not OP"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("OpName"); - ASSERT_EQ(parts.size(), (size_t)2); - if (parts[0].compare(llvm::StringRef("Op", 2))) { - ADD_FAILURE() << "First part is not Op"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("opName"); - ASSERT_EQ(parts.size(), (size_t)2); - if (parts[0].compare(llvm::StringRef("op", 2))) { - ADD_FAILURE() << "First part is not op"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("FlangTestProgram123"); - ASSERT_EQ(parts.size(), (size_t)3); - if (parts[0].compare(llvm::StringRef("Flang", 5))) { - ADD_FAILURE() << "First part is not Flang"; - } - if (parts[1].compare(llvm::StringRef("Test", 4))) { - ADD_FAILURE() << "Second part is not Test"; - } - if (parts[2].compare(llvm::StringRef("Program123", 10))) { - ADD_FAILURE() << "Third part is not Program123"; - } - for (auto p : parts) { - llvm::errs() << p << " " << p.size() << "\n"; - } -} - -TEST(EnumClassTest, SplitHyphenated) { - auto parts = splitHyphenated("no-twenty-one"); - ASSERT_EQ(parts.size(), (size_t)3); - if (parts[0].compare(llvm::StringRef("no", 2))) { - ADD_FAILURE() << "First part is not twenty"; - } - if (parts[1].compare(llvm::StringRef("twenty", 6))) { - ADD_FAILURE() << "Second part is not one"; - } - if (parts[2].compare(llvm::StringRef("one", 3))) { - ADD_FAILURE() << "Third part is not one"; - } - for (auto p : parts) { - llvm::errs() << p << " " << p.size() << "\n"; - } -} - -TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); - - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); -} - -std::optional> parseCLITestEnumExtraOption( - llvm::StringRef input) { - return parseCLIEnum(input, FindTestEnumExtraIndex); -} - -TEST(EnumClassTest, parseCLIEnumOption) { - auto result = parseCLITestEnumExtraOption("no-twenty-one"); - auto expected = - std::pair(false, TestEnumExtra::TwentyOne); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("twenty-one"); - expected = std::pair(true, TestEnumExtra::TwentyOne); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("no-forty-two"); - expected = std::pair(false, TestEnumExtra::FortyTwo); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("forty-two"); - expected = std::pair(true, TestEnumExtra::FortyTwo); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("no-seven-seven-seven"); - expected = - std::pair(false, TestEnumExtra::SevenSevenSeven); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("seven-seven-seven"); - expected = - std::pair(true, TestEnumExtra::SevenSevenSeven); - ASSERT_EQ(result, std::optional{expected}); +#include + +namespace Fortran::common::FortranFeaturesHelpers { + +optional> parseCLIUsageWarning( + llvm::StringRef input); +TEST(EnumClassTest, ParseCLIUsageWarning) { + EXPECT_EQ((parseCLIUsageWarning("no-twenty-one")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("twenty-one")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-seven-seven-seven")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-")), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("Portability"), std::nullopt); + auto expect{std::pair{false, UsageWarning::Portability}}; + ASSERT_EQ(parseCLIUsageWarning("no-portability"), expect); + expect.first = true; + ASSERT_EQ((parseCLIUsageWarning("portability")), expect); + expect = + std::pair{false, Fortran::common::UsageWarning::PointerToUndefinable}; + ASSERT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable")), expect); + expect.first = true; + ASSERT_EQ((parseCLIUsageWarning("pointer-to-undefinable")), expect); + EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("NoPointerToUndefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable"), std::nullopt); } -} // namespace Fortran::common +} // namespace Fortran::common::FortranFeaturesHelpers >From 79303b42f7cfd3806c22bd34e5eced5f27d27f32 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 15:42:27 -0700 Subject: [PATCH 5/8] removing debugging statement --- flang/lib/Support/Fortran-features.cpp | 4 ---- 1 file changed, 4 deletions(-) diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 72ea6639adf51..75baa0b096af0 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -11,10 +11,6 @@ #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" -// Debugging -#include "llvm/ADT/StringRef.h" -#include "llvm/Support/raw_ostream.h" - namespace Fortran::common { LanguageFeatureControl::LanguageFeatureControl() { >From 8f0aa22125528a755ec61af2bd45b6c314cfe45c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 15:59:18 -0700 Subject: [PATCH 6/8] more feedback --- flang/include/flang/Support/Fortran-features.h | 4 +--- flang/lib/Support/Fortran-features.cpp | 16 +++++++++------- flang/unittests/Common/FortranFeaturesTest.cpp | 4 ---- 3 files changed, 10 insertions(+), 14 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index fd6a9139b7ea7..501b183cceeec 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -11,8 +11,6 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" -#include "llvm/ADT/StringRef.h" -#include #include namespace Fortran::common { @@ -115,7 +113,7 @@ class LanguageFeatureControl { DisableAllNonstandardWarnings(); DisableAllUsageWarnings(); } - bool applyCLIOption(llvm::StringRef input); + bool applyCLIOption(std::string_view input); bool AreWarningsDisabled() const { return disableAllWarnings_; } bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 75baa0b096af0..d140ecdff7f24 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -99,11 +99,11 @@ LanguageFeatureControl::LanguageFeatureControl() { // used instead of static so that there can be unit tests for these // functions. namespace FortranFeaturesHelpers { -// Check if Lower Case Hyphenated words are equal to Camel Case words. +// Check if lower case hyphenated words are equal to camel case words. // Because of out use case we know that 'r' the camel case string is // well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. // This is checked in the enum-class.h file. -bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { +static bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { size_t ls{l.size()}, rs{r.size()}; if (ls < rs) { return false; @@ -161,8 +161,9 @@ optional> ParseCLIEnum( negated = true; input = input.drop_front(3); } - EnumClass::Predicate predicate{ - [input](llvm::StringRef r) { return LowerHyphEqualCamelCase(input, r); }}; + EnumClass::Predicate predicate{[input](std::string_view r) { + return LowerHyphEqualCamelCase(input, r); + }}; optional x = EnumClass::Find(predicate, findIndex); return MapOption>( x, [negated](T x) { return std::pair{!negated, x}; }); @@ -182,12 +183,13 @@ optional> parseCLILanguageFeature( // Take a string from the CLI and apply it to the LanguageFeatureControl. // Return true if the option was applied recognized. -bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { - if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(input)) { +bool LanguageFeatureControl::applyCLIOption(std::string_view input) { + llvm::StringRef inputRef{input}; + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(inputRef)) { EnableWarning(result->second, result->first); return true; } else if (auto result = - FortranFeaturesHelpers::parseCLIUsageWarning(input)) { + FortranFeaturesHelpers::parseCLIUsageWarning(inputRef)) { EnableWarning(result->second, result->first); return true; } diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index e12aff9f7b735..b3f0c31a57025 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -7,11 +7,7 @@ //===----------------------------------------------------------------------===// #include "gtest/gtest.h" -#include "flang/Common/enum-class.h" #include "flang/Support/Fortran-features.h" -#include "llvm/ADT/SmallVector.h" -#include "llvm/ADT/StringRef.h" -#include "llvm/Support/ErrorHandling.h" #include namespace Fortran::common::FortranFeaturesHelpers { >From a0317745bca77a1134e116fd570b4ecca60e4d95 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 17:02:17 -0700 Subject: [PATCH 7/8] adding insensitive match back --- .../include/flang/Support/Fortran-features.h | 2 +- flang/lib/Support/Fortran-features.cpp | 86 +++++++++++++++---- .../unittests/Common/FortranFeaturesTest.cpp | 75 +++++++++++----- 3 files changed, 123 insertions(+), 40 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 501b183cceeec..0b55a3175580a 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -113,7 +113,7 @@ class LanguageFeatureControl { DisableAllNonstandardWarnings(); DisableAllUsageWarnings(); } - bool applyCLIOption(std::string_view input); + bool applyCLIOption(std::string_view input, bool insensitive = false); bool AreWarningsDisabled() const { return disableAllWarnings_; } bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index d140ecdff7f24..80e87615697df 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -8,6 +8,7 @@ #include "flang/Support/Fortran-features.h" #include "flang/Common/idioms.h" +#include "flang/Common/optional.h" #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" @@ -99,11 +100,48 @@ LanguageFeatureControl::LanguageFeatureControl() { // used instead of static so that there can be unit tests for these // functions. namespace FortranFeaturesHelpers { + +// Ignore case and any inserted punctuation (like '-'/'_') +static std::optional GetWarningChar(char ch) { + if (ch >= 'a' && ch <= 'z') { + return ch; + } else if (ch >= 'A' && ch <= 'Z') { + return ch - 'A' + 'a'; + } else if (ch >= '0' && ch <= '9') { + return ch; + } else { + return std::nullopt; + } +} + +// Check for case and punctuation insensitive string equality. +// NB, b is probably not null terminated, so don't treat is like a C string. +static bool InsensitiveWarningNameMatch( + std::string_view a, std::string_view b) { + size_t j{0}, aSize{a.size()}, k{0}, bSize{b.size()}; + while (true) { + optional ach{nullopt}; + while (!ach && j < aSize) { + ach = GetWarningChar(a[j++]); + } + optional bch{}; + while (!bch && k < bSize) { + bch = GetWarningChar(b[k++]); + } + if (!ach && !bch) { + return true; + } else if (!ach || !bch || *ach != *bch) { + return false; + } + ach = bch = nullopt; + } +} + // Check if lower case hyphenated words are equal to camel case words. // Because of out use case we know that 'r' the camel case string is // well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. // This is checked in the enum-class.h file. -static bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { +static bool SensitiveWarningNameMatch(llvm::StringRef l, llvm::StringRef r) { size_t ls{l.size()}, rs{r.size()}; if (ls < rs) { return false; @@ -154,42 +192,56 @@ static bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { // Parse a CLI enum option return the enum index and whether it should be // enabled (true) or disabled (false). template -optional> ParseCLIEnum( - llvm::StringRef input, EnumClass::FindIndexType findIndex) { +optional> ParseCLIEnum(llvm::StringRef input, + EnumClass::FindIndexType findIndex, bool insensitive) { bool negated{false}; - if (input.starts_with("no-")) { - negated = true; - input = input.drop_front(3); + EnumClass::Predicate predicate; + if (insensitive) { + if (input.starts_with_insensitive("no")) { + negated = true; + input = input.drop_front(2); + } + predicate = [input](std::string_view r) { + return InsensitiveWarningNameMatch(input, r); + }; + } else { + if (input.starts_with("no-")) { + negated = true; + input = input.drop_front(3); + } + predicate = [input](std::string_view r) { + return SensitiveWarningNameMatch(input, r); + }; } - EnumClass::Predicate predicate{[input](std::string_view r) { - return LowerHyphEqualCamelCase(input, r); - }}; optional x = EnumClass::Find(predicate, findIndex); return MapOption>( x, [negated](T x) { return std::pair{!negated, x}; }); } optional> parseCLIUsageWarning( - llvm::StringRef input) { - return ParseCLIEnum(input, FindUsageWarningIndex); + llvm::StringRef input, bool insensitive) { + return ParseCLIEnum(input, FindUsageWarningIndex, insensitive); } optional> parseCLILanguageFeature( - llvm::StringRef input) { - return ParseCLIEnum(input, FindLanguageFeatureIndex); + llvm::StringRef input, bool insensitive) { + return ParseCLIEnum( + input, FindLanguageFeatureIndex, insensitive); } } // namespace FortranFeaturesHelpers // Take a string from the CLI and apply it to the LanguageFeatureControl. // Return true if the option was applied recognized. -bool LanguageFeatureControl::applyCLIOption(std::string_view input) { +bool LanguageFeatureControl::applyCLIOption( + std::string_view input, bool insensitive) { llvm::StringRef inputRef{input}; - if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(inputRef)) { + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature( + inputRef, insensitive)) { EnableWarning(result->second, result->first); return true; - } else if (auto result = - FortranFeaturesHelpers::parseCLIUsageWarning(inputRef)) { + } else if (auto result = FortranFeaturesHelpers::parseCLIUsageWarning( + inputRef, insensitive)) { EnableWarning(result->second, result->first); return true; } diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index b3f0c31a57025..4e9529d633ad9 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -13,29 +13,60 @@ namespace Fortran::common::FortranFeaturesHelpers { optional> parseCLIUsageWarning( - llvm::StringRef input); + llvm::StringRef input, bool insensitive); TEST(EnumClassTest, ParseCLIUsageWarning) { - EXPECT_EQ((parseCLIUsageWarning("no-twenty-one")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("twenty-one")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("no-seven-seven-seven")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("no")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("no-")), std::nullopt); - EXPECT_EQ(parseCLIUsageWarning("Portability"), std::nullopt); - auto expect{std::pair{false, UsageWarning::Portability}}; - ASSERT_EQ(parseCLIUsageWarning("no-portability"), expect); - expect.first = true; - ASSERT_EQ((parseCLIUsageWarning("portability")), expect); - expect = - std::pair{false, Fortran::common::UsageWarning::PointerToUndefinable}; - ASSERT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable")), expect); - expect.first = true; - ASSERT_EQ((parseCLIUsageWarning("pointer-to-undefinable")), expect); - EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable"), std::nullopt); - EXPECT_EQ(parseCLIUsageWarning("NoPointerToUndefinable"), std::nullopt); - EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable"), std::nullopt); - EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable"), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-twenty-one", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("twenty-one", false)), std::nullopt); + EXPECT_EQ( + (parseCLIUsageWarning("no-seven-seven-seven", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-", false)), std::nullopt); + + EXPECT_EQ(parseCLIUsageWarning("Portability", false), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-portability", false)), + (std::optional{std::pair{false, UsageWarning::Portability}})); + EXPECT_EQ((parseCLIUsageWarning("portability", false)), + (std::optional{std::pair{true, UsageWarning::Portability}})); + EXPECT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable", false)), + (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ((parseCLIUsageWarning("pointer-to-undefinable", false)), + (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable", false), std::nullopt); + EXPECT_EQ( + parseCLIUsageWarning("NoPointerToUndefinable", false), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable", false), std::nullopt); + EXPECT_EQ( + parseCLIUsageWarning("nopointertoundefinable", false), std::nullopt); + + EXPECT_EQ((parseCLIUsageWarning("no-twenty-one", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("twenty-one", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-seven-seven-seven", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-", true)), std::nullopt); + + EXPECT_EQ(parseCLIUsageWarning("Portability", true), + (std::optional{std::pair{true, UsageWarning::Portability}})); + EXPECT_EQ(parseCLIUsageWarning("no-portability", true), + (std::optional{std::pair{false, UsageWarning::Portability}})); + + EXPECT_EQ((parseCLIUsageWarning("portability", true)), + (std::optional{std::pair{true, UsageWarning::Portability}})); + EXPECT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable", true)), + (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ((parseCLIUsageWarning("pointer-to-undefinable", true)), + (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable", true), + (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("NoPointerToUndefinable", true), + (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable", true), + (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable", true), + (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); } } // namespace Fortran::common::FortranFeaturesHelpers >From d37976f4167677294227c976b4aea03fdb130462 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 18:01:35 -0700 Subject: [PATCH 8/8] fixing enum name --- flang/include/flang/Support/Fortran-features.h | 2 +- flang/lib/Semantics/tools.cpp | 2 +- flang/unittests/Common/FortranFeaturesTest.cpp | 5 ++++- 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 0b55a3175580a..899959ad8a435 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -60,7 +60,7 @@ ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, NonTargetPassedToTarget, PointerToPossibleNoncontiguous, ShortCharacterActual, ShortArrayActual, ImplicitInterfaceActual, PolymorphicTransferArg, PointerComponentTransferArg, TransferSizePresence, - F202XAllocatableBreakingChange, OptionalMustBePresent, CommonBlockPadding, + F202xAllocatableBreakingChange, OptionalMustBePresent, CommonBlockPadding, LogicalVsCBool, BindCCharLength, ProcDummyArgShapes, ExternalNameConflict, FoldingException, FoldingAvoidsRuntimeCrash, FoldingValueChecks, FoldingFailure, FoldingLimit, Interoperability, CharacterInteroperability, diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 1d1e3ac044166..e8da757416cc6 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -1672,7 +1672,7 @@ std::forward_list GetAllNames( void WarnOnDeferredLengthCharacterScalar(SemanticsContext &context, const SomeExpr *expr, parser::CharBlock at, const char *what) { if (context.languageFeatures().ShouldWarn( - common::UsageWarning::F202XAllocatableBreakingChange)) { + common::UsageWarning::F202xAllocatableBreakingChange)) { if (const Symbol * symbol{evaluate::UnwrapWholeSymbolOrComponentDataRef(expr)}) { const Symbol &ultimate{ResolveAssociations(*symbol)}; diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 4e9529d633ad9..0e48697182ff5 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -52,7 +52,6 @@ TEST(EnumClassTest, ParseCLIUsageWarning) { (std::optional{std::pair{true, UsageWarning::Portability}})); EXPECT_EQ(parseCLIUsageWarning("no-portability", true), (std::optional{std::pair{false, UsageWarning::Portability}})); - EXPECT_EQ((parseCLIUsageWarning("portability", true)), (std::optional{std::pair{true, UsageWarning::Portability}})); EXPECT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable", true)), @@ -67,6 +66,10 @@ TEST(EnumClassTest, ParseCLIUsageWarning) { (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable", true), (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); + + EXPECT_EQ(parseCLIUsageWarning("f202x-allocatable-breaking-change", false), + (std::optional{ + std::pair{true, UsageWarning::F202xAllocatableBreakingChange}})); } } // namespace Fortran::common::FortranFeaturesHelpers From flang-commits at lists.llvm.org Fri May 30 18:12:59 2025 From: flang-commits at lists.llvm.org (Andre Kuhlenschmidt via flang-commits) Date: Fri, 30 May 2025 18:12:59 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683a579b.170a0220.15079f.97f8@mx.google.com> https://github.com/akuhlens updated https://github.com/llvm/llvm-project/pull/142022 >From 8f3fd2daab46f477e87043c66b3049dff4a5b20e Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:11:04 -0700 Subject: [PATCH 1/9] initial commit --- flang/include/flang/Common/enum-class.h | 47 ++++- .../include/flang/Support/Fortran-features.h | 51 ++++-- flang/lib/Frontend/CompilerInvocation.cpp | 62 ++++--- flang/lib/Support/CMakeLists.txt | 1 + flang/lib/Support/Fortran-features.cpp | 168 ++++++++++++++---- flang/lib/Support/enum-class.cpp | 24 +++ flang/test/Driver/disable-diagnostic.f90 | 19 ++ flang/test/Driver/werror-wrong.f90 | 7 +- flang/test/Driver/wextra-ok.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 3 + flang/unittests/Common/EnumClassTests.cpp | 45 +++++ .../unittests/Common/FortranFeaturesTest.cpp | 142 +++++++++++++++ 12 files changed, 483 insertions(+), 88 deletions(-) create mode 100644 flang/lib/Support/enum-class.cpp create mode 100644 flang/test/Driver/disable-diagnostic.f90 create mode 100644 flang/unittests/Common/EnumClassTests.cpp create mode 100644 flang/unittests/Common/FortranFeaturesTest.cpp diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index 41575d45091a8..baf9fe418141d 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -18,8 +18,9 @@ #define FORTRAN_COMMON_ENUM_CLASS_H_ #include -#include - +#include +#include +#include namespace Fortran::common { constexpr std::size_t CountEnumNames(const char *p) { @@ -58,15 +59,51 @@ constexpr std::array EnumNames(const char *p) { return result; } +template +std::optional inline fmap(std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +std::optional FindEnumIndex( + Predicate pred, int size, const std::string_view *names); + +using FindEnumIndexType = std::optional( + Predicate, int, const std::string_view *); + +template +std::optional inline FindEnum( + Predicate pred, std::function(Predicate)> find) { + std::function f = [](int x) { return static_cast(x); }; + return fmap(find(pred), f); +} + #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ ::Fortran::common::CountEnumNames(#__VA_ARGS__)}; \ + [[maybe_unused]] static constexpr std::array NAME##_names{ \ + ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ [[maybe_unused]] static inline std::string_view EnumToString(NAME e) { \ - static const constexpr auto names{ \ - ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ - return names[static_cast(e)]; \ + return NAME##_names[static_cast(e)]; \ } +#define ENUM_CLASS_EXTRA(NAME) \ + [[maybe_unused]] inline std::optional Find##NAME##Index( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnumIndex( \ + p, NAME##_enumSize, NAME##_names.data()); \ + } \ + [[maybe_unused]] inline std::optional Find##NAME( \ + ::Fortran::common::Predicate p) { \ + return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + } \ + [[maybe_unused]] inline std::optional StringTo##NAME( \ + const std::string_view name) { \ + return Find##NAME( \ + [name](const std::string_view s) -> bool { return name == s; }); \ + } } // namespace Fortran::common #endif // FORTRAN_COMMON_ENUM_CLASS_H_ diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index e696da9042480..d5aa7357ffea0 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -12,6 +12,8 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" #include "flang/Common/idioms.h" +#include "llvm/Support/Error.h" +#include "llvm/Support/raw_ostream.h" #include #include @@ -79,12 +81,13 @@ ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, NullActualForDefaultIntentAllocatable, UseAssociationIntoSameNameSubprogram, HostAssociatedIntentOutInSpecExpr, NonVolatilePointerToVolatile) +// Generate default String -> Enum mapping. +ENUM_CLASS_EXTRA(LanguageFeature) +ENUM_CLASS_EXTRA(UsageWarning) + using LanguageFeatures = EnumSet; using UsageWarnings = EnumSet; -std::optional FindLanguageFeature(const char *); -std::optional FindUsageWarning(const char *); - class LanguageFeatureControl { public: LanguageFeatureControl(); @@ -97,8 +100,10 @@ class LanguageFeatureControl { void EnableWarning(UsageWarning w, bool yes = true) { warnUsage_.set(w, yes); } - void WarnOnAllNonstandard(bool yes = true) { warnAllLanguage_ = yes; } - void WarnOnAllUsage(bool yes = true) { warnAllUsage_ = yes; } + void WarnOnAllNonstandard(bool yes = true); + bool IsWarnOnAllNonstandard() const { return warnAllLanguage_; } + void WarnOnAllUsage(bool yes = true); + bool IsWarnOnAllUsage() const { return warnAllUsage_; } void DisableAllNonstandardWarnings() { warnAllLanguage_ = false; warnLanguage_.clear(); @@ -107,16 +112,16 @@ class LanguageFeatureControl { warnAllUsage_ = false; warnUsage_.clear(); } - - bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } - bool ShouldWarn(LanguageFeature f) const { - return (warnAllLanguage_ && f != LanguageFeature::OpenMP && - f != LanguageFeature::OpenACC && f != LanguageFeature::CUDA) || - warnLanguage_.test(f); - } - bool ShouldWarn(UsageWarning w) const { - return warnAllUsage_ || warnUsage_.test(w); + void DisableAllWarnings() { + disableAllWarnings_ = true; + DisableAllNonstandardWarnings(); + DisableAllUsageWarnings(); } + bool applyCLIOption(llvm::StringRef input); + bool AreWarningsDisabled() const { return disableAllWarnings_; } + bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } + bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } + bool ShouldWarn(UsageWarning w) const { return warnUsage_.test(w); } // Return all spellings of operators names, depending on features enabled std::vector GetNames(LogicalOperator) const; std::vector GetNames(RelationalOperator) const; @@ -127,6 +132,24 @@ class LanguageFeatureControl { bool warnAllLanguage_{false}; UsageWarnings warnUsage_; bool warnAllUsage_{false}; + bool disableAllWarnings_{false}; }; + +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). Just exposed for the template below. +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find); + +template +std::optional> parseCLIEnum( + llvm::StringRef input, std::function(Predicate)> find) { + using To = std::pair; + using From = std::pair; + static std::function cast = [](From x) { + return std::pair{x.first, static_cast(x.second)}; + }; + return fmap(parseCLIEnumIndex(input, find), cast); +} + } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index ba2531819ee5e..9ea568549bd6c 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -34,6 +34,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" +#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" #include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" @@ -45,6 +46,7 @@ #include #include #include +#include using namespace Fortran::frontend; @@ -971,10 +973,23 @@ static bool parseSemaArgs(CompilerInvocation &res, llvm::opt::ArgList &args, /// Parses all diagnostics related arguments and populates the variables /// options accordingly. Returns false if new errors are generated. +/// FC1 driver entry point for parsing diagnostic arguments. static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, clang::DiagnosticsEngine &diags) { unsigned numErrorsBefore = diags.getNumErrors(); + auto &features = res.getFrontendOpts().features; + // The order of these flags (-pedantic -W -w) is important and is + // chosen to match clang's behavior. + + // -pedantic + if (args.hasArg(clang::driver::options::OPT_pedantic)) { + features.WarnOnAllNonstandard(); + features.WarnOnAllUsage(); + res.setEnableConformanceChecks(); + res.setEnableUsageChecks(); + } + // -Werror option // TODO: Currently throws a Diagnostic for anything other than -W, // this has to change when other -W's are supported. @@ -984,21 +999,27 @@ static bool parseDiagArgs(CompilerInvocation &res, llvm::opt::ArgList &args, for (const auto &wArg : wArgs) { if (wArg == "error") { res.setWarnAsErr(true); - } else { - const unsigned diagID = - diags.getCustomDiagID(clang::DiagnosticsEngine::Error, - "Only `-Werror` is supported currently."); - diags.Report(diagID); + // -W(no-) + } else if (!features.applyCLIOption(wArg)) { + const unsigned diagID = diags.getCustomDiagID( + clang::DiagnosticsEngine::Error, "Unknown diagnostic option: -W%0"); + diags.Report(diagID) << wArg; } } } + // -w + if (args.hasArg(clang::driver::options::OPT_w)) { + features.DisableAllWarnings(); + res.setDisableWarnings(); + } + // Default to off for `flang -fc1`. - res.getFrontendOpts().showColors = - parseShowColorsArgs(args, /*defaultDiagColor=*/false); + bool showColors = parseShowColorsArgs(args, false); - // Honor color diagnostics. - res.getDiagnosticOpts().ShowColors = res.getFrontendOpts().showColors; + diags.getDiagnosticOptions().ShowColors = showColors; + res.getDiagnosticOpts().ShowColors = showColors; + res.getFrontendOpts().showColors = showColors; return diags.getNumErrors() == numErrorsBefore; } @@ -1074,16 +1095,6 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, Fortran::common::LanguageFeature::OpenACC); } - // -pedantic - if (args.hasArg(clang::driver::options::OPT_pedantic)) { - res.setEnableConformanceChecks(); - res.setEnableUsageChecks(); - } - - // -w - if (args.hasArg(clang::driver::options::OPT_w)) - res.setDisableWarnings(); - // -std=f2018 // TODO: Set proper options when more fortran standards // are supported. @@ -1092,6 +1103,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, // We only allow f2018 as the given standard if (standard == "f2018") { res.setEnableConformanceChecks(); + res.getFrontendOpts().features.WarnOnAllNonstandard(); } else { const unsigned diagID = diags.getCustomDiagID(clang::DiagnosticsEngine::Error, @@ -1099,6 +1111,7 @@ static bool parseDialectArgs(CompilerInvocation &res, llvm::opt::ArgList &args, diags.Report(diagID); } } + return diags.getNumErrors() == numErrorsBefore; } @@ -1694,16 +1707,7 @@ void CompilerInvocation::setFortranOpts() { if (frontendOptions.needProvenanceRangeToCharBlockMappings) fortranOptions.needProvenanceRangeToCharBlockMappings = true; - if (getEnableConformanceChecks()) - fortranOptions.features.WarnOnAllNonstandard(); - - if (getEnableUsageChecks()) - fortranOptions.features.WarnOnAllUsage(); - - if (getDisableWarnings()) { - fortranOptions.features.DisableAllNonstandardWarnings(); - fortranOptions.features.DisableAllUsageWarnings(); - } + fortranOptions.features = frontendOptions.features; } std::unique_ptr diff --git a/flang/lib/Support/CMakeLists.txt b/flang/lib/Support/CMakeLists.txt index 363f57ce97dae..9ef31a2a6dcc7 100644 --- a/flang/lib/Support/CMakeLists.txt +++ b/flang/lib/Support/CMakeLists.txt @@ -44,6 +44,7 @@ endif() add_flang_library(FortranSupport default-kinds.cpp + enum-class.cpp Flags.cpp Fortran.cpp Fortran-features.cpp diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index bee8984102b82..55abf0385d185 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -9,6 +9,8 @@ #include "flang/Support/Fortran-features.h" #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/Support/raw_ostream.h" namespace Fortran::common { @@ -94,57 +96,123 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Ignore case and any inserted punctuation (like '-'/'_') -static std::optional GetWarningChar(char ch) { - if (ch >= 'a' && ch <= 'z') { - return ch; - } else if (ch >= 'A' && ch <= 'Z') { - return ch - 'A' + 'a'; - } else if (ch >= '0' && ch <= '9') { - return ch; - } else { - return std::nullopt; +// Split a string with camel case into the individual words. +// Note, the small vector is just an array of a few pointers and lengths +// into the original input string. So all this allocation should be pretty +// cheap. +llvm::SmallVector splitCamelCase(llvm::StringRef input) { + using namespace llvm; + if (input.empty()) { + return {}; } + SmallVector parts{}; + parts.reserve(input.size()); + auto check = [&input](size_t j, function_ref predicate) { + return j < input.size() && predicate(input[j]); + }; + size_t i{0}; + size_t startWord = i; + for (; i < input.size(); i++) { + if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || + ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { + parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); + startWord = i + 1; + } + } + parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); + return parts; } -static bool WarningNameMatch(const char *a, const char *b) { - while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); - } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); +// Split a string whith hyphens into the individual words. +llvm::SmallVector splitHyphenated(llvm::StringRef input) { + auto parts = llvm::SmallVector{}; + llvm::SplitString(input, parts, "-"); + return parts; +} + +// Check if two strings are equal while normalizing case for the +// right word which is assumed to be a single word in camel case. +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { + size_t ls = l.size(); + if (ls != r.size()) + return false; + size_t j{0}; + // Process the upper case characters. + for (; j < ls; j++) { + char rc = r[j]; + char rc2l = llvm::toLower(rc); + if (rc == rc2l) { + // Past run of Uppers Case; + break; } - if (!ach && !bch) { - return true; - } else if (!ach || !bch || *ach != *bch) { + if (l[j] != rc2l) + return false; + } + // Process the lower case characters. + for (; j < ls; j++) { + if (l[j] != r[j]) { return false; } - ++a, ++b; } + return true; } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). +std::optional> parseCLIEnumIndex( + llvm::StringRef input, std::function(Predicate)> find) { + auto parts = splitHyphenated(input); + bool negated = false; + if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { + negated = true; + // Remove the "no" part + parts = llvm::SmallVector(parts.begin() + 1, parts.end()); + } + size_t chars = 0; + for (auto p : parts) { + chars += p.size(); + } + auto pred = [&](auto s) { + if (chars != s.size()) { + return false; + } + auto ccParts = splitCamelCase(s); + auto num_ccParts = ccParts.size(); + if (parts.size() != num_ccParts) { + return false; + } + for (size_t i{0}; i < num_ccParts; i++) { + if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { + return false; } } - } - return std::nullopt; + return true; + }; + auto cast = [negated](int x) { return std::pair{!negated, x}; }; + return fmap>(find(pred), cast); } -std::optional FindLanguageFeature(const char *name) { - return ScanEnum(name); +std::optional> parseCLILanguageFeature( + llvm::StringRef input) { + return parseCLIEnum(input, FindLanguageFeatureIndex); } -std::optional FindUsageWarning(const char *name) { - return ScanEnum(name); +std::optional> parseCLIUsageWarning( + llvm::StringRef input) { + return parseCLIEnum(input, FindUsageWarningIndex); +} + +// Take a string from the CLI and apply it to the LanguageFeatureControl. +// Return true if the option was applied recognized. +bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { + if (auto result = parseCLILanguageFeature(input)) { + EnableWarning(result->second, result->first); + return true; + } else if (auto result = parseCLIUsageWarning(input)) { + EnableWarning(result->second, result->first); + return true; + } + return false; } std::vector LanguageFeatureControl::GetNames( @@ -201,4 +269,32 @@ std::vector LanguageFeatureControl::GetNames( } } +template +void ForEachEnum(std::function f) { + for (size_t j{0}; j < N; ++j) { + f(static_cast(j)); + } +} + +void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { + warnAllLanguage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + // should be equivalent to: reset().flip() set ... + ForEachEnum( + [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + if (yes) { + // These three features do not need to be warned about, + // but we do want their feature flags. + warnLanguage_.set(LanguageFeature::OpenMP, false); + warnLanguage_.set(LanguageFeature::OpenACC, false); + warnLanguage_.set(LanguageFeature::CUDA, false); + } +} + +void LanguageFeatureControl::WarnOnAllUsage(bool yes) { + warnAllUsage_ = yes; + disableAllWarnings_ = yes ? false : disableAllWarnings_; + ForEachEnum( + [&](UsageWarning w) { warnUsage_.set(w, yes); }); +} } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp new file mode 100644 index 0000000000000..ed11318382b35 --- /dev/null +++ b/flang/lib/Support/enum-class.cpp @@ -0,0 +1,24 @@ +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include +#include +namespace Fortran::common { + +std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; + } + } + return std::nullopt; +} + + +} // namespace Fortran::common \ No newline at end of file diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 new file mode 100644 index 0000000000000..8a58e63cfa3ac --- /dev/null +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -0,0 +1,19 @@ +! RUN: %flang -Wknown-bad-implicit-interface %s -c 2>&1 | FileCheck %s --check-prefix=WARN +! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty +! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 +! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 +! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface +! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface + +program disable_diagnostic + REAL :: x + INTEGER :: y + ! CHECK-NOT: warning + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(x) + ! WARN: warning: If the procedure's interface were explicit, this reference would be in error + call sub(y) +end program disable_diagnostic + +subroutine sub() +end subroutine sub \ No newline at end of file diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 58adf6f745d5e..33f0aff8a1739 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -1,6 +1,7 @@ ! Ensure that only argument -Werror is supported. -! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG -! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG +! RUN: not %flang_fc1 -fsyntax-only -Wall %s 2>&1 | FileCheck %s --check-prefix=WRONG1 +! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 -! WRONG: Only `-Werror` is supported currently. +! WRONG1: error: Unknown diagnostic option: -Wall +! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file diff --git a/flang/test/Driver/wextra-ok.f90 b/flang/test/Driver/wextra-ok.f90 index 441029aa0af27..db15c7f14aa35 100644 --- a/flang/test/Driver/wextra-ok.f90 +++ b/flang/test/Driver/wextra-ok.f90 @@ -5,7 +5,7 @@ ! RUN: not %flang -std=f2018 -Wblah -Wextra %s -c 2>&1 | FileCheck %s --check-prefix=WRONG ! CHECK-OK: the warning option '-Wextra' is not supported -! WRONG: Only `-Werror` is supported currently. +! WRONG: Unknown diagnostic option: -Wblah program wextra_ok end program wextra_ok diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index bda02ed29a5ef..19cc5a20fecf4 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -1,3 +1,6 @@ add_flang_unittest(FlangCommonTests + EnumClassTests.cpp FastIntSetTest.cpp + FortranFeaturesTest.cpp ) +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp new file mode 100644 index 0000000000000..f67c453cfad15 --- /dev/null +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -0,0 +1,45 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Common/template.h" +#include "gtest/gtest.h" + +using namespace Fortran::common; +using namespace std; + +ENUM_CLASS(TestEnum, One, Two, + Three) +ENUM_CLASS_EXTRA(TestEnum) + +TEST(EnumClassTest, EnumToString) { + ASSERT_EQ(EnumToString(TestEnum::One), "One"); + ASSERT_EQ(EnumToString(TestEnum::Two), "Two"); + ASSERT_EQ(EnumToString(TestEnum::Three), "Three"); +} + +TEST(EnumClassTest, EnumToStringData) { + ASSERT_STREQ(EnumToString(TestEnum::One).data(), "One, Two, Three"); +} + +TEST(EnumClassTest, StringToEnum) { + ASSERT_EQ(StringToTestEnum("One"), std::optional{TestEnum::One}); + ASSERT_EQ(StringToTestEnum("Two"), std::optional{TestEnum::Two}); + ASSERT_EQ(StringToTestEnum("Three"), std::optional{TestEnum::Three}); + ASSERT_EQ(StringToTestEnum("Four"), std::nullopt); + ASSERT_EQ(StringToTestEnum(""), std::nullopt); + ASSERT_EQ(StringToTestEnum("One, Two, Three"), std::nullopt); +} + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, FindNameNormal) { + auto p1 = [](auto s) { return s == "TwentyOne"; }; + ASSERT_EQ(FindTestEnumExtra(p1), std::optional{TestEnumExtra::TwentyOne}); +} diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp new file mode 100644 index 0000000000000..7ec7054f14f6e --- /dev/null +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -0,0 +1,142 @@ +//===-- flang/unittests/Common/FastIntSetTest.cpp ---------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "flang/Common/enum-class.h" +#include "flang/Support/Fortran-features.h" +#include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/ErrorHandling.h" +#include "gtest/gtest.h" + +namespace Fortran::common { + +// Not currently exported from Fortran-features.h +llvm::SmallVector splitCamelCase(llvm::StringRef input); +llvm::SmallVector splitHyphenated(llvm::StringRef input); +bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); + +ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) +ENUM_CLASS_EXTRA(TestEnumExtra) + +TEST(EnumClassTest, SplitCamelCase) { + + auto parts = splitCamelCase("oP"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("o", 1))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("P", 1))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OPName"); + ASSERT_EQ(parts.size(), (size_t)2); + + if (parts[0].compare(llvm::StringRef("OP", 2))) { + ADD_FAILURE() << "First part is not OP"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("OpName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("Op", 2))) { + ADD_FAILURE() << "First part is not Op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("opName"); + ASSERT_EQ(parts.size(), (size_t)2); + if (parts[0].compare(llvm::StringRef("op", 2))) { + ADD_FAILURE() << "First part is not op"; + } + if (parts[1].compare(llvm::StringRef("Name", 4))) { + ADD_FAILURE() << "Second part is not Name"; + } + + parts = splitCamelCase("FlangTestProgram123"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("Flang", 5))) { + ADD_FAILURE() << "First part is not Flang"; + } + if (parts[1].compare(llvm::StringRef("Test", 4))) { + ADD_FAILURE() << "Second part is not Test"; + } + if (parts[2].compare(llvm::StringRef("Program123", 10))) { + ADD_FAILURE() << "Third part is not Program123"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, SplitHyphenated) { + auto parts = splitHyphenated("no-twenty-one"); + ASSERT_EQ(parts.size(), (size_t)3); + if (parts[0].compare(llvm::StringRef("no", 2))) { + ADD_FAILURE() << "First part is not twenty"; + } + if (parts[1].compare(llvm::StringRef("twenty", 6))) { + ADD_FAILURE() << "Second part is not one"; + } + if (parts[2].compare(llvm::StringRef("one", 3))) { + ADD_FAILURE() << "Third part is not one"; + } + for (auto p : parts) { + llvm::errs() << p << " " << p.size() << "\n"; + } +} + +TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); + EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); + + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); + EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); +} + +std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); +} + +TEST(EnumClassTest, parseCLIEnumOption) { + auto result = parseCLITestEnumExtraOption("no-twenty-one"); + auto expected = std::pair(false, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("twenty-one"); + expected = std::pair(true, TestEnumExtra::TwentyOne); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-forty-two"); + expected = std::pair(false, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("forty-two"); + expected = std::pair(true, TestEnumExtra::FortyTwo); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("no-seven-seven-seven"); + expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); + result = parseCLITestEnumExtraOption("seven-seven-seven"); + expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + ASSERT_EQ(result, std::optional{expected}); +} + +} // namespace Fortran::common >From 49a0579f9477936b72f0580823b4dd6824697512 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:56:14 -0700 Subject: [PATCH 2/9] adjust headers --- flang/include/flang/Support/Fortran-features.h | 4 +--- flang/lib/Frontend/CompilerInvocation.cpp | 5 ----- flang/lib/Support/Fortran-features.cpp | 1 - 3 files changed, 1 insertion(+), 9 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index d5aa7357ffea0..4a8b0da4c0d4d 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -11,9 +11,7 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" -#include "flang/Common/idioms.h" -#include "llvm/Support/Error.h" -#include "llvm/Support/raw_ostream.h" +#include "llvm/ADT/StringRef.h" #include #include diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp index 9ea568549bd6c..d8bf601d0171d 100644 --- a/flang/lib/Frontend/CompilerInvocation.cpp +++ b/flang/lib/Frontend/CompilerInvocation.cpp @@ -20,11 +20,9 @@ #include "flang/Support/Version.h" #include "flang/Tools/TargetSetup.h" #include "flang/Version.inc" -#include "clang/Basic/AllDiagnostics.h" #include "clang/Basic/DiagnosticDriver.h" #include "clang/Basic/DiagnosticOptions.h" #include "clang/Driver/Driver.h" -#include "clang/Driver/DriverDiagnostic.h" #include "clang/Driver/OptionUtils.h" #include "clang/Driver/Options.h" #include "llvm/ADT/StringRef.h" @@ -34,9 +32,7 @@ #include "llvm/Option/ArgList.h" #include "llvm/Option/OptTable.h" #include "llvm/Support/CodeGen.h" -#include "llvm/Support/Error.h" #include "llvm/Support/FileSystem.h" -#include "llvm/Support/FileUtilities.h" #include "llvm/Support/Path.h" #include "llvm/Support/Process.h" #include "llvm/Support/raw_ostream.h" @@ -46,7 +42,6 @@ #include #include #include -#include using namespace Fortran::frontend; diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 55abf0385d185..0e394162ef577 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -10,7 +10,6 @@ #include "flang/Common/idioms.h" #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" -#include "llvm/Support/raw_ostream.h" namespace Fortran::common { >From fa2db7090c6d374ce1a835ad26d19a1d7bd42262 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 12:57:22 -0700 Subject: [PATCH 3/9] reformat --- flang/lib/Support/enum-class.cpp | 20 ++++++++++--------- flang/unittests/Common/EnumClassTests.cpp | 5 ++--- .../unittests/Common/FortranFeaturesTest.cpp | 18 ++++++++++------- 3 files changed, 24 insertions(+), 19 deletions(-) diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp index ed11318382b35..ac57f27ef1c9e 100644 --- a/flang/lib/Support/enum-class.cpp +++ b/flang/lib/Support/enum-class.cpp @@ -1,4 +1,5 @@ -//===-- lib/Support/enum-class.cpp -------------------------------*- C++ -*-===// +//===-- lib/Support/enum-class.cpp -------------------------------*- C++ +//-*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. @@ -7,18 +8,19 @@ //===----------------------------------------------------------------------===// #include "flang/Common/enum-class.h" -#include #include +#include namespace Fortran::common { -std::optional FindEnumIndex(std::function pred, int size, const std::string_view *names) { - for (int i = 0; i < size; ++i) { - if (pred(names[i])) { - return i; - } +std::optional FindEnumIndex( + std::function pred, int size, + const std::string_view *names) { + for (int i = 0; i < size; ++i) { + if (pred(names[i])) { + return i; } - return std::nullopt; + } + return std::nullopt; } - } // namespace Fortran::common \ No newline at end of file diff --git a/flang/unittests/Common/EnumClassTests.cpp b/flang/unittests/Common/EnumClassTests.cpp index f67c453cfad15..c9224a8ceba54 100644 --- a/flang/unittests/Common/EnumClassTests.cpp +++ b/flang/unittests/Common/EnumClassTests.cpp @@ -6,15 +6,14 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Common/template.h" -#include "gtest/gtest.h" using namespace Fortran::common; using namespace std; -ENUM_CLASS(TestEnum, One, Two, - Three) +ENUM_CLASS(TestEnum, One, Two, Three) ENUM_CLASS_EXTRA(TestEnum) TEST(EnumClassTest, EnumToString) { diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 7ec7054f14f6e..597928e7fe56e 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -6,12 +6,12 @@ // //===----------------------------------------------------------------------===// +#include "gtest/gtest.h" #include "flang/Common/enum-class.h" #include "flang/Support/Fortran-features.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" -#include "gtest/gtest.h" namespace Fortran::common { @@ -34,7 +34,7 @@ TEST(EnumClassTest, SplitCamelCase) { if (parts[1].compare(llvm::StringRef("P", 1))) { ADD_FAILURE() << "Second part is not Name"; } - + parts = splitCamelCase("OPName"); ASSERT_EQ(parts.size(), (size_t)2); @@ -114,13 +114,15 @@ TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); } -std::optional> parseCLITestEnumExtraOption(llvm::StringRef input) { - return parseCLIEnum(input, FindTestEnumExtraIndex); +std::optional> parseCLITestEnumExtraOption( + llvm::StringRef input) { + return parseCLIEnum(input, FindTestEnumExtraIndex); } TEST(EnumClassTest, parseCLIEnumOption) { auto result = parseCLITestEnumExtraOption("no-twenty-one"); - auto expected = std::pair(false, TestEnumExtra::TwentyOne); + auto expected = + std::pair(false, TestEnumExtra::TwentyOne); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("twenty-one"); expected = std::pair(true, TestEnumExtra::TwentyOne); @@ -132,10 +134,12 @@ TEST(EnumClassTest, parseCLIEnumOption) { expected = std::pair(true, TestEnumExtra::FortyTwo); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("no-seven-seven-seven"); - expected = std::pair(false, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(false, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); result = parseCLITestEnumExtraOption("seven-seven-seven"); - expected = std::pair(true, TestEnumExtra::SevenSevenSeven); + expected = + std::pair(true, TestEnumExtra::SevenSevenSeven); ASSERT_EQ(result, std::optional{expected}); } >From 5f3feb64c1a97500e2808114d44bb07aa4ccb00c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Thu, 29 May 2025 15:58:43 -0700 Subject: [PATCH 4/9] addressing feedback --- flang/include/flang/Common/enum-class.h | 53 +++--- flang/include/flang/Common/optional.h | 7 + .../include/flang/Support/Fortran-features.h | 16 -- flang/lib/Support/Fortran-features.cpp | 175 ++++++++---------- flang/lib/Support/enum-class.cpp | 15 +- flang/test/Driver/disable-diagnostic.f90 | 3 +- flang/test/Driver/werror-wrong.f90 | 2 +- flang/unittests/Common/CMakeLists.txt | 2 +- .../unittests/Common/FortranFeaturesTest.cpp | 159 +++------------- 9 files changed, 153 insertions(+), 279 deletions(-) diff --git a/flang/include/flang/Common/enum-class.h b/flang/include/flang/Common/enum-class.h index baf9fe418141d..3dbd11bb4057c 100644 --- a/flang/include/flang/Common/enum-class.h +++ b/flang/include/flang/Common/enum-class.h @@ -17,9 +17,9 @@ #ifndef FORTRAN_COMMON_ENUM_CLASS_H_ #define FORTRAN_COMMON_ENUM_CLASS_H_ +#include "optional.h" #include #include -#include #include namespace Fortran::common { @@ -59,26 +59,6 @@ constexpr std::array EnumNames(const char *p) { return result; } -template -std::optional inline fmap(std::optional x, std::function f) { - return x ? std::optional{f(*x)} : std::nullopt; -} - -using Predicate = std::function; -// Finds the first index for which the predicate returns true. -std::optional FindEnumIndex( - Predicate pred, int size, const std::string_view *names); - -using FindEnumIndexType = std::optional( - Predicate, int, const std::string_view *); - -template -std::optional inline FindEnum( - Predicate pred, std::function(Predicate)> find) { - std::function f = [](int x) { return static_cast(x); }; - return fmap(find(pred), f); -} - #define ENUM_CLASS(NAME, ...) \ enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ @@ -90,17 +70,34 @@ std::optional inline FindEnum( return NAME##_names[static_cast(e)]; \ } +namespace EnumClass { + +using Predicate = std::function; +// Finds the first index for which the predicate returns true. +optional FindIndex( + Predicate pred, std::size_t size, const std::string_view *names); + +using FindIndexType = std::function(Predicate)>; + +template +optional inline Find(Predicate pred, FindIndexType findIndex) { + return MapOption( + findIndex(pred), [](int x) { return static_cast(x); }); +} + +} // namespace EnumClass + #define ENUM_CLASS_EXTRA(NAME) \ - [[maybe_unused]] inline std::optional Find##NAME##Index( \ - ::Fortran::common::Predicate p) { \ - return ::Fortran::common::FindEnumIndex( \ + [[maybe_unused]] inline optional Find##NAME##Index( \ + ::Fortran::common::EnumClass::Predicate p) { \ + return ::Fortran::common::EnumClass::FindIndex( \ p, NAME##_enumSize, NAME##_names.data()); \ } \ - [[maybe_unused]] inline std::optional Find##NAME( \ - ::Fortran::common::Predicate p) { \ - return ::Fortran::common::FindEnum(p, Find##NAME##Index); \ + [[maybe_unused]] inline optional Find##NAME( \ + ::Fortran::common::EnumClass::Predicate p) { \ + return ::Fortran::common::EnumClass::Find(p, Find##NAME##Index); \ } \ - [[maybe_unused]] inline std::optional StringTo##NAME( \ + [[maybe_unused]] inline optional StringTo##NAME( \ const std::string_view name) { \ return Find##NAME( \ [name](const std::string_view s) -> bool { return name == s; }); \ diff --git a/flang/include/flang/Common/optional.h b/flang/include/flang/Common/optional.h index c7c81f40cc8c8..5b623f01e828d 100644 --- a/flang/include/flang/Common/optional.h +++ b/flang/include/flang/Common/optional.h @@ -27,6 +27,7 @@ #define FORTRAN_COMMON_OPTIONAL_H #include "api-attrs.h" +#include #include #include @@ -238,6 +239,12 @@ using std::nullopt_t; using std::optional; #endif // !STD_OPTIONAL_UNSUPPORTED +template +std::optional inline MapOption( + std::optional x, std::function f) { + return x ? std::optional{f(*x)} : std::nullopt; +} + } // namespace Fortran::common #endif // FORTRAN_COMMON_OPTIONAL_H diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 4a8b0da4c0d4d..fd6a9139b7ea7 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -133,21 +133,5 @@ class LanguageFeatureControl { bool disableAllWarnings_{false}; }; -// Parse a CLI enum option return the enum index and whether it should be -// enabled (true) or disabled (false). Just exposed for the template below. -std::optional> parseCLIEnumIndex( - llvm::StringRef input, std::function(Predicate)> find); - -template -std::optional> parseCLIEnum( - llvm::StringRef input, std::function(Predicate)> find) { - using To = std::pair; - using From = std::pair; - static std::function cast = [](From x) { - return std::pair{x.first, static_cast(x.second)}; - }; - return fmap(parseCLIEnumIndex(input, find), cast); -} - } // namespace Fortran::common #endif // FORTRAN_SUPPORT_FORTRAN_FEATURES_H_ diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 0e394162ef577..72ea6639adf51 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -11,6 +11,10 @@ #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" +// Debugging +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/raw_ostream.h" + namespace Fortran::common { LanguageFeatureControl::LanguageFeatureControl() { @@ -95,119 +99,99 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } -// Split a string with camel case into the individual words. -// Note, the small vector is just an array of a few pointers and lengths -// into the original input string. So all this allocation should be pretty -// cheap. -llvm::SmallVector splitCamelCase(llvm::StringRef input) { - using namespace llvm; - if (input.empty()) { - return {}; +// Namespace for helper functions for parsing CLI options +// used instead of static so that there can be unit tests for these +// functions. +namespace FortranFeaturesHelpers { +// Check if Lower Case Hyphenated words are equal to Camel Case words. +// Because of out use case we know that 'r' the camel case string is +// well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. +// This is checked in the enum-class.h file. +bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { + size_t ls{l.size()}, rs{r.size()}; + if (ls < rs) { + return false; } - SmallVector parts{}; - parts.reserve(input.size()); - auto check = [&input](size_t j, function_ref predicate) { - return j < input.size() && predicate(input[j]); - }; - size_t i{0}; - size_t startWord = i; - for (; i < input.size(); i++) { - if ((check(i, isUpper) && check(i + 1, isUpper) && check(i + 2, isLower)) || - ((check(i, isLower) || check(i, isDigit)) && check(i + 1, isUpper))) { - parts.push_back(StringRef(input.data() + startWord, i - startWord + 1)); - startWord = i + 1; + bool atStartOfWord{true}; + size_t wordCount{0}, j; // j is the number of word characters checked in r. + for (; j < rs; j++) { + if (wordCount + j >= ls) { + // `l` was shorter once the hiphens were removed. + // If r is null terminated, then we are good. + return r[j] == '\0'; } - } - parts.push_back(llvm::StringRef(input.data() + startWord, i - startWord)); - return parts; -} - -// Split a string whith hyphens into the individual words. -llvm::SmallVector splitHyphenated(llvm::StringRef input) { - auto parts = llvm::SmallVector{}; - llvm::SplitString(input, parts, "-"); - return parts; -} - -// Check if two strings are equal while normalizing case for the -// right word which is assumed to be a single word in camel case. -bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r) { - size_t ls = l.size(); - if (ls != r.size()) - return false; - size_t j{0}; - // Process the upper case characters. - for (; j < ls; j++) { - char rc = r[j]; - char rc2l = llvm::toLower(rc); - if (rc == rc2l) { - // Past run of Uppers Case; - break; + if (atStartOfWord) { + if (llvm::isUpper(r[j])) { + // Upper Case Run + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else { + atStartOfWord = false; + if (l[wordCount + j] != r[j]) { + return false; + } + } + } else { + if (llvm::isUpper(r[j])) { + atStartOfWord = true; + if (l[wordCount + j] != '-') { + return false; + } + ++wordCount; + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else if (l[wordCount + j] != r[j]) { + return false; + } } - if (l[j] != rc2l) - return false; } - // Process the lower case characters. - for (; j < ls; j++) { - if (l[j] != r[j]) { - return false; - } + // If there are more characters in l after processing all the characters in r. + // then fail unless the string is null terminated. + if (ls > wordCount + j) { + return l[wordCount + j] == '\0'; } return true; } // Parse a CLI enum option return the enum index and whether it should be // enabled (true) or disabled (false). -std::optional> parseCLIEnumIndex( - llvm::StringRef input, std::function(Predicate)> find) { - auto parts = splitHyphenated(input); - bool negated = false; - if (parts.size() >= 1 && !parts[0].compare(llvm::StringRef("no", 2))) { +template +optional> ParseCLIEnum( + llvm::StringRef input, EnumClass::FindIndexType findIndex) { + bool negated{false}; + if (input.starts_with("no-")) { negated = true; - // Remove the "no" part - parts = llvm::SmallVector(parts.begin() + 1, parts.end()); - } - size_t chars = 0; - for (auto p : parts) { - chars += p.size(); + input = input.drop_front(3); } - auto pred = [&](auto s) { - if (chars != s.size()) { - return false; - } - auto ccParts = splitCamelCase(s); - auto num_ccParts = ccParts.size(); - if (parts.size() != num_ccParts) { - return false; - } - for (size_t i{0}; i < num_ccParts; i++) { - if (!equalLowerCaseWithCamelCaseWord(parts[i], ccParts[i])) { - return false; - } - } - return true; - }; - auto cast = [negated](int x) { return std::pair{!negated, x}; }; - return fmap>(find(pred), cast); + EnumClass::Predicate predicate{ + [input](llvm::StringRef r) { return LowerHyphEqualCamelCase(input, r); }}; + optional x = EnumClass::Find(predicate, findIndex); + return MapOption>( + x, [negated](T x) { return std::pair{!negated, x}; }); } -std::optional> parseCLILanguageFeature( +optional> parseCLIUsageWarning( llvm::StringRef input) { - return parseCLIEnum(input, FindLanguageFeatureIndex); + return ParseCLIEnum(input, FindUsageWarningIndex); } -std::optional> parseCLIUsageWarning( +optional> parseCLILanguageFeature( llvm::StringRef input) { - return parseCLIEnum(input, FindUsageWarningIndex); + return ParseCLIEnum(input, FindLanguageFeatureIndex); } +} // namespace FortranFeaturesHelpers + // Take a string from the CLI and apply it to the LanguageFeatureControl. // Return true if the option was applied recognized. bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { - if (auto result = parseCLILanguageFeature(input)) { + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(input)) { EnableWarning(result->second, result->first); return true; - } else if (auto result = parseCLIUsageWarning(input)) { + } else if (auto result = + FortranFeaturesHelpers::parseCLIUsageWarning(input)) { EnableWarning(result->second, result->first); return true; } @@ -277,11 +261,10 @@ void ForEachEnum(std::function f) { void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { warnAllLanguage_ = yes; - disableAllWarnings_ = yes ? false : disableAllWarnings_; - // should be equivalent to: reset().flip() set ... - ForEachEnum( - [&](LanguageFeature f) { warnLanguage_.set(f, yes); }); + warnLanguage_.reset(); if (yes) { + disableAllWarnings_ = false; + warnLanguage_.flip(); // These three features do not need to be warned about, // but we do want their feature flags. warnLanguage_.set(LanguageFeature::OpenMP, false); @@ -292,8 +275,10 @@ void LanguageFeatureControl::WarnOnAllNonstandard(bool yes) { void LanguageFeatureControl::WarnOnAllUsage(bool yes) { warnAllUsage_ = yes; - disableAllWarnings_ = yes ? false : disableAllWarnings_; - ForEachEnum( - [&](UsageWarning w) { warnUsage_.set(w, yes); }); + warnUsage_.reset(); + if (yes) { + disableAllWarnings_ = false; + warnUsage_.flip(); + } } } // namespace Fortran::common diff --git a/flang/lib/Support/enum-class.cpp b/flang/lib/Support/enum-class.cpp index ac57f27ef1c9e..d6d0ee758175b 100644 --- a/flang/lib/Support/enum-class.cpp +++ b/flang/lib/Support/enum-class.cpp @@ -8,19 +8,20 @@ //===----------------------------------------------------------------------===// #include "flang/Common/enum-class.h" +#include "flang/Common/optional.h" #include -#include -namespace Fortran::common { -std::optional FindEnumIndex( - std::function pred, int size, +namespace Fortran::common::EnumClass { + +optional FindIndex( + std::function pred, size_t size, const std::string_view *names) { - for (int i = 0; i < size; ++i) { + for (size_t i = 0; i < size; ++i) { if (pred(names[i])) { return i; } } - return std::nullopt; + return nullopt; } -} // namespace Fortran::common \ No newline at end of file +} // namespace Fortran::common::EnumClass diff --git a/flang/test/Driver/disable-diagnostic.f90 b/flang/test/Driver/disable-diagnostic.f90 index 8a58e63cfa3ac..849489377da12 100644 --- a/flang/test/Driver/disable-diagnostic.f90 +++ b/flang/test/Driver/disable-diagnostic.f90 @@ -2,6 +2,7 @@ ! RUN: %flang -pedantic -Wno-known-bad-implicit-interface %s -c 2>&1 | FileCheck %s --allow-empty ! RUN: not %flang -WKnownBadImplicitInterface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR1 ! RUN: not %flang -WKnown-Bad-Implicit-Interface %s -c 2>&1 | FileCheck %s --check-prefix=ERROR2 + ! ERROR1: error: Unknown diagnostic option: -WKnownBadImplicitInterface ! ERROR2: error: Unknown diagnostic option: -WKnown-Bad-Implicit-Interface @@ -16,4 +17,4 @@ program disable_diagnostic end program disable_diagnostic subroutine sub() -end subroutine sub \ No newline at end of file +end subroutine sub diff --git a/flang/test/Driver/werror-wrong.f90 b/flang/test/Driver/werror-wrong.f90 index 33f0aff8a1739..6e3c7cca15bc7 100644 --- a/flang/test/Driver/werror-wrong.f90 +++ b/flang/test/Driver/werror-wrong.f90 @@ -4,4 +4,4 @@ ! RUN: not %flang_fc1 -fsyntax-only -WX %s 2>&1 | FileCheck %s --check-prefix=WRONG2 ! WRONG1: error: Unknown diagnostic option: -Wall -! WRONG2: error: Unknown diagnostic option: -WX \ No newline at end of file +! WRONG2: error: Unknown diagnostic option: -WX diff --git a/flang/unittests/Common/CMakeLists.txt b/flang/unittests/Common/CMakeLists.txt index 19cc5a20fecf4..3149cb9f7bc47 100644 --- a/flang/unittests/Common/CMakeLists.txt +++ b/flang/unittests/Common/CMakeLists.txt @@ -3,4 +3,4 @@ add_flang_unittest(FlangCommonTests FastIntSetTest.cpp FortranFeaturesTest.cpp ) -target_link_libraries(FlangCommonTests PRIVATE FortranSupport) \ No newline at end of file +target_link_libraries(FlangCommonTests PRIVATE FortranSupport) diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 597928e7fe56e..e12aff9f7b735 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -12,135 +12,34 @@ #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/StringRef.h" #include "llvm/Support/ErrorHandling.h" - -namespace Fortran::common { - -// Not currently exported from Fortran-features.h -llvm::SmallVector splitCamelCase(llvm::StringRef input); -llvm::SmallVector splitHyphenated(llvm::StringRef input); -bool equalLowerCaseWithCamelCaseWord(llvm::StringRef l, llvm::StringRef r); - -ENUM_CLASS(TestEnumExtra, TwentyOne, FortyTwo, SevenSevenSeven) -ENUM_CLASS_EXTRA(TestEnumExtra) - -TEST(EnumClassTest, SplitCamelCase) { - - auto parts = splitCamelCase("oP"); - ASSERT_EQ(parts.size(), (size_t)2); - - if (parts[0].compare(llvm::StringRef("o", 1))) { - ADD_FAILURE() << "First part is not OP"; - } - if (parts[1].compare(llvm::StringRef("P", 1))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("OPName"); - ASSERT_EQ(parts.size(), (size_t)2); - - if (parts[0].compare(llvm::StringRef("OP", 2))) { - ADD_FAILURE() << "First part is not OP"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("OpName"); - ASSERT_EQ(parts.size(), (size_t)2); - if (parts[0].compare(llvm::StringRef("Op", 2))) { - ADD_FAILURE() << "First part is not Op"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("opName"); - ASSERT_EQ(parts.size(), (size_t)2); - if (parts[0].compare(llvm::StringRef("op", 2))) { - ADD_FAILURE() << "First part is not op"; - } - if (parts[1].compare(llvm::StringRef("Name", 4))) { - ADD_FAILURE() << "Second part is not Name"; - } - - parts = splitCamelCase("FlangTestProgram123"); - ASSERT_EQ(parts.size(), (size_t)3); - if (parts[0].compare(llvm::StringRef("Flang", 5))) { - ADD_FAILURE() << "First part is not Flang"; - } - if (parts[1].compare(llvm::StringRef("Test", 4))) { - ADD_FAILURE() << "Second part is not Test"; - } - if (parts[2].compare(llvm::StringRef("Program123", 10))) { - ADD_FAILURE() << "Third part is not Program123"; - } - for (auto p : parts) { - llvm::errs() << p << " " << p.size() << "\n"; - } -} - -TEST(EnumClassTest, SplitHyphenated) { - auto parts = splitHyphenated("no-twenty-one"); - ASSERT_EQ(parts.size(), (size_t)3); - if (parts[0].compare(llvm::StringRef("no", 2))) { - ADD_FAILURE() << "First part is not twenty"; - } - if (parts[1].compare(llvm::StringRef("twenty", 6))) { - ADD_FAILURE() << "Second part is not one"; - } - if (parts[2].compare(llvm::StringRef("one", 3))) { - ADD_FAILURE() << "Third part is not one"; - } - for (auto p : parts) { - llvm::errs() << p << " " << p.size() << "\n"; - } -} - -TEST(EnumClassTest, equalLowerCaseWithCamelCaseWord) { - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("O", "O")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "p")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("o", "P")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("1", "2")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("Op", "op")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("op", "Oplss")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("oplss", "OplSS")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "oplss")); - EXPECT_FALSE(equalLowerCaseWithCamelCaseWord("OPLSS", "OPLSS")); - - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("o", "O")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "OPLSS")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("oplss", "oplss")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "OP555")); - EXPECT_TRUE(equalLowerCaseWithCamelCaseWord("op555", "op555")); -} - -std::optional> parseCLITestEnumExtraOption( - llvm::StringRef input) { - return parseCLIEnum(input, FindTestEnumExtraIndex); -} - -TEST(EnumClassTest, parseCLIEnumOption) { - auto result = parseCLITestEnumExtraOption("no-twenty-one"); - auto expected = - std::pair(false, TestEnumExtra::TwentyOne); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("twenty-one"); - expected = std::pair(true, TestEnumExtra::TwentyOne); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("no-forty-two"); - expected = std::pair(false, TestEnumExtra::FortyTwo); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("forty-two"); - expected = std::pair(true, TestEnumExtra::FortyTwo); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("no-seven-seven-seven"); - expected = - std::pair(false, TestEnumExtra::SevenSevenSeven); - ASSERT_EQ(result, std::optional{expected}); - result = parseCLITestEnumExtraOption("seven-seven-seven"); - expected = - std::pair(true, TestEnumExtra::SevenSevenSeven); - ASSERT_EQ(result, std::optional{expected}); +#include + +namespace Fortran::common::FortranFeaturesHelpers { + +optional> parseCLIUsageWarning( + llvm::StringRef input); +TEST(EnumClassTest, ParseCLIUsageWarning) { + EXPECT_EQ((parseCLIUsageWarning("no-twenty-one")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("twenty-one")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-seven-seven-seven")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("")), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-")), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("Portability"), std::nullopt); + auto expect{std::pair{false, UsageWarning::Portability}}; + ASSERT_EQ(parseCLIUsageWarning("no-portability"), expect); + expect.first = true; + ASSERT_EQ((parseCLIUsageWarning("portability")), expect); + expect = + std::pair{false, Fortran::common::UsageWarning::PointerToUndefinable}; + ASSERT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable")), expect); + expect.first = true; + ASSERT_EQ((parseCLIUsageWarning("pointer-to-undefinable")), expect); + EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("NoPointerToUndefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable"), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable"), std::nullopt); } -} // namespace Fortran::common +} // namespace Fortran::common::FortranFeaturesHelpers >From 79303b42f7cfd3806c22bd34e5eced5f27d27f32 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 15:42:27 -0700 Subject: [PATCH 5/9] removing debugging statement --- flang/lib/Support/Fortran-features.cpp | 4 ---- 1 file changed, 4 deletions(-) diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 72ea6639adf51..75baa0b096af0 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -11,10 +11,6 @@ #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" -// Debugging -#include "llvm/ADT/StringRef.h" -#include "llvm/Support/raw_ostream.h" - namespace Fortran::common { LanguageFeatureControl::LanguageFeatureControl() { >From 8f0aa22125528a755ec61af2bd45b6c314cfe45c Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 15:59:18 -0700 Subject: [PATCH 6/9] more feedback --- flang/include/flang/Support/Fortran-features.h | 4 +--- flang/lib/Support/Fortran-features.cpp | 16 +++++++++------- flang/unittests/Common/FortranFeaturesTest.cpp | 4 ---- 3 files changed, 10 insertions(+), 14 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index fd6a9139b7ea7..501b183cceeec 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -11,8 +11,6 @@ #include "Fortran.h" #include "flang/Common/enum-set.h" -#include "llvm/ADT/StringRef.h" -#include #include namespace Fortran::common { @@ -115,7 +113,7 @@ class LanguageFeatureControl { DisableAllNonstandardWarnings(); DisableAllUsageWarnings(); } - bool applyCLIOption(llvm::StringRef input); + bool applyCLIOption(std::string_view input); bool AreWarningsDisabled() const { return disableAllWarnings_; } bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 75baa0b096af0..d140ecdff7f24 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -99,11 +99,11 @@ LanguageFeatureControl::LanguageFeatureControl() { // used instead of static so that there can be unit tests for these // functions. namespace FortranFeaturesHelpers { -// Check if Lower Case Hyphenated words are equal to Camel Case words. +// Check if lower case hyphenated words are equal to camel case words. // Because of out use case we know that 'r' the camel case string is // well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. // This is checked in the enum-class.h file. -bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { +static bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { size_t ls{l.size()}, rs{r.size()}; if (ls < rs) { return false; @@ -161,8 +161,9 @@ optional> ParseCLIEnum( negated = true; input = input.drop_front(3); } - EnumClass::Predicate predicate{ - [input](llvm::StringRef r) { return LowerHyphEqualCamelCase(input, r); }}; + EnumClass::Predicate predicate{[input](std::string_view r) { + return LowerHyphEqualCamelCase(input, r); + }}; optional x = EnumClass::Find(predicate, findIndex); return MapOption>( x, [negated](T x) { return std::pair{!negated, x}; }); @@ -182,12 +183,13 @@ optional> parseCLILanguageFeature( // Take a string from the CLI and apply it to the LanguageFeatureControl. // Return true if the option was applied recognized. -bool LanguageFeatureControl::applyCLIOption(llvm::StringRef input) { - if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(input)) { +bool LanguageFeatureControl::applyCLIOption(std::string_view input) { + llvm::StringRef inputRef{input}; + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(inputRef)) { EnableWarning(result->second, result->first); return true; } else if (auto result = - FortranFeaturesHelpers::parseCLIUsageWarning(input)) { + FortranFeaturesHelpers::parseCLIUsageWarning(inputRef)) { EnableWarning(result->second, result->first); return true; } diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index e12aff9f7b735..b3f0c31a57025 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -7,11 +7,7 @@ //===----------------------------------------------------------------------===// #include "gtest/gtest.h" -#include "flang/Common/enum-class.h" #include "flang/Support/Fortran-features.h" -#include "llvm/ADT/SmallVector.h" -#include "llvm/ADT/StringRef.h" -#include "llvm/Support/ErrorHandling.h" #include namespace Fortran::common::FortranFeaturesHelpers { >From a0317745bca77a1134e116fd570b4ecca60e4d95 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 17:02:17 -0700 Subject: [PATCH 7/9] adding insensitive match back --- .../include/flang/Support/Fortran-features.h | 2 +- flang/lib/Support/Fortran-features.cpp | 86 +++++++++++++++---- .../unittests/Common/FortranFeaturesTest.cpp | 75 +++++++++++----- 3 files changed, 123 insertions(+), 40 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 501b183cceeec..0b55a3175580a 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -113,7 +113,7 @@ class LanguageFeatureControl { DisableAllNonstandardWarnings(); DisableAllUsageWarnings(); } - bool applyCLIOption(std::string_view input); + bool applyCLIOption(std::string_view input, bool insensitive = false); bool AreWarningsDisabled() const { return disableAllWarnings_; } bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); } bool ShouldWarn(LanguageFeature f) const { return warnLanguage_.test(f); } diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index d140ecdff7f24..80e87615697df 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -8,6 +8,7 @@ #include "flang/Support/Fortran-features.h" #include "flang/Common/idioms.h" +#include "flang/Common/optional.h" #include "flang/Support/Fortran.h" #include "llvm/ADT/StringExtras.h" @@ -99,11 +100,48 @@ LanguageFeatureControl::LanguageFeatureControl() { // used instead of static so that there can be unit tests for these // functions. namespace FortranFeaturesHelpers { + +// Ignore case and any inserted punctuation (like '-'/'_') +static std::optional GetWarningChar(char ch) { + if (ch >= 'a' && ch <= 'z') { + return ch; + } else if (ch >= 'A' && ch <= 'Z') { + return ch - 'A' + 'a'; + } else if (ch >= '0' && ch <= '9') { + return ch; + } else { + return std::nullopt; + } +} + +// Check for case and punctuation insensitive string equality. +// NB, b is probably not null terminated, so don't treat is like a C string. +static bool InsensitiveWarningNameMatch( + std::string_view a, std::string_view b) { + size_t j{0}, aSize{a.size()}, k{0}, bSize{b.size()}; + while (true) { + optional ach{nullopt}; + while (!ach && j < aSize) { + ach = GetWarningChar(a[j++]); + } + optional bch{}; + while (!bch && k < bSize) { + bch = GetWarningChar(b[k++]); + } + if (!ach && !bch) { + return true; + } else if (!ach || !bch || *ach != *bch) { + return false; + } + ach = bch = nullopt; + } +} + // Check if lower case hyphenated words are equal to camel case words. // Because of out use case we know that 'r' the camel case string is // well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. // This is checked in the enum-class.h file. -static bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { +static bool SensitiveWarningNameMatch(llvm::StringRef l, llvm::StringRef r) { size_t ls{l.size()}, rs{r.size()}; if (ls < rs) { return false; @@ -154,42 +192,56 @@ static bool LowerHyphEqualCamelCase(llvm::StringRef l, llvm::StringRef r) { // Parse a CLI enum option return the enum index and whether it should be // enabled (true) or disabled (false). template -optional> ParseCLIEnum( - llvm::StringRef input, EnumClass::FindIndexType findIndex) { +optional> ParseCLIEnum(llvm::StringRef input, + EnumClass::FindIndexType findIndex, bool insensitive) { bool negated{false}; - if (input.starts_with("no-")) { - negated = true; - input = input.drop_front(3); + EnumClass::Predicate predicate; + if (insensitive) { + if (input.starts_with_insensitive("no")) { + negated = true; + input = input.drop_front(2); + } + predicate = [input](std::string_view r) { + return InsensitiveWarningNameMatch(input, r); + }; + } else { + if (input.starts_with("no-")) { + negated = true; + input = input.drop_front(3); + } + predicate = [input](std::string_view r) { + return SensitiveWarningNameMatch(input, r); + }; } - EnumClass::Predicate predicate{[input](std::string_view r) { - return LowerHyphEqualCamelCase(input, r); - }}; optional x = EnumClass::Find(predicate, findIndex); return MapOption>( x, [negated](T x) { return std::pair{!negated, x}; }); } optional> parseCLIUsageWarning( - llvm::StringRef input) { - return ParseCLIEnum(input, FindUsageWarningIndex); + llvm::StringRef input, bool insensitive) { + return ParseCLIEnum(input, FindUsageWarningIndex, insensitive); } optional> parseCLILanguageFeature( - llvm::StringRef input) { - return ParseCLIEnum(input, FindLanguageFeatureIndex); + llvm::StringRef input, bool insensitive) { + return ParseCLIEnum( + input, FindLanguageFeatureIndex, insensitive); } } // namespace FortranFeaturesHelpers // Take a string from the CLI and apply it to the LanguageFeatureControl. // Return true if the option was applied recognized. -bool LanguageFeatureControl::applyCLIOption(std::string_view input) { +bool LanguageFeatureControl::applyCLIOption( + std::string_view input, bool insensitive) { llvm::StringRef inputRef{input}; - if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature(inputRef)) { + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature( + inputRef, insensitive)) { EnableWarning(result->second, result->first); return true; - } else if (auto result = - FortranFeaturesHelpers::parseCLIUsageWarning(inputRef)) { + } else if (auto result = FortranFeaturesHelpers::parseCLIUsageWarning( + inputRef, insensitive)) { EnableWarning(result->second, result->first); return true; } diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index b3f0c31a57025..4e9529d633ad9 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -13,29 +13,60 @@ namespace Fortran::common::FortranFeaturesHelpers { optional> parseCLIUsageWarning( - llvm::StringRef input); + llvm::StringRef input, bool insensitive); TEST(EnumClassTest, ParseCLIUsageWarning) { - EXPECT_EQ((parseCLIUsageWarning("no-twenty-one")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("twenty-one")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("no-seven-seven-seven")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("no")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("")), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("no-")), std::nullopt); - EXPECT_EQ(parseCLIUsageWarning("Portability"), std::nullopt); - auto expect{std::pair{false, UsageWarning::Portability}}; - ASSERT_EQ(parseCLIUsageWarning("no-portability"), expect); - expect.first = true; - ASSERT_EQ((parseCLIUsageWarning("portability")), expect); - expect = - std::pair{false, Fortran::common::UsageWarning::PointerToUndefinable}; - ASSERT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable")), expect); - expect.first = true; - ASSERT_EQ((parseCLIUsageWarning("pointer-to-undefinable")), expect); - EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable"), std::nullopt); - EXPECT_EQ(parseCLIUsageWarning("NoPointerToUndefinable"), std::nullopt); - EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable"), std::nullopt); - EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable"), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-twenty-one", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("twenty-one", false)), std::nullopt); + EXPECT_EQ( + (parseCLIUsageWarning("no-seven-seven-seven", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-", false)), std::nullopt); + + EXPECT_EQ(parseCLIUsageWarning("Portability", false), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-portability", false)), + (std::optional{std::pair{false, UsageWarning::Portability}})); + EXPECT_EQ((parseCLIUsageWarning("portability", false)), + (std::optional{std::pair{true, UsageWarning::Portability}})); + EXPECT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable", false)), + (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ((parseCLIUsageWarning("pointer-to-undefinable", false)), + (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable", false), std::nullopt); + EXPECT_EQ( + parseCLIUsageWarning("NoPointerToUndefinable", false), std::nullopt); + EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable", false), std::nullopt); + EXPECT_EQ( + parseCLIUsageWarning("nopointertoundefinable", false), std::nullopt); + + EXPECT_EQ((parseCLIUsageWarning("no-twenty-one", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("twenty-one", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-seven-seven-seven", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("", true)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-", true)), std::nullopt); + + EXPECT_EQ(parseCLIUsageWarning("Portability", true), + (std::optional{std::pair{true, UsageWarning::Portability}})); + EXPECT_EQ(parseCLIUsageWarning("no-portability", true), + (std::optional{std::pair{false, UsageWarning::Portability}})); + + EXPECT_EQ((parseCLIUsageWarning("portability", true)), + (std::optional{std::pair{true, UsageWarning::Portability}})); + EXPECT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable", true)), + (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ((parseCLIUsageWarning("pointer-to-undefinable", true)), + (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("PointerToUndefinable", true), + (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("NoPointerToUndefinable", true), + (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("pointertoundefinable", true), + (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); + EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable", true), + (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); } } // namespace Fortran::common::FortranFeaturesHelpers >From d37976f4167677294227c976b4aea03fdb130462 Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 18:01:35 -0700 Subject: [PATCH 8/9] fixing enum name --- flang/include/flang/Support/Fortran-features.h | 2 +- flang/lib/Semantics/tools.cpp | 2 +- flang/unittests/Common/FortranFeaturesTest.cpp | 5 ++++- 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/flang/include/flang/Support/Fortran-features.h b/flang/include/flang/Support/Fortran-features.h index 0b55a3175580a..899959ad8a435 100644 --- a/flang/include/flang/Support/Fortran-features.h +++ b/flang/include/flang/Support/Fortran-features.h @@ -60,7 +60,7 @@ ENUM_CLASS(UsageWarning, Portability, PointerToUndefinable, NonTargetPassedToTarget, PointerToPossibleNoncontiguous, ShortCharacterActual, ShortArrayActual, ImplicitInterfaceActual, PolymorphicTransferArg, PointerComponentTransferArg, TransferSizePresence, - F202XAllocatableBreakingChange, OptionalMustBePresent, CommonBlockPadding, + F202xAllocatableBreakingChange, OptionalMustBePresent, CommonBlockPadding, LogicalVsCBool, BindCCharLength, ProcDummyArgShapes, ExternalNameConflict, FoldingException, FoldingAvoidsRuntimeCrash, FoldingValueChecks, FoldingFailure, FoldingLimit, Interoperability, CharacterInteroperability, diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp index 1d1e3ac044166..e8da757416cc6 100644 --- a/flang/lib/Semantics/tools.cpp +++ b/flang/lib/Semantics/tools.cpp @@ -1672,7 +1672,7 @@ std::forward_list GetAllNames( void WarnOnDeferredLengthCharacterScalar(SemanticsContext &context, const SomeExpr *expr, parser::CharBlock at, const char *what) { if (context.languageFeatures().ShouldWarn( - common::UsageWarning::F202XAllocatableBreakingChange)) { + common::UsageWarning::F202xAllocatableBreakingChange)) { if (const Symbol * symbol{evaluate::UnwrapWholeSymbolOrComponentDataRef(expr)}) { const Symbol &ultimate{ResolveAssociations(*symbol)}; diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 4e9529d633ad9..0e48697182ff5 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -52,7 +52,6 @@ TEST(EnumClassTest, ParseCLIUsageWarning) { (std::optional{std::pair{true, UsageWarning::Portability}})); EXPECT_EQ(parseCLIUsageWarning("no-portability", true), (std::optional{std::pair{false, UsageWarning::Portability}})); - EXPECT_EQ((parseCLIUsageWarning("portability", true)), (std::optional{std::pair{true, UsageWarning::Portability}})); EXPECT_EQ((parseCLIUsageWarning("no-pointer-to-undefinable", true)), @@ -67,6 +66,10 @@ TEST(EnumClassTest, ParseCLIUsageWarning) { (std::optional{std::pair{true, UsageWarning::PointerToUndefinable}})); EXPECT_EQ(parseCLIUsageWarning("nopointertoundefinable", true), (std::optional{std::pair{false, UsageWarning::PointerToUndefinable}})); + + EXPECT_EQ(parseCLIUsageWarning("f202x-allocatable-breaking-change", false), + (std::optional{ + std::pair{true, UsageWarning::F202xAllocatableBreakingChange}})); } } // namespace Fortran::common::FortranFeaturesHelpers >From 9ae60472c9e005f0754f195ce1801b970e7244ea Mon Sep 17 00:00:00 2001 From: Andre Kuhlenschmidt Date: Fri, 30 May 2025 18:12:42 -0700 Subject: [PATCH 9/9] fixing uninitialized --- flang/lib/Support/Fortran-features.cpp | 2 +- flang/unittests/Common/FortranFeaturesTest.cpp | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/flang/lib/Support/Fortran-features.cpp b/flang/lib/Support/Fortran-features.cpp index 80e87615697df..dd1798e90051c 100644 --- a/flang/lib/Support/Fortran-features.cpp +++ b/flang/lib/Support/Fortran-features.cpp @@ -147,7 +147,7 @@ static bool SensitiveWarningNameMatch(llvm::StringRef l, llvm::StringRef r) { return false; } bool atStartOfWord{true}; - size_t wordCount{0}, j; // j is the number of word characters checked in r. + size_t wordCount{0}, j{0}; // j is the number of word characters checked in r. for (; j < rs; j++) { if (wordCount + j >= ls) { // `l` was shorter once the hiphens were removed. diff --git a/flang/unittests/Common/FortranFeaturesTest.cpp b/flang/unittests/Common/FortranFeaturesTest.cpp index 0e48697182ff5..367500e46af20 100644 --- a/flang/unittests/Common/FortranFeaturesTest.cpp +++ b/flang/unittests/Common/FortranFeaturesTest.cpp @@ -40,7 +40,7 @@ TEST(EnumClassTest, ParseCLIUsageWarning) { EXPECT_EQ( parseCLIUsageWarning("nopointertoundefinable", false), std::nullopt); - EXPECT_EQ((parseCLIUsageWarning("no-twenty-one", false)), std::nullopt); + EXPECT_EQ((parseCLIUsageWarning("no-twenty-one", true)), std::nullopt); EXPECT_EQ((parseCLIUsageWarning("twenty-one", true)), std::nullopt); EXPECT_EQ((parseCLIUsageWarning("no-seven-seven-seven", true)), std::nullopt); EXPECT_EQ((parseCLIUsageWarning("seven-seven-seven", true)), std::nullopt); From flang-commits at lists.llvm.org Fri May 30 21:14:26 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 21:14:26 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <683a8222.170a0220.6f32.dad4@mx.google.com> https://github.com/NimishMishra edited https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Fri May 30 21:14:26 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 21:14:26 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <683a8222.170a0220.2f6183.e2c3@mx.google.com> ================ @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } + } + + // Initialize linear step + inline void initLinearStep(LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearStep) { + linearSteps.push_back(moduleTranslation.lookupValue(linearStep)); + } + + // Emit IR for initialization of linear variables + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + initLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::BasicBlock *loopPreHeader) { + builder.SetInsertPoint(loopPreHeader->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarLoad = builder.CreateLoad( + linearOrigVars[index]->getAllocatedType(), linearOrigVars[index]); + builder.CreateStore(linearVarLoad, linearPreconditionVars[index]); + } + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + return afterBarrierIP; + } + + // Emit IR for updating Linear variables + void updateLinearVar(llvm::IRBuilderBase &builder, llvm::BasicBlock *loopBody, + llvm::Value *loopInductionVar) { + builder.SetInsertPoint(loopBody->getTerminator()); + for (size_t index = 0; index < linearPreconditionVars.size(); index++) { + // Emit increments for linear vars + llvm::LoadInst *linearVarStart = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + + linearPreconditionVars[index]); + auto mulInst = builder.CreateMul(loopInductionVar, linearSteps[index]); + auto addInst = builder.CreateAdd(linearVarStart, mulInst); + builder.CreateStore(addInst, linearLoopBodyTemps[index]); + } + } + + // Linear variable finalization is conditional on the last logical iteration. + // Create BB splits to manage the same. + void outlineLinearFinalizationBB(llvm::IRBuilderBase &builder, ---------------- NimishMishra wrote: `split` is fine; I will change. Thanks https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Fri May 30 21:14:26 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 21:14:26 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <683a8222.170a0220.37aa97.a5ce@mx.google.com> https://github.com/NimishMishra commented: Thanks @tblah for the review and apologies for taking a while to circle back on this. I will wait for your response on the floating point support of linear steps; rest changes look fine to me, I'll address them. https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Fri May 30 21:14:27 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 21:14:27 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <683a8223.050a0220.f9f50.417f@mx.google.com> ================ @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } + } + + // Initialize linear step + inline void initLinearStep(LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearStep) { + linearSteps.push_back(moduleTranslation.lookupValue(linearStep)); + } + + // Emit IR for initialization of linear variables + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + initLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::BasicBlock *loopPreHeader) { + builder.SetInsertPoint(loopPreHeader->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarLoad = builder.CreateLoad( + linearOrigVars[index]->getAllocatedType(), linearOrigVars[index]); + builder.CreateStore(linearVarLoad, linearPreconditionVars[index]); + } + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + return afterBarrierIP; + } + + // Emit IR for updating Linear variables + void updateLinearVar(llvm::IRBuilderBase &builder, llvm::BasicBlock *loopBody, + llvm::Value *loopInductionVar) { + builder.SetInsertPoint(loopBody->getTerminator()); + for (size_t index = 0; index < linearPreconditionVars.size(); index++) { + // Emit increments for linear vars + llvm::LoadInst *linearVarStart = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + + linearPreconditionVars[index]); + auto mulInst = builder.CreateMul(loopInductionVar, linearSteps[index]); ---------------- NimishMishra wrote: True, but I am referring to section 5.4.6 of the 5.2 standard. It is mentioned that the `step-simple-modifier` and `step-complex-modifier` are integer expressions. Do you need floating point support thereafter? https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Fri May 30 21:14:27 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 21:14:27 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <683a8223.170a0220.29c3f4.a1b4@mx.google.com> ================ @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } + } + + // Initialize linear step + inline void initLinearStep(LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearStep) { + linearSteps.push_back(moduleTranslation.lookupValue(linearStep)); + } + + // Emit IR for initialization of linear variables + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + initLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::BasicBlock *loopPreHeader) { + builder.SetInsertPoint(loopPreHeader->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarLoad = builder.CreateLoad( + linearOrigVars[index]->getAllocatedType(), linearOrigVars[index]); + builder.CreateStore(linearVarLoad, linearPreconditionVars[index]); + } + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + return afterBarrierIP; + } + + // Emit IR for updating Linear variables + void updateLinearVar(llvm::IRBuilderBase &builder, llvm::BasicBlock *loopBody, + llvm::Value *loopInductionVar) { + builder.SetInsertPoint(loopBody->getTerminator()); + for (size_t index = 0; index < linearPreconditionVars.size(); index++) { + // Emit increments for linear vars + llvm::LoadInst *linearVarStart = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + + linearPreconditionVars[index]); + auto mulInst = builder.CreateMul(loopInductionVar, linearSteps[index]); + auto addInst = builder.CreateAdd(linearVarStart, mulInst); + builder.CreateStore(addInst, linearLoopBodyTemps[index]); + } + } + + // Linear variable finalization is conditional on the last logical iteration. + // Create BB splits to manage the same. + void outlineLinearFinalizationBB(llvm::IRBuilderBase &builder, + llvm::BasicBlock *loopExit) { + linearFinalizationBB = loopExit->splitBasicBlock( + loopExit->getTerminator(), "omp_loop.linear_finalization"); + linearExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_exit"); + linearLastIterExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_lastiter_exit"); + } + + // Finalize the linear vars + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + finalizeLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::Value *lastIter) { + // Emit condition to check whether last logical iteration is being executed + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + llvm::Value *loopLastIterLoad = builder.CreateLoad( + llvm::Type::getInt32Ty(builder.getContext()), lastIter); ---------------- NimishMishra wrote: I am thinking of this from the perspective of checking whether the current iteration is the last iteration of the loop or not. Now that you mention it, I am not sure whether `p.lastiter` in canonical loop bodygen is a `bool` (i.e. a flag denoting whether this is the last iteration or not) or an integer (i.e. holding `end` - 1, where `end` is the loop end bound). It is better to have this load match the datatype of its counterpart in canonical loop bodygen https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Fri May 30 21:14:27 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 21:14:27 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <683a8223.170a0220.8e6a6.df20@mx.google.com> ================ @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } ---------------- NimishMishra wrote: Understood. I'll go over the privatisation/reduction patches to see how they manage such a case. I am assuming they hit an assertion failure, but will check and accordingly modify here. https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Fri May 30 21:14:38 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 21:14:38 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <683a822e.170a0220.10f0b.9f24@mx.google.com> https://github.com/NimishMishra updated https://github.com/llvm/llvm-project/pull/139386 >From e4f3cb2553f8ef03a3ad347cf14a187e31064153 Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Sat, 10 May 2025 19:34:16 +0530 Subject: [PATCH 1/3] [flang][OpenMP] Support MLIR lowering of linear clause for omp.wsloop --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp | 34 +++++++++++ flang/lib/Lower/OpenMP/ClauseProcessor.h | 1 + .../lib/Lower/OpenMP/DataSharingProcessor.cpp | 5 +- flang/lib/Lower/OpenMP/OpenMP.cpp | 4 +- flang/test/Lower/OpenMP/wsloop-linear.f90 | 57 +++++++++++++++++++ 5 files changed, 97 insertions(+), 4 deletions(-) create mode 100644 flang/test/Lower/OpenMP/wsloop-linear.f90 diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index 79b5087e4da68..8ba2f604df80a 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -1060,6 +1060,40 @@ bool ClauseProcessor::processIsDevicePtr( }); } +bool ClauseProcessor::processLinear(mlir::omp::LinearClauseOps &result) const { + lower::StatementContext stmtCtx; + return findRepeatableClause< + omp::clause::Linear>([&](const omp::clause::Linear &clause, + const parser::CharBlock &) { + auto &objects = std::get(clause.t); + for (const omp::Object &object : objects) { + semantics::Symbol *sym = object.sym(); + const mlir::Value variable = converter.getSymbolAddress(*sym); + result.linearVars.push_back(variable); + } + if (objects.size()) { + if (auto &mod = + std::get>( + clause.t)) { + mlir::Value operand = + fir::getBase(converter.genExprValue(toEvExpr(*mod), stmtCtx)); + result.linearStepVars.append(objects.size(), operand); + } else if (std::get>( + clause.t)) { + mlir::Location currentLocation = converter.getCurrentLocation(); + TODO(currentLocation, "Linear modifiers not yet implemented"); + } else { + // If nothing is present, add the default step of 1. + fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder(); + mlir::Location currentLocation = converter.getCurrentLocation(); + mlir::Value operand = firOpBuilder.createIntegerConstant( + currentLocation, firOpBuilder.getI32Type(), 1); + result.linearStepVars.append(objects.size(), operand); + } + } + }); +} + bool ClauseProcessor::processLink( llvm::SmallVectorImpl &result) const { return findRepeatableClause( diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 7857ba3fd0845..0ec41bdd33256 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -122,6 +122,7 @@ class ClauseProcessor { bool processIsDevicePtr( mlir::omp::IsDevicePtrClauseOps &result, llvm::SmallVectorImpl &isDeviceSyms) const; + bool processLinear(mlir::omp::LinearClauseOps &result) const; bool processLink(llvm::SmallVectorImpl &result) const; diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp index 7eec598645eac..2a1c94407e1c8 100644 --- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp +++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp @@ -213,14 +213,15 @@ void DataSharingProcessor::collectSymbolsForPrivatization() { // so, we won't need to explicitely handle block objects (or forget to do // so). for (auto *sym : explicitlyPrivatizedSymbols) - allPrivatizedSymbols.insert(sym); + if (!sym->test(Fortran::semantics::Symbol::Flag::OmpLinear)) + allPrivatizedSymbols.insert(sym); } bool DataSharingProcessor::needBarrier() { // Emit implicit barrier to synchronize threads and avoid data races on // initialization of firstprivate variables and post-update of lastprivate // variables. - // Emit implicit barrier for linear clause. Maybe on somewhere else. + // Emit implicit barrier for linear clause in the OpenMPIRBuilder. for (const semantics::Symbol *sym : allPrivatizedSymbols) { if (sym->test(semantics::Symbol::Flag::OmpLastPrivate) && (sym->test(semantics::Symbol::Flag::OmpFirstPrivate) || diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 54560729eb4af..6fa915b4364f9 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -1841,13 +1841,13 @@ static void genWsloopClauses( llvm::SmallVectorImpl &reductionSyms) { ClauseProcessor cp(converter, semaCtx, clauses); cp.processNowait(clauseOps); + cp.processLinear(clauseOps); cp.processOrder(clauseOps); cp.processOrdered(clauseOps); cp.processReduction(loc, clauseOps, reductionSyms); cp.processSchedule(stmtCtx, clauseOps); - cp.processTODO( - loc, llvm::omp::Directive::OMPD_do); + cp.processTODO(loc, llvm::omp::Directive::OMPD_do); } //===----------------------------------------------------------------------===// diff --git a/flang/test/Lower/OpenMP/wsloop-linear.f90 b/flang/test/Lower/OpenMP/wsloop-linear.f90 new file mode 100644 index 0000000000000..b99677108be2f --- /dev/null +++ b/flang/test/Lower/OpenMP/wsloop-linear.f90 @@ -0,0 +1,57 @@ +! This test checks lowering of OpenMP DO Directive (Worksharing) +! with linear clause + +! RUN: %flang_fc1 -fopenmp -emit-hlfir %s -o - 2>&1 | FileCheck %s + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFsimple_linearEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFsimple_linearEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[const:.*]] = arith.constant 1 : i32 +subroutine simple_linear + implicit none + integer :: x, y, i + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + + +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_stepEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_stepEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_step + implicit none + integer :: x, y, i + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[const]] : !fir.ref) {{.*}} + !$omp do linear(x:4) + !CHECK: %[[LOAD:.*]] = fir.load %[[X]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 2 : i32 + !CHECK: %[[RESULT:.*]] = arith.addi %[[LOAD]], %[[const]] : i32 + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine + +!CHECK: %[[A_alloca:.*]] = fir.alloca i32 {bindc_name = "a", uniq_name = "_QFlinear_exprEa"} +!CHECK: %[[A:.*]]:2 = hlfir.declare %[[A_alloca]] {uniq_name = "_QFlinear_exprEa"} : (!fir.ref) -> (!fir.ref, !fir.ref) +!CHECK: %[[X_alloca:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlinear_exprEx"} +!CHECK: %[[X:.*]]:2 = hlfir.declare %[[X_alloca]] {uniq_name = "_QFlinear_exprEx"} : (!fir.ref) -> (!fir.ref, !fir.ref) +subroutine linear_expr + implicit none + integer :: x, y, i, a + !CHECK: %[[LOAD_A:.*]] = fir.load %[[A]]#0 : !fir.ref + !CHECK: %[[const:.*]] = arith.constant 4 : i32 + !CHECK: %[[LINEAR_EXPR:.*]] = arith.addi %[[LOAD_A]], %[[const]] : i32 + !CHECK: omp.wsloop linear(%[[X]]#0 = %[[LINEAR_EXPR]] : !fir.ref) {{.*}} + !$omp do linear(x:a+4) + do i = 1, 10 + y = x + 2 + end do + !$omp end do +end subroutine >From 616d6377bbb94bdc742023d71c63ba89df293e3c Mon Sep 17 00:00:00 2001 From: Nimish Mishra Date: Sat, 10 May 2025 20:51:39 +0530 Subject: [PATCH 2/3] [mlir][llvm][OpenMP] Support translation for linear clause in omp.wsloop --- .../llvm/Frontend/OpenMP/OMPIRBuilder.h | 15 ++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 3 + .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 185 +++++++++++++++++- mlir/test/Target/LLVMIR/openmp-llvm.mlir | 88 +++++++++ mlir/test/Target/LLVMIR/openmp-todo.mlir | 13 -- 5 files changed, 289 insertions(+), 15 deletions(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h index ffc0fd0a0bdac..68f15d5c7d41e 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h @@ -3580,6 +3580,9 @@ class CanonicalLoopInfo { BasicBlock *Latch = nullptr; BasicBlock *Exit = nullptr; + // Hold the MLIR value for the `lastiter` of the canonical loop. + Value *LastIter = nullptr; + /// Add the control blocks of this loop to \p BBs. /// /// This does not include any block from the body, including the one returned @@ -3612,6 +3615,18 @@ class CanonicalLoopInfo { void mapIndVar(llvm::function_ref Updater); public: + /// Sets the last iteration variable for this loop. + void setLastIter(Value *IterVar) { LastIter = std::move(IterVar); } + + /// Returns the last iteration variable for this loop. + /// Certain use-cases (like translation of linear clause) may access + /// this variable even after a loop transformation. Hence, do not guard + /// this getter function by `isValid`. It is the responsibility of the + /// callee to ensure this functionality is not invoked by a non-outlined + /// CanonicalLoopInfo object (in which case, `setLastIter` will never be + /// invoked and `LastIter` will be by default `nullptr`). + Value *getLastIter() { return LastIter; } + /// Returns whether this object currently represents the IR of a loop. If /// returning false, it may have been consumed by a loop transformation or not /// been intialized. Do not use in this case; diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp index a1268ca76b2d5..991cdb7b6b416 100644 --- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp +++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp @@ -4254,6 +4254,7 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::applyStaticWorkshareLoop( Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); Value *PUpperBound = Builder.CreateAlloca(IVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(IVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // At the end of the preheader, prepare for calling the "init" function by // storing the current loop bounds into the allocated space. A canonical loop @@ -4361,6 +4362,7 @@ OpenMPIRBuilder::applyStaticChunkedWorkshareLoop(DebugLoc DL, Value *PUpperBound = Builder.CreateAlloca(InternalIVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(InternalIVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // Set up the source location value for the OpenMP runtime. Builder.restoreIP(CLI->getPreheaderIP()); @@ -4844,6 +4846,7 @@ OpenMPIRBuilder::applyDynamicWorkshareLoop(DebugLoc DL, CanonicalLoopInfo *CLI, Value *PLowerBound = Builder.CreateAlloca(IVTy, nullptr, "p.lowerbound"); Value *PUpperBound = Builder.CreateAlloca(IVTy, nullptr, "p.upperbound"); Value *PStride = Builder.CreateAlloca(IVTy, nullptr, "p.stride"); + CLI->setLastIter(PLastIter); // At the end of the preheader, prepare for calling the "init" function by // storing the current loop bounds into the allocated space. A canonical loop diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 9f7b5605556e6..571505ab9b9aa 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } + } + + // Initialize linear step + inline void initLinearStep(LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearStep) { + linearSteps.push_back(moduleTranslation.lookupValue(linearStep)); + } + + // Emit IR for initialization of linear variables + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + initLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::BasicBlock *loopPreHeader) { + builder.SetInsertPoint(loopPreHeader->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarLoad = builder.CreateLoad( + linearOrigVars[index]->getAllocatedType(), linearOrigVars[index]); + builder.CreateStore(linearVarLoad, linearPreconditionVars[index]); + } + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + return afterBarrierIP; + } + + // Emit IR for updating Linear variables + void updateLinearVar(llvm::IRBuilderBase &builder, llvm::BasicBlock *loopBody, + llvm::Value *loopInductionVar) { + builder.SetInsertPoint(loopBody->getTerminator()); + for (size_t index = 0; index < linearPreconditionVars.size(); index++) { + // Emit increments for linear vars + llvm::LoadInst *linearVarStart = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + + linearPreconditionVars[index]); + auto mulInst = builder.CreateMul(loopInductionVar, linearSteps[index]); + auto addInst = builder.CreateAdd(linearVarStart, mulInst); + builder.CreateStore(addInst, linearLoopBodyTemps[index]); + } + } + + // Linear variable finalization is conditional on the last logical iteration. + // Create BB splits to manage the same. + void outlineLinearFinalizationBB(llvm::IRBuilderBase &builder, + llvm::BasicBlock *loopExit) { + linearFinalizationBB = loopExit->splitBasicBlock( + loopExit->getTerminator(), "omp_loop.linear_finalization"); + linearExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_exit"); + linearLastIterExitBB = linearFinalizationBB->splitBasicBlock( + linearFinalizationBB->getTerminator(), "omp_loop.linear_lastiter_exit"); + } + + // Finalize the linear vars + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + finalizeLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::Value *lastIter) { + // Emit condition to check whether last logical iteration is being executed + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + llvm::Value *loopLastIterLoad = builder.CreateLoad( + llvm::Type::getInt32Ty(builder.getContext()), lastIter); + llvm::Value *isLast = + builder.CreateCmp(llvm::CmpInst::ICMP_NE, loopLastIterLoad, + llvm::ConstantInt::get( + llvm::Type::getInt32Ty(builder.getContext()), 0)); + // Store the linear variable values to original variables. + builder.SetInsertPoint(linearLastIterExitBB->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarTemp = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + linearLoopBodyTemps[index]); + builder.CreateStore(linearVarTemp, linearOrigVars[index]); + } + + // Create conditional branch such that the linear variable + // values are stored to original variables only at the + // last logical iteration + builder.SetInsertPoint(linearFinalizationBB->getTerminator()); + builder.CreateCondBr(isLast, linearLastIterExitBB, linearExitBB); + linearFinalizationBB->getTerminator()->eraseFromParent(); + // Emit barrier + builder.SetInsertPoint(linearExitBB->getTerminator()); + return moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + } + + // Rewrite all uses of the original variable in `BBName` + // with the linear variable in-place + void rewriteInPlace(llvm::IRBuilderBase &builder, std::string BBName, + size_t varIndex) { + llvm::SmallVector users; + for (llvm::User *user : linearOrigVal[varIndex]->users()) + users.push_back(user); + for (auto *user : users) { + if (auto *userInst = dyn_cast(user)) { + if (userInst->getParent()->getName().str() == BBName) + user->replaceUsesOfWith(linearOrigVal[varIndex], + linearLoopBodyTemps[varIndex]); + } + } + } +}; + } // namespace /// Looks up from the operation from and returns the PrivateClauseOp with @@ -292,7 +432,6 @@ static LogicalResult checkImplementationStatus(Operation &op) { }) .Case([&](omp::WsloopOp op) { checkAllocate(op, result); - checkLinear(op, result); checkOrder(op, result); checkReduction(op, result); }) @@ -2423,15 +2562,40 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, llvm::omp::Directive::OMPD_for); llvm::OpenMPIRBuilder::LocationDescription ompLoc(builder); + + // Initialize linear variables and linear step + LinearClauseProcessor linearClauseProcessor; + if (wsloopOp.getLinearVars().size()) { + for (mlir::Value linearVar : wsloopOp.getLinearVars()) + linearClauseProcessor.createLinearVar(builder, moduleTranslation, + linearVar); + for (mlir::Value linearStep : wsloopOp.getLinearStepVars()) + linearClauseProcessor.initLinearStep(moduleTranslation, linearStep); + } + llvm::Expected regionBlock = convertOmpOpRegions( wsloopOp.getRegion(), "omp.wsloop.region", builder, moduleTranslation); if (failed(handleError(regionBlock, opInst))) return failure(); - builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin()); llvm::CanonicalLoopInfo *loopInfo = findCurrentLoopInfo(moduleTranslation); + // Emit Initialization and Update IR for linear variables + if (wsloopOp.getLinearVars().size()) { + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + linearClauseProcessor.initLinearVar(builder, moduleTranslation, + loopInfo->getPreheader()); + if (failed(handleError(afterBarrierIP, *loopOp))) + return failure(); + builder.restoreIP(*afterBarrierIP); + linearClauseProcessor.updateLinearVar(builder, loopInfo->getBody(), + loopInfo->getIndVar()); + linearClauseProcessor.outlineLinearFinalizationBB(builder, + loopInfo->getExit()); + } + + builder.SetInsertPoint(*regionBlock, (*regionBlock)->begin()); llvm::OpenMPIRBuilder::InsertPointOrErrorTy wsloopIP = ompBuilder->applyWorkshareLoop( ompLoc.DL, loopInfo, allocaIP, loopNeedsBarrier, @@ -2443,6 +2607,23 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder, if (failed(handleError(wsloopIP, opInst))) return failure(); + // Emit finalization and in-place rewrites for linear vars. + if (wsloopOp.getLinearVars().size()) { + llvm::OpenMPIRBuilder::InsertPointTy oldIP = builder.saveIP(); + assert(loopInfo->getLastIter() && + "`lastiter` in CanonicalLoopInfo is nullptr"); + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + linearClauseProcessor.finalizeLinearVar(builder, moduleTranslation, + loopInfo->getLastIter()); + if (failed(handleError(afterBarrierIP, *loopOp))) + return failure(); + builder.restoreIP(*afterBarrierIP); + for (size_t index = 0; index < wsloopOp.getLinearVars().size(); index++) + linearClauseProcessor.rewriteInPlace(builder, "omp.loop_nest.region", + index); + builder.restoreIP(oldIP); + } + // Set the correct branch target for task cancellation popCancelFinalizationCB(cancelTerminators, *ompBuilder, wsloopIP.get()); diff --git a/mlir/test/Target/LLVMIR/openmp-llvm.mlir b/mlir/test/Target/LLVMIR/openmp-llvm.mlir index 32f0ba5b105ff..9ad9e93301239 100644 --- a/mlir/test/Target/LLVMIR/openmp-llvm.mlir +++ b/mlir/test/Target/LLVMIR/openmp-llvm.mlir @@ -358,6 +358,94 @@ llvm.func @wsloop_simple(%arg0: !llvm.ptr) { // ----- +// CHECK-LABEL: wsloop_linear + +// CHECK: {{.*}} = alloca i32, i64 1, align 4 +// CHECK: %[[Y:.*]] = alloca i32, i64 1, align 4 +// CHECK: %[[X:.*]] = alloca i32, i64 1, align 4 + +// CHECK: entry: +// CHECK: %[[LINEAR_VAR:.*]] = alloca i32, align 4 +// CHECK: %[[LINEAR_RESULT:.*]] = alloca i32, align 4 +// CHECK: br label %omp_loop.preheader + +// CHECK: omp_loop.preheader: +// CHECK: %[[LOAD:.*]] = load i32, ptr %[[X]], align 4 +// CHECK: store i32 %[[LOAD]], ptr %[[LINEAR_VAR]], align 4 +// CHECK: %omp_global_thread_num = call i32 @__kmpc_global_thread_num(ptr @2) +// CHECK: call void @__kmpc_barrier(ptr @1, i32 %omp_global_thread_num) + +// CHECK: omp_loop.body: +// CHECK: %[[LOOP_IV:.*]] = add i32 %omp_loop.iv, {{.*}} +// CHECK: %[[LINEAR_LOAD:.*]] = load i32, ptr %[[LINEAR_VAR]], align 4 +// CHECK: %[[MUL:.*]] = mul i32 %[[LOOP_IV]], 1 +// CHECK: %[[ADD:.*]] = add i32 %[[LINEAR_LOAD]], %[[MUL]] +// CHECK: store i32 %[[ADD]], ptr %[[LINEAR_RESULT]], align 4 +// CHECK: br label %omp.loop_nest.region + +// CHECK: omp.loop_nest.region: +// CHECK: %[[LINEAR_LOAD:.*]] = load i32, ptr %[[LINEAR_RESULT]], align 4 +// CHECK: %[[ADD:.*]] = add i32 %[[LINEAR_LOAD]], 2 +// CHECK: store i32 %[[ADD]], ptr %[[Y]], align 4 + +// CHECK: omp_loop.exit: +// CHECK: call void @__kmpc_for_static_fini(ptr @2, i32 %omp_global_thread_num4) +// CHECK: %omp_global_thread_num5 = call i32 @__kmpc_global_thread_num(ptr @2) +// CHECK: call void @__kmpc_barrier(ptr @3, i32 %omp_global_thread_num5) +// CHECK: br label %omp_loop.linear_finalization + +// CHECK: omp_loop.linear_finalization: +// CHECK: %[[LAST_ITER:.*]] = load i32, ptr %p.lastiter, align 4 +// CHECK: %[[CMP:.*]] = icmp ne i32 %[[LAST_ITER]], 0 +// CHECK: br i1 %[[CMP]], label %omp_loop.linear_lastiter_exit, label %omp_loop.linear_exit + +// CHECK: omp_loop.linear_lastiter_exit: +// CHECK: %[[LINEAR_RESULT_LOAD:.*]] = load i32, ptr %[[LINEAR_RESULT]], align 4 +// CHECK: store i32 %[[LINEAR_RESULT_LOAD]], ptr %[[X]], align 4 +// CHECK: br label %omp_loop.linear_exit + +// CHECK: omp_loop.linear_exit: +// CHECK: %omp_global_thread_num6 = call i32 @__kmpc_global_thread_num(ptr @2) +// CHECK: call void @__kmpc_barrier(ptr @1, i32 %omp_global_thread_num6) +// CHECK: br label %omp_loop.after + +llvm.func @wsloop_linear() { + %0 = llvm.mlir.constant(1 : i64) : i64 + %1 = llvm.alloca %0 x i32 {bindc_name = "i", pinned} : (i64) -> !llvm.ptr + %2 = llvm.mlir.constant(1 : i64) : i64 + %3 = llvm.alloca %2 x i32 {bindc_name = "y"} : (i64) -> !llvm.ptr + %4 = llvm.mlir.constant(1 : i64) : i64 + %5 = llvm.alloca %4 x i32 {bindc_name = "x"} : (i64) -> !llvm.ptr + %6 = llvm.mlir.constant(1 : i64) : i64 + %7 = llvm.alloca %6 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr + %8 = llvm.mlir.constant(2 : i32) : i32 + %9 = llvm.mlir.constant(10 : i32) : i32 + %10 = llvm.mlir.constant(1 : i32) : i32 + %11 = llvm.mlir.constant(1 : i64) : i64 + %12 = llvm.mlir.constant(1 : i64) : i64 + %13 = llvm.mlir.constant(1 : i64) : i64 + %14 = llvm.mlir.constant(1 : i64) : i64 + omp.wsloop linear(%5 = %10 : !llvm.ptr) { + omp.loop_nest (%arg0) : i32 = (%10) to (%9) inclusive step (%10) { + llvm.store %arg0, %1 : i32, !llvm.ptr + %15 = llvm.load %5 : !llvm.ptr -> i32 + %16 = llvm.add %15, %8 : i32 + llvm.store %16, %3 : i32, !llvm.ptr + %17 = llvm.add %arg0, %10 : i32 + %18 = llvm.icmp "sgt" %17, %9 : i32 + llvm.cond_br %18, ^bb1, ^bb2 + ^bb1: // pred: ^bb0 + llvm.store %17, %1 : i32, !llvm.ptr + llvm.br ^bb2 + ^bb2: // 2 preds: ^bb0, ^bb1 + omp.yield + } + } + llvm.return +} + +// ----- + // CHECK-LABEL: @wsloop_inclusive_1 llvm.func @wsloop_inclusive_1(%arg0: !llvm.ptr) { %0 = llvm.mlir.constant(42 : index) : i64 diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir index 9a83b46efddca..98fccb1a80f67 100644 --- a/mlir/test/Target/LLVMIR/openmp-todo.mlir +++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir @@ -511,19 +511,6 @@ llvm.func @wsloop_allocate(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) { // ----- -llvm.func @wsloop_linear(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) { - // expected-error at below {{not yet implemented: Unhandled clause linear in omp.wsloop operation}} - // expected-error at below {{LLVM Translation failed for operation: omp.wsloop}} - omp.wsloop linear(%x = %step : !llvm.ptr) { - omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) { - omp.yield - } - } - llvm.return -} - -// ----- - llvm.func @wsloop_order(%lb : i32, %ub : i32, %step : i32) { // expected-error at below {{not yet implemented: Unhandled clause order in omp.wsloop operation}} // expected-error at below {{LLVM Translation failed for operation: omp.wsloop}} >From b783aa265096b68aacc33816ff28e97cf2406fd5 Mon Sep 17 00:00:00 2001 From: NimishMishra <42909663+NimishMishra at users.noreply.github.com> Date: Fri, 30 May 2025 21:14:30 -0700 Subject: [PATCH 3/3] Update llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h Co-authored-by: Tom Eccles --- llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h index 68f15d5c7d41e..8fb60ac86f952 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h +++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h @@ -3580,7 +3580,7 @@ class CanonicalLoopInfo { BasicBlock *Latch = nullptr; BasicBlock *Exit = nullptr; - // Hold the MLIR value for the `lastiter` of the canonical loop. + // Hold the LLVM value for the `lastiter` of the canonical loop. Value *LastIter = nullptr; /// Add the control blocks of this loop to \p BBs. From flang-commits at lists.llvm.org Fri May 30 21:28:41 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 21:28:41 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [mlir] [llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop (PR #139386) In-Reply-To: Message-ID: <683a8579.050a0220.122908.3c0f@mx.google.com> ================ @@ -124,6 +124,146 @@ class PreviouslyReportedError char PreviouslyReportedError::ID = 0; +/* + * Custom class for processing linear clause for omp.wsloop + * and omp.simd. Linear clause translation requires setup, + * initialization, update, and finalization at varying + * basic blocks in the IR. This class helps maintain + * internal state to allow consistent translation in + * each of these stages. + */ + +class LinearClauseProcessor { + +private: + SmallVector linearPreconditionVars; + SmallVector linearLoopBodyTemps; + SmallVector linearOrigVars; + SmallVector linearOrigVal; + SmallVector linearSteps; + llvm::BasicBlock *linearFinalizationBB; + llvm::BasicBlock *linearExitBB; + llvm::BasicBlock *linearLastIterExitBB; + +public: + // Allocate space for linear variabes + void createLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearVar) { + if (llvm::AllocaInst *linearVarAlloca = dyn_cast( + moduleTranslation.lookupValue(linearVar))) { + linearPreconditionVars.push_back(builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_var")); + llvm::Value *linearLoopBodyTemp = builder.CreateAlloca( + linearVarAlloca->getAllocatedType(), nullptr, ".linear_result"); + linearOrigVal.push_back(moduleTranslation.lookupValue(linearVar)); + linearLoopBodyTemps.push_back(linearLoopBodyTemp); + linearOrigVars.push_back(linearVarAlloca); + } + } + + // Initialize linear step + inline void initLinearStep(LLVM::ModuleTranslation &moduleTranslation, + mlir::Value &linearStep) { + linearSteps.push_back(moduleTranslation.lookupValue(linearStep)); + } + + // Emit IR for initialization of linear variables + llvm::OpenMPIRBuilder::InsertPointOrErrorTy + initLinearVar(llvm::IRBuilderBase &builder, + LLVM::ModuleTranslation &moduleTranslation, + llvm::BasicBlock *loopPreHeader) { + builder.SetInsertPoint(loopPreHeader->getTerminator()); + for (size_t index = 0; index < linearOrigVars.size(); index++) { + llvm::LoadInst *linearVarLoad = builder.CreateLoad( + linearOrigVars[index]->getAllocatedType(), linearOrigVars[index]); + builder.CreateStore(linearVarLoad, linearPreconditionVars[index]); + } + llvm::OpenMPIRBuilder::InsertPointOrErrorTy afterBarrierIP = + moduleTranslation.getOpenMPBuilder()->createBarrier( + builder.saveIP(), llvm::omp::OMPD_barrier); + return afterBarrierIP; + } + + // Emit IR for updating Linear variables + void updateLinearVar(llvm::IRBuilderBase &builder, llvm::BasicBlock *loopBody, + llvm::Value *loopInductionVar) { + builder.SetInsertPoint(loopBody->getTerminator()); + for (size_t index = 0; index < linearPreconditionVars.size(); index++) { + // Emit increments for linear vars + llvm::LoadInst *linearVarStart = + builder.CreateLoad(linearOrigVars[index]->getAllocatedType(), + + linearPreconditionVars[index]); + auto mulInst = builder.CreateMul(loopInductionVar, linearSteps[index]); ---------------- NimishMishra wrote: Okay I understand. The linear variable itself might be `real` (or compatible type). So assuming the `loopInductionVar` and `linearStep` are integers, we can create `mulInst` and then emit a cast to `linearVarStart`'s type, and finally emit the addition. Would that be okay? https://github.com/llvm/llvm-project/pull/139386 From flang-commits at lists.llvm.org Fri May 30 22:53:16 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 22:53:16 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683a994c.050a0220.392818.b983@mx.google.com> https://github.com/NimishMishra edited https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Fri May 30 22:53:16 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 22:53:16 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683a994c.170a0220.28e55e.98c7@mx.google.com> ================ @@ -1,16 +0,0 @@ -! RUN: not %flang_fc1 -fopenmp-version=51 -fopenmp %s 2>&1 | FileCheck %s ---------------- NimishMishra wrote: I think adding tests for unparsing would be good, since there are substantial changes to the parsing. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Fri May 30 22:53:17 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 22:53:17 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683a994d.170a0220.e7414.a95d@mx.google.com> https://github.com/NimishMishra commented: Thanks for the massive work on this patch. I took a look at the parser level changes for now, and tried running a few examples locally. Looks fine overall. https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Fri May 30 22:53:17 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 22:53:17 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683a994d.170a0220.24ef7f.e233@mx.google.com> ================ @@ -0,0 +1,82 @@ +!RUN: %flang_fc1 -fopenmp -fopenmp-version=60 -emit-hlfir -mmlir -fdebug-dump-atomic-analysis %s -o /dev/null 2>&1 | FileCheck %s + +subroutine f00(x) + integer :: x, v + !$omp atomic read + v = x +end + +!CHECK: Analysis { +!CHECK-NEXT: atom: x +!CHECK-NEXT: cond: +!CHECK-NEXT: op0 { +!CHECK-NEXT: what: Read +!CHECK-NEXT: assign: v=x +!CHECK-NEXT: } +!CHECK-NEXT: op1 { ---------------- NimishMishra wrote: Nit: In cases where op1 is NULL, is it better to simply skip dumping it? https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Fri May 30 22:53:17 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 22:53:17 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683a994d.170a0220.2bfeed.db91@mx.google.com> ================ @@ -1223,6 +1233,155 @@ TYPE_PARSER(sourced(construct(first( TYPE_PARSER(sourced(construct( sourced(Parser{}), Parser{}))) +struct OmpEndDirectiveParser { + using resultType = OmpDirectiveSpecification; + + constexpr OmpEndDirectiveParser(llvm::omp::Directive dir) : dir_(dir) {} + + std::optional Parse(ParseState &state) const { + if ((startOmpLine >> "END"_id).Parse(state)) { + auto &&dirSpec{Parser{}.Parse(state)}; + if (dirSpec && dirSpec->DirId() == dir_) { + return std::move(dirSpec); + } + } + return std::nullopt; + } + +private: + llvm::omp::Directive dir_; +}; + +// Parser for an arbitrary OpenMP ATOMIC construct. +// +// Depending on circumstances, an ATOMIC construct applies to one or more +// following statements. In certain cases when a single statement is +// expected, the end-directive is optional. The specifics depend on both +// the clauses used, and the form of the executable statement. To emit +// more meaningful messages in case of errors, the exact analysis of the +// structure of the construct will be delayed until semantic checks. +// +// The parser will first try the case when the end-directive is present, +// and will parse at most "BodyLimit" (and potentially zero) constructs +// while looking for the end-directive before it gives up. +// Then it will assume that no end-directive is present, and will try to +// parse a single executable construct as the body of the construct. +// +// The limit on the number of constructs is there to reduce the amount of +// unnecessary parsing when the end-directive is absent. It's higher than +// the maximum number of statements in any valid construct to accept cases +// when extra statements are present by mistake. ---------------- NimishMishra wrote: Is there a specific reason to allow a `BodyLimit` up to 5? Atomic structured blocks still have at most two statements (in case of capture construct). If we are speculating, wouldn't the end atomic construct (if it exists) occur at most 3 constructs later? https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Fri May 30 22:53:17 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 22:53:17 -0700 (PDT) Subject: [flang-commits] [flang] [llvm] [flang][OpenMP] Overhaul implementation of ATOMIC construct (PR #137852) In-Reply-To: Message-ID: <683a994d.170a0220.3f3ad.a07e@mx.google.com> ================ @@ -2666,422 +2673,1391 @@ void OmpStructureChecker::Leave(const parser::OmpEndBlockDirective &x) { } } -inline void OmpStructureChecker::ErrIfAllocatableVariable( - const parser::Variable &var) { - // Err out if the given symbol has - // ALLOCATABLE attribute - if (const auto *e{GetExpr(context_, var)}) - for (const Symbol &symbol : evaluate::CollectSymbols(*e)) - if (IsAllocatable(symbol)) { - const auto &designator = - std::get>(var.u); - const auto *dataRef = - std::get_if(&designator.value().u); - const parser::Name *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); +/// parser::Block is a list of executable constructs, parser::BlockConstruct +/// is Fortran's BLOCK/ENDBLOCK construct. +/// Strip the outermost BlockConstructs, return the reference to the Block +/// in the executable part of the innermost of the stripped constructs. +/// Specifically, if the given `block` has a single entry (it's a list), and +/// the entry is a BlockConstruct, get the Block contained within. Repeat +/// this step as many times as possible. +static const parser::Block &GetInnermostExecPart(const parser::Block &block) { + const parser::Block *iter{&block}; + while (iter->size() == 1) { + const parser::ExecutionPartConstruct &ep{iter->front()}; + if (auto *exec{std::get_if(&ep.u)}) { + using BlockConstruct = common::Indirection; + if (auto *bc{std::get_if(&exec->u)}) { + iter = &std::get(bc->value().t); + continue; } + } + break; + } + return *iter; } -inline void OmpStructureChecker::ErrIfLHSAndRHSSymbolsMatch( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if the symbol on the LHS is also used on the RHS of the assignment - // statement - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - for (const Symbol &symbol : evaluate::GetSymbolVector(*e)) { - if (varSymbol == symbol) { - const common::Indirection *designator = - std::get_if>(&expr.u); - if (designator) { - auto *z{var.typedExpr.get()}; - auto *c{expr.typedExpr.get()}; - if (z->v == c->v) { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); - } +// There is no consistent way to get the source of a given ActionStmt, so +// extract the source information from Statement when we can, +// and keep it around for error reporting in further analyses. +struct SourcedActionStmt { + const parser::ActionStmt *stmt{nullptr}; + parser::CharBlock source; + + operator bool() const { return stmt != nullptr; } +}; + +struct AnalyzedCondStmt { + SomeExpr cond{evaluate::NullPointer{}}; // Default ctor is deleted + parser::CharBlock source; + SourcedActionStmt ift, iff; +}; + +static SourcedActionStmt GetActionStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return SourcedActionStmt{}; + } + if (auto *exec{std::get_if(&x->u)}) { + using ActionStmt = parser::Statement; + if (auto *stmt{std::get_if(&exec->u)}) { + return SourcedActionStmt{&stmt->statement, stmt->source}; + } + } + return SourcedActionStmt{}; +} + +static SourcedActionStmt GetActionStmt(const parser::Block &block) { + if (block.size() == 1) { + return GetActionStmt(&block.front()); + } + return SourcedActionStmt{}; +} + +// Compute the `evaluate::Assignment` from parser::ActionStmt. The assumption +// is that the ActionStmt will be either an assignment or a pointer-assignment, +// otherwise return std::nullopt. +// Note: This function can return std::nullopt on [Pointer]AssignmentStmt where +// the "typedAssignment" is unset. This can happen is there are semantic errors +// in the purported assignment. +static std::optional GetEvaluateAssignment( + const parser::ActionStmt *x) { + if (x == nullptr) { + return std::nullopt; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + using TypedAssignment = parser::AssignmentStmt::TypedAssignment; + + return common::visit( + [](auto &&s) -> std::optional { + using BareS = llvm::remove_cvref_t; + if constexpr (std::is_same_v || + std::is_same_v) { + const TypedAssignment &typed{s.value().typedAssignment}; + // ForwardOwningPointer typedAssignment + // `- GenericAssignmentWrapper ^.get() + // `- std::optional ^->v + return typed.get()->v; } else { - context_.Say(expr.source, - "RHS expression on atomic assignment statement cannot access '%s'"_err_en_US, - var.GetSource()); + return std::nullopt; } + }, + x->u); +} + +// Check if the ActionStmt is actually a [Pointer]AssignmentStmt. This is +// to separate cases where the source has something that looks like an +// assignment, but is semantically wrong (diagnosed by general semantic +// checks), and where the source has some other statement (which we want +// to report as "should be an assignment"). +static bool IsAssignment(const parser::ActionStmt *x) { + if (x == nullptr) { + return false; + } + + using AssignmentStmt = common::Indirection; + using PointerAssignmentStmt = + common::Indirection; + + return common::visit( + [](auto &&s) -> bool { + using BareS = llvm::remove_cvref_t; + return std::is_same_v || + std::is_same_v; + }, + x->u); +} + +static std::optional AnalyzeConditionalStmt( + const parser::ExecutionPartConstruct *x) { + if (x == nullptr) { + return std::nullopt; + } + + // Extract the evaluate::Expr from ScalarLogicalExpr. + auto getFromLogical{[](const parser::ScalarLogicalExpr &logical) { + // ScalarLogicalExpr is Scalar>> + const parser::Expr &expr{logical.thing.thing.value()}; + return GetEvaluateExpr(expr); + }}; + + // Recognize either + // ExecutionPartConstruct -> ExecutableConstruct -> ActionStmt -> IfStmt, or + // ExecutionPartConstruct -> ExecutableConstruct -> IfConstruct. + + if (auto &&action{GetActionStmt(x)}) { + if (auto *ifs{std::get_if>( + &action.stmt->u)}) { + const parser::IfStmt &s{ifs->value()}; + auto &&maybeCond{ + getFromLogical(std::get(s.t))}; + auto &thenStmt{ + std::get>(s.t)}; + if (maybeCond) { + return AnalyzedCondStmt{std::move(*maybeCond), action.source, + SourcedActionStmt{&thenStmt.statement, thenStmt.source}, + SourcedActionStmt{}}; } } + return std::nullopt; } + + if (auto *exec{std::get_if(&x->u)}) { + if (auto *ifc{ + std::get_if>(&exec->u)}) { + using ElseBlock = parser::IfConstruct::ElseBlock; + using ElseIfBlock = parser::IfConstruct::ElseIfBlock; + const parser::IfConstruct &s{ifc->value()}; + + if (!std::get>(s.t).empty()) { + // Not expecting any else-if statements. + return std::nullopt; + } + auto &stmt{std::get>(s.t)}; + auto &&maybeCond{getFromLogical( + std::get(stmt.statement.t))}; + if (!maybeCond) { + return std::nullopt; + } + + if (auto &maybeElse{std::get>(s.t)}) { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), + GetActionStmt(std::get(maybeElse->t))}; + if (result.ift.stmt && result.iff.stmt) { + return result; + } + } else { + AnalyzedCondStmt result{std::move(*maybeCond), stmt.source, + GetActionStmt(std::get(s.t)), SourcedActionStmt{}}; + if (result.ift.stmt) { + return result; + } + } + } + return std::nullopt; + } + + return std::nullopt; } -inline void OmpStructureChecker::ErrIfNonScalarAssignmentStmt( - const parser::Variable &var, const parser::Expr &expr) { - // Err out if either the variable on the LHS or the expression on the RHS of - // the assignment statement are non-scalar (i.e. have rank > 0 or is of - // CHARACTER type) - const auto *e{GetExpr(context_, expr)}; - const auto *v{GetExpr(context_, var)}; - if (e && v) { - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic assignment " - "statement"_err_en_US); - } -} - -template -bool OmpStructureChecker::IsOperatorValid(const T &node, const D &variable) { - using AllowedBinaryOperators = - std::variant; - using BinaryOperators = std::variant; - - if constexpr (common::HasMember) { - const auto &variableName{variable.GetSource().ToString()}; - const auto &exprLeft{std::get<0>(node.t)}; - const auto &exprRight{std::get<1>(node.t)}; - if ((exprLeft.value().source.ToString() != variableName) && - (exprRight.value().source.ToString() != variableName)) { - context_.Say(variable.GetSource(), - "Atomic update statement should be of form " - "`%s = %s operator expr` OR `%s = expr operator %s`"_err_en_US, - variableName, variableName, variableName, variableName); - } - return common::HasMember; +static std::pair SplitAssignmentSource( + parser::CharBlock source) { + // Find => in the range, if not found, find = that is not a part of + // <=, >=, ==, or /=. + auto trim{[](std::string_view v) { + const char *begin{v.data()}; + const char *end{begin + v.size()}; + while (*begin == ' ' && begin != end) { + ++begin; + } + while (begin != end && end[-1] == ' ') { + --end; + } + assert(begin != end && "Source should not be empty"); + return parser::CharBlock(begin, end - begin); + }}; + + std::string_view sv(source.begin(), source.size()); + + if (auto where{sv.find("=>")}; where != sv.npos) { + std::string_view lhs(sv.data(), where); + std::string_view rhs(sv.data() + where + 2, sv.size() - where - 2); + return std::make_pair(trim(lhs), trim(rhs)); } - return false; + + // Go backwards, since all the exclusions above end with a '='. + for (size_t next{source.size()}; next > 1; --next) { + if (sv[next - 1] == '=' && !llvm::is_contained("<>=/", sv[next - 2])) { + std::string_view lhs(sv.data(), next - 1); + std::string_view rhs(sv.data() + next, sv.size() - next); + return std::make_pair(trim(lhs), trim(rhs)); + } + } + llvm_unreachable("Could not find assignment operator"); } -void OmpStructureChecker::CheckAtomicCaptureStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - common::visit( - common::visitors{ - [&](const common::Indirection &designator) { - const auto *dataRef = - std::get_if(&designator.value().u); - const auto *name = - dataRef ? std::get_if(&dataRef->u) : nullptr; - if (name && IsAllocatable(*name->symbol)) - context_.Say(name->source, - "%s must not have ALLOCATABLE " - "attribute"_err_en_US, - name->ToString()); - }, - [&](const auto &) { - // Anything other than a `parser::Designator` is not allowed - context_.Say(expr.source, - "Expected scalar variable " - "of intrinsic type on RHS of atomic " - "assignment statement"_err_en_US); - }}, - expr.u); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicWriteStmt( - const parser::AssignmentStmt &assignmentStmt) { - const auto &var{std::get(assignmentStmt.t)}; - const auto &expr{std::get(assignmentStmt.t)}; - ErrIfAllocatableVariable(var); - ErrIfLHSAndRHSSymbolsMatch(var, expr); - ErrIfNonScalarAssignmentStmt(var, expr); -} - -void OmpStructureChecker::CheckAtomicUpdateStmt( - const parser::AssignmentStmt &assignment) { - const auto &expr{std::get(assignment.t)}; - const auto &var{std::get(assignment.t)}; - bool isIntrinsicProcedure{false}; - bool isValidOperator{false}; - common::visit( - common::visitors{ - [&](const common::Indirection &x) { - isIntrinsicProcedure = true; - const auto &procedureDesignator{ - std::get(x.value().v.t)}; - const parser::Name *name{ - std::get_if(&procedureDesignator.u)}; - if (name && - !(name->source == "max" || name->source == "min" || - name->source == "iand" || name->source == "ior" || - name->source == "ieor")) { - context_.Say(expr.source, - "Invalid intrinsic procedure name in " - "OpenMP ATOMIC (UPDATE) statement"_err_en_US); - } - }, - [&](const auto &x) { - if (!IsOperatorValid(x, var)) { - context_.Say(expr.source, - "Invalid or missing operator in atomic update " - "statement"_err_en_US); - } else - isValidOperator = true; - }, - }, - expr.u); - if (const auto *e{GetExpr(context_, expr)}) { - const auto *v{GetExpr(context_, var)}; - if (e->Rank() != 0 || - (e->GetType().has_value() && - e->GetType().value().category() == common::TypeCategory::Character)) - context_.Say(expr.source, - "Expected scalar expression " - "on the RHS of atomic update assignment " - "statement"_err_en_US); - if (v->Rank() != 0 || - (v->GetType().has_value() && - v->GetType()->category() == common::TypeCategory::Character)) - context_.Say(var.GetSource(), - "Expected scalar variable " - "on the LHS of atomic update assignment " - "statement"_err_en_US); - auto vSyms{evaluate::GetSymbolVector(*v)}; - const Symbol &varSymbol = vSyms.front(); - int numOfSymbolMatches{0}; - SymbolVector exprSymbols{evaluate::GetSymbolVector(*e)}; - for (const Symbol &symbol : exprSymbols) { - if (varSymbol == symbol) { - numOfSymbolMatches++; +namespace atomic { + +struct DesignatorCollector : public evaluate::Traverse, false> { + using Result = std::vector; + using Base = evaluate::Traverse; + DesignatorCollector() : Base(*this) {} + + Result Default() const { return {}; } + + using Base::operator(); + + template // + Result operator()(const evaluate::Designator &x) const { + // Once in a designator, don't traverse it any further (i.e. only + // collect top-level designators). + auto copy{x}; + return Result{AsGenericExpr(std::move(copy))}; + } + + template // + Result Combine(Result &&result, Rs &&...results) const { + Result v(std::move(result)); + auto moveAppend{[](auto &accum, auto &&other) { + for (auto &&s : other) { + accum.push_back(std::move(s)); } + }}; + (moveAppend(v, std::move(results)), ...); + return v; + } +}; + +struct VariableFinder : public evaluate::AnyTraverse { + using Base = evaluate::AnyTraverse; + VariableFinder(const SomeExpr &v) : Base(*this), var(v) {} + + using Base::operator(); + + template + bool operator()(const evaluate::Designator &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + + template + bool operator()(const evaluate::FunctionRef &x) const { + auto copy{x}; + return evaluate::AsGenericExpr(std::move(copy)) == var; + } + +private: + const SomeExpr &var; +}; +} // namespace atomic + +static bool IsAllocatable(const SomeExpr &expr) { + std::vector dsgs{atomic::DesignatorCollector{}(expr)}; + assert(dsgs.size() == 1 && "Should have a single top-level designator"); + evaluate::SymbolVector syms{evaluate::GetSymbolVector(dsgs.front())}; + return !syms.empty() && IsAllocatable(syms.back()); +} + +static bool IsPointerAssignment(const evaluate::Assignment &x) { + return std::holds_alternative(x.u) || + std::holds_alternative(x.u); +} + +static bool IsCheckForAssociated(const SomeExpr &cond) { + return GetTopLevelOperation(cond).first == operation::Operator::Associated; +} + +static bool HasCommonDesignatorSymbols( + const evaluate::SymbolVector &baseSyms, const SomeExpr &other) { + // Compare the designators used in "other" with the designators whose + // symbols are given in baseSyms. + // This is a part of the check if these two expressions can access the same + // storage: if the designators used in them are different enough, then they + // will be assumed not to access the same memory. + // + // Consider an (array element) expression x%y(w%z), the corresponding symbol + // vector will be {x, y, w, z} (i.e. the symbols for these names). + // Check whether this exact sequence appears anywhere in any the symbol + // vector for "other". This will be true for x(y) and x(y+1), so this is + // not a sufficient condition, but can be used to eliminate candidates + // before doing more exhaustive checks. + // + // If any of the symbols in this sequence are function names, assume that + // there is no storage overlap, mostly because it would be impossible in + // general to determine what storage the function will access. + // Note: if f is pure, then two calls to f will access the same storage + // when called with the same arguments. This check is not done yet. + + if (llvm::any_of( + baseSyms, [](const SymbolRef &s) { return s->IsSubprogram(); })) { + // If there is a function symbol in the chain then we can't infer much + // about the accessed storage. + return false; + } + + auto isSubsequence{// Is u a subsequence of v. + [](const evaluate::SymbolVector &u, const evaluate::SymbolVector &v) { + size_t us{u.size()}, vs{v.size()}; + if (us > vs) { + return false; + } + for (size_t off{0}; off != vs - us + 1; ++off) { + bool same{true}; + for (size_t i{0}; i != us; ++i) { + if (u[i] != v[off + i]) { + same = false; + break; + } + } + if (same) { + return true; + } + } + return false; + }}; + + evaluate::SymbolVector otherSyms{evaluate::GetSymbolVector(other)}; + return isSubsequence(baseSyms, otherSyms); +} + +static bool HasCommonTopLevelDesignators( + const std::vector &baseDsgs, const SomeExpr &other) { + // Compare designators directly as expressions. This will ensure + // that x(y) and x(y+1) are not flagged as overlapping, whereas + // the symbol vectors for both of these would be identical. + std::vector otherDsgs{atomic::DesignatorCollector{}(other)}; + + for (auto &s : baseDsgs) { + if (llvm::any_of(otherDsgs, [&](auto &&t) { return s == t; })) { + return true; } - if (isIntrinsicProcedure) { - std::string varName = var.GetSource().ToString(); - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Intrinsic procedure" - " arguments in atomic update statement" - " must have exactly one occurence of '%s'"_err_en_US, - varName); - else if (varSymbol != exprSymbols.front() && - varSymbol != exprSymbols.back()) - context_.Say(expr.source, - "Atomic update statement " - "should be of the form `%s = intrinsic_procedure(%s, expr_list)` " - "OR `%s = intrinsic_procedure(expr_list, %s)`"_err_en_US, - varName, varName, varName, varName); - } else if (isValidOperator) { - if (numOfSymbolMatches != 1) - context_.Say(expr.source, - "Exactly one occurence of '%s' " - "expected on the RHS of atomic update assignment statement"_err_en_US, - var.GetSource().ToString()); + } + return false; +} + +static const SomeExpr *HasStorageOverlap( + const SomeExpr &base, llvm::ArrayRef exprs) { + evaluate::SymbolVector baseSyms{evaluate::GetSymbolVector(base)}; + std::vector baseDsgs{atomic::DesignatorCollector{}(base)}; + + for (const SomeExpr &expr : exprs) { + if (!HasCommonDesignatorSymbols(baseSyms, expr)) { + continue; + } + if (HasCommonTopLevelDesignators(baseDsgs, expr)) { + return &expr; } } + return nullptr; +} - ErrIfAllocatableVariable(var); +static bool IsMaybeAtomicWrite(const evaluate::Assignment &assign) { + // This ignores function calls, so it will accept "f(x) = f(x) + 1" + // for example. + return HasStorageOverlap(assign.lhs, assign.rhs) == nullptr; } -void OmpStructureChecker::CheckAtomicCompareConstruct( - const parser::OmpAtomicCompare &atomicCompareConstruct) { +static bool IsSubexpressionOf(const SomeExpr &sub, const SomeExpr &super) { + return atomic::VariableFinder{sub}(super); +} - // TODO: Check that the if-stmt is `if (var == expr) var = new` - // [with or without then/end-do] +static void SetExpr(parser::TypedExpr &expr, MaybeExpr value) { + if (value) { + expr.Reset(new evaluate::GenericExprWrapper(std::move(value)), + evaluate::GenericExprWrapper::Deleter); + } +} - unsigned version{context_.langOptions().OpenMPVersion}; - if (version < 51) { - context_.Say(atomicCompareConstruct.source, - "%s construct not allowed in %s, %s"_err_en_US, - atomicCompareConstruct.source, ThisVersion(version), TryVersion(51)); - } - - // TODO: More work needed here. Some of the Update restrictions need to - // be added, but Update isn't the same either. -} - -// TODO: Allow cond-update-stmt once compare clause is supported. -void OmpStructureChecker::CheckAtomicCaptureConstruct( - const parser::OmpAtomicCapture &atomicCaptureConstruct) { - const parser::AssignmentStmt &stmt1 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt1Var{std::get(stmt1.t)}; - const auto &stmt1Expr{std::get(stmt1.t)}; - const auto *v1 = GetExpr(context_, stmt1Var); - const auto *e1 = GetExpr(context_, stmt1Expr); - - const parser::AssignmentStmt &stmt2 = - std::get(atomicCaptureConstruct.t) - .v.statement; - const auto &stmt2Var{std::get(stmt2.t)}; - const auto &stmt2Expr{std::get(stmt2.t)}; - const auto *v2 = GetExpr(context_, stmt2Var); - const auto *e2 = GetExpr(context_, stmt2Expr); - - if (e1 && v1 && e2 && v2) { - if (semantics::checkForSingleVariableOnRHS(stmt1)) { - CheckAtomicCaptureStmt(stmt1); - if (semantics::checkForSymbolMatch(v2, e2)) { - // ATOMIC CAPTURE construct is of the form [capture-stmt, update-stmt] - CheckAtomicUpdateStmt(stmt2); +static void SetAssignment(parser::AssignmentStmt::TypedAssignment &assign, + std::optional value) { + if (value) { + assign.Reset(new evaluate::GenericAssignmentWrapper(std::move(value)), + evaluate::GenericAssignmentWrapper::Deleter); + } +} + +static parser::OpenMPAtomicConstruct::Analysis::Op MakeAtomicAnalysisOp( + int what, + const std::optional &maybeAssign = std::nullopt) { + parser::OpenMPAtomicConstruct::Analysis::Op operation; + operation.what = what; + SetAssignment(operation.assign, maybeAssign); + return operation; +} + +static parser::OpenMPAtomicConstruct::Analysis MakeAtomicAnalysis( + const SomeExpr &atom, const MaybeExpr &cond, + parser::OpenMPAtomicConstruct::Analysis::Op &&op0, + parser::OpenMPAtomicConstruct::Analysis::Op &&op1) { + // Defined in flang/include/flang/Parser/parse-tree.h + // + // struct Analysis { + // struct Kind { + // static constexpr int None = 0; + // static constexpr int Read = 1; + // static constexpr int Write = 2; + // static constexpr int Update = Read | Write; + // static constexpr int Action = 3; // Bits containing N, R, W, U + // static constexpr int IfTrue = 4; + // static constexpr int IfFalse = 8; + // static constexpr int Condition = 12; // Bits containing IfTrue, IfFalse + // }; + // struct Op { + // int what; + // TypedAssignment assign; + // }; + // TypedExpr atom, cond; + // Op op0, op1; + // }; + + parser::OpenMPAtomicConstruct::Analysis an; + SetExpr(an.atom, atom); + SetExpr(an.cond, cond); + an.op0 = std::move(op0); + an.op1 = std::move(op1); + return an; +} + +void OmpStructureChecker::CheckStorageOverlap(const SomeExpr &base, + llvm::ArrayRef> exprs, + parser::CharBlock source) { + if (auto *expr{HasStorageOverlap(base, exprs)}) { + context_.Say(source, + "Within atomic operation %s and %s access the same storage"_warn_en_US, + base.AsFortran(), expr->AsFortran()); + } +} + +void OmpStructureChecker::ErrorShouldBeVariable( + const MaybeExpr &expr, parser::CharBlock source) { + if (expr) { + context_.Say(source, "Atomic expression %s should be a variable"_err_en_US, + expr->AsFortran()); + } else { + context_.Say(source, "Atomic expression should be a variable"_err_en_US); + } +} + +/// Check if `expr` satisfies the following conditions for x and v: +/// +/// [6.0:189:10-12] +/// - x and v (as applicable) are either scalar variables or +/// function references with scalar data pointer result of non-character +/// intrinsic type or variables that are non-polymorphic scalar pointers +/// and any length type parameter must be constant. +void OmpStructureChecker::CheckAtomicVariable( + const SomeExpr &atom, parser::CharBlock source) { + if (atom.Rank() != 0) { + context_.Say(source, "Atomic variable %s should be a scalar"_err_en_US, + atom.AsFortran()); + } + + if (std::optional dtype{atom.GetType()}) { + if (dtype->category() == TypeCategory::Character) { + context_.Say(source, + "Atomic variable %s cannot have CHARACTER type"_err_en_US, + atom.AsFortran()); + } else if (dtype->IsPolymorphic()) { + context_.Say(source, + "Atomic variable %s cannot have a polymorphic type"_err_en_US, + atom.AsFortran()); + } + // TODO: Check non-constant type parameters for non-character types. + // At the moment there don't seem to be any. + } + + if (IsAllocatable(atom)) { + context_.Say(source, "Atomic variable %s cannot be ALLOCATABLE"_err_en_US, + atom.AsFortran()); + } +} + +std::pair +OmpStructureChecker::CheckUpdateCapture( + const parser::ExecutionPartConstruct *ec1, + const parser::ExecutionPartConstruct *ec2, parser::CharBlock source) { + // Decide which statement is the atomic update and which is the capture. + // + // The two allowed cases are: + // x = ... atomic-var = ... + // ... = x capture-var = atomic-var (with optional converts) + // or + // ... = x capture-var = atomic-var (with optional converts) + // x = ... atomic-var = ... + // + // The case of 'a = b; b = a' is ambiguous, so pick the first one as capture + // (which makes more sense, as it captures the original value of the atomic + // variable). + // + // If the two statements don't fit these criteria, return a pair of default- + // constructed values. + using ReturnTy = std::pair; + + SourcedActionStmt act1{GetActionStmt(ec1)}; + SourcedActionStmt act2{GetActionStmt(ec2)}; + auto maybeAssign1{GetEvaluateAssignment(act1.stmt)}; + auto maybeAssign2{GetEvaluateAssignment(act2.stmt)}; + if (!maybeAssign1 || !maybeAssign2) { + if (!IsAssignment(act1.stmt) || !IsAssignment(act2.stmt)) { + context_.Say(source, + "ATOMIC UPDATE operation with CAPTURE should contain two assignments"_err_en_US); + } + return std::make_pair(nullptr, nullptr); + } + + auto as1{*maybeAssign1}, as2{*maybeAssign2}; + + auto isUpdateCapture{ + [](const evaluate::Assignment &u, const evaluate::Assignment &c) { + return IsSameOrConvertOf(c.rhs, u.lhs); + }}; + + // Do some checks that narrow down the possible choices for the update + // and the capture statements. This will help to emit better diagnostics. + // 1. An assignment could be an update (cbu) if the left-hand side is a + // subexpression of the right-hand side. + // 2. An assignment could be a capture (cbc) if the right-hand side is + // a variable (or a function ref), with potential type conversions. + bool cbu1{IsSubexpressionOf(as1.lhs, as1.rhs)}; // Can as1 be an update? + bool cbu2{IsSubexpressionOf(as2.lhs, as2.rhs)}; // Can as2 be an update? + bool cbc1{IsVarOrFunctionRef(GetConvertInput(as1.rhs))}; // Can 1 be capture? + bool cbc2{IsVarOrFunctionRef(GetConvertInput(as2.rhs))}; // Can 2 be capture? + + // We want to diagnose cases where both assignments cannot be an update, + // or both cannot be a capture, as well as cases where either assignment + // cannot be any of these two. + // + // If we organize these boolean values into a matrix + // |cbu1 cbu2| + // |cbc1 cbc2| + // then we want to diagnose cases where the matrix has a zero (i.e. "false") + // row or column, including the case where everything is zero. All these + // cases correspond to the determinant of the matrix being 0, which suggests + // that checking the det may be a convenient diagnostic check. There is only + // one additional case where the det is 0, which is when the matrx is all 1 ---------------- NimishMishra wrote: Nit: "matrx" -> "matrix" https://github.com/llvm/llvm-project/pull/137852 From flang-commits at lists.llvm.org Fri May 30 23:57:28 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Fri, 30 May 2025 23:57:28 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [llvm] Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (PR #136098) In-Reply-To: Message-ID: <683aa858.170a0220.65338.a5bd@mx.google.com> fanju110 wrote: > @fanju110 > > The PR caused some buildbot failures, so I have [reverted](https://github.com/llvm/llvm-project/commit/597340b5b666bdee2887f56c111407b6737cbf34) it. You can see the error [here](https://lab.llvm.org/buildbot/#/builders/203/builds/12089/steps/7/logs/stdio). I'm very sorry for this trouble.I'll check carefully and fix it before submitting again. https://github.com/llvm/llvm-project/pull/136098 From flang-commits at lists.llvm.org Sat May 31 05:17:46 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Sat, 31 May 2025 05:17:46 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix ppc64le-flang-rhel-test buildbot failure (PR #142269) Message-ID: https://github.com/mcinally created https://github.com/llvm/llvm-project/pull/142269 PR#142073 introduced a new test that checks the prefer-vector-width function attribute. This test was not accounting for target triples that include default function attributes. This patch updates prefer-vector-width.f90 to ignore extra function attributes. >From 5520865bf718b0d195b02f7a35008c92bb3999a2 Mon Sep 17 00:00:00 2001 From: Cameron McInally Date: Sat, 31 May 2025 05:07:01 -0700 Subject: [PATCH] [flang] Fix ppc64le-flang-rhel-test buildbot failure PR#142073 introduced a new test that checks the prefer-vector-width function attribute. This test was not accounting for target triples that include default function attributes. This patch updates prefer-vector-width.f90 to ignore extra function attributes. --- flang/test/Driver/prefer-vector-width.f90 | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/flang/test/Driver/prefer-vector-width.f90 b/flang/test/Driver/prefer-vector-width.f90 index d0f5fd28db826..0e334f5f3e66e 100644 --- a/flang/test/Driver/prefer-vector-width.f90 +++ b/flang/test/Driver/prefer-vector-width.f90 @@ -9,8 +9,8 @@ subroutine func end subroutine func -! CHECK-DEF-NOT: attributes #0 = { "prefer-vector-width"={{.*}} } -! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } -! CHECK-128: attributes #0 = { "prefer-vector-width"="128" } -! CHECK-256: attributes #0 = { "prefer-vector-width"="256" } +! CHECK-DEF-NOT: attributes #0 = { {{.*}}"prefer-vector-width"={{.*}} } +! CHECK-NONE: attributes #0 = { {{.*}}"prefer-vector-width"="none"{{.*}} } +! CHECK-128: attributes #0 = { {{.*}}"prefer-vector-width"="128"{{.*}} } +! CHECK-256: attributes #0 = { {{.*}}"prefer-vector-width"="256"{{.*}} } ! CHECK-INVALID:error: invalid value 'xxx' in '-mprefer-vector-width=xxx' From flang-commits at lists.llvm.org Sat May 31 05:18:21 2025 From: flang-commits at lists.llvm.org (via flang-commits) Date: Sat, 31 May 2025 05:18:21 -0700 (PDT) Subject: [flang-commits] [flang] [flang] Fix ppc64le-flang-rhel-test buildbot failure (PR #142269) In-Reply-To: Message-ID: <683af38d.a70a0220.33a52b.c6c7@mx.google.com> llvmbot wrote: @llvm/pr-subscribers-flang-driver Author: Cameron McInally (mcinally)
Changes PR#142073 introduced a new test that checks the prefer-vector-width function attribute. This test was not accounting for target triples that include default function attributes. This patch updates prefer-vector-width.f90 to ignore extra function attributes. --- Full diff: https://github.com/llvm/llvm-project/pull/142269.diff 1 Files Affected: - (modified) flang/test/Driver/prefer-vector-width.f90 (+4-4) ``````````diff diff --git a/flang/test/Driver/prefer-vector-width.f90 b/flang/test/Driver/prefer-vector-width.f90 index d0f5fd28db826..0e334f5f3e66e 100644 --- a/flang/test/Driver/prefer-vector-width.f90 +++ b/flang/test/Driver/prefer-vector-width.f90 @@ -9,8 +9,8 @@ subroutine func end subroutine func -! CHECK-DEF-NOT: attributes #0 = { "prefer-vector-width"={{.*}} } -! CHECK-NONE: attributes #0 = { "prefer-vector-width"="none" } -! CHECK-128: attributes #0 = { "prefer-vector-width"="128" } -! CHECK-256: attributes #0 = { "prefer-vector-width"="256" } +! CHECK-DEF-NOT: attributes #0 = { {{.*}}"prefer-vector-width"={{.*}} } +! CHECK-NONE: attributes #0 = { {{.*}}"prefer-vector-width"="none"{{.*}} } +! CHECK-128: attributes #0 = { {{.*}}"prefer-vector-width"="128"{{.*}} } +! CHECK-256: attributes #0 = { {{.*}}"prefer-vector-width"="256"{{.*}} } ! CHECK-INVALID:error: invalid value 'xxx' in '-mprefer-vector-width=xxx' ``````````
https://github.com/llvm/llvm-project/pull/142269 From flang-commits at lists.llvm.org Sat May 31 05:18:56 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Sat, 31 May 2025 05:18:56 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mprefer-vector-width= (PR #142073) In-Reply-To: Message-ID: <683af3b0.170a0220.330f38.f329@mx.google.com> mcinally wrote: PR #142269 for PPC buildbot fix. https://github.com/llvm/llvm-project/pull/142073 From flang-commits at lists.llvm.org Sat May 31 05:30:01 2025 From: flang-commits at lists.llvm.org (Cameron McInally via flang-commits) Date: Sat, 31 May 2025 05:30:01 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mrecip[=] (PR #142172) In-Reply-To: Message-ID: <683af649.170a0220.38a0ea.f279@mx.google.com> mcinally wrote: > LGTM, though, I think we'd better reuse the code from `Clang.cpp`. `flangFrontend` already depends on `clangDriver`, so we just need to export `ParseMRecip` and `getRefinementStep` from `clangDriver` (and probably replace their `Driver` argument with a `DiagnosticEngine` argument, so that it works for both clang and flang). This seems reasonable. I'll look into whether it's possible to do. > (and probably replace their `Driver` argument with a `DiagnosticEngine` argument, so that it works for both clang and flang) It will be interesting to see if Clang allows us to change this without changing it everywhere. Changing it everywhere seems like a heavy lift. https://github.com/llvm/llvm-project/pull/142172 From flang-commits at lists.llvm.org Sat May 31 05:31:17 2025 From: flang-commits at lists.llvm.org (Paul Osmialowski via flang-commits) Date: Sat, 31 May 2025 05:31:17 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page and html support for Flang (PR #141882) In-Reply-To: Message-ID: <683af695.050a0220.7e110.d323@mx.google.com> pawosm-arm wrote: Not sure what is the ultimate scope of this PR, the flang manpage is indeed being generated. There are some observations though, but I'm not sure if this is the PR which should address them. Namely: - At the very beginning we can see this: ``` FLANG |VERSION| (IN-PROGRESS) RELEASE NOTES warning ``` - There are some mentions of the problems with building on Ubuntu 18. This is slightly old Linux distribution these days, I doubt many cares about those issues, and it doesn't feel like a manpage is a good place for running into details of them - The man page is way too big, way too extensive and way too detailed. It feels like entire Flang documentation went into it. Was this intentional? I like the fact that this page finally generates easily with no need for extensive CMake configuring. https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Sat May 31 05:33:42 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Sat, 31 May 2025 05:33:42 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page and html support for Flang (PR #141882) In-Reply-To: Message-ID: <683af726.050a0220.189303.c597@mx.google.com> snarang181 wrote: > * The man page is way too big, way too extensive and way too detailed. It feels like entire Flang documentation went into it. Was this intentional? I think the man page is intended to be that way since all the documentation is built into `index.md`. If we wanted more separate man pages, we can more config entries but that did not seem to be the intended use case. https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Sat May 31 05:36:23 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Sat, 31 May 2025 05:36:23 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page and html support for Flang (PR #141882) In-Reply-To: Message-ID: <683af7c7.170a0220.b373.ae45@mx.google.com> snarang181 wrote: > Not sure what is the ultimate scope of this PR, the flang manpage is indeed being generated. There are some observations though, but I'm not sure if this is the PR which should address them. Namely: I agree that there might be some more things to iron out but I think this patch has already touched quite a bit of CMake config and conf.py. It might be suitable to create another issue and address the issues in another PR as it is out of scope for the issue this is linked to. https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Sat May 31 05:37:32 2025 From: flang-commits at lists.llvm.org (Paul Osmialowski via flang-commits) Date: Sat, 31 May 2025 05:37:32 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page and html support for Flang (PR #141882) In-Reply-To: Message-ID: <683af80c.170a0220.13035a.aa83@mx.google.com> pawosm-arm wrote: > > * The man page is way too big, way too extensive and way too detailed. It feels like entire Flang documentation went into it. Was this intentional? > > I think the man page is intended to be that way since all the documentation is built into `index.md`. If we wanted more separate man pages, we can more config entries but that did not seem to be the intended use case. What I meant is this seems unusual. Compare these: ``` $ ls -la *lang* -rw-r--r-- 1 user01 users 28437 May 31 10:26 clang.1 -rw-r--r-- 1 user01 users 1450 May 31 10:20 clang-tblgen.1 -rw-r--r-- 1 user01 users 1199536 May 31 10:26 flang.1 ``` https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Sat May 31 05:40:21 2025 From: flang-commits at lists.llvm.org (Samarth Narang via flang-commits) Date: Sat, 31 May 2025 05:40:21 -0700 (PDT) Subject: [flang-commits] [flang] [Flang] Add Sphinx man page and html support for Flang (PR #141882) In-Reply-To: Message-ID: <683af8b5.170a0220.111914.b167@mx.google.com> snarang181 wrote: > > > * The man page is way too big, way too extensive and way too detailed. It feels like entire Flang documentation went into it. Was this intentional? > > > > > > I think the man page is intended to be that way since all the documentation is built into `index.md`. If we wanted more separate man pages, we can more config entries but that did not seem to be the intended use case. > > What I meant is this seems unusual. Compare these: > > ``` > $ ls -la *lang* > -rw-r--r-- 1 user01 users 28437 May 31 10:26 clang.1 > -rw-r--r-- 1 user01 users 1450 May 31 10:20 clang-tblgen.1 > -rw-r--r-- 1 user01 users 1199536 May 31 10:26 flang.1 > ``` Ah ok, I see your point. I will defer this to someone who knows the expected size or if it really is meant to be that big. https://github.com/llvm/llvm-project/pull/141882 From flang-commits at lists.llvm.org Sat May 31 11:42:39 2025 From: flang-commits at lists.llvm.org (Fangrui Song via flang-commits) Date: Sat, 31 May 2025 11:42:39 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <683b4d9f.050a0220.175b01.d5a7@mx.google.com> https://github.com/MaskRay approved this pull request. https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Sat May 31 11:42:40 2025 From: flang-commits at lists.llvm.org (Fangrui Song via flang-commits) Date: Sat, 31 May 2025 11:42:40 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <683b4da0.630a0220.383d91.7e55@mx.google.com> ================ @@ -614,3 +614,30 @@ nvfortran defines `-fast` as - `-Mcache_align`: there is no equivalent flag in Flang or Clang. - `-Mflushz`: flush-to-zero mode - when `-ffast-math` is specified, Flang will link to `crtfastmath.o` to ensure denormal numbers are flushed to zero. + + +## FCC_OVERRIDE_OPTIONS + +The environment variable `FCC_OVERRIDE_OPTIONS` can be used to edit flang's +command line arguments. The value of this variable is a space-separated list of +edits to perform. The edits are applied in the order in which they appear in +`FCC_OVERRIDE_OPTIONS`. Each edit should be one of the following forms: + +- `#`: Silence information about the changes to the command line arguments. + +- `^FOO`: Add `FOO` as a new argument at the beginning of the command line. ---------------- MaskRay wrote: For `clang a.cc`, this actually adds `FOO` after `clang`. The comment in clang/lib/Driver/Driver.cpp and this description probably should be made more accurate. https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Sat May 31 11:42:40 2025 From: flang-commits at lists.llvm.org (Fangrui Song via flang-commits) Date: Sat, 31 May 2025 11:42:40 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [flang][driver] Introduce FCC_OVERRIDE_OPTIONS. (PR #140556) In-Reply-To: Message-ID: <683b4da0.620a0220.1276a3.ca80@mx.google.com> ================ @@ -614,3 +614,30 @@ nvfortran defines `-fast` as - `-Mcache_align`: there is no equivalent flag in Flang or Clang. - `-Mflushz`: flush-to-zero mode - when `-ffast-math` is specified, Flang will link to `crtfastmath.o` to ensure denormal numbers are flushed to zero. + + +## FCC_OVERRIDE_OPTIONS ---------------- MaskRay wrote: I noticed that clang/docs doesn't have the documentation. Perhaps we should document `clang/docs/UsersManual.rst` as well and reference it from FlangDriver.md? (`"Each edit should be one of the following forms:" list ` should probably reference the clang doc) https://github.com/llvm/llvm-project/pull/140556 From flang-commits at lists.llvm.org Sat May 31 13:17:36 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 31 May 2025 13:17:36 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683b63e0.170a0220.2567c9.062b@mx.google.com> https://github.com/klausler requested changes to this pull request. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Sat May 31 13:17:37 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 31 May 2025 13:17:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683b63e1.170a0220.28504e.024e@mx.google.com> ================ @@ -107,44 +114,138 @@ static std::optional GetWarningChar(char ch) { } } -static bool WarningNameMatch(const char *a, const char *b) { +// Check for case and punctuation insensitive string equality. +// NB, b is probably not null terminated, so don't treat is like a C string. +static bool InsensitiveWarningNameMatch( + std::string_view a, std::string_view b) { + size_t j{0}, aSize{a.size()}, k{0}, bSize{b.size()}; while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); + optional ach{nullopt}; ---------------- klausler wrote: It's `std::optional` outside of the runtime. You don't need to initialize a `std::optional<>` to `std::nullopt` -- it's implied. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Sat May 31 13:17:37 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 31 May 2025 13:17:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683b63e1.170a0220.386e1.090c@mx.google.com> ================ @@ -62,11 +63,44 @@ constexpr std::array EnumNames(const char *p) { enum class NAME { __VA_ARGS__ }; \ [[maybe_unused]] static constexpr std::size_t NAME##_enumSize{ \ ::Fortran::common::CountEnumNames(#__VA_ARGS__)}; \ + [[maybe_unused]] static constexpr std::array NAME##_names{ \ + ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ [[maybe_unused]] static inline std::string_view EnumToString(NAME e) { \ - static const constexpr auto names{ \ - ::Fortran::common::EnumNames(#__VA_ARGS__)}; \ - return names[static_cast(e)]; \ + return NAME##_names[static_cast(e)]; \ } +namespace EnumClass { ---------------- klausler wrote: namespaces are lower-case in our C++17 style https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Sat May 31 13:17:37 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 31 May 2025 13:17:37 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683b63e1.170a0220.e7aaa.c844@mx.google.com> ================ @@ -238,6 +239,12 @@ using std::nullopt_t; using std::optional; #endif // !STD_OPTIONAL_UNSUPPORTED +template +std::optional inline MapOption( ---------------- klausler wrote: `MapOptional` might be a better name https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Sat May 31 13:17:38 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 31 May 2025 13:17:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683b63e2.050a0220.1e3100.d0ac@mx.google.com> ================ @@ -8,7 +8,9 @@ #include "flang/Support/Fortran-features.h" #include "flang/Common/idioms.h" +#include "flang/Common/optional.h" ---------------- klausler wrote: You can just use `` unless you're in the runtime. https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Sat May 31 13:17:38 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 31 May 2025 13:17:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683b63e2.170a0220.1a4710.fd05@mx.google.com> ================ @@ -94,6 +96,11 @@ LanguageFeatureControl::LanguageFeatureControl() { warnLanguage_.set(LanguageFeature::NullActualForAllocatable); } +// Namespace for helper functions for parsing CLI options +// used instead of static so that there can be unit tests for these +// functions. +namespace FortranFeaturesHelpers { ---------------- klausler wrote: namespaces are lower case in our C++ style https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Sat May 31 13:17:38 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 31 May 2025 13:17:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683b63e2.a70a0220.3ab12c.dd3c@mx.google.com> ================ @@ -107,44 +114,138 @@ static std::optional GetWarningChar(char ch) { } } -static bool WarningNameMatch(const char *a, const char *b) { +// Check for case and punctuation insensitive string equality. +// NB, b is probably not null terminated, so don't treat is like a C string. +static bool InsensitiveWarningNameMatch( + std::string_view a, std::string_view b) { + size_t j{0}, aSize{a.size()}, k{0}, bSize{b.size()}; while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); + optional ach{nullopt}; + while (!ach && j < aSize) { + ach = GetWarningChar(a[j++]); } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); + optional bch{}; + while (!bch && k < bSize) { + bch = GetWarningChar(b[k++]); } if (!ach && !bch) { return true; } else if (!ach || !bch || *ach != *bch) { return false; } - ++a, ++b; + ach = bch = nullopt; ---------------- klausler wrote: `ach.reset(), bch.reset();` https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Sat May 31 13:17:38 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 31 May 2025 13:17:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683b63e2.170a0220.a679b.bd2e@mx.google.com> ================ @@ -107,44 +114,138 @@ static std::optional GetWarningChar(char ch) { } } -static bool WarningNameMatch(const char *a, const char *b) { +// Check for case and punctuation insensitive string equality. +// NB, b is probably not null terminated, so don't treat is like a C string. +static bool InsensitiveWarningNameMatch( + std::string_view a, std::string_view b) { + size_t j{0}, aSize{a.size()}, k{0}, bSize{b.size()}; while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); + optional ach{nullopt}; + while (!ach && j < aSize) { + ach = GetWarningChar(a[j++]); } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); + optional bch{}; + while (!bch && k < bSize) { + bch = GetWarningChar(b[k++]); } if (!ach && !bch) { return true; } else if (!ach || !bch || *ach != *bch) { return false; } - ++a, ++b; + ach = bch = nullopt; } } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Check if lower case hyphenated words are equal to camel case words. +// Because of out use case we know that 'r' the camel case string is +// well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. +// This is checked in the enum-class.h file. +static bool SensitiveWarningNameMatch(llvm::StringRef l, llvm::StringRef r) { + size_t ls{l.size()}, rs{r.size()}; + if (ls < rs) { + return false; + } + bool atStartOfWord{true}; + size_t wordCount{0}, j{0}; // j is the number of word characters checked in r. + for (; j < rs; j++) { + if (wordCount + j >= ls) { + // `l` was shorter once the hiphens were removed. + // If r is null terminated, then we are good. + return r[j] == '\0'; + } + if (atStartOfWord) { + if (llvm::isUpper(r[j])) { + // Upper Case Run + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else { + atStartOfWord = false; + if (l[wordCount + j] != r[j]) { + return false; + } } + } else { + if (llvm::isUpper(r[j])) { + atStartOfWord = true; + if (l[wordCount + j] != '-') { + return false; + } + ++wordCount; + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else if (l[wordCount + j] != r[j]) { + return false; + } + } + } + // If there are more characters in l after processing all the characters in r. + // then fail unless the string is null terminated. + if (ls > wordCount + j) { + return l[wordCount + j] == '\0'; + } + return true; +} + +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). +template +optional> ParseCLIEnum(llvm::StringRef input, + EnumClass::FindIndexType findIndex, bool insensitive) { + bool negated{false}; + EnumClass::Predicate predicate; + if (insensitive) { + if (input.starts_with_insensitive("no")) { + negated = true; + input = input.drop_front(2); + } + predicate = [input](std::string_view r) { + return InsensitiveWarningNameMatch(input, r); + }; + } else { + if (input.starts_with("no-")) { + negated = true; + input = input.drop_front(3); } + predicate = [input](std::string_view r) { + return SensitiveWarningNameMatch(input, r); + }; } - return std::nullopt; + optional x = EnumClass::Find(predicate, findIndex); + return MapOption>( + x, [negated](T x) { return std::pair{!negated, x}; }); } -std::optional FindLanguageFeature(const char *name) { - return ScanEnum(name); +optional> parseCLIUsageWarning( + llvm::StringRef input, bool insensitive) { + return ParseCLIEnum(input, FindUsageWarningIndex, insensitive); } -std::optional FindUsageWarning(const char *name) { - return ScanEnum(name); +optional> parseCLILanguageFeature( + llvm::StringRef input, bool insensitive) { + return ParseCLIEnum( + input, FindLanguageFeatureIndex, insensitive); +} + +} // namespace FortranFeaturesHelpers + +// Take a string from the CLI and apply it to the LanguageFeatureControl. +// Return true if the option was applied recognized. +bool LanguageFeatureControl::applyCLIOption( + std::string_view input, bool insensitive) { + llvm::StringRef inputRef{input}; + if (auto result = FortranFeaturesHelpers::parseCLILanguageFeature( ---------------- klausler wrote: braced initialization only, please https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Sat May 31 13:17:38 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 31 May 2025 13:17:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683b63e2.050a0220.2d672f.d9fa@mx.google.com> ================ @@ -107,44 +114,138 @@ static std::optional GetWarningChar(char ch) { } } -static bool WarningNameMatch(const char *a, const char *b) { +// Check for case and punctuation insensitive string equality. +// NB, b is probably not null terminated, so don't treat is like a C string. +static bool InsensitiveWarningNameMatch( + std::string_view a, std::string_view b) { + size_t j{0}, aSize{a.size()}, k{0}, bSize{b.size()}; while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); + optional ach{nullopt}; + while (!ach && j < aSize) { + ach = GetWarningChar(a[j++]); } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); + optional bch{}; + while (!bch && k < bSize) { + bch = GetWarningChar(b[k++]); } if (!ach && !bch) { return true; } else if (!ach || !bch || *ach != *bch) { return false; } - ++a, ++b; + ach = bch = nullopt; } } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Check if lower case hyphenated words are equal to camel case words. +// Because of out use case we know that 'r' the camel case string is +// well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. +// This is checked in the enum-class.h file. +static bool SensitiveWarningNameMatch(llvm::StringRef l, llvm::StringRef r) { + size_t ls{l.size()}, rs{r.size()}; + if (ls < rs) { + return false; + } + bool atStartOfWord{true}; + size_t wordCount{0}, j{0}; // j is the number of word characters checked in r. + for (; j < rs; j++) { + if (wordCount + j >= ls) { + // `l` was shorter once the hiphens were removed. ---------------- klausler wrote: "hyphens" https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Sat May 31 13:17:38 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 31 May 2025 13:17:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683b63e2.170a0220.23238a.bb09@mx.google.com> ================ @@ -107,44 +114,138 @@ static std::optional GetWarningChar(char ch) { } } -static bool WarningNameMatch(const char *a, const char *b) { +// Check for case and punctuation insensitive string equality. +// NB, b is probably not null terminated, so don't treat is like a C string. +static bool InsensitiveWarningNameMatch( + std::string_view a, std::string_view b) { + size_t j{0}, aSize{a.size()}, k{0}, bSize{b.size()}; while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); + optional ach{nullopt}; + while (!ach && j < aSize) { + ach = GetWarningChar(a[j++]); } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); + optional bch{}; + while (!bch && k < bSize) { + bch = GetWarningChar(b[k++]); } if (!ach && !bch) { return true; } else if (!ach || !bch || *ach != *bch) { return false; } - ++a, ++b; + ach = bch = nullopt; } } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Check if lower case hyphenated words are equal to camel case words. +// Because of out use case we know that 'r' the camel case string is +// well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. +// This is checked in the enum-class.h file. +static bool SensitiveWarningNameMatch(llvm::StringRef l, llvm::StringRef r) { ---------------- klausler wrote: Would this algorithm be more clear if you essentially converted the optional name to CamelCase by using hyphens to indicate capitalization of the following letter? https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Sat May 31 13:17:38 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 31 May 2025 13:17:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683b63e2.170a0220.2b8458.0102@mx.google.com> ================ @@ -107,44 +114,138 @@ static std::optional GetWarningChar(char ch) { } } -static bool WarningNameMatch(const char *a, const char *b) { +// Check for case and punctuation insensitive string equality. +// NB, b is probably not null terminated, so don't treat is like a C string. +static bool InsensitiveWarningNameMatch( + std::string_view a, std::string_view b) { + size_t j{0}, aSize{a.size()}, k{0}, bSize{b.size()}; while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); + optional ach{nullopt}; + while (!ach && j < aSize) { + ach = GetWarningChar(a[j++]); } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); + optional bch{}; + while (!bch && k < bSize) { + bch = GetWarningChar(b[k++]); } if (!ach && !bch) { return true; } else if (!ach || !bch || *ach != *bch) { return false; } - ++a, ++b; + ach = bch = nullopt; } } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Check if lower case hyphenated words are equal to camel case words. +// Because of out use case we know that 'r' the camel case string is +// well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. +// This is checked in the enum-class.h file. +static bool SensitiveWarningNameMatch(llvm::StringRef l, llvm::StringRef r) { + size_t ls{l.size()}, rs{r.size()}; + if (ls < rs) { + return false; + } + bool atStartOfWord{true}; + size_t wordCount{0}, j{0}; // j is the number of word characters checked in r. + for (; j < rs; j++) { + if (wordCount + j >= ls) { + // `l` was shorter once the hiphens were removed. + // If r is null terminated, then we are good. + return r[j] == '\0'; + } + if (atStartOfWord) { + if (llvm::isUpper(r[j])) { + // Upper Case Run + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else { + atStartOfWord = false; + if (l[wordCount + j] != r[j]) { + return false; + } } + } else { + if (llvm::isUpper(r[j])) { + atStartOfWord = true; + if (l[wordCount + j] != '-') { + return false; + } + ++wordCount; + if (l[wordCount + j] != llvm::toLower(r[j])) { + return false; + } + } else if (l[wordCount + j] != r[j]) { + return false; + } + } + } + // If there are more characters in l after processing all the characters in r. + // then fail unless the string is null terminated. + if (ls > wordCount + j) { + return l[wordCount + j] == '\0'; + } + return true; +} + +// Parse a CLI enum option return the enum index and whether it should be +// enabled (true) or disabled (false). +template +optional> ParseCLIEnum(llvm::StringRef input, + EnumClass::FindIndexType findIndex, bool insensitive) { + bool negated{false}; + EnumClass::Predicate predicate; + if (insensitive) { + if (input.starts_with_insensitive("no")) { + negated = true; + input = input.drop_front(2); + } + predicate = [input](std::string_view r) { + return InsensitiveWarningNameMatch(input, r); + }; + } else { + if (input.starts_with("no-")) { + negated = true; + input = input.drop_front(3); } + predicate = [input](std::string_view r) { + return SensitiveWarningNameMatch(input, r); + }; } - return std::nullopt; + optional x = EnumClass::Find(predicate, findIndex); + return MapOption>( + x, [negated](T x) { return std::pair{!negated, x}; }); } -std::optional FindLanguageFeature(const char *name) { - return ScanEnum(name); +optional> parseCLIUsageWarning( + llvm::StringRef input, bool insensitive) { + return ParseCLIEnum(input, FindUsageWarningIndex, insensitive); } -std::optional FindUsageWarning(const char *name) { - return ScanEnum(name); +optional> parseCLILanguageFeature( + llvm::StringRef input, bool insensitive) { + return ParseCLIEnum( + input, FindLanguageFeatureIndex, insensitive); +} + +} // namespace FortranFeaturesHelpers + +// Take a string from the CLI and apply it to the LanguageFeatureControl. +// Return true if the option was applied recognized. ---------------- klausler wrote: "if the option was recognized" https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Sat May 31 13:17:38 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 31 May 2025 13:17:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683b63e2.170a0220.37110c.c102@mx.google.com> ================ @@ -107,44 +114,138 @@ static std::optional GetWarningChar(char ch) { } } -static bool WarningNameMatch(const char *a, const char *b) { +// Check for case and punctuation insensitive string equality. +// NB, b is probably not null terminated, so don't treat is like a C string. +static bool InsensitiveWarningNameMatch( + std::string_view a, std::string_view b) { + size_t j{0}, aSize{a.size()}, k{0}, bSize{b.size()}; while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); + optional ach{nullopt}; + while (!ach && j < aSize) { + ach = GetWarningChar(a[j++]); } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); + optional bch{}; + while (!bch && k < bSize) { + bch = GetWarningChar(b[k++]); } if (!ach && !bch) { return true; } else if (!ach || !bch || *ach != *bch) { return false; } - ++a, ++b; + ach = bch = nullopt; } } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Check if lower case hyphenated words are equal to camel case words. +// Because of out use case we know that 'r' the camel case string is +// well formed in the sense that it is a sequence [a-zA-Z]+[a-zA-Z0-9]*. ---------------- klausler wrote: Is there an enumerator that starts with a lower-case letter? https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Sat May 31 13:17:38 2025 From: flang-commits at lists.llvm.org (Peter Klausler via flang-commits) Date: Sat, 31 May 2025 13:17:38 -0700 (PDT) Subject: [flang-commits] [flang] [flang][cli] Add diagnostic flags to the CLI (PR #142022) In-Reply-To: Message-ID: <683b63e2.170a0220.128232.bf32@mx.google.com> ================ @@ -107,44 +114,138 @@ static std::optional GetWarningChar(char ch) { } } -static bool WarningNameMatch(const char *a, const char *b) { +// Check for case and punctuation insensitive string equality. +// NB, b is probably not null terminated, so don't treat is like a C string. +static bool InsensitiveWarningNameMatch( + std::string_view a, std::string_view b) { + size_t j{0}, aSize{a.size()}, k{0}, bSize{b.size()}; while (true) { - auto ach{GetWarningChar(*a)}; - while (!ach && *a) { - ach = GetWarningChar(*++a); + optional ach{nullopt}; + while (!ach && j < aSize) { + ach = GetWarningChar(a[j++]); } - auto bch{GetWarningChar(*b)}; - while (!bch && *b) { - bch = GetWarningChar(*++b); + optional bch{}; + while (!bch && k < bSize) { + bch = GetWarningChar(b[k++]); } if (!ach && !bch) { return true; } else if (!ach || !bch || *ach != *bch) { return false; } - ++a, ++b; + ach = bch = nullopt; } } -template -std::optional ScanEnum(const char *name) { - if (name) { - for (std::size_t j{0}; j < N; ++j) { - auto feature{static_cast(j)}; - if (WarningNameMatch(name, EnumToString(feature).data())) { - return feature; +// Check if lower case hyphenated words are equal to camel case words. +// Because of out use case we know that 'r' the camel case string is ---------------- klausler wrote: "our" https://github.com/llvm/llvm-project/pull/142022 From flang-commits at lists.llvm.org Sat May 31 18:04:46 2025 From: flang-commits at lists.llvm.org (Tarun Prabhu via flang-commits) Date: Sat, 31 May 2025 18:04:46 -0700 (PDT) Subject: [flang-commits] [clang] [flang] [mlir] [flang] Add support for -mrecip[=] (PR #142172) In-Reply-To: Message-ID: <683ba72e.050a0220.3410ba.d9fe@mx.google.com> tarunprabhu wrote: > > LGTM, though, I think we'd better reuse the code from `Clang.cpp`. `flangFrontend` already depends on `clangDriver`, so we just need to export `ParseMRecip` and `getRefinementStep` from `clangDriver` (and probably replace their `Driver` argument with a `DiagnosticEngine` argument, so that it works for both clang and flang). > > This seems reasonable. I'll look into whether it's possible to do. > > > (and probably replace their `Driver` argument with a `DiagnosticEngine` argument, so that it works for both clang and flang) > > It will be interesting to see if Clang allows us to change this without changing it everywhere. Changing it everywhere seems like a heavy lift. If this option should be handled exactly the way it is in clang, the approach we have been using is to share the code between the two by copying it into `clang/lib/Driver/ToolChains/CommonArgs.cpp`. See for instance, [this](https://github.com/llvm/llvm-project/commit/8ea2b417419344182053c0726cfff184d7917498) https://github.com/llvm/llvm-project/pull/142172